- Requires high-quality mesh-based Gaussian avatars as input.
- FullbodyFix is currently triggered by visual inspection.
- Animatable transfer remains future work.
AvatarMix performs free-viewpoint outfit personalization by composing a user’s identity cues (head–neck, body shape and scale, and skin tone) with a model’s clothed outfit in a 3DGS representation. The examples show variations in height and body proportions, cross-ethnicity skin tones, and diverse garments, while preserving fine details such as printed text and skirt folds under composition and body-shape retargeting. Examples are rendered in metric scale without post-hoc rescaling.
Existing 3D avatar outfit transfer methods face distinct challenges: approaches that lift 2D edits to 3D often suffer from outfit or identity quality degradation, while those that separately model body and clothing layers are prone to intersection artifacts. We introduce AvatarMix, a compositional paradigm that bypasses these issues by directly composing the head and body from two high-fidelity Gaussian avatars. While this paradigm inherently preserves outfit quality and avoids intersections, it introduces challenges in creating a seamless join and maintaining appearance fidelity after body reshaping.
To this end, we propose a two-tier refinement strategy: SeamFix, a localized diffusion module that refines hair and neck to ensure an artifact-free join, and an optional full-body refinement, FullbodyFix, that restores garment appearance when retargeting degrades the clothed body. Both operate on renders from an already 3D-consistent Gaussian avatar, which limits multi-view artifacts compared to 2D-to-3D lifting. To preserve the user's body identity, our mesh-based Gaussian representation enables the adaptation of a robust mesh retargeting technique, precisely reshaping the clothed body to the user's physique and robustly handling diverse body shapes. Extensive experiments demonstrate that our method achieves state-of-the-art results in outfit fidelity and identity preservation, providing a new perspective for realistic 3D outfit personalization.
Reference numbers follow the paper/poster.
For outfit personalization, combine User’s identity with Model’s outfit, preserve high-fidelity 3DGS appearance.
Failure example of 2D-to-3D try-on [1]: red dotted boxes show view-inconsistent garment details.
Given multi-view images of a User and a Model, we first employ Mesh-Based Avatar Reconstruction with semantic segmentation. We then perform Cross-Avatar Geometric Composition by aligning the user’s head and neck to the Model’s pose and reshaping the Model’s clothed body via GSReshape so that the body geometry matches the User’s physique. Finally, our two-tier diffusion refinement operates on rendered views, followed by 3D Gaussian fine-tuning, to produce the final identity-transfer result.
Reconstruct mesh based 3DGS avatars from multi-view images [3,4], with semantic segmentation and SMPL fitting.
Align the User’s head & neck to the Model pose, then combine with Model’s clothed body. Transfer User’s skin-tone to body for identity preservation.
Retarget the clothed body to the user’s physique by jointly deforming the mesh and the attached Gaussians [5].
Repair seam and reshaping artifacts, then use the refined views to update the composed 3DGS avatar [6].
| Method | Outfit DINO ↑ [7] | Head + Neck DINO ↑ [7] | Warp RMSE ↓ [8] | User Preference ↑ |
|---|---|---|---|---|
| VTON360 [1] | 0.633 | 0.786 | 0.0276 | 8.7% |
| TIP-Editor [10] | N/A | 0.356 | 0.0388 | 2.6% |
| Ours | 0.883 | 0.818 | 0.0175 | 88.7% |
Qualitative comparison with TIP-Editor and VTON360. AvatarMix better preserves facial identity, garment texture, and seam quality while avoiding view inconsistency, unnatural wrinkles, and degraded hands.
SeamFix cleans head-neck seams, FullbodyFix restores garment appearance after reshaping artifacts, and GSReshape adapts the clothed body to the user’s physique.
Additional GSReshape examples showing body-shape-aware garment adaptation.
Hand-aware skin tightness examples for robust Gaussian-avatar reshaping.
Additional comparisons on more User-Model pairs, demonstrating preservation of identity and outfit appearance across diverse viewpoints.
Demo video. If you host the video externally, replace this local MP4 with an embedded player.
Zhaorong Wang
University of Tsukuba
zhaorong.wang1997@gmail.com · larsph.github.io
@inproceedings{wang2026avatarmix,
author = {Wang, Zhaorong and Kanamori, Yoshihiro and Endo, Yuki},
title = {{AvatarMix}: Identity-Preserving Cross-Avatar Composition for Outfit Personalization},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings},
month = {June},
year = {2026},
pages = {425-435}
}