노이즈 기반 아바타 지오메트리 생성과 가우시안 스플래팅 시각화

Reading time: 5 minute
...

📝 Original Info

  • Title: 노이즈 기반 아바타 지오메트리 생성과 가우시안 스플래팅 시각화
  • ArXiv ID: 2512.07459
  • Date: Pending
  • Authors: ** Xiangjun Tang, Biao Zhang, Peter Wonka* King Abdullah University of Science and Technology (KAUST) {xiangjun.tang, biao.zhang, peter.wonka}@kaust.edu.sa **

📝 Abstract

sa (a) Points (b) Depth/Color (c) Normal (d) Mesh 𝐗𝐗 ~𝒩𝒩 Figure 1. Our generative framework produces diverse avatar geometry sequences from noise, with geometries represented as points (a). For visualization, these points can be rendered via Gaussian splatting (GS), producing depth images (b) and normal images (c). Colors (b) can then be obtained by GS optimization, using a depth-guided video generation model (Wan 2.1), while the normal images (c) effectively highlight fine folds and wrinkles. Our synthesized geometries are of high quality and can be directly converted into meshes (d) via Poisson reconstruction. The highlighted regions demonstrate fine-grained garment dynamics that faithfully follow human motion.

💡 Deep Analysis

Figure 1

📄 Full Content

Human Geometry Distribution for 3D Animation Generation Xiangjun Tang Biao Zhang Peter Wonka* King Abdullah University of Science and Technology {xiangjun.tang, biao.zhang, peter.wonka}@kaust.edu.sa (a) Points (b) Depth/Color (c) Normal (d) Mesh 𝐗𝐗 ~𝒩𝒩 Figure 1. Our generative framework produces diverse avatar geometry sequences from noise, with geometries represented as points (a). For visualization, these points can be rendered via Gaussian splatting (GS), producing depth images (b) and normal images (c). Colors (b) can then be obtained by GS optimization, using a depth-guided video generation model (Wan 2.1), while the normal images (c) effectively highlight fine folds and wrinkles. Our synthesized geometries are of high quality and can be directly converted into meshes (d) via Poisson reconstruction. The highlighted regions demonstrate fine-grained garment dynamics that faithfully follow human motion. Abstract Generating realistic human geometry animations remains a challenging task, as it requires modeling natural clothing dynamics with fine-grained geometric details under limited data. To address these challenges, we propose two novel de- signs. First, we propose a compact distribution-based latent representation that enables efficient and high-quality geom- etry generation. We improve upon previous work by estab- lishing a more uniform mapping between SMPL and avatar geometries. Second, we introduce a generative animation model that fully exploits the diversity of limited motion *Corresponding author. data. We focus on short-term transitions while maintain- ing long-term consistency through an identity-conditioned design. These two designs formulate our method as a two- stage framework: the first stage learns a latent space, while the second learns to generate animations within this latent space. We conducted experiments on both our latent space and animation model. We demonstrate that our latent space produces high-fidelity human geometry surpassing previous methods (90% lower Chamfer Dist.). The animation model synthesizes diverse animations with detailed and natural dynamics (2.2× higher user study score), achieving the best results across all evaluation metrics. 1 arXiv:2512.07459v1 [cs.GR] 8 Dec 2025 1. Introduction Generating 3D human geometry animation is a fundamental task in visual generation and human modeling. The goal is to synthesize natural dynamics with fine-grained geometric details, which poses significant challenges. First, captur- ing fine-grained details requires modeling subtle geometric structures such as folds and wrinkles. Second, learning nat- ural dynamics is challenging due to the limited availability of 3D animation data, where models may easily overfit and fail to reproduce realistic garment deformation in response to human movement. Early methods [28, 33, 34] learn dynamics for spe- cific garments, or model avatars from video or scanned data [11, 17–19, 23, 29, 32, 40, 41, 45, 53, 61]. While these approaches can synthesize plausible dynamics with limited data, they are not generative methods and fail to general- ize to unseen avatars or garments. In contrast, generative avatar models [6, 13, 16, 22, 51, 59, 62] extend to diverse identities and offer better generalization, yet they struggle to preserve high-fidelity geometry and learn realistic clothing deformations. Overall, no existing approach satisfies both requirements. To address these challenges, we propose two key de- signs. First, we propose a latent representation based on the Human Geometry Distribution (HuGeoDis) [39], which enables the synthesis of high-fidelity geometry from a com- pact latent representation. However, the original HuGeoDis suffers from imbalanced sampling: it requires a large num- ber of points to adequately cover a geometry, and under- sampled areas often lead to reconstruction artifacts. To mitigate this, we design a new training scheme that first establishes more uniform mappings between SMPL and avatar geometries, and then learns from these correspon- dences. This design enables high-quality geometry gen- eration with significantly fewer points, thereby improving efficiency for long animation sequences. Second, we intro- duce a generative animation model that captures temporal dynamics from limited 3D human animation data. We em- ploy a conditional diffusion model that models short-term transitions, which has been empirically shown to leverage diverse motion data more effectively than directly modeling long sequences [37]. Long sequences are generated auto- regressively from these transitions, with long-term con- sistency preserved via conditional inputs to the diffusion model. We conduct experiments to validate both our distribution-based latent representation and the gener- ative animation model. For the latent space, we evaluate reconstruction accuracy and efficiency, and further assess its performance for the static random avatar generation task, a standard benchmark in avatar gene

📸 Image Gallery

supervised_model.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut