Multi-Rigid-Body Approximation of Human Hands with Application to Digital Twin

Reading time: 4 minute
...

📝 Original Info

  • Title: Multi-Rigid-Body Approximation of Human Hands with Application to Digital Twin
  • ArXiv ID: 2512.07359
  • Date: 2025-12-08
  • Authors: Bin Zhao, Yiwen Lu, Haohua Zhu, Xiao Li, Sheng Yi

📝 Abstract

Human hand simulation plays a critical role in digital twin applications, requiring models that balance anatomical fidelity with computational efficiency. We present a complete pipeline for constructing multi-rigid-body approximations of human hands that preserve realistic appearance while enabling real-time physics simulation. Starting from optical motion capture of a specific human hand, we construct a personalized MANO (Multi-Abstracted hand model with Neural Operations) model and convert it to a URDF (Unified Robot Description Format) representation with anatomically consistent joint axes. The key technical challenge is projecting MANO's unconstrained SO(3) joint rotations onto the kinematically constrained joints of the rigid-body model. We derive closed-form solutions for single degree-of-freedom joints and introduce a Baker-Campbell-Hausdorff (BCH)-corrected iterative method for two degree-of-freedom joints that properly handles the non-commutativity of rotations. We validate our approach through digital twin experiments where reinforcement learning policies control the multi-rigid-body hand to replay captured human demonstrations. Quantitative evaluation shows sub-centimeter reconstruction error and successful grasp execution across diverse manipulation tasks.

💡 Deep Analysis

Figure 1

📄 Full Content

Digital twin technology requires accurate real-time simulation of human manipulation, motivating hand models balancing visual realism with computational efficiency. Current approaches face a fundamental trade-off. High-fidelity mesh models like MANO [9] and its extensions [12,13] provide excellent visual quality but require expensive soft-body simulation unsuitable for real-time applications. While optimization-based methods require 3.44ms, our approach achieves 5.36°mean error in 0.41ms-over 8× faster (Table 1). Skeleton-only approaches enable fast simulation but lack visual fidelity. Recent digital twin systems [2,5] demonstrate strong task performance yet remain limited by hand models that cannot simultaneously maintain visual realism and high-frequency updates. This gap becomes critical in virtual training, teleoperation, and human-robot collaboration where both speed and visual accuracy matter.

We propose a multi-rigid-body approximation representing the hand as a kinematic tree of rigid links with fixed meshes, preserving MANO’s (Multi-Abstracted hand model with Neural Operations) visual appearance while enabling standard rigid-body physics simulation. This maintains benefits of both approaches: the computational efficiency of rigid-body dynamics and visual quality of mesh-based models. The rigid links naturally map to URDF (Unified Robot Description Format) format, compatible with standard robotics simulators and control algorithms.

The technical challenge lies in bridging two different rotation representations. MANO parameterizes each joint as an unconstrained 3-DOF rotation in SO(3), while robotic joints are typically constrained to 1-DOF (hinge) or 2-DOF (universal) rotations. Simply discarding degrees of freedom loses essential motion, while naive projection onto constrained axes produces kinematically inconsistent results due to rotation non-commutativity. Previous work either accepts these approximation errors or relies on optimization-based retargeting that lacks real-time guarantees.

Our approach addresses this through a mathematically principled projection framework. For single-DOF joints, we derive a closed-form projection formula based on the tangent space structure of SO (3). For two-DOF joints, we develop an iterative method using the Baker-Campbell-Hausdorff (BCH) formula to handle the non-linear interaction between rotation axes. The method converges rapidly (typically 3-5 iterations) and produces kinematically consistent joint angles that best approximate the original MANO pose.

Our contributions are threefold. First, we present a complete pipeline from human hand capture to multi-rigid-body URDF model, including automated mesh segmentation and joint axis determination based on anatomical heuristics. Second, we develop closed-form and BCH-corrected projection methods for mapping unconstrained SO(3) rotations to kinematically constrained joints, with mathematical analysis of convergence and accuracy. Third, we validate our approach through digital twin experiments showing successful replay of human demonstrations using RL-trained policies, achieving sub-centimeter tracking error across diverse manipulation tasks.

Hand Modeling Approaches. MANO [9] established parametric hand representation by mapping pose and shape parameters to 3D meshes via linear blend skinning. Extensions include MS-MANO [12] with biomechanical constraints, MeMaHand [11] combining parametric and nonparametric accuracy for two-hand reconstruction, and SMPL-X [7] for whole-body modeling with 54 hand joints. PhysHand [13] uses multi-layer geometry with constraint-based dynamics for realistic contact. However, mesh-based methods require expensive soft-body simulation unsuitable for real-time use. Traditional rigid-body models enable faster simulation but lack visual fidelity. Our approach preserves MANO’s visual quality while enabling rigid-body physics at 1000+ Hz. Motion Retargeting. Converting between kinematic representations is challenging when projecting unconstrained rotations to constrained joints. Optimization methods [1] achieve high accuracy but lack real-time guarantees. Analytical approaches [3] compute faster but often target specific morphologies. Rotation non-commutativity creates fundamental issues: naive projection yields kinematically inconsistent results, while the Baker-Campbell-Hausdorff formula [4] handles rotation composition but sees limited hand retargeting use. Neural methods like GeoRT [14] provide ultrafast retargeting but lack interpretability. We derive closed-form solutions for 1-DOF joints and BCH-corrected iteration for 2-DOF joints, achieving mathematical rigor and efficiency. Digital Twin and Simulation. Digital twins require real-time hand simulation for teleoperation and collaboration. DexSim2Real 2 [5] constructs world models through active interactions, while BiDexHD [2] achieves 74% task success via multi-task RL. Isaac Gym reaches 30,000+ FPS for manipulation training vers

📸 Image Gallery

digital_twin_results.jpg manourdf.png mocap_system.jpg pipeline_overview.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut