DynaRetarget: Dynamically-Feasible Retargeting using Sampling-Based Trajectory Optimization

DynaRetarget: Dynamically-Feasible Retargeting using Sampling-Based Trajectory Optimization
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this paper, we introduce DynaRetarget, a complete pipeline for retargeting human motions to humanoid control policies. The core component of DynaRetarget is a novel Sampling-Based Trajectory Optimization (SBTO) framework that refines imperfect kinematic trajectories into dynamically feasible motions. SBTO incrementally advances the optimization horizon, enabling optimization over the entire trajectory for long-horizon tasks. We validate DynaRetarget by successfully retargeting hundreds of humanoid-object demonstrations and achieving higher success rates than the state of the art. The framework also generalizes across varying object properties, such as mass, size, and geometry, using the same tracking objective. This ability to robustly retarget diverse demonstrations opens the door to generating large-scale synthetic datasets of humanoid loco-manipulation trajectories, addressing a major bottleneck in real-world data collection.


💡 Research Summary

DynaRetarget presents a complete pipeline for converting human‑object interaction demonstrations into dynamically feasible whole‑body motions for humanoid robots. The core contribution is a novel Sampling‑Based Trajectory Optimization (SBTO) method that refines imperfect kinematic retargets into physically consistent trajectories by incrementally expanding the optimization horizon. Unlike prior approaches that either rely solely on inverse‑kinematics (IK) retargeting or employ short‑horizon sampling‑based model predictive control (SBMPC), SBTO optimizes control “knots” sequentially: it first optimizes the earliest knot, then adds the next knot once the covariance of the sampling distribution has converged below a threshold, and repeats until the full horizon is covered. This progressive horizon expansion mitigates the curse of dimensionality inherent in whole‑body humanoid control (hundreds of DoFs) while preserving global consistency across long‑duration tasks.

Implementation uses the Cross‑Entropy Method (CEM) to update the mean and covariance of the sampling distribution, with elite‑sample retention (ρ_e) and a keep‑elite proportion (ρ_k) to prevent premature collapse of the distribution. An exponential moving average (EWMA) with momentum parameters α_µ and α_Σ further stabilizes the update. The initial mean is set to the joint positions of the kinematic reference at each knot, and the initial standard deviation σ_0 is 0.25 rad; full covariance matrices are employed to capture inter‑joint dependencies.

Evaluation is conducted in MuJoCo with the G1 humanoid robot on the OmniRetarget dataset, which contains hundreds of human‑object demonstrations (kicking, lifting, pushing, dragging). SBTO corrects missing contacts, penetrations, and discontinuities that plague pure IK retargets. Compared to a state‑of‑the‑art SBMPC baseline, SBTO achieves a 23 percentage‑point increase in retargeting success rate, especially in contact‑rich scenarios where SBMPC often fails due to its myopic horizon. The refined trajectories are then used as demonstration data for a reinforcement‑learning (RL) tracking policy. Policies trained on SBTO‑generated motions converge faster, exhibit smoother control signals, and transfer zero‑shot to a real humanoid robot with robust performance across multiple loco‑manipulation tasks.

Additional experiments demonstrate SBTO’s ability to generalize across varying object properties (mass, size, geometry) without modifying the cost function, highlighting the method’s robustness to environmental changes. Ablation studies on the σ_min threshold and covariance handling confirm that the incremental horizon strategy is essential for escaping local minima and achieving high‑quality solutions.

In summary, DynaRetarget introduces the first full‑horizon, sampling‑based trajectory optimizer for humanoid retargeting, enabling large‑scale synthetic dataset generation and reliable real‑world deployment of complex contact‑rich behaviors. The work bridges the gap between easy‑to‑collect human demonstrations and the stringent dynamic feasibility requirements of humanoid robots, offering a scalable solution for future research in whole‑body manipulation, multi‑agent cooperation, and real‑time online retargeting.


Comments & Academic Discussion

Loading comments...

Leave a Comment