Partial Motion Imitation for Learning Cart Pushing with Legged Manipulators

Partial Motion Imitation for Learning Cart Pushing with Legged Manipulators
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Loco-manipulation is a key capability for legged robots to perform practical mobile manipulation tasks, such as transporting and pushing objects, in real-world environments. However, learning robust loco-manipulation skills remains challenging due to the difficulty of maintaining stable locomotion while simultaneously performing precise manipulation behaviors. This work proposes a partial imitation learning approach that transfers the locomotion style learned from a locomotion task to cart loco-manipulation. A robust locomotion policy is first trained with extensive domain and terrain randomization, and a loco-manipulation policy is then learned by imitating only lower-body motions using a partial adversarial motion prior. We conduct experiments demonstrating that the learned policy successfully pushes a cart along diverse trajectories in IsaacLab and transfers effectively to MuJoCo. We also compare our method to several baselines and show that the proposed approach achieves more stable and accurate loco-manipulation behaviors.


💡 Research Summary

This paper presents a novel learning framework for enabling legged manipulators to perform cart-pushing loco-manipulation, a challenging task that requires maintaining stable locomotion while executing precise manipulation forces over extended periods. The core challenge lies in the complexity of jointly discovering coordinated whole-body behaviors for both mobility and manipulation, often leading to unstable or inefficient policies with naive reinforcement learning approaches.

To address this, the authors propose a two-stage “Partial Motion Imitation” pipeline. In the first stage, a robust locomotion policy is trained in simulation with extensive domain and terrain randomization. This policy learns to track various velocity, heading, and end-effector commands while walking on rough terrain, resulting in a robust and cyclic walking style. A dataset of state transitions generated by this policy is collected for later use.

The second and key stage involves learning the cart-pushing policy. Instead of imitating the full-body motion from the reference policy—which would unnecessarily constrain arm movements needed for manipulation—the authors introduce a “Partial Adversarial Motion Prior (Partial AMP).” In this method, only a projected subset of the state, encompassing the base motion and lower-body joint kinematics (hips, thighs, calves), is used for imitation. A discriminator is trained to distinguish between these projected state transitions from the reference locomotion dataset and those generated by the new cart-pushing policy. The style reward from this discriminator encourages the policy to maintain the stable lower-body locomotion style of the first-stage policy, while the task reward (for cart position tracking, contact maintenance, etc.) allows the upper body and arm to adapt freely to the manipulation task. This separation of concerns is the paper’s central insight.

The proposed method is evaluated in the IsaacLab simulator on a platform consisting of a Unitree Go2 quadruped with a mounted WidowX arm, pushing a shopping cart. The policy is tested on tracking diverse trajectories, including straight lines and curves. Results demonstrate that the Partial AMP approach significantly outperforms several baselines: a policy trained with only task rewards (no imitation), a policy using standard full-state AMP, and a hierarchical RL method. The Partial AMP policy achieves more stable locomotion, higher success rates, and more accurate path tracking. Furthermore, the paper shows successful sim-to-sim transfer, where the policy trained in IsaacLab performs effectively in the MuJoCo simulation environment without additional fine-tuning, highlighting the robustness and generalizability of the learned behavior. The work establishes partial imitation of lower-body motion as a powerful technique for decoupling and simplifying the learning of complex loco-manipulation skills.


Comments & Academic Discussion

Loading comments...

Leave a Comment