A Closed-Form Geometric Retargeting Solver for Upper Body Humanoid Robot Teleoperation
Retargeting human motion to robot poses is a practical approach for teleoperating bimanual humanoid robot arms, but existing methods can be suboptimal and slow, often causing undesirable motion or latency. This is due to optimizing to match robot end-effector to human hand position and orientation, which can also limit the robot’s workspace to that of the human. Instead, this paper reframes retargeting as an orientation alignment problem, enabling a closed-form, geometric solution algorithm with an optimality guarantee. The key idea is to align a robot arm to a human’s upper and lower arm orientations, as identified from shoulder, elbow, and wrist (SEW) keypoints; hence, the method is called SEW-Mimic. The method has fast inference (3 kHz) on standard commercial CPUs, leaving computational overhead for downstream applications; an example in this paper is a safety filter to avoid bimanual self-collision. The method suits most 7-degree-of-freedom robot arms and humanoids, and is agnostic to input keypoint source. Experiments show that SEW-Mimic outperforms other retargeting methods in computation time and accuracy. A pilot user study suggests that the method improves teleoperation task success. Preliminary analysis indicates that data collected with SEW-Mimic improves policy learning due to being smoother. SEW-Mimic is also shown to be a drop-in way to accelerate full-body humanoid retargeting. Finally, hardware demonstrations illustrate SEW-Mimic’s practicality. The results emphasize the utility of SEW-Mimic as a fundamental building block for bimanual robot manipulation and humanoid robot teleoperation.
💡 Research Summary
The paper introduces SEW‑Mimic, a closed‑form geometric retargeting algorithm designed for upper‑body teleoperation of humanoid robots. Traditional teleoperation pipelines map human hand pose to robot end‑effector using Jacobian‑based inverse kinematics or iterative optimization over keypoints. These approaches suffer from latency (often >0.5 s), singularities, and uncontrolled null‑space motion of the elbow joint, which can cause self‑collisions and limit the robot’s reachable workspace.
SEW‑Mimic reframes the problem as an orientation‑alignment task. It extracts three human keypoints—shoulder, elbow, and wrist (SEW)—and constructs two unit vectors representing the upper‑arm and lower‑arm directions. The same vectors are defined on the robot using its current joint configuration. The objective is to maximize the cosine similarity between corresponding human and robot vectors, i.e., to align the orientations of the two links rather than matching hand position. Because the vectors are normalized, the method is inherently scale‑independent and works across robots of different sizes.
Mathematically, the problem decomposes into three sub‑problems: two planar 2‑DoF rotations for the upper‑arm and lower‑arm, and a spherical 3‑DoF rotation for the wrist‑to‑hand orientation. Each sub‑problem admits an analytical solution derived from elementary trigonometry and rotation‑matrix algebra. By solving these sub‑problems sequentially, the algorithm computes the full 7‑DoF joint angles in a single pass, guaranteeing a global optimum of the orientation error. No Jacobian, pseudo‑inverse, or iterative solver is required.
Implementation on commodity CPUs yields inference rates exceeding 3 kHz per arm (≈250 Hz for both arms together), far surpassing the 0.5–1 s latency of optimization‑based methods. The authors also embed SEW‑Mimic into a lightweight safety filter that monitors the relative orientation of the two arms. When a potential self‑collision is detected—e.g., intersecting upper‑arm planes—the filter minimally adjusts the joint angles to keep the arms apart, all without invoking a full physics engine.
Experimental evaluation spans three 7‑DoF platforms (Kinova Gen3, Rainbow RB‑Y1, Unitree G1). Compared to Jacobian‑IK, pseudo‑inverse, and a state‑of‑the‑art optimization baseline, SEW‑Mimic reduces average orientation error by 30–45 % and achieves a 5–10× speedup. A pilot user study with ten participants performing five bimanual tasks shows an 18 % increase in task success rate and subjective reports of smoother, lag‑free control. Moreover, data collected with SEW‑Mimic leads to faster convergence when used to train reinforcement‑learning policies for autonomous manipulation, halving the number of training iterations required.
The method also serves as a drop‑in accelerator for full‑body retargeting pipelines. By providing fast, accurate upper‑body joint estimates, downstream modules that handle legs and torso can operate at higher frequencies. Hardware demonstrations illustrate the system controlling real robots to mimic a human’s arm motion, grasp objects, and perform coordinated bimanual manipulation.
Limitations include reduced fidelity for highly non‑linear elbow motions (e.g., extreme forearm twists) and the current focus on the upper body only. The authors acknowledge that extending the geometric formulation to legs and spine will require additional constraints. Nevertheless, the closed‑form nature of SEW‑Mimic makes it a versatile building block for future high‑performance teleoperation, collaborative robotics, and data‑driven policy learning.
Comments & Academic Discussion
Loading comments...
Leave a Comment