Reliable odometry for legged robots without cameras or LiDAR remains challenging due to IMU drift and noisy joint velocity sensing. This paper presents a purely proprioceptive state estimator that uses only IMU and motor measurements to jointly estimate body pose and velocity, with a unified formulation applicable to biped, quadruped, and wheel-legged robots. The key idea is to treat each contacting leg as a kinematic anchor: joint-torque--based foot wrench estimation selects reliable contacts, and the corresponding footfall positions provide intermittent world-frame constraints that suppress long-term drift. To prevent elevation drift during extended traversal, we introduce a lightweight height clustering and time-decay correction that snaps newly recorded footfall heights to previously observed support planes. To improve foot velocity observations under encoder quantization, we apply an inverse-kinematics cubature Kalman filter that directly filters foot-end velocities from joint angles and velocities. The implementation further mitigates yaw drift through multi-contact geometric consistency and degrades gracefully to a kinematics-derived heading reference when IMU yaw constraints are unavailable or unreliable. We evaluate the method on four quadruped platforms (three Astrall robots and a Unitree Go2 EDU) using closed-loop trajectories. On Astrall point-foot robot~A, a $\sim$200\,m horizontal loop and a $\sim$15\,m vertical loop return with 0.1638\,m and 0.219\,m error, respectively; on wheel-legged robot~B, the corresponding errors are 0.2264\,m and 0.199\,m. On wheel-legged robot~C, a $\sim$700\,m horizontal loop yields 7.68\,m error and a $\sim$20\,m vertical loop yields 0.540\,m error. Unitree Go2 EDU closes a $\sim$120\,m horizontal loop with 2.2138\,m error and a $\sim$8\,m vertical loop with less than 0.1\,m vertical error. github.com/ShineMinxing/Ros2Go2Estimator.git
quadruped robots, proprioceptive odometry, sensor fusion, cubature Kalman estimator
Accurate estimation of a legged robot’s pose and velocity is a prerequisite for motion planning and stable locomotion control [1], [2]. In many deployed systems, exteroceptive sensors such as depth cameras and LiDAR are coupled with SLAM to provide global state feedback [3]- [5]. However, these sensors can become unreliable in the presence of challenging illumination, airborne particulates, or vegetation clutter, and their computational and installation requirements may be undesirable for lightweight or cost-sensitive platforms. This motivates proprioceptive state estimation methods that remain effective when no vision or LiDAR feedback is available.
Most legged robots provide rich proprioceptive measurements, including an inertial measurement unit (IMU) and motor encoder readings; many platforms additionally provide estimated joint torques or contact-related signals. Early approaches relied heavily on IMU acceleration integration (with frame transformations and additional dynamic cues) [6]- [8]. In practice, double integration is extremely sensitive to bias, timing mismatch, and discrete sampling noise, leading to rapid drift. To mitigate this, modern estimators incorporate leg kinematics to introduce velocity constraints from stance feet: when a foot is stationary relative to the ground, the body velocity can be inferred from the relative foot-body motion. A key difficulty is determining which legs provide valid constraints, especially under imperfect contact conditions. Probabilistic contact estimation and impact detection have been used to gate these measurements and detect slippage [9]. Building on contact-aware kinematic constraints, invariant EKF formulations have been explored for legged state estimation [10], and learning-based measurement models have further improved contact classification under difficult terrains [11]. Hybrid systems that incorporate cameras can provide stronger observability [12], but they reintroduce the fragility of exteroception that motivates proprioceptive odometry in the first place. Moreover, even when velocity estimation is improved, position drift can still accumulate over time, and foot velocity computed directly from noisy joint velocities may exhibit spikes caused by encoder quantization and numerical differentiation. This work targets robust purely proprioceptive odometry for biped, quadruped, and wheel-legged robots using only IMU and motor measurements. The key idea is to anchor the body state to discrete footfall events: we record the world-frame footfall position at touchdown and treat it as an intermittent constraint during the stance phase. In our implementation, stance selection is enabled by joint-torquebased foot wrench estimation, while wheel-legged locomotion is handled by compensating the effective wheel rotation during ground contact. To prevent long-term elevation drift, we introduce a lightweight height clustering and time-decay correction that snaps newly recorded footfall heights to previously observed support planes within a configurable resolution. To address spiky velocity observations from joint encoders, we further employ a cubature Kalman filter (CKF) under an inverse-kinematics observation model to estimate and smooth foot-end velocities directly from joint angles and angular velocities.
Compared with linear Kalman filtering and EKF-style approximations [13]- [18] and with alternatives such as invariant EKF and UKF [10], [11], [19]- [22], CKF provides a numerically stable, Jacobianfree nonlinear filtering rule and avoids UKF-style scaling parameters, which is convenient for inversekinematics measurements [23]- [25]. Finally, we correct long-horizon yaw drift by enforcing multi-contact geometric consistency between body-frame kinematics and world-frame contact records, which also provides a fallback yaw reference when IMU yaw constraints are degraded or disabled.
Contributions. This paper makes the following contributions:
• Unified kinematics-based proprioceptive odometry across morphologies. We formulate a contactanchored odometry framework that relies only on IMU and motor measurements (joint angles, velocities, and estimated torques). The framework is morphology-agnostic: at each time step, it forms constraints only from the currently contacting end-effectors (feet or wheels), so the same estimator applies to bipeds, quadrupeds, and wheel-legged robots with different numbers of contacts over time.
In our implementation, stance is selected via joint-torque-based foot wrench estimation, and wheellegged platforms are supported through effective wheel rotation compensation during contact.
• Footfall recording with height correction for accurate elevation. We record world-frame contact (footfall) positions at touchdown and use them as intermittent constraints during stance to suppress longhorizon drift. To prevent elevation drift over extended traversal,
This content is AI-processed based on open access ArXiv data.