ImplicitRDP: An End-to-End Visual-Force Diffusion Policy with Structural Slow-Fast Learning

Reading time: 1 minute
...

📝 Original Info

  • Title: ImplicitRDP: An End-to-End Visual-Force Diffusion Policy with Structural Slow-Fast Learning
  • ArXiv ID: 2512.10946
  • Date: 2025-12-11
  • Authors: Wendi Chen, Han Xue, Yi Wang, Fangyuan Zhou, Jun Lv, Yang Jin, Shirun Tang, Chuan Wen, Cewu Lu

📝 Abstract

Human-level contact-rich manipulation relies on the distinct roles of two key modalities: vision provides spatially rich but temporally slow global context, while force sensing captures rapid, high-frequency local contact dynamics. Integrating these signals is challenging due to their fundamental frequency and informational disparities. In this work, we propose Implic-itRDP, a unified end-to-end visual-force diffusion policy that integrates visual planning and reactive force control within a single network. We introduce Structural Slow-Fast Learning, a mechanism utilizing causal attention to simultaneously process asynchronous visual and force tokens, allowing the policy to perform closed-loop adjustments at the force frequency while maintaining the temporal coherence of action chunks. Furthermore, to mitigate modality collapse where end-to-end models fail to adjust the weights across different modalities, we propose Virtual-target-based Representation Regularization. This auxiliary objective maps force feedback into the same space as the action, providing a stronger, physics-grounded learning signal than raw force prediction. Extensive experiments on contact-rich tas...

📄 Full Content

...(본문 내용이 길어 생략되었습니다. 사이트에서 전문을 확인해 주세요.)

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut