Sim-to-real transfer for contact-rich manipulation remains challenging due to the inherent discrepancy in contact dynamics. While existing methods often rely on costly real-world data or utilize blind compliance through fixed controllers, we propose a framework that leverages expert-designed controller logic for transfer. Inspired by the success of privileged supervision in kinematic tasks, we employ a human-designed finite state machine based position/force controller in simulation to provide privileged guidance. The resulting policy is trained to predict the end-effector pose, contact state, and crucially the desired contact force direction. Unlike force magnitudes, which are highly sensitive to simulation inaccuracies, force directions encode high-level task geometry and remain robust across the sim-to-real gap. At deployment, these predictions configure a force-aware admittance controller. By combining the policy's directional intent with a constant, low-cost manually tuned force magnitude, the system generates adaptive, task-aligned compliance. This tuning is lightweight, typically requiring only a single scalar per contact state. We provide theoretical analysis for stability and robustness to disturbances. Experiments on four real-world tasks, i.e., microwave opening, peg-in-hole, whiteboard wiping, and door opening, demonstrate that our approach significantly outperforms strong baselines in both success rate and robustness.
In simulation, we implement an expert Finite State Machine based on privileged state to generate diverse demonstrations. Identifying force direction and contact state as dynamics-invariant quantities that encode task intent, the policy is trained to predict these signals alongside poses using simulation data. In the real world, the policy outputs configure a force-aware admittance controller, which combines the predicted force direction with a manually specified magnitude to achieve adaptive, task-aligned compliance.
We leverage privileged states in simulation to build an Expert Finite State Machine. The FSM switches behavior based on the contact state: During Free Motion, it simply tracks object-centric key poses. During Contact Interaction, it switches to task-specific rules to generate the targets and adopts contact-aware pose tracking. This allows us to automatically generate large-scale, high-quality demonstrations across diverse tasks.
Red line indicates X, Green line indicates Y, Blue line indicates Z.
Yellow region indicates time steps with our predicted contact state = 1
Red region indicates time steps with disturbance applied
Green region indicates time steps when triggering safety stop
Purple region indicates time steps when microwave door lock is released
L-A / M-A / H-A: isotropic admittance controller with low / medium / high stiffness
Ours
E2VLA+M-A
E2VLA+L-A
Ours
E2VLA+M-A
E2VLA+L-A
Ours
insertion depth=25mm
target normal force magnitude=2N
E2VLA+H-A
insertion depth=10mm
Ours
insertion depth=25mm
target normal force magnitude=2N
E2VLA+H-A
insertion depth=20mm
Ours
insertion depth=25mm
target normal force magnitude=2N
E2VLA+H-A
Ours
target normal force magnitude=4N
E2VLA+H-A
Ours
target normal force magnitude=4N
E2VLA+H-A
Ours
target normal force magnitude=4N
E2VLA+H-A
Ours
$\pi_0$+M-A
$\pi_0$+L-A
Ours
$\pi_0$+M-A
$\pi_0$+L-A
@misc{yang2026directionmatterslearningforce,
title={Direction Matters: Learning Force Direction Enables Sim-to-Real Contact-Rich Manipulation},
author={Yifei Yang and Anzhe Chen and Zhenjie Zhu and Kechun Xu and Yunxuan Mao and Yufei Wei and Lu Chen and Rong Xiong and Yue Wang},
year={2026},
eprint={2602.14174},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2602.14174},
}