Latest writing
Inside Fe0: What Cross-Embodiment Data Teaches an Embodied Foundation Model
A close look at how heterogeneous robot and human data helps an embodied foundation model generalize across visual, semantic, relational, planning, and control axes.
UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling
A unified physical language that maps heterogeneous human and robot motion into shared action primitives.
DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA
Latent world modeling that decouples high-level intent from low-level actions in end-to-end vision-language-action policies.