Aktualności23 czerwca 2026 VLA-JEPA: latent world model for robots, not pixel prediction
A Chinese team from USTC, Zhongguancun Academy, SJTU, and Eastern Institute of Technology Ningbo built VLA-JEPA — a JEPA-style pretraining framework for VLA models that learns world dynamics in latent space instead of predicting pixels. It reached 97.2% on LIBERO and 78.1% on the OOD benchmark LIBERO-Plus, demonstrating that just 13 trajectories suffice for simple assembly tasks.