Robots Atlas>ROBOTS ATLAS
Mind-0

Mind-0

0
MindOn's universal robot foundation model — a single VLA driving heterogeneous platforms (humanoids, dual-arm rigs), trained exclusively on human-centric data.
🔬 Research🔬 Research onlyRobotics foundation modelVision-Language-Action model
Release date
18 June 2026
Deployment:📱 On-device

Overview

Mind-0 is an embodied robotics foundation AI model built by the Shenzhen-based Chinese startup MindOne Robotics (MindOn). It is a Vision-Language-Action (VLA) model designed as a single "mind" driving heterogeneous hardware platforms — from Unitree G1 humanoids to stationary dual-arm rigs. The core thesis of Mind-0 is that instead of training a separate model per platform on expensive teleoperation data, one can train one model on human-centric data (whole-body motion capture, egocentric cameras, handheld devices) and have it generalize across embodiments.

Two-layer architecture

Mind-0 decouples intelligence from embodiment. The high-level layer handles scene understanding, task reasoning, and behavior generation. The low-level Whole-Body Action Foundation Model — trained on tens of thousands of hours of motion-capture data — translates intentions into physical motion respecting each robot's dynamics, achieving sub-3 cm end-effector tracking accuracy while maintaining global motion coherence and balance.

Cross-Embodiment Data Pipeline

The cross-embodiment pipeline converts large-scale human demonstrations into action representations executable by different robots, effectively transferring human dexterity to hardware with fundamentally different kinematics, dynamics, and workspaces.

Real-World Execution Compensation Model

A lightweight compensation model trained on a small amount of real deployment data closes the sim-to-real gap. It corrects tracking errors, dynamics mismatch, and embodiment-specific deviations, reportedly achieving sub-1 cm manipulation accuracy on the Unitree G1 — a platform typically known for limited arm precision.

Hierarchical Coordination Reasoning

Human data is inherently delay-free, while robots suffer from perception and control latency. Mind-0 addresses this with a hierarchical reasoning loop — the high-level policy continuously monitors low-level feedback and adaptively decides when and how to invoke specific skills, rather than directly imitating human demonstrations.

Public demonstrations

Mind-0's first viral demo (November 2025) showed a Unitree G1 autonomously performing complex household chores with no speed-ups and no teleoperation. The second (June 18, 2026) showcased a heterogeneous fleet — two Unitree G1 humanoids and two stationary dual-arm rigs — running an end-to-end logistics workflow (shelf picking, transport, sorting, packing, tape sealing), with all four robots driven by a single Mind-0 model.

Classification
Robotics foundation modelVision-Language-Action model
Access & deployment
On-device
Weights: Closed
Key parameters
📥 Input: robot sensors, robot state data, image, video
Robotics
Robot manipulationBimanual manipulationDexterous manipulationRobot controlRobot navigationMotion planningScene understandingEmbodied task planning

Technical specification

License
Proprietary (closed)
Hardware requirements
Deployed on commercial Unitree G1 humanoids and stationary dual-arm rigs (embodiment-agnostic architecture).
Modalities
⬇ Input
robot_sensorsrobot_state_dataimagevideo
⬆ Output
robot_actionsrobot_commandsmotion_trajectoriesmanipulator_control

Capabilities and applications

Native model capabilities
Cross-embodiment transfer
The ability of a single model to control robots with different morphologies (humanoids, dual-arm rigs, mobile platforms) without training a separate model per platform. Intelligence is decoupled from embodiment, so the same policy runs on hardware with different kinematics and dynamics.
Category: robotics
Vision-language-action grounding
The ability of a VLA model to ground visual perception and a language instruction into a concrete physical robot action. The model understands the scene and intent, then generates an executable action sequence, closing the loop from observation to motion.
Category: robotics
Planning
Forming and executing action plans for complex tasks.
Category: planning
Reasoning
The model's ability to reason logically and solve complex problems.
Category: reasoning
Multimodal understanding
Category: multimodal
Robotics
Robot manipulationBimanual manipulationDexterous manipulationRobot controlRobot navigationMotion planningScene understandingEmbodied task planning

Deployment and security

🤖 Related robots