Robots Atlas>ROBOTS ATLAS
GO-1 Air
GO-1 Air is the lightweight variant of the GO-1 Vision-Language-Action foundation model from AgiBot/OpenDriveLab — without the Latent Planner. 3B parameters, built on InternVL 2.5-2B, pretrained on AgiBot World Beta. CC BY-NC-SA 4.0.
✓ ActiveRobotics foundation modelVision-Language-Action model📁 GO-1 (Genie Operator-1)
Parameters
3B
parameters
Release date
19 September 2025
Access:open_weightsopen_sourceDeployment:💻 Local📱 On-device☁ Cloud

Overview

GO-1 Air is the variant of the GO-1 (Genie Operator-1) Vision-Language-Action foundation model without the Latent Planner component, open-sourced on September 19, 2025 by the AgiBot-World team (OpenDriveLab + AgiBot). Unlike the full GO-1, GO-1 Air is designed as a high-performance, lightweight variant — smaller in size (3B parameters) and faster at inference at the cost of dropping the latent planning layer.

Architecture

GO-1 Air is built on the InternVL 2.5-2B Vision-Language Model (OpenGVLab) and adds an Action Expert layer for generating robot control trajectories. Actions are predicted in absolute joint space with a chunk size of 30 (matching the 30 Hz frequency of the AgiBot World dataset). The model does not contain the Latent Planner component present in the full GO-1, which predicts latent action plans at a higher abstraction level.

Training and data

The model was pretrained on the AgiBot World Beta dataset (~1,003,672 trajectories, ~43.8 TB) containing data from the AgiBot G1 humanoid and covering 100+ scenarios across 5 target domains (retail, industry, catering, home, office). Counter to industry intuition, pretraining on a single embodiment (AgiBot G1) yields better cross-embodiment transfer than multi-robot pretraining — the model transfers via fine-tuning on <200 demonstrations to AgileX Cobot Magic (Aloha), Dual Franka (LIBERO) and RoboTwin.

Hardware requirements

Inference: ~7 GB GPU memory (runs on a single RTX 4090). Full fine-tuning (all weights): ~70 GB at batch size 16 (requires A100 80 GB or H100). Action Expert-only fine-tuning: ~24 GB at batch size 16 (RTX 4090, A100 40 GB). Recommended: CUDA 12.4, Flash Attention 2.4.2, and the LeRobot framework with dataset v2.1 (commit 2b71789).

Openness and availability

GO-1 Air is publicly available on HuggingFace at agibot-world/GO-1-Air under CC BY-NC-SA 4.0 (non-commercial). The checkpoint is in Safetensors format, BF16. Loader: transformers.AutoModel with trust_remote_code=True. The model is described in arXiv paper 2503.06669, an IROS 2025 Best Paper Award finalist and published in IEEE TRO 2026.

Classification
Robotics foundation modelVision-Language-Action model
Access & deployment
open_weightsopen_source
LocalOn-deviceCloud
Key parameters
🧩 Parameters: 3B
✓ Fine-tuning
📥 Input: image, text, robot sensors, robot state data

Technical specification

Parameters
3B
parameters
License
CC BY-NC-SA 4.0
Hardware requirements
Inference: ~7 GB VRAM (RTX 4090). Full fine-tuning: ~70 GB (A100 80GB, H100). Action Expert-only fine-tuning: ~24 GB (RTX 4090, A100 40GB). Required: CUDA 12.4, Flash Attention 2.4.2, LeRobot dataset v2.1.
Features:Fine-tuning
Modalities
⬇ Input
imagetextrobot_sensorsrobot_state_data
⬆ Output
robot_actionsrobot_commandsmotion_trajectories

Deployment and security

🤖 Related robots
💾 Related software