GO-1 Air

GO-1 Air is the lightweight variant of the GO-1 Vision-Language-Action foundation model from AgiBot/OpenDriveLab — without the Latent Planner. 3B parameters, built on InternVL 2.5-2B, pretrained on AgiBot World Beta. CC BY-NC-SA 4.0.

✓ ActiveRobotics foundation modelVision-Language-Action model📁 GO-1 (Genie Operator-1)

Parameters

parameters

Release date

19 September 2025

🏢AGIBOTProducer

Access:open_weightsopen_sourceDeployment:💻 Local📱 On-device☁ Cloud

Overview

GO-1 Air is the variant of the GO-1 (Genie Operator-1) Vision-Language-Action foundation model without the Latent Planner component, open-sourced on September 19, 2025 by the AgiBot-World team (OpenDriveLab + AgiBot). Unlike the full GO-1, GO-1 Air is designed as a high-performance, lightweight variant — smaller in size (3B parameters) and faster at inference at the cost of dropping the latent planning layer.

Architecture

GO-1 Air is built on the InternVL 2.5-2B Vision-Language Model (OpenGVLab) and adds an Action Expert layer for generating robot control trajectories. Actions are predicted in absolute joint space with a chunk size of 30 (matching the 30 Hz frequency of the AgiBot World dataset). The model does not contain the Latent Planner component present in the full GO-1, which predicts latent action plans at a higher abstraction level.

Training and data

The model was pretrained on the AgiBot World Beta dataset (~1,003,672 trajectories, ~43.8 TB) containing data from the AgiBot G1 humanoid and covering 100+ scenarios across 5 target domains (retail, industry, catering, home, office). Counter to industry intuition, pretraining on a single embodiment (AgiBot G1) yields better cross-embodiment transfer than multi-robot pretraining — the model transfers via fine-tuning on <200 demonstrations to AgileX Cobot Magic (Aloha), Dual Franka (LIBERO) and RoboTwin.

Hardware requirements

Inference: ~7 GB GPU memory (runs on a single RTX 4090). Full fine-tuning (all weights): ~70 GB at batch size 16 (requires A100 80 GB or H100). Action Expert-only fine-tuning: ~24 GB at batch size 16 (RTX 4090, A100 40 GB). Recommended: CUDA 12.4, Flash Attention 2.4.2, and the LeRobot framework with dataset v2.1 (commit 2b71789).

Openness and availability

GO-1 Air is publicly available on HuggingFace at agibot-world/GO-1-Air under CC BY-NC-SA 4.0 (non-commercial). The checkpoint is in Safetensors format, BF16. Loader: transformers.AutoModel with trust_remote_code=True. The model is described in arXiv paper 2503.06669, an IROS 2025 Best Paper Award finalist and published in IEEE TRO 2026.

Classification

Robotics foundation modelVision-Language-Action model

Family: GO-1 (Genie Operator-1)

Access & deployment

open_weightsopen_source

LocalOn-deviceCloud

Key parameters

🧩 Parameters: 3B

✓ Fine-tuning

📥 Input: image, text, robot sensors, robot state data

Technical specification

Parameters

parameters

License

CC BY-NC-SA 4.0

Hardware requirements

Inference: ~7 GB VRAM (RTX 4090). Full fine-tuning: ~70 GB (A100 80GB, H100). Action Expert-only fine-tuning: ~24 GB (RTX 4090, A100 40GB). Required: CUDA 12.4, Flash Attention 2.4.2, LeRobot dataset v2.1.

Features:✓ Fine-tuning

Modalities

⬇ Input

imagetextrobot_sensorsrobot_state_data

⬆ Output

robot_actionsrobot_commandsmotion_trajectories

Deployment and security

🤖 Related robots

🤖AGIBOT G2Robot

💾 Related software

💾Genie Sim 3.0Software

Sources and related pages

5 sources

RepoHuggingFace agibot-world/GO-1-Air (model card)huggingface.co RepoGitHub OpenDriveLab/AgiBot-Worldgithub.com PaperAgiBot World Colosseo (arXiv 2503.06669) — IROS 2025 Best Paper Award Finalist & IEEE TRO 2026arxiv.org BlogOpenGO1 — The Bitter Lessons of Building VLA Systems at Scale (19.09.2025)opendrivelab.com WebAgiBot World — Project Pageagibot-world.com

Browse related topics

📁 GO-1 (Genie Operator-1)All robotics foundation model models All vla model models