
Unitree G1
Bipedal humanoid robot by Unitree Robotics, designed as a compact research, development, and developer platform.
- Research
- Home Assistance

**OpenVLA** is the first fully open-source replication of the RT-2 architecture, announced in June 2024 (paper 'OpenVLA: An Open-Source Vision-Language-Action Model', Kim et al., arXiv:2406.09246). It was developed jointly by Stanford AI Lab, UC Berkeley (Robot Learning Lab), Google DeepMind, Toyota Research Institute, MIT and Physical Intelligence. OpenVLA fills the gap left by the closed RT-2 — releasing the **model weights**, **training code**, **fine-tuning recipes**, and **complete data pipeline**.
Architecture: ~**7B parameters** built from three components. (1) **Vision encoder** — a fusion of DINOv2 (semantic features) + SigLIP (CLIP-style alignment), both ViT-L/14. (2) **LLM backbone** — Llama 2 7B. (3) **Action head** — action discretization into 256 bins per dimension (as in RT-2), next-token prediction over action tokens.
Training data: ~970,000 demonstrations from **Open X-Embodiment** (Google DeepMind, 21 institutions), covering 22 robots (Franka, UR5, WidowX, Sawyer, Google Robot etc.) and ~500 tasks. Training time: 8 days on 64× A100 80 GB.
Results: OpenVLA achieves **+16.5 pp success rate** over RT-2-X (55B) on out-of-distribution generalization tasks — despite having 8× fewer parameters. Fine-tuning on custom datasets (LoRA-style) takes 10-20 hours on 1× A100 and adapts the model to a new robot with 100-500 demonstrations.
Ecosystem: full integration with **HuggingFace Transformers** (`openvla/openvla-7b`), support for 4-bit quantization (bitsandbytes), compatibility with PyTorch 2.0+. Impact: OpenVLA has become the **de facto VLA baseline** in academia — the basis of all subsequent works (CogACT, TraceVLA, RoboFlamingo). Reproducibility: full checkpoints, dataset indices, and training scripts.
A Runtime is the environment or execution layer used to run code, load libraries, manage dependencies, and operate applications or services — either in real time or during normal system operation. In robotics this includes real-time operating system (RTOS) runtimes, ROS 2 executor runtimes, containerised execution environments (Docker, podman), and embedded C++ runtimes on microcontrollers.
An SDK (Software Development Kit) is a curated set of libraries, interfaces, tools, sample code, and documentation intended for building applications and integrating with a specific hardware device, platform, or service. In robotics, an SDK typically exposes device control, telemetry, sensor access, configuration, and execution functions, significantly reducing the time-to-first-integration for developers targeting a specific robot or platform.
A family of open Vision-Language-Action (VLA) and foundation models for robotics: OpenVLA (Stanford/Berkeley), LeRobot (Hugging Face), RoboAgent (CMU), RT-2 (Google DeepMind, publication). Trained on datasets such as Open X-Embodiment, BridgeData V2, and RoboNet.
The de facto VLA baseline for academic teams since H2 2024 — used in 150+ scientific publications (Google Scholar, Q1 2026). Fine-tuning experiments: TRI (autonomous driving demonstrations), Stanford (Tidybot mobile manipulation), Berkeley (BridgeData V2). Commercial fine-tunes: Skild AI, Covariant (closed). HuggingFace Spaces demo with a teleop interface.
github.com/openvla/openvla ~2.9k★, ~310 forks. HuggingFace `openvla/openvla-7b` ~50k downloads/month. The arXiv:2406.09246 paper has ~450 citations (Q1 2026). 'Open Robotics Foundation Models' Discord ~1.5k members. Active PRs with fine-tunes for specific domains.

Bipedal humanoid robot by Unitree Robotics, designed as a compact research, development, and developer platform.

Unitree H1 is a full-size general-purpose humanoid robot (~180 cm, ~47 kg). Bipedal, 5 DOF per leg + 4 DOF per arm, 3.3 m/s walking speed, 360° perception via 3D LiDAR + depth camera, Unitree M107 PMSM joint motors with ~360 N·m peak knee torque. Standard compute: Intel Core i5/i7; optional NVIDIA Jetson Orin NX.

Figure 03 is the third-generation humanoid robot from Figure AI, designed for Helix, home environments, and scalable mass production.

Boston Dynamics bipedal humanoid robot. The fully electric generation unveiled in 2024 succeeds the hydraulic Atlas that was retired after more than a decade of research.
Ubuntu 24.04 LTS 'Noble Numbat' — supported until April 2029. The host for ROS 2 Jazzy.
Open weights `openvla/openvla-7b` on HuggingFace. Code on GitHub `openvla/openvla` (MIT License). Llama 2 license for the backbone — requires acceptance of the Meta AI license for full commercial use.
License family: Permissive
Small 3B-parameter variant for edge inference on Jetson AGX Orin (~150 ms per action).
OpenVLA integration with the Mobile Aloha platform (Stanford), first dual-arm demonstration on a bipedal robot.
CMU + OpenVLA — a variant with a diffusion-based action head instead of discrete tokens. SOTA results in long-horizon manipulation.
The 'OFT' variant — Optimal Fine-Tuning recipe (LoRA-based) with better performance on few-shot tasks.
First public release — arXiv:2406.09246 paper + the `openvla/openvla-7b` checkpoint on HuggingFace.