Control · Control & Planning

RoboAgent

RoboAgent v1.2 (maintenance)·Carnegie Mellon University

Active Open source API available

CATEGORYControl · Control & Planning

READINESSTRL 5

ADOPTION SCALEResearch / Prototype

LICENSESApache-2.0

FIRST RELEASE2023

**RoboAgent** is a project by the CMU Robotics Institute (Vikash Kumar) + Meta AI (FAIR) announced in August 2023 (paper 'RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking', Bharadhwaj et al., arXiv:2309.01918). The goal: to demonstrate that a **small (~75M-parameter) policy** trained on just **7,500 trajectories** can perform 12 skills × 38 tasks with generalization to new objects and scenes.

The key innovation: **semantic augmentations** — during data preparation, synthetic image variants are generated through inpainting (Stable Diffusion) while preserving the robot's actions. From one demonstration, 10-50 variants are created (different backgrounds, lighting, distractors, object colors), which significantly increases generalization without additional real demonstrations.

Architecture: **MT-ACT** (Multi-Task Action Chunking Transformer) — an extension of ACT (the Action Chunking Transformer from Mobile Aloha) with multi-task conditioning via language embedding (from a CLIP text encoder). The network predicts 'chunks' of 10-20 actions instead of individual actions, which stabilizes execution and allows 30 Hz control even with a large model.

Demonstrated skills: pick, place, push, slide, rotate, hinge open/close, drawer open/close, pour, wipe, pick-and-place, multi-step assembly. Hardware: Franka Emika Panda + RealSense D455 (one front view) + RealSense D435 (wrist view). Training: 4× A100 over ~3 days. The full stack is open-source on github.com/robopen/roboagent (Apache 2.0).

Impact: RoboAgent introduced **semantic augmentations** to mainstream robot learning, and the technique has been adopted by OpenVLA, π0, and Genesis. Bharadhwaj et al. subsequently founded the startup Skild AI (2024, $300M seed) — continuing the 'compact foundation models' direction.

Documentation