General-purpose model-based RL algorithm that, with a single set of hyperparameters, masters more than 150 tasks and is the first to collect diamonds in Minecraft without any human data.
Parameters
12M โ 400M
parameters
Release date
10 January 2023
Access:DownloadDeployment:๐ป Local
Overview
Access & deployment
Download
Local
Weights: Open source
Key parameters
๐งฉ Parameters: 12M โ 400M
โ Fine-tuning
๐ฅ Input: image, structured data, robot state data
Robotics
Motion planningRobot controlEnvironment modelingSpatial prediction
Technical specification
Parameters
12M โ 400M
parameters
License
MIT
Hardware requirements
Training on a single GPU; reported training times range from about 12 hours (small configurations) to several days (large models) on modern NVIDIA GPUs / TPUs. Reference implementation built on JAX.
Features:โ Fine-tuning
Modalities
โฌ Input
imagestructured_datarobot_state_data
โฌ Output
robot_actionsstructured_data
Capabilities and applications
Native model capabilities
Planning
The model's ability to determine a sequence of actions leading to a goal โ predicting the consequences of actions and selecting an optimal path in a given environment.
Category: planning
Robotics
Motion planningRobot controlEnvironment modelingSpatial prediction
Benchmark results
4 benchmarks
Minecraft (Diamond)
pixel input, sparse rewards, no curriculum
first to collect diamonds without human data
๐ DreamerV3 paper (arXiv:2301.04104)
Atari 200M
single hyperparameter configuration across all games
state-of-the-art with single config
๐ DreamerV3 paper (arXiv:2301.04104)
DeepMind Control Suite (Proprio)
state-of-the-art
๐ DreamerV3 paper (arXiv:2301.04104)
Crafter
state-of-the-art
๐ DreamerV3 paper (arXiv:2301.04104)
