Robots Atlas>ROBOTS ATLAS
Genie 3

Genie 3

3ย ยทย Family: Genie
Foundation world model from Google DeepMind that generates interactive 3D worlds from a text prompt, in real time at 24 fps, 720p, with consistency for several minutes.
โณ Previewโณ Limited accessWorld Model๐Ÿ“ Genie
Release date
5 August 2025
Access:HostedDeployment:โ˜ Cloud

Overview

Genie 3 is a general-purpose foundation world model developed by Google DeepMind, announced on 5 August 2025 by Jack Parker-Holder and Shlomi Fruchter. Given a text prompt, the model generates dynamic, interactive 3D worlds that can be navigated in real time at 24 frames per second at 720p resolution, retaining consistency for several minutes.

Progress over Genie 2

Genie 3 is the first model in the Genie family to allow real-time interaction while simultaneously improving consistency and realism compared with Genie 2 (December 2024). Visual memory extends back roughly one minute โ€” the model remembers and correctly renders previously seen regions when revisited. Unlike approaches such as NeRFs or Gaussian Splatting, Genie 3 does not rely on an explicit 3D representation: worlds are generated frame by frame from the description and user actions, making them more dynamic and richer.

Promptable world events

In addition to navigational input, Genie 3 introduces promptable world events โ€” a text-based form of interaction that lets the user change the simulated world on the fly (altering weather, introducing new objects or characters). This mechanism broadens the range of counterfactual ("what if") scenarios available to agents that learn from experience.

Embodied agent research

Genie 3 is used to generate worlds for training and evaluating embodied agents. DeepMind demonstrated a collaboration with a recent version of the SIMA agent: in worlds generated by Genie 3, SIMA pursues stated goals by issuing navigation actions to the model, while Genie 3 โ€” unaware of the agent's goal โ€” simulates future frames. Longer-horizon consistency makes it possible to execute longer sequences of actions and more complex tasks.

Limitations

Limitations stated by DeepMind: a limited action space directly available to the agent, imperfect modelling of interactions between multiple independent agents, no perfect geographic fidelity of real-world locations, issues with rendering legible text (unless the text is provided in the world description) and a continuous interaction duration limited to a few minutes rather than extended hours.

Availability

Genie 3 has been released as a limited research preview to a small cohort of academics and creators. The weights are not publicly available; there is no public API. DeepMind signals plans to extend access to additional testers.

Classification
World Model
Family: Genie
Access & deployment
Hosted
Cloud
Weights: Closed
Key parameters
๐Ÿ“ฅ Input: text, structured data
Robotics
Environment modelingSpatial predictionScene understandingSpatial reasoning

Technical specification

Modalities
โฌ‡ Input
textstructured_data
โฌ† Output
video

Capabilities and applications

Native model capabilities
Video understanding
The model's ability to analyse and interpret video content โ€” recognising actions, motion, events and relationships between objects over time.
Category: video
Planning
The model's ability to determine a sequence of actions leading to a goal โ€” predicting the consequences of actions and selecting an optimal path in a given environment.
Category: planning
Robotics
Environment modelingSpatial predictionScene understandingSpatial reasoning

Technical architecture

Core Architecture