NVIDIA's open World Foundation Model for controllable video translation: turns simulations (for example from Omniverse) into photorealistic synthetic data for robotics and autonomous vehicles.
Parameters
7B (Cosmos Transfer 1, wszystkie warianty)
parameters
Release date
19 March 2025
Access:DownloadAPIHostedDeployment:💻 Local☁ Cloud
Overview
Access & deployment
DownloadAPIHosted
LocalCloud
Weights: Open weights
Key parameters
🧩 Parameters: 7B (Cosmos Transfer 1, wszystkie warianty)
✓ Fine-tuning
📥 Input: video, image, text, depth…
Robotics
Environment modelingScene understandingSpatial reasoning
Platforms
Technical specification
Parameters
7B (Cosmos Transfer 1, wszystkie warianty)
parameters
License
NVIDIA Open Model License (Cosmos Transfer 1 / 2.5)
Hardware requirements
Training on NVIDIA GPU clusters of the H100 / B100 / GB200 class. Inference of the 7B model is feasible on a single server-grade GPU (H100 80GB) or via NVIDIA NIM. Reference implementation in PyTorch.
Features:✓ Fine-tuning
Modalities
⬇ Input
videoimagetextdepthstructured_data
⬆ Output
video
Capabilities and applications
Native model capabilities
Video generation
The model's ability to generate video clips from a text prompt, image or another video, with control over length, resolution and visual characteristics.
Category: video
Image-to-video
The model's ability to animate a static input image — extending it in time into a consistent video clip according to a description of motion or action.
Category: video
Video understanding
The model's ability to analyse and interpret video content — recognising actions, motion, events and relationships between objects over time.
Category: video
Robotics
Environment modelingScene understandingSpatial reasoning
Application domains
Technical architecture
Core Architecture
Model Form
Training Techniques
Deployment and security
💾 Related software
☁ Available on platforms
