OpenAI text-to-video diffusion-transformer model. Generates clips up to 60 seconds in 1080p from a text prompt, an image or another video.
Release date
15 February 2024
Access:HostedDeployment:โ Cloud
Overview
Access & deployment
Hosted
Cloud
Weights: Closed
Key parameters
๐ฅ Input: text, image, video
Technical specification
Max output tokens
0
tokens per response
Modalities
โฌ Input
textimagevideo
โฌ Output
video
Capabilities and applications
Native model capabilities
Video generation
The model's ability to generate video clips from a text prompt, image or another video, with control over length, resolution and visual characteristics.
Category: video
Image-to-video
The model's ability to animate a static input image โ extending it in time into a consistent video clip according to a description of motion or action.
Category: video
Video understanding
The model's ability to analyse and interpret video content โ recognising actions, motion, events and relationships between objects over time.
Category: video
Technical architecture
Core Architecture
Model Form
