
OpenAI reasoning model released April 16, 2025 with full tool access in ChatGPT, ability to think with images, and a 200K context window. Succeeded by GPT-5.
Context window
200K
tokens
Max output
100,000
tokens
Release date
16 April 2025
Access:APIHostedDeployment:☁ Cloud
Overview
Access & deployment
APIHosted
Cloud
Weights: Closed
Key parameters
📏 Context: 200K
✓ Tools
📥 Input: text, image
Platforms
Technical specification
Context window
200K
tokens
Max output tokens
100,000
tokens per response
Knowledge cutoff
1 Jun 2024
Knowledge boundary
Features:✓ Tool use
Modalities
⬇ Input
textimage
⬆ Output
textcode
Capabilities and applications
Native model capabilities
Reasoning
Category: reasoning
Multi-step reasoning
Category: reasoning
Coding
Category: coding
Long context
Category: reasoning
Multilingual
Category: language
Image understanding
Category: vision
Multimodal understanding
Category: multimodal
Function Calling
Category: planning
Parallel Tool Calls
Ability to invoke multiple external tools simultaneously while generating a response.
Category: reasoning
Planning
Category: planning
Agentic capability
The model's ability to autonomously plan and execute multi-step tasks by sequentially using tools, maintaining context, and adapting to intermediate results.
Category: planning
Computer use
The model's ability to operate a computer interface by interpreting screenshots and generating actions such as clicks, typing, and navigating applications.
Category: planning
Structured output
Category: structured_generation
Benchmark results
6 benchmarks
Codeforces
ELO rating · High reasoning effort, with tools
2727points
📅 16 Apr 2025📄 OpenAI announcement (Introducing OpenAI o3 and o4-mini)
SWE-bench
accuracy · SWE-bench Verified, fixed subset n=477, no custom scaffold
69.1%
📅 16 Apr 2025📄 OpenAI announcement (Introducing OpenAI o3 and o4-mini)
MMMU
accuracy · Multimodal understanding, high reasoning effort
82.9%
📅 16 Apr 2025📄 OpenAI announcement (Introducing OpenAI o3 and o4-mini)
AIME 2025
pass@1 · AIME 2025 with tool access (Python). Without tools the score is lower and not comparable to models without tool access.
98.4%
📅 16 Apr 2025📄 OpenAI announcement (Introducing OpenAI o3 and o4-mini)
GPQA
accuracy · GPQA Diamond, high reasoning effort
83.3%
📅 16 Apr 2025📄 OpenAI announcement (Introducing OpenAI o3 and o4-mini)
Humanity's Last Exam (HLE)
accuracy · Humanity's Last Exam, no tools
20.32%
📅 16 Apr 2025📄 OpenAI announcement (Introducing OpenAI o3 and o4-mini)
Pricing
Technical architecture
Core Architecture
Training Techniques
Deployment and security
☁ Available on platforms