Robots Atlas>ROBOTS ATLAS
GPT-5.5
AI Modelsโ€บGPT

GPT-5.5

gpt-5.5ย ยทย Family: GPT
GPT-5.5 is OpenAI's newest frontier model, focused on agentic coding, computer use, knowledge work, and scientific research with a 1M-token context window.
โœ“ Activeโœ“ Public accessLLMMultimodalReasoning modelTool-using model๐Ÿ“ GPT
Context window
1M
tokens
Max output
128,000
tokens
Release date
23 April 2026
Access:APIHostedDeployment:โ˜ Cloud

Overview

Classification
LLMMultimodalReasoning modelTool-using model
Family: GPT
Access & deployment
APIHosted
Cloud
Weights: Closed
Key parameters
๐Ÿ“ Context: 1M
โœ“ Tools
๐Ÿ“ฅ Input: text, image

Technical specification

Context window
1M
tokens
Max output tokens
128,000
tokens per response
Knowledge cutoff
1 Dec 2025
Knowledge boundary
Features:โœ“ Tool use
Modalities
โฌ‡ Input
textimage
โฌ† Output
textcodestructured_data

Capabilities and applications

Native model capabilities
Reasoning
Category: reasoning
Multi-step reasoning
Category: reasoning
Long context
Category: reasoning
Coding
Category: coding
Function Calling
Category: planning
Structured output
Category: structured_generation
Audio understanding
Category: audio
Image understanding
Category: vision
Video Understanding
Category: video
Chart understanding
Category: vision
Diagram reasoning
Category: reasoning
OCR
Category: vision
Multilingual
Category: language
Planning
Category: planning
Streaming output
Category: reasoning
Interleaved Multimodal Input
Category: reasoning
Multimodal understanding
Category: multimodal

Benchmark results

20 benchmarks
SWE-Bench Pro (Public)
accuracy ยท Evaluation conducted with xhigh reasoning effort in a research environment.
58.6%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Results may differ slightly from production ChatGPT.
Terminal-Bench 2.0
accuracy ยท Tests of complex command-line workflows requiring planning, iteration, and tool coordination.
82.7%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
Expert-SWE (Internal)
accuracy ยท Internal evaluation of long-term coding tasks (estimated human completion time: 20 hours).
73.1%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Internal OpenAI benchmark; no public methodology available.
OSWorld
accuracy ยท Measures the model's ability to independently operate real operating systems.
78.7%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
GDPval
wins or ties vs industry professional ยท Tests the model's ability to produce specialized professional knowledge across 44 occupations.
84.9%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
BrowseComp
accuracy ยท Evaluation of browser tool usage capabilities.
84.4%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
GPQA
accuracy ยท Evaluation with xhigh reasoning effort in a research environment.
93.6%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with reasoning effort xhigh in a research environment.
Humanity's Last Exam (HLE)
accuracy ยท Evaluation with reasoning effort set to xhigh in a research environment, without tools.
41.4%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Toolless variant.
Humanity's Last Exam (HLE)
accuracy ยท Evaluation with xhigh reasoning effort in a research environment, with tools.
52.2%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Variant with tools.
EpochAI Frontier Math
accuracy ยท Evaluation with xhigh reasoning effort in a research environment.
51.7%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
EpochAI Frontier Math
accuracy ยท Hardest FrontierMath tier; evaluation with reasoning effort xhigh.
35.4%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
Toolathlon
accuracy ยท Tool use evaluation; reasoning effort xhigh.
55.6%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with reasoning effort xhigh in a research environment.
CyberGym
accuracy ยท Cybersecurity benchmark; reasoning effort xhigh.
81.8%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation conducted with reasoning effort set to xhigh in a research environment.
TAU-bench
accuracy ยท Tests complex customer service workflows in telecommunications; results obtained without prompt tuning or prompt adjustments.
98.0%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Tau2-bench Telecom run without prompt tuning (GPT-4.1 as user model).
MMMU
accuracy ยท Multimodal evaluation without tools; reasoning effort xhigh.
81.2%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation conducted with reasoning effort set to xhigh in a research environment.
MMMU
accuracy ยท Multimodal evaluation with tools; reasoning effort xhigh.
83.2%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
BixBench
accuracy ยท Bioinformatics benchmark and data analysis; reasoning effort xhigh.
80.5%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
GeneBench
accuracy ยท Multi-step analysis of scientific data in genetics and quantitative biology; reasoning effort xhigh.
25.0%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with xhigh reasoning effort in a research environment.
ARC-AGI-1 (Verified)
accuracy ยท Abstract reasoning; reasoning effort xhigh.
95.0%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with reasoning effort xhigh in a research environment.
ARC-AGI-2 (Verified)
accuracy ยท Abstract reasoning (harder difficulty level); reasoning effort xhigh.
85.0%
๐Ÿ“… 23 Apr 2026๐Ÿ“„ OpenAI (openai.com/index/introducing-gpt-5-5/)
Evaluation with reasoning effort xhigh in a research environment.

Pricing

Deployment and security

๐Ÿ”’ Security / Enterprise
โœ“ Verified enterprise information

OpenAI rates GPT-5.5's cyber and biological capabilities as High under the Preparedness Framework. The model underwent a full safety and governance process, including targeted evaluations for advanced cyber and biological capabilities and testing with external experts.

Updated: 25 Apr 2026โ†— Security documentation