GPT-5 Thinking

5 · Family: GPT

GPT-5 variant with deep reasoning mode, available in ChatGPT and via the API through the reasoning_effort parameter. Released August 7, 2025.

✓ Active✓ Public accessReasoning modelMultimodalLLM📁 GPT

Context window

400K

tokens

Max output

128,000

tokens

Release date

7 August 2025

🏢OpenAIProducer 🤝MicrosoftTechnology partner

Access:APIHostedDeployment:☁ Cloud

Overview

GPT-5 Thinking is the deep reasoning variant within the unified GPT-5 system released by OpenAI on August 7, 2025. The GPT-5 system consists of a fast model answering most queries, a deeper reasoning model (GPT-5 Thinking), and a real-time router that selects the path based on conversation type, complexity, tool needs, and user intent. In ChatGPT, GPT-5 Thinking is explicitly selectable from the model picker for paid users, or can be triggered by phrases such as "think hard about this". In the API, the same GPT-5 model (identifier gpt-5, snapshot gpt-5-2025-08-07) exposes a reasoning_effort parameter with values minimal/low/medium/high that controls the depth of the chain-of-thought. The context window is 400,000 tokens, maximum output is 128,000 tokens. Knowledge cutoff: September 30, 2024. Input modalities are text and image, output is text. API pricing: USD 1.25 per 1M input tokens (USD 0.125 cached) and USD 10 per 1M output tokens. The model was trained on Microsoft Azure AI supercomputers.

Classification

Reasoning modelMultimodalLLM

Family: GPT

Access & deployment

APIHosted

Cloud

Weights: Closed

Key parameters

📏 Context: 400K

✓ Tools

📥 Input: text, image

Platforms

OpenAI API Microsoft Azure AI Foundry

Technical specification

Context window

400K

tokens

Max output tokens

128,000

tokens per response

Knowledge cutoff

30 Sept 2024

Knowledge boundary

Features:✓ Tool use

Modalities

⬇ Input

textimage

⬆ Output

textcode

Capabilities and applications

Native model capabilities

Reasoning

The model's ability to reason logically and solve complex problems.

Category: reasoning

Multi-step reasoning

Carrying out multi-step chains of reasoning across long, complex tasks.

Category: reasoning

Coding

Generating, analysing and modifying code in many programming languages. Covers writing functions, debugging, refactoring, code review, and creating tests. Measured by benchmarks such as HumanEval and SWE-bench.

Category: coding

Long context

Support for large context windows — tens to hundreds of thousands (or millions) of input tokens. Enables analysis of entire codebases, long documents, and many parallel conversations without losing earlier information. GPT-5.1 supports 400,000 tokens.

Category: language

Multilingual

Competence in many natural languages (from a few to over a hundred): understanding, generation, translation, and code-switching within a single conversation. Frontier models support a wide range of languages with comparable quality.

Category: language

Image understanding

Analysing and interpreting the content of images.

Category: vision

Function Calling

Category: planning

Planning

Forming and executing action plans for complex tasks.

Category: planning

Parallel Tool Calls

Ability to invoke multiple external tools simultaneously while generating a response.

Category: reasoning

Agentic capability

The model's ability to autonomously plan and execute multi-step tasks by sequentially using tools, maintaining context, and adapting to intermediate results.

Category: planning

Structured output

Producing data in structured formats such as JSON.

Category: structured_generation

Computer use

The model's ability to operate a computer interface by interpreting screenshots and generating actions such as clicks, typing, and navigating applications.

Category: planning

Benchmark results

6 benchmarks

SWE-bench

accuracy · SWE-bench Verified, fixed subset n=477, high reasoning effort

74.9%