GPT Realtime 2

2 · Family: GPT

OpenAI's voice model with GPT-5-class reasoning, parallel tool calls and a 128K-token context window, available via the Realtime API.

✓ Active✓ Public accessAudioAudioMultimodalReasoning model📁 GPT

Context window

128K

tokens

Release date

7 May 2026

🏢OpenAIProducer

Access:APIDeployment:☁ Cloud

Overview

GPT-Realtime-2 is a next-generation audio model released by OpenAI on May 7, 2026, as part of the Realtime API. It combines GPT-5-class reasoning, parallel tool calls, and a context window expanded to 128K tokens (up from 32K in the previous version). A new "preamble" feature lets the model speak short acknowledgement phrases ("let me check that", "one moment") before generating a full response, along with audible announcements of tool calls in progress.

On OpenAI benchmarks, GPT-Realtime-2 (high) scores 15.2% higher than its predecessor GPT-Realtime-1.5 on Big Bench Audio (audio reasoning) and 13.8% higher on Audio MultiChallenge (multi-turn conversation). Early tester Zillow reported a 26-point increase in call success rate (95% vs. 69%) after prompt optimization. The model is accessible via WebRTC, WebSocket, and SIP, with full EU Data Residency support.

Classification

AudioAudioMultimodalReasoning model

Family: GPT

Access & deployment

API

Cloud

Weights: Closed

Key parameters

📏 Context: 128K

✓ Tools

📥 Input: audio, text

Technical specification

Context window

128K

tokens

Features:✓ Tool use

Modalities

⬇ Input

audiotext

⬆ Output

audiotext

Capabilities and applications

Native model capabilities

Audio understanding

Category: audio

Voice Conversation

Ability to conduct multi-turn real-time voice conversations with context retention and natural speech pacing.

Category: speech

Live Translation

Real-time speech translation between multiple languages without interrupting the audio stream.

Category: speech

Streaming Speech-to-Text

Real-time conversion of speech to text with immediate output as the speaker is talking.

Category: speech

Parallel Tool Calls

Ability to invoke multiple external tools simultaneously while generating a response.

Category: reasoning

Benchmark results

2 benchmarks

Big Bench Audio

relative improvement · GPT-Realtime-2 (high)

+15.2% vs GPT-Realtime-1.5%

📄 OpenAI

Audio MultiChallenge

relative improvement · GPT-Realtime-2 (xhigh)

+13.8% vs GPT-Realtime-1.5%

📄 OpenAI

Technical architecture

Core Architecture

TRTransformer NMNative Multimodal

Model Form

MLMultimodal LLM RMReasoning model TLTool-augmented LLM

Articles

1 article

OpenAI Launches GPT-Realtime-2: Voice Intelligence with GPT-5-Class Reasoning

9 May 2026

›

Sources and related pages

3 sources

BlogOpenAI — Advancing voice intelligence with new models in the APIopenai.com BlogTechCrunch — OpenAI launches new voice intelligence features in its APItechcrunch.com DocsOpenAI Developers — Realtime and audio guideplatform.openai.com

Browse related topics

📁 GPT 🧠 Transformer 🧠 Native Multimodal 🧠 Multimodal LLM All audio model models All speech model models