Gemini 2.5 Pro
Gemini 2.5 Pro is Google DeepMind's flagship reasoning model, generally available June 17, 2025. Built on Sparse MoE architecture, supports up to 1M token context, text/audio/image/video input, and integrated thinking mode.
Technical specification
Modalities
Capabilities
17Reasoning★
Reasoning
Multi-step reasoning★
Reasoning
Long context★
Reasoning
Coding★
Coding
Function Calling
Planning
Structured output★
Structured gen.
Audio understanding
Audio
Image understanding★
Vision
Video Understanding
Other
Chart understanding
Vision
Diagram reasoning
Reasoning
OCR★
Vision
Multilingual★
Language
Planning★
Planning
Streaming output
Reasoning
Interleaved Multimodal Input
Reasoning
Multimodal understanding★
Multimodality
Applications
Two-tier pricing based on context length. Prompts ≤200K tokens: $1.25/MTok input, $10.00/MTok output (thinking tokens counted as output). Prompts >200K tokens: $2.50/MTok input, $15.00/MTok output. Context caching: $0.31/MTok (≤200K), $0.625/MTok (>200K), storage $4.50/MTok/h. Batch API: ~50% discount. Free tier available in Google AI Studio (data used for product training).
INPUT
$1.2500 / per 1M tokens
OUTPUT
$10.0000 / per 1M tokens
CACHE
$0.3100 / per 1M tokens
TOTAL
for 10K tokens
Output includes thinking tokens. Context caching: $0.31/MTok read, storage $4.50/MTok/h.
Higher rate applies for long contexts exceeding 200K tokens.
Batch API (~50% discount). Asynchronous processing.
