Robots Atlas>ROBOTS ATLAS
Gemini 3.5 Flash

Gemini 3.5 Flash

3.5 Flashย ยทย Family: Gemini
Fast multimodal model from the Gemini 3.5 family, optimized for agentic coding, long context and advanced reasoning with low latency.
โณ Previewโณ Limited accessLLMMultimodalReasoning modelTool-using model๐Ÿ“ Gemini
Context window
1M
tokens
Max output
65,536
tokens
Access:APIHostedDeployment:โ˜ Cloud

Overview

Gemini 3.5 Flash is a model from the Gemini 3.5 family developed by Google DeepMind. Designed as a fast, multimodal model aimed at frontier intelligence per dollar, it combines advanced reasoning with the low latency characteristic of Flash variants.

It accepts text, images, video, audio and PDF documents as input and produces text and code as output. The model offers a 1M token context window, up to 64k output tokens, and supports function calling, structured output, code execution and search as a tool. Knowledge cutoff is January 2025.

Available through the Gemini app, Gemini API, Google AI Studio, Gemini Enterprise, Google AI Mode, Google Antigravity and Android Studio. Status: Preview.

Classification
LLMMultimodalReasoning modelTool-using model
Family: Gemini
Access & deployment
APIHosted
Cloud
Weights: Closed
Key parameters
๐Ÿ“ Context: 1M
โœ“ Tools
๐Ÿ“ฅ Input: text, image, audio, videoโ€ฆ

Technical specification

Context window
1M
tokens
Max output tokens
65,536
tokens per response
Knowledge cutoff
1 Jan 2025
Knowledge boundary
License
proprietary
Hardware requirements
Available only through Google cloud infrastructure (Gemini API, Vertex AI, Google AI Studio).
Features:โœ“ Tool use
Modalities
โฌ‡ Input
textimageaudiovideodocuments
โฌ† Output
textcode

Capabilities and applications

Native model capabilities
Reasoning
Category: reasoning
Multi-step reasoning
Category: reasoning
Long context
Category: reasoning
Multimodal understanding
Category: multimodal
Coding
Category: coding
Function Calling
Category: planning
Structured output
Category: structured_generation
Audio understanding
Category: audio
Image understanding
Category: vision
Video Understanding
Category: video
Chart understanding
Category: vision
OCR
Category: vision
Multilingual
Category: language
Planning
Category: planning
Interleaved Multimodal Input
Category: reasoning

Benchmark results

14 benchmarks
Terminal-bench 2.1
accuracy ยท Terminus-2 harness
76.2%%
๐Ÿ“„ deepmind.google/models/gemini/flash
SWE-Bench Pro (Public)
accuracy ยท Single attempt
55.1%%
๐Ÿ“„ deepmind.google/models/gemini/flash
MCP Atlas
accuracy
83.6%%
๐Ÿ“„ deepmind.google/models/gemini/flash
Toolathlon
accuracy
56.5%%
๐Ÿ“„ deepmind.google/models/gemini/flash
OSWorld-Verified
accuracy
78.4%%
๐Ÿ“„ deepmind.google/models/gemini/flash
Finance Agent v2
accuracy
57.9%%
๐Ÿ“„ deepmind.google/models/gemini/flash
GDPval-AA
Elo ยท Economically valuable knowledge work
1656
๐Ÿ“„ deepmind.google/models/gemini/flash
CharXiv Reasoning
accuracy ยท No tools
84.2%%
๐Ÿ“„ deepmind.google/models/gemini/flash
MMMU-Pro
accuracy ยท No tools
83.6%%
๐Ÿ“„ deepmind.google/models/gemini/flash
Blueprint-Bench 2
normalized score
33.6%%
๐Ÿ“„ deepmind.google/models/gemini/flash
MRCR v2 (8-needle) 128k
accuracy ยท Long context, average
77.3%%
๐Ÿ“„ deepmind.google/models/gemini/flash
MRCR v2 (8-needle) 1M
accuracy ยท Pointwise
26.6%%
๐Ÿ“„ deepmind.google/models/gemini/flash
Humanity's Last Exam
accuracy ยท Full set, text + MM
40.2%%
๐Ÿ“„ deepmind.google/models/gemini/flash
ARC-AGI-2
accuracy
72.1%%
๐Ÿ“„ deepmind.google/models/gemini/flash

Technical architecture