THE AI RANKINGS

Google

Gemini 3.5 Flash

Provider
Google
Status
available
Context
1,000,000 tok
Price
$1.5 / $9 /MTok

Gemini 3.5 Flash is Google’s fast, low-cost model, launched at Google I/O on 19 May 2026 and now the default model on the Gemini app’s free tier. Its standout story is that a Flash model beat last year’s Pro: Google says Gemini 3.5 Flash outperforms its own Gemini 3.1 Pro on several challenging coding and agentic benchmarks while running about 4x faster on output (MarkTechPost).

It keeps a 1-million-token context window and full multimodal input, and is priced at $1.50 input / $9 output per million tokens — a step up from the previous Gemini 3 Flash ($0.50/$3) but still roughly 40% cheaper than Gemini 3.1 Pro (TokenMix). For most everyday and high-volume agentic work, it is the value sweet spot of the current Gemini line.

Quick specs

ProviderGoogle
Released19 May 2026 (Google I/O)
StatusGenerally available (free-tier default)
API model IDgemini-3.5-flash
Context window1,000,000 tokens
Input price$1.50 / MTok
Output price$9.00 / MTok
Cached input$0.15 / MTok
Terminal-Bench 2.176.2%
Speed~4x faster output than Gemini 3.1 Pro
Best forHigh-volume agentic coding, everyday tasks, value-sensitive deployments
Limitations3x the price of the prior Flash; below the Pro tier on the hardest tasks

TRY GEMINI 3.5 FLASH →

What’s new in Gemini 3.5 Flash

The pitch is Pro-class capability at Flash speed and price. Google reports that Gemini 3.5 Flash surpasses the previous premium tier (Gemini 3.1 Pro) on several hard benchmarks while completing tasks roughly four times faster on output and often at less than half the cost (MarkTechPost). That makes it especially attractive for AI agents and coding loops, where speed and cost compound across many calls.

It launched broadly: available the same day across the Gemini app, Google AI Studio, the Gemini API, Antigravity 2.0 and Google Search’s AI Mode (llm-stats).

Benchmark performance

Google-reported figures from the I/O 2026 launch; treat vendor numbers as a ceiling (NxCode).

BenchmarkGemini 3.5 Flash
Terminal-Bench 2.176.2%
GDPval-AA1656 Elo
MCP-Atlas83.6%
CharXiv Reasoning84.2%

The headline is comparative: on these challenging coding and agentic tests, Gemini 3.5 Flash beats the prior-generation Gemini 3.1 Pro — a fast, cheap model overtaking last year’s flagship. It still sits below the very top of the field (and the forthcoming Gemini 3.5 Pro) on the hardest problems. See best AI for coding for cross-model standings.

Pricing

Input (per MTok)Output (per MTok)Cached input
Gemini 3.5 Flash$1.50$9.00$0.15
Gemini 3 Flash (prior)$0.50$3.00
Gemini 3.1 Pro$2.00$12.00$0.20

At $1.50/$9, Gemini 3.5 Flash is 3x the price of the previous Flash but about 40% cheaper than Gemini 3.1 Pro (TokenMix). Given that it outperforms 3.1 Pro on several benchmarks while running far faster, the value proposition is strong despite the Flash-tier price rise.

How to access Gemini 3.5 Flash

Generally available since 19 May 2026 across the Gemini app (the free-tier default), Google AI Studio, the Gemini API (gemini-3.5-flash), Vertex AI, Antigravity 2.0 and Google Search AI Mode.

How Gemini 3.5 Flash compares

Known limitations

Price rose 3x over the prior Flash, narrowing the gap to the Pro tier. It is below the Pro tier on the hardest reasoning and coding. And like the rest of the Gemini line, headline benchmarks are vendor-reported — lead with independent figures where decisions ride on the number.

FAQ

What is Gemini 3.5 Flash?

Google’s fast, low-cost model, launched at Google I/O on 19 May 2026. It is the default on the Gemini app free tier and, notably, outperforms the previous-generation Gemini 3.1 Pro on several coding and agentic benchmarks while running about 4x faster.

How much does Gemini 3.5 Flash cost?

$1.50 per million input tokens and $9 per million output, with cached input at $0.15. That is three times the previous Gemini 3 Flash but roughly 40% cheaper than Gemini 3.1 Pro.

Is Gemini 3.5 Flash better than Gemini 3.1 Pro?

On several challenging coding and agentic benchmarks, Google says yes — and it runs about 4x faster. On the very hardest tasks the Pro tier (and the forthcoming Gemini 3.5 Pro) still leads.

What is the context window?

One million tokens, with full multimodal input (text, images, video, audio, PDFs) and text output.


Last verified 18 June 2026. Benchmark figures are Google-reported from the I/O 2026 launch (via MarkTechPost, NxCode and llm-stats); pricing via TokenMix. Confirm against Google’s official pages before relying on specific numbers.