Gemini 3 Deep Think
- Provider
- Status
- available
- Context
- 1,000,000 tok
- Knowledge
- 2025-01-31
Gemini 3 Deep Think is Google’s extended-reasoning mode, released on 4 December 2025 to Google AI Ultra subscribers. It is not a separate model so much as a way of running Gemini 3 Pro that explores multiple reasoning paths in parallel before converging on an answer — Google’s approach to squeezing the model’s ceiling on the hardest scientific, mathematical and coding problems.
The payoff is concentrated on frontier reasoning benchmarks. Deep Think lifts Gemini 3 Pro’s ARC-AGI-2 score from 31.1% to 45.1% and Humanity’s Last Exam from 37.5% to 41.0% (Google) — at the cost of roughly 1.4–2.3x more latency. It is gated to the $249.99/month Google AI Ultra tier, so it is a power-user feature rather than an everyday model.
Quick specs
| Provider | |
| Released | 4 December 2025 |
| Status | Available (Google AI Ultra only) |
| Built on | Gemini 3 Pro |
| Access | Consumer only — Google AI Ultra ($249.99/month) |
| Context window | 1,000,000 tokens |
| ARC-AGI-2 | 45.1% (vs 31.1% base) |
| Humanity’s Last Exam | 41.0% (vs 37.5% base) |
| Best for | The hardest reasoning, maths and science problems |
| Limitations | Ultra-tier only; no per-token API; higher latency |
What Gemini 3 Deep Think is
Deep Think is Google’s parallel-reasoning mode: rather than producing a single chain of thought, it explores multiple hypotheses simultaneously and converges on the best solution, trading latency and compute for accuracy on hard problems (Google). It runs the Gemini 3 Pro model, so it inherits Gemini’s 1M-token context and native multimodality, and adds depth on the tasks where extra reasoning matters most.
Its headline result is abstract reasoning: 45.1% on ARC-AGI-2, a benchmark of novel visual-puzzle reasoning, versus 31.1% for standard Gemini 3 Pro — a large jump that was, at release, among the highest scores any model had posted on that test.
Benchmark performance
Google-reported figures comparing Deep Think to standard Gemini 3 Pro.
| Benchmark | Gemini 3 Deep Think | Gemini 3 Pro (base) |
|---|---|---|
| ARC-AGI-2 | 45.1% | 31.1% |
| Humanity’s Last Exam | 41.0% | 37.5% |
| GPQA Diamond | 91.9% | 91.9% |
The gains concentrate on the hardest reasoning benchmarks (ARC-AGI-2, HLE); on already-saturated tests like GPQA Diamond there is little headroom to add. The trade-off is latency — Google notes Deep Think runs roughly 1.4–2.3x slower than standard inference. See best AI models and the best AI models ranking for context.
How to access Gemini 3 Deep Think
Deep Think is consumer-only and exclusive to the Google AI Ultra tier ($249.99/month, often discounted 50% for the first three months) in the Gemini app. It is not sold as a separate per-token model on the API. A Deep Think mode is also part of the announced Gemini 3.5 Pro.
How it compares
- vs standard Gemini 3 Pro — same underlying model, but Deep Think adds parallel reasoning for large gains on the hardest problems at higher latency.
- vs rival reasoning modes — comparable in spirit to OpenAI’s high-effort / Pro reasoning and Anthropic’s extended thinking; Deep Think’s standout is its ARC-AGI-2 result.
Known limitations
Ultra-tier only. At $249.99/month and consumer-only, it is out of reach for most users and not available as an API model. Higher latency. The parallel reasoning is 1.4–2.3x slower. Narrow benefit. The gains show up on the hardest reasoning tasks, not everyday use.
FAQ
What is Gemini 3 Deep Think?
An extended-reasoning mode from Google, released 4 December 2025, that runs Gemini 3 Pro with parallel hypothesis exploration for harder problems. It is available only to Google AI Ultra subscribers.
How much does Gemini 3 Deep Think cost?
It is included with Google AI Ultra at $249.99/month (often 50% off for the first three months). It is not sold per-token on the API.
How much better is Deep Think than standard Gemini 3 Pro?
On the hardest reasoning benchmarks, substantially: ARC-AGI-2 rises from 31.1% to 45.1% and Humanity’s Last Exam from 37.5% to 41.0%, at the cost of higher latency. On everyday tasks the difference is smaller.
Last verified 18 June 2026. Benchmark figures are Google-reported (Gemini 3 Deep Think announcement). A Deep Think mode is also part of the announced Gemini 3.5 Pro. Confirm against Google’s official pages before relying on specific numbers.