Gemini 3.1 Pro
- Provider
- Status
- available
- Context
- 1,000,000 tok
- SWE-bench
- 80.6%
- Price
- $2 / $12 /MTok
Gemini 3.1 Pro is Google’s generally-available flagship model, released on 19 February 2026 as the successor to Gemini 3 Pro. It is the strongest Gemini most people can actually use today: the newer Gemini 3.5 Pro was announced at Google I/O in May 2026 but, as of mid-June 2026, remains in limited preview, so 3.1 Pro carries the GA flagship slot.
It is a frontier-class generalist with Google’s signature strengths — a 1-million-token context window and native multimodality across text, images, video, audio and PDFs. On benchmarks it is essentially tied with the best on coding and ahead on abstract reasoning: Google reports 80.6% on SWE-bench Verified, 94.3% on GPQA Diamond and 77.1% on ARC-AGI-2, and says it leads 13 of 16 benchmarks it evaluated (NxCode, Artificial Analysis).
Quick specs
| Provider | |
| Released | 19 February 2026 |
| Status | Generally available (current GA flagship) |
| API model ID | gemini-3.1-pro-preview |
| Context window | 1,000,000 tokens input / 64,000 output |
| Input price | $2.00 / MTok (≤200K) → $4.00 (>200K) |
| Output price | $12.00 / MTok (≤200K) → $18.00 (>200K) |
| SWE-bench Verified | 80.6% |
| GPQA Diamond | 94.3% |
| ARC-AGI-2 | 77.1% |
| Best for | Long-context work, multimodal tasks, reasoning, frontend/UI coding |
| Limitations | Coding narrowly trails the top Claude models; superseded-in-announcement by 3.5 Pro |
What’s new in Gemini 3.1 Pro
Gemini 3.1 Pro is an incremental but real upgrade over Gemini 3 Pro, released three months later. It keeps the 1M-token context window, dynamic thinking and the full multimodal input set, and pushes benchmark scores up across reasoning and coding — most notably lifting ARC-AGI-2 to 77.1% (from Gemini 3 Pro’s 31.1% base) and SWE-bench Verified to 80.6% (NxCode, llm-stats).
It retains Google’s structural advantages: a context window roughly five times larger than the standard windows on rival flagships, and native understanding of video and audio that most competitors lack.
Benchmark performance
Google-reported figures from the February 2026 launch; treat vendor numbers as a ceiling.
| Benchmark | Gemini 3.1 Pro | Notes |
|---|---|---|
| SWE-bench Verified | 80.6% | Just behind Claude Opus 4.6 (80.8%) |
| SWE-bench Pro | ~54.2% | ~46.1% on the standardized public leaderboard |
| GPQA Diamond | 94.3% | Among the best of any model |
| ARC-AGI-2 | 77.1% | A large jump over the Gemini 3 Pro base |
On coding it is essentially tied with the best — within 0.2 points of Claude Opus 4.6 on SWE-bench Verified — while leading on abstract reasoning and matching the field on graduate-level science (NxCode). On the hardest agentic-coding tests (Claude Opus 4.8 and GPT-5.5 still lead on SWE-bench Pro). See best AI models for the cross-model standings.
Pricing
| Input (per MTok) | Output (per MTok) | |
|---|---|---|
| Standard (≤200K) | $2.00 | $12.00 |
| Long context (>200K) | $4.00 | $18.00 |
Gemini 3.1 Pro keeps Gemini 3 Pro’s pricing structure: $2/$12 up to 200K tokens, doubling above that (llm-stats). That makes it one of the cheaper frontier-class models, especially for long-context work where it can process a 500K-token document in a single call.
How to access Gemini 3.1 Pro
Gemini 3.1 Pro is generally available through the Gemini app (free with limits; expanded on Google AI Pro at $19.99/month and Ultra at $249.99/month), Google AI Studio, the Gemini API (gemini-3.1-pro-preview) and Vertex AI.
How Gemini 3.1 Pro compares
- vs Claude Opus 4.8 — Claude leads the hardest coding benchmarks (88.6% SWE-bench Verified, 69.2% Pro); Gemini 3.1 Pro counters with a far larger context window, native multimodality and lower price.
- vs GPT-5.5 — The two trade blows on reasoning; Gemini’s edges are context length and multimodal breadth, GPT-5.5’s are agentic/terminal coding and ecosystem.
- vs Gemini 3.5 Pro — 3.5 Pro is the announced successor (2M context, Deep Think) but is still in limited preview, so 3.1 Pro is the model to use today.
Known limitations
Coding narrowly trails the top Claude models on the hardest agentic benchmarks. It is superseded in announcement by Gemini 3.5 Pro, which will take the flagship slot once it reaches general availability. And like Gemini 3 Pro, it can be overconfident when wrong — verify critical factual outputs.
FAQ
What is Gemini 3.1 Pro?
Google’s generally-available flagship AI model, released 19 February 2026. It is the strongest Gemini you can use today, with a 1M-token context window and native multimodality, pending the wider release of Gemini 3.5 Pro.
How much does Gemini 3.1 Pro cost?
$2 per million input tokens and $12 per million output (doubling to $4/$18 above 200K tokens) via the API. In the Gemini app it is free with limits, with higher limits on Google AI Pro ($19.99/month) and Ultra ($249.99/month).
Is Gemini 3.1 Pro better than Claude or GPT-5.5?
It depends on the task. Gemini 3.1 Pro leads on context length, multimodality and abstract reasoning; Claude Opus 4.8 leads the hardest coding benchmarks and GPT-5.5 leads agentic terminal coding. On SWE-bench Verified the three are close.
What is the context window?
One million tokens of input with up to 64,000 tokens of output — about five times the standard window on most rival flagships.
Last verified 18 June 2026. Benchmark figures are Google-reported from the February 2026 launch, with independent context from Artificial Analysis and llm-stats; standardized SWE-bench Pro placement comes from the public leaderboard. Pricing and standing change quickly — confirm against Google’s official pages before relying on them.