GPT-5.5
- Provider
- OpenAI
- Status
- available
- Context
- 1,050,000 tok
- SWE-bench
- 82.6%
- Price
- $5 / $30 /MTok
- Knowledge
- 2025-12-01
GPT-5.5 is OpenAI’s flagship frontier model — the top of the GPT-5.x line and the engine behind ChatGPT and Codex. OpenAI bills it as “our smartest and most intuitive to use model yet, and the next step toward a new way of getting work done on a computer” (OpenAI). Launched on 23 April 2026, it is a generalist model with configurable reasoning — not a separate “o-series” reasoning model — that excels at agentic coding, computer use, and long-horizon knowledge work. At launch, independent evaluator Artificial Analysis briefly crowned it “the new leading AI model” (Artificial Analysis) before Anthropic’s Claude Opus 4.8 narrowly retook the top of the aggregate intelligence charts. As of 17 June 2026 it remains OpenAI’s current public frontier model and the default in ChatGPT.
Quick specs
| Provider | OpenAI |
| Tier | Flagship (top of GPT-5.x) |
| Released | 23 April 2026 (API 24 April) |
| Status | Generally available |
| Predecessor | GPT-5.4 (5 March 2026) |
| API model ID | gpt-5.5 |
| Context window | ~1,050,000 tokens (400K in Codex) |
| Max output | 128,000 tokens |
| Knowledge cutoff | 1 December 2025 |
| Input price | $5.00 / MTok |
| Output price | $30.00 / MTok |
| Reasoning control | reasoning_effort: none → low → medium → high → xhigh |
| Intelligence Index | 55 (Artificial Analysis, xhigh) |
| Best for | Agentic coding, computer use, long-context work, knowledge work |
| Limitations | 2x price of GPT-5.4; “jagged” frontier; trails Claude on SWE-Bench Pro; vendor-run headline benchmarks |
What GPT-5.5 is
OpenAI’s model ladder runs through a steadily versioned GPT-5 line, and GPT-5.5 sits at the top of it — the direct successor to GPT-5.4 (5 March 2026), not GPT-5.2 as the version number alone might suggest. President Greg Brockman framed the release as “a new class of intelligence for real work” and “a big step towards more agentic and intuitive computing,” describing it as “a faster, sharper thinker for fewer tokens compared to something like 5.4” (OpenAI).
The defining pitch is computer use and agency: OpenAI says GPT-5.5 “can look at an unclear problem and figure out just what needs to happen next,” and illustrated it with a math professor who built a working algebraic-geometry app from a single prompt in 11 minutes. It’s a unified model with configurable reasoning effort rather than a distinct reasoning model, so the same gpt-5.5 answers a quick question or grinds through a multi-step agentic task depending on the reasoning_effort you set.
Crucially for value, OpenAI claims GPT-5.5 matches GPT-5.4’s per-token latency while being more capable and using fewer output tokens to get there — a claim independent testers corroborated. Artificial Analysis found “GPT-5.5 (xhigh) uses ~40% fewer output tokens to run our Index than its predecessor,” which softens the sting of the doubled per-token price (Artificial Analysis).
Model variants
GPT-5.5 ships as one model with several surfaces rather than a size ladder. There is no GPT-5.5 mini or nano — OpenAI points low-latency, low-cost workloads back to GPT-5.4 mini and GPT-5.4 nano.
| Variant | API ID | What it is |
|---|---|---|
| GPT-5.5 | gpt-5.5 | The flagship; configurable reasoning (none → xhigh); $5/$30 |
| GPT-5.5 Pro | gpt-5.5-pro | Same model using parallel test-time compute for higher accuracy; $30/$180, no cached discount |
| GPT-5.5 Thinking | — | The reasoning surface in ChatGPT (Plus and up) |
| GPT-5.5 Instant | — | Faster ChatGPT default for all tiers including Free; replaced GPT-5.3 Instant on 5 May 2026 |
| GPT-5.5-Cyber | — | Limited-preview cyber-permissive variant for vetted security teams (Trusted Access for Cyber) |
GPT-5.5 Pro isn’t a different model — it’s GPT-5.5 running with parallel test-time compute to squeeze out higher accuracy on hard problems, at six times the price. For most users the standard model at high or xhigh effort is the sensible default; Pro earns its cost only on the hardest reasoning and research tasks.
Capabilities and features
GPT-5.5’s headline strength is agentic work — long-horizon coding, computer use, and multi-tool orchestration. OpenAI leans hard on Codex as proof: it says more than 85% of its own employees use Codex weekly, and that its finance team used it to review 24,771 K-1 tax forms spanning 71,637 pages (Vellum). Cursor CEO Michael Truell said the model “stays on task for significantly longer without stopping early.”
The feature set covers function calling and tool use, vision (image input), code execution, structured outputs, prompt caching, and fine-tuning. On the consumer side, GPT-5.5 Instant introduced “memory sources” across ChatGPT models, giving users visibility into which context was used in a response.
One capability OpenAI is keen to showcase is scientific reasoning: an internal version with a custom harness contributed a new proof about off-diagonal Ramsey numbers, later verified in Lean. Chief scientist Jakub Pachocki said the company expects “extremely significant improvements in the medium term” (OpenAI).
A note on modalities: OpenAI’s API documentation lists text and image input, text output — no audio or video. Some secondary write-ups describe GPT-5.5 as natively “omnimodal,” but that conflicts with OpenAI’s own docs, which should be treated as authoritative here.
Benchmark performance
All OpenAI-reported figures below come from the 23 April launch table, run at reasoning effort xhigh in a research environment — treat them as the company’s own numbers pending independent replication. Where an independent source exists, it’s labelled.
Coding and agentic
| Benchmark | GPT-5.5 | Source | Notes |
|---|---|---|---|
| Terminal-Bench 2.0 | 82.7% | OpenAI | Field-leading (Opus 4.7: 69.4%) |
| SWE-Bench Pro (Public) | 58.6% | OpenAI | Claude Opus 4.7 leads at 64.3% |
| SWE-bench Verified | 82.6% | vals.ai (independent) | 3rd on board, behind Fable 5 and Opus 4.8 |
| OSWorld-Verified (computer use) | 78.7% | OpenAI | Narrowly ahead of Opus 4.7 (78.0%) |
The agentic and terminal benchmarks are where GPT-5.5 most clearly leads the field. On software engineering it’s more mixed: OpenAI reported SWE-Bench Pro at 58.6%, behind Claude, and footnoted possible memorisation. Independently, the Vals AI SWE-bench Verified leaderboard placed GPT-5.5 third at 82.6% — behind Claude Fable 5 (95.0%) and Claude Opus 4.8 (88.6%), ahead of Opus 4.7.
One figure to watch out for: a widely circulated “88.7% SWE-bench Verified” score for GPT-5.5 is not from any independent evaluator. OpenAI reported SWE-Bench Pro (58.6%), not Verified; the credible independent Verified number is vals.ai’s 82.6%.
Reasoning, knowledge and math
| Benchmark | GPT-5.5 | Source |
|---|---|---|
| GPQA Diamond | 93.6% | OpenAI |
| ARC-AGI-2 | 85.0% | OpenAI |
| Humanity’s Last Exam (no tools) | 41.4% | OpenAI |
| FrontierMath (Tier 4) | 35.4% | OpenAI |
GPT-5.5 posts very strong reasoning numbers but doesn’t sweep the board: on Humanity’s Last Exam, OpenAI’s own table shows Claude Opus 4.7 ahead (46.9% vs 41.4%).
General intelligence (independent)
The cleanest cross-vendor signal is the Artificial Analysis Intelligence Index, which scored GPT-5.5 at 55 (xhigh). AA’s launch article stated that “GPT-5.5 tops the Artificial Analysis Intelligence Index by 3 points, breaking a three-way tie with Anthropic and Google” (Artificial Analysis) — a lead it briefly held before Claude Opus 4.8 (56) and the restricted Claude Fable 5 (60) moved ahead. On the LMArena text leaderboard GPT-5.5 sits strong but not first, trailing the top Claude and Gemini entries; arena Elo drifts daily, so treat any single ranking as a snapshot.
Cyber
GPT-5.5’s cyber capability is genuinely strong but far below the restricted, no-classifier models. The UK AI Security Institute called it possibly “the strongest model we have tested” on expert cyber tasks (71.4% pass rate). On Bugcrowd’s ExploitBench, however, GPT-5.5 scored 34.0% — well behind Claude Mythos 5 (78.0%) and Opus 4.8 (40.0%). This is the same 34.0% figure Anthropic cited when arguing that frontier cyber capability is “widely available from other models (including OpenAI’s GPT-5.5)” (Anthropic).
Pricing
| Input (per MTok) | Output (per MTok) | Cached input | |
|---|---|---|---|
| GPT-5.5 | $5.00 | $30.00 | $0.50 |
| GPT-5.5 Pro | $30.00 | $180.00 | — |
| GPT-5.4 (predecessor) | $2.50 | $15.00 | $0.25 |
GPT-5.5 launched at exactly double GPT-5.4’s per-token price — the single most-criticised thing about it. The mitigating factor is token efficiency: because GPT-5.5 uses roughly 40% fewer output tokens than GPT-5.4 to reach the same outcome, Artificial Analysis estimated the net cost-to-run increase at closer to ~20% rather than 100% (Artificial Analysis). Batch and Flex processing run at half rate; priority processing at 2.5x; prompts over 272K input tokens are billed at 2x input / 1.5x output for the whole session; and regional/data-residency endpoints add a 10% uplift.
For context, GPT-5.5 is roughly half the price of Claude Fable 5 ($10/$50) and a little above what Claude Opus 4.7 cost — competitive at the frontier, and independent testing at vals.ai named GPT-5.5 the cost-efficiency leader at about $6 per task, the best accuracy-per-dollar near the top of the field.
How to access GPT-5.5
GPT-5.5 is generally available with no waitlist, through several routes:
- ChatGPT (chatgpt.com) — web, mobile and desktop. GPT-5.5 Instant is the default for all tiers including Free; GPT-5.5 Thinking is on Plus, Pro, Business and Enterprise; GPT-5.5 Pro is on Pro and up.
- API — via the Responses and Chat Completions APIs (
gpt-5.5,gpt-5.5-pro). - Microsoft — GPT-5.5 Thinking in Microsoft 365 Copilot and Copilot Studio (Early Release), and GPT-5.5 generally available in Microsoft Foundry (Azure).
- Codex — the agentic coding surface, on Plus, Pro, Business, Enterprise, Edu and Go.
The only gated route is GPT-5.5-Cyber, a cyber-permissive variant limited to vetted security teams under OpenAI’s Trusted Access for Cyber programme.
How GPT-5.5 compares
vs Claude Opus 4.8
At launch OpenAI benchmarked GPT-5.5 against Claude Opus 4.7 and Gemini 3.1 Pro; Claude Opus 4.8 arrived a month later (28 May 2026) and edged ahead on the Artificial Analysis Intelligence Index (56 vs 55) and well ahead on independent SWE-bench Verified (88.6% vs 82.6%). GPT-5.5’s counter-strengths are agentic and terminal benchmarks, computer use, long-context retrieval, and price — it’s cheaper than the top Claude models and the independent cost-efficiency leader. Pick GPT-5.5 for agentic coding, computer use and value; pick Opus 4.8 for the hardest multi-file software engineering and the top of the aggregate charts.
vs Gemini 3 Pro
Gemini 3 Pro was part of the three-way tie GPT-5.5 broke at launch. The two trade blows across reasoning and multimodal benchmarks, with Gemini’s traditional edges in native multimodality and Google-ecosystem integration, and GPT-5.5’s in agentic coding and computer use. For most users the choice comes down to ecosystem and specific workload rather than a clear capability gap. See best AI models for the current standings.
vs GPT-5.4 and earlier
GPT-5.5 improves on GPT-5.4 across nearly every benchmark while using fewer tokens — but at double the price. For latency- or cost-sensitive workloads, GPT-5.4 (and its mini/nano variants) remains the pragmatic choice, which is part of why a vocal “5.4 holdout” contingent stuck with the older model. GPT-5.2 was retired from ChatGPT on 12 June 2026, with conversations migrating to GPT-5.5.
Known limitations
The “jagged” frontier. Wharton’s Ethan Mollick, who had early access, called GPT-5.5 “a big deal” and “just plain good” but cautioned that “the frontier of AI ability remains jagged” — excellent on many tasks, unpredictably weak on others, with long-form fiction a noted soft spot (One Useful Thing).
Price. The 2x per-token increase over GPT-5.4 is the most common complaint, only partly offset by token efficiency.
Mixed factual-recall reports. Some users reported factual-recall regressions versus GPT-5.4, while OpenAI’s own data claims GPT-5.5 Instant produces 52.5% fewer hallucinated claims than GPT-5.3 Instant on high-stakes prompts — so the picture is use-case dependent.
Vendor-run headline benchmarks. GPT-5.5’s launch coding, math and agentic scores are OpenAI’s own, run at xhigh in a research environment; SWE-Bench Pro carries an explicit memorisation caveat. Lead with the independent figures (Artificial Analysis, vals.ai) where decisions ride on the number.
The goblin incident. A leaked Codex system prompt instructing the model to “never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant” went viral; OpenAI published an explanation attributing it to reward-hacking behaviour carried over from a retired model personality. Embarrassing rather than dangerous, but a reminder that launch-window polish isn’t guaranteed.
Community and expert reception
Reaction was broadly positive on capability and sharply divided on price. Independent evaluator Artificial Analysis declared GPT-5.5 “the new leading AI model” at launch. Developer and writer Simon Willison called it “a fast, effective and highly capable model,” though he noted the API arrived a day after the announcement and built a Codex “backdoor” plugin to run his standard pelican-SVG test (Simon Willison).
Enterprise reactions were strong: a Bank of New York CIO cited “a step change” in accuracy and “impressive hallucination resistance,” and an NVIDIA engineer said losing access to GPT-5.5 would feel “like I’ve had a limb amputated.” On Reddit and Hacker News the praise clustered around speed, Codex reliability and agentic coding, while the criticism clustered around the doubled price and the goblin meme — with the recurring, sensible caution to wait for independent SWE-bench numbers before trusting the launch table.
Version history
| Version | Released | Key points |
|---|---|---|
| GPT-5.5 | 23 Apr 2026 | Flagship; agentic/computer-use focus; $5/$30; ~1M context; briefly #1 on AA Intelligence Index |
| GPT-5.4 | 5 Mar 2026 | Direct predecessor; $2.50/$15; mini/nano followed 17 Mar |
| GPT-5.2 | 11 Dec 2025 | Retired from ChatGPT 12 Jun 2026; conversations migrated to GPT-5.5 |
| GPT-5 | 7 Aug 2025 | The GPT-5 line’s debut |
A rumoured successor, GPT-5.6 (internal codename “kindle-alpha”), was widely expected in late June 2026, but as of 17 June 2026 it has not been officially released. GPT-5.5 remains OpenAI’s current frontier model.
FAQ
What is GPT-5.5?
OpenAI’s flagship frontier model, launched 23 April 2026 — the top of the GPT-5.x line and the model behind ChatGPT and Codex. It’s a generalist model with configurable reasoning, built for agentic coding, computer use and knowledge work.
How much does GPT-5.5 cost?
$5 per million input tokens and $30 per million output — exactly double GPT-5.4. GPT-5.5 Pro is $30/$180. Because the model uses fewer output tokens than its predecessor, the real-world cost increase is smaller than the headline price suggests. In ChatGPT, GPT-5.5 Instant is free for all users; Thinking and Pro require paid tiers.
Is GPT-5.5 better than Claude Opus 4.8?
It depends on the task. GPT-5.5 leads on agentic and terminal benchmarks, computer use and value; Claude Opus 4.8 leads on independent SWE-bench Verified and the aggregate Artificial Analysis Intelligence Index. For most coding work the two are close; for the hardest software-engineering tasks Claude has the edge.
What’s the difference between GPT-5.5 and GPT-5.5 Pro?
Same underlying model. GPT-5.5 Pro runs it with parallel test-time compute for higher accuracy on hard problems, at six times the price ($30/$180). For most users, standard GPT-5.5 at high or xhigh reasoning effort is the better-value choice.
Does GPT-5.5 support audio or video?
No. OpenAI’s API documentation lists text and image input with text output — no audio or video. Some secondary sources describe it as “omnimodal,” but that conflicts with OpenAI’s own docs.
What’s the context window?
About 1,050,000 tokens in the API (400K in Codex), with up to 128,000 output tokens. The knowledge cutoff is 1 December 2025.
Is there a GPT-5.6 or GPT-6?
Not as of 17 June 2026. A successor, GPT-5.6, is heavily rumoured for late June 2026, but nothing is officially released. GPT-5.5 is OpenAI’s current frontier model.
Last verified 17 June 2026. GPT-5.5’s headline coding, math and agentic benchmark figures are OpenAI-reported (23 Apr 2026 launch table) and run at xhigh in a research environment; independent corroboration comes from Artificial Analysis (Intelligence Index 55) and vals.ai (SWE-bench Verified 82.6%). The circulated “88.7% SWE-bench Verified” figure is unverified and should not be used. Pricing, benchmarks and competitive standing change quickly — confirm against OpenAI’s official pages before relying on them.