GPT-5.6 Sol
- Provider
- OpenAI
- Status
- Preview
- Price
- $5 / $30 /MTok
GPT-5.6 is OpenAI’s next-generation model family, previewed on 26 June 2026 in three tiers: Sol, the flagship; Terra, a balanced everyday model; and Luna, a fast, low-cost model (OpenAI). This is a limited preview, not a general release — GPT-5.6 is currently available only through the API and Codex, and only to a small group of vetted partners, as part of a release process OpenAI is coordinating with the US government. OpenAI says it plans to make all three models generally available across ChatGPT, Codex and the API “in the coming weeks.”
GPT-5.6 Sol is OpenAI’s strongest model to date, with the headline gains concentrated in coding, biology and cybersecurity. It sets a new state of the art on Terminal-Bench 2.1, OpenAI’s agentic command-line coding benchmark, and introduces two new controls: a max reasoning effort for deeper single-model reasoning, and an ultra mode that coordinates subagents working in parallel (OpenAI). The preview also debuts a new naming convention — the number (5.6) marks the generation, while Sol, Terra and Luna are durable capability tiers that can each advance on their own cadence.
Two caveats frame everything below. First, the benchmark figures are OpenAI’s own, drawn from the preview, with a fuller evaluation suite promised at general availability — treat them as a vendor ceiling pending independent testing. Second, OpenAI did not publish several standard specifications for the preview (context window, knowledge cutoff, exact API model strings), so those are marked data not available.
Quick specs
| Provider | OpenAI |
| Family | GPT-5.6 (Sol flagship; Terra and Luna tiers) |
| Announced | 26 June 2026 (limited preview) |
| Status | Preview — not generally available |
| Predecessor | GPT-5.5 (23 April 2026) |
| Access | API + Codex, vetted partners only; ChatGPT “in the coming weeks” |
| Context window | Not stated by OpenAI; secondary coverage reports ~1.5M tokens for Sol (unconfirmed) |
| Knowledge cutoff | Data not available (not disclosed in preview) |
| Sol price | $5 input / $30 output per MTok |
| Terra price | $2.50 input / $15 output per MTok |
| Luna price | $1 input / $6 output per MTok |
| New controls | max reasoning effort; ultra subagent mode (Sol) |
| Terminal-Bench 2.1 | ~91.9% (Sol ultra), ~88.8% (Sol max) — OpenAI’s Codex harness |
| Best for | Hard agentic coding, security research, scientific/biology workflows |
| Limitations | Preview-only access; vendor-reported and partial benchmarks; key specs undisclosed |
What’s new in GPT-5.6
A three-tier family and a new naming convention
The biggest structural change is the move from a single flagship to a named three-tier family. OpenAI describes the new system plainly: the number identifies the generation (5.6), while Sol, Terra and Luna identify durable capability tiers that can advance on their own cadence (OpenAI). Sol is the high-ceiling flagship, Terra the balanced default, and Luna the fast, cheap option. The intent is clearer choices across intelligence, speed and cost — and an end to confusing labels like “Instant” standing in for the latest underlying model (DataCamp).
Two new ways to push the model harder
GPT-5.6 adds two controls beyond the existing reasoning-effort dial:
maxreasoning effort gives Sol the most time to reason deeply on a single problem — a new top rung above the previousxhigh.ultramode goes beyond a single agent, using subagents to accelerate complex, multi-step work. In OpenAI’s Terminal-Bench results, “ultra” appears as its own line and posts the top score, so the headline ~91.9% figure is an ultra (multi-agent) result rather than a single-model score.
Step-change capabilities in coding, biology and cyber
OpenAI frames Sol as its most capable model yet for cybersecurity, shifting the performance-efficiency frontier on long-horizon security tasks such as vulnerability research and exploitation, while pairing those gains with its most robust safeguards to date. It also reports broad improvements in biology workflows (genomics and quantitative-biology analysis) and a new state of the art in agentic coding (OpenAI). See the benchmark section below for the numbers and caveats.
A government-coordinated, phased release
Unusually, OpenAI is releasing GPT-5.6 as a limited preview coordinated with the US government. The company says it previewed the models’ capabilities to the government ahead of launch and, at the government’s request, is starting with a small group of trusted partners before releasing more broadly (VentureBeat). OpenAI states plainly that it does not believe this kind of government access process should become the long-term default, framing the step as the fastest path to broad availability while it works with the Administration on a repeatable framework for future model releases. This is covered in more detail in the safety section below.
The GPT-5.6 model family: Sol, Terra, Luna
GPT-5.6 ships as three distinct models at different capability tiers and prices — a more meaningful split than GPT-5.5’s surfaces (Pro, Thinking, Instant), which were the same model presented differently.
| Tier | Role | Price (in → out, per MTok) | Notes |
|---|---|---|---|
| GPT-5.6 Sol | Flagship; hardest problems | $5.00 → $30.00 | Only tier with max effort and ultra mode; biggest cyber/bio gains |
| GPT-5.6 Terra | Balanced, everyday default | $2.50 → $15.00 | ”Competitive with GPT-5.5 while ~2x cheaper” |
| GPT-5.6 Luna | Fast, low-cost, high-volume | $1.00 → $6.00 | Lowest cost; strong on routine work |
Sol is the model to reach for on hard, multi-step problems — complex coding, security research, scientific analysis — when you want the highest ceiling and accept the highest per-token cost. Terra is positioned as the new default: OpenAI says it matches GPT-5.5’s capability at roughly half the price, a pattern of “last-generation flagship quality at a mid-tier price” that analysts expect to recur (DataCamp). Luna targets high-volume, latency-sensitive and budget-conscious workloads — summarisation, drafting and routine automation — and, per OpenAI’s cyber results, “cheapest” does not mean “weakest” on every task.
OpenAI has not published the exact API model strings for each tier in the preview, so the identifiers above describe the tiers rather than confirmed API names.
Benchmark performance
OpenAI shared a focused set of evaluations for the preview — coding, biology and cyber — and promised an expanded suite at general availability. Every figure here is OpenAI-reported, and the numeric values were transcribed from OpenAI’s published charts by secondary coverage, so they should be read as a vendor ceiling until independent testing lands.
Coding — Terminal-Bench 2.1 (OpenAI’s Codex harness)
Terminal-Bench 2.1 tests command-line workflows that require planning, iteration and tool coordination. On OpenAI’s own Codex CLI harness, GPT-5.6 Sol sets a new state of the art (OpenAI, figures via kingy.ai):
| Model | Terminal-Bench 2.1 | Notes |
|---|---|---|
| GPT-5.6 Sol (ultra) | ~91.9% | New state of the art; uses subagents in parallel |
| GPT-5.6 Sol (max) | ~88.8% | Single-agent, max reasoning effort |
| Claude Mythos 5 | 88.0% | Restricted Anthropic model (suspended) |
| GPT-5.6 Terra | 84.3% | Tied with Fable 5 |
| Claude Fable 5 | 84.3% | Anthropic Mythos-class (suspended) |
| GPT-5.5 | 83.4% | Prior OpenAI flagship |
Two things to keep straight. First, the ~91.9% headline is an ultra (multi-agent) result; the like-for-like single-model number is Sol at max effort, ~88.8%, which still edges Claude Mythos 5 (88.0%). Second, this is OpenAI’s own Codex harness — the same one on which GPT-5.5 scores 83.4%. On the public Terminus-2 harness used to compare all models, the same family scores several points lower (GPT-5.5 78.2%, Claude Opus 4.8 74.6%), so expect lower public-harness numbers for GPT-5.6 once they exist. As ever on this benchmark, the harness explains a lot of the gap.
Biology — GeneBench v1
On GeneBench v1, which evaluates long-horizon genomics and quantitative-biology analyses, OpenAI reports Sol achieving stronger results than GPT-5.5 while using fewer tokens (OpenAI). No numeric score was published in the preview.
Cybersecurity — ExploitBench and ExploitGym
Cyber is the capability OpenAI emphasises most, and the one it pairs with the heaviest safeguards. On ExploitBench, OpenAI says GPT-5.6 Sol is competitive with the restricted Mythos Preview model while using only about one-third of the output tokens — an efficiency as much as a capability claim. On ExploitGym, a benchmark built by UC Berkeley researchers in collaboration with OpenAI and other labs, all three tiers (Sol, Terra and Luna) show strong improvements as reasoning increases (OpenAI). Crucially, OpenAI states that Sol does not cross the “Cyber Critical” threshold of its Preparedness Framework: in tests on Chromium and Firefox it identified bugs and exploitation primitives but did not autonomously produce a functional full-chain exploit under the conditions tested.
What is not yet measured
The preview did not include the standard cross-model benchmarks — SWE-bench Verified or Pro, GPQA, AIME, ARC-AGI, or an independent aggregate such as the Artificial Analysis Intelligence Index. Those are data not available until OpenAI publishes its full suite at general availability, and until independent evaluators (Artificial Analysis, vals.ai) can run the models. We will add them then rather than estimate now.
Pricing
GPT-5.6 is priced per million tokens across the three tiers (OpenAI):
| Tier | Input (per MTok) | Output (per MTok) | Cache read |
|---|---|---|---|
| GPT-5.6 Sol | $5.00 | $30.00 | ~$0.50 (90% off input) |
| GPT-5.6 Terra | $2.50 | $15.00 | ~$0.25 (90% off input) |
| GPT-5.6 Luna | $1.00 | $6.00 | ~$0.10 (90% off input) |
Two pricing notes matter. Sol’s per-token price is identical to GPT-5.5 ($5/$30), so the flagship is not a price increase — the gains come at the same headline rate, with ultra mode’s parallel subagents being where heavy-task token spend can climb. Terra lands at the old GPT-5.4 price ($2.50/$15) while, per OpenAI, matching GPT-5.5’s capability — the clearest value story in the family.
GPT-5.6 also changes prompt caching. It introduces explicit cache breakpoints and a 30-minute minimum cache life for more predictable caching, and for GPT-5.6 and later models, cache writes are billed at 1.25x the uncached input rate while cache reads keep the 90% cached-input discount (OpenAI). All figures are API preview pricing; OpenAI has not set ChatGPT-tier availability or consumer pricing yet.
Cost comparison with contemporaries
| Model | Input | Output | Notes |
|---|---|---|---|
| GPT-5.6 Sol | $5.00 | $30.00 | Same price as GPT-5.5; preview-only |
| GPT-5.6 Terra | $2.50 | $15.00 | ”GPT-5.5 quality at ~2x cheaper” (OpenAI) |
| GPT-5.6 Luna | $1.00 | $6.00 | Value tier |
| GPT-5.5 | $5.00 | $30.00 | Prior flagship |
| Claude Opus 4.8 | $5.00 | $25.00 | Anthropic’s GA flagship |
| Claude Fable 5 | $10.00 | $50.00 | Mythos-class; suspended worldwide |
How to access GPT-5.6
During the preview, GPT-5.6 is not generally available. Access is limited to:
- The API — GPT-5.6 Sol, Terra and Luna for a select group of trusted partners and organisations, via OpenAI’s API platform.
- Codex — the agentic coding surface, for the same vetted partners (Codex).
OpenAI says it plans to make the three models generally available across ChatGPT, Codex and the API in the coming weeks, and to launch Sol on Cerebras at up to 750 tokens/second in July for select customers, expanding as capacity grows (OpenAI). There is no public ChatGPT availability or consumer pricing yet, and OpenAI has not published exact API model strings for the tiers. Full safeguard and preparedness details are in the GPT-5.6 Preview system card.
Safety and the phased release
The safety stack is central to this release, not a footnote. OpenAI calls it its most robust to date, with configurations matched to each tier’s capability, and describes a layered approach: safeguards trained into the model, real-time cyber and biology misuse classifiers that can pause generation for a larger reasoning model to review, account-level review across conversations, differentiated access, monitoring and enforcement (OpenAI). OpenAI says it dedicated over 700,000 A100-equivalent GPU hours to automated red-teaming aimed at universal jailbreaks, alongside third-party human expert red-teaming that continues through the preview.
The trade-off OpenAI flags directly: during the preview, users may hit safeguards that block or slow some requests, including legitimate dual-use security work where defensive and offensive activity initially look similar. Testing whether legitimate users can still complete normal work reliably is, OpenAI says, part of the point of the preview.
The release is also coordinated with the US government. OpenAI previewed GPT-5.6’s capabilities to the government ahead of launch and, at its request, began with a limited preview for vetted partners whose participation was shared with the government, before a wider release (VentureBeat). OpenAI states it does not believe this should become the long-term default, arguing it keeps capable tools from users, developers and defenders, and frames the step as a short-term path toward broad availability while it works with the Administration on a repeatable “cyber Executive Order” framework. This mirrors the policy backdrop around Anthropic’s Fable 5 and Mythos 5 suspension earlier in June 2026 — the year’s recurring theme of government involvement in frontier-model release.
How GPT-5.6 compares
A full head-to-head is not yet possible — OpenAI published only a partial, vendor-run benchmark set — so the comparisons below are directional, with gaps marked honestly.
vs GPT-5.5
GPT-5.6 succeeds GPT-5.5 as OpenAI’s top line. On OpenAI’s Terminal-Bench 2.1 harness, Sol at max effort (~88.8%) is about 5 points ahead of GPT-5.5 (83.4%), and Sol in ultra mode (~91.9%) further. Sol costs the same as GPT-5.5 ($5/$30), so the flagship upgrade is a capability gain at a flat price. The bigger value shift is Terra, which OpenAI says matches GPT-5.5’s capability at half the cost ($2.50/$15). A complete benchmark comparison waits on the GA suite.
vs Claude Opus 4.8, Fable 5 and Mythos 5
On OpenAI’s Terminal-Bench 2.1 harness, Sol at max effort (~88.8%) edges Claude Mythos 5 (88.0%) and beats Claude Fable 5 (84.3%) — but these are OpenAI-run numbers, and both Fable 5 and Mythos 5 are themselves suspended worldwide under a US export-control directive, so they are unusable today regardless of score. On cyber, OpenAI reports Sol is competitive with the restricted Mythos Preview on ExploitBench at roughly one-third of the output tokens. Against Anthropic’s generally-available flagship, Claude Opus 4.8, there are no shared-harness numbers yet — OpenAI did not publish SWE-bench or aggregate scores — so a clean comparison is data not available until GA.
vs Gemini 3.5 Pro
Google’s Gemini 3.5 Pro reached general availability in late June 2026 with a 2M-token context and a Deep Think mode. With no shared benchmark between it and the GPT-5.6 preview, a head-to-head is data not available; the practical contrast today is that Gemini 3.5 Pro is generally available while GPT-5.6 is a limited preview. See best AI models for where each sits once GPT-5.6’s full numbers land.
Known limitations
You probably cannot use it yet. GPT-5.6 is a limited preview, restricted to vetted API and Codex partners; there is no general ChatGPT, API or consumer access at launch.
Government-gated release. Access is coordinated with the US government, a constraint OpenAI itself says should not become the default — but which limits availability for now.
Vendor-reported, partial benchmarks. OpenAI published only Terminal-Bench 2.1, GeneBench, ExploitBench and ExploitGym, run on its own harnesses, with numbers read from launch charts. The standard cross-model suite and any independent aggregate are not yet available.
Key specs undisclosed. OpenAI did not publish the context window (secondary coverage reports ~1.5M for Sol, unconfirmed), knowledge cutoff, maximum output tokens, or exact API model strings.
Safeguard friction. OpenAI warns the preview’s safeguards may block or delay some legitimate requests, particularly in dual-use security work.
ultra cost. Ultra mode’s parallel subagents post the top scores but can consume more tokens per task, so the headline benchmark and the production bill can diverge.
Community reception
Early coverage was substantial and broadly positive on capability, with two recurring threads. The first was the cleaner three-tier naming — commentators welcomed the generation-plus-tier scheme (Sol/Terra/Luna) as less confusing than labels like “Instant,” and singled out Terra as the practical story: roughly last-generation flagship quality at a mid-tier price (DataCamp). The second was the government-coordinated limited preview, which several outlets led with as the more newsworthy angle than the benchmarks, given OpenAI’s own discomfort with the precedent (VentureBeat).
The consistent caution across write-ups: this is a preview with vendor-run, partial benchmarks, so the SOTA claims should be treated as provisional until the full suite and independent testing arrive. For now, the capability picture is real but incomplete.
Version history
| Version | Released | Key points |
|---|---|---|
| GPT-5.6 (Sol / Terra / Luna) | 26 Jun 2026 (preview) | New three-tier family and naming; max effort and ultra subagent mode; coding/biology/cyber gains; government-coordinated limited preview |
| GPT-5.5 | 23 Apr 2026 | Prior flagship; $5/$30; ~1M context; briefly #1 on the AA Intelligence Index |
| GPT-5.4 | 5 Mar 2026 | $2.50/$15; mini and nano variants followed |
| GPT-5.2 | 11 Dec 2025 | Retired from ChatGPT 12 Jun 2026; conversations migrated to GPT-5.5 |
GPT-5.6 was widely rumoured before launch (an internal codename “kindle-alpha” circulated) and arrived as a preview rather than a full release. GPT-5.5 remains OpenAI’s current generally-available flagship until GPT-5.6 reaches GA.
Frequently asked questions
What is GPT-5.6?
GPT-5.6 is OpenAI’s next-generation model family, previewed on 26 June 2026 in three tiers: Sol (flagship), Terra (balanced) and Luna (fast and low-cost). It introduces a new max reasoning effort and an ultra subagent mode, and a naming convention where the number is the generation and Sol/Terra/Luna are durable capability tiers. It launched as a limited, government-coordinated preview, not a general release.
What is the difference between GPT-5.6 Sol, Terra and Luna?
Sol is the flagship for the hardest problems (complex coding, security research) and the only tier with max effort and ultra mode, at $5/$30 per million tokens. Terra is the balanced default — OpenAI says it matches GPT-5.5’s capability at about half the cost — at $2.50/$15. Luna is the fast, cheap tier for high-volume and latency-sensitive work, at $1/$6.
Is GPT-5.6 available to use yet?
Not generally. During the preview it is restricted to a select group of trusted partners via the API and Codex, as part of a release process coordinated with the US government. OpenAI says it plans general availability across ChatGPT, Codex and the API “in the coming weeks.”
How much does GPT-5.6 cost?
Per million tokens: Sol $5 input / $30 output; Terra $2.50 / $15; Luna $1 / $6. Sol matches GPT-5.5’s price exactly, and Terra matches the old GPT-5.4 price. GPT-5.6 also adds more predictable prompt caching, with cache reads at a 90% discount and cache writes billed at 1.25x the uncached input rate. These are API preview prices; ChatGPT pricing is not set yet.
What are GPT-5.6’s max and ultra modes?
max is a new reasoning-effort level that gives Sol the most time to reason on a single problem, above the previous xhigh. ultra goes beyond a single agent, using subagents in parallel to accelerate complex tasks — it is where Sol’s top Terminal-Bench score (~91.9%) comes from, so that figure is a multi-agent result rather than a single-model score.
Is GPT-5.6 Sol better than GPT-5.5?
On OpenAI’s own Terminal-Bench 2.1 harness, yes for agentic coding: Sol at max effort scores ~88.8% versus GPT-5.5’s 83.4%, and ~91.9% in ultra mode. OpenAI also reports gains in biology and cyber. But the benchmark set is partial and vendor-run, so a full verdict waits on the GA evaluation suite and independent testing. Notably, Sol costs the same as GPT-5.5.
How does GPT-5.6 Sol compare to Claude Opus 4.8 and Fable 5?
On OpenAI’s Terminal-Bench 2.1 harness, Sol at max effort (~88.8%) edges Claude Mythos 5 (88.0%) and beats Claude Fable 5 (84.3%) — but those are OpenAI-run numbers, and Fable 5 and Mythos 5 are suspended worldwide and unusable. Against the generally-available Claude Opus 4.8 there are no shared-harness benchmarks yet, so a clean comparison is data not available until GPT-5.6 reaches GA.
When will GPT-5.6 be generally available?
OpenAI says general availability across ChatGPT, Codex and the API is planned “in the coming weeks,” with a Cerebras launch for Sol (up to 750 tokens/second) in July for select customers. No firm date has been given.
GPT-5.6 is in limited preview. The benchmark figures here are OpenAI-reported and partial, transcribed from OpenAI’s launch charts, and several specifications were not disclosed; this page will be updated with independent benchmarks, full specs and consumer availability once the models reach general availability. Pricing and access are subject to change.