THE AI RANKINGS

OpenAI

gpt-oss

Provider
OpenAI
Status
available
Context
128,000 tok

gpt-oss is OpenAI’s family of open-weight models — gpt-oss-120b and gpt-oss-20b — released on 5 August 2025 under the permissive Apache 2.0 licence. They are OpenAI’s first open-weight large language models since GPT-2 in 2019, and a notable (if partial) return to openness from a company that had built its business on closed models. Unlike the proprietary GPT-5 line, anyone can download the weights, run them locally and use them commercially without restriction.

Both are mixture-of-experts reasoning models with a 128K context window and configurable reasoning effort. The larger gpt-oss-120b (117B total / 5.1B active parameters) lands near OpenAI’s o4-mini on core reasoning benchmarks while fitting on a single 80GB GPU; the smaller gpt-oss-20b (21B total / 3.6B active) is roughly o3-mini-level and runs on edge devices with about 16GB of memory.

Quick specs

ProviderOpenAI
TierOpen weights
Released5 August 2025
LicenceApache 2.0
Modelsgpt-oss-120b, gpt-oss-20b
ArchitectureMixture-of-experts (reasoning)
Context window128,000 tokens
120b params117B total / 5.1B active — runs on one 80GB GPU
20b params21B total / 3.6B active — runs on a ~16GB device
Quality~o4-mini (120b) / ~o3-mini (20b) on core reasoning
PriceFree to self-host; low-cost via third-party hosts
Best forPrivate, on-prem and edge deployment; fine-tuning; cost-sensitive inference
LimitationsBelow the current proprietary frontier; text-only

VIEW ON OPENAI →

What gpt-oss is

gpt-oss is OpenAI’s open-weight offering — a deliberate counterpart to its closed GPT-5 line. Where models like GPT-5.5 are available only through OpenAI’s API and ChatGPT, gpt-oss weights are published on Hugging Face for anyone to download, run, fine-tune and deploy commercially under Apache 2.0 — no copyleft, no patent traps.

Both models are reasoning models: they expose configurable reasoning effort (low/medium/high), produce visible chain-of-thought, and support tool use, which makes them suitable for agentic workflows that run on your own infrastructure. The design goal was strong reasoning at sizes that run on accessible hardware — a single high-end GPU for the 120b, a laptop-class device for the 20b.

The two models

ModelTotal / active paramsRuns onComparable to
gpt-oss-120b117B / 5.1BA single 80GB GPU~OpenAI o4-mini on core reasoning
gpt-oss-20b21B / 3.6BAn edge device with ~16GB memory~OpenAI o3-mini

The mixture-of-experts design is what makes this practical: only a small fraction of the parameters activate per token, so the 120b model delivers strong quality at inference cost closer to a much smaller dense model.

Benchmark performance

OpenAI positions gpt-oss-120b at near-parity with o4-mini on core reasoning benchmarks and gpt-oss-20b around o3-mini — genuinely strong for open-weight models of their size at release. They sit below the current proprietary frontier (GPT-5.5, Claude Opus 4.8, Gemini 3.x) and below the largest open-weight models such as DeepSeek V4 on the hardest coding and reasoning tests. Exact figures vary by harness and quantisation, so treat published numbers as host-dependent. See best AI models for where the open-weight field stands.

Pricing and access

gpt-oss is free to download and self-host under Apache 2.0. There is no first-party OpenAI API price; instead the models are hosted by many third-party providers — Hugging Face, OpenRouter, Groq, Together, Fireworks and others — at low per-token rates that vary by host. To run locally, gpt-oss-120b needs roughly a single 80GB GPU and gpt-oss-20b about 16GB of memory.

This makes gpt-oss the natural OpenAI choice for private, on-prem or air-gapped deployment, fine-tuning on proprietary data, and cost-sensitive or high-volume inference where sending data to a hosted API is undesirable.

How gpt-oss compares

Known limitations

Below the frontier. gpt-oss-120b is around o4-mini level — capable, but well behind current flagship models.

Text-only. No image, audio or video input; these are text-in, text-out reasoning models.

No first-party hosting or updates. OpenAI released the weights but does not host gpt-oss as a managed API, and there has been no newer open-weight release in the line — for managed, frontier capability you are back to the proprietary GPT-5 models.

FAQ

What is gpt-oss?

gpt-oss is OpenAI’s family of open-weight models — gpt-oss-120b and gpt-oss-20b — released in August 2025 under the Apache 2.0 licence. They are OpenAI’s first open-weight LLMs since GPT-2, and can be downloaded, run locally, fine-tuned and used commercially for free.

How good is gpt-oss?

gpt-oss-120b is roughly on par with OpenAI’s o4-mini on core reasoning, and gpt-oss-20b is around o3-mini level — strong for open models of their size, but below the current proprietary frontier and the largest open-weight models.

Can I run gpt-oss locally?

Yes. gpt-oss-120b runs on a single 80GB GPU and gpt-oss-20b on an edge device with about 16GB of memory. The weights are on Hugging Face under Apache 2.0, so local and commercial use is unrestricted.

How much does gpt-oss cost?

Nothing to self-host beyond your own compute. There is no first-party OpenAI price; third-party hosts offer it at low, variable per-token rates.

Is gpt-oss the same as GPT-5?

No. gpt-oss is a separate, open-weight family that is much smaller and less capable than the proprietary GPT-5 models. It exists for self-hosting, fine-tuning and edge use, not to match the frontier.


Last verified 18 June 2026. gpt-oss specifications and comparisons are drawn from OpenAI’s August 2025 release and model card; exact benchmark figures vary by harness, host and quantisation. Confirm against OpenAI’s official pages and your chosen host before relying on specific numbers.