gpt-oss

Provider: OpenAI
Status: available
Context: 128,000 tok

gpt-oss is OpenAI’s family of open-weight models — gpt-oss-120b and gpt-oss-20b — released on 5 August 2025 under the permissive Apache 2.0 licence. They are OpenAI’s first open-weight large language models since GPT-2 in 2019, and a notable (if partial) return to openness from a company that had built its business on closed models. Unlike the proprietary GPT-5 line, anyone can download the weights, run them locally and use them commercially without restriction.

Both are mixture-of-experts reasoning models with a 128K context window and configurable reasoning effort. The larger gpt-oss-120b (117B total / 5.1B active parameters) lands near OpenAI’s o4-mini on core reasoning benchmarks while fitting on a single 80GB GPU; the smaller gpt-oss-20b (21B total / 3.6B active) is roughly o3-mini-level and runs on edge devices with about 16GB of memory.

Quick specs


Provider	OpenAI
Tier	Open weights
Released	5 August 2025
Licence	Apache 2.0
Models	gpt-oss-120b, gpt-oss-20b
Architecture	Mixture-of-experts (reasoning)
Context window	128,000 tokens
120b params	117B total / 5.1B active — runs on one 80GB GPU
20b params	21B total / 3.6B active — runs on a ~16GB device
Quality	~o4-mini (120b) / ~o3-mini (20b) on core reasoning
Price	Free to self-host; low-cost via third-party hosts
Best for	Private, on-prem and edge deployment; fine-tuning; cost-sensitive inference
Limitations	Below the current proprietary frontier; text-only

VIEW ON OPENAI →

What gpt-oss is

gpt-oss is OpenAI’s open-weight offering — a deliberate counterpart to its closed GPT-5 line. Where models like GPT-5.5 are available only through OpenAI’s API and ChatGPT, gpt-oss weights are published on Hugging Face for anyone to download, run, fine-tune and deploy commercially under Apache 2.0 — no copyleft, no patent traps.

Both models are reasoning models: they expose configurable reasoning effort (low/medium/high), produce visible chain-of-thought, and support tool use, which makes them suitable for agentic workflows that run on your own infrastructure. The design goal was strong reasoning at sizes that run on accessible hardware — a single high-end GPU for the 120b, a laptop-class device for the 20b.

The two models

Model	Total / active params	Runs on	Comparable to
gpt-oss-120b	117B / 5.1B	A single 80GB GPU	~OpenAI o4-mini on core reasoning
gpt-oss-20b	21B / 3.6B	An edge device with ~16GB memory	~OpenAI o3-mini

The mixture-of-experts design is what makes this practical: only a small fraction of the parameters activate per token, so the 120b model delivers strong quality at inference cost closer to a much smaller dense model.

Benchmark performance

OpenAI positions gpt-oss-120b at near-parity with o4-mini on core reasoning benchmarks and gpt-oss-20b around o3-mini — genuinely strong for open-weight models of their size at release. They sit below the current proprietary frontier (GPT-5.5, Claude Opus 4.8, Gemini 3.x) and below the largest open-weight models such as DeepSeek V4 on the hardest coding and reasoning tests. Exact figures vary by harness and quantisation, so treat published numbers as host-dependent. See best AI models for where the open-weight field stands.

Pricing and access

gpt-oss is free to download and self-host under Apache 2.0. There is no first-party OpenAI API price; instead the models are hosted by many third-party providers — Hugging Face, OpenRouter, Groq, Together, Fireworks and others — at low per-token rates that vary by host. To run locally, gpt-oss-120b needs roughly a single 80GB GPU and gpt-oss-20b about 16GB of memory.

This makes gpt-oss the natural OpenAI choice for private, on-prem or air-gapped deployment, fine-tuning on proprietary data, and cost-sensitive or high-volume inference where sending data to a hosted API is undesirable.

How gpt-oss compares

vs OpenAI’s GPT-5 line — gpt-oss trades frontier capability for openness and control. For the best quality, GPT-5.5 and GPT-5.4 are far stronger; for self-hosting and data control, gpt-oss is the only OpenAI option.
vs other open-weight models — Against DeepSeek, Alibaba’s Qwen, Mistral and Meta’s Llama, gpt-oss is competitive at its size but the largest open models now lead on raw capability. Its draw is the OpenAI lineage, the permissive licence and efficient sizes.

Known limitations

Below the frontier. gpt-oss-120b is around o4-mini level — capable, but well behind current flagship models.

Text-only. No image, audio or video input; these are text-in, text-out reasoning models.

No first-party hosting or updates. OpenAI released the weights but does not host gpt-oss as a managed API, and there has been no newer open-weight release in the line — for managed, frontier capability you are back to the proprietary GPT-5 models.

FAQ

What is gpt-oss?

gpt-oss is OpenAI’s family of open-weight models — gpt-oss-120b and gpt-oss-20b — released in August 2025 under the Apache 2.0 licence. They are OpenAI’s first open-weight LLMs since GPT-2, and can be downloaded, run locally, fine-tuned and used commercially for free.

How good is gpt-oss?

gpt-oss-120b is roughly on par with OpenAI’s o4-mini on core reasoning, and gpt-oss-20b is around o3-mini level — strong for open models of their size, but below the current proprietary frontier and the largest open-weight models.

Can I run gpt-oss locally?

Yes. gpt-oss-120b runs on a single 80GB GPU and gpt-oss-20b on an edge device with about 16GB of memory. The weights are on Hugging Face under Apache 2.0, so local and commercial use is unrestricted.

How much does gpt-oss cost?

Nothing to self-host beyond your own compute. There is no first-party OpenAI price; third-party hosts offer it at low, variable per-token rates.

Is gpt-oss the same as GPT-5?

No. gpt-oss is a separate, open-weight family that is much smaller and less capable than the proprietary GPT-5 models. It exists for self-hosting, fine-tuning and edge use, not to match the frontier.

Last verified 18 June 2026. gpt-oss specifications and comparisons are drawn from OpenAI’s August 2025 release and model card; exact benchmark figures vary by harness, host and quantisation. Confirm against OpenAI’s official pages and your chosen host before relying on specific numbers.