THE AI RANKINGS

Meta

Llama 4

Provider
Meta
Status
available
Context
10,000,000 tok

Llama 4 is Meta’s open-weight model family, released on 5 April 2025 — the first open-weight, natively-multimodal models built on a mixture-of-experts (MoE) architecture, and the high-water mark of Meta’s open-source era. The “herd” came in three sizes: Scout (17B active / 109B total, with an industry-leading 10-million-token context window), Maverick (17B active / 400B total, the assistant workhorse), and Behemoth (288B active / ~2T total, a teacher model that remained in training) (Meta, Hugging Face).

In 2026 Llama 4 is legacy for Meta’s own assistantMuse Spark replaced it as the Meta AI engine — but it remains freely available for developers to download, run and fine-tune. Its licence is not Apache 2.0: the Llama 4 Community License carries usage obligations and a 700-million-monthly-active-user threshold that requires a separate licence from Meta (Royfactory).

Quick specs

ProviderMeta
Released5 April 2025
StatusAvailable (open weights; legacy as the Meta AI engine)
ArchitectureMixture-of-experts, natively multimodal
SizesScout (109B), Maverick (400B), Behemoth (~2T, in training)
Context windowUp to 10,000,000 tokens (Scout)
LicenceLlama 4 Community License (700M-MAU clause)
ModalitiesText + image input; text output
Best forSelf-hosting, fine-tuning, very long context (Scout)
LimitationsSuperseded by Muse Spark; non-Apache licence; open frontier has moved ahead

GET LLAMA 4 →

The Llama 4 herd

ModelActive / total paramsContextNotes
Scout17B / 109B (16 experts)10M tokensFits a single server GPU with 4/8-bit quantisation
Maverick17B / 400B (128 experts)1M tokensThe assistant workhorse; BF16 and FP8
Behemoth288B / ~2T (16 experts)Teacher model; remained in training

All use a mixture-of-experts design — one expert is always active for general knowledge, with others selected per token — which is what lets Scout deliver a 10M-token context on accessible hardware (Meta). Scout was pretrained on ~40 trillion tokens and Maverick on ~22 trillion, of multimodal data.

Benchmark performance

At its April 2025 release, Meta reported that Llama 4 Maverick beat GPT-4o and Gemini 2.0 Flash across a range of widely-reported benchmarks and matched DeepSeek v3 on reasoning and coding at less than half the active parameters, while Scout outperformed Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 (Meta, TechTalks).

A credibility caveat: Meta’s then-Chief AI Scientist Yann LeCun later alleged that the GenAI team had “fudged” some Llama 4 benchmark results — an unverified claim, but one that contributed to Zuckerberg’s loss of confidence in the team and the subsequent reorganisation (see the Meta provider page). Either way, the open frontier has since moved well ahead — models like DeepSeek V4 now lead on open-weight capability — so Llama 4’s draw today is its licence-able weights and Scout’s enormous context, not raw benchmark standing. See best AI models.

Licence and access

Llama 4 is open weights, but not Apache 2.0. The Llama 4 Community License allows free download, self-hosting, fine-tuning and commercial use, but with practical obligations and a 700-million-monthly-active-user threshold above which a separate licence from Meta is required (Royfactory) — a meaningful distinction from the fully-permissive licences on gpt-oss, Gemma 4 or DeepSeek.

The weights are on Hugging Face and llama.com, with hosting across the major clouds and inference providers. Scout fits on a single server-grade GPU; Maverick ships in BF16 and FP8.

How Llama 4 compares

Known limitations

Superseded as Meta’s assistant by Muse Spark. Restrictive licence — the 700M-MAU clause makes it less “open” than Apache/MIT rivals. Benchmark-credibility cloud from the LeCun allegation. And the open frontier has moved ahead since April 2025, so Llama 4 trails the current best open-weight models on capability.

FAQ

What is Llama 4?

Llama 4 is Meta’s open-weight model family, released April 2025 — the first open-weight, natively-multimodal mixture-of-experts models. It comes in Scout (10M context), Maverick (the workhorse) and the in-training Behemoth.

Is Llama 4 free and open source?

It is open weights and free to download, but under the Llama 4 Community License rather than a fully-permissive Apache/MIT licence — there are usage obligations and a 700-million-monthly-active-user threshold that requires a separate licence from Meta.

What is Llama 4 Scout’s context window?

Up to 10 million tokens — the largest of any widely-available model at its release — while fitting on a single server-grade GPU with quantisation.

Is Llama 4 still Meta’s main model?

No. Muse Spark replaced Llama 4 as the engine behind Meta AI in April 2026. Llama 4 remains available as open weights for developers.


Last verified 18 June 2026. Llama 4 figures are Meta-reported from the April 2025 release; the “fudged benchmarks” allegation is attributed to Yann LeCun and is unverified. Licence terms and benchmarks change — confirm against Meta’s official pages and the model licence before relying on them.