Specifications

Model ID claude-sonnet-4-5
Provider Anthropic
Architecture transformer
Context Window 200K tokens
Max Input 200K tokens
Max Output 64K tokens
Knowledge Cutoff 2025-01-31
License proprietary
Open Weights No

Capabilities

Modalities

Input: text, images, pdf
Output: text

Reasoning

Reasoning Model Agentic
Effort Levels: lowmediumhigh

Features

function-callingjson-modestructured-outputsvisionstreamingparallel-tool-callingcomputer-useweb-searchcitationsprompt-cachingbatch-processingextended-thinkinginterleaved-thinkingcode-executiontext-editor-toolmemory-tool1m-context-beta

Variants

VARIANT API ID DESCRIPTION SWE
Claude Sonnet 4.5 claude-sonnet-4-5-20250929 Flagship coding model with extended thinking capabilities
Claude Sonnet 4.5 (alias) claude-sonnet-4-5 Always points to latest Sonnet 4.5 version

API Pricing

Input $3 per 1M tokens
Output $15 per 1M tokens
Cached Input $0.3 per 1M tokens
Batch Input $1.5 per 1M tokens
Batch Output $7.5 per 1M tokens

Price parity with Sonnet 4—pure capability upgrade at same cost. Prompt caching offers up to 90% savings on repeated context. Batch API provides 50% discount with 24-hour turnaround. 1M context beta available at Tier 4+.

Claude Access

TIER PRICE CONTEXT RATE LIMIT
Free Free 200K ~9 messages per 5 hours
Pro $20/mo 200K 40-80 hours Sonnet weekly
Max 5× $100/mo 200K 140-280 hours Sonnet/week
Max 20× $200/mo 200K 240-480 hours Sonnet/week
Team Standard $25/mo 200K
Team Premium $150/mo 200K
Enterprise Free 500K Custom

Benchmarks

Coding

SWE-bench Verified 77.2%

Reasoning

GPQA Diamond 83.4%
MMLU 89.1%

Math

AIME 2025 87%

Vision

MMMU 77.8%

Rankings

Artificial Analysis #4
AA Intelligence Index 61

SWE-bench leader at launch (77.2%). OSWorld (61.4%) establishes SOTA for computer use—45% improvement over Sonnet 4. Perfect AIME 2025 with Python tools. TAU-bench Telecom (98%) demonstrates exceptional agent capabilities. Output speed (63 tok/s) among fastest frontier models.

Claude Sonnet 4.5 is Anthropic’s flagship coding model, released on September 29, 2025. At launch, it achieved the highest score on SWE-bench Verified at 77.2%—establishing itself as the leading model for real-world software engineering tasks. Anthropic positioned it as their recommended model for “basically every use case,” delivering significant improvements in autonomous operation: 30+ hours of continuous work compared to 7 hours for its predecessor.

The model dominates computer use benchmarks with a 61.4% OSWorld score (45% improvement over Sonnet 4) and scores 100% on AIME 2025 when using Python tools. Critically, it maintains price parity with Sonnet 4 at $3/$15 per million tokens—making the upgrade a pure capability gain at zero additional cost. For developers building AI-assisted coding workflows, Sonnet 4.5 represents the current sweet spot between capability and cost.

Quick specs

ProviderAnthropic
ReleasedSeptember 29, 2025
Context window200K tokens (1M beta at Tier 4+)
Max output64K tokens
Knowledge cutoffJanuary 31, 2025
Input price$3.00 / MTok
Output price$15.00 / MTok
Cached input$0.30 / MTok (90% savings)
SWE-bench Verified77.2% (82.0% high compute)
OSWorld61.4% (best-in-class computer use)
Best forCoding, agents, computer use, high-volume production
LimitationsPure reasoning trails GPT-5, costs 2.4× more than GPT-5 input

TRY CLAUDE SONNET 4.5 →

What’s new in Sonnet 4.5

Sonnet 4.5 represents the most significant Sonnet upgrade to date, focusing on agentic capabilities and coding performance while maintaining the speed and cost that made Sonnet the default choice for production workloads.

30+ hour autonomous operation

The headline improvement is sustained autonomous work. Anthropic reports Sonnet 4.5 can operate for 30+ hours continuously compared to 7 hours for Sonnet 4. This enables true overnight coding tasks, multi-day research projects, and complex multi-system integrations without human intervention.

State-of-the-art computer use

OSWorld score jumped from 42.2% to 61.4%—a 45% relative improvement. This makes Sonnet 4.5 the best model for computer use tasks: navigating GUIs, filling forms, interacting with web applications, and automated testing. The TAU-bench results reinforce this: 98% on Telecom (vs 71.5% for Opus 4.1) and 86.2% on Retail.

Zero code editing errors

Replit reported their code editing error rate dropped from 9% to 0% when switching from Sonnet 4 to Sonnet 4.5. Combined with the 77.2% SWE-bench score, this establishes Sonnet 4.5 as the most reliable model for automated code modification.

1M context window beta

API users at Tier 4+ ($400+ spend) can access the 1 million token context beta via the context-1m-2025-08-07 header. This enables processing entire codebases, lengthy documentation, or comprehensive research materials in a single context—though pricing doubles for inputs beyond 200K tokens.

Reduced sycophancy

Anthropic specifically trained Sonnet 4.5 to be less agreeable when users are wrong. The model pushes back more appropriately and avoids the excessive “you’re absolutely right!” responses that plagued earlier versions.

The Sonnet 4.5 model family

Sonnet 4.5 is the middle tier of the Claude 4.5 family, balancing capability and cost:

ModelReleasedAPI IdentifierPricingBest for
Claude Opus 4.5Nov 24, 2025claude-opus-4-5-20251101$5/$25Complex reasoning, “when you can’t afford to be wrong”
Claude Sonnet 4.5Sep 29, 2025claude-sonnet-4-5-20250929$3/$15Coding, agents, production workloads
Claude Haiku 4.5Oct 2025claude-haiku-4-5-20251001$1/$5High-volume, latency-sensitive tasks

Sonnet 4.5 delivers approximately 95% of Opus 4.5’s coding capability at 60% of the cost, making it the recommended default for most use cases. Reserve Opus for complex architectural decisions and difficult debugging.

Benchmark performance

Sonnet 4.5 leads on coding and agentic benchmarks while showing competitive—but not leading—performance on pure reasoning tasks.

Coding benchmarks

BenchmarkSonnet 4.5Opus 4.5GPT-5.1Gemini 3 Pro
SWE-bench Verified77.2%80.9%76.3%76.2%
SWE-bench (high compute)82.0%
Terminal-Bench50.0%59.3%43.8%
OSWorld61.4%66.3%~44%

SWE-bench Verified tests models on real GitHub pull requests—the most realistic benchmark for software engineering capability. Sonnet 4.5’s 77.2% means it resolves roughly 4 out of 5 actual bug reports without human intervention. The high-compute configuration (parallel attempts + rejection sampling) pushes this to 82.0%.

Agentic benchmarks

BenchmarkSonnet 4.5Opus 4.1GPT-5Gemini 2.5 Pro
TAU-bench Telecom98.0%71.5%
TAU-bench Retail86.2%
TAU-bench Airline70.0%63.0%
Finance Agent55-69%46.9%29.4%

The TAU-bench results demonstrate Sonnet 4.5’s exceptional agent capabilities—the 98% Telecom score represents near-perfect task completion in complex multi-step scenarios.

Reasoning and knowledge

BenchmarkSonnet 4.5Opus 4.5GPT-5.1Gemini 3 Pro
GPQA Diamond83.4%87.0%~87.0%91.9%
MMLU89.1%90.8%91.5%91.8%
AIME 202587.0%100.0%94.0%
MMMU77.8%80.7%85.4%

On pure reasoning, Sonnet 4.5 trails the flagship models. GPQA Diamond (83.4%) is notably behind Gemini 3 Pro (91.9%) and GPT-5.1 (~87%). For tasks requiring maximum reasoning capability, Opus 4.5 or GPT-5.1 may be better choices.

Speed and efficiency

Artificial Analysis testing ranks Sonnet 4.5 among the fastest frontier models:

  • Output speed: 63 tokens/second
  • Time to first token: 1.80 seconds
  • Intelligence Index: 61 (thinking mode)—4th overall

This speed advantage makes Sonnet 4.5 practical for interactive coding assistants where latency matters.

Pricing breakdown

Sonnet 4.5 maintains price parity with Sonnet 4—the capability upgrade comes at zero additional cost.

Standard API pricing

TierInputOutput
Standard (≤200K context)$3.00 / MTok$15.00 / MTok
Long context (>200K)$6.00 / MTok$22.50 / MTok

Cost optimisation options

OptionInputOutputSavings
Prompt caching (read)$0.30 / MTok90%
Prompt caching (write)$3.75 / MTok-25% (investment)
Extended cache (1hr write)$6.00 / MTok-100%
Batch API$1.50 / MTok$7.50 / MTok50%

Prompt caching with 5-minute TTL delivers up to 90% savings on repeated context. For high-volume production with predictable prompts, this dramatically reduces costs.

Competitor pricing comparison

ModelInputOutputvs Sonnet 4.5
Claude Sonnet 4.5$3.00$15.00
Claude Opus 4.5$5.00$25.001.67× more
GPT-5.1$1.25$10.002.4× cheaper input
Gemini 3 Pro~$2.00~$12.00~1.5× cheaper

The cost differential with GPT-5.1 is significant. For cost-sensitive deployments where Sonnet 4.5’s coding advantages aren’t critical, GPT-5.1 offers compelling value.

How to access Claude Sonnet 4.5

Via API

Sonnet 4.5 is generally available with no waitlist. Basic usage:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-5-20250929",
    max_tokens=8192,
    messages=[{"role": "user", "content": "Your prompt here"}]
)

Enable extended thinking with budget control:

response = client.messages.create(
    model="claude-sonnet-4-5-20250929",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # Minimum 1,024
    },
    messages=[{"role": "user", "content": "Complex reasoning task"}]
)

Enable 1M context (Tier 4+ only):

response = client.messages.create(
    model="claude-sonnet-4-5-20250929",
    max_tokens=8192,
    headers={"anthropic-beta": "context-1m-2025-08-07"},
    messages=[{"role": "user", "content": "Process this large codebase..."}]
)

Via Claude.ai

Access varies by subscription tier:

TierAccessPriceNotes
Free$0~9 messages/5 hours, no extended thinking
Pro$20/mo40-80 hours Sonnet/week, extended thinking
Max 5×$100/mo140-280 hours Sonnet/week
Max 20×$200/mo240-480 hours Sonnet/week
Team$25-150/user/moPremium seats get priority
EnterpriseCustom500K context, custom limits

Sonnet 4.5 is the default model for all Claude.ai users, making it immediately accessible without configuration changes.

Via cloud providers

Sonnet 4.5 is available on all three major cloud platforms:

PlatformStatusModel ID
Amazon BedrockGAanthropic.claude-sonnet-4-5-20250929-v1:0
Google Cloud Vertex AIGAclaude-sonnet-4-5@20250929
Microsoft Azure FoundryPreview

AWS Bedrock includes GovCloud availability with cross-region inference. Vertex AI supports batch predictions and the 1M context preview.

Via coding tools

Sonnet 4.5 is fully integrated into major AI coding assistants:

  • Cursor: Regular and thinking modes available
  • GitHub Copilot: Public preview, powering agentic experiences
  • Windsurf: Full integration
  • Cline: Available with API key
  • OpenRouter: anthropic/claude-sonnet-4.5 at same pricing

How Claude Sonnet 4.5 compares

vs Claude Opus 4.5

Opus 4.5 costs 67% more ($5/$25 vs $3/$15) but delivers measurably better results on complex tasks:

  • +3.7pp on SWE-bench (80.9% vs 77.2%)
  • +9.3pp on Terminal-Bench (59.3% vs 50.0%)
  • +4.9pp on OSWorld (66.3% vs 61.4%)
  • +13pp on AIME 2025 (100% vs 87%)

However, Opus achieves this with 76% fewer output tokens at medium effort, making it more cost-effective for tasks requiring deep reasoning. Simon Willison noted he “switched back to Sonnet 4.5 and kept on working at the same pace” for routine tasks.

Choose Sonnet 4.5 for: daily coding tasks, high-volume production, cost-sensitive applications, standard agent workflows.

Choose Opus 4.5 for: complex architectural decisions, difficult debugging, long-horizon autonomous agents, “when you cannot afford to be wrong.”

vs GPT-5.1

Independent testing shows near parity on SWE-bench under standardised conditions: Sonnet 4.5 at 69.8% vs GPT-5-Codex at 69.4%. However, GPT-5.1 is 2.4× cheaper on input tokens ($1.25 vs $3.00).

Sonnet 4.5 advantages:

  • Superior computer use (OSWorld 61.4% vs ~44%)
  • Better agentic consistency
  • 30+ hour autonomous operation
  • Lower hallucination rate on code tasks

GPT-5.1 advantages:

  • 2.4× cheaper input pricing
  • Better pure reasoning (GPQA 87% vs 83.4%)
  • Larger context window (400K vs 200K standard)
  • Broader multimodal support (audio/video)

Zvi Mowshowitz’s assessment: “If I had to pick one ‘best coding model in the world’ right now it would be Sonnet 4.5. If I had to pick one coding strategy to build with, I’d use Sonnet 4.5 and Claude Code.”

vs Gemini 3 Pro

Gemini 3 Pro (released November 2025) narrows the coding gap to 76.2% SWE-bench while offering:

  • Larger context: 1M+ tokens native vs 200K (1M beta)
  • Better reasoning: GPQA 91.9% vs 83.4%
  • Lower cost: ~$2/$12 vs $3/$15
  • Broader multimodal: Native audio, video, more languages

Sonnet 4.5 maintains advantages in computer use and agentic reliability. Choose Gemini for massive context needs or when reasoning benchmarks matter more than coding benchmarks.

The practical consensus

From community feedback: developers use Sonnet 4.5 as the daily driver for coding work, reserve Opus 4.5 for hard problems, and reach for GPT-5.1 when cost matters or for “particular wicked problems and difficult bugs.”

Known limitations

Independent testing and community reports reveal several areas where Sonnet 4.5 falls short:

Cost premium over GPT-5.1: At $3/$15, Sonnet 4.5 costs 2.4× more than GPT-5.1 ($1.25/$10) on input tokens. For cost-sensitive deployments, this premium requires justification through superior coding performance.

Reasoning gap: GPQA Diamond (83.4%) trails GPT-5.1 (~87%) and Gemini 3 Pro (91.9%). For tasks requiring maximum reasoning capability—mathematical research, complex scientific analysis—other models may be better choices.

Legacy codebase struggles: Hacker News feedback suggests Sonnet 4.5 is “insanely impressive in greenfield projects and collapses in legacy codebases.” The model excels at building new systems but can struggle with complex existing architectures.

Context window constraints: The standard 200K context is smaller than GPT-5.1’s 400K and Gemini’s 1M+. The 1M beta requires Tier 4+ API access and doubles input pricing.

Superficial implementations on long tasks: Some users report that while Sonnet 4.5 is fast, it can produce “broken and superficial” implementations on longer, more complex tasks compared to GPT-5 which “took 5× longer but understood the objective better.”

Mathematical weakness: Multiple reviewers note Sonnet 4.5 is “still leagues worse than GPT-5-high at mathematical stuff.” For heavy mathematical workloads, consider alternatives.

Community reception

The developer community has responded positively to Sonnet 4.5, with particular praise for its coding capabilities and speed.

The positives

Coding excellence: Simon Willison called it “probably the ‘best coding model in the world’” at launch. The code interpreter mode successfully cloned his repository and ran 466 tests autonomously.

Real-world validation: Ethan Mollick demonstrated Sonnet 4.5 autonomously replicating published economics research—reading papers, processing data archives, converting STATA code to Python, and reproducing findings.

Speed improvements: Tasks that took 20+ minutes with competitors complete in ~3 minutes with Sonnet 4.5. Replit’s code editing error rate dropped from 9% to 0%.

Agent reliability: Devin reported 18% improvement in planning tasks. The 30+ hour autonomous operation enables overnight coding workflows that weren’t previously practical.

The criticisms

Cost concerns: The 2.4× input cost premium over GPT-5.1 is frequently cited. For high-volume deployments, this adds up significantly.

Context issues: Some users report the model “forgets rules” and doesn’t reuse existing code patterns in longer conversations.

Not universally superior: Zvi Mowshowitz notes GPT-5 remains better for “particular wicked problems and difficult bugs”—Sonnet 4.5’s speed advantage doesn’t always translate to better outcomes.

Expert verdict

The consensus: Sonnet 4.5 is the best default choice for AI-assisted coding, but the “best coding model” claim applies to specific contexts—greenfield projects, architecture planning, and rapid iteration. For legacy codebases, mathematical research, and cost-sensitive deployments, alternatives remain competitive or superior.

Version history

VersionReleasedKey changes
Claude Opus 4.5Nov 24, 2025Flagship, 80.9% SWE-bench, 67% price cut from Opus 4.1
Claude Haiku 4.5Oct 2025Fast tier, 73.3% SWE-bench at $1/$5
Claude Sonnet 4.5Sep 29, 202577.2% SWE-bench, 30+ hour operation, 61.4% OSWorld
Claude Sonnet 4May 202572.7% SWE-bench, computer use improvements
Claude 3.7 SonnetEarly 202562.3% SWE-bench, extended thinking preview
Claude 3.5 Sonnet2024Previous generation

Sonnet 4 remains available via API for users who need backward compatibility. The upgrade path from Sonnet 4 to 4.5 requires no code changes—simply update the model identifier.

FAQ

Is Claude Sonnet 4.5 better than GPT-5.1 for coding?

For most coding tasks—yes. Sonnet 4.5 leads on SWE-bench (77.2% vs 76.3%) and significantly outperforms on computer use (OSWorld 61.4% vs ~44%). However, GPT-5.1 costs 2.4× less on input tokens and may be better for “wicked problems and difficult bugs.” Choose based on whether capability or cost matters more.

How much does Claude Sonnet 4.5 cost?

$3.00 per million input tokens, $15.00 per million output tokens. Cached inputs drop to $0.30/MTok (90% savings). Batch API offers 50% discount. This is identical pricing to Sonnet 4—the upgrade is a pure capability gain.

Can I use Claude Sonnet 4.5 for free?

Yes. Free tier users get approximately 9 messages per 5 hours with Sonnet 4.5, though extended thinking is not available. For serious use, Pro ($20/month) or API access is recommended.

What’s the difference between Sonnet 4.5 and Opus 4.5?

Opus 4.5 (80.9% SWE-bench) is Anthropic’s most capable model for complex tasks. Sonnet 4.5 (77.2% SWE-bench) offers ~95% of the capability at 60% of the price ($3/$15 vs $5/$25). Most developers use Sonnet daily and reserve Opus for hard problems.

Does Sonnet 4.5 support 1 million token context?

Yes, but only via API at Tier 4+ ($400+ spend). Enable with the context-1m-2025-08-07 beta header. Input pricing doubles for content beyond 200K tokens.

Is Sonnet 4.5 good for agents?

Excellent. The 30+ hour autonomous operation, 61.4% OSWorld score, and 98% TAU-bench Telecom demonstrate best-in-class agent capabilities. It’s the recommended model for building autonomous coding workflows.

What is Sonnet 4.5 best at?

Coding tasks, computer use, agentic workflows, and high-volume production workloads. It excels at greenfield projects, rapid iteration, and tasks requiring sustained autonomous operation.

Where is Sonnet 4.5 available?

Claude.ai (all tiers), Anthropic API, Amazon Bedrock, Google Cloud Vertex AI, Microsoft Azure Foundry (preview), Cursor, GitHub Copilot, Windsurf, Cline, and OpenRouter.

ResourceURL
Claude.aiclaude.ai
Anthropic Websiteanthropic.com
Sonnet 4.5 Announcementanthropic.com/news/claude-sonnet-4-5
Sonnet Product Pageanthropic.com/claude/sonnet
API Documentationdocs.anthropic.com
Pricinganthropic.com/pricing
Model Overviewdocs.anthropic.com/en/docs/about-claude/models/overview
Migration Guidedocs.anthropic.com/en/docs/about-claude/models/migrating-to-claude-4
Status Pagestatus.anthropic.com
guest@theairankings:~$_