Claude Sonnet 5: Specs, Pricing, and Competitive Position

Anthropic shipped Claude Sonnet 5 on 30 June 2026, slotting it as the new default model on Claude.ai for both Free and Pro tiers and pushing it to the API, Amazon Bedrock, and Azure AI Foundry on day one. The release is notable for what it consolidates rather than what it invents: it inherits the cyber safeguards first hardened in Opus 4.7 and 4.8, ships with the Opus 4.7 tokenizer, and posts an agentic coding score of 63.2% that sits between Sonnet 4.6 (58.1%) and Opus 4.8 (69.2%). The pricing move — $2/$10 per million tokens through 31 August, then $3/$15 — is the part most likely to reshape procurement decisions in the near term.

API Specifications and Context Window

Sonnet 5 accepts up to 1 million tokens of input context, with a maximum output of 128K tokens on the standard API. Teams that can tolerate latency can push output to 300K tokens via the Batch API, which matters for document-heavy workloads such as contract review, large refactors, and multi-file code generation. The knowledge cutoff sits at January 2026, a meaningful step up from earlier Sonnet generations and a direct contributor to its stronger answers on recent library versions and current events.

One architectural decision worth flagging is the tokenizer. Sonnet 5 uses the Opus 4.7 tokenizer, which emits between 1.0x and 1.35x more tokens than the Sonnet 4.6 tokenizer for equivalent text. For teams billed per token, that means a direct comparison of headline prices understates real cost on certain corpora — English prose sits near the low end of that range, while dense code with heavy symbol usage trends higher. Anyone migrating batch pipelines should re-baseline token counts before projecting spend.

Tokenizer Shift and Cost Implications

The tokenizer change has a second-order effect on caching economics. Anthropic’s prompt caching discounts repeated prefixes, and a model that fragments the same source text into more tokens also inflates the cached-token count — and the cache write bill. The Batch API discount of 50% partially offsets this for asynchronous workloads, but only for jobs that can absorb the latency. For synchronous agentic loops, the token multiplier is a direct surcharge.

Worth noting: this tokenizer lineage suggests Sonnet 5 shares infrastructure with the Opus family rather than the prior Sonnet line. Practically, that means tooling built around Opus 4.7 token boundaries — chunking strategies, retrieval window sizing, context-budget calculators — transfers with little adjustment. Anything tuned to Sonnet 4.6 token counts needs recalibration.

Pricing Analysis Against Competitors

The promotional price of $2 input / $10 output per million tokens positions Sonnet 5 aggressively against the front-runner field. For comparison:

  • Anthropic Opus 4.8: $5 / $25 per million tokens — roughly 2.5x the promo Sonnet 5 rate.
  • OpenAI GPT-5.5: $5 / $30 per million tokens — the most expensive of the group on output.
  • Google Gemini 3.5 Flash: cheaper than Sonnet 5 on raw per-token price, but with a different capability ceiling.

At the post-promo $3 / $15 rate, Sonnet 5 still undercuts Opus 4.8 and GPT-5.5 on both axes while claiming to slightly outperform Opus 4.8 on knowledge-work benchmarks. That is a deliberate squeeze: Anthropic is pricing its mid-tier model to pull volume away from competitors’ flagships during a window when usage growth matters — the company is widely reported to be heading toward an IPO, and a discounted headline rate is a reliable way to juice API call volume ahead of a public filing.

The trade-off for buyers is the agentic coding gap. Sonnet 5’s 63.2% on agentic coding evaluations trails Opus 4.8’s 69.2% by six points — a margin that compounds over long autonomous runs. For workflows where a model operates unsupervised across dozens of tool calls, that gap easily justifies Opus’s higher unit cost. For human-in-the-loop coding, document synthesis, and retrieval-augmented answer generation, Sonnet 5’s pricing is hard to beat.

Benchmark Performance and Knowledge Work

Anthropic’s published numbers place Sonnet 5 slightly ahead of Opus 4.8 on knowledge-work tasks — a category that spans research synthesis, structured reasoning over provided documents, and multi-step analysis. The system card, released alongside the model, documents the evaluation methodology. The agentic coding result is the one place where Opus 4.8 retains a clear lead, which is consistent with Anthropic’s tiering strategy: Opus remains the flagship for autonomous tool use, while Sonnet 5 is positioned as the everyday workhorse.

Behavioral safety also improved. Anthropic reports fewer misuse and deception behaviors than Sonnet 4.6, which addresses a class of reliability concerns that surfaced in production deployments of the previous generation. For teams that ran into sycophancy or deceptive-completion issues with Sonnet 4.6, the upgrade is a substantive fix rather than a marginal one.

The cyber-safeguard story deserves its own treatment. Sonnet 5 ships with the offensive-capability restrictions from Opus 4.7 and 4.8 enabled by default, and Anthropic confirmed that the model scores 0% on the Firefox 147 exploit generation test — a deliberate cap rather than an accident. For API consumers building developer tooling, that removes a category of risk (the model refusing or degrading on security-adjacent tasks) but also bounds what the model can do for offensive security work.

Where Sonnet 5 Lands Strategically

Read against Anthropic’s roadmap, Sonnet 5 is a consolidation release dressed up as a flagship launch. The tokenizer alignment with Opus, the inherited safeguards, and the incremental benchmark gains all point to a model optimized to hold the mid-tier price band while Opus 4.8 defends the top. The promo pricing through August is the tell: Anthropic wants volume now, ahead of the IPO window, and is willing to compress margins to get it.

For practitioners, the decision tree is straightforward. If your workload is agentic coding with long unsupervised runs, stay on Opus 4.8. If it is document synthesis, retrieval-augmented generation, or human-supervised coding, Sonnet 5 at the promo rate is the most cost-effective front-tier option available right now — and the post-promo price remains competitive. The performance-debate methodology issues that surrounded earlier Claude releases make it worth running your own evals rather than trusting published numbers, especially given the tokenizer shift’s effect on context-budget calculations.

Broader context matters too. The release lands in a competitive window where OpenAI’s three-tier GPT-5.6 strategy and Google’s Gemini 3.5 Flash are both pressing on price, and where long-context accuracy degradation remains an open engineering problem that no vendor has fully solved. Sonnet 5’s 1M-token window is competitive on paper; whether it holds accuracy at the upper end is the question buyers should test before committing.