PxHere

Anthropic has released Claude Sonnet 4.5, which the company claims is the best coding model in the world and the strongest model for building complex agents.

The company says Claude Sonnet 4.5 is state-of-the-art on the SWE-bench Verified evaluation, which measures real-world software coding abilities. The model has been observed maintaining focus for more than 30 hours on complex, multi-step tasks.

The model represents a significant leap forward on computer use, claims Anthropic. On OSWorld, a benchmark that tests AI models on real-world computer tasks, Sonnet 4.5 now leads at 61.4 per cent. Four months ago, Sonnet 4 held the lead at 42.2 per cent. The model also shows improved capabilities on a broad range of evaluations including reasoning and maths.

Anthropic says Claude Sonnet 4.5 is its most aligned frontier model yet. The company says it has substantially improved the model’s behaviour, reducing concerning behaviours like sycophancy, deception, power-seeking, and the tendency to encourage delusional thinking. For the model’s agentic and computer use capabilities, Anthropic says it has made progress on defending against prompt injection attacks.

The model is being released under Anthropic’s AI Safety Level 3 protections, which include filters called classifiers that aim to detect potentially dangerous inputs and outputs, particularly those related to chemical, biological, radiological, and nuclear weapons.

Anthropic is releasing the Claude Agent SDK, the same infrastructure that powers Claude Code. In Claude Code, the company has added checkpoints that save progress and allow users to roll back instantly to a previous state, refreshed the terminal interface and shipped a native VS Code extension.

Claude Sonnet 4.5 is available via the Claude API using claude-sonnet-4-5. Pricing remains the same as Claude Sonnet 4, at $3 per million input tokens and $15 per million output tokens. Anthropic is releasing a temporary research preview called “Imagine with Claude” available to Max subscribers for five days.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

James Webb telescope reveals surprise origins of rare planetary odd couple

A normally “lonely” hot Jupiter sharing its immediate orbital space with a…

Attention economy can confuse as a result of missing scientific details

Science communication optimized for the attention economy often leads readers to incorrect…

Alaska megatsunami reveals seismic ‘calling card’ for earlier disaster detection

Scientists have identified a distinctive geological “ringing” that could provide an early…

Single dose of psilocybin triggers lasting anatomical brain changes

A single high dose of psilocybin causes likely anatomical changes in the…

Solar activity hits ‘transition boundary’ as space junk fall accelerates

Space debris and defunct satellites descend toward Earth significantly faster once solar…

Brexit milestones triggered persistent financial volatility across EU markets

Brexit functioned as a prolonged sequence of uncertainty that sent waves of…