As shipping agentic capabilities becomes table stakes among foundation model companies, Anthropic is releasing Claude Sonnet 5, a more powerful and agentic version of the lab’s midsize model.
“It can make plans, use tools like browsers and terminals, and run autonomously at a level that, just a few months ago, required larger and more expensive models,” Anthropic said in a blog post.
That framing mirrors what OpenAI and Google have said about their own recent releases. OpenAI’s GPT-5.6 Sol was launched in preview last week, and it is also the firm’s most agentic model yet, allowing users to split work across subagents for longer autonomous tasks. Google’s Gemini 3.5 Flash, which launched in May, was pitched as a shift from a conversational chatbot to an agentic tool that plans, builds, and iterates on real work with minimal human input.
Sonnet 5’s pitch is confirmation that agentic capability is the new baseline expectation at every price tier. Now the differentiator isn’t going to be who can do agentic work best, but how cheaply they can do it and how reliably without human oversight.
Sonnet 5 promises performance close to that of Opus 4.8, but for much lower costs. Starting Tuesday, Claude Sonnet 5 will be the default model for free and Pro plans and is available for every subscription.
At launch, Sonnet 5 is priced at $2 per million input tokens and $10 per million output tokens through August 31, after which the price will jump to $3 per million input tokens and $10 per million output tokens. That makes Sonnet 5 cheaper than Opus 4.8, as well as OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro. (It’s still more expensive than Gemini 3.5 Flash.)
The new model also demonstrates significant improvements over its predecessor Sonnet 4.6, released in February, on agentic performance like reasoning, tool use, software coding, and knowledge work, according to Anthropic.
For example, on one benchmark, Sonnet 5 scores a 63.2% on agentic coding, compared to Opus 4.8’s 69.2% and Sonnet 4.6’s 58.1%. On a knowledge work benchmark, Sonnet 5 actually slightly outperforms Opus 4.8, which is known for winning on solving the hardest problems like making subtle judgement calls and deep research.
“Opus 4.8 is still the model of choice for higher accuracy on these tasks, but Sonnet 5 provides developers with lower-priced options that are of much higher quality than what was previously available,” Anthropic says. “Between Sonnet 5 and Opus 4.8, users can adjust the effort level to find the right balance of cost and performance.”
According to testers cited in the blog post, Sonnet 5 also excels at finishing complex tasks where previous model versions would have stopped short and “checks its own output without explicitly being asked.”
“We handed Claude Sonnet 5 a two-part job—update Salesforce account tiers, send a launch announcement to enterprise contacts—and it finished end to end,” Daniel Shepard, a senior engineer at Zapier, said in a statement. “That used to stall halfway. For day-to-day automation, it’s a no-brainer. ”
On safety, Sonnet 5 also demonstrates a lower rate of “undesirable behaviors” like cooperation with misuse and deception than its predecessor, making it safer to use in agentic contexts. It’s better at refusing malicious requests and sidestepping hijack attempts in prompt injection attacks. It also hallucinates and engages in sycophantic behavior at a lower rate than Sonet 4.6.
That said, it’s not on the same level as Opus 4.8 and Claude Mythos Preview when it comes to misaligned behavior. “Evaluations also show that it has a much lower ability to perform dangerous cybersecurity tasks than our current Opus models,” reads the blog post.
Lovable co-founder Fabian Hedin said in a statement that Claude Sonnet 5 “refuses unsafe requests cleanly and consistently.”
“At Lovable, we’re putting powerful tools in the hands of millions of builders,” Hedin said. “A model that knows when to say no is just as important as one that knows how to build.”
When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.
Disclaimer : This story is auto aggregated by a computer programme and has not been created or edited by DOWNTHENEWS. Publisher: techcrunch.com




