Claude 3.5 Sonnet: Anthropic’s AI model competes with GPT-4o and Gemini 1.5.

The AI ​​arms race continues apace: Anthropic is launching its latest model, called Claude 3.5 Sonnet, which it says can rival or even surpass OpenAI’s GPT-4o or Google’s Gemini for a wide range of tasks. The new model is already available to Claude users on the web and on iOS, and Anthropic is also making it available to developers.

Claude 3.5 Sonnet will ultimately be the middle model in the lineup – Anthropic is using the Haiku name for its smallest model, Sonnet for the regular middle option, and Opus for its highest model. (The names are weird, but every AI company seems to name things in their own weird ways, so let’s leave it at that.) But the company says 3.5 Sonnet outperforms 3 Opus, and the benchmarks show that this by a fairly wide margin. The new model is also apparently twice as fast as the previous one, which may be an even bigger problem.

Benchmarks for AI models should always be taken with a grain of salt; there are many, it’s easy to choose the ones that make you look good, and the models and products change so quickly that no one seems to have an advantage for long. That said, Claude 3.5 Sonnet looks impressive: it scored better than GPT-4o, Gemini 1.5 Pro, and Meta’s Llama 3 400B in seven of nine overall benchmarks and four of five vision benchmarks. Again, don’t read too much into that, but it seems like Anthropic has built a legitimate competitor in this space.

Claude 3.5’s benchmark scores look impressive, but these things change so quickly.
Image: anthropic

What does that all actually amount to? Anthropic says Claude 3.5 Sonnet will be much better at writing and translating code, handling multi-step workflows, interpreting charts and graphs, and transcribing text from images. This new and improved Claude also apparently understands humor better and can write much more humanly.

Along with the new model, Anthropic is also introducing a new feature called Artifacts. Artefacts lets you see and work with the results of your Claude requests: if you ask the model to design something for you, it can now show you what it looks like and let you edit it right in the app. If Claude writes you an email, you can edit the email in the Claude app so you don’t have to copy it into a text editor. It’s a small feature, but a smart one: these AI tools need to become more than simple chatbots, and features like Artifacts simply give the app more to do.

The new Artefacts feature is a hint at what a post-chatbot Claude could look like.
Image: anthropic

Artifacts actually seem to signal the long-term vision for Claude. Anthropic has long said it’s primarily focused on businesses (even as it hires consumer tech like Instagram co-founder Mike Krieger) and said in its press release announcing Claude 3.5 Sonnet that it plans to turn Claude into a tool for companies to “securely centralize their knowledge, documents and ongoing work in one shared space.” That sounds more like Notion or Slack than ChatGPT, with Anthropic’s models being central to the entire system.

For now, however, the model is the big news. And the pace of improvements here is wild to see: Anthropic launched Claude 3 Opus in March and proudly said it was as good as GPT-4 and Gemini 1.0, before OpenAI and Google released better versions of their models. Now Anthropic has taken its next step, and it certainly won’t be long before the competition does too. Claude isn’t talked about as much as Gemini or ChatGPT, but he is very much in the running.

Leave a Comment