laitimes

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

author:New Zhiyuan

Edited by alan

Claude 3 Opus actually killed GPT-4. In Chatbot Arena's latest chatbot battle rankings, Claude 3's mega cup has successfully reached the top, and even the smallest Claude 3 HaiKu has reached the GPT-4 level!

Claude 3 Opus surpasses GPT-4 to become the new king!

Today, Chatbot Arena updated the leaderboard of chatbot battles, and after the baptism of time and the test of the masses, Claude 3, which was slightly inferior to GPT-4, actually surpassed!

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

And it's not just the Claude 3's oversized Opus that has managed to reach the top, defying all living beings, the overall performance of the Claude 3 family is very impressive.

The big cup Claude 3 Sonnet is ranked 4th, and even the smallest Claude 3 HaiKu has reached the GPT-4 level!

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

So how authoritative is this list compared to benchmark scores?

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

Chatbot Arena was developed by the Berkeley team, and the score of each model on the list depends entirely on the experience of real human users.

Let's take a look at the scoring rules:

Users ask any of the same questions to two anonymous models (e.g., ChatGPT, Claude, Llama) at the same time, and then vote for the better-performing model based on the answers;

If an answer is not certain at once, the user can continue chatting until the winner is determined;

If the identity of the model is revealed in the conversation, the votes will not be counted.

The Chatbot Arena platform collected more than 400,000 votes to calculate the rating leaderboard of this large model and finally find out who is the champion.

Obviously, this time Claude 3 won the game.

Let's take a look at the real situation:

Percentage of all non-draw matches, A vs B wins:

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

Number of battles between models (no draws):

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

GPT-4 was finally killed, and some netizens began to spoof it:

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

Just saw Sam Altman at his local supermarket, and he looked at his phone in shock. After a few seconds, he really collapsed and began to tremble violently. After 2 minutes of shaking and screaming, a crowd surrounded him trying to help him. But surprisingly, he stopped shaking and screaming after 2 minutes, stood up, picked up his phone and started dialing a number.

"Ready to release ......"

We don't know if Altman is going to put GPT-5 or not.

Netizens said that Claude is indeed much more diligent than GPT:

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!
GPT-4-Turbo is very lazy. In any coding task, it skips parts of the code and says "you know what to put in yourself", and Opus can output the entire code without missing a beat.

Even Claude-2 touched this netizen with his diligence and patience.

More pragmatic netizens pointed out that Haiku's ranking is more important because it is the first LLM that can run instantly at a very low cost, and has high enough intelligence to provide real-time customer service.

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!
Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

Claude 3 Haiku not only performs as well as the original version of GPT-4, but the key is that it's quite cheap, and you can even use it for free on some platforms.

Everyone then praised Claude 3 Haiku:

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

The intelligence is equivalent to GPT-4, the price is cheaper than GPT-3.5, and the model is said to be as small as 20B.

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

Some netizens said that OpenAI is not good, and now Anthropic is the boss, and for a while, the platform is full of happy air inside and outside.

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

ChatGPT has zero growth in a year

Looking back at ChatGPT, from the initial highlight and king, to now, it can't be said that it is a little shabby.

Recently, the relevant statistical platform exposed: ChatGPT has zero growth in the past year!

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

ChatGPT has been accused of being lazy and bloated with system prompts in recent times, while on the other hand, the competition has intensified – both Claude 3 and Gemini Pro 1.5 now offer 8x more context length and better recall capabilities than GPT-4.

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

For almost every ChatGPT use case, there are now a plethora of verticalized AI startups working to meet the needs of users rather than being satisfied with existing ChatGPT interfaces and bundled tools

They have better UI options (e.g. IDE and image/document editors), better native integrations (e.g. for cron repetitive operations), better privacy/corporate protection (e.g. for healthcare and finance), more fine-grained controls (GPT's default RAG is naïve and non-configurable).

The following are some netizens who listed the products in related verticals, as well as the company's financing:

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!
Surpassing GPT-4, Claude 3 Super Cup becomes the new king!
Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

In a sense, OpenAI's B2B and B2C parts compete with each other, which is somewhat of a healthy competition — OpenAI can train on RLHF data from ChatGPT.

And the new GPT store can be seen as OpenAI's attempt to capture these verticalization needs.

- Instead of leaving the platform and paying $20/month everywhere, why not stay inside ChatGPT and only have to pay once, allowing OpenAI to distribute the theoretical revenue to GPT creators?

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

In this regard, most creators are also wise to generally only release a lite version of the app to ChatGPT as a channel for their main platform.

In the console business, it's no secret that purchase decisions are often driven by platform-exclusive games. In a sense, the future of ChatGPT will feature platform-specific models.

Surpassing GPT-4, Claude 3 Super Cup becomes the new king!

Therefore, when Sora or even GPT-5 is publicly released, it will definitely be the first to land on its own platform, and maybe that will be the growth point of the next round of ChatGPT.