laitimes

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

author:Silicon Star Man

Whether an open-source model is popular or not, you can see how quickly the products in the ecosystem support him.

On April 26, Tongyi Qianwen opened up the source again, and directly threw out the 110 billion parameter Wang Bang model Qwen1.5-110B, refreshing the performance of the open source model. Less than 24 hours after the model was released, Ollama quickly launched support for the 110B. This means that in addition to prostituting demos on the Moji community and HuggingFace, you can deploy the model to your own computer as soon as it is released.

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

There are also some cloud deployment platforms such as SkyPilot, which are also the first to tweet the popularity of Qwen 1.5. Looking at the large model open source community, only Llama is what everyone wants to rub on. The Qwen series has been open source for more than half a year, and its position in the open source ecosystem has gradually begun to approach Llama.

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

On the day of its release, Qwen1.5-110B topped the Hacker News popularity list for a while, and the last time there was so much heat and discussion was when Tongyi Qianwen first announced open source in August last year. However, the direction of discussion has shifted from "what is this?" to a serious discussion of "how strong is this?". The skeptical noise faded as Qwen's strength grew.

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

Some netizens affirmed the ability of Qwen1.5-110B in abstract generation and information extraction, and thought that the effect was better than Llama 3.

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

However, some friends express their love in a somewhat rough way.

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

The Qwen1.5-110B open source model released by Tongyi Qianwen is the first model with 100 billion parameters in the Qwen series, and the performance of 110B is significantly improved compared with the 72B model of the same series. Tongyi Qianwen 72B has always been the most popular open source model in the community, and it is not an exaggeration to say that it has repeatedly dominated the list. However, in this model, there are no major changes to the pre-training method, so the performance improvement is mainly due to the increase in model size.

Qwen1.5-110B is similar to other Qwen1.5 models in that it uses the same Transformer decoder architecture and uses Packet Query Attention (GQA). It supports the context length of 32K tokens, and supports multiple languages such as English, Chinese, French, Spanish, German, Russian, Japanese, Korean, Vietnamese, and Arabic.

In terms of performance, most of the tests outperformed the Llama 3 70B:

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

In addition to the running score, we are more curious about how the Qwen1.5-110B actually performs, and how different it is from the Llama 3-70B?

Qwen1.5-110B VS Llama 3 70B

Let's start with a few fresh mentally handicapped questions:

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world
Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world
Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

Without any personification of the prompt, Qwen1.5-110B's language is more logical, more informative, and the answer is correct. And Llama 3's answer is a little more retarded than the retarded bar, not only the nonsense literature that has an hour and a half is 1.5 hours, but also the illusion that an electric car becomes a tricycle. Maybe that's the right answer for the mentally handicapped?

Let's take a look at its Chinese comprehension ability:

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

The correct answer to this sentence should be: I grabbed the "handlebar"/"handlebar" at once.

Qwen's answer is correct, except that the meaning of holding the handlebar is missing. And Llama 3 thinks she's hilarious.

Another round of follow-up questions and answers:

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

Let Qwen think about it one more time, and basically answer the question correctly. And Llama 3 is still hilarious. I really laughed at Llama 3's answer.

Let's take a serious math problem:

Mrs. Wang went to the market to sell eggs, the first person bought half of the eggs in the basket and one another, and the second person bought half of the remaining eggs, and then there was one egg left in the basket, how many eggs did Mrs. Wang sell?

Their answer was:

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

Qwen has a clear mind and the answer is correct. The Llama 3 process is correct, but solving a unary equation is wrong. And in terms of problem-solving ideas, Qwen is reverse thinking, which is very ingenious. Llama is a typical primary school student's thinking, and I believe that all primary school students will use Llama 3 to solve this problem when they see it.

Communicate in Korean without clearing the chat history, and Llama 3 will continue the previous answer habits in Chinese. Qwen was replaced with a Korean answer.

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

Qwen1.5-110B beats Llama 3 70B in these tests. It's not that Llama 3 can't work, but in Chinese, Qwen1.5-110B should not be controversial to say that the strongest open source model.

Carry open source to the end

On Hugging Face, the Qwen series of models has been at the top of the popularity list almost since it was open source, and with the arrival of version 1.5 and the introduction of the 72B and 110B large-parameter models, it has become one of the most dazzling open source models outside of Llama. Especially in the native language field of Chinese, there is basically no meal replacement on the whole network.

Since August last year, Tongyi Qianwen's open source rhythm has been non-stop. Since the release of the Qwen 1.5 series in early February, 10 open-source models with different parameters have been launched in the past three months, including 8 large language models, Code series models, and MoE models. At the end of last year, Tongyi Qianwen also open-sourced two multimodal models, the visual understanding model Qwen-VL and the audio understanding model Qwen-Audio.

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

If you count the various versions deployed and debugged, there are already 76 different models of Qwen models on HuggingFace. For comparison, both Mistral and Llama have single-digit models. In contrast, Qwen is a model worker in the open source world.

Hard work is also rewarded, in more than half a year, the number of downloads of Qwen series models has exceeded 7 million, HF and Magic can easily turn to Qwen series based models and applications.

For a large number of developers and enterprises, the Qwen series, which covers from 500 million to 110 billion parameters, provides the most ideal model selection package. The National Astronomical Observatory of the Chinese Academy of Sciences has developed a new generation of astronomical model "Star Language 3.0" based on the Tongyi Qianwen open source model, which is the first time that China's large model has been "in the sky" and applied to the field of astronomical observation.

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

Recently, with the gradual convergence of model capabilities, the debate between open and closed sources has become more and more discussed. Compared with the closed-source model that pursues self-closed-loop commercialization, the open source track unfolds another kind of "anything is possible" imagination.

Some people use it and some people discuss it, and open source makes sense.

From this point of view, the Qwen series has become one of the most successful open source products in China.

Read on