laitimes

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

author:I'm a tech fanatic
Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

OpenAI released ChatGPT-3.5 on November 30, 2022 and GPT-4 on March 14, 2023. On March 16, 2023, Baidu also released its own big language model - Wenxin Yiyan. Moreover, Baidu is the first technology giant in China to publish its own large-language model. However, in the roughly two months after the release of Wen Xin Yiyan, more than half of Internet users may give a negative evaluation of Wen Xin Yiyan - a big gap compared to ChatGPT-3.5.

But from another point of view, Baidu's immature Wen Xin is still worthy of recognition for its immature Wen Xin to the majority of users, which is a confident move - acknowledging that there are shortcomings, and I believe that it will become better and better. I am afraid that what many people did not expect was that in the past half a month, Baidu Wenxin seemed to have made a qualitative leap - "IQ" suddenly increased a lot, and began to become "smart".

First of all, similar to Microsoft Bing Chat and Google Bard, Wen Xin Yiyan can also search for information on the Internet by itself and answer various questions to users. This shows that Wen Xin Yiyan does have certain language understanding and generation ability, as well as some basic knowledge and logical reasoning ability, and can already provide users with real help in some scenarios. In addition, in order to give the best possible answer in a word, users should be as accurate and complete as possible when asking questions.

For example, in this round of dialogue, Xiang Wen Xin asked three questions.

1, What day of the week is today?

2. What is the weather like in Shanghai in the next few days?

3, I am currently in Chengdu, Sichuan, planning to go to Shanghai for a few days, can you help plan a travel route as simply as possible?

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

Wen Xin's answer was relatively satisfactory.

For example, in this round of dialogue, Xiang Wen Xin asked four questions.

1. Search for news about NIO's price reduction and tell NIO why it is reducing prices?

2, What is NIO's monthly car delivery volume from January ~ May 2023?

3, NIO founder Li Bin said that NIO will never reduce prices?

4. Can you make a simple comparison of the delivery volume of NIO, Xpeng, and Ideal in May 2023?

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

Similarly, Wen Xin's words gave a more satisfactory answer.

Secondly, Wen Xinyiyan's ability to translate between Chinese and English has been improved compared with before, with a certain ability of cross-language conversion, being able to understand the grammatical, lexical and semantic differences between the two languages, and being able to generate relatively fluent and accurate text in the target language. In addition, according to relevant statistics, there are no less than 1.5 billion people in the world who know how to use English, and about 1.1 billion people use Chinese (here mainly refers to Mandarin). For many people, it is necessary to master both languages Chinese and English. As Wen Xin Yiyan's ability to translate between Chinese and English continues to improve, it can not only become an intelligent assistant for language learning, but also help translate various texts (improve efficiency and quality).

In this round of dialogue, Xiang Wen Xin asked four questions.

1. Translate the latter Chinese into Japanese: I am not in a good mood today and want to sleep.

2, then translate it into English.

3,将后面的英文翻译通顺连贯的中文:The launch is part of SpaceX's Transporter-8 mission which is "a dedicated smallsat rideshare mission", according to SpaceX's website. The rocket will carry 72 payloads on this flight, including CubeSats, MicroSats, a re-entry capsule and orbital transfer vehicles carrying spacecraft to be deployed at a later time.

4. What is the difference between a beautiful girl and a pretty girl?

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

Wen Xin's response to questions 1 and 2 shows that it understands context, which is good! For the third question, Wen Xin can translate English into Chinese well. For the fourth question, Wen Xin's answer is also acceptable.

Third, as long as it is not too difficult, Wen Xin can write normal program code with a word - with a certain understanding of programming languages, and can generate program code that conforms to syntax, logic and function according to the user's natural language description or existing program code fragments. It is foreseeable that the stronger Wen Xin's ability to write code, it will be able to provide convenience for developers in many scenarios, such as code generation, code completion, code translation, code comments, etc.

In this round of dialogue, only two very simple Python programming problems were mentioned to Wen Xin.

1. Write a video animation of football in Python language.

2. Write the nine-nine multiplication mantra table in simple Python language.

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

Interested netizens can try and experience by themselves (set the difficulty of code problems or tasks according to their own ideas), including Python, Java, C/C++, JavaScript and other programming languages.

Fourth, Wen Xin cannot do logical thinking problems that exceed a certain degree of difficulty and science and engineering topics such as mathematics, physics, chemistry, biology and other science and engineering topics like humans. This means that in addition to language understanding and generative ability, Wen Xin Yiyan also needs to improve its ability to calculate and generalize. Here, the so-called computing ability refers to numerical calculations, symbolic operations, logical reasoning, etc., to obtain correct results, and to be able to deal with uncertainties and anomalies, while generalization ability is to be able to deal with different topics and fields, not only limited to the knowledge and methods in the training data, but also to be able to learn and use new knowledge and methods to solve more complex and abstract problems.

In this round of dialogue, Xiang Wen Xin asked two questions, and both of them were carefully designed - there were no ready-made answers on the Internet.

1, Bob wants to cut a square piece of paper into two different rectangular pieces of paper, so that the sum of the circumferences of the two rectangles is minimal. If the side length of the square is 8 cm, then how should he cut it to achieve this goal?

2. Students from a certain school conducted a math quiz with a total score of 100 points, with a full score of 10 questions and 10 points per question. After the test, the teacher found that one student scored more than 90 points, but did not get a perfect score. According to the school's grading system, students with less than 10 points will not be graded, and only the whole number will be graded. What is the specific score that the student is likely to get?

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

Wen Xin's answer was obviously wrong. In contrast, iFLYTEK's Spark Cognition and OpenAI's ChatGPT-3.5 are not much better.

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

Spark cognition

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

ChatGPT-3.5

Take a look at Microsoft's Bing Chat (GPT-4 driver) answer. For the first question, Bing Chat says that cutting the square diagonally to get two isosceles right triangles so that the sum of the perimeters of the two rectangles (isosceles right triangles) is minimal. Bing Chat even gave corresponding proofs during the reasoning process. It should be noted that Bing Chat also has errors in the derivation and calculation process. For the second question, Bing Chat said there is no one definitive answer because different scoring systems can lead to different results. And, Bing Chat gives one of the possible scoring systems for users' reference.

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?
Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

As for Wen Xin's ability and performance in other aspects, it generally feels okay. For example, simply chatting, writing novels, writing ancient poems, writing news, and so on. I will not repeat them all.

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?
Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

Conclusion: Compared with the previous version, the current Wen Xin word can almost be considered a rebirth. At the same time, it makes the outside world feel that compared with the AI language models of American companies, such as ChatGPT, Bard, Claude, etc., the AI language models developed by Chinese companies are also competitive (not so bad). Following Baidu's Wen Xin Yiyan and Alibaba's General Meaning, ByteDance and Tencent will also release their own AI language models. Some domestic general-purpose large language models, including Wen Xin's words, should catch up with GPT-4 faster than the outside world expects. This round of AI boom in the field of technology. Both American and Chinese technology companies know the importance of the fourth AI revolution. Even if GPT-4 is temporarily in the world's leading position, it is only the starting point for the development of AGI for general artificial intelligence.

Read on