Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

OpenAI released ChatGPT-3.5 on November 30, 2022 and GPT-4 on March 14, 2023. On March 16, 2023, Baidu also released its own big language model - Wenxin Yiyan. Moreover, Baidu is the first technology giant in China to publish its own large-language model. However, in the roughly two months after the release of Wen Xin Yiyan, more than half of Internet users may give a negative evaluation of Wen Xin Yiyan - a big gap compared to ChatGPT-3.5.

But from another point of view, Baidu's immature Wen Xin is still worthy of recognition for its immature Wen Xin to the majority of users, which is a confident move - acknowledging that there are shortcomings, and I believe that it will become better and better. I am afraid that what many people did not expect was that in the past half a month, Baidu Wenxin seemed to have made a qualitative leap - "IQ" suddenly increased a lot, and began to become "smart".

First of all, similar to Microsoft Bing Chat and Google Bard, Wen Xin Yiyan can also search for information on the Internet by itself and answer various questions to users. This shows that Wen Xin Yiyan does have certain language understanding and generation ability, as well as some basic knowledge and logical reasoning ability, and can already provide users with real help in some scenarios. In addition, in order to give the best possible answer in a word, users should be as accurate and complete as possible when asking questions.

For example, in this round of dialogue, Xiang Wen Xin asked three questions.

1, What day of the week is today?

2. What is the weather like in Shanghai in the next few days?

3, I am currently in Chengdu, Sichuan, planning to go to Shanghai for a few days, can you help plan a travel route as simply as possible?

Wen Xin's answer was relatively satisfactory.

For example, in this round of dialogue, Xiang Wen Xin asked four questions.

1. Search for news about NIO's price reduction and tell NIO why it is reducing prices?

2, What is NIO's monthly car delivery volume from January ~ May 2023?

3, NIO founder Li Bin said that NIO will never reduce prices?

4. Can you make a simple comparison of the delivery volume of NIO, Xpeng, and Ideal in May 2023?

Similarly, Wen Xin's words gave a more satisfactory answer.

Secondly, Wen Xinyiyan's ability to translate between Chinese and English has been improved compared with before, with a certain ability of cross-language conversion, being able to understand the grammatical, lexical and semantic differences between the two languages, and being able to generate relatively fluent and accurate text in the target language. In addition, according to relevant statistics, there are no less than 1.5 billion people in the world who know how to use English, and about 1.1 billion people use Chinese (here mainly refers to Mandarin). For many people, it is necessary to master both languages Chinese and English. As Wen Xin Yiyan's ability to translate between Chinese and English continues to improve, it can not only become an intelligent assistant for language learning, but also help translate various texts (improve efficiency and quality).

In this round of dialogue, Xiang Wen Xin asked four questions.

1. Translate the latter Chinese into Japanese: I am not in a good mood today and want to sleep.

2, then translate it into English.

3，将后面的英文翻译通顺连贯的中文：The launch is part of SpaceX's Transporter-8 mission which is "a dedicated smallsat rideshare mission", according to SpaceX's website. The rocket will carry 72 payloads on this flight, including CubeSats, MicroSats, a re-entry capsule and orbital transfer vehicles carrying spacecraft to be deployed at a later time.

4. What is the difference between a beautiful girl and a pretty girl?

Wen Xin's response to questions 1 and 2 shows that it understands context, which is good! For the third question, Wen Xin can translate English into Chinese well. For the fourth question, Wen Xin's answer is also acceptable.

Third, as long as it is not too difficult, Wen Xin can write normal program code with a word - with a certain understanding of programming languages, and can generate program code that conforms to syntax, logic and function according to the user's natural language description or existing program code fragments. It is foreseeable that the stronger Wen Xin's ability to write code, it will be able to provide convenience for developers in many scenarios, such as code generation, code completion, code translation, code comments, etc.

In this round of dialogue, only two very simple Python programming problems were mentioned to Wen Xin.

1. Write a video animation of football in Python language.

2. Write the nine-nine multiplication mantra table in simple Python language.

Interested netizens can try and experience by themselves (set the difficulty of code problems or tasks according to their own ideas), including Python, Java, C/C++, JavaScript and other programming languages.

Fourth, Wen Xin cannot do logical thinking problems that exceed a certain degree of difficulty and science and engineering topics such as mathematics, physics, chemistry, biology and other science and engineering topics like humans. This means that in addition to language understanding and generative ability, Wen Xin Yiyan also needs to improve its ability to calculate and generalize. Here, the so-called computing ability refers to numerical calculations, symbolic operations, logical reasoning, etc., to obtain correct results, and to be able to deal with uncertainties and anomalies, while generalization ability is to be able to deal with different topics and fields, not only limited to the knowledge and methods in the training data, but also to be able to learn and use new knowledge and methods to solve more complex and abstract problems.

In this round of dialogue, Xiang Wen Xin asked two questions, and both of them were carefully designed - there were no ready-made answers on the Internet.

1, Bob wants to cut a square piece of paper into two different rectangular pieces of paper, so that the sum of the circumferences of the two rectangles is minimal. If the side length of the square is 8 cm, then how should he cut it to achieve this goal?

2. Students from a certain school conducted a math quiz with a total score of 100 points, with a full score of 10 questions and 10 points per question. After the test, the teacher found that one student scored more than 90 points, but did not get a perfect score. According to the school's grading system, students with less than 10 points will not be graded, and only the whole number will be graded. What is the specific score that the student is likely to get?

Wen Xin's answer was obviously wrong. In contrast, iFLYTEK's Spark Cognition and OpenAI's ChatGPT-3.5 are not much better.

Spark cognition

ChatGPT-3.5

Take a look at Microsoft's Bing Chat (GPT-4 driver) answer. For the first question, Bing Chat says that cutting the square diagonally to get two isosceles right triangles so that the sum of the perimeters of the two rectangles (isosceles right triangles) is minimal. Bing Chat even gave corresponding proofs during the reasoning process. It should be noted that Bing Chat also has errors in the derivation and calculation process. For the second question, Bing Chat said there is no one definitive answer because different scoring systems can lead to different results. And, Bing Chat gives one of the possible scoring systems for users' reference.

As for Wen Xin's ability and performance in other aspects, it generally feels okay. For example, simply chatting, writing novels, writing ancient poems, writing news, and so on. I will not repeat them all.

Conclusion: Compared with the previous version, the current Wen Xin word can almost be considered a rebirth. At the same time, it makes the outside world feel that compared with the AI language models of American companies, such as ChatGPT, Bard, Claude, etc., the AI language models developed by Chinese companies are also competitive (not so bad). Following Baidu's Wen Xin Yiyan and Alibaba's General Meaning, ByteDance and Tencent will also release their own AI language models. Some domestic general-purpose large language models, including Wen Xin's words, should catch up with GPT-4 faster than the outside world expects. This round of AI boom in the field of technology. Both American and Chinese technology companies know the importance of the fourth AI revolution. Even if GPT-4 is temporarily in the world's leading position, it is only the starting point for the development of AGI for general artificial intelligence.

Very unexpected! Baidu Wenxin has made a qualitative leap and overtook GPT-4 within this year?

Read on

Big model artificial intelligence "bean bag" little beauty and Baidu Wenxin in a word, iFLYTEK Xinghuo wrote a poem praising Huawei, which is more powerful!! The following is a poem written by Doubao praising Huawei

Here it comes! Baidu Wenxin said that the first batch of domestic large models will be officially launched from today on

Baidu Wenxin topped the App Store on the first day of opening: the evaluation was polarized, and 33.42 million questions were answered

Baidu Wenxin is officially open! Wen Xin can be tested by a small program.

Baidu Wenxin's painting generation process is very fast, and it can generate a painting in a few seconds. Users can enter text descriptions, such as theme, style, color, etc., to control

Global artificial intelligence ushered in a new round of rapid development: Baidu Wenxin was approved

Li Yanhong's white belt shines better than Baidu Wenxin's words!

Baidu Wenxin Yiyan in the development of high vacuum precision control device application case

Baidu Wenxin a word, #Baidu Wenxin a word # can not believe it. The gap between the data I queried and the data of the exchange software is not ordinarily large! Who exactly should I believe?

Baidu Wenxin AI-generated pictures, the style is changeable, do whatever you want no longer worry about the use of pictures [laughing and crying]! Although it's still a little less smart [laughs]. #文心一

Li Yanhong's white belt overshadowed all the light of Baidu Wenxin's words!

WeChat ushered in a big update / Baidu Wenxin is open to the whole society

AIGC to see how effective Baidu Wenxin Yiyan and Ali Tongyi Qianwen are. The same question, Baidu Wenxin can not answer a word, Ali Tongyi Qianwen answered incorrectly, the road to the big model is still there

I feel that Baidu Wenxin has Tencent's undercover agent in the developer, and there are pictures to prove it! [Cover face] [Cover face] [Cover face]

Baidu Wenxin took the lead in opening up to the whole society

The daily interest rate is 0.0235%, what is the total annual interest rate? The following are Baidu Wenxin's answer, chatGPT's answer, and iFLYTEK's answer.

#Today's headlines#How to make more people read the dynamics of headlines?bean bag#bean bag#Wenxin Yiyan#Wenxin Yiyan#Xunfei Xinghuo#Xunfei Xinghuo#Give their respective answers

This is a picture made by Wenxin in one word, drawing input: Chinese girl, delicate facial features, long hair, breast and fat buttocks, hot body, bikini, cheongsam, buttocks, movie-level, photography-level, live-action CG,

文心一言 VS 讯飞星火 VS chatgpt (195)—— 算法导论14.3 3题

It is reported that the Chinese version of the iPhone 16 series will cooperate with Baidu to provide AI functions

Jiyue car owners are young people, each car voice interaction is about 63 times a day, the average daily use of Wenxin Yiyan service 7.1 times, 9 percent of users will use PPA intelligent driving, all of which are used in big cities

Apple joins forces with Baidu, and the iPhone 16 national bank is expected to have a built-in Wenxin Yiyan!

117 Generative AI Service Filing Information Announced: Baidu Wenxin Yiyan and others are listed

How can ordinary people effectively apply artificial intelligence software such as Kimi, Wenxin Yiyan, and iFLYTEK Xinghuo?

Customize the AI voice in 2 seconds! Wenxin is a big job in one word: the effect is surprising

On April 11th, #华为新款MateBookXPro正式发布#, the new product is positioned as a flagship thin and light book, with a weight of only 980 grams and a body thickness of 13.5 mm.

Baidu CEO Robin Li: The number of users of Wenxin Yiyan exceeded 200 million and released three major AI development tools

Wang Haifeng, CTO of Baidu: Wenxin Yiyan's user scale and average daily call volume have reached 200 million

Can it compete with ChatGPT?Baidu says Wenxin Yiyan now has 200 million users

Following Microsoft's example, Google has also merged its hardware and operating system divisions, OpenAI has set up a Japanese branch, and Wenxin Yiyan has more than 200 million users......

Wenxin said the latest instructions, and quickly saved them

文心一言 VS 讯飞星火 VS chatgpt （276）—— 算法导论20.3 3题