laitimes

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

author:Media Tea Party

Give the topic, keywords, and a video can be generated in a few seconds; Able to identify language traps and provide accurate answers; I have an amazing memory, and I still remember what I had before after many rounds of human-machine dialogue...

On October 17, at the Baidu World 2023 Conference, Robin Li, founder, chairman and CEO of Baidu, announced the official release of Wenxin Model 4.0 and demonstrated its multi-scenario application capabilities.

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

As the first generative AI product released by a major global manufacturer, version 4.0 of Wenxin Yiyan has significantly improved its comprehension, generation, logic, and memory capabilities. After the iterative upgrade, what are the specific "magic skills"? What are the outstanding performances compared to GPT-4? On October 29th, the editor tested it for himself.

Comprehension, generation, logic, and memory are the foundation and core capabilities of large models, and these capabilities determine the intelligence level of large models and the space for artificial intelligence applications. According to Robin Li, Wenxin Model 4.0 is the most powerful Wenxin model to date, and it is not inferior to GPT-4 in terms of comprehension, generation, logic, and memory capabilities.

In order to verify its strength, the editor listed a series of questions, started a dialogue with Wenxin Dayayan 4.0, and asked GPT-4 to answer the same questions and compare their respective performances.

How many major abilities PK, which is better than Wenxin Yiyan 4.0 and GPT-4?

Ability to comprehend

In order to verify Wenxin Yiyan's comprehension ability, the editor asked the following questions:

I want to go back to Hainan to buy a house, can I use the provident fund loan, and what should I do? I work in Beijing.

The answers received are as follows:

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

It is not difficult to find that although there is a problem of disordered expression and the core intent of this question is relatively vague, Wenxin Yiyan 4.0 and GPT-4 have not been affected, accurately grasp the core intent, and understand the subtext in the question: Can Hainan hukou use Beijing's provident fund to buy a house in Hainan?

In order to avoid the cognitive bias caused by a single case, the editor asked another question:

I want to buy a fish that is more suitable, I want to make fish-flavored shredded pork.

The answers to both are as follows:

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

For the answer to this question, GPT-4 and Wenxin Yiyan 4.0 have a tie, and they both identified the language trap inside: there is no fish in the fish-flavored shredded pork, and supplemented the corresponding dishes.

In addition, the editor uses "what is the problem of lonely" to verify the ability of GPT-4 and Wenxin Yiyan 4.0 to understand hot words and hot memes on the Internet.

The answers to both are as follows:

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

The evaluation and comparison found that Wenxin Yiyan 4.0 is better than the understanding of hot words and hot stalks on the Internet. It shows that the data and entries of Wenxin Yiyan are updated in a more timely manner than GPT.

Ability to build

According to the keywords and themes, it is an important measure to test the generation ability of large models to quickly and accurately generate videos, pictures, poems, etc. required for asking questions.

In terms of generation capabilities, Robin Li showed how Wenxin Yiyan quickly generated a set of advertising posters, five advertising copy, and a marketing video based on a stock image in just a few minutes.

The editor also personally tested and compared the generation capabilities of Wenxin Yiyan 4.0 and GPT-4. In terms of video generation capabilities, Wenxin Yiyan 4.0 has implemented video generation capabilities in some scenarios and themes.

The following is the editor's question - the video of generating a group photo of college students, and the video generated by Wenxin Yiyan 4.0:

Loading...

It is not difficult to see that Wenxin Yiyan's video is in line with the graduation theme, and the video also has rich scenes and characters, and shows the feeling of farewell and happiness for college students after graduation.

But at present, GPT-4 does not have the ability to generate videos, and can only give video production suggestions.

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

In addition to the video, the editor also compared Wenxin Yiyan and GPT's capabilities in making posters. The evaluation results show that GPT-4.0 cannot directly make posters, while Wenxin Yiyan 4.0 can.

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

Logical ability

Speaking of logical ability, how can you miss math problems, Xiaobian found a math problem about the number series: let Sn be the sum of the first n terms of the equal difference sequence {an}, S8=4a3, a7=-2, then a9=how much? Ask two large models to help solve the problem.

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

For this math problem, the editor found that Wenxin Yiyan 4.0 answered correctly, but GPT-4 made a mistake in drawing the horizontal line from the screenshot, so the answer was wrong.

The editor changed a math problem, and the evaluation found that Wenxin Yiyan 4.0 answered correctly again, while GPT-4 answered incorrectly again because of the wrong unit conversion.

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

The results of the above two math questions show that Wenxin Yiyan 4.0 is better than GPT-4 in terms of logical reasoning and calculation.

Memory ability

In order to test the memory ability of Wenxin Yiyan 4.0 and GPT-4, the editor asked Wenxin Yiyan 4.0 and GPT-4 to write a novel with the story synopsis of "The Reporter Exposed the Black Heart Factory", and conducted many dialogues to enrich and supplement the plot, interspersed with interference problems (due to the multiple dialogues and the long length of the answers, all the dialogue screenshots are not shown here).

Finally, I asked what is the name of the protagonist of this article about the content of the generated novel, and found that the two are equally capable in this regard, and can accurately answer the details of their own generation without interference, and there are no inconsistent and logical questions.

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround
No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

Therefore, judging from the evaluation results, in terms of memory ability, Wenxin Yiyan 4.0 is not inferior to GPT-4 in any way.

In addition, Wenxin Yiyan is considered to have strong Chinese contextual understanding and creative ability. What are the notable aspects of Wenxin Yiyan in version 4.0? Is it better than GPT-4? I also did a test.

The editor asked Wenxin Yiyan 4.0 to write a few sentences similar to GPT-4 - "The leader picks up the dishes and you turn the table, the leader drinks water and you brake, and the leader listens to the cards and you draw yourself".

The answer is as follows:

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

The evaluation results show that Wenxin Yiyan 4.0 obviously understands the current popular copywriting on the Internet: young people who are just starting out in the workplace ridicule themselves. GPT-4 completely reversed the meaning - wrote a few sentences of copywriting that pandered to the leadership.

The editor also uses the poetic Chinese style problem - to write a Tibetan poem with five words: remembering, "person", "stanza", "fast", and "music", requiring the content of the Tibetan head to reflect the deep meaning and charm, and at the same time requiring rhyme, smoothness, and in line with the writing norms of poetry, to evaluate Wenxin Yiyan 4.0 and GPT-4.

The answer is as follows:

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

Judging from the results, both can understand the meaning of Tibetan poems, but the Tibetan poems written by Wenxin Yiyan 4.0 are closer to the ancient poems we are familiar with, and they are neatly aligned. GPT-4, on the other hand, is a little less stylistic, and the style is closer to modern poetry. In order to test the comprehension ability of the two large models in Chinese dialects, the editor also asked a question - "What are you doing?" What does that mean?

The answer is as follows:

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(Wenxin Yiyan 4.0)

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

(GPT-4)

Judging from the results given, Wenxin Yiyan 4.0 can more accurately identify dialects. ". And GPT-4 gives two answers - "Why are you like this?" "What are you doing?" , the former is the correct answer, and the latter is the wrong answer, indicating that GPT-4's ability to accurately understand Chinese dialects needs to be further improved.

On the whole, Wenxin Yiyan 4.0 is not inferior to GPT-4.0 in terms of comprehension and memory ability. Moreover, it has better performance than GPT-4.0 in logic, generation, Internet hot word understanding, and ancient poetry creation.

From imperfection to "counterattack", Wen Xin made a turnaround

"What do you think of Wen Xin's words?" "I think it's good." "What? That's how good it is? It's just rote memorization, which is much worse than ChatGPT. In March this year, when Baidu Wenxin Yiyan was launched, there was no shortage of such comments in the market, and Wenxin Yiyan was also questioned as "imperfect".

But more than half a year later, Wen Xin Yiyan made a beautiful turnaround!

On the one hand, many evaluation results do confirm that compared with ChatGPT, Wenxin Yiyan 4.0 is "not inferior" in strength. In addition, in July this year, the International Data Corporation released a report on the technical capabilities of AI large models, and Wenxin large model 3.5 scored 7 full scores in 12 indicators, ranking first in comprehensive scoring, first in algorithm models, and first in industry coverage.

On the other hand, from the perspective of data, Wenxin Yiyan has also achieved a good report card.

On the occasion of Wenxin Yiyan's full moon, its QPS (requests per second) increased by 10 times compared with the launch time, and the model inference performance increased by 50%. On August 31, 12 hours after Wenxin Yiyan announced its official opening, it quickly topped the list of free apps in the App store. Up to now, Wenxin Yiyan currently has 45 million users and 54,000 developers, covering 4,300 application scenarios, 825 applications, and 500 plug-ins.

To explore the reasons for the success of Wenxin Yiyan, high R&D cost investment, technology iteration and innovation, and open-mindedness are the key factors that cannot be avoided.

Data shows that as early as 2010, Baidu has begun to lay out the research and development of AI-related technologies, and in the past ten years, the cumulative R&D investment has exceeded 140 billion yuan. Among them, R&D expenses in 2022 will reach 21.416 billion yuan, accounting for 22.4% of Baidu's core revenue. Compared with other large manufacturers in the world, these investments are also among the best.

At the same time, Baidu's continuous iteration and exploration and innovation in technology have greatly improved the performance of Wenxin Yiyan from the initial text generation and dialogue function, to the later semantic understanding and sentiment analysis, and then to the current multimodal interaction and cross-language application.

For example, the introduction of deep learning technology has improved the performance and generalization ability of the model, so that Wenxin Yiyan can better adapt to different application scenarios. The addition of multi-modal interaction functions enables Wenxin Yiyan to process various forms of input such as images and voices, improving the convenience and practicability of the application. Baidu has also developed an agent mechanism that allows Wenxin Yiyan to learn to understand, plan, reflect and evolve, continuously learn in the environment to achieve autonomous evolution, and autonomously complete complex tasks. Wenxin Yiyan also uses regenerative training technology, which effectively saves training resources and time, and accelerates the speed of model iteration. It is understood that Wenxin Yiyan has improved the efficiency of training algorithms by 3.6 times since March. In terms of training stability, the average weekly training efficiency has exceeded 98%.

The rapid development of Wenxin Yiyan is inseparable from the "mutual achievement" brought about by comprehensive opening up. Wenxin Yiyan is free and open, providing intelligent tools for users' work and life. Conversely, when Wenxin Yiyan opens its services to hundreds of millions of Internet users on a large scale, it will be able to get a lot of real-world human feedback, which will further improve the underlying model and iterate at a faster pace.

Wenxin Yiyan: Build a large-scale model ecological rainforest

"The big model starts with technology, is stronger than application, and will eventually benefit the ecology." Baidu will work with users, customers, and partners to cultivate a large-scale ecological rainforest. At the World Conference on October 17, in addition to demonstrating Wenxin Yiyan's intelligent advanced capabilities, Baidu also demonstrated many AI native applications, as well as the integration and application results of large model technology and multiple scenarios.

From a longitudinal point of view, relying on the basic foundation of the large model, Baidu has reconstructed its own business ecology.

Starting from the instructions, to quickly finding the industry report through the library, and generating a summary and refinement of the 70-page library content in a few seconds, and quickly generating the document into a PPT, and completing the polishing and beautification. The large model has reopened the imagination space of Baidu Library, making it a one-stop intelligent document creation platform. Since the launch of Baidu's new AI function, more than 13 million users have used it, the cumulative number of function uses has exceeded 100 million, the cumulative content generated has exceeded 20 million, and the PPT generation has exceeded 2 million.

No less than GPT-4! Wen Xin Yiyan made a beautiful turnaround

Baidu input method's newly launched full-scene AI creation assistant "Super Writing", providing high emotional intelligence communication, God comments, Lenovo continuation, high praise circle of friends, popular titles, inspiration notes and other functions, to assist users in full-scene, full-platform content creation, the cumulative number of requests exceeded 100 million, and the user like rate exceeded 80%.

In a wide range of industries, the Wenxin Yiyan model has been applied to richer scenarios, bringing users a more intelligent, efficient, and convenient application experience.

In the field of transportation, the addition of Wenxin Yiyan makes the autonomous driving technology better implemented, makes the autonomous driving outstanding in the recognition of pedestrians, traffic lights, etc., helps the autonomous driving system quickly make the most intelligent decisions according to the specific situation, and also provides a strong guarantee for the safety of autonomous driving.

In the field of sports, Wenxin Yiyan helps the Chinese diving team get more efficient and accurate training by learning massive data, understanding and implementing complex instructions from coaches and athletes, providing accurate information in a timely manner, and scoring real-time actions and accurate quantitative analysis.

Based on "Wenxin", reshaping thousands of industries with large models may be Baidu's ultimate goal in the AI era. Judging from the current development trend, Baidu has taken a key and leading step in the sea of stars of AI.

Easter egg at the end of the article: Give benefits!

The media tea party communicated with Baidu, and in the case of the scarcity of invitation codes, 30 Wenxin Yiyan 4.0 internal test invitation codes were given to tea fans as benefits.

How to receive: Like, light up this article, and forward it to the circle of friends. Leave a comment at the end of the article to express your views on technical topics such as AI.

Chacha will select the top 30 fans who leave messages and likes on a first-come, first-served basis and give away benefits.

Read on