laitimes

As long as you can speak, you can develop without writing code! Baidu has made another big move

author:New Zhiyuan

Editor: Editorial Department

Wenxin model 4.0 has been improved by 52.5% in half a year, and new progress has been made in agents, code, and multiple models! The agent that thinks like a human has reached a certain degree of white box; the intelligent code assistant Comate allows developers to develop applications with their mouths.

Wenxin model, there is new progress!

Just yesterday, the Create 2024 Baidu AI Developer Conference was successfully held, and another wave of new progress in agents, code, and multiple models was announced.

On March 16 last year, Wenxin Yiyan was released, and it has continued to iterate since then.

Based on greater computing power, more data and stronger algorithms, relying on the PaddlePaddle platform, Wenxin has evolved from 3.0 and 3.5 to version 4.0.

As long as you can speak, you can develop without writing code! Baidu has made another big move

AI agents, think like humans

There is no doubt that agents are the unanimously optimistic direction in the industry.

In this regard, Baidu CTO Wang Haifeng also said that the intelligent body will bring more application explosions.

And today's Baidu agent has learned to think like a human!

On the basic model, it further carried out thinking enhancement training, including supervised fine-tuning of the thinking process, preference learning for behavioral decision-making, and reinforcement learning for outcome reflection, and then obtained the thinking model.

As a result, it learns to read manuals, learn tools to try out, and even call tools to complete tasks like a human.

As long as you can speak, you can develop without writing code! Baidu has made another big move

To illustrate this process in detail, we can refer to the theory in the book "Thinking, Fast and Slow".

The human cognitive system can be divided into two parts: System 1 reacts quickly, but is prone to error. System 2 is slower, but more rational and accurate.

As long as you can speak, you can develop without writing code! Baidu has made another big move

On top of the powerful foundational model, Baidu's R&D team further developed System 2, which includes understanding, planning, reflection and evolution.

In this way, the agent's thinking process is white-box to a certain extent, so that the machine can think and act like a human, complete complex tasks autonomously, continue to learn, and achieve autonomous evolution.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Let's take a look at the thinking process of Baidu Agent.

On the Wenxin model 4.0 tool version, we can ask questions like this-

"I'm going on a business trip to the Greater Bay Area for a week. I want to know how the weather changes so I can decide what to bring. Please help me check the temperature in Beijing and Shenzhen for the coming week, tell me what clothes I should bring on a business trip, and organize it into a table. 」

Next, it's time to show off the real technology.

First, it invokes an "advanced networking" tool to check the local weather information.

As long as you can speak, you can develop without writing code! Baidu has made another big move

It then calls the "code interpreter" to plot a temperature trend graph.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Depending on the weather for the coming week, it chooses the right clothing.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Finally, it thinks about and confirms the results, which are automatically summarized into a table.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Throughout the process, it has demonstrated skillful thinking and planning capabilities, methodically dismantling user needs into multiple sub-tasks, and a complete set of processes is smooth.

Not only that, from the trillions of training data, what the Wenxin model has learned is not only natural language ability, but also code ability.

Code agents

As the name suggests, this agent can help us write code.

The barrier between programmers and ordinary people has been completely broken, and everyone can now do what programmers could do before.

A code agent is composed of two parts: a thinking model and a code interpreter.

First, the thinking model understands our needs and, after some thought, integrates instructions and relevant information to complete the task into prompts that are fed into the code interpreter.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Based on this prompt, the code interpreter translates the natural language user requirements into code and executes them, so that the execution result, or debugging information, is obtained.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Finally, the thinking model also performs a reflective confirmation of the results of the code interpreter.

If the result is correct, the result will be returned to the user, and if it is incorrect, it will continue to be updated independently.

As long as you can speak, you can develop without writing code! Baidu has made another big move

At the conference site, Wang Haifeng showed off his skills on the spot.

The task of the site is to let it customize the invitation letter for the guests of this conference.

As long as you can speak, you can develop without writing code! Baidu has made another big move

After a wave of operations, the names of the guests were filled in the correct position in the invitation.

The newly generated invitation letter files are also named after the guests, and they are packaged and output together.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Mouthpiece development, intelligent code assistant has arrived

And this legendary intelligent code assistant Comate, just listen to the name and know that it is more professional.

As long as you can speak, you can develop without writing code! Baidu has made another big move

That's right, its role is to be a programmer's AI peer, that is, to help professional programmers write better code more efficiently.

In the past, developers have changed the world with code.

Now, natural language has become the new language of development. In other words, developers will be able to develop applications with just their mouths in the future.

As long as you can speak, you can develop without writing code! Baidu has made another big move

On the basis of the continuous improvement of the model effect, Baidu has further built capabilities such as context enhancement, private domain knowledge enhancement, and seamless process integration.

As a result, the overall adoption rate of Baidu's intelligent code assistant Comate has reached 46%, and the proportion of new code generated has reached 27%.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Code understanding, generation, optimization, and other capabilities are seamlessly integrated into all aspects of R&D by Comate.

For example, just tell Comate to "help me sort out the architecture of my current project" and in a matter of seconds, it will provide a clear and coherent answer.

As long as you can speak, you can develop without writing code! Baidu has made another big move

It's like an assistant that helps programmers improve the quality and efficiency of code development.

Here's an example of how Comate can help engineers take over the code.

As you can see, with just a simple instruction, it can quickly understand the architecture of the entire code, and even down to the specific implementation logic of each module.

For example, when you ask for something more nuanced and specific, "how is the core RAG logic of the project implemented", you can get a quick answer.

What's even more surprising is that there is also an index link that jumps directly to the reference.

As long as you can speak, you can develop without writing code! Baidu has made another big move

In addition, it can automatically generate new code that meets the requirements based on the current project code as well as third-party code.

As shown below, an external reference code is given, along with the API for the Thousand Sails model, to generate code that calls Ernie Bot 4.0.

Comate gives a basic code example in minutes.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Large and small models are trained together

In addition, Wang Haifeng also shared the "multi-model" technology at the scene.

Why do we need multi-models today?

In the process of promoting the application of large models, developers and enterprises need to pay attention not only to cost, but also to consider effect and efficiency.

Therefore, in practical application, it is necessary to choose the most suitable model for yourself from the landing scene.

As long as you can speak, you can develop without writing code! Baidu has made another big move

On the one hand, efficient and low-cost model production needs to be solved urgently.

In this regard, Baidu has developed a training mechanism for the coordination of large and small models, which can effectively inherit knowledge and efficiently produce high-quality small models.

Small models not only have low inference costs, but also fast response speeds. Moreover, in some specific scenes, the effect of a fine-tuned small model can be comparable to that of a large model.

As long as you can speak, you can develop without writing code! Baidu has made another big move

It can also use small models to achieve contrast enhancement to help large models complete training.

At the same time, Baidu has also built a seed model matrix, a data quality improvement and enhancement mechanism, and a series of supporting tool chains, from pre-training, fine-tuning alignment, model compression to inference deployment.

In this way, the efficient and low-cost model production mechanism can accelerate the application, reduce the deployment cost, and achieve better results.

Our most common MoE is a typical example of "multi-model" technology.

As long as you can speak, you can develop without writing code! Baidu has made another big move

It can be seen that both GPT-4 (according to speculation), as well as the open source Grok and Mistral, all use the MoE architecture.

They all achieved excellent performance in benchmarks.

Baidu believes that in the future, large-scale AI native applications will basically be MoE architecture. Solve problems by mixing large and small models rather than a single model.

Therefore, for scene matching, when to call a large model and when to call a small model, technical considerations are required.

On the other hand, there is multi-model inference.

Baidu has developed an end-to-end multi-model inference technology based on feedback learning, built an intelligent routing model, and carried out end-to-end feedback learning, giving full play to the ability of different models to handle different tasks, and achieving the best balance between effect, efficiency and cost.

As long as you can speak, you can develop without writing code! Baidu has made another big move

As Robin said at the conference, it is much better to cut out a smaller model through the powerful Wenxin 4.0 than to fine-tune the open source model directly.

During this time, a graph of the gap between the open source model and the closed-source model is getting closer, and it is going crazy on the whole network.

Many are optimistic that the open-source model will soon break through its limits and achieve the ability to approach GPT-4 or even replace the closed-source model.

As long as you can speak, you can develop without writing code! Baidu has made another big move

In fact, the open source model is not ready-to-use, but requires more customized fine-tuning.

This is also the reason why Baidu has released three lightweight models: ERNIE Speed, Lite, and Tiny.

Through Wenxin Large Model 4.0, a basic model is compressed and distilled, and then trained with specialized data. This is much better than being based on an open-source model or even retraining a model.

Wenxin 4.0 performance improved by 52.5%

In addition to the above, Wenxin 4.0's innovations also include a data system based on model feedback closed-loop, large model alignment technology based on self-feedback enhancement, and multimodal technology.

As long as you can speak, you can develop without writing code! Baidu has made another big move

In the six months since its release, the performance of Wenxin 4.0 has increased by 52.5%.

As long as you can speak, you can develop without writing code! Baidu has made another big move

The reason why Wenxin's large model can evolve so quickly and continuously is inseparable from Baidu's full-stack layout in chips, frameworks, models and applications, especially the joint optimization of PaddlePaddle Deep Learning Platform and Wenxin. The average weekly training efficiency of the Wenxin model reached 98.8%.

In comparison, when Wenxin Yiyan was released a year ago, the training efficiency was directly increased to 5.1 times that time, and the reasoning reached 105 times.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Up to now, the PaddlePaddle Wenxin ecosystem has gathered 12.95 million developers and served 244,000 enterprises and institutions. Based on PaddlePaddle and Wenxin, 895,000 models have been created.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Today, Wenxin Yiyan has accumulated 200 million users, and the average daily call volume has reached 200 million.

The work, life and study of these 200 million users have been changed by Wenxin Yiyan.

The 5 million AI talent training plan ended ahead of schedule

Finally, it is worth mentioning that Baidu's 5 million AI talent training plan ended ahead of schedule!

In 2020, Baidu proposed to cultivate 5 million AI talents for the whole society within five years, and now the goal has been completed ahead of schedule.

As long as you can speak, you can develop without writing code! Baidu has made another big move

Wang Haifeng said that in the future, Baidu will continue to devote itself to talent training, so that the little stars of talents will converge into a bright galaxy.

In the era of intelligence, everyone is a developer and a creator.

Resources:

Hatps://create.baidu.com/?lang=i

Read on