laitimes

The explosion of AI native applications is imminent, and China needs an "operating system" of "one cloud and multiple cores"

author:Silicon Star Man

Programming is a creative process where, through a specific programming language, we are able to give instructions to a computer to perform the tasks we want. From the software and hardware applications of our daily office, study, and entertainment activities, to the development of the digital economy, programming is not only the foundation of the Internet, but also penetrates into every corner of human life.

The origins of programming can be traced back to the 19th century. The analysis machine invented by the British mathematician Charles Babbage is considered to be the first programmable computer.

The design concept includes the use of punch cards for data input and output, and different positions on the punch cards represent different instructions and values, which can be combined and arranged to achieve different functions. In Babbage's time, the mastery of this "programming" technique was limited to Babbage and a few of his direct collaborators.

With the continuous development of computer technology, computer programming languages are constantly updated and developed. From the earliest machine languages, to assembly languages, to high-level languages today. The history of programming is the history of communication between humans and computers, and each generation of programming languages has provided more efficient and powerful tools for human beings, and has enabled more people to create with the help of programming.

While the cost of learning new programming languages has been reduced considerably, it still takes a lot of time for people without a computer science background to reach a level of proficiency and ability to solve real-world problems.

The development of generative AI is changing this, and we are beginning to communicate with large language models through prompt words to guide them in various tasks.

For the first time ever, natural language has become a programming language, although it is still very young. But we've begun to imagine that in the near future, we'll be moving away from cumbersome, "rigid" programming languages to natural language programming.

The replacement of programming languages by natural language does not mean the end of human intellectual labor represented by programming, nor does it represent the disappearance of programmers, but a kind of liberation of human creativity.

This is what Baidu founder Robin Li and NVIDIA founder Jensen Huang are emphasizing, everyone can be a programmer in the future.

But this process does not happen overnight, and one of the key drivers of the advancement of programming technology is the evolution of operating systems. From early hardware control, to software programming. Stand-alone operating systems such as Linux focus on solving hardware and software compatibility problems and providing an interface for developers.

With the software development model from the stand-alone era to the cloud computing era, the cloud computing center operating system can manage a large number of hardware devices and processes, and developers do not need to care about the processes on the stand-alone machine, which has become a new type of architecture.

Since then, the rise of AI has made the three parallel lines of cloud computing, AI technology and application development also intertwined.

Cloud computing is entering the AI-native era, and the core of the computing operating system that supports AI also needs to undergo a fundamental change. The traditional CPU-based computing power has given way to a GPU-driven computing power structure, and the core has also incorporated the world knowledge of the compressed large model. The focus of management has also shifted from processes and microservices to fine-grained control of intelligence itself.

AI is moving from an invisible and intangible underlying technology to a tooly, ubiquitous and industrialized application, and a new era requires a new operating system.

In this context, at the 2024 Baidu Create AI Developer Conference, Shen Dou, Executive Vice President of Baidu Group and President of Baidu Intelligent Cloud Business Group, officially released a new generation of intelligent computing operating system - Wanyuan.

Native AI application development simplifies complexity and achieves "ultimate experience" with "optimal solution"

From the perspective of architecture, Wanyuan is mainly composed of three layers: Kernel, Shell and Toolkit.

The underlying layer abstracts and encapsulates the intelligent computing platform in the AI-native era to shield the complexity of cloud-native systems and heterogeneous computing power. The upper layer has established a set of reusable and extensible tools, services, and frameworks to provide support and guarantee for the agile development of AI-native applications.

The explosion of AI native applications is imminent, and China needs an "operating system" of "one cloud and multiple cores"

From the developer's point of view, Wanyuan is like a combination of "intelligent computing power plant" + "super AI native application factory". On the one hand, AI application development will increase the demand for computing power, and not everyone has the ability to build their own computing power. At the upper level, Wanyuan plays the role of a "super AI native application factory", providing one-stop platform support, just like the "brain" of the factory, responsible for commanding and dispatching various resources, coordinating the production process, and ensuring product quality, while developers can focus more on creativity and building the product itself.

Specifically, a huge training task requires the cooperation of computing power clusters composed of a large number of GPU servers. In the kernel layer of Wanyuan, Baidu Baige · The AI heterogeneous computing platform is specially optimized for the design of large models and intelligent computing clusters. The effective training time of the model on the Wanka cluster accounts for more than 98.8%, and the linear acceleration ratio and bandwidth effectiveness are as high as 95%, respectively, leading the industry in computing power efficiency.

The proportion of effective model training time has increased from 95% last year to more than 98.8%, and almost all of the time is spent on actual model training, rather than idle or waiting for resources. The linear acceleration ratio and bandwidth effectiveness are as high as 95%, respectively, which means that there is almost no waste in data transmission, which is critical for processing large amounts of data. Just like an efficient conveyor belt, almost all the space is used efficiently, there is no idleness, no wasted space, ensuring that users can get more efficient services while saving overhead.

In terms of computing power, there is a special situation in China, that is, the uncertainty of chip supply, which will inevitably lead to the coexistence of multiple chips.

The so-called heterogeneous computing platform contains more than one form of computing power, and Baige has achieved compatibility with mainstream AI chips at home and abroad, such as Kunlun Chip, Ascend, Haiguang DCU, NVIDIA, Intel, etc., and supports users to complete computing power adaptation at the minimum cost.

By shielding this wide compatibility between hardware to the greatest extent, it means that Wanyuan is able to use a variety of different hardware resources on the market to help users get rid of the dependence on a single chip, providing more choices and flexibility without having to redesign and optimize the model for each chip platform, saving a lot of time and resources. Especially in the case of possible fluctuations in the global supply chain, avoid the impact of single chip supply issues on the project schedule.

The explosion of AI native applications is imminent, and China needs an "operating system" of "one cloud and multiple cores"

The use of chips from multiple vendors in AI model training, especially in a single training task, has always been a major challenge for the industry, and there are two main issues:

Evenly sliced computing power: The need to ensure that chips from different vendors can contribute computing power equally during the training process requires the system to intelligently distribute tasks so that each chip can perform its maximum performance.

Communication efficiency between chips: Chips from different vendors may have different communication protocols and optimization technologies, and how to optimize the data exchange and synchronization between these chips is the key to ensuring training efficiency and reducing time delay.

Baige has realized the mixed training of chips from different manufacturers under a single training task, and the performance loss of 100 calories does not exceed 3%, and the performance loss of 1000 calories does not exceed 5%, which is the leading in the industry.

Another important component of the Wanyuan kernel is the large model. Large models can efficiently compress huge amounts of world knowledge, and encapsulate the understanding, generation, logic, and memory capabilities of natural language. At present, Wanyuan kernel includes not only the industry-leading ERNIE 4.0 and ERNIE 3.5 large language models, but also lightweight models such as ERNIE Speed/Lite/Tiny, Wenxin visual large models and third-party large models with different characteristics to meet the diverse needs of users in different business scenarios.

ModelBuilder at the shell layer solves the problems of model management, scheduling, and secondary development in the kernel, shields the complexity of model development, and helps more people quickly fine-tune models suitable for their own business with only a small amount of data, resources, and energy.

At the same time, in practical applications, the model routing service provided by ModelBuidler can automatically select models with appropriate parameter sizes for tasks of different difficulty, and give the optimal model combination that balances effect and cost. It is estimated that when the model effect is basically the same, the model routing reduces the inference cost by as much as 30% on average.

In addition to directly calling the API of Wanyuan's built-in large model, on top of the shell layer, Qianfan AppBuilder and AgentBuilder together form a tool layer to provide developers with powerful AI native application development capabilities.

In terms of application development engineering practice, the development of AI-native applications has gradually transitioned from human programming to prompting-based technology, intelligent agent design, and even multi-agent systems. AgentBuilder provides developers with two low-cost agent development models: zero-code and low-code, and truly promotes Agent to become a key force in leveraging a new round of artificial intelligence revolution.

For AI application development involving complex algorithms and a large amount of data processing, AppBuilder provides a workflow orchestration function that allows developers to use preset templates and components to easily customize their own business processes, integrate and expand their own characteristic components, select suitable models on different nodes, and implement business logic through flexible orchestration.

This is similar to providing a common wheel for AI-native application development, so that developers can quickly get their AI applications up and running without having to start from scratch.

Overall, Baidu Intelligent Cloud has released a new generation of intelligent computing operating system "Wanyuan", which builds a comprehensive and comprehensive architecture, which fundamentally supports the entire life cycle of AI application development and deployment. Bottom-up coverage ensures that everything from the most basic cloud infrastructure and high-performance hardware to the top-level application development and implementation is fully supported and optimized.

This ultimate pursuit of "system engineering" ensures that no single link becomes the bottleneck of the overall system, so as to achieve a significant improvement in developer efficiency and a significant reduction in costs, and through comprehensive system optimization, the "multiplier effect" of benefits is realized.

As an overall operating system, Wanyuan provides full-stack services in the AI era. Developers can also directly build their own AI native applications through AppBuilder and AgentBuilder, and customers can choose different levels of services according to their needs, for example, enterprises can build exclusive vertical industry operating systems with the help of Wanyuan, and Wanyuan also supports privatized deployment in customers' own intelligent computing centers to provide stable, safe and efficient intelligent computing platform services.

In the future, Wanyuan will further open up ecological cooperation to provide application developers with more capabilities and interfaces, adapt to more heterogeneous chips from more manufacturers and maximize their efficiency.

MoE architecture + device-cloud collaboration: Only by taking into account user experience and implementation cost can AI be inclusive

Wanyuan is the culmination of Baidu's cloud computing technology and AI technology, and it is also an active promoter of the concept of AI inclusiveness.

Through the Mixture of Experts (MoE), users can flexibly select and combine different AI models according to their actual needs and resources, such as combining the Wenxin basic large model with many specialized models. The breadth and depth of intelligence is guaranteed by the underlying large model, and the dedicated model is responsible for handling specific tasks to achieve optimal performance and cost-effectiveness. It not only lowers the threshold for technological innovation, but also provides a way to deal with the limitations of computing resources.

The cooperation between Baidu Intelligent Cloud and Honor on MagicOS is a case in point, with the two parties adopting an MoE architecture that combines the "Wenxin Model" and Honor's "Magic Model".

In terms of deployment, the method of "device-cloud collaboration" is adopted. The Wenxin model on the cloud side is good at dealing with complex problems and meeting the deep-seated needs of users, and the device-side model is closer to users and understands user intentions better. The device-side model is like a big housekeeper, which can understand what the user wants to do, and then decompose the task into a series of small tasks and send it to the large model in the cloud, which acts as a think tank, and finally completes the complex tasks that the user wants by calling the resources of the cloud and coordinating different small services.

The next generation of computing devices is still in sight, and mobile phones need to combine large models and terminals if they want to become "AI entrances". Only when the capabilities of the device and cloud are complementary and combined can users use large models and bring users a native AI experience.

While the outside world is still speculating which manufacturer Apple will cooperate with to land the national version of the device, Baidu Intelligent Cloud and Honor's innovation in MagicOS's end-to-end and cloud collaboration has made the best demonstration for the industry.

Recently, UBTECH's humanoid robot has been connected to Baidu's Wenxin model through the Baidu AppBuilder platform for task scheduling application development, which has allowed us to see the great potential brought by the combination of device-cloud collaboration and embodied intelligence, so that the robot can realize the integration of action and cognition.

In the future, we can expect Baidu Intelligent Cloud to further promote the penetration and innovation of AI model applications in various industries, so that generative AI can truly become inclusive and enter everyone's life.

Wanyuan is the inevitability of the times and the inevitability of Baidu

Why is Baidu doing this? This is the inevitability of the times, and it is also the inevitability of Baidu.

Baidu has been deeply involved in AI technology for a long time, and is in a leading position in China in the fields of large models and multi-modality, Wanyuan can fully integrate and output these cutting-edge technologies, Baidu itself has massive data and computing resources, and can also provide strong computing infrastructure support for Wanyuan.

The birth of Wanyuan and Baidu intelligent cloud "cloud intelligence integration" strategy is also in the same vein, cloud intelligence integration can be understood from the two perspectives of AI + cloud and cloud + AI, AI + cloud, is to innovate cloud computing with artificial intelligence technology, cloud + AI is to use cloud computing as a platform for the external output of AI technology products, reducing the threshold for the creation and use of AI applications.

Through the deep integration of AI technology and cloud computing services, Wanyuan upgrades and reconstructs from the end-to-end of the underlying infrastructure, large model development and application, and AI native application development, and welcomes the arrival of the native AI era from the perspective of a complete AI native operating system.

From "software will swallow the world" to "AI is devouring software", it is not an exaggeration to say that AI developers have won the world, and Wanyuan is the product of Baidu seizing this historical opportunity.

Thousands of sails, when they meet water, they will be sent. While Wanyuan has become the foundation for the emergence of intelligence and the rooting of applications in the AI native era, it has also brought new growth points to Baidu Intelligent Cloud in the AI era. So in this sense, Wanyuan is not only the source of AI ecological prosperity, but also the source of Baidu's new growth.

We are experiencing a rare computing revolution in our lifetimes, where "the future is here, but unevenly distributed", and Baidu's long-term investment in AI is bridging this gap. Wanyuan in the AI era came into being, and Baidu in the AI era is on a dynamic growth route at the right time.

Read on