laitimes

Cloud + AI treats the symptoms rather than the root cause, and the industry needs "one cloud and multiple cores" AI native cloud

author:Brother Bird's Notes

Source: Photon Planet

The cloud computing market in the past year has been hot and cold. Under factors such as price wars, "cloud tides", and more and more one-way involution, cloud computing has encountered global IT weakness, but the AIGC field is a different scene.

Large-scale cloud vendors are using AI as a carriage to drive business, but many attempts have stopped at the deployment of AI infrastructure and the intelligence of cloud products. Although this path is of great help to its own business processes, it is very lacking in terms of outward output, which objectively causes the industrial intelligence process to be limited to points and lines, and cannot be promoted to the surface. In addition, the emergence of large models has compressed human knowledge, and the management objects developed on the cloud have also quietly changed.

In the AI era, the use of the cloud and the development of AI on the cloud require a new kinetic energy, just like the high-speed railway, which is gradually replacing the traditional internal combustion engine trains.

Cloud + AI is a green car running on the old railroad tracks, while AI cloud is a new high-speed rail line.

AI cloud, not cloud + AI

"Traditional cloud computing systems are still important, but they are no longer the protagonists, we need a new operating system, abstract and encapsulate the new computing platform, that is, intelligent computing, redefine human-computer interaction, and provide developers with a simpler and smoother development experience. Shen Dou, executive vice president of Baidu Group and president of Baidu Intelligent Cloud Business Group, said recently.

In the past year, large models have continued to evolve and have begun to deeply reconstruct human-computer interaction. For example, with the maturity of NLP, programming through natural language processing is no longer impossible. In fact, due to factors such as application scenarios, technological development, and language competition, programming has long been caught in the "Babel curse", and the only common computer programming languages include C language, C++, Java, Python, Go, etc. As a result, programming has to be process- and object-oriented, rather than demanding.

When the programming process becomes a process in which the developer expresses his wishes, the iteration of the entire operating system is the meaning of the problem. For example, the underlying hardware, which used to be dominated by the computing power of the CPU, mainly relied on low-latency and high-complexity operations, which were applied to arithmetic and logical operations. GPUs, which were born in graphics rendering, rely on high-throughput, low-complexity operations and are longer than processing large-scale datasets.

In addition, the emergence of large models has compressed human knowledge, and the objects managed by the operating system have also quietly changed.

The AI model has gradually moved from the initial framework construction to the implementation stage. However, as AI models penetrate into thousands of industries, the market has begun to realize that although the general model is powerful, it is difficult to meet personalized needs, resulting in service providers becoming "high-tech construction teams", and users are in an extreme situation where they encounter problems.

In the past few years, some industries have repeatedly jumped between the cloud and the cloud.

The person in charge of the technology center of a medical institution in the south previously mentioned that because the top leaders realized that going to the cloud was a trend, they would shoot the cloud after a meeting. They first migrated the image archiving and communication systems to the cloud in the branch of the hospital with a small business volume, and then replicated them to the headquarters after a series of verifications such as cost, application, and maintenance were effective. However, at that time, the cloud system slowed down many times, and it was also dragged down by downtime.

"We couldn't find the problem, so we had to find cloud service vendors, video archiving and communication system vendors, check local systems, and coordinate various departments in the hospital. Eventually, the medical institution decided to go to the cloud and spend money to migrate the data back to the old system.

The common problems faced by all walks of life now seem to be answered.

On April 16, during the Create2024 Baidu AI Developer Conference, Shen Dou, Executive Vice President of Baidu Group and President of Baidu Intelligent Cloud Business Group, officially released a new generation of intelligent computing operating system - Wanyuan.

Wanyuan is mainly composed of three layers: Kernel, Shell and Toolkit. And for the first time, resources other than hardware and software have been added, that is, the world knowledge compressed by the large model. By abstracting and encapsulating the intelligent computing platform in the AI-native era, it shields users from the complexity of cloud-native systems and heterogeneous computing power, and improves the efficiency and experience of AI-native application development.

This means that the operation, computing power, and language thresholds for model and AI native development are further decentralized. More importantly, Baidu relies on this to "bridge" the ecology of different development depths and the different roles in it, and a demand-driven dynamic coupling system is formed.

"One cloud and multiple cores" launched a computing power revolution

Each era has its own underlying carrier to support it, such as the steam engine, the engine, or the CPU of the Industrial Revolution. According to this logic, the comprehensive intelligence opened by AIGC also needs to be supported by a core carrier, which is the intelligent computing system mentioned above.

It is important to note that intelligent computing is not a replacement or simple integration of previous computing technologies. Rather, it is a form of computation that systematically and comprehensively optimizes existing computing methods and resources to solve real-world problems according to the requirements of the task.

Even though China's total computing power ranks second in the world, and the average annual growth rate of computing power in the past five years is nearly 30%, the anxiety of computing power is still spreading over the industry - Nvidia GPU is hard to find, and CoreWeave, a "computing power scalper" that provides GPU hosting services to developers, has grown to 56 billion yuan in just four years.

In order to bridge the gap between the supply and demand of computing power and make computing power more usable, Wanyuan's "silver bullet" for intelligent computing is a hundred ... AI heterogeneous computing platform.

In the kernel layer of Wanyuan, in terms of computing resource management, Baidu Baige · The AI heterogeneous computing platform optimizes the design, scheduling, and fault tolerance of intelligent computing clusters for tasks such as large model training and inference. At present, Baige can achieve more than 98.8% of the effective training time of the model on the Wanka cluster, and the linear acceleration ratio and bandwidth effectiveness are as high as 95%, respectively, leading the industry in computing power performance.

You must know that even in the field of view of the top large model manufacturers in China, most of the intelligent computing clusters with 90% of the effective model training time are still limited to kilocalorie clusters. In addition, the more critical breakthrough of Baige lies in the excellent performance of "one cloud and multiple cores" in model training scenarios, which can be said to fundamentally alleviate computing power anxiety.

At present, Baige is compatible with mainstream AI chips at home and abroad, such as Kunlun Chip, Ascend, Haiguang DCU, NVIDIA, Intel, etc., and supports users to complete computing power adaptation at the lowest cost.

In the past, multiple model training tasks in intelligent computing clusters were often served by a single vendor's chip, whether it was the computing power splitting of chips from different vendors, the communication efficiency between chips, or the fundamental model training efficiency problems, all of which made the computing power anxiety infinitely amplified under the hardware differences.

However, under the intelligent scheduling of Baige, the mixed training of chips from different manufacturers in a single task has become a reality, and the performance loss of 100 calories does not exceed 3%, and the performance loss of 1000 calories does not exceed 5%. Baidu said that the application of the Baige platform can shield the differences between hardware to the greatest extent, help users get rid of the dependence on a single chip, achieve better costs, and create a more flexible supply chain system.

In the face of the industry's leading major breakthrough, Baidu's internal ecstasy is also a little helpless. "With such a small performance loss, there is basically no manufacturer in the industry that can complete the single-task training of a hybrid multiple chips, and even some of our users can't believe that this is true."

When the hardware difference is smoothed out as much as possible at the computing scheduling level, the cost and usage threshold of model training will also be reduced, and even the mismatch between computing power demand and supply will be directly addressed. In the absence of fundamental changes in hardware facilities, Baidu's underlying technology has fired the starting gun of the intelligent computing power revolution.

Teach people to fish and reshape the development ecology

Through the ability to manage intelligence, Wanyuan has become a bridge between Baidu's "bridging" computing power efficiency and application innovation, and the three layers of Kernel, Shell and ToolKit are efficiently interconnected, and an end-to-end performance optimization closed loop has been formed.

The kernel layer is in addition to Baige · In addition to the AI heterogeneous computing platform, it also includes Wenxin models of different specifications and third-party models, the outer shell layer is ModelBuilder to solve the management, scheduling and secondary development of models, and the tool layer is the development platform for specific applications, namely AgentBuilder and AppBuilder.

From the perspective of the industry, most model service providers have launched development tools that reach the C-end around 2024, including packaged applications for C-end users, AI-native development tools for developers, and finely customized MaaS for enterprises. Baidu's layering based on three different development depths of infrastructure, models, and AI-native application builds is not uncommon in the industry.

But what is commendable is that Baidu has further integrated the small systems that are isolated from each other into a large system through Wanyuan. Within the closed loop, users, developers, and enterprises can share computing resources and model capabilities in the ecosystem for efficient development.

Users or developers can develop targeted AI-native applications such as agents with minimal computing resources in the form of natural language. For example, the "Singapore Tourism Board" agent demonstrated by Baidu CEO Robin Li at the conference can create an experienced and exclusive "backpacker" in a matter of minutes by adding knowledge base content on top of the directly generated basic agent.

As Shen Dou said, "With the continuous evolution of large-scale model technology, programming through natural language is becoming a reality. Programming will no longer be process-oriented or object-oriented, but requirements-oriented." Based on Wanyuan, Baidu provides the industry with a development tool and distribution platform that reaches the "editor" level, driving AI native applications to the next stage.

The flywheel of device-cloud collaboration

The initial improvement of a development ecology is only the first step to open the next imagination space, and we need to look for a carrier that shows the value of development.

For the "iPhone moment" that opened the era of smartphones, the first carrier to show great value is the mobile game represented by "Angry Birds", and for AIGC, the value of AI-native development lies in the collaboration between the device side and the device cloud.

It only takes a cursory glance at today's mobile phone industry, which shows that smartphone + AI has become the core strategy of major mobile phone manufacturers. IDC expects global shipments of next-generation AI phones to exceed 170 million units in 2024, accounting for about 15% of overall smartphone shipments, and Counterpoint expects shipments to reach 522 million units in 2027, with a penetration rate of 40%.

On the other hand, the ability of AI to reach also requires a carrier closest to the user, whether it is the secure reading of personal data or further analysis and sorting out human behavior, instructions, etc., mobile phones are undoubtedly an excellent choice at present.

In fact, Baidu anchored the value realization target of Wanyuan system as early as before this conference. On January 10, 2024, at the HONOR MagicOS 8.0 launch and developer conference, Zhao Ming, CEO of HONOR Device Co., Ltd., announced the "100 Model Ecological Plan", and jointly announced with Shen Dou, Executive Vice President of Baidu Group and President of Baidu Intelligent Cloud Business Group, that Baidu Intelligent Cloud has become a strategic partner of HONOR large model ecology.

As early as November 2017, Honor released the Honor view10 equipped with AI applications and Kirin 970 processors, which was not yet formed at that time, and recently took out 10 billion real money and more than 2000 related patents, declaring its firm determination to develop end-side AI. More importantly, Honor is also the head player in the smartphone track, according to IDC data, in the fourth quarter of last year, Honor ranked first in the Android camp with a share of 16.8%.

Judging from the previously disclosed details of the cooperation, different from the simple upgrades such as taking pictures, real-time call translation, and intelligent search in the industry, the two are through MagicOS to carry out the paradigm innovation of device-cloud collaboration - the Glory Magic Model on the device side is responsible for understanding the user's intentions, transforming the user's simple prompts into more professional prompts in the background, and then the Wenxin model on the cloud provides professional services such as knowledge questions and answers, life suggestions, etc.

For example, in the daily assistant requirements of "help me arrange a schedule" or "help me set a sports plan", the magic model will analyze the user's travel, health and other usage data, and generate preliminary prompts, so as to dispatch the Wenxin model to generate a comprehensive enough plan. In this process, the magic model will filter out sensitive information and ensure that personal privacy is not on the cloud through the device-side protection network, so as to solve users' hidden concerns about personal data.

However, this is only the initial application of AI under this device-cloud collaboration paradigm, and what is further is a personal knowledge base formed based on the device-side data uploaded by users themselves, which can improve labor productivity and extend the reachable space of brain power with a very short link.

At a time when comprehensive indicators such as security, explainability, and ease of use have long become the key to the AI arms race, opening the black box of large models to reach public awareness is undoubtedly the winner or loser of large models entering the next cycle of "innovation diffusion". From the developer's perspective, the black box lies in the algorithm and training process, and for users, the invisible black box is the ease of use and universality of the model's capabilities.

In the face of the "unboxing" needs of different roles, customized products continue to emerge. However, only Baidu has taken the lead in advancing to the bottom, greatly lowering the development threshold through intelligent computing capabilities, and integrating relatively isolated different development and feedback systems. The first establishment of the system often means commercial barriers, especially for the to B track.

Under the deep integration of AI and cloud, and the deep coupling of device-cloud collaboration, we also see Robin Li's confidence in adhering to the closed-source route in his internal speech. Baidu, which is "grabbing the beach" for intelligent computing, has once again maintained its leading position in the AI arms race.