laitimes

Huawei releases Pangu model 3.0, "no time to write poetry"

author:Observer.com

(Editor/Lv Dong)

On July 7, the 6th World Artificial Intelligence Conference (WAIC) is in full swing in Shanghai, large models are the absolute keyword, and just when the large models of major manufacturers are "competing for beauty" in terms of chat, painting, and poetry, Huawei has launched a large model that "does not write poetry".

"The Pangu model does not write poetry, nor does it have time to write poetry, because it has to go deep into all walks of life and let AI give value to all walks of life." On the afternoon of July 7, at the Huawei Developer Conference 2023 (Cloud), HUAWEI CLOUD CEO Zhang Pingan said.

Zhang Pingan announced at the meeting that Pangu Large Model 3.0 was officially released, which is a series of large models completely oriented to the industry.

Huawei releases Pangu model 3.0, "no time to write poetry"

Zhang Pingan, CEO of HUAWEI CLOUD

According to him, Pangu Model 3.0 includes a "5+N+X" three-layer architecture:

The L0 layer includes five basic large models of natural language, vision, multimodality, prediction, and scientific computing, providing a variety of skills to meet the needs of industry scenarios. Pangu 3.0 provides customers with a series of basic models of 10 billion parameters, 38 billion parameters, 710 parameters and 100 billion parameters, matching the diversified needs of customers in different scenarios, different delays, and different response speeds. At the same time, it provides a new capability set, including knowledge Q&A, copywriting generation, code generation, and image generation and image understanding capabilities of multimodal large models, which can be directly called by customers and partner companies.

The L1 layer is N industry big models, and HUAWEI CLOUD can provide industry-wide big models trained using industry open data, including large models such as government affairs, finance, manufacturing, mining, and meteorology. It is also possible to train your own proprietary large model for customers on the L0 and L1 layers of the Pangu large model based on the industry customer's own data. The L2 layer provides customers with more detailed scenario models, focusing more on specific industry applications or specific business scenarios such as government hotlines, network assistants, lead drug screening, conveyor belt foreign body detection, typhoon path prediction, etc., and provides customers with "out-of-the-box" model services.

Huawei releases Pangu model 3.0, "no time to write poetry"

Screenshot of Huawei Developer Conference 2023 (Cloud) video

It is disclosed that the Pangu large model adopts a completely layered decoupling design, which can quickly adapt and quickly meet the changing needs of the industry. Customers can either load separate datasets for their large models, upgrade the base model separately, or upgrade the capability set separately. Based on the L0 and L1 models, HUAWEI CLOUD also provides customers with a large model industry development kit, which allows customers to have their own exclusive industry models through secondary training on their own data. At the same time, according to the different data security and compliance requirements of customers, Pangu Grand Model also provides diversified deployment forms of public cloud, large model cloud zone, and hybrid cloud.

"Pangu is born for the industry, we must think about the industry", Zhang Pingan said, now Pangu model has played a huge value in finance, finance, manufacturing, pharmaceutical research and development, coal mining, railway and many other industries.

"Everyone knows that everyone else can use the most mature GPU and the most mature software in the industry, but Huawei can't, so Huawei can only rely on the root technology of AI that we have built." He said.

Zhang Pingan revealed that Huawei has built an AI computing power cloud platform based on Kunpeng and Ascend at the lowest level, as well as a heterogeneous computing architecture CANN, a full-scenario AI framework MindSpore, and an AI development production line ModelArts, providing key capabilities such as distributed parallel acceleration, operator and compilation optimization, and cluster-level communication optimization for the development and operation of large models.

"Based on Huawei's AI root technology, the training efficiency of large models can be tuned to 1.1 times that of mainstream GPUs in the industry." He said.

Huawei releases Pangu model 3.0, "no time to write poetry"

Screenshot of Huawei Developer Conference 2023 (Cloud) video

Computing power is the basis for training large models.

At the conference, Zhang Pingan announced that the Ascend AI cloud service with single-cluster 2000P flops computing power will be launched simultaneously in HUAWEI CLOUD's Ulanqab and Gui'an AI computing power centers. In addition to supporting MindSpore, Huawei's all-scenario AI framework, Ascend AI Cloud Service also supports mainstream AI frameworks such as Pytorch and Tensorflow. At the same time, 90% of the operators in these frameworks can be smoothly migrated to the Ascend platform through Huawei's end-to-end migration tools. For example, Meitu migrated 70 models to Ascend in only 30 days, while HUAWEI CLOUD and the Meitu team worked together to optimize more than 30 operators and accelerate processes in parallel, improving AI performance by 30% compared with the original solution.

In addition, GPU failures are often encountered during the training process of large models, and developers have to restart training frequently, which takes a long time and is costly. Ascend AI cloud services can provide longer and stable AI computing power services, with a 30-day long-term stability rate of 90% for kcal training, and a breakpoint recovery time of no more than 10 minutes.

This article is an exclusive manuscript of the Observer Network and may not be reproduced without authorization.

Read on