laitimes

The first line: the intelligent computing center is a system engineering, and it still faces ten challenges

author:Communication Industry News
The first line: the intelligent computing center is a system engineering, and it still faces ten challenges
Build a digital foundation for the AI era.

In the digital era, what impact will the intelligent computing center bring to the industry, and how to give full play to the technical advantages of the intelligent computing center, build a new foundation for the digital economy computing network, and help the digital development of the industry?

"We believe that it is not only the integration of computing and networking, but the integration of computing, network, electricity and energy to build a sustainable digital economy foundation, and the top-level design of national computing integration is the key to unlocking the door to enter the intelligent society. Hou Xingze, deputy chief engineer of Dr. Peng Group and head of Dr. Peng Research Institute, said in an interview with the all-media reporter of "Communication Industry News" that we should not only focus on a single link of the computing power center, but must be a system engineering and promote it as a whole.

In Hou Xingze's view, the intelligent computing center is a new type of infrastructure based on the latest artificial intelligence theory, using a leading artificial intelligence computing architecture, and providing computing services, data services and algorithm services required for artificial intelligence applications. It aims to support open data sharing, intelligent ecological construction, and industrial innovation aggregation, and promote AI industrialization, industrial AI, and intelligent government governance.

The first line: the intelligent computing center is a system engineering, and it still faces ten challenges

In addition, liquid cooling technology is more commonly used in intelligent computing centers to support high-power-density devices and effectively manage heat dissipation issues. Intelligent computing centers often need to deploy a large number of high-performance computing devices, such as GPUs, TPUs, and FPGAs, which usually have high power consumption. As a result, the power density of a single cabinet in an intelligent computing center is usually higher than that of a traditional data center. According to the survey, the power density of a single cabinet in an intelligent computing center needs to exceed 30kW, or even reach more than 100kW, while the power density of a single cabinet in a traditional data center is generally between 6kW~15kW.

Hou Xingze believes that the intelligent computing center is a complex system engineering, which is in the process of active evolution from all aspects, and with the rapid evolution and iterative upgrading of the system architecture of mainstream manufacturers, the computing power center in different construction periods is very different. The power supply power of the first generation of computing power centers that have been put into operation is about 6.5kw, and a single machine can provide 5P computing power, and the air-cooled heat dissipation mode is generally adopted. In the second-generation computing center, the power supply of a single machine has been increased to 10.5kw, and a single machine can provide 15P computing power, and a gas-liquid hybrid cooling and heat dissipation mode has begun to appear. To the third stage of the computing power center single power supply power up to 24kw, a single machine can provide 225P computing power, the general use of plate liquid cooling mode.

It is foreseeable that in the future, the power density of a single cabinet of the computing center using immersion liquid cooling will reach 60kw~240kw, and the computing power of a single machine will exceed 1440P just around the corner. The first and second generation intelligent computing centers have generally entered the operation period, the third generation intelligent computing centers are basically in the construction stage, and the updated intelligent computing centers are still in the process of design and pre-research.

At present, the rapid growth of the demand for intelligent computing power has brought the construction of intelligent computing centers into a period of rapid development, and various localities have begun to gradually introduce guidelines and related indicators on the overall construction, which is mainly to guide the healthy development of intelligent computing centers in the direction of "heavy quality" and "green".

Hou Xingze pointed out that as an important infrastructure for the R&D and application of artificial intelligence technology, the development of the intelligent computing center faces many challenges and difficulties.

First, the problem of computing power integration. The intelligent computing center needs to provide general computing power and dedicated computing power to meet the diverse computing power needs of different scenarios such as autonomous driving, smart healthcare, and smart cities. It is difficult to take into account the specific needs of multiple industries and fields with a single computing power solution.

Second, the coordination of software and hardware is insufficient. In the process of construction of the intelligent computing center, there is a "silos" of vertical integration between different chip platforms, algorithm models, databases, and application levels, and the compatibility of software and hardware needs to be improved urgently.

Third, the linkage of investment, construction and operation. The investment, construction, and operation of intelligent computing centers are often handled by different entities, which may lead to the separation of construction and operation, affecting customer experience and service quality.

Fourth, energy consumption and carbon emissions. The equipment energy consumption and carbon emissions of the intelligent computing center are high, and the power consumption of AI model training is huge, which poses challenges to the environment and cost control.

Fifth, the issue of cost and price regulation. The construction and operation costs of intelligent computing centers are high, and the investment and use costs of some intelligent computing centers exceed the normal market price, which needs to be further standardized and optimized.

Sixth, the richness of application scenarios and the maturity of the operating model. In the process of development, the intelligent computing center needs richer application scenarios and mature operation models to achieve its effective application and commercial operation in various industries.

Seventh, open questions. The Intelligent Computing Center needs to solve the problem of openness to adapt to the increase in the number of AI applications and the acceleration of iteration speed, and ensure that it can serve a wider range of fields and needs.

Eighth, technical bottlenecks and optimization directions. With the rise of generative AI and large models, intelligent computing centers need to continuously break through technical bottlenecks and optimize computing power supply and algorithm support to meet the growing demand for computing power.

Ninth, security and trustworthiness. The construction of the intelligent computing center needs to fully consider information security and industrial security, and build a secure and credible environment based on independent technology system.

Tenth, the network design of the intelligent computing center must take into account the special needs of AI and big data applications, and provide a network environment with high performance, low latency, high bandwidth, high stability, easy scalability, and easy management and maintenance. The intelligent computing center has higher requirements in terms of node hardware, energy supply, cooling and heat dissipation, network interconnection, development environment, platform functions, and continuous operation, which is difficult for traditional data centers to meet, so customized design and construction are required.

Written by: Hu Yuan

Chart: Dawn

Editor and proofreader: Hu Yuan

Guidance: Xin Wen

Frontline Talk: Challenges and Responses of Intelligent Computing Center

Intelligent Computing Center Industry Map: A List of Typical Enterprises and Competitiveness

Intelligent Computing Center: "igniting" new computing infrastructure (with industry map)

The government work report first mentioned the "national integrated computing system": why? how to build? "Eastern Data and Western Computing" for two years: the national integrated computing power network is about to emerge

Expert interpretation: how to build a national integrated computing power system?

Expert interpretation: accelerate the construction of a national integrated computing network

The first line: the intelligent computing center is a system engineering, and it still faces ten challenges
The first line: the intelligent computing center is a system engineering, and it still faces ten challenges

Read on