laitimes

Strike suddenly! What is the intention of the United States to ban Nvidia's high-end GPU sales to China? Just now, the latest response from the Ministry of Commerce came...

author:Finance

On August 31, the U.S. government ordered chip maker NVIDIA to stop selling some high-performance GPUs to China, and according to Reuters, another AMD (Chaowei Semiconductor) also said that it had received a related ban order.

Affected by the news, on September 1, the stock prices of domestic GPU and AI chip-related listed companies generally rose sharply, Cambrian stock prices rose by 20% at one point, and Jingjiawei, Haiguang Information and other intraday companies rose sharply.

How powerful are the banned A100 and H100 chips? What is the intention of the United States to suddenly attack at this moment?

At present, the domestic research and development of high-end GPUs is almost from scratch, and in the face of high-tech barriers to GPUs, the pursuit of domestic substitution can be called a "lonely road of struggle".

A spokesman for the Ministry of Commerce said that China has noted the relevant situation. For some time, the US side has continuously abused export control measures to restrict the export of semiconductor-related items to China, and China firmly opposes this. The relevant practices of the US side deviate from the principle of fair competition and violate international economic and trade rules, which will not only damage the legitimate rights and interests of Chinese enterprises, but also seriously affect the interests of US enterprises, hinder international scientific and technological exchanges and economic and trade cooperation, and have an impact on the stability of the global industrial chain supply chain and the recovery of the world economy. The US side should immediately stop its erroneous practices, treat enterprises of various countries, including Chinese enterprises, fairly, and do more things conducive to the stability of the world economy.

What kind of chip is the GPU,

How tough are the A100 and H100?

GPUs are at the heart of computer graphics displays.

GPUs are better suited for intensive data processing than CPUs. It is highly parallelistic and allows mathematical operations to be applied to highly parallel datasets. While CPUs can also perform the same tasks, they do not have the parallelism of GPUs, so they are not efficient in these tasks.

Early GPUs were mostly used for the calculation and processing of 2D and 3D graphics. Traditionally, the most critical market for GPU before was in the gaming space. However, in the long run, the growth rate of this part of the market has slowed down, and even has a faint downward momentum.

At present, it is the era of computing power, and the application scenario of GPUs is mainly to accelerate the digital transformation under intelligent manufacturing. Deep neural networks, data analysis, visualization, Internet recommendation algorithms, digital twins, etc. are inseparable from GPUs. In addition to game consoles and PCs, its terminal applications include servers, automobiles, mobile and other fields.

In general, GPU application scenarios can be divided into two main categories: graph display and computing.

So, how powerful are the banned A100 and H100 chips?

Compared to Nvidia's previous-generation Volta GPU, the A100 delivers 20x performance improvements, making it ideal for ARTIFICIAL intelligence, data analytics, scientific computing, and cloud graphics workloads. The chip, which consists of 54 billion transistors, packs the third-generation Tensor core and has acceleration for sparse matrix operations, which is particularly useful for AI inference and training. In addition, each GPU can be divided into multiple instances that perform different inference tasks, and Nvidia NVLink interconnect technology can be used for larger AI inference workloads.

However, these appeared slightly inadequate after the advent of the H100.

At this spring's press conference, Jen-Hsun Huang released the next-generation Hopper architecture for high-performance computing (HPC) and data centers, and the first acceleration card with the new generation of chips was named H100, which is an alternative to the A100.

The H100 is a chip specifically optimized for large models, built using TSMC's 5nm custom version of the process (4N), with a single chip containing 80 billion transistors. It is also the world's first PCI-E 5 and HBM 3 graphics cards, and the IO bandwidth of an H100 is 40 terabytes per second.

Transformer-like pre-trained models are currently the hottest direction in the FIELD of AI, and Nvidia has specifically optimized the design of the H100 with this goal, proposing the Transformer Engine, which combines the new Tensor Core, FP8 and FP16 precision calculations, as well as the dynamic processing power of Transformer neural networks, which can shorten the training time of such machine learning models from weeks to days.

For server applications, the H100 can also be virtualized for use by seven users, each with the equivalent of two full-power T4 GPUs. In addition, the H100 implements the industry's first GPU-based confidential computation.

Based on the Hopper architecture H100, NVIDIA has also launched a series of products such as machine learning workstations and supercomputers. Eight H100 and 4 NVLinks combine to form a giant GPU, the DGX H100, with a total of 640 billion transistors, 32 petaflops of AI hashrate, and up to 640G of HBM3 memory capacity.

At the same time, thanks to a partnership with Equinix, a global service provider that manages more than 240 data centers worldwide, the new GPUs of the A100 and H100 save users energy costs by water cooling. Savings of up to 11 GIWh can be achieved using this cooling method, resulting in a 20x efficiency increase in AI and HPC inference work.

In May this year, Nvidia open sourced the Linux GPU kernel module code, and it is not known whether there will be more open source plans in the future.

Suddenly, what does the United States intend?

All parties have said about the US restrictions on high-end GPU sales in China.

According to the official website of the American Stock Exchange, the U.S. government issued a notice to Nvidia on August 26, asking Nvidia to impose a new export control requirement on Chinese mainland, Hong Kong, China and Russia. The requirements include NVIDIA's ban on the sale of A100 GPUs and the upcoming H100 GPUs to these companies, effective immediately. The U.S. government says this is to prevent the products from being used for "military end uses" or "military end users."

Many people in the chip circle believe that this is a combination of the United States to comprehensively restrict China's technological development and comprehensively slow down China's development.

An AI chip practitioner analyzed, "Previously, the United States restricted the sales of EDA, as well as joint semiconductor equipment and wafer foundries, which were restricting the development of local basic technologies, including chip technology. Restricting NVIDIA and AMD sales in China today is hindering the development of China's application market, which is also a very important market for major chip giants. ”

Zhu Jing, deputy secretary-general of the Beijing Semiconductor Industry Association, said that according to the news, the blocked products are high-end GPUs with sufficient double precision computing power, and low-end GPUs are not affected. High-end GPUs with high double-precision computing capabilities are mainly used in the field of high-performance computing, including scientific computing, CAE (computer-aided engineering), medical and other aspects.

The Supercomputing Center, the National Supercomputing Center, consists of thousands or more processors, has ultra-high computing power, and is known as "Everest in the computer", which mainly meets the needs of the country's high-tech field and cutting-edge technology research.

In contrast, ordinary data centers are geared towards all scenarios that require information technology support, including a large number of Internet applications. China's telecom operators and Internet companies have built their own data centers. Zhu Jing said that enterprise-level data centers often purchase A100 and H100 products in NVIDIA messages, which are high-end GPUs with sufficient double-precision computing power, and if the above-mentioned supply cuts are implemented, the scope of impact will be relatively large.

In the past, the United States has "started" china supercalculation three times, and in 2015, 4 Chinese institutions related to China's "Tianhe-2" project were included in the "entity list" by the United States; In 2019, 5 companies, including Haiguang, Sugon of Zhongke and Wuxi Jiangnan Institute of Computing Technology, entered the entity list; In 2021, 7 supercomputing institutions such as Feiteng and Shenwei entered the entity list.

Zhu Jing said that from the above process, it can be seen that for China's supercomputing, the United States has upgraded from the perspective of the way of attack and the scope. If the news is true, the crackdown will be upgraded from "supercomputing related units entering the entity list" to "direct ban on the sale of related products that can provide services for supercomputing", resulting in the expansion of the scope of the impact from the supercomputing field to the Internet field.

Zhu Jing said that the supply of high-end GPUs seems to be a further blockade of China's supercomputing and intelligent computing, but the scope of the impact has been greatly enlarged, and the disconnection of technical points should also consider the implications for upstream and downstream.

How difficult is the road to domestic substitution?

This sales restriction may be a great opportunity for domestic manufacturers.

Some insiders believe that domestic BAT and other manufacturers of big data will be forced to embark on the road of domestic substitution. The GPUs of Haiguang, Biling, and Flintyuan can replace the demand for NVIDIA GPUs in some markets in terms of functionality.

"This policy change means that forcing domestic data centers to be replaced by localization has basically become a mandatory requirement." AI chip practitioners also said.

However, at present, there are almost no domestic products that have the opportunity to replace NVIDIA GPUs, and they have encountered relatively large landing challenges, and the development of Chinese AI chip companies has a long way to go.

Why is it so hard to develop high-end GPUs?

In general, the technical architecture is the hardware barrier of the GPU, and the algorithm and ecology are the soft power of the GPU. Indispensable, the barriers are extremely high. Specifically:

In terms of hardware architecture, gpu structure is precise and complex, which is the result of long-term technological evolution. There are many advanced graphics processing steps, including vertex processing, rasterization, texture mapping, etc., which are supported by the underlying sophisticated hardware structure. Take the Turing architecture launched by NVIDIA in 2018 as an example, which includes 4608 CUDA Cores, 576 deep learning matrix computing units, and 72 optical tracking units.

In terms of algorithms, GPU graphics rendering requires computational graphics, involving multidisciplinary knowledge such as mathematics and physics. When simulating the real world, even seemingly ordinary scenes such as shaking leaves, blowing hair, and rippling water waves require a lot of graphics algorithms if you want to implement them on a computer.

In terms of ecology, analysts pointed out that the software ecosystem is an important competitive barrier for GPU manufacturers. Still taking the leading NVIDIA as an example, the company has formed commercial cooperation/mutual authorization with industry partners, and at the same time, it has also launched a CUDA platform for software developers to form a developer community ecosystem.

Soochow Securities also pointed out that due to the lack of third-party IP licensing vendors like ARM, GPU designers must be completely independent of research and development, starting from scratch, which is more difficult, which can be called a "lonely road of struggle".

A-share listed companies are laying out

Even if it is difficult, the emerging teams of local GPU companies in the mainland have begun to emerge, and the development process of the industry is also continuing to advance.

Among the A-share listed companies, there are some small leaders that are quietly growing.

Jing Jiawei (300474)

Jingjiawei started in the military field and made efforts in domestic GPU chips. In the early days, Jingjiawei mainly focused on graphic display control and small specialized radar products, mainly for the military field. In 2014, the successful research and development of JM5 series products marked a breakthrough in mainland China's domestic GPU chips from 0 to 1, and since then, with the successful development of JM7 and JM9 series products.

JM5 and JM7 series products are mainly for the military and xinchuang market, JM9 series products benchmark NVIDIA in 2016 released Nvidia GTX1080 products, gradually open up the civilian market, and overseas GPU chip giants to form a dislocation competition.

Haiguang Information (688041)

The company was founded in 2014, born from the Chinese Academy of Sciences system, the main products for server and workstation CPU and AI training, data mining DCU, the early years of technology from AMD x86 authorization, but the current technology has achieved independent iteration, telecommunications, finance is the main downstream field.

The company ranks in the first echelon of domestic CPUs, and is the only two X86 architecture companies in China, and there is a gap in performance benchmarking giant Intel.

In addition, Haiguang DCU series products are based on the GPGPU architecture, compatible with the general "CUDA-like" environment, mainly attacking the accelerated computing market, rapid technology iteration every two years, and deep calculation No. 1 DCU to reach the international level of the same type of high-end products.

Loongson Zhongke (688047)

The company is one of the few enterprises in China that can carry out instruction system architecture and CPUIP verification. Unlike some domestic manufacturers who purchase commercial IP for CPU chip design, the company adheres to the independent research and development of core IP, including a series of CPUIP cores, GPUIP cores, memory controllers and PHY, high-speed bus controllers and PHY and hundreds of IP cores, all of which have been successfully developed independently.

The company has long accumulated the formation of its own instruction system architecture LoongArch, developing the core modules of the operating system, including the kernel, the three major compilers (GCC, LLVM, GoLang), the three major virtual machines (Java, JavaScript, . NET), etc., has formed two basic operating systems for loongnix for information applications and LoongOS for industrial control applications.

Cambrian (688256)

The company's cloud product line is constantly iteratively updated, mainly for the high-end training scenario of Siyuan 370, and the main training of high-end products Siyuan 290 to form a synergy, in the Internet, finance, operators, AI and other customers have been widely used, is expected to relay the edge of the product to become the company's main revenue growth driver.

In terms of autonomous driving chips, the subsidiary Xingge Technology has planned different gears of on-board intelligent chip products, closely linked with the company's existing cloud edge end product line, and has strong technical advantages and market competitiveness in the field of general large computing power vehicle intelligent chips, and has now carried out strategic cooperation with some traditional car companies to match the large computing power chips suitable for L3+ models.

This article originates from the value line

Read on