AI chips are the hardware cornerstone of computing power, and the main pointers are chips that have made special acceleration designs for artificial intelligence algorithms.
Artificial intelligence chips mostly use traditional chips, or use expensive graphics processing units (GPUs), or use field programmable gate array chips with central processing units (FPGA + CPU) to use deep learning training and inference in cloud data centers, general/dedicated AI chips (ASICs), that is, tensor processors or application-specific integrated circuits (ASICs), mainly for specific application scenarios, the three types of chips will coexist in the short term and complement each other in different application scenarios. #Artificial Intelligence ##AI芯片 #
Comparison of various AI chips:
When used for deep learning, four types of chips are different: 1) versatility: CPUs> GPUs> FPGAs> ASICs, the lower the versatility, the fewer types of algorithms they are suitable to support; 2) Performance to power ratio: CPU < GPU <FPGA < ASIC, the higher the performance to power consumption ratio, the better, which means that the more operations under the same power consumption, the shorter the time required to train the same algorithm.
When it comes to deep learning, the performance of four types of AI chips is different:
Source: Fast Technology
As domestic and foreign technology giants increase capital expenditure in the field of AI, this also brings greater demand for data centers and related supporting industries. According to the prediction of Yiou Think Tank, the scale of the mainland AI chip market will reach 178 billion yuan in 2025, an increase of nearly 100% over 2022.
Pay attention to Leqing Think Tank and gain insight into the industrial pattern!
The CPU is the computing and control core of a computer system.
Divided into general-purpose high-performance microprocessors, embedded microprocessors and mobile SoC MPUs/APs, it is one of the most demanded and important semiconductor products in the electronic information age.
A CPU is a general-purpose processor that is suitable for most computing tasks. Its small number of cores makes it more suitable for single-threaded or small-threaded tasks, such as text processing, web browsing, programming, etc., and is slightly weaker for high-performance computing tasks such as machine learning and deep learning.
CPU internal components and working principle:
In terms of computing power, the CPU can realize parallel computing through instructions to improve the computing performance of AI systems.
Whether it is machine learning or deep learning models, most computing tasks are mainly based on vector and matrix operations in AI algorithms, while the CPU can achieve high-performance calculations such as large-scale matrix multiplication and convolution through efficient vectorization instruction sets, and can complete multiple numerical operations per unit time, thereby effectively improving computing efficiency.
Since many computing tasks such as convolutional neural networks (CNNs) have strong parallelism, CPUs can accelerate AI workloads through SIMD instructions (Single Instruction Stream Multiple Data Stream Structure) and multi-threading technology.
CPU basic architecture diagram:
Source: Computer Science GCSE GURU
The CPU ecosystem is a combination of hardware and software, and is the product of upstream and downstream interaction in the industry.
Starting from the underlying instruction system, the chip physical circuit is formed through the IP core on the hardware, and finally used in the whole machine equipment in different fields such as board and complete machine manufacturers; Software includes Linux kernel, compiler, Java, . .NET and other basic development software, as well as operating systems that are highly compatible with the instruction system, provide platform support for building an application software ecosystem based on the instruction system, and finally form a mature software and hardware system used in government and enterprise, education, telecommunications and other industries.
Most of the CPU industry chain giants are concentrated overseas, ranking at the core of all links of the industry chain, and have a great influence on the global CPU industry.
Line check | Industry research database data shows that in the design process, Intel and AMD almost monopolize the general-purpose CPU market; In terms of equipment, materials, EDA/IP and other links, the gap between domestic leaders and foreign leaders is large, and the localization rate is low; In the manufacturing process, only TSMC and Samsung currently have 5nm process production capacity, but they both need to use American equipment; In the closed and tested link, Taiwan, China, Chinese mainland, and the United States are currently divided into three parts of the world.
From the perspective of the global market competition pattern, Intel still occupies the first place in the world with a market share of 70.77%, but due to the delay in the release of the new generation product Sapphire Rapids, the market share decreased by 9.94pct compared with 2021.
Driven by the new generation of server processor Genoa, AMD's market share has further increased to 19.8% (yoy: +8.10pct).
In addition, the market share of AWS, Ampere Computing, server CPU vendors based on ARM architecture, also increased, with AWS/Ampere's market share of 3.2/1.5% in 2022, an increase of 1.4/0.4pct from 2021.
From the perspective of import substitution in the current domestic CPU industry chain: in the design process, Huawei Kunpeng, Feiteng and other leaders have ranked among the world's first-class levels; In the packaging and testing link, Tongfu undertakes AMD7nm CPU packaging and testing, and the advanced process of 14nm and below nodes; Equipment, materials, EDA/IP, manufacturing and other links are still far from foreign leaders, and the model of "external circulation as the main + internal circulation as the supplement" is still adopted.
Domestic CPU industry chain:
GPUs directly benefit from the surge in global demand for computing power.
According to VerifiedMarket Research, the global GPU market size will be 33.5 billion yuan in 2021, and the global GPU market size is expected to reach 477.4 billion yuan in 2028, with a CAGR of 33.3% in 22-30 years.
IDC data shows that in China's AI chips, GPUs account for more than 90% of the market share, ranking first, while ASICs, FPGAs, NPUs and other non-GPU chips are also increasingly used in various industries and fields, with an overall market share of nearly 10%, and it is expected that by 2025 it will account for more than 20%.
GPUs work similarly to CPUs in that they complete computational tasks by executing instructions.
The difference is that the CPU completes the computing task by executing instructions serially, while the GPU completes the computing task by executing instructions in parallel.
CPU and GPU comparison:
The parallel computing mode of the GPU can perform multiple tasks at the same time, which greatly improves the computing efficiency and speed. Applications that require multiple computationally and memory-intensive tasks to be performed simultaneously can be accelerated, and are mainly used in AI to accelerate the training and inference process.
In terms of computing power, GPUs can significantly improve the processing speed of AI applications through parallel processing capabilities.
NVIDIA is the leader in the GPU market, with a global discrete graphics card market share of up to 80%. Its high-end GPUs such as H100, A100 and V100 occupy the vast majority of the AI algorithm training market.
NVIDIA introduced its first GeForce 256 product in 1999, defining the function of the GPU as a graphics rendering chip. Since 2010, NVIDIA has focused on research and development in the field of AI, its Tesla GPU provides computing power support for the world's fastest supercomputer, and launched the first GPU computing architecture "Fermi". #英伟达 #
Since then, NVIDIA has maintained a new product every six months and a new chip architecture every two years. From the Fermi architecture to the latest Hopper architecture, the product process has been iterated from 40nm to 4nm, the number of transistors has increased from 3 billion to 80 billion, the FP32 computing power has increased from 1.5TFLOPS to 60TFLOPS, and the memory bandwidth has increased from 192.4 GB/s to 3TB/s.
NVIDIA Turing TU102 GPU block diagram:
The United States continues to increase export restrictions on China's high-end chips, and the localization process of GPU chips related to high-speed computing is bound to accelerate. From the perspective of domestic alternatives, GPUs: Cambrian, Bicheng Technology, Flinton Technology, Kunlun Chip, Huawei HiSilicon, etc.
Domestic GPU industry chain:
FPGA is an integrated circuit chip with the biggest feature being field programmability.
FPGAs were invented in 1985 by Ross Freeman, one of the founders of Xilinx, and are further developed on the basis of existing programmable devices such as PAL, GAL, CPLD, etc.
FPGAs are mainly composed of programmable I/O units, programmable logic units, programmable wiring resources, etc. As a semi-custom circuit in the field of application-specific integrated circuits (ASICs), FPGAs can be reprogrammed multiple times to achieve specific functions according to the user's needs through the accompanying EDA software.
FPGA block diagram:
As one of the most widely used acceleration chips for AI computing platforms, FPGAs have the characteristics of low power consumption, short latency and strong flexibility, and are widely used in machine learning, network security, ultra-large-scale image processing, genetic detection and other fields.
Since FPGAs are highly flexible and high-speed, their parallel processing capabilities can greatly improve the computing performance of AI algorithms, thereby realizing efficient inference and training tasks.
Due to their hardware-only implementation, FPGAs are more flexible than traditional CPUs and GPUs, so more and more AI is beginning to adopt FPGAs for computational processing.
In terms of the global market structure, overseas manufacturers dominate the global FPGA market, Xilinx and Intel form a double monopoly, and domestic companies continue to increase the layout of FPGA chips, with huge growth space.
In terms of FPGAs, Fudan Microelectronics (leading high-reliability FPGA technology, taking the lead in launching billion-gate FPGAs and PSoC chips, with continuously enriched application fields) and Unigroup Guowei (domestic leader in the special integrated circuit industry, products covering more than 500 varieties, continuous updates of FPGAs in special fields), Anlu Technology (domestic civil FPGA leader).
ASICs are dedicated chips designed for specific, specific, and relatively single AI applications.
In 2016, Google released TPU chips (ASIC class), ASIC overcame the shortcomings of expensive GPU and high power consumption, ASIC chips began to be gradually applied to the field of AI, becoming an important branch of AI chips.
ASICs greatly surpass standard chips in terms of performance, energy efficiency, and cost, and are very suitable for AI computing scenarios.
According to CSET data, ASIC chips have obvious advantages in the field of inference, and their efficiency and speed are about 100-1000 times that of CPUs, which are significantly competitive compared to GPUs and FPGAs.
Although ASIC chips can also be applied to the training field (such as TPUv2, v3, v4), they are expected to be the first to appear in the field of inference.
ASIC chips are more used in the field of inference:
At present, the mainstream ASICs on the market include TPU chips, NPU chips, VPU chips and BPU chips, which are designed and produced by Google, Cambrian, Intel and Horizon respectively. Due to the long development cycle of ASICs, only large manufacturers have the funds and strength to conduct research and development. At the same time, ASICs are fully customized chips, which operate most efficiently in some specific scenarios, so when the downstream market space in some scenarios is large enough, mass production of ASIC chips can achieve huge profits.
In recent years, leading manufacturers have entered the ASIC field. NVIDIA continued the GPU route and released the H100 chip in 22, which is currently widely used in cloud training and inference; AMD used its own technology accumulation to integrate CPU and GPU to launch the Instinct MI300 chip, which is expected to be available in H2 23.
Leading manufacturers began to cut into the ASIC field, Google is the pioneer of AIASIC chips, launched TPUv4 in 21, greatly improved computing efficiency; Intel acquired HabanaLab in '19 and launched the Gaudi2 ASIC chip in '22; IBM, Samsung and other leading manufacturers have also entered the ASIC field. #5月财经新势力 #
Source: Guosen Securities
The computing power chip industry chain in the era of artificial intelligence is expected to develop rapidly, bringing new opportunities to the upstream and downstream industry chains. Domestic manufacturers are expected to seize the current round of AI wave to accelerate domestic substitution, and the market space for all links of the industrial chain is broad.
Check the industry data, just use the line to check! Line check | Industry Research Database