laitimes

GPU (NVIDIA) and artificial intelligence - AI basic science popularization

author:Green big leaf investment

GPUs and artificial intelligence

The total amount of global data and the workload of data centers have increased significantly, and the demand for computing power in data centers has grown rapidly. With the development of new technologies such as artificial intelligence, the generation of massive data and its computing processing have become the key to the development of data centers. According to IDC data, the total amount of global data is expected to increase from 82.47 ZB in 2021 to 215.99 ZB in 2026, corresponding to a CAGR of 21.24%. Among them, large-scale tensor operations and matrix operations are prominent requirements of artificial intelligence at the computing level, and the wide application of deep learning algorithms with high parallelism in vision, speech and natural language processing has made the demand for computing power increase exponentially.

Measured by the number of parameters in the model, the parameters of large language models have grown exponentially over the past five years. As the amount of parameters and training data increases, the capacity of the language model increases linearly with the exponential increase of the number of parameters, a phenomenon known as Scaling Law. However, when the number of parameters of the model is greater than a certain extent, the model ability will suddenly skyrocket, and the model will suddenly have some Emergent Ability, such as reasoning ability, unlabeled learning ability, and so on. For example, the mainstream of large language models before GPT was driven by deep neural networks, with billions of parameters, while ChatGPT reached 175 billion parameters.

With the accumulation of data in the Internet era, neural networks have become an important method of machine learning under the background of big data. In 2012, the deep convolutional neural network AlexNet became a landmark event of artificial intelligence by greatly improving its performance and greatly reducing the error rate in the field of image classification recognition.

In the process, its trainer Alex Krizhevsky innovatively used NVIDIA GPUs to successfully train AlexNet, a deep neural network with breakthrough performance improvements, thus opening a new era of artificial intelligence. NVIDIA GPUs have become a new infrastructure in the era of artificial intelligence with the large amount of computing power required for deep learning model training and inference.

GPU (NVIDIA) and artificial intelligence - AI basic science popularization

ChatGPT leads the global wave of artificial intelligence, and the development of artificial intelligence requires AI chips as computing power support. Since 2018, when OpenAI began to release the generative pre-trained language model GPT, GPT has continued to improve the scale of models and parameters, and the number of GPT-1 parameters at that time was only 117 million. In 2020, OpenAI released a GPT-3 pre-training model with 175 billion parameters and a corpus of 100 billion words for training, which performed well in natural language processing applications such as text analysis, machine translation, and machine writing.

In December 2022, OpenAI released ChatGPT, a chatbot model based on GPT-3.5, with excellent text chat and complex language processing capabilities. The release of ChatGPT has exploded in the AI field, and technology companies at home and abroad have announced the release of large-language models, and the explosive growth of users has also brought challenges to the demand for computing power of large-language models, and AI chips have become the key to improving computing power.

LAI chips, also known as AI accelerators or compute cards, are modules designed to handle a large number of computing tasks in AI applications. With the massive growth of data, the complexity of algorithm models, the heterogeneity of processing objects, and the high requirements for computing performance, AI chips can make targeted designs on AI algorithms and applications, and efficiently handle increasingly diverse and complex computing tasks in AI applications.

At present, mainstream AI chips mainly include graphics processing units (GPUs), field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), neuromorphic chips (NPUs), etc. Among them, GPUs and FPGAs are relatively mature chip architectures in the early stage, which belong to general-purpose chips. ASICs are chips customized for specific scenarios of AI. In addition, the central processing unit (CPU) is the computing and control core of the computer, the final execution unit of information processing and program operation, and the core component of the computer.

lGPU has an absolute advantage in training loads. According to IDC data, GPUs account for 91.9% of China's AI chip market share in 1H21, and are still the first choice for data center acceleration. GPU general-purpose, suitable for large-scale parallel computing, mature design and manufacturing process, suitable for advanced complex algorithms and general-purpose artificial intelligence platform.

GPU (Graphics Processing Unit, graphics processor) can achieve the performance advantage of parallel computing to meet the needs of deep learning. The GPU initially undertook the image computing task, and the goal was to improve the computer's processing performance of graphics, images, videos and other data, and solve the problem of low processing efficiency of the CPU in the field of graphics and images. Since GPUs are capable of parallel computing, their architecture itself is more suitable for deep learning algorithms. Therefore, through the optimization of the GPU, it can further meet the needs of deep learning for large calculations.

Typical GPU architectures and the similarities and differences between GPUs and CPUs

Typical GPU architectures include:

A GPU consists of multiple processor clusters

A Processor Cluster consists of multiple Streaming Multiprocessors

A Streaming Multiprocessors may contain multiple Cores, a certain number of Cores in Streaming Multiprocessors share a L1 cache, and multiple Streaming Multiprocessors share a L2 cache

lCPU (Central Processing Unit) is the computing and control core of the computer system, and is the final execution unit of information processing and program operation. The structure of the CPU mainly includes the arithmetic (ALU, Arithmetic and Logic Unit), the control unit (CU, Control Unit), the register (Register), the cache (Cache) and the bus for the data, control and status of the communication.

l Similarities: CPU and GPU are both computing processors, and the architecture consists of 3 parts: the operation unit ALU, the control unit Control and the cache unit Cache.

l Differences: The CPU is designed for low latency and is good at handling complex logic and serial computing tasks. The CPU needs to be very versatile to deal with a variety of different data types, and at the same time to logically judge and introduce a large number of branch jumps and interrupt processing, so the CPU has a complex internal structure and is good at logic control and general-purpose data operations.

GPUs are designed for high throughput and tailored for large-scale data parallel computing tasks. GPUs face large-scale data and relatively pure computing environments with highly unified types, independent of each other. GPUs use a large number of computing units and ultra-long pipelines, and are good at large-scale concurrent operations.

GPU (NVIDIA) and artificial intelligence - AI basic science popularization

NVIDIA is a leader in artificial intelligence computing, ranking first in the market in independent GPU shipments

Founded in 1993 and headquartered in Santa Clara, California, NVIDIA is an artificial intelligence computing company. As a pioneer in accelerated computing, the company has expanded from focusing on PC graphics computing to various important large-scale computing-intensive fields. The company uses its GPU products and architectures to create platforms for scientific computing, artificial intelligence (AI), data science, autonomous vehicles (AV), robotics, metaverse, and 3D internet applications.

According to Omdia data, among the top ten companies in the world by semiconductor revenue in 2022, NVIDIA ranks eighth with a market share of about 3.5%.

l The company is in a leading position in the global independent GPU market and data center market acceleration chips. According to Jon Peddie Research (JPR) data, NVIDIA independent GPU shipments accounted for 82% in 4Q22, ranking first in the market; PC GPU shipments accounted for 17%, second only to the world's largest processor manufacturer Intel, with its desktop integrated graphics advantage accounted for the largest share.

The stock price has experienced three rounds of rapid growth since 2016-2018, 2020-2021 and September 2022

GPU (NVIDIA) and artificial intelligence - AI basic science popularization

The company's development history and mergers and acquisitions history

In 1993, Jensen Huang and others co-founded NVIDIA. In 1994, the company partnered with SGS-Thomson to manufacture a single-chip graphical user interface accelerator. In 1995, the company launched its first product, NV1; The release of RIVA TNT in 1998 solidified the company's market position in developing powerful graphics adapters. In addition, the company signed a strategic partnership with TSMC in 1998, and TSMC began to assist in the manufacture of the company's products.

In 1999, the company invented the graphics processing unit and the world's first GPU, the GeForce 256, was born. The company's invention of the GPU defined modern computer graphics, and thus set itself on the path to reinventing the industry and establishing leadership in the field. Since then, the company's GPU product shipments have grown rapidly, with processor shipments exceeding 100 million units in 2002, more than 500 million units in 2006, and more than 1 billion units in 2011.

In 2006, the company launched the CUDA platform for general-purpose GPU (GPGPU) computing. Software developers can use the platform to write GPU-on-chip programs in C to complete complex calculations. Starting from the G80, NVIDIA GPU architecture has fully supported general-purpose programming, and the GPU has actually separated from the single purpose of image processing and become a true general-purpose GPU.

l Since 015, with the rapid advancement of the AI wave, the company's business has continued to diversify, developing into data centers, games, mobile devices, automotive electronics and other markets. In 2017, the company built the Tesla V100 GPU for data centers and high-performance computing to power the DGX series of AI supercomputers. FY23, the company's revenue from the data center business surpassed the gaming business, becoming the company's largest source of revenue.

The company continues to consolidate its business strength and expand its business boundaries through acquisitions. A representative important acquisition event is the acquisition of 3DFX. 3dfx, a company specializing in the development and production of graphics cards and 3D chips, has been a leader in graphics chips since the late nineties, but 3dfx was eventually acquired by the company due to bankruptcy. In addition, the company has consolidated its strength in the field of traditional graphics computing advantages through the acquisition of a number of graphics rendering related companies. In 2020, the company acquired Mellanox, a leader in high-performance interconnection technology, and expanded its product layout from GPUs to DPUs

In 2016, NVIDIA CEO Jensen Huang personally delivered the first NVIDIA DGX-1AI supercomputer to OpenAI, which is also the engine behind the breakthrough of large-scale language models supporting ChatGPT.

In 2018, OpenAI proposed to Microsoft the idea of "building an artificial intelligence system that can forever change the way humans and computers interact." To build supercomputers that support the OpenAI project, Microsoft spent hundreds of millions of dollars to connect tens of thousands of NVIDIA A100 chips together on the Azure cloud computing platform and retrofit server racks. On this supercomputer, OpenAI-trained models continue to become stronger, laying the foundation for the later birth of ChatGPT.

In November 2022, Microsoft announced a partnership with NVIDIA to build "one of the world's most powerful AI supercomputers" to handle the huge computing load required to train and scale AI. Based on Microsoft's Azure cloud infrastructure, the supercomputer uses tens of thousands of Nvidia H100 and A100 Tensor Core GPUs, as well as its Quantum-2 InfiniBand networking platform, which can be used to research and accelerate generative AI models such as DALL-E and Stable Diffusion.

On March 14, 2023, Microsoft announced that it will strengthen its cooperation with NVIDIA, upgrade the GPU from the previous A100 to H100, and launch a new ND H100 v5 virtual machine specifically developed for artificial intelligence

Excerpted from-Guosen Securities-Electronics Industry AI+ Series Special Report (2): Review NVIDIA's AI development road

Read on