laitimes

The fifth in the world, with a market value of more than one trillion! The vendor is monopolizing the AI industry

author:Cairns Finance

U.S. chipmaker NVIDIA once exceeded $1 trillion in market capitalization, becoming the first chip company to reach this milestone. It is the fifth largest company in the US stock market by capitalization.

The popularity of artificial intelligence and chip industries this year is self-evident, and Nvidia's share price rally is partly driven by higher-than-expected performance guidance. Nvidia's first-quarter earnings per share were $1.09, above market expectations of 92 cents, and second-quarter revenue guidance was $11 billion, nearly $4 billion above market expectations of $7.15 billion. In terms of market capitalization, NVIDIA is not cheap, and the expected price-to-earnings ratio in the next 12 months is as high as 47.4 times.

Now the rise in stock prices is more driven by future generative artificial intelligence, and the demand for NVIDIA products has risen sharply, creating more market growth potential and expectations for NVIDIA.

From the perspective of NVIDIA's directional popularity, the long-term field may be relatively early, and now the foundation of computing power GPU is the most important. Let's talk about NVIDIA today.

In this battle of AI big models, many manufacturers have participated, such as OpenAI, Microsoft, Google, domestic Baidu, Ali, Huawei, SenseTime, etc. But the GPU giant NVIDIA hidden behind cannot be hidden, especially NVIDIA A100 and H100, which are currently the "main force" for providing computing power for AI large models. Mastering "computing power" has also received a lot of attention. Now it's time to train large models.

Two days ago, NVIDIA founder and CEO Jensen Huang announced a number of advances involving accelerated computing and artificial intelligence (AI) at COMPUTEX 2023 in Taipei.

Launched the new large-memory AI supercomputer DGX GH200, the pinnacle of systems using NVIDIA's latest GPUs and CPUs, covering NVIDIA's most advanced accelerated computing and networking technologies. It is the first supercomputer to pair the Grace Hopper superchip with the NVIDIA NVLink Switch system, which uses a new interconnect method to connect 256 Grace Hopper superchips together, allowing them to work together like a single giant GPU, providing 1EFLOPS performance and 144TB of shared memory, nearly 500 times more memory than the previous generation DGX A100 320GB system introduced in 2020.

The fifth in the world, with a market value of more than one trillion! The vendor is monopolizing the AI industry

At the same time, it is also building its own large-scale AI supercomputer NVIDIA Helios based on the DGX GH200 to support the work of its research and development team. It features four DGX GH200 systems, each of which will be connected to the NVIDIA Quantum-2 InfiniBand network with up to 400Gb/s bandwidth to increase data throughput for training large AI models. Helios will include 1,024 Grace Hopper super chips.

The fifth in the world, with a market value of more than one trillion! The vendor is monopolizing the AI industry

Now the GH200 Grace Hopper super chip is fully in production, this product GH200 Grace Hopper super chip uses NVIDIA NVLink-C2C interconnect technology, combining Arm-based NVIDIA Grace CPU and Hopper GPU architecture in the same package, providing a total bandwidth of up to 900GB/s - 7 times higher than the standard PCIe Gen5 channel bandwidth in traditional acceleration systems. Interconnect power consumption is reduced to 1/5 of the original power to meet demanding generative AI and high-performance computing (HPC) applications.

There is also the launch of NVIDIA Spectrum-X, a network platform designed to improve the performance and efficiency of Ethernet-based AI clouds, Spectrum-X has the advantage of being highly versatile and can be used in some AI applications. Based on the blueprint and testbed of the Spectrum-X reference design, NVIDIA built a hyperscale generative AI supercomputer Israel-1 in its Israel data center.

and the NVIDIA MGX server specification to meet the needs of data centers of all sizes. With this specification, system manufacturers can use the product to build more than 100 server configurations to accommodate a wide range of AI, HPC and NVIDIA Omniverse applications. MGX can already be compatible with NVIDIA's full range of GPUs, CPUs, DPUs, and various x86 and Arm processors. With MGX, manufacturers can start with a basic system architecture optimized for accelerated computing for their server chassis, then choose GPU, DPU, and CPU. Multiple tasks such as AI training and 5G can be handled on a single machine or upgraded to next-generation hardware.

The fifth in the world, with a market value of more than one trillion! The vendor is monopolizing the AI industry

Another is about games, the launch of the NVIDIA Avatar Cloud Engine (ACE) for games, and everyone's attention is relatively high. is a custom AI model foundry service that middleware, tools, and game developers can use to build and deploy custom voice, conversation, and animation AI models. This model covers many of NVIDIA's technologies, NVIDIA NeMo, which uses proprietary data to build, customize and deploy language models; NVIDIA Riva for automatic speech recognition and text-to-speech for real-time voice conversations; NVIDIA Omniverse Audio2Face, used to instantly create emoticons of game characters to match any voice track. Regarding artificial intelligence, although it has not yet reached the application state, the direction of CPO and game media is slowly emerging.

Previously, NVIDIA and Convai collaborated to show how game developers can use ACE to build NPCs for games. AI models are used to create games, which do have many advantages, AI can create more intelligent NPCs, more humanized dialogue systems, more free scene generation, give users a better experience, and AI technology can also achieve "cost reduction and efficiency increase". The cost of generating NPCs in traditional methods is high, and it can take days to animate characters in one minute, but it is faster with AI technology.

There are many contents launched by NVIDIA at the meeting, if you are interested, you can go to understand again. Training AI large models is indeed a work that consumes computing power, algorithms, and data, and a series of hardware infrastructure and software tools launched by NVIDIA two days ago also show that NVIDIA wants to make efforts on large models, and use its own technology to crack the bottleneck of large-scale AI computing power around the core pain points of enterprise development and deployment of generative AI applications. If it can be applied to more enterprises, it will help enterprises and research institutions save a lot of time and costs.

Why NVIDIA?

On the CPU and GPU, Nvidia is betting on the GPU. And use the GPU for general-purpose computing. At that time, GPUs were mostly used to accelerate graphics rendering, often used in the field of gaming. But twenty years ago, NVIDIA developed the CUDA development platform (Universal Parallel Computing Architecture). The GPU-equipped toolset, CUDA programming, enables multiple GPUs to perform operations in parallel and improve the corresponding computing performance.

With the blessing of CUDA, GPUs have departed from the single purpose of image processing, and have begun to have the ability of general computing, and have gradually been used in AI deep learning. For example, the GPT large language model developed by OpenAI requires a lot of computing power, and GPU is the main computing power output tool. The A100 chip launched by NVIDIA is the "workhorse" GPU that supports GPT. Public data shows that GPT-3 has 175 billion parameters, 45TB of training data, and is supported by tens of thousands of A100 chips.

At the two-day Taipei Computer Show, Huang was also optimistic about the computing power of GPUs, mentioning that training an LLM large-language model would require a server cluster of 960 CPUs, which would cost about 10 million US dollars (about 70.7 million yuan) and consume 11 gigawatt-hours of electricity. In contrast, building a GPU server cluster at the same cost of $10 million will train 44 LLM models with only 3.2 gigawatt-hours of power consumption. If they both consume 11 gigawatt-hours of power, the GPU server cluster can achieve 150 times faster speedup, train 150 LLM large models, and have a smaller footprint.

To train a large LLM model, all you need is a GPU server that costs about $400,000 and consumes 0.13 gigawatt-hours of power. That is, GPU servers can train an LLM at 4% cost and 1.2% power consumption, and GPUs have a lot more advantages than CPU servers.

Compared with other manufacturers, NVIDIA's advantages in AI chips include both hardware performance advantages and software ecosystem advantages. NVIDIA launched the CUDA platform, which allows developers to program in familiar high-level programming languages and flexibly call the computing power of the GPU. Since then, the use of GPUs is no longer limited to graphics cards, but has expanded to all areas suitable for parallel computing. The software and hardware system composed of GPU and CUDA forms NVIDIA's product barrier.

AMD's product focus is on CPUs, and Passmark has announced that in the fourth quarter of 2021, the AMD EPYC™ series grew under Intel's monopoly, accounting for 6% of the global server CPU market. But in recent years, AMD is also making GPUs, developing GPGPU products, and also establishing Infinity Fabric technology, directly connecting EPYC™ series CPUs with Instinct MI series GPUs. In addition to hardware construction, software is no less, AMD launched the ROCm platform to build the CDNA architecture. AMD's ROCm ecosystem adopts the HIP programming model, but HIP is very similar to CUDA's programming syntax, and developers can program AMD's GPU products by imitating CUDA's programming methods, and then be compatible with CUDA at the source code level. In other words, AMD's ROCm ecosystem borrows CUDA's technology, and it still does not surpass NVIDIA in technology.

Amazon has launched AI-specific chips, and in late 2020, AWS launched Trainium, which is dedicated to training machine learning models. Earlier this year, Inferentia 2, built for artificial intelligence, was released, tripling compute performance, a quarter of total accelerator memory, a quarter of throughput, and a tenth of latency. INF2 instances, which support distributed inference over direct ultra-high-speed connections between chips, can support up to 175 billion parameters. Microsoft is also developing AI chips, and Google has deployed the AI chip TPU v4 in its own data centers, and the TPU v4-based supercomputer has 4,096 chips. Google's data is that TPU v4 performance is 2.1 times higher than TPU v3.

And domestic AI chip manufacturers are difficult to do both at the same time, mostly tend to architecture innovation, computing performance, platform solutions, etc., chip research and development has some results, such as Haiguang Information, Days Wisdom Core, Wall Technology and Moore Thread as a representative of domestic manufacturers in the research and development of GPUs, but the results are not much, software, system and ecosystem construction is less.

We also know that the current ChatGPT big model is just the beginning, and some manufacturers have already benefited a lot, such as Nvidia this year, the stock price increased by more than 60%. NVIDIA has contributed a lot of strength to this "AI computing power" with A100 and H100 series chips, which is the power source behind large-scale language models such as ChatGPT. A few days ago, NVIDIA announced its financial results for the first quarter of fiscal 2024, with quarterly revenue of $7.19 billion, an increase of 19% over the previous quarter. More than half of that revenue came from data centers, which reached a record high of $4.28 billion. Based on the revenue growth in the current quarter, NVIDIA expects revenue of $11 billion in the next quarter. Driven by growing demand for generative AI and large language models, NVIDIA data center revenue reached a record high, up 14% year-over-year and 18% sequentially

NVIDIA's performance and blue ocean is far more than A100, H100, because more and more manufacturers have demand for large models, especially the technology industry is competing to develop larger AI models, and large data center operators are also adjusting the computing infrastructure, but also doing artificial intelligence direction, the demand for chips is continuous, we also mentioned before that there was a shortage of H100 chips, only began to sell this year, H100 and A100 chip sales are very strong, only increasing. Moreover, the United States has banned the sale of high-end AI chips to the mainland, and it is difficult to buy A100 and H100 in China, resulting in the price of A800 and H800 chips being 40% higher than the original recommended retail price.

So, for NVIDIA's next product, what is the performance? How is the application? The market is still very much looking forward to it.

Lv Changshun (Cairns) Certificate number: A0150619070003. [The above content only represents personal views, does not constitute a basis for trading, the stock market is risky, investment needs to be cautious]