laitimes

Five years later, Huang is back in the air: Nvidia is king in the AI world

author:Intelligent driving network

After 5 years, NVIDIA's GTC is finally back offline.

At 4 a.m. Beijing time on March 19, at the SAP Sports Center in San Jose, the heart of California, Nvidia CEO Jensen Huang took the stage and delivered a keynote speech at GTC 2024.

The entire Silicon Valley is waiting with bated breath, and the ice hockey stadium of 10,000 people is not absent, accommodating practitioners, technology developers, investors, etc. from different industry backgrounds such as technology to autonomous driving to robotics, who are rushing to Silicon Valley to collectively witness the carnival moment of the future trend of general artificial intelligence built by this "AI vane".

At the opening, Jensen Huang, an Asian man in a leather suit, quipped, "You have to realize that this is not a concert, it's a developer conference."

Five years later, Huang is back in the air: Nvidia is king in the AI world

With the explosive development of AI technology and AI applications, Nvidia has become the biggest winner at the moment. From an app to a consumer electronics product, behind it is NVIDIA's technology research and development covering the entire AI field and 98% of the market share. Huang is no longer just the CEO of a trillion-dollar company, but represents a "symbol" of the future of AI in the era of large models.

Five years later, Huang is back in the air: Nvidia is king in the AI world

NVIDIA is like a multi-prism that can shine into the confidence and reality of the development of AI in different industries in the outside world.

As of January 28, 2024, Nvidia's fourth-quarter revenue was $22.1 billion, up 22% from the previous quarter and up 265% from the same period last year. Fiscal 2024 revenue increased 126% to $60.9 billion.

With GPU nuclear bombs such as A100 and H100, NVIDIA's market value has skyrocketed 6 times in 15 months from last year to now, once exceeding $2 trillion, ranking as the world's third largest technology company after Apple and Microsoft. In a sense, he can certainly be regarded as the "rock star Taylor" of the AI circle.

Five years later, Huang is back in the air: Nvidia is king in the AI world

After the end of the first day of GTC today, Nvidia's stock price closed the day down 1.76%. In the face of Nvidia, whose stock price is setting new records every day, the divergence in the capital market is beginning to emerge: Jordan Klein, an analyst at Mizuho Securities, tried to cool the market, reminding me of "Nvidia's stock price is a bit unhealthy, which reminds me of the crazy tech market mentality in 1999 and 2000", and Citi also pointed out in the report that "the risk of a correction facing Nvidia is intensifying".

Forbes Media doesn't think so, saying: "Anyone who has been wondering if Nvidia will lose its competitive advantage should rest assured that the leader will continue to lead." ”

GTC 2024 will offer more than 900 sessions, more than 300 exhibits, and more than 20 technical workshops covering generative AI and more from March 18-21 (U.S. time). In today's two-hour opening evolution, Huang introduced NVIDIA's "new engine" in terms of hardware, software, and services, igniting everyone's curiosity about the new ecosystem of NVIDIA's generative AI applications.

His focus on this GTC release is:

  • In terms of hardware, the release of the new Blackwell architecture and GB200 combination chip will provide 4 times the training performance of Hopper, and the parameters of the large model have reached the trillion level, which is also the biggest attraction of this GTC.
  • To bring AI into the physical world, Omniverse Cloud, a digital twin product that combines the training and application of robots, autonomous driving, and digital twins in a single platform, was also introduced.
  • and updates to the robotics platform lsaac, including the Isaac Perceptor Perception SDK and the Isaac Manipulator robotic arm control library. At the same time, the Jetson Thor computer for humanoid robots and the Project GR00T general base model were announced.

While watching the GTC presentation satisfies people's brilliant imaginations about AI, the focus is also on learning "storytelling" from Huang — Keynote.

At last year's GTC, Huang said, "The iPhone moment of AI has started," and this year's GTC, Huang continues to say something even more deafening to the world: "The future is generative."

We will restore the key content of Huang's self-described GTC 2024 keynote speech, which has been compiled and edited without changing the original meaning:

01.

Blackwell:

"If you do it with Blackwell, you only need 2,000 GPUs, four megawatts of electricity"

Named after American mathematician and game theorist David Blackwell, Blackwell inherits the Hopper GPU architecture with 208 billion transistors, making it NVIDIA's first GPU with a multi-chip package design, integrating two GPUs on the same chip.

Hopper is great, but we need a bigger GPU. Blackwell is not a chip, it's a platform name. While we make GPUs, modern GPUs have taken a very different shape. At the heart of the Blackwell system is precisely this new GPU, and within the company we refer to it only by numbers. In short, Blackwell is the top GPU in the world today.

Five years later, Huang is back in the air: Nvidia is king in the AI world

The size is evident with the Blackwell GB200 GPU on the left and the Hopper GH100 GPU on the right.

As a hyperscale chip, the GB 200 connects two GB200 GPUs to the Grace CPU through a 900GB/s ultra-low-power chip-to-chip interconnect.

Five years later, Huang is back in the air: Nvidia is king in the AI world

The new GB200 GPU has 208 billion transistors and offers up to 20 petaflops of FP4 computing power. Combining two of these GPUs with a single Grace CPU can deliver 30x the performance of LLM inference workloads while delivering significant efficiency gains.

In the GPT-175 billion LLM benchmark with 175 billion parameters, GB200 outperforms H100 by 7x, trains at 4x faster than H100, and is even more impressive with a 30x increase in inference throughput per GPU.

Now, let's take a look at how Blackwell performs in action.

Imagine training a GPT model with 1.8 trillion parameters, if you use a traditional ampere chip, it will take about 25,000 and take three to five months. Switching to Hopper, while only 8,000 GPUs, still requires 15 megawatts of power and a three-month training cycle.

Five years later, Huang is back in the air: Nvidia is king in the AI world

However, with the Blackwell platform, we only needed 2,000 GPUs and trained in the same 90 days, but the power consumption was only four megawatts. This not only significantly reduces costs, but also significantly improves energy efficiency.

In short, Blackwell, with its superior performance and efficiency, provides a more economical and environmentally friendly solution for training large AI models.

Thanks to the new, faster fifth-generation NVLink, Blackwell is able to scale up to 576 GPUs (H100 up to 256). The included second-generation Transformer engine with FP4 precision, as well as a decompression engine that is 20 times faster than before, all contribute to the performance gains.

The Transformer engine is a technology that allows each tensor to be computed with optimal accuracy, now with an accuracy of up to FP4. This means that if a competitor's GPU has the same number of Flops, our Blackwell may be twice as fast in inference processing thanks to the Transformer engine.

Five years later, Huang is back in the air: Nvidia is king in the AI world

Most of the marketing efforts are not focused on the Blackwell GPU, but on a three-chip superchip called the GB200, which consists of two Blackwell and a Grace Arm CPU.

This different 1-1 ratio from the Grace-Hopper chip makes perfect sense, as for the GH200, Grace's I/O and compute bandwidth are enough to manage two Blackwells, or four GPUs.

The GB200 NVL72 rack with NVLink support contains 72 Blackwell GPUs and 36 Grace CPUs. This single rack is capable of training a 27 trillion parameter model. Of course, most AI factories designed for this purpose will use multiple racks to train such a large model faster.

The Ceiba AI supercomputer we host at AWS will now consist of 20,000 GB200 GPUs instead of the 16,000 H100s originally announced.

We have no shortage of customers, and the GB200's current customers, including Amazon, Google, Microsoft, and Oracle, are already planning to offer NVL72 racks in their cloud service offerings.

Five years later, Huang is back in the air: Nvidia is king in the AI world

The GB200 NVL72 liquid-cooled rack system with 36 GB200 Grace Blackwell Superchips, which delivers a 30x performance improvement for inference workloads compared to our current H100 GPUs.

The Blackwell GPU and GB200 superchip, undoubtedly our new top leaders in AI training and inference, will also be introduced into the NVIDIA DGX B200 system on the cloud platform for model training, fine-tuning, and inference. All NVIDIA DGX platforms include NVIDIA AI Enterprise software for enterprise-grade development and deployment.

Looking back over the past eight years, we've seen a staggering 1,000-fold increase in computing power, far faster than Moore's Law predicts. You know, in the golden age of the PC revolution, performance only improved by 100 times every 10 years. However, it took us only eight years to achieve a 1,000-fold growth, and we are expected to continue to expand this advantage in the next two years. In short, the Blackwell platform is revolutionizing computing performance and laying the foundation for the future of technology.

02.

Accelerating the New Industrial Revolution:

“A new industry has emerged”

Looking back at history, several milestones in the history of NVIDIA's development, such as the establishment of NVIDIA in 1993, the development of CUDA in 2006, and AlexNet in 2012, constituted the first contact between humans and AI.

Five years later, Huang is back in the air: Nvidia is king in the AI world

In 2006, the CUDA computational model was born, and we foresaw its revolutionary potential and expected its rapid popularity. However, the real breakthrough came nearly 20 years later, in 2012, when AlexNet AI was first fused with CUDA. In 2016, we realized the importance of this computational model and launched a new type of computer, the DGX1, with a computing power of up to 170 teraflops. In the DGX1 supercomputer, eight GPUs are interconnected for the first time.

In 2016, we delivered our first supercomputer, DGX-1, to OpenAI, a startup based in San Francisco.

Five years later, Huang is back in the air: Nvidia is king in the AI world

As the first AI supercomputer, the DGX1 ushers in a new chapter in AI technology with 170 teraflops of computing power. From the emergence of Transformer in 2017 to ChatGPT's amazement around the world in 2022, the importance and capabilities of artificial intelligence have become increasingly prominent. In 2023, generative AI has come to the fore, giving birth to entirely new industries. That's because we're now using computers to write software that never existed before, and they generate tokens and floating-point numbers in entirely new ways.

This is similar to the early days of the Industrial Revolution, when people recognized the power of factories and energy to create electricity, an intangible but valuable resource. Today, we are generating a new type of electron, tokens, through infrastructure "factories", creating valuable artificial intelligence. This marked the birth of a new industry.

“A new industry has emerged.”

03.

“Omniverse and DRIVE Thor”

In the future, dynamic entities will be widely robotized, including humanoid robots, self-driving cars, and other devices.

These robotic systems need to operate efficiently in large venues such as stadiums, warehouses, and factories. In order to coordinate and manage these complex robotic production lines, we need a unified platform – Omniverse, a digital twin platform that serves as the operating system for our robotics world, and is one such platform that provides the necessary infrastructure to support the integration, coordination, and optimization of robotic systems.

Today, we're announcing the ability for Omniverse Cloud to stream to Vision Pro, giving users easy access to Omniverse's virtual worlds. The Vision Pro's seamless connection to Omniverse, combined with the integration of numerous CAD and design tools, provides users with an unprecedented workflow experience.

Five years later, Huang is back in the air: Nvidia is king in the AI world

The trend of the future is that everything that moves will be robotized, which will lead to greater safety and convenience.

The automotive industry is one of the most important application areas, and we are building a robotics stack based on computer systems, including self-driving cars and autonomous cloud programs that will soon be applied to Mercedes-Benz and Jaguar Land Rover vehicles. These autonomous robotic systems are completely software-defined, demonstrating the limitless potential of the technology.

DRIVE Thor, a centralized in-vehicle computing platform, will be powered by the new Blackwell architecture for Transformers, large language models (LLMs), and generative AI workloads.

In 2015, we entered the field of in-vehicle computing platforms, launching the first generation of autonomous driving computing platform DRIVE PX and Tegra series in-vehicle chips, and later released Xavier chips and Orin chips. In 2022, when the growth rate of the automotive business slowed down, we officially launched a new generation of autonomous driving computing chip DRIVE Thor.

Five years later, Huang is back in the air: Nvidia is king in the AI world

DRIVE Thor is an in-vehicle computing platform built for the increasingly important generative AI applications in the automotive industry. As the successor to the DRIVE Orin, the DRIVE Thor offers a wide range of cockpit features.

At the same time, we also announced that a number of leading electric vehicle manufacturers are using our DRIVE Thor, including many Chinese automakers such as BYD, GAC Aion, Xpeng, Li Auto and Zeekr, as well as autonomous driving platform companies such as WeRide.

I would also like to say that the new industrial revolution is coming, and the data centers of the future will be fully upgraded and accelerated. This is due to the computing power we bring to the table, which has led to the creation of generative AI, a new way of software development. This AI will usher in a new industrial revolution by creating an infrastructure dedicated to generating tasks, rather than traditional multi-user data centers.

Five years later, Huang is back in the air: Nvidia is king in the AI world

At 6 a.m. Beijing time, Huang returned to say goodbye, and the 2024 GTC ended.

04. Further discussion: Who may have the opportunity to become a replacement for NVIDIA?

Today, the chip market pattern is also divided by AI.

In January, Huang visited NVIDIA's offices in Beijing, Shanghai, and Shenzhen, and attended the annual meeting in China.

In February, Nvidia filed with the U.S. Securities and Exchange Commission (SEC) that it said Huawei was its current competitor in four of five areas, including artificial intelligence (AI)-related graphics processors, large cloud service companies with in-house teams to design AI-related chips, Arm-based CPU units and networking products.

Five years later, Huang is back in the air: Nvidia is king in the AI world

It is also the first time that Nvidia has listed Huawei as its main competitor in a variety of areas such as AI chips, along with the likes of Intel, AMD, Broadcom and Qualcomm, as well as large cloud computing companies such as Amazon, Microsoft, Alibaba and Baidu.

"There is a potential for new competitors or alliances between competitors to emerge and occupy significant market share," Nvidia said in the report. ”

The Global Times published an editorial saying that NVIDIA's move can be seen as a recognition of Huawei's progress in the field of AI chips.

Nvidia made it clear that Huawei is not only competing in the GPU field, but also in several fields such as CPUs and network chips. Huawei, which is listed as its biggest competitor, is considered a cloud services company and has demonstrated a strong presence in designing its own hardware and software to improve AI computing.

Huang once said that Huawei is a good company with very strong technical capabilities, and such an opponent deserves respect and attention. The growing number of semiconductor startups poses a serious challenge to Nvidia's dominance in the AI accelerator market.

Huawei has been seen as a potential beneficiary of the computing boom for some time now. Huawei's Ascend series chips compete head-to-head with NVIDIA's AI chip series, especially the Ascend 910B chip launched by Huawei last year, which is regarded as a Chinese alternative to the A100 chip launched by NVIDIA three years ago.

Huawei's Ascend 910B uses the self-developed Ascend architecture, which uses a 7nm process, has 256 AICore (Artificial Intelligence Cores), and up to 32GB of HBM2 (High Bandwidth Memory Gen 2). This performance indicator is basically benchmarked against NVIDIA's A100, showing Huawei's deep strength in the field of AI hardware.

Analysts estimate that China's AI chip market is expected to be worth $7 billion. Just before the U.S. government tightened exports and restricted U.S. companies from supplying advanced AI chips to China, Baidu had ordered 1,600 Ascend 910B chips from Huawei, and as of October last year, Huawei had delivered 60% of the orders. Zhou Hongyi also said in the same period last year. 360 purchased 1,000 Huawei AI chips.

Nvidia CFO Colette Kress was quoted as saying: "[Market] growth is strong in all regions except China." Following the U.S. government's implementation of export controls in October, our data center revenue in China fell significantly. Kress noted that under tighter U.S. controls, Nvidia has turned to exporting alternative products to China that do not require a license.

In a recent interview with Reuters, Huang said, "Nvidia is providing customers with samples of two new AI chips for the Chinese market, both of which are eligible for the requirement that no [U.S.] license is required." We look forward to the feedback from our customers. In the interview, Huang did not announce the names of the two new chips, nor did he disclose which companies were accepted for the samples, and Nvidia did not officially respond to this matter.

According to foreign technology media reports, Nvidia is preparing to launch three chips for the Chinese market - H20, L20 and L2. While the chip contains most of the new features in Nvidia's AI operations, some of the computing power is reduced to comply with export controls that the U.S. government expanded last October.

Among them, H20 is the most powerful of the three chips, which was originally scheduled to be released in November last year, but was postponed due to problems with server manufacturers. Reuters reports have revealed that Nvidia has begun accepting orders for H20 chips, with wholesalers pricing similar to competing goods launched by Chinese tech giant Huawei.

The move was seen as Nvidia's attempt to defend its dominance in the Chinese market.

The competition between the two tech giants in the field of AI will become more intense, which will not only drive rapid technological advancement, but also bring more innovation and value to consumers around the world. "There is a potential for new competitors or alliances between competitors to emerge and gain significant market share," Nvidia said in the report. ”

Five years later, Huang is back in the air: Nvidia is king in the AI world

Blackwell is NVIDIA's first chiplet-designed architecture, which could simplify the production of Blackwell-based GPUs at the silicon level, as it would be easier to maximize the yield of smaller chips.

The evolved version of Blackwell will not only further improve AI acceleration capabilities, but also feature high-speed memory interfaces, improved ray tracing technology, and parallel processing capabilities. Morgan Stanley believes that Nvidia wants to defend its computing power advantage and firmly bind those core customers, and the B100 is the most useful weapon.

But on the other hand, the packaging of multi-chip solutions has become more complex.

The question now is how quickly NVIDIA can ramp up production of B100 SXM modules and B100 PCIe cards, as well as DGX servers. After all, these are brand new products that use different components. If the market demand is too large, it is the same as in the previous case when the H100 was shipped in the early days, causing a large area of delay.

In an interview after the quarterly earnings release, Huang said, "All of our products are in short supply, which is a natural attribute of new products, so we're trying to meet the demand as much as we can, but overall, our demand is growing too fast." ”

Colette Kress, Nvidia's chief financial officer (CFO), also added: "We expect the supply of next-generation products to be very tight, as demand far outstrips supply capacity. ”

Five years later, Huang is back in the air: Nvidia is king in the AI world

Mike O'Rourke, chief market strategist at Jones Trading, released a report titled "Rest in Peace in the Age of the Big Seven", arguing that Apple, Google, Meta, Nvidia, Tesla, Amazon and Microsoft – the tech giant's dominance of the stock market is coming to an end. The seven stocks have parted ways as their fates diverge this year, with Dan Niles, founder and portfolio manager at the Sartori Fund, saying: "Earnings are struggling, they're having competitive issues, and I think you can see that in the share price, Apple, Tesla are all down this year, Google is also underperforming the market, and the portfolio should be left with only Nvidia, Meta, Amazon and Microsoft." ”

Five years later, from offline to offline, the sovereign status of the host Nvidia is no longer the same as in 2019.

Read on