After using the world's fastest AI super-calculation to show muscles, Huang Jenxun will use AI to seize the metaverse life gate

2022-03-23 21:20:47

Written | Fish three falcons

Edit | Jingyu

"AI in the next era", in a 1-hour and 40-minute keynote speech at the NVIDIA GTC conference on March 22, Beijing time, founder Huang Jenxun repeatedly said the word.

In the black virtual scene, Huang Methodically introduced a series of hardware, software, AI and robot application frameworks serving AI computing, and introduced NVIDIA's achievements in autonomous driving, virtual world, medical and other fields with the help of AI in the past period.

At GTC2021 in the fall of November last year, Huang Jenxun made a high-profile announcement of "entering the meta-universe", compared with the problems that GTC2022 focused on.

Since its birth, the "meta-universe" has gone from being popular in the industry to becoming synonymous with "unrealistic", which can be described as a big ups and downs. Metaverse players who have not yet left the game after calming down have to think about a serious question: what to start with to reach such a distant future.

"AI" is the metaverse life gate that Nvidia grasped.

For the metaverse, image processing and generation capabilities are facing tens of millions of levels of improvement, and AI can carry out more complex and refined image processing, whether it is in replication simulation, or in innovative construction, AI is an indispensable foundation.

Behind "AI" is more basic and more critical is "computing power".

After more than ten years of development, more and more data have been collected, more and more large algorithm models have been born, and the data and parameters that have yet to be processed have risen sharply.

Some professionals believe that in order to achieve the meta-universe scene depicted in "Avalanche", at least 1,000 times the growth of computing power is needed, and industry giants such as Apple, Tesla, and Meta are gradually turning to chip self-development and customization.

The industry is calling for a more efficient computing hardware base, and in the face of the "barbarians" who suddenly come to the door, NVIDIA chooses to take the initiative.

Whether NVIDIA releases an H100 GPU based on the new architecture Hopper, a Grace CPU, or shows its own progress in AI software, it reveals its layout and ambition to seize the next generation of AI tides.

Hash rate: top priority

NVIDIA H100

The first speaker in the keynote was the H100, the first GPU based on the new Hopper architecture.

The NVIDIA H100 uses the TSMC 4N (TSMC 4nm) process to integrate 80 billion transistors, significantly increasing the speed of AI, HPC, memory bandwidth, interconnects, and communications, and enabling nearly 5TB/s of external interconnect bandwidth.

"20 H100 GPUs can take on the traffic of the global Internet!" Huang Jenxun made a bold announcement at the meeting.

The H100 delivers an order of magnitude leap in performance and is one of the largest graphics processors NVIDIA has ever seen. Its FP8 hash rate is 4PetaFLOPS, FP16 is 2PetaFLOPS, TF32 hash rate is 1PetaFLOPS, and FP64 and FP32 hash rate is 60 TeraFLOPS.

After using the world's fastest AI super-calculation to show muscles, Huang Jenxun will use AI to seize the metaverse life gate

NVIDIA H100 | Nvidia

The large-scale training performance of the H100 is 9 times that of the "predecessor" A100, and the throughput of large language model inference is 30 times that of the A100.

At the same time, Hopper has created a proprietary engine specifically for Transformer, which will shorten the training that would have taken weeks to a few days. With the same model training accuracy, the performance is increased by 6 times.

In addition, the H100 is the world's first accelerator with confidential computing capabilities, both AI models and customer data will be protected.

Grace CPU super chip

In addition to the H100, the Grace CPU, which Wong called "the ideal CPU for the global AI infrastructure", is also no less impressive.

Grace CPU is NVIDIA's first dedicated CPU for AI infrastructure and high-performance computing, based on the latest data center architecture Arm v9, consisting of two CPU chips, with 144 core CPUs, 500W of power consumption, and two to three times the performance of the previous one.

Grace CPU | Nvidia

The two CPUs are connected via NVLink, and the technology enables inter-chip interconnection with high speed and low latency. Grace CPUs and Hopper can also be customized via NVLink.

NVLink technology will be widely used in NVIDIA chips in the future, including CPUs, GPUs, DPU and SoCs, with this technology, NVIDIA users will be able to use NVIDIA's platform to achieve semi-custom construction of chips.

EoS The world's fastest AI supercomputer

The computing power is not enough, and the number is made up.

Through Huang Jenxun's explanation, we can know that 8 H100 and 4 NVLinks can be combined into DGX H100, this giant GPU has 640 billion transistors, AI hash rate of 32 petaFLOPS; 32 DGX H100 can form a DGX POD with 256 GPUs; and 18 DGX PODs, a total of 4608 GPUs are built together, which is NVIDIA's announced EoS supercomputing.

DGX H100 | Nvidia

In the end, the computing power that EoS can achieve, which is 275petaFLOPS by traditional supercomputing standards, will be 1.4 times that of the previous A100-based U.S. largest supercomputing Summit; from the perspective of AI computing, EoS output of 18.4 Exaflops will be four times that of Fugaku, the world's first supercomputer today.

By then, EoS will be the world's fastest AI supercomputer.

Software: Steady updates

In terms of software systems, NVIDIA is still steadily updating.

Nvidia has released more than 60 updates to a range of libraries, tools, and technologies for CUDA-X, as well as its progress on climate forecasting, the conversational AI service Riva, and the Merlin framework for the recommendation system.

Earth-2 | Nvidia

At gtc2021 last year, Nvidia released its first AI digital twin, the Earth-2, and a few months later, Nvidia developed a weather forecastING AI model based on FourCastNet.

Developed by NVIDIA and researchers from universities and research institutions such as Caltech and Berkeley Lab, the model is trained on up to 10TB of Earth system data to predict the probability of precipitation more accurately than previous models.

Subsequently, Huang Renxun introduced Riva, NVIDIA's conversational AI service.

Riva version 2.0 supports 7 language recognition, converts neural text into gender-specific speech, and users can customize tuning through its TAO Transfer Learning Toolkit.

Maxine is a toolkit of 30 AI models that optimizes audiovisual effects for video communications in real time.

Maxine | Nvidia

When a remote video conference is held, even if you're reading a manuscript or browsing other web pages, Maxine helps speakers stay in line of sight with other people in attendance. If attendees are of different nationalities and speak different languages, Maxine can switch to another Chinese in real time via the AI model.

The Merlin framework is geared towards a recommendation system.

Metacosm and a new wave of AI

While improving computing power and making up for CPU shortcomings, NVIDIA has not forgotten the "sea of stars" of the metaverse it ultimately pursues.

Huang Jenxun's avatar Toy Jensen once again came on to talk to the Buddha-figure, and it is worth noting that this time, Toy Jensen was able to make eye contact and dialogue with Wong In-hoon in complete real time.

Faced with tough questions such as "what is synthetic biology" and "how did you make it?", Toy Jensen gave fluent answers.

Behind Toy Jensen is NVIDIA's Omniverse Avatar framework, which enables companies to quickly build similar avatars that mimic appearances, movements, and sounds.

The real-time conversation is supported by the above-mentioned Riva and the super language model Megatron 530B NLP, which allows the avatar to understand questions and respond in real time.

Toy Jensen talks to Jen-hoon Wong | Nvidia

Building avatars and interacting in real time will undoubtedly be the norm in the future metaverse world, and in just a few minutes of demonstrations, Nvidia tells us that this does not seem to be impossible.

In addition, in Huang Jenxun's view, the new chips, software and simulation functions will set off a "new wave of AI", the first wave of AI learning is perception and reasoning, and the direction of the next wave of AI development is robotics.

At present, Nvidia has gradually built an end-to-end full-stack robot platform such as NVIDIA Avatar for avatars, DRIVE for autonomous driving, Metropolis for manipulation and control systems, Isaac for autonomous infrastructure, and Holoscan for medical devices around the four pillars of real data generation, AI model training, robot stack and Omniverse digital twin.

At the end of the keynote speech, Huang Renxun spent about 8 minutes to lead the audience to sort out the newly released technologies, products and platforms from scratch, and summarized 5 trends that affect the development of the industry: million-X million times the speed of computing leaps, Transformers, which has greatly accelerated the speed of AI, become the data center of AI factories, the demand for robot systems has grown exponentially, and the digital twin of the next AI era.

The increase in "computing power" will continue to be the basis for all breakthroughs.

"We will accelerate the entire stack at data center scale over the next decade, once again achieving a million-X millionx performance leap." I can't wait to see what the next mega-performance leap will bring."

After using the world's fastest AI super-calculation to show muscles, Huang Jenxun will use AI to seize the metaverse life gate

Read on