laitimes

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

After two years, the Hopper architecture debuted!

Zhidong reported in the early morning of March 23 that today, NVIDIA (NVIDIA) returned with a high-profile new H100 GPU series based on the latest Hopper architecture!

Nvidia founder and CEO Jen-Hsun Huang is still wearing a leather jacket, but this time he did not appear in the kitchen scene that has almost become the "standard" of GTC conferences, but in a more sci-fi virtual space.

Continuing the previous style, Huang Jenxun continued to air every second in his keynote speech, announcing a number of "world firsts". This time, he brought a series of AI blockbuster new products that can be called "the strongest on the surface", and the AI performance of any one precision is 3 to 6 times higher than that of the previous generation A100.

Although Nvidia's plan to acquire Arm has just failed, its data center "three-core" route (GPU + DPU + CPU) is still unwavering - following the launch of its first data center CPU last year, today, NVIDIA has unveiled a Grace CPU super chip based on Arm architecture.

In addition, Huang Jenxun once again sent his own virtual digital avatar "Doll Lao Huang" Toy Jensen, and had a smooth real-time Q&A dialogue with this vividly animated doll.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

With two major tracks, graphics processing and artificial intelligence (AI), NVIDIA has become the world's top 1 semiconductor market capitalization. As of the time of publication, NVIDIA's market capitalization is more than $660 billion, nearly $110 billion more than the second place TSMC.

Let's take a look at the complete dry goods of this GTC conference:

1, H100 GPU: using TSMC 4N process, with 80 billion transistors, to achieve the first GPU confidential computing, compared with A100, FP8 performance increased by 6 times, FP16, TF32, FP64 performance increased by 3 times.

2. New NVLink Switch system: highly scalable, supporting 256 H100 GPU interconnects.

3. Converged Accelerator H100 CNX: Coupled with H100 GPU and ConnectX-7 and Ethernet smart cards, it can provide more powerful performance for I/O intensive applications.

4, DGX H100: equipped with 8 H100 GPUs, a total of 640 billion transistors, in the new FP8 precision AI performance is 6 times higher than the previous generation, can provide 900GB / s bandwidth.

5. DGX SuperPOD: It is composed of up to 32 DGX H100s, and the AI computing power can reach 1EFLOPS.

6. Eos supercomputer: The world's fastest AI supercomputer, equipped with 576 DGX H100 systems, FP8 computing power of 18EFLOPS, PF64 computing power of 275PFLOPS.

7, Grace CPU super chip: composed of two CPU chips, using the latest Armv9 architecture, with 144 CPU cores and 1TB/s memory bandwidth, will be available in the first half of 2023.

8. Open NVLink for custom chips: Using advanced packaging technology, compared with PCIe Gen 5 on NVIDIA chips, the energy efficiency is 25 times higher and the area efficiency is 90 times higher. NVIDIA will also support the universal small chip interconnect transmission channel UCIe standard.

9. CUDA-X: More than 60 updates to a range of libraries, tools, and technologies for CUDA-X.

10. Riva 2.0: The conversational AI service Riva is fully released, and version 2.0 supports recognition of 7 languages, which can convert neural text into speech voices of different genders.

Merlin 1.0: Helps enterprises quickly build, deploy, and scale advanced AI recommendation systems.

12. Sionna: An AI framework for 6G communication research.

13. OVX and OVX SuperPod: Data center-class servers and super clusters for industrial digital twins.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

14. Spectrum-4: The world's first 400Gbps end-to-end network platform, the switching throughput is 4 times higher than that of previous generations, reaching 51.2Tbps.

15. Omniverse Cloud: Supports collaborators to work together remotely and in real time anytime, anywhere.

16, DRIVE Hyperion 9: Automotive reference design, with 14 cameras, 9 radar, 3 lidar and 20 ultrasonic sensors, the overall number of sensors is twice the number of sensors of the previous generation.

17. DRIVE Map: Multimodal map engine, which contains data from cameras, lidar and radar, while taking into account security.

Clara HoloscanMGX: A computing platform for the medical device industry to develop and deploy real-time AI applications at the edge, with AI computing power of 254 to 610 trillion operations per second.

19. Isaac for AMR: Provides a reference design for autonomous mobile robot systems.

20. Jetson AGX Orin Developer Suite: Implement server-level AI performance at the edge.

Jen-Hsun Huang also introduced NVIDIA's NVIDIA AI Acceleration Program, which develops engineered solutions by working with developers in the AI ecosystem to ensure customers can deploy with confidence.

01.

H100 GPU: 80 billion transistors, six innovations

Each time Nvidia's new GPU architecture is named after a scientist, this time too.

The new Hopper architecture is named after American computer scientist Grace Hopper, the first female doctor of mathematics at Yale University, the third programmer in the world, the inventor of the world's first compiler, and the first to discover a "bug".

On September 9, 1945, Grace's Mark II machine malfunctioned, and after nearly a day of troubleshooting, she found the cause of the failure: there was a dead moth in the relay. Later, the words "bug" and "debug" have been passed down as special words in the computer field to this day.

A series of new AI computing products based on the Hopper architecture have been crowned various "world firsts". According to industry practice, anyone who compares AI computing power will take NVIDIA's latest flagship GPU as a measure.

Nvidia is no exception, first "crushing" the previous generation of A100 GPUs it released two years ago.

As the world's first GPU based on Hopper architecture, NVIDIA H100 has taken over the responsibility of accelerating AI and high-performance computing (HPC), with AI performance reaching 3 times that of the A100 at FP64, TF32 and FP16 accuracy.

As you can see, NVIDIA is becoming more and more keen to take the sparse route. In the past six years, NVIDIA has developed technologies that use FP32 and FP16 for training. The performance introduction of H100 has a new Tensor processing format FP8, and the AI performance under FP8 accuracy can reach 4PFLOPS, which is about 6 times that of A100 FP16.

From the perspective of technological progress, the H100 has 6 breakthrough innovations:

1) Advanced chip: H100 adopts TSMC 4N process, TSMC CoWoS 2.5D package, has 80 billion transistors (A100 has 54 billion transistors), equipped with HBM3 memory, can achieve nearly 5TB/s external interconnection bandwidth.

H100 is the first GPU to support PCIe 5.0, and it is also the first GPU to adopt the HBM3 standard, and a single H100 can support 40Tb/s IO bandwidth and achieve 3TB/s video memory bandwidth. Huang Renxun said that 20 H100 GPUs can support the equivalent of global Internet traffic.

2) New Transformer Engine: This engine combines the new Tensor Core with software that can use FP8 and FP16 digital formats to dynamically process the various layers of the Transformer network, reducing the training time of the Transformer model from weeks to days without compromising accuracy.

3) Generation 2 Secure Multi-Instance GPUs: MIG Technology supports dividing a single GPU into 7 smaller and completely independent instances to handle different types of jobs and provide a secure multi-tenant configuration for each GPU instance. The H100 can host seven cloud tenants, while the A100 can host only one, which extends some of MIG's capabilities by a factor of seven. The performance of each H100 instance is equivalent to two full NVIDIA Cloud Inference T4 GPUs.

4) Confidential Computing: The H100 is the world's first GPU accelerator with confidential computing capabilities that protects AI models and ongoing customer data, federated learning that can be applied to privacy-sensitive industries such as healthcare and financial services, and shared cloud infrastructure.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

5) 4th Generation NVIDIA NVLink: In order to accelerate large AI models, NVLink, combined with a new external NVLink Switch, can expand NVLink into an interconnected network between servers, connecting up to 256 H100 GPUs, which is 9 times higher than the bandwidth of the previous generation using NVIDIA HDR Quantum InfiniBand network.

6) DPX instructions: Hopper introduced a new set of instructions called DPX, DPX can accelerate dynamic programming algorithms, solve path optimization, genomics and other algorithm optimization problems, compared with the CPU and the previous generation of GPUs, its speed increase can reach 40 times and 7 times, respectively.

Overall, these technical optimizations of the H100 will significantly improve the efficiency of tasks such as running depth recommendation systems, large AI language models, genomics, complex digital twins, and climate science.

For example, the Megatron 530B, a monolithic Transformer language model used by chatbots, is supported by the H100, with 30 times higher throughput than the previous generation, while meeting the sub-second latency required for real-time conversational AI.

Another example is the training of a hybrid expert model with 395 billion parameters with H100, which can accelerate the training speed by up to 9 times, and the training time can be shortened from weeks to days.

The H100 will be available in both SXM and PCIe formats to meet a variety of server design needs.

Among them, the H100SXM provides HGX H100 server boards with 4 GPU and 8 GPU configurations; the H100 PCIe connects two GPUs through NVLink, which can provide more than 7 times the bandwidth compared to PCIe 5.0. The PCIe specification facilitates integration into existing data center infrastructures.

Demand for electricity in both sizes has increased significantly. The H100 SXM edition consumes 700W of thermal design power (TDP), which is 75% higher than the A100's 400W. According to Jen-Hsun Huang, the H100 features air-cooled and liquid-cooled designs.

The product is expected to go on sale later this year. Cloud providers such as Alibaba Cloud, AWS, Baidu Smart Cloud, Google Cloud, Microsoft Azure, Oracle Cloud, Tencent Cloud, and Volcano Engine all plan to launch H100-based instances.

In order to introduce The power of Hopper to mainstream servers, Nvidia has launched a new fusion accelerator H100 CNX. It connects the network directly to the GPU, couples the H100 GPU with the NVIDIA ConnectX-7 400Gb/s InfiniBand and Ethernet smart card, enabling network data to be transmitted directly to the H100 at 50GB/s via DMA, avoiding bandwidth bottlenecks and providing more robust performance for I/O intensive applications.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

02.

Stronger enterprise-level AI system, the world's fastest AI supercomputing

Based on the A100, NVIDIA's most advanced enterprise-class AI infrastructure, the DGX H100 system, the DGX POD, the DGX SuperPOD, and one by one, are on the scene. They will be supplied from the third quarter of this year.

Huang said that among the Fortune 10 companies and the Top 100 companies, 8 and 44 companies use DGX as AI infrastructure, respectively.

NVIDIA DGX systems now include the NVIDIA AI Enterprise software suite, which adds support for bare metal infrastructure. DGX customers can use pre-trained AI platform models, toolkits, and frameworks in the software suite to speed up their work.

1. DGX H100: The most advanced enterprise-grade AI infrastructure

The 4th generation NVIDIA DGX system DGX H100 is an AI platform based on the NVIDIA H100 Tensor Core GPU.

Each DGX H100 system is equipped with eight H100 GPUs, with a total of 640 billion transistors, connected by NVLink, and AI performance up to 32 Petaflops at the new FP8 precision, which is 6 times higher than the performance of the previous generation system.

Each GPU in the DGX H100 system is connected via the fourth generation NVLink, providing 900GB/s of bandwidth, which is 1.5 times that of the previous generation system. The DGX H100 has up to 24TB/s of memory bandwidth.

The system supports dual x86 CPUs, and each system also contains two NVIDIA BlueField-3 DPUs for offloading, accelerating, and isolating advanced networking, storage, and security services.

Eight NVIDIA ConnectX-7 Quantum-2 InfiniBand NICs are capable of delivering 400GB/s throughput for connecting compute and storage, up to 1x faster than previous-generation systems.

2, DGX SuperPOD: FP8 AI performance up to 1Exaflops

The DGX H100 system is the building block for the new generation of NVIDIA DGX PODs and DGX SuperPOD supercomputers.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

With the NVLink Switch system, the DGX Pod with 32 nodes and 256 GPUs has 20.5TB of HBM3 video memory and up to 768TB/s of video memory bandwidth.

"By comparison, the entire Internet is only 100TB/s." Huang Renxun sighed. Each DGX can be connected to the NVLink Switch with the help of a 4-port optical transceiver, each port has eight 100G-PAM4 channels capable of transmitting 100GB per second, and 32 NVLink transceivers are connected to an NVLink Switch system with 1 rack unit.

The new generation of DGX SuperPOD delivers 1Exaflops' FP8 AI performance, which is 6 times higher than the previous generation, capable of running large language model workloads with trillions of parameters, as well as 20TB of HBM3 video memory and 192TFLOPS of SHARP network computing performance.

By using Quantum-2 InfiniBand connectivity and the NVLink Switch system, the new DGX SuperPOD architecture moves data between GPUs with up to 70TB/s bandwidth, 11 times higher than the previous generation.

The Quantum-2 InfiniBand switch chip has 57 billion transistors and can provide 64 400Gbps ports. Multiple DGX SuperPOD units can be combined.

In addition, NVIDIA launched a new DGX-Ready managed services program to help simplify AI deployment. Its DGX Foundry-hosted development solution is expanding globally, with new locations in North America, Europe and Asia supporting remote access to DGX SuperPOD.

Included in the DGX Foundry is NVIDIA Base Command software, which enables customers to easily manage the end-to-end AI development lifecycle based on the DGX SuperPOD infrastructure.

3, Eos: the world's fastest running AI supercomputer

Huang also revealed that Nvidia is building the Eos supercomputer, saying it is "the first Hopper AI factory" that will be launched in a few months.

The supercomputer contains 18 DGX PODs, 576 DGX H100 systems, a total of 4608 DGX H100 GPUs, and is expected to provide 18.4 Exaflops of AI computing power, which is 4 times faster than the fastest Running Japanese Fugaku supercomputer. In terms of traditional scientific computing, Eos is expected to deliver 275 Petaflops performance.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

03.

A super chip consisting of two CPUs

In addition to GPUs, cpus, another pillar of NVIDIA's "three-core" strategy for data centers, have also made new progress.

Today, Nvidia unveiled its first dedicated Arm Neoverse-based data center CPU for HPC and AI infrastructure, the Grace CPU Super Chip.

This is called "the ideal CPU for AI factories" by Huang Jenxun.

According to reports, the Grace Hopper super chip module can directly connect between the chip between the CPU and the GPU, and its key driver technology is the NVLink interconnection between memory-consistent chips, and the speed of each link reaches 900GB/s.

The Grace CPU super chip can also be composed of two CPU chips. They are connected to each other via the high-speed, low-latency chip-to-chip interconnect technology NVLink-C2C.

It is based on the latest Armv9 architecture, a single socket has 144 CPU cores, with the highest single-threaded core performance, support for Arm's new generation of vector expansion.

In SPECrate2017_int_base benchmarks, the Grace CPU superchip scored 740 for analog performance, which is more than 1.5 times higher than the dual CPU on the current DGX A100, according to NVIDIA Labs' estimates using similar compilers.

In addition, the Grace CPU Super Chip achieves 2 times the memory bandwidth and energy efficiency of today's leading server chips.

Its innovative memory subsystem, which relies on LPDDR5x memory with error correction codes, provides the best balance of speed and power consumption. The LPDDR5x memory subsystem provides twice the bandwidth of traditional DDR5 designs, reaching 1TB/s, while also significantly reducing power consumption, with the overall CPU plus memory consumption of only 500 watts.

The Grace CPU Super Chip can run all of NVIDIA's computing software stacks, combined with NVIDIA ConnectX-7 network cards, can be flexibly configured into the server, or as a standalone pure CPU system, or as a GPU acceleration server, can be paired with 1, 2, 4 or 8 Hopper-based GPUs.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

In other words, users can maintain only one software stack to optimize performance for their specific workloads.

Huang Said grace super chips are expected to start supplying next year.

04.

Open NVLink integration for custom chips

The UCIe chip standard will be supported

Let's talk about NVLink-C2C technology separately.

The aforementioned Grace CPU super chip series and the Grace Hopper super chip released last year all use this technology to connect the processor chip.

Ian Buck, vice president of NVIDIA's hyperscale computing, said: "In order to cope with the slowdown in the development of Moore's Law, small chips and heterogeneous computing must be developed. ”

That's why Nvidia has leveraged its expertise in high-speed interconnects to develop unified, open NVLink-C2C interconnect technology.

The technology will enable custom dies to achieve consistent interconnections with NVIDIA GPUs, CPUs, DPUs, NICs, and SoCs, enabling new types of integration products through chiplets to help data centers create next-generation system-level integration.

NVLink-C2C is now open for semi-custom chips, supporting its integration with NVIDIA technology.

By using advanced packaging technology, NVIDIA's NVLink-C2C interconnect link can be up to 25 times more efficient than PCIe Gen 5 on NVIDIA chips, with an area efficiency of 900 times or more, and consistent interconnection bandwidth of 900 GB per second or more.

NVLink-C2C supports the Arm AMBA Conformance Hub Interface (AMBA CHI) protocol, or CXL industry standard protocol, for interoperability between devices. Nvidia and Arm are currently working closely together to enhance AMBA CHI to support accelerators that are fully consistent and secure with other interconnect processors.

NVIDIA NVLink-C2C relies on NVIDIA's SERDES and LINK design technologies to scale from PCB-level integration and multi-chip modules to silicon inserter and wafer-level connectivity. This provides extremely high bandwidth while optimizing energy efficiency and die area efficiency.

In addition to NVLink-C2C, NVIDIA will also support the universal chip interconnect transmission channel UCIe standard released earlier this month.

Custom chip integration with NVIDIA chips can use either the UCIe standard or NVLink-C2C, which is optimized for lower latency, higher bandwidth, and higher efficiency.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

05.

AI software: Fully released conversational AI services

Launched version 1.0 of the recommendation system AI framework

Nowadays, Nvidia has been able to provide full-stack AI, in addition to AI computing hardware, its AI software has also made a lot of progress.

Huang said that AI has fundamentally changed the ability of software and the way software is developed, and in the past decade, NVIDIA Accelerated Computing has achieved a million-fold acceleration in the field of AI.

Today, Nvidia released more than 60 updates to cuda-X's libraries, tools, and technologies to accelerate advances in quantum computing and 6G research, cybersecurity, genomics, drug discovery, and more.

Nvidia will use its first AI digital twin, Earth-2, to address climate change challenges and create a Physics-ML model to simulate dynamic changes in global weather patterns.

NVIDIA has also developed a weather forecast AI model, FourCastNet, based on 10TB of Earth system data, which for the first time achieved higher accuracy than advanced numerical models in precipitation prediction and increased prediction speed by 4 to 5 orders of magnitude. Whereas traditional numerical simulations used to take a year, now it only takes a few minutes.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

NVIDIA Triton is an open source, ultra-large-scale model inference server, is the "central station" of AI deployment, it supports CNN, RNN, GNN, Transformer and other models, various AI frameworks and various machine learning platforms, supporting the operation of cloud, local, edge or embedded devices.

At the same time, Huang Renxun announced the full release of Nvidia's conversational AI service Riva, which supports recognition of 7 languages, which can convert neural text into gender-pronounced speech, and users can customize and tune through its TAO transfer learning toolkit.

Maxine is an AI model toolkit that now has 30 advanced models that optimize the audiovisual effects of real-time video communication. For example, when holding remote video conferencing, Maxine enables speakers to maintain eye contact with all participants, and can switch the language spoken to another language in real time, and the sound is unchanged.

This GTC release adds new models for echo cancellation and audio super resolution.

In addition, Jen-Hsun Huang also announced the launch of a 1.0 version of NVIDIA's AI framework Merlin for recommendation systems.

Merlin helps businesses quickly build, deploy, and scale advanced AI recommendation systems. For example, WeChat Merlin shortened the short video recommendation delay to 1/4 of the original and increased the throughput by 10 times. By migrating from CPU to GPU, Tencent's cost in this business has been reduced by 1/2.

In the field of healthcare, Huang Renxun said that in the past few years, AI pharmaceutical research startups have received more than $40 billion in investment, and the conditions are ripe for a digital biology revolution, which he said will be "the greatest mission of NVIDIA AI to date."

The 6G standard came out around 2026, and some related basic technologies gradually took shape. In response, Jen-Hsun Huang announced the launch of an AI framework for 6G communication research, Sionna.

06.

Omniverse: The first digital twin

Dedicated servers and super clusters

Huang Believes that the first wave of AI learning is perception and reasoning, and the development direction of the next wave of AI is robotics, that is, the use of AI to plan actions. NVIDIA's Omniverse platform is also becoming an indispensable tool when manufacturing robotic software.

As a simulation engine for virtual worlds, the Omniverse platform follows the laws of physics and builds a realistic digital world that can be applied to remote collaboration between designers using different tools, as well as industrial digital twins.

Huang Believes that industrial digital twins need a purpose-built new type of computer, so NVIDIA has built OVX servers and OVX SuperPOD super clusters for industrial digital twins.

The OVX is the first Omniverse computing system consisting of eight NVIDIA A40 RTX GPUs, three ConnectX-6 200Gbps network interface cards (NICs), and two Intel Xeon Ice Lake CPUs.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

The 32 OVX servers that make up the OVX SuperPOD super cluster are the key facilities for this connection, NVIDIA's new Spectrum-4 Ethernet platform today.

It is reported that this is the world's first 400Gbps end-to-end network platform, its switching throughput is 4 times higher than previous generations, the aggregate ASIC bandwidth reaches 51.2Tbps, and it supports 128 400GbE ports.

Spectrum-4 achieves nanosecond timing accuracy, which is 5 to 6 orders of magnitude higher than typical data center millisecond jitter. The switch also accelerates, simplifies, and secures network architectures. Compared with the previous generation, the bandwidth per port is increased by 2 times, the number of switches is reduced to 1/4, and the power consumption is reduced by 40%.

The platform consists of NVIDIA Spectrum-4 switch series, ConnectX-7 smart NICs, BlueField-3DPU, and DOCA data center infrastructure software that improves the performance and scalability of AI applications, digital twins, and cloud infrastructure, dramatically accelerating cloud-native applications at scale.

The Spectrum-4 ASIC and SN5000 switch series are based on a 4nm process with 100 billion transistors and a simplified transceiver design for leading energy efficiency and total cost of ownership.

Spectrum-4 distributes bandwidth fairly across all ports, supports adaptive routing, and enhanced congestion control mechanisms, significantly increasing application speed in the data center.

The Spectrum-4 ASIC has 12.8Tbp encrypted bandwidth and leading-edge security features such as support for MACsec and VXLANsec, and secure boot as the default setting via a hardware root of trust to help ensure the security and integrity of traffic and network management.

Now major computer manufacturers have launched OVX servers, for customers who want to try Omniverse in OVX, Nvidia offers the LaunchPad program in many places around the world, the first generation of OVX is being run by Nvidia and early customers, and the second generation of OVX is being built. A prototype of spectrum-4 will be released by the end of the fourth quarter of this year.

Subsequently, Huang Jenxun's avatar "Doll Old Yellow" Toy Jensen, who had been shown at previous GTC conferences, reappeared.

It is not a video recording, but can make eye contact and dialogue in complete real time. Huang Renxun asked it on the spot", "what is synthetic biology", "how do you make it" and other questions, and it answered like a stream.

Using NVIDIA's Omniverse Avatar framework, companies can quickly build and deploy avatars like Toy Jensen, from imitating sounds to subtle head and body movements, to high-fidelity image shaping, making virtual people more agile.

Finally, thanks to riva's latest conversational AI technology and megatron 530B NLP, the super-language model, virtual people can understand the questions you ask and interact with you in real time.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

On this basis, Nvidia announced that it will launch Omniverse Cloud. With Omniverse Cloud connectivity, collaborators can work together remotely in real time using NVIDIA RTX PCs, laptops, and workstations.

If you don't have an RTX computer, you can launch Omniverse from GeForce Now with just one click.

07.

Automotive: Trailer DRIVE Hyperion 9

Launched the multimodal map engine

The Omniverse platform is at the heart of the entire workflow, and the DRIVE platform is the equivalent of an AI driver.

Huang Renxun announced that the next-generation DRIVE Hyperion 9 will be installed in cars from 2026, and it will have 14 cameras, 9 radars, 3 lidars and 20 ultrasonic sensors, and the overall number of sensors will be twice that of the Hyperion 8.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

In addition, Nvidia has introduced a multimodal map engine, NVIDIA DRIVE Map, which contains data from cameras, lidar and radar, while maintaining security.

DRIVE Map has two map engines, the Truth Mapping Map Engine and the Crowdsourced Fleet Map Engine. By 2024, Huang said, they expect to map and create digital twins of all major highways in North America, Western Europe and Asia, with a total length of about 500,000 kilometers.

"We're building a digital twin of an earth-scale autonomous fleet." Huang Jenxun said.

In terms of cooperation, BYD, the world's second-largest electric vehicle manufacturer, will be equipped with the DRIVE Orin computing platform in the cars that will start production in the first half of 2023. Autonomous driving unicorn Yuanrong Qixing and Chinese autonomous driving startup Yunji Zhixing also announced that they will be equipped with NVIDIA DRIVE Orin SoC chips in their L4-level autonomous vehicle specification-level mass production plan.

U.S. electric vehicle company Lucid Motors, Chinese L4 self-driving technology company Wenyuan Zhixing, and Chinese new electric vehicle company Yo-Run Technology have all announced that they will use NVIDIA's DRIVE Hyperion self-driving car platform.

08.

Robotic platforms: From medical devices to autonomous mobile robots

Huang believes that the next wave of AI is robotics, and NVIDIA is building multiple robotic platforms, including DRIVE for self-driving cars, Isaac for manipulating and controlling systems, Metropolis for autonomous infrastructure, and Holoscan for medical devices.

He simplifies the workflow of the robotic system into four pillars: truth data generation, AI model training, Omniverse digital twin, and robotics stack.

Clara Holoscan MGX is an open and scalable robotics platform designed to meet IEC-62304 medical-grade specifications, with a Jetson AGX Orin and ConnectX-7 smart network card at its core, and optional NVIDIA RTX A6000 GPU.

The ai computing power of the platform can reach 254 to 610 trillion operations per second, and is currently open to early experience customers, the official launch time is May, and will complete medical-grade preparation in the first quarter of 2023.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

The Metropolis platform has reached 300,000 downloads, has more than 1,000 ecosystem partners, and operates in more than 1 million facilities.

One of the fastest growing areas of robotics is autonomous mobile robots (AMRs), which are essentially indoor driverless, low-speed but highly unstructured environments.

Today, Nvidia launches Isaac for AMR, which has four cores: NVIDIA DeepMap for truth-generating, NVIDIA AI for training models, AMR Robotics Reference Design with Orin, new Gem in the Isaac robotics stack, and a new Omniverse-based isaac Sim, each of which is individually available and fully open.

Similar to drive hyperion, the Isaac Nova is an AMR robotics system reference design on which the entire Isaac stack is built. Nova has 2 cameras, 2 lidar, 8 ultrasonic radars and 4 fisheye cameras.

Nvidia also announced the Jetson Orin Developer Suite to enable server-class AI performance at the edge.

The Nova AMR, which will be available in the second quarter, will be equipped with NVIDIA's new DeepMap radar mapping system, which scans and reconstructs the environment for route planning and digital twin simulation.

Nvidia threw 20 AI nuclear bombs in a row! 80 billion transistor GPUs and 144 core CPUs are coming

09.

Conclusion: A feast of cutting-edge technology for AI developers

Over the years, NVIDIA GTC has become a technological feast for many cutting-edge fields such as AI, HPC, scientific computing, digital twins and autonomous driving.

In this feast, we see not only technological breakthroughs changing the productivity and way of working across industries, but also NVIDIA's latest layout around the computing world.

With the advent of a new generation of large-scale cloud technologies, data center architectures need to be transformed. On the basis of the stable support of GPU basic disk, NVIDIA's role is shifting from graphical display and accelerated computing "partial scientific hegemony" to comprehensive development around the three major chip pillars of the data center.

Huang Believes that data centers are transforming into "AI factories" that process massive amounts of data to achieve intelligence, and today's H100 is the engine for accelerating enterprise AI business.

Several of the H100's technological innovations, the special design of the Data Center-specific Grace CPU superchip, and the continuous upgrade of the AI and Omniverse platforms have further expanded NVIDIA's leadership in accelerating AI training and inference.

At the 4-day NVIDIA GTC conference, we will also see more experts in different segments, sharing how they are using AI and accelerating technological innovations in computing to conduct groundbreaking research or solve current challenges.

Copyright declaration: Where the content of this public account indicates [original], the content copyright belongs to the author, the pictures and text content that are not marked [original] are reproduced from the network, the copyright belongs to the original, if the pictures and text are infringing, please inform us, we will delete it immediately. Thank you!

Read on