NVIDIA Launches DGX H100 System to Lead Enterprise AI Infrastructure

2022-03-23 23:43:59

NVIDIA today announced the availability of the fourth-generation NVIDIA DGX system, the world's first AI platform based on the new NVIDIA H100 Tensor Core GPU.

NVIDIA Launches DGX H100 System to Lead Enterprise AI Infrastructure

NVIDIA DGX H100 Systems

The DGX H100 system meets the needs of large-scale computing for large language models, recommendation systems, healthcare research, and climate science. Each DGX H100 system is equipped with eight NVIDIA H100 GPUs, connected by NVIDIA NVLink, capable of achieving 32 Petaflop's AI performance at the new FP8 accuracy, which is 6 times higher than the performance of the previous generation system.

The DGX H100 system is the building block of the new generation of NVIDIA DGX POD and NVIDIA DGX SuperPOD AI infrastructure platforms. The new DGX SuperPOD architecture features a new NVIDIA NVLink Switch system that connects up to 32 nodes with a total of 256 H100 GPUs.

The new generation of DGX SuperPOD delivers 1 Exaflops' FP8 AI performance, which is 6 times higher than the previous generation, capable of running huge LLM workloads with trillions of parameters, thus driving the cutting edge of AI.

Jen-Hsun Huang, founder and CEO of NVIDIA, said: "AI has fundamentally changed the way software functions and is produced. Companies that use AI to revolutionize their industries are also recognizing the importance of their OWN AI infrastructure. NVIDIA's new DGX H100 system will empower enterprise AI factories to extract our most valuable resource from our data, namely 'intelligence'. "

The world's fastest running AI supercomputer, NVIDIA Eos

NVIDIA will be the first to use a breakthrough new AI architecture to build the DGX SuperPOD, empowering the research efforts of NVIDIA researchers and advancing the future of climate science, digital biology, and AI.

The "Eos" supercomputer, which will begin operation later this year, is equipped with a total of 576 DGX H100 systems and a total of 4608 DGX H100 GPUs, which is expected to become the fastest AI system in the world.

NVIDIA Eos is expected to deliver 18.4 Exaflops of AI computing performance, 4 times faster than Japan's Fugaku supercomputer, the fastest running system today. In terms of traditional scientific computing, Eos is expected to deliver 275 Petaflop performance.

For NVIDIA and its OEM and cloud computing partners, Eos will be the blueprint for its advanced AI infrastructure.

Easily scale enterprise-class AI with the DGX H100 system, DGX POD, and DGX SuperPOD

As enterprises evolve from initial projects to widespread deployment, the DGX H100 system can be easily scaled to meet the AI needs of enterprises.

In addition to eight H100 GPUs and a total of 640 billion transistors, each DGX H100 system contains two NVIDIA BlueField-3 DPUs for offloading, accelerating, and isolating advanced networking, storage, and security services.

Eight NVIDIA ConnectX-7 Quantum-2 InfiniBand NICs deliver 400 GB/s of throughput for connecting compute and storage, twice as fast as previous-generation systems. The fourth generation of NVLink, combined with NVSwitch, is capable of achieving 900 GB/s connection speeds between the various GPUs in each DGX H100 system, which is 1.5 times that of the previous generation system.

The DGX H100 system uses dual x86 CPUs and can be combined with NVIDIA networks and storage devices provided by NVIDIA partners, giving the DGX POD the flexibility to be used for AI computing at all scales.

With the DGX H100 system, the DGX SuperPOD can become a scalable, enterprise-class AI center of excellence. The DGX H100 nodes and H100 GPUs in the DGX SuperPOD are connected by the NVLink Switch system and the NVIDIA Quantum-2 InfiniBand, with bandwidth speeds of up to 70 TB/s, 11 times higher than the previous generation. Storage devices provided by NVIDIA partners will be tested and certified to meet the needs of DGX SuperPOD AI computing.

Multiple DGX SuperPOD units can be combined to provide industries such as automotive, healthcare, manufacturing, communications, retail, and more with the AI performance needed to develop large models.

NVIDIA DGX Foundry-hosted development solutions are expanding globally, giving DGX SuperPOD customers immediate access to advanced computing infrastructure during their system installation. New locations in North America, Europe, and Asia support remote access to DGX SuperPOD (or part of it).

DGX Foundry includes NVIDIA Base Command software, which enables customers to easily manage the end-to-end AI development lifecycle based on the DGX SuperPOD infrastructure.

With NVIDIA LaunchPad Labs hosted in Equinix IBX (International Business Exchange) data centers around the world, eligible businesses can experience the NVIDIA Base Command and DGX systems for free.

Enterprise-grade AI software MLOps helps customers increase AI adoption

To support DGX customers who are developing AI, MLOps solutions from NVIDIA DGX-Ready software partners, including Domino Data Lab, Run:ai, and Weights & Biases, will join the NVIDIA AI Acceleration program.

The MLOps applications provided by participating partners will be validated to provide enterprise-class workflows and cluster management, scheduling and orchestration solutions to DGX customers.

In addition, NVIDIA DGX systems now include the NVIDIA AI Enterprise software suite, which adds support for bare metal infrastructure. DGX customers can use the pre-trained NVIDIA AI platform models, toolkits, and frameworks included in the software suite (e.g., NVIDIA RAPIDS, NVIDIA TAO Tool Suite, NVIDIA Triton Inference Server, etc.) to speed up their work.

The DGX-Ready Managed Services plan simplifies AI deployment

As enterprise AI adoption continues to increase, customers are looking for more options to add the infrastructure they need to transform their businesses. NVIDIA has launched a new DGX-Ready managed services program that supports customers who want to partner with service providers to oversee their infrastructure.

Deloitte is the first global provider to partner with NVIDIA on this initiative and will be certified to support customers in Europe, North America and Asia alongside regional vendors (CGit, ePlus, Insight Enterprises and PTC System).

Jim Rowan, Deloitte Consulting's Senior Partner and Head of AI and Data Operations Services, said: "AI is only likely to drive breakthroughs in the business if it can integrate technology into its operations. With the new DGX-Ready Managed Services program, customers can easily adopt advanced AI technology as well as NVIDIA DGX systems and software managed by Deloitte's global experts. ”

The DGX-Ready Lifecycle Management Program enables easy upgrades

Customers can now upgrade their existing DGX systems with the new NVIDIA DGX platform through the new DGX-Ready Lifecycle Management Program.

NVIDIA channel partners participating in the DGX-Ready Lifecycle Management Program will be able to update the previous generation of DGX systems for purchase by new customers and expand access to globally ready systems for AI infrastructure.

Availability information

Starting in the third quarter, NVIDIA's global partners will begin supplying NVIDIA DGX H100 systems, DGX PODs, and DGX SuperPODs.

Customers also have the option to deploy DGX systems in colocation facilities operated by NVIDIA DGX-Ready data center partners such as Cyxtera, Digital Realty and Equinix IBX Data Center.

(7891529)

NVIDIA Launches DGX H100 System to Lead Enterprise AI Infrastructure

Read on