NVIDIA is working on a variety of multi-chip GPU designs that can be combined for different workloads

2022-01-08 15:58:27

The next generation of NVIDIA will launch GPUs based on the Hopper architecture and Ada Lovelace architecture for the data center and consumer markets, respectively. The difference is that Nvidia only uses MCM multi-chip packaging on Hopper architecture GPUs, and Ada Lovelace architecture GPUs will still retain the traditional design and will not introduce MCM multi-chip packaging to consumer GPUs like AMD's Navi 31 based on RDNA 3 architecture.

NVIDIA is working on a variety of multi-chip GPU designs that can be combined for different workloads

Recently, Nvidia researchers published an article detailing how Nvidia is exploring how to deploy multi-chip designs for future products. With the rise of heterogeneous computing, NVIDIA is looking for ways to increase the flexibility of its semiconductor designs to flexibly match a variety of modules depending on the workload, which is where MCM multichip packaging comes in.

Nvidia's research on multi-chip designs was first exposed in 2017, when Nvidia demonstrated a design built from four small chips that not only improved performance, but also helped increase yield (smaller chip yields would increase), and also allowed more computing resources to be pooled together. The multi-chip design also helps to improve power supply efficiency and better heat dissipation.

Nvidia's current practice on MCM multi-chip packaging GPUs is called "Composable On Package GPU," or COPA. The article explains how Nvidia handles the differences between HPC and AI workloads, and as the computing needs of the two change, the requirements for computing are gradually distancing. Nvidia worries that too single GPU architectures will gradually lose the computing advantages of HPC and AI workloads, and the market size of the two is growing.

In order to better cope with future computing needs, NVIDIA has been simulating different multi-chip designs and configurations to confirm the hardware modules required for different workloads. According to data provided by NVIDIA, a 25% reduction in video memory bandwidth on HPC workloads actually reduces performance by only 4%, and if it is reduced by another 25%, the performance penalty increases by another 10%. Therefore, after reducing the memory bandwidth by 50% and removing the associated hardware modules, they can be replaced with more appropriate hardware modules, providing the corresponding performance for the corresponding workload, thereby improving efficiency. Since not all hardware modules are peer-to-peer and individual functions are indispensable, COPA is NVIDIA's attempt to emulate the impact of multi-chip designs, as well as the relationship with performance.

Nvidia currently prioritizes the HPC and AI markets, in addition to high profit factors, many companies are gradually encroaching on Nvidia's market space through customized solutions. Of course, this workload-specific configuration can also be applied to NVIDIA's other GPU product lines, including GeForce graphics cards in the consumer market. However, unlike the professional market, the way rendering works in games is fundamentally different, and if a multi-chip design is used, it is necessary to further improve the interconnection speed between small chips.

AMD has previously talked about the direction of its 3D stack technology, saying that package selection and chip architecture depend on the performance, power, area and cost of the specific product, which AMD calls PPAC. If you include already released and upcoming products, AMD has 14 multi-layer chip designs in the package architecture in progress. AMD believes that the future belongs to multi-chip modular design and matching coordinated packages.

AMD has taken the first step with products in the X3D package, demonstrating the impact on cost, power consumption, and performance on the CPU. Adopting the same design on GPUs will be more difficult, but the development of technology will eventually drive NVIDIA's goal of multi-chip design, that is, a GPU composed of multiple small chips with different functional modules, which can make more specialized combinations according to the computing needs of the workload.

NVIDIA is working on a variety of multi-chip GPU designs that can be combined for different workloads

Read on