laitimes

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

author:096 nucleus

This article does some preliminary research on the ARM Neoverse series of server CPUs, and tries to compare the similarities and differences between server CPUs and mobile AP CPUs.

Let's start with a family portrait of the ARM Neoverse series of CPUs, some of the main ones will be highlighted below.

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

1、ARM Neoverse N1

ARM first announced its server-oriented CPU product in February 2019: the ARM Neoverse N1.

The Neoverse N1 is the same era as the Cortex-A76.

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs
Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

Figure: Some key points from the Cortex-A76 architecture design

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

Figure: An overview of the Cortex-A76 architecture

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

Figure: The Cortex-A76 supports DynamIQ and is capable of composing different cores.

For Cortex-A76, see the October 4 article:

2、ARM Neoverse V1

The Cortex-A series CPUs on ARM mobile devices are divided into ultra-large core Cortex-X series, large core Cortex-A7x series and small core Cortex-A5x series.

Correspondingly, the Neoverse server lineup is divided into V-Series, which pursues extreme performance, N-Series, which pursues a balance between performance and power consumption, and E-Series, which pursues power consumption area.

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

From an architectural point of view, the V1 borrows the design of the Cortex-X1 CPU.

Due to the high performance of Neoverse V series CPUs, due to export controls in the United States and the United Kingdom, Chinese companies cannot purchase ARM Neoverse V series CPUs, including Neoverse V1 and subsequent Neoverse V2.

For Cortex-X1, see the October 4 article:

3、ARM Neoverse N2

The Neoverse N2 is ARM's first server CPU from the ARMv9 series.

It is the same generation CPU as the Cortex-A710 on mobile.

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs
Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs
Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

A710 is the first big core of the Armv9 family, A710 is also the first time to officially introduce the SVE2 extended instruction set, A710 has not given up 32bit support, can be compatible with both 32bit and 64bit applications.

Compared with Neoverse V1, the N2 adds new features of ARMv9 generation CPUs, such as SVE2 and Memory Tagging Extension (MTE).

For the Cortex-A710, see the October 4 article:

3.1 Server products based on Neoverse N2 In October 2021, Pingtou released the Yitian 710, which is based on TSMC 5nm process, uses 128-core Neoverse N2, with a maximum frequency of 3.2GHz, 8-channel DDR5, a peak total bandwidth of 281GB/s, and 96-channel PCIe 5.0. SPECInt 2017 scored 440 points.

The Etian 710 is divided into two dies, each containing 64 CPU cores and 4 channels of DDR.

According to online information, each die size is about 310mm2.

The Yitian 710 uses a 2.5D package for multi-die packaging, with a total of 60 billion transistors. The bus used is most likely the CMN-700 from the same period as the Neoverse N2, with a CMN bus on each die.

4、ARM Neoverse V2

ARM released the Neoverse V2 CPU in September 2022.

Compared with Neoverse V1, the maximum L2 cache size is increased from 1MB to 2MB. In addition, new features of ARM v9.0 are supported, such as SVE2 4x128b.

Once again, the V2 borrows the design of the X2.

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

Of course, V2 is not just a core, but a platform specification that can be licensed; With the CMN-700 interconnect, licensees can build V2 CPUs that scale up to 256 cores and 512 MB of system-level cache, providing 4 TB/s of cross-sectional bandwidth across all cores, memory, and memory, as well as grid-based I/O controllers.

NVIDIA had previously announced that Grace would be based on the Neoverse design, so this week's announcement by Arm finally confirms long-held suspicions that Grace will be based on the next-generation Neoverse V core.

Who will be licensed for V2 cores other than NVIDIA and possibly AWS?

Maybe anyone who intends to use V2 is already working on a custom design.

For Cortex-X2, see the October 4 article:

5. Neoverse E2: Cortex-A510 works with N2.

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs
Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs
Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

Arm paired a Cortex-A510 CPU, Arm's small/efficient Cortex CPU core, with a CMN-700 grid.

The move is designed to provide server operators/vendors with greater flexibility by providing an alternative to the CPU cores of the N2, while still providing the modern I/O and memory capabilities of Arm mesh.

Emphasizing this, the E2 system backplane is even compatible with the N2 backplane.

For more information about the Cortex-A510, see the October 4 article:

6、ARM CMN700 vs CMN600

In addition to the ARM Neoverse series of CPUs, the CMN bus is also an important component in the ARM server architecture.

Compared with CMN-600, CMN-700 increases the number of cores supported on each die, the number of nodes in mesh, and the capacity of system level caches.

Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs
Similarities and differences between ARM Neoverse series server CPUs and mobile AP CPUs

The 256 cores per die supported by CMN-700 are calculated as follows:

The CMN-700 can support 128 RN-F (Full Coherent Requesting Nodes), and each RN-F can be two Neoverse CPU cores that converge through the CAL (Component Aggregation Layer) components of the CMN-700.

So a total of 128 * 2 = 256 cores are supported.

Theoretically, the CMN-600 can support 64 RN-Fs, so the number of cores that can theoretically be supported should also be 64 * 2 = 128 cores (in fact, it will be slightly less than this value).

Because Ampere Altra's current generation of server chips already has 80 Neoverse N1 cores, which is more than the 64 cores per die given by ARM.

ARM's statement is that the 64 cores per die refers to the number of cores directly connected to the Node, and a higher number of cores can be achieved if CAL is used.

If you want to implement a 128-core server chip, you can choose to build it with CMN-700 on a single die, or you can interconnect it through multiple dies.

Using 64 cores per die, two dies are used to form a 128 core server chip, and the size of each die is relatively small, and the yield will be higher, at the cost of additional logic to achieve inter-chip interconnection.

Read on