Dell is driving AI infrastructure upgrades with technological innovation

Driven by the AI boom, the infrastructure market is recovering. Dell Technologies recently reported financial results for the fourth quarter of fiscal 2024 ended February 3, with revenue of $9.3 billion for the fiscal fourth quarter, up 10% sequentially. The rebound in Dell Technologies was driven by growth in its AI-optimized server business, which saw orders for AI servers grow 40% in the quarter, nearly doubling its backlog to $2.9 billion, compared to $1.6 billion in the previous quarter.

Dell Technologies' AI server business growth is a microcosm of the latest changes in the infrastructure market across the world. Since last year, enterprises' enthusiasm for AI applications has been rising, and the successive explosions of ChatGPT and Sora have fueled the AI boom, driving strong demand for AI infrastructure. To meet these needs, Dell Technologies has introduced a number of innovative products and technologies to help businesses embrace AI.

AI applications are widespread

Promote infrastructure upgrades

In the past few years, AI application scenarios have become more and more abundant, and the level of intelligence has been continuously improved. An obvious change is that the perceptual intelligence that focused on image recognition and video analysis in the past has gradually expanded to the cognitive intelligence represented by ChatGPT, document retrieval and document generation have become new mainstream application scenarios, and at the same time, multimodal AI technology represented by Sora has also begun to emerge.

Dell is driving AI infrastructure upgrades with technological innovation

The booming AI application is inseparable from the support of the underlying infrastructure. Taking the training and inference of large models as an example, the number of parameters of large models has increased from billions to tens of billions, trillions, or even higher, and larger models bring greater demand for AI computing power. According to data, the computing power consumed by AI will double every 3 to 4 months on average, and most of this increased computing power demand will be borne by GPUs, which directly drives the popularization of heterogeneous computing.

On the other hand, the scale of computing is getting larger and larger, and more and more are carried through clusters, and data centers with 10,000 or 100,000 GPUs are not uncommon. The need for network communication has increased along with the expansion of computing scale, making high-bandwidth, low-latency network switching a necessary part of AI systems.

At the same time, emerging AI applications are placing increased demands on storage systems. Stronger and larger algorithm models need to be paired with more quantity, higher quality, and more diverse training data to achieve the desired effect, otherwise it is easy to cause the model to be underfit due to insufficient training data, which means that the storage system has higher performance, more stability, and lower cost.

The demand of the market has been seen by infrastructure manufacturers, and many infrastructure manufacturers represented by Dell Technologies have also launched a large number of innovative products and technologies.

Diversified AI computing platform

Accelerate the intelligent transition

As we all know, the typical architecture of traditional data centers is CPU-centric, but today, driven by the demand for large model training and inference, heterogeneous computing represented by CPU+GPU is becoming the mainstream, and GPU has become the standard configuration of more and more servers.

In response to the latest market demand, Dell Technologies has launched a number of PowerEdge servers, which are specifically designed and optimized for GPU computing, and can support different brands of GPU accelerator cards such as Intel to meet the needs of different application scenarios.

"Focusing on GPU acceleration for AI applications, Dell Technologies can provide GPU-accelerated computing solutions such as GPU pooling, GPU distributed training, GPU cluster management and resource scheduling, and edge AI computing. Wu Yue, Enterprise Technology Architect and Global CTO Ambassador of Dell Technologies' Information Infrastructure Solutions Group, said.

He says that there are a lot of GPU-specific designs in PowerEdge server design. For example, the application of multi-vector heat dissipation technology can ensure uniform cooling of all components of the server; The power enhancement design (i.e., instantaneous power) can reach 1.4 to 1.7 times the nominal power of the power supply to cope with the boot storm during GPU boot. These are designed to ensure that the GPU is running optimally within the server.

In the open benchmark of AI, Dell PowerEdge performs well. MLPerf is currently the most concerned and engaged computing performance benchmark in the AI field, with more than 13,500 test records submitted by 26 vendors in the MLPerf Inference V3.1 inference benchmark released in September last year. Dell Technologies took first place in seven of the 20 data center projects and second place in four others. Achieved the best overall score of all GPU server products reviewed.

In addition to a variety of GPU servers, Dell's PowerEdge servers powered by Intel Xeon Max processors are also worth mentioning. Dell Technologies currently has three servers that can support Xeon Max processors, including the 2U two-socket PowerEdge R760, the 1U two-socket PowerEdge R660 and the 2U 4-node PowerEdge C6620.

The Xeon Max processor is the industry's first x86 CPU with integrated HBM, which integrates 64GB of HBM and can reach 1TB/s of memory bandwidth, which can accelerate memory bandwidth-intensive applications such as model inference and model fine-tuning without the need for a GPU accelerator card.

Tests show that a single Xeon Max processor can load and apply 6 billion and 13 billion large language models. For conversational AI, the latency of the first token generation of the session is less than 3 seconds, and the delay of the next token generation is less than 100 milliseconds. In addition, since all major AI frameworks and acceleration libraries support x86 computing, running AI models on the CPU can reduce the amount of code changes, greatly simplifying the development and deployment of AI applications.

Wu Yue said that in order to continue to lead AI infrastructure technology innovation and accelerate the implementation of AI applications, Dell Technologies has also established an AI HPC Innovation Lab, specializing in AI computing, parallel computing cutting-edge technology research, performance benchmarking and other related work. The supercomputer Ratter is a GPU supercomputing cluster built by Dell's AI & HPC Innovation Lab. Many of Dell's AI and GPU-accelerated HPC solutions (e.g., CAE, molecular dynamics, life sciences, etc.) were also first tested and optimized on the Ratter cluster.

From edge to end

Make AI computing power ubiquitous

In fact, infrastructure-related technological innovations are not only happening in the computing field, but also in the storage and network fields, and at the same time, not only in the cloud and data centers, but also on the device side, and a comprehensive upgrade from cloud to end is brewing.

For example, in order to reduce the latency of data transmission, Dell PowerScale has added a Multipath Client driver this year, so that a single client can achieve a peak bandwidth of 40GB/s, which can provide abundant back-end storage performance for high-density GPU computing servers.

OneFS is a self-developed file system developed by Dell Technologies, all-flash PowerScale with OneFS 9.7 file system, the storage streaming read performance is twice that of the previous generation platform, and the streaming write performance is 2.2 times that of the previous generation platform, which can effectively improve the efficiency of the GPU in the model pre-training and fine-tuning stage.

Of course, there is also a lot of innovation on the end side. Since Intel launched the AI PC concept in September last year, AI PCs have been quickly accepted by the market. IDC predicts that more than half of the new PCs this year will be AI PCs; It will further increase to 84.6% in 2027. Dell Technologies has also fully embraced AI PCs, and the newly released AI workstation Precision 7960 this year will support 4 double-wide GPUs, with up to 4TB of memory and 152TB of local storage, providing strong support for AI landing workstations.

When it comes to the landing of AI at the edge, Intel cannot fail to mention it. In addition to focusing on AI PCs, Intel continues to optimize the performance of CPUs for AI applications in other aspects, providing more options for Dell Technologies to enrich its product line. For example, Intel has added AMX accelerators to its 4th Gen Xeon Scalable processors, which can significantly improve the performance of AI inference and training without the need for GPUs. According to Intel, AMX has an 8x performance improvement over INT8 accuracy and a 16x performance improvement over BF16 accuracy. And because AMX is built into the CPU, there's no need to build a solution.

On the other hand, Intel is also working with partners to launch reference designs for AI solutions for the edge. One such is the Intel Edge AI Box, which integrates video decoding and analysis in a single compute box powered by Intel Core processors, either as a stand-alone device connected to a video source such as an IP camera or network video recorder (NVR) for real-time video analytics at the edge, or as a standalone AI service on the network to run offline deep learning analytics on demand, making it easy to deploy.

"Focusing on the implementation and deployment of edge AI, Intel has worked with partners such as Dell Technologies to create a number of edge AI solutions based on Intel's hardware and software stacks to support ubiquitous AI applications," said Wei Yu, AI Architect of Intel China Solutions Group. ”

In fact, in addition to Intel, Dell Technologies also cooperates with many vendors, among which cooperating with ISVs to develop solutions for subdivided fields is an important way of cooperation.

At present, RAG (Retrieval Enhanced Generation) of vector database + large model is very popular. Because RAG can not only give full play to the language generation ability of large models, but also solve the illusion problem of large models with the help of vector databases, it is considered to be a very promising form of AI implementation. Daguan Data is a national specialized and new "little giant" enterprise focusing on intelligent text processing technology, and its intelligent knowledge management system has many users in the industry.

In addition, Dell Technologies offers a variety of AI Validated Design Solutions, as well as accompanying technical white papers and performance validations, all of which require the support of a variety of partners.

"With the iteration of AI technology, the scale of AI scenarios and AI models will become more complex, and the demand for AI computing power and AI training data will also become stronger, and AI infrastructure will continue to become one of the hot spots of global IT investment in the next few years. Wu Yue said that as a leading infrastructure provider, Dell Technologies will actively embrace AI with partners, promote infrastructure upgrades through technological innovation, and accelerate the popularization of AI applications.