laitimes

Intel released the Gaudi 3 AI chip, which is cost-effective

Intel released the Gaudi 3 AI chip, which is cost-effective

Ray Technology

2024-04-10 20:30Posted in Guangdong science and technology creators

Nvidia's position in the AI chip market today is indisputable, as evidenced by skyrocketing data center revenue and market capitalization. But kingship is not eternal, and Nvidia is not unshakable.

Three weeks after Nvidia unveiled the latest generation of BlackWell GPUs, on the evening of April 9, Intel made a series of big announcements about chips at the Vision 2024 conference.

At the conference, Intel unveiled its sixth-generation Xeon processors for data centers, and also showcased Lunar Lake processors for next-generation AI PCs in advance. However, throughout the conference, Intel may have spent the most time and paid the most attention to the latest generation of AI chips:

Intel Gaudi 3.

Intel released the Gaudi 3 AI chip, which is cost-effective

Gaudi 3, Photo/Intel

Beyond the NVIDIA H100: The Gaudi 3 is more powerful and less expensive

The most straightforward upgrade to the Gaudi 3 is in terms of performance and cost.

Compared to the NVIDIA H100, Gaudi 3 delivers an average of 50% better AI inference performance and an average 40% better energy efficiency. In benchmarks, Gaudi 3 can reduce training time to half that of NVIDIA H100 in the Llama2-7B and Llama2-13B models, while also achieving an average of 50% higher inference throughput than the latter.

Arguably, the Gaudi 3 beats the H2 GPU based on Nvidia's Hopper architecture at least in key large models such as Llama, which is also the most technologically advanced AI chip currently on the market.

And the Gaudi 3 has another extremely important upgrade - a much lower cost than the H100, according to Intel:

The cost (of the Gaudi 3) is a fraction of that of the Nvidia H100.

It's no wonder that as soon as Gaudi 3 was released, Intel announced that a large number of companies such as Naver (the Korean internet giant), Bosch, IBM, Ola, and others became customers and partners of Intel's Gaudi accelerator. As early as the second quarter of this year, Intel will be the first to supply to OEMs such as Dell, HP and Supermicro, and it will be officially launched in the third quarter.

Intel released the Gaudi 3 AI chip, which is cost-effective

Naver stands for the stage, photo/Intel

Intriguingly, Intel basically did not release new chips at the Vision conference in previous years, but this year it uncharacteristically released two blockbuster products, the sixth-generation Xeon processor and the Gaudi 3 AI chip.

Considering that Intel CEO Pat Gesinger (Pat Gesinger) bombarded Nvidia's CUDA ecosystem at the end of last year as "shallow and narrow", it seems that in addition to the confrontation in the software ecosystem, Intel is also speeding up the hardware catch-up.

Does Gaudi 3 really stand a chance to challenge Nvidia's GPU supremacy, though?

You know, compared with Nvidia's H100 based on the Hopper GPU architecture two years ago, the B100 released last month based on the BlackWell GPU architecture has undergone another round of major upgrades, including Musk can't help but sigh, "There is no better AI chip than Nvidia GPU at present." 」

Intel released the Gaudi 3 AI chip, which is cost-effective

Blackwell GPU,图/英伟达

Gaudi 3 硬件追上英伟达了吗?

Unlike BlackWell, which uses the latest TSMC 3nm process, Gaudi 3 is built on TSMC's 5nm process, while the tensor cores have been upgraded from 24 to 32.

Compared with the previous generation Gaudi 2, Gaudi 3 has been comprehensively improved in FP8 performance, BF16 performance, network bandwidth, and memory bandwidth, with FP8 throughput up to 1835 TFLOPS:

It's basically doubled.

Intel released the Gaudi 3 AI chip, which is cost-effective

Photo/ Intel

Oddly enough, the Gaudi 3 with 128GB of RAM doesn't feature the latest HBM3 (High Bandwidth Memory), but instead uses the slightly outdated HBM2e.

In addition to the lower transmission bandwidth, the single capacity of HBM2e is only 16GB, compared to HBM3, the transmission bandwidth has been greatly improved, and the capacity of a single cable can reach 24GB, or even 32GB.

In addition, the Gaudi 3 features a dual-chip design similar to NVIDIA's BlackWell, with two identical chips packaged simultaneously and connected via a high-bandwidth link. Each Gaudi 3 chip has 48MB of onboard SRAM, and the entire chip provides 96MB of SRAM for a total bandwidth of 12.8TB/s.

In terms of I/O, Intel has not given up on the Ethernet route, upgrading the rate of Ethernet ports from 100GB/s to 200GB/s on Gaudi 3, and considering the dual-chip design and 24 Ethernet ports per chip, the total bandwidth of Ethernet I/O in each Gaudi 3 is up to 8.4TB/s.

In general, Intel is not aggressive in the Gaudi 3 upgrade, and can even be said to be somewhat conservative, including the much less costly 5nm process and HBM2e memory, which speaks for itself. While there is a clear improvement over the previous generation Gaudi 2, surpassing the H100 in some large models, it is clearly difficult to compete with Nvidia's latest B100.

But Intel's decision isn't necessarily wrong.

Intel released the Gaudi 3 AI chip, which is cost-effective

Gaudi 3, Photo/Intel

On the one hand, considering NVIDIA's technological and ecological leadership in AI-accelerated computing, even if Intel catches up at all costs, it is likely to be difficult to catch up, and the high cost of chips will also cause Intel to directly miss the fast-growing AI chip market.

On the other hand, under the premise of obvious cost advantages, as long as Intel can surpass the performance of NVIDIA H100, it will naturally be able to attract enough customers to purchase.

What's more, even Nvidia itself expects "a supply crunch for the next generation (B100)." Faced with the dilemma of "wanting to buy but not buying", many customers will naturally turn to other alternative AI chips.

Among them, there is a chance to belong to Gaudi 3.

Hardware and software ecosystems go hand in hand, can Intel carry the banner?

"The whole industry wants to take out CUDA, including Google, OpenAI, and other companies are looking for ways to make AI training more open. We believe that CUDA's moat is shallow and narrow. Kissinger said.

In a recent report by Lei Technology, we analyzed the actions of global technology giants to form the UXL Unified Acceleration Foundation to fight against NVIDIA CUDA.

Intel released the Gaudi 3 AI chip, which is cost-effective

Diagram / UXL

To put it simply, based on Intel's oneAPI technology, the giants are developing a set of open-source software platforms to replace the NVIDIA CUDA platform, allowing AI developers to run their code on any AI chip, including NVIDIA GPUs, the core is to untie the strong binding relationship between chip hardware and software development platforms, and break the hegemony of NVIDIA GPUs in the development ecosystem.

As Vinesh Sukumar, head of artificial intelligence and machine learning at Qualcomm, explains, "We're actually showing developers how to migrate away from the NVIDIA platform. 」

The overthrow of software and the catch-up of hardware, Intel obviously understands that NVIDIA's success comes from both software and hardware, and only by going hand in hand can we really catch up with NVIDIA, and even kill CUDA and NVIDIA's moat.

But whether Intel can carry the banner of "down with NVIDIA" and regain the leading position in the data center market may depend on two things:

First, whether Intel can stand out from the crowd of NVIDIA challengers, including AMD and a number of global AI chip companies, will not miss the opportunity of Nvidia's GPU "in short supply";

The second is whether the next-generation GPU, codenamed "Falcon Shores", can surpass the potential of NVIDIA's latest generation GPU after merging the two product lines of GPU and AI chips, according to Intel's roadmap.

Intel released the Gaudi 3 AI chip, which is cost-effective

Data Center Chip Roadmap, Diagram / Intel

Write at the end

Over the past year or so, AI has set off a self-evident revolution, but no matter how AI will change our lives, chip computing power is still the underlying driving force.

But Nvidia accounts for 80% of the AI chip market, which is obviously unusual, but the key point of people's biggest dissatisfaction is that Nvidia can't satisfy everyone. Under this premise, whether it is Intel, AMD or domestic AI chip manufacturers, there are actually still opportunities.

In other words, Intel Gaudi 3 is still facing a "big world", and naturally "promising".

The Beijing International Automobile Exhibition (Beijing Auto Show) will be held from April 25th to May 4th, with the theme of "New Era, New Automobiles", which is the vane of "Cars from Electrification to Intelligence".

At that time, including BYD, Xiaomi, AITO, Xiaopeng, Weilai, ideal, Extreme Krypton, Jiyue, Changan Deep Blue and other head brands will all appear, in addition to the new model "big competition", the advancement of autonomous driving technology, the evolution of intelligent cockpit and the combination of AI large models and cars, will be important highlights. Lei Technology's "Pay attention to electric vehicles, understand more intelligent" account tram will send a reporting team to Beijing to conduct first-line professional reports, so stay tuned.

Intel released the Gaudi 3 AI chip, which is cost-effective

Read on