laitimes

"Iron Man" uses 100,000 NVIDIA H100 to build the world's strongest AI cluster

Elon Musk, the tech superstar, has always pushed the world forward with breakneck speed and jaw-dropping fashion. This time, he did another big thing: building a supercomputer cluster of 100,000 Nvidia H100 GPUs in Memphis, Tennessee. Doesn't that sound cool? Today we're going to take a look at what the hell is going on.

Pay attention to [universality does not exist]

Unlock the infinite possibilities of technology and industry

01 What does Musk think?

"Iron Man" uses 100,000 NVIDIA H100 to build the world's strongest AI cluster

First of all, we have to figure out, why is Musk in such a hurry to build this supercomputer cluster? The reason is simple: he wants to train his AI model, Grok. What the hell is Grog, you may ask? In short, it's a large language model from the xAI company with the goal of becoming the most powerful AI in the world. And Musk's goal, of course, is to make it unbeatable.

So another question is, why doesn't Musk wait for the next generation of GPUs and rush to use the current H100? You know, Nvidia's new generation of GPUs H200 and the B100 and B200 based on the Blackwell architecture will be available soon. The answer is simple: Musk can't wait. He has always been an impatient person, and he never drags his feet in his work. His idea is clear: what can be done now, will not be postponed until tomorrow.

02 How awesome is this supercomputer cluster?

"Iron Man" uses 100,000 NVIDIA H100 to build the world's strongest AI cluster

Let's take a look at how awesome this supercomputer cluster really is. First of all, it consists of 100,000 liquid-cooled H100 GPUs. These GPUs are Nvidia's latest offerings, designed for AI training with powerful performance and good heat dissipation. The liquid-cooled design means that these GPUs can continue to work under high loads without overheating and downclocking.

Second, this cluster uses the RDMA architecture. What does this mean? In simple terms, the RDMA architecture allows data to be transferred quickly between different compute nodes without burdening the central processing unit (CPU). This means more efficient data transfer and lower latency. This is tailor-made for AI training.

Let's take a look at the cost of this project. 100,000 H100 GPUs, each costing between $30,000 and $40,000, for a total cost of about $3 billion to $4 billion. Musk really threw a lot of money at it this time. Not only that, but xAI has also invested a lot of money in Memphis to transform public infrastructure to support such a large project. They needed to build new substations and wastewater treatment facilities to meet the power and cooling needs of the supercomputer cluster.

03 Operation of the Memphis Supercomputer Cluster

"Iron Man" uses 100,000 NVIDIA H100 to build the world's strongest AI cluster

Now, the Memphis supercomputer cluster is up and running. This is a huge milestone for xAI and Musk. Not only is this cluster very powerful in terms of hardware, but it also has great potential for AI training.

Musk's goal is to train the world's most powerful AI, Grok 3, by December this year. This goal sounds a bit exaggerated, but that's Musk's style. He has always liked to set high goals and try to achieve them. This cluster of supercomputers has surpassed the world's most powerful supercomputers in computing power, such as Frontier (37,888 AMD GPUs), Aurora (60,000 Intel GPUs), and Microsoft's Eagle (14,400 Nvidia H100 GPUs). This will have a profound impact on the field of AI and perhaps change our understanding of computing power.

04 Capital investment in xAI

"Iron Man" uses 100,000 NVIDIA H100 to build the world's strongest AI cluster

The scale of xAI's investment in Memphis is significant. According to a report by Benzinga, the cost of each Nvidia H100 GPU is estimated to be between $30,000 and $40,000. Considering that xAI uses 100,000 Nvidia H100 units, Musk's AI startup appears to have spent around $3 billion to $4 billion on the project.

Not only that, but xAI has pledged to improve Memphis' public infrastructure to support the development of the data center, including the construction of a new substation and a wastewater treatment facility. The CEO of Memphis Electric, Gas & Water estimates that the xAI Memphis plant could use up to 150 megawatts of electricity per hour, equivalent to the electricity needed by 100,000 homes, while xAI expects to require at least 1 million gallons of cooling water per day.

An investment of this magnitude shows not only Musk's ambitions, but also his confidence in the xAI project. Memphis City Council member Pearl Walker said last week, "People are scared. They are worried about possible problems with water resources, but also about energy supply (problems). These concerns are understandable, given that a project of this magnitude does have a significant impact on local infrastructure.

05 Train the world's most powerful artificial intelligence by the end of the year

"Iron Man" uses 100,000 NVIDIA H100 to build the world's strongest AI cluster

Musk's goal is to train the world's most powerful AI by December of this year. Judging by the progress so far, this goal does not seem out of reach. The establishment of the Memphis supercomputer cluster is not only a major milestone for xAI, but also a major breakthrough in the field of AI.

The computing power of this cluster of supercomputers has surpassed any supercomputer on the latest Top500 list in terms of GPU horsepower. The world's most powerful supercomputers, such as Frontier, Aurora, and Microsoft Eagle, are all far behind xAI machines. This will have a profound impact on AI research and development, and perhaps change our understanding of supercomputers.

Musk's xAI supercomputer cluster in Memphis is not only a technological breakthrough, it's part of his ambition. He doesn't just want to be AI, he wants to be the strongest AI. As for whether it can be realized, time will tell. But one thing is for sure, every move of Musk will cause quite a shock.

Whether it's xAI's huge investment in Memphis or their commitment to public infrastructure, it shows Musk's strong belief in the development of AI. This firm belief not only promotes the development of technology, but also brings us more imagination.

In a sense, Musk is like a modern-day explorer who is constantly pushing the envelope and exploring the unknown. And each of his new projects is like a new adventure, full of challenges and opportunities.

So, let's wait and see what surprises Musk and xAI will bring next. Whatever the outcome, we can be sure that this will be a technological feast like never before.

So how do you feel about it?

Feel free to discuss in the comment section

Explore the frontiers of science and technology and gain insight into the pulse of the industry

Daily updates on the latest technology information and industry trends

Let's work together:

Dive into the vast ocean of technology

Insight into the changes in the industry

Witness every leap of the times

Share every breakthrough in innovation

Pay attention to [universality does not exist]

Unlock the infinite possibilities of technology and industry

Read on