The U.S. plans to ban cloud computing vendors from training large AI models for China

2024-01-30 04:41:07

The U.S. plans to ban cloud computing vendors from training large AI models for China

In an interview with Reuters on January 26, 2024, U.S. Secretary of Commerce Gina Raimondo announced plans to restrict foreign customers, especially Chinese customers, from using the services of U.S. cloud vendors to train large AI models. Raimondo's exact words were: "We can't allow Chinese or other players we don't want to use our cloud services to train their models. We introduced a ban on chip exports, but those chips are being used by cloud computing data centers in the U.S., so we need to consider shutting down those channels to avoid potentially malicious behavior. (Note: The original article can be found in the Reuters telegram of January 27)

There is no doubt that the above measures have pushed the US technology sanctions against China to a new level, and the potential damage to Chinese's artificial intelligence industry is very large. While I'm not an expert in chips or artificial intelligence, luckily I have a lot of friends in these industries. As soon as I heard the news, I asked them for their views and learned a lot. They generally agree that the Commerce Department's new initiative is understandable from a macro perspective, but why it was launched at this point in time is somewhat intriguing.

Over the past year or so, China's major Internet companies and technology companies have claimed to have made remarkable achievements in the field of AI large models, "with a gap of only half a year to a year with OpenAI". Just a few days ago, Zhou Hongyi also declared, "Last year, we looked at the big model like an atomic bomb, and this year we look at the big model like a tea egg" -- from the perspective of capital market speculation, the above statement is very reasonable (especially conducive to the divorce and reduction of major shareholders); from the perspective of technology research and development, it is not the same thing at all. In fact, the Chinese technology industry's "catch-up" with OpenAI is inseparable from the help of the following three factors:

First, the absorption and reference of overseas open source models.

Versions above GPT-3 are not open source, but there is no shortage of open source large models abroad for reference (plagiarism), the most popular of which is LlaMA released by Meta in February 2023 and LLaMA2 released in July. LLaMA was originally only conditionally open sourced to the academic community, but it was soon leaked on a large scale, and Meta simply made its subsequent version fully open source.

There are three public versions of LLaMA2 with 7 billion, 13 billion, and 70 billion parameters, and Meta has announced that it will release more complex versions in due course. Although LLaMA2 is still not as good as GPT-4, it is enough as a reference (plagiarism) origin. As we all know, the "self-developed large model" of some domestic start-up companies (the name will not be named) is based on LLaMA2, and they don't even bother to change the parameter name.

Second, by renting the GPT interface, the GPT model parameters are "distilled".

A month ago, foreign news reported that ByteDance's products were suspected of being banned for calling the GPT interface to train its own large models. In fact, this kind of thing is done by everyone, and it is commonly known as "distillation" in the circle. The so-called "distillation" is to repeatedly engage in massive conversations with GPT, adjust your model parameters through the data returned by GPT, in short, let GPT help you train your own model.

As long as there are enough manpower and financial resources, any company can make a decent "self-developed large model" in a relatively short period of time by copying LLaMA2 first and then renting GPT for "distillation", and its performance in the test can even be "only half a year to a year short of OpenAI" (the specific difference depends on how much money is spent on distillation). It's a pity that the large model created through this method will never be able to catch up with OpenAI, just like a student who copies a student's test paper during an exam can never surpass a student - by the way, you have to be careful not to be caught by the invigilator.

Third, rent overseas cloud computing services such as Azure and AWS to solve the bottleneck of computing power.

Since 2022, the United States has continuously tightened the ban on chip exports to China. Although Nvidia has repeatedly launched a "special version" of GPUs for China, the U.S. Department of Commerce immediately filled the loophole, and now the vulnerability available for exploitation is very small. To be fair, even without considering the chip ban, it will be difficult for Chinese companies to grab enough GPUs, because Nvidia's high-end GPUs have been in short supply, and major North American manufacturers such as Amazon often "grab 10,000 copies of new models first", and customers from China will certainly not be too high-priority.

As we know, the computing power used by large AI models is divided into two types: "training" and "inference", and the former is significantly more demanding. Therefore, Chinese technology companies have generally adopted the model of "separating training and reasoning", handing over a large part of the training business to major North American cloud computing companies such as Microsoft Azure, Amazon AWS, and Google GCP, because only they have enough high-end computing power; Although the chips have not been imported into China, the relevant computing power is used by Chinese companies. The U.S. regulatory authorities must have noticed the loopholes here, but they didn't manage them before, and now they have decided to take care of them.

Of course, it is debatable whether the U.S. Department of Commerce's proposal will be legal and illegal (in this case, U.S. domestic law). However, in the context of Sino-US technology competition, there is a high probability that this proposal will be implemented, and the door of American cloud computing manufacturers will be closed to Chinese customers sooner or later. There are two questions that really need to be answered:

Why is the United States introducing a new ban at this time?

Those domestic manufacturers who seriously want to catch up with and surpass GPT, where to find computing power next?

The U.S. plans to ban cloud computing vendors from training large AI models for China

Let's start with the first one. From the perspective of business or technology logic, the U.S. Department of Commerce is proposing a new ban at this moment, which is a bit difficult to understand: the gap between China and the United States in terms of AI large models is still very large, and the so-called "large model has changed from an atomic bomb to a tea egg" is purely a boasted by the major shareholders of A-share companies in order to reduce their holdings. Allowing Chinese companies to lease U.S. cloud computing resources will not affect Silicon Valley's technological hegemony in the short term, but will bring in a lot of revenue. The chip ban is already very restrictive enough on China's AI R&D, is it necessary to go further and do things absolutely without it? Why don't cloud computing giants such as Microsoft and Amazon stop the US Department of Commerce from doing things absolutely at all?

There are two ways to explain the above question. One is the political explanation: this is an election year, the two parties in the United States are comparing who is tougher on China, and voters in the "battlefield states" generally have little affection for globalization, and strengthening the ban on technology in China is a better card at this time. As for the Silicon Valley tech giants, the past few years have been a time of surging revenue and profits, and losing a little cloud revenue from Chinese customers shouldn't be a big deal, and no one wants to fight over it.

The other is the scientific and technological explanation: the training computing power required by the next generation of AI large models (GPT-5 and its competitors) may be upgraded to "10,000 calorie scale" or even "N 10,000 calorie scale". Because to further improve the capabilities of large models on the existing basis, it is necessary to "vigorously produce miracles" and find ways in terms of resource scale, just as nuclear weapons soared from 20,000 tons to 10 million tons or even 100 million tons. If Chinese companies plan to catch up, their demand for U.S. cloud computing resources will rise by an order of magnitude, and in other words, the demand from U.S. customers will also rise by an order of magnitude, resulting in a greater shortage of high-end computing power in the hands of Microsoft and Amazon.

Therefore, the U.S. Department of Commerce proposed a ban on cloud computing services in China at this time, on the one hand, it is to block the road for Chinese companies to catch up in advance, and on the other hand, it will also help to reserve valuable computing resources for domestic companies in the United States. In a market economy, it is difficult to say what will happen to the higher bidder, so the best competitive strategy is to exclude Chinese companies from the market economy. Microsoft and Amazon must also know that their computing power will be more expensive and there will be no shortage of customers, so they have no incentive to object.

Now for the second question: Where do Chinese tech companies go next for those who really want to catch up with the world's leading players (as opposed to inflating their stock prices and reducing their holdings through divorces)? The answer depends on how high the price they are willing to pay, and not only the economic cost. Even if the U.S. Department of Commerce officially promulgates and strictly enforces the ban, there should be a way for Chinese companies to purchase U.S. cloud computing services in disguised form by registering overseas branches and looking for overseas partners. The problem is that if caught, the consequences can be serious – never underestimate the strength of the US regulators. Most of the major domestic Internet companies are listed in the United States and Hong Kong, do they have the courage to take such a huge risk for the sake of AI models?

If you don't consider the above-mentioned risky practices, it can only be based on exploring domestic computing resources. At present, all the "domestic substitution" in the field of AI-related chips is concentrated on the inference end, because the computing power required for inference is not high. On the training side, there is a substitute demand for NVIDIA (design) + TSMC (manufacturing) all over the world, but the world has not done it (including the United States' own technology manufacturers). Maybe in another five or ten years, someone will come up with an alternative, but then the world will be different. As mentioned above, the gap between domestic self-developed AI models and GPT is still obvious, and GPT itself is also iterating rapidly, and the demand for training computing power will not be able to fall for a while.

At present, several mainstream manufacturers in China (everyone knows which ones) only have an average of 1000-2000 graphics cards for general large model training, some more, some less. Some people speculate that some big manufacturers may have hoarded a large number of graphics cards overseas, but considering that Nvidia graphics cards have been in short supply in recent years, even if they are "hoarded", the scale will not be too large. The upcoming GPT-5 era may be an era of "10,000 card scale", and the existing graphics cards that meet the requirements in the country may not be able to meet the training needs of even a self-developed large model. What to do? I would like to know.

By the way, before ChatGPT was launched in November 2022, the strongest incentive for Chinese tech companies to hoard Nvidia graphics cards turned out to be in preparation for cloud gaming, which many major Internet companies at the time genuinely believed that cloud gaming was the future and could be quickly cashed in. Although cloud gaming has not been on the rise, the graphics card prepared for cloud gaming has become a lifesaver for China's AI model.

It's ironic: over the past three years, China's gaming industry has been under fire from all directions, with tails between its legs and people on social media, saying "are games tech" and "what are games technological?" Now, it is the game industry that has to play a pivotal role in saving China's "hard technology". We all owe the gaming industry a thank you. Those who consistently belittle, insult, and denigrate games owe the industry an apology. I await their apologies and I hope to hear their apologies!

The U.S. plans to ban cloud computing vendors from training large AI models for China

The U.S. plans to ban cloud computing vendors from training large AI models for China

Read on