laitimes

How much computing power is needed to build a large AI model?

In the AI boom sweeping the world, a potential resistance that cannot be ignored is the lack of computing power. Computing power is the infrastructure for AI development, AI training needs to repeat the data set for multiple rounds, and the size of the computing power represents the strength of the data processing ability.

According to OpenAI estimates, since 2012, the amount of computing used in global AI training has increased exponentially, doubling every 3.43 months on average, and the amount of computing has expanded by 300,000 times, far exceeding the growth rate of computing power. As Baidu, 360, Huawei and other companies enter the AI field, the demand for computing power from domestic manufacturers will usher in a blowout.

Based on the research experience of OpenAI and the R&D progress of large models of Chinese enterprises, Liu Xuefeng, an analyst at GF Securities, and others estimated the computing power requirements in the training and inference stages of domestic AI large models, as well as the corresponding costs.

Computing power requirements

First, based on GPT-3 data, analysts calculated the AI server requirements required for domestic AI model training and inference.

According to analysts:

According to SenseTime's prospectus, GPT-3's large model would require 355 GPU-years of training. In the context of the current strong willingness of companies to launch AI large models, we believe that the time used by technology companies to train AI large models is one month, so the number of AI acceleration cards that need to train AI large models is 4260.

We believe that AI large models are expected to become a key area for competition among technology manufacturers, so assuming that there are 10 domestic companies willing to develop AI large models, the demand space for new AI accelerator cards due to AI large model training is 43,000, and the demand space for new AI servers due to AI large model training in China is about 5,325 (this article assumes that a single AI server is equipped with 8 AI accelerator cards).

According to the data of NVIDIA's official website, the A100 AI large model for the BERT class can achieve 1757 inferences per second, so it can be assumed that the single-chip A100 is used for the AI large model to generate 1757 words per second, which is equivalent to the number of content that a single customer needs to generate.

As of March 27, 2023, Baidu's "Wenxin Yiyan" has received 120,000 applications for testing. We expect that the number of visitors to domestic "ChatGPT-like" will be relatively large.

On March 29, 2023, at the "2023 Digital Security and Development Summit Forum", 360 demonstrated the application of 360 language models on 360 browser. Analysts believe that content generation applications have become the direction actively explored by various technology manufacturers to develop AI large models, so assume that the number of "ChatGPT-like" AI models developed and applied in China in the future is 10. The following assumptions were made for different scenario analysts:

The positioning of domestic "ChatGPT-like" is only for the internal use of registered enterprises, assuming that the daily number of visits is 50 million, and each person talks to ChatGPT 5 times, it is estimated that the new AI acceleration card demand space due to AI large model reasoning is 43,000, and the new AI server demand space is 5425 units. The positioning of domestic "ChatGPT-like" is open to individual users, assuming that the daily visits are 100 million or 300 million times, and each person has 5 conversations with ChatGPT, so it is estimated that the new AI acceleration card demand space due to AI large model reasoning is 87,000 or 260,000, and the new AI server demand space is 11,000 or 33,000 units.

Therefore, under the optimistic assumption, domestic AI large models may generate computing power equivalent to 11,000 or 38,000 high-end AI servers in the training and inference stages.

Cost estimation

In addition, analysts emphasize that multimodal large models are the development direction of AI large models and have broad application prospects. Since the beginning of this year, many technology manufacturers around the world have successively released multi-modal large models, such as Google's PaLM-E large model, OpenAI's GPT-4 large model, and Baidu's "Wen Xin Yiyan" large model.

Compared with natural language models, multimodal models integrate multi-dimensional data such as text, images, and three-dimensional objects in the training stage, and have more types of interactive information, and their versatility has been greatly enhanced. After referring to the charging standards of OpenAI and Baidu's AI big model, the analysts made the following assumptions about the cost of generating tasks for users in various industries:

The price of the generated text is 0.003 USD/1000 tokens, which is equivalent to 0.02 RMB/1000 tokens (reference exchange rate: 1 USD = 6.88 RMB). Tokens are characters that include words and punctuation marks, so they can be easily understood as a single text. The price of the generated image is 0.02 USD/piece, which is equivalent to 0.15 RMB per piece.

After the opening of the multimodal large model API, the sensitivity analysis of the usage cost of users in various industries for content generation tasks: Based on the above assumptions, we analyze the sensitivity analysis of the cost of calling the multimodal large model API for content generation tasks by users in various industries.

We expect that in the short to medium term, the range of single-day calls based on multimodal large models is expected to be between 50 million and 300 million. Assuming that each person generates text content 5 times a day, each time generating 1000 tokens, the range of generated text is expected to be between 2.5 trillion and 1.5 trillion tokens.

Assuming that each person generates 5 images per day, the number of images generated is expected to range from 250 million to 1.5 billion. From this, we calculate that the cost of users in various industries calling the APIs of multimodal large models for content generation tasks is shown in the following table.

Analysts also emphasized that AI large model technology is still in the early stage of development, the rhythm and direction of technology iteration are changing rapidly, and in the estimation of AI computing power demand, it is also necessary to consider the factors that reduce the cost of AI model consumption of computing power due to algorithm optimization. Considering the cost reduction and efficiency improvement factors brought by software optimization, the actual hardware requirements and computing power costs may be lower than the previously measured values.

In summary, analysts pointed out that under the premise of not considering the reduction in model consumption and computing power cost caused by algorithm optimization at the software level, domestic large models may generate computing power requirements equivalent to 11,000 or 38,000 (under optimistic assumptions) high-end AI servers in the training and inference stages, and the price of a single A100 is 100,000 yuan, and the value of AI acceleration cards accounts for about 70% of the entire server, corresponding to an incremental AI server market size of about 12.6 billion yuan (RMB) or 43.4 billion yuan.

Analysts expect that the incremental demand for the aforementioned AI server may gradually land in the dimension of 1-3 years.

The main point of view of this article comes from the report "Calculation of Computing Power Requirements in the Training and Inference Stage of Domestic AI Large Models" released by Liu Xuefeng (practice: S0260514030002), an analyst at GF Securities, which is abridged

This article is from Wall Street News, welcome to download the APP to see more

Read on