Arouse the attention of regulators, and the development of large models has a new direction?

【Dahe Caifang Reporter Zhang Keyao】

As ChatGPT applications explode across the web, technologies and concepts related to AI (artificial intelligence) are sought after. With the entry of domestic and foreign manufacturers into large models, the "100-model war" is about to break out, and the ecological construction of large models has also attracted the attention of regulators. Big models have been regarded as one of the infrastructure in the AI era, what new directions will the thinking of regulators bring to the development of large models?

Regulators pay attention to the construction of large model ecology

On July 10, Yao Qian, Director of the Science and Technology Regulatory Bureau of the China Securities Regulatory Commission, published an article "Some Thoughts on the Construction of Big Model Ecology" in China Finance, analyzing and discussing from the aspects of the evolution and upgrading path of large models and the possible interaction modes between large models and small and medium-sized models, expounding the data ecology and model ecological construction of large models, and providing relevant ideas for ensuring the safe and healthy development of the industry, avoiding data and technology risks, and building a sustainable development of large model ecology.

What is a large model? AI has its own explanation. The reporter of Dahe Caicube threw this question to Baidu and 360's AI tools, and got the following answer:

Baidu: "Big model refers to a language model based on deep learning and natural language processing technology, which can master the structure and semantics of language through the learning of a large amount of text data, so as to generate and understand human language." ”

360: "Large models are deep learning models with a large number of parameters and computing resources that can handle more complex tasks and data types, such as images, speech, and natural language. They typically consist of multiple convolutional, pooled, and fully connected layers that automatically learn features and perform tasks such as classification or regression. ”

According to technical experts, for ordinary users, the large model is an underlying technology, which is used to support the technical needs of users when using software and other application tools, and play the role of the operating system, such as the above AI tools to answer questions, behind the help of the large model. Corresponding to the large model are the small model, which differ in parameter magnitude, application and performance.

Yao Qian explained the characteristics of both large and small models in his article. In Yao Qian's view, the basic big model is the core engine of the large model industrial ecology, and its advantages lie in its fundamentality and versatility; Small models have the characteristics of small volume (usually at the level of tens of billions of parameters), easy to train and maintain, so they are suitable for various vertical fields and suitable for internal development and use in various industries. In general, small models are less expensive to train, but the performance is much less than that of large models.

Who is the "Achilles heel" of the big model?

"Overall, there is no generational gap between mainstream large models at home and abroad at the algorithm level, but there is a gap in computing power and data." Yao Qian said in a post.

As we all know, the three major elements of AI are data, computing power, and algorithms: data serves as "fuel" to help train and verify large models; As a driving force, computing power provides computing support for the training and inference of large models. As the "soul", algorithms are the core driving force for mining the value of data, refining data information, and realizing data empowerment. The big model is closely related to the three major elements.

The above-mentioned technical experts revealed that the gap in computing power is reflected in the computing power infrastructure, including the construction of computing power infrastructure and the hardware supporting computing power infrastructure; The gap in data is related to the development time of the Chinese Internet and the accumulated Chinese corpus.

With the continuous construction of computing power infrastructure in the mainland, the gap in computing power is narrowing. In May 2021, the construction of the national hub node of the national integrated computing power network was officially launched; In July of the same year, the Three-Year Action Plan for the Development of New Data Centers (2021-2023) was issued. In February 2022, the mainland launched the construction of national computing power hub nodes in eight places, including Beijing-Tianjin-Hebei, Yangtze River Delta, and Guangdong-Hong Kong-Macao Greater Bay Area, and planned 10 national data center clusters, thus completing the overall layout design of the national integrated big data center system, and the "East Data and West Computing" project was fully launched.

Behind the major project, the industrial chain involving hardware facilities. Technical experts said that due to the complex international trade environment, the circulation of individual hardware of the computing power infrastructure is limited, but this gives domestic hardware manufacturers market opportunities, and it depends on how domestic hardware manufacturers grasp it.

The corpus is equivalent to the data of a large model. The corpus consists of a lot of text, which is the "data" that the big model uses to learn and predict. Training based on large-scale corpus makes it possible for large models to train high performance. Because the larger the size of the corpus and the more parameters can be used in the training process, the stronger the understanding and adaptability of the large model to the laws of language.

The above-mentioned technical experts believe that the difficulty in training domestic large models lies in the fact that the Chinese corpus is still in the accumulation stage, which is related to factors such as the development time of the Chinese Internet. "Big model training is like teaching a child to learn knowledge, what you teach him and what he learns." The technical expert said that with the high-quality development of the Chinese Internet, domestic large model training will also benefit.

"100 Model Wars" fiercely fights the application market

The popularity of AI issues has made the long-dormant Internet application market lively again, domestic and foreign manufacturers have launched large models, AIGC (artificial intelligence generated content) applications, and the "100-model war" has risen: there are "showdowns" from Microsoft, Google and other manufacturers; There are Baidu, Ali, 360, Huawei, iFLYTEK and other manufacturers of "vertical and horizontal".

On July 7, Huawei Developer Conference 2023 (Cloud) was held. Tianyancha and HUAWEI CLOUD jointly released the first business inspection model, Tianyan Sister: Trusted Business Assistant. According to reports, "Tianyan Sister" can enable users to conduct business inquiries through natural language dialogue.

On July 11, iFLYTEK announced that as of the end of June, iFLYTEK's artificial intelligence open platform had gathered 4.974 million developers, with a growth rate of 45% in the past year; The number of applications was 1.725 million, with a growth rate of 13.7% in the past year. iFLYTEK and industry leading enterprises jointly build an industry open platform based on the industry model to empower the whole industry, which will also become another way to land the artificial intelligence ecology of iFLYTEK, and has begun to form preliminary results in energy, consumer goods and other industries.

Mu Ruitao, vice president of Henan Digital Economy Industry Innovation Research Institute and former senior engineer of Oracle Asia R&D Center, said in an interview with Dahe Finance Lifang that the construction of large model ecology plays a key role in the success or failure of the market, but it cannot be built behind closed doors, and it is necessary to always pay attention to changes in the demand for market applications, and the value of large model ecology should be reflected through market testing.

Yao Qian proposed in the article that it is recommended to vigorously support domestic leading science and technology enterprises in the general field to develop independent and controllable domestic large models, and at the same time encourage all vertical fields to use open source tools to build standardized and controllable independent tool chains on the basis of large models, not only explore "big and strong" general models, but also develop "small and beautiful" vertical industry models, so as to build a good ecology of interactive symbiosis and iterative evolution of basic large models and professional small models.

In Mu Ruitao's view, at this stage, both domestic and foreign countries are exploring the construction of large model ecology, which is related to the entire artificial intelligence industry chain, and cannot be led by hot concepts, only focusing on certain links, but requires collaborative innovation of hardware and software manufacturers.

Responsible editor: Wang Shidan | Reviewer: Li Jinyu | Reviewed: Li Zhen | Executive producer: Wan Junwei