Reporter | Peng Xin
Edit |
After the algorithm, data, and computing power, another link of the "AI model" developed by Chinese Intelligent has received market attention.
At the end of October, the server company Inspur Information released the open source artificial intelligence giant model "Source 1.0", which is mainly aimed at the field of natural language processing, that is, the language model, which is intended to attract more developers to explore the application of artificial intelligence natural language.
The so-called language model is a technology that allows machines to understand and predict human language. Source, GPT-3, commonly known as the "large language model," refers to an algorithm that harnesses deep learning to string words and phrases together through thousands of books and massive amounts of text from the Internet.
In 2020, the US artificial intelligence non-profit organization OpenAI released a GPT-3 model, the amount of parameters exceeded the 100 billion mark for the first time, reaching 175 billion, using a 570GB training data set, which can answer questions, translate, write articles, etc., attracting the attention of the global AI industry. The MIT Technology Review said of GPT-3: "People think they can write anything: fan fiction, philosophical debates, even code." There is even debate about whether GPT-3 is the first general artificial intelligence. ”
Since then, China has been actively promoting the landing of such large models in the Chinese world. Alibaba, in conjunction with Tsinghua University, released the M6, a Chinese pre-training model with a parameter scale of 100 billion yuan, which can be applied to tasks such as e-commerce product description generation, Q&A, and Chinese poetry generation. In April this year, Huawei released the HUAWEI CLOUD Pangu model, which allows developers to quickly develop AI models with higher accuracy and stronger generalization capabilities with only a small amount of industry data.
The source 1.0 released by this wave has reached a new high in specifications. Inspur said that the scale of the source 1.0 model parameters is 245.7 billion, and the Chinese dataset used for training is 5000 GB. Compared with the GPT-3 model, the parameter scale of source 1.0 is 40% ahead, and the scale of the training dataset is nearly 10 times ahead.
According to the test data provided by Inspur, the dialogues, novel continuations, news, poetry, couplets generated by the source 1.0 model are mixed with similar works created by humans and distinguished by the crowd, and the test results show that the success rate of the crowd being able to accurately distinguish between people and "source 1.0" works has been less than 50%.
In the zero sample learning list, "Source 1.0" surpassed the best score in the industry by 18.3%, and won the championship in 6 tasks of literature classification, news classification, commodity classification, native Chinese reasoning, idiom reading comprehension fill-in-the-blank, and noun pronoun relationship; and won the championship in 4 tasks such as literature classification, commodity classification, literature abstract recognition, and noun pronoun relationship in small sample learning. In the idiom reading comprehension fill-in-the-blank item, source 1.0 has surpassed the human score.
Companies are competing to release "large language models" because of the bottleneck of AI technology popularization. In terms of development efficiency, AI application development is too slow, hindering the combination of technology and requirements, and large models are seen as feasible directions. "At present, training huge models with very large parameter quantities through large-scale data is considered to be an important direction for achieving general artificial intelligence." Wang Endong, chief scientist of Inspur, thinks.
"The most important advantage of the big model is to enter the large-scale replicable industrial landing stage, only a small sample of learning, can also achieve better results than before, and the larger the scale of the model parameters, the more obvious this advantage, can greatly reduce the development and use costs of all types of users." Wu Shaohua, chief scientist of the Inspur Artificial Intelligence Research Institute, said.
In terms of promotion form, Source 1.0 adopts an open source model and is open to teams in the direction of artificial intelligence in universities and scientific research institutions, as well as Inspur partners and intelligent computing centers. Wu Shaohua envisions that source 1.0 will be open sourced from the aspects of data, APIs, and code. At the same time, Inspur will also work with partners to promote the migration and development of "source" applications on domestic chips.
Inspur expects that the release of the source 1.0 Chinese huge model will enable Chinese academia and industry to use a general huge language model, greatly reduce the difficulty of language model adaptation for different application scenarios, and improve the model generalization application ability in small sample learning and zero sample learning scenarios.
The scale of AI market applications is gradually increasing. The general manager of the AI & HPC product line of Inspur Information observed that the hash rate demand for AI models will double every 3 to 4 months, and it is expected to grow by an order of magnitude per day. "We can feel very directly that the AI cloud service platform has been providing a lot of AI services for work and life, including cloud recognition, image recognition, natural language processing and so on. The number of movements per day will exceed trillions. ”
For Inspur Information, the AI model represented by the "large language model" has a huge demand for computing power, which is conducive to promoting the landing and promotion of projects such as the Intelligent Computing Center. In fact, the intelligent computing center is a project that various Chinese technology companies are promoting recently, and IT hardware manufacturers such as Inspur, Huawei, and New H3C can benefit from it by selling hardware.
In terms of AI applications, the dazzling application of artificial intelligence has been replaced by industry solutions, and large computing power support has become indispensable. For example, some government citizen service hotlines have reduced their speed from a few minutes to a few seconds after using artificial intelligence technologies such as automatic dispatch, semantic recognition, and emotion perception. According to Inspur's vision, the source 1.0 model can involve operators' intelligent operation and maintenance, automatically generate reports in smart office scenarios, automatically dialogue intelligent assistants in mobile Internet scenarios, and intelligent customer service in e-commerce Internet scenarios and application scenarios such as text recognition, text search, and translation.
Benefiting from the huge market for AI, Inspur's information-related business has maintained growth. According to IDC's global AI server data in 2020, Inspur, Dell and HPE are ranked in the top three in the global market, with Inspur's market share reaching 16.4%. IDC also expects the market size of AI servers in China to reach $10.8 billion by 2025.