laitimes

From the "100 model war" to the application race

author:China Business News

Text: Li Jing

In 2023, major domestic technology manufacturers and startups will invest in the research and development of large language models. From Baidu taking the lead in launching "Wenxin Yiyan" in March 2023, to Alibaba, iFLYTEK, Huawei, JD.com, ByteDance and other companies have successively launched large-model products, the domestic market is showing a state of "100 model wars".

However, when the major domestic manufacturers are fiercely fighting for large models, there is another voice in the market gradually rising - industry insiders represented by Robin Li, founder and CEO of Baidu, have repeatedly spoken: "There is no point in rolling large models, and there are greater opportunities for volume applications." ”

At the end of 2023, Pika, an AI video generation product, a large-scale model application that is in the limelight in Silicon Valley, has a valuation of $250 million with a team of only four people. Another important event in the same period was that OpenAI cut off ByteDance's API interface. Li Mingshun, founding partner of Shunfu Capital and chairman of Xingxing AI, explained: "This actually reflects that large model companies are beginning to be afraid of strong applications, because large models are gradually becoming open cards, and everyone is no longer competing with the technical base, but how many users, how many scenarios, and how much money continue to find computing power." The number of users, scenarios, and investment capabilities will become the core of this wave of large-scale model competition, and in this context, the importance of large-scale models themselves will be reduced. ”

Technological progress is evident

Large models are a major keyword in China's Internet technology field in 2023. Under the influence of ChatGPT and led by giants, large-scale models with hundreds of billions of parameters will be implemented in China in 2023. Specifically, in March 2023, Baidu released "Wenxin Yiyan", in April, Alibaba released "Tongyi Qianwen", SenseTime released the "Ririxin" large model, in May, iFLYTEK released the "Spark Large Model", and in July, Huawei released the industry-oriented "Pangu Large Model 3.0". CCID consultant statistics show that from January to July 2023 alone, a total of 64 large models will be released in China.

From September to October 2023, Baidu, Alibaba, Tencent, iFLYTEK and other companies have successively launched the latest versions of their large models, and the model capabilities are on par with GPT-3.5 and GPT4.

In November 2023, the "Artificial Intelligence Large Model Experience Report 3.0" released by the China Enterprise Development Research Center of the Xinhua News Agency Research Institute showed that domestic large model manufacturers showed a trend of competing with each other in terms of technical strength, and compared with August 2023, the current Chinese large model products have made significant progress. Compared with GPT-4 and other top international large models, domestic large models started late, but with the accelerated development of domestic large models and the accumulation of parameters and training, some domestic large models can already compete with foreign large models. From the perspective of the number of parameters, the number of parameters of ChatGLM2-130B under Zhipu has reached 1.3 trillion, second only to GPT-4, and the number of parameters of "Wenxin Yiyan" 4.0 is also among the trillion level; From the perspective of performance, the performance of large models such as "Wenxin Yiyan" and "Consultation" in the CLiB evaluation is better than Meta's LLa MA-2-70B large model, and the gap with GPT-4 is small.

"The biggest change in the field of large models in 2023 is the rapid expansion of the scale and application scope of models, mainly due to the explosive growth of data volume and the improvement of computing power. With the popularization of the Internet and the acceleration of digitalization, massive amounts of data are being generated every day, which provides rich 'food' for training larger, more complex models, so that large models can better understand and simulate the real world. Zhu Keli, executive director of the China Information Association and founding dean of the National Research Institute of New Economy, said that with the continuous development and maturity of deep Xi technology, the performance and efficiency of the model are also improving, providing a technical basis for the wide application of large models. At the same time, the development of cloud computing, edge computing and other technologies also provides powerful computing resources and infrastructure support for the training and deployment of large models. These factors have jointly promoted the rapid development and changes in the field of large models.

"Intelligent Dialogue started research in academia very early on, and there are landing products, such as Microsoft Xiaoice, but it can't complete complex tasks. Until November 2022, OpenAI released ChatGPT-3.5, and its problem-solving ability shocked the world. A large amount of capital at home and abroad is entering, resulting in the rapid development of the large model industry, and nearly 300 large model companies and scientific research institutions have been born in China alone. Dr. Liang Bin, CEO of Bayou Technology, believes that the biggest change in the market is the confidence of all participants in the comprehensive transformation of the world by large models, which was not available before, but now everyone sees a major opportunity and dares to invest.

Li Mingshun pointed out: "In fact, the gap between domestic mainstream models and foreign models is still more than a year and a half, but this does not mean that domestic models are not good. The domestic capital market and companies that make large models are relatively pragmatic, and they are more likely to consider commercialization, and are more willing to do model development or application development in combination with scenarios. Foreign giants, on the other hand, are more willing to invest in basic research and cutting-edge technology exploration. ”

It needs to be noted that the large model is still moving forward, and the industry is also facing some problems that need to be solved urgently. Luo Jiangchun, founder and CEO of Glance Technology and a member of the AI application working group of the Ministry of Industry and Information Technology, pointed out to the reporter of China Business News that large language models also face major challenges in terms of scalability, alignment with human value, pre-training cost, authenticity and credibility.

Liang Bin also pointed out that computing power is stuck in the neck is a serious challenge at present. From the perspective of data, there is still a big gap between the processing power of China's large model for PGC data (data generated by the platform) and the United States, because many high-tech content and papers are in English.

Better prospects for development applications?

Under the general environment of the "100-model war", is it meaningful for small and medium-sized enterprises and entrepreneurs to get involved in the war of large models? Robin Li has spoken many times: "The 100-model war is a great waste of social resources, and more resources should be put on super applications. ”

"Just like a search engine, there may only be a few companies that can make a large model in the end. Wang Jianbo, founder and CEO of Maxima bidding big data platform, told reporters.

From a technical point of view, OpenAI just chose the Transformer architecture in the deep learning Xi, and then improved it, and finally there was a "smart emergence". As Amazon CEO Jeff Bezos put it, large language models are more like "discovery" than "invention".

Wang Xiaochuan, Chairman and CEO of Cheetah Mobile and Chairman of Orion Star, said: "After understanding the principle of large models, all companies can make large models, and training and parallel training are engineering things. ”

"Large model training technology is only a problem on the one hand, and on the other hand, it also needs enough corpus and enough computing power, which is actually a steady stream of money. Bao Ran, vice chairman of Zhongguancun Modern Information Consumption Application Industry Technology Alliance, believes that "as long as there is money, making a large model is actually a simple and crude thing to do, but this investment is tens of billions of dollars, whether it is a mature company or a start-up, it is very difficult to use tens of billions of dollars to make a large model." ”

But what needs to be faced is that the rapid development of technology has caused industry volatility. "The industry is moving from the exploration phase of AI technologies to a deep understanding of how to effectively integrate these technologies into specific business processes and services. In the face of this change, the key is to find and master the real application scenarios, the immediate and effective data feedback system, the effective technology application development capabilities, and the relatively complete industrial chain support. Luo Jiangchun said.

For large model companies, the challenge is how to find application scenarios, because scenarios are actually scattered across all walks of life. Wang Jianbo told reporters that large model companies are far away from customers, and they often take AI hammers everywhere to find nails, but many scenarios may be pseudo needs, and the probability of AI creating new scenes is also very small.

This has also triggered the vigilance of some large model companies for companies with strong applications and strong scenarios. At the end of 2023, OpenAI cut off ByteDance's API interface. "A company like ByteDance with a huge user base is what OpenAI is afraid of, because many products like ByteDance will embed AI in the future. Li Mingshun said, especially after the use of AI in short video production tools such as Jianying (capcut in the overseas version), it may become the world's No. 1 short video tool in the future, far surpassing the Pika that everyone sees today.

For small and medium-sized entrepreneurs, innovation in the field of large models is shifting to providing highly targeted and highly customized vertical industry solutions, so that even small entrepreneurial groups can make successful AI application products by standing on the shoulders of large models.

In 2023, there is also a relatively obvious trend of change in the domestic large model market, which is diversification and deep integration. "It has become more focused on the needs of specific industries, providing more refined services, such as legal, medical and financial, and the corresponding trend of deep integration, which makes AI technology better serve the specific needs of specific industries. The market's acceptance of these new technology tools is also increasing, and businesses and consumers are beginning to rely more on these AI tools to improve work efficiency and decision-making, Luo said. At the same time, there is an increasing demand for the adaptability of these technologies, i.e. they need to be able to adapt to the specific needs of different industries and use cases.

"In the second half of 2023, some large-scale model applications have appeared in industrial, gaming, e-commerce, hotel and other scenarios. In 2024, the combination of scenarios and AI will continue to produce large-scale model applications. However, Li Mingshun pointed out in a timely manner that the recognition of large models in the industry has become higher and higher, and it is difficult to establish their own barriers to do some light applications based on large models out of thin air.

"The training of large models and large model applications in vertical industries requires a large amount of high-quality data in the industry, and in this process, how to effectively obtain and process data, how to improve the performance and efficiency of the model, and how to ensure the security and reliability of the application are all problems that need to be solved urgently. Zhu Keli said.