Big model "melee", the era of cloud competition has arrived

Per reporter: Shi Puning Per editor: Tang Yuan

"China is now the '100-model war', which is the closest road to general artificial intelligence." At the first Chengdu Eastern New Area and Chengdu-Chongqing Twin Cities Economic Circle Urban Media Development Conference and City Opportunity List Release Event held on June 2 of "Future City and Intelligent Media", Zhang Hongzhong, dean of the School of Journalism and Communication of Beijing Normal University, said that the big model gives all Internet companies a new opportunity to start over.

How to understand the "big model"? Perhaps it can start from the architecture. Simply put, today's IT technology stack is divided into four layers, chip layer, frame layer, model layer and application layer.

"Today's ChatGPT, Wen Xin Yiyan, etc. belong to the model layer, and native applications in the AI era will be developed based on large models." Baidu CEO Robin Li said this at the 7th World Intelligence Conference.

Since the beginning of spring, with the acceleration of the era of AI 2.0 marked by the application of large models, the concept of artificial intelligence has caught fire.

In China, major enterprises led by large manufacturers have announced their own large models and products, and general large models have also ushered in a wave of entrepreneurship, and the market is lively for a time, and no one wants to miss the opportunity.

Looking at the "birth" of various large-model products, there are both from familiar large manufacturers such as Baidu and Alibaba, as well as from companies focusing on the field of artificial intelligence such as SenseTime, and entrepreneurs "starting from scratch", typical such as Meituan co-founder Wang Huiwen, who said that he wanted to create "China's OpenAI".

In the context of the "100-model melee", what is the "battle situation" of each family today, and what kind of technological changes are we ushering in while the large models "swarm" break into view?

Various products have been launched

"ChatGPT has entered the social field since last November, and there are two biggest breakthrough points: the first is multi-round conversations in the open domain, and the second is generative text." Zhang Hongzhong said that the breakthrough of the two technologies means the arrival of a new era of human-computer communication, "which is a huge revolutionary breakthrough." ”

Baidu and Ali, which have accumulated deep knowledge in large-model technology, took the lead in launching "Wen Xin Yiyan" and "Tongyi Qianwen" on March 16 and April 7 respectively.

Big model "melee", the era of cloud competition has arrived

Image source: "Baidu Wenxin Yiyan" public account

Since the closed beta, Wen Xin has completed 4 technical version upgrades: today's inference cost has dropped to one-tenth of the original, and the inference performance has increased by nearly 10 times.

"When others are just thinking about how to train, we have already gone a long way in reasoning." In May, Robin Li said at an internal Baidu conference.

In addition, Wen Xin's words will gradually be integrated into all Baidu businesses.

On May 16, Baidu released its unaudited financial report for the first quarter, and Robin Li said, "We plan to gradually integrate Wenxin's words into all our businesses to empower our products and services and attract a wider range of users and customers." Focusing on the words of Wen Xin, build a new ecology in the new era. This will also help us achieve long-term, sustainable growth. ”

Coincidentally, more than a month ago, on the occasion of the official launch of the Alibaba Cloud Summit, Daniel Zhang, Chairman and CEO of Alibaba Group and CEO of Alibaba Cloud Intelligent Group, also said that all Alibaba products will be connected to the Tongyi Qianwen model and comprehensively transformed in the future. "Facing the AI era, all products are worth upgrading with large models."

On June 1, Alibaba Cloud disclosed the latest progress of the Tongyi Big Model, launched "Tongyi Hearing" for the AI audio and video track, and officially opened the public beta.

According to Zhou Jingren, CTO of Alibaba Cloud, as a work-learning AI assistant, Tongyi Listening is not only "good at hearing", can generate meeting minutes with high accuracy, distinguish different speakers, but also has "extremely high understanding", which can divide audio and video into chapters and form abstracts, summarize the full text and each speaker's views, and organize key points and to-do lists in one second.

Image source: "Alibaba Cloud" public account

In addition to the frequent actions of large manufacturers, AI technology companies are also eyeing this "cake".

In April, computer vision company SenseTime launched SenseNova, which includes a natural language processing model SenseChat, a Wensheng graph model called Miaohua, and a digital human video generation platform SenseAvatar.

In the same month, Fourth Paradigm, a decision-making AI company, presented "Formula 3.0" to the public for the first time, aiming at the growth space and market opportunities of generative AI for enterprise software refactoring and change.

It is worth mentioning that as far as large-model products are concerned, Tencent is "low-key" and does not intend to "fight for speed".

After Tencent released its fourth-quarter 2022 earnings report in March, Tencent President Martin Lau revealed plans for a number of upcoming products, including chatbots. He said that Tencent will not rush to launch products, but will spend time to create a long-term development opportunity after many iterations.

In addition, large manufacturers such as ByteDance have chosen to play a role similar to Microsoft, and its Volcano Engine has launched a large model training cloud platform to provide technical services such as computing power for large model companies. Tan Ji, president of Volcano Engine, said that Volcano Engine will not make large models, but will become an enabler to provide AI infrastructure such as computing power for large model customers to help them do a good job in large model development.

Image source: "Volcano Engine" public account

The homework education model is in closed beta

In addition to technology companies, the reporter learned from Homework Gang that Homework is currently testing a large educational model based on the Chinese market, including multidisciplinary problem solving, Chinese and English composition correction, multilingual dialogue direction and other educational application scenarios, involving tool apps, intelligent hardware, books and other businesses. According to insiders, the comprehensive ability performance in the education scene exceeded the expected level.

In response to this matter, Caijing Technology asked the homework help for verification, and the relevant staff of the homework help replied, the homework help education model is currently in the internal testing stage, and the relevant evaluation and filing work is in progress.

A person close to Homework Gang said that Homework Gang launched a self-developed education GPT project at the beginning of this year, mobilizing technical elites from various business modules to form an original team. In March this year, the internal email of the job gang announced that it would increase investment again and make organizational adjustments. At present, the project is led by CTO Luo Liang for the bottom level R&D support and AIGC general direction construction, and the R&D funds are allocated first.

The person further said that in addition to the organizational structure adjustment, nearly 100 people in the production and research team of the operation help have also been merged into GPT-related projects, "from the technology to the product side, different landing scenarios in the 'running' project, it is estimated that there are more than 200 people in the team size."

Another insider of the homework gang revealed, "Judging from the accumulation of technical capabilities in the past, the homework gang has obvious advantages, so it's strange not to do this." The person said that after half a year of technical research and development, the homework gang is more confident in self-developed GPT, and has given priority to "unlocking" product-level applications such as problem-solving ability, Chinese and English composition correction, knowledge Q&A, etc., which have progressed more smoothly than expected, and have reached the industry's top level in some educational scenarios.

Image source: "Homework Gang" public account

The era of cloud competition has arrived

"Now that the big language model has emerged, the cloud has begun to be developed, and the competition in the cloud will become the window to the next Internet competition." Zhang Hongzhong said.

According to Canalys data, China's cloud computing market grew by 10% year-on-year last year, and the top four cloud computing vendors Alibaba Cloud, Huawei Cloud, Tencent Cloud and Baidu Intelligent Cloud increased by 9%, accounting for 79% of the total cloud service customer spending.

Zhang Hongzhong believes that for applications, "dialogue as a platform" has become a reality, and dialogue can solve multimodal problems. For example, after connecting the ChatGPT API, large models can draw pictures, do graphic design, write copywriting, and so on.

To understand the big model at a deeper level, "emergence" is a key concept.

Simply put, emergence refers to the ability to produce qualitative changes when the amount of data in a large model reaches a certain amount. This capability does not exist in small models. Zhang Peng, founder and CEO of Zhipu AI, who specializes in pre-training large models, pointed out in an interview with the media that the general consensus in the industry is that 50-60 billion machine training parameters are the threshold for the emergence of large model intelligence.

This is also the basis of ChatGPT's "miracle", and the GPT-3 model that "lifted butterfly wings" earlier already has 175 billion parameters. OpenAI did not further disclose the number of parameters for GPT-4. For comparison, Google's new PaLM2 training parameters have reached 340 billion.

"Computing power is the basis of competition," Zhang Hongzhong said, adding that many teachers who want to engage in related research have switched from universities to enterprises because universities cannot bear such huge computing power demand and huge costs.

A typical example is that as a supplier of computing power basic GPU, NVIDIA, whose market value has recently exceeded trillion dollars, and its A100 chips and H100 chips have absolute advantages in the world. As far as the domestic A100 chip is concerned, it is mainly in the hands of Baidu, Ali and other large manufacturers.

NVIDIA CEO Jensen Huang Image source: Screenshot of NVIDIA's official website

In other words, the competition of large-model products is expensive, which is also the main consideration for some domestic large-model products to adopt invitation testing or not to face the C-end.

Guosheng Securities estimates that GPT-3 costs as much as $1.4 million per training, and for some larger LLMs (large language models), training costs between $2 million and $12 million.

Among them, most of the cost is electricity, computer expert Wu Jun once mentioned, "probably 3,000 Tesla electric vehicles, each running to 200,000 miles (about 321,900 kilometers), run it 'dead', such a large power consumption, enough for ChatGPT training once." ”

Under such a high cost, various Internet giants are also vying to enter, perhaps because of the possibility of earning "real money" behind major model products.

Li Yanhong mentioned that start-ups "do not need to reinvent the wheel (the basic big model)", their opportunity is at the application layer, there will be "a completely new entrepreneurial opportunity ten times that of WeChat and Douyin now". Of course, Baidu itself wants to make "wheels".

In Zhang Hongzhong's view, the Internet has experienced three major eras: the PC Internet era, the mobile Internet era after 2010, and the next era that may enter - the era of large models. He said, "Our future Internet competition will almost always be based on big models. ”

He further mentioned that in the era of mobile Internet, no one could have imagined that the four major portals - Sina, Tencent, Netease, and Sohu - would be replaced and overtaken by applications such as Douyin, Kuaishou and WeChat in this wave.

"Similarly, today we have reached the era of large models, and the application on the basis of large models may produce new technologies, which is the hope and variables brought to us by artificial intelligence technology, including some new industrial development opportunities."

The author of this article is Shi Puning, a reporter of Tianfu Cultural and Creative Cloud, and the public account of "Tianfu Cultural and Creative Cloud" can be searched for related to cooperation.

Daily economic news