laitimes

Xiaochuan Wang: OpenAI is trying to connect 10 million GPUs together

Xiaochuan Wang: OpenAI is trying to connect 10 million GPUs together

Xiaochuan Wang: OpenAI is trying to connect 10 million GPUs together

Written by Vera Ye

Edited by Kang Xiao

Produced by丨Deep Web Tencent News Xiaoman Studio

The AGI large model is undoubtedly the hottest outlet in 2023. Since the first year of artificial intelligence in 2016, the AI industry has undergone several rounds of reshuffle, and with the help of ChatGPT, general artificial intelligence large model entrepreneurship has once again been put in the spotlight. 

"It's a very similar era to the Gold Rush era. If you had gone to California to pan for gold at that time, a whole bunch of people would have died, but people who sold spoons and shovels would have made money forever. Large models are platform-based opportunities, and model-first platforms will be larger than information-first platforms. Lu Qi, founder of MiraclePlus, said.

According to a report released by the Ministry of Industry and Information Technology and other institutions, the total number of patent applications for AI large models in China has exceeded 40,000. In the first half of the year alone, there have been more than 70 large-scale model startups in China, and in the competition in technology, computing power and financing capabilities, the reshuffle of large-scale model startups is accelerating. 

At the 2023 Tencent ConTech Conference, Sun Tianshu, professor and director of the Digital Transformation Center of Cheung Kong Graduate School of Business, had a dialogue with Wang Xiaochuan, founder and CEO of Baichuan Intelligence, and Qiu Xipeng, professor at the School of Computer Science and Technology of Fudan University and head of the Moss system.

Wang Xiaochuan gave his own answer, "One step slower in the ideal, three steps faster in the landing." Wang Xiaochuan believes that compared with ChatGPT, we still have a gap in terms of beliefs and resources, but at the business level, the combination of large models and applications is indeed much stronger, such as Taobao, WeChat, including Douyin, and the product experience of China's Internet is far better than that of the United States.

Personal assistants and the entertainment industry are the biggest commercial application opportunities that Wang sees in the future.

Qiu Xipeng said that catching up with ChatGPT should be combined with the specific actual situation of the current mainland, such as relying on the industry, the demand generated from it, and then transforming it into technical research. "The problems of our technical research should be condensed by real needs, so industry-university-research cooperation becomes very important in the next step. ”

The following is a transcript of the dialogue between Sun Tianshu and Wang Xiaochuan and Qiu Xipeng, edited and arranged by "Deep Web":

"The application of China's large model will run faster"

Sun Tianshu: Compared with the United States, we are still learning and making breakthroughs in the research of large-scale model technology. In the field of product application, can Chinese companies achieve global leadership in product application based on their previous generation of Internet experience, product experience, and based on China's large consumer market?

Wang Xiaochuan: I put forward a concept this year, which is to ideally take half a step slower and land faster. I just think that the opportunity for us in China is not to run faster in technology and scientific research, but application is where we can run faster. 

I went to the United States in June and had a lot of exchanges with OpenAI and other colleagues who are making large models, and after I came back, this sentence was changed, from "half a step slower in the ideal and one step faster in the landing" to "one step slower in the ideal and three steps faster in the landing".

OpenAI has been working for 7 years now, and we have only been working for a year, and it still has tens of billions of dollars invested in it to continue to develop its core technology. And when I was talking to them, I said, do you do any research now? He said we're trying to put 10 million GPUs together to train a large-scale model.

What is the concept? Nvidia produces about 1 million GPUs a year, 25,000 GPT4 for training, 4,000 GPT3.5 for domestic benchmarking, and we are now doing 4,000 benchmarking things, they are studying how to do this work with 10 million GPUs, and we are far from enough from the resource level.

But at the business level, we are indeed much stronger, like our Taobao, WeChat, including Douyin, China's Internet product experience is far better than that of the United States.

We have a lot of product managers, so I think we have a huge experience and advantage in this situation. To develop it, there are two difficulties to solve, one is that you need to have a model, and if the model is poor, it may be lagging behind, so we have to use more ingenuity to make up for it, and even make a model company and an application company to combine it and solve it with an open source and end-to-end model.

A company with strong foreign model technology does not mean that it is good at application, this is the first thing, the model and the ability to cooperate with the application.

Second, there will be some relative transformation and improvement of product managers. In the past, the product manager was called PM, how can we define that a good product is in line with market demand. In the past, when we encountered problems, we defaulted to technology not as a bottleneck, such as WeChat, although everyone has requirements for technology, but this can be achieved and achieved, and it is more a problem of engineering efficiency and stability. Today is a product brought about by algorithms, which is for the product manager, he knows what kind of product technology is enough today, and even takes a step to lead the development of technology. My evaluation of the technology, what is my test set, how to evaluate it well, how to get algorithm engineers to keep up with this pace.

Therefore, the product manager of this era should have a judgment and evaluation of the technology, and if these two problems are solved, China's application can be ahead of the United States.

Sun Tianshu: From the perspective of basic research, first, how to move the basic research of large AI models into a more efficient way, small models and small data, this is from the perspective of popularization. The second dimension is more open, what are the impacts and changes brought by computer research, artificial intelligence research, and large models to the research of the entire field of natural science?

Qiu Xipeng: The gap between us and OpenAI's computing power is too big, and I feel that we can keep a close follow of OpenAI. On the one hand, we can exercise our skills, and on the other hand, we can maintain the team and cultivate talents, which can achieve very good results. 

However, to catch up and surpass, it is still necessary to combine the specific actual situation of the current mainland, such as relying on the industry, and the demand generated from it, and then transform it into technological research. The problems of our technical research should be condensed by real needs, so industry-university-research cooperation becomes very important in the next step.

On the other hand, whether it is the research of AI itself or the impact it brings to the entire scientific research, it is possible that its paradigm will change. AI for Science, it is still the traditional AI model, relying on big data, people to label a large amount of data, to train a model. This model may be dedicated to structural prediction and cannot do anything else but do it. The current large model or the new generation of general AI technology centered on the large language model is to first use language as the foundation to construct a very complete series of knowledge. It is easy to attach the various experiences of people to the model through language as a carrier.

In many scientific research applications, patterns and forms may not be the same as before. In the future, he will focus more on scientific research and discovery, similar to scientific exploration, or take the initiative to let him get some new conclusions. This is something that AI for Science couldn't do before.

"Democratization and Two Business Opportunities"

Sun Tianshu: What are the product forms and business opportunities of large models on the C-end of the consumer Internet, and is it an opportunity for a large Internet company or an opportunity for a new generation of start-ups?

Wang Xiaochuan: Let me start with the second thing, the opportunity for large factories or start-ups? Today, I think that after the big model came, both large factories and start-up companies have been very enthusiastic about researching related technologies. My opinion is that large manufacturers at least have a lot of space in the iterative upgrading of existing products, and personnel-intensive places can be replaced by large models, and they can also upgrade their own products.

But for startups, the mortality rate will be higher, and it is likely that the biggest innovation breakthroughs will belong to innovative companies. Before the industry made a big model, there was a concept called big innovation depends on small factories, and small innovation relies on large factories. Therefore, today, the large factory itself will also have a large model, as long as the increase of 1% is a huge benefit. But perhaps the biggest innovation comes from startups. The new wave of opportunities, our view is from the information age to the intelligent age, a big era, just like the previous industrial age, there will be new companies up, so I still think the space is quite large.

On the C side, I would like to divide it into two parts, the first concept appears to be similar to a human assistant, because the biggest change from the large model to today, it is not a tool. It is our partner, it begins to speak, it communicates with people, it knows language, the difference between animals and people is that they master language. The large model language is no worse than people today, and there is also world knowledge. Therefore, it is a companion role that can accompany us. So a big concept is an assistant, which can become all kinds of assistants, your writing assistant, Q&A assistant, and even become a private teacher, a private doctor, a private lawyer, which is an anthropomorphic character.

Second, there is a huge demand in the entertainment industry to construct a virtual physical world. We know that people need leisure in addition to work creation, and the large model can be used as an emotional companion, which can not only anthropomorphize, but even create a virtual world. I call it too unreal, and the storyline, the spatial structure of the house, and the large model can all be unreal. There will be great changes for the entertainment and gaming industries. Large language models can construct worldviews and natural, social, and cultural logics in the world, including physical rules.

Therefore, the entertainment industry and assistants, these are the two major directions seen in the C-side.

Sun Tianshu: In the future, when the open source model is overseas, will it converge to one or two models in China? What preparations do large model companies need to make for the further development of the open ecosystem? What preparations do domestic large model companies need to make in terms of capabilities to achieve the future open ecological layout like OpenAI and Agent?

Qiu Xipeng: The ecology of the large model has become very open now, and there are a lot of participants here. There is a question of civilianization at stake. Civilianization may be divided into two: on the one hand, because everyone is surrounding an open source, or even not open source, like GPT, it can also be around a relatively large ecosystem, if it provides very good personalization and customization capabilities, it is a technology stack, because there are many participants, and there are a lot of accumulation and condensation. It will indeed reduce a lot of R&D costs.

But on the other hand, the cost of computing power and computing is relatively high, so there are still certain obstacles. At present, there are relatively high capabilities to support the very vigorous ecological development, including GPT personalization, including the use of so-called various tools, task planning, etc., these capabilities still need relatively large models.

The computational cost of these large models is still relatively high. So maybe at present, especially in China, due to cost reasons, it can't be widely applied to everyone, so that it can become everyone's assistant. So in the future, how to further improve the efficiency of the model? further compress the large model, and even some new architectures, to reduce its real computing cost, so that everyone can afford to use it, so that it can achieve real popularization. The opportunities are also very numerous.

Read on