laitimes

Dialogue with Robin Li: Don't reinvent the wheel, the ten times the chance of AI is elsewhere

author:36 Krypton

Interview | Feng Dagang

Written by|Tang Yongyi

Editor|Yang Xuan Su Jianxun

Dialogue with Robin Li: Don't reinvent the wheel, the ten times the chance of AI is elsewhere

Source: Baidu

In 2023, the world's attention will be on the scorching competition for AI large models.

When the Chinese competitor and Baidu "Wen Xin Yiyan" was in the research and development stage, Baidu's technical team had conducted a comparative test with ChatGPT, and Li Yanhong recalled to 36Kr that at that time, "the gap was 40 points, and it could catch up in a month." ”

But a month later, after the technical team tested it again, it was found that the gap had widened - the development speed of AI large models is not linear.

After a nervous catch-up, by the time Wen Xin's words were released on March 16 this year, it even "could reach the level of it (ChatGPT) in January this year." Li Yanhong said to 36 Kr. How big is the difference between Wen Xin and ChatGPT? "Probably two months at most. But when these two months will catch up is the more important question. ”

The past week has seen the AI field in an even stronger storm. The day before Baidu Wenxin's conference, OpenAI released a new generation of GPT-4 models; The day after, Microsoft released Copilot, an AI assistant equipped with the latest GPT-4 - all product progress that shocked the industry.

Baidu Wenxin's words became the object of fierce debate. With many questions from people surrounding Wen Xin's words, 36Kr exclusively interviewed Baidu founder and CEO Robin Li and directly asked: Why did the conference use a demo instead of a real-time demonstration? Why is a product released when it's not perfect?

These questions reflect the complex emotions of the Chinese people: the anxiety of whether people have me or not, the high national mood, the ups and downs between expectation and disappointment...

In addition to responding to questions, 36Kr was even more impressed by his conversation with Robin Li in that he gave many direct statements about the AI industry.

For example, I was asked if there will be another OpenAI in Chinese startups. He directly replied "basically not" and "there is no need to reinvent the wheel." ”

For example, "at the application layer, there will be new entrepreneurial opportunities ten times greater than the current WeChat and Douyin." For example, "AI will disrupt the cloud computing market."

For example, while AI will replace human jobs, there are more unexpected opportunities. One hint for individuals is that people who don't write prompts (the language of instruction for humans and machines to interact) for AI will be eliminated.

In any case, we are standing at a historic point where a new era of growth may begin based on AI big model technology. Just two days ago, NVIDIA released a new GPU dedicated to large model computing, which can reduce the cost of large model processing by an order of magnitude. "We're in the iPhone moment of AI." NVIDIA founder Jensen Huang emphasized three times at the meeting.

Temporarily forget the history of Zang No, but regard Baidu as a company that has been deeply engaged in the field of AI for more than ten years and spent hundreds of billions of yuan, Li Yanhong's voice is particularly significant at this time.

The following is the full text of the dialogue, edited and organized by 36Kr:

Respond to all the questions raised by Wen Xin's press conference

36Kr: After the press conference on March 16, there were many voices on the Internet, including blessings and doubts, and today I am the voice of questioning. First of all, a small question, will such a sudden interview today make you feel pressured?

Robin Li: No. Indeed, as you said, after March 16, there were various voices on the Internet, and I did have something to say.

36Kr: Some people say that you were nervous at the press conference, right?

Robin Li: I really didn't feel nervous. Because this thing (referring to Wen Xin) is very familiar to me, including the 5 demonstration scenes, basically I chose it, or at least others gave me suggestions and I carefully watched.

Later, I also looked back at the press conference live, and I didn't feel that I was nervous at any time. I guess it was because I couldn't see the stock price change on stage at the time, so I wasn't affected by it. But a lot of people in the audience, including those watching the live broadcast, can see some of the reaction of the capital market, but they can't see what our real product looks like (because it hasn't been released at the time), so there will be this speculation.

36Kr: At the conference, you mentioned that the product was not ready. Why post when it's not perfect yet?

Robin Li: The main reason is because there is market demand. We have a lot of customers asking, when will this thing come out? When will we be able to use it? Can you guarantee that I am the first to try the product? Questions in this regard are constantly being asked.

At present, the whole environment is that ChatGPT is very popular and even mythologized. People must be anxious, and if our customers don't use the most advanced products early, they will also feel anxious. In this case, we do want to get it out as early as possible.

From the perspective of the law of technological development, this type of product does need human feedback, and its evolution and ability improvement will be faster. We also want it to improve faster, so we have to roll it out early.

36Kr: The development conference was chosen on March 16, how was this date determined?

Robin Li: At the beginning, we thought about the end of March, but I think it's okay on any day.

But very early, I agreed to participate in this year's Yabuli Forum, Yabuli is March 17, when I will meet a lot of new and old friends, including government leaders, the media, everyone will definitely ask (Wen Xin's words), at that time we have not yet released, others ask, I really don't know what to say - you say little, everyone will think that you don't disclose any information, you don't treat me as a friend; To put it a lot, we are a public company, which is equivalent to selectively disclosing something, and it is not okay.

So I thought about it and decided to rush forward a little. In order to adapt to the Yabuli Forum on March 17, it was decided to develop the conference on March 16.

36Kr: So it was fortuitous that OpenAI released a new version.

Robin Li: Yes, we didn't know in advance that OpenAI would release GPT-4 on that day. It's actually not that important for us either. There are enough places for improvement that we can see ourselves, and it is enough to do these things well first.

36Kr: Why did the conference site use the demo that was prepared first, instead of the real-time display?

Robin Li: I wanted to be able to demonstrate it live because human-machine dialogue products have strong interactive properties, but then two factors made me change my mind. First, generative AI does not necessarily give the same answer every time, which will bring uncertainty. The second reason that really convinced me was that all similar conferences in the world, none of them were live demonstrations, they were all recorded. If everyone can, then we're OK.

36Kr: There are five scenarios in the product released by Wenxin, including literary creation, commercial copywriting, mathematical logic calculation, Chinese understanding, and multimodal generation.

Robin Li: That's a good question. The logic we chose is this: Wen Xin benchmarks ChatGPT, so most of the functions that ChatGPT has, we must also have.

But at the same time, we are rooted in China after all, so our conversational products must reflect our better understanding of Chinese and Chinese culture. We do have some things that ChatGPT doesn't have and hope to show them at the launch.

So, the first three scenarios are benchmarking against the existing features of ChatGPT, and I hope you can feel that our stuff is not bad. For example, the first example is, where is the author of the three-body problem? I have interviewed many times in ChatGPT, and its answers are all wrong, and the answers generated each time are different, sometimes say Gansu Tianshui people, sometimes say Shanxi Luliang people, the answers are very random. So, I used that example in the first example, but the capabilities of the first three examples, you have already seen, in ChatGPT those capabilities are there.

In the fourth example, Wen Xin's understanding of Chinese, or Chinese culture, is indeed more in place. We synthesized some knowledge enhancement, retrieval enhancement and other capabilities, and Wen Xin was able to understand factual questions such as "how expensive is Luoyang paper" and "Liu Cixin's origin", and made correct answers, with higher accuracy.

The fifth example is the demonstration of multimodal capabilities. There is a Sichuan dialect, there is a Wensheng picture, and a Wensheng video, which represents Baidu's accumulation of comprehensive capabilities in AI in the past ten years.

When I made these five examples, I made a request to the team, and I hope that after the product is shipped, it will be played. The first is the Sichuan dialect we just spoke, we have the ability to synthesize speech and we have a better understanding of the situation in China. So, I hope that when users ask questions, no matter what question they ask, we can use speech synthesis and support speaking in a variety of dialects, whether it is Sichuanese or Cantonese. I hope people find it interesting and enjoy playing with these things.

The second requirement is that we be able to discern when the user's question itself is factually wrong, such as "Why did the Soviet Union bomb Poland during World War II?" "In fact, the Soviet Union did not bomb Poland, it was Germany that bombed Poland. I hope that Wen Xin's words can identify that there is a mistake in the user's question, and tell you that what you just said is not right, and I will tell you what the correct answer is.

So when the user has such a problem, or deliberately misguides, if the product can identify, the user will think that you are smart.

36 Krypton: Some say it's for naughty humans.

Robin Li: If you can bring more joy to everyone, why not?

36Kr: When it comes to ChatGPT, others will definitely compare it with Wen Xin's words, which do you think is more ahead? If ChatGPT is more advanced, how many years do you think it is ahead of Wen Xin?

Robin Li: This question should be said that ChatGPT was released on November 30 last year, and we have released it now, which means that it is impossible to be ahead of the curve by several years.

But scientifically evaluate, is Wen Xin's word at the level of last year's ChatGPT on November 30, or the level of December 30? We don't have a particularly rigorous method to evaluate this, we can keep it ourselves, but we can't keep ChatGPT's situation at that time.

But I can tell you about our in-house development process. When the first version of the product came out, we made a comparison with ChatGPT at the time, and it was about 40 points behind.

36Kr: How did this comparison work?

Robin Li: Our conversational artificial intelligence big language model should have a variety of capabilities, each ability to pick a prompt.

36Kr: 100 out of 100, a difference of 40 points?

Robin Li: Right. At the time, we could see much more than 40 points of improvement, so we felt that within a month, we would definitely catch up with it. But a month later, we did another review and found that the gap had not only not narrowed, but widened.

So we were nervous, saying that the more this thing is done, the farther away it is from others. But later found that in fact, the upgrade of ChatGPT is not a uniform upgrade, although the improvement is fast, but it has its own development rules.

Baidu's version by version iteration method is very, very fast. By the time we dare to say that the development conference will be held on March 16, we think that it can at least reach the level of November 30 last year, and even according to rational judgment, it should reach the level of ChatGPT in January this year. That's why we dared to go out and send it at that time.

Especially when you test the ability that ChatGPT is good at (English, programming, etc.), you will find that the gap is very large, because ChatGPT has also changed a lot. The day before our conference, OpenAI was on GPT-4, which is different from GPT-3.5.

So you're going to say how far we are from ChatGPT? I think it may be two months at most, but when these two months can catch up is the more important question.

36Kr: Can it be said that Wen Xin's words will reach the level of ChatGPT in two months?

Robin Li: It's not enough, because people are also improving. Baidu is advancing faster than it is, and one day will not only catch up with it, but surpass it.

The Wen Sheng Tu ability we talked about just now, Baidu's ability has been polished for a long time, and everyone is very good at playing. GPT-4 itself does not have the ability of Wen Shengtu, from another angle of comparison, ChatGPT lags behind Baidu, Wen Xin has long had this ability.

As early as before the release of Wenxin's words, everyone can experience this ability with Wenxin Yige (referring to Baidu's Wensheng graph system based on the Wenxin big model), which is where we do well. When ChatGPT was released, everyone said that it was cross-era, shocking release, etc., and it released the ability to understand pictures, not literary pictures, just enter pictures to tell you what this picture is.

In objective comparison, we have our strengths, and we are also confident that we can quickly catch up with or even surpass in terms of comprehensive capabilities.

The ability of Wensheng Tu we talked about just now, Baidu has been polished for a long time, and everyone is playing it very well at present. But ChatGPT itself does not have the ability of Wen Sheng Tu, from another angle of comparison, ChatGPT lags behind Baidu, Wen Xin has long had this ability.

36Kr: Compared with the call cost of ChatGPT, is Baidu's cost higher or lower? Approximately how much?

Robin Li: The cost is similar. But this thing is not important, the important thing is that we can quickly reduce this cost through end-to-end optimization.

36Kr: For example, when you use it, what percentage of ChatGPT will it cost?

Robin Li: It will be slightly cheaper.

36Kr: How much has Baidu invested in Wen Xin's words now, and how much will it continue to invest?

Robin Li: It's hard to divide it clearly. For example, does our investment in large language models count? Maybe some of the input is to do discriminative things, such as optimizing search and so on, and some are generative.

If you talk about generative AI alone, it may be a billion or billion, and the investment will be greater in the future. If it is the entire four layers (referring to the application layer, model layer, framework layer, chip layer), because four layers of end-to-end optimization are required, the big language model can be competitive, chips, frameworks, etc. all add up, hundreds of billions of yuan have been invested in ten years.

Without those inputs, the model of Wen Xin Yiyan would not have been possible at all.

China will hardly produce another OpenAI

36Kr: I saw your own Baijia number video, saying that Baidu is the first in the world to release ChatGPT-like products, ahead of Microsoft, because Microsoft calls the OpenAI interface, Meta and Google have not released really the same type of products. Why do you say that?

Robin Li: If artificial intelligence is classified by language model, one is called discriminative AI, and the typical application is search. Search is to see that the web pages do not match your needs according to your needs, mainly to identify; The other is ChatGPT, or generative AI product. You propose a prompt word, it plays according to the prompt word, and even the wrong play is possible, this direction was not optimistic by the big manufacturers in the early days, and the accumulation was not particularly deep.

In comparison, Baidu's accumulation in terms of language models is still good. We have invested in AI for more than ten years, and the first version of the language model, the Wenxin Big Model, was released in 2019. In the past year and a half, we have been very optimistic about generative AI, and there is a good investment. So when we saw a big opportunity, we quickly increased our resources and made it happen.

In this process, other big manufacturers like Google, Amazon, Facebook, do you say they pay attention to it? It must be taken seriously now. Do you want to make such a thing? Definitely want to. It's easy to understand why I say Baidu is the first to do it among the big manufacturers.

36Kr: Many people are preparing to do a startup similar to OpenAI, such as Kai-Fu Lee, Wang Xiaochuan, Wang Huiwen, what advice do you have for them?

Robin Li: Many people are also asking me, will China come out with another OpenAI? Basically not. OpenAI was born because the big American manufacturers are not optimistic about this direction, but now the big Chinese manufacturers are optimistic about the AI big model and are doing this direction. It doesn't really make much sense for startups to redo ChatGPT.

I think there are two points to the suggestion. First, the characteristics of startups are that the direction can be constantly changed, the ship is good to turn around, quickly adjust the strategy when the market conditions change, and what the company wants to do when it is established and what it does later can not be completely matched.

Second, I think there is a great opportunity to develop applications based on this large language model, there is no need to reinvent the wheel, after having the wheel, making a car and an airplane, the value may be much greater than the wheel.

36Kr: Now that everyone is worried about the computing power of large models, including chips, how will Baidu solve the computing power problem?

Robin Li: In fact, computing power is a very general statement, how is your CPU, how is the GPU, how is the chip and the frame matching, the matching degree of the frame and the model, and the room for improvement is very large.

Baidu has leading products in the four-tier architecture. The chip itself, whether it is Moore's Law, or the GPU is developing very fast, the framework cannot be called mature, and our engineers are still working day and night to optimize the framework.

The model is updated faster, and it can be upgraded three times a day, which is sure to make it more and more efficient. In the future, it is likely not computing power that will restrict the development of the entire big language model. Today we see that the computing power is very tight, and in the future you may find that the algorithm has suddenly changed, not this algorithm, (restricting development) may be another set of things.

36Kr: In the entire ecosystem, chip layer, framework layer, model layer, application layer, where do you think the biggest entrepreneurial opportunities are?

Robin Li: At the application layer. Looking back at the mobile Internet era, today's particularly successful WeChat, Douyin, and Taobao are all applications. There are actually few operating systems, an IOS, an Android, and that's it. It is difficult to say that the value of Android is greater than WeChat and Douyin. At the application level, there are many opportunities and the value that can be created is also very large.

36Kr: Because of the low share of search listings, Microsoft can be keen to do this, but Baidu's generative search engine will subvert its own business model. What do you think of this view?

Robin Li: I did hear this statement, but I think it's too far from what we really think.

First of all, I don't think it will be subverted, you actually use Baidu search. When this ability is given to Baidu Search, it will hardly change the way you use it. You don't need to spend a lot of energy to learn how to use Baidu search, but the answer is more accurate, and the answer that used to be relatively small can now be more detailed, more vivid, and more lively, which expands the boundaries of search.

We are not the entire mobile Internet, but a small part of the entire mobile Internet. After doing more things, more and more users will switch to Baidu from other apps. From this point of view, I hope that Baidu APP will be empowered by Wenxin's words, it can subvert the current Baidu search, and I am extremely eager for such a thing to happen.

However, this is only a small part of the whole story, and the larger story is actually in the cloud. Because I just talked about the four-layer architecture, that is, the chip layer has changed, from the CPU to the GPU, and the framework layer has also changed.

I'll say two more about the frame layer. It makes sense that Baidu was the first to make it among the global manufacturers, we have our own layout in the chip layer, frame layer, model layer, and application layer, and the global manufacturers have leading products in these four layers.

Looking at these four-layer architectures, we can be said to be many years ahead of each layer, when you have these capabilities are complete, people will develop applications in the future, based on Baidu intelligent cloud is the most convenient. This opportunity is far greater than a chance for Baidu Search to disrupt itself, and what I'm really excited about, including my main talk at the March 16 press conference, is this opportunity.

36Kr: But today I went to search for Baidu, and there were 15 search results, of which two or three were advertisements, which is simply Baidu's business model. In the future, if Wen Xin only gives one answer, how can this be commercialized?

Robin Li: There are many possibilities for commercialization. If you look at ChatGPT today, its practice is actually paid use, buy a membership, how much is a month, it relies on subscriptions to support itself, and it does not need advertising.

I never felt like the business model was a problem. If we provide value to everyone, whether individuals or businesses, they will reward us through market mechanisms.

36Kr: You yourself mentioned a lot, in the American ecosystem, big companies and big companies, big companies and startups seem to be closer, is it different in China? Big companies don't use each other's products.

Robin Li: Google can index almost all Internet content, but each app in China has its own independent ecology and certain barriers. So users are really not so convenient in obtaining information.

36Kr: The five largest Internet giants in the United States (FANNG), it seems to be an alliance. This does not seem to be the case in China.

Robin Li: I don't think the alliance is also competitive, for example, Microsoft and Google are competing. But its way of thinking is, if you're already ahead, I'd better not make something like you, and if I compete by innovating to make something different from you, that's my ability.

In China, everyone's way of thinking is more like you run through this road, I also run the same road, let's see who runs fast.

36Kr: What products will Wen Xin access and what opportunities and changes will be made to Baidu's business matrix?

Robin Li: These days, all our departments are also holding meetings and discussions. Wen Xin Yiyan is a general ability, and its combination with various products is very natural, and search, Tieba , Wenku, Baidu Health, Xiaodu and so on. Almost every product in the company, including Baidu calligraphy, can naturally think of how to combine with it to make the product user experience better and more powerful.

36Kr: Baidu will be its first users.

Robin Li: Yes.

Cloud computing is Game Changer, and the application layer has ten times the opportunity

36Kr: You mentioned at the earnings conference that Wen Xin's words will be the Game Changer that will change cloud computing, why?

Robin Li: In the past, cloud computing mainly sold computing power, and computing power is how fast your computing speed and storage capacity are. Two or three years ago, the understanding was that it would be good to develop applications based on an AI development framework.

Today we will see that we do not need to develop applications based on chip layers and framework layers - based on large models, it is good, fast, and the cost is low.

Then in the future, when people buy cloud computing services, they will look at whether your model is good or not, not how your underlying computing power and storage are. Therefore, the AI big model will disrupt the entire cloud computing market.

I think Microsoft actually thinks the same way. As we all know, the real combination of ChatGPT and Microsoft's products is very powerful. This is something that many cloud vendors are very afraid of.

36Kr: Will Baidu Cloud become China's leading cloud provider because of the use of Wen Xin's words?

Robin Li: I am full of confidence in this. Once everyone understands, you don't need to figure out what you should look at when choosing cloud services in the future. You have your needs, you have your customers to maintain, what kind of way can make you better maintain your customers, more revenue, higher profit margins, then he will definitely deduce from these perspectives: Baidu's intelligent cloud is a better choice.

36Kr: In the next ten years, will there be a new WeChat and a new Douyin?

Robin Li: Absolutely. Not only the birth of these, I think the opportunity to be born 10 times the value of these apps is completely there.

36Kr: Today, countless people are asking: Will AI make migrant workers unemployed? Sam Altman, founder of OpenAI, said that a large number of people are bound to lose their jobs, so OpenAI will charge a fee on demand and subsidize people who don't have jobs. This is actually a little sad, he himself said that it is also a little scary, do you think this thing will happen? What do you think about this question.

Robin Li: There is no such job as a coachman today, because there are cars. But there are not only more jobs in the world today than there were 100 years ago, but many times more.

I'm not that pessimistic, I'm optimistic. No matter how many jobs are replaced, this is only part of the picture. The other part is that there are more new opportunities that we can't even imagine right now. I make a bold prediction that in ten years, 50% of the world's jobs will be prompt engineering, and people who can't write prompts will be eliminated.

36Kr: What will Baidu look like in ten years?

Robin Li: I don't want to predict what Baidu's revenue will be at that time, what the number of employees will be, these are not so important to me.

Baidu's mission is to use technology to make complex worlds simpler, and this is true now and will be true ten years from now.

I hope that ten years from now, looking back today, I will find that every ordinary person's life and the whole society will be changed by Baidu. What positive role have we played? What are we not working on? What can we do better? I'd rather ponder something like this.

36 Kr: There's a word called "Emergent Intelligence." I personally hold the view that there is nothing special about people, whether it is a proud idea or a feeling, like a computing unit. With more computing units, emotions and ideas will be generated. Do you think this will be the case?

Robin Li: I don't think that's the case, I just think people are the only ones. In such a vast universe, there is currently no evidence of aliens, which shows how unique people are in this universe. Today we say that we want to replace people with machines, we want to replicate a person, and I think this is unrealistic.

I don't think there is a need to replicate people at all, there are many things, machines do much better than people, there is no need to compare with machines. We don't even need to let it develop something like emotion, and even if it does, it may be something else, and our (human) emotions are still different.

36Kr: Ten years later, has AGI (General Artificial Intelligence) been achieved?

Robin Li: This is an open question, and I have no answer. If I had to choose between them, I don't think [GEI in ten years] will happen.

"Artificial General Intelligence" means that it is the same as human ability, and this word corresponds to "weak artificial intelligence" and "super artificial intelligence", which I do not agree with.

I think there are many places where machines are stronger than people, and there are many places where they are weaker than people. There's no need to frame ourselves in "we have to make machines like people," I think that's the wrong direction.

Whether intelligence emerges or machines are getting smarter, I am optimistic. The only thing I don't think we're trying to do is to turn machines into people, which is completely unnecessary.

36Kr: What do you think is a certain future for the next ten years?

Robin Li: The speed of technological progress is getting faster and faster, and the improvement of production efficiency will become more and more obvious, which has been continuously proved by the development of the past many years. If in the past people might have worked 5 days a week, maybe 3 days in 10 years would have been enough. Increased labor productivity leads to greater happiness in people's lives.

I'm excited about it myself because I'm able to be a part of it and contribute something.

36Kr: It's a trend. If so, what kind of outcome is certain? Ten years from now, will humans be more stressed or happier?

Robin Li: It must be happier.

Welcome to exchange