Dialogue with AI Explorer | The large model is like an arms race, and it is completely impossible to stop

The future of AI is pointing north

2024-04-26 09:40Posted on the official account of Hebei Tencent Technology AI Future Finger North

Dialogue with AI Explorer | The large model is like an arms race, and it is completely impossible to stop

Tencent Technology's AI Future Guide-AI Explorer Series talks about the entrants in the AI industry and focuses on the key issues in the first stage of the implementation of AI large models. In this issue, we work with Tencent Research Institute to talk to Zhou Jian, the founder of Lanma Technology, and pay attention to the key thinking of the landing of large models on the B-side.

Text/Tencent Technology Guo Xiaojing

The competition for large models has entered the next journey, and the focus has shifted to how to find the real scene and how to apply it.

For the industry, this is still a puzzle, everyone is thinking and actively discussing, but now the world has not come up with a perfect answer.

In the face of this difficult problem, we try to find out what experience the last wave of CV (Computer Vision) explosion to the development of the industry can bring to the development of today's large models.

We are also trying to find answers from reality, how the large model technology will converge, and what benefits the large model with universal skills can bring to everyone.

In this context, Tencent Technology had a conversation with Zhou Jian, the founder of Lanma Technology. He is a serial entrepreneur who has experienced almost the entire development of the CV industry as the No. 10 employee of YITU, and served as the CTO of Robotic Process Automation (RPA) company.

Starting a business again in 2023, when most of the industry's resources are invested in the basic model, this "old gun" with rich entrepreneurial experience directly chooses to do the application, "I don't see any signs of ending the war of large models at present, and I can't see when it will end." After all, the essence of doing business is to make a profit, not to burn money, right?

So from a strategic point of view, although the model looks very glamorous, the valuation can be very high, and it can be followed by a lot of "money", but in fact, the soaring cannot be stopped. ”

The soaring industry has brought about a "100-model war", which means that the application layer is actually facing many choices. When the large model begins to enter the business scene that needs to make money from the hot and eye-catching sexy concept, the "mix and match model" has become the choice of many enterprises, and different models are selected for different scenarios, and a set of "model solutions" with the highest cost performance and efficiency are mixed and matched.

Zhou Jian positioned that Lanma is a "model neutral" manufacturer. Large model manufacturers can launch various model products based on their own brands. And "model neutral" has more options, not only to choose multiple models, but also to choose multiple brands, but the situation will also be more complicated.

This is an opportunity, but at the same time, there are many practical challenges: "When the model is released, the model manufacturer only defines the parameters, but does not define the specific feature parameters of the model. ”

This means that when matching real scenarios, "model-neutral" vendors may face the dilemma of blind people touching elephants. Constant matching and debugging based on experience is required. This is especially important for past industry experience, such as the domain knowledge base, and at the level of product roles, set up two layers of key teams, one for business experts to disassemble business scenarios and workflows, and one for product experts to match scenarios to real model implementation.

From narrow-oriented technologies such as CV to general-purpose technologies such as large language models, the commercialization path of TOB has actually changed and remained unchanged. The TOB business has a high voice of Party A, a long industrial chain, a high degree of personalization, and heavy delivery, and even if new technologies explode, these industry pain points still exist. However, the versatility of large language models does give "model-neutral" manufacturers a good opportunity, and unlike model manufacturers, they do not burn money and can try and make mistakes with less pressure.

When large-scale model companies have not yet formed clear industry standards and product definition standards, the trial-and-error experience of "model-neutral" manufacturers is also urgently needed by the industry. However, the company of the large model application layer has always been facing a problem, the model company that claims not to touch the application, if it finally touches the application, the model technology is universal, the model company has a larger amount of data and stronger strength, then the application layer still has a deep enough moat?

With these questions in mind, we had an in-depth conversation with Zhou Jian, and his wonderful views are as follows:

1

There is no sign of the end of the large-scale model arms race-like "hurricane", and the strategic location may be better for entrepreneurs to choose the middle layer.
2

Computer Vision (CV) is a narrow-oriented technology that cannot be covered by some industries and scenarios, while large models are general-purpose technologies that can be applied to all scenarios if they are strong enough.
3

Nowadays, many enterprises need to mix and match models, which is an opportunity for model-neutral vendors, and it is a core capability to integrate multiple model capabilities into the original workflow or code system.
4

The scientific research scenario of the large model is different from the business scenario, and the capability boundary and category of the large model must be defined first for commercial implementation.
5

I'm not worried about AI dominating humans, but I'm more worried about systemic disasters due to human carelessness or wrong instructions.

Here's a rundown of the conversation:

First, the industrial opportunities of large models far exceed those of CV

Tencent Technology: I started my business last year, and there are very few companies that make applications as soon as they come up, after all, the ability of the basic model is not mature, why don't you make a model?

Zhou Jian: After OpenAI received $10 billion in financing in April 2023, basically from a strategic point of view, it is definitely not possible to directly expand the model. Because he is a serial entrepreneur, he actually has a judgment on the entire extension of the competitive landscape. At that time, it was relatively easy to imagine that ordinary startups would basically give up competition if they achieved GPT-4. Big manufacturers will definitely continue to compete and continue to compete at the level of GPT-4.5.

Just like Texas Hold'em, startups definitely have fewer chips in front of them. And when you get a big financing, to some extent, you are "coerced" by capital. In many cases, it is necessary to do some prescribed actions according to the requirements of investors and the market at that time.

In addition, capital will not bet on one side, when there is the first place, it will definitely win the second and third place, in such a fiercely competitive track, it is difficult for you to have a time window to build organizational capacity. I don't see any signs of ending this war at the moment, and I don't see when it will end. After all, the essence of doing business is to make a profit, not to burn money, right?

So from a strategic point of view, although it seems that making a large language model is very beautiful, the valuation can be very high, it can be followed by a lot of "money", and a lot of things can be done, but in fact, the "hurricane" cannot be stopped.

The positioning we are choosing now also has great challenges, such as the fact that capital may not pay attention to it. However, from the perspective of ecological positioning, after the competition of large model manufacturers is becoming more and more fierce, it is necessary to open source part of the model, which is commercially possible to be used by our middle-tier companies. From a business perspective, I think we're probably in a better position.

Tencent Technology: Model companies are also exploring applications, don't you worry that the main business falls on the extension line of the model company and is "eaten" by the model company?

Zhou Jian: Do you mean the C-side?

Tencent Technology: Model companies are actually doing both the B-end and the C-side.

Zhou Jian: I don't think so, I don't think the winner-takes-all super app will appear anytime soon. The first is the B-side, which has never been in history. The B-side is mainly a cost factor, and you can't hire "Einstein" for the entire business.

The C-end is an opportunity for large manufacturers, and the wave of mobile Internet, including some taxi-hailing platforms, has actually been coerced by large factories to fight a battle of capital. Maybe the large model manufacturer can make it in the future, but if you do the application layer, the C-side is easy to be "eaten", because the resources of the large manufacturer are hundreds of times that of yours.

In addition, we recall that the mobile Internet rose in 2008, and the iPhone 4 did not appear until almost 2012, and ByteDance was only established in 2012. There was originally a game on the mobile terminal where the fruit was cut, because the touch screen changed the way of interaction, and the most intuitive was to cut the fruit. Nowadays, people rarely play fruit cutting, because later more innovative elements are combined to make the experience better.

Now the same is true for large models, too early, when the infrastructure cost has not been reduced and the interaction design is still being explored, the super APP is unlikely to be established.

In addition, today's startups still have the path dependence of the past, and the companies founded in the next two or three years may be more likely to make super apps.

Tencent Technology: You have also experienced the last wave of CV (Computer Vision) entrepreneurship in the AI field, what is the difference between these two waves?

Zhou Jian: CV is actually a narrow-oriented technology, which is dotted in various industries. As the technology evolves, you can unlock new scenarios, such as ID card comparison at first, and face turnstiles later. A technologically advanced company, in the business competition, it is possible to exchange high gross profit for sales network. But the problem with it is that there are scenarios in the security industry, but not in finance.

However, large language model is a general-purpose technology, and it is not so relevant to a certain sub-industry, but through natural language, the technical threshold is lowered to a very low level. For example, in the past, when we handled resumes, you had to look at his departure time, past experience, etc., and the cost was extremely high, because resumes were varied. However, with the technology of large language models, it may be just a prompt word project with a week's workload.

Technology can be used in all walks of life, which opens up the imagination. In the past, RPA (Robotic Process Automation) technology quickly reached the "ceiling", and there was no way to make it higher, faster, and stronger.

Tencent Technology: The "Four Little Dragons" that came out of the CV wave are actually facing bottlenecks in their current growth. Do they still have a competitive advantage in this wave?

Zhou Jian: It's a completely different technology. If their cognitive paradigm doesn't change, they will be very passive.

Tencent Technology: We see that some of them are actively deploying models.

Zhou Jian: It's really doing. But for a company, when it has grown to its current scale, the annual revenue is about 3 billion. When faced with a huge technological change, it is necessary to maintain growth, otherwise it is difficult to explain to shareholders. In addition, technological change requires huge investment, and the current scale of income cannot support the transformation of new technologies with hundreds of millions, so it is still very difficult.

Doing new technology also means new risks, which has reached the stage of PMF (Product Market Fit) before, and today suddenly a new technology, how can you not subvert the advantages you already have. Even Google is facing this problem. This is also what we often call the innovator's dilemma, and it is very difficult to change.

The advantage of a startup is that you have a lot of possibilities, and if you get it right, you can move forward quickly.

2. The opportunity for "model neutrality".

Tencent Technology: If only the application layer is used, is there a calculation of the computing power cost of inference?

Zhou Jian: Indeed, the computing power of training is basically completely saved, and it is basically fine-tuning. The logic is to use GPT to verify the feasibility of customer scenarios, and then do privatization deployment, and then use the open-source small model 13B, up to 70B, to confirm whether it can be used in privatization deployment.

However, based on the current situation, it is still unrealistic to reduce costs, and perhaps the priority should be to increase revenue or comply with regulations today, which may be more efficient.

In the case of one of Lanma's insurance customers, in the past, insurance agents could only sell new insurance products through blind phone calls, and the success rate was very low. Now the company's insurance agents can recommend personalized insurance product recommendations based on the physical examination situation to the corresponding physical examination customer, and the recommended products are also in line with his health condition, which can greatly improve the sales conversion rate.

Tencent Technology: The ToB business in China has a long chain, high personalization, and heavy service, can you build an AI agent platform through large-scale model technology, can you fundamentally make the ToB business lighter and better?

Zhou Jian: It is true that after the large model came, because of its versatility, the cost of personalized customization was greatly reduced. We only need to define the necessary knowledge in the workflow, and dynamically generate forms and codes in the process of talking to you, which is probably the part that needs the most personalization.

In the past, it was largely a matter of humans adapting to machines, but now large models allow machines to adapt to people. When machines interact with people, they can actually become more and more intelligent, so the cost of personalized customization will be lower and lower.

Tencent Technology: Can you give an example, how do you do that?

Zhou Jian: We are now positioned as a model-neutral manufacturer. The first core problem for enterprises to implement AI applications is that they need to choose models.

The needs of many enterprises are even mixed and matched models, so that multiple model capabilities can be integrated into the original workflow or code system, which I think is our core capability at present.

In fact, at the core, the middle layer of our own positioning should actually be a set of development frameworks in the future. Figuratively speaking, it is like being in a bank safe, each customer has their own special needs, and it will be easier to build applications based on our platform and based on the knowledge data we have precipitated.

Tencent Technology: Will the core competitiveness in the future be the accumulation of knowledge base in various fields?

Zhou Jian: I think the current situation is that we have AI capabilities and customer relationships. After winning the benchmark case, the next hurdle must be domain data.

When we are digital employees now, we actually want it to be able to decide what the next action is. At present, the main problem is the lack of data, and the difficulty is how to define the type of action.

Tencent Technology: Too personalized?

Zhou Jian: Actually, it's okay that in a company, some positions such as finance and human resources (HR) may not require as complex skill levels as people think. In the case of finance, for example, there is actually a limit to the kinds of tasks that finance employees can accomplish. For example, the establishment of a financial sharing center allows an employee based in Romania or Dalian to handle financial matters for a Fortune 500 company, including approving bills and conducting interviews. This shows that these tasks are not as difficult as they could be, and that they can be automated and become as efficient as robots.

China's management standardization may not have been done well enough in the past, resulting in lagging behind international standards in some areas. However, in the eighties and nineties, foreign countries experienced a wave of standardization of management processes, which made significant progress in the standardization of internal processes of enterprises.

The key to the future is to build an effective model of the world that lists all possible options, so that the decision-making process can be simplified and seemingly complex problems less difficult.

Tencent Technology: Will the competitiveness of "tool man" decline sharply?

Zhou Jian: In the games industry, the positions of draftsmen and junior to mid-level software engineers in software outsourcing companies have been disrupted by automation.

At the moment, although it has not yet been fully realized, there is a trend that when a company has a very detailed division of labor, such as outsourcing the position of an engineer or draftsman, if the position has more than 100 employees, you don't need one person to do it all at its best. In fact, if one employee is better than others in 30% of the way, it is enough to replace the work of the other 30 people.

This logic is gradually becoming a reality, although the technology is not yet fully mature, making this change not yet obvious. However, I have mixed feelings about the release of GPT-5, both expectant and somewhat in awe, because I think it will bring a huge change.

Tencent Technology: Do you think GPT-5 will be released this year?

Zhou Jian: Yes, OpenAI's motivation to release GPT-5 should be big, because competitors have caught up.

Tencent Technology: If the large model is used as an agent, it actually needs to be multitasked, but for example, with a mobile phone as a scenario, it is difficult to get through various apps, and will it also face such difficulties in the toB field?

Zhou Jian: I think so, in the ToB market, office collaboration platforms like DingTalk, WeChat Work and Feishu are competing fiercely. However, when these platforms operate independently, they face similar problems as mobile phone manufacturers and app developers. Mobile phone manufacturers want to be like Apple to achieve total control and provide a fully connected user experience. But at the same time, different applications may not be willing to fully integrate into the ecosystem of mobile phone manufacturers.

From a historical point of view, since the iPhone was first released in 2008, in 2013 or 2014, the battle for ride-hailing and group buying began to be fierce, and this competition lasted for about five or six years.

By analogy, given the release of ChatGPT at the end of 2022, we can foresee that in 2026 there may be some competition, and in 2027, these rivalries or conflicts may become more pronounced or normalized.

Tencent Technology: Is the choice of the future the platform or the user?

Zhou Jian: In fact, end users will still buy hardware, just like your laptop, mobile phone, or new hardware will have an assistant, just to serve the cross-device, cross-APP super assistant, or who has the most right to speak, there may be no way to predict, I think about 2-3 years to start to have a strong competitor to come out.

3. Who defines the "product characteristics" of the large model

Tencent Technology: Don't you see any signs from the current AI mobile phones and AIPC?

Zhou Jian: It's too early, it's still in the conceptual stage.

Now the main thing is that the hardware needs to be ready, for example, recently Apple said that the M4 chip is made for AI.

Tencent Technology: The current hardware manufacturers have said that they can support 7B and 13B models to run on the device side, so is the current obstacle no longer in the hardware?

Zhou Jian: The problem now is that the mobile phone side model of 13B is the same for manufacturers like us to B Agent, any model manufacturer releases a product, only the parameters of the model, how do we use it?

Now is the first level of the model product definition, the first level is after the model is miniaturized, what is its feature list? How to define this model product is in line with the mobile phone, in line with the PC side? What are its basic capabilities?

After this is passed, it will be time to find out what kind of computing power and what model we need on what mobile phones.

Tencent Technology: Now I see that manufacturers are defining, and I can use this model to do image removal, file management, and so on.

Zhou Jian: This should not be done by mobile phone manufacturers.

Mobile phone manufacturers can't define it, and after they define it, model manufacturers can't do it, so what should I do? Or is it really a question of egg or chicken. Let's not talk about mobile phones, let's talk about the models deployed in the cloud, 13B, 33B, 130B, what are their boundaries, what should be used in what scenarios? These are not yet available.

It may be that we, as TOB companies, have broken through many scenarios, but after these scenarios are taken out, is there a model vendor that can give a specific feature list, let alone the device side, because it is more difficult to run the end-to-end application through.

The TOB service can also be said to be verified first with the most powerful service at present. How to verify it on the device side? I think it may be necessary for a vendor like ours to align with the model vendor first.

In business, it is actually very different from academics, and in business, it is necessary to define the capability boundaries and categories of large models, what are the big categories, and what are the capability boundaries of each category, these basic questions need to be answered by model manufacturers first.

Tencent Technology: What is the category of large models? For example, if a certain ability is outstanding, does a long text count?

Zhou Jian: The definition of a feature is decided by everyone in the market. At present, the smallest model computing power may only need a 4090 consumer-grade graphics card, and the larger model, in terms of reasoning, may require 100 A800 graphics cards. In the case of a 100-fold difference in computing power, it must be divided into different categories. For example, an ordinary T-shirt and a luxury item can cost from tens to tens of thousands, but you can give this price a reason.

However, if the product definition system of the model does not appear, in fact, there is no way to iterate on the commercialization.

Tencent Technology: Aren't the parameters defined in the technical documentation of the large model of great value to you?

Zhou Jian: It doesn't matter much, we buy clothes, and we don't care about the production process.

Tencent Technology: Who will do this in the future?

Zhou Jian: It must be a large-scale model manufacturer. At present, it is still "chaotic", including the cost calculation of the landing of various application scenarios released now, and it has not become a consensus, if it slowly becomes a consensus, the entire ecology will begin to turn.

Tencent Technology: In fact, everyone has not thought through the application of large models.

Zhou Jian: In what scenario should the application of large models be implemented, everyone is actually crazy now.

Our own exploration is that, as just mentioned, the low cost of personalization and strong language understanding ability of large models can unlock some new scenarios that cannot be done by traditional ones. For example, in the past, the credit review of small and medium-sized enterprises could not be done by relying on AI, or the original quality was very poor, such as insurance agents, which could not be done in the past.

In fact, we use AI Agent plus knowledge base to free up experts' time and turn it into new productivity. In the past, the supply of time for domain experts was a bottleneck, but now this bottleneck can be broken through through AI, and things that could not be done or could not be done in batches can now be done.

I think it's different from what everyone understands, the benefits that large models can bring are not first of all cost reduction and efficiency increase, but more revenue increase and compliance. It is easier to reduce costs and increase efficiency, and the priority of TOB enterprises is definitely to increase revenue and compliance.

Tencent Technology: Can AI Agent be compared to an app in the mobile Internet era?

Zhou Jian: The most important thing about the agent is the interaction between it and the environment, in fact, there are two sides, one is to sense from the environment, and the other side can generate a plan, and then make actions to execute.

It's not an APP, and today we see that it's actually more about adding AI capabilities to the original APP, for example, any application today can send instructions through natural language conversations. This is not an agent.

The important thing for the agent is to interact with the environment perception, and the kind I just talked about is just the AI of traditional software.

Tencent Technology: Is there a huge opportunity for startups to tap into these AI needs?

Zhou Jian: The huge opportunity is definitely not here, traditional software plus AI is what traditional software can do. Huge opportunities in AI-native applications.

In the agent ecosystem, existing software can be used as a component of the agent's perception environment. In fact, we are not competing with traditional software, but empowering traditional software to become the entrance to the next generation.

For example, when we are a digital employee, we identify the employee's intent into a professional knowledge base and dismantle it into the original application. In fact, AI native applications are a new category, and the ecological position of its station is another layer between people and systems. It will act as an agent, disassemble the plan and get the job done. This job can be to operate a variety of systems.

So in this sense, AI native is a completely new opportunity, and it has nothing to do with traditional software and APP. This is a new continent, and where exactly it is, we have to explore again.

Tencent Technology: But it sounds like the future of AI Agent still needs traditional software?

Zhou Jian: Yes. Our assumption is not that there is no informatization, informatization is the premise of digitalization, digitalization is the premise of AI, our premise is that the previous has been digitized, if there is no digitalization, such as the headhunting industry, the law firm industry, because they do not have informatization, there is no digitalization, this is very difficult.

Tencent Technology: In the past, it was difficult to promote the digitalization of enterprises, will this become an obstacle to the implementation of AI in TOB?

Zhou Jian: In fact, many large enterprises are quite determined and have prepared the infrastructure, but they have not found the landing scene. The demand for this part is still quite large.

Tencent Technology: Do your company's product managers still need to understand the workflow of each specific industry?

Zhou Jian: Yes, he himself has to be a business expert, and he can't do it without a reliable business expert. But there will be two layers, one is the product manager of the business, such as banking, insurance, brokerage, energy, he must understand the industry.

The other layer is to make a platform, which does not need to know the business in particular, but needs to make products through well-defined capability documents and data processes.

Tencent Technology: How will the workflow of enterprises change in the future?

Zhou Jian: In fact, there is a division of labor and cooperation between people, and there will be a lot of breakpoints of information, data, and knowledge, which is caused by people's bandwidth.

For example, from a management point of view, the bandwidth of a person can only focus on 7-15 people at most. As the large model continues to evolve, the memory ability of the large model will get better and better, which will actually help fill in the breakpoints in the process.

The most important thing for enterprises is to amplify their competitive advantages, not to make up for their shortcomings. The competitive advantage must be the cooperation between the front line and the back office, "so that the people who can hear the artillery fire on the front line call for artillery fire", but the bandwidth to transmit information is limited, and the background experts may not receive this information.

However, if each employee has an agent, the front-line situation can be brought to the rear without loss, and the decision-making of the rear can be directly transmitted to all parts of the front-line.

Tencent Technology: That's actually helping people improve bandwidth, and the core capabilities of people will become bigger in the future?

Zhou Jian: People mainly know what to do and give tasks. Today's AI actually has a big bottleneck, it doesn't have an internal world model and can't learn on its own.

Human decision-making nowadays is largely based on intuition, which means that we can automatically find correlations, in my past experience, what things are relevant and what I did to solve the problem at that time. This is something that I don't see AI being able to replace anytime soon.

The current Scaling law route is actually brute force, and the world's energy will not be enough for AGI in the future, so I think there is a bottleneck in this road. From the perspective of the laws of the past, human beings are not designed, but slowly evolved, and evolution is the strongest force.

Therefore, we cannot see the replacement of human beings with AI in the short term, but we must learn how to embrace it and become a new worker. Know where the boundaries of its competence are, and occupy a good position in the future workplace.

Tencent Technology: Is the basic science of mankind still like a chimney, without integration and opening?

Zhou Jian: Yes, there have been companies that have done quantitative investment in history because of a bug that led to the bankruptcy of the company. I don't think there is any need to worry about AI dominating humans now, but because of human carelessness or wrong instructions, it will bring about systemic disasters. This is probably the most worrying thing.

View original image 430K