laitimes

OpenAI Weng Lilian's Agent formula, must it be correct?

author:Quantum Position

Hengyu is from the Au Fei Temple

量子位 | 公众号 QbitAI

It's 2024, who is using the AI Agent that has been pinned on high hopes?!

It is regarded as one of the most likely paths to AGI, and domestic and foreign companies are studying it hotly, and although it seems that everything is just a "bet", as far as we know, there are many toB fields that are already interested in Agent.

OpenAI Weng Lilian's Agent formula, must it be correct?

Qubit asked an entrepreneur who founded an AI agent company that is providing services for many scenarios in the toB field.

AI Agent can be seen as a connector between managers and basic employees, experts and basic employees, and between employees and employees, which can fill the space between people and systems when enterprises want to do digital transformation.

At the same time, he also mentioned that due to the limitations of current technology, AI Agent not only needs to be combined with some traditional technologies, such as search rule engines and knowledge graphs.

More importantly, the agent must know what kind of environment and what kind of scenario it can work in.

Coincidentally, the entrepreneur, Zhou Jian, founder and CEO of Lanma Technology, has just officially released the team's self-developed AI Agent platform AskXBOT in Shanghai.

We decided to take this platform as an example to explore the current progress of Agent.

Agent平台AskXBOT

The self-definition of AskXBOT is an AI agent platform.

AskXBOT is a one-stop platform for agent and workflow design, development, use, management, and knowledge precipitation based on large language models.

From the perspective of the overall structure, the design of AskXBOT is not cumbersome, it is mainly composed of four parts, namely the designer, the user, the management platform and the knowledge center, which provides enterprises with basic capabilities such as document retrieval, AI calling, data query, and intelligent programming.

OpenAI Weng Lilian's Agent formula, must it be correct?

The designer is used to create and manage AI agents. Here, users can design and configure the required AI agent templates by dragging, pulling, and dragging, even users who have no technical background and can't write code can easily create agents that meet the requirements.

The user provides an interface for the user to interact with the agent, where the user can talk to the agent and let the agent obtain information to further perform tasks.

As the name suggests, the main function of the management platform is to provide supervision tools, including permission control, performance monitoring, log analysis, etc., to ensure the safe and stable operation of the agent, and continuously optimize it based on user feedback.

The knowledge center used to precipitate and organize knowledge assets is regarded as the core differentiation module by Landcode Technology. Experts can input knowledge and experience into the knowledge center and be used by agents after learning and Xi to provide more accurate services and responses.

OpenAI Weng Lilian's Agent formula, must it be correct?

In practice, the data analysis process steps and analysis of expert users are very rich, theoretically speaking, experts can save the analysis process data and make an SOP (Standard Operating Procedure).

Then, all other intermediate or junior business personnel can repeat the corresponding standard operating analysis process without rote memorization of the core knowledge of the experts.

"Any expert can only provide intelligence and labor 24 hours a day, and it is impossible to expand indefinitely. Zhou Jian gave a very easy-to-understand example, that is, now with the ability of large models, using Agent, as long as the computing power is added, the experience and knowledge of experts can be copied.

In this way, if there are 30 graphics cards, the ability of one expert can provide 720 hours of uninterrupted service.

Therefore, we can understand Zhou Jian's belief that "if there are no experts to digitize the knowledge, then the implementation of AI Agent will be very difficult".

In order to make the agent better implemented and applied to all walks of life, the Lanma team proposed a three-step AI agent construction rule:

The first step is the digitization of expert knowledge;

The second step is flexible interaction based on CUI (Conversational User Interface);

The third step is the cyclic precipitation of domain knowledge.

OpenAI Weng Lilian's Agent formula, must it be correct?

2023 is generally considered to be the first year of the large model, and 2024, which has just begun, is expected to greatly kick off the development of Agent.

Zhou Jian said that in fact, the current design idea of Lanma is to realize the Agent itself as a kind of productivity.

At present, AskXBOT has many cooperative customers such as education, human resources, banks, state-owned enterprises, etc.

Taking the human headhunting industry as an example, Beijing Human Resources Huaming Technology Co., Ltd. and CGL are all users of AskXBOT.

In cooperation with CGL, Lanma has made a Copilot in its original headhunting consultant system, so that consultants can not only efficiently screen and contact candidates, but also do some things that would otherwise not be possible.

Guo Yanbing, senior vice president of CGL, said that in practice, the role of experts is far more than data, so it is necessary to work on expert knowledge and make agents that can simulate expert behavior.

Zhou Jian also hopes that in the next version, Agent will have the initial function of selecting, cultivating, employing and retaining talents.

"The agent is landing more slowly than expected"

"How far are we from enabling the agent to select, educate, use, and retain the agent?"

"Two years. ”

Breaking the casserole and asking in the end, we finally know how the number "2 years" is estimated.

Zhou Jian's judgment criteria first came from OpenAI. OpenAI once mentioned when GPT-4 came out that if there were companies that could do AGI better than it within two years, it would surrender and tell the winner all the methods it knew.

I wonder if OpenAI suddenly sacrificed GPT-5, how cool it would be to build an AI agent on top of it at that time.

OpenAI Weng Lilian's Agent formula, must it be correct?

To get back to the point, now a year has passed since OpenAI's two-year appointment, and Zhou Jian himself judges that 2024, whether in China or Silicon Valley, will be the first year of the landing of AI large models, and in 2025, the year after that, because there are landing cases, I see that some people from friends or other tracks have landed, and have greatly increased their income through AI competition, so other manufacturers will definitely follow up.

"Such an environment, coupled with GPT-5, so I think it will definitely explode in 2025, and it may be able to really be used. Zhou Jian said with a smile.

It sounds like Zhou Jian is quite confident about everything he expects in the future, but when the conversation turns from the next two years back to the past year, his attitude is like this-

Looking forward at the end of the year, the degree of landing is slower than I imagined at the beginning of the year.

In March 2023, almost when Zhou Jian had just started his business, OpenAI released the ChatGPT Plugin, and he was so worried that he didn't sleep all night, and his mind was full of "vocal, what should I do if GPT does what I want to do?"

But when the time came to November and GPTs were unveiled, Zhou Jian claimed that his mentality was clear and unhurried at this time.

There is only one reason, that is, after reading the sensational paper "Sparks of Arti cial General Intelligence: Early experiments with GPT-4" in March, Zhou Jian judged that GPT-4 can write 50 to 100 lines of code generally. This is actually very scary, because a programmer can write about 100 lines of code a day, so GPT-4 can replace a programmer's workload in a day. ”

Based on this setting, Zhou Jian at that time was very radical in his belief that each software should be able to be empowered by AI immediately and in situ, and the knockout round would be opened in 2023, and the opportunity for himself and his team was swept away by the large model at the beginning of the game.

OpenAI Weng Lilian's Agent formula, must it be correct?

However, in the later process of cooperation between the team and office software, Zhou Jian found something wrong more and more:

The general estimate of 50 to 100 lines of code suddenly crippled me. Later, if you take a closer look at GPT-4's paper, in fact, it can only write 2-3 lines of code, a difference of ten times.

If GPT-4, the most capable of doing so, can write 50-100 lines of code, then humanity may not have much of a chance left.

But if it can only write 3 or 5 lines of code, and you still plan based on it, the complexity of the task decisions you can make is very limited.

After correctly seeing the gap between real programmers and GPT-4, and having a correct understanding of computing power, the privatized deployment service provided by Lanma was gradually transformed into a small model and a large model collaboration, plus Root as the entrance, to disassemble the task, and use different computing power and call different models to solve this problem.

Is Weng Lilian's Agent formula correct?

Almost all those who pay attention to Agent have seen the Agent formula given by OpenAI Chinese scientist Weng Lilian:

Agent = Large Model + Memory + Active Planning + Tool Usage

OpenAI Weng Lilian's Agent formula, must it be correct?

Once upon a time, Zhou Jian also regarded this agent formula as a guideline, but as Lanma's practice in the field of AI agent increased, he had a slightly different thinking.

"The embodied intelligence, AI agent, and future AGI that everyone has been talking about now all have the ability to interact with the actual environment. "Zhou Jian explained his new perspective to us, and he felt more and more that the most important ability of an agent is the ability to interact with the environment.

Of course, the use of tools may be fine, and active planning and common sense memory are still the core points, but the most different point may lie in whether the agent can interact with the environment and whether the AI has the ability to explore the environment.

Zhou Jian expressed his opinion, "If it has no perception of the environment, I don't think it can be called an agent." On the other hand, it is important to be aware of what the environment is like, what tools are available, and to be able to discover and explore.

From this point of view, the formula given by Weng Lilian is more like "an agent who compromises technical limitations at this stage" in Zhou Jian's heart.

In the eyes of Zhou Jian, an AI agent practitioner, there needs to be a master agent for task distribution - of course, the premise must be to have a sense of the environment before the next exploration can be carried out.

If we want to get to the point of human-machine integration, human-machine integration, or human-machine symbiosis, it must need to be able to perceive the environment.

Depending on the size of the environment, you can slowly increase your abilities and finally return to the world model mentioned by Yann LeCun.

Of course, it is a pity that the above idea is a new research center that Zhou Jian and the Lanma team have gradually worked out recently, and there is no corresponding setting and embodiment in AskXBOT.

However, 2024 is just around the corner, in the year of Agent, in addition to the team and the number of Agents themselves are rising, maybe the ability can also emerge and break through?

— END —

QbitAI · Headline number signed

Follow us and be the first to know about cutting-edge technology trends

Read on