laitimes

The terminal AI grading standard has been implemented, and the "fire" of the mobile phone model has burned to the agent

From Apple's "Apple Intelligence" to Honor's "AI Agent", and then from vivo's "PhoneGPT" to OPPO's "AIOS", the competition of AI Agent has begun to become the key to measuring the technical capabilities of mobile phone manufacturers.

At the same time as the "big smash brawl" of AI mobile phones, related hierarchical discussions are also underway. On October 18, the China Academy of Information and Communications Technology (hereinafter referred to as the "Academy of Information and Communications Technology") jointly released the world's first "Terminal Intelligence Classification Research Report" with many terminal and chip companies such as Honor, vivo, and Huawei, giving a clear direction for the evolution path of AI mobile phones.

"True and fake AI" mobile phones now have a standard for judging. Zhong Xiaolei, an analyst at Canalys, told the first financial reporter that phenomenal AI native applications still have yet to emerge, and if AI native applications can reshape the existing habits of mobile Internet content consumption, then the demand for end-side performance is expected to accelerate the elimination of old devices.

Grading standards are implemented

While Internet manufacturers are still discussing who is the "Chinese version of ChatGPT", mobile phone manufacturers and terminal industry chain players are also looking for new opportunities brought by large models to the industry, trying to get a piece of the pie. But when the volume gathers, the noise and complex speech make the "AI phone" a broad and vague concept.

"A phone that can provide generative AI capabilities is not the same as an AI phone, or even a far cry from it." A person in charge of a leading domestic mobile phone manufacturer previously told reporters that making AI mobile phones is like fast food, but it will be counterproductive.

What is an AI phone? How do you define the capabilities of an AI phone? Is it an AI phone that can eliminate the picture? In the face of these problems, the China Academy of Information and Communications Technology mentioned in the latest report that the level of terminal intelligence is currently divided into L1-L5 five levels, different levels correspond to different roles of people and terminals, and the higher the level of intelligence, the higher the autonomous participation of the terminal in the process of task completion, and the lower the participation of people.

In other words, the above-mentioned hierarchical capabilities can be called AI capable end products to some extent, but the capabilities vary greatly between them.

Taking the travel scenario as an example, in the L1 level, the user enters the information "help me book a ticket from Beijing to Shanghai", and the terminal can recognize and execute the command, open the booking software and enter the corresponding content, but the user needs to browse the search results, select the appropriate flight, fill in the passenger information, and complete the payment by himself. In L2, the terminal can open the booking software to automatically search for air tickets according to the user's preferences, and in the L3 scenario, it can provide a customized plan. In the L4 scenario, the potential travel needs will be identified and the plan will be specified based on the user's daily browsing actions. At the L5 level, you can infer your travel ideas based on the user's daily information and plan your destination and plan without booking a ticket.

The terminal AI grading standard has been implemented, and the "fire" of the mobile phone model has burned to the agent

It can be seen that the L1 level (intelligent response level) and L2 level (intelligent assistance level) have a certain degree of intelligence, and can complete a single type of task based on user preferences. At L3 (intelligent assistant level) and L4 (intelligent collaboration level), it gradually changes from perceptual recognition of complex intentions to recognition of potential intentions. On the other hand, Level 5 (Autonomous Intelligence Level) has comprehensive intelligence, and autonomously plans and completes all types of tasks based on all scenarios.

However, from L1 to L5 is undoubtedly a complex system project, which requires all parties in the industry to cooperate and complement each other's advantages to jointly promote the improvement of terminal intelligence. Judging from the current direction of the layout of mobile phone manufacturers, each has its own focus.

In September this year, HONOR officially released the world's first cross-application open ecological agent "HONOR AI Agent" at the IFA exhibition, which can understand user needs and respond quickly based on the understanding of user habits and current usage scenarios, and execute and access various resources and third-party services of mobile phones.

Zhao Ming, CEO of Honor, said that the Honor AI agent is aimed at "mobile phone autonomous driving", and the AI agent can not only call the services that come with its own system, but also open cooperation to all third-party services, and Honor wants to be a "platform-level" AI terminal.

OPPO also demonstrated the ability of AI search at a recent developer conference. Tang Kai, President of OPPO's Software Engineering Division, said that AIOS will go through three stages, from system application AI to system AI, and then to AI as a system. In other words, OPPO seeks to integrate the underlying system with AI first, and the technical side is more inclined to the improvement of its own system.

The direction of vivo is similar to that of OPPO. The president of vivo AI Global Research Institute told reporters that the core of this year is AI to reconstruct the system experience, which will not be limited to a single function or application, but what kind of solution will be used is still being explored.

Apple has also shown the functions of AI mobile phones before, such as the combination with Apple's voice assistant Siri, but from the perspective of cooperation with large model companies such as OpenAI, Apple's direction in AI also tends to be a platform.

The "flames of war" burned to the application layer

Regardless of the path taken, the real competition for AI at the application level has already begun. After completing the thinking of computing power and large model foundation construction, who among the mobile phone manufacturers can move from L1 to L5 faster will determine their future position in the mobile phone industry.

Specifically, Apple's AI functions are currently mainly concentrated in mainstream applications, from cameras, photo albums, calendars, memos, to browsers, emails, almost all native applications in the system will be empowered by AI. In addition, with the blessing of Apple's intelligence, the performance of Apple's voice assistant Siri has been significantly enhanced, and users can ask the system to call ChatGPT to respond while using Siri and a range of apps.

Compared with Apple, domestic mobile phone manufacturers have begun to give more specific AI mobile phone application scenarios.

Not long ago, Honor successively announced AI functions such as "ordering takeout with one sentence" and "forwarding documents with one sentence". For example, if a user sends a voice command to the mobile phone, the Honor AI agent integrated with the mobile phone system will follow the steps of "search, select, and share" to send the file after understanding the user's needs, and support jumping to the WeChat app and selecting the corresponding contact.

"The AI agent can not only understand your semantics, but the key is that it can understand the real-time feedback on the screen and imitate human understanding to act accordingly." Zhao Ming said that more sophisticated AI phones can imitate humans in third-party applications to recognize and understand the content of the application, and imitate people to perform corresponding operations.

OPPO launched the SenseNow framework at the developer conference, which supports continuous dialogue and spot recognition through the Xiaobu assistant. OPPO said that in the direction of the scene, the goal is to enable AI to achieve multi-modal intuitive interaction that can be "heard clearly, understood, and done quickly", so as to understand the complex intentions of users and complete cross-application operations.

vivo delegates more capabilities to "small models", such as image models and sound models.

Rather than doing everything, the surrounding people believe that it is necessary to superimpose AI capabilities based on existing scenarios to have the possibility of development. "For example, we have made 11 models this year, the first version has made more than 20 languages, next year 40, the year after next may be 60, 80, 80 languages can be done with just one model, when there are more and more such things, there will be a gap in the experience of mobile phones at this time." Said around.

When it comes to the difference with Apple's AI in the application layer, Zhao Ming believes that the operating system of domestic mobile phone manufacturers has more powerful capabilities. "In the past, manufacturers needed to follow the rhythm of Android, but now Android is more like a kernel or framework. At present, all manufacturers are investing at the system level to explore specific landing solutions for scenarios. After the blessing of AI, the operating system is becoming personalized, which is actually a thousand people and thousands of facets of the device. ”

Archie, a smartphone industry analyst at Counterpoint Research, told reporters that the mobile phone was originally just a tool, but in the future it will become an external brain and no longer just a hardware state.

Archie believes that in the exploration of application scenarios, the combination of multi-modal input and output capabilities can greatly enhance the productivity tool attributes of smart phones, which can not only generate charts, texts, music, pictures and even videos required by users based on various forms of input information, but also edit the input pictures and videos, and these new technological changes can gradually stimulate new needs in the consumer market.

However, it can also be seen that the AI chip inference ability at the hardware level and the interaction mode at the software level have put forward higher requirements for the AI mobile phone itself, and the cost and technical solutions have come to the key node, and in addition to the hardware, major mobile phone manufacturers also need to focus on how to use AI to provide users with personalized services.

(This article is from Yicai)

Read on