laitimes

Siri fell, and the apple was full

Siri fell, and the apple was full

Wen 丨 New Eyes, Author 丨 Liu Sixuan, Editor 丨 Sang Mingqiang

It's an indisputable fact that apples are full and Siri falls.

Since its stunning debut on the iPhone 4S in 2011, Siri has become another symbol of Apple. After all, in the era when artificial intelligence has just entered deep learning, not every ordinary person has the opportunity to talk to the intelligent subject of the machine, and compared with a new feature, Siri is more like a messenger of a future civilization, so that the concept of AI can be concrete.

But as the freshness faded, many questions began to surface one after another: answering non-questions in an open dialogue environment, setting thresholds for the user's accent intonation, only being able to complete the awakening of the iOS ecological app, slow iteration of functions for more than a decade, intelligence, voice, assistant, positioning playing three levels, whether it is disassembled or combined, Siri has not achieved its mission.

As Apple's product sequence, it is like an outlier, like an illegitimate child, with only innovative colors, but it has lost the practical genes shared by other products under the brand. Because of this, while Apple has grown year after year, people's enthusiasm for Siri has gradually faded, so much so that they quietly turned off the option of "Hey Siri" and no longer woke up. Could it be that Siri's fate is merely a new joy to satisfy the curiosity?

The answer to the question can be found in three points in mathematics.

01, zero point

"Zero is not a point."

This is the most commonly heard concept in mathematics. As the intersection of the function image and the axis, the zero point emphasizes more of an overlapping state than a quantitative addition or subtraction. If Apple's business layout is seen as a functional image, then Siri is the zero point in it: it is the intersection of electronic products and high-level machine intelligence, representing the state of upward exploration, but it does not have a huge volume.

Before being acquired by Apple, Siri has been independently developing for 2 years, relying on official scientific research projects and appearing in the Apple app market as a third-party app. Seeing the broad prospects behind voice assistants, Jobs won Siri for $200 million, and Apple had its own AI.

Jobs's appreciation for Siri was palpable. Unfortunately, the day after the release, Jobs died of illness. In the frequent personnel changes that followed, no resolute leader appeared, and Siri began to lose its way. Insufficient investment, unclear positioning, and closed system make Siri's downhill road inevitable.

Frankly, Siri wasn't a success when it was released. The launch was sloppy, and the original features were limited. At that time, Siri could only respond to simple operation commands such as setting alarms and opening apps; in the face of more flexible voice commands such as sending text messages, making calls, and information retrieval, there were obvious recognition vulnerabilities.

From the characteristics of artificial intelligence deep learning, this problem is not difficult to solve, only need to increase the amount of training can be gradually optimized. The problem is that Siri is not all of Apple, and there are too many projects carried out at the same time, including apps such as Maps and iBook, as well as new product lines such as iPad Air, iPad Pro, and Apple Watch. Resources are divided, making Siri, who has little experience to refer to, difficult, and "evolution" is pushed back and forth; coupled with the obstinate behavior of project leader Williamson, Siri, which should be continuously updated, can only follow the iOS system every year, making the space for progress further compressed.

In addition to insufficient investment, unclear positioning is also a major problem. According to the assumptions of the Siri founders, voice assistants should be "do engines," not simply "search engines." This means that Siri should be like a friend in life, not only able to respond to stylized instructions, but also have the ability to deal with open dialogue scenarios, the former corresponding to natural language processing (NLP), the latter corresponding to the more difficult natural language understanding (NLU).

However, Apple's internal support for Siri executives left due to infighting and the original technical team left, so that the "original dream" was smeared. The search function is amplified. In addition to simple everyday language, most sentences will be converted into search instructions, even if the text contains "Apple", "Siri", "you" and other highly directed words, it can not recognize the user's dialogue request, still jump to the web search interface. In addition, sensitive issues such as crowd discrimination and political tendencies that appear under the malicious guidance of some users have also prompted the technical team to make a "one-size-fits-all" decision, making simple search a shield.

As for the system closure, it is a well-known problem. Being within the walls of iOS, where external developers can't step in, is a fatal weakness for AI that relies on massive amounts of data to trigger learning. Although SiriKit was later introduced to access third parties, it was too late, and the market for intelligent voice assistants already had third-party feature-rich Amazon Alexa and Google Assistant, and Apple lost its first-mover advantage.

02, singularity

Singularities, in mathematics, refer to points that cannot be defined. Intelligent voice assistants like Siri are just singularities.

As up-and-comings, they are not the main business of established technology companies and do not get 100% attention; many unknowns that still exist in the field of technology have also given development a stage ceiling, how to arrange the position of intelligent voice assistants in the business sector, the answer is somewhat ambiguous.

Judging by Apple's current performance, it is clear that Siri is only used as a trivial functional module. Exhausted the dividends brought by the freshness at the beginning of the release, it is now reduced to standard, as if the convention, each new product will be equipped with Siri, but there is not much improvement; it is difficult to become a selling point for increasing hardware sales, and even "turning a blow" in homePod, indirectly leading to the product offline.

In fact, modularity is not the only option, in singularity, points that tend to infinity are defined as poles. Similarly, voice assistants can grow into giant "poles."

Amazon Alexa, released in 2014, is a good example.

Failure to seize the opportunity in the release time does not mean that the product itself is inferior to others. Three years late, instead, allowed Alexa to fully evolve its algorithms. The advent of the smart speaker Echo also shows that the concept of the product is not just an auxiliary functional module, but a business segment with huge growth space, around Alexa, and many tentacles will be derived to cover a wider range of application scenarios. In terms of product features, contrary to the closed nature of Apple's ecology, the open environment allows Alexa to have tens of thousands of functions, including but not limited to takeaway ordering, daily questions, and paying attention to team status updates.

Stand-alone product forms provide an "immersive" voice interaction experience, but that's not the key to success. Alexa was able to reach heights that Siri couldn't reach, and more importantly, the technology sank in the scene. The form of the smart speaker corresponds to the application scenario of the daily life of the family; in turn, the family scene requires the product to have specific attributes, such as matching, entertainment, companionship, etc. for different ages, rising to function, and corresponding to strong language comprehension ability, rich command options, and natural semantic associations. Through the scene to improve the product, and then the product to lock in a more accurate market, and further polish the technology, so that the closed loop is formed, intelligent voice interaction can be smoothly landed.

The difference between Siri and Alexa also reflects the current pattern of corporate performance involved in the AI industry. Like Apple, only taking AI as an additional function of the existing product series will be limited to the application scenario of the product itself, and the result is that the "text is not right", and the closed loop cannot be formed; only by treating the scene and AI as two independent endpoints, and taking the product as a link to connect, can we get a virtuous circle of spiraling.

03, Origin

"What kind of voice interaction do we really need?"

Perhaps this is the real origin of "Siri Falls" and the common problem of "Siri". As for the answer, it can be found from two perspectives: the present, the future.

From the current point of view, voice interaction is not just what most people need. In the existing work and lifestyle, individual problems are solved by themselves, and group problems are solved through communication, as long as the information is unimpeded, there is no need to use artificial intelligence as a medium.

However, with the premise of "unimpeded information", the audience with pain points has been demarcated: special groups of people with poor information.

Children, the elderly, and people with disabilities are all parties to information obstruction, and there are obstacles in dealing with problems. Overcoming this obstacle, one usually thinks of a profession: nanny. Intelligent voice interaction is the best alternative for nannies. Make full use of AI's information processing capabilities, with specific mechanical structures, to provide life assistance for users with heavy needs; or only in the form of speakers, accompany the audience with less needs, and provide services such as companionship, answers, and hardware remote control. In application scenarios related to special groups, the emergence of intelligent voice interaction is tantamount to a revolution.

Turning your vision to the future, you may see a confusion, but referring to the great changes that have occurred in the past, you can also guess the shocks brought about by intelligent interaction. From the perspective of information circulation, intelligent voice interaction represents a faster transmission speed. This is the same as from the paper to the paper, from the horse to the car, from 2G to 5G.

The film Her depicts an era of highly mature voice interaction. The office of the characters in the film is completely separated from paper and pencil, and it is also free from the keyboard and mouse we are currently using, and only needs to sit in front of the computer screen and dictate their own ideas. Unlike simple speech-to-text, intelligent voice assistants will determine whether a statement is part of the content or a command by using the user's tone, expression, language content, etc. When you say, "Help me delete the previous sentence," the previous sentence will be cleared; when you say, "Save a draft," the text will go into the draft box.

If one day, intelligent voice interaction technology really develops to such a high level, then it is conceivable that the efficiency of work will increase by more than a hundred times; even the concept of work scenes will disappear, as long as we are in contact with voice assistants through headphones or more advanced equipment, even if we are lying in bed, we can easily process files and write plans.

The founder of Siri, Norman. Weinersky once believed that the three major elements that will change the future of mankind are virtual assistants, artificial intelligence robot assistants, and augmented reality, which correspond to the information world, the physical world, and the interface between the two. Obviously, intelligent voice interaction has all three elements at the same time. Where it is invisible, the voice assistant processes information; where it is visible, it presents the results of processing and collects feedback; and it is itself, that interface.

Although Siri is lagging behind at present, it will never become an outcast under the trend of the times. In recent years, Apple's aggressive acquisition of AI companies has also shown signs of a shift in focus. It is not difficult to imagine that in the next few decades, intelligent voice interaction will become a battleground for several Internet technology giants, and the positive feedback brought by this competition will make voice interaction begin to become three-point.

Read on