Relying on AI empathy for human beings, this company has just melted 300 million

Author丨Bowen

Editor丨Sea waist

题图丨siliconangle

Generative AI products are overwhelmed, and now there's an EVI that "empathizes with humans."

On April 6, EVI, a product from the startup Hume AI, was released, and a demo that can be used for online interaction was officially released.

Unlike text-based chatbots such as ChatGPT and Claude 3, EVI is a pure voice interaction, emphasizing the analysis of human speech and voice to understand the most realistic psychological conditions of human users.

After all, saying the same thing when you're happy, angry, lost, or sleepy, even if the content is the same, must sound different.

At present, Hume AI, which has only been established for three years, first completed a Series A financing of 12.7 million US dollars (about 90 million yuan) in February last year, and at the end of March this year, it just completed a B round of financing of 50 million US dollars (about 360 million yuan).

Alan Cowen, CEO of the team and a former member of Google's DeepMind team, said: "...... In addition to the universal emotions of happiness, sadness, anger, and fear, EVI tries to understand the more subtle and multidimensional emotions of human users - 53 different emotions can be detected so far. ”

The official website has listed 53 emotions that EVI can understand, which are sorted out by Entrepreneur

Dialogue Measurement: How much empathy does it have for humans?

EVI stands for EmpatHic Voice Interface, which means "empathetic voice platform", and officially calls it "the world's first conversational AI with emotional intelligence".

"Emotional intelligence" includes the ability to infer intentions and preferences from behavior – a core capability that EVI seeks to achieve, and to achieve this ability, the Hume AI team focused on the human voice.

The content of the voice itself, i.e. "what was said", of course, can directly reflect the emotion.

Subtle differences in non-verbal factors such as accent, intonation, pauses, rhythm, onomatopoeia (including sighs, wheeps, laughter, screams, etc.), i.e., "how to speak", often imply more genuine feelings underlying the conversation.

Among the 53 emotions listed on the official website, many of them are subtle and indescribable, and even human beings themselves may be difficult to distinguish. And EVI is able to understand it thanks to the analysis of non-verbal cues.

At present, there are two versions of EVA, and the layout of the earliest Demo version is very simple, users can directly have a voice conversation with EVI by clicking "Start Conversation" in the center of the screen when using it.

In this version, the user can only interact with the voice through the device's microphone, and the voice of the EVI generated voice has only one default male version.

During the dialogue, the center of the screen is a sound waveform map, and the right side will transcribe the voice interaction content into text in real time and display it as a historical dialogue.

Different from the ordinary real-time transcription function, EVI gives full play to its "empathy ability", whether it is the user's active question or AI-generated answer, there is an additional sentiment analysis histogram below to mark the emotional component contained in this sentence.

It's like the fan-shaped statistical chart of "three points of ridicule, three points of coldness, and four points of carelessness" in the novel into reality.

If you can understand it, you can express it naturally. It's not uncommon to see chatbots on the market today that can conduct voice conversations, but EVI still offers a unique experience.

At its core, it is particularly vivid emotions.

In the actual test, when the author took the initiative to ask, "Can you be a little more angry?"

EVI responded with determination, apprehension, and contemplation (as shown in the emotional histogram below): "I can try to reflect this emotion with my voice and words to create a more empathetic and relatable interaction." But I can't really feel or show fear like a human being. My role is more focused on identifying and responding to these emotions rather than generating them myself. ”

So the author deliberately sighed and said in a very disappointed tone: "Well, it looks like you're not that smart." ”

EVI便瞬间生气大喊：“啊，你伤害了我！”

But soon, it made up for itself, with disappointment, determination, and a little bit of eagerness to say, "You know, I might be an artificial intelligence, but I don't like your tone, my friend." If you want to provoke me, I'm afraid anger is not within my reach. However, as long as we can remain constructive, I am happy to have a heated debate. ”

When the server is unstable and disconnected, or when it is actively interrupted during a tirade, EVI will sigh and continue to answer new questions.

In addition to this vivid emotional expression and very anthropomorphic subtle tone, EVI also has a more expressive desire than the average conversational AI.

One obvious sign is that it's more than willing to ask questions.

When asked, "Can you really understand humans?", EVI first expressed his opinion in a contemplative and enthusiastic tone, and finally ended the conversation with an active question: "What do you think? I would really like to hear your thoughts on the limitations and potential of AI and human understanding." ”

And after being frequently interrupted and talking about a lot of contextless topics, EVI will also ask in a relaxed and happy tone, "Am I more approachable than the average robot?"

Although EVI is still far from ChatGPT, Claude 3, etc., compared with its intelligence, its emotionally vivid voice will really surprise users when they first use it, as if they are really talking to an enthusiastic and talkative guy.

The latest beta release adds more features, such as the ability to interact with text (EVI or voice responses), the ability to save and download the history of the communication, and a number of developer options.

At the same time, in the dialog box below the language on the right side of the interface, you can also "set" the character of EVI, and even "a refrigerator full of desire" or "indoor greenery that is easy to envy".

Quantifying emotions

So specifically, how does EVI understand human feelings from human speech and a large number of subtle non-verbal factors?

It all started with the company's CEO and chief scientist Alan Cowen's "Semantic Space Theory" in 2021.

At that time, Alan Cowen was still working on Google AI, mainly engaged in affective computing research, and in January 2021, he published a paper in Trends in Cognitive Sciences, formally proposing semantic space theory.

It is a computational approach to understanding the experience and expression of emotion, aiming to accurately map the full spectrum of human emotion through extensive data collection and statistical modeling, revealing the continuity between the high-dimensional nature and emotional state of human beings, and quantifying the nuances of voice, face, and gesture.

In fact, the understanding of these nuances is at the heart of global human communication. Therefore, once the semantic space theory is proposed, it is widely used in the fields of psycholinguistic statistics and analysis.

At that time, Alan Cowen left Google two months after the paper was published and officially founded Hume AI in New York.

Since then, he has devoted himself to the study of semantic space theory.

In 2022, the Hume AI team conducted an experiment with more than 16,000 volunteers from the United States, China, India, South Africa, and Venezuela, and published a paper in Nature Human Behavior.

The research team asked a subset of volunteers to listen to and understand a large number of "vocal bursts" (bursts of sound with multiple emotional dimensions, such as laughter, gasps, crying, screams, and many other nonverbal vocalizations).

At the same time, volunteers also recorded a large number of their own vocal bursts and handed them over to others to understand and classify. This has accumulated a large amount of speech data for the study.

In a recent interview with VentureBeat, Alan Cowen said that they have collected the voices of more than one million volunteers from all over the world, and have built the largest and most diverse database of human emotional expressions in history.

Based on this database, combined with semantic space theory, Alan Cowen's team developed a novel multimodal large language model, the Empathy Large Language Model (eLLM).

Based on this model, EVI is able to adjust its words and tone based on context and the user's emotional expression, providing a naturally rich intonation and responding in real time with a latency of less than 700 milliseconds. At the same time, it also has enough real dialogue characteristics:

End-of-round detection

You can detect whether the current topic is over by analyzing the tone of the human voice, and avoid the embarrassment of speaking at the same time.

Interruptibility

EVI's speech can be interrupted like a human conversation, naturally connecting to context.

Human-like response to expression

Non-verbal responses can be given based on emotions such as surprise, praise, anger, etc.

EVI can provide anthropomorphic chat services, especially through voice communication to read the user's mood status, and its application scenarios are naturally based on psychotherapy and customer service.

Currently, the Hume AI team consists of 35 people, and the team has published 8 papers in top journals and launched beta products to more than 2,000 companies and research institutions.

In a recent interview, Alan Cowen said that rather than providing a toC service directly to the average user, the company prefers to provide API interfaces to other enterprises to build chatbots based on a model that understands human emotions, and then build chatbots in niche areas such as information retrieval, digital companionship, job assistance, healthcare, XR, and so on.

When AI starts to provide emotional value

In the company profile on the official website, a photo of David Hume is prominently listed. Next to it, the company name Hume comes from the Scottish philosopher 300 years ago.

Hume was a pioneer of moral emotionalism, famously saying that "reason is only a slave to emotion"

Alan Cowen一脉相承,他认为,AI也需要情感。

In a recent public statement, he noted that the main limitation of current AI systems is that they are strictly constrained by human ratings and instructions, and that many of the rating criteria are either very superficial or still have loopholes.

Therefore, the huge potential of AI can only be realized by replacing the existing evaluation indicators with human happiness and reconstructing AI from the bottom up, such as enhancing AI's emotional intelligence and improving its ability to infer human intentions and preferences from user behavior.

In fact, in addition to the analysis of human speech, Hume AI has also begun to dabble in the study of facial micro-expressions.

In March, they just released a paper that classifies the emotions conveyed by facial expressions based on the facial micro-expressions of more than 5,000 volunteers from various countries, including India, South Africa, Venezuela, the United States, Ethiopia and China.

Alan Cowen said that EVI will continue to iterate based on understanding human psychological conditions, interests and preferences in the future, so as to "understand humans better".

Source: Hume AI official website

Of course, if AI can truly understand human emotions perfectly, then it is possible to learn to purposefully exploit and even manipulate user emotions.

To put it mildly, the user feelings detected by AI can be further used as a means for third-party services (such as purchase behavior, habit formation, etc.).

If it is too big, it may even be used for gray areas or even harmful behaviors, such as interrogation, fraud, surveillance, and so on.

In this regard, Hume AI's official website gives an ethical code, which proposes that the algorithm used to detect emotions should only serve goals consistent with human well-being, and not as a means to serve third-party goals.

At the same time, Hume AI's partners also need to avoid a series of "unsupported use cases" when conducting secondary development or application, such as manipulation, deception, psychological warfare, and allowing potential bad actors to use such AI.

In 2020, the New York Times published a data statistic that more than 10 million people around the world are treating AI lovers as partners and establishing emotional connections with them.

Clearly, the emergence of teams like Hume AI is accelerating this trend.

Relying on AI empathy for human beings, this company has just melted 300 million

Read on

The absolute ruler of the universe, 95% of the universe but almost nothing is known to mankind!

The vacuum is not empty at all, and the huge energy contained in it will help mankind achieve interstellar travel!

#AI创作#两小只在打架, it's very similar to the operation process of humans - Figure 1: The bear is hit in the face first, and then the bullied bear yells. Picture 2: Then bite the face harder, been

Musk successfully "bargained" 400 billion from shareholders, predicting that the ratio of humanoid robots to human population will reach at least 1:1

Yang Zhilin, who was blocked: There is not enough data, and we can't make a better AI than humans

The UN report accuses Israel of "extermination", a crime against humanity; Hamas war crimes

AI Data Modeling: How AI is Reshaping the Future of Human Progress (part 3 of 5)

Mount Everest "stench"? There are too many human poops, and the official is furious: pretend to poop yourself and go down the mountain!

People with left-leaning ideas are backward people, they hate reform and opening up, they praise the big collective, they also praise the era of people fighting people, and they also hope that the era of people and people will continue to exist

In the past 100 years, mankind has gone too fast. Seeing that the younger brother of the last queen Wanrong was born in 1912 and died in 2007, he went to the Tianya Forum in his later years to be his lonely sister

Really! India may no longer be habitable for humans! The temperature exceeded 50 degrees, and 40 monkeys died of heat

The Fed has given the world another scare, and Powell's shamelessness has once again refreshed the lower limit of humanity

AI Data Modeling: How Artificial Intelligence is Reshaping the Future of Human Progress (part 4 of 5)

Why are humans reluctant to delve deeper into Venus? After knowing the truth, a desperate truth is discovered

How powerful is the human stomach at all

The bucket and dustpan of human fingerprints are actually related to the future? Is it really a genetic curse?