laitimes

Since the era of media globalization! Spotify launches AI voice translation feature

The thing that can make Musk "Wow" is definitely not simple!

Just yesterday, Musk expressed "Wow" surprise in the Lex Fridman X comment section.

Since the era of media globalization! Spotify launches AI voice translation feature

Musk is familiar to everyone, and the identity of the one who interacts with him is also quite remarkable.

Lex Fridman is a well-known podcaster and research scientist at the Massachusetts Institute of Technology, specializing in artificial intelligence. Many people know him mostly through the show "The LexFridman Podcast."

In the program, he has interviewed "technology madman" Musk, "father of ChatGPT" Sam Altman, American UFO incident witness David Flavor and many other well-known figures, the interview topics involve scientific research, academia, suspense, society and many other fields, each issue of the video is very informative, deeply loved by people.

In addition to the good interviews, people are not ambiguous at all in terms of scientific research. At MIT, he taught courses including deep learning, autonomous driving, reinforcement learning, and more; In the field of autonomous driving, he has also designed an image recognition detection function to prevent driver distraction. In addition, he is a black belt in Brazilian Jiu-Jitsu, a fitness enthusiast, and plays guitar and piano well, which can be said to be full of talent points.

However, this genius-level figure is also Musk's fan brother, and it has not been a day or two to "go back and forth" with Musk on X.

So, what is it this content that can make the two big guys in the technology circle pay so much attention?

The answer is Spotify's upcoming "voice translation feature for podcasts."

01

Voice translation of "Clone Host Voice"

Spotify is currently one of the world's largest legitimate streaming music service platforms, with more than 200 million paid users, a market value of $24 billion, and nearly 500 million monthly active users, making it a leader in the music and podcast industry.

Since the era of media globalization! Spotify launches AI voice translation feature

In recent years, Spotify has attached great importance to the podcast business, acquiring more than ten podcast related companies, covering content production, creation tools, advertising measurement services, monetization and data analysis.

In addition, in 2020, Spotify purchased the exclusive broadcast rights to the JoeRogan podcast "The JoeRogan Experience" for $100 million. In the podcast business, Spotify can be described as a bloody bank.

This time, the voice translation function will inject strong impetus into the globalization of Spotify's podcast business.

What is special about the speech translation function? That's the quality of cloned sound, allowing a streamer to seamlessly switch to various languages "in the original sound".

Combined with reality, in the past, if we wanted to understand some foreign short videos, movies, TV series or broadcast content, either through subtitles, or through dubbing, or a combination of the two.

In the absence of a professional subtitle group, enthusiastic netizens with unique skills will become a "subtitle group" to personally translate the dialogue in the video and add subtitles. This allows us to understand these foreign contents even if we can't understand them.

But watching the video while watching the subtitles was really tiring, so the dubbing went online. But dubbing also brings some interesting phenomena - the voice and the person are not on the same number. For example, Stephen Chow's movies are too popular, and everyone takes the voice of dubbing as Stephen Chow's voice for granted, and when you really hear Stephen Chow's voice, or "Stephen Chow's voice" appear on other movie characters, it will feel a little strange.

Since the era of media globalization! Spotify launches AI voice translation feature

That will change when Spotify introduces its new "Voice Translation." The characters can automatically switch to various languages, and it is a "complete original voice", even the rhythm and tone of speech can be restored.

Spotify says its AI voice translation feature, powered by OpenAI's automatic speech recognition model Whisper, is able to mimic the original speaker's style and is more natural than traditional voice over. Using this feature may allow podcasts to expand their audience and users will have a seamless listening experience that switches between languages.

The feature is currently in beta and will only be available in Spanish, with French and German coming in the coming days and weeks.

Still, the first beta of the new feature, "Spotify's AI Speech Translation Experiment, Let Your Favorite Streamers Broadcast for You in Your Native Language," attracted many famous podcast hosts, including Dex Sharpard, Monica Paderman, Lex Friedman, Bill Simmons and Steven Bartlett.

Ziad Sultan, Spotify's VP of Personalization, said: "By matching the creators' own voices, voice translation provides listeners around the world with a more authentic way than ever before. "The use of this technology that uses the power of audio to overcome barriers, such as boundaries and distances, bridges language differences and makes content have a stronger impact and dissemination power, beyond the social benefits of general text translation."

02

AI brings

The era of "true globalization" of self-media

The appearance of voice translation function has also attracted attention in China, such a powerful function has made many netizens begin to look forward to the Chinese version, some netizens said that "AI will bring from the era of globalization of media".

Since the era of media globalization! Spotify launches AI voice translation feature

In fact, this view is really worth noting. Language is an important carrier of content transmission, and if content wants to achieve cross-regional or global communication, then language transformation is a necessary link, but this transformation also requires costs.

The extent to which a media software can be accepted by a country or region is affected by not only content factors, but also by the language conversion ability of its content. If it can support less resources for content conversion capabilities, it will also decline in meeting demand, which to a certain extent weakens the influence of self-media software or media platforms in the region and slows down their speed of entering the regional market.

The emergence of the "voice translation" function, on the basis of ensuring the original sound, also improves the conversion capacity and reduces the conversion cost, which is conducive to the improvement of the content dissemination range and speed, and then drives the global flow of software.

But at present, Spotify is still in a relatively limited application range, which is inseparable from its security considerations.

If the media problem in the past stemmed from loss and distortion in transmitting information, the current media problem is due to the "too real" information presented by the medium.

For example, not long ago, "AI Sun Yanzi" covered a large number of well-known songs, which caused a sensation on the Internet, and its imitators are also flocking to it. But the emergence of so many works that are "fake and real" also makes it difficult to distinguish what is real and what is synthetic. In addition, such a "truth" or "reduction" is based on the acquisition, accumulation and integration analysis of a large amount of data. However, the data itself has the problem of blurring the boundary between public and private, and the gray area used is still very significant.

In addition, in what form the function is combined with the platform and in what form it serves the user, these need to be systematically planned. But what is certain is that AI is simplifying the process of content dissemination and improving the efficiency of communication, and the globalization of media is accelerating.

Author: Watermelon Typesetting: Huang Leihui

The picture comes from Q Zai Internet surfing, if there is infringement, background contact, Q Zai kneeling to delete~