laitimes

AI music, will it give birth to the next "Douyin"?

author:Geek Park

Author | Lian Ran

Edit | Zheng Xuan

AI music has been hot lately.

First of all, in late March, American AI startup Suno released the V3 music generation model, which can generate two minutes of high-quality audio based on the input prompt words in a few seconds, and the amazing effect has convinced the industry that AI music has ushered in its own "ChatGPT moment".

Then, a month later, Kunlun Wanwei in China announced the launch of the "Tiangong SkyMusic" music generation model, which surpassed Suno V3 and became the latest SOTA (State of the Art, referring to the world's first technical indicators of current technology in this field) of AI music generation models.

This aroused the curiosity of Geek Park. Objectively speaking, there is still a certain gap between China and the United States in the basic research and development of AI large models. Even with vertical models in subdivided fields, it is rare for a team to claim to be the world's leading team with such confidence.

In addition, music, as an important part of the multimodal field, has a lot of industrial value. The total annual revenue of the global recorded music market is nearly 30 billion US dollars, and related concerts, BGMs, KTVs, online short videos, karaoke platforms, etc. are hundreds of billions of dollars per year. The intervention of AI will inevitably bring about a comprehensive revolution on both the production and consumption sides, and its potential impact will be no less than that of the rise of digital music and streaming.

Therefore, after the official launch of "Tiangong SkyMusic" on April 17th, we downloaded and experienced this software for the first time. Here's what we had to say, and a little thought on the future of AI music.

01

「天工SkyMusic」:

A lyric that is generated in seconds

Three songs in different styles

From a product design point of view, "SkyMusic" and "Suno V3" are similar in many aspects, but there are still significant differences in some key details.

To generate music with SkyMusic, you need to enter the song title, lyrics, and fine-tune it by adding paragraph information, such as "main song", "chorus", "intro", etc., which are similar to the operation of Suno.

AI music, will it give birth to the next "Douyin"?
AI music, will it give birth to the next "Douyin"?

The top is the Suno interface, and the bottom is the Tiangong SkyMusic interface

The biggest difference between the two is that Suno requires users to input a song style, such as pop, jazz, or rap, while SkyMusic allows you to select a reference track, which can be from songs uploaded by other users or upload yourself.

This feature is very useful. On the one hand, "tracks" provide more precise orientation than general "styles", which is more helpful in generating the music you want.

From the perspective of actual experience, if there is no professional music theory training, it is difficult for ordinary users to accurately describe the style of the song - imagine what language should be used to describe "Chapter 7 of the Night"? Therefore, a style like "Tiangong SkyMusic" can find a style that matches your lyrics when browsing and listening to the original music, which can better express the musical needs than using natural language description, and is also more suitable for the music creation scene of ordinary users.

Once you've entered the lyrics, song title, and reference track, you can generate music directly. "SkyMusic" will generate three songs with slightly different styles and voices at once - this is a very practical product design. Whether it is "Suno V3" or "Tiangong SkyMusic", although AI music can generate a certain level of music today, the stability needs to be improved, and only one of the three songs is not bad, and if you want to get better music, in addition to fine-tuning the lyrics and segments, you need to try repeatedly.

Enter the song title "Summer Wind" and the first sentence "Summer Wind I will always remember", and the song was composed by "AI Lyrics"|Video source: Geek Park

In addition to writing lyrics to generate songs, "Tiangong SkyMusic" also supports AI lyrics. As shown above, the above song "Summer Wind" is the song we created with "AI lyrics" and "generated songs". The melody of the song is not bad, but because the lyrics are not segmented, the whole song is missing a little tonal change.

I tried to add segments to Su Shi's "Water Tune Song Head", and the song generated this time has obvious emotional changes, especially when singing the climax of the chorus such as "People have joys and sorrows, and the moon is cloudy and sunny", which is full of appeal.

Demo of "Water Tune Song Head".

If a satisfactory song is generated, users can also choose to "contribute" to the "SkyMusic" platform, as well as support other social media platforms. On the homepage, I listened to some popular generated music with a lot of likes, and some of them were of such high quality that at first glance it was almost impossible to tell that they were AI-generated — although if you look closely, you can still see the difference between them and professional works in some details.

In fact, many professional musicians also have a lot of praise for "Tiangong SkyMusic". For example, the UP master of Station B, @Metalion, a professional musician, tried to generate with his old lyrics, and auditioned the popular AI songs posted by others on the homepage, and several of them gave not low evaluations such as "not bad" or "like a complete song".

AI music, will it give birth to the next "Douyin"?

The video released by the UP main @Metalion of station B | Source: Screenshot of Station B

On the whole, I feel that the music generation ability of "Tiangong SkyMusic" is at least above that of music beginners, but the generation speed is far from comparable to that of humans, and there are many relatively complete or even partially amazing excellent works in a large number of music works. Of course, the current AI music model is not capable of maintaining the consistent quality of the entire song from beginning to end, nor can it polish the work in the same detail as a real musician.

However, in terms of the current level of technology, "Tiangong SkyMusic" is an excellent product that can bring real value to users. By lowering the threshold for music creation, such as "imitation tracks", "generating three songs at a time" and "choosing to publish", the large model has made it easy for ordinary people to enjoy the joy of creating and sharing music, and everyone can make their ambitions clear with songs.

As large models continue to iterate and products and functions continue to be enriched, there will be more possibilities for AI music in the next year or two.

02

Where will AI music go?

After talking about the experience of the product, let's finally talk about some observations on the future development of the AI music industry that we can see based on this experience.

At the media communication conference of "Tiangong 3.0" and "Tiangong SkyMusic", Fang Han, chairman and CEO of Kunlun Wanwei, said in an interview with the media that he believes that AI music creation tools will have two differentiations in the future: for PGC and UGC users for Xiaobai. UGC is basically one-click generation, which is mainly convenient, while PGC is relatively complex, and "Tiangong SkyMusic" will also add more professional music tools such as tune adjustment in the future.

Products like SkyMusic have some value for both UGC and PGC today, and the technology will continue to iterate in two different directions for the foreseeable future.

For the average UGC user, AI music generation mainly provides entertainment value. Music creation is a form of emotional expression, and even non-professionals have a need to create. In the past, technical barriers such as songwriting and music theory were major obstacles, but the introduction of AI technology has provided them with a powerful "exoskeleton for music creation", allowing anyone to create the music they want and share it on social media.

Of course, the current AI music platform is still complicated to operate, users need to manually segment and fine-tune the lyrics, and the generated sound quality is not stable enough. And as the technology continues to iterate, it is believed that these problems will be solved soon. Just as Weibo and Douyin have revolutionized the creation of articles and videos, a decentralized music creation platform could revolutionize the way people create and consume music.

For professional PGC users, while today's AI music tools are not yet a complete replacement for human creators, they have begun to become an extremely valuable aid.

The first is creative stimulation. At present, the stability of AI music model creation is not enough, but the speed is fast, and AI can be a good creative inspiration tool for creators who are stuck in a bottleneck. Fang Han also cited a scene after the technical team communicated with professional creators, and they believed that "Tiangong SkyMusic" has a particularly valuable point: it is inspired by cross-regional styles. For example, Chinese musicians are often unfamiliar with African and Latin American music, but by using Tiangong SkyMusic, they can easily explore and blend the tunes of these exotic cultures, thus gaining new inspiration in their creations.

Another valuable scenario is the demo. In many commercial creation communication occasions, especially the communication between some musicians and non-musicians (such as the communication between the arranger and the advertiser when doing advertising BGM), the big trouble is how to align the understanding of the direction of the work. With AI, it's easy to generate a relatively accurate sample, which saves a lot of communication time and reduces the risk of subsequent rework.

From a practical point of view, there are many industries that need to use music today, the more typical ones are short video, film and television, and game production, while traditional music creation is not only costly, but also time-consuming.

In the field of content creation, whoever can adapt to technological change the fastest is likely to reap the greatest benefits, which has been proven in the field of short videos. With the rise of mobile phone cameras, 5G, and video editing apps, the threshold for video shooting and production has been continuously lowered, and finally with the global popularity of Douyin, the short video track has risen in an all-round way, opening the curtain of a new era.

And today, on AI music, and indeed on all AIGC platforms, history could repeat itself again.

*Header image source: Tiangong AI

Read on