laitimes

How much can generative AI make waves in the entertainment industry?

author:Chinese Society of Artificial Intelligence

Transferred from The Power of Machines

The AI music generation software Suno V3 attracted a lot of attention after its release, and it was rumored in the media that OpenAI was going to sell Sora to Hollywood. The dynamics of these two music generation and video generation tools have sparked a discussion in the industry about the opportunities for generative AI in the entertainment industry.

Highlights of this issue

How popular have Sora and Suno been in the entertainment industry lately?

What happens when GenAI actually works?

What generative approaches have Sora and Suno had before?

What other applications of GenAI are in the entertainment industry?

Generative AI is making waves again in the entertainment industry??

The V3 version of the AI music generation software launched by the Suno team has been hailed as the "ChatGPT of the music industry" by the industry for its excellent performance. This version is capable of creating musical compositions up to two minutes long, and its sound quality is up to broadcast-grade standards. In addition, the V3 version has improved stylistic diversity and understanding of user input, reducing errors in the generation process.

Prior to the release of V3, the Suno team's achievements in the field of music technology had already been recognized by the market. According to the GenAI Top100 report released by a16z, Suno has become one of the most watched GenAI apps in the past six months (as of January 4), and is the only music company on the list.

Prior to this, Bark, an open-source text-to-audio model released by the Suno team in April 2023, received more than 4,500 stars on Github in a short period of time.

  • In July, Suno added vocal music capabilities to the audio generation model.
  • In September, Suno provided users with access to audio generation models through a Discord channel.
  • In December, Suno launched a web app and announced a partnership with Microsoft to integrate Nano's model capabilities in Copilot.

In the field of video creation, Sora, a video generation model released by OpenAI on February 15, has attracted a lot of attention in the industry with its ability to generate one-minute high-quality videos. In addition to the resurgence of Sora at home and abroad, the media also revealed that OpenAI CEO Sam Altman was recently featured at an event during the Oscars, and plans to meet with senior executives in the entertainment industry to explore cooperation opportunities. And the work that OpenAI shared last week that artists created using Sora has once again attracted a lot of attention.

When GenAI starts to really work, what's next?

In the case of AI-generated music, the difficulty lies with:

1. Music has a complex structure and rich emotions, and AI needs to master music theory and simulate human emotions.

2. High-quality music datasets are essential for AI training, but it is still difficult to obtain diverse and high-quality data.

3. Some models in music involve long time spans, and current models still struggle to memorize and use this information to generate coherent compositions.

4. Music involves the expression of style and emotion, and AI needs to capture specific styles, while understanding and simulating the emotional aspects of music.

5. Music quality evaluation is a subjective process, and there is a lack of objective standards, which leads to the immaturity of the feedback mechanism for the improvement of music generation models.

Although Suno V3 has made some achievements in music creation, it still needs to be improved in terms of duration, language comprehension, and track processing. Despite this, its function as an auxiliary tool for music creators has been initially realized.

  • For ordinary users, AI tools can lower the threshold for creation, enabling more people to express themselves in a simple way.
  • For professional users, AI tools can serve as creative assistants to improve creative efficiency.

As AI-powered music creation tools mature, the future is likely to have a multifaceted impact on the music industry. Advancements in technology and the company's market positioning are likely to drive GenAI in the music space to develop both enterprise-oriented (toB) and consumer-oriented (toC) business models. In addition, GenAI-based content generation tools, combined with publishing channels and interactive features, have the potential to build a complete content ecosystem to realize the monetization of content and the distribution of revenue for creators. At the same time, the application of GenAI technology is expected to narrow the skill gap between amateur and professional creators, enabling more amateur creators to achieve high-quality creations through AI technology.

Despite concerns about the impact Suno could have on the music industry, there is a general consensus that AI technology is unlikely to completely replace professional musicians, and that it could affect those with relatively low technical requirements. Lei Ming, CEO of iDream Technology, pointed out that Suno is unlikely to replace music forms that require teamwork and will not affect the career prospects of well-known musicians. But he also mentioned that companies and fields that do not pursue uniqueness and tend to create in batches, such as advertising soundtracks, ambient music, film and television dubbing, etc., may be impacted by AI technology.

In terms of video generation technology, the analysis of Factorial Funds points out that despite the limitations of the Sora model, the quality of the video generated by it has been able to meet the needs of specific scenarios and is expected to replace some stock video clips. Sora's success demonstrates the importance of scaling law in video models, and continuous model scaling will be a key factor in improving model capabilities. At the same time, other companies, such as Runway, Genmo, and Pika, are exploring how to build more intuitive user interfaces and workflows that will have a significant impact on the ease of use and generalizability of video generation models. ....

How far has GenAI come to bring Sora and Suno?What problems are still being solved when GenAI is moving towards commercialization?What other entertainment applications are there for GenAI in the near future?... For a full explanation, please go to the "Heart of the Machine PRO" industry newsletter · #Week 13 of 2024

The full version of this newsletter includes 3 thematic interpretations + 31 important events on the AI & Robotics track

1. LLM 之后,「Next token Prediction」还能训机器人?

Why do you say, "If you predict the next token, you can reach AGI"?What is NTP technology?How does Berkeley use NTP to train robots?Can "Next Action Prediction" work in the field of robotics?...

2. Is generative AI making waves in entertainment again?

How popular have Sora and Suno been in the entertainment industry lately?What will happen when GenAI actually works?What generative methods have Sora and Suno used before?What other applications of GenAI have in the entertainment industry?...

3.Richard Sutton 最新播客内容要点解读

What are the key takeaways from Richard Sutton in the podcast interviews, where is the Alberta project, and what are Sutton's new ideas for AGI?...

【Disclaimer】Reprinted for non-commercial educational and scientific research purposes, only for the dissemination of academic news information, the copyright belongs to the original author, if there is any infringement, please contact us immediately, we will delete it in time.