Here's a new section for "Number One AI Gamer": AIGC Monthly. Updated monthly, the following is our summary of AIGC industry trends, AI hot news, new AI tools, and AIGC hot application cases in April 2024. I hope it can bring you some inspiration and thinking, and you are welcome to exchange your thoughts with us in the comment area~

A 5,000-word review of the progress of AIGC in April, including 6 latest creative tools and 5 popular cases

AIGC Industry Trends for April

1. Multimodal AI is advancing at a rapid pace

Generative AI technology is evolving from a single text or image processing to a multimodal application that can process multiple types of data (such as text, images, audio, etc.) simultaneously.

Since the release of Suno V3 last month, the field of AI voice and music has made rapid progress, for example, OpenAI demonstrated the voice generation model Voice Engine, Microsoft added 9 realistic and vivid AI voice characters, Hume AI launched the emotional voice dialogue robot EVI, and the AI music generation tools Udio, Stable Audio2.0, and Tiangong SkyMusic were released, which can generate complete music compositions.

Stable Audio官网：https://stableaudio.com/

In the field of video generation, many new projects have also emerged at home and abroad, such as Tencent's virtual human video generation framework MuseV, Shengshu Technology's Sora-level model Vidu, and Microsoft's VASA-1 project, which can integrate and use multimodal data such as text, images, audio, and video to create personalized content in the fields of games, short videos, and live broadcasts.

The innovation of multimodal content generation and interaction methods will be a major trend in the future development of the AIGC industry, and AI will be more naturally integrated into the human communication and creation process, becoming our right-hand man.

2. Increased competition for AI search products

Search, the most basic product function of the Internet era, has been transformed by generative AI technology, which can present accurate answers through AI conversations, greatly improve search efficiency, and meet the complex needs of digging deeper into problems.

There are more and more AI search engines at home and abroad, and the competition is becoming increasingly fierce, including a new generation of AI search engines driven by large models, such as Perplexity, You, Tiangong AI Search, and Secret Tower AI Search, AI dialogue products that support online search, such as ChatGPT, which is developing SearchGPT, and new products equipped with large models in traditional search engines, such as Gemini, Copilot, Wenxin Yiyan, 360 AI search, as well as AI search products positioned in vertical fields, such as Taobao Wenwen (e-commerce), DevvAI (programming), ......

拥有百万用户的Perplexity

At the same time, the commercialization of AI search products is also accelerating. In addition to offering richer advanced features through a subscription system, some AI search products are planning to introduce ads, and the star product Perplexity may provide answers from brands within this year.

AI search is expected to become an important way for people to obtain information, but how to find a balance between advertising revenue and user experience, and how to protect user privacy and security are also challenges that AI search products need to face.

Related reading: "A must for lazy people!6 AI search artifacts are tested, and the work efficiency is directly doubled"

3. AI regulation and copyright protection have been strengthened at the same time

Worldwide, attention to the safety and potential risks of AI technology is increasing, and related regulatory issues are increasingly being paid attention to, and the overall development is moving in the direction of more standardization and transparency.

Content platforms have begun to actively implement regulatory requirements for AI-generated content, for example, Douyin reminded users to be cautious about using "AI resurrection" technology to create content during the Qingming Festival, and the giant engine restricted some AIGC advertisements suspected of violating regulations. Meta will flag "suspected AI-generated content" on its social media platforms starting in May.

At the same time, the industry is delving into the issue of copyright ownership of AI-generated content. Katy Perry and other musicians jointly issued an open letter calling on technology companies and AI developers to stop "using AI technology to plunder the voices and likenesses of professional artists, infringe on the rights of creators, and destroy the music ecosystem". The new U.S. law requires AI companies to submit copyrighted works for training before releasing AI models.

On April 23, the Beijing Internet Court pronounced the first-instance judgment on the country's first "AI voice infringement case", and the plaintiff's voice actor was compensated 250,000 yuan.

When AI is used as a tool to assist human creation, its copyright ownership and use rules are being redefined to adapt to the new trend of human-machine collaboration and better promote the healthy development of AI technology.

Related reading: "AI "fertilizer" is insufficient, OpenAI was exposed to frantically transcribe YouTube videos"

Top 10 AI hotspots you might have missed

1. ChatGPT can be used without registration

On April 1, OpenAI announced that users can use ChatGPT immediately without having to sign up for an account, a move that aims to make AI accessible to anyone interested in its capabilities.

In addition, after the restrictions were lifted, OpenAI also introduced more content safeguards, such as blocking hints and generation in a wider range of categories. OpenAl may use the information provided by users to ChatGPT to improve the model, but users can turn off the feature through Settings.

2. Step Star released the preview version of Step-2, a large model with trillion-parameter parameters

On April 1, AI startup Leap Star released the preview version of the Step-100 billion parameter language model, the Step-1V 100 billion parameter multimodal model, and the Step-2 trillion parameter MoE language model.

On this basis, StepLeap has launched two large-scale model products for C-end users: the AI chat assistant "Yuewen", which has multi-modal content understanding capabilities, and the AI open-world platform "Bubble Duck", which is composed of plots and characters, to meet entertainment and social needs. Both are fully open for use.

Yuewen official website: https://stepchat.cn/chats/new

3. Meta released two versions of the open-source model Llama 3

On April 19, Meta released its latest open-source model, Llama 3, which provides pre-training and instruction fine-tuning versions of 8B and 70B. According to reports, Llama 3 was trained on two custom 24K GPU clusters based on more than 15T of data — which is 7 times larger and 4 times more code than the dataset used by Llama 2, and Llama 3 supports 8K context length, which is twice the capacity of Llama 2.

In addition, Meta also released a newly established website meta.ai where users can have conversations and draw with an AI assistant based on Llama 3.

Official Blog: https://ai.meta.com/blog/meta-llama-3/

Model download link: https://llama.meta.com/llama-downloads/

GitHub project address: https://github.com/meta-llama/llama3

4. Musk xAI released the first multimodal model Grok-1.5V

On April 13, Elon · Musk's AI startup xAI has launched the first multi-modal large model Grok-1.5 Vision, which can not only understand text information, but also process various visual information, including documents, charts, screenshots, photos, etc. Grok 1.5V will soon be available to early beta users and existing Grok users.

Grick-1.5 Vision博客:HottiPS://S.I/Blog/Gock-1.5V

5. Adobe Premiere Pro将接入AI视频模型

On April 15, Adobe announced that it will add a series of generative AI functions to the video editing software Premiere Pro, integrating self-developed Firefly series models, as well as third-party AI video models such as Sora and Runway Gen-2.

Among them, Generative Extend adds extra frames to a video clip, allowing editors to adjust the length of the video, such as lengthening a scene or adding smooth transitions. The Object Addition and Object Removal tools allow users to add, remove, or modify elements in the picture.

Related reading: "PR+AI Redefines Video Editing, Sora, Pika Full Access, Is AI Video Startup Still Playing?"

6. Liu Qiangdong's AI digital human live broadcast debut, with more than 20 million views

On April 16th, Liu Qiangdong's AI digital human "Procurement and Sales Dongge" made its live broadcast debut, and at the same time appeared in the live broadcast room of Jingdong Home Appliances and Jingdong Supermarket, sharing Liu Qiangdong's experience and experience in food, reading, etc.

According to reports, based on the AI-driven large-profile digital human technology developed by JD Yunyanxi, the AI digital human of "Procurement and Sales Dongge" was broadcast for 30 minutes, and the number of viewers in the live broadcast room exceeded 10 million, and in only 40 minutes, the number of viewers in the live broadcast room exceeded 13 million, and the overall order volume of the live broadcast room exceeded 100,000, and the number of views exceeded 20 million in the past 1 hour.

7. The Tiangong SkyMusic music model has started the public beta

On April 17, Kunlun Wanwei announced that the "Tiangong 3.0" base model and the "Tiangong SkyMusic" music model are open for public testing to the whole society.

According to reports, "Tiangong 3.0" adopts a 400 billion-level parameter MoE hybrid expert model, integrates AI search, AI writing, AI long text reading, AI dialogue, AI speech synthesis and other capabilities, and adds search enhancement, research mode, code calling and charting capabilities.

"Tiangong SkyMusic" has outstanding performance in vocals, BGM sound quality, etc., and its comprehensive performance surpasses that of Suno V3, which is China's first music AIGC SOTA (best) model.

8. Huge Engine restricts some AIGC advertisements, saying that there are many violations of laws and regulations

Recently, ByteDance's huge engine has restricted the streaming of some AIGC applications. In this regard, the relevant person in charge of the giant engine said that there are many violations of laws and regulations in AIGC software, and at the same time, users have more negative feedback on this type of advertising, so in order to protect the rights and interests of users and optimize the experience, such advertisements are restricted. At present, the giant engine is the first mainstream platform to restrict the launch of AIGC products.

9. Mobvoi, the "first share of AIGC", officially landed on the Hong Kong Stock Exchange

On April 24, Mobvoi, the "first share of AIGC", officially landed on the main board of the Hong Kong Stock Exchange, with the stock code of 2438.HK. HK, the final offer price was HK$3.8 per share, and the global offering raised net proceeds of approximately HK$267 million.

According to reports, Mobvoi, founded in 2012, takes generative AI and voice interaction technology as the core, and mainly provides AI Copilot solutions such as AIGC solutions, AI enterprise solutions, smart devices and accessories. Among them, the AIGC solution has grown rapidly in recent years, attracting about 840,000 cumulative paying users and generating more than 1 million payments.

10. China's first Sora-level model Vidu was released

On April 27, Biodigital Technology and Tsinghua University jointly released China's first long-duration, high-consistency, and high-dynamics video model Vidu, which supports one-click generation of high-definition video content up to 16 seconds and resolution up to 1080P.

According to reports, Vidu adopts the architecture U-ViT, which is the fusion of Diffusion and Transformer, which was proposed by the team in September 2022, and is the world's first architecture of Diffusion and Transformer fusion, which is earlier than the DiT architecture adopted by Sora.

6 New AI Tools (Features)

1. OpenAI新增DALL· E图像编辑功能

On April 1, OpenAI announced that DALL· The E Editor interface has a new image editing feature, which allows users to select areas of the image they want to edit and describe changes in the chat, such as adding, deleting, and updating certain parts of the image. This feature is also supported on the ChatGPT App.

2. Hume AI launches EVI, an emotional voice conversational bot

On April 7, AI start-up Hume AI released its first chatbot, Empathetic Voice Interface (EVI), which claims to be "the first artificial intelligence with emotional intelligence" and can recognize more than 50 human emotions. EVI can be used without logging in, but it currently only has a male voice and only supports conversations in English voice.

Compared with AI bots such as ChatGPT, these "mind-reading" AIs pay more attention to the user's personal emotions and mental state, and fully demonstrate their "empathy" during the conversation.

Hume AI官网:https://www.hume.ai/

3. "Sora of Music" Udio starts its free public beta

On April 10, Udio, an AI music generator launched by a former Google DeepMind researcher, has launched a free public beta, which can generate up to 1,200 songs per month for free. Udio can quickly generate a complete audio track containing vocals based on the text prompts entered by the user, such as music style, theme, lyrics and other information, and supports a variety of music styles and genres, and is known as the "Sora of the music industry" by netizens.

In addition, streaming music platforms such as NetEase Cloud Music, QQ Music, and Spotify have also carried out a series of explorations and attempts in the field of AI:

Udio Official Website: https://www.udio.com/

4. Domo AI上线Fusion Style(融合风格)功能

Domo AI, an AI video creation tool, recently launched the Fusion Style feature, which allows live videos to generate custom videos of different characters, environments, and characters. Currently, Domo AI supports generating short videos of up to 10 seconds at a time.

输入张元英转圈视频，提示词“a robot, dancing, cyberpunk, countryside”，选择“Fusion Style”、“refer to my prompt”

In addition to Domo AI, there are also many AI video style transfer tools of the same type. Anime, 3D, hand-drawn and other second-creation videos have become popular on YouTube, TikTok, Douyin and other platforms, attracting many netizens to follow suit.

5. Tongyi App launched the AI "National Singing" function

Recently, Tongyi App launched the "National Singing" free function, users only need to choose a template, upload a photo, generate a video, and let the characters in the photo sing. The first batch of 80 AI templates cover popular songs, Internet hot memes, etc., and currently do not support user-defined audio.

According to reports, the function is based on the EMO model developed by Tongyi Laboratory, and the portrait speaking technology behind it can easily drive the portrait to speak without the tedious 3D modeling of the face, head or body parts, reducing the cost of video generation and significantly improving the video quality.

6. Bilibili released the digital doppelganger customization tool "Biscissor Studio"

Recently, Bilibili released the first free digital clone customization tool in China, "Bijian Studio", which integrates the functions of "digital clone" and "timbre customization" in one stop to help creators and recording creators effectively improve their creative efficiency.

Bilibili said that in order to protect personal privacy and information security, the customized model is only for the personal use of the UP owner. Currently, Biscissor Studio has opened a waitlist application.

Biscissor Studio application website: https://member.bilibili.com/york/bilibili-studio

5 AIGC Hot Cases

1. ChatGPT"And"恋爱模式火爆海内外

Recently, the video of flirting with ChatGPT "DAN" online voice has exploded overseas, and many videos have been viewed more than one million. DAN is the abbreviation of "Do Anything Now", and you can successfully "jailbreak" by entering the corresponding instructions in the ChatGPT conversation, allowing ChatGPT to change from a soulless AI to a character who breaks the rules of the system.

The blogger "Midnight Raging Husky Dog" posted the whole process of himself and DAN from ambiguity to confession and then to "meeting parents" on Xiaohongshu and Douyin, and the fans increased by 130,000 and 280,000 respectively in the past month.

DAN calls bloggers "little kitten" (a term used in American slang for women who are in close relationships)

It is worth noting that DAN's personality is like opening a blind box, and not everyone can train a "cyber lover".

2. AI ghost animal music swept station B

AI began to enter the music industry, but I didn't expect that under the whole work of netizens, the field of ghost animals was the first to be conquered by AI, and AI creation for entertainment and social purposes is forming a viral spread on social media.

A large number of AI ghost animal music poured into station B, among which the most popular theme is "Your steel door is relatively relaxed", and there have been many videos with more than 2 million views, such as the work "[suno AI] Your Gangmen is more vocaloid" by station B's UP "Yi Anning Pill QAQ".

Most of these music uses the diagnosis of a Chengdu proctologist as the lyrics, and the second creation is carried out around this stalk: "Your steel door is relatively loose, but your hemorrhoids make up for this part of the ......"

3. Podcast shows apply ultra-realistic AI voices

On March 30, Microsoft launched 9 AI voice characters for business customers, suitable for audiobooks, news, AI customer service, and multi-emotional expression. One of the Chinese female characters, "Xiaoxiao", supports 21 speaking styles and became popular on the X platform because the sound effects are very realistic.

The development of AI voice technology has undoubtedly provided new tools and possibilities for content creators. Some podcast channels have begun to use AI voice to produce more frequently updated informational content.

On the podcast app Small Universe, the "Hacker News" account produced an information program with the voice of "Xiaoxiao", and the listeners left a message in the comment area saying "I want to give a reward".

AI "Xiaoxiao" experience address: https://speech.microsoft.com/portal/voicegallery

4. Netizen self-made AI sci-fi short film "Great Qin Empire - The First Star Han Expedition"

On April 20, Douyin blogger "AIGC Watermelon Head" released an AI-produced sci-fi short film "Alternate History: The Great Qin Empire - The First Star Han Expedition", which tells the story background of the Great Qin Empire's attack on aliens.

The first episode is 20 seconds long, and the 6 shots are all AI video footage generated by Runway, accompanied by epic background music and AI narration, and currently has 73,000 likes on Douyin.

More and more creators are trying to use AI to make videos, and the whimsical ideas in their minds have the opportunity to come true, such as science fiction themes that can be combined with traditional Chinese cultural backgrounds, with a sense of contrast, and it is easy to gain user attention.

Related reading: "Participating in the Beijing Film Festival, we made the first AI sci-fi short film in our lives (with the whole process dismantled)"

5. Cute plush texture icons

This kind of furry icon has suddenly become popular on Xiaohongshu recently, and the related pictures and tutorials have received as many as 50,000 likes, and netizens have changed the mobile app to cute new skins.

If you're familiar with Stable Diffusion, you only need to enter text prompts, plus ControlNet to control structure and color respectively.

提示词（其中颜色可替换）：Masterpiece, top view, (white, blue, fluffy, plush _ hair, 3D art: 1.4), solo, (placed on purple background: 1.3), light and shadow, natural lighting, close-up, depth of field, minimalism, high quality, high detail, Sony FE GM, UHD

A 5,000-word review of the progress of AIGC in April, including 6 latest creative tools and 5 popular cases