AIGC leads the transformation of content production methods: debut is the peak, tomorrow is just around the corner

2022 is the first year of the AIGC field.

AIGC (AI-Generated Content) has developed to today, and deep neural networks (CLIP, GPT, Diffusion Model, etc.) can be used to generate high-quality text, images, video, audio and other forms of content.

AIGC leads the transformation of content production methods: debut is the peak, tomorrow is just around the corner

Source: China Academy of Information and Communications Technology

This transformation from analytical AI iteration to creative AI (Generative AI) has brought us a digital revolution - the content generated by AIGC is more real, delicate and diverse, and its powerful content generation ability continues to give people a sense of "amazing".

AIGC technology has achieved excellent or even surpassed human level results in a number of public evaluation tasks, and has also shown the characteristics of stability, efficiency and high satisfaction in practical application scenarios.

It can be predicted that in the future, AIGC will be widely used in various industries, fields and scenarios to provide users with better interactive experience and value creation.

Source: Qubits

Text generation

It is applied to scenarios such as intelligent customer service, translation, automatic article generation, and social networking.

Such as the GPT (Generative Pre-trained Transformer) series. A large-scale "generative pre-trained language model" developed by OpenAI, that is, generating subsequent texts based on previous information, including ChatGPT, GPT4, etc.

Transformer model

Audio generation

It is used in music creation, virtual anchors, voice assistants, etc.

Examples include WaveNet, DeepMind's Jukebox, and Meta AI's Voicebox.

By using the method of "training a neural network with real speech recording", the waveform is directly simulated to generate a human-like voice that sounds very realistic.

These models can directly generate speech or music that matches the target style through text, or you can directly render existing speech to the target style, and you can also combine speech recognition to transform some paragraphs into different text.

Image generation

It is used in UI design, painting, animation, e-commerce scenes, game scenes, etc.

For example, DALL-E 2, an image generation model developed by OpenAI (combining pre-trained CLIP and diffusion models). Google has similar models, such as Palette, Imagen, and Muse.

Video generation

Special effects rendering for movies, games, etc.

Such as DeepFake, an artificial intelligence algorithm.

Using neural network technology for large-sample learning, you can replace one face (source picture, video) with another person's face (target picture, video), which is what we often call "AI face replacement".

Cross-modal generation

AI technology can generate a variety of different media types of content at the same time, such as Wen Sheng pictures, Wen Sheng videos, video generated text, etc.

Through deep learning and training on multimodal data, AIGC can learn the associations and semantics between different media, enabling content generation in multiple media domains.

So much has been said

Is AIGC harmless?

Although AIGC has made remarkable achievements and applications in text, audio, image, video and multimodal content generation, it still faces many challenges and risks in content security in terms of "content compliance and ethical security".

Public information crisis

Some criminals use AI face change, AI voice change and other functions to generate false pictures, sound effects, and videos for illegal activities such as fraud and harassment.

Some users also use the AIGC platform to create vulgar pornography (including obscure, subtle, edge scraping and other soft pornography), and even bloody, violent, and politically related images and audio for dissemination. This kind of behavior will not only damage the physical and mental health of adolescents, but also may cause irreparable social harm.

Many artists said that they would protest and resist the "AI painting generation model" trained on a large number of human painting materials, believing that AI software has learned, imitated or copied their own paintings without permission, which has constituted infringement.

Privacy Information Leakage

The phenomenon of some dataset systems using users' photos and information to conduct AI training without the user's permission has also been repeatedly prohibited.

Buggy, low-level, and undesirable content is rampant

Text generation models, such as GPT, are trained on data on the Internet, and the quality of the data they participate in training cannot be guaranteed. At the same time, GPT is based on the Transformer model, which essentially predicts the next word. Therefore, it does not have any understanding of the quality and meaning of the word itself.

This leads to two serious problems:

First, Model Hallucination, in which the model may associate completely unrelated events, people, and times, producing descriptions that do not match the facts at all;

Second, the AI Alignment problem, where generating AI produces low-level, discriminatory, reactionary content that is completely unaware of the potential harm that this content has to do to society.

For example, when you ask the GPT-integrated Microsoft search engine "Bing": the name of the protagonist of Shawshank Redemption, its answer is "Xiaoshuai". This is because there are a large number of low-quality texts in the corpus of GPT training, which collectively refer to the protagonists of the movie as "Xiaoshuai".

Content security in the era of AIGC

As AIGC promotes its use, content generation will become more and more popular, cheaper and more convenient, and the resulting AIGC-based gray production will also grow rapidly, the difficulty of content review will be unprecedented, and the supervision of relevant departments will become stricter.

On April 11, 2023, the Cyberspace Administration of China (CAC) issued the Measures for the Administration of Generative Artificial Intelligence Services (Draft for Comments), which delineates the industry's bottom line for possible risks in AIGC in terms of content and copyright.
On May 31, 2023, the State Council also stated in the notice of the "2023 Annual Legislative Work Plan" that it would submit the draft AI law to the Standing Committee of the National People's Congress for deliberation.
On July 10, 2023, the Cyberspace Administration of China announced that the Interim Measures for the Administration of Generative Artificial Intelligence Services will be implemented on August 15, 2023 to promote the healthy development and standardized application of generative AI.

Is AIGC content security difficult to review?

The volume of generated content has exploded, and multimodal content has been fused

Content security has evolved from only policing specific words and images to requiring a constant review of various AIGC content representations, including a large number of characters, organizations, and even symbols. This is a huge challenge in terms of content review effort, recall, and accuracy.

Vague audit criteria

The existing review process is difficult to standardize the definition of content such as copyright infringement and discrimination, which has led to a vague gray area in some AIGC industries.

Audit model updates are in high demand

AIGC continues to iterate and evolve, and the content generated is more vivid and rich, and it is becoming more and more difficult for people to distinguish between what is AI-generated and what is created by humans. With the continuous popularity of AIGC, the speed of related content dissemination is getting faster and faster.

This requires content moderation technology to keep up with, and even surpass, various rapidly developing AIGC technologies in order to identify fictional, false information and inappropriate content, and complete sensitive word detection, rather than just relying on post-mortem defense.

Face AIGC content security risks

How is it monitored and protected?

On the one hand, enterprises and individuals consciously adhere to the correct bottom line of values, and do not abuse, leak or illegally use data information; On the other hand, it is also necessary for industry and legal supervision and market and society to strengthen the understanding, control and review of AI-generated content.

In terms of the review and monitoring of text, pictures, audio and video, documents and other content, Craftsman Technology has unique advantages.

Jiaoshu Technology, which has been deeply engaged in the field of content security for many years, has continuously carried out innovative research and development in artificial intelligence algorithms, independent AI platforms and intelligent software and hardware, and its products can be applied to sensitive word detection, illegal content interception, LED screen protection, public display content filtering, network content supervision, etc., and support recognition capability call, adapt to a variety of online and offline application scenarios, and meet the use needs of content review in various industries.