Why can the AI face-changing video that everyone can do deceive technology practitioners?

Per reporter: Song Meilu Per editor: Zhang Haini

The legal representative of a technology company in Fuzhou was deceived by AI for 4.3 million yuan in 9 minutes, and Mr. He of Anhui Province was deceived by 2.45 million yuan in 9 seconds.

On the other side, in the live broadcast room, anchors with star faces such as Yang Mi and Di Li Gerba appeared in the live broadcast room to bring goods, making it difficult for the audience to distinguish between true and false.

Why can the AI face-changing video that everyone can do deceive technology practitioners?

Suspected face-changing Yang Mi's anchor with goods Image source: video screenshot

In the presentation videos of social networking sites, videos of hands changing faces in seconds abound, and everyone can make a video with AI, and scammers are no exception.

AI music, AI painting, AI face change, AI voice change... Unconsciously, each of us has entered the cyber world, whether it is the application of special effects such as #one-click unlock instantaneous universe ##one-click makeup# in Douyin or the "consumption" of AI Sun Yanzi, it is a witness of the daily life of AI.

While everyone is enjoying the technology carnival, the other side of technology is gradually revealed: counterfeiting technology and identification technology are like playing a "cat and mouse game", who can seize the initiative in the end?

How low is the technical threshold for AI face changing that can be mass-produced?

AI face changing technology is not a new thing, as early as 2019, the emergence of a face changing software "ZAO", has made AI face changing popular on the Internet, a large number of AI face changing spoof videos appeared on online video platforms, many stars were spoofed, rumors.

Although "ZAO" was removed from the shelves shortly after its launch due to privacy violations and other issues, the negative impact of this technology persisted until the popularity of AI technology this year caused the face change to break out again.

Xiao Zihao, co-founder and algorithm scientist of Relais Intelligent RealAI, said that AI face change, AI voice change mainly uses deep synthesis technology, with the open source of deep synthesis technology, the increase of deep synthesis products and services, the technical threshold for deep synthesis content production is getting lower and lower, realizing the "civilian" technology, ordinary people can also use a small number of images, audio and other sample data, using easy-to-use synthesis tools, to produce deep synthesis content.

Xiao Zihao introduced that there are currently two ways to change faces in video chat or live broadcast: one is to play the face changing video directly prepared in advance, and the other is to access the real-time face change video. First, the current production cost is already very low, and there are open and mature applications that can support the work; Although the second has not found public applications, there are also very mature technical means to support it. The time to make a video is related to factors such as equipment and computing power, and current technology can produce results in the same time as generating videos. Based on the current technical capabilities, the original video image before replacement will show better results under the conditions of clear face, unobstructed face, and no exaggerated action.

AI practitioner Tang Hui also said that the current AI face change has very low technical requirements, "If it is a specialized technical person, find an open source model on the Internet, and then understand it yourself." If you just collect an image of this person to generate a video, you can quickly train it. You can make one in 20 minutes. ”

Search for "AI face changer" in the mobile app store can find that there are many related software, of which FacePlay has been downloaded 240,000 in the iOS system, which has templates such as film and television characters, photos, and comics. The software makes a profit in the form of a fee-based acquisition template, with a weekly member of 17 yuan and an annual member of 398 yuan. There is also software that can be created just by looking at the advertisement.

Face changing software faceplay download and paid interface Image source: screenshot

Although keywords such as "face change" have been blocked on e-commerce platforms, relevant content can still be found by searching for other related terms, and the price is mostly tens of yuan.

Details of an e-commerce platform selling AI face changing products Image source: webpage screenshot

Short videos such as Douyin have also launched simple AI face changing templates, such as face changing dance, face changing clothes, men and women changing faces, etc. A reporter can generate a video in a few seconds after trying to import a photo, but the accuracy of this type of video is relatively low, and occasionally turning the head can see some loopholes such as misfit.

In addition, many large technology companies are also focusing on related businesses, AI concept stock Wondershare Technology (SZ300624, stock price 136.60 yuan, market value 18.809 billion yuan) has landed AI face changing, AI keying, AI noise reduction, AI audio restructuring and other AI capabilities. According to media reports, on May 25, Wondershare Broadcast Explosion, a subsidiary of Wondershare Technology, has fully launched AI digital human customization services, supporting digital human image customization, voice reproduction and video template customization. Users only need to record a video of about 6 minutes and 20 sentences of effective audio corpus materials to generate "real people" and "real voice" exclusive digital humans.

"At present, AI can already produce a large number of videos in batches, make multiple videos that replace the same person, and also make videos that replace multiple people at the same time." Xiao Zihao said.

Tagging AI, a "cat-and-mouse game"

"Technology is only going to get faster and faster, maybe you can tell if it's AI or a real person now, but in six months, a year from now?" Tang Hui believes that AI technology is developing too fast, but there are too few people who understand AI in real life, which leads to frequent fraud.

Xiao Zihao also said that at present, deep synthesis technology is constantly evolving, the generated sounds and videos are becoming more and more realistic, and the difficulty of ordinary people's naked eye identification is getting higher and higher.

According to media reports, Microsoft Chief Technology Officer (CTO) Kevin Scott said in an interview on the eve of the Build developer conference that opened on May 23 local time that Microsoft has been researching the "media source system" for three years - placing encrypted watermarks in AI-generated content, which can be decrypted by software to obtain source information to detect false information.

From a technical point of view, Xiao Zihao said that they have been studying the automatic detection of deep synthesis technology, commonly used methods include training the model detector based on the forged content dataset, and achieving the discrimination of forged content based on inter-frame inconsistency, etc., these methods can achieve 99.9% accuracy in the open source dataset.

"The difficulty of prevention lies in the endless emergence of new counterfeiting methods, the increasingly complex network propagation environment, and the structural defects of detection algorithms based on deep neural networks, etc., and the anti-deep counterfeit detection technology is also facing 'strong adversariality', which needs continuous update and iterative optimization."

Similar to the "cat-and-mouse game," Xiao Zihao said, deep synthesis and detection evolve itself in the process of continuous learning of attack and defense, circumventing the previous generation of confrontation technology. In order to grasp the initiative in confrontation attack and defense, the future development of anti-deep fake detection technology needs to integrate multi-modal content forensic analysis, digital watermark-based traceability technology and other capabilities to achieve accurate identification of fake content and create a trusted content system.

The "Provisions on the Administration of Deep Synthesis of Internet Information Services" clearly requires that deep synthesis service providers shall employ technical measures to add identifiers that do not affect the use of information content generated or edited using their services, and for services that have the function of generating or significantly changing information content, they shall make conspicuous marks in reasonable locations and areas of the information content generated or edited, and remind the public of the synthesis of information content, to avoid confusion or misidentification by the public.

The reporter noted that at present, station B has marked AI synthetic videos on some videos, and Douyin released a platform specification and industry initiative on AI-generated content on May 9, advocating that all providers of generative artificial intelligence technology should be prominently marked with generated content so that the public can judge. At the same time, use unified artificial intelligence to generate content data standards or metadata standards that can be easily identified by other content platforms.

B station AI technology synthetic logo Image source: screenshot

Xiao Zihao suggested that if ordinary people encounter doubtful situations, they can consciously guide the other party to do some actions when watching the video, such as: shaking their heads or opening their mouths significantly. If the scammers are weak, it is possible to find flaws on the edges of the other person's face or teeth and identify AI face changes. But at the same time, he said that this method is still difficult to identify "high-level" fraudsters. In addition, you can ask for several private information that only you and the borrower know to verify the identity of the other party.

AI face swapping is risky

According to the relevant provisions of the Copyright Law, if the AI face changing uses the performer's video material, it may also infringe its copyright. If it is illegal use, such as the perpetrator in this case, who fraudulently used the WeChat of Mr. Guo's friend to use AI to change faces, it not only constitutes infringement, but also is suspected of a crime.

If the AI face swapping service is used in the live broadcast, the live broadcast merchant, the platform and the technology provider will all bear certain responsibilities. In particular, according to the Provisions on the Administration of Deep Synthesis of Internet Information Services, if the deep synthesis service provider and technical supporter violate the provisions, the relevant departments will impose penalties; If a crime is constituted, it shall also bear corresponding criminal responsibility.

In addition, the legal team of Wang Rongmei of Beijing Jingshi Law Firm reminded that there is still controversy over the ownership of the copyright of AI-produced content.

First, there is controversy over whether AI-generated content constitutes a work; Secondly, in practice, the nature of AI generated objects is also controversial. Therefore, the current law does not clearly stipulate the copyright of AI-generated content, but Article 5 of the Measures for the Administration of Generative Artificial Intelligence Services (Draft for Comments) stipulates the main liability for infringement of such content. That is, organizations and individuals who use generative artificial intelligence products to provide chat and text, image, sound generation and other services, including supporting others to generate text, images, sounds, etc. by providing programmable interfaces, etc., assume the responsibility of the content producers of the product.

"At present, AI is still an emerging field, the development is changing with each passing day, while bringing convenience to people's work and life, there are also criminals using this technology to engage in illegal and criminal activities, at present, the provisions in the field of AI are scattered in the Civil Code, Personal Information Protection Law, Cybersecurity Law, and have not yet formed a systematic legal system. It is believed that with the wide application of this technology, laws and regulations will continue to evolve with the times, and the state will introduce more normative measures to gradually form a complete legal system. Wang Rongmei's legal team said.

Daily economic news