laitimes

The Deep Synthesis Application Trend Report is released, Expert: Everything could be faked in the future

Olympic AI sign language anchors, virtual idols, face-changing stars... How far has the deep synthesis technology behind these applications evolved? What is the relationship between the "metacosm" of the fire and deep synthesis? What are the technical and ethical challenges facing deep synthetic regulation?

On February 18, at the sub-forum of the Second Great Wall Engineering Science and Technology Conference "Safety Controllability and Ethical Governance of Artificial Intelligence" held in Beijing, the "Ten Trends Report on Deep Synthetic Applications" (2022) was released. Around the ethical issues brought about by deep synthesis, a number of experts discussed at the meeting.

1

Deep synthesis content is growing rapidly, and related research is constantly upgrading

Deep synthesis technology refers to the use of generative synthesis algorithms represented by deep learning and virtual reality to produce text, images, audio, video, virtual scenes and other information technology. In 2017, a user named "Deepfakes" shared a pornographic video of a human face tampering with the face on the Us reddit website, bringing deep synthesis technology to the public.

The "Report" shows that in recent years, the number of deep synthetic content production and dissemination has increased rapidly. The number of newly released deeply synthesized videos in 2021 has increased by more than 10 times compared with 2017. In addition, the attention of deep synthesis content has also increased exponentially, taking the likes/likes data of videos as an example, and the number of likes of newly released deep synthesis videos in 2021 has exceeded 600 million.

The Deep Synthesis Application Trend Report is released, Expert: Everything could be faked in the future

Data description: In 10 domestic and foreign platforms (iQiyi, Tencent Video, Youku, Bilibili, Douyin, Kuaishou, Weibo, YouTube, Twitter, TikTok), 10 Chinese and English keywords such as "Deepfakes" were searched, and the data results were counted after deduplication by URL. Figure from "Ten Trends in Deep Synthesis Application Report" (2022)

Research results are the underlying driving force behind the dissemination of deeply synthesized content. Among them, the University of Montreal proposed the Generative Adversarial Network (GAN) in 2014, which pushed the fidelity of the data to a new height and greatly reduced the threshold of deep synthesis. The Report shows that the number of papers in the field of deep synthesis continues to grow every year. These papers in the field of deep synthesis include technical research on the synthesis of different modes such as images, speech, and text, of which the research on image generation accounts for the highest proportion of 64%. Audio and text account for 24% and 12%, respectively.

In addition to thesis research, open source projects in the field of deep synthesis also show a continuous upward trend. Open source projects have promoted the continuous upgrading and iteration of deep synthesis methods in terms of synthesis quality and production efficiency.

In terms of application scenarios, the Report shows that the application scenarios of deep synthesis are constantly enriched, such as the restoration of historical photos, AI sign language anchors, virtual idols and other applications. Especially in the field of film and television production, in recent years, deep synthesis technology has become a rescue tool for works dragged down by the bad behavior of a small number of artists.

In addition, more and more enterprises and institutions are beginning to use deep synthesis technology to provide products and services for the public. The report shows that images and videos are most commonly used in the early stages of deep synthesis applications, but due to the uneven quality of products and easy violations of user privacy, the number of them gradually decreases when regulatory norms are in place. In terms of audio, speech synthesis has become an important part of human-computer interaction, and is widely used in intelligent hardware, intelligent customer service, voice navigation, audiobooks, robots, voice assistants, automatic news broadcasts and other scenarios. In terms of text, deep synthesis has been more and more applied in news reports, poetry creation, chat Q&A, etc., and has shown great creative efficiency and future potential.

The report predicts that with the gradual formation of technologies such as automatic data generation, whole-body synthesis, and 3D shaping, a new human survival scenario will be developed with deep synthesis technology as the cornerstone. The metacosm is a future human virtual digital space based on deep synthesis technology, which "completes the multiple replication and extension of real space and time, jumps out of the limitations of traditional physical space, and provides a new world where virtual people, natural people and robots are close to reality and transcend reality."

2

The negative risks of deep synthesis have intensified, and the supervision of various countries has become a trend

The "Report" believes that when deep synthesis technology penetrates into all areas of social life, the negative risks of deep synthesis content continue to increase and produce substantial harm. With the open source of deep synthesis technology and the increase of deep synthesis products and services, the technical threshold for deep synthesis content production is getting lower and lower, and the "civilianization" of technology has been realized. It is not uncommon for illegal acts such as false video and false audio to be framed, defamed, defrauded, and extorted through deep synthesis technology.

Deep synthesis technology will also have a more profound impact on the dissemination of information. The "Report" analyzes that human communication activities have gradually entered the era of "deep post-truth" due to deep synthesis technology. First of all, "deep forgery" profoundly affects the news's record of the truth, and the difficult identification of false content affects the effectiveness of fact verification. Secondly, in the node of major social emergencies or political events, if the deep synthesis technology is used maliciously, it will use social media to cause the viral spread and spread of false information on the Internet. Third, in the release and tracking of information on daily events, deeply falsified information will also cause the continuous reversal of public opinion opinions in the field of public opinion, intensifying the contradictions of different groups in society. What needs to be vigilant is that the malicious fake content of deep synthesis technology usually caters to the curiosity of the public and has a strong ability to shape consciousness.

The report also points out that the identification of deeply synthesized content is facing technical challenges. The emergence of new forgery methods in an endless stream, coupled with structural defects in detection algorithms based on deep neural networks, anti-deep pseudo detection technology is also facing "strong confrontation" and requires continuous updating and iterative optimization. This is similar to the "cat and mouse game", where deep synthesis and detection evolve themselves in the process of continuous learning to attack and defend, circumventing the confrontational techniques of the previous generation. At present, academia and industry have made a lot of investment in the research and development of identification and detection technology, and many scientific research institutions and scientific and technological enterprises at home and abroad have launched testing products.

With the negative impact of deep synthesis, the establishment of regulatory mechanisms in various countries in the world has also become a trend. The European Union tends to incorporate deep synthesis into the existing legal framework for regulation; in the United States, some states have passed formal laws regulating "deep forgery", such as California, Virginia and Texas; Singapore has also introduced a special bill to clarify the responsibilities of subjects and platforms; the Mainland issued the "Regulations on the Administration of Online Audio and Video Services" issued in January 2021 specifically mentioning the prohibition of using deep learning technology to produce and disseminate false news information. The Provisions on the Administration of in-Depth Synthesis of Internet Information Services (Draft for Solicitation of Comments) is a special management provision with systematic, targeted and operability.

3

Expert: Everything of value in the future may be forged

In the face of current challenges, how to standardize the application of deep synthesis technology and mitigate the negative impact of technology? A number of experts expressed their views from the perspective of ethics and governance.

Xue Hui, head of Alibaba's security perception and cognitive intelligence department, believes that there are two main difficulties, one is because deep synthesis technology has great commercial value, so it cannot be banned one-size-fits-all, but it is necessary to adopt an "inclusive and prudent" attitude, but how to determine the boundaries of supervision is a problem. Another problem is that deep synthesis faces continuous attack and defense and game, and in attack and defense, the attacker often finds a point to break through, but the defense is relatively backward.

Tao Jianhua, a researcher at the Institute of Automation of the Chinese Academy of Sciences, pointed out that at present, the connotation and extension of the concept of deep synthesis are not clear, resulting in regulatory difficulties. "Is it called deep synthesis obtained by deep learning methods?" I think this is debatable. ”

In addition, he argues, users of deep synthesis should be managed more effectively, rather than constraining its developers too much. A large number of technologies in artificial intelligence have two sides, he compares technology to knife, and the impact of technology depends on how it is used. The earliest people engaged in deep synthesis related research, many people's motivation is to entertain, improve people's lives, for example, some people want to let the machine learn the mother's voice on its own, to the baby to do aloud, this is the way to improve life. But some malicious attack tools are not excluded. Therefore, the regulation of technology should still be carried out in an open manner.

Ren Kui, dean of the School of Cyberspace Security at Zhejiang University, raised the issue of insufficient data sets. He introduced that the current deep synthesis technology is mainly for people, so training deep synthesis detection models requires a lot of face data, but face data and audio data are highly sensitive personal information, and this part of the data is difficult to obtain. He suggested that non-profit institutions with high credibility can sort out the data and let qualified research institutions participate together, so that the value of the data can be maximized and used in a positive direction.

However, future deep compositing scenarios may be more complex. In his view, deep synthesis in the future will not only stay in simple audio, images and video, but will be used for a variety of forgeries, and not limited to the forgery of digital space, the forgery in physical space may be more deceptive and deadly. "From a critical scenario, like autonomous driving, here I might fake a scenario that could be digital or a way of merging with the physical world. If we think a little further, such as the concept of the metaverse, it is not necessarily to falsify information about people, everything of value may be forged, and here deep synthesis may have a lot of imagination space and space for use and attack. ”

Tian Tian, CEO of Beijing Ruilai Smart Technology Co., Ltd., believes that the essential problem of deep forgery is insufficient transparency. In this technology, the traditional "seeing is believing" has been impacted, so it is particularly important to improve people's understanding of deep synthesis technology. "For the average audience, there needs to be a lowering of the bar for problems, recognizing what deep synthesis is, or having simple tools to judge that it is synthetic. Only when the threshold is lowered to the point where all audiences can recognize, discuss, and understand this issue within a common framework, can it be a relatively healthy and benign development, and its application can be expanded on a larger scale. He said.

Written by: Nandu reporter Li Yaning

Read on