Full coverage of values and privacy protection CAC plans to "set rules" for generative AI

On April 11, the Cyberspace Administration of China (CAC) drafted and released the Measures for the Administration of Generative Artificial Intelligence Services (Draft for Comments), and launched a one-month consultation with the public.

The Administrative Measures (Draft for Comments) contain a total of 21 articles, which include not only entities providing generative AI services, but also organizations and individuals using these services; The management measures cover the value orientation of generative AI output content, the training principles of service providers, the protection of various rights such as privacy/intellectual property rights, etc.

After the emergence of GPT generative natural language models and products, the public has not only experienced the rapid progress of artificial intelligence, but also exposed security risks, including biased and discriminatory content, data leakage, invasion of privacy, AI fraud and other issues. Globally, the regulation of artificial intelligence in various countries is gradually becoming a trend.

In China, once the Measures for the Administration of Generative AI Services are promulgated, the large models and product providers of domestic generative AI can no longer be "rolled" in disorder, and adopters will also have a normative circle when using generative AI.

Set "off-limits" for generated content

"These Measures apply to research and development and use of generative artificial intelligence products to provide services to the public within the territory of the People's Republic of China.

"Generative artificial intelligence" as used in these Measures refers to technologies that generate text, pictures, sounds, videos, codes, and other content based on algorithms, models, and rules. ”

On April 11, the Measures, issued by the Cyberspace Administration of China (CAC), clarifies the applicable entities of the Measures and the definition of "generative AI" in Article 2.

Judging from the content of this article, companies such as Baidu, Alibaba, Tencent, Huawei, etc. that have publicly stated that they have generative models and products will be within the scope of the Measures if they provide services to users in China, and users who use relevant products and services also need to comply with the provisions of the Measures.

The Measures also emphasize that the state supports independent innovation, promotion and application, and international cooperation of basic technologies such as artificial intelligence algorithms and frameworks, and encourages the priority use of secure and trusted software, tools, computing and data resources.

Under this premise, the Measures delineate "forbidden areas" for providers of generative AI products or services, including the content generated and the basic principles of research and development.

In terms of content, the Measures require that content generated by generative artificial intelligence shall reflect the core values of socialism, and shall not contain content that subverts state power, overthrows the socialist system, incites separatism, undermines national unity, advocates terrorism and extremism, advocates ethnic hatred, ethnic discrimination, violence, obscene pornographic information, false information, and content that may disrupt economic and social order; Content generated using generative AI should be truthful and accurate, and measures should be taken to prevent the generation of false information.

In terms of research and development, the Measures require parties to take measures to prevent discrimination on the basis of race, ethnicity, belief, nationality, region, gender, age, occupation, etc. in the process of algorithm design, training data selection, model generation and optimization, and provision of services.

From these requirements, the Measures basically cover the safety and ethical issues exposed by users of natural language large model products on the market, including discriminatory bias and false information

Problem content produced by humans through generative AI has appeared on the Internet one after another.

For example, ChatGPT has provided steps for users to consult "how to shoplift", although it includes "tips for illegal shoplifting"; Its "role-playing" function has been induced by users to answer questions as DAN (Do Anyting Now), and the answers given have "foul language"; There are also fake news of ChatGPT tests spread in China and become the content of "debunking rumors".

Microsoft's chatbot, which was integrated into the search engine Bing, was exposed by overseas media people as "abusive users"; The AI biograph application Midjourney has been used to create fake pictures of "the Pope wearing Balenciaga down jacket", "Musk dating the CEO of General Motors", and even some people have used it to create various non-existent earthquake histories and solar storm disasters.

Fake image of the Pope (left) and Musk

In response to false information and the identification of AI-generated content, the Measures require providers to "ensure the authenticity, accuracy, objectivity and diversity of data" from the source; Generated pictures, videos, and other content shall be identified in accordance with the "Provisions on the Administration of Deep Synthesis of Internet Information Services"; When manual labeling is used in the development of generative AI products, the provider shall formulate clear, specific, and operable labeling rules that meet the requirements of these Measures, conduct necessary training for labeling personnel, and sample to verify the correctness of labeling content.

Chinese regulation sets a no-go zone for generative AI content, in part for companies that require large models and products to keep pre-training and data in check.

Emphasis on data sources and personal information protection

In addition to emphasizing value-oriented, social morality, compliance with laws and anti-discrimination for generative content, the Measures also require pre-training, data sources, personal information protection and other rights and interests related to generative AI.

For example, the Measures require that providers should be responsible for the legality of the pre-training data and optimization training data sources of generative AI products, and do not contain content that infringes intellectual property rights; Where data contains personal information, the consent of the Personal Data Subject shall be obtained; Bear the obligation to protect user input information and use records, must not illegally retain input information that can determine the user's identity, must not make a profile based on user input information and usage, and must not provide user input information to others.

Data infringement problems generated by generative AI do exist, for example, when users use conversational robots to meet some work needs, it is inevitable that they will upload the company's information, and if they do not pay attention, it is likely to cause the leakage of trade secrets. Earlier, South Korean electronics giant Samsung said after it proposed a "limited questioning order" on ChatGPT that internal data was leaked due to employees' interactions with the app.

The Measures not only target providers of generative AI products and services, but also define principles for consumers of products and services.

For example, do not use generated content to damage the image, reputation and other legitimate rights and interests of others, and do not carry out commercial hype or improper marketing.

Since the Measures were formulated in accordance with laws and administrative regulations such as the Cybersecurity Law of the People's Republic of China, the Data Security Law of the People's Republic of China, and the Personal Information Protection Law of the People's Republic of China, these laws and regulations will apply to violations of the Measures, including infringement of intellectual property rights, infringement of personal information and other illegal acts.

There are 21 articles in the Measures, 13 of which are explicitly aimed at "providers", that is, organizations and individuals that use generative AI products to provide services such as chat and text, image, and voice generation.

There are three ways for the public to give feedback

It can be seen that once the Measures are officially promulgated, domestic enterprises and adopters of generative models and products will have to act within the scope of the rules. According to the official website of the CAC, the public can submit feedback through three channels, and the deadline for feedback is May 10, 2023.

Full coverage of values and privacy protection CAC plans to "set rules" for generative AI

Set "off-limits" for generated content

Emphasis on data sources and personal information protection