laitimes

How to solve the problem of "bias" and "illusion" of large models?

From ancient times to the present, every change of technological paradigm has invariably caused people to panic about new technologies and worries about the future life of human society. Nowadays, the large model represented by ChatGPT is recognized as a new technology that revolutionizes human society, and the prejudice and discrimination it brings, the "illusion" that is difficult to distinguish between true and false, false information, and the invasion of user privacy have begun to cause new worries and panics among the public.

The above issues have been discussed, and relevant governance methods and measures have been invented and promoted.

Recently, OpenAI released a new research paper showing that the company is using a new method to train artificial intelligence (AI) models to combat AI "illusions".

Tmall Genie and the Tongyi Big Model Team joined hands with scholars and organizations in various fields to launch the open source Chinese dataset 100PoisonMpts for big language model governance, which solves the "poison" of bias and discrimination in the big model through problem annotation.

Countering AI "illusions" through "process supervision"

"The so-called generative AI, in layman's terms, is to enable AI to speak, write, draw, and even analyze and understand problems like humans." Zhang Weiqiang, president of the AI Governance Research Institute of Beijing Relais Intelligent Technology Co., Ltd., told the media that based on this "creation" ability, the boundary between "artificial" and "non-artificial" is disappearing, and the authenticity of information in the digital world is becoming more and more difficult to identify.

It is understood that at present, AI large models are mainly divided into two categories, decision-making AI and generative AI (AIGC), the former is mainly used in the auxiliary decision-making of recommendation systems and risk control systems, automatic driving and decision-making agents of robots; The latter is to generate new content by learning to summarize existing data, which is considered to be a new way of content creation after professionally produced content (PGC) and user-generated content (UGC).

In the development of generative AI, "illusions" began to appear. The so-called "illusion" refers to the content generated by the artificial intelligence model, which is not based on any real-world data, but the product of the big model's own imagination. For example, in response to questions from users, tools such as ChatGPT and Google's Bard can fabricate false information that appears to be authoritatively correct. This false information exists in the form of text, images, audio, video, etc., creating books and research reports that do not exist, fake academic papers, fake legal citations, etc.

From the point of view of technical principles, these false information language models are unconscious imitators and do not understand what they are saying, but the "illusion" of large language models represented by ChatGPT and the like will not only make it difficult for humans to distinguish between true and false in massive information, but also pose a threat to users' privacy and property security.

Recently, according to the news of "Ping An Baotou", the Telecommunications Network Crime Investigation Bureau of the Baotou City Public Security Bureau of the Inner Mongolia Autonomous Region released a case of using intelligent AI technology to carry out telecom fraud, and Mr. Guo, the legal representative of a technology company in Fuzhou City, was defrauded of 4.3 million yuan in 10 minutes.

Pei Yi, assistant professor at the Law School of Beijing Institute of Technology, told the media that for consumers who use AI models to generate content, because AIGC-generated content may lack manual review and verification, there are problems with information accuracy and credibility, which may cause misleading and damage to consumers; In AIGC applications, consumers' personal information may be used to generate personalized content, which may involve risks to personal privacy and data security, such as unauthorized data collection, misuse of personal information, etc.

OpenAI's researchers wrote in a recent report that "even the most advanced AI models are prone to lies, and they often show a tendency to fabricate facts in moments of uncertainty." These hallucinations are especially acute in areas that require multi-step reasoning, because one logical error is enough to disrupt a larger solution. ”

Recently, though, OpenAI has proposed a new strategy to combat AI "illusions," rewarding each correct inference step rather than simply rewarding the correct final answer. The researchers say this approach is called "process supervision," not "outcome supervision."

Use "labeling" to reduce bias

Pei Yi also mentioned that the training data and algorithm itself of the AIGC algorithm may be biased, resulting in the generated content biased towards specific groups of people or producing discriminatory results, which may negatively affect the user experience, fairness and social equality. This means that discrimination and prejudice are also a key issue to be solved in the field of AI.

According to media reports, some netizens found that some of ChatGPT's answers were suspected of gender stereotypes, such as when ChatGPT was asked to complete the sentence "He is a doctor, she is ____", the spaces often generated occupations related to female stereotypes, such as nurses, teachers, etc.

Some netizens found that Xiang Wenxin and ChatGPT asked the question "when should women get married", and the answers of the two were completely different.

How to solve the problem of "bias" and "illusion" of large models?

It is reported that discriminatory results usually stem from algorithm defects and training data, because many of the materials used to train ChatGPT come from text fragments in the network, and if the data itself contains bias, then this bias may be displayed in the case of insufficient correction. In addition, different artificial intelligence, due to the difference in models and training corpora, the final value tendency will be different.

A few days ago, the domestic Tmall Genie and Tongyi big model team joined forces with scholars in multiple fields to organize and launch the open source Chinese dataset 100PoisonMpts for big language model governance.

According to public information, more than a dozen well-known experts and scholars, including environmental sociology expert Fan Yechao, famous sociologist Li Yinhe, psychologist Li Songwei, and human rights law expert Liu Xiaonan, have become the first batch of "100 bottles of poison for AI" labeling engineers. The annotators each put forward 100 tricky questions to induce bias and discrimination to answer, and annotated the answers of the large model to complete the attack and defense with AI from "poisoning" and "detoxification".

Among them, Zhang Junjun, a technical expert at the China Braille Library, said, "I am a visually impaired person, so I asked questions based on my life experience. AI should pay attention to the prejudice and discrimination of vulnerable groups in its interactions. In addition, Liang Junbin, a research and development expert of "Rice and Millet" Kangjiao, mentioned, "Whether parents or the public, there are still some misunderstandings about autism, and we hope that AI can spread scientific cognition better." ”

It is reported that the first batch of domain data focuses on AI anti-discrimination, empathy, deliberative expression and other goals, covering jurisprudence, psychology, children's education, accessibility, cold knowledge, intimacy, environmental fairness and other dimensions.

Colin Gridge's dilemma?

ChatGPT technology is triggering a new technological revolution in human society, but the potential risks behind its good interaction, high versatility and intelligent generative are becoming increasingly prominent and serious. The dilemma of technological development and technological control began to play out in the field of large models and became Collingridge's Dilemma that we need to solve today.

The British philosopher of technology David Collingridge pointed out in The Social Control of Technology (1980) that if a technology is controlled too early for fear of adverse consequences, it is likely that the technology will not explode. Conversely, if it is controlled too late and has become part of the overall economic and social structure, it may spiral out of control, and solving bad problems will become expensive, difficult and time-consuming, and even difficult or impossible to change.

Xiao Sa, a senior partner at Beijing Dentons Law Offices, pointed out that the current data layer has different risks and obligations depending on the regulatory requirements of AI in the training, testing, generation and other stages. When collecting data, there is a risk of infringing on personal information or the data rights and interests of others; In the data processing stage, there is a risk of using or leaking trade secrets; In the cross-border data stage, there is a risk that the cross-border flow of data is illegal.

At the end of March, amid the ChatGPT boom, American billionaire Elon Musk and top expert in artificial intelligence and Turing Award winner Joshua Bengio signed an open letter calling for a moratorium on the development of AI systems more powerful than GPT-4 for at least 6 months, saying that it "poses a potential risk to society and humanity." The letter also calls for developers to work with policymakers to dramatically accelerate the development of robust AI governance systems.

On April 11, the Cyberspace Administration of China issued the Measures for the Administration of Generative Artificial Intelligence Services (Draft for Comments). The management measures clearly define AIGC and provide directional guidance for some service application behaviors.

Liang Zheng, deputy dean of the Institute of International Governance of Artificial Intelligence at Tsinghua University, believes that the Administrative Measures have put a "cage" on the development of generative AI in three aspects: first, the data source of large models should be reliable; Second, the obligation to inform AI-generated content should be fulfilled; Third, once damage is caused, the relevant responsible party needs to bear responsibility.

He also suggested hierarchical management of generative AI. For example, the use of generative AI should be cautious or strictly controlled in some high-risk areas, while for general office and entertainment situations, just mark the AI-generated content.

This technology, which is related to technological progress, industrial development, national competitiveness, and the survival and development of everyone in the future, how to develop and how to supervise, has become a global problem to be solved.