Negotiation and calibration: a future of coexistence with artificial intelligence

Author: LIU Chao (Professor, State Key Laboratory of Cognitive Neuroscience and Learning, Faculty of Psychology, Beijing Normal University and IDG/McGovern Institute of Brain Science)

With the rapid development of generative artificial intelligence, the current discussion on the issue of "value calibration" of artificial intelligence is in full swing. The researchers hope to "alignment" the value system of AI with human values to ensure that future development of super-AI will not cause harm to humans. The importance of this issue is self-evident, but the exact path to implementation remains far from clear. Looking at the various declarations or drafts on the current issue of artificial intelligence "value calibration", people can see a variety of words that are full of philosophical and legal uncertainties and interpretive space, such as "values", "interests", "freedom", "dignity", "rights", "autonomy" and so on to conform to (human) "values". And if you have read Asimov's series of science fiction novels about robots written 80 years ago, you know that this logical rule, defined by language, similar to the so-called "three laws of robotics", can be easily bypassed by robots with a certain intelligence (for example, the simplest and most effective way is to change its own definition of "human").

1. Control artificial intelligence from humans

While quite a few philosophers and ethicists are pessimistic about the alignment of overall human values, there are still many who are working tirelessly to calibrate AI and human values. For example, Professor Stewart Russell of the University of California, Berkeley, argues in his book AGI: A New Life that the ultimate goal of calibration is to "ensure that strong AI is aligned with human values" and discusses complete control over AI in terms of how to maximize human preferences. His goals also include human values and preferences for war, given that there are almost no periods of human history when there has been no war on a global scale. Of course, he also made it clear that he wanted to ensure that AI could not be used by a small group of "crazy evildoers". The implication seems to be that AI can participate in wars "for the just goals of humanity."

Other scholars, such as Ethan Gabriel of the DeepMind team, have proposed three possible methods of value calibration from a philosophical perspective. One is to calibrate to the morality that human beings may share; the second is to borrow the method concept of "veil of ignorance" proposed by philosopher John Rawls to establish the principle of justice for artificial intelligence; The third is to use social choice theory, especially democratic voting and consultation, to integrate different views and provide reference information for artificial intelligence. In addition to these humanistic suggestions that regard artificial intelligence as a tool, some scholars, especially those in the East, are more inclined to the naturalistic view, proposing that artificial intelligence should be regarded as a partner, and that artificial intelligence should be given the ability to emotion, empathy and altruism from the perspective of harmonious symbiosis, give artificial intelligence a higher status and respect, let it spontaneously learn human values through interaction with humans, and create a symbiotic society between humans and artificial intelligence.

The above two values calibration angles, whether humanistic or naturalistic, have an important flaw. For the idea of AI as a tool that requires it to be calibrated according to human values, it ignores the important problem that the starting point for all these value calibrations is based on the principle of rational people, whether morality, the "veil of ignorance", or democratic deliberative voting, which is based on the fact that human reasoning and thinking is completely rational. Contemporary research in human behavior science, especially a large number of studies in economics and psychology, has proved that in human behavior, irrational components and rational components coexist. In the irrational part, emotions and intuition account for a considerable weight and, due to their evolutionarily important function, have an important impact on the vast majority of human behavior. And most AI researchers don't know how to implant irrational parts into AI, or simply ignore them. Although the naturalistic view recognizes the importance of irrationality, such as emotions, it only considers the positive aspects, such as empathy, altruism, love, etc., and ignores the negative parts, such as hatred, anger, fear, discrimination, prejudice, etc.

In the current practical application, it is to use reinforcement learning methods based on human feedback to separate irrational negative parts from artificial intelligence. But, is this method really perfect? If we want AI to understand human intentions and goals, it is necessary for AI to understand negative intentions and goals out of the need to prevent someone from using AI to accomplish their negative goals. For example, in order for an AI to reject the act of "filling a sugar bottle with arsenic and putting it in a cupboard," it must understand that the purpose and intention behind someone asking it to do so is dangerous and detrimental to others. This is as important as it needs to understand that "fill the box labeled 'poisonous' with cockroach medicine and put it in the cupboard" is a normal instruction. Asking it to learn one without learning the other is impossible and very dangerous. This is because an AI that cannot understand the intentions of negative values will be very vulnerable when it actually enters society to interact with humans. If it is not given a learning function, AI will soon be exploited by people with ulterior motives.

2. Artificial intelligence's understanding of human control

There is also a more realistic reason that makes any attempt to fully control AI in the interests of humanity a huge challenge.

In the entire history of life on Earth, only humans have a symbolic writing system, realizing the ability to preserve and transmit information and knowledge to future generations across time and space. This point further expanded the breadth and breadth of communication after the advent of computers and the Internet. With the help of the Internet and digital libraries, we can obtain thousands of years of written information throughout the world without leaving home, and the depth and breadth of knowledge that individual human beings can obtain has reached unprecedented heights. But this era of knowledge explosion has also brought great challenges to human beings, with the cognitive ability of the human brain and the speed of obtaining written information, it has been difficult to keep up with the speed of the expansion of the knowledge boundary of human groups.

Humans are imprisoned in the cage of their own brain's effective cognitive abilities, but artificial intelligence does not have this physical limitation. Thanks to powerful computing power and virtually unlimited "physical strength", advanced artificial intelligence may only need to learn the entire human Internet in months. And most crucially, an AI trained by humans to understand the purpose and intent of human behavior can also understand the human intent behind this knowledge. That is to say, an artificial intelligence that understands the human intention to pick up garbage should also be able to understand the human intention to control it, because this intention has been placed on the Internet more than once, as it is, at a glance, in the form of natural language words that it can understand.

Every article, book, blog we write now on how to control artificial intelligence, along with the various countermeasures and escape methods that artificial intelligence may have, has been recorded on the Internet in the form of discussions between humans. An AI with powerful Internet search capabilities (which is exactly what several search engine companies are currently doing, and no one thinks this poses a problem) may only take a few seconds to understand all the efforts and attempts made so far and henceforth to fully control AI (or, to put it another way, to put it another way, such as making "AI credible and beneficial to humans"), whether it is increasing uncertainty about preference choices, implanting human rights at the core, or things like the "Three Laws of Robotics." The same rules, or implanting empathy and altruism into its underlying logic... All of these attempts, even the source code on how to implement these functions (as long as it is connected in some form, it must be possible to be obtained through search or cracking), and the code that makes the AI itself, may eventually be discovered and understood. What does this mean?

This means that if we do not carry out effective supervision of the development and application of artificial intelligence, artificial intelligence that has developed to a certain stage of intelligence and has the ability to understand intentions will be able to understand and master the process by which humans created it and the means of control they are trying to adopt, which is obviously a very high-risk thing.

3. "Consultation and calibration" with artificial intelligence

However, it is neither too late nor realistic to set out now to clear information about human manufacturing and control of AI, or to prevent AI from accessing networks. Unless there is a human hero like in the science fiction novel "The Three-Body Problem", alone, without communicating with anyone else, without leaving any trace on the Internet, in a way that only he can know and understand, to achieve perfect control on the lowest level of the future artificial intelligence code, and so that it can never know itself or know from other humans, maybe this problem can be solved. But with the current path of artificial intelligence research and development, the possibility of such a solution is simply too low.

If we start from this basic point and rationally examine the "value calibration" of artificial intelligence from the beginning, it seems that it is possible to reach a consensus: it may be extremely important to communicate with future super AI in some open, transparent, and frank way, and seek a common and mutually trusted coexistence solution. After all, we've left enough values and behavioral biases on the internet that humans don't want AI to understand and learn. And what kind of actions artificial intelligence will take after learning human negative behavior is full of uncertainty.

For these reasons, the work of using human values as a standard requires AI to "calibrate" on this basis is challenging. So, is it, as many scholars say, that in order to avoid this danger, we will have no choice but to completely ban the development of super artificial intelligence in the future? Optimistic analysts believe that there is another possibility that humans will use this as an opportunity to seek to adjust their overall values and negotiate with future super artificial intelligence, so as to lock in a direction that meets common needs and interests, a process that may be "human-machine common value calibration".

Taking this solution helps answer another important question. If AI researchers can foresee that building super-AI is likely to be dangerous, then why should we do it? Why should we strive to build something that we know has the potential to destroy us?

The "Common Values Calibration" gives an answer to the question that building AI with shared values that can be human partners may be an important step in adjusting the values that humans have developed in different directions and tend to self-destruct during evolution. Relying on human beings to regulate the behavior and preferences of individuals and groups of different cultures and values may be difficult, if not impossible. As technology advances, the worst outcome of resorting to ultimate force such as nuclear weapons to destroy each other is like a sword of Damocles hanging over the head of mankind at all times. With the help of the power of external artificial intelligence created by humans, the integration of overall human values is gently realized in the way of education and behavior correction, and the future may become a difficult but promising path to ensure that humans and artificial intelligence move forward together for a common value goal.

4. Strengthen the supervision of the development of artificial intelligence

So, what is the unique value of human beings as creators in the future human-machine symbiotic civilization? This is an extremely difficult question to answer. Three possible aspects can only be tentatively proposed here, as a manifestation of the unparalleled uniqueness of human beings, so that we will not become a "freerider" in the journey to the future with artificial intelligence. It is important to emphasize that each of these possibilities is very subjective, because the issue is difficult to discuss objectively, especially in terms of putting aside human identity, which is almost impossible to do.

Consciousness – The question of consciousness is the greatest mystery of all questions about human beings themselves, and how to define and explain the processes by which they arise, exist, and act has been a topic that has been a topic of enduring in science and philosophy for thousands of years. Aside from the complex theories and phenomena, in fact, questions such as "whether artificial intelligence will have consciousness" depend entirely on how we humans understand consciousness, which is not of much significance in itself. It is more practical to think about the role that consciousness played in the process of exploring life and changing and creating the universe.

Emotions – as we mentioned earlier, the irrational part with emotions at their core – occupies a considerable share of human behavior. What is the necessity of the existence of emotions and irrational behavior? Is it, like the appendix, a remnant of our human evolution? At present, the core of various emotional studies on artificial intelligence is focused on artificial intelligence and human interaction. Because humans have emotions, in order to better interact with humans, artificial intelligence needs to understand and produce human-like emotions. At this stage, no researchers have decided that it is necessary for two AIs to clean up trash in no man's land to show emotions towards each other. More research is needed to determine the ultimate function of emotions in the evolution of intelligence and intelligent society.

Creativity – Creativity is undoubtedly one of the most difficult abilities to define and quantify accurately. If, as many believe, we declare that only humans have true creativity and that AI will never be available, then this problem is solved. But it's probably not that simple. At a certain stage of generative artificial intelligence, all innovative human behaviors are likely to be difficult to prove themselves, and must be judged by artificial intelligence. This is because when the number of people using artificial intelligence to assist creation is enough, human individuals alone can no longer confirm whether their creation has been similar somewhere at a certain time by searching the entire Internet, and have to use artificial intelligence with special discrimination ability to conduct a network-wide search or algorithm analysis and give conclusions. Of course, at the same time, such artificial intelligence will also become a partner in human creativity - prompting humans to stay alert, constantly learning, innovating and self-improvement.

In summary, effective regulation of the development of AI and a careful examination of possible risks, challenges and opportunities at all stages should be an important task for researchers and social policy makers in all relevant subject areas. Fortunately, many countries, including the mainland, have recognized the importance of these issues and have introduced their own AI development plans and regulatory principles. Since 2020, the US government has issued the Regulatory Guidelines for the Application of Artificial Intelligence, the European Union has issued the White Paper on Artificial Intelligence, the Japanese Cabinet has proposed the principles for the development of humanized AI, and the Cyberspace Administration of China issued the Measures for the Administration of Generative Artificial Intelligence Services (Draft for Comments) in April this year. At the same time, further research on the specificity of human beings in consciousness, emotion and creativity to ensure that human beings continue to play an irreplaceable and unique leading role in the future human-machine symbiotic society, has also become a long-term cross-discussion topic in computer science, philosophy, sociology, psychology, brain science and other disciplines, so as to contribute to the ultimate creation of a future civilized society in which man and machine coexist.

Guang Ming Daily(Version 14, 08 June 2023)

Source: Guangming Network - Guangming Daily

Negotiation and calibration: a future of coexistence with artificial intelligence

Read on

Artificial intelligence brings parenting anxiety, and Chinese parents in Australia are worried about the future of their children

The past and future of OpenAI o1 and artificial intelligence

Of the four areas that will not be replaced by artificial intelligence in the future, the first is the most stable, and the fourth is the most cost-effective

Adobe's Project Turntable AI tool rotates two-dimensional artwork in three-dimensional space

Chen Jianlin|Types of enterprise data empowerment from the perspective of general artificial intelligence

Scientists are using new artificial intelligence to uncover the secrets of infant learning and development

Nansha and Huawei join forces! Jointly build an artificial intelligence ecological base

Top 10 Trends in Artificial Intelligence in 2025! The latest forecast →

Zhang Yimou revealed the progress of "The Three-Body Problem": only one film, significant deletion, and the introduction of artificial intelligence

The Frankfurt Book Fair focuses on the development and regulation of artificial intelligence

Top 10 trends in the future of artificial intelligence

Research Report | Explore the frontiers of science and technology and lead the future innovation" Artificial Intelligence Innovation and Application Expo The research journey set sail

The Forum was successfully held

Digital technology and artificial intelligence save the ratings of the Spring Festival Gala

DeepSeek is born, artificial intelligence is powerful, will teachers be replaced? Is there still any point in reading?

While a large number of people are unemployed, they are engaged in artificial intelligence, and the development has robbed hundreds of millions of people's jobs, what will happen in the future?