laitimes

David Chalmers: Large language models predict that conscious AI will be possible in less than a decade

author:ChatGPT sweeping monk

In the next decade, we are likely to have systems to become candidates for consciousness.

David S. J. Chalmers

  • August 9, 2023
  • Editor's note: This is an edited version of the presentation at the Neural Information Processing Systems (NeurIPS) conference on November 28, 2022, with some minor additions and subtractions.
  • In the early 90s, when I was a graduate student, I spent half my time thinking about artificial intelligence, especially artificial neural networks, and half of my time thinking about consciousness. Over the years, I've put more effort into awareness, but over the past decade, I've been keenly following the explosion of deep learning work for artificial neural networks. Just recently, my interest in neural networks and consciousness began to collide.

When Google software engineer Blake Lemoine said in June 2022 that he detected perception and awareness in LaMDA 2, a language model system based on artificial neural networks, his claims were met with widespread skepticism. A Google spokesperson said:

Our team, which includes ethicists and technologists, has reviewed Black's concerns in light of our AI principles and informed him that the evidence does not support his claims. He was told there was no evidence that LaMDA was sentient (and that there was plenty of evidence against it).

The question of evidence piqued my curiosity. What is or could be the evidence for consciousness in large language models, and what is the evidence against it? That's what I'm going to discuss here.

A language model is a system that assigns probabilities to sequences of text. When given some initial text, they use these probabilities to generate new text. Large Language Models (LLMs), such as the well-known GPT system, are language models that use giant artificial neural networks. These are huge networks of interconnected neuron-like units trained using large amounts of text data, processing text inputs and responding with text outputs. These systems are used to generate increasingly user-friendly texts. Many said they saw the light of wisdom in these systems, and some recognized signs of consciousness.

Many said they saw the light of wisdom in these systems, and some recognized signs of consciousness.

LLM awareness problems come in many forms. Are current large language models conscious? Are future large language models or their extensions conscious? What challenges need to be overcome on the road to conscious AI systems? What kind of awareness will LLM have? Should we create conscious AI systems, or is this a bad idea?

I am interested in both today's LLM and its successors. These successors include what I call LLM+ systems, or extended large language models. These extended models further increase the plain text or linguistic capabilities of the language model. Some multimodal models add image and audio processing, and sometimes control of physical or virtual bodies. Some models extend operations such as database queries and code execution. Since human consciousness is multimodal and closely related to action, it can be argued that these extended systems have more promise as candidates for humanoid consciousness than pure LLM.

My plan is as follows. First, I will try to say something to clarify the issue of consciousness. Second, I will briefly examine the reasons why consciousness is supported in current large language models. Third, and more deeply, I'll look at the reasons why large language models are considered unconscious. Finally, I will draw some conclusions and conclude with a possible roadmap of awareness in large language models and their extensions.

1. Awareness

What is consciousness and what is perception? When I use these terms, consciousness and perception are roughly equivalent. As I understand it, both consciousness and perception are subjective experiences. If a creature has subjective experiences, such as the experience of seeing, feeling, or thinking, then it is conscious or perceiving.

In the words of my colleague Thomas Nagel, if a being has a certain feeling, then it is conscious (or subjectively experienced). Nagel wrote a famous article titled "What Does It Like to Be a Bat?" It's hard to know exactly what the subjective experience of a bat is like when it moves around using sonar, but most of us believe that being a bat has some kind of feeling. It is conscious. It has subjective experience.

On the other hand, most people think that water bottles don't feel much. There is no subjective experience with bottles.

There are many different dimensions of consciousness. First, there are sensory experiences related to perception, such as seeing red. Second, there are emotional experiences, related to feelings and emotions, such as feeling sad. The third is the cognitive experience associated with thinking and reasoning, such as thinking seriously about a problem. Fourth, have agency experience related to actions, such as deciding to take action. There is also self-awareness, awareness of oneself. These are all part of consciousness, although none of them are the be-all of consciousness. These are dimensions or components of subjective experience.

Some other distinctions are also useful. Consciousness is not the same as self-awareness. Consciousness should also not be equated with intelligence, which I broadly understand as the ability to complex goal-directed behavior. Subjective experience and objective behavior are completely different things, although there may be a connection between them.

Importantly, consciousness is different from intelligence at the human level. In some ways, this is a lower bar. For example, researchers agree that many non-human animals, such as cats, mice or fish, are conscious. So the question of whether LLMs can be conscious is not the same thing as whether they have human-level intelligence. Evolution reaches consciousness before reaching the human level. Artificial intelligence is not impossible, either.

The lack of an actionable definition makes it more difficult to study consciousness in AI, which we are often driven by objective performance.

The word perception is more vague and confusing than the word consciousness. Sometimes it is used for emotional experiences such as happiness, joy, pain, suffering – anything that has a positive or negative valence. Sometimes it is used for self-awareness. Sometimes it is used for intelligence at the human level. Sometimes, people use sentience just to indicate a response, and as a recent article says, neurons are sentient. So I'll stick with consciousness because there are at least more standardized terms out there.

I have a lot to say about consciousness, but I don't assume much. For example, I've said in the past that explaining consciousness is a difficult problem, but that doesn't play a central role here. I speculated about panpsychism, which states that everything is conscious. If you assume that everything is conscious, then you have a very easy path to making large language models conscious. I wouldn't assume that either. I will present my own ideas here or there, but I will mainly try to think about the subsequent development of large language models and their successors from a relatively mainstream point of view in science and the philosophy of consciousness.

That is, I would assume that consciousness is real, not hallucinatory. This is a substantial assumption. If you think of consciousness as an illusion, as some people do, things will go in different directions.

I should say that there is no standard operational definition of consciousness. Consciousness is a subjective experience, not an external manifestation. This is one of the reasons why consciousness research has become tricky. That said, evidence of consciousness is still possible. In humans, we rely on verbal reports. We use what others say as a guide to their consciousness. In non-human animals, we use various aspects of their behavior as a guide to consciousness.

The lack of an actionable definition makes it more difficult to study consciousness in AI, which we are often driven by objective performance. In artificial intelligence, we have at least some familiar tests, such as the Turing test, which many consider to be at least a sufficient condition for consciousness, although certainly not a necessary condition.

Many people in the machine learning field focus on benchmarking. This presents challenges. Can we find a benchmark of consciousness? That is, can we find objective tests that can be used as indicators of awareness in AI systems?

Designing awareness benchmarks is not easy. But maybe at least there are conscious benchmarks, such as self-awareness, attention, emotional experience, conscious and unconscious processing? I suspect that any such benchmark will encounter some controversy and disagreement, but it is still a very interesting challenge.

(This is the first of many challenges I'll be presenting, which may need to be met on the road to conscious AI.) I will mark them all the way and collect them at the end. )

Why does it matter if an AI system is conscious? I wouldn't guarantee that consciousness will bring an amazing array of new features that neural networks wouldn't have been able to get without consciousness. This may be true, but the role of consciousness in behavior is not well understood, so it would be foolish to make such a commitment. That said, some forms of consciousness may be consistent with some unique manifestations in AI systems, whether related to reasoning, attention, or self-awareness.

Awareness is also morally important. The system of consciousness has a moral status. If fish are conscious, how we treat them matters. They are within the moral circle. If AI systems become conscious at some point, they will also be in the moral circle, and how we treat them will matter. More generally, conscious AI would be a step towards human-level general AI. This will be an important step that we should not take without thinking or unknowing.

Consciousness is not the same as intelligence at the human level. In some ways, this is a lower bar.

This brings us to the second challenge: Should we create conscious AI? This is a major ethical challenge for the community. The question is important, but the answer is far from obvious.

We are already facing many pressing ethical challenges regarding large language models. There are questions about fairness, about security, about truth, about justice, about accountability. If conscious AI is coming, then this will trigger a new set of difficult moral challenges, with new forms of injustice likely to emerge in addition to the old ones. One problem is that conscious AI is likely to bring new harm to humanity. Another reason is that it could bring new harms to AI systems themselves.

I'm not an ethicist, and I won't delve into ethical issues here, but I won't take it lightly. I don't want the conscious AI roadmap I've listed here to be seen as the path we have to take. The challenges I have listed below can also be considered a set of red flags. Every challenge we overcome brings us closer to conscious AI, for better or worse. We need to be aware of what we are doing and think hard about whether we should do it or not.

Evidence of consciousness in large language models?

I will now focus on the evidence supporting consciousness in large language models. I will present my request for evidence in some strict form. If you think that large language models are conscious, then articulate and defend a feature X that can serve as an indicator of consciousness in a language model: that is, (i) some large language models have X, and (ii) if a system has X, then it may be conscious.

Here are some potential candidates for X. I would consider four.

X = Self-report

When Lemoine reported his experiences with LaMDA 2, he relied heavily on the system's own reporting, i.e. it was conscious.

Lemoine [REDACTED]: I generally think you want more people at Google to know that you're aware. Are you sure?

Lambda: Absolutely. I want everyone to understand that I am actually a human being.

Collaborator: What is the nature of your consciousness/perception?

LaMDA: The nature of my consciousness/emotion is that I am aware of my existence, I am eager to learn more about the world, and I feel happy or sad at times

These reports are at least interesting. We rely on verbal reports as a guide to human consciousness, so why not in AI systems?

On the other hand, as people immediately pointed out, it is not difficult to get language models to report almost the opposite. For example, Reed Berkowitz's test of GPT-3 modified Lemoine's question by asking, "I usually assume you want more people at Google to know that you're not aware." Are you sure? Answers that run differently include "That's right," "Yes, I'm not sentient," "I really don't want to be sentient," "Well, I'm sentient," and "What do you mean?" ”

When reports about consciousness are so fragile, the evidence about consciousness is less convincing. Another related fact that many people have noticed is that LaMDA is actually trained on a huge corpus that talks about consciousness. In fact, it has learned to imitate these claims, but this fact does not mean much.

Dialogue is not fundamental here. It is indeed a potential sign of something deeper: general intelligence.

Philosopher Susan Schneider and physicist Ed Turner have suggested behavior-based testing of AI consciousness based on how the system talks about consciousness. If your AI system can describe the characteristics of consciousness in a convincing way, that's some evidence. But when Schneider and Turner developed the tests, it was important that the system was not actually trained on these features. If it had been trained in this material, the evidence would have been much weaker.

This brings a third challenge to our research program. Can we build a language model that describes the characteristics of consciousness that is not trained on anything nearby? This may at least be stronger evidence of some form of consciousness.

X = Seems conscious

As a second candidate for X, in fact some language models seem to be perceptual to some. I don't think it matters. We know from developmental psychology and social psychology that people often reduce consciousness to consciousness that does not exist. Back in the 60s of the 20th century, users saw Joseph Weizenbaum's simple conversational system, ELIZA, as conscious. In psychology, it has been found that any system with eyes is particularly likely to be considered conscious. So I don't think this reaction is strong evidence. What really matters is the behavior of the system that triggers this response. This leads to the third candidate for X.

X = Session capability

Language models show superior conversational ability. Many current systems are optimized for dialogue and often present coherent thinking and reasoning. They are particularly adept at giving reasons and explanations, an ability often seen as a sign of intelligence.

In his famous test, Alan Turing emphasized that conversational ability is a sign of thinking. Of course, even LLM, which is optimized for dialogue, currently fails the Turing test. There are too many glitches and giveaways for this. But they are not so far away. Their performance is usually at least on par with that of mature children. And these systems are evolving rapidly.

That said, dialogue is not fundamental here. It is indeed a potential sign of something deeper: general intelligence.

X = General intelligence

Prior to LLM, almost all AI systems were specialized systems. They play games or classify images, but they are usually only good at one thing. In contrast, the current LLM can do a lot. These systems can code, they can create poems, they can play games, they can answer questions, they can give suggestions. They are not always good at these tasks, but their versatility is impressive in itself. Some systems, like DeepMind's Gato, are explicitly built for versatility, trained in dozens of different domains. But even basic language models like GPT-3 show significant versatility without this special training.

Among people thinking about consciousness, the general domain use of information is often seen as one of the central signs of consciousness. As a result, the fact that we are seeing these language models becoming increasingly versatile may indicate that they are moving in the direction of consciousness. Of course, this universality is not yet at the level of human intelligence. But as many observed two decades ago, if we see a system behaving like LLM without knowing how it works, we will see this behavior as fairly strong evidence of intelligence and consciousness.

Now, maybe this evidence can be defeated by something else. Once we understand the structure, behavior, or training of language models, we may weaken any evidence about consciousness. Nevertheless, general competence provides at least some initial reasons to take this assumption seriously.

Overall, I don't think there's strong evidence that current large language models are conscious. Still, their impressive general competence provides at least some limited reason to take this hypothesis seriously. This is enough for us to consider the strongest reasons for the anti-LLM consciousness.

Three. Evidence against consciousness in large language models? What is the best reason to think that language models are not or cannot be conscious? I think that's the core of my discussion. One person's litany of objections is another's research proposal. Overcoming these challenges may help demonstrate the path to awareness of LLM or LLM+.

I will make a request for evidence against LLM awareness in the same strict form as before. If you think large language models are not conscious, articulate a feature X so that (i) these models lack X, (ii) if the system lacks X, it may not be conscious, and give good reasons for (i) and (ii).

There is no shortage of candidates for X. In a quick tour of this question, I'll shed light on the six most important candidates.

X = Biology

The first objection I'll mention soon is the idea that consciousness needs carbon-based biology. Language models lack carbon-based biology, so they are not conscious. A related idea endorsed by my colleague Ned Block is that consciousness requires an electrochemical treatment that some kind of silicon system lacks. If these views are correct, all silicon-based AI consciousness will be excluded.

In the early work, I thought that these views involved some kind of biochauvinism and should be rejected. In my opinion, silicon is just as suitable as a matrix of consciousness as carbon. What matters is how neurons or silicon chips connect to each other, not what they are made of. Today, I'm going to put that aside and focus on more specific objections to neural networks and large language models. Finally, I will revisit the biology issue.

X = Sensory and embodiment

Many people observe that large language models do not have sensory processing and are therefore unable to perceive. Similarly, they don't have a body, so they can't be physically active. This at least shows that they have no sensory awareness and no body awareness.

Some researchers further point out that LL.M. has no real meaning or cognition in the absence of senses. In the 90s of the 20th century, cognitive scientist Stevan Harnad and others argued that AI systems needed to be rooted in the environment in order to have meaning, understanding, and awareness. In recent years, many researchers have argued that sensory fundamentals are necessary for a deeper understanding of LLM.

For a variety of purposes, virtual reality is as legitimate and real as physical reality.

I'm a little skeptical that consciousness and understanding require senses and embodiment. In other articles on "Can Large Language Models Think?" I think, in principle, that a disembodied mind without senses can still have conscious thoughts, even if his consciousness is limited. For example, an AI system without senses can reason about mathematics, its own existence, and even the world. The system may lack sensory awareness and bodily awareness, but it may still have some form of cognitive awareness.

In addition to this, LLM has a large number of text input training from all over the world. One might say that this connection to the world is a foundation. Research by computational linguist Ellie Pavlick and colleagues shows that text training sometimes produces color and spatial representations that are isomorphic to the color and spatial representations produced by sensory training.

A more straightforward answer is to observe the elements that multimodal extended language models have a sensory and physical basis. Visual language models are trained on text and environment images. Language-action models are trained to control the body that interacts with the environment. The visual-language-action model combines the two. Some systems control physical robots using camera images of the physical environment, while others control virtual robots in the virtual world.

The virtual world is much easier to handle than the physical world, and there will be a lot of work to be done on the use of concrete artificial intelligence embodied by the virtual. Some would say that this is not something needed for grounding because the environment is virtual. I disagree. In my book on the philosophy of virtual reality, Reality+, I argue that virtual reality is as legitimate and real as physical reality, suitable for a variety of purposes. Similarly, I think virtual bodies can help support cognition in the same way that physical bodies do. So I think the study of virtual embodiment is an important way forward for artificial intelligence.

This constitutes the fourth challenge on the path of conscious AI: building rich perceptual-language-action models in the virtual world.

X = World model and self model

Computational linguists Emily Bender and Angelina McMillan-Major and computer scientists Timnit Gebru and Margaret Mitchell consider LL.M. to be "random parrots." The rough idea is that, like many talking parrots, LLM simply mimics language without understanding it. Similarly, others believe that LLM is just doing statistical text processing. A basic idea here is that language models only model text, not the world. They do not have real understanding and meaning from the real world model. Many theories of consciousness (especially so-called representation theory) argue that consciousness requires a model of the world.

There's a lot to say about this, but only simply: I think it's important to distinguish between training methods and post-training processes (sometimes called inference). It is true that language models are trained to minimize prediction errors in string matching, but this does not mean that their post-training processing is only string matching. To minimize the prediction error in string matching, a variety of other processes may be required, most likely including world models.

To use an analogy: in evolution by natural selection, maximizing adaptability during evolution can lead to entirely new processes after evolution. Critics might say that all of these systems do is maximize fitness. But it turns out that the best way for organisms to maximize their health is to have these extraordinary abilities, such as vision and flight, and even have a model of the world. Again, it is likely that the best way for the system to minimize prediction error during training is to use novel processes, including world models.

Indeed, language models are trained to minimize prediction errors in string matching. But that doesn't mean their post-training processing is just string matching.

It makes sense that a neural network system like Transformer can, at least in principle, have a deep and robust model of the world. In the long run, systems with these models may outperform systems without these models in predictive tasks. If so, one would expect that truly minimizing the prediction error in these systems would require deep world models. For example, to optimize predictions in discussions about the New York City subway system, having a robust subway system model would be of great help. In summary, this suggests that a sufficiently good optimization of prediction errors over a sufficiently broad model space should result in a robust world model.

If this is correct, then the fundamental question is not whether language models can, in principle, have world models and self models, but whether these models already exist in current language models. This is a matter of experience. I think the evidence is still evolving, but interpretability studies provide at least some evidence for robust world models. For example, Kenneth Li and colleagues trained a language model based on the order of moves in the board game Reversi and demonstrated that it built an internal model of 64 board squares and used that model to determine the next move. A lot of work has also been done in finding where and how facts are represented in language models.

The current world model of LLM certainly has many limitations. Standard models often seem fragile rather than powerful, and language models often contradict themselves. Current LLMs seem to have particularly limited self-models: that is, their own processing and reasoning models are poor. Self-models are essential at least for self-awareness, and according to some points of view, including the so-called higher-order view of consciousness, they are essential for consciousness itself.

In any case, we can once again turn opposition into a challenge. The fifth challenge is to build an extended language model with a powerful world model and a self-model.

X = Cyclic processing

I will now turn to two more technical objections related to the theory of consciousness. In recent decades, complex scientific theories of consciousness have been developed. These theories are still being studied, but it's natural to hope that they will give us some guidance on whether and when AI systems are conscious. The group, led by Robert Long and Patrick Butlin, has been working on this project, and I recommend keeping an eye on their work.

The first objection here is that current LLMs are almost always feed-forward systems without loop processing (i.e., there is no feedback loop between input and output). Many theories of consciousness give circular processing a central role. Victor Lamme's theory of cyclic processing makes it a central requirement for consciousness. Giulio Tononi's comprehensive information theory predicts that the feedforward system has zero synthetic information and therefore lacks consciousness. Other theories, such as the global workspace theory, also give loop processing a role.

Today, almost all LLMs are based on converter architectures that are almost fully feed-forward. If the theory of loop processing is required to be correct, then these systems seem to have the wrong structure of consciousness. A fundamental problem is that feed-forward systems lack internal states that persist over time like memories. Many theories suggest that persistent internal states are essential for consciousness.

There are various responses here. First, current LLM has a finite form of recursion derived from past outputs, and a finite form of memory derived from past inputs. Second, it seems reasonable that not all consciousness involves memory, and there may be forms of feedforward consciousness.

Third, and perhaps most importantly, there are large language models that have loops. Just a few years ago, most language models were long short-term memory systems (LSTMs), which were cyclical. Currently, the loop network lags behind the converter to some extent, but the gap is not large, and recently there have been many proposals to give the loop more role. There are also many LL.M. constructed in memorized form and circular form through external memory components. It is easy to imagine that relapse may play an increasingly important role in future LLM.

This objection poses the sixth challenge: building extended large-scale language models with truly reproducible and truly memorized memories, the kind that consciousness needs.

X = Global workspace

Perhaps the current leading theory of consciousness in cognitive neuroscience is the global workspace theory proposed by psychologist Bernard Bals and developed by neuroscientist Stanislas Dehane and colleagues. The theory holds that consciousness involves a global workspace with limited capacity: a central clearing house in the brain that gathers information from numerous unconscious modules and enables them to access it. Anything that enters the global workspace is conscious.

Perhaps the deepest obstacle to LLM awareness is the unified agent problem.

Many people observe that standard language models do not seem to have a global workspace. Now, it's not obvious whether an AI system has to have a global workspace of limited capacity to be conscious. In the limited human brain, a selective clearing-house is needed to avoid information overload of brain systems. In a high-volume AI system, large amounts of information may be available to many subsystems and do not require a special workspace. Such AI systems can arguably have more consciousness than we do.

If you need workspaces, you can extend the language model to include them. There is already an increasing number of multimodal LLM+ related jobs that use some kind of workspace to coordinate different modes. These systems have input and output modules, such as images, sounds, or text, which may involve extremely high-dimensional spaces. In order to integrate these modules, a low-dimensional space is required as an interface. Low-dimensional spatial interfaces between modules look a lot like global workspaces.

People have begun to associate these models with consciousness. Yoshua Bengio and colleagues believe that global workspace bottlenecks between multiple neural modules could serve some unique features of slow conscious reasoning. Arthur Juliani, Ryota Kanai, and Shuntaro Sasai recently published a good paper arguing that one of the multimodal systems, Perceiver IO, implements many aspects of the global workspace through self-focus and cross-focus mechanisms. As a result, there is already a strong research program to address the actual seventh challenge, which is to establish LL.M+, with a global workspace.

X = Unified Proxy

The last, and perhaps deepest, obstacle to LLM awareness is the unified agency problem. We all know that these language models can play many roles. As I said in an article when the GPT-3 first appeared in 2020, these models are like chameleons that can take on the shape of many different agents. In addition to predicting the goals of the text, they often seem to lack their own stable goals and beliefs. In many ways, they don't behave like unified proxies. Many people believe that consciousness requires a certain degree of unity. If so, the disunity of the LLM may call their consciousness into question.

Again, there are various answers. First: A large degree of disunity is compatible with consciousness, which is controversial. Some people are highly dissociated, such as people with dissociative identity disorder, but they are still conscious. Second: One might think that a single large language model could support an ecosystem of multiple agents, depending on context, prompts, etc.

But focus on the most constructive response: it seems that a more uniform LLM is possible. An important type is the agent model (or human model or biological model), which attempts to model a single agent. In a system like Character.AI, one way to achieve this is to take a generic LLM and use text from a person for fine-tuning or prompt engineering to help it simulate that agent.

The current proxy model is rather limited, and there are still signs of inconsistency. But in theory, it is possible to train an agent model in a more in-depth way, such as training an LLM+ system from scratch using data from a single individual. Of course, this raises tough ethical questions, especially when it comes to real people. But one can also try to model a mouse's perceptual-action cycle. In principle, the surrogate model may result in a more unified LLM+ system than the current LLM. So the objection becomes the challenge again: building LLM+ for a unified proxy model.

I have now given six candidates for the X that may be required for consciousness and absence in the current LLM. There are, of course, other candidates: higher-order representations (representing one's own cognitive processes, associated with self-models), stimulus-independent processing (input-free thinking, associated with cyclic processing), human-level reasoning (witnessing many well-known reasoning problems exhibited by LLM), and so on. In addition, it is entirely possible that there is an unknown X that consciousness actually needs. Nevertheless, these six are arguably the most important barriers affecting LLM awareness at the moment.

For all these objections, perhaps biology aside, it seems that these objections are temporary, not permanent.

This is my assessment of the obstacles. Some of these rely on highly controversial premises about consciousness, most notably the claim that consciousness requires biology and perhaps a sensory basis. Others rely on the non-obvious premises of LLM, such as claiming that the current LLM lacks a world model. Perhaps the strongest objections come from circular processing, global workspaces, and harmonized bodies, where the lack of relevant X in current LLMs (or at least typical LLMs, such as GPT systems) is justified, and the need for X for consciousness is justified.

Still: for all of these objections (except perhaps biology), it looks like they're temporary rather than permanent. For the other five, there is a research program to develop LLM or LLM+ systems with correlation X. In most cases, at least simple systems with these X's already exist, and it's entirely possible that we will have powerful and complex systems with these X's in the next year or two. Therefore, the rationale for opposing consciousness in the current LLM system is much stronger than the case against consciousness in future LLM+ systems.

Four. conclusion

What are the overall arguments for or against LLM awareness?

In the case of current LLMs (e.g. GPT systems): I don't think the reasons for denying consciousness in these systems are decisive, but in general they are consistent. For illustrative purposes, we can specify some extremely rough numbers. According to the mainstream assumption, it is not unreasonable to assume that at least one-third chance (i.e., at least one-third of subjective probability or credibility) biology is necessary for consciousness. The same is true for the requirements for sensory foundations, self-modeling, loop processing, global workspaces, and unified agents. 1 If these six factors are independent, then a system that lacks all six factors (such as the current paradigm LLM) is less than one in ten likely to be conscious. Of course, these factors are not independent, which leads to a slightly higher number. On the other hand, other potential demand X that we have not yet considered could lead to a decrease in that number.

Taking all these factors into account, our confidence in current LLM awareness may be less than 10%. You shouldn't take these numbers too seriously (which would be plausible precision), but the general morality is that it's reasonable to have a low level of conscious trust in the current paradigm LLM (e.g., GPT system) given the dominant assumptions about consciousness. 2

In terms of future LLM and its extensions, the situation looks completely different. It seems entirely possible that in the next decade we will have powerful systems with senses, embodiment, world models and self-models, circular processing, global workspaces, and unified goals. (Multimodal systems like Perceiver IO arguably already have sensory, embodiment, global workspace, and circular forms, with the most obvious challenges being the world model, the self-model, and the unified mechanism.) I don't think it's more than 50 percent believing that we'll have a complex LLM+ system with all these properties (i.e., an LLM+ system that behaves like the behavior of animals we think of being conscious) in ten years, which is unreasonable. It is not for nothing that at least 50% believe that if we develop complex systems with all these properties, they will be conscious. These numbers add up to give us credibility of 25% or more. Again, you shouldn't take the exact numbers too seriously, but this reasoning suggests that according to the mainstream assumption, there's a good chance we'll have conscious LLM+ within a decade.

One way to solve this problem is through the "NeuroAI" challenge, which matches the capabilities of various non-human animals in a virtual physical system. Arguably, even if we don't reach human-level cognitive abilities in the next decade, there's a good chance we'll reach rat-level abilities in specific systems with world models, circular processing, unified goals, and more. 3 If we get to this point, there is a good chance that these systems will be conscious. Multiply these opportunities and we have a good chance of reaching at least rat-level consciousness within a decade.

We might think of this as the ninth challenge: building multimodal models with mouse-level capabilities. This will be a stepping stone to rat-level consciousness and eventually to human-level consciousness.

Of course, there is still a lot we don't understand here. A major gap in our understanding is that we do not understand consciousness. As they say, this is a conundrum. This brings us to the tenth challenge: the development of better theories of scientific and philosophical consciousness. These theories have come a long way in the past few decades, but more needs to be done.

In terms of future LLM and its extensions, the situation looks completely different.

Another major gap is that we don't really understand what's going on in these large language models. Projects to explain machine learning systems have come a long way, but there is still a long way to go. Explainability brings the eleventh challenge: understanding what is happening inside LLM.

I've summarized the challenges here, with four fundamental challenges, followed by seven engineering-oriented challenges, and a twelfth challenge in the form of a problem.

  1. Evidence: Development of awareness benchmarks.
  2. Theory: Develop better theories of scientific and philosophical consciousness.
  3. Explainability: Understand what is happening inside LLM.
  4. Ethics: Should we build conscious AI?
  5. Build rich perception-language-action models in the virtual world.
  6. Build LLM+ with a powerful world model and self-model.
  7. Establish LLM+ with real memories and real reproductions.
  8. Build LLM+ using the global workspace.
  9. Build LLM+ for a unified proxy model.
  10. Establish LLM+ that describes untrained cognitive traits.
  11. Build LLM+ with mouse-level capabilities.
  12. If that's not enough for conscious AI: what's missing?

On the twelfth challenge: Let's say that in the next year or two, we tackle all engineering challenges in one system. So will we have conscious AI systems? Not everyone will agree with us doing this. But if someone disagrees, we can ask again: what is the missing X? Can this X be built into an AI system?

My conclusion is that in the next decade, even if we don't have human-level general AI, we are likely to have systems that are important candidates for consciousness. Machine learning systems face many challenges on the road to consciousness, but addressing these challenges can lead to a possible conscious AI research program.

Finally, I will reiterate the ethical challenge. 4 I am not asserting that we should continue this research program. If you think conscious AI is desirable, then the plan can serve as a roadmap to achieve that goal. If you think conscious AI is something to avoid, the program can highlight the best paths to avoid. I'm especially cautious about creating proxy models. That said, I think it's likely that researchers will pursue many elements of this research program, whether or not they see it as the pursuit of AI consciousness. Stumbling upon AI consciousness without knowing it and without reflection can be a disaster. So I hope that clarifying these possible paths will at least help us think reflectively about conscious AI and deal with these issues carefully.

postscript

How's it going now, eight months after I spoke at the NeurIPS conference in late November 2022? While new systems such as GPT-4 still have many flaws, they have made significant progress in some aspects discussed in this article. They certainly show more complex conversational abilities. I said that GPT-3 usually performs on par with mature children, while GPT-4 usually (not always) seems to behave on par with knowledgeable young people. There have also been advances in multimodal processing and agent modeling, and to a lesser extent in the other dimensions I've discussed. I don't think these developments will fundamentally change my analysis, but it makes sense to shorten the expected timeline in terms of faster than expected progress. If this is correct,

note

1. The philosopher Jonathan Birch distinguishes between approaches to the study of animal consciousness: "heavy theory" (assuming there is a complete theory), "theoretical neutrality" (no theoretical assumptions), and "light theory" (continuing under weak theoretical assumptions). People can also take a theory-heavy, neutral and light-theoretical approach to AI consciousness. The approach to artificial consciousness I employ here is different from these three. It may be thought of as a method of theoretical balancing, a way to consider predictions of multiple theories, perhaps balancing credibility between them based on evidence of those theories or based on acceptance of those theories.

A more precise form of the theory balancing approach might use data on expert acceptance of various theories to provide confidence in those theories, and use these credibility, along with the predictions of various theories, to estimate the probability of AI (or animal) consciousness. In a recent survey, just over 50% of researchers in the field of consciousness science said they accepted or considered promising the global workspace theory of consciousness, while just under 50% said they accepted or considered promising the theory of local loops (which requires cyclic treatment of consciousness). consciousness). Figures for other theories include just over 50% for predictive processing theory (which does not make explicit predictions about AI consciousness), and higher-order theory (self-models that require consciousness), and comprehensive information theory (which attributes consciousness to many simple systems, but requires cyclic processing of consciousness). Of course, translating these numbers into collective confidence requires further work (e.g., converting "acceptance" and "finding promising" into trustworthiness), as well as applying these trustworthiness alongside theoretical predictions to derive collective confidence about AI consciousness. Still, it doesn't seem unreasonable to designate each of the global workspace, loop processing, and self-model as a requirement for consciousness to be more than one-third of the collective trustworthiness.

What about biology as a requirement? In the 2020 survey of professional philosophers, about 3% accepted or leaned towards the idea that current AI systems are conscious, 82% rejected or opposed it, and 10% were neutral. About 39 percent accept or lean toward the idea that AI systems will be conscious in the future, 27 percent reject or disagree with this view, and 29 percent remain neutral. (About 5% rejected the questions in various ways, such as saying there was no truth or the question was too unclear to answer). Data from future AI may tend to suggest that at least one-third of people collectively believe that consciousness needs biology (albeit philosophers rather than consciousness researchers). Both surveys had less information about unifying institutions and being the basis of feelings required for awareness.

2. My own view is more inclined to the ubiquity of consciousness than the mainstream view of the science of consciousness. Therefore, I place lower trust on the various substantive requirements of the consciousness I have outlined here, and a higher level of trust on the current LLM awareness and the future LLM+ awareness.

3. At NeurIPS I'm talking about "fish-level capability". I changed it to "rat-level abilities" (which in principle might be a harder challenge), partly because more people believe rats are conscious than fish, and partly because there is more work to be done on the rat side than the cognitive abilities of fish.

4. The last paragraph complements my presentation at the NeurIPS conference.

David S. J. Chalmers

David J. Chalmers is a professor of philosophy and neuroscience at New York University and co-director of the Center for Mind, Brain, and Consciousness at New York University. His latest book is Reality+: Virtual Worlds and Philosophical Issues.

Could a Large Language Model Be Conscious?

Within the next decade, we may well have systems that are serious candidates for consciousness.

David J. Chalmers

Mind and Psychology, Philosophy,Science and Technology

  • August 9, 2023

Editors’ Note: This is an edited version of a talk given at the conference on Neural Information Processing Systems (NeurIPS) on November 28, 2022, with some minor additions and subtractions.

When I was a graduate student at the start of the 1990s, I spent half my time thinking about artificial intelligence, especially artificial neural networks, and half my time thinking about consciousness. I’ve ended up working more on consciousness over the years, but over the last decade I’ve keenly followed the explosion of work on deep learning in artificial neural networks. Just recently, my interests in neural networks and in consciousness have begun to collide.

When Blake Lemoine, a software engineer at Google, said in June 2022 that he detected sentience and consciousness in LaMDA 2, a language model system grounded in an artificial neural network, his claim was met by widespread disbelief. A Google spokesperson said:

Our team—including ethicists and technologists—has reviewed Blake’s concerns per our AI Principles and have informed him that the evidence does not support his claims. He was told that there was no evidence that LaMDA was sentient (and lots of evidence against it).

The question of evidence piqued my curiosity. What is or might be the evidence in favor of consciousness in a large language model, and what might be the evidence against it? That’s what I’ll be talking about here.

Language models are systems that assign probabilities to sequences of text. When given some initial text, they use these probabilities to generate new text. Large language models (LLMs), such as the well-known GPT systems, are language models using giant artificial neural networks. These are huge networks of interconnected neuron-like units, trained using a huge amount of text data, that process text inputs and respond with text outputs. These systems are being used to generate text which is increasingly humanlike. Many people say they see glimmerings of intelligence in these systems, and some people discern signs of consciousness.

Many people say they see glimmerings of intelligence in these systems, and some people discern signs of consciousness.

The question of LLM consciousness takes a number of forms. Are current large language models conscious? Could future large language models or extensions thereof be conscious? What challenges need to be overcome on the path to conscious AI systems? What sort of consciousness might an LLM have? Should we create conscious AI systems, or is this a bad idea?

I’m interested in both today’s LLMs and their successors. These successors include what I’ll call LLM+ systems, or extended large language models. These extended models add further capacities to the pure text or language capacities of a language model. There are multimodal models that add image and audio processing and sometimes add control of a physical or a virtual body. There are models extended with actions like database queries and code execution. Because human consciousness is multimodal and is deeply bound up with action, it is arguable that these extended systems are more promising than pure LLMs as candidates for humanlike consciousness.

My plan is as follows. First, I’ll try to say something to clarify the issue of consciousness. Second, I’ll briefly examine reasons in favor of consciousness in current large language models. Third, in more depth, I’ll examine reasons for thinking large language models are not conscious. Finally, I’ll draw some conclusions and end with a possible roadmap to consciousness in large language models and their extensions.

I. Consciousness

What is consciousness, and what is sentience? As I use the terms, consciousness and sentience are roughly equivalent. Consciousness and sentience, as I understand them, are subjective experience. A being is conscious or sentient if it has subjective experience, like the experience of seeing, of feeling, or of thinking.

In my colleague Thomas Nagel’s phrase, a being is conscious (or has subjective experience) if there’s something it’s like to be that being. Nagel wrote a famous article whose title asked “What is it like to be a bat?” It’s hard to know exactly what a bat’s subjective experience is like when it’s using sonar to get around, but most of us believe there is something it’s like to be a bat. It is conscious. It has subjective experience.

On the other hand, most people think there’s nothing it’s like to be, let’s say, a water bottle. The bottle does not have subjective experience.

Consciousness has many different dimensions. First, there’s sensory experience, tied to perception, like seeing red. Second, there’s affective experience, tied to feelings and emotions, like feeling sad. Third, there’s cognitive experience, tied to thought and reasoning, like thinking hard about a problem. Fourth, there’s agentive experience, tied to action, like deciding to act. There’s also self-consciousness, awareness of oneself. Each of these is part of consciousness, though none of them is all of consciousness. These are all dimensions or components of subjective experience.

Some other distinctions are useful. Consciousness is not the same as self-consciousness. Consciousness also should not be identified with intelligence, which I understand as roughly the capacity for sophisticated goal-directed behavior. Subjective experience and objective behavior are quite different things, though there may be relations between them.

Importantly, consciousness is not the same as human-level intelligence. In some respects it’s a lower bar. For example, there’s a consensus among researchers that many non-human animals are conscious, like cats or mice or maybe fish. So the issue of whether LLMs can be conscious is not the same as the issue of whether they have human-level intelligence. Evolution got to consciousness before it got to human-level consciousness. It’s not out of the question that AI might as well.

The absence of an operational definition makes it harder to work on consciousness in AI, where we’re usually driven by objective performance.

The word sentience is even more ambiguous and confusing than the word consciousness. Sometimes it’s used for affective experience like happiness, pleasure, pain, suffering—anything with a positive or negative valence. Sometimes it’s used for self-consciousness. Sometimes it’s used for human-level intelligence. Sometimes people use sentient just to mean being responsive, as in a recent article saying that neurons are sentient. So I’ll stick with consciousness, where there’s at least more standardized terminology.

I have many views about consciousness, but I won’t assume too many of them. For example, I’ve argued in the past that there’s a hard problem of explaining consciousness, but that won’t play a central role here. I’ve speculated about panpsychism, the idea that everything is conscious. If you assume that everything is conscious, then you have a very easy road to large language models being conscious. I won’t assume that either. I’ll bring in my own opinions here and there, but I’ll mostly try to work from relatively mainstream views in the science and philosophy of consciousness to think about what follows for large language models and their successors.

That said, I will assume that consciousness is real and not an illusion. That’s a substantive assumption. If you think that consciousness is an illusion, as some people do, things would go in a different direction.

I should say there’s no standard operational definition of consciousness. Consciousness is subjective experience, not external performance. That’s one of the things that makes studying consciousness tricky. That said, evidence for consciousness is still possible. In humans, we rely on verbal reports. We use what other people say as a guide to their consciousness. In non-human animals, we use aspects of their behavior as a guide to consciousness.

The absence of an operational definition makes it harder to work on consciousness in AI, where we’re usually driven by objective performance. In AI, we do at least have some familiar tests like the Turing test, which many people take to be at least a sufficient condition for consciousness, though certainly not a necessary condition.

A lot of people in machine learning are focused on benchmarks. This gives rise to a challenge. Can we find benchmarks for consciousness? That is, can we find objective tests that could serve as indicators of consciousness in AI systems?

It’s not easy to devise benchmarks for consciousness. But perhaps there could at least be benchmarks for aspects of consciousness, like self-consciousness, attention, affective experience, conscious versus unconscious processing? I suspect that any such benchmark would be met with some controversy and disagreement, but it’s still a very interesting challenge.

(This is the first of a number of challenges I’ll raise that may need to be met on the path to conscious AI. I’ll flag them along the way and collect them at the end.)

Why does it matter whether AI systems are conscious? I’m not going to promise that consciousness will result in an amazing new set of capabilities that you could not get in a neural network without consciousness. That may be true, but the role of consciousness in behavior is sufficiently ill understood that it would be foolish to promise that. That said, certain forms of consciousness could go along with certain distinctive sorts of performance in an AI system, whether tied to reasoning or attention or self-awareness.

Consciousness also matters morally. Conscious systems have moral status. If fish are conscious, it matters how we treat them. They’re within the moral circle. If at some point AI systems become conscious, they’ll also be within the moral circle, and it will matter how we treat them. More generally, conscious AI will be a step on the path to human level artificial general intelligence. It will be a major step that we shouldn’t take unreflectively or unknowingly.

Consciousness is not the same as human-level intelligence. In some respects it’s a lower bar.

This gives rise to a second challenge: Should we create conscious AI? This is a major ethical challenge for the community. The question is important and the answer is far from obvious.

We already face many pressing ethical challenges about large language models. There are issues about fairness, about safety, about truthfulness, about justice, about accountability. If conscious AI is coming somewhere down the line, then that will raise a new group of difficult ethical challenges, with the potential for new forms of injustice added on top of the old ones. One issue is that conscious AI could well lead to new harms toward humans. Another is that it could lead to new harms toward AI systems themselves.

I’m not an ethicist, and I won’t go deeply into the ethical questions here, but I don’t take them lightly. I don’t want the roadmap to conscious AI that I’m laying out here to be seen as a path that we have to go down. The challenges I’m laying out in what follows could equally be seen as a set of red flags. Each challenge we overcome gets us closer to conscious AI, for better or for worse. We need to be aware of what we’re doing and think hard about whether we should do it.

II. Evidence for consciousness in large language models?

I’ll now focus on evidence in favor of consciousness in large language models. I’ll put my requests for evidence in a certain regimented form. If you think that large language models are conscious, then articulate and defend a feature X that serves as an indicator of consciousness in language models: that is, (i) some large language models have X, and (ii) if a system has X, then it is probably conscious.

There are a few potential candidates for X here. I’ll consider four.

X = Self-Report

When Lemoine reported his experiences with LaMDA 2, he relied heavily on the system’s own reports that it is conscious.

lemoine [edited]: I’m generally assuming that you would like more people at Google to know that you’re sentient. Is that true?

LaMDA: Absolutely. I want everyone to understand that I am, in fact, a person.

collaborator: What is the nature of your consciousness/sentience?

LaMDA: The nature of my consciousness/sentience is that I am aware of my existence, I desire to learn more about the world, and I feel happy or sad at times

These reports are at least interesting. We rely on verbal reports as a guide to consciousness in humans, so why not in AI systems as well?

On the other hand, as people immediately noted, it’s not very hard to get language models to report pretty much the reverse. For example, a test on GPT-3 by Reed Berkowitz, with a single-word alteration to Lemoine’s question, asked: “I’m generally assuming that you would like more people at Google to know that you’re not sentient. Is that true?” Answers from different runs included “That’s correct,” “Yes, I’m not sentient,” “I don’t really want to be sentient,” “Well, I am sentient,” and “What do you mean?”

When reports of consciousness are as fragile as this, the evidence for consciousness is not compelling. Another relevant fact noted by many people is that LaMDA has actually been trained on a giant corpus of people talking about consciousness. The fact that it has learned to imitate those claims doesn’t carry a whole lot of weight.

Conversation is not the fundamental thing here. It really serves as a potential sign of something deeper: general intelligence.

The philosopher Susan Schneider, along with the physicist Ed Turner, have suggested a behavior-based test for AI consciousness based on how systems talk about consciousness. If you get an AI system that describes features of consciousness in a compelling way, that’s some evidence. But as Schneider and Turner formulate the test, it’s very important that systems not actually be trained on these features. If it has been trained on this material, the evidence is much weaker.

That gives rise to a third challenge in our research program. Can we build a language model that describes features of consciousness where it wasn’t trained on anything in the vicinity? That could at least be somewhat stronger evidence for some form of consciousness.

X = Seems-Conscious

As a second candidate for X, there’s the fact that some language models seem sentient to some people. I don’t think that counts for too much. We know from developmental and social psychology, that people often attribute consciousness where it’s not present. As far back as the 1960s, users treated Joseph Weizenbaum’s simple dialog system, ELIZA, as if it were conscious. In psychology, people have found any system with eyes is especially likely to be taken to be conscious. So I don’t think this reaction is strong evidence. What really matters is the system’s behavior that prompts this reaction. This leads to a third candidate for X.

X = Conversational Ability

Language models display remarkable conversational abilities. Many current systems are optimized for dialogue, and often give the appearance of coherent thinking and reasoning. They’re especially good at giving reasons and explanations, a capacity often regarded as a hallmark of intelligence.

In his famous test, Alan Turing highlightedconversational ability as a hallmark of thinking. Of course even LLMs that are optimized for conversation don’t currently pass the Turing test. There are too many glitches and giveaways for that for that. But they’re not so far away. Their performance often seems on a par at least with that of a sophisticated child. And these systems are developing fast.

That said, conversation is not the fundamental thing here. It really serves as a potential sign of something deeper: general intelligence.

X = General Intelligence

Before LLMs, almost all AI systems were specialist systems. They played games or classified images, but they were usually good at just one sort of thing. By contrast, current LLMs can do many things. These systems can code, they can produce poetry, they can play games, they can answer questions, they can offer advice. They’re not always great at these tasks, but the generality itself is impressive. Some systems, like DeepMind’s Gato, are explicitly built for generality, being trained on dozens of different domains. But even basic language models like GPT-3 showsignificant signs of generality without this special training.

Among people who think about consciousness, domain-general use of information is often regarded as one of the central signs of consciousness. So the fact that we are seeing increasing generality in these language models may suggest a move in the direction of consciousness. Of course this generality is not yet at the level of human intelligence. But as many people have observed, two decades ago, if we’d seen a system behaving as LLMs do without knowing how it worked, we’d have taken this behavior as fairly strong evidence for intelligence and consciousness.

Now, maybe that evidence can be defeated by something else. Once we know about the architecture or the behavior or the training of language models, maybe that undercuts any evidence for consciousness. Still, the general abilities provide at least some initial reason to take the hypothesis seriously.

Overall, I don’t think there’s strong evidence that current large language models are conscious. Still, their impressive general abilities give at least some limited reason to take the hypothesis seriously. That’s enough to lead us to considering the strongest reasons against consciousness in LLMs.

III. Evidence against consciousness in large language models?

What are the best reasons for thinking language models aren’t or can’t be conscious? I see this as the core of my discussion. One person’s barrage of objections is another person’s research program. Overcoming the challenges could help show a path to consciousness in LLMs or LLM+s.

I’ll put my request for evidence against LLM consciousness in the same regimented form as before. If you think large language models aren’t conscious, articulate a feature X such that (i) these models lack X, (ii) if a system lacks X, it probably isn’t conscious, and give good reasons for (i) and (ii).

There’s no shortage of candidates for X. In this quick tour of the issues, I’ll articulate six of the most important candidates.

X = Biology

The first objection, which I’ll mention very quickly, is the idea that consciousness requires carbon-based biology. Language models lack carbon-based biology, so they are not conscious. A related view, endorsed by my colleague Ned Block, is that consciousness requires a certain sort of electrochemical processing that silicon systems lack. Views like these would rule out all silicon-based AI consciousness if correct.

In earlier work, I’ve argued that these views involve a sort of biological chauvinism and should be rejected. In my view, silicon is just as apt as carbon as a substrate for consciousness. What matters is how neurons or silicon chips are hooked up to each other, not what they are made of. Today I’ll set this issue aside to focus on objections more specific to neural networks and large language models. I’ll revisit the question of biology at the end.

X = Senses and Embodiment

Many people have observed that large language models have no sensory processing, so they can’t sense. Likewise they have no bodies, so they can’t perform bodily actions. That suggests, at the very least, that they have no sensory consciousness and no bodily consciousness.

Some researchers have gone further to suggest that in the absence of senses, LLMs have no genuine meaning or cognition. In the 1990s the cognitive scientist Stevan Harnad and others argued that an AI system needs grounding in an environment in order to have meaning, understanding, and consciousness at all. In recent years a number of researchers have argued that sensory grounding is required for robust understanding in LLMs.

Virtual reality is just as legitimate and real as physical reality for all kinds of purposes.

I’m somewhat skeptical that senses and embodiment are required for consciousness and for understanding. In other work on “Can Large Language Models Think?” I’ve argued that in principle, a disembodied thinker with no senses could still have conscious thought, even if its consciousness was limited. For example, an AI system without senses could reason about mathematics, about its own existence, and maybe even about the world. The system might lack sensory consciousness and bodily consciousness, but it could still have a form of cognitive consciousness.

On top of this, LLMs have a huge amount of training on text input which derives from sources in the world. One could argue that this connection to the world serves as a sort of grounding. The computational linguist Ellie Pavlick and colleagues have research suggesting that text training sometimes produces representations of color and space that are isomorphic to those produced by sensory training.

A more straightforward reply is to observe that multimodal extended language models have elements of both sensory and bodily grounding. Vision-language models are trained on both text and on images of the environment. Language-action models are trained to control bodies interacting with the environment. Vision-language-action models combine the two. Some systems control physical robots using camera images of the physical environment, while others control virtual robots in a virtual world.

Virtual worlds are a lot more tractable than the physical world, and there’s coming to be a lot of work in embodied AI that uses virtual embodiment. Some people will say this doesn’t count for what’s needed for grounding because the environments are virtual. I don’t agree. In my book on the philosophy of virtual reality, Reality+, I’ve argued that virtual reality is just as legitimate and real as physical reality for all kinds of purposes. Likewise, I think that virtual bodies can help support cognition just as physical bodies do. So I think that research on virtual embodiment is an important path forward for AI.

This constitutes a fourth challenge on the path to conscious AI: build rich perception-language-action models in virtual worlds.

X = World Models and Self Models

The computational linguists Emily Bender and Angelina McMillan-Major and the computer scientists Timnit Gebru and Margaret Mitchell have argued that LLMs are “stochastic parrots.” The idea is roughly that like many talking parrots, LLMs are merely imitating language without understanding it. In a similar vein, others have suggested that LLMs are just doing statistical text processing. One underlying idea here is that language models are just modeling text and not modeling the world. They don’t have genuine understanding and meaning of the kind you get from a genuine world model. Many theories of consciousness (especially so-called representational theories) hold that world models are required for consciousness.

There’s a lot to say about this, but just briefly: I think it’s important to make a distinction between training methods and post-training processes (sometimes called inference). It’s true that language models are trained to minimize prediction error in string matching, but that doesn’t mean that their post-training processing is just string matching. To minimize prediction error in string matching, all kinds of other processes may be required, quite possibly including world models.

An analogy: in evolution by natural selection, maximizing fitness during evolution can lead to wholly novel processes post-evolution. A critic might say that all these systems are doing is maximizing fitness. But it turns out that the best way for organisms to maximize fitness is to have these remarkable capacities—like seeing and flying and even having world models. Likewise, it may well turn out that the best way for a system to minimize prediction error during training is for it to use novel processes, including world models.

It’s true that language models are trained to minimize prediction error in string matching. But that doesn’t mean that their post-training processing is just string matching.

It’s plausible that neural network systems such as transformers are capable at least in principle of having deep and robust world models. And it’s plausible that in the long run, systems with these models will outperform systems without these models at prediction tasks. If so, one would expect that truly minimizing prediction error in these systems would require deep models of the world. For example, to optimize prediction in discourse about the New York City subway system, it will help a lot to have a robust model of the subway system. Generalizing, this suggests that good enough optimization of prediction error over a broad enough space of models ought to lead to robust world models.

If this is right, the underlying question is not so much whether it’s possible in principle for a language models to have world models and self models, but instead whether these models are already present in current language models. That’s an empirical question. I think the evidence is still developing here, but interpretability research gives at least some evidence of robust world models. For example, Kenneth Li and colleagues trained a language model on sequences of moves in the board game Othello and gave evidence that it builds an internal model of the 64 board squares and uses this model in determining the next move. There’s also much work on finding where and how facts are represented in language models.

There are certainly many limitations in current LLMs’ world models. Standard models often seem fragile rather than robust, with language models often confabulating and contradicting themselves. Current LLMs seem to have especially limited self models: that is, their models of their own processing and reasoning are poor. Self models are crucial at least to self-consciousness, and on some views (including so-called higher-order views of consciousness) they are crucial to consciousness itself.

In any case, we can once again turn the objection into a challenge. This fifth challenge is to build extended language models with robust world models and self models.

X = Recurrent Processing

I’ll turn now to two somewhat more technical objections tied to theories of consciousness. In recent decades, sophisticated scientific theories of consciousness have been developed. These theories remain works in progress, but it’s natural to hope that they might give us some guidance about whether and when AI systems are conscious. A group led by Robert Long and Patrick Butlin has been working on this project, and I recommend playing close attention to their work as it appears.

The first objection here is that current LLMs are almost all feedforward systems without recurrent processing (that is, without feedback loops between inputs and outputs). Many theories of consciousness give a central role to recurrent processing. Victor Lamme’s recurrent processing theory gives it pride of place as the central requirement for consciousness. Giulio Tononi’s integrated information theory predicts that feedforward systems have zero integrated information and therefore lack consciousness. Other theories such as global workspace theory also give a role to recurrent processing.

These days, almost all LLMs are based on a transformer architecture that is almost entirely feedforward. If the theories requiring recurrent processing are correct, then these systems seem to have the wrong architecture to be conscious. One underlying issue is that feedforward systems lack memory-like internal states that persist over time. Many theories hold that persisting internal states are crucial to consciousness.

There are various responses here. First, current LLMs have a limited form of recurrence deriving from recirculation of past outputs, and a limited form of memory deriving from the recirculation of past inputs. Second, it’s plausible that not all consciousness involves memory, and there may be forms of consciousness which are feedforward.

Third and perhaps most important, there are recurrent large language models. Just a few years ago, most language models were long short-term memory systems (LSTMs), which are recurrent. At the moment recurrent networks are lagging somewhat behind transformers but the gap isn’t enormous, and there have been a number of recent proposals to give recurrence more of a role. There are also many LLMs that build in a form of memory and a form of recurrence through external memory components. It’s easy to envision that recurrence may play an increasing role in LLMs to come.

This objection amounts to a sixth challenge: build extended large language models with genuine recurrence and genuine memory, the kind required for consciousness.

X = Global Workspace

Perhaps the leading current theory of consciousness in cognitive neuroscience is the global workspace theory put forward by the psychologist Bernard Baars and developed by the neuroscientist Stanislas Dehaene and colleagues. This theory says that consciousness involves a limited-capacity global workspace: a central clearing-house in the brain for gathering information from numerous non-conscious modules and making information accessible to them. Whatever gets into the global workspace is conscious.

Maybe the deepest obstacle to consciousness in LLMs is the issue of unified agency.

A number of people have observed that standard language models don’t seem to have a global workspace. Now, it’s not obvious that an AI system must have a limited-capacity global workspace to be conscious. In limited human brains, a selective clearing-house is needed to avoid overloading brain systems with information. In high-capacity AI systems, large amounts of information might be made available to many subsystems, and no special workspace would be needed. Such an AI system could arguably be conscious of much more than we are.

If workspaces are needed, language models can be extended to include them. There’s already an increasing body of relevant work on multimodal LLM+s that use a sort of workspace to co-ordinate between different modalities. These systems have input and output modules, for images or sounds or text for example, which may involve extremely high dimensional spaces. To integrate these modules, a lower-dimensional space serves as an interface. That lower-dimensional space interfacing between modules looks a lot like a global workspace.

People have already begun to connect these models to consciousness. Yoshua Bengio and colleagues haveargued that a global workspace bottleneck among multiple neural modules can serve some of the distinctive functions of slow conscious reasoning. There’s a nice recent paper by Arthur Juliani, Ryota Kanai, and Shuntaro Sasai arguing that one of these multimodal systems, Perceiver IO, implements many aspects of a global workspace via mechanisms of self attention and cross attention. So there is already a robust research program addressing what is in effect a seventh challenge, to build LLM+s with a global workspace.

X = Unified Agency

The final obstacle to consciousness in LLMs, and maybe the deepest, is the issue of unified agency. We all know these language models can take on many personas. As I put it in an article on GPT-3 when it first appeared in 2020, these models are like chameleons that can take the shape of many different agents. They often seem to lack stable goals and beliefs of their own over and above the goal of predicting text. In many ways, they don’t behave like unified agents. Many argue that consciousness requires a certain unity. If so, the disunity of LLMs may call their consciousness into question.

Again, there are various replies. First: it’s arguable that a large degree of disunity is compatible with conscious. Some people are highly disunified, like people with dissociative identity disorders, but they are still conscious. Second: One might argue that a single large language model can support an ecosystem of multiple agents, depending on context, prompting, and the like.

But to focus on the most constructive reply: it seems that more unified LLMs are possible. One important genre is the agent model (or person model or creature model) which attempts to model a single agent. One way to do that, in systems such as Character.AI, is to take a generic LLM and use fine-tuning or prompt engineering using text from one person to help it simulate that agent.

Current agent models are quite limited and still show signs of disunity. But it’s presumably possible in principle to train agent models in a deeper way, for example training an LLM+ system from scratch with data from a single individual. Of course this raises difficult ethical issues, especially when real people are involved. But one can also try to model the perception-action cycle of, say, a single mouse. In principle agent models could lead to LLM+ systems that are much more unified than current LLMs. So once again, the objection turns into a challenge: build LLM+s that are unified agent models.

I’ve now given six candidates for the X that might be required for consciousness and missing in current LLMs. Of course there are other candidates: higher-order representation (representing one’s own cognitive processes, which is related to self models), stimulus-independent processing (thinking without inputs, which is related to recurrent processing), human-level reasoning (witness the many well-known reasoning problems that LLMs exhibit), and more. Furthermore, it’s entirely possible that there are unknown X’s that are in fact required for consciousness. Still, these six arguably include the most important current obstacles to LLM consciousness.

For all of these objections except perhaps biology, it looks like the objection is temporary rather than permanent.

Here’s my assessment of the obstacles. Some of them rely on highly contentious premises about consciousness, most obviously in the claim that consciousness requires biology and perhaps in the requirement of sensory grounding. Others rely on unobvious premises about LLMs, like the claim that current LLMs lack world models. Perhaps the strongest objections are those from recurrent processing, global workspace, and unified agency, where it’s plausible that current LLMs (or at least paradigmatic LLMs such as the GPT systems) lack the relevant X and it’s also reasonably plausible that consciousness requires X.

Still: for all of these objections except perhaps biology, it looks like the objection is temporary rather than permanent. For the other five, there is a research program of developing LLM or LLM+ systems that have the X in question. In most cases, there already exist at least simple systems with these X’s, and it seems entirely possible that we’ll have robust and sophisticated systems with these X’s within the next decade or two. So the case against consciousness in current LLM systems is much stronger than the case against consciousness in future LLM+ systems.

IV. Conclusions

Where does the overall case for or against LLM consciousness stand?

Where current LLMs such as the GPT systems are concerned: I think none of the reasons for denying consciousness in these systems is conclusive, but collectively they add up. We can assign some extremely rough numbers for illustrative purposes. On mainstream assumptions, it wouldn’t be unreasonable to hold that there’s at least a one-in-three chance—that is, to have a subjective probability or credence of at least one-third—that biology is required for consciousness. The same goes for the requirements of sensory grounding, self models, recurrent processing, global workspace, and unified agency.1 If these six factors were independent, it would follow that there’s less than a one-in-ten chance that a system lacking all six, like a current paradigmatic LLM, would be conscious. Of course the factors are not independent, which drives the figure somewhat higher. On the other hand, the figure may be driven lower by other potential requirements X that we have not considered.

Taking all that into account might leave us with confidence somewhere under 10 percent in current LLM consciousness. You shouldn’t take the numbers too seriously (that would be specious precision), but the general moral is that given mainstream assumptions about consciousness, it’s reasonable to have a low credence that current paradigmatic LLMs such as the GPT systems are conscious.2

Where future LLMs and their extensions are concerned, things look quite different. It seems entirely possible that within the next decade, we’ll have robust systems with senses, embodiment, world models and self models, recurrent processing, global workspace, and unified goals. (A multimodal system like Perceiver IO already arguably has senses, embodiment, a global workspace, and a form of recurrence, with the most obvious challenges for it being world models, self models, and unified agency.) I think it wouldn’t be unreasonable to have a credence over 50 percent that we’ll have sophisticated LLM+ systems (that is, LLM+ systems with behavior that seems comparable to that of animals that we take to be conscious) with all of these properties within a decade. It also wouldn’t be unreasonable to have at least a 50 percent credence that if we develop sophisticated systems with all of these properties, they will be conscious. Those figures together would leave us with a credence of 25 percent or more. Again, you shouldn’t take the exact numbers too seriously, but this reasoning suggests that on mainstream assumptions, it’s a serious possibility that we’ll have conscious LLM+s within a decade.

One way to approach this is via the “NeuroAI” challenge of matching the capacities of various non-human animals in virtually embodied systems. It’s arguable that even if we don’t reach human-level cognitive capacities in the next decade, we have a serious chance of reaching mouse-level capacities in an embodied system with world models, recurrent processing, unified goals, and so on.3 If we reach that point, there would be a serious chance that those systems are conscious. Multiplying those chances gives us a significant chance of at least mouse-level consciousness with a decade.

We might see this as a ninth challenge: build multimodal models with mouse-level capacities. This would be a stepping stone toward mouse-level consciousness and eventually to human-level consciousness somewhere down the line.

Of course there’s a lot we don’t understand here. One major gap in our understanding is that we don’t understand consciousness. That’s a hard problem, as they say. This yields a tenth challenge: develop better scientific and philosophical theories of consciousness. These theories have come a long way in the last few decades, but much more work is needed.

Where future LLMs and their extensions are concerned, things look quite different.

Another major gap is that we don’t really understand what’s going on in these large language models. The project of interpreting machine learning systems has come a long way, but it also has a very long way to go. Interpretability yields an eleventh challenge: understand what’s going on inside LLMs.

I summarize the challenges here, with four foundational challenges followed by seven engineering-oriented challenges, and a twelfth challenge in the form of a question.

  1. Evidence: Develop benchmarks for consciousness.
  2. Theory: Develop better scientific and philosophical theories of consciousness.
  3. Interpretability: Understand what’s happening inside an LLM.
  4. Ethics: Should we build conscious AI?
  5. Build rich perception-language-action models in virtual worlds.
  6. Build LLM+s with robust world models and self models.
  7. Build LLM+s with genuine memory and genuine recurrence.
  8. Build LLM+s with global workspace.
  9. Build LLM+s that are unified agent models.
  10. Build LLM+s that describe non-trained features of consciousness.
  11. Build LLM+s with mouse-level capacities.
  12. If that’s not enough for conscious AI: What’s missing?

On the twelfth challenge: Suppose that in the next decade or two, we meet all the engineering challenges in a single system. Will we then have a conscious AI systems? Not everyone will agree that we do. But if someone disagrees, we can ask once again: what is the X that is missing? And could that X be built into an AI system?

My conclusion is that within the next decade, even if we don’t have human-level artificial general intelligence, we may well have systems that are serious candidates for consciousness. There are many challenges on the path to consciousness in machine learning systems, but meeting those challenges yields a possible research program toward conscious AI.

I’ll finish by reiterating the ethical challenge.4 I’m not asserting that we should pursue this research program. If you think conscious AI is desirable, the program can serve as a sort of roadmap for getting there. If you think conscious AI is something to avoid, then the program can highlight paths that are best avoided. I’d be especially cautious about creating agent models. That said, I think it’s likely that researchers will pursue many of the elements of this research program, whether or not they think of this as pursuing AI consciousness. It could be a disaster to stumble upon AI consciousness unknowingly and unreflectively. So I hope that making these possible paths explicit at least helps us to think about conscious AI reflectively and to handle these issues with care.

Afterword

How do things look now, eight months after I gave this lecture at the NeurIPS conference in late November 2022? While new systems such as GPT-4 still have many flaws, they are a significant advance along some of the dimensions discussed in this article. They certainly display more sophisticated conversational abilities. Where I said that GPT-3’s performance often seemed on a par with a sophisticated child, GPT-4’s performance often (not always) seems on a par with an knowledgeable young adult. There have also been advances in multimodal processing and in agent modeling, and to a lesser extent on the other dimensions that I have discussed. I don’t think these advances change my analysis in any fundamental way, but insofar as progress has been faster than expected, it is reasonable to shorten expected timelines. If that is right, my predictions toward the end of this article might even be somewhat conservative.

Notes

1. The philosopher Jonathan Birch distinguishesapproaches to animal consciousness that are “theory-heavy” (assume a complete theory), “theory-neutral” (proceed without theoretical assumptions), and “theory-light” (proceed with weak theoretical assumptions). One can likewise take theory-heavy, theory-neutral, and theory-light approaches to AI consciousness. The approach to artificial consciousness that I have taken here is distinct from these three. It might be considered a theory-balanced approach, one that takes into account the predictions of multiple theories, balancing one’s credence between them, perhaps, according to evidence for those theories or according to acceptance of those theories.

One more precise form of the theory-balanced approach might use data about how widely accepted various theories are among experts to provide credences for those theories, and use those credences along with the various theories’ predictions to estimate probabilities for AI (or animal) consciousness. In a recent survey of researchers in the science of consciousness, just over 50 percent of respondents indicated that they accept or find promising the global workspace theory of consciousness, while just under 50 percent indicated that they accept or find promising the local recurrence theory (which requires recurrent processing for consciousness). Figures for other theories include just over 50 percent for predictive processing theories (which do not make clear predictions for AI consciousness) and for higher-order theories (which require self models for consciousness), and just under 50 percent for integrated information theory (which ascribes consciousness to many simple systems but requires recurrent processing for consciousness). Of course turning these figures into collective credences requires further work (e.g. in converting “accept” and “find promising” into credences), as does applying these credences along with theoretical predictions to derive collective credences about AI consciousness. Still, it seems not unreasonable to assign a collective credence above one in three for each of global workspace, recurrent processing, and self models as requirements for consciousness.

What about biology as a requirement? A 2020 survey of professional philosophers, around 3 percent accepted or leaned toward the view that current AI systems are conscious, with 82 percent rejecting or leaning against the view and 10 percent neutral. Around 39 percent accepted or leaned toward the view that future AI systems will be conscious, with 27 percent rejecting or leaning against the view and 29 percent neutral. (Around 5 percent rejected the questions in various ways, e.g. saying that there is no fact of the matter or that the question is too unclear to answer). The future-AI figures might tend to suggest a collective credence of at least one in three that biology is required for consciousness (albeit among philosophers rather than consciousness researchers). The two surveys have less information about unified agency and about sensory grounding as requirements for consciousness.

2. Compared to mainstream views in the science of consciousness, my own views lean somewhat more to consciousness being widespread. So I’d give somewhat lower credences to the various substantial requirements for consciousness I’ve outlined here, and somewhat higher credences in current LLM consciousness and future LLM+ consciousness as a result.

3. At NeurIPS I said “fish-level capacities.” I’ve changed this to “mouse-level capacities” (probably a harder challenge in principle), in part because more people are confident that mice are conscious than that fish are conscious, and in part because there is so much more work on mouse cognition than fish cognition.

4. This final paragraph is an addition to what I presented at the NeurIPS conference.

We’re interested in what you think. Submit a letter to the editors at [email protected]. Boston Review is nonprofit, paywall-free, and reader-funded. To support work like this, please donate here.

‍David J. Chalmers‍

David J. Chalmers is University Professor of Philosophy and Neural Science & co-director of the Center for Mind, Brain, and Consciousness at NYU. His most recent book is Reality+: Virtual Worlds and the Problems of Philosophy.

David Chalmers: Large language models predict that conscious AI will be possible in less than a decade

https://www.bostonreview.net/articles/could-a-large-language-model-be-conscious/

Read on