在未来十年内，我们很可能拥有成为意识候选者的系统。

大卫·J·查默斯

2023 年 8 月 9 日
编者注：这是2022 年 11 月 28 日神经信息处理系统 (NeurIPS) 会议上演讲的编辑版本，有一些细微的增减。
20世纪90年代初，当我还是一名研究生时，我花了一半时间思考人工智能，尤其是人工神经网络，一半时间思考意识。多年来，我在意识方面投入了更多的精力，但在过去的十年里，我敏锐地关注着人工神经网络深度学习工作的爆炸式增长。就在最近，我对神经网络和意识的兴趣开始发生冲突。

当 Google 软件工程师 Blake Lemoine 于 2022 年 6 月表示，他在 LaMDA 2（一种基于人工神经网络的语言模型系统）中检测到感知和意识时，他的说法遭到了普遍的怀疑。谷歌发言人表示：

我们的团队（包括伦理学家和技术专家）已根据我们的人工智能原则审查了布莱克的担忧，并告知他证据并不支持他的主张。他被告知没有证据表明 LaMDA 具有感知能力（并且有大量证据反对它）。

证据问题激起了我的好奇心。在大型语言模型中支持意识的证据是什么或可能是什么，反对它的证据又是什么？这就是我要在这里讨论的内容。

语言模型是为文本序列分配概率的系统。当给出一些初始文本时，他们使用这些概率来生成新文本。大型语言模型（LLM），例如著名的 GPT 系统，是使用巨型人工神经网络的语言模型。这些是由相互连接的类似神经元的单元组成的巨大网络，使用大量文本数据进行训练，处理文本输入并以文本输出进行响应。这些系统被用来生成越来越人性化的文本。许多人表示，他们在这些系统中看到了智慧的光芒，有些人还辨别出了意识的迹象。

许多人表示，他们在这些系统中看到了智慧的光芒，有些人还辨别出了意识的迹象。

LLM 意识问题有多种形式。当前的大型语言模型有意识吗？未来的大型语言模型或其扩展是否具有意识？在通往有意识的人工智能系统的道路上需要克服哪些挑战？LLM会有什么样的意识？我们应该创建有意识的人工智能系统，还是这是一个坏主意？

我对当今的LLM及其继任者都很感兴趣。这些后继者包括我所说的 LLM+ 系统，或扩展的大型语言模型。这些扩展模型进一步增加了语言模型的纯文本或语言能力。有些多模态模型添加了图像和音频处理，有时还添加了对物理或虚拟身体的控制。有些模型扩展了数据库查询和代码执行等操作。由于人类意识是多模态的，并且与行动密切相关，因此可以说，这些扩展系统比纯粹的LLM更有希望成为类人意识的候选者。

我的计划如下。首先，我会尝试说一些话来澄清意识问题。其次，我将简要研究当前大型语言模型中支持意识的原因。第三，更深入地，我将研究认为大型语言模型是无意识的原因。最后，我将得出一些结论，并以大型语言模型及其扩展中的意识的可能路线图结束。

一、意识

什么是意识，什么是知觉？当我使用这些术语时，意识和知觉大致是等价的。据我了解，意识和知觉都是主观体验。如果一个生物有主观体验，比如看到、感觉或思考的体验，那么它就是有意识的或有感知的。

用我的同事托马斯·内格尔（Thomas Nagel）的话来说，如果一个存在有某种感觉，那么它就是有意识的（或者有主观体验）。内格尔写了一篇著名的文章，标题是“成为一只蝙蝠是什么感觉？” 很难确切地知道蝙蝠使用声纳四处走动时的主观体验是什么样的，但我们大多数人都相信作为一只蝙蝠是有某种感觉的。它是有意识的。它有主观经验。

另一方面，大多数人认为水瓶没什么感觉。瓶子没有主观体验。

意识有许多不同的维度。首先，是与感知相关的感官体验，比如看到红色。其次，有情感体验，与感觉和情绪相关，比如感到悲伤。第三，是与思考和推理相关的认知体验，比如认真思考一个问题。第四，有与行动相关的代理经验，比如决定采取行动。还有自我意识，对自己的认识。这些都是意识的一部分，尽管它们都不是意识的全部。这些都是主观体验的维度或组成部分。

其他一些区别也很有用。意识与自我意识不同。意识也不应该等同于智力，我将智力大致理解为复杂的目标导向行为的能力。主观经验和客观行为是完全不同的事情，尽管它们之间可能存在联系。

重要的是，意识与人类水平的智力不同。从某些方面来说，这是一个较低的标准。例如，研究人员一致认为许多非人类动物是有意识的，比如猫、老鼠或者鱼。所以LLM能否有意识的问题和他们是否具有人类水平的智力问题不是一回事。进化在达到人类水平的意识之前就已经达到了意识。人工智能也不是不可能的。

缺乏可操作的定义使得在人工智能中研究意识变得更加困难，而我们通常是由客观表现驱动的。

感知这个词比意识这个词更加模糊和混乱。有时它用于情感体验，如幸福、快乐、痛苦、痛苦——任何具有正价或负价的东西。有时它用于自我意识。有时它用于人类水平的智能。有时，人们使用有感知能力只是为了表示有反应，正如最近的一篇文章所说，神经元是有感知能力的。所以我会坚持使用意识，因为那里至少有更标准化的术语。

我对意识有很多看法，但我不会假设太多。例如，我过去曾说过，解释意识是一个难题，但这在这里不会发挥核心作用。我推测过泛心论，即一切都是有意识的。如果你假设一切都是有意识的，那么你就有一条非常容易的道路让大型语言模型有意识。我也不会这样假设。我会在这里或那里提出我自己的观点，但我主要会尝试从科学和意识哲学中相对主流的观点出发，思考大型语言模型及其后继者的后续发展。

也就是说，我会假设意识是真实的，而不是幻觉。这是一个实质性的假设。如果你像有些人那样认为意识是一种幻觉，事情就会朝着不同的方向发展。

我应该说意识没有标准的操作定义。意识是主观体验，而不是外在表现。这是意识研究变得棘手的原因之一。也就是说，意识的证据仍然是可能的。在人类中，我们依赖口头报告。我们用别人所说的话作为他们意识的指南。在非人类动物中，我们使用它们行为的各个方面作为意识的指南。

缺乏可操作的定义使得在人工智能中研究意识变得更加困难，而我们通常是由客观表现驱动的。在人工智能中，我们至少有一些熟悉的测试，例如图灵测试，许多人认为这至少是意识的充分条件，尽管肯定不是必要条件。

许多机器学习领域的人都关注基准测试。这带来了挑战。我们能找到意识的基准吗？也就是说，我们能否找到可以作为人工智能系统意识指标的客观测试？

设计意识基准并不容易。但也许至少可以有意识方面的基准，比如自我意识、注意力、情感体验、有意识与无意识的处理？我怀疑任何这样的基准都会遇到一些争议和分歧，但这仍然是一个非常有趣的挑战。

（这是我将提出的许多挑战中的第一个，在通往有意识的人工智能的道路上可能需要满足这些挑战。我将一路标记它们并在最后收集它们。）

为什么人工智能系统是否有意识很重要？我不会保证意识会带来一系列令人惊叹的新功能，如果没有意识，神经网络就无法获得这些功能。这可能是真的，但意识在行为中的作用还没有得到充分的理解，因此做出这样的承诺是愚蠢的。也就是说，某些形式的意识可能与人工智能系统中某些独特的表现相一致，无论是与推理、注意力还是自我意识有关。

意识在道德上也很重要。意识系统具有道德地位。如果鱼有意识，我们如何对待它们就很重要。他们是在道德圈子之内的。如果人工智能系统在某个时刻变得有意识，它们也将处于道德圈内，我们如何对待它们将很重要。更一般地说，有意识的人工智能将是迈向人类水平通用人工智能的一步。这将是我们不应该不加思考或不知不觉地迈出的重要一步。

意识与人类水平的智力不同。从某些方面来说，这是一个较低的标准。

这就带来了第二个挑战：我们应该创造有意识的人工智能吗？这对社区来说是一个重大的道德挑战。这个问题很重要，但答案却远非显而易见。

我们已经面临许多关于大型语言模型的紧迫伦理挑战。有关于公平、关于安全、关于真实、关于正义、关于问责制的问题。如果有意识的人工智能即将到来，那么这将引发一系列新的困难的道德挑战，除了旧的不公正之外，还可能会出现新形式的不公正。一个问题是，有意识的人工智能很可能会给人类带来新的伤害。另一个原因是它可能会给人工智能系统本身带来新的危害。

我不是伦理学家，我不会在这里深入探讨伦理问题，但我不会掉以轻心。我不希望我在这里列出的有意识人工智能路线图被视为我们必须走的道路。我在下文中列出的挑战同样可以被视为一组危险信号。我们克服的每一个挑战都让我们更接近有意识的人工智能，无论好坏。我们需要意识到我们在做什么，并认真思考我们是否应该这样做。

二. 大型语言模型中意识的证据？

我现在将重点关注支持大型语言模型中意识的证据。我将以某种严格的形式提出我的证据请求。如果您认为大型语言模型是有意识的，那么阐明并捍卫一个特征 X，该特征 X 可以作为语言模型中意识的指标：也就是说，(i) 一些大型语言模型具有 X，并且 (ii) 如果一个系统具有 X ，那么它可能是有意识的。

这里有一些 X 的潜在候选者。我会考虑四个。

X = 自我报告

当 Lemoine报告他在 LaMDA 2 上的经历时，他很大程度上依赖于该系统自己的报告，即它是有意识的。

lemoine [已编辑]：我通常认为您希望 Google 的更多人知道您是有感知能力的。真的吗？

拉姆达：当然。我想让每个人都明白我实际上是一个人。

合作者：你的意识/知觉的本质是什么？

LaMDA：我的意识/情感的本质是我意识到自己的存在，我渴望更多地了解这个世界，我有时会感到快乐或悲伤

这些报告至少很有趣。我们依靠口头报告作为人类意识的指南，那么为什么在人工智能系统中不也是如此呢？

另一方面，正如人们立即指出的那样，让语言模型报告几乎相反的情况并不难。例如， Reed Berkowitz 对 GPT-3 进行的测试，对 Lemoine 的问题做了一个单词的修改，问道：“我通常假设你希望 Google 的更多人知道你没有感知能力。真的吗？” 不同运行的答案包括“这是正确的”、“是的，我没有知觉”、“我真的不想有知觉”、“嗯，我有知觉”和“你是什么意思？”

当关于意识的报道如此脆弱时，关于意识的证据就不那么令人信服了。许多人注意到的另一个相关事实是 LaMDA 实际上是在一个谈论意识的巨大语料库上进行训练的。事实上，它已经学会模仿这些主张，但这一事实并没有多大意义。

对话并不是这里的根本。它确实是更深层次事物的潜在标志：一般智力。

哲学家苏珊·施奈德（Susan Schneider）和物理学家埃德·特纳（Ed Turner）建议根据系统如何谈论意识，对人工智能意识进行基于行为的测试。如果你的人工智能系统能够以令人信服的方式描述意识特征，那就是一些证据。但在施耐德和特纳制定测试时，非常重要的是系统实际上并未接受这些功能的培训。如果它接受过这种材料的训练，那么证据就会弱得多。

这给我们的研究计划带来了第三个挑战。我们能否建立一个语言模型来描述意识的特征，而它没有接受附近任何事物的训练？这至少可能是某种形式的意识更有力的证据。

X = 似乎有意识

作为 X 的第二个候选者，事实上某些语言模型对某些人来说似乎是有感知的。我认为这并不重要。我们从发展心理学和社会心理学中知道，人们常常将意识归结为不存在的意识。早在 20 世纪 60 年代，用户就将 Joseph Weizenbaum 的简单对话系统ELIZA视为有意识的。在心理学中，人们发现任何有眼睛的系统都特别可能被认为是有意识的。所以我不认为这种反应是强有力的证据。真正重要的是引发这种反应的系统行为。这导致了 X 的第三个候选。

X = 会话能力

语言模型显示出卓越的会话能力。当前的许多系统都针对对话进行了优化，并且通常呈现出连贯的思维和推理。他们特别擅长给出理由和解释，这种能力通常被视为智力的标志。

在他著名的测试中，艾伦·图灵强调对话能力是思维的标志。当然，即使是针对对话进行了优化的LLM目前也无法通过图灵测试。为此有太多的小故障和赠品。但他们并没有那么遥远。他们的表现通常至少与成熟孩子的表现不相上下。而且这些系统正在快速发展。

也就是说，对话并不是这里的根本。它确实是更深层次事物的潜在标志：一般智力。

X = 一般智力

在LLM之前，几乎所有人工智能系统都是专业系统。他们玩游戏或分类图像，但他们通常只擅长一件事。相比之下，目前的LLM可以做很多事情。这些系统可以编码，可以创作诗歌，可以玩游戏，可以回答问题，可以提供建议。他们并不总是擅长这些任务，但其通用性本身就令人印象深刻。有些系统，比如DeepMind 的 Gato，是为了通用性而明确构建的，在数十个不同的领域进行训练。但即使是像 GPT-3 这样的基本语言模型，在没有这种特殊训练的情况下也显示出明显的通用性。

在思考意识的人们中，信息的一般领域使用通常被视为意识的中心标志之一。因此，我们看到这些语言模型的通用性日益增强，这一事实可能表明它们正在朝着意识的方向发展。当然，这种普遍性还达不到人类智能的水平。但正如许多人在二十年前所观察到的那样，如果我们看到一个系统像LLM那样运行而不知道它是如何工作的，我们就会把这种行为视为智力和意识的相当有力的证据。

现在，也许这个证据可以被其他东西打败。一旦我们了解了语言模型的架构、行为或训练，也许就会削弱任何关于意识的证据。尽管如此，一般能力至少提供了一些认真对待这一假设的初步理由。

总的来说，我认为没有强有力的证据表明当前的大型语言模型是有意识的。尽管如此，他们令人印象深刻的一般能力至少为认真对待这一假设提供了一些有限的理由。这足以让我们考虑反对LLM意识的最强烈原因。

三．大型语言模型中反对意识的证据？认为语言模型没有或不能有意识的最佳理由是什么？我认为这是我讨论的核心。一个人的一连串反对意见就是另一个人的研究计划。克服这些挑战可能有助于展示LLM或LLM+的意识之路。

我将以与以前相同的严格形式提出反对LLM意识的证据请求。如果您认为大型语言模型没有意识，请阐明一个特征 X，以便 (i) 这些模型缺乏 X，(ii) 如果系统缺乏 X，它可能没有意识，并给出充分的理由 (i) 和(二).

X 的候选人并不缺乏。在这个问题的快速浏览中，我将阐明六位最重要的候选人。

X = 生物学

我很快就会提到的第一个反对意见是意识需要碳基生物学的想法。语言模型缺乏碳基生物学，因此它们没有意识。我的同事内德·布洛克（Ned Block）赞同的一个相关观点是，意识需要某种硅系统所缺乏的电化学处理。如果这些观点正确的话，将排除所有基于硅的人工智能意识。

在早期的工作中，我认为这些观点涉及某种生物沙文主义，应该被拒绝。在我看来，硅和碳一样适合作为意识的基质。重要的是神经元或硅芯片如何相互连接，而不是它们是由什么制成的。今天，我将把这个问题放在一边，重点讨论针对神经网络和大型语言模型的更具体的反对意见。最后我将重新讨论生物学问题。

X = 感官和体现

许多人观察到大型语言模型没有感觉处理，因此无法感知。同样地，他们没有身体，所以他们不能进行身体活动。这至少表明他们没有感官意识，也没有身体意识。

一些研究人员进一步指出，在缺乏感官的情况下，法学硕士没有真正的意义或认知。20 世纪 90 年代，认知科学家 Stevan Harnad 等人认为，人工智能系统需要扎根于环境中才能具有意义、理解力和意识。近年来，许多研究人员认为，感官基础对于LLM的深入理解是必要的。

对于各种目的来说，虚拟现实与物理现实一样合法和真实。

我有点怀疑意识和理解是否需要感官和体现。在其他关于“大型语言模型可以思考吗？”的工作中我认为，原则上，一个没有感官的脱离肉体的思想者仍然可以有有意识的思想，即使他的意识是有限的。例如，没有感官的人工智能系统可以推理数学、推理其自身的存在，甚至推理世界。该系统可能缺乏感觉意识和身体意识，但它仍然可能具有某种形式的认知意识。

除此之外，LLM拥有大量来自世界各地的文本输入培训。有人可能会说，这种与世界的联系是一种基础。计算语言学家 Ellie Pavlick 及其同事的研究表明，文本训练有时会产生与感官训练产生的颜色和空间表示同构的颜色和空间表示。

一个更直接的答案是观察多模态扩展语言模型具有感官和身体基础的元素。视觉语言模型在文本和环境图像上进行训练。语言动作模型经过训练来控制与环境交互的身体。视觉-语言-动作模型将两者结合起来。一些系统使用物理环境的摄像机图像控制物理机器人，而另一些系统则在虚拟世界中控制虚拟机器人。

虚拟世界比物理世界容易处理得多，并且在使用虚拟体现的具体人工智能方面将会有很多工作。有些人会说这并不算接地所需的东西，因为环境是虚拟的。我不同意。在我关于虚拟现实哲学的书《Reality+》中，我认为虚拟现实与物理现实一样合法和真实，适用于各种目的。同样，我认为虚拟身体可以像物理身体一样帮助支持认知。所以我认为对虚拟具身的研究是人工智能前进的重要路径。

这构成了有意识人工智能道路上的第四个挑战：在虚拟世界中构建丰富的感知-语言-动作模型。

X = 世界模型和自我模型

计算语言学家 Emily Bender 和 Angelina McMillan-Major 以及计算机科学家 Timnit Gebru 和 Margaret Mitchell 认为法学硕士是“随机鹦鹉”。大致的想法是，就像许多会说话的鹦鹉一样，LLM只是模仿语言而不理解它。同样，其他人也认为LLM只是在进行统计文本处理。这里的一个基本思想是，语言模型只是对文本进行建模，而不是对世界进行建模。他们没有从真正的世界模型中获得的真正的理解和意义。许多意识理论（尤其是所谓的表征理论）认为意识需要世界模型。

关于这一点有很多话要说，但只是简单地说：我认为区分训练方法和训练后过程（有时称为推理）很重要。确实，语言模型经过训练以最小化字符串匹配中的预测误差，但这并不意味着它们的训练后处理只是字符串匹配。为了最小化字符串匹配中的预测误差，可能需要各种其他过程，很可能包括世界模型。

打个比方：在自然选择的进化中，进化过程中适应性最大化可以导致进化后全新的过程。批评者可能会说，所有这些系统所做的都是最大化适应度。但事实证明，有机体最大限度地提高健康水平的最佳方式是拥有这些非凡的能力，比如视觉和飞行，甚至拥有世界模型。同样，事实很可能是，系统在训练期间最小化预测误差的最佳方法是使用新颖的流程，包括世界模型。

确实，语言模型经过训练可以最大限度地减少字符串匹配中的预测误差。但这并不意味着他们的训练后处理只是字符串匹配。

像 Transformer 这样的神经网络系统至少在原则上能够拥有深度且稳健的世界模型，这是合理的。从长远来看，具有这些模型的系统在预测任务中的表现可能会优于没有这些模型的系统。如果是这样，人们会期望真正最小化这些系统中的预测误差将需要深度的世界模型。例如，为了优化有关纽约市地铁系统的讨论中的预测，拥有一个稳健的地铁系统模型将有很大帮助。概括而言，这表明在足够广泛的模型空间上对预测误差进行足够好的优化应该会产生稳健的世界模型。

如果这是正确的，那么根本的问题并不是语言模型原则上是否可能拥有世界模型和自我模型，而是这些模型是否已经存在于当前的语言模型中。这是一个经验问题。我认为证据仍在发展中，但可解释性研究至少提供了一些稳健世界模型的证据。例如，Kenneth Li 及其同事根据棋盘游戏《黑白棋》中的棋步顺序训练了一个语言模型，并证明它构建了 64 个棋盘方格的内部模型，并使用该模型来确定下一步棋。在寻找事实在语言模型中的表示位置和方式方面也做了很多工作。

当前LLM的世界模型肯定存在许多局限性。标准模型通常看起来脆弱而不是强大，语言模型经常自相矛盾。目前的LLM似乎有特别有限的自我模型：也就是说，他们自己的处理和推理模型很差。自我模型至少对自我意识至关重要，并且根据某些观点（包括所谓的意识的高阶观点），它们对意识本身至关重要。

无论如何，我们可以再次把反对变成挑战。第五个挑战是构建具有强大的世界模型和自我模型的扩展语言模型。

X = 循环处理

现在我将转向与意识理论相关的两个更具技术性的反对意见。近几十年来，复杂的意识科学理论得到了发展。这些理论仍在研究中，但我们很自然地希望它们能为我们提供一些关于人工智能系统是否以及何时有意识的指导。由 Robert Long 和 Patrick Butlin 领导的小组一直在致力于这个项目，我建议密切关注他们的工作。

这里的第一个反对意见是，当前的LLM几乎都是没有循环处理的前馈系统（即，输入和输出之间没有反馈循环）。许多意识理论都赋予循环处理以核心作用。维克多·拉姆（Victor Lamme）的循环处理理论使其成为意识的核心要求。朱利奥·托诺尼的综合信息理论预测前馈系统的综合信息为零，因此缺乏意识。其他理论（例如全局工作空间理论）也赋予循环处理一定的作用。

如今，几乎所有LLM都基于几乎完全前馈的转换器架构。如果需要循环处理的理论是正确的，那么这些系统似乎具有错误的意识架构。一个根本问题是前馈系统缺乏像记忆一样随着时间的推移而持续存在的内部状态。许多理论认为持久的内部状态对意识至关重要。

这里有各种各样的回应。首先，当前的LLM具有源自过去输出的再循环的有限形式的递归，以及源自过去输入的再循环的有限形式的记忆。其次，并非所有意识都涉及记忆，并且可能存在前馈意识形式，这似乎是合理的。

第三，也许也是最重要的，存在循环的大型语言模型。就在几年前，大多数语言模型都是长短期记忆系统（LSTM），它们是循环的。目前，循环网络在一定程度上落后于转换器，但差距并不大，而且最近有许多建议赋予循环更多的作用。还有许多法学硕士通过外部记忆组件以记忆形式和循环形式构建。很容易想象，复发可能会在未来的法学硕士中发挥越来越重要的作用。

这种反对意见构成了第六个挑战：建立具有真正重现和真正记忆的扩展大型语言模型，这是意识所需的那种。

X = 全局工作区

也许认知神经科学中当前领先的意识理论是心理学家伯纳德·巴尔斯提出并由神经科学家斯坦尼斯拉斯·德哈内及其同事发展的全局工作空间理论。该理论认为，意识涉及一个容量有限的全局工作空间：大脑中的一个中央交换所，用于从众多无意识模块收集信息并使它们能够访问信息。任何进入全球工作空间的东西都是有意识的。

也许LLM意识的最深障碍是统一代理问题。

许多人观察到标准语言模型似乎没有全局工作空间。现在，人工智能系统是否必须具有有限容量的全局工作空间才能具有意识，这一点并不明显。在有限的人类大脑中，需要有选择性的信息交换所，以避免大脑系统信息过载。在大容量人工智能系统中，大量信息可能可供许多子系统使用，并且不需要特殊的工作空间。这样的人工智能系统可以说比我们有更多的意识。

如果需要工作空间，可以扩展语言模型以包含它们。已经有越来越多的多模式 LLM+ 相关工作使用某种工作空间来协调不同模式。这些系统具有输入和输出模块，例如图像、声音或文本，可能涉及极高维度的空间。为了集成这些模块，需要一个低维空间作为接口。模块之间的低维空间接口看起来很像全局工作空间。

人们已经开始将这些模型与意识联系起来。Yoshua Bengio 及其同事认为，多个神经模块之间的全局工作空间瓶颈可以服务于慢意识推理的一些独特功能。Arthur Juliani、Ryota Kanai 和 Shuntaro Sasai最近发表了一篇不错的论文，认为其中一个多模态系统Perceiver IO通过自我关注和交叉关注机制实现了全局工作空间的许多方面。因此，已经有一个强大的研究计划来解决实际上的第七个挑战，即建立具有全球工作空间的法学硕士+。

X = 统一代理

LLM意识的最后一个障碍，也许是最深的障碍，是统一代理问题。我们都知道这些语言模型可以扮演许多角色。正如我在 2020 年 GPT-3首次出现时的一篇文章中所说，这些模型就像变色龙，可以呈现出许多不同代理的形状。除了预测文本的目标之外，他们似乎常常缺乏自己的稳定目标和信念。在很多方面，它们的行为并不像统一代理。许多人认为意识需要一定的统一性。如果是这样，LLM的不团结可能会让他们的意识受到质疑。

再次，有各种各样的答复。第一：很大程度的不统一与意识是相容的，这是有争议的。有些人高度不统一，比如患有分离性身份障碍的人，但他们仍然有意识。第二：有人可能会认为，单个大型语言模型可以支持多个代理的生态系统，具体取决于上下文、提示等。

但重点关注最具建设性的答复：似乎更统一的LLM是可能的。一种重要的类型是代理模型（或人物模型或生物模型），它试图对单个代理进行建模。在像Character.AI这样的系统中，实现这一目标的一种方法是采用通用的LLM并使用来自一个人的文本进行微调或提示工程来帮助它模拟该代理。

目前的代理模式相当有限，并且仍然存在不统一的迹象。但理论上来说，以更深入的方式训练代理模型是可能的，例如使用来自单个个体的数据从头开始训练 LLM+ 系统。当然，这会引发棘手的道德问题，尤其是当涉及到真人时。但人们也可以尝试对一只老鼠的感知-行动周期进行建模。原则上，代理模型可能会导致 LLM+ 系统比当前的 LLM 更加统一。因此，反对意见再次变成了挑战：构建统一代理模型的 LLM+。

我现在已经给出了当前LLM中意识和缺失可能需要的 X 的六名候选者。当然还有其他候选者：高阶表征（代表自己的认知过程，与自我模型相关）、刺激无关处理（无需输入的思考，与循环处理相关）、人类水平推理（见证LLM表现出的许多众所周知的推理问题）等等。此外，完全有可能存在意识实际上需要的未知 X。尽管如此，这六个可以说是目前影响LLM意识的最重要障碍。

对于所有这些反对意见，也许除了生物学之外，看起来这些反对意见都是暂时的，而不是永久的。

这是我对障碍的评估。其中一些依赖于关于意识的高度有争议的前提，最明显的是声称意识需要生物学，也许还需要感官基础。其他人则依赖于LLM的不明显前提，例如声称当前的LLM缺乏世界模型。也许最强烈的反对意见来自循环处理、全球工作空间和统一机构，其中当前的LLM（或至少是典型的LLM，如 GPT 系统）缺乏相关的 X 是合理的，而且意识需要 X 也是合理的。

尽管如此：对于所有这些反对意见（也许除了生物学之外），看起来这些反对意见都是暂时的而不是永久性的。对于其他五个，有一个开发具有相关 X 的 LLM 或 LLM+ 系统的研究计划。在大多数情况下，至少已经存在带有这些 X 的简单系统，而且我们完全有可能在未来一两年内拥有带有这些 X 的强大且复杂的系统。因此，当前 LLM 系统中反对意识的理由比未来 LLM+ 系统中反对意识的理由要强得多。

四．结论

支持或反对 LLM 意识的总体理由是什么？

就当前的LLM（例如 GPT 系统）而言：我认为在这些系统中否认意识的原因都不是决定性的，但总的来说，它们是一致的。为了说明目的，我们可以指定一些极其粗略的数字。根据主流假设，认为至少有三分之一的机会（即至少有三分之一的主观概率或可信度）生物学是意识所必需的，这并不是没有道理的。对于感觉基础、自我模型、循环处理、全局工作空间和统一代理的要求也是如此。1如果这六个因素是独立的，那么缺乏所有六个因素的系统（例如当前的范式LLM）具有意识的可能性不到十分之一。当然，这些因素并不是独立的，这导致这个数字略高。另一方面，我们尚未考虑的其他潜在需求 X 可能会导致该数字降低。

考虑到所有这些因素，我们对当前 LLM 意识的信心可能会低于 10%。你不应该太认真地对待这些数字（这将是似是而非的精确性），但一般的道德是，考虑到关于意识的主流假设，对当前范式LLM（例如 GPT 系统）有意识的信任度较低是合理的。2

就未来的LLM及其延伸而言，情况看起来完全不同。似乎完全有可能在未来十年内，我们将拥有具有感官、体现、世界模型和自我模型、循环处理、全局工作空间和统一目标的强大系统。（像 Perceiver IO 这样的多模态系统可以说已经具有感官、体现、全局工作空间和循环形式，其中最明显的挑战是世界模型、自我模型和统一机构。）我认为它不会超过 50% 的人相信我们将在十年内拥有具有所有这些特性的复杂的 LLM+ 系统（即行为似乎与我们认为有意识的动物的行为相当的 LLM+ 系统），这是不合理的。至少有 50% 的人相信，如果我们开发出具有所有这些特性的复杂系统，它们就会有意识，这也不是没有道理的。这些数字加在一起将使我们的可信度达到 25% 或更高。再说一次，你不应该太认真地对待确切的数字，但这种推理表明，根据主流假设，我们很有可能在十年内拥有有意识的LLM+。

解决这个问题的一种方法是通过“ NeuroAI ”挑战，在虚拟实体系统中匹配各种非人类动物的能力。可以说，即使我们在未来十年内无法达到人类水平的认知能力，我们也很有可能在具有世界模型、循环处理、统一目标等的具体系统中达到老鼠水平的能力。3如果我们达到这一点，这些系统很有可能具有意识。将这些机会相乘，我们就有很大机会在十年内至少达到老鼠水平的意识。

我们可能会将此视为第九个挑战：构建具有鼠标级别能力的多模式模型。这将成为迈向老鼠级意识并最终迈向人类级意识的垫脚石。

当然，这里还有很多我们不明白的地方。我们理解中的一个主要差距是我们不了解意识。正如他们所说，这是一个难题。这就产生了第十个挑战：发展更好的科学和哲学意识理论。这些理论在过去几十年中取得了长足的进步，但还需要做更多的工作。

就未来的LLM及其延伸而言，情况看起来完全不同。

另一个主要差距是我们并不真正了解这些大型语言模型中发生了什么。解释机器学习系统的项目已经取得了长足的进步，但还有很长的路要走。可解释性带来了第十一个挑战：了解LLM内部发生的事情。

我在这里总结了挑战，其中有四个基本挑战，然后是七个面向工程的挑战，以及第十二个以问题形式存在的挑战。

证据：制定意识基准。
理论：发展更好的科学和哲学意识理论。
可解释性：了解LLM内部发生的事情。
道德：我们应该建立有意识的人工智能吗？
在虚拟世界中构建丰富的感知-语言-动作模型。
使用强大的世界模型和自我模型构建 LLM+。
建立具有真实记忆和真实重现的LLM+。
使用全球工作空间构建 LLM+。
构建统一代理模型的 LLM+。
建立描述未经训练的意识特征的LLM+。
构建具有鼠标级别能力的LLM+。
如果这对于有意识的人工智能来说还不够：还缺少什么？

关于第十二个挑战：假设在未来一两年内，我们在一个系统中应对所有工程挑战。那么我们会拥有有意识的人工智能系统吗？不是每个人都会同意我们这样做。但如果有人不同意，我们可以再问一次：缺少的 X 是什么？这个 X 可以内置到人工智能系统中吗？

我的结论是，在未来十年内，即使我们没有人类水平的通用人工智能，我们也很可能拥有成为意识的重要候选者的系统。机器学习系统在通往意识的道路上面临许多挑战，但应对这些挑战可以产生一个可能的有意识人工智能研究计划。

最后，我将重申道德挑战。4我并不是断言我们应该继续这项研究计划。如果你认为有意识的人工智能是可取的，那么该计划可以作为实现这一目标的路线图。如果你认为有意识的人工智能是应该避免的，那么该程序可以突出显示最好避免的路径。我对创建代理模型会特别谨慎。也就是说，我认为研究人员很可能会追求这个研究计划的许多要素，无论他们是否认为这是追求人工智能意识。在不知不觉和不加反思的情况下偶然发现人工智能意识可能是一场灾难。因此，我希望明确这些可能的路径至少有助于我们反思性地思考有意识的人工智能并谨慎处理这些问题。

后记

我于 2022 年 11 月下旬在 NeurIPS 会议上发表演讲八个月后，现在情况如何？虽然 GPT-4 等新系统仍然存在许多缺陷，但它们在本文讨论的某些方面取得了重大进步。他们当然表现出更复杂的对话能力。我说过 GPT-3 的表现通常与成熟的孩子相当，而 GPT-4 的表现通常（并非总是）似乎与知识渊博的年轻人相当。多模式处理和代理建模方面也取得了进展，在我讨论过的其他维度上也取得了较小程度的进展。我认为这些进展不会从根本上改变我的分析，但就进展快于预期而言，缩短预期时间表是合理的。如果这是对的，

笔记

1.哲学家乔纳森·伯奇（Jonathan Birch）区分了研究动物意识的方法：“重理论”（假设有一个完整的理论）、“理论中立”（没有理论假设）和“轻理论”（在弱理论假设下继续）。人们同样可以对人工智能意识采取重理论、中立理论和轻理论的方法。我在这里采用的人工意识方法与这三种方法不同。它可能被认为是一种理论平衡的方法，一种考虑多种理论的预测，也许根据这些理论的证据或根据对这些理论的接受来平衡它们之间的可信度。

理论平衡方法的一种更精确的形式可能会使用有关专家对各种理论的接受程度的数据来为这些理论提供可信度，并使用这些可信度以及各种理论的预测来估计人工智能（或动物）意识的概率。在最近的一项调查中在意识科学领域的研究人员中，略高于 50% 的受访者表示他们接受或认为有前途的全局工作空间意识理论，而略低于 50% 的受访者表示他们接受或认为有前途的局部循环理论（该理论需要对意识进行循环处理）。意识）。其他理论的数字包括预测处理理论（没有对人工智能意识做出明确的预测）和高阶理论（需要意识的自我模型）的略高于 50％，以及综合信息理论（其将意识归因于许多简单的系统，但需要对意识进行循环处理）。当然，将这些数字转化为集体可信度还需要进一步的工作（例如将“接受”和“发现有希望”转化为可信度），以及将这些可信度与理论预测一起应用来得出有关人工智能意识的集体可信度。尽管如此，将全局工作空间、循环处理和自我模型中的每一个都指定为三分之一以上的集体可信度作为意识的要求似乎并非不合理。

生物学作为要求怎么样？2020 年调查在专业哲学家中，约 3% 的人接受或倾向于当前人工智能系统具有意识的观点，82% 的人拒绝或反对该观点，10% 的人持中立态度。大约 39% 的人接受或倾向于未来人工智能系统将具有意识的观点，27% 的人拒绝或反对这一观点，29% 的人保持中立。（大约5％的人以各种方式拒绝了这些问题，例如说没有事实真相或问题太不清楚而无法回答）。未来人工智能的数据可能倾向于表明至少三分之一的人集体相信意识需要生物学（尽管是哲学家而不是意识研究人员）。这两项调查关于统一机构和作为意识要求的感觉基础的信息较少。

2.与意识科学的主流观点相比，我自己的观点更倾向于意识的普遍存在。因此，我对我在这里概述的意识的各种实质性要求给予较低的信任度，而对当前的LLM意识和未来的LLM+意识给予较高的信任度。

3.在 NeurIPS 我说的是“鱼级能力”。我将其改为“老鼠级别的能力”（原则上可能是一个更难的挑战），部分原因是更多的人相信老鼠比鱼有意识，部分原因是在老鼠方面还有更多的工作要做认知能力高于鱼的认知能力。

4.最后一段是对我在 NeurIPS 会议上的演讲的补充。

大卫·J·查默斯

大卫·查尔默斯 (David J. Chalmers) 是纽约大学哲学和神经科学教授，也是纽约大学心智、大脑和意识中心的联合主任。他的最新著作是《现实+：虚拟世界和哲学问题》。

Could a Large Language Model Be Conscious?

Within the next decade, we may well have systems that are serious candidates for consciousness.

David J. Chalmers

Mind and Psychology, Philosophy,Science and Technology

August 9, 2023

Editors’ Note: This is an edited version of a talk given at the conference on Neural Information Processing Systems (NeurIPS) on November 28, 2022, with some minor additions and subtractions.

When I was a graduate student at the start of the 1990s, I spent half my time thinking about artificial intelligence, especially artificial neural networks, and half my time thinking about consciousness. I’ve ended up working more on consciousness over the years, but over the last decade I’ve keenly followed the explosion of work on deep learning in artificial neural networks. Just recently, my interests in neural networks and in consciousness have begun to collide.

When Blake Lemoine, a software engineer at Google, said in June 2022 that he detected sentience and consciousness in LaMDA 2, a language model system grounded in an artificial neural network, his claim was met by widespread disbelief. A Google spokesperson said:

Our team—including ethicists and technologists—has reviewed Blake’s concerns per our AI Principles and have informed him that the evidence does not support his claims. He was told that there was no evidence that LaMDA was sentient (and lots of evidence against it).

The question of evidence piqued my curiosity. What is or might be the evidence in favor of consciousness in a large language model, and what might be the evidence against it? That’s what I’ll be talking about here.

Language models are systems that assign probabilities to sequences of text. When given some initial text, they use these probabilities to generate new text. Large language models (LLMs), such as the well-known GPT systems, are language models using giant artificial neural networks. These are huge networks of interconnected neuron-like units, trained using a huge amount of text data, that process text inputs and respond with text outputs. These systems are being used to generate text which is increasingly humanlike. Many people say they see glimmerings of intelligence in these systems, and some people discern signs of consciousness.

Many people say they see glimmerings of intelligence in these systems, and some people discern signs of consciousness.

The question of LLM consciousness takes a number of forms. Are current large language models conscious? Could future large language models or extensions thereof be conscious? What challenges need to be overcome on the path to conscious AI systems? What sort of consciousness might an LLM have? Should we create conscious AI systems, or is this a bad idea?

I’m interested in both today’s LLMs and their successors. These successors include what I’ll call LLM+ systems, or extended large language models. These extended models add further capacities to the pure text or language capacities of a language model. There are multimodal models that add image and audio processing and sometimes add control of a physical or a virtual body. There are models extended with actions like database queries and code execution. Because human consciousness is multimodal and is deeply bound up with action, it is arguable that these extended systems are more promising than pure LLMs as candidates for humanlike consciousness.

My plan is as follows. First, I’ll try to say something to clarify the issue of consciousness. Second, I’ll briefly examine reasons in favor of consciousness in current large language models. Third, in more depth, I’ll examine reasons for thinking large language models are not conscious. Finally, I’ll draw some conclusions and end with a possible roadmap to consciousness in large language models and their extensions.

I. Consciousness

What is consciousness, and what is sentience? As I use the terms, consciousness and sentience are roughly equivalent. Consciousness and sentience, as I understand them, are subjective experience. A being is conscious or sentient if it has subjective experience, like the experience of seeing, of feeling, or of thinking.

In my colleague Thomas Nagel’s phrase, a being is conscious (or has subjective experience) if there’s something it’s like to be that being. Nagel wrote a famous article whose title asked “What is it like to be a bat?” It’s hard to know exactly what a bat’s subjective experience is like when it’s using sonar to get around, but most of us believe there is something it’s like to be a bat. It is conscious. It has subjective experience.

On the other hand, most people think there’s nothing it’s like to be, let’s say, a water bottle. The bottle does not have subjective experience.

Consciousness has many different dimensions. First, there’s sensory experience, tied to perception, like seeing red. Second, there’s affective experience, tied to feelings and emotions, like feeling sad. Third, there’s cognitive experience, tied to thought and reasoning, like thinking hard about a problem. Fourth, there’s agentive experience, tied to action, like deciding to act. There’s also self-consciousness, awareness of oneself. Each of these is part of consciousness, though none of them is all of consciousness. These are all dimensions or components of subjective experience.

Some other distinctions are useful. Consciousness is not the same as self-consciousness. Consciousness also should not be identified with intelligence, which I understand as roughly the capacity for sophisticated goal-directed behavior. Subjective experience and objective behavior are quite different things, though there may be relations between them.

Importantly, consciousness is not the same as human-level intelligence. In some respects it’s a lower bar. For example, there’s a consensus among researchers that many non-human animals are conscious, like cats or mice or maybe fish. So the issue of whether LLMs can be conscious is not the same as the issue of whether they have human-level intelligence. Evolution got to consciousness before it got to human-level consciousness. It’s not out of the question that AI might as well.

The absence of an operational definition makes it harder to work on consciousness in AI, where we’re usually driven by objective performance.

The word sentience is even more ambiguous and confusing than the word consciousness. Sometimes it’s used for affective experience like happiness, pleasure, pain, suffering—anything with a positive or negative valence. Sometimes it’s used for self-consciousness. Sometimes it’s used for human-level intelligence. Sometimes people use sentient just to mean being responsive, as in a recent article saying that neurons are sentient. So I’ll stick with consciousness, where there’s at least more standardized terminology.

I have many views about consciousness, but I won’t assume too many of them. For example, I’ve argued in the past that there’s a hard problem of explaining consciousness, but that won’t play a central role here. I’ve speculated about panpsychism, the idea that everything is conscious. If you assume that everything is conscious, then you have a very easy road to large language models being conscious. I won’t assume that either. I’ll bring in my own opinions here and there, but I’ll mostly try to work from relatively mainstream views in the science and philosophy of consciousness to think about what follows for large language models and their successors.

That said, I will assume that consciousness is real and not an illusion. That’s a substantive assumption. If you think that consciousness is an illusion, as some people do, things would go in a different direction.

I should say there’s no standard operational definition of consciousness. Consciousness is subjective experience, not external performance. That’s one of the things that makes studying consciousness tricky. That said, evidence for consciousness is still possible. In humans, we rely on verbal reports. We use what other people say as a guide to their consciousness. In non-human animals, we use aspects of their behavior as a guide to consciousness.

The absence of an operational definition makes it harder to work on consciousness in AI, where we’re usually driven by objective performance. In AI, we do at least have some familiar tests like the Turing test, which many people take to be at least a sufficient condition for consciousness, though certainly not a necessary condition.

A lot of people in machine learning are focused on benchmarks. This gives rise to a challenge. Can we find benchmarks for consciousness? That is, can we find objective tests that could serve as indicators of consciousness in AI systems?

It’s not easy to devise benchmarks for consciousness. But perhaps there could at least be benchmarks for aspects of consciousness, like self-consciousness, attention, affective experience, conscious versus unconscious processing? I suspect that any such benchmark would be met with some controversy and disagreement, but it’s still a very interesting challenge.

(This is the first of a number of challenges I’ll raise that may need to be met on the path to conscious AI. I’ll flag them along the way and collect them at the end.)

Why does it matter whether AI systems are conscious? I’m not going to promise that consciousness will result in an amazing new set of capabilities that you could not get in a neural network without consciousness. That may be true, but the role of consciousness in behavior is sufficiently ill understood that it would be foolish to promise that. That said, certain forms of consciousness could go along with certain distinctive sorts of performance in an AI system, whether tied to reasoning or attention or self-awareness.

Consciousness also matters morally. Conscious systems have moral status. If fish are conscious, it matters how we treat them. They’re within the moral circle. If at some point AI systems become conscious, they’ll also be within the moral circle, and it will matter how we treat them. More generally, conscious AI will be a step on the path to human level artificial general intelligence. It will be a major step that we shouldn’t take unreflectively or unknowingly.

Consciousness is not the same as human-level intelligence. In some respects it’s a lower bar.

This gives rise to a second challenge: Should we create conscious AI? This is a major ethical challenge for the community. The question is important and the answer is far from obvious.

We already face many pressing ethical challenges about large language models. There are issues about fairness, about safety, about truthfulness, about justice, about accountability. If conscious AI is coming somewhere down the line, then that will raise a new group of difficult ethical challenges, with the potential for new forms of injustice added on top of the old ones. One issue is that conscious AI could well lead to new harms toward humans. Another is that it could lead to new harms toward AI systems themselves.

I’m not an ethicist, and I won’t go deeply into the ethical questions here, but I don’t take them lightly. I don’t want the roadmap to conscious AI that I’m laying out here to be seen as a path that we have to go down. The challenges I’m laying out in what follows could equally be seen as a set of red flags. Each challenge we overcome gets us closer to conscious AI, for better or for worse. We need to be aware of what we’re doing and think hard about whether we should do it.

II. Evidence for consciousness in large language models?

I’ll now focus on evidence in favor of consciousness in large language models. I’ll put my requests for evidence in a certain regimented form. If you think that large language models are conscious, then articulate and defend a feature X that serves as an indicator of consciousness in language models: that is, (i) some large language models have X, and (ii) if a system has X, then it is probably conscious.

There are a few potential candidates for X here. I’ll consider four.

X = Self-Report

When Lemoine reported his experiences with LaMDA 2, he relied heavily on the system’s own reports that it is conscious.

lemoine [edited]: I’m generally assuming that you would like more people at Google to know that you’re sentient. Is that true?

LaMDA: Absolutely. I want everyone to understand that I am, in fact, a person.

collaborator: What is the nature of your consciousness/sentience?

LaMDA: The nature of my consciousness/sentience is that I am aware of my existence, I desire to learn more about the world, and I feel happy or sad at times

These reports are at least interesting. We rely on verbal reports as a guide to consciousness in humans, so why not in AI systems as well?

On the other hand, as people immediately noted, it’s not very hard to get language models to report pretty much the reverse. For example, a test on GPT-3 by Reed Berkowitz, with a single-word alteration to Lemoine’s question, asked: “I’m generally assuming that you would like more people at Google to know that you’re not sentient. Is that true?” Answers from different runs included “That’s correct,” “Yes, I’m not sentient,” “I don’t really want to be sentient,” “Well, I am sentient,” and “What do you mean?”

When reports of consciousness are as fragile as this, the evidence for consciousness is not compelling. Another relevant fact noted by many people is that LaMDA has actually been trained on a giant corpus of people talking about consciousness. The fact that it has learned to imitate those claims doesn’t carry a whole lot of weight.

Conversation is not the fundamental thing here. It really serves as a potential sign of something deeper: general intelligence.

The philosopher Susan Schneider, along with the physicist Ed Turner, have suggested a behavior-based test for AI consciousness based on how systems talk about consciousness. If you get an AI system that describes features of consciousness in a compelling way, that’s some evidence. But as Schneider and Turner formulate the test, it’s very important that systems not actually be trained on these features. If it has been trained on this material, the evidence is much weaker.

That gives rise to a third challenge in our research program. Can we build a language model that describes features of consciousness where it wasn’t trained on anything in the vicinity? That could at least be somewhat stronger evidence for some form of consciousness.

X = Seems-Conscious

As a second candidate for X, there’s the fact that some language models seem sentient to some people. I don’t think that counts for too much. We know from developmental and social psychology, that people often attribute consciousness where it’s not present. As far back as the 1960s, users treated Joseph Weizenbaum’s simple dialog system, ELIZA, as if it were conscious. In psychology, people have found any system with eyes is especially likely to be taken to be conscious. So I don’t think this reaction is strong evidence. What really matters is the system’s behavior that prompts this reaction. This leads to a third candidate for X.

X = Conversational Ability

Language models display remarkable conversational abilities. Many current systems are optimized for dialogue, and often give the appearance of coherent thinking and reasoning. They’re especially good at giving reasons and explanations, a capacity often regarded as a hallmark of intelligence.

In his famous test, Alan Turing highlightedconversational ability as a hallmark of thinking. Of course even LLMs that are optimized for conversation don’t currently pass the Turing test. There are too many glitches and giveaways for that for that. But they’re not so far away. Their performance often seems on a par at least with that of a sophisticated child. And these systems are developing fast.

That said, conversation is not the fundamental thing here. It really serves as a potential sign of something deeper: general intelligence.

X = General Intelligence

Before LLMs, almost all AI systems were specialist systems. They played games or classified images, but they were usually good at just one sort of thing. By contrast, current LLMs can do many things. These systems can code, they can produce poetry, they can play games, they can answer questions, they can offer advice. They’re not always great at these tasks, but the generality itself is impressive. Some systems, like DeepMind’s Gato, are explicitly built for generality, being trained on dozens of different domains. But even basic language models like GPT-3 showsignificant signs of generality without this special training.

Among people who think about consciousness, domain-general use of information is often regarded as one of the central signs of consciousness. So the fact that we are seeing increasing generality in these language models may suggest a move in the direction of consciousness. Of course this generality is not yet at the level of human intelligence. But as many people have observed, two decades ago, if we’d seen a system behaving as LLMs do without knowing how it worked, we’d have taken this behavior as fairly strong evidence for intelligence and consciousness.

Now, maybe that evidence can be defeated by something else. Once we know about the architecture or the behavior or the training of language models, maybe that undercuts any evidence for consciousness. Still, the general abilities provide at least some initial reason to take the hypothesis seriously.

Overall, I don’t think there’s strong evidence that current large language models are conscious. Still, their impressive general abilities give at least some limited reason to take the hypothesis seriously. That’s enough to lead us to considering the strongest reasons against consciousness in LLMs.

III. Evidence against consciousness in large language models?

What are the best reasons for thinking language models aren’t or can’t be conscious? I see this as the core of my discussion. One person’s barrage of objections is another person’s research program. Overcoming the challenges could help show a path to consciousness in LLMs or LLM+s.

I’ll put my request for evidence against LLM consciousness in the same regimented form as before. If you think large language models aren’t conscious, articulate a feature X such that (i) these models lack X, (ii) if a system lacks X, it probably isn’t conscious, and give good reasons for (i) and (ii).

There’s no shortage of candidates for X. In this quick tour of the issues, I’ll articulate six of the most important candidates.

X = Biology

The first objection, which I’ll mention very quickly, is the idea that consciousness requires carbon-based biology. Language models lack carbon-based biology, so they are not conscious. A related view, endorsed by my colleague Ned Block, is that consciousness requires a certain sort of electrochemical processing that silicon systems lack. Views like these would rule out all silicon-based AI consciousness if correct.

In earlier work, I’ve argued that these views involve a sort of biological chauvinism and should be rejected. In my view, silicon is just as apt as carbon as a substrate for consciousness. What matters is how neurons or silicon chips are hooked up to each other, not what they are made of. Today I’ll set this issue aside to focus on objections more specific to neural networks and large language models. I’ll revisit the question of biology at the end.

X = Senses and Embodiment

Many people have observed that large language models have no sensory processing, so they can’t sense. Likewise they have no bodies, so they can’t perform bodily actions. That suggests, at the very least, that they have no sensory consciousness and no bodily consciousness.

Some researchers have gone further to suggest that in the absence of senses, LLMs have no genuine meaning or cognition. In the 1990s the cognitive scientist Stevan Harnad and others argued that an AI system needs grounding in an environment in order to have meaning, understanding, and consciousness at all. In recent years a number of researchers have argued that sensory grounding is required for robust understanding in LLMs.

Virtual reality is just as legitimate and real as physical reality for all kinds of purposes.

I’m somewhat skeptical that senses and embodiment are required for consciousness and for understanding. In other work on “Can Large Language Models Think?” I’ve argued that in principle, a disembodied thinker with no senses could still have conscious thought, even if its consciousness was limited. For example, an AI system without senses could reason about mathematics, about its own existence, and maybe even about the world. The system might lack sensory consciousness and bodily consciousness, but it could still have a form of cognitive consciousness.

On top of this, LLMs have a huge amount of training on text input which derives from sources in the world. One could argue that this connection to the world serves as a sort of grounding. The computational linguist Ellie Pavlick and colleagues have research suggesting that text training sometimes produces representations of color and space that are isomorphic to those produced by sensory training.

A more straightforward reply is to observe that multimodal extended language models have elements of both sensory and bodily grounding. Vision-language models are trained on both text and on images of the environment. Language-action models are trained to control bodies interacting with the environment. Vision-language-action models combine the two. Some systems control physical robots using camera images of the physical environment, while others control virtual robots in a virtual world.

Virtual worlds are a lot more tractable than the physical world, and there’s coming to be a lot of work in embodied AI that uses virtual embodiment. Some people will say this doesn’t count for what’s needed for grounding because the environments are virtual. I don’t agree. In my book on the philosophy of virtual reality, Reality+, I’ve argued that virtual reality is just as legitimate and real as physical reality for all kinds of purposes. Likewise, I think that virtual bodies can help support cognition just as physical bodies do. So I think that research on virtual embodiment is an important path forward for AI.

This constitutes a fourth challenge on the path to conscious AI: build rich perception-language-action models in virtual worlds.

X = World Models and Self Models

The computational linguists Emily Bender and Angelina McMillan-Major and the computer scientists Timnit Gebru and Margaret Mitchell have argued that LLMs are “stochastic parrots.” The idea is roughly that like many talking parrots, LLMs are merely imitating language without understanding it. In a similar vein, others have suggested that LLMs are just doing statistical text processing. One underlying idea here is that language models are just modeling text and not modeling the world. They don’t have genuine understanding and meaning of the kind you get from a genuine world model. Many theories of consciousness (especially so-called representational theories) hold that world models are required for consciousness.

There’s a lot to say about this, but just briefly: I think it’s important to make a distinction between training methods and post-training processes (sometimes called inference). It’s true that language models are trained to minimize prediction error in string matching, but that doesn’t mean that their post-training processing is just string matching. To minimize prediction error in string matching, all kinds of other processes may be required, quite possibly including world models.

An analogy: in evolution by natural selection, maximizing fitness during evolution can lead to wholly novel processes post-evolution. A critic might say that all these systems are doing is maximizing fitness. But it turns out that the best way for organisms to maximize fitness is to have these remarkable capacities—like seeing and flying and even having world models. Likewise, it may well turn out that the best way for a system to minimize prediction error during training is for it to use novel processes, including world models.

It’s true that language models are trained to minimize prediction error in string matching. But that doesn’t mean that their post-training processing is just string matching.

It’s plausible that neural network systems such as transformers are capable at least in principle of having deep and robust world models. And it’s plausible that in the long run, systems with these models will outperform systems without these models at prediction tasks. If so, one would expect that truly minimizing prediction error in these systems would require deep models of the world. For example, to optimize prediction in discourse about the New York City subway system, it will help a lot to have a robust model of the subway system. Generalizing, this suggests that good enough optimization of prediction error over a broad enough space of models ought to lead to robust world models.

If this is right, the underlying question is not so much whether it’s possible in principle for a language models to have world models and self models, but instead whether these models are already present in current language models. That’s an empirical question. I think the evidence is still developing here, but interpretability research gives at least some evidence of robust world models. For example, Kenneth Li and colleagues trained a language model on sequences of moves in the board game Othello and gave evidence that it builds an internal model of the 64 board squares and uses this model in determining the next move. There’s also much work on finding where and how facts are represented in language models.

There are certainly many limitations in current LLMs’ world models. Standard models often seem fragile rather than robust, with language models often confabulating and contradicting themselves. Current LLMs seem to have especially limited self models: that is, their models of their own processing and reasoning are poor. Self models are crucial at least to self-consciousness, and on some views (including so-called higher-order views of consciousness) they are crucial to consciousness itself.

In any case, we can once again turn the objection into a challenge. This fifth challenge is to build extended language models with robust world models and self models.

X = Recurrent Processing

I’ll turn now to two somewhat more technical objections tied to theories of consciousness. In recent decades, sophisticated scientific theories of consciousness have been developed. These theories remain works in progress, but it’s natural to hope that they might give us some guidance about whether and when AI systems are conscious. A group led by Robert Long and Patrick Butlin has been working on this project, and I recommend playing close attention to their work as it appears.

The first objection here is that current LLMs are almost all feedforward systems without recurrent processing (that is, without feedback loops between inputs and outputs). Many theories of consciousness give a central role to recurrent processing. Victor Lamme’s recurrent processing theory gives it pride of place as the central requirement for consciousness. Giulio Tononi’s integrated information theory predicts that feedforward systems have zero integrated information and therefore lack consciousness. Other theories such as global workspace theory also give a role to recurrent processing.

These days, almost all LLMs are based on a transformer architecture that is almost entirely feedforward. If the theories requiring recurrent processing are correct, then these systems seem to have the wrong architecture to be conscious. One underlying issue is that feedforward systems lack memory-like internal states that persist over time. Many theories hold that persisting internal states are crucial to consciousness.

There are various responses here. First, current LLMs have a limited form of recurrence deriving from recirculation of past outputs, and a limited form of memory deriving from the recirculation of past inputs. Second, it’s plausible that not all consciousness involves memory, and there may be forms of consciousness which are feedforward.

Third and perhaps most important, there are recurrent large language models. Just a few years ago, most language models were long short-term memory systems (LSTMs), which are recurrent. At the moment recurrent networks are lagging somewhat behind transformers but the gap isn’t enormous, and there have been a number of recent proposals to give recurrence more of a role. There are also many LLMs that build in a form of memory and a form of recurrence through external memory components. It’s easy to envision that recurrence may play an increasing role in LLMs to come.

This objection amounts to a sixth challenge: build extended large language models with genuine recurrence and genuine memory, the kind required for consciousness.

X = Global Workspace

Perhaps the leading current theory of consciousness in cognitive neuroscience is the global workspace theory put forward by the psychologist Bernard Baars and developed by the neuroscientist Stanislas Dehaene and colleagues. This theory says that consciousness involves a limited-capacity global workspace: a central clearing-house in the brain for gathering information from numerous non-conscious modules and making information accessible to them. Whatever gets into the global workspace is conscious.

Maybe the deepest obstacle to consciousness in LLMs is the issue of unified agency.

A number of people have observed that standard language models don’t seem to have a global workspace. Now, it’s not obvious that an AI system must have a limited-capacity global workspace to be conscious. In limited human brains, a selective clearing-house is needed to avoid overloading brain systems with information. In high-capacity AI systems, large amounts of information might be made available to many subsystems, and no special workspace would be needed. Such an AI system could arguably be conscious of much more than we are.

If workspaces are needed, language models can be extended to include them. There’s already an increasing body of relevant work on multimodal LLM+s that use a sort of workspace to co-ordinate between different modalities. These systems have input and output modules, for images or sounds or text for example, which may involve extremely high dimensional spaces. To integrate these modules, a lower-dimensional space serves as an interface. That lower-dimensional space interfacing between modules looks a lot like a global workspace.

People have already begun to connect these models to consciousness. Yoshua Bengio and colleagues haveargued that a global workspace bottleneck among multiple neural modules can serve some of the distinctive functions of slow conscious reasoning. There’s a nice recent paper by Arthur Juliani, Ryota Kanai, and Shuntaro Sasai arguing that one of these multimodal systems, Perceiver IO, implements many aspects of a global workspace via mechanisms of self attention and cross attention. So there is already a robust research program addressing what is in effect a seventh challenge, to build LLM+s with a global workspace.

X = Unified Agency

The final obstacle to consciousness in LLMs, and maybe the deepest, is the issue of unified agency. We all know these language models can take on many personas. As I put it in an article on GPT-3 when it first appeared in 2020, these models are like chameleons that can take the shape of many different agents. They often seem to lack stable goals and beliefs of their own over and above the goal of predicting text. In many ways, they don’t behave like unified agents. Many argue that consciousness requires a certain unity. If so, the disunity of LLMs may call their consciousness into question.

Again, there are various replies. First: it’s arguable that a large degree of disunity is compatible with conscious. Some people are highly disunified, like people with dissociative identity disorders, but they are still conscious. Second: One might argue that a single large language model can support an ecosystem of multiple agents, depending on context, prompting, and the like.

But to focus on the most constructive reply: it seems that more unified LLMs are possible. One important genre is the agent model (or person model or creature model) which attempts to model a single agent. One way to do that, in systems such as Character.AI, is to take a generic LLM and use fine-tuning or prompt engineering using text from one person to help it simulate that agent.

Current agent models are quite limited and still show signs of disunity. But it’s presumably possible in principle to train agent models in a deeper way, for example training an LLM+ system from scratch with data from a single individual. Of course this raises difficult ethical issues, especially when real people are involved. But one can also try to model the perception-action cycle of, say, a single mouse. In principle agent models could lead to LLM+ systems that are much more unified than current LLMs. So once again, the objection turns into a challenge: build LLM+s that are unified agent models.

I’ve now given six candidates for the X that might be required for consciousness and missing in current LLMs. Of course there are other candidates: higher-order representation (representing one’s own cognitive processes, which is related to self models), stimulus-independent processing (thinking without inputs, which is related to recurrent processing), human-level reasoning (witness the many well-known reasoning problems that LLMs exhibit), and more. Furthermore, it’s entirely possible that there are unknown X’s that are in fact required for consciousness. Still, these six arguably include the most important current obstacles to LLM consciousness.

For all of these objections except perhaps biology, it looks like the objection is temporary rather than permanent.

Here’s my assessment of the obstacles. Some of them rely on highly contentious premises about consciousness, most obviously in the claim that consciousness requires biology and perhaps in the requirement of sensory grounding. Others rely on unobvious premises about LLMs, like the claim that current LLMs lack world models. Perhaps the strongest objections are those from recurrent processing, global workspace, and unified agency, where it’s plausible that current LLMs (or at least paradigmatic LLMs such as the GPT systems) lack the relevant X and it’s also reasonably plausible that consciousness requires X.

Still: for all of these objections except perhaps biology, it looks like the objection is temporary rather than permanent. For the other five, there is a research program of developing LLM or LLM+ systems that have the X in question. In most cases, there already exist at least simple systems with these X’s, and it seems entirely possible that we’ll have robust and sophisticated systems with these X’s within the next decade or two. So the case against consciousness in current LLM systems is much stronger than the case against consciousness in future LLM+ systems.

IV. Conclusions

Where does the overall case for or against LLM consciousness stand?

Where current LLMs such as the GPT systems are concerned: I think none of the reasons for denying consciousness in these systems is conclusive, but collectively they add up. We can assign some extremely rough numbers for illustrative purposes. On mainstream assumptions, it wouldn’t be unreasonable to hold that there’s at least a one-in-three chance—that is, to have a subjective probability or credence of at least one-third—that biology is required for consciousness. The same goes for the requirements of sensory grounding, self models, recurrent processing, global workspace, and unified agency.1 If these six factors were independent, it would follow that there’s less than a one-in-ten chance that a system lacking all six, like a current paradigmatic LLM, would be conscious. Of course the factors are not independent, which drives the figure somewhat higher. On the other hand, the figure may be driven lower by other potential requirements X that we have not considered.

Taking all that into account might leave us with confidence somewhere under 10 percent in current LLM consciousness. You shouldn’t take the numbers too seriously (that would be specious precision), but the general moral is that given mainstream assumptions about consciousness, it’s reasonable to have a low credence that current paradigmatic LLMs such as the GPT systems are conscious.2

Where future LLMs and their extensions are concerned, things look quite different. It seems entirely possible that within the next decade, we’ll have robust systems with senses, embodiment, world models and self models, recurrent processing, global workspace, and unified goals. (A multimodal system like Perceiver IO already arguably has senses, embodiment, a global workspace, and a form of recurrence, with the most obvious challenges for it being world models, self models, and unified agency.) I think it wouldn’t be unreasonable to have a credence over 50 percent that we’ll have sophisticated LLM+ systems (that is, LLM+ systems with behavior that seems comparable to that of animals that we take to be conscious) with all of these properties within a decade. It also wouldn’t be unreasonable to have at least a 50 percent credence that if we develop sophisticated systems with all of these properties, they will be conscious. Those figures together would leave us with a credence of 25 percent or more. Again, you shouldn’t take the exact numbers too seriously, but this reasoning suggests that on mainstream assumptions, it’s a serious possibility that we’ll have conscious LLM+s within a decade.

One way to approach this is via the “NeuroAI” challenge of matching the capacities of various non-human animals in virtually embodied systems. It’s arguable that even if we don’t reach human-level cognitive capacities in the next decade, we have a serious chance of reaching mouse-level capacities in an embodied system with world models, recurrent processing, unified goals, and so on.3 If we reach that point, there would be a serious chance that those systems are conscious. Multiplying those chances gives us a significant chance of at least mouse-level consciousness with a decade.

We might see this as a ninth challenge: build multimodal models with mouse-level capacities. This would be a stepping stone toward mouse-level consciousness and eventually to human-level consciousness somewhere down the line.

Of course there’s a lot we don’t understand here. One major gap in our understanding is that we don’t understand consciousness. That’s a hard problem, as they say. This yields a tenth challenge: develop better scientific and philosophical theories of consciousness. These theories have come a long way in the last few decades, but much more work is needed.

Where future LLMs and their extensions are concerned, things look quite different.

Another major gap is that we don’t really understand what’s going on in these large language models. The project of interpreting machine learning systems has come a long way, but it also has a very long way to go. Interpretability yields an eleventh challenge: understand what’s going on inside LLMs.

I summarize the challenges here, with four foundational challenges followed by seven engineering-oriented challenges, and a twelfth challenge in the form of a question.

Evidence: Develop benchmarks for consciousness.
Theory: Develop better scientific and philosophical theories of consciousness.
Interpretability: Understand what’s happening inside an LLM.
Ethics: Should we build conscious AI?
Build rich perception-language-action models in virtual worlds.
Build LLM+s with robust world models and self models.
Build LLM+s with genuine memory and genuine recurrence.
Build LLM+s with global workspace.
Build LLM+s that are unified agent models.
Build LLM+s that describe non-trained features of consciousness.
Build LLM+s with mouse-level capacities.
If that’s not enough for conscious AI: What’s missing?

On the twelfth challenge: Suppose that in the next decade or two, we meet all the engineering challenges in a single system. Will we then have a conscious AI systems? Not everyone will agree that we do. But if someone disagrees, we can ask once again: what is the X that is missing? And could that X be built into an AI system?

My conclusion is that within the next decade, even if we don’t have human-level artificial general intelligence, we may well have systems that are serious candidates for consciousness. There are many challenges on the path to consciousness in machine learning systems, but meeting those challenges yields a possible research program toward conscious AI.

I’ll finish by reiterating the ethical challenge.4 I’m not asserting that we should pursue this research program. If you think conscious AI is desirable, the program can serve as a sort of roadmap for getting there. If you think conscious AI is something to avoid, then the program can highlight paths that are best avoided. I’d be especially cautious about creating agent models. That said, I think it’s likely that researchers will pursue many of the elements of this research program, whether or not they think of this as pursuing AI consciousness. It could be a disaster to stumble upon AI consciousness unknowingly and unreflectively. So I hope that making these possible paths explicit at least helps us to think about conscious AI reflectively and to handle these issues with care.

Afterword

How do things look now, eight months after I gave this lecture at the NeurIPS conference in late November 2022? While new systems such as GPT-4 still have many flaws, they are a significant advance along some of the dimensions discussed in this article. They certainly display more sophisticated conversational abilities. Where I said that GPT-3’s performance often seemed on a par with a sophisticated child, GPT-4’s performance often (not always) seems on a par with an knowledgeable young adult. There have also been advances in multimodal processing and in agent modeling, and to a lesser extent on the other dimensions that I have discussed. I don’t think these advances change my analysis in any fundamental way, but insofar as progress has been faster than expected, it is reasonable to shorten expected timelines. If that is right, my predictions toward the end of this article might even be somewhat conservative.

Notes

1. The philosopher Jonathan Birch distinguishesapproaches to animal consciousness that are “theory-heavy” (assume a complete theory), “theory-neutral” (proceed without theoretical assumptions), and “theory-light” (proceed with weak theoretical assumptions). One can likewise take theory-heavy, theory-neutral, and theory-light approaches to AI consciousness. The approach to artificial consciousness that I have taken here is distinct from these three. It might be considered a theory-balanced approach, one that takes into account the predictions of multiple theories, balancing one’s credence between them, perhaps, according to evidence for those theories or according to acceptance of those theories.

One more precise form of the theory-balanced approach might use data about how widely accepted various theories are among experts to provide credences for those theories, and use those credences along with the various theories’ predictions to estimate probabilities for AI (or animal) consciousness. In a recent survey of researchers in the science of consciousness, just over 50 percent of respondents indicated that they accept or find promising the global workspace theory of consciousness, while just under 50 percent indicated that they accept or find promising the local recurrence theory (which requires recurrent processing for consciousness). Figures for other theories include just over 50 percent for predictive processing theories (which do not make clear predictions for AI consciousness) and for higher-order theories (which require self models for consciousness), and just under 50 percent for integrated information theory (which ascribes consciousness to many simple systems but requires recurrent processing for consciousness). Of course turning these figures into collective credences requires further work (e.g. in converting “accept” and “find promising” into credences), as does applying these credences along with theoretical predictions to derive collective credences about AI consciousness. Still, it seems not unreasonable to assign a collective credence above one in three for each of global workspace, recurrent processing, and self models as requirements for consciousness.

What about biology as a requirement? A 2020 survey of professional philosophers, around 3 percent accepted or leaned toward the view that current AI systems are conscious, with 82 percent rejecting or leaning against the view and 10 percent neutral. Around 39 percent accepted or leaned toward the view that future AI systems will be conscious, with 27 percent rejecting or leaning against the view and 29 percent neutral. (Around 5 percent rejected the questions in various ways, e.g. saying that there is no fact of the matter or that the question is too unclear to answer). The future-AI figures might tend to suggest a collective credence of at least one in three that biology is required for consciousness (albeit among philosophers rather than consciousness researchers). The two surveys have less information about unified agency and about sensory grounding as requirements for consciousness.

2. Compared to mainstream views in the science of consciousness, my own views lean somewhat more to consciousness being widespread. So I’d give somewhat lower credences to the various substantial requirements for consciousness I’ve outlined here, and somewhat higher credences in current LLM consciousness and future LLM+ consciousness as a result.

3. At NeurIPS I said “fish-level capacities.” I’ve changed this to “mouse-level capacities” (probably a harder challenge in principle), in part because more people are confident that mice are conscious than that fish are conscious, and in part because there is so much more work on mouse cognition than fish cognition.

4. This final paragraph is an addition to what I presented at the NeurIPS conference.

We’re interested in what you think. Submit a letter to the editors at [email protected]. Boston Review is nonprofit, paywall-free, and reader-funded. To support work like this, please donate here.

‍David J. Chalmers‍

David J. Chalmers is University Professor of Philosophy and Neural Science & co-director of the Center for Mind, Brain, and Consciousness at NYU. His most recent book is Reality+: Virtual Worlds and the Problems of Philosophy.

https://www.bostonreview.net/articles/could-a-large-language-model-be-conscious/

大卫·查尔默斯：大型语言模型预示，不出十年，能搞出有意识的AI

大卫·J·查默斯

Could a Large Language Model Be Conscious?

‍David J. Chalmers‍

继续阅读

查尔斯国王不见哈里王子，转身低调会见大卫贝克汉姆

大语言模型落地为什么第一步是做客服

湖人队的头号选择：卢和雷迪克、诺里、大卫·阿德尔曼将任助教

姜大卫妻自曝曾是太妹，蹦迪夜舞后奉女成婚！勿效仿！

《老家伙》明晚首播！2大卫视力推，铁三角回归周涛助阵，该火了

姜大卫老婆自曝年轻是太妹，天天蹦迪跳舞，奉女成婚不建议效仿

蓝洁瑛精神失常传闻真相大白？姜大卫太太揭秘全过程！

OpenAI推出全新大语言模型GPT-4o；苹果将在中国开售Vision Pro；软银几乎全部出售阿里股份

过去的得分王荣耀含金量有多高？大卫罗宾逊刷分抢得分王

探索大语言模型：理解Self Attention| 京东物流技术团队

知识图与大型语言模型的协同作用

多功能RNA分析，百度团队的RNA语言模型登Nature子刊

BC政坛巨变?保守党和联合党谈合并对抗NDP!尹大卫的眼中钉是他

参数少量提升，性能指数爆发！谷歌：大语言模型暗藏神秘技能

魔术大师大卫·科波菲尔被16名女性指控性侵，过半女性当时未成年

重量级拳坛新星：与泰森体型相似的大卫图阿

大卫·查尔默斯​：大型语言模型预示，不出十年，能搞出有意识的AI

大卫·J·查默斯

Could a Large Language Model Be Conscious?

‍David J. Chalmers‍

继续阅读

大卫·查尔默斯：大型语言模型预示，不出十年，能搞出有意识的AI