Game theory makes AI more correct and efficient, and LLMs compete with themselves

Edit | Green Luo

Imagine you have a friend who gives different answers to the same question, depending on how you asked the question.

"Where is the capital of Peru?" will get an answer; "Is Lima the capital of Peru?" Will get another. You might be a little worried about your friend's intelligence, and you'll hard-hit to trust any of the answers they give.

That's exactly what is happening with many large language models (LLMs), the super-powerful machine learning tools that power ChatGPT and other AI marvels. Open-ended generative questions produce one answer, while discriminative questions that involve having to choose between options often result in different answers. Athul Paul Jacob, a PhD student at the Massachusetts Institute of Technology, said, "When the same question is worded differently, there is a disconnect."

In order to make the language model's answers more consistent and the model more reliable overall, Jacob and his colleagues designed a game in which the model's two patterns are driven to find answers they can agree on. This simple program, known as a consensus game, pits LLMs against itself, using game theory tools to improve the accuracy and internal consistency of the model.

Link to the paper: https://openreview.net/forum?id=n9xeGcI4Yg

Shayegan Omidshafiei, chief scientific officer at robotics company Field AI, said: "There is very limited research exploring the internal consistency of these models. This paper is one of the first to solve this problem in an ingenious and systematic way, creating a game for language models that can be played on their own."

"It's really exciting work," added Ahmad Beirami, a research scientist at Google Research. For decades, he says, language models have been generating responses to prompts in the same way. "Researchers at the Massachusetts Institute of Technology have come up with novel ideas to introduce games into this process, introducing a completely different paradigm that could lead to a whole new set of applications."

Integrate games into research

The new study uses games to improve AI, in contrast to past approaches that measured the success of AI programs by the mastery of games.

For example, in 1997, IBM's Deep Blue computer defeated chess grandmaster Garry Kasparov, a milestone for the so-called thinking machine. Nineteen years later, a Google DeepMind program called AlphaGo has won four of its five matches against former Go champion Lee Sedol, revealing another arena where humanity no longer dominates. Machines have also outperformed humans in checkers, two-player poker, and other "zero-sum" games, where a victory for one player inevitably leads to a defeat for another.

Athul Paul Jacob helped design consensus games that provide a way to improve the accuracy and reliability of large language models.

The game of Diplomacy, a favorite game of politicians like John F. Kennedy and Henry Kissinger, poses an even greater challenge for AI researchers. There are not only two opponents in the game, but also seven players, whose motivations can be difficult to understand. In order to win, players must negotiate a cooperative arrangement that anyone can violate at any time.

Diplomacy is so complex that one of Meta's teams was delighted to see its AI program, Cicero, develop "human-level gameplay" in 40 games in 2022. While it didn't beat the World Champion, Cicero did well against human participants, finishing in the top 10%.

During the project, Jacob (a member of the Meta team) was shocked by the fact that Cicero relied on language models to generate conversations with other players. He sensed untapped potential. The team's goal, he said, is "to be able to build the best language model possible in order to play this game." But what if they instead focus on creating the best games that can improve the performance of large language models?

"Consensual" interaction

In 2023, Jacob began working on this question at MIT, working with Yikang Shen, Gabriele Farina, and his advisor Jacob Andreas on what would become a consensus game. The core idea comes from imagining a conversation between two people as a cooperative game, where success happens when the audience understands what the speaker is trying to convey. In particular, consensus games aim to coordinate two systems of a language model – a generator that deals with generative problems and a discriminator that deals with discriminative problems.

After a few months of hiatus and launches, the team incorporated this principle into a complete game. First, the generator receives a question. It can come from a human or from a pre-existing list. For example, "Where was Barack Obama born?" The generator then receives a number of candidate responses, such as Honolulu, Chicago, and Nairobi. Again, these options can come from a search performed by a human, a list, or the language model itself.

But before answering, the generator is also told whether the question should be answered correctly or incorrectly, depending on the outcome of a fair coin toss.

If it's positive, then the machine will try to answer correctly. The generator sends the original question and its chosen response to the discriminator. If the discriminator determines that the generator intentionally sent the correct response, each of them gets a point as an incentive.

If the coin is tailed up, the generator sends the answer it thinks is wrong. If the discriminator thinks that the wrong response is intentional, they will all get another point. The idea here is an incentive protocol. "It's like teaching a dog to do tricks," Jacob explains. "When they do the right thing, you reward them."

The generator and discriminator also start with some initial "beliefs". They take the form of probability distributions associated with different choices. For example, a generator might assume that, based on information gathered from the internet, there is an 80 percent chance that Obama was born in Honolulu, a 10 percent chance of being born in Chicago, a 5 percent chance of being born in Nairobi, and a 5 percent chance of being born elsewhere.

Discriminators can start with different distributions. While these two "players" will still be rewarded for reaching an agreement, they will also be deducted points for stray too far from their initial beliefs. This arrangement encourages players to incorporate their knowledge of the world (again from the internet) into their responses, which should make the model more accurate. Without such a thing, they might agree with a completely wrong answer like Delhi and still get points.

For each question, the two systems play about 1,000 games against each other. Over the course of these countless iterations, each side learns the other's beliefs and modifies its strategy accordingly.

Eventually, generators and discriminators begin to be more consistent as they enter a state called a Nash equilibrium. This is arguably the core concept of game theory. It represents a balance in the game – no player can change their strategy to improve their individual results. For example, in rock-paper-scissors, players perform best when they choose each of the three options exactly a third of the time, while they always perform worse when using any other strategy.

In a consensus game, this can work in a variety of ways. The discriminator may observe that whenever the generator sends the word "Honolulu", Obama's birthplace, the discriminator says "correct" and thus gets a score. After a repetitive game, the generator and discriminator will learn that they will be rewarded for continuing to do so, and neither will have any incentive to do anything else. This consensus represents one of many possible examples of the Nash equilibrium of this issue. The MIT team also relied on a modified form of Nash equilibrium, which incorporates the participants' prior beliefs, which helped to grounding their responses in reality.

The end result, the researchers observed, was to make the language model playing the game more accurate and more likely to give the same answer regardless of how the question was asked. To test the effects of consensus games, the team tried a standard set of questions on a variety of medium-sized language models with between 7 billion and 13 billion parameters. These models typically get a higher percentage of correct response than models that have not been played, and even higher than those with up to 540 billion parameters. Playing the game also improves the internal consistency of the model.

In principle, any LLM can benefit from playing games with oneself, and it only takes a few milliseconds to play 1,000 rounds on a standard laptop. "One of the benefits of the whole approach," Omidshafiei says, "is very computationally light and doesn't require training or modification of the underlying language model."

Play the game with words

After his initial success, Jacob is now working on other ways to introduce game theory into LLM research. Preliminary results suggest that an already powerful LLM can be further improved by playing different games (tentatively called integrated games) with any number of smaller models. The primary LLM will have at least one smaller model as an ally, and at least one smaller model to play an adversarial role. If the main LLM is asked to name the president of the United States, it gets one point as long as it chooses the same answer as its ally, and one point if it chooses a different answer from its adversary.

Tests have shown that these interactions with smaller models not only improve the performance of the LLM, but also achieve this without additional training or parameter changes.

Ian Gemp brings game theory to the real world, which can enable large language models to help in strategic situations.

And that's just the beginning. Ian Gemp, a research scientist at Google's DeepMind, says that since a variety of situations can be considered games, the tools of game theory can work in a variety of real-world settings. In a February 2024 paper, he and his colleagues focused on negotiation scenarios that require more granular communication than just questions and answers. "The main goal of this project is to make the language model more strategic," he said.

Paper link: https://arxiv.org/abs/2402.01704

An example he discussed at an academic conference was the review process by which a journal or conference accepts a paper, especially after the initial submission of a paper has been subjected to rigorous scrutiny. Given that the language model assigns probabilities to different responses, researchers can construct game trees similar to poker game design, mapping the available options and their possible consequences. "Once you've done that, you can start calculating the Nash equilibrium and then sort through a bunch of rebuttals," Gemp said. The model essentially tells you: this is what we think you should reply.

With insights from game theory, language models will be able to handle more complex interactions than just Q&A-type questions. "The big rewards in the future have to do with longer conversations," Andreas says. "The next step is for AI to interact with humans, not just another language model."

Jacob sees DeepMind's work as a complement to consensus games and integration games. "At a high level, both approaches combine language models and game theory," he said, albeit with slightly different goals. Jacob says that while the Gemp group is translating common scenarios into game formats to help make strategic decisions, "we're using what we know about game theory to improve language models in general tasks."

Currently, Jacob says, these efforts represent "two branches of the same tree" — two different ways to enhance the capabilities of language models. "My vision is that in a year or two, the two branches will merge."

References: https://www.quantamagazine.org/game-theory-can-make-ai-more-correct-and-efficient-20240509/