laitimes

WWW 2022 | Super double! Maximize game surprises

author:AI Tech Review
WWW 2022 | Super double! Maximize game surprises

Introduction

This article is the selected paper "BONUS! Interpretation of Maximizing Surprise. The paper was completed by Kong Yuqing's research group at the Frontier Computing Research Center of Peking University in collaboration with Associate Professor Liu Xiao of Tsinghua University and Associate Professor Grant Schoenebeck of the University of Michigan. The article examines how the (extra) score for the final round should be set in multiple rounds of two-player matches to maximize the surprise of the audience.

Graphic | Huang Zhihuan, Xu Shengwei, Kong Yuqing

Kong Yuqing Research Group of Peking University

WWW 2022 | Super double! Maximize game surprises

According to the practice of theoretical computing field, authors are sorted alphabetically by initials.

Thesis link: https://arxiv.org/abs/2107.08207

1 Introduction

"Who is the murderer?"

"What was the motive?"

"How can I de-loop?"

In the suspense TV series "Beginning", as the information is released to the audience like a cocoon, the belief in the answer in the audience's heart is constantly changing. However, for a story, under the constraint of logic, the total amount of information is limited, "XX turned out to be the murderer!" "Such an unexpected reversal with such a huge amount of information cannot happen multiple times. Therefore, how to design a strategy for information release, that is, information flow, is a question worth studying. The science fiction novel "Countdown" by Dr. Kong Yuqing, one of the authors of this paper, is also inspired by the flow of information.

This is the second paper in our information flow series, before us IJCAI-21's work SURPRISE! And How to Schedule It, through experiments, the effect of information flow in e-sports events on the quality of audience perception is studied. From a theoretical level, this paper further analyzes how to design the rules of the game to obtain the information stream with the highest expected surprise value to improve the audience's experience when watching the match.

A common attempt is to give the late-game confrontation a higher weight in determining the outcome of the game in order to generate more surprises and surprises. For example: in the classic game DOTA2 in the multiplayer online battle arena (MOBA), the player-controlled hero will have a longer resurrection time after the late death, which will have a greater impact on the situation; the game "League of Legends" will appear "Nash Baron" in 20 minutes, and the party who successfully kills "Nash Baron" will get a significant bonus, so it is often the focus of contention between the two sides; the WeChat mini game "Mind King" will be the focus of contention. The last question will reward the player with "double points"; in addition, in the "Quidditch" competition in Harry Potter, the party that wins the "Golden Snitch" will also score several times the usual score; even some sports events have tried to improve the score of the last race, such as the IAAF Diamond League, IndyCar, Formula One world, etc.

One of the purposes of these designs is to increase the level of surprise and surprise for the spectators: when people watch the game, their belief about who will win the final victory changes as the game progresses. Our previous work[1] and a range of others [2,3,4] have shown that the perceived quality of people watching matches stems in part from the surprising level of content. In this case, both theoretical and practical work will face an interesting problem of how to design a point scheme to maximize the degree of surprise in the course of the game, thereby increasing the entertainment utility of the program and increasing its revenue.

We mainly focus on multi-round matchups (such as The King of Minds), and as we described earlier, a common practice is to double or double the number of points a player can earn in the last round as an additional reward. However, there is no work to theoretically analyze how to design the final round of points, which is an important question for rule designers to consider. To solve this problem, this article theoretically analyzes how to choose the last round of integrals to maximize surprises.

2 Mode

Consider a match that lasts n rounds, with two participants, Alice and Bob; in each round, the winner gets points, and at the end of the n rounds, the higher scorer wins. In our setup, the first n-1 rounds are all scored at 1, and the score for the last round is x. In general, we need only consider the case where the points of the last round do not exceed the total number of rounds n. And, to avoid a draw, we specify that x is an integer identical to n parity.

What are the surprises?

In simple terms, we define surprises as the sum of the absolute values of the changes in beliefs that expect the audience to watch the game on one of the teams, such as Alice, to win. In the following figure, the red curve has a higher surprise value than the purple curve.

WWW 2022 | Super double! Maximize game surprises

The audience's belief depends on his a priori, and we introduce our a priori model.

What is a priori?

In reality, many times the audience is not sure about the strength of the two sides of the game in advance, but updates their estimates of the strength of the two players while watching the game. The audience's judgment of Alice's probability of winning is not the probability of Alice's actual victory, but based on their understanding of Alice's strength.

Therefore, we need to model the audience's a priori belief in the strength of the players. Consider two special cases first:

The first special case is that the audience's belief in the strength of the two participating parties does not change with the course of the game (determining the situation), for example, Zhuge Liang's seventh capture, or Sherlock Holmes and James Moriarty, who have already played many times.

The second special case is that the spectator has no prior knowledge of the strength of the two participants (uniform situation). For example, Guan Gong vs. Qin Qiong, or Sherlock Holmes and Hercule Poolo.

Beta distributions can be generalized to more general cases based on both cases. So we use the Beta distribution as a priori model.

How do I choose points for the last round?

Let's start with three insights derived from the results of our theory.

Insight 1: The greater the difference in strength between the two sides, the more reward points are required. Interestingly, we found that the optimal bonus points are around (2p-1)n, which is the score required for weaker players to flip, which we call the "rollover coefficient". So in a game where the audience thinks that the abilities of two players are very different, we should set higher reward points. Otherwise, the surprises that this game can bring will quickly diminish, resulting in a lot of "garbage time". Conversely, if two players are equally strong, additional bonus points should not be set.

Insight 2: When the audience's a priori is not biased toward one side, more uncertainty makes the optimal reward points higher. We found that in symmetrical cases, the greater the optimal reward points when the a priori was more uncertain. It is worth noting that this is different from the situation where two players are equally strong in the first case. The reason is that in the former case there is no renewal of the belief in the strength of both sides, and in the latter case the belief in the strength of both sides is renewed. In this case, as the game progresses, the audience will renew not only their belief in the final winner, but also their belief in the strength of both sides. From the perspective of the flow of information, more information will be released in the early stages of the game, so we need to set some reward points for the final round to balance the whole information release process.

Insight 3: More rounds will require higher Optimal Rewards Points. We found that as the number of rounds increased, the optimal reward points became larger. Intuitively, the "turnover factor" is positively correlated with the number of rounds, so when the number of rounds gets bigger, we need to increase the bonus points to expand the winning rate of those who fall behind, otherwise the game may soon have no suspense.

More detailed theoretical results:

WWW 2022 | Super double! Maximize game surprises

The table above shows the last round of optimal integrals in the case of Finite and Asymptotic, symmetric, Certain, and in general. where alpha, beta are the a priori parameters.

Symmetry: when the audience's transcendental beliefs are not biased toward any player, it is a clear closed formula;

Deterministic situation: When the audience is completely sure of the strength of the two participants before the game, it is the solution of a specific function, which is about equal to the "turnover coefficient", that is, the expected score required by the weaker to flip;

General case: it can be obtained by linear algorithms, when n tends to infinity, is the solution of a particular function.

3 Summary of proof

Main Challenge: The main challenge we face is that calculating the surprise value of each round is difficult. Even, in the case of asymmetry, it is not easy to calculate Alice's initial win rate. A simple idea is to derive the win rate of each state from the back to the front, and then calculate the surprise value from this. However, calculating the optimal reward x in this way requires the complexity of O(n3).

WWW 2022 | Super double! Maximize game surprises

To overcome this challenge, we need to use some properties of the Beta distribution. First, we use the main lemma to prove that we only need to analyze the belief values of the last two rounds, reducing the problem to a trade-off between the last round and the penultimate round; second, we study some important special cases (symmetry, definite, infinity) that can further simplify the analysis of the last two rounds; third, we do not actually calculate the actual expected surprise value, but only analyze how it changes with the last round of integrals. For more details of the proof, please refer to the paper.

4 Summary and Outlook

Our work solved how to design the optimal final round points in an n-round two-person match to maximize the overall surprise expected by the audience. We modeled the audience's prior beliefs about the abilities of the two players as a beta distribution and found that the optimal reward points depended heavily on the prior beliefs.

We observe that a priori with a higher skewness leads to a larger optimal reward score, and a priori with higher uncertainty in symmetry also leads to a higher optimal reward score. This is in line with our intuition, because the highly asymmetrical a priori requires a high "turnover coefficient", while the highly indeterminate a priori releases a lot of information in the first few rounds of the game, so it needs to be compensated by adding points at the end.

In future work, a feasible direction is to use our existing theory to analyze the rules of traditional sports events and find directions that can be improved. In addition, for the rules of the incomplete information game, we can introduce reinforcement learning algorithms to calculate the expected surprises that the game can bring from the player's point of view and the audience's point of view, and improve the rules based on this. Finally, as previous work proves, the timing of the release of surprises also affects the audience's experience [1], so we can introduce a time factor into the model.

bibliography

[1] Zhihuan Huang, Shengwei Xu, You Shan, Yuxuan Lu, Yuqing Kong, Tracy Liu, and Grant Schoenebeck. 2021. SURPRISE! and When to Schedule It. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, 252-260.

[2] Jeffrey Ely, Alexander Frankel, and Emir Kamenica. 2015. Suspense and surprise. Journal of Political Economy123, 1 (2015), 215–260.

[3] Paolo Bizzozero, Raphael Flepp, and Egon Franck. 2016. The importance of suspense and surprise in entertainment demand: Evidence from Wimbledon.Journal of Economic Behavior & Organization130 (2016), 47–63.

[4] Babatunde Buraimo, David Forrest, Ian G McHale, and JD Tena. 2020. Unscripteddrama: soccer audience response to suspense, surprise, and shock. EconomicInquiry58, 2 (2020), 881–896.

Kong Yuqing's research group

Yuqing Kong's research group mainly studies the intersection of computer science and economics and social sciences, including peer prediction, mechanism design, information design, cognitive hierarchy, etc., especially interested in research topics that are closely related to daily life.
WWW 2022 | Super double! Maximize game surprises

Read on