Counting successes in three billion ordinal games David Goforth, Mathematics and Computer Science, Laurentian University David Robinson, Economics, Laurentian University Abstract Using a combination of mathematical analysis and exhaustive enumeration by computer modeling, we investigate social efficiency in the play of two-person, ordinal games. Our results might inform the debate on a question such as this: If players expect to participate in a series of two-person games of randomly varying payoff structure, what decision rule should they choose? Which strategies are robust over the range of possible games? We consider -elimination of dominated strategies, -maximax – the optimistic strategy of focusing on best outcome, and -maximin – the conservative strategy of maximizing the worst possible outcome. Most of the classic results have been obtained in 2x2 games with players assumed to maximize their own payoffs. Here we consider the performance of these strategies as the number of choices increases. We also incorporate player goals besides self-interest, such as fairness, relative payoff, and efficiency into decision making. To measure the social efficiency of a particular encounter in game play, we use the simple concept of shortfall: the difference between the maximum total payoff possible, and the actual total of the outcome the players achieve. Strategies are then analyzed according to their cumulative shortfall over the range of games. A second measure based on Rawlsian justice confirms the results. The most surprising result is that the elimination of dominated strategies does not perform as well as the naive maximizing approaches, especially as the number of choices grows. We show that the socially most effective policy might be to have players use a maximizing strategy with a goal of efficiency. In other words, the social product is highest if no-one cares who gets the credit. Introduction Designers of virtual marketplaces have considerable freedom in creating environments for transactions. A commercial auction marketplace like eBay is designed to provide a service that can be sold profitably and research continues to provide even more advanced services of this nature1. Rosenschein and Zlotkin (1994) considered the possibility of designing marketplaces that encouraged specific kinds of behaviour in participants, behaviour that advanced particular social goals. For example, they proposed a marketplace for long-distance telephone service purchasing that tipped the balance of control on price from the large service providers to individual consumers. 1 Wurman, Wellman and Walsh, 2002 2 Both commercial designers and academics like Rosenschein and Zlotkin have assumed that the participants in their marketplaces will be acting in their own self interest and a lot of research effort is now focused on automating the participant role2. More and more, there are on-line environments where this assumption may be inappropriate. For example, on a site for allocating public resources like land for development, does the (government) vendor necessarily have the goal of maximizing revenue for the land or is the sale also an instrument of policy with broader social goals? In a corporate setting, different divisions of a company may interact. Should each be acting in self interest or should they be maximizing some combined utility? To approach this question we have gone back to small ordinal games. The question we ask is: How much is lost when players have various and possibly different preferences over their own and others’ payoffs and differing strategy selection rules? We evaluate outcomes using two social criteria: rank efficiency and rank Rawlsian justice. We limit our investigation to small ordinal games because the number of distinct games to tabulate grows quickly: 144 2x2 games, in general (mn)!2/(m!n!) games of size mxn but this means over three billion 3x3 games. Our method is direct – a program generates all possible games of any size, assesses outcomes for each game under every combination of preferences and strategies and counts the games at every level of efficiency or justice. We present quantitative summaries of this conceptually simple but computationally demanding exercise. Characteristics of game players The outcome of a two-person game is dependent on the moves chosen by each player. In the classic game format, the players have complete knowledge of the payoffs but must make their choices independently and concurrently, hence without knowledge of each other’s choice. Figure 1 is on page 9. In this ordinal 2 x 2 game, the Nash equilibrium (large red) is the outcome found by the elimination of dominated strategies (EDS). Other possible strategies include the optimistic Maximax and the conservative Maximin originally described by von Neuman. All produce the same outcome in this case. As a playing strategy, EDS has its problems, the most serious being that it fails to determine an outcome in 25% (36/144) of the games. Here is a summary of the performance of EDS in solving the 144 2 x 2 ordinal games.3 2 3 Sadeh et al, 2003 Many sources count 78 2 x 2 ordinal games; we have retained the games that are identical except for exchange of player roles because we wish to emphasize later that players in 2 x 2 games can find themselves in 144 distinct situations. The quantitative results are altered very little if the reflected games are removed. 3 Figure 2 is on page 9. The 101 satisfactory solutions are the games in which EDS produces a Pareto-optimal outcome. ’36 dominant strategy equilibrium’ counts the games in which both players have a dominant strategy and its descendent ‘1 Pareto-dominated’ is the Prisoner’s Dilemma. When the number of choices increases, the portion of games unsolved by EDS grows. In 3 x 3 games, over 47% remain unsolved, even when iterated dominance is the selection rule. The portion of solved games that end in Pareto-dominated outcomes climbs also. Seven of 144 2 x 2 games or 4.86% have a Pareto-dominated solution; in 3 x 3 games, the proportion of Pareto-dominated outcomes is 7.25%. Figure 3 is on page 10. In this project we have examined the performance of the three strategy selection rules in 2 x 2 and larger ordinal games from the point of view of social utility by evaluating the outcomes according to two measures: efficiency and Rawlsian justice (Rawls 1971). For each outcome, a shortfall is calculated. The efficiency shortfall is the difference between the maximum total payoff in the game and the total payoff of the outcome the players achieve. The justice shortfall compares the best lower payoff with the actual lower payoff of the game outcome. By each measure, a shortfall of 0 is ideal4. Figure 4 is on page 11. The final factor we have included in our analysis is player goals. Pure self-interest leads players to maximize their own payoff but other motives have been proposed and analyzed. Here we consider six distinct player goals and operationalize them as player utility functions of both payoffs. In an m x n ordinal game, each player has payoffs 1, 2, ..., mn and any particular m x n game can be displayed as mn points on an mn by mn grid with the constraint that each row and column must contain one point. The player utility function, U(r,c), r,c ε {1, 2, 3, …, mn}orders the m2n2 cells in the grid in a utility sequence and the player’s preference between any two points in a game can be unambiguously determined. Figure 5 is on page 11. In grid (a) self-interest, the blue (dark) shading shows Row’s standard goal of maximizing the row payoff. The outlined squares show the payoffs of a particular 3 x 3 game and the white square is Row’s preferred outcome. The other grids show the outcomes Row prefers with a goal of: 4 Our definition of 0 shortfall in total payoff efficiency is at least as strong a condition on an ordinal game outcome as Pareto-efficiency. 4 (b) efficiency - maximizing total payoff, (c) fairness – minimizing difference between payoffs, (d) Rawlsian justice – maximizing minimum payoff, (e) relative payoff – maximizing difference of row payoff over column, and (f) altruism – maximizing column payoff. The three socially attractive goals, efficiency, fairness and Rawlsian justice, are all symmetric functions of individual payoffs. When individuals prefer efficiency, fairness or Rawlsian justice we say they are acting on local versions of social goals. We have not used fairness as a criterion in this paper. In general, a goal definition does not guarantee a unique choice. For example, the fairness solution is ambiguous as outcomes of (7,7) and (1,1) are equally fair. Each of the goal definitions is made unambiguous by adopting a ‘tie breaking’ goal. In the fairness case, equally fair outcomes are ordered by efficiency.5 The experiment All games of a particular size (e.g., 2 x 2) were generated. The program determines and classifies outcomes for players with every combination of the six goals and three strategies, a total of (6x3)2=324 outcomes. In effect we have held a round robin tournament in which each player with a particular goal and strategy competes with all the others (including self) over a complete set of games. Outcomes were measured for shortfall in efficiency and justice and the results were accumulated into 324 summaries. For example, results above (Figure 4) are drawn from three of the summaries, all symmetric: • EDS strategy, self interest goal vs EDS strategy, self interest goal • Maximax, self interest goal vs Maximax, self interest goal • Maximin, self interest goal vs Maximin, self interest goal The process was repeated with 2x3 and 2x4 games. Processing of 3x3 games is not yet complete but partial results without Rawlsian justice are included. Results Common goals and strategies: First we consider the results for homogeneous encounters in which participants are using the same strategies with the same goals. There are 18 results for games of each dimension: 2 x 2, 2 x 3, 2 x 4. The bars in each graph are grouped according to strategy with EDS at the left followed by Maximax and Maximin. Within each strategy group, the results are in the same order as the goal definitions above 5 Complete algorithmic and algebraic definitions of the goals and tie-breakers are provided in the appendix. 5 with ‘selfish’ first. Results for 3 x 3 are now in process. A reduced 3 x 3 analysis with 15 results does not include a Rawlsian justice goal. Figures 6a, b, c, d are on pages 12, 13, 14, 15. The graphs show the shortfall statistic for total payoff measures. Looking first at EDS, the obvious problem is the number of unsolved games, especially with goals that are functions of both payoffs. As would be expected, when the goal of the players is efficiency - matching the measure of utility we are using - the games that are solved produce good results. However, Rawlsian justice as an individual goal produces equally good results by the efficiency measure. The following graphs show that when individuals seek efficiency, they also attain social justice. It appears that justice and efficiency are compatible: targeting either one produces both. Figures 7a, b, c are on pages 16, 17, 18. Maximax with either of these goals selects efficient and just outcomes in 90% of the games. Maximin strategy is not as effective as Maximax but does exhibit more consistency over the various player goals. The socially destructive effect of playing to maximize relative payoff is evident. For EDS and Maximax strategies, this goal produces the worst results by a wide margin using either measure in games of all dimensions tested. Competition against other goals and strategies: The success of the Maximax strategy with a goal of efficiency or justice extends to play against other goals and strategies. The following graphs display accumulated success of each Row player against all eighteen opponents. An added column in the first position shows the cumulative average over all 324 encounters for a benchmark. As above, the final graph of 3 x 3 games does not include results for justice. Figures 8a, b, c, d are on pages 19, 20, 21, 22. We show results for the 2 x 3 and the 4 x 2 games, demonstrating that the performance of the strategies is consistent for play with fewer or more strategies in the non-square games. Data for the complementary accumulations are available. The results of the efficiency and justice measures are also consistent so we show only efficiency results. Justice data are available. Again, the EDS players produce no solutions in many games and their influence is evident in results for all players. Efficiency results for EDS players are below the benchmark for all goals. The Maximin players are consistent over the various goals with the number of efficient solutions closely matching the benchmark. Playing Maximax is always better than the benchmark except for the goal of relative payoff. Again, the goals of efficiency and justice produce the highest social utility though the performance degrades with increasing choice. 6 Interpretation Goals: Operationalizing the goals of game players has allowed us to investigate the cumulative social effect of players’ intentions. We have some indication of how important goals are. This says that the common assumption that players play to obtain their own best response does influence results. For our interest in the possibility of participating to promote social good, there is evidence that playing with local versions of the social goals – efficiency and justice – can positively affect the social utility even when the goals are not shared by the majority of players. We have included two naive goals. Fairness is important to young children as they first come to awareness of social interactions. Appealing as it is, fairness does not fare well in promoting social utility. The only worse goal is relative difference. One of us (Robinson) observed that many beginning (commerce!) students in his Game Theory course explicitly play to maximize the difference between their payoffs and the opponents’. This would appear, not surprisingly, to reduce social utility. To summarize, the experiments seem to say something obvious once observed: the closer individual players’ goals align with social goals, the greater will be the social utility of the play. We can roughly order the social value of player goals as: 1, 2 efficiency, Rawlsian justice 3, 4 self interest, altruism 5 fairness 6 relative difference Strategy selection rules: We have tested the effectiveness of three strategy selection rules for achieving various goals in ordinal games. In spite of its attractive connection to Nash equilibria, EDS fails to specify an outcome in a number of games. The number increases with the number of player choices and for goals that involve both payoffs. Maximin, as a conservative philosophy of play, achieves results close to the average over each entire tournament. The attractive feature is its stability over varying goals and competition. Maximax is a riskier strategy, achieving great success with social goals and performing poorly when a player’s goals are at odds with the opponent’s and with social welfare. The 90+% success in homogeneous play with a goal of efficiency or justice is remarkable because it shows that non-collaborative game-playing can be effective if the players approach the interaction with the right intentions and strategy selection rule. This could be the situation in intra-institutional encounters. Figure 9 is on pages 23. Even in play against a variety of opponents, the Maximax selection strategy with a social goal fares well. To the extent that this represents the situation of an institutional participant with social policy goals transacting in an open marketplace, it suggests that social utility can be promoted. 7 Conclusion The investigations described here are preliminary. Extrapolation to the real world is of course risky. Nonetheless, we feel that systematic analysis of two person games can contribute to the design of virtual marketplaces. One result of the investigation is unequivocal. We have identified an approach to full information, non-collaborative gameplaying that produces Pareto-efficient outcomes in over 90% of ordinal games of order 2 x 2, 2 x 3, 2 x 4, and 3 x 3. When players are motivated by local versions of the social goals, and play Maximax, efficiency and Rawlsian justice are served. References Rawls, J. 1971. A Theory of Justice. Cambridge MA, Belknap Press, Harvard U. Press Rosenschein, J. S. and Zlotkin, G. 1994. Rules of Encounter: Designing Conventions for Automated Negotiation among Computers. Cambridge MA, MIT Press. Sadeh, N., Arunachalam, R., Eriksson, J., Finne, N. and Janson, S. 2003. TAC-03: A Supply-Chain Trading Competition. AI Magazine 24(1), p.92-3. Wurman, P., Wellman, M. and Walsh, W. 2002. AI Magazine 23(3), p.15-24. 8 Appendix – definitions of Row player goals (Some of the algorithmic conditions are unnecessary for ordinal games but are included to define a complete ordering in the range [1,mn+1] of the m2n2 cells of the grid.) Sr=Uselfish(r,c) = r E=Uefficient(r,c) = (r+c)/2 F=Ufair(r,c) = 1+|r-c| R=URawlsian(r,c) = min(r,c) Cr=Urelative(r,c) =( mn+1+r-c)/2 Ar=Ualtruistic(r,c) = c δ = 1/(2mn) (a) Self interest Sr + δE – select outcome with highest row payoff – if several have equal highest row payoff, select among these for efficiency (b) Efficiency E + δF +δ2Sr – select outcome with highest row plus column total payoff – if several have highest total payoff, select among these for fairness – if several have equal fairness, select among these for self interest (c) Fairness F + δE + δ2Sr – select outcome with minimum absolute difference between row and column payoff – if several have equal difference, select among these for efficiency – if several have equal efficiency, select among these for self interest (d) Rawlsian Justice R + δE + δ2Sr – select outcome with highest value of min(row payoff, column payoff) – if several have equal highest minimum, select among these for efficiency – if several have equal efficiency, select among these for self interest (e) Relative difference Cr + δE – select outcome with greatest excess of row payoff over column payoff – if several have equal excess, select among these for efficiency (f) Altruism Ar + δE – select outcome with highest column payoff – if several have equal highest column payoff, select among these for efficiency 9 Figure 1 Row \ Column X Y A 2 4 4 1 B 1 3 3 2 Figure 2 144 games 108 unique solution 36 dominant strategy equilibrium 35 undominated 72 dominance solvable 66 undominated 101 satisfactory outcomes 36 no unique solution 1 Paretodominated 6 Paretodominated 18 no equilibrium 18 2 equilibria 43 unsatisfactory outcomes OR no outcome 10 Figure 3 Success of Eliminating Dominated Strategies to solve Ordinal Games 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 144 2x2 43,200 2x3 33,868,800 2x4 3,657,830,400 3x3 Iterated unsolved 0 14,400 12,700,800 1,445,068,800 Unsolved 36 0 0 180,633,600 Iterated Pareto-dominated 0 930 1,283,940 184,157,248 Solved Pareto-dominated 6 1,632 996,120 78,666,112 Equilibrium Pareto-dominated 1 (PD) 138 46,404 2,380,864 Iterated solved 0 6,270 8,241,660 1,080,277,952 Solved 66 16,368 9,587,880 643,868,288 Equilibrium 35 3,462 1,011,996 42,777,536 Dimensions of game 11 Figure 4 Efficiency Strategy \ Shortfall Dominance Maximax Maximin 0 1 2 3 4 5 93 86 87 8 22 16 5 17 18 2 9 18 7 5 3 0 1 2 3 none 91 76 83 17 36 46 none 36 Rawlsian Justice Strategy \ Shortfall Dominance Maximax Maximin 36 32 15 Figure 5 (a) Self interest (b) Efficiency (c) Fairness (d) Rawlsian justice (e) Relative difference (f) Altruistism 12 Figure 6a Common goal and strategy 2x2 games: total payoff efficiency 100% 90% 80% 70% No solution 60% Shortfall 6 Shortfall 5 50% Shortfall 4 Shortfall 3 40% Shortfall 2 Shortfall 1 30% Shortfall 0 20% 10% 0% (a) (a) EDS (a) MaxiMAX MaxiMIN 13 Figure 6b Common goal and strategy 2x3 games -total payoff efficiency 90% 80% No solution 70% Shortfall 10 Shortfall 9 60% Shortfall 8 Shortfall 7 Shortfall 6 50% Shortfall 5 Shortfall 4 40% Shortfall 3 Shortfall 2 30% Shortfall 1 Shortfall 0 20% 10% 0% (a) (a) EDS (a) MaxiMAX MaxiMIN 14 Figure 6c Common goal and strategy 4x2 games: total payoff efficiency 100% 90% 80% 70% No solution Shortfall 14 Shortfall 13 Shortfall 12 Shortfall 11 Shortfall 10 Shortfall 9 Shortfall 8 Shortfall 7 Shortfall 6 Shortfall 5 Shortfall 4 Shortfall 3 Shortfall 2 Shortfall 1 Shortfall 0 60% 50% 40% 30% 20% 10% 0% (a) (a) EDS (a) MaxiMAX MaxiMIN 15 Figure 6d Common goal and strategy 3x3 games - total payoff efficiency 100% 90% 80% No solution Shortfall 16 Shortfall 15 Shortfall 14 Shortfall 13 Shortfall 12 Shortfall 11 Shortfall 10 Shortfall 9 Shortfall 8 Shortfall 7 Shortfall 6 Shortfall 5 Shortfall 4 Shortfall 3 Shortfall 2 Shortfall 1 Shortfall 0 70% 60% 50% 40% 30% 20% 10% 0% (a) (b) (c) EDS (e) (f) (a) (b) (c) MaxiMAX (e) (f) (a) (b) (c) MaxiMIN (e) (f) 16 Figure 7a Common goal and strategy 2x2 games: justice 100% 90% 80% 70% 60% No solution Shortfall 3 Shortfall 2 50% Shortfall 1 Shortfall 0 40% 30% 20% 10% 0% (a) (b) (c) (d) EDS (e) (f) (a) (b) (c) MaxiMAX (d) (e) (f) (a) (b) (c) MaxiMIN (d) (e) (f) 17 Figure 7b Common goal and strategy 2x3 games: justice 100% 90% 80% 70% No solution 60% Shortfall 5 Shortfall 4 Shortfall 3 50% Shortfall 2 Shortfall 1 40% Shortfall 0 30% 20% 10% 0% (a) (a) (a) EDS MaxiMAX MaxiMIN 18 Figure 7c Common goal and strategy 4x2 games: justice 100% 90% 80% 70% No solution Shortfall 7 Shortfall 6 60% Shortfall 5 Shortfall 4 50% Shortfall 3 Shortfall 2 40% 30% 20% 10% 0% (a) (a) (a) EDS MaxiMAX MaxiMIN 19 Figure 8a Average goal and strategy 2x2 games: total payoff efficiency 100% 90% 80% 70% No solution Shortfall 6 60% Shortfall 5 Shortfall 4 50% Shortfall 3 Shortfall 2 40% Shortfall 1 Shortfall 0 30% 20% 10% 0% Total (a) (b) (c) EDS (d) (e) (f) (a) (b) (c) MaxiMAX (d) (e) (f) (a) (b) (c) MaxiMIN Tota l (a) (b ) (c) (d) (e) (f) (a) (b) (c) ( d) (e) (f) (d) (a) (b) (c) (d ) ( e) ( f) (e) (f) 20 Figure 8b Average goal and strategy 2x3 games: total payoff efficiency 100% 90% 80% No solution Shortfall 10 70% Shortfall 9 Shortfall 8 60% Shortfall 7 Shortfall 6 50% Shortfall 5 Shortfall 4 Shortfall 3 40% Shortfall 2 Shortfall 1 30% Shortfall 0 20% 10% 0% Total (a) (a) (a) EDS MaxiMAX MaxiMIN 21 Figure 8c Average goal and strategy 4x2 games: total payoff efficiency 100% 90% 80% 70% No solution Shortfall 14 Shortfall 13 Shortfall 12 Shortfall 11 Shortfall 10 Shortfall 9 Shortfall 8 Shortfall 7 Shortfall 6 Shortfall 5 Shortfall 4 Shortfall 3 Shortfall 2 Shortfall 1 Shortfall 0 60% 50% 40% 30% 20% 10% 0% Total (a) (a) EDS (a) MaxiMAX MaxiMIN 22 Figure 8d Average goal and strategy 3x3 games: total payoff efficiency 100% 90% 80% 70% No solution Shortfall 16 Shortfall 15 Shortfall 14 Shortfall 13 Shortfall 12 Shortfall 11 Shortfall 10 Shortfall 9 Shortfall 8 Shortfall 7 Shortfall 6 Shortfall 5 Shortfall 4 Shortfall 3 Shortfall 2 Shortfall 1 Shortfall 0 60% 50% 40% 30% 20% 10% 0% Total (a) (b) EDS (c) (e) (f) (a) (b) (c) MaxiMAX (e) (f) (a) (b) MaxiMIN (c) (e) (f) 23 Figure 9 Pareto-Efficiency of Maximax strategy with Efficiency goal Play Game Dimensions 2x2 2x3 2x4 3x3 Total Games 144 43 200 33 868 800 3 657 830 400 Pareto-efficient Shortfall 0 138 41 544 32 815 080 3 297 656 880 Portion of Efficient Solutions 95.8% 96.0% 97.0% 90.0%