Counting successes in three billion ordinal games

advertisement
Counting successes in three billion ordinal games
David Goforth, Mathematics and Computer Science, Laurentian University
David Robinson, Economics, Laurentian University
Abstract
Using a combination of mathematical analysis and exhaustive enumeration by computer modeling, we
investigate social efficiency in the play of two-person, ordinal games. Our results might inform the debate
on a question such as this: If players expect to participate in a series of two-person games of randomly
varying payoff structure, what decision rule should they choose? Which strategies are robust over the
range of possible games? We consider
-elimination of dominated strategies,
-maximax – the optimistic strategy of focusing on best outcome, and
-maximin – the conservative strategy of maximizing the worst possible outcome.
Most of the classic results have been obtained in 2x2 games with players assumed to maximize their own
payoffs. Here we consider the performance of these strategies as the number of choices increases. We also
incorporate player goals besides self-interest, such as fairness, relative payoff, and efficiency into decision
making.
To measure the social efficiency of a particular encounter in game play, we use the simple concept of
shortfall: the difference between the maximum total payoff possible, and the actual total of the outcome the
players achieve. Strategies are then analyzed according to their cumulative shortfall over the range of
games. A second measure based on Rawlsian justice confirms the results.
The most surprising result is that the elimination of dominated strategies does not perform as well as the
naive maximizing approaches, especially as the number of choices grows. We show that the socially most
effective policy might be to have players use a maximizing strategy with a goal of efficiency. In other
words, the social product is highest if no-one cares who gets the credit.
Introduction
Designers of virtual marketplaces have considerable freedom in creating environments
for transactions. A commercial auction marketplace like eBay is designed to provide a
service that can be sold profitably and research continues to provide even more advanced
services of this nature1. Rosenschein and Zlotkin (1994) considered the possibility of
designing marketplaces that encouraged specific kinds of behaviour in participants,
behaviour that advanced particular social goals. For example, they proposed a
marketplace for long-distance telephone service purchasing that tipped the balance of
control on price from the large service providers to individual consumers.
1
Wurman, Wellman and Walsh, 2002
2
Both commercial designers and academics like Rosenschein and Zlotkin have assumed
that the participants in their marketplaces will be acting in their own self interest and a lot
of research effort is now focused on automating the participant role2.
More and more, there are on-line environments where this assumption may be
inappropriate. For example, on a site for allocating public resources like land for
development, does the (government) vendor necessarily have the goal of maximizing
revenue for the land or is the sale also an instrument of policy with broader social goals?
In a corporate setting, different divisions of a company may interact. Should each be
acting in self interest or should they be maximizing some combined utility?
To approach this question we have gone back to small ordinal games. The question we
ask is: How much is lost when players have various and possibly different preferences
over their own and others’ payoffs and differing strategy selection rules? We evaluate
outcomes using two social criteria: rank efficiency and rank Rawlsian justice. We limit
our investigation to small ordinal games because the number of distinct games to tabulate
grows quickly: 144 2x2 games, in general (mn)!2/(m!n!) games of size mxn but this means
over three billion 3x3 games.
Our method is direct – a program generates all possible games of any size, assesses
outcomes for each game under every combination of preferences and strategies and
counts the games at every level of efficiency or justice. We present quantitative
summaries of this conceptually simple but computationally demanding exercise.
Characteristics of game players
The outcome of a two-person game is dependent on the moves chosen by each player. In
the classic game format, the players have complete knowledge of the payoffs but must
make their choices independently and concurrently, hence without knowledge of each
other’s choice.
Figure 1 is on page 9.
In this ordinal 2 x 2 game, the Nash equilibrium (large red) is the outcome found by the
elimination of dominated strategies (EDS). Other possible strategies include the
optimistic Maximax and the conservative Maximin originally described by von
Neuman. All produce the same outcome in this case.
As a playing strategy, EDS has its problems, the most serious being that it fails to
determine an outcome in 25% (36/144) of the games. Here is a summary of the
performance of EDS in solving the 144 2 x 2 ordinal games.3
2
3
Sadeh et al, 2003
Many sources count 78 2 x 2 ordinal games; we have retained the games that are identical except for
exchange of player roles because we wish to emphasize later that players in 2 x 2 games can find
themselves in 144 distinct situations. The quantitative results are altered very little if the reflected games
are removed.
3
Figure 2 is on page 9.
The 101 satisfactory solutions are the games in which EDS produces a Pareto-optimal
outcome. ’36 dominant strategy equilibrium’ counts the games in which both players
have a dominant strategy and its descendent ‘1 Pareto-dominated’ is the Prisoner’s
Dilemma.
When the number of choices increases, the portion of games unsolved by EDS grows. In
3 x 3 games, over 47% remain unsolved, even when iterated dominance is the selection
rule. The portion of solved games that end in Pareto-dominated outcomes climbs also.
Seven of 144 2 x 2 games or 4.86% have a Pareto-dominated solution; in 3 x 3 games,
the proportion of Pareto-dominated outcomes is 7.25%.
Figure 3 is on page 10.
In this project we have examined the performance of the three strategy selection rules in
2 x 2 and larger ordinal games from the point of view of social utility by evaluating the
outcomes according to two measures: efficiency and Rawlsian justice (Rawls 1971). For
each outcome, a shortfall is calculated. The efficiency shortfall is the difference between
the maximum total payoff in the game and the total payoff of the outcome the players
achieve. The justice shortfall compares the best lower payoff with the actual lower
payoff of the game outcome. By each measure, a shortfall of 0 is ideal4.
Figure 4 is on page 11.
The final factor we have included in our analysis is player goals. Pure self-interest leads
players to maximize their own payoff but other motives have been proposed and
analyzed.
Here we consider six distinct player goals and operationalize them as player utility
functions of both payoffs. In an m x n ordinal game, each player has payoffs 1, 2, ..., mn
and any particular m x n game can be displayed as mn points on an mn by mn grid with
the constraint that each row and column must contain one point. The player utility
function, U(r,c), r,c ε {1, 2, 3, …, mn}orders the m2n2 cells in the grid in a utility
sequence and the player’s preference between any two points in a game can be
unambiguously determined.
Figure 5 is on page 11.
In grid (a) self-interest, the blue (dark) shading shows Row’s standard goal of
maximizing the row payoff. The outlined squares show the payoffs of a particular 3 x 3
game and the white square is Row’s preferred outcome. The other grids show the
outcomes Row prefers with a goal of:
4
Our definition of 0 shortfall in total payoff efficiency is at least as strong a condition on an ordinal game
outcome as Pareto-efficiency.
4
(b) efficiency - maximizing total payoff,
(c) fairness – minimizing difference between payoffs,
(d) Rawlsian justice – maximizing minimum payoff,
(e) relative payoff – maximizing difference of row payoff over column, and
(f) altruism – maximizing column payoff.
The three socially attractive goals, efficiency, fairness and Rawlsian justice, are all
symmetric functions of individual payoffs. When individuals prefer efficiency, fairness or
Rawlsian justice we say they are acting on local versions of social goals. We have not
used fairness as a criterion in this paper.
In general, a goal definition does not guarantee a unique choice. For example, the fairness
solution is ambiguous as outcomes of (7,7) and (1,1) are equally fair. Each of the goal
definitions is made unambiguous by adopting a ‘tie breaking’ goal. In the fairness case,
equally fair outcomes are ordered by efficiency.5
The experiment
All games of a particular size (e.g., 2 x 2) were generated. The program determines and
classifies outcomes for players with every combination of the six goals and three
strategies, a total of (6x3)2=324 outcomes. In effect we have held a round robin
tournament in which each player with a particular goal and strategy competes with all the
others (including self) over a complete set of games. Outcomes were measured for
shortfall in efficiency and justice and the results were accumulated into 324 summaries.
For example, results above (Figure 4) are drawn from three of the summaries, all
symmetric:
•
EDS strategy, self interest goal vs EDS strategy, self interest goal
•
Maximax, self interest goal vs Maximax, self interest goal
•
Maximin, self interest goal vs Maximin, self interest goal
The process was repeated with 2x3 and 2x4 games. Processing of 3x3 games is not yet
complete but partial results without Rawlsian justice are included.
Results
Common goals and strategies: First we consider the results for homogeneous encounters
in which participants are using the same strategies with the same goals. There are 18
results for games of each dimension: 2 x 2, 2 x 3, 2 x 4. The bars in each graph are
grouped according to strategy with EDS at the left followed by Maximax and Maximin.
Within each strategy group, the results are in the same order as the goal definitions above
5
Complete algorithmic and algebraic definitions of the goals and tie-breakers are provided in the appendix.
5
with ‘selfish’ first. Results for 3 x 3 are now in process. A reduced 3 x 3 analysis with
15 results does not include a Rawlsian justice goal.
Figures 6a, b, c, d are on pages 12, 13, 14, 15.
The graphs show the shortfall statistic for total payoff measures.
Looking first at EDS, the obvious problem is the number of unsolved games, especially
with goals that are functions of both payoffs. As would be expected, when the goal of the
players is efficiency - matching the measure of utility we are using - the games that are
solved produce good results. However, Rawlsian justice as an individual goal produces
equally good results by the efficiency measure.
The following graphs show that when individuals seek efficiency, they also attain social
justice. It appears that justice and efficiency are compatible: targeting either one
produces both.
Figures 7a, b, c are on pages 16, 17, 18.
Maximax with either of these goals selects efficient and just outcomes in 90% of the
games. Maximin strategy is not as effective as Maximax but does exhibit more
consistency over the various player goals. The socially destructive effect of playing to
maximize relative payoff is evident. For EDS and Maximax strategies, this goal
produces the worst results by a wide margin using either measure in games of all
dimensions tested.
Competition against other goals and strategies: The success of the Maximax strategy with
a goal of efficiency or justice extends to play against other goals and strategies.
The following graphs display accumulated success of each Row player against all
eighteen opponents. An added column in the first position shows the cumulative average
over all 324 encounters for a benchmark. As above, the final graph of 3 x 3 games does
not include results for justice.
Figures 8a, b, c, d are on pages 19, 20, 21, 22.
We show results for the 2 x 3 and the 4 x 2 games, demonstrating that the performance of
the strategies is consistent for play with fewer or more strategies in the non-square
games. Data for the complementary accumulations are available.
The results of the efficiency and justice measures are also consistent so we show only
efficiency results. Justice data are available.
Again, the EDS players produce no solutions in many games and their influence is
evident in results for all players. Efficiency results for EDS players are below the
benchmark for all goals.
The Maximin players are consistent over the various goals with the number of efficient
solutions closely matching the benchmark.
Playing Maximax is always better than the benchmark except for the goal of relative
payoff. Again, the goals of efficiency and justice produce the highest social utility
though the performance degrades with increasing choice.
6
Interpretation
Goals: Operationalizing the goals of game players has allowed us to investigate the
cumulative social effect of players’ intentions. We have some indication of how
important goals are. This says that the common assumption that players play to obtain
their own best response does influence results.
For our interest in the possibility of participating to promote social good, there is
evidence that playing with local versions of the social goals – efficiency and justice – can
positively affect the social utility even when the goals are not shared by the majority of
players.
We have included two naive goals. Fairness is important to young children as they first
come to awareness of social interactions. Appealing as it is, fairness does not fare well in
promoting social utility. The only worse goal is relative difference. One of us
(Robinson) observed that many beginning (commerce!) students in his Game Theory
course explicitly play to maximize the difference between their payoffs and the
opponents’. This would appear, not surprisingly, to reduce social utility.
To summarize, the experiments seem to say something obvious once observed: the closer
individual players’ goals align with social goals, the greater will be the social utility of
the play. We can roughly order the social value of player goals as:
1, 2
efficiency, Rawlsian justice
3, 4
self interest, altruism
5
fairness
6
relative difference
Strategy selection rules: We have tested the effectiveness of three strategy selection rules
for achieving various goals in ordinal games. In spite of its attractive connection to Nash
equilibria, EDS fails to specify an outcome in a number of games. The number increases
with the number of player choices and for goals that involve both payoffs.
Maximin, as a conservative philosophy of play, achieves results close to the average over
each entire tournament. The attractive feature is its stability over varying goals and
competition.
Maximax is a riskier strategy, achieving great success with social goals and performing
poorly when a player’s goals are at odds with the opponent’s and with social welfare.
The 90+% success in homogeneous play with a goal of efficiency or justice is remarkable
because it shows that non-collaborative game-playing can be effective if the players
approach the interaction with the right intentions and strategy selection rule. This could
be the situation in intra-institutional encounters.
Figure 9 is on pages 23.
Even in play against a variety of opponents, the Maximax selection strategy with a social
goal fares well. To the extent that this represents the situation of an institutional
participant with social policy goals transacting in an open marketplace, it suggests that
social utility can be promoted.
7
Conclusion
The investigations described here are preliminary. Extrapolation to the real world is of
course risky. Nonetheless, we feel that systematic analysis of two person games can
contribute to the design of virtual marketplaces.
One result of the investigation is unequivocal. We have identified an approach to full
information, non-collaborative gameplaying that produces Pareto-efficient outcomes in
over 90% of ordinal games of order 2 x 2, 2 x 3, 2 x 4, and 3 x 3. When players are
motivated by local versions of the social goals, and play Maximax, efficiency and
Rawlsian justice are served.
References
Rawls, J. 1971. A Theory of Justice. Cambridge MA, Belknap Press, Harvard U. Press
Rosenschein, J. S. and Zlotkin, G. 1994. Rules of Encounter: Designing Conventions for Automated
Negotiation among Computers. Cambridge MA, MIT Press.
Sadeh, N., Arunachalam, R., Eriksson, J., Finne, N. and Janson, S. 2003. TAC-03: A Supply-Chain
Trading Competition. AI Magazine 24(1), p.92-3.
Wurman, P., Wellman, M. and Walsh, W. 2002. AI Magazine 23(3), p.15-24.
8
Appendix – definitions of Row player goals
(Some of the algorithmic conditions are unnecessary for ordinal games but are included
to define a complete ordering in the range [1,mn+1] of the m2n2 cells of the grid.)
Sr=Uselfish(r,c) = r
E=Uefficient(r,c) = (r+c)/2
F=Ufair(r,c) = 1+|r-c|
R=URawlsian(r,c) = min(r,c)
Cr=Urelative(r,c) =( mn+1+r-c)/2
Ar=Ualtruistic(r,c) = c
δ = 1/(2mn)
(a) Self interest
Sr + δE
–
select outcome with highest row payoff
–
if several have equal highest row payoff, select among these for efficiency
(b) Efficiency
E + δF +δ2Sr
–
select outcome with highest row plus column total payoff
–
if several have highest total payoff, select among these for fairness
–
if several have equal fairness, select among these for self interest
(c) Fairness
F + δE + δ2Sr
–
select outcome with minimum absolute difference between row and column
payoff
–
if several have equal difference, select among these for efficiency
–
if several have equal efficiency, select among these for self interest
(d) Rawlsian Justice R + δE + δ2Sr
–
select outcome with highest value of min(row payoff, column payoff)
–
if several have equal highest minimum, select among these for efficiency
–
if several have equal efficiency, select among these for self interest
(e) Relative difference
Cr + δE
–
select outcome with greatest excess of row payoff over column payoff
–
if several have equal excess, select among these for efficiency
(f) Altruism
Ar + δE
–
select outcome with highest column payoff
–
if several have equal highest column payoff, select among these for efficiency
9
Figure 1
Row \ Column
X
Y
A
2 4
4
1
B
1
3
3
2
Figure 2
144
games
108 unique
solution
36 dominant strategy
equilibrium
35
undominated
72 dominance
solvable
66
undominated
101 satisfactory outcomes
36 no unique
solution
1 Paretodominated
6 Paretodominated
18
no equilibrium
18
2 equilibria
43 unsatisfactory outcomes OR no outcome
10
Figure 3
Success of Eliminating Dominated Strategies to solve Ordinal Games
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
144 2x2
43,200 2x3
33,868,800 2x4
3,657,830,400 3x3
Iterated unsolved
0
14,400
12,700,800
1,445,068,800
Unsolved
36
0
0
180,633,600
Iterated Pareto-dominated
0
930
1,283,940
184,157,248
Solved Pareto-dominated
6
1,632
996,120
78,666,112
Equilibrium Pareto-dominated
1 (PD)
138
46,404
2,380,864
Iterated solved
0
6,270
8,241,660
1,080,277,952
Solved
66
16,368
9,587,880
643,868,288
Equilibrium
35
3,462
1,011,996
42,777,536
Dimensions of game
11
Figure 4
Efficiency
Strategy
\ Shortfall
Dominance
Maximax
Maximin
0
1
2
3
4
5
93
86
87
8
22
16
5
17
18
2
9
18
7
5
3
0
1
2
3
none
91
76
83
17
36
46
none
36
Rawlsian Justice
Strategy
\ Shortfall
Dominance
Maximax
Maximin
36
32
15
Figure 5
(a) Self interest
(b) Efficiency
(c) Fairness
(d) Rawlsian justice
(e) Relative difference
(f) Altruistism
12
Figure 6a
Common goal and strategy 2x2 games: total payoff efficiency
100%
90%
80%
70%
No solution
60%
Shortfall 6
Shortfall 5
50%
Shortfall 4
Shortfall 3
40%
Shortfall 2
Shortfall 1
30%
Shortfall 0
20%
10%
0%
(a)
(a)
EDS
(a)
MaxiMAX
MaxiMIN
13
Figure 6b
Common goal and strategy 2x3 games -total payoff efficiency
90%
80%
No solution
70%
Shortfall 10
Shortfall 9
60%
Shortfall 8
Shortfall 7
Shortfall 6
50%
Shortfall 5
Shortfall 4
40%
Shortfall 3
Shortfall 2
30%
Shortfall 1
Shortfall 0
20%
10%
0%
(a)
(a)
EDS
(a)
MaxiMAX
MaxiMIN
14
Figure 6c
Common goal and strategy 4x2 games: total payoff efficiency
100%
90%
80%
70%
No solution
Shortfall 14
Shortfall 13
Shortfall 12
Shortfall 11
Shortfall 10
Shortfall 9
Shortfall 8
Shortfall 7
Shortfall 6
Shortfall 5
Shortfall 4
Shortfall 3
Shortfall 2
Shortfall 1
Shortfall 0
60%
50%
40%
30%
20%
10%
0%
(a)
(a)
EDS
(a)
MaxiMAX
MaxiMIN
15
Figure 6d
Common goal and strategy 3x3 games - total payoff efficiency
100%
90%
80%
No solution
Shortfall 16
Shortfall 15
Shortfall 14
Shortfall 13
Shortfall 12
Shortfall 11
Shortfall 10
Shortfall 9
Shortfall 8
Shortfall 7
Shortfall 6
Shortfall 5
Shortfall 4
Shortfall 3
Shortfall 2
Shortfall 1
Shortfall 0
70%
60%
50%
40%
30%
20%
10%
0%
(a)
(b)
(c)
EDS
(e)
(f)
(a)
(b)
(c)
MaxiMAX
(e)
(f)
(a)
(b)
(c)
MaxiMIN
(e)
(f)
16
Figure 7a
Common goal and strategy 2x2 games: justice
100%
90%
80%
70%
60%
No solution
Shortfall 3
Shortfall 2
50%
Shortfall 1
Shortfall 0
40%
30%
20%
10%
0%
(a)
(b)
(c)
(d)
EDS
(e)
(f)
(a)
(b)
(c)
MaxiMAX
(d)
(e)
(f)
(a)
(b)
(c)
MaxiMIN
(d)
(e)
(f)
17
Figure 7b
Common goal and strategy 2x3 games: justice
100%
90%
80%
70%
No solution
60%
Shortfall 5
Shortfall 4
Shortfall 3
50%
Shortfall 2
Shortfall 1
40%
Shortfall 0
30%
20%
10%
0%
(a)
(a)
(a)
EDS
MaxiMAX
MaxiMIN
18
Figure 7c
Common goal and strategy 4x2 games: justice
100%
90%
80%
70%
No solution
Shortfall 7
Shortfall 6
60%
Shortfall 5
Shortfall 4
50%
Shortfall 3
Shortfall 2
40%
30%
20%
10%
0%
(a)
(a)
(a)
EDS
MaxiMAX
MaxiMIN
19
Figure 8a
Average goal and strategy 2x2 games: total payoff efficiency
100%
90%
80%
70%
No solution
Shortfall 6
60%
Shortfall 5
Shortfall 4
50%
Shortfall 3
Shortfall 2
40%
Shortfall 1
Shortfall 0
30%
20%
10%
0%
Total
(a)
(b)
(c)
EDS
(d)
(e)
(f)
(a)
(b)
(c)
MaxiMAX
(d)
(e)
(f)
(a)
(b)
(c)
MaxiMIN
Tota l
(a)
(b )
(c)
(d)
(e)
(f)
(a)
(b)
(c)
( d)
(e)
(f)
(d)
(a)
(b)
(c)
(d )
( e)
( f)
(e)
(f)
20
Figure 8b
Average goal and strategy 2x3 games: total payoff efficiency
100%
90%
80%
No solution
Shortfall 10
70%
Shortfall 9
Shortfall 8
60%
Shortfall 7
Shortfall 6
50%
Shortfall 5
Shortfall 4
Shortfall 3
40%
Shortfall 2
Shortfall 1
30%
Shortfall 0
20%
10%
0%
Total
(a)
(a)
(a)
EDS
MaxiMAX
MaxiMIN
21
Figure 8c
Average goal and strategy 4x2 games: total payoff efficiency
100%
90%
80%
70%
No solution
Shortfall 14
Shortfall 13
Shortfall 12
Shortfall 11
Shortfall 10
Shortfall 9
Shortfall 8
Shortfall 7
Shortfall 6
Shortfall 5
Shortfall 4
Shortfall 3
Shortfall 2
Shortfall 1
Shortfall 0
60%
50%
40%
30%
20%
10%
0%
Total
(a)
(a)
EDS
(a)
MaxiMAX
MaxiMIN
22
Figure 8d
Average goal and strategy 3x3 games: total payoff efficiency
100%
90%
80%
70%
No solution
Shortfall 16
Shortfall 15
Shortfall 14
Shortfall 13
Shortfall 12
Shortfall 11
Shortfall 10
Shortfall 9
Shortfall 8
Shortfall 7
Shortfall 6
Shortfall 5
Shortfall 4
Shortfall 3
Shortfall 2
Shortfall 1
Shortfall 0
60%
50%
40%
30%
20%
10%
0%
Total
(a)
(b)
EDS
(c)
(e)
(f)
(a)
(b)
(c)
MaxiMAX
(e)
(f)
(a)
(b)
MaxiMIN
(c)
(e)
(f)
23
Figure 9
Pareto-Efficiency of Maximax strategy with Efficiency goal Play
Game
Dimensions
2x2
2x3
2x4
3x3
Total Games
144
43 200
33 868 800
3 657 830 400
Pareto-efficient
Shortfall 0
138
41 544
32 815 080
3 297 656 880
Portion of Efficient
Solutions
95.8%
96.0%
97.0%
90.0%
Download