Additional notes on game theory SA305: Spring 2013 These notes assume you have read the other notes that have been posted and/or were in class on Friday, April 19. 1 Maximin versus Minimax Here is an example where the max of a min is not the min of the max. Suppose there is a batter versus a pitcher. The pitcher is trying to throw a ball and the batter is trying to get a hit. The pitcher can throw either a fastball, denoted by F, or a curveball, denoted by C. The batter can either guess a fastball is coming, also denoted by F, or can guess a curveball coming, also denoted by C. Note that the pitcher and batter only play pure strategies. The payoff matrix is pitcher F C F .300 .200 C .100 .400 batter Here, each entry represents the onbase percentage of the batter given the different strategies played, where onbase percentage is the likelihood that the batter has success. Thus, the batter would prefer a higher percentage and the pitcher a lower percentage. We denote the batters decisions by yF and yC and the pitchers decisions by xF and xC . The sets that describe their strategic possibilities are B = {(yF , yC ) : yF , yC ∈ {0, 1}, yF + yC = 1} and P = {(xF , xC ) : xF , xC ∈ {0, 1}, xF + xC = 1}, which means that they each can only play one pure strategy. Note that for a given pair of strategies (yF , yC ) ∈ B and (xF , xC ) ∈ cP , the payoff is .300 .200 xF yF xF (.300) + yF xC (.200) + yC xF (.100) + yC xC (.400) = yF yC . .100 .400 xC Note that because only pure strategies are considered, exactly one entry from the payoff matrix is chosen. Then, from the batter’s perspective, the worst-case approach is to consider the following: .300 .200 xF yF yC max min . .100 .400 xC (yF ,yC )∈B (xF ,xC )∈P Remember how this is evaluated: the batter selects a strategy and then the pitcher chooses a strategy. Thus, if the batter selects F , the pitcher would choose C, with a payoff of .200 and if the batter selects C, the pitcher would choose F with a payoff of .100. Given the batter chooses first, .300 .200 xF yF yC max min = .200. .100 .400 xC (yF ,yC )∈B (xF ,xC )∈P Now consider the pitchers problem min yF max (xF ,xC )∈B (yF ,yC )∈P .300 .200 xF yC . .100 .400 xC Note that now the pitcher is first. If the pitcher chooses F , then the batter would choose F and the payoff is .300. If the pitcher chooses C, the batter would choose C and the payoff is .400. Thus, .300 .200 xF yF yC min max = .300. .100 .400 xC (xF ,xC )∈B (yF ,yC )∈P So, in conclusion, min max (xF ,xC )∈B (yF ,yC )∈P yF .300 yC .100 .200 .400 xF xC 6= max min (yF ,yC )∈B (xF ,xC )∈P 1 yF .300 .200 xF yC . .100 .400 xC 2 Rock-paper-scissors Recall the rock-paper-scissor payoff matrix −1 0 1 0 A= 1 −1 1 −1 0 where the rows and columns are in rock-paper-scissor strategy order. Also, let yR , yP , yS represent the probabilities the row player plays rock, paper, and scissor respectively. Analogously, let xR , xP , xS represent the probabilities the column player plays rock, paper, and scissor respectively. The row player uses an additional variable v to denote the payoff she receives if she plays a particular strategy. Recall from class that the row player is using the linear program: max s.t. v v ≤ yP − yS v ≤ yS − yR v ≤ yR − yP yR + yP + yS = 1 yR , yP , yS ≥ 0. The column player uses an additional variable w to denote the payoff he receives if he plays a particular strategy. Recall from class the column player’s linear program: min w s.t. w ≥ xS − xP w ≥ xR − xS w ≥ xP − xR xR + xP + xS = 1 xR , xP , xS ≥ 0. You were asked to take a dual of each of these linear programs. Recall that a good way to minimize mistakes in taking duals is to rewrite your linear programs so that all variables are on the left-hand side and all constants are on the right-hand side. Also, it helps to line up the variables consistently. So, for the row player, a good way to rewrite her linear program is as follows. max v s.t. v −yP +yS ≤ 0 v +yR −yS ≤ 0 v −yR +yP ≤0 yR +yP +yS = 1 yR , yP , yS ≥ 0, and for the column player, a good way to rewrite his linear program is min w s.t. w w w +xP −xR +xR xR xR , −xP +xP xP , 2 −xS +xS +xS xS ≥0 ≥0 ≥0 =1 ≥ 0. 3 The general model In general, the model of the two-player zero sum game can be summarized as follows. We have a row player and a col player. The row player uses strategies specified in the set R and the col player uses strategies specified in the set C. The following data is also given: Pij = the payoff the row player receives if row plays strategy i ∈ R, and col plays j ∈ C. The decision variables are yi xj = the probability row plays strategy i ∈ R, = the probability col plays strategy j ∈ C. From the perspective of row, the game strategies are determined by solving the minimax problem: X X X X r∗ = max min Pij yi xj : yi = 1 = xj , xi ≥ 0, ∀i ∈ R, yj ≥ 0, ∀j ∈ C . y x i∈R j∈C i∈R (MM-R) j∈C Note that in (MM-R), the row player “moves first” and selects the y strategies and the col player “responds” by choosing col strategies x. In class we saw that by introducing a variable v to represent the inner minimization, (MM-R) can be solved by solving linear program: max v s.t. v≤ P Pij yi for j ∈ C (LP-R) i∈R yi ≥ 0, i ∈ R. From the perspective of col, the game strategies are determined by solving the minimax1 : X X X X Pij yi xj : yi = 1 = xj , xi ≥ 0, ∀i ∈ R, yj ≥ 0, ∀j ∈ C . c∗ = min max x y i∈R j∈C i∈R (MM-C) j∈C As in (MM-R), by introducing a variable w to represent the inner maximization, (MM-C) can be solved by solving linear program: min w P s.t. w ≥ Pij xj for i ∈ R (LP-C) j∈C xj ≥ 0, j ∈ C. In class we have seen that (LP-C) and (LP-R) are duals of one another. This provides a proof that von Neuman’s famous theorem holds, which states that: r∗ = c∗ . Moreover, the duality relationship of (LP-R) and (LP-C) indicates we need only solve (LP-R) or (LP-C) but not both. The associated dual solution to the optimal basic solution of either is the optimal solution to the other. So, for example, if one were to solve (LP-R) to obtain an optimal basic feasible solution x∗ , then the associated dual solution y ∗ would be optimal to (LP-C). 1 Although it is a “maximin” problem, both are referred to as minimax as maximin has a different meaning 3