M351 THEORY OF GRAPHS

advertisement
MT369, part 5 The minimax Theorem.
In this section I prove a restricted part of the Nash Equilibrium Theorem. In its full generality this is:
Theorem 5.1. Any finite non-cooperative game has at least one equilibrium point in mixed strategies.
That is a set of mixed strategies for each of the players such that no player who deviates can increase
his mean payoff. Not proved, proof not examinable.
(This is Theorem 2.4 in Jones's book) The version (partly) proved here is:
Theorem 5.2. Let M be a matrix game: that is, a 2-person zero sum game. Then M has a solution
in mixed strategies.
This is called the minimax theorem for matrix games. Even this restricted result is difficult, and I will
quote theorems & so omit the most difficult part of the proof. First a useful result:
Lemma 5.3. Let x , y be any 2 column vectors in Rm . Then xTy = yTx .
xT y   xi yi   yi xi  y T x
Proof.

Notation. All vectors are column vectors unless transposed. u and v are a pair of column vectors
whose entries are all 1 . The difference is that u has length m and v has length n . If x is any
vector, we say that x is non-negative and write x  0 as a shorthand way of saying that every
component of x is  0 . Also we write x > 0 meaning that every component of x is strictly
positive, and x  y meaning that x – y  0 . Note the distinction between 0 = zero vector and 0 =
zero number.
Let M be the matrix of a game, say of size mn . By Lemma 4.1 , we can add a constant (  , say)
to all elements of M and so assume that all entries of M are > 1 .
As before call the row player Roy and the column player Clara .
Roy's problem is to maximise his payoff when Clara does her worst against him.
Roy will choose a combination of the rows represented by a vector r . This must satisfy:
r0 ,
rTu = 1 ,
(5.1)
If Clara chooses column i the payoff (to Roy) will be the i-th component of the vector rTM . So
Clara will choose the smallest component of this vector. So Roy's problem is: choose vector r
satisfying (5.1) so as to maximise the smallest component of rTM . Then the payoff (  , say) is
that component.
The simplest way to express the fact that  is the smallest component is to write rTM  vT .
Transpose this: MTr  v . Then Roy's problem is:
Maximise  (the payoff) by choosing vector r (the row strategy) subject to the conditions:
r0 ,
rTu = 1 ,
MTr  v
(5.2)
Since every entry of M is > 1 , we are guaranteed that  > 1 , so we can divide r by  . Say x =
r/ . Then Roy's problem becomes:
Minimise 1/ by choosing x subject to conditions:
x0 ,
xTu = 1/ ,
MTx  v
and this in turn is equivalent to the following:
Given an mn matrix M (with all entries > 0 )
Minimise xTu by choosing x subject to conditions:
x0 ,
MTx  v
(5.3)
and this you should recognise as a standard problem in linear programming. Note that the transformation from (5.1) to (5.3) only works when we are guaranteed that  > 0 . That is why we must start
by adding the constant to M . When we subtract  to go back to the original M , the solution
will still be valid, but the payoff changes.
Definition Given a linear programming problem such as (5.3) a point x is called feasible if it
satisfies the conditions.
Now consider Clara's problem. She wants to minimise the payoff when Roy does his worst. She will
choose a combination of columns represented by a vector c . This must satisfy:
c0 ,
cTv = 1 ,
(5.4)
and when she chooses the vector c then Roy will want to choose the largest component of Mc . So
Clara's problem is: choose vector c satisfying (5.4) so as to minimise the worst possible payoff. This
is the largest component of Mc (  , say). As before, we express this fact by writing Mc  u .
Then Clara's problem is:
Minimise  (the payoff) by choosing vector c (the column strategy) subject to the
conditions:
c0 ,
cTv = 1 ,
Mc  u
(5.5)
As before, we can divide c by  ( say y = c/ ) and so rewrite her problem as:
Maximise 1/ by choosing y subject to conditions:
y0 ,
yTv = 1/ ,
My  u
and this in turn is equivalent to:
Maximise yTv by choosing y subject to conditions:
y0 ,
My  u
and now (5.6) is the dual linear problem to (5.3).
(5.6)
Lemma 5.4. Given an mn matrix M with all entries > 1 , the linear problems (5.3) and (5.6)
both have feasible points.
Proof. Take x = u , then every product in MTx is  1 , hence MTx  v . Hence (5.3) has a
feasible point. Also y = 0 is feasible for (5.6) .

Lemma 5.5. Let x and y be feasible for (5.3) and (5.6) respectively. Then yTv  xTu .
Proof.
yTv  yT MTx (by 5.3) = ( My )Tx  uTx (by 5.6) = xTu by Lemma 5.3.

The difference xTu  yTv is known as the duality gap . If we found a pair of points x , y which are
feasible for the primal (5.3) and dual (5.6) then the gap gives us some idea how far we are away
from finding a pair of optimal points.
As stated before, we now use the Duality Theorem for linear programming:
Theorem 5.6. Let P and D be a dual pair of LP problems, as in (5.3) and (5.6) . Suppose that
both P and D have some feasible points. Then both problems have a solution: that is, a pair of
points x* and y* where x* is a feasible minimum for P and y* is a feasible maximum for D
and the duality gap = 0 . Not proved, proof not examinable.
So now we apply this to a matrix game. Suppose we have solved the twin problems (5.3) and (5.6) ,
with solution x* and y* . Since the gap = 0 , x*Tu = y*Tv =  say . Divide by  ; we get 2
vectors: r* = x*/ and c* = y*/ and these are optimal strategies for the original game.
Download