ROBERT COLLEGE

advertisement
AN ITERATIVE METHOD FOR FINDING
A SOLUTION To A ZERO-SUM
TWO PERSON RECTANGULAR GAME
by
ROBERT JAMES JIRKA
A THESIS
submitted to
OREGON STATE COLLEGE
partial fulfillment of
the requirement for the
degree of
in
MASTER OF SCIENCE
June 1959
APPROVED
Associate Professor of Mathematics
in
Charge of Major
Chairman of Department of Mathematics
Chairman of School of Science Graduate Committee
Dean of Graduate School
Date thesis is presented
Typed by Jolan Eross
May l2. 1959
TABLE OF CONTENTS
Page
I.
INTRODWTION
THE ITERATIVE
CHAPTER
II.
III.
CHAPTER
IV.
CHAPTER
CHAPTER
i
PREDE
PROOF OF CONVERGENCE
OTHER METHODS
FOEL
7
18
SOLVING
GAMES
CHAPTER V.
NUMERICAL EXAMPLES
34
BIBLIOGRAPHY
39
AN ITERATIVE METHOD
FR
FINDING A SOLUTION
TO A ZERO-SUM TWo PERSON RECTANGULAR GAME
CHAPTER
I
INTROD UGT ION
Game theory is
a
branch of mathematics which deals
with problems where various persons with conflicting in-
terests interact with each öther and the outcome of their
interaction is partially controlled by each of the
various participants.
games of strategy,
The theory restricts itself to
i.e., games where the various players
can use their ingenuity to affect the outcome.
Games of strategy, as
a
subject for mathematical
study, had its beginning in comparatively recent times.
The first formalization of the subject was by Von Neumann
(9,
Ps
295-320)
in 1928.
The theory did not stimulate
much interest and very little was done until 1944, when
Von Neumann and Morgenstern
on the
(io)
came out with
a
subject titled Theory of Games and Economic Be-
havior.
This book was the stimulus that the theory
needed.
It made available
of
book
interests was
a
to many fields,
fundamental assumption,
approach to many of their problems.
present time game theory has become
where conflict
a
systematic
From 1944 up to the
a
tool highly
2
applicable to fields which were here-to-fore considered
completely separated from mathematical analysis (business,
etc.).
sociology,
Before defining the special type of game with
which this paper is concerned we shall define some terms
which apply to the theory in general.
If
a
person has available to him
a
set of alter-
natives from which he picks out one, then the one chosen
is defined to be
a
choice.
The point of time at which
he is given the set from which he makes
called
a
move.
a
choice is
A game is then defined to be
a
set of
rules which defines the number of players and materials
defines for each player
to be used;
a
class of sets of
alternatives, the order of the moves, and finally after
the last choice, if there is
a
last one, what will be the
reward (payoff) to each player (money, prestige, death,
etc.).
A realization of the rules is defined to be
play nf the game.
A strategy is
a
a
preconceived plan
which tells the player what choice to make at every move.
If the sum of all the payoffs to all the players for
every conceivable play of the game is zero, then the game
Is
zero-sum.
This paper is concerned with
game called
a
a
special type of
zero-sum two person rectangular game.
game is defined as follows.
Two people which we will
The
3
denote by
(a1)
matrix of real numbers
j
,
1,
=
chooses
P1
e,
and
rn)
number from
a
a
amount
ai
m
P1
simultaneously
P2
Then
pays
P2
from the payoff matrix, where
and
P1
P2.
was the choice of
j
i
mean
htgamett
or "rectangular
Xj
x1,
> O,
,
was the
The game
game" will be used to
i =
is an rn-vector
P1
such that
XmH
1,
,
such vectors by
Smi
m.
each member of the set
P1
the
zero-sum two person rectangular game.
a
A strategy for
x =
P1
From now on
consists of just one move for each player.
the words
is given
the set
P2
his set and
number from his set.
chooses
choice of
(i,
a
from which they are to make their choices.
n)
',
'',
1,
i
called the payoff matrix.
n,
the set rf numbers
(i,
There is given
play the game.
P2
and
P1
is
(1,
a
P1
will choose
and
vector which assigns to
..,
uses in making his choice;
ity that
= 1
will denote the set of all
e
X
x.
i.
n)
a
probability which
i.e., x1
In essence
is the
P1
probabilis
letting
Some chance device determine which number he will pick
from the set
egy for
,
(1,
In an analogus manner
in).
is an n-vector
P2
and
= i
y.
4
>
y
O,
Y =
({
= 1,
j
,
n.
S.
the set of all such n-vectors by
such that
will denote
e
A strategy
for
X
such that one of the components is one and all the rest
P1
are zero is called
pure strategy and all others are
a
A similar definiticn holds for
called mixed strategies.
the strategies of
payoff matrix,
and
P2.
1=1, ',m, j=i, ',rì
(a1)
If
uses the strategy
P1
Y =
uses the strategy
P2
expectation to
payoff matrix of
strategy
=
ii
y,
X
'e',
=
i =
a
II
,
then the
m
i,
game,
x,
yff
,
1=1
in,
j
=
1,
n
is the
solution to the game consists of
a
',
for
x1 y..
a
l=]_
(a11),
XII,
fix1,
yJ'
y1,
E (X, Y) =
If
X
isany
is defined to be
P1
n
a
y, .., yJJ
strat-
a
xII
P2,
for
and
a
P1,
a
strategy
real number
y
such
5
In
that
a1
x
> y
for
i
= 1,
i
=
1,
,
m.
,
and
n
n
a1y
y
<
for
is called an
i=l
*
optimal strategy for
for
The number
P2.
y
is
an optimal strategy
Y
and
P1
called the value of the
game.
The existence of
a
solution for any rectangular
game is guaranteed by the fundamental theorem of rectangular games, known as the minimax theorem.
(a)
that for any payoff matrix
j
= 1,
max
XeS
..
,
min
YeS
we always have
n
E(X,Y)
min
YeS
i
n
= 1,
max
XeS
existing and equal.
It
',
states
m,
E(X,Y)
and
m
The minirnax
theorem can be shown to he equivalent to the existence
of a solution to the game whose payoff matrix is
(
ajj),
i
=
1,
.,
m,
j
1,
s'',
n.
Several excellent treatments of the equivalent
definitions for
a
solution to
a
rectangular game
arid
proofs of the minimax theorem ara available in the
literature.
A good and complete treatment with historical
remarks on the subject can be found in Games and
Decisions (6).
Since we see that any rectangular game has at
least one solution, the question arises as how to find
the solutions.
It
is
quite easily shown that the set of
optimal strategies for
P2.
ly for
P1
is
a
convex set and similar-
The value of the game is of course unique.
The purpose of this paper is to present an iterative method which gives
of points in
m
point determines
and
n +
+
a
i
a
sequence, finite or infinite,
dimensional vector space.
strategy for
real number approximation
a
P1,
a
Each
strategy for
P2
to the value of the game.
If
the seouence is infinite, it will be shown to converge
to
a
point which gives an optimal strategy for
optimal strategy for
P2
P1,
an
and the value of the game.
The sequence will be finite if any point determines
a
solution to the game.
The method is fundamentally
Each
it
point determined
is a
by
a
relaxation method.
the method is checked to see if
solution by forming
a
set of scalar products and
checking this set to see if any of the products are negative.
If one or more of the products
are negative,
then
the new point is determined so as to make the next set
of products have negative terms which are less negative
or all
positive.
7
CHAPTER II
THE ITERATIVE PRCCEDURE
In
view of the definition of
problem of sol 'ing
matrix
(a11),
wish to find
=
i
game becomes:
a
1,
vector
a
11y,
=
x
and
II
,
m,
',
solution the
a
given the payoff
=
j
,
xI
I(x, ..,
real number
a
m
we
n
,
a
,
vector
such that
y
n
a
x
>
for
y
= 1,
j
,
n
a1
,
1=1
< y
y
j=l
*
for
i
for
jl,,n,
= 1,
x1 > O
m,
',
for
=
i
'
m
,
m
y
n
*
*
ç.
)x
l
y.=l.
and
J
i=l
i=l
Before continuing let us introduce some new notation
which will be more convenient later on.
Let
n-components
A.
a11,
=
s,
rn-components
Aol
=
1r
j
I
ami,
P
s..
P
O,
,
n-components
r-
i
'O
A
se.
,
oil
-1
> O
for
i
=
',
1,
and
n
let
rn-components
''
A1 = Ito
'S. -a1_,l
O,
,
'',
J
O,
i=n+1,
for
A
-
=
0,
1,
*
Z
*
=
JJx
Y
a
game is then equivalent
m + n +
i
-
vector
y
x1,
X
*
*
,
0
1,
m+n
,
The problem of solving
to finding an
',
n-components
rn-components
A0
II
*
"'
*
Y1
*
,
y,V
II
such that
(2.1)
A1Z
(2.2)
x
(2.3)
Yq
*
for
O
>
i
= 1,
*
>
O
for
p =
1,
O
for
q =
Lp
i
and
*
',
',
m + n
m
n
,
m
(2.4)
=
x
p=1
n
(2.5)
V
*
Yq'
q=1
The method begins with any
in
+
n
+ i
- vector
y,
IIx(1),
and
(2.4)
vii
and follows the procedure described
(2.5)
below for getting the
=
(k+l)-s.t
j;
xW,
(2.3), (2.4)
A1Z,
i
for all
i,
=
(k)
(k)
and
then
The value
,
m+n)
1,
,
(1,
i =
A.
k
is de-
such that
m+n
and if there
must necessarily be negative.
(k+1)
j1(k+1),
(k+1),
(k+1)
_(k+i)
_(k+i)
Xm
a
k'
-(k+i)
-(k+i)
...
p
p
p
new vector
=
j;
p
For
is from the set
Knowing this value of
n).
,
V
;
formed by the formula
(2.6)
o
one such subscript, choose any one of them.
i,
xl
>
then the subscript
If not,
convenience let us assume that
k
If
solution to the game and
is a
for
is more than
are formed.
m+n,
1,
<
(2.2),
(2.5), the scalar products
termined from the set
.(k)
a
which satisfies
ji
we are finished.
A.
iterant from the k-th
Assuming that we have arrìved at
iterant.
(2.3),
(2.2),
satisfying
(k+i)
1(k)
a B.
3k
+
p
where
o
-1
= - z(1d.B.
i
[.
b0
=
-
+
L
c0s2 e.
-
a
J
BJ.B0
A.
i
b0
=
)Th
A
(A
B
(A
k =
Ojk
°k
)1/i
A.
3k
3k
A.
Ojk = (A
Ojk
A
Ojk
and
cos
8
=
k
A Ojk )1/2
A
k
(A0.A0)
1/2
1/2
(Aj
k
In
(2.6)
is no more than
R.
A.
normalized
3k
similarly
and
B0.
A0.
is
normalized.
By defini-
k
tion
cos
is the cosine of the angle between
O.
two vectors
A
A
and
space and is by definition
(A0
'A
k
)
/ (A0.
k
Schwarz's inequality
ecivai
to
(A.
3k
A.
)1/2
A0
k
(7,
m + n +
In
.
p.
189)
i
)1/2
3k
we have
the
dimensional
Using
11
I
(A0.Aj)
<
(A0j
)h/'2
A0j
(A.A)1/2
and hence
-1
that
is
a
cos2 e.
<
i
The expression
.
1
1.
-
cos2 O.
J
factor which is introduced to increase the rate of
convergence; cos
1
O
A0.
multiple of
a
we have for the components of
(2.6)
From
since A1 is not
m
b
(2.7)
(k+1)
(2.9)
m(1-
+
icos2ej
(k)
= Yq
(k+1) =
q =
for
,
where
-
i
b
k
unchanged.
cos2 o.
n
(-b. )(B.
k
3k
(k) i
In
p= i
p=i,
-(k+i)
Yq
b
_______________
p
for
(2.8)
(k)
(B. .z(k)
3k
(B. .z(k)
3k
3k
.
(2.8)
(A.
3k
)1[2
A
k
we see that the y-components are left
This is because we assumed
which gave us vectors
B.
and
B0.
3k
3k
< n
+
i
whose components
12
corresponding to the y-components of
-(k+i)
Therefore we have immediately that
and (2.5).
(2.3)
are zero.
If we had had
k
>
n +
x-components would have been unchanged and
have immediately satisfied (2.2)
satisfies
Z
then the
i
-(k+i)
Z
(2.4).
and
would
From (2.7)
we have
m
m
(B. .z(k)
3k
m
(k1)
V
,(ic)
Lp
p=i
p=l
m
m(Bj.Z (k))(
b)
since
z
k
+
(k)
xp
p=1
(k+i)
Therefore
p=1
m
20)
-
b
l-cos2S
-
p1
m(1
L
satisfies
(2.3),
(2.4)
and
(2.5)
did.
Geometrically, we have taken the point
and
determined on which side of the linear surfaces
A1'Z
A.
= O
Z = O
the point lies.
we had
a
was farthest from it on its
such that
negative side.
Thus we chose the surface
If there was no such surface we knew that
solution.
If
was not
passed along the normal to the surface
a
solution, we
A.
3k
Z
O
a
13
was from the sur-
distance determined by how far
face and by the cosine nf the angle between the surfaces
A.Z
and
= O
=
i
Then we passed along the
.
normal to the surface
(k+i)
the point
=
until we arrived at
1
on the surface
=
1
Returning to the analytic discussion, one of two
things can happen;
satisfy
it does not
(k+i)
then let
using
.
then apply
give us
a
either
(2.2).
=
If
satisfies
Z
(k+
If
satisfies
(2.2),
and continue the iteration
k+i) does not
satisfy
satisfying
(2.3),
(2.2),
(2.2), we
which will
different procedure to
Z+1)
or
(2.2)
(2.4)
and
(2.5).
Vith the vector
-(k+i)
Z
procedure which satisfies
not
(2.2)
(k+i)
(2.3,
obtained from the above
(2.4)
we begin by placing the
and
(2.5)
but
x-comporìents of
which correspond to the negative
x-components
14
All of them cannot be nega-
equal to zero.
of
in
Uve
y' _(k+i)
since
x
Y
For convenience let us
= 1.
p=l
;(k+1),
Suppose
r < m,
were the negative
r
1.
(k+i)
components of
Now determine all of the sums
r
(k+i)
-(ki)
xi
i
+
i=l
m
-
for
r
= r +
i
such that the sum is negative let
For all
m
i
(k+1)
x1
--
O
If
none of the sums are negative, we could now form the rest
of
(k+i)
Let us assume that some ef the sums were
negative and for convenience let us assume that they
were for
i
= r +
1,
'',
r +
Again determine the
S
r+s
(k+i)
:
sums
(k+i)
J.
+
1=1
m-(r+s)
for
i
If none of the sums are negative,
the rest of the components of
Z
= r
+
s
+
i
m
we could now determine
(k+1)
If some are
negative, we would keep on repeating the above procedure
until non? of the sums were negative and each time
15
Çk+1),5
putting the corresponding
equal to zero when
the sum was negative.
Let us assume that for
either
-(k+i) < O
x
i
=
1,
_(k+1)
X1
or that
we have
t
,
was such that one
of the sums defined above was negative and all the other
(k+i)
(k+1)
terms were such that
mt
+
i
t +
i
Then form
Z
(2.10)
x
(k+i)
>
,
m
1,
as follows
(k+i) -
- x
(k+i)
2
3.
=
(k+1)
=Xt
-0
t
-(k+i)
V
bXi
-(k+1)
(k+1)
(2.11)
_Xi
i
i
= t
+
1,
satisfies
(2.10)
and
(2.12)
(2.3)
for
",
m
for
j
=
1,
",
n
(k+i)
(k+1)
From
i=l
fflt
-k+i
(k+i)
(2.12)
+
and
(2.11)
we have immediately that
(2.5)
since
we see that
(k+i)
did.
From
16
t
m
41)=
(m-t)
m
-(k+i)
V'
Lxi
In
k+i),
i=l
(k+')
(m-t)
i=l
i=t+l
1=1
(k+')
Since
satisfies
satisfied
(2.4).
z+1)
then some one of the
we have that
(2.4)
evidently satisfies
i.oreover we can conclude that
i
m
t
(2.2).
for if it were
k+1)
x-components of
would
have been the last one to be removed by the sum process
defined above,
let us say it was
.
(k+1)
Now
in- i
(k+1)
was made zero because
-k+1
Xm
cannot be, since
satisfies
value of
f
1.
=
(2.2),
1(k+')
1=1
+
(2.3),
(2.4)
< O.
i)
But this
Therefore
and
(2.5).
With this
go back to the beginning of the
iterative procedure and start over again.
Geometrically, we cannot say much about this part
of the method.
be
a
We know that
a
solution to the game must
point which lies in the positive orthant as far as
the first
m + n
components of the point are concerned.
17
j)
The point
does not satisfy this condition.
negative components (considering only the first
components) are made zero, which gives
a
The
m + n
point that does.
But now this point does not lie on the surface
X1 = 1.
hence we go along the normal until
is determined on
X1
i
above over
x
= 1
arid
point
But this may take us
out of the positive orthant (as far as the first
components are concerned).
a
m + n
Essentially we repeat the
over until we come to
a
point on
which is in the positive orthant.
IIj
CHAPTER III
CONVERGENCE
PRflOF OF
of Chapter
If for any iteration of the method
we get
which is
a
and nothing more needs to be said.
then we wish
If not,
IZ(IdJ
to show that the sequence
method converges to
then we are through
solution,
a
generated by the
solution of the game, i.e., the
a
limit of the sequence gives an optimal strategy for
an optimal strategy for
and the
P2
X,
If
II
y
Ux,
=
value
of the gaine.
r;;;,
,
'
P1,
V
are any two vectors in
and
m + n
+
*
space, then the distance between
by
*
Z
- ¿ (k'
[(x_x)2
Z
and
is defined to be the real
...+
+
dimensional
i
Z
(k)
denoted
number
(x*_x
1/2
(_k)) 2
(
\
J
game, i.e.,
Z
*
*
*
Let
j
X
,
is an
Y
,
y
J
optimal
be any solution
strategy for
P1,
to
the
Y
i.
19
game.
is the value of the
y
We know that one always exists as
[Z(1d}
vergence of
consequence of
a
The method of proof of the con-
the minimax theorem.
k
and
P2
an optimal strategy for
will be to first show that for all
we have
(k)1
-
<
um
this fact to show that the
and then using
exists and is
a
ksolution to the game.
Now from
*
(3.1)
¡Z
we have
(2.6)
*
2
-(k+i)
-Z
I
= (x -x
1
(k)
+
1
1
-
cos2
-
m
(B.
.z(k
rn(1
b
+'+
*
(x-
.
(k)+
m
cos2
.z(k)»
3k
-
cos2
C
)\2
3k
1
e
z(k)
(B.
mjk
-
b
m(l
k
-
cos2
C
I
)
k
)(.
(b
k
i
i
n
(
which when the first
rn
\
.zk)
/
\
3k
1-cog2
terms and the last term on
2
20
the right are expanded and collected becomes
z)2
(B.
*
(3.2)
Iz -
*
-(k+i )12=
2
(k)
¡
+
+
i
(1-cos2O.
.z(kB. z*)
2(B.
2(B.
(B.
3k
'z)2
cos2 e.
-
i
m
Z (k2
'j
b)
2
p1
.
i-cos2O
m (i
e
-
m
(b
jk0jk
combining terms,
Iz
-
z
k+1)1
cos2 O.
m
(3.2)
lz*_
=
,
by
3k
becomes
2
*
(3.3)
=i
2
Since
)2
l2_
(B.
3k
i-cos2O
2(B.
i
-
.z(k)B.
c0s2
*
'z
21
Since
and
we
have that
cos2
8.
B.
we
< 1
B.
< O,
Z
*
->
O
(3.3)
can conclude from
that
*
(3.4)
-
-k+i
z'
(2.10),
From
*
<
'
-
Lz
(2.11)
(k
z
'
and
(2.12)
we
have
(3.5)
*
Iz
-z (
k+1)12
=
*
*
(x)2+''+
*
(xt)2+
t
(xt+i
-(k+)
xt+i
(k+')
2
m-t
t
*
+...+
(,
-(k+i)
mxm
1=1
*
*
_(k+1h2 +...+
_(k+i)
(ya- ''n
y
i
2
i
+
(y
-
_(k+i))2
V
which when
2
(k+i)
added and subtracted from the
(_(k+i))2
right side
is
and terms
22
are expanded becomes
t
(3.6)
Iz*_ 7(k+i)12_
Iz*_
(k+i)2
*_(k+1)
x1x1
+ 2
1=1
t
t
r-m
-2L
L1=t+l
i
_(k+1)
(x.-x1
)j
*
'
m-t
+
m-t
t
*
L
i=l
i=l
m
Since
((k+1))2
k+1)
-(k+1)
(x1- x
*
-(k) - x1),
v
)
we
(x1
=
1=1
i=t+l
get when we substitute this into
and
(3.6)
collect terms that
t
(k+))2
(
Iz*_
z(11)i2=
I7)
(k+1)12
i=l
t
t
2
\
-(k+1)
x
2
t
((k+1))2
i
*_(k.)
1=1
x1x1
,
m
1=1
Recall that
m-t
1=1
(k+i)
= O,
x1
i
=
1,
-
they were made zero for one of two reasons,
t
-'
*
Lxi
i=l
t
t
and that
either
23
-(k+i)
Lx
k+')
<
i=l
-(k+1)
or
O
+
x
m-s
<
O
for some
>
O
for
s
< t
(k+i)
xi
and furthermore
Let
us assume
give
a
x
i-1
k4i)
that
negative
t,
<
sum and hence that
t.
it
was the largest
negative
sum such as de-
c'f
Furthermore let us assume that the
were made zero in
Hence
subscripts.
>
i
the last component to
was
the positive terms to give a
scribed above.
-
m - t
for some
s
the same order as
= t
-
p <
p +
their
we have
t-p
v
-(k+i)
xi
1=1
m
Since
-(k+i)
m
-
< O
-t+p
t
+
p > O
for
m
and
i =
t
-
-
>
O
l,',
we have for
that
__
t-p
(k+i)
-(k+i)
m-t
which gives
Xt
+
m-t+p
t.
<O
i
= t
t-p
(m - t)
-(k+i)
Lxl
-(k+i) +
(k+i)
SÇ'
+pXt
Xt
1=1
<o
m-t
-(k+i)
Since
Xt
-(k+i)
>
for
-
i =
i
+
t,
t
we have
(p
-
i)
-(k+i)
x
_(k+i)
x.
>
or
i=t-p+1
t
px
.-(k+i)
>
L
t
(k+i)
-(k+i)>
x1
+Xt
(k+)
it-p+1
i=t-p+1
Hence we have
t-p
.(k+i)
t
(m-t)
(k+i)
.(k+') +
+
it-p+1
1=1
m-t
t
'
t-p
(iz+i)
-k+i
Lx1
pXt
i=i
_________-
m-t
Therefore, since
we have
-(k+i)
Xt
-(k+i)
xt
+
Lx
j=1
< o
m-t
._(k+i)
x
for
i
i
'
,
'-,
25
t
(k+1)
(k+i)
1=1
m-t
Finally since
x
for
< O
>
O,
=
i
u..
= i
i
,
1,
we have
t
t
-(k+i)
y'
Lx
2
*
_(k+1)
Lxii
V1
term of
(3.8)
(3.7).
m-t
Hence we can conclude from (3.7) that
Iz*_
(k+i)1
Iz*_
<
Therefore from (3.4)
From
z+1)j
z*
(3.9)
(3.9)
which is the last
O,
)
we have
<
a
Iz*_
and
we have
(3.8)
(k+i)1
z
<
Ê
monotonic decreasing sequence
Hence the
of non-negative real numbers.
sequence
(3.9)
[jz*_z(1d1}
converges as
k
we obviously have the sequence
Then from
.
(k+i)1
[Iz*
converging and its limit is the same as the limit of the
sequence
{Iz*
2(k)j},
Therefore, from
we can
(3.3)
conclude
4
Z
)
-'k
.
(3.10)
*
.z(k)B
2(B
3k
hm
o
[_
l-cos2
ê
i
-
cos2
I
26
Since each term in the souare brackets of
negative we must have
kco
,
the set (i,
of
a member
the
From
that
y
¡
B. .z(1d
3k
11m
a
max
j
where
= O
is
k
one can see immediately
y
The
.
¡
is
m+n).
of
definition
(3.10)
set of all
L,j
Z =
X, Y,
if
y
such
II
that
m
n
= 1,
x
x1 > O,
= 1,
i
1=1
and
m
y,=1,yj
n,
,
dimensional
vector space.
Hence by
the
theorem we can extract from
(k.)
convergent subsequence
a
(k.)
1
Z
have 11m
=
z
1*co
i-
11m
k'
1=
B
= B
satisfies
Let
0,
we must also
-'k
for any p,
But
(k1)
(k.)
p
B.Z
sequence
.
J
Z'
B
(k1)
= O.
[
(k.)
we have
Z
Since
Z
B.
subset of
is a closed and bounded
la1l
< max
i,j
ivi
Weierstrass's
11m
j=l,',
j=1
+ n + 1
11m
0,
>
>
p
O
(2.2)
Z
and,
since
'Z
B.
k
for
and
p = 1,
",
(2.3).
< B
p
'Z
i
m + n.
Since
Obviously
for each
k1
we
n
27
m
n
k.
and
i
have
k
Therefore
1.
Yq
Z
is a
so-
i:
q=l
p=l
lution to the game.
Since
(k.)
i
um
such
N
there exists an
Z - z
for an arbitrary
= Z,
s
>
that
(k.)
1
k1 > N.
for all
*
Then from
by replacing
(3.9)
I
Therefore
we
Z
-
z(k)i
can conclude
converges to the
is
a
solution to the game.
by
Z
for all
we have
Z
k
k1
that the entire sequence
value
Z
which we have shown
O
CHAPTER IV
OTHER METHODS FCR SOLVING GAMES.
There are several methods already available for
solving rectangular games.
Some are of
finite algo-
a
rithm character which give all solutions to the game
The
while others are of the infinite iterative type.
following is an explanation of several of them.
Brown's
method and Bellman's method will be used for comparison
purposes.
SIMPLEX METHOD.
The method most used today is the simplex method
which is
(3,
p.
a
computational technique devised by Dantzig
339-347)
for solving linear programming problems.
Since any rectangular game can be made into
programming problem
(4,
p.
method can be used t, solve
all solutions to the game in
a
linear
330-338),
the simplex
game.
The method gives
a
a
finite number of steps.
STATISTICAL METHOD OF BROWN.
The most simple of the methods is
method by Brown
(8,
p.
296-301).
It
is
a
statistical
based on the
idea of making present decisions dependent on
past
29
history.
..,
= 1,
j
Given
i
=
plays the pure strategy
P
n,
(a1)
a payoff matrix
',
1,
X1
denotes
tining a
(1, ',
A
(
i
one and the rest zero and
a1'
(
'
(
=
matrix.
row of the
such that
where
A
(
i-th
Y.
2
is a member of the set
j
is the
'
vector
now uses the pure strategy
P
(i,
minimum elements in
is one of the
aÇ
H
con-
is from the set
i
now starts an accumulated sum
P
in).
a',
)
i-th component
strategy with the
a
where
X1
i
m,
,
n)
such
A(i).
2
J
i
starts an accumulative sum vector
P
now uses the pure strategy
of the set
(1,
',
m)
X1
where
such that
maximums of the components of the
adds the
h(i)..,b(i)
j1-th column vector of the matrix.
is the
where
U
is
one
is o:e of
the
i
vector
now
-th row of the matrix to his accumulative
i
2
makes
his
choice by the same criterion as
sum and
P
before.
Hence at the k-th step we have the accumulative
30
sum vectors
A
(k)
(k)
(k)
'
II
=
(I
btd,
where
X,
strategy
bfI
.S.,
and
now uses the pure
P1
.
I
n
i
member of the set
a
1k+i
k+i
(i,
i..,
such that
m)
of the components of
the
is
B(1<).
A(11)
where
such that
aÇY1)
P2
a
3k+i.
formed by adding
of the payoff matrix to
jk+ith row vector
component by component.
y.
3k+x
one of the maximums
b
A(k)
now uses the pure strategy
member of the set
(1,
'.,
n)
is one of the minimums of the corn-
3k+i.
ponents of
A(1(41').
it can
The procedure then repeats.
k
xi
be shown that the seouence
n
n=1
either conk
verges to an optimal strategy for
P
or has
a
sub-
Yj
sequence which does.
Similarly the sequence
either converges to an optimal strategy for
a
subsequence which does.
It can
k
P
or has
also be shown that the
31
value
of the garne as the greatest lower bound of
y
(k)
min
a.
i
)
k =
k
max
OWN
-
,m
iE(1.
bound of
1,
2,
'',
,
k
k
and the least upper
1,
2,
s
VON NEUMANN METHOD.
This method as well as the one following require
that the payoff matrix be
which case the game
skew-symmetric matrix in
called symmetric.
is
easy to show that for
a
It
quite
symmetric game the value of the
a
game is zero and an optimal strategy for
P.
optimal strategy for
is
is also an
P
This apparent limitation
presents no problem since every rectangular game can be
enlarged into an equivalent symmetric game
The Brown
sets up
is a
-
Von Neuman technicue
(5,
(2,
p.
p.
81-88).
73-79)
set of differential equations whose solution
a
solution of the game.
Let
matrix of
x1 > O,
(ajj).
a
u
i,
j
=
1,
e..,
n,
be the payoff
symmetric game and let
=
a11 x
(u1)
=
max [o,
uJ
for
32
i =
,
1,
n.
Furthermore let
i=
p(u1).
ç(x) =
Then the system
to yield
equations to be used
dt
x1(0)
=
and
i
differential
of
a solution is
-p(x)x.
p(ui)
-
= i
x
with
c
n
c1O,
1c
i=
= i
s
i
It can then be shown that as
t
-
x1(t)
has cluster
points which furnish solution to the game.
METHOD OF BELLMAN.
The method of Bellman
Brown
-
(i)
Von Neumann procedure devised to
vergence.
The set
of
dx.
=
variant of the
speed
differential equations
with takes the form
where
is a
f(u)
-
up
con-
he ends up
33
f(u)
=
i
if
>
=0
if
u1O,
0,
and
(x)
=
f(u)
For computinq purposes the differentiai ecuation is replaced by the difference eouation
x1(n+l)
where
is choosen
h
goes negative.
formed,
x1(n) (l-h
a
4s
(x(n))+ hf(u(n))
small enough to insure that no
x1
After one step of the iteration is per-
linear interpolation process can be used to
skip several iterations.
There are several other methods for solving games.
For
a
good exposition of all the methods up to the
present see
(6,
p.
424-446).
In the next chapter the method described
paper, for convenience called method
I,
in this
will be compared
with Bellman's method and Brown's method.
34
CHAPTER V
NUERICAL EXAMPLES
Example
1.
Let us consider the skew-symmetric payoff matrix
O
-1
-2
0
0
2
1
1
0
0
4
0
0
-1
2
0
0
-1
1
0
-1
o
-4
1
0
0
-1
1
0
0
-1
0
0
-3
1
-2
0
0
1
3
0
-1
-1
1
1
-1
-1
1
0
which, since the game is symmetric, has
y = O.
be shown to have the unicue solution
=
x2 = .1111,
x6
.0556,
x
It can
.1111,
x5 = .1667,
x3 =
.1667,
x4 =
=
.3333.
The following pages contain
x-,,
.0556,
a
comparison of the results obtained by the method of this
paper, called method
Bellman's method.
I,
Brown's statistical method and
cc
P4W
X
X
X
Lx
)
I
6PF0
0000.0
6L7T0
0000.0
0000.0
617F0
6tF0
6tF0
6tiF0
6t'F0
0000.0
6tI'0
6t7T'0
0000.0
6t?F0
6tiV0
0000'T
6CQT'O
QtF0
OTEFO
ÇEt7TO
0000.0
0000.0
0U1'O
ÇtT'0
000YO
800'0
£88F0
t7T0
PT.0
0TT'0
PI0
0000.0
0000.0
0000.0
0PT0
Çt7TO
O000
ET600
61t71'O
0000.0
6CtT0
00000
L9990
6t'.T0
6i7I'0
6?tF0
00000
=
ST'O
8F0
6I0
176800
611710
t6ÇF0
6ItVO
6i7TO
6t7T'0
0000.0
0000.0
00000
36
k
k
=
=
For
90
0.1105
0.0926
0.1555
0.1195
0.1808
0.1111
0.1608
0.1808
0.1889
0.0556
0.0926
0.0444
0.1663
0.0926
0.1222
0.0555
0.1808
0.0778
0.3318
0.1803
0.3000
0.1106
0.0721
0.1129
0.1116
0.2367
0.1371
0.1626
0.1503
0.1452
0.0558
0.0702
0.0564
0.1670
0.0702
0.1613
0.0563
0.1589
0.0564
O.3"±S
0.2416
0.3306
125
k
=
125
for Method
I,
we have two place accuracy in all components
nearly two place accuracy for Brown's
Method, and very little accuracy in Bellman's Method.
However as mentioned before, Bellman's Method allows for
a
linear interpolation process by which many of the
iterations can be jumped.
Example
2.
Method
I
is
dependent on the initial choice.
for the same payoff matrix as example
I
If
we had made the
initial choice as indicated below, the process would have
converged faster.
37
k=1
x
k=50
k=125
k=l00
0.0000
0.1067
0.1110
0.1111
0.0000
0.1096
0.1111
0.1111
0.0000
0.1668
0.1667
0.1667
1.0000
0.0588
0.0556
0.5556
0.0000
0.1662
0.1666
0.16b7
0.0000
0.0553
0.0556
0.5556
0,0000
0,3366
0.3334
0,3333
i
x
2
x
3
x4
x7
Example
3
Consider the payoff matrix
1
0
0
1
2
can be verified that
It
strategy
x
strategies
+
a
2
b = 1.
=
y
,
=
x
a,
P
has the unique optimal
and that
=
y
= b,
y
The value of the game is
has the optimal
P
a
1.
where
The following
computations illustrate several things about this method.
k=l
k=2
k=3
k=4
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
y1
0.0000
0.1102
0.1825
0.0000
y2
1.0000
0.7583
0.8175
0.9494
y
0.0000
0.1102
0.0000
0.0505
y
0.0000
0.1000
0.2821
0.6928
x
i
X
2
I.
k=l0
k=20
k=30
k=31
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
y1
0.0914
0.0973
0.0957
0.0957
y
0.7900
0.8074
0.8086
0.8086
y
0.1185
0.0952
0.0957
0.0957
y
0.8846
0.9946
0.9999
0.9999
x
1
x
2
First of all notice that although the method started
with optimal strategies for
condition existed for
and
P
P,
a
non-optimal
on the first iteration,
P
this
2
being due to the fact that the approximation to the value
of the game differed from the actual value and in correct-
ing this,
the strategy for
was altered.
P
the optimal strategy finally obtained for
P
Notice that
is
different from the optimal strategy with which the method
*
started.
If we let
be the solution
Z
1
x
i
1
X
2
2
we see that
y1
= 0,
y
-
Z*I
=
1,
=
i
y
= O
and
Iz
and
¿:
y = 1,
-z°i=
then
1.027.
Therefore we can conclude that the method does not necessarily select the closest solution.
39
BIBLIOGRAPHY
1.
Bellman, Richard. On an iterative algorithm for
finding the solution of games and linear programmSanta Monica, Rand Corporation,
ing problems.
1953.
15 numb, leaves.
(Research memorandum
p-473).
2.
Solutions
and
John Von Neumann.
Brown, G. W.
In:
Kuhn
of games by differential equations.
and Tucker's Contributions to the Theory of
Games, Vol, I.
Princeton, Princeton University
Press, 1950.
(Annals of Mathematics
p. 73-79.
Studies,
Study no. 24)
3.
Dantzig, G. B. Maximization of a linear function
In:
of variables subject to linear inequalities.
T. C. Koopman's Activity Analysis of Production
New York, Wiley, 1951. p. 339and Allocation.
347.
(Cowles Commission Monograph 13).
4,
A proof of the equivalence of the
programming problem and the game problem. In:
T. C. Koopman's Activity Analysis of Production
New York, Wiley, 1951. p. 330and Allocation.
338.
(Cowles Commission Monograph 13).
5.
In:
Gale, David et al. On symmetric games.
Kuhn and Tucker's C0ntributions to the Theory of
Games. Vol. I.
Princeton, Princeton University
Press, 1950. p. 81-88.
(Annals of Mathematics
Studies. Study no. 24)
6.
Luce, R.
Duncan and Howard Raiffa. Games and
509 p.
decisions. New York, Wiley, 1957.
7.
Natansen, I.P.. Theory of functions of real variable. New York, Frederick Ungar, 1955. 277 p.
8.
Robinson, Julia. An iterative method for solving
a game.
Annals of Mathematics.
5:296-301. 1951.
.
9.
10.
Von Neumann, John. Zur theorie der Gesellschafts1928.
spiele. Mathematische Annalen 100:295-320.
Theory
Von Neumann, John and Oskar Morgenstern.
Princeton,
of games and economic behavior.
Princeton University Press, 1944. 625 p.
Download