Complexity and Mixed Strategy Equilibria

advertisement
Complexity and Mixed Strategy Equilibria∗
Tai-Wei Hu†
Penn State University
3/27/2008
Abstract
We propose a theory of mixed strategies in zero-sum two-person games. Given a finite
zero-sum two-person game g, we extend it to collective games g∞ and g∞,S , which are
infinite repetitions of the game g. Players in the collective games are restricted to use
computable strategies only, but each has a complex sequence that can be used in the
computation. We adopt kolmogorov complexity to define complex sequences, which are
also called random sequences in the literature. The two random sequences are assumed to
be independent, and so they can be used to generate complex strategies in g∞ and g∞,S
with all possible rational relative frequencies that are unpredictable to their opponents.
These complex strategies are analogous to mixed strategies in g. In g∞ , however, there
are strategies that do not correspond to any mixed strategies. In g∞,S , players are allowed
to use only those strategies analogous to pure or mixed strategies in g. We show that the
collective games g∞ and g∞,S are solvable, and they both have the same value as that of
g. Moreover, we are able to show that any equilibrium strategy in g∞,S has the relative
frequency as the probability value of an equilibrium mixed strategy in g.
Keywords: Kolmogorov complexity; objective probability; relative frequency theory; mixed
strategy; zero-sum game; effective randomness
∗
I am grateful to Professor Kalyan Chatterjee and Professor Neil Wallace for their guidance and
encouragement. I also received many useful comments from Professor Mamoru Kaneko. All remaining
errors, of course, are my own.
†
E-mail: tuh129@psu.edu; Corresponding address: Department of Economics, Penn State University,
608 Kern Graduate Building, University Park, PA 16802-3306.
1
1
Introduction
We propose a new theory of mixed strategies for finite zero-sum two-person games, based
on complexity considerations. Mixed strategies are introduced in von Neumann and
Morgenstern [11] to analyze two-person zero-sum games. By using mixed strategies,
players may increase their security levels, i.e., minimum expected payoffs, in this class
of games. The original interpretation of mixed strategies (see, for example, Luce and
Raiffa [7]) is to use a random device and make decision depending on the outcome of the
experiment. Players are assumed to make randomized decisions deliberately, instead of
making decisions randomly.
This interpretation has been challenged by many authors, and it is explicitly rejected
by Rubinstein [14]. It is doubtful that a rational player should make decision at random
without a good reason. Moreover, mixed strategies can only increase the security levels
stochastically, but not deterministically. If the game will be played only once, the worst
payoff that a player can receive will not change, whether the player has conducted a random experiment or not. Actually, the expected security level changes with the execution
of the random device — before the execution, it is the ex ante expected payoff, but it
becomes the worst payoff of a pure strategy after the execution. A more delicate example
which involves nature’s moves can be found in Aumann and Maschler [1].
Nonetheless, there is still a good reason to implement mixed strategies — to hide the
pattern of actions so that the opponent will not able to predict the actions. If this would
be the rationale for implementing mixed strategies, then repetition of the original game
is necessary. Patterns only make sense in repetitive situations. Moreover, it is necessary
to introduce unpredictable strategies. Unpredictability is essential for the philosophy of
the maximin criterion, which requires a player implement an optimal strategy against his
opponent’s capability.
Our theory aims to formalize this rationale for mixed strategies. Thus, we extend a
finite game g into a collective game g∞ , which consists of infinite repetitions of the game
g. The game g is not necessarily repeated according to time order in g∞ , but all the
2
repetitions of the game may be played simultaneously. An example of the later situation
can be found in Luce and Raiffa [7], which discusses an ariel strategist making decisions for
his pilots in numerous identical fights. We consider infinite repetitions as an idealization.
The game g∞ is indexed by N, and each t ∈ N represents a repetition of g. A strategy in
g∞ is then a sequence of actions in g, and its tth component is meant to be the action
taken by the player in the tth repetition.
We define a strategy in g∞ to be unpredictable if it is complex. We adopt Kolmogorov
complexity to measure how complicated a strategy is, and this notion is defined in terms
of computability. Roughly speaking, a finite sequence is complex if it does not have a
short description. We restrict ourselves to machines to describe objects — a description
is then a code consisting of 0 and 1’s, and it describes a sequence if the machine produces
the sequence with this code as input. For example, the sequence
0, 1, 0, 1, 0, 1, ..., 0, 1
which has 100 0’s and 1’s can be easily described, while the first 200 digits of the binary
expansion of the number π can only be described with a much longer sentence. Thus, we
may say that the second sequence is more complex than the first. An infinite sequence is
complex if its initial segments have asymptotically high complexities.
In the original theory, the mixed strategies of the two players are assumed to be
independent, and hence each player’s mixed strategies are unpredictable to the other
player. In our formulation, this is formalized by considering oracle computations. A
sequence ξ is computable relative to another sequence ζ (as an oracle) if ξ can be computed
with the aid of ζ, i.e., there is a machine, connected to a black box (called an oracle
inquiry) that returns ζt with input t, that computes ξ. We can then define Kolmogorov
complexity relative to a sequence. Two sequences are independent if each sequence is
complex relative to the other.
In the game g∞ , each player has a sequence that is complex, and the two sequences
are independent. The strategy set of each player consists of computable strategies relative
to the sequence the player has. We adopt the long-run average criterion to specify the
payoffs in g∞ , that is, the payoff of a play is the limit of the average payoffs in its first T
3
repetitions of g as T goes to infinity. In a companion paper, this criterion is characterized
by a system of axioms on preference relations over infinite sequences with well-defined
limit relative frequencies. Moreover, the utility functions are determined uniquely up to
linear positive transformations. In g∞ , we have to extend the criterion since not every
play induces an infinite sequence of outcomes with well-defined relative frequencies. We
adopt the limit inferior for one player and the limit superior for the other so that g∞ is
still a zero-sum game.
Our formulation of mixed strategies is based on the relative frequency theory of probability. According to von Mises [8], this theory defines probability value of an outcome
as the limit of the relative frequency with which it appears in a random sequence. In the
literature, a complexity sequence is shown to be a random sequence with respect to uniform distribution (see, for example, Downey et. al. [3]). Nonetheless, we have to address
relative frequencies, and this feature is essential in the reasoning of the maximin principle. To achieve maximum security level, a player not only has to play an unpredictable
strategies, but also has to put correct frequencies in different actions. We will show that a
complex sequence is sufficient to generate random sequences with any probability values.
We show that the game g∞ is solvable — that is, there is an equilibrium strategy profile
in g∞ , which can be found by maximizing the security levels of both players. Also, all the
equilibrium strategies in g correspond to equilibrium strategies in g∞ . We also show that
the value of g∞ is the same as that of g. In g∞ , players can increase their security levels
by implementing random sequences of actions, and the security levels are average payoffs
instead of expected payoffs. Hence, this increase is deterministic. We formulate another
game g∞,S which allows players to use only strategies in g∞ analogous to pure or mixed
strategies in g. We are able to characterize the set of equilibrium strategies in g∞,S , and
any equilibrium strategies in g∞,S has an equilibrium strategies in g as its representative.
The rest of the paper is organized as follows: section 2 provides some preliminary
information about random sequences; the collective games are formulated and its solutions
are discussed in section 3; some discussions of the results and concluding remarks appear
in section 4.
4
2
Complexity
In this section we give definitions of Kolmogorov complexities. To do so, we first give
some preliminaries on recursion theory.
2.1
Recursion Theory
We will use partial recursive functions to define computability. This is the class of functions which can be computed by a machine, and this is true for different models of machine
computation. A partial function is a function f : Nk → N, but f may not be defined
at every point in Nk . If f is defined at x, we say that f (x) ↓; otherwise, we say f (x) ↑.
The set of partial functions on Nk is denoted by F k and the set of all partial functions is
S
k
n
denoted by F = ∞
k=0 F . Define dom(f ) = {x ∈ N : f (x) ↓}.
We shall introduce a couple of partial functions and some operations over F:
1. 0k : 0k (x) = 0 for all x ∈ Nk .
2. Projkj (j = 1, ..., k): Projkj (x1 , ..., xk ) = xj for all x ∈ Nk .
3. 1+ : 1+ (x) = x + 1 for all x ∈ N.
4. Composition: for any f1 , ..., fm ∈ F k and g ∈ F m , define
g ◦ (f1 , ..., fm )(x) = g(f1 (x), ..., fm (x))
for all x ∈ Nk .1
5. Primitive recursion: for any g ∈ Nk+2 and any f ∈ Nk , define pr(f, g) as the function
h such that
h(0, x1 , ..., xn ) = f (x1 , ..., xn ),
and for y ≥ 0,2
h(y + 1, x1 , ..., xn ) = g(y, h(y, x1 , ..., xn ), x1 , ..., xn ).
1
g ◦ (f1 , ..., fm ) is defined at x if and only if fi (x) is defined all i = 1, ..., m and g(f1 (x), ..., fm (x)) is
defined.
2
For y > 0, pr(f, g) is defined at (y, x) if and only if for all y 0 < y, pr(f, g) is defined at (y 0 , x), and g
5
6. Least number operator: for any f ∈ F k+1 , define
(µy)(f (x1 , ..., xn , y) = 0) = k
if f (x1 , ..., xn , k) = 0, and, for all 0 ≤ k 0 < k, f (x, k 0 ) is defined and f (x, k 0 ) 6= 0;
otherwise, (µy)(f (x1 , ..., xn , y)) is undefined.
For any set A ⊂ Nk , we identify A with its characteristic function χA . A is also called
a predicate.
Definition 2.1. The set of partial recursive functions PR is the smallest subset of F
that satisfies the following conditions:
(a) 1+ ∈ PR;
(b) for all k > 0 and for all j = 1, ..., k, Projkj ∈ PR;
(c) for all k ∈ N, 0k ∈ PR;
(d) if f, g ∈ PR and if f ∈ F k , g ∈ F k+2 , then pr(f, g) ∈ PR;
(e) if f1 , ..., fm , g ∈ PR and if, for i = 1, ..., m, fi ∈ F k , g ∈ F m , then g◦(f1 , ..., fm ) ∈ PR;
(f) if f ∈ F k+1 and if f ∈ PR, then (µy)(f (x, y) = 0) ∈ PR.
In this definition, the functions in requirements (a-c) are called initial functions. The
set PR is defined to be the smallest set including these initial functions that is closed
under the operations listed in requirements (d-f). We shall say that f is computable if
f ∈ PR. Any f ∈ F 1 that is total is also called a Turing oracle. Independence between
random sequences will play a crucial role in our formulation of collective games, and it is
defined in terms of relative computation. Given an oracle f , we may add to our machine
an oracle inquiry that may return values of f for given inputs. It is known that this is
equivalent to add f to the list of initial functions in Definition 2.1.
Definition 2.2. Let f be a Turing oracle. The set of partial recursive function relative to
f , denoted by PRf , is the smallest subset of F that satisfies the conditions in Definition
is defined at (y − 1, pr(f, g)(y − 1, x), x). For y = 0, pr(f, g) is defined at (0, x) if and only if f is defined
at x.
6
2.1 and the condition that f ∈ PRf .
A function is called f -computable if it belongs to PRf . The cornerstone of computation theory is the fact that, for all k, we can effectively enumerate all the k-ary
(k),f
f -computable functions. For each k, let {ϕe
(x1 , ..., xk )}∞
e=0 be such an enumeration.
The following theorem is called the enumeration theorem, the proof of which can be found
in Odifreddi [12].
Theorem 2.1. Let k ∈ N. There is an enumeration of all functions in PR ∩ F k ,
(k),f ∞
}e=0 ,
{ϕe
(k),f
such that the function f (x1 , ..., xk , e) =def ϕe
(x1 , ..., xn ) is f -computable.
Given a finite set X = {x1 , ..., xn }, we use X <N to denote the set of all finite strings
over X, typical element of which will be denoted by σ or τ . We use to denote the empty
string. The set of all infinite sequences over X is denoted by X N , with typical elements
denoted by ξ, ζ, etc. For any T ∈ N, we use ξ[T ] to denote the initial segment of ξ with
length T . We use σ ⊂ τ or σ ⊂ ξ to denote the fact that σ is an initial segment of τ or
ξ. We use στ to denote the concatenation of the two strings σ and τ . For any finite X,
the set X <N can be effectively enumerated, i.e., there is a bijection from X <N to N. Any
function from X <N to N or X <N can thus be identified with a Turing oracle. The set Q
can also be enumerated and so a function from N to Q can be regarded as a Turing oracle
as well.
2.2
Kolmogorov complexity
In this section we define Kolmogorov complexity for finite sequences over {0, 1}. The
set of all such sequences is denoted by {0, 1}<N . It is well-known that the set {0, 1}<N
can be effectively enumerated, and so we may identify this set with N. Thus, a function
f : {0, 1}<N → {0, 1}<N can be viewed as a function in F 1 . We say that such a function
is prefix-free if the domain of f is prefix-free, i.e., for any σ 6= τ ∈ {0, 1}<N , if f (τ ) ↓, and
if σ ⊂ τ or τ ⊂ σ, then f (σ) ↑.
For a given finite sequence σ, its complexity (with respect to a given function f ) is
7
defined to be the minimum length T such that there is a finite sequence τ with length T
and f (τ ) = σ. Formally, we define
Kf (σ) = min{|τ | : τ ∈ {0, 1}<N , f (τ ) = σ}.
Kf (σ) = ∞ if there is no τ such that f (τ ) = σ. This measure is machine dependent, i.e.,
it depends on the function considered. However, there exist machines that use the input
most efficiently, and these are called universal machines. Formally, we say that a function
f is universal (among all prefix-free functions) if for all prefix-free functions g, there is a
string ρ such that g(σ) ∼
= f (ρσ) (i.e., g(σ) is defined if and only if f (ρσ) is, and they have
the same value when they are defined) for all σ ∈ {0, 1}<N . For the universal machines,
the measure is absolute asymptotically. The proof of the following theorem can be found
in Downey et. al. [3].
Theorem 2.2. There is a universal machine f . Moreover, for any two universal machines
f and g, there are constants C1 and C2 such that Kf (σ) ≤ Kg (σ) + C1 and Kg (σ) ≤
Kf (σ) + C2 for all σ ∈ {0, 1}<N .
In the following, we will fixed a universal machine f and define K(σ) to be Kf (σ) for
all σ ∈ {0, 1}<N . Notice that the values of K only differ within a constant for different
choices of universal machines by Theorem 2.2. There is an asymptotic upper bound on
the Kolmogorov complexity thus defined.
We can also define Kolmogorov complexity relative to a function h, presumably not
computable. This is done by considering a universal machine relative to h, i.e., a prefixfree function f satisfying the following: for any h-computable prefix-free function g, there
is an ρ such that g(σ) ∼
= f (ρσ) for all σ ∈ {0, 1}<N . We then define K h (σ) = Kf (σ) for
any universal machine f relative to h. We remark that the corresponding modification
of Theorem 2.2 holds for Kolmogorov complexity relative to a function h. There is an
upper bound for the complexity measure: there is some constant C such that K(σ) ≤
|σ| + 2 log2 |σ| + C for all σ ∈ {0, 1}<N (see, for example, Downey et. al. [3]).
8
2.3
Effective randomness and independence
Now we shall define randomness in terms of complexity, following Chaitin [2] and Levin [6].
Roughly speaking, a sequence in {0, 1}N if it has high complexity. Kolmogorov complexity
measure only applies to finite sequences, and so the complexity of an infinite sequence is
defined in terms of the complexities of its initial segments. We have seen that there is
an upper bound for the complexity of a finite sequence. Randomness, however, does not
require such high complexity.
Definition 2.3. A sequence ξ ∈ {0, 1}N is random if there is a constant C such that
K(ξ[T ]) ≥ T − C for all T ∈ N. Given a Turing oracle f , a sequence ξ is random relative
to f if there is a constant C such that K f (ξ[T ]) ≥ T − C for all T ∈ N.
A random sequence can be thought of as realizations of a repetitive experiment with
two equally probable outcomes. It is well-known that such sequences exist, and, actually,
there are uncountably many of them (see Downey et. al.[3]). We can show that any
random sequence satisfies intuitive properties of such realizations effectively. For example,
P −1 ξt
for any random sequence ξ, limT →∞ Tt=0
= 12 .
T
Independence is a crucial concept for game theory. We are able to formulate independence of random sequences, which formalizes the idea that each sequence is unpredictable
with respect to the other. We use relative randomness to define independence. Roughly
speaking, two random sequences are independent if either sequence provides no information about the other.
Definition 2.4. Two random sequences ξ and ζ are independent if ξ is random relative
to ζ.
We can show that if ξ and ζ are independent, then ζ is random relative to ξ as
well. Thus, independence is a commutative relation. There is a relationship between
independence defined here and independence in the axiomatic probability theory, which
will be discussed later. We conclude this section with the remark that independent random
sequences exist. Actually, for each random sequence ξ, there are uncountably many
random sequences that are independent of ξ.
9
3
Zero-sum two-person games
A general zero-sum two-person game is a triple G = hS1 , S2 , -i, where S1 and S2 are the
strategy sets for players 1 and 2, respectively, and - is player 1’s preference relation over
S = S1 × S2 . Player 2’s preference relation is the perfect opposite of -. A strategy profile
(s∗1 , s∗2 ) ∈ S is an equilibrium if (s1 , s∗2 ) - (s∗1 , s∗2 ) for all s1 ∈ S1 and (s∗1 , s∗2 ) - (s∗1 , s2 )
for all s2 ∈ S2 . We assume that the preference - can be represented by a bounded
real-valued function u : S → R. Hereafter, we shall identify - with its representation
u. For any s1 ∈ S1 , define VG (s1 ) = inf s2 ∈S2 u(s1 , s2 ), and, for any s2 ∈ S2 , define
WG (s2 ) = sups1 ∈S1 u(s1 , s2 ) to be the security levels for player 1 and player 2, respectively.
We say that s1 ∈ S1 is an equilibrium strategy for player 1 if there is some s2 ∈ S2
such that (s1 , s2 ) is an equilibrium. Equilibrium strategies for player 2 are defined in the
same manner. In this class of games, if equilibrium strategies exist, they can be found by
maximization of the security levels. The following theorem is well-known.
Theorem 3.1. Suppose that there exists an equilibrium (s∗1 , s∗2 ) in G. Then the following
two conditions are equivalent.
(a) (s01 , s02 ) is an equilibrium.
(b) maxs1 ∈S1 VG (s1 ) = VG (s01 ) and mins2 ∈S2 WG (s2 ) = WG (s02 ).
Theorem 3.1 (b) implies that if G has an equilibrium, then the equilibrium strategies
are exactly those that maximize the security levels. Moreover, this implies that G satisfies exchangeability, i.e., for any equilibrium strategies s1 ∈ S1 and s2 ∈ S2 , (s1 , s2 ) is an
equilibrium. If G has an equilibrium, then we say that G is solvable. In solvable games,
equilibrium strategies may be called rational in the sense that a minimum payoff is guaranteed by implementing them, and higher payoff may be obtained if the opponent does
not follow the criterion; moreover, this strategy is optimal if the opponent also follows
the same criterion.
Not every finite zero-sum two-person game g = hX, Y, -i, however, is solvable. Consider the matching pennies in the following matrix, denoted by g mp = h{x1 , x2 }, {y1 , y2 }, hi:
10
h
y1
y2
x1
(1, −1) (−1, 1)
x2
(−1, 1) (1, −1)
It is easy to check that Vgmp (x1 ) = Vgmp (x2 ) = −1 < 1 = Wgmp (y1 ) = Wgmp (y2 ), and
hence, by Theorem 3.1, there is no equilibrium in g mp . Player 1, nonetheless, may still
increase the minimum expected payoff by tossing a fair coin, playing x1 if head occurs
and playing x2 otherwise: no matter what player 2 plays, the expected payoff for player 1
is 0. In the same way, player 2 can increase the security level by introducing randomized
strategies, and the game becomes solvable. In general, any finite zero-sum two-person
games is solvable in mixed strategies, and, in some games like the matching pennies,
players have to deliberately use the mixed strategies to increase their security levels.
Mixed strategy equilibria, however, are not invariant with respect to different representation of the preference relation - over the outcomes of the game. For example, if
we put h(x1 , y1 ) = 2 = h(x2 , y2 ) instead of 1 in g pm , playing x1 with probability
1
2
is
not an equilibrium strategy. Von Neumann and Morgenstern [11] develops a theory of
expected utility which extends the domain of the preference relation to the set of probability distributions over the outcomes, and propose a system of axioms so that the utility
functions are determined uniquely up to linear transformation. With this extension, any
zero-sum two-person game becomes solvable, and the solutions are invariant with respect
to different representation of the preference relation over probability distributions.
This theory is originally developed for static games, i.e., for games that will be played
only once. Players, however, cannot deterministically increase the level of minimum
payoffs by implementing mixed strategies in one-shot games. In g mp , for example, the
minimum payoff is −1 for both players, and tossing a coin does not change this fact. We
propose a new theory of mixed strategies, and we extend a finite zero-sum two-person
game g into a collective game g∞ , which is the repetition of g for infinitely many times.
In g∞ , each player has a complex sequence to generate unpredictable strategies, which
are analogous to mixed strategies.
In our theory, a foundation of preference representation is also necessary so that the
11
solution will not change with different representations. In a companion paper, we develop
a version of expected utility theory from the frequentist perspective. The main result there
states that a preference relation over infinite sequences with well-defined limit relative
frequencies is represented by the long-run average criterion if certain axioms are satisfied,
and utility functions are determined uniquely up to positive linear transformations. For
our purposes here, we extend the long-run average criterion to limit inferior and limit
superior of the average utilities.
3.1
Collective games
Let g = hX, Y, hi be a zero-sum two-person game, where X = {x1 , ...., xm } is player
1’s strategy set, Y = {y1 , ..., yn } is player 2’s strategy set, and h : X × Y → Q
is the von Neumann-Morgenstern utility function. Now we define the collective game
g∞ = hX , Y, uh i formally. Our definition will depend on the choice of random sequences
available to the players. Let ξ and ζ be two independent random sequences. The strategy
sets for player 1 and 2 are
X = {a : N → X : a is an ξ-computable total function},
and
Y = {b : N → Y : b is an ζ-computable total function},
respectively. In other words, a strategy is a total computable function relative to the
random sequence accessible to the player. Given a strategy profile (a, b) ∈ X × Y, the
play resulting from the profile is then a ⊗ b ∈ (X × Y )N .
The payoff of a play a ⊗ b ∈ (X × Y )N to player 1 is
uh (a ⊗ b) = lim inf
T →∞
T −1
X
h(at , bt )
t=0
T
,
(1)
and the payoff to player 2 is
−uh (a ⊗ b) = lim sup
T →∞
12
T −1
X
−h(at , bt )
t=0
T
.
(2)
We use different extension of the long-run average criterion for different players so that
the game g∞ is a zero-sum game.
In the game g∞ , we introduce the computability constraints formally, so that the
players may not be able to predict some of the strategies of their opponents. For example,
in the matching pennies, player 2 cannot predict the following strategy of player 1: at = x1
if ξt = 0 and at = x2 if ξt = 1. Even though player 2 is aware of player 1’s choice of a, player
2 is not able to produce a strategy such as bt = y2 if at = x1 and bt = y1 if at = x2 . Thus,
the random sequences ξ and ζ allow the players to use unpredictable strategies relative
to each other’s computability, and it is then possible to formalize a decision criterion that
guarantees a minimum payoff against the capability of the opponent with certainty.
The game g∞ is solvable, and the game g∞ has the same value as g.
Theorem 3.2. Consider any finite zero-sum two-person game g = hX, Y, hi such that
h : X × Y → Q. There is an equilibrium (a∗ , b∗ ) in g∞ = hX , Y, uh i such that uh (a∗ , b∗ )
is the value of g. Moreover, the following two conditions are equivalent:
(a) (a∗ , b∗ ) is an equilibrium.
(b) maxa∈X Vg∞ (a) = Vg∞ (a∗ ) and minb∈Y Wg∞ (b) = Wg∞ (b∗ ).
This theorem is proved by constructing a strategy that is highly complex and has the
relative frequency of actions according to an equilibrium mixed strategy. We have seen
that the random sequences ξ and ζ have the same relative frequency ( 21 , 12 ) on {0, 1}. In
the next section, we shall show that these sequences can generate complex sequences with
any relative frequencies.
3.2
General random sequences
In this section we shall define general random sequences over X. We use
|X|
∆(X) = {p ∈ Q+ :
X
x∈X
13
px = 1}
to denote the set of all probability distributions (with rational probability values) over X.
We adopt the definition from Muchnik [9], which defines random sequences in terms of
martingales, and, we hope, it will be easier for economists. We will show that this gives
equivalent definition to the complexity approach.
A martingale is a betting strategy against infinite sequences over X, but we shall
formulate it in terms of the capital at hand, contingent on the outcomes in the sequence
under consideration. Formally, a function M : X <N → R+ ∪ {∞} is an p-martingale for
some p ∈ ∆(X) if
M (σ) =
X
px M (σhxi) for all σ ∈ X <N .
(3)
x∈X
Let f be a Turing oracle. An p-martingale M is f -effective if there is a sequence of
p-martingales {Mt }∞
t=0 that satisfies the following properties:
(a) Mt (σ) ∈ Q+ for all t ∈ N and for all σ ∈ X <N ;
(b) Mt is f -computable for each t ∈ N;
(c) limt→∞ Mt (σ) ↑ M (σ) for all σ ∈ X <N .
In this case, we say that the sequence {Mt }∞
t=0 supports M . We say that a martingale M
succeeds over a sequence ξ ∈ X N if lim supT →∞ M (ξ[T ]) = ∞.
Definition 3.1. Let f be a Turing oracle. A sequence ξ ∈ X N is p-random relative to f
if there is no f -effective martingale that succeeds over ξ.
An p-random sequence can be thought of as realizations of a repetitive experiment,
and the probability value p is the relative frequency of the outcomes. Moreover, any prandom sequence is stochastic, meaning any of its subsequences selected by a computable
rule has the same relative frequency. To formalize this, we shall now introduce the selection
functions.
A selection function is a function r : X <N → {0, 1}. We apply r to a sequence ξ to
obtain a subsequence of ξ by selecting ξT into the subsequence if r(ξ[T ]) = 1. Formally, for
14
any selection function r and any sequence ξ ∈ X ω , we define a partial function θr,ξ : N → N
inductively as follows:
(a) θ0r,ξ is the least T such that r(ξ[T ]) = 1;
r,ξ
is the least T such that T > θtr,ξ and r(ξ[T ]) = 1.
(b) θt+1
θr,ξ records the places selected by r when it is applied to ξ. Then the subsequence selected
by r, denoted by ξ r , is defined to be such that ξtr = ξθr,ξ (t) for all t ∈ N. Notice that ξ r
may be partial. The theorem below formalizes our claim, the proof of which can be found
in the appendix.
Theorem 3.3. Let h : X → R be an arbitrary function. Suppose that ξ is p-random
relative to f with px > 0 for all x ∈ X, and suppose that r is an f -computable selection
P −1 h(ξtr ) P
function. If ξ r is total, then limT →∞ Tt=0
= x∈X px h(x).
T
If r is identically 1, then Theorem 3.3 implies that for any p-random sequence, p is
the relative frequency of outcomes. We have seen that a random sequence in {0, 1}N has
relative frequency ( 12 , 12 ), and this is not accidental. Actually, the set of random sequences
is the same as the set of ( 21 , 12 )-random sequences, which is stated in the following theorem.
Its proof can be found in Downey et. al. [3].
Theorem 3.4. An infinite sequence in {0, 1}N is an ( 21 , 21 )-random sequence if and only
if it is a random sequence.
We have defined independence for random sequences. For general random sequences,
this definition also applies.
Definition 3.2. Consider two finite sets X and Y . Let ξ ∈ X N be p-random for some
p ∈ ∆(X) and let ζ ∈ Y N be q-random for some q ∈ ∆(Y ). We say that ξ and ζ are
independent if ξ is random relative to ζ.
Here we shall show that our definition of independence is an effective version of the
same concept in axiomatic probability theory. For any (p, q) ∈ ∆(X) × ∆(Y ), we define
15
p ⊗ q to be the product measure of them over X × Y . For any two sequences ξ ∈ X N and
ζ ∈ Y N , we define ξ ⊗ ζ as (ξ ⊗ ζ)t = (ξt , ζt ) for all t ∈ N.
Independence of random variables is defined in terms of product measures in the
axiomatic probability theory: a random variable on X and a random variable on Y are
independent in the standard theory if their joint distribution is a product distribution over
X × Y . Here, we can also define p ⊗ q-randomness in (X × Y )N . The following theorem,
essentially due to van Lambalgen [5], characterizes independence in terms of randomness
with respect to product measures, which establishes a connection between our definition
of independence and the measure theoretical definition. Its proof can be found in Hu [4].
Theorem 3.5. Consider two finite alphabets X and Y . Suppose ξ ∈ X ω and ζ ∈ Y ω ,
and suppose p ∈ ∆(X) and q ∈ ∆(Y ).
(a) If ξ ⊗ ζ is p ⊗ q-random, then ξ is p-random relative to ζ.
(b) If ξ is p-random relative to ζ and ζ is q-random, then ξ ⊗ ζ is p ⊗ q-random.
We conclude this section with a theorem which states that any p-random sequence
can be generated by a random sequence via a computable mapping.3 This mapping is
very intuitive. For example, to produce an ( 16 , 13 , 12 )-random sequence in {x1 , x2 , x3 }N
from a random sequence ξ, first regroup components in ξ and get ξ 0 such that ξt0 = 1 if
(ξ2t , ξ3t+1 , ξ3t+2 ) = (0, 0, 0), ξt0 = 2 if (ξ2t , ξ3t+1 , ξ3t+2 ) = (1, 0, 0), ξt0 = 3 if (ξ2t , ξ3t+1 , ξ3t+2 ) =
(0, 1, 0), ..., ξt0 = 8 if (ξ2t , ξ3t+1 , ξ3t+2 ) = (1, 1, 1); then get the subsequence ξ 00 by deleting all the 7’s and 8’s in ξ 0 ; finally, consider the mapping ω : {1, ..., 6} → X such that
ω(1) = x1 , ω(2) = ω(3) = x2 , and ω(4) = ω(5) = ω(6) = x3 ; let ζ be such that ζt = ω(ξt00 ).
Intuitively, we expect that ξ 0 is an ( 81 , ..., 81 )-random sequence, ξ 00 is an ( 16 , ..., 16 )-random
sequence, and ζ is an ( 16 , 13 , 21 )-random sequence. All these are true. Moreover, this method
can be easily generalized to any p-sequence.
Lemma 3.1. Let ξ be an ( 12 , 12 )-random binary sequence. For each p ∈ ∆(X), there is an
ξ-computable function ξ p : N → X which is p-random.
3
I am grateful to Andrei Karavaev for pointing this out.
16
Proof. From the results in Hu [4], there is computable mapping Λ : ∆(X) × {0, 1}N → X N
such that for any p ∈ ∆(X) and for any ( 21 , 12 )-random ξ, Λ(p, ξ) is p-random. Define
ξtp = Λ(p, ξ)t . Since Λ is computable, ξ p is ξ-computable.
In the following, we use ξ p to denote a specific p-random sequence that is computable
relative to ξ for any p ∈ ∆(X), and use ζ q to denote a specific p-random sequence that is
computable relative to ζ for any q ∈ ∆(Y ).
3.3
Solution of the collective game
In this section we will discuss the structure of the solutions of g∞ . First, we will now
give a sketch of the proof of Theorem 3.2. A full proof can be found in the appendix.
Consider a finite zero-sum two-person game g = hX, Y, hi, where h has range in Q. It
is well-known that the game g is solvable in mixed strategies with rational probability
values. Let (p∗ , q ∗ ) ∈ ∆(X) × ∆(Y ) be an equilibrium mixed strategy profile in g. By
∗
∗
∗
∗
Lemma 3.1, ξ p ∈ X and ζ q ∈ Y. We claim that the strategy profile (ξ p , ζ q ) is an
equilibrium in g∞ .
First we prove that
(∀a ∈ X )(lim inf
T →∞
∗
T −1
X
h(at , ζtq )
t=0
T
≤ h(p∗ , q ∗ )).
(4)
∗
Since ξ and ζ are independent, ζ q is q ∗ -random relative to a for any a ∈ X . Moreover, for
any x ∈ X, h(x, q ∗ ) ≤ h(p∗ , q ∗ ). Inequality (4) then follows from Theorem 3.3. Similarly,
we can show that
(∀b ∈ Y)(lim sup
T →∞
∗
T −1
X
−h(ξtp , bt )
T
t=0
≤ −h(p∗ , q ∗ )),
(5)
Finally, we show that
lim
T →∞
∗
∗
T −1
X
h(ξtp , ζtq )
t=0
T
17
= h(p∗ , q ∗ ).
(6)
∗
∗
This follows from the fact that ξ p and ζ q are independent, by applying Theorem 3.3.
This independence is an implication of the fact that ξ and ζ are independent. Clearly,
∗
∗
inequalities (4) and (5) and equality (6) imply that (ξ p , ζ q ) is an equilibrium profile.
Theorem 3.2 shows that the game g∞ is solvable, and all the solutions of the game
g correspond to solutions in g∞ . The sets of equilibrium strategies in g for both players
are closed convex sets, and their extreme points can be identified constructively by linear
programming. However, we are not able to explicitly describe the structure of equilibrium
strategies in g∞ .
We shall now simplify the game g∞ , and allow only strategies that have representatives
in the set of pure or mixed strategies in g. These are called simple strategies, and they
belong to either one of two categories: the first class, XP , is analogous to the pure strategy
set in g, and the second class, XM , is analogous to the mixed strategy set. Formally:
XP = {a : N → X : a is total and computable},
and
XM = {a : N → X : a = ξ p for some p ∈ ∆(X)}.
Let XS = XP ∪ XM . The definitions of YP , YM , and YS are formulated in exactly the
same manner. We denote the game hXS , YS , uh i by g∞,S . For the game g∞,S , we can
characterize the structure of its equilibrium strategies, and this is stated in the following
theorem. Its proof can be found in the appendix.
Theorem 3.6. Let g = hX, Y, hi be a zero-sum two-person game with h : X × Y → Q.
There is an equilibrium (a∗ , b∗ ) in g∞,S = hXS , YS , uh i, and uh (a∗ , b∗ ) is the value v ∗ of
the game g. Moreover,
(a) a ∈ XP is an equilibrium strategy if and only if
lim inf
T →∞
|{t = 0, ..., T − 1 : at is not an equilibrium pure strategy in g}|
= 0.
T
(b) a ∈ XM is an equilibrium strategy if and only if there is an equilibrium strategy
p ∈ ∆(X) in g such that a = ξ p .
18
(c) if (a∗ , b∗ ) ∈ XM × YM is an equilibrium, then for any a ∈ X ,
lim
T →∞
T −1
X
h(a∗ , b∗ )
t
t=0
T
t
≥ lim sup
T →∞
T −1
X
h(at , b∗ )
t
t=0
T
.
Since the value of g∞,S is the same as the value of g∞ , and XS ⊂ X and YS ⊂ Y,
any equilibrium strategy in g∞,S is also an equilibrium strategy in g∞ . Part (a) and (b)
characterize the equilibrium strategies in g∞,S by equilibrium strategies in g. Notice that
the limit inferior appears in part (a) and it works only for player 1, but the result applies
to player 2’s equilibrium strategies in YP if limit inferior is replaced by limit superior.
Moreover, part (a) also implies that any strategy in XP or YP that uses only equilibrium
pure strategies in g is also an equilibrium strategy in g∞ and g∞,S . Finally, part (c)
implies that any equilibrium strategy in XM or YM is also an equilibrium strategy for
modifications of g∞,S or g∞ which adopt limit superior or limit inferior or other criteria
in between for either player’s payoff specification.
4
Discussions and Conclusion
This section discusses the implications of theorem 3.2 and theorem 3.6 to the literature,
and also makes some comments on our formulations and future development.
4.1
Mixed strategies and complexity
Our theory formulates mixed strategies to be random sequences. In zero-sum two-person
games, we show that players may increase the security levels by using these random sequences in collective games. The security level is defined in terms of long-run average
payoffs instead of expected payoffs, and hence our theory gives a deterministic result.
We capture randomness by complexity. Strategies in XM are deterministic strategies (in
the sense that all its actions are specified), but they are random sequences, which can
be viewed as complex sequences. Hence, our theorems imply that players may guarantee a minimum payoff by using highly complex strategy relative to their opponents’
19
computability.
The ability of prediction is formally formulated by computability constraints. The
key assumption is the independence of the random sequences that the players can use
in their computations. In the literature, many papers discussing complexity in repeated
games use finite automata to model computability constraints (see, for example, Osborne
and Rubinstein [13]). In that approach, complexity is usually measured by the number
of states. However, that measure seems not capable of defining independence in the way
our theory captures it.
4.2
Long-run average
We adopt the long-run average criterion for payoff specification in our collective games.
This criterion is characterized by a representation theorem in a companion paper, and a
system of axioms is provided there so that measurement of the utility function is possible.
For our purpose, we extend the criterion to limit inferior and limit superior for player 1
and 2, respectively. Part (d) in Theorem 3.6 states that equilibrium strategies in XM and
YM are robust to these specifications. This criterion may not be satisfactory in many
applications. Nonetheless, in situations like professional sports, the number of wins is
usually what matters, and this criterion seems to have some relevance.
4.3
Information structure in collective games
In our formulation of g∞ , no time structure is mentioned and players make choices for
each stage game simultaneously. If time structure is explicitly modeled and the index
set N is interpreted as time order, then we may model also the players’ memories of past
plays. Formally, we may formulate our strategy sets as follows:
X = {a : Y <N → X : a is total and computable relative to ξ},
and
Y = {b : X <N → Y : b is total and computable relative to ζ}.
20
We remark that this modification does not change the value of g∞ , and all the equilibrium strategies in the original game are still equilibrium strategies in the modified game.
However, some new equilibrium strategies appear in the modified game.
4.4
Infinite sequence v.s. finite sequence
Our collective game g∞ is formulated to be infinite repetitions of g. Repetition is necessary for mixed strategies to be useful in increasing (deterministic) security levels, but
infinite repetitions can only be idealization. Mixed strategies are formulated to be random
sequences, which have two features — randomness and well-defined relative frequencies.
The first feature, as we have seen, is defined in terms of complexity, but it is the second
feature that gives probability values in ∆(X) for each random sequence. To include all
mixed strategies with different probability values in ∆(X), it is then necessary to introduce infinite sequences, since, for any finite T , there is some p ∈ ∆(X) such that there is
no sequence in X T that has relative frequencies equal p.
We hope that our approach may help in the development of a theory of mixed strategies
in finite situations. The basic idea, nonetheless, does not depend on the infinite structure
— players can increase their security levels by playing highly complicated strategies so
that their opponents are not able to predict. However, there are still some conceptual and
technical difficulties. Although Kolmogorov complexity can be defined for finite strings,
it is an absolute measure only asymptotically. For finite strings, the measure depends on
the machine making the computations. Moreover, independence is also defined only for
infinite random sequences.
5
5.1
Appendix
Omitted proofs
Proof of theorem 3.3:
21
Let Lr,ξ (T ) = |{0 < t < T + 1 : r(ξ[t − 1]) = 1}| to be the number of elements selected by
r in ξ[T ]. Then, θr,ξ is total if and only if Lr,ξ (T ) is unbounded.
We first show that for each x ∈ X,
limT →∞
T −1
X
χx (ξ r )h(ξ r )
t
t
t=0
T
= px h(x),
where χx (x0 ) = 1 if x = x0 , χx (x0 ) = 0 otherwise. Without loss of generality, we assume
that h(x) = 1.
Suppose that there exists some ε > 0 and a sequence {Tk }∞
k=0 such that for all k ∈ N,
PTk −1
t=0
χx (ξtr )
Tk
≥ px + ε. We shall define a martingale M as follows:
(a) M () = 1;
(b) M (σhxi) = (1 + κ(1 − px ))M (σ) and M (σhx0 i) = (1 − κpx )M (σ) for all x0 6= x if
r(σ) = 1;
(c) M (σhx0 i) = M (σ) for all x0 ∈ X if r(σ) = 0.
To check that M is a martingale, note that if r(σ) = 1, then
X
px0 M (σhx0 i) = px (1 + κ(1 − px ))M (σ) +
x0 ∈X
X
px0 (1 − κpx )M (σ)
x0 6=x
= M (σ) + κM (σ)(px (1 − px ) − (1 − px )px ) = M (σ);
if r(σ) = 0, then
X
x0 ∈X
px0 M (σhx0 i) =
X
px0 M (σ) = M (σ).
x0 ∈X
M is f -computable since r is. Define DT =
PLr,ξ (T )
t=0
χx (ξtr ). Then,
M (ξ[T ]) = (1 + κ(1 − px ))DT (1 − κpx )T −DT .
Let Lk = (Lr,ξ )−1 (Tk ). Since ξ r is total, Lk is well defined for all k ∈ N. Since for each k,
DTk ≥ Tk px + Tk ε,
M (ξ[Lk ]) ≥ ((1 + κ(1 − px ))px +ε (1 − κpx )1−px −ε )Tk .
22
Take F (κ) = (1 + κ(1 − px ))px +ε (1 − κpx )1−px −ε . We have ln F (0) = 1 and (ln F )0 (0) =
(px + ε)(1 − px ) − (1 − px − ε)px = ε > 0. Thus, for κ small enough, F (κ) > 1, and so
lim sup M (ξ[T ]) = ∞,
T →∞
P −1 χx (ξtr )
a contradiction to ξ being p-random relative to f . Therefore, lim supT →∞ Tt=0
≤
T
PT −1 χx (ξtr )
PT −1 χx (ξtr )
px . Similarly, we can show that lim inf T →∞ t=0 T ≥ px . Thus, limT →∞ t=0 T =
px . 2
Proof of theorem 3.2:
Let (p∗ , q ∗ ) ∈ ∆(X) × ∆(Y ) be an equilibrium point in g. By Theorem 5.2, such an
equilibrium exists. Since (p∗ , q ∗ ) is an equilibrium, it follows that h(x, q ∗ ) ≤ h(p∗ , q ∗ ) for
all x ∈ X and h(p∗ , y) ≥ h(p∗ , q ∗ ) for all y ∈ Y .
∗
∗
let a∗ = ξ p and let b∗ = ζ q . By Lemma 3.1, a∗ is p∗ -random and ξ-computable,
and b∗ is q ∗ -random and ζ-computable. Thus, ζ is random relative to a∗ , and so, by
Theorem 3.5, a∗ is random relative to ζ, and hence is random relative to b∗ . Thus, a∗ and
b∗ are independent.
First we show that, instead of (4),
(∀a ∈ X )(lim sup
T →∞
T −1
X
h(at , b∗ )
t
T
t=0
≤ h(p∗ , q ∗ )),
(7)
Notice that (4) is a direct implication of (7). Since b∗ is q ∗ -random relative to ξ, for any
a ∈ X , b∗ is random relative to a. Let a ∈ X . For each x ∈ X, let rx : Y <N → {0, 1}
be the selection function such that rx (σ) = 1 if a|σ| = x, and rx (σ) = 0 otherwise. rx
x
is ξ-computable since a is. We use bx to denote (b∗ )r , which is the subsequence of b∗
selected by the function rx .
Define Lx (T ) = |{t ∈ N : t ≤ T − 1, at = x}|. Let
E 1 = {x ∈ X : lim Lx (T ) = ∞}, and E 2 = {x ∈ X : lim Lx (T ) < ∞}.
T →∞
T →∞
For each x ∈ E 2 , let B x = limT →∞ Lx (T ) and let C x =
PB x
t=0
h(x, bxt ). By Theorem 3.3,
for any x ∈ E 1 ,
lim
T →∞
T −1
X
h(x, bx )
t
t=0
T
= h(x, q ∗ ) ≤ h(p∗ , q ∗ ).
23
(8)
We claim that for any ε > 0, there is some T 0 such that T > T 0 implies that
T −1
X
h(at , b∗ )
t
T
t=0
≤ h(p∗ , q ∗ ) + ε.
(9)
Fix some ε > 0. Let T1 be so large that T > T1 implies that, for all x ∈ E 1 (recall that
m = |X|),
T −1
X
h(x, bx )
t
T
t=0
≤ h(p∗ , q ∗ ) +
ε
,
m
(10)
and, for all x ∈ E 2 ,
ε
Cx
< .
T
m
(11)
Let T 0 be so large that, for all x ∈ E1 , Lx (T ) > T1 . If T > T 0 , then
x
x
L (T )−1
(T )−1
X Lx (T ) L X
h(x, bxt ) X X h(x, bxt )
=
+
T
T
Lx (T )
T
t=0
t=0
x∈E 1
x∈E 2
X Lx (T )
X ε
ε
≤
(h(p∗ , q ∗ ) + ) +
≤ h(p∗ , q ∗ ) + ε.
T
m
m
1
2
T −1
X
h(at , b∗ )
t
t=0
x∈E
(12)
x∈E
This proves the inequality (9), and implies that, for any ε > 0, there is some T 0 such
that
sup
T >T 0
T −1
X
h(at , b∗ )
t
T
t=0
≤ h(p∗ , q ∗ ) + ε.
Therefore,
lim sup
T →∞
T −1
X
h(at , b∗ )
t
t=0
T
≤ h(p∗ , q ∗ ).
This proves (7).
Since for all y ∈ Y , h(p∗ , y) ≥ h(p∗ , q ∗ ), similar arguments proves (5), which implies
(∀b ∈ Y)(lim inf
T →∞
T −1
X
h(a∗ , bt )
t
t=0
T
≥ h(p∗ , q ∗ )).
(13)
We have seen that a∗ and b∗ are independent. By Theorem 3.5, a∗ ⊗ b∗ is p∗ ⊗ q ∗ random. Then, (6) comes directly from Theorem 3.3.
24
By (13) and (6), we have that Vg∞ (a∗ ) = h(p∗ , q ∗ ). By (4), we have that Vg∞ (a) ≤
h(p∗ , q ∗ ). Thus,
max Vg∞ (a) = Vg∞ (a∗ ) = h(p∗ , q ∗ ).
(14)
a∈X
By (4) and (6), we have that Wg∞ (b∗ ) = h(p∗ , q ∗ ). By (13), we have that Wg∞ (b) ≥
h(p∗ , q ∗ ). Thus,
min Wg∞ (b) = Wg∞ (b∗ ) = h(p∗ , q ∗ ).
(15)
b∈Y
The second part of the theorem comes directly from the existence result and theorem 3.1. 2
Proof of theorem 3.6:
∗
∗
Given an equilibrium point (p∗ , q ∗ ) in g, consider ξ p ∈ XM and ζ q ∈ YM . By (4), (5),
and (6), we conclude that (p∗ , q ∗ ) is an equilibrium in g∞,S (notice that XS ⊂ X and
YS ⊂ Y).
(a) First we show that if there is no pure equilibrium strategy of player 1 in g, then any
strategy in XP cannot be an equilibrium in g∞,S . Suppose that there is no pure equilibrium
strategy for player 1 in g. Let v ∗ be the value of g. Then, miny∈Y h(x, y) < v ∗ for all
x ∈ X. Let v0 = maxx∈X miny∈Y h(x, y) < v ∗ . Consider an arbitrary strategy a ∈ XP .
Let b ∈ YP be such that bt = arg min{h(at , y) : y ∈ Y }. Notice that b is computable since
PT −1 h(at ,bt )
a is. For each T ∈ N, t=0
< v ∗ , and so
T
lim inf
T →∞
T −1
X
h(at , bt )
T
t=0
< v∗.
Hence, uh (a, b) < v ∗ and V 0 (a) < v ∗ . Then, a cannot be an equilibrium strategy.
Suppose that there exists some pure equilibrium strategy in g, and let v ∗ be the value
of the game. Let v1 = min{h(x, y) : x ∈ X, y ∈ Y }. Then, if x ∈ X is an equilibrium
strategy, miny∈Y h(x, y) ≥ v ∗ . Suppose that a0 ∈ XP and limT →∞ Ca0 (T ) = 0, where
Ca0 (T ) =
|{t = 0, ..., T − 1 : at is not an equilibrium pure strategy in g}|
.
T
25
Let b ∈ Y. Then,
T −1
X
h(at , bt )
t=0
T
≥ Ca0 (T )v1 + (1 − Ca0 (T ))v ∗ .
Thus,
uh (a0 , b) ≥ (lim inf Ca0 (T ))v1 + (1 − lim inf Ca0 (T ))v ∗ = v ∗ .
T →∞
T →∞
It follows from (a) that a0 is an equilibrium strategy.
Conversely, suppose that lim inf T →∞ Ca0 (T ) = ε > 0. Consider the following strategy
b : N → Y such that bt = arg min{h(at , y) : y ∈ Y }. Define
v2 = max{min h(x, y) : x is not an equilibrium pure strategy in g}.
y∈Y
By Theorem 5.2, v2 < v ∗ . Then, for any T ∈ N,
T −1
X
h(at , bt )
t=0
T
≤ Ca0 (T )v2 + (1 − Ca0 (T ))v ∗ ,
and hence
uh (a0 , b) ≤ (lim inf Ca0 (T ))v2 + (1 − lim inf Ca0 (T ))v ∗ = εv2 + (1 − ε)v ∗ < v ∗ .
T →∞
T →∞
Thus, by (a), a0 is not an equilibrium strategy.
∗
(b) Suppose that a∗ ∈ XM is an equilibrium strategy. Let a∗t = ξ p for some p ∈ ∆(X).
For each y ∈ Y , define by ∈ XM to be the strategy such that byt = y for all t ∈ N. Then,
by Theorem 3.3,
lim
T →∞
T −1
X
h(a∗ , byt )
t
t=0
T
= h(p∗ , y).
It then follows, since a∗ maximizes Vg∞,S and the value of g∞,S is the value of g v ∗ , that
min h(p∗ , y) ≥ Vg∞,S (a∗ ) = v ∗ .
y∈Y
This implies that p∗ is an equilibrium strategy in g.
(c) This is a direct result of (7). 2
26
5.2
Some known results in zero-sum two-person games
In this section we present some results in zero-sum two-person games for self-containment.
The first one is concerned with such games with arbitrary strategy sets, and the second
is concerned with solvability in mixed strategies with rational probability values.
Theorem 5.1. Suppose that there exists an equilibrium (s∗1 , s∗2 ) in G. Then the following
two conditions are equivalent.
(a) (s01 , s02 ) is an equilibrium.
(b) maxs1 ∈S1 VG (s1 ) = VG (s01 ) and mins2 ∈S2 WG (s2 ) = WG (s02 ).
Proof. Since (s∗1 , s∗2 ) is an equilibrium, it follows that
(∀s1 ∈ S1 )(u(s∗1 , s∗2 ) ≥ u(s1 , s∗2 )),
(16)
(∀s2 ∈ S2 )(u(s∗1 , s∗2 ) ≤ u(s∗1 , s2 )).
(17)
and
Since, for all s1 , V (s1 ) ≤ u(s1 , s∗2 ) by definition, (16) implies that sups1 ∈S1 V (s1 ) ≤
u(s∗1 , s∗2 ). But by (17), V (s∗1 ) = u(s∗1 , s∗2 ), and thus sups1 ∈S1 V (s1 ) = u(s∗1 , s∗2 ). Actually,
we can say that maxs1 ∈S1 V (s1 ) = u(s∗1 , s∗2 ). Similarly, since, for all s2 , W (s2 ) ≥ u(s∗1 , s2 )
by definition, (17) implies that inf s2 ∈S2 W (s2 ) ≥ u(s∗1 , s∗2 ). But by (16), W (s∗2 ) = u(s∗1 , s∗2 ),
and thus inf s1 ∈S1 W (s2 ) = u(s∗1 , s∗2 ). Again, we can say that mins2 ∈S2 W (s2 ) = u(s∗1 , s∗2 ).
This also shows that (a) ⇒ (b). Let v = u(s∗1 , s∗2 ) be the value of the game.
Conversely, suppose that V (s01 ) = v = W (s02 ). Let s1 ∈ S1 . If u(s1 , s02 ) > u(s01 , s02 ),
then V (s01 ) = v = W (s02 ) ≥ u(s1 , s02 ) > u(s01 , s02 ), which is a contradiction since V (s01 ) ≤
u(s01 , s02 ). Similarly, let s2 ∈ S2 . If u(s01 , s2 ) < u(s01 , s02 ), then W (s02 ) = v = V (s01 ) ≤
u(s01 , s2 ) < u(s01 , s02 ), which is a contradiction since W (s02 ) ≥ u(s01 , s02 ).
Theorem 5.2. Consider any finite two-person zero-sum game g = hX, Y, hi such that
h : X × Y → Q. Then there is an equilibrium pair (p∗ , q ∗ ) ∈ ∆(X) × ∆(Y ).
27
Proof. We can show that an equilibrium exists in mixed strategies with real valued probP
abilities, and this is a result from Nash [10]. Let ∆0 (X) = {p ∈ [0, 1]|X| : x∈X px = 1}
P
and let ∆0 (Y ) = {q ∈ [0, 1]|Y | : y∈Y qy = 1}. The security levels for player 1 and 2 are
defined as
v̂(p) = min
h(p, q) and ŵ(q) = max
h(p, q),
0
0
q∈∆ (Y )
p∈∆ (X)
respectively. Suppose that (p0 , q 0 ) ∈ ∆0 (X) × ∆0 (Y ) is an equilibrium. By Theorem 3.1,
h(p0 , q 0 ) = v̂(p0 ) = max
v̂(p) = min
ŵ(q) = ŵ(q 0 ).
0
0
p∈∆ (X)
q∈∆ (Y )
Moreover, any strategy profile (p0 , q 0 ) is an equilibrium if and only if
v̂(p0 ) = max
v̂(p) and ŵ(q 0 ) = min
ŵ(q).
0
0
p∈∆ (X)
q∈∆ (Y )
We shall now show that the problem maxp∈∆0 (X) v̂(p) has a rational solution. The
problem is equivalent to the problem, we call it LP 1(g),
min
X
c∈R|X|
cx
x∈X
subject to
(∀x ∈ X)cx ≥ 0 and (∀y ∈ Y )
X
cx h(x, y) ≥ 1.
x∈X
We may assume that h(x, y) > 0 for all x ∈ X and for all y ∈ Y . If c solves the above
P
problem, take v = P 1 cx and take px = cvx . Then, for all y ∈ Y , x∈X px h(x, y) ≥ v,
x∈X
0
px
and so v̂(p) ≥ v. Moreover, for any p0 ∈ ∆0 (X), take c0x = v̂(p
0 ) . Then, for all y ∈ Y ,
P
P
P
1
1
0
0
x∈X cx ≥
x∈X cx h(x, y) ≥ 1 and so v̂(p0 ) =
x∈X cx = v̂(p) .
Conversely, if v̂(p0 ) = maxp∈∆0 (X) v̂(p), then take c0x =
p0x
.
v̂(p0 )
Clearly c0 is feasible.
Moreover, for any other feasible c, take vc = P 1 cx . Since c is feasible, for all y ∈ Y ,
x∈X
P
p
h(x,
y)
≥
v
,
and
so
v̂(p)
≥
v
for
p
defined
as px = cvxc . Since v̂(p0 ) ≥ v̂(p) ≥ vc ,
c
c
x∈X x
P
P
it follows that x∈X c0x ≤ x∈X cx .
It is easy to check that, since h has range in Q, LP 1(g) has all its basic feasible solutions
in Q|X| as well. The set of solutions to the problem LP 1(g) is convex and closed, and so
all its extreme points lie in Q|X| . Thus, there is at least one rational solution.
28
References
[1] Aumann, R. J. and M. Maschler. (1972). “Some thoughts on the minimax principle.”
Management Science. vol. 18, 54-63.
[2] Chaitin, G. J. (1975). “A theory of program size formally identical to information
theory.” Journal of the ACM. vol. 22 pp. 329-340.
[3] Downey, R., D. Hirschfeldt, A. Nies, and S. Terwijn. (2006). “Calibrating randomness,” Bulletin Symbolic Logic. volume 12, issue 3, 411-491.
[4] Hu, Tai-Wei. (2008). “Uniform distribution representation of Martin-Löf randomness.”
working paper.
[5] van Lambalgen, M. (1990). “The axiomatization of randomness.” Journal of Symbolic
Logic, vol. 55, no. 3, 1143-1167.
[6] Levin, L. A. (1973). “On the notion of a random sequence.” Soviet Mathematics Doklady. vol. 14, pp. 1413-1416.
[7] Luce, R. Duncan and Howard Raiffa. (1957). Games and Decisions: Introduction and
Critical Survey. Dover.
[8] von Mises, Richard. (1939). “Probability, Statistics, and Truth.”
[9] Muchnik, An. A., A. Semenov, and V. Uspensky. (1998). “Mathematical metaphysics
of randomness.” Theoretical Computer Science, Vol 207(2), 263-317.
[10] Nash, John. (1950). “Equilibrium points in n-person games.” Proccedings of the National Academy of Sciences, U. S. A., vol. 36, page. 48-49.
[11] von Neumann, J. and O. Morgenstern (1944), “Theory of Games and Economic
Behavior.” Princeton University.
[12] Odifreddi, P. G. (1989). Classical recursion theory. (Vol. I). North-Holland Publishing
Co., Amsterdam
29
[13] Osborne, Martin J. and Ariel Rubinstein. (1994), A Course in Game Theory, MIT
Express.
[14] Rubinstein, A. (1991). “Comments on the interpretation of game theory.” Econometrica, Vol. 59, No. 4. 909-924.
30
Download