Representing von Neumann-Morgenstern Games in the Situation Calculus

From: AAAI Technical Report WS-02-06. Compilation copyright © 2002, AAAI (www.aaai.org). All rights reserved.
Representing von Neumann-Morgenstern
Gamesin the Situation Calculus
Oliver
Schulte and James Delgrande
School of ComputingScience
SimonFraser University
Burnaby, B.C., Canada
{oschulte,jim}@cs.sfu.ca
Abstract
Sequential von Neumarm-Morgernstern
(VM)gamesare
verygeneralformalism
for representingmulti-agentinteractions andplanningproblems
in a varietyof typesof environments. Weshowthat sequential VM
gameswith countably
manyactions andcontinuousutility functionshavea sound
and completeaxiomatizationin the situation calculus. We
discuss the application of various conceptsfromVMgame
theoryto the theoryof planningandmulti-agent
interactions,
suchas representingconcurrentactionsandusing the Baire
topologyto define continuouspayofffunctions.Keywords:
decision and gametheory, multiagentsystems,knowledge
representation,reasoningaboutactionsandchange
Introduction
Muchrecent research has shownthat systems of interacting softwareagents are a useful basis for computationalintelligence. Ourbest general theory of interactions between
agents is the mathematicaltheory of games,developedto a
high level since the seminal work ofvon Neumann
and Morgenstern (yon Neumann& Morgenstern 1944). VonNeumannand Morgenstem(VMfor short) designed gametheory as a very general tool for modellingagent interactions;
experience has confirmedthe generality of the formalism:
Computer
scientists, economists,political scientists and others have used gametheory to modelhundredsof scenarios
that arise in economic,
social and political spheres.
Our aim is to represent the concepts of VMgametheory
in a logical formalismso as to enable computationalagents
to employgame-theoreticalmodelsfor the interactions that
they are engagedin. It turns out that the epistemicextension
/;e of the situation calculus(Levesque,Pirri, &Reiter 1998,
Sec.7) is adequate for this task. Weconsider VMgamesin
which the agents can take at most countably manyactions
and in whichthe agents’ payoff functions are continuous.
Weshow that every such VMgame G has an axiomatization Axioms(G)in the situation calculus that represents the
game,in the following strong sense: The gameG itself is
a model of Axioms(G), and all models of Azioms(G)are
isomorphic(that is, Axioms(G)is a categorical axiomatization of (7). It followsthat the axiomatization
is correct, in
Copyright©2002, American
Associationfor Artificial Intelligence(www.aaai.org).
Allrights reserved.
89
the sense that Axioms(G)entails only true assertions about
the game(7, and completein the sense that Aa:iotas(G)entails all true assertions aboutthe game(7.
This result makesa numberof contributions to Artificial
Intelligence. First, it showshowto construct a soundand
completeset of situation calculus axiomsfor a given application domain.Bart et al. give an extendedconstruction of
such a set of axiomsfor a version of the boardgame"Clue"
(Bart, Delgrande,&Schulte 2001), whichwe outline in the
penultimatesection. Second,the result establishes that the
situation calculusis a very general languagefor representing
multi-agentinteractions. Third, it providesan agent designer
with a general recipe for utilizing a game-theoreticmodelof
a given type of interaction (e.g., Prisoner’s Dilemma~
Battie of the Sexes, Cournot Duopoly(Osborne &Rubinstein
1994)) and describing the modelin a logical formalism.
Fourth, it opensup the potential for applyingsolution and
search algorithmsfrom gamesresearch to multi-agent interactions (Koller &Pfeffer 1997;van den Herik &Iida 2001;
Bart, Delgrande,&Schulte 2001). Finally, gametheory is
a major mathematicaldevelopmentof the 20th century, and
we expect that manyof the general concepts of gametheory will provefruitful for research into intelligent agents.
For example,we introduce a standard topological structure
associated with gamesknownas the Baire topology. The
Baire topologydeterminesthe large class of continuouspayoff function, whichcan be defined in the situation calculus
in a natural way.
Anattractive feature of VMgametheory is that it provides a single formalismfor representing both multi-agent
interactions and single-agentplanningproblems.This is because a single-agent interaction with an environmentcan be
modelledas a 2oplayer gamein whichone of the players-the environmentis indifferent about the outcomeof the interaction. Thusour representation result includes planning
problemsas a special case.
Thepaper is organizedas follows. Webegin with the definition of a sequential VM
game.Thenext section introduces
the situation calculus. Thenwe specify the set of axioms
Azioms(G)for a given game(7, in two stages. First, we
axiomatizethe structure of the agents’ interaction--roughly,
what agents can do and what they knowwhen. Second, we
showhowto define continuousutility functions in the situation calculus. Finally, for illustration weoutline an extended
application from previous workin whichweused the situation calculusto represent a variant of the boardgame"Clue".
(Somereaders maywantto look over this discussion in the
next-to-last section, before the moreabstract results in the
mainsections.)
Send~litics
a ~S1)
osery
Sequential Games
Definitions Sequential gamesare also knownas extensive
formgamesor gametrees. Thefollowingdefinition is due to
von Neumann,Morgenstem,and Kuhn; we adopt the formulation of (Osborne&Rubinstein 1994, Ch.ll.1). Webegin
with somenotation for sequences.Aninfinite sequenceis a
function from the positive natural numbersto someset; we
denoteinfinite sequencesby the letter h and variants suchas
hI or hi. A finite sequenceof length n is a functionwith domain1, ..., n, denotedby the letter s andvariants. Almostall
the sequenceswe considerin this paper are sequencesof actions. Wedenote actions throughoutby the letter a and variants. Wewrite s = at, ..., anto indicate the finite sequences
whosei-th memberis ai, and write h = at, ..,an, ... for
infinite sequences.Ifs = at, ...,an is a finite sequenceof
n actions, the concatenations * a = at, ..., an, a yields a
finite sequenceof length n + 1. Wefollow the practice of
set theory and write d C s to indicate that sequenced is a
prefix of s (d C s for proper prefixes); likewise, wewrite
s C h to indicate that s is a finite initial segment
of the infinite sequenceh. The term "sequence"without qualification
applies both to finite and infinite sequences,in whichcase
weuse the letter tr and variants.
Nowwe are ready to define a sequential game.
Definition 0.1 (yon Neumann,
Morgenstern,Kuhn)
sequentialgameGis a tuple ( N, H, player, f c, {Ii}, {ui})
whosecomponentsare as follows.
1. A finite set N (the set ofplayers).
2. A set of H of sequencessatisfying the following three
properties.
(a) The empty sequence 0 is a memberof
(b) If a is in H, then every initial segmentof or is in
(c) lf h is an infinite sequencesuchthat everyfinite initial
segmentof h is in H, then h is in H.
Eachmemberof H is a history; each elementof a history
is an action taken by a player. A history tr is terminalif
there is no history s E Hsuch that tr C s. (Thus all infinite histories are terminaL)Theset of terminalhistories
is denotedby Z. Theset of actions available at a finite
history s is denoted by A(s) = {a : s*a E H}.
3. A function player that assigns to each nonterminalhistory a memberof N O {e}. The function player determineswhichplayer takes an action after the history s. If
player(s) = c, then it is "nature’s"turn to makea chance
move.
4. A function fe that assigns to every history s for which
player(s) = c a probability measurefc(-ls) on A(s).
Each probability measureis independentof every other
such measure.(Thusfc(als) is the probability that "nature" choosesaction a after the history s.)
b (Sl)
~e~~ I .
Send
i~1
~]:
Server1 receivespayoffx,Server2 receives
payoffy
c
d
©..............
©
nodes c and d are in the same information set
Figure 1: A game-theoreticmodelof the interaction between
two intemet servers, S~and $2, with twousers.
5. For each player i E N an information partition Zi defined on {s E H : player(s) = i}. An element Ii ofli
is called an informationset of player i. Werequire that
if s, d are membersof the sameinformationset Ii, then
A(s) = A(s’).
6. For each player i E N a payoff functlan ui : Z ---r R
that assigns a real numberto eachterminalhistory.
An Example To illustrate
how interactions between
agents may be represented as game trees, we adopt an
abridgedversion of a scenario from(Bicchieri, Ephrati,
Antonelli 1996). Wereturn to this examplethroughoutthe
paper. Considerservers on the intemet. Eachserver is connected to several sources of informationand several users,
as well as other servers. There is a cost to receiving and
transmitting messagesfor the servers, whichthey recover
by chargingtheir users. Wehave twoservers 81 and $2, and
two users--journalists--U1 and [/2. Both servers are connected to each other and to user Ut; server $2 also serves
user Us. Thereare twotypes of newsitems that interest the
users: politics and showbiz.Thevarious costs and charges
for transmissionsadd up to payoffsfor the servers, depending on what messagegets sent where. For example,it costs
81 4 cents to send a messageto 82, and it costs 82 2 cents
to send a messageto 0"2. If Us receives a showbizmessage
from,.q2 via St, he pays St and ,5’2 each 6 cents. So in that
case the overall payoff to St is -4 + 6 = 2, and the overall
payoff to $2 is -2 + 6 = 4. Bicchieri et al. describe the
charges in detail; we summarizethemin Figure 1.
Figure1 represents the structure of the interaction possibilities betweenthe servers and the users. Webegin withthe
environment
delivering a type of itemto the first server. If
90
the chance p of each type of item arising is knownamong
the players, the root noder correspondingto the emptyhistory 0 wouldbe a chance node (i.e., player(O) = c) with
associatedprobability p. If the probability p is unknown,
we
wouldassign the root nodeto an "environment"player (i.e.,
player(O) = env) whois indifferent amongall outcomes
(and whosepayoff function hence neednot be listed). Thus
gametheory offers two ways of representing uncertainty
about the environment, depending on whether the probabilities governing the environmentare common
knowledge
amongthe players or not.
Everynode in the tree represents a history in the game.
The nodes belonging to $1 are {a, b}, and its information
partition is 27i = {{a}, {b}}. Bicchieri et al. assumethat
the messagessent from $1 to $2 are encoded so that $2
does not knowwhich type of messageit receives. Therefore the information partition of $2 is Z2 = {{c,d}}. If
$2 knewwhat type of news item it receives, its information partition wouldbe 222 = {{e}, {d}}. Thepayoff functions are as illustrated, with server Sl’s payoffs shownon
top. Thus ul(0.Showbiz,send S2*send U2) = 2, and
u2(0*Showbiz*send
Sg_,send [/2) =
The gametree of Figure1 admitsdifferent interpretations
fromthe one discussedso far. For example,it also represents
a situation in whichthe messagesare not encoded,but in
whichserver $2 has to makea decision independentof what
newsitem $1 sends it. In the latter case it wouldbe more
natural to reorderthe gametree so that Sz choosesfirst, then
the environment,finally $1. But the insight from gametheory here is that for rational choice wedo not needto represent the "absolute" time at whichthe players move,but what
each player knowswhenshe makesa move.Fromthis point
of view, there is no importantdifference betweena setting
in whichplayer 1 choosesfirst, and then player 2 chooses
without knowingl’s choice, as in Figure i, and the setting
in whichboth players choosesimultaneously(of. (Osborne
&Rubinstein 1994, p.202, p.102)). Thus we can modelsimultaneousaction as equivalent to a situation in whichone
player movesfirst, and the other player does not knowthat
choice.
TheSituationCalculusWithInfinite Histories
Ourpresentationof the situation calculus follows(Levesque,
Pirri, &Reiter 1998).To the foundationalsituation calculus
axiomswe add elementsfor referring to infinite sequences
of actions. Ouraxiomatizationresult in the next section applies to the epistemicextensionof the situation calculus. We
use mnemonic
symbolsfor parts of the situation calculus accordingto their intendedmeanings.It is importantto keepin
mindthat these are merelysymbolsin a formal language;it
is the task of our axiomatizationto ensure that the symbols
carry the intended meaning.To emphasizethe distinction
betweensyntactic elementsand their interpretations, we use
boldface type, whichwill alwaysindicate a symbolin a formal language(although conversely, we do not use boldface
for all elementsof the situation calculus.)
The situation calculus with infinite histories is a multisortal languagethat contains, in addition to the usual con-
nectivesand quantifiers, at least the followingelements.
1. A sort action for actions, with variables a, a’ and constants a/.
~. situations, withvariabless, s
2. Asort situation for
3. A sort inhist for infinite sequencesof actions, withvariables h, h’ etc.
4.
A sort objects for everythingelse.
5. A function do : action x situation --~ situation.
6. A distinguished constant S0 E situation.
7. A relation E: situation x (situation U inhist). Weuse
s I-- s~ as a shorthandfor s E s~ A ~(s = st), and similarly
for s r- h. Theintended interpretation is that U denotes
the relation "extends"betweensequences, that is, C denotes C as applied to sequencesviewedas sets of ordered
pairs.
8.
Apredicate poss(a, s), intendedto indicate that action
is possiblein situation s.
9. A predicate possible(s) and possible(h).
Weadopt the standard axioms for situations (see
(Levesque, Pirri, &Reiter 1998)). Here and elsewhere
the paper, all free variables are understoodto be universally
quantified.
~s E So
(1)
(2)
s I-- do(a, s’) - s E
do(a,s) = do(a’,s’) -+(a = a’) A(s =
Axiom3 ensures that every situation has a uniquename.We
adopt the second-orderinduction axiomon situations.
VP.[P(So) A Va, s.P(s) -~ P(do(a, s))] -~ Vs.P(s)
A consequenceof Axiom4 is that every situation correspondsto a finite sequence of actions (cf. (Ternovskaia
1997,Sec. 3)).
Next wespecify axiomsthat characterize infinite histories.
Sot- h
st- h ~ :lsl.s [:: s~ Ast r- h
(s’[:: s As r-- h) -~ s’l::
h = h’ -- (Vs.s r-- h = s r- h’)
possible(h) =_ Vs E h.possible(s)
(5)
(6)
(7)
(8)
(9)
Thefinal set of axiomssays that the possible predicate defines whichaction sequences are possible: an action sequenceis possible if no impossibleaction is ever taken along
it.
possible(So)
(10)
possible(do(a,s))
91
- possible(s)
A poss(a,s) (11)
RepresentingGameFormsin the Situation
Calculus
The situation calculus is a natural languagefor describing
gamesbecause its central notion is the sameas that of VM
sequential games:a sequenceof actions. Weestablish a precise sense in whichthe situation calculusis apt for formalizing VMgames: Every VMgame G with countably many
actions and continuous payoff functions has a categorical
axiomatization Axioms(G)in the situation calculus, thus
an axiomatizationthat describes all and only features of the
gamein question.
Let (N, H, player, fc, {h}, {ul}) be a sequential game
G. The tuple (N,H, player, re, {Ii}) is the game form
of G (Osborne & Rubinstein 1994, p.201), which we denote as F(G). The gameform specifies what actions are
possibleat various stages of a game,but it does not tell us
howthe players evaluate the outcomes.In this section, we
construct a categorical axiomatizationof a gameformwith
countably manyactions A(F). Weconsider axiomatizations
of payoff functionsin a later section. Ourconstructionproceeds as follows. Weintroduceconstants for players, actions
and situations, and assign each action constant ai a denotation Jail, whichdefinesa denotationfor situation constants.
Thenwespecify a set of axiomsfor the other aspects of the
gamein termsof the denotationsof the action and situation
constants.
Completeness.Clause 2c of Definition 0.1 requires that
in a gametree, if everyfinite initial segments of an infinite history h is oneof the finite histories in the gametree,
then h is an infinite history in the gametree. This clause
rules out, for example,a situation in whichfor someaction a and every n, the sequence an is part of the game
tree, but the infinite sequencea~ is not. In such a situation we might think of the infinite sequencea~ as "missing" from the gametree, and view clause 2c as ruling out
this kind of incompleteness.(In topological terms, Clause
2c requires that the Baire topology renders a gametree a
completemetric space; see the section on Defining ContinuousPayoff Functionsand references there). This notion of
completenessis topological and different from the concept
of completenessof a logical system. Froma logical point
of view, topological completenessis complexbecauseit requires quantification over sequencesof situations, whichwe
represent as certain kinds of properties of situations as follows. Let basis(P) stand for the formula P(S0), let inf(P)
stand for Vs.P(s)--~ ~.s r- - s~A P(s~), le closed(P)stand
for (s’r- sAP(s)) ~ P(s’), and finally Seq(P) stand for
basis(P) A in f(P) A closed(P). comp
leteness axio m
correspondingto Clause2c is then the following:
VP.Seq(P) .-r (qhVs.[P(s) - f- hi) (12)
Axiom12 is independentof any particular gamestructure.
Nonethelesswelist it as an axiomfor gametrees rather than
as a general axiomcharacterizing infinite sequencesbecause
the completenessrequirementis part of the notion of a game
tree, rather than part of the notion of an infinite sequence.
Aswesawin the section on the introductionto the situation
calculus, the properties of infinite sequencescan be defined
92
in first-order logic, whereasthe completenessrequirement
in the definition of a gametree requires second-orderquantification.
Players. Weintroducea set of constantsto denoteplayers,
suchas {e, 1, 2,...n}, and one variable p rangingover a new
sort players. In the server gamewith chancemoves,we add
constants c, S1 , 8 2 . Weintroduce unique namesaxiomsof
the general formi ~ j, i ~ c for i ~ j. For the server game,
we add e ~ SI,e ~ 82,81~ S2. Wealso add a domain
closure axiom Vp.p = cVp = 1 V...Vp=n. Inthe
server game, the domainclosure axiomis Vp.p = e V p =
$1 Vp --2. S
Actions. Weadd a set of constantsfor actions. If there are
finitely manyactions al, .., an, weproceedas withthe players, adding finitely manyconstants, unique namesaxioms
and a domainclosure axiom. For example, in the server
game, we may introduce names UI, U2,2S2, Show, Poll
for each action and then include axioms such as Vl
U2, Show~ Poll, etc. However,with infinitely manyactions we cannot formulate a finite domainclosure axiom.
Wepropose a different solution for this case. Just as the
induction axiom4 for situations serves as a domainclosure
axiomfor situations, we use a second-orderinductionaxiom
for actions to ensure that there are at mostcountablymany
actions in any modelof our axioms. The formal axiomsare
as follows.
Weintroduce a successor operation + : action --r action
that takes an action a to the "next" action a+. (n)
Wewrite a
as a shorthandfor a~"’’+ wherethe successor operation is
applied n times to a distinguished constant at (thus (°) =
at). The constants (n) may serve as n ames for actions.
As in the case of finitely manyactions, we introduceunique
names axioms of the form a(0 ~ a(J) where i ~ j.
adopt the induction axiom
VP.[P(ao) A Va.P(a)--~P(a+)] --+ Va.P(a).
Weassumefamiliarity with the notion of an interpretation
or model(see for example(Boolos &Jeffrey 1974, Ch. 9)).
Lemma1 Let .A4 = (actions,+, ~) be a model of Axiom 13 and the unique names axioms. Then for every
a E actions, there is one and only one constant a(n) such
that [a(n) ] = a.
Proof (Outline). The unique namesaxioms ensure that
for every action a, there are no twoconstants a (n), a(n’)
such that ra(n)] = ra(n’)] = a. To establish that there is
one such constant, use an inductive proof on the property of
being namedby a constant; moreprecisely, use the induction
axiomto showthat for every action a E actions, there is
someconstant a(n) such that [a(n)] = a.l"l
Nowchoosea function ~ that is a 1-1 and total assignmentof actions in the gameform A(F)to action constants.
For example, we may choose [Show] = Showbiz, [2S2] =
send Sz, I’U1] = send U1, etc.
Situations. Let do(al, ..., an) denotethe result of repeatedly applyingthe do constructor to the actions al, ...,an
from the initial situation. Thusdo(al, ..., an) is shorthandfor dO(al, do(a2, ..., do(a~))))...). Westipulate
AxiomSet
i~j
Vp.p = 1V...Vp = n
Actions: Unique Names
DomainClosure
poss(a,sj)
~poss(ai,s~)
player(sj) =
player(s1) _L
K(s~,sj)
-K(si, sj)
Constraints
i~j
Axioms(F),defined via the denotation function 17 as described above, is the pair 27 = (I(F), 17). Wenote
if two interpretations are isomorphic, then the samesentences are true in both interpretations (Boolos &Jeffrey
1974, p.191).
see text
[ai|
¯A(fs~l)
fad¢ A(fs~])
player(
f~jl) = fil
[sj1
tt H;
1(r~])
= s(f~jl)
s(rs,1)# s(rs~l)
Theorem0.1 Let F = (N,H, player, {Ii}) be a game
form. Suppose that ~ gives a 1-1 and onto mappingbetween action constants and Actions(F).
Table 1: The set Axioms(F) associated with a game form
F = (N,H, player,{Ii}) and a denotation function 17.
Termssuch as ai andsj stand for action and situation constants, respectively.
~o(0) = So so that ~(al, ..., ~) = Soif n =0. The
constantexpressionsof the formdo(a1, ..., an) serve as our
namesfor situations. Weextendthe denotation[.ai] for actions to situations in the obviousway:[.~(al, ..., an)] =
Fall ¯ --- ¯ [’an]. For example, [.do(Show, 2S2)] =
Showbiz.send $2. Note that [.S01 = 0.
Knowledge.
To represent knowledge, we follow
(Levesque, Pirri, &Reiter 1998, See.7) and employa binary relation K, whereK(s, ~) i s r ead as " situation s is
an epistemic alternative to situation s~’’. Therelation K
is intended to hold betweentwo situation constants s,s’
just in case s and s~ denote situations in the sameinformation set. For example, in the server gamewe have that
X(~(Show, 2S2), ~(Poli,2S2)) holds.
Triviality
Axioms.We add axiomsto cnsurcthat
allfunctions
of situations
takeon a "don’tcare"
value.I_forimpossible
situations.Forexample,
player(do(Show, U1, U2)) = _1_ holds.
Table 1 specifies the set of axioms Axioms(F) for a
VMgame form F. For simplicity, we assume that there
are no chance moves in F (see below). Wewrite /(s)
to denote the information set to whicha finite gamehistory s belongs(in F). To reducenotational clutter, the table denotes situation constants by symbolssuch as si, with
the understandingthat si stands for someconstant term of
the form do(a1, ..., an). Let F (N,H, pl ayer, {/i}) be
a gameform whoseset of actions is A. In set-theoretic
notation, the set of all infinite sequencesof actions is
A’~, and we write A<~for the set of all finite action sequences. Wesay that the tree-like structure for F is
the tuple I(F) = (N, A, A<~,°~, _L,poss, p ossible, C
, *,player~, K), whereeach object interprets the obvious
symbol(e.g., ~ is t he sort f or h, a nd ¯interprets do), an
1. possible(s) ¢=~ s ¯
2. K(s,s’) ¢=~ I(s)= S(s’),
3. player~(s) = / if -~possible(s) and player~(s)
player(s) otherwise.
The intended interpretation for the set of axioms
93
1. The intended interpretation of Axioms(F), defined via
the denotation function ~, is a model of Axioms(F)and
Axioms 1-13.
2. All models of Axioms(F) and Axioms 1-13 are isomorphic.
Proof. (Outline) Part 1: It is fairly straightforward
verify that the intendedinterpretation for F--basically, the
gametree--satisfies the general axiomsand Axioms(F).
For Part 2, the argumentfollowsour construction as follows. Let .h4 = (M, l-].M) be a model of Axioms(F)
and Axioms1 -13. Wecan showthat .A//is isomorphic to
27 = (I(F), [7). (1) The domainclosure and unique
axiomsfor the sorts player and action guaranteethat there
is a 1-1 mappingbetweenA and N and the respective sorts
player and action in 34. Moreprecisely, let narneA~09)
be the constant i of sort player such that [.i]~ = p, and
define similarly name~(a) for action a. Thenthe function
f defined by f(p) [. name~(p)]takes anobject of sort
player in 34 to an object of sort player in27, and f is a 1-I
onto mapping.Similarly for action objects. (2) Giventhat
every action in 34 has a uniqueaction constant namingit,
it follows fromAxioms1--4 that every situation in 34 has a
uniquesituation constant namingit (as before, we maywrite
narne~(s)), whichimplies that there is a 1-1onto mapping
betweenthe situations in 34 and A<~. (3) Axioms5-8 ensure that every object in the sort inhist in 34 corresponds
to an infinite sequenceof (nested) situations; Axiom12 ensures that for every such nested infinite sequenceof situations, there is someobject h in inhist such that h extends
each situation in the sequence.Giventhe result of step (2),
it follows that there is a 1-1 onto mappingbetweenthe sort
inhist in 34 and A~. (4) Since poss~(a, s) holds in 34 iff
poss(name~ (a), namem(s)) is in Axioms(F), which is
true just in case possz ( [name~(a)1, [name~(s)]) holds
in 27, this mapconstitutes a 1-1 onto mappingbetween
the extensions of poss,~ and possz. Together with Axioms 9-11, this ensures a 1-1 mappingbetween the extensions of the possible predicate in each model. (5)
also have that player]~ (s) = if f player(name~ (s)) =
name]~(i) is in Axioms(F), which holds just in case
playerz(namez(s)) = namez(i), so the graphs of the
player function are isomorphicin each model. (6) By the
sameargument, the extensions of the K predicate are isomorphic in each model. So any model 34 of the axioms
Axioms(F) and Axioms1 -13 is isomorphic to the intended model27, whichshowsthat all modelsof these axioms are isomorphic. []
It followsfromTheorem
0.1 that for each player i, the relation K fromour axiomatizationis an equivalencerelation
on the situations at whichi moves,becauseK represents the
informationpartition of playeri.
In order to represent chancemoves,weneed axiomscharacterizing real numbers;wedo not wantto go into axiomatic
real numbertheory here since that topic is well-knownand
the issues are similar to those that we take up in the section Defining ContinuousPayoff Functions. A brief outline: Start with (second-order) axiomscharacterizing the
real numbers.Thereare at mostcountablymanypairs (a, s)
such that "nature" choosesaction a in situation s with some
probability fc(a[s). Introduce a constant for each of these
probabilities, and axiomsto ensure that the constant denotes the correct real numbers. Addaxioms of the form
fc(a,s) = p wheneverfe([a][[s]) = [p], as well as
propriatetriviality axiomsfor fe.
Discussion
and Related
Work
Wemayevaluate a representation formalismsuch as the situation calculus with respect to two dimensions:Expressive
power--how many domains can the formalism model?-and tractability~howdifficult is it to carry out reasoning
in the formal language?Typically, there is a trade-off betweenthese twodesiderata. Ourresults so far indicate how
great the expressivepowerof the situation calculus is. Becamesituation calculus axiomscan represent so manyand
such varied modelsof agent interactions, we cannot guarantee in generalthat the axiomatizationfor a given gametree
is tractable. Animportantresearch topic is whatclasses of
gameshavetractable representationsin the situation calculus (see the conclusionssection).
Toillustrate this point, wenote that in extant applications
of the situation calculus, the posspredicate typically has an
inductive definition in terms of whatactions are possiblein
a situation givenwhatfluents holdin the situation, and then
what fluents obtain as a consequenceof the actions taken.
(Wewill give examplesin the next section.) By contrast,
gametheory deals with a set of histories, without any assumptionsabout howthat set maybe defined. The absence
of state successor axiomsfor fluents in our axiomatization
reflects the generality of VMgamesfor representing many
different types of environment.For example,we maydistinguish betweenstatic and dynamicenvironments(Russell
Norvig 1994, Ch 2.4). An environmentis dynamicfor an
agent "if the environmentcan changewhile an agent is deliberating". In a dynamicenvironment,a frame axiomthat
says that a fluent will remainthe sameunless an agent acts
to changeit maywell be inappropriate. For example,if an
agentsenses that a traffic light is greenat timet, it should
not assumethat the light will be greenat time t + 1, evenif
it takes no actions affecting the light. Hencea VM
modelof
this and other dynamicenvironmentscannot employa simple frameaxiom(cf. (Poole 1998)).
Wehave followed(Levesque,Pirri, &Reiter 1998)in introducinga binary relation I((s, ~) between istuations. One
difference betweenour treatmentand that of Levesqueet al.
is that they allowuncertaintyaboutthe initial situation; they
introduce a predicate Ko(s) whoseintended interpretation
is that the situation s maybe the initial onefor all the agent
94
knows. It is possible to extend game-theoretic modelsto
contain a set of gametrees--often called a "gameforest"--rather than a single tree. A gameforest containsseveral different possibleinitial situations, onefor eachgametree, just
as the approachof (Levesque,Pirri, &Reiter 1998) allows
different possible initial situations. Game
theorists use game
forests to modelgamesof incompleteinformation in which
there is someuncertaintyamongthe players as to the structure of their interaction (Osborne&Rubinstein1994).
Anotherdifference is that Levesqueet al. use the binary
relation 14: to represent the knowledgeof a single agent.
Mathematicallythe single relation K suffices for our axiomatization, even in the multi-agent setting, because the
situations are partitioned accordingto whichplayer moves
at them. In effect, the relation K summarizesa number
of relations Ki defined on the situations at whichplayer i
moves. To see this, define Ki(s, s’) ¢=*(player(s)
player(d) = i) A K(s, s~). ThenKi represents a partition
of player i’s nodes. Fromthe point of viewof the epistemic
situation calculus, the difficulty withthis is that player i’s
knowledgeis defined by the I(i relation (and hence by the
Krelation) onlyat situations at whichit is playeri’s turn to
move.Thusif player(s) = i, what another player j knows
at s is undefinedand so we cannot directly represent, for
example, what i knowsabout j’s knowledgeat s. To represent at each situation whateach player knowsat that situation wewouldrequire a relation Ki for all players that is
a partition of all situations. In terms of the gametree, we
wouldrequire that the informationpartition Ii of player i
constitutes a partition of all nodesin the gametree, rather
than just those at whichplayer i moves.The general point
is that VMgametrees leave implicit the representation of
whata player j knowswhenit is not her turn to move.This
maynot pose a difficulty for humansusing the apparatus of
VMgametheory, but if we want to explicitly represent the
players’ knowledgeat any situation, we will require additional structure, such as informationpartitions comprising
all nodesin the gametree.
Other logical formalismshave been developedto represent classes of games. For example, Parikh’s gamelogic
uses dynamiclogic to represent zero-sumgamesof perfect
information(Parikh 1983). Poole’s independentchoice logic
is a formalism for simultaneous move,or matrix, games
(Poole 1997, Sec.3.3), and the dynamicindependentchoice
logic has resources to describe the componentsof a game
tree (Poole 1997,Sec,5.5). Theconstructionin this paper
different becauseit aimsfor moregenerality with respect to
the class of representablegametrees, and becauseit focuses
on the notion of a sequenceof actions central to both game
theory and the situation calculus.
Defining Continuous Payoff Functions in the
Situation Calculus
It remains to define the payoff functions ui : Z-~Rthat
assign a payofffor player i to eachterminal history. Tosimplify, weassumein this section that all terminalhistories are
infinite. This is no loss of generality becausewecan define
the payofffroma finite history s as the payoffassignedto all
infinite histories extendings, whenever
all histories extending s share the samepayoff. There are in general uncountably manyinfinite histories, so we cannot introduce a name
for each of themin a countablelanguage. Moreover,payoff
functions in infinite gamescan be of arbia’ary complexity
(Mosehovakis
1980), so wecannot expect all such functions
to have a definition in the situation calculus. In economic
applications,continuouspayofffunctionstypically suffice to
represent agents’ preferences. Ournext step is to showthat
all continuouspayofffunctionshavea definition in the situation calculus providedthat weextend the situation calculus
with constants for rational numbers.Beforeweformally define continuousfunctions, let us consider two examplesto
motivatethe definition.
Example1. In a commonsetting, familiar from Markov
decisionprocesses,an agent receivesa "reward"at eachsituation (Sutton &Barto 1998, Ch.3). Assumingthat the value
of the rewardcan be measuredas a real number,an ongoing interaction betweenan agent and its environmentgives
rise to an infinite sequenceof rewardsrx, ..., rn, .... A payoff function u maynowmeasurethe value of the sequence
of rewards by a real number.A common
measureis the discounted sum: Weuse a discount factor t~ with 0 < ~ < 1,
and define the payoff to the agent froman infinite sequence
of rewardsby u(rl, ..., rn,...) ~-~i~x t~iri" For example,
supposethat the agent’stask is to ensurethe truth of a certain fluent, for examplethat ison(light, s) holds. Wemay
then define the reward fluent function reward(s) = 1
ison(light, s) and reward(O,= i son(light, s) so that
the agent receives a rewardof 1 if the light is on, anda rewardof 0 otherwise. Thenfor an infinite history h we have
an infinite sequenceof rewardsrx,..., rn,.., consistingof 0s
and Is. If the agent managesto keep the light on all the
time, reward(a)= 1 will be true in all situations, and its
total discounted payoff is)--~-i=1
oo ~i x 1 = d;/1 - t~.
Example2. Supposethat the agent’s job is to ensure"that
a certain condition never occurs. Imaginethat there is an
action DropContainer(s)and the agent’s goal is to ensure
that it never takes this action. Wemayrepresent this task
specification with the followingpayoff function: u(h) = 0
there is a situation s C h such that do(DropContainer,
s)
holds, and u(h) = 1 otherwise.It is impossibleto define this
payoff function as a discounted sumof boundedrewards.
The special character of the 0-1 payoff function u stems
fromthe fact that if DropContainer
is true at situation s,
the payoff function "jumps"froma possible value of 1 to a
determinate value of 0, no matter howmanytimes beforehandthe agent managed
to avoid the action. Intuitively, the
payoff to the agent is not built up incrementallyfromsituation to situation. Usingstandard conceptsfrom topology,
wecan formalizethis intuitive difference withthe notion of
a continuouspayoff function. To that end, we shall introduce a numberof standard topological notions. Our use of
topologywill be self-containedbut terse; a text that covers
the definitions we use is (Royden1988).
Let ‘4, B be twotopological spaces. A mappingf : .4 --+
B is continuousif for every open set YC_ B, its preimage
f-l(y) is an open subset of .4. Thusto define the set of
continuouspayoff functions, we need to introduce a system
95
of opensets on Z, the set of infinite histories, and_R,the set
of real numbers.Weshall employthe standard topology on
R, denotedby7"¢..
The Baire topology is a standard topology for a space
that consists of infinite sequences,such as the set of infinite gamehistories. It is importantin analysis, gametheory (Moschovakis1980), other areas of mathematics, and
increasingly in computerscience (Finkel 2001), (Duparc,
Finkel, &Ressayre 2001). Let A be a set of actions. Let
Is] = {h : s C h} be the set of infinite action sequencesthat
extends. Thebasic opensets are the sets of the formIs] for
each finite sequence(situation) s. Anopenset is a union
basic opensets, including again the emptyset 0. Wedenote
the resulting topological space by B(A).
For each rational q and natural numbern > 0, define an
open interval O(q, n) centered at q by O(q, n) = {z : Iz
ql < 1/n}. Let Eu(q,n) = u-l(O(q,n)). Sinceu is continuous, each set Eu(q, n) is an openset in the Baire space
and hencea unionof situations. So the function u induces a
characteristic relation intervalu(S, q, n) that holds just in
case Is] C_ E~(q,n). In other words, intervalu(s,q,n)
holds iff for all histories h extendings the utility u(h)
within distance 1/n of the rational q. In the examplewith
the discontinuouspayoff function above, intervalu(s, 1, n)
does not hold for any situation s for n > 1. For given any
situation s at whichthe disaster has not yet occurred, there
is a history h D s with the disaster occurring, such that
u(h) = 0 < I1 1/nl. Intuitively, wi th a continuous payoff
function u an initial situation s determinesthe value u(h)
for any history h extendings up to a certain "confidenceinterval" that boundsthe possible values of histories extending
s. Aswe movefurther into the history h, with longer initial
segmentss of h, the "confidenceintervals" associated with
s becomecentered around the value u(h). The following
theoremexploits the fact that the collection of these intervals associated with situations uniquelydeterminesa payoff
function.
Assumethat constants for situations, natural numbers
and rationals have been defined along with a relation
intervalu such that interval~(s, q, n) is true just in case
intervalu([s], [q], I’n]) holds.
Theorem
0.2 Let A be a set of actions, and let u : I3(A)
7"~ be a continuous function. Let payoff be the axiom
Via, n.3s r- h.interval~(s, u(h), n).
I. u satisfies payoff, and
2. if z~t : 13(A)~ R satisfies payoff, then u’ =u.
Proof. For part 1, let a history h and a numbern be
given. Clearly u(h) O(u(h),n), soh ~Eu(u(h),n) =
u-l(O(q,n)). Since Eu(u(h),n) is an open set, there is
a situation s C h such that [s] ___ Eu(u(h),n). Hence
intervalu(s, u(h), holds ands witnesses the claim.
For part 2, supposethat u~ satisfies the axiompayoff in
a model,such that for all histories h, numbersn, there is
a situation s C h satisfying interval,(s,u’(h),n).
We
showthat for all histories h, for everyn, it is the case that
lu(h) - u’(h)l 1/n, which implies th at t~ (h) = u’(h).
Let h, n be given and, by Part 1, choosea situation s C h
satisfying intervalu(s,u~(h),n). By the definition of the
Fluent
I Meaning
H(i, c~, s)
I player i holds card x
AskPhase(s)I a query may be posed
Know(i, ¢, s) player i knowsthat ¢
A central fluent in our axiomatizationis the card homing
fluent H(i, c~, s), read as "playeri holds card z in situation
s". Wewrite H(0, c~, s) to denote that card x is in the mystery pile in situation s. In MYST,
as in Clue, cards do not
movearound. This is a static, invariant aspect of the game
environment
that the situation calculus can capture concisely
in the followingsuccessorstate axiom:
Table 2: Three fluents used to axiomatize MYST
relation intervalu, it follows that h 6 Bu(u’(h),n)
u-*(O(u’(h),n)). So u(h) 60(u’(h),n), which by the
definition of O(u’(h), implies tha t lu( h) - u’( h) l <1In.
Since this holds for any history h and numbern, the claim
that u(h) = u’(h) follows. []
Weremark that if a VMgameG has only a finite set of
histories, any utility function ui is continuous,so Theorem
0.2 guaranteesthat payofffunctions for finite gamesare definable in the situation calculus.
An Application:
Axioms for a Variant of
"Clue"
Ourresults so far provide a general mathematicalfoundation
for the claimthat the situation calculusis a strongformalism
for representing game-theoreticstructures; they showthat
for a large class of games,the situation calculus can supply
axiomscharacterizing the gamestructure exactly (categorically). This section outlines the representationof a fairly
complexparlour gameto illustrate what such axiomatizations are like, and to give a sense of the usefulness of the
situation calculus in describing gamestructures. The discussion indicates howthe situation calculus can take advantages of invariancesand localities present in an environment
to give a compactrepresentation of the environmentaldynamics. Weshowhowa state successor axiom can define
memory
assumptionslike the game-theoreticnotion of perfeet recall in a direct andelegant manner.
In previous work, we havestudied a variant of the wellknownboard game "Clue", which we call MYST.For our
present purposes, a brief informal outline of MYST
sufrices; for moredetails we refer the reader to previouswork
(Ban, Delgrande,&Schulte 2001), (Bart 2000). In particular, wewill not go into the details here of definingformally
the terms of our language for describing MYST.
MYST
begins witha set of cardse,, .., Cn. Thesecard are distributed
among
a numberof players 1,2 ..... k, and a "mysterypile".
Eachplayer can see her owncards, but not those of the others, nor those in the mysterypile. Theplayers’ goal is to
guess the contents of the mysterypile; the first person to
makea correct guess wins. Theplayers gain informationby
taking turns queryingeachother. Thequeries are of the form
"do youhold one of the cards fromC’?", where(7 is a set of
cards. If the queried player holds noneof the cards in G,
he answers"no". Otherwisehe showsone of the cards from
G to the player posing the query. In what follows, we discuss someaspects of the axiomatization of MYST;
a fuller
discussion is (Bart, Delgrande,&Schulte 2001), and (Ban
2000)offers a full set of axioms.
Table2 showsthe fluents we discuss belowtogether with
their intended meaning.
96
H(i, ex, s) -- H(i, c~, do(a,
Anotherfact that remainsinvariant across situations, and
that is crucial to reasoningabout MYST,
is that every card
is held by exactly one player or the mysterypile. Wemay
express this with two separate axioms.
ExclusivenessH(i, cz, s) -~ Vj # i.-~H(j, cx, s).
If player i holds card ex, then no other player j (or the
mysterypile) holds c~. If the mysterypile holds card cx,
then e~ is not held by anyplayer.
Exhaustiveness Vi=o
P H(i, ex, s).
Everycard is held by at least one player (or the mystery
pile).
It is straightforwardto specify preconditionsfor actions
in MYST.
For example, consider asking queries, one of the
mainactions in this game.Wewrite ask,s(i, q) to denote
that player i takes the action of asking query q. The fuent
Ask,Phase(s)expresses that the situation s is in an "asking
phase", i.e., that no query has been posed yet. Thenthe
preconditionfor asking a queryis given by
Poss(ask,s(i, Q ), s) =- Ask.Phase(s) A player(s)
whereplayer(s) indicates whoseturn it is in situation s.
Weuse an epistemicfluent Know(i, ¢, s) to denote that
player i knows¢ in situation s, where¢ is a sentence. Players gain informationby observingthe results of queries. For
example,if player 1 showscard 3 to player 2, then player
1 comesto knowthat player 2 holds card 3. Usinga fluent shows(j, i, c~), we mayexpress this effect by the axiom
Knows(2,H(1, ca, s), do(shows(l, 2, ca), s)). In general,
we have the following knowledgeaxiomthat is invariant
acrosssituations:
Know(i,H(j, c~, s ), do(shows(j,
i, e~ ),
Indeed, the fact that player 1 holds card 3 becomescommonknowledgeamongthe two players after player 1 shows
card 3 to player 2. Weuse a commonknowledgefluent
indexed by a set of players to express this principle. The
samefluent allowsus to capture the fact that in MYST,
after
player 1 showscard 3 to player 2 in responseto queryq, it
becomescommonknowledgeamongall players that player
1 has somecard matchingthe query (for moredetails, see
(Ban, Delgrande, & Schulte 2001, Sec. 3), (Bart 2000)).
Thisillustrates that the situation calculustogetherwithepistemic logic provides resources to capture somequite subtle
differential effects of actions on the knowledgeand common
knowledgeof agents.
In our analysis of strategic reasoning in MYST,
we assumethat agents do not forget facts once they cometo know
them. Thisis part of the game-theoreticnotion of perfect recall. I Wemayrender this assumptionabout the memory
of
the players by the followingaxiomfor the knowledge
fluent:
Know(i, ¢, s) ~ Know(i, ¢, do(a,
for a sentence¢.
Conclusion
Von Neumaun-Morgenstern
game theory is a very general
formalismfor representingmulti-agentinteractions that encompassessingle-agentdecision processesas a special case.
Game-theoretic models of manymulti-agent interactions
from economicsand social settings are available, and the
general mathematical theory of VMgames is one of the
major developmentsof the 20th century. The situation calculus is a natural language for describing VMgames; we
established a precise sense in whichthe situation calculus
is well-suited to representing VMgames:A large class of
VMgameshas a sound and complete (categorical) set
axiomsin the situation calculus. This result underscores
the expressive powerof the situation calculus. The connection betweenthe situation calculus and gametheory suggests fruitful applications of game-theoreticideas to planning and multi-agentinteractions. Weconsideredin particular the use of the Baire topologyfor definingcontinuousutility functions, and the representation of concurrentactions.
Conversely,the logical resourcessituation calculusallow us
to exploit invariants in the dynamicsof someenvironments
to provide a morecompactrepresentation of their dynamics
than a gamegee would.
Wesee two mainavenuesfor further research. First, it
is important to knowwhat types of VMgamespermit compact and tractable axiomatizationsin the situation calculus.
Withoutrestrictions on the possible gameformsF, there is
no boundon the computationalcomplexityof an axiomatization Axioms(F). There are a numberof plausible subclasses of gamesthat might have computationallyfeasible
representations in the situation calculus. Anobvious restriction wouldbe to considerfinite gametrees, whichhave
first-order categorical axiomatizationsin the situation calculus. A weakerfinitude requirement wouldbe that each
agent should have only finitely manyinformation sets. An
information set correspondsto what in manyAI formalisms
is called a "state" because an information set defines an
agent’s complete knowledgeat the point of makinga decision. Gamesin whichthe agent has only finitely manyinformation sets closely correspondto MarkovDecision Processes in whichthe possiblestates of an agent are typically
assumedto be fnite. Several authors have noted the utility of the situation calculus for defining MarkovDecision
Processes ((Poole 1998), (Boutilier et al. 2000)), and the
research into compactrepresentations of MarkovDecision
~Fora formaldefinitionof perfectrecall in gametheory,see
(Osborne&Rubinstein1994).Thegame-theoreticdefinition requires that an agentshouldnot onlyremember
facts, but also her
actions. This canbe representedin the epistemicextensionLe
of the situationcalculusbyrequiringthat K~(s,s’) doesnot hold
whenever
situationss, s’ differ withrespectto an actionbyagenti.
97
Processes (cf. (Boutilier, Dean, &Hanks1999)) may
apply to compactrepresentations of gametrees as well.
Anotherapproach wouldbe to begin with a decidable or
tractable fragmentof the situation calculus, and then examine which classes of gametrees can be axiomatized in a
given fragment. There are results about computablefragmentsof the situation calculus, evenwith the second-order
induction axiom, such as that presented in (Temovskaia
1999).Themainrestrictions in this result are the following:
(1) Thereare only finitely manyfluents F1, .., F,, (2)
fluent F is unary, i.e. of the form F(s), like our AskPhase
fluent, (3) the troth of fluents in situation s determinesthe
truth of fluents in a successorsituation do(a, s) by state successor axiomsof a certain restricted form (such as our axiomfrom the previous section for the card holding fluent;
see (Levesque,Pirri, &Reiter 1998)for moreon the prooftheoretic and computationalleverage from state successor
axioms). Our experience with MYST,and more generally
the literature on MarkovDecision Processes, suggests that
assumptions(1) and (3) hold in manysequential decision
situations; weplan to explorein future researchthe extent to
whichrestriction (2) can be relaxed, and howepistemiecomponents such as the relation K(s, ~) affect c omputational
complexity.
Second, the main purpose of providing an agent with
a modelof its planning problemor interaction with other
agents is to help the agent makedecisions. Gametheofists have introduced various modelsof rational decisionmakings in VMgames, some of which have been applied
by computerscientists (see (Koller &Pfeffer 1997), (Bicchieri, Ephrati, &Antonelli1996)). It shouldbe possible
axiomatizemethodsfor solving gamesin the situation calculus, and to formalize the assumptions--such as common
knowledgeof rationality--that gametheorists use to derive
predictions for howagents will act in a given game.
Thecombinationof logical techniques, such as the situation calculus, and the advanceddecision-theoreticideas that
we find in gametheory provides a promisingfoundationfor
analyzingplanning problemsand multi-agent interactions.
Acknowledgments
Previous versions of this paper were presented at JAIST
(Japanese AdvancedInstitute for Science and Technology)
and the 2001 GameTheory and Epistemic Logic Workshop
at the University of Tsukuba.Weare indebted to Eugenia
Temovskaia,Hiroakira Ono,Mamoru
Kaneko,Jeff Pelletier
and the anonymousAAAIreferees for helpful comments.
This research was supported by the National Sciences and
Engineering Research Council of Canada.
References
Bart’ B.; Delgrande,J.; and Schulte, O. 2001. Knowledge
and planning in an action-based multi-agent framework:A
case study. In Advancesin Artificial Intelligence, Lecture
Notesin Artificial Intelligence, 121-130.Berlin: Springer.
Ban, B. 2000. Representationsof and strategies for static
information, noncooperative gameswith imperfect infor-
marion. Master’s thesis, SimonFraser University, School
of ComputingScience.
Bicchieri, C.; Ephrati, E.; and Antonelli, A. 1996. Games
servers play: A proceduralapproach.In Intelligent Agents
IL BedimSpringer-Verlag. 127-142.
Boolos,G., and Jeffrey, R. 1974. ComputabilityandLogic.
Cambridge:CambridgeUniversity Press.
Boutilier, C.; Reiter, R.; Soutchanski,M.; and Thrun,S.
2000. Decision-theoretic, high-level agent programming
in
the situation calculus. In AAAI-2000,
SeventeenthNational
Conferenceon Artificial Intelligence. Austin, Texas: AAAI
Press.
Boutilier, C.; Dean, T.; and Hanks, S. 1999. Decisiontheoretic planning: Structural assumptionsand computational leverage. Journalof Artificial Intelligence Research
11:1-94.
Duparc,J.; Finkel, O.; and Ressayre, J.-P. 2001. Computer
science and the fine structure of borel sets. Theoretical
ComputerScience 257:85-105.
Finkel, O. 2001. Topologicalproperties of omegacontextfree languages. Theoretical ComputerScience 262:669697.
Koller, D., and Pfeffer, A. 1997. Representationsand solutions for game-theoreticproblems.Artificial Intelligence
94(1):167-215.
Levesque,H.; Pirri, E; and Reiter, R. 1998. Foundations
for the situation calculus. Linl~pingElectronicArticles in
Computerand Information Science 3(18).
Moschovakis,Y. 1980. Descriptive Set Theory. NewYork:
Elsevier-North Holland.
Osborne, M. J., and Rubinstein, A. 1994. A Course in
GameTheory. Cambridge,Mass.: MITPress.
Parikh, R. 1983. Propositional gamelogic. In IEEESymposiamon Foundationsof ComputerScience, 195-200.
Poole, D. 1997. The independent choice logic for modelling multipleagentsunderuncertainty. Artificial Intelligence 94(1-2).
Poole, D. 1998. Decisiontheory, the situation calculus, and
conditional plans. Link6pingElectronic Articles in Computer andInformationScience 3(8).
Royden,H. L. 1988. RealAnalysis. NewYork: Macmillan,
3 edition.
Russell, S. J., and Norvig,E 1994. Artificial Intelligence:
A modernapproach.Englewood
Cliffs, NJ: Prentice Hall.
Sutton, R. S., and Barto, A. G. 1998. Reinforcementlearning : an introduction. Cambridge,Mass.: MITPress.
Temovskaia,E. 1997. Inductive definability and the situation calculus. In Transactions and Changein Logic
Databases, volume 1472. LNCS.
Temovskaia,E. 1999. Automatatheory for reasoning about
actions. In Proceedingsof the Sixteenth InternationalJoint
Conferenceon Artificial Intelligence. Stockholm,Sweden:
HCAI. 153-158.
van den Herik, H., and Iida, H. 2001. Forewordfor special
issue on games research. Theoretical ComputerScience
252,1-2:1-3.
von Neumann,J., and Morgenstern, O. 1944. Theory of
Gamesand EconomicBehavior. Princeton, NJ: Princeton
UniversityPress.
98