Reasoning about Large Taxonomies of Actions Yilan Gu Mikhail Soutchanski

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008)
Reasoning about Large Taxonomies of Actions
Yilan Gu
Mikhail Soutchanski
Dept. of Computer Science
University of Toronto
10 King’s College Road
Toronto, ON, M5S 3G4, Canada
Email: yilan@cs.toronto.edu
Department of Computer Science
Ryerson University
245 Church Street, ENG281
Toronto, ON, M5B 2K3, Canada
Email: mes@cs.ryerson.ca
these theories are “flat” and do not provide representation for
hierarchies of actions. This can lead to potential difficulties
if one intends to use BATs for the purpose of large scale formalization of reasoning about actions on the commonsense
level, when potentially arbitrary actions and objects have to
be represented. Intuitively, many events and actions have
different degrees of generality: the action of driving a car
from home to an office is a specialization of the action of
transportation using a vehicle, that is in its turn a specialization of the action of moving an object from one location to
another. We represent hierarchies of actions explicitly and
use them in our new modular SSAs. However, we show that
our new modular SSAs can be translated into “flat” Reiter’s
SSAs and, consequently, we inherit all useful properties of
his BATs: formulas entailed from Reiter’s BAT remain entailments from a modular BAT; consequently, the projection
problem can be solved.
Below, we first review the SC. Then we propose a new
representation that helps to design modular BATs and prove
that it has the same desirable logical properties as Reiter’s
BATs. We also discuss the significant computational advantages of using modular BATs in comparison to Reiter’s
“flat” BAT. Finally, we propose an approach to designing
taxonomies of actions and discuss related work.
Abstract
We design a representation based on the situation calculus to
facilitate development, maintenance and elaboration of very
large taxonomies of actions. This representation leads to
more compact and modular basic action theories (BATs) for
reasoning about actions than currently possible. We compare
our representation with Reiter’s BATs and prove that our representation inherits all useful properties of his BATs. Moreover, we show that our axioms can be more succinct, but extended Reiter’s regression can still be used to solve the projection problem (this is the problem of whether a given logical
expression will hold after executing a sequence of actions).
We also show that our representation has significant computational advantages. For taxonomies of actions that can be
represented as finitely branching trees, the regression operator can work exponentially faster with our theories than it
works with Reiter’s BATs. Finally, we propose general guidelines on how a taxonomy of actions can be constructed from
the given set of effect axioms in a domain.
Introduction
A long-standing and important problem in AI is the problem of how to represent and reason about effects of actions
grouped in a realistically large taxonomy, where some actions can be more generic (or more specialized) than others.
While the problem of representing large semantic networks
of (static) concepts has been addressed in AI research from
the 1970s and served as motivation for research on description logics, a related problem of representing and reasoning about large taxonomies of actions received surprisingly
little attention. We would like to address this problem using the situation calculus. The situation calculus (SC) is
a well known and popular predicate logical theory for reasoning about events and actions. There are several different
formulations of the SC. In this paper we would like to concentrate on basic action theories (BATs) introduced in (Reiter 2001), in particular, on successor state axioms (SSAs)
proposed by Reiter to solve (sometimes) the frame problem
(SSAs are part of a BAT). Recall that BATs are more expressive than STRIPS theories: actions specified using BATs can
have context-dependent effects. We propose a representation
that allows writing more compact and modular BATs than is
currently possible. BATs are logical theories of a certain
syntactic form that have several desirable theoretical properties. However, BATs have not been designed to support
taxonomic reasoning about objects and actions. Essentially,
The Situation Calculus
All dialects of the SC Lsc include three disjoint sorts (actions, situations and objects). Actions are first-order (FO)
terms consisting of an action function symbol and its arguments. Actions change the world. Situations are FO
terms which denote possible world histories. A distinguished constant S0 is used to denote the initial situation,
and function do(a, s) denotes the situation that results from
performing action a in situation s. Every situation corresponds uniquely to a sequence of actions. Moreover, the
notation s′ ⊑ s means that either situation s′ is a subsequence of situation s or s = s′ . Objects are FO terms
other than actions and situations that depend on the domain of an application. Fluents are relations or functions
whose values may vary from one situation to the next. Normally, a fluent is denoted by a predicate or function symbol
whose last argument has the sort situation. For example,
F (~x, do([α1 , · · · , αn ], S0 )) represents a relational fluent in
the situation do(αn , do(· · · , do(α1 , S0 ) · · · )) resulting from
execution of actions α1 , · · · , αn in S0 . For simplicity, we
omit all details related to functional fluents below. All free
variables are always ∀-quantified at the front.
The SC includes the distinguished predicate P oss(a, s)
c 2008, American Association for Artificial IntelliCopyright gence (www.aaai.org). All rights reserved.
931
to characterize actions a that are possible to execute in s.
For any first order SC formula φ and a term s of sort situation, we say φ is a formula uniform in s iff it mentions only
fluents (does not mention P oss or ⊑), it does not quantify
over variables of sort situation, it does not use equality on
situations, and whenever it mentions a term of sort situation
in a fluent, then that term is a variable s (see (Reiter 2001)).
A basic action theory (BAT) D in the SC is a set of axioms written in Lsc with the following five classes of axioms
to model actions and their effects (Reiter 2001). Action
precondition axioms Dap : For each action function A(~x),
there is one axiom of the form P oss(A(~x), s) ≡ ΠA (~x, s).
ΠA (~x, s) is a formula uniform in s with free variables
among ~x and s, which characterizes the preconditions of action A. Successor state axioms Dss : For each relational
fluent F (~x, s), there is one axiom of the form
_
_
F (~
x, do(a, s))≡ ψi+ (~
x, a, s) ∨ F (~
x, s) ∧ ¬( ψj− (~
x, a, s)).
i
ψi+ (~x, a, s)
the evaluation of a regressable formula W to a FO theorem
proving task in the initial theory together with unique name
axioms for actions:
D |= W iff DS0 ∪ Duna |= R[W ].
This fact is the key result in the SC. It demonstrates that an
executability or a projection task can be reduced to a theorem proving task that does not use precondition, successor
state, and foundational axioms. This is one of the reasons
why the SC provides a natural and easy way to represent
and reason about dynamic systems.
Action Hierarchies
In practice, it is not easy to specify and reason with a logical theory D if an application domain includes a very large
number of actions. To deal with this problem, we propose to
represent events and actions using a hierarchy.
(1)
j
(ψj− (~x, a, s),
Here, each formula
respectively)
is uniform in s and specifies a positive effect (negative effect, respectively) with certain conditions on fluent F . Each
ψi+ (~x, a, s) or ψi− (~x, a, s) in Eq. (1) has the syntactic form
∃~
z .a = A(~
y)∧γ(~
x, ~
z , s),
Definition 1 We use the predicate sp(a1 , a2 ) to represent
that action a1 is a direct specialization of action a2 (action
a2 is a direct generalization of a1 ). An action diagram is
defined by a finite set H of axioms of the syntactic form
sp(A1 (~
x), A2 (~
y )) ≡ φA1 ,A2 (~
x, ~
y)
(3)
for two action functions A1 (~x), A2 (~y ), where φA1 ,A2 (~x, ~y )
is a satisfiable (i.e., not equivalent to ⊥) situation-free FO
formula with free variables at most among ~x, ~y. Also, H
must be such that the following condition hold:
(2)
where ~z = ~y − ~x and γ(~x, ~z, s) is a context where the action A(~y ) has the effect. The successor state axiom (SSA)
for each fluent F completely characterizes the truth value of
F in the next situation do(a, s) in terms of values that fluents
have in the current situation s. Notice that, unlike STRIPS,
in general these SSA axioms are context-dependent. Initial
theory DS0 : A set of FO formulas whose only situation term
is S0 . It specifies the values of all fluents in the initial state.
It also describes all the facts that are not changeable by any
actions in the domain. Unique name axioms for actions
Duna : Includes axioms specifying that two actions are different if their action names are different, and that identical
actions have identical arguments. Foundational axioms for
situations Σ: The axioms for situations which characterize
the basic properties of situations. These axioms are domain
independent. They are included in the axiomatization of any
dynamic system in the SC (see (Reiter 2001) for details).
Suppose that D = Dap ∪ Dss ∪ DS0 ∪ Σ ∪ Duna is
a BAT, α1 , · · · , αn is a sequence of ground action terms,
and G(s) is a uniform formula with one free variable s.
One of the most important reasoning tasks in the SC is
the projection problem, that is, to determine whether D |=
G(do([α1 , · · · , αn ], S0 )). Another basic reasoning task is the
executability problem. Planning and high-level program execution are two important settings where the executability
and projection problems arise naturally. Regression is a central computational mechanism that forms the basis for automated reasoning in the SC (Reiter 2001). A recursive definition of the regression operator R on any regressable formula
φ is given in (Reiter 2001). We use notation R[φ] to denote
the formula that results from eliminating P oss atoms in favor of their definitions as given by action precondition axioms and replacing fluent atoms about do(α, s) by logically
equivalent expressions about s as given by SSAs repeatedly
until it cannot make such replacements any further. The regression theorem (Reiter 2001) shows that one can reduce
H ∪ D |= sp(a1 , a2 ) ⊃ (P oss(a1 , s) ⊃ P oss(a2, s)).
Given any action diagram H, we say that a directed graph
G = hV, Ei is a digraph of H when V = {A1 , · · · , An },
where all Ai ’s are distinct action function symbols in D and
a directed edge Aj →Ak belongs to the edge set E iff there
is an axiom of the form sp(Aj (~x), Ak (~y )) ≡ φAj ,Ak (~x, ~y )
in H. From Duna follows that the graph G cannot have multiple edges from one node to another. When the digraph G
of H is acyclic, i.e., there is no directed loop in G, we call
H an acyclic action diagram. Below, we will only consider
acyclic action diagrams. Note that if each action in the digraph of an acyclic action diagram has only one parent (single inheritance case), then the digraph is actually a forest,
but generally, there can be actions that have several parents
(multiple inheritance case), as shown in Examples 1 and 2.
Example 1 Consider actions performed in a kitchen, actions such as washing, cooking, frying, etc. Some can be
considered as specializations of others. To simplify the example we assume that water and electricity are always available, ignore some other kitchen activities (such as chopping,
mixing, etc) and consider the (simplified) action digraph
shown in Fig. 1. Each edge corresponds to one sp axiom
in the set H, for example,
sp(wash(x), kitchenAct), sp(prepF ood(x), kitchenAct),
sp(cook(f ood, vessel), prepF ood(f ood)),
sp(oilyCook(f ood, vessel), cook(f ood, vessel)),
sp(oilyCook(f ood, vessel), reheat(f ood)),
sp(microwave(f ood), reheat(f ood)), · · · , etc.
Example 2 We show additional examples where actions in
H can have different numbers of arguments. Consider an
action travel(p, o, d): a person p travels from origin o to
destination d. It can be regarded as a direct specialization
932
parboil
steam
stew
boil
grill
prepHotDr
broil
roast
pressureCook
ovenCook
lowOilCook
prepColdDr
handWash
prepDrink
wash
bake
fry
oilyCook
cook
deepFry
reheat
One can easily prove that under Duna , H∗ is acyclic according to Def. 3 iff the digraph of the action diagram H is
acyclic. Note that the above condition in Def. 3 is more general than the antisymmetry of sp∗ (because antisymmetry is
not strong enough to assure the acyclicity of H).
The following theorem states that the action hierarchies
entail the same intuitively clear taxonomic properties as the
predicate sp.
stir
microwave
makeSalad
prepFood
kitchenAct
Figure 1: A (Simplified) Action Digraph for Kitchen Activities
Theorem 1 Let H be an acyclic action diagram, whose corresponding action hierarchy is H⋆ . Then,
of move – person p moves from location o to location d:
H⋆ ∪ D |= sp∗ (a1 , a2 ) ⊃ (P oss(a1 , s) ⊃ P oss(a2, s)).
sp( travel(p, o, d), move(p, o, d) ).
Consider an action drive(p, v, o, d), representing that a
person p drives a vehicle v from origin o to destination
d. It can be considered as a direct specialization of action travel – person p travels from location o to location
d: sp( drive(p, v, o, d), travel(p, o, d) ). It is also a direct specialization of action move – vehicle v moves from location
o to location d: sp( drive(p, v, o, d), move(v, o, d) ).
Consider an action passDr(p, dr): a person p passes
through a door dr. It is considered as a direct specialization of move(p, o, d) iff the origin o is the outside of dr and
the destination d is the inside of dr, or vice versa:
Proof: It follows from Def. 1, Def. 2 and Def. 3 using induction, but details are omitted because of lack of space. Moreover, the following lemma will be convenient later.
Lemma 1 Consider any acyclic action diagram H, whose
corresponding action hierarchy is H⋆ . For any action functions A1 (~x) and A2 (~y ), A1 (~x) is a (distant) specialization of
A2 (~y ) iff φA1 ,A2 (~x, ~y), for some situation-free FO formula
φA1 ,A2 (including ⊤ and ⊥) whose free object variables are
at most among ~x and ~y. That is,
sp( passDr(p, dr), move(p, o, d) ) ≡
outside(o, dr) ∧ inside(d, dr) ∨ outside(d, dr) ∧ inside(o, dr),
where predicate outside(o, dr) (inside(d, dr), respectively) is true iff o (d, respectively) is the location that is
outside (inside, respectively) of dr.
sp∗ (A1 (~
x), A2 (~
y )) ≡ φA1 ,A2 (~
x, ~
y ).
|=
And, φA1 ,A2 can be found from H in finitely many steps.
Proof: Let G = hV, Ei be the digraph of the given H,
and let max(A′ , A) be the maximum of the lengths of
all the distinct paths from A′ to A in G. We prove
the following property P (n) for any natural number
n: “For any action function symbol A′ , A such that
max(A′, A) = n, n ≤ |V |, and for any distinct free variables
~x, ~
y , sp∗ (A′ (~
x), A(~
y)) ≡ φA′ ,A (~
x, ~
y ) for some FO formula
φA′ ,A (including ⊤ and ⊥) with object variables at most
among ~x and ~y ”.
Base case: P (0), max(A′, A)=0, two sub-cases.
Case 1: A=A′ , since sp∗ is reflexive
In this paper, we will only consider action diagrams with
monotonic inheritance of effects:
D ∪ H |= (∀F ).sp(a1 , a2 ) ∧ F (s) 6≡ F (do(a2 , s))
⊃ F (do(a1 , s)) ≡ F (do(a2 , s)).
Since there are only finitely many (say, m) fluents in D,
the above second-order (SO) formula can be replaced by the
finite conjunction (over j = 1..m) of FO formulas (where
Fj (x~j , s) is jth fluent with object arguments x~j ):
sp∗ (A′ (~
x), A(~
y)) ≡ |~
x| = |~
y| ∧
sp(a1 , a2 ) ∧ Fj (x~j , s) 6≡ Fj (x~j , do(a2 , s))
⊃ Fj (x~j , do(a1 , s)) ≡ Fj (x~j , do(a2 , s)).
V|~x|
i=1
xi = yi (by UNA).
Case 2: A6=A′ , and since max(A′ , A)=0, which means there
is no sp path between A and A′ , then sp∗ (A′ (~x), A(~y)) ≡ ⊥.
Inductive step: Assume that P (j) is true for all j < n, we
prove P (n), where n > 0. Consider any action function
symbols A′ , A such that max(A′ , A) = n, where n ≤ |V |.
Since G is acyclic, hence each path from A to A′ has no
repetitions of the action nodes. Since n > 0, collect all
direct generalizations of A′ in G, say {A1 , · · · , At }, which
are (distant) specializations of A. Then,
Because in general we need to reason about a direct specialization of another direct specialization of an action, we
define (distant) specializations using the predicate sp∗ .
Definition 2 The predicate sp∗ (a1 , a2 ) represents that action a1 is a (distant) specialization of action a2 and is defined as a reflexive-transitive closure of sp:
sp∗ (a1 , a2 ) ≡ (∀P ).{(∀v)[P (v, v)]∧
(∀v, v ′ , v ′′ )[sp(v, v ′ ) ∧ P (v ′ , v ′′ ) ⊃ P (v, v ′′ )] ∧
(∀v, v ′ )[sp(v, v ′ ) ⊃ P (v, v ′ )]} ⊃ P (a1 , a2 )
H⋆ ∪ Duna
′
sp∗ (AW
(~
x), A(~
y)) ≡
t
~i )[sp(A′ (~
x), Ai (x~i )) ∧ sp∗ (Ai (x~i ), A(~
y))].
i=1 (∃x
(4)
For each i, max(Ai, A) ≤ n−1. By the induction hypothesis, we have sp∗ (Ai (x~i ), A(~y)) ≡ φAi ,A (x~i , ~y ) for some
situation-free FO formula φAi ,A whose free variables
are at most among x~i and ~y. In H, for each i we have
Axiom (4) requires SO logic, but we will show in Theorem 3
that we can still reduce reasoning about regressable formulas to theorem proving in FOL only. We denote the set of
axioms including Axiom (4) and all axioms in an action diagram H as H⋆ and call it the action hierarchy (of H).
sp(A′ (~
x), Ai (x~i )) ≡ φA′ ,Ai (~
x, x~i ),
where φA′ ,Ai is a situation-free FO formula. Let
Definition 3 An action hierarchy H∗ is acyclic iff it entails the following conditions: H∗ |= sp∗ (A1 (~x1 ), A2 (~y1 )) ∧
sp∗ (A2 (~
y2 ), A1 (~
x2 )) ⊃ A1 (~
x3 ) = A2 (~
y3 ) for all action functions A1 , A2 .
φA′ ,A (~
x, ~y) =
Wt
~i )[φA′ ,Ai (~
x, x~i )
i=1 (∃x
∧ φAi ,A (x~i , ~
y )],
then P (n) is proved. Notice that n ≤ |V |; hence such FO
formula can always be obtained in finitely many steps. 933
(when h > 0), axiomatizers gain flexibility of writing SSAs
that can deliver more computational advantages. Details can
be found in the next section (see Example 6).
Other classes of axioms such as the initial theory DS0 , the
precondition axioms Dap , the foundational axioms Σ and
unique name axioms for actions Duna have the same formats
as in (Reiter 2001). It is easy to see that a modular BAT
DH differs from Reiter’s BAT D in the following aspects:
DSH0 includes the action hierarchy H⋆ and can use sp∗ to
specify SSAs for classes of actions, while Dss enumerates
each action individually. However, according to Lemma 2
(with a constructive proof), theories DH and D are related.
Example 3 We continue with Example 2.
Most
of the FO formulas φA1 ,A2 (~x, ~y ) equivalent to
sp∗ (A1 (~x), A2 (~y )) are straightforward (either ⊤,
⊥, or the same as the axioms of sp), except for
sp∗ (drive(p, v, o, d), move(obj, orig, dest))
for
any
free variable p, v, o, d, obj, orig, dest. By using Def. 2 and
the axioms given in Example 2, we have
sp∗ (drive(p, v, o, d), move(obj, orig, dest))
≡ sp(drive(p, v, o, d), move(obj, orig, dest)) ∨
sp(drive(p, v, o, d), travel(p, o, d)) ∧
sp(travel(p, o, d), move(obj, orig, dest))
≡ v = obj ∧ o = orig ∧ d = dest ∨
p = obj ∧ o = orig ∧ d = dest,
which can be simplified as: for any variables p, v, o, d, obj,
H
Lemma 2 For a Dss
, there exists an equivalent class Dss
including SSAs of the syntactic form given in Reiter’s BAT
(Reiter 2001): for each relational fluent F
DH |= F (~x, do(a, s)) ≡ φF (~x, a, s)
in which φF may have occurrences of the predicate sp∗ ,
there exists a uniform formula φ′F (~x, a, s) that does not mention sp∗ and such that D |= F (~x, do(a, s)) ≡ φ′F (~x, a, s).
Proof: Assume that a given BAT D includes k action functions in total, say A1 (~v1 ), · · · , Ak (~vk ). For each relational
fluent F (~x, s), assume that its SSA in DH is of the form (1)
whose positive and negative effect conditions have the syntactic form (5), then we substitute a with each action function, say Ai (~vi ) (without loss of generality, we assume that
variables in ~vi are all new variables never used in the SSA
of F ), and in the RHS obtained by this substitution from the
SSA of F (~x, do(Ai (~vi ), s)), replace every occurrence of sp∗
(that has two action functions as arguments) with its equivalent FO formula (that exists according to Lemma 1). This
replacement results in an axiom of the following form
sp∗ (drive(p, v, o, d), move(obj, o, d)) ≡ p = obj ∨v = obj.
Modular BATs
Our goal is to provide a more compact specification of a BAT
based on a given hierarchy of actions. We will call such a
modified BAT a modular BAT and denote it as DH , where
H
DH = Dap ∪ Dss
∪ DSH0 ∪ Σ ∪ Duna .
Here, DSH0 = DS0 ∪H⋆ , in which H⋆ is the action hierarchy
and DS0 describes the usual initial state, the same as Reiter’s
H
initial theory, and Dss
is the new class of SSAs specified
⋆
based on H . In the sequel, let sp∗= (a, a′ ) be an abbreviation
for either sp∗ (a, a′ ) or a = a′ in Formula (5) below.
H
The new syntactic form of SSAs in Dss
can be different
from Reiter’s format in Dss . Intuitively, instead of repeating
tediously each individual action in the right-hand side (RHS)
of a SSA for a fluent, say F (~x, do(a, s)), one can take advantage of the action hierarchies and describe the effect of
the whole class of action functions at once. One can say that
all those actions which are (distant) specializations of some
generic action A(~y ) (actions from the branch going out of
A(~y )), except those (distant) specializations of some other
generic actions, say A(~
yl ) for 1 ≤ l ≤ h (i.e., excluding
actions from some branches), can cause the same (positive
or negative) effects on F under certain conditions. By doing
so, we can represent the effects of actions more compactly.
We will see later that this new form of SSAs leads to significant computational advantages as well.
It is convenient to use the following notation related with
a fluent F (~x, s): for any variable vector ~
yl (l ≥ 0), let
~zl = ~yl − ~x (i.e., ~zl are the new variables mentioned in ~yl
but not in ~x). Note that, in the RHS of the SSA of F (~x, s),
those new variables ~zl need to be existentially quantified.
Formally speaking, the modified SSA of a relational fluent F (~x, s) has the format (1), where each ψi+ (~x, a, s) or
ψi− (~x, a, s) has either the syntactic form (2) or the following syntactic form:
(∃ ~
z0 )[sp⋆ (a, A(~
y0 ))∧γ(~
x, ~
z0 , s) ∧
h
^
F (~
x, do(Ai (~vi ), s)) ≡ ψi+ (~
x, ~vi , s) ∨ F (~
x, s) ∧ ¬ψi− (~
x, ~vi , s).
Whenever ψi+ (~x, ~vi , s) (ψi− (~x, ~vi , s), respectively) are consistent conditions (SC formulas uniform in s), Ai (~vi ) has a
positive effect (a negative effect) on F under such condition.
Hence, the following yields the logically equivalent SSA of
F in the usual BAT of (Reiter 2001):
W
F (~
x, do(a, s)) ≡ [ ki=1 (∃~vi )(a = Ai(~vi ) ∧ ψi+ (~
x, ~vi , s))]∨
Wk
F (~
x, s) ∧ ¬[ j=1 (∃~vj )(a = Aj (~vj ) ∧ ψj− (~
x, ~vj , s))].
Notice that the above axiom can be simplified by removing
inconsistent clauses. Hence the lemma is proved.
We then have the following important property:
Theorem 2 For each DH , there exists an equivalent D of
the format given in (Reiter 2001), where equivalence means
that for any FO regressable sentence W that has no occurrences of the predicate sp, DH |= W iff D |= W.
Proof: Use Lemma 2.
Here we provide some examples of the new way of representing SSAs, and compare them with Reiter’s format.
¬(∃ ~zl )sp⋆= (a, Al (~
yl ))]. (5)
l=1
In (5), γ is a formula uniform in s that has ~x, ~z0 , s at most as
its free variables. Notice that whenever ~zl (l ≥ 0) is empty,
then there is no existential quantifier over ~zl . In addition, in
(5), when index h = 0, the conjunction over l does not exist.
One can prove that axiomatizers can always write modified
SSAs in DH with h = 0 in (5). However, with negation
Example 4 We continue with Example 1 (recall Figure 1).
Consider a fluent f Cooked(x, s) (food x is cooked in the
situation s), the modular BAT version of its SSA could be:
f Cooked(x, do(a, s)) ≡ (∃y)sp∗(a, cook(x, y))∨f Cooked(x, s).
Another example is a SSA for the fluent dirtyV es(x, s) (it
will be false after washing a vessel x in some manner, or it
will be true when x is used to prepare food or drink):
934
dirtyV es(y, do(a, s)) ≡ dirtyV es(y, s) ∧ ¬sp∗ (a, wash(y))
∨ (∃x)sp∗(a, cook(x, y)) ∨ (∃x)a = makeSalad(x, y)
∨ (∃x)sp∗(a, prepDrink(x, y)).
acyclic graph (DAG). Computing the FO formula equivalent
to sp∗ (A1 (~x), A2 (~y )) for any pair of action functions A1 (~x)
and A2 (~y) is similar to finding all paths from A1 to A2 in the
corresponding digraph. The latter problem has the computational complexity of Θ(p) where p is the number of all the
distinct edges in the digraph on any path from A1 to A2 , and
therefore has a computational complexity of O(e) where e
is the number of all edges in the digraph (i.e., e is the number of sp axioms in H). As a consequence, this yields the
following encouraging and important result.
We start with the case h = 0 in (5). Let φ(a, ~x, s) denote (∃~z0 )[sp∗ (a, A(~y0 )) ∧ γ(~x, ~z0 , s)]. In general, to provide an equivalent SSA of F (~x, s) in Reiter’s representation,
φ(a, ~
x, s) has to be replaced by an uniform formula ψ(a, ~
x, s)
of the form (∃~z0 )[ψsp (a, ~y0 ) ∧ γ(~x, ~z0 , s)]. Here, ψsp (a, ~y0 )
W
might have the form (a = A(~y0 ) ∨ ni=A1 −1 (∃~zi )(a = Ai (~yi ) ∧
ψi (~
y0 , ~
yi ))), where each Ai (~
yi ) (1≤ i ≤ nA − 1) is a specialization of A(~y0 ) under the condition ψi (~y0 , ~yi ), nA is the total number of specializations of A. The formula ψsp (a, ~y0 )
is a logically equivalent replacement of sp∗ (a, A(~y0 )) in
φ (see Lemma 2). Let the action diagram H in DH be
acyclic and the corresponding action digraph rooted at A
have a tree structure (the most general actions are considered
as roots and the most specialized actions are considered as
leaves). Then, the computational time of extended regression E [R[sp∗ (α, A(~t ′ , ~z0 ))]] in the clause φ(α, ~t, S), for any
object terms ~t and any situation term do(α, S), is no worse
than and (sometimes) can be exponentially faster than computational time of Reiter’s regression on ψsp in ψ(α, ~t, S).
Theorem 4 If the sub-tree rooted at A in the digraph of H
is a complete k-ary tree (k ≥ 2) with nA action functions
as its nodes, the computational complexity of extended regression E[R[sp∗ (α, A(~t ′ , ~z0 ))]] is Θ(logk nA ), while the
computational complexity of Reiter’s regression R[ψsp ] on
an equivalent replacement is Θ(nA ).
Proof: When a DAG of an action hierarchy has a tree
structure or a forest structure, there is at most one path
between any two action functions. In particular, assume
that the digraph of the action hierarchy is a complete kary (k ≥ 2) tree structure and consider the regression of
F (~t, do(α, S)) for any action term α and situation term S.
To specify the equivalent SSA in Reiter’s format, φ(α, ~t, S)
needs to be replaced by ψ(α, ~t, S). It is easy to see that
one-step regression of the above clause in Reiter’s format
takes Θ(nA ) steps (subsequently, additional time is required
to regress recursively R[γ(~t, ~z0 , S)]). To perform onestep regression using the modular BAT format, it is sufficient to regress sp∗ (α, A(~t ′ , ~z0 )) ∧ γ(~t, ~z0 , S) (which takes
Θ(1) time, excluding again the time that subsequently required to compute recursively R[γ(~t, ~z0 , S)]) and then replace sp∗ (α, A(~t ′ , ~z0 )) with the equivalent FO formula. Because this last replacement step can be considered as finding
the path from α to A0 with the computational complexity of
Θ(logk nA ). Finally, regression makes the same number of
recursive calls in both cases: the formula γ is the same. The Reiter’s SSA for fluent f Cooked(x, s) (with a bigger
taxonomy of actions, it will be much longer):
f Cooked(x, do(a, s)) ≡
(∃y)[ a = cook(x, y)∨a = lowOilCook(x, y)∨a = steam(x, y)
∨a = boil(x, y)∨a = stew(x, y)∨a = broil(x, y)
∨a = bake(x, y)∨a = roast(x, y)∨a = ovenCook(x, y)
∨a = pressureCook(x, y)∨a = oilyCook(x, y)
∨a = f ry(x, y) ∨ a = deepF ry(x, y) ∨ a = stir(x, y)
∨a = parboil(x, y) ∨ a = grill(x, y)] ∨ f Cooked(x, s).
We can also get a similar longer Reiter’s SSA for fluent
dirtyV es(x, s) (details are omitted).
The definitions of the regression operator and the regressable sentences in DH are all the same as in (Reiter 2001).
Similar to the regression theorem (Reiter 2001), we have
DH |= W iff DSH0 ∪ Duna |= R[W ]
for any regressable sentence W . Let E R[W ] (called the
extended regression of W ) be the operator that eliminates
all occurrences (if any) of the sp∗ (A′ , A) predicate in R[W ]
in favor of the corresponding FO formulas φA′ ,A that exists
according to Lemma 1. Then, we have:
Theorem 3 For each DH and for any FO regressable
sentence W , DH |= W iff DS0 ∪ H ∪ Duna |= E R[W ] .
This theorem is important because DSH0 ∪Duna (and hence
D ) include the SO definition of the predicate sp∗ . However, all occurrences of sp∗ in sentence R[W ] can be replaced by FO sentences in finitely many steps according to
the Lemma 1. Consequently, one can use regression in our
modular BATs to reduce projection and executability problems to theorem proving in FOL only.
H
Advantages of Modular BATs
Using action hierarchies and specifying BATs modularly not
only provides a compact way of representing effects of actions, but sometimes leads to a more computationally efficient (than Reiter’s) solution of the projection problem.
Example 5 Continuing with Example 4, consider a ground
action α = deepF ry(Egg1, F ryingP an1 ), and the regression of f Cooked(Egg1 , do(α, S0 )). Using Reiter’s SSA for
this fluent, regression involves checking 16 equality clauses
between actions when regressing on the positive conditions
in the SSA of f Cooked (see the axiom above). Using the
modular BAT, extended regression of the positive conditions
involves only 1 step of regression for predicate sp∗ , and finally the replacement of sp∗ (α, cook(Egg1 , y)) with the corresponding FO formula, i.e., the operator E, takes at most 4
steps of recursive computation (see Figure 1).
Apart from specific examples, let us discuss in general the
following problems: when we can actually gain computational advantages using action hierarchies and how much we
can gain, whether there is any possible computational disadvantage in using action hierarchies alone, and if so, whether
it can be avoided.
According to the definition in the previous section, the
digraph of an acyclic action diagram is in fact a directed
However, if we do not allow the usage of (in)equality beH
tween action terms (e.g., a = Ai ) in Dss
, we may (some-
935
axiom for any relational fluent F (~y , s) has the syntactic form
times) lose computational advantages when the effects of
actions for some fluents only involve very few actions in
a large taxonomy or when the structure of a DAG is not a
forest, especially if it is close to a complete DAG. Since a
dense DAG of n nodes has at most the order of n2 edges
(a complete DAG of n nodes has n(n − 1)/2 edges in total), computing the FO formula equivalent to the predicate
sp∗ has complexity O(n2 ). To avoid such computational
disadvantages, we can easily use both (in)equality between
action terms and predicate sp∗ in modular BATs. Whenever
the (sub)digraph rooted at some action function symbol A
has a tree structure (even mostly a tree structure with a few
extra edges) and most of its specializations have the same
effects under certain common conditions on some fluent F ,
one can use the sp∗ predicate for A to gain both the computational advantage and the advantage of compact representation when writing the SSA for F . Otherwise, one can use the
(in)equality format to avoid computational disadvantages.
Now, we illustrate the advantage of using the negated
component in clause (5) (i.e., allowing h > 0).
+
ψA,F
(~
x, ~
y , s) ⊃ F (~
y, do(A(~
x), s),
and its negative effect
axiom for F (~y , s) is of the form
−
ψA,F (~
x, ~y, s) ⊃ ¬F (~
y, do(A(~
x), s).
(8)
(9)
Definition 4 For an action function A(~x) and a fluent
F (~y, s), which has effect axioms of the form (8,9) we say that
an action A has effect on a relational fluent F (or F could
+
be affected by A) iff either 6|= ψA,F
(~
x, ~
y , s) ≡ F (~
y, s) or
−
6|= ψA,F (~
x, ~
y , s) ≡ ¬F (~
y, s). For any action function A(~
x), a
special meta-function Ne (A) is used to represent the number
of fluents that can be affected by A.
For any two action functions A1 (~x1 ) and A2 (~x2 ), we say
that A1 causes no less effects than A2 iff there exists no fluent such that A1 has no effect on it but A2 has. We say that
A1 causes more effects than A2 iff A1 has no less effects
than A2 and there exists at least one fluent such that A1 has
an effect on it but A2 does not.
Note that if A1 causes more effects than A2 , then Ne (A1 ) >
Ne (A2 ); however, it is not necessarily true the other way
around: actions might affect different sets of fluents. Given
effect axioms, for any pair of actions A1 , A2 , a straightforward linear time O(m) procedure can check whether A1
causes more effects than A2 .
We would like to provide general guidelines on how an
axiomatizer can construct an action diagram H for D. Under the assumption of monotonic inheritance, if A1 is a specialization of A2 , then it causes no less effects than A2 and
Ne (A1 ) ≥ Ne (A2 ). Thus, to return a set H that represents
an action diagram H, it is enough to start with generic actions A that have the smallest value of Ne (A) and proceed
towards more specialized actions checking on each iteration
if the next action we consider is a specialization of one of
the previously considered actions.
1. Sort the action functions and get the sequence A1 (~x1 ),
· · · , An (~
xn ) such that Ne (Ai1 ) ≤ Ne (Ai2 ) for i1 < i2 .
2. Initially, let i = 2 (index 1 ≤ i ≤ n) and H = ∅.
3. If i > n, then return H and terminate; else assign j = i
and continue: look for Aj ’s that are generalizations of Ai .
4. Decrement j = j − 1. If j = 0 (i.e., all candidates Aj have
been already considered), then increment i = i + 1 and
go to step 3 (i.e., take a next action Ai+1 from the sorted
sequence we obtained at step 1); else if Ne (Ai ) = Ne (Aj ),
go to step 4; else continue.
5. For any pair of indices i, j such that 1 ≤ j ≤ i − 1, if
there is a path in H from i to j, then we already know that
Ai is a specialization of Aj and because specialization is
a transitive relation there is no need to add a new directed
edge from Ai to Aj and we can go to step 4; else continue.
6. If Ai (~xi ) is a specialization of Aj (~xj ) under FO condition φi,j , then update H = H ∪ {(Ai (~xi ), Aj (~xj ), φi,j )}
and go to step 4; else do not change H and go to step 4.
To implement the last step for any two action functions
Ai (~xi ) and Aj (~xj ), provided that axiomatizers are able to
write effect axioms of the form (8,9), we formulate the following principles to determine whether or not Ai (~xi ) is a
specialization of Aj (~xj ) under some condition φ.
Example 6 We continue with Example 5. Consider a fluent
nonBBQ(x, s): x is cooked without grilling. Its SSA in
DH can be written without negation in (5), i.e. when h = 0:
nonBBQ(x, do(a, s)) ≡ (∃y)[sp∗ (a, oilyCook(x, y)) ∨
sp∗ (a, ovenCook(x, y)) ∨ a = steam(x, y) ∨ a = stew(x, y) ∨
sp∗ (a, boil(x, y))] ∨ nonBBQ(x, s)∧¬(∃y)a 6= grill(x, y) (6)
Alternatively, it can be written with negation (i.e., h > 0) as:
nonBBQ(x, do(a, s)) ≡ (∃y)[sp∗(a, oilyCook(x, y)) ∨
sp∗ (a, lowOilCook(x, y)) ∧ (∀z)(a 6= lowOilCook(x, z) ∧
a 6= grill(x, z))] ∨ nonBBQ(x, s)∧¬(∃y)a 6= grill(x, y) (7)
Consider E[R[nonBBQ(Egg1 , do(α, S0 ))]], extended regression where α is the same as in Example 5. It takes 1
step less using Formula (7) than using Formula (6) during
regression (regardless the quantifiers). Clearly, the more
branches lowOilCook(x, y) has that have positive effects
on nonBBQ(x, s) without extra context conditions, the
more computational advantage we can obtain by allowing
h ≥ 0 and using Formula (7) during regession.
How to Construct a Taxonomy of Actions
As we see, hierarchies of actions can lead to important computational advantages. An important practical question remains how an axiomatizer should approach the problem of
constructing a hierarchy of actions given only a set of effect
axioms which specify for each fluent what actions have a
(positive or negative) effect on the fluent. In a somewhat
similar vein, (Reiter 2001) starts from effect axioms and
demonstrates that under the causal completeness assumption, SSAs can be constructed from effect axioms. We continue to consider only action diagrams H with monotonic
inheritance of effects. In this subsection, we assume that all
the variables used below in action functions and fluents are
distinct from each other. Consider a BAT D which includes
a set of n action functions, say {Ai (~xi ) | i = 1..n}, and
m fluents, say {Fj (~yj ) | j = 1..m}, that might be affected
by any of the above actions. For any action function A(~x),
without loss of generality, we assume that its positive effect
936
a. If Ai causes more effects than Aj , “guess” a FO formula
φ, whose free variables include at most ~xj and ~xi , such
that for any relational fluent F (~y , s) that could be affected
by both Ai and Aj ,
sitional dynamic logic). All actions in CLASP are represented in the style of STRIPS, which is less expressive than
general Reiter’s BATs and our modular BATs. Our formalism is very different from all the papers mentioned above.
We use a specialization relation between primitive action
functions, and provide a formal axiomatization of the dynamic aspects of actions using full predicate logic (hence,
our theory is quite expressive). Also, we gain both representational and computational advantages by using the action
hierarchies. The extensive research on Hierarchical Task
Networks (HTN), that can be traced to the pioneering work
of Sacerdoti on ABSTRIPS, considers a completely different recursive decomposition of complex actions (i.e., plans,
or nonprimitve tasks) into constituents, but does not explore
large taxonomies of primitive actions and whether these taxonomies can provide any computational advantages when
solving the projection problem. Our work is motivated in
part by the well-known hierarchies of verbs (full troponym)
in WordNet (Fellbaum 1998). Exploring connections with
other frameworks (e.g., FrameNet, Levin’s taxonomy, VerbNet, etc) in computational linguistics and natural language
processing is a possible direction for our future research.
(Amir 2000) proposes and studies an object oriented version
of the SC with the purpose of developing decomposed theories of actions, but he investigates a representation that is significantly different from our approach and considers neither
taxonomies of actions nor BATs. In the future, we will explore how to combine Amir’s decomposed SC theories with
our action hierarchies. Moreover, currently we consider only
hierarchies of primitive actions. In the future, we may also
consider hierarchies of complex actions (plans), and explore
what criteria should be followed for constructing such hierarchies, whether we can construct them automatically from
the existing hierarchies of primitive actions and our BATs.
D |= φ ⊃ (P oss(Ai (~
xi ), s) ⊃ P oss(Aj (~
xj ), s)),
+
+
D |= φ ⊃ (ψA
(~
xi , ~
y, s) ≡ ψA
(~
xj , ~y, s)),
i ,F
j ,F
−
−
D |= φ ⊃ (ψA
(~
xi , ~
y, s) ≡ ψA
(~
xj , ~y, s)).
i ,F
j ,F
If one can find such φ, then Ai is a specialization of Aj
under the condition φ.
b. Otherwise, Ai is not a specialization of Aj .
Note that for any action functions A1 (~x), A2 (~y ) and FO formula φ, each (A1 (~x), A2 (~y), φ) in the returned set H corresponds to an axiom sp(A1 (~x), A2 (~y)) ≡ φ, and the collection
of all these axioms results in an action diagram H.
In general, to determine whether one action is a specialization of another under certain condition is undecidable.
Hence, the axiomatizers have to observe the preconditions
and effects of actions, guess formula φ (using their intuition), and construct action diagrams manually. In the future,
we would like to consider whether it is possible to generate
action diagrams automatically in some special cases.
Discussion and Future Work
There are a few papers related to our work that we would like
to mention. (Lifschitz & Ren 2006) consider modular theories in the propositional action representation language C+,
and address the problem of the development of libraries of
reusable, general-purpose knowledge components. In comparison to them, we explore how to manage a large number of actions in the predicate logic using a hierarchical representation for actions in SC. We propose a representation,
which not only facilitates writing axioms succinctly, but for
realistic taxonomies can also gain computational advantages
in solving the projection problem. (Kautz & Allen 1986)
and subsequent papers of H.Kautz develop frameworks for
plan recognition using hierarchies of plans, in which primitive action and plan instances belong to certain event types
represented as unary predicates, and a hierarchy of plans is
a collection of restricted-form axioms specifying relationships between various event types. However, their axiomatizations of actions (preconditions and effects of the actions)
are limited and they do not address the projection problem.
(Kaneiwa & Tojo 2005) give an ontological framework to
represent actions/events and their hierarchical relationships
in information systems using an order-sorted SO logic. In
this framework, events (or actions) are represented as predicates rather than terms, and the authors consider taxonomical reasoning about relationships between events rather than
reasoning about effects of actions. The authors do not provide axiomatizations of the dynamic aspects of actions and
do not explore computational properties of their framework.
(Devanbu & Litman 1996) propose a plan-based knowledge
representation and reasoning system, called CLASP (CLAssification of Scenarios and Plans). CLASP extends the notions of subsumption from terminological languages to plans
by allowing the construction of plans from concepts corresponding to actions and using plan description forming operators for choice, sequencing and looping (similar to propo-
Acknowledgments
Thanks to the Natural Sciences and Engineering Research
Council of Canada (NSERC) for partial financial support of
this research. Thanks to Fahiem Bacchus, Hector Levesque,
and Sheila McIlraith for comments on preliminary versions.
References
Amir, E. 2000. (De)Composition of Situation Calculus Theories. In Proceedings of the Seventienth National Conference on
Artificial Intelligence (AAAI-00). AAAI.
Devanbu, P. T., and Litman, D. J. 1996. Taxonomic plan reasoning. Artif. Intell. 84(1-2):1–35.
Fellbaum, C. 1998. English verbs as a semantic net. In Fellbaum,
C., ed., WordNet: An Electronic Lexical Database, with a preface
by George Miller, Chapter 3. The MIT Press. 69–104.
Kaneiwa, K., and Tojo, S. 2005. Logical aspects of events: Quantification, sorts,composition and disjointness. In Proceedings of
Australasian Ontology Workshop (AOW 2005).
Kautz, H. A., and Allen, J. F. 1986. Generalized plan recognition. In Proceedings of the fifth National Conference on Artificial
Intelligence (AAAI-86), 32–37. AAAI Press.
Lifschitz, V., and Ren, W. 2006. A modular action description
language. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06). AAAI.
Reiter, R. 2001. Knowledge in Action: Logical Foundations for
Describing and Implementing Dynamical Systems. MIT Press.
937