Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Reasoning about Large Taxonomies of Actions Yilan Gu Mikhail Soutchanski Dept. of Computer Science University of Toronto 10 King’s College Road Toronto, ON, M5S 3G4, Canada Email: yilan@cs.toronto.edu Department of Computer Science Ryerson University 245 Church Street, ENG281 Toronto, ON, M5B 2K3, Canada Email: mes@cs.ryerson.ca these theories are “flat” and do not provide representation for hierarchies of actions. This can lead to potential difficulties if one intends to use BATs for the purpose of large scale formalization of reasoning about actions on the commonsense level, when potentially arbitrary actions and objects have to be represented. Intuitively, many events and actions have different degrees of generality: the action of driving a car from home to an office is a specialization of the action of transportation using a vehicle, that is in its turn a specialization of the action of moving an object from one location to another. We represent hierarchies of actions explicitly and use them in our new modular SSAs. However, we show that our new modular SSAs can be translated into “flat” Reiter’s SSAs and, consequently, we inherit all useful properties of his BATs: formulas entailed from Reiter’s BAT remain entailments from a modular BAT; consequently, the projection problem can be solved. Below, we first review the SC. Then we propose a new representation that helps to design modular BATs and prove that it has the same desirable logical properties as Reiter’s BATs. We also discuss the significant computational advantages of using modular BATs in comparison to Reiter’s “flat” BAT. Finally, we propose an approach to designing taxonomies of actions and discuss related work. Abstract We design a representation based on the situation calculus to facilitate development, maintenance and elaboration of very large taxonomies of actions. This representation leads to more compact and modular basic action theories (BATs) for reasoning about actions than currently possible. We compare our representation with Reiter’s BATs and prove that our representation inherits all useful properties of his BATs. Moreover, we show that our axioms can be more succinct, but extended Reiter’s regression can still be used to solve the projection problem (this is the problem of whether a given logical expression will hold after executing a sequence of actions). We also show that our representation has significant computational advantages. For taxonomies of actions that can be represented as finitely branching trees, the regression operator can work exponentially faster with our theories than it works with Reiter’s BATs. Finally, we propose general guidelines on how a taxonomy of actions can be constructed from the given set of effect axioms in a domain. Introduction A long-standing and important problem in AI is the problem of how to represent and reason about effects of actions grouped in a realistically large taxonomy, where some actions can be more generic (or more specialized) than others. While the problem of representing large semantic networks of (static) concepts has been addressed in AI research from the 1970s and served as motivation for research on description logics, a related problem of representing and reasoning about large taxonomies of actions received surprisingly little attention. We would like to address this problem using the situation calculus. The situation calculus (SC) is a well known and popular predicate logical theory for reasoning about events and actions. There are several different formulations of the SC. In this paper we would like to concentrate on basic action theories (BATs) introduced in (Reiter 2001), in particular, on successor state axioms (SSAs) proposed by Reiter to solve (sometimes) the frame problem (SSAs are part of a BAT). Recall that BATs are more expressive than STRIPS theories: actions specified using BATs can have context-dependent effects. We propose a representation that allows writing more compact and modular BATs than is currently possible. BATs are logical theories of a certain syntactic form that have several desirable theoretical properties. However, BATs have not been designed to support taxonomic reasoning about objects and actions. Essentially, The Situation Calculus All dialects of the SC Lsc include three disjoint sorts (actions, situations and objects). Actions are first-order (FO) terms consisting of an action function symbol and its arguments. Actions change the world. Situations are FO terms which denote possible world histories. A distinguished constant S0 is used to denote the initial situation, and function do(a, s) denotes the situation that results from performing action a in situation s. Every situation corresponds uniquely to a sequence of actions. Moreover, the notation s′ ⊑ s means that either situation s′ is a subsequence of situation s or s = s′ . Objects are FO terms other than actions and situations that depend on the domain of an application. Fluents are relations or functions whose values may vary from one situation to the next. Normally, a fluent is denoted by a predicate or function symbol whose last argument has the sort situation. For example, F (~x, do([α1 , · · · , αn ], S0 )) represents a relational fluent in the situation do(αn , do(· · · , do(α1 , S0 ) · · · )) resulting from execution of actions α1 , · · · , αn in S0 . For simplicity, we omit all details related to functional fluents below. All free variables are always ∀-quantified at the front. The SC includes the distinguished predicate P oss(a, s) c 2008, American Association for Artificial IntelliCopyright gence (www.aaai.org). All rights reserved. 931 to characterize actions a that are possible to execute in s. For any first order SC formula φ and a term s of sort situation, we say φ is a formula uniform in s iff it mentions only fluents (does not mention P oss or ⊑), it does not quantify over variables of sort situation, it does not use equality on situations, and whenever it mentions a term of sort situation in a fluent, then that term is a variable s (see (Reiter 2001)). A basic action theory (BAT) D in the SC is a set of axioms written in Lsc with the following five classes of axioms to model actions and their effects (Reiter 2001). Action precondition axioms Dap : For each action function A(~x), there is one axiom of the form P oss(A(~x), s) ≡ ΠA (~x, s). ΠA (~x, s) is a formula uniform in s with free variables among ~x and s, which characterizes the preconditions of action A. Successor state axioms Dss : For each relational fluent F (~x, s), there is one axiom of the form _ _ F (~ x, do(a, s))≡ ψi+ (~ x, a, s) ∨ F (~ x, s) ∧ ¬( ψj− (~ x, a, s)). i ψi+ (~x, a, s) the evaluation of a regressable formula W to a FO theorem proving task in the initial theory together with unique name axioms for actions: D |= W iff DS0 ∪ Duna |= R[W ]. This fact is the key result in the SC. It demonstrates that an executability or a projection task can be reduced to a theorem proving task that does not use precondition, successor state, and foundational axioms. This is one of the reasons why the SC provides a natural and easy way to represent and reason about dynamic systems. Action Hierarchies In practice, it is not easy to specify and reason with a logical theory D if an application domain includes a very large number of actions. To deal with this problem, we propose to represent events and actions using a hierarchy. (1) j (ψj− (~x, a, s), Here, each formula respectively) is uniform in s and specifies a positive effect (negative effect, respectively) with certain conditions on fluent F . Each ψi+ (~x, a, s) or ψi− (~x, a, s) in Eq. (1) has the syntactic form ∃~ z .a = A(~ y)∧γ(~ x, ~ z , s), Definition 1 We use the predicate sp(a1 , a2 ) to represent that action a1 is a direct specialization of action a2 (action a2 is a direct generalization of a1 ). An action diagram is defined by a finite set H of axioms of the syntactic form sp(A1 (~ x), A2 (~ y )) ≡ φA1 ,A2 (~ x, ~ y) (3) for two action functions A1 (~x), A2 (~y ), where φA1 ,A2 (~x, ~y ) is a satisfiable (i.e., not equivalent to ⊥) situation-free FO formula with free variables at most among ~x, ~y. Also, H must be such that the following condition hold: (2) where ~z = ~y − ~x and γ(~x, ~z, s) is a context where the action A(~y ) has the effect. The successor state axiom (SSA) for each fluent F completely characterizes the truth value of F in the next situation do(a, s) in terms of values that fluents have in the current situation s. Notice that, unlike STRIPS, in general these SSA axioms are context-dependent. Initial theory DS0 : A set of FO formulas whose only situation term is S0 . It specifies the values of all fluents in the initial state. It also describes all the facts that are not changeable by any actions in the domain. Unique name axioms for actions Duna : Includes axioms specifying that two actions are different if their action names are different, and that identical actions have identical arguments. Foundational axioms for situations Σ: The axioms for situations which characterize the basic properties of situations. These axioms are domain independent. They are included in the axiomatization of any dynamic system in the SC (see (Reiter 2001) for details). Suppose that D = Dap ∪ Dss ∪ DS0 ∪ Σ ∪ Duna is a BAT, α1 , · · · , αn is a sequence of ground action terms, and G(s) is a uniform formula with one free variable s. One of the most important reasoning tasks in the SC is the projection problem, that is, to determine whether D |= G(do([α1 , · · · , αn ], S0 )). Another basic reasoning task is the executability problem. Planning and high-level program execution are two important settings where the executability and projection problems arise naturally. Regression is a central computational mechanism that forms the basis for automated reasoning in the SC (Reiter 2001). A recursive definition of the regression operator R on any regressable formula φ is given in (Reiter 2001). We use notation R[φ] to denote the formula that results from eliminating P oss atoms in favor of their definitions as given by action precondition axioms and replacing fluent atoms about do(α, s) by logically equivalent expressions about s as given by SSAs repeatedly until it cannot make such replacements any further. The regression theorem (Reiter 2001) shows that one can reduce H ∪ D |= sp(a1 , a2 ) ⊃ (P oss(a1 , s) ⊃ P oss(a2, s)). Given any action diagram H, we say that a directed graph G = hV, Ei is a digraph of H when V = {A1 , · · · , An }, where all Ai ’s are distinct action function symbols in D and a directed edge Aj →Ak belongs to the edge set E iff there is an axiom of the form sp(Aj (~x), Ak (~y )) ≡ φAj ,Ak (~x, ~y ) in H. From Duna follows that the graph G cannot have multiple edges from one node to another. When the digraph G of H is acyclic, i.e., there is no directed loop in G, we call H an acyclic action diagram. Below, we will only consider acyclic action diagrams. Note that if each action in the digraph of an acyclic action diagram has only one parent (single inheritance case), then the digraph is actually a forest, but generally, there can be actions that have several parents (multiple inheritance case), as shown in Examples 1 and 2. Example 1 Consider actions performed in a kitchen, actions such as washing, cooking, frying, etc. Some can be considered as specializations of others. To simplify the example we assume that water and electricity are always available, ignore some other kitchen activities (such as chopping, mixing, etc) and consider the (simplified) action digraph shown in Fig. 1. Each edge corresponds to one sp axiom in the set H, for example, sp(wash(x), kitchenAct), sp(prepF ood(x), kitchenAct), sp(cook(f ood, vessel), prepF ood(f ood)), sp(oilyCook(f ood, vessel), cook(f ood, vessel)), sp(oilyCook(f ood, vessel), reheat(f ood)), sp(microwave(f ood), reheat(f ood)), · · · , etc. Example 2 We show additional examples where actions in H can have different numbers of arguments. Consider an action travel(p, o, d): a person p travels from origin o to destination d. It can be regarded as a direct specialization 932 parboil steam stew boil grill prepHotDr broil roast pressureCook ovenCook lowOilCook prepColdDr handWash prepDrink wash bake fry oilyCook cook deepFry reheat One can easily prove that under Duna , H∗ is acyclic according to Def. 3 iff the digraph of the action diagram H is acyclic. Note that the above condition in Def. 3 is more general than the antisymmetry of sp∗ (because antisymmetry is not strong enough to assure the acyclicity of H). The following theorem states that the action hierarchies entail the same intuitively clear taxonomic properties as the predicate sp. stir microwave makeSalad prepFood kitchenAct Figure 1: A (Simplified) Action Digraph for Kitchen Activities Theorem 1 Let H be an acyclic action diagram, whose corresponding action hierarchy is H⋆ . Then, of move – person p moves from location o to location d: H⋆ ∪ D |= sp∗ (a1 , a2 ) ⊃ (P oss(a1 , s) ⊃ P oss(a2, s)). sp( travel(p, o, d), move(p, o, d) ). Consider an action drive(p, v, o, d), representing that a person p drives a vehicle v from origin o to destination d. It can be considered as a direct specialization of action travel – person p travels from location o to location d: sp( drive(p, v, o, d), travel(p, o, d) ). It is also a direct specialization of action move – vehicle v moves from location o to location d: sp( drive(p, v, o, d), move(v, o, d) ). Consider an action passDr(p, dr): a person p passes through a door dr. It is considered as a direct specialization of move(p, o, d) iff the origin o is the outside of dr and the destination d is the inside of dr, or vice versa: Proof: It follows from Def. 1, Def. 2 and Def. 3 using induction, but details are omitted because of lack of space. Moreover, the following lemma will be convenient later. Lemma 1 Consider any acyclic action diagram H, whose corresponding action hierarchy is H⋆ . For any action functions A1 (~x) and A2 (~y ), A1 (~x) is a (distant) specialization of A2 (~y ) iff φA1 ,A2 (~x, ~y), for some situation-free FO formula φA1 ,A2 (including ⊤ and ⊥) whose free object variables are at most among ~x and ~y. That is, sp( passDr(p, dr), move(p, o, d) ) ≡ outside(o, dr) ∧ inside(d, dr) ∨ outside(d, dr) ∧ inside(o, dr), where predicate outside(o, dr) (inside(d, dr), respectively) is true iff o (d, respectively) is the location that is outside (inside, respectively) of dr. sp∗ (A1 (~ x), A2 (~ y )) ≡ φA1 ,A2 (~ x, ~ y ). |= And, φA1 ,A2 can be found from H in finitely many steps. Proof: Let G = hV, Ei be the digraph of the given H, and let max(A′ , A) be the maximum of the lengths of all the distinct paths from A′ to A in G. We prove the following property P (n) for any natural number n: “For any action function symbol A′ , A such that max(A′, A) = n, n ≤ |V |, and for any distinct free variables ~x, ~ y , sp∗ (A′ (~ x), A(~ y)) ≡ φA′ ,A (~ x, ~ y ) for some FO formula φA′ ,A (including ⊤ and ⊥) with object variables at most among ~x and ~y ”. Base case: P (0), max(A′, A)=0, two sub-cases. Case 1: A=A′ , since sp∗ is reflexive In this paper, we will only consider action diagrams with monotonic inheritance of effects: D ∪ H |= (∀F ).sp(a1 , a2 ) ∧ F (s) 6≡ F (do(a2 , s)) ⊃ F (do(a1 , s)) ≡ F (do(a2 , s)). Since there are only finitely many (say, m) fluents in D, the above second-order (SO) formula can be replaced by the finite conjunction (over j = 1..m) of FO formulas (where Fj (x~j , s) is jth fluent with object arguments x~j ): sp∗ (A′ (~ x), A(~ y)) ≡ |~ x| = |~ y| ∧ sp(a1 , a2 ) ∧ Fj (x~j , s) 6≡ Fj (x~j , do(a2 , s)) ⊃ Fj (x~j , do(a1 , s)) ≡ Fj (x~j , do(a2 , s)). V|~x| i=1 xi = yi (by UNA). Case 2: A6=A′ , and since max(A′ , A)=0, which means there is no sp path between A and A′ , then sp∗ (A′ (~x), A(~y)) ≡ ⊥. Inductive step: Assume that P (j) is true for all j < n, we prove P (n), where n > 0. Consider any action function symbols A′ , A such that max(A′ , A) = n, where n ≤ |V |. Since G is acyclic, hence each path from A to A′ has no repetitions of the action nodes. Since n > 0, collect all direct generalizations of A′ in G, say {A1 , · · · , At }, which are (distant) specializations of A. Then, Because in general we need to reason about a direct specialization of another direct specialization of an action, we define (distant) specializations using the predicate sp∗ . Definition 2 The predicate sp∗ (a1 , a2 ) represents that action a1 is a (distant) specialization of action a2 and is defined as a reflexive-transitive closure of sp: sp∗ (a1 , a2 ) ≡ (∀P ).{(∀v)[P (v, v)]∧ (∀v, v ′ , v ′′ )[sp(v, v ′ ) ∧ P (v ′ , v ′′ ) ⊃ P (v, v ′′ )] ∧ (∀v, v ′ )[sp(v, v ′ ) ⊃ P (v, v ′ )]} ⊃ P (a1 , a2 ) H⋆ ∪ Duna ′ sp∗ (AW (~ x), A(~ y)) ≡ t ~i )[sp(A′ (~ x), Ai (x~i )) ∧ sp∗ (Ai (x~i ), A(~ y))]. i=1 (∃x (4) For each i, max(Ai, A) ≤ n−1. By the induction hypothesis, we have sp∗ (Ai (x~i ), A(~y)) ≡ φAi ,A (x~i , ~y ) for some situation-free FO formula φAi ,A whose free variables are at most among x~i and ~y. In H, for each i we have Axiom (4) requires SO logic, but we will show in Theorem 3 that we can still reduce reasoning about regressable formulas to theorem proving in FOL only. We denote the set of axioms including Axiom (4) and all axioms in an action diagram H as H⋆ and call it the action hierarchy (of H). sp(A′ (~ x), Ai (x~i )) ≡ φA′ ,Ai (~ x, x~i ), where φA′ ,Ai is a situation-free FO formula. Let Definition 3 An action hierarchy H∗ is acyclic iff it entails the following conditions: H∗ |= sp∗ (A1 (~x1 ), A2 (~y1 )) ∧ sp∗ (A2 (~ y2 ), A1 (~ x2 )) ⊃ A1 (~ x3 ) = A2 (~ y3 ) for all action functions A1 , A2 . φA′ ,A (~ x, ~y) = Wt ~i )[φA′ ,Ai (~ x, x~i ) i=1 (∃x ∧ φAi ,A (x~i , ~ y )], then P (n) is proved. Notice that n ≤ |V |; hence such FO formula can always be obtained in finitely many steps. 933 (when h > 0), axiomatizers gain flexibility of writing SSAs that can deliver more computational advantages. Details can be found in the next section (see Example 6). Other classes of axioms such as the initial theory DS0 , the precondition axioms Dap , the foundational axioms Σ and unique name axioms for actions Duna have the same formats as in (Reiter 2001). It is easy to see that a modular BAT DH differs from Reiter’s BAT D in the following aspects: DSH0 includes the action hierarchy H⋆ and can use sp∗ to specify SSAs for classes of actions, while Dss enumerates each action individually. However, according to Lemma 2 (with a constructive proof), theories DH and D are related. Example 3 We continue with Example 2. Most of the FO formulas φA1 ,A2 (~x, ~y ) equivalent to sp∗ (A1 (~x), A2 (~y )) are straightforward (either ⊤, ⊥, or the same as the axioms of sp), except for sp∗ (drive(p, v, o, d), move(obj, orig, dest)) for any free variable p, v, o, d, obj, orig, dest. By using Def. 2 and the axioms given in Example 2, we have sp∗ (drive(p, v, o, d), move(obj, orig, dest)) ≡ sp(drive(p, v, o, d), move(obj, orig, dest)) ∨ sp(drive(p, v, o, d), travel(p, o, d)) ∧ sp(travel(p, o, d), move(obj, orig, dest)) ≡ v = obj ∧ o = orig ∧ d = dest ∨ p = obj ∧ o = orig ∧ d = dest, which can be simplified as: for any variables p, v, o, d, obj, H Lemma 2 For a Dss , there exists an equivalent class Dss including SSAs of the syntactic form given in Reiter’s BAT (Reiter 2001): for each relational fluent F DH |= F (~x, do(a, s)) ≡ φF (~x, a, s) in which φF may have occurrences of the predicate sp∗ , there exists a uniform formula φ′F (~x, a, s) that does not mention sp∗ and such that D |= F (~x, do(a, s)) ≡ φ′F (~x, a, s). Proof: Assume that a given BAT D includes k action functions in total, say A1 (~v1 ), · · · , Ak (~vk ). For each relational fluent F (~x, s), assume that its SSA in DH is of the form (1) whose positive and negative effect conditions have the syntactic form (5), then we substitute a with each action function, say Ai (~vi ) (without loss of generality, we assume that variables in ~vi are all new variables never used in the SSA of F ), and in the RHS obtained by this substitution from the SSA of F (~x, do(Ai (~vi ), s)), replace every occurrence of sp∗ (that has two action functions as arguments) with its equivalent FO formula (that exists according to Lemma 1). This replacement results in an axiom of the following form sp∗ (drive(p, v, o, d), move(obj, o, d)) ≡ p = obj ∨v = obj. Modular BATs Our goal is to provide a more compact specification of a BAT based on a given hierarchy of actions. We will call such a modified BAT a modular BAT and denote it as DH , where H DH = Dap ∪ Dss ∪ DSH0 ∪ Σ ∪ Duna . Here, DSH0 = DS0 ∪H⋆ , in which H⋆ is the action hierarchy and DS0 describes the usual initial state, the same as Reiter’s H initial theory, and Dss is the new class of SSAs specified ⋆ based on H . In the sequel, let sp∗= (a, a′ ) be an abbreviation for either sp∗ (a, a′ ) or a = a′ in Formula (5) below. H The new syntactic form of SSAs in Dss can be different from Reiter’s format in Dss . Intuitively, instead of repeating tediously each individual action in the right-hand side (RHS) of a SSA for a fluent, say F (~x, do(a, s)), one can take advantage of the action hierarchies and describe the effect of the whole class of action functions at once. One can say that all those actions which are (distant) specializations of some generic action A(~y ) (actions from the branch going out of A(~y )), except those (distant) specializations of some other generic actions, say A(~ yl ) for 1 ≤ l ≤ h (i.e., excluding actions from some branches), can cause the same (positive or negative) effects on F under certain conditions. By doing so, we can represent the effects of actions more compactly. We will see later that this new form of SSAs leads to significant computational advantages as well. It is convenient to use the following notation related with a fluent F (~x, s): for any variable vector ~ yl (l ≥ 0), let ~zl = ~yl − ~x (i.e., ~zl are the new variables mentioned in ~yl but not in ~x). Note that, in the RHS of the SSA of F (~x, s), those new variables ~zl need to be existentially quantified. Formally speaking, the modified SSA of a relational fluent F (~x, s) has the format (1), where each ψi+ (~x, a, s) or ψi− (~x, a, s) has either the syntactic form (2) or the following syntactic form: (∃ ~ z0 )[sp⋆ (a, A(~ y0 ))∧γ(~ x, ~ z0 , s) ∧ h ^ F (~ x, do(Ai (~vi ), s)) ≡ ψi+ (~ x, ~vi , s) ∨ F (~ x, s) ∧ ¬ψi− (~ x, ~vi , s). Whenever ψi+ (~x, ~vi , s) (ψi− (~x, ~vi , s), respectively) are consistent conditions (SC formulas uniform in s), Ai (~vi ) has a positive effect (a negative effect) on F under such condition. Hence, the following yields the logically equivalent SSA of F in the usual BAT of (Reiter 2001): W F (~ x, do(a, s)) ≡ [ ki=1 (∃~vi )(a = Ai(~vi ) ∧ ψi+ (~ x, ~vi , s))]∨ Wk F (~ x, s) ∧ ¬[ j=1 (∃~vj )(a = Aj (~vj ) ∧ ψj− (~ x, ~vj , s))]. Notice that the above axiom can be simplified by removing inconsistent clauses. Hence the lemma is proved. We then have the following important property: Theorem 2 For each DH , there exists an equivalent D of the format given in (Reiter 2001), where equivalence means that for any FO regressable sentence W that has no occurrences of the predicate sp, DH |= W iff D |= W. Proof: Use Lemma 2. Here we provide some examples of the new way of representing SSAs, and compare them with Reiter’s format. ¬(∃ ~zl )sp⋆= (a, Al (~ yl ))]. (5) l=1 In (5), γ is a formula uniform in s that has ~x, ~z0 , s at most as its free variables. Notice that whenever ~zl (l ≥ 0) is empty, then there is no existential quantifier over ~zl . In addition, in (5), when index h = 0, the conjunction over l does not exist. One can prove that axiomatizers can always write modified SSAs in DH with h = 0 in (5). However, with negation Example 4 We continue with Example 1 (recall Figure 1). Consider a fluent f Cooked(x, s) (food x is cooked in the situation s), the modular BAT version of its SSA could be: f Cooked(x, do(a, s)) ≡ (∃y)sp∗(a, cook(x, y))∨f Cooked(x, s). Another example is a SSA for the fluent dirtyV es(x, s) (it will be false after washing a vessel x in some manner, or it will be true when x is used to prepare food or drink): 934 dirtyV es(y, do(a, s)) ≡ dirtyV es(y, s) ∧ ¬sp∗ (a, wash(y)) ∨ (∃x)sp∗(a, cook(x, y)) ∨ (∃x)a = makeSalad(x, y) ∨ (∃x)sp∗(a, prepDrink(x, y)). acyclic graph (DAG). Computing the FO formula equivalent to sp∗ (A1 (~x), A2 (~y )) for any pair of action functions A1 (~x) and A2 (~y) is similar to finding all paths from A1 to A2 in the corresponding digraph. The latter problem has the computational complexity of Θ(p) where p is the number of all the distinct edges in the digraph on any path from A1 to A2 , and therefore has a computational complexity of O(e) where e is the number of all edges in the digraph (i.e., e is the number of sp axioms in H). As a consequence, this yields the following encouraging and important result. We start with the case h = 0 in (5). Let φ(a, ~x, s) denote (∃~z0 )[sp∗ (a, A(~y0 )) ∧ γ(~x, ~z0 , s)]. In general, to provide an equivalent SSA of F (~x, s) in Reiter’s representation, φ(a, ~ x, s) has to be replaced by an uniform formula ψ(a, ~ x, s) of the form (∃~z0 )[ψsp (a, ~y0 ) ∧ γ(~x, ~z0 , s)]. Here, ψsp (a, ~y0 ) W might have the form (a = A(~y0 ) ∨ ni=A1 −1 (∃~zi )(a = Ai (~yi ) ∧ ψi (~ y0 , ~ yi ))), where each Ai (~ yi ) (1≤ i ≤ nA − 1) is a specialization of A(~y0 ) under the condition ψi (~y0 , ~yi ), nA is the total number of specializations of A. The formula ψsp (a, ~y0 ) is a logically equivalent replacement of sp∗ (a, A(~y0 )) in φ (see Lemma 2). Let the action diagram H in DH be acyclic and the corresponding action digraph rooted at A have a tree structure (the most general actions are considered as roots and the most specialized actions are considered as leaves). Then, the computational time of extended regression E [R[sp∗ (α, A(~t ′ , ~z0 ))]] in the clause φ(α, ~t, S), for any object terms ~t and any situation term do(α, S), is no worse than and (sometimes) can be exponentially faster than computational time of Reiter’s regression on ψsp in ψ(α, ~t, S). Theorem 4 If the sub-tree rooted at A in the digraph of H is a complete k-ary tree (k ≥ 2) with nA action functions as its nodes, the computational complexity of extended regression E[R[sp∗ (α, A(~t ′ , ~z0 ))]] is Θ(logk nA ), while the computational complexity of Reiter’s regression R[ψsp ] on an equivalent replacement is Θ(nA ). Proof: When a DAG of an action hierarchy has a tree structure or a forest structure, there is at most one path between any two action functions. In particular, assume that the digraph of the action hierarchy is a complete kary (k ≥ 2) tree structure and consider the regression of F (~t, do(α, S)) for any action term α and situation term S. To specify the equivalent SSA in Reiter’s format, φ(α, ~t, S) needs to be replaced by ψ(α, ~t, S). It is easy to see that one-step regression of the above clause in Reiter’s format takes Θ(nA ) steps (subsequently, additional time is required to regress recursively R[γ(~t, ~z0 , S)]). To perform onestep regression using the modular BAT format, it is sufficient to regress sp∗ (α, A(~t ′ , ~z0 )) ∧ γ(~t, ~z0 , S) (which takes Θ(1) time, excluding again the time that subsequently required to compute recursively R[γ(~t, ~z0 , S)]) and then replace sp∗ (α, A(~t ′ , ~z0 )) with the equivalent FO formula. Because this last replacement step can be considered as finding the path from α to A0 with the computational complexity of Θ(logk nA ). Finally, regression makes the same number of recursive calls in both cases: the formula γ is the same. The Reiter’s SSA for fluent f Cooked(x, s) (with a bigger taxonomy of actions, it will be much longer): f Cooked(x, do(a, s)) ≡ (∃y)[ a = cook(x, y)∨a = lowOilCook(x, y)∨a = steam(x, y) ∨a = boil(x, y)∨a = stew(x, y)∨a = broil(x, y) ∨a = bake(x, y)∨a = roast(x, y)∨a = ovenCook(x, y) ∨a = pressureCook(x, y)∨a = oilyCook(x, y) ∨a = f ry(x, y) ∨ a = deepF ry(x, y) ∨ a = stir(x, y) ∨a = parboil(x, y) ∨ a = grill(x, y)] ∨ f Cooked(x, s). We can also get a similar longer Reiter’s SSA for fluent dirtyV es(x, s) (details are omitted). The definitions of the regression operator and the regressable sentences in DH are all the same as in (Reiter 2001). Similar to the regression theorem (Reiter 2001), we have DH |= W iff DSH0 ∪ Duna |= R[W ] for any regressable sentence W . Let E R[W ] (called the extended regression of W ) be the operator that eliminates all occurrences (if any) of the sp∗ (A′ , A) predicate in R[W ] in favor of the corresponding FO formulas φA′ ,A that exists according to Lemma 1. Then, we have: Theorem 3 For each DH and for any FO regressable sentence W , DH |= W iff DS0 ∪ H ∪ Duna |= E R[W ] . This theorem is important because DSH0 ∪Duna (and hence D ) include the SO definition of the predicate sp∗ . However, all occurrences of sp∗ in sentence R[W ] can be replaced by FO sentences in finitely many steps according to the Lemma 1. Consequently, one can use regression in our modular BATs to reduce projection and executability problems to theorem proving in FOL only. H Advantages of Modular BATs Using action hierarchies and specifying BATs modularly not only provides a compact way of representing effects of actions, but sometimes leads to a more computationally efficient (than Reiter’s) solution of the projection problem. Example 5 Continuing with Example 4, consider a ground action α = deepF ry(Egg1, F ryingP an1 ), and the regression of f Cooked(Egg1 , do(α, S0 )). Using Reiter’s SSA for this fluent, regression involves checking 16 equality clauses between actions when regressing on the positive conditions in the SSA of f Cooked (see the axiom above). Using the modular BAT, extended regression of the positive conditions involves only 1 step of regression for predicate sp∗ , and finally the replacement of sp∗ (α, cook(Egg1 , y)) with the corresponding FO formula, i.e., the operator E, takes at most 4 steps of recursive computation (see Figure 1). Apart from specific examples, let us discuss in general the following problems: when we can actually gain computational advantages using action hierarchies and how much we can gain, whether there is any possible computational disadvantage in using action hierarchies alone, and if so, whether it can be avoided. According to the definition in the previous section, the digraph of an acyclic action diagram is in fact a directed However, if we do not allow the usage of (in)equality beH tween action terms (e.g., a = Ai ) in Dss , we may (some- 935 axiom for any relational fluent F (~y , s) has the syntactic form times) lose computational advantages when the effects of actions for some fluents only involve very few actions in a large taxonomy or when the structure of a DAG is not a forest, especially if it is close to a complete DAG. Since a dense DAG of n nodes has at most the order of n2 edges (a complete DAG of n nodes has n(n − 1)/2 edges in total), computing the FO formula equivalent to the predicate sp∗ has complexity O(n2 ). To avoid such computational disadvantages, we can easily use both (in)equality between action terms and predicate sp∗ in modular BATs. Whenever the (sub)digraph rooted at some action function symbol A has a tree structure (even mostly a tree structure with a few extra edges) and most of its specializations have the same effects under certain common conditions on some fluent F , one can use the sp∗ predicate for A to gain both the computational advantage and the advantage of compact representation when writing the SSA for F . Otherwise, one can use the (in)equality format to avoid computational disadvantages. Now, we illustrate the advantage of using the negated component in clause (5) (i.e., allowing h > 0). + ψA,F (~ x, ~ y , s) ⊃ F (~ y, do(A(~ x), s), and its negative effect axiom for F (~y , s) is of the form − ψA,F (~ x, ~y, s) ⊃ ¬F (~ y, do(A(~ x), s). (8) (9) Definition 4 For an action function A(~x) and a fluent F (~y, s), which has effect axioms of the form (8,9) we say that an action A has effect on a relational fluent F (or F could + be affected by A) iff either 6|= ψA,F (~ x, ~ y , s) ≡ F (~ y, s) or − 6|= ψA,F (~ x, ~ y , s) ≡ ¬F (~ y, s). For any action function A(~ x), a special meta-function Ne (A) is used to represent the number of fluents that can be affected by A. For any two action functions A1 (~x1 ) and A2 (~x2 ), we say that A1 causes no less effects than A2 iff there exists no fluent such that A1 has no effect on it but A2 has. We say that A1 causes more effects than A2 iff A1 has no less effects than A2 and there exists at least one fluent such that A1 has an effect on it but A2 does not. Note that if A1 causes more effects than A2 , then Ne (A1 ) > Ne (A2 ); however, it is not necessarily true the other way around: actions might affect different sets of fluents. Given effect axioms, for any pair of actions A1 , A2 , a straightforward linear time O(m) procedure can check whether A1 causes more effects than A2 . We would like to provide general guidelines on how an axiomatizer can construct an action diagram H for D. Under the assumption of monotonic inheritance, if A1 is a specialization of A2 , then it causes no less effects than A2 and Ne (A1 ) ≥ Ne (A2 ). Thus, to return a set H that represents an action diagram H, it is enough to start with generic actions A that have the smallest value of Ne (A) and proceed towards more specialized actions checking on each iteration if the next action we consider is a specialization of one of the previously considered actions. 1. Sort the action functions and get the sequence A1 (~x1 ), · · · , An (~ xn ) such that Ne (Ai1 ) ≤ Ne (Ai2 ) for i1 < i2 . 2. Initially, let i = 2 (index 1 ≤ i ≤ n) and H = ∅. 3. If i > n, then return H and terminate; else assign j = i and continue: look for Aj ’s that are generalizations of Ai . 4. Decrement j = j − 1. If j = 0 (i.e., all candidates Aj have been already considered), then increment i = i + 1 and go to step 3 (i.e., take a next action Ai+1 from the sorted sequence we obtained at step 1); else if Ne (Ai ) = Ne (Aj ), go to step 4; else continue. 5. For any pair of indices i, j such that 1 ≤ j ≤ i − 1, if there is a path in H from i to j, then we already know that Ai is a specialization of Aj and because specialization is a transitive relation there is no need to add a new directed edge from Ai to Aj and we can go to step 4; else continue. 6. If Ai (~xi ) is a specialization of Aj (~xj ) under FO condition φi,j , then update H = H ∪ {(Ai (~xi ), Aj (~xj ), φi,j )} and go to step 4; else do not change H and go to step 4. To implement the last step for any two action functions Ai (~xi ) and Aj (~xj ), provided that axiomatizers are able to write effect axioms of the form (8,9), we formulate the following principles to determine whether or not Ai (~xi ) is a specialization of Aj (~xj ) under some condition φ. Example 6 We continue with Example 5. Consider a fluent nonBBQ(x, s): x is cooked without grilling. Its SSA in DH can be written without negation in (5), i.e. when h = 0: nonBBQ(x, do(a, s)) ≡ (∃y)[sp∗ (a, oilyCook(x, y)) ∨ sp∗ (a, ovenCook(x, y)) ∨ a = steam(x, y) ∨ a = stew(x, y) ∨ sp∗ (a, boil(x, y))] ∨ nonBBQ(x, s)∧¬(∃y)a 6= grill(x, y) (6) Alternatively, it can be written with negation (i.e., h > 0) as: nonBBQ(x, do(a, s)) ≡ (∃y)[sp∗(a, oilyCook(x, y)) ∨ sp∗ (a, lowOilCook(x, y)) ∧ (∀z)(a 6= lowOilCook(x, z) ∧ a 6= grill(x, z))] ∨ nonBBQ(x, s)∧¬(∃y)a 6= grill(x, y) (7) Consider E[R[nonBBQ(Egg1 , do(α, S0 ))]], extended regression where α is the same as in Example 5. It takes 1 step less using Formula (7) than using Formula (6) during regression (regardless the quantifiers). Clearly, the more branches lowOilCook(x, y) has that have positive effects on nonBBQ(x, s) without extra context conditions, the more computational advantage we can obtain by allowing h ≥ 0 and using Formula (7) during regession. How to Construct a Taxonomy of Actions As we see, hierarchies of actions can lead to important computational advantages. An important practical question remains how an axiomatizer should approach the problem of constructing a hierarchy of actions given only a set of effect axioms which specify for each fluent what actions have a (positive or negative) effect on the fluent. In a somewhat similar vein, (Reiter 2001) starts from effect axioms and demonstrates that under the causal completeness assumption, SSAs can be constructed from effect axioms. We continue to consider only action diagrams H with monotonic inheritance of effects. In this subsection, we assume that all the variables used below in action functions and fluents are distinct from each other. Consider a BAT D which includes a set of n action functions, say {Ai (~xi ) | i = 1..n}, and m fluents, say {Fj (~yj ) | j = 1..m}, that might be affected by any of the above actions. For any action function A(~x), without loss of generality, we assume that its positive effect 936 a. If Ai causes more effects than Aj , “guess” a FO formula φ, whose free variables include at most ~xj and ~xi , such that for any relational fluent F (~y , s) that could be affected by both Ai and Aj , sitional dynamic logic). All actions in CLASP are represented in the style of STRIPS, which is less expressive than general Reiter’s BATs and our modular BATs. Our formalism is very different from all the papers mentioned above. We use a specialization relation between primitive action functions, and provide a formal axiomatization of the dynamic aspects of actions using full predicate logic (hence, our theory is quite expressive). Also, we gain both representational and computational advantages by using the action hierarchies. The extensive research on Hierarchical Task Networks (HTN), that can be traced to the pioneering work of Sacerdoti on ABSTRIPS, considers a completely different recursive decomposition of complex actions (i.e., plans, or nonprimitve tasks) into constituents, but does not explore large taxonomies of primitive actions and whether these taxonomies can provide any computational advantages when solving the projection problem. Our work is motivated in part by the well-known hierarchies of verbs (full troponym) in WordNet (Fellbaum 1998). Exploring connections with other frameworks (e.g., FrameNet, Levin’s taxonomy, VerbNet, etc) in computational linguistics and natural language processing is a possible direction for our future research. (Amir 2000) proposes and studies an object oriented version of the SC with the purpose of developing decomposed theories of actions, but he investigates a representation that is significantly different from our approach and considers neither taxonomies of actions nor BATs. In the future, we will explore how to combine Amir’s decomposed SC theories with our action hierarchies. Moreover, currently we consider only hierarchies of primitive actions. In the future, we may also consider hierarchies of complex actions (plans), and explore what criteria should be followed for constructing such hierarchies, whether we can construct them automatically from the existing hierarchies of primitive actions and our BATs. D |= φ ⊃ (P oss(Ai (~ xi ), s) ⊃ P oss(Aj (~ xj ), s)), + + D |= φ ⊃ (ψA (~ xi , ~ y, s) ≡ ψA (~ xj , ~y, s)), i ,F j ,F − − D |= φ ⊃ (ψA (~ xi , ~ y, s) ≡ ψA (~ xj , ~y, s)). i ,F j ,F If one can find such φ, then Ai is a specialization of Aj under the condition φ. b. Otherwise, Ai is not a specialization of Aj . Note that for any action functions A1 (~x), A2 (~y ) and FO formula φ, each (A1 (~x), A2 (~y), φ) in the returned set H corresponds to an axiom sp(A1 (~x), A2 (~y)) ≡ φ, and the collection of all these axioms results in an action diagram H. In general, to determine whether one action is a specialization of another under certain condition is undecidable. Hence, the axiomatizers have to observe the preconditions and effects of actions, guess formula φ (using their intuition), and construct action diagrams manually. In the future, we would like to consider whether it is possible to generate action diagrams automatically in some special cases. Discussion and Future Work There are a few papers related to our work that we would like to mention. (Lifschitz & Ren 2006) consider modular theories in the propositional action representation language C+, and address the problem of the development of libraries of reusable, general-purpose knowledge components. In comparison to them, we explore how to manage a large number of actions in the predicate logic using a hierarchical representation for actions in SC. We propose a representation, which not only facilitates writing axioms succinctly, but for realistic taxonomies can also gain computational advantages in solving the projection problem. (Kautz & Allen 1986) and subsequent papers of H.Kautz develop frameworks for plan recognition using hierarchies of plans, in which primitive action and plan instances belong to certain event types represented as unary predicates, and a hierarchy of plans is a collection of restricted-form axioms specifying relationships between various event types. However, their axiomatizations of actions (preconditions and effects of the actions) are limited and they do not address the projection problem. (Kaneiwa & Tojo 2005) give an ontological framework to represent actions/events and their hierarchical relationships in information systems using an order-sorted SO logic. In this framework, events (or actions) are represented as predicates rather than terms, and the authors consider taxonomical reasoning about relationships between events rather than reasoning about effects of actions. The authors do not provide axiomatizations of the dynamic aspects of actions and do not explore computational properties of their framework. (Devanbu & Litman 1996) propose a plan-based knowledge representation and reasoning system, called CLASP (CLAssification of Scenarios and Plans). CLASP extends the notions of subsumption from terminological languages to plans by allowing the construction of plans from concepts corresponding to actions and using plan description forming operators for choice, sequencing and looping (similar to propo- Acknowledgments Thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC) for partial financial support of this research. Thanks to Fahiem Bacchus, Hector Levesque, and Sheila McIlraith for comments on preliminary versions. References Amir, E. 2000. (De)Composition of Situation Calculus Theories. In Proceedings of the Seventienth National Conference on Artificial Intelligence (AAAI-00). AAAI. Devanbu, P. T., and Litman, D. J. 1996. Taxonomic plan reasoning. Artif. Intell. 84(1-2):1–35. Fellbaum, C. 1998. English verbs as a semantic net. In Fellbaum, C., ed., WordNet: An Electronic Lexical Database, with a preface by George Miller, Chapter 3. The MIT Press. 69–104. Kaneiwa, K., and Tojo, S. 2005. Logical aspects of events: Quantification, sorts,composition and disjointness. In Proceedings of Australasian Ontology Workshop (AOW 2005). Kautz, H. A., and Allen, J. F. 1986. Generalized plan recognition. In Proceedings of the fifth National Conference on Artificial Intelligence (AAAI-86), 32–37. AAAI Press. Lifschitz, V., and Ren, W. 2006. A modular action description language. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06). AAAI. Reiter, R. 2001. Knowledge in Action: Logical Foundations for Describing and Implementing Dynamical Systems. MIT Press. 937