Knowledge Representation and Reasoning Representação do Conhecimento e Raciocínio José Júlio Alferes 1 Part 1: Introduction 2 What is it ? • What data does an intelligent “agent” deal with? - Not just facts or tuples. • How does an “agent” knows what surrounds it? What are the rules of the game? – One must represent that “knowledge”. • And what to do afterwards with that knowledge? How to draw conclusions from it? How to reason? • Knowledge Representation and Reasoning AI Algorithms and Data Structures Computation 3 What is it good for ? • Fundamental topic in Artificial Intelligence – Planning – Legal Knowledge – Model-Based Diagnosis • Expert Systems • Semantic Web (http://www.w3.org) – Reasoning on the Web (http://www.rewerse.com) • Ontologies and data-modeling 4 What is this course about? • Logic approaches to knowledge representation • Issues in knowledge representation – semantics, expressivity, complexity • • • • Representation formalisms Forms of reasoning Methodologies Applications 5 Bibliography • Will be pointed out as we go along (articles, surveys) in the summaries at the web page • For the first part of the syllabus: – Reasoning with Logic Programming J. J. Alferes and L. M. Pereira Springer LNAI, 1996 – Nonmonotonic Reasoning G. Antoniou MIT Press, 1996. 6 What prior knowledge? • Computational Logic • Introduction to Artificial Intelligence • Logic Programming 7 Logic for KRR • Logic is a language conceived for representing knowledge • It was developed for representing mathematical knowledge • What is appropriate for mathematical knowledge might not be so for representing common sense • What is appropriate for mathematical knowledge might be too complex for modeling data. 8 Mathematical knowledge vs common sense • Complete vs incomplete knowledge – "x:xN→xR – go_Work → use_car • Solid inferences vs default ones – – – – – In the face incomplete knowledge In emergency situations In taxonomies In legal reasoning ... 9 Monotonicity of Logic • Classical Logic is monotonic T |= F → T U T’ |= F • This is a basic property which makes sense for mathematical knowledge • But is not desirable for knowledge representation in general! 10 Non-monotonic logics • Do not obey that property • Appropriate for Common Sense Knowledge • Default Logic – Introduces default rules • Autoepistemic Logic – Introduces (modal) operators which speak about knowledge and beliefs • Logic Programming 11 Logics for Modeling • Mathematical 1st order logics can be used for modeling data and concepts. E.g. – Define ontologies – Define (ER) models for databases • Here monotonicity is not a problem – Knowledge is (assumed) complete • But undecidability, complexity, and even notation might be a problem 12 Description Logics • Can be seen as subsets of 1st order logics – Less expressive – Enough (and tailored for) describing concepts/ontologies – Decidable inference procedures – (arguably) more convenient notation • Quite useful in data modeling • New applications to Semantic Web – Languages for the Semantic Web are in fact Description Logics! 13 In this course (revisited) • Non-Monotonic Logics – – – – Languages Tools Methodologies Applications • Description Logics – Idem… 14 Part 2: Default and Autoepistemic Logics 15 Default Logic • Proposed by Ray Reiter (1980) go_Work → use_car • Does not admit exceptions! • Default rules go_Work : use_car use_car 16 More examples anniversary(X) friend(X) : give_gift(X) give_gift(X) friend(X,Y) friend(Y,Z) : friend (X,Z) friend(X,Z) accused(X) : innocent(X) innocent(X) 17 Default Logic Syntax • A theory is a pair (W,D), where: – W is a set of 1st order formulas – D is a set of default rules of the form: j : Y1, … ,Yn g – j (pre-requisites), Yi (justifications) and g (conclusion) are 1st order formulas 18 The issue of semantics • If j is true (where?) and all Yi are consistent (with what?) then g becomes true (becomes? Wasn’t it before?) • Conclusions must: – be a closed set – contain W – apply the rules of D maximally, without becoming unsupported 19 Default extensions • G(S) is the smallest set such that: – W G(S) – Th(G(S)) = G(S) – A:Bi/C D, A G(S) and Bi S → C G(S) • E is an extension of (W,D) iff E = G(E) 20 Quasi-inductive definition • E is an extension iff E = Ui Ei for: – E0 = W – Ei+1 = Th(Ei) U {C: A:Bj/C D, A Ei, Bj E} 21 Some properties • (W,D) has an inconsistent extension iff W is inconsistent – If an inconsistent extension exists, it is unique • If W Just Conc is inconsistent , then there is only a single extension • If E is an extension of (W,D), then it is also an extension of (W E’,D) for any E’ E 22 Operational semantics • The computation of an extension can be reduced to finding a rule application order (without repetitions). • P = (d1,d2,...) and P[k] is the initial segment of P with k elements • In(P) = Th(W {conc(d) | d P}) – The conclusions after rules in P are applied • Out(P) = {Y | Y just(d) and d P } – The formulas which may not become true, after application of rules in P 23 Operational semantics (cont’d) • d is applicable in P iff pre(d) In(P) and Y In(P) • P is a process iff " dk P, dk is applicable in P[k-1] • A process P is: – successful iff In(P) ∩ Out(P) = {}. • Otherwise it is failed. – closed iff " d D applicable in P → d P • Theorem: E is an extension iff there exists P, successful and closed, such that In(P) = E 24 Computing extensions (Antoniou page 39) extension(W,D,E) :- process(D,[],W,[],_,E,_). process(D,Pcur,InCur,OutCur,P,In,Out) :getNewDefault(default(A,B,C),D,Pcur), prove(InCur,[A]), not prove(InCur,[~B]), process(D,[default(A,B,C)|Pcur],[C|InCur],[~B|OutCur],P,In,Out). process(D,P,In,Out,P,In,Out) :closed(D,P,In), successful(In,Out). closed(D,P,In) :not (getNewDefault(default(A,B,C),D,P), prove(In,[A]), not prove(In,[~B]) ). successful(In,Out) :- not ( member(B,Out), member(B,In) ). getNewDefault(Def,D,P) :- member(Def,D), not member(Def,P). 25 Normal theories • Every rule has its justification identical to its conclusion • Normal theories always have extensions • If D grows, then the extensions grow (semimonotonicity) • They are not good for everything: – John is a recent graduate – Normally recent graduates are adult – Normally adults, not recently graduated, have a job (this cannot be coded with a normal rule!) 26 Problems • No guarantee of extension existence • Deficiencies in reasoning by cases – D = {italian:wine/wine – W ={italian v french} french:wine/wine} • No guarantee of consistency among justifications. – D = {:usable(X), broken(X)/usable(X)} – W ={broken(right) v broken(left)} • Non cummulativity – D = {:p/p, pvq:p/p} – derives p v q, but after adding p v q no longer does so 27 Auto-Epistemic Logic • Proposed by Moore (1985) • Contemplates reflection on self knowledge (auto-epistemic) • Allows for representing knowledge not just about the external world, but also about the knowledge I have of it 28 Syntax of AEL • 1st Order Logic, plus the operator L (applied to formulas) • L j means “I know j” • Examples: MScOnSW → L MScSW (or L MScOnSW → MScOnSW) young (X) L studies (X) → studies (X) 29 Meaning of AEL • What do I know? – What I can derive (in all models) • And what do I not know? – What I cannot derive • But what can be derived depends on what I know – Add knowledge, then test 30 Semantics of AEL • T* is an expansion of theory T iff T* = Th(T{Lj : T* |= j} {Lj : T* |≠ j}) • Assuming the inference rule j/Lj : T* = CnAEL(T {Lj : T* |≠ j}) • An AEL theory is always two-valued in L, that is, for every expansion: " j | Lj T* Lj T* 31 Knowledge vs. Belief • Belief is a weaker concept – For every formula, I know it or know it not – There may be formulas I do not believe in, neither their contrary • The Auto-Epistemic Logic of knowledge and belief (AELB), introduces also operator B j – I believe in j 32 AELB Example • I rent a film if I believe I’m neither going to baseball nor football games Bbaseball Bfootball → rent_filme • I don’t buy tickets if I don’t know I’m going to baseball nor know I’m going to football L baseball L football → buy_tickets • I’m going to football or baseball baseball football • I should not conclude that I rent a film, but do conclude I should not buy tickets 33 Axioms about beliefs • Consistency Axiom B • Normality Axiom B(F → G) → (B F → B G) • Necessitation rule F BF 34 Minimal models • In what do I believe? – In that which belongs to all preferred models • Which are the preferred models? – Those that, for one same set of beliefs, have a minimal number of true things • A model M is minimal iff there does not exist a smaller model N, coincident with M on Bj e Lj atoms • When j is true in all minimal models of T, we write T |=min j 35 AELB expansions • T* is a static expansion of T iff T* = CnAELB(T {Lj : T* |≠ j} {Bj : T* |=min j}) where CnAELB denotes closure using the axioms of AELB plus necessitation for L 36 The special case of AEB • Because of its properties, the case of theories without the knowledge operator is especially interesting • Then, the definition of expansion becomes: T* = YT(T*) where YT(T*) = CnAEB(T {Bj : T* |=min j}) and CnAEB denotes closure using the axioms of AEB 37 Least expansion • Theorem: Operator Y is monotonic, i.e. T T1 T2 → YT(T1) YT(T2) • Hence, there always exists a minimal expansion of T, obtainable by transfinite induction: – T0 = CnAEB(T) – Ti+1 = YT(Ti) – Tb = Ua < b Ta (for limit ordinals b) 38 Consequences • Every AEB theory has at least one expansion • If a theory is affirmative (i.e. all clauses have at least a positive literal) then it has at least a consistent expansion • There is a procedure to compute the semantics 39 Part 3: Logic Programming for Knowledge representation 3.1 Semantics of Normal Logic Programs 40 LP for Knowledge Representation • Due to its declarative nature, LP has become a prime candidate for Knowledge Representation and Reasoning • This has been more noticeable since its relations to other NMR formalisms were established • For this usage of LP, a precise declarative semantics was in order 41 Language • A Normal Logic Programs P is a set of rules: H A1, …, An, not B1, … not Bm (n,m 0) • • • • where H, Ai and Bj are atoms Literal not Bj are called default literals When no rule in P has default literal, P is called definite The Herbrand base HP is the set of all instantiated atoms from program P. We will consider programs as possibly infinite sets of instantiated rules. 42 Declarative Programming • A logic program can be an executable specification of a problem member(X,[X|Y]). member(X,[Y|L]) member(X,L). • Easier to program, compact code • Adequate for building prototypes • Given efficient implementations, why not use it to “program” directly? 43 LP and Deductive Databases • In a database, tables are viewed as sets of facts: flight from to flight (lisbon, adam). Lisbon Adam flight (lisbon, london) Lisbon London M M M • Other relations are represented with rules: connection( A, B) flight ( A, B). connection( A, B) flight ( A, C ), connection(C , B). chooseAnother ( A, B) not connection( A, B). 44 LP and Deductive DBs (cont) • LP allows to store, besides relations, rules for deducing other relations • Note that default negation cannot be classical negation in: connection( A, B) flight ( A, B). connection( A, B) flight ( A, C ), connection(C , B). chooseAnother ( A, B) not connection( A, B). • A form of Closed World Assumption (CWA) is needed for inferring non-availability of connections 45 Default Rules • The representation of default rules, such as “All birds fly” can be done via the non-monotonic operator not flies( A) bird ( A), not abnormal ( A) . bird ( P) penguin( P). abnormal ( P) penguin( P). bird (a). penguin( p). 46 The need for a semantics • In all the previous examples, classical logic is not an appropriate semantics – In the 1st, it does not derive not member(3,[1,2]) – In the 2nd, it never concludes choosing another company – In the 3rd, all abnormalities must be expressed • The precise definition of a declarative semantics for LPs is recognized as an important issue for its use in KRR. 47 2-valued Interpretations • A 2-valued interpretation I of P is a subset of HP – A is true in I (ie. I(A) = 1) iff A I – Otherwise, A is false in I (ie. I(A) = 0) • Interpretations can be viewed as representing possible states of knowledge. • If knowledge is incomplete, there might be in some states atoms that are neither true nor false 48 3-valued Interpretations • A 3-valued interpretation I of P is a set I = T U not F where T and F are disjoint subsets of HP – A is true in I iff A T – A is false in I iff A F – Otherwise, A is undefined (I(A) = 1/2) • 2-valued interpretations are a special case, where: HP = T U F 49 Models • Models can be defined via an evaluation function Î: – For an atom A, Î(A) = I(A) – For a formula F, Î(not F) = 1 - Î(F) – For formulas F and G: • Î((F,G)) = min(Î(F), Î(G)) • Î(F G)= 1 if Î(F) Î(G), and = 0 otherwise • I is a model of P iff, for all rule H B of P: Î(H B) = 1 50 Minimal Models Semantics • The idea of this semantics is to minimize positive information. What is implied as true by the program is true; everything else is false. ableMathematician( X ) physicist ( X ) physicist (einstein) president ( cavaco ) • {pr(c),pr(e),ph(s),ph(e),aM(c),aM(e)} is a model • Lack of information that cavaco is a physicist, should indicate that he isn’t • The minimal model is: {pr(c),ph(e),aM(e)} 51 Minimal Models Semantics D [Truth ordering] For interpretations I and J, I J iff for all atom A, I(A) I(J), i.e. TI TJ and FI FJ T Every definite logic program has a least (truth ordering) model. D [minimal models semantics] An atom A is true in (definite) P iff A belongs to its least model. Otherwise, A is false in P. 52 TP operator • The minimal models of a definite P can be computed (bottom-up) via operator TP D [TP] Let I be an interpretation of definite P. TP(I) = {H: (H Body) P and Body I} T If P is definite, TP is monotone and continuous. Its minimal fixpoint can be built by: I0 = {} and In = TP(In-1) T The least model of definite P is TPw({}) 53 On Minimal Models • SLD can be used as a proof procedure for the minimal models semantics: – If the is a SLD-derivation for A, then A is true – Otherwise, A is false • The semantics does not apply to normal programs: – p not q has two minimal models: {p} and {q} There is no least model ! 54 The idea of completion • In LP one uses “if” but mean “iff” [Clark78] naturalN (0). naturalN ( s ( N )) naturalN ( N ). • This doesn’t imply that -1 is not a natural number! • With this program we mean: nN ( X ) (X = 0 ($Y : X = s (Y ) nN (Y ) )) • This is the idea of Clark’s completion: Syntactically transform if’s into iff’s Use classical logic in the transformed theory to provide the semantics of the program 55 Program completion • The completion of P is the theory comp(P) obtained by: Replace p(t) j by p(X) X = t, j Replace p(X) j by p(X) $Y j, where Y are the original variables of the rule Merge all rules with the same head into a single one p(X) j1 … jn For every q(X) without rules, add q(X) Replace p(X) j by "X (p(X) j) 56 Completion Semantics D Let comp(P) be the completion of P where not is interpreted as classical negation: A is true in P iff comp(P) |= A A is false in P iff comp(P) |= not A • Though completion’s definition is not that simple, the idea behind it is quite simple • Also, it defines a non-classical semantics by means of classical inference on a transformed theory 57 SLDNF proof procedure • By adopting completion, procedurally we have: not is “negation as finite failure” • In SLDNF proceed as in SLD. To prove not A: – If there is a finite derivation for A, fail not A – If, after any finite number of steps, all derivations for A fail, remove not A from the resolvent (i.e. succeed not A) • SLDNF can be efficiently implemented (cf. Prolog) 58 SLDNF example p p. q not p. a not b. b not c. q a not p p not b b No success nor finite failure p p • According to completion: – comp(P) |= {not a, b, not c} – comp(P) | p, comp(P) | not p – comp(P) | q, comp(P) | not q X not c c X 59 Problems with completion • Some consistent programs may became inconsistent: p not p becomes p not p • Does not correctly deal with deductive closures edge(a,b). edge(c,d). edge(d,c). reachable(a). reachable(A) edge(A,B), reachable(B). • Completion doesn’t conclude not reachable(c), due to the circularity caused by edge(c,d) and edge(d,c) Circularity is a procedural concept, not a declarative one 60 Completion Problems (cont) • Difficulty in representing equivalencies: abnormal(B) irregular(B) bird(tweety). fly(B) bird(B), not abnormal(B). irregular(B) abnormal(B) • Completion doesn’t conclude fly(tweety)! – Without the rules on the left fly(tweety) is true – An explanation for this would be: “the rules on the left cause a loop”. Again, looping is a procedural concept, not a declarative one When defining declarative semantics, procedural concepts should be rejected 61 Program stratification • Minimal models don’t have “loop” problems • But are only applicable to definite programs • Generalize Minimal Models to Normal LPs: – – – – – Divide the program into strata The 1st is a definite program. Compute its minimal model Eliminate all nots whose truth value was thus obtained The 2nd becomes definite. Compute its MM … 62 Stratification example P1 pp ab b P P c not p 2 d c, not a P3 e a, not d f not c • Least(P1) = {a, b, not p} • Processing this, P2 becomes: c true d c, false • Its minimal model, together with P1 is: {a, b, c, not d, not p} • Processing this, P3 becomes: e a, true f false • The (desired) semantics for P is then: {a, b ,c, not d, e, not f, not p} 63 Stratification D Let S1;…;Sn be such that S1 U…U Sn = HP, all the Si are disjoint, and for all rules of P: A B1,…,Bm, not C1,…,not Ck if A Si then: • {B1,…,Bm} Ui j=1 Sj • {C1,…,Ck} Ui-1 j=1 Sj Let Pi contain all rules of P whose head belongs to Si. P1;…;Pn is a stratification of P 64 Stratification (cont) • A program may have several stratifications: P1 a P P2 b a P3 c not a or a ba P P2 c not a P1 • Or may have no stratification: b not a a not b D A Normal Logic Program is stratified iff it admits (at least) one stratification. 65 Semantics of stratified LPs D Let I|R be the restriction of interpretation I to the atoms in R, and P1;…;Pn be a stratification of P. Define the sequence: • M1 = least(P1) • Mi+1 is the minimal models of Pi+1 such that: Mi+1| (Uij=1 Sj) = Mi Mn is the standard model of P • A is true in P iff A Mn • Otherwise, A is false 66 Properties of Standard Model Let MP be the standard model of stratified P MP is unique (does not depend on the stratification) MP is a minimal model of P MP is supported D A model M of program P is supported iff: A M $ (A Body) P : Body M (true atoms must have a rule in P with true body) 67 Perfect models • The original definition of stratification (Apt et al.) was made on predicate names rather than atoms. • By abandoning the restriction of a finite number of strata, the definitions of Local Stratification and Perfect Models (Przymusinski) are obtained. This enlarges the scope of application: P1= {even(0)} even(0) P2= {even(1) not even(0)} even(s(X)) not even(X) ... • The program isn’t stratified (even/1 depends negatively on itself) but is locally stratified. • Its perfect model is: {even(0),not even(1),even(2),…} 68 Problems with stratification • Perfect models are adequate for stratified LPs – Newer semantics are generalization of it • But there are (useful) non-stratified LPs even(X) zero(X) zero(0) even(Y) suc(X,Y),not even(X) suc(X,s(X)) • Is not stratified because (even(0) suc(0,0),not even(0)) P • No stratification is possible if P has: pacifist(X) not hawk(X) hawk(Y) not pacifist(X) • This is useful in KR: “X is pacifist if it cannot be assume X is hawk, and vice-versa. If nothing else is said, it is undefined whether X is pacifist or hawk” 69 SLS procedure • In perfect models not includes infinite failure • SLS is a (theoretical) procedure for perfect models based on possible infinite failure • No complete implementation is possible (how to detect infinite failure?) • Sound approximations exist: – based on loop checking (with ancestors) – based on tabulation techniques (cf. XSB-Prolog implementation) 70 Stable Models Idea • The construction of perfect models can be done without stratifying the program. Simply guess the model, process it into P and see if its least model coincides with the guess. • If the program is stratified, the results coincide: – A correct guess must coincide on the 1st strata; – and on the 2nd (given the 1st), and on the 3rd … • But this can be applied to non-stratified programs… 71 Stable Models Idea (cont) • “Guessing a model” corresponds to “assuming default negations not”. This type of reasoning is usual in NMR – Assume some default literals – Check in P the consequences of such assumptions – If the consequences completely corroborate the assumptions, they form a stable model • The stable models semantics is defined as the intersection of all the stable models (i.e. what follows, no matter what stable assumptions) 72 SMs: preliminary example a not b b not a ca cb p not q q not r r • Assume, e.g., not r and not p as true, and all others as false. By processing this into P: a false b false ca cb p false q true r • Its least model is {not a, not b, not c, not p, q, r} • So, it isn’t a stable model: – By assuming not r, r becomes true – not a is not assumed and a becomes false 73 SMs example (cont) a not b ca p not q b not a cb q not r r • Now assume, e.g., not b and not q as true, and all others as false. By processing this into P: a true ca p true b false cb q false r • • • • Its least model is {a, not b, c, p, not q, r} I is a stable model The other one is {not a, b, c, p, not q, r} According to Stable Model Semantics: – c, r and p are true and q is false. – a and b are undefined 74 Stable Models definition D Let I be a (2-valued) interpretation of P. The definite program P/I is obtained from P by: • deleting all rules whose body has not A, and A I • deleting from the body all the remaining default literals GP(I) = least(P/I) D M is a stable model of P iff M = GP(M). • A is true in P iff A belongs to all SMs of P • A is false in P iff A doesn’t belongs to any SMs of P (i.e. not A “belongs” to all SMs of P). 75 Properties of SMs Stable models are minimal models Stable models are supported If P is locally stratified then its single stable model is the perfect model Stable models semantics assign meaning to (some) non-stratified programs – E.g. the one in the example before 76 Importance of Stable Models Stable Models are an important contribution: – Introduce the notion of default negation (versus negation as failure) – Allow important connections to NMR. Started the area of LP&NMR – Allow for a better understanding of the use of LPs in Knowledge Representation – Introduce a new paradigm (and accompanying implementations) of LP It is considered as THE semantics of LPs by a significant part of the community. But... 77 Cumulativity D A semantics Sem is cumulative iff for every P: if A Sem(P) and B Sem(P) then B Sem(P U {A}) (i.e. all derived atoms can be added as facts, without changing the program’s meaning) • This property is important for implementations: – without cumulativity, tabling methods cannot be used 78 Relevance D A directly depends on B if B occur in the body of some rule with head A. A depends on B if A directly depends on B or there is a C such that A directly depends on C and C depends on B. D A semantics Sem is relevant iff for every P: A Sem(P) iff A Sem(RelA(P)) where RelA(P) contains all rules of P whose head is A or some B on which A depends on. • Only this property allows for the usual top-down execution of logic programs. 79 Problems with SMs • Don’t provide a meaning to every program: – P = {a not a} has no stable models • It’s non-cumulative and non-relevant: a not b b not a c not a c not c The only SM is {not a, c,b} – However b is not true in P U {c} (non-cumulative) • P U {c} has 2 SMs: {not a, b, c} and {a, not b, c} – b is not true in Relb(P) (non-relevance) • The rules in Relb(P) are the 2 on the left • Relb(P) has 2 SMs: {not a, b} and {a, not b} 80 Problems with SMs (cont) • Its computation is NP-Complete • The intersection of SMs is non-supported: a not b b not a ca cb c is true but neither a nor b are true. • Note that the perfect model semantics: – – – – is cumulative is relevant is supported its computation is polynomial 81 Part 3: Logic Programming for Knowledge representation 3.2 Answer-Set Programming 82 Programming with SMs • A new paradigm of problem representation with Logic Programming (Answer-Set Programming – ASP) – A problem is represented as (part of) a logic program (intentional database) – An instance of a problem is represented as a set of fact (extensional database) – Solution of the problems are the models of the complete program • In Prolog – A problem is represented by a program – Instances are given as queries – Solutions are substitutions 83 Finding subsets • In Prolog subSet([],_). subSet([E|Ss],[_|S]) :- subSet([E|Ss],S). subSet([E|Ss],[E|S]) :- subSet(Ss,S). ?- subset(X,[1,2,3]). • In ASP: – Program: in_sub(X) :- element(X), not out_sub(X). out_sub(X) :- element(X), not in_sub(X). – Facts: element(1). element(2). element(3). – Each stable model represents one subset. • Which one do you find more declarative? 84 Generation of Stable Models • A pair of rules a :- not b b :- not a generates two stable models: one with a and another with b. • Rules: a(X) :- elem(X), not b(X). b(X) :- elem(X), not a(X). with elem(X) having N solutions, generates 2N stable models 85 Small subsets • From the previous program, eliminate stable models with more than one member – I.e. eliminate all stable models where in_sub(X), in_sub(Y), X ≠ Y • Just add rule: foo :element(X), in_sub(X), in_sub(Y), not eq(X,Y), not foo. %eq(X,X). • Since there is no notion of query, it is very important to guarantee that it is possible to ground programs. – All variables appearing in a rule must appear in a predicate that defines the domains, and make it possible to ground it (in the case, the element(X) predicates. 86 Restricting Stable Models • A rule a :- cond, not a. eliminates all stable models where cond is true. • In most ASP solvers, this is simply written as an integrity constraint :- cond. • An ASP programs usually has: – A part defining the domain (and specific instance of the problem) – A part generating models – A part eliminating models 87 N-Queens • Place N queens in a NxN chess board so that none attacks no other. % Generating models hasQueen(X,Y) :- row(X), column(Y), not noQueen(Q,X,Y). noQueen(X,Y) :- row(X), column(Y), not hasQueen(Q,X,Y). % Eliminating models % No 2 queens in the same line or column or diagnonal :- row(X), column(Y), row(XX), hasQueen(X,Y), hasQueen(XX,Y), not eq(X,XX). :- row(X), column(Y), column(YY), hasQueen(X,Y), hasQueen(X,YY), not eq(Y,YY). :- row(X), column(Y), row(XX), column(YY), hasQueen(X,Y), hasQueen(XX,YY), not eq(abs(X-XX), abs(Y-YY)). % All rows must have at least one queen :- row(X), not hasQueen(X). hasQueen(X) :- row(X), column(Y), hasQueen(X,Y) 88 The facts (in smodels) • Define the domain of predicates and the specific program • Possible to write in abbreviated form, and by resorting to constants const size=8. column(1..size). row(1..size). hide. show hasQueen(X,Y). • Solutions by: > lparse –c size=4 | smodels 0 89 N-Queens version 2 • Generate less, such that no two queens appear in the same row or column. % Generating models hasQueen(X,Y) :- row(X), column(Y), not noQueen(Q,X,Y). noQueen(X,Y) :- row(X), column(Y), column(YY), not eq(Y,YY), hasQueen(X,YY). noQueen(X,Y) :- row(X), column(Y), rwo(XX), not eq(X,XX), hasQueen(XX,Y). • This already guarantees that all rows have a queen. Elimination of models is only needed for diagonals: % Eliminating models :- row(X), column(Y), row(XX), column(YY), hasQueen(X,Y), hasQueen(XX,YY), not eq(abs(X-XX), abs(Y-YY)). 90 Back to subsets in_sub(X) :- element(X), not out_sub(X). out_sub(X) :- element(X), not in_sub(X). • Generate subsets with at most 2 :- element(X), element(Y), element(Z), not eq(X,Y), not eq(Y,Z), not eq(X,Z), in_sub(X), in_sub(Y), in_sub(Z). • Generate subsets with at least 2 hasTwo :- element(X), element(Y), not eq(X,Y), in_sub(X), in_sub(Y). :- not hasTwo. • It could be done for any maximum and minimum • Smodels has simplified notation for that: 2 {in_sub(X): element(X) } 2. 91 Simplified notation in Smodels • Generate models with between N and M elements of P(X) that satisfy Q(X), given R. N {P(X):Q(X)} M :- R • Example: % Exactly one hasQueen(X,Y) per model for each row(X) given column(Y) 1 {hasQueen(X,Y):row(X)} 1 :- column(Y). % Same for columns 1 {hasQueen(X,Y):column(Y)} 1 :- row(X). % Elimination in diagonal :- row(X), column(Y), row(XX), column(YY), hasQueen(X,Y), hasQueen(XX,YY), not eq(abs(X-XX), abs(Y-YY)). 92 Graph colouring • Problem: find all colourings of a map of countries using not more than 3 colours, such that neighbouring countries are not given the same colour. • The predicate arc connects two countries. • Use ASP rules to generate colourings, and integrity constraints to eliminate unwanted solutions 93 Graph colouring arc(minnesota, wisconsin). arc(illinois, michigan). arc(illinois, indiana). arc(michigan, indiana). arc(michigan, wisconsin). arc(wisconsin, iowa). min arc(illinois, iowa). arc(illinois, wisconsin). arc(indiana, ohio). arc(michigan, ohio). arc(minnesota, iowa). arc(minnesota, michigan). wis ill mic col(Country,Colour) ?? ohio iow ind 94 Graph %auxiliary con(X,Y) :- arc(X,Y). con(X,Y) :- arc(Y,X). colouring node(N) :- con(N,C). %generate col(C,red) :- node(C), not col(C,blue), not col(C,green). col(C,blue) :- node(C), not col(C,red), not col(C,green). col(C,green) :- node(C), not col(C,blue), not col(C,red). %eliminate :- colour(C), con(C1,C2), col(C1,C), col(C2,C). 95 One colouring solution min wis ill mic ohio iow ind Answer: 1 Stable Model: col(minnesota,blue) col(wisconsin,green) col(michigan,red) col(indiana,green) col(illinois,blue) col(iowa,red) col(ohio,blue) 96 Hamiltonian paths • Given a graph, find all Hamiltonian paths arc(a,b). arc(a,d). arc(b,a). arc(b,c). arc(d,b). arc(d,c). a b d c 97 Hamiltonian paths % Subsets of arcs in_arc(X,Y) :- arc(X,Y), not out_arc(X,Y). out_arc(X,Y) :- arc(X,Y), not in_arc(X,Y). % Nodes node(N) :- arc(N,_). node(N) :- arc(_,N). % Notion of reachable reachable(X) :- initial(X). reachable(X) :- in_arc(Y,X), reachable(Y). 98 Hamiltonian paths % initial is one (and only one) of the nodes initial(N) :- node(N), not non_initial(N). non_initial(N) :- node(N), not initial(N). :- initial(N1), initial(N2), not eq(N1,N2). % In Hamiltonian paths all nodes are reachable :- node(N), not reachable(N). % Paths must be connected subsets of arcs % I.e. an arc from X to Y can only belong to the path if X is reachable :- arc(X,Y), in_arc(X,Y), not reachable(X). % No node can be visited more than once :- node(X), node(Y), node(Z), in_arc(X,Y), in_arc(X,Z), not eq(Y,Z). 99 Hamiltonian paths (solutions) a b d c {in_arc(a,d), in_arc(b,a), in_arc(b,c), in_arc(d,c)} in_arc(d,b)} 100 ASP vs. Prolog like programming • ASP is adequate for: – NP-complete problems – situation where the whole program is relevant for the problem at hands If the problem is polynomial, why using such a complex system? If only part of the program is relevant for the desired query, why computing the whole model? 101 ASP vs. Prolog • For such problems top-down, goal-driven mechanisms seem more adequate • This type of mechanisms is used by Prolog – Solutions come in variable substitutions rather than in complete models – The system is activated by queries – No global analysis is made: only the relevant part of the program is visited 102 Problems with Prolog • Prolog declarative semantics is the completion – All the problems of completion are inherited by Prolog • According to SLDNF, termination is not guaranteed, even for Datalog programs (i.e. programs with finite ground version) • A proper semantics is still needed 103 Part 3: Logic Programming for Knowledge representation 3.3 The Well Founded Semantics 104 Well Founded Semantics • Defined in [GRS90], generalizes SMs to 3valued models. • Note that: – there are programs with no fixpoints of G – but all have fixpoints of G2 P = {a not a} • But: G({a}) = {} and G({}) = {a} • There are no stable models G({}) = {} and G({a}) = {a} 105 Partial Stable Models D A 3-valued intr. (T U not F) is a PSM of P iff: • T = GP2(T) • T G(T) • F = HP - G(T) The 2nd condition guarantees that no atom is both true and false: T F = {} P = {a not a}, has a single PSM: {} a not b b not a c not a c not c This program has 3 PSMs: {}, {a, not b} and {c, b, not a} The 3rd corresponds to the single SM 106 WFS definition T [WF Model] Every P has a knowledge ordering (i.e. wrt ) least PSM, obtainable by the transfinite sequence: T0 = {} Ti+1 = G2(Ti) Td = Ua<d Ta, for limit ordinals d Let T be the least fixpoint obtained. MP = T U not (HP - G(T)) is the well founded model of P. 107 Well Founded Semantics • Let M be the well founded model of P: – A is true in P iff A M – A is false in P iff not A M – Otherwise (i.e. A M and not A M) A is undefined in P 108 WFS Properties • Every program is assigned a meaning • Every PSM extends one SM – If WFM is total it coincides with the single SM • It is sound wrt to the SMs semantics – If P has stable models and A is true (resp. false) in the WFM, it is also true (resp. false) in the intersection of SMs • WFM coincides with the perfect model in locally stratified programs (and with the least model in definite programs) 109 More WFS Properties • The WFM is supported • WFS is cumulative and relevant • Its computation is polynomial (on the number of instantiated rule of P) • There are top-down proof-procedures, and sound implementations – these are mentioned in the sequel 110 Part 3: Logic Programming for Knowledge representation 3.4 Comparison to other Non-Monotonic Formalisms 111 LP and Default Theories D Let DP be the default theory obtained by transforming: H B1,…,Bn, not C1,…, not Cm into: B1,…,Bn : ¬C1,…, ¬Cm H T There is a one-to-one correspondence between the SMs of P and the default extensions of DP T If L WFM(P) then L belongs to every extension of DP 112 LPs as defaults • LPs can be viewed as sets of default rules • Default literals are the justification: – can be assumed if it is consistent to do so – are withdrawn if inconsistent • In this reading of LPs, is not viewed as implication. Instead, LP rules are viewed as inference rules. 113 LP and Auto-Epistemic Logic D Let TP be the AEL theory obtained by transforming: H B1,…,Bn, not C1,…, not Cm into: B1 … Bn ¬ L C1 … ¬ L Cm H T There is a one-to-one correspondence between the SMs of P and the (Moore) expansions of TP T If L WFM(P) then L belongs to every expansion of TP 114 LPs as AEL theories • LPs can be viewed as theories that refer to their own knowledge • Default negation not A is interpreted as “A is not known” • The LP rule symbol is here viewed as material implication 115 LP and AEB D Let TP be the AEB theory obtained by transforming: H B1,…,Bn, not C1,…, not Cm into: B1 … Bn B ¬C1 … B ¬Cm H T There is a one-to-one correspondence between the PSMs of P and the AEB expansions of TP T A WFM(P) iff A is in every expansion of TP not A WFM(P) iff B¬A is in all expansions of TP 116 LPs as AEB theories • LPs can be viewed as theories that refer to their own beliefs • Default negation not A is interpreted as “It is believed that A is false” • The LP rule symbol is also viewed as material implication 117 SM problems revisited • The mentioned problems of SM are not necessarily problems: – Relevance is not desired when analyzing global problems – If the SMs are equated with the solutions of a problem, then some problems simple have no solution – Some problems are NP. So using an NP language is not a problem. – In case of NP problems, the efficient gains from cumulativity are not really an issue. 118 SM versus WFM • Yield different forms of programming and of representing knowledge, for usage with different purposes • Usage of WFM: – Closer to that of Prolog – Local reasoning (and relevance) are important – When efficiency is an issue even at the cost of expressivity • Usage of SMs – For representing NP-complete problems – Global reasoning – Different form of programming, not close to that of Prolog • Solutions are models, rather than answer/substitutions 119 Part 3: Logic Programming for Knowledge representation 3.5 Extended Logic Programs 120 Extended LPs • In Normal LPs all the negative information is implicit. Though that’s desired in some cases (e.g. the database with flight connections), sometimes an explicit form of negation is needed for Knowledge Representation • “Penguins don’t fly” could be: noFly(X) penguin(X) • This does not relate fly(X) and noFly(X) in: fly(X) bird(X) noFly(X) penguin(X) For establishing such relations, and representing negative information a new form of negation is needed in LP: Explicit negation - ¬ 121 Extended LP: motivation • ¬ is also needed in bodies: “Someone is guilty if is not innocent” – cannot be represented by: guilty(X) not innocent(X) – This would imply guilty in the absence of information about innocent – Instead, guilty(X) ¬innocent(X) only implies guilty(X) if X is proven not to be innocent • The difference between not p and ¬p is essential whenever the information about p cannot be assumed to be complete 122 ELP motivation (cont) • ¬ allows for greater expressivity: “If you’re not sure that someone is not innocent, then further investigation is needed” – Can be represented by: investigate(X) not ¬innocent(X) • ¬ extends the relation of LP to other NMR formalisms. E.g – it can represent default rules with negative conclusions and pre-requisites, and positive justifications – it can represent normal default rules 123 Explicit versus Classical ¬ • Classical ¬ complies with the “excluded middle” principle (i.e. F v ¬F is tautological) – This makes sense in mathematics – What about in common sense knowledge? • ¬A is the the opposite of A. • The “excluded middle” leaves no room for undefinedness The “excluded middle” implies that hire(X) qualified(X) reject(X) ¬qualified(X) every X is either hired or rejected It leaves no room for those about whom further information is need to 124 determine if they are qualified ELP Language • An Extended Logic Program P is a set of rules: L0 L1, …, Lm, not Lm+1, … not Ln (n,m 0) • • • • where the Li are objective literals An objective literal is an atoms A or its explicit negation ¬A Literals not Lj are called default literals The Extended Herbrand base HP is the set of all instantiated objective literals from program P We will consider programs as possibly infinite sets of instantiated rules. 125 ELP Interpretations • An interpretation I of P is a set I = T U not F where T and F are disjoint subsets of HP and ¬L T L F (Coherence Principle) i.e. if L is explicitly false, it must be assumed false by default • I is total iff HP = T U F • I is consistent iff ¬$ L: {L, ¬L} T – In total consistent interpretations the Coherence Principle is trivially satisfied 126 Answer sets • It was the 1st semantics for ELPs [Gelfond&Lifschitz90] • Generalizes stable models to ELPs D Let M- be a stable models of the normal P- obtained by replacing in the ELP P every ¬A by a new atom A-. An answer-set M of P is obtained by replacing A- by ¬A in M• A is true in an answer set M iff A S • A is false iff ¬A S • Otherwise, A is unknown • Some programs have no consistent answer sets: – e.g. P = {a ¬a } 127 Answer sets and Defaults D Let DP be the default theory obtained by transforming: L0 L1,…,Lm, not Lm+1,…, not Ln into: L1,…,Lm : ¬Lm+1,…, ¬Ln L0 where ¬¬A is (always) replaced by A T There is a one-to-one correspondence between the answer-sets of P and the default extensions of DP 128 Answer-sets and AEL D Let TP be the AEL theory obtained by transforming: L0 L1,…,Lm, not Lm+1,…, not Ln into: L1 L L1 … Lm L Lm ¬ L Lm+1 … ¬ L Lm (L0 L L0) T There is a one-to-one correspondence between the answer-sets of P and the expansions of TP 129 The coherence principle • Generalizing WFS in the same way yields unintuitive results: pacifist(X) not hawk(X) hawk(X) not pacifist(X) ¬pacifist(a) – Using the same method the WFS is: {¬pacifist(a)} – Though it is explicitly stated that a is non-pacifist, not pacifist(a) is not assumed, and so hawk(a) cannot be concluded. • Coherence is not satisfied... Coherence must be imposed 130 Imposing Coherence • Coherence is: ¬L T L F, for objective L • According to the WFS definition, everything is false that doesn’t belong to G(T) • To impose coherence, when applying G(T) simply delete all rules for the objective complement of literals in T “If L is explicitly true then when computing undefined literals forget all rules with head ¬L” 131 WFSX definition D The semi-normal version of P, Ps, is obtained by adding not ¬L to every rule of P with head L D An interpretation (T U not F) is a PSM of ELP P iff: • T = GPGPs(T) • T GPs(T) • F = HP - GPs(T) T The WFSX semantics is determined by the knowledge ordering least PSM (wrt ) 132 WFSX example Ps: pacifist(X) not hawk(X), not ¬pacifist(X) hawk(X) not pacifist(X ), not ¬hawk(X) ¬pacifist(a) not pacifist(a) P: pacifist(X) not hawk(X) hawk(X) not pacifist(X) ¬pacifist(a) T0 = {} Gs(T0) = {¬p(a),p(a),h(a),p(b),h(b)} T1 = {¬p(a)} Gs(T1) = {¬p(a),h(a),p(b),h(b)} T2 = {¬p(a),h(a)} T3 = T2 The WFM is: {¬p(a),h(a), not p(a), not ¬h(a), not ¬p(b), not ¬h(b)} 133 Properties of WFSX • Complies with the coherence principle • Coincides with WFS in normal programs • If WFSX is total it coincides with the only answer-set • It is sound wrt answer-sets • It is supported, cumulative, and relevant • Its computation is polynomial • It has sound implementations (cf. below) 134 Inconsistent programs • Some ELPs have no WFM. E.g. { a ¬a } • What to do in these cases? Explosive approach: everything follows from contradiction • taken by answer-sets • gives no information in the presence of contradiction Belief revision approach: remove contradiction by revising P • computationally expensive Paraconsistent approach: isolate contradiction • efficient • allows to reason about the non-contradictory part 135 WFSXp definition • The paraconsistent version of WFSx is obtained by dropping the requirement that T and F are disjoint, i.e. dropping T GPs(T) D An interpretation, T U not F, is a PSMp P iff: • T = GPGPs(T) • F = HP - GPs(T) T The WFSXp semantics is determined by the knowledge ordering least PSM (wrt ) 136 WFSXp example Ps: c not b, not ¬c b a, not ¬b d not e , not ¬d a not ¬a ¬a not a P: c not b ba d not e a ¬a T0 = {} Gs(T0) = {¬a,a,b,c,d} T1 = {¬a,a,b,d} Gs(T1) = {d} T2 = {¬a,a,b,c,d} T3 = T2 The WFM is: {¬a,a,b,c,d, not a, not ¬a, not b, not ¬b not c, not ¬c, not ¬d, not e} 137 Surgery situation • A patient arrives with: sudden epigastric pain; abdominal tenderness; signs of peritoneal irritation • The rules for diagnosing are: – if he has sudden epigastric pain abdominal tenderness, and signs of peritoneal irritation, then he has perforation of a peptic ulcer or an acute pancreatitis – the former requires surgery, the latter therapeutic treatment – if he has high amylase levels, then a perforation of a peptic ulcer can be exonerated – if he has Jobert’s manifestation, then pancreatitis can be exonerated – In both situations, the pacient should not be nourished, but should take H2 antagonists 138 LP representation perforation pain, abd-tender, per-irrit, not high-amylase pancreat pain, abd-tender, per-irrit, not jobert ¬nourish perforation h2-ant perforation ¬nourish pancreat h2-ant pancreat surgery perforation anesthesia surgery ¬surgery pancreat pain. abd-tender. per-irrit. ¬high-amylase. ¬jobert. The WFM is: {pain, abd-tender, per-irrit, ¬high-am, ¬jobert , not ¬pain, not ¬abd-tender, not ¬per-irrit, not high-am, not jobert, ¬nourish, h2-ant, not nourish, not ¬h2-ant, surgery, ¬surgery, not surgery, not ¬surgery, anesthesia, not anesthesia, not ¬anesthesia } 139 Results interpretation The WFM is: {pain, abd-tender, per-irrit, ¬high-am, ¬jobert , …, ¬nourish, h2-ant, not nourish, not ¬h2-ant, surgery, ¬surgery, not surgery, not ¬surgery,anesthesia, not anesthesia, not ¬anesthesia } • • • • The symptoms are derived and non-contradictory Both perforation and pancreatitis are concluded He should not be fed (¬nourish), but take H2 antagonists The information about surgery is contradictory • Anesthesia though not explicitly contradictory (¬anesthesia doesn’t belong to WFM) relies on contradiction (both anesthesia and not anesthesia belong to WFM) 140 Part 3: Logic Programming for Knowledge representation 3.6 Proof procedures 141 WFSX programming • Prolog programming style, but with the WFSX semantics • Requires: – A new proof procedure (different from SLDNF), complying with WFS, and with explicit negation – The corresponding Prolog-like implementation: XSB-Prolog 142 SLX: Proof procedure for WFSX • SLX (SL with eXplicit negation) is a top-down procedure for WFSX • Is similar to SLDNF – Nodes are either successful or failed – Resolution with program rules and resolution of default literals by failure of their complements are as in SLDNF • In SLX, failure doesn’t mean falsity. It simply means non-verity (i.e. false or undefined) 143 Success and failure • A finite tree is successful if its root is successful, and failed if its root is failed • The status of a node is determined by: – A leaf labeled with an objective literal is failed – A leaf with true is successful – An intermediate node is successful if all its children are successful, and failed otherwise (i.e. at least one of its children is failed) 144 Negation as Failure? • As in SLS, to solve infinite positive recursion, infinite trees are (by definition) failed • Can a NAF rule be used? YES True of not A succeeds if true-or-undefined of A fails True-or-undefined of not A succeeds if true of A fails • This is the basis of SLX. It defines: – T-Trees for proving truth – TU-Trees for proving truth or undefinedness 145 T and TU-trees • They differ in that literals involved in recursion through negation, and so undefined in WFSXp, are failed in T-Trees and successful in TU-Trees a not b b not a T aX not b X TU b T not a aX not b X TU b not a … 146 Explicit negation in SLX • ¬-literals are treated as atoms • To impose coherence, the semi-normal program is used in TU-trees a not b b not a ¬a b aX ¬a not a not b notX¬a true b not a X a ¬a X not b not ¬a X true … 147 Explicit negation in SLX (2) • In TU-trees: L also fails if ¬L succeeds true • I.e. if not ¬L fail as true-or-undefined a not a ¬a X X b not ¬a c not c b not c ¬b ab X not c not ¬b X ¬b true cX c c X c cX not c not c not c not c not c X X … X 148 T and TU-trees definition D T-Trees (resp TU-trees) are AND-trees labeled by literals, constructed top-down from the root by expanding nodes with the rules • Nodes labeled with objective literal A • If there are no rules for A, the node is a leaf • Otherwise, non-deterministically select a rule for A A L1,…,Lm, not Lm+1,…, not Ln • In a T-tree the children of A are L1,…,Lm, not Lm+1,…, not Ln • In a TU-tree A has, additionally, the child not ¬A • Nodes labeled with default literals are leafs 149 Success and Failure D All infinite trees are failed. A finite tree is successful if its root is successful and failed if its root is failed. The status of nodes is determined by: • A leaf node labeled with true is successful • A leaf node labeled with an objective literal is failed • A leaf node of a T-tree (resp. TU) labeled with not A is successful if all TUtrees (resp. T) with root A (subsidiary tree) are failed; and failed otherwise • An intermediate node is successful if all its children are successful; and failed otherwise After applying these rules, some nodes may remain undetermined (recursion through not). Undetermined nodes in T-trees (resp.TU) are by definition failed (resp. successful) 150 Properties of SLX • SLX is sound and (theoretically) complete wrt WFSX. • If there is no explicit negation, SLX is sound and (theoretically) complete wrt WFS. • See [AP96] for the definition of a refutation procedure based on the AND-trees characterization, and for all proofs and details 151 Infinite trees example s not p, not q, not r p not s, q, not r q r, not p r p, not q s pX WFM is {s, not p, not q, not r} r X not p not q not r p not q q not s not r r not p q not s not r r not p q p not q r X not p p not q q not s not r 152 Negative recursion example s true q not p(0), not s p(N) not p(s(N)) s q not q X not p(0) not s X WFM = {s, not q} p(0) X X p(0) p(1) p(2) … X not p(1) not p(0) p(1) true p(2) not p(2) not p(3) X … X X not p(1) not p(2) X not p(3) 153 Guaranteeing termination • The method is not effective, because of loops • To guarantee termination in ground programs: Local ancestors of node n are literals in the path from n to the root, exclusive of n Global ancestors are assigned to trees: • the root tree has no global ancestors • the global ancestors of T, a subsidiary tree of leaf n of T’, are the global ancestors of T’ plus the local ancestors of n • global ancestors are divided into those occurring in T-trees and those occurring in TU-trees 154 Pruning rules • For cyclic positive recursion: Rule 1 If the label of a node belongs to its local ancestors, then the node is marked failed, and its children are ignored • For recursion through negation: Rule 2 If a literal L in a T-tree occurs in its global T-ancestors then it is marked failed and its children are ignored 155 Pruning rules (2) Rule 1 Rule 2 L L L … L 156 Other sound rules Rule 3 If a literal L in a T-tree occurs in its global TU-ancestors then it is marked failed, and its children are ignored Rule 4 If a literal L in a TU-tree occurs in its global T-ancestors then it is marked successful, and its children are ignored Rule 5 If a literal L in a TU-tree occurs in its global TU-ancestors then it is marked successful, and its children are ignored 157 Pruning examples a not b b not a ¬a b aX not a not b not ¬a b X c not c b not c ¬b ab not a cX aX ¬a X true Rule 2 ¬a X bX not ¬a ¬b not c not X¬b true not c X Rule 3 158 Non-ground case • The characterization and pruning rules apply to allowed non-ground programs, with ground queries • It is well known that pruning rules do not generalize to general programs with variables: p(X) p(X) p(Y) p(a) p(Y) What to do? • If “fail”, the answers are incomplete • If “proceed” then loop p(Z) 159 Tabling • To guarantee termination in non-ground programs, instead of ancestors and pruning rules, tabulation mechanisms are required – when there is a possible loop, suspend the literal and try alternative solutions – when a solution is found, store it in a table – resume suspended nodes with new solutions in the table – apply an algorithm to determine completion of the process, i.e. when no more solutions exist, and fail the corresponding suspended nodes 160 Tabling example p(X) p(Y) p(a) p(X) 1) suspend p(Y) 2) resume Y=a Table for p(X) X=a X=_ X=a • SLX is also implemented with tabulation mechanisms • It uses XSB-Prolog tabling implementation • SLX with tabling is available with XSB-Prolog from Version 2.0 onwards • Try it at: http://xsb.sourceforge.net/ 161 Tabling (cont.) • If a solution is already stored in a table, and the predicate is called again, then: – there is no need to compute the solution again – simply pick it from the table! • This increases efficiency. Sometimes by one order of magnitude. 162 Fibonacci example fib(6,X) fib(1,1). X=11 fib(2,1). fib(X,F) fib(X-1,F1), fib(X-2,F2), F is F1 + F2. Table for fib Q F 2 1 1 1 3 3 4 4 5 7 6 11 fib(5,Y) Y=7 fib(4,A) fib(3,B) B=3 fib(2,C) fib(1,D) C=1 D=1 fib(4,H) H=4 fib(3,F) A=4 F=3 fib(2,E) E=1 • Linear rather than exponential 163 XSB-Prolog • Can be used to compute under WFS • Prolog + tabling – To using tabling on, eg, predicate p with 3 arguments: :- table p/3. • Table are used from call to call until: abolish_all_table abolish_table_pred(P/A) 164 XSB Prolog (cont.) • WF negation can be used via tnot(Pred) • Explicit negation via -Pred • The answer to query Q is yes if Q is either true or undefined in the WFM • The answer is no if Q is false in the WFM of the program 165 Distinguishing T from U • After providing all answers, tables store suspended literals due to recursion through Residual Program negation • If the residual is empty then True • If it is not empty then Undefined • The residual can be inspected with: get_residual(Pred,Residual) 166 Residual program example :- table a/0. :- table b/0. :- table c/0. :- table d/0. a c b d ::::- b, tnot(c). tnot(a). tnot(d). d. | ?no | ?RA = no | ?RB = no | ?RC = no | ?no | ?- a,b,c,d,fail. get_residual(a,RA). [tnot(c)]; get_residual(b,RB). []; get_residual(c,RC). [tnot(a)]; get_residual(d,RD). 167 Transitive closure :- auto_table. edge(a,b). edge(c,d). edge(d,c). reach(a). reach(A) :edge(A,B),reach(B). |?- reach(X). X = a; no. |?- reach(c). no. |?-tnot(reach(c)). yes. • Due to circularity completion cannot conclude not reach(c) • SLDNF (and Prolog) loops on that query • XSB-Prolog works fine 168 Transitive closure (cont) :- auto_table. edge(a,b). edge(c,d). edge(d,c). reach(a). reach(A) :edge(A,B),reach(B). • • • • :- auto_table. edge(a,b). edge(c,d). edge(d,c). reach(a). reach(A) :reach(B), edge(A,B). Instead one could have written Declarative semantics closer to operational Left recursion is handled properly The version on the right is usually more efficient 169 Grammars • Prolog provides “for free” a right-recursive descent parser • With tabling left-recursion can be handled • It also eliminates redundancy (gaining on efficiency), and handle grammars that loop under Prolog. 170 Grammars example :- table expr/2, term/2. expr expr term term prim prim --> --> --> --> --> --> expr, [+], term. term. term, [*], prim. prim. [‘(‘], expr, [‘)’]. [Int], {integer(Int)}. • This grammar loops in Prolog • XSB handles it correctly, properly associating * and + to the left 171 Grammars example :- table expr/3, term/3. expr(V) --> expr(E), [+], term(T), {V is E + T}. expr(V) --> term(V). term(V) --> term(T), [*], prim(P), {V is T * P}. term(V) --> prim(V). prim(V) --> [‘(‘], expr(V), [‘)’]. prim(Int) --> [Int], {integer(Int)}. • With XSB one gets “for free” a parser based on a variant of Earley’s algorithm, or an active chart recognition algorithm • Its time complexity is better! 172 Finite State Machines • Tabling is well suited for Automata Theory implementations :- table rec/2. rec(St) :- initial(I), rec(St,I). rec([],S) :- is_final(S). rec([C|R],S) :- d(S,C,S2), rec(R,S2). q0 a a q1 a initial(q0). d(q0,a,q1). d(q1,a,q2). d(q2,b,q1). d(q1,a,q3). is_final(q3). q3 b q2 173 Dynamic Programming • Strategy for evaluating subproblems only once. – Problems amenable for DP, might also be for XSB. • The Knap-Sack Problem: – Given n items, each with a weight Ki (1 i n), determine whether there is a subset of the items that sums to K 174 The Knap-Sack Problem Given n items, each with a weight Ki (1 i n), determine whether there is a subset of the items that sums to K. :- table ks/2. ks(0,0). ks(I,K) :- I > 0, I1 is I-1, ks(I1,K). ks(I,K) :- I > 0, item(I,Ki), K1 is K-Ki, I1 is I-1, ks(I1,K1). • There is an exponential number of subsets. Computing this with Prolog is exponential. • There are only I2 possible distinct calls. Computing this with tabling is polynomial. 175 Combined WFM and ASP at work • XSB-Prolog XASP package combines XSB with Smodels – Makes it possible to combine WFM computation with Answer-sets – Use (top-down) WFM computation to determine the relevant part of the program – Compute the stable models of the residual – Possibly manipulate the results back in Prolog 176 XNMR mode • Extends the level of the Prolog shell with querying stable models of the residual: :- table a/0, b/0, c/0. a b c c ::::- tnot(b). tnot(a). b. a. C:\> xsb xnmr. […] nmr| ?- [example]. yes the residuals of the query nmr| ?- c. DELAY LIST = [a] DELAY LIST = [b]? s {c;a} ; {c;b}; SMs of the residual no nmr| ?- a. DELAY LIST = [tnot(b)]s {a}; SMs of residual where {b}; query is true no nmr| ?- a. DELAY LIST = [tnot(b)]t {a}; no 177 XNMR mode and relevance • Stable models given a query • First computes the relevant part of the program given the query • This step already allows for: – Processing away literal in the WFM – Grounding of the program, given the query. • This is a different grounding mechanism, in contrast to lparse or to that of DLV • It is query dependant and doesn’t require that much domain predicates in rule bodies… 178 XASP libraries • Allow for calling smodels from within XSBPrograms • Detailed control and processing of Stable Models • Two libraries are provided – sm_int which includes a quite low level (external) control of smodels – xnmr_int which allows for a combination of SMs and prolog in the same program 179 sm_int library • Assumes a store with (smodels) rules • Provides predicates for – Initializing the store (smcInit/0 and smcReInit/0) – Adding and retracting rules (smcAddRule/2 and smcRetractRule/2) – Calling smodels on the rules of the store (smcCommitProgram/0 and smcComputeModel/0) – Examine the computed SMs (smcExamineModel/2) – smcEnd/0 for reclaiming resources in the end 180 xnmr_int library • Allows for control, within Prolog of the interface provided by xnmr. – Predicates that call goals, compute residual, and compute SMs of the residual • pstable_model(+Query,-Model,0) – Computes one SM of the residual of the Query – Upon backtracking, computes other SMs • pstable_model(+Query,-Model,1) – As above but only SMs where Query is true • Allow for pre and pos-processing of the models – E.g. for finding models that are minimal or prefered in some sense – For pretty input and output, etc • You must: :- import pstable_model/3 from xnmr_int 181 Exercise • Write a XBS-XASP program that – Reads from the input the dimension N of the board – Computes the solution for the N-queens problem of that dimension – Shows the solution “nicely” in the screen – Shows what is common to all solution • E.g. (1,1) has never a queen, in no solution • Write a XSB-XASP program that computes minimal diagnosis of digital circuits 182 Part 3: Logic Programming for Knowledge representation 3.7 Application to representing taxonomies 183 A methodology for KR • WFSXp provides mechanisms for representing usual KR problems: – – – – – logic language non-monotonic mechanisms for defaults forms of explicitly representing negation paraconsistency handling ways of dealing with undefinedness • In what follows, we propose a methodology for representing (incomplete) knowledge of taxonomies with default rules using WFSXp 184 Representation method (1) Definite rules If A then B: –BA • penguins are birds: bird(X) penguin(X) Default rules Normally if A then B: – B A, rule_name, not ¬B rule_name not ¬rule_name • birds normally fly: fly(X) bird(X), bf(X), not ¬fly(X) bf(X) not ¬bf(X) 185 Representation method (2) Exception to default rules Under conditions COND do not apply rule RULE: – ¬RULE COND • Penguins are an exception to the birds-fly rule ¬bf(X) penguin(X) Preference rules Under conditions COND prefer rule RULE+ to RULE- : – ¬RULE- COND, RULE+ • for penguins, prefer the penguins-don’t-fly to the birds-fly rule: ¬bf(X) penguin(X), pdf(X) 186 Representation method (3) Hypotethical rules “If A then B” may or not apply: – B A, rule_name, not ¬B rule_name not ¬rule_name ¬rule_name not rule_name • quakers might be pacifists: pacifist(X) quaker(X), qp(X), not ¬pacifist(X) qp(X) not ¬qp(X) ¬qp(X) not qp(X) For a quaker, there is a PSM with pacifist, another with not pacifist. In the WFM pacifist is undefined 187 Taxonomy example The taxonomy: • • • • • Mammal are animal Bats are mammals Birds are animal Penguins are birds Dead animals are animals • • • • • Normally animals don’t fly Normally bats fly Normally birds fly Normally penguins don’t fly Normally dead animals don’t fly The elements: • Pluto is a mammal • Joe is a penguin • Tweety is a bird • Dracula is a dead bat The preferences: • • • • Dead bats don’t fly though bats do Dead birds don’t fly though birds do Dracula is an exception to the above In general, more specific information is preferred 188 The taxonomy Definite rules Default rules Negated default rules flies animal bird mammal penguin joe dead animal bat tweety pluto dracula 189 Taxonomy representation Taxonomy animal(X) mammal(X) mammal(X) bat(X) animal(X) bird(X) bird(X) penguin(X) deadAn(X) dead(X) Facts mammal(pluto). bird(tweety). deadAn(dracula). penguin(joe). bat(dracula). Default rules ¬flies(X) animal(X), adf(X), not flies(X) adf(X) not ¬adf(X) flies(X) bat(X), btf(X), not ¬flies(X) btf(X) not ¬btf(X) flies(X) bird(X), bf(X), not ¬flies(X) bf(X) not ¬bf(X) ¬flies(X) penguin(X), pdf(X), not flies(X) pdf(X) not ¬pdf(X) ¬flies(X) deadAn(X), ddf(X), not flies(X) ddf(X) not ¬ddf(X) Explicit preferences ¬btf(X) deadAn(X), bat(X), r1(X) ¬btf(X) deadAn(X), bird(X), r2(X) ¬r1(dracula) Implicit preferences ¬adf(X) bat(X), btf(X) ¬bf(X) penguin(X), pdf(X) r1(X) not ¬r1(X) r2(X) not ¬r2(X) ¬r2(dracula) ¬adf(X) bird(X), bf(X) 190 Taxonomy results deadAn bat penguin mammal bird animal adf btf bf pdf ddf r1 r2 flies Joe dracula pluto tweety not not not not not not not not not not not not not ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ 191 Part 4: Knowledge Evolution 192 LP and Non-Monotonicity • LP includes a non-monotonic form of default negation not L is true if L cannot (now) be proven • This feature is used for representing incomplete knowledge: With incomplete knowledge, assume hypotheses, and jump to conclusions. If (later) the conclusions are proven false, withdraw some hypotheses to regain consistency. 193 Typical example • All birds fly. Penguins are an exception: flies(X) bird(X), not ab(X). bird(a) . ab(X) penguin(X). This program concludes flies(a), by assuming not ab(a). • If later we learn penguin(a): – Add: penguin(a). – Goes back on the assumption not ab(a). – No longer concludes flies(a). 194 LP representing a static world • The work on LP allows the (nonmonotonic) addition of new knowledge. • But: – What we have seen so far does not consider this evolution of knowledge • LPs represent a static knowledge of a given world in a given situation. • The issues of how to add new information to a logic program wasn’t yet addressed. 195 Knowledge Evolution • Up to now we have not considered evolution of the knowledge • In real situations knowledge evolves by: – completing it with new information – changing it according to the changes in the world itself • Simply adding the new knowledge possibly leads to contradiction • In many cases a process for restoring consistency is desired 196 Revision and Updates • In real situations knowledge evolves by: – completing it with new information (Revision) – changing it according to the changes in the world itself (Updates) • These forms of evolution require a differentiated treatment. Example: – I know that I have a flight booked for London (either for Heathrow or for Gatwick). Revision: I learn that it is not for Heathrow • I conclude my flight is for Gatwick Update: I learn that flights for Heathrow were canceled • Either I have a flight for Gatwick or no flight at all 197 Part 4: Knowledge Evolution 4.1 Belief Revision and Logic Programming 198 AGM Postulates for Revision For revising a logical theory T with a formula F, first modify T so that it does not derive ¬F, and then add F. The contraction of T by a formula F, T-(F), should obey: 1. 2. 3. 4. 5. 6. 7. 8. T-(F) has the same language as T Th(T-(F)) Th(T) If T |≠ F then T-(F) = T If |≠ F then T-(F) |≠ F Th(T) Th(T-(F) {F}) If |= F ↔ G then Th(T-(F)) = Th(T-(G)) T-(F) ∩ T-(G) T-(F G) If T-(F G) |≠ F then T-(F G) T-(F) 199 Epistemic Entrenchment • The question in general theory revision is how to change a theory so that it obeys the postulates? • What formulas to remove and what formulas to keep? • In general this is done by defining preferences among formulas: some can and some cannot be removed. • Epistemic Entrenchment: some formulas are “more believed” than others. • This is quite complex in general theories. • In LP, there is a natural notion of “more believed” 200 Logic Programs Revision • The problem: – A LP represents consistent incomplete knowledge; – New factual information comes. – How to incorporate the new information? • The solution: – Add the new facts to the program – If the union is consistent this is the result – Otherwise restore consistency to the union • The new problem: – How to restore consistency to an inconsistent program? 201 Simple revision example (1) P: flies(X) bird(X), not ab(X). bird(a) . ab(X) penguin(X). • We learn penguin(a). P {penguin(a)} is consistent. Nothing more to be done. • We learn instead ¬flies(a). P {¬flies(a)} is inconsistent. What to do? Since the inconsistency rests on the assumption not ab(a), remove that assumption (e.g. by adding the fact ab(a), or forcing it undefined with ab(a) u) obtaining a new program P’. If an assumption supports contradiction, then go back on that assumption. 202 Simple revision example (2) P: flies(X) bird(X), not ab(X). bird(a) . ab(X) penguin(X). If later we also learn flies(a) (besides the previous ¬flies(a)) P’ {flies(a)} is inconsistent. The contradiction does not depend on assumptions. Cannot remove contradiction! Some programs are non-revisable. 203 What to remove? • Which assumptions should be removed? normalWheel not flatTyre, not brokenSpokes. flatTyre leakyValve. ¬normalWheel wobblyWheel. flatTyre puncturedTube. wobblyWheel . – Contradiction can be removed by either dropping not flatTyre or not brokenSpokes – We’d like to delve deeper in the model and (instead of not flatTyre) either drop not leakyValve or not puncturedTube. 204 Revisables normalWheel not flatTyre, not brokenSpokes. flatTyre leakyValve. ¬normalWheel wobblyWheel. flatTyre puncturedTube. wobblyWheel . • Solution: – Define a set of revisables: Revisables = not {leakyValve, punctureTube, brokenSpokes} Revisions in this case are {not lv}, {not pt}, and {not bs} 205 Integrity Constraints • For convenience, instead of: ¬normalWheel wobblyWheel we may use the denial: normalWheel, wobblyWheel • ICs can be further generalized into: L1 … Ln Ln+1 … Lm where Lis are literals (possibly not L). 206 ICs and Contradiction • In an ELP with ICs, add for every atom A: A, ¬A • A program P is contradictory iff P where is the paraconsistent derivation of SLX 207 Algorithm for 3-valued revision • Find all derivations for , collecting for each one the set of revisables supporting it. Each is a support set. • Compute the minimal hitting sets of the support sets. Each is a removal set. • A revision of P is obtained by adding {A u: A R} where R is a removal set of P. 208 (Minimal Hitting Sets) • H is a hitting set of S = {S1,…Sn} iff – H ∩ S1 ≠ {} and … H ∩ Sn ≠ {} • H is a minimal hitting set of S iff it is a hitting set of S and there is no other hitting set of S, H’, such that H’ H. • Example: – Let S = {{a,b},{b,c}} – Hitting sets are {a,b},{a,c},{b},{b,c},{a,b,c} – Minimal hitting sets are {b} and {a,c}. 209 Example Rev = not {a,b,c} p, q p not a. q not b, r. r not b. r not c. p not a q r not b not b not c Support sets are: {not a, not b} and {not a, not b, not c}. Removal sets are: {not a} and {not b}. 210 Simple diagnosis example inv(G,I,0) node(I,1), not ab(G). inv(G,I,1) node(I,0), not ab(G). node(b,V) inv(g1,a,V). node(a,1). ¬node(b,0). %Fault model inv(G,I,0) node(I,0), ab(G). inv(G,I,1) node(I,1), ab(G). a=1 g1 b0 The only revision is: P U {ab(g1) u} It does not conclude node(b,1). • In diagnosis applications (when fault models are considered) 3-valued revision is not enough. 211 2-valued Revision • In diagnosis one often wants the IC: ab(X) v not ab(X) – With these ICs (that are not denials), 3-valued revision is not enough. • A two valued revision is obtained by adding facts for revisables, in order to remove contradiction. • For 2-valued revision the algorithm no longer works… 212 Example p. a. b, not c. p not a, not b. p not a not b The only support is {not a, not b}. Removals are {not a} and {not b}. a X b not c X But: • P U {a} is contradictory (and unrevisable). • P U {b} is contradictory (though revisable). • In 2-valued revision: – some removals must be deleted; – the process must be iterated. 213 Algorithm for 2-valued revision 1 Let Revs={{}} 2 For every element R of Revs: – Add it to the program and compute removal sets. – Remove R from Revs – For each removal set RS: • Add R U not RS to Revs 3 Remove non-minimal sets from Revs 4 Repeat 2 and 3 until reaching a fixed point of Revs. The revisions are the elements of the final Revs. 214 Example of 2-valued revision p. a. b, not c. p not a, not b. Rev0 = {{}} Rev1 = {{a}, {b}} Rev2 = {{b}} Rev3 = {{b,c}} = Rev4 • Choose {}. Removal sets of P U {} are {not a} and {not b}. Add them to Rev. • Choose {a}. P U {a} has no removal sets. • Choose {b}. The removal set of P U {b} is {not c}. Add {b, c} to Rev. • Choose {b,c}. The removal set of P U {b,c} is {}. Add {b, c} to Rev. •The fixed point had been reached. P U {b,c} is the only revision. 215 Part 4: Knowledge Evolution 4.2 Application to Diagnosis 216 Revision and Diagnosis • In model based diagnosis one has: – a program P with the model of a system (the correct and, possibly, incorrect behaviors) – a set of observations O inconsistent with P (or not explained by P). • The diagnoses of the system are the revisions of P O. • This allows to mixed consistency and explanation (abduction) based diagnosis. 217 Diagnosis Example c1=0 1 g10 c3=0 0 g22 c2=0 g16 c6=0 c7=0 g11 1 1 g19 1 g23 0 1 218 Diagnosis Program Observables obs(out(inpt0, c1), 0). obs(out(inpt0, c2), 0). obs(out(inpt0, c3), 0). obs(out(inpt0, c6), 0). obs(out(inpt0, c7), 0). obs(out(nand, g22), 0). obs(out(nand, g23), 1). Connections conn(in(nand, conn(in(nand, … conn(in(nand, conn(in(nand, g10, 1), out(inpt0, c1)). g10, 2), out(inpt0, c3)). g23, 1), out(nand, g16)). g23, 2), out(nand, g19)). Predicted and observed values cannot be different obs(out(G, N), V1), val(out(G, N), V2), V1 V2. Value propagation val( in(T,N,Nr), V ) conn( in(T,N,Nr), out(T2,N2) ), val( out(T2,N2), V ). val( out(inpt0, N), V ) obs( out(inpt0, N), V ). Normal behavior val( out(nand,N), V ) not ab(N), val( in(nand,N,1), W1), val( in(nand,N,2), W2), nand_table(W1,W2,V). Abnormal behavior val( out(nand,N), V ) ab(N), val( in(nand,N,1), W1), val( in(nand,N,2), W2), and_table(W1,W2,V). 219 Diagnosis Example c1=0 11 g10 c3=0 g22 c2=0 g16 c6=0 c7=0 g11 00 0 10 11 g19 11 0 g23 11 1 Revision are: {ab(g23)}, {ab(c19)}, and {ab(g16),ab(g22)} 220 Revision and Debugging • Declarative debugging can be seen as diagnosis of a program. • The components are: – rule instances (that may be incorrect). – predicate instances (that may be uncovered) • The (partial) intended meaning can be added as ICs. • If the program with ICs is contradictory, revisions are the possible bugs. 221 Debugging Transformation • Add to the body of each possibly incorrect rule r(X) the literal not incorrect(r(X)). • For each possibly uncovered predicate p(X) add the rule: p(X) uncovered(p(X)). • For each goal G that you don’t want to prove add: G. • For each goal G that you want to prove add: not G. 222 Debugging example a not b b not c WFM = {not a, b, not c} b should be false BUT a should be false! a not b, not incorrect(a not b) b not c, not incorrect(b not c) a uncovered(a) b uncovered(b) c uncovered(c) b Revisables are incorrect/1 and uncovered/1 Add not a Revision are: The only revision is: {incorrect(b not c)} Revisions now are: {inc(b not c), inc(a not b)} {unc(c ), inc(a not b)} BUT c should be true! Add c {unc(c ), inc(a not b)} {uncovered(c)} 223 Part 4: Knowledge Evolution 4.3 Abductive Reasoning and Belief Revision 224 Deduction, Abduction and Induction • In deductive reasoning one derives conclusions based on rules and facts – From the fact that Socrates is a man and the rule that all men are mortal, conclude that Socrates is mortal • In abductive reasoning given an observation and a set of rules, one assumes (or abduce) a justification explaining the observation – From the rule that all men are mortal and the observation that Socrates is mortal, assume that Socrates being a man is a possible justification • In inductive reasoning, given facts and observations induce rules that may synthesize the observations – From the fact that Socrates (and many others) are man, and the observation that all those are mortal induce that all men are mortal. 225 Deduction, Abduction and Induction • Deduction: an analytic process based on the application of general rules to particular cases, with inference of a result • Induction: synthetic reasoning which infers the rule from the case and the result • Abduction: synthetic reasoning which infers the (most likely) case given the rule and the result 226 Abduction in logic • Given a theory T associated with a set of assumptions Ab (abducibles), and an observation G (abductive query), D is an abductive explanation (or solution) for G iff: . D Ab 2. T D |= G 3. T G is consistent • • Usually minimal abductive solutions are of special interest For the notion of consistency, in general integrity constraints are also used (as in revision) 227 Abduction example wobbleWheel flatTyre. wobbleWheel brokenSpokes. flatTyre leakyValve. flatTyre puncturedTube. • It has been observed that wobblyWheel. • What are the abductive solutions for that, assuming that abducibles are brokenSpokes, leakyValve and puncturedTube? 228 Applications • In diagnosis: – Find explanations for the observed behaviour – Abducible are the normality (or abnormality) of components, and also fault modes • In view updates – Find extensional data changes that justify the intentional data change in the view – This can be further generalized for knowledge assimilation 229 Abduction as Nonmonotonic reasoning • If abductive explanations are understood as conclusions, the process of abduction is nonmonotonic • In fact, abduction may be used to encode various other forms of nonmonotonic logics • Vice-versa, other nonmonotonic logics may be used to perform abductive reasoning 230 Negation by Default as Abduction • Replace all not A by a new atom A* • Add for every A integrity constraints: A A* A, A* • L is true in a Stable Model iff there is an abductive solution for the query F • Negation by default is view as hypotheses that can be assumed consistently 231 Defaults as abduction • For each rule d: A:B C add the rule C ← d(B), A and the ICs ¬d(B) ¬ B ¬d(B) ¬C • Make all d(B) abducible 232 Abduction and Stable Models • Abduction can be “simulated” with Stable Models • For each abducible A, add to the program: A ← not ¬A ¬A ← not A • For getting abductive solutions for G just collect the abducibles that belong to stable models with G • I.e. compute stable models after also adding ← not G and then collect all abducible from each stable model 233 Abduction and Stable Models (cont) • The method suggested lacks means for capturing the relevance of abductions made for really proving the query • Literal in the abductive solution may be there because they “help” on proving the abductive query, or simply because they are needed for consistency independently of the query • Using a combination of WFS and Stable Models may help in this matter. 234 Abduction as Revision • For abductive queries: – Declare as revisable all the abducibles – If the abductive query is Q, add the IC: not Q – The revision of the program are the abductive solutions of Q. 235 Part 4: Knowledge Evolution 4.4 Methodologies for modeling updates 236 Reasoning about changes • Dealing with changes in the world, rather than in the belief (Updates rather than revision) requires: – Methodology for representing knowledge about the chang es, actions, etc, using existing languages or – New languages and semantics for dealing with a changing world • Possibly with translation to the existing languages 237 Situation calculus • Initially developed for representing knowledge that changes using 1st order logics [McCarthy and Hayes 1969] – Several problems of the approach triggered research in nonmotonic logics • Main ingredients – Fluent predicates: predicates that may change their truth value – Situations: in which the fluents are true or false • A special initial situation • Other situations are characterized by the actions that were performed from the initial situation up to the situation 238 Situation Calculus - Basis • (Meta)-predicate holds/2 for describing which fluents hold in which situations • Situations are represented by: – constant s0, representing the initial situation – terms of the form result(Action,Situation), representing the situation that results from performing the Action in the previous situation 239 Yale shooting • There is a turkey, initially alive: holds(alive(turkey),s0). • Whenever you shoot with a loaded gun, the turkey at which you shoots dies, and the gun becomes unloaded ¬holds(alive(turkey),result(shoot,S)) ← holds(loaded,S). ¬holds(loaded,result(shoot,S)). • Loading a gun results in a loaded gun holds(loaded,result(load,S)). • What happens to the turkey if I load the gun, and then shoot at the turkey? – holds(alive(turkey), result(shoot, result(load,s0)))? 240 Frame Problem • In general only the axioms for describing what changes are not enough • Knowledge is also needed about what doesn’t change. • Suppose that there is an extra action of waiting: – holds(alive(turkey), result(shoot, result(wait,result(load,s0)))) is not true. • By default, fluents should remain with the truth value they had before, unless there is evidence for their change (commonsense law of inertia) – In 1st order logic it is difficult to express this – With a nonmonotonic logics this should be easy 241 Frame Axioms in Logic Programming • The truth value of fluents in two consecutive situations is, by default, the same: holds(F,result(A,S)) :- holds(F,S), not ¬holds(F,result(A,S)), not nonInertial(F,A,S). ¬holds(F,result(A,S)) :- ¬holds(F,S), not holds(F,result(A,S)), not nonInertial(F,A,S) • This allows for establishing the law of inertia. 242 Representing Knowledge with the situation calculus • Write rules for predicate holds/2 describing the effects of actions. • Write rules (partially) describing the initial situation, and possibly also some other states • Add the frame axioms • Care must be taken, especially in the case of Stable Models, because models are infinite • Look at the models of the program (be it SM or WF) to get the consequences 243 Yale shooting results • The WFM of the program contains, e.g. holds(alive,result(load,s0)) ¬holds(alive,result(shoot,result(wait,result(load,s0)))) ¬holds(loaded,result(shoot,result(wait,result(load,s0)))) • Queries of the form ?- holds(X,<situation>) return what holds in the given situation. • Queries of the form ?- holds(<property>,X) return linear plans for obtaining the property from the initial situation. 244 More on the rules of inertia • The rules allow for, given information about the past, reasoning about possible futures. • Reasoning about the past given information in the future is also possible, but requires additional axioms: holds(F,S) :- holds(F,result(A,S)), not ¬holds(F,S), not nonInertial(F,A,S). ¬holds(F,S) :- ¬holds(F,result(A,S)), not holds(F,S), not nonInertial(F,A,S). • Care must be taken when using these rules, since they may create infinite chains of derivation” • On the other hand, it is difficult with this representation to deal with simultaneous actions 245 Fluent Calculus • • • • Extends by introducing a notion of state [Thielscher 1998] Situation are representations of states State(S) denotes the state of the world in situation S Operator o is used for composing fluents that are true in the same state. • Example: – State(result(shoot,Soalive(turkey)oloaded) = S – State(result(load,S)) = Soloaded • Axioms are needed for guaranteeing that o is commutative and associative, and for equality • This allows inferring non-effects of action without the need for extra frame axioms 246 Event Calculus • It is another methodology developed for representing knowledge that changes over time [Kowalski and Sergot 1986] • Solves the frame problem in a different (simpler) way, also without frame axioms. • It is adequate for determining what holds after a series of action being performed • It does not directly help for planning and for general reasoning about the knowledge that is changing 247 Event Calculus - Basis • Fluents are represented as terms, as in situation calculus • Instead of situations, there is a notion of discrete time: – constants for representing time points – predicate </2 for representing the (partial) order among points – predicates </2 should contain axioms for transitive closure, as usual. • A predicates holds_at/2 defines which fluents hold in which time points • There are events, represented as constants. • Predicate occurs/2 defines what events happen in which time points. 248 Event Calculus – Basis (cont) • Events initiate (the truth) of some fluents and terminate (the truth) of other fluents. • This is represented using predicates initiates/3 and terminates/3 • Effects of action are described by the properties initiated and terminated by the event associated to the action occurrence. • There is a special event, that initiates all fluents at the beginning 249 Yale shooting again • There is a turkey, initially alive: initiates(alive(turkey),start,T). occurs(start,t0). • Whenever you shoot with a loaded gun, the turkey at which you shoots dies, and the gun becomes unloaded terminates(alive(turkey),shoot,T) ← holds_at(loaded,T). terminates(loaded,shoot,T). • Loading a gun results in a loaded gun initiates(loaded,load,T). • The gun was loaded at time t10, and shoot at time t20: occurs(load,t10). occurs(shoot,t20). • Is the turkey alive at time t21? – holds_at(alive(turkey), t21)? 250 General axioms for event calculus • Rules are needed to describe what holds, based on the events that occurred: holds_at(P,S) :- occurs(E,S1), initiates(P,E,S1), S1 < S, not clipped(P,S1,S). clipped(P,S1,S2) :- occurs(E,S), S1 ≤ S < S2, terminates(P,E,S). • There is no need for frame axioms. By default thing will remain true until terminated 251 Event calculus application • Appropriate when it is known which events occurred, and the reasoning task is to know what holds in each moment. E.g. – reasoning about changing databases – reasoning about legislation knowledge bases, for determining what applies after a series of events – Reasoning with parallel actions. • Not directly applicable when one wants to know which action lead to an effect (planning), or for reasoning about possible alternative courses of actions – No way of inferring occurrences of action – No way of representing various courses of actions – No way of reasoning from the future to the past 252 Event calculus and abduction • With abduction it is possible to perform planning using the event calculus methodology – Declare the occurrences of event as abducible – Declare also as abducible the order among the time events occurred – Abductive solutions for holds_at(<fluent>,<time>) give plans to achieve the fluent before the given (deadline) time. 253 Representing Knowledge with the event calculus • Write rules for predicates initiates/3 and terminates/3 describing the effects of actions. • Describe the initial situation as the result of a special event e.g. start, and state that start occurred in the least time point. • Add the axioms defining holds_at/2 • Add rule for describing the partial order of time – These are not need if e.g. integer are used for representing time • Add occurrences of the events • Query the program in time points 254 Part 4: Knowledge Evolution 4.5 Action Languages 255 Action Languages • Instead of – using existing formalism, such as 1st order logics, logic programming, etc, – and developing methodologies • Design new languages specifically tailored for representing knowledge in a changing world – With a tailored syntax for action programs providing ways of describing how an environment evolves given a set external actions – Common expressions are static and dynamic rules. • Static rules describe the rules of the domain • Dynamic rules describe effects of actions. 256 Action Languages (cont) • Usually, the semantics of an action program is defined in terms of a transition system. – Intuitively, given the current state of the world s and a set of actions K, a transition system specifies which are the possible resulting states after performing, simultaneously all the actions in K. • The semantics can also be given as a translation into an existing formalism – E.g. translating action programs into logic programs (possibly with extra arguments on predicates, with extra rules, e.g. for frame axioms) assuring that the semantics of the transformed program has a one-to-one correspondence with the semantics of the action program 257 The A language • First proposal by [Gelfond Lifshitz, 1993]. • Action programs are sets of rules of the form: – initially <Fluent> – <Fluent> after <Action1>; … ; <ActionN> – <action> causes <Fluent> [if <Condition>] • A semantics was first defined in terms of a transition system (labeled graph where the nodes are states – sets of fluents true in it – and where the arc are labeled with action) • Allows for – non-deterministic effects of actions – Conditional effects of actions 258 The Yale shooting in A initialy alive. shoot causes ¬alive if loaded. shoot causes ¬loaded. load causes loaded. • It is possible to make statements about other states, e.g. ¬alive after shoot; wait. • and to make queries about states: ¬alive after shoot; wait; load ? 259 Translation A into logic programs • An alternative definition of the semantics is obtained by translating A-programs into logics programs. Roughly: – Add the frame axioms just as in the situation calculus – For each rule • initially f add holds(f,s0). • f after a1;…;an add holds(f,result(a1,…result(an,s0)…) • a causes f if cond add holds(f,result(a,S)) :- holds(cond,S). • Theorem: holds(f,result(a1,…,result(an,s0)…) belongs to a stable model of the program iff there is a state resulting from the initial state after applying a1, … an where f is true. 260 The B Language • The B language [Gelfond Lifshitz, 1997]. extends A by adding static rules. • Dynamic rules, as in A, allow for describing effects of action, and “cause” a change in the state. • Static rules allow for describing rules of the domain, and are “imposed” at any given state • They allow for having indirect effects of actions • Static rules in B are of the form: <Fluent> if <Condition> • Example: dead if ¬alive. 261 Causality and the C language • Unlike both A and B, where inertia is assumed for all fluents, in C one can decide which fluents are subject to inertia and which aren’t: – Some fluents, such as one time events, should not be assumed to keep its value by inertia. E.g. action names, incoming messages, etc • Based on notions of causality: – It allows for assertion that F is caused by Action, stronger than asserting that F holds • As in B, it comprises static and dynamic rules 262 Rules in C • Static Rules: caused <Fluent> if <Condition> – Intuitively tells that Condition causes the truth of Fluent • Dynamic Rules: caused <Fluents> if <Condition> after <Formula> – The <Formula> can be built with fluents as well as with action names – Intuitively this rules states that after <Formula> is true, the rule caused <Fluents> if <Condition> is in place 263 Causal Theories and semantics of C • The semantics of C is defined in terms of causal theories (sets of static rules) – Something is true iff it is caused by something else • Let T be a causal theory, M be a set of fluents and TM = {F| caused F if G and M |= G} M is a causal model of T iff M is the unique model of TM. • The transition system of C is defined by: – In any state s (set of fluents) consider the causal theory TK formed by the static rules and the dynamic rules true at that state U K, where K is any set of actions – There is an arc from s to s’ labeled with K iff s’ is a causal model of TK. – Note that this way inertia is not obtained! 264 Yale shooting in C caused ¬alive if True after shoot loaded caused ¬loaded if True after shoot caused loaded if True after load • We still need to say that alive and loaded are inertial: caused alive if alive after alive caused loaded if loaded after loaded 265 Macros in C • Macro expressions have been defined for easing the representation of knowledge with C: – A causes F if G • standing for caused F if True after G A – inertial F • standing for caused F if F after F – always F • standing for caused if ¬F – nonexecutable A if F • standing for caused if F A – … 266 Extensions of C • Several extensions exist. E.g. – C++ allowing for multi-valued fluents, and to encode resources – K allowing for reasoning with incomplete states – P and Q that extend C with rich query languages, allowing for querying various states, planning queries, etc 267 Part 4: Knowledge Evolution 4.6 Logic Programs Updates 268 Rule Updates • These languages and methodologies are basically concerned with facts that change – There is a set of fluents (fact) – There are static rules describing the domain, which are not subject to change – There are dynamic rules describing how the facts may change due to actions • What if the rules themselves, be it static or dynamic, are subject to change? – The rules of a given domain may change in time – Even the rules that describe the effects of actions may change (e.g. rules describing the effects of action in physical devices that degrade with time) • What we have seen up to know does not help! 269 Languages for rule updates • Languages dealing with highly dynamic environments where, besides fact, also static and dynamic rules of an agent may change, need: – Means of integrating knowledge updates from external sources (be it from user changes in the rules describing agent behavior, or simply from environment events) – Means for describing rules about the transition between states – Means for describing self-updates, and self-evolution of a program, and combining self-updates with external ones • We will study this in the setting of Logic Programming – First define what it means to update a (running) program by another (externally given) program – Then extend the language of Logic Programs to describe transitions between states (i.e. some sort of dynamic rules) – Make sure that this deals with both self-updates (coming from the dynamic rules) and updates that come directly from external sources 270 Updates of LPs by LPs • Dynamic Logic Programming (DLP) [ALPPP98] was introduced to address the first of these concerns – It gives meaning to sequences of LPs • Intuitively a sequence of LPs is the result of updating P1 with the rules in P2, … – But different programs may also come from different hierarchical instances, different viewpoint (with preferences), etc. • Inertia is applied to rules rather than to literals – Older rules conflicting with newer applicable rules are rejected 271 Updating Models isn’t enough • When updating LPs, doing it model by model is not desired. It loses the directional information of the LP arrow. P: sleep not tv_on. watch tv_on. tv_on. M = {tv,w} U: not tv_on p_failure. p_failure. Mu = {pf,w} U2: not p_failure. Mu2 = {w} {pf,s} {tv,w} • Inertia should be applied to rule instances rather than to their previous consequences. 272 Logic Programs Updates Example • One should not have to worry about how to incorporate new knowledge; the semantics should take care of it. Another example: Open-Day(X) ← Week-end(X) . Week-end(23) . Week-end(24). Sunday(24). Initial knowledge: The restaurant is open in the week-end not Open-Day(X) ← Sunday(X) . New knowledge: On Sunday the restaurant is closed • Instead of rewriting the program we simply update it with the new rules. The semantics should consider the last update, plus all rule instances of the previous that do not conflict. 273 Generalized LPs • Programs with default negation in the head are meant to encode that something should no longer be true. – The generalization of the semantics is not difficult • A generalized logic program P is a set of propositional Horn clauses L L1 ,…, Ln where L and Li are atoms from LK , i.e. of the form A or ´not A´. • Program P is normal if no head of the clause in P has form not A. 274 Generalized LP semantics • A set M is an interpretation of LK if for every atom A in K exactly one of A and not A is in M. • Definition: An interpretation M of LK is a stable model of a generalized logic program P if M is the least model of the Horn theory P {not A: A M}. 275 Generalized LPs example Example: K = { a,b,c,d,e} P : a not b cb e not d not d a, not c d not e this program has exactly one stable model: M = Least(P not {b, c, d}) = {a, e, not b, not c, not d} N = {not a, not e, b, c, d} is not a stable model since N Least(P {not a, not e}) 276 Dynamic Logic Programming • A Dynamic Logic Program P is a sequence of GLPs P1 P2 … Pn • An interpretation M is a stable model of P iff: M = least([UiPi – Reject(M)] U Defaults(M)) – From the union of the programs remove the rules that are in conflict with newer ones (rejected rules) – Then, if some atom has no rules add (in Defaults) its negation – Compute the least, and check stability 277 Rejection and Defaults • By default assume the negation of atoms that have no rule for it with true body: Default(M) = {not A | $ r: head(r)=A and M |= body(r)} • Reject all rules with head A that belong to a former program, if there is a later rule with complementary head and a body true in M: Reject(M) = {r Pi | $r’ Pj, i ≤ j and head(r) = not head(r’) and M |= body(r’)} 278 Example P1: sleep not tv_on. watch tv_on. tv_on. P2: P3: not tv_on p_failure. p_failure. not p_failure. • {pf, sl, not tv, not wt} is the only SM of P1 P2 – Rej = {tv } – Def = {not wt} – Least( P – {tv } U {not wt} = M • {tv, wt, not sl, not pf} is the only SM of P1P2P3 – Rej = {pf } – Def = {not sl} 279 Another example P1 : not fly(X) animal(X) P2 : fly(X) bird(X) P3 : not fly(X) penguin(X) P4 : animal(X) bird(X) bird(X) penguin(X) animal(pluto) bird(duffy) penguin(tweety) Program P1P2 P3P4 has a unique stable model in which fly(duffy) is true and both fly(pluto) and fly(tweety) are false. 280 Some properties • If M is a stable model of the union P U of programs P and U , then it is a stable model of the update program P U. – Thus, the semantics of the program P U is always weaker than or equal to the semantics of P U. • If either P or U is empty, or if both P and U are normal programs, then the semantics of P U and P U coincide. – DLP extends the semantics of stable models 281 What is still missing • DLP gives meaning to sequences of LPs • But how to come up with those sequences? – Changes maybe additions or retractions – Updates maybe conditional on a present state – Some rules may represent (persistent) laws • Since LP can be used to describe knowledge states and also sequences of updating states, it’s only fit that LP is used too to describe transitions, and thus come up with such sequences 282 LP Update Languages • Define languages that extend LP with features that allow to define dynamic (state transition) rules – Put, on top of it, a language with sets of meaningful commands that generate DLPs (LUPS, EPI, KABUL) or – Extend the basic LP language minimally in order to allow for this generation of DLPs (EVOLP) 283 What do we need do make LPs evolve? • Programs must be allowed to evolve Meaning of programs should be sequences of sets of literals, representing evolutions Needed a construct to assert new information nots in the heads to allow newer to supervene older rules • Program evolution may me influenced by the outside Allow external events … written in the language of programs 284 EVOLP Syntax • EVOLP rules are Generalized LP rules (possibly with nots in heads) plus special predicate assert/1 • The argument of assert is an EVOLP rule (i.e. arbitrary nesting of assert is allowed) • Examples: assert( a ← not b) ← d, not e not a ← not assert( assert(a ← b)← not b), c • EVOLP programs are sets of EVOLP rules 285 Meaning of Self-evolving LPs • Determined by sequences of sets of literals • Each sequence represents a possible evolution • The nth set in a sequence represents what is true/false after n steps in that evolution • The first set in sequences is a SM of the LP, where assert/1 literals are viewed as normal ones • If assert(Rule) belongs to the nth set, then (n+1)th sets must consider the addition of Rule 286 Intuitive example a← assert(b ←) assert(c ←) ← b • • • • • At the beginning a is true, and so is assert(b ←) Therefore, rule b ← is asserted At 2nd step, b becomes true, and so does assert(c ←) Therefore, rule c ← is asserted At 3rd step, c becomes true. < {a, assert(b ←)}, {a, b, assert(b ←), assert(c ←)}, {a, b, c, assert(b ←), assert(c ←)} > 287 Self-evolution definitions • An evolution interpretation of P over L is a sequence <I1,…,In> of sets of atoms from Las • The evolution trace of <I1,…,In> is <P1,…,Pn>: P1 = P and Pi = {R | assert(R) Ii-1} (2 ≤ i ≤ n) • Evolution traces contains the programs imposed by interpretations • We have now to check whether each nth set complies with the programs up to n-1 288 Evolution Stable Models • <I1,…,In>, with trace <P1,…,Pn>, is an evolution stable model of P, iff " 1 ≤ i ≤ n, Ii is a SM of the DLP: P1 … Pi • Recall that I is a stable model of P1 … Pn iff I = least( (Pi – Rej(I)) Def(I) ) where: – Def(I) = {not A | $(A ← Body) Pi, Body I} – Rej(I) = {L0 ← Bd in Pi | $(not L0 ← Bd’) Pj, i ≤ j ≤ n, and Bd’ I} 289 Simple example • <{a, assert(b ← a)}, {a, b,c,assert(not a ←)}, {assert( b ← a)}> is an evolution SM of P: a← assert(b ← a) ← not c assert(not a ←) ← b c ← assert(not a ←) • The trace is <P,{b ← a},{not a ←}> a, assert(b ← a) a, b, c, assert(not a ←) assert(b ← a) 290 Example with various evolutions • No matter what, assert c; if a is not going to be asserted, then assert b; if c is true, and b is not going to be asserted, then assert a. assert(b) ← not assert(a). assert(a) ← not assert(b), c assert(c) ← • Paths in the graph below are evolution SMs b,c,ast(a) ast(c) ast(b) ast(c) b,c,ast(b) ast(c) a,b,c,ast(a) ast(c) a,b,c,ast(b) ast(c) 291 Event-aware programs • Self-evolving programs are autistic! • Events may come from the outside: – Observations of facts or rules – Assertion order • Both can be written in EVOLP language • Influence from outside should not persist by inertia 292 Event-aware programs • Events may come from the outside: – Observations of facts or rules – Assertion order • Both can be written in EVOLP language • An event sequence is a sequence of sets of EVOLP rules. • <I1,…,In>, with trace <P1,…,Pn>, is an evolution SM of P given <E1,…,Ek>, iff " 1 ≤ i ≤ n, Ii is a SM of the DLP: P1 P2 … (Pi Ei) 293 Simple example • The program says that: whenever c, assert a ← b • The events were: 1st c was perceived; 2nd an order to assert b; 3rd an order to assert not a P: assert(a ← b) ← c Events: <{c ← }, {assert(b ←)}, {assert(not a ←)}> c← assert(b ←) c← assert(a ← b) ← c c, assert(a ← b) assert(not a ←) assert(b ←) a←b b assert(not a ←) not a← assert(b ←) b, a, assert(not a ←) b 294 Yale shooting with EVOLP • There is a turkey, initially alive: alive(turkey) • Whenever you shoot with a loaded gun, the turkey at which you shoots dies, and the gun becomes unloaded assert(not alive(turkey)) ← loaded, shoot. assert(not loaded) ← shoot. • Loading a gun results in a loaded gun assert(loaded) ← load. • Events of shoot, load, wait, etc make the program evolve • After some time, the shooter becomes older, has sight problems, and does not longer hit the turkey if without glasses. Add event: assert( not assert(not alive(turkey)) ← not glasses) 295 LUPS, EPI and KABUL languages • Sequences of commands build sequences of LPs • There are several types of commands: assert, assert event, retract, always, … always (not a ← b, not c) when d, not e • EPI extends LUPS to allow for: – commands whose execution depends on other commands – external events to condition the KB evolution • KABUL extends LUPS and EPI with nesting 296 LUPS Syntax • Statements (commands) are of the form: assert [event] RULE when COND – asserts RULE if COND is true at that moment. The RULE is noninertial if with keyword event. retract [event] RULE when COND – the same for rule retraction always [event] RULE when COND – From then onwards, whenever COND assert RULE (as na event if with the keyword cancel RULE when COND – Cancel an always command 297 LUPS as EVOLP • The behavior of all LUPS commands can be constructed in EVOLP. Eg: • always (not a ← b, not c) when d, not e coded as event: assert( assert(not a ← b, not c) ← d, not e ) • always event (a ← b) when c coded as events: assert( assert(a ← b, ev(a ← b)) ← c ) assert( assert(ev(a ← b)) ← c ) plus: assert( not ev(R) ) ← ev(R), not assert(ev(R)) 298 EVOLP features • All LUPS and EPI features are EVOLP features: – Rule updates; Persistent updates; simultaneous updates; events; commands dependent on other commands; … • Many extra features (some of them in KABUL) can be programmed: – – – – – Commands that span over time Events with incomplete knowledge Updates of persistent laws Assignments … 299 More features • EVOLP extends the syntax and semantics of logic programs – If no events are given, and no asserts are used, the semantics coincides with the stable models – A variant of EVOLP (and DLP) have been defined also extending WFS – An implementation of the latter is available • EVOLP was show to properly embed action languages A, B, and C. 300 EVOLP possible applications • • • • • • Legal reasoning Evolving systems, with external control Reasoning about actions Active Data (and Knowledge) Bases Static program analysis of agents’ behavior … 301 … and also • EVOLP is a concise, simple and quite powerful language to reason about KB evolution – Powerful: it can do everything other update and action languages can, and much more – Simple and concise: much better to use for proving properties of KB evolution • EVOLP: a firm formal basis in which to express, implement, and reason about dynamic KB • Sometimes it may be regarded as too low level. – Macros with most used constructs can help, e.g. as in the translation of LUPS’ always event command 302 Suitcase example • A suitcase has two latches, and is opened whenever both are up: open ← up(l1), up(l2) • There is an action of toggling applicable to each latch: assert(up(X)) ← not up(X), toogle(X) assert(not up(X)) ← up(X), toogle(X) 303 Abortion Example • Once Republicans take over both Congress and the Presidency they establish the law stating that abortions are punishable by jail assert(jail(X) ← abortion(X)) ← repCongress, repPresident • Once Democrats take over both Congress and the Presidency they abolish such a law assert(not jail(X) ← abortion(X)) ← not repCongress, not repPresident • Performing an abortion is an event, i.e., a non-inertial update. – I.e. we will have events of the form abortion(mary)… • The change of congress is inertial – I.e. The recent change in the congress can be modeled by the event assert(not repCongress) 304 Twice fined example • A car-driver looses his license after a second fine. He can regain the license if he undergoes a refresher course at the drivers school. assert(not license ← fined, probation) ← fined assert(probation) ← fined assert(licence) ← attend_school assert(not probation) ← attend_school 305 Bank example • An account accepts deposits and withdrawals. The latter are only possible when there is enough balance: assert(balance(Ac,B+C)) ← changeB(Ac,C), balance(Ac,B) assert(not balance(Ac,B)) ← changeB(Ac,C), balance(Ac,B) changeB(Ac,D) ← deposit(Ac,D) changeB(Ac,-W) ← withdraw(Ac,W), balance(Ac,B), B > W. • Deposits and withdrawals are added as events. E.g. – {deposit(1012,10), withdraw(1111,5)} 306 Bank examples (cont) • The bank now changes its policy, and no longer accepts withdrawals under 50 €. Event: assert( not changeB(Ac,D) ← deposit(Ac,D), D < 50) ) • Next VIP accounts are allowed negative balance up to account specified limit: assert( changeB(Ac,-W) ← vip(Ac,L), withdrawl(Ac,W), B+L>W ). 307 Email agent example • Personal assistant agent for e-mail management able to: – Perform basic actions of sending, receiving, deleting messages – Storing and moving messages between folders – Filtering spam messages – Sending automatic replies and forwarding – Notifying the user of special situations • All of this may depend on user specified criteria • The specification may change dynamically 308 EVOLP for e-mail Assistant • If the user specifies, once and for all, a consistent set of policies triggering actions, then any existing (commercial) assistant would do the job. • But if we allow the user to update its policies, and to specify both positive (e.g. “…must be deleted”) and negative (e.g. “…must not be deleted”) instances, soon the union of all policies becomes inconsistent • We cannot expect the user to debug the set of policy rules so as to invalidate all the old rules (instances) contravened by newer ones. • Some automatic way to resolve inconsistencies due to updates is needed. 309 EVOLP for e-mail Assistant (cont) • EVOLP provides an automatic way of removing inconsistencies due to updates: – With EVOLP the user simply states whatever new is to be done, and let the agent automatically determine which old rules may persist and which not. – We are not presupposing the user is contradictory, but just that he keeps updating its profile • EVOLP further allows: – Postponed addition of rules, depending on user specified criteria – Dynamic changes in policies, triggered by internal and/or external conditions – Commands that span over various states – … 310 An EVOLP e-mail Assistant • In the following we show some policy rules of the EVOLP e-mail assistant. – A more complete set of rules, and the results given by EVOLP, can be found in the corresponding paper • Basic predicates: – New messages come as events of the form: newmsg(Identifier, From, Subject, Body) – Messages are stored via predicates: msg(Identifier, From, Subj, Body, TimeStamp) and in(Identifier, FolderName) 311 Simple e-mail EVOLP rules • By default messages are stored in the inbox: assert(msg(M,F,S,B,T)) ← newmsg(M,F,S,B), time(T), not delete(M). assert(in(M,inbox)) ← newmsg(M,F,S,B), not delete(M). assert(not in(M,F)) ← delete(M), in(M,F). • Spam messages are to be deleted: delete(M) ← newmsg(M,F,S,B), spam(F,S,B). • The definition of spam can be done by LP rules: spam(F,S,B) ← contains(S,credit). • This definition can later be updated: not spam(F,S,B) ← contains(F,my_accountant). 312 More e-mail EVOLP rules • Messages can be automatically moved to other folders. When that happens (not shown here) the user wants to be notified: notify(M) ← newmsg(M,F,S,B), assert(in(M,F)), assert(not in(M,inbox)). • When a message is marked both for deletion and automatic move to another folder, the deletion should prevail: not assert(in(M,F)) ← move(M,F), delete(M). • The user is organizing a conference, assigning papers to referees. After receipt of a referee’s acceptance, a new rule is to be added, which forwards to the referee any messages about assigned papers: assert(send(R,S,B1) ← newmsg(M1,F,S,B1), contains(S,Id), assign(Id,R)) ← newmsg(M2,R,Id,B2), contains(B2,accept). 313 Part 5: Ontologies 314 Logic and Ontologies • Up to now we have studied Logic Languages for Knowledge Representation and Reasoning: – in both static and dynamic domains – with possibly incomplete knowledge and nonmonotonic reasoning – interacting with the environment and completing the knowledge, possibly contracting previous assumptios • All of this is parametric with a set of predicates and a set of objects • The meaning of a theory depends, and is build on top of, the meaning of the predicates and objects 315 Choice of predicates • We want to represent that trailer trucks have 18 wheels. In 1st order logics: – " x trailerTruck(x) hasEighteenWheels(x) or – " x trailerTruck(x) numberOfWheels(x,18) or – "x ((truck(x) $y (trailer(y) part(x,y))) $s (set(s) count(s,18) "w (member(w,s) wheel(w) part(x,w))) • The choice depends on which predicates are available • For understanding (and sharing) the represented knowledge it is crucial that the meaning of predicates (and also of object) is formally established 316 Ontologies • Ontologies establish a formal specification of the concepts used in representing knowledge • Ontology: originates from philosophy as a branch of metaphysics – Ontologia studies the nature of existence – Defines what exists and the relation between existing concepts (in a given domain) – Sought universal categories for classifying everything that exists 317 An Ontology • An ontology, is a catalog of the types of things that are assumed to exist in a domain. • The types in an ontology represent the predicates, word senses, or concept and relation types of the language when used to discuss topics in the domain. • Logic says nothing about anything, but the combination of logic with an ontology provides a language that can express relationships about the entities in the domain of interest. • Up to now we have implicitly assumed the ontology – I assumed that you understand the meaning of predicates and objects involved in examples 318 Aristotle’s Ontology Being Substance Accident Inherence Property Relation Directedness Containment Movement Quality Quantity Activity Passivity Intermediacy Having Situated Spatial Temporal 319 The Ontology • Effort to defined and categorize everything that exists • Agreeing on the ontology makes it possible to understand the concepts • Efforts to define a big ontology, defining all concepts still exists today: – The Cyc (from Encyclopedia) ontology (over 100,000 concept types and over 1M axioms – Electronic Dictionary Research: 400,00 concept types – WordNet: 166,000 English word senses 320 Cyc Ontology 321 Cyc Ontology Thing Object Event Intangible Represented Thing Stuff Collection Intangible Object Process Occurrence Relationship Intangible Stuff Attribute value Internal machine thing Slot Attribute 322 Small Ontologies • Designed for specific application • How to make these coexist with big ontologies? 323 Domain-Specific Ontologies • Medical domain: – Cancer ontology from the National Cancer Institute in the United States • Cultural domain: – Art and Architecture Thesaurus (AAT) with 125,000 terms in the cultural domain – Union List of Artist Names (ULAN), with 220,000 entries on artists – Iconclass vocabulary of 28,000 terms for describing cultural images • Geographical domain: – Getty Thesaurus of Geographic Names (TGN), containing over 1 million entries 324 Ontologies and the Web • In the Web ontologies provide shared understanding of a domain – It is crucial to deal with differences in terminology • To understand data in the web it is crucial that an ontology exists • To be able to automatically understand the data, and use in a distributed environment it is crucial that the ontology is: – Explicitly defined – Available in the Web • The Semantic Web initiative provides (web) languages for defining ontologies (RDF, RDF Schema, OWL) 325 Defining an Ontology • How to define a catalog of the types of things that are assumed to exist in a domain? – I.e. how to define an ontology for a given domains? • What makes an ontology? – – – – – Entities in a taxonomy Attributes Properties and relations Facets Instances • Similar to ER models in databases 326 Main Stages in Ontology Development 1. Determine scope 2. Consider reuse 3. Enumerate terms 4. Define taxonomy 5. Define properties 6. Define facets 7. Define instances 8. Check for anomalies Not a linear process! 327 Determine Scope • There is no correct ontology of a specific domain – An ontology is an abstraction of a particular domain, and there are always viable alternatives • What is included in this abstraction should be determined by – the use to which the ontology will be put – by future extensions that are already anticipated 328 Determine Scope (cont) • Basic questions to be answered at this stage are: – What is the domain that the ontology will cover? – For what we are going to use the ontology? – For what types of questions should the ontology provide answers? – Who will use and maintain the ontology? 329 Consider Reuse • One rarely has to start from scratch when defining an ontology – In these web days, there is almost always an ontology available that provides at least a useful starting point for our own ontology • With the Semantic Web, ontologies will become even more widely available 330 Enumerate Terms • Write down in an unstructured list all the relevant terms that are expected to appear in the ontology – Nouns form the basis for class names – Verbs form the basis for property/predicate names • Traditional knowledge engineering tools (e.g. laddering and grid analysis) can be used to obtain – the set of terms – an initial structure for these terms 331 Define the Taxonomy • Relevant terms must be organized in a taxonomic is_a hierarchy – Opinions differ on whether it is more efficient/reliable to do this in a top-down or a bottom-up fashion • Ensure that hierarchy is indeed a taxonomy: – If A is a subclass of B, then every object of type A must also be an object of type B 332 Define Properties • Often interleaved with the previous step • Attach properties to the highest class in the hierarchy to which they apply: – Inheritance applies to properties • While attaching properties to classes, it makes sense to immediately provide statements about the domain and range of these properties – Immediately define the domain of properties 333 Define Facets • Define extra conditions over properties – Cardinality restrictions – Required values – Relational characteristics • symmetry, transitivity, inverse properties, functional values 334 Define Instances • Filling the ontologies with such instances is a separate step • Number of instances >> number of classes • Thus populating an ontology with instances is not done manually – Retrieved from legacy data sources (DBs) – Extracted automatically from a text corpus 335 Check for Anomalies • Test whether the ontology is consistent – For this, one must have a notion of consistency in the language • Examples of common inconsistencies – incompatible domain and range definitions for transitive, symmetric, or inverse properties – cardinality properties – requirements on property values can conflict with domain and range restrictions 336 Protégé • Java based Ontology editor • It supports Protégé-Frames and OWL as modeling languages – Frames is based on Open Knowledge Base Connectivity protocol (OKBC) • It exports into various formats, including (Semantic) Web formats • Let’s try it 337 The newspaper example (part) :Thing Author Person News Service Editor Content Employee Reporter Salesperson Article Advertisement Manager • Properties (slots) – – – – Persons have names which are strings, phone number, etc Employees (further) have salaries that are positive numbers Editor are responsible for other employees Articles have an author, which is an instance of Author, and possibly various keywords • Constraints – Each article must have at least two keywords – The salary of an editor should be greater than the salary of any employee which the editor is responsible for 338 Part 6: Description Logics 339 Languages for Ontologies • In early days of Artificial Intelligence, ontologies were represented resorting to non-logic-based formalisms – Frames systems and semantic networks • Graphical representation – arguably ease to design – but difficult to manage with complex pictures – formal semantics, allowing for reasoning was missing 340 Semantic Networks • Nodes representing concepts (i.e. sets of classes of individual objects) • Links representing relationships – IS_A relationship – More complex relationships may have nodes Person Female Woman Mother hasChild (1,NIL) Parent 341 Logics for Semantic Networks • Logics was used to describe the semantics of core features of these networks – Relying on unary predicates for describing sets of individuals and binary predicates for relationship between individuals • Typical reasoning used in structure-based representation does not require the full power of 1st order theorem provers – Specialized reasoning techniques can be applied 342 From Frames to Description Logics • Logical specialized languages for describing ontologies • The name changed over time – Terminological systems emphasizing that the language is used to define a terminology – Concept languages emphasizing the concept-forming constructs of the languages – Description Logics moving attention to the properties, including decidability, complexity, expressivity, of the languages 343 Description Logic ALC • ALC is the smallest propositionally closed Description Logics. Syntax: – Atomic type: • Concept names, which are unary predicates • Role names, which are binary predicates – Constructs • • • • • ¬C C1 ⊓ C2 C1 ⊔ C2 $R.C "R.C (negation) (conjunction) (disjunction) (existential restriction) (universal restriction) 344 Semantics of ALC • Semantics is based on interpretations (DI,.I) where .I maps: – Each concept name A to AI ⊆ DI • I.e. a concept denotes set of individuals from the domain (unary predicates) – Each role name R to AI ⊆ DI x DI • I.e. a role denotes pairs of (binary relationships among) individuals • An interpretation is a model for concept C iff CI ≠ {} • Semantics can also be given by translating to 1st order logics 345 Negation, conjunction, disjunction • ¬C denotes the set of all individuals in the domain that do not belong to C. Formally – (¬C)I = DI – CI – {x: ¬C(x)} • C1 ⊔ C2 (resp. C1 ⊓ C2) is the set of all individual that either belong to C1 or (resp. and) to C2 – (C1 ⊔ C2)I = C1I ⋃ C2I – {x: C1(x) ⌵ C2(x)} • Persons that are not female resp. (C1 ⊓ C2)I = C1I ⋂ C2I resp. {x: C1(x) C2(x)} – Person ⊓ ¬Female • Male or Female individuals – Male ⊔ Female 346 Quantified role restrictions • Quantifiers are meant to characterize relationship between concepts • $R.C denotes the set of all individual which relate via R with at least one individual in concept C – ($R.C)I = {d ∈ DI | (d,e) ∈ RI and e ∈ CI} – {x | $y R(x,y) C(Y)} • Persons that have a female child – Person ⊓ $hasChild.Female 347 Quantified role restrictions (cont) • "R.C denotes the set of all individual for which all individual to which it relates via R belong to concept C – ("R.C)I = {d ∈ DI | (d,e) ∈ RI implies e ∈ CI} – {x | "y R(x,y) C(Y)} • Persons whose all children are Female – Person ⊓ "hasChild.Female • The link in the network above – Parents have at least one child that is a person, and there is no upper limit for children – $hasChild.Person ⊓ "hasChild.Person 348 Elephant example • Elephants that are grey mammal which have a trunck – Mammal ⊓ $bodyPart.Trunk ⊓ "color.Grey • Elephants that are heavy mammals, except for Dumbo elephants that are light – Mammal ⊓ ("weight.heavy ⊔ (Dumbo ⊓ "weight.Light) 349 Reasoning tasks in DL • What can we do with an ontology? What does the logical formalism brings more? • Reasoning tasks – Concept satisfiability (is there any model for C?) – Concept subsumption (does C1I ⊆ C2I for all I?) C1 ⊑ C2 • Subsumption is important because from it one can compute a concept hierarchy • Specialized (decidable and efficient) proof techniques exist for ALC, that do not employ the whole power needed for 1st order logics – Based on tableau algorithms 350 Representing Knowledge with DL • A DL Knowledge base is made of – A TBox: Terminological (background) knowledge • Defines concepts. • Eg. Elephant ≐ Mammal ⊓ $bodyPart.Trunk – A ABox: Knowledge about individuals, be it concepts or roles • E.g. dumbo: Elephant or (lisa,dumbo):haschild • Similar to eg. Databases, where there exists a schema and an instance of a database. 351 General TBoxes • T is finite set of equation of the form C1 ≐ C2 • I is a model of T if for all C1 ≐ C2 ∈ T, C1I = C2I • Reasoning: – Satisfiability: Given C and T find whether there is a model both of C and of T? – Subsumption (C1 ⊑T C2): does C1I ⊆ C2I holds for all models of T? 352 Acyclic TBoxes • For decidability, TBoxes are often restricted to equations A≐ C where A is a concept name (rather than expression) • Moreover, concept A does not appear in the expression C, nor at the definition of any of the concepts there (i.e. the definition is acyclic) 353 ABoxes • Define a set of individuals, as instances of concepts and roles • It is a finite set of expressions of the form: – a:C – (a,b):R where both a and b are names of individuals, C is a concept and R a role • I is a model of an ABox if it satisfies all its expressions. It satisfies – a:C – (a,b):R iff iff aI ∈ CI (aI,bI) ∈ RI 354 Reasoning with TBoxes and ABoxes • Given a TBox T (defining concepts) and an ABox A defining individuals – Find whether there is a common model (i.e. find out about consistency) – Find whether a concept is subsumed by another concept C1 ⊑T C2 – Find whether an individual belongs to a concept (A,T |= a:C), i.e. whether aI ∈ CI for all models of A and T 355 Inference under ALC • Since the semantics of ALC can be defined in terms of 1st order logics, clearly 1st order theorem provers can be used for inference • However, ALC only uses a small subset of 1st order logics – Only unary and binary predicates, with a very limited use of quantifiers and connectives • Inference and algorithms can be much simpler – Tableau Algorithms are used for ALC and mostly other description logics • ALC is also decidable, unlike 1st order logics 356 More expressive DLs • The limited use of 1st order logics has its advantages, but some obvious drawbacks: Expressivity is also limited • Some concept definitions are not possible to define in ALC. E.g. – An elephant has exactly 4 legs • (expressing qualified number restrictions) – Every mother has (at least) a child, and every son is the child of a mother • (inverse role definition) – Elephant are animal • (define concepts without giving necessary and sufficient conditions) 357 Extensions of ALC • ALCN extends ALC with unqualified number restrictions ≤n R and ≥n R and =n R – Denotes the individuals which relate via R to at least (resp. at most, exactly) n individuals – Eg. Person ⊓ (≥ 2 hasChild) • Persons with at least two children • The precise meaning is defined by (resp. for ≥ and =) – (≤n R)I = {d ∈ DI | #{(d,e) ∈ RI} ≤ n } • It is possible to define the meaning in terms of 1st order logics, with recourse to equality. E.g. – ≥2 R is {x: $y$z, y ≠ z R(x,y) R(x,z)} – ≤2 R is {x: "y,z,w, (R(x,y) R(x,z) R(x,w)) (y=z ⌵ y=w ⌵ z=w)} 358 Qualified number restriction • ALCN can be further extended to include the more expressive qualified number restrictions (≤n R C) and (≥n R C) and (=n R C) – Denotes the individuals which relate via R to at least (resp. at most, exactly) n individuals of concept C – Eg. Person ⊓ (≥ 2 hasChild Female) • Persons with at least two female children – E.g. Mammal ⊓ (=4 bodypart Leg) • Mammals with 4 legs • The precise meaning is defined by (resp. for ≥ and =) – (≤n R)I = {d ∈ DI | #{(d,e) ∈ RI} ≤ n } • Again, it is possible to define the meaning in terms of 1st order logics, with recourse to equality. E.g. – (≥2 R C) is {x: $y$z, y ≠ z C(y) C(z) R(x,y) R(x,z)} 359 Further extensions • Inverse relations – R- denotes the inverse of R: R- (x,y) = R(y,x) • One of constructs (nominals) – {a1, …, an}, where as are individuals, denotes one of a1, …, an • Statements of subsumption in TBoxes (rather than only definition) • Role transitivity – Trans(R) denotes the transitivity closure of R • SHOIN is the DL resulting from extending ALC with all the above described extensions – It is the underlying logics for the Semantic Web language OWLDL – The less expressive language SHIF, without nominal is the basis for OWL-Lite 360 Example • From the w3c wine ontology – Wine ⊑ PotableLiquid ⊓ (=1 hasMaker) "hasMaker.Winery) • Wine is a potable liquid with exactly one maker, and the maker must be a winery – "hasColor-.Wine ⊑ {“white”, “rose”, “red”} • Wines can be either white, rose or red. – WhiteWine ≐ Wine ⊓ "hasColor.{“white”} • White wines are exactly the wines with color white. 361 Part 7: Rules and Ontologies 362 Combining rules and ontologies • We now know how to represent (possibly incomplete, evolving, etc) knowledge using rules, but assuming that the ontology is known. • We also learned how to represent ontologies. • The close the circle, we need to combine both. • The goal is to represent knowledge with rules that make use of an ontology for defining the objects and individuals – This is still a (hot) research topic! – Crucial for using knowledge represented by rules in the context of the Web, where the ontology must be made explicit 363 Full integration of rules/ontologies • Amounts to: – Combine DL formulas with rules having no restrictions – The vocabularies are the same – Predicates can be defined either using rules or using DL • This approach encounters several problems – The base assumptions of DL and of non-monotonic rules are quite different, and so mixing them so tightly is not easy 364 Problems with integration • Rule languages (e.g. Logic Programming) use some form of closed world assumption (CWA) – Assume negation by default – This is crucial for reasoning with incomplete knowledge • DL, being a subset of 1st order logics, has no closed world assumption – The world is kept open in 1st order logics (OWA) – This is reasonable when defining concepts – Mostly, the ontology is desirably monotonic • What if a predicate is both “defined” using DL and LP rules? – Should its negation be assumed by default? – Or should it be kept open? – How exactly can one define what is CWA or OWA is this context? 365 CWA vs OWA • Consider the program P wine(X) ← whiteWine(X) nonWhiteWine(X) ← not whiteWine(X) wine(esporão_tinto) and the “corresponding” DL theory WhiteWine ⊑ Wine ¬WhiteWine ⊑ nonWhiteWine esporão_tinto:Wine • P derives nonWhiteWine(esporão_tinto) whilst the DL does not. 366 Modeling exceptions • The following TBox is unsatisfiable Bird ⊑ Flies Penguin ⊑ Bird ⊓ ¬Flies • The first assertion should be seen as allowing exceptions • This is easily dealt by nonmonotonic rule languages, e.g. logic programming, as we have seen 367 Problems with integration (cont) • DL uses classical negation while LP uses either default or explicit negation – Default negation is nonmonotonic – As classical negation, explicit negation also does not assume a complete world and is monotonic – But classical negation and explicit negation are different – With classical negation it is not possible to deal with paraconsistency! 368 Classical vs Explicit Negation • Consider the program P wine(X) ← whiteWine(X) ¬wine(coca_cola) • and the DL theory WhiteWine ⊑ Wine coca_cola: ¬Wine • The DL theory derives ¬WhiteWine(coca_cola) whilst P does not. – In logic programs, with explicit negation, contraposition of implications is not possible/desired – Note in this case, that contraposition would amount to assume that no inconsistency is ever possible! 369 Problems with integration (cont) • Decidability is dealt differently: – DL achieves decidability by enforcing restrictions on the form of formulas and predicates of 1st order logics, but still allowing for quantifiers and function symbols • E.g. it is still possible to talk about an individual without knowing who it is: $hasMaker.{esporão} ⊑ GoodWine – PL achieves decidability by restricting the domain and disallowing function symbols, but being more liberal in the format of formulas and predicates • E.g. it is still possible to express conjunctive formulas (e.g. those corresponding to joins in relational algebra): isBrother(X,Y) ← hasChild(Z,X), hasChild(Z,Y), X≠Y 370 Recent approaches to full integration • Several recent (and in progress) approaches attacking the problem of full integration of DL and (nonmonotonic) rules: – Hybrid MKNF [Motik and Rosati 2007, to appear] • Based on interpreting rules as auto-epistemic formulas (cf. previous comparison of LP and AEL) • DL part is added as a 1st order theory, together with the rules – Equilibrium Logics [Pearce et al. 2006] – Open Answer Sets [Heymans et al. 2004] 371 Interaction without full integration • Other approaches combine (DL) ontologies, with (nonmonotonic) rules without fully integrating them: – Tight semantic integration • Separate rule and ontology predicates • Adapt existing semantics for rules in ontology layer • Adopted e.g. in DL+log [Rosati 2006] and the Semantic Web proposal SWRL [w3c proposal 2005] – Semantic separation • Deal with the ontology as an external oracle • Adopted e.g. in dl-Programs [Eiter et al. 2005] (to be studied next) 372 Nonmonotonic dl-Programs • Extend logic programs, under the answer-set semantic, with queries to DL knowledge bases • There is a clean separation between the DL knowledge base and the rules – Makes it possible to use DL engines on the ontology and ASP solver on the rules with adaptation for the interface • Prototype implementations exist (see dlv-Hex) • The definition of the semantics is close to that of answer sets • It also allows changing the ABox of the DL knowledge base when querying – This permits a limited form of flow of information from the LP part into the DL part 373 dl-Programs • dl-Programs include a set of (logic program) rules and a DL knowledge base (a TBox and an ABox) • The semantics of the DL part is independent of the rules – Just use the semantics of the DL-language, completely ignoring the rules • The semantics of the dl-Program comes from the rules – It is an adaptation of the answer-set semantics of the program, now taking into consideration the DL (as a kind of oracle) 374 dl-atoms to query the DL part • Besides the usual atoms (that are to be “interpreted” on the rules), the logic program may have dl-atoms that are “interpreted” in the DL part • Simple example: DL[Bird](“tweety”) – It is true in the program if in the DL ontology the concept Bird includes the element “tweety” • Usage in a rule flies(X) ← DL[Bird](X), not ab(X) – The query Bird(X) is made in the DL ontology and used in the rule 375 More on dl-atoms • To allow flow of information from the rules to the ontology, dl-atoms allow to add elements to the ABox before querying DL[Penguin ⊎ my_penguin;Bird](X) – First add to the ABox p:Penguin for each individual p such that my_penguin(p) (in the rule part), and then query for Bird(X) • Additions can also be made for roles (with binary rule predicates) and for negative concepts and roles. Eg: DL[Penguin ⊌ nonpenguin;Bird](X) – In this case p:¬Penguin is added for each nonpenguin(p) 376 The syntax of dl-Programs • A dl-Program is a pair (L,P) where – L is a description logic knowledge base – P is a set of dl-rules • A dl-rule is: H A1, …, An, not B1, … not Bm (n,m 0) where H is an atom and Ais and Bis are atoms or dl-atoms • A dl-atom is: DL[S1 op1 p1, …, Sn opn pn;Q](t) (n 0) where Si is a concept (resp. role), opi is either ⊎ or ⊌, pi is a unary (resp. binary) predicate and Q(t) is a DLquery. 377 DL-queries • Besides querying for concepts, as in the examples, dl-atoms also allow querying for roles, and concept subsumption. • A DL-query is either – C(t) for a concept C and term t – R(t1,t2) for a role R and terms t1 and t2 – C1 ⊑ C2 for concepts C1 and C2 378 Interpretations in dl-Programs • Recall that the Herbrand base HP of a logic program is the set of all instantiated atoms from the program, with the existing constants • In dl-programs constants are both those in the rules and the individuals in the ABox of the ontology • As usual a 2-valued interpretation is a subset of HP 379 Satisfaction of atoms wrt L • Satisfaction wrt a DL knowledge base L – For (rule) atoms I |=L A iff A ∈ I I |=L not A iff A ∉ I – For dl-atoms I |=L DL[S1 op1 p1, …, Sn opn pn;Q](t) iff L A1(I) … An(I) |= Q(t) where – Ai(I) = {Si(c) | pi(c) ∈ I} if opi is ⊎ – Ai(I) = {¬Si(c) | pi(c) ∈ I} if opi is ⊌ 380 Models of a Program • Models can be defined for other formulas by extending |= with: – I |=L not A iff I |≠L A – I |=L F, G iff I |=L F and I |=L G – I |=L H G iff I |=L A or I |≠L G for atom H, atom or dl-atom A, and formulas F and G • I is a model of a program (L,P) iff For every rule H G ∈ P, I |=L H G • I is a minimal model of (L,P) iff there is no other I’ ⊂ I that is a model of P • I is the least model of (L,P) if it is the only minimal model of (L,P) • It can be proven that every positive dl-program (without default negation) has a least model 381 Alternative definition of Models • Models can also be defined similarly to what has been done above for normal programs, via an evaluation function ÎL: – For an atom A, ÎL(A)=1 if I |=L A, and = 0 otherwise – For a formula F, ÎL(not F) = 1 - ÎL(F) – For formulas F and G: • ÎL((F,G)) = min(ÎL(F), ÎL(G)) • ÎL(F G)= 1 if ÎL(F) ÎL(G), and = 0 otherwise • I is a model of (L,P) iff, for all rule H B of P: ÎL(H B) = 1 • This definition easily allows for extensions to 3-valued interpretations and models (not yet explored!) 382 Reduct of dl-Programs • Let (L,P) be a dl-Program • Define the Gelfond-Lifshitz reduct P/I as for normal programs, treating dl-atoms as regular atoms • P/I is obtained from P by – Deleting all rules whose body contains not A and I |=L A (being A either a regular or dl-atom) – Deleting all the remaining default literals 383 Answer-sets of dl-Programs • Let least(L,P) be the least model of P wrt L, where P is a positive program (i.e. without negation by default) • I is an answer-set of (L,P) iff I = least(L,P/I) • Explicit negation can be used in P, and is treated just like in answer-sets of extended logic programs 384 Some properties • An answer-sets of dl-Program (L,P) is a minimal model of (L,P) • Programs without default nor explicit negation always have an answer-set • If the program is stratified then it has a single answer-set • If P has no DL atoms then the semantics coincides with the answer-sets semantics of normal and extended programs 385 An example (from [Eiter et al 2006]) • Assume the w3c wine ontology, defining concepts about wines, and with an ABox with several wines • Besides the ontology, there is a set of facts in a LP defining some persons, and their preferences regarding wines • Find a set of wines for dinner that makes everybody happy (regarding their preferences) 386 Wine Preferences Example %Get wines from the ontology wine(X) ← DL[“Wine”](X) %Persons and preferences in the program person(axel). preferredWine(axel,whiteWine). person(gibbi). preferredWine(gibbi,redWine) person(roman). preferredWine(roman,dryWine) %Available bottles a person likes likes(P,W) ← preferredWine(P,sweetWine), wine(W), DL[“SweetWine”](W). likes(P,W) ← preferredWine(P,dryWine), wine(W), DL[“DryWine”](W). likes(P,W) ← preferredWine(P,whiteWine), wine(W), DL[“WhiteWine”](W). likes(P,W) ← preferredWine(P,redWine), wine(W), DL[“RedWine”](W). %Available bottles a person dislikes dislikes(P,W) ← person(P), wine(W), not likes(P,W) %Generation of various possibilities of choosing wines bottleChosen(W) ← wine(W), person(P), likes(P,W), not nonChosen(P,W) nonChosen(W) ← wine(W), person(P), likes(P,W), not bottleChosen(P,W) %Each person must have of bottle of his preference happy(P) ← bottleChosen(W), likes(P,W). false ← person(P), not happy(P), not false. 387 Wine example continued • Suppose that later we learn about some wines, not in the ontology • One may add facts in the program for such new wines. Eg: white(joão_pires). ¬dry(joão_pires). • To allow for integrating this knowledge with that of the ontology, the 1st rule must be changed wine(X) ← DL[“WhiteWine”⊎white,“DryWine”⊌¬dry;“Wine”](X) • In general more should be added in this rule (to allow e.g. for adding, red wines, non red, etc…) • Try more examples in dlv-Hex! 388 About other approaches • This is just one of the current proposals for mixing rules and ontologies • Is this the approach? – There is currently debate on this issue • Is it enough to have just a loosely coupling of rules and ontologies? – It certainly helps for implementations, as it allows for re-using existing implementations of DL alone and of LP alone. – But is it expressive enough in practical? 389 Extensions • A Well-Founded based semantics for dl-Programs [Eiter et al. 2005] exists – But such doesn’t yet exists for other approaches • What about paraconsistency? – Mostly it is yet to be studied! • What about belief revision with rules and ontologies? – Mostly it is yet to be studied! • What about abductive reasoning over rules and ontologies? – Mostly it is yet to be studied! • What about rule updates when there is an underlying ontology? – Mostly it is yet to be studied! • What about updates of both rules and ontologies? – Mostly it is yet to be studied! • What about … regarding combination of rules and ontologies? – Mostly it is yet to be studied! • Plenty of room for PhD theses! – Currently it is a hot research topic with many applications and crying out for results! 390 Part 8: Wrap up 391 What we have studied (in a nutshell) • Logic rule-based languages for representing common sense knowledge – and reasoning with those languages • Methodologies and languages for dealing with evolution of knowledge – Including reasoning about actions • Languages for defining ontologies • Briefly on the recent topic of combining rules and ontologies 392 What we have studied (1) • Logic rule-based languages for representing common sense knowledge – Started by pointing about the need of non-monotonicity to reason in the presence of incomplete knowledge – Then seminal nonmonotonic languages • Default Logics • Auto-epistemic logics – Focused in Logic Programming as a nonmonotonic language for representing knowledge 393 What we have studied (2) • Logic Programming for Knowledge Representation – Thorough study of semantics • of normal logic programs • of extended (paraconsistent) logic programs • including state of the art semantics and corresponding systems – Corresponding proof procedures allowing for reasoning with Logic Programs – Programming under these semantics • Answer-Set Programming • Programming with tabling – Example methodology for representing taxonomies 394 What we have studied (3) • Knowledge evolution – Methods and semantics for dealing with inclusion of new information (still in a static world) • Introduction to belief revision of theories • Belief revision in the context of logic programming • Abductive Reasoning in the context of belief revision • Application to model based diagnosis and debugging – Methods and languages for knowledge updates 395 What we have studied (4) • Methods and languages for knowledge updates – Methodologies for reasoning about changes • Situation calculus • Event calculus – Languages for describing knowledge that changes • Action languages • Logic programming update languages – Dynamic LP and EVOLP with corresponding implementations 396 What we have studied (5) • Ontologies for defining objects, concepts, and roles, and their structure – Basic notions of ontologies – Ontology design (exemplified with Protégé) • Languages for defining ontologies – Basic notions of description logics for representing ontologies • Representing knowledge with rules and ontologies – To close the circle – Still a hot research topic 397 What type of issues • A mixture of: – Theoretical study of classical issues, well established for several years • E.g. default and autoepistemic logics, situation and event calculus, … – Theoretical study of state of the art languages and corresponding system • E.g. answer-sets, well-founded semantics, Dynamic LPs, Action languages, EVOLP, Description logics, … – Practical usage of state of the art systems • E.g. programming with ASP-solvers, with XSB-Prolog, XASP, … – Current research issues with still lots of open topics • E.g. Combining rules and ontologies 398 What next in UNL? For MCL only, sorry • Semantic Web – Where knowledge representation is applied to the domain of the web, with a big emphasis on languages for representing ontologies in the web • Agents – Where knowledge representation is applied to multi-agent systems, with a focus on knowledge changes and actions • Integrated Logic Systems – Where you learn how logic programming systems are implemented • Project – A lot can be done in this area. – Just contact professors of these courses! 399 What next in partner Universities? Even more for MCL, this time 1st year only • In FUB – Module on Semantic Web, including course on Description Logics • In TUD – Advanced course in KRR with seminars on various topics (this year FLogic, abduction and induction, …) – General game playing, in which KRR is used for developing general game playing systems – Advanced course in Description Logics • In TUW – Courses on data and knowledge based systems, and much on answer-set programming • In UPM – Course on intelligent agents and multi-agent systems – Course on ontologies and the semantic web 400 The End From now onwards it is up to you! Study for the exam and do the project I’ll always be available to help! 401