Natural deduction: the KM* system In contrast to forward / backward reasoning systems, natural deduction systems use many inference rules and at each step one of them must be selected for execution. However, data need not be converted to Horn formulas. This is advantageous when reviewing a proof in order to generate an understandable explanation of that proof. Consider the language of PL expanded as follows: <line number> <statement> <justification>, where: <line number> is a number of the line in a proof where <statement> appears, <statement> is a proposition, and <justification> provides an explanation of how the statement was established. Example: 453 (implies A B) asn 454 A premise 455 B (CE 453 454) Justification (CE 452 454) says that statement B is inferred by the conditional elimination (CE) rule (this is called the informant of the justification), and statements on lines 453 and 454 of the proof (they are the antecedents of the justification.) If a statement is not a theorem (as the one on line 455), then it is either a premise or an assumption. Justification’s premise (line 454) and asn (line 453) are examples. Proof in the KM* system A proof in the KM* system is a set of of numbered lines. To prove something, we use a special meta-linguistic predicate, show, which explicitly states our intent to prove a statement. When the proof for that statement is found, we say that the show predicate is cancelled. There are 2 types of inference rules in the KM* system: Elimination rules, which eliminate a logical connective. These are very similar to the simplification rules in the Bundy’s system. Introduction rules, which introduce a logical connective. These require more control knowledge, and cannot be used randomly. Inference rules in the KM* system 1. Rule of indirect proof: to prove P, assume (not P); if a contradiction is derived, P must be true. h show P (IP i j k) i (not P) asn ... j Q ... k (not Q) 2. Elimination rules: Not Elimination (NE) i (not (not P)) P (NE i) Disregard all lines in this box, because they depend on the assumption (not P). Assumption (not P) is discharged when a contradiction is found. For a proof to be valid all assumptions must be discharged. Lines within the box can not be used elsewhere, they define the logical context of the assumption. Elimination rules (contd.) And Elimination (AE) i (and P Q) P (AE i) Q (AE i) Or Elimination (OE) i (or P Q) j (implies P R) k (implies Q R) R (OE i j k) Conditional Elimination / modus ponens (CE) i (implies P Q) j P Q (CE i j) Bi-conditional Elimination / logical equivalence (BE) i (iff P Q) (implies P Q) (BE i) (implies Q P) (BE i) Introduction rules Not Introduction (NI) h show (not P) i P (NI i j k) asn ... j Q ... k (not Q) And Introduction (AI) i j P Q (and P Q) (and Q P) (AI i j) (AI j i) Or Introduction (OI) i P (or P Q) (or Q P) (OI i) (OI i) Introduction rules (contd.) Conditional Introduction (CI) h show (implies P Q) i j P show Q (CI i j) asn Bi-conditional Introduction i (implies P Q) j (implies Q P) (iff P Q) (BI i j) Example: Assume that the following two premises are given premise 1 There is no one in the office. premise 2 If the light is ON, then there must be someone in the office. show The light is OFF. Example (contd.) Formal representation in KM* framework: 1 No-one-in-the-office 2 (implies Light-on (not No-one-in-the-office)) 3 (show (not Light-on)) premise premise (NI 4 1 5) The proof: 4 Light-on 5 (not No-one-in-the-office) For another example, see textbook page 98. asn (CE 2 4) The Tiny Rule Engine (TRE): a partial implementation of the KM* system To build a PS that implements the KM* system, we must first address the knowledge representation problem. Representation of declarative knowledge assumes the following: Statements about the domain (assertions) are list structures, which have no meaning to the system. We assume that all assertions stored in the KB are true (believed), and everything that the system knows is in the KB. New assertion are inserted in the KB by the assert! procedure. Assertions are retrieved from the KB by the fetch! procedure. We do not have a procedure for deleting assertions from the KB, because for now we assume that no information about links between assertions exist (i.e. we do not have an easy way to remove consequences of deleted assertions). To search for an assertion in the KB, we need a pattern-matching procedure with unification to match the pattern and the expression. We assume that both, the pattern and the expression contain pattern variables, which start with ? Example: (caused-by ?agent event12) (?relation ?robot ?event) Given the following binding list {?relation = caused-by, ?agent = ?robot, ?event = event12} these patterns are the same. Representation of procedural knowledge in TRE Procedural knowledge is represented by rules of the form: (rule <trigger> <body>) where: the trigger is a pattern that is matched against the assertions in the KB and if a match succeeds, a list of bindings is produced; the body is a lisp code which is evaluated in the environment resulting from the unification of the trigger and the matching assertion. Example: Consider the rule, "Everybody who eats ice-cream, likes ice-cream.“ Its representation in the TRE framework is (rule (eats (?x ice-cream)) (assert! `(likes ,?x ice-cream))) If the KB contains assertion (eats (Mary ice-cream)), under the substitution (renaming) ?x = Mary, rule trigger and this assertion match; therefore the rule fires, and the body is evaluated given the binding list (or environment) {?x = Mary}. Representation of procedural knowledge in TRE (cont.) If rule trigger contains a combination of assertions, we view this rule as a frame for several simpler rules, which are lexically scoped within that frame. Example: Consider the following rule (rule (transitive ?rel) (rule (?rel ?x ?y) (rule (rel ?y ?z) (assert! `(?rel ?x ?z))))) this is a rule in turn; it is added to the existing set of rules if the trigger matches an existing datum This rule is equivalent to the following one: If (transitive ?rel) and (?rel ?x ?y) and (rel ?y ?z) then (assert! `(?rel ?x ?z)) To evaluate the form, we use the environment accumulated so far. This is called an alignment strategy. Example: Consider the following domain ?z ?y ?x table Let the KB contain the following assertions: (on D table) (on E D) (on F E) Then, the following rule will recognize the existance of such a tower of 3 blocks. (rule (on ?x table) ; current environment {?x = D} (rule (on ?y ?x) ; current environment {?x = D, ?y = E} (rule (on ?z ?y) ; current environment {?x = D, ?y = E, (assert! `(3-tower ,?x ,?y ,?z))))) ; ?z = F} bodies of the outer, middle and innermost rules, respectively. Example (contd.) While this rule is being evaluated, the KB changes as follows: Step 1: For ?x = D, the outer rule fires, and its body (rule (on ?y D) (rule (on ?z ?y) (assert! `(3-tower D ,?y ,?z)))) is entered in the KB. Step 2: For ?y = E, the newly entered rule fires and its body (rule (on ?z E) (assert! `(3-tower D E ,?z))) is entered in the KB. Step 3: For ?z = F, the newly entered rule fires and its body (assert! `(3-tower D E F)) inserts (3-tower D E F) in the KB. Note: Once a rule is entered in the KB, it remains there forever. Once a new assertion is entered in the KB, all rules matching that assertion fire. This makes the TRE model orderindependent, but it also makes it inefficient, because the whole KB must be searched. To improve efficiency, the knowledge base (rules + assertions) is partitioned into dbclasses, such that the elements of these classes (assertions and rules) are likely to match. Dbclasses Each dbclass contains assertions and rules (trigger patterns), whose leftmost constant symbol is the same. Example: dbclass ((implies A B)) = implies dbclass (data123) = data123 dbclass (((a b) c)) = a Rules and assertions from different dbclasses cannot match. Therefore, adding new components in the KB requires testing them only against assertions/rules from the same dbclass. Note: It is required that the leftmost symbol in the pattern is a constant or a bounded variable. This imposes restrictions on the way knowledge is represented. For example, to represent a relationship between A and B, we must say (A <predicate> B) rather than using a widely used representation (<predicate> A B). Given an assertion or a trigger pattern, we can find the class this assertion / pattern belongs to by extracting the first constant symbol (if this symbol is a pattern variable, it must be bound; otherwise an error would occur). The Tiny Rule Engine (TRE) architecture New assertions New rules (added incrementally by the user) (added incrementally by the user) Assertion 1 Assertion 2 … Assertion N Rule 1 Rule 2 … Rule K Knowledge base New assertions derived by the IE New rules Queue Instantiated by the IE Inference engine The Inference Engine must ensure that: 1. After a new assertion is entered, all rules must be tested for firing. 2. After a new rule is added, all assertions are screened to test if this rule can fire. If yes, the body of that rule plus the binding list resulting from successful unifications are stored on the Queue for execution. The design of the unifier an assertion a pattern current environment (a list of bindings, implemented as an association list, where keys are pattern variables and their bindings are the values of the pattern variables). Unifier Possible outcomes 1. 2. 3. :FAIL ==> failure of a match NIL ==> an assertion and a pattern match exactly, and the initial environment is NIL New environment The design of the queue rule body / environment pairs resulting from a successful match between entities in the KB can be added anywhere on the queue, but we assume a LIFO strategy here. Queue new assertions new rules The implementation of TRE The following files are needed to run the TRE: tinter.lsp Provides the organizing data structure and interface procedures data.lsp Defines the class data structure and provides the data base procedures rules.lsp Defines rules and the queue unify.lsp Defines variables and pattern matching To test TRE, load treex1new.lsp This example utilizes a subset of KM* rules (those that do not require assumptions). TRE assertions are used to code sentences and TRE rules can used to code KM* inference rules. Some control knowledge is wired into the TRE rules for more efficient execution as discussed next. TRE coding of KM* rules (that do not require assumptions) 1. Elimination rules that do not require assumptions, but may require some control knowledge for more efficient execution: Not Elimination (rule (not (not ?p)) (assert! ?p)) And Elimination (rule (and . ?conjuncts) (dolist (con ?conjuncts) (assert! con))) Or Elimination (rule (show ?r) (rule (or ?p ?q) (assert! ‘(show (implies ,?p ,?r))) (assert! ‘(show (implies ,?q ,?r))) (rule (implies ?p ?r) (rule (implies ?q ?r) (assert! ?r))))) This “control” rule suggests that OE rule may be useful. However, there may be problems with the execution of this rule. The TRE coding of KM* rules (that do not require assumptions) Conditional Elimination (rule (implies ?ante ?cons) (rule ?ante (assert! ?cons))) (rule (show ?q) (unless (fetch ?q) (rule (implies ?p ?q) (assert! ‘(show ,?p))) Bi-conditional Elimination (rule (iff ?arg1 ?arg2) (assert! ?arg1) (assert! ?arg2)) This “control” rule suggests that if the system wants to derive q, it should check first that q is not in the DB; if this is the case and it knows that p q, it makes makes sense to try to derive p. Note that this is in fact a backward chaining control rule. The TRE coding of KM* rules (that do not require assumptions) 2 Introduction rules that do not require assumptions are the following: And Introduction (rule (show (and ?a ?b)) (assert! `(show ,?a)) (assert! `(show ,?b)) (rule ?a (rule ?b (assert! `(and ,?a ,?b))))) Or Introduction. (rule (show (or ?a ?b)) (assert! `(show ,?a)) (assert! `(show ,?b)) (rule ?a (assert! `(or ,?a ,?b))) (rule ?b (assert! `(or ,?a ,?b)))) Bi-conditional Introduction (rule (show (iff ?a ?b)) (assert! `(show (implies ,?a ,?b))) (assert! `(show (implies ,?b ,?a))) (rule (implies ?a ?b) (rule (implies ?b ?a) tinter.lsp defines the interface to the TRE machinery (defstruct (tre (:PRINT-FUNCTION tre-printer)) title ; String for printing (dbclass-table nil) ; symbols --> classes (debugging nil) ; prints extra info if non-nil (queue nil) ; LIFO (rule-counter 0) ; Unique id for rules (rules-run 0)) ; Statistics (defun create-tre (title &key debugging) "Create a new Tiny Rule Engine." (make-tre :TITLE title :DBCLASS-TABLE (make-hash-table :test #'eq) :DEBUGGING debugging)) treex1new.lsp defines KM* example (defun ex1 (&optional (debugging nil)) "Demonstration of negation elimination." (in-tre (create-tre "Ex1" :DEBUGGING debugging)) (run-forms *TRE* '( ;; A simple version of Modus Ponens (rule (implies ?ante ?conse) (rule ?ante (assert! ?conse))) ;; A simple version of negation elimination (rule (not (not ?x)) (assert! ?x)) (assert! '(implies (human Turing) (mortal Turing))) (assert! '(not (not (human Turing)))))) (show-data)) The assumption problem Consider the following set of rules: C & MA D Here M is a special meta-linguistic predicate that means “may assume”, or “there is no reason to believe the opposite”. BC FD Given B, we can infer C and D. That is, the set of beliefs, Bel, now becomes Bel = {B, MA, C, D}. Assume next that F and (not A) are added to Bel, i.e. Bel = {B, MA, C, D, F, not A} contradiction. To resolve this contradiction we must be able to withdraw assumption MA, and revise everything that follows from it. Extending TRE to handle assumptions: the FTRE system. There are two reasons why we want to be able to introduce assumptions: 1. 2. To use indirect proof (prove P, assuming not P); The PS may not have all the information needed to solve a problem. This requires the following two problems to be addressed: 1. 2. How to introduce and handle assumptions? How to retract assumptions if they later prove wrong? An easy way to address the first question is to use a stack where a “local context” is defined once an assumption is pushed on the stack (this is called “a stack-oriented context mechanism”). When this context is no longer needed (the theorem that required an assumption has been proved) it is popped and the stack now contain only true facts. The stack-oriented context mechanism, however, does not provide a solution to the second problem. A more sophisticated way to handle assumptions is provided by the Truth Maintenance Systems which will be discussed later. Stack-oriented context mechanism Recall that the DB contains everything believed to be true. That is, it can be viewed as a “global environment” where all inferences take place. If an assumption is to be introduced, a “context” (or local environment) is introduced, which extends the current environment: Data Base Extended logical environment Context 1 (temporary environment 1) Everything inferred after the first assumption in entered in added here Context 2 (temporary environment 2) Everything inferred after the second assumption in entered in added here … Retracting an assumption means retracting the entire context defined by this assumption. Stack-oriented context mechanism (cont.) It is easy to see that such logical environments can be implemented as a stack. Introducing an assumption in the DB pushes a new stack frame, which represents a new context. When an assumption is retracted, the corresponding stack frame is popped. This is called depth-first exploration of the assumption set. If an assertion is derived in the presence of several active contexts, then this assertion is stored in the most recently created context. One important requirement (as we shall see next) is that each assertion must be derived in the smallest possible context (i.e. under the minimum number of assumptions). To implement this idea in FTRE, rules are divided into two groups: – Rules that do not make assumptions. The triggered rules from this group are placed on the normal-queue. – Rules that make assumptions (called A-rules). The triggered rules from this group are placed on the asn-queue. Two procedures in finter.lsp that maintain contexts are seek-in-context and tryin-context (see finter.lsp) Example (BPS, page 120) Consider the following assertion set and assume that they are “pushed” as shown: 1. (show P) 2. (not Q) 3. (implies (not P) Q) 4. (implies (not Q) R) (IP 2 5 6) Premise Premise Premise To prove P, we can use indirect proof, i.e. assume (not P) and try to derive a contradiction. 5. (not P) 6. Q Contradiction Asn (CE 3 5) Once the contradiction is derived, the logical environment defined by assumption (not P) is retracted. Constrains on the assumption set Assume that premise 4 was pushed before premise 3. Then, R would be derived within the logical context defined by assumption (not P) and retracted accordingly (although R does not depend on (not P)). To achieve the maximum efficiency of the inference procedure, the following constraints on the assumption set must be imposed: To guarantee that everything derived in a given context is retracted after the assumption(s) defining that context is/are retracted, every assertion must be made in the latest context. To minimize retractions, assertions must be derived in the simplest possible context. To ensure this, all rules that do not make assumptions must be fired first and then A-rules are fired. Every operation that requires an assumption must push a new stack frame. For examples utilizing assumptions, see textbook page 97 and 99. Enhancing rule representation in FTRE Rule representation used in TRE can be improved by creating a special purpose rule representation language. Consider the following rule: (rule (show ?Q) (rule (implies ?P ?Q) (assert! `(show ,?P)))) A more natural representation for this rule will be the following one: (rule ((show ?Q) (implies ?P ?Q)) (assert! `(show ,?P))) Note that in the body, (assert! `(show ,?P)), ?P is a pattern variable, but it is treated as a Lisp variable. The difference between pattern variables and Lisp variables is that pattern variables take their values via substitution, while Lisp variables take their values via evaluation. To emphasize the fact that ?P is a pattern variable, we can replace the form `(show ,?P) with (show ?P) assuming that all pattern variables are evaluated. This assumption is reasonable, because whenever something is asserted in the DB, pattern variables are substituted with their values. Enhancing rule representation in FTRE (cont.) To implement the proposed representation change, we use a different procedure, rassert! This procedure ensures that all pattern variables are evaluated before the assert operation takes place. Now the rule becomes: (rule ((show ?Q) (implies ?P ?Q)) (rassert! (show ?P))) The next change in the representation is the following. Assume that a pattern variable is bound within an environment enveloping other rules. We can introduce a new construct rlet to enforce the value of that pattern variable within each rule of that environment. rlet is a macro, analogous to let. Example: (rule ((project ?project) (needed (cost-estimate-for ?project))) (rlet ((?cost (expensive-estim-method ?project))) (rassert! (cost-of ?project ?cost :1st-cut)) (rule ((improve-on (cost-of ?project ?cost :1st-cut))) (rlet ((?new-cost (even-more-expensive-method ?project))) (rassert! (cost-of ?project ?new-cost :2nd-cut))))))) Note that let will bound ?cost as a Lisp variable. If ?cost must be used somewhere else as a pattern variable, the PS would not know that it has already been bound. Enhancing rule representation in FTRE (cont.) The next enhancement of the rule syntax allow rule triggers to contain the following keywords: : var gets as its value the entire preceding trigger pattern.This allows us to use this trigger pattern inside the body of the rule without retyping it. :test means that the next element of the trigger is a Lisp expression which must be evaluated and if it returns NIL, the whole match fails. This ensures that no effort will be wasted to try to prove something that is already known. Although we can do this by using when or unless in the body of the rule, doing it via :test expresses out intend more clearly, because these tests are often part of the pattern-matching process. Example: (rule ((show ?q) :test (not (fetch! ?q)) (implies ?p ?q) :var ?imp) (debug-nd "~% Looking for ~A to use ~A" ?p ?imp) (rassert! (show ?p))) Improving the efficiency of the rule retrieval Recall that in TRE, all assertions and rules are divided into classes according to the leftmost symbol in the assertion of rule trigger. Each rule and assertion is checked only against the members of its class. We can further divide the "top-level" classes into subclasses, where each subclass contains assertions and rule triggers with the same second symbol; further divide the subclasses into sub-subclasses with the third symbol being the same, etc. The following diagram illustrates this idea. Level i DB class A12 ... DB class A11: CADDR discr. Level 3 Level 1 of indexing DB class A13 DB class A1: CADR discrimination Level 2 DB class A12 DB class A13 DB class A: CAR discrimination DB class B Data Base DB class C DB class N Discrimination trees: an outline That is, the DB is viewed as a tree, known as a discrimination tree, whose leaves are statements (assertions or rule triggers). The rest of the nodes represent some aspect of the structure discriminating against a particular element. For example, the first node discriminates against the first element of the statements (all children of that node have the same CAR parts), the second node discriminates against the second element, etc. Adding a new rule or assertion requires traversing the tree to find an appropriate leaf to put it and it will be unified only against the leaves from the same branch. That is, by analyzing the structure of the pattern we can reduce the set of candidates for unification. In the extreme case, where all of the pattern elements are indexed, the DB becomes exactly a discrimination tree: DB DB class A DB class B DB class A1 DB class A2 … DB class An DB class A11 DB class A12 … DB class A1k DB class C ... DB class j Improving the efficiency of the rule retrieval in FTRE: an alternative to discrimination trees The FTRE DB is a one-level discrimination tree, just as the TRE DB. However, to maintain the efficiency of very large dbclasses, they are reorganized by introducing redundant classes. As an example, consider a spacial reasoning system which wants to find information about Corner-base32. Possible representations: church12 Car-wash18 1. (left-of ?A ?B) 2. (?A left-of ?B) (right-of ?B ?A) (?B left-of ?A) Corner-base32 relation objects relation Both representations can be useful, which is why we want to have them both in the DB. The following rule asserts the alternative representation: (rule ((left-of ?A ?B)) (rassert! (?A left-of ?B)) (rassert! (?B right-of ?A))) Improving the efficiency of the pattern-matching process: open-coding unification Another change that allows FTRE to improve the efficiency of the patternmatching process is the so-called open-coding unification. It relies on the fact that the structure of the rule trigger is known; therefore for each pattern a special purpose match procedure can be created to perform only those tests that are relevant for that particular pattern. Example: Let (foo ?A ?B (bar ?B)) be the rule trigger. Then, the following tests on the assertion, P, assure that it matches the trigger: (consp P) (equal 'foo (car P)) (consp (cdr P)) (equal ?A (cadr P) (consp (cddr P)) (consp (cdddr P)) (consp (fourth P)) (equal 'bar (car (fourth P))) (consp (cdr (fourth P))) (null (cddr (fourth P))) (null (cddddr P)) (equal (cadr (fourth P) (third P)) Special purpose pattern-matcher used in FTRE Trigger patterns Current environment … Test set generator Set of tests (different for each trigger) Match procedure Special purpose Pattern Matcher Flag indicating match success or failure New environment created during the match process Improving the efficiency of rule execution Recall that the body of FTRE rules consists of rassert! statements or other rule statements, which are Lisp forms. To execute such Lisp forms more efficiently, we can define them as separate procedures. This has two advantages: 1. 2. Such procedures can be called with the current values of pattern variables as arguments. They can be compiled, thus allowing for more efficient execution. The KM* system: implementation of rules requiring assumptions The rules making assumptions in KM* system are Indirect Proof (IP), Not Introduction (NI) and Conditional Introduction (CI). They are similar in that: – – They make an assumption. They look for a specific outcome (a contradiction, in IP and NI case), or the consequent, in CI case). The following rule allows us to state an intention to derive a contradiction and signal when the indirect proof succeeds: (rule ((show contradiction) (not ?P) ?P) (rassert! contradiction)) Notice that trying to prove (not ?P) first is better because there are much more positive than negative statements in the DB, and it does not make sense to try to find the negation of each positive assertion. The FTRE control rule for IP is the following: (A-rule ((show ?P)) (unless (or (fetch! ?P) (eq ?P 'contradiction) (not (simple-proposition? ?P))) (when (seek-in-context `(not ,?P) 'contradiction) (rassert! ?P)))) The FTRE control rule for NI is the following: (A-rule ((show (not ?P))) (unless (or (fetch! `(not ,?P)) (eq ?P 'contradiction)) (when (seek-in-context ?P 'contradiction) (rassert! (not ?P))))) The FTRE control rule for CI is the following: (A-rule ((show (implies ?P ?Q))) (unless (fetch! `(implies ,?P ,?Q)) (when (seek-in-context ?P `(or ,?Q contradiction)) (rassert! (implies ?P ?Q))))) The N-Queens example (see book p.135) The most important question that must be addressed with respect to this problem is how to find consistent column placements for each queen. The solution in the book is based on the idea of "choice sets". A choice set is a set of alternative placements. Consider, for example, the following configuration for N = 4: 0 0 Q 1 Q 2 Q 3 Q 1 2 3 choice set 1 = {(0,0), (1,0), (2,0), (3,0)} Q choice set 2 = {(0,1), (1,1), (2,1), (3,1)} Q Q choice set 1 choice set 2 choice set 3 = {(0,2), (1,2), (2,2), (3,2)} choice set 4 = {(0,3), (1,3), (2,3), (3,3)} choice set 4 choice set 3 Notice that in each choice set, choices are mutually exclusive and exhaustive. Each solution (legal placement of queens) is a consistent combination of choices - one from each set. To find a solution, we must: 1. Identify choice sets. 2. Use search through the set of choice sets to find a consistent combination of choices (one or all). A possible search strategy, utilizing chronological backtracking is the following one (partial graph shown): Choice set 1 (0,0) (0,1) (0,1) (2,1) (1,1) (3,1) … Choice set 2 X X Choice set 3 X Choice set 4 X X X X (inconsistent combinations of choices) X X X X A generic procedure for searching through choice sets utilizing chronological backtracking The following is a generic procedure that searches through choice sets. When an inconsistent choice is detected, it backtracks to the most recent choice looking for an alternative continuation. This strategy is called chronological backtracking. (defun Chrono (choice-sets) (if (null choice-sets) (record-solution) (dolist (choice (first choice-sets)) (while-assuming choice (if (consistent?) (Chrono (rest choice-sets))))))) Notice that when an inconsistent choice is encountered, the algorithm backtracks to the previous choice it made. As the results on book pages 138 and 140 suggest, this algorithm is not efficient because (1) it is exponential, and (2) it re-invents contradictions. As we shall see, dependency-directed backtracking handles this type of search problems in a more efficient way.