Converting formulas into a normal form Consider the following FOL formula stating that a brick is an object which is on another object which is not a pyramid, and there is nothing that a brick is on, and at the same time that object is on the brick, and there is nothing that is not a brick and also is the same thing as the brick. x [brick(x) => ( y [on(x, y) & ¬pyramid(y) ] & & ¬ y [on(x, y) & on(y, x) ] & & y [¬brick(y) => ¬equal(x, y) ] ) ] To convert this formula into a normal form, we must go through the following steps: Step 1: Eliminate implications, i.e. substitute any formula A => B with ¬A v B. x [¬ brick(x) v ( y [on(x, y) & ¬pyramid(y) ] & & ¬ y [on(x, y) & on(y, x) ] & & y [¬ ( ¬brick(y)) v ¬equal(x, y) ] ) ] Converting formulas into a normal form (cont.) Step 2: Move negations down to atomic formulas. This step requires the following transformations: ¬ (A v B) ¬A & ¬B ¬ (A & B) ¬A v ¬B ¬ ¬A A ¬ x A(x) x (¬ A(x)) ¬ x A(x) x (¬ A(x)) x [¬ brick(x) v ( y [on(x, y) & ¬pyramid(y) ] & & y [¬ on(x, y) v ¬ on(y, x) ] & & y [brick(y) v ¬equal(x, y) ] ) ] Converting formulas into a normal form (cont.) Step 3: Eliminate existential quantifiers. Here, the only existential quantifier belongs to the sub-formula y [on(x, y) & ¬pyramid(y) ]. This formula says, that given some x, we can find a function that takes x as an input, and returns y. Let us call this function support(x). Functions that eliminate the need for existential quantifiers are called Skolem functions. Here support(x) is a Skolem function. Note that the universal quantifiers determine the arguments of Skolem functions: there must be one argument for each universally quantified variable whose scope contains the Skolem function. x [¬ brick(x) v ( on(x, support(x)) & ¬pyramid(support(x)) ) & & y [¬ on(x, y) v ¬ on(y, x) ] & & y [brick(y) v ¬equal(x, y) ] ) ] Converting formulas into a normal form (cont.) Step 4: Rename variables, if necessary, so that no two variables are the same. x [¬ brick(x) v ( on(x, support(x)) & ¬pyramid(support(x)) ) & & y [¬ on(x, y) v ¬ on(y, x) ] & & z [brick(z) v ¬equal(x, z) ] ) ] Step 5: Move the universal quantifiers to the left. x y z [¬ brick(x) v ( on(x, support(x)) & ¬pyramid(support(x)) ) & & [¬ on(x, y) v ¬ on(y, x) ] & & [brick(z) v ¬equal(x, z) ] ) ] Converting formulas into a normal form (cont.) Step 6: Move disjunctions down to literals. This requires the following transformation: A v (B & C & D) (A v B) & (A v C) & (A v D) x y z [ ( ¬ brick(x) v on(x, support(x))) & & ( ¬ brick(x) v ¬pyramid(support(x)) ) & & [¬ brick(x) v ¬ on(x, y) v ¬ on(y, x) ] & & [¬ brick(x) v brick(z) v ¬equal(x, z) ] ] Step 7: Eliminate the conjunctions, i.e. each part of the conjunction becomes a separate axiom. x ( ¬ brick(x) v on(x, support(x))) x ( ¬ brick(x) v ¬pyramid(support(x)) ) x y [¬ brick(x) v ¬ on(x, y) v ¬ on(y, x) ] x z [¬ brick(x) v brick(z) v ¬equal(x, z) ] Converting formulas into a normal form (cont.) Step 8: Rename all the variables so that no two variables are the same. x ( ¬ brick(x) v on(x, support(x))) w ( ¬ brick(w) v ¬pyramid(support(w)) ) u y [¬ brick(u) v ¬ on(u, y) v ¬ on(y, u) ] v z [¬ brick(v) v brick(z) v ¬equal(v, z) ] ) ] Step 9: Eliminate the universal quantifiers . Note that this is possible because all variables now are universally quantified, therefore the quantifiers can be ignored. ¬ brick(x) v on(x, support(x)) ¬ brick(w) v ¬pyramid(support(w)) ¬ brick(u) v ¬ on(u, y) v ¬ on(y, u) ¬ brick(v) v brick(z) v ¬equal(v, z) Converting formulas into a normal form (cont.) Step 10 (optional): Convert disjunctions back to implications if you use the “implication” form of the resolution rule. This requires the following transformation: (¬A v ¬B v C v D) ((A & B) => (C v D)) brick(x) => on(x, support(x)) brick(w) & pyramid(support(w)) => False brick(u) & on(u, y) & on(y, u) => False brick(v) & equal(v, z) => brick(z) Converting formulas into a normal form: summary The procedure for translating FOL sentences into a normal form is carried out as follows: 1 2 3 4 5 6 7 8 9 0 Eliminate all of the implications. Move the negation down to the atomic formulas. Eliminate all existential quantifiers. Rename variables Move the universal quantifiers to the left. Move the disjunctions down to literals Eliminate the conjunctions Rename the variables Eliminate the universal quantifiers. (Optional) Convert disjunctions back to implications. Logical reasoning systems Recall that the two most important characteristics of AI agents are: 1 Clear separation between the agent’s knowledge and inference engine. 2 High degree of modularity of the agent’s knowledge. We have already seen how these features are utilized in forward and backward chaining programs (these are referred to as production systems). Next, we discuss three different implementations of AI agents based on the same ideas, namely: – AI agents utilizing logic programming (typically implemented in a PROLOG-like language). – AI agents utilizing frame representation languages or semantic networks. – AI agents utilizing description (or terminological) logics. Logic programming and Prolog. Consider a knowledge base containing only Horn formulas, and a backward chaining program where all of the inferences are performed along a given path until a dead end is encountered (i.e. the underlying control strategy is depthfirst search); when a dead end is encountered, the program backs up to the most recent step which has an alternative continuation. Example: Let the KB contain the following statements represented as PROLOG clauses (next to each clause, a LISP-based implementation is given). mammal(bozo). mammal(Animal) :- hair(Animal). (remember-assertion ‘(Bozo is a mammal)) (remember-rule ‘(identify1 ((? animal) has hair) ((? animal) is a mammal))) ?- mammal(deedee). ?- mammal(X). (backward-chain ‘(Deedee is a mammal)) (backward-chain ‘((? X) is a mammal)) Note that PROLOG uses a prefix notation, variables start with uppercase letters, and constants are in a lowercase. Each statement ends with a period. PROLOG syntax If a rule has more than one premise, a comma is used to separate the premises, i.e. A & B & C & D => H is written in PROLOG as H :- A, B, C, D. head body Rule of the form A v B v C v D => H can be presented as follows: H :- A; B; C; D. Note that this is equivalent to H :- A. H :- B. H :- C. H :- D. To represent negated antecedents, PROLOG uses the nagation as failure operator, i.e. A & ¬B => H is represented in PROLOG as H :- A, not(B). The negation as failure operator Note that not is not a logical negation. This operator works as follows: to satisfy not(B), PROLOG tries to prove B; if it fails to prove B, not(B) is considered proved. The negation as failure operator is based on the so-called closed-world assumption. This states that if a theorem were true, then an axiom would exist stating it as being true. If such an axiom does not exist, we can assume that the theorem is false. The closed-world assumption is very dangerous, because it may introduce inconsistencies in the KB. Example: Assume not(B) is found true. Next assume that B is entered as an axiom. Now, the KB contains both, B and not(B). Managing assumptions and retractions Consider the following PROLOG program: D :- not(A), B. D :- not(C). Rules F :- E, H. N :- D, F. E. Premises H. ?- N. Query To prove N, Prolog searches for a rule, whose head is N. Here N :- D, F. is such a rule. For this rule to fire, D and F must be proven in turn: To prove D, consider rule D :- not(A), B. fails, because of B. To prove D, consider rule D :- not(C). succeeds. F can be easily proved, because E and H are declared as premises Therefore, N is proved (based on the fact that E and H are true, and assuming that C is false). Example (cont.) Assume now that the system learns C. What happens to D and N, respectively? Obviously, D and N must be retracted. The BIG question is how to do this. Note that there are other reasons for which we may want to retract a sentence, such as: To make the system “forget”. To update the current model of the world. PROLOG does not have means for implementing retractions, that is it can only handle incomplete knowledge, but not inconsistent knowledge. Note the difference between retracting a sentence and adding the negation of that sentence. In our example, if not(C) is retracted, the system will not be able to infer either C, or ¬C. Whereas, if ¬C KB and we add C, then the system can infer both C and ¬C. The process of keeping track of which additional statements must be retracted when we retract not(C) (D and N in our example) is called truth maintenance. Although PROLOG cannot do retractions, it is a non-monotonic system, because it allows inferences to be made from incomplete information (thanks to the negation as failure operator). Generation of explanations The ability to explain its reasoning is one of the most important features of KBS. There are different ways to do that: To record the reasoning process, and keep track of the data upon which the conclusion depends. To keep track of the sources of each data item, for example “provided by the user”, “inferred by rule xx”, etc. To keep a special note as part of the rule that contains an explanation. Example: Given the following rules H & B => Y, C => B, ¬C => ¬B, A => X & Y, C => D, ¬A => C. Given that H is true, prove Y. Explanation1 Explanation 2 Y is a conclusion of rule H & B => Y, and premises B and H (how B was proved is not relevant for the explanation of Y). Y because A is unknown (meaning that by the negation as failure rule we can assume ¬A) and H is true. Or, Y because of H while ¬A. Reasoning with incomplete knowledge: default reasoning (AIMA, page 354) Consider the conditions under which Y is true in the above example: for Y to be true, H must be true and it must be reasonable to assume ¬A. This can be represented by means of the following rule: H : ¬A Rules of this type are called default rules. Y The general form of default rules is: A : B1, B2, … , Bn C where: A, B1, B2, … , Bn, C are FOL sentences; A is called the prerequisite of the default rule; B1, B2, … , Bn are called justifications of the default rule; C is called the consequent of the default rule. Example Let Tweety be a bird, and because we know that birds fly we want to be able to conclude that Tweety flies. Bird(Tweety) Bird(X): Flies(X) Flies(X) This rule says “If X is a bird, and it is consistent to believe that X flies, then X flies”. Given only this information about Tweety, we can infer Flies(Tweety). Assume that we learn Penguin(Tweety). Because penguins do not fly, we must have the following rule in the KB to handle penquines: Penguin(X): ¬Flies(X) ¬Flies(X) We can infer now Flies(Tweety) (according to the first rule), and ¬Flies(Tweety) according to the second rule. To resolve this contradiction, we may want to always prefer the “more specific rule”, which in this case will first derive ¬Flies(Tweety) making the first rule inapplicable. Dependency networks The following dependency network presents exceptions in a more descriptive graphical form. Rather than enumerating exceptions, we may “group” them under the property “abnormal”. Flies(X) Bird(X) Penguin(X) Abnormal(X) Ostrich(X) Dead(X) Stuffed(X) Semi-normal default rules allow us to capture exceptions Defaults where the justification and the conclusion of the rule are the same are called normal defaults; otherwise, the default is called semi-normal. Although semi-normal defaults allow “exceptions” to be explicitly enumerated (as part of the rule), they cannot guarantee the correctness of derived conclusions. Example: Consider the following set of sentences Bird(Tom) Penguin(Tom) v Ostrich(Tom) Bird(X) : Flies(X) & ¬Penguin(X) Flies(X) Bird(X) : Flies(X) & ¬ Ostrich(X) Flies(X) We can infer Flies(Tom), which is semantically incorrect, because neither penguins nor ostriches fly. Truth maintenance systems The problem with the example above is that in default reasoning systems once inferred the conclusion is no longer related to its justification. Truth maintenance systems fix this problem by explicitly recording and keeping track of the dependencies between sentences. More good things about TMSs are: They have a mechanism for retracting sentences from the KB as a result of a retraction of another sentence. They are capable of performing default reasoning. They are capable of providing explanations of derived conclusions. There are different types of TMSs: Justification-based TMSs (can be monotonic or non-monotonic). Assumption-based TMSs (these are monotonic systems) Contradiction-tolerant TMSs (these are non-monotonic systems). Assumption-based truth maintenance systems The primary tasks of any TMS are: To record dependencies among beliefs. To answer queries about whether a given belief is true with respect to a given set of beliefs. To provide explanations as to why a given belief is true w.r.t. a given set of beliefs. To accomplish these tasks, each TMS must support two data structures: Nodes representing beliefs. Although beliefs can be expressed in any language, once a belief is bound to a TMS node it becomes nothing more than a proposition. Justifications representing reasons for beliefs. These are equivalent to rules in production systems, where reasons are the premises and the belief itself is the conclusion.