CS 362_ Intelligent Systems CS 362: Intelligent Systems Slide 1 UNIT 1 Introduction to Artificial Intelligence CS 362_ Intelligent Systems Text Book Slide 2 Text Book: Artificial intelligence: structures and strategies for complex problem solving. Author: George F. Luger. Publisher: Pearson Education, Inc CS 362_ Intelligent Systems Course Assessment Slide 3 ITEM DATE MARK Varies 5 Homework Weeks 6 and 13 15 First Exam Week 7 15 Second Exam Week 11 15 Project Week 14 10 Set by the college 40 Quizzes Final Exam Total 100 CS 362_ Intelligent Systems Course Objectives Slide 4 Develop an appreciation of the role of intelligent systems in the contemporary context. Develop a deep understanding of fundamental theoretical and practical concepts about intelligent systems. Develop several applications employing different intelligent system paradigms CS 362_ Intelligent Systems Course Learning Outcomes (CLOs) Slide 5 1.1 Recognize ideas and techniques underlying the design of intelligent systems. 1.2 Describe mathematical representations and logic for Solving AI problems. 2.1 Analyze various search and Knowledge Representation techniques in the design of intelligent systems. 2.2 Design intelligent systems to solve various AI problems. 3.1 Appraise team work utilizing effective group techniques to design, implement and demonstrate Intelligent Systems.( mini project). 4.1 Demonstrate intelligent systems using search techniques, Planning and Robotics. CS 362_ Intelligent Systems What’s Artificial Intelligence (AI) Slide 6 Artificial intelligence (AI) is the branch of computer science concerned with making computers behave like humans (Computers with the ability to mimic or duplicate the functions of the human brain). The term was coined in 1956 by John McCarthy at the Massachusetts Institute of Technology (MIT). CS 362_ Intelligent Systems The Turing Test Slide 7 Alan Turing in 1950 wrote a paper entitled “Computing machinery and intelligence” This paper is one of the earliest papers to address the question of machine intelligence specifically in relation to the modern digital computer The Turing test is a test of machine’s ability to exhibit an intelligent behaviour. The Turing test (called the imitation game) measures the performance of an intelligent machine against that of a human being. The test places the machine and a human counterpart in rooms apart from a second human being, referred to as the interrogator. CS 362_ Intelligent Systems The Turing Test (ii) Slide 8 The interrogator is asked to distinguish the computer from the human being solely on the basis of their answers to questions asked over this device. If the interrogator cannot distinguish the machine from the human, then, Turing argues, the machine may be assumed to be intelligent. CS 362_ Intelligent Systems Artificial Intelligence Systems Slide 9 Artificial intelligence systems are the people, procedures, hardware, software, data, and knowledge needed to develop computer systems and machines that demonstrate the characteristics of intelligence CS 362_ Intelligent Systems Artificial Intelligence Behavior Slide 10 Intelligent behavior: Learn from experience Apply knowledge acquired from experience Handle complex situations Solve problems when important information is missing Determine what is important React quickly and correctly to a new situation Understand visual images Process and manipulate symbols Be creative and imaginative Use heuristics CS 362_ Intelligent Systems Important Features of AI Slide 11 The use of computers to do reasoning, pattern recognition, learning, or some other form of inference. A focus on problems that do not respond to algorithmic solutions. This underlies the reliance on heuristic search (as rules for choosing those branches in a state space that are most likely to lead to an acceptable problem solution) as an AI problem-solving technique. A concern with problem-solving using inexact, missing, or poorly defined information and the use of representational formalisms that enable the programmer to compensate for these problems. Reasoning about the significant qualitative features of a situation. An attempt to deal with issues of semantic meaning as well as syntactic form. Answers that are neither exact nor optimal, but are in some sense “sufficient”. This is a result of the essential reliance on heuristic problem-solving methods in situations where optimal or exact results are either too expensive or not possible. The use of large amounts of domain-specific knowledge in solving problems. This is the basis of expert systems. CS 362_ Intelligent Systems Techniques used in AI Slide 12 Problem Representation (Knowledge Representation): is a science of translating actual knowledge into a format that can be used by the computer Search: a technique to choose the optimal solution from many possible solutions. Automated Reasoning: a process of achieving a specific goal based on the given knowledge. Planning: the ability to decide on a good sequence of actions to achieve our goals. CS 362_ Intelligent Systems Overview of AI applications Slide 13 The two most fundamental concerns of AI researchers are: Knowledge representation: it addresses the problem of capturing in a language (predicate calculus), suitable for computer manipulation Search: is a problem-solving technique that systematically explores a space of problem state (successive and alternative stages in the problem-solving process) for solution space. Like most sciences, AI is decomposed into a number of sub disciplines that, while sharing an essential approach to problem solving, have concerned themselves with different applications. CS 362_ Intelligent Systems Important Research and Application Areas of AI Slide 14 Game playing Automated reasoning and theorem proving Expert systems Natural language understanding and semantic modelling Modelling human performance Planning and robotics Languages and environments for AI Machine learning Alternative representations: neural nets and genetic algorithms Areas of Artificial Intelligence Areas of Artificial intelligence Vision systems Learning systems Robotics Expert systems Neural networks Natural language processing CS 362_ Intelligent Systems (i) Game playing Slide 17 Game playing involves programming computers to play games such as chess and checkers . Most games are played using a well-defined set of rules: this makes it easy to generate the search space and frees the researcher from many of the ambiguities and complexities inherent in less structured problems. As games are easily played, testing a game-playing program presents no financial or ethical burden. CS 362_ Intelligent Systems (ii): Automated reasoning & theorem proving Slide 18 Automated theorem-proving research was responsible for much of the early work in formalizing search algorithms and developing formal representation languages such as the predicate calculus and the logic programming language Prolog. CS 362_ Intelligent Systems (iii): Expert Systems Slide 19 Expert systems involves programming computers to make decisions in real-life situations (for example, some expert systems help doctors diagnose diseases based on symptoms) . Expert systems are constructed by obtaining this knowledge from a domain expert (a person who has deep knowledge (of both facts and rules) and strong practical experience in a particular domain) and coding it into a form that a computer may apply to similar problems. Expert knowledge is a combination of a theoretical understanding of the problem and a collection of heuristic problem-solving rules that experience has shown to be effective in the domain. CS 362_ Intelligent Systems (iv): Natural Language Understanding Slide 20 Natural language understanding involves programming computers to understand natural human languages like Arabic. Understanding natural language involves much more than parsing sentences into their individual parts of speech and looking those words up in a dictionary. Real understanding depends on extensive background knowledge about the domain of discourse and the idioms used in that domain as well as an ability to apply general contextual knowledge to resolve the omissions and ambiguities that are a normal part of human speech. CS 362_ Intelligent Systems (v): Modeling Human Performance Slide 21 Many AI programs are engineered to solve some useful problems without regard for their similarities to human mental architecture. In fact, programs that take nonhuman approaches to solving problems (e.g. chess) are often more successful than their human counterparts. Still, the design of systems that explicitly model aspects of human performance is a fertile area of research in both AI and psychology. Human performance modeling, in addition to providing AI with much of its basic methodology, has proved to be a powerful tool for formulating and testing theories of human cognition. CS 362_ Intelligent Systems (vi): Planning and Robotics Slide 22 Planning is the ability to decide on a good sequence of actions to achieve our goals. Research in planning began as an effort to design robots that could perform their tasks with some degree of flexibility and responsiveness to the outside world. Briefly, planning assumes a robot that is capable of performing certain atomic actions. It attempts to find a sequence of those actions that will accomplish some higher-level task, such as moving across an obstacle-filled room. Planning is a difficult problem for a number of reasons, not the least of which is the size of the space of possible sequences of moves. Even an extremely simple robot is capable of generating a vast number of potential move sequences. CS 362_ Intelligent Systems (vii): Languages and Environments of AI Slide 23 Some of the most important by-products of artificial intelligence research have been advances in programming languages and software development environments. Programming environments include knowledge-structuring techniques such as object-oriented programming. High-level languages, such as Lisp and Prolog, help manage program size and complexity. Trace packages allow a programmer to reconstruct the execution of a complex algorithm and make it possible to unravel the complexities of heuristic search. Without such tools and techniques, it is doubtful that many significant AI systems could have been built. CS 362_ Intelligent Systems (viii): Machine Learning Slide 24 Learning is one of the most important components of intelligent behavior. An expert system may perform extensive and costly computations to solve a problem. However, unlike a human being, if it is given the same or a similar problem a second time, it usually does not remember the solution. It performs the same sequence of computations again. CS 362_ Intelligent Systems (ix): Neural Nets and Genetic Algorithms Slide 25 Computer system that can act like or simulate the functioning of the human brain Most of the techniques in AI (and in this course) use knowledge and search algorithms to implement intelligence. An alternative approach seeks to build intelligent programs using models that parallel the structure of neurons in the human brain or the evolving patterns found in genetic algorithms and artificial life. CS 362_ Intelligent Systems Major Branches of AI Slide 26 Perceptive system: A system that approximates the way a human sees, hears, and feels objects Vision system: Captures, stores, and manipulates visual images and pictures Robotics: Mechanical and computer devices that perform tedious tasks with high precision Expert system: Stores knowledge and makes inferences Learning system: Computer changes how it functions or reacts to situations based on feedback Natural language processing: Computers understand and react to statements and commands made in a “natural” languages, such as English Neural network: Computer system that can act like or simulate the functioning of the human brain CS 362_ Intelligent Systems Artificial intelligence = Software that acts intelligently Slide 27 AI centers on methods using Booleans, conditionals, and logical reasoning, with numbers used as needed. AI software need not work like people do, but people can provide clues as to methods. For instance, aircraft don't fly by imitating birds. AI means deep (not superficial) understanding of how to do something (e.g. language understanding versus table lookup). Example: Query "picture of west wing of white house" for Google. AI will become increasingly common in the future, as computers do increasingly complex tasks. Many developments in AI have been "exported" to other areas of computer science (e.g. object-oriented programming and data mining). AI programs that are too slow today may get used eventually as computers increase in speed (e.g. speech understanding). CS 362_ Intelligent Systems What is AI good for? Slide 28 AI is not precisely defined, but generally it’s for: Problems needing "common sense", like recognizing building types in aerial photos Problems requiring many different kinds of knowledge, like automatic translation of English text Problems only a few experts can solve, like treating rare diseases Hard problems without any good known algorithms, like playing chess CS 362_ Intelligent Systems Some history of AI Slide 29 1950s: The first programs First speculation about AI (Turing test) Dartmouth Conference: organized by John McCarthy and colleagues for starting a new area in studying computation and intelligence. John McCarthy introduced the term “artificial intelligence” in the conference. Game-playing Heuristic search methods 1960s: Major progress Lisp programming language by McCarthy Development of symbolic reasoning methods using logical constraints The first natural language and vision programs CS 362_ Intelligent Systems Slide 30 1970s: Many successes Appearance of expert systems Prolog programming language and various other AI software Symbolic learning is popular 1980s: Faddishness Suddenly AI is faddish and gets much media coverage Lots of AI startup companies, most fail Lots of standalone AI applications, lots of expert systems Neural networks become popular CS 362_ Intelligent Systems Slide 31 1990s: Maturity of AI AI no longer a fad, but used more than ever (e.g. the Web) AI is embedded in larger systems (like on the Web and in simulations) Genetic algorithms and artificial life are popular Statistical language processing is popular, including speech understanding 2000s: AI is back in fashion Data mining is popular Simulations and games using AI are popular CS 362_ Intelligent Systems Artificial intelligence today Slide 32 Programs that use artificial-intelligence techniques are usually just pieces of a larger system (like Java classes). "Artificial-intelligence techniques" emphasize conditional statements and logical constructs (like "and", "or", and "not"). Artificial-intelligence code often contains many small pieces, to provide flexible reasoning. Journals with current AI trends: IEEE Intelligent Systems and AI Magazine. More technical are IEEE Transactions on Knowledge and Data Engineering and Artificial Intelligence. CS 362_ Intelligent Systems Options for implementing AI Slide 33 Use an AI programming language Use an AI software package Do AI directly in your favorite programming language CS 362_ Intelligent Systems Options for implementing AI: (i) Use an AI programming language Slide 34 Prolog and Lisp are two programming languages; both have standards. Both emphasize programming in small pieces. Both use emphasize linked lists and recursion. Lisp uses functions, Prolog predicate calculus. Prolog has automatic backtracking and flexible variable binding. CS 362_ Intelligent Systems Options for implementing AI: (ii) Use an AI software package Slide 35 Many software packages are available: CLIPS is a popular standalone system JESS is a popular Java package. Packages have better support facilities (e.g. graphics) than languages Languages have standards, shells don't. Packages may be too rigid for your application. Some packages don't support variables (e.g. neural nets). CS 362_ Intelligent Systems Options for implementing AI: (iii) Do AI directly in your favorite programming language Slide 36 This is the most popular way today. Libraries and predefined classes for AI methods are available. It's more work to do it yourself, but many AI ideas aren't hard to implement. Programming in Weka, Webots, Unity 3D, etc. seem to be popular for simulation and rapid development. CS 362_ Intelligent Systems Key difficulties in doing AI Slide 37 Successes get exported, and people forget the ideas came from AI. Thorough testing is necessary to show an AI system works. Don't trust quick demos. Methods that cannot be tested easily (e.g. genetic algorithms and fuzzy sets) tend to be overvalued because people can’t see when they're wrong. Methods that can show obvious errors (e.g. logical inferences) tend to be undervalued. CS 362_ Intelligent Systems CS 362: Intelligent Systems Slide 1 UNIT 2 THE PREDICATE CALCULUS CS 362_ Intelligent Systems Knowledge Slide 2 Knowledge is a theoretical or practical understanding of a subject or a domain. Knowledge is also the sum of what is currently known, and apparently knowledge is power. Those who possess knowledge are called experts. Knowledge representation is a science of translating actual knowledge into a format that can be used by the computer. CS 362_ Intelligent Systems THE PROPOSITIONAL CALCULUS Slide 3 The propositional calculus and the predicate calculus are representation languages for artificial intelligence. Using their words, phrases, and sentences, we can represent and reason about properties and relationships in the world. The first step in describing a language is to introduce the pieces that make it up: its set of symbols. CS 362_ Intelligent Systems Syntax vs. Semantic Slide 4 Syntax: is the study of the principles and rules for constructing phrases and sentences in natural languages The syntax of the logic system defined by a set of rules for producing legal sentences. Semantics: is the study of meaning of legal sentences. CS 362_ Intelligent Systems PROPOSITIONAL CALCULUS SYMBOLS Slide 5 The symbols of propositional calculus are: The propositional symbols (that denote propositions): P, Q, R, S, … The truth symbols: True and False The connectives: ∧ : the conjunction (and) ∨ : the disjunction (or) ¬ : the negation → : the implication (conditional statement) ≡ : the equivalence CS 362_ Intelligent Systems Propositional Calculus sentences (i) Slide 6 Sentences in the propositional calculus are formed from these atomic symbols according to the following rules: Every propositional symbol and truth symbol is a sentence. For example: true, P, Q, and R are sentences. The negation of a sentence is a sentence: For example: ¬ P and ¬ false are sentences. The conjunction (and) of two sentences is a sentence. For example: P ∧ Q is a sentence (P and Q are called the conjuncts). The disjunction (or) of two sentences is a sentence. For example: P ∨ Q is a sentence (P and Q are referred to as disjuncts). The implication of one sentence from another is a sentence. For example: P → Q is a sentence (P is called the premise or antecedent and Q is called the conclusion or consequent.) The equivalence of two sentences is a sentence. For example: P ∨ Q ≡ R is a sentence. CS 362_ Intelligent Systems Propositional Calculus sentences (ii) Slide 7 Legal sentences are also called well-formed formulas or WFFs. In the sentences of propositional calculus, the symbols ( ) and [ ] are used to group symbols into sub expressions and so to control their order of evaluation and meaning. For example, (P ∨ Q) ≡ R is quite different from P ∨ (Q ≡R) An expression is considered a sentence (or well-formed formula) if and only if it can be formed of legal symbols through some sequence of the above rules. CS 362_ Intelligent Systems Example Slide 8 Is the following expression a well-formed sentence in the propositional calculus? ((P ∧ Q) → R) ≡ ¬ P ∨ ¬Q ∨ R Answer Yes, it is a well-formed sentence in the propositional calculus because it has been constructed (through a series of applications of legal rules and is therefore “well formed) as follows: P, Q, and R are propositions and thus sentences. P ∧ Q, the conjunction of two sentences, is a sentence. (P ∧ Q) → R, the implication of a sentence for another, is a sentence. ¬ P and ¬ Q, the negations of sentences, are sentences. ¬ P ∨ ¬ Q, the disjunction of two sentences, is a sentence. ¬ P ∨ ¬ Q ∨ R, the disjunction of three sentences, is a sentence. ((P ∧ Q) → R) ≡ ¬ P ∨ ¬ Q ∨ R, the equivalence of two sentences, is a sentence. CS 362_ Intelligent Systems The Semantics of the Propositional Calculus Slide 9 Interpretation is the truth value assignment to propositional sentences. Formally, an interpretation is a mapping from the propositional symbols into the set {T, F}. Because AI programs must reason with their representational structures, it is important to demonstrate that the truth of their conclusions depends only on the truth of their initial knowledge or premises. As mentioned earlier, the symbols true and false are part of the set of wellformed sentences of the propositional calculus; i.e., they are distinct from the truth value assigned to a sentence. To enforce this distinction, the symbols T and F are used for truth value assignment. CS 362_ Intelligent Systems The interpretation for sentences in the Propositional Calculus Slide 10 The truth assignment of negation (¬ P), where P is any propositional symbol, is F if the assignment to P is T, and T if the assignment to P is F. The truth assignment of conjunction (∧) is T only when both conjuncts have truth value T; otherwise it is F. The truth assignment of disjunction (∨) is F only when both disjuncts have truth value F; otherwise it is T. The truth assignment of implication (→) is F only when the premise (symbol before the implication) is T and the truth value of the consequent (symbol after the implication) is F; otherwise it is T. The truth assignment of equivalence (≡) is T only when both expressions have the same truth assignment for all possible interpretations; otherwise it is F. CS 362_ Intelligent Systems Propositional Calculus Laws Slide 11 P, Q, and R are propositional expressions The double negation law ¬(¬P) ≡ P The definition of implication (P → Q) ≡ (¬P ∨ Q) The contrapositive law (P → Q) ≡ (¬Q → ¬P) De Morgan’s laws ¬(P ∨ Q) ≡ (¬P ∧ ¬Q) ¬(P ∧ Q) ≡ (¬P ∨¬ Q) The commutative laws (P ∧ Q) ≡ (Q ∧ P) (P ∨ Q) ≡ (Q ∨ P) The associative laws ((P ∧ Q) ∧ R) ≡ (P ∧ (Q ∧ R)) ((P ∨ Q) ∨ R) ≡ (P ∨ (Q ∨ R)) The distributive laws P ∨ (Q ∧ R) ≡ (P ∨ Q) ∧ (P ∨ R) P ∧ (Q ∨ R) ≡ (P ∧ Q) ∨ (P ∧ R) CS 362_ Intelligent Systems Truth Tables Slide 12 Two expressions in the propositional calculus are equivalent if they have the same value under all truth value assignments. This equivalence may be demonstrated using truth tables. CS 362_ Intelligent Systems Example Slide 13 Using a truth table, show that the expressions (P → Q) and (¬P ∨ Q) are equivalent (P → Q ≡ ¬P ∨ Q) CS 362_ Intelligent Systems The Predicate Calculus: introduction (i) Slide 14 In propositional calculus, each atomic symbol (P, Q, etc.) denotes a single proposition. Therefore, there is no way to access the components of an individual assertion. For instance, P is a proposition such that P: “it rained on Tuesday” On the contrary, predicate calculus provides this ability. For instance, we can create a predicate weather that describes a relationship between a day and the weather: weather (Tuesday, rain) Through inference rules we can manipulate predicate calculus expressions, accessing their individual components and inferring new sentences. CS 362_ Intelligent Systems The Predicate Calculus: introduction (ii) Slide 15 Predicate calculus also allows expressions to contain variables. Variables let us create general assertions about classes of entities. For example, we could state that: for all values of X, where X is a day of the week, the statement weather(X, rain) is true; i.e., it rains every day. CS 362_ Intelligent Systems Predicate Calculus Symbols (i) Slide 16 The alphabet that makes up the symbols of the predicate calculus consists of: The set of both upper and lowercase letters of the English alphabet The set of decimal digits (0, 1, 2, … , 9) The underscore ( _ ) Symbols in the predicate calculus begin with a letter and are followed by any sequence of the above legal characters. CS 362_ Intelligent Systems Example Slide 17 Is the following a legitimate character in the alphabet of predicate calculus symbols? A Yes d Yes # No / No 6 Yes _ Yes & No CS 362_ Intelligent Systems Example Slide 18 Is the following a legitimate predicate calculus symbol? George YES Fire3 YES science!!! No ahmd_and_salem YES 3jack No bill YES XXXX YES Cheating not allowed No friends_of YES ab%cd No CS 362_ Intelligent Systems Predicate Calculus Symbols (ii) Slide 19 Parentheses “( )”, commas “,”, and periods “.” are used solely to construct well-formed expressions and do not denote objects or relations in the world. These are called improper symbols. Predicate calculus symbols may represent either variables, constants, functions, or predicates. CS 362_ Intelligent Systems Predicate Calculus Symbols (iii) Slide 20 Predicate calculus symbols include: Truth symbols: true and false (these are reserved symbols). Constant symbols: symbol expressions having the first character lowercase. Variable symbols: symbol expressions beginning with an uppercase character. Function symbols: symbol expressions having the first character lowercase. Every function symbol has an associated arity, indicating the number of elements in the domain mapped onto each element of the range. A function expression is a function symbol followed by its arguments. i.e., it consists of a function constant of arity n, followed by n terms (t1, t2 ,…, tn) enclosed in parentheses and separated by commas. CS 362_ Intelligent Systems Predicate Calculus Terms Slide 21 A predicate calculus term is either a constant, variable, or function expression. It may be used to denote objects and properties in a problem domain. Examples of terms are: cat times(2,3) X Blue mother(sarah) kate CS 362_ Intelligent Systems Predicates Slide 22 Symbols in predicate calculus may also represent predicates. Predicate symbols, like constants and function names, begin with a lowercase letter. A predicate names a relationship between zero or more objects in the world. The number of objects so related is the arity of the predicate. Predicates with the same name but different arities are considered distinct. Examples of predicates are: likes equals on near part_of CS 362_ Intelligent Systems Atomic Sentences Slide 23 Atomic sentence Is the most primitive unit of the predicate calculus language. It is a predicate of arity n followed by n terms enclosed in parentheses and separated by commas. The truth values, true and false, are also atomic sentences Atomic sentences are also called atomic expressions, atoms, or propositions We may combine atomic sentences using logical operators to form sentences in the predicate calculus. These are the same logical connectives used in propositional calculus (∧, ∨, ¬, →, and ≡) CS 362_ Intelligent Systems Examples of Atomic Sentences Slide 24 Examples of atomic sentences are: likes(george,kate) likes(george,susie) likes(X,george) likes(X,X) likes(george,sarah,tuesday) friends(bill,richard) friends(bill,george) friends(father_of(david),father_of(andrew)) helps(bill,george) helps(richard,bill) CS 362_ Intelligent Systems Universal and Existential Quantifiers Slide 25 The universal quantifier,∀, indicates that the sentence is true for all values of the variable. In the example, ∀X likes(X, ice_cream) is true for all values in the domain of the definition of X. The existential quantifier,∃, indicates that the sentence is true for at least one value in the domain. For example ∃ Y friends(Y, peter) is true if there is at least one object, indicated by Y that is a friend of peter. CS 362_ Intelligent Systems Syntax of First Order Logic: Quantifiers Slide 26 For the universal quantifier ⇒ is the main connective with (∀). E.g.: ∀X in(X,KSA) ⇒ Smart(X) means: “everyone in the KSA is smart” For the existential quantifier ∧ is the main connective with (∃) E.g. ∃X in(X,KSA) ∧ Smart(X) means “there is at least one in the KSA who is smart” CS 362_ Intelligent Systems PREDICATE CALCULUS SENTENCES (WFFs) Slide 27 Every atomic sentence is a sentence. If s is a sentence, then so is its negation, ¬ s. If s1 and s2 are sentences, then so is their conjunction, s1 ∧ s2. If s1 and s2 are sentences, then so is their disjunction, s1 ∨ s2. If s1 and s2 are sentences, then so is their implication, s1 → s2. If s1 and s2 are sentences, then so is their equivalence, s1 ≡ s2. If X is a variable and s a sentence, then ∀ X s is a sentence. If X is a variable and s a sentence, then ∃ X s is a sentence. CS 362_ Intelligent Systems Examples of well-formed sentences Slide 28 Let times and plus be function symbols of arity 2 and let equal and foo be predicate symbols with arity 2 and 3, respectively. plus(two,three) is a function and thus not an atomic sentence. equal(plus(two,three), five) is an atomic sentence. equal(plus(2, 3), seven) is an atomic sentence. Note that this sentence, given the standard interpretation of plus and equal, is false. Well-formedness and truth value are independent issues. ∃X foo(X,two,plus(two,three))∧equal(plus(two,three),five) is a sentence since both conjuncts are sentences. (foo(two,two,plus(two,three)))→(equal(plus(three,two),five)≡true) is a sentence because all its components are sentences, appropriately connected by logical operators. CS 362_ Intelligent Systems Family Relationships Slide 29 An Example of the use of predicate calculus to describe a simple world. The domain of discourse is a set of family relationships in a biblical genealogy: mother(eve, abel) mother(eve , cain) father(adam , abel) father(adam , cain) ∀ X ∀ Y father(X, Y) ∨ mother(X, Y) → parent(X, Y) ∀ X ∀ Y ∀ Z parent(X, Y) ∧ parent(X, Z) → sibling (Y, Z) CS 362_ Intelligent Systems A Semantics for the Predicate Calculus Slide 30 Given an interpretation, the meaning of an expression is a truth value assignment over the interpretation. The value of an atomic sentence is either T or F as determined by the interpretation I CS 362_ Intelligent Systems FIRST-ORDER PREDICATE CALCULUS (i) Slide 31 First-order predicate calculus allows quantified variables to refer to objects in the domain of discourse and not to predicates or functions. For example, ∀(Likes) Likes(george,kate) is not a well-formed expression in the first-order predicate calculus. There are higher-order predicate calculi where such expressions are meaningful. CS 362_ Intelligent Systems FIRST-ORDER PREDICATE CALCULUS (ii) Slide 32 Examples of English sentences represented in predicate calculus are: If it doesn’t rain on Monday, Tom will go to the mountains. ¬ weather(rain, monday)→go(tom, mountains) Emma is a Doberman pinscher and a good dog. gooddog(emma) ∧ isa(emma, doberman) All basketball players are tall. ∀ X (basketball_player(X)→tall(X)) Some people like anchovies. ∃ X (person(X) ∧ likes(X, anchovies)) If wishes were horses, beggars would ride. equal(wishes, horses)→ride(beggars) Nobody likes taxes. ¬ ∃ X likes(X, taxes) CS 362_ Intelligent Systems Example (i) Slide 33 Everybody likes some food. X F food(F) likes (X,F) There is a food that everyone likes. F X food(F) likes (X,F) Whoever eats some spicy dish, they’re happy. X F food(F) spicy(F) eats (X,F) happy(X) John’s meals are spicy. X meal-of(John,X) spicy(X) CS 362_ Intelligent Systems Example (ii) Slide 34 From the following axioms and facts, can you deduce who is the prize winner and if he is happy or not? ∀x Won_Prize(x) ⇒Happy(x) ∀y Play_Game(y) ^ Lucky(y) ⇒ Won_Prize(y) Play_Game(Ali) Play_Game(Mona) Lucky(Ali) Happy(Mona) CS 362_ Intelligent Systems Example (iii) Slide 35 Consider the following axioms: 1. All hounds howl at night. 2. Anyone who has any cats will not have any mice. 3. Light sleepers do not have anything which howls at night. 4. Ali has either a cat or a hound. Using FOL translate each of the above axioms into a well formed formula (WFF). Given the fact that Ali is a light sleeper, Does Ali have mice? Explain. CS 362_ Intelligent Systems Fundamental concepts of logical representation: the inference rules Slide 36 The semantics of the predicate calculus provides a basis for a formal theory of logical inference. The ability to infer new correct expressions from a set of true assertions is an important feature of the predicate calculus. These new expressions are correct in that they are consistent with all previous interpretations of the original set of expressions. CS 362_ Intelligent Systems Fundamental concepts of logical representation: the proof procedure Slide 37 Proof procedure is a combination of an inference rule and an algorithm for applying that rule to a set of logical expressions to generate new sentences. CS 362_ Intelligent Systems Rules of inference Slide 38 Modus ponens p pq q Modus tollens q pq p Elimination pq p q Introduction p q pq Universal instantiation ∀ X p(X) p(a), where a is from the domain of x CS 362_ Intelligent Systems Unification Slide 39 Unification is an algorithm for determining the substitutions needed to make two predicate calculus expressions match. Unification specifies conditions under which two (or more) predicate calculus expressions may be said to be equivalent. Because p(X) and p(Y) are equivalent, Y may be substituted for X to make the sentences match. Unification and inference rules such as modus ponens allow us to make inferences on a set of logical assertions. Unification is complicated by the fact that a variable may be replaced by any term, including other variables and function expressions of arbitrary complexity. These expressions may themselves contain variables. CS 362_ Intelligent Systems Examples Slide 40 Unify p(a,X) and p(a,b) answer: b/X p(a,b) Unify p(a,X) and p(Y,b) answer: a/Y, b/X p(a,b) Unify p(a,X) and p(Y, f(Y)) answer: a/Y, f(a)/X p(a,f(a)) Unify p(a,X) and p(X,b) Failure Unify p(a,X) and p(Y,b) answer: a/Y, b/X Unify p(a,b) and p(X,X) Failure Unify p(X, f(Y), b) and P(X, f(b), b) answer: b/Y p(a,b) Taibah University_CS Dept. CS 362_ Intelligent Systems CS 362: Intelligent Systems Slide 1 UNIT 3 STRUCTURES & STRATEGIES FOR STATE SPACE SEARCH CS 362_ Intelligent Systems Search and AI Slide 2 Search is an important aspect of AI. Search can be defined as a problem-solving technique that enumerates a problem space from an initial position in search of a goal position (or solution). Search algorithm or strategy is the manner in which the problem space is searched. Intelligent agents are supposed to maximize their performance measure. Achieving this is sometimes simplified if the agent can adopt a goal and aim at satisfying it. CS 362_ Intelligent Systems Search and AI Slide 3 The process of looking for a sequence of actions that reaches the goal is called search A search algorithm takes a problem as input and returns a solution in the form of an action sequence. Once a solution is found, the actions it recommends can be carried out. A solution to a problem is an action sequence that leads from the initial state to a goal state. Solution quality is measured by the path cost function, An optimal solution has the lowest path cost among all solutions. CS 362_ Intelligent Systems Search and AI Slide 4 Why Search? To achieve goals or to maximize our utility: We need to predict what the result of our actions in the future will be. There are many sequences of actions, each with their own utility: We want to find, or search for, the best one. CS 362_ Intelligent Systems WELL-DEFINED PROBLEM Slide 5 A problem can be defined formally by five components: Initial state: that the agent starts in. Actions: a description of the possible actions available to the agent. Transition model: a description of what each action does Goal test: determines whether a given state is a goal state. Path cost: function that assigns a numeric cost to each path. CS 362_ Intelligent Systems GENERAL STATE SPACE SEARCH Slide 6 What is the meaning of search space? Search space is the possible states in environment of the problem that can be searched to find solution. The solution of any problem can be thought as sequence of actions that we can take, and the new state of the environment as we perform those actions The problem of search is to find a sequence of operators that transition from the start to goal state. That sequence of operators is the solution. CS 362_ Intelligent Systems GENERAL STATE SPACE SEARCH Slide 7 In search space: Some paths lead to dead-ends Others lead to solutions. There may be multiple solutions, some better than others. How we avoid dead-ends and then select the best solution available is a product of our particular search strategy. The theory of state space search is our primary tool for finding the solution by: Representing the problem as a state space graph Using graph theory to analyze the structure and complexity of both the problem and the search procedures that we employ to solve it. CS 362_ Intelligent Systems Graph terminology (i) Slide 8 The graph consists of: A set of nodes N1, N2, N3, ..., Nn, ..., which need not be finite. A set of arcs that connect pairs of nodes. Arcs are ordered pairs of nodes; i.e., the arc (N3, N4) connects node N3 to node N4. This indicates a direct connection from node N3 to N4 but not from N4 to N3, unless (N4, N3) is also an arc, and then the arc joining N3 and N4 is undirected. A labeled graph has one or more descriptors (labels) attached to each node that distinguish that node from any other node in the graph. A graph is directed if arcs have an associated directionality. The arcs in a directed graph are usually drawn as arrows or have an arrow attached to indicate direction (E.g.: in the shown graph arc (a, b) may only be crossed from node a to node b, but arc (b, c) is crossable in either direction.) CS 362_ Intelligent Systems Graph terminology (ii) Slide 9 If a directed arc connects Nj and Nk, then: Nj is called the parent of Nk and Nk, the child of Nj. If the graph also contains an arc (Nj, Nl), then Nk and Nl are called siblings. A rooted graph has a unique node NS from which all paths in the graph originate. That is, the root has no parent in the graph. A tip or leaf node is a node that has no children. An ordered sequence of nodes [N1, N2, N3, ..., Nn], where each pair Ni, Ni+1 in the sequence represents an arc, i.e., (Ni, Ni+1), is called a path of length n - 1. On a path in a rooted graph, a node is said to be an ancestor of all nodes positioned after it (to its right) as well as a descendant of all nodes before it. A path that contains any node more than once (some Nj in the definition of path above is repeated) is said to contain a cycle or loop. CS 362_ Intelligent Systems Graph terminology (iii) Slide 10 A tree is a graph in which there is a unique path between every pair of nodes. The paths in a tree, therefore, contain no cycles. The arcs in a rooted tree are directed away from the root. Each node in a rooted tree has a unique parent. Two nodes are said to be connected if a path exists that includes them both. CS 362_ Intelligent Systems Example Slide 11 In the shown tree find: The root a The parent of c is b The children of g are: h, i, and j The siblings of h are: i and j The ancestors of e are: c, b, and a. The descendants of b are: c, d, and e. The tips (leaves) are: d, e, f , i, k, l, and m. CS 362_ Intelligent Systems Use of graph theory in state space search Slide 12 In the state space model of problem solving: The nodes of a graph are taken to represent discrete states in a problem-solving process. The arcs of the graph represent transitions between states. Graph theory is our best tool for reasoning about the structure of objects and relations. The Swiss mathematician Leonhard Euler invented graph theory to solve the “bridges of Königsberg problem.” CS 362_ Intelligent Systems The Problem of Königsberg Bridges (i) Slide 13 The city of Königsberg occupied both banks and two islands of a river. The islands and the riverbanks were connected by seven bridges. The bridges of Königsberg problem asks if there is a walk around the city that crosses each bridge exactly once. CS 362_ Intelligent Systems The Problem of Königsberg Bridges (ii) Slide 14 The residents had failed to find such a walk and doubted that it was possible, no one had proved its impossibility. Devising a form of graph theory, Euler created an alternative representation for the map. The riverbanks are (rb1and rb2) , islands are (i1and i2) and the bridges are (b1, b2,…., b7). CS 362_ Intelligent Systems The Problem of Königsberg Bridges (iii) Slide 15 The riverbanks (rb1and rb2) and islands (i1and i2) are described by the nodes of a graph ; the bridges are represented by labeled arcs between nodes (b1, b2,… , b7). The graph representation preserves the essential structure of the bridge system, while ignoring extraneous features such as bridge lengths, distances, and order of bridges in the walk. CS 362_ Intelligent Systems The Problem of Königsberg Bridges (iv) Slide 16 We may represent the Königsberg bridge system using predicate calculus. The connect predicate corresponds to an arc of the graph, asserting that two land masses are connected by a particular bridge. Each bridge requires two connect predicates, one for each direction in which the bridge may be crossed. A predicate expression, connect(X, Y, Z) = connect (Y, X, Z), indicating that any bridge can be crossed in either direction, would allow removal of half the following connect facts: CS 362_ Intelligent Systems The Problem of Königsberg Bridges (v) Slide 17 Euler focused on the degree of the nodes of the graph With the exception of its beginning and ending nodes, the desired walk would have to leave each node exactly as often as it entered it. Nodes of odd degree could be used only as the beginning or ending of the walk, because such nodes could be crossed only a certain number of times before they proved to be a dead end. The traveler could not exit the node without using a previously traveled arc. Euler noted that unless a graph contained either exactly zero or two nodes of odd degree, the walk was impossible. If there were two odd-degree nodes, the walk could start at the first and end at the second. If there were no nodes of odd degree, the walk could begin and end at the same node. Taibah University_CS Dept. CS 362_ Intelligent Systems The Finite State Machine (FSM) Slide 18 A finite state machine (FSM) is a finite, directed, connected graph, having a: Finite set of states (S) Finite set of input values (I), and State transition function(F) that describes the effect that the elements of the input stream have on the states of the graph. Thus ∀ i ∈ I, Fi : (S → S). If the machine is in state sj and input i occurs, the next state of the machine will be Fi (sj). The stream of input values produces a path within the graph of the states of this finite machine. The primary use for such a machine is to recognize components of a formal language. These components are often strings of characters (“words” made from characters of an “alphabet”). Taibah University_CS Dept. CS 362_ Intelligent Systems FSM Representation Slide 19 FSM may be represented in two equivalent ways using: A transition matrix, where: input values are listed along the top row the states are in the leftmost column the output (next state) for an input applied to a state is at the intersection point. A finite graph that has directed arcs, labeled with the input values. Taibah University_CS Dept. CS 362_ Intelligent Systems Example Slide 20 For a finite state machine that has S = {s0, s1}, I = {0,1}, f0(s0) = s0, f0(s1) = (s1), f1(s0) = s1, and f1(s1) = s0, draw: 1. The finite state graph 2. The transition matrix Taibah University_CS Dept. CS 362_ Intelligent Systems The Finite State Acceptor (MOORE Machine) Slide 21 The finite state acceptor is a finite state machine (S, I, F), where: ∃ s0 ∈ S: the starting state such that the input stream starts at s0 We use the convention of placing an arrow from no state that terminates in the starting state of the Moore machine ∃ sn ∈ S: called the accepting state. The input stream is accepted if it terminates in that state. In fact, there may be a set of accept states. The accepting state (or states) is represented using a doubled circle The finite state acceptor is represented as (S, s0, {sn}, I, F) Taibah University_CS Dept. CS 362_ Intelligent Systems Example of the Finite State Acceptor (MOORE Machine): string recognition Slide 22 With two assumptions, this machine could be seen as a recognizer of all strings of characters from the alphabet {a, b, c, d} that contain the exact sequence “a b c” The two assumptions are: State s0 has a special role as the starting state State s3 is the accepting state. Thus, the input stream will present its first element to state s0. If the stream later terminates with the machine in state s3, it will have recognized that there is the sequence “a b c” within that input stream. Taibah University_CS Dept. CS 362_ Intelligent Systems Example Slide 23 For the shown FSM which of these inputs are valid (will be accepted by the shown FSM): aaacdb CORRECT ababacdaaac CORRECT abcdb (ERROR; there is no input that accepts b then c) acda (ERROR S1 is not a accepting state) acdbdb CORRECT Taibah University_CS Dept. CS 362_ Intelligent Systems State Space Representation of Problems (i) Slide 24 As mentioned earlier, the theory of state space search is our primary tool for finding the solution by representing the problem as a state space graph in which: The nodes of a graph correspond to partial problem solution states The arcs correspond to steps in a problem-solving (solution) process. One or more start states (corresponding to the given information in a problem instance) form the root of the graph. The goal state(s) of the problem. These states are described using either: A measurable property of the states encountered in the search. A measurable property of the path developed in the search, for example, the sum of the transition costs for the arcs of the path. Taibah University_CS Dept. CS 362_ Intelligent Systems State Space Representation of Problems (ii) Slide 25 Paths through the space represent solutions in various stages of completion. Paths are searched, beginning at the start state and continuing through the graph, until either the goal description is satisfied or they are abandoned. The actual generation of new states along the path is done by applying operators, such as “legal moves” in a game or inference rules in a logic problem or expert system, to existing states on a path. A solution path is a path through the graph from a start nodes to goal nodes A path cost function: assigns a numeric cost to each path = performance measure, denoted by g, to distinguish the best path from others. Usually the path cost is the sum of the step costs of the individual actions (in the action list). Optimal solution : the solution with lowest path cost among all solutions. The task of a search algorithm is to find a solution path through such a problem space. Search algorithms must keep track of the paths from a start to a goal node, because these paths contain the series of operations that lead to the problem solution. CS 362_ Intelligent Systems State Space Representation of Problems Example (i): the 8-puzzle Slide 26 Eight tiles can be moved around in nine spaces. Although physical puzzle moves are made by moving tiles, it’s simpler to think in terms of “moving the blank space”. This simplifies the definition of move rules because there are eight tiles but only a single blank. States: state description specifies the location of each of the eight tiles and the blank in one of the nine squares. Starting state: any state can be designated as the starting state Goal description of the state space is a particular state or board configuration. When this state is found on a path, the search terminates. The path from start to goal is the desired series of moves. Taibah University_CS Dept. CS 362_ Intelligent Systems State space of the 8-puzzle: generated by “move blank” operations Slide 27 Taibah University_CS Dept. CS 362_ Intelligent Systems State Space Representation of Problems Example (ii): the travelling salesperson_1 Slide 28 A salesperson has five cities to visit (cities A, B, C, D and E) and then must return home. The goal of the problem is to find the shortest path for the salesperson to travel, visiting each city (starting from city A, where he lives), and then returning to the starting city (city A). Taibah University_CS Dept. CS 362_ Intelligent Systems State Space Representation of Problems Example (ii): the travelling salesperson_2 Slide 29 The nodes of the graph represent cities Each arc is labeled with a weight indicating the cost of traveling that arc. This cost might be a representation of the miles necessary in car travel or cost of an air flight between the two cities. The goal description requires a complete circuit with minimum cost (the goal description here is a property of the entire path, rather than of a single state) Taibah University_CS Dept. CS 362_ Intelligent Systems State Space Representation of Problems Example (ii): the travelling salesperson_3 Slide 30 The figure shows one way in which possible solution paths may be generated and compared. Beginning with node A, possible next states are added until all cities are included and the path returns home. The goal is the lowestcost path. Taibah University_CS Dept. CS 362_ Intelligent Systems State Space Representation of Problems Example (ii): the travelling salesperson_4 Slide 31 An instance of the travelling salesperson problem with the nearest neighbor path in bold is shown in the graph below. This path (A, E, D, B, C, A), at a cost of 550, is not the shortest path. The comparatively high cost of arc (C, A) defeated the heuristic. However, this method is highly efficient, as there is only one path to be tried. The nearest neighbor heuristic is fallible, as graphs exist for which it does not find the shortest path, but it is a possible compromise when the time required makes exhaustive search impractical. Taibah University_CS Dept. CS 362_ Intelligent Systems Strategies for Space State Search Slide 32 A state space may be searched in two directions: From the given data of a problem instance toward a goal using data driven search “forward chaining” From a goal back to the data using goal driven search “backward chaining”. Taibah University_CS Dept. CS 362_ Intelligent Systems Data Driven search “Forward chaining” (i) Slide 33 In data-driven search, sometimes called forward chaining, the problem solver begins with the given facts of the problem and a set of legal moves or rules for changing state. Search proceeds by applying rules to facts to produce new facts, which are in turn used by the rules to generate more new facts. This process continues until (hopefully !) it generates a path that satisfies the goal condition. To summarize: data-driven reasoning takes the facts of the problem and applies the rules or legal moves to produce new facts that lead to a goal. CS 362_ Intelligent Systems Data Driven search “Forward chaining” (ii) Slide 34 The figure shows the state space in which data-directed search prunes irrelevant data and their consequents and determines one of a number of possible goals. Taibah University_CS Dept. CS 362_ Intelligent Systems When to use the data driven search Slide 35 Data-driven search is appropriate for problems in which: All or most of the data are given in the initial problem statement. Interpretation problems often fit this mold by presenting a collection of data and asking the system to provide a high-level interpretation. There are a large number of potential goals, but there are only a few ways to use the facts and given information of a particular problem instance. It is difficult to form a goal or hypothesis. CS 362_ Intelligent Systems Goal Driven search “Backward chaining” (i) Slide 36 In goal driven search we: Take the goal to solve. See what rules or legal moves could be used to generate this goal and determine what conditions must be true to use them. These conditions become the new goals, or subgoals, for the search. Search continues, working backward through successive subgoals until (hopefully !) it works back to the facts of the problem. To summarize: goal-driven reasoning focuses on the goal, finds the rules that could produce the goal, and chains backward through successive rules and subgoals to the given facts of the problem. CS 362_ Intelligent Systems Goal Driven search “Backward chaining” (ii) Slide 37 State space in which goal-directed search effectively prunes extraneous search paths. CS 362_ Intelligent Systems When to use the goal driven search Slide 38 Goal-driven search is suggested if: A goal or hypothesis is given in the problem statement or can easily be formulated. There are a large number of rules that match the facts of the problem and thus produce an increasing number of conclusions or goals. Problem data are not given but must be acquired by the problem solver. In this case, goal-driven search can help guide data acquisition. Goal-driven search thus uses knowledge of the desired goal to guide the search through relevant rules and eliminate branches of the space. Taibah University_CS Dept. CS 362_ Intelligent Systems Which one to choose? Data or goal-driven search Slide 39 Both data-driven and goal-driven problem solvers search the same state space graph; however, the order and actual number of states searched can differ. The strategy is determined by the properties of the problem itself. These include: The complexity of the rules The “shape” of the state space The nature and availability of the problem data. The decision to choose between data- and goal-driven search is based on the structure of the problem to be solved. All of these vary for different problems. Taibah University_CS Dept. CS 362_ Intelligent Systems Strategies for Space State Search Slide 40 A strategy is defined by picking the order of node expansion. Strategies are evaluated along the following dimensions: Completeness: does it always find a solution if one exists? Optimality: does it always find a least-cost solution? Time complexity: how many steps does it take to solve a problem? Space complexity: how much memory does it take to solve a problem? Time and space complexities are measured in terms of: b: maximum branching factor (the number of children for each node) of the search tree. d: depth of the least-cost solution. m: maximum depth of the state space (may be infinite). CS 362_ Intelligent Systems CLASSES OF SEARCH Slide 41 There are four classes of search: Uninformed search √ Informed search √ Constraint satisfaction: It is the process of finding a solution to a set of constraints that impose conditions that the variables must satisfy. Adversarial search: Here we examine the problem which arises when we try to plan ahead of the world and other agents are planning against us Taibah University_CS Dept. CS 362_ Intelligent Systems Search Strategies (i) Slide 42 Uninformed search No information about the number of steps No information on the path cost from the current state to the goal Search the state space blindly Informed search, or heuristic search A cleverer strategy that searches toward the goal, based on the information from the current state so far Uninformed (blind) search strategies use only the information available in the problem definition. Informed search techniques might have additional information (e.g. a compass) in solving the problem. Taibah University_CS Dept. CS 362_ Intelligent Systems Search Strategies (ii) Slide 43 Uninformed search Easy Very inefficient in most cases Have huge search tree Informed search Uses problem-specific information Reduces the search tree into a small one Resolves time and memory complexities Taibah University_CS Dept. CS 362_ Intelligent Systems Implementing Graph Search Slide 44 In solving a problem using either goal- or data-driven search, the problem solver must find a path from a start state to a goal through the state space graph. The sequence of arcs in this path corresponds to the ordered steps of the solution Various blind strategies: • • • • • • Breadth-first search √ Bidirectional search Uniform-cost search Depth-first search √ Depth-limited search Iterative deepening search Taibah University_CS Dept. CS 362_ Intelligent Systems Backtracking Slide 45 Backtracking is a technique for systematically trying all paths through a state space. We will present a simpler version of the backtrack algorithm with depth-first search (uninformed search) Backtracking search begins at the start state and pursues a path until it reaches either a goal or a “dead end.” If it finds a goal, it quits and returns the solution path. If it reaches a dead end, it “backtracks” to the most recent node on the path having unexamined siblings and continues down one of these branches Taibah University_CS Dept. CS 362_ Intelligent Systems Backtracking Algorithm (i) Slide 46 A backtracking algorithm can be defined using three lists to keep track of nodes in the state space: State list (SL): it lists the states in the current path being tried. If a goal is found, SL contains the ordered list of states on the solution path. New state list (NSL): it contains nodes awaiting evaluation, i.e., nodes whose children have not yet been generated and searched. Dead ends list (DE): it lists states whose children have failed to contain a goal. If these states are encountered again, they will be detected as elements of DE and eliminated from consideration immediately. CS 362_ Intelligent Systems Backtracking Algorithm (ii) Slide 47 In backtrack, the state currently under consideration is called current state (CS). CS is always equal to the state most recently added to SL and represents the “frontier” of the solution path currently being explored. The result is an ordered set of new states, the children of CS. The first of these children is made the new current state and the rest are placed in order on NSL for future examination. The new current state is added to SL and search continues. If CS has no children, it is removed from SL (this is where the algorithm “backtracks”) and any remaining children of its predecessor on SL are examined. Taibah University_CS Dept. CS 362_ Intelligent Systems Backtracking Algorithm (iii) Slide 48 CS 362_ Intelligent Systems Backtracking Algorithm (iii) Slide 49 Taibah University_CS Dept. CS 362_ Intelligent Systems Depth-First and Breadth-First Search Slide 50 In addition to specifying a search direction (data-driven or goaldriven), a search algorithm must determine the order in which states are examined in the tree or the graph. We here considers two possibilities for the order in which the nodes of the graph are considered: Depth-first search Breadth-first search. Taibah University_CS Dept. CS 362_ Intelligent Systems Breadth-First Search (BFS) (i) Slide 51 Breadth-first search, explores the space in a level-by-level fashion. Only when there are no more states to be explored at a given level, the algorithm moves onto the next deeper level. E.g.: a breadth-first search of the shown graph considers the states in the order A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U. Taibah University_CS Dept. CS 362_ Intelligent Systems Breadth-First Search (ii) Slide 52 The root node is expanded first (FIFO) All the nodes generated by the root node are then expanded And then their successors and so on Taibah University_CS Dept. CS 362_ Intelligent Systems Implementing Breadth-First Search Slide 53 Breadth-first search is implemented using two lists to keep track of progress through the state space. Open (like NSL in backtrack) lists states that have been generated but whose children have not been examined. The order in which states are removed from open determines the order of the search. Closed records states already examined. Closed is the union of the DE and SL lists of the backtrack algorithm. Child states are generated by inference rules, legal moves of a game, or other state transition operators. Each iteration produces all children of the state X an adds them to open. Note that open is maintained as a queue, or first-in-first out (FIFO) data structure. States are added to the right of the list and removed from the left. Taibah University_CS Dept. CS 362_ Intelligent Systems Breadth-First Search Algorithm Slide 54 Taibah University_CS Dept. CS 362_ Intelligent Systems A trace of Breadth-first search Slide 55 Each successive number, 2,3,4, . . . , represents an iteration of the “while” loop. U is the goal state. EXAMPLE (Breadth-First Search - BFS) OPEN A B E C F K L S T Initial state D G M N H O Goal state P U I J Q R A BCD CDEF DEFGH EFGHIJ FGHIJKL GHIJKLM HIJKLMN IJKLMNOP JKLMNOPQ KLMNOPQR LMNOPQRS MNOPQRST NOPQRST OPQRST PQRST QRSTU RSTU STU TU U U CLOSE A BA CBA DCBA EDCBA FEDCBA GFEDCBA HGFEDCBA IHGFEDCBA JIHGFEDCBA KJIHGFEDCBA LKJIHGFEDCBA MLKJIHGFEDCBA NMLKJIHGFEDCBA ONMLKJIHGFEDCBA PONMLKJIHGFEDCBA QPONMLKJIHGFEDCBA RQPONMLKJIHGFEDCBA SRQPONMLKJIHGFEDCBA TSRQPONMLKJIHGFEDCBA hold Taibah University_CS Dept. CS 362_ Intelligent Systems Breadth-first search of the 8-puzzle Slide 57 Breadth-first search of the 8-puzzle, showing order in which states were removed from open Taibah University_CS Dept. CS 362_ Intelligent Systems Depth-First Search (DFS) Slide 58 In depth-first search, when a state is examined, all of its children and their descendants are examined before any of its siblings. Depth-first search goes deeper into the search space whenever this is possible. Only when no further descendants of a state can be found, its siblings are considered. E.g.: depth-first search examines the states in the shown graph in the order A, B, E, K,S, L, T, F, M, C, G, N, H, O, P, U, D, I, Q, J, R. CS 362_ Intelligent Systems Implementing Depth-First Search Slide 59 In a depth-first search algorithm, the descendant states are added and removed from the left end of open Open is maintained as a stack, or last-in first-out(LIFO) structure. The organization of open as a stack directs search toward the most recently generated states, producing a depth-first search order. Taibah University_CS Dept. CS 362_ Intelligent Systems Depth_first_search algorithm Slide 60 Taibah University_CS Dept. CS 362_ Intelligent Systems A trace of depth-first search Slide 61 Each successive iteration of the “while” loop is indicated by a single line (2, 3, 4, . . .). The initial states of open and closed are given on line 1. U is the goal state. EXAMPLE (Depth-First Search - DFS) Start B OPEN A A C D BCD A EFCD BA KLFCD EBA SLFCD KEBA LFCD SKEBA TFCD E F G H I J FCD MCD CD K L M N O P Q R Goal U TLSKEBA FTLSKEBA MFTLSKEBA CMFTLSKEBA NHD GCMFTLSKEBA OPD T LSKEBA GHD HD S CLOSED NGCMFTLSKEBA HNGCMFTLSKEBA PD OHNGCMFTLSKEBA UD POHNGCMFTLSKEBA Taibah University_CS Dept. CS 362_ Intelligent Systems Depth-first search of the 8-puzzle with a depth bound of 5 Slide 63 Blind Search Strategies (cont.) Complexity Criterion Breadth-First Depth-First Time bd bm Space bd bm Optimal? Yes No if we guarantee that deeper solutions are less optimal, e.g. step-cost=1 It may find a nonoptimal goal first Yes No Complete? it always reaches goal (if b is finite) b: branching factor Fails in infinitedepth spaces d: solution depth m: maximum depth Taibah University_CS Dept. CS 362_ Intelligent Systems CS 362: Intelligent Systems Slide 1 UNIT 4 HEURISTIC SEARCH Taibah University_CS Dept. CS 362_ Intelligent Systems Outline Slide 2 4.0 Introduction 4.1 An Algorithm for Heuristic Search 4.2 Admissibility, Monotonicity, and Informedness 4.3 Using Heuristics I n Games 4.4 Complexity Issues Taibah University_CS Dept. CS 362_ Intelligent Systems Heuristic definition Slide 3 George Polya (1945) defines heuristic as : “the study of the methods and rules of discovery and invention” The verb eurisco, which means “I discover.” When Archimedes emerged from his famous bath clutching the golden crown, he shouted “Eureka!” meaning “I have found it!”. In state space search, Heuristics are formalized as rules for choosing those branches in a state space that are most likely to lead to an acceptable problem solution. Taibah University_CS Dept. CS 362_ Intelligent Systems Heuristic and AI problem solving Slide 4 AI problem solvers employ heuristics in two basic situations: 1) A problem may not have an exact solution because of inherent ambiguities in the problem statement or available data. • Medical diagnosis is an example of this. A given set of symptoms may have several possible causes; doctors use heuristics to choose the most likely diagnosis and formulate a plan of treatment. • Vision is example of an inexact problem. Vision systems often use heuristics to select the most likely of several possible interpretations of as scene. 2) A problem may have an exact solution, but the computational cost of finding it may be prohibitive The problem has an exact solution but is too complex to allow for a brute force solution. In many problems (such as chess), Taibah University_CS Dept. CS 362_ Intelligent Systems Heuristic and AI problem solving Slide 5 Heuristic or informed search: Exploits additional knowledge about the problem that helps direct search to more promising paths. problem-solving strategies which in many cases find a solution faster than uninformed search. However , this is not guaranteed. Could require a lot more time and can even result in the solution not being found. Fallible because they rely on limited information, they may lead to a suboptimal solution or to a dead end. Taibah University_CS Dept. CS 362_ Intelligent Systems Heuristic and AI problem solving Slide 6 Heuristic is: an AI search technique that employs heuristic for its moves. a rule of thumb that probably leads to a solution. Heuristics play a major role in search strategies because of exponential nature of the most problems. help to reduce the number of alternatives from an exponential number to a polynomial number. In Artificial Intelligence, heuristic search has a general meaning, and a more specialized technical meaning. In a general sense, the term heuristic is used for any advice that is often effective, but is not guaranteed to work in every case. Within the heuristic search architecture, however, the term heuristic usually refers to the special case of a heuristic evaluation function. Taibah University_CS Dept. CS 362_ Intelligent Systems Informed or Heuristic search Strategies Slide 7 Types of informed search strategies: • Best-First Search - Greedy search - A* search - Best-First Search Analysis - Heuristic Functions • Memory Bounded Search - Iterative Deepening A* Search - Recursive best-first search - Simplified Memory Bounded A* • Iterative Improvement Algorithms - Hill-Climbing Search - Simulated Annealing - Tabu Search - and many more. Taibah University_CS Dept. CS 362_ Intelligent Systems Heuristic Information Slide 8 In order to solve larger problems, domain-specific knowledge must be added to improve search efficiency. Information about the problem include: the nature of states, cost of transforming from one state to another, and characteristics of the goals. This information can often be expressed in the form of heuristic evaluation function, say f(n , g), a function of the nodes n and/or the goals g. Taibah University_CS Dept. CS 362_ Intelligent Systems Example (1): The game of tic-tac-toe Slide 9 Even if we use symmetry to reduce the search space of redundant moves, The number of possible paths through the search space is something like 12 x 7! = 60480. That is a measure of the amount of work that would have to be done by a brute-force search. Fig 4.1 First three levels of the tic-tactoe state space reduced by symmetry Taibah University_CS Dept. CS 362_ Intelligent Systems Heuristic Example (1) Slide 10 Fig 4.2 The “most wins” heuristic applied to the first children in tic-tactoe. Fig 4.3 Heuristically reduced state space for tic-tac-toe. In reality, the number of states is smaller, as the board fills and reduces options. This crude bound of 25 states is an improvement of four orders of magnitude over 9!. Taibah University_CS Dept. CS 362_ Intelligent Systems Example (2) The traveling salesman problem Slide 11 The traveling salesman problem is too complex to be solved via exhaustive search for large values of N. The nearest neighbor heuristic work well most of the time, but with some arrangements of cities it does not find the shortest path. Consider the following two graphs. In the first, nearest neighbor will find the shortest path. In the second it does not: Taibah University_CS Dept. CS 362_ Intelligent Systems Hill Climbing Slide 12 The simplest way to implement heuristic search is through a procedure called hill-climbing (Pearl 1984). Hill climbing (local search) is an iterative improvement algorithm that starts with an initial state to a problem, then attempts to find the best node for the current node to follow. And so on until goal node is found. Hill-climbing strategies expand the current state of the search and evaluate its children. The “best” child is selected for further expansion; neither its siblings nor its parent are retained. In hill climbing the basic idea is to always head towards a state which is better than the current one. So, if you are at town A and you can get to town B and town C (and your target is town D) then you should make a move IF town B or C appear nearer to town D than town A does. Taibah University_CS Dept. CS 362_ Intelligent Systems Hill Climbing Slide 13 Note that: Backtracking is not permitted. At each step in the search, a single node is chosen to follow. The criterion for the node to follow is that it’s the best state for the current state. Hill climbing do not care about path from initial state to goal In steepest ascent hill climbing you will always make your next state the best successor of your current state, and will only make a move if that successor is better than your current state. This can be described as follows: 1. Start with current-state = initial-state. 2. Until current-state = goal-state OR there is no change in current-state do: A. Get the successors of the current state and use the evaluation function to assign a score to each successor. B. 2. If one of the successors has a better score than the current-state then set the new currentstate to be the successor with the best score. Taibah University_CS Dept. CS 362_ Intelligent Systems Hill Climbing Slide 14 Taibah University_CS Dept. CS 362_ Intelligent Systems Hill Climbing Slide 15 Taibah University_CS Dept. CS 362_ Intelligent Systems Hill Climbing Example Slide 16 Taibah University_CS Dept. CS 362_ Intelligent Systems Hill Climbing Slide 17 Terminates when a peak is reached. Does not look ahead of the immediate neighbors of the current state. Chooses randomly among the set of best successors, if there is more than one. Doesn’t backtrack, since it doesn’t remember where it’s been The problem with hill climbing is that the best node to enumerate locally may not be the best node globally. For this reason, hill climbing can lead to local optimums, but not necessarily the global optimum (the best solution available). Hill-climbing can get stuck without finding optimum Terminates when it reaches a peak (where no neighboring state has a higher value) Does not look beyond immediate neighbors of current state Like climbing Everest in thick fog with amnesia Taibah University_CS Dept. CS 362_ Intelligent Systems Hill Climbing Example Slide 18 Taibah University_CS Dept. CS 362_ Intelligent Systems Problem of HILL-CLIMBING SEARCH Slide 19 Taibah University_CS Dept. CS 362_ Intelligent Systems The Best-First Search Algorithm (BEST-FS) Slide 20 Best First Search (or Greedy Best First Search ) is a combination of depth first and breadth first searches. • • Depth first is good because a solution can be found without computing all nodes and Breadth first is good because it does not get trapped in dead ends. The best first search allows us to switch between paths thus gaining the benefit of both approaches. Best First Search described as follows: • • • At each step the most promising node is chosen. If one of the nodes chosen generates nodes that are less promising it is possible to choose another at the same level and in effect the search changes from depth to breadth. If on analysis these are no better then this previously unexpanded node and branch is not forgotten and the search method reverts to the descendants of the first choice and proceeds, backtracking as it were. Greedy best-first search 21 Greedy Best-first search 22 Greedy Best-first search 23 Taibah University_CS Dept. CS 362_ Intelligent Systems The Best-First Search Algorithm(BEST-FS) Slide 24 Best First Search Algorithm: 1- Start with OPEN holding the initial state 2- Pick the best node on OPEN 3- Generate its successors 4- For each successor Do If it has not been generated before evaluate it add it to OPEN and record its parent If it has been generated before change the parent if this new path is better and in that case update the cost of getting to any successor nodes 5- If a goal is found or no more nodes left in OPEN, quit, else return to 2. Taibah University_CS Dept. CS 362_ Intelligent Systems The Best-First Search Algorithm(BEST-FS) Slide 25 The best first search algorithm will involve an OR graph which avoids the problem of node duplication and assumes that each node has a parent link to give the best node from which it came and a link to all its successors. In this way if a better node is found this path can be propagated down to the successors. This method of using an OR graph requires 2 lists of nodes : • • OPEN is a priority queue of nodes that have been evaluated by the heuristic function but which have not yet been expanded into successors. The most promising nodes are at the front. CLOSED are nodes that have already been generated and these nodes must be stored because a graph is being used in preference to a tree. Taibah University_CS Dept. CS 362_ Intelligent Systems BEST-FS – Example 1 Slide 26 If the root is A and the goal is P A trace of the execution of best_first_search for Figure 4.10 1.open = [A5]; closed = [ ] 2.evaluate A5; open = [B4,C4,D6]; closed = [A5] 3.evaluate B4; open = [C4,E5,F5,D6]; closed = [B4,A5] 4.evaluate C4; open = [H3,G4,E5,F5,D6]; closed = [C4,B4,A5] 5.evaluate H3; open = [O2,P3,G4,E5,F5,D6]; closed = [H3,C4,B4,A5] 6.evaluate O2; open = [P3,G4,E5,F5,D6]; closed = [O2,H3,C4,B4,A5] 7.evaluate P3; the solution is found! Fig 4.10 Heuristic search of a hypothetical state space. Taibah University_CS Dept. CS 362_ Intelligent Systems BEST-FS – Example 1 Slide 27 open and closed states at level 3 Fig 4.11 Heuristic search of a hypothetical state space with open and closed states highlighted. Taibah University_CS Dept. CS 362_ Intelligent Systems BEST-FS – Example 2 Slide 28 The start state, first moves, and goal state for an example-8 puzzle. The simplest heuristic counts the tiles out of place in each state when compared with the goal. This heuristic does not use all of the information available in a board configuration, because it does not take into account the distance the tiles must be moved. A “better” heuristic would sum all the distances by which the tiles are out of place, one for each square a tile must be moved to reach its position in the goal state. Taibah University_CS Dept. CS 362_ Intelligent Systems BEST-FS – Example 2 Slide 29 Three heuristics applied to states in the 8-puzzle. tiles 2,8,1,6,7 tiles 2,8,1 tiles 2,8,1,6,5 Sum=1+2+1+1+1 Sum=1+2+1 Taibah University_CS Dept. CS 362_ Intelligent Systems The A* Algorithm Slide 30 If two states have the same or nearly the same heuristic evaluations, it is generally preferable to examine the state that is nearest to the root state of the graph. This state will have a greater probability of being on the shortest path to the goal. The distance from the starting state to its descendants can be measured by maintaining a depth count for each state. This count is 0 for the beginning state and is incremented by 1 for each level of the search. This makes our evaluation function, f , the sum of two components: f (n) = g(n) + h(n) Where • • g(n) measures the actual length of the path from any state n to the start state and, h(n) is a heuristic estimate of the distance from state n to a goal. Taibah University_CS Dept. CS 362_ Intelligent Systems A* Algorithm – Example Slide 31 The heuristic f This is A* example applied to states in the 8-puzzle. Taibah University_CS Dept. CS 362_ Intelligent Systems A* Algorithm – Example Slide 32 The full best-first search of the 8-puzzle graph. Each state is labeled with a letter and its heuristic weight, f(n) = g(n) + h(n) The successive stages of open and closed that generate this graph are: Taibah University_CS Dept. CS 362_ Intelligent Systems A* Algorithm – Example Slide 33 Open and closed as they appear after the 3rd iteration of heuristic search Taibah University_CS Dept. CS 362_ Intelligent Systems The A* Search Algorithm Summarization Slide 34 To summarize: Operations on states generate children of the state currently under examination. Each new state is checked to see whether it has occurred before (is on either open or closed), there by preventing loops. Each state n is given an f value equal to the sum of its depth in the search space g(n) and a heuristic estimate of its distance to a goal h(n). The h value guides search toward heuristically promising states while the g value can prevent search from persisting indefinitely on a fruitless path. States on open are sorted by their f values. By keeping all states on open until they are examined or a goal is found, the algorithm recovers from dead ends. As an implementation point, the algorithm’s efficiency can be improved through maintenance of the open and closed lists, perhaps as heaps or leftist trees. Taibah University_CS Dept. CS 362_ Intelligent Systems BEST-FS Properties Slide 35 How does best-first search rate according to the four criteria of performance? Complete: It is complete It always reaches goal. Optimal: It is not optimal A solution can be found in a longer path (higher g(n) with a lower h(n) value) Time: It generates O(b Space: m) nodes. Space complexity is O(bm) Keeps every node in memory. [here b is maximum branching factor; m is maximum path length] Taibah University_CS Dept. CS 362_ Intelligent Systems Admissibility, Monotonicity , and Informedness Slide 36 We may evaluate the behavior of heuristics along a number of dimensions. We may desire a solution and also require the algorithm to find the shortest path to the goal. For instance, this could be important when an application might have an excessive cost for extra solution steps, such as planning a path for an autonomous robot through a dangerous environment. Heuristics that find the shortest path to a goal whenever it exists are said to be admissible. Taibah University_CS Dept. CS 362_ Intelligent Systems Admissibility, Monotonicity , and Informedness Slide 37 Conditions for optimality: Admissibility The heuristic is defined as admissible if it accurately estimates the path cost to the goal, or underestimates it (remains optimistic). This means that: g(n) (path cost from the initial node to the current node) monotonically increases, While h(n) (path cost from the current node to the goal node) monotonically decreases. Taibah University_CS Dept. CS 362_ Intelligent Systems Admissibility Measures Slide 38 A search algorithm is Admissible if it is guaranteed to find a minimal path to a solution whenever such a path exists. • • Breadth-first search is an admissible search strategy. Because it looks at every state at level n of the graph before considering any state at the level n+1, any goal nodes are found along the shortest possible path. Unfortunately, breadth-first search is often too inefficient for practical use. Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm Slide 39 Best-First Search Algorithm A is called as A* algorithm Using the evaluation function f(n) = g(n) + h(n) If n is a node in the state space graph, • g(n) measures the depth at which that state has been found in the graph, and • h(n) is the heuristic estimate of the distance from n to a goal. In this sense f(n) estimates the total cost of the path from the start state through n to the goal state. A* search 40 root g(n): cost of path n h*(n): true minimum cost to goal h(n): Heuristic (expected) minimum cost to goal. (estimation) Goal Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm Slide 41 A* algorithm: 1- Start with OPEN holding the initial state 2- Pick the best node on OPEN 3- Generate its successors 4- For each successor Do If it has not been generated before evaluate it add it to OPEN and record its parent If it has been generated before change the parent if this new path is better and in that case update the cost of getting to any successor nodes 5- If a goal is found or no more nodes left in OPEN, quit, else return to 2. Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm Slide 42 We may characterize a class of Admissible Heuristic search strategies. we define an evaluation function f*: f*(n) = g*(n) + h*(n) Where • • g*(n) is the cost of the shortest path from the start to node n and, h*(n) returns the actual cost of the shortest path from n to the goal. It follows that f*(n) is the actual cost of the optimal path from a start node to a goal node that passes through node n. Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm Slide 43 In algorithm A, g(n), the cost of the current path to state n , is a reasonable estimate of g*, but they may not be equal: g(n)≥g*(n). These are equal only if the graph search has discovered the optimal path to state n. Similarly, we replace h*(n) with h(n), a heuristic estimate of the minimal cost to a goal state. Although we usually may not compute h*, it is often possible to determine whether or not the heuristic estimate, h(n), is bounded from above by h*(n), i.e., is always less than or equal to the actual cost of a minimal path. If algorithm A uses an evaluation function f in which called algorithm A*. h(n) ≤ h*(n), it is Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm - Romania Example Slide 44 Find goal “Bucharest” starting at “Arad”. Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm – Romanian Cities Example Slide 45 Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm - Romania Example Slide 46 f(Arad) = c(??,Arad)+h(Arad)=0+366=366 Expand Arad and determine f(n) for each node • f(Sibiu)=c(Arad,Sibiu)+h(Sibiu)=140+253=393 • f(Timisoara)=c(Arad,Timisoara)+h(Timisoara)=118+329=447 • f(Zerind)=c(Arad,Zerind)+h(Zerind)=75+374=449 Best choice is Sibiu Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm - Romania Example Slide 47 Expand Sibiu and determine f(n) for each node • f(Arad)=c(Sibiu,Arad)+h(Arad)=280+366=646 • f(Fagaras)=c(Sibiu,Fagaras)+h(Fagaras)=239+179=415 • f(Oradea)=c(Sibiu,Oradea)+h(Oradea)=291+380=671 • f(Rimnicu Vilcea)=c(Sibiu,Rimnicu Vilcea)+h(Rimnicu Vilcea)=220+192=413 Best choice is Rimnicu Vilcea Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm - Romania Example Slide 48 Expand Rimnicu Vilcea and determine f(n) for each node • f(Craiova)=c(Rimnicu Vilcea, Craiova)+h(Craiova)=360+160=526 • f(Pitesti)=c(Rimnicu Vilcea, Pitesti)+h(Pitesti)=317+100=417 • f(Sibiu)=c(Rimnicu Vilcea,Sibiu)+h(Sibiu)=300+253=553 Best choice is Fagaras Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm - Romania Example Slide 49 Expand Fagaras and determine f(n) for each node • f(Sibiu)=c(Fagaras, Sibiu)+h(Sibiu)=338+253=591 • f(Bucharest)=c(Fagaras,Bucharest)+h(Bucharest)=450+0=450 Best choice is Pitesti !!! Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm - Romania Example Slide 50 Expand Pitesti and determine f(n) for each node • f(Bucharest)=c(Pitesti,Bucharest)+h(Bucharest)=418+0=418 Best choice is Bucharest !!! • Optimal solution (only if h(n) is admissable) • Note values along optimal path !! A* search example 51 A* search example 52 A* search example 53 A* search example 54 A* search example 55 A* search example 56 Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm - Romania Example Slide 57 Taibah University_CS Dept. CS 362_ Intelligent Systems A* algorithm Properties Slide 58 How does A* search rate according to the four criteria of performance? Complete: It is complete As long as the memory supports the depth and branching factor of the tree. Optimal: It is optimal Depends on the use of an admissible heuristic • No algorithm with the same heuristic is guaranteed to expand Time: nodes O(bd) Must keep track of the nodes evaluated so far Space: O(bd) Must keep track of the discovered nodes to be evaluated fewer SUMMARY of SEARCH TECHNIQUES Search techniques Blind Depth first Search ( DFS ) Breadth first Search ( BFS ) Heuristic Hill climbing Search Best-First Search A* search Greedy Search Taibah University_CS Dept. CS 362_ Intelligent Systems Comparison of different Search Algorithms Slide 60 Taibah University_CS Dept. CS 362_ Intelligent Systems CS 362: Intelligent Systems Slide 1 Unit 5 STOCHASTIC METHODS Taibah University_CS Dept. CS 362_ Intelligent Systems Outline Slide 2 5.0 5.1 5.2 5.3 5.4 Introduction The Elements of Counting Elements of Probability Theory Bayes’ Theorem Applications of the Stochastic Methodology Taibah University_CS Dept. CS 362_ Intelligent Systems What is Uncertainty? Slide 3 Uncertainty is essentially lack of information to formulate a decision. Uncertainty may result in making poor or bad decisions. As living creatures, we are accustomed to dealing with uncertainty – that’s how we survive. Dealing with uncertainty requires reasoning under uncertainty along with possessing a lot of common sense. Taibah University_CS Dept. CS 362_ Intelligent Systems Theories to Deal with Uncertainty Slide 4 Bayesian Probability Hartley Theory (1928) Shannon Theory (1948) Dempster-Shafer Theory (1976) Markov Models Zadeh’s Fuzzy Theory(1965) Taibah University_CS Dept. CS 362_ Intelligent Systems Dealing with Uncertainty Slide 5 Deductive reasoning – deals with exact facts and exact conclusions Inductive reasoning – not as strong as deductive – premises support the conclusion but do not guarantee it. There are a number of methods to pick the best solution in light of uncertainty. When dealing with uncertainty, we may have to settle for just a good solution. Taibah University_CS Dept. CS 362_ Intelligent Systems Introduction Slide 6 One important application domain for the use of the stochastic methodology is diagnostic reasoning where cause/effect relationships are not always captured in a purely deterministic fashion, as is often possible in the knowledge-based approaches to problem solving that we saw in Chapters 2, 3, 4, and will see again in Chapter 8. A diagnostic situation usually presents evidence, such as fever or headache, without further causative justification. In fact, the evidence can often be indicative of several different causes, e.g., fever can be caused by either flu or an infection. In these situations probabilistic information can often indicate and prioritize possible explanations for the evidence. Another interesting application for the stochastic methodology is gambling, where supposedly random events such as the roll of dice, the dealing of shuffled cards, or the spin of a roulette wheel produce possible player payoff. Taibah University_CS Dept. CS 362_ Intelligent Systems Introduction Slide 7 In fact, in the 18th century, the attempt to provide a mathematical foundation for gambling was an important motivation for Pascal (and later Laplace) to develop a probabilistic calculus. We next describe several problem areas, among many, where the stochastic methodology is often used in the computational implementation of intelligence; these areas will be major topics in later chapters. Taibah University_CS Dept. CS 362_ Intelligent Systems Some Applications for Stochastic Methods Slide 8 We describe several problem areas, where the stochastic methodology is often used in the computational implementation of intelligence. Diagnostic Reasoning. In medical diagnosis, for example, there is not always an obvious cause/effect relationship between the set of symptoms presented by the patient and the causes of these symptoms. In fact, the same sets of symptoms often suggest multiple possible causes. Natural language understanding. If a computer is to understand and use a human language, that computer must be able to characterize how humans themselves use that language. Words, expressions, and metaphors are learned, but also change and evolve as they are used over time. Taibah University_CS Dept. CS 362_ Intelligent Systems Some Applications for Stochastic Methods Slide 9 Planning and scheduling. When an agent forms a plan, for example, a vacation trip by automobile, it is often the case that no deterministic sequence of operations is guaranteed to succeed. What happens if the car breaks down, if the car ferry is cancelled on a specific day, if a hotel is fully booked, even though a reservation was made? Learning. The three previous areas mentioned for stochastic technology can also be seen as domains for automated learning. An important component of many stochastic systems is that they have the ability to sample situations and learn over time. The stochastic methodology has its foundation in the properties of counting. Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Slide 10 The foundation for the stochastic methodology is the ability to count the elements of an application domain. The basis for collecting and counting elements is, set theory, in which we must be able to determine whether an element is or is not a member of a set of elements. Once this is determined, there are methodologies for counting elements of sets, of the complement of a set, and the union and intersection of multiple sets. Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Slide 11 If we have a set A, the number of the elements in set A is denoted by |A|, called the cardinality of A. Of course, A may be Empty (the number of elements is zero), Finite , Countably infinite , or uncountably infinite. Each set is defined in terms of a domain of interest or universe , U, of elements that might be in that set Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Example Slide 12 Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting The Addition and Multiplication Rules Slide 13 The addition rule for combining two sets. For any two sets A and C, the number of elements in the union of these sets is: A similar addition rule holds for three sets A,B, and C, This Addition rule may be generalized to any finite number of sets Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting The Addition and Multiplication Rules Slide 14 The Cartesian Product of two sets A and B, is the set of all ordered pairs (a, b) where a is an element of set A and b is an element of set B; or more formally: and by the multiplication principle of counting for two sets The Cartesian product can, of course, be defined across any number of sets. Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Permutations and Combinations Slide 15 A permutation of a set of elements is an arranged sequence of the elements of that set. The permutations of a set of n elements taken r at a time The combinations of a set of n elements taken r at a time Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Permutations and Combinations Slide 16 A permutation of a set of elements is an arranged sequence of the elements of that set. The permutations of a set of n elements taken r at a time (0 ≤ r ≤n) we can represent this equation as: equivalently, the number of permutations of n elements taken r at a time, which is symbolized as nPr , is: Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Permutations and Combinations Slide 17 The combination of a set of n elements is any subset of these elements that can be formed. The combinations of a set of n elements taken r at a time (0 ≤ r ≤ n ) Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Permutations and Combinations Slide 18 Permutations: there are basically two types of permutation: (remember the order does matter now): Repetition is Allowed: such as the lock example. It could be "333". No Repetition: for example the first three people in a running race. You can't be first and second. Combinations: there are also two types of combinations (remember the order does not matter now): Repetition is Allowed: such as coins in your pocket (5,5,5,10,10) No Repetition: such as lottery numbers (2,14,15,27,30,33) Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Permutations without Repetition Example Slide 19 For example, what order could 16 pool balls be in? After choosing, say, number "14" we can't choose it again. So, our first choice would have 16 possibilities, and our next choice would then have 15 possibilities, then 14, 13, etc. And the total permutations would be: 16 × 15 × 14 × 13 × ... = 20,922,789,888,000 But maybe we don't want to choose them all, just 3 of them, so that would be only: 16 × 15 × 14 = 3,360 In other words, there are 3,360 different ways that 3 pool balls could be selected out of 16 balls. Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Permutations without Repetition Example Slide 20 So, if we wanted to select all of the billiard balls the permutations would be: 16! = 20,922,789,888,000 But if we wanted to select just 3, then we have to stop the multiplying after 14. How do we do that? There is a neat trick ... we divide by 13! ... 16 × 15 × 14 × 13 × 12 ... 13 × 12 ... = 16 × 15 × 14 = 3,360 Do you see? 16! / 13! = 16 × 15 × 14 where n is the number of things to choose from, and The formula is written: we choose r of them (No repetition, order matters) Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Permutations without Repetition Example Slide 21 Examples: Our "order of 3 out of 16 pool balls example" would be: 16! (16-3)! = 16! 13! 20,922,789,888,000 = 6,227,020,800 = 3,360 (which is just the same as: 16 × 15 × 14 = 3,360) How many ways can first and second place be awarded to 10 people? 10! (10-2)! = 10! 8! = 3,628,800 40,320 = 90 (which is just the same as: 10 × 9 = 90) Notation Instead of writing the whole formula, people use different notations such as these: Example: P(10,2) = 90 Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Permutations with Repetition Example Slide 22 Example: in the lock below, there are 10 numbers to choose from (0,1,...9) and we choose 3 of them: 10 × 10 × ... (3 times) = 103 = 1,000 permutations So, the formula is simply: nr where n is the number of things to choose from, and we choose r of them (Repetition allowed, order matters) Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Combinations without Repetition Example Slide 23 Going back to our pool ball example, let's say we just want to know which 3 pool balls were chosen, not the order. We already know that 3 out of 16 gave us 3,360 permutations. But many of those will be the same to us now, because we don't care what order! For example, let us say balls 1, 2 and 3 were chosen. These are the possibilities: Order does matter Order doesn't matter 123 132 213 231 312 321 123 So, the permutations will have 6 times as many possibilities. In fact there is an easy way to work out how many ways "1 2 3" could be placed in order, and we have already talked about it. The answer is: 3! = 3 × 2 × 1 = 6 Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Combinations without Repetition Example Slide 24 So we adjust our permutations formula to reduce it by how many ways the objects could be in order (because we aren't interested in their order any more): That formula is so important it is often just written in big parentheses like this: where n is the number of things to choose from, and we choose r of them (No repetition, order doesn't matter) It is often called "n choose r" (such as "16 choose 3") and is also known as the "Binomial Coefficient“ Notation As well as the "big parentheses", people also use these notations: Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Combinations without Repetition Example Slide 25 So, our pool ball example (now without order) is: 16! 3!(16-3)! = 16! 3!×13! 20,922,789,888,000 = = 560 6×6,227,020,800 Or we could do it this way: 16×15×14 3×2×1 = 3360 = 560 6 So remember, do the permutation, then reduce by a further "r!" In other words choosing 3 balls out of 16, or choosing 13 balls out of 16 have the same number of combinations. 16! 3!(16-3)! = 16! 13!(16-13)! = 16! 3!×13! = 560 Taibah University_CS Dept. CS 362_ Intelligent Systems The Elements of Counting Combinations with Repetition Example Slide 26 Actually, these are the hardest to explain. We would write it like this: where n is the number of things to choose from, and we choose r of them (Repetition allowed, order doesn't matter) Taibah University_CS Dept. CS 362_ Intelligent Systems Elements of Probability Theory Slide 27 The following definitions, the foundation for a theory for probability, were first formalized by the French mathematician Laplace (1816) in the early nineteenth century. D E F I N I T I O N S: The Sample Space, Probabilities, and Independence Taibah University_CS Dept. CS 362_ Intelligent Systems Elements of Probability Theory The Sample Space, Probabilities, and Independence Slide 28 DEFINITION ELEMENTARY EVENT An elementary or atomic event is a happening or occurrence that cannot be made up of other events. EVENT, E An event is a set of elementary events. SAMPLE SPACE, S The set of all possible outcomes of an event E is the sample space S or universe for that event. PROBABILITY, p The probability of an event E in a sample space S is the ratio of the number of elements in E to the total number of possible outcomes of the sample space S of E. Thus, p(E) = |E| / |S|. Taibah University_CS Dept. CS 362_ Intelligent Systems Example Slide 29 For example, what is the probability that a 7 or an 11 are the result of the roll of two fair dice? We first determine the sample space for this situation. Using the multiplication principle of counting, each die has 6 outcomes, so the total set of outcomes of the two dice is 36. The number of combinations of the two dice that can give a 7 is 1,6; 2,5; 3,4; 4,3; 5,2; and 6,1 - 6 altogether. The probability of rolling a 7 is thus 6/36 = 1/6. The number of combinations of the two dice that can give an 11 is 5,6; 6,5 - or 2, and the probability of rolling an 11 is 2/36 = 1/18. Using the additive property of distinct outcomes, there is 1/6 + 1/18 or 2/9 probability of rolling either a 7 or 11 with two fair dice. Taibah University_CS Dept. CS 362_ Intelligent Systems Explanation of Example Slide 30 In this 7/11 example, the two events are getting a 7 or getting an 11. The elementary events are the distinct results of rolling the two dice. Thus the event of a 7 is made up of the six atomic events (1,6), (2,5), (3,4), (4,3), (5,2), and (6,1). The full sample space is the union of all thirty-six possible atomic events, the set of all pairs that result from rolling the dice. As we see soon, because the events of getting a 7 and getting an 11 have no atomic events in common, they are independent, and the probability of their sum (union) is just the sum of their individual probabilities. Taibah University_CS Dept. CS 362_ Intelligent Systems Elements of Probability Theory The Sample Space, Probabilities, and Independence Slide 31 The probability of any event E from the sample space S is: The sum of the probabilities of all possible outcomes is 1 The probability of the compliment of an event is The probability of the contradictory or false outcome of an event Taibah University_CS Dept. CS 362_ Intelligent Systems Elements of Probability Theory The Sample Space, Probabilities, and Independence Slide 32 A final important relationship, the probability of the union of two sets of events. Namely that for any two sets A and B: From this relationship we can determine the probability of the union of any two sets taken from the sample space S: Taibah University_CS Dept. CS 362_ Intelligent Systems Elements of Probability Theory The Sample Space, Probabilities, and Independence Slide 33 Taibah University_CS Dept. CS 362_ Intelligent Systems Example Slide 34 Consider the situation where bit strings of length four are randomly generated. We want to know whether the event of the bit string containing an even number of 1s is independent of the event where the bit string ends with a 0. Using the multiplication principle, each bit having 2 values, there are a total of 24= 16 bit strings of length 4. Taibah University_CS Dept. CS 362_ Intelligent Systems Example Slide 35 There are 8 bit strings of length four that end with a 0: {1110, 1100, 1010, 1000,0010, 0100, 0110, 0000}. There are also 8 bit strings that have an even number of 1s: {1111, 1100, 1010, 1001, 0110, 0101, 0011, 0000}. The number of bit strings that have both an even number of 1s and end with a 0 is 4: {1100, 1010, 0110, 0000}. Now these two events are independent since Taibah University_CS Dept. CS 362_ Intelligent Systems The three Kolmogorov Axioms Slide 36 The three Kolmogorov Axioms: We note that other axiom systems supporting the foundations of probability theory are possible, for instance, as an extension to the propositional calculus . As an example of the set-based approach we have taken, the Russian mathematician Kolmogorov (1950) proposed a variant of the following axioms, equivalent to our definitions. From these three axioms Kolmogorov systematically constructed all of probability theory. Taibah University_CS Dept. CS 362_ Intelligent Systems Probabilistic Inference Example Slide 37 For this example we assume we have three true or false parameters (we will define this type parameter as a boolean random variable). First, there is whether or not the traffic - and you - are slowing down. This situation will be labeled S, with assignment of t or f. Second, there is the probability of whether or not there is an accident, A, with assignments t or f. Finally, the probability of whether or not there is road construction at the time, C; again either t or f. We can express these relationships for the interstate highway traffic, thanks to our car-based data download system, in Table 5.1. A Venn diagram representation of the probability distributions of Table 5.1; S is traffic slowdown, A is accident, C is construction. Taibah University_CS Dept. CS 362_ Intelligent Systems Probabilistic Inference Example Slide 38 Table 5.1 are interpreted, of course, just like the truth tables, except that the right hand column gives the probability of the situation on the left hand side happening. Thus, the third row of the table gives the probability of the traffic slowing down and there being an accident but with no construction as 0.16: Taibah University_CS Dept. CS 362_ Intelligent Systems Probabilistic Inference Example Slide 39 We can also work out the probability of any simple or complex set of events. For example, we can calculate the probability that there is traffic slowdown S. The value for slow traffic is 0.32, the sum of the first four lines of Table 5.1; that is, all the situations where S = t. This is sometimes called the unconditional or marginal probability of slow traffic, S. This process is called marginalization because all the probabilities other than slow traffic are summed out. That is, the distribution of a variable can be obtained by summing out all the other variables from the joint distribution containing that variable. Taibah University_CS Dept. CS 362_ Intelligent Systems Probabilistic Inference Example Slide 40 Taibah University_CS Dept. CS 362_ Intelligent Systems Random Variables Slide 41 Taibah University_CS Dept. CS 362_ Intelligent Systems Random Variables Example Slide 42 An example using a discrete random variable on the domain of Season, where the atomic events of Season are{spring, summer, fall, winter}, assigns 0.75, say, to the domain element Season = spring. In this situation we say p(Season = spring) = 0.75. An example of a boolean random variable in the same domain would be the mapping p(Season = spring) = true. Taibah University_CS Dept. CS 362_ Intelligent Systems Random Variables - EXPECTATION OF AN EVENT Slide 43 Example : Suppose that a fair roulette wheel has integers 0 through 36 equally spaced on the slots of the wheel. In the game each player places $1 on any number she chooses: if the wheel stops on the number chosen, she wins $35; otherwise she loses the dollar. The reward of a win is $35; the cost of a loss, $1. Since the probability of winning is 1/37, of losing, 36/37, the expected value for this event, ex(E), is: ex(E) = 35 (1/37) + (-1) (36/37)≈ -0.027 Thus the player loses, on average, about $0.03 per play! Taibah University_CS Dept. CS 362_ Intelligent Systems Conditional Probability Slide 44 Prior probability and (Conditional or Posterior) probability Taibah University_CS Dept. CS 362_ Intelligent Systems Conditional Probability Slide 45 Based on the previous definitions, The posterior probability of a person having disease d, from a set of diseases D, with symptom or evidence, s, from a set of symptoms S, is: A Venn diagram illustrating the calculations of P(d | s) as a function of p(s | d). Taibah University_CS Dept. CS 362_ Intelligent Systems Conditional Probability The chain rule Slide 46 Chain rule, an important technique used across many domains of stochastic reasoning, especially in natural language processing. We have just developed the equations for any two sets, A1 and A2: and now, the generalization to multiple sets Ai, called the chain rule: We make an inductive argument to prove the chain rule, consider the nth case: Taibah University_CS Dept. CS 362_ Intelligent Systems Conditional Probability The chain rule for two sets Slide 47 We apply the intersection of two sets of rules to get: And then reduce again, considering that: Until is reached, the base case, which we have already demonstrated. Taibah University_CS Dept. CS 362_ Intelligent Systems CONDITIONALLY INDEPENDENT EVENTS Slide 48 We redefine independent events (see Section 5.2.1) in the context of conditional probabilities, and then we define conditionally independent events, or the notion of how events can be independent of each other, given some third event. Taibah University_CS Dept. CS 362_ Intelligent Systems Conditional Probability Bayes’ theorem Slide 49 The Reverend Thomas Bayes was a mathematician and a minister. His famous theorem was published in 1763, four years after his death. Bayes’ theorem relates cause and effect in such a way that by understanding the effect we can learn the probability of its causes. As a result Bayes’ theorem is important both for determining the causes of diseases, such as cancer, as well as useful for determining the effects of some particular medication on that disease. Taibah University_CS Dept. CS 362_ Intelligent Systems Conditional Probability Bayes’ theorem Slide 50 We revisit one of the results of Section 5.2.4, Bayes’ equation for one disease and one symptom. To help keep our diagnostic relationships in context, we rename the variables used previously to indicate individual hypotheses, hi, from a set of hypotheses, H, and a set of evidence, E. Furthermore, we will now consider the set of individual hypotheses hi as disjoint, and having the union of all hi to be equal to H. This equation may be read, “The probability of an hypothesis hi given a set of evidence E is . . .” Taibah University_CS Dept. CS 362_ Intelligent Systems Conditional Probability Bayes’ theorem Slide 51 The general form of Bayes’ theorem where we assume the set of hypotheses H partition the evidence set E: Naïve Bayes, or the Bayes classifier, that uses the partition assumption, even when it is not justified: Taibah University_CS Dept. CS 362_ Intelligent Systems Conditional Probability Bayes’ theorem Example Slide 52 simple numerical example demonstrating Bayes’ theorem Suppose that you go out to purchase an automobile. The probability that you will go to dealer 1, d1, is 0.2. The probability of going to dealer 2, d2, is 0.4. There are only three dealers you are considering and the probability that you go to the third, d3, is also 0.4. At d1 the probability of purchasing a particular automobile, a1, is 0.2; at dealer d2 the probability of purchasing a1 is 0.4. Finally, at dealer d3, the probability of purchasing a1 is 0.3. Suppose you purchase automobile a1. What is the probability that you purchased it at dealer d2? First, we want to know, given that you purchased automobile a1, that you bought it from dealer d2, i.e., to determine p(d2|a1). We present Bayes’ theorem in variable form for determining p(d2|a1) and then with variables bound to the situation in the example. Taibah University_CS Dept. CS 362_ Intelligent Systems Applications of the Stochastic Methodology PROBABILISTIC FINITE STATE MACHINE Slide 53 We present two examples that use probability measures to reason about the interpretation of ambiguous information. First, we define an important modeling tool based on the finite state machine, the probabilistic finite state machine. Taibah University_CS Dept. CS 362_ Intelligent Systems Applications of the Stochastic Methodology PROBABILISTIC FINITE STATE MACHINE Slide 54 It can be seen that these two definitions are simple extensions to the finite state. The addition for non-determinism is that the next state function is no longer a function in the strict sense. That is, there is no unique range state for each input value and each state of the domain. Rather, for any state, the next state function is a probability distribution over all possible next states. Taibah University_CS Dept. CS 362_ Intelligent Systems Applications of the Stochastic Methodology PROBABILISTIC FINITE STATE MACHINE Example Slide 55 Example : How Is “Tomato” Pronounced? Figure 5.3 presents a probabilistic finite state acceptor that represents different pronunciations of the word “tomato”. A particular acceptable pronunciation for the word tomato is characterized by a path from the start state to the accept state. The decimal values that label the arcs of the graph represent the probability that the speaker will make that particular transition in the state machine. For example, 60% of all speakers in this data set go directly from the t phone to the m without producing any vowel phone between. You say [ to ow m ey tow ] and I say [ t ow m aa t ow ]… Figure 5.3 A probabilistic finite state acceptor for the pronunciation of tomato , adapted from Jurafsky and Martin (2009). Taibah University_CS Dept. CS 362_ Intelligent Systems Applications of the Stochastic Methodology PROBABILISTIC FINITE STATE MACHINE Example Slide 56 Besides characterizing the various ways people in the pronunciation database speak the word “tomato”, this model can be used to help interpret ambiguous collections of phonemes. This is done by seeing how well the phonemes match possible paths through the state machines of related words. Further, given a partially formed word, the machine can try to determine the probabilistic path that best completes that word. Taibah University_CS Dept. CS 362_ Intelligent Systems Extending the Road/Traffic Example Slide 57 Figure 5.4 presents a Bayesian account of what we have just seen. road construction is correlated with orange barrels and bad traffic. Similarly, accident correlates with flashing lights and bad traffic. We examine Figure 5.4 and build a joint probability distribution for the road construction and bad traffic relationship. Fig 5.4 The Bayesian representation of the traffic problem with potential explanations. Taibah University_CS Dept. CS 362_ Intelligent Systems Extending the Road/Traffic Example Slide 58 We simplify both of these variables to be either true (t) or false (f) and represent the probability distribution in Table 5.4. Note that if construction is f there is not likely to be bad traffic and if it is t then bad traffic is likely. Note also that the probability of road construction on the interstate, C = true, is .5 and the probability of having bad traffic, T = true, is .4 (this is New Mexico!). Table 5.4 The joint probability distribution for the traffic and construction variables of Fig 5.3. Taibah University_CS Dept. CS 362_ Intelligent Systems Extending the Road/Traffic Example Slide 59 We next consider the change in the probability of road construction given the fact that we have bad traffic, or p(C|T) or p(C = t | T = t). p(C|T) = p(C = t , T = t) / (p(C = t , T = t) + p(C = f , T = t)) = .3 / (.3 + .1) = .75 So now, with the probability of road construction being .5, given that there actually is bad traffic, the probability for road construction goes up to .75. This probability will increase even further with the presence of orange barrels, explaining away the hypothesis of accident. Taibah University_CS Dept. CS 362_ Intelligent Systems CS 362: Intelligent Systems Slide 1 Unit 6 Knowledge Representation Taibah University_CS Dept. CS 362_ Intelligent Systems Knowledge Representation Slide 2 Issues in Knowledge Representation AI Representational Systems Associationist Theories of Meaning Semantic Networks Standardization of Network Relationships Frames Conceptual Graphs: a Network Language Introduction to Conceptual Graphs Types, Individuals, and Names Generalization and Specialization Propositional Nodes Conceptual Graphs and Logic Taibah University_CS Dept. CS 362_ Intelligent Systems Review(i) The Pyramid of Knowledge Slide 3 The Pyramid of Knowledge Taibah University_CS Dept. CS 362_ Intelligent Systems Review(ii) Knowledge vs. Expert Systems Slide 4 Knowledge is of primary importance in expert systems. In fact, an analogy to classic expression Algorithms + Data Structures = Programs Knowledge + Inference = Expert Systems Knowledge representation is key to the success of expert systems. Expert systems are designed for knowledge representation based on rules of logic called inferences. Knowledge affects the development, efficiency, speed, and maintenance of the system. Taibah University_CS Dept. CS 362_ Intelligent Systems Review(iii) How is Knowledge Used? Slide 5 Knowledge has many meanings – data, facts, information. How do we use knowledge to reach conclusions or solve problems? Heuristics refers to using experience to solve problems – using precedents. Expert systems may have hundreds / thousands of microprecedents to refer to. Taibah University_CS Dept. CS 362_ Intelligent Systems Review(iv) Knowledge in Rule-Based Systems Slide 6 Knowledge: • Knowledge is part of a hierarchy. • Knowledge refers to rules that are activated by facts or other rules. • Activated rules produce new facts or conclusions. • Conclusions are the end-product of inferences when done according to formal rules. Metaknowledge: • Metaknowledge is knowledge about knowledge and expertise. • Most successful expert systems are restricted to as small a domain as possible. • In an expert system, an ontology is the metaknowledge that describes everything known about the problem domain. Wisdom : • Wisdom is the metaknowledge of determining the best goals of life and how to obtain them. Taibah University_CS Dept. CS 362_ Intelligent Systems Review(v) Representation and Intelligence Slide 7 The question of representation, or how to best capture critical aspects of intelligent activity for use on a computer. Three predominant approaches to representation taken by the AI research community over this time period. The first theme( weak method problem-solving) Articulated in the 1950s and 1960s by Newell and Simon in their work with the Logic Theorist , is known as weak method problem-solving. The second (Strong method problem solving ) Common throughout the 1970s and 1980s, and espoused by the early expert system designers, is strong method problem solving . In more recent years, (Agent-Based) Especially in the domains of robotics and the internet (Brooks 1987, 1989; Clark 1997), the emphasis is on distributed and embodied or agent-based approaches to intelligence. Taibah University_CS Dept. CS 362_ Intelligent Systems Issues in Knowledge Representation (i) Slide 8 The representation of information for use in intelligent problem solving offers important and difficult challenges that lie at the core of AI. Representation Schemes Scheme – data/knowledge structure Semantic network • Frames Conceptual graphs Stochastic methods Connectionist (neural networks) Implementation media Medium – implementation languages Prolog, Lisp, Scheme, even C and Java Taibah University_CS Dept. CS 362_ Intelligent Systems Issues in Knowledge Representation (ii) Slide 9 knowledge base is described as: (Bobrow 1975) • A mapping between the objects and relations in a problem domain • The computational objects and relations of a program The results of inferences in the knowledge base are assumed to correspond to the results of actions or observations in the world. The computational objects, relations, and inferences available to programmers are mediated by the knowledge representation language. Taibah University_CS Dept. CS 362_ Intelligent Systems Issues in Knowledge Representation (iii) Slide 10 There are general principles of knowledge organization that apply across a variety of domains and can be directly supported by a representation language. Example: class hierarchies are found in both scientific and commonsense classification systems. 1. How may we provide a general mechanism for representing them? 2. How may we represent definitions? 3. Exceptions? 4. When should an intelligent system make default assumptions about missing information and how can it adjust its reasoning should these assumptions prove wrong? 5. How may we best represent time? Causality? Uncertainty? Progress in building intelligent systems depends on discovering the principles of knowledge organization and supporting them through higher-level representational tools. Taibah University_CS Dept. CS 362_ Intelligent Systems Associationist Theories of Meaning(ii) Slide 11 There are many problems that arise in mapping commonsense reasoning into formal logic. For example, it is common to think of the operators ∨ and → as corresponding to the English “or” and “if ... then ...”. However, these operators in logic are concerned solely with truth values and ignore the fact that the English “if ... then ...” suggests specific relationship (often more coorelational than causal) between its premises and its conclusion. Taibah University_CS Dept. CS 362_ Intelligent Systems Associationist Theories of Meaning(iii) Slide 12 For example, the sentence “If a bird is a cardinal then it is red” (associating the bird cardinal with the color red) can be written in predicate calculus: ∀ X (cardinal(X) → red(X)). This may be changed, through a series of truth-preserving operations, Chapter 2, into the logically equivalent expression ∀ X (¬ red(X)→ ¬cardinal(X)). These two expressions have the same truth value; that is, the second is true if and only if the first is true. Taibah University_CS Dept. CS 362_ Intelligent Systems Associationist Theories of Meaning(vi) Slide 13 Associationist theories: 1. It defines the meaning of an object in terms of a network of associations with other objects. 2. For the associationist, when humans perceive an object, that perception is first mapped into a concept. 3. This concept is part of our entire knowledge of the world and is connected through appropriate relationships to other concepts. 4. These relationships form an understanding of the properties and behavior of objects such as snow. Taibah University_CS Dept. CS 362_ Intelligent Systems Semantics of Calculus Slide 14 Predicate calculus representation • Formal representation languages • Sound and complete inference rules • Truth-preserving operations Meaning(line of reasoning) – Semantics Logical implication is a relationship between truth values: pq Associationist theory Attach semantics to logical symbols and operators Taibah University_CS Dept. CS 362_ Intelligent Systems Semantics of Calculus(2) Slide 15 For Example: “if a bird is cardinal then it is red” (associating the bird cardinal with the color red) can be written in predicate calculus: ∀X((cardinal(X) →red(X)) ∀X(┐red(X) → ┐cardinal(X)) These two expressions have the same truth value; that is, the second is true if and only if the first is true. 15 Taibah University_CS Dept. CS 362_ Intelligent Systems Semantic Networks(i) Slide 16 Using Graphs, by providing a means of explicitly representing relations using arcs and nodes, have proved formalizing associationist theories of knowledge. A semantic network represents knowledge as a graph, with the nodes corresponding to facts or concepts and the arcs to relations or associations between concepts. Both nodes and links are generally labeled. The term “semantic network” encompasses a family of graphbased representations. These differ chiefly in the names that are allowed for nodes and links and the inferences that may be performed. However, a common set of assumptions and concerns is shared by all network representation languages Taibah University_CS Dept. CS 362_ Intelligent Systems Semantic Networks(ii) Slide 17 Semantic network developed by Collins and Quillianin, their research on human information storage and response times (Harmon and King 1985). • classic representation technique for propositional information • Propositions – a form of declarative knowledge, stating facts (true/false) • Propositions are called “atoms” – cannot be further subdivided. Semantic nets consist of nodes (objects, concepts, situations) and arcs (relationships between them). Taibah University_CS Dept. CS 362_ Intelligent Systems Semantic Networks(iii) Slide 18 Definition • • • • Represent knowledge as a graph Nodes correspond to facts or concepts Arcs correspond to relations or associations between concepts Nodes and arcs are labeled Properties • • • • Labeled arcs and links Inference is to find a path between nodes Implement inheritance Variations – conceptual graphs Taibah University_CS Dept. CS 362_ Intelligent Systems Semantic Networks Example 1(i) Slide 19 Figure 7.1 Semantic network developed by Collins and Quillian in their research on human information storage and response times (Harmon and King 1985). Taibah University_CS Dept. CS 362_ Intelligent Systems Semantic Networks Example 1(ii) Slide 20 Different inferences with given questions. The structure of this hierarchy was derived from laboratory testing of human subjects. The subjects were asked questions about different properties of birds, such as, “Is a canary a bird?” or “Can a canary sing?” or “Can a canary fly?”. Taibah University_CS Dept. CS 362_ Intelligent Systems Semantic Networks Example 1(iii) Slide 21 reaction-time studies indicated that it took longer for subjects to answer “Can a canary fly?” than it did to answer “Can a canary sing?” Collins and Quillian explain this difference in response time by arguing that people store information at its most abstract level. Instead of trying to recall that canaries fly, and robins fly, and swallows fly, all stored with the individual bird, humans remember that canaries are birds and that birds have (usually) the property of flying. Even more general properties such as eating, breathing, and moving are stored at the “animal” level, and so trying to recall whether a canary can breathe should take longer than recalling whether a canary can fly. This is, of course, because the human must travel further up the hierarchy of memory structures to get the answer. Taibah University_CS Dept. CS 362_ Intelligent Systems Semantic Networks Example 1 (iv) Slide 22 The fastest recall was for the traits specific to the bird, say, that it can sing or is yellow. Exception handling also seemed to be done at the most specific level. When subjects were asked whether an ostrich could fly, the answer was produced faster than when they were asked whether an ostrich could breathe. Thus the hierarchy Ostrich → bird → animal seems not to be traversed to get the exception information: it is stored directly with ostrich. This knowledge organization has been formalized in inheritance systems. Taibah University_CS Dept. CS 362_ Intelligent Systems Inheritance systems Slide 23 Inheritance systems allow us to store information at the highest level of abstraction, which reduces the size of knowledge bases and helps prevent update inconsistencies. Example: • If we are building a knowledge base about birds, we can define the traits common to all birds, such as flying or having feathers, for the general class bird. • Allow a particular species of bird to inherit these properties. • This reduces the size of the knowledge base by requiring us to define these essential traits only once, rather than requiring their assertion for every individual. Inheritance also helps us to maintain the consistency of the knowledge base when adding new classes and individuals. Example: • Assume that we are adding the species robin to an existing knowledge base. • When we assert that robin is a subclass of songbird; robin inherits all of the common properties of both songbirds and birds. • It is not up to the programmer to remember (or forget!) to add this information. Taibah University_CS Dept. CS 362_ Intelligent Systems Inferences in Semantic Networks Slide 24 The program used this knowledge base to find relationships between pairs of English words. Given two words, it would search the graphs outward from each word in a breadth-first fashion, Searching for a common concept or intersection node. The paths to this node represented a relationship between the word concepts. Taibah University_CS Dept. CS 362_ Intelligent Systems Inferences in Semantic Networks Example Slide 25 Find the relationship (intersection path) between “cry” and “comfort” Using this intersection path, the program was able to conclude: cry 2 is among other things to make a sad sound. To comfort 3 can be to make 2 something less sad Taibah University_CS Dept. CS 362_ Intelligent Systems Standardization of Network Relationships(i) Slide 26 Much of the work in network representations that followed Quillian’s focused on defining a richer set of link labels (relationships) that would more fully model the semantics of natural language. By implementing the fundamental semantic relationships of natural language as part of the formalism, rather than as part of the domain knowledge added by the system builder, knowledge bases require less handcrafting and achieve greater generality and consistency. Simmons (1973) addressed this need for standard relationships by focusing on the case structure of English verbs. In this verb-oriented approach, based on earlier work by Fillmore (1968), links define the roles played by nouns and noun phrases in the action of the sentence. Case relationships include agent, object, instrument, location, and time. Taibah University_CS Dept. CS 362_ Intelligent Systems Standardization of Network Relationships(ii) Slide 27 Sentence is represented as a verb node, with various case links to nodes representing other participants in the action. This structure is called a case frame. In parsing a sentence, the program finds the verb and retrieves the case frame for that verb from its knowledge base. It then binds the values of the agent, object, etc., to the appropriate nodes in the case frame. Example: Using this approach, the sentence “Sarah fixed the chair with glue” might be represented by the network in Figure 7.5. Taibah University_CS Dept. CS 362_ Intelligent Systems Standardization of Network Relationships(iii) Slide 28 Taibah University_CS Dept. CS 362_ Intelligent Systems Frames Slide 29 Frames are another representation technique that evolved from semantic networks (frames can be thought of as an implementation of semantic networks). But compared to semantic networks, frames are structured and follow a more object-oriented abstraction with greater structure Taibah University_CS Dept. CS 362_ Intelligent Systems Frames Slide 30 Similar to classes in Object-oriented Quotes from “A Framework for Representing Knowledge” [Minsky81] Proposed by Minsky in 1975 Here is the essence of the frame theory: When one encounters a new situation (or makes a substantial change in one’s view of a problem) one selects from memory a structure called a “frame”. This is a remembered framework to be adapted to fit reality by changing details as necessary. Taibah University_CS Dept. CS 362_ Intelligent Systems Frames – What and Why? Slide 31 [Minsky, 1981]: A Frame is a collection of questions to be asked about a hypothetical situation: it specifies issues to be raised and methods to be used in dealing with them. To understand a situation, questions like: What caused it (agent)? What was the purpose (intention)? What are the consequences (effects)? Whom does it affect (recipient)? How is it done (instruments) Taibah University_CS Dept. CS 362_ Intelligent Systems Frame Example (Hotel Room) (i) Slide 32 The hotel room and its components can be described by a number of individual frames. In addition to the bed, A frame could represent a chair: expected height is 20 to 40 cm, number of legs is 4, a default value, is designed for sitting. A further frame represents the hotel telephone: this is a specialization of a regular phone except that billing is through the room, there is a special hotel operator (default), and a person is able to use the hotel phone to get meals served in the room, make outside calls, and to receive other services. Taibah University_CS Dept. CS 362_ Intelligent Systems Frame Example (Hotel Room) (ii) Slide 33 Each individual frame may be seen as a data structure, similar in many respects to the traditional “record”, that contains information relevant to stereotyped entities. Taibah University_CS Dept. CS 362_ Intelligent Systems Frame Slots(i) Slide 34 A frame is a set of slots (similar to a set of fields in a class in OO) The slots in the frame contain information such as: 1. Frame identification information. 2. Relationship of this frame to other frames. The “hotel phone” might be a special instance of “phone,” which might be an instance of a “communication device.” 3. Descriptors of requirements for a frame. A chair, for instance, has its seat between 20 and 40 cm from the floor, its back higher than 60 cm, etc. These requirements may be used to determine when new objects fit the stereotype defined by the frame. Taibah University_CS Dept. CS 362_ Intelligent Systems Frame Slots(ii) Slide 35 4. Procedural information on use of the structure described. An important feature of frames is the ability to attach procedural instructions to a slot. 5. Frame default information. These are slot values that are taken to be true when no evidence to the contrary has been found. For instance, chairs have four legs, telephones are pushbutton, or hotel beds are made by the staff. 6. New instance information. Many frame slots may be left unspecified until given a value for a particular instance or when they are needed for some aspect of problem solving. For example, the color of the bedspread may be left unspecified. Taibah University_CS Dept. CS 362_ Intelligent Systems Frames -example Slide 36 In the following ‘Frame’ example, what piece of knowledge can be inferred? A.All birds with wings fly B.Some birds have two wings C.Some birds with wings do not fly D.All birds are brown or dark in color Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Graphs: a Network Language Slide 37 Graphs Introduction to Conceptual Graphs Types, Individuals, and Names The Type Hierarchy Generalization and Specialization Propositional Nodes Conceptual Graphs and Logic Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Graphs: a Network Language Slide 38 Following on the early research work in AI that developed representational schemes a number of network languages were created to model the semantics of natural language and other domains. In this section, we examine a particular formalism in detail, to show how, in this situation, the problems of representing meaning were addressed. John Sowa’s conceptual graphs (Sowa 1984) is an example of a network representation language. We define the rules for forming and manipulating conceptual graphs and the conventions for representing classes, individuals, and relationships. Taibah University_CS Dept. CS 362_ Intelligent Systems Graphs Review Slide 39 Graphs are sometimes called a network or net. A graph can have zero or more links between nodes – there is no distinction between parent and child. Sometimes links have weights – weighted graph; or, arrows – directed graph. Simple graphs have no loops – links that come back onto the node itself. A circuit (cycle) is a path through the graph beginning and ending with the same node. Acyclic graphs have no cycles. Connected graphs have links to all the nodes. Digraphs are graphs with directed links. Lattice is a directed acyclic graph. Taibah University_CS Dept. CS 362_ Intelligent Systems Simple Graphs Slide 40 Taibah University_CS Dept. CS 362_ Intelligent Systems Introduction to Conceptual Graphs(i) Slide 41 Conceptual graph A finite, connected, bipartite graph No arc labels , (why ? ). Conceptual graphs do not use labeled arcs; instead the conceptual relation nodes represent relations between concepts. Because conceptual graphs are bipartite, concepts only have arcs to relations, and vice versa. Nodes 1. concept nodes – box nodes Concrete concepts: cat, telephone, classroom, restaurant Abstract objects: love, beauty, loyalty 2. conceptual relation nodes – ellipse nodes Relations involving one or more concepts Arity – number of box nodes linked to Taibah University_CS Dept. CS 362_ Intelligent Systems Introduction to Conceptual Graphs(ii) Slide 42 Nodes 1. Concept nodes – box nodes Concrete concepts: cat, telephone, classroom, restaurant (are characterized by our ability to form an image of them in our minds) Abstract objects: love, beauty, loyalty (that do not correspond to images in our minds.) 2. Conceptual relation nodes – ellipse nodes Relations involving one or more concepts Relation Arity are the number of box nodes linked to Taibah University_CS Dept. CS 362_ Intelligent Systems Introduction to Conceptual Graphs(iii) Slide 43 Conceptual relation nodes indicate a relation involving one or more concepts. One advantage of formulating conceptual graphs as bipartite graphs rather than using labeled arcs is that it simplifies the representation of relations of any arity. A relation of arity n is represented by a conceptual relation node having n arcs. (as shown in Figure 7.14.). Each conceptual graph represents a single proposition. A typical knowledge base will contain a number of such graphs. Graphs may be arbitrarily complex but must be finite. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Graphs Example(i) Slide 44 For example, one graph in Figure 7.14 represents the proposition “A dog has a color of brown”. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Graphs Example(ii) Slide 45 For example, one graph in Figure 7.15 is a graph of somewhat greater complexity that represents the sentence “Mary gave John the book”. This graph uses conceptual relations to represent the cases of the verb “to give” and indicates the way in which conceptual graphs are used to model the semantics of natural language. Taibah University_CS Dept. CS 362_ Intelligent Systems Types, Individuals, and Names(i) Slide 46 In conceptual graphs, every concept is a unique individual of a particular type. Each concept box is labeled with a type label, which indicates the class or type of individual represented by that node. For example: A node labeled dog represents some individual of that type. Types are organized into a hierarchy. The type dog is a subtype of carnivore, which is a subtype of mammal, etc. Taibah University_CS Dept. CS 362_ Intelligent Systems Types, Individuals, and Names(ii) Slide 47 Boxes with the same type label represent concepts of the same type; however, these boxes may or may not represent the same individual concept Each concept box is labeled with the names of the type and the individual. The type and individual labels are separated by a colon, “:”. Taibah University_CS Dept. CS 362_ Intelligent Systems Types, Individuals, and Names(iii) Slide 48 Type A class, a concept Types are organized into hierarchy Individual -- Concrete entity Name – Identifier of type and individual Conceptual Graph Concept box with type label indicating the class or type of individual represented by a node Label consists of ( type, :, and individual) boy: Ali Unnamed individual labeled as marker: #<number> Marker can separate an individual from name boy: #12354 Taibah University_CS Dept. CS 362_ Intelligent Systems Types, Individuals, and Names Example 1 Slide 49 Taibah University_CS Dept. CS 362_ Intelligent Systems Types, Individuals, and Names Example 2 Slide 50 For example: “Her name was McGill, and she called herself Lil, but everyone knew her as Nancy” Taibah University_CS Dept. CS 362_ Intelligent Systems Types, Individuals, and Names(iv) Slide 51 As an alternative to indicating an individual by its marker or name, we can also use the generic marker * to indicate an unspecified individual. This is often omitted from concept labels; a node given just a type label, dog, is equivalent to a node labeled dog:*. In addition to the generic marker, conceptual graphs allow the use of named variables. These are represented by an asterisk followed by the variable name (e.g.,*X or *foo). This is useful if two separate nodes are to indicate the same, but unspecified, individual. Taibah University_CS Dept. CS 362_ Intelligent Systems Types, Individuals, and Names Example 3 Slide 52 For example: “The dog scratches its ear with its paw”. Although we do not know which dog is scratching its ear, the variable *X indicates that the paw and the ear belong to the same dog that is doing the scratching. Taibah University_CS Dept. CS 362_ Intelligent Systems Generalization and Specialization(i) Slide 53 The theory of conceptual graphs includes a number of operations that create new graphs from existing graphs. These allow for the generation of a new graph by either specializing or generalizing an existing graph, operations that are important for representing the semantics of natural language. The four operations are : 1. 2. 3. 4. Copy Restrict Join Simplify. Taibah University_CS Dept. CS 362_ Intelligent Systems Generalization and Specialization(ii) Slide 54 1. The copy rule allows us to form a new graph, g, that is the exact copy of g1. 2. Restrict allows concept nodes in a graph to be replaced by a node representing their specialization. There are two cases: 1. If a concept is labeled with a generic marker, the generic marker may be replaced by an individual marker. 2. A type label may be replaced by one of its subtypes, if this is consistent with the referent of the concept. In Figure 7.22 we can replace animal with dog. Taibah University_CS Dept. CS 362_ Intelligent Systems Restriction Example Slide 55 For example we can replace animal with dog. Taibah University_CS Dept. CS 362_ Intelligent Systems Generalization and Specialization(iii) Slide 56 The join rule lets us combine two graphs into a single graph. If there is a concept node c1 in the graph s1 that is identical to a concept node c2 in s2, then we can form a new graph by deleting c2 and linking all of the relations incident on c2 to c1. Join is a specialization rule, because the resulting graph is less general than either of its components. Taibah University_CS Dept. CS 362_ Intelligent Systems Join Example Slide 57 The Join of g1 and g3 Taibah University_CS Dept. CS 362_ Intelligent Systems Generalization and Specialization(iv) Slide 58 The simplify rule. If a graph contains two duplicate relations, then one of them may be deleted, along with all its arcs. This is the simplify rule. Duplicate relations often occur as the result of a join operation, as in graph g4. Taibah University_CS Dept. CS 362_ Intelligent Systems Propositional Nodes Slide 59 In addition to using graphs to define relations between objects in the world, we may also want to define relations between propositions. Consider, for example, the statement “Tom believes that Jane likes pizza”. “Believes” is a relation that takes a proposition as its argument. Conceptual graphs include a concept type, proposition, that takes a set of conceptual graphs as its referent and allows us to define relations involving propositions. Propositional concepts are indicated as a box that contains another conceptual graph. These proposition concepts may be used with appropriate relations to represent knowledge about propositions. Figure 7.24 shows the conceptual graph for the above assertion about Jane, Tom, and pizza. Taibah University_CS Dept. CS 362_ Intelligent Systems Propositional Nodes Example(i) Slide 60 Taibah University_CS Dept. CS 362_ Intelligent Systems Propositional Nodes Example(ii) Slide 61 The experiencer relation is loosely analogous to the agent relation in that it links a subject and a verb. The experiencer link is used with belief states based on the notion that they are something one experiences rather than does. Figure 7.24 shows how conceptual graphs with propositional nodes may be used to express the modal concepts of knowledge and belief. Modal logics are concerned with the various ways propositions are entertained: believed, asserted as possible, probably or necessarily true, intended as a result of an action, or counterfactual (Turner 1984). Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Graphs and Logic Slide 62 Using conceptual graphs, we can easily represent conjunctive concepts such as “The dog is big and hungry”, but we have not established any way of representing negation or disjunction. Nor have we addressed the issue of variable quantification. We may implement negation using propositional concepts and a unary operation called neg. neg takes as argument a proposition concept and asserts that concept as false. The conceptual graph of Figure 7.25 uses neg to represent the statement “There are no pink dogs”. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Graphs and Logic Example Slide 63 Conceptual graph of Figure 7.25 uses neg to represent the statement “There are no pink dogs”. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual graphs and predicate calculus(i) Slide 64 Conceptual graphs are equivalent to predicate calculus in their expressive power. As these examples suggest, there is a straightforward mapping from conceptual graphs into predicate calculus notation. The algorithm, taken from Sowa (1984), for changing a conceptual graph, g, into a predicate calculus expression is: 1. Assign a unique variable, x1, x2, . . . , xn, to each of the n generic concepts in g. 2. Assign a unique constant to each individual concept in g. This constant may simply be the name or marker used to indicate the referent of the concept. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual graphs and predicate calculus(ii) Slide 65 3. Represent each concept node by a unary predicate with the same name as the type of that node and whose argument is the variable or constant given that node. 4. Represent each n-ary conceptual relation in g as an n-ary predicate whose name is the same as the relation. Let each argument of the predicate be the variable or constant assigned to the corresponding concept node linked to that relation. 5. Take the conjunction of all atomic sentences formed under 3 and 4. This is the body of the predicate calculus expressions. All the variables in the expression are existentially quantified. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual graphs and predicate calculus Example Slide 66 For example, the graph of Figure 7.16 below may be transformed into the predicate calculus expression ∃ X1 (dog(emma) ∧ color(emma,X1) ∧ brown(X1)) Taibah University_CS Dept. CS 362_ Intelligent Systems CS 362: Intelligent Systems Slide 1 Unit 7 STRONG METHOD PROBLEM SOLVING Taibah University_CS Dept. CS 362_ Intelligent Systems Chapter 7: STRONG METHOD PROBLEM SOLVING Slide 2 8.0 Introduction 8.1 Overview of Expert System Technology 8.1.1 The Design of Rule-Based Expert Systems 8.1.2 Selecting a Problem and the Knowledge Engineering Process 8.1.3 Conceptual Models and Their Role in Knowledge Acquisition Taibah University_CS Dept. CS 362_ Intelligent Systems Introduction (1) Slide 3 The Important component of AI is knowledgeintensive or strong method problem solving. An expert system uses knowledge specific to a problem domain to provide “expert quality” performance in that application area. Expert systems are built to solve a wide range of problems in domains such as medicine, mathematics, engineering, chemistry, geology, computer science, business, law, defense, and education. Taibah University_CS Dept. CS 362_ Intelligent Systems Introduction(2) Slide 4 Expert system emulates the human methodology and performance. As with skilled humans, expert systems: - Tend to be specialists. - Focusing on a narrow set of problems. - Are both theoretical and practical expert’s Taibah University_CS Dept. CS 362_ Intelligent Systems Introduction(3) Slide 5 Because of their heuristic, knowledge-intensive nature, expert systems generally: 1. Support inspection of their reasoning processes, both in presenting intermediate steps and in answering questions about the solution process. 2. Allow easy modification in adding and deleting skills from the knowledge base. 3. Reason heuristically, using (often imperfect) knowledge to get useful solutions Taibah University_CS Dept. CS 362_ Intelligent Systems Expert System Problem Categories (1) Waterman (1986) Slide 6 The following list from Waterman (1986), is a useful summary of general expert system problem categories: Interpretation: forming high-level conclusions from collections of raw data. Prediction: projecting probable consequences of given situations. Diagnosis: determining the cause of malfunctions in complex situations based on observable symptoms. Design: finding a configuration of system components that meets performance goals while satisfying a set of design constraints. Taibah University_CS Dept. CS 362_ Intelligent Systems The Design of Rule-Based Expert Systems(2) Waterman (1986) Slide 7 Planning: devising a sequence of actions that will achieve a set of goals given certain starting conditions and run-time constraints. Monitoring: comparing a system’s observed behavior to its expected behavior. Instruction: assisting in the education process in technical domains. Control: governing the behavior of a complex environment. Taibah University_CS Dept. CS 362_ Intelligent Systems Examples Slide 8 Diagnostic applications, servicing people. Play chess Make financial planning decisions Configure computers Monitor real time systems Underwrite insurance policies Perform many other services which previously required human expertise Example Rules for a Credit Application Slide 9 Mortgage application for a loan for $100,000 to $200,000 If there are no previous credits problems, and If month net income is greater than 4x monthly loan payment, and If down payment is 15% of total value of property, and If net income of borrower is > $25,000, and If employment is > 3 years at same company Then accept the applications Else check other credit rules Taibah University_CS Dept. CS 362_ Intelligent Systems The Design of Rule-Based Expert Systems(1) Slide 10 There are three major ES Components: Expert system interfaces employ a variety of user styles, including question-and-answer, menu-driven, or graphical interfaces. knowledge base Is the heart of the expert system, which contains the knowledge of a particular application domain. In a rule-based expert system this knowledge is most often represented in the form of if... then... rules, The knowledge base contains both general knowledge as well as case- specific information. Taibah University_CS Dept. CS 362_ Intelligent Systems The Design of Rule-Based Expert Systems(2) Slide 11 Inference engine -Applies the knowledge to the solution of actual problems. -It is essentially an interpreter for the knowledge base. -Performs the recognize-act control cycle. The procedures that implement the control cycle are separate from the production rules themselves The Design of Rule-Based Expert Systems(3) Slide 12 User Interface Inference Engine Knowledge Base Three Major ES Components The Design of Rule-Based Expert Systems(4) Slide 13 Architecture of a typical expert system for a particular problem domain. 13 Taibah University_CS Dept. CS 362_ Intelligent Systems The Design of Rule-Based Expert Systems(5) Slide 14 Case-specific data -The expert system must keep track of case-specific data: the facts, conclusions, and other information relevant to the case under consideration. -This includes the data given in a problem instance, partial conclusions, confidence measures of conclusions, and dead ends in the search process. -This information is separate from the general knowledge base. Taibah University_CS Dept. CS 362_ Intelligent Systems The Design of Rule-Based Expert Systems(6) Slide 15 Explanation subsystem -The explanation subsystem allows the program to explain its reasoning to the user. These explanations include: 1. Justifications for the system’s conclusions, In response to how queries, 2. Explanations of why the system needs a particular piece of data, 3. Why queries, and, where useful, 4. Tutorial explanations or deeper theoretical justifications of the program’s actions. Taibah University_CS Dept. CS 362_ Intelligent Systems The Design of Rule-Based Expert Systems(7) Slide 16 knowledge-base editor -Knowledge-base editors help the programmer locate and correct bugs in the program’s performance, often accessing the information provided by the explanation subsystem. -They also may assist in the addition of new knowledge, help maintain correct rule syntax, and perform consistency checks on any updated knowledge base. Taibah University_CS Dept. CS 362_ Intelligent Systems Selecting a Problem and the Knowledge Engineering Process (1) Slide 17 Guidelines to determine whether a problem is appropriate for expert system solution or not: 1. The need for the solution justifies the cost and effort of building an expert system. 2. Human expertise is not available in all situations where it is needed. 3. The problem may be solved using symbolic reasoning. 4. The problem domain is well structured and does not require commonsense reasoning. 5. The problem may not be solved using traditional computing methods. 6. Cooperative and articulate experts exist. 7. The problem is of proper size and scope. Taibah University_CS Dept. CS 362_ Intelligent Systems Selecting a Problem and the Knowledge Engineering Process (2) Slide 18 Exploratory development cycle Taibah University_CS Dept. CS 362_ Intelligent Systems Selecting a Problem and the Knowledge Engineering Process (3) Slide 19 Expert systems are built by progressive approximations, with the program’s mistakes leading to corrections or additions to the knowledge base. The process of trying and correcting candidate designs is common to expert systems development, and contrasts with such neatly hierarchical processes as top-down design. In a sense, the knowledge base is “grown” rather than constructed. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Models and Their Role in Knowledge Acquisition (1) Slide 20 A simplified model of the knowledge acquisition process that will serve as a useful “first approximation” for understanding the problems involved in acquiring and formalizing human expert performance. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Models and Their Role in Knowledge Acquisition (2) Slide 21 Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Models and Their Role in Knowledge Acquisition (3) Slide 22 The Human Element in Expert Systems -Expert -Knowledge Engineer -User The human expert, working in an application area, operates in a domain of knowledge, skill, and practice. This knowledge is often vague, imprecise, and only partially verbalized. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Models and Their Role in Knowledge Acquisition (4) Slide 23 The knowledge engineer must translate this informal expertise into a formal language suited to a computational system. knowledge engineering is difficult and should be viewed as spanning the life cycle of any expert system. Knowledge engineers should document and make public an assumptions about the domain through common software engineering methodologies. To simplify knowledge engineering task, it is useful to consider, a conceptual model If the knowledge engineer uses a predicate calculus model, begin as a number of simple networks representing states of reasoning through typical problem-solving situations. Only after further refinement does this network become explicit if... then... rules. Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Models and Their Role in Knowledge Acquisition (5) Slide 24 Taibah University_CS Dept. CS 362_ Intelligent Systems Conceptual Models and Their Role in Knowledge Acquisition (6) Slide 25 A conceptual model that lies between human expertise and the implemented program. By a conceptual model, we mean the knowledge engineer’s evolving conception of the domain knowledge. This model that actually determines the construction of the formal knowledge base. The conceptual model is not formal or directly executable on a computer. It is an intermediate design construct, a template to begin to constrain and codify human skill. Taibah University_CS Dept. CS 362_ Intelligent Systems Chapter 8: STRONG METHOD PROBLEM SOLVING Slide 26 8.2 Rule-Based Expert Systems 8.2.1 The Production System and Goal-Driven Problem Solving 8.2.2 Explanation and Transparency in Goal-Driven Reasoning 8.2.3 Using the Production System for Data-Driven Reasoning 8.2.4 Heuristics and Control in Expert Systems Taibah University_CS Dept. CS 362_ Intelligent Systems Rule-Based Expert Systems Slide 27 Rule-based expert systems represent problem-solving knowledge as if... then... rules. • This approach is one of the oldest techniques for representing domain knowledge in an expert system. • It is also one of the most natural, and remains widely used in practical and experimental expert systems. Taibah University_CS Dept. CS 362_ Intelligent Systems The Production System and Goal-Driven Problem Solving(1) Slide 28 • The production system was the intellectual precursor of modern expert system architectures, where application of production rules leads to refinements of understanding of a particular problem situation. • When Newell and Simon developed the production system, their goal was to model human performance in problem solving. Taibah University_CS Dept. CS 362_ Intelligent Systems The Production System and Goal-Driven Problem Solving(2) Slide 29 • If we regard the expert system architecture in the above Figure as a production system, the domain-specific knowledge base is the set of production rules. Taibah University_CS Dept. CS 362_ Intelligent Systems The Production System and Goal-Driven Problem Solving(3) • In Slide 30 a rule-based system, these condition action pairs are represented as if... then...rules, with the premises of the rules, the if portion, corresponding to the condition, and the conclusion, the then portion, corresponding to the action: when the condition is satisfied, the expert system takes the action of asserting the conclusion as true. • Case-specific data can be kept in the working memory. The inference engine implements the recognize-act cycle of the production system; this control may be either data-driven or goal-driven. Taibah University_CS Dept. CS 362_ Intelligent Systems The Production System and Goal-Driven Problem Solving(4) Slide 31 Taibah University_CS Dept. CS 362_ Intelligent Systems The Production System and Goal-Driven Problem Solving(5) Slide 32 • In a goal-driven expert system, the goal expression is initially placed in working memory. • The system matches rule conclusions with the goal, selecting one rule and placing its premises in the working memory. • This corresponds to a decomposition of the problem’s goal into simpler subgoals. • The process continues in the next iteration of the production system, with these premises becoming the new goals to match against rule conclusions. Taibah University_CS Dept. CS 362_ Intelligent Systems The Production System and Goal-Driven Problem Solving(6) Slide 33 • • • • • The system thus works back from the original goal until all the subgoals in working memory are known to be true, indicating that the hypothesis has been verified. Thus , backward search in an expert system corresponds roughly to the process of hypothesis testing in human problem solving In an expert system, subgoals can be solved by asking the user for information. Some expert systems allow the system designer to specify which subgoals may be solved by asking the user. Others simply ask the user about any subgoals that fail to match rules in the knowledge base; i.e., if the program cannot infer the truth of a subgoal, it asks the user. Taibah University_CS Dept. CS 362_ Intelligent Systems Example: A small expert system for analysis of automotive problems(1) Slide 34 Rule 1: if the engine is getting gas, and the engine will turn over, then the problem is spark plugs. Rule 3: if the engine does not turn over, and the lights do come on then the problem is the starter motor. Rule 2: if the engine does not turn over, and the lights do not come on then the problem is battery or cables. Rule 4: if there is gas in the fuel tank, and there is gas in the carburetor then the engine is getting gas. Taibah University_CS Dept. CS 362_ Intelligent Systems Example: A small expert system for analysis of automotive problems(2) Slide 35 The and/or graph searched in the car diagnosis example, with the conclusion of Rule 4 matching the first premise of Rule 1. Taibah University_CS Dept. CS 362_ Intelligent Systems Example: A small expert system for analysis of automotive problems(3) Slide 36 The production system at the start of a consultation in the car diagnostic example. The production system at the start of a consultation (it will be DFS) Taibah University_CS Dept. CS 362_ Intelligent Systems Example: A small expert system for analysis of automotive problems(4) Slide 37 The production system after Rule 1 has fired. The system after Rule 4 has fired. Note the stack-based approach to goal reduction. Taibah University_CS Dept. CS 362_ Intelligent Systems Explanation and Transparency in Goal-Driven Reasoning(1) Slide 38 Explanation: Answering “why” questions ES - Gas in fuel tank? U-yes ES-Gas in carburetor? U-yes ES-Engine will turn over? U-yes ES-This follows from rule 1 ES-It has been established that 1. The engine is getting gas, therefore if 2. The engine will turn over, then the problem is spark plugs. The user is asking “why is this information important?” CS 362_ Intelligent Systems Taibah University_CS Dept. Explanation and Transparency in Goal-Driven Reasoning(2) Slide 39 U-how the engine is getting gas ES-This follows from rule 4: if gas in fuel tank, and there is gas in the carburetor then the engine is getting gas gas in fuel tank was given by the user gas in the carburetor was given by the user The user is asking “How did the system come up with this conclusion?” Taibah University_CS Dept. CS 362_ Intelligent Systems Using the Production System for Data-Driven Reasoning(3) Slide 40 • Breadth-first search is even more common in data-driven reasoning. • The algorithm is: 1- compare the contents of working memory with the conditions of each rule in the rule base according to the order of the rules in the rule base. 2- If the data in working Memory supports a rule’s firing the result is placed in working memory and then control moves on to the next rule. Once all rules have been considered, search starts again at the beginning of the rule set. Taibah University_CS Dept. CS 362_ Intelligent Systems Using the Production System for Data-Driven Reasoning ( Data-driven reasoning in Ess)(1) Slide 41 Use breadth-first search Algorithm: Do the next step until the working memory does not change anymore For each rule: Compare the contents of the working memory with the conditions of each rule in the rule base using the ordering of the rule base. If the data in working memory supports a rule’s firing place the result in working memory Taibah University_CS Dept. CS 362_ Intelligent Systems Using the Production System for Data-Driven Reasoning ( Data-driven reasoning in Ess)(2) Slide 42 At the start of a consultation for data-driven reasoning Taibah University_CS Dept. CS 362_ Intelligent Systems Using the Production System for Data-Driven Reasoning ( Data-driven reasoning in Ess)(3) Slide 43 After evaluating the first premise of Rule 2, which then fails Taibah University_CS Dept. CS 362_ Intelligent Systems Using the Production System for Data-Driven Reasoning ( Data-driven reasoning in Ess)(4) Slide 44 After considering Rule 4, beginning its second pass through the rules Taibah University_CS Dept. CS 362_ Intelligent Systems Using the Production System for Data-Driven Reasoning ( Data-driven reasoning in Ess)(5) Slide 45 The search graph as described by the contents of WM data-driven Taibah University_CS Dept. CS 362_ Intelligent Systems Heuristics and Control in Expert Systems (2) Slide 46 Example: • Rule 1 in the automotive example tries to determine whether • • • the engine is getting gas before it asks if the engine turns over. This is inefficient, in that trying to determine whether the engine is getting gas invokes another rule and eventually asks the user two questions. By reversing the order of the premises, a negative response to the query “engine will turn over?” Eliminates this rule from consideration before the more involved condition is examined, thus making the system more efficient. Taibah University_CS Dept. CS 362_ Intelligent Systems Chapter 8: STRONG METHOD PROBLEM SOLVING Slide 47 8.4 Planning 8.4.1 Introduction to Planning: Robotics 8.4.2 Using Planning Macros: STRIPS Taibah University_CS Dept. CS 362_ Intelligent Systems Planning (1) Slide 48 • Planning is a process of organizing pieces of knowledge into a consistent sequence of actions that will accomplish a goal. • The task of a planner is to find a sequence of actions that allow a problem solver, such as a control system, to accomplish some specific task. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning (2) Slide 49 • Traditional planning is very much knowledge-intensive, since plan creation requires the organization of pieces of knowledge and partial plans into a solution procedure. • Planning plays a role in expert systems in reasoning about events occurring over time. • Planning has many applications in manufacturing, such as process control. It is also important for designing a robot controller. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning (3) Slide 50 Planning Examples: • Identify the actions and the task or goal n • Sequence of actions to go get block a from room b 1. Put down whatever is now held 2. Go to room b 3. Pick up block a 4. Leave room b 5. Return to original location Taibah University_CS Dept. CS 362_ Intelligent Systems Planning - Atomic Actions Slide 51 The steps of a traditional robot are composed of the robot’s Atomic actions can be at the micro-level or higher level Example: “turn the sixth stepper motor one revolution” atomic actions What type of actions were the ones listed in the previous example? Taibah University_CS Dept. CS 362_ Intelligent Systems Planning -Searching (1) Slide 52 • Plans are created by searching through a space of possible actions until the sequence necessary to accomplish the task is discovered. • This space represents states of the world that are changed by applying each of the actions. • The search terminates when the goal state (the appropriate description of the world) is produced. Thus, many of the issues of heuristic search, including finding A* algorithms, are also appropriate in planning. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning - Searching (2) Slide 53 • Traditional planning relies on search techniques. • Planning may be seen as a state space search • The techniques of graph search may be applied to find a path from the start state to the goal state. The operations on this path constitute a plan. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(1) Slide 54 Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(2) Slide 55 We have five blocks, labeled a, b, c, d, e, sitting on the top of a table. The blocks are all cubes of the same size, and stacks of blocks, as in the figure, have blocks directly on top of each other. The robot arm has a gripper that can grasp any clear block (one with no block on top of it) and move it to any location on the tabletop or place it on top of any other block (with no block on top of it). Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(3) Slide 56 The robot arm can perform the following tasks (U, V, W, X, Y, and Z are variables): Meaning Task Go to location described by coordinates X, Y, and Z. This location might be goto(X,Y,Z) implicit in the command pickup (W) where block W has location X, Y, Z. Pick up block W from its current location. It is assumed that block W is clear on top, the gripper is empty at the time, and the computer knows the location of block W. pickup(W) Place block W down at some location on the table and record the new putdown(W) location for W. W must be held by the gripper at the time. stack(U,V) Place block U on top of block V. The gripper must be holding U and the top of V must be clear of other blocks. stack(U,V) Remove block U from the top of V. U must be clear of other blocks, V must have block U on top of it, and the gripper hand must be empty before this command can be executed. unstack(U,V) Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(4) Slide 57 1. 2. 3. 4. 5. 6. The state of the world is described by a set of predicates and predicate relationships: location(W,X,Y,Z) Block W is at coordinates X, Y, Z. on(X,Y) Block X is immediately on top of block Y. clear(X) Block X has nothing on top of it. gripping(X) The robot arm is holding block X. gripping( ) The robot gripper is empty. ontable(W) Block W is on the table. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(5) Slide 58 This is the collection of predicates STATE 1 for our continuing example: STATE 1 ontable(a). ontable(c). ontable(d). on(b, a). on(e, d). gripping( ). clear(b). clear(c). clear(e). The full state description is the conjunction ( ) of all these predicates. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(6) Slide 59 A number of truth relations or rules for performance are created for the clear (X), ontable (X), and gripping (). 1. ( X) (clear(X) ←¬ ( Y) (on(Y,X))) 2. ( Y) ( X) ¬ on(Y,X) ← ontable(Y) 3. ( Y) gripping( ) ↔¬ (gripping(Y)) 4. ( X) (pickup(X) → (gripping(X) ← (gripping( ) clear(X) ontable(X)))). 5. ( X) (putdown(X) → ((gripping( ) ontable(X) clear(X)) ← gripping(X))). 6. ( X) ( Y) (stack(X,Y) → ((on(X,Y) gripping( ) clear(X)) ← (clear(Y) gripping(X)))). 7. ( X)( Y) (unstack(X,Y) → ((clear(Y) gripping(X)) ← (on(X,Y) clear(X) gripping( ))). 8. ( X) ( Y) ( Z) (unstack(Y,Z) → (ontable(X) ← ontable(X))). 9. ( X) ( Y) ( Z) (stack(Y,Z) → (ontable(X) ← ontable(X))). Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(7) Slide 60 A number of truth relations or rules for performance are created for the clear (X), ontable (X), and gripping (). Explanation of Rule 4: For all blocks X, pickup(X) means gripping(X) if the hand is empty and X is clear and X is on table. Rule A → (B ← C) says that operator A produces new predicate(s) B when condition(s) C is(are) true. Operator A can be used to create a new state described by predicates B when predicates C are true. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(8) Slide 61 A number of other predicates also true for STATE 1 will remain true in STATE 2. These states are preserved for STATE 2 by the frame axioms. We now produce the nine predicates describing STATE 2 by applying the unstack operator and the frame axioms to the nine predicates of STATE 1, where the net result is to unstack block e: Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(9) Slide 62 This is the collection of predicates STATE 2 for our continuing example: STATE 2 ontable(a). ontable(c). ontable(d). on(b, a). clear(c). gripping(e). clear(b). clear(d). clear(e). Taibah University_CS Dept. CS 362_ Intelligent Systems Planning Example (robot’s world to a set of blocks on a tabletop )(10) Slide 63 Fig 8.19 Portion of the state space for a portion of the blocks world. STATE 1 STATE 2 63 Taibah University_CS Dept. CS 362_ Intelligent Systems Using Planning Macros: STRIPS (1) Slide 64 STRIPS, developed at what is now SRI International, stands for STanford Research Institute Planning System. This controller was used to drive the SHAKEY robot of the early 1970s. The book presents a version of STRIPS-style planning and triangle tables (data structure) used to organize and store macro operations. Using blocks example, the four operators pickup, putdown, stack, and unstack are represented as triples of descriptions. Taibah University_CS Dept. CS 362_ Intelligent Systems Using Planning Macros: STRIPS(2) Slide 65 The first element of the triple is the set of preconditions (P), or conditions the world must meet for an operator to be applied. The second element of the triple is the add list (A), or the additions to the state description that are a result of applying the operator. Finally, there is the delete list (D), or the items that are removed from a state description to create the new state when the operator is applied. Taibah University_CS Dept. CS 362_ Intelligent Systems Using Planning Macros: STRIPS(3) Slide 66 1- pickup(X) P: gripping( ) clear(X) ontable(X) A: gripping(X) D: ontable(X) gripping( ) 2- putdown(X) P: gripping(X) A: ontable(X) gripping( ) clear(X) D: gripping(X) 3- stack(X,Y) P: clear(Y) gripping(X) A: on(X,Y) gripping( ) clear(X) D: clear(Y) gripping(X) 4- unstack(X,Y) P: clear(X) gripping( ) on(X,Y) A: gripping(X) clear(Y) D: gripping( ) on(X,Y) Using blocks example, the four operators pickup, putdown, stack, and unstack are represented as triples of descriptions. Taibah University_CS Dept. CS 362_ Intelligent Systems Problems with Using Planning Macros(1) Slide 67 • Example of an incompatible subgoal using the start state STATE 1 of Figure 8.18. • Suppose the goal of the plan is STATE G as in Figure 8.20, with on(b,a) on(a,c) and blocks d and e remaining as in STATE 1. • It may be noted that one of the parts of the conjunctive goal on(b,a) on(a,c) is true in STATE 1, namely on(b,a). • This already satisfied part of the goal must be undone (unfinished) before the second subgoal, on(a,c), can be accomplished. Taibah University_CS Dept. CS 362_ Intelligent Systems Problems with Using Planning Macros(2) Slide 68 Fig: 8.18 Fig: 8.20 Taibah University_CS Dept. CS 362_ Intelligent Systems Triangle Table (1) Slide 69 • The Triangle Table representation (Fikes and Nilsson 1971, Nilsson 1980) is aimed at alleviating some of these anomalies, presented in slides 21 and 22. • A triangle table is a data structure for organizing sequences of actions, including potentially incompatible subgoals, within a plan. Taibah University_CS Dept. CS 362_ Intelligent Systems Triangle Table (2) Slide 70 Figure 8.21 presents a triangle table for the macro action stack(X,Y) stack(Y,Z). This macro action can be applied to states where on(X,Y) clear(X) clear(Z) is true. This triangle table is appropriate for starting STATE 1 with X = b, Y = a, and Z = c. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning - A Triangle Table Slide 71 The atomic actions of the plan, i.e., pickup, putdown, stack, and unstack are recorded along the diagonal. Taibah University_CS Dept. CS 362_ Intelligent Systems Planning - A Triangle Table - Explanation Slide 72 The set of preconditions of each of these actions are in the row before that action, and the postconditions (add and delete) of each action are in the column below it. Advantage of triangle tables is the assistance they can offer in attempting to recover from problems, such as a block being slightly out of place, or accidents, such as dropping a block.