Knowledge-based systems Rozália Lakner University of Veszprém Department of Computer Science An overview Knowledge-based systems, expert systems structure, characteristics main components advantages, disadvantages Base techniques of knowledge-based systems rule-based techniques inductive techniques hybrid techniques symbol-manipulation techniques case-based techniques (qualitative techniques, model-based techniques, temporal reasoning techniques, neural networks) Engineering Application of AI - PhD Course - 2/57 Knowledge-based systems Engineering Application of AI - PhD Course - 3/57 Structure and characteristics 1 KBSs are computer systems KBSs are AI programs with program structure of new type contain stored knowledge solve problems like humans would knowledge-base (rules, facts, meta-knowledge) inference engine (reasoning and search strategy for solution, other services) characteristics of KBSs: intelligent information processing systems representation of domain of interest symbolic representation problem solving by symbol-manipulation symbolic programs Engineering Application of AI - PhD Course - 4/57 Structure and characteristics 2 Explanation subsystem User Case specific database User interface Inference engine Knowledge base Knowledge engineer Developer's interface Knowledge acquisition subsystem Engineering Application of AI - PhD Course - 5/57 Main components 1 knowledge-base (KB) inference engine knowledge about the field of interest (in natural language-like formalism) symbolically described system-specification KNOWLEDGE-REPRESENTATION METHOD! „engine” of problem solving (general problem solving knowledge) supporting the operation of the other components PROBLEM SOLVING METHOD! case-specific database auxiliary component specific information (information from outside, initial data of the concrete problem) information obtained during reasoning Engineering Application of AI - PhD Course - 6/57 Main components 2 explanation subsystem explanation of system’ actions in case of user’ request typical explanation facilities: explanation during problem solving: WHY... (explanative reasoning, intelligent help, tracing information about the actual reasoning steps) WHAT IF... (hypothetical reasoning, conditional assignment and its consequences, can be withdrawn) WHAT IS ... (gleaning in knowledge-base and case-specific database) explanation after problem solving: HOW ... (explanative reasoning, information about the way the result has been found) WHY NOT ... (explanative reasoning, finding counter-examples) WHAT IS ... (gleaning in knowledge-base and case-specific database) Engineering Application of AI - PhD Course - 7/57 Main components 3 knowledge acquisition subsystem main tasks: user interface ( user) dialogue on natural language (consultation/ suggestion) specially intefaces checking the syntax of knowledge elements checking the consistency of KB (verification, validation) knowledge extraction, building KB automatic logging and book-keeping of the changes of KB tracing facilities (handling breakpoints, automatic monitoring and reporting the values of knowledge elements) database and other connections developer interface ( knowledge engineer, human expert) Engineering Application of AI - PhD Course - 8/57 Main components 4 the main tasks of the knowledge engineer: knowledge acquisition and design of KBS: determination, classification, refinement and formalization of methods, thumb-rules and procedures selection of knowledge representation method and reasoning strategy implementation of knowledge-based system verification and validation of KB KB maintenance Engineering Application of AI - PhD Course - 9/57 Expert Systems Engineering Application of AI - PhD Course - 10/57 Structure and characteristics 1 expert systems knowledge-based systems employ expert’ knowledge applied in a narrow specific field solve difficult problems (must be demand on special knowledge) specialized human experts are needed experts must be agreed on the fundamental questions of professional field learning examples and raw data are needed expectations from an ES (like a human expert): make intelligent decision: offer intelligent advice and explanations question/ answer (“treated as an equal conversation partner”) explanation of questions acceptable advice even in case of uncertain situation Engineering Application of AI - PhD Course - 11/57 Structure and characteristics 2 AI programs Knowledge-based systems Expert systems AI programs: intelligent problem solving tools KBSs AI programs with special program structure separated knowledge base ESs KBSs applied in a specific narrow field Engineering Application of AI - PhD Course - 12/57 Expert system shells 1 „empty” ESs, contain all the active elements of an ES empty KB, powerful knowledge acquicition subsystem contain services for construction and operation of ES independently of the field of interest support the development of rapid prototype and the incremental construction examples: CLIPS, GoldWorks, G2, Level5 Engineering Application of AI - PhD Course - 13/57 Expert system shells 2 Explanation subsystem User Case specific database User interface Inference engine Knowledge base Knowledge engineer Developer's interface Knowledge acquisition subsystem Engineering Application of AI - PhD Course - 14/57 Advantages of KBSs and ESs make up for shortage of experts, spread expert’ knowledge on available price (TROPICAID) field of interest’ changes are well-tracked (R1) increase expert’ ability and efficiency preserve know-how can be developed systems unrealizabled with tradicional technology (Buck Rogers) self-consistents in advising, equable in performance are available permanently able to work even with partial, non-complete data able to give expanation Engineering Application of AI - PhD Course - 15/57 Disadvantages of KBSs and ESs their knowledge is from a narrow field, don’t know the limits the answers are not always correct (advices have to be analysed!) don’t have common sence (greatest restriction) all of the self-evident checking have to be defined (many exceptions increase the size of KB and the running time) Engineering Application of AI - PhD Course - 16/57 Base techniques of KBSs Engineering Application of AI - PhD Course - 17/57 Techniques of KBSs based on the knowledge-representation methods and reasoning strategies applied in the implementation rule-based techniques inductive techniques hybrid techniques symbol-manipulation techniques case-based techniques (qualitative techniques, model-based techniques, temporal reasoning techniques, neural networks) Engineering Application of AI - PhD Course - 18/57 Rule-based techniques (a short review) Engineering Application of AI - PhD Course - 19/57 Reasoning with rules 1 knowledge-representation form: rule rule-base can be according to the structure of KB simple/unstructured structured (contexts) reasoning strategies: according to the control direction data-driven/forward chaining goal-driven/backward chaining Engineering Application of AI - PhD Course - 20/57 Reasoning with rules 2 aim: proving a goal statement or achieving a goal state the reasoning algorithm: pattern matching conflict resolution selecting the most appropriate rule from conflict set conflict resolution strategies firing finding applicable rules (watching condition/conclusion part of rules) fireable rules conflict set (match condition/conclusion part of rules) executing the selected rule new knowledge (new facts or new subgoals to be proved) watching termination conditions restart of the cycle Engineering Application of AI - PhD Course - 21/57 Inductive techniques Engineering Application of AI - PhD Course - 22/57 Inductive reasoning a type of machine learning technics inferring from individual cases to general information given a collection of training examples (x, f(x)) return a function h that approximates f h is called hypothese f(x) x h y p o t h e s e s aim: finding the hypothese fits well on the training examples h is used for prediction the values of the unseen examples Engineering Application of AI - PhD Course - 23/57 Decision tree 1 one of the most known methods of inductive learning: learning decision trees decision tree: simple representation for classifying examples elements of the decision tree: nonleaf (internal) nodes are labelled with attributes (A) arcs out of a node are labelled with possible attribute values of A leaf nodes are labelled with classifications (Boolean values – yes/no - in the simplest case) Engineering Application of AI - PhD Course - 24/57 Decision tree 2 Country Age Engine Colour Easy to sell 1. Germany 3-6 2. Japan 3. Japan diesel 6-10 diesel 3-6 diesel white yes red blue yes We want to classify new examples on property Easy to sell based on the examples’ Country, Age, Engine and Colour. no Country Germany Japan yes Colour red yes blue no Engineering Application of AI - PhD Course - 25/57 Decision tree 3 a decision tree under construction contains: nodes labelled with attributes nodes labelled with classifications (yes/no values) unlabelled nodes arcs labelled with attribute values outlet only form nodes labelled with attributes every unlabelled nodes possess: a subset of training examples eligible attributes Engineering Application of AI - PhD Course - 26/57 Decision tree 4 some questions about decision tree: Given some data (set of training examples and attributes), which decision tree should be generated? A decision tree can represent any discrete function of the inputs. Which trees are the best predictors of unseen data? You need a bias (preference for one hypothesis over another). Example, prefer the smallest tree. Least depth? Fewest nodes? How should you go about building a decision tree? The space of decision trees is too big for systematic search for the smallest decision tree. Engineering Application of AI - PhD Course - 27/57 Learning decision trees 1 learning decision tree ID3 algorithm: 1. initially decision tree contains an unlabelled node with all of the training examples and attributes 2. selecting an unlabelled node (n) with non-empty set of training examples (T) and non-empty set of attributes (A) if T is homogen class n leaf node, label with the classification otherwise choosing the „best” attribute (B) from A extension of the tree with all of the possible attribute values of B (devide into subclasses) classification of T to the children nodes according to the attribute values (assign the elements of T to subclasses) continue with step 2. building the tree top-down Engineering Application of AI - PhD Course - 28/57 Learning decision trees 2 how to choose the „best” attribute? attribute divides the examples into homogen classes otherwise attribute makes the most progress towards this hill-climbing search on the space of decision trees searching for the smallest tree heuristics (maximum information gain) information gain of an attribute test measures the difference between the original information requirement and the new requirement (after the attribute test) information gain (G) it is based on information contents (entropy, E) n G ( S , A) E ( S ) i 1 where: Si S E ( Si ) S: set of classified examples, A: attribute S1, … , Sn: subsets of S according to A S S S S E: entropy E (S ) log 2 log 2 S S S S Engineering Application of AI - PhD Course - 29/57 Learning decision trees 3 Author Thread Length Reads 1 known new short true 2 unknown new long true 3 unknown old short false 4 known old short true 5 known new long true 6 known old long true 7 unknown old long false 8 unknown new long true 9 known new short true 10 unknown old long false 11 new short true 12 known old long true 13 known new long true known Engineering Application of AI - PhD Course - 30/57 Using decision trees 1 major problem with using decision tree: overfitting occurs when there is a distinction in the tree that appears in the training examples, but it doesn’t appear in the unseen examples handling overfitting: restricting the splitting, so that you split only when the split is useful allowing unrestricted splitting and pruning the resulting tree where it makes unwarranted distinctions: examples are devided into two sets: training set and test set constructing a decision tree with the training set examining all of the nodes with the test set: whether the subtree under the node is replaceable with a leaf node Engineering Application of AI - PhD Course - 31/57 Using decision trees 2 supporting knowledge acquisition/ fast prototype-making (rule-based/ hybrid systems with inductive services) each one row in the matrix of training examples is a rule Author Thread Length Reads 1 known new short true 2 unknown new long true … IF (Author = known) and (Thread = new) and (Length = short) THEN (Reads = true) IF (Author = unknown) and (Thread = new) and (Length = long) THEN (Reads = true) … better: each one path (root leaf) on the decision tree is a rule IF (Author = known) THEN (Reads = true) IF (Author = unknown) and (Thread = new) THEN (Reads = true) IF (Author = unknown) and (Thread = old) THEN (Reads = false) Engineering Application of AI - PhD Course - 32/57 Main components of inductive systems Knowledge representation: The matrix of training examples: attributes, values Reasoning and control: Algorithm, which constructs a decision tree using the matrix of training examples and operates the generated system. Engineering Application of AI - PhD Course - 33/57 Main steps of inductive systems problem definition (knowledge representation): reasoning (generating a hypothese) attributes (head of the matrix, generate coloumns, define object classes) training examples (fill the raws of the matrix, define instances) checking the contradiction freeness of the training examples learning optimal decision tree (DT) knowledge base control (operating the system) classification of user’ (unknown) examples (traversing DT) analysis of user’ examples (with the help of DT) Engineering Application of AI - PhD Course - 34/57 Hybrid techniques Engineering Application of AI - PhD Course - 35/57 Characteristics of hybrid systems supporting various programming techniques: frame-based techniques rule-based techniques data-driven reasoning goal-driven reasoning inductive techniques realization: using of object-oriented tools Engineering Application of AI - PhD Course - 36/57 Frames knowledge-representation unit developed on epistemology foundations formal tool using for description of structured objects or events or notions characteristics of frames: a frame contains: the name of the object/event its important properties (attributes) stored in slots (slot identifier, type, value – it can be another frame) classes, subclasses, instances hierarchical structure (is_a, instance_of relations) inheritance (classes - subclasses, classes - instances) procedures controlled by events: daemons Engineering Application of AI - PhD Course - 37/57 Formalization of frames 1 directed graph Person is_a status subjects f_name l_name Subject is_a name preconditions instance_of name Teacher Student subjects ES Expert_ systems preconditions AI instance_of f_name l_name status subjects Rozália instance_of Peter Peter f_name subjects l_name Kis Engineering Application of AI - PhD Course - 38/57 Formalization of frames 2 description in frame-based environment frame person is_a class f_name: l_name: end frame is_a subjects: end student person collection_of subject frame Peter instance_of student f_name: Peter l_name: Kis subjects: ES end frame ES isnstance_of subject name: Expert_systems precond: AI end frame subject is_a class name: precond: collection_of subject end Engineering Application of AI - PhD Course - 39/57 Formalization of frames 3 object-attribute-value triplets <Peter, f_name, Peter> <Peter, l_name, Kis> <Peter, subjects, [ES]> <ES, name, Expert_systems> <ES, preconditions, [AI]> Engineering Application of AI - PhD Course - 40/57 Daemons 1 active elements of a frame system standard built-in procedures assigned to the attributes of the classes and instances automatically invoked in case of predefined changing in the value of the slot usual daemons are as follows: when-needed: describes the steps to be performed when the value of slot is read when-changed: is invoked when the value of the slot is changed when-added: contains the actions to be performed when the slot gets its first value when deleted: is executed when the value of the slot is deleted Engineering Application of AI - PhD Course - 41/57 Daemons 2 the executable part of the daemons is determined by the user or it may even be empty execution is controlled by events daemons can invoke (call) each other via changing slot values spread over and over the operation of a frame system is described in an indirect way (embedded in the daemons) daemons can be used for restricted data-driven reasoning Engineering Application of AI - PhD Course - 42/57 Daemons versus rules Daemons Rules Faster and more independent than rules. „Reason/action” is connected to the changes in values and the system’ responses. They act in autonomous way. A rule is invoked by another rule or in case of presence of a certain data. The execution depends on the situation and cannot be seen in advance. Less readable than rules. (daemons are defined on the implementation language of the given tool) Easy to read. (symbolic formalism, natural-language like) They handle the pre-defined changes of the given attribute-values. The built-in knowledge of the rules steams freely to all of the rules. The range of a deamon is bounded statically in advance. (more or less flexible) The range of a rule is stand out dynamically in run-time. (flexible, creative problem solving) Engineering Application of AI - PhD Course - 43/57 Hybrid techniques rules: used for description of heuristic knowledge frames: contains both descriptive and procedural knowledge of the given objects/ events/ notions (altogether in one place! easy to read and modify, the effects of modifications can be held easily) inference engine of hybrid techniques can contain: mechanisms insuring inheritance and handling of daemons mechanisms insuring message changing (object-oriented) data-driven and/or goal-driven reasoning mechanism can support the organization of rules and/or frames into hierarchical modules can support making and using of meta-rules Engineering Application of AI - PhD Course - 44/57 Symbol-manipulation techniques Engineering Application of AI - PhD Course - 45/57 Programming languages of AI high-level symbol-manipulation languages are used to support the implementation of AI methods LISP (LISt Processing) based on the notion and operations of lists all of the problems can be described in the form of function calls PROLOG (PROgramming in LOGic) high-level declarative language define relationships between various entities with the help of logic special type of clause (A B1 … Bn): fact, rule, question reasoning environment with a built-in inference engine answer to a question with the help of logical reasoning goal-driven (backward) reasoning Engineering Application of AI - PhD Course - 46/57 Comparison of symbol-manipulation and traditional techniques Traditional programming languages LISP PROLOG numeric calculus symbol-manipulation symbol-manipulation Neumann-principle languages consist of sequence of commands executed in a predefined order functional approach sequence of evaluation of functionexpressions (-calculus) relation approach based on mathematical logic (predicate-calculus) main elements: commands main elements: functions (procedures) main elements: predicates (relations among objects) procedural (executing in a predefined order) procedural declarative (defining only the description of the problem) executing mechanism have to be defined by the programmer executing mechanism have to be defined by the programmer built-in executing mechanism (goal-driven reasoning with backtracking search strategy) the structure of program and data is different the sructure of program and data is the same (can produce, execute other programs, can modify themselves) the sructure of program and data is the same (can produce, execute other programs, can modify themselves) readability: LISP-like hard to read easy to read Engineering Application of AI - PhD Course - 47/57 Case-based techniques Engineering Application of AI - PhD Course - 48/57 Case-based reasoning (CBR) 1 basic assumption: like was the past like will be the future the „really” observation can be describe hard with the help of classical rules it consists of interconnected relationships of more or less generalized events idea: solving problems based on solutions for similar problems solved in the past requires storing, retrieving and adapting past solutions to similar problems Engineering Application of AI - PhD Course - 49/57 Case-based reasoning 2 solve a new problem by making an analogy to an old one and adapting its solution to the current situation retrieving a case starts with a problem description and ends when a best matching case has been found all case-based reasoning methods have in common the following process: identifying a set of relevant problem descriptors retrieve the most similar case (or cases) comparing the case to the library of past cases reuse the retrieved case to try to solve the current problem revise and adapt the proposed solution if necessary retain the final solution as part of a new case Engineering Application of AI - PhD Course - 50/57 Case a case represents specific knowledge in a particular context there are three major parts in any case: a description of the problem/situation the state of the world when the case is available solution the chain of operators that were used to solve the problem (solving path) outcome/consequence the state of the world after the supervention of the case (description of the effect on the world) in addition to specific cases, one also has to consider the case memory organisation Engineering Application of AI - PhD Course - 51/57 Case - indexing the most important problem in CBR how do we remember when to retrieve what? essentially, the indexing problem requires assigning labels to cases to designate the situations in which they are likely to be useful indexing of cases - issues indexing should anticipate the vocabulary a retriever might use indexing has to be by concepts normally used to describe the items being indexed indexing has to anticipate the circumstances in which a retriever is likely to want to retrieve something Engineering Application of AI - PhD Course - 52/57 Main components of case-based systems 1 case-base (library of cases) tools for determining of key-elements of actual case and for retrieving of most-similar cases for speeding of data-retrieval indexing for finding suitable cases pattern, similarity-estimation tools for the solution’ adaptation according to the specialities of the new case finding the deviations, implementation of alterations in the suggested solution (ex. null-adaptation, parameter adjustment) supervision (solution after the adaptation is suitable or not) learning (finding the reason of failure or enclosing the case to the case-base) Engineering Application of AI - PhD Course - 53/57 Main components of case-based systems 2 new problem indexing retrieving (case-matching) similar cases case-base selecting proposed solution adapting learning checking solution Engineering Application of AI - PhD Course - 54/57 Advantages and disadvantages advantages: case-base is more objective and formal than the expert’s interpretation (knowledge of expert’s) knowledge are represented in an explicit way case can be defined for incomplete or badly-defined notions CBR is suitable for domains for which a proper, theoretical foundations do not exist CBR is applicable in default of algorithmic method easy knowledge acquisition (get well during usage) disadvantages: CBR solves only the problems covered by cases CBR might use a past case blindly without validating it in the new situation solution is time-demanding (also in case of proper indexing) Engineering Application of AI - PhD Course - 55/57 Rule-based systems versus casebased systems Rule-based systems Case-based systems Rule: symbolic pattern Case: collection of data, constants Rule: individual unit, independent of the other rules, consistent piece of field of interest Case: depends on the other cases (often overlap each other), individual unit of the field of interest Retrieving rule: exact matching Retrieving case: partial matching Using of rules: general iterativ cycle Using of cases: several steps (approximate retrieval, adaptation, refinement) The model of the problem have to be developed (sometimes it is hard or impossible) The model of the problem needn’t be developed The knowledge-acquisition of field of interst is hard and time-demanding The knowledge-acquisition of field of interst is limited to collecting and analysing the past cases Development time is long Development time is short Slow, handling of many data is difficult Many data is treatable with the useing of database – handling techniques Enlargement is hard (the validation have to be repeated after enlargement) Enlargement and development is easy. Learning is not supported It is able to learn (preserving new cases) Engineering Application of AI - PhD Course - 56/57 Summary Knowledge-based systems, expert systems Base techniques of knowledge-based systems rule-based techniques inductive techniques hybrid techniques symbol-manipulation techniques case-based techniques References K. M. Hangos, R. Lakner and M. Gerzson: Intelligent Control Systems. An Introduction with Examples. Kluwer Academic Publishers, 2001. Chapter 5. D. Poole, A. Mackworth, R. Goebel: Computational Intelligence. A logical Approach. Oxford University Press, 1998. Chapter 6. I. Futó (Ed.): Mesterséges intelligencia. Aula Kiadó, 1999. Chapter 12. (in hungarian) Engineering Application of AI - PhD Course - 57/57