Case-based reasoning What is case-based reasoning? An approach to building KBS which is radically different to the rule-based and other knowledge-representation approaches we have seen so far. The principle is to find a solution which has been shown to solve problems like your current problem in the past, and adapt it so that it solves the current problem. What is case-based reasoning? This has a certain psychological plausibility as a model of what the expert-decision-maker actually does when solving a problem. Based on research by Riesbeck & Schank (1989). A good comprehensive description is to be found in Kolodner (1993). What is case-based reasoning? Three quotes from Roger Schank: "Humans use cases because they don't know what they know - they don't know their own rules - they do things nonreflectively." "The key process in intelligence is the reminding process". "People don't ever reason from first principles. They always choose a matching case. It may be a bad match, but in that case they need more experience.” How a CBR system works: the knowledgebase The knowledge base contains a collection of representative cases (of faults, say, if the system is concerned with fault diagnosis), with their symptoms, causes, and treatments. How a CBR system works: the process The user is instructed to provide the (relevant) features of the current case. The similarity between this set of features, and the features characteristic of each of the stored cases is calculated, and the best match is chosen. How a CBR system works: the process Current Case 284742 134752 135753 134744 284743 144702 Case 1 Case 2 Case 3 Case 4 Case 5 How a CBR system works: the process Current Case 284742 134752 135753 134744 284743 144702 Case 1 Case 2 Case 3 Case 4 Case 5 How a CBR system works: the process Current Case 284742 134752 135753 134744 284743 144702 Case 1 Case 2 Case 3 Case 4 Case 5 How a CBR system works: the process Current Case 284742 134752 135753 134744 284743 144702 Case 1 Case 2 Case 3 Case 4 Case 5 How a CBR system works: the process Current Case 284742 134752 135753 134744 284743 144702 Case 1 Case 2 Case 3 Case 4 Case 5 How a CBR system works: the process Current Case 284742 134752 135753 134744 284743 144702 Case 1 Case 2 Case 3 Case 4 Case 5 How a CBR system works: the process The features which have been identified as important in the stored cases, and which the user is asked about, are known as “indices”. Each has a value. In the example I just showed you, each was represented by a number. How a CBR system works: the process If necessary, this case is adapted so that it is a better match for the current circumstances. The case is then presented as the solution, with the opportunity to examine the 'precedent' case. How a CBR system works The sequence of operations, for a simple CBR system: 1) assign indices 2) retrieve a similar case Flow chart for a simple CBR system Input Indexing rules Case memory 1. Assign indices 2. Retrieve Output Similarity metrics How a CBR system works The sequence of operations, for a “full-blown” CBR system: 1) assign indices 2) retrieve a similar case 3) modify the past case 4) test the case 5a) assign indices to this new case, and store as a working solution OR 5b) explain failure, repair the solution, and test again. Flow chart for a full-blown CBR system Input Indexing rules Case memory 1. Assign indices 2. Retrieve 5b. Store 3. Modify 5a. Assign indices 4. Test Working solution Failed solution 6a. Explain Similarity metrics Modification rules 6b. Repair Repair rules Available techniques for case memory organisation Memory organisation by: linear ("flat") case memory case hierarchy nested cases decision-tree orientated memory knowledge-guided indexing Available techniques for case retrieval Retrieval by: Nearest neighbour case matching Weighted nearest neighbour case matching Decision tree methods Knowledge-guided retrieval The last four memory organisation approaches, and the last two retrieval approaches, might be thought of as hybrid systems. “Nearest neighbour” algorithm: an example Suppose that we have a sick soyabean plant, and we wish to discover which of a number of known specimens of sick soyabean plants it is most like. “Nearest neighbour” algorithm: an example Choose (let’s say) three characteristics of the leaves that can be represented as numbers: Amount of the leaf that is covered by the discolouration Lightness of the discoloured parts of the leaf Lightness of the remaining parts of the leaf. “Nearest neighbour” algorithm: an example Suppose that the first two cases to be matched are: case 1: coverage - 8 lightness-1 - 4 lightness-2 - 6 case 2: coverage - 10 lightness-1 - 7 lightness-1 - 6 “Nearest neighbour” algorithm This can be treated as two points in three-dimensional space: x, y, z coordinates of case 1: (8, 4, 6) x, y, z coordinates of case 2: (10, 7, 6) “Nearest neighbour” algorithm y 10 9 8 7 6 5 4 3 2 1 1 2 0 1 2 3 4 5 6 7 8 9 10 x 3 4 5 6 7 8 9 10 z A system of 3-dimensional co-ordinates “Nearest neighbour” algorithm y 10 9 8 7 6 5 4 3 2 1 1 2 0 - case 1 1 2 3 4 5 6 7 8 9 10 x 3 4 5 6 7 8 9 10 z The 1st case represented as a point “Nearest neighbour” algorithm y 10 9 8 7 6 5 4 3 2 1 1 2 0 - case 2 1 2 3 4 5 6 7 8 9 10 x 3 4 5 6 7 8 9 10 z The 2nd case represented as a point “Nearest neighbour” algorithm y 10 9 8 7 6 5 4 3 2 1 1 2 0 - case 1 - case 2 1 2 3 4 5 6 7 8 9 10 x 3 4 5 6 7 8 9 10 z The two cases represented as points “Nearest neighbour” algorithm y 10 9 8 7 6 5 4 3 2 1 1 2 0 - case 1 - case 2 1 2 3 4 5 6 7 8 9 10 x 3 4 5 6 7 8 9 10 z The distance between the two cases “Nearest neighbour” algorithm y 10 9 8 7 6 5 4 3 2 1 1 2 6 7 8 9 10 z 0 - case 1 - case 2 - case 3 1 2 3 4 5 6 7 8 9 10 x 3 4 5 Adding a third case: (2, 3, 9) “Nearest neighbour” algorithm There is a simple formula that tells you the distance between two points in 3dimensional space. To find out whether case 1 is more similar to case 2 or to case 3, you simply calculate the two distances, and pick the smaller of the two. “Nearest neighbour” algorithm To find out which of a whole series of cases case 1 is most similar to, calculate the distance from case 1 to each of them, and pick the smallest figure. “Nearest neighbour” algorithm Suppose it was 4 features, or 7, or 100? Would you have to draw 4-dimensional or 7-dimensional or 100-dimensional graphs? No, it’s simply necessary to have a formula for calculating distances in 4, or 7, or 100-dimensional space, and such formulae are readily available. Case adaptation "Fixing" inconsistencies between diagnosis and symptoms. Techniques: the end user does it knowledge-based (qualitative reasoning, etc) a fixed procedure. Case adaptation Note that there is a problem about updating the case-base with adapted cases. Since the new case isn’t exactly like any of the cases in the case-base, it can’t really be said to have been solved by the expert judgement that was used to build the case-base in the first place. There is a real chance that the conclusion that the system came to is wrong in this case If wrongly concluded cases are added to the case-base, it becomes progressively degraded. Case adaptation Typically, the procedure is to put fresh cases into a special file, and have the Domain Expert pass judgement on them before they are added to the case-base. Appropriate domains CBR is suitable: when the domain is broad but shallow. when experience rather than theory is the primary source of information. when the requirement is for the best available solution, rather than a guaranteed exact solution. when solutions are reusable, rather than unique to each situation. Example of a successful system CBR is particularly used for help-desk applications. For instance the COMPAQ SMART system. Example of a successful system The problem was that: Thousands of customers were calling Compaq directly every day, requesting support. Many of the staff were new; there was a major training problem. There was a need for consistent & accurate answers and responses There was a need for retention of corporate knowledge. Example of a successful system The COMPAQ SMART system, once developed and installed, succeeded in solving 85-95% of calls. Typical time to solve a problem was less than 2 minutes. Advantages of CBR Case-based reasoning: tends to focus on the problem's essential features. can solve problems in domains that are only partially understood. can provide solutions when no algorithmic method is available. can interpret open-ended and illdefined concepts. Steps in building a casebased reasoning system 1. Obtain data for cases. 2. Design cases based on data. 3. Determine the case memory structure. 4. Decide the case retrieval method. 5. Decide whether a case adaptation procedure is appropriate (and, if so, implement it). 6. Develop the rest of the system (e.g. the user interface). Some currently-available CBR tools (with vendors) Esteem (Esteem Software) CBR Express & CBR v.2.0 (Inference) ReMind (Intelligent Applications, Cognitive Systems) ReCall (ISoft) KATE-CBR (Acknosoft) Some of these are UK products, some American, some French. Example of a large CBR project: the Cassipoée system Used a combination of inductive and CBR techniques. Written using KATE-CBR, by AcknoSoft of Paris, on behalf of an engineering firm owned by General Electric and SNECMA. A diagnostic system for aircraft engines: CFM 56-3 engines in Boeing 737s and Airbus A340s. Example of a large CBR project: the Cassipoée system The cases came from a legacy database of 23000 engine maintenance reports, built up over 8 years. Experienced engineers worked over the cases, eliminating items where there was no diagnosis or mis-diagnosis, and duplicates. This left 16000 cases, each with up to 100 features. Example of a large CBR project: the Cassipoée system Case selection was by a decision tree, generated from the cases. This directed the questioning of the user, to provide a set of symptoms, to select cases. Example of a large CBR project: the Cassipoée system Extra features: Integrated with an Illustrated Part Catalogue Generates reports of reliability and maintainability using EXCEL Uses e-mail to collect maintenance reports world-wide. Example of a large CBR project: the Cassipoée system Success Very fast diagnosis: reduced diagnosis time by 50% Won 1st prize for innovative software applications at the European XPS show, Germany, March 1995. A note on knowledge acquisition In rule-based reasoning, knowledge is extracted from experts and encoded in rules. This is often difficult to do. In case-based reasoning, most (but not all) knowledge is in the form of cases. A note on knowledge acquisition Case-based reasoners also need the same semantic knowledge that rulebased reasoners need. In addition, case-based reasoners need adaptation rules and similarity metrics - more types of knowledge, but perhaps knowledge that is easier to acquire. A note on knowledge acquisition Several recent studies point to the relative ease with which case-based reasoners can be built as compared to building the same rule-based systems. Kolodner (1993), p.94 Knowledge acquisition In one study, the Digital Equipment Corporation commissioned two systems (for customer technical support), with equivalent functionality. One, called CANASTA, was rule-based; one, called CASCADE, was case-based. Knowledge acquisition CANASTA took 960 days of development time CASCADE required 105 days. However, the personnel required for the CANASTA development were more valuable than those required for CASCADE if one takes account of this, the development of CANASTA took the equivalent of 1600 days, and CASCADE the equivalent of 193 days. Knowledge acquisition CANASTA took 960 days of development time CASCADE required 105 days. However, the personnel required for the CANASTA development were more valuable than those required for CASCADE; if one takes account of this, the development of CANASTA took the equivalent of 1600 days, and CASCADE the equivalent of 193 days. Knowledge acquisition The accuracy and efficiency of the two systems were reckoned to be equivalent. The continuing maintenance costs of CANASTA were high, while those of CASCADE were negligible. (Simoudis, 1991 & 1992). A comparison between rule-based & case-based reasoning Criterion Knowledge unit Granularity Knowledge acquisition Rule-based reasoning Rule Case-based reasoning Case Fine Coarse Obtaining rules & hierarchies Obtaining cases & hierarchies A comparison between rule-based & case-based reasoning Criterion Rule-based reasoning Explanation Backtrace of mechanism rules fired Characteristic Answer + output confidence measure Knowledge Potentially transfer high Case-based reasoning Precedent cases Answer + precedent cases Low A comparison between rule-based & case-based reasoning Criterion Rule-based reasoning Domain Domain require- vocabulary, ments good set of inference rules, rules which hold throughout domain Case-based reasoning Domain vocabulary, casebase of example cases, stability: modified cases still hold A comparison between rule-based & case-based reasoning Advantages Rule-based reasoning Flexible use of knowledge, potentially optimal answers. Case-based reasoning Rapid knowledge acquisition, explanation by example A comparison between rule-based & case-based reasoning Rule-based Case-based reasoning reasoning ComputationDisadvantages ally expensive, long development time, impenetrable explanations Suboptimal solutions, redundancy in knowledge base