Rule Based Systems Rule based systems / Knowledge based systems/ Expert Systems have played and plays an important role in the AI industry. A report from from 1993 by John Durkin: Reports on Over 2500 Developed Expert Systems Application areas: Agriculture, Business, Chemistry, Communications, Computer Systems, Education, Electronics, Engineering, Environment, Geology, Image processing, Information Management, Law, Manufacturing, Mathematics, Medicine, Meteorology, Military, Mining, Power Systems, Science, Space Technology, Transportation Types of systems: Rule Based, Frame Based, Fuzzy Logic, Case Based, Neural Network Architecture of a typical expert system User interface: Question-andanswer Knowledge base Knowledgebase editor Menu driven User Natural language Graphic inteface Expert system shell Inference engine General knowledgebase Case-specific data Explanation subsystem AI in Medicine (USA 1970) • Stanford MYCIN - blood infections • Rutgers CASNET - casual reasoning • MIT PIP - renal disease • Stanford • Pittsburgh Internist – internal medicine - ”the primary goal of this field is to develop computer programs that perform efficiently and are able to explain their reasoning and conclusions to their users” MYCIN’S knowledge base •About 400 diagnostic rules •About 5 therapy rules Why Mycin? •Diagnose likely infecting organisms in blood and meningitis infections •Use test results and information about patient supplied by doctor •Prescribe an effective antibiotic treatment •Do this early in the course of the disease, before all possible information is available •To counteract: - overuse of antibiotics - irrational use of antibiotics -maldistribution of expertise Mycin system for diagnosis og meningitis and bacteremia (bacterial infections) IF the site of the culture is blood, and the identity of the organism is not known with certainty, and the stain of the organism is gramneg, and the morphology of the organism is rod, and the patient has been seriously burned THEN there is weakly suggestive evidence (0.4) that the identity of the organism is pseudomonas MYCIN diagnosis rule (2) IF the site of the culture is blood, and the identity of the organism is gramneg, and the morphology of the organism is rod, and the patient is a compromised host THEN there is suggestive evidence (0.6) that the identity of the organism is pseudomonas-Aeruginosa MYCIN diagnosis rule (3) Rule 3 IF (1) stain of organism is gram-positive and (2) morphology of organism is coccus and (3) growth-conformation of the organism is clumps THEN there is suggestive evidence (0.7) that identity of organism is staphylococcus. MYCIN diagnosis rule (3) (CEFAX notation ) rule 3 if stain of organism is gram_positive and morphology of organism is coccus and growth_conformation of organism is clumps then 0.7 certainty identity of organism is staphylococcus. MYCIN: Therapy Selection Rule IF You are considering giving chloramphenicol, and the patient is less than 1 week old THEN it is definite (1.0) that chlorampericol is contraindicated for this patient [Justification: Newborn infants may develop vasomusculular collapse due to an immaturity of the liver and kidney functions resulting in decreased metabolism of chloramphenicol] How does MYCIN create confidence in the user Answering ”Why?” (Why did you ask that?) Answering ”How” (How did you arraive at that conclusion?) Answering ”Why not X?” (Why did you not consider X?) Mycin’s simple rule format and friendly explanations in ”English” are the key. MYCIN Explanation User: Why didn’t you consider Streptococcus as a possiblity for Organism- 1 MYCIN: The following rule could have been used to determine that the identoty of Organism-1 was streptococcus: Rule 33 But Clause 2 (”the morphology of the organism is Coccus”) was already known to be false for Organism-1, so the rule was never tried. How MYCIN looks to the user: Therapy recommendation [REC-1] My preferred therapy recommendation is as follows: In order to cover for items <1 2 3 4 5Z: Give the following in combination 1: Kanamycin Dose 750 mg (7.5 mg(kg)q12h IM (or IV) for 28 days Comments: Modify dose in renal failure 2: Penicillin Dose: 2,500,000 units (2500 units/kg) q4h IV for 28 days Emycin and expert system shell MYCIN has later been developed, and separated into to parts: An expert system shell EMYCIN (empty MYCIN) A knowledg base The expert system shell EMYCIN is ”the mother of all expert system shells”. One simplified version is called CEFAX is implemented at NTNU in Prolog Rule based system as a reasoning system If we look aside from the uncertainties in MYCIN, the system can be regarded as logical inference system, where the explanation is the proof tree of the reasoning. A bank clerk shall approve loans for customers. He collects the basic information about the customer, which is represented as a set of variables: Basic variables APP (the appraisal on the collateral is greater than the loan amount) RATING (The applicant has a good credit rating) INC (The applicant’s income exceeds his expenses) BAL (The applicant has an excellent balance sheet) Derived variables OK (The loan should be approved) COLLAT(The collateral for the loan is satisfactory) PYMT (The applicant is able to make the loan payments) REP (The applicant has a good financial reputation) The loan approval rule base He follows the banks guidelines, which can be stated as follows (1) (2) (3) (4) (5) COLLAT PYMT REP OK APP COLLAT RATING REP INC PYMT BAL REP OK A loan application case Mr John Smith has applied for a loan. His case is: APP is true, INC is true, RATING is true, BAL is true That his loan can be approved follows from the following proof tree (Rule numbers in parentheses) approval OK OK (1) (5) COLLAT PYMT REP (2) APP BAL REP (4) (3) (3) INC RATING RATING The loan approval as a rule based system rule 1 if collat is true and pymt is true and rep is true rule 4 if inc is true then 1.0 certainty pymt is true. then 1.0 certainty ok is true. rule 2 if app is true then 1.0 certainty collat is true. rule 3 if rating is true then 1.0 certainty rep is true. rule 5 if and bal is true rep is true then 1.0 certainty ok is true. A loan approval “real” dialog User: Why do you believe that the loan should be approved System: Because the collateral for the loan is satisfactory, and the applicant is able to make the loan payments, and the applicant has a good financial reputation. User: Why do you believe that the applicant has a good financial reputation ? System: Because the credit-rating service said that the applicant has a good credit rating. User: Why did you ask me about whether or not the applicant’s income exceeds his expenses ? System: Because I was trying to establish whether or not the applicant is able to make the loan payments. User: Why are you trying to establish whether or not the applicant is able to make loan payments. System: Because that is one of the criteria whether or not the loan should be approved. User: Why are you trying to establish whether or not the loan should be approved ? System: Because you asked me to. An expert system loan approval dialog ?- eval ok. What is the certainty of app is true --> 1.0. What is the certainty of inc is true --> why. I ask because rule 4 if inc is true then 1.00 certainty pymt is true . I ask because rule 1 if collat is true and pymt is true and then 1.00 certainty ok is true . The last conclusion was your original problem . What is the certainty of inc is true --> 1.0. What is the certainty of rating is true --> 1.0. What is the certainty of bal is true --> 1.0. ok is true IS PROVED ( 1.00) (help./ how./ ok./ quit./ ) ? rep is true Loan approval proof Dialog and proof of Loan approval are shown here PRO/loanapproval.txt PRO/loanproof.txt ok is true IS PROVED ( 1.00) rep is true IS PROVED ( 1.00) (help./ how./ ok./ quit./ ) ? BECAUSE |: how. by rule 3 : BECAUSE rep is true IS PROVED ( 1.00) contributions BECAUSE by rule 1 : rating is true IS PROVED ( 1.00) ok is true IS PROVED ( 1.00) BECAUSE BECAUSE rating is true is given collat is true IS PROVED ( 1.00) BECAUSE by rule 5 : ok is true IS PROVED ( 1.00) by rule 2 : BECAUSE collat is true IS PROVED ( 1.00) bal is true IS PROVED ( 1.00) BECAUSE BECAUSE app is true IS PROVED ( 1.00) bal is true is given BECAUSE app is true is given AND AND pymt is true IS PROVED ( 1.00) BECAUSE by rule 4 : rep is true IS PROVED ( 1.00) BECAUSE by rule 3 : rep is true IS PROVED ( 1.00) pymt is true IS PROVED ( 1.00) BECAUSE inc is true IS PROVED ( 1.00) BECAUSE inc is true is given AND BECAUSE rating is true IS PROVED ( 1.00) BECAUSE rating is true is given The Certainty Factor model for uncertainty handling Rule 3 IF (1) stain of organism is gram-positive and (2) morphology of organism is coccus and (3) growth_conformation of the organism is clumps THEN there is suggestive evidence (0.7) that identity of organism is staphylococcus. The uncertainty model is based on certainties which are numbers between –1 and +1. The example 0.7 is a rule parameter that modifies the certainty of the conclusion. Uncertainty vs Ignorance 0 0 0 0 0 1 MB MI MD The origin is based on belief intervals. Measure of Belief [0.0 -- 1.0] Measure of Disbelief [0.0 -- 1.0] Certainty Factor = MB – MD = [ - 1.0 -- + 1.0] Measurements of Ignorance 1.0 – (MB+MD) Measurement of inconsistency MB+MD –1.0 (= -MI) Uncertainty: Is it raining in Trondheim tomorrow ? Ignorance : Is it raining in Kuala Lumpur tomorrow ? Statistical interpretations Characteristics Ranges Values 0 <= MB <= 1 0 <= MD <= 1 -1 <= CF <= 1 Certain True Hypothesis MB=1 P(H|E) =1 MD=0 CF =1 Certain False Hypothesis MB=0 P(-H|E) =1 MD=1 CF = -1 Lack of evidence MB=0 P(H|E) =P(H) MD=0 CF = 0 Contradictory evidence MB=1 MD=1 CF = 0 Manipulation of CF-values Usually, we use only one CF value, so we don’t distinguish between ignorance and inconsistency. CF rule principle if P then CF certainty Q CF(P) computed CF(Q) = parameter CF(Q) defined computed CF(P) *CF CF(P) >0 0 otherwise Antecedent Combination Rule The CF values of the premise is computed together If A and B then CF C (CF = 0.6) A CF(A) B CF(B) (0.5) (0,7) CF(A and B) = min(CF(A),CF(B)) = (0.5) CF (C ) = CF * CF(A and B) = (0.3) Similarily CF(A or B) = max (CF(A),CF(B)) CF(not B) = - CF(B) Serial Combination Rule The CF-values are chained together with the rule applications IF AA THEN CF1 CF(AA)=0.5 IF B THEN (0.35) (0.7) CF2 (0.3) B => CF(B)=0.35 C => CF(C) = 0.105 Paralell combination rule Accumulation of CF-values, contribution from several rules (1) IF AA1 then xx R ( CF(R1)= P) (2) IF AA2 then yy R CF(R) = P + Q – P*Q R P Q ( CF(R1)= Q) (in the simple case) Motivation for Parallel rule Supppose B1 and B2 are two independent stochastic variables, and that B = B1 or B2 Then P(B) = P(B1 or B2) = P(B1) + P(B2) – P(B1 and B2) = P(B1) + P(B2) – P(B1)*P(B2) which corresponds to the rule CF(R) = CF(R1) + CF(R2) – CF(R1)*CF(R2) The complete parallel rule CF1 + CF2 – CF1*CF2 (CF1,CF2 >0) CFparallel(CF1,CF2) = _ CF1 + CF2 – CF1*CF2 (CF1,CF2 < 0) (CF1 + CF2) _________________ (1 – min(|CF1|,|CF2|)) (CF1*CF2 <0) Motivation for complete parallel rule Historically, the parallel rule for CF values of opposite sign was just CF = CF1 + CF2 e.g. CF1=0.999 (damn sure) CF2= - 0.799 => CF = 0.2 which is unreasonably low The revised rule gives CF = 0.995 (almost damn sure) BUT the old rule also had the defect that it was not associative and not commutative Mathematical properties of the revised parallel rule The parallel rule has some good and obviously required properties The CF parallel combination rule has some very nice (and obviously required) mathematical properties: - it is associative, i.e. evidence may be grouped arbitrarily - it is commutative, i.e. the sequence of evidence is irrelevant - it has a zero element, (CF = 0) that has no effect - it is symmetric, i.e. equal but opposite evidence cancel out However, the CF parallel combination rule is not idempotent: C + C - C*C > C (if C >0) (If you repeat the same weakly supported postulate sufficiently often, it will be regarded as certain after a while . :-)