Knowledge Systems • Knowledge Systems use formal representations of knowledge to answer unanticipated questions with coherent explanations • Knowledge System = KB + Q/A + Explanation Generator + Knowledge Acq. tools Advances over Expert Systems Coverage of domain, not domain task Various modes of reasoning, well integrated Domain level explanation Rapid construction U • • • • Just how advanced are they? Project Halo* • Long term: build a Knowledge System encompassing much of the world’s scientific knowledge • Short term: assess current technologies • Use a portion of the Advance Placement (AP) chemistry exam as a metric * Full support for Project Halo was provided by Vulcan Inc, Seattle, WA Challenges Systems must be robust in the face of widely varying, unanticipated questions. Explanations are as important as correctness. Hard-ball evaluation, aimed to expose weaknesses. New domain and short development time require using off-the-shelf KR&R methods and systems. It was not clear at the outset that these challenges could be met. Example Questions • The spectator ions in the reaction of barium nitrate with sodium sulfate are what? (choices) • Although nitric acid and phosphoric acid have very different properties as pure substances, their aqueous solutions possess many common properties. List some general properties of these solutions and explain their common behavior in terms of the species present. • Explain why a solution of HClO4 and NaClO4 cannot act as a buffer solution. • Sodium azide is used in air bags to rapidly produce gas to inflate the bag. The products of the decomposition reaction are what? (choices) Questions were manually encoded in our formal language Because Questions Vary Widely… … we can not anticipate the questions, or even the type of questions, so a retrieval method won’t do. A custom inference method won’t do. The system must be capable of using its knowledge in unanticipated ways. An Example Explanation • What are the products of the given decomposition reaction? • By definition, oxidation-reduction reactions occur when electrons are transferred from the atom that is oxidized to the atom that is reduced. We need to look for changes in the oxidation states of the elements in the reaction. • In the reactants, the oxidation state(s) of the element Na is/are (1). In the product, the oxidation state(s) is/are (0) • Therefore, the reaction causes a change in oxidation state. • Therefore, this is an oxidation reduction reaction. • By definition, a Binary Ionic-Compound Decomposition Reaction occurs when a binary ionic compound is heated. • Therefore, this reaction is a Binary-Ionic Compound Decomposition reaction. • In general, a Binary Ionic-Compound Decomposition Reaction converts a binary ionic-compound into basic elements. • In this reaction, NaN3 reacts to produce Na and N2. • The products of the decomposition reaction are: (d) Sodium and nitrogen-g Our KR&R System * • KM: KRL-like frame system with FOL semantics. • …able to represent: – – – – classes, instances, prototypes defaults, fluents, constraints (hypothetical) situations actions (pre-, post-, and during- conditions) • …and reason about: – inheritance with exceptions – constraints – automatic classification (given a partial description of an instance, determine the classes to which it belongs) – temporal projection (“my car is where I left it”) – effects of actions • KM answers questions by interleaving two types of inference: – Automatic classification * Details: AAAI’97 – Backward chaining Structure of the Knowledge Base* Two principal types of chemistry knowledge: – terms, e.g. “binary ionic compound” – laws, e.g. problem-solving method for computing products of reactions of binary-ionic compounds Terms are encoded as definitions to enable automatic classification. Laws are encoded as rules to enable backward chaining. * Details: KR’04 (Barker, et.al.) The Content of a Chemistry Law Concentration of Solute Law Context: The conditions under which the law applies a mixture M such that: volume(M) = V liters has-part(M) includes Chemical C such that: quantity(C) = Q moles concentration(C) = Conc molar The subset of variables that must Input: V, Q be bound Output: Conc Method: Conc ← Q/V The axioms used to compute values for output variables The subset of variables that will be bound Knowledge Engineering Methodology Knowledge base built in 4 months: – Ontological engineering (4 person-months): designed representations, including structure of terms, laws, reactions, solutions, etc. – Knowledge capture (6 person-months): consolidated 70 textbook pages into 35 pages of terms and laws – Knowledge encoding (15 person-months): coded in KM 500 types and relations, 150 chemistry laws and 65 terms. Compiled a large test suite which was run daily – Explanation engineering (3 person-months): augmented the representations of terms and laws with templates Results of Project Halo* • After 4 month development effort, the knowledge systems were sequestered and given a test: – 165 novel questions: 50 multiple choice; 115 free form response – Questions translated from English to formal language by each team, then assessed for fidelity by Vulcan and team representatives • * Details: AI Magazine (Winter 2004) • www.projecthalo.com : systems, Q/A, and analyses Correctness • Our system’s correctness score corresponds to an AP score of 3 – high enough for credit at UCSD, UIUC, and many other universities. • We’ve predicted scoring 85% after a 3 month follow-on project. Explanation Quality Error Analysis * We analyzed every point lost. Most deductions were due to errors in domain modeling — mistakes that domain experts would not make. (More later) Some errors were caused by technology problems. Details: KR’04 (Friedland, et.al.) Problems Due to KR&R Technology • Explanations too verbose: e.g. passages repeated multiple times with only small variations – graders expected a general statement that covered them all. Requires explanation planning • Questions that require reasoning about our representations: – Calculate the pH of a particular substance. Explain why the result is unreasonable. – Explain the difference between the subscript “3” and the coefficient “3” in 3HNO3. – Explain when and why it’s OK for a particular chemistry method to use an equation that only approximates the true answer. Reasoning about Relevance Hydrofluoric acid is a weak acid, Ka = 6.8 x 10-4, and yet it is considered to be a very reactive compound. For example, HF dissolves glass. The major reason it is considered highly reactive is: (a) It is an acid. (b) It forms H3O+. (c) It dissociates. (d) It readily forms very stable fluoride compounds. (e) It is a weak electrolyte. All five statements are true. The question requires that the system reason about which of the multiple true statements is most relevant to the claim. Bottom Line • Halo I was a rigorous evaluation of current Knowledge System technology. • In general, the systems were more capable than Vulcan expected. • The major hurdles to building a Knowledge System for science are errors (in domain modeling) and cost ($10K/page).