Building Knowledge Bases from Reusable Components Peter Clark Boeing Applied Research and Technology Fragment of a Knowledge-Base ... Soil Person isa ... Biotechnologist environment agent remediator agent Microbes patient Get then Rate contains Q+ rate Bioremediation then Break Down Q- Amount script pollutant agent se Script se patient se se agent Apply I- then I+ Amount amount product Oil absorbed ... amount Fertilizer product Absorb ... Example queries: • “What steps are involved in bioremediation?” • “How does pollutant volume affect rate?” • “What equipment is needed?” • “What do the microbes do?” • ... ... ... Potentials… The growing demand for knowledge processing: • Growth of on-line, structured information, eg – – – – XML On-line databases (eg. commercial, geographic) eCommerce NLP-generated structures • Requirements for more than fact retrieval, eg – Search / Information Access (“best search engine wins...”) – Knowledge Management – NL understanding / MLT / Speech The Botany KB Experience • 10 yr effort, 20k concepts, 100k facts • Supports sophisticated question-answering – description – prediction • But: – KB still highly incomplete – laborious to build/maintain – difficult to achieve reuse want a better approach! Fundamental Problem • Reliance on manual construction of many specific representations: – impractical, unmaintainable – can’t anticipate them all • But: – representations contain repeated abstractions • production occurs in photosynthesis, mitosis, growth • germination includes conversion, production, expansion • Goal: Capture abstractions in a recomposable way Representation of Bioremediation Soil Biotechnologist environment agent remediator agent Microbes patient Get then Rate contains Q+ rate Bioremediation Amount script pollutant agent se Script se patient se se agent Apply then Break Down I- then Q- I+ Amount amount product Oil absorbed Absorb amount Fertilizer product An underlying abstraction... Soil Biotechnologist environment agent remediator agent Microbes patient Get then Rate contains Q+ rate Bioremediation then Break Down Q- Amount I+ Amount amount product script pollutant agent se Script se patient se se agent Apply I- Oil absorbed then amount Fertilizer product Absorb Rate rate Q+ I- Amount Conversion rawmaterials Q- I+ Amount amount product Substance amount Substance Another abstraction... Soil Biotechnologist environment agent patient then I- Q- Amount Apply Break Down then then Absorb Digest eater Agent I+ Amount amount amount script product pollutant agent Oil Fertilizer se Script se se se patient agent absorbed product Microbes Get Q+ rate Bioremediation remediator agent Rate contains food script agent Script agent se se Break Down patient Substance absorbed then Absorb Another abstraction... Soil Biotechnologist environment agent Microbes patient Get then agent Thing patient Get Apply then then I- Amount script pollutant agent se Script se patient se se agent agent applied Person Q+ rate Bioremediation remediator agent Rate contains Break Down then Oil absorbed Absorb patient agent se Script se applied Apply patient I+ Amount amount product Treatment script Q- Thing amount Fertilizer product A Component-Based Approach • Represent component abstractions explicitly • Define concepts as compositions • Construct representations on-demand to answer qns KB Architecture • Component theories = abstract, reusable models • Definitions = specifications of compositions • Inference = construct compositions as needed to answer questions. Lessons from a Dictionary... Move: to Go Go: to Move Transport: to Move from one Place to another Vehicle: a Means for Transporting something Car: a Vehicle for Passengers Most abstract concepts appeal to core, foundational theories Specific concepts defined as compositions of abstract concepts 1. Component Theories • A coherent, encapsulated system of concepts & relns • Contains: – ontology (vocabulary of concepts and relations) – axioms (rules) relating these • Provides semantics for these concepts in the KB • Can layer these theories (define one using others) Example: Distribution Network Producer Intermediary Producer Consumer Intermediary Ontology Producer Intermediary Consumer Material connects supplied state Rules (axioms): • PRODUCERS produce MATERIAL. • CONSUMERS can consume MATERIAL. • A network element may be BLOCKED or UNBLOCKED. • If an element connects with an UNBLOCKED element, then it has an ACCESS to that element. • A CONSUMER is SUPPLIED if it has ACCESS to a PRODUCER. • …. Axiom Representation (example) “If an element connects to an UNBLOCKED element, then it has ACCESS to that element.” e1:Element (i) Semantic network (ii) Logic (iii) Implementation (KM) connectedelement e2:Element state Unblocked e1:Element access-to e2:Element e1,e2:Element connected-element(e1,e2) state(e2,Unblocked) access-to(e1,e2) (every Element has (access-to ( (allof (the connected-element of Self) where ((the state of It) = Unblocked))))) Other component theories... • Supply-and-demand • Containment • Machines • Production network • Two-state object • Transportation • ... 2. Definitions and Composition • Definition = specification of a composition – a Fuel-Cell is a Producer of Electricity – a Bulb is an Electrical Resistor producing Light – a Camera is an Image Recording Device – a Wire is a Conduit of Electricity • Automated composition: – Elaboration: component supplies info to answer query – Classification: recognize concepts in the composition 2. Composition (example) Composition: Camera = an Image Recording Device Query: Failure modes of a camera? Device behavior Image input Recording (Camera has (superclasses (Device))) (every Camera has (behavior ((a Recording with (input (Image))))) Component Theory: Devices FailureMode failuremode failuremode failuremode Device FailureMode behavior failuremode Activity participants Physobj FailureMode Image failuremode failuremode input Physobj failuremode Device behavior failuremode Recording part. part. Physobj Physobj FailureMode (Device has (superclasses (Physobj))) (every Device has (behavior ((a Activity))) (failure-modes ( (the failure-modes of (the participants of the behavior of Self)))))) Query: Failure modes of a camera? Sub-query: Participants in its behavior? FailureMode Image failuremode failuremode input failuremode Device behavior failuremode Recording part. part. Physobj Physobj FailureMode Component Theory: Recording Signal input Recording output participant participant input Signal Receptor Memory-Unit subevents agent input Receiving patient Writing (Recording has (superclasses (Activity))) (every Recording has (input ((a Signal))) (participants ( (a Receptor with (input ((the input of Self))) ... FailureMode Image failuremode failuremode input failuremode Device behavior failuremode Recording part. input FailureMode part. Physobj Receptor agent Receiving Signal output input Physobj Memory-Unit subevents patient Writing FailureMode Image failuremode failuremode input failuremode Device behavior failuremode Recording part. input FailureMode part. Receptor agent Receiving Signal output input Memory-Unit subevents patient Writing Run-Time Classification: Aperture = an Image Receptor Blockage failuremode Image Image output input Aperture - inputs an image - outputs an image - might be blocked - ... Aperture FailureBlockage Mode Image failuremode failuremode input failuremode Device behavior failuremode Recording part. input FailureMode part. Receptor Aperture agent Receiving Signal Image output input Memory-Unit subevents patient Writing Run-Time Classification: Aperture = an Image Receptor Blockage Image failuremode failuremode input failuremode Device behavior failuremode Recording part. input FailureMode part. Aperture agent Receiving Image output input Memory-Unit subevents patient Writing Run-Time Classification: Aperture = an Image Receptor Film= an Image Memory-Unit Aging failuremode Image input Film Film - includes a sheet coated with image-sensitive chemical - might age sensitive-to - ... Chemical parts covering Sheet Blockage Image failuremode failuremode input failuremode Device Run-Time Classification: Aperture = an Image Receptor Film= an Image Memory-Unit behavior failuremode Recording part. input FailureAging Mode part. Aperture agent Receiving Image output input sensitive-to Memory-Unit Film Chemical parts patient covering subevents Writing Sheet Query: Failure modes of a camera? Blockage, Aging Sub-query: Participants in its behavior? Aperture, Film Blockage Image failuremode failuremode input Device failuremode behavior failuremode Recording Image part. input Aging part. Aperture agent Receiving output input Film subevents patient Writing sensitive-to Chemical covering Sheet Demo... KM> (a Device with (behavior ((a Recording with (input (Image)))))) _Device01 KM> (the behavior of _Device01) _Recording01 KM> (the failure-modes of _Device01) failure modes of its participants? (from Device) what are the participants? a Receptor and a Memory-Unit. (from Recording) [Trace: _Receptor31 classified as a Aperture] [Trace: _Memory-Unit32 classified as a Film] an Aperture and a Film. failure modes of an Aperture and a Film? Blocking, Aging. (from Aperture and Film) (classification) Demo... KM> (a Device with (behavior ((a Recording with (input (Image)))))) _Device01 KM> (the behavior of _Device01) _Recording01 KM> (the failure-modes of _Device01) failure modes of its participants? (from Device) what are the participants? a Receptor and a Memory-Unit. (from Recording) [Trace: _Receptor31 classified as a Aperture] [Trace: _Memory-Unit32 classified as a Film] an Aperture and a Film. failure modes of an Aperture and a Film? Blocking, Aging. (from Aperture and Film) (Blocking Aging) KM> (the subevents of (the behavior of _Device01)) [Trace: _Writing45 classified as a Exposing] (_Receiving45 _Exposing46) KM> (classification) Other Compositions... • Sound Recording Device (tape recorder) Device behavior Sound input Recording • Sound Producing Device (stereo) • Vibration Recording Device (seismology) • Idea Recording Device (palmtop) • etc. Compound Concepts are Ubiquitous – Botany: • photosynthesis • plant material distribution • ... – Aerospace: • turbine gearbox assembly • case drain fluid • …(43k acronyms!)… – Sentences also: • “The aircraft overshot the runway.” • “The air-conditioning unit had no power.” • ... Overall Architecture 2. Component theories 1. Ontology (conceptual vocabulary) (computational clockwork) Thing Distn.Network Producer Consumer Circuit ... Movement Physobj Move Location ... ... ... ... Overall Architecture 1. Ontology 2. Component theories (conceptual vocabulary) (computational clockwork) Thing Distn.Network Producer Consumer Circuit ... Consumer Circuit Producer Movement Physobj Move Location ... ... ... ... Overall Architecture 2. Component theories 1. Ontology (conceptual vocabulary) (computational clockwork) Thing Activity Distn.Network Producer Consumer Circuit ... Physobj Movement Physobj Move Location ... Move ... ... ... Overall Architecture 2. Component theories 1. Ontology (conceptual vocabulary) (computational clockwork) (describe concepts in terms of others) Thing Distn.Network 3. Definitions and Descriptions Producer Consumer Circuit ... Movement Physobj Move Location ... ... ... ... Bulb = Light-producing Electrical Consumer 4. Databases of basic facts (instances) Prototype KBS: PHaSE Trainer Laser source Computer + Screen PHaSE KB Architecture 2. Component theories 1. Ontology Thing DAG Blckable DAG Discrete events Process Network Optical Circuits Distrn Network Elec. Circuits 2-state Object Machine ... ... ... PHaSE KB Architecture 2. Component theories 1. Ontology Thing DAG Blckable DAG Process Network Optical Circuits Distrn Network Elec. Circuits 3. Definitions and Descriptions Carousel = a Revolving Case for Storage ... ... ... PHaSE KB Architecture 2. Component theories 1. Ontology Thing DAG Blckable DAG Process Network Optical Circuits Distrn Network Elec. Circuits 3. Definitions and Descriptions ... ... ... Carousel = ... 4. Basic Facts about PHaSE PH. Science Checklists PH. Circuit PHaSE physical structure Example Queries “The parts of the PHaSE control panel?” PHaSE-cover-screw, main power switch, PHaSE MOTOR PWR light, … “The tool for removing the cover screw?” 4.5mm screwdriver “The possible malfunctions of the PHaSE MOTOR PWR light?” “The motor pwr light has no electricity. (_Absence23) The filament of the motor pwr light is burned out. (_Burned-Out24) The carousel motor is tripped.” (_Tripped25) “The corrective actions for a tripped motor?” “Toggle the power switch of the carousel motor.” PHaSE User Interface PHaSE User Interface Application:Product Description/eCommerce “M8 titanium alloy bolt” <product-description> <merchant> <name>Abalt Ltd</name> <location> <city>London</city> <country>UK</country> </location> </merchant> <product> <type>bolt</type> <size>M8</size> <material> <base>titanium</base> <alloy>3Al-2.5V</alloy> </material> </product> <cost> <amount>0.03</amount> <currency>GBP</currency> </cost> </product-description> (XML) ... ... ... KB • “Low-priced fastener?” • “no import restrictions?” • “heat-resistant to 600F?” • “nearby supplier?” • ... Knowledge Requirements for this Component Theories Transportation (transport, location, vehicle, …) Commerce (buyer, goods, …) Finance (money, account, exchange-rate, …) Material physics (temperature, density, …) ... Definitions Purchase = an exchange of goods for money Delivery = the transport of goods from a seller to a buyer ... Fact databases Geography Vendors Materials Part-lists ... ... ... KB • “Low-priced fastener?” • “no import restrictions?” • “heat-resistant to 600F?” • “nearby supplier?” • ... Application: Incident DB Search FAA Flight Incident Database (1) 950708025099G THE AIRPLANE OVERSHOT THE RUNWAY. STOPPED 40 FEET FROM END. (2) 961003038219C NUMBER 1 ENGINE FAILED DURING TAKEOFF. RETURNED. (3) 961211044319C A PASSENGER CUSSED OUT THE FLIGHT ATTENDANT. PASSENGER REMOVED. (4) 961203043609C BLEW TIRE DURING LANDING. Example Search Questions: • “Which events affected the propulsion? (2) • “Which events might have damaged the undercarriage? (1,4) • “Which events required a mechanic? (1,2,4) • ... Application: Incident DB Search FAA Flight Incident Database (1) 950708025099G THE AIRPLANE OVERSHOT THE RUNWAY. STOPPED 40 FEET FROM END. (2) 961003038219C NUMBER 1 ENGINE FAILED DURING TAKEOFF. RETURNED. (3) 961211044319C A PASSENGER CUSSED OUT THE FLIGHT ATTENDANT. PASSENGER REMOVED. (4) 961203043609C BLEW TIRE DURING LANDING. Representation (“Specification”) + Composition Answers (a Incident with (aircraft ((a Piper-PA-32))) (destination (OHare-Airport)))) (event ((a Overshooting with (agent ((the aircraft of Self))) (target ((the runway of (the destination of Self)))))) Related Work • Component-based approaches – Compositional Modeling (CML, Xerox) – Description Logics (composition) – problem-solving methods (KADS) – contexts (Cyc) – s/w engineering (many! Patterns, Comp. Arch) • Large-scale KBs – Cyc, BKB, TOVE, HPKB – WordNet, Pangloss Summary • Demand and potential of knowledge processing • Component-based architecture – ontology – core theories – definitions (specifications of compositions) – basic fact libraries • Staged, evaluable development possible – simple, inferred fact delivery… – …to a large-scale knowledge resource