Building Knowledge Bases from Reusable Components Peter Clark Boeing Applied Research and Technology

advertisement
Building Knowledge Bases from
Reusable Components
Peter Clark
Boeing Applied Research and Technology
Fragment of a Knowledge-Base
...
Soil
Person
isa
...
Biotechnologist
environment
agent
remediator
agent
Microbes
patient
Get
then
Rate
contains
Q+
rate
Bioremediation
then
Break
Down
Q-
Amount
script
pollutant
agent
se Script se patient
se se
agent
Apply
I-
then
I+
Amount
amount
product
Oil
absorbed
...
amount
Fertilizer
product
Absorb
...
Example queries:
• “What steps are involved in bioremediation?”
• “How does pollutant volume affect rate?”
• “What equipment is needed?”
• “What do the microbes do?”
• ...
...
...
Potentials…
The growing demand for knowledge processing:
• Growth of on-line, structured information, eg
–
–
–
–
XML
On-line databases (eg. commercial, geographic)
eCommerce
NLP-generated structures
• Requirements for more than fact retrieval, eg
– Search / Information Access (“best search engine wins...”)
– Knowledge Management
– NL understanding / MLT / Speech
The Botany KB Experience
• 10 yr effort, 20k concepts, 100k facts
• Supports sophisticated question-answering
– description
– prediction
• But:
– KB still highly incomplete
– laborious to build/maintain
– difficult to achieve reuse
 want a better approach!
Fundamental Problem
• Reliance on manual construction of many specific
representations:
– impractical, unmaintainable
– can’t anticipate them all
• But:
– representations contain repeated abstractions
• production occurs in photosynthesis, mitosis, growth
• germination includes conversion, production, expansion
• Goal: Capture abstractions in a recomposable way
Representation of Bioremediation
Soil
Biotechnologist
environment
agent
remediator
agent
Microbes
patient
Get
then
Rate
contains
Q+
rate
Bioremediation
Amount
script
pollutant
agent
se Script se patient
se se
agent
Apply
then
Break
Down
I-
then
Q-
I+
Amount
amount
product
Oil
absorbed
Absorb
amount
Fertilizer
product
An underlying abstraction...
Soil
Biotechnologist
environment
agent
remediator
agent
Microbes
patient
Get
then
Rate
contains
Q+
rate
Bioremediation
then
Break
Down
Q-
Amount
I+
Amount
amount
product
script
pollutant
agent
se Script se patient
se se
agent
Apply
I-
Oil
absorbed
then
amount
Fertilizer
product
Absorb
Rate
rate
Q+
I-
Amount
Conversion
rawmaterials
Q-
I+
Amount
amount
product
Substance
amount
Substance
Another abstraction...
Soil
Biotechnologist
environment
agent
patient
then
I-
Q-
Amount
Apply
Break
Down
then
then
Absorb
Digest
eater
Agent
I+
Amount
amount
amount
script
product
pollutant
agent
Oil
Fertilizer
se Script se
se se
patient
agent
absorbed
product
Microbes
Get
Q+
rate
Bioremediation
remediator
agent
Rate
contains
food
script
agent
Script
agent
se
se
Break
Down
patient
Substance
absorbed
then
Absorb
Another abstraction...
Soil
Biotechnologist
environment
agent
Microbes
patient
Get
then
agent
Thing
patient
Get
Apply
then
then
I-
Amount
script
pollutant
agent
se Script se patient
se se
agent
agent
applied
Person
Q+
rate
Bioremediation
remediator
agent
Rate
contains
Break
Down
then
Oil
absorbed
Absorb
patient
agent
se Script
se
applied
Apply
patient
I+
Amount
amount
product
Treatment
script
Q-
Thing
amount
Fertilizer
product
A Component-Based Approach
• Represent component abstractions explicitly
• Define concepts as compositions
• Construct representations on-demand to answer qns
KB Architecture
• Component theories = abstract, reusable models
• Definitions = specifications of compositions
• Inference = construct compositions as needed to
answer questions.
Lessons from a Dictionary...
Move: to Go
Go: to Move

Transport: to Move from one Place to another
Vehicle: a Means for Transporting something
Car: a Vehicle for Passengers

Most abstract
concepts appeal to
core, foundational
theories
Specific concepts
defined as compositions
of abstract concepts
1. Component Theories
• A coherent, encapsulated system of concepts & relns
• Contains:
– ontology (vocabulary of concepts and relations)
– axioms (rules) relating these
• Provides semantics for these concepts in the KB
• Can layer these theories (define one using others)
Example: Distribution Network
Producer
Intermediary
Producer
Consumer
Intermediary
Ontology
Producer
Intermediary
Consumer
Material
connects
supplied
state
Rules (axioms):
• PRODUCERS produce MATERIAL.
• CONSUMERS can consume
MATERIAL.
• A network element may be BLOCKED
or UNBLOCKED.
• If an element connects with an
UNBLOCKED element, then it has an
ACCESS to that element.
• A CONSUMER is SUPPLIED if it has
ACCESS to a PRODUCER.
• ….
Axiom Representation (example)
“If an element connects to an UNBLOCKED element,
then it has ACCESS to that element.”
e1:Element
(i) Semantic
network
(ii) Logic
(iii) Implementation
(KM)
connectedelement
e2:Element state Unblocked

e1:Element
access-to
e2:Element
 e1,e2:Element
connected-element(e1,e2)  state(e2,Unblocked)
 access-to(e1,e2)
(every Element has
(access-to (
(allof (the connected-element of Self)
where ((the state of It) = Unblocked)))))
Other component theories...
• Supply-and-demand
• Containment
• Machines
• Production network
• Two-state object
• Transportation
• ...
2. Definitions and Composition
• Definition = specification of a composition
– a Fuel-Cell is a Producer of Electricity
– a Bulb is an Electrical Resistor producing Light
– a Camera is an Image Recording Device
– a Wire is a Conduit of Electricity
• Automated composition:
– Elaboration: component supplies info to answer query
– Classification: recognize concepts in the composition
2. Composition (example)
Composition: Camera = an Image Recording Device
Query: Failure modes of a camera?
Device
behavior
Image
input
Recording
(Camera has (superclasses (Device)))
(every Camera has
(behavior ((a Recording with
(input (Image)))))
Component Theory: Devices
FailureMode
failuremode
failuremode
failuremode
Device
FailureMode
behavior failuremode
Activity
participants
Physobj
FailureMode
Image
failuremode
failuremode
input
Physobj
failuremode
Device
behavior failuremode
Recording
part.
part.
Physobj
Physobj
FailureMode
(Device has (superclasses (Physobj)))
(every Device has
(behavior ((a Activity)))
(failure-modes (
(the failure-modes of
(the participants of
the behavior of Self))))))
Query: Failure modes of a camera?
Sub-query: Participants in its behavior?
FailureMode
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
part.
part.
Physobj
Physobj
FailureMode
Component Theory: Recording
Signal
input
Recording
output
participant
participant
input
Signal
Receptor
Memory-Unit
subevents
agent
input
Receiving
patient
Writing
(Recording has
(superclasses (Activity)))
(every Recording has
(input ((a Signal)))
(participants (
(a Receptor with
(input ((the input of Self)))
...
FailureMode
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
part.
input
FailureMode
part.
Physobj
Receptor
agent
Receiving
Signal
output
input
Physobj
Memory-Unit
subevents
patient
Writing
FailureMode
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
part.
input
FailureMode
part.
Receptor
agent
Receiving
Signal
output
input
Memory-Unit
subevents
patient
Writing
Run-Time Classification:
Aperture = an Image Receptor
Blockage
failuremode
Image
Image
output
input
Aperture
- inputs an image
- outputs an image
- might be blocked
- ...
Aperture
FailureBlockage
Mode
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
part.
input
FailureMode
part.
Receptor
Aperture
agent
Receiving
Signal
Image
output
input
Memory-Unit
subevents
patient
Writing
Run-Time Classification:
Aperture = an Image Receptor
Blockage
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
part.
input
FailureMode
part.
Aperture
agent
Receiving
Image
output
input
Memory-Unit
subevents
patient
Writing
Run-Time Classification:
Aperture = an Image Receptor
Film= an Image Memory-Unit
Aging
failuremode
Image
input
Film
Film
- includes a sheet coated with
image-sensitive chemical
- might age
sensitive-to
- ...
Chemical
parts
covering
Sheet
Blockage
Image
failuremode
failuremode
input
failuremode
Device
Run-Time Classification:
Aperture = an Image Receptor
Film= an Image Memory-Unit
behavior failuremode
Recording
part.
input
FailureAging
Mode
part.
Aperture
agent
Receiving
Image
output
input
sensitive-to
Memory-Unit
Film
Chemical
parts
patient
covering
subevents
Writing
Sheet
Query: Failure modes of a camera? Blockage, Aging
Sub-query: Participants in its behavior? Aperture, Film
Blockage
Image
failuremode
failuremode
input
Device
failuremode
behavior failuremode
Recording
Image
part.
input
Aging
part.
Aperture
agent
Receiving
output
input
Film
subevents
patient
Writing
sensitive-to
Chemical
covering
Sheet
Demo...
KM> (a Device with
(behavior ((a Recording with
(input (Image))))))
_Device01
KM> (the behavior of _Device01)
_Recording01
KM> (the failure-modes of _Device01)
 failure modes of its participants? (from Device)
 what are the participants?
 a Receptor and a Memory-Unit. (from Recording)
[Trace: _Receptor31 classified as a Aperture]
[Trace: _Memory-Unit32 classified as a Film]
 an Aperture and a Film.
 failure modes of an Aperture and a Film?
 Blocking, Aging. (from Aperture and Film)
(classification)
Demo...
KM> (a Device with
(behavior ((a Recording with
(input (Image))))))
_Device01
KM> (the behavior of _Device01)
_Recording01
KM> (the failure-modes of _Device01)
 failure modes of its participants? (from Device)
 what are the participants?
 a Receptor and a Memory-Unit. (from Recording)
[Trace: _Receptor31 classified as a Aperture]
[Trace: _Memory-Unit32 classified as a Film]
 an Aperture and a Film.
 failure modes of an Aperture and a Film?
 Blocking, Aging. (from Aperture and Film)
(Blocking Aging)
KM> (the subevents of (the behavior of _Device01))
[Trace: _Writing45 classified as a Exposing]
(_Receiving45 _Exposing46)
KM>
(classification)
Other Compositions...
• Sound Recording Device (tape recorder)
Device
behavior
Sound
input
Recording
• Sound Producing Device (stereo)
• Vibration Recording Device (seismology)
• Idea Recording Device (palmtop)
• etc.
Compound Concepts are Ubiquitous
– Botany:
• photosynthesis
• plant material distribution
• ...
– Aerospace:
• turbine gearbox assembly
• case drain fluid
• …(43k acronyms!)…
– Sentences also:
• “The aircraft overshot the runway.”
• “The air-conditioning unit had no power.”
• ...
Overall Architecture
2. Component
theories
1. Ontology
(conceptual vocabulary)
(computational clockwork)
Thing
Distn.Network
Producer
Consumer
Circuit
...
Movement
Physobj
Move
Location
...
...
...
...
Overall Architecture
1. Ontology
2. Component
theories
(conceptual vocabulary)
(computational clockwork)
Thing
Distn.Network
Producer
Consumer
Circuit
...
Consumer
Circuit
Producer
Movement
Physobj
Move
Location
...
...
...
...
Overall Architecture
2. Component
theories
1. Ontology
(conceptual vocabulary)
(computational clockwork)
Thing
Activity
Distn.Network
Producer
Consumer
Circuit
...
Physobj
Movement
Physobj
Move
Location
...
Move
...
...
...
Overall Architecture
2. Component
theories
1. Ontology
(conceptual vocabulary)
(computational clockwork)
(describe concepts
in terms of others)
Thing
Distn.Network
3. Definitions and
Descriptions
Producer
Consumer
Circuit
...
Movement
Physobj
Move
Location
...
...
...
...
Bulb = Light-producing
Electrical Consumer
4. Databases of
basic facts
(instances)
Prototype KBS: PHaSE Trainer
Laser
source
Computer
+ Screen
PHaSE KB Architecture
2. Component
theories
1. Ontology
Thing
DAG
Blckable
DAG
Discrete
events
Process
Network
Optical
Circuits
Distrn
Network
Elec.
Circuits
2-state
Object
Machine
...
...
...
PHaSE KB Architecture
2. Component
theories
1. Ontology
Thing
DAG
Blckable
DAG
Process
Network
Optical
Circuits
Distrn
Network
Elec.
Circuits
3. Definitions and
Descriptions
Carousel = a
Revolving Case
for Storage
...
...
...
PHaSE KB Architecture
2. Component
theories
1. Ontology
Thing
DAG
Blckable
DAG
Process
Network
Optical
Circuits
Distrn
Network
Elec.
Circuits
3. Definitions and
Descriptions
...
...
...
Carousel = ...
4. Basic Facts
about PHaSE
PH. Science
Checklists
PH. Circuit
PHaSE
physical
structure
Example Queries
“The parts of the PHaSE control panel?”
PHaSE-cover-screw, main power switch, PHaSE MOTOR PWR light, …
“The tool for removing the cover screw?”
4.5mm screwdriver
“The possible malfunctions of the PHaSE MOTOR PWR light?”
“The motor pwr light has no electricity.
(_Absence23)
The filament of the motor pwr light is burned out.
(_Burned-Out24)
The carousel motor is tripped.”
(_Tripped25)
“The corrective actions for a tripped motor?”
“Toggle the power switch of the carousel motor.”
PHaSE User Interface
PHaSE User Interface
Application:Product Description/eCommerce
“M8 titanium alloy bolt”
<product-description>
<merchant>
<name>Abalt Ltd</name>
<location>
<city>London</city>
<country>UK</country>
</location>
</merchant>
<product>
<type>bolt</type>
<size>M8</size>
<material>
<base>titanium</base>
<alloy>3Al-2.5V</alloy>
</material>
</product>
<cost>
<amount>0.03</amount>
<currency>GBP</currency>
</cost>
</product-description>
(XML)
... ... ...
KB
• “Low-priced fastener?”
• “no import restrictions?”
• “heat-resistant to 600F?”
• “nearby supplier?”
• ...
Knowledge Requirements for this
Component Theories
Transportation (transport, location, vehicle, …)
Commerce (buyer, goods, …)
Finance (money, account, exchange-rate, …)
Material physics (temperature, density, …)
...
Definitions
Purchase = an exchange of goods for money
Delivery = the transport of goods from a
seller to a buyer
...
Fact databases
Geography
Vendors
Materials
Part-lists
... ... ...
KB
• “Low-priced fastener?”
• “no import restrictions?”
• “heat-resistant to 600F?”
• “nearby supplier?”
• ...
Application: Incident DB Search
FAA Flight Incident Database
(1)
950708025099G
THE AIRPLANE OVERSHOT THE RUNWAY. STOPPED 40 FEET FROM END.
(2)
961003038219C
NUMBER 1 ENGINE FAILED DURING TAKEOFF. RETURNED.
(3)
961211044319C
A PASSENGER CUSSED OUT THE FLIGHT ATTENDANT. PASSENGER
REMOVED.
(4)
961203043609C
BLEW TIRE DURING LANDING.
Example Search Questions:
• “Which events affected the propulsion? (2)
• “Which events might have damaged the undercarriage? (1,4)
• “Which events required a mechanic? (1,2,4)
• ...
Application: Incident DB Search
FAA Flight Incident Database
(1)
950708025099G
THE AIRPLANE OVERSHOT THE RUNWAY. STOPPED 40 FEET FROM END.
(2)
961003038219C
NUMBER 1 ENGINE FAILED DURING TAKEOFF. RETURNED.
(3)
961211044319C
A PASSENGER CUSSED OUT THE FLIGHT ATTENDANT. PASSENGER
REMOVED.
(4)
961203043609C
BLEW TIRE DURING LANDING.
Representation (“Specification”) + Composition  Answers
(a Incident with
(aircraft ((a Piper-PA-32)))
(destination (OHare-Airport))))
(event ((a Overshooting with
(agent ((the aircraft of Self)))
(target ((the runway of (the destination of Self))))))
Related Work
• Component-based approaches
– Compositional Modeling (CML, Xerox)
– Description Logics (composition)
– problem-solving methods (KADS)
– contexts (Cyc)
– s/w engineering (many! Patterns, Comp. Arch)
• Large-scale KBs
– Cyc, BKB, TOVE, HPKB
– WordNet, Pangloss
Summary
• Demand and potential of knowledge processing
• Component-based architecture
– ontology
– core theories
– definitions (specifications of compositions)
– basic fact libraries
• Staged, evaluable development possible
– simple, inferred fact delivery…
– …to a large-scale knowledge resource
Download