Knowledge-Based Question-Answering Bruce Porter, Peter Clark, and John Thompson

advertisement
Knowledge-Based
Question-Answering
Bruce Porter, Peter Clark,
and John Thompson
Terminology and Key Points
• Knowledge base - a formal representation of
knowledge with associated inference methods
• Explanation Generation = select, organize, and present
information to answer a question. Inference plays a
supporting role in each step.
• Key Points:
– Ways of evaluating explanation systems
– Architectural principles for explanations systems
– Composition: a way to reduce the up-front
“knowledge requirement”
Botany Knowledge Base
• 10 year effort by full-time domain expert and
support staff
• contains 20,000 concepts and 100,000 facts
• much more information available via inheritance
and rules
• performance goal: robust, expert-level ability to
answer questions with good explanations
An Example of Explanation Generation:
The KNIGHT System
(J. Lester and B. Porter, Developing and Empirically Evaluating Robust Explanation Generators,
Computational Linguistics Journal, 23(1), 1997.)
Q: What happens during embryo sac formation?
A: Embryo sac formation is a kind of female gametophyte
formation. During embryo sac formation, the embryo sac is
formed from the megaspore mother cell. Embryo sac formation
occurs in the ovule.
Embryo sac formation is a step of angiosperm sexual
reproduction. It consists of megasporogenesis and embryo sac
generation. During megasporogenesis, the megaspore mother
cell divides in the nucellus to form 4 megaspores. During
embryo sac generation, the embryo sac is generated from the
megaspore.
KNIGHT approach:
1. Representation:
of the situation which user is asking about
2. Elaboration:
of that representation, guided by an answer
schema (EDP)
3. Assembly:
of results into natural language text
4. Presentation
Knight System Architecture
(virtual) KB
provided by
information
base facts
views
explanations
BKB
view retriever
KNIGHT
user
requests
... worked well to provide an “arms length relationship”
between application programs and the KB
View Retriever
(L. Acker and B. Porter, Extracting Viewpoints from Knowledge Bases, AAAI-94)
•
given a specification of desired information
•
return a subgraph of the knowledge base
representing a coherent, comprehensive set of
facts pertinent to the specification
The Viewpoint of
Photosynthesis as Production
(L. Acker and B. Porter, Extracting Viewpoints from Knowledge Bases, AAAI-94)
Production
product
location
Substance
energy
Place source
Photosynthesis
raw
Thing materials
product
Oxygen
energy
source
location
raw
materials
Chloroplast
ATP
producer
Water
Carbon-Dioxide
Photosynthetic Cell
Glucose
producer
Substance
Thing
A Combination Viewpoint:
Flower Structure vis-à-vis Plant Reproduction
Angiosperm Sexual Reproduction
location
Pollen Grain
Formation
location
Flower
has
parts
Pollen Grain
Transfer
source
Androecium
surrounds
Gynoecium
subevents
Embryo Sac
Formation
Pollen Grain
Germination
location destination
location
location
Double
Fertilization
Explanation Design Plan for
Processes
Explain Process
Process
Overview
As-kind-of
viewpoint
Fates of
patients
Location
description
Black-box
viewpoint
Temporal
information
Temporal step-of
viewpoint
For each patient:
change viewpoint
Temporal steps
viewpoints
Nodes contain programs with iteration and conditionals
Process
details
For each subevent:
Black-box viewpoint
KNIGHT Evaluation
Questions (60)
(60)
KNIGHT
(15)
Biologist
(15)
(15)
Biologist
explanations
Panel of Judges:
8 Biologists
Evaluations
(15)
Biologist
Biologist
Results of the Evaluation
Author
Overall
Content Organization
Writing
Correctness
KNIGHT
2.37±0.13 2.65±0.13 2.45±0.16 2.40±0.13 3.07±0.15
Human
2.85±0.15
2.95±0.16 3.07±0.16 2.93±0.16 3.16±0.15
Overall
Content Organization
Writing
Correctness
Difference
0.48
0.30
0.62
0.53
0.09
T statistic
-2.36
-1.47
-2.73
-2.54
-0.42
Significance
0.02
0.14
0.07
0.01
0.67
Significant?
yes
no
no
yes
no
Another example (DCE Application)
Question (user):
Describe a binding event, between
- the client Payday running on Slowbox
- the server Oracle running on Speedy
Answer (KB-generated):
• First, Payday queries the cell directory server for
the network-id of Oracle.
• Then Payday queries the endpoint mapper of
Speedy for Oracle’s endpoint.
• Finally, Payday assembles a binding from the
network-id and the endpoint.
1. Representation of situation in question
host
Slowbox
Oracle
server
host
Payday
Speedy
client
Binding-Event01
Describe a binding event, between
- the client Payday running on Slowbox
- the server Oracle running on Speedy
2. Elaboration (guided by answer schema)
host
Slowbox
cds
?
CDS01
Oracle
Network01
Speedy
server
network
Payday
client
request
subevents
?
queried
?
Binding-Event01
agent
Query01
then
Query02
?
then
Assemble01
Schema/EDP (paraphased):
“For each subevent, present summary, and pointers
to sub-subevents.”
2. Elaboration (guided by answer schema)
host
Slowbox
cds
CDS01
server
agent
client
Binding-Event01
agent
Speedy
endpoint
network
Payday
queried
Oracle
Network01
request
id
Endpoint01
then
Query02
NetId01
request
subevents
components
Query01
epm
then
Assemble01
queried
Endpoint
Mapper01
Schema/EDP (paraphased):
“For each subevent, present summary, and pointers
to sub-subevents.”
3. Assembly of text answer
host
Slowbox
cds
CDS01
queried
server
host
Payday
agent
Oracle
Network01
endpoint
network
client
Binding-Event01
agent
Speedy
request
id
Endpoint01
then
Query02
NetId01
request
subevents
components
Query01
epm
then
Assemble01
queried
Endpoint
Mapper01
• “First”
“First, Payday
(the agent
queries
of Query01)
the cell “queries”
directory(the
server for
queried
the
network-id
of Query01)
of Oracle.”
“for” (the request of Query01)
4. Presentation
The Application Environment
(Hyperlinked text)
(run-time generated pages)
Critique
• Approach used in Botany KB & three
smaller applications
• Benefits:
– Customized answers
– Controllable level of detail
– Flexibility (in theory)
• Well received, but:
– KBs still highly incomplete
– laborious to build
– difficult to achieve reuse
 want more modular approach
A Component-Based Approach to
Knowledge-Base Construction
Obervation:
Concept representations contain numerous abstractions
Approach:
1. Component theories = abstract, reusable models
2. More specific concepts: specified as compositions
3. Inference = construct compositions as needed to
answer questions.
Lessons from a Dictionary...
Move: to Go
Go: to Move

Transport: to Move from one Place to another
Vehicle: a Means for Transporting something
Car: a Vehicle for Passengers

Most abstract
concepts appeal to
core, foundational
theories
Specific concepts
defined as compositions
of abstract concepts
1. Component Theories
• A coherent, encapsulated system of concepts and
relations
• Contains:
– ontology (vocabulary of concepts and relations)
– axioms (rules) relating these
• Provides semantics for these concepts in the KB
• Can define specific theories using general ones
Example: Electrical Circuits
Electrical Circuit
Fuel
Cells
Switches
Light
Motor
• Carries electricity
• If closed circuit from Fuel
Cell to Device, then
Device is powered
• Switches can open/close
the circuit
Example: Electrical Circuits
Electrical Circuit
Fuel
Cells
Distribution-Network
P
Switches
P
I
C
Light
I
C
Motor
• Carries electricity
• If closed circuit from Fuel
Cell to Device, then
Device is powered
• Switches can open/close
the circuit
• Carries transport-element
• If unblocked path from
Producer to Consumer, then
Consumer is supplied.
• connects is transitive
• ….
Circuits as Distribution Networks
Electrical Circuit
Fuel
Cells
Distribution-Network
P
Switches
P
I
C
Light
I
C
Motor
• Carries electricity
• If closed circuit from Fuel
Cell to Device, then
Device is powered
• Switches can open/close
the circuit
• Carries transport-element
• If unblocked path from
Producer to Consumer, then
Consumer is supplied.
• connects is transitive
• ….
Distribution Networks as DAGS
Distribution-Network
Imports: Blockable-DAG
P
Blockable-DAG
N1
N2
P
N3
I
C
I
N4
N5
N6
C
And:
• Producers, Intermediaries,
and Consumers are Nodes
• If unblocked path from
Producer to Consumer,
then Consumer is
supplied.
• ...
• Nodes can connect with
other nodes.
• X reaches Y if X connects
with Y.
• X reaches Z if X connects
with Y and Y reaches Z
• ….
Component theories in KB-PHaSE
DAG
Blockable
DAG
Processing
Network
Optical
Circuits
Discrete Event
Model
Distribution
Network
Two-state
Object
Electrical
Circuits
Machines
PHaSE KB
Ontology, compositions,
basic facts about the domain
Spatial Relns
2. Composition
• Describe domain-specific concepts as compositions:
– a Bulb is a Resistor to Electricity producing Light
– a Camera is a Device for the Recording of Images
– a Battery is a Producer of Electricity
– a Wire is a Conduit of Electricity
• Inference:compute properties of compound concept
– using axioms from each component
– on demand, in response to questions
2. Composition (example)
Composition: Camera = a Device for the Recording of Images
Query: Failure modes of a camera?
Device
behavior
Image
input
Recording
(Camera has (superclasses (Device)))
(every Camera has
(behavior ((a Recording with
(input (Image)))))
Component Theory: Devices
FailureMode
failuremode
failuremode
failuremode
Device
FailureMode
behavior failuremode
Activity
participants
Physobj
FailureMode
Image
failuremode
failuremode
input
Physobj
failuremode
Device
behavior failuremode
Recording
part.
part.
Physobj
Physobj
FailureMode
(Device has (superclasses (Physobj)))
(every Device has
(behavior ((a Activity)))
(failure-modes (
(the failure-modes of
(the participants of
the behavior of Self))))))
Component Theory: Recording
Signal
input
Recording
output
participant
participant
input
Signal
Receptor
Memory-Unit
subevents
agent
input
Receiving
patient
Writing
(Recording has
(superclasses (Activity)))
(every Recording has
(input ((a Signal)))
(participants (
(a Receptor with
(input ((the input of Self)))
...
FailureMode
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
part.
input
FailureMode
part.
Physobj
Receptor
agent
Receiving
Signal
output
input
Physobj
Memory-Unit
subevents
patient
Writing
FailureMode
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
part.
input
FailureMode
part.
Receptor
agent
Receiving
Signal
output
input
Memory-Unit
subevents
patient
Writing
Run-Time Classification:
Aperture = a Receptor of Images
Blockage
failuremode
Image
Image
output
input
Aperture
- inputs an image
- outputs an image
- might be blocked
- ...
Aperture
FailureBlockage
Mode
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
part.
input
FailureMode
part.
Receptor
Aperture
agent
Receiving
Signal
Image
output
input
Memory-Unit
subevents
patient
Writing
Run-Time Classification:
Aperture = a Receptor of Images
Query: Failure modes of a camera? Blockage, ...
Sub-query: Participants in its behavior? Aperture, ...
Blockage
Image
failuremode
failuremode
input
failuremode
Device
behavior failuremode
Recording
Image
part.
input
Aging
part.
Aperture
agent
Receiving
output
input
Memory-Unit
subevents
patient
Writing
sensitive-to
Chemical
covering
Sheet
Compound Concepts are Ubiquitous
– Botany:
• photosynthesis
• plant material distribution
• ...
– Aerospace:
• turbine gearbox assembly
• case drain fluid
• …(43k acronyms!)…
– Sentences also:
• “The aircraft overshot the runway.”
• “The air-conditioning unit had no power.”
• ...
Overall Architecture
2. Component
theories
1. Ontology
Thing
DAG
Blckable
DAG
Discrete
events
Process
Network
Optical
Circuits
Distrn
Network
Elec.
Circuits
2-state
Object
Machine
...
...
...
Overall Architecture
2. Component
theories
1. Ontology
Thing
DAG
Blckable
DAG
Process
Network
Optical
Circuits
Distrn
Network
Elec.
Circuits
3. Definitions and
Descriptions
Camera = a Device
for the Recording
of Images
...
...
...
Overall Architecture
2. Component
theories
1. Ontology
Thing
DAG
Blckable
DAG
Process
Network
Optical
Circuits
Distrn
Network
Elec.
Circuits
3. Definitions and
Descriptions
...
...
...
Camera = ...
4. Basic facts
about domain
PH. Science
Checklists
PH. Circuit
PHaSE
physical
structure
Summary
• Explanation Generators select, organize, and
present information in response to questions.
• Inference plays a supporting role in each step.
• Explanation Design Plans are built for each type
of explanation.
• Composition at run-time reduces the up-front
“knowledge requirement”
Discussion
• Technical: The component approach is still a
work-in-progress; in particular although we can
isolate the general theories, the “basic facts” can
still be highly interdependent.
• Philosophical: We need a library of reusable
components. Will the idiosyncrasies of real-world
concepts overwhelm the generality of patterns?
Download