Knowledge Systems and Project Halo

advertisement
Knowledge Systems
and Project Halo
In collaboration
with
SRI (Vinay
Chaudhri)
and Boeing
(Peter Clark)
Knowledge Systems
• Knowledge Systems are formal representations
of knowledge capable of answering
unanticipated questions with coherent
explanations
• Knowledge System = KB + Q/A +
Explanation Generator +
Knowledge Acq. tools
Project Halo
• Funded and administered by Vulcan, Inc – a
Paul Allen company
• Objective: to assess the state of the art of
knowledge systems – computer programs
that know a lot and answer tough questions
with coherent explanations
• Method: administer an AP Chemistry exam
to knowledge systems built by 4 teams of
researchers
A Significant Advance over
Expert Systems
•
•
•
•
Coverage
Reasoning
Explanation
Rapid construction
KM: A Logic Programming
Language
• …able to represent:
–
–
–
–
classes, instances, prototypes
defaults, fluents, constraints
(hypothetical) situations
actions (pre-, post-, and during- conditions)
• …and reason about:
– inheritance with exceptions
– deductive and abductive inference (with constraints)
– automatic classification (given a partial description of an instance,
determine the classes to which it belongs)
– temporal projection (“my car is where I left it”)
– affects of actions
A Simple Example
•
When 70 ml of 3.0-Molar Na2CO3 is
added to 30 ml of 1.0-Molar NaHCO3 the
resulting concentration of Na+ is:
a)
b)
c)
d)
e)
2.0 M
2.4 M
4.0 M
4.5 M
7.0 M
Question Representation
output
Question 26
context
result
Mix
Mixture
has-part
raw material
Aqueous Solution
base
conc.
3.0 M
Na2CO3
volume
0.07 lit
Aqueous Solution
base
conc.
1.0 M
NaHCO3
volume
0.03 lit
Na+
conc.
??
Background Knowledge
Chemistry laws:
1. Concentration of a solute
2. Composition of strong electrolyte solutions
3. Conservation of mass
4. Conservation of volume
etc.
Law 1: Concentration of a Solute
Compute-Concentration Method
context
input
Mixture
volume
Volume
*liters
has-part
Note: when this law is
applied, using Novak’s
output code, the quantities are
automatically
converted to the unitsof-measurement
specified here
Chemical conc. Concentration
*molar
quantity
Quantity
*moles
Explanation Template
The concentration of a chemical in a mixture is the quantity of the chemical divided by the
volume of the mixture.
Divide the quantity by the volume:
<Quantity> / <Volume> = X *molar
Therefore, the concentration of <Chemical> in <Mixture> = X *molar
Law 2: Composition of Strong
Electrolytes
Compute-Ions-in-Strong-Electrolyte
context
input
output
Strong Electrolyte
quantity
Quantity
*moles
has-part
Anion
quantity
Quantity
*moles
Cation
quantity
Quantity
*moles
Law 3: Conservation of Mass
Conservation of Mass
input
output
context
result
Mix
Chemical
has-part
raw-material
part-of
Chemical1 … Chemicaln
quantity
Quantity
*moles
quantity
Quantity
*moles
Chemical
quantity
??
*moles
Explanation Template
By the Law of Conservation of Mass, the quantity of a chemical in a mixture is the sum of
the quantities of that chemical in the parts of the mix.
The quantity of <Chemical> in <Chemical1> is X1 *moles
…
The quantity of <Chemical> in <Chemicaln> is Xn *moles
Therefore, the quantity of <Chemical> = X *moles
Law 4: Conservation of Volume
Conservation of Volume
input
output
context
result
Mix
Mixture
volume
raw-material
Chemical1 … Chemicaln
volume
Volume
<uom1>
??
*liter
volume
Volume
<uomn>
Explanation Template
By the Law of Conservation of Volume, the volume of a mixture is the sum of the volumes of
the parts mixed.
The sum of X1 <uom1>, … and Xn <uomn> = X *liter
Therefore, the volume of <Mixture> = X *liter
Step 1: Reclassify Terms
Strong Electrolyte Solution
superclass
result
Mix
Mixture
has-part
raw material
Aqueous Solution
base
conc.
3.0 M
Na2CO3
volume
0.07 lit
Aqueous Solution
base
conc.
1.0 M
NaHCO3
volume
0.03 lit
Na+
Step 2: Use Law 1 to Compute Concentration
result
Mix
Mixture
has-part
raw material
Aqueous Solution
base
conc.
3.0 M
Na2CO3
volume
0.07 lit
volume
Aqueous Solution
base
conc.
1.0 M
Na+
volume
0.03 lit
??
*liters
quantity
??
*moles
conc.
??
*molar
NaHCO3
Mixture
volume
Law 1
Volume
*liters
has-part
Chemical conc. Concentration
*molar
quantity
Quantity
*moles
The Search is non-deterministic
•
Multiple laws might be used to compute a
value for any property. For example,
here’s another way to compute
concentration:
 pH = - log [H+], where [H+] is the
concentration of H+
•
Since this applies only to H+, this search
path ends quickly
Step 3: Use Law 4 to Compute Volume
result
Mix
Mixture
has-part
raw material
Aqueous Solution
base
conc.
3.0 M
Na2CO3
volume
0.07 lit
volume
Aqueous Solution
base
conc.
1.0 M
volume
0.03 lit
Na+
??
*liters
.1
quantity
??
*moles
conc.
??
*molar
NaHCO3
result
Mix
Chemical
volume
Law 4
raw-material
Chemical … Chemical
volume
Volume
*liter
volume
Volume
*liter
Volume
*liter
Step 4: Use Law 3 to Compute Quantity
Mix
raw material
Aqueous Solution
0.07
liters
Na2CO3
base
Na+
volume
conc.
NaHCO3
0.03 liters
.1
*liters
has-part
Aqueous Solution
base
volume conc.
volume
Mixture
result
quantity
conc.
??
*moles
??
*molar
1.0 M
3.0 M
Na+
Na+
quantity
??
*moles
??
*moles
result
Mix
has-part
Chemical
has-part
raw-material
Law 3
part-of
Chemical … Chemical
quantity
Quantity
*moles
quantity
Quantity
*moles
Chemical
quantity
??
*moles
Step 5: Use Law 2 to Compute Quantity
of Ionic Parts
Mix
raw material
Aqueous Solution
Na2CO3
0.07
liters
base
*liters
Na+
volume
0.03 liters
Na+
Na+
quantity
??
*moles
??
*moles
quantity
conc.
??
*molar
1.0 M
has-part
??
*moles
conc.
NaHCO3
3.0 M
quantity
.1
has-part
Aqueous Solution
base
volume conc.
volume
Mixture
result
??
*moles
Strong Electrolyte
Law 2
quantity
Quantity
*moles
has-part
Anion
quantity
Quantity
*moles
Cation
quantity
Quantity
*moles
Step 6: Use Law 1’ to Compute Quantity
Mix
raw material
Aqueous Solution
Na2CO3
0.07
liters
base
conc.
NaHCO3
Na+
volume
has-part
Na+
??
*moles
.21
??
*molar
Mixture
Na+
volume
quantity
??
*moles
??
*moles
1.0 M
3.0 M
quantity
quantity
conc.
0.03 liters
.1
*liters
has-part
Aqueous Solution
base
volume conc.
volume
Mixture
result
??
*moles
Volume
Law 1’ *liters
has-part
Chemical conc. Concentration
*molar
quantity
Quantity
*moles
Step 7: Wind out of Law 2 from step 5
Mix
raw material
Aqueous Solution
Na2CO3
0.07
liters
base
NaHCO3
has-part
.21
*moles
Na+
volume
0.03 liters
Na+
Na+
quantity
.42
??
*moles
??
*moles
quantity
conc.
??
*molar
1.0 M
3.0 M
quantity
conc.
.1
*liters
has-part
Aqueous Solution
base
volume conc.
volume
Mixture
result
??
*moles
Strong Electrolyte
Law 2
quantity
Quantity
*moles
has-part
Anion
quantity
Quantity
*moles
Cation
quantity
Quantity
*moles
Step 8-10: Similar to steps 5-7
Mix
raw material
Aqueous Solution
Na2CO3
0.07
liters
base
conc.
NaHCO3
0.03 liters
1.0 M
3.0 M
has-part
quantity
Na+
Na+
quantity
.21
*moles
.42
*moles
??
*moles
Na+
volume
.03
.1
*liters
has-part
Aqueous Solution
base
volume conc.
volume
Mixture
result
quantity
conc.
??
*molar
??
*moles
Step 11: Wind out of Law 3 from Step 4
Mix
raw material
Aqueous Solution
Na2CO3
0.07
liters
base
*liters
Na+
volume
conc.
NaHCO3
0.03 liters
quantity
conc.
??
*molar
.45
Na+
Na+
quantity
.42
*moles
.03
*moles
result
Mix
has-part
.21
*moles
??
*moles
1.0 M
3.0 M
quantity
.1
has-part
Aqueous Solution
base
volume conc.
volume
Mixture
result
Chemical
has-part
raw-material
Law 3
part-of
Chemical … Chemical
quantity
Quantity
*moles
quantity
Quantity
*moles
Chemical
quantity
??
*moles
Step 12: Wind out of Law 1 from Step 2
Mix
raw material
Aqueous Solution
Na2CO3
0.07
liters
.1
*liters
has-part
Aqueous Solution
base
volume conc.
volume
Mixture
result
base
Na+
volume
conc.
NaHCO3
0.03 liters
quantity
conc.
??
*molar
1.0 M
3.0 M
.45
*moles
4.5
has-part
quantity
Na+
Mixture
Na+
quantity
.21
*moles
.42
*moles
.03
*moles
volume
Law 1 Volume
*liters
has-part
Chemical conc. Concentration
*molar
quantity
Quantity
*moles
Question 26 Answer
When 70 ml of 3.0-Molar Na2CO3 is added to 30 ml of 1.0-Molar NaHCO3, what is the resulting concentration of Na+?.
The concentration of a chemical in a mixture is the quantity of the chemical divided by the volume of the mixture.
By the Law of Conservation of Mass, the quantity of a chemical in a mixture is the sum of the quantities of that chemical in
the parts of the mix.
In the na2co3 strong-electrolyte-solution and the nahco3 strong-electrolyte-solution :
In the na-plus :
Multiply the concentration and the volume:
3 molar * 70 milliliter = 0.21 mole.
The quantity of na-plus in the na-plus is 0.42 mole.
In the co3-2 :
The quantity of na-plus in the co3-2 is 0 mole.
Multiply the concentration and the volume:
1 molar * 30 milliliter = 0.03 mole.
In the na-plus :
The quantity of na-plus in the na-plus is 0.03 mole.
In the hco3- :
The quantity of na-plus in the hco3- is 0 mole.
The quantity of na-plus in the na2co3 strong-electrolyte-solution and the nahco3 strong-electrolyte-solution is 0.45 mole.
Therefore, the quantity of na-plus = 0.45 mole.
By the Law of Conservation of Volume, the volume of a mixture is the sum of the volumes of the parts mixed.
The sum of 70 milliliter and 30 milliliter = 0.10 liter.
Therefore, the volume of the strong-electrolyte-solution strong-electrolyte-solution mixture = 0.10 liter.
Divide the quantity by the volume:.
0.45 mole / 0.10 liter = 4.50 molar.
Therefore, the concentration of na-plus in the strong-electrolyte-solution strong-electrolyte-solution mixture = 4.50 molar.
When 70 ml of 3.0-Molar Na2CO3 is added to 30 ml of 1.0-Molar NaHCO3, the resulting concentration of Na+ is 4.50 molar
Results of Project Halo
• After 4 month development effort, the
knowledge systems were sequestered and
given a test:
– 165 novel questions: 50 multiple choice; 115
free form response
– Questions translated from English to formal
language by each team, then assessed for
fidelity by an independent committee
• High likelihood of long term follow on
Correctness
• The SRI’s team correctness score corresponds to an AP score of 3 –
high enough for credit at UCSD, UIUC, and many other universities.
• We’ve predicted scoring 85% after a 3 month follow-on project.
Explanation Quality
Our Long Term Goal
• to enable distributed communities of
domain experts to build knowledge systems
in their area of expertise …
– without direct help from knowledge engineers
– working with familiar concepts and without
writing axioms
– with little more effort than writing technical
papers
Our Current Focus
• Insight: even domain-specific representations
contain common abstractions
• Approach: we build a library consisting of
– a small hierarchy of reusable, composable, domainindependent knowledge units (“components”)
– a small vocabulary of relations to connect them
then domain experts build representations by
instantiating and composing these components
Building a Representation Compositionally
Soil
Biotechnologist
environment
agent
remediator
agent
Microbes
patient
Get
then
Rate
contains
rate
Bioremediation
script
then
I-
Amount
pollutant
se Script se
se se
Apply
Q+
patient
agent
Break
Down
I-
Amount
amount
product
Oil
absorbed
then
Q-
Absorb
amount
Fertilizer
product
An underlying abstraction...
Soil
Biotechnologist
environment
agent
remediator
agent
Microbes
patient
Get
then
Rate
contains
rate
Bioremediation
script
then
I-
Q-
Amount
Amount
Oil
patient
agent
Break
Down
I-
amount
product
pollutant
se Script se
se se
Apply
Q+
absorbed
then
amount
Fertilizer
product
Absorb
Rate
rate
Q+
I-
Amount
Conversion
rawmaterials
Q-
I-
Amount
amount
product
Substance
amount
Substance
Another abstraction...
Soil
Biotechnologist
environment
agent
script
Microbes
Get
then
Apply
Q-
amount
product
patient
agent
absorbed
then
product
Absorb
Agent
food
script
agent
Script
agent
se
se
Break
Down
patient
Substance
absorbed
then
Absorb
amount
Fertilizer
Digest
eater
I-
Amount
Oil
Break
Down
then
I-
Amount
pollutant
se Script se
se se
patient
Q+
rate
Bioremediation
remediator
agent
Rate
contains
Another abstraction...
Soil
Biotechnologist
environment
agent
remediator
agent
Microbes
patient
Get
script
se Script se patient
se se
agent
Apply
then
Break
Down
script
patient
Get
se
Apply
then
Script
substance
patient
I-
Amount
pollutant
Treatment
substance
Q+
rate
Bioremediation
then
Agent
Rate
contains
then
Q-
I-
Amount
amount
product
Oil
absorbed
Absorb
amount
Fertilizer
product
Examples of Concepts Described
Compositionally
• a Fuel-Cell is a Producer of Electricity
• a Bulb is an Electrical Resistor that Produces Light
• a Camera is an Image Recording Device
• a Wire is a Conduit of Electricity
Library Contents
•
actions — things that happen, change states
–
•
states — relatively temporally stable events
–
•
Be-Closed, Be-Attached-To, Be-Confined, etc.
entities — things that are
–
•
Enter, Copy, Replace, Transfer, etc.
Substance, Place, Object, etc.
roles — things that are, but only in the context
of things that happen
–
Container, Catalyst, Barrier, Vehicle, etc.
Library Contents
•
relations between events, entities, roles
–
–
–
–
•
agent, donor, object, recipient, result, etc.
content, part, material, possession, etc.
causes, defeats, enables, prevents, etc.
purpose, plays, etc.
properties between events/entities and values
–
–
rate, frequency, intensity, direction, etc.
size, color, integrity, shape, etc.
Computational Semantics
• Knowledge about Enter:
– instances of Enter inherit axioms from Move, such as:
the action changes the location of the object of the Move
– before the Enter, the object is outside some enclosure
– after the Enter, the object is inside that enclosure and
contained by it
– during the Enter, the object passes through a portal of the
enclosure
– if the portal has a covering, it must be open; and unless it
is known to be closed, assume that it’s open
– etc.
Searching the Library
•
•
browsing the hierarchy top-down
WordNet-based search
–
–
–
all components have hooks to WordNet
climb the WordNet hypernym tree with search terms
assemble: Attach, Come-Together
mend:
Repair
infiltrate: Enter, Traverse, Penetrate, Move-Into
gum-up: Block, Obstruct
busted:
Be-Broken, Be-Ruined
First Challenge Problem
•
To enable biologists to encode collegelevel textbook knowledge about cells
•
A small example: mRNA-Transport
•
•
“mRNA is transported out of the cell nucleus
into the cytoplasm”
Transport: Move-Out-Of
unify
location
Evaluation
• Can Domain Experts learn to use the library
to encode domain knowledge?
• Can sophisticated knowledge be captured
through composition of components?
Methodology
• train biologists (4 graduate students) for six days
• have them encode knowledge from a college
textbook, Essential Cell Biology by Bruce Alberts
• supply end-of-the-chapter-style Biology questions
• have the biologists pose the questions to their
knowledge bases and record the answers
• have another biologist evaluate the answers on a
scale of 0-3
• qualitatively evaluate their KBs
Some Example Questions
• What nucleotide base pairs with adenine in RNA?
• How is uracil in RNA like thymine in DNA?
• What is the relationship between thymine and
uracil?
• For a given bacterial gene, how are bacterial RNA
and DNA molecules different?
• Describe RNA as a kind of polymer.
• What are the four bases/nucleotides of RNA?
• What is the relationship between a DNA gene and
its RNA transcription product?
Evaluation — Productivity
Axioms × 1000
2.5
2.0
1.5
Structural
Implication
Total
1.0
0.5
0.0
6/25
7/2
7/9
7/16
7/23
7/30
Evaluation — Question Answering
wrong
16%
right
54%
poor
15%
pretty good
15%
Summary
• Knowledge Systems offer significant benefits
compared with expert systems
• Multi-functional knowledge bases can be built
• … by domain experts, almost
• … and they will be, with or without sound
principles of ontological engineering
• … and ontologists can significantly improve the
results
Discussion
• Will the idiosyncrasies of specific domains
overshadow the commonalities coded in the
component library?
• How can NLP be used to pull information
from text to build knowledge systems?
• How can knowledge acquisition systems
use machine learning?
Download