Document 15772682

advertisement
Knowledge Systems
• Knowledge Systems use formal
representations of knowledge to answer
unanticipated questions with coherent
explanations
• Knowledge System = KB + Q/A +
Explanation Generator +
Knowledge Acq. tools
Advances over Expert Systems
Coverage of domain, not domain task
Various modes of reasoning, well integrated
Domain level explanation
Rapid construction
U
•
•
•
•
Just how advanced are they?
Project Halo*
• Long term: build a Knowledge System
encompassing much of the world’s scientific
knowledge
• Short term: assess current technologies
• Use a portion of the Advance Placement (AP)
chemistry exam as a metric
* Full support for Project Halo was provided by Vulcan Inc, Seattle, WA
Challenges
Systems must be robust in the face of widely
varying, unanticipated questions.
Explanations are as important as correctness.
Hard-ball evaluation, aimed to expose weaknesses.
New domain and short development time require
using off-the-shelf KR&R methods and systems.
It was not clear at the outset that
these challenges could be met.
Example Questions
• The spectator ions in the reaction of barium nitrate
with sodium sulfate are what? (choices)
• Although nitric acid and phosphoric acid have very
different properties as pure substances, their aqueous
solutions possess many common properties. List some
general properties of these solutions and explain their
common behavior in terms of the species present.
• Explain why a solution of HClO4 and NaClO4 cannot act
as a buffer solution.
• Sodium azide is used in air bags to rapidly produce gas
to inflate the bag. The products of the decomposition
reaction are what? (choices)
Questions were manually encoded in our formal language
Because Questions Vary Widely…
… we can not anticipate the questions, or
even the type of questions, so a retrieval
method won’t do.
A custom inference method won’t do.
The system must be capable of using its
knowledge in unanticipated ways.
An Example Explanation
• What are the products of the given decomposition reaction?
• By definition, oxidation-reduction reactions occur when electrons are
transferred from the atom that is oxidized to the atom that is reduced. We
need to look for changes in the oxidation states of the elements in the
reaction.
• In the reactants, the oxidation state(s) of the element Na is/are (1). In
the product, the oxidation state(s) is/are (0)
• Therefore, the reaction causes a change in oxidation state.
• Therefore, this is an oxidation reduction reaction.
• By definition, a Binary Ionic-Compound Decomposition Reaction occurs
when a binary ionic compound is heated.
• Therefore, this reaction is a Binary-Ionic Compound Decomposition reaction.
• In general, a Binary Ionic-Compound Decomposition Reaction converts a
binary ionic-compound into basic elements.
• In this reaction, NaN3 reacts to produce Na and N2.
• The products of the decomposition reaction are: (d) Sodium and nitrogen-g
Our KR&R System *
• KM: KRL-like frame system with FOL semantics.
• …able to represent:
–
–
–
–
classes, instances, prototypes
defaults, fluents, constraints
(hypothetical) situations
actions (pre-, post-, and during- conditions)
• …and reason about:
– inheritance with exceptions
– constraints
– automatic classification (given a partial description of an instance,
determine the classes to which it belongs)
– temporal projection (“my car is where I left it”)
– effects of actions
• KM answers questions by interleaving two types of
inference:
– Automatic classification
* Details: AAAI’97
– Backward chaining
Structure of the Knowledge Base*
Two principal types of chemistry knowledge:
– terms, e.g. “binary ionic compound”
– laws, e.g. problem-solving method for computing
products of reactions of binary-ionic compounds
Terms are encoded as definitions to enable
automatic classification.
Laws are encoded as rules to enable backward
chaining.
* Details: KR’04 (Barker, et.al.)
The Content of a Chemistry Law
Concentration of Solute Law
Context:
The conditions under which the
law applies
a mixture M such that:
volume(M) = V liters
has-part(M) includes
Chemical C such that:
quantity(C) = Q moles
concentration(C) = Conc molar
The subset of variables that must
Input: V, Q
be bound
Output: Conc
Method: Conc ← Q/V
The axioms used to compute values
for output variables
The subset of variables that will
be bound
Knowledge Engineering Methodology
Knowledge base built in 4 months:
– Ontological engineering (4 person-months): designed
representations, including structure of terms, laws,
reactions, solutions, etc.
– Knowledge capture (6 person-months): consolidated
70 textbook pages into 35 pages of terms and laws
– Knowledge encoding (15 person-months): coded in KM
500 types and relations, 150 chemistry laws and 65
terms. Compiled a large test suite which was run daily
– Explanation engineering (3 person-months):
augmented the representations of terms and laws with
templates
Results of Project Halo*
• After 4 month development effort, the
knowledge systems were sequestered and
given a test:
– 165 novel questions: 50 multiple choice; 115
free form response
– Questions translated from English to formal
language by each team, then assessed for
fidelity by Vulcan and team representatives
• * Details: AI Magazine (Winter 2004)
•
www.projecthalo.com : systems, Q/A, and analyses
Correctness
• Our system’s correctness score corresponds to an AP score of 3 – high enough for credit at
UCSD, UIUC, and many other universities.
• We’ve predicted scoring 85% after a 3 month follow-on project.
Explanation Quality
Error Analysis *
We analyzed every point lost. Most
deductions were due to errors in domain
modeling — mistakes that domain experts
would not make. (More later)
Some errors were caused by technology
problems.
Details: KR’04 (Friedland, et.al.)
Problems Due to KR&R Technology
• Explanations too verbose: e.g. passages repeated
multiple times with only small variations – graders
expected a general statement that covered them all.
Requires explanation planning
• Questions that require reasoning about our
representations:
– Calculate the pH of a particular substance. Explain why the
result is unreasonable.
– Explain the difference between the subscript “3” and the
coefficient “3” in 3HNO3.
– Explain when and why it’s OK for a particular chemistry
method to use an equation that only approximates the true
answer.
Reasoning about Relevance
Hydrofluoric acid is a weak acid, Ka = 6.8 x 10-4, and yet it is considered to be a very
reactive compound. For example, HF dissolves glass. The major reason it is considered
highly reactive is:
(a) It is an acid.
(b) It forms H3O+.
(c) It dissociates.
(d) It readily forms very stable fluoride compounds.
(e) It is a weak electrolyte.
All five statements are true. The question requires that the system reason about which of
the multiple true statements is most
relevant to the claim.
Bottom Line
• Halo I was a rigorous evaluation of current
Knowledge System technology.
• In general, the systems were more capable
than Vulcan expected.
• The major hurdles to building a Knowledge
System for science are errors (in domain
modeling) and cost ($10K/page).
Download