slide presentation - Department of Computer Science

advertisement
Capturing and Answering Questions
Posed to a Knowledge-Based System
Peter Clark, John Thompson, William R
Murray, Phil Harrison (Boeing)
Jason Chaw, Bruce Porter, Ken Barker, Peter
Yeh, James Fan (UT Austin)
Vinay Chaudhri, Aaron Spaulding (SRI)
Bonnie John (CMU)
Overview
 The context and problem
 Question-Answering
 Controlled Language for Asking Questions
 Reasoning for Answering Questions
 Evaluation and how it worked out
 Reformulation attempts
 Advice
 Common sense
 Future
Overview
 The context and problem
 Question-Answering
 Controlled Language for Asking Questions
 Reasoning for Answering Questions
 Evaluation and how it worked out
 Reformulation attempts
 Advice
 Common sense
 Future
Context: Project Halo (Vulcan Inc)
 “The Digital Aristotle”
 access to massive amounts of knowledge
in computationally usable form
 First step
 Restricted to science domains
 Advanced high-school (AP) physics, chemistry,
biology
 Knowledge acquisition
 Can domain experts directly enter their
knowledge?
 Question-Answering
 Can non-computer-scientists pose
questions?
 Can the system reason and provide good
answers back?
Some Example AP Questions
Example question (physics)
An alien measures the height of a cliff by
dropping a boulder from rest and measuring the
time it takes to hit the ground below. The boulder
fell for 23 seconds on a planet with an
acceleration of gravity of 7.9 m/s2. Assuming
constant acceleration and ignoring air
resistance, how high was the cliff?
Example question (chemistry)
A solution of nickel nitrate and sodium hydroxide are
mixed together. Which of the following statements is true?
a. A precipitate will not form.
b. A precipitate of sodium nitrate will be produced.
c. Nickel hydroxide and sodium nitrate will be produced.
d. Nickel hydroxide will precipitate.
e. Hydrogen gas is produced from the sodium hydroxide.
?
Question-Asking: Approaches
 Posing complex questions is challenging!
 Templates: Too restricted
 English: Too difficult for
computer to understand
 Formal language: Too
difficult for user to learn
 Controlled language?
The “Controlled Language” Claim
There lies a “sweet spot” between logic and full NL which
is both human-usable and machine-understandable
Formal
language
CPL
“xy B(x)
R(x,y)C(y)”
too hard
for the
user
“A boulder is dropped”
Unrestricted
natural
language
“Consider the following
possible situation in
which a boulder first…”
too hard for the
computer
to understand
Example of a CPL encoding of a question
An alien measures the height of a cliff by
dropping a boulder from rest and measuring the
time it takes to hit the ground below. The boulder
fell for 23 seconds on a planet with an
acceleration of gravity of 7.9 m/s2. Assuming
constant acceleration and ignoring air
resistance, how high was the cliff?
A boulder is dropped.
The initial speed of the boulder is 0 m/s.
The duration of the drop is 23 seconds.
The acceleration of the drop is 7.9 m/s^2.
What is the distance of the drop?
?
The Interface (Posing Questions)
Question-Answering: The Interface
Overview
 The context and problem
 Question-Answering
 Controlled Language for Asking Questions
 Reasoning for Answering Questions
 Evaluation and how it worked out
 Reformulation attempts
 Advice
 Common sense
 Future
Controlled Language for Question-Asking…
 Controlled Language: Not a panacea!
 Not just a matter of grammatical simplification
 Only certain linguistic forms are understood
 Many concepts, many ways of expressing each one
 Huge effort to encode these in the interpreter
 User has to learn acceptable forms
 User needs to make common sense explicit
 Man pulls rope, rope attached to sled → force on sled
 4 wheels support a car → ¼ weight on each wheel
The Question Answering Cycle
CPL (Controlled english)
Original
text
A boulder is dropped.
The initial speed of the boulder is 0 m/s.
The duration of the drop is 23 seconds.
The acceleration of the drop is 7.9 m/s^2.
What is the distance of the drop?.
Rewriting
advice
Logic
A boulder is the object of a dropping.
The dropping has a duration of 23 seconds.
The dropping has initial speed 23 seconds.
The dropping has acceleratio 7.9 m/s^2.
The dropping has a distance of unknown
What is the distance?
Graph & paraphrase of
system’s understanding
QuestionAnswering
General Guidelines for CPL




Write in very simple sentences.
Avoid “and” & “or” phrases and negatives.
Avoid “flowery” language.
Avoid multiple states; instead, describe a single
event with initial and final values.
 Include common-sense facts if needed.
 Ask for a single value in a question, or ask “Is it
true that ...?”
For example: Write in Simple Sentences
 INSTEAD OF:
 A 2 kg block, starting from rest, slides 20 m down a frictionless
inclined plane from X to Y, dropping a vertical distance of 10 m.
 WRITE:
 The mass of a block is 2 kg.
 The initial velocity of the block is 0 m/s.
 The block slides down an inclined plane from X to Y.
 The coefficient of friction of the plane is 0 units.
 X is a point on the plane. Y is a point on the plane.
 The distance between X and Y is 20 m.
 The vertical distance between X and Y is 10 m.
Question-Answering
 Find a “model” (set of equations + assumptions) that
 matches the question scenario
 can provide an answer
 May require making additional assumptions
17m/s
"An object moves.
The mass of the object is 80 kg.
The initial speed of the object is 17 m/s.
The final speed of the object is 0 m/s.
The distance of the move is 10 m.
What is the force on the object?"
 Unstated assumption:
 Acceleration is constant
 Without this assumption:
 Can’t answer the question
0m/s
Question-Answering
 Basic Problem Solver (BPS)
 Searches space of possible models
 Find model which answers qn under assumptions
Question-Answering
 Basic Problem Solver (BPS)
 Searches space of possible models
 Find model which answers qn under assumptions
Question-Answering
 Basic Problem Solver (BPS)
 Searches space of possible models
 Find model which answers qn under assumptions
Overview
 The context and problem
 Question-Answering
 Controlled Language for Asking Questions
 Reasoning for Answering Questions
 Evaluation and how it worked out
 Reformulation attempts
 Advice
 Common sense
 Future
Results
 Tested on topics in AP science in June 2006
 Overall results (knowledge formulation & QA)
 38% (biology), 37.5% (chemistry), 19% (physics)
 Huge achievement!
 But what about the 60%-70% incorrect?
 No single weak point
40%
20%
20%
20%
correct
missing knowledge
bad interpretation
bad qn formulation
 Also: users needed several attempts to ask qns
Three investigations…
 Reformulation Attempts
 Advice System
 Factoring out common sense
1. Users would often have several attempts…
 Mean # of reformulations ~ 6
 Physics: 6.3 tries/qn (between 1 and 18)
 Chemistry: 6.6 tries/qn (between 1 and 19)
 Biology: 1.5 tries/qn (between 1 and 5)
 Majority was trying to find a wording which worked
In general, only certain wordings translate to logical
forms that trigger the right solution process, and in
many cases the users appeared to be performing trialand-error guessing until they hit a wording that
worked, or they gave up.
1. Users would often have several attempts…
 The SME trying to find a wording which works…
E6. Which of the following ionic compounds are insoluble in water?
[a] BaCO3
[b] BaNO3
[c] Al(OH)3
[d] NaOH
Is it true that BaCO3 is soluble in H2O?
Is it true that BaCO3 dissolves in water?
Is it true that BaCO3 dissolves?
There is a reaction. BaCO3 is the raw material of the reaction.
Is it true that the result of the reaction is an aqueous solution?
What is the solubility of BaCO3?
What is the solubility of BaCO3 in water?
2. CPL’s advice to the User
“Always specify a unit for numbers (e.g., “10 m”, not just “10”)
 40%
“Failed to understand the input. Please rephrase”
 60%




Not mapping a word to a concept (37%)
Ungrammatical sentence (5%)
Not legal CPL (12%)
Bad chemical formulas (21% in chem)
2. CPL’s advice to the User
 1171 CPL advice messages given
 About half of the advice library used at some point
Specific,
targetted
advice
Unable to map word
to a concept
Not legal CPL
Ungrammatical sentence
Bad chemical
formula notation
3. Making common sense explicit
phys3-#24a: An aeroplane moves exactly in
horizontal direction with a constant velocity
of 50km/h. A parachutist leaves the
aeroplane....
50km/h
 AURA needs to know v(parachutist) = v(airplane)
phys3-#18: A policeman chases a jewel
thief across city rooftops. Both come to a
gap between two buildings that is 5 m wide
with a horizontal velocity of 5 m/s. What is
the minimum drop of the second building in
comparison to the first one to clear the gap
(most nearly)?
5m/s
?
5m
 Need to know difference in height = distance of fall
Summary
 Question-Answering: Challenging!
 Controlled language for question-asking
 A “sweet spot” between logic and language
 But not a panacea:
 Users need to learn how to use and control it
 Expensive to build and maintain
 Our system CPL
 Was able to adequately support QA in AURA
 But:
 users needed several attempts to formulate qns
 CPL’s advice was often too general to be useful
 Challenging to factor out common sense knowledge
 Question answering
 The Basic Problem Solver (BPS)
 Searches for “best” model to answer a qn
 May include heuristic assumptions about the scenario
Download