Question Answering - School of Computer Science

Open-Domain
Question Answering
Eric Nyberg
Associate Professor
ehn@cs.cmu.edu
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
1
Outline
• What is question answering?
• Typical QA pipeline
• Unsolved problems
• The JAVELIN QA architecture
• Related research areas
These slides and links to other background material can
be found here: http://www.cs.cmu.edu/~ehn/15-381
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
2
Question Answering
• Inputs: a question in English; a
set of text and database resources
• Output: a set of possible answers
drawn from the resources
“When is the next train
to Glasgow?”
QA
SYSTEM
“8:35, Track 9.”
Carnegie Mellon
School of Computer Science
Text
Corpora
& RDBMS
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
3
Ancestors of Modern QA
• Information Retrieval
– Retrieve relevant documents from a
set of keywords; search engines
• Information Extraction
– Template filling from text (e.g. event
detection); e.g. TIPSTER, MUC
• Relational QA
– Translate question to relational DB
query; e.g. LUNAR, FRED
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
4
http://trec.nist.gov
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
5
Typical TREC QA Pipeline
“A simple factoid
question”
Question
Query
Extract
Keywords
Docs
Search
Engine
Answers
Passage
Extractor
Answer
Selector
Answer
Corpus
Carnegie Mellon
School of Computer Science
“A 50-byte passage likely
to contain the desired
answer” (TREC QA track)
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
6
Sample Results
Mean Reciprocal Rank (MRR): Find the ordinal
position of the correct answer in your output (1st
answer, 2nd answer, etc.) and divide by one; average
over entire test suite.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
7
Functional Evolution
• Traditional QA Systems (TREC)
– Question treated like keyword query
– Single answers, no understanding
Q: Who is prime minister of India?
<find a person name close to prime,
minister, India (within 50 bytes)>
A: John Smith is not prime minister
of India
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
8
Functional Evolution [2]
• Future QA Systems
– System understands questions
– System understands answers and
interprets which are most useful
– System produces sophisticated
answers (list, summarize, evaluate)
What other airports are near Niletown?
Where can helicopters land close to the
embassy?
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
9
Major Research Challenges
• Acquiring high-quality, high-coverage
lexical resources
• Improving document retrieval
• Improving document understanding
• Expanding to multi-lingual corpora
• Flexible control structure
– “beyond the pipeline”
• Answer Justification
– Why should the user trust the answer?
– Is there a better answer out there?
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
10
Why NLP is Required
• Question:
“When was Wendy’s founded?”
• Passage candidate:
– “The renowned Murano glassmaking industry, on an
island in the Venetian lagoon, has gone through
several reincarnations since it was founded in 1291.
Three exhibitions of 20th-century Murano glass are
coming up in New York. By Wendy Moonan.”
• Answer:
20th Century
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
11
Predicate-argument
structure
• Q336: When was Microsoft established?
• Difficult because Microsoft tends to establish lots of
things…
Microsoft plans to establish manufacturing partnerships in
Brazil and Mexico in May.
• Need to be able to detect sentences in which `Microsoft’
is object of `establish’ or close synonym.
• Matching sentence:
Microsoft Corp was founded in the US in 1975, incorporated in
1981, and established in the UK in 1982.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
12
Why Planning is Required
• Question: What is the occupation
of Bill Clinton’s wife?
– No documents contain these
keywords plus the answer
• Strategy: decompose into two
questions:
– Who is Bill Clinton’s wife? = X
– What is the occupation of X?
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
13
JAVELIN: Justification-based Answer
Valuation through Language Interpretation
Carnegie Mellon Univ. (Language Technologies Institute)
OBJECTIVES
• QA as planning by
developing a glass box operator
(action)
planning infrastructure
models
• Universal auditability by
developing a detailed set of
labeled dependencies that
form a traceable network of
reasoning steps
• Utility-based information
fusion
JAVELIN
GUI
Domain
Model
Planner
Question
Analyzer
Execution
Manager
Retrieval
Strategist
Data
process history
Repository and results
Request
Filler
Answer
Generator
...
search engines &
document collections
PLAN
Address the full Q/A task:
• Question analysis - question typing, interpretation, refinement, clarification
• Information seeking - document retrieval, entity and relation extraction
• Multi-source information fusion - multi-faceted answers, redundancy and contradiction detection
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
14
JAVELIN Objectives
• QA as Planning
– Create a general QA planning system
– How should a QA system represent
its chain of reasoning?
• QA and Auditability
– How can we improve a QA system’s
ability to justify its steps?
– How can we make QA systems open
to machine learning?
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
15
JAVELIN Objectives [2]
• Utility-Based Information Fusion
– Perceived utility is a function of many
different factors
– Create and tune utility metrics, e.g.:
U = Argmax
k
[F (Rel(I,Q,T),
Nov(I,T,A),
Ver(S,Sup(I,S)),
Div(S),
Cmp(I,A)),
Cst(I,A)]
-
relevance
novelty
veracity, support
diversity
comprehensibility
cost
I: Info item, Q: Question, S: Source, T: Task context, A: Analyst
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
16
Control
Flow
Strategic
Decision
Points
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
17
Repository
ERD
(Entity Relationship Diagram)
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
18
JAVELIN User
Interface
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
19
Javelin Architecture
JAVELIN
GUI
operator (action)
models
Domain
Model
Planner
Execution
Manager
Data
process history
Repository and results
Integrated w/XML
Modules can run
on different servers
Question
Analyzer
Retrieval
Strategist
Information
Extractor
Answer
Generator
...
search engines &
document collections
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
20
Module Integration
• Via XML DTDs for each object type
• Modules use simple XML objectpassing protocol built on TCP/IP
• Execution Manager takes care of
checking objects in/out of
Repository
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
21
Sample Log
File Excerpt
Components communicate via
XML object representations
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
22
Question
Analyzer
• Taxonomy of
question-answer
types and typespecific
constraints
• Knowledge
integration
• Pattern
matching
approach for
this year’s
evaluation
Carnegie Mellon
School of Computer Science
Question input (XML format)
Brill Tagger
BBN Identifier
KANTOO lexifier
Tokenizer
Token information extraction
Wordnet
Kantoo Lexicon
Token string input
Parser
KANTOO
grammars
Yes
Get FR?
QA taxonomy
+
Type-specific
constraints
No
FR
Event/entity
template filler
Request object builder
Pattern matching
Request object builder
Request object + system result
(XML format)
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
23
Question Taxonomies
• Q-Types
– Express
relationships
between events,
entities and
attributes
– Influence Planner
strategy
• A-Types
– Express semantic
type of valid
answers
Carnegie Mellon
School of Computer Science
Q-Type
A-Type
When did the Titanic
sink ?
eventcompletion
timepoint
Who was Darth
Vader's son?
conceptcompletion
personname
What is thalassemia ?
definition
definition
We expect to add more
A-types and refine granularity
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
24
Sample of Q-Type Hierarchy
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
25
Sample of A-Type Hierarchy
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
26
Request Object
Who was the first U.S. president
to appear on TV ?
•
•
•
•
•
Question type event-completion
person-name
Answer type
Computation element order 1
first, U.S. president, appear, TV
Keyword set
(event(subject(person-name ?)
F-structure
(occupation “U.S. president”))
(act appear)
(order 1)(theme TV))
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
27
How the Retrieval
Strategist Works
• Inputs:
– Keywords and keyphrases
– Type of answer desired
– Resource constraints
• Min/Max documents, time, etc.
• Outputs:
– Ranked set of documents
– Location of keyword matches
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
28
How the Retrieval
Strategist Works
• Constructs sequences of queries
based on a Request Object
– Start with very constrained queries
• High quality matches, low probability of
success
– Progressively relax queries until
search constraints are met
• Lower quality matches, high probability
of success
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
29
Sample Search Strategy
Inquery
Operator
Type? Query
#3
Yes
#3(Titanic
#syn(sink sank) *date)
#UW20
Yes
#UW20(Titanic
#syn(sink sank) *date)
:
:
:
: :
#PASSAGE250 Yes
#SUM
Yes
:
:
:
:
:
#PASSAGE250(Titanic
#syn(sink sank) *date)
#SUM(Titanic
#syn(sink sank) *date)
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
*** ** **** ** * *** **
** *** ** *** * * * ** *
*** * *** ** * ** *** **
* ** *** ** ** *** * * **
* ** ** *** * * ** * * **
** ** *** * * * ** * ** *
**** * * * ** * ** * * **
*** ** **** ** * *** **
** *** ** *** * * * ** *
*** * *** ** * ** *** **
* ** *** ** ** *** * * **
*** ** **** ** * *** **
** *** ** *** * * * ** *
*** * *** ** * ** *** **
* ** *** ** ** *** * * **
* ** ** *** * * ** * * **
** ** *** * * * ** * ** *
**** * * * ** * ** * * **
*** ** **** ** * *** **
** *** ** *** * * * ** *
*** * *** ** * ** *** **
* ** *** ** ** *** * * **
*** ** **** ** * *** **
** *** ** *** * * * ** *
*** * *** ** * ** *** **
* ** *** ** ** *** * * **
* ** ** *** * * ** * * **
** ** *** * * * ** * ** *
**** * * * ** * ** * * **
*** ** **** ** * *** **
** *** ** *** * * * ** *
*** * *** ** * ** *** **
* ** *** ** ** *** * * **
30
Retrieval Strategist (RS):
TREC Results Analysis
• Success: % of questions where at least
1 answer document was found
• TREC 2002:
Success rate @ 30 docs: ~80%
@ 60 docs: ~85%
@ 120 docs: ~86%
• Reasonable performance for a simple
method, but room for improvement
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
31
RS: Ongoing Improvements
• Improved incremental relaxation
– Searching for all keywords too restrictive
• Use subsets prioritized by discriminative ability
– Remove duplicate documents from results
• Don’t waste valuable list space
– 15% fewer failures (229 test questions)
• Overall success rate: @ 30 docs 83% (was 80%)
@ 60 docs 87% (was 85%)
• Larger improvements unlikely without additional
techniques, such as constrained query expansion
• Investigate constrained query expansion
– WordNet, Statistical methods
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
32
What Does the Request Filler Do?
• Input:
– Request Object (from QA module)
– Document Set (from RS module)
• Output:
– Set of extracted answers which match the
desired type (Request Fill objects)
– Confidence scores
• Role in JAVELIN: Extract possible
answers & passages from documents
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
33
Request Filler Steps
• Filter passages
– Match answer type?
– Contain sufficient keywords?
• Create variations on passages
–
–
–
–
POS tagging (Brill)
Cleansing (punctuation, tags, etc.)
Expand contractions
Reduce surface forms to lexemes
• Calculate feature values
• A classifier scores the passages, which
are output with confidence scores
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
34
Features
• Features are self-contained algorithms that
score passages in different ways
• Example: Simple Features
– # Keywords present
– Normalized window size
– Average <Answer,Keywords> distance
• Example: Pattern Features
– cN [..] cV [..] in/on [date]
– [date], iN [..] cV [..]
• Any procedure that returns a numeric value
is a valid feature!
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
35
Learning An Answer
Confidence Function
• Supervised learning
– Answer type-specific model
– Aggregate model across answer types
• Decision Tree – C4.5
– Variable feature dependence
– Fast enough to re-learn from each
new instance
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
36
A When Q-Type Decision Tree
 0.75
% Keywords
present in
the passage
 0.2
> 0.2
876.0/91.8
% Keywords
present in
the passage
> 0.75
Average distance
<date, keywords>
> 60
 60
62.0/11.6
Maximum scaled
keyword
window size
 0.75
5.0/1
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
> 0.2.4
33.0/10.3
37
Semantic Analysis Would Help
• The company said it believes
the expenses of the
restructuring will be
recovered by the end of 1992
• The/DT company/NN say/VBD
it/PRP believe/VBZ the/DT
expense/NNS of/IN the/DT
restructuring/NN will/MD be/VB
recover/VBN by/IN the/DT
end/NN of/IN 1992/CD
• …the artist expressed
• … the performer expressed
• The company said it believes
…
• Microsoft said it believes …
• It is a misconception the
Titanic sank on April the
15th,1912 …
Carnegie Mellon
School of Computer Science
• The Titanic sank on April the
15th,1912 …
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
38
Information Extractor (IX):
TREC Analysis
If the answer is in the doc set returned by the Retrieval
Strategist, does the IX module identify it as an answer
candidate with a high confidence score?
Trec 8
Trec 9
Trec 10
Inputs
Answer
in top 5
Answer
in docset
200
693
500
71
218
119
189
424
313
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
39
IX: Current & Future Work
• Enrich feature space beyond surface
patterns & surface statistics
• Perform AType-specific learning
• Perform adaptive semantic expansion
• Enhance training data quantity/quality
• Tune objective function
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
40
NLP for Information
Extraction
• Simple statistical classifiers are not
sufficient on their own
• Need to supplement statistical
approach with natural language
processing to handle more
complex queries
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
41
Example question
• Question: “When was Wendy’s founded?”
• Question Analyzer extended output:
– { temporal(?x), found(*, Wendy’s) }
• Passage discovered by retrieval module:
– “R. David Thomas founded Wendy’s in 1969, …”
• Conversion to predicate form by Passage Analyzer:
– { founded(R. David Thomas, Wendy’s), DATE(1969), … }
• Unification of QA literals against PA literals:
– Equiv(found(*,Wendy’s),
founded(R. David Thomas, Wendy’s))
– Equiv(temporal(?x),
DATE(1969))
– ?x := 1969
• Answer: 1969
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
42
Answer Generator
• Currently last module in pipe-line.
• Main tasks:
– Combination of different sorts of evidence for
answer verification.
– Detection and combination of similar answer
candidates to address answer granularity.
– Initiation of processing loops to gather more
evidence.
– Generation of answers in required format.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
43
Answer Generator input
• Analyzed question (RequestObject):
– Question/Answer type (qtype/atype)
– Number of expected answers;
– Syntactic parse and keywords.
• Passages (RequestFills):
– Marked candidates of right semantic type (right NE
type);
– Confidences computed using set of text-based
(surface) features such as keyword placement.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
44
Answer Generator output
• Answer string from document (for now).
• Set of text passages (RequestFills) Answer Generator
decided were supportive of answer.
• Or, requests for more information (exceptions)
passed on to Planner:
– “Not enough answer candidates”
– “Can’t distinguish answer candidates”
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
45
Types of evidence
• Currently implemented: Redundancy, frequency
counts.
– Preference given to more often occurring, normalized
answer candidates.
• Next step: Structural information from parser.
– Matching question and answer predicate-argument
structure.
– Detecting hypotheticals, negation, etc.
• Research level: Combining collection-wide statistics
with ‘symbolic’ QA.
– Ballpark estimates of temporal boundaries of
events/states.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
46
Example
• Q: What year did the Titanic sink?
A: 1912
Supporting evidence:
It was the worst peacetime disaster involving a British ship
since the Titanic sank on the 14th of April, 1912.
The Titanic sank after striking an iceberg in the North Atlantic
on April 14th, 1912.
The Herald of Free Enterprise capsized off the Belgian port of
Zeebrugge on March 6, 1987, in the worst peacetime disaster
involving a British ship since the Titanic sank in 1912.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
47
What happened?
• Different formats for answer candidates detected,
normalized and combined:
– `April 14th, 1912’
– `14th of April, 1912’
• Supporting evidence detected and combined:
– `1912’ supports `April 14th, 1912’
• Structure of date expressions understood and correct
piece output:
– `1912’ rather than `April 14th, 1912’
• Most frequent answer candidate found and output:
– `April 14th, 1912’ rather than something else.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
48
Answer Normalization
• Request Filler/Answer Generator aware of NE types:
dates, times, people names, company names, locations,
currency expressions.
• `April 14th, 1912’, `14th of April 1912’, `14 April 1912’
instances of same date, but different strings.
• For date expressions, normalization performed to ISO
8601 (YYYY-MM-DD) in Answer Generator.
•
‘summer’, ‘last year’, etc. remain as strings.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
49
Answer Normalization
• Normalization enables comparison and detection of
redundant or complementary answers.
• Define supporting evidence as piece of text expressing
same or less specific information.
• E.g., `1912’ supports `April 12th, 1912’.
• Complementary evidence: ‘1912’ complements ‘April 12th’.
• Normalization and supporting extend to other NE types:
– `Clinton’ supports `Bill Clinton’;
– `William Clinton’ and `Bill Clinton’ are normalized to same.
– For locations, `Pennsylvania’ supports `Pittsburgh’.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
50
Other forms of evidence
• Q: Name all the bills that were passed during the Bush
administration.
• Not likely to find passages mentioning `bill’, `pass’,
`Bush administration’.
• When was Bush administration??
• `Symbolic’ QA: look for explicit answer in collection,
might not be present.
• `Statistical’ QA: look at distribution of documents
mentioning Bush administration.
• Combining evidence of different sorts!
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
51
Other forms of evidence
• Can we figure out if Bush administration was around
when document was written?
• Look at tense/aspect/wording.
• Forward time references
– Bush administration will do something
• Backward time references
– Bush administration has done something
• Hypothesis:
– Backward time references provide information about
onset of event;
– Forward time references provide information about end of
event.
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
52
Other forms of evidence
• Bush administration forward references
#docs
mentioning
Bush adm.
on given
day
Administration
change
Event end
Time stamps
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
53
Other forms of evidence
• Bush administration backward references
#docs
mentioning
Bush adm.
on given
day
Administration
change
Event onset
Time stamps
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
54
Planning in JAVELIN
• Enable generation of new questionanswering strategies at run-time
• Improve ability to recover from bad
decisions as information is
collected
• Gain insight into when different QA
components are most useful
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
55
Planner Integration
Data
process history
Repository and data
Domain
Model
exe A
question
module A
results
...
ack
...
JAVELIN operator
(action) models
JAVELIN
GUI
exe E
Planner
results
dialog
response
answer
module E
store
exe F
results
Carnegie Mellon
School of Computer Science
Execution
Manager
module F
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
56
Current Domain Operators
• QuestionAnalyzer module called as a precursor to planning
• Demonstrates generation of multiple search paths,
feedback loops
RETRIEVE_DOCUMENTS
pre:
(and (request ?q ?ro)
(> (extracted_terms ?ro) 0)
(> request_quality 0))
RESPOND_TO_USER
pre:
EXTRACT_DT_CANDIDATE_FILLS
pre:
(and (retrieved_docs ?docs ?ro)
(== (expected_atype ?ro) location_t)
(> docset_quality 0.3))
ASK_USER_FOR_ANSWER_TYPE
pre:
EXTRACT_KNN_CANDIDATE_FILLS
pre:
(and (retrieved_docs ?docs ?ro)
(!= (expected_atype ?ro) location_t)
(> docset_quality 0.3))
RANK_CANDIDATES
pre:
(and (candidate_fills ?fills ?ro ?docs)
(> fillset_quality 0))
Carnegie Mellon
School of Computer Science
(and (interactive_session)
(request ?q ?ro)
(ranked_answers ?ans ?ro ?fills)
(> (max_ans_score ?ans) 0.1)
(> answer_quality 0))
(and (interactive_session)
(request ?q ?ro)
(or (and (ranked_answers ?ans ?ro ?fills)
(< (max_ans_score ?ans) 0.1))
(no_docs_found ?ro)
(no_fills_found ?ro ?docs)))
ASK_USER_FOR_MORE_KEYWORDS
pre:
(and (interactive_session)
(request ?q ?ro)
(or (and (ranked_answers ?ans ?ro ?fills)
(< (max_ans_score ?ans) 0.1))
(no_docs_found ?ro)
(no_fills_found ?ro ?docs)))
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
57
Current Domain Operators
more detailed operator view...
RETRIEVE_DOCUMENTS (?q - question ?ro - qtype)
pre: (and (request ?q ?ro)
(> (extracted_terms ?ro) 0)
(> request_quality 0))
dbind:
?docs
(genDocsetID)
?dur
(estTimeRS (expected_atype ?ro))
?pnone (probNoDocs ?ro)
?pgood
(probDocsHaveAns ?ro)
?dqual
(estDocsetQual ?ro))
effects:
(?pnodocs
execute:
(RetrievalStrategist ?docs ?ro 10 15 300)
((no_docs_found ?ro)
(scale-down request_quality 2)
(assign docset_quality 0)
(increase system_time ?dur))
?pgood
((retrieved_docs ?docs ?ro)
(assign docset_quality ?dqual)
(increase system_time ?dur))
(1-?pgood-?pnone) ((retrieved_docs ?docs ?ro)
(scale-down request_quality 2)
(assign docset_quality 0)
(increase system_time ?dur)))
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
58
Illustrative Examples
Where is bile produced?
• Overcomes current limitations of system “location” knowledge
• Uses answer candidate confidence to trigger feedback loop
Top 3 answers found during
initial pass (with “location”
answer type)
Top 3 answers displayed (with
user-specified “object” answer
type; ‘liver’ ranked 6th)
1: Moscow (Conf: 0.01825)
2: China (Conf: 0.01817)
3: Guangdong Province (Conf: 0.01817)
1: gallbladder (Conf: 0.58728)
2: dollars (Conf: 0.58235)
3: stores (Conf: 0.58147)
1st iter
2nd iter
<RETRIEVE_DOCUMENTS RetrievalStrategist DS2216 RO2262 10 15 300>
<EXTRACT_DT_CANDIDATE_FILLS DTRequestFiller FS2216 RO2262 DS2216 900>
<RANK_CANDIDATES AnswerGenerator AL2196 RO2262 FS2216 180>
<ASK_USER_FOR_ANSWER_TYPE AskUserForAtype Q74050 RO2262>
<ASK_USER_FOR_MORE_KEYWORDS AskUserForKeywords Q74050 RO2262>
<RETRIEVE_DOCUMENTS RetrievalStrategist DS2217 RO2263 10 15 300>
<EXTRACT_KNN_CANDIDATE_FILLS KNNRequestFiller FS2217 RO2263 DS2217 900>
<RANK_CANDIDATES AnswerGenerator AL2197 RO2263 FS2217 180>
<RESPOND_TO_USER RespondToUser A2204 AL2197 Q74050 RANKED>
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
59
Illustrative Examples
Who invented the road traffic cone?
• Overcomes current inability to relax phrases during document
retrieval
• Uses answer candidate confidence scores to trigger feedback loop
Top 3 answers found during initial
pass (using terms ‘invented’ and
‘road traffic cone’)
1: Colvin (Conf: 0.0176)
2: Vladimir Zworykin (Conf: 0.0162)
3: Angela Alioto (Conf: 0.01483)
1st iter
2nd iter
Top 3 answers displayed (with
additional user-specified term
‘traffic cone’; correct answer is
‘David Morgan’)
1: Morgan (Conf: 0.4203)
2: Colvin (Conf: 0.0176)
3: Angela Alioto (Conf: 0.01483)
<RETRIEVE_DOCUMENTS RetrievalStrategist DS2221 RO2268 10 15 300>
<EXTRACT_KNN_CANDIDATE_FILLS KNNRequestFiller FS2221 RO2268 DS2221 900>
<RANK_CANDIDATES AnswerGenerator AL2201 RO2268 FS2221 180>
<ASK_USER_FOR_ANSWER_TYPE AskUserForAtype Q74053 RO2268>
<ASK_USER_FOR_MORE_KEYWORDS AskUserForKeywords Q74053 RO2268>
<RETRIEVE_DOCUMENTS RetrievalStrategist DS2222 RO2269 10 15 300>
<EXTRACT_KNN_CANDIDATE_FILLS KNNRequestFiller FS2222 RO2269 DS2222 900>
<RANK_CANDIDATES AnswerGenerator AL2202 RO2269 FS2222 180>
<RESPOND_TO_USER RespondToUser A2207 AL2202 Q74053 RANKED>
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
60
Multilingual Question
Answering
• Goals
– English questions
– Multilingual information sources (Jpn/Chi)
– English/Multilingual Answers
• Extensions to existing JAVELIN modules
–
–
–
–
Question Analyzer
Retrieval Strategist
Information Extractor
Answer Generator
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
61
Multilingual
Architecture
English
corpora
RS
English
Index
Information
Extractor1
(English)
Japanese
Index
Information
Extractor2
(Japanese)
Chinese
Index
Information
Extractor3
(Chinese)
Other
Index
Information
Extractor4
(other lang)
Machine
xlation
Japanese
corpora
?’s
Question
Analyzer
Chinese
corpora
other lang
corpora
Answer
Generator
Answers
Bilingual
Dictionary
Module
Encoding
Converter
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
62
15-381 Project Topics
• Create more, better RF/IX modules
–
–
–
–
More intelligent feature extractors
Smarter classifiers
Train on different answer types
Plug in and evaluate your work in the
context of the larger system
• End-to-end QA system
– Focus on a particular question type
– Utilize existing RS module for document
retrieval
– Evaluate on TREC test suites (subsets)
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
63
Questions?
Carnegie Mellon
School of Computer Science
15-381 Lecture, Spring 2003
Copyright © 2003, Carnegie Mellon. All Rights Reserved.
64