NLQA

advertisement
Natural Language Question
Answering
Gary Geunbae Lee
NLP Lab.
Professor@POSTECH
Chief Scientist@DiQuest
Kiss_tutorial 2
Dan Jurafsky
Question Answering
One of the oldest NLP tasks (punched card systems in 1961)
Simmons, Klein, McConlogue. 1964. Indexing and
Dependency Logic for Answering English Questions.
American Documentation 15:30, 196-204
2
Dan Jurafsky
Question Answering: IBM’s Watson
• Won Jeopardy on February 16, 2011!
WILLIAM WILKINSON’S
“AN ACCOUNT OF THE PRINCIPALITIES OF
WALLACHIA AND MOLDOVIA”
INSPIRED THIS AUTHOR’S
MOST FAMOUS NOVEL
3
Bram Stoker
Dan Jurafsky
Apple’s Siri
4
Dan Jurafsky
Wolfram Alpha
5
6
Task Definition
 Open-domain QA

Find answers to open-domain natural language questions from large
collections of documents

Questions are usually about factoid

Documents includes texts, web pages, databases, multimedia, maps,
etc.
 Canned QA

Map new questions into predefined questions for which answers
exist.

FAQ Finder
Gary G. Lee, Postech
7
Example (1)
 Q: What is the fastest car in the world?
 Correct answer:
…, the Jaguar XJ220 is the dearest (Pounds 415,000), fastest (217 mph /
350 kmh) and most sought-after car in the world.
 Wrong answer:
… will stretch Volkswagen’s lead in the world’s fastest-growing vehicle
market. Demand is expected to soar in the next few years as more
Chinese are able to afford their own cars.
Gary G. Lee, Postech
8
Example (2)
 Q: How did Socrates die?
 Answer: … Socrates drunk poisoned wine.
Needs world knowledge and reasoning.
(Anyone drinking or eating something that is poisoned is likely to die.)
Gary G. Lee, Postech
9
A taxonomy of QA systems
Moldovan et al., ACL 2002
Class
Type
1
Factual
2
Simple reasoning
3
Fusion list
4
Interactive context
5
Speculative
Example
Q: What is the largest city in Germany?
A: … Berlin, the largest city in Germany, …
Q: How did Socrates die?
A: … Socrates poisoned himself …
Q: What are the arguments for and against prayer in school?
Answer across several texts
Clarification questions
Q: Should the Fed raise interest rates at their next meeting?
Answer provides analogies to past actions
Gary G. Lee, Postech
10
Enabling Technologies and Applications
 Enabling Technologies
 Applications


POS Tagger

Parser

WSD

Named Entity Tagger


Information Retrieval


Information Extraction


Inference engine


Ontology (WordNet)

Knowledge Acquisition

Knowledge Classification

Language generation




Smart agents
Situation Management
E-Commerce
Summarization
Tutoring / Learning
Personal Assistant in business
On-line Documentation
On-line Troubleshooting
Semantic Web
Gary G. Lee, Postech
11
Question Taxonomies
 Wendy Lehnert’s Taxonomy

13 question categories

Based on Shank’s Conceptual Dependency
 Arthur Graesser’s Taxonomy

18 question categories

Expanding Lehnert’s categories

Based on speech act theory
 Question Types in LASSO [Moldovan et al., TREC-8]

Combines question stems, question focus and phrasal heads.

Unifies the question class with the answer type via the question
focus.
Gary G. Lee, Postech
12
Earlier QA Systems






The QUALM system (Wendy Lehnert, 1978)
 Reads stories and answers questions about what was read
The LUNAR system (W. Woods, 1977)
 One of the first user evaluations of question answering systems
The STUDENT system (Terry Winograd, 1977)
 Solved high school algebra problems
The MURAX system (Julian Kupiec, 1993)
 Uses online encyclopedia for closed-class questions
The START system (Boris Katz, 1997)
 Uses annotations to process questions from the Web
FAQ-Finder (Robin Burke, 1997)
 Uses Frequently Asked Questions from Usenet newsgroups
Gary G. Lee, Postech
13
Recent QA Conferences
TREC-8
TREC-9
TREC-10
TREC-11
NTCIR-3
No. Questions
200
693
500
500
200
Collection size
2GB
3GB
3GB
3GB (AQUAINT)
280MB
Question source
From text
Query logs
Query logs
Query logs
From text
Participating
systems
41
75
92
76
36
Performance
64.5%
76%
69%
85.6%
60.8%
Answer format
50&250 bytes
50&250 bytes
50 bytes
Exact
Exact
Top 5, MRR
Top 5, MRR
Main task
List task
Context task
Null answer
Top 1, CW
Main task
List task
Null answer
Top 5, MRR
Task 1
Task 2
Task 3
Top 5, MRR
Gary G. Lee, Postech
Dan Jurafsky
Types of Questions in Modern Systems
• Factoid questions
•
•
•
•
Who wrote “The Universal Declaration of Human Rights”?
How many calories are there in two slices of apple pie?
What is the average age of the onset of autism?
Where is Apple Computer based?
• Complex (narrative) questions:
• In children with an acute febrile illness, what is the
efficacy of acetaminophen in reducing fever?
• What do scholars think about Jefferson’s position on
dealing with pirates?
14
Dan Jurafsky
Commercial systems:
mainly factoid questions
Where is the Louvre Museum
located?
In Paris, France
What’s the abbreviation for
limited partnership?
L.P.
What are the names of Odin’s
ravens?
Huginn and Muninn
What currency is used in China?
The yuan
What kind of nuts are used in
marzipan?
almonds
What instrument does Max
drums
Dan Jurafsky
Paradigms for QA
• IR-based approaches
• TREC; IBM Watson; Google
• Knowledge-based and Hybrid approaches
• IBM Watson; Apple Siri; Wolfram Alpha; True
Knowledge Evi
16
Dan Jurafsky
Many questions can already be answered
by web search
• a
17
Dan Jurafsky
IR-based Question Answering
• a
18
Dan Jurafsky
IR-based Factoid QA
Document
DocumentDocument
Document
Document Document
Answer
Indexing
Passage
Retrieval
Question
Processing
Query
Formulation
Question
Answer Type
Detection
19
Document
Retrieval
Docume
Docume
nt
Docume
nt
Docume
nt
Docume
nt
Relevant
nt
Docs
Passage
Retrieval
passages
Answer
Processing
Dan Jurafsky
IR-based Factoid QA
• QUESTION PROCESSING
• Detect question type, answer type, focus, relations
• Formulate queries to send to a search engine
• PASSAGE RETRIEVAL
• Retrieve ranked documents
• Break into suitable passages and rerank
• ANSWER PROCESSING
• Extract candidate answers
• Rank candidates
• using evidence from the text and external sources
Dan Jurafsky
Knowledge-based approaches (Siri)
• Build a semantic representation of the query
• Times, dates, locations, entities, numeric quantities
• Map from this semantics to query structured data or resources
•
•
•
•
21
Geospatial databases
Ontologies (Wikipedia infoboxes, dbPedia, WordNet, Yago)
Restaurant review sources and reservation services
Scientific databases
Dan Jurafsky
Hybrid approaches (IBM Watson)
• Build a shallow semantic representation of the query
• Generate answer candidates using IR methods
• Augmented with ontologies and semi-structured data
• Score each candidate using richer knowledge sources
• Geospatial databases
• Temporal reasoning
• Taxonomical classification
22
Dan Jurafsky
Factoid Q/A
Document
DocumentDocument
Document
Document Document
Answer
Indexing
Passage
Retrieval
Question
Processing
Query
Formulation
Question
Answer Type
Detection
23
Document
Retrieval
Docume
Docume
nt
Docume
nt
Docume
nt
Docume
nt
Relevant
nt
Docs
Passage
Retrieval
passages
Answer
Processing
Dan Jurafsky
Question Processing
Things to extract from the question
• Answer Type Detection
• Decide the named entity type (person, place) of the answer
• Query Formulation
• Choose query keywords for the IR system
• Question Type classification
• Is this a definition question, a math question, a list question?
• Focus Detection
• Find the question words that are replaced by the answer
• Relation Extraction
• Find relations between entities in the question
24
Dan Jurafsky
Question Processing
They’re the two states you could be reentering if you’re crossing
Florida’s northern border
•
•
•
•
25
Answer Type: US state
Query: two states, border, Florida, north
Focus: the two states
Relations: borders(Florida, ?x, north)
Dan Jurafsky
Answer Type Detection: Named Entities
• Who founded Virgin Airlines?
• PERSON
• What Canadian city has the largest population?
• CITY.
Dan Jurafsky
Answer Type Taxonomy
Xin Li, Dan Roth. 2002. Learning Question Classifiers. COLING'02
• 6 coarse classes
• ABBEVIATION, ENTITY, DESCRIPTION, HUMAN, LOCATION,
NUMERIC
• 50 finer classes
• LOCATION: city, country, mountain…
• HUMAN: group, individual, title, description
• ENTITY: animal, body, color, currency…
27
Dan Jurafsky
Part of Li & Roth’s Answer Type Taxonomy
city
country
state
reason
expression
LOCATION
definition
abbreviation
ABBREVIATION
DESCRIPTION
individual
food
ENTITY
NUMERIC
currency
animal
HUMAN
date
title
group
money
percent
distance
size
28
Dan Jurafsky
Answer Types
29
Dan Jurafsky
More Answer Types
30
Dan Jurafsky
Answer types in Jeopardy
Ferrucci et al. 2010. Building Watson: An Overview of the DeepQA Project. AI Magazine. Fall 2010. 59-79.
• 2500 answer types in 20,000 Jeopardy question sample
• The most frequent 200 answer types cover < 50% of data
• The 40 most frequent Jeopardy answer types
he, country, city, man, film, state, she, author, group, here, company,
president, capital, star, novel, character, woman, river, island, king,
song, part, series, sport, singer, actor, play, team, show,
actress, animal, presidential, composer, musical, nation,
book, title, leader, game
31
Dan Jurafsky
Answer Type Detection
• Hand-written rules
• Machine Learning
• Hybrids
Dan Jurafsky
Answer Type Detection
• Regular expression-based rules can get some cases:
• Who {is|was|are|were} PERSON
• PERSON (YEAR – YEAR)
• Other rules use the question headword:
(the headword of the first noun phrase after the wh-word)
• Which city in China has the largest number of
foreign financial companies?
• What is the state flower of California?
Dan Jurafsky
Answer Type Detection
• Most often, we treat the problem as machine learning
classification
• Define a taxonomy of question types
• Annotate training data for each question type
• Train classifiers for each question class
using a rich set of features.
• features include those hand-written rules!
34
Dan Jurafsky
Features for Answer Type Detection
•
•
•
•
•
35
Question words and phrases
Part-of-speech tags
Parse features (headwords)
Named Entities
Semantically related words
Dan Jurafsky
Factoid Q/A
Document
DocumentDocument
Document
Document Document
Answer
Indexing
Passage
Retrieval
Question
Processing
Query
Formulation
Question
Answer Type
Detection
36
Document
Retrieval
Docume
Docume
nt
Docume
nt
Docume
nt
Docume
nt
Relevant
nt
Docs
Passage
Retrieval
passages
Answer
Processing
Dan Jurafsky
Keyword Selection Algorithm
Dan Moldovan, Sanda Harabagiu, Marius Paca, Rada Mihalcea, Richard Goodrum,
Roxana Girju and Vasile Rus. 1999. Proceedings of TREC-8.
1. Select all non-stop words in quotations
2. Select all NNP words in recognized named entities
3. Select all complex nominals with their adjectival modifiers
4. Select all other complex nominals
5. Select all nouns with their adjectival modifiers
6. Select all other nouns
7. Select all verbs
8. Select all adverbs
9. Select the QFW word (skipped in all previous steps)
10. Select all other words
Dan Jurafsky
Choosing keywords from the query
Slide from Mihai Surdeanu
Who coined the term “cyberspace” in his novel “Neuromancer”?
1
4
1
4
7
cyberspace/1 Neuromancer/1 term/4 novel/4 coined/7
38
Dan Jurafsky
Factoid Q/A
Document
DocumentDocument
Document
Document Document
Answer
Indexing
Passage
Retrieval
Question
Processing
Query
Formulation
Question
Answer Type
Detection
39
Document
Retrieval
Docume
Docume
nt
Docume
nt
Docume
nt
Docume
nt
Relevant
nt
Docs
Passage
Retrieval
passages
Answer
Processing
Dan Jurafsky
Passage Retrieval
• Step 1: IR engine retrieves documents using query terms
• Step 2: Segment the documents into shorter units
• something like paragraphs
• Step 3: Passage ranking
• Use answer type to help rerank passages
40
Dan Jurafsky
Features for Passage Ranking
Either in rule-based classifiers or with supervised machine learning
•
•
•
•
•
•
Number of Named Entities of the right type in passage
Number of query words in passage
Number of question N-grams also in passage
Proximity of query keywords to each other in passage
Longest sequence of question words
Rank of the document containing passage
Dan Jurafsky
Factoid Q/A
Document
DocumentDocument
Document
Document Document
Answer
Indexing
Passage
Retrieval
Question
Processing
Query
Formulation
Question
Answer Type
Detection
42
Document
Retrieval
Docume
Docume
nt
Docume
nt
Docume
nt
Docume
nt
Relevant
nt
Docs
Passage
Retrieval
passages
Answer
Processing
Dan Jurafsky
Answer Extraction
• Run an answer-type named-entity tagger on the passages
• Each answer type requires a named-entity tagger that detects it
• If answer type is CITY, tagger has to tag CITY
• Can be full NER, simple regular expressions, or hybrid
• Return the string with the right type:
• Who is the prime minister of India (PERSON)
Manmohan Singh, Prime Minister of India, had told
left leaders that the deal would not be renegotiated .
• How tall is Mt. Everest? (LENGTH)
The official height of Mount Everest is 29035 feet
Dan Jurafsky
Ranking Candidate Answers
• But what if there are multiple candidate answers!
Q: Who was Queen Victoria’s second son?
• Answer Type: Person
• Passage:
The Marie biscuit is named after Marie Alexandrovna,
the daughter of Czar Alexander II of Russia and wife of
Alfred, the second son of Queen Victoria and Prince
Albert
Dan Jurafsky
Ranking Candidate Answers
• But what if there are multiple candidate answers!
Q: Who was Queen Victoria’s second son?
• Answer Type: Person
• Passage:
The Marie biscuit is named after Marie Alexandrovna,
the daughter of Czar Alexander II of Russia and wife of
Alfred, the second son of Queen Victoria and Prince
Albert
Dan Jurafsky
Use machine learning:
Features for ranking candidate answers
Answer type match: Candidate contains a phrase with the correct answer type.
Pattern match: Regular expression pattern matches the candidate.
Question keywords: # of question keywords in the candidate.
Keyword distance: Distance in words between the candidate and query keywords
Novelty factor: A word in the candidate is not in the query.
Apposition features: The candidate is an appositive to question terms
Punctuation location: The candidate is immediately followed by a
comma, period, quotation marks, semicolon, or exclamation mark.
Sequences of question terms: The length of the longest sequence
of question terms that occurs in the candidate answer.
Dan Jurafsky
Candidate Answer scoring in IBM Watson
• Each candidate answer gets scores from >50 components
• (from unstructured text, semi-structured text, triple stores)
• logical form (parse) match between question and candidate
• passage source reliability
• geospatial location
• California is ”southwest of Montana”
• temporal relationships
• taxonomic classification
47
Dan Jurafsky
Common Evaluation Metrics
1. Accuracy (does answer match gold-labeled answer?)
2. Mean Reciprocal Rank
• For each query return a ranked list of M candidate answers.
• Its score is 1/Rank of the first right answer.
• Take the mean over all N queries
N
MRR =
48
1
å rank
i
i=1
N
49
A Generic QA Architecture
Document
Collection
Question
Question
Processing
QA
Query
Answers
Document
Retrieval
Documents
Answer
Extraction &
Formulation
Resources (dictionary, ontology, …)
Gary G. Lee, Postech
50
IBM - Statistical QA
Ittycheriah et al. TREC-9, TREC-10
Answer type
Answer Type
Docs
Prediction
Question






Query &
Focus
Expansion
IR
Named Entity
Marking
Answer
Selection
Answers
31 answer categories
Named Entity tagging
Focus expansion using WordNet
Query expansion using encyclopedia
Dependency Relations matching
Use a maximum entropy model in both Answer Type Prediction and
Answer Selection
Gary G. Lee, Postech
51
USC/ISI - Webclopedia
Hovy et al. TREC-9, TREC-10
Parsing
Create Query
IR
Question
Parsing
Docs
Select & rank
sentences
Matching
QA
typology
Constraint
patterns
Inference
Answers
Ranking
 Parsing questions and top segments





CONTEX
Query expansion using WordNet
Score each sentence using word scores
Match constraint patterns & parse trees
Match semantic type & parse tree elements
Gary G. Lee, Postech
52
LCC - PowerAnswer
Moldovan et al. ACL-02
Question
M1: Keyword
pre-processing
(split/bind/spell)
Loop1
M2: Construction
of question
representation
M3: Derivation
of expected
answer type
M4:
Keyword
selection
M5:
Keyword
expansion
Loop2
Loop3
M6: Actual
retrieval of
documents
and passages
M7: Passage
post-filtering
M8:
Identification
of candidate
answers
M8’:
Logic
Proving
M9:
Answer
ranking
M5: Answer
formulation
Answers
Docs
Gary G. Lee, Postech
53
LCC – PowerAnswer (cont’d)
 Keyword alternations

Morphological alternations


Lexical alternations


invent  inventor
How far  distance
Semantic alternations

like prefer
 3 feedback loops



Loop1: adjust Boolean queries
Loop2: lexical alternations
Loop3: semantic alternations
 Logic proving

Unification between the question and answer logic forms
Gary G. Lee, Postech
Dan Jurafsky
Relation Extraction
• Answers: Databases of Relations
• born-in(“Emma Goldman”, “June 27 1869”)
• author-of(“Cao Xue Qin”, “Dream of the Red Chamber”)
• Draw from Wikipedia infoboxes, DBpedia, FreeBase, etc.
• Questions: Extracting Relations in Questions
Whose granddaughter starred in E.T.?
(acted-in ?x “E.T.”)
(granddaughter-of ?x ?y)
54
Dan Jurafsky
Temporal Reasoning
• Relation databases
• (and obituaries, biographical dictionaries, etc.)
• IBM Watson
”In 1594 he took a job as a tax collector in Andalusia”
Candidates:
• Thoreau is a bad answer (born in 1817)
• Cervantes is possible (was alive in 1594)
55
Dan Jurafsky
Geospatial knowledge
(containment, directionality, borders)
• Beijing is a good answer for ”Asian city”
• California is ”southwest of Montana”
• geonames.org:
56
Dan Jurafsky
Context and Conversation
in Virtual Assistants like Siri
• Coreference helps resolve ambiguities
U: “Book a table at Il Fornaio at 7:00 with my mom”
U: “Also send her an email reminder”
• Clarification questions:
U: “Chicago pizza”
S: “Did you mean pizza restaurants in Chicago
or Chicago-style pizza?”
57
Dan Jurafsky
Answering harder questions
Q: What is water spinach?
A: Water spinach (ipomoea aquatica) is a semi-aquatic leafy green plant with long
hollow stems and spear- or heart-shaped leaves, widely grown throughout Asia as a
leaf vegetable. The leaves and stems are often eaten stir-fried flavored with salt or in
soups. Other common names include morning glory vegetable, kangkong (Malay),
rau muong (Viet.), ong choi (Cant.), and kong xin cai (Mand.). It is not related to
spinach, but is closely related to sweet potato and convolvulus.
Dan Jurafsky
Answering harder question
Q: In children with an acute febrile illness, what is the efficacy of
single medication therapy with acetaminophen or ibuprofen in
reducing fever?
A: Ibuprofen provided greater temperature decrement and longer
duration of antipyresis than acetaminophen when the two drugs
were administered in approximately equal doses. (PubMedID:
1621668, Evidence Strength: A)
Dan Jurafsky
Answering harder questions via
query-focused summarization
• The (bottom-up) snippet method
• Find a set of relevant documents
• Extract informative sentences from the documents (using tf-idf, MMR)
• Order and modify the sentences into an answer
• The (top-down) information extraction method
• build specific answerers for different question types:
• definition questions,
• biography questions,
• certain medical questions
Dan Jurafsky
The Information Extraction method
• a good biography of a person contains:
• a person’s birth/death, fame factor, education, nationality and so on
• a good definition contains:
• genus or hypernym
• The Hajj is a type of ritual
• a medical answer about a drug’s use contains:
• the problem (the medical condition),
• the intervention (the drug or procedure), and
• the outcome (the result of the study).
Dan Jurafsky
Information that should be in the answer
for 3 kinds of questions
Dan Jurafsky
Architecture for complex question answering:
.
definition questions
S. Blair-Goldensohn, K. McKeown and A. Schlaikjer. 2004.
Answering Definition Questions: A Hyrbid Approach
"What is the Hajj?"
(Ndocs=20, Len=8)
Document
Retrieval
11 Web documents
1127 total
sentences
The Hajj, or pilgrimage to Makkah [Mecca], is the central duty of Islam. More than
two million Muslims are expected to take the Hajj this year. Muslims must perform
the hajj at least once in their lifetime if physically and financially able. The Hajj is a
milestone event in a Muslim's life. The annual hajj begins in the twelfth month of
the Islamic year (which is lunar, not solar, so that hajj and Ramadan fall sometimes
in summer, sometimes in winter). The Hajj is a week-long pilgrimage that begins in
the 12th month of the Islamic lunar calendar. Another ceremony, which was not
connected with the rites of the Ka'ba before the rise of Islam, is the Hajj, the
annual pilgrimage to 'Arafat, about two miles east of Mecca, toward Mina…
Predicate
Identification
9 Genus-Species Sentences
The Hajj, or pilgrimage to Makkah (Mecca), is the central duty of Islam.
The Hajj is a milestone event in a Muslim's life.
The hajj is one of five pillars that make up the foundation of Islam.
...
383 Non-Specific Definitional sentences
Definition
Creation
Sentence clusters,
Importance ordering
Data-Driven
Analysis
SiteQ system
64
Postech qa: Answer Type
 Top 18, Total 84
DATE
TIME
PROPERTY
LANGUAGE_UNIT
LANGUAGE
SYMBOLIC_REP
ACTION
ACTIVITY LIFE_FORM
QUANTITY
NATURAL_OBJECT LOCATION
SUBSTANCE ARTIFACT GROUP
PHENOMENON STATUS
BODY_PART
……
AGE DISTANCE DURATION
LENGTH MONETARY_VALUE NUMBER PERCE
NTAGE POWER RATE SIZE SPEED TEMPERATU
RE VOLUME WEIGHT
……
Gary G. Lee, Postech
SiteQ system
65
Layout of SiteQ/J
Question
Index
DB
Part-Of-Speech Tagging
(POSTAG/J)
Document Retrieval
(POSNIR/J)
Semantic
Category
Dictionary
Chunking &
Category Processing
Passage Selection
- NP,VP chunking
- syntactic/semantic
category check
- divide document into
dynamic passages
- score passages
Identification of
Answer Type
LSP grammar
for question
- locate question word
- LSP matching
- date constraint check
Detection of Answer Candidates
- LSP matching
LSP grammar
for answer
Answer Filtering
& Scoring
& Ranking
Answer
Gary G. Lee, Postech
SiteQ system
66
Question Processing
question
Tagger
NP Chunker
Query
Formatter
query
Answer Type
Determiner
Normalizer
RE Matcher
Qnorm dic
LSP
answer type
Gary G. Lee, Postech
SiteQ system
67
Lexico-Semantic Patterns
 LSP is a pattern expressed by lexical entries, POS, syntactic
categories and semantic categories and more flexible than
surface pattern
 Used for

Identification of answer type of a question

Detection of answer candidates from selected passages
Gary G. Lee, Postech
SiteQ system
68
Answer Type Determination (1)
Who was the 23rd president of the United States?
Tagger
Who/WP was/VBD the/DT 23rd/JJ president/NN of/IN the/DT
United/NP States/NPS ?/SENT ?
NP Chunker
Who/WP was/VBD [[ the/DT 23rd/JJ president/NN ] of/IN
[ the/DT United/NP States/NPS ]] ?/SENT ?
Normalizer
%who%be@person
RE Matcher
(%who)(%be)(@person)
PERSON
Gary G. Lee, Postech
SiteQ system
69
Dynamic Passage Selection (1)
 Analysis of answer passages of TREC-10 questions

About 80% of query terms occurred within 50-word (or 3-sentence)
window including answer string in its center
 Definition of passage



Prefer sentence window to word window
Prefer dynamic passage to static passage
Passage is defined as any consecutive three sentences
Gary G. Lee, Postech
SiteQ system
70
Answer Extraction
Query-term detection
Query-term filtering
Answer detection core
Stop-Word
Term list
Noun-phrase chunking
Pivot creation
Answer detection
Stemming (Porter’s)
AAD processing
Answer filtering
WordNet
Answer scoring
Gary G. Lee, Postech
SiteQ system
71
Answer Detection (1)
… The/DT velocity/NN of/IN light/NN is/VBZ
300,000/CD km/NN per/IN second/JJ
in/IN the/DT physics/NN textbooks/NNS …
cd@unit_length%per@unit_time
speed|1|4|4
Gary G. Lee, Postech
SiteQ system
72
Answer Scoring
 Based on

Passage score

Number of unique terms matched in the passage

Distances from each matched term
AScore    PScore 
ptuc
ptuc
(1   )  LSPwgt 

qtuc ( ptuc  avgdist )
LSPwgt: the weight of LSP grammar
avgdist: the average distance between the answer candidate and
each matched term
Gary G. Lee, Postech
SiteQ system
73
Performance in TREC-10
Rank
# of Qs
(strict)
# of Qs
(lenient)
Avg. of 67
runs
1
121
124
88.58
2
45
49
28.24
3
24
29
20.46
4
15
16
12.57
5
11
14
12.46
276
260
329.7
0.320
0.335
0.234
No
MRR
Q-ty pe
freq
how + adj/adv
31
how do
2
what do
24
what is
242
what/which noun 88
when
26
where
27
who
46
why
4
name a
2
Total
492
MRR
MRR
(strict) (lenient)
0.316
0.250
0.050
0.308
0.289
0.362
0.515
0.464
0.125
0.500
0.332
0.250
0.050
0.320
0.331
0.362
0.515
0.47 1
0.125
0.500
Gary G. Lee, Postech
SiteQ system
74
Performance in QAC-1
 Performance of question
answering


Return 994 answer candidates
among which 183 responses
were judged as correct
MRR is 0.608, the best
performance (average is 0.303)
Question
Answer
200
Output
Correct
288
Recall
Precision
0.635
0.184
994
F-measure
183
MRR
0.285
Rank
# of questions
1
98
2
27
3
14
4
6
5
4
Total
149
0.608
Gary G. Lee, Postech
75
Contents




Introduction: QA as a high precision search
Architecture of QA Systems
Main components for QA
A QA system: SiteQ
Issues for advanced QA
Gary G. Lee, Postech
76
Question Interpretation
 Use pragmatic knowledge
“Will Prime Minister Mori survive the political crisis?”

The analyst is not literary meaning
Will Prime Minister Mori be still alive when the political crisis is over?

But expresses the belief that
The current political crisis might cost the Japanese Prime Minister his
job.
 Decompose complex questions into series of simpler
questions
 Expand the question with contextual information
Gary G. Lee, Postech
77
Semantic Parsing
 Use the explicit representation of semantic roles as encoded
in FrameNet

FrameNet provides database of words with associated semantic
roles
 Enables efficient, computable semantic representation of
questions and texts.
 Filling in semantic roles

Prominent role for named entities

Generalization across domains

FrameNet coverage and fallback techniques
Gary G. Lee, Postech
78
Information Fusion
 Combining answer fragments in a concise response
 Technologies

Learning paraphrases, syntactic and lexical

Detecting important differences

Merging descriptions

Learning and formulating content plans
Gary G. Lee, Postech
79
Answer Generation
 Compare request fills from several answer extractions to
derive answer

Merge / remove duplicates

Detect ambiguity

Reconcile answer (secondary search)

Identify possible contradictions

Apply utility measures
Gary G. Lee, Postech
80
And more …







Real Time QA
Multi-Lingual QA
Interactive QA
Advanced Reasoning for QA
User Profiling for QA
Collaborative QA
Spoken language QA
Gary G. Lee, Postech
Download