Learning from the Past: Answering New Questions with Past Answers

advertisement
LEARNING FROM THE
PAST:
ANSWERING NEW
QUESTIONS WITH PAST
ANSWERS
Date: 2012/11/22
Author: Anna Shtok, Gideon Dror,
Yoelle Maarek, Idan Szpektor
Source: WWW ’12
Advisor: Dr. Jia-Ling Koh
Speaker: Yi-Hsuan Yeh
OUTLINE
 Introduction
 Description
of approach
Stage one: top candidate selection
 Stage two: top candidate validation

 Experiment
Offline
 Online

 Conclusion
2
INTRODUCTION

Users struggle with expressing their need as short
query
3
INTRODUCTION

Community-based Question Answering(CQA)
sites, such as Yahoo! Answers or Baidu Zhidao
Title
Body
15% of the
questions
unanswered

Answer new questions by past resolved question
4
OUTLINE
 Introduction
 Description
of approach
Stage one: top candidate selection
 Stage two: top candidate validation

 Experiment
Offline
 Online

 Conclusion
5
A TWO STAGE APPROACH
find the most similar
past question.
decides whether or
not to serve the
answer
6
STAGE ONE: TOP CANDIDATE
SELECTION

Vector-space unigram model with TF-IDF weight
w1 w2 w3 . . . wn
(title)
Qnew
0.1 0.2 0.12 . . . 0.8
TF-IDF
Qpast 1 0.3 0.5 0.2 . . . 0.1
Qpast 2 0.2 0 0.1 . . . 0.6
.
.
Qpast n 0.9 0.3 0.5 . . . 0.1


Cosine similarity => threshold α
Ranking: Cos(Qpast title+body, Qnew title+body)
=> the top candidate past question and A
7
STAGE TWO: TOP CANDIDATE
VALIDATION

Train a classifier that validates whether A can be
served as an answer to Qnew.
8
SURFACE-LEVEL FEATURE

Surface level statistics


text length, number of question marks, stop word
count, maximal IDF within all terms in the text,
minimal IDF, average IDF, IDF standard deviation,
http link count, number of figures.
Surface level similarity
TF-IDF weighted word unigram vector space model
 Cosine similarity






Qnew title - Qpast title
Qnew body - Qpast body
Qnew title+ body - Qpast title+body
Qnew title+ body - Answer
Qpast title+ body - Answer
9
LINGUISTIC ANALYSIS

Latent topic

LDA(Latent Dirichlet Allocation)
Qnew Qpast A
Topic 1
Topic 2
Topic 3
.
.
.
.
Topic n
0.3
0.03
0.15
.
.
.
.
0.06
0.1
0.1
0.08
.
.
.
.
0.13
0.25
0.02
0.12
.
.
.
.
0.05
• Entropy
• Most probable topic
• JS divergence
10

Lexico-syntactic analysis

Ex:
Stanford dependency parser
 Main verb , subject, object, the main noun and
adjective
Q1:Why doesn’t my dog eat?
Main predicate : eat
Main predicate argument: dog
Q2:Why doesn’t my cat eat?
Main predicate : eat
Main predicate argument: cat
11
RESULT LIST ANALYSIS

Query clarity
Qnew
Qpast1

Qpast2 Qpast3
Qpastall
A
0.5
0
0.1
0.5
B
0
0.5
0
0
C
0.3
0.1
0
0.3
D
0.2
0.4
0.9
0.2
Language model & KL divergence
12

Query feedback


Informational similarity between two queries can be
effectively estimated by the similarity between their
ranked document lists.
Result list length

The number of questions that pass the threshold α
13
CLASSIFIER MODEL
Random forest classifier
 Random n feature & training n past questions

… ….
14
OUTLINE
 Introduction
 Description
of approach
Stage one: top candidate selection
 Stage two: top candidate validation

 Experiment
Offline
 Online

 Conclusion
15
OFFLINE

Dataset
Yahoo! Answer: Beauty & Style, Health and Pets.
 Included best answers chosen by the askers, and
received at least three stars.
 Between Feb and Dec 2010

16
MTurk
 Fleiss’s kappa

17
18
19
ONLINE
20
21
OUTLINE
 Introduction
 Description
of approach
Stage one: top candidate selection
 Stage two: top candidate validation

 Experiment
Offline
 Online

 Conclusions
22
CONCLUSIONS



Short questions might suffer from vocabulary
mismatch problems and sparsity.
The long cumbersome descriptions introduce
many irrelevant aspects which can hardly be
separated from the essential question
details(even for a human reader).
Terms that are repeated in the past question and
in its best answer should usually be emphasized
more as related to the expressed need.
23



A general informative answer can satisfy a
number of topically connected but different
questions.
A general social answer, may often satisfy a
certain type of questions.
In future work, we would like to better
understand time-sensitive questions, such as
common in the Sports category
24
Download