Extracting Query Facets From Search Results Date : 2013/08/20

advertisement
Extracting Query Facets
From Search Results
1
Date : 2013/08/20
Source : SIGIR’13
Authors : Weize Kong and James Allan
Advisor : Dr.Jia-ling, Koh
Speaker : Wei, Chang
OUTLINE
Introduction
 Approach
 Experiment
 Conclusion

2
What is query facet ?

Definition : query facet
a set of coordinate terms
( terms that share a semantic relationship by being
grouped under a relationship )
a query facet
(Mars rovers)
3
WHAT CAN WE DO WITH QUERY FACETS ?
•
•
Flight type
•
Domestic
•
International
Travel Class
•
First
•
Business
•
Economy
4
GOAL

Extract query facets from the top-k web search
results D={𝐷1 , 𝐷2 , … , 𝐷𝑘 }
5
OUTLINE
Introduction
 Approach

Step 1 : Extracting candidate lists
 Step 2 : Finding query facets from candidate lists

Experiment
 Conclusion

6
PATTERN-BASED SEMANTIC CLASS
EXTRACTION


Reference from : Z. Dou, S. Hu, Y. Luo, R. Song, and
J.-R. Wen. Finding dimensions for queries.
For example :
There are many Mars rovers, such as Curiosity,
Opportunity, and Spirit.
 <ul> <li>first class</li>
<li>business class</li>
<li>economy class</li> </ul>

7
CANDIDATE LISTS
All the list items are normalized by converting text to lowercase and removing
non-alphanumeric characters.
Then, we remove stopwords and duplicate items in each lists.
Finally, we discard all lists that contain fewer than two item or more than 200
items.
•
•
•


The candidate lists are usually noisy, and could be
non-relevant to the issued query.
To address this problem, we use a supervised
method.
8
NOTE : WHAT IS SUPERVISED METHOD
EXAMPLE :
LA-100
David
Quiz 1
Quiz 2
Quiz 3
Final
Exam
A-
B+
A-
?
?
James
B
A
LA-99 (Training Data)
A
Quiz 1
Quiz 2
Quiz 3
Final
Exam
John
A
B+
B-
B
Eric
A+
A
A+
A
Peter
B+
A-
A+
A+
Steve
A+
A+
B-
B+
Mark
C
A+
B+
B
Larry
B+
B+
B+
A
9
NOTE : WHAT IS SUPERVISED LEARNING
Training data
(with
features)
Training
Model
New Data
Model
Prediction
10
OUTLINE
Introduction
 Approach

Step 1 : Extracting candidate lists
 Step 2 : Finding query facets from candidate lists

Experiment
 Conclusion

11
PROBLEM DEFINITION
Whether a list item is a facet term
 Whether a pair of list items is in one query facet

12
FEATURES
13
GRAPH
14
LOGISTIC-BASED CONDITIONAL
PROBABILITY DISTRIBUTIONS
15
PARAMETER ESTIMATION
Maximizing the log-likelihood using gradient descent.
16
INFERENCE
The training is finished.
 The graphical model does not enforce the labeling
to produce strict partitioning for facet terms. For
example, when𝑍1,2 =1, 𝑍2,3 =1, we may have 𝑍1,3 =
0.

17
REPHRASE THE OPTIMIZATION PROBLEM
The optimization target becomes
, where
is the set of all possible query facet sets that can be generated from
L with the strict partitioning constraint.
This optimization problem is NP-hard, which can be proved
by a reduction from the Multiway Cut problem. Therefore, we
propose two algorithms, QF-I and QF-J, to approximate the
results.
18
QF-I
1. Select list items 𝑡𝑖 with 𝑃 𝑡𝑖 > 𝑤𝑚𝑖𝑛 as facet terms.
2.
19
QF-J
20
RANKING QUERY FACETS

score for a query facet :

score for a facet term :
21
OUTLINE
Introduction
 Approach

Step 1 : Extracting candidate lists
 Step 2 : Finding query facets from candidate lists


Experiment
Evaluation
 Experiment Result


Conclusion
22
DATA
Using Top 10 query facets generated by different models.
23
EVALUATION METRICS

Using “∗” to distinguish between system
generated results and human labeled results,
which we used as ground truth.
24
CLUSTERING QUALITY
25
OVERALL QUALITY
fp-nDCG is weighted by
rp-nDCG is weighted by
26
OUTLINE
Introduction
 Approach

Step 1 : Extracting candidate lists
 Step 2 : Finding query facets from candidate lists


Experiment
Evaluation
 Experiment Result


Conclusion
27
FACET TERMS
28
CLUSTERING FACET TERMS
29
OVERALL
30
OUTLINE
Introduction
 Approach

Step 1 : Extracting candidate lists
 Step 2 : Finding query facets from candidate lists


Experiment
Evaluation
 Experiment Result


Conclusion
31
CONCLUSION
We developed a supervised method based on a
graphical model to recognize query facets from
the noisy facet candidate lists extracted from the
top ranked search results.
 We proposed two algorithms for approximate
inference on the graphical model.
 We designed a new evaluation metric for this
task to combine recall and precision of facet
terms with grouping quality.
 Experimental results showed that the supervised
method significantly outperforms other
unsupervised methods, suggesting that query
facet extraction can be effectively learned.

32
Download