Sample presentation slides (White with blue bar design)

advertisement
Davide Mottin
Alice Marascu, Senjuti Basu Roy, Gautam Das, Themis Palpanas, Yannis Velegrakis
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
query = Alarm, DSL, Manual
No answer
CAR
DB
{}
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
2
Ranking results based on user preferences
IR [Baeza11] and database solutions [Chaudhuri04]
Query relaxation
Modify some of the query conditions [Mishra09]
(-) Suggests all the modification together
(-) Does not take user feedback into account
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
3
Suggests one relaxation at a time
Takes user feedback into account
Models user preferences
Optimization centric relaxation suggestions
User centric (effort, relevance)
System-centric (profit)
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
4
Exponential number of relaxations
Modeling user preferences
System encoding of different objective functions
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
5
A probabilistic optimization framework
• Based on probability that user says yes to relaxation Q’ of
query Q
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
6
Probability of accepting relaxation Q’ of Q
belief of user that an answer will be found in the
database: Prior
likelihood the user will like the answers of relaxed
query: Pref
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
Probability to reject a relaxation
Cost for a relaxation
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
Maximize profit
Pref: favors solutions with highest values of individual tuples
a static function
Maximize answer relevance
Pref: favors solutions with most relevant tuples to original
query
Semi-dynamic function (computed only once with the user query
Minimize user effort
Pref: favors solutions with least number of user interactions
fully dynamic function (changes at every relaxation)
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
9
Minimum Effort Objective
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
10
Query : (Alarm, DSL, Manual)
Relaxation nodes
Choice nodes
1
1
2
0.3
0.7
0
0
0.3
0
1
1
1
0.7
0
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
11
Exact algorithm (FastOpt):
Upper and lower bound for each node
Pruning can be enabled for this algorithm
Approximate algorithm (CDR):
Nodes cost approximated by probability distribution
Relaxation nodes: min/max distribution of Cost
Choice nodes: sum distribution of Cost
Approximated by computing the convolution cost
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
12
Idea: prune non-optimal relaxations in advance
• Upper and lower bound of cost function
• Prune branches using upper/lower bounds
reasoning
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
13
(1,1,1)
(1,1,1)
[1,1.938]
[1,3]
(?,1,1)
(?,1,1)
[1,1.9]
[1,2.33]
(1,?,1)
[1,3]
yes 33% no 67%
yes 33%
yes 33% no 67%
(-,1,1)
(#,1,1)
(-,1,1) [0,1.4]
(#,1,1)
[0,0]
[0,0]
(#,?,1)
[1.1]
(#,1,?)
[1,1.2]
yes 36% no 64%
(#,-,1)
[0,0]
(#,#,1)
[0,1]
[0,2]
(1,-,1)
[0,2]
yes 20%
(#,1,-)
[0,1]
no 67%
(#,1,#)
[0,0]
yes 33%
(1,-,1)
(#,-,1)
[0,1]
no 67%
no 67%
(1,1,#)
[0,2]
(1,-,?)
[1,1.2]
yes 60% no 40%
(-,-,1)
[0,0]
yes 33%
(1,#,1) [1,1.4]
(1,1,-)
[0,2]
[0,2]
(?,-,1)
[1,1.4]
no 80%
Prune!!!
(1,?,1)
(1,1,?)
[2,2.802]
[1,3]
(?,#,1)
[1,1.67]
yes 20%
(1,-,-)
[0,1]
(1,#,1)
[1,2]
no 80%
(1,-,#)
[0,0]
(1,#,?)
[1,2]
yes 33% no 67%
(-,#,1)
[0,0]
(#,#,1)
[0,1]
yes 33%
(1,#,-)
[0,1]
no 67%
(1,#,#)
[0,1]
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
14
Datasets:
US Home dataset: 38k tuples, 18 attributes
Car dataset: 100k tuples, 31 attributes
Syntetic datasets: 20k to 500k tuples
Baseline algorithms:
Previous works: top-k, query-refinement, ranking
Random relaxation
Greedy: choose the first non empy otherwise
random
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
15
1. Interactive vs non-interactive
• Measure user satisfaction with our interactive
approach vs relax at-once approaches
• 100 Amazon Turk users, 5 queries each
2. Objective functions effectiveness
• Compare proposed relaxations with objective
function goals (max profit, min effort, max user
relevance)
• Three tasks
• 100 users, 5 queries
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
16
Scalability results:
FastOpt (Exact): timely exact answers for small
queries
CDR (Approximate)
real time answers for queries size 10
results close to optimal
User study results
Interactive methods preferred over non-interactive
Objective functions correctly achieve their
optimization goals
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
17
6
5
Cost
4
FullTree
FastOpt
CDR
Greedy
Random
• CDR close to optimal
• Random and Greedy
produce 1.5 more
relaxations
3
2
1
0
3
4
5
Query size
6
7
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
18
10000
1000
Query me (sec)
Exponential behaviour
FullTree
CDR
FastOpt
100
Efficient for small
queries
10
1.4 sec for query size
10!!!
1
0.1
0.01
0.001
3
4
5
6
7
Query size
8
9
10
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
19
Interac>ve# Mul>@
Relaxa>ons# top@
k# Why@
Not#
100%#
80%#
60%#
40%#
20%#
0%#
###################Favored#
########Answers#Quality#
#######################Usability#
Users prefer interactive systems to relaxations
all at once
Better quality answers
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
20
Introduce novel principled, user-centric and interactive
approach for the empty-answer problem
Propose exact and approximate algorithms
Demonstrate scalability of proposed techniques with
database and query size
Show effectiveness of the different objective functions
Verify quality of the answers and superior usability of
our interactive approach
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
21
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
Dynamic
Semi-Dynamic
Sta c
Number of steps
1
6
Profit
0.8
0.6
0.4
0.2
Dynamic
Semi-Dynamic
Sta c
3
4
5
6
Query Size
7
Dynamic
Semi-Dynamic
Sta c
0.6
5
0.5
4
0.4
3
0.3
2
0.2
1
0.1
0
0
0.7
Answer Quality
1.2
0
3
4
5
6
Query Size
7
3
4
5
6
Query Size
7
Objective functions achieve their goals
Dynamic and Semi-Dynamic very similar in
performance
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
23
Idea: use cost distribution instead of actual cost.
1. b-size histogram in each node
2. Construct the tree first L levels
3. Expand the branch with the biggest probability of
being the optimal
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
24
1. compute the probability that the cost is smaller than
the siblings
2. choose the son with the highest probability
Expand this!
Pr(n1<n2) = 0.6
n1
n2
Pr(n2<n1) = 0.4
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
25
[Mishra09] C. Mishra and N. Koudas, “Interactive query
refinement,” in EDBT,2009.
[Roy08] S. Basu Roy, H. Wang, G. Das, U. Nambiar, and M.
Mohania, “Minimum-effort driven dynamic faceted
search in structured databases,” in CIKM, 2008.
[Chadhuri04] S. Chaudhuri, G. Das, V. Hristidis, and G.
Weikum, “Probabilistic ranking of database query results,”
in VLDB, 2004.
[Baeza11] R. A. Baeza-Yates and B. A. Ribeiro-Neto,
Modern Information Retrieval, 2011.
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das
Davide Mottin
A Probabilistic Optimization Framework for the Empty-Answer Problem
26
Download