Foundations of Preference Queries Jan Chomicki University at Buffalo Jan Chomicki ()

advertisement
Foundations of Preference Queries
Jan Chomicki
University at Buffalo
Jan Chomicki ()
Preference Queries
1 / 68
Plan of the course
1
Preference relations
2
Preference queries
3
Preference management
4
Advanced topics
Jan Chomicki ()
Preference Queries
2 / 68
Part I
Preference relations
Jan Chomicki ()
Preference Queries
3 / 68
Outline of Part I
1
Preference relations
Preference
Equivalence
Preference specification
Combining preferences
Skylines
Jan Chomicki ()
Preference Queries
4 / 68
Preference relations
Universe of objects
constants: uninterpreted, numbers,...
individuals (entities)
tuples
sets
Jan Chomicki ()
Preference Queries
5 / 68
Preference relations
Universe of objects
constants: uninterpreted, numbers,...
individuals (entities)
tuples
sets
Preference relation binary relation between objects
x y ≡ x is better than y ≡ x dominates y
an abstract, uniform way of talking about desirability, worth, cost,
timeliness,..., and their combinations
preference relations used in queries
Jan Chomicki ()
Preference Queries
5 / 68
Buying a car
Jan Chomicki ()
Preference Queries
6 / 68
Buying a car
Salesman: What kind of car do you prefer?
Jan Chomicki ()
Preference Queries
6 / 68
Buying a car
Salesman: What kind of car do you prefer?
Customer: The newer the better, if it is the same make. And cheap, too.
Jan Chomicki ()
Preference Queries
6 / 68
Buying a car
Salesman: What kind of car do you prefer?
Customer: The newer the better, if it is the same make. And cheap, too.
Salesman: Which is more important for you: the age or the price?
Jan Chomicki ()
Preference Queries
6 / 68
Buying a car
Salesman:
Customer:
Salesman:
Customer:
What kind of car do you prefer?
The newer the better, if it is the same make. And cheap, too.
Which is more important for you: the age or the price?
The age, definitely.
Jan Chomicki ()
Preference Queries
6 / 68
Buying a car
Salesman: What kind of car do you prefer?
Customer: The newer the better, if it is the same make. And cheap, too.
Salesman: Which is more important for you: the age or the price?
Customer: The age, definitely.
Salesman: Those are the best cars, according to your preferences, that we
have in stock.
Jan Chomicki ()
Preference Queries
6 / 68
Buying a car
Salesman: What kind of car do you prefer?
Customer: The newer the better, if it is the same make. And cheap, too.
Salesman: Which is more important for you: the age or the price?
Customer: The age, definitely.
Salesman: Those are the best cars, according to your preferences, that we
have in stock.
Customer: Wait...it better be a BMW.
Jan Chomicki ()
Preference Queries
6 / 68
Applications of preferences and preference queries
1
decision making
2
e-commerce
3
digital libraries
4
personalization
Jan Chomicki ()
Preference Queries
7 / 68
Properties of preference relations
Jan Chomicki ()
Preference Queries
8 / 68
Properties of preference relations
Properties of irreflexivity: ∀x. x 6 x
asymmetry: ∀x, y . x y ⇒ y 6 x
transitivity: ∀x, y , z. (x y ∧ y z) ⇒ x z
negative transitivity: ∀x, y , z. (x 6 y ∧ y 6 z) ⇒ x 6 z
connectivity: ∀x, y . x y ∨ y x ∨ x = y
Jan Chomicki ()
Preference Queries
8 / 68
Properties of preference relations
Properties of irreflexivity: ∀x. x 6 x
asymmetry: ∀x, y . x y ⇒ y 6 x
transitivity: ∀x, y , z. (x y ∧ y z) ⇒ x z
negative transitivity: ∀x, y , z. (x 6 y ∧ y 6 z) ⇒ x 6 z
connectivity: ∀x, y . x y ∨ y x ∨ x = y
Orders
strict partial order (SPO): irreflexive and transitive
weak order (WO): negatively transitive SPO
total order: connected SPO
Jan Chomicki ()
Preference Queries
8 / 68
Weak and total orders
Total order
Weak order
a
c
a
b
d
e
d
f
f
Jan Chomicki ()
Preference Queries
9 / 68
Order properties of preference relations
Jan Chomicki ()
Preference Queries
10 / 68
Order properties of preference relations
Irreflexivity, asymmetry: uncontroversial.
Jan Chomicki ()
Preference Queries
10 / 68
Order properties of preference relations
Irreflexivity, asymmetry: uncontroversial.
Transitivity:
captures rationality of preference
not always guaranteed: voting paradoxes
helps with preference querying
Jan Chomicki ()
Preference Queries
10 / 68
Order properties of preference relations
Irreflexivity, asymmetry: uncontroversial.
Transitivity:
captures rationality of preference
not always guaranteed: voting paradoxes
helps with preference querying
Negative transitivity:
scoring functions represent weak orders
Jan Chomicki ()
Preference Queries
10 / 68
Order properties of preference relations
Irreflexivity, asymmetry: uncontroversial.
Transitivity:
captures rationality of preference
not always guaranteed: voting paradoxes
helps with preference querying
Negative transitivity:
scoring functions represent weak orders
We assume that preference relations are SPOs.
Jan Chomicki ()
Preference Queries
10 / 68
When are two objects equivalent?
Jan Chomicki ()
Preference Queries
11 / 68
When are two objects equivalent?
Relation ∼
binary relation between objects
x ∼ y ≡ x 00 is equivalent to 00 y
Jan Chomicki ()
Preference Queries
11 / 68
When are two objects equivalent?
Relation ∼
binary relation between objects
x ∼ y ≡ x 00 is equivalent to 00 y
Several notions of equivalence
equality: x ∼eq y ≡ x = y
indifference: x ∼i y ≡ x y ∧ y x
restricted indifference:
x ∼r y ≡ ∀z. (x ≺ z ⇔ y ≺ z) ∧ (z ≺ y ⇔ z ≺ x)
Jan Chomicki ()
Preference Queries
11 / 68
When are two objects equivalent?
Relation ∼
binary relation between objects
x ∼ y ≡ x 00 is equivalent to 00 y
Several notions of equivalence
equality: x ∼eq y ≡ x = y
indifference: x ∼i y ≡ x y ∧ y x
restricted indifference:
x ∼r y ≡ ∀z. (x ≺ z ⇔ y ≺ z) ∧ (z ≺ y ⇔ z ≺ x)
Properties of equivalence
equivalence relation: reflexive, symmetric, transitive
equality and restricted indifference (if is an SPO) are equivalence
relations
indifference is reflexive and symmetric; transitive for WO
Jan Chomicki ()
Preference Queries
11 / 68
Example
Jan Chomicki ()
Preference Queries
12 / 68
Example
bmw
ford
vw mazda
kia
Jan Chomicki ()
Preference Queries
12 / 68
Example
bmw
ford
vw mazda
Preference:
bmw ford, bmw vw
bmw mazda, bmw kia
mazda kia
Indifference:
kia
ford ∼i vw, vw ∼i ford,
ford ∼i mazda, mazda ∼i
ford,
vw ∼i mazda, mazda ∼i
vw,
ford ∼i kia, kia ∼i ford,
vw ∼i kia, kia ∼i vw
Restricted indifference:
ford ∼r vw, vw ∼r ford
Jan Chomicki ()
Preference Queries
12 / 68
Example
bmw
ford
vw mazda
Preference:
bmw ford, bmw vw
bmw mazda, bmw kia
mazda kia
Indifference:
kia
This is a strict partial
order which is not a
weak order.
ford ∼i vw, vw ∼i ford,
ford ∼i mazda, mazda ∼i
ford,
vw ∼i mazda, mazda ∼i
vw,
ford ∼i kia, kia ∼i ford,
vw ∼i kia, kia ∼i vw
Restricted indifference:
ford ∼r vw, vw ∼r ford
Jan Chomicki ()
Preference Queries
12 / 68
Not every SPO is a WO
Canonical example
mazda kia, mazda ∼i vw, kia ∼i vw
Violation of negative transitivity
mazda vw, vw kia, mazda kia
Jan Chomicki ()
Preference Queries
13 / 68
Preference specification
Jan Chomicki ()
Preference Queries
14 / 68
Preference specification
Explicit preference relations
Finite sets of pairs: bmw mazda, mazda kia,...
Jan Chomicki ()
Preference Queries
14 / 68
Preference specification
Explicit preference relations
Finite sets of pairs: bmw mazda, mazda kia,...
Implicit preference relations
can be infinite but finitely representable
defined using logic formulas in some constraint theory:
Jan Chomicki ()
Preference Queries
14 / 68
Preference specification
Explicit preference relations
Finite sets of pairs: bmw mazda, mazda kia,...
Implicit preference relations
can be infinite but finitely representable
defined using logic formulas in some constraint theory:
(m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 )
for relation Car (Make, Year , Price).
Jan Chomicki ()
Preference Queries
14 / 68
Preference specification
Explicit preference relations
Finite sets of pairs: bmw mazda, mazda kia,...
Implicit preference relations
can be infinite but finitely representable
defined using logic formulas in some constraint theory:
(m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 )
for relation Car (Make, Year , Price).
defined using preference constructors (Preference SQL)
Jan Chomicki ()
Preference Queries
14 / 68
Preference specification
Explicit preference relations
Finite sets of pairs: bmw mazda, mazda kia,...
Implicit preference relations
can be infinite but finitely representable
defined using logic formulas in some constraint theory:
(m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 )
for relation Car (Make, Year , Price).
defined using preference constructors (Preference SQL)
defined using real-valued scoring functions:
Jan Chomicki ()
Preference Queries
14 / 68
Preference specification
Explicit preference relations
Finite sets of pairs: bmw mazda, mazda kia,...
Implicit preference relations
can be infinite but finitely representable
defined using logic formulas in some constraint theory:
(m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 )
for relation Car (Make, Year , Price).
defined using preference constructors (Preference SQL)
defined using real-valued scoring functions: F (m, y , p) = α · y + β · p
Jan Chomicki ()
Preference Queries
14 / 68
Preference specification
Explicit preference relations
Finite sets of pairs: bmw mazda, mazda kia,...
Implicit preference relations
can be infinite but finitely representable
defined using logic formulas in some constraint theory:
(m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 )
for relation Car (Make, Year , Price).
defined using preference constructors (Preference SQL)
defined using real-valued scoring functions: F (m, y , p) = α · y + β · p
(m1 , y1 , p1 ) 2 (m2 , y2 , p2 ) ≡ F (m1 , y1 , p1 ) > F (m2 , y2 , p2 )
Jan Chomicki ()
Preference Queries
14 / 68
Logic formulas
Jan Chomicki ()
Preference Queries
15 / 68
Logic formulas
The language of logic formulas
constants
object (tuple) attributes
comparison operators: =, 6=, <, >, . . .
arithmetic operators: +, ·, . . .
Boolean connectives: ¬, ∧, ∨
quantifiers:
∀, ∃
usually can be eliminated (quantifier elimination)
Jan Chomicki ()
Preference Queries
15 / 68
Representability
Jan Chomicki ()
Preference Queries
16 / 68
Representability
Definition
A scoring function f represents a preference relation if for all x, y
x y ≡ f (x) > f (y ).
Jan Chomicki ()
Preference Queries
16 / 68
Representability
Definition
A scoring function f represents a preference relation if for all x, y
x y ≡ f (x) > f (y ).
Necessary condition for representability
The preference relation is a weak order.
Jan Chomicki ()
Preference Queries
16 / 68
Representability
Definition
A scoring function f represents a preference relation if for all x, y
x y ≡ f (x) > f (y ).
Necessary condition for representability
The preference relation is a weak order.
Sufficient condition for representability
is a weak order
the domain is countable or some continuity conditions are satisfied
(studied in decision theory)
Jan Chomicki ()
Preference Queries
16 / 68
Not every WO can be represented using a scoring function
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Proof
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Proof
1
Assume there is a real-valued function f such that
x lo y ≡ f (x) > f (y ).
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Proof
1
2
Assume there is a real-valued function f such that
x lo y ≡ f (x) > f (y ).
For every x0 , (x0 , 1) lo (x0 , 0).
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Proof
1
Assume there is a real-valued function f such that
x lo y ≡ f (x) > f (y ).
2
For every x0 , (x0 , 1) lo (x0 , 0).
3
Thus f (x0 , 1) > f (x0 , 0).
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Proof
1
Assume there is a real-valued function f such that
x lo y ≡ f (x) > f (y ).
2
For every x0 , (x0 , 1) lo (x0 , 0).
3
Thus f (x0 , 1) > f (x0 , 0).
4
Consider now x1 > x0 .
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Proof
1
Assume there is a real-valued function f such that
x lo y ≡ f (x) > f (y ).
2
For every x0 , (x0 , 1) lo (x0 , 0).
3
Thus f (x0 , 1) > f (x0 , 0).
4
Consider now x1 > x0 .
5
Clearly f (x1 , 1) > f (x1 , 0) > f (x0 , 1) > f (x0 , 0).
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Proof
1
Assume there is a real-valued function f such that
x lo y ≡ f (x) > f (y ).
2
For every x0 , (x0 , 1) lo (x0 , 0).
3
Thus f (x0 , 1) > f (x0 , 0).
4
Consider now x1 > x0 .
5
Clearly f (x1 , 1) > f (x1 , 0) > f (x0 , 1) > f (x0 , 0).
6
So there are uncountably many nonempty disjoint intervals in R.
Jan Chomicki ()
Preference Queries
17 / 68
Not every WO can be represented using a scoring function
Lexicographic order in R × R
(x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 )
Proof
1
Assume there is a real-valued function f such that
x lo y ≡ f (x) > f (y ).
2
For every x0 , (x0 , 1) lo (x0 , 0).
3
Thus f (x0 , 1) > f (x0 , 0).
4
Consider now x1 > x0 .
5
Clearly f (x1 , 1) > f (x1 , 0) > f (x0 , 1) > f (x0 , 0).
6
So there are uncountably many nonempty disjoint intervals in R.
7
Each such interval contains a rational number: contradiction with the
countability of the set of rational numbers.
Jan Chomicki ()
Preference Queries
17 / 68
Preference constructors [Kie02, KK02]
Jan Chomicki ()
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
Prefer v ∈ S1 over v 6∈ S1 .
Jan Chomicki ()
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Jan Chomicki ()
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Bad values
Prefer v 6∈ S1 over v ∈ S1 .
Jan Chomicki ()
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Bad values
NEG(Make,{yugo})
Prefer v 6∈ S1 over v ∈ S1 .
Jan Chomicki ()
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Bad values
NEG(Make,{yugo})
Prefer v 6∈ S1 over v ∈ S1 .
Explicit preference
Preference encoded by a finite
directed graph.
Jan Chomicki ()
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Bad values
NEG(Make,{yugo})
Prefer v 6∈ S1 over v ∈ S1 .
Explicit preference
Preference encoded by a finite
directed graph.
Jan Chomicki ()
EXP(Make,{(bmw,ford),...,
(mazda,kia)})
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Bad values
NEG(Make,{yugo})
Prefer v 6∈ S1 over v ∈ S1 .
Explicit preference
Preference encoded by a finite
directed graph.
EXP(Make,{(bmw,ford),...,
(mazda,kia)})
Value comparison
Prefer larger/smaller values.
Jan Chomicki ()
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Bad values
NEG(Make,{yugo})
Prefer v 6∈ S1 over v ∈ S1 .
Explicit preference
Preference encoded by a finite
directed graph.
Value comparison
Prefer larger/smaller values.
Jan Chomicki ()
EXP(Make,{(bmw,ford),...,
(mazda,kia)})
HIGHEST(Year)
LOWEST(Price)
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Bad values
NEG(Make,{yugo})
Prefer v 6∈ S1 over v ∈ S1 .
Explicit preference
Preference encoded by a finite
directed graph.
Value comparison
Prefer larger/smaller values.
EXP(Make,{(bmw,ford),...,
(mazda,kia)})
HIGHEST(Year)
LOWEST(Price)
Distance
Prefer values closer to v0 .
Jan Chomicki ()
Preference Queries
18 / 68
Preference constructors [Kie02, KK02]
Good values
POS(Make,{mazda,vw})
Prefer v ∈ S1 over v 6∈ S1 .
Bad values
NEG(Make,{yugo})
Prefer v 6∈ S1 over v ∈ S1 .
Explicit preference
Preference encoded by a finite
directed graph.
Value comparison
EXP(Make,{(bmw,ford),...,
(mazda,kia)})
Prefer larger/smaller values.
HIGHEST(Year)
LOWEST(Price)
Distance
AROUND(Price,12K)
Prefer values closer to v0 .
Jan Chomicki ()
Preference Queries
18 / 68
Combining preferences
Jan Chomicki ()
Preference Queries
19 / 68
Combining preferences
Preference composition
combining preferences about objects of the same kind
dimensionality is not increased
representing preference aggregation, revision, ...
Jan Chomicki ()
Preference Queries
19 / 68
Combining preferences
Preference composition
combining preferences about objects of the same kind
dimensionality is not increased
representing preference aggregation, revision, ...
Preference accumulation
defining preferences over objects in terms of preferences over simpler
objects
dimensionality is increased (preferences over Cartesian product).
Jan Chomicki ()
Preference Queries
19 / 68
Combining preferences: composition
Jan Chomicki ()
Preference Queries
20 / 68
Combining preferences: composition
Boolean composition
x ∪ y ≡ x 1 y ∨ x 2 y
and similarly for ∩.
Jan Chomicki ()
Preference Queries
20 / 68
Combining preferences: composition
Boolean composition
x ∪ y ≡ x 1 y ∨ x 2 y
and similarly for ∩.
Prioritized composition
x lex y ≡ x 1 y ∨ (y 1 x ∧ x 2 y ).
Jan Chomicki ()
Preference Queries
20 / 68
Combining preferences: composition
Boolean composition
x ∪ y ≡ x 1 y ∨ x 2 y
and similarly for ∩.
Prioritized composition
x lex y ≡ x 1 y ∨ (y 1 x ∧ x 2 y ).
Pareto composition
x Par y ≡ (x 1 y ∧ y 2 x) ∨ (x 2 y ∧ y 1 x).
Jan Chomicki ()
Preference Queries
20 / 68
Preference composition
Jan Chomicki ()
Preference Queries
21 / 68
Preference composition
Preference relation 1
bmw
ford
mazda
kia
Jan Chomicki ()
Preference Queries
21 / 68
Preference composition
Preference relation 1
Preference relation 2
bmw
ford
ford
mazda
mazda
kia
Jan Chomicki ()
kia
bmw
Preference Queries
21 / 68
Preference composition
Preference relation 1
Preference relation 2
bmw
ford
ford
mazda
kia
mazda
kia
bmw
Prioritized composition
bmw
ford
mazda
kia
Jan Chomicki ()
Preference Queries
21 / 68
Preference composition
Preference relation 1
Preference relation 2
bmw
ford
ford
mazda
mazda
kia
Prioritized composition
kia
bmw
Pareto composition
ford
bmw
ford
kia
bmw
mazda
mazda
kia
Jan Chomicki ()
Preference Queries
21 / 68
Combining preferences: accumulation [Kie02]
Jan Chomicki ()
Preference Queries
22 / 68
Combining preferences: accumulation [Kie02]
Prioritized accumulation: pr = (1 & 2 )
(x1 , x2 ) pr (y1 , y2 ) ≡ x1 1 y1 ∨ (x1 = y1 ∧ x2 2 y2 ).
Jan Chomicki ()
Preference Queries
22 / 68
Combining preferences: accumulation [Kie02]
Prioritized accumulation: pr = (1 & 2 )
(x1 , x2 ) pr (y1 , y2 ) ≡ x1 1 y1 ∨ (x1 = y1 ∧ x2 2 y2 ).
Pareto accumulation: pa = (1 ⊗ 2 )
(x1 , x2 ) pa (y1 , y2 ) ≡ (x1 1 y1 ∧ x2 2 y2 ) ∨ (x1 1 y1 ∧ x2 2 y2 ).
Jan Chomicki ()
Preference Queries
22 / 68
Combining preferences: accumulation [Kie02]
Prioritized accumulation: pr = (1 & 2 )
(x1 , x2 ) pr (y1 , y2 ) ≡ x1 1 y1 ∨ (x1 = y1 ∧ x2 2 y2 ).
Pareto accumulation: pa = (1 ⊗ 2 )
(x1 , x2 ) pa (y1 , y2 ) ≡ (x1 1 y1 ∧ x2 2 y2 ) ∨ (x1 1 y1 ∧ x2 2 y2 ).
Properties
closure
associativity
commutativity of Pareto accumulation
Jan Chomicki ()
Preference Queries
22 / 68
Skylines
Jan Chomicki ()
Preference Queries
23 / 68
Skylines
Skyline
Given single-attribute total preference relations A1 , . . . , An for a
relational schema R(A1 , . . . , An ), the skyline preference relation sky is
defined as
sky =A1 ⊗ A2 ⊗ · · · ⊗ An .
Unfolding the definition
(x1 , . . . , xn ) sky (y1 , . . . , yn ) ≡
^
xi Ai yi ∧
i
Jan Chomicki ()
Preference Queries
_
xi Ai yi .
i
23 / 68
Skyline in Euclidean space
Jan Chomicki ()
Preference Queries
24 / 68
Skyline in Euclidean space
Two-dimensional Euclidean space
(x1 , x2 ) sky (y1 , y2 ) ≡ x1 ≥ y1 ∧ x2 > y2 ∨ x1 > y1 ∧ x2 ≥ y2
Skyline consists of sky -maximal vectors
Jan Chomicki ()
Preference Queries
24 / 68
Skyline in Euclidean space
Two-dimensional Euclidean space
(x1 , x2 ) sky (y1 , y2 ) ≡ x1 ≥ y1 ∧ x2 > y2 ∨ x1 > y1 ∧ x2 ≥ y2
Skyline consists of sky -maximal vectors
Jan Chomicki ()
Preference Queries
24 / 68
Skyline properties
Jan Chomicki ()
Preference Queries
25 / 68
Skyline properties
Invariance
A skyline preference relation is unaffacted by scaling or shifting in any
dimension.
Jan Chomicki ()
Preference Queries
25 / 68
Skyline properties
Invariance
A skyline preference relation is unaffacted by scaling or shifting in any
dimension.
Maxima
A skyline consists of the maxima of monotonic scoring functions.
Jan Chomicki ()
Preference Queries
25 / 68
Skyline properties
Invariance
A skyline preference relation is unaffacted by scaling or shifting in any
dimension.
Maxima
A skyline consists of the maxima of monotonic scoring functions.
Skyline is not a weak order
(2, 0) sky (0, 2), (0, 2) sky (1, 0), (2, 0) sky (1, 0)
Jan Chomicki ()
Preference Queries
25 / 68
Skyline in SQL
Jan Chomicki ()
Preference Queries
26 / 68
Skyline in SQL
Grouping
Designating attributes not used in comparisons (DIFF).
Example
SELECT * FROM Car
SKYLINE Price MIN,
Year MAX,
Make DIFF
Jan Chomicki ()
Preference Queries
26 / 68
Skyline in SQL
Grouping
Designating attributes not used in comparisons (DIFF).
Example
SELECT * FROM Car
SKYLINE Price MIN,
Year MAX,
Make DIFF
Dynamic skylines
dimensions defined using dimension functions g1 , . . . , gn
variable query point.
Jan Chomicki ()
Preference Queries
26 / 68
Dynamic skylines
Jan Chomicki ()
Preference Queries
27 / 68
Dynamic skylines
Relation Hotel(XCoord, YCoord, Price)
tuple p = (px , py , pz ), query point (ux , uy )
dimension functions based on 2D Euclidean distance:
q
g1 (px , py ) = (px − ux )2 + (py − uy )2
g2 (pz ) = pz
Jan Chomicki ()
Preference Queries
27 / 68
Dynamic skylines
Relation Hotel(XCoord, YCoord, Price)
tuple p = (px , py , pz ), query point (ux , uy )
dimension functions based on 2D Euclidean distance:
q
g1 (px , py ) = (px − ux )2 + (py − uy )2
g2 (pz ) = pz
XCoord
YCoord
Price
0
5
80
2
6
100
5
3
120
Jan Chomicki ()
Query point: (3,4).
Preference Queries
27 / 68
Dynamic skylines
Relation Hotel(XCoord, YCoord, Price)
tuple p = (px , py , pz ), query point (ux , uy )
dimension functions based on 2D Euclidean distance:
q
g1 (px , py ) = (px − ux )2 + (py − uy )2
g2 (pz ) = pz
XCoord
YCoord
Price
0
5
80
2
6
100
5
3
120
Jan Chomicki ()
Query point: (3,4).
Preference Queries
27 / 68
Combining scoring functions
Jan Chomicki ()
Preference Queries
28 / 68
Combining scoring functions
Scoring functions can be combined using numerical operators.
Jan Chomicki ()
Preference Queries
28 / 68
Combining scoring functions
Scoring functions can be combined using numerical operators.
Common scenario
scoring functions f1 , . . . , fn
aggregate scoring function: F (t) = E (f1 (t), . . . , fn (t))
linear scoring function: Σni=1 αi fi
Jan Chomicki ()
Preference Queries
28 / 68
Combining scoring functions
Scoring functions can be combined using numerical operators.
Common scenario
scoring functions f1 , . . . , fn
aggregate scoring function: F (t) = E (f1 (t), . . . , fn (t))
linear scoring function: Σni=1 αi fi
Numerical vs. logical combination
logical combination cannot be defined numerically
numerical combination cannot be defined logically (unless arithmetic
operators are available)
Jan Chomicki ()
Preference Queries
28 / 68
Part II
Preference Queries
Jan Chomicki ()
Preference Queries
29 / 68
Outline of Part II
2
Preference queries
Retrieving non-dominated elements
Rewriting queries with winnow
Retrieving Top-K elements
Optimizing Top-K queries
Jan Chomicki ()
Preference Queries
30 / 68
Winnow[Cho03]
Jan Chomicki ()
Preference Queries
31 / 68
Winnow[Cho03]
Winnow
new relational algebra operator ω (other names: Best, BMO [Kie02])
retrieves the non-dominated (best) elements in a database relation
can be expressed in terms of other operators
Jan Chomicki ()
Preference Queries
31 / 68
Winnow[Cho03]
Winnow
new relational algebra operator ω (other names: Best, BMO [Kie02])
retrieves the non-dominated (best) elements in a database relation
can be expressed in terms of other operators
Definition
Given a preference relation and a database relation r :
ω (r ) = {t ∈ r | ¬∃t 0 ∈ r . t 0 t}.
Jan Chomicki ()
Preference Queries
31 / 68
Winnow[Cho03]
Winnow
new relational algebra operator ω (other names: Best, BMO [Kie02])
retrieves the non-dominated (best) elements in a database relation
can be expressed in terms of other operators
Definition
Given a preference relation and a database relation r :
ω (r ) = {t ∈ r | ¬∃t 0 ∈ r . t 0 t}.
Notation: If a preference relation C is defined using a formula C , then
we write ωC (r ), instead of ωC (r ).
Jan Chomicki ()
Preference Queries
31 / 68
Winnow[Cho03]
Winnow
new relational algebra operator ω (other names: Best, BMO [Kie02])
retrieves the non-dominated (best) elements in a database relation
can be expressed in terms of other operators
Definition
Given a preference relation and a database relation r :
ω (r ) = {t ∈ r | ¬∃t 0 ∈ r . t 0 t}.
Notation: If a preference relation C is defined using a formula C , then
we write ωC (r ), instead of ωC (r ).
Skyline query
ωsky (r ) computes the set of maximal vectors in r (the skyline set).
Jan Chomicki ()
Preference Queries
31 / 68
Example of winnow
Jan Chomicki ()
Preference Queries
32 / 68
Example of winnow
Relation Car (Make, Year , Price)
Preference relation:
(m, y , p) 1 (m0 , y 0 , p 0 ) ≡ y > y 0 ∨ (y = y 0 ∧ p < p 0 ).
Jan Chomicki ()
Preference Queries
32 / 68
Example of winnow
Relation Car (Make, Year , Price)
Preference relation:
(m, y , p) 1 (m0 , y 0 , p 0 ) ≡ y > y 0 ∨ (y = y 0 ∧ p < p 0 ).
Jan Chomicki ()
Make
Year
Price
mazda
2009
20K
ford
2009
15K
ford
2007
12K
Preference Queries
32 / 68
Example of winnow
Relation Car (Make, Year , Price)
Preference relation:
(m, y , p) 1 (m0 , y 0 , p 0 ) ≡ y > y 0 ∨ (y = y 0 ∧ p < p 0 ).
Jan Chomicki ()
Make
Year
Price
mazda
2009
20K
ford
2009
15K
ford
2007
12K
Preference Queries
32 / 68
Computing winnow using BNL [BKS01]
Require: SPO , database relation r
1: initialize window W and temporary file F to empty
2: repeat
3:
for every tuple t in the input do
4:
if t is dominated by a tuple in W then
5:
ignore t
6:
else if t dominates some tuples in W then
7:
eliminate them and insert t into W
8:
else if there is room in W then
9:
insert t into W
10:
else
11:
add t to F
12:
end if
13:
end for
14:
output tuples from W that were added when F was empty
15:
make F the input, clear F
16: until empty input
Jan Chomicki ()
Preference Queries
33 / 68
BNL in action
Preference relation: a c, a d, b e.
Jan Chomicki ()
Preference Queries
34 / 68
BNL in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
c,e,d,a,b
Jan Chomicki ()
Preference Queries
34 / 68
BNL in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
c
e,d,a,b
Jan Chomicki ()
Preference Queries
34 / 68
BNL in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
c
d,a,b
e
Jan Chomicki ()
Preference Queries
34 / 68
BNL in action
Preference relation: a c, a d, b e.
Temporary file
d
Window
Input
c
a,b
e
Jan Chomicki ()
Preference Queries
34 / 68
BNL in action
Preference relation: a c, a d, b e.
Temporary file
d
Window
Input
a
b
e
Jan Chomicki ()
Preference Queries
34 / 68
BNL in action
Preference relation: a c, a d, b e.
Temporary file
d
Window
Input
a
b
Jan Chomicki ()
Preference Queries
34 / 68
BNL in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
a
d
b
Jan Chomicki ()
Preference Queries
34 / 68
BNL in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
a
b
Jan Chomicki ()
Preference Queries
34 / 68
Computing winnow with presorting
Jan Chomicki ()
Preference Queries
35 / 68
Computing winnow with presorting
SFS: adding presorting step to BNL [CGGL03]
topologically sort the input:
if x dominates y , then x precedes y in the sorted input
window contains only winnow points and can be output after every pass
for skylines:
Qk sort the input using a monotonic scoring function, for
example i=1 xi .
Jan Chomicki ()
Preference Queries
35 / 68
Computing winnow with presorting
SFS: adding presorting step to BNL [CGGL03]
topologically sort the input:
if x dominates y , then x precedes y in the sorted input
window contains only winnow points and can be output after every pass
for skylines:
Qk sort the input using a monotonic scoring function, for
example i=1 xi .
LESS: integrating different techniques [GSG07]
adding an elimination filter to the first external sort pass
combining the last external sort pass with the first SFS pass
average running time: O(kn)
Jan Chomicki ()
Preference Queries
35 / 68
SFS in action
Preference relation: a c, a d, b e.
Jan Chomicki ()
Preference Queries
36 / 68
SFS in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
a,b,c,d,e
Jan Chomicki ()
Preference Queries
36 / 68
SFS in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
a
b,c,d,e
Jan Chomicki ()
Preference Queries
36 / 68
SFS in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
a
c,d,e
b
Jan Chomicki ()
Preference Queries
36 / 68
SFS in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
a
d,e
b
Jan Chomicki ()
Preference Queries
36 / 68
SFS in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
a
e
b
Jan Chomicki ()
Preference Queries
36 / 68
SFS in action
Preference relation: a c, a d, b e.
Temporary file
Window
Input
a
b
Jan Chomicki ()
Preference Queries
36 / 68
Generalizations of winnow
Jan Chomicki ()
Preference Queries
37 / 68
Generalizations of winnow
Iterating winnow
0 (r ) = ω (r )
ω
n+1
ω
(r ) = ω (r −
Jan Chomicki ()
S
1≤i≤n
i (r ))
ω
Preference Queries
37 / 68
Generalizations of winnow
Iterating winnow
0 (r ) = ω (r )
ω
n+1
ω
(r ) = ω (r −
S
1≤i≤n
i (r ))
ω
Ranking
Rank tuples by their minimum distance from a winnow tuple:
η (r ) = {(t, i) | t ∈ ωCi (r )}.
Jan Chomicki ()
Preference Queries
37 / 68
Generalizations of winnow
Iterating winnow
0 (r ) = ω (r )
ω
n+1
ω
(r ) = ω (r −
S
1≤i≤n
i (r ))
ω
Ranking
Rank tuples by their minimum distance from a winnow tuple:
η (r ) = {(t, i) | t ∈ ωCi (r )}.
k-band
Return the tuples dominated by at most k tuples:
ω (r ) = {t ∈ r | #{t 0 ∈ r | t 0 t} ≤ k}.
Jan Chomicki ()
Preference Queries
37 / 68
Preference SQL
Jan Chomicki ()
Preference Queries
38 / 68
Preference SQL
The language
basic preference constructors
Pareto/prioritized accumulation
new SQL clause PREFERRING
groupwise preferences
implementation: translation to SQL
Jan Chomicki ()
Preference Queries
38 / 68
Preference SQL
The language
basic preference constructors
Pareto/prioritized accumulation
new SQL clause PREFERRING
groupwise preferences
implementation: translation to SQL
Winnow in Preference SQL
SELECT * FROM Car
PREFERRING HIGHEST(Year)
CASCADE LOWEST(Price)
Jan Chomicki ()
Preference Queries
38 / 68
Algebraic laws [Cho03]
Jan Chomicki ()
Preference Queries
39 / 68
Algebraic laws [Cho03]
Commutativity of winnow with selection
If the formula
∀t1 , t2 .[α(t2 ) ∧ γ(t1 , t2 )] ⇒ α(t1 )
is valid, then for every r
σα (ωγ (r )) = ωγ (σα (r )).
Jan Chomicki ()
Preference Queries
39 / 68
Algebraic laws [Cho03]
Commutativity of winnow with selection
If the formula
∀t1 , t2 .[α(t2 ) ∧ γ(t1 , t2 )] ⇒ α(t1 )
is valid, then for every r
σα (ωγ (r )) = ωγ (σα (r )).
Under the preference relation
(m, y , p) C1 (m0 , y 0 , p 0 ) ≡ y > y 0 ∧ p ≤ p 0 ∨ y ≥ y 0 ∧ p < p 0
the selection σPrice<20K commutes with ωC1 but σPrice>20K does not.
Jan Chomicki ()
Preference Queries
39 / 68
Other algebraic laws
Jan Chomicki ()
Preference Queries
40 / 68
Other algebraic laws
Distributivity of winnow over Cartesian product
For every r1 and r2
ωC (r1 × r2 ) = ωC (r1 ) × r2
if C refers only to the attributes of r1 .
Commutativity of winnow
If ∀t1 , t2 .[C1 (t1 , t2 ) ⇒ C2 (t1 , t2 )] is valid and C1 and C2 are SPOs, then
for all finite instances r :
ωC1 (ωC2 (r )) = ωC2 (ωC1 (r )) = ωC2 (r ).
Jan Chomicki ()
Preference Queries
40 / 68
Semantic query optimization [Cho07b]
Jan Chomicki ()
Preference Queries
41 / 68
Semantic query optimization [Cho07b]
Using information about integrity constraints to:
eliminate redundant occurrences of winnow.
make more efficient computation of winnow possible.
Eliminating redundancy
Given a set of integrity constraints F , ωC is redundant w.r.t. F iff F
implies the formula
∀t1 , t2 . R(t1 ) ∧ R(t2 ) ⇒ t1 ∼C t2 .
Jan Chomicki ()
Preference Queries
41 / 68
Integrity constraints
Jan Chomicki ()
Preference Queries
42 / 68
Integrity constraints
Constraint-generating dependencies (CGD) [BCW99, ZO97]
∀t1 . . . . ∀tn . [R(t1 ) ∧ · · · ∧ R(tn ) ∧ γ(t1 , . . . tn )] ⇒ γ 0 (t1 , . . . tn ).
Jan Chomicki ()
Preference Queries
42 / 68
Integrity constraints
Constraint-generating dependencies (CGD) [BCW99, ZO97]
∀t1 . . . . ∀tn . [R(t1 ) ∧ · · · ∧ R(tn ) ∧ γ(t1 , . . . tn )] ⇒ γ 0 (t1 , . . . tn ).
CGD entailment
Decidable by reduction to the validity of ∀-formulas in the constraint
theory (assuming the theory is decidable).
Jan Chomicki ()
Preference Queries
42 / 68
Top-K queries
Jan Chomicki ()
Preference Queries
43 / 68
Top-K queries
Scoring functions
each tuple t in a relation has numeric scores f1 (t), . . . , fm (t) assigned
by numeric component scoring functions f1 , . . . , fm
the aggregate score of t is F (t) = E (f1 (t), . . . , fm (t)) where E is a
numeric-valued expression
F is monotone if E (x1 , . . . , xm ) ≤ E (y1 , . . . , ym ) whenever xi ≤ yi for
all i
Jan Chomicki ()
Preference Queries
43 / 68
Top-K queries
Scoring functions
each tuple t in a relation has numeric scores f1 (t), . . . , fm (t) assigned
by numeric component scoring functions f1 , . . . , fm
the aggregate score of t is F (t) = E (f1 (t), . . . , fm (t)) where E is a
numeric-valued expression
F is monotone if E (x1 , . . . , xm ) ≤ E (y1 , . . . , ym ) whenever xi ≤ yi for
all i
Top-K queries
return K elements having top F -values in a database relation R
query expressed in an extension of SQL:
SELECT *
FROM R
ORDER BY F DESC
LIMIT K
Jan Chomicki ()
Preference Queries
43 / 68
Top-K sets
Jan Chomicki ()
Preference Queries
44 / 68
Top-K sets
Definition
Given a scoring function F and a database relation r , s is a Top-K set if:
s⊆r
|s| = min(K , |r |)
∀t ∈ s. ∀t 0 ∈ r − s. F (t) ≥ F (t 0 )
There may be more than one Top-K set: one is selected
non-deterministically.
Jan Chomicki ()
Preference Queries
44 / 68
Example of Top-2
Jan Chomicki ()
Preference Queries
45 / 68
Example of Top-2
Relation Car (Make, Year , Price)
component scoring functions:
f1 (m, y , p) = (y − 2005)
f2 (m, y , p) = (20000 − p)
aggregate scoring function:
F (m, y , p) = 1000 · f1 (m, y , p) + f2 (m, y , p)
Jan Chomicki ()
Preference Queries
45 / 68
Example of Top-2
Relation Car (Make, Year , Price)
component scoring functions:
f1 (m, y , p) = (y − 2005)
f2 (m, y , p) = (20000 − p)
aggregate scoring function:
F (m, y , p) = 1000 · f1 (m, y , p) + f2 (m, y , p)
Make
Year
Price
mazda
2009
20000
4000
ford
2009
15000
9000
ford
2007
12000
10000
Jan Chomicki ()
Aggregate score
Preference Queries
45 / 68
Example of Top-2
Relation Car (Make, Year , Price)
component scoring functions:
f1 (m, y , p) = (y − 2005)
f2 (m, y , p) = (20000 − p)
aggregate scoring function:
F (m, y , p) = 1000 · f1 (m, y , p) + f2 (m, y , p)
Make
Year
Price
mazda
2009
20000
4000
ford
2009
15000
9000
ford
2007
12000
10000
Jan Chomicki ()
Aggregate score
Preference Queries
45 / 68
Computing Top-K
Jan Chomicki ()
Preference Queries
46 / 68
Computing Top-K
Naive approaches
sort, output the first K -tuples
scan the input maintaining a priority queue of size K
...
Jan Chomicki ()
Preference Queries
46 / 68
Computing Top-K
Naive approaches
sort, output the first K -tuples
scan the input maintaining a priority queue of size K
...
Better approaches
Jan Chomicki ()
Preference Queries
46 / 68
Computing Top-K
Naive approaches
sort, output the first K -tuples
scan the input maintaining a priority queue of size K
...
Better approaches
the entire input does not need to be scanned...
Jan Chomicki ()
Preference Queries
46 / 68
Computing Top-K
Naive approaches
sort, output the first K -tuples
scan the input maintaining a priority queue of size K
...
Better approaches
the entire input does not need to be scanned...
... provided additional data structures are available
Jan Chomicki ()
Preference Queries
46 / 68
Computing Top-K
Naive approaches
sort, output the first K -tuples
scan the input maintaining a priority queue of size K
...
Better approaches
the entire input does not need to be scanned...
... provided additional data structures are available
variants of the threshold algorithm
Jan Chomicki ()
Preference Queries
46 / 68
Threshold algorithm (TA)[FLN03]
Jan Chomicki ()
Preference Queries
47 / 68
Threshold algorithm (TA)[FLN03]
Inputs
a monotone scoring function F (t) = E (f1 (t), . . . , fm (t))
lists Si , i = 1, . . . , m, each sorted on fi (descending) and representing a
different ranking of the same set of objects
1
For each list Si in parallel, retrieve the current object w in sorted order:
(random access) for every j 6= i, retrieve vj = fj (w ) from the list Sj
if d = E (v1 , . . . , vm ) is among the highest K scores seen so far,
remember w and d (ties broken arbitrarily)
2
Thresholding:
for each i, wi is the last object seen under sorted access in Si
if there are already K top-K objects with score at least equal to the
threshold T = E (f1 (w1 ), . . . , fm (wm )), return collected objects sorted
by F and terminate
otherwise, go to step 1.
Jan Chomicki ()
Preference Queries
47 / 68
TA in action
Aggregate score
F (t) = P1 (t) + P2 (t)
Priority queue
OID
P1
OID
P2
5
50
3
50
1
35
2
40
3
30
1
30
2
20
4
20
4
10
5
10
Jan Chomicki ()
Preference Queries
48 / 68
TA in action
Aggregate score
F (t) = P1 (t) + P2 (t)
Priority queue
OID
P1
OID
P2
5
50
3
50
1
35
2
40
3
30
1
30
2
20
4
20
4
10
5
10
Jan Chomicki ()
T=100
Preference Queries
48 / 68
TA in action
Aggregate score
F (t) = P1 (t) + P2 (t)
Priority queue
OID
P1
OID
P2
5
50
3
50
1
35
2
40
3
30
1
30
2
20
4
20
4
10
5
10
Jan Chomicki ()
5:60
T=100
Preference Queries
48 / 68
TA in action
Aggregate score
F (t) = P1 (t) + P2 (t)
Priority queue
OID
P1
OID
P2
3:80
5
50
3
50
5:60
1
35
2
40
3
30
1
30
2
20
4
20
4
10
5
10
Jan Chomicki ()
T=100
Preference Queries
48 / 68
TA in action
Aggregate score
F (t) = P1 (t) + P2 (t)
Priority queue
OID
P1
OID
P2
3:80
5
50
3
50
5:60
1
35
2
40
3
30
1
30
2
20
4
20
4
10
5
10
Jan Chomicki ()
T=75
Preference Queries
48 / 68
TA in action
Aggregate score
F (t) = P1 (t) + P2 (t)
Priority queue
OID
P1
OID
P2
3:80
5
50
3
50
1:65
1
35
2
40
5:60
3
30
1
30
2
20
4
20
4
10
5
10
Jan Chomicki ()
T=75
Preference Queries
48 / 68
TA in action
Aggregate score
F (t) = P1 (t) + P2 (t)
Priority queue
OID
P1
OID
P2
3:80
5
50
3
50
1:65
1
35
2
40
5:60
3
30
1
30
2:60
2
20
4
20
4
10
5
10
Jan Chomicki ()
T=75
Preference Queries
48 / 68
TA in databases
Jan Chomicki ()
Preference Queries
49 / 68
TA in databases
objects: tuples of a single relation r
single-attribute component scoring functions
sorted list access implemented through indexes
random access to all lists implemented by primary index access to r
that retrieves entire tuples
Jan Chomicki ()
Preference Queries
49 / 68
Optimizing Top-K queries [LCIS05]
Jan Chomicki ()
Preference Queries
50 / 68
Optimizing Top-K queries [LCIS05]
Goals
integrating Top-K with relational query evaluation and optimization
replacing blocking by pipelining
Jan Chomicki ()
Preference Queries
50 / 68
Optimizing Top-K queries [LCIS05]
Goals
integrating Top-K with relational query evaluation and optimization
replacing blocking by pipelining
Example
SELECT *
FROM Hotel h, Restaurant r , Museum m
WHERE c1 AND c2 AND c3
ORDER BY f1 + f2 + f3
LIMIT K
Jan Chomicki ()
Preference Queries
50 / 68
Optimizing Top-K queries [LCIS05]
Goals
integrating Top-K with relational query evaluation and optimization
replacing blocking by pipelining
Example
SELECT *
FROM Hotel h, Restaurant r , Museum m
WHERE c1 AND c2 AND c3
ORDER BY f1 + f2 + f3
LIMIT K
Is there a better evaluation plan than materialize-then-sort?
Jan Chomicki ()
Preference Queries
50 / 68
Partial ranking of tuples
Jan Chomicki ()
Preference Queries
51 / 68
Partial ranking of tuples
Model
set of component scoring functions P = {f1 , . . . , fm } such that
fi (t) ≤ 1 for all t
aggregate scoring function F (t) = E (f1 (t), . . . , fm (t))
how to rank intermediate tuples?
Jan Chomicki ()
Preference Queries
51 / 68
Partial ranking of tuples
Model
set of component scoring functions P = {f1 , . . . , fm } such that
fi (t) ≤ 1 for all t
aggregate scoring function F (t) = E (f1 (t), . . . , fm (t))
how to rank intermediate tuples?
Ranking principle
Given P0 ⊆ P,
F̄P0 (t) = E (g1 (t), . . . , gm (t))
where

 fi (t) if fi ∈ P0
gi (t) =
 1
otherwise
Jan Chomicki ()
Preference Queries
51 / 68
Relations with rank
Jan Chomicki ()
Preference Queries
52 / 68
Relations with rank
Rank-relation RP0
relation R
monotone aggregate scoring function F (the same for all relations)
set of component scoring functions P0 ⊆ P
order:
t1 >RP0 t2 ≡ F̄P0 (t1 ) > F̄P0 (t2 )
Jan Chomicki ()
Preference Queries
52 / 68
Ranking intermediate results
Jan Chomicki ()
Preference Queries
53 / 68
Ranking intermediate results
Operators
rank operator µf : ranks tuples according to an additional component
scoring function f
standard relational algebra operators suitably extended to work on
rank-relations
Jan Chomicki ()
Preference Queries
53 / 68
Ranking intermediate results
Operators
rank operator µf : ranks tuples according to an additional component
scoring function f
standard relational algebra operators suitably extended to work on
rank-relations
Operator
Order
t1 >µf (RP
RP1 ∩ SP2
t1 >RP1 ∩SP2 t2 ≡ F̄P1 ∪P2 (t1 ) > F̄P1 ∪P2 (t2 )
Jan Chomicki ()
0
) t2
≡ F̄P0 ∪{f } (t1 ) > F̄P0 ∪{f } (t2 )
µf (RP0 )
Preference Queries
53 / 68
Example
Jan Chomicki ()
Preference Queries
54 / 68
Example
Query
SELECT *
FROM S
ORDER BY f1 + f2 + f3
LIMIT 1
Jan Chomicki ()
Preference Queries
54 / 68
Example
Query
SELECT *
FROM S
ORDER BY f1 + f2 + f3
LIMIT 1
Unranked relation S
A
f1
f2
f3
1
0.7
0.8
0.9
2
0.9
0.85
0.8
3
0.5
0.45
0.75
Jan Chomicki ()
Preference Queries
54 / 68
Example
Query
SELECT *
FROM S
ORDER BY f1 + f2 + f3
LIMIT 1
Rank-relation S{f1 }
Unranked relation S
A
f1
f2
f3
A
F̄{f1 }
1
0.7
0.8
0.9
2
2.9
2
0.9
0.85
0.8
1
2.7
3
0.5
0.45
0.75
3
2.5
Jan Chomicki ()
Preference Queries
54 / 68
Pipelined execution
Pipelined execution
A
f1
f2
f3
F̄{f1 }
2
0.9
0.85
0.8
2.9
1
0.7
0.8
0.9
2.7
3
0.5
0.45
0.75
2.5
IndexScanf1
Pipelined execution
A
f1
f2
f3
F̄{f1 }
A
F̄{f1 ,f2 }
2
0.9
0.85
0.8
2.9
2
2.75
1
0.7
0.8
0.9
2.7
1
2.5
3
0.5
0.45
0.75
2.5
3
1.95
IndexScanf1
µ f2
Pipelined execution
A
f1
f2
f3
F̄{f1 }
A
F̄{f1 ,f2 }
2
0.9
0.85
0.8
2.9
2
2.75
1
0.7
0.8
0.9
2.7
1
2.5
3
0.5
0.45
0.75
2.5
3
1.95
µ f2
µf3
IndexScanf1
Jan Chomicki ()
Preference Queries
A
F̄{f1 ,f2 ,f3 }
2
2.55
1
2.4
55 / 68
Algebraic laws for rank-relation operators
Jan Chomicki ()
Preference Queries
56 / 68
Algebraic laws for rank-relation operators
Splitting for µ
R{f1 ,f2 ,...,fm } ≡ µf1 (µf2 (. . . (µfm (R)) . . .))
Jan Chomicki ()
Preference Queries
56 / 68
Algebraic laws for rank-relation operators
Splitting for µ
R{f1 ,f2 ,...,fm } ≡ µf1 (µf2 (. . . (µfm (R)) . . .))
Commutativity of µ
µf1 (µf2 (RP0 )) ≡ µf2 (µf1 (RP0 ))
Jan Chomicki ()
Preference Queries
56 / 68
Algebraic laws for rank-relation operators
Splitting for µ
R{f1 ,f2 ,...,fm } ≡ µf1 (µf2 (. . . (µfm (R)) . . .))
Commutativity of µ
µf1 (µf2 (RP0 )) ≡ µf2 (µf1 (RP0 ))
Commutativity of µ with selection
σC (µf (RP0 )) ≡ µf (σC (RP0 ))
Jan Chomicki ()
Preference Queries
56 / 68
Algebraic laws for rank-relation operators
Splitting for µ
R{f1 ,f2 ,...,fm } ≡ µf1 (µf2 (. . . (µfm (R)) . . .))
Commutativity of µ
µf1 (µf2 (RP0 )) ≡ µf2 (µf1 (RP0 ))
Commutativity of µ with selection
σC (µf (RP0 )) ≡ µf (σC (RP0 ))
Distributivity of µ over Cartesian product
µf (RP1 × SP2 ) ≡ µf (RP1 ) × SP2 if f refers only to the attributes of R.
Jan Chomicki ()
Preference Queries
56 / 68
Part III
Preference management
Jan Chomicki ()
Preference Queries
57 / 68
Outline of Part III
3
Preference management
Preference modification
Jan Chomicki ()
Preference Queries
58 / 68
Preference modification
Jan Chomicki ()
Preference Queries
59 / 68
Preference modification
Goal
Given a preference relation and additional preference or indifference
information I, construct a new preference relation 0 whose contents
depend on and I.
Jan Chomicki ()
Preference Queries
59 / 68
Preference modification
Goal
Given a preference relation and additional preference or indifference
information I, construct a new preference relation 0 whose contents
depend on and I.
General postulates
fulfillment: the new information I should be completely incorporated
into 0
minimal change: should be changed as little as possible
closure:
order-theoretic properties of should be preserved in 0 (SPO, WO)
finiteness or finite representability of should also be preserved in 0
Jan Chomicki ()
Preference Queries
59 / 68
Preference revision [Cho07a]
Jan Chomicki ()
Preference Queries
60 / 68
Preference revision [Cho07a]
Setting
new information: revising preference relation 0
composition operator θ: union, prioritized or Pareto composition
composition eliminates (some) preference conflicts
additional assumptions: interval orders
0 = TC (0 θ ) to guarantee SPO
Jan Chomicki ()
Preference Queries
60 / 68
Preference revision [Cho07a]
Setting
new information: revising preference relation 0
composition operator θ: union, prioritized or Pareto composition
composition eliminates (some) preference conflicts
additional assumptions: interval orders
0 = TC (0 θ ) to guarantee SPO
Jan Chomicki ()
VW , 2009
Kia, 2009
VW , 2008
Kia, 2008
VW , 2007
Kia, 2007
Preference Queries
60 / 68
Preference revision [Cho07a]
Setting
new information: revising preference relation 0
composition operator θ: union, prioritized or Pareto composition
composition eliminates (some) preference conflicts
additional assumptions: interval orders
0 = TC (0 θ ) to guarantee SPO
Jan Chomicki ()
VW , 2009
Kia, 2009
VW , 2008
Kia, 2008
VW , 2007
Kia, 2007
Preference Queries
60 / 68
Preference revision [Cho07a]
Setting
new information: revising preference relation 0
composition operator θ: union, prioritized or Pareto composition
composition eliminates (some) preference conflicts
additional assumptions: interval orders
0 = TC (0 θ ) to guarantee SPO
Jan Chomicki ()
VW , 2009
Kia, 2009
VW , 2008
Kia, 2008
VW , 2007
Kia, 2007
Preference Queries
60 / 68
Preference contraction [MC08]
Jan Chomicki ()
Preference Queries
61 / 68
Preference contraction [MC08]
Setting
new information: contractor relation CON
0 : maximal subset of disjoint with CON
Jan Chomicki ()
Preference Queries
61 / 68
Preference contraction [MC08]
Setting
new information: contractor relation CON
0 : maximal subset of disjoint with CON
VW , 2009
VW , 2008
VW , 2007
VW , 2006
VW , 2005
Jan Chomicki ()
Preference Queries
61 / 68
Preference contraction [MC08]
Setting
new information: contractor relation CON
0 : maximal subset of disjoint with CON
VW , 2009
VW , 2008
VW , 2007
VW , 2006
VW , 2005
Jan Chomicki ()
Preference Queries
61 / 68
Preference contraction [MC08]
Setting
new information: contractor relation CON
0 : maximal subset of disjoint with CON
VW , 2009
VW , 2008
VW , 2007
VW , 2006
VW , 2005
Jan Chomicki ()
Preference Queries
61 / 68
Substitutability [BGS06]
Jan Chomicki ()
Preference Queries
62 / 68
Substitutability [BGS06]
Setting
new information: set of indifference pairs
additional preferences are added to convert indifference to restricted
indifference
achieving object substitutability
Jan Chomicki ()
Preference Queries
62 / 68
Substitutability [BGS06]
Setting
new information: set of indifference pairs
additional preferences are added to convert indifference to restricted
indifference
achieving object substitutability
Jan Chomicki ()
VW , 2009
Kia, 2009
VW , 2008
Kia, 2008
VW , 2007
Kia, 2007
Preference Queries
62 / 68
Substitutability [BGS06]
Setting
new information: set of indifference pairs
additional preferences are added to convert indifference to restricted
indifference
achieving object substitutability
Jan Chomicki ()
VW , 2009
Kia, 2009
VW , 2008
Kia, 2008
VW , 2007
Kia, 2007
Preference Queries
62 / 68
Substitutability [BGS06]
Setting
new information: set of indifference pairs
additional preferences are added to convert indifference to restricted
indifference
achieving object substitutability
Jan Chomicki ()
VW , 2009
Kia, 2009
VW , 2008
Kia, 2008
VW , 2007
Kia, 2007
Preference Queries
62 / 68
Part IV
Advanced topics
Jan Chomicki ()
Preference Queries
63 / 68
Outline of Part IV
Jan Chomicki ()
Preference Queries
64 / 68
Prospective research topics
Jan Chomicki ()
Preference Queries
65 / 68
Prospective research topics
Definability
Given a preference relation C , how to construct a definition of a scoring
function F representing C , if such a function exists?
Jan Chomicki ()
Preference Queries
65 / 68
Prospective research topics
Definability
Given a preference relation C , how to construct a definition of a scoring
function F representing C , if such a function exists?
Extrinsic preference relations
Preference relations that are not fully defined by tuple contents:
x y ≡ ∃n1 , n2 . Dissatisfied(x, n1 ) ∧ Dissatisfied(y , n2 ) ∧ n1 < n2 .
Jan Chomicki ()
Preference Queries
65 / 68
Prospective research topics
Definability
Given a preference relation C , how to construct a definition of a scoring
function F representing C , if such a function exists?
Extrinsic preference relations
Preference relations that are not fully defined by tuple contents:
x y ≡ ∃n1 , n2 . Dissatisfied(x, n1 ) ∧ Dissatisfied(y , n2 ) ∧ n1 < n2 .
Incomplete preferences
tuple scores and probabilities [SIC08, ZC08]
uncertain tuple scores
disjunctive preferences: a b ∨ a c
Jan Chomicki ()
Preference Queries
65 / 68
Jan Chomicki ()
Preference Queries
66 / 68
Preference modification
beyond revision and contraction: merging, arbitration,...
general parametric framework?
conflict resolution
Jan Chomicki ()
Preference Queries
66 / 68
Preference modification
beyond revision and contraction: merging, arbitration,...
general parametric framework?
conflict resolution
Variations
preference and similarity: “find the objects similar to one of the best
objects”
Jan Chomicki ()
Preference Queries
66 / 68
Preference modification
beyond revision and contraction: merging, arbitration,...
general parametric framework?
conflict resolution
Variations
preference and similarity: “find the objects similar to one of the best
objects”
Applications
preference queries as decision components: workflows, event systems
personalization of query results
preference negotiation: applying contraction
Jan Chomicki ()
Preference Queries
66 / 68
Acknowledgments
Denis Mindolin
Slawek Staworko
Xi Zhang
Jan Chomicki ()
Preference Queries
67 / 68
Jan Chomicki ()
Preference Queries
68 / 68
M. Baudinet, J. Chomicki, and P. Wolper.
Constraint-Generating Dependencies.
Journal of Computer and System Sciences, 59:94–115, 1999.
Preliminary version in ICDT’95.
W-T. Balke, U. Güntzer, and W. Siberski.
Exploiting Indifference for Customization of Partial Order Skylines.
In International Database Engineering and Applications Symposium
(IDEAS), pages 80–88, 2006.
S. Börzsönyi, D. Kossmann, and K. Stocker.
The Skyline Operator.
In IEEE International Conference on Data Engineering (ICDE), pages
421–430, 2001.
J. Chomicki, P. Godfrey, J. Gryz, and D. Liang.
Skyline with Presorting.
In IEEE International Conference on Data Engineering (ICDE), 2003.
Poster.
J. Chomicki.
Jan Chomicki ()
Preference Queries
68 / 68
Preference Formulas in Relational Queries.
ACM Transactions on Database Systems, 28(4):427–466, December
2003.
J. Chomicki.
Database Querying under Changing Preferences.
Annals of Mathematics and Artificial Intelligence, 50(1-2):79–109,
2007.
J. Chomicki.
Semantic optimization techniques for preference queries.
Information Systems, 32(5):660–674, 2007.
R. Fagin, A. Lotem, and M. Naor.
Optimal Aggregation Algorithms for Middleware.
Journal of Computer and System Sciences, 66(4):614–656, 2003.
P. Godfrey, R. Shipley, and J. Gryz.
Algorithms and Analyses for Maximal Vector Computation.
VLDB Journal, 16:5–28, 2007.
W. Kießling.
Jan Chomicki ()
Preference Queries
68 / 68
Foundations of Preferences in Database Systems.
In International Conference on Very Large Data Bases (VLDB), pages
311–322, 2002.
W. Kießling and G. Köstler.
Preference SQL - Design, Implementation, Experience.
In International Conference on Very Large Data Bases (VLDB), pages
990–1001, 2002.
C. Li, K. C-C. Chang, I.F. Ilyas, and S. Song.
RankSQL: Query Algebra and Optimization for Relational Top-k
Queries.
In ACM SIGMOD International Conference on Management of Data,
pages 131–142, 2005.
D. Mindolin and J. Chomicki.
Maximal Contraction of Preference Relations.
In National Conference on Artificial Intelligence, pages 492–497, 2008.
M.A. Soliman, I.F. Ilyas, and K. C-C. Chang.
Probabilistic Top-k and Ranking-Aggregate Queries.
Jan Chomicki ()
Preference Queries
68 / 68
ACM Transactions on Database Systems, 33(3):13:1–13:54, 2008.
X. Zhang and J. Chomicki.
On the Semantics and Evaluation of Top-k Queries in Probabilistic
Databases.
In International Workshop on Database Ranking (DBRank). IEEE
Computer Society (ICDE Workshops), 2008.
X. Zhang and Z. M. Ozsoyoglu.
Implication and Referential Constraints: A New Formal Reasoning.
IEEE Transactions on Knowledge and Data Engineering,
9(6):894–910, 1997.
Jan Chomicki ()
Preference Queries
68 / 68
Download