Foundations of Preference Queries Jan Chomicki University at Buffalo Jan Chomicki () Preference Queries 1 / 68 Plan of the course 1 Preference relations 2 Preference queries 3 Preference management 4 Advanced topics Jan Chomicki () Preference Queries 2 / 68 Part I Preference relations Jan Chomicki () Preference Queries 3 / 68 Outline of Part I 1 Preference relations Preference Equivalence Preference specification Combining preferences Skylines Jan Chomicki () Preference Queries 4 / 68 Preference relations Universe of objects constants: uninterpreted, numbers,... individuals (entities) tuples sets Jan Chomicki () Preference Queries 5 / 68 Preference relations Universe of objects constants: uninterpreted, numbers,... individuals (entities) tuples sets Preference relation binary relation between objects x y ≡ x is better than y ≡ x dominates y an abstract, uniform way of talking about desirability, worth, cost, timeliness,..., and their combinations preference relations used in queries Jan Chomicki () Preference Queries 5 / 68 Buying a car Jan Chomicki () Preference Queries 6 / 68 Buying a car Salesman: What kind of car do you prefer? Jan Chomicki () Preference Queries 6 / 68 Buying a car Salesman: What kind of car do you prefer? Customer: The newer the better, if it is the same make. And cheap, too. Jan Chomicki () Preference Queries 6 / 68 Buying a car Salesman: What kind of car do you prefer? Customer: The newer the better, if it is the same make. And cheap, too. Salesman: Which is more important for you: the age or the price? Jan Chomicki () Preference Queries 6 / 68 Buying a car Salesman: Customer: Salesman: Customer: What kind of car do you prefer? The newer the better, if it is the same make. And cheap, too. Which is more important for you: the age or the price? The age, definitely. Jan Chomicki () Preference Queries 6 / 68 Buying a car Salesman: What kind of car do you prefer? Customer: The newer the better, if it is the same make. And cheap, too. Salesman: Which is more important for you: the age or the price? Customer: The age, definitely. Salesman: Those are the best cars, according to your preferences, that we have in stock. Jan Chomicki () Preference Queries 6 / 68 Buying a car Salesman: What kind of car do you prefer? Customer: The newer the better, if it is the same make. And cheap, too. Salesman: Which is more important for you: the age or the price? Customer: The age, definitely. Salesman: Those are the best cars, according to your preferences, that we have in stock. Customer: Wait...it better be a BMW. Jan Chomicki () Preference Queries 6 / 68 Applications of preferences and preference queries 1 decision making 2 e-commerce 3 digital libraries 4 personalization Jan Chomicki () Preference Queries 7 / 68 Properties of preference relations Jan Chomicki () Preference Queries 8 / 68 Properties of preference relations Properties of irreflexivity: ∀x. x 6 x asymmetry: ∀x, y . x y ⇒ y 6 x transitivity: ∀x, y , z. (x y ∧ y z) ⇒ x z negative transitivity: ∀x, y , z. (x 6 y ∧ y 6 z) ⇒ x 6 z connectivity: ∀x, y . x y ∨ y x ∨ x = y Jan Chomicki () Preference Queries 8 / 68 Properties of preference relations Properties of irreflexivity: ∀x. x 6 x asymmetry: ∀x, y . x y ⇒ y 6 x transitivity: ∀x, y , z. (x y ∧ y z) ⇒ x z negative transitivity: ∀x, y , z. (x 6 y ∧ y 6 z) ⇒ x 6 z connectivity: ∀x, y . x y ∨ y x ∨ x = y Orders strict partial order (SPO): irreflexive and transitive weak order (WO): negatively transitive SPO total order: connected SPO Jan Chomicki () Preference Queries 8 / 68 Weak and total orders Total order Weak order a c a b d e d f f Jan Chomicki () Preference Queries 9 / 68 Order properties of preference relations Jan Chomicki () Preference Queries 10 / 68 Order properties of preference relations Irreflexivity, asymmetry: uncontroversial. Jan Chomicki () Preference Queries 10 / 68 Order properties of preference relations Irreflexivity, asymmetry: uncontroversial. Transitivity: captures rationality of preference not always guaranteed: voting paradoxes helps with preference querying Jan Chomicki () Preference Queries 10 / 68 Order properties of preference relations Irreflexivity, asymmetry: uncontroversial. Transitivity: captures rationality of preference not always guaranteed: voting paradoxes helps with preference querying Negative transitivity: scoring functions represent weak orders Jan Chomicki () Preference Queries 10 / 68 Order properties of preference relations Irreflexivity, asymmetry: uncontroversial. Transitivity: captures rationality of preference not always guaranteed: voting paradoxes helps with preference querying Negative transitivity: scoring functions represent weak orders We assume that preference relations are SPOs. Jan Chomicki () Preference Queries 10 / 68 When are two objects equivalent? Jan Chomicki () Preference Queries 11 / 68 When are two objects equivalent? Relation ∼ binary relation between objects x ∼ y ≡ x 00 is equivalent to 00 y Jan Chomicki () Preference Queries 11 / 68 When are two objects equivalent? Relation ∼ binary relation between objects x ∼ y ≡ x 00 is equivalent to 00 y Several notions of equivalence equality: x ∼eq y ≡ x = y indifference: x ∼i y ≡ x y ∧ y x restricted indifference: x ∼r y ≡ ∀z. (x ≺ z ⇔ y ≺ z) ∧ (z ≺ y ⇔ z ≺ x) Jan Chomicki () Preference Queries 11 / 68 When are two objects equivalent? Relation ∼ binary relation between objects x ∼ y ≡ x 00 is equivalent to 00 y Several notions of equivalence equality: x ∼eq y ≡ x = y indifference: x ∼i y ≡ x y ∧ y x restricted indifference: x ∼r y ≡ ∀z. (x ≺ z ⇔ y ≺ z) ∧ (z ≺ y ⇔ z ≺ x) Properties of equivalence equivalence relation: reflexive, symmetric, transitive equality and restricted indifference (if is an SPO) are equivalence relations indifference is reflexive and symmetric; transitive for WO Jan Chomicki () Preference Queries 11 / 68 Example Jan Chomicki () Preference Queries 12 / 68 Example bmw ford vw mazda kia Jan Chomicki () Preference Queries 12 / 68 Example bmw ford vw mazda Preference: bmw ford, bmw vw bmw mazda, bmw kia mazda kia Indifference: kia ford ∼i vw, vw ∼i ford, ford ∼i mazda, mazda ∼i ford, vw ∼i mazda, mazda ∼i vw, ford ∼i kia, kia ∼i ford, vw ∼i kia, kia ∼i vw Restricted indifference: ford ∼r vw, vw ∼r ford Jan Chomicki () Preference Queries 12 / 68 Example bmw ford vw mazda Preference: bmw ford, bmw vw bmw mazda, bmw kia mazda kia Indifference: kia This is a strict partial order which is not a weak order. ford ∼i vw, vw ∼i ford, ford ∼i mazda, mazda ∼i ford, vw ∼i mazda, mazda ∼i vw, ford ∼i kia, kia ∼i ford, vw ∼i kia, kia ∼i vw Restricted indifference: ford ∼r vw, vw ∼r ford Jan Chomicki () Preference Queries 12 / 68 Not every SPO is a WO Canonical example mazda kia, mazda ∼i vw, kia ∼i vw Violation of negative transitivity mazda vw, vw kia, mazda kia Jan Chomicki () Preference Queries 13 / 68 Preference specification Jan Chomicki () Preference Queries 14 / 68 Preference specification Explicit preference relations Finite sets of pairs: bmw mazda, mazda kia,... Jan Chomicki () Preference Queries 14 / 68 Preference specification Explicit preference relations Finite sets of pairs: bmw mazda, mazda kia,... Implicit preference relations can be infinite but finitely representable defined using logic formulas in some constraint theory: Jan Chomicki () Preference Queries 14 / 68 Preference specification Explicit preference relations Finite sets of pairs: bmw mazda, mazda kia,... Implicit preference relations can be infinite but finitely representable defined using logic formulas in some constraint theory: (m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 ) for relation Car (Make, Year , Price). Jan Chomicki () Preference Queries 14 / 68 Preference specification Explicit preference relations Finite sets of pairs: bmw mazda, mazda kia,... Implicit preference relations can be infinite but finitely representable defined using logic formulas in some constraint theory: (m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 ) for relation Car (Make, Year , Price). defined using preference constructors (Preference SQL) Jan Chomicki () Preference Queries 14 / 68 Preference specification Explicit preference relations Finite sets of pairs: bmw mazda, mazda kia,... Implicit preference relations can be infinite but finitely representable defined using logic formulas in some constraint theory: (m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 ) for relation Car (Make, Year , Price). defined using preference constructors (Preference SQL) defined using real-valued scoring functions: Jan Chomicki () Preference Queries 14 / 68 Preference specification Explicit preference relations Finite sets of pairs: bmw mazda, mazda kia,... Implicit preference relations can be infinite but finitely representable defined using logic formulas in some constraint theory: (m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 ) for relation Car (Make, Year , Price). defined using preference constructors (Preference SQL) defined using real-valued scoring functions: F (m, y , p) = α · y + β · p Jan Chomicki () Preference Queries 14 / 68 Preference specification Explicit preference relations Finite sets of pairs: bmw mazda, mazda kia,... Implicit preference relations can be infinite but finitely representable defined using logic formulas in some constraint theory: (m1 , y1 , p1 ) 1 (m2 , y2 , p2 ) ≡ y1 > y2 ∨ (y1 = y2 ∧ p1 < p2 ) for relation Car (Make, Year , Price). defined using preference constructors (Preference SQL) defined using real-valued scoring functions: F (m, y , p) = α · y + β · p (m1 , y1 , p1 ) 2 (m2 , y2 , p2 ) ≡ F (m1 , y1 , p1 ) > F (m2 , y2 , p2 ) Jan Chomicki () Preference Queries 14 / 68 Logic formulas Jan Chomicki () Preference Queries 15 / 68 Logic formulas The language of logic formulas constants object (tuple) attributes comparison operators: =, 6=, <, >, . . . arithmetic operators: +, ·, . . . Boolean connectives: ¬, ∧, ∨ quantifiers: ∀, ∃ usually can be eliminated (quantifier elimination) Jan Chomicki () Preference Queries 15 / 68 Representability Jan Chomicki () Preference Queries 16 / 68 Representability Definition A scoring function f represents a preference relation if for all x, y x y ≡ f (x) > f (y ). Jan Chomicki () Preference Queries 16 / 68 Representability Definition A scoring function f represents a preference relation if for all x, y x y ≡ f (x) > f (y ). Necessary condition for representability The preference relation is a weak order. Jan Chomicki () Preference Queries 16 / 68 Representability Definition A scoring function f represents a preference relation if for all x, y x y ≡ f (x) > f (y ). Necessary condition for representability The preference relation is a weak order. Sufficient condition for representability is a weak order the domain is countable or some continuity conditions are satisfied (studied in decision theory) Jan Chomicki () Preference Queries 16 / 68 Not every WO can be represented using a scoring function Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Proof Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Proof 1 Assume there is a real-valued function f such that x lo y ≡ f (x) > f (y ). Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Proof 1 2 Assume there is a real-valued function f such that x lo y ≡ f (x) > f (y ). For every x0 , (x0 , 1) lo (x0 , 0). Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Proof 1 Assume there is a real-valued function f such that x lo y ≡ f (x) > f (y ). 2 For every x0 , (x0 , 1) lo (x0 , 0). 3 Thus f (x0 , 1) > f (x0 , 0). Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Proof 1 Assume there is a real-valued function f such that x lo y ≡ f (x) > f (y ). 2 For every x0 , (x0 , 1) lo (x0 , 0). 3 Thus f (x0 , 1) > f (x0 , 0). 4 Consider now x1 > x0 . Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Proof 1 Assume there is a real-valued function f such that x lo y ≡ f (x) > f (y ). 2 For every x0 , (x0 , 1) lo (x0 , 0). 3 Thus f (x0 , 1) > f (x0 , 0). 4 Consider now x1 > x0 . 5 Clearly f (x1 , 1) > f (x1 , 0) > f (x0 , 1) > f (x0 , 0). Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Proof 1 Assume there is a real-valued function f such that x lo y ≡ f (x) > f (y ). 2 For every x0 , (x0 , 1) lo (x0 , 0). 3 Thus f (x0 , 1) > f (x0 , 0). 4 Consider now x1 > x0 . 5 Clearly f (x1 , 1) > f (x1 , 0) > f (x0 , 1) > f (x0 , 0). 6 So there are uncountably many nonempty disjoint intervals in R. Jan Chomicki () Preference Queries 17 / 68 Not every WO can be represented using a scoring function Lexicographic order in R × R (x1 , y1 ) lo (x2 , y2 ) ≡ x1 > x2 ∨ (x1 = x2 ∧ y1 > y2 ) Proof 1 Assume there is a real-valued function f such that x lo y ≡ f (x) > f (y ). 2 For every x0 , (x0 , 1) lo (x0 , 0). 3 Thus f (x0 , 1) > f (x0 , 0). 4 Consider now x1 > x0 . 5 Clearly f (x1 , 1) > f (x1 , 0) > f (x0 , 1) > f (x0 , 0). 6 So there are uncountably many nonempty disjoint intervals in R. 7 Each such interval contains a rational number: contradiction with the countability of the set of rational numbers. Jan Chomicki () Preference Queries 17 / 68 Preference constructors [Kie02, KK02] Jan Chomicki () Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values Prefer v ∈ S1 over v 6∈ S1 . Jan Chomicki () Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Jan Chomicki () Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Bad values Prefer v 6∈ S1 over v ∈ S1 . Jan Chomicki () Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Bad values NEG(Make,{yugo}) Prefer v 6∈ S1 over v ∈ S1 . Jan Chomicki () Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Bad values NEG(Make,{yugo}) Prefer v 6∈ S1 over v ∈ S1 . Explicit preference Preference encoded by a finite directed graph. Jan Chomicki () Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Bad values NEG(Make,{yugo}) Prefer v 6∈ S1 over v ∈ S1 . Explicit preference Preference encoded by a finite directed graph. Jan Chomicki () EXP(Make,{(bmw,ford),..., (mazda,kia)}) Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Bad values NEG(Make,{yugo}) Prefer v 6∈ S1 over v ∈ S1 . Explicit preference Preference encoded by a finite directed graph. EXP(Make,{(bmw,ford),..., (mazda,kia)}) Value comparison Prefer larger/smaller values. Jan Chomicki () Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Bad values NEG(Make,{yugo}) Prefer v 6∈ S1 over v ∈ S1 . Explicit preference Preference encoded by a finite directed graph. Value comparison Prefer larger/smaller values. Jan Chomicki () EXP(Make,{(bmw,ford),..., (mazda,kia)}) HIGHEST(Year) LOWEST(Price) Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Bad values NEG(Make,{yugo}) Prefer v 6∈ S1 over v ∈ S1 . Explicit preference Preference encoded by a finite directed graph. Value comparison Prefer larger/smaller values. EXP(Make,{(bmw,ford),..., (mazda,kia)}) HIGHEST(Year) LOWEST(Price) Distance Prefer values closer to v0 . Jan Chomicki () Preference Queries 18 / 68 Preference constructors [Kie02, KK02] Good values POS(Make,{mazda,vw}) Prefer v ∈ S1 over v 6∈ S1 . Bad values NEG(Make,{yugo}) Prefer v 6∈ S1 over v ∈ S1 . Explicit preference Preference encoded by a finite directed graph. Value comparison EXP(Make,{(bmw,ford),..., (mazda,kia)}) Prefer larger/smaller values. HIGHEST(Year) LOWEST(Price) Distance AROUND(Price,12K) Prefer values closer to v0 . Jan Chomicki () Preference Queries 18 / 68 Combining preferences Jan Chomicki () Preference Queries 19 / 68 Combining preferences Preference composition combining preferences about objects of the same kind dimensionality is not increased representing preference aggregation, revision, ... Jan Chomicki () Preference Queries 19 / 68 Combining preferences Preference composition combining preferences about objects of the same kind dimensionality is not increased representing preference aggregation, revision, ... Preference accumulation defining preferences over objects in terms of preferences over simpler objects dimensionality is increased (preferences over Cartesian product). Jan Chomicki () Preference Queries 19 / 68 Combining preferences: composition Jan Chomicki () Preference Queries 20 / 68 Combining preferences: composition Boolean composition x ∪ y ≡ x 1 y ∨ x 2 y and similarly for ∩. Jan Chomicki () Preference Queries 20 / 68 Combining preferences: composition Boolean composition x ∪ y ≡ x 1 y ∨ x 2 y and similarly for ∩. Prioritized composition x lex y ≡ x 1 y ∨ (y 1 x ∧ x 2 y ). Jan Chomicki () Preference Queries 20 / 68 Combining preferences: composition Boolean composition x ∪ y ≡ x 1 y ∨ x 2 y and similarly for ∩. Prioritized composition x lex y ≡ x 1 y ∨ (y 1 x ∧ x 2 y ). Pareto composition x Par y ≡ (x 1 y ∧ y 2 x) ∨ (x 2 y ∧ y 1 x). Jan Chomicki () Preference Queries 20 / 68 Preference composition Jan Chomicki () Preference Queries 21 / 68 Preference composition Preference relation 1 bmw ford mazda kia Jan Chomicki () Preference Queries 21 / 68 Preference composition Preference relation 1 Preference relation 2 bmw ford ford mazda mazda kia Jan Chomicki () kia bmw Preference Queries 21 / 68 Preference composition Preference relation 1 Preference relation 2 bmw ford ford mazda kia mazda kia bmw Prioritized composition bmw ford mazda kia Jan Chomicki () Preference Queries 21 / 68 Preference composition Preference relation 1 Preference relation 2 bmw ford ford mazda mazda kia Prioritized composition kia bmw Pareto composition ford bmw ford kia bmw mazda mazda kia Jan Chomicki () Preference Queries 21 / 68 Combining preferences: accumulation [Kie02] Jan Chomicki () Preference Queries 22 / 68 Combining preferences: accumulation [Kie02] Prioritized accumulation: pr = (1 & 2 ) (x1 , x2 ) pr (y1 , y2 ) ≡ x1 1 y1 ∨ (x1 = y1 ∧ x2 2 y2 ). Jan Chomicki () Preference Queries 22 / 68 Combining preferences: accumulation [Kie02] Prioritized accumulation: pr = (1 & 2 ) (x1 , x2 ) pr (y1 , y2 ) ≡ x1 1 y1 ∨ (x1 = y1 ∧ x2 2 y2 ). Pareto accumulation: pa = (1 ⊗ 2 ) (x1 , x2 ) pa (y1 , y2 ) ≡ (x1 1 y1 ∧ x2 2 y2 ) ∨ (x1 1 y1 ∧ x2 2 y2 ). Jan Chomicki () Preference Queries 22 / 68 Combining preferences: accumulation [Kie02] Prioritized accumulation: pr = (1 & 2 ) (x1 , x2 ) pr (y1 , y2 ) ≡ x1 1 y1 ∨ (x1 = y1 ∧ x2 2 y2 ). Pareto accumulation: pa = (1 ⊗ 2 ) (x1 , x2 ) pa (y1 , y2 ) ≡ (x1 1 y1 ∧ x2 2 y2 ) ∨ (x1 1 y1 ∧ x2 2 y2 ). Properties closure associativity commutativity of Pareto accumulation Jan Chomicki () Preference Queries 22 / 68 Skylines Jan Chomicki () Preference Queries 23 / 68 Skylines Skyline Given single-attribute total preference relations A1 , . . . , An for a relational schema R(A1 , . . . , An ), the skyline preference relation sky is defined as sky =A1 ⊗ A2 ⊗ · · · ⊗ An . Unfolding the definition (x1 , . . . , xn ) sky (y1 , . . . , yn ) ≡ ^ xi Ai yi ∧ i Jan Chomicki () Preference Queries _ xi Ai yi . i 23 / 68 Skyline in Euclidean space Jan Chomicki () Preference Queries 24 / 68 Skyline in Euclidean space Two-dimensional Euclidean space (x1 , x2 ) sky (y1 , y2 ) ≡ x1 ≥ y1 ∧ x2 > y2 ∨ x1 > y1 ∧ x2 ≥ y2 Skyline consists of sky -maximal vectors Jan Chomicki () Preference Queries 24 / 68 Skyline in Euclidean space Two-dimensional Euclidean space (x1 , x2 ) sky (y1 , y2 ) ≡ x1 ≥ y1 ∧ x2 > y2 ∨ x1 > y1 ∧ x2 ≥ y2 Skyline consists of sky -maximal vectors Jan Chomicki () Preference Queries 24 / 68 Skyline properties Jan Chomicki () Preference Queries 25 / 68 Skyline properties Invariance A skyline preference relation is unaffacted by scaling or shifting in any dimension. Jan Chomicki () Preference Queries 25 / 68 Skyline properties Invariance A skyline preference relation is unaffacted by scaling or shifting in any dimension. Maxima A skyline consists of the maxima of monotonic scoring functions. Jan Chomicki () Preference Queries 25 / 68 Skyline properties Invariance A skyline preference relation is unaffacted by scaling or shifting in any dimension. Maxima A skyline consists of the maxima of monotonic scoring functions. Skyline is not a weak order (2, 0) sky (0, 2), (0, 2) sky (1, 0), (2, 0) sky (1, 0) Jan Chomicki () Preference Queries 25 / 68 Skyline in SQL Jan Chomicki () Preference Queries 26 / 68 Skyline in SQL Grouping Designating attributes not used in comparisons (DIFF). Example SELECT * FROM Car SKYLINE Price MIN, Year MAX, Make DIFF Jan Chomicki () Preference Queries 26 / 68 Skyline in SQL Grouping Designating attributes not used in comparisons (DIFF). Example SELECT * FROM Car SKYLINE Price MIN, Year MAX, Make DIFF Dynamic skylines dimensions defined using dimension functions g1 , . . . , gn variable query point. Jan Chomicki () Preference Queries 26 / 68 Dynamic skylines Jan Chomicki () Preference Queries 27 / 68 Dynamic skylines Relation Hotel(XCoord, YCoord, Price) tuple p = (px , py , pz ), query point (ux , uy ) dimension functions based on 2D Euclidean distance: q g1 (px , py ) = (px − ux )2 + (py − uy )2 g2 (pz ) = pz Jan Chomicki () Preference Queries 27 / 68 Dynamic skylines Relation Hotel(XCoord, YCoord, Price) tuple p = (px , py , pz ), query point (ux , uy ) dimension functions based on 2D Euclidean distance: q g1 (px , py ) = (px − ux )2 + (py − uy )2 g2 (pz ) = pz XCoord YCoord Price 0 5 80 2 6 100 5 3 120 Jan Chomicki () Query point: (3,4). Preference Queries 27 / 68 Dynamic skylines Relation Hotel(XCoord, YCoord, Price) tuple p = (px , py , pz ), query point (ux , uy ) dimension functions based on 2D Euclidean distance: q g1 (px , py ) = (px − ux )2 + (py − uy )2 g2 (pz ) = pz XCoord YCoord Price 0 5 80 2 6 100 5 3 120 Jan Chomicki () Query point: (3,4). Preference Queries 27 / 68 Combining scoring functions Jan Chomicki () Preference Queries 28 / 68 Combining scoring functions Scoring functions can be combined using numerical operators. Jan Chomicki () Preference Queries 28 / 68 Combining scoring functions Scoring functions can be combined using numerical operators. Common scenario scoring functions f1 , . . . , fn aggregate scoring function: F (t) = E (f1 (t), . . . , fn (t)) linear scoring function: Σni=1 αi fi Jan Chomicki () Preference Queries 28 / 68 Combining scoring functions Scoring functions can be combined using numerical operators. Common scenario scoring functions f1 , . . . , fn aggregate scoring function: F (t) = E (f1 (t), . . . , fn (t)) linear scoring function: Σni=1 αi fi Numerical vs. logical combination logical combination cannot be defined numerically numerical combination cannot be defined logically (unless arithmetic operators are available) Jan Chomicki () Preference Queries 28 / 68 Part II Preference Queries Jan Chomicki () Preference Queries 29 / 68 Outline of Part II 2 Preference queries Retrieving non-dominated elements Rewriting queries with winnow Retrieving Top-K elements Optimizing Top-K queries Jan Chomicki () Preference Queries 30 / 68 Winnow[Cho03] Jan Chomicki () Preference Queries 31 / 68 Winnow[Cho03] Winnow new relational algebra operator ω (other names: Best, BMO [Kie02]) retrieves the non-dominated (best) elements in a database relation can be expressed in terms of other operators Jan Chomicki () Preference Queries 31 / 68 Winnow[Cho03] Winnow new relational algebra operator ω (other names: Best, BMO [Kie02]) retrieves the non-dominated (best) elements in a database relation can be expressed in terms of other operators Definition Given a preference relation and a database relation r : ω (r ) = {t ∈ r | ¬∃t 0 ∈ r . t 0 t}. Jan Chomicki () Preference Queries 31 / 68 Winnow[Cho03] Winnow new relational algebra operator ω (other names: Best, BMO [Kie02]) retrieves the non-dominated (best) elements in a database relation can be expressed in terms of other operators Definition Given a preference relation and a database relation r : ω (r ) = {t ∈ r | ¬∃t 0 ∈ r . t 0 t}. Notation: If a preference relation C is defined using a formula C , then we write ωC (r ), instead of ωC (r ). Jan Chomicki () Preference Queries 31 / 68 Winnow[Cho03] Winnow new relational algebra operator ω (other names: Best, BMO [Kie02]) retrieves the non-dominated (best) elements in a database relation can be expressed in terms of other operators Definition Given a preference relation and a database relation r : ω (r ) = {t ∈ r | ¬∃t 0 ∈ r . t 0 t}. Notation: If a preference relation C is defined using a formula C , then we write ωC (r ), instead of ωC (r ). Skyline query ωsky (r ) computes the set of maximal vectors in r (the skyline set). Jan Chomicki () Preference Queries 31 / 68 Example of winnow Jan Chomicki () Preference Queries 32 / 68 Example of winnow Relation Car (Make, Year , Price) Preference relation: (m, y , p) 1 (m0 , y 0 , p 0 ) ≡ y > y 0 ∨ (y = y 0 ∧ p < p 0 ). Jan Chomicki () Preference Queries 32 / 68 Example of winnow Relation Car (Make, Year , Price) Preference relation: (m, y , p) 1 (m0 , y 0 , p 0 ) ≡ y > y 0 ∨ (y = y 0 ∧ p < p 0 ). Jan Chomicki () Make Year Price mazda 2009 20K ford 2009 15K ford 2007 12K Preference Queries 32 / 68 Example of winnow Relation Car (Make, Year , Price) Preference relation: (m, y , p) 1 (m0 , y 0 , p 0 ) ≡ y > y 0 ∨ (y = y 0 ∧ p < p 0 ). Jan Chomicki () Make Year Price mazda 2009 20K ford 2009 15K ford 2007 12K Preference Queries 32 / 68 Computing winnow using BNL [BKS01] Require: SPO , database relation r 1: initialize window W and temporary file F to empty 2: repeat 3: for every tuple t in the input do 4: if t is dominated by a tuple in W then 5: ignore t 6: else if t dominates some tuples in W then 7: eliminate them and insert t into W 8: else if there is room in W then 9: insert t into W 10: else 11: add t to F 12: end if 13: end for 14: output tuples from W that were added when F was empty 15: make F the input, clear F 16: until empty input Jan Chomicki () Preference Queries 33 / 68 BNL in action Preference relation: a c, a d, b e. Jan Chomicki () Preference Queries 34 / 68 BNL in action Preference relation: a c, a d, b e. Temporary file Window Input c,e,d,a,b Jan Chomicki () Preference Queries 34 / 68 BNL in action Preference relation: a c, a d, b e. Temporary file Window Input c e,d,a,b Jan Chomicki () Preference Queries 34 / 68 BNL in action Preference relation: a c, a d, b e. Temporary file Window Input c d,a,b e Jan Chomicki () Preference Queries 34 / 68 BNL in action Preference relation: a c, a d, b e. Temporary file d Window Input c a,b e Jan Chomicki () Preference Queries 34 / 68 BNL in action Preference relation: a c, a d, b e. Temporary file d Window Input a b e Jan Chomicki () Preference Queries 34 / 68 BNL in action Preference relation: a c, a d, b e. Temporary file d Window Input a b Jan Chomicki () Preference Queries 34 / 68 BNL in action Preference relation: a c, a d, b e. Temporary file Window Input a d b Jan Chomicki () Preference Queries 34 / 68 BNL in action Preference relation: a c, a d, b e. Temporary file Window Input a b Jan Chomicki () Preference Queries 34 / 68 Computing winnow with presorting Jan Chomicki () Preference Queries 35 / 68 Computing winnow with presorting SFS: adding presorting step to BNL [CGGL03] topologically sort the input: if x dominates y , then x precedes y in the sorted input window contains only winnow points and can be output after every pass for skylines: Qk sort the input using a monotonic scoring function, for example i=1 xi . Jan Chomicki () Preference Queries 35 / 68 Computing winnow with presorting SFS: adding presorting step to BNL [CGGL03] topologically sort the input: if x dominates y , then x precedes y in the sorted input window contains only winnow points and can be output after every pass for skylines: Qk sort the input using a monotonic scoring function, for example i=1 xi . LESS: integrating different techniques [GSG07] adding an elimination filter to the first external sort pass combining the last external sort pass with the first SFS pass average running time: O(kn) Jan Chomicki () Preference Queries 35 / 68 SFS in action Preference relation: a c, a d, b e. Jan Chomicki () Preference Queries 36 / 68 SFS in action Preference relation: a c, a d, b e. Temporary file Window Input a,b,c,d,e Jan Chomicki () Preference Queries 36 / 68 SFS in action Preference relation: a c, a d, b e. Temporary file Window Input a b,c,d,e Jan Chomicki () Preference Queries 36 / 68 SFS in action Preference relation: a c, a d, b e. Temporary file Window Input a c,d,e b Jan Chomicki () Preference Queries 36 / 68 SFS in action Preference relation: a c, a d, b e. Temporary file Window Input a d,e b Jan Chomicki () Preference Queries 36 / 68 SFS in action Preference relation: a c, a d, b e. Temporary file Window Input a e b Jan Chomicki () Preference Queries 36 / 68 SFS in action Preference relation: a c, a d, b e. Temporary file Window Input a b Jan Chomicki () Preference Queries 36 / 68 Generalizations of winnow Jan Chomicki () Preference Queries 37 / 68 Generalizations of winnow Iterating winnow 0 (r ) = ω (r ) ω n+1 ω (r ) = ω (r − Jan Chomicki () S 1≤i≤n i (r )) ω Preference Queries 37 / 68 Generalizations of winnow Iterating winnow 0 (r ) = ω (r ) ω n+1 ω (r ) = ω (r − S 1≤i≤n i (r )) ω Ranking Rank tuples by their minimum distance from a winnow tuple: η (r ) = {(t, i) | t ∈ ωCi (r )}. Jan Chomicki () Preference Queries 37 / 68 Generalizations of winnow Iterating winnow 0 (r ) = ω (r ) ω n+1 ω (r ) = ω (r − S 1≤i≤n i (r )) ω Ranking Rank tuples by their minimum distance from a winnow tuple: η (r ) = {(t, i) | t ∈ ωCi (r )}. k-band Return the tuples dominated by at most k tuples: ω (r ) = {t ∈ r | #{t 0 ∈ r | t 0 t} ≤ k}. Jan Chomicki () Preference Queries 37 / 68 Preference SQL Jan Chomicki () Preference Queries 38 / 68 Preference SQL The language basic preference constructors Pareto/prioritized accumulation new SQL clause PREFERRING groupwise preferences implementation: translation to SQL Jan Chomicki () Preference Queries 38 / 68 Preference SQL The language basic preference constructors Pareto/prioritized accumulation new SQL clause PREFERRING groupwise preferences implementation: translation to SQL Winnow in Preference SQL SELECT * FROM Car PREFERRING HIGHEST(Year) CASCADE LOWEST(Price) Jan Chomicki () Preference Queries 38 / 68 Algebraic laws [Cho03] Jan Chomicki () Preference Queries 39 / 68 Algebraic laws [Cho03] Commutativity of winnow with selection If the formula ∀t1 , t2 .[α(t2 ) ∧ γ(t1 , t2 )] ⇒ α(t1 ) is valid, then for every r σα (ωγ (r )) = ωγ (σα (r )). Jan Chomicki () Preference Queries 39 / 68 Algebraic laws [Cho03] Commutativity of winnow with selection If the formula ∀t1 , t2 .[α(t2 ) ∧ γ(t1 , t2 )] ⇒ α(t1 ) is valid, then for every r σα (ωγ (r )) = ωγ (σα (r )). Under the preference relation (m, y , p) C1 (m0 , y 0 , p 0 ) ≡ y > y 0 ∧ p ≤ p 0 ∨ y ≥ y 0 ∧ p < p 0 the selection σPrice<20K commutes with ωC1 but σPrice>20K does not. Jan Chomicki () Preference Queries 39 / 68 Other algebraic laws Jan Chomicki () Preference Queries 40 / 68 Other algebraic laws Distributivity of winnow over Cartesian product For every r1 and r2 ωC (r1 × r2 ) = ωC (r1 ) × r2 if C refers only to the attributes of r1 . Commutativity of winnow If ∀t1 , t2 .[C1 (t1 , t2 ) ⇒ C2 (t1 , t2 )] is valid and C1 and C2 are SPOs, then for all finite instances r : ωC1 (ωC2 (r )) = ωC2 (ωC1 (r )) = ωC2 (r ). Jan Chomicki () Preference Queries 40 / 68 Semantic query optimization [Cho07b] Jan Chomicki () Preference Queries 41 / 68 Semantic query optimization [Cho07b] Using information about integrity constraints to: eliminate redundant occurrences of winnow. make more efficient computation of winnow possible. Eliminating redundancy Given a set of integrity constraints F , ωC is redundant w.r.t. F iff F implies the formula ∀t1 , t2 . R(t1 ) ∧ R(t2 ) ⇒ t1 ∼C t2 . Jan Chomicki () Preference Queries 41 / 68 Integrity constraints Jan Chomicki () Preference Queries 42 / 68 Integrity constraints Constraint-generating dependencies (CGD) [BCW99, ZO97] ∀t1 . . . . ∀tn . [R(t1 ) ∧ · · · ∧ R(tn ) ∧ γ(t1 , . . . tn )] ⇒ γ 0 (t1 , . . . tn ). Jan Chomicki () Preference Queries 42 / 68 Integrity constraints Constraint-generating dependencies (CGD) [BCW99, ZO97] ∀t1 . . . . ∀tn . [R(t1 ) ∧ · · · ∧ R(tn ) ∧ γ(t1 , . . . tn )] ⇒ γ 0 (t1 , . . . tn ). CGD entailment Decidable by reduction to the validity of ∀-formulas in the constraint theory (assuming the theory is decidable). Jan Chomicki () Preference Queries 42 / 68 Top-K queries Jan Chomicki () Preference Queries 43 / 68 Top-K queries Scoring functions each tuple t in a relation has numeric scores f1 (t), . . . , fm (t) assigned by numeric component scoring functions f1 , . . . , fm the aggregate score of t is F (t) = E (f1 (t), . . . , fm (t)) where E is a numeric-valued expression F is monotone if E (x1 , . . . , xm ) ≤ E (y1 , . . . , ym ) whenever xi ≤ yi for all i Jan Chomicki () Preference Queries 43 / 68 Top-K queries Scoring functions each tuple t in a relation has numeric scores f1 (t), . . . , fm (t) assigned by numeric component scoring functions f1 , . . . , fm the aggregate score of t is F (t) = E (f1 (t), . . . , fm (t)) where E is a numeric-valued expression F is monotone if E (x1 , . . . , xm ) ≤ E (y1 , . . . , ym ) whenever xi ≤ yi for all i Top-K queries return K elements having top F -values in a database relation R query expressed in an extension of SQL: SELECT * FROM R ORDER BY F DESC LIMIT K Jan Chomicki () Preference Queries 43 / 68 Top-K sets Jan Chomicki () Preference Queries 44 / 68 Top-K sets Definition Given a scoring function F and a database relation r , s is a Top-K set if: s⊆r |s| = min(K , |r |) ∀t ∈ s. ∀t 0 ∈ r − s. F (t) ≥ F (t 0 ) There may be more than one Top-K set: one is selected non-deterministically. Jan Chomicki () Preference Queries 44 / 68 Example of Top-2 Jan Chomicki () Preference Queries 45 / 68 Example of Top-2 Relation Car (Make, Year , Price) component scoring functions: f1 (m, y , p) = (y − 2005) f2 (m, y , p) = (20000 − p) aggregate scoring function: F (m, y , p) = 1000 · f1 (m, y , p) + f2 (m, y , p) Jan Chomicki () Preference Queries 45 / 68 Example of Top-2 Relation Car (Make, Year , Price) component scoring functions: f1 (m, y , p) = (y − 2005) f2 (m, y , p) = (20000 − p) aggregate scoring function: F (m, y , p) = 1000 · f1 (m, y , p) + f2 (m, y , p) Make Year Price mazda 2009 20000 4000 ford 2009 15000 9000 ford 2007 12000 10000 Jan Chomicki () Aggregate score Preference Queries 45 / 68 Example of Top-2 Relation Car (Make, Year , Price) component scoring functions: f1 (m, y , p) = (y − 2005) f2 (m, y , p) = (20000 − p) aggregate scoring function: F (m, y , p) = 1000 · f1 (m, y , p) + f2 (m, y , p) Make Year Price mazda 2009 20000 4000 ford 2009 15000 9000 ford 2007 12000 10000 Jan Chomicki () Aggregate score Preference Queries 45 / 68 Computing Top-K Jan Chomicki () Preference Queries 46 / 68 Computing Top-K Naive approaches sort, output the first K -tuples scan the input maintaining a priority queue of size K ... Jan Chomicki () Preference Queries 46 / 68 Computing Top-K Naive approaches sort, output the first K -tuples scan the input maintaining a priority queue of size K ... Better approaches Jan Chomicki () Preference Queries 46 / 68 Computing Top-K Naive approaches sort, output the first K -tuples scan the input maintaining a priority queue of size K ... Better approaches the entire input does not need to be scanned... Jan Chomicki () Preference Queries 46 / 68 Computing Top-K Naive approaches sort, output the first K -tuples scan the input maintaining a priority queue of size K ... Better approaches the entire input does not need to be scanned... ... provided additional data structures are available Jan Chomicki () Preference Queries 46 / 68 Computing Top-K Naive approaches sort, output the first K -tuples scan the input maintaining a priority queue of size K ... Better approaches the entire input does not need to be scanned... ... provided additional data structures are available variants of the threshold algorithm Jan Chomicki () Preference Queries 46 / 68 Threshold algorithm (TA)[FLN03] Jan Chomicki () Preference Queries 47 / 68 Threshold algorithm (TA)[FLN03] Inputs a monotone scoring function F (t) = E (f1 (t), . . . , fm (t)) lists Si , i = 1, . . . , m, each sorted on fi (descending) and representing a different ranking of the same set of objects 1 For each list Si in parallel, retrieve the current object w in sorted order: (random access) for every j 6= i, retrieve vj = fj (w ) from the list Sj if d = E (v1 , . . . , vm ) is among the highest K scores seen so far, remember w and d (ties broken arbitrarily) 2 Thresholding: for each i, wi is the last object seen under sorted access in Si if there are already K top-K objects with score at least equal to the threshold T = E (f1 (w1 ), . . . , fm (wm )), return collected objects sorted by F and terminate otherwise, go to step 1. Jan Chomicki () Preference Queries 47 / 68 TA in action Aggregate score F (t) = P1 (t) + P2 (t) Priority queue OID P1 OID P2 5 50 3 50 1 35 2 40 3 30 1 30 2 20 4 20 4 10 5 10 Jan Chomicki () Preference Queries 48 / 68 TA in action Aggregate score F (t) = P1 (t) + P2 (t) Priority queue OID P1 OID P2 5 50 3 50 1 35 2 40 3 30 1 30 2 20 4 20 4 10 5 10 Jan Chomicki () T=100 Preference Queries 48 / 68 TA in action Aggregate score F (t) = P1 (t) + P2 (t) Priority queue OID P1 OID P2 5 50 3 50 1 35 2 40 3 30 1 30 2 20 4 20 4 10 5 10 Jan Chomicki () 5:60 T=100 Preference Queries 48 / 68 TA in action Aggregate score F (t) = P1 (t) + P2 (t) Priority queue OID P1 OID P2 3:80 5 50 3 50 5:60 1 35 2 40 3 30 1 30 2 20 4 20 4 10 5 10 Jan Chomicki () T=100 Preference Queries 48 / 68 TA in action Aggregate score F (t) = P1 (t) + P2 (t) Priority queue OID P1 OID P2 3:80 5 50 3 50 5:60 1 35 2 40 3 30 1 30 2 20 4 20 4 10 5 10 Jan Chomicki () T=75 Preference Queries 48 / 68 TA in action Aggregate score F (t) = P1 (t) + P2 (t) Priority queue OID P1 OID P2 3:80 5 50 3 50 1:65 1 35 2 40 5:60 3 30 1 30 2 20 4 20 4 10 5 10 Jan Chomicki () T=75 Preference Queries 48 / 68 TA in action Aggregate score F (t) = P1 (t) + P2 (t) Priority queue OID P1 OID P2 3:80 5 50 3 50 1:65 1 35 2 40 5:60 3 30 1 30 2:60 2 20 4 20 4 10 5 10 Jan Chomicki () T=75 Preference Queries 48 / 68 TA in databases Jan Chomicki () Preference Queries 49 / 68 TA in databases objects: tuples of a single relation r single-attribute component scoring functions sorted list access implemented through indexes random access to all lists implemented by primary index access to r that retrieves entire tuples Jan Chomicki () Preference Queries 49 / 68 Optimizing Top-K queries [LCIS05] Jan Chomicki () Preference Queries 50 / 68 Optimizing Top-K queries [LCIS05] Goals integrating Top-K with relational query evaluation and optimization replacing blocking by pipelining Jan Chomicki () Preference Queries 50 / 68 Optimizing Top-K queries [LCIS05] Goals integrating Top-K with relational query evaluation and optimization replacing blocking by pipelining Example SELECT * FROM Hotel h, Restaurant r , Museum m WHERE c1 AND c2 AND c3 ORDER BY f1 + f2 + f3 LIMIT K Jan Chomicki () Preference Queries 50 / 68 Optimizing Top-K queries [LCIS05] Goals integrating Top-K with relational query evaluation and optimization replacing blocking by pipelining Example SELECT * FROM Hotel h, Restaurant r , Museum m WHERE c1 AND c2 AND c3 ORDER BY f1 + f2 + f3 LIMIT K Is there a better evaluation plan than materialize-then-sort? Jan Chomicki () Preference Queries 50 / 68 Partial ranking of tuples Jan Chomicki () Preference Queries 51 / 68 Partial ranking of tuples Model set of component scoring functions P = {f1 , . . . , fm } such that fi (t) ≤ 1 for all t aggregate scoring function F (t) = E (f1 (t), . . . , fm (t)) how to rank intermediate tuples? Jan Chomicki () Preference Queries 51 / 68 Partial ranking of tuples Model set of component scoring functions P = {f1 , . . . , fm } such that fi (t) ≤ 1 for all t aggregate scoring function F (t) = E (f1 (t), . . . , fm (t)) how to rank intermediate tuples? Ranking principle Given P0 ⊆ P, F̄P0 (t) = E (g1 (t), . . . , gm (t)) where fi (t) if fi ∈ P0 gi (t) = 1 otherwise Jan Chomicki () Preference Queries 51 / 68 Relations with rank Jan Chomicki () Preference Queries 52 / 68 Relations with rank Rank-relation RP0 relation R monotone aggregate scoring function F (the same for all relations) set of component scoring functions P0 ⊆ P order: t1 >RP0 t2 ≡ F̄P0 (t1 ) > F̄P0 (t2 ) Jan Chomicki () Preference Queries 52 / 68 Ranking intermediate results Jan Chomicki () Preference Queries 53 / 68 Ranking intermediate results Operators rank operator µf : ranks tuples according to an additional component scoring function f standard relational algebra operators suitably extended to work on rank-relations Jan Chomicki () Preference Queries 53 / 68 Ranking intermediate results Operators rank operator µf : ranks tuples according to an additional component scoring function f standard relational algebra operators suitably extended to work on rank-relations Operator Order t1 >µf (RP RP1 ∩ SP2 t1 >RP1 ∩SP2 t2 ≡ F̄P1 ∪P2 (t1 ) > F̄P1 ∪P2 (t2 ) Jan Chomicki () 0 ) t2 ≡ F̄P0 ∪{f } (t1 ) > F̄P0 ∪{f } (t2 ) µf (RP0 ) Preference Queries 53 / 68 Example Jan Chomicki () Preference Queries 54 / 68 Example Query SELECT * FROM S ORDER BY f1 + f2 + f3 LIMIT 1 Jan Chomicki () Preference Queries 54 / 68 Example Query SELECT * FROM S ORDER BY f1 + f2 + f3 LIMIT 1 Unranked relation S A f1 f2 f3 1 0.7 0.8 0.9 2 0.9 0.85 0.8 3 0.5 0.45 0.75 Jan Chomicki () Preference Queries 54 / 68 Example Query SELECT * FROM S ORDER BY f1 + f2 + f3 LIMIT 1 Rank-relation S{f1 } Unranked relation S A f1 f2 f3 A F̄{f1 } 1 0.7 0.8 0.9 2 2.9 2 0.9 0.85 0.8 1 2.7 3 0.5 0.45 0.75 3 2.5 Jan Chomicki () Preference Queries 54 / 68 Pipelined execution Pipelined execution A f1 f2 f3 F̄{f1 } 2 0.9 0.85 0.8 2.9 1 0.7 0.8 0.9 2.7 3 0.5 0.45 0.75 2.5 IndexScanf1 Pipelined execution A f1 f2 f3 F̄{f1 } A F̄{f1 ,f2 } 2 0.9 0.85 0.8 2.9 2 2.75 1 0.7 0.8 0.9 2.7 1 2.5 3 0.5 0.45 0.75 2.5 3 1.95 IndexScanf1 µ f2 Pipelined execution A f1 f2 f3 F̄{f1 } A F̄{f1 ,f2 } 2 0.9 0.85 0.8 2.9 2 2.75 1 0.7 0.8 0.9 2.7 1 2.5 3 0.5 0.45 0.75 2.5 3 1.95 µ f2 µf3 IndexScanf1 Jan Chomicki () Preference Queries A F̄{f1 ,f2 ,f3 } 2 2.55 1 2.4 55 / 68 Algebraic laws for rank-relation operators Jan Chomicki () Preference Queries 56 / 68 Algebraic laws for rank-relation operators Splitting for µ R{f1 ,f2 ,...,fm } ≡ µf1 (µf2 (. . . (µfm (R)) . . .)) Jan Chomicki () Preference Queries 56 / 68 Algebraic laws for rank-relation operators Splitting for µ R{f1 ,f2 ,...,fm } ≡ µf1 (µf2 (. . . (µfm (R)) . . .)) Commutativity of µ µf1 (µf2 (RP0 )) ≡ µf2 (µf1 (RP0 )) Jan Chomicki () Preference Queries 56 / 68 Algebraic laws for rank-relation operators Splitting for µ R{f1 ,f2 ,...,fm } ≡ µf1 (µf2 (. . . (µfm (R)) . . .)) Commutativity of µ µf1 (µf2 (RP0 )) ≡ µf2 (µf1 (RP0 )) Commutativity of µ with selection σC (µf (RP0 )) ≡ µf (σC (RP0 )) Jan Chomicki () Preference Queries 56 / 68 Algebraic laws for rank-relation operators Splitting for µ R{f1 ,f2 ,...,fm } ≡ µf1 (µf2 (. . . (µfm (R)) . . .)) Commutativity of µ µf1 (µf2 (RP0 )) ≡ µf2 (µf1 (RP0 )) Commutativity of µ with selection σC (µf (RP0 )) ≡ µf (σC (RP0 )) Distributivity of µ over Cartesian product µf (RP1 × SP2 ) ≡ µf (RP1 ) × SP2 if f refers only to the attributes of R. Jan Chomicki () Preference Queries 56 / 68 Part III Preference management Jan Chomicki () Preference Queries 57 / 68 Outline of Part III 3 Preference management Preference modification Jan Chomicki () Preference Queries 58 / 68 Preference modification Jan Chomicki () Preference Queries 59 / 68 Preference modification Goal Given a preference relation and additional preference or indifference information I, construct a new preference relation 0 whose contents depend on and I. Jan Chomicki () Preference Queries 59 / 68 Preference modification Goal Given a preference relation and additional preference or indifference information I, construct a new preference relation 0 whose contents depend on and I. General postulates fulfillment: the new information I should be completely incorporated into 0 minimal change: should be changed as little as possible closure: order-theoretic properties of should be preserved in 0 (SPO, WO) finiteness or finite representability of should also be preserved in 0 Jan Chomicki () Preference Queries 59 / 68 Preference revision [Cho07a] Jan Chomicki () Preference Queries 60 / 68 Preference revision [Cho07a] Setting new information: revising preference relation 0 composition operator θ: union, prioritized or Pareto composition composition eliminates (some) preference conflicts additional assumptions: interval orders 0 = TC (0 θ ) to guarantee SPO Jan Chomicki () Preference Queries 60 / 68 Preference revision [Cho07a] Setting new information: revising preference relation 0 composition operator θ: union, prioritized or Pareto composition composition eliminates (some) preference conflicts additional assumptions: interval orders 0 = TC (0 θ ) to guarantee SPO Jan Chomicki () VW , 2009 Kia, 2009 VW , 2008 Kia, 2008 VW , 2007 Kia, 2007 Preference Queries 60 / 68 Preference revision [Cho07a] Setting new information: revising preference relation 0 composition operator θ: union, prioritized or Pareto composition composition eliminates (some) preference conflicts additional assumptions: interval orders 0 = TC (0 θ ) to guarantee SPO Jan Chomicki () VW , 2009 Kia, 2009 VW , 2008 Kia, 2008 VW , 2007 Kia, 2007 Preference Queries 60 / 68 Preference revision [Cho07a] Setting new information: revising preference relation 0 composition operator θ: union, prioritized or Pareto composition composition eliminates (some) preference conflicts additional assumptions: interval orders 0 = TC (0 θ ) to guarantee SPO Jan Chomicki () VW , 2009 Kia, 2009 VW , 2008 Kia, 2008 VW , 2007 Kia, 2007 Preference Queries 60 / 68 Preference contraction [MC08] Jan Chomicki () Preference Queries 61 / 68 Preference contraction [MC08] Setting new information: contractor relation CON 0 : maximal subset of disjoint with CON Jan Chomicki () Preference Queries 61 / 68 Preference contraction [MC08] Setting new information: contractor relation CON 0 : maximal subset of disjoint with CON VW , 2009 VW , 2008 VW , 2007 VW , 2006 VW , 2005 Jan Chomicki () Preference Queries 61 / 68 Preference contraction [MC08] Setting new information: contractor relation CON 0 : maximal subset of disjoint with CON VW , 2009 VW , 2008 VW , 2007 VW , 2006 VW , 2005 Jan Chomicki () Preference Queries 61 / 68 Preference contraction [MC08] Setting new information: contractor relation CON 0 : maximal subset of disjoint with CON VW , 2009 VW , 2008 VW , 2007 VW , 2006 VW , 2005 Jan Chomicki () Preference Queries 61 / 68 Substitutability [BGS06] Jan Chomicki () Preference Queries 62 / 68 Substitutability [BGS06] Setting new information: set of indifference pairs additional preferences are added to convert indifference to restricted indifference achieving object substitutability Jan Chomicki () Preference Queries 62 / 68 Substitutability [BGS06] Setting new information: set of indifference pairs additional preferences are added to convert indifference to restricted indifference achieving object substitutability Jan Chomicki () VW , 2009 Kia, 2009 VW , 2008 Kia, 2008 VW , 2007 Kia, 2007 Preference Queries 62 / 68 Substitutability [BGS06] Setting new information: set of indifference pairs additional preferences are added to convert indifference to restricted indifference achieving object substitutability Jan Chomicki () VW , 2009 Kia, 2009 VW , 2008 Kia, 2008 VW , 2007 Kia, 2007 Preference Queries 62 / 68 Substitutability [BGS06] Setting new information: set of indifference pairs additional preferences are added to convert indifference to restricted indifference achieving object substitutability Jan Chomicki () VW , 2009 Kia, 2009 VW , 2008 Kia, 2008 VW , 2007 Kia, 2007 Preference Queries 62 / 68 Part IV Advanced topics Jan Chomicki () Preference Queries 63 / 68 Outline of Part IV Jan Chomicki () Preference Queries 64 / 68 Prospective research topics Jan Chomicki () Preference Queries 65 / 68 Prospective research topics Definability Given a preference relation C , how to construct a definition of a scoring function F representing C , if such a function exists? Jan Chomicki () Preference Queries 65 / 68 Prospective research topics Definability Given a preference relation C , how to construct a definition of a scoring function F representing C , if such a function exists? Extrinsic preference relations Preference relations that are not fully defined by tuple contents: x y ≡ ∃n1 , n2 . Dissatisfied(x, n1 ) ∧ Dissatisfied(y , n2 ) ∧ n1 < n2 . Jan Chomicki () Preference Queries 65 / 68 Prospective research topics Definability Given a preference relation C , how to construct a definition of a scoring function F representing C , if such a function exists? Extrinsic preference relations Preference relations that are not fully defined by tuple contents: x y ≡ ∃n1 , n2 . Dissatisfied(x, n1 ) ∧ Dissatisfied(y , n2 ) ∧ n1 < n2 . Incomplete preferences tuple scores and probabilities [SIC08, ZC08] uncertain tuple scores disjunctive preferences: a b ∨ a c Jan Chomicki () Preference Queries 65 / 68 Jan Chomicki () Preference Queries 66 / 68 Preference modification beyond revision and contraction: merging, arbitration,... general parametric framework? conflict resolution Jan Chomicki () Preference Queries 66 / 68 Preference modification beyond revision and contraction: merging, arbitration,... general parametric framework? conflict resolution Variations preference and similarity: “find the objects similar to one of the best objects” Jan Chomicki () Preference Queries 66 / 68 Preference modification beyond revision and contraction: merging, arbitration,... general parametric framework? conflict resolution Variations preference and similarity: “find the objects similar to one of the best objects” Applications preference queries as decision components: workflows, event systems personalization of query results preference negotiation: applying contraction Jan Chomicki () Preference Queries 66 / 68 Acknowledgments Denis Mindolin Slawek Staworko Xi Zhang Jan Chomicki () Preference Queries 67 / 68 Jan Chomicki () Preference Queries 68 / 68 M. Baudinet, J. Chomicki, and P. Wolper. Constraint-Generating Dependencies. Journal of Computer and System Sciences, 59:94–115, 1999. Preliminary version in ICDT’95. W-T. Balke, U. Güntzer, and W. Siberski. Exploiting Indifference for Customization of Partial Order Skylines. In International Database Engineering and Applications Symposium (IDEAS), pages 80–88, 2006. S. Börzsönyi, D. Kossmann, and K. Stocker. The Skyline Operator. In IEEE International Conference on Data Engineering (ICDE), pages 421–430, 2001. J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with Presorting. In IEEE International Conference on Data Engineering (ICDE), 2003. Poster. J. Chomicki. Jan Chomicki () Preference Queries 68 / 68 Preference Formulas in Relational Queries. ACM Transactions on Database Systems, 28(4):427–466, December 2003. J. Chomicki. Database Querying under Changing Preferences. Annals of Mathematics and Artificial Intelligence, 50(1-2):79–109, 2007. J. Chomicki. Semantic optimization techniques for preference queries. Information Systems, 32(5):660–674, 2007. R. Fagin, A. Lotem, and M. Naor. Optimal Aggregation Algorithms for Middleware. Journal of Computer and System Sciences, 66(4):614–656, 2003. P. Godfrey, R. Shipley, and J. Gryz. Algorithms and Analyses for Maximal Vector Computation. VLDB Journal, 16:5–28, 2007. W. Kießling. Jan Chomicki () Preference Queries 68 / 68 Foundations of Preferences in Database Systems. In International Conference on Very Large Data Bases (VLDB), pages 311–322, 2002. W. Kießling and G. Köstler. Preference SQL - Design, Implementation, Experience. In International Conference on Very Large Data Bases (VLDB), pages 990–1001, 2002. C. Li, K. C-C. Chang, I.F. Ilyas, and S. Song. RankSQL: Query Algebra and Optimization for Relational Top-k Queries. In ACM SIGMOD International Conference on Management of Data, pages 131–142, 2005. D. Mindolin and J. Chomicki. Maximal Contraction of Preference Relations. In National Conference on Artificial Intelligence, pages 492–497, 2008. M.A. Soliman, I.F. Ilyas, and K. C-C. Chang. Probabilistic Top-k and Ranking-Aggregate Queries. Jan Chomicki () Preference Queries 68 / 68 ACM Transactions on Database Systems, 33(3):13:1–13:54, 2008. X. Zhang and J. Chomicki. On the Semantics and Evaluation of Top-k Queries in Probabilistic Databases. In International Workshop on Database Ranking (DBRank). IEEE Computer Society (ICDE Workshops), 2008. X. Zhang and Z. M. Ozsoyoglu. Implication and Referential Constraints: A New Formal Reasoning. IEEE Transactions on Knowledge and Data Engineering, 9(6):894–910, 1997. Jan Chomicki () Preference Queries 68 / 68