Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006 Outline Motivation Approximate sorting • Lower bound • Upper bound Aggregation • Algorithm • Experimental results Conclusion Motivation Find preference structure of consumer w.r.t. a set of products Common: assign value function to products Value function determines a ranking of products Elicitation: pairwise comparisons Problem: deriving metric value function from non-metric information We restrict ourselves to finding ranking Motivation Find for every respondent a ranking individually Efficiency measure: number of comparisons Comparison based sorting algorithm Lower Bound: nlog n comparisons As set of products can be large this is too much Motivation Possible solutions: Approximation Aggregation Modeling and distribution assumptions Approximation (joint work with J. Giesen and M. Stojaković) 1. Lower bound (proof) 2. Algorithm Approximation Consumer’s true ranking of n products corresponds to: Identity increasing permutation id on {1, .., n} Wanted: Approximation of ranking corresponds to: S n s.t. dist ( , id ) small Metric on Sn Needed: metric on S n Meaningful in the market research context Spearman’s footrule metric D: n D( , id ) D( ) (i ) i i 1 Note: D( ) n 2 We show: To approximate ranking within expected distance n2 ( n) at least n(min{log (n), log n} 6) comparisons necessary 6n log( (n)) comparisons always sufficient Lower bound A : randomized approximate sorting algorithm R Sn 2 n r ( n) A A( , ) , If for every input permutation the expected distance of the output to id is at most r, then A performs at least n(min{log (n), log n} 6) comparisons in the worst case. Lower bound: Proof Follows Yao’s Minimax Principle Assume less than n(min{log (n), log n} 6) comparisons for every input. Fix deterministic algorithm. Then for at least 1 n! permutations: output at distance more than 2r. 2 Expected distance of A( , ) larger than r. There is a 0 S n , s.t. expected distance of A( 0 , ) Contradiction. larger than r. Lower bound: Lemma For r>0 BD id , r : ball centered at id with radius r r id Lemma: 2e ( r n ) BD id , r n n Lower bound: Proof of Lemma If BD id , r then n (i) i r i 1 uniquely determined by the sequence { (i) i}i For sequence of non-negative integers d i : at most 2n permutations satisfy (i) i di n r # sequences of n non-negative integers whose sum is at most r: n r n n 2e(r n) 2 B id , r D n n n Lower bound: deterministic case Now to show: ~ fixed, the number of input permutations For which have output at distance more than 2r to id is more than 1 n! 2 Lower bound: deterministic case k comparisons 2k classes of same outcome Lower bound: deterministic case k comparisons 2k classes of same outcome Lower bound: deterministic case ~A(),~A ~) , (~ ) , ) ( , For ,, ininthe For thesame sameclass: class:A(A Lower bound: deterministic case For , in the same class: A( , ~ ) A( , ~ ) Lower bound: deterministic case At most 2k input permutations have same output Lower bound: deterministic case At most BD id ,2r 2k input permutations with output in BD id ,2r Lower bound: deterministic case At least n! BD id ,2r 2 k 1 n! input permutations 2 with output outside BD id ,2r Upper Bound Algorithm (suggested by Chazelle) approximates 2 any ranking within distance n (n) with less than 6n log( (n)) comparisons. Algorithm Partitioning of elements into equal sized bins Elements within bin smaller than any element in subsequent bin. No ordering of elements within a bin Output: permutation consistent with sequence of bins Algorithm Round 0 1 2 Analysis of algorithm m rounds 2m bins Output : ranking consistent with ordering of bins Running Time Median search and partitioning of n elements: less than 6n comparisons (algorithm by Blum et al) m rounds less than 6nm comparisons Distance n n2 D( , id ) (i) i m m 2 i 1 i 1 2 n Set m log( (n)) n Algorithm: Theorem Any ranking consistent with bins computed in log( (n)) rounds, i.e. with less than 6n log( (n)) comparisons has distance at most n2 ( n) Approximation: Summary For sufficiently large error: less comparisons than for exact sorting: n2 n 2, const: error (n) n2 2 o (1) n error : ( n) (n log n) comparisons o(n log n) comparisons For real applications: still too much Individual elicitation of value function not possible Second approach: Aggregation Aggregation (joint work with J. Giesen and D. Mitsche) Motivation: We think that population splits into preference/ customer types Respondents answer according to their type (but deviation possible) Instead of • Individual preference analysis or • aggregation over the population aggregate within customer types Aggregation Idea: Ask only a constant number of questions (pairwise comparisons) Ask many respondents Cluster the respondents according to answers into types Aggregate information within a cluster to get type rankings Philosophy: First segment then aggregate Algorithm The algorithm works in 3 phases: (1) (2) (3) Estimate the number k of customer types Segment the respondents into the k customer types Compute a ranking for each customer type Algorithm Every respondent performs pairwise comparisons. Basic data structure: matrix A = [aij] Entry aij in {-1,1,0}, refers to respondent i and the j-th product pair (x,y) 1 aij 1 0 if respondent i prefers y over x if respondent i prefers x over y if respondent i has not compared x and y Algorithm Define B = AAT Then Bij = number of product pairs on which respondent i and j agree minus number of pairs on which they disagree (not counting 0’s). Algorithm: phase 1 Phase 1: Estimation of number k of customer types Use matrix B Analyze spectrum of B We expect: k largest eigenvalues of B to be substantially larger than the other eigenvalues Search for gap in the eigenvalues Algorithm: phase 2 Phase 2: Cluster respondents into customer types Use again matrix B Compute projector P onto the space spanned by the eigenvectors to the k largest eigenvalues of B Every respondent corresponds to a column of P Cluster columns of P Algorithm: phase 2 Intuition for using projector – example on graphs: Algorithm: phase 2 Ad = 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 0 1 Algorithm: phase 2 P= 2.0 2.1 2.1 2.4 -0.6 -0.6 0.9 -0.3 2.1 2.2 2.2 2.5 -0.4 -0.4 1.0 -0.1 2.1 2.2 2.2 2.5 -0.4 -0.4 1.0 -0.1 2.4 2.5 2.5 3.0 0 0 1.5 0.5 -0.6 -0.4 -0.4 0 2.7 2.7 1.4 -0.6 -0.4 -0.4 0 2.7 2.7 1.4 3.1 0.9 1.0 1.0 1.5 1.4 1.4 1.4 1.8 -0.3 -0.1 -0.1 0.5 3.1 3.1 1.8 3.7 3.1 Algorithm: phase 2 P’ = 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 Algorithm: phase 2 Embedding of the columns of P Algorithm: phase 3 Phase 3: Compute the ranking for each type For each type t compute characteristic vector ct: 1 (ct )i 0 if respondent i belongs to that type otherwise For each type t compute ATct positive: x preferred over y by t if entry for product pair (x,y) is negative: y preferred over x by t zero : type t is indifferent Experimental study On real world data 21 data sets from Sawtooth Software, Inc. (Conjoint data sets) Questions: Do real populations decompose into different customer types Comparison of our algorithm to Sawtooth’s algorithm Conjoint structures Attributes: Sets A1, .. An, |Ai|=mi An element of Ai is called level of the i-th attribute A product is an element of A1x …x An Example: Car Number of seats = {5, 7} Cargo area = {small, medium, large} Horsepower = {240hp, 185hp} Price = {$29000, $33000, $37000} … In practical conjoint studies: 2 mi 8 3 n 15 Quality measures Difficulty: we do not know the real type rankings We cannot directly measure quality of result Other quality measures: • Number of inverted pairs invij : average number of inversions in the partial rankings of respondents in type i with respect to the j-th type ranking invii l • Deviation probability • Hit Rate (Leave one out experiments) 1 p Study 1 # respondents = 270 Size of study: 8 x 3 x 4 = 96 # questions = 20 Largest eigenvalues of matrix B # respondents = 270 Size of study: 8 x 3 x 4 = 96 # questions = 20 Study 1 two types Size of clusters: 179 – 91 Ranking for type 1 Ranking for type 2 1-p Type 1 0.19 3.33 0.95% Type 2 2.28 0.75 3.75% Number of inversions and deviation probability Study 1 Hitrates: Sawtooth: ? Our algorithm: 69% # respondents = 270 Size of study: 8 x 3 x 4 = 96 # questions = 20 Study 2 # respondents = 539 Size of study: 4 x 3 x 3 x 5 = 180 # questions = 30 Largest eigenvalues of matrix B # respondents = 539 Size of study: 4 x 3 x 3 x 5 = 180 # questions = 30 Study 2 four types Size of clusters: 81 – 119 – 130 – 209 Ranking for type 1 Ranking for type 2 Ranking for type 3 Ranking for type 4 1-p Type 1 0.44 6.77 5.11 6.53 1.5% Type 2 5.58 0.92 6.92 7.98 3.1% Type 3 3.56 6.1 0.84 5.67 2.8% Type 4 3.56 5.08 4.25 1.16 3.9% Number of inversions and deviation probability Study 2 Hitrates: Sawtooth: 87% Our algorithm: 65% # respondents = 539 Size of study: 4 x 3 x 3 x 5 = 180 # questions = 30 Study 3 # respondents = 1184 Size of study: 9 x 6 x 5 = 270 # questions = 48 Size=of12% 1-p clusters: 6 – 1175 3 3 – 1164 –6–8–3 Largest eigenvalues of matrix B Study 3 Hitrates: Sawtooth: 78% Our algorithm: 62% # respondents = 1184 Size of study: 9 x 6 x 5 = 270 # questions = 48 Study 4 # respondents = 300 Size of study: 6 x 4 x 6 x 3 x 2= 3456 # questions = 40 Largest eigenvalues of matrix B Study 4 Hitrates: Sawtooth: 85% Our algorithm: 51% # respondents = 300 Size of study: 6 x 4 x 6 x 3 x 2= 3456 # questions = 40 Aggregation - Conclusion Segmentation seems to work well in practice. Hitrates not good Reason: information too sparse Additional assumptions necessary • • Exploit conjoint structure Make distribution assumptions Thank you! Yao’s Minimax Principle I : finite set of input instances A : finite set of deterministic algorithms C(i,a): cost of algorithm a on input i, where i I and a A For all distributions p over I and q over A min aA E (C (i p , a)) max iI E (C (i, aq ))