Preference Analysis

advertisement
Preference Analysis
Joachim Giesen and Eva Schuberth
May 24, 2006
Outline




Motivation
Approximate sorting
• Lower bound
• Upper bound
Aggregation
• Algorithm
• Experimental results
Conclusion
Motivation

Find preference structure of consumer w.r.t. a set of products
Common: assign value function to products
Value function determines a ranking of products

Elicitation: pairwise comparisons

Problem: deriving metric value function from non-metric information


 We restrict ourselves to finding ranking
Motivation
 Find for every respondent a ranking individually



Efficiency measure: number of comparisons
Comparison based sorting algorithm
Lower Bound: nlog n comparisons
As set of products can be large this is too much
Motivation
Possible solutions:



Approximation
Aggregation
Modeling and distribution assumptions
Approximation
(joint work with J. Giesen and M. Stojaković)
1.
Lower bound (proof)
2.
Algorithm
Approximation

Consumer’s true ranking of n products
corresponds to:
Identity increasing permutation id on {1, .., n}

Wanted:
Approximation of ranking
corresponds to:
  S n s.t. dist ( , id ) small
Metric on Sn

Needed: metric on S n
Meaningful in the market research context

Spearman’s footrule metric D:

n
D( , id )  D( )    (i )  i
i 1

Note:
D( )  n 2
We show:
To approximate ranking within expected distance
n2
 ( n)
at least
n(min{log  (n), log n}  6)
comparisons necessary
6n log(  (n)) comparisons always sufficient
Lower bound

A : randomized approximate sorting algorithm
 R
  Sn

2
n
r
 ( n)
A
A( , )
,
If for every input permutation the expected distance of the
output to id is at most r, then A performs at least
n(min{log  (n), log n}  6)
comparisons in the worst case.
Lower bound: Proof
Follows Yao’s Minimax Principle


Assume less than n(min{log  (n), log n}  6) comparisons for every input.
Fix
 deterministic algorithm.
Then for at least 1 n! permutations: output at distance more than 2r.
2

 Expected distance of A( , ) larger than r.

There is a  0  S n , s.t. expected distance of A( 0 , )
Contradiction.
larger than r.
Lower bound: Lemma
For r>0
BD id , r  : ball centered at id with radius r
r
id
Lemma:
 2e ( r  n ) 
BD id , r   

n


n
Lower bound: Proof of Lemma




If   BD id , r  then
n
  (i)  i  r
i 1
 uniquely determined by the sequence { (i)  i}i
For sequence of non-negative integers d i : at most 2n permutations
satisfy

 (i)  i  di

n  r
# sequences of n non-negative integers whose sum is at most r: 
 n 


 r  n  n  2e(r  n) 

2  


B
id
,
r


 D
n
n




n
Lower bound: deterministic case
Now to show:
~ fixed, the number of input permutations
For 
which have output at distance more than 2r to
id is more than 1 n!
2
Lower bound: deterministic case
k comparisons  2k classes of same outcome
Lower bound: deterministic case
k comparisons  2k classes of same outcome
Lower bound: deterministic case
~A(),~A
~)
, (~
) , 
) ( , 
For 
,, ininthe
For
thesame
sameclass:
class:A(A
Lower bound: deterministic case
For  ,  in the same class: A( , ~ )  A( , ~ )
Lower bound: deterministic case
At most 2k input permutations have same output
Lower bound: deterministic case
At most BD id ,2r  2k input permutations with output in BD id ,2r 
Lower bound: deterministic case
At least n! BD id ,2r  2 k 
1
n! input permutations
2
with output outside BD id ,2r 
Upper Bound
Algorithm (suggested by Chazelle) approximates
2
any ranking within distance n  (n)
with less than 6n log(  (n)) comparisons.
Algorithm




Partitioning of elements into equal sized bins
Elements within bin smaller than any element in
subsequent bin.
No ordering of elements within a bin
Output: permutation consistent with sequence of bins
Algorithm
Round
0
1
2
Analysis of algorithm
m rounds  2m bins
Output : ranking consistent with ordering of bins
Running Time
 Median search and partitioning of n elements: less than 6n
comparisons (algorithm by Blum et al)
 m rounds  less than 6nm comparisons
Distance
n n2
D( , id )    (i)  i   m  m
2
i 1
i 1 2
n
Set m  log( (n))
n
Algorithm: Theorem
Any ranking consistent with bins computed in
log( (n))
rounds, i.e. with less than
6n log(  (n))
comparisons has distance at most
n2
 ( n)
Approximation: Summary

For sufficiently large error: less comparisons than for exact sorting:
n2
 n 2,  const:
error
 (n)
n2
2  o (1)

n
error
:
 ( n)



(n log n) comparisons
o(n log n) comparisons
For real applications: still too much
Individual elicitation of value function not possible
 Second approach: Aggregation
Aggregation
(joint work with J. Giesen and D. Mitsche)
Motivation:



We think that population splits into preference/ customer types
Respondents answer according to their type (but deviation possible)
Instead of
• Individual preference analysis or
• aggregation over the population
 aggregate within customer types
Aggregation
Idea:
 Ask only a constant number of questions (pairwise comparisons)
 Ask many respondents
 Cluster the respondents according to answers into types
 Aggregate information within a cluster to get type rankings
Philosophy:
First segment then aggregate
Algorithm
The algorithm works in 3 phases:
(1)
(2)
(3)
Estimate the number k of customer types
Segment the respondents into the k customer types
Compute a ranking for each customer type
Algorithm
Every respondent performs pairwise comparisons.
Basic data structure: matrix A = [aij]
Entry aij in {-1,1,0}, refers to respondent i and the j-th product pair (x,y)
 1

aij  1
0

if respondent i prefers y over x
if respondent i prefers x over y
if respondent i has not compared x and y
Algorithm
Define B = AAT
Then Bij = number of product pairs on which respondent i and j agree
minus number of pairs on which they disagree (not counting 0’s).
Algorithm: phase 1
Phase 1: Estimation of number k of customer types



Use matrix B
Analyze spectrum of B
We expect: k largest eigenvalues of B to be substantially larger than
the other eigenvalues
 Search for gap in the eigenvalues
Algorithm: phase 2
Phase 2: Cluster respondents into customer types




Use again matrix B
Compute projector P onto the space spanned by the eigenvectors to
the k largest eigenvalues of B
Every respondent corresponds to a column of P
Cluster columns of P
Algorithm: phase 2

Intuition for using projector – example on graphs:
Algorithm: phase 2
Ad =
0
1
1
0
0
0
0
0
1
0
0
1
0
0
0
0
1
0
0
1
0
0
0
0
0
1
1
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
1
1
1
0
1
Algorithm: phase 2
P=
2.0
2.1
2.1
2.4
-0.6
-0.6
0.9
-0.3
2.1
2.2
2.2
2.5
-0.4
-0.4
1.0
-0.1
2.1
2.2
2.2
2.5
-0.4
-0.4
1.0
-0.1
2.4
2.5
2.5
3.0
0
0
1.5
0.5
-0.6
-0.4
-0.4
0
2.7
2.7
1.4
-0.6
-0.4
-0.4
0
2.7
2.7
1.4
3.1
0.9
1.0
1.0
1.5
1.4
1.4
1.4
1.8
-0.3
-0.1
-0.1
0.5
3.1
3.1
1.8
3.7
3.1
Algorithm: phase 2
P’ =
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
1
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
1
0
0
0
1
1
1
1
1
0
0
0
0
1
1
1
1
1
Algorithm: phase 2
Embedding of the
columns of P
Algorithm: phase 3
Phase 3: Compute the ranking for each type

For each type t compute characteristic vector ct:
1
(ct )i  
0

if respondent i belongs to that type
otherwise
For each type t compute ATct
positive: x preferred over y by t
if entry for product pair (x,y) is
negative: y preferred over x by t
zero :
type t is indifferent
Experimental study
On real world data
 21 data sets from Sawtooth Software, Inc. (Conjoint data sets)
Questions:
 Do real populations decompose into different customer types
 Comparison of our algorithm to Sawtooth’s algorithm
Conjoint structures
Attributes: Sets A1, .. An, |Ai|=mi
An element of Ai is called level of the i-th attribute
A product is an element of A1x …x An
Example: Car
 Number of seats = {5, 7}
 Cargo area = {small, medium, large}
 Horsepower = {240hp, 185hp}
 Price = {$29000, $33000, $37000}
 …
In practical conjoint studies:
2  mi  8
3  n  15
Quality measures
Difficulty: we do not know the real type rankings
 We cannot directly measure quality of result
 Other quality measures:
• Number of inverted pairs invij :
average number of inversions in the partial rankings of
respondents in type i with respect to the j-th type ranking
invii
l
•
Deviation probability
•
Hit Rate (Leave one out experiments)
1 p 
Study 1
# respondents = 270
Size of study: 8 x 3 x 4 = 96
# questions = 20
Largest eigenvalues of matrix B
# respondents = 270
Size of study: 8 x 3 x 4 = 96
# questions = 20
Study 1

two types

Size of clusters: 179 – 91
Ranking for
type 1
Ranking for
type 2
1-p
Type 1
0.19
3.33
0.95%
Type 2
2.28
0.75
3.75%
Number of inversions and deviation probability
Study 1
Hitrates:

Sawtooth: ?

Our algorithm: 69%
# respondents = 270
Size of study: 8 x 3 x 4 = 96
# questions = 20
Study 2
# respondents = 539
Size of study: 4 x 3 x 3 x 5 = 180
# questions = 30
Largest eigenvalues of matrix B
# respondents = 539
Size of study: 4 x 3 x 3 x 5 = 180
# questions = 30
Study 2

four types

Size of clusters: 81 – 119 – 130 – 209
Ranking for
type 1
Ranking for
type 2
Ranking for
type 3
Ranking for
type 4
1-p
Type 1
0.44
6.77
5.11
6.53
1.5%
Type 2
5.58
0.92
6.92
7.98
3.1%
Type 3
3.56
6.1
0.84
5.67
2.8%
Type 4
3.56
5.08
4.25
1.16
3.9%
Number of inversions and deviation probability
Study 2
Hitrates:

Sawtooth: 87%

Our algorithm: 65%
# respondents = 539
Size of study: 4 x 3 x 3 x 5 = 180
# questions = 30
Study 3
# respondents = 1184
Size of study: 9 x 6 x 5 = 270
# questions = 48
Size=of12%
1-p
clusters:
6 – 1175
3
3 – 1164
–6–8–3
Largest eigenvalues of matrix B
Study 3
Hitrates:

Sawtooth: 78%

Our algorithm: 62%
# respondents = 1184
Size of study: 9 x 6 x 5 = 270
# questions = 48
Study 4
# respondents = 300
Size of study: 6 x 4 x 6 x 3 x 2= 3456
# questions = 40
Largest eigenvalues of matrix B
Study 4
Hitrates:

Sawtooth: 85%

Our algorithm: 51%
# respondents = 300
Size of study: 6 x 4 x 6 x 3 x 2= 3456
# questions = 40
Aggregation - Conclusion

Segmentation seems to work well in practice.

Hitrates not good
Reason: information too sparse

 Additional assumptions necessary
•
•
Exploit conjoint structure
Make distribution assumptions
Thank you!
Yao’s Minimax Principle
I : finite set of input instances
A : finite set of deterministic algorithms
C(i,a): cost of algorithm a on input i, where i  I and a  A
For all distributions p over I and q over A
min aA E (C (i p , a))  max iI E (C (i, aq ))
Download