Department of Statistics, UCLA UC Los Angeles

advertisement
Department of Statistics, UCLA
UC Los Angeles
Title:
Multidimensional Unfolding
Author:
Jan de Leeuw
Publication Date:
08-22-2011
Series:
Department of Statistics Papers
Publication Info:
Department of Statistics Papers, Department of Statistics, UCLA, UC Los Angeles
Permalink:
http://escholarship.org/uc/item/05m7v36t
Abstract:
This is an entry for The Encyclopedia of Statistics in Behavioral Science, to be published by Wiley
in 2005.
eScholarship provides open access, scholarly publishing
services to the University of California and delivers a dynamic
research platform to scholars worldwide.
MULTIDIMENSIONAL UNFOLDING
D
ES
K
JAN DE LEEUW
A BSTRACT. This is an entry for The Encyclopedia of Statistics in Behavioral Science, to be published by Wiley in 2005.
The unfolding model is a geometric model for preference and choice. It
locates individuals and alternatives as points in a joint space, and it says
’S
that an individual will pick the alternative in the choice set closest to its ideal
point. Unfolding originated in the work of Coombs [4] and his students. It
JA
attitude scaling.
N
is perhaps the dominant model in both scaling of preferential choice and
The multidimensional unfolding technique computes solutions to the equations of unfolding model. It can be defined as multidimensional scaling
M
of off-diagonal matrices. This means the data are dissimilarities between
n row objects and m column objects, collected in an n × m matrix 1. An
FR
O
important example is preference data, where δi j indicates, for instance, how
much individual i dislikes object j. In unfolding we have many of the same
Date: April 3, 2004.
Key words and phrases. fitting distances, multidimensional scaling, unfolding, choice
models.
1
2
JAN DE LEEUW
distinctions as in general multidimensional scaling: there is unidimensional
and multidimensional unfolding, metric and nonmetric unfolding, and there
D
ES
K
are many possible choices of loss functions that can be minimized.
First we will look at (metric) unfolding as defining the system of equations
δi j = di j (X, Y ), where X is the n × p configuration matrix of row-points,
Y is the m × p configuration matrix of column points, and
v
u p
uX
di j (X, Y ) = t (xis − y js )2 .
s=1
this system expands to
p
X
2
xis
+
p
X
y 2js − 2
N
δi2j =
’S
Clearly an equivalent system of algebraic equations is δi2j = di2j (X, Y ), and
s=1
s=1
p
X
xis y js .
s=1
JA
0 + e b0 − 2X Y 0 , where a
We can rewrite this in matrix form as 1(2) = aem
n
and b contain the row and column sums of squares, and where e is used for
M
a vector with all elements equal to one. If we define the centering operators
0 /m, then we see that doubly centering
Jn = In −en en0 /n and Jm = Im −em em
FR
O
the matrix of squared dissimilarities gives the basic result
1
H = − Jn 1(2) Jm = X̃ Ỹ 0 ,
2
where X̃ = Jn X and Ỹ = Jm Y are centered versions of X and Y . For
our system of equations to be solvable, it is necessary that rank(H ) ≤ p.
Solving the system, or finding an approximate solution by using the singular
MULTIDIMENSIONAL UNFOLDING
3
value decomposition, gives us already an idea about X and Y , except that we
do not know the relative location and orientation of the two points clouds.
D
ES
K
More precisely, if H = P Q 0 is is full rank decomposition of H , then the
solutions X and Y of our system of equations δi2j = di2j (X, Y ) can be written
in the form
X = (P + en α 0 )T,
Y = (Q + em β 0 )(T 0 )−1 ,
’S
which leaves us with only the p( p + 2) unknowns in α, β, and T still to be
determined. By using the fact that the solution is invariant under translation
N
and rotation we can actually reduce this to 21 p( p + 3) parameters. One way
JA
to find these additional parameters is given in [10].
Instead of trying to find an exact solution, if one actually exists, by algebraic
M
means, we can also define a multidimensional unfolding loss function and
minimize it. In the most basic and classical form, we have the Stress loss
FR
O
function
σ (X, Y ) =
n X
m
X
wi j (δi j − di j (X, Y ))2
i=1 j=1
This is identical to an ordinary multidimensional scaling problems where
the diagonal (row-row and column-column) weights are zero. Or, to put
it differently, in unfolding the dissimilarities between different row objects
4
JAN DE LEEUW
and different column objects are missing. Thus any multidimensional scaling program that can handle weights and missing data can be used to minimize this loss function. Details are in [7] or [1, Part III]. One can also con-
D
ES
K
sider measuring loss using SStress, the sum of squared differences between
the squared dissimilarities and squared distances. This has been considered
in [11, 6].
Area Plot Code
SOC
Educational and Developmental Psychology
EDU
’S
Social Psychology
CLI
Mathematical Psychology and Psychological Statistics
MAT
N
Clinical Psychology
EXP
Cultural Psychology and Psychology of Religion
CUL
Industrial Psychology
IND
Test Construction and Validation
TST
Physiological and Animal Psychology
PHY
TABLE 1. Nine Psychology Areas
FR
O
M
JA
Experimental Psychology
We use an example from Roskam [9, p. 152]. The Department of Psychology at the University of Nijmegen has, or had, 9 different areas of
research and teaching. Each of the 39 psychologists working in the department ranked all 9 areas in order of relevance for their work. The areas
MULTIDIMENSIONAL UNFOLDING
5
are given in Table 1. We apply metric unfolding, in two dimensions, and
CLI
EDU10
15
6
5
2
8
D
ES
K
find the solution in Figure 1.
12
9
7
39
23
18
24
20 27
1
31
2
EXP
14
0
22
19
30
MAT
SOC
4
1
28
3 13
29
−2
CUL
16
3611
35
N
−3
M
JA
−2
TST
3326
37
3432
−4
17
’S
−1
dimension2
PHY
38
21
25
IND
0
2
4
dimension1
F IGURE 1. Metric Unfolding Roskam Data
FR
O
In this analysis we used the rank orders, more precisely the numbers 0 to
8. Thus, for good fit, first choices should coincide with ideal points. The
grouping of the 9 areas in the solution is quite natural.
In this case, and in many other cases, the problems we are analyzing suggest
that we really are interested in nonmetric unfolding. It is difficult to think of
6
JAN DE LEEUW
actual applications of metric unfolding, except perhaps in the life and physical sciences. This does not mean that metric unfolding is uninteresting.
Most nonmetric unfolding algorithms solve metric unfolding subproblems,
D
ES
K
and one can often make a case for metric unfolding as a robust form to solve
nonmetric unfolding problems.
The original techniques proposed by Coombs [4] were purely nonmetric
and did not even lead to metric representations. In preference analysis,
the protypical area of application, we often only have ranking information.
’S
Each individual ranks a number of candidates, or food samples, or investment opportunities. The ranking information is row-conditional, which
N
means we cannot compare the ranks given by individual i to the ranks
JA
given by individual k. The order is defined only within rows. Metric data
are generally unconditional, because we can compare numbers both within
and between rows. Because of the paucity of information (only rank or-
M
der, only row-conditional, only off-diagonal) the usual Kruskal approach to
nonmetric unfolding often leads to degenerate solutions, even after clever
FR
O
renormalization and partitioning of the loss function [8]. In Figure 2 we
give the solution minimizing
σ (X, Y, 1) =
n
X
i=1
Pm
2
j=1 wi j (δi j − di j (X, Y ))
Pm
2
j=1 wi j (δi j − δi? )
MULTIDIMENSIONAL UNFOLDING
7
over X and Y and over those 1 whose rows are monotone with the ranks
given by the psychologists. Thus there is a separate monotone regression
18
19
37
EXP
MAT
TST
IND
SOC
−5
17
28
0
5
10
dimension1
M
−10
PHY
N
13
JA
−6
−4
30
29
39
’S
0
2
10
15
6 5
8
12
23
24
25
21
20
27
22
35
14
7
32
1234
9
11
36
34
26
33
16
−2
dimension2
38
EDU
CLI
CUL
D
ES
K
31
4
6
computed for each of the 39 rows.
FR
O
F IGURE 2. Nonmetric Unfolding Roskam Data
The solution is roughly the same as the metric one, but there is more clustering and clumping in the plot, and this makes the visual representation much
less clear. It is quite possible that continuing to iterate to higher precision
will lead to even more degeneracy. More recently Busing et al. [2] have
8
JAN DE LEEUW
adapted the Kruskal approach to nonmetric unfolding by penalizing for the
flatness of the monotone regression function.
D
ES
K
One would expect even more problems when the data are not even rank
orders but just binary choices. Suppose n individuals have to choose one alternative from a set of m alternatives. The data can be coded as an indicator
matrix, which is an n × m binary matrix with exactly one unit element in
each row. The unfolding model says there are n points xi and m points y j in
R p such that, if individual i picks alternative j, then kxi − y j k ≤ kxi − y` k
’S
for all ` = 1, . . . , m. More concisely, we use the m points y j to draw a
0.5
y
0.0
FR
O
M
1.0
JA
N
Voronoi diagram. This is illustrated in Figure 3 for six points in the plane.
−1.0
−0.5
0.0
0.5
1.0
1.5
x
F IGURE 3. A Voronoi Diagram
2.0
MULTIDIMENSIONAL UNFOLDING
9
There is one Voronoi cell for each the y j , and the cell (which can be bounded
on unbounded) contains exactly those points which are closer to y j than to
any of the other y` . The unfolding model says that individuals are in the
D
ES
K
Voronoi cells of the objects they pick. This clearly leaves room for a lot of
indeterminacy in the actual placement of the points.
The situation becomes more favorable if we have more than one indicator
matrix, that is if each individual makes more than one choice. There is a
Voronoi diagram for each choice and individuals must be in the Voronoi
’S
cells of the object they choose for each of the diagrams. Superimposing the
diagrams creates smaller and smaller regions that each individual must be
N
in, and the unfolding model requires the intersection of the Voronoi cells
JA
determined by the choices of any individual to be nonempty.
It is perhaps simplest to apply this idea to binary choices. The Voronoi
cells in this case are half spaces defined by hyperplanes dividing Rn in two
M
parts. All individuals choosing the first of the two alternatives must be on
FR
O
one side of the hyperplane, all others must be on the other side. There is a
hyperplane for each choice.
10
JAN DE LEEUW
4
110
111
000
2
y
101
D
ES
K
3
100
1+
1
001
1−
011
3+ 3−
0
2+ 2−
0
1
2
3
4
’S
x
JA
N
F IGURE 4. Unfolding Binary Data
This is the nonmetric factor analysis model studied first by Coombs and
Kao [5]. It is illustrated in Figure 4.
M
The prototype here is roll call data [3]. If 100 US senators vote on 20 issues,
FR
O
then the unfolding model says that (for a representation in the plane) there
are 100 points and 20 lines, such that each issue-line separates the “aye”
and the “nay” voters for that issue. Unfolding in this case can be done by
correspondence analysis, or by maximum likelihood logit or probit techniques. We give an example, using 20 issues selected by Americans for
Democratic Action, and the 2000 US Senate.
MULTIDIMENSIONAL UNFOLDING
11
0.4
Roll call plot for senate
60
40
D
ES
K
0.3
37
0.2
64
94
12
86
62
20
87
100
0.1
24
57
88
34
17
15
18
4
58 84
31
6623
20
3
32
4
0.0
dimension 2
10
80
5
38
6716
1011 9
3
35
76
25
65
113
82
12
14
19
98 50
9
63
39
61
22
54 8
1836
73
71918377 89 97
67 68
92
7
69
53
48
33 55 70 262 28
75
79 82
74
13
72 6
99 95
16 44
1778
15 45
30
90
49 52
47
19
56
96
4643
59
14 27
−0.1
19
5
85
11
29
4293
−0.1
0.0
0.1
0.2
N
−0.2
81
21
41
20
14
1213
184
21
15
81639
6
711
105 17
’S
1
−0.3
51
JA
dimension 1
R EFERENCES
FR
O
M
F IGURE 5. The 2000 US Senate
[1] I. Borg and P.J.F. Groenen. Modern Multidimensional Scaling: Theory
and Applications. Springer, New York, 1997.
[2] F.M.T.A. Busing, P.J.F. Groenen, and W.J. Heiser. Avoiding Degeneracy in Multidimensional Unfolding by Penalizing on the Coefficient
of Variation. Psychometrika, 2004.
12
JAN DE LEEUW
[3] J. Clinton, S. Jackman, and D. Rivers. The Statistical Analysis of Roll
Call Data. American Political Science Review, (in press).
[4] C. H. Coombs. A Theory of Data. Wiley, 1964.
D
ES
K
[5] C.H. Coombs and R.C. Kao. Nonmetric Factor Analysis. Engineering
Research Bulletin 38, Engineering Research Institute, University of
Michigan, Ann Arbor, 1955.
[6] M.J. Greenacre and M.W. Browne. An Efficient Alternating LeastSquares Algorithm to Perform Multidimensional Unfolding.
’S
chometrika, 51:241–250, 1986.
Psy-
[7] W.J. Heiser. Unfolding Analysis of Proximity Data. PhD thesis, Uni-
N
versity of Leiden, 1981.
[8] J.B. Kruskal and J.D. Carroll. Geometrical Models and Badness of Fit
JA
Functions. In P.R. Krishnaiah, editor, Multivariate Analysis, Volume
II, pages 639–671. North Holland Publishing Company, 1969.
M
[9] E.E.CH.I. Roskam. Metric Analysis of Ordinal Data in Psychology.
PhD thesis, University of Leiden, 1968.
FR
O
[10] P.H. Schönemann. On Metric Multidimensional Unfolding. Psychometrika, 35:349–366, 1970.
[11] Y. Takane, F.W. Young, and J. De Leeuw. Nonmetric Individual Differences in Multidimensional Scaling: An Alternating Least Squares
Method with Optimal Scaling Features.
Psychometrika, 42:7–67,
MULTIDIMENSIONAL UNFOLDING
13
1977.
D EPARTMENT OF S TATISTICS , U NIVERSITY OF C ALIFORNIA , L OS A NGELES , CA 90095-
D
ES
K
1554
E-mail address, Jan de Leeuw: deleeuw@stat.ucla.edu
FR
O
M
JA
N
’S
URL, Jan de Leeuw: http://gifi.stat.ucla.edu
Download