Department of Statistics, UCLA UC Los Angeles Title: Multidimensional Unfolding Author: Jan de Leeuw Publication Date: 08-22-2011 Series: Department of Statistics Papers Publication Info: Department of Statistics Papers, Department of Statistics, UCLA, UC Los Angeles Permalink: http://escholarship.org/uc/item/05m7v36t Abstract: This is an entry for The Encyclopedia of Statistics in Behavioral Science, to be published by Wiley in 2005. eScholarship provides open access, scholarly publishing services to the University of California and delivers a dynamic research platform to scholars worldwide. MULTIDIMENSIONAL UNFOLDING D ES K JAN DE LEEUW A BSTRACT. This is an entry for The Encyclopedia of Statistics in Behavioral Science, to be published by Wiley in 2005. The unfolding model is a geometric model for preference and choice. It locates individuals and alternatives as points in a joint space, and it says ’S that an individual will pick the alternative in the choice set closest to its ideal point. Unfolding originated in the work of Coombs [4] and his students. It JA attitude scaling. N is perhaps the dominant model in both scaling of preferential choice and The multidimensional unfolding technique computes solutions to the equations of unfolding model. It can be defined as multidimensional scaling M of off-diagonal matrices. This means the data are dissimilarities between n row objects and m column objects, collected in an n × m matrix 1. An FR O important example is preference data, where δi j indicates, for instance, how much individual i dislikes object j. In unfolding we have many of the same Date: April 3, 2004. Key words and phrases. fitting distances, multidimensional scaling, unfolding, choice models. 1 2 JAN DE LEEUW distinctions as in general multidimensional scaling: there is unidimensional and multidimensional unfolding, metric and nonmetric unfolding, and there D ES K are many possible choices of loss functions that can be minimized. First we will look at (metric) unfolding as defining the system of equations δi j = di j (X, Y ), where X is the n × p configuration matrix of row-points, Y is the m × p configuration matrix of column points, and v u p uX di j (X, Y ) = t (xis − y js )2 . s=1 this system expands to p X 2 xis + p X y 2js − 2 N δi2j = ’S Clearly an equivalent system of algebraic equations is δi2j = di2j (X, Y ), and s=1 s=1 p X xis y js . s=1 JA 0 + e b0 − 2X Y 0 , where a We can rewrite this in matrix form as 1(2) = aem n and b contain the row and column sums of squares, and where e is used for M a vector with all elements equal to one. If we define the centering operators 0 /m, then we see that doubly centering Jn = In −en en0 /n and Jm = Im −em em FR O the matrix of squared dissimilarities gives the basic result 1 H = − Jn 1(2) Jm = X̃ Ỹ 0 , 2 where X̃ = Jn X and Ỹ = Jm Y are centered versions of X and Y . For our system of equations to be solvable, it is necessary that rank(H ) ≤ p. Solving the system, or finding an approximate solution by using the singular MULTIDIMENSIONAL UNFOLDING 3 value decomposition, gives us already an idea about X and Y , except that we do not know the relative location and orientation of the two points clouds. D ES K More precisely, if H = P Q 0 is is full rank decomposition of H , then the solutions X and Y of our system of equations δi2j = di2j (X, Y ) can be written in the form X = (P + en α 0 )T, Y = (Q + em β 0 )(T 0 )−1 , ’S which leaves us with only the p( p + 2) unknowns in α, β, and T still to be determined. By using the fact that the solution is invariant under translation N and rotation we can actually reduce this to 21 p( p + 3) parameters. One way JA to find these additional parameters is given in [10]. Instead of trying to find an exact solution, if one actually exists, by algebraic M means, we can also define a multidimensional unfolding loss function and minimize it. In the most basic and classical form, we have the Stress loss FR O function σ (X, Y ) = n X m X wi j (δi j − di j (X, Y ))2 i=1 j=1 This is identical to an ordinary multidimensional scaling problems where the diagonal (row-row and column-column) weights are zero. Or, to put it differently, in unfolding the dissimilarities between different row objects 4 JAN DE LEEUW and different column objects are missing. Thus any multidimensional scaling program that can handle weights and missing data can be used to minimize this loss function. Details are in [7] or [1, Part III]. One can also con- D ES K sider measuring loss using SStress, the sum of squared differences between the squared dissimilarities and squared distances. This has been considered in [11, 6]. Area Plot Code SOC Educational and Developmental Psychology EDU ’S Social Psychology CLI Mathematical Psychology and Psychological Statistics MAT N Clinical Psychology EXP Cultural Psychology and Psychology of Religion CUL Industrial Psychology IND Test Construction and Validation TST Physiological and Animal Psychology PHY TABLE 1. Nine Psychology Areas FR O M JA Experimental Psychology We use an example from Roskam [9, p. 152]. The Department of Psychology at the University of Nijmegen has, or had, 9 different areas of research and teaching. Each of the 39 psychologists working in the department ranked all 9 areas in order of relevance for their work. The areas MULTIDIMENSIONAL UNFOLDING 5 are given in Table 1. We apply metric unfolding, in two dimensions, and CLI EDU10 15 6 5 2 8 D ES K find the solution in Figure 1. 12 9 7 39 23 18 24 20 27 1 31 2 EXP 14 0 22 19 30 MAT SOC 4 1 28 3 13 29 −2 CUL 16 3611 35 N −3 M JA −2 TST 3326 37 3432 −4 17 ’S −1 dimension2 PHY 38 21 25 IND 0 2 4 dimension1 F IGURE 1. Metric Unfolding Roskam Data FR O In this analysis we used the rank orders, more precisely the numbers 0 to 8. Thus, for good fit, first choices should coincide with ideal points. The grouping of the 9 areas in the solution is quite natural. In this case, and in many other cases, the problems we are analyzing suggest that we really are interested in nonmetric unfolding. It is difficult to think of 6 JAN DE LEEUW actual applications of metric unfolding, except perhaps in the life and physical sciences. This does not mean that metric unfolding is uninteresting. Most nonmetric unfolding algorithms solve metric unfolding subproblems, D ES K and one can often make a case for metric unfolding as a robust form to solve nonmetric unfolding problems. The original techniques proposed by Coombs [4] were purely nonmetric and did not even lead to metric representations. In preference analysis, the protypical area of application, we often only have ranking information. ’S Each individual ranks a number of candidates, or food samples, or investment opportunities. The ranking information is row-conditional, which N means we cannot compare the ranks given by individual i to the ranks JA given by individual k. The order is defined only within rows. Metric data are generally unconditional, because we can compare numbers both within and between rows. Because of the paucity of information (only rank or- M der, only row-conditional, only off-diagonal) the usual Kruskal approach to nonmetric unfolding often leads to degenerate solutions, even after clever FR O renormalization and partitioning of the loss function [8]. In Figure 2 we give the solution minimizing σ (X, Y, 1) = n X i=1 Pm 2 j=1 wi j (δi j − di j (X, Y )) Pm 2 j=1 wi j (δi j − δi? ) MULTIDIMENSIONAL UNFOLDING 7 over X and Y and over those 1 whose rows are monotone with the ranks given by the psychologists. Thus there is a separate monotone regression 18 19 37 EXP MAT TST IND SOC −5 17 28 0 5 10 dimension1 M −10 PHY N 13 JA −6 −4 30 29 39 ’S 0 2 10 15 6 5 8 12 23 24 25 21 20 27 22 35 14 7 32 1234 9 11 36 34 26 33 16 −2 dimension2 38 EDU CLI CUL D ES K 31 4 6 computed for each of the 39 rows. FR O F IGURE 2. Nonmetric Unfolding Roskam Data The solution is roughly the same as the metric one, but there is more clustering and clumping in the plot, and this makes the visual representation much less clear. It is quite possible that continuing to iterate to higher precision will lead to even more degeneracy. More recently Busing et al. [2] have 8 JAN DE LEEUW adapted the Kruskal approach to nonmetric unfolding by penalizing for the flatness of the monotone regression function. D ES K One would expect even more problems when the data are not even rank orders but just binary choices. Suppose n individuals have to choose one alternative from a set of m alternatives. The data can be coded as an indicator matrix, which is an n × m binary matrix with exactly one unit element in each row. The unfolding model says there are n points xi and m points y j in R p such that, if individual i picks alternative j, then kxi − y j k ≤ kxi − y` k ’S for all ` = 1, . . . , m. More concisely, we use the m points y j to draw a 0.5 y 0.0 FR O M 1.0 JA N Voronoi diagram. This is illustrated in Figure 3 for six points in the plane. −1.0 −0.5 0.0 0.5 1.0 1.5 x F IGURE 3. A Voronoi Diagram 2.0 MULTIDIMENSIONAL UNFOLDING 9 There is one Voronoi cell for each the y j , and the cell (which can be bounded on unbounded) contains exactly those points which are closer to y j than to any of the other y` . The unfolding model says that individuals are in the D ES K Voronoi cells of the objects they pick. This clearly leaves room for a lot of indeterminacy in the actual placement of the points. The situation becomes more favorable if we have more than one indicator matrix, that is if each individual makes more than one choice. There is a Voronoi diagram for each choice and individuals must be in the Voronoi ’S cells of the object they choose for each of the diagrams. Superimposing the diagrams creates smaller and smaller regions that each individual must be N in, and the unfolding model requires the intersection of the Voronoi cells JA determined by the choices of any individual to be nonempty. It is perhaps simplest to apply this idea to binary choices. The Voronoi cells in this case are half spaces defined by hyperplanes dividing Rn in two M parts. All individuals choosing the first of the two alternatives must be on FR O one side of the hyperplane, all others must be on the other side. There is a hyperplane for each choice. 10 JAN DE LEEUW 4 110 111 000 2 y 101 D ES K 3 100 1+ 1 001 1− 011 3+ 3− 0 2+ 2− 0 1 2 3 4 ’S x JA N F IGURE 4. Unfolding Binary Data This is the nonmetric factor analysis model studied first by Coombs and Kao [5]. It is illustrated in Figure 4. M The prototype here is roll call data [3]. If 100 US senators vote on 20 issues, FR O then the unfolding model says that (for a representation in the plane) there are 100 points and 20 lines, such that each issue-line separates the “aye” and the “nay” voters for that issue. Unfolding in this case can be done by correspondence analysis, or by maximum likelihood logit or probit techniques. We give an example, using 20 issues selected by Americans for Democratic Action, and the 2000 US Senate. MULTIDIMENSIONAL UNFOLDING 11 0.4 Roll call plot for senate 60 40 D ES K 0.3 37 0.2 64 94 12 86 62 20 87 100 0.1 24 57 88 34 17 15 18 4 58 84 31 6623 20 3 32 4 0.0 dimension 2 10 80 5 38 6716 1011 9 3 35 76 25 65 113 82 12 14 19 98 50 9 63 39 61 22 54 8 1836 73 71918377 89 97 67 68 92 7 69 53 48 33 55 70 262 28 75 79 82 74 13 72 6 99 95 16 44 1778 15 45 30 90 49 52 47 19 56 96 4643 59 14 27 −0.1 19 5 85 11 29 4293 −0.1 0.0 0.1 0.2 N −0.2 81 21 41 20 14 1213 184 21 15 81639 6 711 105 17 ’S 1 −0.3 51 JA dimension 1 R EFERENCES FR O M F IGURE 5. The 2000 US Senate [1] I. Borg and P.J.F. Groenen. Modern Multidimensional Scaling: Theory and Applications. Springer, New York, 1997. [2] F.M.T.A. Busing, P.J.F. Groenen, and W.J. Heiser. Avoiding Degeneracy in Multidimensional Unfolding by Penalizing on the Coefficient of Variation. Psychometrika, 2004. 12 JAN DE LEEUW [3] J. Clinton, S. Jackman, and D. Rivers. The Statistical Analysis of Roll Call Data. American Political Science Review, (in press). [4] C. H. Coombs. A Theory of Data. Wiley, 1964. D ES K [5] C.H. Coombs and R.C. Kao. Nonmetric Factor Analysis. Engineering Research Bulletin 38, Engineering Research Institute, University of Michigan, Ann Arbor, 1955. [6] M.J. Greenacre and M.W. Browne. An Efficient Alternating LeastSquares Algorithm to Perform Multidimensional Unfolding. ’S chometrika, 51:241–250, 1986. Psy- [7] W.J. Heiser. Unfolding Analysis of Proximity Data. PhD thesis, Uni- N versity of Leiden, 1981. [8] J.B. Kruskal and J.D. Carroll. Geometrical Models and Badness of Fit JA Functions. In P.R. Krishnaiah, editor, Multivariate Analysis, Volume II, pages 639–671. North Holland Publishing Company, 1969. M [9] E.E.CH.I. Roskam. Metric Analysis of Ordinal Data in Psychology. PhD thesis, University of Leiden, 1968. FR O [10] P.H. Schönemann. On Metric Multidimensional Unfolding. Psychometrika, 35:349–366, 1970. [11] Y. Takane, F.W. Young, and J. De Leeuw. Nonmetric Individual Differences in Multidimensional Scaling: An Alternating Least Squares Method with Optimal Scaling Features. Psychometrika, 42:7–67, MULTIDIMENSIONAL UNFOLDING 13 1977. D EPARTMENT OF S TATISTICS , U NIVERSITY OF C ALIFORNIA , L OS A NGELES , CA 90095- D ES K 1554 E-mail address, Jan de Leeuw: deleeuw@stat.ucla.edu FR O M JA N ’S URL, Jan de Leeuw: http://gifi.stat.ucla.edu