Simplicity transformations and typical rank of three-way arrays, with applications to Tucker-3 analysis, CANDECOMP/PARAFAC, and INDSCAL,

SIMPLICITY TRANSFORMATIONS AND TYPICAL RANK OF THREE-WAY ARRAYS, WITH APPLICATIONS TO TUCKER-3 ANALYSIS, CANDECOMP/PARAFAC, AND INDSCAL Jos M.F. ten Berge University of Groningen, The Netherlands Heijmans Institute, University of Groningen, Grote Kruisstraat 2/1, 9712TS Groningen, Netherlands, tberge@ppsw.rug.nl It is well-known that a matrix X, with SVD X=PDQ', can be transformed to a diagonal matrix D=P'XQ. The transformation brings about the maximum simplicity (in terms of number of zero elements) that can be obtained by nonsingular transformations. As a bonus, it also reveals the rank of X as the number of nonzero elements in D. For three-way arrays, consisting of K slices X1,..,XK, similar transformations to simplicity are possible, but more complicated. In addition, such transformations can be helpful in finding the rank of the three-way array. The topics of simplicity and rank of three-way arrays have implications for the analysis of three-way arrays by Tucker three-way PCA (3PCA), for CANDECOMP/PARAFAC (CP), and, notably, for hybrid models in between the two. The present paper gives a summary of results and demonstrates their application in a number of examples. The organization of this paper is as follows. First, the main features of CP, 3PCA, and hybrid models in between will be reviewed. Next, maximal simplicity results and typical rank results for three-way arrays will be discussed. Finally, some applications, revolving around issues of non-triviality and uniqueness of hybrid models, will be discussed. CANDECOMP/PARAFAC (CP) Carroll & Chang (1970) and Harshman (1970) independently proposed CANDECOMP and PARAFAC, respectively. For any given number of components R, CP yields component matrices A, B, and C, with R columns such that, for the decomposition Xk=ACkB+Ek, where Ck=diag(C(k,:)), k=1,…,K, the residual sum of squares tr(Ek'Ek) is a minimum. (1) CP is said to have the property of essential uniqueness. That is, that rescaling columns A (or B or C) by a diagonal matrix L is allowed, provided that the inverse of L is accounted for somewhere else. For instance, ACkB=ALL-1CkB=(AL)Ck(BL-1). Similarly, simultaneous permutations of columns of A, B, and diagonal elements of Ck, k=1,...,K are allowed. But apart from rescaling and permuting columns of A, B, and C, there is no transformational freedom in CP. Although CP uniqueness is not the topic of this paper, a brief digression may be in order. The first uniqueness proofs of CP date back to Harshman (1972) and Kruskal (1977), who showed that CP has essential uniqueness, as explained above, under mild sufficient conditions. After decades of silence, the topic of CP uniqueness has recently been revived. Sidiropoulos & Bro (2000) have given a short-cut proof for Kruskals sufficient condition, and generalized it to n-way arrays (n>3). Ten Berge & Sidiropoulos (2002) have shown that Kruskals sufficient conditions are necessary for R=2 or 3, but not for R>3. CP has acquired great popularity from its application to arrays with symmetric slices. The INDSCAL method of Carroll & Chang (1970) relies on CP, hoping that it will yield A and B equal. In practice, this always seems to work. Then each (symmetric) slice is decomposed as Xk=ACkA+Ek, (2) Ck diagonal, k=1,…,K, in the least squares sense. Again, INDSCAL has essential uniqueness. Simultaneous permutations and column-wise rescaling are allowed. Tucker three-way PCA Tucker three-way PCA (3PCA) has been proposed by Tucker (1966). Kroonenberg and De Leeuw (1980) have offered an alternating least squares algorithm. It yields matrices A, B, and C, with P, Q, and R columns, and a PQR core array G, such that the sum of squares of elements of E is minimized in the decomposition X= P Q R    g pqr (a p  b q  c r ) +E. p1 q1 r 1 (3) It is important to note that CP is a constrained version of 3PCA, where P=Q=R and the core array is unit superdiagonal: gpqr=0 unless p=q=r, in which case gpqr=1. Hence, the i-th slice of G is eiei, where ei is column i of I. In 3PCA, there are no uniqueness properties like we have for CP. This can be seen from the matrix form of 3PCA, where X=[X1|….|XK] is decomposed as X=[X1|…|XK]=AG(CB)+[E1|...|EK], (4) 2 with G=[G1|...|GR], the matricized core array. If S, T, and U are nonsingular matrices, we may equivalently write X=AG(CB)=A(S)-1SG(UT)(U-1T-1)(CB). (5) Clearly, we may switch to new component matrices A(S)-1, B(T)-1, and C(U)-1, associated with a new core array SG(UT). This invariance property implies that there is no uniqueness in 3PCA. Without loss of fit, we may premultiply all slices by S, postmultiply them by T, and . mix them by U. These transformations are called Tucker transformations (Rocci & Ten Berge, 2002). Hybrid models in between 3PCA and CP. To introduce the concept of hybrid models, an example will be instructive. Consider a 333 core array G of 3PCA in matrix form: All of its elements, denoted by ?, are unconstrained: ? ? ? ? ? ? ? ? ? G= ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (6) In CP, the i-th slice of the core is eiei', which gives the fully determined core 1 0 0 0 0 0 0 0 0 G= 0 0 0 0 1 0 0 0 0 . 0 0 0 0 0 0 0 0 1 (7) Hybrid models have partly constrained cores. For instance, we have 6 free parameters in 1 0 0 0 0 ? 0 ? 0 G= 0 0 ? 0 1 0 ? 0 0 . 0 ? 0 ? 0 0 0 0 1 (8) When dealing with such partly constrained models, in between CP and 3PCA, two questions are of interest: First, there is the question of (non)triviality: Do Tucker transformations of an unconstrained solution exist which satisfy the constraints? If this is the case, we can first run 3PCA, and instill the constraints afterwards, using the freedom of transformation. Secondly, there is the question of uniqueness: Are Tucker transformations possible that preserve pattern of zeros? If this is the case, no transformational freedom is left. Obviously, both non-triviality and uniqueness are desirable characteristics of a constrained core. In the particular hybrid core of (8), the model seems to be trivial with positive probability, and it is essentially unique, just like CP (Kiers, Ten Berge & Rocci, 1997). A discussion of this type of hybrid core in the context of loglinear modeling can be found in Wong (2001) 3 An even better understood case of constrained cores is that of order 332. When the two slices can be diagonalized simultaneously, it is possible to arrive at the form  ? 0 0 0 0 0 0 ? 0 0 ? 0 .   0 0 0 0 0 ?  (9) But it is always possible to transform the array to the form 1 0 0 0 0 0  0 1 0 0 0 x  ,    0 0 1 0 y 0  (10) with x2 = y2. Hence, such cores are trivial. They are partly unique (Rocci & Ten Berge, 2002). A third example arose in Chemometrics: Gurden, Westerhuis, Bijlsma & Smilde (2001) used a hybrid core of the form 1 0  G= 0  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 1 (11) It will be shown below that this core is non-trivial. Although answers to questions of non-triviality and uniqueness can sometimes be given on an ad hoc basis, general rules to decide about triviality and uniqueness are to be preferred. Some rules can be obtained from maximal simplicity results and some from typical rank results. The next sections will be devoted to such rules. Tucker transformations to maximal simplicity. The search for methods to simplify three-way core arrays started with numerical approaches. For instance, Kiers (1998) derived three-way Simplimax, an iterative rotation method aimed at finding a core with a specified (small) number of nonzero elements. However, iterative orthonormalization (Ten Berge, Kiers, Murakami & Van der Heijden, 2000) has also been successful in a variety of cases. The results from these iterative procedures led to a search for direct algebraic solutions for simplicity transformations. In some cases, direct solutions are straightforward. For instance, consider a core of order 522. It is trivial to find S such that 4 1  0 SX1= 0  0 0 0  1 0  0 0 0 0  and SX2= 0  1 0 0 0 0 .  0 1 (12) (Construct a 55 matrix by putting an extra random column in between X1 and X2, and take S' as the inverse of that matrix). In this case, there is no need to invoke T or U. However, cases like this (P>QR) do not arise in 3PCA, because they would represent a case of overfactoring. A more complicated class of cases arises when the array has two slices, with 2Q>P>Q. Ten Berge and Kiers (1999) have shown how to find transformation matrices S and T such that    SX1T=     IQ O        and SX2T=        O IQ        (13) Again, the solution is non-unique, because U has not been invoked. Further explicit simplicity results have been given in Rocci & Ten Berge (2002). More importantly, they have considered the question of what constitutes maximal simplicity. For instance, they proved that PQ2 arrays with 2Q>P>Q cannot have fewer than 2Q nonzero elements, as in (13). This is a powerful result, because all hybrid cores involving more than maximal simplicity are nontrivial. Unfortunately, except for two-slice arrays, very little has been achieved in the way of maximal simplicity results. There is an alternative: We may also use typical rank results. Typical rank of three-way arrays Each pair of vectors defines a rank one matrix, and vice versa. A rank-one matrix has proportional rows and proportional columns. The rank of a matrix X is the smallest number of rank-one matrices generating X as their sum. These matrix concepts immediately transfer to three-way arrays. Specifically, each triple of vectors defines a three-way array of rank one, and vice versa. A rank-one array has proportional slices in each direction. Parallel to the definition of matrix rank, the rank of three-way array X is defined as the smallest number of rank-one arrays generating X as their sum (Kruskal, 1977, 1983, 1989). For example, the 332 array X with 5 2 [X1|X2]=  3   4 3 1 1 5 7 1 1 3 5  1 7  1 11  1 3 (14) has rank 2 because it can be written as the sum of two rank-one arrays 1 2 0 2 4 0  1  2 4 0 4 8 0 +  1    3 6 0 6 12 0  1 1 1 1 1 1 1  1  1  1  1  1  1 .  1  1  1 (15) The rank of a three-way array has a direct link to CP: It is the smallest number of components that is enough for a full CP decomposition. Usually, matrices have the maximal rank they can have, given their order. For instance, a randomly generated 75 matrix has rank 5 almost surely. This does not hold for three-way arrays. There is a gap between typical rank (the rank an array format has with positive probability) and maximal rank (Kruskal, 1983). For instance, a 442 array has maximal rank 6, and typical rank 5. Kruskal (1989) has given a general expression for the maximal rank of PQ2 arrays. In the present context, we focus on typical rank, which is more relevant from a practical point of view. The first results on typical rank go back to Kruskal (1983). Recently, a few general principles have been obtained. Because Tucker transformations leave the rank of an array unaffected, transformations to simplicity have proven very helpful in the study of typical rank. The tables below demonstrate some of the results. Table 1 is based on Ten Berge & Kiers, (1999), who solved the typical rank issue for all PQ2 arrays. Table 2 gives some values for PQ3 arrays (Ten Berge, 2000). Table 1. Typical rank of PQ2 arrays Q=2 Q=3 Q=4 P=2 {2,3} 3 4 P=3 3 {3,4} 4 P=4 4 4 {4,5} P=5 4 5 5 P=6 4 6 6 P=7 4 6 7 P=8 4 6 8 P=9 4 6 8 6 Table 2. Typical rank of PQ3 arrays Q=3 Q=4 Q=5 P=5 ? ? 7 P=6 6 ? ? P=7 7 ? ? P=8 8 {8,9} ? P=9 9 9 ? P=10 9 10 10 P=11 9 11 11 P=12 9 12 12 The bold face entry for the typical rank of 553 arrays is based on unpublished work. It has been inserted because it is relevant for the constrained array (11). This array is the sum of 5 linearly independent rank-one arrays, hence its rank is five. However, arrays of format 553 have almost surely rank 7 or higher. We may thus infer that the array of Gurden et al. is nontrivial (Ten Berge & Smilde, 2001). This demonstrates that typical rank results can answer questions of non-triviality of hybrid core arrays in between 3PCA and CP. Discussion The results discussed above do not apply to arrays of symmetric slices, as arise in INDSCAL. Typical rank results for arrays with symmetric slices differ from those of their non-symmetric counterparts. In general, it seems that symmetry entails a lower typical rank, which means that we need fewer components for a full INDSCAL decomposition than for CP. However, in some cases the typical rank of symmetric arrays is as high as that of their non-symmetric counterparts. Further details can be found in Ten Berge, Sidiropoulos and Rocci (2001). Carroll, J.D. & Chang, J.J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of Eckart-Young decomposition. Psychometrika, 35, 283-319. Gurden, S.P., Westerhuis, J.A., Bijlsma, S. & Smilde, A. (2001). Modeling of spectroscopic batch process data using grey models to incorporate external information. Journal of Chemometrics, 15, 101-121. Harshman, R.A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-mode factor analysis. UCLA Working Papers in Phonetics, 16, 1-84. 7 Harshman, R.A. (1972). Determination and proof of minimum uniqueness conditions for PARAFAC1. UCLA Working Papers in Phonetics, 16, 1-84. Kiers, H.A.L. (1998). Three-way SIMPLIMAX for oblique rotation of the three-mode factor analysis core to simple structure. Computational Statistics & Data Analysis, 28, 307-324. Kiers, H.A.L., Ten Berge, J.M.F. & Rocci, R. (1997). Uniqueness of three-mode factor models with sparse cores: The 333 case. Psychometrika, 62, 349-374. Kroonenberg, P.M. & De Leeuw, J. (1980). Principal component analysis of three-mode data by means of alternating least-squares. Psychometrika, 45, 69-97. Kruskal, J.B. (1977). Three-way arrays: Rank and uniqueness of trilinear decompositions with applications to arithmetic complexity and statistics. Lin. Alg. & Appl., 18, 95-138. Kruskal, J.B. (1983). Statement of some current results about three-way arrays. (Unpublished). Kruskal, J.B. (1989). Rank, decomposition, and uniqueness for 3-way and N-way arrays. In R. Coppi and S. Bolasco (Eds.), Multiway data analysis (pp. 7-18). Amsterdam: North-Holland. Rocci, R. & Ten Berge, J.M.F. (2002). Transforming three-way arrays to maximal simplicity. Psychometrika, 67, in press. Sidiropoulos, N.D. & Bro, R. (2000). On the uniqueness of multilinear decomposition of N-way arrays. Journal of Chemometrics, 14, 229-239. Ten Berge, J.M.F. (2000). The typical rank of tall three-way arrays. Psychometrika, 65, 525532. Ten Berge, J.M.F. & Kiers, H.A.L. (1999). Simplicity of core arrays in three-way principal component analysis and the typical rank of PQ2 arrays. Lin. Alg. & Appl., 294, 169-179. Ten Berge, J.M.F., Kiers, H.A.L., Murakami, T. & Van der Heijden, R. (2000). Transforming three-way arrays to multiple orthonormality. Journal of Chemometrics 14, 275-284. Ten Berge, J.M.F. & Sidiropoulos, N. D. (2002). On uniqueness in CANDECOMP/PARAFAC. Psychometrika, 67, in press. Ten Berge, J.M.F. & Sidiropoulos, N. D. & Rocci, R. (2001). Typical rank and INDSCAL dimensionality for symmetric arrays of order I22 or I33 (submitted for publication) Ten Berge, J.M.F. & Smilde, A.K. (2001). Non-triviality and identification of a constrained Tucker-3 analysis. (submitted for publication). Tucker, L. R., (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31, 279-311. Wong, R. S-K. (2001). Multidimensional association models. A multilinear approach. Sociological Methods & Research, 30, 197-240. 8

Simplicity transformations and typical rank of three-way arrays, with applications to Tucker-3 analysis, CANDECOMP/PARAFAC, and INDSCAL,

Related documents

Products

Support

Simplicity transformations and typical rank of three-way arrays, with applications to Tucker-3 analysis, CANDECOMP/PARAFAC, and INDSCAL,

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib