119 APPENDIX E COMPARISON OF PRINCIPAL COMPONENTS Assume two random vectors, X m and Y m , have the covariance matrices of X and Y , respectively. Suppose the coefficients for the first p principal components (PCs) of X and Y are {1 ,, p } and {1 ,, p } , respectively, that is, {1 ,, p } are the first p orthonormal eigenvectors of X , and {1 ,, p } are the first p orthonormal eigenvectors of Y . In order to compare these two sets of PCs, it is necessary to compare the two subspaces spanned by {1 ,, p } and {1 ,, p } , respectively. The following two theorems in [Krzanowski, 1979] propose one rigorous way to analyze it. Proposition E.1 [Krzanowski, 1979]: Denote L [1 ,, p ] and M [ 1 ,, p ] , the minimum angle between an arbitrary vector in the subspace span{1 ,, p } and another arbitrary vector in the subspace span{1 ,, p } is given by cos 1 ( 1 ) , where 1 is the largest eigenvalue of K LT MM T L . Proof: Arbitrarily select one vector from span{1 ,, p } , and represent it as w1 L1 , then the projection of w1 onto span{1 ,, p } is given by MM T w1 . Due to the geometry 120 property of projection, to find the minimum angle between two arbitrary vectors in span{1 ,, p } and span{1 ,, p } is to find w1 , such that the angle between w1 and MM T w1 , say 1 , is minimal. From cos(1 ) w1T MM T w1 w1 MM T w1 , and LT L M T M I cos 2 (1 ) v1T LT MM T Lv1 v1T v1 . So minimizing 1 is equivalent to maximizing v1T LT MM T Lv1 , subject to v1T v1 1 . The closed form solution from the well known linear algebra fact is given by: When v1 is the first eigenvector of K LT MM T L , cos 2 (1 ) 1 , which is the largest eigenvalue of K . One point to note is that all eigenvalues, i , of K satisfy 0 i 1 , which is verified by: LT MM T Lx x xT LT MM T Lx xT x yT MM T y xT x , where y Lx . M T y is the projection coefficients of y onto span{1 ,, p } , with {1 ,, p } being an orthonormal set, so M T y y xT x yT MM T y M T y 2 y 2 xT LT Lx xT x 1 ■ Proposition E.2 [Krzanowski, 1979]: L , M , and K are defined as in Proposition E.1, Let i , i be the i -th largest eigenvalue and corresponding eigenvector of K , respectively. Take wi L i , then {w1 ,, w p } and {MM T w1 ,, MM T w p } form orthogonal vectors in span{1 ,, p } and span{1 , , p } , respectively. The angle between the i -th pair wi , 121 MM T wi is given by cos 1 ( i ) . Proposition E.1 shows that w1 and MM T w1 give the two closest vectors when one is constrained to be in the subspace span{1 ,, p } and the other in span{1 , , p } . It follows that w2 and MM T w2 give directions, orthogonal to the previous ones, between which lies the next smallest angle between the subspaces. Proof: Arbitrarily select one vector from span{1 ,, p } , which is orthogonal to w1 , and represent it as w2 L 2 , then the projection of w2 onto span{1 ,, p } is given by MM T w2 . The angle between w2 and MM T w2 , say 2 , satisfies cos ( 2 ) 2 v2T LT MM T Lv2 v2T v2 . So we need to maximize v2T LT MM T Lv2 , subject to v2T v2 1 , and v2T v1 0 . By the same Lagrange multiplier method we use for PCA construction in [Appendix B], it turns out that the optimal solution is cos 2 ( 2 ) 2 , when v2 is the second eigenvector of K LT MM T L . It is also true that w1 w2 , because w1T w2 v1T v2 0 , and that MM T w1 MM T w2 , because w2T MM T MM T w1 w2T MM T w1 v2T LT MM T Lv1 1v2T v1 0 . Continuing in this way, the conclusion of this theorem is reached. ■ Let ij be the angle between i and j , i.e., cos( ij ) iT j , then LT M {cos( ij )} , so we have 122 p p p i trace( LT MM T L) cos 2 ( ij ) . i 1 j 1 i 1 Thus the summation of the eigenvalues of K LT MM T L equals the sum of squares of the cosines of the angles between each basis element of span{1 ,, p } and each basis element of span{1 ,, p } . This sum is invariant with respect to whichever basis you select for span{1 ,, p } and span{1 ,, p } . In more detail, let ~ ~ ~ ~ L [~1 , , ~ p ] [1, , p ]P LP , and M [ 1 ,, p ] [ 1 ,, p ]Q MQ , where P , Q are p p orthogonal matrices, i.e., PT P PPT I , and QT Q QQT I . ~ ~ If ij is the angle between ~i and j , then p p ~ ~ ~~ ~ cos 2 ( ij ) trace( LT MM T L ) trace( PT LT MQQ T M T LP) j 1i 1 trace( PT LT MM T LP) trace(M T LPPT LT M ) p p trace( M T LLT M ) trace( LT MM T L) cos 2 ( ij ) . j 1i 1 So, this sum can be used as a measure of total similarity between the two subspaces. It p can be checked that if span{1 , , p } span{1 , , p } , i p , and if i 1 p span{1 , , p } span{1 , , p } , i 0 . i 1