APPENDIX E COMPARISON OF PRINCIPAL COMPONENTS

advertisement
119
APPENDIX E
COMPARISON OF PRINCIPAL COMPONENTS
Assume two random vectors, X  m and Y  m , have the covariance matrices of  X
and Y , respectively. Suppose the coefficients for the first p principal components (PCs)
of X and Y are {1 ,,  p } and {1 ,,  p } , respectively, that is, {1 ,,  p } are the
first p orthonormal eigenvectors of  X , and {1 ,,  p } are the first p orthonormal
eigenvectors of Y . In order to compare these two sets of PCs, it is necessary to compare
the two subspaces spanned by {1 ,,  p } and {1 ,,  p } , respectively. The following
two theorems in [Krzanowski, 1979] propose one rigorous way to analyze it.
Proposition E.1 [Krzanowski, 1979]: Denote L  [1 ,,  p ] and M  [ 1 ,,  p ] , the
minimum angle between an arbitrary vector in the subspace span{1 ,,  p } and another
arbitrary vector in the subspace span{1 ,,  p } is given by cos 1 ( 1 ) , where 1 is the
largest eigenvalue of K  LT MM T L .
Proof:
Arbitrarily select one vector from span{1 ,,  p } , and represent it as w1  L1 , then
the projection of w1 onto span{1 ,,  p } is given by MM T w1 . Due to the geometry
120
property of projection, to find the minimum angle between two arbitrary vectors in
span{1 ,,  p } and span{1 ,,  p } is to find w1 , such that the angle between w1 and
MM T w1 , say 1 , is minimal.
From cos(1 ) 
w1T MM T w1
w1 MM T w1
, and LT L  M T M  I  cos 2 (1 ) 
v1T LT MM T Lv1
v1T v1
.
So minimizing 1 is equivalent to maximizing v1T LT MM T Lv1 , subject to v1T v1  1 . The
closed form solution from the well known linear algebra fact is given by:
When v1 is the first eigenvector of K  LT MM T L , cos 2 (1 )  1 , which is the largest
eigenvalue of K .
One point to note is that all eigenvalues, i , of K satisfy 0  i  1 , which is verified by:
LT MM T Lx  x  xT LT MM T Lx  xT x  yT MM T y  xT x , where y  Lx .
M T y is the projection coefficients of y onto span{1 ,,  p } , with {1 ,,  p } being
an orthonormal set, so
M T y  y  xT x  yT MM T y  M T y
2
 y
2
 xT LT Lx  xT x    1
■
Proposition E.2 [Krzanowski, 1979]: L , M , and K are defined as in Proposition E.1, Let
i ,  i be the i -th largest eigenvalue and corresponding eigenvector of K , respectively.
Take wi  L i , then {w1 ,, w p } and {MM T w1 ,, MM T w p } form orthogonal vectors in
span{1 ,,  p } and span{1 ,  ,  p } , respectively. The angle between the i -th pair wi ,
121
MM T wi is given by cos 1 ( i ) . Proposition E.1 shows that w1 and MM T w1 give the
two closest vectors when one is constrained to be in the subspace span{1 ,,  p } and the
other in span{1 ,  ,  p } . It follows that w2 and MM T w2 give directions, orthogonal to
the previous ones, between which lies the next smallest angle between the subspaces.
Proof:
Arbitrarily select one vector from span{1 ,,  p } , which is orthogonal to w1 , and
represent it as w2  L 2 , then the projection of w2 onto span{1 ,,  p } is given by
MM T w2 . The angle between w2 and MM T w2 , say  2 , satisfies
cos ( 2 ) 
2
v2T LT MM T Lv2
v2T v2
.
So we need to maximize v2T LT MM T Lv2 , subject to v2T v2  1 , and v2T v1  0 . By the same
Lagrange multiplier method we use for PCA construction in [Appendix B], it turns out
that the optimal solution is
cos 2 ( 2 )  2 , when v2 is the second eigenvector of K  LT MM T L .
It is also true that w1  w2 , because w1T w2  v1T v2  0 , and that MM T w1  MM T w2 ,
because w2T MM T MM T w1  w2T MM T w1  v2T LT MM T Lv1  1v2T v1  0 .
Continuing in this way, the conclusion of this theorem is reached.
■
Let  ij be the angle between  i and  j , i.e., cos( ij )   iT  j , then LT M  {cos( ij )} ,
so we have
122
p
p
p
 i  trace( LT MM T L)    cos 2 ( ij ) .
i 1
j 1 i 1
Thus the summation of the eigenvalues of K  LT MM T L equals the sum of squares of
the cosines of the angles between each basis element of span{1 ,,  p } and each basis
element of span{1 ,,  p } . This sum is invariant with respect to whichever basis you
select for span{1 ,,  p } and span{1 ,,  p } . In more detail, let
~
~
~
~
L  [~1 , , ~ p ]  [1, ,  p ]P  LP , and M  [ 1 ,,  p ]  [ 1 ,,  p ]Q  MQ ,
where P , Q are p  p orthogonal matrices, i.e., PT P  PPT  I , and QT Q  QQT  I .
~
~
If  ij is the angle between ~i and  j , then
p p
~
~ ~~ ~
  cos 2 ( ij )  trace( LT MM T L )  trace( PT LT MQQ T M T LP)
j 1i 1
 trace( PT LT MM T LP)  trace(M T LPPT LT M )
p p
 trace( M T LLT M )  trace( LT MM T L)    cos 2 ( ij ) .
j 1i 1
So, this sum can be used as a measure of total similarity between the two subspaces. It
p
can be checked that if span{1 ,  ,  p }  span{1 ,  ,  p } ,  i  p , and if
i 1
p
span{1 ,  ,  p }  span{1 ,  ,  p } ,  i  0 .
i 1
Download