Structure from Motion

advertisement
Eigen Decomposition and
Singular Value Decomposition
Mani Thomas
CISC 489/689
Introduction

Eigenvalue decomposition




Physical interpretation of eigenvalue/eigenvectors
Singular Value Decomposition
Importance of SVD




Spectral decomposition theorem
Matrix inversion
Solution to linear system of equations
Solution to a homogeneous system of equations
SVD application
What are eigenvalues?

Given a matrix, A, x is the eigenvector and  is the
corresponding eigenvalue if Ax = x

A must be square the determinant of A -  I must be
equal to zero
Ax - x = 0 ! x(A - I) = 0



Trivial solution is if x = 0
The non trivial solution occurs when det(A - I) = 0
Are eigenvectors are unique?

If x is an eigenvector, then x is also an eigenvector and
 is an eigenvalue
A(x) = (Ax) = (x) = (x)
Calculating the Eigenvectors/values

Expand the det(A - I) = 0 for a 2 £ 2 matrix
  a11 a12 
1 0 
0
det  A  I   det  
 



a
a
0
1


22 
  21
a12 
a11  
det 
 0  a11   a22     a12a21  0

a22   
 a21
2   a11  a22   a11a22  a12a21   0


For a 2 £ 2 matrix, this is a simple quadratic equation with two solutions
(maybe complex)
2

a11  a22 
  a11  a22  
4a11a22  a12a21 
This “characteristic equation” can be used to solve for x
Eigenvalue example

Consider,

The corresponding eigenvectors can be computed as
2  a11  a22   a11a22  a12a21   0
1 2  
2
A


 (1  4)  1  4  2  2  0


 2 4 
2

 (1  4)    0,   5

 1 2  0 0    x 
1 2  x  1x  2 y  0



0

   
 
  






 2 4   y   2 x  4 y  0 
  2 4  0 0    y 
 1 2  5 0    x 
  4 2   x    4 x  2 y  0 
  5  



0

   
 
  






 2  1  y   2 x  1y  0
  2 4  0 5    y 
  0  


For  = 0, one possible solution is x = (2, -1)
For  = 5, one possible solution is x = (1, 2)
For more information: Demos in Linear algebra by G. Strang, http://web.mit.edu/18.06/www/
Physical interpretation

Consider a correlation matrix, A
 1 .75
A
 1  1.75, 2  0.25

.75 1 

Error ellipse with the major axis as the larger eigenvalue
and the minor axis as the smaller eigenvalue
Original Variable B
Physical interpretation
PC 2
PC 1
Original Variable A


Orthogonal directions of greatest variance in data
Projections along PC1 (Principal Component) discriminate the data most
along any one axis
Physical interpretation


First principal component is the direction of
greatest variability (covariance) in the data
Second is the next orthogonal (uncorrelated)
direction of greatest variability



So first remove all the variability along the first
component, and then find the next direction of greatest
variability
And so on …
Thus each eigenvectors provides the directions of
data variances in decreasing order of eigenvalues
For more information: See Gram-Schmidt Orthogonalization in G. Strang’s lectures
Spectral Decomposition theorem

If A is a symmetric and positive definite k £ k matrix (xTAx
> 0) with i (i > 0) and ei, i = 1  k being the k
eigenvector and eigenvalue pairs, then
k
A   e e   e e    e e  A   i ei eTi  PPT
 k k 
T
1 1
1
 k 1 1k 
T
2
2
2
 k 1 1k 
T
k
k
k
 k 1 1k 
1 0
0 
2
P  e1 , e 2 e k ,   
 k k 
 k k   


0 0


 k k 
i 1
 k 1 1k 
0
0 
 

 k 


This is also called the eigen decomposition theorem
Any symmetric matrix can be reconstructed using its
eigenvalues and eigenvectors

Any similarity to what has been discussed before?
Example for spectral decomposition

Let A be a symmetric, positive definite matrix
2.2 0.4
A
 det  A  I   0

0.4 2.8
 2  5  6.16  0.16    3  2  0

The eigenvectors for the corresponding eigenvalues
are e   1 5 , 2 5 , e   2 5 , 1 5 
Consequently,
T
1

T
2
1 
2.2 0.4
5 1

A

3

 2  
5
0.4 2.8
5 

0.6 1.2   1.6  0.8

   0.8 0.4 
1
.
2
2
.
4

 

 2

5  2
2   2
5   1  
5
5 

1

5 
Singular Value Decomposition

If A is a rectangular m £ k matrix of real numbers, then there exists an m
£ m orthogonal matrix U and a k £ k orthogonal matrix V such that
A  U  VT
mk 


mm  mk  k k 
UU T  VV T  I
 is an m £ k matrix where the (i, j)th entry i ¸ 0, i = 1  min(m, k) and the
other entries are zero

The positive constants i are the singular values of A
If A has rank r, then there exists r positive constants 1, 2,r, r
orthogonal m £ 1 unit vectors u1,u2,,ur and r orthogonal k £ 1 unit
vectors v1,v2,,vr such that
r
A   i u i v Ti
i 1

Similar to the spectral decomposition theorem
Singular Value Decomposition (contd.)

If A is a symmetric and positive definite
then

SVD = Eigen decomposition


EIG(i) = SVD(i2)
Here AAT has an eigenvalue-eigenvector
pair (i2,ui)
T
T
T T

AA  UV
UV 
 UV T VUT
 U2 UT

Alternatively, the vi are the eigenvectors of
ATA with the same non zero eigenvalue i2
A T A  V2 V T
Example for SVD

Let A be a symmetric, positive definite matrix

U can be computed as
3  1
 3 1 1
 3 1 1 
  11 1 
T
A

AA

1
3

 1 3 1 

 
 1 3 1

 1 1   1 11




det AA T  I  0   1  12,  2  10  u1T   1 , 1 , uT2   1 ,  1 
2
2
2
2



V can be computed as
3
 3 1 1
1
T
A

A
A



 1 3 1
1
det AT A  I  0   1  12,  2


 1
10 0 2
3
1
1


 0 10 4
3  


 1 3 1 

 2 4 2
1 
 10,  3  0

 v1T   1 , 2 , 1 , v T2   2 ,  1 ,0 , v T3   1
,2
,5
6
6
6
5
5 
30
30
30 



Example for SVD

Taking 21=12 and 22=10, the singular value
decomposition of A is
 3 1 1
A


1
3
1


1 
 1 
2   1 , 2 , 1   10 
2   2 ,  1 ,0 
 12 
 1  
 1 
6
6
6 
5
5 


2 
2 


Thus the U, V and  are computed by performing eigen
decomposition of AAT and ATA
Any matrix has a singular value decomposition but only
symmetric, positive definite matrices have an eigen
decomposition
Applications of SVD in Linear Algebra

Inverse of a n £ n square matrix, A
If A is non-singular, then A-1 = (UVT)-1= V-1UT where
-1=diag(1/1, 1/1,, 1/n)
-1
T -1
-1 T
 If A is singular, then A = (UV ) ¼ V0 U where
0-1=diag(1/1, 1/2,, 1/i,0,0,,0)


Least squares solutions of a m£n system



Ax=b (A is m£n, m¸n) =(ATA)x=ATb ) x=(ATA)-1 ATb=A+b
If ATA is singular, x=A+b¼ (V0-1UT)b where 0-1 = diag(1/1,
1/2,, 1/i,0,0,,0)
Condition of a matrix

Condition number measures the degree of singularity of A

Larger the value of 1/n, closer A is to being singular
http://www.cse.unr.edu/~bebis/MathMethods/SVD/lecture.pdf
Applications of SVD in Linear Algebra

Homogeneous equations, Ax = 0




Minimum-norm solution is x=0
(trivial solution)
Impose a constraint, x  1
“Constrained” optimization
problem min x 1 Ax
Special Case


If rank(A)=n-1 (m ¸ n-1, n=0)
then x= vn ( is a constant)
Genera Case

If rank(A)=n-k (m ¸ n-k, nk+1== n=0) then x=1vn2
2
k+1++kvn with  1++ n=1

Has appeared before



Homogeneous solution of a linear
system of equations
Computation of Homogrpahy
using DLT
Estimation of Fundamental matrix
For proof: Johnson and Wichern, “Applied Multivariate Statistical Analysis”, pg 79
What is the use of SVD?


SVD can be used to compute
optimal low-rank approximations
of arbitrary matrices.
Face recognition


Data mining


Represent the face images as
eigenfaces and compute distance
between the query face image in the
principal component space
Latent Semantic Indexing for
document extraction
Image compression

Karhunen Loeve (KL) transform
performs the best image
compression

In MPEG, Discrete Cosine
Transform (DCT) has the closest
approximation to the KL transform
in PSNR
Image Compression using SVD

The image is stored as a 256£264 matrix M
with entries between 0 and 1


Select r ¿ 256 as an approximation to the
original M


The matrix M has rank 256
As r in increased from 1 all the way to 256 the
reconstruction of M would improve i.e.
approximation error would reduce
Advantage


To send the matrix M, need to send 256£264
= 67584 numbers
To send an r = 36 approximation of M, need
to send 36 + 36*256 + 36*264 = 18756
numbers



36 singular values
36 left vectors, each having 256 entries
36 right vectors, each having 264 entries
Courtesy: http://www.uwlax.edu/faculty/will/svd/compression/index.html
Download