Eigenvalues, Eigenvectors, and Matrix Decompositions c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 48 A ⇒ q(λ) = |A − λI| is a polynomial of degree n. n×n For example, a − λ a12 11 A ⇒ |A − λI| = 2×2 a21 a22 − λ = (a11 − λ)(a22 − λ) − a12 a21 = λ2 + (−a11 − a22 )λ + a11 a22 − a12 a21 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 48 The definition of |A − λI| implies that the coefficient on λn is (−1)n . a12 a13 a11 − λ a a22 − λ a23 21 a31 a32 a33 − λ .. . .. .. . . an1 an2 an3 c Copyright 2012 Dan Nettleton (Iowa State University) ··· ··· ··· .. . ··· a1n a2n a3n .. . ann − λ Statistics 611 3 / 48 The n (not necessarily distinct) roots of q(λ) (i.e., solutions to q(λ) = 0) are called the eigenvalues of A. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 4 / 48 The Fundamental Theorem of Algebra guarantees n roots, and the Polynomial Factorization Theorem implies that q(λ) = |A − λI| = n Y (λi − λ) i=1 for the eigenvalues λ1 , . . . , λn . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 48 Example: " Suppose A = 2 1 1 2 # , then 2 − λ 1 |A − λI| = 1 2 − λ = (2 − λ)(2 − λ) − 1 = λ2 − 4λ + 3 = (3 − λ)(1 − λ). Eigenvalues are λ1 = 3, λ2 = 1 ∵ (3 − λ)(1 − λ) = 0 ⇐⇒ λ = 3 or λ = 1. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 6 / 48 Example: " Suppose A = 3 0 0 3 # , then 3 − λ 0 |A − λI| = 0 3 − λ = (3 − λ)(3 − λ) = λ2 − 6λ + 9. Eigenvalues are λ1 = 3, λ2 = 3 ∵ (3 − λ)(3 − λ) = 0 ⇐⇒ λ = 3. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 7 / 48 In the previous example, the eigenvalue 3 is said to have algebraic multiplicity 2. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 8 / 48 The algebraic multiplicity of eigenvalue λ∗ is n X c Copyright 2012 Dan Nettleton (Iowa State University) 11[λi = λ∗ ]. i=1 Statistics 611 9 / 48 Note that the roots of q(λ) are not necessarily real numbers. For example, " A= 0 1 # −1 0 c Copyright 2012 Dan Nettleton (Iowa State University) " ⇒ A − λI = −λ 1 # −1 −λ ⇒ |A − λI| = λ2 + 1 √ √ = ( −1 − λ)(− −1 − λ) √ √ ∴ λ1 = −1, λ2 = − −1. Statistics 611 10 / 48 Ifn×n A is symmetric, then all roots of q(λ) = |A − λI| are real numbers. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 11 / 48 λ is an eigenvalue of A ⇐⇒ |A − λI| = 0 ⇐⇒ A − λI is singular ⇐⇒ rank(A − λI) < n ⇐⇒ ∃ x 6= 0 3 (A − λI)x = 0 ⇐⇒ ∃ x 6= 0 3 Ax − λx = 0 ⇐⇒ ∃ x 6= 0 3 Ax = λx. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 12 / 48 Such a vector x 6= 0 satisfying Ax = λx is called an eigenvector ofn×n A corresponding to eigenvalue λ. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 13 / 48 Note that if x is an eigenvector corresponding to eigenvalue λ, then cx is also an eigenvector ∀ c ∈ R. ∵ Ax = λx ⇒ cAx = cλx ⇒ A(cx) = λ(cx). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 14 / 48 It is customary to take eigenvalues to be unit norm, i.e., we take c= 1 kxk so that the eigenvector cx satisfies kcxk = 1. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 15 / 48 If λ is an eigenvalue ofn×n A, the set of vectors satisfying Ax = λx is called the eigenspace of the eigenvalue λ. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 16 / 48 Note that the eigenspace {x : Ax = λx} = {x : (A − λI)x = 0} = N (A − λI). The dimension of the eigenspace of λ is known as the geometric multiplicity of λ. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 48 For symmetric matrices, it turns out that algebraic multiplicity of an eigenvalue is the same as the geometric multiplicity. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 18 / 48 Show that eigenvectors corresponding to distinct eigenvalues of a symmetric matrixn×n A are orthogonal. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 19 / 48 Proof: Suppose Ax1 = λ1 x1 , Ax2 = λ2 x2 and λ1 6= λ2 . Then λ1 x01 x2 = (λ1 x1 )0 x2 = (Ax1 )0 x2 = x01 A0 x2 = x01 Ax2 = x01 (Ax2 ) = x01 (λ2 x2 ) = λ2 x01 x2 . Now λ1 x01 x2 = λ2 x01 x2 ⇒ (λ1 − λ2 )x01 x2 = 0 ⇒ x01 x2 = 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 20 / 48 Fact V7: If S ⊆ Rn is a vector space of dimension k > 0, then there exist a set of mutually orthonormal vectors that forms a basis for S. Proof: Homework problem. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 21 / 48 It follows that for each distinct eigenvalue λ of a matrixn×n A satisfying A = A0 , there exists a set of orthonormal eigenvectors with dim(N (A − λI)) elements. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 22 / 48 These orthonormal eigenvectors form a basis for the eigenspace of λ. Furthermore, the collection of all such eigenvectors corresponding to the eigenvalues of A is a mutually orthonormal set of n vectors. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 23 / 48 Let q1 , . . . , qn denote these n orthonormal eigenvectors corresponding to eigenvalues λ1 , . . . , λn respectively. Let Q = [q1 , . . . , qn ]. Let Λ = diag(λ1 , . . . , λn ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 24 / 48 ∵ q1 , . . . , qn are mutually orthonormal, Q0 Q = I. Moreover, because Q is n × n, it follows that Q0 is Q−1 . Thus, QQ0 = Q0 Q = I. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 25 / 48 Note that Aqi = λi qi ∀ i = 1, . . . , n ⇐⇒ AQ = QΛ ⇐⇒ AQQ0 = QΛQ0 0 ⇐⇒ A = QΛQ = c Copyright 2012 Dan Nettleton (Iowa State University) n X λi qi q0i . i=1 Statistics 611 26 / 48 This result is known as the Spectral Decomposition Theorem: Ifn×n A is a symmetric matrix, then A = QΛQ0 , where Q is an orthogonal matrix and Λ is a diagonal matrix. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 27 / 48 Result A.19: Letn×n A be a symmetric matrix. Then rank(A) is equal to the number of nonzero eigenvalues of A. Proof: Homework problem. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 28 / 48 Supposen×n A has eigenvalues λ1 , . . . , λn . Then P (a) trace(A) = ni=1 λi , and (b) |A| = Qn i=1 λi . Proof: Homework problem. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 29 / 48 Quadratic Forms Supposen×n A = [aij ] and x = [x1 , . . . , xn ]0 . x0 Ax = Pn i=1 Pn j=1 xi xj aij is known as a quadratic form. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 30 / 48 Nonnegative Definite and Positive Definite Matrices A symmetric matrixn×n A is nonnegative definite (NND) if and only if x0 Ax ≥ 0 ∀ x ∈ Rn . A symmetric matrixn×n A is positive definite (PD) if and only if x0 Ax > 0 ∀ x ∈ Rn \ {0}. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 31 / 48 Supposen×n A is a symmetric matrix. Prove that A is NND if and only if all eigenvalues of A are nonnegative. Prove that A is PD if and only if all eigenvalues of A are positive. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 32 / 48 Proof: (=⇒) Suppose A is a symmetric NND matrix. Then A = QΛQ0 and q0i Aqi ≥ 0 ∀ i = 1, . . . , n. Now note that Q0 qi = ei (the ith column ofn×n I ). Thus, λi = e0i Λei = q0i QΛQ0 qi = q0i Aqi ≥ 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 33 / 48 (⇐=) Suppose all eigenvalues of A, λ1 , . . . , λn , are nonnegative. Let x ∈ Rn be arbitrary. Let y = Qx. Then x0 Ax = x0 QΛQ0 x = y0 Λy = n X λi y2i ≥ 0. i=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 34 / 48 Supposen×n A is a symmetric NND matrix. Show there exists a symmetric matrix B such that BB = A. Such a matrix is called the symmetric square root of A and is denoted by A1/2 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 35 / 48 Proof: Supposen×n A is a symmetric NND matrix. By the Spectral Decomposition Theorem, A = QΛQ0 where Q is n×n orthogonal and Λ is diagonal with eigenvalues of A on the diagonal. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 36 / 48 Becausen×n A is NND, eigenvalues ofn×n A are nonnegative. Define 1/2 Λ1/2 = diag(λ1 , . . . , λ1/2 n ), where Λ = diag(λ1 , . . . , λn ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 37 / 48 Now define B = QΛ1/2 Q0 . Note B0 = (QΛ1/2 Q0 )0 = (Q0 )0 (Λ1/2 )0 Q0 = QΛ1/2 Q0 = B. and BB = QΛ1/2 Q0 QΛ1/2 Q0 = QΛ1/2 Λ1/2 Q0 = QΛQ0 = A. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 38 / 48 Note that if A is PD, A1/2 = QΛ1/2 Q0 is nonsingular h p i p ∵ A−1/2 = Q diag(1/ λ1 , . . . , 1/ λ1 ) Q0 = QΛ−1/2 Q0 is its inverse: A1/2 A−1/2 = QΛ1/2 Q0 QΛ−1/2 Q0 = QΛ1/2 Λ−1/2 Q0 = QQ0 = I. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 39 / 48 Result A.20 (Cholesky Decomposition): Supposen×n A is a symmetric matrix. n×n A is PD iff there exists a nonsingular lower triangular matrix L such that c Copyright 2012 Dan Nettleton (Iowa State University) A = LL0 . Statistics 611 40 / 48 Proof of Result A.20: (⇐=) ∀ x 6= 0, L0 x 6= 0 ∵ L0 is nonsingular and thus full column rank. Thus, ∀ x 6= 0, x0 Ax = x0 LL0 x = (L0 x)0 (L0 x) > 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 41 / 48 (=⇒) If n = 1, A = [a]. Moreover, 1×1 A is PD ⇒ x0 ax > 0 ∀ x 6= 0 1×1 ⇒ ax2 > 0 ∀ x 6= 0 ⇒ a > 0. √ If we take L = [ a], then LL0 = A, L is a nonsingular lower triangular matrix, and the result holds. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 42 / 48 Now suppose the result holds for n = 1, . . . , k for some integer k ≥ 1. Suppose A is any (k + 1) × (k + 1) matrix. Partition A as A= c Copyright 2012 Dan Nettleton (Iowa State University) " # Ak ak a0k a Statistics 611 43 / 48 A is PD ⇒ Ak is PD ∵ x0 Ax > 0 ∀ x 6= 0 ⇒ x0 Ax > 0 ∀ x 6= 0 with the last component 0 " # y 0 > 0 ∀ y 6= 0 ⇒ [y 0]A 0 ⇒ y0 Ak y > 0 ∀ y 6= 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 44 / 48 ∴ ∃ Lk nonsingular and lower triangular 3 Ak = Lk L0k . Now let lk = L−1 k ak . ∵ A is PD, h l0k L−1 −1 k #" # " i A a L0−1 k k k lk c Copyright 2012 Dan Nettleton (Iowa State University) a0k a −1 >0⇒ Statistics 611 45 / 48 0 −1 ⇒ l0k Lk−1 Ak L0−1 k lk − 2lk Lk ak + a > 0 0 ⇒ l0k Lk−1 Lk L0k L0−1 k lk − 2lk lk + a > 0 ⇒ −l0k lk + a > 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 46 / 48 Now let l = p 0 −lk lk + a. Then L= " # Lk 0 l0k l is lower triangular and nonsingular (∵ l > 0 ⇒ L is full rank) and c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 47 / 48 0 LL = = #" # " Lk 0 L0k lk l0k = 0 l # Lk l k l0k L0k l0k lk + l2 " = l " Lk L0k Ak Lk L−1 k ak # 0 0 0 a0k L0−1 k Lk l k l k + a − l k l k # " A k ak a0k a = A. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 48 / 48