Lecture Notes 1: Matrix Algebra Part D: Matrix Diagonalization Peter J. Hammond

Lecture Notes 1: Matrix Algebra Part D: Matrix Diagonalization Peter J. Hammond Autumn 2014, revised 2015 University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 43 Outline Eigenvalues and Eigenvectors Real Case The Complex Case Eigenvectors are Linearly Independent Diagonalizing a General Matrix Diagonalizing a Symmetric Matrix A Symmetric Matrix has only Real Eigenvalues Orthogonal Projections and Complements A Trick Function for Generating Eigenvalues The Spectral Theorem University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 2 of 43 Lecture Outline Eigenvalues and Eigenvectors Real Case The Complex Case Eigenvectors are Linearly Independent Diagonalizing a General Matrix Diagonalizing a Symmetric Matrix A Symmetric Matrix has only Real Eigenvalues Orthogonal Projections and Complements A Trick Function for Generating Eigenvalues The Spectral Theorem University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 3 of 43 Definitions in the Real Case Consider any n × n matrix A. The scalar λ ∈ R is an eigenvalue just in case the equation Ax = λx has a non-zero solution. In this case the solution x ∈ Rn \ {0} is an eigenvector, and the pair (λ, x) is an eigenpair. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 4 of 43 The Eigenspace Given any eigenvalue λ, let Eλ := {x ∈ Rn \ {0} | Ax = λx} denote the associated set of eigenvectors. Given any two eigenvectors x, y ∈ Eλ and any two scalars α, β ∈ R, note that A(αx + βy) = αAx + βAy = αλx + βλy = λ(αx + βy) Hence the linear combination αx + βy, unless it is 0, is also an eigenvector in Eλ . It follows that the set Eλ ∪ {0} is a linear subspace of Rn which we call the eigenspace associated with the eigenvalue λ. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 5 of 43 Characteristic Equation The equation Ax = λx holds for x 6= 0 if and only if x 6= 0 solves (A − λIx = 0. This holds iff the matrix A − λI is singular, which holds iff λ is a root of the characteristic equation |A − λI| = 0, or equivalently, a zero of the polynomial |A − λI| of degree n. Suppose that the |A − λI| = 0 has k distinct roots λ1 , λ2 , . . . , λk whose multiplicities are respectively m1 , m2 , . . . , mk . This means that |A − λI| = (−1)n (λ − λ1 )m1 · (λ − λ2 )m2 · · · (λ − λk )mn · · · The polynomial has degree m1 + m2 + . . . + mk , which equals n. This implies that k ≤ n, so there can be at most n distinct eigenvalues. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 6 of 43 The Case of a Diagonal Matrix, I For the diagonal matrix D = diag(d1 , d2 , . . . , dn ), the characteristic equation Q |D − λI| = 0 takes the degenerate form nk=1 (dk − λ) = 0, so the eigenvalues are the diagonal elements. The ith component of the vector equation Dx = dk x takes the form di xi = dk xi , which has a non-trivial solution if and only if di = dk . The kth vector ek = (δik )nk=1 of the canonical orthonormal basis of Rn always solves the equation Dx = dk x, and so is an eigenvector associated with the eigenvalue dk . University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 7 of 43 The Case of a Diagonal Matrix, II Apart from non-zero multiples of ek , there are other eigenvectors associated with dk only if a different element di of the diagonal also equals dk . In fact, the eigenspace spanned by the eigenvectors associated with each eigenvalue dk equals the space spanned by the set {ei | di = dk } of canonical basis vectors. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 8 of 43 Example with No Real Eigenvalues, I Recall that a 2-dimensional rotation matrix takes the form cos θ − sin θ Rθ := sin θ cos θ for θ ∈ R, which is the angle of rotation measured in radians. The rotation Rθ transforms any vector x = (x1 , x2 ) ∈ R2 to cos θ − sin θ x1 x1 cos θ − x2 sin θ Rθ x = = sin θ cos θ x2 x1 sin θ + x2 cos θ Introduce polar coordinates (r , η), where x = (x1 , x2 ) = r (cos η, sin η). Then cos η cos θ − sin η sin θ cos(η + θ) Rθ x = r =r cos η sin θ + sin η cos θ sin(η + θ) This makes it easy to verify that Rθ+2kπ = Rθ for all θ ∈ R and k ∈ Z, and that Rθ Rη = Rη Rθ = Rθ+η for all θ, η ∈ R. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 9 of 43 Example with No Real Eigenvalues, II The characteristic equation |Rθ − λI| = 0 takes the form cos θ − λ − sin θ = (cos θ−λ)2 +sin2 θ = 1−2λ cos θ+λ2 0 = sin θ cos θ − λ The two roots are λ = cos θ ± i sin θ = e ±iθ . These are complex conjugates except in the degenerate case when cos θ = 1 because θ = 2kπ for some k ∈ Z. Then Rθ reduces to the identity matrix I2 . So, except when θ = 2kπ for some k ∈ Z, the real matrix Rθ has no real eigenvalues. Instead, the two roots λ = cos θ ± i sin θ = e ±iθ of the characteristic equation are complex eigenvalues, with associated complex eigenvectors. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 10 of 43 Outline Eigenvalues and Eigenvectors Real Case The Complex Case Eigenvectors are Linearly Independent Diagonalizing a General Matrix Diagonalizing a Symmetric Matrix A Symmetric Matrix has only Real Eigenvalues Orthogonal Projections and Complements A Trick Function for Generating Eigenvalues The Spectral Theorem University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 11 of 43 Complex Eigenvalues To consider complex eigenvalues properly, we need to leave Rn and consider instead the linear space Cn whose elements are n-vectors with complex coordinates. That is, we consider a linear space whose field of scalars is the plane C of complex numbers, rather than the line R of real numbers. Suppose A is any n × n matrix whose elements may be real or complex. The complex scalar λ ∈ C is an eigenvalue just in case the equation Ax = λx has a non-zero solution, in which case that solution x ∈ Cn \ {0} is an eigenvector. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 12 of 43 Fundamental Theorem of Algebra Theorem P k Let P(λ) = λn + n−1 k=0 pk λ be a polynomial function of λ of degree n in the complex plane C. Then there exists at least one root λ̂ ∈ C such that P(λ̂) = 0. Corollary The polynomial P(λ) can Qbe factorized as the product Pn (λ) ≡ nr=1 (λ − λr ) of exactly n linear terms. Proof. The proof will be by induction on n. When n = 1 one has P1 (λ) = λ + p0 , whose only root is λ = −p0 . Suppose the result is true when n = m − 1. By the fundamental theorem of algebra, there exists λ̂ ∈ C such that Pm (λ̂) = 0. Polynomial division gives Pm (λ) ≡ Pm−1 (λ)(λ − λ̂), etc. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 13 of 43 Characteristic Roots as Eigenvalues Theorem Every n × n matrix A ∈ Cn×n with complex elements has exactly n eigenvalues (real or complex) corresponding to the roots, counting multiple roots, of the characteristic equation |A − λI| = 0. Proof. The characteristic equation can be written in the form Pn (λ) = 0 where Pn (λ) ≡ |λI − A| is a polynomial of degree n. By the fundamental theorem of algebra, together with its corollary, Q the polynomial |λI − A| equals the product nr=1 (λ − λr ) of n linear terms. For any of these roots λr the matrix A − λr I is singular, so there exists x 6= 0 such that (A − λr I)x = 0 or Ax = λr x, implying that λr is an eigenvalue. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 14 of 43 Outline Eigenvalues and Eigenvectors Real Case The Complex Case Eigenvectors are Linearly Independent Diagonalizing a General Matrix Diagonalizing a Symmetric Matrix A Symmetric Matrix has only Real Eigenvalues Orthogonal Projections and Complements A Trick Function for Generating Eigenvalues The Spectral Theorem University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 15 of 43 Proof by Induction, I Theorem Let {λk }m k=1 = {λ1 , λ2 , . . . , λm } be any collection of m ≤ n distinct eigenvalues. Then any set {xk }m k=1 of associated eigenvectors must be linearly independent. The proof will be by induction on m. Because x1 6= 0, the set {x1 } is linearly independent. So the result is evidently true when m = 1. As the induction hypothesis, suppose the result holds for m − 1. Suppose that one solution of the equation Ax = λmP x, which may be zero, is the linear combination xm = m−1 k=1 αk xk of the preceding m − 1 eigenvectors. Hence Ax = λm x = University of Warwick, EC9A0 Maths for Economists Xm−1 k=1 αk λm xk Peter J. Hammond 16 of 43 Proof by Induction, II Next, the hypothesis that {(λk , xk )}m−1 k=1 is a collection of eigenpairs implies that Ax = Xm−1 k=1 αk Axk = Xm−1 k=1 αk λ k xk Subtracting this equation from the prior equation Ax = λm x = Xm−1 k=1 αk λm xk gives 0= Xm−1 k=1 University of Warwick, EC9A0 Maths for Economists αk (λm − λk )xk Peter J. Hammond 17 of 43 Proof by Induction, III So we have 0= Xm−1 k=1 αk (λm − λk )xk The induction hypothesis is that {xk }m−1 k=1 is linearly independent. It follows that αk (λm − λk ) = 0 for k = 1, . . . , m − 1. Hence either λm = λk for some k = 1, . . . , m − 1, in which case the m eigenvalues are not distinct; or else αk = 0 for k = 1, . . . , m − 1. P But then the hypothesis that xm = m−1 k=1 αk xk implies that xm = 0, so xm is not an eigenvector. This completes the proof by induction. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 18 of 43 Diagonalization Definition An n × n matrix A matrix is said to be diagonalizable just in case there exists an invertible diagonalizing matrix S and a diagonal matrix Λ = diag(λ1 , λ2 , . . . , λn ) such that each of the following equivalent statements holds: A = SΛS−1 ⇐⇒ AS = SΛ ⇐⇒ S−1 AS = Λ Theorem Given any n × n matrix A: 1. The columns of any matrix S that diagonalizes A must consist of n linearly independent eigenvectors of A. 2. The matrix A is diagonalizable if and only if it has a set of n linearly independent eigenvectors. 3. The matrix A and its diagonalization Λ = S−1 AS have the same set of eigenvalues. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 19 of 43 Proof of Part 1 Suppose that AS = SΛ where A = (aij )n×n , S = (sij )n×n , and Λ = diag(λ1 , λ2 , . . . , λn ). Then for each i, k ∈ {1, 2, . . . , n}, equating the elements in row i and column k of the equal matrices AS and SΛ implies that Xn Xn aij sjk = sij δjk λk = sik λk j=1 j=1 It follows that Ask = λk sk where sk = (sik )ni=1 denotes the kth column of the matrix S. Because S must be invertible: I each column sk must be non-zero, so an eigenvector of A; I the set of all these n columns must be linearly independent. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 20 of 43 Proof of Part 2 By part 1, if the diagonalizing matrix S exists, its columns must form a set of n linearly independent eigenvectors for the matrix A. Conversely, suppose that A does have a set {x1 , x2 , . . . , xn } of n linearly independent eigenvectors, with Axk = λk xk for k = 1, 2, . . . , n. Now define S as the n × n matrix whose kth column is the eigenvector xk , for each k = 1, 2, . . . , n. Then it is easy to check that AS = SΛ where Λ = diag(λ1 , λ2 , . . . , λn ). University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 21 of 43 Proof of Part 3 Suppose that λ is an eigenvalue of A with corresponding eigenvector x 6= 0. Then x solves Ax = λx = SΛS−1 x. Premultiplying each side of the equation SΛS−1 x = Λx by S−1 , it follows that y := S−1 x solves Λy = λy. Moreover, because S−1 has the inverse S, the equation S−1 x = 0 has only the trivial solution. Hence x 6= 0 implies that y 6= 0, implying that (λ, y) is an eigenpair of Λ. Essentially the same argument shows that if (λ, y) is an eigenpair of Λ, then (λ, Sy) is an eigenpair of Λ. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 22 of 43 A Non-Diagonalizable 2 × 2 Matrix Example 0 1 The non-symmetric matrix A = cannot be diagonalized. 0 0 −λ 1 = λ2 . Its characteristic equation is 0 = |A − λI| = 0 −λ It follows that λ = 0 is the unique eigenvalue. x1 0 1 x1 x The eigenvalue equation is 0 = = 2 0 x2 0 0 x2 > or x2 = 0, whose only solutions take the form x2 (1, 0) . Thus, every eigenvector is a non-zero multiple of the column vector (1, 0)> . This makes it impossible to find any set of two linearly independent eigenvectors. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 23 of 43 A Non-Diagonalizable n × n Matrix: Specification The following n × n matrix also has a unique eigenvalue, whose eigenspace is of dimension 1. Example Consider the non-symmetric n × n matrix A whose elements in the first n − 1 rows satisfy aij = δi,j−1 for i = 1, 2, . . . , n − 1 but whose last row is 0> . Such a matrix is upper  0 0   A =  ...  0 0 triangular, and takes the special form  1 0 ... 0 0 1 . . . 0  .. .. . . ..  = 0 In−1 . . . . 0 0>  0 0 . . . 1 0 0 ... 0 in which the elements in the first n − 1 rows and last n − 1 columns make up the identity matrix. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 24 of 43 A Non-Diagonalizable n × n Matrix: Analysis Because A − λI is also upper triangular, its characteristic equation is 0 = |A − λI| = (−λ)n . This has λ = 0 as an n-fold repeated root. So λ = 0 is the unique eigenvalue. The eigenvalue equation Ax = λx with λ = 0 takes the form Ax = 0 or Xn 0= δi,j−1 xj = xi+1 (i = 1, 2, . . . , n − 1) j=1 with an extra nth equation of the form 0 = 0. The only solutions take the form xj = 0 for j = 2, . . . , n, with x1 arbitrary. So all the eigenvectors of A are non-zero multiples of e1 = (1, 0, . . . , 0)> , implying that there is just one eigenspace, which has dimension 1. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 25 of 43 Outline Eigenvalues and Eigenvectors Real Case The Complex Case Eigenvectors are Linearly Independent Diagonalizing a General Matrix Diagonalizing a Symmetric Matrix A Symmetric Matrix has only Real Eigenvalues Orthogonal Projections and Complements A Trick Function for Generating Eigenvalues The Spectral Theorem University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 26 of 43 Complex Conjugates and Adjoint Matrices Recall that any complex number c ∈ C can be expressed as a + ib with a ∈ R as the real part and b ∈ R as the imaginary part. The complex conjugate of c is c̄ = a − ib. Note that cc̄ = c̄c = (a + ib)(a − ib) = a2 + b 2 = |c|2 , where |c| is the modulus of c. Any m × n complex matrix C = (cij )m×n ∈ Cm×n can be written as A + iB, where A and B are real m × n matrices. Given the m × n complex matrix C = A + iB, its adjoint is the n × m complex matrix C∗ := A> − iB> which is the transpose of the matrix A − iB whose elements are the complex conjugates c̄ij of the corresponding elements of C. In the case of a real matrix A whose imaginary part is 0, its adjoint is simply the transpose A> . University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 27 of 43 Self-Adjoint and Symmetric Matrices An n × n complex matrix C = A + iB is self-adjoint just in case C∗ = C, which holds if and only if A> − iB> = A + iB, and so if and only if: I the real part A is symmetric; I the imaginary part B is anti-symmetric in the sense that B> = −B. Of course, a real matrix is self-adjoint if and only if it is symmetric. Theorem Any eigenvalue of a self-adjoint complex matrix is a real scalar. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 28 of 43 Proof that Eigenvalues are Real Suppose that the scalar λ ∈ C and vector x ∈ Cn together satisfy the eigenvalue equation Ax = λx for any A ∈ Cn×n . Taking complex conjugates, one has λ̄x∗ = x∗ A∗ . By the associative law of complex matrix multiplication, one has x∗ Ax = x∗ (Ax) = x∗ (λx) = λ(x∗ x) as well as x∗ A∗ x = (x∗ A∗ )x = (λ̄x∗ )x = λ̄(x∗ x). In case A is self-adjoint and so A∗ = A, subtracting the second equation from the first gives x∗ Ax − x∗ A∗ x = x∗ (A − A∗ )x = 0 = (λ − λ̄)(x∗ x). But in case x P is an eigenvector, one has x = (xi )ni=1 ∈ Cn \ {0} and so x∗ x = ni=1 |xi |2 > 0. Because 0 = (λ − λ̄)(x∗ x), it follows that the eigenvalue λ satisfies λ − λ̄ = 0, implying that λ is real. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 29 of 43 Orthogonal Projections Definition An n × n matrix P is an orthogonal projection if P2 = P and u> v = 0 whenever Pv = 0 and u = Px for some x ∈ Rn . Theorem Suppose that the n × m matrix X has full rank m < n. Let L ⊂ Rn be the linear subspace spanned by m linearly independent columns of X. Define the n × n matrix P := X(X> X) −1 X> . Then: 1. The matrix P is a symmetric orthogonal projection onto L. 2. The matrix I − P is a symmetric orthogonal projection onto the orthogonal complement L⊥ of L. 3. For each vector y ∈ Rn , its orthogonal projection onto L is the unique vector v = Py that minimizes the distance ky − vk between y and L. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 30 of 43 Proof of Part 1 Because of the rules for the transposes of products and inverses, −1 the definition P := X(X> X) X> implies that P> = P and also P2 = X(X> X) −1 X> X(X> X) −1 −1 X> = X(X> X) X> = P Moreover, if Pv = 0 and u = Px for some x ∈ Rn , then u> v = x> P> v = x> Pv = 0 Finally, for every y ∈ Rn , the vector Py equals Xb, where −1 b = (X> X) X> y Hence Py ∈ L. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 31 of 43 Proof of Part 2 Evidently (I − P)> = I − P> = I − P, and (I − P)2 = I − 2P + P2 = I − 2P + P = I − P Hence I − P is a projection. This projection is also orthogonal because if (I − P)v = 0 and u = (I − P)x for some x ∈ Rn , then u> v = x> (I − P)> v = x> (I − P)v = 0 Next, suppose that v = Xb ∈ L and that y = (I − P)x belongs to the range of (I − P). Then y> v = x> (I − P)> Xb = x> Xb − x> Xb = 0 so y ∈ L⊥ . University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 32 of 43 Proof of Part 3 For any vector v = Xb ∈ L and y ∈ Rn , one has ky − vk2 = (y − Xb)> (y − Xb) = y> y − 2y> Xb + b> X> Xb −1 Now define b̂ := (X> X) X> y (which is the OLS estimator of b in the linear regression equation y = Xb + e ). 2 Also, define v̂ := Xb̂ = Py. Because P> P = P> = P = P , ky − vk2 = y> y − 2y> Xb + b> X> Xb = (b − b̂)> X> X(b − b̂) + y> y − b̂> X> Xb̂ = kv − v̂k2 + y> y − y> P> Py = kv − v̂k2 + y> y − y> Py On the other hand, given that v̂ = Py, one also has ky − v̂k2 = y> y − 2y> v̂ + v̂> v̂ = y> y − 2y> Py + y> P> Py = y> y − y> Py So ky − vk2 − ky − v̂k2 = kv − v̂k2 ≥ 0 with = iff v = v̂. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 33 of 43 Outline Eigenvalues and Eigenvectors Real Case The Complex Case Eigenvectors are Linearly Independent Diagonalizing a General Matrix Diagonalizing a Symmetric Matrix A Symmetric Matrix has only Real Eigenvalues Orthogonal Projections and Complements A Trick Function for Generating Eigenvalues The Spectral Theorem University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 34 of 43 A Trick Function for Generating Eigenvalues For all x 6= 0, define the homogeneous of degree zero function Pn Pn > Ax xi aij xj x i=1 Pn j=1 2 Rn \ {0} 3 x 7→ f (x) := > = x x i=1 xi which is left undefined at. The partial derivative w.r.t. any component xh of the vector x is i ∂f 2 hXn = > 2 ahj xj (x> x) − (x> Ax)xh j=1 ∂xh (x x) At any stationary point x̂ 6= 0 where ∂f /∂xh = 0 for all h, one therefore has (x̂> x̂)Ax̂ = (x̂> Ax̂)x̂ and so Ax̂ = λx̂ where λ = f (x̂). That is, a stationary point x̂ 6= 0 must be an eigenvalue, with the corresponding function value f (x̂) as the associated eigenvector. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 35 of 43 More Properties of the Trick Function Lemma Every n × n symmetric square matrix A: 1. has a maximum eigenvalue λ∗ at an eigenvector x∗ where f attains its maximum; 2. has a minimum eigenvalue λ∗ at an eigenvector x∗ where f attains its minimum; 3. satisfies A = λI if and only if λ∗ = λ∗ = λ. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 36 of 43 Proof of Parts 1 and 2 The unit sphere S n−1 is a compact subset of Rn , and the function f restricted to S n−1 is continuous. By the extreme value theorem, f restricted to S n−1 must have: I a maximum value λ∗ attained at some point x∗ ; I a minimum value λ∗ attained at some point x∗ . Because f is homogeneous of degree zero, these are the maximum and minimum values of f over the whole domain Rn \ {0}. In particular, f must be stationary at any maximum point x∗ , as well as at any minimum point x∗ . But stationary points must be eigenvectors. This proves parts 1 and 2 of the lemma. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 37 of 43 Outline Eigenvalues and Eigenvectors Real Case The Complex Case Eigenvectors are Linearly Independent Diagonalizing a General Matrix Diagonalizing a Symmetric Matrix A Symmetric Matrix has only Real Eigenvalues Orthogonal Projections and Complements A Trick Function for Generating Eigenvalues The Spectral Theorem University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 38 of 43 A Useful Lemma Lemma Let A be a symmetric n × n matrix. Suppose that there are m < n eigenvectors {uk }m k=1 which form an orthonormal set of vectors, as well as the columns of an n × m matrix U. Then there is at least one more eigenvector x that satisfies U> x = 0 — i.e., it is orthogonal to each of the m eigenvectors uk . University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 39 of 43 Constructive Proof I For each eigenvector uk , let λk be the associated eigenvalue, so that Auk = λk uk for k = 1, 2, . . . , m. Then AU = UΛ where Λ := diag(λk )nk=1 . Also, because the eigenvectors {uk }m k=1 form an orthonormal set, one has U> U = Im . Hence U> AU = U> UΛ = Λ. Also, transposing AU = UΛ gives U> A = ΛU> . Consider now the n × n matrix Â := (I − UU> )A(I − UU> ), which is symmetric because both A and UU> are symmetric. Note that Â = A − UU> A − AUU> + UU> AUU> = A − UΛU> − UΛU> + UΛU> = A − UΛU> This matrix Â has at least one eigenvalue λ, which must be real, and an associated eigenvector x 6= 0, which together satisfy Âx = (I − UU> )A(I − UU> )x = (A − UΛU> )x = λx Pre-multiplying each side of the last equation by U> shows that λU> x = U> Ax − U> UΛU> x = ΛU> x − ΛU> x = 0 University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 40 of 43 Constructive Proof II This leaves two possible cases: 1. In the exceptional case when the only eigenvalue of the symmetric matrix Â is λ = 0, one has Â = 0 and so A = UΛU> . Then any vector x 6= 0 satisfying U> x = 0 must satisfy Ax = UΛU> x = 0, implying that x is an eigenvector of A. 2. Otherwise, in the generic case when Â has at least one eigenvalue λ 6= 0, there is a corresponding eigenvector x 6= 0 of Â > that satisfies U> x = 0m . But then this implies that Ax = (Â + UΛU> )x = Âx = λx Hence x is an eigenvector of A. > In both cases there is an eigenvector x of A satisfying U> x = 0m . University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 41 of 43 Spectral Theorem Theorem Given any symmetric n × n matrix A: 1. its eigenvectors span the whole of Rn ; 2. there exists an orthogonal matrix P that diagonalizes A. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 42 of 43 Proof of Spectral Theorem The matrix A has at least one eigenvalue, which must be real. The associated eigenvector x, normalized to satisfy x> x = 1, forms an orthonormal set {u1 }. As the induction hypothesis, suppose that there are m < n eigenvectors {uk }m k=1 which form an orthonormal set of vectors. We have just proved that it holds for m = 1. The lemma shows that, if it holds for any m = 1, 2, . . . , n − 1, then it holds for m + 1. The result follows by induction. In particular, when m = n, there exists an orthonormal set of n eigenvectors, which must then span the whole of Rn . Also, by the previous result, taking P as an orthogonal matrix whose columns are an orthonormal set of n eigenvectors implies that P> AP = Λ where the elements of the diagonal matrix Λ constitute its set of eigenvalues. University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 43 of 43

Lecture Notes 1: Matrix Algebra Part D: Matrix Diagonalization Peter J. Hammond

Related documents

Products

Support

Lecture Notes 1: Matrix Algebra Part D: Matrix Diagonalization Peter J. Hammond

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib