Lecture Notes 1: Matrix Algebra Part D: Matrix Diagonalization Peter J. Hammond

advertisement
Lecture Notes 1: Matrix Algebra
Part D: Matrix Diagonalization
Peter J. Hammond
Autumn 2014, revised 2015
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
1 of 43
Outline
Eigenvalues and Eigenvectors
Real Case
The Complex Case
Eigenvectors are Linearly Independent
Diagonalizing a General Matrix
Diagonalizing a Symmetric Matrix
A Symmetric Matrix has only Real Eigenvalues
Orthogonal Projections and Complements
A Trick Function for Generating Eigenvalues
The Spectral Theorem
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
2 of 43
Lecture Outline
Eigenvalues and Eigenvectors
Real Case
The Complex Case
Eigenvectors are Linearly Independent
Diagonalizing a General Matrix
Diagonalizing a Symmetric Matrix
A Symmetric Matrix has only Real Eigenvalues
Orthogonal Projections and Complements
A Trick Function for Generating Eigenvalues
The Spectral Theorem
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
3 of 43
Definitions in the Real Case
Consider any n × n matrix A.
The scalar λ ∈ R is an eigenvalue
just in case the equation Ax = λx has a non-zero solution.
In this case the solution x ∈ Rn \ {0} is an eigenvector,
and the pair (λ, x) is an eigenpair.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
4 of 43
The Eigenspace
Given any eigenvalue λ, let Eλ := {x ∈ Rn \ {0} | Ax = λx}
denote the associated set of eigenvectors.
Given any two eigenvectors x, y ∈ Eλ
and any two scalars α, β ∈ R, note that
A(αx + βy) = αAx + βAy = αλx + βλy = λ(αx + βy)
Hence the linear combination αx + βy,
unless it is 0, is also an eigenvector in Eλ .
It follows that the set Eλ ∪ {0} is a linear subspace of Rn
which we call the eigenspace associated with the eigenvalue λ.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
5 of 43
Characteristic Equation
The equation Ax = λx holds for x 6= 0
if and only if x 6= 0 solves (A − λIx = 0.
This holds iff the matrix A − λI is singular,
which holds iff λ is a root
of the characteristic equation |A − λI| = 0,
or equivalently, a zero of the polynomial |A − λI| of degree n.
Suppose that the |A − λI| = 0 has k distinct roots λ1 , λ2 , . . . , λk
whose multiplicities are respectively m1 , m2 , . . . , mk .
This means that
|A − λI| = (−1)n (λ − λ1 )m1 · (λ − λ2 )m2 · · · (λ − λk )mn · · ·
The polynomial has degree m1 + m2 + . . . + mk , which equals n.
This implies that k ≤ n,
so there can be at most n distinct eigenvalues.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
6 of 43
The Case of a Diagonal Matrix, I
For the diagonal matrix D = diag(d1 , d2 , . . . , dn ),
the characteristic equation Q
|D − λI| = 0
takes the degenerate form nk=1 (dk − λ) = 0,
so the eigenvalues are the diagonal elements.
The ith component of the vector equation Dx = dk x
takes the form di xi = dk xi ,
which has a non-trivial solution if and only if di = dk .
The kth vector ek = (δik )nk=1
of the canonical orthonormal basis of Rn
always solves the equation Dx = dk x,
and so is an eigenvector associated with the eigenvalue dk .
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
7 of 43
The Case of a Diagonal Matrix, II
Apart from non-zero multiples of ek ,
there are other eigenvectors associated with dk
only if a different element di of the diagonal also equals dk .
In fact, the eigenspace spanned by the eigenvectors
associated with each eigenvalue dk
equals the space spanned by the set {ei | di = dk }
of canonical basis vectors.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
8 of 43
Example with No Real Eigenvalues, I
Recall that a 2-dimensional rotation matrix takes the form
cos θ − sin θ
Rθ :=
sin θ cos θ
for θ ∈ R, which is the angle of rotation measured in radians.
The rotation Rθ transforms any vector x = (x1 , x2 ) ∈ R2 to
cos θ − sin θ
x1
x1 cos θ − x2 sin θ
Rθ x =
=
sin θ cos θ
x2
x1 sin θ + x2 cos θ
Introduce polar coordinates (r , η),
where x = (x1 , x2 ) = r (cos η, sin η). Then
cos η cos θ − sin η sin θ
cos(η + θ)
Rθ x = r
=r
cos η sin θ + sin η cos θ
sin(η + θ)
This makes it easy to verify that Rθ+2kπ = Rθ for all θ ∈ R
and k ∈ Z, and that Rθ Rη = Rη Rθ = Rθ+η for all θ, η ∈ R.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
9 of 43
Example with No Real Eigenvalues, II
The characteristic equation |Rθ − λI| = 0 takes the form
cos θ − λ − sin θ = (cos θ−λ)2 +sin2 θ = 1−2λ cos θ+λ2
0 = sin θ
cos θ − λ
The two roots are λ = cos θ ± i sin θ = e ±iθ .
These are complex conjugates except in the degenerate case
when cos θ = 1 because θ = 2kπ for some k ∈ Z.
Then Rθ reduces to the identity matrix I2 .
So, except when θ = 2kπ for some k ∈ Z,
the real matrix Rθ has no real eigenvalues.
Instead, the two roots λ = cos θ ± i sin θ = e ±iθ
of the characteristic equation are complex eigenvalues,
with associated complex eigenvectors.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
10 of 43
Outline
Eigenvalues and Eigenvectors
Real Case
The Complex Case
Eigenvectors are Linearly Independent
Diagonalizing a General Matrix
Diagonalizing a Symmetric Matrix
A Symmetric Matrix has only Real Eigenvalues
Orthogonal Projections and Complements
A Trick Function for Generating Eigenvalues
The Spectral Theorem
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
11 of 43
Complex Eigenvalues
To consider complex eigenvalues properly,
we need to leave Rn and consider instead the linear space Cn
whose elements are n-vectors with complex coordinates.
That is, we consider a linear space whose field of scalars
is the plane C of complex numbers,
rather than the line R of real numbers.
Suppose A is any n × n matrix
whose elements may be real or complex.
The complex scalar λ ∈ C is an eigenvalue
just in case the equation Ax = λx has a non-zero solution,
in which case that solution x ∈ Cn \ {0} is an eigenvector.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
12 of 43
Fundamental Theorem of Algebra
Theorem
P
k
Let P(λ) = λn + n−1
k=0 pk λ
be a polynomial function of λ of degree n in the complex plane C.
Then there exists at least one root λ̂ ∈ C such that P(λ̂) = 0.
Corollary
The polynomial P(λ) can
Qbe factorized
as the product Pn (λ) ≡ nr=1 (λ − λr ) of exactly n linear terms.
Proof.
The proof will be by induction on n.
When n = 1 one has P1 (λ) = λ + p0 , whose only root is λ = −p0 .
Suppose the result is true when n = m − 1.
By the fundamental theorem of algebra,
there exists λ̂ ∈ C such that Pm (λ̂) = 0.
Polynomial division gives Pm (λ) ≡ Pm−1 (λ)(λ − λ̂), etc.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
13 of 43
Characteristic Roots as Eigenvalues
Theorem
Every n × n matrix A ∈ Cn×n with complex elements
has exactly n eigenvalues (real or complex)
corresponding to the roots, counting multiple roots,
of the characteristic equation |A − λI| = 0.
Proof.
The characteristic equation can be written in the form Pn (λ) = 0
where Pn (λ) ≡ |λI − A| is a polynomial of degree n.
By the fundamental theorem of algebra, together
with its corollary,
Q
the polynomial |λI − A| equals the product nr=1 (λ − λr )
of n linear terms.
For any of these roots λr the matrix A − λr I is singular,
so there exists x 6= 0 such that (A − λr I)x = 0 or Ax = λr x,
implying that λr is an eigenvalue.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
14 of 43
Outline
Eigenvalues and Eigenvectors
Real Case
The Complex Case
Eigenvectors are Linearly Independent
Diagonalizing a General Matrix
Diagonalizing a Symmetric Matrix
A Symmetric Matrix has only Real Eigenvalues
Orthogonal Projections and Complements
A Trick Function for Generating Eigenvalues
The Spectral Theorem
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
15 of 43
Proof by Induction, I
Theorem
Let {λk }m
k=1 = {λ1 , λ2 , . . . , λm }
be any collection of m ≤ n distinct eigenvalues.
Then any set {xk }m
k=1 of associated eigenvectors
must be linearly independent.
The proof will be by induction on m.
Because x1 6= 0, the set {x1 } is linearly independent.
So the result is evidently true when m = 1.
As the induction hypothesis, suppose the result holds for m − 1.
Suppose that one solution of the equation Ax = λmP
x,
which may be zero, is the linear combination xm = m−1
k=1 αk xk
of the preceding m − 1 eigenvectors. Hence
Ax = λm x =
University of Warwick, EC9A0 Maths for Economists
Xm−1
k=1
αk λm xk
Peter J. Hammond
16 of 43
Proof by Induction, II
Next, the hypothesis that {(λk , xk )}m−1
k=1
is a collection of eigenpairs implies that
Ax =
Xm−1
k=1
αk Axk =
Xm−1
k=1
αk λ k xk
Subtracting this equation from the prior equation
Ax = λm x =
Xm−1
k=1
αk λm xk
gives
0=
Xm−1
k=1
University of Warwick, EC9A0 Maths for Economists
αk (λm − λk )xk
Peter J. Hammond
17 of 43
Proof by Induction, III
So we have
0=
Xm−1
k=1
αk (λm − λk )xk
The induction hypothesis is that {xk }m−1
k=1 is linearly independent.
It follows that αk (λm − λk ) = 0 for k = 1, . . . , m − 1.
Hence
either λm = λk for some k = 1, . . . , m − 1,
in which case the m eigenvalues are not distinct;
or else αk = 0 for k = 1, . . . , m − 1.
P
But then the hypothesis that xm = m−1
k=1 αk xk
implies that xm = 0, so xm is not an eigenvector.
This completes the proof by induction.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
18 of 43
Diagonalization
Definition
An n × n matrix A matrix is said to be diagonalizable
just in case there exists an invertible diagonalizing matrix S
and a diagonal matrix Λ = diag(λ1 , λ2 , . . . , λn )
such that each of the following equivalent statements holds:
A = SΛS−1 ⇐⇒ AS = SΛ ⇐⇒ S−1 AS = Λ
Theorem
Given any n × n matrix A:
1. The columns of any matrix S that diagonalizes A
must consist of n linearly independent eigenvectors of A.
2. The matrix A is diagonalizable if and only if
it has a set of n linearly independent eigenvectors.
3. The matrix A and its diagonalization Λ = S−1 AS
have the same set of eigenvalues.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
19 of 43
Proof of Part 1
Suppose that AS = SΛ where A = (aij )n×n , S = (sij )n×n ,
and Λ = diag(λ1 , λ2 , . . . , λn ).
Then for each i, k ∈ {1, 2, . . . , n},
equating the elements in row i and column k
of the equal matrices AS and SΛ implies that
Xn
Xn
aij sjk =
sij δjk λk = sik λk
j=1
j=1
It follows that Ask = λk sk
where sk = (sik )ni=1 denotes the kth column of the matrix S.
Because S must be invertible:
I
each column sk must be non-zero, so an eigenvector of A;
I
the set of all these n columns must be linearly independent.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
20 of 43
Proof of Part 2
By part 1, if the diagonalizing matrix S exists,
its columns must form a set of n linearly independent eigenvectors
for the matrix A.
Conversely, suppose that A does have a set {x1 , x2 , . . . , xn }
of n linearly independent eigenvectors,
with Axk = λk xk for k = 1, 2, . . . , n.
Now define S as the n × n matrix whose kth column
is the eigenvector xk , for each k = 1, 2, . . . , n.
Then it is easy to check that AS = SΛ
where Λ = diag(λ1 , λ2 , . . . , λn ).
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
21 of 43
Proof of Part 3
Suppose that λ is an eigenvalue of A
with corresponding eigenvector x 6= 0.
Then x solves Ax = λx = SΛS−1 x.
Premultiplying each side of the equation SΛS−1 x = Λx by S−1 ,
it follows that y := S−1 x solves Λy = λy.
Moreover, because S−1 has the inverse S,
the equation S−1 x = 0 has only the trivial solution.
Hence x 6= 0 implies that y 6= 0,
implying that (λ, y) is an eigenpair of Λ.
Essentially the same argument shows that
if (λ, y) is an eigenpair of Λ, then (λ, Sy) is an eigenpair of Λ.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
22 of 43
A Non-Diagonalizable 2 × 2 Matrix
Example
0 1
The non-symmetric matrix A =
cannot be diagonalized.
0 0
−λ 1 = λ2 .
Its characteristic equation is 0 = |A − λI| = 0 −λ
It follows that λ = 0 is the unique eigenvalue.
x1
0 1
x1
x
The eigenvalue equation is 0
=
= 2
0
x2
0 0
x2
>
or x2 = 0, whose only solutions take the form x2 (1, 0) .
Thus, every eigenvector is a non-zero multiple
of the column vector (1, 0)> .
This makes it impossible to find
any set of two linearly independent eigenvectors.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
23 of 43
A Non-Diagonalizable n × n Matrix: Specification
The following n × n matrix also has a unique eigenvalue,
whose eigenspace is of dimension 1.
Example
Consider the non-symmetric n × n matrix A
whose elements in the first n − 1 rows satisfy aij = δi,j−1
for i = 1, 2, . . . , n − 1 but whose last row is 0> .
Such a matrix is upper

0
0


A =  ...

0
0
triangular, and takes the special form

1 0 ... 0
0 1 . . . 0
 .. .. . . ..  = 0 In−1
. .
. .
0 0>

0 0 . . . 1
0 0 ... 0
in which the elements in the first n − 1 rows and last n − 1 columns
make up the identity matrix.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
24 of 43
A Non-Diagonalizable n × n Matrix: Analysis
Because A − λI is also upper triangular,
its characteristic equation is 0 = |A − λI| = (−λ)n .
This has λ = 0 as an n-fold repeated root.
So λ = 0 is the unique eigenvalue.
The eigenvalue equation Ax = λx with λ = 0
takes the form Ax = 0 or
Xn
0=
δi,j−1 xj = xi+1 (i = 1, 2, . . . , n − 1)
j=1
with an extra nth equation of the form 0 = 0.
The only solutions take the form xj = 0 for j = 2, . . . , n,
with x1 arbitrary.
So all the eigenvectors of A
are non-zero multiples of e1 = (1, 0, . . . , 0)> ,
implying that there is just one eigenspace, which has dimension 1.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
25 of 43
Outline
Eigenvalues and Eigenvectors
Real Case
The Complex Case
Eigenvectors are Linearly Independent
Diagonalizing a General Matrix
Diagonalizing a Symmetric Matrix
A Symmetric Matrix has only Real Eigenvalues
Orthogonal Projections and Complements
A Trick Function for Generating Eigenvalues
The Spectral Theorem
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
26 of 43
Complex Conjugates and Adjoint Matrices
Recall that any complex number c ∈ C can be expressed as a + ib
with a ∈ R as the real part and b ∈ R as the imaginary part.
The complex conjugate of c is c̄ = a − ib.
Note that cc̄ = c̄c = (a + ib)(a − ib) = a2 + b 2 = |c|2 ,
where |c| is the modulus of c.
Any m × n complex matrix C = (cij )m×n ∈ Cm×n
can be written as A + iB, where A and B are real m × n matrices.
Given the m × n complex matrix C = A + iB,
its adjoint is the n × m complex matrix C∗ := A> − iB>
which is the transpose of the matrix A − iB whose elements are
the complex conjugates c̄ij of the corresponding elements of C.
In the case of a real matrix A whose imaginary part is 0,
its adjoint is simply the transpose A> .
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
27 of 43
Self-Adjoint and Symmetric Matrices
An n × n complex matrix C = A + iB is self-adjoint
just in case C∗ = C, which holds if and only if A> − iB> = A + iB,
and so if and only if:
I
the real part A is symmetric;
I
the imaginary part B is anti-symmetric in the sense that
B> = −B.
Of course, a real matrix is self-adjoint if and only if it is symmetric.
Theorem
Any eigenvalue of a self-adjoint complex matrix is a real scalar.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
28 of 43
Proof that Eigenvalues are Real
Suppose that the scalar λ ∈ C and vector x ∈ Cn together satisfy
the eigenvalue equation Ax = λx for any A ∈ Cn×n .
Taking complex conjugates, one has λ̄x∗ = x∗ A∗ .
By the associative law of complex matrix multiplication,
one has x∗ Ax = x∗ (Ax) = x∗ (λx) = λ(x∗ x)
as well as x∗ A∗ x = (x∗ A∗ )x = (λ̄x∗ )x = λ̄(x∗ x).
In case A is self-adjoint and so A∗ = A,
subtracting the second equation from the first
gives x∗ Ax − x∗ A∗ x = x∗ (A − A∗ )x = 0 = (λ − λ̄)(x∗ x).
But in case x P
is an eigenvector, one has x = (xi )ni=1 ∈ Cn \ {0}
and so x∗ x = ni=1 |xi |2 > 0.
Because 0 = (λ − λ̄)(x∗ x), it follows that the eigenvalue λ
satisfies λ − λ̄ = 0, implying that λ is real.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
29 of 43
Orthogonal Projections
Definition
An n × n matrix P is an orthogonal projection if P2 = P
and u> v = 0 whenever Pv = 0 and u = Px for some x ∈ Rn .
Theorem
Suppose that the n × m matrix X has full rank m < n.
Let L ⊂ Rn be the linear subspace spanned
by m linearly independent columns of X.
Define the n × n matrix P := X(X> X)
−1
X> . Then:
1. The matrix P is a symmetric orthogonal projection onto L.
2. The matrix I − P is a symmetric orthogonal projection
onto the orthogonal complement L⊥ of L.
3. For each vector y ∈ Rn , its orthogonal projection onto L
is the unique vector v = Py
that minimizes the distance ky − vk between y and L.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
30 of 43
Proof of Part 1
Because of the rules for the transposes of products and inverses,
−1
the definition P := X(X> X) X> implies that P> = P and also
P2 = X(X> X)
−1
X> X(X> X)
−1
−1
X> = X(X> X)
X> = P
Moreover, if Pv = 0 and u = Px for some x ∈ Rn , then
u> v = x> P> v = x> Pv = 0
Finally, for every y ∈ Rn , the vector Py equals Xb, where
−1
b = (X> X)
X> y
Hence Py ∈ L.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
31 of 43
Proof of Part 2
Evidently (I − P)> = I − P> = I − P, and
(I − P)2 = I − 2P + P2 = I − 2P + P = I − P
Hence I − P is a projection.
This projection is also orthogonal because if (I − P)v = 0
and u = (I − P)x for some x ∈ Rn , then
u> v = x> (I − P)> v = x> (I − P)v = 0
Next, suppose that v = Xb ∈ L and that y = (I − P)x
belongs to the range of (I − P). Then
y> v = x> (I − P)> Xb = x> Xb − x> Xb = 0
so y ∈ L⊥ .
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
32 of 43
Proof of Part 3
For any vector v = Xb ∈ L and y ∈ Rn , one has
ky − vk2 = (y − Xb)> (y − Xb) = y> y − 2y> Xb + b> X> Xb
−1
Now define b̂ := (X> X) X> y (which is the OLS estimator of b
in the linear regression equation y = Xb + e ).
2
Also, define v̂ := Xb̂ = Py. Because P> P = P> = P = P ,
ky − vk2 = y> y − 2y> Xb + b> X> Xb
= (b − b̂)> X> X(b − b̂) + y> y − b̂> X> Xb̂
= kv − v̂k2 + y> y − y> P> Py = kv − v̂k2 + y> y − y> Py
On the other hand, given that v̂ = Py, one also has
ky − v̂k2 = y> y − 2y> v̂ + v̂> v̂
= y> y − 2y> Py + y> P> Py = y> y − y> Py
So ky − vk2 − ky − v̂k2 = kv − v̂k2 ≥ 0 with = iff v = v̂.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
33 of 43
Outline
Eigenvalues and Eigenvectors
Real Case
The Complex Case
Eigenvectors are Linearly Independent
Diagonalizing a General Matrix
Diagonalizing a Symmetric Matrix
A Symmetric Matrix has only Real Eigenvalues
Orthogonal Projections and Complements
A Trick Function for Generating Eigenvalues
The Spectral Theorem
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
34 of 43
A Trick Function for Generating Eigenvalues
For all x 6= 0, define the homogeneous of degree zero function
Pn Pn
> Ax
xi aij xj
x
i=1
Pn j=1 2
Rn \ {0} 3 x 7→ f (x) := > =
x x
i=1 xi
which is left undefined at.
The partial derivative w.r.t. any component xh of the vector x is
i
∂f
2 hXn
= > 2
ahj xj (x> x) − (x> Ax)xh
j=1
∂xh
(x x)
At any stationary point x̂ 6= 0 where ∂f /∂xh = 0 for all h,
one therefore has (x̂> x̂)Ax̂ = (x̂> Ax̂)x̂
and so Ax̂ = λx̂ where λ = f (x̂).
That is, a stationary point x̂ 6= 0 must be an eigenvalue, with the
corresponding function value f (x̂) as the associated eigenvector.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
35 of 43
More Properties of the Trick Function
Lemma
Every n × n symmetric square matrix A:
1. has a maximum eigenvalue λ∗ at an eigenvector x∗ where f
attains its maximum;
2. has a minimum eigenvalue λ∗ at an eigenvector x∗ where f
attains its minimum;
3. satisfies A = λI if and only if λ∗ = λ∗ = λ.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
36 of 43
Proof of Parts 1 and 2
The unit sphere S n−1 is a compact subset of Rn ,
and the function f restricted to S n−1 is continuous.
By the extreme value theorem, f restricted to S n−1 must have:
I
a maximum value λ∗ attained at some point x∗ ;
I
a minimum value λ∗ attained at some point x∗ .
Because f is homogeneous of degree zero,
these are the maximum and minimum values of f
over the whole domain Rn \ {0}.
In particular, f must be stationary at any maximum point x∗ ,
as well as at any minimum point x∗ .
But stationary points must be eigenvectors.
This proves parts 1 and 2 of the lemma.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
37 of 43
Outline
Eigenvalues and Eigenvectors
Real Case
The Complex Case
Eigenvectors are Linearly Independent
Diagonalizing a General Matrix
Diagonalizing a Symmetric Matrix
A Symmetric Matrix has only Real Eigenvalues
Orthogonal Projections and Complements
A Trick Function for Generating Eigenvalues
The Spectral Theorem
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
38 of 43
A Useful Lemma
Lemma
Let A be a symmetric n × n matrix.
Suppose that there are m < n eigenvectors {uk }m
k=1
which form an orthonormal set of vectors,
as well as the columns of an n × m matrix U.
Then there is at least one more eigenvector x
that satisfies U> x = 0
— i.e., it is orthogonal to each of the m eigenvectors uk .
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
39 of 43
Constructive Proof I
For each eigenvector uk , let λk be the associated eigenvalue,
so that Auk = λk uk for k = 1, 2, . . . , m.
Then AU = UΛ where Λ := diag(λk )nk=1 .
Also, because the eigenvectors {uk }m
k=1 form an orthonormal set,
one has U> U = Im . Hence U> AU = U> UΛ = Λ.
Also, transposing AU = UΛ gives U> A = ΛU> .
Consider now the n × n matrix  := (I − UU> )A(I − UU> ), which
is symmetric because both A and UU> are symmetric. Note that
 = A − UU> A − AUU> + UU> AUU>
= A − UΛU> − UΛU> + UΛU> = A − UΛU>
This matrix  has at least one eigenvalue λ, which must be real,
and an associated eigenvector x 6= 0, which together satisfy
Âx = (I − UU> )A(I − UU> )x = (A − UΛU> )x = λx
Pre-multiplying each side of the last equation by U> shows that
λU> x = U> Ax − U> UΛU> x = ΛU> x − ΛU> x = 0
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
40 of 43
Constructive Proof II
This leaves two possible cases:
1. In the exceptional case when the only eigenvalue
of the symmetric matrix  is λ = 0,
one has  = 0 and so A = UΛU> .
Then any vector x 6= 0 satisfying U> x = 0
must satisfy Ax = UΛU> x = 0,
implying that x is an eigenvector of A.
2. Otherwise, in the generic case
when  has at least one eigenvalue λ 6= 0,
there is a corresponding eigenvector x 6= 0 of Â
>
that satisfies U> x = 0m . But then this implies that
Ax = (Â + UΛU> )x = Âx = λx
Hence x is an eigenvector of A.
>
In both cases there is an eigenvector x of A satisfying U> x = 0m .
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
41 of 43
Spectral Theorem
Theorem
Given any symmetric n × n matrix A:
1. its eigenvectors span the whole of Rn ;
2. there exists an orthogonal matrix P that diagonalizes A.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
42 of 43
Proof of Spectral Theorem
The matrix A has at least one eigenvalue, which must be real.
The associated eigenvector x, normalized to satisfy x> x = 1,
forms an orthonormal set {u1 }.
As the induction hypothesis,
suppose that there are m < n eigenvectors {uk }m
k=1
which form an orthonormal set of vectors.
We have just proved that it holds for m = 1.
The lemma shows that, if it holds for any m = 1, 2, . . . , n − 1,
then it holds for m + 1. The result follows by induction.
In particular, when m = n, there exists an orthonormal set
of n eigenvectors, which must then span the whole of Rn .
Also, by the previous result, taking P as an orthogonal matrix
whose columns are an orthonormal set of n eigenvectors
implies that P> AP = Λ where the elements
of the diagonal matrix Λ constitute its set of eigenvalues.
University of Warwick, EC9A0 Maths for Economists
Peter J. Hammond
43 of 43
Download