Uploaded by Cerys Middleton

3501handouts

advertisement
MRQ 2014
School of Mathematics and Statistics
MT3501 Linear Mathematics
Handout 0: Course Information
Lecturer:
Martyn Quick, Room 326.
Prerequisite:
MT2001 Mathematics
Lectures:
Mon (even), Tue & Thu 12 noon, Purdie Theatre B.
Tutorials:
These begin in Week 2:
Times: Mon 2pm (Maths Theatre B), Mon 3pm (Maths Theatre B),
Thu 2pm (Maths Theatre B), Fri 3pm (Maths Theatre B).
Assessment: 10% Continuous assessment, 90% exam
Continuous assessment questions will be distributed later in the semester. (Full
details t.b.a.)
Recommended Texts:
• T. S. Blyth & E. F. Robertson, Basic Linear Algebra, Second Edition, Springer
Undergraduate Mathematics Series (Springer-Verlag 2002)
• T. S. Blyth & E. F. Robertson, Further Linear Algebra, Springer Undergraduate
Mathematics Series (Springer-Verlag 2002)
• R. Kaye & R. Wilson, Linear Algebra, Oxford Science Publications (OUP 1998)
Webpage: All my handouts, slides, problem sheets and solutions will be available
in PDF format from MMS and in due course on my webpage. Solutions will be posted
only in MMS and only after all relevant tutorials have happened. The versions on my
webpage will be as single PDFs so will be posted later in the semester (the current
versions are from last year, though there won’t be a huge number of changes).
The lecture notes and handouts will contain a number (currently a small number!)
of extra examples. I don’t have time to cover these in the lectures themselves, but
they are intended to be useful supplementary material.
1
Course Content:
• Vector spaces: subspaces, spanning sets, linear independence, bases
• Linear transformations: rank, nullity, matrix of a linear transformation,
change of basis
• Direct sums: projection maps
• Diagonalisation: eigenvectors & eigenvalues, characteristic polynomial, minimum polynomial
• Jordan normal form
• Inner product spaces: orthogonality, orthonormal bases, Gram–Schmidt process
2
MRQ 2014
School of Mathematics and Statistics
MT3501 Linear Mathematics
Handout I: Vector spaces
1
Vector spaces
Definition 1.1 A field is a set F together with two binary operations
F ×F →F
F ×F →F
(α, β) 7→ α + β
(α, β) 7→ αβ
called addition and multiplication, respectively, such that
(i) α + β = β + α for all α, β ∈ F ;
(ii) (α + β) + γ = α + (β + γ) for all α, β, γ ∈ F ;
(iii) there exists an element 0 in F such that α + 0 = α for all α ∈ F ;
(iv) for each α ∈ F , there exists an element −α in F such that α + (−α) = 0;
(v) αβ = βα for all α, β ∈ F ;
(vi) (αβ)γ = α(βγ) for all α, β, γ ∈ F ;
(vii) α(β + γ) = αβ + αγ for all α, β, γ ∈ F ;
(viii) there exists an element 1 in F such that 1 6= 0 and 1α = α for all α ∈ F ;
(ix) for each α ∈ F with α 6= 0, there exists an element α−1 (or 1/α) in F such that
αα−1 = 1.
Example 1.2 The following are examples of fields: (i) Q; (ii) R; (iii) C, all with the
usual addition and multiplication;
(iv) Z/pZ = {0, 1, . . . , p − 1} (where p is a prime number) with addition and
multiplication being performed modulo p.
1
Definition 1.3 Let F be a field. A vector space over F is a set V together with the
following operations
V ×V →V
F ×V →V
(u, v) 7→ u + v
(α, v) 7→ αv,
called addition and scalar multiplication, respectively, such that
(i) u + v = v + u for all u, v ∈ V ;
(ii) (u + v) + w = u + (v + w) for all u, v, w ∈ V ;
(iii) there exists a vector 0 in V such that v + 0 = v for all v ∈ V ;
(iv) for each v ∈ V , there exists a vector −v in V such that v + (−v) = 0;
(v) α(u + v) = αu + αv for all u, v ∈ V and α ∈ F ;
(vi) (α + β)v = αv + βv for all v ∈ V and α, β ∈ F ;
(vii) (αβ)v = α(βv) for all v ∈ V and α, β ∈ F ;
(viii) 1v = v for all v ∈ V .
We shall use the term real vector space to refer to a vector space over the field R
and complex vector space to refer to one over the field C.
2
Example 1.4
(i) Let n be a positive integer and let
 
x1



  x2 
 
F n =  .  x1 , x2 , . . . , xn ∈ F

 .. 



xn
This is a vector space over F with addition given by

    
x1 + y 1
y1
x1
 x2   y 2   x2 + y 2 

    
,
 ..  +  ..  = 
..

 .  . 
.
xn + y n
yn
xn





.




scalar multiplication by

 

x1
αx1
 x2   αx2 
  

α .  =  . ,
 ..   .. 
xn
zero vector
αxn
 
0
0
 
0 = . ,
 .. 
0
and negatives given by

 

x1
−x1
 x2   −x2 
  

− .  =  . .
 ..   .. 
xn
−xn
(ii) C forms a vector space over R.
(iii) A polynomial over a field F is an expression of the form
f (x) = a0 + a1 x + a2 x2 + · · · + am xm
for some m > 0, where a0 , a1 , . . . , am ∈ F . The set F [x] of all polynomials over F
forms a vector space over F . Addition is given by
X
X
X
ai xi +
bi xi =
(ai + bi )xi
i
i
i
and scalar multiplication by
α
X
ai x i =
i
X
(αai )xi .
i
3
(iv) Let FR denote the set of all functions f : R → R. Define
(f + g)(x) = f (x) + g(x)
(for x ∈ R)
and
(αf )(x) = α · f (x)
(for x ∈ R).
Then FR forms a vector space over R with respect to this addition and scalar
multiplication. In this vector space negatives are given by
(−f )(x) = −f (x)
and the zero vector is 0 : x 7→ 0 for all x ∈ R.
Basic properties of vector spaces
Proposition 1.5 Let V be a vector space over a field F . Let v ∈ V and α ∈ F . Then
(i) α0 = 0;
(ii) 0v = 0;
(iii) if αv = 0, then either α = 0 or v = 0;
(iv) (−α)v = −αv = α(−v).
Subspaces
Definition 1.6 Let V be a vector space over a field F . A subspace W of V is a
non-empty subset such that
(i) if u, v ∈ W , then u + v ∈ W , and
(ii) if v ∈ W and α ∈ F , then αv ∈ W .
Lemma 1.7 Let V be a vector space and let W be a subspace of V . Then
(i) 0 ∈ W ;
(ii) if v ∈ W , then −v ∈ W .
4
Definition 1.9 Let V be a vector space and let U and W be subspaces of V .
(i) The intersection of U and W is
U ∩ W = { v | v ∈ U and v ∈ W }.
(ii) The sum of U and W is
U + W = { u + w | u ∈ U, w ∈ W }.
Proposition 1.10 Let V be a vector space and let U and W be subspaces of V . Then
(i) U ∩ W is a subspace of V ;
(ii) U + W is a subspace of V .
Corollary 1.11 Let V be a vector space and let U1 , U2 , . . . , Uk be subspaces of V .
Then
U1 + U2 + · · · + Uk = { u1 + u2 + · · · + uk | ui ∈ Ui for each i }
is a subspace of V .
5
Spanning sets
Definition 1.12 Let V be a vector space over a field F and suppose that A =
{v1 , v2 , . . . , vk } is a set of vectors in V . A linear combination of these vectors is a
vector of the form
α1 v1 + α2 v2 + · · · + αk vk
for some α1 , α2 , . . . , αk ∈ F . The set of all such linear combinations is called the span
of the vectors v1 , v2 , . . . , vk and is denoted by Span(v1 , v2 , . . . , vk ) or by Span(A ).
Proposition 1.13 Let A be a set of vectors in the vector space V . Then Span(A ) is
a subspace of V .
Definition 1.14 A spanning set for a subspace W is a set A of vectors such that
Span(A ) = W .
Linear independent elements and bases
Definition 1.16 Let V be a vector space over a field F . A set A = {v1 , v2 , . . . , vk }
is called linearly independent if the only solution to the equation
k
X
αi vi = 0
i=1
(with αi ∈ F ) is α1 = α2 = · · · = αk = 0.
If A is not linearly independent, we shall call it linearly dependent.
6
Example 1A Determine whether the set {x+x2 , 1−2x2 , 3+6x} is linearly independent
in the vector space P of all real polynomials.
Solution: We solve
α(x + x2 ) + β(1 − 2x2 ) + γ(3 + 6x) = 0;
that is,
(β + 3γ) + (α + 6γ)x + (α − 2β)x2 = 0.
(1)
Equating coefficients yields the system of equations
β + 3γ = 0
α
+ 6γ = 0
α − 2β
= 0;
that is,

   
0 1 3
α
0
1 0 6 β  = 0 .
1 −2 0
γ
0
A sequence of row operations (Check!) converts this to

   
1 −2 0
α
0
0 1 3 β  = 0 .
0 0 0
γ
0
Hence the original equation (1) is equivalent to
α − 2β
=0
β + 3γ = 0.
Since there are fewer equations remaining than the number of variables, we have enough
freedom to produce a non-zero solution. For example, if we set γ = 1, then β = −3γ =
−3 and α = 2β = −6. Hence the set {x + x2 , 1 − 2x2 , 3 + 6x} is linearly dependent. Lemma 1.17 Let A be a set of vectors in the vector space V . Then A is linearly
independent if and only if no vector in A can be expressed as a linear combination of
the others.
7
Lemma 1.18 Let A be a linearly dependent set of vectors belonging to a vector
space V . Then there exists some vector v in A such that A \ {v} spans the same
subspace as A .
If we start with a finite set and repeatedly remove vectors which are linear combinations of the others in the set, then we must eventually stop and produce a linearly
independent set. This establishes:
Theorem 1.19 Let V be a vector space (over some field). If A is a finite subset of V
and W = Span(A ), then there exists a linearly independent subset B with B ⊆ A
and Span(B) = W .
Definition 1.20 Let V be a vector space over the field F . A basis for V is a linearly
independent spanning set. We say that V is finite-dimensional if it possesses a finite
spanning set; that is, if V possesses a finite basis. The dimension of V is the size of
any basis for V and is denote by dim V .
Lemma 1.22 Let V be a vector space of dimension n (over some field) and suppose
B = {v1 , v2 , . . . , vn } is a basis for V . Then every vector in V can be expressed as a
linear combination of the vectors in B in a unique way.
8
Theorem 1.23 Let V be a finite-dimensional vector space. Suppose {v1 , v2 , . . . , vm }
is a linearly independent set of vectors and {w1 , w2 , . . . , wn } is a spanning set for V .
Then
m 6 n.
Corollary 1.24 Let V be a finite-dimensional vector space. Then any two bases for V
have the same size and consequently dim V is uniquely determined.
Proposition 1.26 Let V be a finite-dimensional vector space. Then every linearly
independent set of vectors in V can be extended to a basis for V by adjoining a finite
number of vectors.
Corollary 1.27 Let V be a vector space of finite dimension n. If A is a linearly
independent set containing n vectors, then A is a basis for V .
9
MRQ 2014
School of Mathematics and Statistics
MT3501 Linear Mathematics
Handout II: Linear transformations
2
Linear transformations
Definition 2.1 Let V and W be vector spaces over the same field F . A linear mapping
(also called a linear transformation) from V to W is a function T : V → W such that
(i) T (u + v) = T (u) + T (v) for all u, v ∈ V , and
(ii) T (αv) = α T (v) for all v ∈ V and α ∈ F .
Comment: Sometimes we shall write T v for the image of the vector v under the
linear transformation T (instead of T (v)).
Lemma 2.2 Let T : V → W be a linear mapping between two vector spaces over the
field F . Then
(i) T (0) = 0;
(ii) T (−v) = −T (v) for all v ∈ V ;
(iii) if v1 , v2 , . . . , vk ∈ V and α1 , α2 , . . . , αk ∈ F , then
T
X
k
i=1
αi vi
1
=
k
X
i=1
αi T (vi ).
Definition 2.3 Let T : V → W be a linear transformation between vector spaces over
a field F .
(i) The image of T is
T (V ) = im T = { T (v) | v ∈ V }.
(ii) The kernel or null space of T is
ker T = { v ∈ V | T (v) = 0W }.
Proposition 2.4 Let T : V → W be a linear transformation between vector spaces
V and W over the field F . The image and kernel of T are subspaces of W and V ,
respectively.
Definition 2.5 Let T : V → W be a linear transformation between vector spaces over
the field F .
(i) The rank of T , which we shall denote by rank T , is the dimension of the image
of T .
(ii) The nullity of T , which we shall denote by null T , is the dimension of the kernel
of T .
Theorem 2.6 (Rank-Nullity Theorem) Let V and W be vector spaces over the
field F with V finite-dimensional and let T : V → W be a linear transformation. Then
rank T + null T = dim V.
Comment: [For those who have done MT2002.] This can be viewed as an analogue of the First Isomorphism Theorem for groups within the world of vector spaces.
Rearranging gives
dim V − dim ker T = dim im T
and since (as we shall see on Problem Sheet II) dimension essentially determines vector
spaces we conclude
V /ker T ∼
= im T.
2
Constructing linear transformations
Proposition 2.7 Let V be a finite-dimensional vector space over the field F with
basis {v1 , v2 , . . . , vn } and let W be any vector space over F . If y1 , y2 , . . . , yn are
arbitrary vectors in W , there is a unique linear transformation T : V → W such that
T (vi ) = yi
for i = 1, 2, . . . , n.
This transformation T is given by
T
X
n
αi vi
i=1
for an arbitrary linear combination
Pn
=
i=1 αi vi
n
X
αi yi
i=1
in V .
Proposition 2.9 Let V be a finite-dimensional vector space over the field F with
basis {v1 , v2 , . . . , vn } and let W be a vector space over F . Fix vectors y1 , y2 , . . . , yn
in W and let T : V → W be the unique linear transformation given by T (vi ) = yi
for i = 1, 2, . . . , n. Then
(i) im T = Span(y1 , y2 , . . . , yn ).
(ii) ker T = {0} if and only if {y1 , y2 , . . . , yn } is a linearly independent set.
3
Example 2A Define a linear transformation T : R3 → R3 in terms of the standard
basis B = {e1 , e2 , e3 } by
 
 
 
2
−1
0
T (e1 ) = y 1 =  1  , T (e2 ) = y 2 =  0  , T (e3 ) = y 3 = −1 .
−1
2
4
Show that ker T = {0} and im T = R3 .
Solution: We check whether {y 1 , y 2 , y 3 } is linearly independent. Solve
αy 1 + βy 2 + γy 3 = 0;
that is,
2α − β
=0
α
−γ =0
−α + 2β + 4γ = 0.
The second equation tells us that γ = α while the first says β = 2α. Substituting for
β and γ in the third equation gives
−α + 4α + 4α = 7α = 0.
Hence α = 0 and consequently β = γ = 0.
This shows {y 1 , y 2 , y 3 } is linearly independent. Consequently, ker T = {0} by
Proposition 2.9. The Rank-Nullity Theorem now says
dim im T = dim R3 − dim ker T = 3 − 0 = 3.
Therefore im T = R3 as it has the same dimension.
[Alternatively, since dim R3 = 3 and {y 1 , y 2 , y 3 } is linearly independent, this set
must be a basis for R3 (see Corollary 1.24). Therefore, by Proposition 2.9(i),
im T = Span(y 1 , y 2 , y 3 ) = R3 ,
once again.]
4
The matrix of a linear transformation
Definition 2.11 Let V and W be finite-dimensional vector spaces over the field F and
let B = {v1 , v2 , . . . , vn } and C = {w1 , w2 , . . . , wm } be bases for V and W , respectively.
If T : V → W is a linear transformation, let
T (vj ) =
m
X
αij wi
i=1
express the image of the vector vj under T as a linear combination of the basis C
(for j = 1, 2, . . . , n). The m × n matrix [αij ] is called the matrix of T with respect to
the bases B and C . We shall denote this by Mat(T ) or, when we wish to be explicit
about the dependence upon the bases B and C , by MatB,C (T ).
Note that the entries of the jth column of the matrix of T are:
α1j
α2j
..
.
αmj
i.e., the jth column specifies the image of T (vj ) by listing the coefficients when it is
expressed as a linear combination of the vectors in C .
Informal description of what the matrix of linear transformation does: Suppose that V and W are finite-dimensional vector spaces of dimension n and m, respectively. Firstly V and W “look like” F n and F m . Moreover, from this viewpoint, a linear
transformation T : V → W then “looks like” multiplication by the matrix Mat(T ).
5
Change of basis
The matrix of a linear transformation depends heavily on the choice of bases for the
two vector spaces involved. The following theorem describes how these matrices for
a linear transformation T : V → V are linked when calculated with respect to two
different bases for V . Similar results apply when using different bases for vectors
spaces V and W for a linear transformation T : V → W .
Theorem 2.13 Let V be a vector space of dimension n over a field F and let T : V →
V be a linear transformation. Let B = {v1 , v2 , . . . , vn } and C = {w1 , w2 , . . . , wn } be
bases for V and let A and B be the matrices of T with respect to B and C , respectively.
Then there is an invertible matrix P such that
B = P −1 AP.
Specifically, the (i, j)th entry of P is the coefficient of vi when wj is expressed as a
linear combination of the basis vectors in B.
6
Example 2B Let
     
1
2 
 0
B =  1  ,  0  , −1 .


−1
−1
0
(i) Show that B is a basis for R3 .
(ii) Write down the change of basis matrix from the standard basis E = {e1 , e2 , e3 }
to B.
(iii) Let


−2 −2 −3
A= 1
1
2
−1 −2 −2
and view A as a linear transformation R3 → R3 . Find the matrix of A with
respect to the basis B.
Solution: (i) We first establish that B is linearly independent. Solve
 
 
   
0
1
2
0
α  1  + β  0  + γ −1 = 0 ;
−1
0
−1
0
that is,
β + 2γ = 0
α
−γ =0
−α − β
= 0.
Thus γ = α and the first equation yields 2α + β = 0. Adding the third equation now
gives α = 0 and hence β = γ = 0. This show B is linearly independent and it is
therefore a basis for R3 since dim R3 = 3 = |B|.
(ii) We write each vector in B in terms of the standard basis
 
0
 1 =
e2 − e3
−1
 
1
 0  = e1
− e3
−1
 
2
−1 = 2e1 − e2
0
and write the coefficients appearing down the columns of the change of basis matrix:


0
1
2
0 −1 .
P = 1
−1 −1 0
7
(iii) Theorem 2.13 says MatB,B (A) = P −1 AP (as the matrix of A with respect to
the standard basis is A itself). We first calculate the inverse of P via the usual row
operation method:




0
1
2 1 0 0
0 1
2 1 0 0
1
0 −1 0 1 0 −→ 1 0 −1 0 1 0
r3 7→ r3 + r1
−1 −1 0 0 0 1
0 −1 −1 0 1 1


1 0 −1 0 1 0
−→ 0 1
r1 ↔ r2
2 1 0 0
0 −1 −1 0 1 1


1 0 −1 0 1 0
r3 7→ r3 + r2
−→ 0 1 2 1 0 0
0 0 1 1 1 1


1 0 0 1
2
1
r1 7→ r1 + r3
−→ 0 1 0 −1 −2 −2
r2 7→ r2 − 2r3
0 0 1 1
1
1
Hence
P −1
and so
MatB,B (A) = P −1 AP

1
2
= −1 −2
1
1

−1 −2

=
2
4
−2 −3

−1 0

1 −1
=
0
1


1
2
1
= −1 −2 −2
1
1
1



1
−2 −2 −3
0
1
2
−2  1
1
2  1
0 −1
1
−1 −2 −2
−1 −1 0


−1
0
1
2


3
1
0 −1
−3
−1 −1 0

0
0 .
−1
8
MRQ 2014
School of Mathematics and Statistics
MT3501 Linear Mathematics
Handout III: Direct sums
3
Direct sums
Definition 3.1 Let V be a vector space over a field F . We say that V is the direct
sum of two subspaces U1 and U2 , written V = U1 ⊕ U2 if every vector v in V can be
expressed uniquely in the form v = u1 + u2 where u1 ∈ U1 and u2 ∈ U2 .
Proposition 3.2 Let V be a vector space and U1 and U2 be subspaces of V . Then
V = U1 ⊕ U2 if and only if the following conditions hold:
(i) V = U1 + U2 ,
(ii) U1 ∩ U2 = {0}.
Comment: Many authors use the two conditions to define what is meant by a direct
sum and then show it is equivalent to our “unique expression” definition.
Proposition 3.4 Let V = U1 ⊕ U2 be a finite-dimensional vector space expressed as
a direct sum of two subspaces. If B1 and B2 are bases for U1 and U2 , respectively,
then B1 ∪ B2 is a basis for V .
Corollary 3.5 If V = U1 ⊕ U2 is a finite-dimensional vector space expressed as a
direct sum of two subspaces, then
dim V = dim U1 + dim U2 .
1
Projection maps
Definition 3.6 Let V = U1 ⊕ U2 be a vector space expressed as a direct sum of two
subspaces. The two projection maps P1 : V → V and P2 : V → V onto U1 and U2 ,
respectively, corresponding to this decomposition are defined as follows:
if v ∈ V , express v uniquely as v = u1 + u2 where u1 ∈ U1 and u2 ∈ U2 ,
then
P1 (v) = u1
and
P2 (v) = u2 .
Lemma 3.7 Let V = U1 ⊕ U2 be a direct sum of subspaces with projection maps
P1 : V → V and P2 : V → V . Then
(i) P1 and P2 are linear transformations;
(ii) P1 (u) = u for all u ∈ U1 and P1 (w) = 0 for all w ∈ U2 ;
(iii) P2 (u) = 0 for all u ∈ U1 and P2 (w) = w for all w ∈ U2 ;
(iv) ker P1 = U2 and im P1 = U1 ;
(v) ker P2 = U1 and im P2 = U2 .
Proposition 3.8 Let P : V → V be a projection corresponding to some direct sum
decomposition of the vector space V . Then
(i) P 2 = P ;
(ii) V = ker P ⊕ im P ;
(iii) I − P is also a projection;
(iv) V = ker P ⊕ ker(I − P ).
Here I : V → V denotes the identity transformation I : v 7→ v for v ∈ V .
2
Example 3A Let V = R3 and U = Span(v 1 ), where
 
3
v 1 = −1 .
2
(i) Find a subspace W such that V = U ⊕ W .
(ii) Let P : V → V be the associated projection onto W . Calculate P (u) where
 
4
u = 4 .
4
Solution: (i) We first extend {v 1 } to a basis for R3 . We claim that
      
0 
1
 3





−1 , 0 , 1
B=


0
0
2
is a basis for R3 . We solve

that is,
   

 
0
0
3
1
α −1 + β 0 + γ 1 = 0 ;
0
0
0
2
3α + β = −α + γ = 2α = 0.
Hence α = 0, so β = −3α = 0 and γ = α = 0. Thus B is linearly independent. Since
dim V = 3 and |B| = 3, we conclude that B is a basis for R3 .
Let W = Span(v 2 , v 3 ) where
 
 
0
1
and
v 3 = 1 .
v 2 = 0
0
0
Since B = {v 1 , v 2 , v 3 } is a basis for V , if v ∈ V , then there exist α1 , α2 , α3 ∈ R such
that
v = (α1 v 1 ) + (α2 v 2 + α3 v 3 ) ∈ U + W.
Hence V = U + W .
If v ∈ U ∩ W , then there exist α, β1 , β2 ∈ R such that
v = αv 1 = β1 v 2 + β2 v 3 .
Therefore
αv 1 + (−β1 )v 2 + (−β2 )v 3 = 0.
Since B is linearly independent, we conclude α = −β1 = −β2 = 0, so v = αv 1 = 0.
Thus U ∩ W = {0} and so
V = U ⊕ W.
3
(ii) We write u as a linear combination of the basis B. Inspection shows
 
 
 
 
4
3
1
0
u = 4 = 2 −1 − 2 0 + 6 1
4
2
0
0
   
6
−2
= −2 +  6  ,
4
0
where the first term in the last line belongs to U and the second to W . Hence
 
−2

P (u) =
6
0
(since this is the W -component of u).
Direct sums of more summands
Definition 3.10 Let V be a vector space. We say that V is the direct sum of subspaces
U1 , U2 , . . . , Uk , written V = U1 ⊕ U2 ⊕ · · · ⊕ Uk , if every vector in V can be uniquely
expressed in the form u1 + u2 + · · · + uk where ui ∈ Ui for each i.
The following analogues of earlier results then hold:
Proposition 3.11 Let V be a vector space with subspaces U1 , U2 , . . . , Uk . Then
V = U1 ⊕ U2 ⊕ · · · ⊕ Uk if and only if the following conditions hold:
(i) V = U1 + U2 + · · · + Uk ;
(ii) Ui ∩ (U1 + · · · + Ui−1 + Ui+1 + · · · + Uk ) = {0} for each i.
Proposition 3.12 Let V = U1 ⊕ U2 ⊕ · · · ⊕ Uk be a direct sum of subspaces. If Bi is
a basis for Ui for i = 1, 2, . . . , k, then B1 ∪ B2 ∪ · · · ∪ Bk is a basis for V .
4
MRQ 2014
School of Mathematics and Statistics
MT3501 Linear Mathematics
Handout IV: Diagonalisation of linear transformations
4
Diagonalisation of linear transformations
Throughout fix a vector space V over a field F and a linear transformation T : V → V .
Eigenvectors and eigenvalues
Definition 4.1 Let V be a vector space over a field F and let T : V → V be a linear
transformation. A non-zero vector v is an eigenvector for T with eigenvalue λ (where
λ ∈ F ) if
T (v) = λv.
Definition 4.2 Let V be a vector space over a field F , let T : V → V be a linear
transformation, and λ ∈ F . The eigenspace corresponding to the eigenvalue λ is the
subspace
Eλ = ker(T − λI) = { v ∈ V | T (v) − λv = 0 }
= { v ∈ V | T (v) = λv }.
Recall I denotes the identity transformation given by v 7→ v.
From now on we strengthen out standing assumption to V being a finite-dimensional
vector space over F .
Definition 4.3 Let T : V → V be a linear transformation of the finite-dimensional
vector space V (over F ) and let A be the matrix of T with respect to some basis. The
characteristic polynomial of T is
cT (x) = det(xI − A)
where x is an indeterminate variable.
Lemma 4.4 Suppose that T : V → V is a linear transformation of the finite-dimensional
vector space V over F . Then λ is an eigenvalue of T if and only if λ is a root of the
characteristic polynomial of T .
1
Lemma 4.5 Let V be a finite-dimensional vector space V over F and T : V → V
be a linear transformation. The characteristic polynomial cT (x) is independent of the
choice of basis for V .
Diagonalisability
Definition 4.6 (i) Let T : V → V be a linear transformation of a finite-dimensional
vector space V . We say that T is diagonalisable if there is a basis with respect
to which T is represented by a diagonal matrix.
(ii) A square matrix A is diagonalisable if there is an invertible matrix P such that
P −1 AP is diagonal.
Proposition 4.7 Let V be a finite-dimensional vector space and T : V → V be a
linear transformation. Then T is diagonalisable if and only if there is a basis for V
consisting of eigenvectors for T .
2
Algebraic and geometric multiplicities
Lemma 4.10 If the linear transformation T : V → V is diagonalisable, then the characteristic polynomial of T is a product of linear factors.
Definition 4.12 Let V be a finite-dimensional vector space over the field F and let
T : V → V be a linear transformation of V . Let λ ∈ F .
(i) The algebraic multiplicity of λ (as an eigenvalue of T ) is the largest power k such
that (x − λ)k is a factor of the characteristic polynomial cT (x).
(ii) The geometric multiplicity of λ is the dimension of the eigenspace Eλ corresponding to λ.
Proposition 4.13 Let T : V → V be a linear transformation of a vector space V . A
set of eigenvectors of T corresponding to distinct eigenvalues is linearly independent.
Theorem 4.14 Let V be a finite-dimensional vector space over the field F and let
T : V → V be a linear transformation of V .
(i) If the characteristic polynomial cT (x) is a product of linear factors (as always
happens, for example, if F = C), then the sum of the algebraic multiplicities
equals dim V .
(ii) Let λ ∈ F and let rλ be the algebraic multiplicity and nλ be the geometric
multiplicity of λ. Then
nλ 6 rλ .
(iii) The linear transformation T is diagonalisable if and only if cT (x) is a product of
linear factors and nλ = rλ for all eigenvalues λ.
3
Example 4A Let


−1 2 −1
A = −4 5 −2 .
−4 3 0
Show that A is not diagonalisable.
Solution: The characteristic polynomial of A is
cA (x) = det(xI − A)

x + 1 −2
= det  4
x−5
4
−3
x−5
= (x + 1) det
−3

1
2
x
2
4 2
4 x−5
+ 2 det
+ det
x
4 x
4 −3
= (x + 1) x(x − 5) + 6 + 2(4x − 8) + (−12 − 4x + 20)
= (x + 1)(x2 − 5x + 6) + 8(x − 2) − 4x + 8
= (x + 1)(x − 2)(x − 3) + 8(x − 2) − 4(x − 2)
= (x − 2) (x + 1)(x − 3) + 8 − 4
= (x − 2)(x2 − 2x − 3 + 4)
= (x − 2)(x2 − 2x + 1)
= (x − 2)(x − 1)2 .
In particular, the algebraic multiplicity of the eigenvalue 1 is 2.
We now determine the eigenspace for eigenvalue 1. We solve (A − I)v = 0; that is,

   
−2 2 −1
x
0
−4 4 −2 y  = 0 .
(2)
−4 3 −1
z
0
We solve this by applying row operations:



−2 2
−2 2 −1 0
−4 4 −2 0 −→  0
0
0 −1
−4 3 −1 0

−2 0

0
0
−→
0 −1

−1 0
0 0
1 0

1 0
0 0
1 0
So Equation (2) is equivalent to
r2 7→ r2 − 2r1
r3 7→ r3 − 2r1
r1 7→ r1 + 2r3
−2x + z = 0 = −y + z.
Hence z = 2x and y = z = 2x. Therefore the eigenspace is
 

 
1
 x





E1 =
2x
x ∈ R = Span
2


2x
2
and we conclude dim E1 = 1. Thus the geometric multiplicity of 1 is not equal to the
algebraic multiplicity, so A is not diagonalisable.
4
Minimum polynomial
Note: The minimum polynomial is often also called the “minimal polynomial”.
Definition 4.15 Let T : V → V be a linear transformation of a finite-dimensional
vector space over the field F . The minimum polynomial mT (x) is the monic polynomial
over F of smallest degree such that
mT (T ) = 0.
To say that a polynomial is monic is to say that its leading coefficient is 1. Our
definition of the characteristic polynomial ensures that cT (x) is also always a monic
polynomial.
The definition of the minimum polynomial (in particular, the assumption that it is
monic) ensures that it is unique.
Theorem 4.16 (Cayley–Hamilton Theorem) Let T : V → V be a linear transformation of a finite-dimensional vector space V . If cT (x) is the characteristic polynomial
of T , then
cT (T ) = 0.
5
Facts about polynomials: Let F be a field and recall F [x] denotes the set of
polynomials with coefficients from F :
f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0
(where ai ∈ F ).
Then F [x] is an example of what is known as a Euclidean domain (see MT4517 Rings
and Fields for full details). A summary of its main properties are:
• We can add, multiply and subtract polynomials;
• Euclidean Algorithm: if we attempt to divide f (x) by g(x) (where g(x) 6= 0), we
obtain
f (x) = g(x)q(x) + r(x)
where either r(x) = 0 or the degree of r(x) satisfies deg r(x) < deg g(x) (i.e., we
can perform long-division with polynomials).
• When the remainder is 0, that is, when f (x) = g(x)q(x) for some polynomial q(x),
we say that g(x) divides f (x).
• If f (x) and g(x) are non-zero polynomials, their greatest common divisor is the
polynomial d(x) of largest degree dividing them both. It is uniquely determined
up to multiplying by a scalar and can be expressed as
d(x) = a(x)f (x) + b(x)g(x)
for some polynomials a(x), b(x).
Those familiar with divisibility in the integers Z (particularly those who have attended MT1003 ) will recognise these facts as being standard properties of Z (which is
also a standard example of a Euclidean domain).
6
Proposition 4.18 Let V be a finite-dimensional vector space over a field F and let
T : V → V be a linear transformation. If f (x) is any polynomial (over F ) such that
f (T ) = 0, then the minimum polynomial mT (x) divides f (x).
Corollary 4.19 Suppose that T : V → V is a linear transformation of a finite-dimensional
vector space V . Then the minimum polynomial mT (x) divides the characteristic polynomial cT (x).
Theorem 4.20 Let V be a finite-dimensional vector space over a field F and let
T : V → V be a linear transformation of V . Then the roots of the minimum polynomial mT (x) and the roots of the characteristic polynomial cT (x) coincide.
Theorem 4.21 Let V be a finite-dimensional vector space over the field F and let
T : V → V be a linear transformation. Then T is diagonalisable if and only if the
minimum polynomial mT (x) is a product of distinct linear factors.
Lemma 4.22 Let T : V → V be a linear transformation of a vector space over the
field F and let f (x) and g(x) be coprime polynomials over F . Then
ker f (T )g(T ) = ker f (T ) ⊕ ker g(T ).
7
Example 4B Let


0 −2 −1
A= 1
5
3 .
−1 −2 0
Calculate the characteristic polynomial and the minimum polynomial of A. Hence
determine whether A is diagonalisable.
Solution:
cA = det(xI − A)


x
2
1
= det −1 x − 5 −3
1
2
x
x − 5 −3
−1 −3
−1 x − 5
= x det
− 2 det
+ det
2
x
1
x
1
2
= x x(x − 5) + 6 − 2(−x + 3) + (−2 − x + 5)
= x(x2 − 5x + 6) + 2(x − 3) − x + 3
= x(x − 3)(x − 2) + 2(x − 3) − (x − 3)
= (x − 3) x(x − 2) + 2 − 1
= (x − 3)(x2 − 2x + 1)
= (x − 3)(x − 1)2 .
Since the minimum polynomial divides cA (x) and has the same roots, we deduce
mA (x) = (x − 3)(x − 1) or
mA (x) = (x − 3)(x − 1)2 .
We calculate

−3
(A − 3I)(A − I) =  1
−1

2
= −2
2


−2 −1
−1 −2 −1
2
3  1
4
3
−2 −3
−1 −2 −1

0 −2
0 2  6= 0.
0 −2
Hence mA (x) 6= (x − 3)(x − 1). We conclude
mA (x) = (x − 3)(x − 1)2 .
This is not a product of distinct linear factors, so A is not diagonalisable.
8
MRQ 2014
School of Mathematics and Statistics
MT3501 Linear Mathematics
Handout V: Jordan normal form
5
Jordan normal form
Definition 5.1 A Jordan block is an n × n matrix of the form


λ 1


λ 1




.
.


.
.
Jn (λ) = 
.
.



..

. 1
λ
0
0
for some positive integer n and some scalar λ.
A linear transformation T : V → V (of a vector space V ) has Jordan normal form A
if there exists a basis for V with respect to which the matrix of T is


Jn1 (λ1 )
0
··· ···
0
 0
Jn2 (λ2 ) 0 · · ·
0 



 ..
..
..
..


.
.
.
0
Mat(T ) = A =  .


 ..
..
..
 .
.
0 
.
0
0
···
0 Jnk (λk )
for some positive integers n1 , n2 , . . . , nk and scalars λ1 , λ2 , . . . , λk . (The occurrences
of 0 here indicate zero matrices of appropriate sizes.)
Theorem 5.2 Let V be a finite-dimensional vector space and T : V → V be a linear
transformation of V such that the characteristic polynomial cT (x) is a product of linear
factors with eigenvalues λ1 , λ2 , . . . , λn , then there exist a basis for V with respect to
which Mat(T ) is in Jordan normal form where each Jordan block has the form Jm (λi )
for some m and some i.
Corollary 5.3 Let A be a square matrix over C. Then there exist an invertible
matrix P (over C) such that P −1 AP is in Jordan normal form.
1
Basic properties of Jordan normal form
Proposition 5.4 Let J = Jn (λ) be an n × n Jordan block. Then
(i) cJ (x) = (x − λ)n ;
(ii) mJ (x) = (x − λ)n ;
(iii) the eigenspace Eλ of J has dimension 1.
Finding the Jordan normal form
Problem: Let V be a finite-dimensional vector space and let T : V → V be a linear
transformation. If the characteristic polynomial cT (x) is a product of linear factors,
find a basis B with respect to which T is in Jordan normal form and determine what
this Jordan normal form is.
Observation 5.5 The algebraic multiplicity of λ as an eigenvalue of T equals the sum
of the sizes of the Jordan blocks Jn (λ) (associated to λ) occurring in the Jordan normal
form for T .
Observation 5.6 If λ is an eigenvalue of T , then the power of (x − λ) occurring in
the minimum polynomial mT (x) is (x − λ)m where m is the largest size of a Jordan
block associated to λ occurring in the Jordan normal form for T .
Observation 5.9 The geometric multiplicity of λ as an eigenvalue of T equals the
number of Jordan blocks Jn (λ) occurring in the Jordan normal form for T .
These observations are enough to determine the Jordan normal form for many
small examples. We need to generalise Observation 5.9 in larger cases and consider
more general spaces than the eigenspace Eλ . Determining
dim ker(T − λI)2 ,
dim ker(T − λI)3 ,
...
would prove to be sufficient.
To find the basis B such that MatB,B (T ) is in Jordan normal form, boils down to
finding the Jordan normal form and then solving appropriate systems of linear equations. (It is essentially a generalisation of finding the eigenvectors for a diagonalisable
linear transformation.)
Example 5A Omitted from the handout due to length. A document containing this
supplementary example can be found in MMS and on my webpage.
2
MRQ 2014
School of Mathematics and Statistics
MT3501 Linear Mathematics
Handout VI: Inner product spaces
6
Inner product spaces
Throughout this section, our base field F will be either R or C. Recall that if z =
x + iy ∈ C, the complex conjugate of z is given by
z̄ = x − iy.
To save space and time, we shall use the complex conjugate even when F = R. Thus,
when F = R and ᾱ appears, we mean ᾱ = α for a scalar α ∈ R.
Definition 6.1 Let F = R or C. An inner product space is a vector space V over F
together with an inner product
V ×V →F
(v, w) 7→ hv, wi
such that
(i) hu + v, wi = hu, wi + hv, wi for all u, v, w ∈ V ,
(ii) hαv, wi = αhv, wi for all v, w ∈ V and α ∈ F ,
(iii) hv, wi = hw, vi for all v, w ∈ V ,
(iv) hv, vi is a real number satisfying hv, vi > 0 for all v ∈ V ,
(v) hv, vi = 0 if and only if v = 0.
In the case when F = R, our inner product is symmetric in the sense that Condition (iii) then becomes
hv, wi = hw, vi
for all v, w ∈ V .
1
Example 6.2 (i) The vector space Rn is an inner product space with respect to
the usual dot product:
   
   
y1 +
x1
y1
* x1
n
 x2   y 2 
 x2   y 2  X
   
   
xi y i .
 ..  ,  ..  =  ..  ·  ..  =
 .  .
 .  .
i=1
xn
yn
xn
yn
(ii) We can endow Cn with an inner product by
   
w1 +
* z1
n
 z2   w2 
X
   
zi w̄i .
 ..  ,  ..  =
.  . 
i=1
zn
wn
(iii) The set C[a, b] of continous functions f : [a, b] → R is an inner product space
when we define
Z b
f (x)g(x) dx.
hf, gi =
a
(iv) The space of real polynomials of degree at most n inherits the above inner product
from the space of continous functions.
The space of complex polynomials
f (x) = αn xn + αn−1 xn−1 + · · · + α1 x + α0
(with αi ∈ C)
becomes an inner product space when we define
Z 1
f (x)g(x) dx
hf, gi =
0
where
f (x) = ᾱn xn + ᾱn−1 xn−1 + · · · + ᾱ1 x + ᾱ0 .
2
Definition 6.3 Let V be an inner product space with inner product h·, ·i. The norm
is the function k · k : V → R defined by
p
kvk = hv, vi.
(This makes sense since hv, vi > 0 for all v ∈ V .
Lemma 6.4 Let V be an inner product space with inner product h·, ·i. Then
(i) hv, αwi = ᾱhv, wi for all v, w ∈ V and α ∈ F ;
(ii) kαvk = |α| · kvk for all v ∈ V and α ∈ F ;
(iii) kvk > 0 whenever v 6= 0.
Theorem 6.5 (Cauchy–Schwarz Inequality) Let V be an inner product space
with inner product h·, ·i. Then
|hu, vi| 6 kuk · kvk
for all u, v ∈ V .
Corollary 6.6 (Triangle Inequality) Let V be an inner product space. Then
ku + vk 6 kuk + kvk
for all u, v ∈ V .
3
Orthogonality and orthonormal bases
Definition 6.7 Let V be an inner product space.
(i) Two vectors v and w are said to be orthogonal if hv, wi = 0.
(ii) A set A of vectors is orthogonal if every pair of vectors within it are orthogonal.
(iii) A set A of vectors is orthonormal if it is orthogonal and every vector in A has
unit norm.
Thus the set A = {v1 , v2 , . . . , vk } is orthonormal if
(
0 if i 6= j
hvi , vj i = δij =
1 if i = j.
An orthonormal basis for an inner product space V is a basis which is itself an
orthonormal set.
Theorem 6.9 An orthogonal set of non-zero vectors is linearly independent.
Problem: Given a (finite-dimensional) inner product space V , how do we find an
orthonormal basis?
Theorem 6.10 (Gram–Schmidt Process) Suppose that V is a finite-dimensional
inner product space with basis {v1 , v2 , . . . , vn }. The following procedure constructs an
orthonormal basis {e1 , e2 , . . . , en } for V .
Step 1:
Define e1 =
1
kv1 k v1 .
Step k: Suppose {e1 , e2 , . . . , ek−1 } have been constructed. Define
k−1
X
hvk , ei iei
wk = vk −
i=1
and
ek =
1
wk .
kwk k
4
Example 6.12 (Laguerre polynomials) We can define an inner product on the
space P of real polynomials f (x) by
Z ∞
f (x)g(x)e−x dx.
hf, gi =
0
If we apply the Gram–Schmidt process to the standard basis of monomials
{1, x, x2 , x3 , . . . },
then we produce an orthonormal basis for P consisting of the Laguerre polynomials
{ Ln (x) | n = 0, 1, 2, . . . }.
Example 6.13 Define an inner product on the space P of real polynomials by
Z 1
f (x)g(x) dx.
hf, gi =
−1
Applying the Gram–Schmidt process to the monomials {1, x, x2 , x3 , . . . } produces an
orthonormal basis (with respect to this inner product). The polynomials produced are
scalar multiples of the Legendre polynomials:
P0 (x) = 1
P1 (x) = x
P2 (x) = 12 (3x2 − 1)
..
.
The set { Pn (x) | n = 0, 1, 2, . . . } of Legendre polynomials is orthogonal, but not
orthonormal. This is the reason why the Gram–Schmidt process only produces a
scalar multiple of them: they do not have unit norm.
The Hermite polynomials form an orthogonal set in the space P when we endow it
with the following inner product
Z ∞
2
f (x)g(x)e−x /2 dx.
hf, gi =
−∞
The orthonormal basis produced by applying the Gram–Schmidt process to the monomials are scalar multiples of the Hermite polynomials.
5
Orthogonal complements
Definition 6.14 Let V be an inner product space. If U is a subspace of V , the
orthogonal complement to U is
U ⊥ = { v ∈ V | hv, ui = 0 for all u ∈ U }.
Lemma 6.15 Let V be an inner product space and U be a subspace of V . Then
(i) U ⊥ is a subspace of V , and
(ii) U ∩ U ⊥ = {0}.
Theorem 6.16 Let V be a finite-dimensional inner product space and U be a subspace
of V . Then
V = U ⊕ U ⊥.
We can consider the projection map PU : V → V onto U associated to the decomposition V = U ⊕ U ⊥ . This is given by
PU (v) = u
where v = u + w is the unique decomposition of v with u ∈ U and w ∈ U ⊥ .
Theorem 6.17 Let V be a finite-dimensional inner product space and U be a subspace
of V . Let PU : V → V be the projection map onto U associated to the direct sum
decomposition V = U ⊕ U ⊥ . If v ∈ V , then PU (v) is the vector in U closest to v.
Example 6A Omitted from the handout due to length. A document containing this
supplementary example can be found in MMS and on my webpage.
6
MRQ 2014
School of Mathematics and Statistics
MT3501 Linear Mathematics
Handout VII: The adjoint of a transformation and self-adjoint
transformations
7
The adjoint of a transformation and self-adjoint transformations
Throughout this section, V is a finite-dimensional inner product space over a field F
(where, as before, F = R or C) with inner product h·, ·i.
Definition 7.1 Let T : V → V be a linear transformation. The adjoint of T is a map
T ∗ : V → V such that
hT (v), wi = hv, T ∗ (w)i
for all v, w ∈ V .
Remark: More generally, if T : V → W is a linear map between inner product
spaces, the adjoint T ∗ : W → V is a map satisfying the above equation for all v ∈ V
and w ∈ W . Appropriate parts of what we describe here can be done in this more
general setting.
Lemma 7.2 Let V be a finite-dimensional inner product space and let T : V → V be
a linear transformation. Then there is a unique adjoint T ∗ for T and, moreover, T ∗ is
a linear transformation.
We also record what was observed in the course of this proof:
If A = MatB,B (T ) is the matrix of T with respect to an orthonormal basis,
then
MatB,B (T ∗ ) = ĀT
(the conjugate transpose of A).
1
Definition 7.3 A linear transformation T : V → V is self-adjoint if T ∗ = T .
Lemma 7.4 (i) A real matrix A defines a self-adjoint transformation if and only if
it is symmetric: AT = A.
(ii) A complex matrix A defines a self-adjoint transformation if and only if it is
Hermitian: ĀT = A.
Theorem 7.5 A self-adjoint transformation of a finite-dimensional inner product space
is diagonalisable.
Corollary 7.6
(i) A real symmetric matrix is diagonalisable.
(ii) A Hermitian matrix is diagonalisable.
Lemma 7.7 Let V be a finite-dimensional inner product space and T : V → V be a
self-adjoint transformation. Then the characteristic polynomial is a product of linear
factors and every eigenvalue of T is real.
Lemma 7.8 Let V be an inner product space and T : V → V be a linear map. If U is
a subspace of V such that T (U ) ⊆ U (i.e., U is T -invariant), then T ∗ (U ⊥ ) ⊆ U ⊥
(i.e., U ⊥ is T ∗ -invariant).
2
Download