MRQ 2014 School of Mathematics and Statistics MT3501 Linear Mathematics Handout 0: Course Information Lecturer: Martyn Quick, Room 326. Prerequisite: MT2001 Mathematics Lectures: Mon (even), Tue & Thu 12 noon, Purdie Theatre B. Tutorials: These begin in Week 2: Times: Mon 2pm (Maths Theatre B), Mon 3pm (Maths Theatre B), Thu 2pm (Maths Theatre B), Fri 3pm (Maths Theatre B). Assessment: 10% Continuous assessment, 90% exam Continuous assessment questions will be distributed later in the semester. (Full details t.b.a.) Recommended Texts: • T. S. Blyth & E. F. Robertson, Basic Linear Algebra, Second Edition, Springer Undergraduate Mathematics Series (Springer-Verlag 2002) • T. S. Blyth & E. F. Robertson, Further Linear Algebra, Springer Undergraduate Mathematics Series (Springer-Verlag 2002) • R. Kaye & R. Wilson, Linear Algebra, Oxford Science Publications (OUP 1998) Webpage: All my handouts, slides, problem sheets and solutions will be available in PDF format from MMS and in due course on my webpage. Solutions will be posted only in MMS and only after all relevant tutorials have happened. The versions on my webpage will be as single PDFs so will be posted later in the semester (the current versions are from last year, though there won’t be a huge number of changes). The lecture notes and handouts will contain a number (currently a small number!) of extra examples. I don’t have time to cover these in the lectures themselves, but they are intended to be useful supplementary material. 1 Course Content: • Vector spaces: subspaces, spanning sets, linear independence, bases • Linear transformations: rank, nullity, matrix of a linear transformation, change of basis • Direct sums: projection maps • Diagonalisation: eigenvectors & eigenvalues, characteristic polynomial, minimum polynomial • Jordan normal form • Inner product spaces: orthogonality, orthonormal bases, Gram–Schmidt process 2 MRQ 2014 School of Mathematics and Statistics MT3501 Linear Mathematics Handout I: Vector spaces 1 Vector spaces Definition 1.1 A field is a set F together with two binary operations F ×F →F F ×F →F (α, β) 7→ α + β (α, β) 7→ αβ called addition and multiplication, respectively, such that (i) α + β = β + α for all α, β ∈ F ; (ii) (α + β) + γ = α + (β + γ) for all α, β, γ ∈ F ; (iii) there exists an element 0 in F such that α + 0 = α for all α ∈ F ; (iv) for each α ∈ F , there exists an element −α in F such that α + (−α) = 0; (v) αβ = βα for all α, β ∈ F ; (vi) (αβ)γ = α(βγ) for all α, β, γ ∈ F ; (vii) α(β + γ) = αβ + αγ for all α, β, γ ∈ F ; (viii) there exists an element 1 in F such that 1 6= 0 and 1α = α for all α ∈ F ; (ix) for each α ∈ F with α 6= 0, there exists an element α−1 (or 1/α) in F such that αα−1 = 1. Example 1.2 The following are examples of fields: (i) Q; (ii) R; (iii) C, all with the usual addition and multiplication; (iv) Z/pZ = {0, 1, . . . , p − 1} (where p is a prime number) with addition and multiplication being performed modulo p. 1 Definition 1.3 Let F be a field. A vector space over F is a set V together with the following operations V ×V →V F ×V →V (u, v) 7→ u + v (α, v) 7→ αv, called addition and scalar multiplication, respectively, such that (i) u + v = v + u for all u, v ∈ V ; (ii) (u + v) + w = u + (v + w) for all u, v, w ∈ V ; (iii) there exists a vector 0 in V such that v + 0 = v for all v ∈ V ; (iv) for each v ∈ V , there exists a vector −v in V such that v + (−v) = 0; (v) α(u + v) = αu + αv for all u, v ∈ V and α ∈ F ; (vi) (α + β)v = αv + βv for all v ∈ V and α, β ∈ F ; (vii) (αβ)v = α(βv) for all v ∈ V and α, β ∈ F ; (viii) 1v = v for all v ∈ V . We shall use the term real vector space to refer to a vector space over the field R and complex vector space to refer to one over the field C. 2 Example 1.4 (i) Let n be a positive integer and let x1 x2 F n = . x1 , x2 , . . . , xn ∈ F .. xn This is a vector space over F with addition given by x1 + y 1 y1 x1 x2 y 2 x2 + y 2 , .. + .. = .. . . . xn + y n yn xn . scalar multiplication by x1 αx1 x2 αx2 α . = . , .. .. xn zero vector αxn 0 0 0 = . , .. 0 and negatives given by x1 −x1 x2 −x2 − . = . . .. .. xn −xn (ii) C forms a vector space over R. (iii) A polynomial over a field F is an expression of the form f (x) = a0 + a1 x + a2 x2 + · · · + am xm for some m > 0, where a0 , a1 , . . . , am ∈ F . The set F [x] of all polynomials over F forms a vector space over F . Addition is given by X X X ai xi + bi xi = (ai + bi )xi i i i and scalar multiplication by α X ai x i = i X (αai )xi . i 3 (iv) Let FR denote the set of all functions f : R → R. Define (f + g)(x) = f (x) + g(x) (for x ∈ R) and (αf )(x) = α · f (x) (for x ∈ R). Then FR forms a vector space over R with respect to this addition and scalar multiplication. In this vector space negatives are given by (−f )(x) = −f (x) and the zero vector is 0 : x 7→ 0 for all x ∈ R. Basic properties of vector spaces Proposition 1.5 Let V be a vector space over a field F . Let v ∈ V and α ∈ F . Then (i) α0 = 0; (ii) 0v = 0; (iii) if αv = 0, then either α = 0 or v = 0; (iv) (−α)v = −αv = α(−v). Subspaces Definition 1.6 Let V be a vector space over a field F . A subspace W of V is a non-empty subset such that (i) if u, v ∈ W , then u + v ∈ W , and (ii) if v ∈ W and α ∈ F , then αv ∈ W . Lemma 1.7 Let V be a vector space and let W be a subspace of V . Then (i) 0 ∈ W ; (ii) if v ∈ W , then −v ∈ W . 4 Definition 1.9 Let V be a vector space and let U and W be subspaces of V . (i) The intersection of U and W is U ∩ W = { v | v ∈ U and v ∈ W }. (ii) The sum of U and W is U + W = { u + w | u ∈ U, w ∈ W }. Proposition 1.10 Let V be a vector space and let U and W be subspaces of V . Then (i) U ∩ W is a subspace of V ; (ii) U + W is a subspace of V . Corollary 1.11 Let V be a vector space and let U1 , U2 , . . . , Uk be subspaces of V . Then U1 + U2 + · · · + Uk = { u1 + u2 + · · · + uk | ui ∈ Ui for each i } is a subspace of V . 5 Spanning sets Definition 1.12 Let V be a vector space over a field F and suppose that A = {v1 , v2 , . . . , vk } is a set of vectors in V . A linear combination of these vectors is a vector of the form α1 v1 + α2 v2 + · · · + αk vk for some α1 , α2 , . . . , αk ∈ F . The set of all such linear combinations is called the span of the vectors v1 , v2 , . . . , vk and is denoted by Span(v1 , v2 , . . . , vk ) or by Span(A ). Proposition 1.13 Let A be a set of vectors in the vector space V . Then Span(A ) is a subspace of V . Definition 1.14 A spanning set for a subspace W is a set A of vectors such that Span(A ) = W . Linear independent elements and bases Definition 1.16 Let V be a vector space over a field F . A set A = {v1 , v2 , . . . , vk } is called linearly independent if the only solution to the equation k X αi vi = 0 i=1 (with αi ∈ F ) is α1 = α2 = · · · = αk = 0. If A is not linearly independent, we shall call it linearly dependent. 6 Example 1A Determine whether the set {x+x2 , 1−2x2 , 3+6x} is linearly independent in the vector space P of all real polynomials. Solution: We solve α(x + x2 ) + β(1 − 2x2 ) + γ(3 + 6x) = 0; that is, (β + 3γ) + (α + 6γ)x + (α − 2β)x2 = 0. (1) Equating coefficients yields the system of equations β + 3γ = 0 α + 6γ = 0 α − 2β = 0; that is, 0 1 3 α 0 1 0 6 β = 0 . 1 −2 0 γ 0 A sequence of row operations (Check!) converts this to 1 −2 0 α 0 0 1 3 β = 0 . 0 0 0 γ 0 Hence the original equation (1) is equivalent to α − 2β =0 β + 3γ = 0. Since there are fewer equations remaining than the number of variables, we have enough freedom to produce a non-zero solution. For example, if we set γ = 1, then β = −3γ = −3 and α = 2β = −6. Hence the set {x + x2 , 1 − 2x2 , 3 + 6x} is linearly dependent. Lemma 1.17 Let A be a set of vectors in the vector space V . Then A is linearly independent if and only if no vector in A can be expressed as a linear combination of the others. 7 Lemma 1.18 Let A be a linearly dependent set of vectors belonging to a vector space V . Then there exists some vector v in A such that A \ {v} spans the same subspace as A . If we start with a finite set and repeatedly remove vectors which are linear combinations of the others in the set, then we must eventually stop and produce a linearly independent set. This establishes: Theorem 1.19 Let V be a vector space (over some field). If A is a finite subset of V and W = Span(A ), then there exists a linearly independent subset B with B ⊆ A and Span(B) = W . Definition 1.20 Let V be a vector space over the field F . A basis for V is a linearly independent spanning set. We say that V is finite-dimensional if it possesses a finite spanning set; that is, if V possesses a finite basis. The dimension of V is the size of any basis for V and is denote by dim V . Lemma 1.22 Let V be a vector space of dimension n (over some field) and suppose B = {v1 , v2 , . . . , vn } is a basis for V . Then every vector in V can be expressed as a linear combination of the vectors in B in a unique way. 8 Theorem 1.23 Let V be a finite-dimensional vector space. Suppose {v1 , v2 , . . . , vm } is a linearly independent set of vectors and {w1 , w2 , . . . , wn } is a spanning set for V . Then m 6 n. Corollary 1.24 Let V be a finite-dimensional vector space. Then any two bases for V have the same size and consequently dim V is uniquely determined. Proposition 1.26 Let V be a finite-dimensional vector space. Then every linearly independent set of vectors in V can be extended to a basis for V by adjoining a finite number of vectors. Corollary 1.27 Let V be a vector space of finite dimension n. If A is a linearly independent set containing n vectors, then A is a basis for V . 9 MRQ 2014 School of Mathematics and Statistics MT3501 Linear Mathematics Handout II: Linear transformations 2 Linear transformations Definition 2.1 Let V and W be vector spaces over the same field F . A linear mapping (also called a linear transformation) from V to W is a function T : V → W such that (i) T (u + v) = T (u) + T (v) for all u, v ∈ V , and (ii) T (αv) = α T (v) for all v ∈ V and α ∈ F . Comment: Sometimes we shall write T v for the image of the vector v under the linear transformation T (instead of T (v)). Lemma 2.2 Let T : V → W be a linear mapping between two vector spaces over the field F . Then (i) T (0) = 0; (ii) T (−v) = −T (v) for all v ∈ V ; (iii) if v1 , v2 , . . . , vk ∈ V and α1 , α2 , . . . , αk ∈ F , then T X k i=1 αi vi 1 = k X i=1 αi T (vi ). Definition 2.3 Let T : V → W be a linear transformation between vector spaces over a field F . (i) The image of T is T (V ) = im T = { T (v) | v ∈ V }. (ii) The kernel or null space of T is ker T = { v ∈ V | T (v) = 0W }. Proposition 2.4 Let T : V → W be a linear transformation between vector spaces V and W over the field F . The image and kernel of T are subspaces of W and V , respectively. Definition 2.5 Let T : V → W be a linear transformation between vector spaces over the field F . (i) The rank of T , which we shall denote by rank T , is the dimension of the image of T . (ii) The nullity of T , which we shall denote by null T , is the dimension of the kernel of T . Theorem 2.6 (Rank-Nullity Theorem) Let V and W be vector spaces over the field F with V finite-dimensional and let T : V → W be a linear transformation. Then rank T + null T = dim V. Comment: [For those who have done MT2002.] This can be viewed as an analogue of the First Isomorphism Theorem for groups within the world of vector spaces. Rearranging gives dim V − dim ker T = dim im T and since (as we shall see on Problem Sheet II) dimension essentially determines vector spaces we conclude V /ker T ∼ = im T. 2 Constructing linear transformations Proposition 2.7 Let V be a finite-dimensional vector space over the field F with basis {v1 , v2 , . . . , vn } and let W be any vector space over F . If y1 , y2 , . . . , yn are arbitrary vectors in W , there is a unique linear transformation T : V → W such that T (vi ) = yi for i = 1, 2, . . . , n. This transformation T is given by T X n αi vi i=1 for an arbitrary linear combination Pn = i=1 αi vi n X αi yi i=1 in V . Proposition 2.9 Let V be a finite-dimensional vector space over the field F with basis {v1 , v2 , . . . , vn } and let W be a vector space over F . Fix vectors y1 , y2 , . . . , yn in W and let T : V → W be the unique linear transformation given by T (vi ) = yi for i = 1, 2, . . . , n. Then (i) im T = Span(y1 , y2 , . . . , yn ). (ii) ker T = {0} if and only if {y1 , y2 , . . . , yn } is a linearly independent set. 3 Example 2A Define a linear transformation T : R3 → R3 in terms of the standard basis B = {e1 , e2 , e3 } by 2 −1 0 T (e1 ) = y 1 = 1 , T (e2 ) = y 2 = 0 , T (e3 ) = y 3 = −1 . −1 2 4 Show that ker T = {0} and im T = R3 . Solution: We check whether {y 1 , y 2 , y 3 } is linearly independent. Solve αy 1 + βy 2 + γy 3 = 0; that is, 2α − β =0 α −γ =0 −α + 2β + 4γ = 0. The second equation tells us that γ = α while the first says β = 2α. Substituting for β and γ in the third equation gives −α + 4α + 4α = 7α = 0. Hence α = 0 and consequently β = γ = 0. This shows {y 1 , y 2 , y 3 } is linearly independent. Consequently, ker T = {0} by Proposition 2.9. The Rank-Nullity Theorem now says dim im T = dim R3 − dim ker T = 3 − 0 = 3. Therefore im T = R3 as it has the same dimension. [Alternatively, since dim R3 = 3 and {y 1 , y 2 , y 3 } is linearly independent, this set must be a basis for R3 (see Corollary 1.24). Therefore, by Proposition 2.9(i), im T = Span(y 1 , y 2 , y 3 ) = R3 , once again.] 4 The matrix of a linear transformation Definition 2.11 Let V and W be finite-dimensional vector spaces over the field F and let B = {v1 , v2 , . . . , vn } and C = {w1 , w2 , . . . , wm } be bases for V and W , respectively. If T : V → W is a linear transformation, let T (vj ) = m X αij wi i=1 express the image of the vector vj under T as a linear combination of the basis C (for j = 1, 2, . . . , n). The m × n matrix [αij ] is called the matrix of T with respect to the bases B and C . We shall denote this by Mat(T ) or, when we wish to be explicit about the dependence upon the bases B and C , by MatB,C (T ). Note that the entries of the jth column of the matrix of T are: α1j α2j .. . αmj i.e., the jth column specifies the image of T (vj ) by listing the coefficients when it is expressed as a linear combination of the vectors in C . Informal description of what the matrix of linear transformation does: Suppose that V and W are finite-dimensional vector spaces of dimension n and m, respectively. Firstly V and W “look like” F n and F m . Moreover, from this viewpoint, a linear transformation T : V → W then “looks like” multiplication by the matrix Mat(T ). 5 Change of basis The matrix of a linear transformation depends heavily on the choice of bases for the two vector spaces involved. The following theorem describes how these matrices for a linear transformation T : V → V are linked when calculated with respect to two different bases for V . Similar results apply when using different bases for vectors spaces V and W for a linear transformation T : V → W . Theorem 2.13 Let V be a vector space of dimension n over a field F and let T : V → V be a linear transformation. Let B = {v1 , v2 , . . . , vn } and C = {w1 , w2 , . . . , wn } be bases for V and let A and B be the matrices of T with respect to B and C , respectively. Then there is an invertible matrix P such that B = P −1 AP. Specifically, the (i, j)th entry of P is the coefficient of vi when wj is expressed as a linear combination of the basis vectors in B. 6 Example 2B Let 1 2 0 B = 1 , 0 , −1 . −1 −1 0 (i) Show that B is a basis for R3 . (ii) Write down the change of basis matrix from the standard basis E = {e1 , e2 , e3 } to B. (iii) Let −2 −2 −3 A= 1 1 2 −1 −2 −2 and view A as a linear transformation R3 → R3 . Find the matrix of A with respect to the basis B. Solution: (i) We first establish that B is linearly independent. Solve 0 1 2 0 α 1 + β 0 + γ −1 = 0 ; −1 0 −1 0 that is, β + 2γ = 0 α −γ =0 −α − β = 0. Thus γ = α and the first equation yields 2α + β = 0. Adding the third equation now gives α = 0 and hence β = γ = 0. This show B is linearly independent and it is therefore a basis for R3 since dim R3 = 3 = |B|. (ii) We write each vector in B in terms of the standard basis 0 1 = e2 − e3 −1 1 0 = e1 − e3 −1 2 −1 = 2e1 − e2 0 and write the coefficients appearing down the columns of the change of basis matrix: 0 1 2 0 −1 . P = 1 −1 −1 0 7 (iii) Theorem 2.13 says MatB,B (A) = P −1 AP (as the matrix of A with respect to the standard basis is A itself). We first calculate the inverse of P via the usual row operation method: 0 1 2 1 0 0 0 1 2 1 0 0 1 0 −1 0 1 0 −→ 1 0 −1 0 1 0 r3 7→ r3 + r1 −1 −1 0 0 0 1 0 −1 −1 0 1 1 1 0 −1 0 1 0 −→ 0 1 r1 ↔ r2 2 1 0 0 0 −1 −1 0 1 1 1 0 −1 0 1 0 r3 7→ r3 + r2 −→ 0 1 2 1 0 0 0 0 1 1 1 1 1 0 0 1 2 1 r1 7→ r1 + r3 −→ 0 1 0 −1 −2 −2 r2 7→ r2 − 2r3 0 0 1 1 1 1 Hence P −1 and so MatB,B (A) = P −1 AP 1 2 = −1 −2 1 1 −1 −2 = 2 4 −2 −3 −1 0 1 −1 = 0 1 1 2 1 = −1 −2 −2 1 1 1 1 −2 −2 −3 0 1 2 −2 1 1 2 1 0 −1 1 −1 −2 −2 −1 −1 0 −1 0 1 2 3 1 0 −1 −3 −1 −1 0 0 0 . −1 8 MRQ 2014 School of Mathematics and Statistics MT3501 Linear Mathematics Handout III: Direct sums 3 Direct sums Definition 3.1 Let V be a vector space over a field F . We say that V is the direct sum of two subspaces U1 and U2 , written V = U1 ⊕ U2 if every vector v in V can be expressed uniquely in the form v = u1 + u2 where u1 ∈ U1 and u2 ∈ U2 . Proposition 3.2 Let V be a vector space and U1 and U2 be subspaces of V . Then V = U1 ⊕ U2 if and only if the following conditions hold: (i) V = U1 + U2 , (ii) U1 ∩ U2 = {0}. Comment: Many authors use the two conditions to define what is meant by a direct sum and then show it is equivalent to our “unique expression” definition. Proposition 3.4 Let V = U1 ⊕ U2 be a finite-dimensional vector space expressed as a direct sum of two subspaces. If B1 and B2 are bases for U1 and U2 , respectively, then B1 ∪ B2 is a basis for V . Corollary 3.5 If V = U1 ⊕ U2 is a finite-dimensional vector space expressed as a direct sum of two subspaces, then dim V = dim U1 + dim U2 . 1 Projection maps Definition 3.6 Let V = U1 ⊕ U2 be a vector space expressed as a direct sum of two subspaces. The two projection maps P1 : V → V and P2 : V → V onto U1 and U2 , respectively, corresponding to this decomposition are defined as follows: if v ∈ V , express v uniquely as v = u1 + u2 where u1 ∈ U1 and u2 ∈ U2 , then P1 (v) = u1 and P2 (v) = u2 . Lemma 3.7 Let V = U1 ⊕ U2 be a direct sum of subspaces with projection maps P1 : V → V and P2 : V → V . Then (i) P1 and P2 are linear transformations; (ii) P1 (u) = u for all u ∈ U1 and P1 (w) = 0 for all w ∈ U2 ; (iii) P2 (u) = 0 for all u ∈ U1 and P2 (w) = w for all w ∈ U2 ; (iv) ker P1 = U2 and im P1 = U1 ; (v) ker P2 = U1 and im P2 = U2 . Proposition 3.8 Let P : V → V be a projection corresponding to some direct sum decomposition of the vector space V . Then (i) P 2 = P ; (ii) V = ker P ⊕ im P ; (iii) I − P is also a projection; (iv) V = ker P ⊕ ker(I − P ). Here I : V → V denotes the identity transformation I : v 7→ v for v ∈ V . 2 Example 3A Let V = R3 and U = Span(v 1 ), where 3 v 1 = −1 . 2 (i) Find a subspace W such that V = U ⊕ W . (ii) Let P : V → V be the associated projection onto W . Calculate P (u) where 4 u = 4 . 4 Solution: (i) We first extend {v 1 } to a basis for R3 . We claim that 0 1 3 −1 , 0 , 1 B= 0 0 2 is a basis for R3 . We solve that is, 0 0 3 1 α −1 + β 0 + γ 1 = 0 ; 0 0 0 2 3α + β = −α + γ = 2α = 0. Hence α = 0, so β = −3α = 0 and γ = α = 0. Thus B is linearly independent. Since dim V = 3 and |B| = 3, we conclude that B is a basis for R3 . Let W = Span(v 2 , v 3 ) where 0 1 and v 3 = 1 . v 2 = 0 0 0 Since B = {v 1 , v 2 , v 3 } is a basis for V , if v ∈ V , then there exist α1 , α2 , α3 ∈ R such that v = (α1 v 1 ) + (α2 v 2 + α3 v 3 ) ∈ U + W. Hence V = U + W . If v ∈ U ∩ W , then there exist α, β1 , β2 ∈ R such that v = αv 1 = β1 v 2 + β2 v 3 . Therefore αv 1 + (−β1 )v 2 + (−β2 )v 3 = 0. Since B is linearly independent, we conclude α = −β1 = −β2 = 0, so v = αv 1 = 0. Thus U ∩ W = {0} and so V = U ⊕ W. 3 (ii) We write u as a linear combination of the basis B. Inspection shows 4 3 1 0 u = 4 = 2 −1 − 2 0 + 6 1 4 2 0 0 6 −2 = −2 + 6 , 4 0 where the first term in the last line belongs to U and the second to W . Hence −2 P (u) = 6 0 (since this is the W -component of u). Direct sums of more summands Definition 3.10 Let V be a vector space. We say that V is the direct sum of subspaces U1 , U2 , . . . , Uk , written V = U1 ⊕ U2 ⊕ · · · ⊕ Uk , if every vector in V can be uniquely expressed in the form u1 + u2 + · · · + uk where ui ∈ Ui for each i. The following analogues of earlier results then hold: Proposition 3.11 Let V be a vector space with subspaces U1 , U2 , . . . , Uk . Then V = U1 ⊕ U2 ⊕ · · · ⊕ Uk if and only if the following conditions hold: (i) V = U1 + U2 + · · · + Uk ; (ii) Ui ∩ (U1 + · · · + Ui−1 + Ui+1 + · · · + Uk ) = {0} for each i. Proposition 3.12 Let V = U1 ⊕ U2 ⊕ · · · ⊕ Uk be a direct sum of subspaces. If Bi is a basis for Ui for i = 1, 2, . . . , k, then B1 ∪ B2 ∪ · · · ∪ Bk is a basis for V . 4 MRQ 2014 School of Mathematics and Statistics MT3501 Linear Mathematics Handout IV: Diagonalisation of linear transformations 4 Diagonalisation of linear transformations Throughout fix a vector space V over a field F and a linear transformation T : V → V . Eigenvectors and eigenvalues Definition 4.1 Let V be a vector space over a field F and let T : V → V be a linear transformation. A non-zero vector v is an eigenvector for T with eigenvalue λ (where λ ∈ F ) if T (v) = λv. Definition 4.2 Let V be a vector space over a field F , let T : V → V be a linear transformation, and λ ∈ F . The eigenspace corresponding to the eigenvalue λ is the subspace Eλ = ker(T − λI) = { v ∈ V | T (v) − λv = 0 } = { v ∈ V | T (v) = λv }. Recall I denotes the identity transformation given by v 7→ v. From now on we strengthen out standing assumption to V being a finite-dimensional vector space over F . Definition 4.3 Let T : V → V be a linear transformation of the finite-dimensional vector space V (over F ) and let A be the matrix of T with respect to some basis. The characteristic polynomial of T is cT (x) = det(xI − A) where x is an indeterminate variable. Lemma 4.4 Suppose that T : V → V is a linear transformation of the finite-dimensional vector space V over F . Then λ is an eigenvalue of T if and only if λ is a root of the characteristic polynomial of T . 1 Lemma 4.5 Let V be a finite-dimensional vector space V over F and T : V → V be a linear transformation. The characteristic polynomial cT (x) is independent of the choice of basis for V . Diagonalisability Definition 4.6 (i) Let T : V → V be a linear transformation of a finite-dimensional vector space V . We say that T is diagonalisable if there is a basis with respect to which T is represented by a diagonal matrix. (ii) A square matrix A is diagonalisable if there is an invertible matrix P such that P −1 AP is diagonal. Proposition 4.7 Let V be a finite-dimensional vector space and T : V → V be a linear transformation. Then T is diagonalisable if and only if there is a basis for V consisting of eigenvectors for T . 2 Algebraic and geometric multiplicities Lemma 4.10 If the linear transformation T : V → V is diagonalisable, then the characteristic polynomial of T is a product of linear factors. Definition 4.12 Let V be a finite-dimensional vector space over the field F and let T : V → V be a linear transformation of V . Let λ ∈ F . (i) The algebraic multiplicity of λ (as an eigenvalue of T ) is the largest power k such that (x − λ)k is a factor of the characteristic polynomial cT (x). (ii) The geometric multiplicity of λ is the dimension of the eigenspace Eλ corresponding to λ. Proposition 4.13 Let T : V → V be a linear transformation of a vector space V . A set of eigenvectors of T corresponding to distinct eigenvalues is linearly independent. Theorem 4.14 Let V be a finite-dimensional vector space over the field F and let T : V → V be a linear transformation of V . (i) If the characteristic polynomial cT (x) is a product of linear factors (as always happens, for example, if F = C), then the sum of the algebraic multiplicities equals dim V . (ii) Let λ ∈ F and let rλ be the algebraic multiplicity and nλ be the geometric multiplicity of λ. Then nλ 6 rλ . (iii) The linear transformation T is diagonalisable if and only if cT (x) is a product of linear factors and nλ = rλ for all eigenvalues λ. 3 Example 4A Let −1 2 −1 A = −4 5 −2 . −4 3 0 Show that A is not diagonalisable. Solution: The characteristic polynomial of A is cA (x) = det(xI − A) x + 1 −2 = det 4 x−5 4 −3 x−5 = (x + 1) det −3 1 2 x 2 4 2 4 x−5 + 2 det + det x 4 x 4 −3 = (x + 1) x(x − 5) + 6 + 2(4x − 8) + (−12 − 4x + 20) = (x + 1)(x2 − 5x + 6) + 8(x − 2) − 4x + 8 = (x + 1)(x − 2)(x − 3) + 8(x − 2) − 4(x − 2) = (x − 2) (x + 1)(x − 3) + 8 − 4 = (x − 2)(x2 − 2x − 3 + 4) = (x − 2)(x2 − 2x + 1) = (x − 2)(x − 1)2 . In particular, the algebraic multiplicity of the eigenvalue 1 is 2. We now determine the eigenspace for eigenvalue 1. We solve (A − I)v = 0; that is, −2 2 −1 x 0 −4 4 −2 y = 0 . (2) −4 3 −1 z 0 We solve this by applying row operations: −2 2 −2 2 −1 0 −4 4 −2 0 −→ 0 0 0 −1 −4 3 −1 0 −2 0 0 0 −→ 0 −1 −1 0 0 0 1 0 1 0 0 0 1 0 So Equation (2) is equivalent to r2 7→ r2 − 2r1 r3 7→ r3 − 2r1 r1 7→ r1 + 2r3 −2x + z = 0 = −y + z. Hence z = 2x and y = z = 2x. Therefore the eigenspace is 1 x E1 = 2x x ∈ R = Span 2 2x 2 and we conclude dim E1 = 1. Thus the geometric multiplicity of 1 is not equal to the algebraic multiplicity, so A is not diagonalisable. 4 Minimum polynomial Note: The minimum polynomial is often also called the “minimal polynomial”. Definition 4.15 Let T : V → V be a linear transformation of a finite-dimensional vector space over the field F . The minimum polynomial mT (x) is the monic polynomial over F of smallest degree such that mT (T ) = 0. To say that a polynomial is monic is to say that its leading coefficient is 1. Our definition of the characteristic polynomial ensures that cT (x) is also always a monic polynomial. The definition of the minimum polynomial (in particular, the assumption that it is monic) ensures that it is unique. Theorem 4.16 (Cayley–Hamilton Theorem) Let T : V → V be a linear transformation of a finite-dimensional vector space V . If cT (x) is the characteristic polynomial of T , then cT (T ) = 0. 5 Facts about polynomials: Let F be a field and recall F [x] denotes the set of polynomials with coefficients from F : f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 (where ai ∈ F ). Then F [x] is an example of what is known as a Euclidean domain (see MT4517 Rings and Fields for full details). A summary of its main properties are: • We can add, multiply and subtract polynomials; • Euclidean Algorithm: if we attempt to divide f (x) by g(x) (where g(x) 6= 0), we obtain f (x) = g(x)q(x) + r(x) where either r(x) = 0 or the degree of r(x) satisfies deg r(x) < deg g(x) (i.e., we can perform long-division with polynomials). • When the remainder is 0, that is, when f (x) = g(x)q(x) for some polynomial q(x), we say that g(x) divides f (x). • If f (x) and g(x) are non-zero polynomials, their greatest common divisor is the polynomial d(x) of largest degree dividing them both. It is uniquely determined up to multiplying by a scalar and can be expressed as d(x) = a(x)f (x) + b(x)g(x) for some polynomials a(x), b(x). Those familiar with divisibility in the integers Z (particularly those who have attended MT1003 ) will recognise these facts as being standard properties of Z (which is also a standard example of a Euclidean domain). 6 Proposition 4.18 Let V be a finite-dimensional vector space over a field F and let T : V → V be a linear transformation. If f (x) is any polynomial (over F ) such that f (T ) = 0, then the minimum polynomial mT (x) divides f (x). Corollary 4.19 Suppose that T : V → V is a linear transformation of a finite-dimensional vector space V . Then the minimum polynomial mT (x) divides the characteristic polynomial cT (x). Theorem 4.20 Let V be a finite-dimensional vector space over a field F and let T : V → V be a linear transformation of V . Then the roots of the minimum polynomial mT (x) and the roots of the characteristic polynomial cT (x) coincide. Theorem 4.21 Let V be a finite-dimensional vector space over the field F and let T : V → V be a linear transformation. Then T is diagonalisable if and only if the minimum polynomial mT (x) is a product of distinct linear factors. Lemma 4.22 Let T : V → V be a linear transformation of a vector space over the field F and let f (x) and g(x) be coprime polynomials over F . Then ker f (T )g(T ) = ker f (T ) ⊕ ker g(T ). 7 Example 4B Let 0 −2 −1 A= 1 5 3 . −1 −2 0 Calculate the characteristic polynomial and the minimum polynomial of A. Hence determine whether A is diagonalisable. Solution: cA = det(xI − A) x 2 1 = det −1 x − 5 −3 1 2 x x − 5 −3 −1 −3 −1 x − 5 = x det − 2 det + det 2 x 1 x 1 2 = x x(x − 5) + 6 − 2(−x + 3) + (−2 − x + 5) = x(x2 − 5x + 6) + 2(x − 3) − x + 3 = x(x − 3)(x − 2) + 2(x − 3) − (x − 3) = (x − 3) x(x − 2) + 2 − 1 = (x − 3)(x2 − 2x + 1) = (x − 3)(x − 1)2 . Since the minimum polynomial divides cA (x) and has the same roots, we deduce mA (x) = (x − 3)(x − 1) or mA (x) = (x − 3)(x − 1)2 . We calculate −3 (A − 3I)(A − I) = 1 −1 2 = −2 2 −2 −1 −1 −2 −1 2 3 1 4 3 −2 −3 −1 −2 −1 0 −2 0 2 6= 0. 0 −2 Hence mA (x) 6= (x − 3)(x − 1). We conclude mA (x) = (x − 3)(x − 1)2 . This is not a product of distinct linear factors, so A is not diagonalisable. 8 MRQ 2014 School of Mathematics and Statistics MT3501 Linear Mathematics Handout V: Jordan normal form 5 Jordan normal form Definition 5.1 A Jordan block is an n × n matrix of the form λ 1 λ 1 . . . . Jn (λ) = . . .. . 1 λ 0 0 for some positive integer n and some scalar λ. A linear transformation T : V → V (of a vector space V ) has Jordan normal form A if there exists a basis for V with respect to which the matrix of T is Jn1 (λ1 ) 0 ··· ··· 0 0 Jn2 (λ2 ) 0 · · · 0 .. .. .. .. . . . 0 Mat(T ) = A = . .. .. .. . . 0 . 0 0 ··· 0 Jnk (λk ) for some positive integers n1 , n2 , . . . , nk and scalars λ1 , λ2 , . . . , λk . (The occurrences of 0 here indicate zero matrices of appropriate sizes.) Theorem 5.2 Let V be a finite-dimensional vector space and T : V → V be a linear transformation of V such that the characteristic polynomial cT (x) is a product of linear factors with eigenvalues λ1 , λ2 , . . . , λn , then there exist a basis for V with respect to which Mat(T ) is in Jordan normal form where each Jordan block has the form Jm (λi ) for some m and some i. Corollary 5.3 Let A be a square matrix over C. Then there exist an invertible matrix P (over C) such that P −1 AP is in Jordan normal form. 1 Basic properties of Jordan normal form Proposition 5.4 Let J = Jn (λ) be an n × n Jordan block. Then (i) cJ (x) = (x − λ)n ; (ii) mJ (x) = (x − λ)n ; (iii) the eigenspace Eλ of J has dimension 1. Finding the Jordan normal form Problem: Let V be a finite-dimensional vector space and let T : V → V be a linear transformation. If the characteristic polynomial cT (x) is a product of linear factors, find a basis B with respect to which T is in Jordan normal form and determine what this Jordan normal form is. Observation 5.5 The algebraic multiplicity of λ as an eigenvalue of T equals the sum of the sizes of the Jordan blocks Jn (λ) (associated to λ) occurring in the Jordan normal form for T . Observation 5.6 If λ is an eigenvalue of T , then the power of (x − λ) occurring in the minimum polynomial mT (x) is (x − λ)m where m is the largest size of a Jordan block associated to λ occurring in the Jordan normal form for T . Observation 5.9 The geometric multiplicity of λ as an eigenvalue of T equals the number of Jordan blocks Jn (λ) occurring in the Jordan normal form for T . These observations are enough to determine the Jordan normal form for many small examples. We need to generalise Observation 5.9 in larger cases and consider more general spaces than the eigenspace Eλ . Determining dim ker(T − λI)2 , dim ker(T − λI)3 , ... would prove to be sufficient. To find the basis B such that MatB,B (T ) is in Jordan normal form, boils down to finding the Jordan normal form and then solving appropriate systems of linear equations. (It is essentially a generalisation of finding the eigenvectors for a diagonalisable linear transformation.) Example 5A Omitted from the handout due to length. A document containing this supplementary example can be found in MMS and on my webpage. 2 MRQ 2014 School of Mathematics and Statistics MT3501 Linear Mathematics Handout VI: Inner product spaces 6 Inner product spaces Throughout this section, our base field F will be either R or C. Recall that if z = x + iy ∈ C, the complex conjugate of z is given by z̄ = x − iy. To save space and time, we shall use the complex conjugate even when F = R. Thus, when F = R and ᾱ appears, we mean ᾱ = α for a scalar α ∈ R. Definition 6.1 Let F = R or C. An inner product space is a vector space V over F together with an inner product V ×V →F (v, w) 7→ hv, wi such that (i) hu + v, wi = hu, wi + hv, wi for all u, v, w ∈ V , (ii) hαv, wi = αhv, wi for all v, w ∈ V and α ∈ F , (iii) hv, wi = hw, vi for all v, w ∈ V , (iv) hv, vi is a real number satisfying hv, vi > 0 for all v ∈ V , (v) hv, vi = 0 if and only if v = 0. In the case when F = R, our inner product is symmetric in the sense that Condition (iii) then becomes hv, wi = hw, vi for all v, w ∈ V . 1 Example 6.2 (i) The vector space Rn is an inner product space with respect to the usual dot product: y1 + x1 y1 * x1 n x2 y 2 x2 y 2 X xi y i . .. , .. = .. · .. = . . . . i=1 xn yn xn yn (ii) We can endow Cn with an inner product by w1 + * z1 n z2 w2 X zi w̄i . .. , .. = . . i=1 zn wn (iii) The set C[a, b] of continous functions f : [a, b] → R is an inner product space when we define Z b f (x)g(x) dx. hf, gi = a (iv) The space of real polynomials of degree at most n inherits the above inner product from the space of continous functions. The space of complex polynomials f (x) = αn xn + αn−1 xn−1 + · · · + α1 x + α0 (with αi ∈ C) becomes an inner product space when we define Z 1 f (x)g(x) dx hf, gi = 0 where f (x) = ᾱn xn + ᾱn−1 xn−1 + · · · + ᾱ1 x + ᾱ0 . 2 Definition 6.3 Let V be an inner product space with inner product h·, ·i. The norm is the function k · k : V → R defined by p kvk = hv, vi. (This makes sense since hv, vi > 0 for all v ∈ V . Lemma 6.4 Let V be an inner product space with inner product h·, ·i. Then (i) hv, αwi = ᾱhv, wi for all v, w ∈ V and α ∈ F ; (ii) kαvk = |α| · kvk for all v ∈ V and α ∈ F ; (iii) kvk > 0 whenever v 6= 0. Theorem 6.5 (Cauchy–Schwarz Inequality) Let V be an inner product space with inner product h·, ·i. Then |hu, vi| 6 kuk · kvk for all u, v ∈ V . Corollary 6.6 (Triangle Inequality) Let V be an inner product space. Then ku + vk 6 kuk + kvk for all u, v ∈ V . 3 Orthogonality and orthonormal bases Definition 6.7 Let V be an inner product space. (i) Two vectors v and w are said to be orthogonal if hv, wi = 0. (ii) A set A of vectors is orthogonal if every pair of vectors within it are orthogonal. (iii) A set A of vectors is orthonormal if it is orthogonal and every vector in A has unit norm. Thus the set A = {v1 , v2 , . . . , vk } is orthonormal if ( 0 if i 6= j hvi , vj i = δij = 1 if i = j. An orthonormal basis for an inner product space V is a basis which is itself an orthonormal set. Theorem 6.9 An orthogonal set of non-zero vectors is linearly independent. Problem: Given a (finite-dimensional) inner product space V , how do we find an orthonormal basis? Theorem 6.10 (Gram–Schmidt Process) Suppose that V is a finite-dimensional inner product space with basis {v1 , v2 , . . . , vn }. The following procedure constructs an orthonormal basis {e1 , e2 , . . . , en } for V . Step 1: Define e1 = 1 kv1 k v1 . Step k: Suppose {e1 , e2 , . . . , ek−1 } have been constructed. Define k−1 X hvk , ei iei wk = vk − i=1 and ek = 1 wk . kwk k 4 Example 6.12 (Laguerre polynomials) We can define an inner product on the space P of real polynomials f (x) by Z ∞ f (x)g(x)e−x dx. hf, gi = 0 If we apply the Gram–Schmidt process to the standard basis of monomials {1, x, x2 , x3 , . . . }, then we produce an orthonormal basis for P consisting of the Laguerre polynomials { Ln (x) | n = 0, 1, 2, . . . }. Example 6.13 Define an inner product on the space P of real polynomials by Z 1 f (x)g(x) dx. hf, gi = −1 Applying the Gram–Schmidt process to the monomials {1, x, x2 , x3 , . . . } produces an orthonormal basis (with respect to this inner product). The polynomials produced are scalar multiples of the Legendre polynomials: P0 (x) = 1 P1 (x) = x P2 (x) = 12 (3x2 − 1) .. . The set { Pn (x) | n = 0, 1, 2, . . . } of Legendre polynomials is orthogonal, but not orthonormal. This is the reason why the Gram–Schmidt process only produces a scalar multiple of them: they do not have unit norm. The Hermite polynomials form an orthogonal set in the space P when we endow it with the following inner product Z ∞ 2 f (x)g(x)e−x /2 dx. hf, gi = −∞ The orthonormal basis produced by applying the Gram–Schmidt process to the monomials are scalar multiples of the Hermite polynomials. 5 Orthogonal complements Definition 6.14 Let V be an inner product space. If U is a subspace of V , the orthogonal complement to U is U ⊥ = { v ∈ V | hv, ui = 0 for all u ∈ U }. Lemma 6.15 Let V be an inner product space and U be a subspace of V . Then (i) U ⊥ is a subspace of V , and (ii) U ∩ U ⊥ = {0}. Theorem 6.16 Let V be a finite-dimensional inner product space and U be a subspace of V . Then V = U ⊕ U ⊥. We can consider the projection map PU : V → V onto U associated to the decomposition V = U ⊕ U ⊥ . This is given by PU (v) = u where v = u + w is the unique decomposition of v with u ∈ U and w ∈ U ⊥ . Theorem 6.17 Let V be a finite-dimensional inner product space and U be a subspace of V . Let PU : V → V be the projection map onto U associated to the direct sum decomposition V = U ⊕ U ⊥ . If v ∈ V , then PU (v) is the vector in U closest to v. Example 6A Omitted from the handout due to length. A document containing this supplementary example can be found in MMS and on my webpage. 6 MRQ 2014 School of Mathematics and Statistics MT3501 Linear Mathematics Handout VII: The adjoint of a transformation and self-adjoint transformations 7 The adjoint of a transformation and self-adjoint transformations Throughout this section, V is a finite-dimensional inner product space over a field F (where, as before, F = R or C) with inner product h·, ·i. Definition 7.1 Let T : V → V be a linear transformation. The adjoint of T is a map T ∗ : V → V such that hT (v), wi = hv, T ∗ (w)i for all v, w ∈ V . Remark: More generally, if T : V → W is a linear map between inner product spaces, the adjoint T ∗ : W → V is a map satisfying the above equation for all v ∈ V and w ∈ W . Appropriate parts of what we describe here can be done in this more general setting. Lemma 7.2 Let V be a finite-dimensional inner product space and let T : V → V be a linear transformation. Then there is a unique adjoint T ∗ for T and, moreover, T ∗ is a linear transformation. We also record what was observed in the course of this proof: If A = MatB,B (T ) is the matrix of T with respect to an orthonormal basis, then MatB,B (T ∗ ) = ĀT (the conjugate transpose of A). 1 Definition 7.3 A linear transformation T : V → V is self-adjoint if T ∗ = T . Lemma 7.4 (i) A real matrix A defines a self-adjoint transformation if and only if it is symmetric: AT = A. (ii) A complex matrix A defines a self-adjoint transformation if and only if it is Hermitian: ĀT = A. Theorem 7.5 A self-adjoint transformation of a finite-dimensional inner product space is diagonalisable. Corollary 7.6 (i) A real symmetric matrix is diagonalisable. (ii) A Hermitian matrix is diagonalisable. Lemma 7.7 Let V be a finite-dimensional inner product space and T : V → V be a self-adjoint transformation. Then the characteristic polynomial is a product of linear factors and every eigenvalue of T is real. Lemma 7.8 Let V be an inner product space and T : V → V be a linear map. If U is a subspace of V such that T (U ) ⊆ U (i.e., U is T -invariant), then T ∗ (U ⊥ ) ⊆ U ⊥ (i.e., U ⊥ is T ∗ -invariant). 2