18.700 FALL 2010, OPTIONAL EXERCISES ON MATERIAL AFTER HOMEWORK 10 SELECTED SOLUTIONS LAST REVISION OF THIS DOCUMENT: SATURDAY, DEC 11, 3:20 PM TRAVIS SCHEDLER (1) (a) Suppose that F is an arbitrary field. Suppose that A ∈ Mat(F, n, n) is a square matrix and that the characteristic polynomial as defined by χA = det(xI − A) has no roots over F. Show that A has no eigenvalues, i.e., that A − λI is invertible for all λ ∈ F. (b) Now specialize to the case F =R. Conclude that, whenever a, b ∈ R and b is nonzero, a −b and for all λ ∈ R, the matrix − λI is invertible. b a a) We claim that all eigenvalues of A are roots of χA (x). By the Cayley-Hamilton theorem, χA (A) = 0. More generally, suppose that f (x) is a polynomial and f (A) = 0. If λ were an eigenvalue of A with nonzero eigenvalue v, then f (A)v = f (λ)v. Since f (A) = 0, this implies that f (λ) = 0. So, if f has no roots over F, then A has no eigenvalues. Now we plug back in f = χA (x) and deduce the result. b) The characteristic polynomial of this matrix is x2 − 2ax + (a2 + b2 ) which has no real rootsif b 6= 0(the complex roots are a ± bi). So part (a) implies it has no real eigenvalues, a −b − λI is invertible for all λ. i.e., b a (2) Suppose again that F is an arbitrary field and A ∈ Mat(F, n, n) is a square matrix. Suppose that χA = xn + an−1 xn−1 + · · · + a0 is an irreducible polynomial, i.e., it has no polynomial factors other than scalar multiples of itself and scalars. Then, for every nonzero vector v ∈ Mat(F, n, 1): (a) Show that (v, Av, A2 v, . . . , An−1 v) is a basis of Mat(F, n, 1). (b) Let T be the operator T (v) = Av. In the basis of (a), show that (cf. Chapter 8, Exercise 28): 0 0 · · · 0 −a0 1 0 · · · 0 −a1 M(T ) = 0 1 · · · 0 −a2 . .. .. . . . .. . . . .. . 0 0 · · · 1 −an−1 Here, all the entries other than the rightmost column and the subdiagonal are zero. The subdiagonal consists of ones. (c) Conclude that A is conjugate to the preceding matrix, over F. Conclude further that all matrices with the same irreducible characteristic polynomial are conjugate. (d) Now specialize to the case F = R. Use this to give another proof of the fact that if A is a two-by-two real matrix with no real eigenvalues, then it is conjugate to a matrix of a −b the form , for a, b ∈ R, and that a±bi are necessarily the complex eigenvalues b a of A (in particular, a and b are unique up to the sign of b). 1 a) The list (v, Av, . . . , An−1 v) has length equal to the dimension, so it is a basis if and only if it spans. Note that its span is an invariant subspace, since Ai+1 v is in the span for 0 ≤ i ≤ n − 1, and An v = (−an−1 An−1 − · · · − a0 I)v is also in the span. So if the span is not all of Mat(F, n, 1), we have a nonzer invariant subspace which is not the whole space. Call it U = Span(v, Av, . . . , An−1 v). Now if we take a basis of U and complete it to a basis of V , the matrix of v 7→ Av in this basis is block upper-triangular with blocks of dimension dim U and dim V . If χ1 , χ2 are the characteristic polynomials of the two diagonal blocks, we deduce from problem (5) below that χA (x) = χ1 (x)χ2 (x), where deg χ1 (x) = m and deg χ2 (x) = n − m. This contradicts irreducibility of χA (x), so that is impossible. b) This is immediate: T (Ai v) = Ai+1 v and T (An−1 v) = An v = −a0 v − a1 Av − · · · − an−1 An−1 v. So we get the matrix described. c) Since A is one matrix of T and the given matrix is another matrix of T , the two matrices are conjugate by the change-of-basis formula. Since A was an arbitrary matrix with this irreducible characteristic polynomial, we deduce that all matrices with the same irreducible characteristic polynomial are conjugate. d) We know that the roots of the characteristic polynomial are the real eigenvalues (in fact, we proved that all eigenvalues are roots in (1); the other direction is a consequence of the fact that det(λI − A) is nonzero iff λI − A is noninvertible, iff λ is an eigenvalue). So a two-by-two matrix A with no real eigenvalues must have an irreducible quadratic characteristic polynomial. If its complex roots are a ± bi (for b 6= 0), then the characteristic 2 2 2 polynomial must be x − 2ax + (a + b ), which is the same as the characteristic polynomial a −b a −b of the matrix . Applying part (c), A must be conjugate to . b a b a (3) Assume that F = C. Let T ∈ L(V ), where V is finite-dimensional. Prove that the following are equivalent: (i) The minimal polynomial of T equals the characteristic polynomial; (ii) The Jordan canonical form matrix of T has only one Jordan block for each eigenvalue of T . By problem (5), the minimal polynomial is the least common multiple of the minimal polynomials of all the Jordan blocks, whereas the characteristic polynomial is the product of the characteristic polynomials of all Jordan blocks. As we saw in class, the minimal polynomial of a Jordan block J with eigenvalue λ and size k is (x − λ)k : the matrix (x − λ)j for 1 ≤ j < k is the upper-triangular matrix with all zero entries except for entries that are exactly j above the (main) diagonal, which are all ones. So k ≥ 1 is the minimum positive integer such that (x − λ)k = 0. Also, since the block J is upper-triangular with λ only on the diagonal, and it has size k, we deduce also that (x − λ)k = χJ . Applying problem (5), the characteristic polynomial (as we saw in class) equals the product over all eigenvalues λ of (x − λ)mλ where mλ is the sum of the sizes of all Jordan blocks, whereas the minimal polynomial is (x − λ)kλ where kλ is the size of the largest Jordan block of eigenvalue λ. Evidently, these two are equal iff there is exactly one Jordan block for each eigenvalue. (4) Let A be a matrix with real coefficients. Suppose that, considered as a complex matrix, A = SJS −1 where S is an invertible matrix and J is a Jordan canonical form matrix of A; both S and J may have complex coefficients. Prove that the number and size of Jordan blocks for each complex eigenvalue λ of A equals that of the complex conjugate λ̄. If we take complex conjuagates, we get A = Ā = S̄ J¯S̄ −1 . Since S̄ S̄ −1 = I = S̄ −1 S̄, this ¯ Now, J¯ is also a Jordan form matrix, with the same means that A is conjugate also to J. numbers and sizes of Jordan blocks, but with the eigenvalues all replaced by their complex 2 conjugates. As we proved in class, though, there is a unique Jordan form matrix conjugate to A up to rearranging the order of the blocks. Hence J and J¯ must be obtainable from each other by rearranging the order of the blocks. This proves the statement. (5) (cf. Chapter 10, # 22): Let A be a block upper-triangular matrix with diagonal blocks A1 , . . . , Am . Prove that the characteristic polynomial of A is the product of the characteristic polynomials of A1 , . . . , Am . Next, if A is block diagonal, show that the minimal polynomial of A is the least common multiple of the minimal polynomials of A1 , . . . , Am . (Note: for the definition of characteristic polynomial of A, you can either take det(xI − A), which works for arbitrary F, or you can take one of the definitions from class for the complex case, (x − λ1 ) · · · (x − λn ) where λ1 , . . . , λn are the diagonal entries occurring in an Q upper-triangular matrix conjugate to A, or λ (x − λ)dim V (λ) where V (λ) is the generalized eigenspace of λ. Perhaps you should try using both definitions!) First, the statement about minimal polynomials is easy: if A is a block diagonal matrix with diagonal blocks A1 , . . . , Am , and f (x) is a polynomial, then f (A) = 0 iff f (A1 ), . . . , f (Am ) are all zero. The latter are all zero iff f is a multiple of the minimal polynomials of A1 , . . . , Am (by Theorem 8.34 in Axler). So f (A) = 0 iff f is a multiple of the least common multiple of all the minimal polynomials of A1 , . . . , Am . We now prove the statement about characteristic polynomials. In the complex case, this is easy: write V = U1 ⊕ · · · ⊕ Um where U1 is the span of the first dim U1 standard basis vectors, U2 is the span of the next dim U2 standard basis vectors, etc. Then if we define the transformation T such that T (v) = Av, then T |Ui in the standard basis vectors is the i-th diagonal block. We can change the basis of each Ui to one in which TUi is actually upper-triangular. Then the characteristic polynomial is the product of (x − λ) where λ ranges over all the diagonal entries of all the T |Ui in the new basis. This is the same as the product of the characteristic polynomials of each of the T |Ui , since these are the product of (x − λ) where λ ranges over the diagonal entries Q appearing in the upper triangluar matrix we found for T |Ui . So we deduce that χT = m i=1 χT |Ui . In the case of general F, we have to show that det(xI −A) = det(xI −A1 ) · · · det(xI −Am ). By replacing A with xI − A, it is enough to show, using the sum formula for determinant, that det(A) = det A1 · · · det Am , where we allow A to have entries which are polynomials in x rather than merely numbers. Let the sizes of the matrices A1 , . . . , Am be d1 , . . . , dm , so that d1 + · · · + dm = n. We will prove the statement for m = 2. We can then deduce it for general m by induction: assuming the result for m − 1, it is enough to show det(A) = det(B) det(Am ) where B is the upper-triangular matrix obtained by taking the first n − dm rows and columns of A. But this is just the m = 2 case. So the statement about characteristic polynomial reduces to the following lemma: Lemma 0.1. Let A be a block upper-triangular matrix with diagonal blocks A1 and A2 (where A is allowed to have polynomial coefficients). Then det(A) = det(A1 ) det(A2 ). P Proof. Using the sum formula for determinant, det(A) = σ∈Sn sign(σ)aσ(1),1 · · · aσ(m),m . On the other hand, X det(A1 ) det(A2 ) = sign(σ) sign(τ )(aσ(1),1 · · · aσ(d1 ),d1 )(aτ (1),d1 +1 · · · aτ (d2 ),d1 +d2 ). σ∈Sd1 ,τ ∈Sd2 Now given σ ∈ Sd1 and τ ∈ Sd2 , let σ × τ ∈ Sd1 +d2 = Sn be the permutation such that ( σ(i), 1 ≤ i ≤ d1 , (σ × τ )(i) = τ (i), d1 + 1 ≤ i ≤ d1 + d2 = n. 3 Then it is immediate that o(σ × τ ) = o(σ)o(τ ) so that sign(σ × τ ) = sign(σ) sign(τ ) (alternatively, σ × τ is obtainable from the identity by a number of swaps equal to the sum of numbers of swaps needed to obtain σ and τ alone). We conclude that X det(A1 ) det(A2 ) = sign(σ × τ )(a(σ×τ )(1),1 · · · a(σ×τ )(n),n ). σ∈Sd1 ,τ ∈Sd2 The sum on the RHS is a subset of the sum for det(A) itself. Namely, let Sd1 × Sd2 := {σ × τ : σ ∈ Sd1 , τ ∈ Sd2 } ⊆ Sd1 +d2 . Let Sn \ (Sd1 × Sd2 ) be the set of all permutations in Sn that are not in Sd1 × Sd2 . Then, X sign(β)aβ(1),1 · · · aβ(n),n . det(A) − det(A1 ) det(A2 ) = β∈Sn \Sd1 ×Sd2 It remains to show that the RHS is zero. We show that every summand on the RHS is zero. In fact, we show that if β ∈ Sn \ (Sd1 × Sd2 ), then there is some i ≤ d1 such that β(i) > d1 , i.e., ai,β(i) is below the block diagonal of A and hence zero. Suppose, for sake of contradiction, that no (i, β(i)) is below the block diagonal. Then, for all 1 ≤ i ≤ d1 , we must have β(i) ≤ d1 . So β restricts to a map {1, . . . , d1 } → {1, . . . , d1 } which must be bijective. This implies that β must also restrict to a bijection {d1 +1, . . . , d1 + d2 }. But this means that β ∈ Sd1 × Sd2 , by definition. This is a contradiction. (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) Book problems (don’t forget F = R or C and V is finite-dimensional): Chapter 8, # 4. Chapter 8, # 8. Chapter 8, # 11. Chapter 8, # 15. Chapter 8, # 16. Chapter 8, # 19. (Note: in fact, m-th roots exist for all m ≥ 1! The proof is similar—you can try it.) Chapter 8, # 20. Chapter 8, # 30. Chapter 10, # 4. Chapter 10, # 6 and 7. Chapter 10, # 10. Chapter 10, # 12. Chapter 10, # 15. Chapter 10, # 16 and 17. Chapter 10, # 19. (Note: this strengthens Proposition 7.6.) Chapter 10, # 20. Chapter 10, # 23. Chapter 10, # 24. Chapter 10, # 25. Additional problems: (25) If ST = T S, prove that S preserves null T k for all k ≥ 0, i.e., null T k is S-invariant. Deduce the following consequences: (a) For F = C, S commutes with T iff S preserves all of the generalized eigenspaces VT (λ) of T and S|VT (λ) commutes with T |VT (λ) for all λ. (b) Now let us restrict to V = VT (λ). If S commutes with T , show that S preserves each null(T − λI)p for p ≥ 1. 4 (c) Furthermore, show that S commutes with T if and only if it commutes with the nilpotent tranformation N = T − λI. (d) Digression: let N be a nilpotent transformation and (v1 , N v1 , . . . , N m(v1 ) v1 , . . . , vk , N vk , . . . , N m(vk ) vk ) be a Jordan basis of V in terms of N . Show that null(N p ) = Span(N m(vi )−j vi : 1 ≤ i ≤ k, 0 ≤ j ≤ p − 1). (e) Returning to the problem, let N := T − λI. We need to find all S commuting with N ; we showed that these preserve null(N p ) for all p ≥ 1. Show first that all S ∈ L(null N ) commute with N |null(N ) . (f) Inductively, suppose that S ∈ L(null(N p )) commutes with N |null(N p ) . Show that S extends to an operator on null(N p+1 ) which commutes with N iff, for all i such that m(vi ) ≥ p, N m(vi )−(p−1) vi ∈ range N , i.e., in the span of the vectors N j vi for j ≥ 1. In this case, let ui be such that S(N m(vi )−(p−1) vi ) = N ui . Show that all extensions of S to operators on null(N p+1 ) which commute with T are given by S(N m(vi )−p )vi = ui + wi for arbitrary wi ∈ null(T ), for all i such that m(vi ) ≥ p. (g) Hence, when inductively constructing an S to commute with T , show that we will be able to extend to an operator on all of V iff we always choose S(N j vi ) to lie in 0 range(N j ), i.e., the span of the basis vectors N j vi0 for j 0 ≥ j. (h) In other words, when defining S(N m(vi )−p vi ), the choices of wi which will lead to an operator S commuting with T on all of V are wi ∈ Span(N m(vj ) vj : m(vj ) ≥ m(vi )−p). This gives an inductive description of all S commuting with T . (26) Last class warm-upQexercise and extension of Chapter 10, # 29: Show that the minimal polynomial of T is λ (x−λ)kλ where kλ = the size of theQlargest Jordan block of eigenvalue λ. Similarly, show that the characteristic polynomial is λ (x − λ)mλ where mλ =the sum of the sizes of all Jordan blocks of eigenvalue λ. (27) Last class warm-up exercise: Given that the following matrix has integer eigenvalues, compute them. Then, without using the knowledge that the matrix has integer eigenvalues, prove that they are indeed the eigenvalues: −43 8 50 A = 32 −5 −36 −44 8 51 5