NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 C

NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 LANCE D. DRAGER 1. Polynomials and Linear Transformations Fix a vector space 𝑉 over the complex numbers C. We’ll denote the dimension of 𝑉 by 𝑑. Let 𝑇 : 𝑉 → 𝑉 be a linear transformation. We’re safe in assuming 𝑇 ̸= 0. If we have a polynomial 𝑝(𝑧) ∈ C[𝑧], say 𝑝(𝑧) = 𝑎𝑛 𝑧 𝑛 + 𝑎𝑛−1 𝑧 𝑛−1 + · · · + 𝑎1 𝑧 + 𝑎0 , we can plug 𝑇 into the polynomial in place of 𝑧 to get a linear operator 𝑝(𝑇 ). We interpret the constant term in the polynomial as 𝑎0 𝑧 0 , so when we plug in 𝑇 we get 𝑎0 𝑇 0 = 𝑎0 𝐼, where 𝐼 is the identity operator. Thus, 𝑝(𝑇 ) = 𝑎𝑛 𝑇 𝑛 + 𝑎𝑛−1 𝑇 𝑛−1 + · · · + 𝑎1 𝑇 + 𝑎0 𝐼. We will omit the 𝐼 in this expression if no confusion will result. Thus we write 𝑇 − 3 for 𝑇 − 3𝐼. Studying the operators 𝑝(𝑇 ) will allow us to analyze the structure of 𝑇 . We begin by showing there is a polynomial so that 𝑝(𝑇 ) = 0. Lemma 1.1. If 𝑇 : 𝑉 → 𝑉 is a linear operator, there is a non-zero polynomial 𝑝(𝑧) ∈ C[𝑧] so that 𝑝(𝑇 ) = 0 Proof. The operator 𝑇 is in the vector space 𝐿(𝑉, 𝑉 ) of linear operators on 𝑉 . We know that 𝐿(𝑉, 𝑉 ) is isomorphic to the space of 𝑑 × 𝑑 complex matrices, which has dimension 𝑑2 , so the dimension of 𝐿(𝑉, 𝑉 ) is 𝑑2 . Consider the following vectors in 𝐿(𝑉, 𝑉 ), 2 𝐼, 𝑇, 𝑇 2 , . . . , 𝑇 𝑑 . This is a list of 𝑑2 + 1 vectors in a 𝑑2 dimensional space, so these vectors must be dependent. Thus, there are complex numbers 1 𝑎𝑖 (not all zero) so that 2 𝑎0 𝐼 + 𝑎1 𝑇 + 𝑎2 𝑇 2 + · · · + 𝑎𝑑2 𝑇 𝑑 = 0. If we let 𝑝(𝑧) be the polynomial 2 𝑝(𝑧) = 𝑎0 + 𝑎1 𝑧 + 𝑎2 𝑧 + · · · + 𝑎𝑑2 𝑧 𝑑 𝑘, we get a nonzero polynomial so that 𝑝(𝑇 ) = 0. Version Time-stamp: ”2012-11-06 15:03:07 drager”. 1We’ll denote √−1 by i. 1 2 LANCE D. DRAGER We define ℐ𝑇 = {𝑝(𝑧) ∈ C[𝑧] | 𝑝(𝑇 ) = 0}. This is (pretty obviously) an ideal in C[𝑧], see Exercise 1.3. Since all ideals in C[𝑧] are principal, we can find the monic generator 𝜇(𝑧) of ℐ𝑇 . This is the monic polynomial of least degree that annihilates 𝑇 . The polynomial 𝜇(𝑧) is called the minimal polynomial of 𝑇 . Similarly, if 𝑣 ∈ 𝑉 , we can look at 𝒥𝑣 = {𝑝(𝑧) ∈ C[𝑧] | 𝑝(𝑇 )𝑣 = 0}. This is again an ideal, and it’s nonzero because ℐ𝑇 ⊆ 𝒥𝑣 , i.e., a polynomial that annihilates everything annihilates 𝑣. The monic generator of this ideal is called the minimal polynomial of 𝑣, and will be denoted by 𝜇𝑣 (𝑧). Since the minimal polynomial 𝜇(𝑧) of 𝑇 is contained in 𝒥𝑣 , 𝜇𝑣 (𝑧) must divide 𝜇(𝑧). This fact is so handy, we’ll display it. Proposition 1.2. The minimal polynomial 𝜇𝑣 (𝑧) of a vector 𝑣 ∈ 𝑉 divides the minimal polynomial 𝜇(𝑧) of 𝑇 . Exercise 1.3. Show that ℐ𝑇 and 𝒥𝑣 are ideals in C[𝑧]. Exercise 1.4. Let 𝑊 be a subspace of 𝑉 and define 𝒦𝑊 = {𝑝(𝑧) ∈ C[𝑧] | 𝑝(𝑇 )𝑊 = 0} = {𝑝(𝑧) ∈ C[𝑧] | 𝑝(𝑇 )𝑤 = 0 for all 𝑤 ∈ 𝑊 } (1) Show that 𝒦𝑊 is a nonzero ideal. Denote the monic generator by 𝜈𝑊 (𝑧) (2) Show that 𝜈𝑊 (𝑧) divides 𝜇(𝑧). (3) Show that if 𝑤 ∈ 𝑊 , 𝜇𝑤 (𝑧) divides 𝜈𝑊 (𝑧). It will be useful to know how big the degree of 𝜇𝑣 (𝑧) can be. Proposition 1.5. If 𝑣 ∈ 𝑉 , there is a polynomial 𝑝(𝑧) that annihilates 𝑣 and has deg(𝑝(𝑧)) ≤ 𝑑. Consequently, the degree of 𝜇𝑣 (𝑧) must be less than or equal to 𝑑. Proof. We’ve already used the basic idea. Consider the vectors 𝑣, 𝑇 𝑣, 𝑇 2 𝑣, . . . , 𝑇 𝑑 (𝑣). This list has 𝑑 + 1 vectors in it, and they are in the 𝑑-dimensional space 𝑉 , so they are linearly dependent. Thus, there are constants 𝑐𝑖 , not all zero, so that 𝑐0 𝑣 + 𝑐1 𝑇 𝑣 + 𝑐2 𝑇 2 𝑣 + · · · + 𝑐𝑑 𝑇 𝑑 𝑣. If we define 𝑝(𝑧) = 𝑐0 + 𝑐1 𝑧 + 𝑐2 𝑧 2 + · · · + 𝑐𝑑 𝑧 𝑑 , we get a polynomial of degree ≤ 𝑑 so that 𝑝(𝑇 )𝑣 = 0. Since 𝜇(𝑧) is a monic (non-constant) polynomial, we can factor it into linear factors as (1.1) 𝜇(𝑧) = (𝑧 − 𝑟1 )𝑚1 (𝑧 − 𝑟2 )𝑚2 · · · (𝑧 − 𝑟ℓ )𝑚ℓ , where 𝑟1 , 𝑟2 , . . . , 𝑟ℓ are the distinct roots of 𝜇(𝑧). We next want to determine what these roots are. Theorem 1.6. The roots of 𝜇(𝑧) are exactly the eigenvalues of 𝑇 . NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 3 Proof. First, suppose that 𝜆 is an eigenvalue of 𝑇 . Then there is a nonzero vector 𝑣 so that (𝑇 − 𝜆)𝑣 = 0. Thus, the minimal polynomial of 𝑣 must be 𝜇𝑣 (𝑧) = 𝑧 − 𝜆 (why?). Since 𝜇𝑣 (𝑧) divides 𝜇(𝑧), 𝜆 is a root of 𝜇(𝑧). Thus every eigenvalue is a root of 𝜇(𝑧). Next, we need to show that every root of 𝜇(𝑧) is an eigenvalue. Write 𝜇(𝑧) as in (1.1). We want to consider one of the roots. There is nothing special about how we labeled the roots, so we may well call our root 𝑟1 . Consider the polynomial 𝑞(𝑧) = (𝑧 − 𝑟1 )𝑚1 −1 (𝑧 − 𝑟2 )𝑚2 . . . (𝑧 − 𝑟ℓ )𝑚ℓ , i.e., we’ve pulled out one factor of (𝑧 − 𝑟1 ), so 𝜇(𝑧) = (𝑧 − 𝑟1 )𝑞(𝑧). Since 𝑞(𝑧) has degree less than the degree of 𝜇(𝑧), it is not divisible by 𝜇(𝑧). Thus, 𝑞(𝑇 ) ̸= 0. Saying this operator is not zero means that there is a vector 𝑣 ̸= 0 to that 𝑞(𝑇 )𝑣 ̸= 0. But then (𝑇 − 𝑟1 )[𝑞(𝑇 )𝑣] = [(𝑇 − 𝑟1 )𝑞(𝑇 )]𝑣 = 𝜇(𝑇 )𝑣 = 0𝑣 = 0. Thus, 𝑟1 is an eigenvalue of 𝑇 , with eigenvector 𝑞(𝑇 )𝑣. We can now rewrite 𝜇(𝑧) as (1.2) 𝜇(𝑧) = (𝑧 − 𝜆1 )ℎ1 (𝑧 − 𝜆2 )ℎ2 . . . (𝑧 − 𝜆ℓ )ℎℓ where 𝜆1 , 𝜆2 , . . . , 𝜆ℓ are the distinct eigenvalues of 𝑇 . Remark 1.7. Remember the notation for the exponents in (1.2) since we will be referring to them. 2. Some Tools The following Proposition is standard linear algebra. The proof is included for completeness. Proposition 2.1. Let 𝑊 be a vector space. Suppose that we have linear operators 𝑃1 , . . . , 𝑃𝑛 that satisfy the following conditions. (1) Each 𝑃𝑖 is a projection operator, i.e., 𝑃𝑖2 = 𝑃𝑖 . (2) 𝑃1 + 𝑃2 + · · · + 𝑃𝑛 = 𝐼. (3) If 𝑖 ̸= 𝑗, 𝑃𝑖 𝑃𝑗 = 0. Let 𝑊𝑖 be the image of 𝑃𝑖 . Then, 𝑊 = 𝑊1 ⊕ 𝑊2 ⊕ · · · ⊕ 𝑊 𝑛 and 𝑃𝑖 coincides with the projection of 𝑊 onto 𝑊𝑖 defined by the direct sum decomposition. Proof. We first want to show that any 𝑤 can be written as a sum of elements in the 𝑊𝑖 ’s. This is easy, by Condition (2) we have 𝑤 = 𝑃1 𝑤 + 𝑃2 𝑤 + · · · + 𝑃𝑛 𝑤, and 𝑃𝑖 𝑤 ∈ 𝑊𝑖 by definition. Next we need to show that if (2.1) 0 = 𝑤1 + 𝑤2 + · · · + 𝑤𝑛 , 𝑤𝑖 ∈ 𝑊𝑖 , then each of the components is zero. Since 𝑤𝑖 ∈ 𝑊𝑖 = im(𝑃𝑖 ), we can find a vector 𝑢𝑖 so that 𝑤𝑖 = 𝑃𝑖 𝑢𝑖 . Thus, we have (2.2) 0 = 𝑃1 𝑢1 + 𝑃2 𝑢2 + · · · + 𝑃𝑛 𝑢𝑛 . 4 LANCE D. DRAGER Fix an index 𝑗 and apply 𝑃𝑗 on the left of both sides of (2.2). We get 0 = 𝑃𝑗 𝑃1 𝑢1 + 𝑃𝑗 𝑃2 𝑢2 + · · · + 𝑃𝑗 𝑃𝑗−1 𝑢𝑗−1 + 𝑃𝑗2 𝑢𝑗 + 𝑃𝑗 𝑃𝑗+1 𝑢𝑗+1 + · · · + 𝑃𝑗 𝑃𝑛 𝑢𝑛 . By Condition (3), the terms where the indices are different are zero, so we get 0 = 𝑃𝑗2 𝑢𝑗 . But 𝑃𝑗2 𝑢𝑗 = 𝑃𝑗 𝑢𝑗 = 𝑤𝑗 , by Condition (1), so 𝑤𝑗 = 0. Since 𝑗 was arbitrary, we conclude that all the components in (2.1) are zero. Finally, note that the projection onto 𝑊𝑗 defined by the direct sum decomposition is to take 𝑤 = 𝑤1 + 𝑤2 + · · · + 𝑤𝑛 , (2.3) 𝑤𝑖 ∈ 𝑊𝑖 , to 𝑤𝑗 . By the same computation as above, if we apply 𝑃𝑗 to 𝑤, the result is 𝑤𝑗 . Thus, the projections coincide. For our next utility, we need a little algebra. Recall that if 𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧) are polynomials, the ideal (𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)) they generate is the set of all combinations 𝑟1 (𝑧)𝑝1 (𝑧) + 𝑟2 (𝑧)𝑝2 (𝑧) + · · · + 𝑟𝑛 (𝑧)𝑝𝑛 (𝑧), 𝑟1 (𝑧), . . . , 𝑟𝑛 (𝑧) ∈ C[𝑧]. This ideal is also equal to (𝑔(𝑧)) for some polynomial 𝑔(𝑧). If the polynomials are not all zero, 𝑔(𝑧) can’t be zero and we can make a choice for 𝑔(𝑧) by choosing the monic one. We call 𝑔(𝑧) the greatest common divisor of the polynomials 𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧), written as gcd(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)) . Since the gcd is in the ideal (𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)), we have gcd(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)) = 𝑟1 (𝑧)𝑝1 (𝑧) + 𝑟2 (𝑧)𝑝2 (𝑧) + · · · + 𝑟𝑛 (𝑧)𝑝𝑛 (𝑧), for some 𝑟1 (𝑧), . . . , 𝑟𝑛 (𝑧) ∈ C[𝑧] If gcd(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)) = 1 the polynomials are called relatively prime. Proposition 2.2. The common roots of 𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧) are exactly the roots of 𝑔(𝑧) = gcd(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)). Consequently, the polynomials are relatively prime if and only if they have no common roots. Proof. Suppose that 𝜆 is a root of 𝑔(𝑧). Since 𝑔(𝑧) divides everything in the ideal (𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)), we have 𝑝𝑖 (𝑧) = 𝑞𝑖 (𝑧)𝑔(𝑧), for some polynomial 𝑞𝑖 (𝑧). But then 𝑝𝑖 (𝜆) = 𝑞𝑖 (𝜆)𝑔(𝜆) = 𝑞𝑖 (𝜆)0 = 0. Thus, 𝜆 is a root of each 𝑝𝑖 (𝑧). For the converse, suppose 𝜆 is a root of all of the 𝑝𝑖 (𝑧)’s. We have 𝑔(𝑧) = 𝑟1 (𝑧)𝑝1 (𝑧) + 𝑟2 (𝑧)𝑝2 (𝑧) + · · · + 𝑟𝑛 (𝑧)𝑝𝑛 (𝑧) for some 𝑟𝑖 (𝑧)’s, so 𝑔(𝜆) = 𝑟1 (𝜆)𝑝1 (𝜆) + · · · + 𝑟𝑛 (𝜆)𝑝𝑛 (𝜆) = 𝑟1 (𝑧)0 + · · · + 𝑟𝑛 (𝜆)0 = 0. If the polynomials 𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧) have no common roots, then 𝑔(𝑧) has no roots—so it must be a nonzero constant. The monic version is 1. Conversely, if our polynomials are relatively prime, the gcd is 1, which has no roots, so the polynomials have no common roots. NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 5 3. Generalized Eigenspaces In this section we will define the generalized eigenspaces2and show that the whole space 𝑉 is the direct sum of the generalized eigenspaces. To begin, let 𝜆𝑖 be an eigenvalue of 𝑇 . We say a vector 𝑣 ̸= 0 is a generalized eigenvector belonging to eigenvalue 𝜆𝑖 if (𝑇 − 𝜆𝑖 )𝑝 𝑣 = 0 for some positive integer 𝑝. Obviously, for any positive integer 𝑘, (𝑇 − 𝜆𝑖 )𝑝+𝑘 𝑣 = (𝑇 − 𝜆𝑖 )𝑘 [(𝑇 − 𝜆𝑖 )𝑝 𝑣] = (𝑇 − 𝜆𝑖 )𝑘 0 = 0, so (𝑇 − 𝜆𝑖 )𝑞 𝑣 = 0 for 𝑞 ≥ 𝑝. For the moment let 𝑚 be the smallest positive integer so that (𝑇 − 𝜆𝑖 )𝑚 𝑣 = 0. Then the minimal polynomial of 𝑣 is 𝜇𝑣 (𝑧) = (𝑧 − 𝜆𝑖 )𝑚 (why?). By Proposition 1.5, we must have 𝑚 ≤ 𝑑. Thus, if (𝑇 − 𝜆𝑖 )𝑝 𝑣 = 0 for any power 𝑝, we must have (𝑇 − 𝜆𝑖 )𝑑 𝑣 = 0. With this in mind, for each eigenvalue 𝜆𝑖 , we define 𝐺(𝜆𝑖 ) = {𝑣 ∈ 𝑉 | (𝑇 − 𝜆𝑖 )𝑑 𝑣 = 0} = ker((𝑇 − 𝜆𝑖 )𝑑 ). The subspace 𝐺(𝜆𝑖 ) is called the generalized eigenspace belonging to the eigenvalue 𝜆𝑖 . Recall that 𝐸(𝜆𝑖 ) = ker((𝑇 − 𝜆𝑖 )) is the eigenspace belonging to 𝜆𝑖 . Clearly 𝐸(𝜆𝑖 ) ⊆ 𝐺(𝜆𝑖 ), but in general they are not equal. Exercise 3.1. Consider the linear transformation C3 → C3 given by multiplication by the matrix ⎡ ⎤ 0 1 0 𝐴 = ⎣0 0 1⎦ . 0 0 0 The only eigenvalue is 0. Find 𝐸(0) and 𝐺(0). Exercise 3.2. Show that there are no “generalized eigenvalues”, i.e., if 𝜉 ∈ C and there is a nonzero vector 𝑣 and a positive integer 𝑝 so that (𝑇 − 𝜉)𝑝 𝑣 = 0, then 𝜉 is an eigenvalue of 𝑇 . The following observation is useful. Proposition 3.3. If 𝜆𝑖 is an eigenvalue, 𝐺(𝜆𝑖 ) = ker((𝑇 − 𝜆𝑖 )ℎ𝑖 ). Recall that ℎ𝑖 is the exponent of (𝑧 − 𝜆𝑖 ) in the minimal polynomial 𝜇(𝑧), see (1.2) Proof. First suppose 𝑣 ∈ 𝐺(𝜆𝑖 ). If 𝑣 = 0, there is nothing to prove. If 𝑣 ̸= 0 then (𝑇 − 𝜆𝑖 )𝑑 𝑣 = 0, so the minimal polynomial of 𝑣 must be 𝜇𝑣 (𝑧) = (𝑧 − 𝜆𝑖 )𝑚 for some positive integer 𝑚 ≤ 𝑑. But 𝜇𝑣 (𝑧) divides 𝜇(𝑧), so 𝑚 ≤ ℎ𝑖 . Thus (𝑇 − 𝜆𝑖 )ℎ𝑖 𝑣 = 0. The other direction is trivial. If 𝑣 ∈ ker((𝑇 − 𝜆𝑖 )ℎ𝑖 ), then (𝑇 − 𝜆𝑖 )𝑑 𝑣 = 0, since 𝑑 ≥ ℎ𝑖 . Thus, 𝑣 ∈ 𝐺(𝜆𝑖 ). Another useful observation is the following Proposition. 2Bad terminology, but we’re stuck with it. 6 LANCE D. DRAGER Proposition 3.4. 𝑖 and 𝑗 be distinct indices, so 𝜆𝑖 ̸= 𝜆𝑗 . Then 𝐺(𝜆𝑖 ) ∩ 𝐺(𝜆𝑗 ) = {0}. Proof. Suppose that 𝑣 ∈ 𝐺(𝜆𝑖 ). Then (𝑇 − 𝜆𝑖 )ℎ𝑖 𝑣 = 0. Thus, the minimal polynomial 𝜇𝑣 (𝑧) of 𝑣 must divide (𝑧 − 𝜆𝑖 )ℎ𝑖 . This means that 𝜇𝑣 (𝑧) must be 𝜇𝑣 (𝑧) = (𝑧 − 𝜆𝑖 )𝑚 , for some integer 0 ≤ 𝑚 ≤ ℎ𝑖 (the zero vector would have minimal polynomial (𝑧 − 𝜆𝑖 )0 = 1). On the other hand, 𝑣 ∈ 𝐺(𝜆𝑗 ), so (𝑇 − 𝜆𝑗 )ℎ𝑗 𝑣 = 0. Thus, the polynomial (𝑧 − 𝜆𝑗 )ℎ𝑗 must be divisible by 𝜇𝑣 (𝑧) = (𝑧 − 𝜆𝑖 )𝑚 . Sine 𝜆𝑖 ̸= 𝜆𝑗 , the only way this is possible is to have 𝑚 = 0, i.e., 𝜇𝑣 (𝑧) = 1. But then 0 = 𝜇𝑣 (𝑇 )𝑣 = 1𝑣 = 𝑣, so 𝑣 = 0. Next, we develop a little machinery about commuting operators. If 𝐿 and 𝑆 are linear maps 𝑉 → 𝑉 , we say they commute if 𝐿𝑆 = 𝑆𝐿. The following simple observations are left to the reader. Proposition 3.5. Let 𝑇 , 𝑆, and 𝑅 be linear operators 𝑉 → 𝑉 . Then, the following properties hold. (1) 𝑇 commutes with itself. (2) If 𝑇 commutes with 𝑆 and 𝑅, then 𝑇 commutes with the products 𝑆𝑅 and 𝑅𝑆. (3) If 𝑆 and 𝑇 commute, 𝑇 𝑝 commutes with 𝑆 𝑞 for any powers 𝑝 and 𝑞 (which can be negative if the operator is invertable). (4) If 𝑇 commutes with 𝑆 and 𝑅, then 𝑇 commutes with 𝛼𝑆 + 𝛽𝑅 for any scalars 𝛼 and 𝛽. (5) If 𝑆 commutes with 𝑇 , then 𝑆 commutes with any polynomial 𝑝(𝑇 ) in 𝑇 . We can use these facts to prove the following useful Proposition. Proposition 3.6. Let 𝑆 be a linear operator 𝑉 → 𝑉 that commutes with 𝑇 and let 𝜆𝑖 be an eigenvalue of 𝑇 . Then 𝑆𝐺(𝜆𝑖 ) ⊆ 𝐺(𝜆𝑖 ) 𝑆𝐸(𝜆𝑖 ) ⊆ 𝐸(𝜆𝑖 ) We say that the subspaces 𝐺(𝜆𝑖 ) and 𝐸(𝜆𝑖 ) are invariant under 𝑆. Proof. Suppose that 𝑣 ∈ 𝐺(𝜆𝑖 ), which means that (𝑇 − 𝜆𝑖 )𝑑 𝑣 = 0. To test if 𝑆𝑣 is in 𝐺(𝜆𝑖 ), we need to see if (𝑇 − 𝜆𝑖 )𝑑 [𝑆𝑣] = 0. But 𝑆 and (𝑇 − 𝜆𝑖 )𝑑 commute, so (𝑇 − 𝜆𝑖 )𝑑 [𝑆𝑣] = [(𝑇 − 𝜆𝑖 )𝑑 𝑆]𝑣 = 𝑆[(𝑇 − 𝜆𝑖 )𝑑 𝑣] = 𝑆0 = 0. Thus, 𝑆𝑣 ∈ 𝐺(𝜆𝑖 ) The corresponding result for the eigenspaces is left to the reader. Corollary 3.7. The eigenspace 𝐸(𝜆𝑖 ) and the generalized eigenspace 𝐺(𝜆𝑖 ) are invariant under any polynomial 𝑝(𝑇 ) in 𝑇 . We now state the Big Theorem. Theorem 3.8 (Big Theorem). Let 𝜆1 , . . . , 𝜆ℓ be the distinct eigenvalues of 𝑇 . Then 𝑉 = 𝐺(𝜆1 ) ⊕ 𝐺(𝜆2 ) ⊕ · · · ⊕ 𝐺(𝜆ℓ ), in words, 𝑉 is the direct sum of the generalized eigenspaces. NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 7 Let’s discuss the proof, stating some important facts as Lemmas. Consider the polynomials (3.1) 𝑞𝑖 (𝑧) = ℓ ∏︁ (𝑧 − 𝜆𝑗 )ℎ𝑗 , 𝑗=1 𝑗̸=𝑖 in other words, we take the minimal polynomial and remove the factor (𝑧 − 𝜆𝑖 )ℎ𝑖 corresponding to the eigenvalue 𝜆𝑖 . ℓ Lemma 3.9. The polynomials {𝑞𝑖 (𝑧)}𝑖=1 are relatively prime. Proof of Lemma. It will suffice to show our polynomials have no common roots. The only possible roots are the eigenvalues 𝜆1 , 𝜆2 , . . . , 𝜆ℓ . But 𝜆1 is not a common root, because it is not a root of 𝑞1 (𝑧), 𝜆2 is not a root of 𝑞2 (𝑧), and so forth. Since the 𝑞𝑖 (𝑧)’s are relatively prime, we have (3.2) 1 = 𝑟1 (𝑧)𝑞1 (𝑧) + 𝑟2 (𝑧)𝑞2 (𝑧) + · · · + 𝑟ℓ (𝑧), for some polynomials 𝑟1 (𝑧), . . . , 𝑟ℓ (𝑧). We’ll use the following notation 𝑝𝑖 (𝑧) = 𝑟𝑖 (𝑧)𝑞𝑖 (𝑧) 𝑃𝑖 = 𝑝𝑖 (𝑇 ) = 𝑟𝑖 (𝑇 )𝑞𝑖 (𝑇 ). Plugging 𝑇 in for 𝑧 in (3.2) we have (3.3) 𝐼 = 𝑝1 (𝑇 ) + 𝑝2 (𝑇 ) + · · · + 𝑝ℓ (𝑇 ) = 𝑃1 + 𝑃2 + · · · + 𝑃ℓ . Lemma 3.10. For each 𝑖, im(𝑃𝑖 ) ⊆ 𝐺(𝜆𝑖 ). Proof of Lemma. Let 𝑣 be a vector in 𝑉 . We want to show that 𝑃𝑖 𝑣 ∈ 𝐺(𝜆𝑖 ). By Proposition 3.3, it will suffice to show that (𝑇 − 𝜆𝑖 )ℎ𝑖 𝑃𝑖 𝑣 = 0. (3.4) But (𝑧 − 𝜆𝑖 )ℎ𝑖 is exactly the factor we removed from 𝜇(𝑧) to get 𝑞𝑖 (𝑧). Thus, (𝑧 − 𝜆𝑖 )ℎ𝑖 𝑝𝑖 (𝑧) = 𝑟𝑖 (𝑧)(𝑧 − 𝜆𝑖 )ℎ𝑖 𝑞𝑖 (𝑧) = 𝑟𝑖 (𝑧)𝜇(𝑧) and then (𝑇 − 𝜆𝑖 )ℎ𝑖 𝑃𝑖 = 𝑟𝑖 (𝑇 )𝜇(𝑇 ) = 0, so (3.4) is certainly true. Lemma 3.11. If 𝑖 ̸= 𝑗, 𝑃𝑖 𝐺(𝜆𝑗 ) = 0. Proof of Lemma. The factor (𝑧 − 𝜆𝑗 )ℎ𝑗 appears in 𝑞𝑖 (𝑧). Thus, 𝑝𝑖 (𝑧) = 𝑔(𝑧)(𝑧 − 𝜆𝑗 )ℎ𝑗 for some polynomial 𝑔(𝑧). Thus, 𝑃𝑖 𝐺(𝜆𝑗 ) = 𝑝𝑖 (𝑇 )𝐺(𝜆𝑗 ) = 𝑔(𝑇 )(𝑇 − 𝜆𝑗 )ℎ𝑗 𝐺(𝜆𝑗 ) = 0, since (𝑇 − 𝜆𝑗 )ℎ𝑗 kills 𝐺(𝜆𝑗 ). Lemma 3.12. If 𝑖 ̸= 𝑗, 𝑃𝑖 𝑃𝑗 = 0. Proof of Lemma. This follows from Lemma 3.10 and Lemma 3.11. Lemma 3.13. Each 𝑃𝑖 is a projection operator, i.e., 𝑃𝑖2 = 𝑃𝑖 . 8 LANCE D. DRAGER Proof of Lemma. We have 𝐼 = 𝑃1 + 𝑃2 + · · · + 𝑃ℓ . Multiply this by 𝑃𝑗 on the left. This gives 𝑃𝑗 = 𝑃𝑗 𝑃1 + 𝑃𝑗 𝑃2 + . . . 𝑃𝑗 𝑃𝑗−1 + 𝑃𝑗2 + 𝑃𝑗 𝑃𝑗+1 + . . . 𝑃𝑗 𝑃ℓ . All the terms where the indices are not equal are zero, so we wind up with 𝑃𝑗 = 𝑃𝑗2 . We’ve now shown that the 𝑃𝑖 ’s satisfy all the requirements of Proposition 2.1. If we let 𝑊𝑖 = im(𝑃𝑖 ) ⊆ 𝐺(𝜆𝑖 ), we have (3.5) 𝑉 = 𝑊1 ⊕ 𝑊2 ⊕ · · · ⊕ 𝑊ℓ We will be done if we show that 𝑊𝑖 = 𝐺(𝜆𝑖 ). To do this, suppose that 𝑣 ∈ 𝐺(𝜆𝑖 ). We have, of course, 𝑣 = 𝑃1 𝑣 + 𝑃2 𝑣 + · · · + 𝑃ℓ 𝑣. Consider 𝑃𝑗 𝑣 for 𝑗 ̸= 𝑖. On the one hand, 𝑃𝑗 𝑣 ∈ 𝑊𝑗 ⊆ 𝐺(𝜆𝑗 ). On the other hand, 𝑃𝑗 = 𝑝𝑗 (𝑇 ) is a polynomial in 𝑇 . By Corollary 3.7, 𝐺(𝜆𝑖 ) is invariant under 𝑃𝑗 , so 𝑃𝑗 𝑣 ∈ 𝐺(𝜆𝑖 ). But then 𝑃𝑗 𝑣 ∈ 𝐺(𝜆𝑖 ) ∩ 𝐺(𝜆𝑗 ) = 0, using Proposition 3.4. Since 𝑃𝑗 𝑣 = 0 for 𝑗 ̸= 𝑖, we have 𝑣 = 𝑃𝑖 𝑣 ∈ 𝑊𝑖 , which completes the proof that 𝑊𝑖 = 𝐺(𝜆𝑖 ). This completes our proof of the Big Theorem, Theorem 3.8. 4. The Jordan Decomposition We begin with a discussion of nilpotent matrices. A linear transformation 𝑁 : 𝑉 → 𝑉 is nilpotent if 𝑁 𝑝 = 0 for some positive integer 𝑝. Of course, if 𝑁 𝑝 = 0 then 𝑁 𝑞 = 0 for any 𝑞 > 𝑝. We call the smallest positive integer 𝑛 such that 𝑁 𝑛 = 0 the degree of nilpotency of 𝑁 . Another way to characterize 𝑛 is 𝑁 𝑛 = 0 but 𝑁 𝑛−1 ̸= 0. We want to show that if the dimension of 𝑉 is 𝑑 and 𝑁 is nilpotent then 𝑁 𝑑 = 0, i.e., 𝑛 ≤ 𝑑. One way to see this is the following Proposition, which is useful in its own right. Proposition 4.1. Let 𝑣 be a vector in 𝑉 and let 𝑆 : 𝑉 → 𝑉 be a linear transformation. Suppose there is a positive integer 𝑚 such that 𝑆 𝑚 𝑣 = 0, but 𝑆 𝑚−1 𝑣 ̸= 0. Then the 𝑚 vectors 𝑣, 𝑆𝑣, 𝑆 2 𝑣, . . . , 𝑆 𝑚−1 𝑣 are linearly independent. Proof. Suppose that we have a relation (4.1) 𝑐0 𝑣 + 𝑐1 𝑆𝑣 + 𝑐2 𝑆 2 𝑣 + · · · + 𝑐𝑚−1 𝑆 𝑚−1 𝑣 = 0. We need to show that all of the coefficients are zero. To do this, first multiply (4.1) on the left by 𝑆 𝑚−1 . This give (4.2) 𝑐0 𝑆 𝑚−1 𝑣 + 𝑐1 𝑆 𝑚 𝑣 + · · · + 𝑐𝑚−1 𝑆 2𝑚−2 𝑣 = 0. Since 𝑆 𝑝 𝑣 = 0 for 𝑝 ≥ 𝑚, this reduces to just 𝑐0 𝑆 𝑚−1 𝑣 = 0. Since 𝑆 𝑚−1 𝑣 ̸= 0, we conclude that 𝑐0 = 0. Equation (4.1) now reduces to 𝑐1 𝑆𝑣 + 𝑐2 𝑆 2 𝑣 + · · · + 𝑐𝑚−1 𝑆 𝑚−1 𝑣 = 0. NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 9 We now multiply this on the left by 𝑆 𝑚−2 , which gives us 𝑐1 𝑆 𝑚−1 𝑣 + 𝑐1 𝑆 𝑚 𝑣 + · · · + 𝑐𝑚−1 𝑆 2𝑚−3 𝑣 = 0. Again, all the terms but the first are zero, so 𝑐1 𝑆 𝑚−1 𝑣 = 0, from which we can conclude that 𝑐1 = 0. Continuing in this way, we conclude that all the coefficients are zero. Proposition 4.2. If 𝑁 : 𝑉 → 𝑉 is nilpotent and the dimension of 𝑉 is 𝑑, then 𝑁 𝑑 = 0, i.e., the degree of nilpotency of 𝑁 is less than or equal to 𝑑. Proof I. Suppose that 𝑁 𝑛−1 ̸= but 𝑁 𝑛 = 0. Since 𝑁 𝑛−1 ̸= 0, there is a vector 𝑣 such that 𝑁 𝑛−1 𝑣 ̸= 0. But then the 𝑛 vectors 𝑣, 𝑁 𝑣, 𝑁 2 𝑣, . . . , 𝑁 𝑛−1 𝑣 are linearly independent, so 𝑛 ≤ dim(𝑉 ) = 𝑑. Proof II. Suppose that 𝑁 𝑛 = 0 but 𝑁 𝑛−1 ̸= 0. Then the polynomial 𝑧 𝑛 annihilates 𝑁 , but 𝑧 𝑛−1 does not. Thus, the minimal polynomial of 𝑁 is 𝑧 𝑛 . We know the degree of the minimal polynomial must be ≤ 𝑑. We continue to investigate our fixed linear transformation 𝑇 : 𝑉 → 𝑉 , where 𝑉 has dimension 𝑑. The goal of this section is to prove the following Theorem, which often suffices to solve a problem without going to the full Jordan Form. Theorem 4.3 (Jordan Decomposition). If 𝑇 : 𝑉 → 𝑉 is a linear transformation, there are unique linear transformations 𝑆 and 𝑁 from 𝑉 to 𝑉 so that the following conditions hold. (JD1) 𝑇 = 𝑆 + 𝑁 . (JD2) 𝑆𝑁 + 𝑁 𝑆, i.e, 𝑆 and 𝑁 commute. (JD3) 𝑆 is diagonalizable. (JD4) 𝑁 is nilpotent. We’ll divide the rest of this section into the proof of existence and the proof of uniqueness. 4.1. Proof of Existence. As usual, let 𝜆1 , 𝜆2 , . . . , 𝜆ℓ be the distinct eigenvalues of 𝑇 . From our Big Theorem, we have (4.3) 𝑉 = 𝐺(𝜆1 ) ⊕ 𝐺(𝜆2 ) ⊕ · · · ⊕ 𝐺(𝜆ℓ ). Let 𝑆 : 𝑉 → 𝑉 be the linear transformation that is given on 𝐺(𝜆𝑖 ) by multiplication by 𝜆𝑖 . Thus, if 𝑣 ∈ 𝑉 is decomposed as 𝑣 = 𝑣1 + 𝑣2 + · · · + 𝑣ℓ , with respect to the direct sum decomposition (4.3), we have 𝑆𝑣 = 𝜆1 𝑣1 + 𝜆2 𝑣2 + · · · + 𝜆ℓ 𝑣ℓ . Another way to say it is that (4.4) 𝑆 = 𝜆1 𝑃1 + 𝜆2 𝑃2 + · · · + 𝜆ℓ 𝑃ℓ . Since the 𝑃𝑖 ’s are polynomials in 𝑇 , we see that 𝑆 is a linear combination of polynomials in 𝑇 , and so is a polynomial in 𝑇 . Thus, 𝑆 commutes with 𝑇 , which is also easy to check from the definition of 𝑆. 10 LANCE D. DRAGER Exercise 4.4. Use the fact that the generalized eigenspaces are invariant under 𝑇 to show that 𝑆𝑇 = 𝑇 𝑆. We now define 𝑁 = 𝑇 − 𝑆. It’s clear that 𝑁 commutes with both 𝑆 and 𝑇 . Indeed, 𝑁 is a polynomial in 𝑇 . We need to show that 𝑁 is nilpotent. To do this, suppose that 𝑣 ∈ 𝐺(𝜆𝑖 ). We then have (𝑇 − 𝑆)𝑣 = 𝑇 𝑣 − 𝑆𝑣 = 𝑇 𝑣 − 𝜆𝑖 𝑣 = (𝑇 − 𝜆𝑖 )𝑣. Since (𝑇 − 𝑆) commutes with (𝑇 − 𝜆𝑖 ), we have (𝑇 −𝑆)2 𝑣 = (𝑇 −𝑆)[(𝑇 −𝑆)𝑣] = (𝑇 −𝑆)(𝑇 −𝜆𝑖 )𝑣 = (𝑇 −𝜆𝑖 )[(𝑇 −𝑆)𝑣] = (𝑇 −𝜆𝑖 )2 𝑣. Continuing in this way, we get (𝑇 − 𝑆)𝑝 𝑣 = (𝑇 − 𝜆𝑖 )𝑝 𝑣. Thus, 𝑁 𝑑 𝑣 = (𝑇 − 𝑆)𝑑 𝑣 = (𝑇 − 𝜆𝑖 )𝑑 𝑣 = 0, since 𝑣 ∈ 𝐺(𝜆𝑖 ). For an arbitrary 𝑣 ∈ 𝑉 , we have 𝑣 = 𝑣1 + 𝑣2 + · · · + 𝑣ℓ , 𝑣𝑖 ∈ 𝐺(𝜆𝑖 ), and so 𝑁 𝑑 𝑣 = 𝑁 𝑑 𝑣1 + 𝑁 𝑑 𝑣2 + · · · + 𝑁 𝑑 𝑣ℓ = 0 + 0 + · · · + 0 = 0. We’ve now constructed 𝑆 and 𝑁 satisfying the required four properties, so the proof of existence is complete. 4.2. Proof of Uniqueness. Denote the Jordan Decomposition we have constructed in the last subsection as 𝑇 = 𝑆old +𝑁old . Suppose that we have two transformations 𝑆 and 𝑁 so that 𝑇 = 𝑆 + 𝑁 and (JD1) through (JD4) hold. We want to prove that 𝑆 = 𝑆old and 𝑁 = 𝑁old . Notice that 𝑇 , 𝑆 and 𝑁 must all commute with each other. First, let’s determine the eigenvalues of 𝑆. Suppose that 𝜉 is an eigenvalue of 𝑆 and let 𝑣 be an eigenvector for 𝜉. Then (𝑇 − 𝑆)𝑣 = 𝑇 𝑣 − 𝑆𝑣 = 𝑇 𝑣 − 𝜉𝑣 = (𝑇 − 𝜉)𝑣. Since (𝑇 − 𝑆) and 𝑇 − 𝜉 commute, we have (𝑇 − 𝑆)2 𝑣 = (𝑇 − 𝑆)[(𝑇 − 𝑆)𝑣] = (𝑇 − 𝑆)[(𝑇 − 𝜉)𝑣] = (𝑇 − 𝜉)[(𝑇 − 𝑆)𝑣] = (𝑇 − 𝜉)2 𝑣. Continuing this argument, we have (𝑇 − 𝑆)𝑝 𝑣 = (𝑇 − 𝜉)𝑝 𝑣 for any power 𝑝. But then (𝑇 − 𝜉)𝑑 𝑣 = (𝑇 − 𝑆)𝑑 𝑣 = 𝑁 𝑑 𝑣 = 0, since 𝑁 is nilpotent. From this we conclude, as in Exercise 3.2, that 𝜉 is an eigenvalue of 𝑇 . We can also conclude that 𝑣, which started out as an eigenvector of 𝑆, is in the generalized eigenspace of 𝑇 belonging to 𝜉. This gives us the following Lemma. Lemma 4.5. Let 𝜉 be an eigenvalue of 𝑆. Then 𝜉 is an eigenvalue of 𝑇 and 𝐸𝑆 (𝜉) ⊆ 𝐺𝑇 (𝜉), i.e., the eigenspace of 𝑆 is contained in the generalized eigenspace of 𝑇 . NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 11 Next, let 𝜆 be an eigenvalue of 𝑇 with eigenvector 𝑣. Then (𝑆 − 𝑇 )𝑣 = 𝑆𝑣 − 𝑇 𝑣 = 𝑆𝑣 − 𝜆𝑣 = (𝑆 − 𝜆)𝑣. Since (𝑆 − 𝑇 ) and (𝑆 − 𝜆) commute, we can use the same procedure as above to show that (𝑆 − 𝑇 )𝑝 𝑣 = (𝑆 − 𝜆)𝑝 𝑣 for any exponent 𝑝. But 𝑆 − 𝑇 = −𝑁 , so (𝑆 − 𝜆)𝑑 𝑣 = (𝑆 − 𝑇 )𝑑 𝑣 = (−𝑁 )𝑑 𝑣 = (−1)𝑑 𝑁 𝑑 𝑣 = 0, since 𝑁 𝑑 = 0. As before, this implies that 𝜆 is an eigenvalue of 𝑆. We’ve now shown that the eigenvalues of 𝑆 and 𝑇 are exactly the same, and for each eigenvalue 𝜆𝑖 we have 𝐸𝑆 (𝜆𝑖 ) ⊆ 𝐺𝑇 (𝜆𝑖 ). Of course we have 𝑉 = 𝐺𝑇 (𝜆1 ) ⊕ 𝐺𝑇 (𝜆2 ) ⊕ · · · ⊕ 𝐺𝑇 (𝜆ℓ ). Since 𝑆 is diagonalizable, we have 𝑉 = 𝐸𝑆 (𝜆1 ) ⊕ 𝐸𝑆 (𝜆2 ) ⊕ · · · ⊕ 𝐸𝑆 (𝜆ℓ ). We can then cite the following Lemma. Lemma 4.6. Let 𝑉 be a vector space and suppose that 𝑉1 , 𝑉2 , . . . , 𝑉𝑛 , 𝑊1 , 𝑊2 . . . , 𝑊𝑛 are subspaces of 𝑉 so that (1) 𝑊𝑖 ⊆ 𝑉𝑖 for all 𝑖 = 1, 2, . . . , 𝑛. (2) 𝑉 = 𝑉1 ⊕ 𝑉2 ⊕ · · · ⊕ 𝑉𝑛 . (3) 𝑉 = 𝑊1 ⊕ 𝑊2 ⊕ · · · ⊕ 𝑊𝑛 . Then 𝑊𝑖 = 𝑉𝑖 , 𝑖 = 1, 2, . . . , 𝑛. Proof. Of course, any vector 𝑣 can be written uniquely as 𝑣 = 𝑣1 + 𝑣2 + · · · + 𝑣𝑛 , where 𝑣𝑖 ∈ 𝑉𝑖 We want to prove 𝑊𝑖 = 𝑉𝑖 for all 𝑖. Suppose that 𝑣 ∈ 𝑉𝑗 . Then its decomposition as above is (4.5) 𝑣 = 0 + 0 + · · · + 0 + 𝑣 + 0 + · · · + 0, i.e., all the components are zero except for the one in 𝑉𝑗 , which is 𝑣. But we can also write 𝑣 with respect to the direct sum of the 𝑊𝑗 ’s, so (4.6) 𝑣 = 𝑤1 + 𝑤2 + · · · + 𝑤𝑗−1 + 𝑤𝑗 + 𝑤𝑗+1 + · · · + 𝑤𝑛 , where 𝑤𝑖 ∈ 𝑊𝑖 . But 𝑊𝑖 ⊆ 𝑉𝑖 , so each 𝑤𝑖 ∈ 𝑉𝑖 . Thus, in (4.6), 𝑣 is written as a sum of components where the 𝑖th component is in 𝑉𝑖 . There is only one way to do this, namely (4.5). Thus, we have 𝑤𝑖 = 0 for 𝑖 ̸= 𝑗 and 𝑣 = 𝑤𝑗 ∈ 𝑊𝑗 . This shows 𝑉𝑗 ⊆ 𝑊𝑗 , so the proof is complete. Applying this to the case at hand, we conclude that 𝐸𝑆 (𝜆𝑖 ) = 𝐺𝑇 (𝜆𝑖 ), 12 LANCE D. DRAGER in other words, 𝑆 is given by multiplication by 𝜆𝑖 on 𝐺𝑇 (𝜆𝑖 ). This is exactly the definition of 𝑆old in our construction, so we conclude that 𝑆 = 𝑆old . Then, of course, 𝑁 = 𝑇 − 𝑆 = 𝑇 − 𝑆old = 𝑁old , so the proof of uniqueness is complete. Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX 79409-1042 E-mail address: lance.drager@ttu.edu

NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 C

Related documents

Products

Support

NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 C

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib