NOTES ON JORDAN CANONICAL FORM MATH 5316, FALL 2012 C

advertisement
NOTES ON JORDAN CANONICAL FORM
MATH 5316, FALL 2012
LANCE D. DRAGER
1. Polynomials and Linear Transformations
Fix a vector space 𝑉 over the complex numbers C. We’ll denote the dimension of
𝑉 by 𝑑. Let 𝑇 : 𝑉 → 𝑉 be a linear transformation. We’re safe in assuming 𝑇 ΜΈ= 0.
If we have a polynomial 𝑝(𝑧) ∈ C[𝑧], say
𝑝(𝑧) = π‘Žπ‘› 𝑧 𝑛 + π‘Žπ‘›−1 𝑧 𝑛−1 + · · · + π‘Ž1 𝑧 + π‘Ž0 ,
we can plug 𝑇 into the polynomial in place of 𝑧 to get a linear operator 𝑝(𝑇 ). We
interpret the constant term in the polynomial as π‘Ž0 𝑧 0 , so when we plug in 𝑇 we get
π‘Ž0 𝑇 0 = π‘Ž0 𝐼, where 𝐼 is the identity operator. Thus,
𝑝(𝑇 ) = π‘Žπ‘› 𝑇 𝑛 + π‘Žπ‘›−1 𝑇 𝑛−1 + · · · + π‘Ž1 𝑇 + π‘Ž0 𝐼.
We will omit the 𝐼 in this expression if no confusion will result. Thus we write
𝑇 − 3 for 𝑇 − 3𝐼.
Studying the operators 𝑝(𝑇 ) will allow us to analyze the structure of 𝑇 . We
begin by showing there is a polynomial so that 𝑝(𝑇 ) = 0.
Lemma 1.1. If 𝑇 : 𝑉 → 𝑉 is a linear operator, there is a non-zero polynomial
𝑝(𝑧) ∈ C[𝑧] so that 𝑝(𝑇 ) = 0
Proof. The operator 𝑇 is in the vector space 𝐿(𝑉, 𝑉 ) of linear operators on 𝑉 . We
know that 𝐿(𝑉, 𝑉 ) is isomorphic to the space of 𝑑 × π‘‘ complex matrices, which has
dimension 𝑑2 , so the dimension of 𝐿(𝑉, 𝑉 ) is 𝑑2 . Consider the following vectors in
𝐿(𝑉, 𝑉 ),
2
𝐼, 𝑇, 𝑇 2 , . . . , 𝑇 𝑑 .
This is a list of 𝑑2 + 1 vectors in a 𝑑2 dimensional space, so these vectors must be
dependent. Thus, there are complex numbers 1 π‘Žπ‘– (not all zero) so that
2
π‘Ž0 𝐼 + π‘Ž1 𝑇 + π‘Ž2 𝑇 2 + · · · + π‘Žπ‘‘2 𝑇 𝑑 = 0.
If we let 𝑝(𝑧) be the polynomial
2
𝑝(𝑧) = π‘Ž0 + π‘Ž1 𝑧 + π‘Ž2 𝑧 + · · · + π‘Žπ‘‘2 𝑧 𝑑 π‘˜,
we get a nonzero polynomial so that 𝑝(𝑇 ) = 0.
Version Time-stamp: ”2012-11-06 15:03:07 drager”.
1We’ll denote √−1 by i.
1
2
LANCE D. DRAGER
We define
ℐ𝑇 = {𝑝(𝑧) ∈ C[𝑧] | 𝑝(𝑇 ) = 0}.
This is (pretty obviously) an ideal in C[𝑧], see Exercise 1.3.
Since all ideals in C[𝑧] are principal, we can find the monic generator πœ‡(𝑧) of ℐ𝑇 .
This is the monic polynomial of least degree that annihilates 𝑇 . The polynomial
πœ‡(𝑧) is called the minimal polynomial of 𝑇 .
Similarly, if 𝑣 ∈ 𝑉 , we can look at
π’₯𝑣 = {𝑝(𝑧) ∈ C[𝑧] | 𝑝(𝑇 )𝑣 = 0}.
This is again an ideal, and it’s nonzero because ℐ𝑇 ⊆ π’₯𝑣 , i.e., a polynomial that
annihilates everything annihilates 𝑣. The monic generator of this ideal is called
the minimal polynomial of 𝑣, and will be denoted by πœ‡π‘£ (𝑧). Since the minimal
polynomial πœ‡(𝑧) of 𝑇 is contained in π’₯𝑣 , πœ‡π‘£ (𝑧) must divide πœ‡(𝑧). This fact is so
handy, we’ll display it.
Proposition 1.2. The minimal polynomial πœ‡π‘£ (𝑧) of a vector 𝑣 ∈ 𝑉 divides the
minimal polynomial πœ‡(𝑧) of 𝑇 .
Exercise 1.3. Show that ℐ𝑇 and π’₯𝑣 are ideals in C[𝑧].
Exercise 1.4. Let π‘Š be a subspace of 𝑉 and define
π’¦π‘Š = {𝑝(𝑧) ∈ C[𝑧] | 𝑝(𝑇 )π‘Š = 0} = {𝑝(𝑧) ∈ C[𝑧] | 𝑝(𝑇 )𝑀 = 0 for all 𝑀 ∈ π‘Š }
(1) Show that π’¦π‘Š is a nonzero ideal. Denote the monic generator by πœˆπ‘Š (𝑧)
(2) Show that πœˆπ‘Š (𝑧) divides πœ‡(𝑧).
(3) Show that if 𝑀 ∈ π‘Š , πœ‡π‘€ (𝑧) divides πœˆπ‘Š (𝑧).
It will be useful to know how big the degree of πœ‡π‘£ (𝑧) can be.
Proposition 1.5. If 𝑣 ∈ 𝑉 , there is a polynomial 𝑝(𝑧) that annihilates 𝑣 and has
deg(𝑝(𝑧)) ≤ 𝑑. Consequently, the degree of πœ‡π‘£ (𝑧) must be less than or equal to 𝑑.
Proof. We’ve already used the basic idea. Consider the vectors
𝑣, 𝑇 𝑣, 𝑇 2 𝑣, . . . , 𝑇 𝑑 (𝑣).
This list has 𝑑 + 1 vectors in it, and they are in the 𝑑-dimensional space 𝑉 , so they
are linearly dependent. Thus, there are constants 𝑐𝑖 , not all zero, so that
𝑐0 𝑣 + 𝑐1 𝑇 𝑣 + 𝑐2 𝑇 2 𝑣 + · · · + 𝑐𝑑 𝑇 𝑑 𝑣.
If we define
𝑝(𝑧) = 𝑐0 + 𝑐1 𝑧 + 𝑐2 𝑧 2 + · · · + 𝑐𝑑 𝑧 𝑑 ,
we get a polynomial of degree ≤ 𝑑 so that 𝑝(𝑇 )𝑣 = 0.
Since πœ‡(𝑧) is a monic (non-constant) polynomial, we can factor it into linear
factors as
(1.1)
πœ‡(𝑧) = (𝑧 − π‘Ÿ1 )π‘š1 (𝑧 − π‘Ÿ2 )π‘š2 · · · (𝑧 − π‘Ÿβ„“ )π‘šβ„“ ,
where π‘Ÿ1 , π‘Ÿ2 , . . . , π‘Ÿβ„“ are the distinct roots of πœ‡(𝑧). We next want to determine what
these roots are.
Theorem 1.6. The roots of πœ‡(𝑧) are exactly the eigenvalues of 𝑇 .
NOTES ON JORDAN CANONICAL FORM
MATH 5316, FALL 2012
3
Proof. First, suppose that πœ† is an eigenvalue of 𝑇 . Then there is a nonzero vector
𝑣 so that (𝑇 − πœ†)𝑣 = 0. Thus, the minimal polynomial of 𝑣 must be πœ‡π‘£ (𝑧) = 𝑧 − πœ†
(why?). Since πœ‡π‘£ (𝑧) divides πœ‡(𝑧), πœ† is a root of πœ‡(𝑧). Thus every eigenvalue is a
root of πœ‡(𝑧).
Next, we need to show that every root of πœ‡(𝑧) is an eigenvalue. Write πœ‡(𝑧) as
in (1.1). We want to consider one of the roots. There is nothing special about how
we labeled the roots, so we may well call our root π‘Ÿ1 . Consider the polynomial
π‘ž(𝑧) = (𝑧 − π‘Ÿ1 )π‘š1 −1 (𝑧 − π‘Ÿ2 )π‘š2 . . . (𝑧 − π‘Ÿβ„“ )π‘šβ„“ ,
i.e., we’ve pulled out one factor of (𝑧 − π‘Ÿ1 ), so πœ‡(𝑧) = (𝑧 − π‘Ÿ1 )π‘ž(𝑧). Since π‘ž(𝑧) has
degree less than the degree of πœ‡(𝑧), it is not divisible by πœ‡(𝑧). Thus, π‘ž(𝑇 ) ΜΈ= 0.
Saying this operator is not zero means that there is a vector 𝑣 ΜΈ= 0 to that π‘ž(𝑇 )𝑣 ΜΈ= 0.
But then
(𝑇 − π‘Ÿ1 )[π‘ž(𝑇 )𝑣] = [(𝑇 − π‘Ÿ1 )π‘ž(𝑇 )]𝑣 = πœ‡(𝑇 )𝑣 = 0𝑣 = 0.
Thus, π‘Ÿ1 is an eigenvalue of 𝑇 , with eigenvector π‘ž(𝑇 )𝑣.
We can now rewrite πœ‡(𝑧) as
(1.2)
πœ‡(𝑧) = (𝑧 − πœ†1 )β„Ž1 (𝑧 − πœ†2 )β„Ž2 . . . (𝑧 − πœ†β„“ )β„Žβ„“
where πœ†1 , πœ†2 , . . . , πœ†β„“ are the distinct eigenvalues of 𝑇 .
Remark 1.7. Remember the notation for the exponents in (1.2) since we will be
referring to them.
2. Some Tools
The following Proposition is standard linear algebra. The proof is included for
completeness.
Proposition 2.1. Let π‘Š be a vector space. Suppose that we have linear operators
𝑃1 , . . . , 𝑃𝑛 that satisfy the following conditions.
(1) Each 𝑃𝑖 is a projection operator, i.e., 𝑃𝑖2 = 𝑃𝑖 .
(2) 𝑃1 + 𝑃2 + · · · + 𝑃𝑛 = 𝐼.
(3) If 𝑖 ΜΈ= 𝑗, 𝑃𝑖 𝑃𝑗 = 0.
Let π‘Šπ‘– be the image of 𝑃𝑖 . Then,
π‘Š = π‘Š1 ⊕ π‘Š2 ⊕ · · · ⊕ π‘Š 𝑛
and 𝑃𝑖 coincides with the projection of π‘Š onto π‘Šπ‘– defined by the direct sum decomposition.
Proof. We first want to show that any 𝑀 can be written as a sum of elements in
the π‘Šπ‘– ’s. This is easy, by Condition (2) we have
𝑀 = 𝑃1 𝑀 + 𝑃2 𝑀 + · · · + 𝑃𝑛 𝑀,
and 𝑃𝑖 𝑀 ∈ π‘Šπ‘– by definition.
Next we need to show that if
(2.1)
0 = 𝑀1 + 𝑀2 + · · · + 𝑀𝑛 ,
𝑀𝑖 ∈ π‘Šπ‘– ,
then each of the components is zero.
Since 𝑀𝑖 ∈ π‘Šπ‘– = im(𝑃𝑖 ), we can find a vector 𝑒𝑖 so that 𝑀𝑖 = 𝑃𝑖 𝑒𝑖 . Thus, we
have
(2.2)
0 = 𝑃1 𝑒1 + 𝑃2 𝑒2 + · · · + 𝑃𝑛 𝑒𝑛 .
4
LANCE D. DRAGER
Fix an index 𝑗 and apply 𝑃𝑗 on the left of both sides of (2.2). We get
0 = 𝑃𝑗 𝑃1 𝑒1 + 𝑃𝑗 𝑃2 𝑒2 + · · · + 𝑃𝑗 𝑃𝑗−1 𝑒𝑗−1 + 𝑃𝑗2 𝑒𝑗 + 𝑃𝑗 𝑃𝑗+1 𝑒𝑗+1 + · · · + 𝑃𝑗 𝑃𝑛 𝑒𝑛 .
By Condition (3), the terms where the indices are different are zero, so we get
0 = 𝑃𝑗2 𝑒𝑗 . But 𝑃𝑗2 𝑒𝑗 = 𝑃𝑗 𝑒𝑗 = 𝑀𝑗 , by Condition (1), so 𝑀𝑗 = 0. Since 𝑗 was
arbitrary, we conclude that all the components in (2.1) are zero.
Finally, note that the projection onto π‘Šπ‘— defined by the direct sum decomposition is to take
𝑀 = 𝑀1 + 𝑀2 + · · · + 𝑀𝑛 ,
(2.3)
𝑀𝑖 ∈ π‘Šπ‘– ,
to 𝑀𝑗 . By the same computation as above, if we apply 𝑃𝑗 to 𝑀, the result is 𝑀𝑗 .
Thus, the projections coincide.
For our next utility, we need a little algebra. Recall that if 𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)
are polynomials, the ideal (𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)) they generate is the set of all
combinations
π‘Ÿ1 (𝑧)𝑝1 (𝑧) + π‘Ÿ2 (𝑧)𝑝2 (𝑧) + · · · + π‘Ÿπ‘› (𝑧)𝑝𝑛 (𝑧),
π‘Ÿ1 (𝑧), . . . , π‘Ÿπ‘› (𝑧) ∈ C[𝑧].
This ideal is also equal to (𝑔(𝑧)) for some polynomial 𝑔(𝑧). If the polynomials are
not all zero, 𝑔(𝑧) can’t be zero and we can make a choice for 𝑔(𝑧) by choosing
the monic one. We call 𝑔(𝑧) the greatest common divisor of the polynomials
𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧), written as gcd(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)) . Since the gcd is in
the ideal (𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)), we have
gcd(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)) = π‘Ÿ1 (𝑧)𝑝1 (𝑧) + π‘Ÿ2 (𝑧)𝑝2 (𝑧) + · · · + π‘Ÿπ‘› (𝑧)𝑝𝑛 (𝑧),
for some π‘Ÿ1 (𝑧), . . . , π‘Ÿπ‘› (𝑧) ∈ C[𝑧] If gcd(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)) = 1 the polynomials
are called relatively prime.
Proposition 2.2. The common roots of 𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧) are exactly the roots
of 𝑔(𝑧) = gcd(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)). Consequently, the polynomials are relatively
prime if and only if they have no common roots.
Proof. Suppose that πœ† is a root of 𝑔(𝑧). Since 𝑔(𝑧) divides everything in the ideal
(𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧)), we have 𝑝𝑖 (𝑧) = π‘žπ‘– (𝑧)𝑔(𝑧), for some polynomial π‘žπ‘– (𝑧). But
then 𝑝𝑖 (πœ†) = π‘žπ‘– (πœ†)𝑔(πœ†) = π‘žπ‘– (πœ†)0 = 0. Thus, πœ† is a root of each 𝑝𝑖 (𝑧).
For the converse, suppose πœ† is a root of all of the 𝑝𝑖 (𝑧)’s. We have
𝑔(𝑧) = π‘Ÿ1 (𝑧)𝑝1 (𝑧) + π‘Ÿ2 (𝑧)𝑝2 (𝑧) + · · · + π‘Ÿπ‘› (𝑧)𝑝𝑛 (𝑧)
for some π‘Ÿπ‘– (𝑧)’s, so
𝑔(πœ†) = π‘Ÿ1 (πœ†)𝑝1 (πœ†) + · · · + π‘Ÿπ‘› (πœ†)𝑝𝑛 (πœ†) = π‘Ÿ1 (𝑧)0 + · · · + π‘Ÿπ‘› (πœ†)0 = 0.
If the polynomials 𝑝1 (𝑧), 𝑝2 (𝑧), . . . , 𝑝𝑛 (𝑧) have no common roots, then 𝑔(𝑧) has
no roots—so it must be a nonzero constant. The monic version is 1. Conversely,
if our polynomials are relatively prime, the gcd is 1, which has no roots, so the
polynomials have no common roots.
NOTES ON JORDAN CANONICAL FORM
MATH 5316, FALL 2012
5
3. Generalized Eigenspaces
In this section we will define the generalized eigenspaces2and show that the
whole space 𝑉 is the direct sum of the generalized eigenspaces.
To begin, let πœ†π‘– be an eigenvalue of 𝑇 . We say a vector 𝑣 ΜΈ= 0 is a generalized
eigenvector belonging to eigenvalue πœ†π‘– if
(𝑇 − πœ†π‘– )𝑝 𝑣 = 0
for some positive integer 𝑝. Obviously, for any positive integer π‘˜, (𝑇 − πœ†π‘– )𝑝+π‘˜ 𝑣 =
(𝑇 − πœ†π‘– )π‘˜ [(𝑇 − πœ†π‘– )𝑝 𝑣] = (𝑇 − πœ†π‘– )π‘˜ 0 = 0, so (𝑇 − πœ†π‘– )π‘ž 𝑣 = 0 for π‘ž ≥ 𝑝. For the moment
let π‘š be the smallest positive integer so that (𝑇 − πœ†π‘– )π‘š 𝑣 = 0. Then the minimal
polynomial of 𝑣 is πœ‡π‘£ (𝑧) = (𝑧 − πœ†π‘– )π‘š (why?). By Proposition 1.5, we must have
π‘š ≤ 𝑑. Thus, if (𝑇 − πœ†π‘– )𝑝 𝑣 = 0 for any power 𝑝, we must have (𝑇 − πœ†π‘– )𝑑 𝑣 = 0.
With this in mind, for each eigenvalue πœ†π‘– , we define
𝐺(πœ†π‘– ) = {𝑣 ∈ 𝑉 | (𝑇 − πœ†π‘– )𝑑 𝑣 = 0} = ker((𝑇 − πœ†π‘– )𝑑 ).
The subspace 𝐺(πœ†π‘– ) is called the generalized eigenspace belonging to the
eigenvalue πœ†π‘– . Recall that
𝐸(πœ†π‘– ) = ker((𝑇 − πœ†π‘– ))
is the eigenspace belonging to πœ†π‘– . Clearly
𝐸(πœ†π‘– ) ⊆ 𝐺(πœ†π‘– ),
but in general they are not equal.
Exercise 3.1. Consider the linear transformation C3 → C3 given by multiplication
by the matrix
⎑
⎀
0 1 0
𝐴 = ⎣0 0 1⎦ .
0 0 0
The only eigenvalue is 0. Find 𝐸(0) and 𝐺(0).
Exercise 3.2. Show that there are no “generalized eigenvalues”, i.e., if πœ‰ ∈ C and
there is a nonzero vector 𝑣 and a positive integer 𝑝 so that (𝑇 − πœ‰)𝑝 𝑣 = 0, then πœ‰ is
an eigenvalue of 𝑇 .
The following observation is useful.
Proposition 3.3. If πœ†π‘– is an eigenvalue,
𝐺(πœ†π‘– ) = ker((𝑇 − πœ†π‘– )β„Žπ‘– ).
Recall that β„Žπ‘– is the exponent of (𝑧 − πœ†π‘– ) in the minimal polynomial πœ‡(𝑧), see
(1.2)
Proof. First suppose 𝑣 ∈ 𝐺(πœ†π‘– ). If 𝑣 = 0, there is nothing to prove. If 𝑣 ΜΈ= 0 then
(𝑇 − πœ†π‘– )𝑑 𝑣 = 0, so the minimal polynomial of 𝑣 must be πœ‡π‘£ (𝑧) = (𝑧 − πœ†π‘– )π‘š for some
positive integer π‘š ≤ 𝑑. But πœ‡π‘£ (𝑧) divides πœ‡(𝑧), so π‘š ≤ β„Žπ‘– . Thus (𝑇 − πœ†π‘– )β„Žπ‘– 𝑣 = 0.
The other direction is trivial. If 𝑣 ∈ ker((𝑇 − πœ†π‘– )β„Žπ‘– ), then (𝑇 − πœ†π‘– )𝑑 𝑣 = 0, since
𝑑 ≥ β„Žπ‘– . Thus, 𝑣 ∈ 𝐺(πœ†π‘– ).
Another useful observation is the following Proposition.
2Bad terminology, but we’re stuck with it.
6
LANCE D. DRAGER
Proposition 3.4. 𝑖 and 𝑗 be distinct indices, so πœ†π‘– ΜΈ= πœ†π‘— . Then
𝐺(πœ†π‘– ) ∩ 𝐺(πœ†π‘— ) = {0}.
Proof. Suppose that 𝑣 ∈ 𝐺(πœ†π‘– ). Then (𝑇 − πœ†π‘– )β„Žπ‘– 𝑣 = 0. Thus, the minimal
polynomial πœ‡π‘£ (𝑧) of 𝑣 must divide (𝑧 − πœ†π‘– )β„Žπ‘– . This means that πœ‡π‘£ (𝑧) must be
πœ‡π‘£ (𝑧) = (𝑧 − πœ†π‘– )π‘š , for some integer 0 ≤ π‘š ≤ β„Žπ‘– (the zero vector would have
minimal polynomial (𝑧 − πœ†π‘– )0 = 1).
On the other hand, 𝑣 ∈ 𝐺(πœ†π‘— ), so (𝑇 − πœ†π‘— )β„Žπ‘— 𝑣 = 0. Thus, the polynomial
(𝑧 − πœ†π‘— )β„Žπ‘— must be divisible by πœ‡π‘£ (𝑧) = (𝑧 − πœ†π‘– )π‘š . Sine πœ†π‘– ΜΈ= πœ†π‘— , the only way this
is possible is to have π‘š = 0, i.e., πœ‡π‘£ (𝑧) = 1. But then 0 = πœ‡π‘£ (𝑇 )𝑣 = 1𝑣 = 𝑣, so
𝑣 = 0.
Next, we develop a little machinery about commuting operators. If 𝐿 and 𝑆 are
linear maps 𝑉 → 𝑉 , we say they commute if 𝐿𝑆 = 𝑆𝐿. The following simple
observations are left to the reader.
Proposition 3.5. Let 𝑇 , 𝑆, and 𝑅 be linear operators 𝑉 → 𝑉 . Then, the following
properties hold.
(1) 𝑇 commutes with itself.
(2) If 𝑇 commutes with 𝑆 and 𝑅, then 𝑇 commutes with the products 𝑆𝑅 and
𝑅𝑆.
(3) If 𝑆 and 𝑇 commute, 𝑇 𝑝 commutes with 𝑆 π‘ž for any powers 𝑝 and π‘ž (which
can be negative if the operator is invertable).
(4) If 𝑇 commutes with 𝑆 and 𝑅, then 𝑇 commutes with 𝛼𝑆 + 𝛽𝑅 for any
scalars 𝛼 and 𝛽.
(5) If 𝑆 commutes with 𝑇 , then 𝑆 commutes with any polynomial 𝑝(𝑇 ) in 𝑇 .
We can use these facts to prove the following useful Proposition.
Proposition 3.6. Let 𝑆 be a linear operator 𝑉 → 𝑉 that commutes with 𝑇 and
let πœ†π‘– be an eigenvalue of 𝑇 . Then
𝑆𝐺(πœ†π‘– ) ⊆ 𝐺(πœ†π‘– )
𝑆𝐸(πœ†π‘– ) ⊆ 𝐸(πœ†π‘– )
We say that the subspaces 𝐺(πœ†π‘– ) and 𝐸(πœ†π‘– ) are invariant under 𝑆.
Proof. Suppose that 𝑣 ∈ 𝐺(πœ†π‘– ), which means that (𝑇 − πœ†π‘– )𝑑 𝑣 = 0. To test if 𝑆𝑣 is
in 𝐺(πœ†π‘– ), we need to see if (𝑇 − πœ†π‘– )𝑑 [𝑆𝑣] = 0. But 𝑆 and (𝑇 − πœ†π‘– )𝑑 commute, so
(𝑇 − πœ†π‘– )𝑑 [𝑆𝑣] = [(𝑇 − πœ†π‘– )𝑑 𝑆]𝑣 = 𝑆[(𝑇 − πœ†π‘– )𝑑 𝑣] = 𝑆0 = 0.
Thus, 𝑆𝑣 ∈ 𝐺(πœ†π‘– ) The corresponding result for the eigenspaces is left to the reader.
Corollary 3.7. The eigenspace 𝐸(πœ†π‘– ) and the generalized eigenspace 𝐺(πœ†π‘– ) are
invariant under any polynomial 𝑝(𝑇 ) in 𝑇 .
We now state the Big Theorem.
Theorem 3.8 (Big Theorem). Let πœ†1 , . . . , πœ†β„“ be the distinct eigenvalues of 𝑇 . Then
𝑉 = 𝐺(πœ†1 ) ⊕ 𝐺(πœ†2 ) ⊕ · · · ⊕ 𝐺(πœ†β„“ ),
in words, 𝑉 is the direct sum of the generalized eigenspaces.
NOTES ON JORDAN CANONICAL FORM
MATH 5316, FALL 2012
7
Let’s discuss the proof, stating some important facts as Lemmas.
Consider the polynomials
(3.1)
π‘žπ‘– (𝑧) =
β„“
∏︁
(𝑧 − πœ†π‘— )β„Žπ‘— ,
𝑗=1
𝑗̸=𝑖
in other words, we take the minimal polynomial and remove the factor (𝑧 − πœ†π‘– )β„Žπ‘–
corresponding to the eigenvalue πœ†π‘– .
β„“
Lemma 3.9. The polynomials {π‘žπ‘– (𝑧)}𝑖=1 are relatively prime.
Proof of Lemma. It will suffice to show our polynomials have no common roots.
The only possible roots are the eigenvalues πœ†1 , πœ†2 , . . . , πœ†β„“ . But πœ†1 is not a common
root, because it is not a root of π‘ž1 (𝑧), πœ†2 is not a root of π‘ž2 (𝑧), and so forth.
Since the π‘žπ‘– (𝑧)’s are relatively prime, we have
(3.2)
1 = π‘Ÿ1 (𝑧)π‘ž1 (𝑧) + π‘Ÿ2 (𝑧)π‘ž2 (𝑧) + · · · + π‘Ÿβ„“ (𝑧),
for some polynomials π‘Ÿ1 (𝑧), . . . , π‘Ÿβ„“ (𝑧). We’ll use the following notation
𝑝𝑖 (𝑧) = π‘Ÿπ‘– (𝑧)π‘žπ‘– (𝑧)
𝑃𝑖 = 𝑝𝑖 (𝑇 ) = π‘Ÿπ‘– (𝑇 )π‘žπ‘– (𝑇 ).
Plugging 𝑇 in for 𝑧 in (3.2) we have
(3.3)
𝐼 = 𝑝1 (𝑇 ) + 𝑝2 (𝑇 ) + · · · + 𝑝ℓ (𝑇 ) = 𝑃1 + 𝑃2 + · · · + 𝑃ℓ .
Lemma 3.10. For each 𝑖,
im(𝑃𝑖 ) ⊆ 𝐺(πœ†π‘– ).
Proof of Lemma. Let 𝑣 be a vector in 𝑉 . We want to show that 𝑃𝑖 𝑣 ∈ 𝐺(πœ†π‘– ). By
Proposition 3.3, it will suffice to show that
(𝑇 − πœ†π‘– )β„Žπ‘– 𝑃𝑖 𝑣 = 0.
(3.4)
But (𝑧 − πœ†π‘– )β„Žπ‘– is exactly the factor we removed from πœ‡(𝑧) to get π‘žπ‘– (𝑧). Thus,
(𝑧 − πœ†π‘– )β„Žπ‘– 𝑝𝑖 (𝑧) = π‘Ÿπ‘– (𝑧)(𝑧 − πœ†π‘– )β„Žπ‘– π‘žπ‘– (𝑧) = π‘Ÿπ‘– (𝑧)πœ‡(𝑧)
and then
(𝑇 − πœ†π‘– )β„Žπ‘– 𝑃𝑖 = π‘Ÿπ‘– (𝑇 )πœ‡(𝑇 ) = 0,
so (3.4) is certainly true.
Lemma 3.11. If 𝑖 ΜΈ= 𝑗, 𝑃𝑖 𝐺(πœ†π‘— ) = 0.
Proof of Lemma. The factor (𝑧 − πœ†π‘— )β„Žπ‘— appears in π‘žπ‘– (𝑧). Thus, 𝑝𝑖 (𝑧) = 𝑔(𝑧)(𝑧 −
πœ†π‘— )β„Žπ‘— for some polynomial 𝑔(𝑧). Thus,
𝑃𝑖 𝐺(πœ†π‘— ) = 𝑝𝑖 (𝑇 )𝐺(πœ†π‘— ) = 𝑔(𝑇 )(𝑇 − πœ†π‘— )β„Žπ‘— 𝐺(πœ†π‘— ) = 0,
since (𝑇 − πœ†π‘— )β„Žπ‘— kills 𝐺(πœ†π‘— ).
Lemma 3.12. If 𝑖 ΜΈ= 𝑗, 𝑃𝑖 𝑃𝑗 = 0.
Proof of Lemma. This follows from Lemma 3.10 and Lemma 3.11.
Lemma 3.13. Each 𝑃𝑖 is a projection operator, i.e.,
𝑃𝑖2
= 𝑃𝑖 .
8
LANCE D. DRAGER
Proof of Lemma. We have
𝐼 = 𝑃1 + 𝑃2 + · · · + 𝑃ℓ .
Multiply this by 𝑃𝑗 on the left. This gives
𝑃𝑗 = 𝑃𝑗 𝑃1 + 𝑃𝑗 𝑃2 + . . . 𝑃𝑗 𝑃𝑗−1 + 𝑃𝑗2 + 𝑃𝑗 𝑃𝑗+1 + . . . 𝑃𝑗 𝑃ℓ .
All the terms where the indices are not equal are zero, so we wind up with 𝑃𝑗 =
𝑃𝑗2 .
We’ve now shown that the 𝑃𝑖 ’s satisfy all the requirements of Proposition 2.1.
If we let π‘Šπ‘– = im(𝑃𝑖 ) ⊆ 𝐺(πœ†π‘– ), we have
(3.5)
𝑉 = π‘Š1 ⊕ π‘Š2 ⊕ · · · ⊕ π‘Šβ„“
We will be done if we show that π‘Šπ‘– = 𝐺(πœ†π‘– ).
To do this, suppose that 𝑣 ∈ 𝐺(πœ†π‘– ). We have, of course,
𝑣 = 𝑃1 𝑣 + 𝑃2 𝑣 + · · · + 𝑃ℓ 𝑣.
Consider 𝑃𝑗 𝑣 for 𝑗 ΜΈ= 𝑖. On the one hand, 𝑃𝑗 𝑣 ∈ π‘Šπ‘— ⊆ 𝐺(πœ†π‘— ). On the other hand,
𝑃𝑗 = 𝑝𝑗 (𝑇 ) is a polynomial in 𝑇 . By Corollary 3.7, 𝐺(πœ†π‘– ) is invariant under 𝑃𝑗 , so
𝑃𝑗 𝑣 ∈ 𝐺(πœ†π‘– ). But then 𝑃𝑗 𝑣 ∈ 𝐺(πœ†π‘– ) ∩ 𝐺(πœ†π‘— ) = 0, using Proposition 3.4.
Since 𝑃𝑗 𝑣 = 0 for 𝑗 ΜΈ= 𝑖, we have 𝑣 = 𝑃𝑖 𝑣 ∈ π‘Šπ‘– , which completes the proof that
π‘Šπ‘– = 𝐺(πœ†π‘– ).
This completes our proof of the Big Theorem, Theorem 3.8.
4. The Jordan Decomposition
We begin with a discussion of nilpotent matrices. A linear transformation
𝑁 : 𝑉 → 𝑉 is nilpotent if 𝑁 𝑝 = 0 for some positive integer 𝑝. Of course, if
𝑁 𝑝 = 0 then 𝑁 π‘ž = 0 for any π‘ž > 𝑝.
We call the smallest positive integer 𝑛 such that 𝑁 𝑛 = 0 the degree of nilpotency of 𝑁 . Another way to characterize 𝑛 is 𝑁 𝑛 = 0 but 𝑁 𝑛−1 ΜΈ= 0.
We want to show that if the dimension of 𝑉 is 𝑑 and 𝑁 is nilpotent then 𝑁 𝑑 = 0,
i.e., 𝑛 ≤ 𝑑.
One way to see this is the following Proposition, which is useful in its own right.
Proposition 4.1. Let 𝑣 be a vector in 𝑉 and let 𝑆 : 𝑉 → 𝑉 be a linear transformation. Suppose there is a positive integer π‘š such that 𝑆 π‘š 𝑣 = 0, but 𝑆 π‘š−1 𝑣 ΜΈ= 0.
Then the π‘š vectors
𝑣, 𝑆𝑣, 𝑆 2 𝑣, . . . , 𝑆 π‘š−1 𝑣
are linearly independent.
Proof. Suppose that we have a relation
(4.1)
𝑐0 𝑣 + 𝑐1 𝑆𝑣 + 𝑐2 𝑆 2 𝑣 + · · · + π‘π‘š−1 𝑆 π‘š−1 𝑣 = 0.
We need to show that all of the coefficients are zero.
To do this, first multiply (4.1) on the left by 𝑆 π‘š−1 . This give
(4.2)
𝑐0 𝑆 π‘š−1 𝑣 + 𝑐1 𝑆 π‘š 𝑣 + · · · + π‘π‘š−1 𝑆 2π‘š−2 𝑣 = 0.
Since 𝑆 𝑝 𝑣 = 0 for 𝑝 ≥ π‘š, this reduces to just 𝑐0 𝑆 π‘š−1 𝑣 = 0. Since 𝑆 π‘š−1 𝑣 ΜΈ= 0, we
conclude that 𝑐0 = 0.
Equation (4.1) now reduces to
𝑐1 𝑆𝑣 + 𝑐2 𝑆 2 𝑣 + · · · + π‘π‘š−1 𝑆 π‘š−1 𝑣 = 0.
NOTES ON JORDAN CANONICAL FORM
MATH 5316, FALL 2012
9
We now multiply this on the left by 𝑆 π‘š−2 , which gives us
𝑐1 𝑆 π‘š−1 𝑣 + 𝑐1 𝑆 π‘š 𝑣 + · · · + π‘π‘š−1 𝑆 2π‘š−3 𝑣 = 0.
Again, all the terms but the first are zero, so 𝑐1 𝑆 π‘š−1 𝑣 = 0, from which we can
conclude that 𝑐1 = 0.
Continuing in this way, we conclude that all the coefficients are zero.
Proposition 4.2. If 𝑁 : 𝑉 → 𝑉 is nilpotent and the dimension of 𝑉 is 𝑑, then
𝑁 𝑑 = 0, i.e., the degree of nilpotency of 𝑁 is less than or equal to 𝑑.
Proof I. Suppose that 𝑁 𝑛−1 ΜΈ= but 𝑁 𝑛 = 0. Since 𝑁 𝑛−1 ΜΈ= 0, there is a vector 𝑣
such that 𝑁 𝑛−1 𝑣 ΜΈ= 0. But then the 𝑛 vectors
𝑣, 𝑁 𝑣, 𝑁 2 𝑣, . . . , 𝑁 𝑛−1 𝑣
are linearly independent, so 𝑛 ≤ dim(𝑉 ) = 𝑑.
Proof II. Suppose that 𝑁 𝑛 = 0 but 𝑁 𝑛−1 ΜΈ= 0. Then the polynomial 𝑧 𝑛 annihilates
𝑁 , but 𝑧 𝑛−1 does not. Thus, the minimal polynomial of 𝑁 is 𝑧 𝑛 . We know the
degree of the minimal polynomial must be ≤ 𝑑.
We continue to investigate our fixed linear transformation 𝑇 : 𝑉 → 𝑉 , where 𝑉
has dimension 𝑑.
The goal of this section is to prove the following Theorem, which often suffices
to solve a problem without going to the full Jordan Form.
Theorem 4.3 (Jordan Decomposition). If 𝑇 : 𝑉 → 𝑉 is a linear transformation,
there are unique linear transformations 𝑆 and 𝑁 from 𝑉 to 𝑉 so that the following
conditions hold.
(JD1) 𝑇 = 𝑆 + 𝑁 .
(JD2) 𝑆𝑁 + 𝑁 𝑆, i.e, 𝑆 and 𝑁 commute.
(JD3) 𝑆 is diagonalizable.
(JD4) 𝑁 is nilpotent.
We’ll divide the rest of this section into the proof of existence and the proof of
uniqueness.
4.1. Proof of Existence. As usual, let πœ†1 , πœ†2 , . . . , πœ†β„“ be the distinct eigenvalues
of 𝑇 . From our Big Theorem, we have
(4.3)
𝑉 = 𝐺(πœ†1 ) ⊕ 𝐺(πœ†2 ) ⊕ · · · ⊕ 𝐺(πœ†β„“ ).
Let 𝑆 : 𝑉 → 𝑉 be the linear transformation that is given on 𝐺(πœ†π‘– ) by multiplication by πœ†π‘– . Thus, if 𝑣 ∈ 𝑉 is decomposed as
𝑣 = 𝑣1 + 𝑣2 + · · · + 𝑣ℓ ,
with respect to the direct sum decomposition (4.3), we have
𝑆𝑣 = πœ†1 𝑣1 + πœ†2 𝑣2 + · · · + πœ†β„“ 𝑣ℓ .
Another way to say it is that
(4.4)
𝑆 = πœ†1 𝑃1 + πœ†2 𝑃2 + · · · + πœ†β„“ 𝑃ℓ .
Since the 𝑃𝑖 ’s are polynomials in 𝑇 , we see that 𝑆 is a linear combination of polynomials in 𝑇 , and so is a polynomial in 𝑇 . Thus, 𝑆 commutes with 𝑇 , which is also
easy to check from the definition of 𝑆.
10
LANCE D. DRAGER
Exercise 4.4. Use the fact that the generalized eigenspaces are invariant under 𝑇
to show that 𝑆𝑇 = 𝑇 𝑆.
We now define 𝑁 = 𝑇 − 𝑆. It’s clear that 𝑁 commutes with both 𝑆 and 𝑇 .
Indeed, 𝑁 is a polynomial in 𝑇 . We need to show that 𝑁 is nilpotent.
To do this, suppose that 𝑣 ∈ 𝐺(πœ†π‘– ). We then have
(𝑇 − 𝑆)𝑣 = 𝑇 𝑣 − 𝑆𝑣 = 𝑇 𝑣 − πœ†π‘– 𝑣 = (𝑇 − πœ†π‘– )𝑣.
Since (𝑇 − 𝑆) commutes with (𝑇 − πœ†π‘– ), we have
(𝑇 −𝑆)2 𝑣 = (𝑇 −𝑆)[(𝑇 −𝑆)𝑣] = (𝑇 −𝑆)(𝑇 −πœ†π‘– )𝑣 = (𝑇 −πœ†π‘– )[(𝑇 −𝑆)𝑣] = (𝑇 −πœ†π‘– )2 𝑣.
Continuing in this way, we get
(𝑇 − 𝑆)𝑝 𝑣 = (𝑇 − πœ†π‘– )𝑝 𝑣.
Thus,
𝑁 𝑑 𝑣 = (𝑇 − 𝑆)𝑑 𝑣 = (𝑇 − πœ†π‘– )𝑑 𝑣 = 0,
since 𝑣 ∈ 𝐺(πœ†π‘– ). For an arbitrary 𝑣 ∈ 𝑉 , we have
𝑣 = 𝑣1 + 𝑣2 + · · · + 𝑣ℓ ,
𝑣𝑖 ∈ 𝐺(πœ†π‘– ),
and so
𝑁 𝑑 𝑣 = 𝑁 𝑑 𝑣1 + 𝑁 𝑑 𝑣2 + · · · + 𝑁 𝑑 𝑣ℓ = 0 + 0 + · · · + 0 = 0.
We’ve now constructed 𝑆 and 𝑁 satisfying the required four properties, so the
proof of existence is complete.
4.2. Proof of Uniqueness. Denote the Jordan Decomposition we have constructed
in the last subsection as 𝑇 = 𝑆old +𝑁old . Suppose that we have two transformations
𝑆 and 𝑁 so that 𝑇 = 𝑆 + 𝑁 and (JD1) through (JD4) hold. We want to prove
that 𝑆 = 𝑆old and 𝑁 = 𝑁old . Notice that 𝑇 , 𝑆 and 𝑁 must all commute with each
other.
First, let’s determine the eigenvalues of 𝑆. Suppose that πœ‰ is an eigenvalue of 𝑆
and let 𝑣 be an eigenvector for πœ‰. Then
(𝑇 − 𝑆)𝑣 = 𝑇 𝑣 − 𝑆𝑣 = 𝑇 𝑣 − πœ‰π‘£ = (𝑇 − πœ‰)𝑣.
Since (𝑇 − 𝑆) and 𝑇 − πœ‰ commute, we have
(𝑇 − 𝑆)2 𝑣 = (𝑇 − 𝑆)[(𝑇 − 𝑆)𝑣] = (𝑇 − 𝑆)[(𝑇 − πœ‰)𝑣] = (𝑇 − πœ‰)[(𝑇 − 𝑆)𝑣] = (𝑇 − πœ‰)2 𝑣.
Continuing this argument, we have (𝑇 − 𝑆)𝑝 𝑣 = (𝑇 − πœ‰)𝑝 𝑣 for any power 𝑝. But
then
(𝑇 − πœ‰)𝑑 𝑣 = (𝑇 − 𝑆)𝑑 𝑣 = 𝑁 𝑑 𝑣 = 0,
since 𝑁 is nilpotent. From this we conclude, as in Exercise 3.2, that πœ‰ is an
eigenvalue of 𝑇 . We can also conclude that 𝑣, which started out as an eigenvector
of 𝑆, is in the generalized eigenspace of 𝑇 belonging to πœ‰.
This gives us the following Lemma.
Lemma 4.5. Let πœ‰ be an eigenvalue of 𝑆. Then πœ‰ is an eigenvalue of 𝑇 and
𝐸𝑆 (πœ‰) ⊆ 𝐺𝑇 (πœ‰),
i.e., the eigenspace of 𝑆 is contained in the generalized eigenspace of 𝑇 .
NOTES ON JORDAN CANONICAL FORM
MATH 5316, FALL 2012
11
Next, let πœ† be an eigenvalue of 𝑇 with eigenvector 𝑣. Then
(𝑆 − 𝑇 )𝑣 = 𝑆𝑣 − 𝑇 𝑣 = 𝑆𝑣 − πœ†π‘£ = (𝑆 − πœ†)𝑣.
Since (𝑆 − 𝑇 ) and (𝑆 − πœ†) commute, we can use the same procedure as above to
show that
(𝑆 − 𝑇 )𝑝 𝑣 = (𝑆 − πœ†)𝑝 𝑣
for any exponent 𝑝. But 𝑆 − 𝑇 = −𝑁 , so
(𝑆 − πœ†)𝑑 𝑣 = (𝑆 − 𝑇 )𝑑 𝑣 = (−𝑁 )𝑑 𝑣 = (−1)𝑑 𝑁 𝑑 𝑣 = 0,
since 𝑁 𝑑 = 0. As before, this implies that πœ† is an eigenvalue of 𝑆.
We’ve now shown that the eigenvalues of 𝑆 and 𝑇 are exactly the same, and for
each eigenvalue πœ†π‘– we have
𝐸𝑆 (πœ†π‘– ) ⊆ 𝐺𝑇 (πœ†π‘– ).
Of course we have
𝑉 = 𝐺𝑇 (πœ†1 ) ⊕ 𝐺𝑇 (πœ†2 ) ⊕ · · · ⊕ 𝐺𝑇 (πœ†β„“ ).
Since 𝑆 is diagonalizable, we have
𝑉 = 𝐸𝑆 (πœ†1 ) ⊕ 𝐸𝑆 (πœ†2 ) ⊕ · · · ⊕ 𝐸𝑆 (πœ†β„“ ).
We can then cite the following Lemma.
Lemma 4.6. Let 𝑉 be a vector space and suppose that 𝑉1 , 𝑉2 , . . . , 𝑉𝑛 , π‘Š1 , π‘Š2 . . . , π‘Šπ‘›
are subspaces of 𝑉 so that
(1) π‘Šπ‘– ⊆ 𝑉𝑖 for all 𝑖 = 1, 2, . . . , 𝑛.
(2) 𝑉 = 𝑉1 ⊕ 𝑉2 ⊕ · · · ⊕ 𝑉𝑛 .
(3) 𝑉 = π‘Š1 ⊕ π‘Š2 ⊕ · · · ⊕ π‘Šπ‘› .
Then
π‘Šπ‘– = 𝑉𝑖 ,
𝑖 = 1, 2, . . . , 𝑛.
Proof. Of course, any vector 𝑣 can be written uniquely as
𝑣 = 𝑣1 + 𝑣2 + · · · + 𝑣𝑛 ,
where 𝑣𝑖 ∈ 𝑉𝑖
We want to prove π‘Šπ‘– = 𝑉𝑖 for all 𝑖. Suppose that 𝑣 ∈ 𝑉𝑗 . Then its decomposition
as above is
(4.5)
𝑣 = 0 + 0 + · · · + 0 + 𝑣 + 0 + · · · + 0,
i.e., all the components are zero except for the one in 𝑉𝑗 , which is 𝑣. But we can
also write 𝑣 with respect to the direct sum of the π‘Šπ‘— ’s, so
(4.6)
𝑣 = 𝑀1 + 𝑀2 + · · · + 𝑀𝑗−1 + 𝑀𝑗 + 𝑀𝑗+1 + · · · + 𝑀𝑛 ,
where 𝑀𝑖 ∈ π‘Šπ‘– . But π‘Šπ‘– ⊆ 𝑉𝑖 , so each 𝑀𝑖 ∈ 𝑉𝑖 . Thus, in (4.6), 𝑣 is written as a
sum of components where the 𝑖th component is in 𝑉𝑖 . There is only one way to do
this, namely (4.5). Thus, we have 𝑀𝑖 = 0 for 𝑖 ΜΈ= 𝑗 and 𝑣 = 𝑀𝑗 ∈ π‘Šπ‘— .
This shows 𝑉𝑗 ⊆ π‘Šπ‘— , so the proof is complete.
Applying this to the case at hand, we conclude that
𝐸𝑆 (πœ†π‘– ) = 𝐺𝑇 (πœ†π‘– ),
12
LANCE D. DRAGER
in other words, 𝑆 is given by multiplication by πœ†π‘– on 𝐺𝑇 (πœ†π‘– ). This is exactly the
definition of 𝑆old in our construction, so we conclude that 𝑆 = 𝑆old . Then, of
course, 𝑁 = 𝑇 − 𝑆 = 𝑇 − 𝑆old = 𝑁old , so the proof of uniqueness is complete.
Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX
79409-1042
E-mail address: lance.drager@ttu.edu
Download