Jordan Canonical Forms of Linear Operators

advertisement
Undergraduate Research Opportunity Programme in Science
(UROPS)
Jordan Canonical Forms of Linear Operators
Submitted by
Teo Koon Soon
Supervised by
Dr. Victor Tan
Department of Mathematics
National University of Singapore
Academic Year 2001/2002 Semester 2
-1-
Table of Contents
Introduction …………………………………………………………………...…………. 1
Chapter 0: Preliminaries ………………………………………………………...………. 2
Chapter 1: Fundamentals of Jordan Canonical Form …………………………………… 4
Chapter 2: Relationship between Minimum Polynomial and Jordan Canonical Form ... 19
Chapter 3: Finding the Jordan Canonical Form and Basis …………………………..… 28
Conclusion …………………………………………………………………………..…. 38
References …………………………………………………………………………….... 38
-2-
Introduction
Any linear transformation can be expressed by its matrix representation. In an
ideal world, all linear operators are diagonalisable. The advantage lies in the simplicity of
its description; as such an operator has a diagonal matrix representation. Unfortunately,
there are linear operators that are not diagonalisable. However, a ‘near-diagonal’ matrix,
called the canonical form, may represent a non-diagonalisable linear operator.
This report is focused on the underlying principles in constructing the Jordan
canonical forms of linear operators, and determining its associated Jordan canonical
basis. We will only deal with one of the most common canonical forms, the Jordan
canonical form, which can be used to represent a linear operator if its characteristic
polynomial splits. Particularly, every linear operator performed over the complex field
always has a Jordan canonical form.
-3-
0. Preliminaries
Definition 0.1: A polynomial f(x) splits if there are scalars c, a1, … , an (not necessarily
distinct) such that f(x) = c(x-a1)(x-a2)…(x-an).
Remark: When a polynomial splits over a particular field, the scalars c, a1, … , an are
elements of that field.
Definition 0.2: Let T : V  V be a linear operator on a vector space V. A subspace W is
T-invariant if T(v)  W for all v  W, i.e., T(W)  W.
Definition 0.3: Let T : V  V be a linear operator on a vector space V. Let W be Tinvariant subspace of V. The restriction of T on W is defined to be the function Tw : W 
W where Tw(v) = T(v) for all v  W.
Proposition 0.4:
Let T : V  V be a linear operator. If f(T) and g(T) are any
polynomials of T, then f(T)g(T) = g(T)f(T).
Proof: Let f(T) = akTk + ak-1Tk-1 + … + a0I and g(T) = bnTn + bn-1Tn-1 + … + b0I
Suppose g’(T) consists of a single term, say g’(T) = bmTm
f(T)g’(T)
= (akTk + ak-1Tk-1 + … + a0I) (bmTm)
= (akbmTk+m + ak-1bmTk+m-1 + … + a0bmTm)
= (bmTm) (akTk + ak-1Tk-1 + … + a0I) = g’(T)f(T)
Using the above result, we will prove the proposition for any polynomial g(T).
-4-
f(T)g(T)
= f(T)(bnTn + bn-1Tn-1 + … + b0I)
= f(T)(bnTn) + f(T)(bn-1Tn-1) + … + f(T)(b0I)
= (bnTn)f(T) + (bn-1Tn-1)f(T) + … + (b0I)f(T)
= (bnTn + bn-1Tn-1 + … + b0I)f(T)
= g(T)f(T),
proven

Proposition 0.5: Let V be of finite dimension and T : V  V be a linear operator. Let W
be a T-invariant subspace of V. Then the characteristic polynomial of Tw divides the
characteristic polynomial of T.
Proof: Let  = { v1, v2, … , vk } be a basis for W, and extend it to a basis S = {v1, v2, … ,
vk, vk+1, … , vn } for V. Let A = [T]S and B1 = [Tw]
Since T(vi)  V, where i = 1, 2, … , k, we let T(vi) = a1v1 + …+ akvk + ak+1vk+1
+…+ anvn, where a1,…, an  R.
Also, T(vi)  W and  is a basis for W, T(vi) can be expressed as a linear
combination of elements in . Hence we have ak+1vk+1 + ak+2vk+2 + ak+1vk+1 …+ anvn = 0.
But {vk+1, … , vn} is linearly independent, therefore ak+1 = ak+2 = … = an = 0.
B1 B2 where
O B3
O is the (n-k) x k zero matrix, and B2 and B3 are matrices of suitable sizes.
Recall that [T(vi)]S is the ith-column for [T]S. Hence A =
Let f(t) be the characteristic polynomial of T and g(t) be that of Tw. Then
f(t) = | A-tIn | =
B1-tIk B2
O
B3-tIn-k
= | B1-tIk | . | B3 – tIn-k | = g(t) . | B3 – tIn-k |

Hence g(t) divides f(t).
-5-
1. Fundamentals of Jordan Canonical Form
For the purpose of this report, we will assume that the characteristic polynomial
of a transformation T always splits, i.e., we assume that the transformation is performed
over the complex field.
Recall that T is diagonalisable if and only if T has n linearly independent
eigenvectors, where n = dim(V). However, if T has less than n linearly independent
eigenvectors, we can still construct the Jordan canonical form of T by extending the
eigenspaces to generalised eigenspaces, from which we select ordered bases whose union
is a basis for V.
In this chapter, we will prove that T has a Jordan canonical form. We will also
demonstrate how to select the Jordan canonical basis from the generalised eigenspaces,
and why they form a basis.
But before we can go into the important theorems, we need to first define a few
basic concepts. Particularly, we need to know how does a Jordan canonical form look
like, and thus understand why it is ‘almost-diagonal’.
Definition 1.1: A square matrix is called a Jordan block corresponding to a scalar λ if it
has λ along the diagonal, 1 along the superdiagonal, and 0 everywhere else.
λ
0
.
0
0
1
λ
.
0
0
0
1
.
0
0
.... 0 0
.... 0 0
. .
.... λ 1
.... 0 λ
-6-
Definition 1.2: A matrix is in Jordan canonical form if it is a block diagonal matrix with
Jordan blocks along the diagonal.
2
0
0
0
0
1
2
0
0
0
0
1
2
0
0
0
0
0
4
0
0
0
0
1
4
As an example, the above matrix is in Jordan canonical form, consisting of two
Jordan blocks, one (3 by 3) corresponding to the eigenvalue 2 and the other (2 by 2)
corresponding to the eigenvalue 4.
We will next introduce the concept of generalised eigenvectors, which is
contained in generalised eigenspaces, that will eventually be used to form the required
Jordan canonical basis.
Definition 1.3:
Let T : V  V be a linear operator. A non-zero vector x  V is a
generalised eigenvector of T corresponding to the scalar λ if (T- λI)P(x) = 0 for some
positive integer p.
Remark: If p is the smallest integer that satisfy the above equation, then v = (T- λI)P-1(x)
is an eigenvector of T corresponding to λ since (T- λ I)(v) = 0. Therefore λ is an
eigenvalue of T.
Definition 1.4: Let T : V  V be a linear operator, and let λ be an eigenvalue of T. The
generalised eigenspace of T corresponding to λ, denoted by Kλ(T), is the subset of V
such that Kλ(T) = { x  V | (T- λI)P(x) = 0 for some positive integer p }
-7-
We will first prove that a vector space V is a direct sum of the generalised
eigenspaces corresponding to each eigenvalue of T. Then we can proceed to prove that
the union of the bases that we select from these generalised eigenspaces forms a basis for
V.
Prior to that, the following theorem and proposition (Theorem 1.5 and Proposition
1.6) has to be established. They will be needed in the crucial proofs that follows.
Theorem 1.5: Let T : V  V be a linear operator, and let λ be an eigenvalue of T. Then
(a)
Kλ(T) is a T-invariant subspace of V containing Eλ (the eigenspace of T
corresponding to λ).
(b)
For any scalar   , the restriction of T - I to Kλ(T) is one-to-one.
Proof: (a)
We will prove this by showing that (i) Kλ(T) is a subspace, (ii) Kλ(T) is T-
invariant, and (iii) Kλ(T) contains Eλ.
(i)
First of all, (T- λI)P(0) = 0 for any positive integer p. So 0  Kλ(T) and hence it is
non-empty.
Now let u, v  Kλ(T) and k be a non-zero scalar. So (T- λI)p(u) = (T- λI)q(v) = 0
for some positive integers p and q.
(T- λI)p+q(u+v) = (T- λI)p+q(u) + (T- λI)p+q(v) = 0. So u+v  Kλ(T) and Kλ(T) is
closed under addition
Also, (T- λI)p(ku) = k(T- λI)p(u) = 0. So ku  Kλ(T) and Kλ(T) is closed under
scalar multiplication.
Hence Kλ(T) is a subspace.
-8-
Let v  Kλ(T). Then (T- λI)P(v) = 0 for some positive integer.
(ii)
By Proposition 0.4, (T- λI)P[T(v)] = T(T- λI)P(v) = T(0) = 0.
Hence T(v)  Kλ(T) and Kλ(T) is T-invariant.
(iii)
Let v  Eλ. Then (T- λI)(v) = 0 and so v  Kλ(T). Hence Eλ  Kλ(T)
(b)
We need to show that ker(T-I) = {0}, where the domain of T-I is Kλ(T).
Let v be a non-zero vector belonging to Kλ(T) such that (T-I)(v) = 0. We shall
prove this result by way of contradiction.
Let p be the smallest integer for which (T- λI)P(v) = 0 and let u = (T- λI)P-1(v). So
(T- λI)(u) = (T- λI)p(v) = 0 and therefore u  Eλ .
Also, (T-I)(u) = (T-I)(T- λI)p-1(v) = (T- λI)p-1(T-I)(v) = (T- λI)p-1(0) = 0.
Therefore u  E
Hence T(u) = u = u. This implies that u = 0 since   
We now have (T- λI)p-1(v) = 0, which contradicts the hypothesis that p is the
smallest integer for which (T- λI)p(v) = 0.
Hence v = 0 and the restriction of T - I to Kλ(T) is one-to-one

Proposition 1.6:
Let V be of finite dimension. Let T : V  V be a linear operator.
Suppose that λ is an eigenvalue of T with multiplicity m. Then
(a)
dim(Kλ(T))  m
(b)
Kλ(T) = Ker((T- λI)m)
-9-
Proof: (a)
Let W = Kλ(T) and h(t) be the characteristic polynomial of Tw.
Claim: λ is the only eigenvalue of Tw.
Proof of claim:
Since λ is an eigenvalue of T, then (T- λI)v = 0 for some v  V. Therefore v 
W and λ is an eigenvalue of Tw.
Let w be a non-zero vector belonging to W. Suppose   λ is also an
eigenvalue of Tw. So (T- I)w = 0.
By Theorem 1.5(b), the restriction of T - I to W is one-to-one. So w = 0 is the
only solution and therefore  is not an eigenvalue of Tw. We arrived at a contradiction
and hence we have proven the claim.
Since λ is the only eigenvalue of Tw, we h(t) = (t – λ)d, where d = dim(W).
By Proposition 0.5, h(t) divides the characteristic polynomial of T. Hence we
conclude that d  m.
(b)
We will show that Ker((T- λI)m)  Kλ(T) and Kλ(T)  Ker((T- λI)m).
Let v  Ker((T- λI)m), then (T- λI)m(v) = 0. Therefore v  Kλ(T) and hence
Ker((T- λI)m)  Kλ(T),
Now let v  Kλ(T). Recall that W = Kλ(T) and h(t) is the characteristic
polynomial of Tw. So h(Tw) = 0 by Cayley-Hamilton theorem.
From (a), h(Tw) = (Tw - λI)d = 0. So (T - λI)d(v) = 0. Since d  m, we have (T λI)m(v) = 0. Hence v  Ker((T- λI)m) and therefore Kλ(T)  Ker((T- λI)m).
Hence we have proven that Kλ(T) = Ker((T- λI)m).

- 10 -
Now we can go into detail to prove that V is a direct sum of the generalised
eigenspaces corresponding to all each of the eigenvalues of T.
Theorem 1.7: Let T : V  V be a linear operator on a finite dimensional vector space V.
Let λ1, … , λk be distinct eigenvalues of T. Then
(i)
V = Kλ1(T) + … + Kλk(T) and
(ii)
Kλi(T)  Kλj(T) = {0} if i  j
Proof: (i)
The proof is by mathematical induction on k. Let m = multiplicity of k,
and f(t) be the characteristic polynomial of T.
When k = 1, f(t) = (t - )m. By Cayley-Hamilton theorem, f(T) = (T - I)m = 0. So
for all v  V, (T - I)mv = 0. Therefore v  Kλ(T) and hence V = Kλ(T).
Suppose the result is true when T has k-1 distinct eigenvalues.
Now suppose T has k distinct eigenvalues. Then f(t) = (t - k)mg(t) for some g(t)
not divisible by (t - k). Let W = Range of (T - kI)m, denoted by of R((T - kI)m)
Claim 1: W is T-invariant.
Suppose v  W. So v = (T - kI)m(x) for some x  V. T(v) = T(T - kI)m(x) =
(T - kI)mT(x). Therefore T(v)  W and hence W is T-invariant.
Claim 2: (T - kI)m maps Kλi(T) onto itself for i  k
Let x be any vector belonging to Kλi(T). So (T - iI)p(x) = 0 for some positive
integer p.
- 11 -
(T - iI)p(T - kI)m(x) = (T - kI)m(T - iI)p(x) = (T - kI)m(0) = 0. Hence (T kI)m(x)  Kλi(T) and (T - kI)m maps Kλi(T) into itself for i  k.
Since k  i, by Theorem 1.5b, the restriction of T - kI to Kλi(T) is one-to-one
and hence (T - kI)m maps Kλi(T) onto itself for i  k and we have proven the claim.
Let x  Kλi(T). So (T - iI)p(x) = 0. By claim 2, Kλi(T)  W. So x  W. And since
W is T-invariant, (T - iI)p-1(x)  W. Therefore, for i  k, i is an eigenvalue for Tw since
(T - iI)(T - iI)p-1(x) = 0. Hence Tw has at least k-1 eigenvalues.
Now we want to show that k is not an eigenvalue of Tw.
Suppose k is an eigenvalue of Tw. Then (T - kI)(v) = 0 for some non zero v 
W. Now v = (T - kI)m(y) for some y  V. But (T - kI)(v) = (T - kI)m+1(y) = 0.
Therefore y  Kλk(T). Since Kλk(T) = Ker((T- λkI)m), we have v = (T - kI)m(y) = 0 by
Proposition 1.6(b). This is a contradiction, and hence Tw has exactly k-1 distinct
eigenvalues, λ1, … , λk-1.
Let x  V, then (T - kI)m(x)  W.
Since Tw has exactly k-1 distinct eigenvalues, the induction hypothesis applies.
Hence there exist wi  Kλi(Tw), i = 1, …, k-1 such that (T - kI)m(x) = w1 + … + wk-1
Since Kλi(Tw)  Kλi(T), and by claim 2, there exists vi  Kλi(T) such that (T kI)m(vi) = wi for i = 1, 2, …, k-1. Therefore,
(T - kI)m(x) = (T - kI)m(v1) + … + (T - kI)m(vk-1)
So (T - kI)m(x - v1 - … - vk-1) = 0
- 12 -
Therefore (x - v1 - … - vk-1)  Kλk(T) and hence there exists vk  Kλk(T) such that
x = v1 + v2 + … + vk. We can now conclude that V = Kλ1(T) + … + Kλk(T).
Let x  Kλi(T)  Kλj(T),
(ii)
ij
x  Kλi(T), hence (T - iI)P(x) = 0
(T - iI) is one-to-one on Kλj(T) by Theorem 1.5(b). Since x  Kλj(T), (T - iI)P(x)
= 0 if and only if, for any positive integer p, x = 0.

Having established any vector space V as a sum of the generalised eigenspaces of
a linear transformation T, we have the following theorem, which allows us to select the
basis for V from the basis for Kλi(T).
Theorem 1.8:
Let T : V  V be a linear operator. Let λ1, … , λk be the distinct
eigenvalues of T with multiplicities m1, …, mk respectively. Let Bi be an ordered basis
for Kλi(T), where i = 1, 2, …, k. Then B = B1  B2  …  Bk is an ordered basis for V
Proof:
Let di = dim(Kλi(T)) and q = no of elements in B.
By Theorem 1.7, V is spanned by Kλi, i = 1, ..., k. Kλi in turn is spanned by B. So B
spans V and therefore q  dimV.
Since Bi  Bj =  , we have q = d1 + … + dk by Theorem 1.7
And by proposition 1.6(a), (d1 + … + dk)  (m1 + … + mk). Therefore q  dim(V)
since dim(V) = (m1 + … + mk).
Hence q = dim(V) and B is a basis for V
- 13 -

Corollary 1.9:
Let T : V  V be a linear operator on a vector space V of finite
dimension. Then
(a) Dim(Kλi(T)) = mi for all i.
(b) T is diagonalizable if and only if Eλ = Kλ(T) for all eigenvalue  of T.
Proof:
(a)
From theorem 1.8, we have concluded that di = mi
From proposition 1.6(a), di  mi for all i.
Hence di = mi for all i.
(b)
Clearly, Eλ  Kλ(T).
Also, T is diagonalizable if and only if multiplicity of  = dim(Eλ)
From (a), dim(Kλ(T)) = multiplicity of  = dim(Eλ)
Hence Eλ = Kλ(T).

Theorem 1.8 permits us to use the union of bases for the generalised eigenspaces
as a basis for V. However, not all bases from the generalised eigenspaces will give us the
Jordan canonical basis. We will now examine how we should select the bases so that we
can obtain a Jordan canonical basis for any vector space V. The required basis, in fact, is
obtained from the cycle of generalised eigenvectors, defined below.
- 14 -
Definition 1.10: Let T : V  V be a linear operator. Let x be a generalised eigenvector
of T corresponding to eigenvalue  and p be the smallest positive integer such that
(T-I)P(x)=0. Then the cycle of generalized eigenvectors of T corresponding to  is the
ordered set {(T-I)P-1(x), (T-I)P-2(x),…,(T-I)(x),x}. The initial vector and end vector of
the cycle is the first and last elements of the set respectively. The length of the cycle is p.
Remark: The initial vector of the cycle is the only eigenvector of T in the cycle.
In order to use these cycles as a basis for the vector space V, we will need to show
that the union of the cycles of generalised eigenvectors forms a basis for the generalised
eigenspace corresponding to . Theorem 1.12 proves this directly, but we will first need
theorem 1.11 so that we can be certain that the union of cycles we select is linearly
independent.
Theorem 1.11: Let T : V  V be a linear operator and  be an eigenvalue of T. Suppose
that 1, …, q are cycles of generalised eigenvectors of T corresponding to  such that the
initial vectors of 1, …, q are distinct and linearly independent. Then
(a)
1, …, q are disjoint
(b)
 = 1 2  …  q is linearly independent.
Proof: (a)
Let i and j be cycles of generalised eigenvectors of length pi and pj
respectively.
Let v  i  j. Since v  i, (T-I)r(v) = 0 for some 1  r  pi. Also, since v  j,
(T-I)s(v) = 0 for some 1  s  pj.
- 15 -
If s = r, then the initial vectors of i and j are the same. Therefore s  r. Without
loss of generality, assume r < s.
Then s is not the smallest integer such that (T-I)s(v) = 0. So s > pj and we have a
contradiction. Hence i  j = .
(b)
We prove this by induction on n, the number of vectors in  = { v1, …, vn }
If n = 1, the result is trivial. Now assume that the result hold whenever 
consists of less than n vectors, n > 1.
Now suppose  has exactly n vectors. Let W = span(). So W is (T - I) invariant,
since (T - I)(v1) = 0 and (T-I)(vi) = vi-1 for i = 2, 3, .., n. Let U denote the restriction of
T - I to W.
For each i, where I = 1, 2, …, q, let i’ denote the set obtained from i by deleting
the end vector. When i is of length one, i’ is the null set.
We will first show that ’ = 1’  2’  …  q’ is a basis for R(U).
Observe that i’ generates R(U) since each vector of i’ is the image under U of a
vector in i, and every non-zero image under U of a vector of i is contained in i’. (Recall
that W is generated by )
Next, ’ consists of n-q vectors, and the initial vectors of i’ are linearly
independent since they are also the initial vectors of . Thus ’ is linearly independent by
the induction hypothesis.
Hence ’ is a basis for R(U) and dim(R(U)) = n – q.
- 16 -
Since the q initial vectors of  lie in Ker(U) and form a linearly independent set,
dim(Ker(U))  q.
By the dimension theorem, dim(W) = dim(R(U)) + dim(Ker(U))  (n–q) + q = n.
But W is generated by , so dim(W)  (no of elements in ) = n. Therefore dim(W) = n.
Since  generates W and consists of n vectors,  is linearly independent.

Remark:
Every cycle of generalised eigenvectors of a linear operator is linearly
independent.
Theorem 1.12: : Let T : V  V be a linear operator and  be an eigenvalue of T. Then
Kλ(T) has an ordered basis consisting of a union of disjoint cycles of generalised
eigenvectors corresponding to .
Proof: Let n = dim(Kλ(T)).
Suppose Kλ(T) = E(). Let v1, …, vn be the basis for E(). So v1, …, vn are
eigenvectors, and also generalised eigenvectors, corresponding to . Hence {v1}, …, {vn}
are disjoint cycles of generalised eigenvectors corresponding to  and the union forms a
basis for Kλ(T).
Now suppose Kλ(T)  E(). We will prove this by induction on n.
When n = 1, the result is trivial. Now assume that the result hold whenever
dim(Kλ(T)) < n, where n > 1.
Suppose dim(Kλ(T)) = n.
- 17 -
Kλ(T) is (T-I)-invariant, since if x  Kλ(T) such that (T- λI)P(x) = 0 for some
positive integer p, then (T- λI)P-1[(T- λI)(x)] = 0 and hence (T- λI)(x)  Kλ(T).
Let U denote the restriction of (T - I) to Kλ(T). So R(U)  Kλ(T) since Kλ(T) is
the codomain of U. Therefore dim(R(U))  dim(Kλ(T)).
Now we want to show that dim(R(U)) < dim(Kλ(T)). Since  is an eigenvalue of
T, there exists a non-zero vector x such that (T - I)x = 0. Hence U(x) = 0 and ker(U) 
{0}. Therefore, by the dimension theorem, dim(R(U)) < dim(Kλ(T)).
Since dim(R(U)) < n, we can apply the induction hypothesis. Therefore, R(U) has
an ordered basis  = 1 2  …  q, where is are disjoint cycles of generalised
eigenvectors corresponding to λ for the restriction of T to R(U), and hence for T itself.
Note that the end vector of 1, 2, …, q is the image under U of a vector vi 
Kλ(T).
Let ’i = i  {vi} and let wi be the initial vector of ’i. Since wi is also the initial
vector of i, {w1, …, wq} is a linearly independent subset of Eλ. We can extend this subset
to a basis {w1, …, wq, u1, …, us} for Eλ.
Let ’ = ’1  …  ’q  {u1}  … {us}. We want to show that ’ is a basis for
Kλ(T). We will first show that ’ is linearly independent.
The initial vectors of the cycles ’1, …, ’q, {u1}, …, {us} forms the set {w1, …,
wq, u1, …, us}, which is linearly independent since it is a basis for Eλ. Since ’1, …, ’q,
{u1}, …, {us} are also disjoint cycles of generalised eigenvectors of T corresponding to
, ’ is linearly independent by Theorem 1.11.
- 18 -
Next we will show that ’ contains the same number of elements as dim(Kλ(T)).
This will show that ’ is a basis for Kλ(T) since ’ is linearly independent.
Suppose  consists of r elements, then ’ consists of r + q + s elements.
Ker(U) = Eλ. So Nullity(U) = q + s since {w1, …, wq, u1, …, us} is a basis for Eλ.
Rank(U) = r since  is a basis for R(U).
By the dimension theorem, dim(Kλ(T)) = Rank(U) + Nullity(U) = r + q + s.
Hence, we conclude that ’ is a basis for Kλ(T).

Now we have proven that the union of the cycles of generalised eigenvectors of T
is a basis for V. What follows is to show that this basis is indeed the Jordan canonical
basis required.
Theorem 1.13: Let T : V  V be a linear operator. Suppose B is a basis for V such that
B is a disjoint union of cycles of generalised eigenvectors of T. Let  be any cycle of
generalised eigenvectors contained in B. Then W = span() is T-invariant, and [Tw] is a
Jordan block. Also, B is a Jordan canonical basis for V.
Proof: Let  = { v1, v2, …, vp } be a cycle of generalised eigenvectors of T corresponding
to . So vi = (T-I)p-i(x) for i = 1, 2, …, p
We will prove that W is T-invariant by showing T(vi)  W, i = 1, 2, …, p
For i > 1, (T-I)(vi) = (T-I)p-(i-1)(x) = vi-1 and so T(vi) = vi + vi-1
For i = 1, (T-I)(v1) = (T-I)p(x) = 0 and so T(v1) = v1
So T(vi)  W for i = 1, 2, …, p and hence W is T-invariant.
- 19 -
Next we will show that [Tw] is a Jordan block.
[T][v1] = [v1] ,
T(1,0,0,…,0)T = (,0,0,…,0)T
so
[T][vi] = [vi] + [vi-1] ,
T(0,1,0,…,0)T = (1,,0,0,…,0)T
so
T(0,0,1,…,0)T = (0,1,,0,…,0)T and so on.
We can construct [Tw] by arranging these resultant vectors into ordered columns,
and we obtain a Jordan block.
By repeating the above arguments for each cycle in B in order to obtain [T] B, we
will find that [T]B is in Jordan canonical form. Hence B is a Jordan canonical basis for V.

Theorem 1.14:
Let T : V  V be a linear operator such that the characteristic
polynomial of T splits. Then T has a Jordan canonical form.
Proof: Let dim(V) = n.
Since the characteristic polynomial of T splits, there exists n eigenvalues of T,
counting multiplicity.
Let the distinct eigenvalues of T be 1, 2, …, k.
By Theorem 1.12, for each i, there is an ordered basis Bi consisting of a union of
disjoint cycles of generalised eigenvectors corresponding to i.
Let B = Bi  …  Bi. By theorem 1.8, B is an ordered basis for V.
And by Theorem 1.13, B is a Jordan canonical basis for V, and hence T has a
Jordan canonical form.
- 20 -

Having proven these fundamental theorems relating to Jordan canonical forms,
we are now ready to move on to Chapter Two and Three, where we will use these
theorems extensively to determine the Jordan canonical forms and Jordan canonical basis.
2. Relationship Between Minimum Polynomial & Jordan Canonical Form
We have proven rigorously several important results in the first chapter. However,
we still need a procedure to determine the exact Jordan canonical form of a linear
operator T. In this chapter, we will introduce a new concept, the minimum polynomial,
which will help to deduce possible Jordan canonical forms of a linear operator.
The main theorem in this chapter is Theorem 2.6, which proves that the minimum
polynomial imposes restrictions on the size of the Jordan blocks belonging to the Jordan
canonical form. These restrictions are needed to discard possible Jordan canonical forms
that do not comply with the requirements.
Prior to that, we have several new definitions and theorems that will be needed in
the later part of the chapter.
Definition 2.0: Given a polynomial p(x) = anxn + an-1xn-1 + … + a1x + a0,
(a)
p(x) is monic if its leading coefficient an = 1.
(b)
p(x) is irreducible if its only factors are the scalars and scalar multiples of
p(x). Hence every polynomial q(x) is a product of irreducible polynomials
q(x) = f1(x)f2(x)…fk(x). These fi(x) are called the irreducible factors of q(x).
- 21 -
Remark: Since q(x) always splits over the complex field, each irreducible factor of q(x)
is linear.
Definition 2.1: The minimum polynomial, m(x), of a linear operator T is defined to be
the non-zero monic polynomial of smallest degree such that m(T) is the zero map.
Now that we have defined what is the minimum polynomial, we shall have a look
at some relationships between the minimum polynomial and the characteristic
polynomial. Particularly important is Theorem 2.5, which tells us that the two
polynomials have exactly the same root (with possibly different multiplicities). We will
need this theorem for our final result, Theorem 2.6.
Theroem 2.2:
Let m(x) be the minimum polynomial for a linear operator T. For any
polynomial f(x), if f(T) = 0, then m(x) divides f(x). In particular, m(x) divides the
characteristic polynomial of T.
Proof: There exists polynomials q(x) and r(x) such that f(x) = m(x)q(x) + r(x), where
degree of r(x) is less than the degree of m(x).
So f(T) = m(T)q(T) + r(T). But f(T) = 0 and m(T) = 0 from definition 2.1. Hence
r(T) = 0.
If r(x)  0, then m(x) is not of the smallest degree such that m(T) is the zero map,
since degree of r(x)  degree of m(x).
Hence r(x) = 0 and m(x) divides f(x).
- 22 -

Theorem 2.3: The minimum polynomial of a linear operator T is unique.
Proof: Suppose m1(x) and m2(x) are both minimum polynomials of T. Then m1(x)
divides m2(x) by theorem 2.2.
Since m1(x) and m2(x) are both minimum polynomials, they have the same
degree. So m1(x) = km2(x), where k is a non-zero scalar.
But m1(x) and m2(x) are both monic, so k = 1. Hence m1(x) = m2(x).

Lemma 2.4: Let T be a linear operator on a n-dimensional vector space V. Let m(x) and
p(x) be the minimum polynomial and characteristic polynomial respectively. Then p(x)
divides [m(x)]n.
Proof: Let the minimum polynomial of T be m(x) = xk + m1xk-1 + … + mk-1x + mk.
Let Si be a linear transformation defined recursively such that S0 = I and Sk - TSk-1
= mkI for k > 0.
We also define M(x) = xk-1S0 + xk-2S1 + … + xSk-2 + Sk-1. We shall show that
p(x)|M(x)| = [m(x)]n, and hence concluding that p(x) always divides [m(x)]n.
Then (xI-T)M(x)
= xM(x) – TM(x)
= (xkS0 + xk-1S1 + … + x2Sk-2 + xSk-1)
- (xk-1TS0 + xk-2TS1 + … + xTSk-2 + TSk-1)
= xkS0 + xk-1(S1 - TS0) + … + x(Sk-1 - TSk-2) - TSk-1
= xkI + m1xk-1I + … + mk-1xI - TSk-1
- 23 -
But TSk-1
= mk-1T + T2Sk-2
= mk-1T + T2(mk-2I + TSk-3)
= … = Tk + m1Tk-1 + … + mk-1T
= m(T) - mkI
= - mkI.
Hence (xI-T)M(x) = xkI + m1xk-1I + … + mk-1xI + mkI = m(x)I.
Therefore p(x)|M(x)| = |xI-T||M(x)| = |m(x)I| = [m(x)]n and hence p(x) divides [m(x)]n.

Theorem 2.5:
The characteristic polynomial and minimum polynomial of a linear
transformation T have the same irreducible factors.
Proof:
Let p(x) and m(x) be the characteristic polynomial and minimum polynomial
respectively. Let p(x) = f1(x)a1f2(x)a2…fk(x)ak, where fi(x) are the irreducible factors of
p(x).
By Theorem 2.2, m(x) = f1(x)b1f2(x)b2…fk(x)bk, where 0  bi  ai for i = 1, …, k.
By Lemma 2.4, p(x) divides m(x)n = f1(x)n(b1)f2(x)n(b2)…fk(x)n(bk). Hence nbi  0
for all i. Therefore bi  0 and so fi(x) are also the irreducible factors of p(x) for all i = 1,

…, k.
Theorem 2.6: If the minimum polynomial of a linear transformation T : V  V is (x λ1)m(1)…(x - λk)m(k), then its Jordan canonical form has the following property:
(i)
All Jordan blocks belonging to λi is of size less than or equal to mi and
- 24 -
at least one Jordan block belonging to λi is of size mi for i = 1, …, k.
(ii)
Proof: (i)
Let the minimum polynomial of T be m(x) = (x - λ1)m(1)…(x - λk)m(k)
where λ1, …, λk are distinct eigenvalues. Note that V = Kλ1(T)  …  Kλk(T) by
Theorem 1.7 and Theorem 2.4.
From Theorem 1.13, for i = 1, …, k, [Twi]i is a Jordan block, where i is any cycle
of generalised eigenvectors corresponding to λi and Wi = span(i). Therefore, the size of
the Jordan blocks equals to dim(Wi), which in turn equals to the length of i.
Hence we only need to prove that the length of i is less than or equal to mi. We
first consider the length of k.
Since m(x) is the minimum polynomial, we have m(T) = (T - λ1I)m(1)…(T λkI)m(k) = 0, and therefore (T - λ1I)m(1)…(T - λk-1I)m(k-1)(T - λkI)m(k)v = 0, where v 
Kλk(T).
We want to show that (T - λk)m(k)v = 0.
By Theorem 1.5, the restriction of (T - λ1)m(1)…(T - λk-1)m(k-1) to Kλk(T) is one toone. The theorem states that the restriction of (T - λjI) to Kλk(T) is one-to-one for j  k.
Hence the restriction of (T - λ1I)m(1)…(T - λk-1I)m(k-1) to Kλk(T) is one-to-one since it is a
composition of one-to-one linear transformations.
Since v  Kλk(T), we have (T - λk)m(k)v  Kλk(T). Therefore, (T - λk)m(k)v = 0, and
hence k has length less than or equal to mk.
Repeating the above argument for v belonging to other generalised eigenspaces,
we can conclude that the length of i is less than or equal to mi for i = 1, …, k. Hence all
Jordan blocks belonging to λi is of size less than or equal to mi.
- 25 -
(ii)
We shall prove this by contradiction. Hence we suppose that for some i, the length
of any cycles i of T corresponding to λi is strictly less than mi.
Therefore, for some i = 1, …, k, (T - λiI)m(i)
– 1
(v) = 0 for all v  Kλi(T). In
particular, (T - λkI)m(k) – 1(v) = 0 for all v  Kλk(T).
Case one: If v  Kλk(T):
Then (T - λ1I)m(1)…(T - λk-1I)m(k-1)(T - λkI)m(k) - 1(v) = 0.
Case two: If v  Kλi(T), where i  k:
Then (T - λiI)m(i)(v) = 0 from (i). Therefore,
(T - λ1I)m(1)…(T - λkI)m(i-1)(T - λkI)m(i+1)…(T - λk-1I)m(k)-1(T - λkI)m(i)(v) = 0.
Case three: v  Kλi(T) for all i = 1, …, k:
Then v = u1 + … + uk, where ui  Kλi(T), since V = Kλ1(T)  …  Kλk(T)
by Theorem 1.7. Therefore
(T - λ1I)m(1)…(T - λk-1I)m(k-1)(T - λkI)m(k) - 1(v)
= (T - λ1I)m(1)…(T - λk-1I)m(k-1)(T - λkI)m(k) - 1(u1) + …
+ (T - λ1I)m(1)…(T - λk-1I)m(k-1)(T - λkI)m(k) - 1(uk)
= 0, by case one and case two.
Hence we have shown that (T - λ1I)m(1)…(T - λk-1I)m(k-1)(T - λkI)m(k) - 1(v) = 0 for
all v  V. Therefore, we can conclude that (T - λ1I)m(1)…(T - λk-1I)m(k-1)(T - λkI)m(k) – 1 is
the zero map.
This contradicts that m(x) is the minimum polynomial. Hence there exists at least
one Jordan block of size mi belonging to λi, for i = 1, …, k.

- 26 -
Example 2.7: Determine the Jordan canonical forms of the linear transformation T
represented by the following matrix:
2 -4 2 4
-2 0 1 4
0 4 0 -4
-2 -6 3 10
Solution:
Let p(x) and m(x) be the characteristic polynomial and minimum polynomial of T
respectively.
The characteristic polynomial of T is computed to be p(x) = (x – 2)2(x - 4)2.
By Theorem 2.4, the irreducible factors of m(x) are also (x – 2) and (x – 4). So
m(x) = (x – 2)a(x – 4)b.
By Theorem 2.2, m(x) divides p(x), so m(x) must be one of the following:
m1(x) = (x – 2)(x – 4).
m2(x) = (x – 2)(x – 4)2
m3(x) = (x – 2)2(x – 4)
m4(x) = (x – 2)2(x – 4)2
By definition, we want m(x) to be of smallest degree such that m(A) = 0. Since
m1(A)  0 and m2(A) = 0, m(x) = m2(x) = (x – 2)(x – 4)2.
By Theorem 2.6, all Jordan blocks belonging to λ1 = 2 is of size 1, and there exist
at least one Jordan block of size 2 belonging to λ2 = 4.
Hence there is only one possible Jordan canonical form of T, which is:
2
0
0
0
0
2
0
0
0
0
4
0
0
0
1
4
- 27 -
J=

Theorem 2.6 is useful as it might reduce the possibilities of other Jordan canonical
forms of T. In this example, if we only have the results in chapter one to depend on, all
the matrices given below all possible Jordan canonical forms of T:
2
0
0
0
0
2
0
0
0
0
4
0
0
0
0
4
2
0
0
0
1
2
0
0
0
0
4
0
0
0
0
4
2
0
0
0
0
2
0
0
0
0
4
0
0
0
1
4
2
0
0
0
1
2
0
0
0
0
4
0
0
0
1
4
The number of possibilities will increase further if the dimension of V is to increase.
Hence we see the importance and usefulness of this theorem.
We were fortunate in example 2.7 that there is only one possible Jordan canonical
form. However, this theorem does not always fully determine the Jordan canonical forms
of T. In the next example, we shall see that two matrices having the same minimum
polynomial and characteristic polynomial does not necessarily have the same Jordan
canonical form.
Example 2.8 Determine the minimum polynomials, mA(x) and mB(x) of A and B
respectively, given
A=
2
0
0
0
1
2
0
0
0
0
2
0
0
0
0
2
B=
- 28 -
2
0
0
0
1
2
0
0
0
0
2
0
0
0
1
2
Clearly, λ = 2 is the only eigenvalue of A and B. Hence mA(x) and mB(x) is of the
form (x – 2)k, where 1  k  4. We check that
(A – 2I)1 =
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
(A – 2I)2 =
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Hence mA(x) = (x – 2)2.
Similarly, we check that (B – 2I)1  0 and (B – 2I)2 = 0. Hence mB(x) = (x – 2)2.
The characteristic polynomial of A and B are both (x – 2)4 since they are of
dimension 4.
A and B have the same minimum polynomial and characteristic polynomial.
Clearly, A and B are Jordan canonical forms of themselves, but A  B. Hence two
matrices having the same minimum polynomial and characteristic polynomial need not
have the same Jordan canonical form.

Therefore, to determine the Jordan canonical form, it is necessary, not only to
look at the minimum polynomial. Furthermore, the minimum polynomial does not allow
us to determine the Jordan canonical basis. This gives rise to the need for a more in depth
and powerful method to determine the Jordan canonical form and basis of any linear
operator. This shall be discussed in Chapter three.
- 29 -
3.
Finding the Jordan Canonical Form and Basis
From chapter one, we have proven that the disjoint union of cycles of generalised
eigenvectors of T forms a basis for V. With this knowledge, we will now introduce a
method for computing each Jordan block and the required union of linearly independent
disjoint cycles for each generalised eigenspace, and hence computing the Jordan
canonical form and the Jordan canonical basis for V.
To facilitate discussion, we shall assume the following throughout this entire
chapter: T is a linear operator on a n-dimensional vector space V, having 1, 2, …, k as
the distinct eigenvalues. Also, we let Bi be a basis for Kλi(T) such that Bi is a union of
disjoint cycles of generalised eigenvectors corresponding to λi. Finally, we define Ti to be
the restriction of T to Kλi(T) and Ai to be the matrix [Ti]Bi. So Ai is the Jordan canonical
form for Ti and hence, from theorem 1.13,
J = [T]B =
A1 0 … 0
0 A2 … 0
. . … .
0 0 … Ak
is a Jordan canonical form for T.
- 30 -
By convention, we will order Bi such that the ni cycles are of decreasing length.
So if 1, …, ni are disjoint cycles of Bi having length p1, …, pni respectively, then p1 …
 pni. This will limit the possible variations in Ai, and hence J. As we progress, we will
see that each Ai is hence unique and therefore, T is unique up to the ordering of the
eigenvalues of T.
We shall now introduce the method for computing the Jordan canonical form Ai.
This is achieved by using an array of dots called the dot diagram for Ti. Each dot in the
dot diagram for Ti represents one unique vector of Bi, and it is arranged into columns
such that each column represents each cycle of Bi. Hence, the jth column will consist of
pj dots that correspond to the elements in j, with the first dot representing the initial
vector, and continuing down to the end vector.
Remark: The dot diagram consists of ni columns and p1 rows, and each row
becomes shorter as we move from top to bottom.
Example:
Suppose that T has an eigenvalue i = 3, and Bi is a basis for Ki(T) such
that Bi is a union of three disjoint cycles with lengths 3, 2 and 2 respectively. Then
Ai =
and the dot diagram of Ti is
3
0
0
0
0
0
0
1
3
0
0
0
0
0
0
1
3
0
0
0
0
0
0
0
3
0
0
0
0
0
0
1
3
0
0
0
0
0
0
0
3
0
  
  
- 31 -
0
0
0
0
0
1
3


The following three theorems will give us the method required for constructing
the dot diagram and finding the basis given the matrix representation of T. The dot
diagram, on its own, will fully determine the Jordan canonical form of the linear operator.
Recall that Bi is a basis for Kλi(T) such that it is a union of disjoint cycles, 1, …,
ni, of generalised eigenvectors corresponding to λi such that the ni cycles are of
decreasing length p1, …, pni respectively.
Theorem 3.1: The vectors in Bi that are represented by the dots in the first r rows of the
dot diagram of Ti forms a basis for Ker((T - iI)r). Hence, Nullity((T - iI)r) is equal to the
number of dots in the first r rows of the dot diagram.
Proof:
Kλi(T) is invariant under (T - iI)r. For suppose that x  Kλi(T) such that (T iI)Px = 0 for some positive integer p. Then (T - iI)P(T - iI)rx = (T - iI)r(T - iI)Px = 0.
Hence (T - iI)rx  Kλi(T).
Therefore, let U denote the restriction of (T - iI)r to Kλi(T).
Clearly, Ker((T - iI)r)  Kλi(T). So Ker((T - iI)r) = Ker(U) and therefore it is
sufficient to establish the result for U.
Let p be the number of dots in the first r rows, q be the number of the remaining
dots in the (r + 1)th row onwards, and mi = dim(Kλi(T)). Note that mi = p + q.
- 32 -
For x  Bi, (T - iI)rx = 0 if and only if x is the first r vectors of a cycle. Hence for
any x  Bi, U(x) = 0 if and only if x is represented by a dot in the first r rows of the dot
diagram. Hence, the p vectors represented by the dots in the first r rows lies in ker(U).
Let S = { U(x)  0 | x  Bi }. We will show that S is a basis for R(U). Firstly, S
spans R(U). This can be seen as follows: Suppose u  R(U), then u = U(v) for some v 
Kλi(T). Since v  Kλi(T), v is a linear combination of the basis vectors in Kλi(T). These
basis vectors are chosen from Bi. So u is a linear combination of the image of these basis
vectors under U. Hence S spans R(U).
If x  Bi, then U will map x to a vector represented by the dot that is exactly r
position above the dot representing x. So U is a one-to-one mapping, of every vector in
the (r + 1)th row onwards, into Bi. So S is linearly independent and therefore, S is a basis
for R(U) and hence rank(U) = q.
By the dimension theorem, N(U) = mi – q = p. Since the p vectors represented by
the dots in the first r rows are linearly independent vectors that lie in ker(U), these vectors
forms a basis for ker(U), and hence ker((T - iI)r).

Theorem 3.2: Let rj be the number of dots in the jth row of the dot diagram of Ti. Then
(a)
r1 = dim(V) – rank(T - iI).
(b)
rj = rank((T - iI)j-1) – rank((T - iI)j), if j  1.
Proof:
By Theorem 3.1, r1 + r2 + … + rj = N((T - iI)j) = dim(V) – rank((T - iI)j).
Therefore, when j = 1, we have r1 = dim(V) – rank(T - iI).
When j > 1, rj = ( r1 + r2 + … + rj-1 + rj ) – ( r1 + r2 + … + rj-1 )
- 33 -
= [dim(V) – rank((T - iI)j)] – [dim(V) – rank((T - iI)j-1)]
= rank((T - iI)j-1) – rank((T - iI)j).

Remark: This theorem shows that the dot diagram is determined by only T and i. Hence
the dot diagram for Ti is unique subject to the convention that the cycles of generalised
eigenvectors corresponding to i is are ordered in decreasing length. Hence the Jordan
canonical form of a linear operator is unique up to the ordering of the eigenvalues.
Theorem 3.3:
Let A and B be n x n matrices. Then A and B are similar if and only if
they have the same Jordan canonical form, assuming the same ordering of their
eigenvalues.
Proof:
If A and B have the same Jordan canonical form J, then A and B are both similar
to J and hence A is similar to B by the transitive property of similarity.
Now suppose A and B are similar. Then A and B have the same set of
eigenvalues. Let JA and JB be the Jordan canonical forms of A and B respectively,
assuming the same ordering of their eigenvalues. Then A is similar to both J A and JB by
the transitive property of similarity. So JA and JB are matrix representations of the linear
transformation T represented by the matrix A. Therefore, JA and JB are Jordan canonical
forms of T and hence JA = JB since the Jordan canonical forms of a linear operator is
unique if JA and JB assume the same ordering of their eigenvalues.

- 34 -
With these theorems, we will demonstrate how to construct the Jordan canonical
form and its associated basis in the following example. The definition of cycles of
generalised eigenvectors, Definition 1.10, will also be needed to determine the Jordan
canonical basis.
Example:
Using the dot diagram, determine the Jordan canonical form and Jordan
canonical basis for A. Hence, determine whether A is similar to B.
(i)
A=
6
-4
0
0
0
4
-2
0
0
0
0 2 0
0 1 0
2 -2 1
0 6 4
0 -4 -2
(ii)
B=
-2
4
0
0
0
-4
6
0
0
0
0 1 1
0 2 2
2 2 5
0 10 16
0 -4 -6
Solution:
(i)
Let T be the transformation represented by the matrix A.
The characteristic polynomial of A is computed to be |kI–A| = (k – 2)5. Thus A
has only one distinct eigenvalue, 1 = 2, with multiplicities 5. Let T1 be the restriction of
T to Kλ1(T).
[Note: In this example, Kλ1(T) = V, hence it seems that it is pointless to consider
the restriction T1. However, it is not always the case that A has only one distinct distinct
eigenvalue. A more general way of solving this problem is to consider the restrictions, Ti,
of T to Kλi(T) for each eigenvalue i.]
We will compute the dot diagram for T1 by calculating the number of dots in each
rows of the dot diagram. Let r1 be the number of dots in the first row.
- 35 -
By Theorem 3.2, r1 = dim(V) – rank(T – 21) = 5 – 3 = 2. So the Jordan canonical
basis for T consists of two cycles of generalised eigenvectors. We shall call them 1 and
2.
By the same theorem, r2 = rank(T – 2I) – rank(T – 2I)2 = 3 – 2 = 1. So the second
row of the dot diagram consists of only one dot, and therefore, 2 has length of one.
By Corollary 1.9, the dot diagram for T1 consists of five dots, since dim(Kλ1(T)) =
5. So the first column consists of four dots, as follows:
 



Therefore, B1, which is the basis for Kλ1(T), is a union of two cycles of length 4
and 1. Hence A1 = [T1]B1 =
2
0
0
0
0
1
2
0
0
0
0
1
2
0
0
0
0
1
2
0
0
0
0
0
2
When we have more than one eigenvalues, we will compute Ai for each
eigenvalue i, and construct JA, the canonical form for A, by the definition given in
Chapter 1. It is a special case in this example that JA = A1
We shall now determine B, the Jordan canonical basis for A. If T has more than
one distinct eigenvalues, we need to compute Bi, the basis for Kλi(T), for all i. We then
obtain B as the union of Bi for all i. In this example, B = B1.
By Theorem 3.1, the two initial vectors of 1 and 2 in B1 forms a basis for Ker(T 2I). It is computed that Ker(T - 2I) = { s(0, 0, 1, 0, 0)T + t(1, -1, 0, 0, 0)T | s, t  C }.
- 36 -
We need to choose the two initial vectors of 1 and 2 from Ker(T - 2I) such that
they are linearly independent. We hence use the basis for the eigenspace of T 1 as the
initial vectors of 1 and 2 since they belong to Ker(T - 2I) and are linearly independent.
Theorem 1.11 guarantees that if the initial vectors of the cycles are linearly independent,
then all the vectors belonging to all cycles are linearly independent of one another. And
together with Theorem 1.13, this cycles will give us the required Jordan canonical basis.
We hence obtain the two vectors corresponding to the two dots in the first row as:
{ (0, 0, 1, 0, 0)T, (1, -1, 0, 0, 0)T }.
Next, we determine the rest of the vectors in B1 using Definition 1.10. Let v1 and
v2 be the end vectors of the cycles 1 and 2 respectively. Since 1 and 2 is of length four
and one respectively, we have (T - 2I)3v1 = u1 and v2 = u2, where u1 and u2 are the two
vectors in the set { (0, 0, 1, 0, 0)T, (1, -1, 0, 0, 0)T }, since this set consists of the initial
vectors of the two cycles. However, it is still unknown that which vector is u1 and which
vector is u2.
However, it is known that only one vector out of u1 or u2 will satisfy the linear
system (T - 4I)3v1 = ui. For suppose both u1 and u2 satisfy the linear system, then the dot
diagram for A consists of eight dots, which is a contradiction.
We check that (T - 4I)3v1 = (0, 0, 1, 0, 0)T gives no feasible solution for v1. Hence
(1, -1, 0, 0, 0)T is the initial vector of 1, and (0, 0, 1, 0, 0)T belongs to 2.
We will now compute the rest of the vectors in 1. Since v1 is the end vector of 1,
1 = { (1, -1, 0, 0, 0)T, (T - 4I)2v1, (T - 4I)v1, v1 }.
- 37 -
We first compute v1. Solving for v1 in the linear system (T - 4I)3v1 = (0, 0, 1, 0,
0)T, we obtain the solution set { v1 = (a, b, c, d, 1/48 - d)T | a, b, c, d  C }. Choose v1 = (0,
0, 0, 0, 1/48)T. Hence (T - 4I)v1 = (0, 0, 1/48, 1/12, -1/12)T, (T - 4I)2v1 = (1/6, 1/12, -1/4, 0, 0)T.
Therefore B = B1 = 1  2 = { (1, -1, 0, 0, 0)T, (1/6, 1/12, -1/4, 0, 0)T, (0, 0, 1/48, 1/12,
-1/12)T, (0, 0, 0, 0, 1/48)T, (0, 0, 1, 0, 0)T }.
We need to note that the ordering of the vectors in B is important, if we want to
determine P such that JA = P-1AP. If A1 is the first Jordan block of J, followed by A2, …,
Ak, then the basis vectors in B1 should be listed first, followed by B2 and all the way to
Bk. The basis vectors in each Bi should further be ordered such that it follows the same
pattern as a cycle of generalised eigenvectors.
With such an ordering, we can obtain P easily by constructing the n x n matrix
such that the columns are the vectors of B listed in the same order. Hence, given the
matrix A as above,
P=
(ii)
1 1/6 0
-1 1/12 0
0 -1/4 1/48
0 0 1/12
0 0 -1/12
0
0
0
0
1
/48
0
0
1
0
0
We shall now compute the Jordan canonical form for B.
The steps are exactly the same as in (i), hence we shall briefly outline the steps
and leave out the detailed computations.
We first compute the characteristic polynomial of B, and thus obtain the set of
eigenvalue 1 = 2 with multiplicities 5.
- 38 -
Using Theorem 3.2, we can construct the dot diagram for each Ti, and we obtain
the dot diagram for T1 as:
 



With this dot diagram, we can determine A1 taking into account the lengths of the
various cycles that constitutes B1.
Arranging all Ais as blocks along the diagonals, we obtain JB, the required Jordan
canonical form. In this example, there is only one Ai, that is A1
JB =
2
0
0
0
0
1
2
0
0
0
0
1
2
0
0
0
0
1
2
0
0
0
0
0
2
Since JA = JB , by Theorem 2.3, A is similar to B.

Using the dot diagram, we will always be able to determine the exact Jordan
canonical form of a linear operator, and its associated basis. Hence, given any linear
operator performed over the complex field, we can always use the dot diagram to fully
determine the Jordan canonical form and Jordan canonical basis.
- 39 -
Conclusion
Through the understanding of Jordan canonical forms and the proofs for the
fundamental theorems, we can obtain an elegant matrix representation for any linear
transformation T, performed over the complex field, together with its associated basis.
The advantage and beauty of the Jordan canonical form lies in the simplicity of its
description, which grants great advantage in many real life applications.
- 40 -
References
1. Stephen H. Friedberg, Arnold J. Insel, Lawrence E. Spence. Linear Algebra 3rd
Edition. Prentice Hall.
2. Seymour Lipschutz, Marc Lars Lipson. Linear Algebra 3rd Edition. McGraw-Hill.
3. http://www.ma.iup.edu/projects/CalcDEMma/JCF/jcf0.html
4. http://ece.gmu.edu/ececourses/ece521/lecturenote/chap1/node3.html
5. http://www.dpmms.cam.ac.uk/~leinster/linear.html
- 41 -
Download