NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 1. Introduction

advertisement
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL
EQUATIONS
LANCE D. DRAGER
1. Introduction
A problem that comes up in a lot of different fields of mathematics and engineering is solving a system of linear constant coefficient differential equations. Such a
system looks like
π‘₯′1 (𝑑) = π‘Ž11 π‘₯1 (𝑑) + π‘Ž12 π‘₯2 (𝑑) + · · · + π‘Ž1𝑛 π‘₯𝑛 (𝑑)
π‘₯′2 (𝑑) = π‘Ž21 π‘₯1 (𝑑) + π‘Ž22 π‘₯2 (𝑑) + · · · + π‘Ž2𝑛 π‘₯𝑛 (𝑑)
(1.1)
..
.
π‘₯′𝑛 (𝑑) = π‘Žπ‘›1 π‘₯1 (𝑑) + π‘Žπ‘›2 π‘₯2 (𝑑) + · · · + π‘Žπ‘›π‘› π‘₯𝑛 (𝑑),
where the π‘Žπ‘–π‘— ’s are constants. This is a system of 𝑛 differential equations for the
𝑛 unknown functions π‘₯1 (𝑑), . . . , π‘₯𝑛 (𝑑). It’s important to note that the equations are
coupled, meaning the the expression for the derivative π‘₯′𝑖 (𝑑) contains not only π‘₯𝑖 (𝑑),
but (possibly) all the rest of the unknown functions. It’s unclear how to proceed
using methods we’ve learned for scalar differential equation. Of course, to find
a specific solution of (1.1), we need to specify initial conditions for the unknown
functions at some value 𝑑0 of 𝑑,
π‘₯1 (𝑑0 ) = 𝑐1
π‘₯2 (𝑑0 ) = 𝑐2
(1.2)
..
.
π‘₯𝑛 (𝑑0 ) = 𝑐𝑛 ,
where 𝑐1 , 𝑐2 , . . . , 𝑐𝑛 are constants.
It’s pretty clear that linear algebra is going to help here. We can put our unknown
functions into a vector-valued function
⎑
⎀
π‘₯1 (𝑑)
⎒ π‘₯2 (𝑑) βŽ₯
⎒
βŽ₯
π‘₯(𝑑) = ⎒ . βŽ₯ ,
⎣ .. ⎦
π‘₯𝑛 (𝑑)
Version Time-stamp: ”2011-03-31 18:11:43 drager”.
1
2
LANCE D. DRAGER
and our constant coefficients into a matrix an 𝑛 × π‘› matrix 𝐴 = [π‘Žπ‘–π‘— ]. Recall that
to differentiate a vector-valued functions, we differentiate each component, so
⎑ ′ ⎀
π‘₯1 (𝑑)
⎒ π‘₯′2 (𝑑) βŽ₯
⎒
βŽ₯
π‘₯′ (𝑑) = ⎒ . βŽ₯ ,
⎣ .. ⎦
π‘₯′𝑛 (𝑑)
so we can rewrite (1.1) more compactly in vector-matrix form as
(1.3)
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑).
If we put our initial values into a vector
⎑ ⎀
𝑐1
⎒ 𝑐2 βŽ₯
⎒ βŽ₯
𝑐 = ⎒ . βŽ₯,
⎣ .. ⎦
𝑐𝑛
we can rewrite the initial conditions (1.2) as
(1.4)
π‘₯(𝑑0 ) = 𝑐.
Thus, the matrix form of our problem is
(1.5)
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑),
π‘₯(𝑑0 ) = 𝑐.
A problem of this form is called an initial value problem (IVP).
For information on the matrix manipulations used in these notes, see the Appendix.
Eigenvalues and eigenvectors are going to be important to our solution methods.
Of course, even real matrices can have complex, nonreal eigenvalues. To take care
of this problem, we’ll work with complex matrices and complex solutions to the
differential equation from the start. In many (but not all) applications, one is only
interested in real solutions, so we’ll indicate as we go along what happens when our
matrix 𝐴 and our initial conditions 𝑐 are real.
The equation
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑)
is a homogeneous equation. An inhomogeneous equation would be one of the form
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑) + 𝑓 (𝑑),
where 𝑓 (𝑑) is a given vector-value function. We’ll discuss homogeneous systems
to begin with, and show how to solve inhomogeneous systems at the end of these
notes.
1.1. Notation. The symbol R will denote the real numbers and C will denote the
complex numbers.
We will denote by Matπ‘š×𝑛 (C) the space of π‘š × π‘› matrices with entries in C and
Matπ‘š×𝑛 (R) will denote the space of π‘š × π‘› matrices with entries in R.
We use R𝑛 as a synonym Mat𝑛×1 (R), the space of column vectors with 𝑛 entries.
Similarly, C𝑛 is a synonym for Mat𝑛×1 (C).
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
3
2. The Matrix Exponential
In this section, we’ll first consider the existence and uniqueness question for our
system of equations. We’ll then consider the fundamental matrix for our system
and show how this solves the problem. In the last subsection, we’ll view this
fundamental matrix as a matrix exponential function.
2.1. Initial Value Problem and Existence and Uniqueness. The main problem we’re interested in is the initial value problem
(2.1)
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑),
π‘₯(𝑑0 ) = 𝑐,
were 𝐴 ∈ Mat𝑛×𝑛 (C), 𝑐 ∈ C𝑛 and we’re solving for a function π‘₯(𝑑) with values in
C𝑛 , defined on some interval in R containing 𝑑0 .
We state the following existence and uniqueness theorem without proof.
Theorem 2.1 (Existance and Uniqueness Theorem). Let 𝐴 be an 𝑛 × π‘› complex
matrix, let 𝑐 ∈ C𝑛 and let 𝑑0 ∈ R.
(1) There is a differentiable function π‘₯ : R → C𝑛 : 𝑑 ↦→ π‘₯(𝑑) such that
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑),
for all 𝑑 ∈ R,
π‘₯(𝑑0 ) = 𝑐.
(2) If 𝐽 ⊆ R is an open interval in R that contains 𝑑0 and 𝑦 : 𝐽 → C𝑛 is a
differentiable function such that
𝑦 ′ (𝑑) = 𝐴𝑦(𝑑),
for all 𝑑 ∈ 𝐽,
𝑦(𝑑0 ) = 𝑐,
then 𝑦(𝑑) = π‘₯(𝑑) for all 𝑑 ∈ 𝐽.
In view of this Theorem, we may as well consider solutions defined on all of R.
For brevity, we’ll summarize by saying that solutions of the initial value problem
are unique.
It will turn out to be useful for consider initial value problems for matrix valued functions. To distinguish the cases, we’ll usually write 𝑋(𝑑) for our unknown
function with values in Mat𝑛×𝑛 (C).
Theorem 2.2. Suppose that 𝐴 ∈ Mat𝑛×𝑛 (C) and that 𝑑0 ∈ R. Let 𝐢 be a fixed
𝑛 × π‘› complex matrix. Then there is a function 𝑋 : R → Mat𝑛×𝑛 (C) : 𝑑 ↦→ 𝑋(𝑑)
such that
(2.2)
𝑋 ′ (𝑑) = 𝐴𝑋(𝑑),
𝑑∈R
𝑋(𝑑0 ) = 𝐢.
This solution is unique in the sense of Theorem 2.1
Proof. If we write 𝑋(𝑑) in terms of its columns as
𝑋(𝑑) = [π‘₯1 (𝑑) | π‘₯2 (𝑑) | · · · | π‘₯𝑛 (𝑑)],
so each π‘₯𝑖 (𝑑) is a vector-valued function, then
𝑋 ′ (𝑑) = [π‘₯′1 (𝑑) | π‘₯′2 (𝑑) | · · · | π‘₯′𝑛 (𝑑)],
𝐴𝑋(𝑑) = [𝐴π‘₯1 (𝑑) | 𝐴π‘₯2 (𝑑) | · · · | 𝐴π‘₯𝑛 (𝑑)]
4
LANCE D. DRAGER
Thus, the matrix differential equation 𝑋 ′ (𝑑) = 𝐴𝑋(𝑑) is equivalent to 𝑛 vector
differential equations
π‘₯′1 (𝑑) = 𝐴π‘₯1 (𝑑)
π‘₯′2 (𝑑) = 𝐴π‘₯2 (𝑑)
..
.
π‘₯′𝑛 (𝑑) = 𝐴π‘₯𝑛 (𝑑).
If we write the initial matrix 𝐢 in terms of its columns as
𝐢 = [𝑐1 | 𝑐2 | · · · | 𝑐𝑛 ],
then the initial condition 𝑋(𝑑0 ) = 𝐢 is equivalent to the vector equations
π‘₯1 (𝑑0 ) = 𝑐1
π‘₯2 (𝑑0 ) = 𝑐2
..
.
π‘₯𝑛 (𝑑0 ) = 𝑐𝑛
Since each of the initial value problems
π‘₯′𝑗 (𝑑) = 𝐴π‘₯𝑗 (𝑑),
π‘₯𝑗 (𝑑0 ) = 𝑐𝑗 ,
𝑗 = 1, 2, . . . , 𝑛
has a unique solution, we conclude that the matrix initial value problem (2.2) has
a unique solution.
2.2. The Fundamental Matrix and It’s Properties. It turns out we only need
to solve one matrix initial value problem in order to solve them all.
Definition 2.3. Let 𝐴 be a complex 𝑛 × π‘› matrix. The unique 𝑛 × π‘›-matrix value
function 𝑋(𝑑) that solves the matrix initial valued problem
𝑋 ′ (𝑑) = 𝐴𝑋(𝑑),
(2.3)
𝑋(0) = 𝐼
will be denoted by Φ𝐴 (𝑑), in order indicate the dependence on 𝐴.
In other words, Φ𝐴 (𝑑) is the unique function so that
Φ′𝐴 (𝑑) = 𝐴Φ𝐴 (𝑑),
(2.4)
Φ𝐴 (0) = 𝐼.
The function Φ𝐴 (𝑑) is called the Fundamental Matrix of (2.3).
We’ll have a much more intuitive notation for Φ𝐴 (𝑑) in a bit, but we need to do
some work first.
First, let’s show that Φ𝐴 (𝑑) solves the initial value problems we’ve discussed so
far.
Theorem 2.4. Let 𝐴 be an 𝑛 × π‘› complex matrix.
(1) Let 𝑐 ∈ C𝑛 . The solution of the initial value problem
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑),
(2.5)
π‘₯(𝑑0 ) = 𝑐,
is
(2.6)
π‘₯(𝑑) = Φ𝐴 (𝑑 − 𝑑0 )𝑐.
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
5
(2) Let 𝐢 ∈ Mat𝑛×𝑛 (C). The solution 𝑋(𝑑) of the matrix initial value problem
𝑋 ′ (𝑑) = 𝐴𝑋(𝑑),
(2.7)
𝑋(𝑑0 ) = 𝐢.
is
(2.8)
𝑋(𝑑) = Φ𝐴 (𝑑 − 𝑑0 )𝐢.
Proof. Consider the matrix-valued function
Ψ(𝑑) = Φ𝐴 (𝑑 − 𝑑0 ).
We then have
𝑑
Φ𝐴 (𝑑 − 𝑑0 )
𝑑𝑑
𝑑
= Φ′𝐴 (𝑑 − 𝑑0 ) (𝑑 − 𝑑0 )
𝑑𝑑
= 𝐴Φ(𝑑 − 𝑑0 )
Ψ′ (𝑑) =
= 𝐴Ψ(𝑑).
We also have Ψ(𝑑0 ) = Φ𝐴 (𝑑0 − 𝑑0 ) = Φ𝐴 (0) = 𝐼.
For the first part of the proof, suppose 𝑐 is a constant vector, and let 𝑦(𝑑) = Ψ(𝑑)𝑐.
Then
𝑦 ′ (𝑑) = Ψ′ (𝑑)𝑐
= 𝐴Ψ(𝑑)𝑐
= 𝐴𝑦(𝑑),
and 𝑦(𝑑0 ) = Ψ(𝑑0 )𝑐 = 𝐼𝑐 = 𝑐. Thus, 𝑦(𝑑) is the unique solution of the initial value
problem (2.5).
The proof of the second part is very similar.
Exercise 2.5. Show that
Φ0 (𝑑) = 𝐼,
for all 𝑑,
where 0 is the 𝑛 × π‘› zero matrix.
In the rest of this subsection, we’re going to derive some properties of Φ𝐴 (𝑑).
The pattern of proof is the same in each case; we show two functions satisfy the
same initial value problem, so they must be the same.
Here’s a simple example to start.
Theorem 2.6. Let 𝐴 be an 𝑛 × π‘› real matrix. Then Φ𝐴 (𝑑) is real. The solutions
of the initial value problems (2.5) and (2.7) are real if the initial data, 𝑐 or 𝐢, is
real.
Recall that the conjugate of a complex number 𝑧 is denoted by 𝑧¯. Of course, 𝑧
is real if and only if 𝑧 = 𝑧¯. If π‘₯(𝑑) is a complex valued function, we define π‘₯
¯(𝑑) by
π‘₯
¯(𝑑) = π‘₯(𝑑). Note that
𝑑
π‘₯
¯′ (𝑑) =
π‘₯(𝑑) = π‘₯′ (𝑑).
𝑑𝑑
Proof of Theorem. Let 𝐴 be a complex matrix. The fundamental solution Φ𝐴 (𝑑)
solves the IVP
(2.9)
𝑋 ′ (𝑑) = 𝐴𝑋(𝑑),
𝑋(0) = 𝐼
6
LANCE D. DRAGER
Suppose 𝑋(𝑑) is the solution of this IVP. Taking conjugates, we get
¯ ′ (𝑑) = 𝐴¯π‘‹(𝑑),
¯
¯
(2.10)
𝑋
𝑋(0)
= 𝐼.
¯
In other words, (2.10) shows that 𝑋(𝑑)
is the solution of the IVP
′
¯
(2.11)
π‘Œ (𝑑) = π΄π‘Œ (𝑑), π‘Œ (0) = 𝐼,
¯
so we must have π‘Œ (𝑑) = 𝑋(𝑑).
But (2.11) is the IVP that defines Φ𝐴¯ (𝑑).
In other words, for a complex matrix 𝐴, we have
Φ𝐴¯ (𝑑) = Φ𝐴 (𝑑).
Suppose now that 𝐴 is real. Then 𝐴¯ = 𝐴. The last equation becomes
Φ𝐴 (𝑑) = Φ𝐴 (𝑑).
Since Φ𝐴 (𝑑) is equal to its conjugate, it must be real.
Of course, if Φ𝐴 (𝑑) is real, and 𝑐 is a real vector, the solution π‘₯(𝑑) = Φ𝐴 (𝑑 − 𝑑0 )𝑐
of (2.5) is real. A similar argument takes care of (2.7).
The next properties is the basic property of Φ𝐴 (𝑑) that is often used.
Theorem 2.7. Let 𝐴 be an 𝑛 × π‘› complex matrix. For any 𝑑, 𝑠 ∈ R,
(2.12)
Φ𝐴 (𝑑 + 𝑠) = Φ𝐴 (𝑑)Φ𝐴 (𝑠).
Proof. Let 𝑠 be fixed by arbitrary, and think both sides of (2.12) as functions of 𝑑.
Consider the matrix initial value problem
(2.13)
𝑋 ′ (𝑑) = 𝐴𝑋(𝑑),
𝑋(0) = Φ𝐴 (𝑠).
According to the Theorem 2.4, the solution of this initial value problem is 𝑋(𝑑) =
Φ𝐴 (𝑑)Φ𝐴 (𝑠).
On the other hand, consider the function Ψ(𝑑) = Φ𝐴 (𝑑 + 𝑠). We have
𝑑
Φ𝐴 (𝑑 + 𝑠)
Ψ′ (𝑑) =
𝑑𝑑
𝑑
= Φ′𝐴 (𝑑 + 𝑠) (𝑑 + 𝑠)
𝑑𝑑
= 𝐴Φ𝐴 (𝑑 + 𝑠)
= 𝐴Ψ(𝑑)
and Ψ(0) = Φ𝐴 (0 + 𝑠) = Φ𝐴 (𝑠). Thus, Ψ(𝑑) is also a solution to the initial value
problem (2.13). Our two solutions must be the same, which proves the theorem. Since 𝑑 + 𝑠 = 𝑠 + 𝑑, we readily obtain the following corollary.
Corollary 2.8. For any real numbers 𝑑 and 𝑠,
Φ𝐴 (𝑑)Φ𝐴 (𝑠) = Φ𝐴 (𝑠)Φ𝐴 (𝑑),
i.e., the matrices Φ𝐴 (𝑑) and Φ𝐴 (𝑠) commute.
Almost as easily, we obtain a second corollary.
Corollary 2.9. For any 𝑑 ∈ R, the matrix Φ𝐴 (𝑑) is invertible and the inverse is
Φ𝐴 (−𝑑).
Proof.
Φ𝐴 (𝑑)Φ𝐴 (−𝑑) = Φ𝐴 (−𝑑)Φ𝐴 (𝑑) = Φ𝐴 (𝑑 + (−𝑑)) = Φ𝐴 (0) = 𝐼.
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
7
The next theorem has a similar proof. We’ll use this later.
Theorem 2.10. Let 𝐴 be an 𝑛 × π‘› complex matrix and let 𝑠 be a real number.
Then
Φ𝑠𝐴 (𝑑) = Φ𝐴 (𝑠𝑑),
𝑑 ∈ R.
Proof. Let Ψ(𝑑) = Φ𝐴 (𝑠𝑑). Then
𝑑
Φ𝐴 (𝑠𝑑)
𝑑𝑑
𝑑
= Φ′𝐴 (𝑠𝑑) (𝑠𝑑)
𝑑𝑑
= 𝑠Φ′𝐴 (𝑠𝑑)
Ψ′ (𝑑) =
= 𝑠𝐴Φ𝐴 (𝑠𝑑)
= (𝑠𝐴)Ψ(𝑑),
and Ψ(0) = 𝐼. Thus, Ψ(𝑑) satisfies the same initial value problem that characterizes
Φ𝑠𝐴 (𝑑), so the two functions must be equal.
The next theorem describes how the effect of similarity transformations. This
well be the key to computing Φ𝐴 (𝑑) when 𝐴 is diagonalizable.
Theorem 2.11. Let 𝐴 be and 𝑛 × π‘› complex matrix, and let 𝑃 be an invertible
𝑛 × π‘› complex matrix. Then
Φ𝑃 −1 𝐴𝑃 (𝑑) = 𝑃 −1 Φ𝐴 (𝑑)𝑃.
Proof. Let Ψ(𝑑) = 𝑃 −1 Φ𝐴 (𝑑)𝑃 . Then we have
Ψ′ (𝑑) = 𝑃 −1 Φ′𝐴 (𝑑)𝑃
= 𝑃 −1 𝐴Φ𝐴 (𝑑)𝑃
= 𝑃 −1 𝐴𝑃 𝑃 −1 Φ𝐴 (𝑑)𝑃
= (𝑃 −1 𝐴𝑃 )Ψ(𝑑).
We also have
Ψ(0) = 𝑃 −1 Φ𝐴 (0)𝑃 = 𝑃 −1 𝐼𝑃 = 𝐼.
Thus, πœ“(𝑑) is a solution of the same initial value problem that characterizes Φ𝑃 −1 𝐴𝑃 (𝑑),
so the two functions are equal.
The following proposition is preparation for the next theorem. As we’ll see, what
commutes with what is important in this subject.
Proposition 2.12. Let 𝐴 be an 𝑛 × π‘› complex matrix. Let 𝐡 be an 𝑛 × π‘› complex
matrix that commutes with 𝐴, i.e., 𝐴𝐡 = 𝐡𝐴. Then 𝐡 commutes with Φ𝐴 (𝑑), i.e.,
𝐡Φ𝐴 (𝑑) = Φ𝐴 (𝑑)𝐡,
𝑑 ∈ R.
In particular, since 𝐴 commutes with itself, we have
𝐴Φ𝐴 (𝑑) = Φ𝐴 (𝑑)𝐴,
𝑑 ∈ R.
Proof. Consider the matrix initial value problem
(2.14)
𝑋 ′ (𝑑) = 𝐴𝑋(𝑑),
𝑋(0) = 𝐡.
According to Theorem 2.4, the solution to this initial value problem is 𝑋(𝑑) =
Φ𝐴 (𝑑)𝐡.
8
LANCE D. DRAGER
Let Ψ(𝑑) = 𝐡Φ𝐴 (𝑑). Then we have
Ψ′ (𝑑) = 𝐡Φ′𝐴 (𝑑)
= 𝐡𝐴Φ𝐴 (𝑑)
= 𝐴𝐡Φ𝐴 (𝑑)
since 𝐴𝐡 = 𝐡𝐴
= 𝐴Ψ(𝑑).
We also have Ψ(0) = 𝐡Φ𝐴 (0) = 𝐡𝐼 = 𝐡. Thus, Ψ(𝑑) solves the initial value
problem (2.14), so our two solutions must be equal.
Theorem 2.13. If 𝐴 and 𝐡 are 𝑛 × π‘› square matrices that commute, i.e., 𝐴𝐡 =
𝐡𝐴, then
Φ𝐴+𝐡 (𝑑) = Φ𝐴 (𝑑)Φ𝐡 (𝑑).
Of course, since 𝐴 + 𝐡 = 𝐡 + 𝐴, we would have to have
Φ𝐴+𝐡 (𝑑) = Φ𝐡 (𝑑)Φ𝐴 (𝑑),
so the matrices Φ𝐴 (𝑑) and Φ𝐡 (𝑑) commute.
Proof. You know the drill by now. Let Ψ(𝑑) = Φ𝐴 (𝑑)Φ𝐡 (𝑑). Then we have
Ψ′ (𝑑) = Φ′𝐴 (𝑑)Φ𝐡 (𝑑) + Φ𝐴 (𝑑)Φ′𝐡 (𝑑)
by the Product Rule
= 𝐴Φ𝐴 (𝑑)Φ𝐡 (𝑑) + Φ𝐴 (𝑑)𝐡Φ𝐡 (𝑑)
= 𝐴Φ𝐴 (𝑑)Φ𝐡 (𝑑) + 𝐡Φ𝐴 (𝑑)Φ𝐡 (𝑑)
by Proposition 2.12
= (𝐴 + 𝐡)Φ𝐴 (𝑑)Φ𝐡 (𝑑)
= (𝐴 + 𝐡)Ψ(𝑑).
In addition we have Ψ(0) = Φ𝐴 (0)Φ𝐡 (0) = 𝐼𝐼 = 𝐼. Thus, Ψ(𝑑) solves the same
initial value problem as Φ𝐴+𝐡 (𝑑), so the two functions are equal.
Exercise 2.14. Under the hypotheses of the last theorem, show that
Φ𝐴 (𝑠)Φ𝐡 (𝑑) = Φ𝐡 (𝑑)Φ𝐴 (𝑠)
for all 𝑠, 𝑑 ∈ R, not just 𝑠 = 𝑑 as stated in the theorem.
2.3. Matrix Exponentials. If π‘Ž is a number, consider the initial value problem
π‘₯′ (𝑑) = π‘Žπ‘₯,
π‘₯(0) = 1.
π‘Žπ‘‘
The solution is π‘₯(𝑑) = 𝑒 . Of course this depends on knowing the exponential
function exp(π‘₯) = 𝑒π‘₯ . But this function can also be recovered from the solution of
the IVP as π‘’π‘Ž = π‘₯(1).
We can follow a similar course to define 𝑒𝐴 = exp(𝐴) when 𝐴 is a matrix.
Definition 2.15. If 𝐴 is a complex 𝑛 × π‘› matrix, we define
𝑒𝐴 = Φ𝐴 (1).
Now use Theorem 2.10 (reversing the roles of 𝑠 and 𝑑). For any 𝑠, we have
Φ𝑑𝐴 (𝑠) = Φ𝐴 (𝑑𝑠).
Setting 𝑠 = 1 in this equation gives us
𝑒𝑑𝐴 = Φ𝑑𝐴 (1) = Φ𝐴 (𝑑).
For the rest of the notes, we’ll write 𝑒𝑑𝐴 or exp(𝑑𝐴) instead of Φ𝐴 (𝑑).
Let’s summarize the properties we’ve developed so far in the new notation.
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
9
Proposition 2.16. If 𝐴 is a complex 𝑛×𝑛 matrix, the matrix exponential function
𝑒𝑑𝐴 satisfies the following properties.
(1)
𝑑 𝑑𝐴
𝑒 = 𝐴𝑒𝑑𝐴 , 𝑒0𝐴 = 𝐼.
𝑑𝑑
(2) If 𝐴 is real, 𝑒𝑑𝐴 is real.
(3)
𝑒(𝑑1 +𝑑2 )𝐴 = 𝑒𝑑1 𝐴 𝑒𝑑2 𝐴 .
(4) For each fixed 𝑑, 𝑒𝑑𝐴 is invertible, and the inverse is 𝑒−𝑑𝐴 .
(5)
𝑒𝑑(𝑠𝐴) = 𝑒(𝑠𝑑)𝐴
(6) For any invertible 𝑛 × π‘› matrix 𝑃 ,
𝑃 −1 𝑒𝑑𝐴 𝑃 = 𝑒𝑑𝑃
−1
𝐴𝑃
.
(7) If 𝐡 commutes with 𝐴, then
𝐡𝑒𝑑𝐴 = 𝑒𝑑𝐴 𝐡
(8) If 𝐴 and 𝐡 commute, then
𝑒𝑑(𝐴+𝐡) = 𝑒𝑑𝐴 𝑒𝑑𝐡 = 𝑒𝑑𝐡 𝑒𝑑𝐴 .
Remark 1. Of course, it would be nice to know what 𝑒𝑑(𝐴+𝐡) is when 𝐴 and 𝐡 don’t
commute. The answer is given by the Baker-Campbell-Hausdorff formula, which is
fairly complex. This is beyond the scope of these notes.
There is another approach to developing the matrix exponential function, which
we should mention. Recall that, for a real variable π‘₯, the exponential function 𝑒π‘₯
is given by a power series, namely
∞
∑︁
1
1
1
1 π‘˜
π‘₯ = 1 + π‘₯ + π‘₯2 + π‘₯3 + π‘₯4 + · · · .
𝑒 =
π‘˜!
2
6
24
π‘₯
π‘˜=0
If 𝐴 is a square matrix, then formally substituting π‘₯ = 𝑑𝐴 in the series above would
suggest that we define 𝑒𝑑𝐴 by
(2.15)
𝑒
𝑑𝐴
=
∞ π‘˜
∑︁
𝑑
π‘˜=0
π‘˜!
1
1
π΄π‘˜ = 𝐼 + 𝑑𝐴 + 𝑑2 𝐴2 + 𝑑3 𝐴3 + · · · .
2
6
Of course, this is an infinite sum of matrices. One can interpret convergence of
this series to mean that the scalar infinite series one gets in each slot converges.
Of course, to make this proposed definition valid, we would have to show that the
series (2.15) converges. To make the definition useful, we would have to develop
enough properties of such infinite series to prove that 𝑒𝑑𝐴 has the properties in
Proposition 2.16. We’ve decided to take a different approach. Still, we’ll see an
echo of the power series (2.15) below.
3. Computing Matrix Exponentials
We will first consider how to compute 𝑒𝑑𝐴 when 𝐴 is diagonalizable, and the
consider the nondiagonalizable case.
10
LANCE D. DRAGER
3.1. The Diagonalizable Case. Recall that an 𝑛 × π‘› matrix 𝐴 is diagonalizable
if there is an invertible matrix 𝑃 so that 𝑃 −1 𝐴𝑃 is a diagonal matrix 𝐷. This
means that 𝐷 is of the form
⎑
⎀
πœ†1
⎒
βŽ₯
πœ†2
⎒
βŽ₯
𝐷=⎒
βŽ₯,
.
.
⎣
⎦
.
πœ†π‘›
where all the off diagonal entries are zero.
The first thing to do is to find 𝑒𝑑𝐷 when 𝐷 is diagonal.
Proposition 3.1. If 𝐷 is the diagonal matrix
⎑
πœ†1
⎒
πœ†2
⎒
𝐷=⎒
..
⎣
.
⎀
βŽ₯
βŽ₯
βŽ₯,
⎦
πœ†π‘›
then 𝑒𝑑𝐷 is the diagonal matrix
𝑒𝑑𝐷
⎑ πœ†1 𝑑
𝑒
⎒
⎒
=⎒
⎣
⎀
π‘’πœ†2 𝑑
..
.
π‘’πœ†π‘› 𝑑
βŽ₯
βŽ₯
βŽ₯.
⎦
Proof. We just check that our proposed solution satisfied the right initial value
problem. So, define Ψ(𝑑) by
⎀
⎑ πœ†1 𝑑
𝑒
βŽ₯
⎒
π‘’πœ†2 𝑑
βŽ₯
⎒
Ψ(𝑑) = ⎒
βŽ₯.
..
⎦
⎣
.
πœ†π‘› 𝑑
𝑒
Then,
⎑ πœ†1 0
𝑒
⎒
⎒
Ψ(0) = ⎒
⎣
⎀
π‘’πœ† 2 0
..
.
π‘’πœ† 𝑛 0
⎑
1
βŽ₯ ⎒
βŽ₯ ⎒
βŽ₯=⎒
⎦ ⎣
⎀ ⎑ πœ†1 𝑑
𝑒
βŽ₯⎒
βŽ₯⎒
βŽ₯⎒
⎦⎣
πœ†π‘›
1
..
βŽ₯
βŽ₯
βŽ₯=𝐼
⎦
.
1
By differentiating each component, we get
⎑ πœ†1 𝑑
πœ†1 𝑒
⎒
πœ†2 π‘’πœ†1 𝑑
⎒
′
Ψ (𝑑) = ⎒
⎣
One the other hand
⎑
πœ†1
⎒
πœ†2
⎒
𝐷Ψ(𝑑) = ⎒
..
⎣
.
⎀
⎀
..
.
πœ†π‘› π‘’πœ†π‘› 𝑑
βŽ₯
βŽ₯
βŽ₯.
⎦
⎑ πœ†1 𝑑
πœ†1 𝑒
βŽ₯ ⎒
βŽ₯ ⎒
βŽ₯=⎒
⎦ ⎣
⎀
π‘’πœ†2 𝑑
..
.
π‘’πœ† 𝑛 𝑑
⎀
πœ†2 π‘’πœ†1 𝑑
..
.
πœ†π‘› π‘’πœ†π‘› 𝑑
βŽ₯
βŽ₯
βŽ₯.
⎦
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
11
Thus, Ψ′ (𝑑) = 𝐷Ψ(𝑑).
We conclude that Ψ(𝑑) solves the initial value problem that defines 𝑒𝑑𝐷 , so we
must have 𝑒𝑑𝐷 = Ψ(𝑑).
We can now easily compute 𝑒𝑑𝐴 if 𝐴 is diagonalizable. In this case, we have
𝑃 𝐴𝑃 = 𝐷 for some diagonal matrix 𝐷. We can also write this equation as
−1
𝐴 = 𝑃 𝐷𝑃 −1 .
But then, we know
𝑒𝑑𝐴 = 𝑒𝑑𝑃 𝐷𝑃
−1
= 𝑃 𝑒𝑑𝐷 𝑃 −1 .
So, to compute 𝑒𝑑𝐴 when 𝐴 is diagonalizable, we use the equation
(3.1)
𝑒𝑑𝐴 = 𝑃 𝑒𝑑𝐷 𝑃 −1 .
Let’s do a couple of computations.
Example 3.2. Consider the matrix
⎑
8
𝐴 := ⎣ 6
6
⎀
−3
−3 ⎦
−1
−4
−2
−4
The characteristic polynomial of 𝐴 is 𝑝(πœ†) = πœ†3 − 5πœ†2 + 8πœ† − 4 which factors as
𝑝(πœ†) = (πœ† − 1)(πœ† − 2)2 . Thus the eigenvalues are 1 and 2.
First, let’s find a basis for 𝐸(1), the eigenspace for eigenvalue 1. Of course, 𝐸(1)
is the nullspace of 𝐴 − (1)𝐼. We calculate that
⎑
⎀
7 −4 −3
𝐴 − 𝐼 = ⎣ 6 −3 −3 ⎦
6 −4 −2
The reduced row echelon form of 𝐴 − 𝐼 is
⎑
1 0
𝑅=⎣ 0 1
0 0
⎀
−1
−1 ⎦
0
By the usual method, we find the nullspace of 𝑅, which is the same as the nullspace
of 𝐴 − 𝐼. The conclusion is that 𝐸(1) is one dimensional with basis
⎑ ⎀
1
𝑣1 = ⎣ 1 ⎦ .
1
Next, let’s look for a basis of 𝐸(2). We
⎑
6
𝐴 − 2𝐼 = ⎣ 6
6
calculate
−4
−4
−4
⎀
−3
−3 ⎦ .
−3
The reduced row echelon form of 𝐴 − 2𝐼 is
⎑
⎀
1 −2/3 −1/2
0
0 ⎦.
𝑅=⎣ 0
0
0
0
12
LANCE D. DRAGER
By the usual method, we find that 𝐸(2) is two dimensional with basis
⎑
⎀
⎑
⎀
1/2
2/3
𝑒2 = ⎣ 0 ⎦ , 𝑒3 = ⎣ 1 ⎦ .
1
0
It’s not necessary, but we can make things look nicer by getting rid of the fractions.
It we multiply 𝑒2 by 2 and 𝑒3 by 3, we’ll still have a basis. This gives us a new
basis
⎑
⎀
⎑
⎀
1
2
𝑣2 = 2𝑒2 = ⎣ 0 ⎦ , 𝑣3 = 3𝑒3 = ⎣ 3 ⎦
2
0
Since we have 3 independent eigenvectors, the matrix 𝐴 is diagonalizable. We
put the basis vectors into a matrix as columns, so we get
⎑
⎀
1 1 2
𝑃 = [𝑣1 | 𝑣2 | 𝑣3 ] = ⎣ 1 0 3 ⎦
1 2 0
The corresponding diagonal matrix is
⎑
1
𝐷 = ⎣0
0
0
2
0
⎀
0
0⎦ .
2
The reader is invited to check that 𝑃 −1 𝐴𝑃 = 𝐷. More importantly at the
moment, we have 𝐴 = 𝑃 𝐷𝑃 −1 . Thus, 𝑒𝑑𝐴 = 𝑃 𝑒𝑑𝐷 𝑃 −1 . We easily calculate1 𝑒𝑑𝐷
as
⎑
⎀
exp (𝑑)
0
0
⎣
⎦
0
exp (2𝑑)
0
0
0
exp (2𝑑)
Using a machine, we compute
𝑒𝑑𝐴 = 𝑃 𝑒𝑑𝐷 𝑃 −1
⎀⎑
⎀⎑
⎑
exp (𝑑)
0
0
1 1 2
⎦⎣
0
exp (2𝑑)
0
= ⎣ 1 0 3 ⎦⎣
0
0
exp (2𝑑)
1 2 0
⎑
−6 exp (𝑑) + 7 exp (2𝑑) 4 exp (𝑑) − 4 exp (2𝑑)
= ⎣ −6 exp (𝑑) + 6 exp (2𝑑) 4 exp (𝑑) − 3 exp (2𝑑)
−6 exp (𝑑) + 6 exp (2𝑑) 4 exp (𝑑) − 4 exp (2𝑑)
−6
3
2
4
−2
−1
⎀
3 exp (𝑑) − 3 exp (2𝑑)
3 exp (𝑑) − 3 exp (2𝑑) ⎦ .
3 exp (𝑑) − 2 exp (2𝑑)
Consider the initial value problem
⎑
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑),
⎀
1
π‘₯(0) = 𝑐 = ⎣ 2 ⎦ .
−3
The solution is
⎑
⎀
−7 exp (𝑑) + 8 exp (2𝑑)
π‘₯(𝑑) = 𝑒𝑑𝐴 𝑐 = ⎣ −7 exp (𝑑) + 9 exp (2𝑑) ⎦ .
−7 exp (𝑑) + 4 exp (2𝑑)
1The software I’m using prefers exp(𝑑) to 𝑒𝑑
⎀
3
−1 ⎦
−1
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
13
Example 3.3. Here is a simple example with complex eigenvalues. Let
[οΈƒ
]οΈƒ
4 −1
𝐴=
.
5
0
The characteristic polynomial is
𝑝(πœ†) = det(𝐴 − πœ†πΌ) = πœ†2 − 4πœ† + 5.
The roots of the characteristic polynomial are 2 + 𝑖 and 2 − 𝑖. We calculate
[οΈƒ
]οΈƒ
2−𝑖
−1
𝐴 − (2 + 𝑖)𝐼 =
.
5
−2 − 𝑖
The reduced row echelon form of 𝐴 − (2 + 𝑖)𝐼 is
[οΈƒ
]οΈƒ
1 −2/5 − (1/5) 𝑖
.
𝑅=
0
0
By the usual method we find the nullspace of 𝑅, which is the same as the nullspace
of 𝐴 − (2 + 𝑖)𝐼. The result is that the nullspace is one dimensional with basis
[οΈƒ
]οΈƒ
2/5 + (1/5) 𝑖
𝑣1 =
.
1
Thus, 𝑣1 is a basis of the eigenspace of 𝐴 for eigenvalue 2 + 𝑖. The other eigenvalue,
2 − 𝑖 is the conjugate of the first eigenvalue, so we can just conjugate the basis we
found for 2 + 𝑖. This give us
[οΈƒ
]οΈƒ
2/5 − (1/5) 𝑖
𝑣2 =
1
and a basis for the eigenspace for eigenvalue 2 − 𝑖.
Since we have two independent eigenvectors, the matrix 𝐴 is diagonalizable. If
we put 𝑣1 and 𝑣2 into a matrix we get
]οΈƒ
[οΈƒ
2/5 + (1/5) 𝑖 2/5 − (1/5) 𝑖
.
𝑃 =
1
1
The corresponding diagonal matrix is
[οΈƒ
]οΈƒ
2+𝑖
0
𝐷=
.
0
2−𝑖
Note that each column of 𝑃 is an eigenvector for the eigenvalue occurring in the
corresponding column of 𝐷. You can now check that 𝑃 −1 𝐴𝑃 = 𝐷, or 𝐴 = 𝑃 𝐷𝑃 −1 .
Since 𝐴 is real, we know that 𝑒𝑑𝐴 must be real, even though 𝑃 and 𝐷 are not
real.
We compute 𝑒𝑑𝐷 as
[οΈ‚ (2+𝑖)𝑑
]οΈ‚
𝑒
0
𝑒𝑑𝐷 =
0
𝑒(2−𝑖)𝑑
We then have 𝑒𝑑𝐴 = 𝑃 𝑒𝑑𝐷 𝑃 −1 . Putting into the TI-89 gives
[οΈ‚ 2𝑑
]οΈ‚
𝑒 (cos(𝑑) + 2 sin(𝑑))
𝑒2𝑑 sin(𝑑)
𝑑𝐴
.
(3.2)
𝑒 =
5𝑒2𝑑 sin(𝑑)
𝑒2𝑑 (cos(𝑑) − 2 sin(𝑑))
14
LANCE D. DRAGER
Of course, the calculator has used Euler’s Formula
𝑒(𝛼+𝑖𝛽)𝑑 = 𝑒𝛼𝑑 cos(𝛽𝑑) + 𝑖𝑒𝛼𝑑 sin(𝛽𝑑).
Sometimes you have to put this in by hand to persuade recalcitrant software to
give you results that are clearly real.
If we are given an initial condition
[οΈ‚ ]οΈ‚
𝑐
π‘₯(0) = 𝑐 = 1 ,
𝑐2
the solution of the initial value problem
π‘₯′ (𝑑) = 𝐴π‘₯(𝑑),
π‘₯(0) = 𝑐
is
[οΈƒ (οΈ€
π‘₯(𝑑) = 𝑒𝑑𝐴 𝑐 =
]οΈƒ
)οΈ€
𝑒2 𝑑 cos (𝑑) + 2 𝑒2 𝑑 sin (𝑑) 𝑐1 − 𝑒2 𝑑 sin (𝑑) 𝑐2
(οΈ€
)οΈ€
.
5 𝑒2 𝑑 sin (𝑑) 𝑐1 + 𝑒2 𝑑 cos (𝑑) − 2 𝑒2 𝑑 sin (𝑑) 𝑐2
3.2. Nilpotent Matrices. In this subsection we discuss nilpotent matrices, which
will be used in the next subsection to compute the matrix exponential of a general
nondiagonalizable matrix.
Definition 3.4. A square matrix 𝑁 is said to be nilpotent if 𝑁 𝑝 = 0 for some
integer 𝑝 ≥ 1.
Example 3.5. Consider the matrix
𝐽2 =
[οΈ‚
0
0
]οΈ‚
1
.
0
Of course, 𝐽2 ̸= 0, but the reader can easily check that 𝐽22 = 0.
If we go one dimension higher, we have
⎀
⎑
0 1 0
𝐽3 = ⎣0 0 1⎦ .
0 0 0
The reader can easily check that 𝐽32 ̸= 0 but 𝐽33 = 0.
You can see a pattern here, but these are just some simple examples of nilpotent
matrices. We’ll see later that it’s possible to have a nilpotent matrix where all the
entries are nonzero.
Of course, if 𝑁 𝑝 = 0, then 𝑁 π‘ž = 0 for all π‘ž > 𝑝. One might wonder how high a
power 𝑝 is necessary to make 𝑁 𝑝 = 0. This is answered in the next theorem.
Theorem 3.6. Let 𝑁 be a nilpotent 𝑛 × π‘› matrix. Then 𝑁 𝑛 = 0. To put it more
precisely, if 𝑁 𝑝−1 ΜΈ= 0 and 𝑁 𝑝 = 0, then 𝑝 ≤ 𝑛.
The proof of this theorem will be given in the appendices.
Our main concern here is to compute 𝑒𝑑𝑁 when 𝑁 is nilpotent. This can be done
by a simple matrix calculation.
Theorem 3.7. Let 𝑁 be an 𝑛 × π‘› matrix. Then
(3.3) 𝑒𝑑𝑁 = 𝐼 + 𝑑𝑁 +
𝑛−1
∑︁ 1
1
1
1 2 2
𝑑 𝑁 + 𝑑3 𝑁 3 + · · · +
𝑑𝑛−1 𝑁 𝑛−1 =
π‘‘π‘˜ 𝑁 π‘˜ .
2!
3!
(𝑛 − 1)!
π‘˜!
π‘˜=0
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
15
Note that it may well happen that 𝑁 𝑝 = 0 for some 𝑝 < 𝑛 − 1, so some of the
terms above will vanish.
You may recognize this formula as what you would get if you plug 𝑁 into the
power series (2.15) and notice that all the terms with exponent higher than 𝑛 − 1
will be zero. However, we can prove the theorem without resorting to the power
series, we just use our usual method.
Proof of Theorem. Consider the function
Ψ(𝑑) = 𝐼 + 𝑑𝑁 +
1 2 2 1 3 3
1
1
𝑑 𝑁 + 𝑑 𝑁 +···+
𝑑𝑛−2 𝑁 𝑛−2 +
𝑑𝑛−1 𝑁 𝑛−1 .
2!
3!
(𝑛 − 2)!
(𝑛 − 1)!
Clearly Ψ(0) = 𝐼.
It’s easy to differentiate Ψ(𝑑) if you recall that 𝑝/𝑝! = 1/(𝑝 − 1)!. The result is
1
1
1
𝑑𝑛−3 𝑁 𝑛−2 +
𝑑𝑛−2 𝑁 𝑛−1 .
(3.4) Ψ′ (𝑑) = 𝑁 + 𝑑𝑁 2 + 𝑑2 𝑁 3 + · · · +
2
(𝑛 − 3)!
(𝑛 − 2)!
One the other hand, we can compute 𝑁 Ψ(𝑑) by multiplying 𝑁 through the formula
for Ψ(𝑑). The result is
(3.5)
1
1
1
1
𝑁 Ψ(𝑑) = 𝑁 +𝑑𝑁 2 + 𝑑2 𝑁 3 + 𝑑3 𝑁 4 +· · ·+
𝑑𝑛−2 𝑁 𝑛−1 +
+𝑑𝑛−1 𝑁 𝑛
2!
3!
(𝑛 − 2)!
(𝑛 − 1)!
But the last term is zero, because 𝑁 𝑛 = 0. Thus, we get exactly the same formula
in (3.4) and (3.5), so we have Ψ′ (𝑑) = 𝑁 Ψ(𝑑). Thus, Ψ(𝑑) satisfies the same initial
value problem that defines 𝑒𝑑𝑁 . Thus, we must have 𝑒𝑑𝑁 = Ψ(𝑑).
With this formula, we can compute 𝑒𝑑𝑁 for (reasonably small) nilpotent matrices.
Example 3.8. Consider the matrix
⎑
0
⎒0
𝐽 =⎒
⎣0
0
1
0
0
0
0
1
0
0
⎀
0
0βŽ₯
βŽ₯.
1⎦
0
We have
⎑
0
⎒0
2
𝐽 =⎒
⎣0
0
⎑
0
⎒0
3
𝐽 =⎒
⎣0
0
𝐽 4 = 0.
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
⎀
0
1βŽ₯
βŽ₯
0⎦
0
⎀
1
0βŽ₯
βŽ₯
0⎦
0
16
LANCE D. DRAGER
Thus, we can calculate
1
1
𝑒𝑑𝐽 = 𝐼 + 𝑑𝐽 + 𝑑2 𝐽 2 + 𝑑3 𝐽 3
2
3!
⎑
⎑
⎀
0 1
1 0 0 0
⎒0 0
⎒0 1 0 0βŽ₯
⎒
βŽ₯
=⎒
⎣0 0 1 0⎦ + 𝑑 ⎣0 0
0 0
0 0 0 1
⎑
⎀
1 𝑑 𝑑2 /2 𝑑3 /6
⎒0 1
𝑑
𝑑2 /2βŽ₯
βŽ₯.
=⎒
⎣0 0
1
𝑑 ⎦
0 0
0
1
0
1
0
0
⎀
⎑
0
0
⎒
1
0βŽ₯
βŽ₯ + 𝑑2 ⎒0
1⎦ 2 ⎣0
0
0
0
0
0
0
1
0
0
0
⎀
⎑
0
0
⎒
1
1βŽ₯
βŽ₯ + 𝑑3 ⎒0
0⎦ 6 ⎣0
0
0
0
0
0
0
0
0
0
0
⎀
1
0βŽ₯
βŽ₯
0⎦
0
Example 3.9. Consider the matrix
[οΈ‚
3
𝑁=
1
]οΈ‚
−9
.
−3
The reader is invited to check that 𝑁 2 = 0. Thus,
𝑒𝑑𝑁 = 𝐼 + 𝑑𝑁
[οΈ‚
]οΈ‚
[οΈ‚
]οΈ‚
1 0
3 −9
=
+𝑑
0 1
1 −3
[οΈ‚
]οΈ‚
1 + 3𝑑 −9𝑑
=
.
𝑑
1 − 3𝑑
In general, it’s clear from the formula (3.3) that the entries of 𝑒𝑑𝑁 are polynomials
in 𝑑.
3.3. Computing Exponentials with the Jordan Decomposition. In this section will see a “general” method for computing the matrix exponential of any matrix. I put “general” in quotes because we’re making our usual assumption that
you can find the eigenvalues.
Which matrices commute with each other is going to be important, so we start
with an easy proposition about commuting matrices. The proof is left to the reader
Theorem 3.10. Consider 𝑛×𝑛 matrices. We say 𝐴 commutes with 𝐡 if 𝐴𝐡 = 𝐡𝐴.
(1) 𝐴 commutes with itself and with the identity 𝐼.
(2) If 𝐴 commutes with 𝐡, 𝐴 commutes with any power of 𝐡.
(3) If 𝐴 commutes with 𝐡 and with 𝐢 then 𝐴 commutes with 𝛼𝐡 + 𝛽𝐢 for any
scalars 𝛼, 𝛽 ∈ C.
If a matrix does not have enough independent eigenvectors it’s not diagonalizable. We need to look for more vectors to add to the list. It turns out the the
important thing to look for is generalized eigenvectors.
Definition 3.11. Let 𝐴 be an 𝑛 × π‘› matrix and let πœ† be an eigenvalue of 𝐴. The
generalized eigenspace of 𝐴 for eigenvalue πœ† is denoted by 𝐺(πœ†) and is defined
as
(οΈ€
)οΈ€
𝐺(πœ†) = nullspace (𝐴 − πœ†πΌ)𝑛 .
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
17
Note that if (𝐴 − πœ†πΌ)𝑝 𝑣 = 0 then (𝐴 − πœ†πΌ)π‘ž 𝑣 = 0 for all π‘ž > 𝑝. In particular,
the eigenspace of 𝐴 for eigenvalue πœ† is defined by 𝐸(πœ†) = nullspace((𝐴 − πœ†πΌ)), so
𝐸(πœ†) ⊆ 𝐺(πœ†). In particular 𝐺(πœ†) ΜΈ= {0}.
The following Big Theorem will justify our constructions in this section. It is
stated without proof.
Theorem 3.12 (Generalized Eigenspaces Theorem). Let 𝐴 be an 𝑛 × π‘› matrix and
let πœ†1 , . . . , πœ†π‘˜ be the distinct eigenvalues of 𝐴.
If you find a basis for each of the generalized eigenspaces 𝐺(πœ†π‘— ) and concatenate
this lists into one long list of vectors, you get a basis for C𝑛 .
To make this notationally specific suppose that you take a basis 𝑣1𝑗 , 𝑣2𝑗 , . . . , 𝑣𝑛𝑗 𝑗 of
𝐺(πœ†π‘— ), where 𝑛𝑗 is the dimension of 𝐺(πœ†π‘— ). The the list of vectors
(3.6)
𝑣11 , 𝑣21 . . . , 𝑣𝑛1 1 , 𝑣12 , 𝑣22 , . . . 𝑣𝑛2 2 , . . . , 𝑣1π‘˜ , 𝑣2π‘˜ , . . . , π‘£π‘›π‘˜ π‘˜
is a basis of C𝑛 . In particular, the dimensions of the generalized eigenspaces add
up to to 𝑛.
Corollary 3.13 (Generalized Eigenspaces Decomposition). Let 𝐴 be an 𝑛 × π‘›
matrix and let πœ†1 , . . . , πœ†π‘˜ be the distinct eigenvalues of 𝐴. Then every vector 𝑣 ∈ C𝑛
can be written uniquely as
(3.7)
𝑣 = 𝑣1 + 𝑣2 + · · · + π‘£π‘˜ ,
𝑣𝑗 ∈ 𝐺(πœ†π‘— ).
0 = 𝑣1 + 𝑣2 + · · · + π‘£π‘˜ ,
𝑣𝑗 ∈ 𝐺(πœ†π‘— )
In particular, if
(3.8)
then each 𝑣𝑗 is zero.
Proof. According to our big theorem, we can find a basis of C𝑛 of the form (3.6).
If 𝑣 is any vector it can be written as a linear combination of the basis vectors. In
particular, we can write
(3.9)
𝑣=
𝑛𝑗
π‘˜ ∑︁
∑︁
𝑐𝑗𝑝 𝑣𝑝𝑗 ,
𝑗= 𝑝=1
for some scalars
𝑐𝑗𝑝 .
But the term
𝑛𝑗
∑︁
𝑐𝑗𝑝 𝑣𝑝𝑗
𝑝=1
is in 𝐺(πœ†π‘— ), so (3.9) gives us the expression (3.7). Since the coefficients 𝑐𝑗𝑝 are
unique, so is the decomposition (3.7).
The second statement follows by uniqueness.
In preparation for our construction of the Jordan Decomposition, we have the
following fact.
Proposition 3.14. Let 𝐴 be an 𝑛 × π‘› matrix. Then generalized eigenspaces of 𝐴
are invariant under 𝐴. In other words if πœ† is an eigenvalue of 𝐴,
𝑣 ∈ 𝐺(πœ†) =⇒ 𝐴𝑣 ∈ 𝐺(πœ†).
18
LANCE D. DRAGER
Proof. Suppose that 𝑣 ∈ 𝐺(πœ†). Then, by definition, (𝐴 − πœ†πΌ)𝑛 𝑣 = 0 To check if
𝐴𝑣 is in 𝐺(πœ†), we have to check if (𝐴 − πœ†πΌ)𝑛 (𝐴𝑣) is zero. But 𝐴 and (𝐴 − πœ†πΌ)𝑛
commute. So we have
(𝐴 − πœ†πΌ)𝑛 (𝐴𝑣) = ((𝐴 − πœ†πΌ)𝑛 𝐴)𝑣
= (𝐴(𝐴 − πœ†πΌ)𝑛 )𝑣
= 𝐴((𝐴 − πœ†πΌ)𝑛 𝑣)
= 𝐴0 = 0.
Thus, 𝐴𝑣 ∈ 𝐺(πœ†).
We’re now ready to discuss the Jordan Decomposition Theorem and how to use
it to compute matrix exponentials.
Theorem 3.15 (Jordan Decomposition Theorem). Let 𝐴 be an 𝑛×𝑛 matrix. Then
there are unique matrices 𝑆 and 𝑁 that satisfy the following conditions
(1) 𝐴 = 𝑆 + 𝑁
(2) 𝑆𝑁 = 𝑁 𝑆, i.e., 𝑁 and 𝑆 commute (it follows from the previous condition
that 𝑆 and 𝑁 commute with 𝐴).
(3) 𝑆 is diagonalizable
(4) 𝑁 is nilpotent.
The expression 𝐴 = 𝑆 + 𝑁 is the Jordan Decomposition of 𝐴.
We’ll leave the uniqueness part of this Theorem to the appendices. We’ll show
how to construct 𝑆 and 𝑁 , which will prove the existence part of the Theorem.
This Theorem will enable us to compute 𝑒𝑑𝐴 . Since 𝑆 and 𝑁 commute,
𝑒𝑑𝐴 = 𝑒𝑑𝑆 𝑒𝑑𝑁 = 𝑒𝑑𝑁 𝑒𝑑𝑆 .
We know how to compute 𝑒𝑑𝑆 , since 𝑆 is diagonalizable, and we know how to
compute 𝑒𝑑𝑁 , since 𝑁 is nilpotent.
We’ll give one application of the uniqueness part of the Theorem.
Corollary 3.16. If 𝐴 is a real matrix with Jordan Decomposition 𝐴 = 𝑆 + 𝑁 , then
𝑆 and 𝑁 are real.
¯ . Taking the conjugates of both
Proof. In general, if 𝐴 = 𝑆 + 𝑁 , then 𝐴¯ = 𝑆¯ + 𝑁
¯
¯
¯
¯
sides in 𝑆𝑁 = 𝑁 𝑆 gives 𝑆 𝑁 = 𝑁 𝑆. Taking the conjugates of both sides in 𝑁 𝑛 = 0
¯ 𝑛 = 0, so 𝑁
¯ is nilpotent.
shows 𝑁
¯
We claim that 𝑆 is diagonalizable. First observe that if 𝑃 is invertible, we have
𝑃 𝑃 −1 = 𝑃 −1 𝑃 = 𝐼. Taking conjugates in this shows that
𝑃¯ 𝑃 −1 = 𝑃 −1 𝑃¯ = 𝐼¯ = 𝐼.
This shows that
𝑃¯ −1 = 𝑃 −1 .
Now,since 𝑆 is invertible, there is an invertible matrix 𝑃 and a diagonal matrix
𝐷 so that 𝑃 −1 𝑆𝑃 = 𝐷. Taking conjugates on both sides of this equation and using
our previous remark, we have
¯
𝑃¯ −1 𝑆¯π‘ƒ¯ = 𝐷.
¯ is certainly diagonal, so 𝑆¯ is diagonalizable.
The matrix 𝐷
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
19
¯ satisfy all the required conditions to form a Jordan Decomposition
Thus, 𝑆¯ and 𝑁
¯
of 𝐴. Thus, in general,
¯
𝐴¯ = 𝑆¯ + 𝑁
¯
is the Jordan Decomposition of 𝐴.
But if 𝐴 is real, then 𝐴¯ = 𝐴, so
¯
𝐴 = 𝑆¯ + 𝑁
¯ satisfy all the right conditions. By the uniqueness part of the
where 𝑆¯ and 𝑁
¯ , so both 𝑆 and
Jordan Decomposition Theorem, we must have 𝑆 = 𝑆¯ and 𝑁 = 𝑁
𝑁 are real.
We’ll now see how to construct the Jordan Decomposition, and then how to
compute 𝑒𝑑𝐴 .
To construct the Jordan Decomposition of 𝐴, we first find the eigenvalues of 𝐴,
call them πœ†1 , . . . , πœ†π‘˜ .
Next, we find a basis of each of the generalized eigenspaces 𝐺(πœ†π‘— ). We know how
to do this because
𝐺(πœ†π‘— ) = nullspace((𝐴 − πœ†πΌ)𝑛 ).
So, we compute the matrix (𝐴 − πœ†πΌ)𝑛 and use our usual algorithm to find the null
space. So, we first find the RREF of (𝐴 − πœ†πΌ)𝑛 and we can then read off a basis
of the nullspace. Obviously this will all be much pleasanter if we have machine
assistance, like the TI-89 or Maple.
Once we have found a basis of each generalized eigenspace, putting all the lists
together gives a basis of C𝑛 . Thus, if we insert all of these vector as the columns
of a matrix 𝑃 , the matrix 𝑃 in 𝑛 × π‘› and is invertible.
Now construct a diagonal matrix
⎑
⎀
πœ‡1
⎒
βŽ₯
πœ‡2
⎒
βŽ₯
(3.10)
𝐷=⎒
(off diagonal entries are 0),
βŽ₯,
..
⎣
⎦
.
πœ‡π‘›
where πœ‡β„“ is the eigenvalue that goes with column β„“ of 𝑃 , i.e., πœ‡β„“ = πœ†π‘— if column β„“
of 𝑃 is one of the basis vectors of 𝐺(πœ†π‘— ). So, all of the πœ‡β„“ ’s are eigenvalues of 𝐴,
but the same eigenvalue could be repeated in several columns.
We then construct a matrix 𝑆 by
(3.11)
𝑆 = 𝑃 𝐷𝑃 −1 .
Certainly 𝑆 is diagonalizable!
Claim. If 𝑣 ∈ 𝐺(πœ†β„“ ), then 𝑆𝑣 = πœ†β„“ 𝑣.
Proof of Claim. This is just the way we constructed it. Describe 𝑃 by its columns
as
]οΈ€
[οΈ€
𝑃 = 𝑝1 𝑝2 . . . 𝑝𝑛 .
Since 𝑆𝑃 = 𝑃 𝐷, we have
𝑆𝑝𝑗 = πœ‡π‘— 𝑝𝑗 .
By the construction, a subset of the columns of 𝑃 form a basis of 𝐺(πœ†β„“ ). Thus,
we can find some columns
𝑝𝑗1 , 𝑝𝑗2 , . . . , π‘π‘—π‘š ,
20
LANCE D. DRAGER
that form a basis, where π‘š is the dimension of 𝐺(πœ†β„“ ). Since these vectors are 𝐺(πœ†β„“ ),
we have
πœ‡π‘—1 = πœ‡π‘—2 = · · · = πœ‡π‘—π‘š = πœ†β„“ .
If 𝑣 ∈ 𝐺(πœ†β„“ ), we can write 𝑣 in terms of our basis as
𝑣 = 𝑐1 𝑝𝑗1 + 𝑐2 𝑝𝑗2 + · · · + π‘π‘š π‘π‘—π‘š .
for some scalars 𝑐1 , 𝑐2 , . . . , π‘π‘š . Then we have
𝑆𝑣 = 𝑐1 𝑆𝑝𝑗1 + 𝑐2 𝑆𝑝𝑗2 + · · · + π‘π‘š π‘†π‘π‘—π‘š
= 𝑐1 πœ†β„“ 𝑝𝑗1 + 𝑐2 πœ†β„“ 𝑝𝑗2 + · · · + π‘π‘š πœ†β„“ π‘π‘—π‘š
= πœ†β„“ (𝑐1 𝑝𝑗1 + 𝑐2 𝑝𝑗2 + · · · + π‘π‘š π‘π‘—π‘š )
= πœ†π‘— 𝑣.
This completes the proof of the claim.
If 𝑣 is any vector, we can write it in the form
𝑣 = 𝑣1 + · · · + 𝑣 π‘˜ ,
(3.12)
𝑣𝑗 ∈ 𝐺(πœ†π‘— ).
Applying 𝑆 gives us
(3.13)
𝑆𝑣 = πœ†1 𝑣1 + πœ†2 𝑣2 + · · · + πœ†π‘˜ π‘£π‘˜ ,
πœ†π‘— 𝑣𝑗 ∈ 𝐺(πœ†π‘— ).
Claim. 𝑆 commutes with 𝐴.
Proof of Claim. Recall that if 𝑣 ∈ 𝐺(πœ†β„“ ), then 𝐴𝑣 ∈ 𝐺(πœ†β„“ ). If 𝑣 is any vector, we
can write it in the form (3.12). Multiplying by 𝐴 gives us
𝐴𝑣 = 𝐴𝑣1 + 𝐴𝑣2 + · · · + π΄π‘£π‘˜ .
Since each 𝐴𝑣𝑗 is in 𝐺(πœ†π‘— ), we can apply 𝑆 to get
𝑆𝐴𝑣 = πœ†1 𝐴𝑣1 + πœ†2 𝐴𝑣2 + · · · + πœ†π‘˜ π΄π‘£π‘˜ .
On the other hand,
𝑆𝑣 = πœ†1 𝑣1 + πœ†2 𝑣2 + · · · + πœ†π‘˜ π‘£π‘˜ ,
and multiplying by 𝐴 gives
𝐴𝑆𝑣 = 𝐴(πœ†1 𝑣1 + πœ†2 𝑣2 + · · · + πœ†π‘˜ π‘£π‘˜ ) = πœ†1 𝐴𝑣1 + πœ†2 𝐴𝑣2 + · · · + πœ†π‘˜ π΄π‘£π‘˜
Thus, 𝑆𝐴 = 𝐴𝑆.
Now, we define 𝑁 by 𝑁 = 𝐴 − 𝑆. Obviously 𝐴 = 𝑆 + 𝑁 . Since 𝑆 commutes
with 𝐴 and itself, 𝑆 commutes with 𝐴 − 𝑆 = 𝑁 . Thus, 𝑆𝑁 = 𝑁 𝑆. It remains to
show that 𝑁 is nilpotent.
Claim. If 𝑣 ∈ 𝐺(πœ†β„“ ), then
(3.14)
(𝐴 − 𝑆)𝑛 𝑣 = (𝐴 − πœ†β„“ )𝑛 𝑣 = 0.
Proof of Claim. Since 𝐴 and 𝑆 commute with each other and with the identity,
𝐴 − 𝑆 and 𝐴 − πœ†β„“ 𝐼 commute. If 𝑣 ∈ 𝐺(πœ†β„“ ), 𝑆𝑣 = πœ†β„“ 𝑣. Thus,
(𝐴 − 𝑆)𝑣 = 𝐴𝑣 − 𝑆𝑣 = 𝐴𝑣 − πœ†β„“ 𝑣 = (𝐴 − πœ†β„“ 𝐼)𝑣.
NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS
21
For the next power we have
(𝐴 − 𝑆)2 𝑣 = (𝐴 − 𝑆)[(𝐴 − 𝑆)𝑣]
= (𝐴 − 𝑆)[(𝐴 − πœ†β„“ 𝐼)𝑣]
= {(𝐴 − 𝑆)(𝐴 − πœ†β„“ 𝑣)} 𝑣
= {(𝐴 − πœ†β„“ 𝐼)(𝐴 − 𝑆)} 𝑣
= (𝐴 − πœ†β„“ 𝐼)[(𝐴 − 𝑆)𝑣]
= (𝐴 − πœ†β„“ 𝐼)[(𝐴 − πœ†β„“ 𝐼)𝑣]
= (𝐴 − πœ†β„“ 𝐼)2 𝑣
We can continue inductively to show (𝐴 − 𝑆)𝑝 𝑣 = (𝐴 − πœ†β„“ )𝑝 𝑣 for any power 𝑝. If
we set 𝑝 = 𝑛, we get(3.14).
We can now prove that 𝑁 is nilpotent. If 𝑣 is any vector, we write it in the form
(3.15)
𝑣 = 𝑣1 + · · · + π‘£π‘˜ ,
𝑣𝑗 ∈ 𝐺(πœ†π‘— ).
Applying 𝑁 𝑛 to both sides of (3.15) gives
𝑁 𝑛 𝑣 = (𝐴 − 𝑆)𝑛 𝑣
= (𝐴 − 𝑆)𝑛 𝑣1 + (𝐴 − 𝑆)𝑛 𝑣2 + · · · + (𝐴 − 𝑆)𝑛 π‘£π‘˜
= (𝐴 − πœ†1 )𝑛 𝑣1 + (𝐴 − πœ†2 )𝑛 𝑣2 + · · · + (𝐴 − πœ†π‘˜ )𝑛 π‘£π‘˜
= 0 + 0 + · · · + 0 = 0.
Thus, 𝑁 𝑛 𝑣 = 0 for all 𝑣, so 𝑁 𝑛 = 0.
We’ve now constructed matrices 𝑆 and 𝑁 so that 𝐴 = 𝑆 + 𝑁 , 𝑆𝑁 = 𝑁 𝑆, 𝑆 is
diagonalizable and 𝑁 is nilpotent. So, we’ve found the Jordan Decomposition of
𝐴.
We can use the machinery we developed to compute 𝑒𝑑𝐴 . Our matrix 𝐷 in (3.10)
is diagonal, so
⎀
⎑ πœ‡π‘‘
𝑒 1
βŽ₯
⎒
π‘’πœ‡2 𝑑
βŽ₯
⎒
(off diagonal entries are 0),
𝑒𝑑𝐷 = ⎒
βŽ₯,
.
..
⎦
⎣
π‘’πœ‡π‘› 𝑑
Since 𝑆 = 𝑃 𝐷𝑃 −1 , we can compute 𝑒𝑑𝑆 by
𝑒𝑑𝑆 = 𝑃 𝑒𝑑𝐷 𝑃 −1 .
Since 𝑁 is nilpotent, we know how to compute 𝑒𝑑𝑁 . We can then compute 𝑒𝑑𝐴 as
𝑒𝑑𝐴 = 𝑒𝑑𝑆 𝑒𝑑𝑁 = 𝑒𝑑𝑁 𝑒𝑑𝑆 .
Let’s summarize the steps in this process.
Theorem 3.17 (Computing 𝑒𝑑𝐴 ). Let 𝐴 be an 𝑛 × π‘› matrix. To compute 𝑒𝑑𝐴 ,
follow these steps.
(1) Find the characteristic polynomial of 𝐴 and the eigenvalues of 𝐴. Call the
eigenvalues πœ†1 , . . . , πœ†π‘˜ .
(2) For each eigenvalue πœ†β„“ , find a basis of 𝐺(πœ†β„“ ) by computing the matrix 𝐡 =
(𝐴 − πœ†β„“ )𝑛 and the finding a basis for the nullspace of 𝐡.
(3) Put the basis vectors you’ve constructed in as the columns of a matrix 𝑃 .
22
LANCE D. DRAGER
(4) Construct the diagonal matrix
⎑
⎀
πœ‡1
⎒
βŽ₯
πœ‡2
βŽ₯
⎒
𝐷=⎒
βŽ₯,
..
⎣
⎦
.
πœ‡π‘›
(off diagonal entries are 0),
where πœ‡π‘— is the eigenvalue such that the 𝑗th column of 𝑃 is in 𝐺(πœ‡π‘— ).
(5) Construct the matrices 𝑆 = 𝑃 𝐷𝑃 −1 and 𝑁 = 𝐴 − 𝑆. Then 𝐴 = 𝑆 + 𝑁 is
the Jordan Decomposition of 𝐴. If 𝐴 is diagonalizable, you’ll wind up with
𝑁 = 0.
(6) Compute 𝑒𝑑𝐷 by
⎀
⎑ πœ‡π‘‘
𝑒 1
βŽ₯
⎒
π‘’πœ‡2 𝑑
βŽ₯
⎒
𝑒𝑑𝐷 = ⎒
βŽ₯
..
⎦
⎣
.
π‘’πœ‡π‘› 𝑑
(7) Compute 𝑒𝑑𝑆 by 𝑒𝑑𝑆 = 𝑃 𝑒𝑑𝐷 𝑃 −1
(8) Compute 𝑒𝑑𝑁 by the method for nilpotent matrices discussed above.
(9) Compute 𝑒𝑑𝐴 by 𝑒𝑑𝐴 = 𝑒𝑑𝑆 𝑒𝑑𝑁 .
Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX
79409-1042
E-mail address: lance.drager@ttu.edu
Download