NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 1. Introduction

NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS LANCE D. DRAGER 1. Introduction A problem that comes up in a lot of different fields of mathematics and engineering is solving a system of linear constant coefficient differential equations. Such a system looks like 𝑥′1 (𝑡) = 𝑎11 𝑥1 (𝑡) + 𝑎12 𝑥2 (𝑡) + · · · + 𝑎1𝑛 𝑥𝑛 (𝑡) 𝑥′2 (𝑡) = 𝑎21 𝑥1 (𝑡) + 𝑎22 𝑥2 (𝑡) + · · · + 𝑎2𝑛 𝑥𝑛 (𝑡) (1.1) .. . 𝑥′𝑛 (𝑡) = 𝑎𝑛1 𝑥1 (𝑡) + 𝑎𝑛2 𝑥2 (𝑡) + · · · + 𝑎𝑛𝑛 𝑥𝑛 (𝑡), where the 𝑎𝑖𝑗 ’s are constants. This is a system of 𝑛 differential equations for the 𝑛 unknown functions 𝑥1 (𝑡), . . . , 𝑥𝑛 (𝑡). It’s important to note that the equations are coupled, meaning the the expression for the derivative 𝑥′𝑖 (𝑡) contains not only 𝑥𝑖 (𝑡), but (possibly) all the rest of the unknown functions. It’s unclear how to proceed using methods we’ve learned for scalar differential equation. Of course, to find a specific solution of (1.1), we need to specify initial conditions for the unknown functions at some value 𝑡0 of 𝑡, 𝑥1 (𝑡0 ) = 𝑐1 𝑥2 (𝑡0 ) = 𝑐2 (1.2) .. . 𝑥𝑛 (𝑡0 ) = 𝑐𝑛 , where 𝑐1 , 𝑐2 , . . . , 𝑐𝑛 are constants. It’s pretty clear that linear algebra is going to help here. We can put our unknown functions into a vector-valued function ⎡ ⎤ 𝑥1 (𝑡) ⎢ 𝑥2 (𝑡) ⎥ ⎢ ⎥ 𝑥(𝑡) = ⎢ . ⎥ , ⎣ .. ⎦ 𝑥𝑛 (𝑡) Version Time-stamp: ”2011-03-31 18:11:43 drager”. 1 2 LANCE D. DRAGER and our constant coefficients into a matrix an 𝑛 × 𝑛 matrix 𝐴 = [𝑎𝑖𝑗 ]. Recall that to differentiate a vector-valued functions, we differentiate each component, so ⎡ ′ ⎤ 𝑥1 (𝑡) ⎢ 𝑥′2 (𝑡) ⎥ ⎢ ⎥ 𝑥′ (𝑡) = ⎢ . ⎥ , ⎣ .. ⎦ 𝑥′𝑛 (𝑡) so we can rewrite (1.1) more compactly in vector-matrix form as (1.3) 𝑥′ (𝑡) = 𝐴𝑥(𝑡). If we put our initial values into a vector ⎡ ⎤ 𝑐1 ⎢ 𝑐2 ⎥ ⎢ ⎥ 𝑐 = ⎢ . ⎥, ⎣ .. ⎦ 𝑐𝑛 we can rewrite the initial conditions (1.2) as (1.4) 𝑥(𝑡0 ) = 𝑐. Thus, the matrix form of our problem is (1.5) 𝑥′ (𝑡) = 𝐴𝑥(𝑡), 𝑥(𝑡0 ) = 𝑐. A problem of this form is called an initial value problem (IVP). For information on the matrix manipulations used in these notes, see the Appendix. Eigenvalues and eigenvectors are going to be important to our solution methods. Of course, even real matrices can have complex, nonreal eigenvalues. To take care of this problem, we’ll work with complex matrices and complex solutions to the differential equation from the start. In many (but not all) applications, one is only interested in real solutions, so we’ll indicate as we go along what happens when our matrix 𝐴 and our initial conditions 𝑐 are real. The equation 𝑥′ (𝑡) = 𝐴𝑥(𝑡) is a homogeneous equation. An inhomogeneous equation would be one of the form 𝑥′ (𝑡) = 𝐴𝑥(𝑡) + 𝑓 (𝑡), where 𝑓 (𝑡) is a given vector-value function. We’ll discuss homogeneous systems to begin with, and show how to solve inhomogeneous systems at the end of these notes. 1.1. Notation. The symbol R will denote the real numbers and C will denote the complex numbers. We will denote by Mat𝑚×𝑛 (C) the space of 𝑚 × 𝑛 matrices with entries in C and Mat𝑚×𝑛 (R) will denote the space of 𝑚 × 𝑛 matrices with entries in R. We use R𝑛 as a synonym Mat𝑛×1 (R), the space of column vectors with 𝑛 entries. Similarly, C𝑛 is a synonym for Mat𝑛×1 (C). NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 3 2. The Matrix Exponential In this section, we’ll first consider the existence and uniqueness question for our system of equations. We’ll then consider the fundamental matrix for our system and show how this solves the problem. In the last subsection, we’ll view this fundamental matrix as a matrix exponential function. 2.1. Initial Value Problem and Existence and Uniqueness. The main problem we’re interested in is the initial value problem (2.1) 𝑥′ (𝑡) = 𝐴𝑥(𝑡), 𝑥(𝑡0 ) = 𝑐, were 𝐴 ∈ Mat𝑛×𝑛 (C), 𝑐 ∈ C𝑛 and we’re solving for a function 𝑥(𝑡) with values in C𝑛 , defined on some interval in R containing 𝑡0 . We state the following existence and uniqueness theorem without proof. Theorem 2.1 (Existance and Uniqueness Theorem). Let 𝐴 be an 𝑛 × 𝑛 complex matrix, let 𝑐 ∈ C𝑛 and let 𝑡0 ∈ R. (1) There is a differentiable function 𝑥 : R → C𝑛 : 𝑡 ↦→ 𝑥(𝑡) such that 𝑥′ (𝑡) = 𝐴𝑥(𝑡), for all 𝑡 ∈ R, 𝑥(𝑡0 ) = 𝑐. (2) If 𝐽 ⊆ R is an open interval in R that contains 𝑡0 and 𝑦 : 𝐽 → C𝑛 is a differentiable function such that 𝑦 ′ (𝑡) = 𝐴𝑦(𝑡), for all 𝑡 ∈ 𝐽, 𝑦(𝑡0 ) = 𝑐, then 𝑦(𝑡) = 𝑥(𝑡) for all 𝑡 ∈ 𝐽. In view of this Theorem, we may as well consider solutions defined on all of R. For brevity, we’ll summarize by saying that solutions of the initial value problem are unique. It will turn out to be useful for consider initial value problems for matrix valued functions. To distinguish the cases, we’ll usually write 𝑋(𝑡) for our unknown function with values in Mat𝑛×𝑛 (C). Theorem 2.2. Suppose that 𝐴 ∈ Mat𝑛×𝑛 (C) and that 𝑡0 ∈ R. Let 𝐶 be a fixed 𝑛 × 𝑛 complex matrix. Then there is a function 𝑋 : R → Mat𝑛×𝑛 (C) : 𝑡 ↦→ 𝑋(𝑡) such that (2.2) 𝑋 ′ (𝑡) = 𝐴𝑋(𝑡), 𝑡∈R 𝑋(𝑡0 ) = 𝐶. This solution is unique in the sense of Theorem 2.1 Proof. If we write 𝑋(𝑡) in terms of its columns as 𝑋(𝑡) = [𝑥1 (𝑡) | 𝑥2 (𝑡) | · · · | 𝑥𝑛 (𝑡)], so each 𝑥𝑖 (𝑡) is a vector-valued function, then 𝑋 ′ (𝑡) = [𝑥′1 (𝑡) | 𝑥′2 (𝑡) | · · · | 𝑥′𝑛 (𝑡)], 𝐴𝑋(𝑡) = [𝐴𝑥1 (𝑡) | 𝐴𝑥2 (𝑡) | · · · | 𝐴𝑥𝑛 (𝑡)] 4 LANCE D. DRAGER Thus, the matrix differential equation 𝑋 ′ (𝑡) = 𝐴𝑋(𝑡) is equivalent to 𝑛 vector differential equations 𝑥′1 (𝑡) = 𝐴𝑥1 (𝑡) 𝑥′2 (𝑡) = 𝐴𝑥2 (𝑡) .. . 𝑥′𝑛 (𝑡) = 𝐴𝑥𝑛 (𝑡). If we write the initial matrix 𝐶 in terms of its columns as 𝐶 = [𝑐1 | 𝑐2 | · · · | 𝑐𝑛 ], then the initial condition 𝑋(𝑡0 ) = 𝐶 is equivalent to the vector equations 𝑥1 (𝑡0 ) = 𝑐1 𝑥2 (𝑡0 ) = 𝑐2 .. . 𝑥𝑛 (𝑡0 ) = 𝑐𝑛 Since each of the initial value problems 𝑥′𝑗 (𝑡) = 𝐴𝑥𝑗 (𝑡), 𝑥𝑗 (𝑡0 ) = 𝑐𝑗 , 𝑗 = 1, 2, . . . , 𝑛 has a unique solution, we conclude that the matrix initial value problem (2.2) has a unique solution. 2.2. The Fundamental Matrix and It’s Properties. It turns out we only need to solve one matrix initial value problem in order to solve them all. Definition 2.3. Let 𝐴 be a complex 𝑛 × 𝑛 matrix. The unique 𝑛 × 𝑛-matrix value function 𝑋(𝑡) that solves the matrix initial valued problem 𝑋 ′ (𝑡) = 𝐴𝑋(𝑡), (2.3) 𝑋(0) = 𝐼 will be denoted by Φ𝐴 (𝑡), in order indicate the dependence on 𝐴. In other words, Φ𝐴 (𝑡) is the unique function so that Φ′𝐴 (𝑡) = 𝐴Φ𝐴 (𝑡), (2.4) Φ𝐴 (0) = 𝐼. The function Φ𝐴 (𝑡) is called the Fundamental Matrix of (2.3). We’ll have a much more intuitive notation for Φ𝐴 (𝑡) in a bit, but we need to do some work first. First, let’s show that Φ𝐴 (𝑡) solves the initial value problems we’ve discussed so far. Theorem 2.4. Let 𝐴 be an 𝑛 × 𝑛 complex matrix. (1) Let 𝑐 ∈ C𝑛 . The solution of the initial value problem 𝑥′ (𝑡) = 𝐴𝑥(𝑡), (2.5) 𝑥(𝑡0 ) = 𝑐, is (2.6) 𝑥(𝑡) = Φ𝐴 (𝑡 − 𝑡0 )𝑐. NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 5 (2) Let 𝐶 ∈ Mat𝑛×𝑛 (C). The solution 𝑋(𝑡) of the matrix initial value problem 𝑋 ′ (𝑡) = 𝐴𝑋(𝑡), (2.7) 𝑋(𝑡0 ) = 𝐶. is (2.8) 𝑋(𝑡) = Φ𝐴 (𝑡 − 𝑡0 )𝐶. Proof. Consider the matrix-valued function Ψ(𝑡) = Φ𝐴 (𝑡 − 𝑡0 ). We then have 𝑑 Φ𝐴 (𝑡 − 𝑡0 ) 𝑑𝑡 𝑑 = Φ′𝐴 (𝑡 − 𝑡0 ) (𝑡 − 𝑡0 ) 𝑑𝑡 = 𝐴Φ(𝑡 − 𝑡0 ) Ψ′ (𝑡) = = 𝐴Ψ(𝑡). We also have Ψ(𝑡0 ) = Φ𝐴 (𝑡0 − 𝑡0 ) = Φ𝐴 (0) = 𝐼. For the first part of the proof, suppose 𝑐 is a constant vector, and let 𝑦(𝑡) = Ψ(𝑡)𝑐. Then 𝑦 ′ (𝑡) = Ψ′ (𝑡)𝑐 = 𝐴Ψ(𝑡)𝑐 = 𝐴𝑦(𝑡), and 𝑦(𝑡0 ) = Ψ(𝑡0 )𝑐 = 𝐼𝑐 = 𝑐. Thus, 𝑦(𝑡) is the unique solution of the initial value problem (2.5). The proof of the second part is very similar. Exercise 2.5. Show that Φ0 (𝑡) = 𝐼, for all 𝑡, where 0 is the 𝑛 × 𝑛 zero matrix. In the rest of this subsection, we’re going to derive some properties of Φ𝐴 (𝑡). The pattern of proof is the same in each case; we show two functions satisfy the same initial value problem, so they must be the same. Here’s a simple example to start. Theorem 2.6. Let 𝐴 be an 𝑛 × 𝑛 real matrix. Then Φ𝐴 (𝑡) is real. The solutions of the initial value problems (2.5) and (2.7) are real if the initial data, 𝑐 or 𝐶, is real. Recall that the conjugate of a complex number 𝑧 is denoted by 𝑧¯. Of course, 𝑧 is real if and only if 𝑧 = 𝑧¯. If 𝑥(𝑡) is a complex valued function, we define 𝑥 ¯(𝑡) by 𝑥 ¯(𝑡) = 𝑥(𝑡). Note that 𝑑 𝑥 ¯′ (𝑡) = 𝑥(𝑡) = 𝑥′ (𝑡). 𝑑𝑡 Proof of Theorem. Let 𝐴 be a complex matrix. The fundamental solution Φ𝐴 (𝑡) solves the IVP (2.9) 𝑋 ′ (𝑡) = 𝐴𝑋(𝑡), 𝑋(0) = 𝐼 6 LANCE D. DRAGER Suppose 𝑋(𝑡) is the solution of this IVP. Taking conjugates, we get ¯ ′ (𝑡) = 𝐴¯𝑋(𝑡), ¯ ¯ (2.10) 𝑋 𝑋(0) = 𝐼. ¯ In other words, (2.10) shows that 𝑋(𝑡) is the solution of the IVP ′ ¯ (2.11) 𝑌 (𝑡) = 𝐴𝑌 (𝑡), 𝑌 (0) = 𝐼, ¯ so we must have 𝑌 (𝑡) = 𝑋(𝑡). But (2.11) is the IVP that defines Φ𝐴¯ (𝑡). In other words, for a complex matrix 𝐴, we have Φ𝐴¯ (𝑡) = Φ𝐴 (𝑡). Suppose now that 𝐴 is real. Then 𝐴¯ = 𝐴. The last equation becomes Φ𝐴 (𝑡) = Φ𝐴 (𝑡). Since Φ𝐴 (𝑡) is equal to its conjugate, it must be real. Of course, if Φ𝐴 (𝑡) is real, and 𝑐 is a real vector, the solution 𝑥(𝑡) = Φ𝐴 (𝑡 − 𝑡0 )𝑐 of (2.5) is real. A similar argument takes care of (2.7). The next properties is the basic property of Φ𝐴 (𝑡) that is often used. Theorem 2.7. Let 𝐴 be an 𝑛 × 𝑛 complex matrix. For any 𝑡, 𝑠 ∈ R, (2.12) Φ𝐴 (𝑡 + 𝑠) = Φ𝐴 (𝑡)Φ𝐴 (𝑠). Proof. Let 𝑠 be fixed by arbitrary, and think both sides of (2.12) as functions of 𝑡. Consider the matrix initial value problem (2.13) 𝑋 ′ (𝑡) = 𝐴𝑋(𝑡), 𝑋(0) = Φ𝐴 (𝑠). According to the Theorem 2.4, the solution of this initial value problem is 𝑋(𝑡) = Φ𝐴 (𝑡)Φ𝐴 (𝑠). On the other hand, consider the function Ψ(𝑡) = Φ𝐴 (𝑡 + 𝑠). We have 𝑑 Φ𝐴 (𝑡 + 𝑠) Ψ′ (𝑡) = 𝑑𝑡 𝑑 = Φ′𝐴 (𝑡 + 𝑠) (𝑡 + 𝑠) 𝑑𝑡 = 𝐴Φ𝐴 (𝑡 + 𝑠) = 𝐴Ψ(𝑡) and Ψ(0) = Φ𝐴 (0 + 𝑠) = Φ𝐴 (𝑠). Thus, Ψ(𝑡) is also a solution to the initial value problem (2.13). Our two solutions must be the same, which proves the theorem. Since 𝑡 + 𝑠 = 𝑠 + 𝑡, we readily obtain the following corollary. Corollary 2.8. For any real numbers 𝑡 and 𝑠, Φ𝐴 (𝑡)Φ𝐴 (𝑠) = Φ𝐴 (𝑠)Φ𝐴 (𝑡), i.e., the matrices Φ𝐴 (𝑡) and Φ𝐴 (𝑠) commute. Almost as easily, we obtain a second corollary. Corollary 2.9. For any 𝑡 ∈ R, the matrix Φ𝐴 (𝑡) is invertible and the inverse is Φ𝐴 (−𝑡). Proof. Φ𝐴 (𝑡)Φ𝐴 (−𝑡) = Φ𝐴 (−𝑡)Φ𝐴 (𝑡) = Φ𝐴 (𝑡 + (−𝑡)) = Φ𝐴 (0) = 𝐼. NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 7 The next theorem has a similar proof. We’ll use this later. Theorem 2.10. Let 𝐴 be an 𝑛 × 𝑛 complex matrix and let 𝑠 be a real number. Then Φ𝑠𝐴 (𝑡) = Φ𝐴 (𝑠𝑡), 𝑡 ∈ R. Proof. Let Ψ(𝑡) = Φ𝐴 (𝑠𝑡). Then 𝑑 Φ𝐴 (𝑠𝑡) 𝑑𝑡 𝑑 = Φ′𝐴 (𝑠𝑡) (𝑠𝑡) 𝑑𝑡 = 𝑠Φ′𝐴 (𝑠𝑡) Ψ′ (𝑡) = = 𝑠𝐴Φ𝐴 (𝑠𝑡) = (𝑠𝐴)Ψ(𝑡), and Ψ(0) = 𝐼. Thus, Ψ(𝑡) satisfies the same initial value problem that characterizes Φ𝑠𝐴 (𝑡), so the two functions must be equal. The next theorem describes how the effect of similarity transformations. This well be the key to computing Φ𝐴 (𝑡) when 𝐴 is diagonalizable. Theorem 2.11. Let 𝐴 be and 𝑛 × 𝑛 complex matrix, and let 𝑃 be an invertible 𝑛 × 𝑛 complex matrix. Then Φ𝑃 −1 𝐴𝑃 (𝑡) = 𝑃 −1 Φ𝐴 (𝑡)𝑃. Proof. Let Ψ(𝑡) = 𝑃 −1 Φ𝐴 (𝑡)𝑃 . Then we have Ψ′ (𝑡) = 𝑃 −1 Φ′𝐴 (𝑡)𝑃 = 𝑃 −1 𝐴Φ𝐴 (𝑡)𝑃 = 𝑃 −1 𝐴𝑃 𝑃 −1 Φ𝐴 (𝑡)𝑃 = (𝑃 −1 𝐴𝑃 )Ψ(𝑡). We also have Ψ(0) = 𝑃 −1 Φ𝐴 (0)𝑃 = 𝑃 −1 𝐼𝑃 = 𝐼. Thus, 𝜓(𝑡) is a solution of the same initial value problem that characterizes Φ𝑃 −1 𝐴𝑃 (𝑡), so the two functions are equal. The following proposition is preparation for the next theorem. As we’ll see, what commutes with what is important in this subject. Proposition 2.12. Let 𝐴 be an 𝑛 × 𝑛 complex matrix. Let 𝐵 be an 𝑛 × 𝑛 complex matrix that commutes with 𝐴, i.e., 𝐴𝐵 = 𝐵𝐴. Then 𝐵 commutes with Φ𝐴 (𝑡), i.e., 𝐵Φ𝐴 (𝑡) = Φ𝐴 (𝑡)𝐵, 𝑡 ∈ R. In particular, since 𝐴 commutes with itself, we have 𝐴Φ𝐴 (𝑡) = Φ𝐴 (𝑡)𝐴, 𝑡 ∈ R. Proof. Consider the matrix initial value problem (2.14) 𝑋 ′ (𝑡) = 𝐴𝑋(𝑡), 𝑋(0) = 𝐵. According to Theorem 2.4, the solution to this initial value problem is 𝑋(𝑡) = Φ𝐴 (𝑡)𝐵. 8 LANCE D. DRAGER Let Ψ(𝑡) = 𝐵Φ𝐴 (𝑡). Then we have Ψ′ (𝑡) = 𝐵Φ′𝐴 (𝑡) = 𝐵𝐴Φ𝐴 (𝑡) = 𝐴𝐵Φ𝐴 (𝑡) since 𝐴𝐵 = 𝐵𝐴 = 𝐴Ψ(𝑡). We also have Ψ(0) = 𝐵Φ𝐴 (0) = 𝐵𝐼 = 𝐵. Thus, Ψ(𝑡) solves the initial value problem (2.14), so our two solutions must be equal. Theorem 2.13. If 𝐴 and 𝐵 are 𝑛 × 𝑛 square matrices that commute, i.e., 𝐴𝐵 = 𝐵𝐴, then Φ𝐴+𝐵 (𝑡) = Φ𝐴 (𝑡)Φ𝐵 (𝑡). Of course, since 𝐴 + 𝐵 = 𝐵 + 𝐴, we would have to have Φ𝐴+𝐵 (𝑡) = Φ𝐵 (𝑡)Φ𝐴 (𝑡), so the matrices Φ𝐴 (𝑡) and Φ𝐵 (𝑡) commute. Proof. You know the drill by now. Let Ψ(𝑡) = Φ𝐴 (𝑡)Φ𝐵 (𝑡). Then we have Ψ′ (𝑡) = Φ′𝐴 (𝑡)Φ𝐵 (𝑡) + Φ𝐴 (𝑡)Φ′𝐵 (𝑡) by the Product Rule = 𝐴Φ𝐴 (𝑡)Φ𝐵 (𝑡) + Φ𝐴 (𝑡)𝐵Φ𝐵 (𝑡) = 𝐴Φ𝐴 (𝑡)Φ𝐵 (𝑡) + 𝐵Φ𝐴 (𝑡)Φ𝐵 (𝑡) by Proposition 2.12 = (𝐴 + 𝐵)Φ𝐴 (𝑡)Φ𝐵 (𝑡) = (𝐴 + 𝐵)Ψ(𝑡). In addition we have Ψ(0) = Φ𝐴 (0)Φ𝐵 (0) = 𝐼𝐼 = 𝐼. Thus, Ψ(𝑡) solves the same initial value problem as Φ𝐴+𝐵 (𝑡), so the two functions are equal. Exercise 2.14. Under the hypotheses of the last theorem, show that Φ𝐴 (𝑠)Φ𝐵 (𝑡) = Φ𝐵 (𝑡)Φ𝐴 (𝑠) for all 𝑠, 𝑡 ∈ R, not just 𝑠 = 𝑡 as stated in the theorem. 2.3. Matrix Exponentials. If 𝑎 is a number, consider the initial value problem 𝑥′ (𝑡) = 𝑎𝑥, 𝑥(0) = 1. 𝑎𝑡 The solution is 𝑥(𝑡) = 𝑒 . Of course this depends on knowing the exponential function exp(𝑥) = 𝑒𝑥 . But this function can also be recovered from the solution of the IVP as 𝑒𝑎 = 𝑥(1). We can follow a similar course to define 𝑒𝐴 = exp(𝐴) when 𝐴 is a matrix. Definition 2.15. If 𝐴 is a complex 𝑛 × 𝑛 matrix, we define 𝑒𝐴 = Φ𝐴 (1). Now use Theorem 2.10 (reversing the roles of 𝑠 and 𝑡). For any 𝑠, we have Φ𝑡𝐴 (𝑠) = Φ𝐴 (𝑡𝑠). Setting 𝑠 = 1 in this equation gives us 𝑒𝑡𝐴 = Φ𝑡𝐴 (1) = Φ𝐴 (𝑡). For the rest of the notes, we’ll write 𝑒𝑡𝐴 or exp(𝑡𝐴) instead of Φ𝐴 (𝑡). Let’s summarize the properties we’ve developed so far in the new notation. NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 9 Proposition 2.16. If 𝐴 is a complex 𝑛×𝑛 matrix, the matrix exponential function 𝑒𝑡𝐴 satisfies the following properties. (1) 𝑑 𝑡𝐴 𝑒 = 𝐴𝑒𝑡𝐴 , 𝑒0𝐴 = 𝐼. 𝑑𝑡 (2) If 𝐴 is real, 𝑒𝑡𝐴 is real. (3) 𝑒(𝑡1 +𝑡2 )𝐴 = 𝑒𝑡1 𝐴 𝑒𝑡2 𝐴 . (4) For each fixed 𝑡, 𝑒𝑡𝐴 is invertible, and the inverse is 𝑒−𝑡𝐴 . (5) 𝑒𝑡(𝑠𝐴) = 𝑒(𝑠𝑡)𝐴 (6) For any invertible 𝑛 × 𝑛 matrix 𝑃 , 𝑃 −1 𝑒𝑡𝐴 𝑃 = 𝑒𝑡𝑃 −1 𝐴𝑃 . (7) If 𝐵 commutes with 𝐴, then 𝐵𝑒𝑡𝐴 = 𝑒𝑡𝐴 𝐵 (8) If 𝐴 and 𝐵 commute, then 𝑒𝑡(𝐴+𝐵) = 𝑒𝑡𝐴 𝑒𝑡𝐵 = 𝑒𝑡𝐵 𝑒𝑡𝐴 . Remark 1. Of course, it would be nice to know what 𝑒𝑡(𝐴+𝐵) is when 𝐴 and 𝐵 don’t commute. The answer is given by the Baker-Campbell-Hausdorff formula, which is fairly complex. This is beyond the scope of these notes. There is another approach to developing the matrix exponential function, which we should mention. Recall that, for a real variable 𝑥, the exponential function 𝑒𝑥 is given by a power series, namely ∞ ∑︁ 1 1 1 1 𝑘 𝑥 = 1 + 𝑥 + 𝑥2 + 𝑥3 + 𝑥4 + · · · . 𝑒 = 𝑘! 2 6 24 𝑥 𝑘=0 If 𝐴 is a square matrix, then formally substituting 𝑥 = 𝑡𝐴 in the series above would suggest that we define 𝑒𝑡𝐴 by (2.15) 𝑒 𝑡𝐴 = ∞ 𝑘 ∑︁ 𝑡 𝑘=0 𝑘! 1 1 𝐴𝑘 = 𝐼 + 𝑡𝐴 + 𝑡2 𝐴2 + 𝑡3 𝐴3 + · · · . 2 6 Of course, this is an infinite sum of matrices. One can interpret convergence of this series to mean that the scalar infinite series one gets in each slot converges. Of course, to make this proposed definition valid, we would have to show that the series (2.15) converges. To make the definition useful, we would have to develop enough properties of such infinite series to prove that 𝑒𝑡𝐴 has the properties in Proposition 2.16. We’ve decided to take a different approach. Still, we’ll see an echo of the power series (2.15) below. 3. Computing Matrix Exponentials We will first consider how to compute 𝑒𝑡𝐴 when 𝐴 is diagonalizable, and the consider the nondiagonalizable case. 10 LANCE D. DRAGER 3.1. The Diagonalizable Case. Recall that an 𝑛 × 𝑛 matrix 𝐴 is diagonalizable if there is an invertible matrix 𝑃 so that 𝑃 −1 𝐴𝑃 is a diagonal matrix 𝐷. This means that 𝐷 is of the form ⎡ ⎤ 𝜆1 ⎢ ⎥ 𝜆2 ⎢ ⎥ 𝐷=⎢ ⎥, . . ⎣ ⎦ . 𝜆𝑛 where all the off diagonal entries are zero. The first thing to do is to find 𝑒𝑡𝐷 when 𝐷 is diagonal. Proposition 3.1. If 𝐷 is the diagonal matrix ⎡ 𝜆1 ⎢ 𝜆2 ⎢ 𝐷=⎢ .. ⎣ . ⎤ ⎥ ⎥ ⎥, ⎦ 𝜆𝑛 then 𝑒𝑡𝐷 is the diagonal matrix 𝑒𝑡𝐷 ⎡ 𝜆1 𝑡 𝑒 ⎢ ⎢ =⎢ ⎣ ⎤ 𝑒𝜆2 𝑡 .. . 𝑒𝜆𝑛 𝑡 ⎥ ⎥ ⎥. ⎦ Proof. We just check that our proposed solution satisfied the right initial value problem. So, define Ψ(𝑡) by ⎤ ⎡ 𝜆1 𝑡 𝑒 ⎥ ⎢ 𝑒𝜆2 𝑡 ⎥ ⎢ Ψ(𝑡) = ⎢ ⎥. .. ⎦ ⎣ . 𝜆𝑛 𝑡 𝑒 Then, ⎡ 𝜆1 0 𝑒 ⎢ ⎢ Ψ(0) = ⎢ ⎣ ⎤ 𝑒𝜆 2 0 .. . 𝑒𝜆 𝑛 0 ⎡ 1 ⎥ ⎢ ⎥ ⎢ ⎥=⎢ ⎦ ⎣ ⎤ ⎡ 𝜆1 𝑡 𝑒 ⎥⎢ ⎥⎢ ⎥⎢ ⎦⎣ 𝜆𝑛 1 .. ⎥ ⎥ ⎥=𝐼 ⎦ . 1 By differentiating each component, we get ⎡ 𝜆1 𝑡 𝜆1 𝑒 ⎢ 𝜆2 𝑒𝜆1 𝑡 ⎢ ′ Ψ (𝑡) = ⎢ ⎣ One the other hand ⎡ 𝜆1 ⎢ 𝜆2 ⎢ 𝐷Ψ(𝑡) = ⎢ .. ⎣ . ⎤ ⎤ .. . 𝜆𝑛 𝑒𝜆𝑛 𝑡 ⎥ ⎥ ⎥. ⎦ ⎡ 𝜆1 𝑡 𝜆1 𝑒 ⎥ ⎢ ⎥ ⎢ ⎥=⎢ ⎦ ⎣ ⎤ 𝑒𝜆2 𝑡 .. . 𝑒𝜆 𝑛 𝑡 ⎤ 𝜆2 𝑒𝜆1 𝑡 .. . 𝜆𝑛 𝑒𝜆𝑛 𝑡 ⎥ ⎥ ⎥. ⎦ NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 11 Thus, Ψ′ (𝑡) = 𝐷Ψ(𝑡). We conclude that Ψ(𝑡) solves the initial value problem that defines 𝑒𝑡𝐷 , so we must have 𝑒𝑡𝐷 = Ψ(𝑡). We can now easily compute 𝑒𝑡𝐴 if 𝐴 is diagonalizable. In this case, we have 𝑃 𝐴𝑃 = 𝐷 for some diagonal matrix 𝐷. We can also write this equation as −1 𝐴 = 𝑃 𝐷𝑃 −1 . But then, we know 𝑒𝑡𝐴 = 𝑒𝑡𝑃 𝐷𝑃 −1 = 𝑃 𝑒𝑡𝐷 𝑃 −1 . So, to compute 𝑒𝑡𝐴 when 𝐴 is diagonalizable, we use the equation (3.1) 𝑒𝑡𝐴 = 𝑃 𝑒𝑡𝐷 𝑃 −1 . Let’s do a couple of computations. Example 3.2. Consider the matrix ⎡ 8 𝐴 := ⎣ 6 6 ⎤ −3 −3 ⎦ −1 −4 −2 −4 The characteristic polynomial of 𝐴 is 𝑝(𝜆) = 𝜆3 − 5𝜆2 + 8𝜆 − 4 which factors as 𝑝(𝜆) = (𝜆 − 1)(𝜆 − 2)2 . Thus the eigenvalues are 1 and 2. First, let’s find a basis for 𝐸(1), the eigenspace for eigenvalue 1. Of course, 𝐸(1) is the nullspace of 𝐴 − (1)𝐼. We calculate that ⎡ ⎤ 7 −4 −3 𝐴 − 𝐼 = ⎣ 6 −3 −3 ⎦ 6 −4 −2 The reduced row echelon form of 𝐴 − 𝐼 is ⎡ 1 0 𝑅=⎣ 0 1 0 0 ⎤ −1 −1 ⎦ 0 By the usual method, we find the nullspace of 𝑅, which is the same as the nullspace of 𝐴 − 𝐼. The conclusion is that 𝐸(1) is one dimensional with basis ⎡ ⎤ 1 𝑣1 = ⎣ 1 ⎦ . 1 Next, let’s look for a basis of 𝐸(2). We ⎡ 6 𝐴 − 2𝐼 = ⎣ 6 6 calculate −4 −4 −4 ⎤ −3 −3 ⎦ . −3 The reduced row echelon form of 𝐴 − 2𝐼 is ⎡ ⎤ 1 −2/3 −1/2 0 0 ⎦. 𝑅=⎣ 0 0 0 0 12 LANCE D. DRAGER By the usual method, we find that 𝐸(2) is two dimensional with basis ⎡ ⎤ ⎡ ⎤ 1/2 2/3 𝑢2 = ⎣ 0 ⎦ , 𝑢3 = ⎣ 1 ⎦ . 1 0 It’s not necessary, but we can make things look nicer by getting rid of the fractions. It we multiply 𝑢2 by 2 and 𝑢3 by 3, we’ll still have a basis. This gives us a new basis ⎡ ⎤ ⎡ ⎤ 1 2 𝑣2 = 2𝑢2 = ⎣ 0 ⎦ , 𝑣3 = 3𝑢3 = ⎣ 3 ⎦ 2 0 Since we have 3 independent eigenvectors, the matrix 𝐴 is diagonalizable. We put the basis vectors into a matrix as columns, so we get ⎡ ⎤ 1 1 2 𝑃 = [𝑣1 | 𝑣2 | 𝑣3 ] = ⎣ 1 0 3 ⎦ 1 2 0 The corresponding diagonal matrix is ⎡ 1 𝐷 = ⎣0 0 0 2 0 ⎤ 0 0⎦ . 2 The reader is invited to check that 𝑃 −1 𝐴𝑃 = 𝐷. More importantly at the moment, we have 𝐴 = 𝑃 𝐷𝑃 −1 . Thus, 𝑒𝑡𝐴 = 𝑃 𝑒𝑡𝐷 𝑃 −1 . We easily calculate1 𝑒𝑡𝐷 as ⎡ ⎤ exp (𝑡) 0 0 ⎣ ⎦ 0 exp (2𝑡) 0 0 0 exp (2𝑡) Using a machine, we compute 𝑒𝑡𝐴 = 𝑃 𝑒𝑡𝐷 𝑃 −1 ⎤⎡ ⎤⎡ ⎡ exp (𝑡) 0 0 1 1 2 ⎦⎣ 0 exp (2𝑡) 0 = ⎣ 1 0 3 ⎦⎣ 0 0 exp (2𝑡) 1 2 0 ⎡ −6 exp (𝑡) + 7 exp (2𝑡) 4 exp (𝑡) − 4 exp (2𝑡) = ⎣ −6 exp (𝑡) + 6 exp (2𝑡) 4 exp (𝑡) − 3 exp (2𝑡) −6 exp (𝑡) + 6 exp (2𝑡) 4 exp (𝑡) − 4 exp (2𝑡) −6 3 2 4 −2 −1 ⎤ 3 exp (𝑡) − 3 exp (2𝑡) 3 exp (𝑡) − 3 exp (2𝑡) ⎦ . 3 exp (𝑡) − 2 exp (2𝑡) Consider the initial value problem ⎡ 𝑥′ (𝑡) = 𝐴𝑥(𝑡), ⎤ 1 𝑥(0) = 𝑐 = ⎣ 2 ⎦ . −3 The solution is ⎡ ⎤ −7 exp (𝑡) + 8 exp (2𝑡) 𝑥(𝑡) = 𝑒𝑡𝐴 𝑐 = ⎣ −7 exp (𝑡) + 9 exp (2𝑡) ⎦ . −7 exp (𝑡) + 4 exp (2𝑡) 1The software I’m using prefers exp(𝑡) to 𝑒𝑡 ⎤ 3 −1 ⎦ −1 NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 13 Example 3.3. Here is a simple example with complex eigenvalues. Let [︃ ]︃ 4 −1 𝐴= . 5 0 The characteristic polynomial is 𝑝(𝜆) = det(𝐴 − 𝜆𝐼) = 𝜆2 − 4𝜆 + 5. The roots of the characteristic polynomial are 2 + 𝑖 and 2 − 𝑖. We calculate [︃ ]︃ 2−𝑖 −1 𝐴 − (2 + 𝑖)𝐼 = . 5 −2 − 𝑖 The reduced row echelon form of 𝐴 − (2 + 𝑖)𝐼 is [︃ ]︃ 1 −2/5 − (1/5) 𝑖 . 𝑅= 0 0 By the usual method we find the nullspace of 𝑅, which is the same as the nullspace of 𝐴 − (2 + 𝑖)𝐼. The result is that the nullspace is one dimensional with basis [︃ ]︃ 2/5 + (1/5) 𝑖 𝑣1 = . 1 Thus, 𝑣1 is a basis of the eigenspace of 𝐴 for eigenvalue 2 + 𝑖. The other eigenvalue, 2 − 𝑖 is the conjugate of the first eigenvalue, so we can just conjugate the basis we found for 2 + 𝑖. This give us [︃ ]︃ 2/5 − (1/5) 𝑖 𝑣2 = 1 and a basis for the eigenspace for eigenvalue 2 − 𝑖. Since we have two independent eigenvectors, the matrix 𝐴 is diagonalizable. If we put 𝑣1 and 𝑣2 into a matrix we get ]︃ [︃ 2/5 + (1/5) 𝑖 2/5 − (1/5) 𝑖 . 𝑃 = 1 1 The corresponding diagonal matrix is [︃ ]︃ 2+𝑖 0 𝐷= . 0 2−𝑖 Note that each column of 𝑃 is an eigenvector for the eigenvalue occurring in the corresponding column of 𝐷. You can now check that 𝑃 −1 𝐴𝑃 = 𝐷, or 𝐴 = 𝑃 𝐷𝑃 −1 . Since 𝐴 is real, we know that 𝑒𝑡𝐴 must be real, even though 𝑃 and 𝐷 are not real. We compute 𝑒𝑡𝐷 as [︂ (2+𝑖)𝑡 ]︂ 𝑒 0 𝑒𝑡𝐷 = 0 𝑒(2−𝑖)𝑡 We then have 𝑒𝑡𝐴 = 𝑃 𝑒𝑡𝐷 𝑃 −1 . Putting into the TI-89 gives [︂ 2𝑡 ]︂ 𝑒 (cos(𝑡) + 2 sin(𝑡)) 𝑒2𝑡 sin(𝑡) 𝑡𝐴 . (3.2) 𝑒 = 5𝑒2𝑡 sin(𝑡) 𝑒2𝑡 (cos(𝑡) − 2 sin(𝑡)) 14 LANCE D. DRAGER Of course, the calculator has used Euler’s Formula 𝑒(𝛼+𝑖𝛽)𝑡 = 𝑒𝛼𝑡 cos(𝛽𝑡) + 𝑖𝑒𝛼𝑡 sin(𝛽𝑡). Sometimes you have to put this in by hand to persuade recalcitrant software to give you results that are clearly real. If we are given an initial condition [︂ ]︂ 𝑐 𝑥(0) = 𝑐 = 1 , 𝑐2 the solution of the initial value problem 𝑥′ (𝑡) = 𝐴𝑥(𝑡), 𝑥(0) = 𝑐 is [︃ (︀ 𝑥(𝑡) = 𝑒𝑡𝐴 𝑐 = ]︃ )︀ 𝑒2 𝑡 cos (𝑡) + 2 𝑒2 𝑡 sin (𝑡) 𝑐1 − 𝑒2 𝑡 sin (𝑡) 𝑐2 (︀ )︀ . 5 𝑒2 𝑡 sin (𝑡) 𝑐1 + 𝑒2 𝑡 cos (𝑡) − 2 𝑒2 𝑡 sin (𝑡) 𝑐2 3.2. Nilpotent Matrices. In this subsection we discuss nilpotent matrices, which will be used in the next subsection to compute the matrix exponential of a general nondiagonalizable matrix. Definition 3.4. A square matrix 𝑁 is said to be nilpotent if 𝑁 𝑝 = 0 for some integer 𝑝 ≥ 1. Example 3.5. Consider the matrix 𝐽2 = [︂ 0 0 ]︂ 1 . 0 Of course, 𝐽2 ̸= 0, but the reader can easily check that 𝐽22 = 0. If we go one dimension higher, we have ⎤ ⎡ 0 1 0 𝐽3 = ⎣0 0 1⎦ . 0 0 0 The reader can easily check that 𝐽32 ̸= 0 but 𝐽33 = 0. You can see a pattern here, but these are just some simple examples of nilpotent matrices. We’ll see later that it’s possible to have a nilpotent matrix where all the entries are nonzero. Of course, if 𝑁 𝑝 = 0, then 𝑁 𝑞 = 0 for all 𝑞 > 𝑝. One might wonder how high a power 𝑝 is necessary to make 𝑁 𝑝 = 0. This is answered in the next theorem. Theorem 3.6. Let 𝑁 be a nilpotent 𝑛 × 𝑛 matrix. Then 𝑁 𝑛 = 0. To put it more precisely, if 𝑁 𝑝−1 ̸= 0 and 𝑁 𝑝 = 0, then 𝑝 ≤ 𝑛. The proof of this theorem will be given in the appendices. Our main concern here is to compute 𝑒𝑡𝑁 when 𝑁 is nilpotent. This can be done by a simple matrix calculation. Theorem 3.7. Let 𝑁 be an 𝑛 × 𝑛 matrix. Then (3.3) 𝑒𝑡𝑁 = 𝐼 + 𝑡𝑁 + 𝑛−1 ∑︁ 1 1 1 1 2 2 𝑡 𝑁 + 𝑡3 𝑁 3 + · · · + 𝑡𝑛−1 𝑁 𝑛−1 = 𝑡𝑘 𝑁 𝑘 . 2! 3! (𝑛 − 1)! 𝑘! 𝑘=0 NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 15 Note that it may well happen that 𝑁 𝑝 = 0 for some 𝑝 < 𝑛 − 1, so some of the terms above will vanish. You may recognize this formula as what you would get if you plug 𝑁 into the power series (2.15) and notice that all the terms with exponent higher than 𝑛 − 1 will be zero. However, we can prove the theorem without resorting to the power series, we just use our usual method. Proof of Theorem. Consider the function Ψ(𝑡) = 𝐼 + 𝑡𝑁 + 1 2 2 1 3 3 1 1 𝑡 𝑁 + 𝑡 𝑁 +···+ 𝑡𝑛−2 𝑁 𝑛−2 + 𝑡𝑛−1 𝑁 𝑛−1 . 2! 3! (𝑛 − 2)! (𝑛 − 1)! Clearly Ψ(0) = 𝐼. It’s easy to differentiate Ψ(𝑡) if you recall that 𝑝/𝑝! = 1/(𝑝 − 1)!. The result is 1 1 1 𝑡𝑛−3 𝑁 𝑛−2 + 𝑡𝑛−2 𝑁 𝑛−1 . (3.4) Ψ′ (𝑡) = 𝑁 + 𝑡𝑁 2 + 𝑡2 𝑁 3 + · · · + 2 (𝑛 − 3)! (𝑛 − 2)! One the other hand, we can compute 𝑁 Ψ(𝑡) by multiplying 𝑁 through the formula for Ψ(𝑡). The result is (3.5) 1 1 1 1 𝑁 Ψ(𝑡) = 𝑁 +𝑡𝑁 2 + 𝑡2 𝑁 3 + 𝑡3 𝑁 4 +· · ·+ 𝑡𝑛−2 𝑁 𝑛−1 + +𝑡𝑛−1 𝑁 𝑛 2! 3! (𝑛 − 2)! (𝑛 − 1)! But the last term is zero, because 𝑁 𝑛 = 0. Thus, we get exactly the same formula in (3.4) and (3.5), so we have Ψ′ (𝑡) = 𝑁 Ψ(𝑡). Thus, Ψ(𝑡) satisfies the same initial value problem that defines 𝑒𝑡𝑁 . Thus, we must have 𝑒𝑡𝑁 = Ψ(𝑡). With this formula, we can compute 𝑒𝑡𝑁 for (reasonably small) nilpotent matrices. Example 3.8. Consider the matrix ⎡ 0 ⎢0 𝐽 =⎢ ⎣0 0 1 0 0 0 0 1 0 0 ⎤ 0 0⎥ ⎥. 1⎦ 0 We have ⎡ 0 ⎢0 2 𝐽 =⎢ ⎣0 0 ⎡ 0 ⎢0 3 𝐽 =⎢ ⎣0 0 𝐽 4 = 0. 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ⎤ 0 1⎥ ⎥ 0⎦ 0 ⎤ 1 0⎥ ⎥ 0⎦ 0 16 LANCE D. DRAGER Thus, we can calculate 1 1 𝑒𝑡𝐽 = 𝐼 + 𝑡𝐽 + 𝑡2 𝐽 2 + 𝑡3 𝐽 3 2 3! ⎡ ⎡ ⎤ 0 1 1 0 0 0 ⎢0 0 ⎢0 1 0 0⎥ ⎢ ⎥ =⎢ ⎣0 0 1 0⎦ + 𝑡 ⎣0 0 0 0 0 0 0 1 ⎡ ⎤ 1 𝑡 𝑡2 /2 𝑡3 /6 ⎢0 1 𝑡 𝑡2 /2⎥ ⎥. =⎢ ⎣0 0 1 𝑡 ⎦ 0 0 0 1 0 1 0 0 ⎤ ⎡ 0 0 ⎢ 1 0⎥ ⎥ + 𝑡2 ⎢0 1⎦ 2 ⎣0 0 0 0 0 0 0 1 0 0 0 ⎤ ⎡ 0 0 ⎢ 1 1⎥ ⎥ + 𝑡3 ⎢0 0⎦ 6 ⎣0 0 0 0 0 0 0 0 0 0 0 ⎤ 1 0⎥ ⎥ 0⎦ 0 Example 3.9. Consider the matrix [︂ 3 𝑁= 1 ]︂ −9 . −3 The reader is invited to check that 𝑁 2 = 0. Thus, 𝑒𝑡𝑁 = 𝐼 + 𝑡𝑁 [︂ ]︂ [︂ ]︂ 1 0 3 −9 = +𝑡 0 1 1 −3 [︂ ]︂ 1 + 3𝑡 −9𝑡 = . 𝑡 1 − 3𝑡 In general, it’s clear from the formula (3.3) that the entries of 𝑒𝑡𝑁 are polynomials in 𝑡. 3.3. Computing Exponentials with the Jordan Decomposition. In this section will see a “general” method for computing the matrix exponential of any matrix. I put “general” in quotes because we’re making our usual assumption that you can find the eigenvalues. Which matrices commute with each other is going to be important, so we start with an easy proposition about commuting matrices. The proof is left to the reader Theorem 3.10. Consider 𝑛×𝑛 matrices. We say 𝐴 commutes with 𝐵 if 𝐴𝐵 = 𝐵𝐴. (1) 𝐴 commutes with itself and with the identity 𝐼. (2) If 𝐴 commutes with 𝐵, 𝐴 commutes with any power of 𝐵. (3) If 𝐴 commutes with 𝐵 and with 𝐶 then 𝐴 commutes with 𝛼𝐵 + 𝛽𝐶 for any scalars 𝛼, 𝛽 ∈ C. If a matrix does not have enough independent eigenvectors it’s not diagonalizable. We need to look for more vectors to add to the list. It turns out the the important thing to look for is generalized eigenvectors. Definition 3.11. Let 𝐴 be an 𝑛 × 𝑛 matrix and let 𝜆 be an eigenvalue of 𝐴. The generalized eigenspace of 𝐴 for eigenvalue 𝜆 is denoted by 𝐺(𝜆) and is defined as (︀ )︀ 𝐺(𝜆) = nullspace (𝐴 − 𝜆𝐼)𝑛 . NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 17 Note that if (𝐴 − 𝜆𝐼)𝑝 𝑣 = 0 then (𝐴 − 𝜆𝐼)𝑞 𝑣 = 0 for all 𝑞 > 𝑝. In particular, the eigenspace of 𝐴 for eigenvalue 𝜆 is defined by 𝐸(𝜆) = nullspace((𝐴 − 𝜆𝐼)), so 𝐸(𝜆) ⊆ 𝐺(𝜆). In particular 𝐺(𝜆) ̸= {0}. The following Big Theorem will justify our constructions in this section. It is stated without proof. Theorem 3.12 (Generalized Eigenspaces Theorem). Let 𝐴 be an 𝑛 × 𝑛 matrix and let 𝜆1 , . . . , 𝜆𝑘 be the distinct eigenvalues of 𝐴. If you find a basis for each of the generalized eigenspaces 𝐺(𝜆𝑗 ) and concatenate this lists into one long list of vectors, you get a basis for C𝑛 . To make this notationally specific suppose that you take a basis 𝑣1𝑗 , 𝑣2𝑗 , . . . , 𝑣𝑛𝑗 𝑗 of 𝐺(𝜆𝑗 ), where 𝑛𝑗 is the dimension of 𝐺(𝜆𝑗 ). The the list of vectors (3.6) 𝑣11 , 𝑣21 . . . , 𝑣𝑛1 1 , 𝑣12 , 𝑣22 , . . . 𝑣𝑛2 2 , . . . , 𝑣1𝑘 , 𝑣2𝑘 , . . . , 𝑣𝑛𝑘 𝑘 is a basis of C𝑛 . In particular, the dimensions of the generalized eigenspaces add up to to 𝑛. Corollary 3.13 (Generalized Eigenspaces Decomposition). Let 𝐴 be an 𝑛 × 𝑛 matrix and let 𝜆1 , . . . , 𝜆𝑘 be the distinct eigenvalues of 𝐴. Then every vector 𝑣 ∈ C𝑛 can be written uniquely as (3.7) 𝑣 = 𝑣1 + 𝑣2 + · · · + 𝑣𝑘 , 𝑣𝑗 ∈ 𝐺(𝜆𝑗 ). 0 = 𝑣1 + 𝑣2 + · · · + 𝑣𝑘 , 𝑣𝑗 ∈ 𝐺(𝜆𝑗 ) In particular, if (3.8) then each 𝑣𝑗 is zero. Proof. According to our big theorem, we can find a basis of C𝑛 of the form (3.6). If 𝑣 is any vector it can be written as a linear combination of the basis vectors. In particular, we can write (3.9) 𝑣= 𝑛𝑗 𝑘 ∑︁ ∑︁ 𝑐𝑗𝑝 𝑣𝑝𝑗 , 𝑗= 𝑝=1 for some scalars 𝑐𝑗𝑝 . But the term 𝑛𝑗 ∑︁ 𝑐𝑗𝑝 𝑣𝑝𝑗 𝑝=1 is in 𝐺(𝜆𝑗 ), so (3.9) gives us the expression (3.7). Since the coefficients 𝑐𝑗𝑝 are unique, so is the decomposition (3.7). The second statement follows by uniqueness. In preparation for our construction of the Jordan Decomposition, we have the following fact. Proposition 3.14. Let 𝐴 be an 𝑛 × 𝑛 matrix. Then generalized eigenspaces of 𝐴 are invariant under 𝐴. In other words if 𝜆 is an eigenvalue of 𝐴, 𝑣 ∈ 𝐺(𝜆) =⇒ 𝐴𝑣 ∈ 𝐺(𝜆). 18 LANCE D. DRAGER Proof. Suppose that 𝑣 ∈ 𝐺(𝜆). Then, by definition, (𝐴 − 𝜆𝐼)𝑛 𝑣 = 0 To check if 𝐴𝑣 is in 𝐺(𝜆), we have to check if (𝐴 − 𝜆𝐼)𝑛 (𝐴𝑣) is zero. But 𝐴 and (𝐴 − 𝜆𝐼)𝑛 commute. So we have (𝐴 − 𝜆𝐼)𝑛 (𝐴𝑣) = ((𝐴 − 𝜆𝐼)𝑛 𝐴)𝑣 = (𝐴(𝐴 − 𝜆𝐼)𝑛 )𝑣 = 𝐴((𝐴 − 𝜆𝐼)𝑛 𝑣) = 𝐴0 = 0. Thus, 𝐴𝑣 ∈ 𝐺(𝜆). We’re now ready to discuss the Jordan Decomposition Theorem and how to use it to compute matrix exponentials. Theorem 3.15 (Jordan Decomposition Theorem). Let 𝐴 be an 𝑛×𝑛 matrix. Then there are unique matrices 𝑆 and 𝑁 that satisfy the following conditions (1) 𝐴 = 𝑆 + 𝑁 (2) 𝑆𝑁 = 𝑁 𝑆, i.e., 𝑁 and 𝑆 commute (it follows from the previous condition that 𝑆 and 𝑁 commute with 𝐴). (3) 𝑆 is diagonalizable (4) 𝑁 is nilpotent. The expression 𝐴 = 𝑆 + 𝑁 is the Jordan Decomposition of 𝐴. We’ll leave the uniqueness part of this Theorem to the appendices. We’ll show how to construct 𝑆 and 𝑁 , which will prove the existence part of the Theorem. This Theorem will enable us to compute 𝑒𝑡𝐴 . Since 𝑆 and 𝑁 commute, 𝑒𝑡𝐴 = 𝑒𝑡𝑆 𝑒𝑡𝑁 = 𝑒𝑡𝑁 𝑒𝑡𝑆 . We know how to compute 𝑒𝑡𝑆 , since 𝑆 is diagonalizable, and we know how to compute 𝑒𝑡𝑁 , since 𝑁 is nilpotent. We’ll give one application of the uniqueness part of the Theorem. Corollary 3.16. If 𝐴 is a real matrix with Jordan Decomposition 𝐴 = 𝑆 + 𝑁 , then 𝑆 and 𝑁 are real. ¯ . Taking the conjugates of both Proof. In general, if 𝐴 = 𝑆 + 𝑁 , then 𝐴¯ = 𝑆¯ + 𝑁 ¯ ¯ ¯ ¯ sides in 𝑆𝑁 = 𝑁 𝑆 gives 𝑆 𝑁 = 𝑁 𝑆. Taking the conjugates of both sides in 𝑁 𝑛 = 0 ¯ 𝑛 = 0, so 𝑁 ¯ is nilpotent. shows 𝑁 ¯ We claim that 𝑆 is diagonalizable. First observe that if 𝑃 is invertible, we have 𝑃 𝑃 −1 = 𝑃 −1 𝑃 = 𝐼. Taking conjugates in this shows that 𝑃¯ 𝑃 −1 = 𝑃 −1 𝑃¯ = 𝐼¯ = 𝐼. This shows that 𝑃¯ −1 = 𝑃 −1 . Now,since 𝑆 is invertible, there is an invertible matrix 𝑃 and a diagonal matrix 𝐷 so that 𝑃 −1 𝑆𝑃 = 𝐷. Taking conjugates on both sides of this equation and using our previous remark, we have ¯ 𝑃¯ −1 𝑆¯𝑃¯ = 𝐷. ¯ is certainly diagonal, so 𝑆¯ is diagonalizable. The matrix 𝐷 NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 19 ¯ satisfy all the required conditions to form a Jordan Decomposition Thus, 𝑆¯ and 𝑁 ¯ of 𝐴. Thus, in general, ¯ 𝐴¯ = 𝑆¯ + 𝑁 ¯ is the Jordan Decomposition of 𝐴. But if 𝐴 is real, then 𝐴¯ = 𝐴, so ¯ 𝐴 = 𝑆¯ + 𝑁 ¯ satisfy all the right conditions. By the uniqueness part of the where 𝑆¯ and 𝑁 ¯ , so both 𝑆 and Jordan Decomposition Theorem, we must have 𝑆 = 𝑆¯ and 𝑁 = 𝑁 𝑁 are real. We’ll now see how to construct the Jordan Decomposition, and then how to compute 𝑒𝑡𝐴 . To construct the Jordan Decomposition of 𝐴, we first find the eigenvalues of 𝐴, call them 𝜆1 , . . . , 𝜆𝑘 . Next, we find a basis of each of the generalized eigenspaces 𝐺(𝜆𝑗 ). We know how to do this because 𝐺(𝜆𝑗 ) = nullspace((𝐴 − 𝜆𝐼)𝑛 ). So, we compute the matrix (𝐴 − 𝜆𝐼)𝑛 and use our usual algorithm to find the null space. So, we first find the RREF of (𝐴 − 𝜆𝐼)𝑛 and we can then read off a basis of the nullspace. Obviously this will all be much pleasanter if we have machine assistance, like the TI-89 or Maple. Once we have found a basis of each generalized eigenspace, putting all the lists together gives a basis of C𝑛 . Thus, if we insert all of these vector as the columns of a matrix 𝑃 , the matrix 𝑃 in 𝑛 × 𝑛 and is invertible. Now construct a diagonal matrix ⎡ ⎤ 𝜇1 ⎢ ⎥ 𝜇2 ⎢ ⎥ (3.10) 𝐷=⎢ (off diagonal entries are 0), ⎥, .. ⎣ ⎦ . 𝜇𝑛 where 𝜇ℓ is the eigenvalue that goes with column ℓ of 𝑃 , i.e., 𝜇ℓ = 𝜆𝑗 if column ℓ of 𝑃 is one of the basis vectors of 𝐺(𝜆𝑗 ). So, all of the 𝜇ℓ ’s are eigenvalues of 𝐴, but the same eigenvalue could be repeated in several columns. We then construct a matrix 𝑆 by (3.11) 𝑆 = 𝑃 𝐷𝑃 −1 . Certainly 𝑆 is diagonalizable! Claim. If 𝑣 ∈ 𝐺(𝜆ℓ ), then 𝑆𝑣 = 𝜆ℓ 𝑣. Proof of Claim. This is just the way we constructed it. Describe 𝑃 by its columns as ]︀ [︀ 𝑃 = 𝑝1 𝑝2 . . . 𝑝𝑛 . Since 𝑆𝑃 = 𝑃 𝐷, we have 𝑆𝑝𝑗 = 𝜇𝑗 𝑝𝑗 . By the construction, a subset of the columns of 𝑃 form a basis of 𝐺(𝜆ℓ ). Thus, we can find some columns 𝑝𝑗1 , 𝑝𝑗2 , . . . , 𝑝𝑗𝑚 , 20 LANCE D. DRAGER that form a basis, where 𝑚 is the dimension of 𝐺(𝜆ℓ ). Since these vectors are 𝐺(𝜆ℓ ), we have 𝜇𝑗1 = 𝜇𝑗2 = · · · = 𝜇𝑗𝑚 = 𝜆ℓ . If 𝑣 ∈ 𝐺(𝜆ℓ ), we can write 𝑣 in terms of our basis as 𝑣 = 𝑐1 𝑝𝑗1 + 𝑐2 𝑝𝑗2 + · · · + 𝑐𝑚 𝑝𝑗𝑚 . for some scalars 𝑐1 , 𝑐2 , . . . , 𝑐𝑚 . Then we have 𝑆𝑣 = 𝑐1 𝑆𝑝𝑗1 + 𝑐2 𝑆𝑝𝑗2 + · · · + 𝑐𝑚 𝑆𝑝𝑗𝑚 = 𝑐1 𝜆ℓ 𝑝𝑗1 + 𝑐2 𝜆ℓ 𝑝𝑗2 + · · · + 𝑐𝑚 𝜆ℓ 𝑝𝑗𝑚 = 𝜆ℓ (𝑐1 𝑝𝑗1 + 𝑐2 𝑝𝑗2 + · · · + 𝑐𝑚 𝑝𝑗𝑚 ) = 𝜆𝑗 𝑣. This completes the proof of the claim. If 𝑣 is any vector, we can write it in the form 𝑣 = 𝑣1 + · · · + 𝑣 𝑘 , (3.12) 𝑣𝑗 ∈ 𝐺(𝜆𝑗 ). Applying 𝑆 gives us (3.13) 𝑆𝑣 = 𝜆1 𝑣1 + 𝜆2 𝑣2 + · · · + 𝜆𝑘 𝑣𝑘 , 𝜆𝑗 𝑣𝑗 ∈ 𝐺(𝜆𝑗 ). Claim. 𝑆 commutes with 𝐴. Proof of Claim. Recall that if 𝑣 ∈ 𝐺(𝜆ℓ ), then 𝐴𝑣 ∈ 𝐺(𝜆ℓ ). If 𝑣 is any vector, we can write it in the form (3.12). Multiplying by 𝐴 gives us 𝐴𝑣 = 𝐴𝑣1 + 𝐴𝑣2 + · · · + 𝐴𝑣𝑘 . Since each 𝐴𝑣𝑗 is in 𝐺(𝜆𝑗 ), we can apply 𝑆 to get 𝑆𝐴𝑣 = 𝜆1 𝐴𝑣1 + 𝜆2 𝐴𝑣2 + · · · + 𝜆𝑘 𝐴𝑣𝑘 . On the other hand, 𝑆𝑣 = 𝜆1 𝑣1 + 𝜆2 𝑣2 + · · · + 𝜆𝑘 𝑣𝑘 , and multiplying by 𝐴 gives 𝐴𝑆𝑣 = 𝐴(𝜆1 𝑣1 + 𝜆2 𝑣2 + · · · + 𝜆𝑘 𝑣𝑘 ) = 𝜆1 𝐴𝑣1 + 𝜆2 𝐴𝑣2 + · · · + 𝜆𝑘 𝐴𝑣𝑘 Thus, 𝑆𝐴 = 𝐴𝑆. Now, we define 𝑁 by 𝑁 = 𝐴 − 𝑆. Obviously 𝐴 = 𝑆 + 𝑁 . Since 𝑆 commutes with 𝐴 and itself, 𝑆 commutes with 𝐴 − 𝑆 = 𝑁 . Thus, 𝑆𝑁 = 𝑁 𝑆. It remains to show that 𝑁 is nilpotent. Claim. If 𝑣 ∈ 𝐺(𝜆ℓ ), then (3.14) (𝐴 − 𝑆)𝑛 𝑣 = (𝐴 − 𝜆ℓ )𝑛 𝑣 = 0. Proof of Claim. Since 𝐴 and 𝑆 commute with each other and with the identity, 𝐴 − 𝑆 and 𝐴 − 𝜆ℓ 𝐼 commute. If 𝑣 ∈ 𝐺(𝜆ℓ ), 𝑆𝑣 = 𝜆ℓ 𝑣. Thus, (𝐴 − 𝑆)𝑣 = 𝐴𝑣 − 𝑆𝑣 = 𝐴𝑣 − 𝜆ℓ 𝑣 = (𝐴 − 𝜆ℓ 𝐼)𝑣. NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 21 For the next power we have (𝐴 − 𝑆)2 𝑣 = (𝐴 − 𝑆)[(𝐴 − 𝑆)𝑣] = (𝐴 − 𝑆)[(𝐴 − 𝜆ℓ 𝐼)𝑣] = {(𝐴 − 𝑆)(𝐴 − 𝜆ℓ 𝑣)} 𝑣 = {(𝐴 − 𝜆ℓ 𝐼)(𝐴 − 𝑆)} 𝑣 = (𝐴 − 𝜆ℓ 𝐼)[(𝐴 − 𝑆)𝑣] = (𝐴 − 𝜆ℓ 𝐼)[(𝐴 − 𝜆ℓ 𝐼)𝑣] = (𝐴 − 𝜆ℓ 𝐼)2 𝑣 We can continue inductively to show (𝐴 − 𝑆)𝑝 𝑣 = (𝐴 − 𝜆ℓ )𝑝 𝑣 for any power 𝑝. If we set 𝑝 = 𝑛, we get(3.14). We can now prove that 𝑁 is nilpotent. If 𝑣 is any vector, we write it in the form (3.15) 𝑣 = 𝑣1 + · · · + 𝑣𝑘 , 𝑣𝑗 ∈ 𝐺(𝜆𝑗 ). Applying 𝑁 𝑛 to both sides of (3.15) gives 𝑁 𝑛 𝑣 = (𝐴 − 𝑆)𝑛 𝑣 = (𝐴 − 𝑆)𝑛 𝑣1 + (𝐴 − 𝑆)𝑛 𝑣2 + · · · + (𝐴 − 𝑆)𝑛 𝑣𝑘 = (𝐴 − 𝜆1 )𝑛 𝑣1 + (𝐴 − 𝜆2 )𝑛 𝑣2 + · · · + (𝐴 − 𝜆𝑘 )𝑛 𝑣𝑘 = 0 + 0 + · · · + 0 = 0. Thus, 𝑁 𝑛 𝑣 = 0 for all 𝑣, so 𝑁 𝑛 = 0. We’ve now constructed matrices 𝑆 and 𝑁 so that 𝐴 = 𝑆 + 𝑁 , 𝑆𝑁 = 𝑁 𝑆, 𝑆 is diagonalizable and 𝑁 is nilpotent. So, we’ve found the Jordan Decomposition of 𝐴. We can use the machinery we developed to compute 𝑒𝑡𝐴 . Our matrix 𝐷 in (3.10) is diagonal, so ⎤ ⎡ 𝜇𝑡 𝑒 1 ⎥ ⎢ 𝑒𝜇2 𝑡 ⎥ ⎢ (off diagonal entries are 0), 𝑒𝑡𝐷 = ⎢ ⎥, . .. ⎦ ⎣ 𝑒𝜇𝑛 𝑡 Since 𝑆 = 𝑃 𝐷𝑃 −1 , we can compute 𝑒𝑡𝑆 by 𝑒𝑡𝑆 = 𝑃 𝑒𝑡𝐷 𝑃 −1 . Since 𝑁 is nilpotent, we know how to compute 𝑒𝑡𝑁 . We can then compute 𝑒𝑡𝐴 as 𝑒𝑡𝐴 = 𝑒𝑡𝑆 𝑒𝑡𝑁 = 𝑒𝑡𝑁 𝑒𝑡𝑆 . Let’s summarize the steps in this process. Theorem 3.17 (Computing 𝑒𝑡𝐴 ). Let 𝐴 be an 𝑛 × 𝑛 matrix. To compute 𝑒𝑡𝐴 , follow these steps. (1) Find the characteristic polynomial of 𝐴 and the eigenvalues of 𝐴. Call the eigenvalues 𝜆1 , . . . , 𝜆𝑘 . (2) For each eigenvalue 𝜆ℓ , find a basis of 𝐺(𝜆ℓ ) by computing the matrix 𝐵 = (𝐴 − 𝜆ℓ )𝑛 and the finding a basis for the nullspace of 𝐵. (3) Put the basis vectors you’ve constructed in as the columns of a matrix 𝑃 . 22 LANCE D. DRAGER (4) Construct the diagonal matrix ⎡ ⎤ 𝜇1 ⎢ ⎥ 𝜇2 ⎥ ⎢ 𝐷=⎢ ⎥, .. ⎣ ⎦ . 𝜇𝑛 (off diagonal entries are 0), where 𝜇𝑗 is the eigenvalue such that the 𝑗th column of 𝑃 is in 𝐺(𝜇𝑗 ). (5) Construct the matrices 𝑆 = 𝑃 𝐷𝑃 −1 and 𝑁 = 𝐴 − 𝑆. Then 𝐴 = 𝑆 + 𝑁 is the Jordan Decomposition of 𝐴. If 𝐴 is diagonalizable, you’ll wind up with 𝑁 = 0. (6) Compute 𝑒𝑡𝐷 by ⎤ ⎡ 𝜇𝑡 𝑒 1 ⎥ ⎢ 𝑒𝜇2 𝑡 ⎥ ⎢ 𝑒𝑡𝐷 = ⎢ ⎥ .. ⎦ ⎣ . 𝑒𝜇𝑛 𝑡 (7) Compute 𝑒𝑡𝑆 by 𝑒𝑡𝑆 = 𝑃 𝑒𝑡𝐷 𝑃 −1 (8) Compute 𝑒𝑡𝑁 by the method for nilpotent matrices discussed above. (9) Compute 𝑒𝑡𝐴 by 𝑒𝑡𝐴 = 𝑒𝑡𝑆 𝑒𝑡𝑁 . Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX 79409-1042 E-mail address: lance.drager@ttu.edu

NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 1. Introduction

Related documents

Products

Support

NOTES ON SOLVING LINEAR SYSTEMS OF DIFFERENTIAL EQUATIONS 1. Introduction

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib