Contents 1 Complex numbers 1.1 1.2 1.3 3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.1 Imaginary numbers . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.2 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . 3 Manipulation of complex numbers . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Basic operations . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.2 Square root of a complex number . . . . . . . . . . . . . . . . 6 Representation of a complex number . . . . . . . . . . . . . . . . . . 7 1.3.1 Algebraic representation . . . . . . . . . . . . . . . . . . . . . 7 1.3.2 Trigonometric representation . . . . . . . . . . . . . . . . . . . 7 1.3.3 de Moivre’s theorem . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.4 Complex logarithms . . . . . . . . . . . . . . . . . . . . . . . 12 1.3.5 Trigonometric and hyperbolic functions . . . . . . . . . . . . . 13 2 Ordinary differential equations 2.1 2.2 2.3 17 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.1 A simple example . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.2 The direction field . . . . . . . . . . . . . . . . . . . . . . . . 18 2.1.3 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 20 First order, first degree differential equations . . . . . . . . . . . . . . 22 2.2.1 Separable equations . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2.2 Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2.3 Exact equations . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2.4 Integrating factors . . . . . . . . . . . . . . . . . . . . . . . . 30 Higher degree first order differential equations . . . . . . . . . . . . . 35 2.3.1 Equations solvable for p . . . . . . . . . . . . . . . . . . . . . 36 2.3.2 Equations solvable for y . . . . . . . . . . . . . . . . . . . . . 36 2.3.3 Equations solvable for x . . . . . . . . . . . . . . . . . . . . . 37 2.3.4 Special differential equations . . . . . . . . . . . . . . . . . . . 39 i ii CONTENTS 2.3.5 2.4 Second order differential equations 47 2.4.1 2.4.2 Second order homogeneous ODEs with constant coefficients . . The Wronskian determinant . . . . . . . . . . . . . . . . . . . 48 49 2.4.3 Fundamental set of solutions of homogeneous ODEs with constant coefficients . . . . . . . . . . . . . . . . . . . . . . . . . 51 Second order nonhomogeneous ODEs with constant coefficients 56 Higher order linear differential equations . . . . . . . . . . . . . . . . 67 2.5.1 Homogeneous n-th order ODEs. . . . . . . . . . . . . . . . . . 67 2.5.2 2.5.3 Nonhomogeneous n-th order ODEs. . . . . . . . . . . . . . . . The D-operator . . . . . . . . . . . . . . . . . . . . . . . . . . 70 74 2.5.4 The Euler linear equations . . . . . . . . . . . . . . . . . . . . 81 2.5.5 Series solutions of linear equations 82 . . . . . . . . . . . . . . . 3 Complex analysis 3.1 3.2 4.2 91 3.1.1 3.1.2 Differentiable functions . . . . . . . . . . . . . . . . . . . . . . The Cauchy-Riemann conditions . . . . . . . . . . . . . . . . 93 94 Complex integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 3.2.1 3.2.2 Line integrals in the complex plane . . . . . . . . . . . . . . . 98 Cauchy’s integral theorem . . . . . . . . . . . . . . . . . . . . 102 3.2.3 Cauchy’s integral formula . . . . . . . . . . . . . . . . . . . . 105 3.2.4 Cauchy’s integral formula for higher derivatives . . . . . . . . 109 3.2.5 3.2.6 Taylor and Laurent series . . . . . . . . . . . . . . . . . . . . 110 Residue theorem . . . . . . . . . . . . . . . . . . . . . . . . . 113 3.2.7 Real integrals using contour integration . . . . . . . . . . . . . 115 123 Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 4.1.1 Basic definition and properties . . . . . . . . . . . . . . . . . . 123 4.1.2 4.1.3 Solution of initial value problems by means of Laplace transforms127 The Bromwich integral . . . . . . . . . . . . . . . . . . . . . . 133 Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 4.2.1 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 4.2.2 From Fourier series to Fourier transform . . . . . . . . . . . . 141 5 Systems of differential equations 5.1 91 Complex functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Integral transforms 4.1 45 . . . . . . . . . . . . . . . . . . . 2.4.4 2.5 Singular solutions and envelopes . . . . . . . . . . . . . . . . . 145 Review of matrices and systems of algebraic equations . . . . . . . . . 145 5.1.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 1 CONTENTS 5.2 5.3 5.1.2 Systems of linear algebraic equations . . . . . . . . . . . . Systems of first order linear ODEs . . . . . . . . . . . . . . . . . . 5.2.1 General properties . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Homogeneous linear systems with constant coefficients . . 5.2.3 Nonhomogeneous linear systems with constant coefficients Systems of second order linear ODEs . . . . . . . . . . . . . . . . 6 Modeling physical systems with ODEs 6.1 Constructing mathematical models . . 6.2 Mechanical and electrical vibrations . . 6.2.1 The spring-mass system . . . . 6.2.2 Electric circuits . . . . . . . . . 6.3 Other physical processes . . . . . . . . 6.3.1 Wave propagation . . . . . . . . 6.3.2 Heat flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Vector and tensor analysis 7.1 Review of vector algebra and vector spaces . . . 7.1.1 Vector algebra . . . . . . . . . . . . . . . 7.1.2 Vector spaces . . . . . . . . . . . . . . . 7.1.3 Linear operators . . . . . . . . . . . . . 7.2 Vector calculus . . . . . . . . . . . . . . . . . . 7.2.1 Differentiation of vectors . . . . . . . . . 7.2.2 Scalar and vector fields . . . . . . . . . . 7.2.3 Vector operators . . . . . . . . . . . . . 7.3 Transformation of coordinates . . . . . . . . . . 7.3.1 Rotation of the coordinate axes . . . . . 7.3.2 General curvilinear coordinates . . . . . 7.4 Tensors . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Basic definitions . . . . . . . . . . . . . . 7.4.2 Einstein summation convention . . . . . 7.4.3 Direct product and contraction . . . . . 7.4.4 Kronecker delta and Levi-Civita symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 152 152 153 160 169 . . . . . . . 171 171 178 178 183 185 185 186 . . . . . . . . . . . . . . . . 189 189 189 193 193 195 195 202 202 209 210 214 217 218 219 220 221 2 CONTENTS Chapter 1 Complex numbers The imaginary number is a fine and wonderful resource of the human spirit, almost an amphibian between being and not being – Gottfried Wilhelm Leibniz – 1.1 1.1.1 Introduction Imaginary numbers As it is well known, whether you square a positive or a negative real number, the result is a positive real number. That means that, in the real domain, the square root of negative numbers is not defined. It also means that the equation x2 + 1 = 0 does not have any real solution. However, it is useful for a wide class of problems to define a new kind of numbers such that when you square them you do obtain a negative real number. This class of numbers takes the name of imaginary numbers. We therefore define the basic imaginary number (the imaginary unit) i as the square root of −1, namely the number such that: i2 = −1. (1.1) √ Consequently, given a real positive number λ one has −λ = i λ. It is also worth stressing that, according to the fundamental theorem of algebra, the equation x2 +1 = 0, being a second degree equation, has two roots, the second being −i. Although −i is distinct from i, it shares the property of having −1 as its square. 1.1.2 √ Complex numbers A complex number comprises a real number and an imaginary number. It is conventionally written as z and it is the sum of a real part x (often indicated as Re(z)) and i times an imaginary part y (or Im(z)), namely 3 4 CHAPTER 1. COMPLEX NUMBERS z = x + iy. (1.2) It is therefore equivalent to an ordered pair of two real numbers and it is sometimes indicated with the compact notation z = (x, y) (with this notation i = (0, 1)). It is also important to define the complex conjugate of a complex number, which is indicated by z ∗ (sometimes also indicated by z) and is simply obtained by changing sign of the imaginary part of z. Therefore, if z is defined as x + iy, then z ∗ = x − iy. 1.2 1.2.1 Manipulation of complex numbers Basic operations The addition of two complex numbers z1 and z2 gives as a result another complex number. The real and imaginary parts of the complex number are added separately. z1 + z2 = (x1 + iy1 ) + (x2 + iy2 ) = (x1 + x2 ) + i(y1 + y2 ). (1.3) Of course, given the commutativity and associativity of the addition between real numbers, also the complex addition is commutative and associative. The product between two complex numbers can simply be found by multiplying them in full (in the same manner as in polynomials) and remembering that i2 = −1, namely: z1 z2 = (x1 + iy1 )(x2 + iy2 ) = x1 x2 + i(x1 y2 + y1 x2 ) + i2 y1 y2 = (x1 x2 − y1 y2 ) + i(x1 y2 + y1 x2 ). (1.4) It is easy to verify that z1 z2 = z2 z1 (the multiplication is commutative), z1 (z2 z3 ) = (z1 z2 )z3 (the multiplication is associative), z1 (z2 +z3 ) = z1 z3 +z2 z3 (the multiplication is distributive over addition). Example 1.2.1 Given two generic complex numbers z1 and z2 , show that z1 z2∗ +z1∗ z2 is a real number given by 2 Re(z1 z2∗ ). Assuming that z1 = x1 + iy1 and z2 = x2 + iy2 we can use Eq. 1.4 to obtain: z1 z2∗ = (x1 + iy1 )(x2 − iy2 ) = (x1 x2 + y1 y2 ) + i(y1 x2 − x1 y2 ), z1∗ z2 = (x1 − iy1 )(x2 + iy2 ) = (x1 x2 + y1 y2 ) − i(y1 x2 − x1 y2 ). 1.2. MANIPULATION OF COMPLEX NUMBERS 5 Therefore, z1 z2∗ and z1∗ z2 are two complex numbers having the same real part, but opposite imaginary part (as expected, the second is the complex conjugate of the first), therefore their sum is a pure real number given by z1 z2∗ + z1∗ z2 = 2(x1 x2 + y1 y2 ), which is twice the real part of z1 z2∗ (and also of z1∗ z2 of course), namely we have demonstrated the equality z1 z2∗ + z1∗ z2 = 2Re(z1 z2∗ ). Also the division between two complex numbers can be obtained in a straightforward manner. Given z1 = x1 + iy1 and z2 = x2 + iy2 , zz21 can be simply obtained by multiplying both denominator and numerator by the complex conjugate of z2 , namely: (x1 + iy1 )(x2 − iy2 ) z1 = z2 (x2 + iy2 )(x2 − iy2 ) x1 x2 + i(y1 x2 − x1 y2 ) + y1 y2 = x22 + y22 x1 x2 + y1 y2 y1 x2 − x1 y2 = +i . 2 2 x2 + y2 x22 + y22 (1.5) Example 1.2.2 Calculate the division between z1 = 9 + 2i and z2 = 1 + 4i We have x1 = 9, y1 = 2, , x2 = 1, y2 = 4. Using Eq. 1.5 we obtain: 9+8 2 − 36 z1 = +i = 1 − 2i z2 1 + 16 1 + 16 As a simple application of Eq. 1.5 we can find the inverse 1/z of a complex number z 6= 0 which turns out to be: 1 1 x − iy = = 2 . z x + iy x + y2 (1.6) The numerator of the right hand side of Eq. 1.6 is z ∗ , therefore we can notice that zz ∗ = x2 + y 2, an expression which could also be obtained by direct multiplication p √ of z and z ∗ . The quantity zz ∗ = x2 + y 2 is also called modulus of the complex number z and it is indicated as r or |z|. Since x and y are real numbers, r ≥ 0. ∗ From the equation zz ∗ = x2 + y 2 = r 2 it can be easily derived that z1 = zr2 , which is equivalent to Eq. 1.6. We can show that triangle inequality |z + w| ≤ |z| + |w|, that holds for ordinary vectors, holds also if z and w are generic complex numbers. In order to demonstrate that, we have to remind the equality zw ∗ +z ∗ w = 2Re(zw ∗ ) we have seen in Example 6 CHAPTER 1. COMPLEX NUMBERS 1.2.1 and we have to introduce the (quite obvious to demonstrate) relations Re z ≤ |z|, (z + w)∗ = z ∗ + w ∗ , |zw| = |z||w| and |z ∗ | = |z|. We have therefore: |z + w|2 = (z + w)(z + w)∗ = (z + w)(z ∗ + w ∗ ) = zz ∗ + (zw ∗ + z ∗ w) + ww ∗ = |z|2 + 2Re(zw ∗ ) + |w|2 ≤ |z|2 + 2|zw ∗ | + |w|2 = |z|2 + 2|z||w| + |w|2 = (|z| + |w|)2. (1.7) Since both |z + w| and |z| + |w| are non-negative, taking the square root of Eq. 1.7 we obtain: |z + w| ≤ |z| + |w|. 1.2.2 (1.8) Square root of a complex number We have seen in Sect. 1.1.1 that the square root of a negative number −λ is given √ √ by the two complex conjugate numbers i λ and −i λ. What about the square root of a generic complex number w = a + ib? We have to find the complex number z = x + iy such that z 2 = w, namely: (x + iy)(x + iy) = x2 − y 2 + 2ixy = a + ib. By equating the real and the imaginary part we obtain: b b2 , x2 − 2 = a. 2x 4x 2 . Since x must The last equation is a quadratic equation in x which has roots a±r 2 2 be a real number, x cannot beq negative, therefore the only acceptable solution is r+a 2 x = 2 which has roots x = ± r+a . By direct substitution, we obtain: 2 r r−a b b q =± = . y= 2x 2 ±2 r+a x2 − y 2 = a , 2xy = b ⇒ y = 2 Therefore, the square root of the complex number w = a + ib is: r r r r r+a r−a r+a r−a z1 = +i , z2 = − −i . (1.9) 2 2 2 2 √ Namely, the two solutions of the equation w = z are two opposite complex numbers z1 and z2 . As we shall see, there is a much simpler way to calculate roots of complex numbers. 1.3. REPRESENTATION OF A COMPLEX NUMBER 7 Figure 1.1: The Argand diagram. 1.3 1.3.1 Representation of a complex number Algebraic representation The representation of the complex numbers we have encountered so far (z = x + iy) is also called algebraic or cartesian representation. In fact, since a complex number can be expressed by a couple of real numbers, it is instinctive to place them in a plane. Therefore, we can think at the real part of a complex number z as the x-axis (abscissa) of a cartesian coordinate system, the imaginary part of it being the y-axis (ordinate). Such a visualization of z is called the Argand diagram or Gaussian plane and it is shown in Fig. 1.1. 1.3.2 Trigonometric representation To introduce the trigonometric representation of a complex number, it is convenient to start with a complex number with modulus r. As we know from the trigonometry (and as we can see in Fig. 1.2), the real part of z coincides with r times the cosine of θ (the angle that z forms with the x-axis, also indicated as argument of the complex number z) and the imaginary part with r times the sine of θ, therefore we have z = r(cos θ + i sin θ). (1.10) Since we have used trigonometric functions to express z, this representation of the complex numbers is called trigonometric representation or polar form. Indeed we can 8 CHAPTER 1. COMPLEX NUMBERS think of the equality x + iy = r(cos θ + i sin θ) as analogous to the transformation from cartesian coordinates (x, y) to the polar ones (r, θ). The following relations hold: x = r cos θ ; y = r sin θ, p y . r = x2 + y 2 ; θ = arctan x (1.11) (1.12) It is worth stressing that the function tan x has a periodicity of π, therefore, if θ = arctan( xy ), also θ + π = arctan( xy ). For this reason, it is always important to check the quadrant where the complex number lies. For instance, the argument of the complex number −i − 1 is 5π/4 and not π/4 because if it was π/4 then z would have been in the first quadrant and would have had positive real and imaginary parts. Recalling the Taylor expansion of sin θ and cos θ we obtain: z = r(cos θ + i sin θ) θ3 θ5 θ2 θ4 + −... + i θ − + −... =r 1− 2! 4! 3! 5! θ3 θ4 θ5 θ2 −i + + i ... = r 1 + iθ − 2! 3! 4! 5! 3 2 (iθ) (iθ)4 (iθ)5 (iθ) + + + ... . = r 1 + (iθ) + 2! 3! 4! 5! The expression inside the square bracket is analogous to the Taylor expansion of an exponential, therefore we can define: eiθ = ∞ X (iθ)n n=0 iθ n! = cos θ + i sin θ. (1.13) The equivalence e = cos θ + i sin θ is called Euler’s formula. Analogously to Eq. 1.13 we can define, for each complex number z, its exponential as: ez = ∞ X zn n=0 n! . (1.14) We can easily notice that the quantity r(cos θ + i sin θ), with 0 ≤ θ < 2π, covers the whole domain of complex numbers, therefore we can represent every complex number z = x + iy with the expression: z = reiθ , (1.15) where r is the modulus of z and θ is its argument. Eq. 1.15 is the exponential representation of a complex number. 1.3. REPRESENTATION OF A COMPLEX NUMBER 9 Figure 1.2: Representation in the Argand diagram of a complex number with modulus r. Multiplication and division of complex numbers in polar form are particularly simple. In fact, given two complex numbers z1 = r1 eiθ1 and z2 = r2 eiθ2 we have: z1 z2 = r1 eiθ1 r2 eiθ2 = r1 r2 ei(θ1 +θ2 ) , (1.16) r1 eiθ1 r1 z1 = = ei(θ1 −θ2 ) , iθ 2 z2 r2 e r2 (1.17) namely, the product of two complex numbers is a complex number having, as modulus, the product of the moduli and as argument the sum of the arguments; the division of two complex numbers is a complex number having as modulus the quotient of the moduli and as argument the difference of the two arguments. Although the product and division of two complex numbers are straightforward in Polar form, the sum and difference are not, and it is convenient to convert the numbers we want to sum in the algebraic representation. It is also worth noticing that the product of a complex number z by a number w = eiα with unit modulus corresponds to a rotation of z in the Argand diagram by an angle α. In fact, assuming that z = reiθ , then zw = rei(θ+α) . Concerning the trigonometric representation of a complex number, it is important to notice that the quantity cos θ + i sin θ has a periodicity of 2π, namely reiθ ≡ rei(θ+2nπ) for every integer number n. Therefore, in order not to have a multiple-valued definition of z it is conventionally assumed that θ lies in the interval [0, 2π[. 10 1.3.3 CHAPTER 1. COMPLEX NUMBERS de Moivre’s theorem From the equality (eiθ )n = einθ we derive the very important relation: (cos θ + i sin θ)n = cos nθ + i sin nθ. (1.18) In fact, (eiθ )n = (cos θ+i sin θ)n follows directly from the trigonometric representation of complex numbers, whereas the identity einθ = cos nθ + i sin nθ follows from the series expansion of einθ (see Sect. 1.3.2). Another way of demonstrating the de Moivre’s theorem is through the properties of the product in the polar representation. It is in fact easy to see that z n = r n einθ , therefore: [r(cos θ + i sin θ)]n = z n = r n einθ = r n (cos nθ + i sin nθ), from which the de Moivre’s theorem is easily deduced. It is worth stressing that n can be an integer, a real, or even a complex number. The de Moivre’s theorem helps in finding the roots of a generic complex number, to solve polynomial equations or to easily recover trigonometric identities, as the next 4 examples show. Example 1.3.1 Use the de Moivre’s theorem to find the square roots of a generic complex number z. As we have seen, we can express z as r(cos θ + i sin θ). From the de Moivre’s theorem we have: √ √ 1 θ θ 2 = w1 . z = (z) = r cos + i sin 2 2 However, we shall not forget that cos θ + i sin θ has a periodicity of 2π, therefore also √ θ θ + π + i sin +π . w2 = r cos 2 2 √ is a solution of the equation z = w. From elementary properties of the trigonometric functions, we can see that w1 and w2 are opposite complex numbers (as we have seen in Sect. 1.2.2). It is also easy to show that the values of w1 and w2 found here coincide with the values found in Eq. 1.9. Example 1.3.2 Find the solutions of the equation z 7 = 1. We can express 1 as ei(0+2nπ) = e2niπ . Therefore, we have: 2 z 7 = e2niπ ⇒ z = ei 7 nπ . 1.3. REPRESENTATION OF A COMPLEX NUMBER 11 Figure 1.3: The solutions of the equation z 7 = 1. It is clear that n can assume only the values 0 . . . 6. In fact, for n = 7 we obtain the solution e2iπ which coincides with the solution obtained with n = 0. Therefore, we 4 6 8 10 12 2 have 7 distinct solutions, namely 1, e 7 iπ , e 7 iπ , e 7 iπ , e 7 iπ , e 7 iπ , e 7 iπ . The solutions have been drawn in Fig. 1.3. We can notice from here that all the solutions (as expected) lie in a circle of radius 1 and that they divide the circle into 7 circular sectors with the same angle. It is simple to generalize the procedure shown in Example 1.3.2 to find the n-th roots z1 . . . zn of a generic complex number w. From z n = w = reiθ we obtain: z= √ n θ rei( n + 2kπ ) n k = 0 . . . n − 1. (1.19) Example 1.3.3 Find the solutions of the equation z 5 + z 3 − 3z 2 = 3. Given the obvious symmetry of this equation, we can factorize it into (z 3 −3)(z 2 +1) = 0. The second factor has the solutions ±i; for the first factor we can proceed as in √ √ √ Example 5 obtaining the 3 solutions 3 3, 3 3e2iπ/3 , 3 3e4iπ/3 . We can notice from the examples 1.3.2 and 1.3.3 (and from Fig. 1.3) that all the √ solutions occur in conjugate pairs. Of course, 1 (example 1.3.2) and 3 3 (example 1.3.3) make an exception because they are real numbers. This is a general result of 12 CHAPTER 1. COMPLEX NUMBERS the roots of a polynomial with real coefficients. In fact, let us suppose that z is a root of a polynomial of degree n, namely z is a complex number such that: n X aj z j = 0. j=0 Taking the complex conjugate of this equation we obtain: n X a∗j (z ∗ )j = 0. j=0 But aj are real numbers, therefore aj = a∗j , therefore we obtain: n X aj (z ∗ )j = 0, j=0 ∗ namely, also z is a root of the polynomial. Example 1.3.4 Recover the double- and triple-angle formulae of the trigonometry with the aid of the de Moivre’s theorem. From the de Moivre’s theorem we have: cos(2θ) + i sin(2θ) = (cos θ + i sin θ)2 = cos2 θ − sin2 θ + 2i sin θ cos θ. By equating the real and imaginary coefficients separately, we obtain: cos(2θ) = cos2 θ − sin2 θ ; sin(2θ) = 2 sin θ cos θ. These are the well-known double-angle formulae of the trigonometry. Analogously we can proceed for the triple-angle formulae: cos(3θ) + i sin(3θ) = (cos θ + i sin θ)3 = cos3 θ + 3i cos2 θ sin θ − 3 cos θ sin2 θ − i sin3 θ ⇒ cos(3θ) = cos3 θ − 3 cos θ sin2 θ = 4 cos3 θ − 3 cos θ sin(3θ) = 3 sin θ cos2 θ − sin3 θ = 3 sin θ − 4 sin3 θ. 1.3.4 Complex logarithms To define the logarithm of a complex number, we can proceed as for the real numbers, namely we define Ln(z) that complex number w such that ew = z. If we use the exponential representation of z, this number can be easily found. In fact: 13 1.3. REPRESENTATION OF A COMPLEX NUMBER ew = z = reiθ = eiθ+ln r ⇒ w = iθ + ln r. (1.20) Ln(z) = i(θ + 2nπ) + ln r, (1.21) However, we must not forget that eiθ = ei(θ+2nπ) (they represent the same points in the Argand diagram). Therefore, we obtain: namely, the logarithm of a complex number is multiple-valued (the numbers i(θ + 2nπ) + ln r represent different points in the Argand diagram). The value of Ln(z) obtained restricting the argument of z to lie in the interval 0 ≤ θ < 2π is called principal value of Ln(z) and it is usually indicated with ln z. i Example 1.3.5 Express in polar form the number z = −i− 3 . The logarithm of z is given by Ln(z) = − 3i Ln(−i). To calculate the logarithm of −i we have to express it in exponential form, namely: Ln(−i) = Ln ei 3 π+2nπ 2 We obtain therefore: Ln(z) z=e − 3i i =e 3 = i π + 2nπ . 2 3 π+2nπ 2 which is a real quantity and not a complex one. 1.3.5 π 2 = e 2 + 3 nπ , Trigonometric and hyperbolic functions Given a real number x, we can also define sin x and cos x with the help of the exponential form of the complex numbers. It is enough to notice that eix + e−ix = 2 cos x and that eix − e−ix = 2i sin x. We can generalize this equality and define the cosine and the sine of a generic complex number z in this way: eiz + e−iz , (1.22) 2 eiz − e−iz . (1.23) sin z = 2i The case in which z is a pure imaginary number is very interesting. In fact, in this case z = ix with x real. We obtain therefore: ex + e−x cos(ix) = ≡ cosh x, (1.24) 2 ex − e−x sin(ix) = i ≡ i sinh x. (1.25) 2 cos z = 14 CHAPTER 1. COMPLEX NUMBERS In this way we have defined the hyperbolic functions sinh x = 21 (ex − e−x ) and cosh x = 12 (ex + e−x ). In an analogous way we can also define cosh z and sinh z for a generic complex number z. We have seen from Eqs. 1.24 and 1.25 that cos(ix) = cosh x; sin(ix) = i sin x. Analogously, we can see that cos x = cosh(ix); i sin x = sinh(ix). These relations make the relationship between trigonometric and hyperbolic functions transparent. For instance, from the well-known relation cos2 x+sin2 x = 1 it is trivial to find the analogous relation for trigonometric functions cosh2 x−sinh2 x = 1. By analogy with the trigonometric functions, also the remaining hyperbolic functions: ex − e−x sinh x = x , cosh x e + e−x 1 2 sech x = = x , cosh x e + e−x 1 2 cosech x = = x , sinh x e − e−x ex + e−x 1 = x , coth x = tanh x e − e−x (1.26) tanh x = (1.27) (1.28) (1.29) (1.30) can be defined. Also the inverse hyperbolic functions arcsinh x, arccosh x and arctanh x can be defined and it is possible, by inverting the equations defining the hyperbolic functions, to find a closed-form expression for them, namely: √ arcsinh x = ln( x2 + 1 + x), √ arccosh x = ln(x ± x2 − 1), r 1+x . arctanh x = ln 1−x (1.31) (1.32) (1.33) Example 1.3.6 Demonstrate Eq. 1.33. y −y By the definition of the hyperbolic tangent we have tanh y = eey −e = x. We want to +e−y invert this relation and express y = arctanh x as a function of x. It turns out to be: ey − e−y = x(ey + e−y ) ⇒ ey (1 − x) = e−y (1 + x) 1+x ⇒ e2y = 1−x r 1+x . ⇒ arctanh x = ln 1−x 1.3. REPRESENTATION OF A COMPLEX NUMBER 15 Example 1.3.7 Find the solutions of the equation 4 cosh x − 3e−x = −1. From the definition of cosh x, this equation is equivalent to (2ex +2e−x )−3e−x +1 = 0. Multiplying by ex we obtain: 2e2x + ex − 1 = 0, which is a quadratic expression in ex , whose solutions are ex = 12 and ex = −1. The first has the obvious solution x = − ln 2, whereas for the second we have to express −1 in exponential form, namely −1 = ei(π+2nπ) , therefore x = i(π + 2nπ), with principal value x = iπ. 16 CHAPTER 1. COMPLEX NUMBERS Chapter 2 Ordinary differential equations Science is a differential equation. Religion is a boundary condition. – Alan Turing – 2.1 Introduction Almost every physical problem has to do with rates of change of some quantities. For instance, the velocity is the rate at which the position of a body changes with time. Expressed in mathematical terms, rates are derivatives. Therefore, in order to describe most of the physical problems, we encounter equations containing derivatives of some unknown function. This kind of equations are called differential equations (DE). The DE (or the system of DEs) that describes some specific physical problem is also called mathematical model of the process. 2.1.1 A simple example We know that if we leave a body with mass m falling near the sea level, the only force acting on it is the gravity, which has only a vertical component and is quantified by mg, with g=9.8 m s−2 . From the Newton’s second law F = ma we have the obvious relation ma = mg. We know also that a is the rate of change of velocity as a function of time, therefore we can express the Newton’s second law with a very simple DE, namely: dv = g, (2.1) dt where v is the velocity in the vertical direction. We know from the mechanics the solution of this DE, namely v(t) = v0 + gt. If we take now into account the drag force of the atmosphere, things get (slightly) more complicated. Usually the drag is taken proportional to the velocity of the body and acts in the direction opposite to the 17 18 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS motion of the body. In this way, the force acting on the body is given by F = mg−γv, where γ is a constant. From the Newton’s second law we get therefore: dv = mg − γv. (2.2) dt As we can see, the unknown function v(t) appears on both sides of the equation, in one case (right hand side) the simple function v appears, in the other case (left hand side) its derivative as a function of t appears. m 2.1.2 The direction field Although Eq. 2.2 is easy to solve (we will do it at the end of this subsection), there is a way to get useful information about its behavior without solving it. In fact we know from the geometry that, if we plot v(t) in a cartesian coordinate system, represents the slope of the curve v(t) in the then for each value of t the quantity dv dt point with coordinates (t, v). Therefore, if we assign the velocity a specific value v, γ =g−m v represents the slope of the function v(t) for each time t the quantity dv dt passing by the point (t, v). To make the example more quantitative, we can assume = 9.8 − 0.2 v. If v = 0, then, for m = 10 kg, γ = 2 kg s−1 . In this way we have dv dt = 9.8, if v = 10 m s−1 , then dv = 7.8, if v = 20 m s−1 , then dv = 5.8, each t, dv dt dt dt and so on. We can display these informations by drawing in a tv-plane short line at different values of t. This has been done in Fig. 2.1. We segments with slope dv dt can see from this plot that, in the lower part of the diagram, the slopes are always positive. In physical terms that means that the velocity increases with time. That happens only if the velocity at the time t = 0 (the initial velocity) is lower than some threshold value. On the other hand, if we start with a high initial velocity, the slope is always negative, namely the velocity decreases with time. A value of the velocity somewhere around 50 m s−1 seems to be peculiar because the slopes of the segments flatten considerably. Indeed this peculiar behavior occurs when dv = 0 (therefore dt gm −1 when v = γ ; v = 49 m s in our example). In fact, if we start at t = 0 with this velocity, then the gravity and the drag force balance perfectly and the velocity of the body does not change with time. This solution is called equilibrium solution. The velocity v = 49 m s−1 is also called terminal velocity since, no matter what is the initial velocity, at sufficiently large times the velocity tends asymptotically to this value. Although we have not started yet to treat differential equations, we can nevertheless solve the easy Eq. 2.2 by treating the infinitesimals dv and dt as normal unknowns in an equation. Therefore we have: dv = dt. γ g−m v 19 2.1. INTRODUCTION Figure 2.1: The direction field of the equation dv dt = 9.8 − 0.2 v. Integrating the right hand side between 0 and t and the left hand side between v0 (the velocity at the time t = 0) and v, we obtain: t= Z v v0 m mg − γv dw = − ln . γ g − mw γ mg − γv0 From this equation we get: γ mg − γv = (mg − γv0 )e− m t . Now we can recover v(t), namely: mg − v= γ γ mg − v0 e− m t . γ (2.3) From this family of solutions (which depend on the initial velocity v0 ) we can check the validity of the properties we have already noticed with the help of the direction then v(t) always increases with field. In particular we can notice that, if v0 < mg γ mg time; the opposite happens with v0 > γ . If v0 = mg , then v does not change γ with time. We have drawn some of these solutions in Fig. 2.2 for the following initial velocities: v0 = 40, 45, 49 (thick line), 55 and 60 m s−1 . As expected, all the solutions tend asymptotically to the equilibrium value and the tangents to all the curves are well reproduced by the direction field we have drawn previously. Eq. 2.3 is also called general solution of the DE 2.2 since it represents the solution for all possible initial velocities of the body. The curves drawn in Fig. 2.2 are a subsample 20 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS Figure 2.2: The direction field of the equation dv = 9.8 − 0.2 v together with the dt solutions (red continuous lines) under the initial conditions v0 = 40, 45, 49 (thick line), 55 and 60 m s−1 . of the family of the infinitely many curves representing the general solution of the DE, also called integral curves. Although the utility of the direction field has been made clear by this example, dy = f (x, y). its use is limited to the DEs of the form dx 2.1.3 Basic definitions In the example we have analyzed in the previous subsections, we have expressed the velocity of the body as a function of the time. Namely, the variable t has been left free to vary and we have checked how the variable v varied as a function of t. We define therefore t as independent variable and v as dependent variable. In our example we have given the variables names that resemble their physical meanings (t for time and v for velocity). If the equation is a purely mathematical abstraction, it is customary to indicate with x the independent variable and with y the dependent one. If there is only one independent variable, and, as a consequence, only total dy then the equation is called ordinary differential equation or ODE. derivatives like dx If instead the function we seek depends on several independent variables and the equation contains partial derivatives with respect to those, then the equation is called partial differential equation (or PDE). An equation like: 21 2.1. INTRODUCTION d2 y dx2 3 dy = ln y x , + cos dx is an ODE in which we seek the unknown function y(x) which satisfies it. Instead, an equation like: ∂3f ∂f x + e + ln y = 0, 2 ∂ x∂y ∂x is an example of PDE and in this case the function to seek is f (x, y), depending on two independent variables x and y. In this chapter (and in the following ones) we will not treat PDEs (see however Sect. 6.3.1 and 6.3.2), concentrating our attention on ODEs. The order of an ODE is the order of the highest derivative that appears in the equation. The equation dn y dy d2 y F x, , 2 , . . . , n , dx dx dx (2.4) F [x, y ′, y ′′, . . . , y (n) ] = 0. (2.5) is the general expression of a n-th order ODE. It is customary to indicate with y ′(x), y ′′ (x), . . . , y (n) (x) the first, second, . . . , n-th derivative of y as a function of x. Sometimes, in order to simplify the notation, the dependence on the independent variable x is omitted. Therefore Eq. 2.4 can also be written as: The degree of an ODE is instead the power of the highest derivative term. An ODE like: ′′′ 4 5 y − 2 y ′′ − y = x, is of the third order and fourth degree. Another crucial classification of differential equations is between linear and nonlinear ones. An ODE is said to be linear if the function F in Eq. 2.5 is a linear function of y ′, y ′′, . . . , y (n) ; nonlinear otherwise. Therefore, a linear ODE of n-th order can be expressed as: an (x)y (n) (x) + an−1 (x)y (n−1) (x) + · · · + a1 (x)y ′ (x) + a0 (x)y(x) = f (x). (2.6) Referring to the example in Sect. 2.1.1, we have seen that we can recover a single function v(t) as a solution of the given ODE only if we specify a constant which is the velocity of the body at some specific time (in our case the initial time t = 0). If we do not specify it, then we obtain a family of solutions. An ODE associated 22 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS with an initial condition of the kind y(x0 ) = y0 is called initial value problem (or Cauchy problem). The solution of an initial value problem (namely ODE plus initial condition) is called particular solution of the ODE. Given an ODE of n-th order, an initial value problem consists in specifying the value of the zeroth, first, . . . , (n−1)-th derivative of the dependent variable y at some fixed point x0 , otherwise the solution is not defined. We can clarify it with the trivial example of the fall of a body without drag (Eq. 2.1). We have seen that this leads to the general solution v(t) = v0 + gt. To obtain the height of the body y(t) as a function of time, we have to integrate once more over time, obtaining: 1 y(t) = y0 + v0 t + gt2 , (2.7) 2 where y0 = y(t = 0) and v0 = y ′(t = 0). Namely, in order to obtain the particular solution we have specified the value of the function y(t) and of its first derivative at the time t0 = 0. An alternative is to assign the value of y(t) at two different values of time y1 = y(t1 ) and y2 = y(t2 ). Assuming for simplicity that t2 = 0, namely that y2 = y(t2 ) = y(0) = y0 , the condition y1 = y(t1 ) translates into: y1 − y0 1 1 − gt1 . y1 = y0 + v0 t1 + gt21 ⇒ v0 = 2 t1 2 Therefore, the particular solution of the ODE is: y1 − y0 1 1 y(t) = y0 + − gt1 t + gt2 . t1 2 2 (2.8) An ODE which has, as additional constraints, the value of the dependent variable for different values of the independent variable(s) is called boundary value problem. 2.2 First order, first degree differential equations The most general expression of a first order, first degree ODE is: Q(x, y)y ′(x) + P (x, y) = 0. (2.9) We will be busy in the next sections in determining whether a solution of this ODE exists and, if so, in developing methods for finding it. However, it is worth remarking that, for arbitrary functions P and Q a solution might not exist or it might not be possible to express it in terms of elementary functions. In this case, numerical methods are required. We will concentrate instead on methods which can be applied to particular subclasses of first order ODE, namely separable equations, linear equations and exact equations. 2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS 2.2.1 23 Separable equations Taking into account the general first order ODE Eq. 2.9, if it happens that Q(x, y) = q(y) depends only on y and P (x, y) = p(x) depends only on x, then the ODE is particularly easy to treat. In fact, we can immediately obtain q(y)dy = −p(x)dx and at this point we can integrate both sides of this equation, namely the solution of the ODE q(y)y ′(x) + p(x) = 0 is: Z q(ỹ)dỹ = − Z p(x̃)dx̃ + K, (2.10) where K is a suitable constant. It is also very simple to solve an initial value problem given by a separable ODE and the initial condition y(x0 ) = y0 . The solution is in fact: Z y y0 q(ỹ)dỹ = − Z x p(x̃)dx̃. (2.11) x0 Example 2.2.1 Find the solution of the initial value problem: y ′(x) = xy sin x . y(0) = 2 This is clearly a separable ODE which brings directly to the solution: Z 2 y 1 dỹ = ỹ Z x x̃ sin x̃dx̃. 0 The left hand side has the obvious solution ln y2 . We have to integrate the right hand R side by part. Making use of the relation sin x = d(− cos x) we obtain x sin x = R −x cos x + cos x = sin x − x cos x. We obtain therefore the solution: y(x) = 2esin x−x cos x . Example 2.2.2 Find the solution of the initial value problem: xy ′ (x) = x − 1 − x2 + 2xy − y 2 . y(1) = 0 This is not a clearly separable ODE. However, we can rewrite it as xy ′ (x) − x + 1 = −x2 + 2xy − y 2 and we can notice that the right hand side of this equation is 24 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS −(x − y)2 . It could therefore be convenient to define v = x − y. It is y = x − v and y ′(x) = 1 − v ′ (x). We obtain therefore: x(1 − v ′ (x)) − 1 + x = −v 2 ⇒ xv ′ (x) = v 2 + 1. This ODE is clearly separable and we can easily find a solution. In fact, the initial condition y(1) = 0 translates into v(1) = 1 and we can apply Eq. 2.10 to obtain: Z 1 v dṽ = ṽ 2 + 1 Z x 1 π dx̃ ⇒ arctan v − = ln x. x̃ 4 This brings us directly to the solution: π , v = tan ln x + 4 which, recalling the substitution v = x − y, translates into: π . y = x − tan ln x + 4 This example shows us that, as in the case of integrals, we can simplify ODEs and reduce them to tractable ones by means of clever substitutions. 2.2.2 Linear equations Recalling the general form of a linear ODE (Eq. 2.6) a first order linear ODE can be expressed as: a1 (x)y ′ (x) + a0 (x)y(x) = f (x). To simplify the notation we can define r(x) = aa01 (x) and s(x) = af1(x) (provided that (x) (x) a1 (x) 6= 0), therefore the general first order ODE can be expressed as: dy + r(x)y(x) = s(x). (2.12) dx To find the solution of this ODE we can proceed as follows. Let us introduce a (yet unknown) function g(x), which we multiply by both sides of Eq. 2.12. We obtain therefore: g(x) dy + r(x)g(x)y(x) = s(x)g(x). dx (2.13) dy dg d [gy] = g(x) dx + y(x) dx . The right hand side of this equation Now, we known that dx is equal to the left hand side of Eq. 2.13 if and only if the condition: 2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS 25 dg = r(x)g(x) dx holds. This equation is easy to solve and it yields: R g(x) = e r(x̃)dx̃ . (2.14) Coming back to Eq. 2.13 we have now: d [gy] = s(x)g(x). dx Assuming that gy is the dependent variable of this ODE, we obtain: gy = Z (2.15) s(x)g(x)dx + K. Since we have already found the function g(x), now the general solution is: 1 y= g Z − s(x)g(x)dx + K = e R r(x̃)dx̃ Z R s(x)e r(x̃)dx̃ dx + K . (2.16) The function g(x) is also called integrating factor. Example 2.2.3 Find the solution of the ODE: √ (1 + x2 )3/2 y ′(x) + 2xy 1 + x2 = 1 and draw some of the integral curves. We can rearrange this ODE obtaining: y ′ (x) + 2xy 1 = . 1 + x2 (1 + x2 )3/2 This is clearly a linear ODE with r(x) = 2x(x2 + 1)−1 and s(x) = (x2 + 1)−3/2 . It is R easy to see that r(x)dx = ln(1 + x2 ), therefore the integrating factor g(x) is: R 2 g(x) = e r(x)dx = eln(1+x ) = 1 + x2 . R It is worth stressing here that r(x)dx = ln(1 + x2 ) + K, therefore every function 2 of the kind eln(1+x )+K is an integrating factor, as well. However, we do not need a specific integrating factor (each function satisfying Eq. 2.15 is sufficient), therefore we take the easiest possible g(x). From Eq. 2.16 we obtain: 26 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS √ Figure 2.3: Solutions of the ODE (1 + x2 )3/2 y ′(x) + 2xy 1 + x2 = 1 (Example 2.2.3) with integral constant ranging from K = −3 (lowermost curve) to K = 3 (uppermost curve). 1 y(x) = g Z s(x)g(x)dx + K Z dx 1 √ +K = 1 + x2 1 + x2 arcsinh x + K = . 1 + x2 Some of these solutions (for K = −3 . . . 3) are plotted in Fig. 2.3 Variation of parameters There is an alternative method to solve linear ODEs which is worth introducing here because we will use it more extensively to solve higher order ODEs. This method is called variation of parameters (or variation of constants) and it was developed by the mathematician Joseph Louis Lagrange. Given a linear ODE as in Eq. 2.12, this method consists in finding, as a first step, the solution of the associated homogeneous equation: y ′ (x) + r(x)y(x) = 0, (2.17) 27 2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS namely the equation obtained assuming s(x) = 0. This equation is separable, therefore it is straightforward to find the solution: y(x) = e− R r(x̃)dx̃+K = Ae− R r(x̃)dx̃ . We can now assume that the solution of the ODE is given by an expression like: y(x) = A(x)e− R r(x̃)dx̃ , (2.18) namely, instead of keeping A constant, we let it vary with x and we wish to check for which function A(x) is Eq. 2.12 satisfied. In order to do that, it is enough to substitute Eq. 2.18 into Eq. 2.12. We obtain: A′ (x)e− R r(x̃)dx̃ R − A(x)r(x)e− R r(x̃)dx̃ + A(x)r(x)e− ⇒ A′ (x) = s(x)e r(x̃)dx̃ Z R ⇒ A(x) = s(x)e r(x̃)dx̃ dx. R r(x̃)dx̃ = s(x) (2.19) That is, the final solution y(x) of the given ODE is: − y(x) = e R r(x̃)dx̃ Z R s(x)e r(x̃)dx̃ dx, which, of course, coincides with the Eq. 2.16 already found with the help of the integrating factor. 2.2.3 Exact equations Let us go back to the general form of a first order ODE Q(x, y)y ′(x) + P (x, y) = 0 (Eq. 2.9). Let us then suppose that we can find a function f (x, y) such that: ∂f (x, y) = P (x, y), ∂x ∂f (x, y) = Q(x, y). ∂y (2.20) (2.21) Then, we have ∂f (x, y) ∂f (x, y) dx + dy = P (x, y)dx + Q(x, y)dy = 0. ∂x ∂y That means that f (x, y) is constant or, in other words, that the function f (x, y) = K, with K an arbitrary constant, is the general solution of the ODE Eq. 2.9 in implicit form. If such a function f (x, y) can be found, the ODE takes the name of exact equation. df (x, y) = 28 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS It can be demonstrated that an ODE is exact if and only if ∂P (x, y) ∂Q(x, y) = . (2.22) ∂y ∂x In fact, if the ODE is exact, Eqs. 2.20 and 2.21 hold. If we differentiate Eq. 2.20 as a function of y and Eq. 2.21 as a function of x, then we obtain: ∂ 2 f (x, y) ∂ ∂f (x, y) ∂P (x, y) = = , ∂y ∂y ∂x ∂y∂x ∂Q(x, y) ∂ 2 f (x, y) ∂ ∂f (x, y) = = . ∂x ∂x ∂y ∂x∂y If the function f (x, y) is sufficiently regular (i.e. twice differentiable), then ∂ 2 f (x,y) ∂x∂y = ∂ 2 f (x,y) . ∂y∂x Unless otherwise stated, we will always deal with sufficiently regular (i.e. continuous and differentiable) functions, therefore we have demonstrated that, if the ODE is exact, then the condition expressed in Eq. 2.22 is satisfied. (x,y) The rigorous demonstration that the condition ∂P∂y = ∂Q(x,y) implies that the ∂x ODE is exact is beyond the scope of these notes (and of this course). A less rigorous demonstration can be performed in the following way. We seek a function f (x, y) such that f (x, y) = K is a solution of the ODE Q(x, y)y ′ + P (x, y) = 0. We can always assume that the Eq. 2.20 holds. We try to find out if Eq. 2.21 holds too. From Eq. 2.20 we get: f (x, y) = Z P (x̃, y)dx̃ + r(y), (2.23) where r(y) is an unknown function depending only on y. We can now differentiate = Q. We have therefore that: Eq. 2.23 with respect to y and we assume that ∂f ∂y Z ∂ ∂f P (x̃, y)dx̃ + r ′ (y). (2.24) = Q(x, y) = ∂y ∂y R ∂ Now, the quantity r ′ (y) = Q(x, y) − ∂y P (x̃, y)dx̃ must depend only on y, namely its derivative with respect to x must vanish. We have: Z ∂Q(x, y) ∂ ∂ ∂r ′ (y) P (x̃, y)dx̃ = − ∂x ∂x ∂x ∂y Z ∂Q(x, y) ∂ ∂ = P (x̃, y)dx̃ − ∂x ∂y ∂x ∂Q(x, y) ∂P (x, y) − , = ∂x ∂y (2.25) ∂ ∂ ∂ ∂ = ∂x . Now, the right hand side of where we have made use of the relation ∂y ∂x ∂y Eq. 2.25 is zero on account of Eq. 2.22, therefore r(y) indeed does not depend on x. 2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS 29 If we find r(y) by integrating Eq. 2.24, then substituting this solution into Eq. 2.23 we find the required function f (x, y). Example 2.2.4 Find the solution of the ODE: (2x3 y + cos x)y ′ + 3x2 y 2 = y sin x . We have that P (x, y) = 3x2 y 2 − y sin x and Q(x, y) = 2x3 y + cos x, therefore: ∂P = 6x2 y − sin x ∂y ∂Q = 6x2 y − sin x. ∂x The ODE is exact and the solution is given by: f (x, y) = = Z Z P (x̃, y)dx̃ + r(y) (3x̃2 y 2 − y sin x̃)dx̃ + r(y) = x3 y 2 + y cos x + C + r(y). (2.26) Since another integration constant is introduced by the equation f (x, y) = K, we can safely assume that C = 0. By differentiating Eq. 2.26 with respect to y we obtain: r ′ (y) = ∂f (x, y) − 2x3 y − cos x ∂y = Q(x, y) − 2x3 y − cos x = 0. We have thus obtained r(y) = K, therefore the general solution of the proposed ODE is: x3 y 2 + y cos x = K There is a method alternative to the one seen above to find the solution of an exact ODE. It is less secure but in most cases more practical. It can be shown that the solution of the exact ODE P (x, y) + Q(x, y)y ′ = 0 is: f (x, y) = f (x, y) = Z Z P (x̃, y)dx̃ + Q(x, ỹ)dỹ + Z Z Q(x1 , ỹ)dỹ = K or (2.27) P (x̃, y1 )dx̃ = K, (2.28) 30 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS where x1 (or y1 ) is a point where it is particularly easy to calculate the given integrals = ∂Q (namely (in most of the cases x1 = 0). In fact, it is very easy to see that, if ∂P ∂y ∂x if the equation is exact), then df = 0. The following example clarifies this method. Example 2.2.5 Solve the ODE (2x3 y + cos x)y ′ + 3x2 y 2 = y sin x by means of this new method. We have already seen how to solve this ODE in Example 2.2.4. With the help of Eq. 2.28 the solution is more straightforward. In fact, taking y1 = 0 we obtain: f (x, y) = K = = Z Z Q(x, ỹ)dỹ + Z P (x̃, y1 )dx̃ Z 3 (2x ỹ + cos x)dỹ + 0dx̃ = x3 y 2 + y cos x. 2.2.4 Integrating factors It is clear that separable equations, having P (x, y) = p(x) and Q(x, y) = q(y) are exact equations ( ∂Q = ∂P = 0), whereas linear ODE, with Q(x, y) = 1 and P (x, y) = ∂x ∂y r(x)y − s(x) (see Eq. 2.12) in general are not exact. However, we have seen that we can find a suitable integrating factor g(x) that simplifies the ODE into the form Eq. 2.15, which is an exact and easy to solve ODE. Is it always possible to find an integrating factor and solve the general first order ODE Q(x, y)y ′ + P (x, y) = 0? We can proceed as we have done in Sect. 2.2.2 and multiply both members of the ODE by an unknown function g which will be in general function of both x and y. We have therefore: g(x, y)Q(x, y)y ′ + g(x, y)P (x, y) = 0. (2.29) This equation is exact if and only if: ∂ ∂ [g(x, y)Q(x, y)] = [g(x, y)P (x, y)] ∂x ∂y ∂g ∂P ∂g ∂Q + Q(x, y) = g(x, y) + P (x, y) ⇒ g(x, y) ∂x ∂x ∂y ∂y ∂Q ∂P ∂g ∂g ⇒ g(x, y) = P (x, y) − Q(x, y) . − ∂x ∂y ∂y ∂x (2.30) 31 2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS Once the integrating factor g(x, y) is known, the (exact) Eq. 2.29 can be solved with the same technique we have seen in Sect. 2.2.3. Unfortunately, the solution of Eq. 2.30 can be more complicated than the starting ODE (it is a PDE). This method can be effective though in some specific cases. 1. g = g(x) In this case, ∂g ∂y = 0 and the Eq. 2.30 reduces to: g(x) ∂Q ∂P dg . =− − dx Q(x, y) ∂x ∂y Namely, the integrating factor g depends only on x if and only if the quantity: 1 ∂Q ∂P , − Q(x, y) ∂x ∂y (2.31) depends on x only. In this case, g(x) can be found by direct integration, namely: R g(x) = e 1 Q(x̃,y) ∂P (x̃,y) ∂y − ∂Q(x̃,y) ∂ x̃ dx̃. (2.32) It is clear that for linear equations, in which Q(x, y) = 1 and P (x, y) = r(x)y − s(x), g(x) reduces to Eq. 2.14. Example 2.2.6 Find the solution of the ODE: (ln x + y)y ′ + y + x(2y ln x + y 2) = 0 x . We have that P (x, y) = y x + x(2y ln x + y 2 ) and Q(x, y) = ln x + y, therefore: 1 ∂P = + 2x(ln x + y) ∂y x 1 ∂Q = . ∂x x The ODE is therefore not exact and we have to find an integrating factor g(x, y). We can notice that: 1 ∂P 1 ∂Q 1 1 = = 2x, − + x(2 ln x + 2y) − Q(x, y) ∂y ∂x ln x + y x x depends indeed only on x, as the condition 2.31 requires. We can therefore easily find that R g(x) = e 2xdx 2 = ex , 32 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS and we recover therefore the exact ODE: y 2 2 2 ex (ln x + y)y ′ + ex + xex (2y ln x + y 2) = 0. x This equation is of course much uglier that the starting ODE and if we would look at 2 it, we would instinctively cancel out the term ex . Unfortunately, we would eliminate the factor that makes the ODE tractable. To solve this exact ODE we can start from the condition Eq. 2.21 and find: f (x, y) = Z Z Q(x, ỹ)dỹ + s(x) 2 ex (ln x + ỹ)dỹ + s(x) y2 x2 + s(x), =e y ln x + 2 = where s(x) is function only of the independent variable x. We can now differentiate this expression with respect to x and use the condition expressed in Eq. 2.20 to find: y x2 y 2 y2 x2 2 x2 − ex = 0. s (x) = e + xe (2y ln x + y ) − 2xe y ln x + x 2 x ′ Namely, we have s(x) = K, therefore the general solution of the given ODE is: y2 e y ln x + = K. 2 x2 2. g = g(y) By the same line of reasoning of the previous subsection, we can find that the integrating factor g = g(y) if and only if the quantity: ∂Q ∂P 1 , − P (x, y) ∂x ∂y (2.33) depends on y only. In this case, g(y) is given by: R g(y) = e 1 P (x,ỹ) ∂Q(x,ỹ) ∂x ỹ) − ∂P∂(x, dỹ. ỹ (2.34) 2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS 33 3. g = g(x · y) In this special case we can introduce the new variable u = xy and we want to check under which conditions g can be function of u alone. From the chain rule of the differentiation we know that: ∂g ∂u dg ∂g = =x ∂y ∂u ∂y du ∂g ∂u dg ∂g = =y . ∂x ∂u ∂x du Eq. 2.30 translates therefore into: ∂P dg ∂Q dg g(u) = Qy − − Px . ∂y ∂x du du We can thus see that the integrating factor depends on u = x · y if and only if the function: ∂P 1 ∂Q , H(u) = − yQ − xP ∂y ∂x (2.35) depends on u alone, and in this case we can easily find g(u) by: R g(u) = e H(ũ)dũ . (2.36) 4. g = g( xy ) Now we can make the substitution u = us to: x y and the chain rule of differentiation leads ∂g x dg =− 2 ∂y y du 1 dg ∂g = . ∂x y du Eq. 2.30 translates therefore into: ∂P Q dg xP dg ∂Q g(u) = − + 2 . ∂y ∂x y du y du Therefore the integrating factor depends on u = x y if and only if the function: ∂P ∂Q y2 , − K(u) = yQ + xP ∂y ∂x depends on u alone, and also in this case g(u) can be expressed by: (2.37) 34 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS R g(u) = e K(ũ)dũ . (2.38) As a general rule it is therefore always useful to look at the quantity: ∂P ∂Q − . ∂y ∂x If this is zero, then the ODE is exact and “easy” to solve. If it is not zero, it might nevertheless hide an hint on the form the integrating factor g(x, y) might have. Of course, it is worth recalling that the ODE might not have a solution! In this case, the integrating factor does not exist. Example 2.2.7 Find the solution of the ODE: 1+6 x2 x + (ln x − 2)y ′ = 0 y y . 2 We have that P (x, y) = 1 + 6 xy and Q(x, y) = xy (ln x − 2), therefore: ∂P x2 = −6 2 ∂y y 1 ∂Q = (ln x − 1). ∂x y The ODE is therefore not exact. ∂Q suggests us nothing but ∂P seems to suggest ∂x ∂y x some possible dependence on y . We try therefore to calculate the quantity Eq. 2.37 which turns out to be in our case: x2 1 y 1 y2 −6 2 − (ln x − 1) = − = − . K(u) = x3 y y x u x(ln x − 1) + 6 y We have been lucky: the integrating factor depends indeed on xy . It is easy to calculate R it. In fact, since − u1 = − ln u we obtain g(x, y) = xy . The exact ODE is therefore: y + 6x + (ln x − 2)y ′ = 0. x To solve it we can proceed as in Example 2.2.4, namely integrating Eq. 2.21 yielding: f (x, y) = Z (ln x − 2)dỹ + s(x) = y(ln x − 2) + s(x). We differentiate this equation as a function of x and use the condition Eq. 2.20 to find: 2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS 35 y y + 6x − = 6x, x x 2 from which we get s(x) = 3x , therefore the solution is: s′ (x) = y(ln x − 2) + 3x2 = K ⇒ y = 2.3 K − 3x2 . ln x − 2 Higher degree first order differential equations So far we have dealt only with ODEs of the type P (x, y) + Q(x, y)y ′ = 0, namely we have excluded that the derivative of the function y(x) has an exponent different from 1. If we are in the presence of terms of the type [y ′ ]n , the solution of the ODE is much more complicated. It is always a good idea in this case to make the substitution y ′(x) = p, namely to use the derivative y ′ as a parameter and try to solve the corresponding equation f (p, x, y) = 0. An example can help to clarify this procedure. Example 2.3.1 Solve the differential equation 3[y ′(x)]2 − 5xyy ′ + 2x2 y 2 = 0 . With the substitution y ′(x) = p we obtain the equation: 3p2 − 5xyp + 2x2 y 2 = 0, 2 ⇒ 3(p − xy)(p − xy) = 0. 3 Now we are left with solving two first degree ODEs, namely: x2 2 y ′ = xy ⇒ 3d ln y = dx2 ⇒ y = Ke 3 , 3 x2 1 y ′ = xy ⇒ d ln y = dx2 ⇒ y = Ke 2 , 2 Note that, since there is only one constant required for the solution of a first order ODE, we can take the same integration constant K for both solutions. The final general solution of the given ODE is given by: x2 y − Ke 3 x2 y − Ke 2 = 0. 36 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS The general form of a first order higher degree ODE is: an (x, y)pn + an−1 (x, y)pn−1 + · · · + a1 (x, y)p + a0 (x, y) = 0, (2.39) provided that we have used the notation p = y ′(x). The solution of this equation could be obtained (either explicitly or in parametric form) if this equation can be solved for p, x, y. 2.3.1 Equations solvable for p If the left hand side of Eq. 2.39 can be factorized into the form [p − F1 (x, y)][p − F2 (x, y)] . . . [p − Fn (x, y)], (2.40) then we can solve separately each of the first order first degree ODEs p −Fj (x, y) = 0 and express the solution in the form Gj (x, y, K) = 0. At this point, the general solution of the given ODE will be: G1 (x, y, K)G2(x, y, K) . . . Gn (x, y, K) = 0. (2.41) Example 2.3.1 has been solved exactly with this method. 2.3.2 Equations solvable for y If Eq. 2.39 can be solved for y, that means that we can write it in the form y = F (x, p). Now, differentiating both members with respect to x we obtain: y ′ (x) = p = ∂F ∂F dp + . ∂x ∂p dx (2.42) dp Namely, we have a function G(p, p′ (x), x) = ∂F + ∂F − p = 0 that does not depend ∂x ∂p dx on y. If this equation can be solved to give p = p(x), then we can substitute this function to the original Eq. 2.39 and obtain the final solution f (x, y) = 0. By using this method most of the time we find at the end of the computation some ancillary solutions that cannot be obtained by the general solution. These solutions are called singular solutions. An example can illustrate this kind of solutions. Example 2.3.2 Solve the differential equation 3[y ′(x)]2 − 2y ′(x) + . y =0 x 2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS 37 As usual we assume y ′(x) = p. We can multiply both sides of the equations by x and we can see that the equation is easily solvable for y. Namely, we obtain: y = −3p2 x + 2px. Now we can differentiate both members of this equation with respect to x and, reminding that y ′ = p we obtain: p = −6pxp′ + 2xp′ − 3p2 + 2p, ⇒ 2xp′ (−3p + 1) + p(−3p + 1) = 0, ⇒ (p + 2xp′ )(1 − 3p) = 0. (2.43) We managed to factorize the equation and isolate one term containing p′ and another term containing only p. Now we solve the simple ODE p + 2xp′ = 0 obtaining: p′ 1 1 K =− ⇒ ln p = − ln x + K ⇒ p = √ . p 2x 2 x Substituting this value of p into the original ODE, we obtain: 3 √ K y K2 − 2 √ + = 0 ⇒ y = 2K x − 3K 2 . x x x This is the general solution of the given ODE. However, we shall not forget that Eq. 2.43 was factorized in two factors and from the second factor we can easily find the solution p = 13 . If we substitute it into the original ODE, we get the solution: x . 3 It is very easy to see that this is indeed a solution of the given ODE. It can also be noticed that this equation cannot be obtained by any choice of the integrating constant K. That is the reason why it is called singular solution. y= 2.3.3 Equations solvable for x If Eq. 2.39 can be solved for x, we can proceed in a similar way as we have done in the previous subsection. Namely we can write the equation in the form x = F (y, p) and differentiate both members with respect to y, obtaining: x′ (y) = ∂F ∂F dp 1 = + . p ∂y ∂p dy (2.44) 38 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS dp + ∂F − p1 = 0 that does not We have now found a function G(p, p′ (y), y) = ∂F ∂y ∂p dy depend on x. We can use it together with the original ODE to eliminate p and give the general solution. As in the previous case, if the function G(p, p′ (y), y) can be factorized, then the term containing p′ (y) should be used to find p(y), which will be used to eliminate y ′(x) = p from the original ODE and get the general solution. Using the remaining term in the factorized function G(p, p′ (y), y) will often lead to singular solutions. It is also to note that in this case (but also in the case of a ODE solvable for y) it is not required that the ODE has the form of Eq. 2.39. It is enough that we can find a function G(p, p′ (y), y) = 0 that does not depend on x, solve it and substitute p onto the original ODE. An example can help clarifying it. Example 2.3.3 Solve the ODE xy ′ = y ln y ′ . Provided that y ′ > 0 (as required since otherwise the logarithm of y ′ would not be defined), this is clearly an ODE solvable for x (indeed it is also solvable for y). We make as usual the substitution y ′ = p and obtain: x=y ln p . p We differentiate it with respect to y and we obtain: x′ = 1 ln p p′ − p′ ln p = +y ⇒ p = p ln p + y(1 − ln p)p′ ⇒ (p − yp′)(1 − ln p) = 0. 2 p p p Assuming that y 6= 0, from the factor containing p′ we obtain: dp dy = ⇒ p = Ky. p y Substituting it into the equation x = y lnpp we obtain the general solution: ln Ky . K As usual, we shall not forget the term not containing p′ in the factorized ODE, namely the term (1 − ln p). By equating it to 0 we obtain p = e, namely we have obtained the singular solution y = ex that cannot be obtained from the general solution for any value of the integration constant K. x= 2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS 2.3.4 39 Special differential equations Bernoulli’s equation Bernoulli’s equation has the form: y ′(x) = f (x)y(x) + g(x)y p (x). (2.45) For p = 0 or p = 1 we know already how to solve this ODE. For p 6= 0, 1 this is a non-linear first order, first degree ODE. However, it can be made linear by the simple substitution z = y 1−p . In fact, we obtain: 1 y = z 1−p ⇒ y ′ = In this way we have: p 1 z 1−p z ′ . 1−p p p 1 1 z 1−p z ′ = f (x)z 1−p + g(x)z 1−p , 1−p ⇒ z ′ = (1 − p)f (x)z + (1 − p)g(x). (2.46) This is a linear ODE in z and g and therefore we can promptly find the solution applying Eq. 2.16, namely: R z(x) = e (1−p)f (x̃)dx̃ Z R (p−1)f (x̃)dx̃ (1 − p)g(x)e dx + K . Recalling the substitution z = y 1−n we obtain therefore: R y(x) = e (1−p)f (x̃)dx̃ which can be simplified into: R y(x) = e f (x̃)dx̃ 1 1−p Z (1 − p) R (p−1)f (x̃)dx̃ R f (x̃)dx̃ (1 − p)g(x)e Z (p−1) g(x)e dx + K dx + K 1 1−p 1 1−p , , (2.47) Example 2.3.4 Solve the ODE x2 y − x3 y ′ = y 4 sin x . We can first notice that a trivial solution to this ODE is y = 0. If we divide both members of this ODE by x3 we obtain: 40 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS y4 y − 3 sin x. x x x This is therefore a Bernoulli’s ODE, with p = 4, f (x) = x1 , g(x) = − sin . We can x3 therefore directly apply Eq. 2.47 and obtain: y′ = Z − 31 sin x 3 R 1 dx̃ 3 y(x) = e , e x̃ dx + K x3 − 31 Z sin x 3 ln x ln x e dx + K 3 , =e x3 Z − 1 = x 3 sin xdx + K 3 , R 1 dx̃ x̃ x . = √ 3 K − 3 cos x Riccati’s equation The Riccati’s equation has the form: y ′ (x) = f (x) + g(x)y(x) + h(x)y 2 (x). (2.48) namely, it is a Bernoulli’s equation with p = 2 but with an additional term f (x). To solve this equation we have to make the substitution: u = e− R y(x̃)h(x̃)dx̃ ⇒ u′ = −uyh. In this way we obtain: u′′ (uh) − (u′ h − uh′ )u′ u′ ′ ⇒ y =− . y=− uh u2 h2 Substituting y and y ′ into the original ODE Eq. 2.48 we obtain: u′ (u′ )2 (u′h − uh′ )u′ − u′′ (uh) = f − g + h , u2 h2 uh u2 h2 ⇒ (u′)2 h + uu′h′ − u′′ (uh) = f u2 h2 − gu′uh + h(u′ )2 , ⇒ u′′ uh = u′ (uh′ + ghu) − f h2 u2 , ′ ′′ ′ h ⇒u =u + g − f hu. h (2.49) The resulting ODE is therefore linear, but unfortunately of second order! And we have not learned (yet) how to solve second order ODE. The only possibility to solve 41 2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS the Riccati’s ODE without invoking second order ODE is by means of quadrature, once a particular solution of the ODE is known. If a particular solution is not known, we cannot solve the Riccati’s ODE with the methods we have learned so far. If we know that y1 (x) is a particular (not the general) solution, then of course from Eq. 2.48 we have y1 ′ = f + gy1 + hy1 2 . We can now make the substitution u = y − y1 and obtain: u′ = g(y − y1 ) + h(y 2 − y1 2 ). We have y 2 − y1 2 = (y − y1 )(y + y1 ) = (y − y1 )(y − y1 + 2y1 ) = u(u + 2y1 ), therefore the above equation translates into: u′ = gu + hu(u + 2y1 ) = (g + 2hy1 )u + hu2 . (2.50) This is a Bernoulli’s equation with p = 2, which can be directly solved by means of Eq. 2.47, to obtain: R u(x) = e [g(x̃)+2h(x̃)y1 (x̃)]dx̃ R (−1) Z R (1) [g(x̃)+2h(x̃)y1 (x̃)]dx̃ h(x)e e [g(x̃)+2h(x̃)y1 (x̃)]dx̃ R . ⇒ y(x) = y1 (x) − R h(x)e [g(x̃)+2h(x̃)y1 (x̃)]dx̃ dx + K dx + K −1 , (2.51) Example 2.3.5 Solve the ODE 4xy ′ − 2xy = e−x + xex (4 + x) . We can rewrite the ODE into the form: 1 1 1 y ′ = y + e−x y 2 + ex (4 + x). 2 4x 4 Now we can recognize it as a Riccati’s ODE with f (x) = 41 ex (4 + x), g(x) = 21 , 1 −x e . Because of the term e−x y 2 we can suppose that a function of the kind h(x) = 4x xk ex could be a solution of the ODE. In fact, we can see that the terms ex would cancel out. Let us try if a value of k exists such that y1 (x) = xk ex is a particular solution of the given ODE. It is y1 ′ = kxk−1 ex + xk ex = xk−1 ex (k + x), therefore we have: 1 1 1 xk−1 ex (k + x) = xk ex + x2k ex + ex (4 + x). 2 4x 4 As expected, we can cancel out ex and we obtain: 42 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS x xk x2k + +1+ . 2 4x 4 It is clear that all the exponents of x should be the same and it is easy to see that this condition is fulfilled only if k = 1 and it is equally easy to see that for k = 1 the previous equation is indeed an identity, therefore y1 (x) = xex is a particular solution of the given ODE. Now we can apply Eq. 2.51. We have: kxk−1 + xk = R e (g+2hy1 )dx̃ R =e ⇒ y(x) = y1 (x) − R 1 + 1 e−x̃ ·x̃ex̃ 2 2x̃ ex 1 −x e 4x dx̃ · ex dx R = ex . ex = y1 (x) − 1 R 1 =e 1dx̃ 4 x dx . The general solution of the given ODE is thus: 4ex . ln x + K In a sense, the particular solution y1 can be obtained from the general solution for K very large, namely lim y(x) = y1 (x). y(x) = xex − K→∞ Clairaut’s equation Clairaut’s equation has the form: y = xy ′ + g(y ′). (2.52) Namely, it is just a particular case of the equations solvable for y we have encountered in Sect 2.3.2 and we can solve it with the method we have learned in that subsection, but for the Clairaut’s equation the form of the general solution is particularly simple. In fact, given as usual the substitution y ′ = p, we have y = xp + g(p), therefore, differentiating with respect to x we get: dg y ′ = p = xp′ + p + p′ , dp dg ′ ⇒ x+ p = 0. dp The factor containing p′ is thus elementary to solve. It gets p = K and therefore we have already found the general solution of the ODE that is: y(x) = xK + g(K). (2.53) 43 2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS The equations x = − dg dp y = xp + g(p). (2.54) represent the (parametric) singular solution. Example 2.3.6 Solve the ODE y = xy ′ + ey ′ This is clearly a Clairaut’s ODE with g(p) = ep , therefore the general solution is given by: y = Kx + eK To find the particular solution we have to solve the system of equations: x = −ep y = xp + ep We have p = ln(−x) therefore the particular solution of the given ODE is: y = x ln(−x) − x = x[ln(−x) − 1] D’Alembert’s equation The D’Alembert’s (or D’Alembert-Lagrange) equation can be written as follows: y = xf (y ′ ) + g(y ′). (2.55) It is therefore again a particular case of the equations solvable for y we have encountered in Sect. 2.3.2. It is also to note that the Clairaut’s equation is a particular form of the D’Alembert’s equation with f (y ′) = y ′ . Also in this case there is an easier method of finding the general solution of this ODE than the one described in Sect. 2.3.2. Let as usual y ′ be p, therefore y = xf (p) + g(p). We differentiate it with respect to x and obtain: y ′ = p = f (p) + xf ′ (p)p′ + g ′ (p)p′ ⇒ p − f (p) = xf ′ (p) + g ′(p) p′ . 44 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS Now we can write p′ = dp dx as 1/ dx = 1/x′ (p) and obtain therefore: dp x′ (p) p − f (p) = xf ′ (p) + g ′ (p), ⇒ x′ (p) = x f ′ (p) g ′ (p) + . p − f (p) p − f (p) (2.56) Therefore, we have obtained a linear ODE which can be solved with the known methods and give x = x(p). This equation, together with the equation y = xf (p) + g(p) is already the parametric general solution of the D’Alembert’s ODE. Unfortunately, with this method we cannot recover singular solutions. Example 2.3.7 Solve the ODE y = x(y ′ )2 + 2y ′ This is clearly a D’Alembert’s ODE with f (y ′) = (y ′)2 and g(y ′) = 2y ′ . We can apply therefore Eq. 2.56 to obtain: 2 2p + 2 p−p p − p2 2 2 ⇒ x′ (p) − x = . 1−p p − p2 x′ (p) = x 2 and s(p) = This is a linear equation in p with r(p) = − 1−p solve this ODE, namely: 2 . p−p2 We know how to Z R 2 2 − 1− dp̃ p̃ x(p) = e e dp + K p − p2 Z 2 −2 ln(1−p) 2 ln(1−p) =e e dp + K p(1 − p) Z 2 1 (1 − p)dp + K = (1 − p)2 p Z 2 1 = − 1 dp + K (1 − p)2 p 2 (ln p − p + K). = (1 − p)2 R 2 dp̃ 1−p̃ We can express the general solution of the given ODE in the parametric form: x(p) = 2 (ln p − p + K) (1−p)2 y(p) = xp2 + 2p 2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS 2.3.5 45 Singular solutions and envelopes We have seen that, in solving higher degree ODEs usually a singular solution emerges that cannot be obtained from the general solution for any choice of the integrating constant K. Has the singular solution really nothing to do with the general solution? Let us take the ODE we have studied in Example 2.3.2, namely 3(y ′)2 − 2y ′ + xy = 0. √ We have seen that it has the general solution y = 2K x − 3K 2 and a singular solution y = x3 . We have plotted the general solution for K ranging from 1 to 5 (black lines) together with the singular solution (red line) in Fig. 2.4. As it is clear from this figure, the singular solution is tangent to each member of the integral curves representing the general solution of the given ODE at some point, namely it represents its envelope. Has this just happened by chance? Of course not. We remind that we have transformed the given ODE into the form: y = −3p2 x + 2px. This can be seen as an equation in p, namely 3p2 x − 2px + y = 0. The discriminant of this equation is △ = x2 − 3xy. Because of the term xy in the original ODE we have to assume x 6= 0. If we assume x > 0, then the discriminant is larger than 0 if and only if x . 3 The curve y = x3 represents thus the limiting curve below which two distinct roots for p = y ′ can be found. On the curve itself, only solutions with multiplicity larger than one can be found. That means also that, if we set up an initial value problem by means of the initial condition y(x0 ) = y0 , we will obtain two solutions if (x0 , y0 ) lies below the curve delimiting the singular solution, one solution if (x0 , y0 ) is on the curve, no solutions otherwise. For instance, given the initial condition y(16) = 5 it is easy to see from the general solution of the ODE that there are two possible values of K satisfying it namely K = 1 and K = 35 . We shall remark here that having two distinct solutions (with two distinct values of K) for an initial value problem is something happening only on ODE of degree higher than 1. For first-degree ODEs we have so far taken for granted that there exists a single and unique function, solution of an initial value problem: x − 3y > 0 ⇒ y < y ′ = f (x, y), y(x0 ) = y0 . (2.57) Actually the so-called existence and uniqueness theorem guarantees that, if the function f is well-behaved, then there is an interval of values of x where the solution of Eq. 2.57 exists and is unique. We will not demonstrate this theorem, but it was important to notice that this result is possible only for first-degree ODEs. Going back to the ODE 3(y ′)2 −2y ′ + xy = 0, we can also notice that the derivative 46 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS √ Figure 2.4: Family of curves y = 2K x − 3K 2 with K ranging from 1 (flatter curve) to 5 (steeper curve). Red line: curve of equation y = x3 (singular solution of the ODE 3(y ′)2 − 2y ′ + xy = 0). √ √ of the general solution y = 2K x − 3K 2 with respect to K yields 2 x − 6K. If √ we equate it to 0, we obtain K = 3x . If we substitute it to the general solution we √ obtain y = 2 3x − 3 x9 = x3 , namely again the singular solution. In other terms, if f (x, y, K) = 0 is the general solution of a ODE, the two equations: f (x, y, K) = 0, ∂f (x,y,K) = 0, ∂K (2.58) represent the parametric form of the singular solution of the given ODE. Yet another and even simpler method can be found to recover the singular solution. If we differentiate the original ODE with respect to y ′ we obtain 6y ′ − 2. Equating it to 0 we obtain y ′ = 13 . If we substitute this value into the ODE, we obtain once again the singular solution y = x3 . Namely, given a differential equation ψ(x, y, y ′) = 0 the singular solution can be obtained by solving the system of equations: ψ(x, y, y ′) = 0, ′) ∂ψ(x,y,y = 0, ′ (2.59) ∂y without finding the general solution. Also in the ODE xy ′ = y ln y ′ (Example 2.3.3) it is easy to see that both Eq. 2.58 and Eq. 2.59 lead us to the singular solution. 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS 47 For what concerns the D’Alembert’s equation y = x(y ′ )2 + 2y ′ (Example 2.3.7) we have seen that the standard method of solution does not allow us to find the singular solution. Also the application of Eq. 2.58 is quite complicated given the parametric form of the general solution. Applying Eq. 2.59 instead we obtain immediately that 1 ∂ψ(x, y, y ′) ′ ′ = 0 ⇒ 2xy + 2 = 0 ⇒ y = − . ∂y ′ x Substituting this value of y ′ into the original ODE we obtain immediately the singular solution y = − x1 . However, this second method is not applicable to all differential equations. For instance in an ODE of the kind y ′ = ψ(x, y) if we differentiate both members with respect to y ′ we obtain 1 = 0. The first method described by Eq. 2.58 works also in this case (provided that it is possible to find the general solution of the ODE). 2.4 Second order differential equations Second order differential equations play a particularly important role in physics because many relevant physical processes (Newtonian dynamics, oscillations, electric circuits and many more) can be described by means of equations involving second order derivatives. We have already seen a very simple example of second order differential equations in the fall of a body that, neglecting the air resistance, can be 2 , described by the ODE ddt2y = g. Since this ODE does not depend explicitly on t, y, dy dt it was particularly easy to solve it without knowing anything about theory of second order ODE. In general, however, we have to do with equations of this kind: y ′′ (x) = f [x, y(x), y ′(x)], namely, we consider only second order, first degree ODEs. Actually, we will concentrate almost exclusively on linear ODEs, namely ODEs that can be written as: a2 (x)y ′′ (x) + a1 (x)y ′ (x) + a0 (x)y(x) = f (x), or, to simplify the notation: y ′′ (x) + p(x)y ′ (x) + q(x)y(x) = g(x), (2.60) (x) where of course is p(x) = aa12 (x) , q(x) = aa20 (x) and g(x) = af2(x) (provided that a2 (x) 6= (x) (x) 0). As usual, we will start from the simplest possible cases, then we will deal progressively with more and more complicated ODEs. 48 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS 2.4.1 Second order homogeneous ODEs with constant coefficients The simplest possible second order ODE has the form: a2 y ′′ (x) + a1 y ′ (x) + a0 y(x) = 0. (2.61) This ODE is called homogeneous because f (x) = 0 ∀x and is called with constant coefficients for the obvious reason that the coefficients of each term of the ODE are real numbers and not function of the independent variable x. Our mathematical intuition can help us solve this ODE. In fact, we know that the exponential function has the property that each derivative, irrespective of the order, remains proportional to the initial function. We can therefore expect that a function of the kind y(x) = eλx could be the solution of the given ODE. We have of course y ′ = λeλx and y ′′ = λ2 eλx , therefore: = 0, + λa eλx eλx eλx λ2 a2 1 + a0 √ −a1 ± a1 2 − 4a0 a2 . ⇒λ= 2a2 (2.62) The solution of the ODE reduces therefore to the solution of the simple algebraic equation λ2 a2 + λa1 + a0 = 0 that is called characteristic equation. We have therefore two possible values of λ satisfying this equation. Indeed, we have more than that. Let us call λ1 and λ2 the solutions of the equation 2.62. Of course, we have λ1 2 a2 + λ1 a1 + a0 = 0 and λ2 2 a2 + λ2 a1 + a0 = 0. If we take a function f (x) = c1 eλ1 x + c2 eλ2 x , namely a linear combination of the two solutions eλ1 x and eλ2 x , we can easily see that f (x) is also a solution of the given ODE. In fact, if f (x) is a solution of the given ODE then: a2 f ′′ (x) + a1 f ′ (x) + a0 f (x) = 0, ⇒ a2 (c1 λ1 2 eλ1 x + c2 λ2 2 eλ2 x ) + a1 (c1 λ1 eλ1 x + c2 λ2 eλ2 x ) + a0 (c1 eλ1 x + c2 eλ2 x ) = 0, ⇒ c1 eλ1 x (a2 λ1 2 + a1 λ1 + a0 ) + c2 eλ2 x (a2 λ2 2 + a1 λ2 + a0 ) = 0. We have already seen that the terms under brackets are 0, therefore this identity is fulfilled and f (x) is indeed a solution of the given ODE. This result is a particular case of a more general theorem, called principle of superposition that states that, given a linear homogeneous differential equation: y ′′ + p(x)y ′ + q(x)y = 0, (2.63) 49 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS and given two solutions y1 and y2 of this ODE, then the linear combination c1 y1 + c2 y2 is also a solution of the given ODE for any value of the constants c1 and c2 . To demonstrate this theorem it is enough to remind the linearity of the operator derivative, namely that, given a function c1 y1 (x) + c2 y2 (x) one has: [c1 y1 (x) + c2 y2 (x)]′ = c1 y1 ′ (x) + c2 y2 ′ (x). (2.64) Of course we will have also (c1 y1 + c2 y2 )′′ = c1 y1 ′′ + c2 y2 ′′ . Now we can test if the function g = c1 y1 + c2 y2 satisfies the ODE g ′′ + p(x)g ′ + q(x)g = 0. We have: g ′′ + pg ′ + qg = (c1 y1 ′′ + c2 y2 ′′ ) + p(c1 y1 ′ + c2 y2 ′ ) + q(c1 y1 + c2 y2 ), = c1 (y1 ′′ + py1 ′ + qy1 ) + c2 (y2 ′′ + py2′ + qy2 ) = 0. The last step is justified by the fact that the two functions y1 and y2 are solutions of the given ODE. We have therefore shown that the function c1 y1 + c2 y2 is also a solution of the ODE. Namely, starting from two particular solutions of the linear homogeneous ODE we can construct an infinite family of solutions by means of linear combinations of the two initial solutions. 2.4.2 The Wronskian determinant Given a generic solution y = c1 y1 + c2 y2 how should c1 and c2 be chosen in order to satisfy the initial conditions y(x0 ) = y0 and y ′ (x0 ) = y0 ′ ? Of course it must be: c y (x ) + c y (x ) = y 1 1 0 2 2 0 0 ′ ′ c1 y1 (x0 ) + c2 y2 (x0 ) = y0 ′ (2.65) We remind from the theory of systems of linear equations how to find c1 and c2 (see also Sect. 5.1.2). By means of the Cramer’s rule, we obtain: y y (x ) 0 2 0 ′ y0 y2 ′ (x0 ) c1 = y1 (x0 ) y2 (x0 ) ′ y1 (x0 ) y2 ′ (x0 ) y (x ) y 1 0 0 ′ y1 (x0 ) y0 ′ , c2 = y (x ) y (x ) 1 0 2 0 ′ y1 (x0 ) y2 ′ (x0 ) . (2.66) Namely, the solution of this system is possible only if the denominators of these quantities are different from zero, namely only if: y (x ) y (x ) 1 0 2 0 W = ′ y1 (x0 ) y2 ′ (x0 ) = y1 (x0 )y2 ′ (x0 ) − y1 ′ (x0 )y2 (x0 ) 6= 0. (2.67) 50 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS The determinant W is called the Wronskian determinant or simply Wronskian and it plays a fundamental role in the study of differential equations of order higher than 1. The Wronskian of two functions y1 and y2 in a point x0 is also indicated with the notation W (y1 , y2)(x0 ). The condition W 6= 0 implies in the end that the functions y1 and y2 are linearly independent. In fact, if the two functions are linearly dependent, then it is always possible to find a constant k such that y2 = ky1 . The Wronskian is thus: W = y1 · ky1 ′ − y1 ′ · ky1 = 0. It can be demonstrated that, given an ODE y ′′ + p(x)y ′ + q(x)y = 0 that admits two solutions y1 and y2 , if there is a point x0 where the Wronskian is nonzero, then the family of solutions: y(x) = c1 y1 (x) + c2 y2 (x), (2.68) with arbitrary coefficients c1 and c2 includes every solution of the given ODE. For this reason we will call Eq. 2.68 the general solution of the given ODE. The solutions y1 and y2 are said to form a fundamental set of solutions of the ODE. In fact, one can see that the family of solution of the ODE forms a vector space and y1 and y2 are the bases (often called generators) of it. If the functions y1 (x) and y2 (x) satisfy the linear homogeneous ODE Eq. 2.63, we have: y1 ′′ + p(x)y1 ′ + q(x)y1 = 0, y2 ′′ + p(x)y2 ′ + q(x)y2 = 0. We multiply now the first equation by −y2 and the second by y1 , obtaining: − y2 y1 ′′ − p(x)y2 y1 ′ − q(x)y2 y1 = 0, y1 y2 ′′ + p(x)y1 y2 ′ + q(x)y1 y2 = 0. If we now add these two equations together we obtain: y1 y2 ′′ − y2 y1 ′′ + p(x)(y1 y2 ′ − y2 y1 ′ ) = 0. (2.69) If we treat the Wronskian W (y1, y2 )(x) = y1 (x)y2 ′ (x) − y2 (x)y1 ′ (x) as a function of x, we can differentiate it with respect to x and obtain: ′ ′ ′ ′ W ′ (x) = y1 y2 + y1 y2 ′′ − y1 y2 − y1 ′′ y2 = y1 y2 ′′ − y1 ′′ y2 . 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS 51 Recalling now Eq. 2.69 we can rewrite it as: W ′ (x) + p(x)W (x) = 0. This is a simple separable differential equation, whose solution is: W (x) = Ce− R p(x̃)dx̃ , (2.70) where the constant C depends only on the two functions y1 and y2 . Furthermore, if C = 0, then the Wronskian is always zero, whereas if C 6= 0 then the Wronskian is always different from 0, for any choice of x. Eq. 2.70 is also called Abel’s theorem. Since. ′ ′ W = y1 y2 − y1 y2 = y1 we can also recover the useful relation: ′ 2 y2 y1 ′ , R Ce− p(x̃)dx̃ = y1 2 Z − R p(x̃)dx̃ e ⇒ y2 = Cy1 dx, y1 2 (x) y2 y1 (2.71) which is a direct formula to find the second solution of an ODE once a solution is already known. 2.4.3 Fundamental set of solutions of homogeneous ODEs with constant coefficients Going back to the second order homogeneous ODE with constant coefficients we have studied at the beginning of this section, we have seen that the solution is given by a linear combination of the two functions eλ1 x and eλ2 x , where λ1 and λ2 are calculated by means of Eq. 2.62, namely they are the solutions of the characteristic equation. In a generic point x0 the Wronskian of this set of solutions is given by: eλ1 x0 eλ2 x0 W = λ1 eλ1 x0 λ2 eλ2 x0 = (λ2 − λ1 )e(λ1 +λ2 )x0 . (2.72) Provided that λ1 6= λ2 , this Wronskian is always different from 0, for any possible choice of x0 . Therefore, the functions eλ1 x and eλ2 x constitute a fundamental set of solutions of the homogeneous ODE with constant coefficients, provided that the roots of the characteristic equations do not coincide. We can distinguish therefore 3 possible cases, depending on the nature of the roots of the characteristic equations. 52 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS Distinct real roots In this case, λ1,2 ∈ R and λ1 6= λ2 . This happens when the discriminant of the characteristic equation △ = a1 2 − 4a0 a2 > 0. The functions eλ1 x and eλ2 x are realvalued and can be used to express the general solution of each physical problem. Example 2.4.1 Find the general solution of the ODE y ′′ − 5y ′ + 6y = 0 The characteristic equation of this ODE is: λ2 − 5λ + 6 = 0, √ 5 ± 25 − 24 ⇒λ= , 2 ⇒ λ1 = 2 , λ2 = 3. The general solution of the given ODE is thus: y(x) = c1 e2x + c2 e3x . Distinct complex roots We already know (see Sect. 1.3.3) that, when the discriminant of the characteristic equation △ = a1 2 − 4a0 a2 < 0, then the solutions are the complex numbers: √ −a1 ± i −△ , λ1,2 = 2a2 a1 − namely, the two solutions are the two complex conjugate numbers λ1 = − 2a 2 √ √ −△ −△ a1 i 2a2 = µ − iα and λ2 = − 2a2 + i 2a2 = µ + iα. It is still true that the function c1 eλ1 x + c2 eλ2 x is the general solution of the given ODE, but in general we might need complex coefficients c1 and c2 and this is to avoid when we treat physical problems (whose solutions are supposed to be real-valued). There is a way to avoid it and obtain real-valued solutions. In fact, we have seen that any linear combination of the fundamental set of solutions of a given ODE is still a solution. By using the Euler’s formula (Eq. 1.13) we can rewrite the solutions as: y1 = eλ1 x = eµx [cos(−αx) + i sin(−αx)] = eµx [cos(αx) − i sin(αx)], y2 = eλ2 x = eµx [cos(αx) + i sin(αx)]. 53 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS If we make the sum and the difference of the two solutions y1 and y2 we obtain: µx i sin(αx) i sin(αx)] y1 + y2 = eµx [cos(αx) − + cos(αx) + = 2e cos(αx), − i sin(αx) − − i sin(αx)] = −2ieµx sin(αx). y1 − y2 = eµx [ cos(αx) cos(αx) Now, if we take: y1 (x) + y2 (x) = eµx cos(αx), 2 y1 (x) − y2 (x) = eµx sin(αx), v(x) = − 2i u(x) = the functions u(x) and v(x) are real-valued and are obtained as linear combinations of two solutions, therefore are solutions themselves of the given ODE. Do they constitute a fundamental set of solutions? We have: eµx cos(αx) eµx sin(αx) W = µx µe cos(αx) − αeµx sin(αx) µeµx sin(αx) + αeµx cos(αx) (( (( , 2 2 ((( ((( sin(αx)µ cos(αx)µ = e2µx [( ((( cos(αx) + α sin (αx)], ((( sin(αx) + α cos (αx) − ( = αe2µx . If we have α = 0, then we have △ = 0, whereas we have assumed △ < 0. Therefore, the Wronskian of the functions u(x) and v(x) is always 6= 0 for any value of x and u and v form a fundamental set of solutions. The general solution of the ODE can be thus expressed as: y(x) = eµx [c1 cos(αx) + c2 sin(αx)]. Example 2.4.2 Solve the initial value problem: ′′ ′ y − 8y + 17 = 0 y(0) = 1 y ′ (0) = 1 The characteristic equation of the given ODE is: λ2 − 8λ + 17 = 0. √ ⇒ λ = 4 ± 16 − 17 = 4 ± i. (2.73) 54 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS We have therefore µ = 4 and α = 1. The general solution can be written as: y(x) = e4x (c1 cos x + c2 sin x). From the initial condition y(0) = 1 we obtain immediately c1 = 1. We have then: y ′(x) = 4e4x (c1 cos x + c2 sin x) + e4x (−c1 sin x + c2 cos x). From the initial condition y ′(0) = 1 we obtain 4c1 + c2 = 1, namely c2 = −3. The required solution is: y(x) = e4x (cos x − 3 sin x). Repeated roots Repeated roots of the characteristic equations occur when the discriminant is zero. We have seen in Eq. 2.72 that this is the only case for which eλ1 x and eλ2 x do not form a fundamental set of solutions. This of course makes sense because we have in this case λ1 = λ2 , therefore the functions eλ1 x and eλ2 x are the same function. a1 From Eq. 2.62 we know that in this case λ = − 2a and therefore a solution of 2 − a1 x an ODE whose characteristic equation has null discriminant is surely e 2a2 . There are different ways to find a second solution, linearly independent from this one. The most widely used is due to D’Alembert and consists in finding a function f (x) so a − 1 x that f (x)e 2a2 is also a solution of the ODE a2 y ′′ + a1 y ′ + a0 y = 0. We have: a a1 − 2aa1 x − 1 x f e 2 + f ′ e 2a2 , 2a2 a a1 2 − 2aa1 x a1 ′ − 2aa1 x ′′ − 2a12 x 2 2 f e f e − + f e . y ′′ = 4a2 2 a2 y′ = − a − 2a1 x We obtain an ODE in f in which the term e it, we obtain: 2 will cancel out. Without writing a1 2 a1 2 ′ ′′ − f + a1 f + a2 f − a1 f′ + a0 f = 0, 4a2 2a2 2 a 1 f = 0. ⇒ a2 f ′′ + a0 − 4a2 The term within brackets is zero because we shall not forget that △ = a1 2 −4a0 a2 = 0, therefore the ODE has the particularly simple form f ′′ = 0. We can integrate it twice 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS 55 with respect to x and we obtain that f (x) = c1 + c2 x. Namely, the solution can be expressed as y = c1 y1 + y2 c2 with: a − 2a1 x y1 = e 2 a − 2a1 x , y2 = xe 2 . (2.74) Are these two functions linearly independent? To check it we have to calculate the Wronskian: a − 1 x e− 2aa12 x xe 2a2 W = a − a1 x − a1 x , a 1 1 − e 2a2 − 2a2 x + 1 e 2a2 2a2 a a a1 a1 − a1 x − 1x =e 2 1− x+ x = e a2 6= 0. 2a2 2a2 This demonstrates that the functions y1 and y2 form indeed a fundamental set of solutions of the given ODE. A very trivial example of this case is the motion of 2 a body without any force acting on it, namely the solution of the ODE ddt2x = 0. The characteristic equation associated to it is λ2 = 0, which has the repeated roots λ1,2 = 0. Therefore the solution is given by: x(t) = c1 e0·t + c2 te0·t = c1 + c2 t. Of course, this solution could have been found very easily by a double integration, without need of the characteristic equation. This procedure of finding a second solution of an ODE once a first one is known can be applied also to the more general ODE y ′′ + p(x)y ′ + q(x)y = 0. Suppose that we know a solution y1 (x), namely we know that y1 ′′ + p(x)y1 ′ + q(x)y1 = 0, then it is easy to find a second solution given by f (x)y1 (x). In fact we have: f ′′ y1 + 2f ′ y1 ′ + f y1′′ + pf ′ y1 + pf y1 ′ + qf y1 = 0. Collecting terms we obtain: f ′′ y1 + f ′ (2y1 ′ + py1 ) + f [y1 ′′ + p(x)y1 ′ + q(x)y1 ] = 0. But the last addend is zero because we know that y1 is a solution of the given ODE. We are therefore left with the ODE: f ′′ y1 + f ′ (2y1 ′ + py1 ) = 0, (2.75) which, in spite of the appearance, is a very simple first order ODE in f ′ , from which we can recover f (x) and from it the second of the fundamental set of solutions y1 (x) and f (x)y1 (x). This method of finding the fundamental set of solutions given one 56 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS particular solution is called reduction of order. In many cases however it is more convenient to use the direct formula Eq. 2.71. 2.4.4 Second order nonhomogeneous ODEs with constant coefficients In this section we deal with ODEs of the type: a2 y ′′ (x) + a1 y ′(x) + a0 y(x) = g(x). (2.76) We start with some theoretical results about the generic second order nonhomogeneous ODE: y ′′ (x) + p(x)y ′ (x) + q(x)y(x) = g(x). (2.77) We will call corresponding homogeneous equation the ODE: y ′′(x) + p(x)y ′(x) + q(x)y(x) = 0. (2.78) In order to describe the structure of the solutions of this equation, we need to demonstrate two important results: • If Y1 and Y2 are solutions of the nonhomogeneous ODE Eq. 2.77, then Y1 − Y2 is a solution of the corresponding homogeneous equation Eq. 2.78. Moreover, if y1 and y2 are a fundamental set of solutions of Eq. 2.78, then it is always possible to find two numbers c1 and c2 such that: Y1 (x) − Y2 (x) = c1 y1 (x) + c2 y2 (x). (2.79) To demonstrate this result it is enough to note that Y1 and Y2 are solutions of Eq. 2.77, therefore we have: Y1 ′′ + pY1 ′ + qY1 = g, Y2 ′′ + pY2 ′ + qY2 = g, ⇒ Y1 ′′ − Y2 ′′ + p(Y1 ′ − Y2 ′ ) + q(Y1 − Y2 ) = 0. (2.80) We already know that, since the differential operator is linear, then one has (Y1 − Y2 )′′ = Y1 ′′ − Y2 ′′ and (Y1 − Y2 )′ = Y1 ′ − Y2 ′ , therefore Eq. 2.80 already demonstrates that Y1 − Y2 is a solution of the ODE 2.78. Moreover, we have shown that every solution of an homogeneous ODE like Eq. 2.78 can be expressed as linear combination of the fundamental set of solutions 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS 57 and this demonstrates the second part of this theorem, namely there must exist two numbers c1 and c2 such that Y1 (x) − Y2 (x) = c1 y1 (x) + c2 y2 (x). • The general solution of the nonhomogeneous equation Eq. 2.77 can be expressed as: y(x) = c1 y1 (x) + c2 y2 (x) + yp (x), (2.81) where y1 and y2 are the fundamental set of solutions of the corresponding homogeneous equation 2.78 and yp (x) is a particular solution of the nonhomogeneous equation. This result follows directly from the previous theorem. It is enough to call Y1 (x) = y(x) (the general solution) and Y2 (x) = yp (x) (some particular solution of the nonhomogeneous ODE) and from Eq. 2.79 we can directly recover Eq. 2.81. In order to find the solution of the nonhomogeneous ODE with constant coefficients Eq. 2.76 we have to perform the following 3 steps: • Find the fundamental set of solutions y1 and y2 of the corresponding homogeneous equation: a2 y ′′ (x) + a1 y ′(x) + a0 y(x) = 0, with the methods learned in Sect. 2.4.3. The general solution of this ODE c1 y1 (x) + c2 y2 (x) is often called complementary solution. • Find a specific solution yp (x) of the nonhomogeneous ODE. This solution is called particular solution. • Sum up the results found in the two preceding steps. Of course, the difficulty here resides in step two, namely in finding the particular solutions. The two most widely used methods are called method of variation of parameters and method of undetermined coefficients. Method of variation of parameters This method, attributed to Lagrange, is very powerful because it can be applied to any ODE, regardless of the form of the function g(x) in Eq. 2.76 but it might be quite laborious. It is the generalization of the method we have already applied to first-order linear ODEs (Section 2.2.2). 58 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS We start again from Eq. 2.77 and suppose that we know the fundamental set of solutions y1 and y2 of the corresponding homogeneous equation Eq. 2.78. We have therefore that the complementary solution is given by: yc (x) = c1 y1 (x) + c2 y2 (x). The method of variation of parameters consists in substituting the constants c1 and c2 with two unknown functions C1 (x) and C2 (x). We try now to determine the two functions C1 and C2 such that y(x) = C1 (x)y1 (x) + C2 (x)y2 (x) is a solution of the given ODE. If we differentiate once y with respect to x we obtain: y ′ = C1 ′ y 1 + C1 y 1 ′ + C2 ′ y 2 + C2 y 2 ′ . Now, let us assume that the sum of the terms containing the derivatives of C1 and C2 is zero, namely that: C1 ′ y1 + C2 ′ y2 = 0. (2.82) y ′ = C1 y 1 ′ + C2 y 2 ′ . (2.83) Therefore y ′ is simply given by: If we differentiate once more with respect to x we obtain: y ′′ = C1 ′ y1 ′ + C1 y1 ′′ + C2 ′ y2 ′ + C2 y2 ′′ . (2.84) Now, given the function y(x) = C1 (x)y1 (x) + C2 (x)y2 (x) we have y ′ (Eq. 2.83) and y ′′ (Eq. 2.84) and we can therefore check under what conditions can y satisfy the given ODE. We have: C1 ′ y1 ′ + C1 y1 ′′ + C2 ′ y2 ′ + C2 y2 ′′ + p(C1 y1 ′ + C2 y2 ′ ) + q(C1 y1 + C2 y2 ) = g, ⇒ C1 (y1 ′′ + py1 ′ + qy1 ) + C2 (y2 ′′ + py2 ′ + qy2 ) + C1 ′ y1 ′ + C2 ′ y2 ′ = g. Since y1 and y2 are solutions of the corresponding homogeneous ODE, the two terms under brackets are zero. The condition C1 ′ y1 ′ + C2 ′ y2 ′ = g remains, which, together with Eq. 2.82 forms a system of two linear equations in C1 ′ and C2 ′ , namely: C ′ y + C ′ y = 0 1 1 2 2 C1 ′ y1 ′ + C2 ′ y2 ′ = g. (2.85) We can solve this system treating C1 ′ and C2 ′ as real numbers. By means of the Cramer’s rule, we obtain: 59 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS y 0 1 ′ y1 g ′ , C2 = y y . y2 1 2 ′ y1 y2 ′ y2 ′ gy2 gy1 ⇒ C1 ′ = − , C2 ′ = , W (y1 , y2 ) W (y1, y2 ) Z Z g(x)y1 (x) g(x)y2 (x) dx + c1 , C2 (x) = dx + c2 . ⇒ C1 (x) = − W (y1 , y2)(x) W (y1, y2 )(x) 0 g ′ C1 = y1 ′ y1 y2 y2 ′ (2.86) In the end, we can write the solution of the ODE as: y(x) = c1 y1 (x) + c2 y2 (x) − y1 (x) Z x x0 g(s)y2(s) ds + y2 (x) W (y1 , y2 )(s) Z x x0 g(s)y1(s) ds, W (y1 , y2)(s) (2.87) where x0 is a conveniently chosen point. Example 2.4.3 Calculate the general solution of the ODE: y ′′ − 3y ′ + 2y = ex The characteristic equation of the corresponding homogeneous ODE is: √ 9−8 ⇒ λ1 = 1 , λ2 = 2. 2 The complementary solution is thus given by: 2 λ − 3λ + 2 = 0 ⇒ λ = 3± yc (x) = c1 ex + c2 e2x . The Wronskian of the functions y1 and y2 is given by: ex e2x W = x e 2e2x = e3x . Applying Eq. 2.86 with g(x) = ex , W (x) = e3x , y1 (x) = ex , y2 (x) = e2x we obtain: ex · e2x C1 (x) = − dx = −x + c1 , e3x Z Z x x e ·e dx = e−x dx = −e−x + c2 . C2 (x) = 3x e Z The general solution of the given ODE is thus given by: 60 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS y(x) = c1 ex + c2 e2x − xex − ex = (c1 − x − 1)ex + c2 e2x . Note that the second part of the complementary solution −ex is proportional to y1 , therefore we can incorporate this function into c1 ex and write the solution in the form: y(x) = (c1 − x)ex + c2 e2x . Example 2.4.4 Solve the ODE: y ′′ + 4y = 1 . sin(2x) The characteristic equation of the homogeneous ODE is: λ2 + 4 = 0 ⇒ λ = ±2i. The complementary solution is thus: yc (x) = c1 cos(2x) + c2 sin(2x). The Wronskian is: cos(2x) sin(2x) W = −2 sin(2x) 2 cos(2x) From Eq. 2.86 we have: = 2 cos2 (2x) + 2 sin2 (2x) = 2. Z 1 sin(2x) 1 C1 (x) = − dx = − x + c1 , 2 sin(2x) 2 Z cos(2x) 1 1 dx = ln[sin(2x)] + c2 . C2 (x) = 2 sin(2x) 4 The solution we have been looking for is: 1 1 y(x) = c1 cos(2x) + c2 sin(2x) − x cos(2x) + sin(2x) ln[sin(2x)]. 2 4 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS 61 Method of the undetermined coefficients The method of undetermined coefficients requires that we make an assumption about the type of particular solution yp (x) we are seeking, but with unspecified coefficients. It is therefore a trial and error method in which, if we have guessed right, we will be able to determine the coefficients of our trial particular solution; if we have not guessed right, we will not be able to find the coefficients and that would mean that there is no solution of the form that we have guessed. We can guess another form of solution and try again. This method is particularly useful for simple forms of g(x) (in particular exponents, sines, cosines and polynomials) as the following examples show. Example 2.4.5 Solve the ODE. y ′′ − 7y ′ + 12y = 6e2x . The characteristic equation is: √ 49 − 48 ⇒ λ1 = 3 , λ2 = 4. 2 Since the exponential function reproduces itself through differentiation, we can try a function of the type Ae2x as particular solution of the given ODE. If we try this function we obtain: 2 λ − 7λ + 12 = 0 ⇒ λ = 7± 4Ae2x − 14Ae2x + 12Ae2x = 6e2x ⇒ 2A = 6 ⇒ A = 3. We have therefore already found the particular solution yp (x) = 3e2x and the general solution is: y(x) = c1 e3x + c2 e4x + 3e2x . Example 2.4.6 Find a particular solution of the ODE: y ′′ − 4y ′ − 5y = 12e−x . Following the previous example, the most natural choice seems to be yp (x) = Ae−x . If we try this function we get: Ae−x + 4Ae−x − 5Ae−x = 12e−x , 62 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS which cannot be solved. Indeed, the solution of the characteristic equation is: λ=2± √ 4 + 5 ⇒ λ1 = −1 , λ2 = 5. This means that e−x is already part of the complementary solution, therefore a particular solution with the form Ae−x would be simply incorporated in the complementary solution and cannot be a solution of the nonhomogeneous equation. By looking at Example 2.4.3 we migth guess that in this case a function with the form yp (x) = Axe−x can be the right one. We have: yp ′ = (−Ax + A)e−x = A(1 − x)e−x , yp ′′ = (Ax − A − A)e−x = A(x − 2)e−x . We have thus: A(x − 2) − 4A(1 − x) − 5Ax = 12 ⇒ −2A − 4A = 12 ⇒ A = −2. The complementary solution is then yp (x) = −2xe−x . From these examples we have seen that if we guess the form of the particular solution correctly, then the method of undetermined coefficients is straightforward and very fast, but it might hide complications if the most natural particular solution to guess is already a solution of the corresponding homogeneous ODE. Moreover, 1 in cases like Example 2.4.4, with g(x) = sin(2x) it is very difficult to guess what the form of the particular solution might be and the method of the variation of constants might be the only way to find it out. In the case in which the function g(x) has the form g(x) = eax then the particular solution yp (x) has the form • Aeax if a is not a root of the corresponding homogeneous equation. • Axeax if a is a root of the corresponding homogeneous equation. In what other cases can we easily find a particular solution with the method of the undetermined coefficients? We can check it with the help of some examples. 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS 63 Example 2.4.7 Find a particular solution of the ODE y ′′ − 2y ′ − 3y = −5 cos x. If we try with a function yp (x) = A cos x we will have that the second derivative is proportional to yp but the first derivative is not! A better guess might be a function with the form yp (x) = A sin x + B cos x. In this case we have: yp ′ = A cos x − B sin x, yp ′′ = −A sin x − B cos x. Substituting it into the original ODE we get: −A sin x − B cos x − 2A cos x + 2B sin x − 3A sin x − 3B cos x = 5 cos x. We collect now the terms containing cos x and sin x and obtain the system of equations: −4B − 2A = −5 −4A + 2B = 0. From the second we get B = 2A, therefore we have: A= 1 1 , B = 1 ⇒ yp (x) = sin x + cos x. 2 2 Example 2.4.8 Find the particular solution of the ODE: y ′′ − 5y ′ = 3 − 15x2 . Since g(x) is a polynomial of second degree, then we can expect to have a particular solution which is a polynomial of second degree, too. We can try therefore the function yp (x) = Ax2 + Bx + C. We have yp ′ = 2Ax + B, yp ′′ = 2A. In order yp (x) = Ax2 + Bx + C to be a solution of the given ODE we should have: 2A − 10Ax − 5B = 3 − 15x2 . It is evident from this equation that we have no information on the value of the constant C. The reason for that is the same as in the example with e−x we have 64 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS already encountered. Namely, the characteristic equation λ2 − 5λ = 0 has roots 0 and 5. The function yp (x) = Ax2 + Bx + C can be seen as the sum of 3 functions. One of them 3 yp (x) = C is proportional to one of the two functions that form the fundamental set of solutions of the corresponding homogeneous equation. That cannot be and therefore we should change our assumption about the form of the particular solution. We test now the function yp (x) = x(Ax2 + Bx + C). In this case we have: yp ′ = 3Ax2 + 2Bx + C, yp ′′ = 6Ax + 2B. We have therefore: 6Ax + 2B − 15Ax2 − 10Bx − 5C = 3 − 15x2 . Now we can equate the coefficients with the same power of x and obtain the system of equations: −15A = −15 6A − 10B = 0 2B − 5C = 3 This is a system of 3 equations in 3 unknowns; it can be easily solved yielding A = 1, 9 , therefore the particular solution we were looking for is: B = 53 , C = − 25 3 9 2 . yp (x) = x x + x − 5 25 Example 2.4.9 Find the particular solution of the ODE: y ′′ − 3y ′ = (2x2 − 4x + 1)ex . The first guess would be to look for a function with the form (Ax2 + Bx + C)ex . After Example 2.4.8 we might fear that this function could not be the right guess because λ = 0 is a root of the characteristic equation. This fear is unjustified because of the factor ex which always multiplies the terms of the polynomial. The only risk would be therefore if 1 were a solution of the characteristic equation because in this case e1·x would be already part of the solution of the corresponding homogeneous equation. We have therefore: 2.4. SECOND ORDER DIFFERENTIAL EQUATIONS 65 yp (x) = (Ax2 + Bx + C)ex , yp ′ = (Ax2 + Bx + C + 2Ax + B)ex = [Ax2 + (B + 2A)x + B + C]ex , yp ′′ = [Ax2 + (B + 2A)x + B + C + 2Ax + B + 2A]ex = [Ax2 + (B + 4A)x + 2A + 2B + C]ex . Neglecting as usual the term ex we obtain: Ax2 + (B + 4A)x + 2A + 2B + C − 3Ax2 − 3(B + 2A)x − 3B − 3C = 2x2 − 4x + 1. Now we can collect coefficients of terms with the same power of x and obtain the system of equations: −2A = 2 −2B − 2A = −4 2A − B − 2C = 1 , which has solution A = −1, B = 3, C = −3. The particular solution is thus: (−x2 + 3x − 3)ex . In the end the method of undetermined coefficients can be effective only if the term g(x) is a function involving exponentials, sines, cosines, polynomials and products of such functions. We can summarize the form of particular solution we should look at in the following items. • g(x) = Pn (x), where Pn (x) is a polynomial with degree n. In this case, the particular solution is: yp (x) = xm [An xn + An−1 xn−1 + · · · + A1 x + A0 ]. (2.88) Here A0 and An are the coefficients to determine and m is the multiplicity of 0 as root of the characteristic equation of the homogeneous ODE. That is, if the ODE is of the form y ′′ + ay ′ = g(x) then 0 is a single root of the characteristic equation and m = 1. If instead the ODE can be written as y ′′ = g(x) then 0 is a double root of the characteristic equation and m = 2. 66 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS • g(x) = Pn (x)eµx . The solution we should look at is: yp (x) = xm [An xn + An−1 xn−1 + · · · + A1 x + A0 ]eµx . (2.89) In this case, m depends on the multiplicity of µ as a root of the characteristic equation. • g(x) = Pn (x)eµx cos(αx) or g(x) = Pn (x)eµx sin(αx). This case can be seen as an extension in the complex field of the previous example since eµx cos(αx) and eµx sin(αx) are linear combinations of e(µ+iα)x . The solution we look at in this case is thus: yp (x) =xm [(An xn + An−1 xn−1 + · · · + A1 x + A0 ) cos(αx)+ + (Bn xn + Bn−1 xn−1 + · · · + B1 x + B0 ) sin(αx)]eµx , (2.90) where m is the multiplicity of µ + iα as (complex) root of the characteristic equation. • g(x) = Pn (x) cos(αx) or g(x) = Pn (x) sin(αx). This case can be seen as analogous to the previous case, provided that µ = 0. The solution to look at is: yp (x) =xm [(An xn + An−1 xn−1 + · · · + A1 x + A0 ) cos(αx)+ + (Bn xn + Bn−1 xn−1 + · · · + B1 x + B0 ) sin(αx)], (2.91) where m is the multiplicity of iα as (imaginary) root of the characteristic equation. • g(x) = g1 (x) + g2 (x) + · · · + gn (x), where each of these g1 , . . . gn belongs to one of the previous items. In this case we calculate separately the particular functions 1 yp . . . n yp and the particular function is given by the sum of these partial particular functions, namely: yp (x) =1 yp (x) + · · · +n yp (x). (2.92) The bottom line here is that the particular solution we look at must be always linearly independent from the solutions of the corresponding homogeneous equations. 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS 2.5 67 Higher order linear differential equations In this section we deal with equations involving the n-th derivative (with n > 2) of some unknown function to be determined. We will only consider linear equations, therefore the generic n-th order ODE we will consider can be written as: y (n) (x) + p1 (x)y (n−1) (x) + · · · + pn−1 (x)y ′ + pn (x)y = g(x). (2.93) All the results we will present in this section are just a generalization of what we have learned in the previous section about second-order differential equations. 2.5.1 Homogeneous n-th order ODEs. The homogeneous n-th order ODE can be written as: y (n) (x) + p1 (x)y (n−1) (x) + · · · + pn−1 (x)y ′ + pn (x)y = 0. (2.94) Also in this case, it is easy to demonstrate that, if y1 , . . . , yn are solutions of the above equation, then each linear combination of these functions is still a solution of it. If we set up an initial value problem, we have to specify the value of the unknown function y and of its derivatives up to the (n − 1)-th derivative in some point x0 . Namely, we have to define the following initial conditions: y(x0 ) = y0 y ′ (x ) = y ′ 0 0 . .. (n−1) y (x0 ) = y0 (n−1) (2.95) If we know that a set of functions y1 , . . . , yn is solution of the given ODE, we wish to determine if a set of constants c1 , . . . , cn exists so that y = c1 y1 + · · · + cn yn is the solution of the initial value problem. To find c1 , . . . , cn we have to solve the system of equations: c1 y1 (x0 ) + c2 y2 (x0 ) + · · · + cn yn (x0 ) = y0 c y ′ (x ) + c y ′ (x ) + · · · + c y ′ (x ) = y ′ 1 1 0 2 2 0 n n 0 0 . .. c1 y1 (n−1) (x0 ) + c2 y2 (n−1) (x0 ) + · · · + cn yn (n−1) (x0 ) = y0 (n−1) In order this system of equations to have a solution, the determinant of the coefficients must be different from zero, namely we must have: 68 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS y (x ) y2 (x0 ) ... yn (x0 ) 1 0 y1 ′ (x0 ) y2 ′ (x0 ) ... yn ′ (x0 ) W (y1, . . . , yn )(x0 ) = .. .. .. . . . (n−1) (n−1) (n−1) y1 (x0 ) y2 (x0 ) . . . yn (x0 ) 6= 0. (2.96) This is again the Wronskian of the functions y1 , . . . , yn . If a point x0 exists, where the Wronskian is different from zero, then the functions y1 , . . . , yn are linearly independent, form a fundamental set of solution of the given homogeneous ODE and each solution can be obtained by a linear combination of y1 , . . . , yn . A homogeneous n-th order ODE with constant coefficients can be written as: an y (n) + an−1 y (n−1) + · · · + a1 y ′ + a0 y = 0. (2.97) As in the case of second-order ODEs, we look for solutions of the type eλx . Since dk λx e = λk eλx we obtain again the characteristic equation: dxk an λn + an−1 λn−1 + · · · + a1 λ + a0 = 0. (2.98) The fundamental set of solutions of the ODE 2.97 is given by the functions eλ1 x , . . . , eλn x , where λ1 , . . . , λn are the solutions of the characteristic equation. These roots can be real or complex. For complex roots we have seen that they come always in conjugate pairs λ = µi ± iαi and we have already seen that we can transform the two functions e(µi ±iαi )x into the real-valued solutions eµi x cos(αi x) and eµi x sin(αi x). If λ is a repeated root, we have seen that we have to multiply by x the function eλx . With ODEs with order larger than 2 it can happen that the multiplicity of λ as a solution of the characteristic equation is larger than 2. In this case, if m is the multiplicity of λ, we have to multiply eλx by x, x2 , . . . , xm−1 in order to have m linearly independent solutions of the given ODE. If n ≥ 4 then it could also happen that λ and λ∗ are repeated complex roots. Then in this case, if m is the multiplicity of µ ± iα as roots of a ODE. then linearly independent solutions are given by the functions eµx cos(αx), xeµx cos(αx), . . . , xm−1 eµx cos(αx) and eµx sin(αx), xeµx sin(αx), . . . , xm−1 eµx sin(αx). Example 2.5.1 Find the solution of the ODE: y (6) + 3y (4) + 3y ′′ + y = 0. The characteristic equation is: λ6 + 3λ4 + 3λ2 + 1 = 0. 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS 69 This is clearly the third power of λ2 + 1. The only roots are therefore ±i (the solutions of λ2 + 1 = 0) and their multiplicity is 3. For what we have said above, the fundamental set of solutions is given by the functions: cos x, x cos x, x2 cos x, sin x, x sin x, x2 sin x. We can write the general solution as: y(x) = (c1 + c2 x + c3 x2 ) cos x + (c4 + c5 x + c6 x2 ) sin x. Example 2.5.2 Solve the ODE: y ′′′ + 3y = 0. The characteristic equation is: λ3 + 3 = 0 ⇒ λ3 = −3. We transform -3 into 3ei(π+2nπ) , therefore the characteristic equation has solutions: λ= √ 3 π 2n 3ei 3 + 3 iπ . n can assume the values 0, 1, 2, therefore the roots of the characteristic equation are: λ1 = √ 3 √ √ 5 π 3 3 3ei 3 , λ2 = − 3, λ3 = 3e 3 iπ . λ1 and λ3 are complex conjugated, therefore we can take only λ1 and take sine and cosine of the imaginary part of it as linearly independent solutions. We can write λ1 as: √ 3 π 3ei 3 √ π π 3 = 3 cos + i sin 3 3 √ √ 3 1 3 = 3 +i 2 2 √ √ 6 3 3 35 = +i . 2 2 We can thus write the solution as: λ1 = y(x) = c1 e √ 33 2 x √ 6 √ 6 √ √ 33 35 35 − 3 3x x 2 x + c2 e + c3 e x. cos sin 2 2 70 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS 2.5.2 Nonhomogeneous n-th order ODEs. Nonhomogeneous n-th order ODEs can be expressed by means of the generic Eq. 2.93. As in the case of second-order ODEs, to find the general solution of this equation we have to first solve the corresponding homogeneous equation (namely we have to find the complementary solution), then find a particular solution of the nonhomogeneous equation, then sum up the complementary and the particular solution. As in the case of second-order ODEs the particular solution can be found either with the method of variation of parameters or with the method of undetermined coefficients. The method of variation of parameters consists in finding the general solution of the corresponding homogeneous equation y(x) = c1 y1 (x) + · · · + cn yn (x) and then replace the constants c1 , . . . , cn with unknown functions C1 (x), . . . , Cn (x). Now we want to find out how shall we choose C1 (x), . . . , Cn (x) so that y = C1 (x)y1 (x) + · · · + Cn (x)yn (x) is a solution of the nonhomogeneous ODE. It is: y ′ = (C1 ′ y1 + · · · + Cn ′ yn ) + (C1 y1 ′ + · · · + Cn yn ′ ). (2.99) We assume now that the first term under bracket (C1 ′ y1 +· · ·+Cn ′ yn ) is zero, therefore we have: y ′ = C1 y 1 ′ + · · · + Cn y n ′ . (2.100) We now keep on differentiating y and assume always that the sum of all the terms containing Ci ′ is zero. Namely, after m derivatives we assume: C1 ′ y1 (m−1) + · · · + Cn ′ yn (m−1) = 0, (2.101) and, consequently, we will get: y (m) = C1 y1 (m) + · · · + Cn yn (m) . (2.102) This procedure goes on until we find: y (n) = (C1 y1 (n) + · · · + Cn yn (n) ) + (C1 ′ y1 (n−1) + · · · + Cn ′ yn (n−1) ). (2.103) Now, if we make the sum of y (n) + p1 y (n−1) + · · · + pn−1 y ′ + pn y using Eq. 2.102 and if we collect all the terms containing Ci we obtain: Ci (yi(n) + p1 yi (n−1) + · · · + pn−1 yi ′ + pn yi ). (2.104) But all these terms are zero because yi is a solution of the corresponding homogeneous ODE. The only term remaining is (C1 ′ y1 (n−1) + · · · + Cn ′ yn (n−1) ) which is therefore 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS 71 equal to g(x). Because of the equations 2.101 we can now write the system of equations: ′ ′ C1 y1 + · · · + Cn yn = 0 C ′ y ′ + · · · + C ′ y ′ = 0 1 1 n n . .. ′ (n−1) C1 y 1 + · · · + Cn ′ yn (n−1) = g (2.105) This is a linear system of equations for the unknowns C1 ′ , . . . , Cn ′ , which can be solved with the Cramer’s rule yielding: Ci ′ = g(x)Wi (x) , W (x) (2.106) where W (x) is the Wronskian of the functions y1 , . . . , yn and Wi (x) is the determinant of the matrix obtained by replacing the m-th column by the column (0, . . . , 1). In this way we can express the general solution of the nonhomogeneous ODE as: y(x) = n X i=1 yi (x) Z g(x)Wi (x) dx + ci . W (x) (2.107) Although the procedure to obtain this result is straightforward, the algebra involved can be terribly messy, in particular for n > 3. The method of undetermined coefficients can be applied also to ODEs of order higher than 2 and it is in most of the cases efficient and fast but it applies as usual only to polynomials, sines, cosines, exponentials and combinations of these elementary functions. After the discussion in Sect. 2.4.4, we can limit ourselves to consider the case: g(x) = Pn (x)eµx cos(αx) or g(x) = Pn (x)eµx sin(αx). We have seen that, in this case, the particular solution we should look at has the form: yp (x) =xm [(An xn + An−1 xn−1 + · · · + A1 x + A0 ) cos(αx)+ + (Bn xn + Bn−1 xn−1 + · · · + B1 x + B0 ) sin(αx)]eµx , (2.108) where m is the multiplicity of µ + iα (or µ − iα) as root of the corresponding homogeneous ODE. If the ODE is of the second or third order, then m could be only 0 or 1, but if the ODE is of higher order, then m could be also larger than 1. All the other cases we can consider are just particular cases of this one, namely: 72 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS • g(x) = Pn (x)eµx . The particular solution can be obtained from Eq. 2.108 taking α = 0, namely yp (x) = xm (An xn + An−1 xn−1 + · · · + A1 x + A0 )eµx . m is the multiplicity of µ as root of the characteristic equation. • g(x) = eµx cos(αx). The particular solution can be obtained from Eq. 2.108, assuming that the polynomial Pn (x) has degree 0, namely yp (x) = xm [A cos(αx) + B sin(αx)]eµx . • g(x) = Pn (x) cos(αx). The particular solution can be obtained from Eq. 2.108 taking µ = 0, namely yp (x) = xm [(An xn + An−1 xn−1 + · · · + A1 x + A0 ) cos(αx) + (Bn xn + Bn−1 xn−1 + · · ·+ B1 x + B0 ) sin(αx)]. m is the multiplicity of iα as root of the characteristic equation. • g(x) = Pn (x). Again we can recover the particular solution from Eq. 2.108 taking µ = 0 and α = 0, namely yp (x) = xm (An xn + An−1 xn−1 + · · · + A1 x + A0 ). m is the multiplicity of 0 as solution of the characteristic equation. Example 2.5.3 Find a particular solution of the ODE: x y ′′ − 4y ′ + 4y = e2x cos . 2 For what explained above, the particular solution yp (x) must have the form yp (x) = Ae2x cos x2 +Be2x sin x2 . In fact, only when the roots of the characteristic equations are 2 ± 2i , then one of the two fundamental solutions of the corresponding homogeneous ODE is e2x cos x2 , therefore we should take the function yp (x) = xm [Ae2x cos x2 + Be2x sin x2 ] as particular solution. We have (writing c instead of cos x2 and s instead of sin x2 for compactness of notation): yp (x) = Ae2x c + Be2x s, 1 1 yp ′ (x) = 2Ae2x c − Ae2x s + 2Be2x s + Be2x c 2 2 B 2x A 2x = 2A + e c + 2B − e s, 2 2 B 2x A 2x ′′ 2x 2x yp (x) = (4A + B)e c − A + e s + (4B − A)e s + B − e c 4 4 15 15 2x = 2B + A e c + −2A + B e2x s. 4 4 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS 73 Substituting these values in the original ODE and neglecting e2x (we know already that we will be able to cancel out this term) we obtain: 15 A 15 B 2B + A c + −2A + B s − 4 2A + c − 4 2B − s + 4Ac + 4Bs = c. 4 4 2 2 Collecting terms with c and s we obtain the system of equations: + 2B 15 A 4 − + 2A + 4A = 1 − 8A − 2B 15 B 4 + 4B = 0 2A − 8B + It is easy to find the solutions of this system, namely B = 0 and A = −4, therefore the particular solution of the given ODE is: x yp (x) = −4e2x cos . 2 Example 2.5.4 Find a particular solution of the ODE: y (4) − 2y ′′′ + y ′′ = −2ex . The characteristic equation is: λ4 − 2λ3 + λ2 = 0 ⇒ (λ − 1)2 λ2 = 0. Therefore, the roots are 0 and 1, both double. For what we have learned the particular solution to look at is yp (x) = Ax2 ex . We have: yp (x) = Ax2 ex yp ′ (x) = A(x2 + 2x)ex yp ′′ = A(x2 + 2x + 2x + 2)ex = A(x2 + 4x + 2)ex yp ′′′ = A(x2 + 4x + 2 + 2x + 4)ex = A(x2 + 6x + 6)ex yp (4) (x) = A(x2 + 6x + 6 + 2x + 6)ex = A(x2 + 8x + 12)ex Substituting these functions into the original ODE (and cancelling out ex ) we obtain: A(x2 + 8x + 12) − 2A(x2 + 6x + 6) + A(x2 − 4x + 2) = −2. The terms with x and x2 cancel out; what remains is 2A = −2, namely A = −1. The particular solution is thus: yp (x) = −x2 ex . 74 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS 2.5.3 The D-operator We introduce now yet another method of finding particular solutions of ODEs which uses up the properties of the derivative as operator. We define D≡ d , dx (2.109) namely D always operates on a function f and produces its derivative. For this reason Df 6= f D because Df is a function (f ′ ) whereas f D is an operator. From the basic properties of the derivative we know that the operator D is linear, namely: D(c1 f1 + · · · + cn fn ) = c1 Df1 + · · · + cn Dfn , (2.110) where c1 , . . . , cn are constants and f1 , . . . , fn functions. It is even possible to define the inverse D −1 of the operator D; we only require that: D −1 (Df ) = f. (2.111) It is therefore easy to recognize that D −1 indicates the operation of integration (the inverse of differentiation). At the same time we can define D n as the n-th order derivative of the function f and, consequently, D −n is the operator that integrates n times a given function. It is also easy to give a meaning to the operator P (D), where P is a generic polynomial. We have in fact: P (D)y = (an D n + · · · + a1 D + a0 )f = an y (n) + · · · + a1 y ′ + a0 , (2.112) namely P (D)y is a linear combination of derivatives of f and therefore the equation P (D)y = g is a linear nonhomogeneous ODE. If we are able to invert the operator P (D), then the particular solution of the ODE P (D)y = g can be simply expressed as: yp = [P (D)]−1g = 1 g. P (D) (2.113) The operator P (D) can be inverted only for some particular functions g, which we analyze in the following items: • g(x) = eµx In this case we have: 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS P (D)eµx = P (µ)eµx 1 µx 1 µx e = e P (D) P (µ) 75 (2.114) (2.115) The first relation is clear because of the relation D m eµx = µm eµx (see also how we recover the characteristic equation from a given homogeneous ODE with constant coefficients, cf. Eqs. 2.97 and 2.98). The second relation is also clear because we managed to transform the operator P (D) in a polynomial and therefore the inverse of P (D) is simply the inverse of the polynomial. It is important to note that, if P (µ) = 0, then this result cannot be applied. In fact, we know already from the method of the undetermined coefficients that, if µ is a solution of the characteristic equation (that is equivalent to say that P (µ) = 0), then the particular solution is not proportional to eµx but to xm eµx . • g(x) = f(x)eµx It turns out that: P (D)f (x)eµx = eµx P (D + µ)f (x) 1 1 f (x)eµx = eµx f (x) P (D) P (D + µ) (2.116) (2.117) The first relation can be demonstrated as follows: given two functions f and g we know already that the n-th derivative of f g can be obtained with the help of the Pascal’s triangle, namely: n X dn dk dn−k n! [f (x)g(x)] = f (x) · g(x) dxn k!(n − k)! dxk dxn−k k=0 ⇒ D n (f g) = n X k=0 n! D k f · D n−k g. k!(n − k)! (2.118) If g(x) = eµx then we know already that D j g = µj eµx , therefore: n µx µx D (f e ) = e n X k=0 n! D k f · µn−k = eµx (D + µ)n f. k!(n − k)! (2.119) If we put together a polynomial in D, namely many terms like the one above with different exponents n and some multiplicative coefficients, we will obtain 76 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS eµx times a polynomial in D + µ, demonstrating therefore Eq. 2.116. To demonstrate Eq. 2.117 we can call h(x) = P (D + µ)f (x), namely f (x) = 1 h(x). From Eq. 2.116 we have: P (D+µ) µx P (D)e If we now apply the operator µx e 1 h(x) P (D + µ) 1 P (D) = eµx h(x). to both members of this equation we obtain: 1 h(x) P (D + µ) = 1 µx e h(x), P (D) which is exactly as Eq. 2.117 with f replaced by h. • g(x) = cos(αx) It turns out that: P (D 2) cos(αx) = P (−α2 ) cos(αx) 1 1 cos(αx) = cos(αx) 2 P (D ) P (−α2 ) (2.120) (2.121) The first relation can be demonstrated as follows: D 0 cos(αx) = cos(αx) D 2 cos(αx) = −α2 cos(αx) D 4 cos(αx) = D 2 [D 2 cos(αx)] = (−α2 )2 cos(αx) .. . D 2n cos(αx) = (−α2 )n cos(αx). Therefore, a polynomial in D 2 applied to the function cos(αx) is equivalent to cos(αx) times the polynomial in −α2 . With it, Eq. 2.120 is demonstrated. To demonstrate Eq. 2.121 it is enough to notice that we have transformed P (D 2) in a polynomial (i.e. in a number if α is assigned) and we can simply invert it as we invert numbers. If P (−α2 ) = 0 (namely if iα or −iα are roots of the characteristic equation), this method cannot be applied. • g(x) = sin(αx) This case is analogous to the previous one, therefore we have: 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS P (D 2 ) sin(αx) = P (−α2 ) sin(αx) 1 1 sin(αx) = sin(αx) 2 P (D ) P (−α2 ) 77 (2.122) (2.123) • g(x) = xn In this case it is always possible to find some coefficients b0 , . . . , bn such that: 1 xn = (b0 + b1 D + · · · + bn D n )xn . P (D) (2.124) In fact, if we divide the number 1 by the polynomial P (D) we will obtain a polynomial with an infinite number of addends. However, since D m xn = 0 for any m > n, we can truncate the polynomial at the n-th degree. The application of the D-operator to the solution of ODEs can be clearer after some examples. Example 2.5.5 Apply the method of the D-operator to find a particular solution of the ODE: y ′′ − y ′ + y = x3 ex . We try to invert the operator P (D) = D 2 − D + 1 and find the function: 1 {x3 ex }. −D+1 With the help of Eq. 2.117 this transforms into: yp = yp = ex (D + 1)2 D2 1 1 x3 = ex 2 x3 . − (D + 1) + 1 D +D+1 We now divide the number 1 by the polynomial D 2 + D + 1 and obtain: 1 + D + D2 1 1 + D + D2 1 − D + D3 −D − D 2 −D − D 2 − D 3 D3 The solution is therefore: yp (x) = ex (1 − D + D 3 )x3 = ex (x3 − 3x2 + 6). 78 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS Example 2.5.6 Use the method of the D-operator to find the particular solution of the ODE: y ′′′ + y ′′ + y ′ + y = 15 cos(2x). We have: 1 cos(2x) D3 + D2 + D + 1 1 = 15 cos(2x) (D + 1)(D 2 + 1) 1 1 · cos(2x) (because of the Eq. 2.121) = 15 2 D + 1 −2 + 1 D−1 = −5 2 cos(2x) D −1 1 = −5(D − 1) 2 cos(2x) (again Eq. 2.121) −2 − 1 = (D − 1) cos(2x) = −2 sin(2x) − cos(2x). yp = 15 Example 2.5.7 With the method of the D-operator find the solution of the ODE: y ′′ − 4y ′ + 3y = e2x (cos x + 1). We can rewrite this ODE as: (D 2 − 4D + 3)y = e2x (cos x + 1). We have: 1 {e2x (cos x + 1)} D 2 − 4D + 3 1 1 e2x + 2 {e2x cos x} = 2 D − 4D + 3 D − 4D + 3 1 1 = 2 e2x + e2x cos x 2 2 −4·2+3 (D + 2) − 4(D + 2) + 3 1 = −e2x + e2x 2 cos x D −1 1 cos x = −e2x + e2x 2 −1 − 1 1 = −e2x 1 + cos x 2 y= 79 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS As it is perhaps evident from these examples, the method of the D-operator is quite powerful and versatile, but it requires some ability and a fair amount of practice to get familiar with its use. In any case, for functions that cannot be expressed as combinations of sines, cosines, polynomials and exponentials, the only viable method remains the messy method of the variation of parameters. We have seen however that this method cannot be applied in the cases in which 1 1 eµx ) or P (−α2 ) = 0 (when calculating P (D P (µ) = 0 (when calculating P (D) 2) = sin(αx) or cos(αx)). What can we do in these cases? In the case in which we want to 1 eµx with P (µ) = 0 we can write P (D) = (D−µ)m ∆(D), where m is the calculate P (D) multiplicity of µ as root of the polynomial P (D) and ∆(D) is a polynomial in D which 1 1 {eµx f (x)} = eµx P (µ+D) f (x) does not have µ as a root. Now we can apply the rule P (D) and obtain: 1 µx 1 e = eµx P (D) (D − µ)m ∆(D) 1 1 · 1 = eµx m · e0·x = eµx m D ∆(µ + D) D ∆(µ + D) 1 1 = eµx m D ∆(0 + µ) µx e xm = . ∆(µ) m! The last step is justified by the fact that the only operator left is m 1 and produces the m-th integral of it, which is xm! . 1 ; Dm (2.125) it operates on Example 2.5.8 Find the particular solution of the ODE: y (4) + 3y ′′′ + 3y ′′ + y ′ = e−x . With the help of the D-operator we can rewrite this ODE as: (D 4 + 3D 3 + 3D 2 + D)y = e−x ⇒ D(D + 1)3 y = e−x . -1 is therefore a root of the polynomial P (D) with multiplicity 3 and ∆D = D. Applying the method just learned we can write: y= 1 1 1 e−x = e−x 3 1 = e−x 3 (−1). 3 D(D + 1) D (D − 1) D 3 If we integrate 3 times −1 we obtain − x6 , therefore the particular solution is: yp (x) = −e−x x3 . 6 80 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS 1 1 To apply the operator P (D 2 ) cos(αx) (or P (D 2 ) sin(αx)) in the case in which P (−α2 ) = 0 it is more convenient to substitute α with α + ε (with ε small) and make the Taylor series expansion of cos[(α + ε)x]. We obtain: 1 1 cos(αx) = cos[(α + ε)x] 2 P (D ) P (D 2) 1 cos[(α + ε)x] = P [−(α + ε)2 ] ∞ X 1 (εx)n (n) = . cos (αx) P [−(α + ε)2 ] n=0 n! (2.126) In the limit ε → 0 we obtain our result. Example 2.5.9 With the help of the method of the D-operator find the particular solution of the ODE: y ′′ + y = cos x. We have P (D) = D 2 + 1 and clearly −12 + 1 = 0. We increment therefore 1 by a tiny amount ε and obtain: y= 1 1 cos x = cos[(1 + ε)x] D2 + 1 −(1 + ε)2 + 1 ∞ X (εx)n 1 cos(n) x = −ε(ε + 2) n=0 n! 1 (εx)2 = cos x − (εx) sin x − cos x . . . −ε(ε + 2) 2 x sin x εx2 cos x cos x + + ... (2.127) =− ε(ε + 2) ε+2 2(ε + 2) The first term of this sum does not concern us because a term proportional to cos x is already part of the complementary solution of the given ODE. The third term (and all the following terms) of the right hand side tend to 0 for ε → 0, therefore the only term left is: x sin x x sin x = , ε+2 2 which is therefore the particular solution we have been looking for. 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS 2.5.4 81 The Euler linear equations In this subsection and in the following we deal with linear differential equations with non-constant coefficients. At variance with the ODEs with constant coefficients, there is no general theory to find the solutions and in most of the cases the solutions cannot be expressed in terms of simple elementary functions. In many cases it is possible to obtain a solution of the given ODEs by clever substitutions, but there is no general rule on how to find the right substitution. The Euler equation is one of the simplest cases in which an ODE with variable coefficients can be solved. It has the form: an xn D n y + an−1 xn−1 D n−1 y + · · · + a1 xDy + a0 y = g(x), (2.128) namely the derivative of the m-th order is multiplied by xm and by a constant. This ODE can be also written in a compact notation as P (xD)y = g. We can solve this equation by means of the substitution x = es , namely s = ln x and s′ (x) = x1 . We have: dy dy ds 1 dy = = dx ds dx x ds ′ ⇒ xDy = y (s) Dy = If we now indicate with δ the operator derivative with respect to s, from the last relation we also have the correlation between the two operators D and δ: D= 1 δ. x Higher derivatives are given by: 1 1 1 1 1 2 δy = − 2 δy + Dδy = 2 (δ 2 y − δy) = 2 δ(δ − 1)y D y=D x x x x x ⇒ x2 D 2 y = δ(δ − 1)y 1 2 1 3 D y = D 2 δ(δ − 1)y = − 3 δ(δ − 1)y + 3 δ[δ(δ − 1)]y x x x ⇒ x3 D 3 y = δ(δ − 1)(δ − 2)y .. . xn D n y = δ(δ − 1)(δ − 2) . . . (δ − n + 1)y, (2.129) In this way we can transform the original ODE into an ODE with constant coefficients in which the independent variable is s, namely: bn δ n y + bn−1 δ n−1 y + · · · + b1 δy + b0 y = g(es ). (2.130) 82 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS Example 2.5.10 Solve the ODE: x2 y ′′ + xy ′ + y = sin ln x2 . With the substitution x = es and using the relations xDy = δy, x2 D 2 y = δ(δ − 1)y we obtain: [δ(δ − 1) + δ + 1]y = sin(2s) ⇒ (δ 2 + 1)y = sin(2s). The complementary solution of this ODE can be obtained by the roots of the characteristic equation λ2 + 1 = 0, namely: yc (s) = c1 cos s + c2 sin s. The particular solution can be obtained with the method of the D-operator, namely: 1 1 1 sin(2s) = sin(2s) = − sin(2s). 2 +1 −2 + 1 3 The solution of the given ODE is thus: yp (s) = δ2 1 sin(2s) 3 1 ⇒ y(x) = c1 cos(ln x) + c2 sin(ln x) − sin(ln x2 ). 3 y(s) = c1 cos s + c2 sin s − 2.5.5 Series solutions of linear equations To deal with the large class of ODEs with variable coefficients we have to extend our search for solutions beyond the familiar elementary functions. The basic idea of the series solution is similar to the method of the undetermined coefficients: we assume that the solutions of a given ODE have power series expansions and then we attempt to determine the coefficients of the series so as to satisfy the ODE. Example 2.5.11 Find a series solution of the equation: y ′′ + y = 0. 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS 83 We know already that this ODE has solution c1 cos x + c2 sin x but this example illustrates well the use of power series to the solution of ODE. We look for solutions in the form of power series about x0 = 0, namely solutions of this kind: y(x) = a0 + a1 x + a2 x2 + a3 x3 + · · · = Differentiating it term by term we obtain: y ′ = a1 + 2a2 x + 3a3 x2 + · · · = y ′′ = 2a2 + 6a3 x + · · · = ′′ ∞ X n=2 ∞ X ∞ X an xn . n=0 nan xn−1 n=1 n(n − 1)an xn−2 Now we should substitute y and y into the original ODE. Before doing that, we have to rewrite one of the two series so that both series display the same generic term. We do that by replacing n by n + 2 in the series expressing y ′′ and therefore we obtain: ′′ y = ∞ X (n + 2)(n + 1)an+2 xn . n=0 In this way, the original ODE can be transformed into: ∞ X [(n + 2)(n + 1)an+2 + an ]xn = 0. n=0 In order this equation to be satisfied, the coefficients of each power of x must be zero, namely we must have: (n + 2)(n + 1)an+2 + an = 0, ∀n. This relation is called recurrence relation. cannot have any information on a0 and a1 value, then all the even coefficients can be odd coefficients can be obtained recurrently (2.131) It is evident from this relation that we but if we give the first two coefficients a obtained recurrently from a0 and all the from a1 , namely we have: a0 a0 a2 a0 a4 a0 = − , a4 = − = , a6 = − =− 2·1 2! 4·3 4! 6·5 6! a1 a3 a1 a5 a1 a1 = − , a5 = − = , a7 = − =− a3 = − 3·2 3! 5·4 5! 7·6 7! The solution is therefore: a2 = − x2 x4 x6 x3 x5 x7 y(x) = a0 1 − + − + . . . + a1 x − + − + ... . 2! 4! 6! 3! 5! 7! 84 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS It is not difficult to recognize the Taylor series expansions of cos x and sin x in the two sums inside the brackets, therefore the solution of the ODE (as we already knew) is: y(x) = a0 cos x + a1 sin x. This example shows us that in order to find the solution of an ODE in power series, the main point is to find the recurrence relation (Eq. 2.131), namely to find in recurrence all the coefficients of the series as a function of the first ones. We shall concentrate from now on only on second-order equations. In fact, all the important ODEs for which in the history of mathematics a power series solution has been found (Legendre’s equation, Hermite’s equation, Bessel functions etc.) are second-order ODEs. With little work all the following results can be generalized to n-th order ODEs. Given a generic second-order linear homogeneous ODE: y ′′(x) + p(x)y ′ + q(x)y = 0, the Frobenius and Fuchs theorem describes the nature of the solutions according to the properties of the functions p and q. • If the functions p(x) and q(x) admit a convergent Taylor series expansion about a point x0 (namely if they are analytic), then it is always possible to find a solution of the kind y(x) = ∞ X n=0 an (x − x0 )n . (2.132) In fact, it is enough to proceed as in Example 2.5.11, find the recurrence relation of the given ODE (taking into account the Taylor series expansion of the functions p and q) and solve it. Eq. 2.132 will unavoidably lead to two (linearly independent) functions y1 (x) and y2 (x). A point x0 with such properties (namely that p and q are analytic at it) is said to be ordinary, otherwise it is said to be singular. • If the point x0 is singular, but the functions g(x) = (x − x0 )p(x), h(x) = (x − x0 )2 q(x) are analytic at x0 , then the point x0 is a regular singular point. In this case, the ODE admits at least one solution of the form: 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS y(x) = (x − x0 ) p ∞ X n=0 n an (x − x0 ) = ∞ X n=0 an (x − x0 )n+p . 85 (2.133) If we substitute this equation into the original ODE, we obtain a quadratic equation in p called indical equation. • If x0 is not a regular singular point, then it is said irregular and a solution of the given ODE may not exist. Given an ODE of the form: y ′′ + p(x)y ′ + q(x)y = 0, if the point x = 0 is a regular singular point, then the functions g(x) = xp(x) and h(x) = x2 q(x) are analytic at x = 0, namely they admit a Taylor series expansion: xp(x) = g(x) = ∞ X n 2 gn x , x q(x) = h(x) = ∞ X hn xn . (2.134) n=0 n=0 The original ODE can be thus written as: g(x) ′ h(x) y + 2 y = 0. (2.135) x x From Eq. 2.133 we know that we must look for at least one solution of the form: y ′′ + y= ∞ X an xn+p . n=0 Differentiating it with respect to x we obtain: ′ y = y ′′ = ∞ X (n + p)an xn+p−1 n=0 ∞ X n=0 (2.136) (n + p)(n + p − 1)an xn+p−2 (2.137) Substituting Eqs. 2.136 and 2.137 into the original ODE we obtain: ∞ X n=0 n+p−2 (n + p)(n + p − 1)an x ∞ ∞ X X n+p−2 + g(x) (n + p)an x + h(x) an xn+p−2 = 0. n=0 n=0 Dividing this equation by xp−2 we obtain: ∞ X n=0 [(n + p)(n + p − 1) + g(x)(n + p) + h(x)]an xn = 0. (2.138) 86 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS In order this equation to be satisfied, the coefficients of each power of x must be zero, including x0 . The terms with x0 are obtained when n = 0 in the above equation and when the first terms g0 and h0 of the series expansions of g(x) and h(x) are taken. These are by the way also the values of the functions g(x) and h(x) in the point x = 0, namely g0 = g(0) and h0 = h(0). Equating these coefficients to zero we obtain: p(p − 1)a0 + g0 pa0 + h0 a0 = 0. Assuming that a0 6= 0 we can cancel it out and obtain the indical equation: p2 + (g0 − 1)p + h0 = 0. (2.139) This is a second order equation, which has therefore two roots p1 and p2 . According to the nature of these roots we can have the following cases (which we mention without demonstrating): • Real distinct roots not differing by an integer In this case the two series solutions of the given ODE are both of the form of Eq. 2.133 namely: y1 = ∞ X an xn+p1 , y2 = n=0 ∞ X bn xn+p2 . n=0 • Double roots In this case we still have a solution of the form of Eq. 2.133 y1 = ∞ X an xn+p , n=0 whereas the second solution is given by: y2 = y1 ln x + ∞ X bn xn+p . (2.140) n=0 • Roots differing by an integer. Once again (as shown in the theorem of Frobenius and Fuchs) we have a solution of the form of Eq. 2.133, namely: y1 = ∞ X n=0 an xn+p1 . 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS 87 The second solution is given by: y2 = ky1 ln x + ∞ X bn xn+p2 , (2.141) n=0 namely, it differs from the previous case only by a constant k which can also turn out to be zero. Once the nature of the solutions has been established, we must determine the coefficients aj and bj by means of the opportune recurrence relations; if we are P∞ P n n lucky the series ∞ n=0 bn x will be the Taylor series expansion of some n=0 an x , known elementary function; otherwise the series solution will remain defined by its recurrence relation. Example 2.5.12 Find the power series solutions of the ODE: xy ′′ − 2y ′ + 9x5 y = 0, and verify that the two solutions are linearly independent. We can rewrite this ODE in the form: 2 y ′′ − y ′ + 9x4 y = 0. x x = 0 is therefore a regular singular point because the functions g(x) = xp(x) = −2 and h(x) = x2 q(x) = 9x6 are analytic at x = 0. We have g(0) = −2 and h(0) = 0 therefore the indical equation is: p2 + [g(0) − 1]p + h(0) = 0 ⇒ p2 − 3p = 0. This equation has roots p1 = 0 and p2 = 3. At least one of these indices must lead to P n+p . If we take p = 3, by means of Eq. 2.138 a solution of the form y(x) = ∞ n=0 an x we obtain: ∞ X n=0 ⇒ ∞ X [(n + 3)(n + 2) − 2(n + 3) + 9x6 ]an xn = 0 n(n + 3)an xn + 9an xn+6 = 0. n=0 The second term of this sum has therefore an exponent different from the first one. In order to compare terms with like power of x we have to lower the index an in the term 9an xn+6 by six units. In this way we obtain the recurrence relation: 88 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS n(n + 3)an + 9an−6 = 0 ⇒ an = − 9an−6 . n(n + 3) Namely, given an initial value a0 we can determine the coefficients a6 , a12 and so on; nothing can be said about the intermediate coefficients a1 , a2 , . . . (that can be put to 0). Let us therefore put n = 6k and rewrite the recurrence relation as: a6k = − a6k−6 a6k−12 9a6k−6 =− = = ... 6k(6k + 3) 2k(2k + 1) (2k + 1)2k(2k − 1) It is now more evident what the recurrence relation looks like, namely: a6k = (−1)k a0 . (2k + 1)! Therefore, the first solution of the given ODE is: y1 (x) = a0 ∞ X 6n+3 a6n x n=0 ∞ X (−1)n 3(2n+1) x . = a0 (2n + 1)! n=0 If we substitute now z = x3 it is not difficult to recognize in this series the Taylor expansion of sin z, therefore the first solution is: y1 (x) = a0 sin(x3 ). P n+p As we have said, the second solution might or might not be of the form ∞ . n=0 bn x If we assume it to be of this form and proceed as in the previous case (but with p = 0) this time, we obtain: n(n − 1)bn − 2nbn + 9an−6 = 0 ⇒ bn = − Again we write 6k instead of n and obtain: b6k = − 9bn−6 . n(n − 3) b6k−6 (−1)k 9b6k−6 =− = b0 . 6k(6k − 3) 2k(2k − 1) (2k)! In this way, the second solution can be written as: y2 (x) = b0 ∞ X (−1)n n=0 (2n)! x6n = b0 cos(x3 ). The Wronskian of the two functions is: W (y1 , y2 ) = y1 y2 ′ − y2 y1 ′ = a0 sin(x3 )(−3x2 )b0 sin(x3 ) − b0 cos(x3 )(3x2 )a0 cos(x3 ) = −3a0 b0 x2 6= 0. 2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS We could have also calculated the Wronskian from the Abel’s theorem W = Ce− R p(x̃)dx̃ obtaining therefore the same function. = Ce2 R 1 dx̃ x̃ = Cx2 , 89 90 CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS Chapter 3 Complex analysis The complex analysis consists in expanding in the complex space the familiar concepts of function analysis (limits, continuity, derivatives, integrals) in the real space. Many of the properties of complex functions are analogous to the ones in the real space but we will see that some other operations (like the integration) is totally unlike the familiar integration of real functions. 3.1 Complex functions We have already seen in Chapter 1 how to form complex numbers z given ordered couples of real numbers (x, y). Of course, if x and y are allowed to vary (namely if they are variables), then we can define z a complex variable. The quantity f (z) is said to be a complex function of z if to every value of z in a certain domain R (a region in the Argand diagram) there corresponds one or more complex numbers wi = f (z). If only one value of w corresponds to each value of z then the function f is single-valued; if instead to each value of z we can assign more values of w, then the function is multiple-valued. Example 3.1.1 Is the function f (z) = √ z single-valued or multiple-valued? We can write z = reiθ , therefore it is f (z) = √ θ rei( 2 +nπ) , namely, given whatever complex number z we have two complex numbers correspond √ iθ √ i θ +π . The function is therefore ing to its square root: w1 = re 2 and w2 = re 2 multiple-valued. Concerning Example 3.1.1 we can notice that, if we move around a closed path √ that does not enclose the origin of the axes and we evaluate f (z) = z at each 91 92 CHAPTER 3. COMPLEX ANALYSIS Figure 3.1: (a) a closed contour not enclosing the origin; (b) a closed contour enclosing the origin. point of the path, after one complete circuit the value of θ will be the same as its original value, therefore also the value of f (z) will be the same. If instead we move on a closed path that encloses the origin, then after one circuit we will have θ + 2π √ iθ √ i θ +π instead of θ and therefore the value of f (z) will change from re 2 to re 2 . Referring to Fig. 3.1, along the path (a) the function f (z) does not change after one circuit, wheres it does change if we move on the path drawn in (b). A point like the origin, which has the property that the function f (z) changes after one circuit enclosing it, is called branch point. In order a multiple-valued function to be treated as a single-valued one, we must somehow prevent that the circuit around the origin can be completed. This result can be obtained through the so-called branch cut. The branch cut is a line or a curve in the complex plane and can be regarded as an artificial barrier that a closed path √ must not cross. A possible branch cut for the function f (z) = z is shown in Fig. 3.2, namely we take the cut as the positive real axis (but any line extending from the origin out to |z| = ∞ would be a branch cut, as well). As it is shown in this figure, if we prevent the closed path to cross the branch cut, there is no way to enclose the origin and therefore the function f (z) may be regarded as single-valued. We can √ also write that the function f (z) = z is single-valued in the domain of the complex space R = C − R≥0 . 93 3.1. COMPLEX FUNCTIONS Figure 3.2: A possible branch cut for the function f (z) = 3.1.1 √ z and a closed contour. Differentiable functions Analogously to the definition of derivative in the real space, we can define the derivative of a single-valued function f in a point z as the limit: f (z + ∆z) − f (z) (3.1) f (z) = lim ∆z→0 ∆z A function f that is single-valued in some domain R of the Argand diagram is said to be differentiable if the derivative Eq. 3.1 exists and is unique, namely it does not depend on the direction in the Argand diagram from which ∆z tends to zero. ′ Example 3.1.2 Show that the function f (z) = Re(z) · Im(z) = xy is not differentiable anywhere in the complex plane. Given a generic ∆z = ∆x + i∆y we have: (x + ∆x)(y + ∆y) − xy f (z + ∆z) − f (z) = lim . lim ∆x,∆y→0 ∆z→0 ∆z ∆x + i∆y If we suppose now that ∆z → 0 along a straight line with slope m, we have ∆y = m∆x, therefore we have: · m∆x + + xm f (z + ∆z) − f (z) ∆x ∆x ∆x y y + xm = lim lim . = ∆x→0 ∆z→0 ∆z 1 + im ∆x(1 + im) 94 CHAPTER 3. COMPLEX ANALYSIS This limit therefore clearly depends on the direction from which ∆z tends to zero (namely on m), therefore the given function is not differentiable. A function that is single-valued and differentiable at all points of a domain R is said to be analytic there. If a function is analytic in a domain R with the exception of some points of this domain, than these points are called singularities of f (z). We have already encountered a singularity (the branch point). Another very important singularity is the pole. A point z0 is said to be a pole of of order n the function f (z) if: lim [(z − z0 )n f (z)] = a, (3.2) z→z0 where a is a finite complex number different from zero. Analogously, we can say that the function f (z) has a pole of order n in z0 if we can find a function g(z), analytic in a neighborhood of z0 and with g(z0 ) 6= 0, such that: f (z) = g(z) . (z − z0 )n (3.3) A pole is an isolated singularity in the sense that f (z) has a pole at z0 but is analytic in some neighborhood around z0 . 3.1.2 The Cauchy-Riemann conditions The Cauchy-Riemann conditions establish what properties a function f (z) must have in order to be analytic in some domain R. We can split every function f (z) in two functions u(x, y) and v(x, y) such that: f (z) = u(x, y) + iv(x, y). (3.4) Given a point z ∈ R, f ′ (z) is given by: f ′ (z) = [u(x + ∆x, y + ∆y) − u(x, y)] + i[v(x + ∆x, y + ∆y) − v(x, y)] . ∆x,∆y→0 ∆x + i∆y lim As we have said this limit will in general depend on the way ∆z = ∆x + i∆y approaches zero. The easiest possible paths we can think at are along the x-axis and along the y-axis, namely in the first case we have ∆y = 0 and we let ∆x go to 0, whereas in the second case is ∆x = 0 and ∆y tends to zero. For ∆y = 0 we obtain: lim ∆x→0 [u(x + ∆x, y) − u(x, y)] [v(x + ∆x, y) − v(x, y)] +i ∆x ∆x = ∂u ∂v +i . ∂x ∂x (3.5) 95 3.1. COMPLEX FUNCTIONS Assuming ∆x = 0 we obtain instead: lim ∆x→0 [v(x, y + ∆y) − v(x, y)] [u(x, y + ∆y) − u(x, y)] +i i∆y i∆y = 1 ∂u ∂v + . i ∂y ∂y (3.6) It is clear that a necessary condition in order f (z) to be differentiable in z is that we obtain the same f ′ (z) in the two above mentioned cases. Equating the real and imaginary parts we obtain: ∂v ∂v ∂u ∂u = , =− . (3.7) ∂x ∂y ∂x ∂y These equations are called the Cauchy-Riemann conditions. It can be shown that a necessary and sufficient condition in order f (z) to be analytic in a domain R is that the Cauchy-Riemann conditions are satisfied in all points of R and that all the functions ∂u , ∂u , ∂v , ∂v are continuous. ∂x ∂y ∂x ∂x Example 3.1.3 Show that the function sin z is analytic in the whole complex plane. We have seen in Chapter 1 that we can define sin z = z. Since z = x + iy we obtain: eiz −e−iz 2i for any complex number eiz − e−iz ei(x+iy) − e−i(x+iy) e−y+ix − ey−ix = = 2i 2i 2i −y y e (cos x + i sin x) − e (cos x − i sin x) = 2i e−y − ey e−y + ey = cos x + sin x 2i 2i e−y − ey e−y + ey = −i cos x + sin x 2 2i = i cos x sinh y + sin x cosh y. sin z = The real and imaginary parts of sin z are thus u(x, y) = sin x cosh y and v(x, y) = cos x sinh y, respectively. If we make now the partial derivatives of these two functions we obtain: ∂u ∂x ∂u ∂y ∂v ∂x ∂v ∂y = cos x cosh y = sin x sinh y = − sin x sinh y = cos x cosh y. 96 CHAPTER 3. COMPLEX ANALYSIS It is clear that all the partial derivatives are continuous and the Cauchy-Riemann conditions are satisfied for any choice of z, therefore the sinus function is analytic everywhere in C. Example 3.1.4 Show that the function ln z (the principal value of Ln (z)) is analytic in the whole complex plane and find its derivative. As we have seen in Chapter 1, ln z is given by: ln z = ln r + iθ = ln namely we have: u(x, y) = ln p p x2 + y 2 , y x2 + y 2 + i arctan , x y v(x, y) = arctan . x If we make the partial derivatives we obtain: 1 1 x 1 ∂u =p · p · 2x = 2 2 2 2 2 ∂x x + y2 x +y 2 x +y ∂u y = 2 ∂y x + y2 y ∂v 1 y − 2 =− 2 = 2 · y ∂x x x + y2 1 + x2 1 1 x ∂v = = 2 . 2 · y ∂y x + y2 1 + x2 x It can be seen now that all the partial derivatives are continuous and satisfy the Cauchy-Riemann conditions, therefore the function f (z) = ln z is analytic in the whole complex plane. It is worth noting that, had we taken the complex logarithm Ln(z) = i(θ + 2nπ) + ln r instead of its principal value, nothing would have changed in the partial derivatives because the additive term 2nπ in v(x, y) would have disappeared after differentiation. However, the function f (z) = Ln (z) is multiple-valued (as we have seen in Chapter 1) and because of that it cannot be analytic. To make it single-valued we have to restrict the range of arguments to θ ∈ [0, 2π[, namely we have to take the principal value ln z of the complex logarithm. If we apply the definition of derivative to the function f (z) = ln z we obtain: ∆z ln 1 + f (z + ∆z) − f (z) z = lim . f ′ (z) = lim ∆z→0 ∆z→0 ∆z ∆z 97 3.1. COMPLEX FUNCTIONS It can be shown that ln(1 + z) has the same Taylor expansion as the analogous real 3 2 function ln(1 + x), namely ln(1 + z) = z − z2 + z3 − . . . , therefore we obtain: ln 1 + ∆z z f (z) = lim ∆z→0 ∆z ′ = lim ∆z ∆z→0 z − 1 ∆z 2 2 z + ... ∆z analogous to the result for the real function f (x) = ln x. 1 = , z Indeed it is possible to show that all the elementary functions (sine, cosine, hyperbolic sine and cosine, exponential function, polynomials etc.) are analytic and their derivatives have the same expression as the corresponding real functions. If we express the complex number in polar form (z = reiθ ), then we can find two functions u and v such that: f (z) = u(r, θ) + iv(r, θ). In this way f ′ (z) can be written as: f ′ (z) = lim ∆r,∆θ→0 u(r + ∆r, θ + ∆θ) − u(r, θ) + i[v(r + ∆r, θ + ∆θ) − v(r, θ)] . (r + ∆r)ei(θ+∆θ) − reiθ Once again we take first ∆θ = 0 and we send ∆r to zero, then we proceed the other way around. For ∆θ = 0 we obtain: u(r + ∆r, θ) − u(r, θ) + i[v(r + ∆r, θ) − v(r, θ)] ∂v −iθ ∂u lim . =e +i ∆r→0 ∆reiθ ∂r ∂r If we take now ∆r = 0 we have: u(r, θ + ∆θ) − u(r, θ) + i[v(r, θ + ∆θ) − v(r, θ)] = ∆θ→0 r(ei(θ+∆θ) − eiθ ) e−iθ u(r, θ + ∆θ) − u(r, θ) + i[v(r, θ + ∆θ) − v(r, θ)] = lim . r ∆θ→0 ei∆θ − 1 lim We know already (Chapter 1) that the exponential function admits the same Taylor 2 series expansion of the corresponding real function, namely ex = 1 + x + x2 + . . . , therefore the expression at the denominator is ei∆θ − 1 ≃ i∆θ. In the limit for ∆θ → 0 the previous expression transforms into: e−iθ ∂v ∂v ∂u e−iθ 1 ∂u = . +i −i r i ∂θ ∂θ r ∂θ ∂θ If we compare now the real and imaginary parts of the two limits we have found, we obtain: 98 CHAPTER 3. COMPLEX ANALYSIS ∂v ∂u = , (3.8) ∂r ∂θ ∂v ∂u r = − ,, (3.9) ∂r ∂θ which are the Cauchy-Riemann conditions in polar form. By means of these equations it is much easier to see that the function ln z (Example 3.1.4) is analytic. In fact, we = 0 = ∂v , satisfying have u(r, θ) = ln r and v(r, θ) = θ. It is therefore clear that ∂u ∂θ ∂r 1 ∂v ∂u Eq. 3.9. We have then ∂r = r and ∂θ = 1, therefore also Eq. 3.8 is satisfied. As usual, some operations are easier to perform in polar representation, some others in algebraic representation. r 3.2 3.2.1 Complex integration Line integrals in the complex plane We can define integrals of complex functions exactly as we do with real functions, R namely we define the indefinite integral f (z)dz as any possible function whose derivative is f (z). When we try to calculate definite integrals, things are more complicated. In fact, given a real function f (x) and two real numbers x1 and x2 , Rx the number x12 f (x)dx is unambiguously defined. Given instead a complex function Rz f (z) and two points z1 , z2 ∈ C, z12 f (z)dz is ambiguously defined because there is an infinite number of curves joining z1 to z2 and we might have different results according to the curve we choose. In this way we can see the analogy between complex integrals and line integrals of scalar (or vector) fields. If we have a scalar field f and we want to evaluate the integral of this field along a curve γ, we know that we have to find a parameterization r : [a, b] → γ such that r(a) and r(b) are the endpoints of the curve γ. We can then express the line integral with real integrals in this way: Z γ f ds = Z b f [r(t)]|r′(t)|dt, a where ds is the infinitesimal line element. For complex integrals we proceed in the same way, namely we describe the path γ with a continuous (real) parameter t ranging from a to b and giving the successive positions on γ through the relations: x = x(t), y = y(t). (3.10) Assuming as usual that u(x, y) is the real part of f (z) and v(x, y) its imaginary part, the integral of the function f (z) along the curve γ can be given as a sum of real integrals as follows: 99 3.2. COMPLEX INTEGRATION Z f (z)dz = Z (u + iv)(dx + idy) Z Z = (udx − vdy) + i (udy + vdx) γ γ γ γ Z b Z b dx dy dy dx = u −v dt + i dt u +v dt dt dt dt a a (3.11) Because of this transformation of complex integrals to sums of real integrals, it is easy to see that these two identities are verified: Z Z f (z)dz = − f (z)dz (3.12) γ −γ Z Z Z f (z)dz. (3.13) f (z)dz + f (z)dz = γ γ2 γ1 In the first identity the curve −γ is the same as the curve γ but taken in the opposite direction, therefore the parameterization x(t) and y(t) is the same but we have to invert the endpoints a and b. In this sense the result of Eq. 3.12 is equivalent to the Rb Ra relation among real integrals a f (x)dx = − b f (x)dx. In the second identity we intend γ1 and γ2 as two curves, joining whom we obtain the curve γ. Because of the transformation between complex and real integrals of Eq. 3.11 this result is again Rb Rm Rb analogous to the result of real integrals a f (x)dx = a f (x)dx + m f (x)dx. Example 3.2.1 Evaluate the integrals of the functions f1 (z) = along the paths γ1 and γ2 indicated in Fig. 3.3. 1 z and f2 (z) = Im(z) √ The curve γ1 is a circular arc with radius R 2 and its parameterization is clearly: √ x(t) = R 2 cos t, √ y(t) = R 2 sin t, π 3π . , t∈ 4 4 The curve γ2 is instead a straight line in which y is constant and a parameterization of it is: x(t) = Rt, The function f1 (z) = 1 z y(t) = R, t ∈ [1, −1]. can be also written as: x − iy 1 = 2 x + iy x + y2 x y ⇒ u(x, y) = 2 , v(x, y) = − 2 . 2 x +y x + y2 f1 (z) = By using the parameterization adopted for the curve γ1 this becomes: 100 CHAPTER 3. COMPLEX ANALYSIS Figure 3.3: Different paths to evaluate the integrals of the functions f1 (z) = f2 (z) = Im(z) (see Example 3.2.1.) u(x, y) = 1 z and sin t cos t √ , v(x, y) = − √ , R 2 R 2 whereas with the parameterization we have chosen for the curve γ2 we obtain: u(x, y) = t 1 , v(x, y) = − . R(1 + t2 ) R(1 + t2 ) For the function f2 (z) = Im(z) the real and imaginary parts are evident, namely: u(x, y) = y, v(x, y) = 0. Now we can start calculating the four requested integrals: √ sin t √ cos t √ (−R √(R f1 (z)dz = 2 cos t) dt+ 2 sin t) + π R 2 R 2 γ1 4 Z 3π √ 4 sin t cos t √ √ (R 2 cos t) − √ (−R 2 sin t) dt +i π R 2 R 2 4 Z 3π 4 π 1dt = i . =i π 2 4 Z Z 3π 4 101 3.2. COMPLEX INTEGRATION Z Z Z −1 1 t Rdt + i Rdt − f1 (z)dz = R(1 + t2 ) R(1 + t2 ) 1 γ2 1 −1 1 = ln(1 + t2 ) 1 − i[arctan t]−1 1 2 π π π =i . = −i − − 4 4 2 Z f2 (z)dz = γ1 Z 3π 4 −1 √ √ R 2 sin t(−R 2 sin t)dt + i π 4 = −R2 Z 3π 4 2 sin2 tdt + iR2 π 4 Z 3π 4 Z 3π 4 √ √ R 2 sin t(R 2 cos t)dt π 4 2 sin t cos tdt. π 4 We remind now that 2 sin t cos t = sin(2t) and −2 sin2 t = cos(2t) − 1, therefore we have: Z 3π Z 3π Z 4 4 2 2 f2 (z)dz = R [cos(2t) − 1]dt + iR sin(2t)dt π 4 γ1 π 4 3π4 3π 1 R2 sin(2t) − t =R − i [cos(2t)] π4 4 2 2 π 4 π . = −R2 1 + 2 2 Finally: Z f2 (z)dz = Z 1 γ2 −1 RRdt = −2R2 . There are many things to notice from the previous example. We have seen that, in one case (function f1 (z)) the integral seems not to depend on the path we have chosen but only on the end points, whereas in the other case (function f2 (z)) the result does depend on the path. If we take the closed curve γ = γ1 − γ2 , the integral of the function f1 (z) along it is zero. It is not difficult to imagine why the functions f1 (z) and f2 (z) behave differently: the first function is analytic in the domain we are considering, whereas the function f2 (z) is not. Example 3.2.2 Evaluate the integral Z C dz , (z − z0 )n where C is a circle of radius R and centered at z0 . 102 CHAPTER 3. COMPLEX ANALYSIS It is convenient to express z − z0 in exponential form. In fact, along a circle of radius R centered at z0 the modulus of z − z0 is always constant and equal to R, whereas the argument ranges from 0 to 2π. Therefore z − z0 = Reiθ and consequently dz = iReiθ dθ. The integral to evaluate transforms thus into: Z C dz = (z − z0 )n Z 2π 0 iReiθ dθ = iR1−n Rn einθ Z 0 2π ei(1−n)θ dθ = 0 ∀n 6= 1. The result is due to the fact that, as we have learned, the function eiθ has a periodicity of 2π, namely eiθ = ei(θ+2nπ) . If n = 1 we obtain: Z C dz =i z − z0 Z 2π dθ = 2πi, 0 which could be also obtained as we have done in Example 3.2.1, only with a parameterization extending in the interval t ∈ [0, 2π]. We have seen in this example that the integral considered is zero in all but the case in which the origin of the axes (which is also the center of the curve C) is a pole of order 1 (also called simple pole) of the given function. In this case, the integral is given by the value 2πi which is independent on the radius of the considered circle. This leads us to one of the most important results of complex analysis, namely the Cauchy’s integral theorem. 3.2.2 Cauchy’s integral theorem Before discussing the Cauchy’s integral theorem we have to define a simply-connected domain. Referring to Fig. 3.4 we can assume that some function f (z) is not defined or is not differentiable in the points P1 and P2 and in the region R1 , all concentrated in the right part of the Argand diagram (region of positive real parts). If we take a generic closed curve (for instance Γ1 ) in this region, we cannot shrink it indefinitely without avoiding the singularities and holes. In the left side of the diagram instead (light blue region; region of negative real parts) we can take any generic closed curve (for instance the curve Γ2 ) and shrink it to a point without leaving it. If a region of the complex plane has the property that any closed curve lying in it can be shrunk to a point without leaving it, then the region is called simply connected. If a region is not simply connected but has a number of holes in it, then it is said to be multiply connected. The Cauchy’s integral theorem states that, if a function f (z) is analytic in a simply connected region and on its boundary C, then I C f (z)dz = 0. (3.14) 103 3.2. COMPLEX INTEGRATION Figure 3.4: Simply connected and multiply connected regions. Here and in the following we will denote an integral around a closed contour C with H . C To demonstrate the Cauchy’s integral theorem we first have to show a result known as Green’s theorem in a plane that states that, given two functions P (x, y) and Q(x, y), continuous and with continuous partial derivatives inside a simply connected region R of the xy-plane and on its boundary C, then: ZZ ∂Q ∂P (P dx + Qdy) = dxdy. (3.15) − ∂y C R ∂x It relates therefore the surface integral on R with the line integral on the contour line C enclosing R. To demonstrate this result we refer to Fig. 3.5. If y1 (x) and y2 (x) are the equations of the curves ST U and SV U respectively, then for each value of x0 between a and b the segment joining y2 (x0 ) with y1 (x0 ) represents a cut through the region R and it is always y2 (x0 ) > y1 (x0 ), therefore we can write: I ZZ R ∂P dxdy = ∂y Z b dx a Z Z y2 (x) y1 (x) b dy ∂P ∂y y=y (x) dx P (x, y) y=y12 (x) a Z b Z b I = P [x, y2 (x)]dx − P [x, y1(x)]dx = − P dx. = a a C 104 CHAPTER 3. COMPLEX ANALYSIS Figure 3.5: A simply connected region R bounded by the curve C. The sign minus is due to the fact that we have gone through the curve C clockwise, whereas it is conventionally assumed that the counterclockwise direction is the positive one (see also Example 3.2.2). We proceed in the same way by defining x1 (y) and x2 (y) as the equations of the curves T SV and T UV , respectively. By construction it is always x2 (y) > x1 (y). We have then: ZZ R ∂Q dxdy = ∂x Z d dy c Z Z x2 (y) dx x1 (y) d ∂Q ∂x x=x2 (y) dy Q(x, y) x=x1 (y) c Z d Z d I = Q[x2 (y), y]dy − Q[x1 (y), y]dy = Qdy. = c c C RR RR dxdy − R ∂P dxdy, we obtain the Green’s theorem in a plane. If we take now R ∂Q ∂x ∂y The Cauchy’s integral theorem is a simple application of the Green’s theorem in a plane. In fact we have: I C f (z)dz = I I (udx − vdy) + i (udy + vdx) C ZZ ZZ ∂u ∂v ∂(−v) ∂u dxdy + i dxdy. − − = ∂x ∂y ∂y R ∂x R C (3.16) 105 3.2. COMPLEX INTEGRATION ∂v = ∂y and For an analytic function we know that the Cauchy-Riemann conditions ∂u ∂x ∂u ∂v = − ∂x are satisfied, therefore both integrating functions in Eq. 3.16 are zero, ∂y demonstrating thus the Cauchy’s integral theorem. On the light of the Cauchy’s theorem the result emerged after Example 3.2.1 that the integral of the function f (z) = z1 along the closed curve γ = γ1 − γ2 is zero becomes obvious. In fact, the curve does not enclose the origin (which is a singularity), the function f (z) is analytic in this region of the Argand diagram and therefore the integral of f (z) along γ must be zero. 3.2.3 Cauchy’s integral formula A very important application of the Cauchy’s integral theorem is that, given a function f (z) analytic everywhere but in an isolated singularity z0 , whatever curve we choose surrounding z0 the integral of f (z) along it will be equivalent to the integral along a circle γ centered on z0 and of arbitrarily small radius ε. To demonstrate this, we refer to Fig. 3.6. We want to demonstrate that the integral of the function H f along C is equal to γ f (z)dz, where γ is a small circle with radius ε surrounding the singularity z0 . To do that we consider the curve Γ in Fig. 3.6 (b), namely we cut the curve C and we connect it through two parallel lines r1 and r2 to the small circle γ enclosing z0 . If the separation δ between the two parallel straight lines r1 and r2 becomes infinitesimally small, then the curve Γ can be obtained as Γ = C +r1 −γ +r2 . Note the term −γ due to the fact that the curve Γ rotates clockwise around z0 . But the curve Γ does not contain the singularity z0 , therefore according to the Cauchy’s integral theorem: I ⇒ IΓ C f (z)dz = 0 I I I f (z)dz = 0. f (z)dz − f (z)dz + f (z)dz + γ r1 r2 But if the separation δ between r1 and r2 becomes infinitesimally small, the two curves will eventually lie on top of each other but will be traversed in opposite H directions. According to Eq. 3.12 that means that the contributions of f (z)dz along these two lines cancel out. It remains therefore: I f (z)dz = C I f (z)dz. (3.17) γ As an extension of this result, if a curve C encloses n holes or singularities z1 , . . . , zn of a given complex function f (z), then: 106 CHAPTER 3. COMPLEX ANALYSIS Figure 3.6: Collapsing a contour C around a singularity z0 . I C f (z)dz = n I X i=1 f (z)dz, (3.18) γi where γi is a small circle, of arbitrarily small radius ε, enclosing the singularity zi . This can be seen in Fig. 3.7 where we show that we can transform the curve C into the curve Γ that does not contain the singularities z1 , z2 , z3 but encloses them through the small circles γ1 , γ2 , γ3 traversed clockwise, therefore we have H H P H 0 = Γ f (z)dz = C f (z)dz − 3i=1 γi f (z)dz, from which we obtain the result Eq. 3.18. The Cauchy’s integral formula states that, if f (z) is analytic within and on a closed contour C and z0 is a point within C, then: 1 f (z0 ) = 2πi I C f (z) dz. z − z0 (3.19) The Cauchy’s integral formula relates therefore the value of a function f (z) in a point z0 with the complex integral of f in a contour surrounding z0 . Because of Eq. 3.17, to demonstrate this result it is enough to demonstrate that this relation holds for a small circle γ with radius ε surrounding z0 . But any point z on γ is given by z = z0 + εeiθ , with θ ∈ [0, 2π]. Moreover, dz = iεeiθ dθ, therefore we have: 107 3.2. COMPLEX INTEGRATION Figure 3.7: Collapsing a contour C around three singularities z1 , z2 , z3 . I C f (z) dz = z − z0 = I γ f (z) dz z − z0 2π Z 0 f z0 + εeiθ iθ i εe dθ iθ εe For ε → 0 the function to integrate reduces to if (z0 ), therefore we have: I C f (z) dz = i z − z0 Z 2π f (z0 )dθ = 2πif (z0 ), 0 which demonstrates the Cauchy’s integral formula. If f (z) = 1, we obtain the result H dz = 2πi we have already seen in Example 3.2.2. The Cauchy’s integral formula z−z0 is very useful to evaluate complex integrals, as the next example demonstrates. Example 3.2.3 Find the integral of the function 2 ez +1 z2 + 1 along the curves Γ1 , Γ2 , Γ3 shown in Fig. 3.8, namely: • a generic closed curve containing the point z1 = i but not the point z2 = −i (Γ1 ), 108 CHAPTER 3. COMPLEX ANALYSIS Figure 3.8: Curves analyzed in Example 3.2.3. • a generic closed curve containing neither z1 = i nor z2 = −i (Γ2 ) • a generic closed curve containing the point z2 = −i but not the point z1 = i (Γ3 ) We have to evaluate the integral I Γi 2 ez +1 dz = z2 + 1 I 2 Γi ez +1 dz. (z + i)(z − i) This function has simple poles on z1 = i and z2 = −i. The curve Γ2 can be shrunk to a point avoiding the singularities, therefore: I 2 Γ2 ez +1 dz = 0. z2 + 1 z 2 +1 Considering the curve Γ1 it is clear that the function g1 (z) = ez+i does not have singularities in and along it, therefore we can apply the Cauchy’s integral formula and obtain: I Γ1 2 ez +1 dz = z2 + 1 I Γ1 e0 g1 (z) dz = 2πig1 (i) = 2πi = π z−i 2i 109 3.2. COMPLEX INTEGRATION Analogously, in and along the curve Γ3 the function g2 (z) = singularities, therefore: I 3.2.4 2 Γ3 ez +1 dz = z2 + 1 I Γ3 2 ez +1 z−i does not have e0 g2 (z) dz = 2πig2 (−i) = 2πi = −π z+i −2i Cauchy’s integral formula for higher derivatives Recalling the definition of derivative of a complex (differentiable) function in a point z0 : f (z0 + ∆z) − f (z0 ) , ∆z→0 ∆z we can evaluate f (z0 + ∆z) and f (z0 ) by means of the Cauchy’s integral formula, using as contour any closed curve C containing z0 but not enclosing other singularities (we know that we can collapse this curve around z0 ). We obtain: f ′ (z0 ) = lim I 1 f (z) f (z) 1 dz − f (z0 ) = lim ∆z→0 2πi C ∆z z − (z0 + ∆z) z − z0 I ( ( (( (( 1 ( (z(−(z( (z(−(z( 1 0 )f (z) + ∆zf (z) 0 )f (z) − ( dz = lim ∆z→0 2πi C ∆z (z − z0 − ∆z)(z − z0 ) I 1 f (z) = lim dz ∆z→0 2πi C (z − z0 − ∆z)(z − z0 ) I 1 f (z) = dz. 2πi C (z − z0 )2 ′ (3.20) We can go on with higher order derivatives and prove by induction that, given a closed curve C containing z0 and given a function f (z) analytic on and inside C, then: f (n) n! (z0 ) = 2πi I C f (z) dz. (z − z0 )n+1 (3.21) This formula is known as Cauchy’s integral formula for higher derivatives. Also this formula is useful to evaluate complex integrals. Example 3.2.4 Evaluate the integral Z C e2z z − i π4 4 dz, where C : |z| = 1 is a circumference with radius 1. 110 CHAPTER 3. COMPLEX ANALYSIS The only singularity of the function to integrate is a pole of order 4 at z0 = i π4 . The path C encloses it and the function g(z) = e2z is analytic on and inside C, therefore we can apply the Cauchy’s integral formula for higher derivatives and obtain: e2z Z C 3.2.5 dz = π 4 z − i4 Z C g(z) (3) dz = g i π 4 z − i4 π 2πi π 2πi 8π · = 8ei 2 · =− . 4 3! 3! 3 Taylor and Laurent series We are now in the condition to demonstrate that analytic functions admit Taylor series expansions equivalent to the ones we know for real functions, namely: f (z) = ∞ X f (n) (z0 ) n=0 (z − z0 )n . n! (3.22) In order to obtain this result we need to remind that, given a number q with |q| < 1, 1 can be obtained through the geometric series 1 + q + q 2 + . . . , namely: then 1−q ∞ X qn = n=0 1 . 1−q (3.23) We suppose now that a function f (z) is analytic in a circle R centered on z0 (whose P n 0) (n) . The (z0 ) (z−z circumference we denote with C) and we want to evaluate ∞ n=0 f n! quantity f (n) (z0 ) can be obtained by means of the Cauchy’s integral formula for higher derivatives (Eq. 3.21), therefore we have: ∞ X f (n) n=0 ∞ (z − z0 )n X (z − z0 )n n! (z0 ) = n! 2πi n! n=0 I C f (ξ) dξ. (ξ − z0 )n+1 We must notice here that ξ is on the circumference C, whereas z is inside the circle, 1 therefore |z − z0 | < |ξ − z0 |. We can bring now the sum inside the integral (and 2πi outside it) and obtain: ∞ X n=0 Since |z−z0 | |ξ−z0 | ∞ X n=0 f f (n) 1 (z − z0 )n = (z0 ) n! 2πi I X ∞ (z − z0 )n f (ξ) dξ. n ξ−z 0 C n=0 (ξ − z0 ) < 1 we can apply Eq. 3.23 to the quantity q = (n) 1 (z − z0 )n = (z0 ) n! 2πi I 1 C 1− z−z0 ξ−z0 z−z0 ξ−z0 1 f (ξ) dξ = ξ − z0 2πi I C and obtain: f (ξ) dξ = f (z). ξ−z The last step is due again to the Cauchy’s integral formula (Eq. 3.19) and this concludes the demonstration of the Eq. 3.22. We see now why we have used so 111 3.2. COMPLEX INTEGRATION far the term analytic function to indicate differentiable functions in some domain R: analogously to the real function, a complex function is analytic if it admits a convergent Taylor series expansion in each point of some domain R. In fact, given a point z0 and a function f (z) whose closest singularity to z0 is z1 , then for all z in a circle centered in z0 and with radius |z1 − z0 | the Taylor series expansion of f (z) converges. If a function f (z) is not analytic in some domain R, we could still be able to expand it in a series. Let us consider first a function f (z) having a pole of order m in a point z0 . By using the definition Eq. 3.3 of pole, we know that an analytic g(z) function g exists such that f (z) = (z−z n . Since g(z) is analytic, it admits a Taylor 0) series expansion around z0 , therefore we have: ∞ X 1 (z − z0 )n (n) f (z) = g (z ) 0 (z − z0 )m n=0 n! = ∞ X g (n) (z0 ) n=0 ∞ X (z − z0 )n−m n! ∞ X g (n+m) (z0 ) n = (z − z0 ) = an (z − z0 )n . (n + m)! n=−m n=−m (3.24) Such a series, which is an extension of the Taylor series to negative indices is called Laurent series. To find the Laurent expansion of a function f (z) about a point z0 it is therefore enough to find the Taylor expansion of g(z) about z0 and divide it by (z − z0 )m . Example 3.2.5 Find the Laurent series of f (z) = 1 z(z − 2)3 about the singularities z = 0 and z = 2 and find the corresponding residues. We remind here that the function (1 + x)α can be expressed as: α (1 + x) = ∞ X α(α − 1)(α − 2) . . . (α − n + 1) n=0 n! xn . This relation holds also if x, α ∈ C. To get the Laurent series about z = 0 we notice that the function g(z) = 1/(z − 2)3 is analytic in a neighborhood of z = 0. We can thus expand it about z = 0. We rewrite the given function as: 1 1 1 =− . 3 z(z − 2) 8z 1− z 3 2 112 CHAPTER 3. COMPLEX ANALYSIS Now we develop 1 − z −3 2 and obtain: z −3 z (−3)(−4) z 2 (−3)(−4)(−5) z 3 1− − + .... = 1 − (−3) + 2 2 2! 4 3! 8 We have therefore: 3 1 3 2 5 3 1 1 + z + z + z + ... =− z(z − 2)3 8z 2 2 4 1 3 3 5 =− − − z − z2 − . . . 8z 16 16 32 The residue at z = 0 is therefore − 18 . To find the Laurent series about z = 2 we proceed the same way expanding the function g(z) = 1/z about z = 2. It is more convenient to write ξ = z − 2. In this way we have: 1 1 1 . = = 3 3 3 z(z − 2) (ξ + 2)ξ 2ξ 1 + 2ξ −1 Now we can expand the function 1 + 2ξ . This can be expressed with the geometric series, namely: 1 2ξ 3 ξ 1+ 2 −1 n ∞ ξ ξ ξ2 ξ3 ξ4 1 X 1 − − + −... = 3 = 3 1− + 2ξ n=0 2 2ξ 2 4 8 16 = 1 1 1 1 ξ − 2+ − + − ... 3 2ξ 4ξ 8ξ 16 32 Recalling that ξ = z − 2 we obtain the Laurent expansion about z = 2: 1 1 1 1 z−2 1 = − + − + − ... 3 3 2 z(z − 2) 2(z − 2) 4(z − 2) 8(z − 2) 16 32 The residue at z = 2 is therefore 18 . Because of the Cauchy’s integral formula for higher derivatives we are also able to calculate the coefficients an for each n. We have in fact: 1 g (n+m) (z0 ) = an = (n + m)! 2πi I g(z) dz, (z − z0 )n+m+1 where the integral is taken on any contour enclosing z0 but not enclosing any other g(z) singularity of f (z) or g(z). Recalling that f (z) = (z−z m we obtain: 0) 113 3.2. COMPLEX INTEGRATION I f (z) 1 dz. (3.25) an = 2πi (z − z0 )n+1 The part of the Laurent series with indices n ≥ 0 is called analytic part, whereas the reminder of the series, consisting on negative indices, is called principal part. Depending on the nature of the singularity z0 , the principal part may contain an infinite number of terms, so that: f (z) = ∞ X n=−∞ an (z − z0 )n . (3.26) We can use this equation as a definition of the nature of the point z0 : • if, given a positive m all the coefficients of the Laurent series Eq. 3.26 ai = 0 for any i < m, then z0 is a zero of order m of the function f . • if a0 6= 0 but ai = 0 for any negative i, then f is analytic at z0 . • if a positive integer number m exists such that a−m 6= 0 but all the coefficients a−i of the Laurent series with i > m are zero, then z0 is a pole of order m of the function f . • if it is not possible to find such a lowest value m, then f (z) is said to have an essential singularity at z0 . 3.2.6 Residue theorem If z0 is a pole of order m of the function f (z), then the coefficient a−1 (not a−m ) is called residue of the function f (z) at the pole z0 and is usually indicated as a−1 (z0 ) or R(z0 ). Example 3.2.6 Verify that z = 0 is a pole of the function f (z) = its residue. cos z z3 and calculate The function cosine has the Taylor series expansion: cos z = ∞ X n=0 (−1)n z2 z4 z 2n =1− + − ... (2n)! 2! 4! We have therefore: z2 z4 1 1 z 1 + − ... = 3 − + − ... f (z) = 3 1 − z 2! 4! z 2!z 4! we can see therefore from this formula that z = 0 is indeed a pole (of order 3) of the given function and that the residue at z = 0 is a−1 = − 12 . 114 CHAPTER 3. COMPLEX ANALYSIS In many cases it is convenient not to recover the whole Laurent series to find the residue. In fact, if a function f (z) has a pole of order m on a point z0 , we have: f (z) = a−1 a−m +···+ + a0 + a1 (z − z0 ) + . . . m (z − z0 ) z − z0 If we multiply now both sides of this equation by (z − z0 )m we obtain: (z − z0 )m f (z) = a−m + · · · + a−1 (z − z0 )m−1 + . . . Now we can differentiate both sides m − 1 times and obtain: ∞ X dm−1 m (z − z0 ) f (z) = (m − 1)!a−1 + bn (z − z0 )n , m−1 dz n=1 for some coefficients bn . Now, if we take the limit of both sides for z → z0 the terms in the sum will disappear, therefore what remains is dm−1 m (z − z ) f (z) = (m − 1)!a−1 0 z→z0 dz m−1 1 dm−1 ⇒ a−1 = lim m−1 (z − z0 )m f (z) . (m − 1)! z→z0 dz lim (3.27) Considering Example 3.2.6, since z = 0 was a pole of order 3 we can calculate the residue in this way: d2 cos z 1 1 lim 2 z 3 3 = − , 2! z→0 dz z 2 A special case of this equation occurs when z0 is a simple pole. Then, the residue a−1 is given by: a−1 = a−1 = lim [(z − z0 )f (z)]. (3.28) z→z0 Why are the residues so important? We have seen that, if a function f (z) has a pole of order m in a point z0 , then it can be written as a Laurent series about z0 : f (z) = ∞ X n=−m an (z − z0 )n . If we want to integrate f (z) around a closed curve C that encloses z0 but no other singular points, we know (see Eq. 3.17 and Fig. 3.6) that this is equivalent to integrate f around a small circumference γ around z0 , therefore: I C f (z)dz = I X ∞ γ n=−m n an (z − z0 ) dz = ∞ X n=−m an I γ (z − z0 )n dz. 115 3.2. COMPLEX INTEGRATION H But we have already seen (Example 3.2.2) that (z − z0 )n dz in a circle around z0 is equal to zero for all the exponents but for n = −1 and, in this case, is equal to 2πi. We have therefore: I C f (z)dz = a−1 I γ dz = 2πia−1 . z − z0 (3.29) We can extend the above argument to a contour C enclosing n poles of a function f (see Eq. 3.18 and Fig. 3.7) and we obtain: I f (z)dz = 2πi C n X Ri , (3.30) i=1 Pn where i=1 Ri is the sum of the residues of f (z) at its poles within C. This fundamental result of the complex analysis is known as residue theorem. 3.2.7 Real integrals using contour integration Given a definite integral of the form: Z ∞ f (x)dx, −∞ this can be evaluated by means of the residue theorem provided that the function f (z) (the complex function analogous to the real function f (x)) has the following properties: • f (z) is analytic in the upper half-plane of the Argand diagram, except for a finite number of poles, none of which is in the real axis. • zf (z) → 0 as |z| → ∞. If these two conditions are fulfilled, then we can evaluate the integral of the function f (z) along the path indicated in Fig. 3.9, namely along a semicircle Γ of radius R (large enough to enclose all the poles) and along the x-axis, between the points −R and R. The residue theorem ensures us that this integral is given by 2πi times the sum of the residues of f at the poles on the upper half-plane. To evaluate the integral along Γ, we notice that the modulus of each z on Γ is R. Since zf (z) → 0 as |z| = R → ∞ we have also that |zf (z)| = R|f (z)| tends to 0 for R → ∞. For this reason (since z = Reiθ , θ ∈ [0, π] on Γ): Z f (z)dz = Γ Z 0 π iθ f (z) · iRe dθ, ⇒ lim | R→∞ Z Γ f (z)dz| ≤ lim R→∞ Z 0 π R|f (z)|dθ = 0. 116 CHAPTER 3. COMPLEX ANALYSIS Figure 3.9: Contour used in Example 3.2.7. Therefore, if we take the contour shown in Fig. 3.9 and we take the limit for R → ∞, RR R∞ the integral along Γ vanishes, −R f (z)dz transforms into −∞ f (x)dx (we are on a curve along the x-axis, where y = 0 and therefore z = x) and what remains is: Z ∞ f (x)dx = 2πi −∞ X Rj , (3.31) Im(z)>0 namely we take the residues Rj only on the poles zj that lie above the x-axis (with positive imaginary part). Of course there is no specific reason why we have chosen the upper half-plane; we might have chosen the lower half-plane as well and the result would not have changed (but if we chose the lower half-plane we have to rotate RR clockwise around Γ, otherwise we could not have −R f (x)dx on the x-axis). Example 3.2.7 Evaluate the integral Z ∞ 0 x2 dx. (x2 + 1)3 R∞ 2 We notice first that the function to integrate is even, therefore 0 (x2x+1)3 dx = R 2 x2 1 ∞ dx. The function f (z) = (z 2z+1)3 has two poles (of order 3) at z = i 2 −∞ (x2 +1)3 and z = −i (none of them lie in the x-axis) and it is clearly zf (z) → 0 as |z| → ∞, 117 3.2. COMPLEX INTEGRATION therefore we can evaluate the integral along the path of Fig. 3.9, the integral tends to zero for R → ∞ and therefore we have: Z ∞ −∞ R Γ f (z)dz x2 dx = 2πiR(i). (x2 + 1)3 Since the pole at z = i is of order 3, the residue is given by: We have: d2 z2 1 1 d2 z2 3 (z − i) = lim . R(i) = lim z→i 2 dz 2 (z 2 + 1)3 2 z→i dz 2 (z + i)3 2z(z + i)3 − 3(z + i)2 z 2 2iz − z 2 z2 d = = dz (z + i)3 (z + i)6 (z + i)4 2(i − z)(z + i)4 − 4(z + i)3 (2iz − z 2 ) −2(z 2 + 1) + 4(z 2 − 2iz) z2 d2 = = . dz 2 (z + i)3 (z + i)8 (z + i)5 Substituting this value of d2 f (z) dz 2 into the formula to obtain the residue, we get: 1 4(−1 + 2) i = − . 2 (2i)5 16 R(i) = In this way we obtain: Z ∞ −∞ x2 π dx = , ⇒ 2 3 (x + 1) 8 Z ∞ 0 x2 π dx = . 2 3 (x + 1) 16 In the case in which there are some simple poles on the x-axis, we can integrate the function along the curve in Fig. 3.10 (solid line). Reminding that in this case a−1 is f (z) = z−a + Pn (z − a), the integral along the curve γ can be evaluated taking iθ z = a + εe (and therefore dz = iεeiθ dθ), with θ ∈ [π, 0] (the curve rotates clockwise around a) and taking the limit for ε → 0. In this way we have: Z γ Z f (z)dz = lim a−1 ε→0 π 0 1 iεeiθ dθ + εeiθ Z 0 iθ Pn εe π iθ iεe dθ = −iπa−1 . (3.32) In fact, the integral of the polynomial venishes for ε → 0. Provided that the integral over the big half-circle Γ vanishes for R → ∞ (namely, provided that |zf (z)| → 0 as |z| → ∞), we have: lim R→∞,ε→0 ⇒ Z ∞ −∞ Z f (z)dz + Γ f (x)dx = 2πi Z f (z)dz + γ X Im(z)>0 Z R −R Rj + πi X X f (x)dx = 2πi Rj Im(z)=0 Im(z)>0 Rk , (3.33) 118 CHAPTER 3. COMPLEX ANALYSIS Figure 3.10: Contours to use in the case in which there are poles on the x-axis. namely, besides the residues Rj with positive imaginary parts, we have to add πi times the residues Rk along the x-axis. What happens if we take the dashed contour of Fig. 3.10 instead (namely if we rotate counterclockwise around the singularity at a)? In this case, the integral over γ1 of the function f (z) is given by: Z γ1 Z f (z)dz = lim a−1 ε→0 2π π 1 iεeiθ dθ + εeiθ Z 2π iθ Pn εe π iθ iεe dθ = iπa−1 , namely it has the opposite sign compared to Eq. 3.32 because of the different direction. However, in this case is a within the big contour we use for the integration of the function f (z), therefore the residue theorem tells us that the integral is given by 2πi times the residues of all the singularities on the x-axis and above it, namely: lim R→∞,ε→0 ⇒ Z ∞ −∞ Z f (z)dz + Γ f (x)dx = 2πi Z X f (z)dz + γ1 Im(z)≥0 Rj − πi Z R −R X X f (x)dx = 2πi Rj Im(z)≥0 Rk . Im(z)=0 Of course, the two methods produce the same result. In the end, a simple pole on the x-axis is counted as one-half of what it would be if it were above the axis. R∞ R∞ dx or −∞ cos(mx) dx In the case we have to evaluate integrals of the form −∞ sin(mx) f (x) f (x) it is often useful the Jordan’s lemma that states that, if a function f (z) is analytic in 119 3.2. COMPLEX INTEGRATION the upper half-plane of the Argand diagram (Im(z) > 0) except for a finite number of poles and if |f (z)| → 0 for |z| → ∞, then for each m > 0 we have: lim R→∞ Z eimz f (z)dz = 0, (3.34) Γ where Γ is as usual the semicircular contour with radius R in the upper half-plane, centered on the origin. Example 3.2.8 Evaluate the integral Z ∞ cos mx , a ∈ R, a > 0. −∞ x − a imz To evaluate this integral, we will evaluate the integral of the function f (z) = ez−a along the curve indicated in Fig. 3.10 (solid line). For ε (radius of the semicircular contour γ centered on a) → 0 this integral tends to: Z f (z)dz = C Z f (z)dz + Γ Z R f (z)dz + −R Z f (z)dz. γ The function f (z) has a simple pole at z = a, therefore, according to Eq. 3.32: Z f (z)dz = −iπa−1 , γ where the sign minus is due to the fact that the chosen curve rotates clockwise around a. The residue a−1 at z = a is given by: eimz = eima , = lim (z − a) z→a z−a a−1 therefore: Z γ f (z)dz = −iπeima . Since |(z −a)−1 | → 0 for |z| → ∞, the Jordan’s lemma ensures us that for R → ∞. What remains is therefore: Z ∞ ∞ cos mx dx = −π sin ma. x−a −∞ eimz dz = iπeima . z−a R Γ f (z)dz → 0 Note that this contour lies on the x-axis, therefore z = x. If we take now the real parts of both sides of this equation we obtain: Z −∞ 120 CHAPTER 3. COMPLEX ANALYSIS Figure 3.11: Contour to use in the case in which there is a branch point at the origin. √ We have seen at the beginning of this Chapter that some functions (like z or ln z) are multiple-valued and we need a branch cut to make them single-valued. The branch cut can be used to evaluate integrals involving this kind of functions. Usually a contour to use in this case is the one in Fig. 3.11, namely we evaluate the integral on the big circle Γ (letting the radius R go to infinity), on the small circle γ (letting the radius ε go to zero) and on the two straight lines r1 and r2 (noticing that, since we are dealing with multiple-valued functions, these two integrals do not cancel out). Example 3.2.9 Evaluate the integral Z ∞ 0 √ x dx. (x + 1)2 √ z The function f (z) = (z+1) 2 has a branch point at z = 0 and a pole (of order 2) at z = −1. We can evaluate the integral of f (z) along the curve C = Γ − r2 − γ + r1 indicated in Fig. 3.11. The integral will be 2πi times the residue at the pole z = −1, namely: I √ 1 z d 2 (z + 1) = 2πi √ = π. f (z)dz = 2πi lim 2 z→−1 dz (z + 1) 2 −1 C To evaluate the integrals along Γ and γ we set z = Reiθ for Γ and z = εeiθ for γ and we let R → ∞ and ε → 0. We obtain: 121 3.2. COMPLEX INTEGRATION I f (z)dz = Z f (z)dz = Z Γ I 2π 0 γ 2π 0 √ θ Rei 2 · iReiθ dθ →R→∞ 0 (Reiθ + 1)2 √ iθ εe 2 · iεeiθ dθ →ε→0 0. (εeiθ + 1)2 To evaluate the integral along the lines r1 and r2 the easiest thing to do is to substitute z = xe0i along r1 and z = xe2πi along r2 . In fact, the two curves will eventually overlap with the x-axis and therefore |z| → x in both cases, but the argument is different because the function is multiple-valued. We have therefore: √ √ πi Z 0 xe0i xe dx = π. 2 dx + 0i 2πi (xe + 1) + 1)2 0 ∞ (xe R ∞ √x The first integral is equal to 0 (x+1) 2 dx; in the second integral we have at the de√ √ 2 nominator again (x + 1) but at the numerator we have xeiπ = − x. We have therefore: Z Z ∞ 0 √ ∞ x dx + (x + 1)2 Z ∞ 0 √ x dx = π ⇒ (x + 1)2 Z 0 ∞ √ x π dx = . 2 (x + 1) 2 Another important application of the residue theorem is to evaluate integrals of the form Z 2π F (sin θ, cos θ)dθ. 0 In this case we can transform this integral into a complex integral around the unit circle C with |z| = 1. In fact, along this curve is z = eiθ , therefore dz = ieiθ dθ, namely: dz . iz Recalling then the definitions of sine and cosine, the given integral can be transformed into: dθ = 2π z − z −1 z + z −1 dz F (sin θ, cos θ)dθ = F , , (3.35) 2i 2 iz 0 C which can be evaluated by checking how many singularities of the function F lie inside C and calculating their residues. Z I Example 3.2.10 Calculate the integral Z 2π 0 dθ . sin θ 122 CHAPTER 3. COMPLEX ANALYSIS We can transform this integral as an integral along the circle of radius 1 obtaining: Z 2π dθ = sin θ 0 I C 1 z−z −1 2i dz =2 iz I C dz . z2 − 1 The given function has (simple) poles at z = 1 and z = −1. The residues of the given function at these points are: 1 1 R(−1) = lim (z + 1) 2 =− z→−1 z −1 2 1 1 R(1) = lim (z − 1) 2 = z→1 z −1 2 We have therefore: Z 0 2π 1 1 dθ = 0. = 4πi − + sin θ 2 2 Chapter 4 Integral transforms In mathematics, an integral transform is any transform T of a given function f of the following form: T f (s) = x2 Z K(x, s)f (x)dx. (4.1) x1 The input is a function f (x) and the output is another function T f (s). There are different integral transforms, depending on the kernel function K(x, s). The transforms we consider in this chapter are the Laplace transform and the Fourier transform. 4.1 4.1.1 Laplace transform Basic definition and properties To obtain the Laplace transform of a given function f (x) we use the kernel K(x, s) = e−sx , namely: L{f } = F (s) = Z ∞ f (x)e−sx dx. (4.2) 0 Here s can also be a complex variable, namely the Laplace transform maps a real function to a complex one. For our purposes it is enough to consider for the moment s real. We can easily verify that L is a linear operator. In fact: L{af + bg} = Z ∞ −sx [af (x) + bg(x)]e dx = a 0 Z 0 ⇒ L{af + bg} = aL{f } + bL{g}. 123 ∞ −sx f (x)e dx + b Z ∞ g(x)e−sx dx 0 (4.3) 124 CHAPTER 4. INTEGRAL TRANSFORMS Example 4.1.1 Find the Laplace transform of the function f (x) = 1. It is L{1} = Z ∞ 0 e−sx dx = − 1 −sx ∞ 1 e = . 0 s s Example 4.1.2 Find the Laplace transform of f (x) = xn , with n positive integer. We integrate by parts and obtain: n L{x } = Z ∞ n −sx x e 0 ∞ n 1 dx = − xn e−sx 0 + s s Z ∞ xn−1 e−sx dx = 0 n L(xn−1 ). s To obtain L{xn−1 } we proceed the same way and obtain L{xn−1 } = We iterate n times and obtain: L{xn } = n−1 L{xn−2 }. s n! n(n − 1)(n − 2) . . . L{1} = n+1 . n s s Example 4.1.3 Find the Laplace transform of f (x) = sin(mx). R∞ It is L{f (x)} = 0 e−sx sin(mx)dx. By using the relation sin(mx) = obtain: Z ∞ Z ∞ 1 (im−s)x −(im+s)x e dt − L{f (x)} = e dx 2i 0 0 (im−s)x ∞ −(im+s)x ∞ e e 1 − = 2i im − s 0 −im − s 0 1 1 m 1 = = 2 − . 2i s − im s + im s + m2 eimx −e−imx 2i we In these three simple cases, it was clear that the integral 4.2 was convergent for any possible value of s; in fact, lim xn e−sx = 0 ∀s, n. This is not always the case, x→∞ as the two following examples show. Example 4.1.4 Find the Laplace transform of f (x) = eax . ax L{e } = Z 0 ∞ ax −sx e e dx = lim A→∞ Z A (a−s)x e 0 dx = lim A→∞ e(a−s)x a−s A . 0 It is clear that this limit exists and is finite only if a < s (a < Re (s) if s ∈ C), namely we can define the Laplace transform of the function f (x) = eax only if Re (s) > a. In this case it is: L{eax } = 1 . s−a 125 4.1. LAPLACE TRANSFORM Example 4.1.5 Find the Laplace transform of the function f (x) = cosh(mx). It is L{f (x)} = we obtain: R∞ 0 e−sx cosh(mx)dx. By using the relation cosh(mx) = emx +e−mx 2 Z ∞ Z ∞ 1 (m−s)x −(m+s)x L{f (x)} = e dt + e dx 2 0 0 (m−s)x ∞ −(m+s)x ∞ e e 1 − = 2 m−s 0 m+s 0 1 s 1 1 = = 2 + . 2 s−m s+m s − m2 This result holds as long as e(m−s)x and e−(m+s)x tend to zero for x → ∞, namely it must be s > |m|. There are a few properties of the Laplace transform that help us finding the transform of more complex functions. If we know that F (s) is the Laplace transform of f (x), namely that L{f (x)} = F (s), then: • L ecx f (x) = F (s − c) (4.4) This property comes directly from the definition of Laplace transform, in fact: • L ecx f (x) = Z ∞ cx −sx f (x)e e 0 dx = Z ∞ 0 f (x)e−(s−c)x dx = F (s − c). 1 s , (c > 0) L{f (cx)} = F c c (4.5) To show that it is enough to substitute cx with t. In this way is x = ct , dx = and therefore: L{f (cx)} = • Z 0 ∞ −sx e 1 f (cx)dx = c Z ∞ 0 dt c s 1 s . e− c t f (t)dt = F c c L{uc(x)f (x − c)} = e−sc F (s) (4.6) Here is uc (x) the Heaviside or step function, namely: 0 x < c uc (x) = 1 x ≥ c (4.7) 126 CHAPTER 4. INTEGRAL TRANSFORMS The function uc (x)f (x − c) is thus given by: 0 x < c uc (x)f (x − c) = f (x − c) x ≥ c We have thus: L{uc (x)f (x − c)} = Z c ∞ e−sx f (x − c)dx. With the substitution t = x − c we obtain: Z ∞ L{uc(x)f (x − c)} = e−s(c+t) f (t)dt = e−sc F (s). 0 • L{xn f (x)} = (−1)n F (n) (s) (4.8) It is enough to derive F (s) with respect to s, to obtain: Z Z ∞ d ∞ −sx e f (x)dx = − xe−sx f (x)dx = −L{xf (x)}. F (s) = ds 0 0 If we now differentiate n times F (s) with respect to s we obtain: ′ F (n) (s) = (−1)n L{xn f (x)}. From it, Eq. 4.8 is readily obtained. • L{f ′(x)} = −f (0) + sF (s) (4.9) This property can be obtained integrating e−sx f ′ (x) by parts, namely: ′ L{f (x)} = Z 0 ∞ ∞ f (x)dx = f (x)e−sx 0 +s −sx ′ e Z ∞ 0 e−sx f (x)dx = −f (0)+sF (s). Example 4.1.6 Find the Laplace transform of cos(mx). We could calculate this transform directly but it is easier to use the Laplace transform m of sin(mx) that we have calculated in Example 4.1.3 (L{sin(mx)} = s2 +m 2 ). From Eq. 4.9 (and reminding that L is a linear operator) we have: d L sin(mx) dx ⇒ L{cos(mx)} = = mL{cos(mx)} = − sin(0) + s · s2 s . + m2 s2 m . + m2 127 4.1. LAPLACE TRANSFORM Example 4.1.7 Find the Laplace transform of x cosh(mx). s We remind from Example 4.1.5 that L{cosh(mx)} = F (s) = s2 −m 2 (s > |m|). Eq. ′ 4.8 tells us that F (s) is the Laplace transform of −x cosh(mx). We have therefore: L{x cosh(mx)} = −F ′ (s) = − s2 − m2 − 2s2 s2 + m2 = . (s2 − m2 )2 (s2 − m2 )2 Example 4.1.8 Find the Laplace transform of the function f (x) defined in this way: x x < π f (x) = x − cos(x − π) x ≥ π By means of the step function (Eq. 4.7) we can rewrite f (x) as f (x) = x − uπ (x) cos(x−π). The Laplace transform of this function can be found by means of Eq. n! s 4.6 and of the known results L{xn } = sn+1 (Example 4.1.2) and L{cos(mx)} = s2 +m 2 (Example 4.1.6). L{f (x)} = L{x} − L{uπ (x) cos(x − π)} = 4.1.2 1 se−πs 1 −πs − e L{cos x} = − . s2 s2 s2 + 1 Solution of initial value problems by means of Laplace transforms We have seen (Eq. 4.9) that the Laplace transform of the derivative of a function is given by L{f ′(x)} = −f (0) + sF (s), where F (s) = L{f (x)}. If we consider the Laplace transform of higher order derivatives we obtain (always integrating by parts): ′′ L{f (x)} = Z ∞ −sx ′′ e −sx ′ f (x)dx = e 0 ′ 2 f (x) ∞ 0 +s Z ∞ e−sx f ′ (x)dx 0 = −f (0) − sf (0) + s F (s) Z Z ∞ −sx ′′ ∞ ′′′ −sx ′′′ L{f (x)} = e f (x)dx = e f (x) 0 + s ∞ e−sx f ′′ (x)dx 0 0 ′′ ′ 2 3 = −f (0) − sf (0) − s f (0) + s F (s) .. . L{f (n) (x)} = sn F (s) − sn−1 f (0) − sn−2 f ′ (0) − · · · − sf (n−2) (0) − f (n−1) (0). (4.10) 128 CHAPTER 4. INTEGRAL TRANSFORMS This result allows us to simplify considerably linear ODEs. Let us take for instance an initial value problem consisting of a second-order inhomogeneous ODE with constant coefficients (but the method can be applied also to more complex ODEs): ′′ ′ a2 y (x) + a1 y (x) + a0 y(x) = f (x) y(0) = y0 y ′ (0) = y0 ′ If we now make the Laplace transform of both members of this equation (calling Y (s) the Laplace transform of y(x) and F (s) the Laplace transform of f (x)), we obtain: a2 s2 Y (s) − sy0 − y0 ′ + a1 [sY (s) − y0 ] + a0 Y (s) = F (s) ⇒ Y (s)(a2 s2 + a1 s + a0 ) = F (s) + a1 y0 + a2 (sy0 + y0 ′ ) ⇒ Y (s) = F (s) + a1 y0 + a2 (sy0 + y0 ′ ) . a2 s2 + a1 s + a0 (4.11) Namely, we have transformed an ODE into an algebraic one, which is of course easier to solve. Moreover, the particular solution (satisfying the given initial conditions) is automatically found, without need to search first the general solution and the look for the coefficients that satisfy the initial conditions. Further, homogeneous and inhomogeneous ODEs are handled in exactly the same way; it is not necessary to solve the corresponding homogeneous ODE first. The price to pay for these advantages is that Eq. 4.11 is not yet the solution of the given ODE; we should invert this relation and find the function f (x) whose Laplace transform is given by F (s). This function is called the inverse Laplace transform of F (s) and it is indicated with L−1 {F (s)}. Since the operator L is linear, it is easy to show that also the inverse operator −1 L is linear. In fact, given two functions f1 (x) and f2 (x) whose Laplace transforms are F1 (s) and F2 (s), respectively, the linearity of the operator L ensures us that: L{c1 f1 (x) + c2 f2 (x)} = c1 F1 (s) + c2 F2 (s). If we apply now the operator L−1 to both members of this equation we obtain: L−1 L{c1 f1 (x) + c2 f2 (x)} = L−1 {c1 F1 (s) + c2 F2 (s)} = c1 L−1 {F1 (s)} + c2 L−1 {F2 (s)}. To invert the function F (s) it is therefore enough to split it into many (possibly simple) addends and find for each of them the inverse Laplace transform. Based on the examples in Sect. 4.1.1 (and others that we do not have time to calculate, but that can be found in the mathematical literature) it is possible to construct a 129 4.1. LAPLACE TRANSFORM Table 4.1: Summary of elementary Laplace transforms f (x) = L−1 {F (s)} 1 emx xn sin(mx) cos(mx) sinh(mx) cosh(mx) emx sin(px) emx cos(px) n mx x e x−1/2 √ x δ(x − c) uc (x) uc (x)f (x − c) ecx f (x) f (cx) Rx f (x̃)dx̃ R0x f (x − ξ)g(ξ)dξ 0 (−1)n xn f (x) f (n) (x) F (s) = L{f (x)} Convergence 1 s s>0 s>m s>0 s>0 s>0 s>m s>m s>m 1 s−m n! sn+1 m s2 +m2 s s2 +m2 m s2 −m2 s s2 −m2 p (s−m)2 +p2 s−m (s−m)2 +p2 n! (s−m)n+1 p π s p 1 π 2 s3 −cs s>m s>m s>0 s>0 c>0 s>0 e e−cs s −cs e F (s) F (s − c) 1 s F c c c>0 F (s) s F (s)G(s) F (n) (s) sn F (s) − sn−1 f (0) − · · · − f (n−1) (0) “dictionary” of basic functions/expressions and corresponding Laplace transforms, as in Table 4.1. Any time we face a particular F (s), we can look at the dictionary and check whether it is possible to recover the function f (x) whose Laplace transform is F (s). Example 4.1.9 Find the inverse Laplace transform of the function F (s) = s2 + 5 s3 − 9s We can write the given function as: F (s) = s2 + 5 s2 + 5 = . s(s2 − 9) s(s − 3)(s + 3) To invert this function we have to apply the method of the partial fractions, namely: 130 CHAPTER 4. INTEGRAL TRANSFORMS s2 + 5 A B C As2 − 9A + Bs2 + 3Bs + Cs2 − 3Cs = + + = . s(s − 3)(s + 3) s s−3 s+3 s(s − 3)(s + 3) Now we can compare terms with like power of s, obtaining the following system of equations: A + B + C = 1 3B − 3C = 0 −9A = 5 From the second we obtain B = C, from the last A = − 59 . From the first equation: 14 7 ⇒ B=C= . 9 9 Now we can invert all the terms of the given function and obtain: 2B = 51 7 1 1 s2 + 5 −1 − =L + + f (x) = L s3 − 9s 9s 9 s−3 s+3 5 14 −1 5 14 s =− + L =− + cosh(3x). 2 9 9 s −9 9 9 −1 Example 4.1.10 Solve the initial value problem ′′ x y (x) + 4y(x) = e y(0) = 0 y ′ (0) = −1 We have to apply the operator L to both members of the given ODE. Since this is a second-order ODE with constant coefficients, we can apply directly Eq. 4.11 to obtain: 1 −1 2−s F (s) − 1 s−1 = 2 = . Y (s) = 2 s +4 s +4 (s − 1)(s2 + 4) We apply now the method of the partial fractions to decompose this function: A Bs + C 2−s + 2 = s−1 s +4 (s − 1)(s2 + 4) ⇒ As2 + 4A + Bs2 + Cs − Bs − C = 2 − s. 131 4.1. LAPLACE TRANSFORM By equating the terms with like power of s we obtain the system of equations: A + B = 0 ⇒ C − B = −1 4A − C = 2 A = −B C =B−1 −4B − B = 1 The decomposed Y (s) is thus given by: 1 B = − 5 ⇒ A = 15 C = − 6 5 1 1 1 s 6 1 − 2 − 2 . 5s−1 5s +4 5s +4 With the help of Table 4.1 we can easily identify the inverse Laplace transforms of these addends, obtaining therefore: Y (s) = y(x) = ex cos(2x) 3 sin(2x) − − . 5 5 5 The method of the Lagrange transform is sometimes more convenient, sometimes less convenient compared to traditional methods of ODE resolution. It proves however to be always more convenient in the case in which the inhomogeneous function is a step function. In fact, in this case the only available traditional method is the laborious variation of constants, whereas the Laplace transform of the step function can be readily found. Example 4.1.11 Find the solution of the initial value problem: ′′ y (x) + y(x) = g(x) where g(x) is given by: y(0) = 0 y ′ (0) = 0 0 0 ≤ x < 1 g(x) = x − 1 1 ≤ x < 2 1 x ≥ 2 (also known as ramp loading). The function g(x) can be written as: g(x) = u1 (x)(x − 1) − u2 (x)(x − 2), where uc (x) is the Heaviside function (Eq. 4.7). In fact, for x < 1 both u1 and u2 are zero. For x between 1 and 2 is u1 = 1 but u2 is still zero. For x ≥ 2 both functions 132 CHAPTER 4. INTEGRAL TRANSFORMS are 1 and therefore u1 (x)(x − 1) − u2 (x)(x − 2) = x − 1 − x + 2 = 1. If we make the Laplace transform of both members of the given ODE we obtain: e−s e−2s − 2 s2 s 1 + s2 − s2 e−s − e−2s e−s − e−2s 1 = e−s − e−2s 2 2 = − . ⇒ Y (s) = e−s − e−2s 2 2 s (s + 1) s (s + 1) s2 s2 + 1 s2 Y (s) + Y (s) = To invert this function Y (s) we use again the relation L−1 {uc (x)f (x−c)} = e−cs F (s) to obtain: f (x) = u1 (x)(x − 1) − u2(x)(x − 2) − u1 (x) sin(x − 1) + u2 (x) sin(x − 2). Among the results presented in Table 4.1 very significant is the one concerning the Dirac delta function δ(x−c). We remind here briefly what is the Dirac delta function and what are its properties. Given a function g(x) defined in the following way: g(x) = dξ (x) = 1 −ξ <x<ξ 2ξ 0 x ≤ −ξ or x ≥ ξ (4.12) it is clear that the integral of this function is 1 for any possible choice of ξ, in fact: Z ∞ g(x)dx = −∞ Z ξ −ξ 1 dx = 1. 2ξ It is also clear that if ξ tends to zero, the interval of values of x in which g(x) is different from zero becomes narrower and narrower until it disappears. Analogously, the function g(x − c) = dξ (x − c) is non-null only in a narrow interval of x centered on c that disappears for ξ tending to zero. The limit of the function g(x) = dξ (x) for ξ → 0 is called Dirac delta function and is indicated with δ(x). It is therefore characterized by the properties: δ(x − c) = 0 ∀x 6= c Z ∞ δ(x)dx = 1. (4.13) (4.14) −∞ Given a generic function f (x), if we integrate f (x)δ(x − c) between −∞ and ∞ we obtain: ∞ 1 f (x)δ(x − c)dx = lim ξ→0 2ξ −∞ Z Z c+ξ c−ξ 1 [2ξf (x̃)] , x̃ ∈ [c − ξ, c + ξ]. ξ→0 2ξ f (x)dx = lim 133 4.1. LAPLACE TRANSFORM The last step is justified by the mean value theorem for integrals. But the interval of values in which x̃ must be taken collapses to the point c for ξ → 0, therefore we obtain the important property of the Dirac delta function: Z ∞ f (x)δ(x − c)dx = f (c). −∞ (4.15) To calculate the Laplace transform of δ(x − c) (with c ≥ 0) it is conveninet to calculate first the Laplace transform of the function dξ (x − c) and then take the limit ξ → 0, namely: L{δ(x − c)} = lim ξ→0 Z ∞ −sx e 0 −s(c+ξ) dξ (x − c)dx = lim ξ→0 Z c+ξ c−ξ sξ e−sx dx 2ξ e − e−sξ −e = e−sc lim ξ→0 ξ→0 −2sξ 2sξ sξ −sξ ξ(e + e ) = e−sc lim = e−sc . ξ→0 2ξ = lim e −s(c+ξ) The last step is justified by the de l’Hopital’s rule for limits. In this way we have found the result reported in Table 4.1 about the Laplace transform of δ(x − c). In the case that c = 0 we have L{δ(x)} = 1. 4.1.3 The Bromwich integral Although for most of the practical purposes the inverse Laplace transform of a given function F (s) can be found by means of the “dictionary” provided by Tab. 4.1 (or of more extended tables that can be found in the literature), a general formula for the inversion of F (s) can be found treating F (s) as a complex function and is given by the so-called Bromwich integral: 1 f (x) = L {F (s)} = 2πi −1 Z λ+i∞ esx F (s)ds, (4.16) λ−i∞ where λ is a real positive number and is larger that the real parts of all the singularities of esx F (s). In practice, the integral must be performed along the infinite line L, parallel to the imaginary axis, indicated in Fig. 4.1. At this point, a curve must be chosen in order to close the contour C. Possible completion paths are for instance the curves Γ1 or Γ2 indicated in Fig. 4.2, namely the half-circles on the left and on the right of L, respectively. For R → ∞ these curves make with L a closed contour. The Bromwich integral can be evaluated by means of the residue theorem provided that the integral of the function esx F (s) tends to zero for R (radius of the chosen half-circle) tending to infinity. If we choose the completion path Γ1 , then the residue theorem ensures us that: 134 CHAPTER 4. INTEGRAL TRANSFORMS Figure 4.1: The infinite line L along which the Bromwich integral must be performed. f (x) = X X 1 Rj , Rj = · 2πi 2πi C C (4.17) where the sum is extended to all the residues of the function esx F (s) in the complex plane. In fact, by construction L lies on the right of each singularity of esx F (s) and on the limit R → ∞ the closed curve C = L + Γ1 will enclose them all (including for instance the singularity z1 that in Fig. 4.2 is not yet enclosed in C). If we instead have to choose the completion path Γ2 , then the closed curve L + Γ2 will enclose no singularities and therefore f (x) will be zero. Example 4.1.12 Find the inverse Laplace transform of the function F (s) = 2e−2s . s2 + 4 From the relation L{uc (x)f (x − c)} = F (s) we can already derive the inverse Laplace transform of the given function, namely u2 (x) sin[2(x−2)]. We check if we can obtain the same result be means of the Bromwich integral. We have to evaluate the integral 1 2πi Z λ+i∞ λ−i∞ 2es(x−2) ds. s2 + 4 4.1. LAPLACE TRANSFORM 135 Figure 4.2: Possible contour completions for the integration path L to use in the Bromwich integral. We notice first that the given function has two simple poles at s = 2i and s = −2i (in fact it is s2 + 4 = (s + 2i)(s − 2i)), both of which have Re (z) = 0. We can therefore take an arbitrarily (but positive) small value of λ. We can distinguish two cases: i) x < 2 and ii) x > 2. For x < 2 the exponent s(x − 2) has negative real part if Re (s) > 0. We notice here that es(x−2) = e(x−2)Re(s) ei(x−2)Im(s) , therefore what determines the behavior of this function at infinity is e(x−2)Re(s) (ei(x−2)Im(s) has modulus 1 and does not create problems). That means that, for Re (s) → +∞ the function es(x−2) tends to zero. At the same time the denominator s2 + 4 diverges as Re (s) → +∞ but not as fast as the exponential function tends to zero. Therefore the integral of the function F (s)esx tends to zero along the curve Γ2 of Fig. 4.2 (for R → ∞) and we can calculate the Bromwich integral by means of the contour C = L + Γ2 . For what we have learned, since the given closed contour does not enclose the poles, the function f (x) is zero. For x > 2, the function es(x−2) tends to zero for Re (s) → −∞. That means s(x−2) that the integral of the function es2 +4 tends to zero (for R → ∞) along the curve Γ1 of Fig. 4.2 and we take therefore Γ1 as a completion of L to calculate the Bromwich integral. For the residue theorem, this integral is given by the sum of the residues of the function esx F (s) at all the poles, namely: 136 CHAPTER 4. INTEGRAL TRANSFORMS f (x) = Res(2i) + Res(−2i). We have: 2es(x−2) 2es(x−2) e2i(x−2) = lim = s→2i s→2i s + 2i s2 + 4 2i s(x−2) s(x−2) 2e e−2i(x−2) 2e = lim = Res(−2i) = lim (s + 2i) 2 s→−2i s − 2i s→−2i s +4 −2i Res(2i) = lim (s − 2i) By summing up these two residues we obtain: 1 2i(x−2) e − e−2i(x−2) = sin[2(x − 2)]. 2i This is what we obtain if x > 2 whereas, as we have seen, if x is smaller than 2 the function is zero. Recalling the definition of the Heaviside function uc (x) we can conclude that the inverse Laplace transform of the given function is: f (x) = f (x) = L −1 2e−2s s2 + 4 = u2 (x) sin[2(x − 2)]. Example 4.1.13 Find the inverse Laplace transform of the function: F (s) = √ s − a, with a ∈ R. √ √ The function esx s − a has no poles, but the function z is multiple-valued in the complex plane, therefore, as we have seen, a branch point is present at the point z = 0, namely at s = a. This is the only singularity of our F (s)esx and therefore, in order to evaluate the Bromwich integral, we have to take λ larger than a. The integral to calculate will be: √ 1 L { s − a} = 2πi −1 Z λ+i∞ λ−i∞ √ s − aesx ds. By means of the substitution z = s − a we obtain: Z λ+i∞ Z √ √ (z+a)x 1 eax λ+i∞ √ zx −1 L { s − a} = ze dz = ze dz. 2πi λ−i∞ 2πi λ−i∞ In this case, the branch point is at zero, therefore λ can be arbitrarily small (but always larger than zero). Since z = 0 is a branch point of the function to integrate, we have to introduce a branch cut to evaluate the integral. Although we have taken so far the positive real axis as a branch cut, we have also said that this choice is 137 4.1. LAPLACE TRANSFORM Figure 4.3: Contour to use in Example 4.1.13. √ arbitrary and to make the function z singe value it is enough that closed curves are not allowed to enclose the origin. We can therefore take as branch cut the negative real axis. In Fig. 4.3 we indicate the contour we must use to integrate the given function. Since the closed contour C = L + Γ1 + r1 + γ + r2 + Γ2 does not enclose singularities, its integral is zero. To evaluate the Bromwich integral (namely the integral along L) we have to calculate the integral along the arcs Γ1 and Γ2 , along the straight lines r1 and r2 and along the circumference γ. √ √ Since the function zezx tends to zero for Re (z) → −∞ (the term z cannot contrast the exponential decay of ezx ), the integral along the arcs Γ1 and Γ2 disappears. To evaluate the integral along γ we take as usual z = εeiθ and we take the limit for ε → 0. The interval of values of θ is [π, −π], in fact, as we arrive at γ the first argument will be π. Then, we rotate clockwise around the origin and after a whole circuit the argument will be −π. Since dz = iεeiθ dθ we have: I √ γ zx ze dz = Z π −π √ θ iθ εei 2 exεe · iεeiθ dθ. The integrating function clearly tends to zero for ε → 0, therefore there is no contribution from the integral over γ. Along the straight lines r1 and r2 we can assume that the arguments of the complex numbers lying on them are π (along r1 ) and −π (along r2 ) and that their 138 CHAPTER 4. INTEGRAL TRANSFORMS imaginary parts tend to zero, therefore we have z = reiπ (r1 ) and z = re−iπ (r2 ). Notice here that, although we are on the negative real axis, r is positive. In fact, eiπ = e−iπ = −1. The parameter r runs between +∞ and 0 (r1 ) and between 0 and +∞ (r2 ). The integral of the given function along r1 turns out to be: Z √ zx ze dz = 0 Z √ i π2 xreiπ re e ∞ r1 iπ · e dr = Z 0 √ ∞ −xr r·i·e · (−1)dr = i Z ∞ √ re−xr dr. 0 Along r2 we have: Z √ zx ze dz = r2 Z ∞ √ −i π2 xre−iπ re e 0 −iπ ·e dr = Z ∞ √ −xr r·(−i)·e 0 ·(−1)dr = i Z ∞ √ re−xr dr. 0 In the end we have: √ eax f (x) = L { s − a} = − 2πi −1 Z √ r1 +r2 eax ze dz = − π zx Z ∞ √ re−xr dr. 0 The sign minus is due to the fact that, as we have said, the integral along the whole R R closed curve C is zero, therefore L F (s)esx ds = − r1 +r2 F (s)esx ds. To evaluate R ∞ √ −xr 2 re dr we make the substitution xr = t2 , therefore r = tx and the integral 0 . We obtain: dr = 2tdt x Z ∞ √ −xr re dr = 0 2 1 x3/2 Z 0 ∞ 2 te−t · 2tdt. 2 Since −2te−t is the differential of e−t we can integrate the given function by parts and obtain: Z ∞ Z ∞ √ −xr 1 h −t2 i∞ −t2 − e dt . re dr = − 3/2 te x 0 0 0 √ R∞ 2 The term under square brackets is zero. By using the known result 0 e−t dt = 2π we obtain: Z 0 ∞ √ −xr re dr = √ π 2x3/2 . This result completes our inversion of the function F (s) = √ eax f (x) = L−1 { s − a} = − √ . 2 πx3 √ s − a, namely we have: 139 4.2. FOURIER TRANSFORMS 4.2 Fourier transforms Fourier transforms are widely used in physics and astronomy because they allow to express a function (not necessarily periodic) as a superposition of sinusoidal functions, therefore we devote this section to them. Since the Fourier transforms are used mostly to represent time-varying functions, we shall use t as independent variable instead of x. On the other hand, the transformed variable represents for most of the application a frequency and will be indicated with ω instead of s. 4.2.1 Fourier series For some physical applications, we might need to expand in series some functions that are not continuous or not differentiable and that therefore do not admit a Taylor series. Fourier series allow to represent periodic functions, for which a Taylor expansion does not exist, as superposition of sine and cosine functions. Given a periodic function f (t) with period T such that the integral of |f (t)| over one period converges, f (t) can be expressed in this way: ∞ 2πnt 2πnt a0 X + bn sin , + an cos f (t) = 2 T T n=1 where the constant coefficients an , bn are called Fourier coefficients. Defining the angular frequency ω = 2π we simplify this expression into: T ∞ a0 X + [an cos(ωnt) + bn sin(ωnt)] , f (t) = 2 n=1 (4.18) namely the function f (t) can be expressed as a superposition of an infinite number 2π of sinusoidal functions having periods Tn = ωn . It can be shown that these coefficients are given by: 2 an = T Z 2 bn = T Z T 2 f (t) cos(ωnt)dt (4.19) f (t) sin(ωnt)dt (4.20) − T2 T 2 − T2 Example 4.2.1 Find the Fourier series expansion of the function −1 − T + kT ≤ t < kT 2 f (t) = 1 kT ≤ t < T + kT 2 140 CHAPTER 4. INTEGRAL TRANSFORMS This is a square wave: a series of positive impulses followed periodically by negative impulses of the same intensity. We can notice immediately that the function f (t) is odd (f (t) = −f (−t)). Since the function cos(ωnt) is even, the whole function f (t) cos(ωnt) is odd and its integral between −T /2 and T /2 is zero. That means that the coefficients an are zero. To find the coefficients bn we apply Eq. 4.20 obtaining: 2 bn = T Z T 2 − T2 " Z # Z T 0 2 2 − f (t) sin(ωnt)dt = sin(ωnt)dt + sin(ωnt)dt T − T2 0 Z T T 2 2 4 sin(ωnt)dt = − [cos(ωnt)]02 = T 0 nπ 2 = [1 − cos(nπ)] . nπ Here we have used the relation ωT = 2π. We can notice here that cos(nπ) is 1 if n is even and -1 if n is odd, namely cos(nπ) = (−1)n . We could find the same result by means of the de Moivre’s theorem applied to the complex number z = eiπ . 4 if n is odd. The Fourier The coefficients bn are equal to zero if n is even and to nπ expansion we looked at is therefore: 4 f (t) = π sin(3ωt) sin(5ωt) sin(ωt) + + + ... . 3 5 By using the identities cos z = (eiz + e−iz )/2 and sin z = (eiz − e−iz )/2i the Fourier expansion of a function f (t) can also be written as: ∞ a0 X eiωnt + e−iωnt eiωnt − e−iωnt f (t) = + an + bn 2 2 2i n=1 ∞ a0 eiω0t 1 X = + (an − ibn )eiωnt + (an + ibn )e−iωnt . 2 2 n=1 In this way we can see that the function f (t) can be expressed as sum, extending from −∞ to +∞, of terms of the form eiωn t , where ωn = ω · n, namely we have: f (t) = ∞ X −∞ cn eiωn t ; 1 (a − ib ) n ≥ 0 n n cn = 2 1 (an + ibn ) n < 0 2 . (4.21) This compact representation of the periodic function f (t) is called complex Fourier series. If we combine the coefficients an and bn as indicated in Eq. 4.21 we find that, irrespective of the sign of n, we have: 141 4.2. FOURIER TRANSFORMS 1 cn = T 4.2.2 T 2 Z f (t)e−iωn t dt. (4.22) − T2 From Fourier series to Fourier transform We have seen that the Fourier series allow us to describe periodic functions as superpositions of sinusoidal functions characterized by angular frequencies ωn . To represent non-periodic functions, what we can do is to extend the period T to infinity (every function can be considered periodic if the period is large enough). That corresponds to consider a vanishingly small “frequency quantum” ∆ω = ωnn = 2π T and therefore a continuous spectrum of angular frequencies ωn . Given a function RT P iωn t , with cn = T1 −2T f (t)e−iωn t dt, we want to see what happens f (t) = ∞ n=−∞ cn e 2 2π T in the limit T → ∞ (or, analogously, ∆ω = → 0). We have: Z T Z T ∞ ∞ X X 2 ∆ω 2 1 −iωn t iωn t f (t) = f (t)e dt · e = f (t)e−iωn t dt · eiωn t . T T T 2π −2 −2 n=∞ n=∞ In the limit for T → ∞ and ∆ω → 0 the limits of the integration extend to infinity, the sum becomes an integral and the discrete values ωn become a continuous variable ω (with ∆ω → dω). We have thus: 1 f (t) = 2π Z ∞ iωt dωe Z ∞ duf (u)e−iωu . (4.23) −∞ −∞ From this relation we can define the Fourier transform of a function f (t) as: 1 f˜(ω) = F {f (t)} = √ 2π Z ∞ f (t)e−iωt dt. (4.24) −∞ R∞ Here we require, in order this integration to be possible, that −∞ |f (t)|dt is finite. Unlike the Laplace transform, the Fourier transform is very easy to invert. In fact, we can directly see from Eq. 4.23 that: 1 f (t) = √ 2π Z ∞ f˜(ω)eiωt dω. (4.25) −∞ Example 4.2.2 Find the Fourier transform of the normalized Gaussian distribution t2 1 f (t) = √ e− 2τ 2 . τ 2π By definition of Fourier transform we have: 142 CHAPTER 4. INTEGRAL TRANSFORMS Z ∞ Z ∞ t2 1 1 −iωt ˜ e−iωt− 2τ 2 dt. f(ω) =√ f (t)e dt = 2πτ −∞ 2π −∞ We can modify the exponent of e in the integral as follows: t2 1 2 2 2 2 2 2 = − t + 2iωtτ + (iωτ ) − (iωτ ) . 2τ 2 2τ 2 The first 3 addends inside the square brackets are the square of t + iωτ 2 , namely we obtain: −iωt − t2 (t + iωτ 2 )2 (iωτ 2 )2 −iωt − 2 = − + =− 2τ 2τ 2 2τ 2 1 Since the term e− 2 ω 2τ 2 t + iωτ 2 √ 2τ 2 1 − ω 2τ 2 . 2 does not depend on t we obtain: Z ” “ 2 2 1 − 1 ω2 τ 2 ∞ − t+iωτ √ ˜ 2τ e 2 dt. f (ω) = e 2πτ −∞ This is the integral of a complex function, therefore we should use the methods of complex integration we have learned so far. However, we can see that the integration simplifies significantly by means of the substitution: t + iωτ 2 √ = s, 2τ dt = √ 2τ ds. In this way we obtain: 1 2 2 1 f˜(ω) = √ e− 2 ω τ 2π ∞ 1 2 2 1 2 e−s ds = √ e− 2 ω τ , 2π −∞ R ∞ −s2 √ where we have made use of the known result −∞ e ds = π. It is important to note that the Fourier transform of a Gaussian function is another Gaussian function. Z The Fourier transform allows us to express the Dirac delta function in an elegant and useful way. We recall Eq. 4.23 Z ∞ Z ∞ 1 iωt duf (u)e−iωu . f (t) = dωe 2π −∞ −∞ By exchanging the variable of integration we obtain: Z ∞ Z ∞ 1 dω duf (u)eiω(t−u) f (t) = 2π −∞ Z ∞ Z −∞ ∞ 1 = du dωf (u)eiω(t−u) 2π −∞ −∞ Z ∞ Z ∞ 1 iω(t−u) e dω , = duf (u) 2π −∞ −∞ 143 4.2. FOURIER TRANSFORMS where the exchange of the order of integration has been made possible by the Fubini’s theorem. Recalling Eq. 4.15 we can immediately recognize that: 1 δ(t − u) = 2π Z ∞ eiω(t−u) dω. (4.26) −∞ Analogously to the Laplace transform, it is easy to calculate the Fourier transform of the derivative of a function. It is: Z ∞ 1 F {f (t)} = √ f ′ (t)e−iωt dt 2π −∞ Z 1 (−iω) ∞ −iωt ∞ =√ f (t)e f (t)e−iωt dt − √ −∞ 2π 2π −∞ ′ = iωF {f (t)}. (4.27) Here we have assumed that the function f (t) tends to zero for t → ±∞ (as it should R∞ be since −∞ |f (t)|dt is finite). It is easy to iterate this procedure and show that: F {f (n) (t)} = (iω)n F {f (t)}. (4.28) This relation can be used in some cases to solve ODEs analogously to what done by means of Laplace transforms, namely we transform both members of an ODE, solve the obtained algebraic equation as a function of F {y(x)} (the Fourier transform of the solution y(x) we seek) and then invert the function we have obtained. However, for most of the practical cases, it is more convenient to use Laplace transformation methods to solve ODEs. Fourier transformation methods can be extremely useful instead to solve partial differential equations (see Sect. 6.3.1). 144 CHAPTER 4. INTEGRAL TRANSFORMS Chapter 5 Systems of differential equations In many physical applications it is necessary to solve simultaneously n ODEs involving n unknown functions y1 (x), y2 (x), . . . , yn (x) of the same independent variable x. Such a system of differential equations shares many analogies with a system of ordinary algebraic equations and can be solved with the matrix formalism (taking into account all the properties of ODEs we have encountered so far). We will consider only systems of linear ODEs. At the end of this Chapter we will show that systems of linear ODEs with order larger than one can always be transformed into systems of first order ODEs, therefore we will devote most of our attention to this case. 5.1 Review of matrices and systems of algebraic equations Because of the utility of the results of matrix theory to solve systems of ODEs, we recall here (without proof) some of the most important and useful properties of matrices and of systems of linear algebraic equations. 5.1.1 Matrices A matrix (usually indicated with a boldface capital letter like A) is a rectangular array of elements arranged in m rows and n columns like the following one: A= a11 a21 .. . a12 a22 .. . ... ... a1n a2n .. . am1 am2 . . . amn . In this case we refer to A as a m × n matrix. The transpose of a matrix AT is 145 146 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS the matrix obtained by inverting rows with columns. If we, besides inverting rows with column, take also the complex conjugate of each element of the matrix, then we obtain the adjoint A∗ . Of course, for real matrices, the transpose and the adjoint coincide. The multiplication of two matrices is possible only if the number of columns of the first matrix equals the number of rows in the second matrix. In this case, if we multiply a m × r matrix A times a r × n matrix B, we obtain a m × n matrix C whose element in the i-th row and j-th column is obtained multiplying each element of the i-th row of A by each corresponding element of the j-th column of B, namely: cij = r X aik bkj . (5.1) k=1 If A and B are square matrices (namely if the number of rows and columns are the same) it is possible to define both AB and BA but, from the definition of matrix product, it derives that in general AB 6= BA, namely the matrix multiplication is not commutative. A matrix with a single column is also called vector, namely a vector x looks like that: x= x1 x2 .. . xn . Of course, the transpose xT of a vector x is composed by a single row (x1 , x2 , . . . , xn ). If we multiply the transpose of a vector x by the complex conjugated of a second vector y, we obtain the scalar product (or inner product) x · y (sometimes indicated with (x, y)), which is the usual way to multiply vectors. In symbols: x·y = n X xi yi ∗ . (5.2) i=1 If (x, y) = 0, then the two vectors x and y are said to be orthogonal. The identity matrix I is the square matrix whose diagonal terms are 1 and whose non-diagonal terms are all 0, namely: I= 0 ... 0 1 ... 0 .. .. . . . 0 0 ... 1 1 0 .. . Given a square matrix A, from the definition of matrix multiplication we have: 5.1. REVIEW OF MATRICES AND SYSTEMS OF ALGEBRAIC EQUATIONS147 AI = IA = A. If the determinant of A is different from zero, then it is always possible to find the inverse A−1 of the matrix A, namely the matrix such that AA−1 = A−1 A = I. The general formula to find the element bij of the matrix A−1 is: (−1)i+j Mji det A (note the inversion of the indices in Mji ), where Mij is the minor of the matrix A associated with the element aij , namely the determinant of the matrix obtained by deleting the i-th row and the j-th column. This is quite an inefficient way to find the inverse of a matrix. Much more effective is the method of the Gauss elimination. It consists on manipulating the rows of a matrix until one obtains the diagonal matrix. The allowed row operations are: bij = • interchange of two rows; • multiplication of a row by a non-zero scalar; • addition of any multiple of one row to another row. An example can clarify how the Gauss elimination method works. Example 5.1.1 Find the inverse of the matrix 3 −1 1 A = 1 1 −1 2 −1 0 We begin by forming the augmented matrix A|I, namely: 3 −1 1 1 0 0 A|I = 1 1 −1 0 1 0 . 2 −1 0 0 0 1 We perform now a series of row operations on A in order to transform it into I. At the same time, I will be transformed into A−1 . To shorten the notation we indicate with Rmn the swap of the m-th with the n-th row, with Rm + Rn (α) the sum of the m-th row with α times the n-th row and with Rn (β) the product of the n-th row by β. We have: 148 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS 3 −1 1 1 0 0 A|I = 1 1 −1 0 1 0 2 −1 0 0 0 1 1 1 −1 0 1 0 [R12 ] = 3 −1 1 1 0 0 2 −1 0 0 0 1 1 1 −1 0 1 0 [R2 + R1 (−3)]; [R3 + R1 (−2)] = 0 −4 4 1 −3 0 0 −3 2 0 −2 1 1 1 −1 0 1 0 1 R2 − = 0 1 −1 − 14 43 0 4 0 −2 1 0 −3 2 1 1 1 0 0 0 4 4 [R1 + R2 (−1)]; [R3 + R2 (3)] = 0 1 −1 − 14 43 0 0 0 −1 − 34 41 1 1 1 0 1 0 0 4 4 [R3 (−1)] = 0 1 −1 − 14 43 0 3 − 14 −1 0 0 1 4 0 1 0 0 14 41 [R2 + R3 (1)] = 0 1 0 12 21 −1 = I|A−1. 0 0 1 34 − 14 −1 In the end, the inverse matrix of A is: A−1 1 1 0 1 = 2 2 −4 . 4 3 −1 −4 If the elements aij of a matrix A are functions of an independent variable x, then we can define the derivative and the integral of the matrix A as the matrix whose elements are the derivatives and the integrals of the element aij (x), respectively. Given two matrices A(x) and B(x) and a constant matrix K we have: dA d (KA) = K dx dx d d d (A + B) = A+ B dx dx dx dB dA d (AB) = A + B. dx dx dx (5.3) (5.4) (5.5) 5.1. REVIEW OF MATRICES AND SYSTEMS OF ALGEBRAIC EQUATIONS149 Since, as we have said, the matrix multiplication is not commutative, care must be taken in respecting the order of multiplications. 5.1.2 Systems of linear algebraic equations A system of n linear algebraic equation in n variables can be written as: a11 x1 + a12 x2 + · · · + a1n xn = b1 a x + a x + · · · + a x = b 21 1 22 2 2n n 2 . .. an1 x1 + an2 x2 + · · · + ann xn = bn . (5.6) By using the matrix formalism, we can write this system in the compact form: Ax = b, (5.7) where A= a11 a12 . . . a1n a21 a22 . . . a2n .. .. .. . . . an1 an2 . . . ann , x= x1 x2 .. . xn , b = b1 b2 .. . bn If b = 0 then the system is said to be homogeneous, otherwise it is nonhomogeneous. If the determinant of A is different from zero, then we can calculate the inverse matrix A−1 . By multiplying both sides of the equation Ax = b by A−1 we obtain the solution of the system, namely: x = A−1 b. (5.8) This solution is therefore unique. For homogeneous systems, we can only have the trivial solution x = 0, whereas if b 6= 0 the solution can be either found by Gauss elimination, or by means of the Cramer’s rule. The Cramer’s rule states that the i-th component of the vector x, solution of the given system of linear algebraic equations, is given by: det Ai , (5.9) det A where Ai is the matrix obtained by replacing the i-th column of A by the column vector b. If det A = 0, then the homogeneous system Ax = 0 has an infinite number of solutions. The nonhomogeneous system Ax = b has instead, in general, no solutions xi = 150 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS if the determinant of A is zero. It has however an infinite number of solutions if the vectors y such that A∗ y = 0 are orthogonal to b, namely if: (b, y) = 0. In the end, this condition implies that one or more than one equation of the system Eq. 5.6 can be obtained as a linear combination of the others. Example 5.1.2 Find, as a function of a, the solutions of the system of algebraic equations ! ! a 1 b1 x = Ax = . 1 a b2 The determinant of the given system is a2 − 1, therefore we have an unique solution for |a| = 6 1. In this case, the solution is given by x = A−1 b. It is easy to see that the inverse of the matrix A is given by: A−1 1 = 2 a −1 a −1 −1 a ! , therefore the solution is: 1 x= 2 a −1 a −1 −1 a ! b1 b2 ! 1 = 2 a −1 If a = 1, the system becomes: x + x = b 1 2 1 x1 + x2 = b2 ab1 − b2 −b1 + ab2 ! . . It is quite clear that, since the left-hand sides of the equations are equal, also the right-hand sides must be equal, namely solutions are possible only if b1 = b2 . In this case, the equations are proportional to each other, with proportionality constant 1. We just need to solve one of them, for instance: x1 + x2 = b1 . If x1 takes the value K, then x2 = b1 − K, therefore the (infinite) solutions in this case are represented by the vector: ! K x= . b1 − K 5.1. REVIEW OF MATRICES AND SYSTEMS OF ALGEBRAIC EQUATIONS151 If a = −1, the system becomes: −x + x = b 1 2 1 x1 − x2 = b2 . Here, solutions are possible only if b1 = −b2 . In this case, the equations are proportional to each other, with proportionality constant −1. If we solve: −x1 + x2 = b1 , the (infinite) solutions in this case are represented by the vector: ! K x= . b1 + K A set of n vectors x(1) , . . . , x(n) is said to be linearly dependent if there exists a set of n numbers c1 , . . . , cn , at least one of which is different from zero, such that: c1 x(1) + · · · + cn x(n) = 0. (5.10) If this relation is satisfied only by the set of values c1 = c2 = · · · = cn = 0, then the vectors x(1) , . . . , x(n) are said to be linearly independent. To check the linear independence of a set of vectors, the best way is to construct the matrix X whose columns are the vectors x(1) , . . . , x(n) . Eq. 5.10 can be thus written as: (1) (n) x1 c1 + . . . + x1 cn x11 c1 + . . . .. . ... . = .. (1) (n) xn c1 + . . . + xn cn xn1 c1 + . . . + x1n cn .. . = Xc = 0. + xnn cn If det X 6= 0, then the only solution of this equation is c = 0 (therefore the vectors are linearly independent). If instead det X = 0 then the system Xc = 0 admits a solution c 6= 0 and the vectors are linearly dependent. For many physical and mathematical applications it is necessary to search under what conditions a vector x can be transformed into a multiple of itself through the transformation Ax, namely under what conditions Ax = λx. Recalling the identity matrix I, this equation can be written as: (A − λI)x = 0. (5.11) 152 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS This is an homogeneous system of equations, therefore non-zero solutions are possible only if det (A − λI) = 0. (5.12) The values of λ that satisfy this equation are called eigenvalues of the matrix A. The non-zero solutions of Eq. 5.11 that are obtained by using such values of λ are called eigenvectors. If all the roots of the Eq. 5.12 have multiplicity 1, then it can be shown that all corresponding eigenvectors are linearly independent. If instead one root λi is repeated, with algebraic multiplicity m, then it can happen that, associated to it, are only q linearly independent eigenvectors. In this case, q is said to be the geometric multiplicity of the eigenvalue λi . It is always 1 ≤ q ≤ m. 5.2 Systems of first order linear ODEs 5.2.1 General properties A generic system of n first order linear ODEs can be written as: y1 ′ (x) = p11 (x)y1 (x) + p12 (x)y2 (x) + · · · + p1n (x)yn (x) + g1 (x) y ′ (x) = p (x)y (x) + p (x)y (x) + · · · + p (x)y (x) + g (x) 2 21 1 22 2 2n n 2 . .. ′ yn (x) = pn1 (x)y1 (x) + pn2 (x)y2 (x) + · · · + pnn (x)yn (x) + gn (x) (5.13) This system can be written in a more compact way in matrix notation, namely: y′ = P(x)y + g, (5.14) where y = ′ y1 ′ (x) y2 ′ (x) .. . yn ′ (x) and P is the n × n matrix: , P= y= y1 (x) y2 (x) .. . yn (x) p11 (x) . . . p21 (x) . . . .. . , g= p1n (x) p2n (x) .. . pn1 (x) . . . pnn (x) . g1 (x) g2 (x) .. . gn (x) , 5.2. SYSTEMS OF FIRST ORDER LINEAR ODES 153 By means of this formalism, we can extend practically all the definitions and properties we have already encountered studying linear ODEs. In the case of the homogeneous system: y′ = P(x)y, (5.15) in compliance with what we have learned on ODEs, we expect Eq. 5.15 to have in general n solutions y(1) (x), . . . , y(n) (x). Are these solutions linearly independent? To check it it is enough to consider the n × n matrix Y(x) whose columns are the vectors y(1) (x), . . . , y(n) (x), namely: y11 (x) . . . .. Y(x) = . y1n (x) .. . . (5.16) yn1(x) . . . ynn (x) As we have reminded in Sect. 5.1.2, the vectors y(1) (x), . . . , y(n) (x) are linearly independent provided that the determinant of Y(x): W [y(1) (x), . . . , y(n) (x)] = detY(x), (5.17) is different from zero. This determinant W is called Wronskian of the n solutions y(1) (x), . . . , y(n) (x). If W 6= 0, then we can express any solution y(x) of the system of ODEs Eq. 5.14 as a linear combination of the solutions y(1) (x), . . . , y(n) (x), namely y(x) = c1 y(1) (x) + · · · + cn y(n) (x). (5.18) In this case, the n vectors y(1) (x), . . . , y(n) (x) form a fundamental set of solutions of the given system of ODEs. 5.2.2 Homogeneous linear systems with constant coefficients A system of homogeneous linear ODEs with constant coefficients can be written as: y′ = Ay, (5.19) where A is a constant n × n matrix. Analogously to what we have seen with normal ODEs (that can be interpreted as systems of ODEs with n = 1), we expect an exponential solution of the form y = peλx , where the exponent λ and the constant d vector p must be determined. It can be easily shown that dx peλx = λpeλx , therefore substituting y = peλx in the system Eq. 5.19 we obtain: λpeλx = Apeλx . 154 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS We can cancel out eλx from this equation. Moreover, by using the identity matrix I we can write p = Ip obtaining: (A − λI)p = 0. (5.20) In the end, the solution of the system of ODEs Eq. 5.19 reduces to the system of algebraic equations Eq. 5.20, which is precisely the one that determines the eigenvalues and eigenvectors of the matrix A. Example 5.2.1 Solve the system of ODEs −2 1 1 −2 y′ (x) = ! y. According to Eq. 5.20 we have to solve the system of algebraic equations: −2 − λ 1 1 −2 − λ ! p1 p2 ! = 0 0 ! . (5.21) This system admits a non-trivial solution p 6= 0 only if the determinant of the matrix is zero (namely if the two rows are linearly dependent). This occurs when: (−2 − λ)(−2 − λ) − 1 = 0, ⇒ λ2 + 4λ + 3 = 0, ⇒ λ1,2 = −2 ± √ 4 − 3, namely the eigenvalues are λ1 = −3 and λ2 = −1. By substituting these values in the Eq. 5.21 we can obtain the eigenvectors p(1) and p(2) . We start with λ1 = −3 and obtain: −2 + 3 1 1 −2 + 3 ! (1) p1 (1) p2 (1) ! = 0 0 (1) ! (1) (1) ⇒ p1 + p2 = 0. Of course one equation linking p1 and p2 is sufficient because the rows of the matrix (1) (1) are now linearly dependent. The solution of this equation is therefore p1 = −p2 ! 1 (1) and, taking p1 = 1 we obtain the eigenvector p(1) = . The choice of −1 p(1) is arbitrary, because all the remaining infinitely many eigenvectors are linearly dependent on p(1) . If we now substitute λ2 = −1 into Eq. 5.21 we obtain the equation (2) (2) −p1 + p2 = 0, 155 5.2. SYSTEMS OF FIRST ORDER LINEAR ODES (2) (2) namely p1 = p2 and therefore the second eigenvector is p(2) = 1 1 ! . The general solution of the given system of ODEs is thus: y = c1 1 −1 ! e−3x + c2 1 1 ! e−x . In the simple case in which the system is composed by only 2 equations, we can write Eq. 5.20 as: a −λ a12 11 a21 a22 − λ = 0, ⇒ λ2 − (a11 + a22 )λ + a11 a22 − a12 a21 = 0. (5.22) Let us now write explicitly the system y′ (x) = Ay in this case. We have: y ′ = a y + a y 1 11 1 12 2 . y2 ′ = a21 y1 + a22 y2 We can recover y2 from the first equation, obtaining: y2 = 1 (y1 ′ − a11 y1 ). a12 Substituting this function into the second equation of the system we obtain: a22 ′ 1 (y1 ′′ − a11 y1 ′ ) = a21 y1 + (y1 − a11 y1 ) a12 a12 ⇒ y1 ′′ − (a11 + a22 )y1 ′ + (a11 a22 − a12 a21 )y1 = 0. (5.23) This is a second order homogeneous ODE with constant coefficients, whose characteristic equation is exactly like Eq. 5.22. For this reason we can call the equation det (A − λI) = 0 the characteristic equation of the given system of ODEs. Eq. 5.23 suggests us also how to solve the system y′ = Ay. From it we can in fact recover y1 (x) and, if we substitute it into the equation y2 = a112 (y1 ′ − a11 y1 ), we can find also y2 (x). This method is also called elimination method and might be faster than the methods involving the matrix formalism in the case of simple systems. As we have seen in Example 5.2.1, given a system of linear ODEs y′ = Ay, to find the solution we have to find the eigenvalues λi and the corresponding eigenvectors p(i) of the matrix A. The eigenvalues are the roots of the Eq. 5.12, therefore, if A is a real matrix, 3 possibilities may arise: • all eigenvalues are real and different from each other; 156 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS • some eigenvalues occur in complex conjugate pairs; • some eigenvalues are repeated. Example 5.2.1 belongs to the first category and we have seen that, once we have found the eigenvalues λ1 , . . . , λn and the corresponding eigenvectors p(1) , . . . , p(n) , the general solution is given by: y = c1 p(1) eλ1 x + · · · + cn p(n) eλn x . (5.24) It is easy to show that the vectors y(1) , . . . , y(n) = p(1) eλ1 x , . . . , p(n) eλn x are linearly independent, in fact: (1) λ x λn x p1 e 1 . . . p(n) 1 e .. .. W [y(1) , . . . , y(n) ](x) = . (1). pn eλ1 x . . . pn(n) eλn x (1) p1 . . . p(n) 1 .. .. (λ1 +···+λn )x = e . . (1) pn . . . p(n) n , and this quantity is different from zero because the vectors p(1) , . . . , p(n) are linearly independent. If some of the eigenvalues of the matrix A are complex, we know that, if A is real, both the complex conjugate pairs λ1,2 = µ ± iα must appear. In this case, the eigenvectors p(1) and p(2) , corresponding to the eigenvalues λ1,2 , will be complex conjugates, either. If we take the solution y(1) = p(1) eλx , if p(1) = a + ib, then we have: y(1) = p(1) eλx = (a+ib)e(µ+iα)x = eµx [a cos(αx)−b sin(αx)]+ieµx [a sin(αx)+b cos(αx)]. If we write y(1) = u + iv, then the vectors: u(x) = eµx [a cos(αx) − b sin(αx)] v(x) = eµx [a sin(αx) + b cos(αx)] (5.25) (5.26) are linearly dependent, real-valued solutions and can be made part of the fundamental set of solutions of the system of ODEs y′ = Ay. Example 5.2.2 Find the solution of the system of ODEs ! −1 −4 y′ = y. 1 −1 157 5.2. SYSTEMS OF FIRST ORDER LINEAR ODES We have to find the eigenvalues of the matrix A − λI, namely: −1 − λ −4 1 −1 − λ √ = 0 ⇒ (1 + λ)2 + 4 = 0 ⇒ λ1,2 = −1 ± 1 − 5. The eigenvalues are therefore λ1,2 = −1 ± 2i. As we have learned, we need only one of these eigenvalues because the real and imaginary parts of the complex vector peλx are linearly independent. The eigenvector p can be obtained substituting the chosen eigenvalue into the equation (A − λI)p = 0. If we take the value λ = −1 − 2i we obtain: 2i −4 1 2i ! ! p1 p2 = 0 0 ! . We must take only one of these equations (they are linearly dependent), for instance p1 + 2ip2 = 0. If p1 takes the value 2i, p2 must take the value −1, therefore all the solutions of the equation p1 + 2ip2 = 0 are multiples of the eigenvector p= 2i −1 ! . We have now to separate the vector peλx into its real and imaginary components u and v. We obtain: peλx = e−x ⇒ u = e−x 2i −1 ! [cos(2x) − i sin(2x)] = e−x 2 sin(2x) − cos(2x) ! v = e−x 2 cos(2x) sin(2x) ! 2i cos(2x) + 2 sin(2x) − cos(2x) + i sin(2x) ! . The general solution is thus given by: " y = e−x c1 2 sin(2x) − cos(2x) ! + c2 2 cos(2x) sin(2x) !# . If some eigenvalues of the matrix A are repeated, namely if one (or more) root λ of the equation det (A − λI) has algebraic multiplicity m larger than one, as we have said two possibilities may arise: • the geometric multiplicity q of λ is equal to m; 158 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS • q is smaller than m. In the first case, there are still m linearly independent eigenvectors p(1) , . . . , p(m) corresponding to the eigenvalue λ and therefore the vectors p(1) eλx , . . . , p(m) eλx are linearly independent. The general solution is still of the form of Eq. 5.24, namely: y = c1 p(1) eλ1 x + · · · + cn p(n) eλn x , although some values of λ are repeated. If instead there are less than m linearly independent eigenvectors corresponding to an eigenvalue with algebraic multiplicity m, then not all the vectors forming the fundamental set of solutions have the form p(i) eλi x . By analogy with the results for linear ODEs of order n, we might expect additional solutions involving products of polynomials with exponential functions. If the root λ of the equation det (A − λI) is double, then one solution has the standard form peλx , whereas the second solution must have the form: uxeλx + veλx . (5.27) We can see here that, at variance with higher-order linear ODEs, we have a term proportional to xeλx and a term proportional to eλx . We cannot drop this term because the vector v is not multiple of the vector p. Example 5.2.3 Find the solution of the system of ODEs ! 1 −1 y′ = y. 1 3 We have to find the eigenvalues of the matrix A − λI, namely: 1 − λ −1 1 3−λ = 0 ⇒ λ2 − 3λ − λ + 3 + 1 = 0 ⇒ λ1,2 = 2. The eigenvalue λ = 2 is thus a double eigenvalue (namely, the algebraic multiplicity m of it is 2). The eigenvector p can be obtained substituting it into the equation (A − λI)p = 0. We obtain: −1 −1 1 1 ! p1 p2 ! = 0 0 ! . We must take only one of these equations (they are linearly dependent), for instance p1 + p2 = 0, 159 5.2. SYSTEMS OF FIRST ORDER LINEAR ODES obtaining, as eigenvector: p= 1 −1 ! . There is no possibility to obtain another linearly independent eigenvector corresponding to the eigenvalue 2 (this means that the geometric multiplicity of ! λ = 2 is 1). 1 One solution of the given system of ODEs has thus the form e2x , whereas, −1 in order to obtain the second solution, we have to apply Eq. 5.27. Upon substitution of this expression into the original system of ODEs, we obtain: = A (ux + v) e2x [2 (ux + v) + u] e2x . We can now equate the terms with the same power of x, obtaining: 2u = Au 2v + u = Av. We can write alternatively these two equations as: (A − 2I) u = 0 (5.28) (A − 2I) v = u, (5.29) The first equation simply implies that u must ! be an eigenvector corresponding to the 1 eigenvalue λ = 2, namely u = p = . From the second equation we obtain −1 instead: −1 −1 1 1 ! v1 v2 ! = 1 −1 ! . These are two linearly dependent equations. We solve one of them, for instance: v1 + v2 = −1. If v1 takes the value of C, then v2 must be −1 − C, namely we have: v= C −1 − C ! = 0 −1 ! +C 1 −1 ! , and the second solution of the given system of ODEs has, according to Eq. 5.27, the form: 160 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS y2 = 1 −1 ! xe2x + 0 −1 ! e2x + C 1 −1 ! e2x . The last summand of the right hand side of this equation is proportional to pe2x and may be ignored. The general solution is thus given by: y = e2x ( c1 1 −1 ! + c2 " 1 −1 ! x+ 0 −1 !#) . It is not difficult to demonstrate that, if an eigenvalue λ has algebraic multiplicity m = 2 and geometric multiplicity q = 1, then the vectors u and v forming the second solution (according to Eq. 5.27) can always be determined by equations of the form of Eqs. 5.28 and 5.29, namely: (A − λI) u = 0 (A − λI) v = u, 5.2.3 (5.30) (5.31) Nonhomogeneous linear systems with constant coefficients Analogously to the solution of a n-th order ODE, a system of n first-order nonhomogeneous linear ODEs y′ = Ay + g, has, as general solution, the vector y(x) = c1 y(1) (x) + · · · + cn y(n) (x) + yp (x), (5.32) where c1 y(1) (x) + · · · + cn y(n) (x) is the solution of the corresponding homogeneous system y = Ay, whereas the vector yp (x) is a particular solution of the nonhomogeneous system. In order to find the particular solution of a system of ODEs, the same methods employed in the solution of linear n-th order nonhomogeneous ODEs can be used, namely: • D-operator; • undetermined coefficients; • Laplace transforms; • variation of parameters. 161 5.2. SYSTEMS OF FIRST ORDER LINEAR ODES Method of the D-operator This method is an extension of the elimination method we have seen in Sect. 5.2.2. Taking for simplicity a system of 2 linear ODEs, we can write them as: From the first we obtain: Dy = a y + a y + g (x) 1 11 1 12 2 1 Dy2 = a21 y1 + a22 y2 + g2 (x) y2 = . 1 [Dy1 − a11 y1 − g1 (x)]. a12 Substituting it into the second ODE we obtain: a22 1 (D 2 y1 − a11 Dy1 − Dg1 ) = a21 y1 + [Dy1 − a11 y1 − g1 (x)] + g2 (x) a12 a12 ⇒ D 2 y1 − (a11 + a22 )Dy1 + (a11 a22 − a12 a21 )y1 = Dg1 − a22 g1 (x) + a12 g2 (x). (5.33) This is a second-order nonhomogeneous ODE whose complementary solution can be found by means of the standard methods and whose particular solution can be obtained using the properties of the D-operator. Example 5.2.4 Find the solution of the system of ODEs: Dy = y + y 1 1 2 . Dy2 = 4y1 + y2 + ex From the first equation we obtain: y2 = Dy1 − y1 . We substitute it into the second ODE obtaining: D 2 y1 − Dy1 = 4y1 + Dy1 − y1 + ex ⇒ (D 2 − 2D − 3)y1 = ex The characteristic equation of the corresponding homogeneous ODE is: λ2 − 2λ − 3 = 0, ⇒ λ = 1 ± √ 1 + 3, whose roots are therefore λ1 = −1 and λ2 = 3. The particular solution is given by: y1,p = 1 1 x 1 x x e = e = − e . D 2 − 2D − 3 1−2−3 4 162 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS We have thus: 1 y1 (x) = c1 e−x + c2 e3x − ex . 4 We can find the second solution y2 by means of the relation y2 = Dy1 − y1 , namely: 1 1 y2 (x) = −c1 e−x + 3c2 e3x − ex − c1 e−x − c2 e3x + ex = −2c1 e−x + 2c2 e3x . 4 4 We can express the solution in the vectorial form: ! ! 1 1 1 −x y(x) = c1 e + c2 e3x − 4 −2 2 1 0 ! ex . Method of the undetermined coefficients This method consists in guessing the correct form of the particular solution (guided by the form of the vector g(x)), leaving the coefficients undetermined, and determining the coefficients by direct substitution into the given system of ODEs. This method is analogous to the method we have already seen in the section about n-th order linear ODEs, the only difference being that, if we have a nonhomogeneous term g of the form g = heλx and if λ is a simple root of the characteristic equation, then the solution to seek has the form axeλx + beλx and not simply axeλx . Example 5.2.5 Find the solution of the system of ODEs y ′ = 2y − y + ex 1 1 2 y2 ′ = 3y1 − 2y2 With the matrix formalism this system can be written as: ! ! 2 −1 1 y′ = + ex . 3 −2 0 To find the general solution of the corresponding homogeneous system, we have to solve the equation: 2−λ −1 = 0, ⇒ λ2 − 4 + 3 = 0. 3 −2 − λ The eigenvalues are thus λ1 = 1 and λ2 = −1. Corresponding to the eigenvalue λ1 is the eigenvector p(1) given by: 163 5.2. SYSTEMS OF FIRST ORDER LINEAR ODES 1 −1 3 −3 ! ! p1 (1) p2 (1) ! 0 0 = 1 1 , ⇒ p1 (1) = p2 (1) , ⇒ p(1) = ! . The eigenvector p(2) corresponding to the eigenvalue λ2 is instead given by: 3 −1 3 −1 ! p1 (2) p2 (2) ! 0 0 = ! 1 3 , ⇒ 3p1 (2) = p2 (2) , ⇒ p(2) = ! . The complementary solution is thus given by: yc = c1 1 1 ! 1 3 ex + c2 ! e−x . Since the vector p(1) ex is part of the complementary solution, to find yp we have to assume a solution of the form yp = axex + bex . Substituting it into the original system of ODEs we obtain: aex + axex + bex = A (axex + bex ) + 1 0 ! ex . We can cancel out ex from this system of equations. Moreover, we can compare the coefficients containing x and the coefficients not containing x in both members of the system, obtaining the two vector equations: a = Aa 1 a + b = Ab + 0 From the first equation we know that a must be proportional to the eigenvector corresponding to the eigenvalue λ = 1, namely ! 1 . a = Cp(1) = C 1 Substituting this value of a into the second equation we obtain: (A − I)b = C 1 1 ! − 1 0 ! b − b = C − 1 1 2 , ⇒ 3b1 − 3b2 = C . This system of equations admits (infinitely many) solutions only if the second equation is proportional to the first one, namely only if 3 C = 3(C − 1), ⇒ C = . 2 164 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS This leads to the equation 1 b1 − b2 = . 2 If we give b1 a value K, then b2 must be K − 12 , therefore the vector b is of the form: K K− b= 1 2 ! . In the end, the particular solution of the given system of ODEs is given by: 3 yp (x) = 2 1 1 ! xex + K K− 1 2 ! ex . Method of the Laplace transforms The Laplace transform of a vector of functions f(x) is simply given by the vector whose components are the Laplace transforms of the components of f(x). If we apply the Laplace transformation to the vector y′ (x), by simple extension of the result L{f ′(x)} = sF (s) − f (0), we find: L{f ′ (x)} = sF(s) − f(0). Given the nonhomogeneous system of ODEs with constant coefficients y′ (x) = Ay(x) + g(x), with the initial condition y(0) = y0 , we can apply the Laplace transformation to both members of this equation. If we call Y(s) the Laplace transform of the vector y(x) and with G(s) the Laplace transform of g(x), we obtain: sY(s) − y0 = AY(s) + G(s) ⇒ (sI − A)Y(s) = y0 + G(s). (5.34) In fact, since A is a constant matrix, it is L{Ay} = AL{y}. Eq. 5.34 can be solved (for instance by means of the Cramer’s rule or by inverting the matrix sI − A) and, once we have obtained Y(s) we can invert it element by element and obtain the vector solution y(x) = L−1 {Y(s)}. 5.2. SYSTEMS OF FIRST ORDER LINEAR ODES 165 Example 5.2.6 Solve the initial value problem: y ′ = 1 3 y + 0 2 0 sin x y(0) = 0 We use the property L{y′} = sF(s) − y0 , where F(s) is the Laplace transform of the vector y(x), and apply the Laplace transformation to both members of the given system of ODEs, obtaining: ! ! 1 3 0 sY(s) = Y(s) + 1 2 0 s2 +1 ! ! s − 1 −3 0 . ⇒ Y(s) = 1 −2 s s2 +1 This system of algebraic equations can be solved for instance by means of the Cramer’s rule. We calculate first the component Y1 (s) of the vector Y(s), obtaining: 0 −3 1 3 s2 +1 s s2 +1 Y1 (s) = = s2 − s − 6 . s − 1 −3 −2 s The roots of the denominator s2 − s − 6 are: √ 1 ± 1 + 24 s= , ⇒ s1,2 = 3, −2. 2 We can thus express Y1 (s) as: Y1 (s) = (s2 3 . + 1)(s − 3)(s + 2) We apply now the method of the partial fractions, namely we seek the coefficients A, B, C, D such that: C D 3 As + B + + = 2 , 2 s +1 s−3 s+2 (s + 1)(s − 3)(s + 2) namely: As3 −As2 −6As+Bs2 −Bs−6B +Cs3 +2Cs2 +Cs+2C +Ds3 −3Ds2 +Ds−3D = 3. This leads to the system: 166 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS A+C +D =0 −A + B + 2C − 3D = 0 −6A − B + C + D = 0 −6B + 2C − 3D = 3 A = −C − D B + 3C − 2D = 0 ⇒ −B + 7C + 7D = 0 −6B + 2C − 3D = 3 A = −C − D B = −3C + 2D ⇒ 10C + 5D = 0 20C − 15D = 3 If we multiply the third equation by 3 and sum it with the fourth equation we obtain 50C = 3. From it, we find the solution for all the other coefficients, namely: 3 A = 50 B = − 21 50 3 50 The function Y1 (s) is therefore: Y1 (s) = C= 3 D = − 25 . 21 1 3 1 3 1 3 s − + − . 2 2 50 s + 1 50 s + 1 50 s − 3 25 s + 2 It is easy to invert this function, obtaining: 21 3 3 3 cos x − sin x + e3x − e−2x . 50 50 50 25 The second component of the vector Y(s) is given by: y1 (x) = L−1 {Y1 (s)} = s−1 −2 0 1 s2 +1 s−1 . (s − 3)(s + 2) + 1)(s − 3)(s + 2) The method of the partial fractions yields the same system of equations we have already encountered solving for Y1 (s); only the right-hand sides of the equations are different, namely: Y2 (s) = A+C +D =0 −A + B + 2C − 3D = 0 −6A − B + C + D = 1 −6B + 2C − 3D = −1 We obtain therefore: = (s2 A = −C − D B = −3C + 2D ⇒ 10C + 5D = 1 20C − 15D = −1 4 A = − 25 B = 3 25 ⇒ 1 C = 25 3 D = 25 3 1 1 1 3 1 4 s + + + 2 2 25 s + 1 25 s + 1 25 s − 3 25 s + 2 4 3 1 3 ⇒ y2 (x) = − cos x + sin x + e3x + e−2x 25 25 25 25 Y2 (s) = − . . 5.2. SYSTEMS OF FIRST ORDER LINEAR ODES 167 The functions y1 (x) and y2 (x) we have found in this way are the components of the vector solution y(x). Method of variation of parameters The methods we have encountered so far can be applied to the system of nonhomogeneous ODEs y′ = Ay + g only if the components of the vector g are simple functions (sinusoidal functions, exponentials, polynomials). For more complex functions, the only viable method is the variation of parameters. We have seen that, given the homogeneous system of ODEs y′ = Ay, the solution can be expressed as: y = c1 y(1) (x) + · · · + cn y(n) (x), where the vectors y(1) (x), . . . , y(n) (x) form a fundamental set of solutions and are usually of the form y(i) = p(i) eλi x . This solution can be also written with the compact notation y = Y(x)c, where Y is the matrix whose columns are the vectors y(1) (x), . . . , y(n) (x) and c is the (constant) column vector formed by the coefficients c1 , . . . , cn . If we now allow the coefficients to vary, the vector c becomes dependent on x. If we substitute the vector y = Y(x)c(x) into the nonhomogeneous system y′ = Ay + g we obtain: Y ′(x)c(x) + Y(x)c′ (x) = AY(x)c(x) + g(x). Since Y is made of solutions of the corresponding homogeneous system of ODE, it is Y ′(x) = AY(x) and, consequently, Y ′ (x)c(x) = AY(x)c(x). We can cancel out these two terms from the previous equation and what remains is: Y(x)c′ (x) = g(x) ⇒ c′ (x) = Y −1 (x)g(x). In fact, the matrix Y is made of the (linearly independent) vectors that form the fundamental set of solutions, therefore det Y 6= 0 and we can always find the inverse Y −1 . From this equation, we obtain: Z Y −1 (x)g(x)dx + K Z −1 ⇒ y(x) = Y(x) Y (x)g(x)dx + K , c(x) = where K is a constant vector. (5.35) 168 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS Example 5.2.7 Find the particular solution of the system of ODEs ! ! 1 2 −5 sin x . y′ = y+ 1 1 −2 cos x We first have to find the matrix Y whose columns are solutions of the homogeneous system: ! 2 −5 y. y′ = 1 −2 The characteristic equation is: (2 − λ)(−2 − λ) + 5 = 0, ⇒ λ2 + 1 = 0. The eigenvalues are therefore the (complex conjugated) values λ1,2 = ±i. We can take λ = i and calculate the corresponding eigenvector: 2−i −5 1 −2 − i ! p1 p2 ! = 0 0 ! . The second equation of this system is: 2+i 1 p1 − (2 + i)p2 = 0, ⇒ p = ! . We have!obtained a complex eigenvalue λ = i and a complex eigenvector p = 2+i . As we have learned, we have to find the real and the imaginary parts 1 of peλx , namely: y= ⇒u= 2+i 1 ! 2+i 1 eix = 2 cos x − sin x cos x ! ! v= (cos x + i sin x) = cos x + 2 sin x sin x (2 + i) cos x + (2i − 1) sin x cos x + i sin x ! The vector Y is thus given by: Y= Given a 2 × 2 matrix ! 2 cos x − sin x cos x + 2 sin x . cos x sin x ! a b it is easy to show that the inverse is given by: c d ! d −b 1 . ad − bc −c a ! 169 5.3. SYSTEMS OF SECOND ORDER LINEAR ODES In the case of the matrix Y it is: ad − bc = det A = 2 sin x cos x − sin2 x − cos2 x − 2 sin x cos x = −1, therefore Y −1 is given by: Y −1 = − sin x cos x + 2 sin x cos x sin x − 2 cos x ! . We can now apply Eq. 5.35 and obtain: y(x) = 2 cos x − sin x cos x + 2 sin x cos x sin x = 2 cos x − sin x cos x + 2 sin x cos x sin x = 2 cos x − sin x cos x + 2 sin x cos x sin x !Z !Z !" ! − sin x cos x + 2 sin x cos x sin x − 2 cos x ! −1 + 1 + 2 tan x dx cot x + tan x − 2 −2 ln(cos x) ln(sin x) − ln(cos x) − 2x ! 1 sin x 1 cos x +K ! # −2(2 cos x − sin x) ln(cos x) + (cos x + 2 sin x)[ln(sin x) − ln(cos x) − 2x] = −2 cos x ln(cos x) + sin x[ln(sin x) − ln(cos x) − 2x] ! 2 cos x − sin x cos x + 2 sin x +K cos x sin x 5.3 ! Systems of second order linear ODEs A generic system of second order linear ODEs with constant coefficients can be written as: y ′′ = a11 (1) y1 + a11 (2) y1 ′ + · · · + a1n (1) yn + a1n (2) yn ′ + g1 (x) 1 .. . yn ′′ = an1 (1) y1 + an1 (2) y1 ′ + · · · + ann (1) yn + ann (2) yn ′ + gn (x) . dx (5.36) This system of n second order ODEs can be transformed into a system of 2n first order ODEs by the substitutions: y ′ = u1 1 .. . . yn ′ = un + 170 CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS The resulting system of ODEs is: u1 ′ = a11 (1) y1 + a11 (2) u1 + · · · + a1n (1) yn + a1n (2) un + g1 (x) .. . u ′ = a (1) y + a (2) u + · · · + a (1) y + a (2) u + g (x) n n1 1 n1 1 nn n nn n y1 ′ = u1 .. . ′ yn = un n , and it can be solved with the methods described so far. On the other hand, a system of second order ODEs can be also solved by means of elimination methods. For instance in the system: y 1 ′′ = ay1 + by2 + g1 (x) y2 ′′ = cy1 + dy2 + ey1 ′ + f y2 ′ + g2 (x) , we can see that it is possible to recover y2 from the first equation and, substituting it into the second equation, we obtain a fourth order nonhomogeneous ODE in y1 that can be solved with the known method and, after substituting it back into the first ODE, also y2 can be found. Chapter 6 Modeling physical systems with ODEs In this chapter we will give some indications on how to treat physical problems by means of differential equations (or systems of differential equations). We will start providing some guidelines on how to construct the mathematical model (namely the underlying differential equations) of some given physical problem. We will then describe some famous (and some less famous) physical problems and show how the mathematical tools learned so far can help us finding the solutions to them. 6.1 Constructing mathematical models Differential equations are useful in practically all the physical problems of some relevance. However, some work is needed to translate the physical problems in the mathematical language and to formulate the appropriate differential equation(s) that describes the problem being investigated. We have already seen in Chapter 2 some very easy physical problems (like the fall of a body with and without air resistance); in this section we will take into account (slightly) more complicated problems, trying to learn general methods useful to treat any physical system. It is however worth remarking that “magical recipes” on how to find the mathematical model of a physical process do not exist and a lot of experience and hard work are always the best way to find the solutions we look at. It might be useful to start this section with two examples. Example 6.1.1 A baseball player (with some knowledge of physics) is interested in knowing the “optimum angle” (that he calls αo ) at which he should hit the ball, that guarantees the maximum range of the ball. Moreover, he knows that he can give the ball an initial speed of ∼ 35 m/s and he wishes to know how much he can deviate 171 172 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES Figure 6.1: Reference frame to use in Example 6.1.1. from the optimum angle and still make a home run (namely to have a range of ∼ 120 m). The baseball player needs to know the range of the ball, therefore a spatial coordinate (that he will call x and will measure in meters). However, he needs to calculate the ball orbit, therefore also the vertical coordinate (y) must be taken into account. He must start therefore deciding the best reference frame, so that he can know the zero point of the horizontal and vertical coordinates. The best possible reference frame is indicated in Fig. 6.1. In fact, the height of the bat at the moment in which he hits the ball is comparable to the height of the external wall, therefore the natural choice is to take as zero point of the y-axis this height. He will then call α the angle of elevation of the ball at the moment he hits it. To obtain the orbit of the ball, the baseball player needs to know how the spatial coordinates x and y vary with time, therefore the time (that he will measure in seconds) is the independent variable of the problem. He now needs to know the forces acting on the ball. He can think that, since the ball is small and the whole orbit of the ball should last only few seconds, the air resistance can be neglected and the only force in action is the gravity, that acts along the y direction, towards the surface of the Earth. From the Newton’s second law F = ma the differential equations he needs to solve are: 6.1. CONSTRUCTING MATHEMATICAL MODELS ma = 0 x may = −mg 173 , where m is the mass of the ball (that can be cancelled out), ax = x′′ (t) and ay = y ′′ (t) are the accelerations along the x and y directions, respectively and g = 9.8 m s−2 is the gravity at the sea level. All he needs to know now are the initial conditions. Taking t = 0 at the moment he hits the ball, of course it is x(0) = y(0) = 0. The initial velocities x′ (0) and y ′(0) are the components of the initial speed v0 along the x and y direction, respectively, therefore: x′ (0) = v cos α 0 y ′(0) = v0 sin α . By integrating two times with respect to t the ODEs x′′ (t) = 0 and y ′′(t) = −g one obtains: x(t) = v t cos α 0 y(t) = v0 t sin α − 1 gt2 2 Since he needs to know the orbit of the ball, he can recover t from the first equation, namely x . t= v0 cos α By substituting it into the second equation, one obtains: 1 x2 y = x tan α − g 2 . 2 v0 cos2 α The range of the ball is the value of x for which y = 0. The equation y = 0 is satisfied for x = 0 (that is the initial condition) and for x= 2v0 2 cos2 α tan α 2v0 2 sin α cos α v0 2 sin(2α) = = . g g g The baseball player needs to know when the function x(α) has a maximum, therefore dx and puts it equal to zero, obtaining: he calculates the derivative dα dx 2v0 2 cos(2α) π = = 0, ⇒ cos(2α) = 0, ⇒ 2α = . dα g 2 In the end, the optimum angle αo is π4 . 2 he has To answer the second question he makes use of the relation x = v0 sin(2α) g just found. It is required to know for which values of α is x larger than 120 m. This occurs when: 174 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES x= 120 × 9.8 v0 2 sin(2α) > 120, ⇒ sin(2α) > . g v0 2 For v0 = 35 m s−1 he obtains: sin(2α) > 0.96, that is verified in the interval α ∈ [37o , 53o]. Example 6.1.2 A team of astronomers has observed a Supernova Remnant (the remnant left behind after the explosion of a star) but they do not know when the explosion took place. From the spectrum of the remnant they are able to understand that it contains a large amount of iron, a small amount of cobalt and a negligible amount of nickel. They are also able to determine that the iron is ∼ 10 times as much as the cobalt. They know, too, that a typical Supernova explosion releases in the interstellar medium some amount x0 of 56 Ni and negligible amounts of 56 Co and 56 Fe. However, 56 Ni is an unstable isotope and decays into 56 Co in a decay time of ∼ 6 d. Also 56 Co is unstable and decays into 56 Fe in ∼ 77 d. On the other hand, 56 Fe is a stable isotope, therefore on the long run it is to expect that all the nickel and cobalt will be turned into iron. Is this enough to estimate how many days are elapsed since the explosion? The rate at which the number of atoms of a radioactive isotope decreases with time is proportional to the amount of atoms of that species present at the time t (the larger the number of atoms of the radioactive isotope, the larger the number of transitions to a stable isotope), namely the number N(t) of atoms of the radioactive isotope obeys the differential equation: dN = −λN, dt where λ is the inverse of the decay time (in fact, it should have the dimension of the inverse of a time). In our example, we can call λN i the decay rate of nickel to cobalt (namely λN i = 1 ) and λCo the decay rate of cobalt to iron (λCo = 771 d ). The equation that gives the 6d number of atoms of Ni as a function of time is simply dNi = −λN i Ni. dt In fact, the Ni population reduces in size because it decays to cobalt and there are no chemical processes replacing it. On the other hand, the number of atoms of cobalt 6.1. CONSTRUCTING MATHEMATICAL MODELS 175 increases because of the decay of Ni but decreases because Co decays to Fe, namely the number of Co atoms as a function of time obeys the following ODE: dCo = λN i Ni − λCo Co. dt The number of iron atoms can only increase with time and its rate of change is dictated by the decay of cobalt, namely we have: dF e = λCo Co. dt In the end, if we solve the system of ODEs dN i dt = −λN i Ni dCo = λN i Ni − λCo Co dt dF e = λCo Co dt , we will obtain the functions Ni(t), Co(t) and F e(t) that give us the number of atoms i of the 3 species as a function of time. It is worth noticing that dN + dCo + dFdte = 0, as dt dt it should be since, after the Supernova explosion, these elements are neither created nor destroyed, but only transformed one to another. This system of ODEs can be solved in many ways. We will choose here a method that makes use of the Laplace operator. If we apply the Laplace operator to the first ODE, we obtain: x0 , s + λN i where x0 is the (unknown) amount of Ni produced by the Supernova. We apply now the Laplace operator to the second ODE and obtain: sL{Ni} − x0 = −λN i L{Ni}, ⇒ L{Ni} = λN i x0 − λCo L{Co}. s + λN i Here, we have considered that the amount of cobalt produced by the Supernova is negligible, namely Co(0) = 0. From it, we obtain: sL{Co} = λN i L{Ni} − λCo L{Co} = L{Co} = λN i x0 . (s + λCo )(s + λN i ) Inverting this relation, we can already calculate Co(t). We apply the method of the partial fractions and obtain: A B λN i x0 + = ⇒ As + AλN i + Bs + BλCo = λN i x0 . s + λCo s + λN i (s + λCo )(s + λN i ) From it, we obtain the system of equations: 176 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES A + B = 0 AλN i + BλCo = λN i xo A = −B A(λN i − λCo ) = λN i x0 Recalling that L−1 {1/(s − a)} = eax , we obtain: Co(t) = A = B = λNi x0 λNi −λCo λNi x0 − λNi −λCo . x0 λN i e−λCo t − e−λNi t . λN i − λCo We can notice from this equation that Co(0) = 0 and that Co(t) tends to zero for t → ∞, as we expect since all the cobalt will be turned at some time to iron. From the equation dFdte = λCo Co, since F e(0) = 0 we obtain: sL{F e} = λCo L{Co} ⇒ L{F e} = x0 λCo λN i . s(s + λCo )(s + λN i ) We apply again the method of partial fractions, looking for coefficients A, B, C such that B C x0 λCo λN i A + + = . s s + λCo s + λN i s(s + λCo )(s + λN i ) We obtain: ⇒ ⇒ As2 + As(λCo + λN i ) + AλCo λN i + Bs2 + BsλN i + Cs2 + CsλCo = x0 λCo λN i A + B + C = 0 A = x0 A(λCo + λN i ) + BλN i + CλCo = 0 = x λ λ A λ λCo Ni 0Co N i A = x0 B= C = B = −C − x0 + Cλ = 0 x0 (λCo + λNi ) − CλN i − x0 λN i Co x0 λNi λCo −λNi x0 λCo λNi −λCo Inverting this function we obtain: 1 λN i λCo 1 −1 − F e(t) = x0 L + s λCo − λN i s + λCo s + λN i 1 −λCo t −λNi t (λN i e − λCo e ) . = x0 1 + λCo − λN i Also in this case we can see that F e(0) = 0 whereas F e(t) tends, for t → ∞, to x0 (as it should be since at sufficiently large times all the initial amount of nickel will be turned into cobalt and all the cobalt into iron). 177 6.1. CONSTRUCTING MATHEMATICAL MODELS The ratio between F e(t) and Co(t) is given by: −λ t −λ t Co Ni e 1 − λCo e λNiCo −λ F e(t) −λNi . = λNi −λNi t − e−λCo t ) Co(t) (e λ −λ Co Ni We know that this ratio must be ∼ 10, therefore from the above equation we can recover the time at which this happens. We can simplify this equation by noticing that λCo ≪ λN i , therefore λCo − λN i ≃ −λN i and that e−λNi t − e−λCo t ≃ −e−λCo t , therefore we obtain: 1+ F e(t) ≃ Co(t) λCo e−λNi t −λNi e−λCo t λNi e−λCo t ≃ eλCo t + λCo −(λNi −λCo )t e − 1 ≃ eλCo t − 1. λN i By equating this ratio to 10, we obtain: eλCo t ∼ 11 ⇒ t ∼ 1 ln 11 ≃ 185 d. λCo As we have seen from the previous two physical problems, the construction and the solution of a mathematical model is a process that can be loosely described with the following steps: • identify the independent and dependent variables and assign letters to them. In our case, the independent variable was always the time and the dependent variables were the spatial coordinates x and y (Example 6.1.1) and the number of atoms of Fe, Co, Ni (Example 6.1.2). • Choose the units of measurement of each variable. This choice is arbitrary but the calculations are easier if the measured quantities are close to unity (for this reason we have chosen to measure the time in seconds in the first example and in days in the second). • Articulate the basic principle underlying the problem you are investigating. This can be a widely recognized physical law (such as the Newton’s second law or the law of decay of a radioactive isotope), or it may be a more speculative assumption based on experience or observations. Since in the real world many forces and effects act at the same time, be sure that you have isolated the main driver of the physical process and neglect all the forces that you may reasonably consider negligible. • Express this principle of law in terms of the variables you chose in the first step. That may require the introduction of physical constants (like g) or parameters (like λN i,Co ). 178 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES • Make sure that each term in your equations have the same physical units. • Formulate the initial conditions of the problem. For problems involving orbits and trajectories of bodies, this implies the choice of a reference frame. • If the obtained equations are still too complicated and intractable, further approximations may be required. For example, at one stage in Example 6.1.2 we have neglected λCo with respect to λN i . • Solve the obtained ODEs. Try to make all the possible checks to verify that the solution is correct. In particular, check if the initial conditions are satisfied and check if the behavior at large times appears physically reasonable (as we have done in Example 6.1.2, where we have verified that at large time all the cobalt and nickel will be turned into iron). If possible, one should also calculate the values of the solutions at selected points and compare with observations. Or examine the solutions corresponding to certain special values of the parameters of the problem. 6.2 6.2.1 Mechanical and electrical vibrations The spring-mass system We will start here considering some of the most famous and important physical problems to understand how the methods learned so far help us solve them. The first process we analyze is the motion of a mass on a spring because the principles involved are common in many systems. Referring to Fig. 6.2, we consider a mass hanging on the end of a spring with original length y0 . The mass causes an elongation ye of the spring in the downward direction, which we will take as the positive one. There are at this point two forces acting on the mass m: the gravitational force w = mg, but there is also a force Fs due to the spring that acts upward and contrasts the gravity. Already in the seventeenth century Hooke discovered that, for small elongations y, the force is proportional to the elongation, namely Fs = −ky, (6.1) where the constant of proportionality k is called the spring constant. This law is called Hooke’s law. Since ye is the elongation at which the gravity and the elastic force balance, it must be: 6.2. MECHANICAL AND ELECTRICAL VIBRATIONS 179 Figure 6.2: A spring-mass system and the forces acting on the mass m. mg = kye . (6.2) This means that if we attach a body of mass m to a spring whose spring constant is unknown, it is enough to measure the elongation ye and calculate k by means of the relation k= mg . ye We are interested in studying the mass of the body out of its equilibrium, namely how the spring-mass system reacts after we displace the mass m by an amount u = y − ye . The elastic force will be at this point Fs = −ky = −k(ye + u). From the Newton’s second law F = ma we have: mu′′ (t) = Fs + mg = −k[ye + u(t)] + mg = −ku(t), (6.3) where we have made use of the relation Eq. 6.2. The equation of motion of the spring is thus given by the solution of the ODE ′′ mu (t) + ku(t) = 0. Since the constants m and k are always positive, the solutions of the characteristic equation mλ2 + k = 0 are always the imaginary numbers: r λ1,2 = ±i k = ±iω0 , m 180 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES where ω0 is called the natural frequency of the vibration. The solution u(t) of the motion of the spring is thus given by: u(t) = A cos(ω0 t) + B sin(ω0 t). We can express this solution also with: u(t) = R cos(ωo t − δ). (6.4) In fact, R cos(ω0 t − δ) = R cos δ cos(ω0 t) + R sin δ sin(ω0 t), therefore the two solutions coincide provided that A = R cos δ, B = R sin δ. Thus: √ B . (6.5) A The motion of the spring subject only to the gravity and to the elastic force is thus given by a vibration (also called undamped free vibration) about the equilibrium position ye , with period R= A2 + B 2 , δ = arctan 2π = 2π T = ω0 r m , k amplitude R and phase δ. The amplitude and phase are dictated by the initial conditions of the problem. The idealized configuration of the system in which only gravity and the elastic force act is never attainable in practice. In fact, it would predict a vibration, always with the same amplitude R, lasting forever. Resistance from the medium in which the mass moves, internal energy dissipation and other dissipative phenomena are expected to damp the motion of the mass m. We can take into account these damping effects by introducing a resistive force Fd that contrasts the motion of the mass (therefore it has a negative sign) and can be assumed proportional to the velocity, namely: Fd (t) = −γu′ (t). In this way, the motion of the spring must obey the ODE mu′′ (t) + γu′ (t) + ku(t) = 0. Also this ODE has constant coefficients, therefore the solution is simply given by: 181 6.2. MECHANICAL AND ELECTRICAL VIBRATIONS u(t) = Aeλ1 t + Beλ2 t , where λ1,2 are the solutions of the characteristic equation mλ2 + γλ + k = 0. These solutions are: p γ 2 − 4km γ 1 p 2 =− ± γ − 4km. 2m 2m 2m We can distinguish now two cases: λ1,2 = −γ ± (6.6) √ • γ < 2 km p In this case, γ 2 − 4km < 0 and the solutions of the characteristic equation are the complex conjugate numbers: γ i p 4km − γ 2 . ± 2m 2m The law of motion of the mass m is thus given by: λ1,2 = − γ γ u(t) = e− 2m t [A cos(µt) + B sin(µt)] = Re− 2m t cos(µt − δ), (6.7) where p 4km − γ 2 µ= > 0, (6.8) 2m and R and δ are given again by Eq. 6.5. This is again a vibration (called γ damped free vibration) but with amplitude Re− 2m t decreasing with time (in fact γ, m > 0) and tending to zero for t → ∞. √ • γ ≥ 2 km p In this case, γ 2 − 4km ≥ 0 and therefore we have two real solutions (either distinct or coincident) of the characteristic equation. However, γ 2 − 4km is always less than γ 2 (k and m are positive constants), therefore both λ1,2 given by Eq. 6.6 are negative. That means that the law of motion of the spring is given by the sum of two exponentially decaying functions u(t) = c1 e−λ1 t + √ c2 e−λ2 t (u(t) = (c1 + c2 t)e−λt in the case in which γ = 2 km) and therefore the mass tends to return to the equilibrium position for large times without oscillating about the equilibrium position. That corresponds to a very strong √ damping that prevails over the elastic force. The threshold value γ = 2 km is also called critical damping. In some cases, an external force Fe (t) might be applied to the mass m, therefore the law of motion is given by the solution of the following ODE: 182 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES mu′′ (t) + γu′ (t) + ku(t) = Fe (t). (6.9) The solution of this ODE is the sum of the complementary solution uc (t) (given by Eq. 6.7 in the case of small damping), plus a particular solution up (t) related to the nature of the external force Fe (t). Example 6.2.1 A mass m = 1 kg is attached to a spring with spring constant k = 4 N/m and to a damping device with variable but small damping coefficient γ. Analyze the motion of m as a function of γ under the influence of the external forces F cos t 1 Fe (t) = . F2 cos(2t) To simplify the notation we can write γ = 2β. The damping coefficient is assumed to be small, therefore we can assume β < 1. The law of motion must obey the following ODE: u′′ (t) + 2βu′ (t) + 4u(t) = Fe (t). According to Eq. 6.7 the complementary solution, namely the solution of the corresponding homogeneous ODE, is given by: uc (t) = Re−βt cos(µt − δ), where R and δ can be obtained from the initial conditions and p 16 − 4β 2 p = 4 − β 2. µ= 2 Let us take first the external force Fe (t) = F1 cos t. To find the particular solution, we can apply the method of the undetermined coefficients. Provided that µ 6= 1 we know that the particular solution has the form: up (t) = A1 cos t + B1 sin t. The fact that β < 1 ensures us that µ > 1, therefore this is indeed the form of the particular solution. The first and the second derivative of up (t) are given by: up ′ (t) = −A1 sin t + B1 cos t up ′′ (t) = −A1 cos t − B1 sin t 183 6.2. MECHANICAL AND ELECTRICAL VIBRATIONS By substituting them into the given ODE and comparing the coefficients of sin t and cos t we obtain the system of equations: 3A + 2βB = F 1 1 1 3B1 − 2βA1 = 0 B = 2βA1 1 3 , 3A1 + 4β 2 A1 = F1 3 The law of motion of the mass m is thus given by: u(t) = Re−βt cos( p 4 − β 2 t − δ) + A = 1 , B1 = 3F1 9+4β 2 2βF1 9+4β 2 . F1 (3 cos t + 2β sin t), 9 + 4β 2 p namely it is the superposition of two oscillations with angular frequencies 4 − β 2 and 1, respectively. In the case in which Fe (t) = F2 cos(2t), provided that µ 6= 2 (namely that β 6= 0) we can assume a particular solution of the form: up (t) = A2 cos(2t) + B2 sin(2t) up ′ (t) = −2A2 sin(2t) + 2B2 cos(2t) up ′′ (t) = −4A2 cos(2t) − 4B2 sin(2t) Since in this case up ′′ (t) + 4up (t) = 0 we can immediately realize that A2 = 0 and that B2 = F4β2 . The law of motion is thus: p F2 u(t) = Re−βt cos( 4 − β 2 t − δ) + cos(2t). 4β Also in this case the motion is characterized by the superposition of two oscillations but the amplitude F4β2 of the second oscillation becomes unbound for β → 0. This happens any time the frequency q ω of the applied force is equal (or very similar) to k the natural frequency ω0 = m (2 in our example) of the considered system. This phenomenon is called resonance. 6.2.2 Electric circuits A simple electric circuit (also called RLC circuit) is composed of a resistor, a capacitor and an inductor connected in series, as shown in Fig. 6.3. The current I, measured in amperes, is a function of time t. The resistance R (ohms), the capacitance C (farads) and the inductance L (henrys) are all positive constants. The impressed voltage V (in volts) is a given function of time. Another physical quantity that 184 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES Figure 6.3: A simple RLC electric circuit. enters the problem is the total charge Q (measured in coulombs), which is linked to I by the relation I(t) = Q′ (t). The flow of current in the circuit is governed by the Kirchhoff ’s second law that states that In a closed circuit the impressed voltage is equal to the sum of the voltage drops in the rest of the circuit. According to elementary laws of electricity, it is known that: • the voltage drop across the resistor is IR; • the voltage drop across the capacitor is Q/C; • the voltage drop across the inductor is LI ′ (t). The Kirchhoff’s second law translates thus into: L Q dI + RI + = V (t). dt C By means of the relation I(t) = Q′ (t) we obtain: 1 Q(t) = V (t). (6.10) C This is a nonhomogeneous second-order ODE with constant coefficients that resembles very closely the Eq. 6.9 that describes the dynamics of a spring-mass system in the presence of an external force, therefore the solutions can be found exactly as in Sect. 6.2.1. It happens often in physics that similar differential equation can describe quite different physical systems. LQ′′ (t) + RQ′ (t) + 6.3. OTHER PHYSICAL PROCESSES 6.3 185 Other physical processes 6.3.1 Wave propagation It is known that a small perturbation expands in a medium according to the wave equation ∂2y = v 2 ∇2 y, 2 ∂t where v is the propagation speed of the wave and ∇2 is the Laplace operator (see Sect. 7.2.3). For instance, a sound wave in the air propagates with the sound speed 343 m/s. A wave in one dimension (for example a vibrating string) can be described by the one dimensional wave equation: 2 ∂2y 2∂ y =v . (6.11) ∂t2 ∂x2 Many text books report the solution of this equation (y(x, t) = f (x ± vt)) without explaining how it is obtained. The reason is that the wave equation is not an ODE but a partial differential equation and most of the text books do not treat PDEs. We will not treat PDEs neither, but with a method based on the Fourier transforms it is possible to convert the wave equation into an ODE and solve it. The solution of the wave equation is a function y(x, t). We shall assume that the function y(x, t) at the time t = 0 is given by the function f (x), namely: y(x, 0) = f (x). This can be considered the initial condition of the problem. We define now as Y (α, t) the Fourier transform of y(x, t) with respect to x, namely Z ∞ 1 y(x, t)e−iαx dx. Y (α, t) = √ 2π −∞ Consequently, Y (α, 0) is the Fourier transform of f (x), namely: (6.12) Z ∞ 1 f (x)e−iαx dx. (6.13) Y (α, 0) = F (α) = √ 2π −∞ We apply now the Fourier transform (with respect to x) to both members of the wave equation, obtaining: ∞ Z ∂ 2 y −iαx 1 e dx = v 2 √ 2 ∂t 2π ∞ ∂ 2 y −iαx e dx = v 2 (iα)2 Y (α, t) = −v 2 α2 Y (α, t), 2 ∂x −∞ −∞ (6.14) where we have made use of the relation F {f (n) (t)} = (iω)n F {f (t)} that we have encountered in Section 4.2. In the left hand side of Eq. 6.14, the derivative with respect to time can be taken out of the integral, therefore we obtain: 1 √ 2π Z 186 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES 1 √ 2π ∞ Z −∞ ∂2 ∂ 2 y −iαx e dx = Y (α, t). ∂t2 ∂t2 Therefore, Eq. 6.14 transforms to: ∂2 Y (α, t) = −v 2 α2 Y (α, t). (6.15) 2 ∂t Since no derivatives with respect to x enter the above equation, this is an ODE and not a PDE anymore. This is already a major achievement. It is easy to solve Eq. 6.15 (it is the equation of an undamped free vibration). The characteristic equation is λ2 + v 2 α2 = 0, with roots λ1,2 = ±ivα, therefore the solution is: Y (α, t) = K(α)e±ivαt . For t = 0 we obtain Y (α, 0) = K(α), therefore K(α) is the Fourier transform of f (x), namely: Y (α, t) = F (α)e±ivαt . The solution of the wave equation y(x, t) is the inverse Fourier transform of the function Y (α, t), namely: 1 y(x, t) = √ 2π Z ∞ −∞ ±ivαt iαx F (α)e e 1 dα = √ 2π Z ∞ F (α)eiα(x±vt) dα. (6.16) −∞ Since f (x) is the inverse Fourier transform of F (α), we immediately obtain from this equation the solution we seek, namely: y(x, y) = f (x ± vt), (6.17) corresponding to two waves propagating with velocity v in the +x and −x directions, respectively. 6.3.2 Heat flow The one dimensional heat flow is described by the PDE ∂ψ ∂2ψ = a2 2 , ∂t ∂x (6.18) where the solution ψ(x, t) gives the temperature at each coordinate x as a function of time. We proceed as for the wave equation and define Ψ(α, t) as the Fourier transform of ψ(x, t) with respect to x, namely: 187 6.3. OTHER PHYSICAL PROCESSES Z ∞ 1 Ψ(α, t) = √ ψ(x, t)e−iαx dx. 2π −∞ If we apply the Fourier transform to both sides of Eq. 6.18, this yields an ODE for the function Ψ(α, t), Fourier transform of ψ(x, t), in the time variable t given by: ∂Ψ(α, t) = −a2 α2 Ψ(α, t), ∂t that can be easily solved yielding: ln Ψ(α, t) = −a2 α2 t + K(α), or: 2 α2 t Ψ(α, t) = C(α)e−a , where C = Ψ(α, 0) is dictated by the initial conditions of the problem. By inverting this expression, we obtain the solution: 1 ψ(x, t) = √ 2π Z ∞ −∞ 2 α2 t C(α)eiαx−a dα. (6.19) 188 CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES Chapter 7 Vector and tensor analysis We discuss in this chapter the calculus (i.e. differentiation and integration) of vectors and we introduce the concept of tensors. Particular emphasis will be given on the most widely used application of vectors, namely the vectors describing the position of a body in a 3-dimensional space because this is the perfect framework to treat dynamics. We start with a review of space vectors and their properties. 7.1 7.1.1 Review of vector algebra and vector spaces Vector algebra We have already encountered vectors as matrices composed of a singe column (see Section 5.1.1). However, when we think at a vector we associate it mentally to an arrow in space. Indeed, a vector is a geometrical object that has a magnitude (or length) and a direction. The association between vectors as geometrical objects and columns of real numbers arises when we consider a vector a in a 3-dimensional space and decompose it along 3 vectors e1 , e2 and e3 , not lying in a plane. In this case it is always possible to find 3 real numbers a1 , a2 and a3 such that: a = a1 e1 + a2 e2 + a3 e3 . (7.1) At this point we can identify the vector a with the 3 scalars a1 , a2 and a3 , namely: a1 a = a2 . a3 The three vectors e1 , e2 and e3 are said to form a basis for the 3-dimensional space and the numbers a1 , a2 and a3 are the components of the vector a with respect to this basis. If the vectors e1 , e2 and e3 lie in the same plane, then they are linearly 189 190 CHAPTER 7. VECTOR AND TENSOR ANALYSIS Figure 7.1: A Cartesian basis set and the components of the vector a. dependent (one can be obtained as linear combination of the other two), therefore they form a basis if and only if they are linearly independent, namely if and only if the only solution of the equation c1 e1 + c2 e2 + c3 e3 = 0 is c1 = c2 = c3 = 0. As we have seen, this corresponds to the condition det E 6= 0, where E is the matrix whose columns are the vectors e1 , e2 , e3 . If we wish to label points in space using a Cartesian coordinate system (x, y, z), we can introduce the unit vectors i, j and k. At this point the vector a can be written as sum of three vectors, each parallel to a different coordinate axis: a = ax i + ay j + az k, (7.2) (see Fig. 7.1). Therefore, each vector a can be associated to the three numbers ax , ay , az , namely: ax a = ay . az Clearly, the vectors i, j and k can be represented as: 0 0 1 i = 0 , j = 1 , k = 0 . 1 0 0 7.1. REVIEW OF VECTOR ALGEBRA AND VECTOR SPACES 191 The magnitude of a vector is a measure of its length. It is indicated with |a| or a and it is given by: p (7.3) a = |a| = ax 2 + ay 2 + az 2 . A vector whose magnitude equals unity is called unit vector. The unit vector in the direction of a is indicated as â and is evaluated as: â = a . a (7.4) The relation a = aâ is useful because, given the vector a, it separates clearly its magnitude from its direction. Sums and differences between vectors are simple applications of sums and differences between matrices (each component must be added or subtracted separately). As for the product of two vectors, we have already encountered the scalar product (Eq. 5.2), namely: a · b = ax bx + ay by + az bz . (7.5) In fact, we have seen that a·b = aT b∗ , but vectors in the three-dimensional space are real vectors and therefore b∗ = b. It can be shown that the scalar product between two vectors is also given by: a · b = ab cos θ, (7.6) where θ is the angle between the two vectors. From it, we recover immediately that: a= √ a · a. The simplest application of the scalar product in physics is the work, which is given by W = F · r, namely we have to take into account the component of the force F along a displacement r. There is another kind of multiplication between vectors that yields a vector (instead of yielding a scalar as the scalar product). It is called vector product (or cross product). The vector product is indicated with a × b and it is a vector whose magnitude is: |a × b| = ab sin θ, where θ is again the angle between the two vectors. The vector a×b is perpendicular to the plane identified by a and b and the direction is that in which a right-handed screw moves forward rotating between a and b (see Fig. 7.2). Therefore, a × b is opposite to b × a, namely the vector product is anticommutative. It is also clear that a × a = 0. In a Cartesian coordinate system, the vector product is expressed as: 192 CHAPTER 7. VECTOR AND TENSOR ANALYSIS Figure 7.2: Vector product between two vectors a and b. ay bz − az by a × b = (ay bz − az by )i + (az bx − ax bz )j + (ax by − ay bx )k = az bx − ax bz . (7.7) ax by − ay bx This can also be written as: i j k a × b = ax ay az . bx by bz We can extend our discussion to the product again a scalar triple product: ax [a, b, c] ≡ a · (b × c) = bx cx and a vector triple product: (7.8) of three vector and distinguish ay az by bz , cy cz a × (b × c) = (a · c)b − (a · b)c (a × b) × c = (a · c)b − (b · c)a. 7.1. REVIEW OF VECTOR ALGEBRA AND VECTOR SPACES 7.1.2 193 Vector spaces A vector space is nothing else than a collection of vectors. A vector space V over a field F (for instance the field of the real numbers) is said to be linear if, for any a, b, c ∈ V and for any λ, µ ∈ F the following five conditions are satisfied: • the addition in V is commutative and associative, namely: a+b= b+a a + (b + c) = (a + b) + c (7.9) (7.10) • There exists a null vector 0 such that a + 0 = a ∀a ∈ V (7.11) • All vectors have a corresponding negative vector −a such that a + (−a) = 0. (7.12) • Multiplication with scalars (elements of the field F ) is associative and distributive with respect to vector and field addition, namely: λ(µa) = (λµ)a (7.13) λ(a + b) = λa + λb (7.14) (λ + µ)a = λa + µa (7.15) • Multiplication by unity always leaves the vector a unchanged, namely 1(a) = a. (7.16) It can be easily shown that the vectors in the three-dimensional Euclidean space form a linear vector space over the field of the real numbers. This vector space is often indicated with R3 . 7.1.3 Linear operators An operator in a vector space is any kind of vector manipulation A that transforms a vector x into another vector y, namely: y = Ax. If the operator A has the property that for any scalar λ, µ A(λa + µb) = λAa + µAb, (7.17) 194 CHAPTER 7. VECTOR AND TENSOR ANALYSIS then the operator A is said to be linear. If x is a vector and A and B are two linear operators, then it follows that: (A + B)x = Ax + Bx (7.18) (λA)x = λ(Ax) (7.19) (AB)x = A(Bx). (7.20) In general, the multiplication between linear operators is not commutative, namely ABx 6= BAx. It is always possible to define a null operator O and an identity operator I such that Ox = 0, Ix = x. Finally, it is often (but not always) possible to find the inverse operator A−1 of A, namely the operator such that: AA−1 = A−1 A = I. If an operator A does not possess an inverse, it is called singular. We have already seen (Eq. 7.1) that, given a basis e1 , e2 , e3 (e1 , e2 , . . . , eN in a generic N-dimensional vector space), it is always possible to express any vector a as: N X a= ai ei . i=1 A linear operator A will transform the vector a into the vector b = Aa. It will transform, too, the basis vectors ej and it will always be possible to express the transformed vector Aej as linear combination of the basis vectors e1 , e2 , . . . , eN , namely it will always be possible to find N numbers A1j , A2j , . . . , AN j such that: Aej = N X Aij ei . (7.21) i=1 Since a is a linear combination of the basis vectors ej , the vector Aa can be expressed as: ! N N N N N X N X X X X X Aa = A aj ej = aj Aej = aj Aij ei = Aij aj ei . j=1 j=1 j=1 i=1 i=1 j=1 Here, we have made use of the linearity of the operator A (Eq. 7.17) and of Eq. 7.21. Since also the vector b = Aa can be expressed as linear combination of the basis vectors ei , from the comparison with the previous equation it follows that: b= N X i=1 bi ei , ⇒ bi = N X j=1 Aij aj . (7.22) 195 7.2. VECTOR CALCULUS This relation suggests us an association between linear operators and matrices, namely once we have fixed a basis e1 , e2 , . . . , eN of our vector space, the equation b = Aa is equivalent to the equation b = Aa, where A is the matrix whose elements Aij are defined by the relation Eq. 7.21. It is not difficult to see that matrices are indeed linear operators. 7.2 7.2.1 Vector calculus Differentiation of vectors Let us suppose that the vector a is a function of the scalar variable t (the best example is a vector representing the position of a moving body, depending therefore on the time). Then, to each value of t we can associate a different vector a (t). In Cartesian coordinates we have: a (t) = ax (t)i + ay (t)j + az (t)k. The derivative of a vector function a (t) can be defined analogously to the derivative of a normal function, namely: a (t + ∆t) − a (t) da ≡ ȧ = lim . ∆t→0 dt ∆t (7.23) In Cartesian coordinates, this can be written as: ȧ = a˙x i + a˙y j + a˙z k. In fact, the basis vectors i, j and k are constant in magnitude and direction and must not be differentiated. If the vector r(t) = x(t)i + y(t)j + z(t)k represents the position vector of a particle with respect to the origin in a Cartesian coordinate system, then ṙ and r̈ are the velocity and acceleration of the particle, respectively, namely: dx dy dz i+ j+ k dt dt dt d2 y d2 z d2 x a = r̈ = 2 i + 2 j + 2 k dt dt dt v = ṙ = (7.24) (7.25) From Fig. 7.3 it is clear that, for ∆t → 0 the vector v(t) = ṙ(t) tends to be tangent to the curve C that describes the motion of the particle in space. Given a scalar function ψ(t) and two vector functions a (t) and b (t), it can be shown that: 196 CHAPTER 7. VECTOR AND TENSOR ANALYSIS Figure 7.3: A small change in a vector r(t) resulting from a small change in t. d (ψa) = ψ ȧ + ψ̇a dt d (a · b) = a · ḃ + ȧ · b dt d (a × b) = a × ḃ + ȧ × b. dt (7.26) (7.27) (7.28) Example 7.2.1 Demonstrate Eq. 7.28. We will use Eq. 7.7 and will calculate the derivative of the three components (a×b)i , (a × b)j and (a × b)k separately, obtaining: d d (a × b)i = (ay bz − az by ) = a˙y bz − a˙z by + ay b˙z − az b˙y = (ȧ × b)i + (a × ḃ)i dt dt d d (a × b)j = (az bx − ax bz ) = a˙z bx − a˙x bz + az b˙x − ax b˙z = (ȧ × b)j + (a × ḃ)j dt dt d d (a × b)k = (ax by − ay bx ) = a˙x by − a˙y bx + ax b˙y − ay b˙x = (ȧ × b)k + (a × ḃ)k dt dt By summing up these three components, we obtain Eq. 7.28. 197 7.2. VECTOR CALCULUS Example 7.2.2 Given a vector a with constant magnitude, demonstrate that ȧ is perpendicular to a. By using Eq. 7.27 we obtain: d (a · a) = a · ȧ + ȧ · a = 2a · ȧ. dt The left hand side of this equation is: da2 d (a · a) = = 0, dt dt because the magnitude of a is constant. Therefore, a · ȧ = 0 and this implies that these two vectors are perpendicular. If a vector function a depends not only on a parameter t but on several parameters, it is possible to define the partial derivatives of a. The example is some vector quantity a(r) = a(x, y, z) that depends on the vector position r = (x, y, z) in a Cartesian coordinate system. In this case, the partial derivatives of a are defined as: ∂a a(x + ∆x, y, z) − a(x, y, z) = lim ∂x ∆x→0 ∆x a(x, y + ∆y, z) − a(x, y, z) ∂a = lim ∂y ∆y→0 ∆y ∂a a(x, y, z + ∆z) − a(x, y, z) = lim ∆z→0 ∂z ∆z (7.29) (7.30) (7.31) Given a curve C in space that describes the motion r(t) of a particle, we can define s as the arc length along the curve, measured from some fixed point. It is possible to find s as a function of t in this way: the infinitesimal displacement dr of the particle along the curve C is given by: dr = dxi + dyj + dzk. For an infinitesimal displacement, the curve can be approximated by a straight line and therefore its length ds is given by: ds = |dr| = p (dx)2 + (dy)2 + (dz)2 = √ dr · dr. If we now divide both sides of this equation by dt, we obtain: r ds dr dr √ = · = ṙ · ṙ, dt dt dt (7.32) 198 CHAPTER 7. VECTOR AND TENSOR ANALYSIS therefore: s(t) = Z √ ṙ · ṙdt. (7.33) If we wish to know the length of the curve C between the point r(t1 ) and r(t2 ), this is given by: Z t2 √ ṙ · ṙdt. (7.34) s= t1 Indeed, it is not necessary that the parameter describing the curve C is the time t. The curve C can be parameterized through a parameter u such that, in a Cartesian coordinate system, r can be described in parametric form as x(u)i + y(u)j + z(u)k. In this case then we obtain: Z r dr(u) dr(u) · du. (7.35) s(u) = du du Example 7.2.3 Given a curve C in the xy-plane described by the equation y = y(x), demonstrate that the arc length of the curve between x = a and x = b is given by: Z bq s= 1 + [y ′(x)]2 dx. a We can assume that the curve C is described by the parameter u = x. In this way, the parametric equation of the curve C is: C : r(u) = ui + y(u)j. The derivative of r with respect to u is thus given by: dy(u) dr =i+ j = i + y ′(x)j. du du The scalar product dr du · dr du is thus given by: dr dr 2 · = 1 + [y ′(x)] , du du (see Eq. 7.5). We can now apply Eq. 7.35 obtaining: s= Z bq 1 + [y ′(x)]2 dx. a We have seen that the vector r(t + ∆t) − r(t) tends to be tangent to the curve C for ∆t → 0 (see Fig. 7.3). This continues to be true no matter how is the curve C parameterized. For instance, we can use as a parameter the arc length s and still 199 7.2. VECTOR CALCULUS Figure 7.4: The unit tangent t̂. r(s + ∆s) − r(s) will tend to be tangent to C for ∆s → 0 (see Fig. 7.4). Moreover, the magnitude of the vector ∆r = r(s + ∆s) − r(s) will tend to be similar to ∆s (see again Fig. 7.4 and see Eq. 7.32). This means that the vector dr r(s + ∆s) − r(s) = lim , ds ∆s→0 ∆s is a unit vector, tangent to the curve C. We will call it unit tangent and will denote it with t̂. Taking into account Eq. 7.32 (namely that |dr| = ds) we can write: t̂ = dr = ds dr dt ds dt dr dt = dr . (7.36) dt Since t̂ is a vector with constant (unit) magnitude, it follows from Example 7.2.2 that it must be perpendicular to dt̂/ds. We can therefore write: dt̂ = κn̂, ds (7.37) where n̂ is called principal normal and it is a unit vector perpendicular to t̂ (and therefore normal to the curve C). The quantity κ is called the curvature of the curve C and its inverse ρ = 1/κ is called radius of curvature. The unit vector perpendicular to both t̂ and n̂ (namely the vector b̂ = t̂ × n̂) is called binormal to C. Its derivative with respect to s is perpendicular to b̂ (see 200 CHAPTER 7. VECTOR AND TENSOR ANALYSIS again Example 7.2.2). It can be shown that it is also perpendicular to t̂. It must be therefore parallel to n̂. One obtains: db̂ = −τ n̂, ds where τ is called torsion of the curve. At any given point on C, the three vectors t̂, n̂ and b̂ form a right-handed rectangular coordinate system. Example 7.2.4 Given a particle moving along a trajectory r(t), calculate the components of the acceleration a(t) with respect to the basis t̂, n̂, b̂. The velocity of the particle is given by: v(t) = dr dr ds ds = = t̂ = v t̂, dt ds dt dt where v = ds/dt is the speed of the particle. To obtain the acceleration we have to differentiate once more this expression obtaining: dv(t) dt̂ dv = t̂ + v . dt dt dt With the help of Eq. 7.37 we can express dt̂/dt as: a(t) = dt̂ ds v dt̂ = = vκn̂ = n̂. dt ds dt ρ Therefore, we have: dv v2 t̂ + n̂, dt ρ namely the acceleration has a component along the tangent of the particle’s trajectory and a component (the centripetal acceleration) in the direction of the principal normal. a(t) = Example 7.2.5 The logarithmic spiral can be described by means of the parametric equations: x = aebθ cos θ y = aebθ sin θ. Find the arc length s as a function of θ, the vectors t̂ and n̂ and the curvature κ at each point of this curve. The vector r is parameterized through the parameter θ, namely we have: 201 7.2. VECTOR CALCULUS bθ r = r(θ) = ae cos θ aebθ sin θ ! . Since the curve lies in the xy-plane, we do not need the z-coordinate (that can be taken equal to zero). The arc length s is given by: Z r dr dr · dθ s= dθ dθ v ! ! Z u bθ bθ u abebθ cos θ − aebθ sin θ abe cos θ − ae sin θ = t · dθ abebθ sin θ + aebθ cos θ abebθ sin θ + aebθ cos θ Z p = a ebθ b2 cos2 θ + sin2 θ − 2b sin θ cos θ + b2 sin2 θ + cos2 θ + 2b sin θ cos θdθ Z √ = a ebθ b2 + 1dθ √ a b2 + 1 bθ = e . b The unit tangent vector t̂ is given by: ! b cos θ − sin θ aebθ dr b sin θ + cos θ dr √ = = dθ t̂ = ds ds a b2 + 1ebθ dθ ! b cos θ − sin θ 1 . =√ b2 + 1 b sin θ + cos θ The vector κn̂ = dt̂/ds is given by: dt̂ = ds = dt̂ dθ ds dθ −b sin θ − cos θ b cos θ − sin θ √ a b2 + 1ebθ ! −b sin θ − cos θ . b cos θ − sin θ √ 1 b2 +1 = e−bθ a(b2 + 1) ! The magnitude of this vector is: −bθ p dt̂ 2 2 2 2 2 2 = e ds a(b2 + 1) b sin θ + cos θ + 2b sin θ cos θ + b cos θ + sin θ − 2b sin θ cos θ e−bθ = √ = κ. a b2 + 1 Consequently, the vector n̂ is given by: 1 n̂ = √ b2 + 1 −b sin θ − cos θ b cos θ − sin θ ! . 202 CHAPTER 7. VECTOR AND TENSOR ANALYSIS It is worth noticing that κ is very similar to the inverse of the arc length s, namely we have the relation: κ= 1 . bs The relation between the curvature and the arc length of a curve takes the name of Cesàro equation. In the case of the logarithmic spiral, the Cesàro equation takes the very simple form κ = C/s, where C is a constant (C = 1/b). We have not calculated the binormal b̂ and the torsion τ of the curve. This is not necessary since the given curve lies in the xy-plane. For this reason, b̂ (which is perpendicular to both t̂ and n̂) must necessarily coincide with k. Since k is a constant vector, its derivative with respect to s is zero. From the relation: db̂ = −τ n̂, ds it is clear that τ must be zero. 7.2.2 Scalar and vector fields We have seen that, once we have defined a basis of a linear vector space, then we can express any vector of this space as linear combination of the basis vectors. For instance, in the ordinary R3 space with a Cartesian coordinate system, the vector position r can be decomposed into three components x, y, z, parallel to the basis vectors i, j and k, respectively. Given a linear vector space R, a scalar field φ(r) = φ(x, y, z) associates a scalar to each point r = (x, y, z) of R and a vector field a(r) = a(x, y, z) associates a vector to each point. Typical examples of scalar fields are the temperature in a room (we can associate each point of the room with a certain value of the temperature), or the pressure at each point in a fluid. A typical vector field is instead the velocity field in a fluid (each element of the fluid has a different velocity, characterized by a different magnitude and direction). 7.2.3 Vector operators Also for scalar and vector fields it is possible to define differential operators. The most important are the gradient of a scalar field and the divergence and the curl of a vector field. Central to all these differential operators is the vector operator ∇ (also called nabla), defined in this way: 203 7.2. VECTOR CALCULUS ∇=i ∂ ∂ ∂ +j +k , ∂x ∂y ∂z (7.38) Gradient of a scalar field The gradient of a scalar field φ(x, y, z) (indicated with grad φ or ∇φ) is a vector whose components along x, y and z are the partial derivatives of φ(x, y, z) with respect to x, y, z, respectively, namely: grad φ = ∇φ = i ∂φ ∂φ ∂φ +j +k . ∂x ∂y ∂z (7.39) Given a constant K and two scalar fields φ and ψ it is easy to show that: ∇(Kφ) = K∇φ (7.40) ∇(φ + ψ) = ∇φ + ∇ψ (7.41) ∇(φ · ψ) = (∇φ) ψ + φ (∇ψ) . (7.42) Example 7.2.6 Find the gradient of φ(r) = r (the magnitude of the position vector r). It is of course r = p x2 + y 2 + z 2 . We have: ∂r ∂r ∂r +j +k ∂x ∂y ∂z 1 r = p (i2x + j2y + k2z) = . r 2 x2 + y 2 + z 2 ∇r = i The vector rr is the unit vector in the direction of r and it is usually indicated with êr . We could have obtained the same result by formally deriving the scalar field r with respect to r, namely: d√ d 1 ∇r = r = r·r= √ dr dr 2 r·r 1 r = 2r = , 2|r| r dr dr r+r dr dr where we have made of Eq. 7.42. The gradient of a scalar field has a simple physical interpretation: it is the rate of change of φ in some particular direction. In fact, the infinitesimal change dφ in going from r to r + dr (= (x + dx)i + (y + dy)j + (z + dz)k) is given by: 204 CHAPTER 7. VECTOR AND TENSOR ANALYSIS ∂φ ∂φ ∂φ dx + dy + dz ∂x ∂y ∂z ∂φ ∂φ ∂φ · (idx + jdy + kdz) = ∇φ · dr. = i +j +k ∂x ∂y ∂z dφ = If we divide now both sides of this equation by the infinitesimal arc length ds we obtain: dφ dr = ∇φ · = ∇φ · t̂, (7.43) ds ds where t̂ is the unit tangent to the curve parameterized by r(s). In general, the rate of change of φ with respect to the distance s in a particular direction a is given by: dφ = ∇φ · â, (7.44) ds where â is the unit vector in the direction of a. This is also called the directional derivative. Recalling that a · b = ab cos θ, it is clear that: dφ = |∇φ| cos θ, ds where θ is the angle between â and ∇φ. This equation tells us also that ∇φ lies in the direction of the fastest increase in φ and |∇φ| is the largest possible value of dφ/ds. √ Example 7.2.7 Find the directional derivative of the scalar field φ = 2x y +yz +z 2 along the direction of the vector a = i + j + k. The gradient of φ is given by: √ 2 y x √ ∇φ = i2 y + j z + √ + k(y + 2z) = z + √xy . y y + 2z The unit vector in the direction of a is given by: 1 1 â = √ 1 . 3 1 The directional derivative of the scalar field φ along a is thus given by: √ 2 y 1 √ 1 x 1 x ∇φ · â = √ z + √y · 1 = √ 2 y + 3z + y + √ . y 3 3 1 y + 2z 205 7.2. VECTOR CALCULUS Example 7.2.8 Find the directional derivative of the scalar field: φ(x, y, z) = cos2 (xy) + sin2 z, at the point P = π 1 π , , 4 2 8 in the direction a = 2i + 2j + k. The gradient of φ at the point P is: √ − 42 −y sin(2xy) −2y sin(xy) cos(xy) √ ∇φP = −2x cos(xy) sin(xy) = −x sin(2xy) = − π 8 2 . √ 2 sin(2z) 2 sin z cos z 2 P P √ The vector a has magnitude a = 4 + 4 + 1 = 3, therefore the requested directional derivative is: √ √ √ ! √ √ 2 − 42 dφ 2 π 2 2 π 2 1 π √2 1 − =− = − 8 · 2 = − + . √ ds 3 3 2 4 2 12 2 1 2 There is also a simple geometrical interpretation of the gradient of a scalar field. The points in the space in which the scalar field is equal to a constant value c designate a surface in the space. In fact, they must obey the equation φ(x, y, z) = c. Since φ is constant, dφ/ds must be zero if we move along this surface. From Eq. 7.43 we can see that, in this case, ∇φ · t̂ = 0. In other words, ∇φ is a vector normal to the surface φ(x, y, z) = c at every point. Divergence of a vector field The divergence of a vector field a(x, y, z) = iax + jay + kaz is defined as: div a = ∇ · a = ∂ax ∂ay ∂az + + . ∂x ∂y ∂z (7.45) Clearly, ∇ · a is a scalar field. Example 7.2.9 A central force field is a vector field whose magnitude depends only on the distance r of the object from the origin. Calculate the divergence of such a field. The central force field F(r) can be expressed as: F(r) = rf (r), 206 CHAPTER 7. VECTOR AND TENSOR ANALYSIS Figure 7.5: Differential rectangular parallelepiped (in the first octant). for some specific function f (r). The divergence of F(r) is thus given by: ∂ ∂ ∂ [xf (r)] + [yf (r)] + [zf (r)] ∂x ∂y ∂z ∂ ∂ ∂ = 3f (r) + x f (r) + y f (r) + z f (r) ∂x ∂y ∂z ∂ ∂r ∂ ∂r ∂ ∂r = 3f (r) + x f (r) + y f (r) + z f (r) ∂r ∂x ∂r ∂y ∂r ∂z 2 2 y z2 x f ′ (r) + p f ′ (r) + p f ′ (r) = 3f (r) + p x2 + y 2 + z 2 x2 + y 2 + z 2 x2 + y 2 + z 2 ∇ · F(r) = = 3f (r) + rf ′ (r). Given two vector fields a(x, y, z) and b(x, y, z) and a scalar field φ(x, y, z) it is: ∇ · (a + b) = ∇a + ∇b ∇ · (φa) = ∇φ · a + φ∇ · a. (7.46) (7.47) A physical interpretation of the divergence is offered by the hydrodynamics. Let us consider a fluid characterized by a density field ρ(x, y, z) and a velocity field v(x, y, z). If we consider a small volume dx dy dz (see Fig. 7.5) at the origin, the fluid entering into this volume per unit time in the (positive) x-direction (namely the fluid 207 7.2. VECTOR CALCULUS crossing the face EF GH per unit time) is given by (rate in)EF GH = [ρvx ]x=0 dy dz. Only the component vx of the velocity must be taken into account because the other components vy and vz do not contribute to the flow through this face. The rate of flow out through the face ABCD is (rate out)ABCD = [ρvx ]x=dx dy dz. To evaluate this rate, we can make a Taylor series expansion of ρvx centered at [ρvx ]x=0 , namely: ∂ (ρvx )dx dy dz. (rate out)ABCD = [ρvx ]x=dx dy dz = ρvx + ∂x x=0 The net flow out of the parallelepiped in the x-direction is now given by the flow out through the face ABCD minus the flow in through the face EF GH, namely: ∂ [net rate of flow out]x = ρvx + (ρvx )dx dy dz − [ρvx ]x=0 dy dz ∂x x=0 ∂ (ρvx )dx dy dz. = ∂x With the same reasoning we can find the net flow out per unit time along the directions y and z, as well. The total net flow out is thus given by: ∂ ∂ ∂ (ρvx ) + (ρvy ) + (ρvz ) dx dy dz = ∇ · (ρv)dx dy dz. (7.48) net flow out = ∂x ∂y ∂z Namely, ∇ · (ρv) is the rate of variation of the density of a fluid per unit time and per unit volume due to the fluid flow. The vector fields whose divergence vanishes are called solenoidal. Since the gradient of a scalar field is a vector field, it is possible to calculate the divergence of it, obtaining once again a scalar field. The divergence of the gradient of a scalar field φ, ∇ · ∇φ is indicated with ∇2 φ. The operator ∇2 is the scalar differential operator: ∂2 ∂2 ∂2 + + , (7.49) ∂x2 ∂y 2 ∂z 2 and it is called Laplace operator. Analogously, ∇2 φ is called Laplacian of φ. ∇2 = Curl of a vector field The curl of a vector field a(x, y, z) = iax + jay + kaz is given by: curl a = ∇ × a = ∂az ∂ay − ∂y ∂z i j k ∂ ∂ ∂ = ∂x ∂y ∂z ax ay az i+ . ∂ax ∂az − ∂z ∂x j+ ∂ay ∂ax − ∂x ∂y k (7.50) 208 CHAPTER 7. VECTOR AND TENSOR ANALYSIS Clearly, the curl of a vector field is itself a vector field. Example 7.2.10 Find the curl of the vector field: y 2z 2 a = y 2 z 2 i + x2 z 2 j + x2 y 2 k = x2 z 2 . x2 y 2 We have to simply apply Eq. 7.50, obtaining: i j k ∂ ∂ ∂ ∇ × a = ∂x ∂z 2 2 ∂y y z x2 z 2 x2 y 2 2x2 (y − z) 2 = 2y (z − x) . 2z 2 (x − y) Example 7.2.11 Find the curl of the vector field: 2 2 2 xe−(x +y +z ) 2 2 2 2 F = re−r = ye−(x +y +z ) . 2 2 2 ze−(x +y +z ) Also in this case we have to simply apply Eq. 7.50, obtaining: i j k ∂ ∂ ∂ ∇×F= ∂x ∂y ∂z xe−(x2 +y2 +z 2 ) ye−(x2 +y2 +z 2 ) ze−(x2 +y2 +z 2 ) 2 2 2 e−(x +y +z ) (−2yz + 2yz) 2 2 2 = e−(x +y +z ) (−2xz + 2xz) = 0. 2 2 2 e−(x +y +z ) (−2xy + 2xy) Indeed, analogously to what we have done in Example 7.2.9, we can demonstrate that 2 any central force field (as re−r is) has curl equal to zero. The vector fields for which the curl is zero are called irrotational. Given two vector fields a and b and a scalar field φ, it is possible to show that: ∇ × (a + b) = ∇ × a + ∇ × b (7.51) ∇ · (a × b) = b · (∇ × a) − a · (∇ × b) (7.53) ∇ × ∇φ = 0 (7.52) ∇ × (φa) = ∇φ × a + φ∇ × a. (7.54) 7.3. TRANSFORMATION OF COORDINATES 209 Figure 7.6: Curl of a fluid flow. The physical significance of the curl of a vector is not quite as transparent as that of the divergence. If we consider again a fluid characterized by a velocity field v(x, y, z) we can see from Fig. 7.6 that, if the component along y of the velocity (the vy component) increases with z, then the fluid tends to curl clockwise (namely in a negative sense) about the x-axis. How much does it curl? It depends on how much y the component vy of the velocity increases with z, namely it depends on ∂v . On the ∂z other hand, if the vz component of the velocity field increases with y, the fluid tends again to curl about the x-axis, but this time counterclockwise (in the positive sense; z . The curl of v about the see again Fig. 7.6). The curl depends in this case on ∂v ∂y x-axis is thus given by the sum of these two contributions, namely: ∂vz ∂vy − , ∂y ∂z which is exactly as the first component of Eq. 7.50. The other two components can be found analogously. [curl v]1 = 7.3 Transformation of coordinates As we have already said, a vector is just a mathematical object characterized by a direction and by a magnitude and it can be associated with a column of N real numbers (3 real numbers in the case of the familiar R3 vector space) only once we 210 CHAPTER 7. VECTOR AND TENSOR ANALYSIS Figure 7.7: Rotation of Cartesian coordinate axes about the z-axis. have established a basis of our vector space. So far we have always worked with the familiar Cartesian coordinate system x, y, z but for many problems it is convenient to use other coordinate systems. 7.3.1 Rotation of the coordinate axes We start here with one of the simplest possible coordinate transformation, namely the rotation of the coordinate axes. Referring to Fig. 7.7, we may wish to express the vector position r (and any other vector v) as a function of the new coordinates x′ and y ′ that are obtained by rotating the x- and the y-axis by the same angle θ (we leave the z-axis unaltered). Looking at the figure it is easy to see that: x = r cos ϕ1 y = r sin ϕ1 x′ = r cos ϕ2 y ′ = r sin ϕ2 , 211 7.3. TRANSFORMATION OF COORDINATES where r is the magnitude of the position vector r. But ϕ1 = θ + ϕ2 (and therefore ϕ2 = ϕ1 − θ), therefore we have: x′ = r cos(ϕ1 − θ) = r cos ϕ1 cos θ + r sin ϕ1 sin θ = x cos θ + y sin θ y ′ = r sin(ϕ1 − θ) = r sin ϕ1 cos θ − r cos ϕ1 sin θ (7.55) = −x sin θ + y cos θ (7.56) = x′ cos θ − y ′ sin θ (7.57) = x′ sin θ + y ′ cos θ. (7.58) x = r cos(ϕ2 + θ) = r cos ϕ2 cos θ − r sin ϕ2 sin θ y = r sin(ϕ2 + θ) = r sin ϕ2 cos θ + r cos ϕ2 sin θ Indeed, we can apply this argument to any vector because any vector can be represented with an arrow starting at the origin of the axes and ending at some point R (translations of the axes are always possible if the vector ! does not start at the vx origin). Therefore, given a vector v represented by in the xy-coordinate vy system, rotation of the axes by an angle θ will transform its components in this way: v ′ = v cos θ + v sin θ x x y vy ′ = −vx sin θ + vy cos θ (7.59) We can even use this formula as definition of vector, namely we call vector any geometrical object whose components under rotation of the coordinate systems satisfy Eq. 7.59. This condition is also called covariance. We can write Eq. 7.59 with the matrix formalism in this way: v′ = vx ′ vy ′ ! = Av = A11 A12 A21 A22 ! vx vy ! , (7.60) namely, recalling Sect. 7.1.3, the rotation of axes is a linear operator that can be expressed with the matrix A. In the case of the rotation of axes by an angle θ, the coefficients Aij are given by: A11 = cos θ, A12 = sin θ, A21 = − sin θ, A22 = cos θ. (7.61) 212 CHAPTER 7. VECTOR AND TENSOR ANALYSIS By looking at Eqs. 7.55 and 7.56 we can even notice that: ∂x′ ∂x ∂x′ ∂y ∂y ′ ∂x ∂y ′ ∂y = cos θ = A11 = sin θ = A12 = − sin θ = A21 = cos θ = A22 We can generalize these four equations with the relation: ∂xi ′ = Aij , ∂xj (7.62) where 1 (i = 1 or j = 1) is the index relative to the x-components (x or x′ ) and 2 (i = 2 or j = 2) refers to the y-components (y or y ′). We can thus rewrite Eq. 7.60 as: 2 X ∂xi ′ ′ vj . vi = ∂xj j=1 We can now generalize these formulae to a N-dimensional vector space. For any kind of axes rotation in a N-dimensional vector space it is possible to find the coefficients Aij such that the equation: v′ = Av, or, in terms of components, vi ′ = N X Aij vj , i = 1, . . . , N, (7.63) j=1 represents the transformation of the components of v in the two coordinate systems. Also the vector position r = (x1 e1 , . . . , xN eN )T must transform in the same way, namely: ′ xi = N X Aik xk . k=1 If we derive now both members of this equation with respect to xj , we obtain: N N X ∂xk ∂ X ∂ ′ Aik Aik xk = xi = = Aij , ∂xj ∂xj k=1 ∂x j k=1 since the various coordinates xj are supposed to be independent, therefore the only value of k for which the derivative of xk with respect to xj is different from zero is 213 7.3. TRANSFORMATION OF COORDINATES k = j. We have therefore obtained that: Aij = ∂xi ′ , ∂xj (7.64) thus Eq. 7.63 translates into: ′ vi = N X ∂xi ′ j=1 ∂xj vj . (7.65) On the other hand, in our 2-dimensional example we have also seen the inverse vector transformations, namely the transformations from the coordinates x′ , y ′ to the coordinates x, y (Eqs. 7.57 and 7.58). For a generic vector v these relations can be expressed as: v = v ′ cos θ − v ′ sin θ x x y , vy = vx ′ sin θ + vy ′ cos θ or: v = A′ v ′ , (7.66) where: cos θ − sin θ sin θ cos θ A′ = ! , namely the transformation matrix A′ is very similar to the transformation matrix A of Eq. 7.60, the only difference being the off-diagonal elements that have been inverted. That means that A′ = AT . By looking again at Eqs. 7.57 and 7.58 we can notice that: ∂x ∂x′ ∂x ∂y ′ ∂y ∂x′ ∂y ∂y ′ = cos θ = A11 ′ = A11 = − sin θ = A12 ′ = A21 = sin θ = A21 ′ = A12 = cos θ = A22 ′ = A22 . We can generalize these four equations with the relation: ∂xj = Aij , ∂xi ′ (7.67) where 1 is again the index relative to the x-components and 2 refers to the ycomponents (note the inversion of i and j with respect to Eq. 7.62, due to the fact that A′ = AT ). We can thus rewrite Eq. 7.66 as: 2 X ∂xj vj , vi = ∂xi ′ j=1 ′ 214 CHAPTER 7. VECTOR AND TENSOR ANALYSIS Figure 7.8: General curvilinear coordinates. or, in the general case of a N-dimensional vector field: N X ∂xj vi = v. ′ j ∂x i j=1 ′ 7.3.2 (7.68) General curvilinear coordinates In general, the position of a point P in space (having Cartesian coordinates x, y, z) can be expressed in terms of three curvilinear coordinates u1, u2 , u3 , provided that there is a one-to-one correspondence between (x, y, z) and (u1 , u2 , u3). The point P is placed at the interception between three surfaces of the kind u1 = c1 , u2 = c2 and u3 = c3 , where c1 , c2 and c3 are constants. The intersection of these three planes forms three curves u1 , u2 and u3 (see Fig. 7.8). We can take as basis vectors of this new coordinate system the vectors ê1 , ê2 and ê3 , tangent to the curves u1 , u2 and u3 at the point P . We will concentrate here on orthogonal coordinate systems (the most useful), namely systems for which ê1 , ê2 and ê3 are mutually perpendicular. In this case we have êj · êj = 0 if i 6= j and ê1 = ê2 × ê3 . If r(u1 , u2 , u3) is the position vector of the point P , then ∂r/∂u1 is a vector tangent to the u1-curve at P (u2 and u3 must remain constant), therefore it has the direction of ê1 . We denote its magnitude with h1 . We can define similarly h2 and h3 as the magnitudes of the vectors ∂r/∂u2 and ∂r/∂u3 , having the direction of ê2 and ê3 , respectively. We have thus: 7.3. TRANSFORMATION OF COORDINATES 215 1 ∂r 1 ∂r 1 ∂r , ê2 = , ê3 = . (7.69) h1 ∂u1 h2 ∂u2 h3 ∂u3 The quantities h1 , h2 and h3 are called scale factors of the curvilinear coordinate system. It is worth remarking that the quantities u1 , u2 and u3 need not be lengths. From this equation we notice however that hj duj must have the dimension of a length. In a Cartesian coordinate system it is clear that ∂r/∂x = i (and analogously for the other two components), therefore h1 = h2 = h3 = 1. The infinitesimal vector displacement dr is expressed as: ê1 = dr = ∂r ∂r ∂r du1 + du2 + du3 = h1 du1 ê1 + h2 du2 ê2 + h3 du3 ê3 . ∂u1 ∂u2 ∂u3 (7.70) The arc length is obtained by the formula (ds)2 = dr · dr. In the case of orthogonal curvilinear coordinates as we have said it is êi · êj = 0 if i 6= j, therefore the arc length is given by: (ds)2 = h1 2 (du1 )2 + h2 2 (du2 )2 + h3 2 (du3 )2 . (7.71) From Eq. 7.70 it is clear that, if the coordinate system is orthogonal, then the element of volume dV is a rectangular parallelepiped whose sides are h1 du1, h2 du2 , h3 du3, namely: dV = h1 h2 h3 du1 du2 du3 . Given a scalar field φ, we might still retain valid the expression dφ = ∇φ · dr we have encountered in Sect. 7.2.3. In fact, the interpretation of the gradient as rate of change of φ along a particular direction remains, irrespective of the chosen coordinate system. Namely, for any coordinate system the gradient of a scalar field must be always the vector having magnitude and direction of the maximum space rate of change. Since: dφ = ∂φ ∂φ ∂φ du1 + du2 + du3 , ∂u1 ∂u2 ∂u3 upon substituting dr with the expression found in Eq. 7.70 we find: 1 ∂φ 1 ∂φ 1 ∂φ ê1 + ê2 + ê3 . h1 ∂u1 h2 ∂u2 h3 ∂u3 Consequently, the nabla operator can be expressed as: ∇φ = ∇ = ê1 1 ∂ 1 ∂ 1 ∂ + ê2 + ê3 . h1 ∂u1 h2 ∂u2 h3 ∂u3 (7.72) (7.73) If we calculate the gradient of uj by means of this formula we obtain êj /hj , namely: êj = hj ∇uj . 216 CHAPTER 7. VECTOR AND TENSOR ANALYSIS This formula can be used in combination with the Eqs. 7.46, 7.47, 7.52, 7.53, 7.54 to express the divergence and the curl of a vector field a and the Laplacian of the scalar field φ in a curvilinear coordinate system, namely: ∂ ∂ ∂ 1 (h2 h3 a1 ) + (h3 h1 a2 ) + (h1 h2 a3 ) (7.74) ∇·a= h1 h2 h3 ∂u1 ∂u2 ∂u3 h1 ê1 h2 ê2 h3 ê3 1 ∂ ∂ ∂ ∇×a = (7.75) ∂u1 ∂u ∂u 2 3 h1 h2 h3 h1 a1 h2 a2 h3 a3 ∂ h2 h3 ∂φ ∂ h3 h1 ∂φ ∂ h1 h2 ∂φ 1 2 + + . ∇ φ= h1 h2 h3 ∂u1 h1 ∂u1 ∂u2 h2 ∂u2 ∂u3 h3 ∂u3 (7.76) We can for instance demonstrate Eq. 7.74 which is a bit more complicated than the other two equations. Let us consider the sub-expression ∇ · (a1 ê1 ). Since ê1 = ê2 × ê3 we have: ∇ · (a1 ê1 ) = ∇ · (a1 h2 h3 ∇u2 × ∇u3 ). In fact, we have seen that for an orthogonal coordinate system it is êj = hj ∇uj . By using Eq. 7.53 we obtain: ∇ · (a1 ê1 ) = ∇(a1 h2 h3 ) · (∇u2 × ∇u3 ) + a1 h2 h3 ∇ · (∇u2 × ∇u3 ). Applying again Eq. 7.53 we obtain: ∇ · (∇u2 × ∇u3 ) = ∇u3 · (∇ × ∇u2 ) − ∇u2 · (∇ × ∇u3 ), which is equal to zero because of Eq. 7.52. Therefore we are left with: ê3 ê1 ê2 × = ∇(a1 h2 h3 ) · . ∇ · (a1 ê1 ) = ∇(a1 h2 h3 ) · (∇u2 × ∇u3 ) = ∇(a1 h2 h3 ) · h2 h3 h2 h3 By applying Eq. 7.72 to the scalar field φ = a1 h2 h3 we obtain: ∇ · (a1 ê1 ) = 1 ∂ (a1 h2 h3 ), h1 h2 h3 ∂u1 which is the first term of Eq. 7.74. Analogously we can proceed to obtain the other two terms. Example 7.3.1 The cylindrical coordinate system (ρ, θ, z) is characterized by the following transformation equations: x = ρ cos θ y = ρ sin θ z = z . 217 7.4. TENSORS Find the expressions of gradient and Laplacian of a scalar field φ; divergence and curl of a vector field a as a function of ρ, θ, z. All we need to know are the scale factors of this coordinate system. We can start considering that the infinitesimal arc length ds in the Cartesian coordinate system is given by: (ds)2 = (dx)2 + (dy)2 + (dz)2 . We have: dx = dρ cos θ − ρ sin θdθ dy = dρ sin θ + ρ cos θdθ dz = dz, therefore: (ds)2 = (dρ cos θ − ρ sin θdθ)2 + (dρ sin θ + ρ cos θdθ)2 + (dz)2 = (dρ)2 (cos2 θ + sin2 θ) + (ρdθ)2 (sin2 θ + cos2 θ) + (dz)2 = (dρ)2 + (ρ2 dθ)2 + (dz)2 . By comparing it with Eq. 7.71 we can immediately recognize that: h1 = hρ = 1, h2 = hθ = ρ, h3 = hz = 1. Now we can apply Eqs. 7.72, 7.74, 7.75 and 7.76, obtaining: 1 ∂φ ∂φ ∂φ + θ̂ + ẑ ∂ρ ρ ∂θ ∂z 1 ∂ 1 ∂aθ ∂az ∇·a= (ρaρ ) + + ρ ∂ρ ρ ∂θ ∂z ρ̂ ρ θ̂ ẑ 1 ∂ ∂ ∂ ∇ × a = ∂ρ ∂θ ∂z ρ aρ ρaθ az ∂φ 1 ∂2φ ∂2φ 1 ∂ 2 ρ + 2 2 + 2, ∇ φ= ρ ∂ρ ∂ρ ρ ∂θ ∂z ∇φ = ρ̂ where ρ̂, θ̂ and ẑ are the unit vectors of the cylindrical coordinate system. 7.4 Tensors The considerations and formulae we have found in Section 7.3.1 might appear excessive on the light of the simplicity of the problem (rotation about one coordinate axis). They represent however the simplest possible introduction to the concept of tensors. 218 7.4.1 CHAPTER 7. VECTOR AND TENSOR ANALYSIS Basic definitions We have seen that we can use Eq. 7.59 as definition of vectors, namely vectors are mathematical entities whose components, under rotation of the coordinate axes, can be transformed according to this equation. This guarantees that the properties of the vector (its magnitude and direction) do not vary if we change the reference system. We have seen, too, that we can express this transformation by means of the partial derivatives of the components of the two reference systems. However, we have found an ambiguity between Eq. 7.65 and Eq. 7.68, namely the components of a transformed vector v′ can be either expressed as: ′i v = N X ∂xi ′ j=1 or as: ∂xj vj , (7.77) N X ∂xj vj . vi = ∂xi ′ j=1 ′ (7.78) In Cartesian coordinates these two formulations are equivalent, but in general curvilinear coordinates (like the ones introduced in Sect. 7.3.2), some vectors will be transformed according to Eq. 7.77 and some others according to Eq. 7.78. The first vectors are defined contravariant vectors while the seconds are named covariant vectors. Contravariant vectors are conventionally denoted with a superscript (like in Eq. 7.77) to distinguish them from the covariant ones. The prototype of a contravariant vector is the vector position (x, y, z) in a Cartesian coordinate system, that can be thus written as (x1 , x2 , x3 ). Of course, with this notation we must pay attention not to confuse the components x2 and x3 with the square and the cubic power of the number x. The prototype of the covariant vector is instead the gradient. We have seen in fact that, in Cartesian coordinates, the gradient can be written as: ∇φ = i ∂φ ∂φ ∂φ + j 2 + k 3. 1 ∂x ∂x ∂x If we now rotate the coordinate axes, the components of the transformed gradient can be found by means of the chain rule of differentiation, namely: 3 X ∂φ ∂xj ∂φ′ ∂φ ∂x1 ∂φ ∂x2 ∂φ ∂x3 = + + = , ∂x′ i ∂x1 ∂x′ i ∂x2 ∂x′ i ∂x3 ∂x′ i ∂xj ∂x′ i j=1 which, since ∂φ/∂xj are the components of the original gradient, has therefore the same form as Eq. 7.78. We will define vectors as tensors of rank 1. In a 3-dimensional space, a tensor of rank n is a mathematical object that transforms in a definite way when the coordi- 219 7.4. TENSORS nate system changes.1 The way it transforms guarantees that the properties of this mathematical object do not vary if we change the coordinate system. The simplest possible tensor is the tensor of rank 0, which has 30 = 1 component and therefore identifies with a scalar. We have seen the definition of covariant and contravariant tensors of rank 1 (Eqs. 7.77 and 7.78). We can now go on defining contravariant, mixed and covariant tensors of rank 2 by the following equations of their components under coordinate transformations: A′ ij = B′ ij = C ′ ij = 3 X 3 X ∂x′ i ∂x′ j k=1 l=1 3 X 3 X k=1 l=1 3 X 3 X k=1 l=1 Akl (7.79) ∂x′ i ∂xl k B l ∂xk ∂x′ j (7.80) ∂xk ∂xl Ckl . ∂x′ i ∂x′ j (7.81) ∂xk ∂xl Clearly, the rank goes as the number of partial derivatives necessary in the definition: 0 for a scalar, 1 for a vector, 2 for a second-rank tensor and so on. Each index (subscript or superscript) ranges over the number of dimensions in the space. We see from the above formulae that Akl is contravariant with respect to both indices, Ckl is covariant with respect to both indices and B k l transforms contravariantly with respect to the first index k but covariantly with respect to the second index l. Once again, if we use the Cartesian coordinate system, all these definitions coincide. The second-rank tensor Akl is often indicated with a boldface capital letter (A) and its components can be arranged in a 3 × 3 square array: A11 A12 A13 A = A21 A22 A23 , A31 A32 A33 but it shall not be confused with a matrix. A matrix is a second-rank tensor only if its components transform according to one of the Eqs. 7.79–7.81. 7.4.2 Einstein summation convention Tensor analysis can be quite messy because of the large number of indices involved. In order to simplify the algebra and to make the notation more compact, Einstein introduced the following convention: once an index variable appears twice in a single term, once in a upper position (superscript), once in a lower position (subscript), 1 In a N -dimensional space a tensor of rank N has N n components. 220 CHAPTER 7. VECTOR AND TENSOR ANALYSIS that implies that we are summing over all the possible values of this index variable. With this convention, Eq. 7.1 can be rewritten as: a = ai ei . The only exception to this rule is given by the Cartesian coordinates, which are assumed to be summed up even if they appear only as superscript. Therefore, Eqs. 7.79–7.81 can be rewritten as: ∂x′ i ∂x′ j kl A ∂xk ∂xl ∂x′ i ∂xl k B′ ij = B l ∂xk ∂x′ j ∂xk ∂xl C ′ ij = ′ i ′ j Ckl , ∂x ∂x A′ ij = (7.82) (7.83) (7.84) in spite of the fact that the indices k and l appear sometimes two times as superscript. 7.4.3 Direct product and contraction Given a covariant vector ai and a contravariant vector bj in a N-dimensional space, the most general product between them is a mathematical entity ci j containing all the possible products between the N components of ai and the N components of bj , namely it is a N ×N matrix whose elements are the products ai bj . Is it a second rank tensor? We can apply Eq. 7.78 and Eq. 7.77 to transform ai and bj , respectively, under rotation of the coordinate axes, obtaining: ai ′ b′ j = ∂xk ∂x′ j l ∂xk ∂x′ j a b = ak bl , k ∂x′ i ∂xl ∂x′ i ∂xl where we have made use of the Einstein summation convention. The tensor ci j = ai bj obeys Eq. 7.80 when the coordinate axes rotate and it is thus a mixed second rank tensor. This kind of product between tensor is called direct product and always produces another tensor with rank equal to the sum of the ranks of the tensor we have multiplied (respecting also the number of covariant and contravariant indices of the two original tensors). For instance we can multiply the second order mixed tensor Ai j by the mixed tensor Bk l obtaining the fourth rank tensor Cik jl = Ai j Bk l . (7.85) The direct product is thus a way to obtain tensors of progressively large ranks. Is it possible to reduce the rank of a tensor? This is possible by means of the operation of contraction, in which we set a covariant index equal to a contravariant index. For 221 7.4. TENSORS instance, in the fourth rank tensor Cik jl defined by Eq. 7.85 we could set k = j and obtain (always using the Einstein convention): Cij jl = Ai j Bj l . By comparing this expression with Eq. 5.1 of Sect. 5.1.1 we can recognize this operation as the classical multiplication of two matrices, that produces (as we know) another matrix. Therefore the fourth rank tensor Cij kl has been contracted to the second rank tensor Ci l . What happens if we contract further Ci l ? This is obtained by putting l equal to i, namely by considering the quantity Ci i . Because of the Einstein convention, this corresponds to summing up all the elements of the matrix Ci l in which the column index is equal to the row index, namely by summing up the diagonal terms of the matrix. This sum is nothing else that the trace of the matrix, namely a scalar (a tensor of rank zero). It is easy to see that the trace does not change if we change the reference system. It is enough to apply Eq. 7.83 to C ′ i i , obtaining: ′i ∂xl l ∂xl k ∂x C = Ck . C ′ ii = l ′i ∂xk ∂xk ∂x But ∂xl /∂xk is always zero for l 6= k (we always assume that the coordinates are independent to each other) and of course it is ∂xk /∂xk = 1, therefore: C ′ i i = Ck k . In spite of the fact that we use two different variables (i and k), these two expressions are both telling us that the sum of all the diagonal terms of C remains the same even if we change the coordinate system. This is the strength of the tensors: they allow us to describe a physical system independently on the coordinate system and the guarantee that the involved physical quantities are independent on the reference frame. We have seen therefore that the operation of contraction has always the effect of reducing the rank of a tensor by 2. It is to remind that the contraction is possible only between a covariant and a contravariant index. The analogy with the vector multiplication helps us to understand why: if we multiply the transpose of a vector by another vector (or by the complex conjugate of another vector if we deal with complex vectors), we obtain the scalar product of them, namely a scalar (see Sect. 5.1.1). If we instead perform the opposite operation (multiplying a vector by the transpose of another vector) we obtain a matrix, namely a tensor of rank two. 7.4.4 Kronecker delta and Levi-Civita symbol We will analyze in this section two famous tensors, useful in many branches of physics: the Kronecker delta and the Levi-Civita symbol. The Kronecker delta is a function 222 CHAPTER 7. VECTOR AND TENSOR ANALYSIS of two integer variables i and j and it is one if the two variables are equal, zero otherwise. It is indicated with δ and, from its definition we have: 1, if i = j δij = . (7.86) 0, if i 6= j Given a sum P∞ i=−∞ ai , we have the property: ∞ X ai δij = aj , i=−∞ R∞ which is analogous to the property −∞ f (x)δ(x − c)dx = f (c) we have seen for the Dirac delta function (Sect. 4.1.2, Eq. 4.15). If we consider the Kronecker delta as a tensor, then it turns out to be a mixed tensor. To demonstrate that, we should remind that in a Cartesian coordinate system the various coordinates are independent to each other. This means that the derivative of the coordinate xi with respect to the coordinate xj will always be zero, but in the case in which i = j. In this case it is of course ∂xi /∂xi = 1. This is exactly what the Kronecker delta requires, namely we have: ∂xi = δj i . (7.87) j ∂x This condition holds also if we rotate the coordinate axes, namely the transformed Kronecker delta obeys the relation: ∂x′ i i = δ′j . ′ j ∂x By means of the chain rule of partial derivation we have: i δ′j = ∂x′ i ∂x′ i ∂xk ∂xl = . ∂x′ j ∂xk ∂xl ∂x′ j However, because of Eq. 7.87 the second term of this product is δl k , therefore we have: ∂x′ i ∂xl k i δl , δ′j = ∂xk ∂x′ j which is analogous to Eq. 7.80, telling us therefore that the Kronecker delta is a mixed tensor. The Levi-Civita symbol is indicated with εijk . It acts therefore on three integer variables and produces 0 if one (or more than one) of the indices are repeated, +1 if (i, j, k) is an even permutation of (1, 2, 3), −1 if (i, j, k) is an odd permutation of (1, 2, 3), namely we have: +1, if (i, j, k) = (1, 2, 3) or (2, 3, 1) or (3, 1, 2) εijk = −1, if (i, j, k) = (1, 3, 2) or (3, 2, 1) or (2, 1, 3) 0, otherwise : i = j or i = k or j = k . (7.88) 223 7.4. TENSORS This can also be expressed by means the formula: εijk = (j − i)(k − i)(k − j) . 2 The Levi-Civita symbol is therefore a tensor of rank 3. It is useful to simplify considerably some formulae. For instance, given a vector c = a × b, its i-th component if given by: ci = εijk aj bk , (compare with the formula Eq. 7.8).