Contents

advertisement
Contents
1 Complex numbers
1.1
1.2
1.3
3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.1.1
Imaginary numbers . . . . . . . . . . . . . . . . . . . . . . . .
3
1.1.2
Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . .
3
Manipulation of complex numbers . . . . . . . . . . . . . . . . . . . .
4
1.2.1
Basic operations
. . . . . . . . . . . . . . . . . . . . . . . . .
4
1.2.2
Square root of a complex number . . . . . . . . . . . . . . . .
6
Representation of a complex number . . . . . . . . . . . . . . . . . .
7
1.3.1
Algebraic representation . . . . . . . . . . . . . . . . . . . . .
7
1.3.2
Trigonometric representation . . . . . . . . . . . . . . . . . . .
7
1.3.3
de Moivre’s theorem . . . . . . . . . . . . . . . . . . . . . . .
10
1.3.4
Complex logarithms . . . . . . . . . . . . . . . . . . . . . . .
12
1.3.5
Trigonometric and hyperbolic functions . . . . . . . . . . . . .
13
2 Ordinary differential equations
2.1
2.2
2.3
17
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.1.1
A simple example . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.1.2
The direction field . . . . . . . . . . . . . . . . . . . . . . . .
18
2.1.3
Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . .
20
First order, first degree differential equations . . . . . . . . . . . . . .
22
2.2.1
Separable equations . . . . . . . . . . . . . . . . . . . . . . . .
23
2.2.2
Linear equations . . . . . . . . . . . . . . . . . . . . . . . . .
24
2.2.3
Exact equations . . . . . . . . . . . . . . . . . . . . . . . . . .
27
2.2.4
Integrating factors . . . . . . . . . . . . . . . . . . . . . . . .
30
Higher degree first order differential equations . . . . . . . . . . . . .
35
2.3.1
Equations solvable for p . . . . . . . . . . . . . . . . . . . . .
36
2.3.2
Equations solvable for y . . . . . . . . . . . . . . . . . . . . .
36
2.3.3
Equations solvable for x . . . . . . . . . . . . . . . . . . . . .
37
2.3.4
Special differential equations . . . . . . . . . . . . . . . . . . .
39
i
ii
CONTENTS
2.3.5
2.4
Second order differential equations
47
2.4.1
2.4.2
Second order homogeneous ODEs with constant coefficients . .
The Wronskian determinant . . . . . . . . . . . . . . . . . . .
48
49
2.4.3
Fundamental set of solutions of homogeneous ODEs with constant coefficients . . . . . . . . . . . . . . . . . . . . . . . . . 51
Second order nonhomogeneous ODEs with constant coefficients 56
Higher order linear differential equations . . . . . . . . . . . . . . . .
67
2.5.1
Homogeneous n-th order ODEs. . . . . . . . . . . . . . . . . .
67
2.5.2
2.5.3
Nonhomogeneous n-th order ODEs. . . . . . . . . . . . . . . .
The D-operator . . . . . . . . . . . . . . . . . . . . . . . . . .
70
74
2.5.4
The Euler linear equations . . . . . . . . . . . . . . . . . . . .
81
2.5.5
Series solutions of linear equations
82
. . . . . . . . . . . . . . .
3 Complex analysis
3.1
3.2
4.2
91
3.1.1
3.1.2
Differentiable functions . . . . . . . . . . . . . . . . . . . . . .
The Cauchy-Riemann conditions . . . . . . . . . . . . . . . .
93
94
Complex integration . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
3.2.1
3.2.2
Line integrals in the complex plane . . . . . . . . . . . . . . . 98
Cauchy’s integral theorem . . . . . . . . . . . . . . . . . . . . 102
3.2.3
Cauchy’s integral formula . . . . . . . . . . . . . . . . . . . . 105
3.2.4
Cauchy’s integral formula for higher derivatives . . . . . . . . 109
3.2.5
3.2.6
Taylor and Laurent series . . . . . . . . . . . . . . . . . . . . 110
Residue theorem . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.2.7
Real integrals using contour integration . . . . . . . . . . . . . 115
123
Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.1.1
Basic definition and properties . . . . . . . . . . . . . . . . . . 123
4.1.2
4.1.3
Solution of initial value problems by means of Laplace transforms127
The Bromwich integral . . . . . . . . . . . . . . . . . . . . . . 133
Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4.2.1
Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4.2.2
From Fourier series to Fourier transform . . . . . . . . . . . . 141
5 Systems of differential equations
5.1
91
Complex functions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Integral transforms
4.1
45
. . . . . . . . . . . . . . . . . . .
2.4.4
2.5
Singular solutions and envelopes . . . . . . . . . . . . . . . . .
145
Review of matrices and systems of algebraic equations . . . . . . . . . 145
5.1.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
1
CONTENTS
5.2
5.3
5.1.2 Systems of linear algebraic equations . . . . . . . . . . . .
Systems of first order linear ODEs . . . . . . . . . . . . . . . . . .
5.2.1 General properties . . . . . . . . . . . . . . . . . . . . . .
5.2.2 Homogeneous linear systems with constant coefficients . .
5.2.3 Nonhomogeneous linear systems with constant coefficients
Systems of second order linear ODEs . . . . . . . . . . . . . . . .
6 Modeling physical systems with ODEs
6.1 Constructing mathematical models . .
6.2 Mechanical and electrical vibrations . .
6.2.1 The spring-mass system . . . .
6.2.2 Electric circuits . . . . . . . . .
6.3 Other physical processes . . . . . . . .
6.3.1 Wave propagation . . . . . . . .
6.3.2 Heat flow . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7 Vector and tensor analysis
7.1 Review of vector algebra and vector spaces . . .
7.1.1 Vector algebra . . . . . . . . . . . . . . .
7.1.2 Vector spaces . . . . . . . . . . . . . . .
7.1.3 Linear operators . . . . . . . . . . . . .
7.2 Vector calculus . . . . . . . . . . . . . . . . . .
7.2.1 Differentiation of vectors . . . . . . . . .
7.2.2 Scalar and vector fields . . . . . . . . . .
7.2.3 Vector operators . . . . . . . . . . . . .
7.3 Transformation of coordinates . . . . . . . . . .
7.3.1 Rotation of the coordinate axes . . . . .
7.3.2 General curvilinear coordinates . . . . .
7.4 Tensors . . . . . . . . . . . . . . . . . . . . . .
7.4.1 Basic definitions . . . . . . . . . . . . . .
7.4.2 Einstein summation convention . . . . .
7.4.3 Direct product and contraction . . . . .
7.4.4 Kronecker delta and Levi-Civita symbol
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
149
152
152
153
160
169
.
.
.
.
.
.
.
171
171
178
178
183
185
185
186
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
189
189
189
193
193
195
195
202
202
209
210
214
217
218
219
220
221
2
CONTENTS
Chapter 1
Complex numbers
The imaginary number is a fine and wonderful resource of
the human spirit, almost an amphibian between being and not being
– Gottfried Wilhelm Leibniz –
1.1
1.1.1
Introduction
Imaginary numbers
As it is well known, whether you square a positive or a negative real number, the
result is a positive real number. That means that, in the real domain, the square
root of negative numbers is not defined. It also means that the equation x2 + 1 = 0
does not have any real solution. However, it is useful for a wide class of problems
to define a new kind of numbers such that when you square them you do obtain a
negative real number. This class of numbers takes the name of imaginary numbers.
We therefore define the basic imaginary number (the imaginary unit) i as the
square root of −1, namely the number such that:
i2 = −1.
(1.1)
√
Consequently, given a real positive number λ one has −λ = i λ. It is also worth
stressing that, according to the fundamental theorem of algebra, the equation x2 +1 =
0, being a second degree equation, has two roots, the second being −i. Although −i
is distinct from i, it shares the property of having −1 as its square.
1.1.2
√
Complex numbers
A complex number comprises a real number and an imaginary number. It is conventionally written as z and it is the sum of a real part x (often indicated as Re(z))
and i times an imaginary part y (or Im(z)), namely
3
4
CHAPTER 1. COMPLEX NUMBERS
z = x + iy.
(1.2)
It is therefore equivalent to an ordered pair of two real numbers and it is sometimes
indicated with the compact notation z = (x, y) (with this notation i = (0, 1)).
It is also important to define the complex conjugate of a complex number, which
is indicated by z ∗ (sometimes also indicated by z) and is simply obtained by changing
sign of the imaginary part of z. Therefore, if z is defined as x + iy, then z ∗ = x − iy.
1.2
1.2.1
Manipulation of complex numbers
Basic operations
The addition of two complex numbers z1 and z2 gives as a result another complex
number. The real and imaginary parts of the complex number are added separately.
z1 + z2 = (x1 + iy1 ) + (x2 + iy2 ) = (x1 + x2 ) + i(y1 + y2 ).
(1.3)
Of course, given the commutativity and associativity of the addition between real
numbers, also the complex addition is commutative and associative.
The product between two complex numbers can simply be found by multiplying
them in full (in the same manner as in polynomials) and remembering that i2 = −1,
namely:
z1 z2 = (x1 + iy1 )(x2 + iy2 )
= x1 x2 + i(x1 y2 + y1 x2 ) + i2 y1 y2
= (x1 x2 − y1 y2 ) + i(x1 y2 + y1 x2 ).
(1.4)
It is easy to verify that z1 z2 = z2 z1 (the multiplication is commutative), z1 (z2 z3 ) =
(z1 z2 )z3 (the multiplication is associative), z1 (z2 +z3 ) = z1 z3 +z2 z3 (the multiplication
is distributive over addition).
Example 1.2.1 Given two generic complex numbers z1 and z2 , show that z1 z2∗ +z1∗ z2
is a real number given by 2 Re(z1 z2∗ ).
Assuming that z1 = x1 + iy1 and z2 = x2 + iy2 we can use Eq. 1.4 to obtain:
z1 z2∗ = (x1 + iy1 )(x2 − iy2 ) = (x1 x2 + y1 y2 ) + i(y1 x2 − x1 y2 ),
z1∗ z2 = (x1 − iy1 )(x2 + iy2 ) = (x1 x2 + y1 y2 ) − i(y1 x2 − x1 y2 ).
1.2. MANIPULATION OF COMPLEX NUMBERS
5
Therefore, z1 z2∗ and z1∗ z2 are two complex numbers having the same real part, but
opposite imaginary part (as expected, the second is the complex conjugate of the first),
therefore their sum is a pure real number given by z1 z2∗ + z1∗ z2 = 2(x1 x2 + y1 y2 ),
which is twice the real part of z1 z2∗ (and also of z1∗ z2 of course), namely we have
demonstrated the equality z1 z2∗ + z1∗ z2 = 2Re(z1 z2∗ ).
Also the division between two complex numbers can be obtained in a straightforward manner. Given z1 = x1 + iy1 and z2 = x2 + iy2 , zz21 can be simply obtained
by multiplying both denominator and numerator by the complex conjugate of z2 ,
namely:
(x1 + iy1 )(x2 − iy2 )
z1
=
z2
(x2 + iy2 )(x2 − iy2 )
x1 x2 + i(y1 x2 − x1 y2 ) + y1 y2
=
x22 + y22
x1 x2 + y1 y2
y1 x2 − x1 y2
=
+i
.
2
2
x2 + y2
x22 + y22
(1.5)
Example 1.2.2 Calculate the division between z1 = 9 + 2i and z2 = 1 + 4i
We have x1 = 9, y1 = 2, , x2 = 1, y2 = 4. Using Eq. 1.5 we obtain:
9+8
2 − 36
z1
=
+i
= 1 − 2i
z2
1 + 16
1 + 16
As a simple application of Eq. 1.5 we can find the inverse 1/z of a complex
number z 6= 0 which turns out to be:
1
1
x − iy
=
= 2
.
z
x + iy
x + y2
(1.6)
The numerator of the right hand side of Eq. 1.6 is z ∗ , therefore we can notice that
zz ∗ = x2 + y 2, an expression which could also be obtained by direct multiplication
p
√
of z and z ∗ . The quantity zz ∗ = x2 + y 2 is also called modulus of the complex
number z and it is indicated as r or |z|. Since x and y are real numbers, r ≥ 0.
∗
From the equation zz ∗ = x2 + y 2 = r 2 it can be easily derived that z1 = zr2 , which is
equivalent to Eq. 1.6.
We can show that triangle inequality |z + w| ≤ |z| + |w|, that holds for ordinary
vectors, holds also if z and w are generic complex numbers. In order to demonstrate
that, we have to remind the equality zw ∗ +z ∗ w = 2Re(zw ∗ ) we have seen in Example
6
CHAPTER 1. COMPLEX NUMBERS
1.2.1 and we have to introduce the (quite obvious to demonstrate) relations Re
z ≤ |z|, (z + w)∗ = z ∗ + w ∗ , |zw| = |z||w| and |z ∗ | = |z|. We have therefore:
|z + w|2 = (z + w)(z + w)∗
= (z + w)(z ∗ + w ∗ )
= zz ∗ + (zw ∗ + z ∗ w) + ww ∗
= |z|2 + 2Re(zw ∗ ) + |w|2
≤ |z|2 + 2|zw ∗ | + |w|2
= |z|2 + 2|z||w| + |w|2
= (|z| + |w|)2.
(1.7)
Since both |z + w| and |z| + |w| are non-negative, taking the square root of Eq. 1.7
we obtain:
|z + w| ≤ |z| + |w|.
1.2.2
(1.8)
Square root of a complex number
We have seen in Sect. 1.1.1 that the square root of a negative number −λ is given
√
√
by the two complex conjugate numbers i λ and −i λ. What about the square root
of a generic complex number w = a + ib? We have to find the complex number
z = x + iy such that z 2 = w, namely:
(x + iy)(x + iy) = x2 − y 2 + 2ixy = a + ib.
By equating the real and the imaginary part we obtain:
b
b2
, x2 − 2 = a.
2x
4x
2
. Since x must
The last equation is a quadratic equation in x which has roots a±r
2
2
be a real number, x cannot beq
negative, therefore the only acceptable solution is
r+a
2
x = 2 which has roots x = ± r+a
. By direct substitution, we obtain:
2
r
r−a
b
b
q
=±
=
.
y=
2x
2
±2 r+a
x2 − y 2 = a , 2xy = b ⇒ y =
2
Therefore, the square root of the complex number w = a + ib is:
r
r
r
r
r+a
r−a
r+a
r−a
z1 =
+i
, z2 = −
−i
.
(1.9)
2
2
2
2
√
Namely, the two solutions of the equation w = z are two opposite complex numbers
z1 and z2 . As we shall see, there is a much simpler way to calculate roots of complex
numbers.
1.3. REPRESENTATION OF A COMPLEX NUMBER
7
Figure 1.1: The Argand diagram.
1.3
1.3.1
Representation of a complex number
Algebraic representation
The representation of the complex numbers we have encountered so far (z = x + iy)
is also called algebraic or cartesian representation. In fact, since a complex number
can be expressed by a couple of real numbers, it is instinctive to place them in a
plane. Therefore, we can think at the real part of a complex number z as the x-axis
(abscissa) of a cartesian coordinate system, the imaginary part of it being the y-axis
(ordinate). Such a visualization of z is called the Argand diagram or Gaussian plane
and it is shown in Fig. 1.1.
1.3.2
Trigonometric representation
To introduce the trigonometric representation of a complex number, it is convenient
to start with a complex number with modulus r. As we know from the trigonometry
(and as we can see in Fig. 1.2), the real part of z coincides with r times the cosine of
θ (the angle that z forms with the x-axis, also indicated as argument of the complex
number z) and the imaginary part with r times the sine of θ, therefore we have
z = r(cos θ + i sin θ).
(1.10)
Since we have used trigonometric functions to express z, this representation of the
complex numbers is called trigonometric representation or polar form. Indeed we can
8
CHAPTER 1. COMPLEX NUMBERS
think of the equality x + iy = r(cos θ + i sin θ) as analogous to the transformation
from cartesian coordinates (x, y) to the polar ones (r, θ). The following relations
hold:
x = r cos θ ; y = r sin θ,
p
y
.
r = x2 + y 2 ; θ = arctan
x
(1.11)
(1.12)
It is worth stressing that the function tan x has a periodicity of π, therefore, if
θ = arctan( xy ), also θ + π = arctan( xy ). For this reason, it is always important to
check the quadrant where the complex number lies. For instance, the argument of
the complex number −i − 1 is 5π/4 and not π/4 because if it was π/4 then z would
have been in the first quadrant and would have had positive real and imaginary
parts.
Recalling the Taylor expansion of sin θ and cos θ we obtain:
z = r(cos θ + i sin θ)
θ3 θ5
θ2 θ4
+
−... + i θ −
+
−...
=r 1−
2!
4!
3!
5!
θ3 θ4
θ5
θ2
−i +
+ i ...
= r 1 + iθ −
2!
3!
4!
5!
3
2
(iθ)
(iθ)4 (iθ)5
(iθ)
+
+
+
... .
= r 1 + (iθ) +
2!
3!
4!
5!
The expression inside the square bracket is analogous to the Taylor expansion
of an exponential, therefore we can define:
eiθ =
∞
X
(iθ)n
n=0
iθ
n!
= cos θ + i sin θ.
(1.13)
The equivalence e = cos θ + i sin θ is called Euler’s formula. Analogously to Eq.
1.13 we can define, for each complex number z, its exponential as:
ez =
∞
X
zn
n=0
n!
.
(1.14)
We can easily notice that the quantity r(cos θ + i sin θ), with 0 ≤ θ < 2π, covers
the whole domain of complex numbers, therefore we can represent every complex
number z = x + iy with the expression:
z = reiθ ,
(1.15)
where r is the modulus of z and θ is its argument. Eq. 1.15 is the exponential
representation of a complex number.
1.3. REPRESENTATION OF A COMPLEX NUMBER
9
Figure 1.2: Representation in the Argand diagram of a complex number with modulus r.
Multiplication and division of complex numbers in polar form are particularly
simple. In fact, given two complex numbers z1 = r1 eiθ1 and z2 = r2 eiθ2 we have:
z1 z2 = r1 eiθ1 r2 eiθ2 = r1 r2 ei(θ1 +θ2 ) ,
(1.16)
r1 eiθ1
r1
z1
=
= ei(θ1 −θ2 ) ,
iθ
2
z2
r2 e
r2
(1.17)
namely, the product of two complex numbers is a complex number having, as modulus, the product of the moduli and as argument the sum of the arguments; the division
of two complex numbers is a complex number having as modulus the quotient of the
moduli and as argument the difference of the two arguments.
Although the product and division of two complex numbers are straightforward
in Polar form, the sum and difference are not, and it is convenient to convert the
numbers we want to sum in the algebraic representation. It is also worth noticing
that the product of a complex number z by a number w = eiα with unit modulus
corresponds to a rotation of z in the Argand diagram by an angle α. In fact, assuming
that z = reiθ , then zw = rei(θ+α) .
Concerning the trigonometric representation of a complex number, it is important to notice that the quantity cos θ + i sin θ has a periodicity of 2π, namely
reiθ ≡ rei(θ+2nπ) for every integer number n. Therefore, in order not to have a
multiple-valued definition of z it is conventionally assumed that θ lies in the interval
[0, 2π[.
10
1.3.3
CHAPTER 1. COMPLEX NUMBERS
de Moivre’s theorem
From the equality (eiθ )n = einθ we derive the very important relation:
(cos θ + i sin θ)n = cos nθ + i sin nθ.
(1.18)
In fact, (eiθ )n = (cos θ+i sin θ)n follows directly from the trigonometric representation
of complex numbers, whereas the identity einθ = cos nθ + i sin nθ follows from the
series expansion of einθ (see Sect. 1.3.2). Another way of demonstrating the de
Moivre’s theorem is through the properties of the product in the polar representation.
It is in fact easy to see that z n = r n einθ , therefore:
[r(cos θ + i sin θ)]n = z n = r n einθ = r n (cos nθ + i sin nθ),
from which the de Moivre’s theorem is easily deduced. It is worth stressing that n
can be an integer, a real, or even a complex number.
The de Moivre’s theorem helps in finding the roots of a generic complex number,
to solve polynomial equations or to easily recover trigonometric identities, as the next
4 examples show.
Example 1.3.1 Use the de Moivre’s theorem to find the square roots of a generic
complex number z.
As we have seen, we can express z as r(cos θ + i sin θ). From the de Moivre’s theorem
we have:
√
√
1
θ
θ
2
= w1 .
z = (z) = r cos + i sin
2
2
However, we shall not forget that cos θ + i sin θ has a periodicity of 2π, therefore also
√
θ
θ
+ π + i sin
+π .
w2 = r cos
2
2
√
is a solution of the equation z = w. From elementary properties of the trigonometric functions, we can see that w1 and w2 are opposite complex numbers (as we have
seen in Sect. 1.2.2). It is also easy to show that the values of w1 and w2 found here
coincide with the values found in Eq. 1.9.
Example 1.3.2 Find the solutions of the equation z 7 = 1.
We can express 1 as ei(0+2nπ) = e2niπ . Therefore, we have:
2
z 7 = e2niπ ⇒ z = ei 7 nπ .
1.3. REPRESENTATION OF A COMPLEX NUMBER
11
Figure 1.3: The solutions of the equation z 7 = 1.
It is clear that n can assume only the values 0 . . . 6. In fact, for n = 7 we obtain the
solution e2iπ which coincides with the solution obtained with n = 0. Therefore, we
4
6
8
10
12
2
have 7 distinct solutions, namely 1, e 7 iπ , e 7 iπ , e 7 iπ , e 7 iπ , e 7 iπ , e 7 iπ . The solutions
have been drawn in Fig. 1.3. We can notice from here that all the solutions (as
expected) lie in a circle of radius 1 and that they divide the circle into 7 circular
sectors with the same angle.
It is simple to generalize the procedure shown in Example 1.3.2 to find the n-th
roots z1 . . . zn of a generic complex number w. From z n = w = reiθ we obtain:
z=
√
n
θ
rei( n +
2kπ
)
n
k = 0 . . . n − 1.
(1.19)
Example 1.3.3 Find the solutions of the equation z 5 + z 3 − 3z 2 = 3.
Given the obvious symmetry of this equation, we can factorize it into (z 3 −3)(z 2 +1) =
0. The second factor has the solutions ±i; for the first factor we can proceed as in
√ √
√
Example 5 obtaining the 3 solutions 3 3, 3 3e2iπ/3 , 3 3e4iπ/3 .
We can notice from the examples 1.3.2 and 1.3.3 (and from Fig. 1.3) that all the
√
solutions occur in conjugate pairs. Of course, 1 (example 1.3.2) and 3 3 (example
1.3.3) make an exception because they are real numbers. This is a general result of
12
CHAPTER 1. COMPLEX NUMBERS
the roots of a polynomial with real coefficients. In fact, let us suppose that z is a
root of a polynomial of degree n, namely z is a complex number such that:
n
X
aj z j = 0.
j=0
Taking the complex conjugate of this equation we obtain:
n
X
a∗j (z ∗ )j = 0.
j=0
But aj are real numbers, therefore aj = a∗j , therefore we obtain:
n
X
aj (z ∗ )j = 0,
j=0
∗
namely, also z is a root of the polynomial.
Example 1.3.4 Recover the double- and triple-angle formulae of the trigonometry
with the aid of the de Moivre’s theorem.
From the de Moivre’s theorem we have:
cos(2θ) + i sin(2θ) = (cos θ + i sin θ)2 = cos2 θ − sin2 θ + 2i sin θ cos θ.
By equating the real and imaginary coefficients separately, we obtain:
cos(2θ) = cos2 θ − sin2 θ ; sin(2θ) = 2 sin θ cos θ.
These are the well-known double-angle formulae of the trigonometry. Analogously
we can proceed for the triple-angle formulae:
cos(3θ) + i sin(3θ) = (cos θ + i sin θ)3
= cos3 θ + 3i cos2 θ sin θ − 3 cos θ sin2 θ − i sin3 θ
⇒ cos(3θ) = cos3 θ − 3 cos θ sin2 θ = 4 cos3 θ − 3 cos θ
sin(3θ) = 3 sin θ cos2 θ − sin3 θ = 3 sin θ − 4 sin3 θ.
1.3.4
Complex logarithms
To define the logarithm of a complex number, we can proceed as for the real numbers,
namely we define Ln(z) that complex number w such that ew = z. If we use the
exponential representation of z, this number can be easily found. In fact:
13
1.3. REPRESENTATION OF A COMPLEX NUMBER
ew = z = reiθ = eiθ+ln r ⇒ w = iθ + ln r.
(1.20)
Ln(z) = i(θ + 2nπ) + ln r,
(1.21)
However, we must not forget that eiθ = ei(θ+2nπ) (they represent the same points in
the Argand diagram). Therefore, we obtain:
namely, the logarithm of a complex number is multiple-valued (the numbers i(θ +
2nπ) + ln r represent different points in the Argand diagram). The value of Ln(z)
obtained restricting the argument of z to lie in the interval 0 ≤ θ < 2π is called
principal value of Ln(z) and it is usually indicated with ln z.
i
Example 1.3.5 Express in polar form the number z = −i− 3 .
The logarithm of z is given by Ln(z) = − 3i Ln(−i). To calculate the logarithm of −i
we have to express it in exponential form, namely:
Ln(−i) = Ln ei
3
π+2nπ
2
We obtain therefore:
Ln(z)
z=e
− 3i i
=e
3
= i π + 2nπ .
2
3
π+2nπ
2
which is a real quantity and not a complex one.
1.3.5
π
2
= e 2 + 3 nπ ,
Trigonometric and hyperbolic functions
Given a real number x, we can also define sin x and cos x with the help of the
exponential form of the complex numbers. It is enough to notice that eix + e−ix =
2 cos x and that eix − e−ix = 2i sin x. We can generalize this equality and define the
cosine and the sine of a generic complex number z in this way:
eiz + e−iz
,
(1.22)
2
eiz − e−iz
.
(1.23)
sin z =
2i
The case in which z is a pure imaginary number is very interesting. In fact, in this
case z = ix with x real. We obtain therefore:
ex + e−x
cos(ix) =
≡ cosh x,
(1.24)
2
ex − e−x
sin(ix) = i
≡ i sinh x.
(1.25)
2
cos z =
14
CHAPTER 1. COMPLEX NUMBERS
In this way we have defined the hyperbolic functions sinh x = 21 (ex − e−x ) and
cosh x = 12 (ex + e−x ). In an analogous way we can also define cosh z and sinh z for
a generic complex number z. We have seen from Eqs. 1.24 and 1.25 that cos(ix) =
cosh x; sin(ix) = i sin x. Analogously, we can see that cos x = cosh(ix); i sin x =
sinh(ix). These relations make the relationship between trigonometric and hyperbolic
functions transparent. For instance, from the well-known relation cos2 x+sin2 x = 1 it
is trivial to find the analogous relation for trigonometric functions cosh2 x−sinh2 x =
1.
By analogy with the trigonometric functions, also the remaining hyperbolic functions:
ex − e−x
sinh x
= x
,
cosh x
e + e−x
1
2
sech x =
= x
,
cosh x
e + e−x
1
2
cosech x =
= x
,
sinh x
e − e−x
ex + e−x
1
= x
,
coth x =
tanh x
e − e−x
(1.26)
tanh x =
(1.27)
(1.28)
(1.29)
(1.30)
can be defined. Also the inverse hyperbolic functions arcsinh x, arccosh x and arctanh
x can be defined and it is possible, by inverting the equations defining the hyperbolic
functions, to find a closed-form expression for them, namely:
√
arcsinh x = ln( x2 + 1 + x),
√
arccosh x = ln(x ± x2 − 1),
r
1+x
.
arctanh x = ln
1−x
(1.31)
(1.32)
(1.33)
Example 1.3.6 Demonstrate Eq. 1.33.
y
−y
By the definition of the hyperbolic tangent we have tanh y = eey −e
= x. We want to
+e−y
invert this relation and express y = arctanh x as a function of x. It turns out to be:
ey − e−y = x(ey + e−y )
⇒ ey (1 − x) = e−y (1 + x)
1+x
⇒ e2y =
1−x r
1+x
.
⇒ arctanh x = ln
1−x
1.3. REPRESENTATION OF A COMPLEX NUMBER
15
Example 1.3.7 Find the solutions of the equation 4 cosh x − 3e−x = −1.
From the definition of cosh x, this equation is equivalent to (2ex +2e−x )−3e−x +1 = 0.
Multiplying by ex we obtain:
2e2x + ex − 1 = 0,
which is a quadratic expression in ex , whose solutions are ex = 12 and ex = −1. The
first has the obvious solution x = − ln 2, whereas for the second we have to express −1
in exponential form, namely −1 = ei(π+2nπ) , therefore x = i(π + 2nπ), with principal
value x = iπ.
16
CHAPTER 1. COMPLEX NUMBERS
Chapter 2
Ordinary differential equations
Science is a differential equation. Religion is a boundary condition.
– Alan Turing –
2.1
Introduction
Almost every physical problem has to do with rates of change of some quantities.
For instance, the velocity is the rate at which the position of a body changes with
time. Expressed in mathematical terms, rates are derivatives. Therefore, in order to
describe most of the physical problems, we encounter equations containing derivatives
of some unknown function. This kind of equations are called differential equations
(DE). The DE (or the system of DEs) that describes some specific physical problem
is also called mathematical model of the process.
2.1.1
A simple example
We know that if we leave a body with mass m falling near the sea level, the only force
acting on it is the gravity, which has only a vertical component and is quantified by
mg, with g=9.8 m s−2 . From the Newton’s second law F = ma we have the obvious
relation ma = mg. We know also that a is the rate of change of velocity as a function
of time, therefore we can express the Newton’s second law with a very simple DE,
namely:
dv
= g,
(2.1)
dt
where v is the velocity in the vertical direction. We know from the mechanics the
solution of this DE, namely v(t) = v0 + gt. If we take now into account the drag force
of the atmosphere, things get (slightly) more complicated. Usually the drag is taken
proportional to the velocity of the body and acts in the direction opposite to the
17
18
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
motion of the body. In this way, the force acting on the body is given by F = mg−γv,
where γ is a constant. From the Newton’s second law we get therefore:
dv
= mg − γv.
(2.2)
dt
As we can see, the unknown function v(t) appears on both sides of the equation, in
one case (right hand side) the simple function v appears, in the other case (left hand
side) its derivative as a function of t appears.
m
2.1.2
The direction field
Although Eq. 2.2 is easy to solve (we will do it at the end of this subsection), there
is a way to get useful information about its behavior without solving it. In fact
we know from the geometry that, if we plot v(t) in a cartesian coordinate system,
represents the slope of the curve v(t) in the
then for each value of t the quantity dv
dt
point with coordinates (t, v). Therefore, if we assign the velocity a specific value v,
γ
=g−m
v represents the slope of the function v(t)
for each time t the quantity dv
dt
passing by the point (t, v). To make the example more quantitative, we can assume
= 9.8 − 0.2 v. If v = 0, then, for
m = 10 kg, γ = 2 kg s−1 . In this way we have dv
dt
= 9.8, if v = 10 m s−1 , then dv
= 7.8, if v = 20 m s−1 , then dv
= 5.8,
each t, dv
dt
dt
dt
and so on. We can display these informations by drawing in a tv-plane short line
at different values of t. This has been done in Fig. 2.1. We
segments with slope dv
dt
can see from this plot that, in the lower part of the diagram, the slopes are always
positive. In physical terms that means that the velocity increases with time. That
happens only if the velocity at the time t = 0 (the initial velocity) is lower than some
threshold value. On the other hand, if we start with a high initial velocity, the slope
is always negative, namely the velocity decreases with time. A value of the velocity
somewhere around 50 m s−1 seems to be peculiar because the slopes of the segments
flatten considerably. Indeed this peculiar behavior occurs when dv
= 0 (therefore
dt
gm
−1
when v = γ ; v = 49 m s in our example). In fact, if we start at t = 0 with this
velocity, then the gravity and the drag force balance perfectly and the velocity of the
body does not change with time. This solution is called equilibrium solution. The
velocity v = 49 m s−1 is also called terminal velocity since, no matter what is the
initial velocity, at sufficiently large times the velocity tends asymptotically to this
value.
Although we have not started yet to treat differential equations, we can nevertheless solve the easy Eq. 2.2 by treating the infinitesimals dv and dt as normal
unknowns in an equation. Therefore we have:
dv
= dt.
γ
g−m
v
19
2.1. INTRODUCTION
Figure 2.1: The direction field of the equation
dv
dt
= 9.8 − 0.2 v.
Integrating the right hand side between 0 and t and the left hand side between v0
(the velocity at the time t = 0) and v, we obtain:
t=
Z
v
v0
m
mg − γv
dw
= − ln
.
γ
g − mw
γ
mg − γv0
From this equation we get:
γ
mg − γv = (mg − γv0 )e− m t .
Now we can recover v(t), namely:
mg
−
v=
γ
γ
mg
− v0 e− m t .
γ
(2.3)
From this family of solutions (which depend on the initial velocity v0 ) we can check
the validity of the properties we have already noticed with the help of the direction
then v(t) always increases with
field. In particular we can notice that, if v0 < mg
γ
mg
time; the opposite happens with v0 > γ . If v0 = mg
, then v does not change
γ
with time. We have drawn some of these solutions in Fig. 2.2 for the following
initial velocities: v0 = 40, 45, 49 (thick line), 55 and 60 m s−1 . As expected, all the
solutions tend asymptotically to the equilibrium value and the tangents to all the
curves are well reproduced by the direction field we have drawn previously. Eq. 2.3
is also called general solution of the DE 2.2 since it represents the solution for all
possible initial velocities of the body. The curves drawn in Fig. 2.2 are a subsample
20
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
Figure 2.2: The direction field of the equation dv
= 9.8 − 0.2 v together with the
dt
solutions (red continuous lines) under the initial conditions v0 = 40, 45, 49 (thick
line), 55 and 60 m s−1 .
of the family of the infinitely many curves representing the general solution of the
DE, also called integral curves.
Although the utility of the direction field has been made clear by this example,
dy
= f (x, y).
its use is limited to the DEs of the form dx
2.1.3
Basic definitions
In the example we have analyzed in the previous subsections, we have expressed
the velocity of the body as a function of the time. Namely, the variable t has been
left free to vary and we have checked how the variable v varied as a function of t.
We define therefore t as independent variable and v as dependent variable. In our
example we have given the variables names that resemble their physical meanings (t
for time and v for velocity). If the equation is a purely mathematical abstraction, it
is customary to indicate with x the independent variable and with y the dependent
one.
If there is only one independent variable, and, as a consequence, only total
dy
then the equation is called ordinary differential equation or ODE.
derivatives like dx
If instead the function we seek depends on several independent variables and the
equation contains partial derivatives with respect to those, then the equation is
called partial differential equation (or PDE). An equation like:
21
2.1. INTRODUCTION
d2 y
dx2
3
dy
= ln y x ,
+ cos
dx
is an ODE in which we seek the unknown function y(x) which satisfies it. Instead,
an equation like:
∂3f
∂f x
+
e + ln y = 0,
2
∂ x∂y ∂x
is an example of PDE and in this case the function to seek is f (x, y), depending on
two independent variables x and y. In this chapter (and in the following ones) we
will not treat PDEs (see however Sect. 6.3.1 and 6.3.2), concentrating our attention
on ODEs.
The order of an ODE is the order of the highest derivative that appears in the
equation. The equation
dn y
dy d2 y
F x, , 2 , . . . , n ,
dx dx
dx
(2.4)
F [x, y ′, y ′′, . . . , y (n) ] = 0.
(2.5)
is the general expression of a n-th order ODE. It is customary to indicate with
y ′(x), y ′′ (x), . . . , y (n) (x) the first, second, . . . , n-th derivative of y as a function of x.
Sometimes, in order to simplify the notation, the dependence on the independent
variable x is omitted. Therefore Eq. 2.4 can also be written as:
The degree of an ODE is instead the power of the highest derivative term. An ODE
like:
′′′ 4
5
y
− 2 y ′′ − y = x,
is of the third order and fourth degree.
Another crucial classification of differential equations is between linear and nonlinear ones. An ODE is said to be linear if the function F in Eq. 2.5 is a linear function of y ′, y ′′, . . . , y (n) ; nonlinear otherwise. Therefore, a linear ODE of n-th order
can be expressed as:
an (x)y (n) (x) + an−1 (x)y (n−1) (x) + · · · + a1 (x)y ′ (x) + a0 (x)y(x) = f (x).
(2.6)
Referring to the example in Sect. 2.1.1, we have seen that we can recover a single
function v(t) as a solution of the given ODE only if we specify a constant which is
the velocity of the body at some specific time (in our case the initial time t = 0).
If we do not specify it, then we obtain a family of solutions. An ODE associated
22
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
with an initial condition of the kind y(x0 ) = y0 is called initial value problem (or
Cauchy problem). The solution of an initial value problem (namely ODE plus initial
condition) is called particular solution of the ODE. Given an ODE of n-th order, an
initial value problem consists in specifying the value of the zeroth, first, . . . , (n−1)-th
derivative of the dependent variable y at some fixed point x0 , otherwise the solution
is not defined. We can clarify it with the trivial example of the fall of a body without
drag (Eq. 2.1). We have seen that this leads to the general solution v(t) = v0 + gt.
To obtain the height of the body y(t) as a function of time, we have to integrate once
more over time, obtaining:
1
y(t) = y0 + v0 t + gt2 ,
(2.7)
2
where y0 = y(t = 0) and v0 = y ′(t = 0). Namely, in order to obtain the particular
solution we have specified the value of the function y(t) and of its first derivative at
the time t0 = 0. An alternative is to assign the value of y(t) at two different values
of time y1 = y(t1 ) and y2 = y(t2 ). Assuming for simplicity that t2 = 0, namely that
y2 = y(t2 ) = y(0) = y0 , the condition y1 = y(t1 ) translates into:
y1 − y0 1
1
− gt1 .
y1 = y0 + v0 t1 + gt21 ⇒ v0 =
2
t1
2
Therefore, the particular solution of the ODE is:
y1 − y0 1
1
y(t) = y0 +
− gt1 t + gt2 .
t1
2
2
(2.8)
An ODE which has, as additional constraints, the value of the dependent variable
for different values of the independent variable(s) is called boundary value problem.
2.2
First order, first degree differential equations
The most general expression of a first order, first degree ODE is:
Q(x, y)y ′(x) + P (x, y) = 0.
(2.9)
We will be busy in the next sections in determining whether a solution of this ODE
exists and, if so, in developing methods for finding it. However, it is worth remarking
that, for arbitrary functions P and Q a solution might not exist or it might not be
possible to express it in terms of elementary functions. In this case, numerical methods are required. We will concentrate instead on methods which can be applied to
particular subclasses of first order ODE, namely separable equations, linear equations
and exact equations.
2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS
2.2.1
23
Separable equations
Taking into account the general first order ODE Eq. 2.9, if it happens that Q(x, y) =
q(y) depends only on y and P (x, y) = p(x) depends only on x, then the ODE is
particularly easy to treat. In fact, we can immediately obtain q(y)dy = −p(x)dx and
at this point we can integrate both sides of this equation, namely the solution of the
ODE q(y)y ′(x) + p(x) = 0 is:
Z
q(ỹ)dỹ = −
Z
p(x̃)dx̃ + K,
(2.10)
where K is a suitable constant. It is also very simple to solve an initial value problem
given by a separable ODE and the initial condition y(x0 ) = y0 . The solution is in
fact:
Z
y
y0
q(ỹ)dỹ = −
Z
x
p(x̃)dx̃.
(2.11)
x0
Example 2.2.1 Find the solution of the initial value problem:

y ′(x) = xy sin x
.
y(0) = 2
This is clearly a separable ODE which brings directly to the solution:
Z
2
y
1
dỹ =
ỹ
Z
x
x̃ sin x̃dx̃.
0
The left hand side has the obvious solution ln y2 . We have to integrate the right hand
R
side by part. Making use of the relation sin x = d(− cos x) we obtain x sin x =
R
−x cos x + cos x = sin x − x cos x. We obtain therefore the solution:
y(x) = 2esin x−x cos x .
Example 2.2.2 Find the solution of the initial value problem:

xy ′ (x) = x − 1 − x2 + 2xy − y 2
.
y(1) = 0
This is not a clearly separable ODE. However, we can rewrite it as xy ′ (x) − x +
1 = −x2 + 2xy − y 2 and we can notice that the right hand side of this equation is
24
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
−(x − y)2 . It could therefore be convenient to define v = x − y. It is y = x − v and
y ′(x) = 1 − v ′ (x). We obtain therefore:
x(1 − v ′ (x)) − 1 + x = −v 2 ⇒ xv ′ (x) = v 2 + 1.
This ODE is clearly separable and we can easily find a solution. In fact, the initial
condition y(1) = 0 translates into v(1) = 1 and we can apply Eq. 2.10 to obtain:
Z
1
v
dṽ
=
ṽ 2 + 1
Z
x
1
π
dx̃
⇒ arctan v − = ln x.
x̃
4
This brings us directly to the solution:
π
,
v = tan ln x +
4
which, recalling the substitution v = x − y, translates into:
π
.
y = x − tan ln x +
4
This example shows us that, as in the case of integrals, we can simplify ODEs
and reduce them to tractable ones by means of clever substitutions.
2.2.2
Linear equations
Recalling the general form of a linear ODE (Eq. 2.6) a first order linear ODE can
be expressed as:
a1 (x)y ′ (x) + a0 (x)y(x) = f (x).
To simplify the notation we can define r(x) = aa01 (x)
and s(x) = af1(x)
(provided that
(x)
(x)
a1 (x) 6= 0), therefore the general first order ODE can be expressed as:
dy
+ r(x)y(x) = s(x).
(2.12)
dx
To find the solution of this ODE we can proceed as follows. Let us introduce a (yet
unknown) function g(x), which we multiply by both sides of Eq. 2.12. We obtain
therefore:
g(x)
dy
+ r(x)g(x)y(x) = s(x)g(x).
dx
(2.13)
dy
dg
d
[gy] = g(x) dx
+ y(x) dx
. The right hand side of this equation
Now, we known that dx
is equal to the left hand side of Eq. 2.13 if and only if the condition:
2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS
25
dg
= r(x)g(x)
dx
holds. This equation is easy to solve and it yields:
R
g(x) = e
r(x̃)dx̃
.
(2.14)
Coming back to Eq. 2.13 we have now:
d
[gy] = s(x)g(x).
dx
Assuming that gy is the dependent variable of this ODE, we obtain:
gy =
Z
(2.15)
s(x)g(x)dx + K.
Since we have already found the function g(x), now the general solution is:
1
y=
g
Z
−
s(x)g(x)dx + K = e
R
r(x̃)dx̃
Z
R
s(x)e
r(x̃)dx̃
dx + K .
(2.16)
The function g(x) is also called integrating factor.
Example 2.2.3 Find the solution of the ODE:
√
(1 + x2 )3/2 y ′(x) + 2xy 1 + x2 = 1
and draw some of the integral curves.
We can rearrange this ODE obtaining:
y ′ (x) +
2xy
1
=
.
1 + x2
(1 + x2 )3/2
This is clearly a linear ODE with r(x) = 2x(x2 + 1)−1 and s(x) = (x2 + 1)−3/2 . It is
R
easy to see that r(x)dx = ln(1 + x2 ), therefore the integrating factor g(x) is:
R
2
g(x) = e r(x)dx = eln(1+x ) = 1 + x2 .
R
It is worth stressing here that r(x)dx = ln(1 + x2 ) + K, therefore every function
2
of the kind eln(1+x )+K is an integrating factor, as well. However, we do not need a
specific integrating factor (each function satisfying Eq. 2.15 is sufficient), therefore
we take the easiest possible g(x). From Eq. 2.16 we obtain:
26
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
√
Figure 2.3: Solutions of the ODE (1 + x2 )3/2 y ′(x) + 2xy 1 + x2 = 1 (Example 2.2.3)
with integral constant ranging from K = −3 (lowermost curve) to K = 3 (uppermost
curve).
1
y(x) =
g
Z
s(x)g(x)dx + K
Z
dx
1
√
+K
=
1 + x2
1 + x2
arcsinh x + K
=
.
1 + x2
Some of these solutions (for K = −3 . . . 3) are plotted in Fig. 2.3
Variation of parameters
There is an alternative method to solve linear ODEs which is worth introducing here
because we will use it more extensively to solve higher order ODEs. This method
is called variation of parameters (or variation of constants) and it was developed by
the mathematician Joseph Louis Lagrange. Given a linear ODE as in Eq. 2.12, this
method consists in finding, as a first step, the solution of the associated homogeneous
equation:
y ′ (x) + r(x)y(x) = 0,
(2.17)
27
2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS
namely the equation obtained assuming s(x) = 0. This equation is separable, therefore it is straightforward to find the solution:
y(x) = e−
R
r(x̃)dx̃+K
= Ae−
R
r(x̃)dx̃
.
We can now assume that the solution of the ODE is given by an expression like:
y(x) = A(x)e−
R
r(x̃)dx̃
,
(2.18)
namely, instead of keeping A constant, we let it vary with x and we wish to check
for which function A(x) is Eq. 2.12 satisfied. In order to do that, it is enough to
substitute Eq. 2.18 into Eq. 2.12. We obtain:
A′ (x)e−
R
r(x̃)dx̃
R
− A(x)r(x)e−
R
r(x̃)dx̃
+ A(x)r(x)e−
⇒ A′ (x) = s(x)e r(x̃)dx̃
Z
R
⇒ A(x) = s(x)e r(x̃)dx̃ dx.
R
r(x̃)dx̃
= s(x)
(2.19)
That is, the final solution y(x) of the given ODE is:
−
y(x) = e
R
r(x̃)dx̃
Z
R
s(x)e
r(x̃)dx̃
dx,
which, of course, coincides with the Eq. 2.16 already found with the help of the
integrating factor.
2.2.3
Exact equations
Let us go back to the general form of a first order ODE Q(x, y)y ′(x) + P (x, y) = 0
(Eq. 2.9). Let us then suppose that we can find a function f (x, y) such that:
∂f (x, y)
= P (x, y),
∂x
∂f (x, y)
= Q(x, y).
∂y
(2.20)
(2.21)
Then, we have
∂f (x, y)
∂f (x, y)
dx +
dy = P (x, y)dx + Q(x, y)dy = 0.
∂x
∂y
That means that f (x, y) is constant or, in other words, that the function f (x, y) = K,
with K an arbitrary constant, is the general solution of the ODE Eq. 2.9 in implicit
form. If such a function f (x, y) can be found, the ODE takes the name of exact
equation.
df (x, y) =
28
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
It can be demonstrated that an ODE is exact if and only if
∂P (x, y)
∂Q(x, y)
=
.
(2.22)
∂y
∂x
In fact, if the ODE is exact, Eqs. 2.20 and 2.21 hold. If we differentiate Eq. 2.20 as
a function of y and Eq. 2.21 as a function of x, then we obtain:
∂ 2 f (x, y)
∂ ∂f (x, y)
∂P (x, y)
=
=
,
∂y
∂y
∂x
∂y∂x
∂Q(x, y)
∂ 2 f (x, y)
∂ ∂f (x, y)
=
=
.
∂x
∂x
∂y
∂x∂y
If the function f (x, y) is sufficiently regular (i.e. twice differentiable), then
∂ 2 f (x,y)
∂x∂y
=
∂ 2 f (x,y)
.
∂y∂x
Unless otherwise stated, we will always deal with sufficiently regular (i.e.
continuous and differentiable) functions, therefore we have demonstrated that, if the
ODE is exact, then the condition expressed in Eq. 2.22 is satisfied.
(x,y)
The rigorous demonstration that the condition ∂P∂y
= ∂Q(x,y)
implies that the
∂x
ODE is exact is beyond the scope of these notes (and of this course). A less rigorous
demonstration can be performed in the following way. We seek a function f (x, y)
such that f (x, y) = K is a solution of the ODE Q(x, y)y ′ + P (x, y) = 0. We can
always assume that the Eq. 2.20 holds. We try to find out if Eq. 2.21 holds too.
From Eq. 2.20 we get:
f (x, y) =
Z
P (x̃, y)dx̃ + r(y),
(2.23)
where r(y) is an unknown function depending only on y. We can now differentiate
= Q. We have therefore that:
Eq. 2.23 with respect to y and we assume that ∂f
∂y
Z
∂
∂f
P (x̃, y)dx̃ + r ′ (y).
(2.24)
= Q(x, y) =
∂y
∂y
R
∂
Now, the quantity r ′ (y) = Q(x, y) − ∂y
P (x̃, y)dx̃ must depend only on y, namely
its derivative with respect to x must vanish. We have:
Z
∂Q(x, y)
∂ ∂
∂r ′ (y)
P (x̃, y)dx̃
=
−
∂x
∂x
∂x ∂y
Z
∂Q(x, y)
∂ ∂
=
P (x̃, y)dx̃
−
∂x
∂y ∂x
∂Q(x, y) ∂P (x, y)
−
,
=
∂x
∂y
(2.25)
∂ ∂
∂ ∂
= ∂x
. Now, the right hand side of
where we have made use of the relation ∂y
∂x
∂y
Eq. 2.25 is zero on account of Eq. 2.22, therefore r(y) indeed does not depend on x.
2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS
29
If we find r(y) by integrating Eq. 2.24, then substituting this solution into Eq. 2.23
we find the required function f (x, y).
Example 2.2.4 Find the solution of the ODE:
(2x3 y + cos x)y ′ + 3x2 y 2 = y sin x
.
We have that P (x, y) = 3x2 y 2 − y sin x and Q(x, y) = 2x3 y + cos x, therefore:
∂P
= 6x2 y − sin x
∂y
∂Q
= 6x2 y − sin x.
∂x
The ODE is exact and the solution is given by:
f (x, y) =
=
Z
Z
P (x̃, y)dx̃ + r(y)
(3x̃2 y 2 − y sin x̃)dx̃ + r(y)
= x3 y 2 + y cos x + C + r(y).
(2.26)
Since another integration constant is introduced by the equation f (x, y) = K, we can
safely assume that C = 0. By differentiating Eq. 2.26 with respect to y we obtain:
r ′ (y) =
∂f (x, y)
− 2x3 y − cos x
∂y
= Q(x, y) − 2x3 y − cos x = 0.
We have thus obtained r(y) = K, therefore the general solution of the proposed ODE
is:
x3 y 2 + y cos x = K
There is a method alternative to the one seen above to find the solution of an
exact ODE. It is less secure but in most cases more practical. It can be shown that
the solution of the exact ODE P (x, y) + Q(x, y)y ′ = 0 is:
f (x, y) =
f (x, y) =
Z
Z
P (x̃, y)dx̃ +
Q(x, ỹ)dỹ +
Z
Z
Q(x1 , ỹ)dỹ = K or
(2.27)
P (x̃, y1 )dx̃ = K,
(2.28)
30
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
where x1 (or y1 ) is a point where it is particularly easy to calculate the given integrals
= ∂Q
(namely
(in most of the cases x1 = 0). In fact, it is very easy to see that, if ∂P
∂y
∂x
if the equation is exact), then df = 0. The following example clarifies this method.
Example 2.2.5 Solve the ODE
(2x3 y + cos x)y ′ + 3x2 y 2 = y sin x
by means of this new method.
We have already seen how to solve this ODE in Example 2.2.4. With the help of Eq.
2.28 the solution is more straightforward. In fact, taking y1 = 0 we obtain:
f (x, y) = K =
=
Z
Z
Q(x, ỹ)dỹ +
Z
P (x̃, y1 )dx̃
Z
3
(2x ỹ + cos x)dỹ + 0dx̃
= x3 y 2 + y cos x.
2.2.4
Integrating factors
It is clear that separable equations, having P (x, y) = p(x) and Q(x, y) = q(y) are
exact equations ( ∂Q
= ∂P
= 0), whereas linear ODE, with Q(x, y) = 1 and P (x, y) =
∂x
∂y
r(x)y − s(x) (see Eq. 2.12) in general are not exact. However, we have seen that
we can find a suitable integrating factor g(x) that simplifies the ODE into the form
Eq. 2.15, which is an exact and easy to solve ODE. Is it always possible to find an
integrating factor and solve the general first order ODE Q(x, y)y ′ + P (x, y) = 0? We
can proceed as we have done in Sect. 2.2.2 and multiply both members of the ODE
by an unknown function g which will be in general function of both x and y. We
have therefore:
g(x, y)Q(x, y)y ′ + g(x, y)P (x, y) = 0.
(2.29)
This equation is exact if and only if:
∂
∂
[g(x, y)Q(x, y)] =
[g(x, y)P (x, y)]
∂x
∂y
∂g
∂P
∂g
∂Q
+ Q(x, y)
= g(x, y)
+ P (x, y)
⇒ g(x, y)
∂x
∂x
∂y
∂y
∂Q ∂P
∂g
∂g
⇒ g(x, y)
= P (x, y) − Q(x, y) .
−
∂x
∂y
∂y
∂x
(2.30)
31
2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS
Once the integrating factor g(x, y) is known, the (exact) Eq. 2.29 can be solved with
the same technique we have seen in Sect. 2.2.3. Unfortunately, the solution of Eq.
2.30 can be more complicated than the starting ODE (it is a PDE). This method
can be effective though in some specific cases.
1. g = g(x)
In this case,
∂g
∂y
= 0 and the Eq. 2.30 reduces to:
g(x) ∂Q ∂P
dg
.
=−
−
dx
Q(x, y) ∂x
∂y
Namely, the integrating factor g depends only on x if and only if the quantity:
1
∂Q ∂P
,
−
Q(x, y) ∂x
∂y
(2.31)
depends on x only. In this case, g(x) can be found by direct integration, namely:
R
g(x) = e
1
Q(x̃,y)
∂P (x̃,y)
∂y
−
∂Q(x̃,y)
∂ x̃
dx̃.
(2.32)
It is clear that for linear equations, in which Q(x, y) = 1 and P (x, y) = r(x)y −
s(x), g(x) reduces to Eq. 2.14.
Example 2.2.6 Find the solution of the ODE:
(ln x + y)y ′ +
y
+ x(2y ln x + y 2) = 0
x
.
We have that P (x, y) =
y
x
+ x(2y ln x + y 2 ) and Q(x, y) = ln x + y, therefore:
1
∂P
= + 2x(ln x + y)
∂y
x
1
∂Q
= .
∂x
x
The ODE is therefore not exact and we have to find an integrating factor g(x, y).
We can notice that:
1
∂P
1
∂Q
1
1
=
= 2x,
−
+ x(2 ln x + 2y) −
Q(x, y) ∂y
∂x
ln x + y x
x
depends indeed only on x, as the condition 2.31 requires. We can therefore easily
find that
R
g(x) = e
2xdx
2
= ex ,
32
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
and we recover therefore the exact ODE:
y 2
2
2
ex (ln x + y)y ′ + ex + xex (2y ln x + y 2) = 0.
x
This equation is of course much uglier that the starting ODE and if we would look at
2
it, we would instinctively cancel out the term ex . Unfortunately, we would eliminate
the factor that makes the ODE tractable. To solve this exact ODE we can start from
the condition Eq. 2.21 and find:
f (x, y) =
Z
Z
Q(x, ỹ)dỹ + s(x)
2
ex (ln x + ỹ)dỹ + s(x)
y2
x2
+ s(x),
=e
y ln x +
2
=
where s(x) is function only of the independent variable x. We can now differentiate
this expression with respect to x and use the condition expressed in Eq. 2.20 to find:
y x2
y 2
y2
x2
2
x2
− ex = 0.
s (x) = e + xe (2y ln x + y ) − 2xe
y ln x +
x
2
x
′
Namely, we have s(x) = K, therefore the general solution of the given ODE is:
y2
e
y ln x +
= K.
2
x2
2. g = g(y)
By the same line of reasoning of the previous subsection, we can find that the integrating factor g = g(y) if and only if the quantity:
∂Q ∂P
1
,
−
P (x, y) ∂x
∂y
(2.33)
depends on y only. In this case, g(y) is given by:
R
g(y) = e
1
P (x,ỹ)
∂Q(x,ỹ)
∂x
ỹ)
− ∂P∂(x,
dỹ.
ỹ
(2.34)
2.2. FIRST ORDER, FIRST DEGREE DIFFERENTIAL EQUATIONS
33
3. g = g(x · y)
In this special case we can introduce the new variable u = xy and we want to check
under which conditions g can be function of u alone. From the chain rule of the
differentiation we know that:
∂g ∂u
dg
∂g
=
=x
∂y
∂u ∂y
du
∂g ∂u
dg
∂g
=
=y .
∂x
∂u ∂x
du
Eq. 2.30 translates therefore into:
∂P
dg
∂Q
dg
g(u)
= Qy
−
− Px .
∂y
∂x
du
du
We can thus see that the integrating factor depends on u = x · y if and only if the
function:
∂P
1
∂Q
,
H(u) =
−
yQ − xP ∂y
∂x
(2.35)
depends on u alone, and in this case we can easily find g(u) by:
R
g(u) = e
H(ũ)dũ
.
(2.36)
4. g = g( xy )
Now we can make the substitution u =
us to:
x
y
and the chain rule of differentiation leads
∂g
x dg
=− 2
∂y
y du
1 dg
∂g
=
.
∂x
y du
Eq. 2.30 translates therefore into:
∂P
Q dg xP dg
∂Q
g(u)
=
−
+ 2
.
∂y
∂x
y du
y du
Therefore the integrating factor depends on u =
x
y
if and only if the function:
∂P
∂Q
y2
,
−
K(u) =
yQ + xP ∂y
∂x
depends on u alone, and also in this case g(u) can be expressed by:
(2.37)
34
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
R
g(u) = e
K(ũ)dũ
.
(2.38)
As a general rule it is therefore always useful to look at the quantity:
∂P
∂Q
−
.
∂y
∂x
If this is zero, then the ODE is exact and “easy” to solve. If it is not zero, it might
nevertheless hide an hint on the form the integrating factor g(x, y) might have. Of
course, it is worth recalling that the ODE might not have a solution! In this case,
the integrating factor does not exist.
Example 2.2.7 Find the solution of the ODE:
1+6
x2 x
+ (ln x − 2)y ′ = 0
y
y
.
2
We have that P (x, y) = 1 + 6 xy and Q(x, y) = xy (ln x − 2), therefore:
∂P
x2
= −6 2
∂y
y
1
∂Q
= (ln x − 1).
∂x
y
The ODE is therefore not exact. ∂Q
suggests us nothing but ∂P
seems to suggest
∂x
∂y
x
some possible dependence on y . We try therefore to calculate the quantity Eq. 2.37
which turns out to be in our case:
x2 1
y
1
y2
−6 2 − (ln x − 1) = − = − .
K(u) =
x3
y
y
x
u
x(ln x − 1) + 6 y
We have been lucky: the integrating factor depends indeed on xy . It is easy to calculate
R
it. In fact, since − u1 = − ln u we obtain g(x, y) = xy . The exact ODE is therefore:
y
+ 6x + (ln x − 2)y ′ = 0.
x
To solve it we can proceed as in Example 2.2.4, namely integrating Eq. 2.21 yielding:
f (x, y) =
Z
(ln x − 2)dỹ + s(x) = y(ln x − 2) + s(x).
We differentiate this equation as a function of x and use the condition Eq. 2.20 to
find:
2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS
35
y
y
+ 6x − = 6x,
x
x
2
from which we get s(x) = 3x , therefore the solution is:
s′ (x) =
y(ln x − 2) + 3x2 = K ⇒ y =
2.3
K − 3x2
.
ln x − 2
Higher degree first order differential equations
So far we have dealt only with ODEs of the type P (x, y) + Q(x, y)y ′ = 0, namely
we have excluded that the derivative of the function y(x) has an exponent different
from 1. If we are in the presence of terms of the type [y ′ ]n , the solution of the
ODE is much more complicated. It is always a good idea in this case to make the
substitution y ′(x) = p, namely to use the derivative y ′ as a parameter and try to
solve the corresponding equation f (p, x, y) = 0. An example can help to clarify this
procedure.
Example 2.3.1 Solve the differential equation
3[y ′(x)]2 − 5xyy ′ + 2x2 y 2 = 0
.
With the substitution y ′(x) = p we obtain the equation:
3p2 − 5xyp + 2x2 y 2 = 0,
2
⇒ 3(p − xy)(p − xy) = 0.
3
Now we are left with solving two first degree ODEs, namely:
x2
2
y ′ = xy ⇒ 3d ln y = dx2 ⇒ y = Ke 3 ,
3
x2
1
y ′ = xy ⇒ d ln y = dx2 ⇒ y = Ke 2 ,
2
Note that, since there is only one constant required for the solution of a first order
ODE, we can take the same integration constant K for both solutions. The final
general solution of the given ODE is given by:
x2
y − Ke 3
x2
y − Ke 2
= 0.
36
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
The general form of a first order higher degree ODE is:
an (x, y)pn + an−1 (x, y)pn−1 + · · · + a1 (x, y)p + a0 (x, y) = 0,
(2.39)
provided that we have used the notation p = y ′(x). The solution of this equation
could be obtained (either explicitly or in parametric form) if this equation can be
solved for p, x, y.
2.3.1
Equations solvable for p
If the left hand side of Eq. 2.39 can be factorized into the form
[p − F1 (x, y)][p − F2 (x, y)] . . . [p − Fn (x, y)],
(2.40)
then we can solve separately each of the first order first degree ODEs p −Fj (x, y) = 0
and express the solution in the form Gj (x, y, K) = 0. At this point, the general
solution of the given ODE will be:
G1 (x, y, K)G2(x, y, K) . . . Gn (x, y, K) = 0.
(2.41)
Example 2.3.1 has been solved exactly with this method.
2.3.2
Equations solvable for y
If Eq. 2.39 can be solved for y, that means that we can write it in the form y =
F (x, p). Now, differentiating both members with respect to x we obtain:
y ′ (x) = p =
∂F
∂F dp
+
.
∂x
∂p dx
(2.42)
dp
Namely, we have a function G(p, p′ (x), x) = ∂F
+ ∂F
− p = 0 that does not depend
∂x
∂p dx
on y. If this equation can be solved to give p = p(x), then we can substitute this
function to the original Eq. 2.39 and obtain the final solution f (x, y) = 0. By using
this method most of the time we find at the end of the computation some ancillary
solutions that cannot be obtained by the general solution. These solutions are called
singular solutions. An example can illustrate this kind of solutions.
Example 2.3.2 Solve the differential equation
3[y ′(x)]2 − 2y ′(x) +
.
y
=0
x
2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS
37
As usual we assume y ′(x) = p. We can multiply both sides of the equations by x and
we can see that the equation is easily solvable for y. Namely, we obtain:
y = −3p2 x + 2px.
Now we can differentiate both members of this equation with respect to x and, reminding that y ′ = p we obtain:
p = −6pxp′ + 2xp′ − 3p2 + 2p,
⇒ 2xp′ (−3p + 1) + p(−3p + 1) = 0,
⇒ (p + 2xp′ )(1 − 3p) = 0.
(2.43)
We managed to factorize the equation and isolate one term containing p′ and another
term containing only p. Now we solve the simple ODE p + 2xp′ = 0 obtaining:
p′
1
1
K
=−
⇒ ln p = − ln x + K ⇒ p = √ .
p
2x
2
x
Substituting this value of p into the original ODE, we obtain:
3
√
K
y
K2
− 2 √ + = 0 ⇒ y = 2K x − 3K 2 .
x
x x
This is the general solution of the given ODE. However, we shall not forget that Eq.
2.43 was factorized in two factors and from the second factor we can easily find the
solution p = 13 . If we substitute it into the original ODE, we get the solution:
x
.
3
It is very easy to see that this is indeed a solution of the given ODE. It can also be
noticed that this equation cannot be obtained by any choice of the integrating constant
K. That is the reason why it is called singular solution.
y=
2.3.3
Equations solvable for x
If Eq. 2.39 can be solved for x, we can proceed in a similar way as we have done in
the previous subsection. Namely we can write the equation in the form x = F (y, p)
and differentiate both members with respect to y, obtaining:
x′ (y) =
∂F
∂F dp
1
=
+
.
p
∂y
∂p dy
(2.44)
38
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
dp
+ ∂F
− p1 = 0 that does not
We have now found a function G(p, p′ (y), y) = ∂F
∂y
∂p dy
depend on x. We can use it together with the original ODE to eliminate p and give
the general solution. As in the previous case, if the function G(p, p′ (y), y) can be
factorized, then the term containing p′ (y) should be used to find p(y), which will
be used to eliminate y ′(x) = p from the original ODE and get the general solution.
Using the remaining term in the factorized function G(p, p′ (y), y) will often lead to
singular solutions. It is also to note that in this case (but also in the case of a ODE
solvable for y) it is not required that the ODE has the form of Eq. 2.39. It is enough
that we can find a function G(p, p′ (y), y) = 0 that does not depend on x, solve it and
substitute p onto the original ODE. An example can help clarifying it.
Example 2.3.3 Solve the ODE
xy ′ = y ln y ′
.
Provided that y ′ > 0 (as required since otherwise the logarithm of y ′ would not be
defined), this is clearly an ODE solvable for x (indeed it is also solvable for y). We
make as usual the substitution y ′ = p and obtain:
x=y
ln p
.
p
We differentiate it with respect to y and we obtain:
x′ =
1
ln p
p′ − p′ ln p
=
+y
⇒ p = p ln p + y(1 − ln p)p′ ⇒ (p − yp′)(1 − ln p) = 0.
2
p
p
p
Assuming that y 6= 0, from the factor containing p′ we obtain:
dp
dy
=
⇒ p = Ky.
p
y
Substituting it into the equation x = y lnpp we obtain the general solution:
ln Ky
.
K
As usual, we shall not forget the term not containing p′ in the factorized ODE, namely
the term (1 − ln p). By equating it to 0 we obtain p = e, namely we have obtained the
singular solution y = ex that cannot be obtained from the general solution for any
value of the integration constant K.
x=
2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS
2.3.4
39
Special differential equations
Bernoulli’s equation
Bernoulli’s equation has the form:
y ′(x) = f (x)y(x) + g(x)y p (x).
(2.45)
For p = 0 or p = 1 we know already how to solve this ODE. For p 6= 0, 1 this is a
non-linear first order, first degree ODE. However, it can be made linear by the simple
substitution z = y 1−p . In fact, we obtain:
1
y = z 1−p ⇒ y ′ =
In this way we have:
p
1
z 1−p z ′ .
1−p
p
p
1
1
z 1−p z ′ = f (x)z 1−p + g(x)z 1−p ,
1−p
⇒ z ′ = (1 − p)f (x)z + (1 − p)g(x).
(2.46)
This is a linear ODE in z and g and therefore we can promptly find the solution
applying Eq. 2.16, namely:
R
z(x) = e
(1−p)f (x̃)dx̃
Z
R
(p−1)f (x̃)dx̃
(1 − p)g(x)e
dx + K .
Recalling the substitution z = y 1−n we obtain therefore:
R
y(x) = e
(1−p)f (x̃)dx̃
which can be simplified into:
R
y(x) = e
f (x̃)dx̃
1
1−p
Z
(1 − p)
R
(p−1)f (x̃)dx̃
R
f (x̃)dx̃
(1 − p)g(x)e
Z
(p−1)
g(x)e
dx + K
dx + K
1
1−p
1
1−p
,
,
(2.47)
Example 2.3.4 Solve the ODE
x2 y − x3 y ′ = y 4 sin x
.
We can first notice that a trivial solution to this ODE is y = 0. If we divide both
members of this ODE by x3 we obtain:
40
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
y4
y
− 3 sin x.
x x
x
This is therefore a Bernoulli’s ODE, with p = 4, f (x) = x1 , g(x) = − sin
. We can
x3
therefore directly apply Eq. 2.47 and obtain:
y′ =
Z
− 31
sin x 3 R 1 dx̃
3
y(x) = e
,
e x̃ dx + K
x3
− 31
Z
sin x 3 ln x
ln x
e
dx + K
3
,
=e
x3
Z
− 1
= x 3 sin xdx + K 3 ,
R
1
dx̃
x̃
x
.
= √
3
K − 3 cos x
Riccati’s equation
The Riccati’s equation has the form:
y ′ (x) = f (x) + g(x)y(x) + h(x)y 2 (x).
(2.48)
namely, it is a Bernoulli’s equation with p = 2 but with an additional term f (x). To
solve this equation we have to make the substitution:
u = e−
R
y(x̃)h(x̃)dx̃
⇒ u′ = −uyh.
In this way we obtain:
u′′ (uh) − (u′ h − uh′ )u′
u′
′
⇒ y =−
.
y=−
uh
u2 h2
Substituting y and y ′ into the original ODE Eq. 2.48 we obtain:
u′
(u′ )2
(u′h − uh′ )u′ − u′′ (uh)
=
f
−
g
+
h
,
u2 h2
uh
u2 h2
⇒ (u′)2 h + uu′h′ − u′′ (uh) = f u2 h2 − gu′uh + h(u′ )2 ,
⇒ u′′ uh = u′ (uh′ + ghu) − f h2 u2 ,
′
′′
′ h
⇒u =u
+ g − f hu.
h
(2.49)
The resulting ODE is therefore linear, but unfortunately of second order! And we
have not learned (yet) how to solve second order ODE. The only possibility to solve
41
2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS
the Riccati’s ODE without invoking second order ODE is by means of quadrature,
once a particular solution of the ODE is known. If a particular solution is not known,
we cannot solve the Riccati’s ODE with the methods we have learned so far. If we
know that y1 (x) is a particular (not the general) solution, then of course from Eq.
2.48 we have y1 ′ = f + gy1 + hy1 2 . We can now make the substitution u = y − y1
and obtain:
u′ = g(y − y1 ) + h(y 2 − y1 2 ).
We have y 2 − y1 2 = (y − y1 )(y + y1 ) = (y − y1 )(y − y1 + 2y1 ) = u(u + 2y1 ), therefore
the above equation translates into:
u′ = gu + hu(u + 2y1 ) = (g + 2hy1 )u + hu2 .
(2.50)
This is a Bernoulli’s equation with p = 2, which can be directly solved by means of
Eq. 2.47, to obtain:
R
u(x) = e
[g(x̃)+2h(x̃)y1 (x̃)]dx̃
R
(−1)
Z
R
(1) [g(x̃)+2h(x̃)y1 (x̃)]dx̃
h(x)e
e [g(x̃)+2h(x̃)y1 (x̃)]dx̃
R
.
⇒ y(x) = y1 (x) − R
h(x)e [g(x̃)+2h(x̃)y1 (x̃)]dx̃ dx + K
dx + K
−1
,
(2.51)
Example 2.3.5 Solve the ODE
4xy ′ − 2xy = e−x + xex (4 + x)
.
We can rewrite the ODE into the form:
1
1
1
y ′ = y + e−x y 2 + ex (4 + x).
2
4x
4
Now we can recognize it as a Riccati’s ODE with f (x) = 41 ex (4 + x), g(x) = 21 ,
1 −x
e . Because of the term e−x y 2 we can suppose that a function of the kind
h(x) = 4x
xk ex could be a solution of the ODE. In fact, we can see that the terms ex would
cancel out. Let us try if a value of k exists such that y1 (x) = xk ex is a particular
solution of the given ODE. It is y1 ′ = kxk−1 ex + xk ex = xk−1 ex (k + x), therefore we
have:
1
1
1
xk−1 ex (k + x) = xk ex + x2k ex + ex (4 + x).
2
4x
4
As expected, we can cancel out ex and we obtain:
42
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
x
xk x2k
+
+1+ .
2
4x
4
It is clear that all the exponents of x should be the same and it is easy to see that
this condition is fulfilled only if k = 1 and it is equally easy to see that for k = 1 the
previous equation is indeed an identity, therefore y1 (x) = xex is a particular solution
of the given ODE. Now we can apply Eq. 2.51. We have:
kxk−1 + xk =
R
e
(g+2hy1 )dx̃
R
=e
⇒ y(x) = y1 (x) − R
1
+ 1 e−x̃ ·x̃ex̃
2 2x̃
ex
1 −x
e
4x
dx̃
· ex dx
R
= ex .
ex
= y1 (x) − 1 R 1
=e
1dx̃
4
x
dx
.
The general solution of the given ODE is thus:
4ex
.
ln x + K
In a sense, the particular solution y1 can be obtained from the general solution for
K very large, namely lim y(x) = y1 (x).
y(x) = xex −
K→∞
Clairaut’s equation
Clairaut’s equation has the form:
y = xy ′ + g(y ′).
(2.52)
Namely, it is just a particular case of the equations solvable for y we have encountered
in Sect 2.3.2 and we can solve it with the method we have learned in that subsection,
but for the Clairaut’s equation the form of the general solution is particularly simple.
In fact, given as usual the substitution y ′ = p, we have y = xp + g(p), therefore,
differentiating with respect to x we get:
dg
y ′ = p = xp′ + p + p′ ,
dp
dg ′
⇒ x+
p = 0.
dp
The factor containing p′ is thus elementary to solve. It gets p = K and therefore we
have already found the general solution of the ODE that is:
y(x) = xK + g(K).
(2.53)
43
2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS
The equations

x = − dg
dp
y = xp + g(p).
(2.54)
represent the (parametric) singular solution.
Example 2.3.6 Solve the ODE
y = xy ′ + ey
′
This is clearly a Clairaut’s ODE with g(p) = ep , therefore the general solution is
given by:
y = Kx + eK
To find the particular solution we have to solve the system of equations:

x = −ep
y = xp + ep
We have p = ln(−x) therefore the particular solution of the given ODE is:
y = x ln(−x) − x = x[ln(−x) − 1]
D’Alembert’s equation
The D’Alembert’s (or D’Alembert-Lagrange) equation can be written as follows:
y = xf (y ′ ) + g(y ′).
(2.55)
It is therefore again a particular case of the equations solvable for y we have encountered in Sect. 2.3.2. It is also to note that the Clairaut’s equation is a particular
form of the D’Alembert’s equation with f (y ′) = y ′ . Also in this case there is an
easier method of finding the general solution of this ODE than the one described in
Sect. 2.3.2. Let as usual y ′ be p, therefore y = xf (p) + g(p). We differentiate it with
respect to x and obtain:
y ′ = p = f (p) + xf ′ (p)p′ + g ′ (p)p′
⇒ p − f (p) = xf ′ (p) + g ′(p) p′ .
44
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
Now we can write p′ =
dp
dx
as 1/ dx
= 1/x′ (p) and obtain therefore:
dp
x′ (p) p − f (p) = xf ′ (p) + g ′ (p),
⇒ x′ (p) = x
f ′ (p)
g ′ (p)
+
.
p − f (p) p − f (p)
(2.56)
Therefore, we have obtained a linear ODE which can be solved with the known methods and give x = x(p). This equation, together with the equation y = xf (p) + g(p)
is already the parametric general solution of the D’Alembert’s ODE. Unfortunately,
with this method we cannot recover singular solutions.
Example 2.3.7 Solve the ODE
y = x(y ′ )2 + 2y ′
This is clearly a D’Alembert’s ODE with f (y ′) = (y ′)2 and g(y ′) = 2y ′ . We can apply
therefore Eq. 2.56 to obtain:
2
2p
+
2
p−p
p − p2
2
2
⇒ x′ (p) − x
=
.
1−p
p − p2
x′ (p) = x
2
and s(p) =
This is a linear equation in p with r(p) = − 1−p
solve this ODE, namely:
2
.
p−p2
We know how to
Z
R 2
2
− 1−
dp̃
p̃
x(p) = e
e
dp + K
p − p2
Z
2
−2 ln(1−p)
2 ln(1−p)
=e
e
dp + K
p(1 − p)
Z
2
1
(1 − p)dp + K
=
(1 − p)2
p
Z 2
1
=
− 1 dp + K
(1 − p)2
p
2
(ln p − p + K).
=
(1 − p)2
R
2
dp̃
1−p̃
We can express the general solution of the given ODE in the parametric form:

x(p) = 2 (ln p − p + K)
(1−p)2
y(p) = xp2 + 2p
2.3. HIGHER DEGREE FIRST ORDER DIFFERENTIAL EQUATIONS
2.3.5
45
Singular solutions and envelopes
We have seen that, in solving higher degree ODEs usually a singular solution emerges
that cannot be obtained from the general solution for any choice of the integrating
constant K. Has the singular solution really nothing to do with the general solution?
Let us take the ODE we have studied in Example 2.3.2, namely 3(y ′)2 − 2y ′ + xy = 0.
√
We have seen that it has the general solution y = 2K x − 3K 2 and a singular
solution y = x3 . We have plotted the general solution for K ranging from 1 to 5
(black lines) together with the singular solution (red line) in Fig. 2.4. As it is clear
from this figure, the singular solution is tangent to each member of the integral
curves representing the general solution of the given ODE at some point, namely
it represents its envelope. Has this just happened by chance? Of course not. We
remind that we have transformed the given ODE into the form: y = −3p2 x + 2px.
This can be seen as an equation in p, namely 3p2 x − 2px + y = 0. The discriminant
of this equation is △ = x2 − 3xy. Because of the term xy in the original ODE we
have to assume x 6= 0. If we assume x > 0, then the discriminant is larger than 0 if
and only if
x
.
3
The curve y = x3 represents thus the limiting curve below which two distinct roots
for p = y ′ can be found. On the curve itself, only solutions with multiplicity larger
than one can be found. That means also that, if we set up an initial value problem
by means of the initial condition y(x0 ) = y0 , we will obtain two solutions if (x0 , y0 )
lies below the curve delimiting the singular solution, one solution if (x0 , y0 ) is on the
curve, no solutions otherwise. For instance, given the initial condition y(16) = 5 it is
easy to see from the general solution of the ODE that there are two possible values
of K satisfying it namely K = 1 and K = 35 .
We shall remark here that having two distinct solutions (with two distinct values
of K) for an initial value problem is something happening only on ODE of degree
higher than 1. For first-degree ODEs we have so far taken for granted that there
exists a single and unique function, solution of an initial value problem:
x − 3y > 0 ⇒ y <

y ′ = f (x, y),
y(x0 ) = y0 .
(2.57)
Actually the so-called existence and uniqueness theorem guarantees that, if the function f is well-behaved, then there is an interval of values of x where the solution of
Eq. 2.57 exists and is unique. We will not demonstrate this theorem, but it was
important to notice that this result is possible only for first-degree ODEs.
Going back to the ODE 3(y ′)2 −2y ′ + xy = 0, we can also notice that the derivative
46
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
√
Figure 2.4: Family of curves y = 2K x − 3K 2 with K ranging from 1 (flatter curve)
to 5 (steeper curve). Red line: curve of equation y = x3 (singular solution of the
ODE 3(y ′)2 − 2y ′ + xy = 0).
√
√
of the general solution y = 2K x − 3K 2 with respect to K yields 2 x − 6K. If
√
we equate it to 0, we obtain K = 3x . If we substitute it to the general solution we
√
obtain y = 2 3x − 3 x9 = x3 , namely again the singular solution. In other terms, if
f (x, y, K) = 0 is the general solution of a ODE, the two equations:

f (x, y, K) = 0,
 ∂f (x,y,K) = 0,
∂K
(2.58)
represent the parametric form of the singular solution of the given ODE.
Yet another and even simpler method can be found to recover the singular
solution. If we differentiate the original ODE with respect to y ′ we obtain 6y ′ −
2. Equating it to 0 we obtain y ′ = 13 . If we substitute this value into the ODE,
we obtain once again the singular solution y = x3 . Namely, given a differential
equation ψ(x, y, y ′) = 0 the singular solution can be obtained by solving the system
of equations:

ψ(x, y, y ′) = 0,
′)
 ∂ψ(x,y,y
= 0,
′
(2.59)
∂y
without finding the general solution. Also in the ODE xy ′ = y ln y ′ (Example 2.3.3)
it is easy to see that both Eq. 2.58 and Eq. 2.59 lead us to the singular solution.
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
47
For what concerns the D’Alembert’s equation y = x(y ′ )2 + 2y ′ (Example 2.3.7) we
have seen that the standard method of solution does not allow us to find the singular
solution. Also the application of Eq. 2.58 is quite complicated given the parametric
form of the general solution. Applying Eq. 2.59 instead we obtain immediately that
1
∂ψ(x, y, y ′)
′
′
=
0
⇒
2xy
+
2
=
0
⇒
y
=
−
.
∂y ′
x
Substituting this value of y ′ into the original ODE we obtain immediately the singular
solution y = − x1 .
However, this second method is not applicable to all differential equations. For
instance in an ODE of the kind y ′ = ψ(x, y) if we differentiate both members with
respect to y ′ we obtain 1 = 0. The first method described by Eq. 2.58 works also in
this case (provided that it is possible to find the general solution of the ODE).
2.4
Second order differential equations
Second order differential equations play a particularly important role in physics because many relevant physical processes (Newtonian dynamics, oscillations, electric
circuits and many more) can be described by means of equations involving second
order derivatives. We have already seen a very simple example of second order differential equations in the fall of a body that, neglecting the air resistance, can be
2
,
described by the ODE ddt2y = g. Since this ODE does not depend explicitly on t, y, dy
dt
it was particularly easy to solve it without knowing anything about theory of second
order ODE. In general, however, we have to do with equations of this kind:
y ′′ (x) = f [x, y(x), y ′(x)],
namely, we consider only second order, first degree ODEs. Actually, we will concentrate almost exclusively on linear ODEs, namely ODEs that can be written as:
a2 (x)y ′′ (x) + a1 (x)y ′ (x) + a0 (x)y(x) = f (x),
or, to simplify the notation:
y ′′ (x) + p(x)y ′ (x) + q(x)y(x) = g(x),
(2.60)
(x)
where of course is p(x) = aa12 (x)
, q(x) = aa20 (x)
and g(x) = af2(x)
(provided that a2 (x) 6=
(x)
(x)
0).
As usual, we will start from the simplest possible cases, then we will deal progressively with more and more complicated ODEs.
48
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
2.4.1
Second order homogeneous ODEs with constant coefficients
The simplest possible second order ODE has the form:
a2 y ′′ (x) + a1 y ′ (x) + a0 y(x) = 0.
(2.61)
This ODE is called homogeneous because f (x) = 0 ∀x and is called with constant
coefficients for the obvious reason that the coefficients of each term of the ODE are
real numbers and not function of the independent variable x. Our mathematical
intuition can help us solve this ODE. In fact, we know that the exponential function
has the property that each derivative, irrespective of the order, remains proportional
to the initial function. We can therefore expect that a function of the kind y(x) = eλx
could be the solution of the given ODE. We have of course y ′ = λeλx and y ′′ = λ2 eλx ,
therefore:
= 0,
+ λa eλx
eλx
eλx
λ2 a2
1 + a0
√
−a1 ± a1 2 − 4a0 a2
.
⇒λ=
2a2
(2.62)
The solution of the ODE reduces therefore to the solution of the simple algebraic
equation λ2 a2 + λa1 + a0 = 0 that is called characteristic equation. We have therefore
two possible values of λ satisfying this equation. Indeed, we have more than that.
Let us call λ1 and λ2 the solutions of the equation 2.62. Of course, we have λ1 2 a2 +
λ1 a1 + a0 = 0 and λ2 2 a2 + λ2 a1 + a0 = 0. If we take a function f (x) = c1 eλ1 x + c2 eλ2 x ,
namely a linear combination of the two solutions eλ1 x and eλ2 x , we can easily see that
f (x) is also a solution of the given ODE. In fact, if f (x) is a solution of the given
ODE then:
a2 f ′′ (x) + a1 f ′ (x) + a0 f (x) = 0,
⇒ a2 (c1 λ1 2 eλ1 x + c2 λ2 2 eλ2 x ) + a1 (c1 λ1 eλ1 x + c2 λ2 eλ2 x ) + a0 (c1 eλ1 x + c2 eλ2 x ) = 0,
⇒ c1 eλ1 x (a2 λ1 2 + a1 λ1 + a0 ) + c2 eλ2 x (a2 λ2 2 + a1 λ2 + a0 ) = 0.
We have already seen that the terms under brackets are 0, therefore this identity is
fulfilled and f (x) is indeed a solution of the given ODE.
This result is a particular case of a more general theorem, called principle of
superposition that states that, given a linear homogeneous differential equation:
y ′′ + p(x)y ′ + q(x)y = 0,
(2.63)
49
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
and given two solutions y1 and y2 of this ODE, then the linear combination c1 y1 +
c2 y2 is also a solution of the given ODE for any value of the constants c1 and c2 .
To demonstrate this theorem it is enough to remind the linearity of the operator
derivative, namely that, given a function c1 y1 (x) + c2 y2 (x) one has:
[c1 y1 (x) + c2 y2 (x)]′ = c1 y1 ′ (x) + c2 y2 ′ (x).
(2.64)
Of course we will have also (c1 y1 + c2 y2 )′′ = c1 y1 ′′ + c2 y2 ′′ . Now we can test if the
function g = c1 y1 + c2 y2 satisfies the ODE g ′′ + p(x)g ′ + q(x)g = 0. We have:
g ′′ + pg ′ + qg = (c1 y1 ′′ + c2 y2 ′′ ) + p(c1 y1 ′ + c2 y2 ′ ) + q(c1 y1 + c2 y2 ),
= c1 (y1 ′′ + py1 ′ + qy1 ) + c2 (y2 ′′ + py2′ + qy2 ) = 0.
The last step is justified by the fact that the two functions y1 and y2 are solutions
of the given ODE. We have therefore shown that the function c1 y1 + c2 y2 is also
a solution of the ODE. Namely, starting from two particular solutions of the linear
homogeneous ODE we can construct an infinite family of solutions by means of linear
combinations of the two initial solutions.
2.4.2
The Wronskian determinant
Given a generic solution y = c1 y1 + c2 y2 how should c1 and c2 be chosen in order to
satisfy the initial conditions y(x0 ) = y0 and y ′ (x0 ) = y0 ′ ? Of course it must be:

c y (x ) + c y (x ) = y
1 1 0
2 2 0
0
′
′
c1 y1 (x0 ) + c2 y2 (x0 ) = y0 ′
(2.65)
We remind from the theory of systems of linear equations how to find c1 and c2 (see
also Sect. 5.1.2). By means of the Cramer’s rule, we obtain:
y y (x )
0
2 0
′
y0 y2 ′ (x0 )
c1 = y1 (x0 ) y2 (x0 )
′
y1 (x0 ) y2 ′ (x0 )
y (x ) y 1 0
0 ′
y1 (x0 ) y0 ′ , c2 = y (x ) y (x )
1 0
2 0
′
y1 (x0 ) y2 ′ (x0 )
.
(2.66)
Namely, the solution of this system is possible only if the denominators of these
quantities are different from zero, namely only if:
y (x ) y (x )
1 0
2 0
W = ′
y1 (x0 ) y2 ′ (x0 )
= y1 (x0 )y2 ′ (x0 ) − y1 ′ (x0 )y2 (x0 ) 6= 0.
(2.67)
50
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
The determinant W is called the Wronskian determinant or simply Wronskian and
it plays a fundamental role in the study of differential equations of order higher than
1. The Wronskian of two functions y1 and y2 in a point x0 is also indicated with the
notation W (y1 , y2)(x0 ).
The condition W 6= 0 implies in the end that the functions y1 and y2 are linearly
independent. In fact, if the two functions are linearly dependent, then it is always
possible to find a constant k such that y2 = ky1 . The Wronskian is thus:
W = y1 · ky1 ′ − y1 ′ · ky1 = 0.
It can be demonstrated that, given an ODE y ′′ + p(x)y ′ + q(x)y = 0 that admits
two solutions y1 and y2 , if there is a point x0 where the Wronskian is nonzero, then
the family of solutions:
y(x) = c1 y1 (x) + c2 y2 (x),
(2.68)
with arbitrary coefficients c1 and c2 includes every solution of the given ODE. For
this reason we will call Eq. 2.68 the general solution of the given ODE. The solutions
y1 and y2 are said to form a fundamental set of solutions of the ODE. In fact, one
can see that the family of solution of the ODE forms a vector space and y1 and y2
are the bases (often called generators) of it.
If the functions y1 (x) and y2 (x) satisfy the linear homogeneous ODE Eq. 2.63,
we have:
y1 ′′ + p(x)y1 ′ + q(x)y1 = 0,
y2 ′′ + p(x)y2 ′ + q(x)y2 = 0.
We multiply now the first equation by −y2 and the second by y1 , obtaining:
− y2 y1 ′′ − p(x)y2 y1 ′ − q(x)y2 y1 = 0,
y1 y2 ′′ + p(x)y1 y2 ′ + q(x)y1 y2 = 0.
If we now add these two equations together we obtain:
y1 y2 ′′ − y2 y1 ′′ + p(x)(y1 y2 ′ − y2 y1 ′ ) = 0.
(2.69)
If we treat the Wronskian W (y1, y2 )(x) = y1 (x)y2 ′ (x) − y2 (x)y1 ′ (x) as a function
of x, we can differentiate it with respect to x and obtain:
′ ′
′ ′
W ′ (x) = y1
y2 + y1 y2 ′′ − y1
y2 − y1 ′′ y2 = y1 y2 ′′ − y1 ′′ y2 .
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
51
Recalling now Eq. 2.69 we can rewrite it as:
W ′ (x) + p(x)W (x) = 0.
This is a simple separable differential equation, whose solution is:
W (x) = Ce−
R
p(x̃)dx̃
,
(2.70)
where the constant C depends only on the two functions y1 and y2 . Furthermore, if
C = 0, then the Wronskian is always zero, whereas if C 6= 0 then the Wronskian is
always different from 0, for any choice of x. Eq. 2.70 is also called Abel’s theorem.
Since.
′
′
W = y1 y2 − y1 y2 = y1
we can also recover the useful relation:
′
2
y2
y1
′
,
R
Ce− p(x̃)dx̃
=
y1 2
Z − R p(x̃)dx̃
e
⇒ y2 = Cy1
dx,
y1 2 (x)
y2
y1
(2.71)
which is a direct formula to find the second solution of an ODE once a solution is
already known.
2.4.3
Fundamental set of solutions of homogeneous ODEs
with constant coefficients
Going back to the second order homogeneous ODE with constant coefficients we have
studied at the beginning of this section, we have seen that the solution is given by a
linear combination of the two functions eλ1 x and eλ2 x , where λ1 and λ2 are calculated
by means of Eq. 2.62, namely they are the solutions of the characteristic equation.
In a generic point x0 the Wronskian of this set of solutions is given by:
eλ1 x0
eλ2 x0
W =
λ1 eλ1 x0 λ2 eλ2 x0
= (λ2 − λ1 )e(λ1 +λ2 )x0 .
(2.72)
Provided that λ1 6= λ2 , this Wronskian is always different from 0, for any possible
choice of x0 . Therefore, the functions eλ1 x and eλ2 x constitute a fundamental set
of solutions of the homogeneous ODE with constant coefficients, provided that the
roots of the characteristic equations do not coincide.
We can distinguish therefore 3 possible cases, depending on the nature of the
roots of the characteristic equations.
52
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
Distinct real roots
In this case, λ1,2 ∈ R and λ1 6= λ2 . This happens when the discriminant of the
characteristic equation △ = a1 2 − 4a0 a2 > 0. The functions eλ1 x and eλ2 x are realvalued and can be used to express the general solution of each physical problem.
Example 2.4.1 Find the general solution of the ODE
y ′′ − 5y ′ + 6y = 0
The characteristic equation of this ODE is:
λ2 − 5λ + 6 = 0,
√
5 ± 25 − 24
⇒λ=
,
2
⇒ λ1 = 2 , λ2 = 3.
The general solution of the given ODE is thus:
y(x) = c1 e2x + c2 e3x .
Distinct complex roots
We already know (see Sect. 1.3.3) that, when the discriminant of the characteristic
equation △ = a1 2 − 4a0 a2 < 0, then the solutions are the complex numbers:
√
−a1 ± i −△
,
λ1,2 =
2a2
a1
−
namely, the two solutions are the two complex conjugate numbers λ1 = − 2a
2
√
√
−△
−△
a1
i 2a2 = µ − iα and λ2 = − 2a2 + i 2a2 = µ + iα.
It is still true that the function c1 eλ1 x + c2 eλ2 x is the general solution of the given
ODE, but in general we might need complex coefficients c1 and c2 and this is to avoid
when we treat physical problems (whose solutions are supposed to be real-valued).
There is a way to avoid it and obtain real-valued solutions. In fact, we have seen
that any linear combination of the fundamental set of solutions of a given ODE is
still a solution. By using the Euler’s formula (Eq. 1.13) we can rewrite the solutions
as:
y1 = eλ1 x = eµx [cos(−αx) + i sin(−αx)] = eµx [cos(αx) − i sin(αx)],
y2 = eλ2 x = eµx [cos(αx) + i sin(αx)].
53
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
If we make the sum and the difference of the two solutions y1 and y2 we obtain:
µx
i sin(αx)
i sin(αx)]
y1 + y2 = eµx [cos(αx) − + cos(αx) + = 2e cos(αx),
− i sin(αx) − − i sin(αx)] = −2ieµx sin(αx).
y1 − y2 = eµx [
cos(αx)
cos(αx)
Now, if we take:
y1 (x) + y2 (x)
= eµx cos(αx),
2
y1 (x) − y2 (x)
= eµx sin(αx),
v(x) = −
2i
u(x) =
the functions u(x) and v(x) are real-valued and are obtained as linear combinations of
two solutions, therefore are solutions themselves of the given ODE. Do they constitute
a fundamental set of solutions? We have:
eµx cos(αx)
eµx sin(αx)
W = µx
µe cos(αx) − αeµx sin(αx) µeµx sin(αx) + αeµx cos(αx)
((
((
,
2
2
(((
(((
sin(αx)µ
cos(αx)µ
= e2µx [(
((( cos(αx) + α sin (αx)],
((( sin(αx) + α cos (αx) − (
= αe2µx .
If we have α = 0, then we have △ = 0, whereas we have assumed △ < 0. Therefore,
the Wronskian of the functions u(x) and v(x) is always 6= 0 for any value of x and u
and v form a fundamental set of solutions. The general solution of the ODE can be
thus expressed as:
y(x) = eµx [c1 cos(αx) + c2 sin(αx)].
Example 2.4.2 Solve the initial value problem:


′′
′

y − 8y + 17 = 0

y(0) = 1



y ′ (0) = 1
The characteristic equation of the given ODE is:
λ2 − 8λ + 17 = 0.
√
⇒ λ = 4 ± 16 − 17 = 4 ± i.
(2.73)
54
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
We have therefore µ = 4 and α = 1. The general solution can be written as:
y(x) = e4x (c1 cos x + c2 sin x).
From the initial condition y(0) = 1 we obtain immediately c1 = 1. We have then:
y ′(x) = 4e4x (c1 cos x + c2 sin x) + e4x (−c1 sin x + c2 cos x).
From the initial condition y ′(0) = 1 we obtain 4c1 + c2 = 1, namely c2 = −3. The
required solution is:
y(x) = e4x (cos x − 3 sin x).
Repeated roots
Repeated roots of the characteristic equations occur when the discriminant is zero.
We have seen in Eq. 2.72 that this is the only case for which eλ1 x and eλ2 x do not
form a fundamental set of solutions. This of course makes sense because we have in
this case λ1 = λ2 , therefore the functions eλ1 x and eλ2 x are the same function.
a1
From Eq. 2.62 we know that in this case λ = − 2a
and therefore a solution of
2
−
a1
x
an ODE whose characteristic equation has null discriminant is surely e 2a2 . There
are different ways to find a second solution, linearly independent from this one. The
most widely used is due to D’Alembert and consists in finding a function f (x) so
a
− 1 x
that f (x)e 2a2 is also a solution of the ODE a2 y ′′ + a1 y ′ + a0 y = 0. We have:
a
a1 − 2aa1 x
− 1 x
f e 2 + f ′ e 2a2 ,
2a2
a
a1 2 − 2aa1 x a1 ′ − 2aa1 x
′′ − 2a12 x
2
2
f
e
f
e
−
+
f
e
.
y ′′ =
4a2 2
a2
y′ = −
a
− 2a1 x
We obtain an ODE in f in which the term e
it, we obtain:
2
will cancel out. Without writing
a1 2
a1 2
′
′′
−
f +
a1 f + a2 f −
a1
f′ + a0 f = 0,
4a2
2a2
2
a
1
f = 0.
⇒ a2 f ′′ + a0 −
4a2
The term within brackets is zero because we shall not forget that △ = a1 2 −4a0 a2 = 0,
therefore the ODE has the particularly simple form f ′′ = 0. We can integrate it twice
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
55
with respect to x and we obtain that f (x) = c1 + c2 x. Namely, the solution can be
expressed as y = c1 y1 + y2 c2 with:
a
− 2a1 x
y1 = e
2
a
− 2a1 x
, y2 = xe
2
.
(2.74)
Are these two functions linearly independent? To check it we have to calculate the
Wronskian:
a
− 1 x
e− 2aa12 x
xe 2a2
W = a − a1 x
− a1 x ,
a
1
1
− e 2a2
− 2a2 x + 1 e 2a2 2a2
a
a
a1
a1
− a1 x
− 1x
=e 2 1−
x+
x = e a2 6= 0.
2a2
2a2
This demonstrates that the functions y1 and y2 form indeed a fundamental set of
solutions of the given ODE. A very trivial example of this case is the motion of
2
a body without any force acting on it, namely the solution of the ODE ddt2x = 0.
The characteristic equation associated to it is λ2 = 0, which has the repeated roots
λ1,2 = 0. Therefore the solution is given by:
x(t) = c1 e0·t + c2 te0·t = c1 + c2 t.
Of course, this solution could have been found very easily by a double integration,
without need of the characteristic equation.
This procedure of finding a second solution of an ODE once a first one is known
can be applied also to the more general ODE y ′′ + p(x)y ′ + q(x)y = 0. Suppose that
we know a solution y1 (x), namely we know that y1 ′′ + p(x)y1 ′ + q(x)y1 = 0, then it
is easy to find a second solution given by f (x)y1 (x). In fact we have:
f ′′ y1 + 2f ′ y1 ′ + f y1′′ + pf ′ y1 + pf y1 ′ + qf y1 = 0.
Collecting terms we obtain:
f ′′ y1 + f ′ (2y1 ′ + py1 ) + f [y1 ′′ + p(x)y1 ′ + q(x)y1 ] = 0.
But the last addend is zero because we know that y1 is a solution of the given ODE.
We are therefore left with the ODE:
f ′′ y1 + f ′ (2y1 ′ + py1 ) = 0,
(2.75)
which, in spite of the appearance, is a very simple first order ODE in f ′ , from which
we can recover f (x) and from it the second of the fundamental set of solutions y1 (x)
and f (x)y1 (x). This method of finding the fundamental set of solutions given one
56
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
particular solution is called reduction of order. In many cases however it is more
convenient to use the direct formula Eq. 2.71.
2.4.4
Second order nonhomogeneous ODEs with constant
coefficients
In this section we deal with ODEs of the type:
a2 y ′′ (x) + a1 y ′(x) + a0 y(x) = g(x).
(2.76)
We start with some theoretical results about the generic second order nonhomogeneous ODE:
y ′′ (x) + p(x)y ′ (x) + q(x)y(x) = g(x).
(2.77)
We will call corresponding homogeneous equation the ODE:
y ′′(x) + p(x)y ′(x) + q(x)y(x) = 0.
(2.78)
In order to describe the structure of the solutions of this equation, we need to
demonstrate two important results:
• If Y1 and Y2 are solutions of the nonhomogeneous ODE Eq. 2.77, then Y1 − Y2
is a solution of the corresponding homogeneous equation Eq. 2.78. Moreover,
if y1 and y2 are a fundamental set of solutions of Eq. 2.78, then it is always
possible to find two numbers c1 and c2 such that:
Y1 (x) − Y2 (x) = c1 y1 (x) + c2 y2 (x).
(2.79)
To demonstrate this result it is enough to note that Y1 and Y2 are solutions of
Eq. 2.77, therefore we have:
Y1 ′′ + pY1 ′ + qY1 = g,
Y2 ′′ + pY2 ′ + qY2 = g,
⇒ Y1 ′′ − Y2 ′′ + p(Y1 ′ − Y2 ′ ) + q(Y1 − Y2 ) = 0.
(2.80)
We already know that, since the differential operator is linear, then one has
(Y1 − Y2 )′′ = Y1 ′′ − Y2 ′′ and (Y1 − Y2 )′ = Y1 ′ − Y2 ′ , therefore Eq. 2.80 already
demonstrates that Y1 − Y2 is a solution of the ODE 2.78.
Moreover, we have shown that every solution of an homogeneous ODE like Eq.
2.78 can be expressed as linear combination of the fundamental set of solutions
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
57
and this demonstrates the second part of this theorem, namely there must exist
two numbers c1 and c2 such that Y1 (x) − Y2 (x) = c1 y1 (x) + c2 y2 (x).
• The general solution of the nonhomogeneous equation Eq. 2.77 can be expressed
as:
y(x) = c1 y1 (x) + c2 y2 (x) + yp (x),
(2.81)
where y1 and y2 are the fundamental set of solutions of the corresponding homogeneous equation 2.78 and yp (x) is a particular solution of the nonhomogeneous
equation.
This result follows directly from the previous theorem. It is enough to call
Y1 (x) = y(x) (the general solution) and Y2 (x) = yp (x) (some particular solution
of the nonhomogeneous ODE) and from Eq. 2.79 we can directly recover Eq.
2.81.
In order to find the solution of the nonhomogeneous ODE with constant coefficients Eq. 2.76 we have to perform the following 3 steps:
• Find the fundamental set of solutions y1 and y2 of the corresponding homogeneous equation:
a2 y ′′ (x) + a1 y ′(x) + a0 y(x) = 0,
with the methods learned in Sect. 2.4.3. The general solution of this ODE
c1 y1 (x) + c2 y2 (x) is often called complementary solution.
• Find a specific solution yp (x) of the nonhomogeneous ODE. This solution is
called particular solution.
• Sum up the results found in the two preceding steps.
Of course, the difficulty here resides in step two, namely in finding the particular
solutions. The two most widely used methods are called method of variation of
parameters and method of undetermined coefficients.
Method of variation of parameters
This method, attributed to Lagrange, is very powerful because it can be applied to
any ODE, regardless of the form of the function g(x) in Eq. 2.76 but it might be
quite laborious. It is the generalization of the method we have already applied to
first-order linear ODEs (Section 2.2.2).
58
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
We start again from Eq. 2.77 and suppose that we know the fundamental set of
solutions y1 and y2 of the corresponding homogeneous equation Eq. 2.78. We have
therefore that the complementary solution is given by:
yc (x) = c1 y1 (x) + c2 y2 (x).
The method of variation of parameters consists in substituting the constants c1 and
c2 with two unknown functions C1 (x) and C2 (x). We try now to determine the two
functions C1 and C2 such that y(x) = C1 (x)y1 (x) + C2 (x)y2 (x) is a solution of the
given ODE. If we differentiate once y with respect to x we obtain:
y ′ = C1 ′ y 1 + C1 y 1 ′ + C2 ′ y 2 + C2 y 2 ′ .
Now, let us assume that the sum of the terms containing the derivatives of C1 and
C2 is zero, namely that:
C1 ′ y1 + C2 ′ y2 = 0.
(2.82)
y ′ = C1 y 1 ′ + C2 y 2 ′ .
(2.83)
Therefore y ′ is simply given by:
If we differentiate once more with respect to x we obtain:
y ′′ = C1 ′ y1 ′ + C1 y1 ′′ + C2 ′ y2 ′ + C2 y2 ′′ .
(2.84)
Now, given the function y(x) = C1 (x)y1 (x) + C2 (x)y2 (x) we have y ′ (Eq. 2.83) and
y ′′ (Eq. 2.84) and we can therefore check under what conditions can y satisfy the
given ODE. We have:
C1 ′ y1 ′ + C1 y1 ′′ + C2 ′ y2 ′ + C2 y2 ′′ + p(C1 y1 ′ + C2 y2 ′ ) + q(C1 y1 + C2 y2 ) = g,
⇒ C1 (y1 ′′ + py1 ′ + qy1 ) + C2 (y2 ′′ + py2 ′ + qy2 ) + C1 ′ y1 ′ + C2 ′ y2 ′ = g.
Since y1 and y2 are solutions of the corresponding homogeneous ODE, the two terms
under brackets are zero. The condition C1 ′ y1 ′ + C2 ′ y2 ′ = g remains, which, together
with Eq. 2.82 forms a system of two linear equations in C1 ′ and C2 ′ , namely:

C ′ y + C ′ y = 0
1 1
2 2
C1 ′ y1 ′ + C2 ′ y2 ′ = g.
(2.85)
We can solve this system treating C1 ′ and C2 ′ as real numbers. By means of the
Cramer’s rule, we obtain:
59
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
y 0 1
′
y1 g ′
, C2 = y y .
y2 1
2 ′
y1 y2 ′ y2 ′ gy2
gy1
⇒ C1 ′ = −
, C2 ′ =
,
W (y1 , y2 )
W (y1, y2 )
Z
Z
g(x)y1 (x)
g(x)y2 (x)
dx + c1 , C2 (x) =
dx + c2 .
⇒ C1 (x) = −
W (y1 , y2)(x)
W (y1, y2 )(x)
0
g
′
C1 = y1
′
y1
y2
y2 ′
(2.86)
In the end, we can write the solution of the ODE as:
y(x) = c1 y1 (x) + c2 y2 (x) − y1 (x)
Z
x
x0
g(s)y2(s)
ds + y2 (x)
W (y1 , y2 )(s)
Z
x
x0
g(s)y1(s)
ds,
W (y1 , y2)(s)
(2.87)
where x0 is a conveniently chosen point.
Example 2.4.3 Calculate the general solution of the ODE:
y ′′ − 3y ′ + 2y = ex
The characteristic equation of the corresponding homogeneous ODE is:
√
9−8
⇒ λ1 = 1 , λ2 = 2.
2
The complementary solution is thus given by:
2
λ − 3λ + 2 = 0 ⇒ λ =
3±
yc (x) = c1 ex + c2 e2x .
The Wronskian of the functions y1 and y2 is given by:
ex e2x
W = x
e 2e2x
= e3x .
Applying Eq. 2.86 with g(x) = ex , W (x) = e3x , y1 (x) = ex , y2 (x) = e2x we obtain:
ex · e2x
C1 (x) = −
dx = −x + c1 ,
e3x
Z
Z x x
e ·e
dx = e−x dx = −e−x + c2 .
C2 (x) =
3x
e
Z
The general solution of the given ODE is thus given by:
60
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
y(x) = c1 ex + c2 e2x − xex − ex = (c1 − x − 1)ex + c2 e2x .
Note that the second part of the complementary solution −ex is proportional to y1 ,
therefore we can incorporate this function into c1 ex and write the solution in the
form:
y(x) = (c1 − x)ex + c2 e2x .
Example 2.4.4 Solve the ODE:
y ′′ + 4y =
1
.
sin(2x)
The characteristic equation of the homogeneous ODE is:
λ2 + 4 = 0 ⇒ λ = ±2i.
The complementary solution is thus:
yc (x) = c1 cos(2x) + c2 sin(2x).
The Wronskian is:
cos(2x)
sin(2x)
W =
−2 sin(2x) 2 cos(2x)
From Eq. 2.86 we have:
= 2 cos2 (2x) + 2 sin2 (2x) = 2.
Z
1
sin(2x)
1
C1 (x) = −
dx = − x + c1 ,
2
sin(2x)
2
Z
cos(2x)
1
1
dx = ln[sin(2x)] + c2 .
C2 (x) =
2
sin(2x)
4
The solution we have been looking for is:
1
1
y(x) = c1 cos(2x) + c2 sin(2x) − x cos(2x) + sin(2x) ln[sin(2x)].
2
4
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
61
Method of the undetermined coefficients
The method of undetermined coefficients requires that we make an assumption about
the type of particular solution yp (x) we are seeking, but with unspecified coefficients.
It is therefore a trial and error method in which, if we have guessed right, we will
be able to determine the coefficients of our trial particular solution; if we have not
guessed right, we will not be able to find the coefficients and that would mean that
there is no solution of the form that we have guessed. We can guess another form of
solution and try again. This method is particularly useful for simple forms of g(x)
(in particular exponents, sines, cosines and polynomials) as the following examples
show.
Example 2.4.5 Solve the ODE.
y ′′ − 7y ′ + 12y = 6e2x .
The characteristic equation is:
√
49 − 48
⇒ λ1 = 3 , λ2 = 4.
2
Since the exponential function reproduces itself through differentiation, we can
try a function of the type Ae2x as particular solution of the given ODE. If we try this
function we obtain:
2
λ − 7λ + 12 = 0 ⇒ λ =
7±
4Ae2x − 14Ae2x + 12Ae2x = 6e2x ⇒ 2A = 6 ⇒ A = 3.
We have therefore already found the particular solution yp (x) = 3e2x and the general
solution is:
y(x) = c1 e3x + c2 e4x + 3e2x .
Example 2.4.6 Find a particular solution of the ODE:
y ′′ − 4y ′ − 5y = 12e−x .
Following the previous example, the most natural choice seems to be yp (x) = Ae−x .
If we try this function we get:
Ae−x + 4Ae−x − 5Ae−x = 12e−x ,
62
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
which cannot be solved. Indeed, the solution of the characteristic equation is:
λ=2±
√
4 + 5 ⇒ λ1 = −1 , λ2 = 5.
This means that e−x is already part of the complementary solution, therefore a particular solution with the form Ae−x would be simply incorporated in the complementary
solution and cannot be a solution of the nonhomogeneous equation. By looking at Example 2.4.3 we migth guess that in this case a function with the form yp (x) = Axe−x
can be the right one. We have:
yp ′ = (−Ax + A)e−x = A(1 − x)e−x ,
yp ′′ = (Ax − A − A)e−x = A(x − 2)e−x .
We have thus:
A(x − 2) − 4A(1 − x) − 5Ax = 12 ⇒ −2A − 4A = 12 ⇒ A = −2.
The complementary solution is then
yp (x) = −2xe−x .
From these examples we have seen that if we guess the form of the particular
solution correctly, then the method of undetermined coefficients is straightforward
and very fast, but it might hide complications if the most natural particular solution
to guess is already a solution of the corresponding homogeneous ODE. Moreover,
1
in cases like Example 2.4.4, with g(x) = sin(2x)
it is very difficult to guess what the
form of the particular solution might be and the method of the variation of constants
might be the only way to find it out.
In the case in which the function g(x) has the form g(x) = eax then the particular
solution yp (x) has the form
• Aeax if a is not a root of the corresponding homogeneous equation.
• Axeax if a is a root of the corresponding homogeneous equation.
In what other cases can we easily find a particular solution with the method of
the undetermined coefficients? We can check it with the help of some examples.
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
63
Example 2.4.7 Find a particular solution of the ODE
y ′′ − 2y ′ − 3y = −5 cos x.
If we try with a function yp (x) = A cos x we will have that the second derivative is
proportional to yp but the first derivative is not! A better guess might be a function
with the form yp (x) = A sin x + B cos x. In this case we have:
yp ′ = A cos x − B sin x,
yp ′′ = −A sin x − B cos x.
Substituting it into the original ODE we get:
−A sin x − B cos x − 2A cos x + 2B sin x − 3A sin x − 3B cos x = 5 cos x.
We collect now the terms containing cos x and sin x and obtain the system of equations:

−4B − 2A = −5
−4A + 2B = 0.
From the second we get B = 2A, therefore we have:
A=
1
1
, B = 1 ⇒ yp (x) = sin x + cos x.
2
2
Example 2.4.8 Find the particular solution of the ODE:
y ′′ − 5y ′ = 3 − 15x2 .
Since g(x) is a polynomial of second degree, then we can expect to have a particular
solution which is a polynomial of second degree, too. We can try therefore the function
yp (x) = Ax2 + Bx + C. We have yp ′ = 2Ax + B, yp ′′ = 2A. In order yp (x) =
Ax2 + Bx + C to be a solution of the given ODE we should have:
2A − 10Ax − 5B = 3 − 15x2 .
It is evident from this equation that we have no information on the value of the
constant C. The reason for that is the same as in the example with e−x we have
64
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
already encountered. Namely, the characteristic equation λ2 − 5λ = 0 has roots 0
and 5. The function yp (x) = Ax2 + Bx + C can be seen as the sum of 3 functions.
One of them 3 yp (x) = C is proportional to one of the two functions that form the
fundamental set of solutions of the corresponding homogeneous equation. That cannot
be and therefore we should change our assumption about the form of the particular
solution. We test now the function yp (x) = x(Ax2 + Bx + C). In this case we have:
yp ′ = 3Ax2 + 2Bx + C,
yp ′′ = 6Ax + 2B.
We have therefore:
6Ax + 2B − 15Ax2 − 10Bx − 5C = 3 − 15x2 .
Now we can equate the coefficients with the same power of x and obtain the system
of equations:




−15A = −15
6A − 10B = 0



2B − 5C = 3
This is a system of 3 equations in 3 unknowns; it can be easily solved yielding A = 1,
9
, therefore the particular solution we were looking for is:
B = 53 , C = − 25
3
9
2
.
yp (x) = x x + x −
5
25
Example 2.4.9 Find the particular solution of the ODE:
y ′′ − 3y ′ = (2x2 − 4x + 1)ex .
The first guess would be to look for a function with the form (Ax2 + Bx + C)ex . After
Example 2.4.8 we might fear that this function could not be the right guess because
λ = 0 is a root of the characteristic equation. This fear is unjustified because of the
factor ex which always multiplies the terms of the polynomial. The only risk would
be therefore if 1 were a solution of the characteristic equation because in this case
e1·x would be already part of the solution of the corresponding homogeneous equation.
We have therefore:
2.4. SECOND ORDER DIFFERENTIAL EQUATIONS
65
yp (x) = (Ax2 + Bx + C)ex ,
yp ′ = (Ax2 + Bx + C + 2Ax + B)ex = [Ax2 + (B + 2A)x + B + C]ex ,
yp ′′ = [Ax2 + (B + 2A)x + B + C + 2Ax + B + 2A]ex
= [Ax2 + (B + 4A)x + 2A + 2B + C]ex .
Neglecting as usual the term ex we obtain:
Ax2 + (B + 4A)x + 2A + 2B + C − 3Ax2 − 3(B + 2A)x − 3B − 3C = 2x2 − 4x + 1.
Now we can collect coefficients of terms with the same power of x and obtain the
system of equations:




−2A = 2
−2B − 2A = −4



2A − B − 2C = 1
,
which has solution A = −1, B = 3, C = −3. The particular solution is thus:
(−x2 + 3x − 3)ex .
In the end the method of undetermined coefficients can be effective only if the
term g(x) is a function involving exponentials, sines, cosines, polynomials and products of such functions. We can summarize the form of particular solution we should
look at in the following items.
• g(x) = Pn (x), where Pn (x) is a polynomial with degree n.
In this case, the particular solution is:
yp (x) = xm [An xn + An−1 xn−1 + · · · + A1 x + A0 ].
(2.88)
Here A0 and An are the coefficients to determine and m is the multiplicity of 0
as root of the characteristic equation of the homogeneous ODE. That is, if the
ODE is of the form y ′′ + ay ′ = g(x) then 0 is a single root of the characteristic
equation and m = 1. If instead the ODE can be written as y ′′ = g(x) then 0 is
a double root of the characteristic equation and m = 2.
66
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
• g(x) = Pn (x)eµx .
The solution we should look at is:
yp (x) = xm [An xn + An−1 xn−1 + · · · + A1 x + A0 ]eµx .
(2.89)
In this case, m depends on the multiplicity of µ as a root of the characteristic
equation.
• g(x) = Pn (x)eµx cos(αx) or g(x) = Pn (x)eµx sin(αx).
This case can be seen as an extension in the complex field of the previous
example since eµx cos(αx) and eµx sin(αx) are linear combinations of e(µ+iα)x .
The solution we look at in this case is thus:
yp (x) =xm [(An xn + An−1 xn−1 + · · · + A1 x + A0 ) cos(αx)+
+ (Bn xn + Bn−1 xn−1 + · · · + B1 x + B0 ) sin(αx)]eµx ,
(2.90)
where m is the multiplicity of µ + iα as (complex) root of the characteristic
equation.
• g(x) = Pn (x) cos(αx) or g(x) = Pn (x) sin(αx).
This case can be seen as analogous to the previous case, provided that µ = 0.
The solution to look at is:
yp (x) =xm [(An xn + An−1 xn−1 + · · · + A1 x + A0 ) cos(αx)+
+ (Bn xn + Bn−1 xn−1 + · · · + B1 x + B0 ) sin(αx)],
(2.91)
where m is the multiplicity of iα as (imaginary) root of the characteristic
equation.
• g(x) = g1 (x) + g2 (x) + · · · + gn (x), where each of these g1 , . . . gn belongs to
one of the previous items.
In this case we calculate separately the particular functions 1 yp . . . n yp and the
particular function is given by the sum of these partial particular functions,
namely:
yp (x) =1 yp (x) + · · · +n yp (x).
(2.92)
The bottom line here is that the particular solution we look at must be always
linearly independent from the solutions of the corresponding homogeneous equations.
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
2.5
67
Higher order linear differential equations
In this section we deal with equations involving the n-th derivative (with n > 2) of
some unknown function to be determined. We will only consider linear equations,
therefore the generic n-th order ODE we will consider can be written as:
y (n) (x) + p1 (x)y (n−1) (x) + · · · + pn−1 (x)y ′ + pn (x)y = g(x).
(2.93)
All the results we will present in this section are just a generalization of what
we have learned in the previous section about second-order differential equations.
2.5.1
Homogeneous n-th order ODEs.
The homogeneous n-th order ODE can be written as:
y (n) (x) + p1 (x)y (n−1) (x) + · · · + pn−1 (x)y ′ + pn (x)y = 0.
(2.94)
Also in this case, it is easy to demonstrate that, if y1 , . . . , yn are solutions of the
above equation, then each linear combination of these functions is still a solution of
it.
If we set up an initial value problem, we have to specify the value of the unknown
function y and of its derivatives up to the (n − 1)-th derivative in some point x0 .
Namely, we have to define the following initial conditions:


y(x0 ) = y0




y ′ (x ) = y ′
0
0
.

..




 (n−1)
y
(x0 ) = y0 (n−1)
(2.95)
If we know that a set of functions y1 , . . . , yn is solution of the given ODE, we
wish to determine if a set of constants c1 , . . . , cn exists so that y = c1 y1 + · · · + cn yn
is the solution of the initial value problem. To find c1 , . . . , cn we have to solve the
system of equations:


c1 y1 (x0 ) + c2 y2 (x0 ) + · · · + cn yn (x0 ) = y0




c y ′ (x ) + c y ′ (x ) + · · · + c y ′ (x ) = y ′
1 1
0
2 2
0
n n
0
0
.

..





c1 y1 (n−1) (x0 ) + c2 y2 (n−1) (x0 ) + · · · + cn yn (n−1) (x0 ) = y0 (n−1)
In order this system of equations to have a solution, the determinant of the coefficients
must be different from zero, namely we must have:
68
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
y (x )
y2 (x0 )
...
yn (x0 )
1 0
y1 ′ (x0 )
y2 ′ (x0 )
...
yn ′ (x0 )
W (y1, . . . , yn )(x0 ) = ..
..
..
.
.
.
(n−1)
(n−1)
(n−1)
y1
(x0 ) y2
(x0 ) . . . yn
(x0 )
6= 0.
(2.96)
This is again the Wronskian of the functions y1 , . . . , yn . If a point x0 exists, where
the Wronskian is different from zero, then the functions y1 , . . . , yn are linearly independent, form a fundamental set of solution of the given homogeneous ODE and
each solution can be obtained by a linear combination of y1 , . . . , yn .
A homogeneous n-th order ODE with constant coefficients can be written as:
an y (n) + an−1 y (n−1) + · · · + a1 y ′ + a0 y = 0.
(2.97)
As in the case of second-order ODEs, we look for solutions of the type eλx . Since
dk λx
e = λk eλx we obtain again the characteristic equation:
dxk
an λn + an−1 λn−1 + · · · + a1 λ + a0 = 0.
(2.98)
The fundamental set of solutions of the ODE 2.97 is given by the functions eλ1 x , . . . , eλn x ,
where λ1 , . . . , λn are the solutions of the characteristic equation.
These roots can be real or complex. For complex roots we have seen that they
come always in conjugate pairs λ = µi ± iαi and we have already seen that we can
transform the two functions e(µi ±iαi )x into the real-valued solutions eµi x cos(αi x) and
eµi x sin(αi x).
If λ is a repeated root, we have seen that we have to multiply by x the function eλx . With ODEs with order larger than 2 it can happen that the multiplicity
of λ as a solution of the characteristic equation is larger than 2. In this case, if
m is the multiplicity of λ, we have to multiply eλx by x, x2 , . . . , xm−1 in order to
have m linearly independent solutions of the given ODE. If n ≥ 4 then it could
also happen that λ and λ∗ are repeated complex roots. Then in this case, if m
is the multiplicity of µ ± iα as roots of a ODE. then linearly independent solutions are given by the functions eµx cos(αx), xeµx cos(αx), . . . , xm−1 eµx cos(αx) and
eµx sin(αx), xeµx sin(αx), . . . , xm−1 eµx sin(αx).
Example 2.5.1 Find the solution of the ODE:
y (6) + 3y (4) + 3y ′′ + y = 0.
The characteristic equation is:
λ6 + 3λ4 + 3λ2 + 1 = 0.
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
69
This is clearly the third power of λ2 + 1. The only roots are therefore ±i (the solutions of λ2 + 1 = 0) and their multiplicity is 3. For what we have said above, the
fundamental set of solutions is given by the functions:
cos x, x cos x, x2 cos x, sin x, x sin x, x2 sin x.
We can write the general solution as:
y(x) = (c1 + c2 x + c3 x2 ) cos x + (c4 + c5 x + c6 x2 ) sin x.
Example 2.5.2 Solve the ODE:
y ′′′ + 3y = 0.
The characteristic equation is:
λ3 + 3 = 0 ⇒ λ3 = −3.
We transform -3 into 3ei(π+2nπ) , therefore the characteristic equation has solutions:
λ=
√
3
π
2n
3ei 3 + 3 iπ .
n can assume the values 0, 1, 2, therefore the roots of the characteristic equation are:
λ1 =
√
3
√
√
5
π
3
3
3ei 3 , λ2 = − 3, λ3 = 3e 3 iπ .
λ1 and λ3 are complex conjugated, therefore we can take only λ1 and take sine and
cosine of the imaginary part of it as linearly independent solutions. We can write λ1
as:
√
3
π
3ei 3
√
π
π
3
= 3 cos + i sin
3
3
√ √
3
1
3
= 3
+i
2
2
√
√
6
3
3
35
=
+i
.
2
2
We can thus write the solution as:
λ1 =
y(x) = c1 e
√
33
2
x
√
6
√
6
√
√
33
35
35
− 3 3x
x
2
x + c2 e
+ c3 e
x.
cos
sin
2
2
70
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
2.5.2
Nonhomogeneous n-th order ODEs.
Nonhomogeneous n-th order ODEs can be expressed by means of the generic Eq.
2.93. As in the case of second-order ODEs, to find the general solution of this equation we have to first solve the corresponding homogeneous equation (namely we have
to find the complementary solution), then find a particular solution of the nonhomogeneous equation, then sum up the complementary and the particular solution. As
in the case of second-order ODEs the particular solution can be found either with the
method of variation of parameters or with the method of undetermined coefficients.
The method of variation of parameters consists in finding the general solution
of the corresponding homogeneous equation y(x) = c1 y1 (x) + · · · + cn yn (x) and then
replace the constants c1 , . . . , cn with unknown functions C1 (x), . . . , Cn (x). Now we
want to find out how shall we choose C1 (x), . . . , Cn (x) so that y = C1 (x)y1 (x) + · · · +
Cn (x)yn (x) is a solution of the nonhomogeneous ODE. It is:
y ′ = (C1 ′ y1 + · · · + Cn ′ yn ) + (C1 y1 ′ + · · · + Cn yn ′ ).
(2.99)
We assume now that the first term under bracket (C1 ′ y1 +· · ·+Cn ′ yn ) is zero, therefore
we have:
y ′ = C1 y 1 ′ + · · · + Cn y n ′ .
(2.100)
We now keep on differentiating y and assume always that the sum of all the terms
containing Ci ′ is zero. Namely, after m derivatives we assume:
C1 ′ y1 (m−1) + · · · + Cn ′ yn (m−1) = 0,
(2.101)
and, consequently, we will get:
y (m) = C1 y1 (m) + · · · + Cn yn (m) .
(2.102)
This procedure goes on until we find:
y (n) = (C1 y1 (n) + · · · + Cn yn (n) ) + (C1 ′ y1 (n−1) + · · · + Cn ′ yn (n−1) ).
(2.103)
Now, if we make the sum of y (n) + p1 y (n−1) + · · · + pn−1 y ′ + pn y using Eq. 2.102 and
if we collect all the terms containing Ci we obtain:
Ci (yi(n) + p1 yi (n−1) + · · · + pn−1 yi ′ + pn yi ).
(2.104)
But all these terms are zero because yi is a solution of the corresponding homogeneous
ODE. The only term remaining is (C1 ′ y1 (n−1) + · · · + Cn ′ yn (n−1) ) which is therefore
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
71
equal to g(x). Because of the equations 2.101 we can now write the system of
equations:

′
′


C1 y1 + · · · + Cn yn = 0


C ′ y ′ + · · · + C ′ y ′ = 0
1 1
n n
.

..



 ′ (n−1)
C1 y 1
+ · · · + Cn ′ yn (n−1) = g
(2.105)
This is a linear system of equations for the unknowns C1 ′ , . . . , Cn ′ , which can be
solved with the Cramer’s rule yielding:
Ci ′ =
g(x)Wi (x)
,
W (x)
(2.106)
where W (x) is the Wronskian of the functions y1 , . . . , yn and Wi (x) is the determinant
of the matrix obtained by replacing the m-th column by the column (0, . . . , 1). In
this way we can express the general solution of the nonhomogeneous ODE as:
y(x) =
n
X
i=1
yi (x)
Z
g(x)Wi (x)
dx + ci .
W (x)
(2.107)
Although the procedure to obtain this result is straightforward, the algebra involved
can be terribly messy, in particular for n > 3.
The method of undetermined coefficients can be applied also to ODEs of order
higher than 2 and it is in most of the cases efficient and fast but it applies as
usual only to polynomials, sines, cosines, exponentials and combinations of these
elementary functions. After the discussion in Sect. 2.4.4, we can limit ourselves to
consider the case:
g(x) = Pn (x)eµx cos(αx) or g(x) = Pn (x)eµx sin(αx).
We have seen that, in this case, the particular solution we should look at has the
form:
yp (x) =xm [(An xn + An−1 xn−1 + · · · + A1 x + A0 ) cos(αx)+
+ (Bn xn + Bn−1 xn−1 + · · · + B1 x + B0 ) sin(αx)]eµx ,
(2.108)
where m is the multiplicity of µ + iα (or µ − iα) as root of the corresponding homogeneous ODE. If the ODE is of the second or third order, then m could be only 0
or 1, but if the ODE is of higher order, then m could be also larger than 1. All the
other cases we can consider are just particular cases of this one, namely:
72
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
• g(x) = Pn (x)eµx .
The particular solution can be obtained from Eq. 2.108 taking α = 0, namely
yp (x) = xm (An xn + An−1 xn−1 + · · · + A1 x + A0 )eµx . m is the multiplicity of µ
as root of the characteristic equation.
• g(x) = eµx cos(αx).
The particular solution can be obtained from Eq. 2.108, assuming that the
polynomial Pn (x) has degree 0, namely yp (x) = xm [A cos(αx) + B sin(αx)]eµx .
• g(x) = Pn (x) cos(αx).
The particular solution can be obtained from Eq. 2.108 taking µ = 0, namely
yp (x) = xm [(An xn + An−1 xn−1 + · · · + A1 x + A0 ) cos(αx) + (Bn xn + Bn−1 xn−1 +
· · ·+ B1 x + B0 ) sin(αx)]. m is the multiplicity of iα as root of the characteristic
equation.
• g(x) = Pn (x).
Again we can recover the particular solution from Eq. 2.108 taking µ = 0 and
α = 0, namely yp (x) = xm (An xn + An−1 xn−1 + · · · + A1 x + A0 ). m is the
multiplicity of 0 as solution of the characteristic equation.
Example 2.5.3 Find a particular solution of the ODE:
x
y ′′ − 4y ′ + 4y = e2x cos .
2
For what explained above, the particular solution yp (x) must have the form yp (x) =
Ae2x cos x2 +Be2x sin x2 . In fact, only when the roots of the characteristic equations are
2 ± 2i , then one of the two fundamental solutions of the corresponding homogeneous
ODE is e2x cos x2 , therefore we should take the function yp (x) = xm [Ae2x cos x2 +
Be2x sin x2 ] as particular solution. We have (writing c instead of cos x2 and s instead
of sin x2 for compactness of notation):
yp (x) = Ae2x c + Be2x s,
1
1
yp ′ (x) = 2Ae2x c − Ae2x s + 2Be2x s + Be2x c
2
2
B 2x
A 2x
= 2A +
e c + 2B −
e s,
2
2
B 2x
A 2x
′′
2x
2x
yp (x) = (4A + B)e c − A +
e s + (4B − A)e s + B −
e c
4
4
15
15
2x
= 2B + A e c + −2A + B e2x s.
4
4
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
73
Substituting these values in the original ODE and neglecting e2x (we know already
that we will be able to cancel out this term) we obtain:
15
A
15
B
2B + A c + −2A + B s − 4 2A +
c − 4 2B −
s + 4Ac + 4Bs = c.
4
4
2
2
Collecting terms with c and s we obtain the system of equations:


+
2B
15
A
4
−
+
2A
+ 4A = 1
− 8A − 2B
15
B
4
+ 4B = 0
2A
− 8B + It is easy to find the solutions of this system, namely B = 0 and A = −4, therefore
the particular solution of the given ODE is:
x
yp (x) = −4e2x cos .
2
Example 2.5.4 Find a particular solution of the ODE:
y (4) − 2y ′′′ + y ′′ = −2ex .
The characteristic equation is:
λ4 − 2λ3 + λ2 = 0 ⇒ (λ − 1)2 λ2 = 0.
Therefore, the roots are 0 and 1, both double. For what we have learned the particular
solution to look at is yp (x) = Ax2 ex . We have:
yp (x) = Ax2 ex
yp ′ (x) = A(x2 + 2x)ex
yp ′′ = A(x2 + 2x + 2x + 2)ex = A(x2 + 4x + 2)ex
yp ′′′ = A(x2 + 4x + 2 + 2x + 4)ex = A(x2 + 6x + 6)ex
yp (4) (x) = A(x2 + 6x + 6 + 2x + 6)ex = A(x2 + 8x + 12)ex
Substituting these functions into the original ODE (and cancelling out ex ) we obtain:
A(x2 + 8x + 12) − 2A(x2 + 6x + 6) + A(x2 − 4x + 2) = −2.
The terms with x and x2 cancel out; what remains is 2A = −2, namely A = −1.
The particular solution is thus:
yp (x) = −x2 ex .
74
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
2.5.3
The D-operator
We introduce now yet another method of finding particular solutions of ODEs which
uses up the properties of the derivative as operator. We define
D≡
d
,
dx
(2.109)
namely D always operates on a function f and produces its derivative. For this
reason Df 6= f D because Df is a function (f ′ ) whereas f D is an operator. From
the basic properties of the derivative we know that the operator D is linear, namely:
D(c1 f1 + · · · + cn fn ) = c1 Df1 + · · · + cn Dfn ,
(2.110)
where c1 , . . . , cn are constants and f1 , . . . , fn functions. It is even possible to define
the inverse D −1 of the operator D; we only require that:
D −1 (Df ) = f.
(2.111)
It is therefore easy to recognize that D −1 indicates the operation of integration (the
inverse of differentiation).
At the same time we can define D n as the n-th order derivative of the function f
and, consequently, D −n is the operator that integrates n times a given function. It is
also easy to give a meaning to the operator P (D), where P is a generic polynomial.
We have in fact:
P (D)y = (an D n + · · · + a1 D + a0 )f = an y (n) + · · · + a1 y ′ + a0 ,
(2.112)
namely P (D)y is a linear combination of derivatives of f and therefore the equation
P (D)y = g is a linear nonhomogeneous ODE. If we are able to invert the operator
P (D), then the particular solution of the ODE P (D)y = g can be simply expressed
as:
yp = [P (D)]−1g =
1
g.
P (D)
(2.113)
The operator P (D) can be inverted only for some particular functions g, which
we analyze in the following items:
• g(x) = eµx
In this case we have:
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
P (D)eµx = P (µ)eµx
1 µx
1 µx
e =
e
P (D)
P (µ)
75
(2.114)
(2.115)
The first relation is clear because of the relation D m eµx = µm eµx (see also
how we recover the characteristic equation from a given homogeneous ODE
with constant coefficients, cf. Eqs. 2.97 and 2.98). The second relation is also
clear because we managed to transform the operator P (D) in a polynomial and
therefore the inverse of P (D) is simply the inverse of the polynomial.
It is important to note that, if P (µ) = 0, then this result cannot be applied. In
fact, we know already from the method of the undetermined coefficients that,
if µ is a solution of the characteristic equation (that is equivalent to say that
P (µ) = 0), then the particular solution is not proportional to eµx but to xm eµx .
• g(x) = f(x)eµx
It turns out that:
P (D)f (x)eµx = eµx P (D + µ)f (x)
1
1
f (x)eµx = eµx
f (x)
P (D)
P (D + µ)
(2.116)
(2.117)
The first relation can be demonstrated as follows: given two functions f and g
we know already that the n-th derivative of f g can be obtained with the help
of the Pascal’s triangle, namely:
n
X
dn
dk
dn−k
n!
[f
(x)g(x)]
=
f
(x)
·
g(x)
dxn
k!(n − k)! dxk
dxn−k
k=0
⇒ D n (f g) =
n
X
k=0
n!
D k f · D n−k g.
k!(n − k)!
(2.118)
If g(x) = eµx then we know already that D j g = µj eµx , therefore:
n
µx
µx
D (f e ) = e
n
X
k=0
n!
D k f · µn−k = eµx (D + µ)n f.
k!(n − k)!
(2.119)
If we put together a polynomial in D, namely many terms like the one above
with different exponents n and some multiplicative coefficients, we will obtain
76
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
eµx times a polynomial in D + µ, demonstrating therefore Eq. 2.116. To
demonstrate Eq. 2.117 we can call h(x) = P (D + µ)f (x), namely f (x) =
1
h(x). From Eq. 2.116 we have:
P (D+µ)
µx
P (D)e
If we now apply the operator
µx
e
1
h(x)
P (D + µ)
1
P (D)
= eµx h(x).
to both members of this equation we obtain:
1
h(x)
P (D + µ)
=
1 µx
e h(x),
P (D)
which is exactly as Eq. 2.117 with f replaced by h.
• g(x) = cos(αx)
It turns out that:
P (D 2) cos(αx) = P (−α2 ) cos(αx)
1
1
cos(αx) =
cos(αx)
2
P (D )
P (−α2 )
(2.120)
(2.121)
The first relation can be demonstrated as follows:
D 0 cos(αx) = cos(αx)
D 2 cos(αx) = −α2 cos(αx)
D 4 cos(αx) = D 2 [D 2 cos(αx)] = (−α2 )2 cos(αx)
..
.
D 2n cos(αx) = (−α2 )n cos(αx).
Therefore, a polynomial in D 2 applied to the function cos(αx) is equivalent to
cos(αx) times the polynomial in −α2 . With it, Eq. 2.120 is demonstrated. To
demonstrate Eq. 2.121 it is enough to notice that we have transformed P (D 2)
in a polynomial (i.e. in a number if α is assigned) and we can simply invert it
as we invert numbers.
If P (−α2 ) = 0 (namely if iα or −iα are roots of the characteristic equation),
this method cannot be applied.
• g(x) = sin(αx)
This case is analogous to the previous one, therefore we have:
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
P (D 2 ) sin(αx) = P (−α2 ) sin(αx)
1
1
sin(αx) =
sin(αx)
2
P (D )
P (−α2 )
77
(2.122)
(2.123)
• g(x) = xn
In this case it is always possible to find some coefficients b0 , . . . , bn such that:
1
xn = (b0 + b1 D + · · · + bn D n )xn .
P (D)
(2.124)
In fact, if we divide the number 1 by the polynomial P (D) we will obtain a
polynomial with an infinite number of addends. However, since D m xn = 0 for
any m > n, we can truncate the polynomial at the n-th degree.
The application of the D-operator to the solution of ODEs can be clearer after
some examples.
Example 2.5.5 Apply the method of the D-operator to find a particular solution of
the ODE:
y ′′ − y ′ + y = x3 ex .
We try to invert the operator P (D) = D 2 − D + 1 and find the function:
1
{x3 ex }.
−D+1
With the help of Eq. 2.117 this transforms into:
yp =
yp = ex
(D +
1)2
D2
1
1
x3 = ex 2
x3 .
− (D + 1) + 1
D +D+1
We now divide the number 1 by the polynomial D 2 + D + 1 and obtain:
1 + D + D2
1
1 + D + D2
1 − D + D3
−D − D 2
−D − D 2 − D 3
D3
The solution is therefore:
yp (x) = ex (1 − D + D 3 )x3 = ex (x3 − 3x2 + 6).
78
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
Example 2.5.6 Use the method of the D-operator to find the particular solution of
the ODE:
y ′′′ + y ′′ + y ′ + y = 15 cos(2x).
We have:
1
cos(2x)
D3 + D2 + D + 1
1
= 15
cos(2x)
(D + 1)(D 2 + 1)
1
1
·
cos(2x) (because of the Eq. 2.121)
= 15
2
D + 1 −2 + 1
D−1
= −5 2
cos(2x)
D −1
1
= −5(D − 1) 2
cos(2x) (again Eq. 2.121)
−2 − 1
= (D − 1) cos(2x) = −2 sin(2x) − cos(2x).
yp = 15
Example 2.5.7 With the method of the D-operator find the solution of the ODE:
y ′′ − 4y ′ + 3y = e2x (cos x + 1).
We can rewrite this ODE as:
(D 2 − 4D + 3)y = e2x (cos x + 1).
We have:
1
{e2x (cos x + 1)}
D 2 − 4D + 3
1
1
e2x + 2
{e2x cos x}
= 2
D − 4D + 3
D − 4D + 3
1
1
= 2
e2x + e2x
cos x
2
2 −4·2+3
(D + 2) − 4(D + 2) + 3
1
= −e2x + e2x 2
cos x
D −1
1
cos x
= −e2x + e2x 2
−1 − 1
1
= −e2x 1 + cos x
2
y=
79
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
As it is perhaps evident from these examples, the method of the D-operator is
quite powerful and versatile, but it requires some ability and a fair amount of practice
to get familiar with its use. In any case, for functions that cannot be expressed as
combinations of sines, cosines, polynomials and exponentials, the only viable method
remains the messy method of the variation of parameters.
We have seen however that this method cannot be applied in the cases in which
1
1
eµx ) or P (−α2 ) = 0 (when calculating P (D
P (µ) = 0 (when calculating P (D)
2) =
sin(αx) or cos(αx)). What can we do in these cases? In the case in which we want to
1
eµx with P (µ) = 0 we can write P (D) = (D−µ)m ∆(D), where m is the
calculate P (D)
multiplicity of µ as root of the polynomial P (D) and ∆(D) is a polynomial in D which
1
1
{eµx f (x)} = eµx P (µ+D)
f (x)
does not have µ as a root. Now we can apply the rule P (D)
and obtain:
1 µx
1
e =
eµx
P (D)
(D − µ)m ∆(D)
1
1
· 1 = eµx m
· e0·x
= eµx m
D ∆(µ + D)
D ∆(µ + D)
1
1
= eµx m
D ∆(0 + µ)
µx
e xm
=
.
∆(µ) m!
The last step is justified by the fact that the only operator left is
m
1 and produces the m-th integral of it, which is xm! .
1
;
Dm
(2.125)
it operates on
Example 2.5.8 Find the particular solution of the ODE:
y (4) + 3y ′′′ + 3y ′′ + y ′ = e−x .
With the help of the D-operator we can rewrite this ODE as:
(D 4 + 3D 3 + 3D 2 + D)y = e−x
⇒ D(D + 1)3 y = e−x .
-1 is therefore a root of the polynomial P (D) with multiplicity 3 and ∆D = D.
Applying the method just learned we can write:
y=
1
1
1
e−x = e−x 3
1 = e−x 3 (−1).
3
D(D + 1)
D (D − 1)
D
3
If we integrate 3 times −1 we obtain − x6 , therefore the particular solution is:
yp (x) = −e−x
x3
.
6
80
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
1
1
To apply the operator P (D
2 ) cos(αx) (or P (D 2 ) sin(αx)) in the case in which
P (−α2 ) = 0 it is more convenient to substitute α with α + ε (with ε small) and make
the Taylor series expansion of cos[(α + ε)x]. We obtain:
1
1
cos(αx) =
cos[(α + ε)x]
2
P (D )
P (D 2)
1
cos[(α + ε)x]
=
P [−(α + ε)2 ]
∞
X
1
(εx)n
(n)
=
.
cos
(αx)
P [−(α + ε)2 ] n=0
n!
(2.126)
In the limit ε → 0 we obtain our result.
Example 2.5.9 With the help of the method of the D-operator find the particular
solution of the ODE:
y ′′ + y = cos x.
We have P (D) = D 2 + 1 and clearly −12 + 1 = 0. We increment therefore 1 by a
tiny amount ε and obtain:
y=
1
1
cos
x
=
cos[(1 + ε)x]
D2 + 1
−(1 + ε)2 + 1
∞
X
(εx)n
1
cos(n) x
=
−ε(ε + 2) n=0
n!
1
(εx)2
=
cos x − (εx) sin x −
cos x . . .
−ε(ε + 2)
2
x sin x εx2 cos x
cos x
+
+
...
(2.127)
=−
ε(ε + 2)
ε+2
2(ε + 2)
The first term of this sum does not concern us because a term proportional to cos x
is already part of the complementary solution of the given ODE. The third term (and
all the following terms) of the right hand side tend to 0 for ε → 0, therefore the only
term left is:
x sin x
x sin x
=
,
ε+2
2
which is therefore the particular solution we have been looking for.
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
2.5.4
81
The Euler linear equations
In this subsection and in the following we deal with linear differential equations with
non-constant coefficients. At variance with the ODEs with constant coefficients,
there is no general theory to find the solutions and in most of the cases the solutions
cannot be expressed in terms of simple elementary functions. In many cases it is
possible to obtain a solution of the given ODEs by clever substitutions, but there is
no general rule on how to find the right substitution.
The Euler equation is one of the simplest cases in which an ODE with variable
coefficients can be solved. It has the form:
an xn D n y + an−1 xn−1 D n−1 y + · · · + a1 xDy + a0 y = g(x),
(2.128)
namely the derivative of the m-th order is multiplied by xm and by a constant. This
ODE can be also written in a compact notation as P (xD)y = g. We can solve this
equation by means of the substitution x = es , namely s = ln x and s′ (x) = x1 . We
have:
dy
dy ds
1 dy
=
=
dx
ds dx
x ds
′
⇒ xDy = y (s)
Dy =
If we now indicate with δ the operator derivative with respect to s, from the last
relation we also have the correlation between the two operators D and δ:
D=
1
δ.
x
Higher derivatives are given by:
1
1
1
1
1
2
δy = − 2 δy + Dδy = 2 (δ 2 y − δy) = 2 δ(δ − 1)y
D y=D
x
x
x
x
x
⇒ x2 D 2 y = δ(δ − 1)y
1
2
1
3
D y = D 2 δ(δ − 1)y = − 3 δ(δ − 1)y + 3 δ[δ(δ − 1)]y
x
x
x
⇒ x3 D 3 y = δ(δ − 1)(δ − 2)y
..
.
xn D n y = δ(δ − 1)(δ − 2) . . . (δ − n + 1)y,
(2.129)
In this way we can transform the original ODE into an ODE with constant coefficients
in which the independent variable is s, namely:
bn δ n y + bn−1 δ n−1 y + · · · + b1 δy + b0 y = g(es ).
(2.130)
82
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
Example 2.5.10 Solve the ODE:
x2 y ′′ + xy ′ + y = sin ln x2 .
With the substitution x = es and using the relations xDy = δy, x2 D 2 y = δ(δ − 1)y
we obtain:
[δ(δ − 1) + δ + 1]y = sin(2s)
⇒ (δ 2 + 1)y = sin(2s).
The complementary solution of this ODE can be obtained by the roots of the characteristic equation λ2 + 1 = 0, namely:
yc (s) = c1 cos s + c2 sin s.
The particular solution can be obtained with the method of the D-operator, namely:
1
1
1
sin(2s) =
sin(2s) = − sin(2s).
2
+1
−2 + 1
3
The solution of the given ODE is thus:
yp (s) =
δ2
1
sin(2s)
3
1
⇒ y(x) = c1 cos(ln x) + c2 sin(ln x) − sin(ln x2 ).
3
y(s) = c1 cos s + c2 sin s −
2.5.5
Series solutions of linear equations
To deal with the large class of ODEs with variable coefficients we have to extend our
search for solutions beyond the familiar elementary functions. The basic idea of the
series solution is similar to the method of the undetermined coefficients: we assume
that the solutions of a given ODE have power series expansions and then we attempt
to determine the coefficients of the series so as to satisfy the ODE.
Example 2.5.11 Find a series solution of the equation:
y ′′ + y = 0.
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
83
We know already that this ODE has solution c1 cos x + c2 sin x but this example illustrates well the use of power series to the solution of ODE. We look for solutions in
the form of power series about x0 = 0, namely solutions of this kind:
y(x) = a0 + a1 x + a2 x2 + a3 x3 + · · · =
Differentiating it term by term we obtain:
y ′ = a1 + 2a2 x + 3a3 x2 + · · · =
y ′′ = 2a2 + 6a3 x + · · · =
′′
∞
X
n=2
∞
X
∞
X
an xn .
n=0
nan xn−1
n=1
n(n − 1)an xn−2
Now we should substitute y and y into the original ODE. Before doing that, we have
to rewrite one of the two series so that both series display the same generic term. We
do that by replacing n by n + 2 in the series expressing y ′′ and therefore we obtain:
′′
y =
∞
X
(n + 2)(n + 1)an+2 xn .
n=0
In this way, the original ODE can be transformed into:
∞
X
[(n + 2)(n + 1)an+2 + an ]xn = 0.
n=0
In order this equation to be satisfied, the coefficients of each power of x must be zero,
namely we must have:
(n + 2)(n + 1)an+2 + an = 0, ∀n.
This relation is called recurrence relation.
cannot have any information on a0 and a1
value, then all the even coefficients can be
odd coefficients can be obtained recurrently
(2.131)
It is evident from this relation that we
but if we give the first two coefficients a
obtained recurrently from a0 and all the
from a1 , namely we have:
a0
a0
a2
a0
a4
a0
= − , a4 = −
= , a6 = −
=−
2·1
2!
4·3
4!
6·5
6!
a1
a3
a1
a5
a1
a1
= − , a5 = −
= , a7 = −
=−
a3 = −
3·2
3!
5·4
5!
7·6
7!
The solution is therefore:
a2 = −
x2 x4 x6
x3 x5 x7
y(x) = a0 1 −
+
−
+ . . . + a1 x −
+
−
+ ... .
2!
4!
6!
3!
5!
7!
84
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
It is not difficult to recognize the Taylor series expansions of cos x and sin x in the
two sums inside the brackets, therefore the solution of the ODE (as we already knew)
is:
y(x) = a0 cos x + a1 sin x.
This example shows us that in order to find the solution of an ODE in power
series, the main point is to find the recurrence relation (Eq. 2.131), namely to find
in recurrence all the coefficients of the series as a function of the first ones.
We shall concentrate from now on only on second-order equations. In fact, all
the important ODEs for which in the history of mathematics a power series solution
has been found (Legendre’s equation, Hermite’s equation, Bessel functions etc.) are
second-order ODEs. With little work all the following results can be generalized to
n-th order ODEs. Given a generic second-order linear homogeneous ODE:
y ′′(x) + p(x)y ′ + q(x)y = 0,
the Frobenius and Fuchs theorem describes the nature of the solutions according to
the properties of the functions p and q.
• If the functions p(x) and q(x) admit a convergent Taylor series expansion about
a point x0 (namely if they are analytic), then it is always possible to find a
solution of the kind
y(x) =
∞
X
n=0
an (x − x0 )n .
(2.132)
In fact, it is enough to proceed as in Example 2.5.11, find the recurrence relation of the given ODE (taking into account the Taylor series expansion of the
functions p and q) and solve it. Eq. 2.132 will unavoidably lead to two (linearly independent) functions y1 (x) and y2 (x). A point x0 with such properties
(namely that p and q are analytic at it) is said to be ordinary, otherwise it is
said to be singular.
• If the point x0 is singular, but the functions
g(x) = (x − x0 )p(x), h(x) = (x − x0 )2 q(x)
are analytic at x0 , then the point x0 is a regular singular point. In this case,
the ODE admits at least one solution of the form:
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
y(x) = (x − x0 )
p
∞
X
n=0
n
an (x − x0 ) =
∞
X
n=0
an (x − x0 )n+p .
85
(2.133)
If we substitute this equation into the original ODE, we obtain a quadratic
equation in p called indical equation.
• If x0 is not a regular singular point, then it is said irregular and a solution of
the given ODE may not exist.
Given an ODE of the form:
y ′′ + p(x)y ′ + q(x)y = 0,
if the point x = 0 is a regular singular point, then the functions g(x) = xp(x) and
h(x) = x2 q(x) are analytic at x = 0, namely they admit a Taylor series expansion:
xp(x) = g(x) =
∞
X
n
2
gn x , x q(x) = h(x) =
∞
X
hn xn .
(2.134)
n=0
n=0
The original ODE can be thus written as:
g(x) ′ h(x)
y + 2 y = 0.
(2.135)
x
x
From Eq. 2.133 we know that we must look for at least one solution of the form:
y ′′ +
y=
∞
X
an xn+p .
n=0
Differentiating it with respect to x we obtain:
′
y =
y ′′ =
∞
X
(n + p)an xn+p−1
n=0
∞
X
n=0
(2.136)
(n + p)(n + p − 1)an xn+p−2
(2.137)
Substituting Eqs. 2.136 and 2.137 into the original ODE we obtain:
∞
X
n=0
n+p−2
(n + p)(n + p − 1)an x
∞
∞
X
X
n+p−2
+ g(x)
(n + p)an x
+ h(x)
an xn+p−2 = 0.
n=0
n=0
Dividing this equation by xp−2 we obtain:
∞
X
n=0
[(n + p)(n + p − 1) + g(x)(n + p) + h(x)]an xn = 0.
(2.138)
86
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
In order this equation to be satisfied, the coefficients of each power of x must be zero,
including x0 . The terms with x0 are obtained when n = 0 in the above equation
and when the first terms g0 and h0 of the series expansions of g(x) and h(x) are
taken. These are by the way also the values of the functions g(x) and h(x) in the
point x = 0, namely g0 = g(0) and h0 = h(0). Equating these coefficients to zero we
obtain:
p(p − 1)a0 + g0 pa0 + h0 a0 = 0.
Assuming that a0 6= 0 we can cancel it out and obtain the indical equation:
p2 + (g0 − 1)p + h0 = 0.
(2.139)
This is a second order equation, which has therefore two roots p1 and p2 . According
to the nature of these roots we can have the following cases (which we mention
without demonstrating):
• Real distinct roots not differing by an integer
In this case the two series solutions of the given ODE are both of the form of
Eq. 2.133 namely:
y1 =
∞
X
an xn+p1 , y2 =
n=0
∞
X
bn xn+p2 .
n=0
• Double roots
In this case we still have a solution of the form of Eq. 2.133
y1 =
∞
X
an xn+p ,
n=0
whereas the second solution is given by:
y2 = y1 ln x +
∞
X
bn xn+p .
(2.140)
n=0
• Roots differing by an integer.
Once again (as shown in the theorem of Frobenius and Fuchs) we have a solution
of the form of Eq. 2.133, namely:
y1 =
∞
X
n=0
an xn+p1 .
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
87
The second solution is given by:
y2 = ky1 ln x +
∞
X
bn xn+p2 ,
(2.141)
n=0
namely, it differs from the previous case only by a constant k which can also
turn out to be zero.
Once the nature of the solutions has been established, we must determine the
coefficients aj and bj by means of the opportune recurrence relations; if we are
P∞
P
n
n
lucky the series ∞
n=0 bn x will be the Taylor series expansion of some
n=0 an x ,
known elementary function; otherwise the series solution will remain defined by its
recurrence relation.
Example 2.5.12 Find the power series solutions of the ODE:
xy ′′ − 2y ′ + 9x5 y = 0,
and verify that the two solutions are linearly independent.
We can rewrite this ODE in the form:
2
y ′′ − y ′ + 9x4 y = 0.
x
x = 0 is therefore a regular singular point because the functions g(x) = xp(x) = −2
and h(x) = x2 q(x) = 9x6 are analytic at x = 0. We have g(0) = −2 and h(0) = 0
therefore the indical equation is:
p2 + [g(0) − 1]p + h(0) = 0 ⇒ p2 − 3p = 0.
This equation has roots p1 = 0 and p2 = 3. At least one of these indices must lead to
P
n+p
. If we take p = 3, by means of Eq. 2.138
a solution of the form y(x) = ∞
n=0 an x
we obtain:
∞
X
n=0
⇒
∞
X
[(n + 3)(n + 2) − 2(n + 3) + 9x6 ]an xn = 0
n(n + 3)an xn + 9an xn+6 = 0.
n=0
The second term of this sum has therefore an exponent different from the first one.
In order to compare terms with like power of x we have to lower the index an in the
term 9an xn+6 by six units. In this way we obtain the recurrence relation:
88
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
n(n + 3)an + 9an−6 = 0 ⇒ an = −
9an−6
.
n(n + 3)
Namely, given an initial value a0 we can determine the coefficients a6 , a12 and so
on; nothing can be said about the intermediate coefficients a1 , a2 , . . . (that can be
put to 0). Let us therefore put n = 6k and rewrite the recurrence relation as:
a6k = −
a6k−6
a6k−12
9a6k−6
=−
=
= ...
6k(6k + 3)
2k(2k + 1)
(2k + 1)2k(2k − 1)
It is now more evident what the recurrence relation looks like, namely:
a6k =
(−1)k
a0 .
(2k + 1)!
Therefore, the first solution of the given ODE is:
y1 (x) = a0
∞
X
6n+3
a6n x
n=0
∞
X
(−1)n 3(2n+1)
x
.
= a0
(2n + 1)!
n=0
If we substitute now z = x3 it is not difficult to recognize in this series the Taylor
expansion of sin z, therefore the first solution is:
y1 (x) = a0 sin(x3 ).
P
n+p
As we have said, the second solution might or might not be of the form ∞
.
n=0 bn x
If we assume it to be of this form and proceed as in the previous case (but with p = 0)
this time, we obtain:
n(n − 1)bn − 2nbn + 9an−6 = 0 ⇒ bn = −
Again we write 6k instead of n and obtain:
b6k = −
9bn−6
.
n(n − 3)
b6k−6
(−1)k
9b6k−6
=−
=
b0 .
6k(6k − 3)
2k(2k − 1)
(2k)!
In this way, the second solution can be written as:
y2 (x) = b0
∞
X
(−1)n
n=0
(2n)!
x6n = b0 cos(x3 ).
The Wronskian of the two functions is:
W (y1 , y2 ) = y1 y2 ′ − y2 y1 ′ = a0 sin(x3 )(−3x2 )b0 sin(x3 ) − b0 cos(x3 )(3x2 )a0 cos(x3 )
= −3a0 b0 x2 6= 0.
2.5. HIGHER ORDER LINEAR DIFFERENTIAL EQUATIONS
We could have also calculated the Wronskian from the Abel’s theorem
W = Ce−
R
p(x̃)dx̃
obtaining therefore the same function.
= Ce2
R
1
dx̃
x̃
= Cx2 ,
89
90
CHAPTER 2. ORDINARY DIFFERENTIAL EQUATIONS
Chapter 3
Complex analysis
The complex analysis consists in expanding in the complex space the familiar concepts of function analysis (limits, continuity, derivatives, integrals) in the real space.
Many of the properties of complex functions are analogous to the ones in the real
space but we will see that some other operations (like the integration) is totally
unlike the familiar integration of real functions.
3.1
Complex functions
We have already seen in Chapter 1 how to form complex numbers z given ordered
couples of real numbers (x, y). Of course, if x and y are allowed to vary (namely
if they are variables), then we can define z a complex variable. The quantity f (z)
is said to be a complex function of z if to every value of z in a certain domain R
(a region in the Argand diagram) there corresponds one or more complex numbers
wi = f (z). If only one value of w corresponds to each value of z then the function
f is single-valued; if instead to each value of z we can assign more values of w, then
the function is multiple-valued.
Example 3.1.1 Is the function f (z) =
√
z single-valued or multiple-valued?
We can write z = reiθ , therefore it is
f (z) =
√
θ
rei( 2 +nπ) ,
namely, given whatever complex number z we have two complex
numbers correspond
√ iθ
√ i θ +π
. The function is therefore
ing to its square root: w1 = re 2 and w2 = re 2
multiple-valued.
Concerning Example 3.1.1 we can notice that, if we move around a closed path
√
that does not enclose the origin of the axes and we evaluate f (z) = z at each
91
92
CHAPTER 3. COMPLEX ANALYSIS
Figure 3.1: (a) a closed contour not enclosing the origin; (b) a closed contour enclosing the origin.
point of the path, after one complete circuit the value of θ will be the same as its
original value, therefore also the value of f (z) will be the same. If instead we move
on a closed path that encloses the origin, then after one circuit we will have θ + 2π
√ iθ
√ i θ +π
instead of θ and therefore the value of f (z) will change from re 2 to re 2
.
Referring to Fig. 3.1, along the path (a) the function f (z) does not change after
one circuit, wheres it does change if we move on the path drawn in (b). A point like
the origin, which has the property that the function f (z) changes after one circuit
enclosing it, is called branch point.
In order a multiple-valued function to be treated as a single-valued one, we must
somehow prevent that the circuit around the origin can be completed. This result
can be obtained through the so-called branch cut. The branch cut is a line or a curve
in the complex plane and can be regarded as an artificial barrier that a closed path
√
must not cross. A possible branch cut for the function f (z) = z is shown in Fig.
3.2, namely we take the cut as the positive real axis (but any line extending from the
origin out to |z| = ∞ would be a branch cut, as well). As it is shown in this figure,
if we prevent the closed path to cross the branch cut, there is no way to enclose the
origin and therefore the function f (z) may be regarded as single-valued. We can
√
also write that the function f (z) = z is single-valued in the domain of the complex
space R = C − R≥0 .
93
3.1. COMPLEX FUNCTIONS
Figure 3.2: A possible branch cut for the function f (z) =
3.1.1
√
z and a closed contour.
Differentiable functions
Analogously to the definition of derivative in the real space, we can define the derivative of a single-valued function f in a point z as the limit:
f (z + ∆z) − f (z)
(3.1)
f (z) = lim
∆z→0
∆z
A function f that is single-valued in some domain R of the Argand diagram is said
to be differentiable if the derivative Eq. 3.1 exists and is unique, namely it does not
depend on the direction in the Argand diagram from which ∆z tends to zero.
′
Example 3.1.2 Show that the function f (z) = Re(z) · Im(z) = xy is not differentiable anywhere in the complex plane.
Given a generic ∆z = ∆x + i∆y we have:
(x + ∆x)(y + ∆y) − xy
f (z + ∆z) − f (z)
= lim
.
lim
∆x,∆y→0
∆z→0
∆z
∆x + i∆y
If we suppose now that ∆z → 0 along a straight line with slope m, we have ∆y =
m∆x, therefore we have:
· m∆x
+
+ xm
f (z + ∆z) − f (z)
∆x
∆x
∆x
y
y + xm
= lim
lim
.
=
∆x→0
∆z→0
∆z
1 + im
∆x(1 + im)
94
CHAPTER 3. COMPLEX ANALYSIS
This limit therefore clearly depends on the direction from which ∆z tends to zero
(namely on m), therefore the given function is not differentiable.
A function that is single-valued and differentiable at all points of a domain R is
said to be analytic there. If a function is analytic in a domain R with the exception
of some points of this domain, than these points are called singularities of f (z).
We have already encountered a singularity (the branch point). Another very
important singularity is the pole. A point z0 is said to be a pole of of order n the
function f (z) if:
lim [(z − z0 )n f (z)] = a,
(3.2)
z→z0
where a is a finite complex number different from zero. Analogously, we can say that
the function f (z) has a pole of order n in z0 if we can find a function g(z), analytic
in a neighborhood of z0 and with g(z0 ) 6= 0, such that:
f (z) =
g(z)
.
(z − z0 )n
(3.3)
A pole is an isolated singularity in the sense that f (z) has a pole at z0 but is analytic
in some neighborhood around z0 .
3.1.2
The Cauchy-Riemann conditions
The Cauchy-Riemann conditions establish what properties a function f (z) must have
in order to be analytic in some domain R. We can split every function f (z) in two
functions u(x, y) and v(x, y) such that:
f (z) = u(x, y) + iv(x, y).
(3.4)
Given a point z ∈ R, f ′ (z) is given by:
f ′ (z) =
[u(x + ∆x, y + ∆y) − u(x, y)] + i[v(x + ∆x, y + ∆y) − v(x, y)]
.
∆x,∆y→0
∆x + i∆y
lim
As we have said this limit will in general depend on the way ∆z = ∆x + i∆y
approaches zero. The easiest possible paths we can think at are along the x-axis and
along the y-axis, namely in the first case we have ∆y = 0 and we let ∆x go to 0,
whereas in the second case is ∆x = 0 and ∆y tends to zero. For ∆y = 0 we obtain:
lim
∆x→0
[u(x + ∆x, y) − u(x, y)]
[v(x + ∆x, y) − v(x, y)]
+i
∆x
∆x
=
∂u
∂v
+i .
∂x
∂x
(3.5)
95
3.1. COMPLEX FUNCTIONS
Assuming ∆x = 0 we obtain instead:
lim
∆x→0
[v(x, y + ∆y) − v(x, y)]
[u(x, y + ∆y) − u(x, y)]
+i
i∆y
i∆y
=
1 ∂u ∂v
+
.
i ∂y ∂y
(3.6)
It is clear that a necessary condition in order f (z) to be differentiable in z is that
we obtain the same f ′ (z) in the two above mentioned cases. Equating the real and
imaginary parts we obtain:
∂v
∂v
∂u
∂u
=
,
=− .
(3.7)
∂x
∂y
∂x
∂y
These equations are called the Cauchy-Riemann conditions. It can be shown that
a necessary and sufficient condition in order f (z) to be analytic in a domain R is
that the Cauchy-Riemann conditions are satisfied in all points of R and that all the
functions ∂u
, ∂u , ∂v , ∂v are continuous.
∂x ∂y ∂x ∂x
Example 3.1.3 Show that the function sin z is analytic in the whole complex plane.
We have seen in Chapter 1 that we can define sin z =
z. Since z = x + iy we obtain:
eiz −e−iz
2i
for any complex number
eiz − e−iz
ei(x+iy) − e−i(x+iy)
e−y+ix − ey−ix
=
=
2i
2i
2i
−y
y
e (cos x + i sin x) − e (cos x − i sin x)
=
2i
e−y − ey
e−y + ey
= cos x
+ sin x
2i
2i
e−y − ey
e−y + ey
= −i cos x
+ sin x
2
2i
= i cos x sinh y + sin x cosh y.
sin z =
The real and imaginary parts of sin z are thus u(x, y) = sin x cosh y and v(x, y) =
cos x sinh y, respectively. If we make now the partial derivatives of these two functions
we obtain:
∂u
∂x
∂u
∂y
∂v
∂x
∂v
∂y
= cos x cosh y
= sin x sinh y
= − sin x sinh y
= cos x cosh y.
96
CHAPTER 3. COMPLEX ANALYSIS
It is clear that all the partial derivatives are continuous and the Cauchy-Riemann
conditions are satisfied for any choice of z, therefore the sinus function is analytic
everywhere in C.
Example 3.1.4 Show that the function ln z (the principal value of Ln (z)) is analytic
in the whole complex plane and find its derivative.
As we have seen in Chapter 1, ln z is given by:
ln z = ln r + iθ = ln
namely we have:
u(x, y) = ln
p
p
x2 + y 2 ,
y
x2 + y 2 + i arctan ,
x
y
v(x, y) = arctan .
x
If we make the partial derivatives we obtain:
1
1
x
1
∂u
=p
· p
· 2x = 2
2
2
2
2
∂x
x + y2
x +y 2 x +y
∂u
y
= 2
∂y
x + y2
y
∂v
1
y
− 2 =− 2
=
2 ·
y
∂x
x
x + y2
1 + x2
1
1
x
∂v
=
= 2
.
2 ·
y
∂y
x + y2
1 + x2 x
It can be seen now that all the partial derivatives are continuous and satisfy the
Cauchy-Riemann conditions, therefore the function f (z) = ln z is analytic in the
whole complex plane.
It is worth noting that, had we taken the complex logarithm Ln(z) = i(θ +
2nπ) + ln r instead of its principal value, nothing would have changed in the partial
derivatives because the additive term 2nπ in v(x, y) would have disappeared after
differentiation. However, the function f (z) = Ln (z) is multiple-valued (as we have
seen in Chapter 1) and because of that it cannot be analytic. To make it single-valued
we have to restrict the range of arguments to θ ∈ [0, 2π[, namely we have to take the
principal value ln z of the complex logarithm.
If we apply the definition of derivative to the function f (z) = ln z we obtain:
∆z ln
1
+
f
(z
+
∆z)
−
f
(z)
z
= lim
.
f ′ (z) = lim
∆z→0
∆z→0
∆z
∆z
97
3.1. COMPLEX FUNCTIONS
It can be shown that ln(1 + z) has the same Taylor expansion as the analogous real
3
2
function ln(1 + x), namely ln(1 + z) = z − z2 + z3 − . . . , therefore we obtain:
ln 1 + ∆z
z
f (z) = lim
∆z→0
∆z
′
= lim
∆z
∆z→0
z
−
1 ∆z 2
2 z
+ ...
∆z
analogous to the result for the real function f (x) = ln x.
1
= ,
z
Indeed it is possible to show that all the elementary functions (sine, cosine,
hyperbolic sine and cosine, exponential function, polynomials etc.) are analytic and
their derivatives have the same expression as the corresponding real functions.
If we express the complex number in polar form (z = reiθ ), then we can find
two functions u and v such that:
f (z) = u(r, θ) + iv(r, θ).
In this way f ′ (z) can be written as:
f ′ (z) =
lim
∆r,∆θ→0
u(r + ∆r, θ + ∆θ) − u(r, θ) + i[v(r + ∆r, θ + ∆θ) − v(r, θ)]
.
(r + ∆r)ei(θ+∆θ) − reiθ
Once again we take first ∆θ = 0 and we send ∆r to zero, then we proceed the other
way around. For ∆θ = 0 we obtain:
u(r + ∆r, θ) − u(r, θ) + i[v(r + ∆r, θ) − v(r, θ)]
∂v
−iθ ∂u
lim
.
=e
+i
∆r→0
∆reiθ
∂r
∂r
If we take now ∆r = 0 we have:
u(r, θ + ∆θ) − u(r, θ) + i[v(r, θ + ∆θ) − v(r, θ)]
=
∆θ→0
r(ei(θ+∆θ) − eiθ )
e−iθ
u(r, θ + ∆θ) − u(r, θ) + i[v(r, θ + ∆θ) − v(r, θ)]
=
lim
.
r ∆θ→0
ei∆θ − 1
lim
We know already (Chapter 1) that the exponential function admits the same Taylor
2
series expansion of the corresponding real function, namely ex = 1 + x + x2 + . . . ,
therefore the expression at the denominator is ei∆θ − 1 ≃ i∆θ. In the limit for
∆θ → 0 the previous expression transforms into:
e−iθ ∂v
∂v
∂u
e−iθ 1 ∂u
=
.
+i
−i
r i ∂θ
∂θ
r
∂θ
∂θ
If we compare now the real and imaginary parts of the two limits we have found, we
obtain:
98
CHAPTER 3. COMPLEX ANALYSIS
∂v
∂u
=
,
(3.8)
∂r
∂θ
∂v
∂u
r
= − ,,
(3.9)
∂r
∂θ
which are the Cauchy-Riemann conditions in polar form. By means of these equations
it is much easier to see that the function ln z (Example 3.1.4) is analytic. In fact, we
= 0 = ∂v
, satisfying
have u(r, θ) = ln r and v(r, θ) = θ. It is therefore clear that ∂u
∂θ
∂r
1
∂v
∂u
Eq. 3.9. We have then ∂r = r and ∂θ = 1, therefore also Eq. 3.8 is satisfied. As
usual, some operations are easier to perform in polar representation, some others in
algebraic representation.
r
3.2
3.2.1
Complex integration
Line integrals in the complex plane
We can define integrals of complex functions exactly as we do with real functions,
R
namely we define the indefinite integral f (z)dz as any possible function whose
derivative is f (z). When we try to calculate definite integrals, things are more
complicated. In fact, given a real function f (x) and two real numbers x1 and x2 ,
Rx
the number x12 f (x)dx is unambiguously defined. Given instead a complex function
Rz
f (z) and two points z1 , z2 ∈ C, z12 f (z)dz is ambiguously defined because there
is an infinite number of curves joining z1 to z2 and we might have different results
according to the curve we choose.
In this way we can see the analogy between complex integrals and line integrals
of scalar (or vector) fields. If we have a scalar field f and we want to evaluate the
integral of this field along a curve γ, we know that we have to find a parameterization
r : [a, b] → γ such that r(a) and r(b) are the endpoints of the curve γ. We can then
express the line integral with real integrals in this way:
Z
γ
f ds =
Z
b
f [r(t)]|r′(t)|dt,
a
where ds is the infinitesimal line element. For complex integrals we proceed in the
same way, namely we describe the path γ with a continuous (real) parameter t
ranging from a to b and giving the successive positions on γ through the relations:
x = x(t),
y = y(t).
(3.10)
Assuming as usual that u(x, y) is the real part of f (z) and v(x, y) its imaginary part,
the integral of the function f (z) along the curve γ can be given as a sum of real
integrals as follows:
99
3.2. COMPLEX INTEGRATION
Z
f (z)dz =
Z
(u + iv)(dx + idy)
Z
Z
= (udx − vdy) + i (udy + vdx)
γ
γ
γ
γ
Z b
Z b
dx
dy
dy
dx
=
u −v
dt + i
dt
u +v
dt
dt
dt
dt
a
a
(3.11)
Because of this transformation of complex integrals to sums of real integrals, it
is easy to see that these two identities are verified:
Z
Z
f (z)dz = −
f (z)dz
(3.12)
γ
−γ
Z
Z
Z
f (z)dz.
(3.13)
f (z)dz +
f (z)dz =
γ
γ2
γ1
In the first identity the curve −γ is the same as the curve γ but taken in the opposite
direction, therefore the parameterization x(t) and y(t) is the same but we have to
invert the endpoints a and b. In this sense the result of Eq. 3.12 is equivalent to the
Rb
Ra
relation among real integrals a f (x)dx = − b f (x)dx. In the second identity we
intend γ1 and γ2 as two curves, joining whom we obtain the curve γ. Because of the
transformation between complex and real integrals of Eq. 3.11 this result is again
Rb
Rm
Rb
analogous to the result of real integrals a f (x)dx = a f (x)dx + m f (x)dx.
Example 3.2.1 Evaluate the integrals of the functions f1 (z) =
along the paths γ1 and γ2 indicated in Fig. 3.3.
1
z
and f2 (z) = Im(z)
√
The curve γ1 is a circular arc with radius R 2 and its parameterization is clearly:
√
x(t) = R 2 cos t,
√
y(t) = R 2 sin t,
π 3π
.
,
t∈
4 4
The curve γ2 is instead a straight line in which y is constant and a parameterization
of it is:
x(t) = Rt,
The function f1 (z) =
1
z
y(t) = R,
t ∈ [1, −1].
can be also written as:
x − iy
1
= 2
x + iy
x + y2
x
y
⇒ u(x, y) = 2
, v(x, y) = − 2
.
2
x +y
x + y2
f1 (z) =
By using the parameterization adopted for the curve γ1 this becomes:
100
CHAPTER 3. COMPLEX ANALYSIS
Figure 3.3: Different paths to evaluate the integrals of the functions f1 (z) =
f2 (z) = Im(z) (see Example 3.2.1.)
u(x, y) =
1
z
and
sin t
cos t
√ , v(x, y) = − √ ,
R 2
R 2
whereas with the parameterization we have chosen for the curve γ2 we obtain:
u(x, y) =
t
1
,
v(x,
y)
=
−
.
R(1 + t2 )
R(1 + t2 )
For the function f2 (z) = Im(z) the real and imaginary parts are evident, namely:
u(x, y) = y, v(x, y) = 0.
Now we can start calculating the four requested integrals:
√ sin t √ cos t
√ (−R
√(R
f1 (z)dz =
2 cos t) dt+
2 sin t) + π
R
2
R 2
γ1
4
Z 3π √
4
sin t
cos t √
√ (R 2 cos t) − √ (−R 2 sin t) dt
+i
π
R
2
R 2
4
Z 3π
4
π
1dt = i .
=i
π
2
4
Z
Z
3π
4
101
3.2. COMPLEX INTEGRATION
Z
Z
Z −1
1
t
Rdt
+
i
Rdt
−
f1 (z)dz =
R(1 + t2 )
R(1 + t2 )
1
γ2
1
−1
1
= ln(1 + t2 ) 1 − i[arctan t]−1
1
2 π
π π
=i .
= −i − −
4
4
2
Z
f2 (z)dz =
γ1
Z
3π
4
−1
√
√
R 2 sin t(−R 2 sin t)dt + i
π
4
= −R2
Z
3π
4
2 sin2 tdt + iR2
π
4
Z
3π
4
Z
3π
4
√
√
R 2 sin t(R 2 cos t)dt
π
4
2 sin t cos tdt.
π
4
We remind now that 2 sin t cos t = sin(2t) and −2 sin2 t = cos(2t) − 1, therefore we
have:
Z 3π
Z 3π
Z
4
4
2
2
f2 (z)dz = R
[cos(2t) − 1]dt + iR
sin(2t)dt
π
4
γ1
π
4
3π4
3π
1
R2
sin(2t) − t
=R
− i [cos(2t)] π4
4
2
2
π
4
π
.
= −R2 1 +
2
2
Finally:
Z
f2 (z)dz =
Z
1
γ2
−1
RRdt = −2R2 .
There are many things to notice from the previous example. We have seen that,
in one case (function f1 (z)) the integral seems not to depend on the path we have
chosen but only on the end points, whereas in the other case (function f2 (z)) the
result does depend on the path. If we take the closed curve γ = γ1 − γ2 , the integral
of the function f1 (z) along it is zero. It is not difficult to imagine why the functions
f1 (z) and f2 (z) behave differently: the first function is analytic in the domain we are
considering, whereas the function f2 (z) is not.
Example 3.2.2 Evaluate the integral
Z
C
dz
,
(z − z0 )n
where C is a circle of radius R and centered at z0 .
102
CHAPTER 3. COMPLEX ANALYSIS
It is convenient to express z − z0 in exponential form. In fact, along a circle of
radius R centered at z0 the modulus of z − z0 is always constant and equal to R,
whereas the argument ranges from 0 to 2π. Therefore z − z0 = Reiθ and consequently
dz = iReiθ dθ. The integral to evaluate transforms thus into:
Z
C
dz
=
(z − z0 )n
Z
2π
0
iReiθ
dθ = iR1−n
Rn einθ
Z
0
2π
ei(1−n)θ dθ = 0 ∀n 6= 1.
The result is due to the fact that, as we have learned, the function eiθ has a periodicity
of 2π, namely eiθ = ei(θ+2nπ) . If n = 1 we obtain:
Z
C
dz
=i
z − z0
Z
2π
dθ = 2πi,
0
which could be also obtained as we have done in Example 3.2.1, only with a parameterization extending in the interval t ∈ [0, 2π].
We have seen in this example that the integral considered is zero in all but the case
in which the origin of the axes (which is also the center of the curve C) is a pole of
order 1 (also called simple pole) of the given function. In this case, the integral is
given by the value 2πi which is independent on the radius of the considered circle.
This leads us to one of the most important results of complex analysis, namely the
Cauchy’s integral theorem.
3.2.2
Cauchy’s integral theorem
Before discussing the Cauchy’s integral theorem we have to define a simply-connected
domain. Referring to Fig. 3.4 we can assume that some function f (z) is not defined
or is not differentiable in the points P1 and P2 and in the region R1 , all concentrated
in the right part of the Argand diagram (region of positive real parts). If we take a
generic closed curve (for instance Γ1 ) in this region, we cannot shrink it indefinitely
without avoiding the singularities and holes. In the left side of the diagram instead
(light blue region; region of negative real parts) we can take any generic closed curve
(for instance the curve Γ2 ) and shrink it to a point without leaving it. If a region of
the complex plane has the property that any closed curve lying in it can be shrunk
to a point without leaving it, then the region is called simply connected. If a region
is not simply connected but has a number of holes in it, then it is said to be multiply
connected.
The Cauchy’s integral theorem states that, if a function f (z) is analytic in a
simply connected region and on its boundary C, then
I
C
f (z)dz = 0.
(3.14)
103
3.2. COMPLEX INTEGRATION
Figure 3.4: Simply connected and multiply connected regions.
Here and in the following we will denote an integral around a closed contour C with
H
.
C
To demonstrate the Cauchy’s integral theorem we first have to show a result
known as Green’s theorem in a plane that states that, given two functions P (x, y) and
Q(x, y), continuous and with continuous partial derivatives inside a simply connected
region R of the xy-plane and on its boundary C, then:
ZZ ∂Q ∂P
(P dx + Qdy) =
dxdy.
(3.15)
−
∂y
C
R ∂x
It relates therefore the surface integral on R with the line integral on the contour line
C enclosing R. To demonstrate this result we refer to Fig. 3.5. If y1 (x) and y2 (x)
are the equations of the curves ST U and SV U respectively, then for each value of
x0 between a and b the segment joining y2 (x0 ) with y1 (x0 ) represents a cut through
the region R and it is always y2 (x0 ) > y1 (x0 ), therefore we can write:
I
ZZ
R
∂P
dxdy =
∂y
Z
b
dx
a
Z
Z
y2 (x)
y1 (x)
b
dy
∂P
∂y
y=y (x)
dx P (x, y) y=y12 (x)
a
Z b
Z b
I
=
P [x, y2 (x)]dx −
P [x, y1(x)]dx = − P dx.
=
a
a
C
104
CHAPTER 3. COMPLEX ANALYSIS
Figure 3.5: A simply connected region R bounded by the curve C.
The sign minus is due to the fact that we have gone through the curve C clockwise,
whereas it is conventionally assumed that the counterclockwise direction is the positive one (see also Example 3.2.2). We proceed in the same way by defining x1 (y) and
x2 (y) as the equations of the curves T SV and T UV , respectively. By construction
it is always x2 (y) > x1 (y). We have then:
ZZ
R
∂Q
dxdy =
∂x
Z
d
dy
c
Z
Z
x2 (y)
dx
x1 (y)
d
∂Q
∂x
x=x2 (y)
dy Q(x, y) x=x1 (y)
c
Z d
Z d
I
=
Q[x2 (y), y]dy −
Q[x1 (y), y]dy =
Qdy.
=
c
c
C
RR
RR
dxdy − R ∂P
dxdy, we obtain the Green’s theorem in a plane.
If we take now R ∂Q
∂x
∂y
The Cauchy’s integral theorem is a simple application of the Green’s theorem
in a plane. In fact we have:
I
C
f (z)dz =
I
I
(udx − vdy) + i (udy + vdx)
C
ZZ ZZ ∂u ∂v
∂(−v) ∂u
dxdy + i
dxdy.
−
−
=
∂x
∂y
∂y
R ∂x
R
C
(3.16)
105
3.2. COMPLEX INTEGRATION
∂v
= ∂y
and
For an analytic function we know that the Cauchy-Riemann conditions ∂u
∂x
∂u
∂v
= − ∂x are satisfied, therefore both integrating functions in Eq. 3.16 are zero,
∂y
demonstrating thus the Cauchy’s integral theorem.
On the light of the Cauchy’s theorem the result emerged after Example 3.2.1
that the integral of the function f (z) = z1 along the closed curve γ = γ1 − γ2 is
zero becomes obvious. In fact, the curve does not enclose the origin (which is a
singularity), the function f (z) is analytic in this region of the Argand diagram and
therefore the integral of f (z) along γ must be zero.
3.2.3
Cauchy’s integral formula
A very important application of the Cauchy’s integral theorem is that, given a function f (z) analytic everywhere but in an isolated singularity z0 , whatever curve we
choose surrounding z0 the integral of f (z) along it will be equivalent to the integral
along a circle γ centered on z0 and of arbitrarily small radius ε. To demonstrate
this, we refer to Fig. 3.6. We want to demonstrate that the integral of the function
H
f along C is equal to γ f (z)dz, where γ is a small circle with radius ε surrounding
the singularity z0 . To do that we consider the curve Γ in Fig. 3.6 (b), namely we cut
the curve C and we connect it through two parallel lines r1 and r2 to the small circle
γ enclosing z0 . If the separation δ between the two parallel straight lines r1 and r2
becomes infinitesimally small, then the curve Γ can be obtained as Γ = C +r1 −γ +r2 .
Note the term −γ due to the fact that the curve Γ rotates clockwise around z0 . But
the curve Γ does not contain the singularity z0 , therefore according to the Cauchy’s
integral theorem:
I
⇒
IΓ
C
f (z)dz = 0
I
I
I
f (z)dz = 0.
f (z)dz − f (z)dz +
f (z)dz +
γ
r1
r2
But if the separation δ between r1 and r2 becomes infinitesimally small, the two
curves will eventually lie on top of each other but will be traversed in opposite
H
directions. According to Eq. 3.12 that means that the contributions of f (z)dz
along these two lines cancel out. It remains therefore:
I
f (z)dz =
C
I
f (z)dz.
(3.17)
γ
As an extension of this result, if a curve C encloses n holes or singularities z1 , . . . , zn
of a given complex function f (z), then:
106
CHAPTER 3. COMPLEX ANALYSIS
Figure 3.6: Collapsing a contour C around a singularity z0 .
I
C
f (z)dz =
n I
X
i=1
f (z)dz,
(3.18)
γi
where γi is a small circle, of arbitrarily small radius ε, enclosing the singularity
zi . This can be seen in Fig. 3.7 where we show that we can transform the curve
C into the curve Γ that does not contain the singularities z1 , z2 , z3 but encloses
them through the small circles γ1 , γ2 , γ3 traversed clockwise, therefore we have
H
H
P H
0 = Γ f (z)dz = C f (z)dz − 3i=1 γi f (z)dz, from which we obtain the result Eq.
3.18.
The Cauchy’s integral formula states that, if f (z) is analytic within and on a
closed contour C and z0 is a point within C, then:
1
f (z0 ) =
2πi
I
C
f (z)
dz.
z − z0
(3.19)
The Cauchy’s integral formula relates therefore the value of a function f (z) in a
point z0 with the complex integral of f in a contour surrounding z0 . Because of Eq.
3.17, to demonstrate this result it is enough to demonstrate that this relation holds
for a small circle γ with radius ε surrounding z0 . But any point z on γ is given by
z = z0 + εeiθ , with θ ∈ [0, 2π]. Moreover, dz = iεeiθ dθ, therefore we have:
107
3.2. COMPLEX INTEGRATION
Figure 3.7: Collapsing a contour C around three singularities z1 , z2 , z3 .
I
C
f (z)
dz =
z − z0
=
I
γ
f (z)
dz
z − z0
2π
Z
0
f z0 + εeiθ
iθ
i
εe
dθ
iθ
εe
For ε → 0 the function to integrate reduces to if (z0 ), therefore we have:
I
C
f (z)
dz = i
z − z0
Z
2π
f (z0 )dθ = 2πif (z0 ),
0
which demonstrates the Cauchy’s integral formula. If f (z) = 1, we obtain the result
H dz
= 2πi we have already seen in Example 3.2.2. The Cauchy’s integral formula
z−z0
is very useful to evaluate complex integrals, as the next example demonstrates.
Example 3.2.3 Find the integral of the function
2
ez +1
z2 + 1
along the curves Γ1 , Γ2 , Γ3 shown in Fig. 3.8, namely:
• a generic closed curve containing the point z1 = i but not the point z2 = −i
(Γ1 ),
108
CHAPTER 3. COMPLEX ANALYSIS
Figure 3.8: Curves analyzed in Example 3.2.3.
• a generic closed curve containing neither z1 = i nor z2 = −i (Γ2 )
• a generic closed curve containing the point z2 = −i but not the point z1 = i
(Γ3 )
We have to evaluate the integral
I
Γi
2
ez +1
dz =
z2 + 1
I
2
Γi
ez +1
dz.
(z + i)(z − i)
This function has simple poles on z1 = i and z2 = −i. The curve Γ2 can be shrunk
to a point avoiding the singularities, therefore:
I
2
Γ2
ez +1
dz = 0.
z2 + 1
z 2 +1
Considering the curve Γ1 it is clear that the function g1 (z) = ez+i does not have
singularities in and along it, therefore we can apply the Cauchy’s integral formula
and obtain:
I
Γ1
2
ez +1
dz =
z2 + 1
I
Γ1
e0
g1 (z)
dz = 2πig1 (i) = 2πi = π
z−i
2i
109
3.2. COMPLEX INTEGRATION
Analogously, in and along the curve Γ3 the function g2 (z) =
singularities, therefore:
I
3.2.4
2
Γ3
ez +1
dz =
z2 + 1
I
Γ3
2
ez +1
z−i
does not have
e0
g2 (z)
dz = 2πig2 (−i) = 2πi
= −π
z+i
−2i
Cauchy’s integral formula for higher derivatives
Recalling the definition of derivative of a complex (differentiable) function in a point
z0 :
f (z0 + ∆z) − f (z0 )
,
∆z→0
∆z
we can evaluate f (z0 + ∆z) and f (z0 ) by means of the Cauchy’s integral formula,
using as contour any closed curve C containing z0 but not enclosing other singularities
(we know that we can collapse this curve around z0 ). We obtain:
f ′ (z0 ) = lim
I
1
f (z)
f (z)
1
dz
−
f (z0 ) = lim
∆z→0 2πi C ∆z z − (z0 + ∆z)
z − z0
I
(
(
((
((
1 (
(z(−(z(
(z(−(z(
1
0 )f (z) + ∆zf (z)
0 )f (z) − (
dz
= lim
∆z→0 2πi C ∆z
(z − z0 − ∆z)(z − z0 )
I
1
f (z)
= lim
dz
∆z→0 2πi C (z − z0 − ∆z)(z − z0 )
I
1
f (z)
=
dz.
2πi C (z − z0 )2
′
(3.20)
We can go on with higher order derivatives and prove by induction that, given a
closed curve C containing z0 and given a function f (z) analytic on and inside C,
then:
f
(n)
n!
(z0 ) =
2πi
I
C
f (z)
dz.
(z − z0 )n+1
(3.21)
This formula is known as Cauchy’s integral formula for higher derivatives. Also this
formula is useful to evaluate complex integrals.
Example 3.2.4 Evaluate the integral
Z
C
e2z
z − i π4
4 dz,
where C : |z| = 1 is a circumference with radius 1.
110
CHAPTER 3. COMPLEX ANALYSIS
The only singularity of the function to integrate is a pole of order 4 at z0 = i π4 . The
path C encloses it and the function g(z) = e2z is analytic on and inside C, therefore
we can apply the Cauchy’s integral formula for higher derivatives and obtain:
e2z
Z
C
3.2.5
dz =
π 4
z − i4
Z
C
g(z)
(3)
dz = g i
π 4
z − i4
π
2πi
π 2πi
8π
·
= 8ei 2 ·
=− .
4
3!
3!
3
Taylor and Laurent series
We are now in the condition to demonstrate that analytic functions admit Taylor
series expansions equivalent to the ones we know for real functions, namely:
f (z) =
∞
X
f (n) (z0 )
n=0
(z − z0 )n
.
n!
(3.22)
In order to obtain this result we need to remind that, given a number q with |q| < 1,
1
can be obtained through the geometric series 1 + q + q 2 + . . . , namely:
then 1−q
∞
X
qn =
n=0
1
.
1−q
(3.23)
We suppose now that a function f (z) is analytic in a circle R centered on z0 (whose
P
n
0)
(n)
. The
(z0 ) (z−z
circumference we denote with C) and we want to evaluate ∞
n=0 f
n!
quantity f (n) (z0 ) can be obtained by means of the Cauchy’s integral formula for
higher derivatives (Eq. 3.21), therefore we have:
∞
X
f
(n)
n=0
∞
(z − z0 )n X (z − z0 )n n!
(z0 )
=
n!
2πi
n!
n=0
I
C
f (ξ)
dξ.
(ξ − z0 )n+1
We must notice here that ξ is on the circumference C, whereas z is inside the circle,
1
therefore |z − z0 | < |ξ − z0 |. We can bring now the sum inside the integral (and 2πi
outside it) and obtain:
∞
X
n=0
Since
|z−z0 |
|ξ−z0 |
∞
X
n=0
f
f
(n)
1
(z − z0 )n
=
(z0 )
n!
2πi
I X
∞
(z − z0 )n f (ξ)
dξ.
n ξ−z
0
C n=0 (ξ − z0 )
< 1 we can apply Eq. 3.23 to the quantity q =
(n)
1
(z − z0 )n
=
(z0 )
n!
2πi
I
1
C
1−
z−z0
ξ−z0
z−z0
ξ−z0
1
f (ξ)
dξ =
ξ − z0
2πi
I
C
and obtain:
f (ξ)
dξ = f (z).
ξ−z
The last step is due again to the Cauchy’s integral formula (Eq. 3.19) and this
concludes the demonstration of the Eq. 3.22. We see now why we have used so
111
3.2. COMPLEX INTEGRATION
far the term analytic function to indicate differentiable functions in some domain
R: analogously to the real function, a complex function is analytic if it admits a
convergent Taylor series expansion in each point of some domain R. In fact, given
a point z0 and a function f (z) whose closest singularity to z0 is z1 , then for all z in
a circle centered in z0 and with radius |z1 − z0 | the Taylor series expansion of f (z)
converges.
If a function f (z) is not analytic in some domain R, we could still be able to
expand it in a series. Let us consider first a function f (z) having a pole of order
m in a point z0 . By using the definition Eq. 3.3 of pole, we know that an analytic
g(z)
function g exists such that f (z) = (z−z
n . Since g(z) is analytic, it admits a Taylor
0)
series expansion around z0 , therefore we have:
∞
X
1
(z − z0 )n
(n)
f (z) =
g
(z
)
0
(z − z0 )m n=0
n!
=
∞
X
g (n) (z0 )
n=0
∞
X
(z − z0 )n−m
n!
∞
X
g (n+m) (z0 )
n
=
(z − z0 ) =
an (z − z0 )n .
(n
+
m)!
n=−m
n=−m
(3.24)
Such a series, which is an extension of the Taylor series to negative indices is called
Laurent series. To find the Laurent expansion of a function f (z) about a point z0
it is therefore enough to find the Taylor expansion of g(z) about z0 and divide it by
(z − z0 )m .
Example 3.2.5 Find the Laurent series of
f (z) =
1
z(z − 2)3
about the singularities z = 0 and z = 2 and find the corresponding residues.
We remind here that the function (1 + x)α can be expressed as:
α
(1 + x) =
∞
X
α(α − 1)(α − 2) . . . (α − n + 1)
n=0
n!
xn .
This relation holds also if x, α ∈ C. To get the Laurent series about z = 0 we notice
that the function g(z) = 1/(z − 2)3 is analytic in a neighborhood of z = 0. We can
thus expand it about z = 0. We rewrite the given function as:
1
1
1
=−
.
3
z(z − 2)
8z 1− z 3
2
112
CHAPTER 3. COMPLEX ANALYSIS
Now we develop 1 −
z −3
2
and obtain:
z −3
z (−3)(−4) z 2 (−3)(−4)(−5) z 3
1−
−
+ ....
= 1 − (−3) +
2
2
2!
4
3!
8
We have therefore:
3
1
3 2 5 3
1
1 + z + z + z + ...
=−
z(z − 2)3
8z
2
2
4
1
3
3
5
=− −
− z − z2 − . . .
8z 16 16
32
The residue at z = 0 is therefore − 18 . To find the Laurent series about z = 2 we
proceed the same way expanding the function g(z) = 1/z about z = 2. It is more
convenient to write ξ = z − 2. In this way we have:
1
1
1
.
=
=
3
3
3
z(z − 2)
(ξ + 2)ξ
2ξ 1 + 2ξ
−1
Now we can expand the function 1 + 2ξ
. This can be expressed with the geometric
series, namely:
1
2ξ 3
ξ
1+
2
−1
n
∞ ξ
ξ ξ2 ξ3
ξ4
1 X
1
−
−
+
−...
= 3
= 3 1− +
2ξ n=0
2
2ξ
2
4
8
16
=
1
1
1
1
ξ
− 2+
−
+
− ...
3
2ξ
4ξ
8ξ 16 32
Recalling that ξ = z − 2 we obtain the Laurent expansion about z = 2:
1
1
1
1
z−2
1
=
−
+
−
+
− ...
3
3
2
z(z − 2)
2(z − 2)
4(z − 2)
8(z − 2) 16
32
The residue at z = 2 is therefore 18 .
Because of the Cauchy’s integral formula for higher derivatives we are also able
to calculate the coefficients an for each n. We have in fact:
1
g (n+m) (z0 )
=
an =
(n + m)!
2πi
I
g(z)
dz,
(z − z0 )n+m+1
where the integral is taken on any contour enclosing z0 but not enclosing any other
g(z)
singularity of f (z) or g(z). Recalling that f (z) = (z−z
m we obtain:
0)
113
3.2. COMPLEX INTEGRATION
I
f (z)
1
dz.
(3.25)
an =
2πi
(z − z0 )n+1
The part of the Laurent series with indices n ≥ 0 is called analytic part, whereas
the reminder of the series, consisting on negative indices, is called principal part.
Depending on the nature of the singularity z0 , the principal part may contain an
infinite number of terms, so that:
f (z) =
∞
X
n=−∞
an (z − z0 )n .
(3.26)
We can use this equation as a definition of the nature of the point z0 :
• if, given a positive m all the coefficients of the Laurent series Eq. 3.26 ai = 0
for any i < m, then z0 is a zero of order m of the function f .
• if a0 6= 0 but ai = 0 for any negative i, then f is analytic at z0 .
• if a positive integer number m exists such that a−m 6= 0 but all the coefficients
a−i of the Laurent series with i > m are zero, then z0 is a pole of order m of
the function f .
• if it is not possible to find such a lowest value m, then f (z) is said to have an
essential singularity at z0 .
3.2.6
Residue theorem
If z0 is a pole of order m of the function f (z), then the coefficient a−1 (not a−m ) is
called residue of the function f (z) at the pole z0 and is usually indicated as a−1 (z0 )
or R(z0 ).
Example 3.2.6 Verify that z = 0 is a pole of the function f (z) =
its residue.
cos z
z3
and calculate
The function cosine has the Taylor series expansion:
cos z =
∞
X
n=0
(−1)n
z2 z4
z 2n
=1−
+
− ...
(2n)!
2!
4!
We have therefore:
z2 z4
1
1
z
1
+
− ... = 3 −
+ − ...
f (z) = 3 1 −
z
2!
4!
z
2!z 4!
we can see therefore from this formula that z = 0 is indeed a pole (of order 3) of the
given function and that the residue at z = 0 is a−1 = − 12 .
114
CHAPTER 3. COMPLEX ANALYSIS
In many cases it is convenient not to recover the whole Laurent series to find
the residue. In fact, if a function f (z) has a pole of order m on a point z0 , we have:
f (z) =
a−1
a−m
+···+
+ a0 + a1 (z − z0 ) + . . .
m
(z − z0 )
z − z0
If we multiply now both sides of this equation by (z − z0 )m we obtain:
(z − z0 )m f (z) = a−m + · · · + a−1 (z − z0 )m−1 + . . .
Now we can differentiate both sides m − 1 times and obtain:
∞
X
dm−1 m
(z − z0 ) f (z) = (m − 1)!a−1 +
bn (z − z0 )n ,
m−1
dz
n=1
for some coefficients bn . Now, if we take the limit of both sides for z → z0 the terms
in the sum will disappear, therefore what remains is
dm−1 m
(z
−
z
)
f
(z)
= (m − 1)!a−1
0
z→z0 dz m−1
1
dm−1 ⇒ a−1 =
lim m−1 (z − z0 )m f (z) .
(m − 1)! z→z0 dz
lim
(3.27)
Considering Example 3.2.6, since z = 0 was a pole of order 3 we can calculate the
residue in this way:
d2 cos z
1
1
lim 2 z 3 3 = − ,
2! z→0 dz
z
2
A special case of this equation occurs when z0 is a simple pole. Then, the residue
a−1 is given by:
a−1 =
a−1 = lim [(z − z0 )f (z)].
(3.28)
z→z0
Why are the residues so important? We have seen that, if a function f (z) has a pole
of order m in a point z0 , then it can be written as a Laurent series about z0 :
f (z) =
∞
X
n=−m
an (z − z0 )n .
If we want to integrate f (z) around a closed curve C that encloses z0 but no other
singular points, we know (see Eq. 3.17 and Fig. 3.6) that this is equivalent to
integrate f around a small circumference γ around z0 , therefore:
I
C
f (z)dz =
I X
∞
γ n=−m
n
an (z − z0 ) dz =
∞
X
n=−m
an
I
γ
(z − z0 )n dz.
115
3.2. COMPLEX INTEGRATION
H
But we have already seen (Example 3.2.2) that (z − z0 )n dz in a circle around z0 is
equal to zero for all the exponents but for n = −1 and, in this case, is equal to 2πi.
We have therefore:
I
C
f (z)dz = a−1
I
γ
dz
= 2πia−1 .
z − z0
(3.29)
We can extend the above argument to a contour C enclosing n poles of a function f
(see Eq. 3.18 and Fig. 3.7) and we obtain:
I
f (z)dz = 2πi
C
n
X
Ri ,
(3.30)
i=1
Pn
where i=1 Ri is the sum of the residues of f (z) at its poles within C. This fundamental result of the complex analysis is known as residue theorem.
3.2.7
Real integrals using contour integration
Given a definite integral of the form:
Z
∞
f (x)dx,
−∞
this can be evaluated by means of the residue theorem provided that the function
f (z) (the complex function analogous to the real function f (x)) has the following
properties:
• f (z) is analytic in the upper half-plane of the Argand diagram, except for a
finite number of poles, none of which is in the real axis.
• zf (z) → 0 as |z| → ∞.
If these two conditions are fulfilled, then we can evaluate the integral of the function
f (z) along the path indicated in Fig. 3.9, namely along a semicircle Γ of radius
R (large enough to enclose all the poles) and along the x-axis, between the points
−R and R. The residue theorem ensures us that this integral is given by 2πi times
the sum of the residues of f at the poles on the upper half-plane. To evaluate the
integral along Γ, we notice that the modulus of each z on Γ is R. Since zf (z) → 0
as |z| = R → ∞ we have also that |zf (z)| = R|f (z)| tends to 0 for R → ∞. For this
reason (since z = Reiθ , θ ∈ [0, π] on Γ):
Z
f (z)dz =
Γ
Z
0
π
iθ
f (z) · iRe dθ, ⇒
lim |
R→∞
Z
Γ
f (z)dz| ≤ lim
R→∞
Z
0
π
R|f (z)|dθ = 0.
116
CHAPTER 3. COMPLEX ANALYSIS
Figure 3.9: Contour used in Example 3.2.7.
Therefore, if we take the contour shown in Fig. 3.9 and we take the limit for R → ∞,
RR
R∞
the integral along Γ vanishes, −R f (z)dz transforms into −∞ f (x)dx (we are on a
curve along the x-axis, where y = 0 and therefore z = x) and what remains is:
Z
∞
f (x)dx = 2πi
−∞
X
Rj ,
(3.31)
Im(z)>0
namely we take the residues Rj only on the poles zj that lie above the x-axis (with
positive imaginary part). Of course there is no specific reason why we have chosen the
upper half-plane; we might have chosen the lower half-plane as well and the result
would not have changed (but if we chose the lower half-plane we have to rotate
RR
clockwise around Γ, otherwise we could not have −R f (x)dx on the x-axis).
Example 3.2.7 Evaluate the integral
Z ∞
0
x2
dx.
(x2 + 1)3
R∞
2
We notice first that the function to integrate is even, therefore 0 (x2x+1)3 dx =
R
2
x2
1 ∞
dx. The function f (z) = (z 2z+1)3 has two poles (of order 3) at z = i
2 −∞ (x2 +1)3
and z = −i (none of them lie in the x-axis) and it is clearly zf (z) → 0 as |z| → ∞,
117
3.2. COMPLEX INTEGRATION
therefore we can evaluate the integral along the path of Fig. 3.9, the integral
tends to zero for R → ∞ and therefore we have:
Z
∞
−∞
R
Γ
f (z)dz
x2
dx = 2πiR(i).
(x2 + 1)3
Since the pole at z = i is of order 3, the residue is given by:
We have:
d2
z2
1
1 d2
z2
3
(z
−
i)
=
lim
.
R(i) = lim
z→i 2 dz 2
(z 2 + 1)3
2 z→i dz 2 (z + i)3
2z(z + i)3 − 3(z + i)2 z 2
2iz − z 2
z2
d
=
=
dz (z + i)3
(z + i)6
(z + i)4
2(i − z)(z + i)4 − 4(z + i)3 (2iz − z 2 )
−2(z 2 + 1) + 4(z 2 − 2iz)
z2
d2
=
=
.
dz 2 (z + i)3
(z + i)8
(z + i)5
Substituting this value of
d2
f (z)
dz 2
into the formula to obtain the residue, we get:
1 4(−1 + 2)
i
=
−
.
2 (2i)5
16
R(i) =
In this way we obtain:
Z
∞
−∞
x2
π
dx = , ⇒
2
3
(x + 1)
8
Z
∞
0
x2
π
dx = .
2
3
(x + 1)
16
In the case in which there are some simple poles on the x-axis, we can integrate
the function along the curve in Fig. 3.10 (solid line). Reminding that in this case
a−1
is f (z) = z−a
+ Pn (z − a), the integral along the curve γ can be evaluated taking
iθ
z = a + εe (and therefore dz = iεeiθ dθ), with θ ∈ [π, 0] (the curve rotates clockwise
around a) and taking the limit for ε → 0. In this way we have:
Z
γ
Z
f (z)dz = lim a−1
ε→0
π
0
1
iεeiθ dθ +
εeiθ
Z
0
iθ
Pn εe
π
iθ
iεe dθ = −iπa−1 .
(3.32)
In fact, the integral of the polynomial venishes for ε → 0. Provided that the integral
over the big half-circle Γ vanishes for R → ∞ (namely, provided that |zf (z)| → 0 as
|z| → ∞), we have:
lim
R→∞,ε→0
⇒
Z
∞
−∞
Z
f (z)dz +
Γ
f (x)dx = 2πi
Z
f (z)dz +
γ
X
Im(z)>0
Z
R
−R
Rj + πi
X
X
f (x)dx = 2πi
Rj
Im(z)=0
Im(z)>0
Rk ,
(3.33)
118
CHAPTER 3. COMPLEX ANALYSIS
Figure 3.10: Contours to use in the case in which there are poles on the x-axis.
namely, besides the residues Rj with positive imaginary parts, we have to add πi
times the residues Rk along the x-axis. What happens if we take the dashed contour
of Fig. 3.10 instead (namely if we rotate counterclockwise around the singularity at
a)? In this case, the integral over γ1 of the function f (z) is given by:
Z
γ1
Z
f (z)dz = lim a−1
ε→0
2π
π
1
iεeiθ dθ +
εeiθ
Z
2π
iθ
Pn εe
π
iθ
iεe dθ = iπa−1 ,
namely it has the opposite sign compared to Eq. 3.32 because of the different direction. However, in this case is a within the big contour we use for the integration of
the function f (z), therefore the residue theorem tells us that the integral is given by
2πi times the residues of all the singularities on the x-axis and above it, namely:
lim
R→∞,ε→0
⇒
Z
∞
−∞
Z
f (z)dz +
Γ
f (x)dx = 2πi
Z
X
f (z)dz +
γ1
Im(z)≥0
Rj − πi
Z
R
−R
X
X
f (x)dx = 2πi
Rj
Im(z)≥0
Rk .
Im(z)=0
Of course, the two methods produce the same result. In the end, a simple pole on
the x-axis is counted as one-half of what it would be if it were above the axis.
R∞
R∞
dx or −∞ cos(mx)
dx
In the case we have to evaluate integrals of the form −∞ sin(mx)
f (x)
f (x)
it is often useful the Jordan’s lemma that states that, if a function f (z) is analytic in
119
3.2. COMPLEX INTEGRATION
the upper half-plane of the Argand diagram (Im(z) > 0) except for a finite number
of poles and if |f (z)| → 0 for |z| → ∞, then for each m > 0 we have:
lim
R→∞
Z
eimz f (z)dz = 0,
(3.34)
Γ
where Γ is as usual the semicircular contour with radius R in the upper half-plane,
centered on the origin.
Example 3.2.8 Evaluate the integral
Z ∞
cos mx
, a ∈ R, a > 0.
−∞ x − a
imz
To evaluate this integral, we will evaluate the integral of the function f (z) = ez−a
along the curve indicated in Fig. 3.10 (solid line). For ε (radius of the semicircular
contour γ centered on a) → 0 this integral tends to:
Z
f (z)dz =
C
Z
f (z)dz +
Γ
Z
R
f (z)dz +
−R
Z
f (z)dz.
γ
The function f (z) has a simple pole at z = a, therefore, according to Eq. 3.32:
Z
f (z)dz = −iπa−1 ,
γ
where the sign minus is due to the fact that the chosen curve rotates clockwise around
a. The residue a−1 at z = a is given by:
eimz
= eima ,
= lim (z − a)
z→a
z−a
a−1
therefore:
Z
γ
f (z)dz = −iπeima .
Since |(z −a)−1 | → 0 for |z| → ∞, the Jordan’s lemma ensures us that
for R → ∞. What remains is therefore:
Z
∞
∞
cos mx
dx = −π sin ma.
x−a
−∞
eimz
dz = iπeima .
z−a
R
Γ
f (z)dz → 0
Note that this contour lies on the x-axis, therefore z = x. If we take now the real
parts of both sides of this equation we obtain:
Z
−∞
120
CHAPTER 3. COMPLEX ANALYSIS
Figure 3.11: Contour to use in the case in which there is a branch point at the origin.
√
We have seen at the beginning of this Chapter that some functions (like z or
ln z) are multiple-valued and we need a branch cut to make them single-valued. The
branch cut can be used to evaluate integrals involving this kind of functions. Usually
a contour to use in this case is the one in Fig. 3.11, namely we evaluate the integral
on the big circle Γ (letting the radius R go to infinity), on the small circle γ (letting
the radius ε go to zero) and on the two straight lines r1 and r2 (noticing that, since
we are dealing with multiple-valued functions, these two integrals do not cancel out).
Example 3.2.9 Evaluate the integral
Z ∞
0
√
x
dx.
(x + 1)2
√
z
The function f (z) = (z+1)
2 has a branch point at z = 0 and a pole (of order 2) at
z = −1. We can evaluate the integral of f (z) along the curve C = Γ − r2 − γ + r1
indicated in Fig. 3.11. The integral will be 2πi times the residue at the pole z = −1,
namely:
I
√
1
z
d
2
(z + 1)
= 2πi √
= π.
f (z)dz = 2πi lim
2
z→−1 dz
(z + 1)
2 −1
C
To evaluate the integrals along Γ and γ we set z = Reiθ for Γ and z = εeiθ for γ and
we let R → ∞ and ε → 0. We obtain:
121
3.2. COMPLEX INTEGRATION
I
f (z)dz =
Z
f (z)dz =
Z
Γ
I
2π
0
γ
2π
0
√
θ
Rei 2 · iReiθ
dθ →R→∞ 0
(Reiθ + 1)2
√ iθ
εe 2 · iεeiθ
dθ →ε→0 0.
(εeiθ + 1)2
To evaluate the integral along the lines r1 and r2 the easiest thing to do is to substitute
z = xe0i along r1 and z = xe2πi along r2 . In fact, the two curves will eventually
overlap with the x-axis and therefore |z| → x in both cases, but the argument is
different because the function is multiple-valued. We have therefore:
√
√ πi
Z 0
xe0i
xe
dx = π.
2 dx +
0i
2πi
(xe + 1)
+ 1)2
0
∞ (xe
R ∞ √x
The first integral is equal to 0 (x+1)
2 dx; in the second integral we have at the de√
√
2
nominator again (x + 1) but at the numerator we have xeiπ = − x. We have
therefore:
Z
Z
∞
0
√
∞
x
dx +
(x + 1)2
Z
∞
0
√
x
dx = π ⇒
(x + 1)2
Z
0
∞
√
x
π
dx = .
2
(x + 1)
2
Another important application of the residue theorem is to evaluate integrals of the
form
Z 2π
F (sin θ, cos θ)dθ.
0
In this case we can transform this integral into a complex integral around the unit
circle C with |z| = 1. In fact, along this curve is z = eiθ , therefore dz = ieiθ dθ,
namely:
dz
.
iz
Recalling then the definitions of sine and cosine, the given integral can be transformed
into:
dθ =
2π
z − z −1 z + z −1 dz
F (sin θ, cos θ)dθ =
F
,
,
(3.35)
2i
2
iz
0
C
which can be evaluated by checking how many singularities of the function F lie
inside C and calculating their residues.
Z
I
Example 3.2.10 Calculate the integral
Z 2π
0
dθ
.
sin θ
122
CHAPTER 3. COMPLEX ANALYSIS
We can transform this integral as an integral along the circle of radius 1 obtaining:
Z
2π
dθ
=
sin θ
0
I
C
1
z−z −1
2i
dz
=2
iz
I
C
dz
.
z2 − 1
The given function has (simple) poles at z = 1 and z = −1. The residues of the
given function at these points are:
1
1
R(−1) = lim (z + 1) 2
=−
z→−1
z −1
2
1
1
R(1) = lim (z − 1) 2
=
z→1
z −1
2
We have therefore:
Z
0
2π
1 1
dθ
= 0.
= 4πi − +
sin θ
2 2
Chapter 4
Integral transforms
In mathematics, an integral transform is any transform T of a given function f of
the following form:
T f (s) =
x2
Z
K(x, s)f (x)dx.
(4.1)
x1
The input is a function f (x) and the output is another function T f (s). There
are different integral transforms, depending on the kernel function K(x, s). The
transforms we consider in this chapter are the Laplace transform and the Fourier
transform.
4.1
4.1.1
Laplace transform
Basic definition and properties
To obtain the Laplace transform of a given function f (x) we use the kernel K(x, s) =
e−sx , namely:
L{f } = F (s) =
Z
∞
f (x)e−sx dx.
(4.2)
0
Here s can also be a complex variable, namely the Laplace transform maps a real
function to a complex one. For our purposes it is enough to consider for the moment
s real. We can easily verify that L is a linear operator. In fact:
L{af + bg} =
Z
∞
−sx
[af (x) + bg(x)]e
dx = a
0
Z
0
⇒ L{af + bg} = aL{f } + bL{g}.
123
∞
−sx
f (x)e
dx + b
Z
∞
g(x)e−sx dx
0
(4.3)
124
CHAPTER 4. INTEGRAL TRANSFORMS
Example 4.1.1 Find the Laplace transform of the function f (x) = 1.
It is
L{1} =
Z
∞
0
e−sx dx = −
1 −sx ∞ 1
e
= .
0
s
s
Example 4.1.2 Find the Laplace transform of f (x) = xn , with n positive integer.
We integrate by parts and obtain:
n
L{x } =
Z
∞
n −sx
x e
0
∞ n
1
dx = − xn e−sx 0 +
s
s
Z
∞
xn−1 e−sx dx =
0
n
L(xn−1 ).
s
To obtain L{xn−1 } we proceed the same way and obtain L{xn−1 } =
We iterate n times and obtain:
L{xn } =
n−1
L{xn−2 }.
s
n!
n(n − 1)(n − 2) . . .
L{1} = n+1 .
n
s
s
Example 4.1.3 Find the Laplace transform of f (x) = sin(mx).
R∞
It is L{f (x)} = 0 e−sx sin(mx)dx. By using the relation sin(mx) =
obtain:
Z ∞
Z ∞
1
(im−s)x
−(im+s)x
e
dt −
L{f (x)} =
e
dx
2i
0
0
(im−s)x ∞ −(im+s)x ∞ e
e
1
−
=
2i
im − s 0
−im − s 0
1
1
m
1
=
= 2
−
.
2i s − im s + im
s + m2
eimx −e−imx
2i
we
In these three simple cases, it was clear that the integral 4.2 was convergent for
any possible value of s; in fact, lim xn e−sx = 0 ∀s, n. This is not always the case,
x→∞
as the two following examples show.
Example 4.1.4 Find the Laplace transform of f (x) = eax .
ax
L{e } =
Z
0
∞
ax −sx
e e
dx = lim
A→∞
Z
A
(a−s)x
e
0
dx = lim
A→∞
e(a−s)x
a−s
A
.
0
It is clear that this limit exists and is finite only if a < s (a < Re (s) if s ∈ C),
namely we can define the Laplace transform of the function f (x) = eax only if Re
(s) > a. In this case it is:
L{eax } =
1
.
s−a
125
4.1. LAPLACE TRANSFORM
Example 4.1.5 Find the Laplace transform of the function f (x) = cosh(mx).
It is L{f (x)} =
we obtain:
R∞
0
e−sx cosh(mx)dx. By using the relation cosh(mx) =
emx +e−mx
2
Z ∞
Z ∞
1
(m−s)x
−(m+s)x
L{f (x)} =
e
dt +
e
dx
2
0
0
(m−s)x ∞ −(m+s)x ∞ e
e
1
−
=
2
m−s 0
m+s 0
1
s
1
1
=
= 2
+
.
2 s−m s+m
s − m2
This result holds as long as e(m−s)x and e−(m+s)x tend to zero for x → ∞, namely it
must be s > |m|.
There are a few properties of the Laplace transform that help us finding the
transform of more complex functions. If we know that F (s) is the Laplace transform
of f (x), namely that L{f (x)} = F (s), then:
•
L ecx f (x) = F (s − c)
(4.4)
This property comes directly from the definition of Laplace transform, in fact:
•
L ecx f (x) =
Z
∞
cx −sx
f (x)e e
0
dx =
Z
∞
0
f (x)e−(s−c)x dx = F (s − c).
1 s
, (c > 0)
L{f (cx)} = F
c
c
(4.5)
To show that it is enough to substitute cx with t. In this way is x = ct , dx =
and therefore:
L{f (cx)} =
•
Z
0
∞
−sx
e
1
f (cx)dx =
c
Z
∞
0
dt
c
s
1 s
.
e− c t f (t)dt = F
c
c
L{uc(x)f (x − c)} = e−sc F (s)
(4.6)
Here is uc (x) the Heaviside or step function, namely:

0 x < c
uc (x) =
1 x ≥ c
(4.7)
126
CHAPTER 4. INTEGRAL TRANSFORMS
The function uc (x)f (x − c) is thus given by:

0 x < c
uc (x)f (x − c) =
f (x − c) x ≥ c
We have thus:
L{uc (x)f (x − c)} =
Z
c
∞
e−sx f (x − c)dx.
With the substitution t = x − c we obtain:
Z ∞
L{uc(x)f (x − c)} =
e−s(c+t) f (t)dt = e−sc F (s).
0
•
L{xn f (x)} = (−1)n F (n) (s)
(4.8)
It is enough to derive F (s) with respect to s, to obtain:
Z
Z ∞
d ∞ −sx
e f (x)dx = −
xe−sx f (x)dx = −L{xf (x)}.
F (s) =
ds 0
0
If we now differentiate n times F (s) with respect to s we obtain:
′
F (n) (s) = (−1)n L{xn f (x)}.
From it, Eq. 4.8 is readily obtained.
•
L{f ′(x)} = −f (0) + sF (s)
(4.9)
This property can be obtained integrating e−sx f ′ (x) by parts, namely:
′
L{f (x)} =
Z
0
∞
∞
f (x)dx = f (x)e−sx 0 +s
−sx ′
e
Z
∞
0
e−sx f (x)dx = −f (0)+sF (s).
Example 4.1.6 Find the Laplace transform of cos(mx).
We could calculate this transform directly but it is easier to use the Laplace transform
m
of sin(mx) that we have calculated in Example 4.1.3 (L{sin(mx)} = s2 +m
2 ). From
Eq. 4.9 (and reminding that L is a linear operator) we have:
d
L
sin(mx)
dx
⇒ L{cos(mx)} =
= mL{cos(mx)} = − sin(0) + s ·
s2
s
.
+ m2
s2
m
.
+ m2
127
4.1. LAPLACE TRANSFORM
Example 4.1.7 Find the Laplace transform of x cosh(mx).
s
We remind from Example 4.1.5 that L{cosh(mx)} = F (s) = s2 −m
2 (s > |m|). Eq.
′
4.8 tells us that F (s) is the Laplace transform of −x cosh(mx). We have therefore:
L{x cosh(mx)} = −F ′ (s) = −
s2 − m2 − 2s2
s2 + m2
=
.
(s2 − m2 )2
(s2 − m2 )2
Example 4.1.8 Find the Laplace transform of the function f (x) defined in this way:

x x < π
f (x) =
x − cos(x − π) x ≥ π
By means of the step function (Eq. 4.7) we can rewrite f (x) as f (x) = x −
uπ (x) cos(x−π). The Laplace transform of this function can be found by means of Eq.
n!
s
4.6 and of the known results L{xn } = sn+1
(Example 4.1.2) and L{cos(mx)} = s2 +m
2
(Example 4.1.6).
L{f (x)} = L{x} − L{uπ (x) cos(x − π)} =
4.1.2
1
se−πs
1
−πs
−
e
L{cos
x}
=
−
.
s2
s2 s2 + 1
Solution of initial value problems by means of Laplace
transforms
We have seen (Eq. 4.9) that the Laplace transform of the derivative of a function
is given by L{f ′(x)} = −f (0) + sF (s), where F (s) = L{f (x)}. If we consider the
Laplace transform of higher order derivatives we obtain (always integrating by parts):
′′
L{f (x)} =
Z
∞
−sx ′′
e
−sx ′
f (x)dx = e
0
′
2
f (x)
∞
0
+s
Z
∞
e−sx f ′ (x)dx
0
= −f (0) − sf (0) + s F (s)
Z
Z ∞
−sx ′′ ∞
′′′
−sx ′′′
L{f (x)} =
e f (x)dx = e f (x) 0 + s
∞
e−sx f ′′ (x)dx
0
0
′′
′
2
3
= −f (0) − sf (0) − s f (0) + s F (s)
..
.
L{f (n) (x)} = sn F (s) − sn−1 f (0) − sn−2 f ′ (0) − · · · − sf (n−2) (0) − f (n−1) (0). (4.10)
128
CHAPTER 4. INTEGRAL TRANSFORMS
This result allows us to simplify considerably linear ODEs. Let us take for instance an
initial value problem consisting of a second-order inhomogeneous ODE with constant
coefficients (but the method can be applied also to more complex ODEs):


′′
′


a2 y (x) + a1 y (x) + a0 y(x) = f (x)
y(0) = y0



y ′ (0) = y0 ′
If we now make the Laplace transform of both members of this equation (calling Y (s)
the Laplace transform of y(x) and F (s) the Laplace transform of f (x)), we obtain:
a2 s2 Y (s) − sy0 − y0 ′ + a1 [sY (s) − y0 ] + a0 Y (s) = F (s)
⇒ Y (s)(a2 s2 + a1 s + a0 ) = F (s) + a1 y0 + a2 (sy0 + y0 ′ )
⇒ Y (s) =
F (s) + a1 y0 + a2 (sy0 + y0 ′ )
.
a2 s2 + a1 s + a0
(4.11)
Namely, we have transformed an ODE into an algebraic one, which is of course easier
to solve. Moreover, the particular solution (satisfying the given initial conditions)
is automatically found, without need to search first the general solution and the
look for the coefficients that satisfy the initial conditions. Further, homogeneous
and inhomogeneous ODEs are handled in exactly the same way; it is not necessary
to solve the corresponding homogeneous ODE first. The price to pay for these
advantages is that Eq. 4.11 is not yet the solution of the given ODE; we should
invert this relation and find the function f (x) whose Laplace transform is given by
F (s). This function is called the inverse Laplace transform of F (s) and it is indicated
with L−1 {F (s)}.
Since the operator L is linear, it is easy to show that also the inverse operator
−1
L is linear. In fact, given two functions f1 (x) and f2 (x) whose Laplace transforms
are F1 (s) and F2 (s), respectively, the linearity of the operator L ensures us that:
L{c1 f1 (x) + c2 f2 (x)} = c1 F1 (s) + c2 F2 (s).
If we apply now the operator L−1 to both members of this equation we obtain:
L−1 L{c1 f1 (x) + c2 f2 (x)} = L−1 {c1 F1 (s) + c2 F2 (s)} = c1 L−1 {F1 (s)} + c2 L−1 {F2 (s)}.
To invert the function F (s) it is therefore enough to split it into many (possibly
simple) addends and find for each of them the inverse Laplace transform. Based
on the examples in Sect. 4.1.1 (and others that we do not have time to calculate,
but that can be found in the mathematical literature) it is possible to construct a
129
4.1. LAPLACE TRANSFORM
Table 4.1: Summary of elementary Laplace transforms
f (x) = L−1 {F (s)}
1
emx
xn
sin(mx)
cos(mx)
sinh(mx)
cosh(mx)
emx sin(px)
emx cos(px)
n mx
x e
x−1/2
√
x
δ(x − c)
uc (x)
uc (x)f (x − c)
ecx f (x)
f (cx)
Rx
f (x̃)dx̃
R0x
f (x − ξ)g(ξ)dξ
0
(−1)n xn f (x)
f (n) (x)
F (s) = L{f (x)}
Convergence
1
s
s>0
s>m
s>0
s>0
s>0
s>m
s>m
s>m
1
s−m
n!
sn+1
m
s2 +m2
s
s2 +m2
m
s2 −m2
s
s2 −m2
p
(s−m)2 +p2
s−m
(s−m)2 +p2
n!
(s−m)n+1
p
π
s
p
1
π
2
s3
−cs
s>m
s>m
s>0
s>0
c>0
s>0
e
e−cs
s
−cs
e F (s)
F (s − c)
1
s
F
c
c
c>0
F (s)
s
F (s)G(s)
F (n) (s)
sn F (s) − sn−1 f (0) − · · · − f (n−1) (0)
“dictionary” of basic functions/expressions and corresponding Laplace transforms,
as in Table 4.1. Any time we face a particular F (s), we can look at the dictionary and
check whether it is possible to recover the function f (x) whose Laplace transform is
F (s).
Example 4.1.9 Find the inverse Laplace transform of the function
F (s) =
s2 + 5
s3 − 9s
We can write the given function as:
F (s) =
s2 + 5
s2 + 5
=
.
s(s2 − 9)
s(s − 3)(s + 3)
To invert this function we have to apply the method of the partial fractions, namely:
130
CHAPTER 4. INTEGRAL TRANSFORMS
s2 + 5
A
B
C
As2 − 9A + Bs2 + 3Bs + Cs2 − 3Cs
= +
+
=
.
s(s − 3)(s + 3)
s
s−3 s+3
s(s − 3)(s + 3)
Now we can compare terms with like power of s, obtaining the following system of
equations:




A + B + C = 1
3B − 3C = 0



−9A = 5
From the second we obtain B = C, from the last A = − 59 . From the first equation:
14
7
⇒ B=C= .
9
9
Now we can invert all the terms of the given function and obtain:
2B =
51 7
1
1
s2 + 5
−1
−
=L
+
+
f (x) = L
s3 − 9s
9s 9 s−3 s+3
5 14 −1
5 14
s
=− + L
=− +
cosh(3x).
2
9
9
s −9
9
9
−1
Example 4.1.10 Solve the initial value problem


′′
x


y (x) + 4y(x) = e
y(0) = 0



y ′ (0) = −1
We have to apply the operator L to both members of the given ODE. Since this is
a second-order ODE with constant coefficients, we can apply directly Eq. 4.11 to
obtain:
1
−1
2−s
F (s) − 1
s−1
= 2
=
.
Y (s) = 2
s +4
s +4
(s − 1)(s2 + 4)
We apply now the method of the partial fractions to decompose this function:
A
Bs + C
2−s
+ 2
=
s−1
s +4
(s − 1)(s2 + 4)
⇒ As2 + 4A + Bs2 + Cs − Bs − C = 2 − s.
131
4.1. LAPLACE TRANSFORM
By equating the terms with like power of s we obtain the system of equations:




A + B = 0
⇒
C − B = −1



4A − C = 2




A = −B
C =B−1



−4B − B = 1
The decomposed Y (s) is thus given by:


1


B = − 5
⇒
A = 15



C = − 6
5
1 1
1 s
6 1
− 2
− 2
.
5s−1 5s +4 5s +4
With the help of Table 4.1 we can easily identify the inverse Laplace transforms of
these addends, obtaining therefore:
Y (s) =
y(x) =
ex cos(2x) 3 sin(2x)
−
−
.
5
5
5
The method of the Lagrange transform is sometimes more convenient, sometimes less
convenient compared to traditional methods of ODE resolution. It proves however
to be always more convenient in the case in which the inhomogeneous function is
a step function. In fact, in this case the only available traditional method is the
laborious variation of constants, whereas the Laplace transform of the step function
can be readily found.
Example 4.1.11 Find the solution of the initial value problem:


′′

y (x) + y(x) = g(x)

where g(x) is given by:
y(0) = 0



y ′ (0) = 0




0 0 ≤ x < 1
g(x) = x − 1 1 ≤ x < 2



1 x ≥ 2
(also known as ramp loading).
The function g(x) can be written as:
g(x) = u1 (x)(x − 1) − u2 (x)(x − 2),
where uc (x) is the Heaviside function (Eq. 4.7). In fact, for x < 1 both u1 and u2 are
zero. For x between 1 and 2 is u1 = 1 but u2 is still zero. For x ≥ 2 both functions
132
CHAPTER 4. INTEGRAL TRANSFORMS
are 1 and therefore u1 (x)(x − 1) − u2 (x)(x − 2) = x − 1 − x + 2 = 1. If we make the
Laplace transform of both members of the given ODE we obtain:
e−s e−2s
− 2
s2
s
1 + s2 − s2
e−s − e−2s e−s − e−2s
1
= e−s − e−2s 2 2
=
−
.
⇒ Y (s) = e−s − e−2s 2 2
s (s + 1)
s (s + 1)
s2
s2 + 1
s2 Y (s) + Y (s) =
To invert this function Y (s) we use again the relation L−1 {uc (x)f (x−c)} = e−cs F (s)
to obtain:
f (x) = u1 (x)(x − 1) − u2(x)(x − 2) − u1 (x) sin(x − 1) + u2 (x) sin(x − 2).
Among the results presented in Table 4.1 very significant is the one concerning the
Dirac delta function δ(x−c). We remind here briefly what is the Dirac delta function
and what are its properties. Given a function g(x) defined in the following way:
g(x) = dξ (x) =

1
−ξ <x<ξ
2ξ
0 x ≤ −ξ or x ≥ ξ
(4.12)
it is clear that the integral of this function is 1 for any possible choice of ξ, in fact:
Z
∞
g(x)dx =
−∞
Z
ξ
−ξ
1
dx = 1.
2ξ
It is also clear that if ξ tends to zero, the interval of values of x in which g(x) is
different from zero becomes narrower and narrower until it disappears. Analogously,
the function g(x − c) = dξ (x − c) is non-null only in a narrow interval of x centered
on c that disappears for ξ tending to zero. The limit of the function g(x) = dξ (x)
for ξ → 0 is called Dirac delta function and is indicated with δ(x). It is therefore
characterized by the properties:
δ(x − c) = 0 ∀x 6= c
Z ∞
δ(x)dx = 1.
(4.13)
(4.14)
−∞
Given a generic function f (x), if we integrate f (x)δ(x − c) between −∞ and ∞ we
obtain:
∞
1
f (x)δ(x − c)dx = lim
ξ→0 2ξ
−∞
Z
Z
c+ξ
c−ξ
1
[2ξf (x̃)] , x̃ ∈ [c − ξ, c + ξ].
ξ→0 2ξ
f (x)dx = lim
133
4.1. LAPLACE TRANSFORM
The last step is justified by the mean value theorem for integrals. But the interval
of values in which x̃ must be taken collapses to the point c for ξ → 0, therefore we
obtain the important property of the Dirac delta function:
Z
∞
f (x)δ(x − c)dx = f (c).
−∞
(4.15)
To calculate the Laplace transform of δ(x − c) (with c ≥ 0) it is conveninet to
calculate first the Laplace transform of the function dξ (x − c) and then take the limit
ξ → 0, namely:
L{δ(x − c)} = lim
ξ→0
Z
∞
−sx
e
0
−s(c+ξ)
dξ (x − c)dx = lim
ξ→0
Z
c+ξ
c−ξ
sξ
e−sx
dx
2ξ
e − e−sξ
−e
= e−sc lim
ξ→0
ξ→0
−2sξ
2sξ
sξ
−sξ
ξ(e + e )
= e−sc lim
= e−sc .
ξ→0
2ξ
= lim
e
−s(c+ξ)
The last step is justified by the de l’Hopital’s rule for limits. In this way we have
found the result reported in Table 4.1 about the Laplace transform of δ(x − c). In
the case that c = 0 we have L{δ(x)} = 1.
4.1.3
The Bromwich integral
Although for most of the practical purposes the inverse Laplace transform of a given
function F (s) can be found by means of the “dictionary” provided by Tab. 4.1 (or
of more extended tables that can be found in the literature), a general formula for
the inversion of F (s) can be found treating F (s) as a complex function and is given
by the so-called Bromwich integral:
1
f (x) = L {F (s)} =
2πi
−1
Z
λ+i∞
esx F (s)ds,
(4.16)
λ−i∞
where λ is a real positive number and is larger that the real parts of all the singularities of esx F (s). In practice, the integral must be performed along the infinite line L,
parallel to the imaginary axis, indicated in Fig. 4.1. At this point, a curve must be
chosen in order to close the contour C. Possible completion paths are for instance
the curves Γ1 or Γ2 indicated in Fig. 4.2, namely the half-circles on the left and on
the right of L, respectively. For R → ∞ these curves make with L a closed contour.
The Bromwich integral can be evaluated by means of the residue theorem provided
that the integral of the function esx F (s) tends to zero for R (radius of the chosen
half-circle) tending to infinity. If we choose the completion path Γ1 , then the residue
theorem ensures us that:
134
CHAPTER 4. INTEGRAL TRANSFORMS
Figure 4.1: The infinite line L along which the Bromwich integral must be performed.
f (x) =
X
X
1
Rj ,
Rj =
· 2πi
2πi
C
C
(4.17)
where the sum is extended to all the residues of the function esx F (s) in the complex
plane. In fact, by construction L lies on the right of each singularity of esx F (s) and
on the limit R → ∞ the closed curve C = L + Γ1 will enclose them all (including for
instance the singularity z1 that in Fig. 4.2 is not yet enclosed in C). If we instead
have to choose the completion path Γ2 , then the closed curve L + Γ2 will enclose no
singularities and therefore f (x) will be zero.
Example 4.1.12 Find the inverse Laplace transform of the function
F (s) =
2e−2s
.
s2 + 4
From the relation L{uc (x)f (x − c)} = F (s) we can already derive the inverse Laplace
transform of the given function, namely u2 (x) sin[2(x−2)]. We check if we can obtain
the same result be means of the Bromwich integral. We have to evaluate the integral
1
2πi
Z
λ+i∞
λ−i∞
2es(x−2)
ds.
s2 + 4
4.1. LAPLACE TRANSFORM
135
Figure 4.2: Possible contour completions for the integration path L to use in the
Bromwich integral.
We notice first that the given function has two simple poles at s = 2i and s = −2i
(in fact it is s2 + 4 = (s + 2i)(s − 2i)), both of which have Re (z) = 0. We can
therefore take an arbitrarily (but positive) small value of λ. We can distinguish two
cases: i) x < 2 and ii) x > 2. For x < 2 the exponent s(x − 2) has negative real
part if Re (s) > 0. We notice here that es(x−2) = e(x−2)Re(s) ei(x−2)Im(s) , therefore
what determines the behavior of this function at infinity is e(x−2)Re(s) (ei(x−2)Im(s) has
modulus 1 and does not create problems). That means that, for Re (s) → +∞ the
function es(x−2) tends to zero. At the same time the denominator s2 + 4 diverges as
Re (s) → +∞ but not as fast as the exponential function tends to zero. Therefore
the integral of the function F (s)esx tends to zero along the curve Γ2 of Fig. 4.2
(for R → ∞) and we can calculate the Bromwich integral by means of the contour
C = L + Γ2 . For what we have learned, since the given closed contour does not
enclose the poles, the function f (x) is zero.
For x > 2, the function es(x−2) tends to zero for Re (s) → −∞. That means
s(x−2)
that the integral of the function es2 +4 tends to zero (for R → ∞) along the curve Γ1
of Fig. 4.2 and we take therefore Γ1 as a completion of L to calculate the Bromwich
integral. For the residue theorem, this integral is given by the sum of the residues of
the function esx F (s) at all the poles, namely:
136
CHAPTER 4. INTEGRAL TRANSFORMS
f (x) = Res(2i) + Res(−2i).
We have:
2es(x−2)
2es(x−2)
e2i(x−2)
=
lim
=
s→2i
s→2i s + 2i
s2 + 4
2i
s(x−2)
s(x−2)
2e
e−2i(x−2)
2e
= lim
=
Res(−2i) = lim (s + 2i) 2
s→−2i s − 2i
s→−2i
s +4
−2i
Res(2i) = lim (s − 2i)
By summing up these two residues we obtain:
1 2i(x−2)
e
− e−2i(x−2) = sin[2(x − 2)].
2i
This is what we obtain if x > 2 whereas, as we have seen, if x is smaller than 2
the function is zero. Recalling the definition of the Heaviside function uc (x) we can
conclude that the inverse Laplace transform of the given function is:
f (x) =
f (x) = L
−1
2e−2s
s2 + 4
= u2 (x) sin[2(x − 2)].
Example 4.1.13 Find the inverse Laplace transform of the function:
F (s) =
√
s − a,
with a ∈ R.
√
√
The function esx s − a has no poles, but the function z is multiple-valued in the
complex plane, therefore, as we have seen, a branch point is present at the point
z = 0, namely at s = a. This is the only singularity of our F (s)esx and therefore,
in order to evaluate the Bromwich integral, we have to take λ larger than a. The
integral to calculate will be:
√
1
L { s − a} =
2πi
−1
Z
λ+i∞
λ−i∞
√
s − aesx ds.
By means of the substitution z = s − a we obtain:
Z λ+i∞
Z
√
√ (z+a)x
1
eax λ+i∞ √ zx
−1
L { s − a} =
ze
dz =
ze dz.
2πi λ−i∞
2πi λ−i∞
In this case, the branch point is at zero, therefore λ can be arbitrarily small (but
always larger than zero). Since z = 0 is a branch point of the function to integrate,
we have to introduce a branch cut to evaluate the integral. Although we have taken
so far the positive real axis as a branch cut, we have also said that this choice is
137
4.1. LAPLACE TRANSFORM
Figure 4.3: Contour to use in Example 4.1.13.
√
arbitrary and to make the function z singe value it is enough that closed curves are
not allowed to enclose the origin. We can therefore take as branch cut the negative
real axis. In Fig. 4.3 we indicate the contour we must use to integrate the given
function. Since the closed contour C = L + Γ1 + r1 + γ + r2 + Γ2 does not enclose
singularities, its integral is zero. To evaluate the Bromwich integral (namely the
integral along L) we have to calculate the integral along the arcs Γ1 and Γ2 , along
the straight lines r1 and r2 and along the circumference γ.
√
√
Since the function zezx tends to zero for Re (z) → −∞ (the term z cannot
contrast the exponential decay of ezx ), the integral along the arcs Γ1 and Γ2 disappears.
To evaluate the integral along γ we take as usual z = εeiθ and we take the limit
for ε → 0. The interval of values of θ is [π, −π], in fact, as we arrive at γ the first
argument will be π. Then, we rotate clockwise around the origin and after a whole
circuit the argument will be −π. Since dz = iεeiθ dθ we have:
I
√
γ
zx
ze dz =
Z
π
−π
√
θ
iθ
εei 2 exεe · iεeiθ dθ.
The integrating function clearly tends to zero for ε → 0, therefore there is no contribution from the integral over γ.
Along the straight lines r1 and r2 we can assume that the arguments of the
complex numbers lying on them are π (along r1 ) and −π (along r2 ) and that their
138
CHAPTER 4. INTEGRAL TRANSFORMS
imaginary parts tend to zero, therefore we have z = reiπ (r1 ) and z = re−iπ (r2 ).
Notice here that, although we are on the negative real axis, r is positive. In fact,
eiπ = e−iπ = −1. The parameter r runs between +∞ and 0 (r1 ) and between 0 and
+∞ (r2 ). The integral of the given function along r1 turns out to be:
Z
√
zx
ze dz =
0
Z
√
i π2 xreiπ
re e
∞
r1
iπ
· e dr =
Z
0
√
∞
−xr
r·i·e
· (−1)dr = i
Z
∞
√
re−xr dr.
0
Along r2 we have:
Z
√
zx
ze dz =
r2
Z
∞
√
−i π2 xre−iπ
re
e
0
−iπ
·e
dr =
Z
∞
√
−xr
r·(−i)·e
0
·(−1)dr = i
Z
∞
√
re−xr dr.
0
In the end we have:
√
eax
f (x) = L { s − a} = −
2πi
−1
Z
√
r1 +r2
eax
ze dz = −
π
zx
Z
∞
√
re−xr dr.
0
The sign minus is due to the fact that, as we have said, the integral along the whole
R
R
closed curve C is zero, therefore L F (s)esx ds = − r1 +r2 F (s)esx ds. To evaluate
R ∞ √ −xr
2
re dr we make the substitution xr = t2 , therefore r = tx and
the integral 0
. We obtain:
dr = 2tdt
x
Z
∞
√
−xr
re
dr =
0
2
1
x3/2
Z
0
∞
2
te−t · 2tdt.
2
Since −2te−t is the differential of e−t we can integrate the given function by parts
and obtain:
Z ∞
Z ∞
√ −xr
1 h −t2 i∞
−t2
−
e dt .
re dr = − 3/2 te
x
0
0
0
√
R∞ 2
The term under square brackets is zero. By using the known result 0 e−t dt = 2π
we obtain:
Z
0
∞
√
−xr
re
dr =
√
π
2x3/2
.
This result completes our inversion of the function F (s) =
√
eax
f (x) = L−1 { s − a} = − √
.
2 πx3
√
s − a, namely we have:
139
4.2. FOURIER TRANSFORMS
4.2
Fourier transforms
Fourier transforms are widely used in physics and astronomy because they allow to
express a function (not necessarily periodic) as a superposition of sinusoidal functions, therefore we devote this section to them. Since the Fourier transforms are used
mostly to represent time-varying functions, we shall use t as independent variable
instead of x. On the other hand, the transformed variable represents for most of the
application a frequency and will be indicated with ω instead of s.
4.2.1
Fourier series
For some physical applications, we might need to expand in series some functions
that are not continuous or not differentiable and that therefore do not admit a
Taylor series. Fourier series allow to represent periodic functions, for which a Taylor
expansion does not exist, as superposition of sine and cosine functions. Given a
periodic function f (t) with period T such that the integral of |f (t)| over one period
converges, f (t) can be expressed in this way:
∞ 2πnt
2πnt
a0 X
+ bn sin
,
+
an cos
f (t) =
2
T
T
n=1
where the constant coefficients an , bn are called Fourier coefficients. Defining the
angular frequency ω = 2π
we simplify this expression into:
T
∞
a0 X
+
[an cos(ωnt) + bn sin(ωnt)] ,
f (t) =
2
n=1
(4.18)
namely the function f (t) can be expressed as a superposition of an infinite number
2π
of sinusoidal functions having periods Tn = ωn
.
It can be shown that these coefficients are given by:
2
an =
T
Z
2
bn =
T
Z
T
2
f (t) cos(ωnt)dt
(4.19)
f (t) sin(ωnt)dt
(4.20)
− T2
T
2
− T2
Example 4.2.1 Find the Fourier series expansion of the function

−1 − T + kT ≤ t < kT
2
f (t) =
1 kT ≤ t < T + kT
2
140
CHAPTER 4. INTEGRAL TRANSFORMS
This is a square wave: a series of positive impulses followed periodically by negative
impulses of the same intensity. We can notice immediately that the function f (t)
is odd (f (t) = −f (−t)). Since the function cos(ωnt) is even, the whole function
f (t) cos(ωnt) is odd and its integral between −T /2 and T /2 is zero. That means that
the coefficients an are zero.
To find the coefficients bn we apply Eq. 4.20 obtaining:
2
bn =
T
Z
T
2
− T2
" Z
#
Z T
0
2
2
−
f (t) sin(ωnt)dt =
sin(ωnt)dt +
sin(ωnt)dt
T
− T2
0
Z T
T
2
2
4
sin(ωnt)dt = −
[cos(ωnt)]02
=
T 0
nπ
2
=
[1 − cos(nπ)] .
nπ
Here we have used the relation ωT = 2π. We can notice here that cos(nπ) is 1 if
n is even and -1 if n is odd, namely cos(nπ) = (−1)n . We could find the same
result by means of the de Moivre’s theorem applied to the complex number z = eiπ .
4
if n is odd. The Fourier
The coefficients bn are equal to zero if n is even and to nπ
expansion we looked at is therefore:
4
f (t) =
π
sin(3ωt) sin(5ωt)
sin(ωt) +
+
+ ... .
3
5
By using the identities cos z = (eiz + e−iz )/2 and sin z = (eiz − e−iz )/2i the Fourier
expansion of a function f (t) can also be written as:
∞ a0 X
eiωnt + e−iωnt
eiωnt − e−iωnt
f (t) =
+
an
+ bn
2
2
2i
n=1
∞
a0 eiω0t 1 X =
+
(an − ibn )eiωnt + (an + ibn )e−iωnt .
2
2 n=1
In this way we can see that the function f (t) can be expressed as sum, extending
from −∞ to +∞, of terms of the form eiωn t , where ωn = ω · n, namely we have:
f (t) =
∞
X
−∞
cn eiωn t ;

 1 (a − ib ) n ≥ 0
n
n
cn = 2
1
 (an + ibn ) n < 0
2
.
(4.21)
This compact representation of the periodic function f (t) is called complex Fourier
series. If we combine the coefficients an and bn as indicated in Eq. 4.21 we find that,
irrespective of the sign of n, we have:
141
4.2. FOURIER TRANSFORMS
1
cn =
T
4.2.2
T
2
Z
f (t)e−iωn t dt.
(4.22)
− T2
From Fourier series to Fourier transform
We have seen that the Fourier series allow us to describe periodic functions as superpositions of sinusoidal functions characterized by angular frequencies ωn . To
represent non-periodic functions, what we can do is to extend the period T to infinity (every function can be considered periodic if the period is large enough). That
corresponds to consider a vanishingly small “frequency quantum” ∆ω = ωnn = 2π
T
and therefore a continuous spectrum of angular frequencies ωn . Given a function
RT
P
iωn t
, with cn = T1 −2T f (t)e−iωn t dt, we want to see what happens
f (t) = ∞
n=−∞ cn e
2
2π
T
in the limit T → ∞ (or, analogously, ∆ω =
→ 0). We have:
Z T
Z T
∞
∞
X
X
2
∆ω 2
1
−iωn t
iωn t
f (t) =
f (t)e
dt · e
=
f (t)e−iωn t dt · eiωn t .
T
T
T
2π
−2
−2
n=∞
n=∞
In the limit for T → ∞ and ∆ω → 0 the limits of the integration extend to infinity,
the sum becomes an integral and the discrete values ωn become a continuous variable
ω (with ∆ω → dω). We have thus:
1
f (t) =
2π
Z
∞
iωt
dωe
Z
∞
duf (u)e−iωu .
(4.23)
−∞
−∞
From this relation we can define the Fourier transform of a function f (t) as:
1
f˜(ω) = F {f (t)} = √
2π
Z
∞
f (t)e−iωt dt.
(4.24)
−∞
R∞
Here we require, in order this integration to be possible, that −∞ |f (t)|dt is finite.
Unlike the Laplace transform, the Fourier transform is very easy to invert. In fact,
we can directly see from Eq. 4.23 that:
1
f (t) = √
2π
Z
∞
f˜(ω)eiωt dω.
(4.25)
−∞
Example 4.2.2 Find the Fourier transform of the normalized Gaussian distribution
t2
1
f (t) = √ e− 2τ 2 .
τ 2π
By definition of Fourier transform we have:
142
CHAPTER 4. INTEGRAL TRANSFORMS
Z ∞
Z ∞
t2
1
1
−iωt
˜
e−iωt− 2τ 2 dt.
f(ω)
=√
f (t)e
dt =
2πτ −∞
2π −∞
We can modify the exponent of e in the integral as follows:
t2
1 2
2
2 2
2 2
=
−
t
+
2iωtτ
+
(iωτ
)
−
(iωτ
)
.
2τ 2
2τ 2
The first 3 addends inside the square brackets are the square of t + iωτ 2 , namely we
obtain:
−iωt −
t2
(t + iωτ 2 )2 (iωτ 2 )2
−iωt − 2 = −
+
=−
2τ
2τ 2
2τ 2
1
Since the term e− 2 ω
2τ 2
t + iωτ 2
√
2τ
2
1
− ω 2τ 2 .
2
does not depend on t we obtain:
Z
”
“
2 2
1 − 1 ω2 τ 2 ∞ − t+iωτ
√
˜
2τ
e 2
dt.
f (ω) =
e
2πτ
−∞
This is the integral of a complex function, therefore we should use the methods of
complex integration we have learned so far. However, we can see that the integration
simplifies significantly by means of the substitution:
t + iωτ 2
√
= s,
2τ
dt =
√
2τ ds.
In this way we obtain:
1 2 2
1
f˜(ω) = √ e− 2 ω τ
2π
∞
1 2 2
1
2
e−s ds = √ e− 2 ω τ ,
2π
−∞
R ∞ −s2
√
where we have made use of the known result −∞ e ds = π. It is important to
note that the Fourier transform of a Gaussian function is another Gaussian function.
Z
The Fourier transform allows us to express the Dirac delta function in an elegant
and useful way. We recall Eq. 4.23
Z ∞
Z ∞
1
iωt
duf (u)e−iωu .
f (t) =
dωe
2π −∞
−∞
By exchanging the variable of integration we obtain:
Z ∞
Z ∞
1
dω
duf (u)eiω(t−u)
f (t) =
2π −∞
Z ∞
Z −∞
∞
1
=
du
dωf (u)eiω(t−u)
2π −∞
−∞ Z ∞
Z ∞
1
iω(t−u)
e
dω ,
=
duf (u)
2π −∞
−∞
143
4.2. FOURIER TRANSFORMS
where the exchange of the order of integration has been made possible by the Fubini’s
theorem. Recalling Eq. 4.15 we can immediately recognize that:
1
δ(t − u) =
2π
Z
∞
eiω(t−u) dω.
(4.26)
−∞
Analogously to the Laplace transform, it is easy to calculate the Fourier transform of the derivative of a function. It is:
Z ∞
1
F {f (t)} = √
f ′ (t)e−iωt dt
2π −∞
Z
1 (−iω) ∞
−iωt ∞
=√
f (t)e
f (t)e−iωt dt
− √
−∞
2π
2π −∞
′
= iωF {f (t)}.
(4.27)
Here we have assumed that the function f (t) tends to zero for t → ±∞ (as it should
R∞
be since −∞ |f (t)|dt is finite). It is easy to iterate this procedure and show that:
F {f (n) (t)} = (iω)n F {f (t)}.
(4.28)
This relation can be used in some cases to solve ODEs analogously to what done by
means of Laplace transforms, namely we transform both members of an ODE, solve
the obtained algebraic equation as a function of F {y(x)} (the Fourier transform of
the solution y(x) we seek) and then invert the function we have obtained. However,
for most of the practical cases, it is more convenient to use Laplace transformation
methods to solve ODEs. Fourier transformation methods can be extremely useful
instead to solve partial differential equations (see Sect. 6.3.1).
144
CHAPTER 4. INTEGRAL TRANSFORMS
Chapter 5
Systems of differential equations
In many physical applications it is necessary to solve simultaneously n ODEs involving n unknown functions y1 (x), y2 (x), . . . , yn (x) of the same independent variable x.
Such a system of differential equations shares many analogies with a system of ordinary algebraic equations and can be solved with the matrix formalism (taking into
account all the properties of ODEs we have encountered so far).
We will consider only systems of linear ODEs. At the end of this Chapter we
will show that systems of linear ODEs with order larger than one can always be
transformed into systems of first order ODEs, therefore we will devote most of our
attention to this case.
5.1
Review of matrices and systems of algebraic
equations
Because of the utility of the results of matrix theory to solve systems of ODEs,
we recall here (without proof) some of the most important and useful properties of
matrices and of systems of linear algebraic equations.
5.1.1
Matrices
A matrix (usually indicated with a boldface capital letter like A) is a rectangular
array of elements arranged in m rows and n columns like the following one:



A=


a11
a21
..
.
a12
a22
..
.
...
...
a1n
a2n
..
.
am1 am2 . . . amn



.


In this case we refer to A as a m × n matrix. The transpose of a matrix AT is
145
146
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
the matrix obtained by inverting rows with columns. If we, besides inverting rows
with column, take also the complex conjugate of each element of the matrix, then
we obtain the adjoint A∗ . Of course, for real matrices, the transpose and the adjoint
coincide.
The multiplication of two matrices is possible only if the number of columns of
the first matrix equals the number of rows in the second matrix. In this case, if we
multiply a m × r matrix A times a r × n matrix B, we obtain a m × n matrix C
whose element in the i-th row and j-th column is obtained multiplying each element
of the i-th row of A by each corresponding element of the j-th column of B, namely:
cij =
r
X
aik bkj .
(5.1)
k=1
If A and B are square matrices (namely if the number of rows and columns are the
same) it is possible to define both AB and BA but, from the definition of matrix
product, it derives that in general AB 6= BA, namely the matrix multiplication is
not commutative.
A matrix with a single column is also called vector, namely a vector x looks like
that:



x=


x1
x2
..
.
xn



.


Of course, the transpose xT of a vector x is composed by a single row (x1 , x2 , . . . , xn ).
If we multiply the transpose of a vector x by the complex conjugated of a second
vector y, we obtain the scalar product (or inner product) x · y (sometimes indicated
with (x, y)), which is the usual way to multiply vectors. In symbols:
x·y =
n
X
xi yi ∗ .
(5.2)
i=1
If (x, y) = 0, then the two vectors x and y are said to be orthogonal.
The identity matrix I is the square matrix whose diagonal terms are 1 and whose
non-diagonal terms are all 0, namely:



I=



0 ... 0

1 ... 0 
..
.. 
.
.
. 
0 0 ... 1
1
0
..
.
Given a square matrix A, from the definition of matrix multiplication we have:
5.1. REVIEW OF MATRICES AND SYSTEMS OF ALGEBRAIC EQUATIONS147
AI = IA = A.
If the determinant of A is different from zero, then it is always possible to find
the inverse A−1 of the matrix A, namely the matrix such that AA−1 = A−1 A = I.
The general formula to find the element bij of the matrix A−1 is:
(−1)i+j Mji
det A
(note the inversion of the indices in Mji ), where Mij is the minor of the matrix A
associated with the element aij , namely the determinant of the matrix obtained by
deleting the i-th row and the j-th column. This is quite an inefficient way to find the
inverse of a matrix. Much more effective is the method of the Gauss elimination. It
consists on manipulating the rows of a matrix until one obtains the diagonal matrix.
The allowed row operations are:
bij =
• interchange of two rows;
• multiplication of a row by a non-zero scalar;
• addition of any multiple of one row to another row.
An example can clarify how the Gauss elimination method works.
Example 5.1.1 Find the inverse of the matrix


3 −1 1


A =  1 1 −1 
2 −1 0
We begin by forming the augmented matrix A|I, namely:

3 −1 1 1 0 0


A|I =  1 1 −1 0 1 0  .
2 −1 0 0 0 1

We perform now a series of row operations on A in order to transform it into I. At
the same time, I will be transformed into A−1 . To shorten the notation we indicate
with Rmn the swap of the m-th with the n-th row, with Rm + Rn (α) the sum of the
m-th row with α times the n-th row and with Rn (β) the product of the n-th row by
β. We have:
148
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS

3 −1 1 1 0 0


A|I =  1 1 −1 0 1 0 
2 −1 0 0 0 1


1 1 −1 0 1 0


[R12 ] =  3 −1 1 1 0 0 
2 −1 0 0 0 1


1 1 −1 0 1 0


[R2 + R1 (−3)]; [R3 + R1 (−2)] =  0 −4 4 1 −3 0 
0 −3 2 0 −2 1


1 1 −1 0
1 0
1


R2 −
=  0 1 −1 − 14 43 0 
4
0 −2 1
0 −3 2


1
1
1 0 0
0
4
4


[R1 + R2 (−1)]; [R3 + R2 (3)] =  0 1 −1 − 14 43 0 
0 0 −1 − 34 41 1


1
1
0
1 0 0
4
4


[R3 (−1)] =  0 1 −1 − 14 43
0 
3
− 14 −1
0 0 1
4


0
1 0 0 14 41


[R2 + R3 (1)] =  0 1 0 12 21 −1  = I|A−1.
0 0 1 34 − 14 −1

In the end, the inverse matrix of A is:
A−1

1 1
0
1

=  2 2 −4  .
4
3 −1 −4

If the elements aij of a matrix A are functions of an independent variable x, then
we can define the derivative and the integral of the matrix A as the matrix whose
elements are the derivatives and the integrals of the element aij (x), respectively.
Given two matrices A(x) and B(x) and a constant matrix K we have:
dA
d
(KA) = K
dx
dx
d
d
d
(A + B) =
A+ B
dx
dx
dx
dB dA
d
(AB) = A
+
B.
dx
dx
dx
(5.3)
(5.4)
(5.5)
5.1. REVIEW OF MATRICES AND SYSTEMS OF ALGEBRAIC EQUATIONS149
Since, as we have said, the matrix multiplication is not commutative, care must be
taken in respecting the order of multiplications.
5.1.2
Systems of linear algebraic equations
A system of n linear algebraic equation in n variables can be written as:


a11 x1 + a12 x2 + · · · + a1n xn = b1




a x + a x + · · · + a x = b
21 1
22 2
2n n
2
.

..





an1 x1 + an2 x2 + · · · + ann xn = bn
.
(5.6)
By using the matrix formalism, we can write this system in the compact form:
Ax = b,
(5.7)
where



A=


a11 a12 . . . a1n
a21 a22 . . . a2n
..
..
..
.
.
.
an1 an2 . . . ann



,





x=


x1
x2
..
.
xn






, b = 




b1
b2
..
.
bn






If b = 0 then the system is said to be homogeneous, otherwise it is nonhomogeneous.
If the determinant of A is different from zero, then we can calculate the inverse
matrix A−1 . By multiplying both sides of the equation Ax = b by A−1 we obtain
the solution of the system, namely:
x = A−1 b.
(5.8)
This solution is therefore unique. For homogeneous systems, we can only have the
trivial solution x = 0, whereas if b 6= 0 the solution can be either found by Gauss
elimination, or by means of the Cramer’s rule. The Cramer’s rule states that the i-th
component of the vector x, solution of the given system of linear algebraic equations,
is given by:
det Ai
,
(5.9)
det A
where Ai is the matrix obtained by replacing the i-th column of A by the column
vector b.
If det A = 0, then the homogeneous system Ax = 0 has an infinite number of
solutions. The nonhomogeneous system Ax = b has instead, in general, no solutions
xi =
150
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
if the determinant of A is zero. It has however an infinite number of solutions if the
vectors y such that A∗ y = 0 are orthogonal to b, namely if:
(b, y) = 0.
In the end, this condition implies that one or more than one equation of the system
Eq. 5.6 can be obtained as a linear combination of the others.
Example 5.1.2 Find, as a function of a, the solutions of the system of algebraic
equations
!
!
a 1
b1
x = Ax =
.
1 a
b2
The determinant of the given system is a2 − 1, therefore we have an unique solution
for |a| =
6 1. In this case, the solution is given by x = A−1 b. It is easy to see that
the inverse of the matrix A is given by:
A−1
1
= 2
a −1
a −1
−1 a
!
,
therefore the solution is:
1
x= 2
a −1
a −1
−1 a
!
b1
b2
!
1
= 2
a −1
If a = 1, the system becomes:

x + x = b
1
2
1
x1 + x2 = b2
ab1 − b2
−b1 + ab2
!
.
.
It is quite clear that, since the left-hand sides of the equations are equal, also the
right-hand sides must be equal, namely solutions are possible only if b1 = b2 . In this
case, the equations are proportional to each other, with proportionality constant 1.
We just need to solve one of them, for instance:
x1 + x2 = b1 .
If x1 takes the value K, then x2 = b1 − K, therefore the (infinite) solutions in this
case are represented by the vector:
!
K
x=
.
b1 − K
5.1. REVIEW OF MATRICES AND SYSTEMS OF ALGEBRAIC EQUATIONS151
If a = −1, the system becomes:

−x + x = b
1
2
1
x1 − x2 = b2
.
Here, solutions are possible only if b1 = −b2 . In this case, the equations are proportional to each other, with proportionality constant −1. If we solve:
−x1 + x2 = b1 ,
the (infinite) solutions in this case are represented by the vector:
!
K
x=
.
b1 + K
A set of n vectors x(1) , . . . , x(n) is said to be linearly dependent if there exists a
set of n numbers c1 , . . . , cn , at least one of which is different from zero, such that:
c1 x(1) + · · · + cn x(n) = 0.
(5.10)
If this relation is satisfied only by the set of values c1 = c2 = · · · = cn = 0, then
the vectors x(1) , . . . , x(n) are said to be linearly independent. To check the linear
independence of a set of vectors, the best way is to construct the matrix X whose
columns are the vectors x(1) , . . . , x(n) . Eq. 5.10 can be thus written as:
 
(1)
(n)
x1 c1 + . . . + x1 cn
x11 c1 + . . .
 ..


.
...
 .
 =  ..
(1)
(n)
xn c1 + . . . + xn cn
xn1 c1 + . . .


+ x1n cn
.. 
.  = Xc = 0.
+ xnn cn
If det X 6= 0, then the only solution of this equation is c = 0 (therefore the vectors
are linearly independent). If instead det X = 0 then the system Xc = 0 admits a
solution c 6= 0 and the vectors are linearly dependent.
For many physical and mathematical applications it is necessary to search under
what conditions a vector x can be transformed into a multiple of itself through the
transformation Ax, namely under what conditions
Ax = λx.
Recalling the identity matrix I, this equation can be written as:
(A − λI)x = 0.
(5.11)
152
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
This is an homogeneous system of equations, therefore non-zero solutions are possible
only if
det (A − λI) = 0.
(5.12)
The values of λ that satisfy this equation are called eigenvalues of the matrix A. The
non-zero solutions of Eq. 5.11 that are obtained by using such values of λ are called
eigenvectors. If all the roots of the Eq. 5.12 have multiplicity 1, then it can be shown
that all corresponding eigenvectors are linearly independent. If instead one root λi is
repeated, with algebraic multiplicity m, then it can happen that, associated to it, are
only q linearly independent eigenvectors. In this case, q is said to be the geometric
multiplicity of the eigenvalue λi . It is always 1 ≤ q ≤ m.
5.2
Systems of first order linear ODEs
5.2.1
General properties
A generic system of n first order linear ODEs can be written as:


y1 ′ (x) = p11 (x)y1 (x) + p12 (x)y2 (x) + · · · + p1n (x)yn (x) + g1 (x)




y ′ (x) = p (x)y (x) + p (x)y (x) + · · · + p (x)y (x) + g (x)
2
21
1
22
2
2n
n
2
.

..




 ′
yn (x) = pn1 (x)y1 (x) + pn2 (x)y2 (x) + · · · + pnn (x)yn (x) + gn (x)
(5.13)
This system can be written in a more compact way in matrix notation, namely:
y′ = P(x)y + g,
(5.14)
where



y =


′
y1 ′ (x)
y2 ′ (x)
..
.
yn ′ (x)
and P is the n × n matrix:



,





P=





y=


y1 (x)
y2 (x)
..
.
yn (x)
p11 (x) . . .
p21 (x) . . .
..
.




,




g=


p1n (x)
p2n (x)
..
.

pn1 (x) . . . pnn (x)


.


g1 (x)
g2 (x)
..
.
gn (x)



,


5.2. SYSTEMS OF FIRST ORDER LINEAR ODES
153
By means of this formalism, we can extend practically all the definitions and properties we have already encountered studying linear ODEs.
In the case of the homogeneous system:
y′ = P(x)y,
(5.15)
in compliance with what we have learned on ODEs, we expect Eq. 5.15 to have
in general n solutions y(1) (x), . . . , y(n) (x). Are these solutions linearly independent?
To check it it is enough to consider the n × n matrix Y(x) whose columns are the
vectors y(1) (x), . . . , y(n) (x), namely:

y11 (x) . . .

..
Y(x) = 
.

y1n (x)

..
.
.
(5.16)
yn1(x) . . . ynn (x)
As we have reminded in Sect. 5.1.2, the vectors y(1) (x), . . . , y(n) (x) are linearly
independent provided that the determinant of Y(x):
W [y(1) (x), . . . , y(n) (x)] = detY(x),
(5.17)
is different from zero. This determinant W is called Wronskian of the n solutions
y(1) (x), . . . , y(n) (x). If W 6= 0, then we can express any solution y(x) of the system of
ODEs Eq. 5.14 as a linear combination of the solutions y(1) (x), . . . , y(n) (x), namely
y(x) = c1 y(1) (x) + · · · + cn y(n) (x).
(5.18)
In this case, the n vectors y(1) (x), . . . , y(n) (x) form a fundamental set of solutions of
the given system of ODEs.
5.2.2
Homogeneous linear systems with constant coefficients
A system of homogeneous linear ODEs with constant coefficients can be written as:
y′ = Ay,
(5.19)
where A is a constant n × n matrix. Analogously to what we have seen with normal
ODEs (that can be interpreted as systems of ODEs with n = 1), we expect an
exponential solution of the form y = peλx , where the exponent λ and the constant
d
vector p must be determined. It can be easily shown that dx
peλx = λpeλx , therefore
substituting y = peλx in the system Eq. 5.19 we obtain:
λpeλx = Apeλx .
154
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
We can cancel out eλx from this equation. Moreover, by using the identity matrix I
we can write p = Ip obtaining:
(A − λI)p = 0.
(5.20)
In the end, the solution of the system of ODEs Eq. 5.19 reduces to the system
of algebraic equations Eq. 5.20, which is precisely the one that determines the
eigenvalues and eigenvectors of the matrix A.
Example 5.2.1 Solve the system of ODEs
−2 1
1 −2
y′ (x) =
!
y.
According to Eq. 5.20 we have to solve the system of algebraic equations:
−2 − λ
1
1
−2 − λ
!
p1
p2
!
=
0
0
!
.
(5.21)
This system admits a non-trivial solution p 6= 0 only if the determinant of the matrix
is zero (namely if the two rows are linearly dependent). This occurs when:
(−2 − λ)(−2 − λ) − 1 = 0, ⇒ λ2 + 4λ + 3 = 0, ⇒ λ1,2 = −2 ±
√
4 − 3,
namely the eigenvalues are λ1 = −3 and λ2 = −1. By substituting these values in
the Eq. 5.21 we can obtain the eigenvectors p(1) and p(2) . We start with λ1 = −3
and obtain:
−2 + 3
1
1
−2 + 3
!
(1)
p1
(1)
p2
(1)
!
=
0
0
(1)
!
(1)
(1)
⇒ p1 + p2 = 0.
Of course one equation linking p1 and p2 is sufficient because the rows of the matrix
(1)
(1)
are now linearly dependent. The solution of this equation is therefore
p1 = −p2
!
1
(1)
and, taking p1 = 1 we obtain the eigenvector p(1) =
. The choice of
−1
p(1) is arbitrary, because all the remaining infinitely many eigenvectors are linearly
dependent on p(1) .
If we now substitute λ2 = −1 into Eq. 5.21 we obtain the equation
(2)
(2)
−p1 + p2 = 0,
155
5.2. SYSTEMS OF FIRST ORDER LINEAR ODES
(2)
(2)
namely p1 = p2 and therefore the second eigenvector is p(2) =
1
1
!
. The general
solution of the given system of ODEs is thus:
y = c1
1
−1
!
e−3x + c2
1
1
!
e−x .
In the simple case in which the system is composed by only 2 equations, we can
write Eq. 5.20 as:
a −λ
a12
11
a21
a22 − λ
= 0, ⇒ λ2 − (a11 + a22 )λ + a11 a22 − a12 a21 = 0.
(5.22)
Let us now write explicitly the system y′ (x) = Ay in this case. We have:

y ′ = a y + a y
1
11 1
12 2
.
y2 ′ = a21 y1 + a22 y2
We can recover y2 from the first equation, obtaining:
y2 =
1
(y1 ′ − a11 y1 ).
a12
Substituting this function into the second equation of the system we obtain:
a22 ′
1
(y1 ′′ − a11 y1 ′ ) = a21 y1 +
(y1 − a11 y1 )
a12
a12
⇒ y1 ′′ − (a11 + a22 )y1 ′ + (a11 a22 − a12 a21 )y1 = 0.
(5.23)
This is a second order homogeneous ODE with constant coefficients, whose characteristic equation is exactly like Eq. 5.22. For this reason we can call the equation
det (A − λI) = 0 the characteristic equation of the given system of ODEs. Eq. 5.23
suggests us also how to solve the system y′ = Ay. From it we can in fact recover
y1 (x) and, if we substitute it into the equation y2 = a112 (y1 ′ − a11 y1 ), we can find also
y2 (x). This method is also called elimination method and might be faster than the
methods involving the matrix formalism in the case of simple systems.
As we have seen in Example 5.2.1, given a system of linear ODEs y′ = Ay, to
find the solution we have to find the eigenvalues λi and the corresponding eigenvectors
p(i) of the matrix A. The eigenvalues are the roots of the Eq. 5.12, therefore, if A
is a real matrix, 3 possibilities may arise:
• all eigenvalues are real and different from each other;
156
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
• some eigenvalues occur in complex conjugate pairs;
• some eigenvalues are repeated.
Example 5.2.1 belongs to the first category and we have seen that, once we have
found the eigenvalues λ1 , . . . , λn and the corresponding eigenvectors p(1) , . . . , p(n) ,
the general solution is given by:
y = c1 p(1) eλ1 x + · · · + cn p(n) eλn x .
(5.24)
It is easy to show that the vectors y(1) , . . . , y(n) = p(1) eλ1 x , . . . , p(n) eλn x are linearly
independent, in fact:
(1) λ x
λn x
p1 e 1 . . . p(n)
1 e
..
..
W [y(1) , . . . , y(n) ](x) = .
(1).
pn eλ1 x . . . pn(n) eλn x
(1)
p1 . . . p(n)
1
..
..
(λ1 +···+λn )x =
e
.
.
(1)
pn . . . p(n)
n
,
and this quantity is different from zero because the vectors p(1) , . . . , p(n) are linearly
independent.
If some of the eigenvalues of the matrix A are complex, we know that, if A
is real, both the complex conjugate pairs λ1,2 = µ ± iα must appear. In this case,
the eigenvectors p(1) and p(2) , corresponding to the eigenvalues λ1,2 , will be complex
conjugates, either. If we take the solution y(1) = p(1) eλx , if p(1) = a + ib, then we
have:
y(1) = p(1) eλx = (a+ib)e(µ+iα)x = eµx [a cos(αx)−b sin(αx)]+ieµx [a sin(αx)+b cos(αx)].
If we write y(1) = u + iv, then the vectors:
u(x) = eµx [a cos(αx) − b sin(αx)]
v(x) = eµx [a sin(αx) + b cos(αx)]
(5.25)
(5.26)
are linearly dependent, real-valued solutions and can be made part of the fundamental
set of solutions of the system of ODEs y′ = Ay.
Example 5.2.2 Find the solution of the system of ODEs
!
−1
−4
y′ =
y.
1 −1
157
5.2. SYSTEMS OF FIRST ORDER LINEAR ODES
We have to find the eigenvalues of the matrix A − λI, namely:
−1 − λ
−4
1
−1 − λ
√
= 0 ⇒ (1 + λ)2 + 4 = 0 ⇒ λ1,2 = −1 ± 1 − 5.
The eigenvalues are therefore λ1,2 = −1 ± 2i. As we have learned, we need only one
of these eigenvalues because the real and imaginary parts of the complex vector peλx
are linearly independent. The eigenvector p can be obtained substituting the chosen
eigenvalue into the equation (A − λI)p = 0. If we take the value λ = −1 − 2i we
obtain:
2i −4
1 2i
!
!
p1
p2
=
0
0
!
.
We must take only one of these equations (they are linearly dependent), for instance
p1 + 2ip2 = 0.
If p1 takes the value 2i, p2 must take the value −1, therefore all the solutions of the
equation p1 + 2ip2 = 0 are multiples of the eigenvector
p=
2i
−1
!
.
We have now to separate the vector peλx into its real and imaginary components u
and v. We obtain:
peλx = e−x
⇒ u = e−x
2i
−1
!
[cos(2x) − i sin(2x)] = e−x
2 sin(2x)
− cos(2x)
!
v = e−x
2 cos(2x)
sin(2x)
!
2i cos(2x) + 2 sin(2x)
− cos(2x) + i sin(2x)
!
.
The general solution is thus given by:
"
y = e−x c1
2 sin(2x)
− cos(2x)
!
+ c2
2 cos(2x)
sin(2x)
!#
.
If some eigenvalues of the matrix A are repeated, namely if one (or more) root
λ of the equation det (A − λI) has algebraic multiplicity m larger than one, as we
have said two possibilities may arise:
• the geometric multiplicity q of λ is equal to m;
158
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
• q is smaller than m.
In the first case, there are still m linearly independent eigenvectors p(1) , . . . , p(m)
corresponding to the eigenvalue λ and therefore the vectors p(1) eλx , . . . , p(m) eλx are
linearly independent. The general solution is still of the form of Eq. 5.24, namely:
y = c1 p(1) eλ1 x + · · · + cn p(n) eλn x ,
although some values of λ are repeated.
If instead there are less than m linearly independent eigenvectors corresponding
to an eigenvalue with algebraic multiplicity m, then not all the vectors forming the
fundamental set of solutions have the form p(i) eλi x . By analogy with the results for
linear ODEs of order n, we might expect additional solutions involving products of
polynomials with exponential functions. If the root λ of the equation det (A − λI) is
double, then one solution has the standard form peλx , whereas the second solution
must have the form:
uxeλx + veλx .
(5.27)
We can see here that, at variance with higher-order linear ODEs, we have a term
proportional to xeλx and a term proportional to eλx . We cannot drop this term
because the vector v is not multiple of the vector p.
Example 5.2.3 Find the solution of the system of ODEs
!
1
−1
y′ =
y.
1 3
We have to find the eigenvalues of the matrix A − λI, namely:
1 − λ −1
1
3−λ
= 0 ⇒ λ2 − 3λ − λ + 3 + 1 = 0 ⇒ λ1,2 = 2.
The eigenvalue λ = 2 is thus a double eigenvalue (namely, the algebraic multiplicity
m of it is 2). The eigenvector p can be obtained substituting it into the equation
(A − λI)p = 0. We obtain:
−1 −1
1
1
!
p1
p2
!
=
0
0
!
.
We must take only one of these equations (they are linearly dependent), for instance
p1 + p2 = 0,
159
5.2. SYSTEMS OF FIRST ORDER LINEAR ODES
obtaining, as eigenvector:
p=
1
−1
!
.
There is no possibility to obtain another linearly independent eigenvector corresponding to the eigenvalue 2 (this means that the geometric multiplicity of
! λ = 2 is 1).
1
One solution of the given system of ODEs has thus the form
e2x , whereas,
−1
in order to obtain the second solution, we have to apply Eq. 5.27. Upon substitution
of this expression into the original system of ODEs, we obtain:
= A (ux + v) e2x
[2 (ux + v) + u] e2x
.
We can now equate the terms with the same power of x, obtaining:
2u = Au
2v + u = Av.
We can write alternatively these two equations as:
(A − 2I) u = 0
(5.28)
(A − 2I) v = u,
(5.29)
The first equation simply implies that u must
! be an eigenvector corresponding to the
1
eigenvalue λ = 2, namely u = p =
. From the second equation we obtain
−1
instead:
−1 −1
1
1
!
v1
v2
!
=
1
−1
!
.
These are two linearly dependent equations. We solve one of them, for instance:
v1 + v2 = −1.
If v1 takes the value of C, then v2 must be −1 − C, namely we have:
v=
C
−1 − C
!
=
0
−1
!
+C
1
−1
!
,
and the second solution of the given system of ODEs has, according to Eq. 5.27, the
form:
160
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
y2 =
1
−1
!
xe2x +
0
−1
!
e2x + C
1
−1
!
e2x .
The last summand of the right hand side of this equation is proportional to pe2x and
may be ignored. The general solution is thus given by:
y = e2x
(
c1
1
−1
!
+ c2
"
1
−1
!
x+
0
−1
!#)
.
It is not difficult to demonstrate that, if an eigenvalue λ has algebraic multiplicity
m = 2 and geometric multiplicity q = 1, then the vectors u and v forming the second
solution (according to Eq. 5.27) can always be determined by equations of the form
of Eqs. 5.28 and 5.29, namely:
(A − λI) u = 0
(A − λI) v = u,
5.2.3
(5.30)
(5.31)
Nonhomogeneous linear systems with constant coefficients
Analogously to the solution of a n-th order ODE, a system of n first-order nonhomogeneous linear ODEs
y′ = Ay + g,
has, as general solution, the vector
y(x) = c1 y(1) (x) + · · · + cn y(n) (x) + yp (x),
(5.32)
where c1 y(1) (x) + · · · + cn y(n) (x) is the solution of the corresponding homogeneous
system y = Ay, whereas the vector yp (x) is a particular solution of the nonhomogeneous system.
In order to find the particular solution of a system of ODEs, the same methods
employed in the solution of linear n-th order nonhomogeneous ODEs can be used,
namely:
• D-operator;
• undetermined coefficients;
• Laplace transforms;
• variation of parameters.
161
5.2. SYSTEMS OF FIRST ORDER LINEAR ODES
Method of the D-operator
This method is an extension of the elimination method we have seen in Sect. 5.2.2.
Taking for simplicity a system of 2 linear ODEs, we can write them as:
From the first we obtain:

Dy = a y + a y + g (x)
1
11 1
12 2
1
Dy2 = a21 y1 + a22 y2 + g2 (x)
y2 =
.
1
[Dy1 − a11 y1 − g1 (x)].
a12
Substituting it into the second ODE we obtain:
a22
1
(D 2 y1 − a11 Dy1 − Dg1 ) = a21 y1 +
[Dy1 − a11 y1 − g1 (x)] + g2 (x)
a12
a12
⇒ D 2 y1 − (a11 + a22 )Dy1 + (a11 a22 − a12 a21 )y1 = Dg1 − a22 g1 (x) + a12 g2 (x).
(5.33)
This is a second-order nonhomogeneous ODE whose complementary solution can
be found by means of the standard methods and whose particular solution can be
obtained using the properties of the D-operator.
Example 5.2.4 Find the solution of the system of ODEs:

Dy = y + y
1
1
2
.
Dy2 = 4y1 + y2 + ex
From the first equation we obtain:
y2 = Dy1 − y1 .
We substitute it into the second ODE obtaining:
D 2 y1 − Dy1 = 4y1 + Dy1 − y1 + ex
⇒ (D 2 − 2D − 3)y1 = ex
The characteristic equation of the corresponding homogeneous ODE is:
λ2 − 2λ − 3 = 0, ⇒ λ = 1 ±
√
1 + 3,
whose roots are therefore λ1 = −1 and λ2 = 3. The particular solution is given by:
y1,p =
1
1 x
1
x
x
e
=
e
=
−
e .
D 2 − 2D − 3
1−2−3
4
162
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
We have thus:
1
y1 (x) = c1 e−x + c2 e3x − ex .
4
We can find the second solution y2 by means of the relation y2 = Dy1 − y1 , namely:
1
1
y2 (x) = −c1 e−x + 3c2 e3x − ex − c1 e−x − c2 e3x + ex = −2c1 e−x + 2c2 e3x .
4
4
We can express the solution in the vectorial form:
!
!
1
1
1
−x
y(x) = c1
e + c2
e3x −
4
−2
2
1
0
!
ex .
Method of the undetermined coefficients
This method consists in guessing the correct form of the particular solution (guided
by the form of the vector g(x)), leaving the coefficients undetermined, and determining the coefficients by direct substitution into the given system of ODEs. This
method is analogous to the method we have already seen in the section about n-th
order linear ODEs, the only difference being that, if we have a nonhomogeneous term
g of the form g = heλx and if λ is a simple root of the characteristic equation, then
the solution to seek has the form axeλx + beλx and not simply axeλx .
Example 5.2.5 Find the solution of the system of ODEs

y ′ = 2y − y + ex
1
1
2
y2 ′ = 3y1 − 2y2
With the matrix formalism this system can be written as:
!
!
2
−1
1
y′ =
+
ex .
3 −2
0
To find the general solution of the corresponding homogeneous system, we have to
solve the equation:
2−λ
−1 = 0, ⇒ λ2 − 4 + 3 = 0.
3
−2 − λ The eigenvalues are thus λ1 = 1 and λ2 = −1. Corresponding to the eigenvalue λ1
is the eigenvector p(1) given by:
163
5.2. SYSTEMS OF FIRST ORDER LINEAR ODES
1 −1
3 −3
!
!
p1 (1)
p2 (1)
!
0
0
=
1
1
, ⇒ p1 (1) = p2 (1) , ⇒ p(1) =
!
.
The eigenvector p(2) corresponding to the eigenvalue λ2 is instead given by:
3 −1
3 −1
!
p1 (2)
p2 (2)
!
0
0
=
!
1
3
, ⇒ 3p1 (2) = p2 (2) , ⇒ p(2) =
!
.
The complementary solution is thus given by:
yc = c1
1
1
!
1
3
ex + c2
!
e−x .
Since the vector p(1) ex is part of the complementary solution, to find yp we have to
assume a solution of the form yp = axex + bex . Substituting it into the original
system of ODEs we obtain:
aex + axex + bex = A (axex + bex ) +
1
0
!
ex .
We can cancel out ex from this system of equations. Moreover, we can compare the
coefficients containing x and the coefficients not containing x in both members of the
system, obtaining the two vector equations:



a = Aa



1


a + b = Ab +  



0
From the first equation we know that a must be proportional to the eigenvector corresponding to the eigenvalue λ = 1, namely
!
1
.
a = Cp(1) = C
1
Substituting this value of a into the second equation we obtain:
(A − I)b = C
1
1
!
−
1
0
!

b − b = C − 1
1
2
, ⇒
3b1 − 3b2 = C
.
This system of equations admits (infinitely many) solutions only if the second equation is proportional to the first one, namely only if
3
C = 3(C − 1), ⇒ C = .
2
164
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
This leads to the equation
1
b1 − b2 = .
2
If we give b1 a value K, then b2 must be K − 12 , therefore the vector b is of the form:
K
K−
b=
1
2
!
.
In the end, the particular solution of the given system of ODEs is given by:
3
yp (x) =
2
1
1
!
xex +
K
K−
1
2
!
ex .
Method of the Laplace transforms
The Laplace transform of a vector of functions f(x) is simply given by the vector
whose components are the Laplace transforms of the components of f(x). If we
apply the Laplace transformation to the vector y′ (x), by simple extension of the
result L{f ′(x)} = sF (s) − f (0), we find:
L{f ′ (x)} = sF(s) − f(0).
Given the nonhomogeneous system of ODEs with constant coefficients
y′ (x) = Ay(x) + g(x),
with the initial condition y(0) = y0 , we can apply the Laplace transformation to
both members of this equation. If we call Y(s) the Laplace transform of the vector
y(x) and with G(s) the Laplace transform of g(x), we obtain:
sY(s) − y0 = AY(s) + G(s)
⇒ (sI − A)Y(s) = y0 + G(s).
(5.34)
In fact, since A is a constant matrix, it is L{Ay} = AL{y}. Eq. 5.34 can be solved
(for instance by means of the Cramer’s rule or by inverting the matrix sI − A) and,
once we have obtained Y(s) we can invert it element by element and obtain the
vector solution y(x) = L−1 {Y(s)}.
5.2. SYSTEMS OF FIRST ORDER LINEAR ODES
165
Example 5.2.6 Solve the initial value problem:








y ′ =  1 3  y +  0 
2 0
sin x



y(0) = 0
We use the property L{y′} = sF(s) − y0 , where F(s) is the Laplace transform of
the vector y(x), and apply the Laplace transformation to both members of the given
system of ODEs, obtaining:
!
!
1 3
0
sY(s) =
Y(s) +
1
2 0
s2 +1
!
!
s − 1 −3
0
.
⇒
Y(s) =
1
−2
s
s2 +1
This system of algebraic equations can be solved for instance by means of the Cramer’s
rule. We calculate first the component Y1 (s) of the vector Y(s), obtaining:
0 −3 1
3
s2 +1 s s2 +1
Y1 (s) = = s2 − s − 6 .
s − 1 −3 −2
s The roots of the denominator s2 − s − 6 are:
√
1 ± 1 + 24
s=
, ⇒ s1,2 = 3, −2.
2
We can thus express Y1 (s) as:
Y1 (s) =
(s2
3
.
+ 1)(s − 3)(s + 2)
We apply now the method of the partial fractions, namely we seek the coefficients A,
B, C, D such that:
C
D
3
As + B
+
+
= 2
,
2
s +1
s−3 s+2
(s + 1)(s − 3)(s + 2)
namely:
As3 −As2 −6As+Bs2 −Bs−6B +Cs3 +2Cs2 +Cs+2C +Ds3 −3Ds2 +Ds−3D = 3.
This leads to the system:
166
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS


A+C +D =0




−A + B + 2C − 3D = 0


−6A − B + C + D = 0




−6B + 2C − 3D = 3


A = −C − D




B + 3C − 2D = 0
⇒


−B + 7C + 7D = 0




−6B + 2C − 3D = 3


A = −C − D




B = −3C + 2D
⇒


10C + 5D = 0




20C − 15D = 3
If we multiply the third equation by 3 and sum it with the fourth equation we obtain
50C = 3. From it, we find the solution for all the other coefficients, namely:

3

A = 50



B = − 21
50
3
50
The function Y1 (s) is therefore:
Y1 (s) =


C=




3
D = − 25
.
21 1
3 1
3 1
3 s
−
+
−
.
2
2
50 s + 1 50 s + 1 50 s − 3 25 s + 2
It is easy to invert this function, obtaining:
21
3
3
3
cos x −
sin x + e3x − e−2x .
50
50
50
25
The second component of the vector Y(s) is given by:
y1 (x) = L−1 {Y1 (s)} =
s−1
−2
0
1
s2 +1
s−1
.
(s − 3)(s + 2)
+ 1)(s − 3)(s + 2)
The method of the partial fractions yields the same system of equations we have
already encountered solving for Y1 (s); only the right-hand sides of the equations are
different, namely:
Y2 (s) =


A+C +D =0




−A + B + 2C − 3D = 0


−6A − B + C + D = 1




−6B + 2C − 3D = −1
We obtain therefore:
=
(s2


A = −C − D




B = −3C + 2D
⇒


10C + 5D = 1




20C − 15D = −1

4

A = − 25




B = 3
25
⇒
1


C = 25




3
D = 25
3 1
1 1
3 1
4 s
+
+
+
2
2
25 s + 1 25 s + 1 25 s − 3 25 s + 2
4
3
1
3
⇒ y2 (x) = − cos x +
sin x + e3x + e−2x
25
25
25
25
Y2 (s) = −
.
.
5.2. SYSTEMS OF FIRST ORDER LINEAR ODES
167
The functions y1 (x) and y2 (x) we have found in this way are the components of the
vector solution y(x).
Method of variation of parameters
The methods we have encountered so far can be applied to the system of nonhomogeneous ODEs y′ = Ay + g only if the components of the vector g are simple functions
(sinusoidal functions, exponentials, polynomials). For more complex functions, the
only viable method is the variation of parameters.
We have seen that, given the homogeneous system of ODEs y′ = Ay, the
solution can be expressed as:
y = c1 y(1) (x) + · · · + cn y(n) (x),
where the vectors y(1) (x), . . . , y(n) (x) form a fundamental set of solutions and are
usually of the form y(i) = p(i) eλi x . This solution can be also written with the
compact notation y = Y(x)c, where Y is the matrix whose columns are the vectors
y(1) (x), . . . , y(n) (x) and c is the (constant) column vector formed by the coefficients
c1 , . . . , cn . If we now allow the coefficients to vary, the vector c becomes dependent
on x. If we substitute the vector y = Y(x)c(x) into the nonhomogeneous system
y′ = Ay + g we obtain:
Y ′(x)c(x) + Y(x)c′ (x) = AY(x)c(x) + g(x).
Since Y is made of solutions of the corresponding homogeneous system of ODE, it
is Y ′(x) = AY(x) and, consequently, Y ′ (x)c(x) = AY(x)c(x). We can cancel out
these two terms from the previous equation and what remains is:
Y(x)c′ (x) = g(x) ⇒ c′ (x) = Y −1 (x)g(x).
In fact, the matrix Y is made of the (linearly independent) vectors that form the
fundamental set of solutions, therefore det Y 6= 0 and we can always find the inverse
Y −1 . From this equation, we obtain:
Z
Y −1 (x)g(x)dx + K
Z
−1
⇒ y(x) = Y(x)
Y (x)g(x)dx + K ,
c(x) =
where K is a constant vector.
(5.35)
168
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
Example 5.2.7 Find the particular solution of the system of ODEs
!
!
1
2 −5
sin x
.
y′ =
y+
1
1 −2
cos x
We first have to find the matrix Y whose columns are solutions of the homogeneous
system:
!
2
−5
y.
y′ =
1 −2
The characteristic equation is:
(2 − λ)(−2 − λ) + 5 = 0, ⇒ λ2 + 1 = 0.
The eigenvalues are therefore the (complex conjugated) values λ1,2 = ±i. We can
take λ = i and calculate the corresponding eigenvector:
2−i
−5
1
−2 − i
!
p1
p2
!
=
0
0
!
.
The second equation of this system is:
2+i
1
p1 − (2 + i)p2 = 0, ⇒ p =
!
.
We have!obtained a complex eigenvalue λ = i and a complex eigenvector p =
2+i
. As we have learned, we have to find the real and the imaginary parts
1
of peλx , namely:
y=
⇒u=
2+i
1
!
2+i
1
eix =
2 cos x − sin x
cos x
!
!
v=
(cos x + i sin x) =
cos x + 2 sin x
sin x
(2 + i) cos x + (2i − 1) sin x
cos x + i sin x
!
The vector Y is thus given by:
Y=
Given a 2 × 2 matrix
!
2 cos x − sin x cos x + 2 sin x
.
cos x
sin x
!
a b
it is easy to show that the inverse is given by:
c d
!
d −b
1
.
ad − bc −c a
!
169
5.3. SYSTEMS OF SECOND ORDER LINEAR ODES
In the case of the matrix Y it is:
ad − bc = det A = 2 sin x cos x − sin2 x − cos2 x − 2 sin x cos x = −1,
therefore Y −1 is given by:
Y −1 =
− sin x cos x + 2 sin x
cos x sin x − 2 cos x
!
.
We can now apply Eq. 5.35 and obtain:
y(x) =
2 cos x − sin x cos x + 2 sin x
cos x
sin x
=
2 cos x − sin x cos x + 2 sin x
cos x
sin x
=
2 cos x − sin x cos x + 2 sin x
cos x
sin x
!Z
!Z
!"
!
− sin x cos x + 2 sin x
cos x sin x − 2 cos x
!
−1 + 1 + 2 tan x
dx
cot x + tan x − 2
−2 ln(cos x)
ln(sin x) − ln(cos x) − 2x
!
1
sin x
1
cos x
+K
!
#
−2(2 cos x − sin x) ln(cos x) + (cos x + 2 sin x)[ln(sin x) − ln(cos x) − 2x]
=
−2 cos x ln(cos x) + sin x[ln(sin x) − ln(cos x) − 2x]
!
2 cos x − sin x cos x + 2 sin x
+K
cos x
sin x
5.3
!
Systems of second order linear ODEs
A generic system of second order linear ODEs with constant coefficients can be
written as:



y ′′ = a11 (1) y1 + a11 (2) y1 ′ + · · · + a1n (1) yn + a1n (2) yn ′ + g1 (x)

 1
..
.



yn ′′ = an1 (1) y1 + an1 (2) y1 ′ + · · · + ann (1) yn + ann (2) yn ′ + gn (x)
.
dx
(5.36)
This system of n second order ODEs can be transformed into a system of 2n first
order ODEs by the substitutions:



y ′ = u1

 1
..
.
.



yn ′ = un
+
170
CHAPTER 5. SYSTEMS OF DIFFERENTIAL EQUATIONS
The resulting system of ODEs is:


u1 ′ = a11 (1) y1 + a11 (2) u1 + · · · + a1n (1) yn + a1n (2) un + g1 (x)




..


.




u ′ = a (1) y + a (2) u + · · · + a (1) y + a (2) u + g (x)
n
n1
1
n1
1
nn
n
nn
n


y1 ′ = u1



..



.



 ′
yn = un
n
,
and it can be solved with the methods described so far.
On the other hand, a system of second order ODEs can be also solved by means
of elimination methods. For instance in the system:

y
1
′′
= ay1 + by2 + g1 (x)
y2 ′′ = cy1 + dy2 + ey1 ′ + f y2 ′ + g2 (x)
,
we can see that it is possible to recover y2 from the first equation and, substituting
it into the second equation, we obtain a fourth order nonhomogeneous ODE in y1
that can be solved with the known method and, after substituting it back into the
first ODE, also y2 can be found.
Chapter 6
Modeling physical systems with
ODEs
In this chapter we will give some indications on how to treat physical problems by
means of differential equations (or systems of differential equations). We will start
providing some guidelines on how to construct the mathematical model (namely
the underlying differential equations) of some given physical problem. We will then
describe some famous (and some less famous) physical problems and show how the
mathematical tools learned so far can help us finding the solutions to them.
6.1
Constructing mathematical models
Differential equations are useful in practically all the physical problems of some
relevance. However, some work is needed to translate the physical problems in the
mathematical language and to formulate the appropriate differential equation(s) that
describes the problem being investigated. We have already seen in Chapter 2 some
very easy physical problems (like the fall of a body with and without air resistance);
in this section we will take into account (slightly) more complicated problems, trying
to learn general methods useful to treat any physical system. It is however worth
remarking that “magical recipes” on how to find the mathematical model of a physical
process do not exist and a lot of experience and hard work are always the best way
to find the solutions we look at.
It might be useful to start this section with two examples.
Example 6.1.1 A baseball player (with some knowledge of physics) is interested in
knowing the “optimum angle” (that he calls αo ) at which he should hit the ball, that
guarantees the maximum range of the ball. Moreover, he knows that he can give the
ball an initial speed of ∼ 35 m/s and he wishes to know how much he can deviate
171
172
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES
Figure 6.1: Reference frame to use in Example 6.1.1.
from the optimum angle and still make a home run (namely to have a range of ∼
120 m).
The baseball player needs to know the range of the ball, therefore a spatial coordinate
(that he will call x and will measure in meters). However, he needs to calculate the
ball orbit, therefore also the vertical coordinate (y) must be taken into account. He
must start therefore deciding the best reference frame, so that he can know the zero
point of the horizontal and vertical coordinates. The best possible reference frame is
indicated in Fig. 6.1. In fact, the height of the bat at the moment in which he hits
the ball is comparable to the height of the external wall, therefore the natural choice
is to take as zero point of the y-axis this height. He will then call α the angle of
elevation of the ball at the moment he hits it.
To obtain the orbit of the ball, the baseball player needs to know how the spatial
coordinates x and y vary with time, therefore the time (that he will measure in seconds) is the independent variable of the problem. He now needs to know the forces
acting on the ball. He can think that, since the ball is small and the whole orbit of
the ball should last only few seconds, the air resistance can be neglected and the only
force in action is the gravity, that acts along the y direction, towards the surface of
the Earth. From the Newton’s second law F = ma the differential equations he needs
to solve are:
6.1. CONSTRUCTING MATHEMATICAL MODELS

ma = 0
x
may = −mg
173
,
where m is the mass of the ball (that can be cancelled out), ax = x′′ (t) and ay = y ′′ (t)
are the accelerations along the x and y directions, respectively and g = 9.8 m s−2
is the gravity at the sea level. All he needs to know now are the initial conditions.
Taking t = 0 at the moment he hits the ball, of course it is x(0) = y(0) = 0. The
initial velocities x′ (0) and y ′(0) are the components of the initial speed v0 along the
x and y direction, respectively, therefore:

x′ (0) = v cos α
0
y ′(0) = v0 sin α
.
By integrating two times with respect to t the ODEs x′′ (t) = 0 and y ′′(t) = −g one
obtains:

x(t) = v t cos α
0
y(t) = v0 t sin α − 1 gt2
2
Since he needs to know the orbit of the ball, he can recover t from the first equation,
namely
x
.
t=
v0 cos α
By substituting it into the second equation, one obtains:
1
x2
y = x tan α − g 2
.
2 v0 cos2 α
The range of the ball is the value of x for which y = 0. The equation y = 0 is satisfied
for x = 0 (that is the initial condition) and for
x=
2v0 2 cos2 α tan α
2v0 2 sin α cos α
v0 2 sin(2α)
=
=
.
g
g
g
The baseball player needs to know when the function x(α) has a maximum, therefore
dx
and puts it equal to zero, obtaining:
he calculates the derivative dα
dx
2v0 2 cos(2α)
π
=
= 0, ⇒ cos(2α) = 0, ⇒ 2α = .
dα
g
2
In the end, the optimum angle αo is π4 .
2
he has
To answer the second question he makes use of the relation x = v0 sin(2α)
g
just found. It is required to know for which values of α is x larger than 120 m. This
occurs when:
174
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES
x=
120 × 9.8
v0 2 sin(2α)
> 120, ⇒ sin(2α) >
.
g
v0 2
For v0 = 35 m s−1 he obtains:
sin(2α) > 0.96,
that is verified in the interval
α ∈ [37o , 53o].
Example 6.1.2 A team of astronomers has observed a Supernova Remnant (the
remnant left behind after the explosion of a star) but they do not know when the
explosion took place. From the spectrum of the remnant they are able to understand
that it contains a large amount of iron, a small amount of cobalt and a negligible
amount of nickel. They are also able to determine that the iron is ∼ 10 times as
much as the cobalt. They know, too, that a typical Supernova explosion releases in
the interstellar medium some amount x0 of 56 Ni and negligible amounts of 56 Co and
56
Fe. However, 56 Ni is an unstable isotope and decays into 56 Co in a decay time of
∼ 6 d. Also 56 Co is unstable and decays into 56 Fe in ∼ 77 d. On the other hand,
56
Fe is a stable isotope, therefore on the long run it is to expect that all the nickel
and cobalt will be turned into iron. Is this enough to estimate how many days are
elapsed since the explosion?
The rate at which the number of atoms of a radioactive isotope decreases with time is
proportional to the amount of atoms of that species present at the time t (the larger
the number of atoms of the radioactive isotope, the larger the number of transitions to
a stable isotope), namely the number N(t) of atoms of the radioactive isotope obeys
the differential equation:
dN
= −λN,
dt
where λ is the inverse of the decay time (in fact, it should have the dimension of the
inverse of a time).
In our example, we can call λN i the decay rate of nickel to cobalt (namely λN i =
1
) and λCo the decay rate of cobalt to iron (λCo = 771 d ). The equation that gives the
6d
number of atoms of Ni as a function of time is simply
dNi
= −λN i Ni.
dt
In fact, the Ni population reduces in size because it decays to cobalt and there are no
chemical processes replacing it. On the other hand, the number of atoms of cobalt
6.1. CONSTRUCTING MATHEMATICAL MODELS
175
increases because of the decay of Ni but decreases because Co decays to Fe, namely
the number of Co atoms as a function of time obeys the following ODE:
dCo
= λN i Ni − λCo Co.
dt
The number of iron atoms can only increase with time and its rate of change is
dictated by the decay of cobalt, namely we have:
dF e
= λCo Co.
dt
In the end, if we solve the system of ODEs


dN i


 dt = −λN i Ni
dCo
= λN i Ni − λCo Co
dt



 dF e = λCo Co
dt
,
we will obtain the functions Ni(t), Co(t) and F e(t) that give us the number of atoms
i
of the 3 species as a function of time. It is worth noticing that dN
+ dCo
+ dFdte = 0, as
dt
dt
it should be since, after the Supernova explosion, these elements are neither created
nor destroyed, but only transformed one to another. This system of ODEs can be
solved in many ways. We will choose here a method that makes use of the Laplace
operator.
If we apply the Laplace operator to the first ODE, we obtain:
x0
,
s + λN i
where x0 is the (unknown) amount of Ni produced by the Supernova. We apply now
the Laplace operator to the second ODE and obtain:
sL{Ni} − x0 = −λN i L{Ni}, ⇒ L{Ni} =
λN i x0
− λCo L{Co}.
s + λN i
Here, we have considered that the amount of cobalt produced by the Supernova is
negligible, namely Co(0) = 0. From it, we obtain:
sL{Co} = λN i L{Ni} − λCo L{Co} =
L{Co} =
λN i x0
.
(s + λCo )(s + λN i )
Inverting this relation, we can already calculate Co(t). We apply the method of the
partial fractions and obtain:
A
B
λN i x0
+
=
⇒ As + AλN i + Bs + BλCo = λN i x0 .
s + λCo s + λN i
(s + λCo )(s + λN i )
From it, we obtain the system of equations:
176
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES

A + B = 0
AλN i + BλCo = λN i xo

A = −B
A(λN i − λCo ) = λN i x0
Recalling that L−1 {1/(s − a)} = eax , we obtain:
Co(t) =

A =
B =
λNi x0
λNi −λCo
λNi x0
− λNi
−λCo
.
x0 λN i
e−λCo t − e−λNi t .
λN i − λCo
We can notice from this equation that Co(0) = 0 and that Co(t) tends to zero for
t → ∞, as we expect since all the cobalt will be turned at some time to iron.
From the equation dFdte = λCo Co, since F e(0) = 0 we obtain:
sL{F e} = λCo L{Co} ⇒ L{F e} =
x0 λCo λN i
.
s(s + λCo )(s + λN i )
We apply again the method of partial fractions, looking for coefficients A, B, C such
that
B
C
x0 λCo λN i
A
+
+
=
.
s
s + λCo s + λN i
s(s + λCo )(s + λN i )
We obtain:
⇒
⇒
As2 + As(λCo + λN i ) + AλCo λN i + Bs2 + BsλN i + Cs2 + CsλCo = x0 λCo λN i








A + B + C = 0
A = x0
A(λCo + λN i ) + BλN i + CλCo = 0



= x λ λ
A
λ
λCo
Ni
0Co N i




A = x0
B=



C =
B = −C − x0



+ Cλ = 0
x0 (λCo + λNi ) − CλN i − x0
λN
i
Co
x0 λNi
λCo −λNi
x0 λCo
λNi −λCo
Inverting this function we obtain:
1
λN i
λCo
1
−1
−
F e(t) = x0 L
+
s λCo − λN i s + λCo s + λN i
1
−λCo t
−λNi t
(λN i e
− λCo e
) .
= x0 1 +
λCo − λN i
Also in this case we can see that F e(0) = 0 whereas F e(t) tends, for t → ∞, to x0
(as it should be since at sufficiently large times all the initial amount of nickel will
be turned into cobalt and all the cobalt into iron).
177
6.1. CONSTRUCTING MATHEMATICAL MODELS
The ratio between F e(t) and Co(t) is given by:
−λ
t
−λ
t
Co
Ni e
1 − λCo e λNiCo −λ
F e(t)
−λNi
.
= λNi
−λNi t − e−λCo t )
Co(t)
(e
λ −λ
Co
Ni
We know that this ratio must be ∼ 10, therefore from the above equation we can
recover the time at which this happens. We can simplify this equation by noticing
that λCo ≪ λN i , therefore λCo − λN i ≃ −λN i and that e−λNi t − e−λCo t ≃ −e−λCo t ,
therefore we obtain:
1+
F e(t)
≃
Co(t)
λCo e−λNi t −λNi e−λCo t
λNi
e−λCo t
≃ eλCo t +
λCo −(λNi −λCo )t
e
− 1 ≃ eλCo t − 1.
λN i
By equating this ratio to 10, we obtain:
eλCo t ∼ 11 ⇒ t ∼
1
ln 11 ≃ 185 d.
λCo
As we have seen from the previous two physical problems, the construction and
the solution of a mathematical model is a process that can be loosely described with
the following steps:
• identify the independent and dependent variables and assign letters to them.
In our case, the independent variable was always the time and the dependent
variables were the spatial coordinates x and y (Example 6.1.1) and the number
of atoms of Fe, Co, Ni (Example 6.1.2).
• Choose the units of measurement of each variable. This choice is arbitrary but
the calculations are easier if the measured quantities are close to unity (for this
reason we have chosen to measure the time in seconds in the first example and
in days in the second).
• Articulate the basic principle underlying the problem you are investigating.
This can be a widely recognized physical law (such as the Newton’s second law
or the law of decay of a radioactive isotope), or it may be a more speculative
assumption based on experience or observations. Since in the real world many
forces and effects act at the same time, be sure that you have isolated the main
driver of the physical process and neglect all the forces that you may reasonably
consider negligible.
• Express this principle of law in terms of the variables you chose in the first step.
That may require the introduction of physical constants (like g) or parameters
(like λN i,Co ).
178
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES
• Make sure that each term in your equations have the same physical units.
• Formulate the initial conditions of the problem. For problems involving orbits
and trajectories of bodies, this implies the choice of a reference frame.
• If the obtained equations are still too complicated and intractable, further
approximations may be required. For example, at one stage in Example 6.1.2
we have neglected λCo with respect to λN i .
• Solve the obtained ODEs. Try to make all the possible checks to verify that the
solution is correct. In particular, check if the initial conditions are satisfied and
check if the behavior at large times appears physically reasonable (as we have
done in Example 6.1.2, where we have verified that at large time all the cobalt
and nickel will be turned into iron). If possible, one should also calculate the
values of the solutions at selected points and compare with observations. Or
examine the solutions corresponding to certain special values of the parameters
of the problem.
6.2
6.2.1
Mechanical and electrical vibrations
The spring-mass system
We will start here considering some of the most famous and important physical
problems to understand how the methods learned so far help us solve them. The
first process we analyze is the motion of a mass on a spring because the principles
involved are common in many systems.
Referring to Fig. 6.2, we consider a mass hanging on the end of a spring with
original length y0 . The mass causes an elongation ye of the spring in the downward
direction, which we will take as the positive one. There are at this point two forces
acting on the mass m: the gravitational force w = mg, but there is also a force Fs due
to the spring that acts upward and contrasts the gravity. Already in the seventeenth
century Hooke discovered that, for small elongations y, the force is proportional to
the elongation, namely
Fs = −ky,
(6.1)
where the constant of proportionality k is called the spring constant. This law is
called Hooke’s law. Since ye is the elongation at which the gravity and the elastic
force balance, it must be:
6.2. MECHANICAL AND ELECTRICAL VIBRATIONS
179
Figure 6.2: A spring-mass system and the forces acting on the mass m.
mg = kye .
(6.2)
This means that if we attach a body of mass m to a spring whose spring constant is
unknown, it is enough to measure the elongation ye and calculate k by means of the
relation
k=
mg
.
ye
We are interested in studying the mass of the body out of its equilibrium, namely
how the spring-mass system reacts after we displace the mass m by an amount
u = y − ye . The elastic force will be at this point Fs = −ky = −k(ye + u). From the
Newton’s second law F = ma we have:
mu′′ (t) = Fs + mg = −k[ye + u(t)] + mg = −ku(t),
(6.3)
where we have made use of the relation Eq. 6.2.
The equation of motion of the spring is thus given by the solution of the ODE
′′
mu (t) + ku(t) = 0. Since the constants m and k are always positive, the solutions
of the characteristic equation mλ2 + k = 0 are always the imaginary numbers:
r
λ1,2 = ±i
k
= ±iω0 ,
m
180
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES
where ω0 is called the natural frequency of the vibration. The solution u(t) of the
motion of the spring is thus given by:
u(t) = A cos(ω0 t) + B sin(ω0 t).
We can express this solution also with:
u(t) = R cos(ωo t − δ).
(6.4)
In fact,
R cos(ω0 t − δ) = R cos δ cos(ω0 t) + R sin δ sin(ω0 t),
therefore the two solutions coincide provided that
A = R cos δ,
B = R sin δ.
Thus:
√
B
.
(6.5)
A
The motion of the spring subject only to the gravity and to the elastic force is thus
given by a vibration (also called undamped free vibration) about the equilibrium
position ye , with period
R=
A2 + B 2 ,
δ = arctan
2π
= 2π
T =
ω0
r
m
,
k
amplitude R and phase δ. The amplitude and phase are dictated by the initial
conditions of the problem.
The idealized configuration of the system in which only gravity and the elastic
force act is never attainable in practice. In fact, it would predict a vibration, always
with the same amplitude R, lasting forever. Resistance from the medium in which
the mass moves, internal energy dissipation and other dissipative phenomena are
expected to damp the motion of the mass m. We can take into account these damping
effects by introducing a resistive force Fd that contrasts the motion of the mass
(therefore it has a negative sign) and can be assumed proportional to the velocity,
namely:
Fd (t) = −γu′ (t).
In this way, the motion of the spring must obey the ODE
mu′′ (t) + γu′ (t) + ku(t) = 0.
Also this ODE has constant coefficients, therefore the solution is simply given by:
181
6.2. MECHANICAL AND ELECTRICAL VIBRATIONS
u(t) = Aeλ1 t + Beλ2 t ,
where λ1,2 are the solutions of the characteristic equation mλ2 + γλ + k = 0. These
solutions are:
p
γ 2 − 4km
γ
1 p 2
=−
±
γ − 4km.
2m
2m 2m
We can distinguish now two cases:
λ1,2 =
−γ ±
(6.6)
√
• γ < 2 km
p
In this case, γ 2 − 4km < 0 and the solutions of the characteristic equation
are the complex conjugate numbers:
γ
i p
4km − γ 2 .
±
2m 2m
The law of motion of the mass m is thus given by:
λ1,2 = −
γ
γ
u(t) = e− 2m t [A cos(µt) + B sin(µt)] = Re− 2m t cos(µt − δ),
(6.7)
where
p
4km − γ 2
µ=
> 0,
(6.8)
2m
and R and δ are given again by Eq. 6.5. This is again a vibration (called
γ
damped free vibration) but with amplitude Re− 2m t decreasing with time (in
fact γ, m > 0) and tending to zero for t → ∞.
√
• γ ≥ 2 km
p
In this case, γ 2 − 4km ≥ 0 and therefore we have two real solutions (either
distinct or coincident) of the characteristic equation. However, γ 2 − 4km is
always less than γ 2 (k and m are positive constants), therefore both λ1,2 given
by Eq. 6.6 are negative. That means that the law of motion of the spring
is given by the sum of two exponentially decaying functions u(t) = c1 e−λ1 t +
√
c2 e−λ2 t (u(t) = (c1 + c2 t)e−λt in the case in which γ = 2 km) and therefore
the mass tends to return to the equilibrium position for large times without
oscillating about the equilibrium position. That corresponds to a very strong
√
damping that prevails over the elastic force. The threshold value γ = 2 km is
also called critical damping.
In some cases, an external force Fe (t) might be applied to the mass m, therefore
the law of motion is given by the solution of the following ODE:
182
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES
mu′′ (t) + γu′ (t) + ku(t) = Fe (t).
(6.9)
The solution of this ODE is the sum of the complementary solution uc (t) (given by
Eq. 6.7 in the case of small damping), plus a particular solution up (t) related to the
nature of the external force Fe (t).
Example 6.2.1 A mass m = 1 kg is attached to a spring with spring constant k = 4
N/m and to a damping device with variable but small damping coefficient γ. Analyze
the motion of m as a function of γ under the influence of the external forces

F cos t
1
Fe (t) =
.
F2 cos(2t)
To simplify the notation we can write γ = 2β. The damping coefficient is assumed to
be small, therefore we can assume β < 1. The law of motion must obey the following
ODE:
u′′ (t) + 2βu′ (t) + 4u(t) = Fe (t).
According to Eq. 6.7 the complementary solution, namely the solution of the corresponding homogeneous ODE, is given by:
uc (t) = Re−βt cos(µt − δ),
where R and δ can be obtained from the initial conditions and
p
16 − 4β 2 p
= 4 − β 2.
µ=
2
Let us take first the external force Fe (t) = F1 cos t. To find the particular solution,
we can apply the method of the undetermined coefficients. Provided that µ 6= 1 we
know that the particular solution has the form:
up (t) = A1 cos t + B1 sin t.
The fact that β < 1 ensures us that µ > 1, therefore this is indeed the form of the
particular solution. The first and the second derivative of up (t) are given by:
up ′ (t) = −A1 sin t + B1 cos t
up ′′ (t) = −A1 cos t − B1 sin t
183
6.2. MECHANICAL AND ELECTRICAL VIBRATIONS
By substituting them into the given ODE and comparing the coefficients of sin t and
cos t we obtain the system of equations:

3A + 2βB = F
1
1
1
3B1 − 2βA1 = 0

B = 2βA1
1
3
,
3A1 + 4β 2 A1 = F1
3
The law of motion of the mass m is thus given by:
u(t) = Re−βt cos(
p
4 − β 2 t − δ) +

A =
1
,
B1 =
3F1
9+4β 2
2βF1
9+4β 2
.
F1
(3 cos t + 2β sin t),
9 + 4β 2
p
namely it is the superposition of two oscillations with angular frequencies 4 − β 2
and 1, respectively.
In the case in which Fe (t) = F2 cos(2t), provided that µ 6= 2 (namely that β 6= 0)
we can assume a particular solution of the form:
up (t) = A2 cos(2t) + B2 sin(2t)
up ′ (t) = −2A2 sin(2t) + 2B2 cos(2t)
up ′′ (t) = −4A2 cos(2t) − 4B2 sin(2t)
Since in this case up ′′ (t) + 4up (t) = 0 we can immediately realize that A2 = 0 and
that B2 = F4β2 . The law of motion is thus:
p
F2
u(t) = Re−βt cos( 4 − β 2 t − δ) +
cos(2t).
4β
Also in this case the motion is characterized by the superposition of two oscillations
but the amplitude F4β2 of the second oscillation becomes unbound for β → 0. This
happens any time the frequency
q ω of the applied force is equal (or very similar) to
k
the natural frequency ω0 = m
(2 in our example) of the considered system. This
phenomenon is called resonance.
6.2.2
Electric circuits
A simple electric circuit (also called RLC circuit) is composed of a resistor, a capacitor
and an inductor connected in series, as shown in Fig. 6.3. The current I, measured
in amperes, is a function of time t. The resistance R (ohms), the capacitance C
(farads) and the inductance L (henrys) are all positive constants. The impressed
voltage V (in volts) is a given function of time. Another physical quantity that
184
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES
Figure 6.3: A simple RLC electric circuit.
enters the problem is the total charge Q (measured in coulombs), which is linked to
I by the relation I(t) = Q′ (t).
The flow of current in the circuit is governed by the Kirchhoff ’s second law that
states that In a closed circuit the impressed voltage is equal to the sum of the voltage
drops in the rest of the circuit. According to elementary laws of electricity, it is
known that:
• the voltage drop across the resistor is IR;
• the voltage drop across the capacitor is Q/C;
• the voltage drop across the inductor is LI ′ (t).
The Kirchhoff’s second law translates thus into:
L
Q
dI
+ RI + = V (t).
dt
C
By means of the relation I(t) = Q′ (t) we obtain:
1
Q(t) = V (t).
(6.10)
C
This is a nonhomogeneous second-order ODE with constant coefficients that resembles very closely the Eq. 6.9 that describes the dynamics of a spring-mass system
in the presence of an external force, therefore the solutions can be found exactly
as in Sect. 6.2.1. It happens often in physics that similar differential equation can
describe quite different physical systems.
LQ′′ (t) + RQ′ (t) +
6.3. OTHER PHYSICAL PROCESSES
6.3
185
Other physical processes
6.3.1
Wave propagation
It is known that a small perturbation expands in a medium according to the wave
equation
∂2y
= v 2 ∇2 y,
2
∂t
where v is the propagation speed of the wave and ∇2 is the Laplace operator (see
Sect. 7.2.3). For instance, a sound wave in the air propagates with the sound speed
343 m/s. A wave in one dimension (for example a vibrating string) can be described
by the one dimensional wave equation:
2
∂2y
2∂ y
=v
.
(6.11)
∂t2
∂x2
Many text books report the solution of this equation (y(x, t) = f (x ± vt)) without
explaining how it is obtained. The reason is that the wave equation is not an ODE
but a partial differential equation and most of the text books do not treat PDEs.
We will not treat PDEs neither, but with a method based on the Fourier transforms
it is possible to convert the wave equation into an ODE and solve it.
The solution of the wave equation is a function y(x, t). We shall assume that
the function y(x, t) at the time t = 0 is given by the function f (x), namely:
y(x, 0) = f (x).
This can be considered the initial condition of the problem. We define now as Y (α, t)
the Fourier transform of y(x, t) with respect to x, namely
Z ∞
1
y(x, t)e−iαx dx.
Y (α, t) = √
2π −∞
Consequently, Y (α, 0) is the Fourier transform of f (x), namely:
(6.12)
Z ∞
1
f (x)e−iαx dx.
(6.13)
Y (α, 0) = F (α) = √
2π −∞
We apply now the Fourier transform (with respect to x) to both members of the
wave equation, obtaining:
∞
Z
∂ 2 y −iαx
1
e
dx = v 2 √
2
∂t
2π
∞
∂ 2 y −iαx
e
dx = v 2 (iα)2 Y (α, t) = −v 2 α2 Y (α, t),
2
∂x
−∞
−∞
(6.14)
where we have made use of the relation F {f (n) (t)} = (iω)n F {f (t)} that we have
encountered in Section 4.2. In the left hand side of Eq. 6.14, the derivative with
respect to time can be taken out of the integral, therefore we obtain:
1
√
2π
Z
186
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES
1
√
2π
∞
Z
−∞
∂2
∂ 2 y −iαx
e
dx
=
Y (α, t).
∂t2
∂t2
Therefore, Eq. 6.14 transforms to:
∂2
Y (α, t) = −v 2 α2 Y (α, t).
(6.15)
2
∂t
Since no derivatives with respect to x enter the above equation, this is an ODE and
not a PDE anymore. This is already a major achievement. It is easy to solve Eq.
6.15 (it is the equation of an undamped free vibration). The characteristic equation
is λ2 + v 2 α2 = 0, with roots λ1,2 = ±ivα, therefore the solution is:
Y (α, t) = K(α)e±ivαt .
For t = 0 we obtain Y (α, 0) = K(α), therefore K(α) is the Fourier transform of
f (x), namely:
Y (α, t) = F (α)e±ivαt .
The solution of the wave equation y(x, t) is the inverse Fourier transform of the
function Y (α, t), namely:
1
y(x, t) = √
2π
Z
∞
−∞
±ivαt iαx
F (α)e
e
1
dα = √
2π
Z
∞
F (α)eiα(x±vt) dα.
(6.16)
−∞
Since f (x) is the inverse Fourier transform of F (α), we immediately obtain from this
equation the solution we seek, namely:
y(x, y) = f (x ± vt),
(6.17)
corresponding to two waves propagating with velocity v in the +x and −x directions,
respectively.
6.3.2
Heat flow
The one dimensional heat flow is described by the PDE
∂ψ
∂2ψ
= a2 2 ,
∂t
∂x
(6.18)
where the solution ψ(x, t) gives the temperature at each coordinate x as a function
of time. We proceed as for the wave equation and define Ψ(α, t) as the Fourier
transform of ψ(x, t) with respect to x, namely:
187
6.3. OTHER PHYSICAL PROCESSES
Z ∞
1
Ψ(α, t) = √
ψ(x, t)e−iαx dx.
2π −∞
If we apply the Fourier transform to both sides of Eq. 6.18, this yields an ODE for
the function Ψ(α, t), Fourier transform of ψ(x, t), in the time variable t given by:
∂Ψ(α, t)
= −a2 α2 Ψ(α, t),
∂t
that can be easily solved yielding:
ln Ψ(α, t) = −a2 α2 t + K(α),
or:
2 α2 t
Ψ(α, t) = C(α)e−a
,
where C = Ψ(α, 0) is dictated by the initial conditions of the problem. By inverting
this expression, we obtain the solution:
1
ψ(x, t) = √
2π
Z
∞
−∞
2 α2 t
C(α)eiαx−a
dα.
(6.19)
188
CHAPTER 6. MODELING PHYSICAL SYSTEMS WITH ODES
Chapter 7
Vector and tensor analysis
We discuss in this chapter the calculus (i.e. differentiation and integration) of vectors
and we introduce the concept of tensors. Particular emphasis will be given on the
most widely used application of vectors, namely the vectors describing the position
of a body in a 3-dimensional space because this is the perfect framework to treat
dynamics. We start with a review of space vectors and their properties.
7.1
7.1.1
Review of vector algebra and vector spaces
Vector algebra
We have already encountered vectors as matrices composed of a singe column (see
Section 5.1.1). However, when we think at a vector we associate it mentally to an
arrow in space. Indeed, a vector is a geometrical object that has a magnitude (or
length) and a direction. The association between vectors as geometrical objects and
columns of real numbers arises when we consider a vector a in a 3-dimensional space
and decompose it along 3 vectors e1 , e2 and e3 , not lying in a plane. In this case it
is always possible to find 3 real numbers a1 , a2 and a3 such that:
a = a1 e1 + a2 e2 + a3 e3 .
(7.1)
At this point we can identify the vector a with the 3 scalars a1 , a2 and a3 , namely:


a1


a =  a2  .
a3
The three vectors e1 , e2 and e3 are said to form a basis for the 3-dimensional space
and the numbers a1 , a2 and a3 are the components of the vector a with respect to
this basis. If the vectors e1 , e2 and e3 lie in the same plane, then they are linearly
189
190
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
Figure 7.1: A Cartesian basis set and the components of the vector a.
dependent (one can be obtained as linear combination of the other two), therefore
they form a basis if and only if they are linearly independent, namely if and only if
the only solution of the equation c1 e1 + c2 e2 + c3 e3 = 0 is c1 = c2 = c3 = 0. As we
have seen, this corresponds to the condition det E 6= 0, where E is the matrix whose
columns are the vectors e1 , e2 , e3 .
If we wish to label points in space using a Cartesian coordinate system (x, y,
z), we can introduce the unit vectors i, j and k. At this point the vector a can be
written as sum of three vectors, each parallel to a different coordinate axis:
a = ax i + ay j + az k,
(7.2)
(see Fig. 7.1). Therefore, each vector a can be associated to the three numbers ax ,
ay , az , namely:

ax


a =  ay  .
az

Clearly, the vectors i, j and k can be represented as:
 
 
 
0
0
1
 
 
 
i =  0 , j =  1 , k =  0 .
1
0
0
7.1. REVIEW OF VECTOR ALGEBRA AND VECTOR SPACES
191
The magnitude of a vector is a measure of its length. It is indicated with |a| or a
and it is given by:
p
(7.3)
a = |a| = ax 2 + ay 2 + az 2 .
A vector whose magnitude equals unity is called unit vector. The unit vector in the
direction of a is indicated as â and is evaluated as:
â =
a
.
a
(7.4)
The relation a = aâ is useful because, given the vector a, it separates clearly its
magnitude from its direction.
Sums and differences between vectors are simple applications of sums and differences between matrices (each component must be added or subtracted separately).
As for the product of two vectors, we have already encountered the scalar product
(Eq. 5.2), namely:
a · b = ax bx + ay by + az bz .
(7.5)
In fact, we have seen that a·b = aT b∗ , but vectors in the three-dimensional space are
real vectors and therefore b∗ = b. It can be shown that the scalar product between
two vectors is also given by:
a · b = ab cos θ,
(7.6)
where θ is the angle between the two vectors. From it, we recover immediately that:
a=
√
a · a.
The simplest application of the scalar product in physics is the work, which is given
by W = F · r, namely we have to take into account the component of the force F
along a displacement r.
There is another kind of multiplication between vectors that yields a vector
(instead of yielding a scalar as the scalar product). It is called vector product (or
cross product). The vector product is indicated with a × b and it is a vector whose
magnitude is:
|a × b| = ab sin θ,
where θ is again the angle between the two vectors. The vector a×b is perpendicular
to the plane identified by a and b and the direction is that in which a right-handed
screw moves forward rotating between a and b (see Fig. 7.2). Therefore, a × b is
opposite to b × a, namely the vector product is anticommutative. It is also clear
that a × a = 0.
In a Cartesian coordinate system, the vector product is expressed as:
192
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
Figure 7.2: Vector product between two vectors a and b.

ay bz − az by


a × b = (ay bz − az by )i + (az bx − ax bz )j + (ax by − ay bx )k =  az bx − ax bz  . (7.7)
ax by − ay bx

This can also be written as:
i j k a × b = ax ay az .
bx by bz We can extend our discussion to the product
again a scalar triple product:
ax
[a, b, c] ≡ a · (b × c) = bx
cx
and a vector triple product:
(7.8)
of three vector and distinguish
ay az by bz ,
cy cz a × (b × c) = (a · c)b − (a · b)c
(a × b) × c = (a · c)b − (b · c)a.
7.1. REVIEW OF VECTOR ALGEBRA AND VECTOR SPACES
7.1.2
193
Vector spaces
A vector space is nothing else than a collection of vectors. A vector space V over a
field F (for instance the field of the real numbers) is said to be linear if, for any a,
b, c ∈ V and for any λ, µ ∈ F the following five conditions are satisfied:
• the addition in V is commutative and associative, namely:
a+b= b+a
a + (b + c) = (a + b) + c
(7.9)
(7.10)
• There exists a null vector 0 such that
a + 0 = a ∀a ∈ V
(7.11)
• All vectors have a corresponding negative vector −a such that
a + (−a) = 0.
(7.12)
• Multiplication with scalars (elements of the field F ) is associative and distributive with respect to vector and field addition, namely:
λ(µa) = (λµ)a
(7.13)
λ(a + b) = λa + λb
(7.14)
(λ + µ)a = λa + µa
(7.15)
• Multiplication by unity always leaves the vector a unchanged, namely
1(a) = a.
(7.16)
It can be easily shown that the vectors in the three-dimensional Euclidean space
form a linear vector space over the field of the real numbers. This vector space is
often indicated with R3 .
7.1.3
Linear operators
An operator in a vector space is any kind of vector manipulation A that transforms
a vector x into another vector y, namely:
y = Ax.
If the operator A has the property that for any scalar λ, µ
A(λa + µb) = λAa + µAb,
(7.17)
194
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
then the operator A is said to be linear. If x is a vector and A and B are two linear
operators, then it follows that:
(A + B)x = Ax + Bx
(7.18)
(λA)x = λ(Ax)
(7.19)
(AB)x = A(Bx).
(7.20)
In general, the multiplication between linear operators is not commutative, namely
ABx 6= BAx. It is always possible to define a null operator O and an identity
operator I such that
Ox = 0,
Ix = x.
Finally, it is often (but not always) possible to find the inverse operator A−1 of A,
namely the operator such that:
AA−1 = A−1 A = I.
If an operator A does not possess an inverse, it is called singular.
We have already seen (Eq. 7.1) that, given a basis e1 , e2 , e3 (e1 , e2 , . . . , eN in
a generic N-dimensional vector space), it is always possible to express any vector a
as:
N
X
a=
ai ei .
i=1
A linear operator A will transform the vector a into the vector b = Aa. It will
transform, too, the basis vectors ej and it will always be possible to express the
transformed vector Aej as linear combination of the basis vectors e1 , e2 , . . . , eN ,
namely it will always be possible to find N numbers A1j , A2j , . . . , AN j such that:
Aej =
N
X
Aij ei .
(7.21)
i=1
Since a is a linear combination of the basis vectors ej , the vector Aa can be expressed
as:
!
N
N
N
N
N X
N
X
X
X
X
X
Aa = A
aj ej =
aj Aej =
aj
Aij ei =
Aij aj ei .
j=1
j=1
j=1
i=1
i=1 j=1
Here, we have made use of the linearity of the operator A (Eq. 7.17) and of Eq.
7.21. Since also the vector b = Aa can be expressed as linear combination of the
basis vectors ei , from the comparison with the previous equation it follows that:
b=
N
X
i=1
bi ei , ⇒ bi =
N
X
j=1
Aij aj .
(7.22)
195
7.2. VECTOR CALCULUS
This relation suggests us an association between linear operators and matrices,
namely once we have fixed a basis e1 , e2 , . . . , eN of our vector space, the equation b = Aa is equivalent to the equation b = Aa, where A is the matrix whose
elements Aij are defined by the relation Eq. 7.21. It is not difficult to see that
matrices are indeed linear operators.
7.2
7.2.1
Vector calculus
Differentiation of vectors
Let us suppose that the vector a is a function of the scalar variable t (the best
example is a vector representing the position of a moving body, depending therefore
on the time). Then, to each value of t we can associate a different vector a (t). In
Cartesian coordinates we have:
a (t) = ax (t)i + ay (t)j + az (t)k.
The derivative of a vector function a (t) can be defined analogously to the derivative
of a normal function, namely:
a (t + ∆t) − a (t)
da
≡ ȧ = lim
.
∆t→0
dt
∆t
(7.23)
In Cartesian coordinates, this can be written as:
ȧ = a˙x i + a˙y j + a˙z k.
In fact, the basis vectors i, j and k are constant in magnitude and direction and must
not be differentiated.
If the vector r(t) = x(t)i + y(t)j + z(t)k represents the position vector of a
particle with respect to the origin in a Cartesian coordinate system, then ṙ and r̈ are
the velocity and acceleration of the particle, respectively, namely:
dx
dy
dz
i+ j+ k
dt
dt
dt
d2 y
d2 z
d2 x
a = r̈ = 2 i + 2 j + 2 k
dt
dt
dt
v = ṙ =
(7.24)
(7.25)
From Fig. 7.3 it is clear that, for ∆t → 0 the vector v(t) = ṙ(t) tends to be tangent
to the curve C that describes the motion of the particle in space.
Given a scalar function ψ(t) and two vector functions a (t) and b (t), it can be
shown that:
196
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
Figure 7.3: A small change in a vector r(t) resulting from a small change in t.
d
(ψa) = ψ ȧ + ψ̇a
dt
d
(a · b) = a · ḃ + ȧ · b
dt
d
(a × b) = a × ḃ + ȧ × b.
dt
(7.26)
(7.27)
(7.28)
Example 7.2.1 Demonstrate Eq. 7.28.
We will use Eq. 7.7 and will calculate the derivative of the three components (a×b)i ,
(a × b)j and (a × b)k separately, obtaining:
d
d
(a × b)i = (ay bz − az by ) = a˙y bz − a˙z by + ay b˙z − az b˙y = (ȧ × b)i + (a × ḃ)i
dt
dt
d
d
(a × b)j = (az bx − ax bz ) = a˙z bx − a˙x bz + az b˙x − ax b˙z = (ȧ × b)j + (a × ḃ)j
dt
dt
d
d
(a × b)k = (ax by − ay bx ) = a˙x by − a˙y bx + ax b˙y − ay b˙x = (ȧ × b)k + (a × ḃ)k
dt
dt
By summing up these three components, we obtain Eq. 7.28.
197
7.2. VECTOR CALCULUS
Example 7.2.2 Given a vector a with constant magnitude, demonstrate that ȧ is
perpendicular to a.
By using Eq. 7.27 we obtain:
d
(a · a) = a · ȧ + ȧ · a = 2a · ȧ.
dt
The left hand side of this equation is:
da2
d
(a · a) =
= 0,
dt
dt
because the magnitude of a is constant. Therefore, a · ȧ = 0 and this implies that
these two vectors are perpendicular.
If a vector function a depends not only on a parameter t but on several parameters, it is possible to define the partial derivatives of a. The example is some
vector quantity a(r) = a(x, y, z) that depends on the vector position r = (x, y, z) in
a Cartesian coordinate system. In this case, the partial derivatives of a are defined
as:
∂a
a(x + ∆x, y, z) − a(x, y, z)
= lim
∂x ∆x→0
∆x
a(x, y + ∆y, z) − a(x, y, z)
∂a
= lim
∂y ∆y→0
∆y
∂a
a(x, y, z + ∆z) − a(x, y, z)
= lim
∆z→0
∂z
∆z
(7.29)
(7.30)
(7.31)
Given a curve C in space that describes the motion r(t) of a particle, we can
define s as the arc length along the curve, measured from some fixed point. It is
possible to find s as a function of t in this way: the infinitesimal displacement dr of
the particle along the curve C is given by:
dr = dxi + dyj + dzk.
For an infinitesimal displacement, the curve can be approximated by a straight line
and therefore its length ds is given by:
ds = |dr| =
p
(dx)2 + (dy)2 + (dz)2 =
√
dr · dr.
If we now divide both sides of this equation by dt, we obtain:
r
ds
dr dr √
=
·
= ṙ · ṙ,
dt
dt dt
(7.32)
198
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
therefore:
s(t) =
Z √
ṙ · ṙdt.
(7.33)
If we wish to know the length of the curve C between the point r(t1 ) and r(t2 ), this
is given by:
Z t2 √
ṙ · ṙdt.
(7.34)
s=
t1
Indeed, it is not necessary that the parameter describing the curve C is the time t.
The curve C can be parameterized through a parameter u such that, in a Cartesian
coordinate system, r can be described in parametric form as x(u)i + y(u)j + z(u)k.
In this case then we obtain:
Z r
dr(u) dr(u)
·
du.
(7.35)
s(u) =
du
du
Example 7.2.3 Given a curve C in the xy-plane described by the equation y = y(x),
demonstrate that the arc length of the curve between x = a and x = b is given by:
Z bq
s=
1 + [y ′(x)]2 dx.
a
We can assume that the curve C is described by the parameter u = x. In this way,
the parametric equation of the curve C is:
C : r(u) = ui + y(u)j.
The derivative of r with respect to u is thus given by:
dy(u)
dr
=i+
j = i + y ′(x)j.
du
du
The scalar product
dr
du
·
dr
du
is thus given by:
dr dr
2
·
= 1 + [y ′(x)] ,
du du
(see Eq. 7.5). We can now apply Eq. 7.35 obtaining:
s=
Z bq
1 + [y ′(x)]2 dx.
a
We have seen that the vector r(t + ∆t) − r(t) tends to be tangent to the curve
C for ∆t → 0 (see Fig. 7.3). This continues to be true no matter how is the curve
C parameterized. For instance, we can use as a parameter the arc length s and still
199
7.2. VECTOR CALCULUS
Figure 7.4: The unit tangent t̂.
r(s + ∆s) − r(s) will tend to be tangent to C for ∆s → 0 (see Fig. 7.4). Moreover,
the magnitude of the vector ∆r = r(s + ∆s) − r(s) will tend to be similar to ∆s (see
again Fig. 7.4 and see Eq. 7.32). This means that the vector
dr
r(s + ∆s) − r(s)
= lim
,
ds ∆s→0
∆s
is a unit vector, tangent to the curve C. We will call it unit tangent and will denote
it with t̂. Taking into account Eq. 7.32 (namely that |dr| = ds) we can write:
t̂ =
dr
=
ds
dr
dt
ds
dt
dr
dt = dr
.
(7.36)
dt
Since t̂ is a vector with constant (unit) magnitude, it follows from Example 7.2.2
that it must be perpendicular to dt̂/ds. We can therefore write:
dt̂
= κn̂,
ds
(7.37)
where n̂ is called principal normal and it is a unit vector perpendicular to t̂ (and
therefore normal to the curve C). The quantity κ is called the curvature of the curve
C and its inverse ρ = 1/κ is called radius of curvature.
The unit vector perpendicular to both t̂ and n̂ (namely the vector b̂ = t̂ × n̂)
is called binormal to C. Its derivative with respect to s is perpendicular to b̂ (see
200
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
again Example 7.2.2). It can be shown that it is also perpendicular to t̂. It must be
therefore parallel to n̂. One obtains:
db̂
= −τ n̂,
ds
where τ is called torsion of the curve. At any given point on C, the three vectors t̂,
n̂ and b̂ form a right-handed rectangular coordinate system.
Example 7.2.4 Given a particle moving along a trajectory r(t), calculate the components of the acceleration a(t) with respect to the basis t̂, n̂, b̂.
The velocity of the particle is given by:
v(t) =
dr
dr ds
ds
=
= t̂ = v t̂,
dt
ds dt
dt
where v = ds/dt is the speed of the particle. To obtain the acceleration we have to
differentiate once more this expression obtaining:
dv(t)
dt̂
dv
= t̂ + v .
dt
dt
dt
With the help of Eq. 7.37 we can express dt̂/dt as:
a(t) =
dt̂ ds
v
dt̂
=
= vκn̂ = n̂.
dt
ds dt
ρ
Therefore, we have:
dv
v2
t̂ + n̂,
dt
ρ
namely the acceleration has a component along the tangent of the particle’s trajectory and a component (the centripetal acceleration) in the direction of the principal
normal.
a(t) =
Example 7.2.5 The logarithmic spiral can be described by means of the parametric
equations:

x = aebθ cos θ
y = aebθ sin θ.
Find the arc length s as a function of θ, the vectors t̂ and n̂ and the curvature κ at
each point of this curve.
The vector r is parameterized through the parameter θ, namely we have:
201
7.2. VECTOR CALCULUS
bθ
r = r(θ) =
ae cos θ
aebθ sin θ
!
.
Since the curve lies in the xy-plane, we do not need the z-coordinate (that can be
taken equal to zero). The arc length s is given by:
Z r
dr dr
· dθ
s=
dθ dθ
v
!
!
Z u
bθ
bθ
u abebθ cos θ − aebθ sin θ
abe
cos
θ
−
ae
sin
θ
= t
·
dθ
abebθ sin θ + aebθ cos θ
abebθ sin θ + aebθ cos θ
Z
p
= a ebθ b2 cos2 θ + sin2 θ − 2b sin θ cos θ + b2 sin2 θ + cos2 θ + 2b sin θ cos θdθ
Z
√
= a ebθ b2 + 1dθ
√
a b2 + 1 bθ
=
e .
b
The unit tangent vector t̂ is given by:
!
b cos θ − sin θ
aebθ
dr
b sin θ + cos θ
dr
√
=
= dθ
t̂ =
ds
ds
a b2 + 1ebθ
dθ
!
b cos θ − sin θ
1
.
=√
b2 + 1 b sin θ + cos θ
The vector κn̂ = dt̂/ds is given by:
dt̂
=
ds
=
dt̂
dθ
ds
dθ
−b sin θ − cos θ
b cos θ − sin θ
√
a b2 + 1ebθ
!
−b sin θ − cos θ
.
b cos θ − sin θ
√ 1
b2 +1
=
e−bθ
a(b2 + 1)
!
The magnitude of this vector is:
−bθ
p
dt̂ 2
2
2
2
2
2
= e
ds a(b2 + 1) b sin θ + cos θ + 2b sin θ cos θ + b cos θ + sin θ − 2b sin θ cos θ
e−bθ
= √
= κ.
a b2 + 1
Consequently, the vector n̂ is given by:
1
n̂ = √
b2 + 1
−b sin θ − cos θ
b cos θ − sin θ
!
.
202
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
It is worth noticing that κ is very similar to the inverse of the arc length s,
namely we have the relation:
κ=
1
.
bs
The relation between the curvature and the arc length of a curve takes the name of
Cesàro equation. In the case of the logarithmic spiral, the Cesàro equation takes the
very simple form κ = C/s, where C is a constant (C = 1/b).
We have not calculated the binormal b̂ and the torsion τ of the curve. This is
not necessary since the given curve lies in the xy-plane. For this reason, b̂ (which
is perpendicular to both t̂ and n̂) must necessarily coincide with k. Since k is a
constant vector, its derivative with respect to s is zero. From the relation:
db̂
= −τ n̂,
ds
it is clear that τ must be zero.
7.2.2
Scalar and vector fields
We have seen that, once we have defined a basis of a linear vector space, then we
can express any vector of this space as linear combination of the basis vectors. For
instance, in the ordinary R3 space with a Cartesian coordinate system, the vector
position r can be decomposed into three components x, y, z, parallel to the basis
vectors i, j and k, respectively.
Given a linear vector space R, a scalar field φ(r) = φ(x, y, z) associates a scalar
to each point r = (x, y, z) of R and a vector field a(r) = a(x, y, z) associates a vector
to each point. Typical examples of scalar fields are the temperature in a room (we
can associate each point of the room with a certain value of the temperature), or the
pressure at each point in a fluid. A typical vector field is instead the velocity field in
a fluid (each element of the fluid has a different velocity, characterized by a different
magnitude and direction).
7.2.3
Vector operators
Also for scalar and vector fields it is possible to define differential operators. The
most important are the gradient of a scalar field and the divergence and the curl of a
vector field. Central to all these differential operators is the vector operator ∇ (also
called nabla), defined in this way:
203
7.2. VECTOR CALCULUS
∇=i
∂
∂
∂
+j
+k ,
∂x
∂y
∂z
(7.38)
Gradient of a scalar field
The gradient of a scalar field φ(x, y, z) (indicated with grad φ or ∇φ) is a vector
whose components along x, y and z are the partial derivatives of φ(x, y, z) with
respect to x, y, z, respectively, namely:
grad φ = ∇φ = i
∂φ
∂φ
∂φ
+j
+k .
∂x
∂y
∂z
(7.39)
Given a constant K and two scalar fields φ and ψ it is easy to show that:
∇(Kφ) = K∇φ
(7.40)
∇(φ + ψ) = ∇φ + ∇ψ
(7.41)
∇(φ · ψ) = (∇φ) ψ + φ (∇ψ) .
(7.42)
Example 7.2.6 Find the gradient of φ(r) = r (the magnitude of the position vector
r).
It is of course r =
p
x2 + y 2 + z 2 . We have:
∂r
∂r
∂r
+j
+k
∂x
∂y
∂z
1
r
= p
(i2x + j2y + k2z) = .
r
2 x2 + y 2 + z 2
∇r = i
The vector rr is the unit vector in the direction of r and it is usually indicated with
êr . We could have obtained the same result by formally deriving the scalar field r
with respect to r, namely:
d√
d
1
∇r = r =
r·r= √
dr
dr
2 r·r
1
r
=
2r = ,
2|r|
r
dr
dr
r+r
dr
dr
where we have made of Eq. 7.42.
The gradient of a scalar field has a simple physical interpretation: it is the rate
of change of φ in some particular direction. In fact, the infinitesimal change dφ in
going from r to r + dr (= (x + dx)i + (y + dy)j + (z + dz)k) is given by:
204
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
∂φ
∂φ
∂φ
dx +
dy +
dz
∂x
∂y
∂z
∂φ
∂φ
∂φ
· (idx + jdy + kdz) = ∇φ · dr.
= i
+j
+k
∂x
∂y
∂z
dφ =
If we divide now both sides of this equation by the infinitesimal arc length ds we
obtain:
dφ
dr
= ∇φ ·
= ∇φ · t̂,
(7.43)
ds
ds
where t̂ is the unit tangent to the curve parameterized by r(s). In general, the rate
of change of φ with respect to the distance s in a particular direction a is given by:
dφ
= ∇φ · â,
(7.44)
ds
where â is the unit vector in the direction of a. This is also called the directional
derivative. Recalling that a · b = ab cos θ, it is clear that:
dφ
= |∇φ| cos θ,
ds
where θ is the angle between â and ∇φ. This equation tells us also that ∇φ lies
in the direction of the fastest increase in φ and |∇φ| is the largest possible value of
dφ/ds.
√
Example 7.2.7 Find the directional derivative of the scalar field φ = 2x y +yz +z 2
along the direction of the vector a = i + j + k.
The gradient of φ is given by:
 √ 
2 y
x
√


∇φ = i2 y + j z + √
+ k(y + 2z) =  z + √xy  .
y
y + 2z
The unit vector in the direction of a is given by:
 
1
1  
â = √  1  .
3
1
The directional derivative of the scalar field φ along a is thus given by:
√   
2 y
1
√
1 
x
1



x
∇φ · â = √  z + √y  ·  1  = √ 2 y + 3z + y + √
.
y
3
3
1
y + 2z

205
7.2. VECTOR CALCULUS
Example 7.2.8 Find the directional derivative of the scalar field:
φ(x, y, z) = cos2 (xy) + sin2 z,
at the point P =
π 1 π
, ,
4 2 8
in the direction a = 2i + 2j + k.
The gradient of φ at the point P is:
√






− 42
−y sin(2xy)
−2y sin(xy) cos(xy)
√






∇φP =  −2x cos(xy) sin(xy)  =  −x sin(2xy)  =  − π 8 2  .
√
2
sin(2z)
2 sin z cos z
2
P
P
√
The vector a has magnitude a = 4 + 4 + 1 = 3, therefore the requested directional
derivative is:
√
  

√
√ !
√
√
2
− 42
dφ
2 π 2
2
π 2
1  π √2    1
−
=−
=  − 8 · 2 =
−
+
.
√
ds
3
3
2
4
2
12
2
1
2
There is also a simple geometrical interpretation of the gradient of a scalar
field. The points in the space in which the scalar field is equal to a constant value c
designate a surface in the space. In fact, they must obey the equation φ(x, y, z) = c.
Since φ is constant, dφ/ds must be zero if we move along this surface. From Eq.
7.43 we can see that, in this case, ∇φ · t̂ = 0. In other words, ∇φ is a vector normal
to the surface φ(x, y, z) = c at every point.
Divergence of a vector field
The divergence of a vector field a(x, y, z) = iax + jay + kaz is defined as:
div a = ∇ · a =
∂ax ∂ay ∂az
+
+
.
∂x
∂y
∂z
(7.45)
Clearly, ∇ · a is a scalar field.
Example 7.2.9 A central force field is a vector field whose magnitude depends only
on the distance r of the object from the origin. Calculate the divergence of such a
field.
The central force field F(r) can be expressed as:
F(r) = rf (r),
206
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
Figure 7.5: Differential rectangular parallelepiped (in the first octant).
for some specific function f (r). The divergence of F(r) is thus given by:
∂
∂
∂
[xf (r)] +
[yf (r)] + [zf (r)]
∂x
∂y
∂z
∂
∂
∂
= 3f (r) + x f (r) + y f (r) + z f (r)
∂x
∂y
∂z
∂
∂r
∂
∂r
∂
∂r
= 3f (r) + x f (r)
+ y f (r) + z f (r)
∂r
∂x
∂r
∂y
∂r
∂z
2
2
y
z2
x
f ′ (r) + p
f ′ (r) + p
f ′ (r)
= 3f (r) + p
x2 + y 2 + z 2
x2 + y 2 + z 2
x2 + y 2 + z 2
∇ · F(r) =
= 3f (r) + rf ′ (r).
Given two vector fields a(x, y, z) and b(x, y, z) and a scalar field φ(x, y, z) it is:
∇ · (a + b) = ∇a + ∇b
∇ · (φa) = ∇φ · a + φ∇ · a.
(7.46)
(7.47)
A physical interpretation of the divergence is offered by the hydrodynamics.
Let us consider a fluid characterized by a density field ρ(x, y, z) and a velocity field
v(x, y, z). If we consider a small volume dx dy dz (see Fig. 7.5) at the origin, the fluid
entering into this volume per unit time in the (positive) x-direction (namely the fluid
207
7.2. VECTOR CALCULUS
crossing the face EF GH per unit time) is given by (rate in)EF GH = [ρvx ]x=0 dy dz.
Only the component vx of the velocity must be taken into account because the other
components vy and vz do not contribute to the flow through this face. The rate of
flow out through the face ABCD is (rate out)ABCD = [ρvx ]x=dx dy dz. To evaluate
this rate, we can make a Taylor series expansion of ρvx centered at [ρvx ]x=0 , namely:
∂
(ρvx )dx
dy dz.
(rate out)ABCD = [ρvx ]x=dx dy dz = ρvx +
∂x
x=0
The net flow out of the parallelepiped in the x-direction is now given by the flow out
through the face ABCD minus the flow in through the face EF GH, namely:
∂
[net rate of flow out]x = ρvx +
(ρvx )dx
dy dz − [ρvx ]x=0 dy dz
∂x
x=0
∂
(ρvx )dx dy dz.
=
∂x
With the same reasoning we can find the net flow out per unit time along the directions y and z, as well. The total net flow out is thus given by:
∂
∂
∂
(ρvx ) +
(ρvy ) + (ρvz ) dx dy dz = ∇ · (ρv)dx dy dz. (7.48)
net flow out =
∂x
∂y
∂z
Namely, ∇ · (ρv) is the rate of variation of the density of a fluid per unit time and
per unit volume due to the fluid flow. The vector fields whose divergence vanishes
are called solenoidal.
Since the gradient of a scalar field is a vector field, it is possible to calculate the
divergence of it, obtaining once again a scalar field. The divergence of the gradient
of a scalar field φ, ∇ · ∇φ is indicated with ∇2 φ. The operator ∇2 is the scalar
differential operator:
∂2
∂2
∂2
+
+
,
(7.49)
∂x2 ∂y 2 ∂z 2
and it is called Laplace operator. Analogously, ∇2 φ is called Laplacian of φ.
∇2 =
Curl of a vector field
The curl of a vector field a(x, y, z) = iax + jay + kaz is given by:
curl a = ∇ × a =
∂az ∂ay
−
∂y
∂z
i
j k
∂ ∂ ∂
= ∂x ∂y ∂z
ax ay az
i+
.
∂ax ∂az
−
∂z
∂x
j+
∂ay ∂ax
−
∂x
∂y
k
(7.50)
208
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
Clearly, the curl of a vector field is itself a vector field.
Example 7.2.10 Find the curl of the vector field:


y 2z 2


a = y 2 z 2 i + x2 z 2 j + x2 y 2 k =  x2 z 2  .
x2 y 2
We have to simply apply Eq. 7.50, obtaining:
i
j
k
∂
∂
∂
∇ × a = ∂x
∂z
2 2 ∂y
y z x2 z 2 x2 y 2


2x2 (y − z)

 2
=  2y (z − x)  .
2z 2 (x − y)
Example 7.2.11 Find the curl of the vector field:


2
2
2
xe−(x +y +z )
2
2
2
2


F = re−r =  ye−(x +y +z )  .
2
2
2
ze−(x +y +z )
Also in this case we have to simply apply Eq. 7.50, obtaining:
i
j
k
∂
∂
∂
∇×F=
∂x
∂y
∂z
xe−(x2 +y2 +z 2 ) ye−(x2 +y2 +z 2 ) ze−(x2 +y2 +z 2 )


2
2
2
e−(x +y +z ) (−2yz + 2yz)
2
2
2


=  e−(x +y +z ) (−2xz + 2xz)  = 0.
2
2
2
e−(x +y +z ) (−2xy + 2xy)
Indeed, analogously to what we have done in Example 7.2.9, we can demonstrate that
2
any central force field (as re−r is) has curl equal to zero. The vector fields for which
the curl is zero are called irrotational.
Given two vector fields a and b and a scalar field φ, it is possible to show that:
∇ × (a + b) = ∇ × a + ∇ × b
(7.51)
∇ · (a × b) = b · (∇ × a) − a · (∇ × b)
(7.53)
∇ × ∇φ = 0
(7.52)
∇ × (φa) = ∇φ × a + φ∇ × a.
(7.54)
7.3. TRANSFORMATION OF COORDINATES
209
Figure 7.6: Curl of a fluid flow.
The physical significance of the curl of a vector is not quite as transparent as
that of the divergence. If we consider again a fluid characterized by a velocity field
v(x, y, z) we can see from Fig. 7.6 that, if the component along y of the velocity (the
vy component) increases with z, then the fluid tends to curl clockwise (namely in a
negative sense) about the x-axis. How much does it curl? It depends on how much
y
the component vy of the velocity increases with z, namely it depends on ∂v
. On the
∂z
other hand, if the vz component of the velocity field increases with y, the fluid tends
again to curl about the x-axis, but this time counterclockwise (in the positive sense;
z
. The curl of v about the
see again Fig. 7.6). The curl depends in this case on ∂v
∂y
x-axis is thus given by the sum of these two contributions, namely:
∂vz
∂vy
−
,
∂y
∂z
which is exactly as the first component of Eq. 7.50. The other two components can
be found analogously.
[curl v]1 =
7.3
Transformation of coordinates
As we have already said, a vector is just a mathematical object characterized by
a direction and by a magnitude and it can be associated with a column of N real
numbers (3 real numbers in the case of the familiar R3 vector space) only once we
210
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
Figure 7.7: Rotation of Cartesian coordinate axes about the z-axis.
have established a basis of our vector space. So far we have always worked with the
familiar Cartesian coordinate system x, y, z but for many problems it is convenient
to use other coordinate systems.
7.3.1
Rotation of the coordinate axes
We start here with one of the simplest possible coordinate transformation, namely
the rotation of the coordinate axes. Referring to Fig. 7.7, we may wish to express
the vector position r (and any other vector v) as a function of the new coordinates
x′ and y ′ that are obtained by rotating the x- and the y-axis by the same angle θ
(we leave the z-axis unaltered). Looking at the figure it is easy to see that:
x = r cos ϕ1
y = r sin ϕ1
x′ = r cos ϕ2
y ′ = r sin ϕ2 ,
211
7.3. TRANSFORMATION OF COORDINATES
where r is the magnitude of the position vector r. But ϕ1 = θ + ϕ2 (and therefore
ϕ2 = ϕ1 − θ), therefore we have:
x′ = r cos(ϕ1 − θ) = r cos ϕ1 cos θ + r sin ϕ1 sin θ
= x cos θ + y sin θ
y ′ = r sin(ϕ1 − θ) = r sin ϕ1 cos θ − r cos ϕ1 sin θ
(7.55)
= −x sin θ + y cos θ
(7.56)
= x′ cos θ − y ′ sin θ
(7.57)
= x′ sin θ + y ′ cos θ.
(7.58)
x = r cos(ϕ2 + θ) = r cos ϕ2 cos θ − r sin ϕ2 sin θ
y = r sin(ϕ2 + θ) = r sin ϕ2 cos θ + r cos ϕ2 sin θ
Indeed, we can apply this argument to any vector because any vector can be represented with an arrow starting at the origin of the axes and ending at some point
R (translations of the axes are always possible if the vector
! does not start at the
vx
origin). Therefore, given a vector v represented by
in the xy-coordinate
vy
system, rotation of the axes by an angle θ will transform its components in this way:

v ′ = v cos θ + v sin θ
x
x
y
vy ′ = −vx sin θ + vy cos θ
(7.59)
We can even use this formula as definition of vector, namely we call vector any geometrical object whose components under rotation of the coordinate systems satisfy
Eq. 7.59. This condition is also called covariance.
We can write Eq. 7.59 with the matrix formalism in this way:
v′ =
vx ′
vy ′
!
= Av =
A11 A12
A21 A22
!
vx
vy
!
,
(7.60)
namely, recalling Sect. 7.1.3, the rotation of axes is a linear operator that can be
expressed with the matrix A. In the case of the rotation of axes by an angle θ, the
coefficients Aij are given by:
A11 = cos θ,
A12 = sin θ,
A21 = − sin θ,
A22 = cos θ.
(7.61)
212
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
By looking at Eqs. 7.55 and 7.56 we can even notice that:
∂x′
∂x
∂x′
∂y
∂y ′
∂x
∂y ′
∂y
= cos θ = A11
= sin θ = A12
= − sin θ = A21
= cos θ = A22
We can generalize these four equations with the relation:
∂xi ′
= Aij ,
∂xj
(7.62)
where 1 (i = 1 or j = 1) is the index relative to the x-components (x or x′ ) and 2
(i = 2 or j = 2) refers to the y-components (y or y ′). We can thus rewrite Eq. 7.60
as:
2
X
∂xi ′
′
vj .
vi =
∂xj
j=1
We can now generalize these formulae to a N-dimensional vector space. For
any kind of axes rotation in a N-dimensional vector space it is possible to find the
coefficients Aij such that the equation:
v′ = Av,
or, in terms of components,
vi ′ =
N
X
Aij vj ,
i = 1, . . . , N,
(7.63)
j=1
represents the transformation of the components of v in the two coordinate systems.
Also the vector position r = (x1 e1 , . . . , xN eN )T must transform in the same way,
namely:
′
xi =
N
X
Aik xk .
k=1
If we derive now both members of this equation with respect to xj , we obtain:
N
N
X
∂xk
∂ X
∂ ′
Aik
Aik xk =
xi =
= Aij ,
∂xj
∂xj k=1
∂x
j
k=1
since the various coordinates xj are supposed to be independent, therefore the only
value of k for which the derivative of xk with respect to xj is different from zero is
213
7.3. TRANSFORMATION OF COORDINATES
k = j. We have therefore obtained that:
Aij =
∂xi ′
,
∂xj
(7.64)
thus Eq. 7.63 translates into:
′
vi =
N
X
∂xi ′
j=1
∂xj
vj .
(7.65)
On the other hand, in our 2-dimensional example we have also seen the inverse
vector transformations, namely the transformations from the coordinates x′ , y ′ to the
coordinates x, y (Eqs. 7.57 and 7.58). For a generic vector v these relations can be
expressed as:

v = v ′ cos θ − v ′ sin θ
x
x
y
,
vy = vx ′ sin θ + vy ′ cos θ
or:
v = A′ v ′ ,
(7.66)
where:
cos θ − sin θ
sin θ cos θ
A′ =
!
,
namely the transformation matrix A′ is very similar to the transformation matrix
A of Eq. 7.60, the only difference being the off-diagonal elements that have been
inverted. That means that
A′ = AT .
By looking again at Eqs. 7.57 and 7.58 we can notice that:
∂x
∂x′
∂x
∂y ′
∂y
∂x′
∂y
∂y ′
= cos θ = A11 ′ = A11
= − sin θ = A12 ′ = A21
= sin θ = A21 ′ = A12
= cos θ = A22 ′ = A22 .
We can generalize these four equations with the relation:
∂xj
= Aij ,
∂xi ′
(7.67)
where 1 is again the index relative to the x-components and 2 refers to the ycomponents (note the inversion of i and j with respect to Eq. 7.62, due to the
fact that A′ = AT ). We can thus rewrite Eq. 7.66 as:
2
X
∂xj
vj ,
vi =
∂xi ′
j=1
′
214
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
Figure 7.8: General curvilinear coordinates.
or, in the general case of a N-dimensional vector field:
N
X
∂xj
vi =
v.
′ j
∂x
i
j=1
′
7.3.2
(7.68)
General curvilinear coordinates
In general, the position of a point P in space (having Cartesian coordinates x, y, z)
can be expressed in terms of three curvilinear coordinates u1, u2 , u3 , provided that
there is a one-to-one correspondence between (x, y, z) and (u1 , u2 , u3). The point P
is placed at the interception between three surfaces of the kind u1 = c1 , u2 = c2 and
u3 = c3 , where c1 , c2 and c3 are constants. The intersection of these three planes
forms three curves u1 , u2 and u3 (see Fig. 7.8). We can take as basis vectors of this
new coordinate system the vectors ê1 , ê2 and ê3 , tangent to the curves u1 , u2 and
u3 at the point P .
We will concentrate here on orthogonal coordinate systems (the most useful),
namely systems for which ê1 , ê2 and ê3 are mutually perpendicular. In this case we
have êj · êj = 0 if i 6= j and ê1 = ê2 × ê3 . If r(u1 , u2 , u3) is the position vector of
the point P , then ∂r/∂u1 is a vector tangent to the u1-curve at P (u2 and u3 must
remain constant), therefore it has the direction of ê1 . We denote its magnitude with
h1 . We can define similarly h2 and h3 as the magnitudes of the vectors ∂r/∂u2 and
∂r/∂u3 , having the direction of ê2 and ê3 , respectively. We have thus:
7.3. TRANSFORMATION OF COORDINATES
215
1 ∂r
1 ∂r
1 ∂r
, ê2 =
, ê3 =
.
(7.69)
h1 ∂u1
h2 ∂u2
h3 ∂u3
The quantities h1 , h2 and h3 are called scale factors of the curvilinear coordinate
system. It is worth remarking that the quantities u1 , u2 and u3 need not be lengths.
From this equation we notice however that hj duj must have the dimension of a
length. In a Cartesian coordinate system it is clear that ∂r/∂x = i (and analogously
for the other two components), therefore h1 = h2 = h3 = 1.
The infinitesimal vector displacement dr is expressed as:
ê1 =
dr =
∂r
∂r
∂r
du1 +
du2 +
du3 = h1 du1 ê1 + h2 du2 ê2 + h3 du3 ê3 .
∂u1
∂u2
∂u3
(7.70)
The arc length is obtained by the formula (ds)2 = dr · dr. In the case of orthogonal
curvilinear coordinates as we have said it is êi · êj = 0 if i 6= j, therefore the arc
length is given by:
(ds)2 = h1 2 (du1 )2 + h2 2 (du2 )2 + h3 2 (du3 )2 .
(7.71)
From Eq. 7.70 it is clear that, if the coordinate system is orthogonal, then the
element of volume dV is a rectangular parallelepiped whose sides are h1 du1, h2 du2 ,
h3 du3, namely:
dV = h1 h2 h3 du1 du2 du3 .
Given a scalar field φ, we might still retain valid the expression dφ = ∇φ · dr
we have encountered in Sect. 7.2.3. In fact, the interpretation of the gradient as
rate of change of φ along a particular direction remains, irrespective of the chosen
coordinate system. Namely, for any coordinate system the gradient of a scalar field
must be always the vector having magnitude and direction of the maximum space
rate of change. Since:
dφ =
∂φ
∂φ
∂φ
du1 +
du2 +
du3 ,
∂u1
∂u2
∂u3
upon substituting dr with the expression found in Eq. 7.70 we find:
1 ∂φ
1 ∂φ
1 ∂φ
ê1 +
ê2 +
ê3 .
h1 ∂u1
h2 ∂u2
h3 ∂u3
Consequently, the nabla operator can be expressed as:
∇φ =
∇ = ê1
1 ∂
1 ∂
1 ∂
+ ê2
+ ê3
.
h1 ∂u1
h2 ∂u2
h3 ∂u3
(7.72)
(7.73)
If we calculate the gradient of uj by means of this formula we obtain êj /hj , namely:
êj = hj ∇uj .
216
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
This formula can be used in combination with the Eqs. 7.46, 7.47, 7.52, 7.53, 7.54
to express the divergence and the curl of a vector field a and the Laplacian of the
scalar field φ in a curvilinear coordinate system, namely:
∂
∂
∂
1
(h2 h3 a1 ) +
(h3 h1 a2 ) +
(h1 h2 a3 )
(7.74)
∇·a=
h1 h2 h3 ∂u1
∂u2
∂u3
h1 ê1 h2 ê2 h3 ê3 1
∂
∂
∂
∇×a =
(7.75)
∂u1
∂u
∂u
2
3
h1 h2 h3 h1 a1 h2 a2 h3 a3 ∂
h2 h3 ∂φ
∂
h3 h1 ∂φ
∂
h1 h2 ∂φ
1
2
+
+
.
∇ φ=
h1 h2 h3 ∂u1
h1 ∂u1
∂u2
h2 ∂u2
∂u3
h3 ∂u3
(7.76)
We can for instance demonstrate Eq. 7.74 which is a bit more complicated than the
other two equations. Let us consider the sub-expression ∇ · (a1 ê1 ). Since ê1 = ê2 × ê3
we have:
∇ · (a1 ê1 ) = ∇ · (a1 h2 h3 ∇u2 × ∇u3 ).
In fact, we have seen that for an orthogonal coordinate system it is êj = hj ∇uj . By
using Eq. 7.53 we obtain:
∇ · (a1 ê1 ) = ∇(a1 h2 h3 ) · (∇u2 × ∇u3 ) + a1 h2 h3 ∇ · (∇u2 × ∇u3 ).
Applying again Eq. 7.53 we obtain:
∇ · (∇u2 × ∇u3 ) = ∇u3 · (∇ × ∇u2 ) − ∇u2 · (∇ × ∇u3 ),
which is equal to zero because of Eq. 7.52. Therefore we are left with:
ê3
ê1
ê2
×
= ∇(a1 h2 h3 ) ·
.
∇ · (a1 ê1 ) = ∇(a1 h2 h3 ) · (∇u2 × ∇u3 ) = ∇(a1 h2 h3 ) ·
h2 h3
h2 h3
By applying Eq. 7.72 to the scalar field φ = a1 h2 h3 we obtain:
∇ · (a1 ê1 ) =
1
∂
(a1 h2 h3 ),
h1 h2 h3 ∂u1
which is the first term of Eq. 7.74. Analogously we can proceed to obtain the other
two terms.
Example 7.3.1 The cylindrical coordinate system (ρ, θ, z) is characterized by the
following transformation equations:




x = ρ cos θ
y = ρ sin θ



z = z
.
217
7.4. TENSORS
Find the expressions of gradient and Laplacian of a scalar field φ; divergence and
curl of a vector field a as a function of ρ, θ, z.
All we need to know are the scale factors of this coordinate system. We can start
considering that the infinitesimal arc length ds in the Cartesian coordinate system is
given by:
(ds)2 = (dx)2 + (dy)2 + (dz)2 .
We have:
dx = dρ cos θ − ρ sin θdθ
dy = dρ sin θ + ρ cos θdθ
dz = dz,
therefore:
(ds)2 = (dρ cos θ − ρ sin θdθ)2 + (dρ sin θ + ρ cos θdθ)2 + (dz)2
= (dρ)2 (cos2 θ + sin2 θ) + (ρdθ)2 (sin2 θ + cos2 θ) + (dz)2
= (dρ)2 + (ρ2 dθ)2 + (dz)2 .
By comparing it with Eq. 7.71 we can immediately recognize that:
h1 = hρ = 1,
h2 = hθ = ρ,
h3 = hz = 1.
Now we can apply Eqs. 7.72, 7.74, 7.75 and 7.76, obtaining:
1 ∂φ
∂φ
∂φ
+ θ̂
+ ẑ
∂ρ
ρ ∂θ
∂z
1 ∂
1 ∂aθ ∂az
∇·a=
(ρaρ ) +
+
ρ ∂ρ
ρ ∂θ
∂z
ρ̂ ρ θ̂ ẑ 1 ∂
∂
∂ ∇ × a = ∂ρ ∂θ ∂z ρ
aρ ρaθ az ∂φ
1 ∂2φ ∂2φ
1 ∂
2
ρ
+ 2 2 + 2,
∇ φ=
ρ ∂ρ
∂ρ
ρ ∂θ
∂z
∇φ = ρ̂
where ρ̂, θ̂ and ẑ are the unit vectors of the cylindrical coordinate system.
7.4
Tensors
The considerations and formulae we have found in Section 7.3.1 might appear excessive on the light of the simplicity of the problem (rotation about one coordinate
axis). They represent however the simplest possible introduction to the concept of
tensors.
218
7.4.1
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
Basic definitions
We have seen that we can use Eq. 7.59 as definition of vectors, namely vectors
are mathematical entities whose components, under rotation of the coordinate axes,
can be transformed according to this equation. This guarantees that the properties
of the vector (its magnitude and direction) do not vary if we change the reference
system. We have seen, too, that we can express this transformation by means of
the partial derivatives of the components of the two reference systems. However, we
have found an ambiguity between Eq. 7.65 and Eq. 7.68, namely the components of
a transformed vector v′ can be either expressed as:
′i
v =
N
X
∂xi ′
j=1
or as:
∂xj
vj ,
(7.77)
N
X
∂xj
vj .
vi =
∂xi ′
j=1
′
(7.78)
In Cartesian coordinates these two formulations are equivalent, but in general curvilinear coordinates (like the ones introduced in Sect. 7.3.2), some vectors will be
transformed according to Eq. 7.77 and some others according to Eq. 7.78. The
first vectors are defined contravariant vectors while the seconds are named covariant
vectors. Contravariant vectors are conventionally denoted with a superscript (like in
Eq. 7.77) to distinguish them from the covariant ones.
The prototype of a contravariant vector is the vector position (x, y, z) in a Cartesian coordinate system, that can be thus written as (x1 , x2 , x3 ). Of course, with this
notation we must pay attention not to confuse the components x2 and x3 with the
square and the cubic power of the number x. The prototype of the covariant vector
is instead the gradient. We have seen in fact that, in Cartesian coordinates, the
gradient can be written as:
∇φ = i
∂φ
∂φ
∂φ
+ j 2 + k 3.
1
∂x
∂x
∂x
If we now rotate the coordinate axes, the components of the transformed gradient
can be found by means of the chain rule of differentiation, namely:
3
X ∂φ ∂xj
∂φ′
∂φ ∂x1
∂φ ∂x2
∂φ ∂x3
=
+
+
=
,
∂x′ i
∂x1 ∂x′ i ∂x2 ∂x′ i ∂x3 ∂x′ i
∂xj ∂x′ i
j=1
which, since ∂φ/∂xj are the components of the original gradient, has therefore the
same form as Eq. 7.78.
We will define vectors as tensors of rank 1. In a 3-dimensional space, a tensor of
rank n is a mathematical object that transforms in a definite way when the coordi-
219
7.4. TENSORS
nate system changes.1 The way it transforms guarantees that the properties of this
mathematical object do not vary if we change the coordinate system. The simplest
possible tensor is the tensor of rank 0, which has 30 = 1 component and therefore
identifies with a scalar. We have seen the definition of covariant and contravariant
tensors of rank 1 (Eqs. 7.77 and 7.78). We can now go on defining contravariant,
mixed and covariant tensors of rank 2 by the following equations of their components
under coordinate transformations:
A′ ij =
B′ ij =
C ′ ij =
3 X
3
X
∂x′ i ∂x′ j
k=1 l=1
3 X
3
X
k=1 l=1
3 X
3
X
k=1 l=1
Akl
(7.79)
∂x′ i ∂xl k
B l
∂xk ∂x′ j
(7.80)
∂xk ∂xl
Ckl .
∂x′ i ∂x′ j
(7.81)
∂xk ∂xl
Clearly, the rank goes as the number of partial derivatives necessary in the definition:
0 for a scalar, 1 for a vector, 2 for a second-rank tensor and so on. Each index
(subscript or superscript) ranges over the number of dimensions in the space. We
see from the above formulae that Akl is contravariant with respect to both indices,
Ckl is covariant with respect to both indices and B k l transforms contravariantly with
respect to the first index k but covariantly with respect to the second index l. Once
again, if we use the Cartesian coordinate system, all these definitions coincide.
The second-rank tensor Akl is often indicated with a boldface capital letter (A)
and its components can be arranged in a 3 × 3 square array:


A11 A12 A13


A =  A21 A22 A23  ,
A31 A32 A33
but it shall not be confused with a matrix. A matrix is a second-rank tensor only if
its components transform according to one of the Eqs. 7.79–7.81.
7.4.2
Einstein summation convention
Tensor analysis can be quite messy because of the large number of indices involved.
In order to simplify the algebra and to make the notation more compact, Einstein
introduced the following convention: once an index variable appears twice in a single
term, once in a upper position (superscript), once in a lower position (subscript),
1
In a N -dimensional space a tensor of rank N has N n components.
220
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
that implies that we are summing over all the possible values of this index variable.
With this convention, Eq. 7.1 can be rewritten as:
a = ai ei .
The only exception to this rule is given by the Cartesian coordinates, which are
assumed to be summed up even if they appear only as superscript. Therefore, Eqs.
7.79–7.81 can be rewritten as:
∂x′ i ∂x′ j kl
A
∂xk ∂xl
∂x′ i ∂xl k
B′ ij =
B l
∂xk ∂x′ j
∂xk ∂xl
C ′ ij = ′ i ′ j Ckl ,
∂x ∂x
A′ ij =
(7.82)
(7.83)
(7.84)
in spite of the fact that the indices k and l appear sometimes two times as superscript.
7.4.3
Direct product and contraction
Given a covariant vector ai and a contravariant vector bj in a N-dimensional space,
the most general product between them is a mathematical entity ci j containing all
the possible products between the N components of ai and the N components of bj ,
namely it is a N ×N matrix whose elements are the products ai bj . Is it a second rank
tensor? We can apply Eq. 7.78 and Eq. 7.77 to transform ai and bj , respectively,
under rotation of the coordinate axes, obtaining:
ai ′ b′ j =
∂xk ∂x′ j l
∂xk ∂x′ j
a
b
=
ak bl ,
k
∂x′ i ∂xl
∂x′ i ∂xl
where we have made use of the Einstein summation convention. The tensor ci j = ai bj
obeys Eq. 7.80 when the coordinate axes rotate and it is thus a mixed second rank
tensor.
This kind of product between tensor is called direct product and always produces
another tensor with rank equal to the sum of the ranks of the tensor we have multiplied (respecting also the number of covariant and contravariant indices of the two
original tensors). For instance we can multiply the second order mixed tensor Ai j
by the mixed tensor Bk l obtaining the fourth rank tensor
Cik jl = Ai j Bk l .
(7.85)
The direct product is thus a way to obtain tensors of progressively large ranks. Is
it possible to reduce the rank of a tensor? This is possible by means of the operation
of contraction, in which we set a covariant index equal to a contravariant index. For
221
7.4. TENSORS
instance, in the fourth rank tensor Cik jl defined by Eq. 7.85 we could set k = j and
obtain (always using the Einstein convention):
Cij jl = Ai j Bj l .
By comparing this expression with Eq. 5.1 of Sect. 5.1.1 we can recognize this
operation as the classical multiplication of two matrices, that produces (as we know)
another matrix. Therefore the fourth rank tensor Cij kl has been contracted to the
second rank tensor Ci l . What happens if we contract further Ci l ? This is obtained
by putting l equal to i, namely by considering the quantity Ci i . Because of the
Einstein convention, this corresponds to summing up all the elements of the matrix
Ci l in which the column index is equal to the row index, namely by summing up the
diagonal terms of the matrix. This sum is nothing else that the trace of the matrix,
namely a scalar (a tensor of rank zero). It is easy to see that the trace does not
change if we change the reference system. It is enough to apply Eq. 7.83 to C ′ i i ,
obtaining:
′i
∂xl l
∂xl k
∂x
C
=
Ck .
C ′ ii =
l
′i
∂xk ∂xk
∂x
But ∂xl /∂xk is always zero for l 6= k (we always assume that the coordinates are
independent to each other) and of course it is ∂xk /∂xk = 1, therefore:
C ′ i i = Ck k .
In spite of the fact that we use two different variables (i and k), these two expressions
are both telling us that the sum of all the diagonal terms of C remains the same
even if we change the coordinate system. This is the strength of the tensors: they
allow us to describe a physical system independently on the coordinate system and
the guarantee that the involved physical quantities are independent on the reference
frame.
We have seen therefore that the operation of contraction has always the effect
of reducing the rank of a tensor by 2. It is to remind that the contraction is possible
only between a covariant and a contravariant index. The analogy with the vector
multiplication helps us to understand why: if we multiply the transpose of a vector
by another vector (or by the complex conjugate of another vector if we deal with
complex vectors), we obtain the scalar product of them, namely a scalar (see Sect.
5.1.1). If we instead perform the opposite operation (multiplying a vector by the
transpose of another vector) we obtain a matrix, namely a tensor of rank two.
7.4.4
Kronecker delta and Levi-Civita symbol
We will analyze in this section two famous tensors, useful in many branches of physics:
the Kronecker delta and the Levi-Civita symbol. The Kronecker delta is a function
222
CHAPTER 7. VECTOR AND TENSOR ANALYSIS
of two integer variables i and j and it is one if the two variables are equal, zero
otherwise. It is indicated with δ and, from its definition we have:

1,
if i = j
δij =
.
(7.86)
0,
if i 6= j
Given a sum
P∞
i=−∞
ai , we have the property:
∞
X
ai δij = aj ,
i=−∞
R∞
which is analogous to the property −∞ f (x)δ(x − c)dx = f (c) we have seen for the
Dirac delta function (Sect. 4.1.2, Eq. 4.15).
If we consider the Kronecker delta as a tensor, then it turns out to be a mixed
tensor. To demonstrate that, we should remind that in a Cartesian coordinate system
the various coordinates are independent to each other. This means that the derivative
of the coordinate xi with respect to the coordinate xj will always be zero, but in the
case in which i = j. In this case it is of course ∂xi /∂xi = 1. This is exactly what
the Kronecker delta requires, namely we have:
∂xi
= δj i .
(7.87)
j
∂x
This condition holds also if we rotate the coordinate axes, namely the transformed
Kronecker delta obeys the relation:
∂x′ i
i
= δ′j .
′
j
∂x
By means of the chain rule of partial derivation we have:
i
δ′j =
∂x′ i
∂x′ i ∂xk ∂xl
=
.
∂x′ j
∂xk ∂xl ∂x′ j
However, because of Eq. 7.87 the second term of this product is δl k , therefore we
have:
∂x′ i ∂xl k
i
δl ,
δ′j =
∂xk ∂x′ j
which is analogous to Eq. 7.80, telling us therefore that the Kronecker delta is a
mixed tensor.
The Levi-Civita symbol is indicated with εijk . It acts therefore on three integer
variables and produces 0 if one (or more than one) of the indices are repeated, +1
if (i, j, k) is an even permutation of (1, 2, 3), −1 if (i, j, k) is an odd permutation of
(1, 2, 3), namely we have:




+1, if (i, j, k) = (1, 2, 3) or (2, 3, 1) or (3, 1, 2)
εijk =
−1, if (i, j, k) = (1, 3, 2) or (3, 2, 1) or (2, 1, 3)



0, otherwise : i = j or i = k or j = k
.
(7.88)
223
7.4. TENSORS
This can also be expressed by means the formula:
εijk =
(j − i)(k − i)(k − j)
.
2
The Levi-Civita symbol is therefore a tensor of rank 3. It is useful to simplify considerably some formulae. For instance, given a vector c = a × b, its i-th
component if given by:
ci = εijk aj bk ,
(compare with the formula Eq. 7.8).
Download