Chapter 2 First-order ordinary differential equations

advertisement
Chapter 2
First-order ordinary differential
equations
First-order ODEs involve only the first-derivative of the unknown function, y(x), and can be written
in the forms
F (x, y, y 0) = 0,
y 0 = f (x, y).
(2.1a)
(2.1b)
The form (2.1b) is less general than (2.1a), but it is simpler to analyse.
2.1
Simple exact solutions (revision)
You should already be familiar with (at least) two methods of exact, or analytic, solution for
first-order ODEs from the C4 module of your A-level maths courses.
2.1.1
Direct integration
If the ODE can be written in the form
y 0 = f (x),
(2.2)
then by the fundamental theorem of calculus, theorem 0.1,
Z
y(x) = f (x) dx,
is a solution of equation (2.2). Obviously, this method will only work if we can actually integrate
the function f (x). If not, then we must usually resort to numerical methods, see §2.3.
2.1.2
Separable ODEs
The ODE is said to be separable if it can be written in the form
g(y) y 0 = f (x),
or g(y)
dy
= f (x).
dx
(2.3)
The “separation” refers to the fact the y-dependence is on one side of the equation and the xdependence is on the other — the variables (x and y) are separated by the equals sign.
24
The solution of equation (2.3) is obtained on integration of both sides with respect to x.
Z
Z
Z
Z
dy
g(y)
dx = f (x) dx, ⇒
g(y) dy = f (x) dx.
dx
The method will always give an implicit expression for the solution, but an explicit solution might
not be possible. Once again, if we cannot integrate either of the functions g(y) or f (x) then we
must seek an alternative method of solution.
Example 2.1. A separable first-order ODE
Find an expression for the solution to the ordinary differential equation
y0 =
x2
.
1 + cos y
(2.4)
Solution 2.1. Multiplying both sides of equation (2.4) by 1 + cos y and integrating with respect
to x gives
Z
Z
(1 + cos y) dy =
x2 dx,
⇒
y + sin y =
1 3
x + C.
3
(2.5)
As usual in these problems the two constants of integration can be combined into a single constant,
C. The solution (2.5) cannot be solved explicitly for y, but we can write an explicit expression for
x.
p
x = 3 3(y + sin y − C).
2.2
Graphical Solutions
Before rushing in to solve more general first-order ODEs exactly, we shall pause and consider the
qualitative information that can be obtained. In fact, when an ODE is in the form (2.1b), we can
sketch the solutions without actually having to solve the ODE — a method known as graphical
solution.
2.2.1
Direction Fields
If the ODE can be written in the form (2.1b) then the function f (x, y) represents the gradient of the
solution y(x). We can extract information about the function y(x) by plotting this gradient in the
x-y plane. The gradient is represented a straight line, the direction vector, that has a slope given
by f (x, y), see Figure 2.1. The direction vector has components (1, y 0 ) = (1, f (x, y)) in Cartesian
coordinates and emanates from the point (x, y). The direction field of the ODE is defined to be
the set of all direction vectors in the x-y plane.
Example 2.2. Plotting a direction field
Calculate and plot the direction field for the ODE y 0 = x for y > 0.
Solution 2.2. At any point (x, y), the direction vector is (1, x) and the corresponding direction
field is shown in Figure 2.2a.
r
to
c
ve
3
n
io
ct
e
r
i
D
θ
0
y = f (x, y)
The angle of the direction vector
to the horizonal is given by
tan θ = y 0 = f (x, y)
1
Figure 2.1: The direction vector has the same slope as the solution of the ODE y 0 = f (x, y).
y
y
3
2.5
2.5
2
2
1.5
1.5
1
1
0.5
0.5
0
0
−2.5
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
−2.5
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
x
x
(a)
(b)
0
Figure 2.2: (a) The direction field corresponding to the ODE y = x. (b) The direction field and
integral curves, y = x2 /2 + C, corresponding to the ODE y 0 = x.
2.2.2
Integral curves
The direction field represents the slope of solutions to the ordinary differential equation, so the
solutions must be everywhere tangent to the direction field. Curves that are everywhere tangent to
the direction field are known as integral curves and each integral curve represents a solution of
the ODE. In other words, the integral curves are (graphical) solutions of the ODE.
Example 2.3. Constructing integral curves
Plot the integral curves corresponding to the ODE y 0 = x.
Solution 2.3. The curves can be constructed by starting from any point on the y-axis and following
the direction field forwards and backwards. In this case, the solution to the ODE is easily obtained
by direct integration y = 12 x2 + C, so the integral curves are a family of parabolas differing only by
the position at which they cross the y-axis (y = C when x = 0), see Figure 2.2b.
2.2.3
Isoclines
When sketching direction fields it can be helpful to identify isoclines, curves along which the
gradient is constant (iso = “same”, and cline = “slope”). The equation defining the isoclines is
simply
dy
= K, a constant, ⇒ f (x, y) = K.
(2.6)
dx
Example 2.4. Constructing another direction field
By determining the isoclines, calculate and plot the direction field and integral curves for the ODE
y 0 = −x/y for y > 0.
Solution 2.4. The isoclines are given by −x/y = K or y = −x/K, i. e. straight lines1 passing
through the origin of slope −1/K. At any point (x, y) the direction vector is (1, −x/y), which can
be rescaled on multiplication by y to give (y, −x). Note that multiplication by a scalar does not
change the direction, merely the magnitude of the direction vector. At any point (x, y), therefore,
the direction vector is perpendicular to the isocline from the origin (0, 0) to the point (x, y). The
direction field and isoclines are shown in Figure 2.3a.
The ODE is separable and, after multiplication by y, both sides may be integrated to give
Z
Z
1
1 2
y = − x2 + C, ⇒ x2 + y 2 = 2C,
y dy = −x dx, ⇒
2
2
√
the equation of a circle with radius 2C. Hence, the integral curves are a family of concentric
circles centred on the origin. The direction field and integral curves are shown in Figure 2.3b.
y
y
3
3
2.5
2.5
2
2
1.5
1.5
1
1
0.5
0.5
0
0
−2.5
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
−2.5
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
x
x
(a)
(b)
0
Figure 2.3: (a) The direction field with isoclines corresponding to the ODE y = −x/y. (b) The
direction field and integral curves, x2 + y 2 = 2C, corresponding to the ODE y 0 = −x/y.
2.2.4
Plotting direction fields using MATLAB
Plotting direction fields is a tedious, repetitive task, which is the kind of thing that computers do
well and humans do badly. Of course, you should be able to plot direction fields by hand, but if a
computer is available, then learn to use it!
One of MATLAB’s many strengths is its built-in plotting capabilities, see §0.2.5. Unfortunately,
direction fields cannot be plotted by default, but a vector field can be plotted by using a “quiver
plot”. The quiver command requires four types of input data x, y, dx and dy where dx and dy are
the x and y components of the vector located at (x, y). As with most data in MATLAB, the input
data must be MATLAB matrices and corresponding entries in each matrix contain the data used
1
Isoclines will not always be straight lines, e. g. y 0 = x2 /y has quadratic isoclines K = x2 /y ⇒ y = x2 /K.
to generate a single arrow in the plot. It is extremely important that all four matrices have the
same dimensions. We must now find a way to generate these four matrices as quickly and easily as
possible.
In order to generate a “grid” of starting points in the (x, y) plane we use another MATLAB
command meshgrid. The command
>> [x,y] = meshgrid(-3:3,0:6);
generates two 7 × 7 matrices x and y that represent x and y coordinates of 49 points in the domain
−3 ≤ x ≤ 3 and 0 ≤ y ≤ 6 with a unit spacing between each point.
x =
y =
-3
-3
-3
-3
-3
-3
-3
-2
-2
-2
-2
-2
-2
-2
-1
-1
-1
-1
-1
-1
-1
0
0
0
0
0
0
0
1
1
1
1
1
1
1
2
2
2
2
2
2
2
3
3
3
3
3
3
3
0
1
2
3
4
5
6
0
1
2
3
4
5
6
0
1
2
3
4
5
6
0
1
2
3
4
5
6
0
1
2
3
4
5
6
0
1
2
3
4
5
6
0
1
2
3
4
5
6
Note that the x coordinate is constant in each column of the x matrix and the y coordinate is
constant in each row of the y matrix. The spacing between points in the grid can be adjusted by
an additional argument between the two limits. The command
>> [x,y] = meshgrid(-3:3:3,0:3:6);
increases the space between each point to 3 units, rather than the default 1 unit, generating the
two matrices:
x =
-3
-3
-3
y =
0
0
0
3
3
3
0
3
6
0
3
6
0
3
6
In general the arguments to meshgrid can be any two vectors that contain lists of x and y coordinates
used to generate the grid. Whatever domain and spacing we choose, we now have a grid of starting
points for the vectors. It remains to calculate the x and y components of the direction field. We
can create a matrix of the gradients at each point (x, y), e. g. f (x, y) = −x/y, by the following
command
>> f = -x./y;
which gives the matrix
f =
Inf
3.00000
1.50000
Inf
2.00000
1.00000
Inf
1.00000
0.50000
NaN
-0.00000
-0.00000
-Inf
-1.00000
-0.50000
-Inf
-2.00000
-1.00000
-Inf
-3.00000
-1.50000
1.00000
0.75000
0.60000
0.50000
0.66667
0.50000
0.40000
0.33333
0.33333
0.25000
0.20000
0.16667
-0.00000
-0.00000
-0.00000
-0.00000
-0.33333
-0.25000
-0.20000
-0.16667
-0.66667
-0.50000
-0.40000
-0.33333
-1.00000
-0.75000
-0.60000
-0.50000
The use of the ./ operator is very important because it specifies element-by-element division. The
command x/y would try to calculate a matrix right division, approximately the same as multiplication of x by the inverse of y. Note that MATLAB can distinguish between positive and negative
infinity (Inf) and undefined quantities (NaN, which means “Not a Number”, occurs because 0/0 is
undefined).
When plotting a direction field, the vectors should all have the same length, which will not
always be the case if we let (dx, dy) = (1, y 0 ). Instead, we use the fact that if θ is the angle between
the direction vector and the x-axis, then y 0 = tan θ, see Figure 2.1. We can then construct a vector
of unit length by letting dx = cos θ, dy = sin θ. We calculate the dx and dy matrices as follows:
>> th = atan(f);
>> dx = cos(th);
>> dy = sin(th);
We can now generate the quiver plot by the following command
>> quiver(x,y,dx,dy);
The complete sequence of commands to generate a plot of the direction field corresponding to
y 0 = f (x, y) = −x/y is
>>
>>
>>
>>
>>
>>
[x,y] = meshgrid(-3:3,0:6);
f = -x./y;
th = atan(f);
dx = cos(th);
dy = sin(th);
quiver(x,y,dx,dy);
and direction fields for other ODEs can be generated by changing the definition of f on the second
line.
Creating a MATLAB function to generate direction fields
If we are typing in the same commands again and again with only minor changes then that suggests
that we could save ourselves some time by creating a MATLAB function, which is essentially a list
of those commands. A suitable function to generate direction fields is shown below.
function dir_field(func,xrange,yrange)
%dir_field(func,xrange,yrange)
%Generate a quiver plot of a direction field:
%
func: a function of two variables such that y’ = func(x,y)
% xrange: a vector of x coordinates
% yrange: a vector of y coordinates
[x,y] = meshgrid(xrange,yrange);
f = func(x,y);
th = atan(f);
dx = cos(th);
dy = sin(th);
quiver(x,y,dx,dy);
end
The only difference between the commands in the above function and those described in section
2.2.4 is the use of an unspecified function, func, of x and y, which is an argument to the MATLAB
function dir_field. This allows for complete generality because we can change the function that
we plot when we call the dir_field function.
In order for MATLAB to use this new function it must be saved in a file called dir_field.m
located in the “MATLAB Search Path”. The easiest way to achieve this is to type edit dir_field
in the MATLAB command window, which should start the MATLAB editor in a new window.
Type the above function into the editor and save it (File->Save). You can test whether the file
has been correctly created by typing
>> help dir_field
If everything is set up properly, you should see the comments at the top of the function definition
dir_field(func,xrange,yrange)
Generate a quiver plot of a direction field:
func: a function of two variables such that y’ = func(x,y)
xrange: a vector of x coordinates
yrange: a vector of y coordinates
If you see the message
No help comments found in dir_field.m.
then the file is in the correct place, and you have either not put any comments in the function or
the comment (percent) sign, % is not the first character in the line.
If instead you see the message
dir_field.m not found.
then the file dir_field.m has not been created, or is not in the “MATLAB Search Path”.
Once you have the file in the correct place, with or without working comments, you can use the
function to generate Figure 2.2a as follows
>> dir_field(@(x,y) x,-3:0.5:3,0:0.5:3);
>> axis([-2.95,2.95,-0.05,3.05]);
The syntax @(x,y) x is called an anonymous function and is a quick way of specifying the function
that defines the direction field2 . The notation @(x,y) specifies that we are defining a function of two
variables, x and y, and the second x is the function itself. Translation of some more mathematical
functions into this MATLAB notation is shown below
2
An alternative is to create a new M-file that defines a function of two variables and then to pass the name of that
function to the function dir field.
f (x, y) = x
f (x, y) = −x/y
f (x, y) = sin x
f (x, y) = x ∗ y
@(x,y)
@(x,y)
@(x,y)
@(x,y)
x
-x./y
sin(x)
x.*y
The use of the element-by-element .* and ./ function is necessary because x and y are matrices,
look back at the commands used in the function dir_field
2.2.5
Existence and uniqueness of solutions
If we can construct an integral curve, then the solution that it represents must exist, at least in
the range over which we can draw the curve. There are many different integral curves, however,
suggesting that the solution is not unique. Indeed, the very first example, 1.1, did not have a
unique solution because any constant C satisfies the equation. As you might have guessed, this
non-uniqueness follows from the arbitrary constant introduced on integration.
We can remove the arbitrariness introduced by the constant of integration by specifying that
the solution must pass through a particular point in the (x, y) plane, (x0 , y0 ) say. The constraint
can be written mathematically in the form
y(x0 ) = y0 ,
(2.7)
and is known as an initial condition because it represents the value of the solution at a single
(starting) point. The complete solution can be constructed by starting from the initial point and
following the integral curve in either direction3 .
Having removed the non-uniqueness due to the arbitrary constant of integration, we can consider
the existence and uniqueness of solutions to the initial value problem:
y 0 = f (x, y),
and y(x0 ) = y0 .
(2.8)
The solution will fail to exist if we cannot construct an integral curve in the x-y plane starting
from the point (x0 , y0 ). In fact, provided that the function f (x, y) is continuous near (x0 , y0 ), we
can always construct an integral curve in the local neighbourhood of (x0 , y0 ) and the solution exists;
a result known as the Peano existence theorem
If there is a choice of direction at any point along an integral curve, in other words if two integral
curves cross each other, then the solution is not unique. A stronger condition than simple continuity
is required to prevent such an occurrence and leads to the following (Picard’s existence) theorem:
(x, y) are both continuous functions of x and y in a region 0 <
Theorem 2.1. If f (x, y) and ∂f
∂y
|x − x0 | < a and 0 < |y − y0 | < b then there exists a unique solution y = y(x) in the interval
0 < |x − x0 | < h ≤ a, that satisfies the initial value problem
dy
= f (x, y) and y(x0 ) = y0 .
dx
The proof of this theorem is beyond the scope of this course, the basic idea is that continuity
of the partial derivative allows the “distance between two nearby functions”4 to be bounded, which
3
In problems for which the independent variable is time, t, it is usually only necessary to follow the integral curve
forward in t.
4
A precise definition of the term “distance between functions” is surprisingly tricky and requires what is known
as functional analysis, see MATH20122, MATH31011 and MATH46111.
is, eventually, enough to show that the “distance” between two solutions passing through the same
point must be zero, i. e. there is only one solution. In order to apply the theorem, the continuity
must be investigated. The theorem will fail if either function is not
of the functions f (x, y) and ∂f
∂y
continuous.
Example 2.5. Using the theorem 2.1
Use the existence and uniqueness theorem to determine values of the initial conditions (x0 , y0 ) for
which there is guaranteed to be a unique solution to the initial value problem
y0 =
1
x
y = y0
at x = x0 .
Solution 2.5. The function f (x, y) = 1/x has a discontinuity at x = 0 and, therefore, a solution
does not necessarily exist5 when x0 = 0 and it may not be possible to extend a solution with x0 < 0
to positive values of x or a solution with x0 > 0 to negative values of x.
We can solve the problem by direct integration
Z
1
1
0
y =
⇒ y=
dx.
x
x
We split the integration into two parts
x>0:
R
1
x
x<0:
dx = log x + C
Let u = R−x ⇒ du
R 1= −dx
1
and x dx = u du
= log u + C = log(−x) + C
Thus, y = log |x| + C and applying the initial conditions gives
y0 = log |x0 | + C
⇒
C = y0 − log |x0 |,
⇒ y = y0 + log |x| − log |x0 | = y0 + log |x|/|x0 | = y0 + log |x/x0 |.
Any solution is undefined (does not exist) when x = 0, which means that the solutions cannot pass
smoothly through x = 0, In addition, there is no solution at all when x0 = 0 and the argument of
the logarithm is undefined, which is entirely consistent with the implications of the theorem.
It is also simple to verify that non-differentiability of f with respect to y can lead to nonuniqueness, but, of course, that does not constitute a proof of the theorem.
Example 2.6. Non-uniqueness in an initial value problem
Find solutions to the initial value problem
p
y 0 = 2 |y|, where y(0) = 0.
(2.9)
p
Solution 2.6. The function f (x, y) = 2 |y| is not differentiable at y = 0 which means that ∂f /∂y
does not even exist at y = 0, so it is certainly not continuous there. Hence, the theorem 2.1 cannot
guarantee uniqueness near y = 0.
5
Note that the theorem is necessary, but not sufficient. Failure of the continuity criterion does not necessarily
imply non-existence of a solution.
We can solve the equation by separation, but must split the calculation into two parts to handle
the modulus function.
y>0:
y<0:
|y| = y,
|y| = −y,
√
Equation (2.9) becomes y 0 = 2 y.
R
1
√
2 y
dy =
R
1 dx
√
Equation (2.9) becomes y 0 = 2 −y.
Separating variables and integrating gives
R 1
R
√
√
⇒ y = x + C1 ,
dy = 1 dx ⇒
2 −y
√
− −y = x + C2 .
The initial condition states that y = 0 when x = 0. Thus, C1 = C2 = 0 and we obtain the solution
p
sgn(y) |y| = x ⇒ y = x2 ,
(2.10)
where sgn(y) is the sign function:


 1, y > 0,
sgn(y) = −1, y < 0,


0, y = 0.
The constant
function y = 0 is also a solution of the initial value problem (2.9) because
p
y 0 = 0 = 2 |0| and y(0) = 0. We have shown that the initial value problem does indeed have two
possible solutions
y = 0 and y = x2 .
2.3
Numerical Solutions
We have already seen how to use a computer to generate the direction field for a first-order ODE,
see §2.2.4, but how can we use a computer to plot the integral curves? In particular, for a given
initial value problem of the form (2.8)
y 0 = f (x, y),
and y(x0 ) = y0 ,
we seek an approximation to the solution y(x). The MATLAB plot command, and other plotting
packages, connect a set of discrete points by straight-line segments. For the function y(x), these
discrete points are pairs of values (xn , yn ), where yn ≈ y(xn ) and is called a numerical approximation to the function y at the point xn . The values of yn correspond to a chosen set of points
in the domain of the independent variable, x0 < x1 < . . . < xN . If we have an analytic expression
for the solution y(x), then it is easy to generate the pairs (xn , yn ) = (xn , y(xn )), but the aim of
numerical solution methods is to generate these pairs from the ODE directly, without the need for
an exact solution.
2.3.1
Euler’s Method
The idea underlying Euler’s Method is to start at the point given by the initial condition (x0 , y0 )
and then to find the value of y1 ≈ y(x1 ) by using a straight-line approximation to y(x) in the
interval [x0 , x1 ]. The obvious approximation is the tangent to y(x) at (x0 , y0 ), i. e. the line with
gradient given by y 0 at (x0 , y0 ). Hence,
y1 = y0 + y 0 (x0 )(x1 − x0 ) = y0 + f (x0 , y0 )(x1 − x0 ).
(2.11)
y0 = C, a given initial value,
yn+1 = yn + f (xn , yn )(xn+1 − xn ).
(2.12)
The method proceeds by repeating the process: start at (x1 , y1 ) and approximate the function y(x)
in the interval, [x1 , x2 ], by the tangent to y(x) at (x1 , y1 ) and so on. The complete set of points
{yn } can be generated from the equations
It is very common to choose an equally-spaced set of points {xn }, i. e. the distance between each
point is the same, xn+1 − xn = h, say6 . Then, Euler’s method can be, and often is, written in the
form
yn+1 = yn + hf (xn , yn ).
(2.13)
The constant spacing between points, h, is called the step-size.
Example 2.7. Euler’s method
Use Euler’s method with a step-size h = 0.5 to determine an approximate solution to the initial
value problem (2.14) in the domain x ∈ [0, 3].
y 0 = x,
and y(0) = 0.
(2.14)
Solution 2.7. Firstly, a step-size of h = 0.5 in the domain x ∈ [0, 3] means that the chosen set of
points in x will be {0, 0.5, 1, 1.5, 2, 2.5, 3}. The easiest way to lay out the calculation is in a table.
n
0
1
2
3
4
5
6
xn yn f (xn , yn ) yn + h × f (xn , yn )
0
0
0
0 + 0.5 ×
0
0.5 0
0.5
0 + 0.5 ×
0.5
1 0.25
1
0.25 + 0.5 ×
1
1.5 0.75
1.5
0.75 + 0.5 ×
1.5
2 1.5
2
1.5 + 0.5 ×
2
2.5 2.5
2.5
2.5 + 0.5 ×
2.5
3 3.75
3
3.75 + 0.5 ×
3
=
=
=
=
=
=
=
=
yn+1
0
0.25
0.75
1.5
2.5
3.75
5.25
Note that the value of yn is copied from the yn+1 column of the row above. The pairs (xn , yn ) are
connected by straight-lines and shown on Figure 2.4a together with the direction field corresponding
to the ODE y 0 = x. The solution “looks” good in the sense that it is always aligned with the direction
field, but, in fact, it is not that accurate. The exact solution to the initial value problem (2.14) can
be obtained by direct integration and is y = x2 /2. Points on the exact solution are shown as open
circles in Figure 2.4 and only coincide with the approximate solution at the initial point (0, 0).
The step-size of h = 0.5 means that we are approximating the exact solution, a quadratic, by
very few straight-line segments, so it should not be a surprise that the answer is inaccurate. 7 The
accuracy of the approximation can be improved by increasing the number of straight-line segments,
or equivalently decreasing the step-size. Figure 2.4b shows the approximate solution to (2.14) with
a step-size of h = 0.01 in red and the exact solution as green markers, which lie exactly (as far as
the eye can tell) on the approximate solution.
6
Henceforth, we shall assume that the set of points {xn } is equally spaced, unless explicitly stated otherwise.
Incidentally, the reason why the approximation “looks” good is because, by construction, each straight-line
segment is always aligned with the direction field. The biggest inaccuracy occurs in the very first step in which the
base of the parabola is approximated by a straight-line of zero gradient. Having moved away from the true solution,
the Euler method follows another integral curve.
7
y
y
3
3
2.5
2.5
2
2
1.5
1.5
1
1
0.5
0.5
0
−3
−2
−1
0
1
2
3
0
−3
−2
−1
0
1
2
3
x
x
(a)
(b)
0
Figure 2.4: (a) The direction field corresponding to the ODE y = x; the solution of the initial value
problem (2.14) using Euler’s method with a step size of h = 0.5 over the interval x ∈ [0, 3] (solid
line); and the exact solution to (2.14) y = x2 /2 (open circles). (b) The direction field corresponding
to the ODE y 0 = x; the solution of the initial value problem (2.14) using Euler’s method with a step
size of h = 0.01 over the interval x ∈ [0, 3] (solid line); and the exact solution to (2.14) y = x 2 /2
(open circles).
Accuracy of Euler’s method
Although Euler’s method seems like a sensible approach to approximate the solution of ODEs, the
example 2.7 shows that when used without thought it can give very inaccurate solutions. Quantification of the accuracy of any numerical method is extremely important, so important that it is
a branch of mathematics in its own right known as numerical analysis, see MATH20602 and, in
particular for ODEs, MATH36022.
In order to say something definite, we must define what we mean by the accuracy of the method,
which leads to the concept of the (numerical) error. The error is essentially the difference between
the numerical approximation and the exact solution and will vary with the independent variable.
Ideally, we would like to represent the error by a single number, e say, and then deduce how that
number varies as we change the number of approximation points in our numerical scheme. The
precise definition of e depends very much on the problem being analysed, however.
In the present context, the numerical solution of ODEs, we can define an error as follows. Let
{xn }, 0 ≤ n ≤ N , be a set of N +1 points on the real line and let yn be the numerical approximation
to the function y(x) at the point xn . The numerical error, en , at the point xn is defined to be
the difference between the numerical approximation and the exact solution
en = yn − y(xn ).
(2.15)
Example 2.8. Evaluating the numerical error
Calculate the numerical errors en corresponding to the values yn determined in example 2.7.
Solution 2.8. The exact solution to the initial value problem (2.14) is y = x2 /2. Thus, we can
present the exact solution, approximate solution and the error in another table.
n
0
1
2
3
4
5
6
xn yn y(xn ) en
0
0
0
0.5 0 0.125
1 0.25 0.5
1.5 0.75 1.125
2 1.5
2
2.5 2.5 3.125
3 3.75 4.5
= yn − y(xn )
0
−0.125
−0.25
−0.375
−0.5
−0.625
−0.75
The error is negative because the numerical approximation is an underestimate of the true solution;
the graph of the numerical solution lies under that of the exact solution, see Figure 2.4a. By
construction, the error is zero at the initial condition (n = 0), but the magnitude of the error
increases by a constant amount after each step. The error is said to “accumulate” as more steps
are taken.
Local error analysis and the order of Euler’s method
If a numerical method is to be of any use at all, we require that the error at every point must tend
to zero as the spacing between the points tends to zero, a property known as convergence. In
other words, the more points we take, the more accurate the approximate solution and if we were
able to take “infinitely many” points then we should recover the exact solution.8
A simple way to investigate the accuracy of Euler’s method is to consider what happens if we
replace the approximation yn by the exact solution y(xn ). For a constant step-size h, the equation
(2.13) becomes
y(xn+1 ) = y(xn ) + hf (xn , y(xn )).
(2.16)
Assuming that y is sufficiently differentiable, we can approximate y(xn+1 ) by a Taylor-series about
the point xn
1
y(xn+1 ) = y(xn +h) = y(xn )+hy 0 (xn) + h2 y 00 (xn )+O(h3 ) = y(xn )+h f (xn , y(xn ))+O(h2 ). (2.17)
2
Using equation (2.17) in equation (2.16) yields
y(xn ) + h f (xn , y(xn )) + O(h2 ) = y(xn ) + h f (xn , y(xn )).
(2.18)
The difference between the two sides is called the local truncation error and is O(h2 ). If the
O(h2 ) term is zero (y 00 (x) ≡ 0) then there is no difference between the two sides of equation (2.18)
and Euler’s method is said to be exact. It follows that Euler’s method will be exact if y(x) is a
polynomial of order one, y(x) = Ax + B. For this reason, Euler’s method is called a first-order
method.
A numerical method is said to be of order p if it recovers exactly every polynomial solution
of order less than or equal to p. Alternatively, a method is of order p if the local truncation error,
the terms that remain after substitution of the exact solution into the numerical method, are of
O(hp+1 ). The order of the numerical scheme tells us that advancing from the point xn to xn+1
introduces an error of size O(hp+1 ), for small h. The problem with this approach is that the order
8
Once again, really precise definitions of these concepts requires quite sophisticated mathematical tools from real,
complex and functional analysis.
only describes the local error, it assumes that we are starting from the exact solution at every
step, i. e. that en = 0. After n steps of the method, however, en 6= 0 in general. In a fixed
interval
N is O(h−1 ) and the naive expectation is that a global error measure
P the number of steps
e = n en ≈ N en = O(h−1 ) × O(hp+1 ) = O(hp ). Sadly, this argument is too simple to be true, in
general9 .
That said, for Euler’s method with constant spacing (2.13) it can be shown that |en | ≤ Ch,
where C is a constant that is independent of h and so |en | → 0 as h → 0 for all n and the method
is convergent10 . The error bound also shows that we should expect the error to decrease linearly
with step-size. In other words, if we wish to halve the error then we must halve the step-size
(approximately).
Plotting integral curves computed by Euler’s method using MATLAB
Numerical methods are designed to take advantage of the fact that computers are dumb, but can do
repetitive tasks accurately and quickly. Euler’s method is “easy” (it requires only multiplication,
addition and a function evaluation for each step) but very laborious. Imagine having to calculate
one million steps by hand! It is much better, and much more fun, to get the computer to do the
boring work.
As in example 2.7, we shall consider the solution to the initial value problem (2.14)
y 0 = x,
and y(0) = 0.
If we create two vectors x and y that represent the sets {xn } and {yn }, then the integral curve
can then be plotted by using the command
>> plot(x,y)
The initial condition will be the first entry in each array and the first entry in a MATLAB array
has the index 1, so, confusingly, x(1) = x0 and y(1) = y0 ,
>> x(1) = 0; y(1) = 0;
If the next point is x1 = 0.5, we could create the next entries in the array by the commands
>> x(2) = x(1) + 0.5
x =
0
0.5000
>> y(2) = y(1) + 0.5*x(1)
y =
0
9
0
It usually fails because the (local) order of a method does not say anything about its (global) convergence.
A proof of this statement is actually quite hard and relies on taking a Taylor-series expansion of the exact solution
to relate en+1 to en followed by careful inductive arguments starting from the fact that e0 = 0 by definition.
10
but we would have to repeat this process for each step, which is as laborious as doing it by hand.
Instead we can (and should) make use of a for loop to write
>> for n=1:6
x(n+1) = x(n) + 0.5;
y(n+1) = y(n) + 0.5*x(n);
end
The first line in the for loop increases the value of x by h = 0.5 and the second is a direct
implementation of Euler’s method, equation (2.13), where f (x, y) = x and the (constant) step-size
is 0.5. The loop will run six times, n=1:6, and populates the vectors with the data shown in solution
2.7.
>> x
x =
0
0.5000
1.0000
1.5000
2.0000
2.5000
3.0000
0
0
0.2500
0.7500
1.5000
2.5000
3.7500
>> y
y =
Once again, we can and should generalise the process by writing a more general MATLAB
function
function euler(func,xinit,yinit,h,nstep)
%Function that plots an integral curve calculated using Euler’s method
%
func: a function of two variables such that y’ = func(x,y)
% xinit: the intial value of the independent variable, x0
% yinit: the initial value of the dependent variable, y0
%
h: the step-size
% nstep: the number of steps
x(1) = xinit; y(1) = yinit;
for n=1:nstep
x(n+1) = x(n) + h;
y(n+1) = y(n) + h*func(x(n),y(n));
end
%Plot the integral curve in red
plot(x,y,’r’);
If the above commands are saved in file file euler.m, then the integral curve in Figure 2.4a can be
generated by the commands
>> euler(@(x,y) x,0,0,0.5,6)
The integral curve shown in Figure 2.4b was generated by
>> euler(@(x,y) x,0,0,0.01,300)
2.3.2
Explicit vs implict methods
Euler’s method is known as an explicit method because it gives an explicit expression for the value
of the solution at a point yn+1 as a function of the solution at the previous point yn . In other words,
the formula (2.13)
yn+1 = yn + hf (xn , yn )
can be written in the form
yn+1 = F (xn , yn ).
(2.19)
The explicit nature of the method arises because we approximate the gradient in the region [x n , xn+1 ]
by its value at the left-hand end, f (xn , yn ).
An alternative would be to approximate the gradient by its value at the right-hand end of the
interval, f (xn+1 , yn+1 ), which leads to the backward Euler method
yn+1 = yn + hf (xn+1 , yn+1 ).
(2.20)
The backward Euler method is an implicit method because for a general f (x, y), it cannot be
written in the form (2.19). Instead, the method is of the general form
yn+1 = F (xn+1 , yn+1 , xn , yn ).
The use of implicit methods requires the solution of a non-linear algebraic equations, e. g. using
Newton’s method, so they are more difficult to program and more expensive per step, so why use
them? The answer is that implicit methods often give accurate results for large steps, so fewer steps
are required in total.
2.3.3
Higher-order methods
We have seen that Euler’s method is only exact for linear functions and the solution of a general
ODE is often far from linear. Extremely small step sizes are required to compute accurate solutions
to general ODEs by Euler’s method, which means that the computations can take a very long time.
For these reasons, Euler’s method is not that widely used. Instead, higher-order methods have been
developed. These methods have smaller local truncation errors than Euler’s method, which means
that even if larger step-sizes are taken the error does not grow as fast. A large number of different
methods have been proposed, each with different strengths and weaknesses, see e. g. Chapter 19 in
“Schaum’s Outline of Differential Equations”.
2.4
Exact (analytic) Solutions: Linear Equations
There are many different techniques that can be used to solve first-order ODEs exactly. The
graphical and numerical methods discussed in the previous sections are quite general, but the
applicability of analytic methods can depend crucially on the properties of the ODE. For example,
we can only use the method of separation, §2.1.2, if the ODE is separable. One of the most important
properties in this regard is whether or not the equation is linear.
A first-order ordinary differential equation is linear if it can be written in the form
y 0 + p(x) y = r(x),
(2.21)
where p(x) and r(x) are functions of the independent variable. The equation (2.21) is the same as
the general definition of a linear ODE (1.8), with n = 1, a1 (x) = 1, (always possible without loss of
generality if the equation is first-order), a0 (x) = p(x) and g(x) = r(x).
The equation (2.21) is often written in the form
Ly = r(x),
where L ≡
d
+ p(x),
dx
(2.22)
and L is called a linear, differential operator.
In fact, as we shall see below, it is possible to write down an explicit expression for the solution
to (nearly) any first-order linear ODE, although the expression might be impossible to simplify
nicely.
2.4.1
Solution using an integrating factor
Example 2.9. A non-separable, first-order, linear ODE
Find the general solution to the ODE
y 0 + 5y = 1.
(2.23)
Solution 2.9. Equation (2.23) is not separable because the dependence on y cannot be made to
occur only in the y 0 term. If we divide by y, then there will be a term 1/y on the right-hand side.
Instead, we “notice” the cunning trick that if we multiply equation (2.23) by e 5x then
e5x y 0 + 5e5x y = e5x
⇒
d 5x e y = e5x ,
dx
(2.24)
by the product rule. We can now integrate both sides of equation (2.24) with respect to x
e5x y =
1 5x
e + C,
5
where C is an arbitrary constant of integration. The final solution is obtained on division by e 5x
y=
1
+ C e−5x .
5
(2.25)
The expression (2.25) is called the general solution of equation (2.23) because it still contains a
free constant C and so represents the entire family of integral curves.
The general case
In example 2.9 we were able to find a solution after multiplying by the function e5x which made it
possible to integrate the resulting equation directly. The solution relied on us being able to “spot”
that multiplying by the specific function e5x would allow us to perform the integration. It would
be better if we had a more constructive method.
The idea is to seek a function q(x) such that if we multiply the entire equation (2.21) by q(x)
then we can write the left-hand side as a perfect derivative. In other words
d
q(x) y 0 + q(x) p(x) y =
q(x) y(x) = q(x) y 0 + q 0 (x) y.
(2.26)
dx
It follows that
q 0 (x) = q(x) p(x),
⇒
p(x) =
q 0 (x)
;
q(x)
(2.27)
we can rule out the pathological case q(x) ≡ 0 because that would mean that we would have
multiplied the equation (2.21) by zero to give 0 = 0; a true statement, but rarely very helpful.
Equation (2.27) is separable and integrating once we obtain
Z x
log q =
p(u) du ≡ P (x),
(2.28)
where u is a dummy variable of integration. Taking the exponential of equation (2.28) gives
q(x) = eP (x) = e
Rx
p(u) du
,
(2.29)
where q(x) is known as the integrating factor. The name arises because if we multiply equation
(2.21) by the factor q(x) then we can integrate the resulting equation directly.
We must multiply both sides of equation (2.21) by q(x), which gives
d P (x) ye
= r(x) eP (x) .
dx
Integrating once yields
ye
P (x)
=C+
Z
(2.30)
x
r(u) eP (u) du,
where C is a constant of integration and, again, u is a dummy variable of integration. Dividing
though by eP (x) we obtain an explicit expression for the solution
Z x
−P (x)
−P (x)
r(u) eP (u) du,
(2.31)
y(x) = Ce
+e
= yc (x) +
yp (x).
The term yc (x) = Ce−P (x) is known as the complementary solution because Lyc = 0. By linearity
LKyc = KLyc = 0, which means that yc can be multiplied by any constant K without changing
its behaviour when operated on by L. The integral term, yp (x), contains no free constants and is
called a particular integral11 because Lyp = r(x). The equation (2.30) is called the self-adjoint
form of the equation (2.21).
Example 2.10. Using the general formula (2.31)
Find the general solution to the ODE (2.23)
y 0 + 5y = 1,
by using the general formula (2.31).
11
The solution of linear ordinary differential equations of any order can always be split into complementary solutions
and particular integrals, see §3.2.3
Solution 2.10. On comparison with the general form of a first-order linear ODE (2.21), y 0 +p(x) y =
r(x), it follows that p(x) = 5 and r(x) = 1. Using the definition (2.28), we have
Z x
5 du = 5x,
P (x) =
and the general solution follows from equation (2.31)
Z x
1
1
−5x
−5x
y = Ce
+e
e5u du, ⇒ y = Ce−5x + e−5x × e5x = Ce−5x + .
5
5
(2.32)
The solution agrees with our previous result (2.25) and we note that 1/5 is the particular integral
and Ce−5x is the complementary solution.
Example 2.11. A less-trivial example of using integrating factors
Find the general solution of the ordinary differential equation
(x2 + 1)y 0 − (1 − x)2 y = xe−x .
(2.33a)
and, hence, find the solution of (2.33a) that satisfies the initial condition
y(0) = 0.
(2.33b)
Solution 2.11. Dividing equation (2.33a) by (x2 + 1), gives
y0 −
x
(1 − x)2
y
=
e−x ,
1 + x2
1 + x2
which is in the form (2.21) where p(x) = −(1 − x)2 /(1 + x2 ) Hence, by equation (2.28),
Z
Z Z
1 − 2x + x2
2x
(1 − x)2
dx = −
dx = −
1−
P (x) =
−
dx,
1 + x2
1 + x2
1 + x2
= −x + log(1 + x2 )
(2.34)
Then
2
2
eP (x) = e−x+log(1+x ) = e−x elog(1+x ) = (1 + x2 ) e−x .
The self-adjoint form of Equation (2.33a) is
0
x
y (1 + x2 ) e−x =
e−x (1 + x2 ) e−x = x e−2x .
1 + x2
Integrating once, gives
1
1
2
−x
x+
e−2x ,
y (1 + x ) e = C −
2
2
and dividing by (1 + x2 ) e−x we obtain the general solution
C ex − 21 x + 21 e−x
.
y=
1 + x2
The solution that satisfies the initial condition (2.33b) must have
0=
so
y=
1
4
ex −
1
2
C−
1
1
4
x+
1 + x2
1
2
1
C= ,
4
⇒
e−x
=
sinh x − x e−x
.
2(1 + x2 )
(2.35)
2.4.2
Existence and uniqueness revisited
The explicit construction (2.31) gives a solution to any linear, first-order ordinary differential equation; and for a given initial condition that solution will be unique. Hence, we can prove a stronger
theorem than Picard’s existence theorem, 2.1, when the equation is linear.
Theorem 2.2. If the functions p(x) and r(x) are continuous on an open interval x ∈ (a, b), then
there exists a unique solution y = y(x) to the initial value problem
y 0 + p(x) y = r(x),
and y(x0 ) = y0 ,
where x0 ∈ (a, b),
in the interval a < x < b.
The continuity requirements on p(x) and r(x) guarantee that the integrals in equation (2.31)
actually exist.
2.4.3
Solution by power series
Example 2.12. A power-series solution
Find the (unique) solution to the initial value problem
y 0 − 2xy = 0,
where y(0) = 1,
(2.36)
by assuming that the solution can be written as a power series expansion about x = 0
y(x) =
∞
X
a n xn ,
(2.37)
n=0
where an are constants.
Solution 2.12. Assuming that the power-series converges we can differentiate equation (2.37)
term-by-term to obtain
∞
∞
X
X
y0 =
n an xn−1 =
n an xn−1 .
(2.38)
n=0
n=1
Substituting the series (2.37) and (2.38) into the ordinary differential equation (2.36) gives
∞
X
n an x
n=1
n−1
− 2x
∞
X
n
an x =
n=0
which may be written in expanded form as
∞
X
n an x
n=1
n−1
−2
∞
X
an xn+1 = 0,
n=0
a1 x0 + {2a2 − 2a0 } x + {3a3 − 2a1 } x2 + · · · + {(n + 1)an+1 − 2an−1 } xn + · · · = 0.
(2.39)
The equation (2.39) must be satisfied for every value of x, which is possible if (and only if) the
coefficient of every power of x is zero12 . Setting the coefficient of each power of x to zero gives
x0 :
x1 :
..
. :
a1 = 0,
2a2 − 2a0 = 0
..
.
xn : (n + 1) an+1 − 2an−1 = 0
12
⇒
a 2 = a0 ,
⇒
an+1 =
A consequence of the “deep” orthogonality property of polynomials.
2
an−1 .
n+1
(2.40a)
(2.40b)
(2.40c)
The equation (2.40c) is known as a recurrence relation because it can be used recursively to find
the values of the coefficients {an }, provided that initial values are known. From equation (2.40a),
a1 = 0 and by the recurrence relation (2.40c) all odd coefficients are zero a1 = a3 = a5 = · · · = 0.
Using equation (2.40b) with the recurrence relation allows the calculation of the even coefficients
(n = 1) : a2 = a0 ,
and, in fact, a2n =
1
n!
(n = 3) : a4 =
2
1
a2 = a0 ,
4
2
(n = 5) : a6 =
2
1
1
a4 = a0 = a0 ,
6
6
3!
a0 . The original power-series expression for (2.37) is, therefore,
y(x) = a0
1
1
1 + x + x4 + x6 + · · ·
2!
3!
2
∞
X
1 2n
2
= a0
x = a 0 ex .
n!
n=0
Finally, the initial condition y(0) = 1 gives a0 = 1 and so the solution is
2
y(x) = ex .
Note that the initial condition can be applied directly to the power-series representation. Evaluating
(2.37) at x = 0 gives
∞
X
y(0) =
an 0n = a 0 ,
n=0
because 00 = 1, but all other powers of zero are zero. Thus, in most cases, a0 is set by the initial
condition.
In fact, the equation (2.36) is separable, or it can be integrated directly after multiplication by
2
the integrating factor e−x ; either way, the exact solution that satisfies the initial condition y(0) = 1
2
is y = ex , in agreement with the power-series method. The power-series method should be used
with caution because it is strictly only valid in the region near x = 0 in which the series converges.
In the present example, the series converges everywhere and the method gives the global solution,
but, in general, it will give only a local approximation to the solution.
Example 2.13. A “forced” case
Find the (unique) solution to the initial value problem
y 0 − 2xy = x2 ,
where y(0) = 1,
(2.41)
by assuming that the solution can be written as a power series expansion about x = 0.
Solution 2.13. The equation (2.41) is very similar to equation (2.36), but there is an additional
“forcing” term of x2 on the right-hand side. Despite this, the method of solution remains exactly
the same as in example 2.12. Substitution of the series (2.37) and (2.38) into equation (2.41) and
subtracting x2 from both sides gives
∞
X
n=1
n an x
n−1
−2
∞
X
n=0
an xn+1 − x2 = 0.
(2.42)
The equation (2.42) must be satisfied for every value of x and, again, this is only possible if the
coefficient of every power of x is zero. Setting coefficients of each power of x to zero gives
x0 :
x1 :
a1 = 0,
2a2 − 2a1 = 0
⇒
x2 :
3a3 − 2a1 − 1 = 0
⇒
..
. :
..
.
xn : (n + 1) an+1 − 2an−1 = 0
(2.43a)
(2.43b)
a 2 = a0 ,
1
a3 = ,
3
⇒
an+1 =
(2.43c)
2
an−1 .
n+1
(2.43d)
The recurrence relation (2.43d) is the same as in example 2.12 and the initial condition is the same,
y(0) = 1 ⇒ a0 = 1, so the even coefficients a0 , a2 , a4 , . . . are completely unaffected by the forcing
2
term. Thus, the solution will consist of the even terms, still ex , but with the addition of any odd
terms that are “triggered” by the forcing term.
The forcing term appears in the coefficient of x2 , equation (2.43c), which implies that a3 = 1/3;
and the use of the recurrence relation for the subsequent odd coefficients gives
a5 =
2
2
a3 = ,
5
15
a7 =
2
4
a5 =
,
7
105
···
1
2n−1
.
m=2 (2m + 1) 3
a2n+1 = Qn
In this case, there isn’t really a “nice” expression for the odd coefficients, but they can all be
evaluated, and the final solution is given by
y=e
x2
∞
2n−1
1 3 X
Qn
+ x +
x2n+1 ,
3
(2m
+
1)
m=1
n=2
(2.44)
2
which consists of a complementary solution, ex , and a particular integral, the remaining mess.
2
The integrating factor is still e−x and on multiplication by the integrating factor equation (2.41)
becomes
Z
0
2
2 −x2
x2
x2
−x2
=x e
⇒ y = Ce + e
x2 e−x dx.
(2.45)
ye
Thus, the difference between the solutions to the forced and unforced cases, the remaining mess in
equation (2.44), is the series approximation to the integral13 term in equation (2.45).
Example 2.14. A warning about series
Find the general solution to the ODE
xy 0 + y = 10,
(2.46)
using series solution methods.
Solution 2.14. If, as before, we pose
y=
∞
X
n=0
13
an x
n
⇒
0
y =
∞
X
nan xn−1 ,
n=0
You might like to know that the integral can actually be written as
√
Z
Z x
2
2
2
1
2
π
erf(x) − e−x x, where erf(x) ≡ √
x2 e−x dx =
e−ξ dξ is called the error function.
4
2
π 0
and substituting the above expressions into equation (2.46) yields
x
∞
X
n=0
nan x
n−1
+
∞
X
n
an x =
n=0
∞
X
(n + 1) an xn = 10.
n=0
Gathering coefficients of powers of x implies that
a0 = 10,
(n + 1)an = 0
⇒
an = 0,
n > 0.
The solution is then
y = 10,
which satisfies the equation (2.46), but does not contain a free constant. The equation (2.46) is
linear, so we should have a complementary solution, where is it?
The equation (2.46) can be solved directly because the left-hand side is an exact derivative:
xy 0 + y = (xy)0 = 10.
Integrating, we obtain
C
;
x
the complementary solution is C/x was not included in our series representation, so we did not
find it.
This example demonstrates that a simple Taylor-series method (series with positive powers of
x) only works if the solution posesses a convergent Taylor expansion in the region of interest 14 .
xy = 10x + C,
2.5
⇒
y = 10 +
Exact Solutions: Nonlinear Equations
In contrast to linear equations, there is no general solution method to find analytic solutions of
nonlinear equations. Existence and uniqueness is harder to establish and, in general, non-linear
equations do not have unique solutions. There are many special forms of non-linear equations
for which analytic solutions are possible, however, e. g. separable equations can be non-linear,
see example 2.1. In fact, the majority of these solution methods use substitutions to convert the
nonlinear equation into a linear or separable equation. We shall discuss only one of the many
special forms; a more comprehensive list can be found in the “Handbook of Differential Equations”
by Zwillinger.
2.5.1
Homogeneous equations
A first-order ODE is said to be homogeneous if it can be written in the form
y
dy
=f
,
dx
x
(2.47)
P∞
The method can be generalised by considering a series expansion of the form y = n=0 an xn+c , where a0 =
6 0
by choice and the value c must be determined as part of the solution. A power-series with negative powers of x is
called a Laurent series.
14
for some function f . Examples are shown below (some rearrangement may be necessary):
dy
y
=
,
dx
x
dy
2x3 y − y 4
= 4
,
dx
x − 2xy 3
dy
+ y 2 − xy = 0.
x2
dx
(2.48a)
(2.48b)
(2.48c)
Equation (2.47) can be solved by means of the substitution z(x) = y(x)/x; i. e. y(x) = xz(x).
Differentiating with respect to x and using the product rule gives
and so equation (2.47) becomes
dy
d
dz
=
xz(x) = z(x) + x ,
dx
dx
dx
z(x) + x
dz
= f (y/x) = f (z),
dx
(2.49)
which is now separable
dz
f (z) − z
=
dx
x
⇒
Z
dz
=
f (z) − z
Z
dx
.
x
(2.50)
Equation (2.50) can be evaluated provided, of course, that we can perform the integral in z.
Example 2.15. Solving a homogeneous equation
Find the general solution of the ODE (2.48b)
x2
dy
+ y 2 − xy = 0.
dx
Solution 2.15. First, we put the equation (2.48b) into homogeneous form on division by x2 (we
assume x 6= 0)
dy y 2 y
− = 0.
+
(2.51)
dx
x
x
Now, we let z = y/x, so that dy/dx = z + x dz/dx and equation (2.51) becomes
z+x
dz
+ z2 − z = 0
dx
⇒
x
dz
= −z 2 .
dx
Dividing both sides by −xz 2 and integrating with respect to x gives
Z
Z
1
dz
dx
⇒
= log |x| + C.
=
−
2
z
x
z
Returning to the original variables we obtain our final solution
1
x
= = log |x| + C
z
y
⇒
y=
x
.
log |x| + C
(2.52)
Example 2.16. Solving a homogeneous equation including boundary conditions
Find the solution of the initial value problem
xy
dy
+ x2 + y 2 = 0,
dx
where y(1) = 1.
(2.53)
Solution 2.16. We divide through by yx (y 6= 0, x 6= 0) to put the equation into homogeneous
form
dy x y
+ + = 0,
dx y x
and using the usual substitution z = y/x we obtain
z+x
dz 1
+ +z =0
dx z
⇒
x
dz
1 + 2z 2
=−
.
dx
z
(2.54)
We could now separate variables and integrate, keep the arbitrary constant of integration and find
its value by using the initial condition. Instead, however, we shall build the initial conditions into
the limits of our integration. For the lower limit, we let x = 1 for which y = 1 and so z = y/x = 1,
we leave the upper limit unknown as x or z and to avoid confusion we introduce two dummy
integration variables ξ and ζ. The equation (2.54) becomes
Z z
Z x
ζ
1
dζ =
− dξ,
2
ξ
ζ=1 1 + 2ζ
ξ=1
z
1
2
= [− log ξ]xξ=1 ,
log(1 + 2ζ )
⇒
4
ζ=1
⇒
1
log(1 + 2z 2 ) − log 3 = − log x + log 1,
4
1
1 + 2z 2 4
1
⇒ log
= log .
3
x
Note that there is no constant of integration because we are calculating a definite integral. Exponentiating both sides of the equation gives
1 + 2z 2
3
41
1
=
x
⇒
1 + 2z 2
1
= 4
3
x
⇒
1
z =
2
2
3
−1 .
x4
Returning to the original variables we obtain the final (implicit) solution
y 2 1 3
1 3
2
2
=
−1
⇒ y =
−x .
x
2 x4
2 x2
2.6
(2.55)
Autonomous equations
In my opinion, the concepts introduced in this section are easier to understand if the independent
variable is time, t, rather than space x. The underlying mathematical structure is, of course, the
same, but the physical analogies are better when we think of quantities varying in time rather than
space.
A first-order ODE is autonomous if there is no explicit dependence on the independent variable
and can, therefore, be written in the form
ẏ = f (y),
(2.56)
where ẏ = dy/dt.
A direct consequence of the autonomy of the system is that the “shape” of the solution is
independent of t. In other words, if y1 (t) is a solution of equation (2.56) then another solution is
given by y2 (t) = y1 (t + c) for any real constant c. A simple proof is given below
Let τ = t + c,
then y2 (t) = y1 (τ ),
and differentiating with respect to t gives
dy2
dy1 (τ ) dτ
d
ẏ2 =
y1 (τ ) =
=
, by the chain rule;
dt
dt
dτ dt
dy1 (τ )
=
,
because dτ /dt = 1;
dτ
= f (y1 (τ )), because y1 (τ ) is a solution of (2.56);
= f (y2 (t)).
Hence, y2 (t) also satisfies equation (2.56). In other words, the exact value of the time doesn’t matter
for the solution; e. g. if you drop ball from a given height, the path of the ball and the time it takes
to hit the ground won’t depend on whether you release the ball at 12:00am on a Tuesday or 1:00pm
on a Friday15 .
2.6.1
The phase portrait
In an autonomous system, the lack of any explicit dependence on t means that the complete behaviour of the system is determined only by y and dy/dt. One useful way of representing the
behaviour of the system is to plot ẏ as a function of y, which is known as a phase portrait; hence,
the ẏ-y plane is called phase plane. For equations in the form (2.56), ẏ = f (y), the phase portrait
is simply a graph of f (y) as a function of y.
Example 2.17. Plotting a phase portrait
Plot the phase portrait of the ordinary differential equation
ẏ = y.
(2.57)
Solution 2.17. The phase portrait is a straight line passing through the origin with gradient one,
see Figure 2.5.
Interpreting the phase portrait
A lot of information about the solutions of the ODE can be obtained by interpreting the phase
portrait. In example 2.17, if our initial condition16 is such that y > 0, then ẏ > 0. A positive value
of ẏ means that y will increase with time and the solution moves “to the right” in the phase portrait
as t increases. As y increases, ẏ also increases, see Figure 2.5, and so the solution moves “faster and
15
16
Assuming that physical conditions like wind, air density, etc don’t change.
The precise time at which this condition is applied does not matter, that’s the beauty of autonomous systems.
ẏ
10
5
0
-10
-5
0
5
-5
10
y
-10
Figure 2.5: Phase portrait corresponding to the ODE ẏ = y.
faster” and always to the right. On the other hand if the initial condition is such that y < 0, then
ẏ < 0 and our solution decreases in value as time increases moving further and further “to the left”
in the phase portrait. The above argument demonstrates that if y(ti ) > 0, for some initial time ti ,
then y → ∞ as t → ∞ and when y(ti ) < 0, y → −∞ as t → ∞. Finally, if the initial condition is
such that y = 0, then ẏ = 0 and the solution cannot change, i. e. if y(ti ) = 0, then y(t) = 0 for all
t > ti .
The above analysis of the phase portrait reveals all the qualitative information about the possible
dynamics of the system without the need to find an explicit solution. In this case, we know that
the general solution is y = Aet , which does exhibit the appropriate qualitative behaviour:


 ∞, A > 0,
t
as t → ∞,
Ae →
0, A = 0,


−∞, A < 0.
2.6.2
Fixed points
Any point for which ẏ = 0 is called a fixed point, or equilibrium point, of the system because
once at that point the value of the solution cannot ever change. In example 2.17, the point y = 0
was the only fixed point of the system17 .
Example 2.18. Finding fixed points
Find the fixed points of the ODE
ẏ = y 3 − 6y 2 + 11y − 6
(2.58)
Solution 2.18. Fixed points occur when ẏ = 0, i. e.
y 3 − 6y 2 + 11y − 6 = 0.
(2.59)
We can factorise the cubic (2.59) by spotting that y = 1 is a solution (or otherwise)
(y − 1)(y 2 − 5y + 6) = (y − 1)(y − 2)(y − 3) = 0,
and so the fixed points are located at y = 1, y = 2 and y = 3.
17
Actually that’s not quite true, there is also a fixed point “at infinity”, which can be analysed by means of the
transformation y = 1/z ⇒ ẏ = −ż/z 2 . The ODE (2.57) in example 2.17 becomes −ż/z 2 = 1/z ⇒ ż = −z, which
has a fixed point at z = 0, corresponding to y → ∞.
Interpretation of fixed points
If a system ever reaches a fixed point then it will stay there for all time (by definition), so, in some
sense, the fixed points represent all possible final destinations of the system. Once we know the
fixed points of a system its behaviour can be completely characterised if we can answer the following
question:
For a given initial condition, which, if any, of the fixed points will the system approach as t → ∞.
The answer to this question can be found by examination of the phase portrait. In example 2.17, for
y 6= 0 the solution moves away from the fixed point at y = 0 and so the system always approaches
the “fixed point at infinity” unless y = 0.
2.6.3
Phase-plane analysis and mathematical modelling
The interpretation of the phase portrait and fixed points of an ordinary differential equation can
be very useful when constructing mathematical models. If the expected qualitative behaviour is
not present in the phase plane then we can reject the model without having to go to the effort of
actually finding solutions. A more dubious approach is to “design” a phase plane that exhibits the
appropriate behaviour, but this is closer to “fitting” a model to the data, rather than building a
model from underlying fundamental principles to explain the data. The “fitting” approach is not
without its uses, but the true test of a model is whether it has predictive power: can the model
predict previously unobserved (and experimentally verifiable) behaviour?
In example 1.2, we derived a simple (linear) model of population growth
Ṅ = f (N ) = αN,
but the model was rejected because it predicted unbounded growth for α > 0 (the phase portrait
is essentially the same as the example 2.17 but the gradient of the line is α rather than one). The
unbounded growth arises because the rate of change, Ṅ , increases as N increases — the population
grows faster the bigger it gets. We can remove this property by changing the function f (N ) so that
Ṅ decreases for sufficiently large populations and the rate of growth eventually slows down. If Ṅ is
to increase for small N , but decrease for large N , then f (N ) must have a turning point and must
be a nonlinear function of N .
Example 2.19. Population growth revisited: the logistic equation
The logistic equation is a nonlinear, autonomous ODE that represents another simple model of
population growth,
Ṅ = αN (1 − N/N1 ),
(2.60)
where α > 0 and N1 > 0 are positive constants. Sketch the phase portrait for equation (2.60) and
find the fixed points of the system. Establish the qualitative behaviour of the system by phase-plane
analysis. Confirm your analysis by finding the general solution of equation (2.60).
Solution 2.19. Sketch
The phase portrait is given by the quadratic g(N ) ≡ αN (1 − N/N1 ) which is zero at the points
N = 0 and N = N1 and has a maximum at (N1 /2, αN1 /4), see Figure 2.6.
αN1
4
Ṅ = g(N )
0
0
N1 /2
N1
N
Figure 2.6: Phase portrait corresponding to the ODE Ṅ = αN (1 − N/N1 )
Fixed points
Fixed points correspond to the zeros of g(N ) and so there are two fixed points at N = 0 and
N = N1 .
Phase-plane analysis
It does not make sense for our population size to be negative, so we shall assume that N ≥ 0. For
N > 0 and small, Ṅ > 0 so N increases and moves away from the fixed point at N = 0. N will keep
increasing because Ṅ remains positive until N = N1 at which point the system is at a fixed point
and N will remain at N1 for all time. If we start with N > N1 , then Ṅ < 0 and the population
decreases until N = N1 and again the system reaches the fixed point at N1 . Thus, for any initial
condition N > 0 the population will eventually reach the fixed point at N = N1 . The fixed point
at N = 0 is said to be unstable because the system moves away from that point. The fixed point
at N = N1 is said to be stable because the system moves towards it; we shall have more to say
about this in the mechanics section of the course.
General solution
The logistic equation is separable and we can find an explicit expression for the general solution.
Multiplying both sides by N1 /(N (N1 − N )) and integrating with respect to t gives
Z
Z
N1
dN = α dt.
N (N1 − N )
We use partial fractions to rewrite the right-hand side
Z 1
1
dN = log N − log(N1 − N ) = αt + C,
+
N
N1 − N
and, hence,
N
= Aeαt ,
N1 − N
(2.61)
where A = eC . We denote initial population (at t = 0) by N0 , so A = N0 /(N1 − N0 ). Rearranging
equation (2.61) gives
N = (N1 − N )Aeαt
⇒
⇒
N (1 + Aeαt ) = N1 Aeαt
N=
N1 N0
.
N0 + (N1 − N0 )e−αt
⇒
N=
N1 Aeαt
,
1 + Aeαt
(2.62)
Firstly, observe that if N0 = 0, then N = 0 for all time. Next, we consider the limit t → ∞ for
which e−αt → 0, because α > 0; hence,
N→
N1 N0
= N1 ,
N0
as t → ∞ (N0 6= 0),
and the system always evolves towards the stable fixed point, in agreement with the analysis of
the phase portrait. Finally, we remark that if we take the limit N1 → ∞ of the solution (2.62), we
obtain
N0
N0
N1 N0
= lim
= −αt = N0 eαt ,
−αt
−αt
N1 →∞ N0 /N1 + (1 − N0 /N1 )e
N1 →∞ N0 + (N1 − N0 )e
e
lim
in agreement with the solution of our simple linear model of population growth (1.4).
Download