Chapter 2 First-order ordinary differential equations First-order ODEs involve only the first-derivative of the unknown function, y(x), and can be written in the forms F (x, y, y 0) = 0, y 0 = f (x, y). (2.1a) (2.1b) The form (2.1b) is less general than (2.1a), but it is simpler to analyse. 2.1 Simple exact solutions (revision) You should already be familiar with (at least) two methods of exact, or analytic, solution for first-order ODEs from the C4 module of your A-level maths courses. 2.1.1 Direct integration If the ODE can be written in the form y 0 = f (x), (2.2) then by the fundamental theorem of calculus, theorem 0.1, Z y(x) = f (x) dx, is a solution of equation (2.2). Obviously, this method will only work if we can actually integrate the function f (x). If not, then we must usually resort to numerical methods, see §2.3. 2.1.2 Separable ODEs The ODE is said to be separable if it can be written in the form g(y) y 0 = f (x), or g(y) dy = f (x). dx (2.3) The “separation” refers to the fact the y-dependence is on one side of the equation and the xdependence is on the other — the variables (x and y) are separated by the equals sign. 24 The solution of equation (2.3) is obtained on integration of both sides with respect to x. Z Z Z Z dy g(y) dx = f (x) dx, ⇒ g(y) dy = f (x) dx. dx The method will always give an implicit expression for the solution, but an explicit solution might not be possible. Once again, if we cannot integrate either of the functions g(y) or f (x) then we must seek an alternative method of solution. Example 2.1. A separable first-order ODE Find an expression for the solution to the ordinary differential equation y0 = x2 . 1 + cos y (2.4) Solution 2.1. Multiplying both sides of equation (2.4) by 1 + cos y and integrating with respect to x gives Z Z (1 + cos y) dy = x2 dx, ⇒ y + sin y = 1 3 x + C. 3 (2.5) As usual in these problems the two constants of integration can be combined into a single constant, C. The solution (2.5) cannot be solved explicitly for y, but we can write an explicit expression for x. p x = 3 3(y + sin y − C). 2.2 Graphical Solutions Before rushing in to solve more general first-order ODEs exactly, we shall pause and consider the qualitative information that can be obtained. In fact, when an ODE is in the form (2.1b), we can sketch the solutions without actually having to solve the ODE — a method known as graphical solution. 2.2.1 Direction Fields If the ODE can be written in the form (2.1b) then the function f (x, y) represents the gradient of the solution y(x). We can extract information about the function y(x) by plotting this gradient in the x-y plane. The gradient is represented a straight line, the direction vector, that has a slope given by f (x, y), see Figure 2.1. The direction vector has components (1, y 0 ) = (1, f (x, y)) in Cartesian coordinates and emanates from the point (x, y). The direction field of the ODE is defined to be the set of all direction vectors in the x-y plane. Example 2.2. Plotting a direction field Calculate and plot the direction field for the ODE y 0 = x for y > 0. Solution 2.2. At any point (x, y), the direction vector is (1, x) and the corresponding direction field is shown in Figure 2.2a. r to c ve 3 n io ct e r i D θ 0 y = f (x, y) The angle of the direction vector to the horizonal is given by tan θ = y 0 = f (x, y) 1 Figure 2.1: The direction vector has the same slope as the solution of the ODE y 0 = f (x, y). y y 3 2.5 2.5 2 2 1.5 1.5 1 1 0.5 0.5 0 0 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 x x (a) (b) 0 Figure 2.2: (a) The direction field corresponding to the ODE y = x. (b) The direction field and integral curves, y = x2 /2 + C, corresponding to the ODE y 0 = x. 2.2.2 Integral curves The direction field represents the slope of solutions to the ordinary differential equation, so the solutions must be everywhere tangent to the direction field. Curves that are everywhere tangent to the direction field are known as integral curves and each integral curve represents a solution of the ODE. In other words, the integral curves are (graphical) solutions of the ODE. Example 2.3. Constructing integral curves Plot the integral curves corresponding to the ODE y 0 = x. Solution 2.3. The curves can be constructed by starting from any point on the y-axis and following the direction field forwards and backwards. In this case, the solution to the ODE is easily obtained by direct integration y = 12 x2 + C, so the integral curves are a family of parabolas differing only by the position at which they cross the y-axis (y = C when x = 0), see Figure 2.2b. 2.2.3 Isoclines When sketching direction fields it can be helpful to identify isoclines, curves along which the gradient is constant (iso = “same”, and cline = “slope”). The equation defining the isoclines is simply dy = K, a constant, ⇒ f (x, y) = K. (2.6) dx Example 2.4. Constructing another direction field By determining the isoclines, calculate and plot the direction field and integral curves for the ODE y 0 = −x/y for y > 0. Solution 2.4. The isoclines are given by −x/y = K or y = −x/K, i. e. straight lines1 passing through the origin of slope −1/K. At any point (x, y) the direction vector is (1, −x/y), which can be rescaled on multiplication by y to give (y, −x). Note that multiplication by a scalar does not change the direction, merely the magnitude of the direction vector. At any point (x, y), therefore, the direction vector is perpendicular to the isocline from the origin (0, 0) to the point (x, y). The direction field and isoclines are shown in Figure 2.3a. The ODE is separable and, after multiplication by y, both sides may be integrated to give Z Z 1 1 2 y = − x2 + C, ⇒ x2 + y 2 = 2C, y dy = −x dx, ⇒ 2 2 √ the equation of a circle with radius 2C. Hence, the integral curves are a family of concentric circles centred on the origin. The direction field and integral curves are shown in Figure 2.3b. y y 3 3 2.5 2.5 2 2 1.5 1.5 1 1 0.5 0.5 0 0 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 x x (a) (b) 0 Figure 2.3: (a) The direction field with isoclines corresponding to the ODE y = −x/y. (b) The direction field and integral curves, x2 + y 2 = 2C, corresponding to the ODE y 0 = −x/y. 2.2.4 Plotting direction fields using MATLAB Plotting direction fields is a tedious, repetitive task, which is the kind of thing that computers do well and humans do badly. Of course, you should be able to plot direction fields by hand, but if a computer is available, then learn to use it! One of MATLAB’s many strengths is its built-in plotting capabilities, see §0.2.5. Unfortunately, direction fields cannot be plotted by default, but a vector field can be plotted by using a “quiver plot”. The quiver command requires four types of input data x, y, dx and dy where dx and dy are the x and y components of the vector located at (x, y). As with most data in MATLAB, the input data must be MATLAB matrices and corresponding entries in each matrix contain the data used 1 Isoclines will not always be straight lines, e. g. y 0 = x2 /y has quadratic isoclines K = x2 /y ⇒ y = x2 /K. to generate a single arrow in the plot. It is extremely important that all four matrices have the same dimensions. We must now find a way to generate these four matrices as quickly and easily as possible. In order to generate a “grid” of starting points in the (x, y) plane we use another MATLAB command meshgrid. The command >> [x,y] = meshgrid(-3:3,0:6); generates two 7 × 7 matrices x and y that represent x and y coordinates of 49 points in the domain −3 ≤ x ≤ 3 and 0 ≤ y ≤ 6 with a unit spacing between each point. x = y = -3 -3 -3 -3 -3 -3 -3 -2 -2 -2 -2 -2 -2 -2 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 3 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 Note that the x coordinate is constant in each column of the x matrix and the y coordinate is constant in each row of the y matrix. The spacing between points in the grid can be adjusted by an additional argument between the two limits. The command >> [x,y] = meshgrid(-3:3:3,0:3:6); increases the space between each point to 3 units, rather than the default 1 unit, generating the two matrices: x = -3 -3 -3 y = 0 0 0 3 3 3 0 3 6 0 3 6 0 3 6 In general the arguments to meshgrid can be any two vectors that contain lists of x and y coordinates used to generate the grid. Whatever domain and spacing we choose, we now have a grid of starting points for the vectors. It remains to calculate the x and y components of the direction field. We can create a matrix of the gradients at each point (x, y), e. g. f (x, y) = −x/y, by the following command >> f = -x./y; which gives the matrix f = Inf 3.00000 1.50000 Inf 2.00000 1.00000 Inf 1.00000 0.50000 NaN -0.00000 -0.00000 -Inf -1.00000 -0.50000 -Inf -2.00000 -1.00000 -Inf -3.00000 -1.50000 1.00000 0.75000 0.60000 0.50000 0.66667 0.50000 0.40000 0.33333 0.33333 0.25000 0.20000 0.16667 -0.00000 -0.00000 -0.00000 -0.00000 -0.33333 -0.25000 -0.20000 -0.16667 -0.66667 -0.50000 -0.40000 -0.33333 -1.00000 -0.75000 -0.60000 -0.50000 The use of the ./ operator is very important because it specifies element-by-element division. The command x/y would try to calculate a matrix right division, approximately the same as multiplication of x by the inverse of y. Note that MATLAB can distinguish between positive and negative infinity (Inf) and undefined quantities (NaN, which means “Not a Number”, occurs because 0/0 is undefined). When plotting a direction field, the vectors should all have the same length, which will not always be the case if we let (dx, dy) = (1, y 0 ). Instead, we use the fact that if θ is the angle between the direction vector and the x-axis, then y 0 = tan θ, see Figure 2.1. We can then construct a vector of unit length by letting dx = cos θ, dy = sin θ. We calculate the dx and dy matrices as follows: >> th = atan(f); >> dx = cos(th); >> dy = sin(th); We can now generate the quiver plot by the following command >> quiver(x,y,dx,dy); The complete sequence of commands to generate a plot of the direction field corresponding to y 0 = f (x, y) = −x/y is >> >> >> >> >> >> [x,y] = meshgrid(-3:3,0:6); f = -x./y; th = atan(f); dx = cos(th); dy = sin(th); quiver(x,y,dx,dy); and direction fields for other ODEs can be generated by changing the definition of f on the second line. Creating a MATLAB function to generate direction fields If we are typing in the same commands again and again with only minor changes then that suggests that we could save ourselves some time by creating a MATLAB function, which is essentially a list of those commands. A suitable function to generate direction fields is shown below. function dir_field(func,xrange,yrange) %dir_field(func,xrange,yrange) %Generate a quiver plot of a direction field: % func: a function of two variables such that y’ = func(x,y) % xrange: a vector of x coordinates % yrange: a vector of y coordinates [x,y] = meshgrid(xrange,yrange); f = func(x,y); th = atan(f); dx = cos(th); dy = sin(th); quiver(x,y,dx,dy); end The only difference between the commands in the above function and those described in section 2.2.4 is the use of an unspecified function, func, of x and y, which is an argument to the MATLAB function dir_field. This allows for complete generality because we can change the function that we plot when we call the dir_field function. In order for MATLAB to use this new function it must be saved in a file called dir_field.m located in the “MATLAB Search Path”. The easiest way to achieve this is to type edit dir_field in the MATLAB command window, which should start the MATLAB editor in a new window. Type the above function into the editor and save it (File->Save). You can test whether the file has been correctly created by typing >> help dir_field If everything is set up properly, you should see the comments at the top of the function definition dir_field(func,xrange,yrange) Generate a quiver plot of a direction field: func: a function of two variables such that y’ = func(x,y) xrange: a vector of x coordinates yrange: a vector of y coordinates If you see the message No help comments found in dir_field.m. then the file is in the correct place, and you have either not put any comments in the function or the comment (percent) sign, % is not the first character in the line. If instead you see the message dir_field.m not found. then the file dir_field.m has not been created, or is not in the “MATLAB Search Path”. Once you have the file in the correct place, with or without working comments, you can use the function to generate Figure 2.2a as follows >> dir_field(@(x,y) x,-3:0.5:3,0:0.5:3); >> axis([-2.95,2.95,-0.05,3.05]); The syntax @(x,y) x is called an anonymous function and is a quick way of specifying the function that defines the direction field2 . The notation @(x,y) specifies that we are defining a function of two variables, x and y, and the second x is the function itself. Translation of some more mathematical functions into this MATLAB notation is shown below 2 An alternative is to create a new M-file that defines a function of two variables and then to pass the name of that function to the function dir field. f (x, y) = x f (x, y) = −x/y f (x, y) = sin x f (x, y) = x ∗ y @(x,y) @(x,y) @(x,y) @(x,y) x -x./y sin(x) x.*y The use of the element-by-element .* and ./ function is necessary because x and y are matrices, look back at the commands used in the function dir_field 2.2.5 Existence and uniqueness of solutions If we can construct an integral curve, then the solution that it represents must exist, at least in the range over which we can draw the curve. There are many different integral curves, however, suggesting that the solution is not unique. Indeed, the very first example, 1.1, did not have a unique solution because any constant C satisfies the equation. As you might have guessed, this non-uniqueness follows from the arbitrary constant introduced on integration. We can remove the arbitrariness introduced by the constant of integration by specifying that the solution must pass through a particular point in the (x, y) plane, (x0 , y0 ) say. The constraint can be written mathematically in the form y(x0 ) = y0 , (2.7) and is known as an initial condition because it represents the value of the solution at a single (starting) point. The complete solution can be constructed by starting from the initial point and following the integral curve in either direction3 . Having removed the non-uniqueness due to the arbitrary constant of integration, we can consider the existence and uniqueness of solutions to the initial value problem: y 0 = f (x, y), and y(x0 ) = y0 . (2.8) The solution will fail to exist if we cannot construct an integral curve in the x-y plane starting from the point (x0 , y0 ). In fact, provided that the function f (x, y) is continuous near (x0 , y0 ), we can always construct an integral curve in the local neighbourhood of (x0 , y0 ) and the solution exists; a result known as the Peano existence theorem If there is a choice of direction at any point along an integral curve, in other words if two integral curves cross each other, then the solution is not unique. A stronger condition than simple continuity is required to prevent such an occurrence and leads to the following (Picard’s existence) theorem: (x, y) are both continuous functions of x and y in a region 0 < Theorem 2.1. If f (x, y) and ∂f ∂y |x − x0 | < a and 0 < |y − y0 | < b then there exists a unique solution y = y(x) in the interval 0 < |x − x0 | < h ≤ a, that satisfies the initial value problem dy = f (x, y) and y(x0 ) = y0 . dx The proof of this theorem is beyond the scope of this course, the basic idea is that continuity of the partial derivative allows the “distance between two nearby functions”4 to be bounded, which 3 In problems for which the independent variable is time, t, it is usually only necessary to follow the integral curve forward in t. 4 A precise definition of the term “distance between functions” is surprisingly tricky and requires what is known as functional analysis, see MATH20122, MATH31011 and MATH46111. is, eventually, enough to show that the “distance” between two solutions passing through the same point must be zero, i. e. there is only one solution. In order to apply the theorem, the continuity must be investigated. The theorem will fail if either function is not of the functions f (x, y) and ∂f ∂y continuous. Example 2.5. Using the theorem 2.1 Use the existence and uniqueness theorem to determine values of the initial conditions (x0 , y0 ) for which there is guaranteed to be a unique solution to the initial value problem y0 = 1 x y = y0 at x = x0 . Solution 2.5. The function f (x, y) = 1/x has a discontinuity at x = 0 and, therefore, a solution does not necessarily exist5 when x0 = 0 and it may not be possible to extend a solution with x0 < 0 to positive values of x or a solution with x0 > 0 to negative values of x. We can solve the problem by direct integration Z 1 1 0 y = ⇒ y= dx. x x We split the integration into two parts x>0: R 1 x x<0: dx = log x + C Let u = R−x ⇒ du R 1= −dx 1 and x dx = u du = log u + C = log(−x) + C Thus, y = log |x| + C and applying the initial conditions gives y0 = log |x0 | + C ⇒ C = y0 − log |x0 |, ⇒ y = y0 + log |x| − log |x0 | = y0 + log |x|/|x0 | = y0 + log |x/x0 |. Any solution is undefined (does not exist) when x = 0, which means that the solutions cannot pass smoothly through x = 0, In addition, there is no solution at all when x0 = 0 and the argument of the logarithm is undefined, which is entirely consistent with the implications of the theorem. It is also simple to verify that non-differentiability of f with respect to y can lead to nonuniqueness, but, of course, that does not constitute a proof of the theorem. Example 2.6. Non-uniqueness in an initial value problem Find solutions to the initial value problem p y 0 = 2 |y|, where y(0) = 0. (2.9) p Solution 2.6. The function f (x, y) = 2 |y| is not differentiable at y = 0 which means that ∂f /∂y does not even exist at y = 0, so it is certainly not continuous there. Hence, the theorem 2.1 cannot guarantee uniqueness near y = 0. 5 Note that the theorem is necessary, but not sufficient. Failure of the continuity criterion does not necessarily imply non-existence of a solution. We can solve the equation by separation, but must split the calculation into two parts to handle the modulus function. y>0: y<0: |y| = y, |y| = −y, √ Equation (2.9) becomes y 0 = 2 y. R 1 √ 2 y dy = R 1 dx √ Equation (2.9) becomes y 0 = 2 −y. Separating variables and integrating gives R 1 R √ √ ⇒ y = x + C1 , dy = 1 dx ⇒ 2 −y √ − −y = x + C2 . The initial condition states that y = 0 when x = 0. Thus, C1 = C2 = 0 and we obtain the solution p sgn(y) |y| = x ⇒ y = x2 , (2.10) where sgn(y) is the sign function: 1, y > 0, sgn(y) = −1, y < 0, 0, y = 0. The constant function y = 0 is also a solution of the initial value problem (2.9) because p y 0 = 0 = 2 |0| and y(0) = 0. We have shown that the initial value problem does indeed have two possible solutions y = 0 and y = x2 . 2.3 Numerical Solutions We have already seen how to use a computer to generate the direction field for a first-order ODE, see §2.2.4, but how can we use a computer to plot the integral curves? In particular, for a given initial value problem of the form (2.8) y 0 = f (x, y), and y(x0 ) = y0 , we seek an approximation to the solution y(x). The MATLAB plot command, and other plotting packages, connect a set of discrete points by straight-line segments. For the function y(x), these discrete points are pairs of values (xn , yn ), where yn ≈ y(xn ) and is called a numerical approximation to the function y at the point xn . The values of yn correspond to a chosen set of points in the domain of the independent variable, x0 < x1 < . . . < xN . If we have an analytic expression for the solution y(x), then it is easy to generate the pairs (xn , yn ) = (xn , y(xn )), but the aim of numerical solution methods is to generate these pairs from the ODE directly, without the need for an exact solution. 2.3.1 Euler’s Method The idea underlying Euler’s Method is to start at the point given by the initial condition (x0 , y0 ) and then to find the value of y1 ≈ y(x1 ) by using a straight-line approximation to y(x) in the interval [x0 , x1 ]. The obvious approximation is the tangent to y(x) at (x0 , y0 ), i. e. the line with gradient given by y 0 at (x0 , y0 ). Hence, y1 = y0 + y 0 (x0 )(x1 − x0 ) = y0 + f (x0 , y0 )(x1 − x0 ). (2.11) y0 = C, a given initial value, yn+1 = yn + f (xn , yn )(xn+1 − xn ). (2.12) The method proceeds by repeating the process: start at (x1 , y1 ) and approximate the function y(x) in the interval, [x1 , x2 ], by the tangent to y(x) at (x1 , y1 ) and so on. The complete set of points {yn } can be generated from the equations It is very common to choose an equally-spaced set of points {xn }, i. e. the distance between each point is the same, xn+1 − xn = h, say6 . Then, Euler’s method can be, and often is, written in the form yn+1 = yn + hf (xn , yn ). (2.13) The constant spacing between points, h, is called the step-size. Example 2.7. Euler’s method Use Euler’s method with a step-size h = 0.5 to determine an approximate solution to the initial value problem (2.14) in the domain x ∈ [0, 3]. y 0 = x, and y(0) = 0. (2.14) Solution 2.7. Firstly, a step-size of h = 0.5 in the domain x ∈ [0, 3] means that the chosen set of points in x will be {0, 0.5, 1, 1.5, 2, 2.5, 3}. The easiest way to lay out the calculation is in a table. n 0 1 2 3 4 5 6 xn yn f (xn , yn ) yn + h × f (xn , yn ) 0 0 0 0 + 0.5 × 0 0.5 0 0.5 0 + 0.5 × 0.5 1 0.25 1 0.25 + 0.5 × 1 1.5 0.75 1.5 0.75 + 0.5 × 1.5 2 1.5 2 1.5 + 0.5 × 2 2.5 2.5 2.5 2.5 + 0.5 × 2.5 3 3.75 3 3.75 + 0.5 × 3 = = = = = = = = yn+1 0 0.25 0.75 1.5 2.5 3.75 5.25 Note that the value of yn is copied from the yn+1 column of the row above. The pairs (xn , yn ) are connected by straight-lines and shown on Figure 2.4a together with the direction field corresponding to the ODE y 0 = x. The solution “looks” good in the sense that it is always aligned with the direction field, but, in fact, it is not that accurate. The exact solution to the initial value problem (2.14) can be obtained by direct integration and is y = x2 /2. Points on the exact solution are shown as open circles in Figure 2.4 and only coincide with the approximate solution at the initial point (0, 0). The step-size of h = 0.5 means that we are approximating the exact solution, a quadratic, by very few straight-line segments, so it should not be a surprise that the answer is inaccurate. 7 The accuracy of the approximation can be improved by increasing the number of straight-line segments, or equivalently decreasing the step-size. Figure 2.4b shows the approximate solution to (2.14) with a step-size of h = 0.01 in red and the exact solution as green markers, which lie exactly (as far as the eye can tell) on the approximate solution. 6 Henceforth, we shall assume that the set of points {xn } is equally spaced, unless explicitly stated otherwise. Incidentally, the reason why the approximation “looks” good is because, by construction, each straight-line segment is always aligned with the direction field. The biggest inaccuracy occurs in the very first step in which the base of the parabola is approximated by a straight-line of zero gradient. Having moved away from the true solution, the Euler method follows another integral curve. 7 y y 3 3 2.5 2.5 2 2 1.5 1.5 1 1 0.5 0.5 0 −3 −2 −1 0 1 2 3 0 −3 −2 −1 0 1 2 3 x x (a) (b) 0 Figure 2.4: (a) The direction field corresponding to the ODE y = x; the solution of the initial value problem (2.14) using Euler’s method with a step size of h = 0.5 over the interval x ∈ [0, 3] (solid line); and the exact solution to (2.14) y = x2 /2 (open circles). (b) The direction field corresponding to the ODE y 0 = x; the solution of the initial value problem (2.14) using Euler’s method with a step size of h = 0.01 over the interval x ∈ [0, 3] (solid line); and the exact solution to (2.14) y = x 2 /2 (open circles). Accuracy of Euler’s method Although Euler’s method seems like a sensible approach to approximate the solution of ODEs, the example 2.7 shows that when used without thought it can give very inaccurate solutions. Quantification of the accuracy of any numerical method is extremely important, so important that it is a branch of mathematics in its own right known as numerical analysis, see MATH20602 and, in particular for ODEs, MATH36022. In order to say something definite, we must define what we mean by the accuracy of the method, which leads to the concept of the (numerical) error. The error is essentially the difference between the numerical approximation and the exact solution and will vary with the independent variable. Ideally, we would like to represent the error by a single number, e say, and then deduce how that number varies as we change the number of approximation points in our numerical scheme. The precise definition of e depends very much on the problem being analysed, however. In the present context, the numerical solution of ODEs, we can define an error as follows. Let {xn }, 0 ≤ n ≤ N , be a set of N +1 points on the real line and let yn be the numerical approximation to the function y(x) at the point xn . The numerical error, en , at the point xn is defined to be the difference between the numerical approximation and the exact solution en = yn − y(xn ). (2.15) Example 2.8. Evaluating the numerical error Calculate the numerical errors en corresponding to the values yn determined in example 2.7. Solution 2.8. The exact solution to the initial value problem (2.14) is y = x2 /2. Thus, we can present the exact solution, approximate solution and the error in another table. n 0 1 2 3 4 5 6 xn yn y(xn ) en 0 0 0 0.5 0 0.125 1 0.25 0.5 1.5 0.75 1.125 2 1.5 2 2.5 2.5 3.125 3 3.75 4.5 = yn − y(xn ) 0 −0.125 −0.25 −0.375 −0.5 −0.625 −0.75 The error is negative because the numerical approximation is an underestimate of the true solution; the graph of the numerical solution lies under that of the exact solution, see Figure 2.4a. By construction, the error is zero at the initial condition (n = 0), but the magnitude of the error increases by a constant amount after each step. The error is said to “accumulate” as more steps are taken. Local error analysis and the order of Euler’s method If a numerical method is to be of any use at all, we require that the error at every point must tend to zero as the spacing between the points tends to zero, a property known as convergence. In other words, the more points we take, the more accurate the approximate solution and if we were able to take “infinitely many” points then we should recover the exact solution.8 A simple way to investigate the accuracy of Euler’s method is to consider what happens if we replace the approximation yn by the exact solution y(xn ). For a constant step-size h, the equation (2.13) becomes y(xn+1 ) = y(xn ) + hf (xn , y(xn )). (2.16) Assuming that y is sufficiently differentiable, we can approximate y(xn+1 ) by a Taylor-series about the point xn 1 y(xn+1 ) = y(xn +h) = y(xn )+hy 0 (xn) + h2 y 00 (xn )+O(h3 ) = y(xn )+h f (xn , y(xn ))+O(h2 ). (2.17) 2 Using equation (2.17) in equation (2.16) yields y(xn ) + h f (xn , y(xn )) + O(h2 ) = y(xn ) + h f (xn , y(xn )). (2.18) The difference between the two sides is called the local truncation error and is O(h2 ). If the O(h2 ) term is zero (y 00 (x) ≡ 0) then there is no difference between the two sides of equation (2.18) and Euler’s method is said to be exact. It follows that Euler’s method will be exact if y(x) is a polynomial of order one, y(x) = Ax + B. For this reason, Euler’s method is called a first-order method. A numerical method is said to be of order p if it recovers exactly every polynomial solution of order less than or equal to p. Alternatively, a method is of order p if the local truncation error, the terms that remain after substitution of the exact solution into the numerical method, are of O(hp+1 ). The order of the numerical scheme tells us that advancing from the point xn to xn+1 introduces an error of size O(hp+1 ), for small h. The problem with this approach is that the order 8 Once again, really precise definitions of these concepts requires quite sophisticated mathematical tools from real, complex and functional analysis. only describes the local error, it assumes that we are starting from the exact solution at every step, i. e. that en = 0. After n steps of the method, however, en 6= 0 in general. In a fixed interval N is O(h−1 ) and the naive expectation is that a global error measure P the number of steps e = n en ≈ N en = O(h−1 ) × O(hp+1 ) = O(hp ). Sadly, this argument is too simple to be true, in general9 . That said, for Euler’s method with constant spacing (2.13) it can be shown that |en | ≤ Ch, where C is a constant that is independent of h and so |en | → 0 as h → 0 for all n and the method is convergent10 . The error bound also shows that we should expect the error to decrease linearly with step-size. In other words, if we wish to halve the error then we must halve the step-size (approximately). Plotting integral curves computed by Euler’s method using MATLAB Numerical methods are designed to take advantage of the fact that computers are dumb, but can do repetitive tasks accurately and quickly. Euler’s method is “easy” (it requires only multiplication, addition and a function evaluation for each step) but very laborious. Imagine having to calculate one million steps by hand! It is much better, and much more fun, to get the computer to do the boring work. As in example 2.7, we shall consider the solution to the initial value problem (2.14) y 0 = x, and y(0) = 0. If we create two vectors x and y that represent the sets {xn } and {yn }, then the integral curve can then be plotted by using the command >> plot(x,y) The initial condition will be the first entry in each array and the first entry in a MATLAB array has the index 1, so, confusingly, x(1) = x0 and y(1) = y0 , >> x(1) = 0; y(1) = 0; If the next point is x1 = 0.5, we could create the next entries in the array by the commands >> x(2) = x(1) + 0.5 x = 0 0.5000 >> y(2) = y(1) + 0.5*x(1) y = 0 9 0 It usually fails because the (local) order of a method does not say anything about its (global) convergence. A proof of this statement is actually quite hard and relies on taking a Taylor-series expansion of the exact solution to relate en+1 to en followed by careful inductive arguments starting from the fact that e0 = 0 by definition. 10 but we would have to repeat this process for each step, which is as laborious as doing it by hand. Instead we can (and should) make use of a for loop to write >> for n=1:6 x(n+1) = x(n) + 0.5; y(n+1) = y(n) + 0.5*x(n); end The first line in the for loop increases the value of x by h = 0.5 and the second is a direct implementation of Euler’s method, equation (2.13), where f (x, y) = x and the (constant) step-size is 0.5. The loop will run six times, n=1:6, and populates the vectors with the data shown in solution 2.7. >> x x = 0 0.5000 1.0000 1.5000 2.0000 2.5000 3.0000 0 0 0.2500 0.7500 1.5000 2.5000 3.7500 >> y y = Once again, we can and should generalise the process by writing a more general MATLAB function function euler(func,xinit,yinit,h,nstep) %Function that plots an integral curve calculated using Euler’s method % func: a function of two variables such that y’ = func(x,y) % xinit: the intial value of the independent variable, x0 % yinit: the initial value of the dependent variable, y0 % h: the step-size % nstep: the number of steps x(1) = xinit; y(1) = yinit; for n=1:nstep x(n+1) = x(n) + h; y(n+1) = y(n) + h*func(x(n),y(n)); end %Plot the integral curve in red plot(x,y,’r’); If the above commands are saved in file file euler.m, then the integral curve in Figure 2.4a can be generated by the commands >> euler(@(x,y) x,0,0,0.5,6) The integral curve shown in Figure 2.4b was generated by >> euler(@(x,y) x,0,0,0.01,300) 2.3.2 Explicit vs implict methods Euler’s method is known as an explicit method because it gives an explicit expression for the value of the solution at a point yn+1 as a function of the solution at the previous point yn . In other words, the formula (2.13) yn+1 = yn + hf (xn , yn ) can be written in the form yn+1 = F (xn , yn ). (2.19) The explicit nature of the method arises because we approximate the gradient in the region [x n , xn+1 ] by its value at the left-hand end, f (xn , yn ). An alternative would be to approximate the gradient by its value at the right-hand end of the interval, f (xn+1 , yn+1 ), which leads to the backward Euler method yn+1 = yn + hf (xn+1 , yn+1 ). (2.20) The backward Euler method is an implicit method because for a general f (x, y), it cannot be written in the form (2.19). Instead, the method is of the general form yn+1 = F (xn+1 , yn+1 , xn , yn ). The use of implicit methods requires the solution of a non-linear algebraic equations, e. g. using Newton’s method, so they are more difficult to program and more expensive per step, so why use them? The answer is that implicit methods often give accurate results for large steps, so fewer steps are required in total. 2.3.3 Higher-order methods We have seen that Euler’s method is only exact for linear functions and the solution of a general ODE is often far from linear. Extremely small step sizes are required to compute accurate solutions to general ODEs by Euler’s method, which means that the computations can take a very long time. For these reasons, Euler’s method is not that widely used. Instead, higher-order methods have been developed. These methods have smaller local truncation errors than Euler’s method, which means that even if larger step-sizes are taken the error does not grow as fast. A large number of different methods have been proposed, each with different strengths and weaknesses, see e. g. Chapter 19 in “Schaum’s Outline of Differential Equations”. 2.4 Exact (analytic) Solutions: Linear Equations There are many different techniques that can be used to solve first-order ODEs exactly. The graphical and numerical methods discussed in the previous sections are quite general, but the applicability of analytic methods can depend crucially on the properties of the ODE. For example, we can only use the method of separation, §2.1.2, if the ODE is separable. One of the most important properties in this regard is whether or not the equation is linear. A first-order ordinary differential equation is linear if it can be written in the form y 0 + p(x) y = r(x), (2.21) where p(x) and r(x) are functions of the independent variable. The equation (2.21) is the same as the general definition of a linear ODE (1.8), with n = 1, a1 (x) = 1, (always possible without loss of generality if the equation is first-order), a0 (x) = p(x) and g(x) = r(x). The equation (2.21) is often written in the form Ly = r(x), where L ≡ d + p(x), dx (2.22) and L is called a linear, differential operator. In fact, as we shall see below, it is possible to write down an explicit expression for the solution to (nearly) any first-order linear ODE, although the expression might be impossible to simplify nicely. 2.4.1 Solution using an integrating factor Example 2.9. A non-separable, first-order, linear ODE Find the general solution to the ODE y 0 + 5y = 1. (2.23) Solution 2.9. Equation (2.23) is not separable because the dependence on y cannot be made to occur only in the y 0 term. If we divide by y, then there will be a term 1/y on the right-hand side. Instead, we “notice” the cunning trick that if we multiply equation (2.23) by e 5x then e5x y 0 + 5e5x y = e5x ⇒ d 5x e y = e5x , dx (2.24) by the product rule. We can now integrate both sides of equation (2.24) with respect to x e5x y = 1 5x e + C, 5 where C is an arbitrary constant of integration. The final solution is obtained on division by e 5x y= 1 + C e−5x . 5 (2.25) The expression (2.25) is called the general solution of equation (2.23) because it still contains a free constant C and so represents the entire family of integral curves. The general case In example 2.9 we were able to find a solution after multiplying by the function e5x which made it possible to integrate the resulting equation directly. The solution relied on us being able to “spot” that multiplying by the specific function e5x would allow us to perform the integration. It would be better if we had a more constructive method. The idea is to seek a function q(x) such that if we multiply the entire equation (2.21) by q(x) then we can write the left-hand side as a perfect derivative. In other words d q(x) y 0 + q(x) p(x) y = q(x) y(x) = q(x) y 0 + q 0 (x) y. (2.26) dx It follows that q 0 (x) = q(x) p(x), ⇒ p(x) = q 0 (x) ; q(x) (2.27) we can rule out the pathological case q(x) ≡ 0 because that would mean that we would have multiplied the equation (2.21) by zero to give 0 = 0; a true statement, but rarely very helpful. Equation (2.27) is separable and integrating once we obtain Z x log q = p(u) du ≡ P (x), (2.28) where u is a dummy variable of integration. Taking the exponential of equation (2.28) gives q(x) = eP (x) = e Rx p(u) du , (2.29) where q(x) is known as the integrating factor. The name arises because if we multiply equation (2.21) by the factor q(x) then we can integrate the resulting equation directly. We must multiply both sides of equation (2.21) by q(x), which gives d P (x) ye = r(x) eP (x) . dx Integrating once yields ye P (x) =C+ Z (2.30) x r(u) eP (u) du, where C is a constant of integration and, again, u is a dummy variable of integration. Dividing though by eP (x) we obtain an explicit expression for the solution Z x −P (x) −P (x) r(u) eP (u) du, (2.31) y(x) = Ce +e = yc (x) + yp (x). The term yc (x) = Ce−P (x) is known as the complementary solution because Lyc = 0. By linearity LKyc = KLyc = 0, which means that yc can be multiplied by any constant K without changing its behaviour when operated on by L. The integral term, yp (x), contains no free constants and is called a particular integral11 because Lyp = r(x). The equation (2.30) is called the self-adjoint form of the equation (2.21). Example 2.10. Using the general formula (2.31) Find the general solution to the ODE (2.23) y 0 + 5y = 1, by using the general formula (2.31). 11 The solution of linear ordinary differential equations of any order can always be split into complementary solutions and particular integrals, see §3.2.3 Solution 2.10. On comparison with the general form of a first-order linear ODE (2.21), y 0 +p(x) y = r(x), it follows that p(x) = 5 and r(x) = 1. Using the definition (2.28), we have Z x 5 du = 5x, P (x) = and the general solution follows from equation (2.31) Z x 1 1 −5x −5x y = Ce +e e5u du, ⇒ y = Ce−5x + e−5x × e5x = Ce−5x + . 5 5 (2.32) The solution agrees with our previous result (2.25) and we note that 1/5 is the particular integral and Ce−5x is the complementary solution. Example 2.11. A less-trivial example of using integrating factors Find the general solution of the ordinary differential equation (x2 + 1)y 0 − (1 − x)2 y = xe−x . (2.33a) and, hence, find the solution of (2.33a) that satisfies the initial condition y(0) = 0. (2.33b) Solution 2.11. Dividing equation (2.33a) by (x2 + 1), gives y0 − x (1 − x)2 y = e−x , 1 + x2 1 + x2 which is in the form (2.21) where p(x) = −(1 − x)2 /(1 + x2 ) Hence, by equation (2.28), Z Z Z 1 − 2x + x2 2x (1 − x)2 dx = − dx = − 1− P (x) = − dx, 1 + x2 1 + x2 1 + x2 = −x + log(1 + x2 ) (2.34) Then 2 2 eP (x) = e−x+log(1+x ) = e−x elog(1+x ) = (1 + x2 ) e−x . The self-adjoint form of Equation (2.33a) is 0 x y (1 + x2 ) e−x = e−x (1 + x2 ) e−x = x e−2x . 1 + x2 Integrating once, gives 1 1 2 −x x+ e−2x , y (1 + x ) e = C − 2 2 and dividing by (1 + x2 ) e−x we obtain the general solution C ex − 21 x + 21 e−x . y= 1 + x2 The solution that satisfies the initial condition (2.33b) must have 0= so y= 1 4 ex − 1 2 C− 1 1 4 x+ 1 + x2 1 2 1 C= , 4 ⇒ e−x = sinh x − x e−x . 2(1 + x2 ) (2.35) 2.4.2 Existence and uniqueness revisited The explicit construction (2.31) gives a solution to any linear, first-order ordinary differential equation; and for a given initial condition that solution will be unique. Hence, we can prove a stronger theorem than Picard’s existence theorem, 2.1, when the equation is linear. Theorem 2.2. If the functions p(x) and r(x) are continuous on an open interval x ∈ (a, b), then there exists a unique solution y = y(x) to the initial value problem y 0 + p(x) y = r(x), and y(x0 ) = y0 , where x0 ∈ (a, b), in the interval a < x < b. The continuity requirements on p(x) and r(x) guarantee that the integrals in equation (2.31) actually exist. 2.4.3 Solution by power series Example 2.12. A power-series solution Find the (unique) solution to the initial value problem y 0 − 2xy = 0, where y(0) = 1, (2.36) by assuming that the solution can be written as a power series expansion about x = 0 y(x) = ∞ X a n xn , (2.37) n=0 where an are constants. Solution 2.12. Assuming that the power-series converges we can differentiate equation (2.37) term-by-term to obtain ∞ ∞ X X y0 = n an xn−1 = n an xn−1 . (2.38) n=0 n=1 Substituting the series (2.37) and (2.38) into the ordinary differential equation (2.36) gives ∞ X n an x n=1 n−1 − 2x ∞ X n an x = n=0 which may be written in expanded form as ∞ X n an x n=1 n−1 −2 ∞ X an xn+1 = 0, n=0 a1 x0 + {2a2 − 2a0 } x + {3a3 − 2a1 } x2 + · · · + {(n + 1)an+1 − 2an−1 } xn + · · · = 0. (2.39) The equation (2.39) must be satisfied for every value of x, which is possible if (and only if) the coefficient of every power of x is zero12 . Setting the coefficient of each power of x to zero gives x0 : x1 : .. . : a1 = 0, 2a2 − 2a0 = 0 .. . xn : (n + 1) an+1 − 2an−1 = 0 12 ⇒ a 2 = a0 , ⇒ an+1 = A consequence of the “deep” orthogonality property of polynomials. 2 an−1 . n+1 (2.40a) (2.40b) (2.40c) The equation (2.40c) is known as a recurrence relation because it can be used recursively to find the values of the coefficients {an }, provided that initial values are known. From equation (2.40a), a1 = 0 and by the recurrence relation (2.40c) all odd coefficients are zero a1 = a3 = a5 = · · · = 0. Using equation (2.40b) with the recurrence relation allows the calculation of the even coefficients (n = 1) : a2 = a0 , and, in fact, a2n = 1 n! (n = 3) : a4 = 2 1 a2 = a0 , 4 2 (n = 5) : a6 = 2 1 1 a4 = a0 = a0 , 6 6 3! a0 . The original power-series expression for (2.37) is, therefore, y(x) = a0 1 1 1 + x + x4 + x6 + · · · 2! 3! 2 ∞ X 1 2n 2 = a0 x = a 0 ex . n! n=0 Finally, the initial condition y(0) = 1 gives a0 = 1 and so the solution is 2 y(x) = ex . Note that the initial condition can be applied directly to the power-series representation. Evaluating (2.37) at x = 0 gives ∞ X y(0) = an 0n = a 0 , n=0 because 00 = 1, but all other powers of zero are zero. Thus, in most cases, a0 is set by the initial condition. In fact, the equation (2.36) is separable, or it can be integrated directly after multiplication by 2 the integrating factor e−x ; either way, the exact solution that satisfies the initial condition y(0) = 1 2 is y = ex , in agreement with the power-series method. The power-series method should be used with caution because it is strictly only valid in the region near x = 0 in which the series converges. In the present example, the series converges everywhere and the method gives the global solution, but, in general, it will give only a local approximation to the solution. Example 2.13. A “forced” case Find the (unique) solution to the initial value problem y 0 − 2xy = x2 , where y(0) = 1, (2.41) by assuming that the solution can be written as a power series expansion about x = 0. Solution 2.13. The equation (2.41) is very similar to equation (2.36), but there is an additional “forcing” term of x2 on the right-hand side. Despite this, the method of solution remains exactly the same as in example 2.12. Substitution of the series (2.37) and (2.38) into equation (2.41) and subtracting x2 from both sides gives ∞ X n=1 n an x n−1 −2 ∞ X n=0 an xn+1 − x2 = 0. (2.42) The equation (2.42) must be satisfied for every value of x and, again, this is only possible if the coefficient of every power of x is zero. Setting coefficients of each power of x to zero gives x0 : x1 : a1 = 0, 2a2 − 2a1 = 0 ⇒ x2 : 3a3 − 2a1 − 1 = 0 ⇒ .. . : .. . xn : (n + 1) an+1 − 2an−1 = 0 (2.43a) (2.43b) a 2 = a0 , 1 a3 = , 3 ⇒ an+1 = (2.43c) 2 an−1 . n+1 (2.43d) The recurrence relation (2.43d) is the same as in example 2.12 and the initial condition is the same, y(0) = 1 ⇒ a0 = 1, so the even coefficients a0 , a2 , a4 , . . . are completely unaffected by the forcing 2 term. Thus, the solution will consist of the even terms, still ex , but with the addition of any odd terms that are “triggered” by the forcing term. The forcing term appears in the coefficient of x2 , equation (2.43c), which implies that a3 = 1/3; and the use of the recurrence relation for the subsequent odd coefficients gives a5 = 2 2 a3 = , 5 15 a7 = 2 4 a5 = , 7 105 ··· 1 2n−1 . m=2 (2m + 1) 3 a2n+1 = Qn In this case, there isn’t really a “nice” expression for the odd coefficients, but they can all be evaluated, and the final solution is given by y=e x2 ∞ 2n−1 1 3 X Qn + x + x2n+1 , 3 (2m + 1) m=1 n=2 (2.44) 2 which consists of a complementary solution, ex , and a particular integral, the remaining mess. 2 The integrating factor is still e−x and on multiplication by the integrating factor equation (2.41) becomes Z 0 2 2 −x2 x2 x2 −x2 =x e ⇒ y = Ce + e x2 e−x dx. (2.45) ye Thus, the difference between the solutions to the forced and unforced cases, the remaining mess in equation (2.44), is the series approximation to the integral13 term in equation (2.45). Example 2.14. A warning about series Find the general solution to the ODE xy 0 + y = 10, (2.46) using series solution methods. Solution 2.14. If, as before, we pose y= ∞ X n=0 13 an x n ⇒ 0 y = ∞ X nan xn−1 , n=0 You might like to know that the integral can actually be written as √ Z Z x 2 2 2 1 2 π erf(x) − e−x x, where erf(x) ≡ √ x2 e−x dx = e−ξ dξ is called the error function. 4 2 π 0 and substituting the above expressions into equation (2.46) yields x ∞ X n=0 nan x n−1 + ∞ X n an x = n=0 ∞ X (n + 1) an xn = 10. n=0 Gathering coefficients of powers of x implies that a0 = 10, (n + 1)an = 0 ⇒ an = 0, n > 0. The solution is then y = 10, which satisfies the equation (2.46), but does not contain a free constant. The equation (2.46) is linear, so we should have a complementary solution, where is it? The equation (2.46) can be solved directly because the left-hand side is an exact derivative: xy 0 + y = (xy)0 = 10. Integrating, we obtain C ; x the complementary solution is C/x was not included in our series representation, so we did not find it. This example demonstrates that a simple Taylor-series method (series with positive powers of x) only works if the solution posesses a convergent Taylor expansion in the region of interest 14 . xy = 10x + C, 2.5 ⇒ y = 10 + Exact Solutions: Nonlinear Equations In contrast to linear equations, there is no general solution method to find analytic solutions of nonlinear equations. Existence and uniqueness is harder to establish and, in general, non-linear equations do not have unique solutions. There are many special forms of non-linear equations for which analytic solutions are possible, however, e. g. separable equations can be non-linear, see example 2.1. In fact, the majority of these solution methods use substitutions to convert the nonlinear equation into a linear or separable equation. We shall discuss only one of the many special forms; a more comprehensive list can be found in the “Handbook of Differential Equations” by Zwillinger. 2.5.1 Homogeneous equations A first-order ODE is said to be homogeneous if it can be written in the form y dy =f , dx x (2.47) P∞ The method can be generalised by considering a series expansion of the form y = n=0 an xn+c , where a0 = 6 0 by choice and the value c must be determined as part of the solution. A power-series with negative powers of x is called a Laurent series. 14 for some function f . Examples are shown below (some rearrangement may be necessary): dy y = , dx x dy 2x3 y − y 4 = 4 , dx x − 2xy 3 dy + y 2 − xy = 0. x2 dx (2.48a) (2.48b) (2.48c) Equation (2.47) can be solved by means of the substitution z(x) = y(x)/x; i. e. y(x) = xz(x). Differentiating with respect to x and using the product rule gives and so equation (2.47) becomes dy d dz = xz(x) = z(x) + x , dx dx dx z(x) + x dz = f (y/x) = f (z), dx (2.49) which is now separable dz f (z) − z = dx x ⇒ Z dz = f (z) − z Z dx . x (2.50) Equation (2.50) can be evaluated provided, of course, that we can perform the integral in z. Example 2.15. Solving a homogeneous equation Find the general solution of the ODE (2.48b) x2 dy + y 2 − xy = 0. dx Solution 2.15. First, we put the equation (2.48b) into homogeneous form on division by x2 (we assume x 6= 0) dy y 2 y − = 0. + (2.51) dx x x Now, we let z = y/x, so that dy/dx = z + x dz/dx and equation (2.51) becomes z+x dz + z2 − z = 0 dx ⇒ x dz = −z 2 . dx Dividing both sides by −xz 2 and integrating with respect to x gives Z Z 1 dz dx ⇒ = log |x| + C. = − 2 z x z Returning to the original variables we obtain our final solution 1 x = = log |x| + C z y ⇒ y= x . log |x| + C (2.52) Example 2.16. Solving a homogeneous equation including boundary conditions Find the solution of the initial value problem xy dy + x2 + y 2 = 0, dx where y(1) = 1. (2.53) Solution 2.16. We divide through by yx (y 6= 0, x 6= 0) to put the equation into homogeneous form dy x y + + = 0, dx y x and using the usual substitution z = y/x we obtain z+x dz 1 + +z =0 dx z ⇒ x dz 1 + 2z 2 =− . dx z (2.54) We could now separate variables and integrate, keep the arbitrary constant of integration and find its value by using the initial condition. Instead, however, we shall build the initial conditions into the limits of our integration. For the lower limit, we let x = 1 for which y = 1 and so z = y/x = 1, we leave the upper limit unknown as x or z and to avoid confusion we introduce two dummy integration variables ξ and ζ. The equation (2.54) becomes Z z Z x ζ 1 dζ = − dξ, 2 ξ ζ=1 1 + 2ζ ξ=1 z 1 2 = [− log ξ]xξ=1 , log(1 + 2ζ ) ⇒ 4 ζ=1 ⇒ 1 log(1 + 2z 2 ) − log 3 = − log x + log 1, 4 1 1 + 2z 2 4 1 ⇒ log = log . 3 x Note that there is no constant of integration because we are calculating a definite integral. Exponentiating both sides of the equation gives 1 + 2z 2 3 41 1 = x ⇒ 1 + 2z 2 1 = 4 3 x ⇒ 1 z = 2 2 3 −1 . x4 Returning to the original variables we obtain the final (implicit) solution y 2 1 3 1 3 2 2 = −1 ⇒ y = −x . x 2 x4 2 x2 2.6 (2.55) Autonomous equations In my opinion, the concepts introduced in this section are easier to understand if the independent variable is time, t, rather than space x. The underlying mathematical structure is, of course, the same, but the physical analogies are better when we think of quantities varying in time rather than space. A first-order ODE is autonomous if there is no explicit dependence on the independent variable and can, therefore, be written in the form ẏ = f (y), (2.56) where ẏ = dy/dt. A direct consequence of the autonomy of the system is that the “shape” of the solution is independent of t. In other words, if y1 (t) is a solution of equation (2.56) then another solution is given by y2 (t) = y1 (t + c) for any real constant c. A simple proof is given below Let τ = t + c, then y2 (t) = y1 (τ ), and differentiating with respect to t gives dy2 dy1 (τ ) dτ d ẏ2 = y1 (τ ) = = , by the chain rule; dt dt dτ dt dy1 (τ ) = , because dτ /dt = 1; dτ = f (y1 (τ )), because y1 (τ ) is a solution of (2.56); = f (y2 (t)). Hence, y2 (t) also satisfies equation (2.56). In other words, the exact value of the time doesn’t matter for the solution; e. g. if you drop ball from a given height, the path of the ball and the time it takes to hit the ground won’t depend on whether you release the ball at 12:00am on a Tuesday or 1:00pm on a Friday15 . 2.6.1 The phase portrait In an autonomous system, the lack of any explicit dependence on t means that the complete behaviour of the system is determined only by y and dy/dt. One useful way of representing the behaviour of the system is to plot ẏ as a function of y, which is known as a phase portrait; hence, the ẏ-y plane is called phase plane. For equations in the form (2.56), ẏ = f (y), the phase portrait is simply a graph of f (y) as a function of y. Example 2.17. Plotting a phase portrait Plot the phase portrait of the ordinary differential equation ẏ = y. (2.57) Solution 2.17. The phase portrait is a straight line passing through the origin with gradient one, see Figure 2.5. Interpreting the phase portrait A lot of information about the solutions of the ODE can be obtained by interpreting the phase portrait. In example 2.17, if our initial condition16 is such that y > 0, then ẏ > 0. A positive value of ẏ means that y will increase with time and the solution moves “to the right” in the phase portrait as t increases. As y increases, ẏ also increases, see Figure 2.5, and so the solution moves “faster and 15 16 Assuming that physical conditions like wind, air density, etc don’t change. The precise time at which this condition is applied does not matter, that’s the beauty of autonomous systems. ẏ 10 5 0 -10 -5 0 5 -5 10 y -10 Figure 2.5: Phase portrait corresponding to the ODE ẏ = y. faster” and always to the right. On the other hand if the initial condition is such that y < 0, then ẏ < 0 and our solution decreases in value as time increases moving further and further “to the left” in the phase portrait. The above argument demonstrates that if y(ti ) > 0, for some initial time ti , then y → ∞ as t → ∞ and when y(ti ) < 0, y → −∞ as t → ∞. Finally, if the initial condition is such that y = 0, then ẏ = 0 and the solution cannot change, i. e. if y(ti ) = 0, then y(t) = 0 for all t > ti . The above analysis of the phase portrait reveals all the qualitative information about the possible dynamics of the system without the need to find an explicit solution. In this case, we know that the general solution is y = Aet , which does exhibit the appropriate qualitative behaviour: ∞, A > 0, t as t → ∞, Ae → 0, A = 0, −∞, A < 0. 2.6.2 Fixed points Any point for which ẏ = 0 is called a fixed point, or equilibrium point, of the system because once at that point the value of the solution cannot ever change. In example 2.17, the point y = 0 was the only fixed point of the system17 . Example 2.18. Finding fixed points Find the fixed points of the ODE ẏ = y 3 − 6y 2 + 11y − 6 (2.58) Solution 2.18. Fixed points occur when ẏ = 0, i. e. y 3 − 6y 2 + 11y − 6 = 0. (2.59) We can factorise the cubic (2.59) by spotting that y = 1 is a solution (or otherwise) (y − 1)(y 2 − 5y + 6) = (y − 1)(y − 2)(y − 3) = 0, and so the fixed points are located at y = 1, y = 2 and y = 3. 17 Actually that’s not quite true, there is also a fixed point “at infinity”, which can be analysed by means of the transformation y = 1/z ⇒ ẏ = −ż/z 2 . The ODE (2.57) in example 2.17 becomes −ż/z 2 = 1/z ⇒ ż = −z, which has a fixed point at z = 0, corresponding to y → ∞. Interpretation of fixed points If a system ever reaches a fixed point then it will stay there for all time (by definition), so, in some sense, the fixed points represent all possible final destinations of the system. Once we know the fixed points of a system its behaviour can be completely characterised if we can answer the following question: For a given initial condition, which, if any, of the fixed points will the system approach as t → ∞. The answer to this question can be found by examination of the phase portrait. In example 2.17, for y 6= 0 the solution moves away from the fixed point at y = 0 and so the system always approaches the “fixed point at infinity” unless y = 0. 2.6.3 Phase-plane analysis and mathematical modelling The interpretation of the phase portrait and fixed points of an ordinary differential equation can be very useful when constructing mathematical models. If the expected qualitative behaviour is not present in the phase plane then we can reject the model without having to go to the effort of actually finding solutions. A more dubious approach is to “design” a phase plane that exhibits the appropriate behaviour, but this is closer to “fitting” a model to the data, rather than building a model from underlying fundamental principles to explain the data. The “fitting” approach is not without its uses, but the true test of a model is whether it has predictive power: can the model predict previously unobserved (and experimentally verifiable) behaviour? In example 1.2, we derived a simple (linear) model of population growth Ṅ = f (N ) = αN, but the model was rejected because it predicted unbounded growth for α > 0 (the phase portrait is essentially the same as the example 2.17 but the gradient of the line is α rather than one). The unbounded growth arises because the rate of change, Ṅ , increases as N increases — the population grows faster the bigger it gets. We can remove this property by changing the function f (N ) so that Ṅ decreases for sufficiently large populations and the rate of growth eventually slows down. If Ṅ is to increase for small N , but decrease for large N , then f (N ) must have a turning point and must be a nonlinear function of N . Example 2.19. Population growth revisited: the logistic equation The logistic equation is a nonlinear, autonomous ODE that represents another simple model of population growth, Ṅ = αN (1 − N/N1 ), (2.60) where α > 0 and N1 > 0 are positive constants. Sketch the phase portrait for equation (2.60) and find the fixed points of the system. Establish the qualitative behaviour of the system by phase-plane analysis. Confirm your analysis by finding the general solution of equation (2.60). Solution 2.19. Sketch The phase portrait is given by the quadratic g(N ) ≡ αN (1 − N/N1 ) which is zero at the points N = 0 and N = N1 and has a maximum at (N1 /2, αN1 /4), see Figure 2.6. αN1 4 Ṅ = g(N ) 0 0 N1 /2 N1 N Figure 2.6: Phase portrait corresponding to the ODE Ṅ = αN (1 − N/N1 ) Fixed points Fixed points correspond to the zeros of g(N ) and so there are two fixed points at N = 0 and N = N1 . Phase-plane analysis It does not make sense for our population size to be negative, so we shall assume that N ≥ 0. For N > 0 and small, Ṅ > 0 so N increases and moves away from the fixed point at N = 0. N will keep increasing because Ṅ remains positive until N = N1 at which point the system is at a fixed point and N will remain at N1 for all time. If we start with N > N1 , then Ṅ < 0 and the population decreases until N = N1 and again the system reaches the fixed point at N1 . Thus, for any initial condition N > 0 the population will eventually reach the fixed point at N = N1 . The fixed point at N = 0 is said to be unstable because the system moves away from that point. The fixed point at N = N1 is said to be stable because the system moves towards it; we shall have more to say about this in the mechanics section of the course. General solution The logistic equation is separable and we can find an explicit expression for the general solution. Multiplying both sides by N1 /(N (N1 − N )) and integrating with respect to t gives Z Z N1 dN = α dt. N (N1 − N ) We use partial fractions to rewrite the right-hand side Z 1 1 dN = log N − log(N1 − N ) = αt + C, + N N1 − N and, hence, N = Aeαt , N1 − N (2.61) where A = eC . We denote initial population (at t = 0) by N0 , so A = N0 /(N1 − N0 ). Rearranging equation (2.61) gives N = (N1 − N )Aeαt ⇒ ⇒ N (1 + Aeαt ) = N1 Aeαt N= N1 N0 . N0 + (N1 − N0 )e−αt ⇒ N= N1 Aeαt , 1 + Aeαt (2.62) Firstly, observe that if N0 = 0, then N = 0 for all time. Next, we consider the limit t → ∞ for which e−αt → 0, because α > 0; hence, N→ N1 N0 = N1 , N0 as t → ∞ (N0 6= 0), and the system always evolves towards the stable fixed point, in agreement with the analysis of the phase portrait. Finally, we remark that if we take the limit N1 → ∞ of the solution (2.62), we obtain N0 N0 N1 N0 = lim = −αt = N0 eαt , −αt −αt N1 →∞ N0 /N1 + (1 − N0 /N1 )e N1 →∞ N0 + (N1 − N0 )e e lim in agreement with the solution of our simple linear model of population growth (1.4).