THE UNIVERSITY OF SYDNEY SCHOOL OF MATHEMATICS AND STATISTICS MATH 1003 Integral Calculus and Modelling N.R. O’Brian C.J. Durrant and D.J. Galloway c The University of Sydney 2009. Contents 1 Introduction 1 2 The Definite Integral: Definition 12 3 The Definite Integral: Properties 21 4 The Definite Integral: Applications 30 5 Integrals as Functions 38 6 Integration Techniques: I 51 7 Integration Techniques: 2 59 8 Models and Differential Equations 73 9 Separable Equations 84 10 Applications of Separable Equations 94 11 Linear Differential Equations 114 12 Second-Order Differential Equations 126 13 Systems of Differential Equations 144 A Table of Standard Integrals 153 1 CHAPTER 1 Introduction HESE NOTES develop the theory of integration, along with some of its applications. We expect that everyone in the class has had exposure to the ideas of integration. In this course we will fit those ideas into a systematic framework and show how many of the most powerful applications of mathematics are based on integration. T As a first exercise, you should try to describe in a single sentence what you understand by the term ‘integration’. Perhaps your answer looks like one of the following: • It’s the opposite of differentiation. • It’s the area under the graph. • It’s a kind of summation process. • It’s the result of adding up small changes. In fact all these are correct, and the diversity of possible answers shows that there are several ways of looking at integration, and understanding the theory depends on an appreciation of the relationships between these different points of view. One of the main reasons for getting a good understanding of integration is that many of the important applications of mathematics are based on this idea. A major practical use of mathematics is in building ‘mathematical models’ of different types of physical, biological or financial systems. Typically these models are equations which express certain relationships between the different quantities involved in the system. Solving the equations then tells us something about the consequences of these relationships. It is very common for the equations to involve derivatives, in which case they are called differential equations. Then we have the problem of deducing the values of some quantity from information about its derivative. This is where we need the theory of integration. As a simple example, if we know the interest rate on a savings account then we know the rate at which the amount in the account will change (assuming no deposits or withdrawals). We then know the derivative with respect to time of the account balance, and we can try to use this information to find the actual balance after a certain period of time. We will see many other examples of this type of problem. 2 MATH 1003 Integral Calculus and Modelling A Note on Notation In most of the examples discussed here the independent variable is taken to be x or t. Often t will stand for time, and the derivative of any quantity depending on t can be thought of as the rate at which that quantity is changing. We generally draw graphs according to the usual convention of measuring the independent variable along the horizontal axis, and the dependent variable along the vertical axis. The fact that a quantity y depends on x is equivalent to saying that y is a function of x. In this case we might write y = f (x). It is also common to think of y itself as the function, and write just y(x) or y = y(x). We often use a prime to denote the derivative, so d f (x) = f ′ (x) . dx Rates of Change One of the basic facts of physics is that the velocity of a falling body increases in proportion to time (at least if we ignore small effects due to air resistance or variations in the gravitational field).1 If t is time and v(t) the velocity at time t then, if the body starts from rest at t = 0, we can express this fact mathematically as an equation: v(t) = gt, where g is a constant. This is an example of a physical law. It contains a lot of information about falling bodies in a compact formula. In fact it is a simple ‘mathematical model’ of a falling body. Any consequences we can deduce from this mathematical relationship will tell us more about falling bodies. The constant g is the same for all bodies, and can be determined by experiment. Suppose we want to know how far the body has fallen when, say, t = 10. If s(t) is the distance fallen after time t we can use the following argument: • Velocity is the derivative of distance with respect to time, so we can write v as ds/dt in the previous equation: (1.1) d s(t) = gt. dt 1 This topic is a recurring theme in these notes. The historical development is sketched in Chapter 8 and the mathematical development in Chapter 9 Chapter 1: Introduction 3 • Although we don’t know s(t) explicitly, we know its derivative is gt. On the other hand, from what we know about differentiation, the function gt2/2 has derivative gt, so s(t) = gt2 /2 (1.2) certainly satisfies equation (1.1). • Now this argument is open to the criticism that gt2 /2 isn’t the only solution to (1.1). We can add on any constant C without changing the derivative, so we ought to consider the possibility that s(t) = gt2 /2 + C . (1.3) If we specify that s is measured from the starting position, then we have s(t) = 0 when t = 0. This forces us to choose C = 0 anyway, and we end up with the formula s(t) = gt2 /2 for the distance fallen after time t. In particular, if t = 10, then s = 50g. In this example the process of recovering s(t) from its derivative is called integration. In this sense integration is the reverse of the differentiation process. The next example is superficially quite similar, but it does not have such a simple answer. In some parts of the world trucks and other heavy vehicles are fitted with a device called a tachograph. This is an instrument which records how fast the vehicle is travelling at each instant of time, typically by moving a pen over graph paper. The output might look like the following. km/h 100 50 0 0 1 2 3 4 5 hours 6 7 8 9 10 Figure 1 Mathematically this shows the speed of the vehicle as a function f (t) of time, and Figure 1 is the graph of f (t) over the range 0 ≤ t ≤ 10. This is similar to the preceding example in that we know how the speed depends on time. 4 MATH 1003 Integral Calculus and Modelling The big difference is that, for the falling body, the time dependence was given by a simple formula f (t) = gt, but in this example there is no such formula. The function f is specified by its graph, rather than algebraically. It is still true to say that speed is the derivative of distance with respect to time, so if s(t) is the distance travelled by the vehicle after time t we still have an equation d s(t) = f (t) dt (1.4) relating distance to time. Unlike the previous example, we cannot solve this equation in terms of a simple formula for s(t). In this case, the idea that integration is the reverse of differentiation does not help us to find explicitly how the distance travelled depends on t. This shows us that the previous method does not tell the whole story. We need a more general idea, and will return to this theme in the next chapter. In mathematics the term integration is used for any process of working out the size of some quantity from information about its rate of change. Besides the mathematical definition, the Macquarie Dictionary gives integrate v., 1. To bring together (parts) into a whole. In the examples looked at so far we are adding up (and so bringing together) the changes in s(t) obtained from information about the derivative in order to find the cumulative effect on s. In these examples we are given the derivative of a function explicitly (either as a function or as a graph). This is actually a special case of a more general type of problem to be considered later in the course. The next section gives a brief preview. Differential Equations If we have some quantity s depending on a variable t then a differential equation is a relation between s, t and one or more derivatives of s with respect to t. In the previous section we have looked at only the simplest kind of differential equation, exemplified by (1.4). The applications we look at later will lead us to consider many other types of differential equations. Here are some examples. Chapter 1: Introduction 5 Exponential Decay and Growth One of the most common examples of a differential equation describes processes of decay and growth. Suppose we have a sample of mineral ore containing a radioactive isotope. Let x(t) be the mass of the isotope present in the sample at time t. The isotope decays at a rate proportional to the mass present, so the rate of change of x(t) is given by the equation dx = −ax dt (1.5) for some positive constant a. This process is called exponential decay. The reverse of exponential decay is exponential growth. This typically occurs in collections of living organisms where each individual reproduces at a constant rate. This produces population growth which is proportional to the size of the population itself. If x is some measure of the size of the population we can describe this mathematically by the equation dx = ax, dt (1.6) where a > 0 is the constant of proportionality. One thing to notice here is that these equations are not special cases of (1.4). In (1.5) and (1.6) the right hand side depends on the value of x, which is what we are trying to find. This means we cannot just integrate both sides with respect to t. We return to this problem in a later chapter, but, in the meantime, you may already know the solutions to (1.5) and (1.6). Otherwise they are not to hard to guess from what you know about differential calculus. The description of this kind of growth or decay as ‘exponential’ gives a strong clue. A systematic treatment of these equations will be found in Chapter 9. Force and Acceleration According to Newton’s second law of mechanics, Mass × Acceleration = Force. If x measures position then acceleration is the second derivative of x with respect to time. If the force depends on both position and time then this relation can be expressed as the differential equation m× d2 x = F (x, t). dt2 6 MATH 1003 Integral Calculus and Modelling In the special case where F (x, t) = −kx for some constant k > 0 this reduces to the equation of simple harmonic motion: (1.7) m d2 x = −kx. dt2 This is an example of a second order differential equation, since it involves second order derivatives. Notice again that the right side involves the function x(t) that we are looking for, so we cannot just integrate twice to get a solution. The equation has the ‘trivial’ solution x(t) ≡ 0, representing a system ‘at rest’ with x = 0. Otherwise the force kx is proportional to the displacement from rest, and directed back towards the rest position. This is typical of systems such as springs or pendulums. In Chapter 12 we will develop methods for equations of this type which allow us to incorporate the effect of friction, damping and external forces varying with time. Other Types of Differential Equation So far all the examples we have looked at involve a single independent variable (which is often, but not always, the time t) and a single unknown dependent variable (x in the previous examples) which we are trying to find as a function x(t) of t. Other types of differential equations are mostly beyond the range of this course. However, it is worth looking at some examples (without trying to find solutions) in order to get some idea of the scope of this branch of mathematics. Systems of Differential Equations Sometimes we have several variables, all depending on a single independent variable. Just as in linear algebra, this situation may involve several equations which we have to solve simultaneously. One of the most famous examples of such a system of equations is the Volterra Predator-Prey equations. Here x and y are respectively measures of the populations of two animal species, where one of the species, y say, is the predator and other species x is its prey. We suppose that both x and y vary with time t. The pair of equations dx = Ax − Bxy , dt dy = −Cx + Dxy , dt Chapter 1: Introduction 7 where A, B, C, D are positive constants, has been suggested as a simple mathematical model of this situation. Where does such a model come from? If we forget the xy terms (or equivalently take B + D = 0) we are left with the two equations dy dx = Ax , = −Cy , dt dt which are just the equations of exponential growth and decay. These equations reflect what might happen to the numbers of prey and predator if they never meet each other: the prey increases, and the predator dies out. In this case the two equations can be solved independently of the other. The xy terms represent the interaction of the two species. The equations (like the two species themselves) are no longer independent. The extra term has the desired effect of benefiting the predator (increasing its growth rate) and disadvantaging the prey (decreasing its growth rate). Analysis of these equations and their various refinements leads to many fascinating insights into the behaviour of ecological systems. The Volterra equations take us beyond the scope of this course, but these notes conclude by looking at a version of the system of equations with a simplified interaction. Partial Differential Equations In other situations there may be several independent variables. For example, the temperature T along a uniform rod depends on both position x and time t. Then T = T (x, t) and the heat equation ∂2T ∂T =k 2 ∂x ∂t is a mathematical expression of physical laws which apply to the flow of heat in a solid body. Here k is a constant depending on the choice of units and the material of the rod. Solving the equation for a specified initial temperature distribution will allow us to predict the future temperature at different parts of the rod. The displacement of a tightly stretched vibrating string (like the string of a musical instrument) is also a function y(x, t) of two variables, where y is the transverse movement of the string as a function of the position x along the string and the time t. In this case the function y satisfies the wave equation: 2 ∂2y 2∂ y = c , ∂x2 ∂t2 where c is a constant depending on the mass per unit length of the string and its tension. The equations which describe the phenomena of electromagnetism and fluid flow are also partial differential equations. The equations of fluid flow in particular are very difficult to analyse mathematically, but in many cases can solved very accurately by powerful computers. Engineering design problems (such as those involving structural analysis or aircraft aerodynamics) which once required extensive model building and prototyping can be handled more efficiently by complex mathematical models based on differential equations and solved on supercomputers. 8 MATH 1003 Integral Calculus and Modelling Infinite Sequences (Stewart § 12.12) At several points in the course we have to deal with infinite sequences of numbers. We conclude this introduction with a short summary of the basic facts about such sequences. Infinite sequences appear in almost all areas of mathematics. The natural numbers 0, 1, 2, 3, . . . are the most obvious example of an infinite sequence. In some cases each term in the sequence is given by a formula depending on its position in the sequence, or there may be some rule which generates each term from the values of preceding terms. Examples include: • Arithmetic sequences: each term differs from its predecessor by the addition of a constant amount, as in the sequence 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, . . . , • Geometric sequences: each term is a constant multiple of its predecessor, as in 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, . . . , • Fibonacci sequence: each term (after the second) is the sum of the two preceding terms, as in 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, . . . . • The terms of the sequence do not have to be integers; they can be any real numbers: 1 1 1 1 1 1 1, √ , √ , √ , √ , √ , √ , . . . . 2 3 4 5 6 7 The terms of a general sequence of real numbers are usually labelled by the non-negative integers, as in (1.8) a0 , a1 , a2 , a3 , a4 , a5 , a6 , a7 , a8 , . . . . Sometimes the labelling starts with 1 rather than 0. Most of the interesting properties of sequences concern what happens to an for large n, so this is not an important feature of the sequence. One important property of a sequence is the existence (or otherwise) of the limit of successive terms. Informally, a number L is the limit of the sequence (1.8) if the an ‘home in’ on L as n increases. We write lim an = L n→∞ 2 Throughout the notes we provide cross-references in the form Stewart § 4.1 to the relevant section in Calculus, Sixth Edition by James Stewart (Brooks/Cole Publishing Company, 2009). Chapter 1: Introduction 9 in this case. Sometimes this is abbreviated to lim an = L or just an → L. The notation lim an = ∞ n→∞ is used when the terms of the sequence become arbitrarily large. If a sequence has a finite limit, the sequence is said to be convergent. Otherwise it is divergent. In the case of either a finite or infinite limit, the sequence is said to approach, or tend to the limit. Worked Example 1.1 Decide which of the following sequences has a limit (finite or infinite): (a) an = 1 n (d) an = 1 (b) an = √ n n 1 (e) an = 2 (c) an = (−1)n (f) an = 2n . Solution. For (a) it is intuitively clear that an → 0 as n → ∞ since we make make the next term as small as we like by making n sufficiently large. In case (b) the sequence tends to ∞ since we can make the next term as large as we like by making n sufficiently large. In (c) the sequence has no limit: it oscillates between the two values of +1 and −1. The sequence in (d) is an example of a constant sequence. It has a limit equal to the value of each term, in this case 1. The final examples (e) and (f) have limits 0 and ∞ respectively. Exercises The following exercises are designed to give a flavour of the type of problem which will be studied in more detail later on. See if you can work out how to tackle them now. Don’t worry if you can’t, we will return to them in due course. 1. A mothball initially has radius 0.5 cm and slowly evaporates. (a) If V denotes the volume of the mothball and r the radius, use the chain rule for differentiation to show that dr dV = 4πr 2 . dt dt (b) Suppose that the rate of change of the volume is proportional to the surface area of the mothball. Express this condition as a differential equation for r as a function of t. 10 MATH 1003 Integral Calculus and Modelling (c) Find a formula for the radius as a function of time, assuming that after 30 days the radius is 0.25 cm. How long before the mothball disappears altogether? 2. A car is travelling at 100 km/h on a level road when it runs out of fuel. Its speed v starts to decrease according to the formula dv = −kv, dt where k is a constant. One kilometre after running out of fuel its speed has fallen to 50 km/h. Use the chain rule substitution dv ds dv dv = = v dt ds dt ds to solve the differential equation. How far will the car travel from the point where it runs out of fuel? How long after running out of fuel will the car come to a stop? Is the model reasonable? Chapter 1: Introduction 11 Summary of Chapter 1. • If we know that the derivative of an unknown function s(t) is a given function f (t), we may be able to find s(t) by working backwards, using what we know about differentiation. • If f (t) is not given by a simple formula (and often even when it is) this method fails. We need to find a more fundamental relation between the derivative of a function and the original function. • The process of recovering s(t) from information about its derivative is called integration. This information can simply specify the derivative explicitly (by a formula or graph). Other cases lead us to study more general types of differential equation. • Differential equations are used as ‘mathematical models’ in many areas of science and technology. They may involve ‘ordinary’ or partial derivatives. Sometimes several differential equations need to be satisfied simultaneously. This leads to the analysis of systems of differential equations. • Later on in the course we need to investigate infinite sequences. One of the most important properties of such sequences is the existence of a limit. 12 CHAPTER 2 The Definite Integral: Definition OW IS THE TOTAL CHANGE in the value of a function over an interval related to the values of its derivative on the interval? We start by investigating this problem from first principles, without using derivatives or differential calculus. Given the rate of change, we want to find the total change. Later on we will relate this to the idea of ‘reverse differentiation’ already discussed. H Velocity vs Distance (Stewart § 5.1) Imagine a car accelerating along the road over a period of 10 seconds. Suppose the car starts with a speed of 5m/sec and ends up with a speed of 32.5 m/sec. What can we say about the distance travelled? We can make a rough estimate as follows. Since the car is accelerating the speed is always increasing. In particular, the speed is always between 5 m/sec and 32.5 m/sec. Over the period of 10 seconds the car therefore travels at least 5 × 10 = 50 metres, but no more than 32.5 × 10 = 325 metres. We can write these two inequalities as: 50 m ≤ Distance travelled ≤ 375 m . This is a very rough estimate indeed. We can do much better if we know more about the velocity at intervening points of time. For example, suppose that we measure the velocity every two seconds. We can present the results as a table, which might look like the following: Time (sec) 0 2 4 6 8 10 Velocity (m/sec) 5 14.5 22 27.5 31 32.5 The minimum and maximum velocities over the first two seconds are 5 m/sec and 14.5 m/sec. Therefore the distance travelled in this period is between 5 × 2 = 10 m and 14.5 × 2 = 29 m. Applying this to each interval in turn and adding up over all five intervals, we get a lower estimate of (2.1) (5 × 2) + (14.5 × 2) + (22 × 2) + (27.5 × 2) + (31.0 × 2) = 200 m, Chapter 2: The Definite Integral: Definition 13 and an upper estimate of (2.2) (14.5 × 2) + (22 × 2) + (27.5 × 2) + (31 × 2) + (32.5 × 2) = 255 m. The gap between the two estimates is now much smaller, with a maximum possible error of 255 − 200 = 55 m. It is very instructive to draw a graph of velocity against time and use it to interpret these calculations. This is done in Figure 1 below, where the curved line shows the actual velocity of the car plotted against the time t. On each 2 second interval along the t-axis the height of the dark rectangle is equal to the minimum velocity on that interval. Since the velocity is increasing, this always occurs on the left endpoint of each subinterval. Thus the first rectangle has height 5, the second height 14.5, and so on. The total height of the dark and light rectangles together is equal to the maximum velocity on each interval. For an increasing function this will occur at the right endpoint. We relate this geometrical construction to the distance travelled by introducing the idea of area. The width of each rectangle is a time interval and the height corresponds to our estimate of velocity over the same interval. Therefore the product width ×height gives the distance travelled during the interval, assuming the velocity is constant and equal to the height of the rectangle. Of course this product is also just the area of the rectangle. Therefore the lower estimate for the velocity given by (2.1) is just the sum of the areas of the dark rectangles. Similarly the expression (2.2) is the total area of the dark and light rectangles taken together. velocity m/sec. 30 20 10 0 Difference in Area 0 2 4 6 8 10 time t Figure 1 The difference between these upper and lower estimates of the distance is then equal to the sum of the areas of the light rectangles. In Figure 1 the light rectangles have been copied over to the right of the diagram and stacked together, in order to better visualize their total area. In fact it is easy to see that the composite rectangle has dimensions (32.5 − 5) × 2, with a total area of 55, in agreement with our earlier calculation. 14 MATH 1003 Integral Calculus and Modelling With more data on the car’s speed we can improve accuracy further still. Suppose we record the speed twice as often, so every second: Time (sec) 0 1 2 3 4 5 6 7 8 9 10 Velocity (m/sec) 5 10 14.5 18.5 22 25 27.5 29.5 31 32 32.5 This gives us 10 intervals instead of 5, and we can again use the lowest and highest speeds on each interval to estimate the distance travelled. This is shown graphically in Figure 2. velocity m/sec. 30 20 10 0 Difference in Area 0 2 4 6 8 10 time t Figure 2 As before the areas of the shaded rectangles give lower and upper estimates for the distance travelled, and difference between these two estimates is equal to the total area of the light rectangles. Comparison with Figure 1 shows that this difference is now much smaller (in fact it is equal to half its previous value). Adding up the upper and lower estimates of distance over each 1 second interval gives us inequalities: 215 m ≤ Distance travelled ≤ 242.5 m . The maximum possible error is now 242.5 − 215 = 27.5 m. Of course, we can continue in the same way, using shorter and shorter subintervals. Here is the result of measuring the velocity every 0.5 seconds: The difference between the upper and lower estimates is again equal to the area of the rectangle drawn at the side of the figure. It should be clear by now that by taking small enough steps, we can make this area as small as we like. Chapter 2: The Definite Integral: Definition 15 velocity m/sec. 30 20 10 0 Difference in Area 0 2 4 6 8 10 time t Figure 3 In mathematical terms this means that both the upper and lower estimates approach a common limit as the size of the steps shrinks towards zero. There is an obvious relation between this limit and area under the curved line. In each case the lower estimate is the sum of areas of rectangles which lie inside this curve. The upper estimate is a sum of areas of rectangles which enclose the curve. The area under the curve, like the total distance, therefore also lies between these upper and lower estimates. Since the upper and lower estimates have a common limit, there is only one number with this property. We conclude that Total Distance = Area under the Curve. There are two new concepts here. First, we have a way of estimating total distance travelled from a knowledge of velocity. By ‘sampling’ the velocity sufficiently frequently, we can make this estimate as accurate as we wish. Second, we see that the total distance travelled is the same as the area under the graph of velocity plotted against time. Note that the argument does not depend on having an algebraic formula for the velocity, and it does not use any differential calculus. The only place where we use the fact that velocity is rate of change of distance with time is in the formula distance = velocity × time for motion with constant velocity. In the next section we generalize this argument into a purely mathematical construction which we can apply to any continuous function. 16 MATH 1003 Integral Calculus and Modelling Riemann Sums (Stewart § 5.1, §5.2 Start with a continuous function f (x) defined on an interval [a, b]. Recall that the notation [a, b] stands for the set of real numbers x satisfying the condition a ≤ x ≤ b. For simplicity we assume the function is non-negative, so f (x) ≥ 0 for all x in the interval. We can mimic the construction of the previous section. Fix an integer N ≥ 1 and divide the interval [a, b] into N subintervals of equal length. This is called a partition of the interval. Then take the minimum value of f (x) on each subinterval, and draw rectangles of this height based on the subintervals. The result for 8 subintervals is shown in Figure 4. Repeat the construction, this time using the maximum value of f (x) on each subinterval. This is shown in Figure 5. f (x) a x Figure 4 b a x b Figure 5 In the case of Figure 4 the total area of the rectangles is clearly a lower estimate for the area under the graph of f (x). Similarly the total shaded area in Figure 5 is an upper estimate for this area. We can also express these quantities using summation notation. Let ∆x be the length of the subintervals, so if there are N equal subintervals then ∆x = (b − a)/N. Let mi and Mi be respectively the minimum and maximum values of f (x) on the ith subinterval. In Figure 4 the area of the ith rectangle is therefore given by the product mi × ∆x. In Figure 5 the height of the ith rectangle is Mi , so the area is Mi × ∆x. Let LN be the total area of the smaller rectangles, as in Figure 4. Then (2.3) LN = (m1 × ∆x) + (m2 × ∆x) + · · · + (mN × ∆x) = N X i=1 mi × ∆x . Chapter 2: The Definite Integral: Definition 17 The number LN is called a Riemann Lower Sum for the function f on the interval [a, b]. It depends not only on N, but also on f and the interval [a, b]. Similarly, let UN be the total area of the larger rectangles (as in Figure 5. Then (2.4) UN = (M1 × ∆x) + (M2 × ∆x) + · · · + (MN × ∆x) = N X i=1 Mi × ∆x . This is called a Riemann Upper Sum for f on [a, b]. As before, the Riemann lower and upper sums give us lower and upper estimates for the area under the graph of f (x): LN ≤ Area under the Graph ≤ UN . What happens as the number N of subintervals is increased or, equivalently, as the length of each subinterval is decreased? Figure 6 shows the effect of taking first 16 and then 32 subintervals. In these pictures both the upper and lower estimates are n = 16 n = 32 Figure 6 shown. We can think of the rectangles for the lower sum geometrically, as the tallest rectangles that will fit under the graph. Similarly, the rectangles for the upper sum are the shortest which enclose the graph. The rectangles contributing to the Riemann lower sum are unshaded, so the difference between the upper and lower sums is equal to the total area of the shaded rectangles. In our earlier example we were able to use the fact that the function was increasing over the whole interval to visualize this difference. In this example the function is increasing for some values of x and decreasing for others, so it is not so easy to find a simple interpretation of the difference in areas. However, the pictures still suggest that the difference approaches zero as the number of intervals is increased. In fact, careful analysis (using more sophisticated mathematical ideas than we have available at this stage) confirms this intuition: as the size of the subintervals is decreased to zero, the upper and lower Riemann sums approach a common value. This is the number we call the definite integral of f (x) over the interval [a, b]. We can summarize this as follows. 18 MATH 1003 Integral Calculus and Modelling The Definite Integral. Suppose that f (x) is a continuous function defined on each integer N ≥ 1 we can divide [a, b] into N equal the associated Riemann upper and lower sums. As N and lower sums approach the same value. This value integral of f over the interval [a, b], and is written as Z the interval [a, b]. For subintervals and form → ∞ both the upper is called the definite b f (x) dx . a It is the unique number which satisfies Z b LN ≤ f (x) dx ≤ UN a for all N ≥ 1. Since the area under the graph of f (x) satisfies the same inequalities, it must be equal to the integral of f over the interval. See below for how to interpret the area in the case where f takes negative values. Since both the upper and lower sums have the same limit as N increases, we can also write Z b f (x) dx . lim LN = lim UN = N →∞ N →∞ a Non-positive Functions (Stewart § 5.2) If f (x) takes negative values we must modify the above argument slightly. We are guided here by the algebraic formulas (2.3) and (2.4) for the upper and lower sums. As before we divide [a, b] into N equal subintervals and let mi and Mi be the minimum and maximum values of f on the ith subinterval. Now one or both of these numbers may be negative. We can still use rectangles to visualize the upper and lower Riemann sums. If mi (or Mi ) is negative the rectangle appears below the axis. The term mi ×∆x (or Mi ×∆x) in the Riemann sum is equal to minus the area of the rectangle. The Riemann sum is therefore equal to the sum of the areas of all the rectangles above the axis, minus the sum of the areas of all the rectangles below the axis. It is still true that the lower and upper Riemann sums converge to a common value. The relation between the definite integral and area continues to hold, except that areas below the axis count as negative. Chapter 2: The Definite Integral: Definition 19 f (x) a x b a x b Upper Sum Lower Sum Figure 7 Note In this discussion we have always used a subdivision of the interval of integration into subintervals of equal length. This is sufficient for many practical applications, but from a theoretical point of view there are some advantages in relaxing this condition. If we subdivide into (finitely many) subintervals of possibly different lengths the upper and lower Riemann sums are still defined. It can be proved that both sums converge to the value of the definite integral, as defined above, as the length of the longest subinterval is decreased towards zero. Exercises 1. Partition the interval [1, 2] into N subintervals of equal length, [x0 , x1 ], [x1 , x2 ], . . . , [xN −1 , xN ] . (a) Show that xi = (N + i)/N. (b) Show that the maximum value of f (x) = 1/x on [xi−1 , xi ] is N/(N + i − 1) and the minimum value is N/(N + i). (c) Show that the upper and lower Riemann sums for f (x) = 1/x on [1, 2] with N subintervals are 1 1 1 + +···+ N +1 N +2 2N and 1 1 1 + +···+ . N N +1 2N − 1 (d) Find a value of N such that the difference between the upper and lower sums is less than 10−6 . 20 MATH 1003 Integral Calculus and Modelling Summary of Chapter 2. • For a vehicle travelling with variable velocity, we can estimate the total distance travelled over an interval of time by dividing the interval into equal subintervals and making the approximation that the velocity is constant on each subinterval. Taking this constant velocity equal to the minimum (resp. maximum) velocity on each subinterval gives a lower (resp. upper) estimate for the distance travelled. • The difference between the upper and lower estimates approaches zero as the length of the subintervals approaches zero. Their common limit is the actual distance travelled. • A similar construction can be applied to any continuous function f (x) defined on an interval [a, b]. In this case the upper and lower estimates are called the upper and lower Riemann sums. • The common limit of the upper and lower Riemann sums as the length of the subintervals shrinks to zero is called the definite integral of f over [a, b]. This number can also be interpreted as the area under the graph of f (areas below the axis counting as negative). 21 CHAPTER 3 The Definite Integral: Properties N A GIVEN EXAMPLE the upper and lower Riemann sums may be difficult to calculate. The reason for this is that, except in some special situations, it may be hard (or inconvenient) to find the minimum and maximum values of f (x) on each subinterval. If the function f (x) increases with x then the minimum value on an interval is always at the left endpoint, and the maximum value at the right. There is a similar conclusion (with right and left reversed) if the function is decreasing on the interval. In general the situation is not so simple. Fortunately we can avoid this difficulty by introducing a more general idea of Riemann sum. The upper and lower Riemann sums will then appear as special cases of this construction. As usual f (x) is a continuous function defined on the interval [a, b]. I Practical Calculation of Riemann Sums (Stewart § 5.2) Suppose [a, b] is divided into N equal subintervals. For 1 ≤ i ≤ N, let ci be any point in the ith subinterval and form the sum (3.1) (f (c1 ) × ∆x) + (f (c2 ) × ∆x) + · · · + (f (cN ) × ∆x) = N X i=1 f (ci ) × ∆x . This is the sum of areas of rectangles of width ∆x and height f (ci ). Figure 1 shows the result of one choice of the ci . In this diagram the sum (3.1) is equal to the total area of the shaded rectangles and the dots are located at the points (ci , f (ci )). In this example the ci were chosen randomly. Alternatively we could consistently choose ci to be the midpoint, left or right endpoint of the ith subinterval. Whatever the choice of the ci , we certainly have mi ≤ f (ci ) ≤ Mi , since mi and Mi are the minimum and maximum values of f on this subinterval. MultiP P Mi ∆x are ply by ∆x and add up over all subintervals. Then the sums mi ∆x and respectively the lower and upper Riemann sums LN and UN , and we get the inequality: (3.2) LN ≤ N X i=1 f (ci )∆x ≤ UN . 22 MATH 1003 Integral Calculus and Modelling a x b Figure 1 The middle expression here is an example of a (general) Riemann Sum for f (x) on the interval [a, b]. Its value obviously depends on the choice of the ci . By taking N large enough, we can make LN and UN as close as we like to the value of the definite integral. The inequalities (3.2) then imply that any Riemann sum must be at least as close to the actual value of the integral. We illustrate this with an example. Worked Example 3.1 Use Riemann sums to estimate the integral Z 2 sin x dx . 1 Use a partition of [1, 2] into 20 subintervals and calculate the Riemann sum for each of the three cases: (a) ci is the left endpoint of the ith subinterval, (b) ci is the right endpoint of the ith subinterval, (c) ci is the midpoint of the ith subinterval. Solution. Each subinterval has length 1/20 = 0.05, so the points xi = 1.0 + 0.05 × i (3.3) (for 0 ≤ i ≤ 20) will mark out the interval [1, 2] into the required subintervals. The ith subinterval will then be the interval [xi−1 , xi ]. The three Riemann sums correspond to the choices (3.4) ci = xi−1 , ci = xi , ci = xi−1 + 0.025 . Chapter 3: The Definite Integral: Properties 23 In each case we have to work out the sum 20 X (3.5) sin ci ∆x = i=1 20 X sin ci i=1 ! × 0.05 . This is quite easy with a programmable calculator or by writing a simple computer program. But perhaps the simplest way to do the calculation is to use a spreadsheet program. In the first column enter the numbers 1 to 20. On most spreadsheets these will be the cells a0 to a19. Fill the second column with the numbers xi−1 using the formula (3.3). Then the next three rows can be filled with the various choices of sin ci , using the formulas (3.4). Here is the resulting table, where the columns (1), (2), (3) contain the values of sin ci for the three specified choices of ci . i xi−1 (1) (2) (3) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1.000000 1.050000 1.100000 1.150000 1.200000 1.250000 1.300000 1.350000 1.400000 1.450000 1.500000 1.550000 1.600000 1.650000 1.700000 1.750000 1.800000 1.850000 1.900000 1.950000 0.841471 0.867423 0.891207 0.912764 0.932039 0.948985 0.963558 0.975723 0.985450 0.992713 0.997495 0.999784 0.999574 0.996865 0.991665 0.983986 0.973848 0.961275 0.946300 0.928960 0.867423 0.891207 0.912764 0.932039 0.948985 0.963558 0.975723 0.985450 0.992713 0.997495 0.999784 0.999574 0.996865 0.991665 0.983986 0.973848 0.961275 0.946300 0.928960 0.909297 0.854714 0.879590 0.902268 0.922690 0.940806 0.956570 0.969944 0.980893 0.989391 0.995415 0.998952 0.999991 0.998531 0.994576 0.988134 0.979223 0.967864 0.954086 0.937923 0.919416 0.954554 0.957946 0.956549 Once these numbers are entered, we can calculate the Riemann sums by summing the columns and using the formula (3.5). This gives the values shown in the bottom line of the table. Of course (anticipating a later section), it is easy to calculate this integral exactly. In fact Z 2 Z 2 d sin x dx = (− cos x) dx = (− cos 2) − (− cos 1) ≈ 0.956449 . 1 1 dx 24 MATH 1003 Integral Calculus and Modelling Note that in this example none of the three Riemann sums give the lower or upper Riemann sum. The maximum value of sin x occurs at x = π/2, which is inside the range of integration. Up to this point the function is increasing. After this point it is decreasing. In some cases the maximum value of the function occurs at the right of the subinterval, other times on the left, and in one case (at π/2) inside the subinterval. Properties of the Definite Integral (Stewart § 5.2) Before continuing, we will note some of the basic properties of the definite integral as defined above. After some practice, most people use these properties all the time without further thought. However, it is instructive to see how they follow from our definition of the integral. Properties of the Definite Integral. • If m and M are the minimum and maximum values of f on the interval [a, b], then (3.6) m × (b − a) ≤ b Z f (x) dx ≤ M × (b − a) . a • If c is a constant, then Z b Z b (3.7) cf (x) dx = c f (x) dx . a a • For functions f and g defined on the interval [a, b], (3.8) Z b (f (x) + g(x)) dx = a Z b f (x) dx + a Z b g(x) dx . a • If f is defined on the interval [a, c], and b is a point between a and b, then (3.9) Z a c f (x) dx = Z a b f (x) dx + Z c f (x) dx . b The first formula (3.6) is just the relation between the definite integral and the upper and lower Riemann sums in the case N = 1, so there is a single interval equal to all of [a, b]. To see where the next two equations come from, we use the definition of the integral as a limit of Riemann sums. We also need some basic properties of limits. Chapter 3: The Definite Integral: Properties 25 For any integer N ≥ 1 we subdivide [a, b] into N subintervals. For each i with 1 ≤ i ≤ N we choose a point ci in the ith subinterval. Let SN (f ) and SN (cf ) be the corresponding Riemann sums for the two functions f (x) and cf (x). It follows immediately from the definition of the Riemann sum that SN (cf ) = cSN (f ). Then, from the standard properties of limits, Z b cf (x) dx = lim SN (cf ) = lim cSN (f ) = c lim SN (f ) = c N →∞ a N →∞ N →∞ Z b f (x) dx . a This proves (3.7). For (3.8), take N and the ci as before and let SN (f ), SN (g) and SN (f + g) be the corresponding Riemann sums for f , g and f + g. Then SN (f + g) = = N X [f (ci) + g(ci )] × ∆x i=1 N X i=1 ! f (ci ) × ∆x + = SN (f ) + SN (g) . N X i=1 ! g(ci) × ∆x Then Z b (f (x) + g(x)) dx = lim SN (f + g) N →∞ a = lim (SN (f ) + SN (g)) N →∞ = lim SN (f ) + lim SN (g) N →∞ N →∞ Z b Z b = f (x) dx + g(x) dx . a a If we interpret the definite integral as an area, the final formula (3.9) is just the fact that the area over the interval [a, c] is the sum of the areas over the two intervals [a, b] and [b, c]. A formal mathematical proof of this depends on the more general type of Riemann sum (with subintervals of different lengths) mentioned at the end of the previous chapter. Reversing the Direction of Integration (Stewart § 5.2) Rb So far we have only defined a f (x) dx in the case where a ≤ b. It is quite useful to extend the definition to the case a > b. We can easily do this in a way which is consistent with the existing definition and properties. Note that in the definition of the Riemann sum as (3.10) N X i=1 f (ci ) ∆x 26 MATH 1003 Integral Calculus and Modelling we have ∆x = (b − a)/N. If a > b we can use the same formula. The only new feature is that now ∆x is negative. From an algebraic point of view this has no effect on our formulas. Geometrically it means that in the Riemann sum areas of rectangles above the axis now count as negative, and areas below the axis count as positive. The easiest way to see the implication of all this is to go back to the Riemann sum 3.10 and look at the effect of interchanging a and b. The only difference which this makes to the formula is to change the sign of ∆x. Hence the Riemann sum also simply changes sign. In the limit as ∆x → 0 the same is true of the definite integral, and we conclude that b Z a f (x) dx = − Z a f (x) dx . b One result of this is that the formula (3.9) now holds whatever the order of the numbers a, b, c. For example, starting with a ≤ b ≤ c and the relation Z b Z c Z c + = a b a we can rearrange to get Z b = a Z a c − Z b c = Z a c + Z b . c This is essentially just (3.9) again, except the point c no longer lies between a and b. Fundamental Theorem of Calculus (Part 2) (Stewart § 5.3) We used the relation between velocity and distance to motivate the introduction of the definite integral, but we have not so far discussed the exact mathematical relation between differentiation and integration. There are really two parts to this relation. One part involves the derivative of an integral, and the other the integral of a derivative. Together these make up the Fundamental Theorem of Calculus. In this section we consider the second part: what happens when we integrate the derivative of a function? The answer is given by the following theorem. The Fundamental Theorem of Calculus II. Let F (x) be a function defined on an interval [a, b] of the real line. Suppose that the derivative of F is defined at each point x of the interval, and that the resulting function F ′ (x) is continuous. Then (3.11) Z a b F ′ (x) dx = F (b) − F (a) . Chapter 3: The Definite Integral: Properties 27 Recall that the definite integral of F ′ (x) is the unique number which lies between the upper and lower Riemann sums of F ′ (x) for all subdivisions of the interval [a, b]. If we can show that the number F (b) − F (a) has the same property, then (3.11) is proved. For this we need the Mean Value Theorem of differential calculus (see, for example, Stewart § 4.2). Here is the statement of the theorem: for a function F (x) defined on an interval [u, v] with continuous derivative F ′ (x), there exists a point w somewhere in the interval with the property that F ′ (w) = F (v) − F (u) . v−u Essentially, this theorem says that the average rate of change over the interval (the ratio of the change in F (x) to the change in x) is equal to the derivative of F at some point in the interval. To prove the formula (3.11) we apply this theorem as follows. Starting with the function F defined on [a, b], partition the interval into N equal subintervals. Label the division points as xi , with 0 ≤ i ≤ N, so that the ith subinterval is [xi−1 , xi ] and a = x0 ≤ x1 ≤ · · · ≤ xN −1 ≤ xN = b . As usual we let mi , Mi be the minimum and maximum values of F (x) on this subinterval, and ∆x the length of the subintervals. According to the Mean Value Theorem there exists a point ci in the ith subinterval where (3.12) F (xi ) − F (xi−1 ) = F ′ (ci ) × (xi − xi−1 ) = F ′ (ci ) × ∆x . Adding up over the range 1 ≤ i ≤ N, the left side is just (F (x1 ) − F (x0 )) + (F (x2 ) − F (x1 )) + (F (x3 ) − F (x2 )) + · · · · · · + (F (xN −1 ) − F (xN −2 )) + (F (xN ) − F (xN −1 )) . All the terms except F (x0 ) and F (xN ) appear twice, with opposite signs. Therefore they cancel out, leaving only F (xN ) − F (x0 ) = F (b) − F (a) . Summing the right side just gives a Riemann sum for F ′ (x) over the interval [a, b]. We conclude that N X F ′ (ci ) × ∆x = F (b) − F (a) . i=1 But any Riemann sum for the specified partition lies between the upper and lower Riemann sums, so the same is true for F (b) − F (a). This completes the proof of this part of the Fundamental Theorem. 28 MATH 1003 Integral Calculus and Modelling Notation The change F (b) − F (a) of a function F over an interval is often denoted by F (b) − F (a) = [F (x)]ba . The theorem then appears in the form Z b F ′ (x) dx = [F (x)]ba . a The important thing about this result is that it gives us a potential shortcut to working out a definite integral. In order to evaluate the integral Z b f (x) dx a we can look for a function F (x) with the property that F ′ (x) = f (x) on the interval [a, b]. According to the fundamental theorem we then have Z b f (x) dx = a Z b a F ′ (x) dx = F (b) − F (a). The function F (x) is called an antiderivative of f (x). In this way the Fundamental Theorem of Calculus gives us a link between ‘the area under the curve’ (in terms of Riemann sums) and ‘reverse differentiation’ (finding an antiderivative). It also motivates us to look at techniques for finding antiderivatives. You should be aware however that there are many examples of quite simple functions for which there is no simple formula for an antiderivative. Exercises 1. Given that Z 1 −3 f (x) dx = −2 , evaluate, where possible: Z 2 Z (a) (f (x) + g(x)) dx, (b) −3 Z 2 f (x) dx = 5 , 1 −3 2 Z 2 g(x) dx = 8 , −3 g(x) dx, 2 (c) Z 2 f (x)g(x) dx . −3 Chapter 3: The Definite Integral: Properties Summary of Chapter 3. • We can estimate the value of the definite integral using the values of f at any point in each subinterval. The resulting Riemann sum lies between the upper and lower Riemann sums for the same partition. • This type of Riemann Sum is easier to calculate, since we do not have to locate the maximum and minimum values of the function on each interval. The difference between such a Riemann sum and the definite integral is no bigger than the difference between the lower or upper Riemann sums for the same partition. • The Fundamental Theorem of Calculus shows that the definite integral of the derivative of a function F (x) is equal to the change in F over the interval. The proof uses the Mean Value Theorem from differential calculus. • We can try to evaluate a definite integral for a function f by finding an antiderivative of f and using the Fundamental Theorem. This method fails when we cannot find a formula for the antiderivative. 29