Integral Calculus and Modelling - School of Mathematics and Statistics

advertisement
THE UNIVERSITY OF SYDNEY
SCHOOL OF
MATHEMATICS AND STATISTICS
MATH 1003
Integral Calculus and Modelling
N.R. O’Brian
C.J. Durrant and D.J. Galloway
c
The
University of Sydney 2009.
Contents
1 Introduction
1
2 The Definite Integral: Definition
12
3 The Definite Integral: Properties
21
4 The Definite Integral: Applications
30
5 Integrals as Functions
38
6 Integration Techniques: I
51
7 Integration Techniques: 2
59
8 Models and Differential Equations
73
9 Separable Equations
84
10 Applications of Separable Equations
94
11 Linear Differential Equations
114
12 Second-Order Differential Equations
126
13 Systems of Differential Equations
144
A Table of Standard Integrals
153
1
CHAPTER 1
Introduction
HESE NOTES develop the theory of integration, along with some of its applications. We expect that everyone in the class has had exposure to the ideas of
integration. In this course we will fit those ideas into a systematic framework and show
how many of the most powerful applications of mathematics are based on integration.
T
As a first exercise, you should try to describe in a single sentence what you understand by the term ‘integration’. Perhaps your answer looks like one of the following:
• It’s the opposite of differentiation.
• It’s the area under the graph.
• It’s a kind of summation process.
• It’s the result of adding up small changes.
In fact all these are correct, and the diversity of possible answers shows that there are
several ways of looking at integration, and understanding the theory depends on an
appreciation of the relationships between these different points of view.
One of the main reasons for getting a good understanding of integration is that
many of the important applications of mathematics are based on this idea. A major
practical use of mathematics is in building ‘mathematical models’ of different types of
physical, biological or financial systems. Typically these models are equations which
express certain relationships between the different quantities involved in the system.
Solving the equations then tells us something about the consequences of these relationships. It is very common for the equations to involve derivatives, in which case they
are called differential equations. Then we have the problem of deducing the values of
some quantity from information about its derivative. This is where we need the theory
of integration.
As a simple example, if we know the interest rate on a savings account then we
know the rate at which the amount in the account will change (assuming no deposits or
withdrawals). We then know the derivative with respect to time of the account balance,
and we can try to use this information to find the actual balance after a certain period
of time. We will see many other examples of this type of problem.
2
MATH 1003 Integral Calculus and Modelling
A Note on Notation
In most of the examples discussed here the independent variable is taken to be x or t.
Often t will stand for time, and the derivative of any quantity depending on t can be
thought of as the rate at which that quantity is changing. We generally draw graphs
according to the usual convention of measuring the independent variable along the
horizontal axis, and the dependent variable along the vertical axis. The fact that a
quantity y depends on x is equivalent to saying that y is a function of x. In this case
we might write y = f (x). It is also common to think of y itself as the function, and
write just y(x) or y = y(x). We often use a prime to denote the derivative, so
d
f (x) = f ′ (x) .
dx
Rates of Change
One of the basic facts of physics is that the velocity of a falling body increases in
proportion to time (at least if we ignore small effects due to air resistance or variations
in the gravitational field).1 If t is time and v(t) the velocity at time t then, if the body
starts from rest at t = 0, we can express this fact mathematically as an equation:
v(t) = gt,
where g is a constant. This is an example of a physical law. It contains a lot of information about falling bodies in a compact formula. In fact it is a simple ‘mathematical
model’ of a falling body. Any consequences we can deduce from this mathematical
relationship will tell us more about falling bodies. The constant g is the same for all
bodies, and can be determined by experiment. Suppose we want to know how far the
body has fallen when, say, t = 10. If s(t) is the distance fallen after time t we can use
the following argument:
• Velocity is the derivative of distance with respect to time, so we can write v as
ds/dt in the previous equation:
(1.1)
d
s(t) = gt.
dt
1
This topic is a recurring theme in these notes. The historical development is sketched in Chapter 8
and the mathematical development in Chapter 9
Chapter 1: Introduction
3
• Although we don’t know s(t) explicitly, we know its derivative is gt. On the
other hand, from what we know about differentiation, the function gt2/2 has
derivative gt, so
s(t) = gt2 /2
(1.2)
certainly satisfies equation (1.1).
• Now this argument is open to the criticism that gt2 /2 isn’t the only solution
to (1.1). We can add on any constant C without changing the derivative, so we
ought to consider the possibility that
s(t) = gt2 /2 + C .
(1.3)
If we specify that s is measured from the starting position, then we have s(t) = 0
when t = 0. This forces us to choose C = 0 anyway, and we end up with the
formula s(t) = gt2 /2 for the distance fallen after time t. In particular, if t = 10,
then s = 50g.
In this example the process of recovering s(t) from its derivative is called integration. In this sense integration is the reverse of the differentiation process. The next
example is superficially quite similar, but it does not have such a simple answer.
In some parts of the world trucks and other heavy vehicles are fitted with a device
called a tachograph. This is an instrument which records how fast the vehicle is travelling
at each instant of time, typically by moving a pen over graph paper. The output might
look like the following.
km/h
100
50
0
0
1
2
3
4
5
hours
6
7
8
9
10
Figure 1
Mathematically this shows the speed of the vehicle as a function f (t) of time, and
Figure 1 is the graph of f (t) over the range 0 ≤ t ≤ 10. This is similar to the preceding
example in that we know how the speed depends on time.
4
MATH 1003 Integral Calculus and Modelling
The big difference is that, for the falling body, the time dependence was given by
a simple formula f (t) = gt, but in this example there is no such formula. The function
f is specified by its graph, rather than algebraically. It is still true to say that speed
is the derivative of distance with respect to time, so if s(t) is the distance travelled by
the vehicle after time t we still have an equation
d
s(t) = f (t)
dt
(1.4)
relating distance to time.
Unlike the previous example, we cannot solve this equation in terms of a simple
formula for s(t). In this case, the idea that integration is the reverse of differentiation
does not help us to find explicitly how the distance travelled depends on t. This shows
us that the previous method does not tell the whole story. We need a more general
idea, and will return to this theme in the next chapter.
In mathematics the term integration is used for any process of working out the size
of some quantity from information about its rate of change. Besides the mathematical
definition, the Macquarie Dictionary gives
integrate v., 1. To bring together (parts) into a whole.
In the examples looked at so far we are adding up (and so bringing together) the changes
in s(t) obtained from information about the derivative in order to find the cumulative
effect on s. In these examples we are given the derivative of a function explicitly (either
as a function or as a graph). This is actually a special case of a more general type of
problem to be considered later in the course. The next section gives a brief preview.
Differential Equations
If we have some quantity s depending on a variable t then a differential equation is a
relation between s, t and one or more derivatives of s with respect to t. In the previous
section we have looked at only the simplest kind of differential equation, exemplified
by (1.4). The applications we look at later will lead us to consider many other types of
differential equations. Here are some examples.
Chapter 1: Introduction
5
Exponential Decay and Growth
One of the most common examples of a differential equation describes processes of decay
and growth. Suppose we have a sample of mineral ore containing a radioactive isotope.
Let x(t) be the mass of the isotope present in the sample at time t. The isotope decays
at a rate proportional to the mass present, so the rate of change of x(t) is given by the
equation
dx
= −ax
dt
(1.5)
for some positive constant a. This process is called exponential decay.
The reverse of exponential decay is exponential growth. This typically occurs in
collections of living organisms where each individual reproduces at a constant rate. This
produces population growth which is proportional to the size of the population itself.
If x is some measure of the size of the population we can describe this mathematically
by the equation
dx
= ax,
dt
(1.6)
where a > 0 is the constant of proportionality.
One thing to notice here is that these equations are not special cases of (1.4). In
(1.5) and (1.6) the right hand side depends on the value of x, which is what we are
trying to find. This means we cannot just integrate both sides with respect to t. We
return to this problem in a later chapter, but, in the meantime, you may already know
the solutions to (1.5) and (1.6). Otherwise they are not to hard to guess from what
you know about differential calculus. The description of this kind of growth or decay
as ‘exponential’ gives a strong clue.
A systematic treatment of these equations will be found in Chapter 9.
Force and Acceleration
According to Newton’s second law of mechanics,
Mass × Acceleration = Force.
If x measures position then acceleration is the second derivative of x with respect to
time. If the force depends on both position and time then this relation can be expressed
as the differential equation
m×
d2 x
= F (x, t).
dt2
6
MATH 1003 Integral Calculus and Modelling
In the special case where F (x, t) = −kx for some constant k > 0 this reduces to the
equation of simple harmonic motion:
(1.7)
m
d2 x
= −kx.
dt2
This is an example of a second order differential equation, since it involves second
order derivatives. Notice again that the right side involves the function x(t) that we
are looking for, so we cannot just integrate twice to get a solution.
The equation has the ‘trivial’ solution x(t) ≡ 0, representing a system ‘at rest’
with x = 0. Otherwise the force kx is proportional to the displacement from rest, and
directed back towards the rest position. This is typical of systems such as springs or
pendulums. In Chapter 12 we will develop methods for equations of this type which
allow us to incorporate the effect of friction, damping and external forces varying with
time.
Other Types of Differential Equation
So far all the examples we have looked at involve a single independent variable (which
is often, but not always, the time t) and a single unknown dependent variable (x in the
previous examples) which we are trying to find as a function x(t) of t. Other types of
differential equations are mostly beyond the range of this course. However, it is worth
looking at some examples (without trying to find solutions) in order to get some idea
of the scope of this branch of mathematics.
Systems of Differential Equations
Sometimes we have several variables, all depending on a single independent variable.
Just as in linear algebra, this situation may involve several equations which we have to
solve simultaneously. One of the most famous examples of such a system of equations is
the Volterra Predator-Prey equations. Here x and y are respectively measures of
the populations of two animal species, where one of the species, y say, is the predator
and other species x is its prey. We suppose that both x and y vary with time t. The
pair of equations
dx
= Ax − Bxy ,
dt
dy
= −Cx + Dxy ,
dt
Chapter 1: Introduction
7
where A, B, C, D are positive constants, has been suggested as a simple mathematical
model of this situation. Where does such a model come from? If we forget the xy terms
(or equivalently take B + D = 0) we are left with the two equations
dy
dx
= Ax ,
= −Cy ,
dt
dt
which are just the equations of exponential growth and decay. These equations reflect
what might happen to the numbers of prey and predator if they never meet each other:
the prey increases, and the predator dies out. In this case the two equations can be
solved independently of the other. The xy terms represent the interaction of the two
species. The equations (like the two species themselves) are no longer independent. The
extra term has the desired effect of benefiting the predator (increasing its growth rate)
and disadvantaging the prey (decreasing its growth rate). Analysis of these equations
and their various refinements leads to many fascinating insights into the behaviour of
ecological systems.
The Volterra equations take us beyond the scope of this course, but these notes
conclude by looking at a version of the system of equations with a simplified interaction.
Partial Differential Equations
In other situations there may be several independent variables. For example, the temperature T along a uniform rod depends on both position x and time t. Then T = T (x, t)
and the heat equation
∂2T
∂T
=k
2
∂x
∂t
is a mathematical expression of physical laws which apply to the flow of heat in a solid
body. Here k is a constant depending on the choice of units and the material of the
rod. Solving the equation for a specified initial temperature distribution will allow us
to predict the future temperature at different parts of the rod.
The displacement of a tightly stretched vibrating string (like the string of a musical instrument) is also a function y(x, t) of two variables, where y is the transverse
movement of the string as a function of the position x along the string and the time t.
In this case the function y satisfies the wave equation:
2
∂2y
2∂ y
=
c
,
∂x2
∂t2
where c is a constant depending on the mass per unit length of the string and its tension.
The equations which describe the phenomena of electromagnetism and fluid flow
are also partial differential equations. The equations of fluid flow in particular are very
difficult to analyse mathematically, but in many cases can solved very accurately by
powerful computers. Engineering design problems (such as those involving structural
analysis or aircraft aerodynamics) which once required extensive model building and
prototyping can be handled more efficiently by complex mathematical models based on
differential equations and solved on supercomputers.
8
MATH 1003 Integral Calculus and Modelling
Infinite Sequences (Stewart § 12.12)
At several points in the course we have to deal with infinite sequences of numbers.
We conclude this introduction with a short summary of the basic facts about such
sequences.
Infinite sequences appear in almost all areas of mathematics. The natural numbers
0, 1, 2, 3, . . . are the most obvious example of an infinite sequence. In some cases each
term in the sequence is given by a formula depending on its position in the sequence, or
there may be some rule which generates each term from the values of preceding terms.
Examples include:
• Arithmetic sequences: each term differs from its predecessor by the addition of a
constant amount, as in the sequence
1, 4, 7, 10, 13, 16, 19, 22, 25, 28, . . . ,
• Geometric sequences: each term is a constant multiple of its predecessor, as in
1, 2, 4, 8, 16, 32, 64, 128, 256, 512, . . . ,
• Fibonacci sequence: each term (after the second) is the sum of the two preceding
terms, as in
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, . . . .
• The terms of the sequence do not have to be integers; they can be any real
numbers:
1
1
1
1
1
1
1, √ , √ , √ , √ , √ , √ , . . . .
2
3
4
5
6
7
The terms of a general sequence of real numbers are usually labelled by the non-negative
integers, as in
(1.8)
a0 , a1 , a2 , a3 , a4 , a5 , a6 , a7 , a8 , . . . .
Sometimes the labelling starts with 1 rather than 0. Most of the interesting properties
of sequences concern what happens to an for large n, so this is not an important feature
of the sequence.
One important property of a sequence is the existence (or otherwise) of the limit
of successive terms. Informally, a number L is the limit of the sequence (1.8) if the an
‘home in’ on L as n increases. We write
lim an = L
n→∞
2
Throughout the notes we provide cross-references in the form Stewart § 4.1 to the relevant section
in Calculus, Sixth Edition by James Stewart (Brooks/Cole Publishing Company, 2009).
Chapter 1: Introduction
9
in this case. Sometimes this is abbreviated to lim an = L or just an → L. The notation
lim an = ∞
n→∞
is used when the terms of the sequence become arbitrarily large.
If a sequence has a finite limit, the sequence is said to be convergent. Otherwise
it is divergent. In the case of either a finite or infinite limit, the sequence is said to
approach, or tend to the limit.
Worked Example 1.1 Decide which of the following sequences has a limit (finite or
infinite):
(a) an =
1
n
(d) an = 1
(b) an =
√
n
n
1
(e) an =
2
(c) an = (−1)n
(f) an = 2n .
Solution. For (a) it is intuitively clear that an → 0 as n → ∞ since we make make the
next term as small as we like by making n sufficiently large. In case (b) the sequence
tends to ∞ since we can make the next term as large as we like by making n sufficiently
large. In (c) the sequence has no limit: it oscillates between the two values of +1 and
−1. The sequence in (d) is an example of a constant sequence. It has a limit equal
to the value of each term, in this case 1. The final examples (e) and (f) have limits 0
and ∞ respectively.
Exercises
The following exercises are designed to give a flavour of the type of problem which will
be studied in more detail later on. See if you can work out how to tackle them now.
Don’t worry if you can’t, we will return to them in due course.
1. A mothball initially has radius 0.5 cm and slowly evaporates.
(a) If V denotes the volume of the mothball and r the radius, use the chain rule
for differentiation to show that
dr
dV
= 4πr 2 .
dt
dt
(b) Suppose that the rate of change of the volume is proportional to the surface
area of the mothball. Express this condition as a differential equation for r
as a function of t.
10
MATH 1003 Integral Calculus and Modelling
(c) Find a formula for the radius as a function of time, assuming that after
30 days the radius is 0.25 cm. How long before the mothball disappears
altogether?
2. A car is travelling at 100 km/h on a level road when it runs out of fuel. Its speed v
starts to decrease according to the formula
dv
= −kv,
dt
where k is a constant. One kilometre after running out of fuel its speed has fallen
to 50 km/h. Use the chain rule substitution
dv ds
dv
dv
=
= v
dt
ds dt
ds
to solve the differential equation. How far will the car travel from the point where
it runs out of fuel?
How long after running out of fuel will the car come to a stop? Is the model
reasonable?
Chapter 1: Introduction
11
Summary of Chapter 1.
• If we know that the derivative of an unknown function s(t) is a given
function f (t), we may be able to find s(t) by working backwards, using
what we know about differentiation.
• If f (t) is not given by a simple formula (and often even when it is) this
method fails. We need to find a more fundamental relation between the
derivative of a function and the original function.
• The process of recovering s(t) from information about its derivative is
called integration. This information can simply specify the derivative
explicitly (by a formula or graph). Other cases lead us to study more
general types of differential equation.
• Differential equations are used as ‘mathematical models’ in many areas of science and technology. They may involve ‘ordinary’ or partial
derivatives. Sometimes several differential equations need to be satisfied simultaneously. This leads to the analysis of systems of differential
equations.
• Later on in the course we need to investigate infinite sequences. One
of the most important properties of such sequences is the existence of a
limit.
12
CHAPTER 2
The Definite Integral: Definition
OW IS THE TOTAL CHANGE in the value of a function over an interval related
to the values of its derivative on the interval? We start by investigating this
problem from first principles, without using derivatives or differential calculus. Given
the rate of change, we want to find the total change. Later on we will relate this to the
idea of ‘reverse differentiation’ already discussed.
H
Velocity vs Distance (Stewart § 5.1)
Imagine a car accelerating along the road over a period of 10 seconds. Suppose the car
starts with a speed of 5m/sec and ends up with a speed of 32.5 m/sec. What can we say
about the distance travelled? We can make a rough estimate as follows. Since the car is
accelerating the speed is always increasing. In particular, the speed is always between
5 m/sec and 32.5 m/sec. Over the period of 10 seconds the car therefore travels at least
5 × 10 = 50 metres, but no more than 32.5 × 10 = 325 metres. We can write these two
inequalities as:
50 m ≤ Distance travelled ≤ 375 m .
This is a very rough estimate indeed. We can do much better if we know more about
the velocity at intervening points of time. For example, suppose that we measure the
velocity every two seconds. We can present the results as a table, which might look like
the following:
Time (sec)
0
2
4
6
8
10
Velocity (m/sec)
5
14.5
22
27.5
31
32.5
The minimum and maximum velocities over the first two seconds are 5 m/sec and
14.5 m/sec. Therefore the distance travelled in this period is between 5 × 2 = 10 m and
14.5 × 2 = 29 m. Applying this to each interval in turn and adding up over all five
intervals, we get a lower estimate of
(2.1)
(5 × 2) + (14.5 × 2) + (22 × 2) + (27.5 × 2) + (31.0 × 2) = 200 m,
Chapter 2: The Definite Integral: Definition
13
and an upper estimate of
(2.2)
(14.5 × 2) + (22 × 2) + (27.5 × 2) + (31 × 2) + (32.5 × 2) = 255 m.
The gap between the two estimates is now much smaller, with a maximum possible
error of 255 − 200 = 55 m. It is very instructive to draw a graph of velocity against
time and use it to interpret these calculations.
This is done in Figure 1 below, where the curved line shows the actual velocity
of the car plotted against the time t. On each 2 second interval along the t-axis the
height of the dark rectangle is equal to the minimum velocity on that interval. Since
the velocity is increasing, this always occurs on the left endpoint of each subinterval.
Thus the first rectangle has height 5, the second height 14.5, and so on. The total
height of the dark and light rectangles together is equal to the maximum velocity on
each interval. For an increasing function this will occur at the right endpoint.
We relate this geometrical construction to the distance travelled by introducing the
idea of area. The width of each rectangle is a time interval and the height corresponds to
our estimate of velocity over the same interval. Therefore the product width ×height
gives the distance travelled during the interval, assuming the velocity is constant and
equal to the height of the rectangle. Of course this product is also just the area of the
rectangle. Therefore the lower estimate for the velocity given by (2.1) is just the sum
of the areas of the dark rectangles. Similarly the expression (2.2) is the total area of
the dark and light rectangles taken together.
velocity m/sec.
30
20
10
0
Difference in Area
0
2
4
6
8
10
time t
Figure 1
The difference between these upper and lower estimates of the distance is then equal to
the sum of the areas of the light rectangles. In Figure 1 the light rectangles have been
copied over to the right of the diagram and stacked together, in order to better visualize
their total area. In fact it is easy to see that the composite rectangle has dimensions
(32.5 − 5) × 2, with a total area of 55, in agreement with our earlier calculation.
14
MATH 1003 Integral Calculus and Modelling
With more data on the car’s speed we can improve accuracy further still. Suppose
we record the speed twice as often, so every second:
Time (sec)
0
1
2
3
4
5
6
7
8
9
10
Velocity (m/sec)
5
10
14.5
18.5
22
25
27.5
29.5
31
32
32.5
This gives us 10 intervals instead of 5, and we can again use the lowest and highest
speeds on each interval to estimate the distance travelled. This is shown graphically in
Figure 2.
velocity m/sec.
30
20
10
0
Difference in Area
0
2
4
6
8
10
time t
Figure 2
As before the areas of the shaded rectangles give lower and upper estimates for the
distance travelled, and difference between these two estimates is equal to the total area
of the light rectangles. Comparison with Figure 1 shows that this difference is now
much smaller (in fact it is equal to half its previous value). Adding up the upper and
lower estimates of distance over each 1 second interval gives us inequalities:
215 m ≤ Distance travelled ≤ 242.5 m .
The maximum possible error is now 242.5 − 215 = 27.5 m. Of course, we can continue
in the same way, using shorter and shorter subintervals. Here is the result of measuring
the velocity every 0.5 seconds:
The difference between the upper and lower estimates is again equal to the area of the
rectangle drawn at the side of the figure. It should be clear by now that by taking small
enough steps, we can make this area as small as we like.
Chapter 2: The Definite Integral: Definition
15
velocity m/sec.
30
20
10
0
Difference in Area
0
2
4
6
8
10
time t
Figure 3
In mathematical terms this means that both the upper and lower estimates approach a common limit as the size of the steps shrinks towards zero. There is an
obvious relation between this limit and area under the curved line. In each case the
lower estimate is the sum of areas of rectangles which lie inside this curve. The upper
estimate is a sum of areas of rectangles which enclose the curve. The area under the
curve, like the total distance, therefore also lies between these upper and lower estimates. Since the upper and lower estimates have a common limit, there is only one
number with this property. We conclude that
Total Distance = Area under the Curve.
There are two new concepts here. First, we have a way of estimating total distance
travelled from a knowledge of velocity. By ‘sampling’ the velocity sufficiently frequently,
we can make this estimate as accurate as we wish. Second, we see that the total
distance travelled is the same as the area under the graph of velocity plotted against
time. Note that the argument does not depend on having an algebraic formula for
the velocity, and it does not use any differential calculus. The only place where we
use the fact that velocity is rate of change of distance with time is in the formula
distance = velocity × time for motion with constant velocity. In the next section
we generalize this argument into a purely mathematical construction which we can
apply to any continuous function.
16
MATH 1003 Integral Calculus and Modelling
Riemann Sums (Stewart § 5.1, §5.2
Start with a continuous function f (x) defined on an interval [a, b]. Recall that the
notation [a, b] stands for the set of real numbers x satisfying the condition a ≤ x ≤ b. For
simplicity we assume the function is non-negative, so f (x) ≥ 0 for all x in the interval.
We can mimic the construction of the previous section. Fix an integer N ≥ 1 and
divide the interval [a, b] into N subintervals of equal length. This is called a partition
of the interval. Then take the minimum value of f (x) on each subinterval, and draw
rectangles of this height based on the subintervals. The result for 8 subintervals is shown
in Figure 4. Repeat the construction, this time using the maximum value of f (x) on
each subinterval. This is shown in Figure 5.
f (x)
a
x
Figure 4
b
a
x
b
Figure 5
In the case of Figure 4 the total area of the rectangles is clearly a lower estimate
for the area under the graph of f (x). Similarly the total shaded area in Figure 5 is an
upper estimate for this area.
We can also express these quantities using summation notation. Let ∆x be the
length of the subintervals, so if there are N equal subintervals then ∆x = (b − a)/N.
Let mi and Mi be respectively the minimum and maximum values of f (x) on the ith
subinterval. In Figure 4 the area of the ith rectangle is therefore given by the product
mi × ∆x. In Figure 5 the height of the ith rectangle is Mi , so the area is Mi × ∆x. Let
LN be the total area of the smaller rectangles, as in Figure 4. Then
(2.3)
LN = (m1 × ∆x) + (m2 × ∆x) + · · · + (mN × ∆x) =
N
X
i=1
mi × ∆x .
Chapter 2: The Definite Integral: Definition
17
The number LN is called a Riemann Lower Sum for the function f on the interval [a, b]. It depends not only on N, but also on f and the interval [a, b]. Similarly, let
UN be the total area of the larger rectangles (as in Figure 5. Then
(2.4)
UN = (M1 × ∆x) + (M2 × ∆x) + · · · + (MN × ∆x) =
N
X
i=1
Mi × ∆x .
This is called a Riemann Upper Sum for f on [a, b]. As before, the Riemann lower
and upper sums give us lower and upper estimates for the area under the graph of f (x):
LN ≤ Area under the Graph ≤ UN .
What happens as the number N of subintervals is increased or, equivalently, as the
length of each subinterval is decreased? Figure 6 shows the effect of taking first 16
and then 32 subintervals. In these pictures both the upper and lower estimates are
n = 16
n = 32
Figure 6
shown. We can think of the rectangles for the lower sum geometrically, as the tallest
rectangles that will fit under the graph. Similarly, the rectangles for the upper sum are
the shortest which enclose the graph.
The rectangles contributing to the Riemann lower sum are unshaded, so the difference between the upper and lower sums is equal to the total area of the shaded
rectangles. In our earlier example we were able to use the fact that the function was increasing over the whole interval to visualize this difference. In this example the function
is increasing for some values of x and decreasing for others, so it is not so easy to find a
simple interpretation of the difference in areas. However, the pictures still suggest that
the difference approaches zero as the number of intervals is increased. In fact, careful
analysis (using more sophisticated mathematical ideas than we have available at this
stage) confirms this intuition: as the size of the subintervals is decreased to zero, the
upper and lower Riemann sums approach a common value. This is the number we call
the definite integral of f (x) over the interval [a, b]. We can summarize this as follows.
18
MATH 1003 Integral Calculus and Modelling
The Definite Integral.
Suppose that f (x) is a continuous function defined on
each integer N ≥ 1 we can divide [a, b] into N equal
the associated Riemann upper and lower sums. As N
and lower sums approach the same value. This value
integral of f over the interval [a, b], and is written as
Z
the interval [a, b]. For
subintervals and form
→ ∞ both the upper
is called the definite
b
f (x) dx .
a
It is the unique number which satisfies
Z b
LN ≤
f (x) dx ≤ UN
a
for all N ≥ 1. Since the area under the graph of f (x) satisfies the same
inequalities, it must be equal to the integral of f over the interval. See below
for how to interpret the area in the case where f takes negative values.
Since both the upper and lower sums have the same limit as N increases, we can
also write
Z b
f (x) dx .
lim LN = lim UN =
N →∞
N →∞
a
Non-positive Functions (Stewart § 5.2)
If f (x) takes negative values we must modify the above argument slightly. We are
guided here by the algebraic formulas (2.3) and (2.4) for the upper and lower sums. As
before we divide [a, b] into N equal subintervals and let mi and Mi be the minimum
and maximum values of f on the ith subinterval. Now one or both of these numbers
may be negative.
We can still use rectangles to visualize the upper and lower Riemann sums. If mi
(or Mi ) is negative the rectangle appears below the axis. The term mi ×∆x (or Mi ×∆x)
in the Riemann sum is equal to minus the area of the rectangle. The Riemann sum is
therefore equal to the sum of the areas of all the rectangles above the axis, minus the
sum of the areas of all the rectangles below the axis. It is still true that the lower and
upper Riemann sums converge to a common value. The relation between the definite
integral and area continues to hold, except that areas below the axis count as negative.
Chapter 2: The Definite Integral: Definition
19
f (x)
a
x
b
a
x
b
Upper Sum
Lower Sum
Figure 7
Note
In this discussion we have always used a subdivision of the interval of integration into
subintervals of equal length. This is sufficient for many practical applications, but from
a theoretical point of view there are some advantages in relaxing this condition. If we
subdivide into (finitely many) subintervals of possibly different lengths the upper and
lower Riemann sums are still defined. It can be proved that both sums converge to the
value of the definite integral, as defined above, as the length of the longest subinterval
is decreased towards zero.
Exercises
1. Partition the interval [1, 2] into N subintervals of equal length,
[x0 , x1 ], [x1 , x2 ], . . . , [xN −1 , xN ] .
(a) Show that xi = (N + i)/N.
(b) Show that the maximum value of f (x) = 1/x on [xi−1 , xi ] is N/(N + i − 1)
and the minimum value is N/(N + i).
(c) Show that the upper and lower Riemann sums for f (x) = 1/x on [1, 2] with
N subintervals are
1
1
1
+
+···+
N +1 N +2
2N
and
1
1
1
+
+···+
.
N
N +1
2N − 1
(d) Find a value of N such that the difference between the upper and lower sums
is less than 10−6 .
20
MATH 1003 Integral Calculus and Modelling
Summary of Chapter 2.
• For a vehicle travelling with variable velocity, we can estimate the total
distance travelled over an interval of time by dividing the interval into
equal subintervals and making the approximation that the velocity is
constant on each subinterval. Taking this constant velocity equal to the
minimum (resp. maximum) velocity on each subinterval gives a lower
(resp. upper) estimate for the distance travelled.
• The difference between the upper and lower estimates approaches zero
as the length of the subintervals approaches zero. Their common limit is
the actual distance travelled.
• A similar construction can be applied to any continuous function f (x)
defined on an interval [a, b]. In this case the upper and lower estimates
are called the upper and lower Riemann sums.
• The common limit of the upper and lower Riemann sums as the length
of the subintervals shrinks to zero is called the definite integral of f
over [a, b]. This number can also be interpreted as the area under the
graph of f (areas below the axis counting as negative).
21
CHAPTER 3
The Definite Integral: Properties
N A GIVEN EXAMPLE the upper and lower Riemann sums may be difficult to
calculate. The reason for this is that, except in some special situations, it may
be hard (or inconvenient) to find the minimum and maximum values of f (x) on each
subinterval. If the function f (x) increases with x then the minimum value on an interval
is always at the left endpoint, and the maximum value at the right. There is a similar
conclusion (with right and left reversed) if the function is decreasing on the interval.
In general the situation is not so simple. Fortunately we can avoid this difficulty by
introducing a more general idea of Riemann sum. The upper and lower Riemann sums
will then appear as special cases of this construction. As usual f (x) is a continuous
function defined on the interval [a, b].
I
Practical Calculation of Riemann Sums (Stewart § 5.2)
Suppose [a, b] is divided into N equal subintervals. For 1 ≤ i ≤ N, let ci be any point
in the ith subinterval and form the sum
(3.1)
(f (c1 ) × ∆x) + (f (c2 ) × ∆x) + · · · + (f (cN ) × ∆x) =
N
X
i=1
f (ci ) × ∆x .
This is the sum of areas of rectangles of width ∆x and height f (ci ). Figure 1 shows the
result of one choice of the ci . In this diagram the sum (3.1) is equal to the total area
of the shaded rectangles and the dots are located at the points (ci , f (ci )).
In this example the ci were chosen randomly. Alternatively we could consistently
choose ci to be the midpoint, left or right endpoint of the ith subinterval. Whatever
the choice of the ci , we certainly have
mi ≤ f (ci ) ≤ Mi ,
since mi and Mi are the minimum and maximum values of f on this subinterval. MultiP
P
Mi ∆x are
ply by ∆x and add up over all subintervals. Then the sums
mi ∆x and
respectively the lower and upper Riemann sums LN and UN , and we get the inequality:
(3.2)
LN ≤
N
X
i=1
f (ci )∆x ≤ UN .
22
MATH 1003 Integral Calculus and Modelling
a
x
b
Figure 1
The middle expression here is an example of a (general) Riemann Sum for f (x)
on the interval [a, b]. Its value obviously depends on the choice of the ci . By taking N
large enough, we can make LN and UN as close as we like to the value of the definite
integral. The inequalities (3.2) then imply that any Riemann sum must be at least as
close to the actual value of the integral. We illustrate this with an example.
Worked Example 3.1 Use Riemann sums to estimate the integral
Z 2
sin x dx .
1
Use a partition of [1, 2] into 20 subintervals and calculate the Riemann sum for each of
the three cases:
(a) ci is the left endpoint of the ith subinterval,
(b) ci is the right endpoint of the ith subinterval,
(c) ci is the midpoint of the ith subinterval.
Solution. Each subinterval has length 1/20 = 0.05, so the points
xi = 1.0 + 0.05 × i
(3.3)
(for 0 ≤ i ≤ 20) will mark out the interval [1, 2] into the required subintervals. The ith
subinterval will then be the interval [xi−1 , xi ]. The three Riemann sums correspond to
the choices
(3.4)
ci = xi−1 ,
ci = xi ,
ci = xi−1 + 0.025 .
Chapter 3: The Definite Integral: Properties
23
In each case we have to work out the sum
20
X
(3.5)
sin ci ∆x =
i=1
20
X
sin ci
i=1
!
× 0.05 .
This is quite easy with a programmable calculator or by writing a simple computer
program. But perhaps the simplest way to do the calculation is to use a spreadsheet
program. In the first column enter the numbers 1 to 20. On most spreadsheets these
will be the cells a0 to a19. Fill the second column with the numbers xi−1 using the
formula (3.3). Then the next three rows can be filled with the various choices of sin ci ,
using the formulas (3.4). Here is the resulting table, where the columns (1), (2), (3)
contain the values of sin ci for the three specified choices of ci .
i
xi−1
(1)
(2)
(3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1.000000
1.050000
1.100000
1.150000
1.200000
1.250000
1.300000
1.350000
1.400000
1.450000
1.500000
1.550000
1.600000
1.650000
1.700000
1.750000
1.800000
1.850000
1.900000
1.950000
0.841471
0.867423
0.891207
0.912764
0.932039
0.948985
0.963558
0.975723
0.985450
0.992713
0.997495
0.999784
0.999574
0.996865
0.991665
0.983986
0.973848
0.961275
0.946300
0.928960
0.867423
0.891207
0.912764
0.932039
0.948985
0.963558
0.975723
0.985450
0.992713
0.997495
0.999784
0.999574
0.996865
0.991665
0.983986
0.973848
0.961275
0.946300
0.928960
0.909297
0.854714
0.879590
0.902268
0.922690
0.940806
0.956570
0.969944
0.980893
0.989391
0.995415
0.998952
0.999991
0.998531
0.994576
0.988134
0.979223
0.967864
0.954086
0.937923
0.919416
0.954554 0.957946 0.956549
Once these numbers are entered, we can calculate the Riemann sums by summing the
columns and using the formula (3.5). This gives the values shown in the bottom line of
the table.
Of course (anticipating a later section), it is easy to calculate this integral exactly.
In fact
Z 2
Z 2
d
sin x dx =
(− cos x) dx = (− cos 2) − (− cos 1) ≈ 0.956449 .
1
1 dx
24
MATH 1003 Integral Calculus and Modelling
Note that in this example none of the three Riemann sums give the lower or upper
Riemann sum. The maximum value of sin x occurs at x = π/2, which is inside the
range of integration. Up to this point the function is increasing. After this point it is
decreasing. In some cases the maximum value of the function occurs at the right of the
subinterval, other times on the left, and in one case (at π/2) inside the subinterval.
Properties of the Definite Integral (Stewart § 5.2)
Before continuing, we will note some of the basic properties of the definite integral
as defined above. After some practice, most people use these properties all the time
without further thought. However, it is instructive to see how they follow from our
definition of the integral.
Properties of the Definite Integral.
• If m and M are the minimum and maximum values of f on the interval [a, b], then
(3.6)
m × (b − a) ≤
b
Z
f (x) dx ≤ M × (b − a) .
a
• If c is a constant, then
Z b
Z b
(3.7)
cf (x) dx = c
f (x) dx .
a
a
• For functions f and g defined on the interval [a, b],
(3.8)
Z
b
(f (x) + g(x)) dx =
a
Z
b
f (x) dx +
a
Z
b
g(x) dx .
a
• If f is defined on the interval [a, c], and b is a point between a and b, then
(3.9)
Z
a
c
f (x) dx =
Z
a
b
f (x) dx +
Z
c
f (x) dx .
b
The first formula (3.6) is just the relation between the definite integral and the
upper and lower Riemann sums in the case N = 1, so there is a single interval equal to
all of [a, b].
To see where the next two equations come from, we use the definition of the integral
as a limit of Riemann sums. We also need some basic properties of limits.
Chapter 3: The Definite Integral: Properties
25
For any integer N ≥ 1 we subdivide [a, b] into N subintervals. For each i with
1 ≤ i ≤ N we choose a point ci in the ith subinterval. Let SN (f ) and SN (cf ) be
the corresponding Riemann sums for the two functions f (x) and cf (x). It follows
immediately from the definition of the Riemann sum that SN (cf ) = cSN (f ). Then,
from the standard properties of limits,
Z
b
cf (x) dx = lim SN (cf ) = lim cSN (f ) = c lim SN (f ) = c
N →∞
a
N →∞
N →∞
Z
b
f (x) dx .
a
This proves (3.7). For (3.8), take N and the ci as before and let SN (f ), SN (g) and
SN (f + g) be the corresponding Riemann sums for f , g and f + g. Then
SN (f + g) =
=
N
X
[f (ci) + g(ci )] × ∆x
i=1
N
X
i=1
!
f (ci ) × ∆x
+
= SN (f ) + SN (g) .
N
X
i=1
!
g(ci) × ∆x
Then
Z
b
(f (x) + g(x)) dx = lim SN (f + g)
N →∞
a
= lim (SN (f ) + SN (g))
N →∞
= lim SN (f ) + lim SN (g)
N →∞
N →∞
Z b
Z b
=
f (x) dx +
g(x) dx .
a
a
If we interpret the definite integral as an area, the final formula (3.9) is just the fact
that the area over the interval [a, c] is the sum of the areas over the two intervals
[a, b] and [b, c]. A formal mathematical proof of this depends on the more general type
of Riemann sum (with subintervals of different lengths) mentioned at the end of the
previous chapter.
Reversing the Direction of Integration (Stewart § 5.2)
Rb
So far we have only defined a f (x) dx in the case where a ≤ b. It is quite useful
to extend the definition to the case a > b. We can easily do this in a way which is
consistent with the existing definition and properties. Note that in the definition of the
Riemann sum as
(3.10)
N
X
i=1
f (ci ) ∆x
26
MATH 1003 Integral Calculus and Modelling
we have ∆x = (b − a)/N. If a > b we can use the same formula. The only new feature
is that now ∆x is negative. From an algebraic point of view this has no effect on our
formulas. Geometrically it means that in the Riemann sum areas of rectangles above
the axis now count as negative, and areas below the axis count as positive.
The easiest way to see the implication of all this is to go back to the Riemann
sum 3.10 and look at the effect of interchanging a and b. The only difference which this
makes to the formula is to change the sign of ∆x. Hence the Riemann sum also simply
changes sign. In the limit as ∆x → 0 the same is true of the definite integral, and we
conclude that
b
Z
a
f (x) dx = −
Z
a
f (x) dx .
b
One result of this is that the formula (3.9) now holds whatever the order of the numbers a, b, c. For example, starting with a ≤ b ≤ c and the relation
Z b
Z c
Z c
+
=
a
b
a
we can rearrange to get
Z
b
=
a
Z
a
c
−
Z
b
c
=
Z
a
c
+
Z
b
.
c
This is essentially just (3.9) again, except the point c no longer lies between a and b.
Fundamental Theorem of Calculus (Part 2) (Stewart § 5.3)
We used the relation between velocity and distance to motivate the introduction of
the definite integral, but we have not so far discussed the exact mathematical relation
between differentiation and integration. There are really two parts to this relation. One
part involves the derivative of an integral, and the other the integral of a derivative.
Together these make up the Fundamental Theorem of Calculus. In this section we
consider the second part: what happens when we integrate the derivative of a function?
The answer is given by the following theorem.
The Fundamental Theorem of Calculus II.
Let F (x) be a function defined on an interval [a, b] of the real line. Suppose
that the derivative of F is defined at each point x of the interval, and that the
resulting function F ′ (x) is continuous. Then
(3.11)
Z
a
b
F ′ (x) dx = F (b) − F (a) .
Chapter 3: The Definite Integral: Properties
27
Recall that the definite integral of F ′ (x) is the unique number which lies between
the upper and lower Riemann sums of F ′ (x) for all subdivisions of the interval [a, b]. If
we can show that the number F (b) − F (a) has the same property, then (3.11) is proved.
For this we need the Mean Value Theorem of differential calculus (see, for example, Stewart § 4.2). Here is the statement of the theorem: for a function F (x) defined
on an interval [u, v] with continuous derivative F ′ (x), there exists a point w somewhere
in the interval with the property that
F ′ (w) =
F (v) − F (u)
.
v−u
Essentially, this theorem says that the average rate of change over the interval (the ratio
of the change in F (x) to the change in x) is equal to the derivative of F at some point
in the interval. To prove the formula (3.11) we apply this theorem as follows. Starting
with the function F defined on [a, b], partition the interval into N equal subintervals.
Label the division points as xi , with 0 ≤ i ≤ N, so that the ith subinterval is [xi−1 , xi ]
and
a = x0 ≤ x1 ≤ · · · ≤ xN −1 ≤ xN = b .
As usual we let mi , Mi be the minimum and maximum values of F (x) on this subinterval,
and ∆x the length of the subintervals. According to the Mean Value Theorem there
exists a point ci in the ith subinterval where
(3.12)
F (xi ) − F (xi−1 ) = F ′ (ci ) × (xi − xi−1 ) = F ′ (ci ) × ∆x .
Adding up over the range 1 ≤ i ≤ N, the left side is just
(F (x1 ) − F (x0 )) + (F (x2 ) − F (x1 )) + (F (x3 ) − F (x2 )) + · · ·
· · · + (F (xN −1 ) − F (xN −2 )) + (F (xN ) − F (xN −1 )) .
All the terms except F (x0 ) and F (xN ) appear twice, with opposite signs. Therefore
they cancel out, leaving only
F (xN ) − F (x0 ) = F (b) − F (a) .
Summing the right side just gives a Riemann sum for F ′ (x) over the interval [a, b]. We
conclude that
N
X
F ′ (ci ) × ∆x = F (b) − F (a) .
i=1
But any Riemann sum for the specified partition lies between the upper and lower
Riemann sums, so the same is true for F (b) − F (a). This completes the proof of this
part of the Fundamental Theorem.
28
MATH 1003 Integral Calculus and Modelling
Notation The change F (b) − F (a) of a function F over an interval is often denoted
by
F (b) − F (a) = [F (x)]ba .
The theorem then appears in the form
Z
b
F ′ (x) dx = [F (x)]ba .
a
The important thing about this result is that it gives us a potential shortcut to working
out a definite integral. In order to evaluate the integral
Z
b
f (x) dx
a
we can look for a function F (x) with the property that F ′ (x) = f (x) on the interval [a, b].
According to the fundamental theorem we then have
Z
b
f (x) dx =
a
Z
b
a
F ′ (x) dx = F (b) − F (a).
The function F (x) is called an antiderivative of f (x). In this way the Fundamental
Theorem of Calculus gives us a link between ‘the area under the curve’ (in terms of Riemann sums) and ‘reverse differentiation’ (finding an antiderivative). It also motivates
us to look at techniques for finding antiderivatives. You should be aware however that
there are many examples of quite simple functions for which there is no simple formula
for an antiderivative.
Exercises
1. Given that
Z
1
−3
f (x) dx = −2 ,
evaluate, where possible:
Z 2
Z
(a)
(f (x) + g(x)) dx, (b)
−3
Z
2
f (x) dx = 5 ,
1
−3
2
Z
2
g(x) dx = 8 ,
−3
g(x)
dx,
2
(c)
Z
2
f (x)g(x) dx .
−3
Chapter 3: The Definite Integral: Properties
Summary of Chapter 3.
• We can estimate the value of the definite integral using the values of f at
any point in each subinterval. The resulting Riemann sum lies between
the upper and lower Riemann sums for the same partition.
• This type of Riemann Sum is easier to calculate, since we do not have to
locate the maximum and minimum values of the function on each interval.
The difference between such a Riemann sum and the definite integral is
no bigger than the difference between the lower or upper Riemann sums
for the same partition.
• The Fundamental Theorem of Calculus shows that the definite integral
of the derivative of a function F (x) is equal to the change in F over
the interval. The proof uses the Mean Value Theorem from differential
calculus.
• We can try to evaluate a definite integral for a function f by finding an
antiderivative of f and using the Fundamental Theorem. This method
fails when we cannot find a formula for the antiderivative.
29
Download