Uploaded by Abel.Nkosi

00 Revision (1)

advertisement
MAM2083F
2021
Vector Calculus for Engineers
0.1. Functions
Mathematics is really not about numbers; it’s about functions.
In your first-year courses, you learnt that a function is a rule that assigns to each
element π‘₯ of a set 𝐴 an element 𝑦 of a set 𝐡. Often we write something like 𝑦 = 𝑓 (π‘₯),
which then gives the function a name, 𝑓 . The basic idea is that 𝑓 somehow transforms
π‘₯ into 𝑦 : we call 𝑦 the value of the function 𝑓 at the point π‘₯. This transformation might
work in different ways; the rule that assigns 𝑦 to π‘₯ might be given by a formula, or
might be found by doing an experiment or consulting a table or looking at a graph. In
fact, this rule might depend on π‘₯ : for example, the absolute value function is defined
by
(
|π‘₯| =
−π‘₯
π‘₯<0
π‘₯
π‘₯≥0
for all π‘₯ ∈ R. In more extreme examples, you might encounter functions that are given
by a formula for certain values of π‘₯, but by a completely different method for others.
Whenever possible, I try to introduce a new function by writing something like 𝑓 : 𝐴 →
𝐡. This has the advantage of specifying the name 𝑓 , the domain 𝐴 and the codomain 𝐡
of the function. However, sometimes I choose to be less formal, and write something
cos π‘₯
. Here I haven’t given the function a name and it isn’t immediately
like 𝑦 =
1 + sin π‘₯
clear what the domain and the codomain are: you somehow have to work this out.
In your first-year courses, you worked mainly with functions whose domains and
codomains were both subsets of R. For such functions, it is often convenient to study
the behaviour of the function by looking at its graph, which is a subset of the cartesian
plane R2 . In MAM2083F, we are more interested in functions whose domains and
codomains are subsets of Rπ‘š or R𝑛 , where either π‘š > 1 or 𝑛 > 1. The graph of such a
function is a subset of Rπ‘š+𝑛 , so it’s not as easy to visualise.
0.2. Limits and continuity
When I started talking about "graphs", you probably thought of a curve in R2 . However,
not all functions are continuous, so even if 𝐴 and 𝐡 are subsets of R, the graph of a
function 𝑓 : 𝐴 → 𝐡 need not be a nice smooth curve in R2 .
1
The concept of continuity is very closely related to the idea of a limit. In a sense, the
limit of a function 𝑓 : 𝐴 → 𝐡 at a point π‘Ž is the value which the function would have at
the point π‘Ž if it were continuous at π‘Ž. It is useful to think of the function 𝑓 as being some
kind of a "black box", that takes a number π‘₯ as an input and somehow transforms this
into an output 𝑓 (π‘₯). Rather than do treat this totally abstractly, let’s imagine that 𝑓 is a
machine, or a piece of electronic circuitry. For example, 𝑓 might be a power supply, that
converts alternating current into direct current. In this analogy, π‘₯ is the input voltage,
while 𝑓 (π‘₯) is the output voltage. Suppose that this power supply is designed to convert
220V of alternating current into 5V of direct current, which you can use to charge your
smartphone. The input is unlikely to be exactly 220V, but fortunately your smartphone
is probably able to tolerate a voltage that is slightly under or over 5V. You would not
be able to use this power supply if its output were outside this range. Even if an input
voltage of exactly 220V produces an output voltage of precisely 5V, you wouldn’t be
happy to use this power supply if you knew that a tiny deviation in the input voltage
could result in an output voltage that is outside the range that your phone can tolerate.
This is the whole idea behind continuity: if the function that relates output voltages to
input voltages "jumps" in some way near 220V, then the power supply can’t be trusted.
Let’s put π‘Ž = 220V and β„“ = 5V. Suppose your phone can tolerate an output voltage
between β„“ − πœ€ and β„“ + πœ€, where πœ€ is a relatively small positive amount, while you know
that the input voltage π‘₯ is certain to lie between π‘Ž − 𝛿 and π‘Ž + 𝛿, where 𝛿 is another
small amount. Thus the question is whether an input voltage that is between π‘Ž − 𝛿 and
π‘Ž + 𝛿 always results in an output voltage between β„“ − πœ€ and β„“ + πœ€.
output
voltage
β„“ +πœ€
β„“
β„“ −πœ€
π‘Ž−𝛿
π‘Ž
π‘Ž+𝛿
input
voltage
If this isn’t the case, then you can’t trust the power supply. In a sense, πœ€ measures how
much variation your phone can tolerate in the output voltage, while 𝛿 measures how
much variation is acceptable in the input voltage.
2
Of course, the power supply isn’t actually a "black box": if you could look inside, you
would see that it’s made up of electronic components, such as resistors, capacitors and
so on. A clever engineer should be able to work out how all these components interact
with one another. By consulting the datasheets for these components and applying the
appropriate formulas, this engineer should be able to calculate what the output voltage
the power supply will produce for any given input. With this information, you will be
able to see whether the power supply can be trusted. In a similar way, a function 𝑓 can
often be broken down into simpler parts. A clever mathematician is able to figure out
whether 𝑓 is continuous at π‘₯ = π‘Ž by looking at these parts and how they fit together.
Most of the important results about limits say things like
"If 𝑓 and
𝑔 both have
limits
at π‘₯ = π‘Ž, then so does 𝑓 + 𝑔, with lim ( 𝑓 (π‘₯) + 𝑔(π‘₯)) = lim 𝑓 (π‘₯) + lim 𝑔(π‘₯) ". These
π‘₯→π‘Ž
π‘₯→π‘Ž
π‘₯→π‘Ž
results are all proved using πœ€’s and 𝛿’s, as we have just been discussing. Unfortunately,
you probably weren’t taught these proofs in first year, as some people mistakenly
believe that engineers don’t have to know them. However, I hope that the example
above makes you realise that this isn’t true: on the contrary, engineers are constantly
working with errors and uncertainties and need to know whether a slight variation in
one quantity could result in an unacceptably large variation in another.
If you know anything at all about alternating current, then you will know that the story
above is simplistic: for example, the input might vary not only in voltage, but also in
frequency. The output might change as certain components heat up, or it might suffer
from "ripple". This means that the input might not depend on a single number π‘₯ and
its output 𝑓 (π‘₯) might have more than one component. This fits in with what we are
going to talk about in MAM2083F: functions which have more than one argument or
whose values are not just a single number.
0.3. Differentiability
You should know from first year that one of the most important applications of limits
is to define the idea of a derivative. I’m not going to repeat most of the story here; in
particular, you should be familiar with the notion of a tangent line and you should
know that the derivative can be interpreted as an "instantaneous rate of change". You
should also know all the rules of differentiation, including the derivatives of the trig
functions sin, cos, tan and sec and the inverse trig functions arcsin, arccos and arctan.
As before, let 𝐴 and 𝐡 be subsets of R.
Definition 0.1. We say that a function 𝑓 : 𝐴 → 𝐡 is differentiable at π‘Ž ∈ 𝐴 if
𝑓 (π‘₯) − 𝑓 (π‘Ž)
π‘₯→π‘Ž
π‘₯−π‘Ž
𝑓 ′(π‘Ž) = lim
exists. If this is the case, then 𝑓 ′(π‘Ž) is known as the derivative of 𝑓 at the point π‘Ž.
3
(1)
If 𝑓 is differentiable at π‘Ž, then the graph of 𝑦 = 𝑓 (π‘₯) has a tangent line at the point
(π‘Ž, 𝑓 (π‘Ž)). The equation of this tangent line is
𝑦 = 𝑓 (π‘Ž) + 𝑓 ′(π‘Ž) (π‘₯ − π‘Ž) .
Note that this line has slope π‘š = 𝑓 ′(π‘Ž).
𝑦
𝑦 = 𝑓 (π‘₯)
𝑦 = 𝑓 (π‘Ž) + 𝑓 ′(π‘Ž)(π‘₯ − π‘Ž)
(π‘Ž, 𝑓 (π‘Ž))
π‘₯
An important observation is that near the point (π‘Ž, 𝑓 (π‘Ž)), the graph of 𝑦 = 𝑓 (π‘₯) looks
very much like its tangent line. This is the idea behind the linear approximation
𝑓 (π‘₯) ≈ 𝑓 (π‘Ž) + 𝑓 ′(π‘Ž) (π‘₯ − π‘Ž) ,
(2)
which in turn motivates important results such as Taylor’s Theorem. We can make this
"approximation" a little more precise. Suppose for each π‘₯ ∈ 𝐴, we put β„Ž = π‘₯ − π‘Ž, so
that π‘₯ = π‘Ž + β„Ž. As you well know, we can write equation (1) in the form
𝑓 (π‘Ž + β„Ž) − 𝑓 (π‘Ž)
.
β„Ž
β„Ž→0
𝑓 ′(π‘Ž) = lim
For each β„Ž ≠ 0, put
𝑠(β„Ž) =
𝑓 (π‘Ž + β„Ž) − 𝑓 (π‘Ž)
− 𝑓 ′(π‘Ž).
β„Ž
(3)
Then
𝑓 (π‘Ž + β„Ž) − 𝑓 (π‘Ž)
− 𝑓 ′(π‘Ž) = 𝑓 ′(π‘Ž) − 𝑓 ′(π‘Ž) = 0.
β„Ž
β„Ž→0
β„Ž→0
In other words, 𝑠(β„Ž) → 0 as β„Ž → 0. Remarkably enough, this can be used to prove that
the function 𝑓 is differentiable at the point π‘Ž ∈ 𝐴. Although we’ve used the fact that
𝑓 is differentiable at the point π‘Ž to prove that 𝑠(β„Ž) → 0 as β„Ž → 0, the converse is also
true: we’ll show that if 𝑠(β„Ž) → 0 as β„Ž → 0, then the function 𝑓 is differentiable at π‘Ž.
lim 𝑠(β„Ž) = lim
Before we prove this converse, let’s complete our explanation of how 𝑠(β„Ž) can be used
to make the linear approximation (2) more precise. It’s not difficult to rewrite the
equation
𝑓 (π‘Ž + β„Ž) − 𝑓 (π‘Ž)
𝑠(β„Ž) =
− 𝑓 ′(π‘Ž)
β„Ž
in the form
𝑓 (π‘Ž + β„Ž) = 𝑓 (π‘Ž) + 𝑓 ′(π‘Ž)β„Ž + 𝑠(β„Ž)β„Ž,
4
which in turn can be written as
𝑓 (π‘₯) = 𝑓 (π‘Ž) + 𝑓 ′(π‘Ž) (π‘₯ − π‘Ž) + 𝑠(β„Ž)β„Ž.
(4)
Thus 𝑠(β„Ž)β„Ž is the difference between the actual value of 𝑓 (π‘₯) and the approximate value
𝑓 (π‘Ž) + 𝑓 ′(π‘Ž) (π‘₯ − π‘Ž) predicted by (2).
𝑦
𝑦 = 𝑓 (π‘₯)
𝑦 = 𝑓 (π‘Ž) + 𝑓 ′(π‘Ž)(π‘₯ − π‘Ž)
𝑠(β„Ž)β„Ž
𝑓 (π‘Ž)
π‘Ž
π‘₯
π‘Ž+β„Ž
Since 𝑓 (π‘₯) is the 𝑦-coordinate of a point on the graph of 𝑦 = 𝑓 (π‘₯) and 𝑓 (π‘Ž)+ 𝑓 ′(π‘Ž) (π‘₯ − π‘Ž)
is the 𝑦-coordinate of the corresponding point on the tangent line, the fact that 𝑠(β„Ž)β„Ž →
0 as π‘₯ → π‘Ž means that the graph and the tangent line get closer and closer together as
you approach the point (π‘Ž, 𝑓 (π‘Ž)). Notice that
𝑓 (π‘₯) − 𝑓 (π‘Ž)
− 𝑓 ′(π‘Ž)
π‘₯−π‘Ž
is the difference between the slope of the secant line joining the points (π‘Ž, 𝑓 (π‘Ž)) and
(π‘₯, 𝑓 (π‘₯)) and the slope of the tangent line through (π‘Ž, 𝑓 (π‘Ž)). The fact that this difference
tends to 0 explains why the secant line tends towards the tangent line as π‘₯ → π‘Ž.
𝑠(β„Ž) =
𝑦
𝑦 = 𝑓 (π‘₯)
secant line
tangent line
(π‘Ž, 𝑓 (π‘Ž))
π‘₯
(π‘₯, 𝑓 (π‘₯))
𝑠(β„Ž)β„Ž
𝑓 (π‘₯) − 𝑓 (π‘Ž)
(π‘Ž, 𝑓 (π‘Ž))
β„Ž=π‘₯−π‘Ž
5
(π‘₯, 𝑓 (π‘Ž))
slope of secant line
𝑓 (π‘₯) − 𝑓 (π‘Ž)
π‘₯−π‘Ž
slope of tangent line
𝑓 ′(π‘Ž)
You’ve probably noticed that 𝑠(β„Ž)β„Ž is actually the product of two factors, both of which
tend to 0 as π‘₯ → π‘Ž. This tells us that difference between 𝑓 (π‘₯) and 𝑓 (π‘Ž)+ 𝑓 ′(π‘Ž) (π‘₯ − π‘Ž) gets
small very quickly as π‘₯ → 0. As we shall see, this is actually what makes the function
𝑓 differentiable. It’s time for us to prove the converse result I mentioned earlier. Since
𝑓 ′(π‘Ž) appears in the formula
𝑓 (π‘Ž + β„Ž) − 𝑓 (π‘Ž)
− 𝑓 ′(π‘Ž),
β„Ž
when we use this formula to define 𝑠(β„Ž), we are obviously assuming that 𝑓 is differentiable. Suppose however that we have somehow found a quantity 𝑠(β„Ž) that tends to 0
as β„Ž → 0, and which is such that
𝑠(β„Ž) =
𝑓 (π‘₯) = 𝑓 (π‘Ž) + π‘š (π‘₯ − π‘Ž) + 𝑠(β„Ž)β„Ž
(5)
for all π‘₯ ∈ 𝐴 for some constant π‘š. (Remember that β„Ž = π‘₯ − π‘Ž.) You will recognise that
𝑦 = 𝑓 (π‘Ž) + π‘š (π‘₯ − π‘Ž)
is the equation of a line that passes through (π‘Ž, 𝑓 (π‘Ž)) and has slope π‘š. Equation (5)
tells us that the graph of 𝑦 = 𝑓 (π‘₯) gets closer and closer to this line as π‘₯ → π‘Ž. To show
that 𝑓 is differentiable at π‘Ž, we look at
𝑓 (π‘₯) − 𝑓 (π‘Ž)
( 𝑓 (π‘Ž) + π‘š (π‘₯ − π‘Ž) + 𝑠(β„Ž)β„Ž) − 𝑓 (π‘Ž)
= lim
π‘₯→π‘Ž
π‘₯→π‘Ž
π‘₯−π‘Ž
π‘₯−π‘Ž
π‘š (π‘₯ − π‘Ž) + 𝑠(β„Ž) (π‘₯ − π‘Ž)
= lim
π‘₯→π‘Ž
π‘₯−π‘Ž
= π‘š + lim 𝑠(β„Ž)
lim
π‘₯→π‘Ž
= π‘š.
This calculation shows not only that 𝑓 is differentiable at π‘Ž, but also that 𝑓 ′(π‘Ž) = π‘š.
Thus the line we’ve just been talking about is actually the tangent line. The point
is that we were able to introduce this line, and show that its slope is 𝑓 ′(π‘Ž), without
assuming that the function 𝑓 is differentiable. The fact that 𝑓 is differentiable follows
from equation (5) and that 𝑠(β„Ž) → 0 as β„Ž → 0; notice that we don’t need an explicit
formula for 𝑠(β„Ž).
So we’ve proved the converse. This means that 𝑓 being differentiable at π‘Ž is equivalent
to equation (5) holding for all π‘₯ ∈ 𝐴, where π‘š is a constant and 𝑠(β„Ž) → 0 as β„Ž → 0.
You might think that this is all very abstract. However, the linear approximation (2)
is one of the most important ideas in Calculus. As I have already mentioned, it is
the motivation for such ideas as Taylor’s Theorem. In the form of equation (4), it can
be used to prove important results such as the Chain Rule. In MAM2083F, we are
going to use these ideas to define the notion of differentiability for functions of several
variables. We will see that for such functions, the idea that "a function is differentiable
6
if its derivative exists" doesn’t work very well. Instead, we will define differentiability
using something like equation (5).
0.4. The Riemann integral
As you know very well, we use integrals to calculate areas, volumes, arclengths,
averages and so on. Most of the important applications of integrals, such as finding
areas and volumes, involve breaking breaking something complicated into the sum of a
large number of tiny parts and then estimating this sum in some way. In your first-year
courses, one of the most important results you learn is the Fundamental Theorem of
Calculus, which establishes the link between integrals and antiderivatives. In those
courses, you spend a lot of time learning various "techniques of integration", which
are mainly clever tricks for finding antiderivatives. All this is important, but I want
to spend a little time revising how we define the notion of an integral, using Riemann
sums. In MAM2083F we will extend this to functions which are defined on various
subsets of R𝑛 , or which take on values which are vectors in Rπ‘š , but it helps to remind
ourselves how it is done .
In first year, you usually start with the story of calculating the area under the graph of
a function 𝑦 = 𝑓 (π‘₯) on an interval [π‘Ž, 𝑏].
𝑦
𝑦 = 𝑓 (π‘₯)
π‘Ž
𝑏
π‘₯
So that we can actually get somewhere, let’s assume that 𝑓 : [π‘Ž, 𝑏] → R is a reasonably
nice function, with 𝑓 (π‘₯) ≥ 0 for all π‘₯ ∈ [π‘Ž, 𝑏]. At this stage, I’m not going to explain
what I mean by "reasonably nice": we’ll decide what we actually mean by this later.
The reason why we want 𝑓 (π‘₯) ≥ 0 is so that we can talk about the region bounded by
𝑦 = 𝑓 (π‘₯) and the lines 𝑦 = 0, π‘₯ = π‘Ž and π‘₯ = 𝑏. We’ll drop this requirement quietly as
soon as we no longer need it.
The first step is to divide the interval [π‘Ž, 𝑏] into subintervals [π‘₯0 , π‘₯ 1 ], [π‘₯ 1 , π‘₯ 2 ], [π‘₯2 , π‘₯ 3 ],
. . . , [π‘₯ 𝑛−1 , π‘₯ 𝑛 ]. We can do this by choosing π‘₯ 0 , π‘₯ 1 , π‘₯ 2 , π‘₯ 3 , . . . , π‘₯ 𝑛 so that
π‘Ž = π‘₯0 < π‘₯ 1 < π‘₯2 < π‘₯3 < · · · < π‘₯ 𝑛 = 𝑏.
7
π‘Ž
π‘₯0
π‘₯1
π‘₯ π‘˜−1 π‘₯ π‘˜
𝑏
π‘₯ 𝑛−1 π‘₯ 𝑛
We call
𝑃 = {π‘₯0 , π‘₯ 1 , π‘₯ 2 , π‘₯ 3 , . . . , π‘₯ 𝑛 }
a partition of the interval [π‘Ž, 𝑏]. Select a "sample point" π‘₯ ∗π‘˜ from each subinterval
[π‘₯ π‘˜−1 , π‘₯ π‘˜ ] and form the Riemann sum
𝑛
Õ
π‘˜=1
𝑓 (π‘₯ ∗π‘˜ ) (π‘₯ π‘˜ − π‘₯ π‘˜−1 )
(6)
As you know, we can interpret this Riemann sum in terms of areas of rectangles:
𝑦
𝑦 = 𝑓 (π‘₯)
π‘₯ 0 π‘₯ ∗ π‘₯ 1π‘₯ ∗
1
2
π‘₯2
π‘₯ 3∗ π‘₯3 π‘₯4∗ π‘₯4
π‘₯
This suggests that the Riemann sum (6) is an approximation to the "area under the
curve".
There are many ways in which to choose a sample point π‘₯ ∗π‘˜ from each subinterval
[π‘₯ π‘˜−1 , π‘₯ π‘˜ ]. In theory, we could choose these sample points at random, but it is natural
to make these choices systematically. For example, we could choose each π‘₯ ∗π‘˜ to be the
right-hand endpoint π‘₯ π‘˜ of the subinterval [π‘₯ π‘˜−1 , π‘₯ π‘˜ ]. This means that π‘₯ ∗π‘˜ = π‘₯ π‘˜ for all π‘˜ ;
the corresponding Riemann sum is the "right-hand sum"
𝑛
Õ
𝑓 (π‘₯ π‘˜ ) (π‘₯ π‘˜ − π‘₯ π‘˜−1 ) .
π‘˜=1
Another possibility is to choose each π‘₯ ∗π‘˜ to be the left-hand endpoint π‘₯ π‘˜−1 of the subinterval [π‘₯ π‘˜−1 , π‘₯ π‘˜ ]. This gives us the "left-hand sum"
𝑛
Õ
𝑓 (π‘₯ π‘˜−1 ) (π‘₯ π‘˜ − π‘₯ π‘˜−1 ) .
π‘˜=1
You should be familiar with this from first-year. Stewart discusses left-hand and righthand sums in detail and goes on to talk about midpoint sums, etc.
8
When we started this story, I asked you to assume that the function 𝑓 : [π‘Ž, 𝑏] → R is
"reasonably nice", without explaining exactly what I meant by this. I will need to add
some more conditions later, but for now, let’s now agree that part of being "reasonably
nice" means that 𝑓 (π‘₯) is bounded on the interval [π‘Ž, 𝑏]. This means that we can find
numbers π‘š, 𝑀 such that
π‘š ≤ 𝑓 (π‘₯) ≤ 𝑀
for all π‘₯ ∈ [π‘Ž, 𝑏]. We call these numbers π‘š, 𝑀 bounds on 𝑓 (π‘₯). If 𝑓 is bounded on [π‘Ž, 𝑏],
then it is also bounded on each of the subintervals [π‘₯ π‘˜−1 , π‘₯ π‘˜ ], so that for each π‘˜, we can
find numbers π‘š π‘˜ , 𝑀 π‘˜ such that
π‘š π‘˜ ≤ 𝑓 (π‘₯) ≤ 𝑀 π‘˜
(7)
for all π‘₯ ∈ [π‘₯ π‘˜−1 , π‘₯ π‘˜ ]. I’m going to assume that for each π‘˜, π‘š π‘˜ and 𝑀 π‘˜ are respectively
the biggest and the smallest numbers with these properties, so that if π‘š ′π‘˜ , 𝑀 ′π‘˜ are any
numbers satisfying
π‘š ′π‘˜ ≤ 𝑓 (π‘₯) ≤ 𝑀 ′π‘˜
for all π‘₯ ∈ [π‘₯ π‘˜−1 , π‘₯ π‘˜ ], then π‘š ′π‘˜ ≤ π‘š π‘˜ and 𝑀 π‘˜ ≤ 𝑀 ′π‘˜ . For a given function 𝑓 and a given
partition 𝑃, these numbers π‘š π‘˜ and 𝑀 π‘˜ are unique.
Definition 0.2. The lower sum of 𝑓 associated with the partition 𝑃 = {π‘₯0 , π‘₯ 1 , π‘₯ 2 , π‘₯ 3 , . . . , π‘₯ 𝑛 }
is
𝐿( 𝑓 , 𝑃) =
𝑛
Õ
π‘š π‘˜ (π‘₯ π‘˜ − π‘₯ π‘˜−1 ) ,
π‘˜=1
while the upper sum is
π‘ˆ( 𝑓 , 𝑃) =
𝑛
Õ
𝑀 π‘˜ (π‘₯ π‘˜ − π‘₯ π‘˜−1 ) ,
π‘˜=1
where π‘š π‘˜ and 𝑀 π‘˜ are as defined above.
Suppose now we have chosen a sample point π‘₯ ∗π‘˜ from each subinterval. From (7), it
follows that
π‘š π‘˜ ≤ 𝑓 (π‘₯ ∗π‘˜ ) ≤ 𝑀 π‘˜
for each π‘˜. This means that if
𝑛
Õ
π‘˜=1
𝑓 (π‘₯ ∗π‘˜ ) (π‘₯ π‘˜ − π‘₯ π‘˜−1 ) is any Riemann sum of 𝑓 associated
with the partition 𝑃 = {π‘₯ 0 , π‘₯ 1 , π‘₯ 2 , π‘₯ 3 , . . . , π‘₯ 𝑛 }, then
𝐿( 𝑓 , 𝑃) ≤
𝑛
Õ
π‘˜=1
𝑓 (π‘₯ ∗π‘˜ ) (π‘₯ π‘˜ − π‘₯ π‘˜−1 ) ≤ π‘ˆ( 𝑓 , 𝑃).
(8)
If for each π‘˜, we can find points 𝑒 π‘˜ , 𝑣 π‘˜ ∈ [π‘₯ π‘˜−1 , π‘₯ π‘˜ ] with 𝑓 (𝑒 π‘˜ ) = π‘š π‘˜ and 𝑓 (𝑣 π‘˜ ) = 𝑀 π‘˜ ,
then 𝐿( 𝑓 , 𝑃) is the Riemann sum obtained by choosing π‘₯ ∗π‘˜ = 𝑒 π‘˜ for each π‘˜, while π‘ˆ( 𝑓 , 𝑃)
is the Riemann sum obtained by choosing π‘₯ ∗π‘˜ = 𝑣 π‘˜ for each π‘˜. In this case, 𝐿( 𝑓 , 𝑃) is the
smallest Riemann sum of the function 𝑓 associated with the partition 𝑃, while π‘ˆ( 𝑓 , 𝑃)
9
is the largest. However, in what we are going to do, it’s not necessary to assume that
the points 𝑒 π‘˜ and 𝑣 π‘˜ exist, since we just need the numbers π‘š π‘˜ and 𝑀 π‘˜ .
You might have noticed that I haven’t assumed that the partition 𝑃 is regular, in the
sense that all the subintervals generated by 𝑃 are of the same length. This means
that there need not be any particular relationship between the lengths of the different
subintervals or between the lengths of the subintervals and the number of points in
𝑃. Thus there is no point in simply taking limits as 𝑛 → ∞ ; we have to be a bit more
clever.
Let’s think what happens when we use different partitions of the interval [π‘Ž, 𝑏]. Obviously, the number of subintervals might change and also their endpoints. Since π‘š π‘˜
and 𝑀 π‘˜ depend on these subintervals, the lower and upper sums will also vary.
Definition 0.3. Let 𝑃 and 𝑄 be two partitions of the same interval [π‘Ž, 𝑏]. We say that
𝑄 is a refinement of 𝑃 if
𝑃 ⊆ 𝑄.
(The symbol "⊆" means "is a subset of".) Sometimes, we prefer to say that the partition
𝑄 is finer than the partition 𝑃 or that 𝑃 is coarser than 𝑄.
Thus 𝑄 is a refinement of 𝑃 if every endpoint of a subinterval in 𝑃 is also an endpoint
of an interval in 𝑄. Each of the subintervals in 𝑃 has either been left unchanged, or
has been cut up into shorter intervals in 𝑄.
𝑃
𝑄
Notice that any interval [π‘Ž, 𝑏] has a coarsest partition, the partition {π‘Ž, 𝑏}. In a sense,
this isn’t actually a "partition", since it involves only one subinterval, the interval [π‘Ž, 𝑏]
itself. For any partition 𝑃 of [π‘Ž, 𝑏], we have that {π‘Ž, 𝑏} ⊆ 𝑃. Notice that
π‘š (𝑏 − π‘Ž) ≤ 𝐿( 𝑓 , {π‘Ž, 𝑏}) ≤ π‘ˆ( 𝑓 , {π‘Ž, 𝑏}) ≤ 𝑀 (𝑏 − π‘Ž) ,
(9)
where π‘š, 𝑀 are bounds on 𝑓 (π‘₯).
If you draw some pictures, you will see that if 𝑄 is a refinement of the partition 𝑃, then
𝐿( 𝑓 , 𝑃) ≤ 𝐿( 𝑓 , 𝑄) ≤ π‘ˆ( 𝑓 , 𝑄) ≤ π‘ˆ( 𝑓 , 𝑃).
(10)
In other words: if we replace a partition of the interval [π‘Ž, 𝑏] by one of its refinements,
the lower sum will get bigger and the upper sum will get smaller. (8) tells us that any
Riemann sum associated with 𝑄 is "squeezed" between 𝐿( 𝑓 , 𝑄) and π‘ˆ( 𝑓 , 𝑄).
10
Now consider two partitions 𝑃, 𝑄 of [π‘Ž, 𝑏] that are not necessarily refinements of one
another. Notice that 𝑃 ∪ 𝑄 is also a partition of [π‘Ž, 𝑏] and it is finer than both 𝑃 and 𝑄.
Together, (9) and (10) tell us that
π‘š (𝑏 − π‘Ž) ≤ 𝐿( 𝑓 , 𝑃) ≤ 𝐿( 𝑓 , 𝑃 ∪ 𝑄) ≤ π‘ˆ( 𝑓 , 𝑃 ∪ 𝑄) ≤ π‘ˆ( 𝑓 , 𝑄) ≤ 𝑀 (𝑏 − π‘Ž) .
In particular, this means for any pair of partitions 𝑃, 𝑄,
π‘š (𝑏 − π‘Ž) ≤ 𝐿( 𝑓 , 𝑃) ≤ π‘ˆ( 𝑓 , 𝑄) ≤ 𝑀 (𝑏 − π‘Ž) .
In summary: if we look at finer and finer partitions of the interval [π‘Ž, 𝑏], the associated
lower sums will get bigger and bigger, but they will always be smaller than any one of
the upper sums. In the same way, the upper sums will get smaller and smaller, but they
will always be bigger than any one of the lower sums. Let 𝑆( 𝑓 ) be the maximum of all
the lower sums of 𝑓 on the interval [π‘Ž, 𝑏] and let 𝑆( 𝑓 ) be the minimum of all the upper
sums. You can think of 𝑆( 𝑓 ) as being the "limit" of the lower sums and 𝑆( 𝑓 ) as being
the "limit" of the upper sums, in the sense that 𝐿( 𝑓 , 𝑃) → 𝑆( 𝑓 ) and π‘ˆ( 𝑓 , 𝑃) → 𝑆( 𝑓 ) as
we consider finer and finer partitions of [π‘Ž, 𝑏]. For any pair of partitions 𝑃, 𝑄,
π‘š (𝑏 − π‘Ž) ≤ 𝐿( 𝑓 , 𝑃) ≤ 𝑆( 𝑓 ) ≤ 𝑆( 𝑓 ) ≤ π‘ˆ( 𝑓 , 𝑄) ≤ 𝑀 (𝑏 − π‘Ž) .
After all this, we can say what we mean by a function 𝑓 being "reasonably nice" and
define the integral of this function on the interval [π‘Ž, 𝑏] :
Definition 0.4. We say that a function 𝑓 is Riemann integrable on [π‘Ž, 𝑏] if it is bounded
on [π‘Ž, 𝑏] and 𝑆( 𝑓 ) = 𝑆( 𝑓 ). If this is the case, then we define the Riemann integral of 𝑓 on
[π‘Ž, 𝑏] to be
∫ 𝑏
π‘Ž
𝑓 (π‘₯) dπ‘₯ = 𝑆( 𝑓 ) = 𝑆( 𝑓 ).
It is useful to have an example of a bounded function which is not Riemann integrable:
Example 0.5. Dirichlet’s function on [0, 1] is defined by
(
π‘Ÿ(π‘₯) =
1
if π‘₯ is rational
0
if π‘₯ is irrational
.
This is a "highly discontinuous" function: any interval of the real line contains both
rational and irrational numbers, and thus points where π‘Ÿ(π‘₯) = 1 and points where
π‘Ÿ(π‘₯) = 0. It follows that 𝐿(π‘Ÿ, 𝑃) = 0 and π‘ˆ(π‘Ÿ, 𝑃) = 1 for any partition 𝑃 of [0, 1]. (Can
you see why?) Thus 𝑆(π‘Ÿ) < 𝑆(π‘Ÿ), and so π‘Ÿ is not Riemann integrable on [0, 1].
One of the basic rules of integration says something like If 𝑓 and 𝑔 are both Riemann
integrable on [π‘Ž, 𝑏], then so is 𝑓 + 𝑔, with
∫ 𝑏
π‘Ž
( 𝑓 (π‘₯) + 𝑔(π‘₯)) dπ‘₯ =
∫ 𝑏
π‘Ž
11
𝑓 (π‘₯) dπ‘₯ +
∫ 𝑏
π‘Ž
𝑔(π‘₯) dπ‘₯.
You can easily prove this result by showing that 𝐿( 𝑓 + 𝑔, 𝑃) = 𝐿( 𝑓 , 𝑃) + 𝐿(𝑔, 𝑃) and
π‘ˆ( 𝑓 + 𝑔, 𝑃) = π‘ˆ( 𝑓 , 𝑃) + π‘ˆ(𝑔, 𝑃) for any partition 𝑃. From this it follows that 𝑆( 𝑓 + 𝑔) =
𝑆( 𝑓 ) + 𝑆(𝑔) and 𝑆( 𝑓 + 𝑔) = 𝑆( 𝑓 ) + 𝑆(𝑔), which is sufficient to prove the result. Most of
the basic rules of integration can be proved in the same way.
If we know that a function 𝑓 is Riemann integrable on [π‘Ž, 𝑏], then we can find the
∫ 𝑏
value of
π‘Ž
𝑓 (π‘₯) dπ‘₯ by calculating either 𝑆( 𝑓 ) or 𝑆( 𝑓 ). If necessary, we can find these
by considering a suitable sequence of partitions, for example, the regular partitions of
[π‘Ž, 𝑏].
It can be shown that if 𝑓 is continuous on [π‘Ž, 𝑏], then 𝑓 is Riemann integrable on [π‘Ž, 𝑏] ;
I’m not going to prove this. Neither am I going to give a proof of the Fundamental
Theorem of Calculus, even though this follows very easily from the ideas which we
have just been discussing. You will see that the Fundamental Theorem of Calculus
plays a very important rôle in the later part of this course.
12
Download