Uploaded by 0yangyc0

Elementary+Classical+Analysis+2nd+Edition

advertisement
Introduction
Sets and Functions
The student who wishes to use this book successfully should have a sound
background in elementary calculus and linear algebra and some exposure to
multivruiable calculus. Adequate preparation is normally obtained from two
years of undergraduate mathematics. Also required is a basic knowledge of
sets and functions, for which the necessary concepts are summarized in this
introduction. This material should be read briefly and then consulted as needed.
Set theory is the starting point for much of mathematics and is itself a vast
and complicated subject. For brevity and better understanding, we begin our
study somewhat intuitively. The reader who is interested in the subtleties of set
theory can consult the supplement at the end of this introduction.
A set is a collection of "objects" or "things" called members of the set. For
example, the collection of positive integers l, 2, 3, ... fcmnsa set. Likewise, the
~ l numbers (fr~s) p/q fonn a set. If Sis a set, and xis a member of S,
we write x E S. A subset of the set S is a set A such that every element of A is
also a member of S; symbolically, this relationship is denoted (x E A) • (x E S),
where the symbol • denotes "implies." When A is a subset of S, we write A c S.
Sometimes the notation A C S is used for what we denote as A c S. We can
also define equality of setsbystating that A= B means A c Band BC A; that
is, A and B have the same elements. The empty set, denoted 0, is the set with
no members. For example, the set of integers n such that n2 = - I is empty.
One method of specifying a set is to list its members in braces. Thus
we write N = {I, 2, 3, ... } to denote the set of all positive integers and Z =
{... ,-3,-2,-1,0,1,2,3, ... } for the set of all integers. An example of a
subset of N is the set of even numbers; it is written
A = { 2, 4, 6, ... } = { x E N I x is even } c N.
We read {x E N I x is even} as "the set of all members x of N such that x is even."
Here is an important notational distinction. If S is a set and a E S, then {a}
2
Introduction: Sets and Functions
denotes the subset of S consisting of the single element a. Thus {a} c S, while
a ES.
Let S be a given set and let ACS and BC S. Define A UB = {x ES Ix EA
or x E B}, which is read "the set of all x E S that are members of A or B
-(or both)." The set AU B is called the union of A and B. Similarly, one can
form the union of a family of sets. For example, let A 1,A2 , ••• be subsets of
S and let Ui:'1A; = {x E S I x E A; for some i}; this is sometimes written
U{A1,A2,A3, ... }. Note thatAUB is the special case withA 1 =A,A2 =B, and
A; = 0 for i > 2. Similarly, we form the intersections A n B = { x E S I x E A
and x E B} and n!='1A; = {x E S I x EA; for all i}. Figure I-1 illustrates these
operations.
s
S
(a)
FIGURE I-1
ArlB
(c)
(b)
(a) Subset; (b) union; (c) intersection
For A C S and B c S, we form the complement of A relative to B by defining
B\A = {x EB Ix
\t A},
where x (/. A means xis not contained in A. See Figure I-2.
FIGURE I-2
Complement
Introduction: Sets and Functions
3
As in Worked Example I.I at the end of this introduction, we see that B\ (A, U
(B\A1)n(B\A2) and that B\(A, nA2) (B\A1)U(B\A2) for any sets A1,A2,
B c S. This is an example of a "set identity." Other examples are given in the
exercises.
Given sets A and B, define the Cartesian product A x B of A and B to be the
set of all ordered pairs (a, b) with a EA and b E B; i.e., Ax B = { (a, b) I a EA
and b E B}. See Figure 1-3.
A2)
=
=
y
B
AXB
b
(a, b)
A
a
FIGURE 1-3
Cartesian product
Let S and T be given sets. A function f : S -+ T consists of two sets S
and T together with a "rule" that assigns to each x E S a specific element of
T, denoted /(x). One often writes x t--t f (x) to denote that x is mapped to the
element f(x). For example, the function /(x) = x2 may be specified by saying
x 1-+ x2. Figure 1-4 depicts this function with S = T the set of all real numbers;
this set is denoted R and will be introduced carefully in Chapter I. For now,
we use it informally.
f
X
FIGURE 1-4
The function x t--t x2
Introduction: Sets and Functions
4
Note. In this book, the terms "mapping," "map," "function," and "transformation" are all synonymous.
For a function f : S ---+ T, the set S is called the domain or source off and T
is called the target off. The range, or image, off is the subset of T defined by
f(S) = {f(x) ET Ix ES}. The graph off is the set {(x,f(x)) ES x TI x ES},
as in Figure 1-5.
(x.f(x))
Graph off
f(x)
B
I
A
FIGURE 1-5
AXB
X
Graph of a function
Someone paying careful attention to logical foundations may object to using
colloquial language, such as "rule," and would be happier to define a function
from .S to T as a subset R of S x T with these two properties:
1.
Each member of S occurs as the first component of some member of R, and
2.
1\vo members of R with the same first component are identical; that is, the
first component x determines the second componentf(x), as in Figure 1-5.
A function f : S ---+ T is called one-to-one or an injection if whenever
x 1 -f. x2, then /(xi) -f. /(x2). Thus a function is one-to-one when no two distinct
elements of S are mapped to the same element of T. Equivalently, f is one-toone when for each y E T, the equation f(x) = y has at most one solution x E S.
An extreme example of a function which is not one-to-one (if S has more than
one element) is a constant function, a function f : S ---+ T such that /(x1) = f(xz)
for all x 1,x2 ES. See Figure 1-6.
Note. In definitions it is a convention that "if' stands for "if and only if."
The latter is often written "iff," or {:}, In theorems it is absolutely necessary
to distinguish between "if," "only if," and "iff."
We say that/ : S---+ Tis onto, or is a swjection, when for every y E T, there
is an x ES such that/(x) = y, in other words, when the range equals the target.
5
Introduction: Sets and Functions
FIGURE I-6
Constant function
Another way of saying this is that for each y E T, the equation f(x) = y has at
least one solution x E S. It should be noted that the choice of S and T is part of
the definition off, and whether or not f is one-to-one or onto depends on that
choice. For example, let f be defined by f(x) = x2 . Then/ is one-to-one and
onto when S and T consist of all real numbers x such that x ~ 0, is one-to-one
but not onto when S is all those x such that x ~ 0 and T is all real numbers,
and is neither when S and T consist of all real numbers.
For f : S - t T and A C S, we define /(A)= {/(x) E T I x EA}, and for
B c T we define/- 1(B) to be the set {x ES lf(x) EB}. We callf(A) the image
of A under/ and 1- 1(B) the inverse image, or preimage, of B under f.
Note. We can form
one-to-one or onto.
1- 1(B)
for a set B C T even though f might not be
If f : S - t T is one-to-one and onto, then for each y E T there is a unique
solution x E S to f(x) = y. Thus there is a unique function, denoted 1- 1 : T .-> S
(not to be confused with the operation/- 1(B) defined in the previous paragraph
or 1/f), such that/(/- 1(y)) = y for ally ET and/- 1(/(x)) = x for all x ES.
We call 1- 1 the inverse function off. A one-to-one and onto map is also called
a bijection, or a one-to-one correspondence.
Note.
In calculus we learn how important the choice of domain (source) is
in forming the inverse function. For instance, to form sin - I arcsin, we cut
the domain and regard sin as a map sin: [-7r/2,7r/2]-> [-1, 1] on which it
is a bijection. Then sin- 1 : [-1,1]-> [-71"/2,rr/2] is defined. Consult your
calculus text for more examples.
=
The map/ : S -> S defined by /(x) = x for all x E S is called the identity
mapping on S. One should distinguish the identity mappings for different sets.
For example, one sometimes uses the notation ls for the identity mapping on S.
Clearly, ls is one-to-one and onto.
6
Introduction: Sets and Functions
For two functions/ : S -+ T and g : T -+ U, the composition g of : S -+ U
is defined by (g o f)(x) = g(f(x)), as shown in Figure l-7. For example, if
f : IR. -+ R is defined by f(x)
x2 and g : R -+ R by g(x) x + 3, then
go f: x 1-+ x2 + 3 and/ o g : x 1-+ (x + 3)2 (here S, T, and U consist of all real
numbers). In particular, note that fog '!- g of.
=
=
gof
FIGURE 1-7
Composition of mappings
Note. In calculus, we learn that comP.ositions are important for, among other
things, the chain rule. The same is true in this book.
Sometimes we wish to restrict our attention to just some elements on which a
function is defined. This process is called restricting a function. More fonnally,
if we have a mapping/ : S-+ T and A C S, we consider a new function denoted
f I A : A -+ T defined by (/ I A)(x) = f(x) for all x E A. We call/ I A the
restriction off to A and/ an extension off I A.
A set A is called finite if we can display all of its elements as follows:
A = {a,, a2, ... , an} for some integer n. A set that is not finite is called infinite.
For example, the set of all positive integers N = { l, 2, ... } is an infinite set.
It can be difficult to decide if one infinite set has more elements than another
infinite set. For instance, it is not clear at first if there are more rational or
irrational numbers. To make this notion precise, we say that two sets A and
B have the same number of elements (or have the same cardinality) if there
exists a mapping/ : A -+ B that is one-to-one and onto. If an infinite set has
the same number of elements as the set of integers { l, 2, ... } , then it is called
denumerable. A set that is either finite or denumerable is said to be countable;
otherwise, it is called uncountably infinite, or just uncountable. An example of
an uncountable set is the set of all real numbers between O and I. (We shall
prove this in Chapter 1.)
Let S'be a set. A seque,ice in Sis a mapping/: N -+ S. Thus we associate
to each integer n an element of S, namely /(n). One often suppresses the fact
Supplement on the Axioms of Set Theory
7
that we have a function by simply considering a sequence as the image elements,
say x 1,x2,x3, .. ., or alternatively we write "the sequence x,," or (x,,)~ 1• We call
Y1,Y2, .•• a subsequence of x1,x2, ... if there is a function g : N - N such
that for every i E N, y; = Xg(i), and if for i < j, g(i) < g(j). In other words,
a subsequence is obtained by "throwing out" elements of the original sequence
and naturally ordering the elements that remain. For example, the sequence
y,. = (2n) 2 is a subsequence of x,. = n2 • Here g(n) = 2n. Sometimes one writes a
subsequence of x,, as x,.1 , where the notation g(i) = n; reminds us that the n; are
chosen from among the n's.
An important method for proving statements indexed by the positive integers
N is the technique of induction. A property P(i) is true for all i EN if:
1.
P(l) is true (base case), and
2.
For every n EN, if P(n) is true then P(n + 1) is true (induction step).
The same technique also applies to {O, 1, ... } with the base case replaced by:
11•
P(O) is true.
We shall have more to say about the basis for the natural numbers in §1.1.
Supplement on the Axioms of Set Theory 1
There is no rigorous mathematics today that does not use concepts from set
theory. For this reason we started with set theory in this text. The purpose of
this supplement is to help bridge the gap between the approach in this text and
that in more formal set theory courses using a book like Halmos's Naive Set
Theory. 2 Any introduction to !;et theory has to take into account the following
points:
1.
The concept of a set is so basic that it is impossible to define it in terms
of more basic notions.
2.
For this reason, we specify the concept of a set with axioms, . but the
axiomatic method may not be familiar to the student.
3.
Axiomatic set theory involves logic, but some concepts of logic may not
be familiar either.
In view of these circumstances, the most effective approach, and the one
used in this text, is to start working with the intuitive concept of a set and come
1This
supplement was written with the help of Istvan Fary.
Paul R., 1960. Naive Set Theory. New York: D. Van Nostran<l Co.
2 Halmos,
8
Introduction: Sets a,ul Functions
back to foundations later. When this method is used, the question arises whether
to take up logic first or to treat axiomatic set theory without fonnal logic. We
choose the second approach.
This plan corresponds to the historical development: Set theory based on
intuitive concepts came first, then criticism of this approach inspired the axiomatic foundations, and finally an intensive discussion of this method heralded
new developments in logic. It may be useful, therefore, to say something about
the history of our subject.
On the History of Set Theory
Set theory is one of the most basic areas of mathematics. It includes facts about
finite sets, but the importance of the theory is that it can deal with infinite sets
and can be developed systematically. In this sense, the founder of set theory
was Georg Cantor (1845-1918). He published his important papers just before
the start of the twentieth century. There was a heated debate over his work, and
famous mathematicians disagreed about fundamental questions.
Cantor was led to discover properties of infinite sets in connection with his
work on trigonometric series (see Chapter 10). A trigonometric series is a sum
of the form
(X)
L (ak cos kx + bk sin kx).
4-=0
Convergence properties of these series are delicate questions, and defining sets
of points according to the behavior of the series leads to very general types of
sets of numbers. For this reason, Cantor dealt with sets of real numbers first but
soon discovered that he had to deal with infinite sets in general.
In one of his papers, he gave the following "definition" of the concept of
set:
We understand by "set" any gathering M of well-defined,
distinguishable objects m (which will be called "elements"
of M) of our intuition or our ideas into a whole. 3
(1)
It is customary today to be "ashamed" of the original definition of Cantor and
to say that it is not a definition. Yet there are many so-called definitions in other
fields that do not come close to the clarity and precision of (1). Nevertheless,
since the concept of a set is so important, we will not ultimately accept Cantor's
3 Toe original German rexl (Collected Papers, p. 282) is: "Unrer elner 'Menge' verslehen wir
Jede Zusammenfassung M von 1>es1lmm1en wohlunrerschiedenen Objekren n, unserer Anschauung
oder unseres Denkens (welche die 'Elemente' von M gennanr werden) zu einem Ganzen."
Supplement o,z the Axioms of Set Theory
9
definition. However, for the moment we will use (1) to clarify our ideas about
sets.
The first point is that we "gather together" objects, and we disregard the
order in which they are taken. For example, if we talk about "the set of natural
numbers" we do not imply that the elements of this set are given in some "order,"
even though there is a "natural order" for integers. For practical purposes we
may list the elements in some order, but this has nothing to do with the set itself.
The words "well-defined, distinguishable objects" in (1) point out another
aspect of the concept of a set: The elements of the set do not "appear twice";
thus, for example, a set consisting of 2, 2, 2, 3 contains 2 and 3 and nothing
else. Hence, a set "contains" some objects which "belong" to the set; some
other objects may not belong to the set. For example, 1003 belongs to the set
of natural numbers (positive integers), while 3. 14159 does not belong to it.
Finally, the "whole" at the end of (1) refers to the fact that sets themselves
are treated as objects; for example, they may be elements of other sets. Thus we
may consider sets whose elements are sets. In fact, these are among the most
important sets in set theory.
Let us now criticize Cantor's definition. Consider the following definition:
An integer p is a prime number if p t, 1 and ± 1, ±p are the only divisors of
p. In this definition the concept of "prime number" is defined in terms of other
concepts (integers, divisor, +l, -1, -p), and we suppose that the latter concepts
are known or were defined without the use of the concept of prime number. The
definition thus reduces the concept of prime number to those other concepts. This
definition also tells us what to do in order to test whether or not 1003 is a prime
number (it is not; it is divisible by 17). Let us see whether (1) can stand up to
such criteria. We have in this sentence a number of other concepts: "gathering,"
"well-defined," "distinguishable," "whole" (not to mention our "intuition," our
"ideas"). It is only fair to ask which concept is simpler: "set" or "gathering." (In
fact, the German word "Zusammenfassung" sounds better, but it does not escape
the criticism.) Similarly, we can question every one of the other concepts and
wonder whether it is simpler than the concept of a set. Cantor's definition was
also criticized on the grounds that it does not exclude contradictory sets, as we
will see in the following section. Motivated by this criticism, Cantor discovered
a germ of the axiomatic description in a later paper.
Logic
We treat logic in an uncritical and unsophisticated way. It is probably fair to
say that the basis of our rational thinking is the following belief: If we start
with true premises and make correct deductions from them, then we reach a
true conclusion. We could refuse to accept this, but we would not get far in
Introduction: Sets and Functions
10
mathematics. If we take this belief seriously (as we do in mathematics), then
rather sophisticated results can be reached. For example, suppose that 2, 3, 5,
7, 11, 13, and 17 are the only prime numbers 2::: 2. Then form the number
n = 2 • 3 • 5 • 7 • 11 • 13 • 17 + 1. Note that n is not divisible by a prime S 17.
(If n factored as such a prime times an integer, then we would obtain 1 as a
nontrivial product of integers, which is impossible.) Thus, n is a prime or it
has a divisor that is a prime ~ 18. Since the conclusion plainly contradicts the
premise, both cannot be true, and since our reasoning was correct, the premise
must be false-there is a prime number ~ 18. This is not surprising, as 19
happens to be such a prime number, but we reached the conclusion by reasoning
and not by experience. More important, the same argument shows that there is a
prime number larger than any gi~en prime; i.e., there are infinitely many primes.
This reasoning, sometimes called reductio ad absurdum, is used frequently.
There are English sentences in which we can erase a word, write x in its
place, and still get a meaningful sentence. For example, in the sentence "two is
smaller than five," erasing "two" and writing "x" gives the sentence "x is smaller
than five." Such a combination of words is called a propositional function or
condition and could be denoted S(x). Writing "seven" in place of x we get a
false sentence, and writing "three" in place of x we get a true sentence, so that
S(x) is meaningful if x is an integer and is true for some integers and false for
other integers. Given an arbitrary condition S(x), we may
Take all objects whose name, substituted in the place of
x in S(x), gives a true sentence. ··
(2)
It is understood that x may occur several times, and substitution must be done
consistently (thus x is just a "place holder"). On the basis of definition (1), we
thus obtain a set. There is a standard notation for this set:
{x I S(x)}.
(3)
In spite of the fact that (2) is consistent with ( 1) and with the usual concept of
set, we run into contradictions if we use (2) indiscriminately. Take the following
example:
The set that does not contain itself as an element.
(4)
This phrase seems to be all right; after all, who ever saw a set that contained
itself as an element? Erase "the set that," and write "x" to obtain the equivalent
sentence:
S(x) = x does not contain itself as an element.
(5)
Then take the corresponding set described_ in (2) and call it M as Cantor does
(M for "Menge"). Let us ask the question: Does M contain M? If it does not,
Supplement on the Axioms of Set Theory
11
then it should, by the sentence that defines it. If it does, then it should not, by
virtue of the same sentence.
This property of construction (2), first noticed by Bertrand Russell, is shocking and discouraging. When we were inspecting Cantor's definition, we suggested that it was not really that bad and actually helped clarify ideas. Now we
find that, at the same time, it allows forming the impossible set M.
The example of the set M may suggest that there is something inherently
wrong with the concept of a set, or at least with the concept of "big" sets. In fact
M is as big as they come-it contains every single "decent" set. However, the
kind of contradiction we have in connection with (5) is well known in classical
logic. Let us first mention an example that can be formulated in terms of "small"
sets. Let N be the set of men living in a small village. Suppose that the barber of
the village declares, "I will shave x E N if and only if x does not shave himself."
It seems that this sentence defines a subset P C N. However, the question of
whether the barber belongs to P leads to the following "barber's paradox": "I
will shave myself, if I do not shave myself."
Note. In more modern terms, consider the "Russell light" on your car's
dashboard. This is the light that comes on when any of the dashboard lights
burn out. What happens if the Russell light burns out?
This dilemma was extensively discussed by Greek logicians, who did not
use the concept of a set. Hence, the contradiction may be independent of the
notion of a set. This seems to be confirmed by the following paradox.
Suppose that during one of my lectures a student in the class says,
The last sentence on the blackboard is false.
(6)
This can happen, unfortunately. If it does, I normally do the following: I
reread the sentence. If I find that the student is right, then I apologize, erase
the sentence, and write down a corrected sentence. If I find that the student
was mistaken, then I say so aloud and leave the sentence on the blackboard.
To make this concrete, suppose now that I am lecturing on set theory and so
far have written items (1) through (6); these items, and nothing else, as on the
blackboard, in order. If a student now says, "The last sentence on the blackboard
is false," I am at a loss as to what to do. If the student is right, then (6) is false,
which means that it is true; hence the student was wrong, but in this case the
sentence is right, which means that it is false.
It would be interesting to pursue these questions of logic, but our aim has
been simply to indicate why it is advisable to restrict the form of sentences when
defining subsets of a set in our axiomatic set theory.
Introduction: Sets and Functions
12
The Language of the Axioms
In a complete, advanced presentation of the axioms of set theory, formalized
logic must be used. Thus at least part of the language of the theory is formalized.
We now describe this pru.1 of the language, without carrying out a formalization.
There will be two basic types of sentences, namely, assertions of belonging,
xeA,
(7)
A=B.
(7')
and assertions of equality
All other sentences ai·e to be obtained from such atomic sentences by repeated
application of the usual logical operators, subject to the rules of grammar and
unambiguity.
To make the definition explicit, it is necessary to append to it a list of the
"usual" logical operators and the rules of syntax. Our list of logical operators
will be
J
l
{
not
or (in the nonexclusive sense)
and
if-then- (meaning implies)
if and only if (abbreviated iff)
(8)
for some- (there exists)
for all-
Notice that "not" operates on a single sentence, the next four operators act on
two sentences (S and T, ... , S iff T), and the last two act on conditions (for
some x, S(x) holds, and so forth). This list is redundant: It is proved in logic
that the first five can be replaced by fewer operators. [For example, we can
delete "and" from the list; instead of the sentence "S and T," where S and T
are sentences, we can say "not (not S or not T)." This is clumsy in colloquial
English but very simple with appropriate logical symbolism. Since we do not
want to use formalized logic, we use the longer list (8).] In our list, the first five
operators are called logical connectives, and the last two are called quantifiers.
In the usual formalism, "for some x" is written 3x and "for all x" is denoted
'vx. The connection between these two quantifiers is as follows. The negation
of "for some x, S(x) holds" is "for all x, not S(x) holds." The negation of "for
all x, S(x) holds" is "there is an x such that not S(x) holds." This is possibly the
main idea to be learned here. Often the connection between the two quantifiers
appears in the following fmm. We want to prove a statement:
For every c:o
> 0,
· • • is true.
(9)
Supplement on the Axioms of Set Theory
13
The negation of this is
There exists co
> 0 such that
is not true.
(10)
If we can now deduce a contradiction from (10), we have a proof of (9). As for
the rules of sentence construction, we agree on the following conventions:
1.
When using "not," put it before a sentence and enclose the whole between
parentheses. (The reason for parentheses is to guarantee unambiguity.)
2.
Put "and," or "or," or "if and only if," where used, between the two
sentences it applies to, and enclose the whole between parentheses.
3.
Replace the dashes-in "if-then-" by sentences and enclose the result in
parentheses.
4.
Replace the dash in "for some-" or "for all-" by a variable, follow
the result by a sentence, and enclose the whole in parentheses. [If the
variable used does not occur in the sentence, no harm is done. According
to the usual convention, "for some y (x E A)" just means "x EA." It is
equally harmless if the variable name used has already been used with
"for some-" or "for all-". "For some x (x E A)" means the same as
'.'for some y (y E A)"; a judicious change of notation will always avert
alphabetic collisions.
The Axioms
Instead of giving a definition of the concept of "set A" and that of "belonging to
a set," denoted a EA, we will give properties of these concepts. Enumerating
properties is the main feature of the axiomatic method. We now state the axioms,
accompanying them with a few remarks.
1. Axiom of Extension Two sets are equal if and only if they have the
same elements.
This axiom means, in particular, that if we want to prove A = B, then we
have to prove that x EA implies x E Band that x E B implies x EA. This point
is so important that it is worthwhile to have a notation for the case when only
half of it-say, the first half-is satisfied. We then write A c B. This will be a
relation between sets; it is not an undefined concept but is defined in terms of
"set" and "belonging." See Worked Example 1.1 at the end of this introduction
for a concrete use of this axiom.
14
Introduction: Sets and Functions
2. Axiom of Specification To every set A and condition S(x) there corresponds a set B whose elements are exactly those elements x of A for which
S(x) is true.
We introduce the notation
{xEAIS(x)}
(11)
to denote the set B. Notice that (11) is the same as our set (3) except that now
we do not form the set of all objects satisfying a certain condition but only those
that are already elements of some set (the set A in (ll)). This allows us, for
example, to form sets of real numbers quite arbitrarily, such as
I= {x E JR I a ::; x ::; b},
(12)
where our sentence S(x) is a ::; x :$ b, provided we know that JR is a set. (This
is not yet implied by axioms I and 2.) The simplest set (11) can be formed with
the atomic sentence (7); then we get A = { x E A I x E A}, hence A is a subset of
A. If our sentence S(x) is not satisfied by any element of A, then (11) describes
the empty set 0. We can always write an impossible condition, for example,
x ¢ A. Then 0 = {x EA Ix¢ A}. In conclusion, if there is any set, then there
is an empty set containing no elements. (Our axioms do not yet say that there
are sets at all; we postulate this later.)
On the basis of axiom 2, we introduce the important set theoretical operation
of intersection. Given sets A and B, we write {x EA Ix E B}; this set is denoted
An B, as you know. The set B n A is {x E B I x E A}, which is clearly the
same set as A n B. The most general operation is the intersection of a collection
of sets C (instead of a set of sets we sometimes say a collection of sets, but,
for us, "collection" shall be synonymous with "set"): Suppose C is a set, and,
if A E C, then A is also a set. We define:
n{A I A E C} = {x E Ao I Ao E C and x EA for all A E C}.
(13)
Hence x is an element of the intersection if it belongs to all sets that belong to
C. A n B corresponds to the case when C contains two elements, one being A
and the other being B. If all elements of C are indexed with integers so that
C = {An}, we write
n
00
An = { X E A I
I X E An
for all n}.
(14)
n=I
Clearly (14) is the set (13) in this special case (in some cases the elements of C
cannot be indexed this way).
Supplement on the Axioms of Set Theory
15
3. Axiom of Pairing For any two sets, there exists a set to which they
both belong.
4. Axiom of Unions For every collection of sets, there exists a set that
contains all the elements that belong to at least one of the sets of the given
collection.
If C is as in (13), we write
LJ{A I A EC}
(15)
to denote the set postulated in axiom 4. The notations A U B and n~1An are
used in special cases, as with unions. If we want to prove that x E UAn, then
we must prove x E An for at least one n; if we want to prove x E nAn, then
we must prove x E An for all n. The logical quantifiers 3x, Vx are thus closely
connected to the set theoretical operations U, n.
We must CflTefully distinguish the pairing and the union: The set {A,B},
postulated in axiom 3, has the two elements A and B if A -/- B and a single
element A if B = A (this is not excluded). For example, given the set 0, we
can form the set {0, 0} = { 0}, which is a nonempty set; it has one element.
Axiom 4 postulates the existence of AU B. This set does not in general contain
A or B as elements; its elements are either elements of A or elements of B. For
example, 0 U 0 = 0 has no element; hence it is different from {0}.
Axioms 3 and 4 also imply the existence of a set {A,B, C} with three
elements. To see this, form {A, B} and { C, C} = {C} and then the pair
{ {A, B}, { C}} = D. Take the union of the elements of D. Similarly, given
n sets A1, ... ,An, we can form the set {A1, ... ,An}.
5. Axiom of Powers For each set there exists a collection of sets that
contains among its elements all the subsets of the given set.
The axiom of powers is a basic tool of set theory. We already know what
countable sets are. We shall show in Worked Example 1.6 at the end of this
introduction that if A is countable, then the power set P(A) is not countable and,
more generally, that there is no bijection from A to P(A). This was discovered
by Cantor; set theory, as we understand it today, was launched by this discovery.
If A is denumerable, then there is a bijection from P(A) to JR, that is, the set
of real numbers. Hence, if we accepted the existence of the integers as a set,
axioms 1 through 5 would imply the existence of the set JR of real numbers,
as will be introduced in §1.2. But the axioms given so far do not postulate the
16
Introduction: Sets and Functions
existence of any set, let alone the existence of infinite sets. For instance, axiom
4 is understood this way: If you have a collection of sets, then you can form
the union. But we never said you have any sets to begin with!
Before formulating the last group of axioms, we examine the question of
existence of sets more closely. If we understand sets in the sense of Cantor's
definition (1 ), all our axioms are satisfied. We can deduce from the axioms that
some sets which can be formed by Cantor's definition are not sets in the sense
of the axioms. Specifically, given a set A we can form B = {x E A I x (/. x}.
Suppose now that B E B. Then B (/. B, and hence it is not possible for B to be an
element of itself. In conclusion, B (/. B, and in particular B (/. A. Summing up,
given any set A, a set B can be constructed that is not an element of A. Hence the
axioms exclude the existence of a set that would contain all sets. On the other
hand, Cantor's definition would admit such a set. Similarly, the contradiction
concerning the set M of (5) shows that M is not a set. The axiomatic system
thus accomplishes our purpose: On the basis of the axioms we can introduce a
part of Cantor's set theory, which is indispensable in mathematics, and at the
same time we can exclude the known contradictions of Cantor's theory.
If we replace the word "set" in axioms I through 5 by the words "finite
set," we have consistent statements. Since we want to introduce the concept of
"set" with these axioms, we must accept any interpretation consistent with them.
Hence, there is a need for an axiom of infinity.
Definition If x is a set, then we define x+ = x U {x} and call it the successor
ofx.
6. Axiom of Infinity There exists a set containing 0 and containing the
successor of each of its elements.
This is the sort of axiom needed to introduce the integers. The next axiom,
which has been a point of controversy in the history of set theory, asserts that
from any collection of sets, we can "pick out" one representative from each set
in the collection. This is stated more precisely in the next axiom.
7. Axiom of Choice If A is a collection of pairwise disjoint nonempty
sets, then there exists a choice set C such that x n C contains a single element
for every set x in A.
We have already used the concept of an ordered pair (a, b) of elements a, b.
An ordered pair contains a first element (coordinate) a and a second element
Worked Examples for the Introduction
17
(coordinate) b. The concept of an ordered pair could be reduced to the concept
of a set by defining (a, b) = { {a}, {a, b}}; we will not give the details' here.
If A and B are given sets, we can form the set of all ordered pairs (a, b ); this
set is denoted A x B. By definition, a map f : A _. B is a subset of A x B such
that:
1.
Given a E A there is a b E B such that (a, b) E /;
2.
If (a,b1)
Ef and (a,b2) E/ then h2 = b1.
Instead of (a,b) E f we write b = f(a). We may proceed to define the
following terms, notations, and concepts in connection with functions: injection,
surjection, bijection, resuiction, extension,f(X) if X CA, f- 1(Y) if Y CB, and
composition of maps. If BC IR, then/ is usually called a real-valued function.
8. Axiom of Substitution If S(a, b) is a senrence such thar for each a in
a set A the set {b I S(a,b)} can be formed, then there exists a function F with
domain A such that F(a) = {b I S(a,b)} for each a in A.
By definition, a function F has a target; hence axiom 8 requires the existence
of a set B such that F c A x B.
We can remember these axioms if we summarize them in suggestive form as
follows. The axiom of extension gives a criterion for the equality of two sets.
The axioms of specification, pairing, unions, and powers allow us to specify
subsets and fonn pairs (and finite sets in general), intersections and unions, and
the collection of all subsets of a given set (called the power set of the given
set). We postulate the existence of infinite sets. The axiom of choice ensures
that we can choose a single element from each set of a collection of pairwise
disjoint, nonempty sets and form a set with the chosen elements. The axiom of
substitution shows that we can substitute for each element of a given set some
set depending on this element.
If we give completely detailed proofs in mathematics, then we have to go
back to these axioms and to first principles of logic. In practice, we mainly
use the set theoretical operations of union, intersection, complement, difference,
power set, and choice set (the last usually implicitly).
•
Worked Examples for the Introduction
Example 1.1 For sets A, B, Cc S, show that the distributive law holds:
A
n (B u C) = (A n B) u (A n C).
Introduction: Sets and Functions
18
Solution One method is to show that each side is a subset of the other.
First truce x E A n (B U C). This means that x is a member of both A and
B U C. Therefore, x is in A and hence x is in either A n B or A n C; that
is, X E (A n B) u (A n C), and so A n (B u C) C (A n B) u (A n C). Now
let X E (A n B) u (A n C); thus X is in either A n B or A n C. If X E A n B,
then x is in A and B, and in particular, x is in A and is in B U C, so that
X E A n (B u C). Similarly, if X E A n C, we conclude that X E A n (B u C).
Hence (A n B) U (A n C) c A n (B U C), and so we now have equality. The
distributive law can be verified diagrammatically as in Figure 1-8.
•
FIGURE 1-8
Distributive law
Example 1.2 Show that for A, B c S,
A C B <=} S\A :> S\B.
Solution First we prove that A c B implies S\B c S\A. Assume A c B
and x E S\B. Then x (/.Band therefore x (/. A (since x EA • x E B), hence
x e S\A, proving that S\B c S\A. To prove the converse, suppose S\B C S\A
and x EA. Then x (/. S\A and so x (/. S\B (since S\B C S\A) and thus x E B.
HenceAcB.
•
r
(defined on the set of all real numbers) and
Example 1.3 Let f(x) =
B = {y I y ~ I}. Compute the set l (B).
r
Solution By definition,f- 1(8) consists of all x such thatf(x) e B, that is,
all x such that x2 ~ l. This happens iff x ~ I or x :5 - l. Thus 1- 1 (B) = {x I
x~l}U{xlx:5-l}. •
19
Worked Examples for the Introduction
Example 1.4 Show that f(A n B) c f(A) nf(B). Give an example of A, B,
and f with the property that f(A
n B) 'F f(A) nf(B).
Solution If y e / (A n B), then there is some
x e A n B with y = f (x). We
know that x E A and x E B, and so y E /(A) and y E f(B). This shows that
y E /(A) nf(B), so f(A n B) C f(A) nf(B). For an example, let A = {x E
Z I x ~ O}, B = {x E Z I x ~ O}. with f : Z -+ Z defined by /(x) = x2.
Then f(A) A, f(B) A, and J(A) nf(B) A. However, An B {O}, and so
f(A n B) =/({0}) = {O} ~A.
•
=
=
=
=
Example 1.5 Use induction to show that I + 2 + • • • + n .= n(n + 1)/2 for
every n EN.
Solution Let P(i) be the statement 1 + •• • +; = i(i + 1)/2 whose truth or falsity we are trying to establish. We check the conditions 1 and 2 of the method
of induction given in the first part of this introduction. The statement P(l) is
obviously true, since it reduces to 1 = I · 2/2. Now assume that the statement
P(n) is true and let us slww that P(n + 1) is true:
I+ 2 + • • • + n + (n + 1) = (1 + 2 + • • • + n) + (n + 1)
= n · (n + 1) + (n + I) = n · (n + 1) + 2 • (n + I)
2
=
2
(n + l)(n + 2)
(n + l)[(n + 1) + I]
----=-----2
2
Thus the statement is true for n + I .
•
Example 1.6 let A be a set and let 'P(A) denote the set of all subsets of A.
Prove that A and P(A) do not have the same cardinality.
Solution Suppose there is a bijection f : A
-+ P(A); we shall then derive
a contradiction. Let B = {x E A I x ¢ /(x)}. There exists a y E A such that
f(y) = B, since/ is onto. If y E B, then by definition of B we conclude that
y ¢ B. Similarly, if y <t B, then we conclude that y E B. In either case we get a
contradiction. Actually, the argument shows that there does not exist a function
f : A -+ P(A) that is even onto.
•
20
Introduction: Sets and Functions
Exercises for the Introduction
1.
The following mappings are defined by stating /(x), the domain S and the
target T. For ACS and BC T, as given, compute/(A) and/- 1(8).
a.
/(x)
=x2, S= {-1,0, 1}, T=all real
numbers,
A= {-1, l}, B= {O, l}.
b.
/(X)
={
~
1
!f
X
~0
-.x-, 1fx<O
S = T = all real numbers,
A = {x
E real numbers Ix> O}, B = {O}.
},
C.
f (X) = {
0,
-},
if X > 0
if X = 0
if X < 0
S = T = all real numbers
A= B = {x E real numbers
I -2 < x < 1}.
2.
Determine whether the functions listed in Exercise I are one-to-one or
onto (or both) for the given domains and targets.
3.
Let/ : S - T be a function, C1, C2 C T, and Di, D2 C S. Prove
4.
a.
1- 1(C1 U C2) =/- 1(C1) u/- 1(C2).
b.
/(D1 U D2)
c.
1-•cc, n C2) =1- cci> n1- <C2).
d.
/(D1
= f(D1) U /(D2).
1
1
n D2) Cf(Di) n / (D2),
Verify the relations (a) through (d) in Exercise 3 for each of the functions (a) through (c) in Exercise 1 and the following sets; use the sets
in part a below for the function in Exercise l(a), the sets in part b for
the function in Exercise I(b ), and the sets in part c for the functions in
Exercise l(c):
a.
C1= allx>O,D1={-l,l},C2= allx~O.D2={0,I};
b.
C1
c.
C1 = all x ~ 0, D1 = all x, C2
= all x ~ 0, Di = all x > 0, C2 = all x :5 2, D2 =
=
all x ~ -1;
all x > ..:.1, D2 = all x > 0.
21
Exercises for the Introduction
5.
If f : S - T is a function from S into T, show that the following are
equivalent. (Each implies the other two.)
a.
f is one-to-one.
b.
For every y in T, the set 1- 1( {y}) contains at most one point.
c.
f(D1 n D2) = f(D1) nf(D2) for all subsets D1 and D2 of S.
Develop similar criteria for "ontoness."
6.
Show that the set of positive integers N = {1, 2, 3, ... } has as many elements as there are integers, by setting up a one-to-one correspondence
between the set Z = { ... , -3, -2, -1,0, 1, 2, 3, ... } and the set N. Conclude that Z is countable.
7.
Let A be a finite set with N elements, and let P(A) denote the collection of
all subsets of A, including the empty set. Prove that P(A) has 2N elements.
8.
a.
Let N = {0,1,2,3, ... }. Define cp: N x N - N by cp(i,J) =J+
½k(k + 1) where k = i + j. Show that cp is a bijection and that it has
something to do with the following picture:
5 20
b.
9.
4 14
19
3
9
13
18
2
s
8
12
17
l
2
4
7
11
16
0
0
1
1
3
6
2
3
10
4
15
s
Show that if A 1,A2, ... are countable sets, so is A1 UA2 U · · ·.
Let A be a family of subsets of a set S. Write UA for the union of all
members of A and similarly define nA. Suppose B :::> A. Show that
UA C UB and nB C nA.
22
Introduction: Sets and Functwns
10.
Let f : S - T, g : T - U, and h : U - V be mappings. Prove that
ho (go f) =(hog) of (that is, that composition is associative).
11.
Let f : S - T, g : T - U be given mappings. Show that for C C
U, (g o/)-l(C) =f-l(g-l(C)).
12.
Let A be a collection of subsets of a set S and B the collection of complementary sets; that is, BE B iff S\B EA. Prove de Morgan's laws:
a.
S\ UA=nB.
b.
S\nA=UB.
Here UA denotes the union of all sets in A.
For example, if A= {A 1,A2 }, then
13.
a.
reads S\(A1 UA2) = (S\A1) n (S\A2) and
b.
reads S\(A1
n A2) = (S\A1) U (S\Az).
Let A, B C S. Show that
A xB=0
14.
15.
16.
¢:,
A = 0 or B = 0.
Show that
a.
(A x B) U (A' x B) = (A U A') x B.
b.
(A
X
B) n (A'
X
B') = (A
n A')
X
(B n B').
Show that
a.
f : S - T is one-to-one iff there is a function g : T - S such that
g of = Is; we call g a left inverse off.
b.
f : S - Tis onto iff there is a function h : T - S such that f oh = fr;
we call h a right inverse off.
c.
A map / : S - T is a bijection iff there is a map g : T - S such
that f o g h and g of = ls. Show also that g 1- 1 and is uniquely
determined.
=
=
Let/: S - T and g: T - Ube bijections. Show that g of is a bijection
and (gof)- 1 =/- 1 og- 1• [Hint: Use Exercise 15(c).]
23
Exercises for the Introduction
17.
(For this problem you may wish to review some linear algebra.) Let
A=
(""
a12
a21
a22
a,'.,1
a,,,2
::
a1n )
be an m x n matrix where the aij are real numbers. Use Exercise 15 to
show that A has rank m if and only if there is a matrix B such that AB is
the m x m identity matrix.
Chapter 1
The Real Line and
Euclidean Space
The basic objects of mathematical analysis are numbers and functions. This
book will examine the theory of convergence, continuity, differentiation, and
integration of functions of one or several variables as it has developed over
the last 250 years. In the process, we will place this theory in the context of
some of the techniques of modem analysis. In this chapter we begin with the
real line and n-space; this topic is indispensable for a precise treatment of the
calculus of functions of several variables as well as for a clear understanding of
it. Pru1s of this chapter may appear to be review, the material perhaps having
- been covered in previous mathematics courses. However, our discussion may
be more rigorous and give some further properties of the real line in preparation
for later work.
§1.1 Ordered Fields and the Number Systems
Ordered Fields
Let us start with the main prope11ies of real numbers. The reader should be
familiar with the heuristic (that is, intuitive) arguments that are used to talk
about the real numbers. Begin with the nonnegative integers 0, 1, 2, 3, ... , and
then adjoin negative integers and nonintegral rational numbers. The system of
reals is obtained by adjoining to the rationals all the nonrational limits of rational
numbers. For example, the iITational number ../2 is obtained as the limit of an
< 2 and Xn rational. One might
increasing (or monotone) sequence Xn with
use a decimal sequence such as 1, 1.4, 1.41, 1.414, ... for Xn. It is a well-known
x;
25
26
Chapter 1 The Real Line and Euclidean Space
fact, first proved by Euclid, that /2 is not rational (see Exercise 2 at the end of
this chapter).
How do we cany out the construction of the reals in a formal manner?
This process is a little long, tedious in places and difficult in others. In the
remainder of this section, we provide an outline of this program. First we
isolate some important properties that we believe the real numbers should have.
The construction of systems of objects from logic and set theory that obey these
properties is an important step in the foundations of analysis. It was carried
out by Karl Weierstrass, Richard Dedekind, Georg Cantor, and others in the late
1800s and early 1900s, and refinements of the idea continue today.
We start with a given set whose elements will be called numbers and consider
the following axioms for this set:
IField Axioms I
I.
II.
Addition axioms There is an addition operation "+" such that for all
numbers x, y, z, the following hold:
1.
x+y =y+x.
2.
x + (y + z) = (x + y) + z.
3.
There is a number O such that x + 0 = x.
4.
For each x there is a number denoted -x
such that x + (-x) = O; one writes
y-x=y+(-x).
Multiplication axioms
commutativity
=y·x.
x · (y · z) = (x · y) · z.
7.
There is a number I such that I • x = x.
8.
For each x ~ 0 there exists a number
denoted x- 1 such that x • x- 1 = I;
one writes y • x- 1 = y/x.
X •Y
X•
10.
I
(y + z) = X • y + X • z.
~
0.
zero
additive inverses
There is a multiplication operation"." such that
5.
6.
9.
associativity
commutativity
associativity
unity
reciprocals
distributive law
nontriviality
Any set or "number system" with operations + and • obeying these rules is
called afield. For example, the rational numbers are a field but the integers are
not-since the reciprocal of an integer (other than ±1) is not an integer. From
now on, we will just write xy for x · y. Rule IO outlaws the trivial field consisting
of the single element 0.
§1.1 Ordered Fields and the Number Systems
27
If one starts with the natural numbers N, then axioms 1 and 2 hold along
with 5, 6, 7, 9, and 10. The inclusion of zero and negative numbers to form
Z gives a system obeying axioms 1 through 7, 9, and IO. Finally, inclusion
of multiplicative reciprocals to fonn the rational numbers Q yields a system
obeying all of these axioms. We shall elaborate on this idea later in this section.
From the field axioms, one can deduce the usual properties for manipulation
of algebraic equalities, such as deriving the identity (a - b)2 = a 2 - 2ab +
b2 and the laws of exponents. We will give some examples of this below
in Proposition 1.1.2 and in examples and exercises. We shall assume these
properties thereafter.
The real numbers also come equipped with a natural orderir,ig. We usually
visualize them arranged on a line.
IOrder Axioms I
III.
There is a relation " $ " such that
11.
For each x we have x $ x.
reflexivity
12. If x :5 y and y :5 x then x = y.
antisymmetry
13. If x :5 y and y $ z then x $ z.
transitivity
14. For every pair of numbers (x,y),
either x $ y or y $ x.
15. If x $ y then
for every
linear ordering
x+ z $ y +z
z.
16. If O :5 x and O :5 y then
0 $ xy.
compatibility of $ and +
compatibility of $ and •
Properties 11 to 13 state that the relation $ is a partial ordering. Property 14
says that every two numbers are comparable. This is described by saying that $
is a linear ordering or total ordering. Properties 15 and 16 say that the ordering
and the operations are compatible. A system obeying all 16 prope11ies is called
an ordered.field. The real numbers JR and the rational numbers Qare examples.
By definition, x < y shall mean that x $ y and x 'f y. Also, ~ y means y $ x,
and x > y means y < x. Properties 11, 12, and 14 combine to give the following
o.bservation.
x
1.1.1 Law of Trichotomy If x and y are elements of an ordered field,
then exactly one of the relations x < y, x = y, or x > y holds..
28
Chapter 1 The Real Une and Euclidean Space
There are other systems besides the real numbers in which some of these
axioms play a role. For example, axioms I through 4 define a commutative
group. Axioms I through 9 excluding axioms 5 and 8 define a ring; for instance,
this concept is appropriate for algebraic operations on the set of all n x n matrices.
Notice that the set of all subsets of a given set S, where A :::; B is defined to
mean that A c B, forms a partially ordered set and that this ordering fails to
satisfy property 14.
1.1.2 Proposition In an ordered field the following properties hold:
=a for every a, then x =0.
Unique identities If a + x
every a, then x = 1.
ii.
Unique inverses If a + x =0, then x =-a. If ax =I, then x =a- 1•
iii.
No divisors of zero If xy =0, then x =0 or y =0.
iv.
Cancellation laws for addition
b + x, then a :::; b.
v.
Cancellation for multiplication If ax
ax :2:: bx and x > 0, then a ;?:: b.
vi.
0 · x = 0 for every x.
vii.
-(-x) =xfor every x.
viii.
-x = (- l)x for every x.
ix.
If x 'f 0, then x- 1 'f O and (x- 1)- 1 = x.
x.
If x f: 0 and y f: 0, then xy f: 0 and (xy)- 1 = x- 1y- 1•
xi.
If x
xii.
If x S 0 and y
xiii. 0
xiv.
If a · x
=a for
i.
If a + x = b + x, then a = b. If a + x S
= bx and x
'f 0, then a
=b.
If
$ y and O.:::; z, then xz :::; yz. If x :::; y and z S 0, then yz :::; xz.
$ 0, then xy ~ 0.
If x S 0 and y ~ 0, then xy :::; 0.
< I.
For any x,
x2
:2:: 0.
Proofs of ii, vi, viii, xiii, and xiv are given at the end of the chapter to
illustrate the ideas.
The purpose of the axioms of an ordered field is to isolate the key properties
we need for manipulation of algebraic equalities and inequalities. For example,
29
§I.I Ordered Fields and the Number Systems
here is a typical manipulation: By 1.1.2xiv, for any numbers a and b, (a - b)2 ~
0. Thus, a 2 - 2ab + b2 ~ 0, and so
ab $
I
2(a2 + b2 )
for any numbers a, b. This type of manipulation is important in analysis and
will reappear later.
1.1.3 Example Using the axioms and properties of an ordered field given
in this section, prove that
a2
-
b2 = (a - b)(a + b).
Solution By the,distributive law,
(a - b)(a + b) = (a - b) ·a+ (a - b) · b.
Now use commutativity and the disuibutive law again, along with a - b =
a+ (-b):
(a- b) · a+ (a -b) · b
=a· (a -
b)+b· (a -b)
=a
2
+a• (-b)+b· a+b · (-b).
Now a• (-b) =a• (-1) · b = (-l)ab = -(ab) by 1.J.2viii, associativity, and
commutativity. Similarly, b · (-b) = -b2 • Thus (a - b)(a +b) equals
a 2 +a· (-b) + b ·a+ b · (-b) = a 2
-
(ab)+ ba - b2
= a2
-
ab+ ab - b2
= a 2 - b2
(by axiom 5)
(by axioms 3 and 4).
•
1.1.4 Example In an ordered field prove that if O $ x < y, then i1 < y2.
Solution If 0 ::; x < y, then 0 ::; x ::; y, and so by 1.1.2xi~ x2- ::; yx. By the
same reasoning, x ::; y implies xy ::; y2. Thus
x2 :5 yx = xy
$
(by commutativity)
y2,
and so ;,? $ y2. We now need to exclude the possibility that x2- equals 'y2 • But
if x2- = y2 then
;,? - y2 = 0 (add -y2 to each side)
= 0 (by Example 1.1.3)
(x - y)(x + y)
Chapter 1 The Real line and Euclidean Space
30
By 1.1.2xii we have O 5 x and y > 0. Now x+y ~ 0, since x+y = 0 would
imply y = (-x) 5 0 so that y 5 0, which is impossible by the law of trichotomy.
By cancellation for multiplication,
(x-y)(x+y)=0
implies that x - y = 0, i.e., that x = y. But we are given x
•
is excluded, as desired.
< y,
and so this case
Other familiar symbols may be introduced. The magnitude, or absolute
value, of a number x is denoted !xi and in an ordered field is defined by !xi = x
if x ~ 0 and !xi= -x if x < 0. The distance between two elements x and y is
Ix - yl. Certain elementary properties of absolute value and distance are worth
singling out both for their own value and because we will generalize them to
other settings later. Their proofs. are straightforward, and so they are omitted.
1.1.5 Proposition
i.
lxl ~ 0 for every x.
ii.
lxl = 0 if and only if x = 0.
iii.
!xyl = Ix! !YI-
iv.
lx+yl 5 Jxl + !YI-
v.
l!xJ - IYII 5 Ix - YI-
triangle i11equalily
a/Jemative triangle inequality
For instance, to prove iv one considers all of the cases: For example, if
x ~ 0, y 5 0, and x + y ~ 0, then Ix+ YI x + y 5 x + Y - y x = lxl 5 lxl + IYI•
=
=
1.1.6 Proposition If d(x,y) = Ix - yj. then
i.
d(x,y) ~ 0.
ii.
d(x,y) = 0
iii.
d(x,y) = d(y,x).
iv.
d(x, y) 5 d(x, z) + d(z, y).
if and only if x = y.
triangle inequality
The name triangle inequality comes from the corresponding inequality when
x and y are not numbers but points in the plane and d(x,y) is the usual Euclidean
31
§1.1 Ordered Fields and the Number Systems
distance between them. In that case, it expresses the familiar fact that the sum
of the lengths of two sides of a triangle is always at least the length of the third
side. We will treat this more gener;il situation in §1.6.
The Number Systems
We briefly summruize the various number systems to review and establish notation.
The Natural Numbers
The natural numbers are the counting numbers: N = { 1, 2, 3, 4, ... } . Appending
' zero gives the set N = { 0, I, 2, 3, ... } of nonnegative integers. The identities 0
and-(1 are in N, and addition and multiplication are defined. However, we do
t1ot have1legatives or reciprocals. With the usual ordering, N satisfies all of ttie
axioms 1 to 16 except 4 and 8. The most outstanding fact about N is that the
el~ments of N are the numbers we use for counting. There is a first one, 0, and
for each there is a definite "next" obtained by adding 1. Repeating the process
of adding 1 generates all of N. This is formalized by the
Principle of Mathematical Induction If Sis a subset of N such that O E Sand
k + l is in S whenever k is in S, then S = N.
As the reader already saw in the introductory chapter, this is a very powerful
tool not only in number theory, but throughout mathematics. Some consequences
of this ptinciple are wo11h pointing out. Certainly the set N has a smallest
el~ment, namely 0. The same is tme of every nonempty subset of N. We say
that N is well-ordered by :5 or that it satisfies the well-ordering property. A
grammatically unf011unate but common expression is to describe :5 as a wellordering on N.
1.1.7 Proposition
N is well-ordered by the relation
:5. That is,
N has the
well-ordering property:
Well-Ordering Property If Sis a nonempty subset of N, then there is a smallest
element in S; i.e., there is an so E S such that s0 :5 x for every x in S.
32
Chapter 1 The Real LJne and Euclidean Space
The proof of this, given at the end of the chapter, illustrates a way of using
induction that may be new to the reader. While a formal proof is necessary, the
validity of the well-ordering property for the natural numbers should be clear
intuitively.
As a consequence of the well-ordering property, one can show that the natural
numbers also satisfy the
Principle of Complete Induction
n} CS implies n ES, then S = N.
If SC N is a subset such that {x E N Ix<
This form of the induction principle is applicable to other well-ordered sets
besides N. It is not equivalent to the principle of finite induction; i.e., there
are well-ordered sets that satisfy the principle of complete induction but not the
principle of mathematical induction.
The Integers
To pass from N to the integers, Z = { ... , -3, :- 2, - I, 0, 1, 2, 3, ... }, we include
the negative numbers and thus guarantee solutions to every equation x + n = m
with n and m in N. In doing so we have produced a system in which there is a
unique solution x = m - n whenever n and m are in Z. This system satisfies all
of the properties 1 through 16 except 8. As mentioned previously, a structure
satisfying I through 4, 6, and 9 is called a ring. With 5 it is a commutative
ring, and with 7 a ring with unity or ring with identity. There are examples of
commutative rings besides Z. (Think of the "clock arithmetic" of grade school
mathematics.) When we ask for properties 10-16, the possibilities narrow. In a
sense, the simplest system now possible is Z. By extending Nin both directions,
we have lost the principle of mathematical induction, but in a sense we have it
in two directions: If S is a subset of Z such that O E S and both k + 1 and k - 1
are in S whenever k is in S, then S = Z.
The Rational Numbers
The set Q of rational numbers consists of those numbers that can be expressed
as a fraction or ratio of integers: m/n with n t- 0. We can view this as expanding
Z to include solutions to all equations nx + m = 0 with n, m E Z, aod n t- 0.
In the process we get a system Q in which every equation ax + b = 0 with a
and b in Q and a t- 0 has a unique solution x = -a-• · b. A system such as
this, satisfying 1-10, is called a field. Examples of fields are Q and the real
33
§1.1 Ordered Fields and the Number Systems
numbers JR. In §1.8 we shall study the complex number field C, but there are
other fields as well, including ones with only finitely many elements. These last,
as well as C, are excluded if we want an ordered field satisfying 11-16 also.
But JR, Q, and others remain. If one imposes some sort of minimality conditions
such as demanding that no proper subset also be a field, then it can be shown
that Q is essentially the only possibility.
We continue with two propositions. The first says that there are many elements in Q, and the second that there are not too many.
1.1.8 Proposition Q is "dense in itself':
is an element z in Q with x < z < y.
If x
and y are in Q, then there
This proposition is verified by choosing z = (x + y)/2.
If we start to mark the rational numbers on a number line, we find that they
are scattered densely along the line and seem to be filling it up. (We know they
do not; for example, v1z is missing.)
1.1.9 Proposition Q is
countable.
This ..means that Q can be placed in one-to-one correspondence with N (or a
subset of it). The whole of Q can be displayed as a list or sequence. One way to
do this is to consider the points in the plane with integer coordinates, say (p, q)
and to assign the fraction p/q (simplified to lowest terms) to this point (leave
out assignments when q = 0). Now starting at the origin, spiral out, listing all
the fractions as you go, omitting any fraction that you have already encountered;
that is, avoiding repetitions. In this way, you will count all possible fractions
(positive and negative) and thereby set up a one-to-one correspondence petween
N and Q.
The final property of Q is rather trivial, but it is quite important in JR. It is
the
Archimedean Property
If x
E Q, then there is an integer n with x
< n.
~ 0, take n = 1. If x = p / q with p > 0 and q > 0, take n = p + 1.
In any ordered field there is a copy of N obtained by identifying n with
1 + 1 + • • • + 1 (n terms). This extends to a copy of Z by including ·-n and of
Q by identifying k/n with n- 1 •kif n ~ 0. A field IF is called an Archimedean
If x
Chapter I The Real Line and Euclidean Space
34
ordered field if it has the Archimedean property. Here are several equivalent
formulations:
1.
If x E lF, then there is an integer n with x
2.
If x and y are in lF with O < x
3.
If x E IF' and x
> 0,
< n.
< y, then there is an integer k with kx > y.
then there is an integer n
>0
with O < I/ n
< x.
(fo obtain the third characterization, let y = 1/x and select n > y.) Thus, Q is
an example of an Archimedean ordered field. There are others, and there are
ordered fields that are not Archimedean.
There are other jobs that, in principle, need to be done before proceeding;
for instance, the laws of integer exponents need to be developed. We hope that
we have given enough of the flavor of the techniques so that this is believable
and we can go on. (The job of fully developing these topics belongs to courses
in algebra.)
1.1.10 Example Give an example of a commutative ring without an identity.
Solution We must have axioms 1-6 and 9, but axiom 7 should fail. Perhaps
the simplest such system is the set of even integers with the usual operations of
addition and multiplication.
•
1.1.11 Example Use induction
n
to prove that 1/2n
< l/nfor every integer
> 0.
Solution This is true for n = l since l /2 < l. Assume it is valid for a given
integer n
>
1. Then
l
1
1
I
I
= - - < - = - - <--.
2n+I 2n · 2 2n n + n
n+l
Hence it is true for n + l.
•
Note. The exercises at the end of each section are intended to be relatively
routine to build confidence and provide practice before the more challenging
of the problems at the end of the chapter are attempted. The student will find
a mix of routine and challenging problems at the end of the chapter.
§1.2 Completeness and the Real Number System
35
Exercises for §1.1
1.
In an ordered field prove that (a+ b)2 = a 2 + 2ab + b 2 .
2.
In a field, show that if b -cf. 0 and d -cf. 0, then
a
c
ad+bc
b + d = ,;;r·
3.
In an ordered field, if a > b, show that a2 b < ab2 + (a 3
4.
Prove that in an ordered field, if v'2 is a positive number whose square is
2, then v'2 < 3 /2. (D'o this without using a numerical approximation for
-
b3)/3.
J2.)
5.
Give an example of a field with only three elements. Prove that it cannot
be made into an ordered field.
§1.2 Completeness and the Real Number System
The axioms of an ordered field are not enough to uniquely characterize the real
numbers, because the rational numbers also obey these axioms. Thus we require
another condition to ensure that limits of rationals are included in the system.
This condition is known as completeness. The system of rational numbers is
quite rich, but not quite rich enough for classical analysis. There are numbers
that in some sense ought to be there but are not. For example, there is no
rational square root of 2. More precisely, there are no integers p and q such
that (p/q)2 = 2. This is a problem not only for analysis, but also for geometry.
It is in that context that this difficulty came to the attention of the Pythagorean
mathematicians and philosophers of ancient Greece. There is no rational number
corresponding to the length of the diagonal of a square whose side has length 1.
There are several ways of introducing the completeness property. We shall
P._.Se the method of monotone sequences, since it is closest to our intuition of
constructing real numbers, like v'2 = 1.414213 ... , as a limit of an approximating sequence. (In the next section we use this property to prove the least upper
bound property, which can be used as an alternative axiom.). To state the axiom,
we need a digression on sequences.
The idea of approximation is central to the theory, techniques, and applications of analysis. A version of the idea is that even if you cannot solve a
problem exactly, perhaps you can establish a procedure that when repeated will
give better and better approximations. In a sense this happens even in arithmetic.
If a child is asked to divide 3 into 1 to obtain the decimal expansion for 1/3, the
usual division method will never stop but produces a sequence of ever-improving
Chapter 1 The Real Line and Euclidean Space
36
approximations: 0.3, 0.33, 0.333, 0.3333, .... We begin our study with sequences
of real numbers and later proceed to sequences of vectors. We will return to
this theme throughout the book, looking at sequences of functions and various
ideas regarding convergence.
Recall that a sequence in a set S is simply a function f : N -+ S. The
essence of the idea is that a sequence is a countable list of elements from S
arranged in a particular order. Subscript notation is often used to label the
elements f(n) = Xn, although sometimes other devices are used, such as Jn>
or x(n). The sequence is arranged as a list x 1,x2,X3, ...• Since the labeling is
somewhat arbitrary, we could relabel without loss of information, by putting
Yk = Xk+~ and getting Yo,Y1 ,Y2,Y3, ... , and this might simplify the expression of
f. For example, the sequence
1 l I
2'4'8'
I
16'""
=
=
=
can be represented as Xn
1/2n for n
1, 2, 3, ... or Yk
I /2k+l for k =
0, 1, 2, 3, .... We typically write the sequence as XnA sequence is said to converge to a limit x or to be convergent to x if we
can guarantee that the points in the sequence are as clo~e as we wish to x by
going far enough put in the sequence. This is made precise in the following
definition.
1.2.1 Definition
A sequence Xn is said to converge to a limit x iffor every
~ N. (The number
N may depend on e; smaller e may require larger N.) In this case we write
limn-+oo Xn = X or Xn -+ X as n -+ 00.
e > 0 there is an integer N such that lxn - xi < e whenever n
The definition requires that all terms beyond the Nth in the sequence be
closer than e to x. It is not enough to simply find one term close to x. The
Greek letter e (epsilon) can be thought of as standing for an allowed maximum
error in approximation. You must be able lo assert that beyond some specific
N, all terms of the sequence approximate x within this degree of predsion. A
feeling for this may be gained by plotting the points on a number line, as in
Figure 1.2-1.
Xi
I
FIGURE 1.2-1
X4
e
X -
II 11•11 I I
E
X
I
e
X
+
E
Elements of a sequence Xn get within e of the limit for n large
§1.2 Completeness and the Real Number System
37
/(n)
T
x+t
x
x-£
----l----~------------1
I
I
t
I
T
•
----r• --+-----r------1
I
I
I
•
•
I
I
I
I
----r-7--T-----r-------
1
I
I
,
I
,
I
I
I
I
I
I
I
I
I
I
I
_ _ _..____.2__3...___4_....5._.-.-.-N.____._I_N___,+_2__..._n
. ,.-
I
N+ 1
FIGURE 1.2-2
Convergence viewed in terms of a graph
Definition 1.2.1 requires that all but finitely many terms of the sequence lie
between x - c and x + c. More intuition may be obtained from the graph of the
function defining the sequence (Figure 1.2-2). Far enough to the right, all the
points must lie in the band of width 2£ around the horizontal line at height x. It
is not enough that (3,x3 ) lies in this band, since (4,x4 ) does not.
1.2.2 Sandwich Lemma Suppose Xn
--+
L.
Yn --+
L, and Xn ~
all n (it is enough to assume that there is an No such that Xn $
n > No). Then Zn --+ L
Zn
$
~ Yn for
whenever
Zn
Yn
Intuitively, this is so because Zn is trapped above and below by numbers
getting close to L. The complete proof is given at the end of the chapter. We
will see further variations on this theme later. Another fact worth noting is that
the limit cannot exceed the bound of the sequence:
1.2.3 Proposition If a $
Xn
$ b for every n and Xn
--+
x, then a $ x $ b.
The proof, which we leave to the reader, can be accomplished directly by a
method similar to that used for the sandwich lemma.
1.2.4 Example The sequence Xn
=(- Jt, n = 1, 2, ... , illustrates several
things:
1.
Some points of a sequence can be repeated; i.e., the function from N
defining it need not be one-to-one.
38
Chapter 1 The Real line and Euclidean Space
2.
The sequence does not converge. Let e = I /2. Then no matter what limit
x is proposed for the sequence, there are tenns far out in tfie sequence that
are more than e from x. To see this, first assume x ~ 0. Then lxn - xi =
I - 1 - xi ;?: 1 > e for every odd n. If x < 0 then lxn - xi = 11 - xi ;?: I > e
for every even n.
3.
The sequence clusters frequently at l and at -1. If we use only the even
terms, they converge to 1, while if we use only the odd terms, we get a
limit of -1. This illustrates the ideas of cluster point and subsequences,
which we will examine in §1.5. •
The following is often expressed by the paraphrase "Limits are unique," a
result that should be plausible. (Again, see the end of the chapter for the proof.)
1.2.S Proposition If Xn is a sequence in an ordered field and x~
Xn -+
-+
x and
y, then x = y.
We note a simple but important fact before turning to the arithmetic of
sequences. A sequence is said to be bounded if the set of points making it up
is bounded, i.e., if there is a number M such that lxnl $ M for all n. It is said to
be bounded above if there is a number B with Xn $ B for every n and bounded
below if there is an A with A $ Xn for all n. Clearly a sequence is bounded if
and only if it is bounded both above and below. We call B an upper bound and
A a lower bound for the sequence.
1.2.6 Proposition
A convergent sequence is bounded.
This is so because all the elements of the sequence eventually bunch up at the
limit and what happens for the first few members does not affect boundedness.
Suppose Xn and Yn are sequences in an ordered field. We can add, subtract,
multiply, and divide them tenn by tenn:
+ (Yn) = (Xn + Yn)
A(Xn) = (AXn)
(x,.)(yn) = (XnYn)
(Xn)/(yn) = (Xn/Yn)-
(Xn)
The last operation assumes that Yn '# 0. These operations behave more or less as
expected with respect to limits. If Xn -+ x and y,. -+ y, then for large enough n,
39
§1.2 Completenes:r and the Real Number System
are
close toy. We expect that Xn + Yn, AXn,
the points Xn are close to X and Yn
and XnYn are close to x + y, >.x, and xy and that if y 'f' 0 then Xn/Yn is close to
x/y. We summarize this plausible result.as follows.
1.2.7 Limit Theorem for Sequences Suppose Xn
-+
x, Yn
-+
y, and >.
is constant. Then
i.
Xn +Yn-+ x+y.
ii.
>.xn
iii.
XnYn
iv.
If Yn 'f O and y 'f 0, then XnfYn
-+
-+
>.x.
xy.
-+
x/y.
A sequence Xn is said to be increasing (or better, nondecreasing) if Xn $ Xn+I
for every n, i.e., x 1 $ x2 $ x3 $ .... It is strictly increasing if x 1 < x2 <
X3 < .... It is decreasing (nonincreasing) if Xn ;:=: xn+1 for each n, i.e., x 1 ;:=:
x2 ;:=: X3 ;:=: ••• , and is strictly decreasing if x1 > x2 > X3 > .... It is called
a monotone sequence (monotonically increasing, monotonically decreasing)
if one of these is the case. A sequence that is monotonically increasing but
bounded above cannot get away to infinity and so should crowd up against
some limit, just as the sequence l, 1.4, 1.41, 1.414, ... converges upward to /2.
1.2.8 Monotone Sequence Property Let lF be an ordered field. We
say that lF has the monotone sequence property if every monotone increasing
sequence bounded above converges.
1.2.9 Completeness Property An ordered.field is said to be complete if
it obeys the mqnotone sequence property.
According to these definitions, the monotone sequence property and completeness are synonymous. We have used two names, since as we shall see in
§1.3 and §1.4, there are other properties that can be alternatively used to characterize completeness, and in that context these terminological distinctions become
important.
As with /2, one can use the monotone sequence property to prove the
existence of square roots of positive numbers in a complete ordered field. Given
y > 0, one defines a sequence x,, by letting Xn = Nn/2n, where Nn is that largest
integer such that~ $ y. One proves that Xn is a monotone sequence bounded
Chapter 1 The Real Line and Euclidean Space
40
above and so converges to a number x. (See Example 1.2.10, which follows, for
the techniques involved.) This number x satisfies x2 = y and is denoted ,jy. We
shall accept this and similar facts about the laws of exponents (and associated
laws of logarithms) in what follows.
1.2.10 Example In a complete ordered field, define Xn inductively by xo = 0,
x 1 =,/2, X2
=J2 + x1, ... , Xn =J2 + Xn-1, . ... Show that Xn converges.
'
Solution We shall show that Xn is increasing and bounded above and this
will prove the assertion. First we use induction to show that Xn ~ 0 and rn =
Xn+I - Xn ~ 0. Clearly, it holds for n = 0. Suppose it is true for n - l; then
Xn = J2+Xn-l 2:: ../2 > 0 and
rn
=Xn+I -
Xn
=V~
L. "1"..{n -
=
./
y 2 +Xn-1
Xn-Xn-1
= J2"+'Xn
2 + Xn + y2 + Xn-1
J2 +Xn + J2 +Xn-1'
and so rn-1 2:: 0 implies rn 2:: 0, and therefore Xn is increasing. Now we want to
show that Xn is bounded above. For example, one can prove by induction that
Xn S 2. Clearly, xo,X1 S 2. Suppose Xn-1 S 2. Then
Xn
= J2 + Xn-1
S J2 + 2 S
J4 =2,
and therefore Xn is increasing and bounded above and thus converges. (We·will
compute the value of the limit in Exercise 1 following this section.)
•
As in this example, completeness does not give the value of the limit; it
merely says that there is one. Such results are common in mathematics and
are called existence theorems. This may not seem to be of much practical
help, but sometimes it is. Once we know that the limit exists and so has some
definite value, we can assign it a symbol, perhaps find an equation for it, and
solve. Without knowing that the solution exists, such a procedure might lead to
a spurious solution.
1.2.11 Proposition Complete ordered fields are Archimedean.
Perhaps you are surprised that this is not obvious! In fact there are ordered
fields that are not Archimedean! One is given in Exercise 47 at the end of this
chapter.
I
§1.2 Completeness and the Real Number System
41
A key example that is instrumental in many other arguments is the "almost
obvious" fact that 1/n tends to 0 as n grows large. This follows from the
Archimedean principle:
1.2.12 Example Prove that in a complete orderedfield,
1/n----t0 as n----too.
Solution According to the definition, given any number c > 0, we must
prove that there is an integer N such that if n 2: N then II/n _:_ 0I < €. It will
be so provided that 1/ N < €, and so it suffices to choose N > 1/ c, which is
possible by the Archimedean property.
•
The next example is a bit more complicated but uses essentially the same
idea.
1.2.13 Example Show that Jn2 + I/n!
----t Oas n ----too.
Solution To show that Jn 2 + 1/n! gets small as n gets large, we estimate it:
0
< Vn2+l <
-
Thus, given c
n!
-
> 0 choose N
0
v2n2 = J2n
n!
=
n!
such that N
>
J2 < J2 .
(n - l)! - n - I
✓2/c + l. Then n
Jn 2 +1
J2
J2
n!
- n- 1 - N - 1
< --- < -- < -- <
This proves the assertion.
2: N implies
€.
•
A third example displays other tricks.
1.2.14 Example Show that lim
11 _
00
I/2n = 0.
Method 1 Using a bit of knowledge we can set a target c > 0 as the allowed
error and see just how large N must be to guarantee II/2n - 0I <€whenever
n 2: N. To do this we compute:
I.!_2 _- o I<
€ {=>
11
{:>
.!_ < € {=> ! < 2"
zn
c
1
log - < n log 2
c
{:>
n
>
log(l/c)
1og 2 .
Chapter 1 The Real Line and Euclidean Space
42
Selecting any N > log(l/e)/ log 2 yields the result. Notice it is important that
each step of the computation be reversible, since we really need n 2::: N to imply
ll/2n - OI < c:. The finicky reader may legitimately object to this proof on the
grounds that it uses the logarithm, which has not yet been rigorously introduced.
Here is another method for such readers.
Method 2 This method uses an earlier result and illustrates a useful trick. We
compare our sequence with known sequences. By Example 1.1.11, we know
that O < 1/2n < l/n for every n > 0. We also know by 1.2.12 that 1/n -+ 0.
Given e > 0, fix N large enough so that 1/n < e whenever n 2::: N. Then n 2::: N
implies O < I/2n < 1/n < £. Thus II/2n - OI < £.
Notice that the value of N obtained by this method is 1/c:. This is probably
quite a bit larger than the best possible N obtained in method 1. For c: = 0.00 I,
we have obtained here that N = 1000 (or better, 1001) works. The first method
showed that any N larger than (log 1000) / log 2 ~ 9 .97 would work. We have
a terribly inefficient value of N, but a much simpler argument. For theoretical
purposes, such estimates are often a good idea. For computational purposes the
effort expended in finding the more efficient estimate might be worthwhile. A
second thing to note is the trick of comparing a sequence under study with more
easily understood sequences. •
The arithmetic of limits is used as in calculus to evaluate more complex
limits:
.
n2
-
3n +6
1.2.15 Example Evaluate hm 3 ., 4 2 .
n-oo n- + n +
Solution
I.Im n2.,-3n+6 = 1.Im 1-3/n+6/n2
n-+oo 3n- + 4n + 2
n-oo 3 + 4/n + 2/n 2
lim I - 3 lim 1/n + 6 ( Jim 1/n) 2
n-oo
=------------~
n-oo
n_,.oo
n~1! 3 +4 n~1! 1/n + 2 c~1! Ifn)2
1-3·0+6·02
- 3+4·0+2-02
1
Jim 1/ n
3 (since we know that n-oo
=-
=0).
•
§1.2 Completeness and the Real Number System
43
Next we discuss the construction of the real numbers. The following result
is basic.
1.2.16 Theorem There is a "unique" complete ordered field, called the
real number system.
The real number system is denoted JR. In this text, +oo and -oo are not
included in JR; for other purposes, such as in complex variables and in integration
theory, however, it is useful to adjoin ±oo.
In the theorem above, "uniqueness" means that any two systems satisfying
axiom groups I-ID and the completeness property 1.2.9 can be put into a oneto-one correspondence that is compatible with addition, multiplication, and the
order relation. By compatibility with addition, for example, we mean that the
number in the second system corresponding to the sum of the two numbers
from the first system is the sum of the corresponding two numbers in the second
system. Such a correspondence is called an isomorphism.
We shall not present the proof of Theorem 1.2.16 but rather shall give the
basic construction of JR here and leave the detailed verification of the properties
\as an exercise. We start wi~ the ordered field of rational numbers IQ. Consider
the set of sequences
_.
- S = {(x1,x2, .. .) I each x~ E IQ and the sequence is
increasing and bounded above}.
Call two members of S equivalent if every upper bound for one sequence is
also an upper bound for the other. The set of all sequences that are equivalent to a given sequence is called an equivalence class. One checks that this
divides (or partitions) th~ space S into disjoint subsets. Let JR be the set of all
the equivalence classes in S. The rationals are regarded as a subset of JR by
identifying a rational number r with the class of all sequences equivalent to the
sequence (r, r, r, ... ). Roughly speaking, the reals are the limits of all increasing
sequences of rationals that are bounded above. Now one has to define addition, multiplication, inequality, and so on, and verify all of the axioms. This
is a straightforward but somewhat tedious job. As part of this job one checks
consistency: if x E JR and if (x 1,x2 , .•. ) is a sequence in the class x, then this
sequence really does converge to x.
The existence of IR can also be shown by verifying that the usual decimal
expansions have the required properties, although this method has some subtle
difficulties. 1\vo other standard methods are a geometrical approach due to
Richard Dedekind and an analytic approach due to Georg Cantor. Dedekind
takes for his basic objects "cuts," which are partitions of the rationals into pairs
44
Chapter I The Real line and Euclidean Space
of subsets (A, B) with x < y for all x in A and y in B. Cantor proceeds as we
do, looking directly at sequences, and creates limits for them by considering as
his basic objects equivalence classes of sequences.
A consequence of the Archimedean property from 1.2.11 is that the rational
numbers are dense in JR. This means that given any real number x, we can find
rational numbers as close to it as we want. This confinns our intuitive picture
of the rational numbers as being densely scattered along the line.
1.2.17 Proposition Q is dense in IR. That is,
i.
If x and y are in IR and x
ii.
If x E IR,
< y,
eE IR and E > 0,
then there is an r E Q with x
then there is an r E Q with
< r < y.
Ix - rl < €.
The rational numbers are countable, and from 1.2.17, there are enough of
them to approximate every real number. Does this mean that IR is countable?
Perhaps surprisingly, the answer is no. This is yet another result of Georg
Cantor. At the end of the chapter we give one of the more traditional arguments
for showing this. This argument is often called the diagonal argument and
depends on the representation of the reals as infinite decimal expansions.
1.2.18 Theorem The unit interval ]O, 1[ in JR is uncountable.
A simple rescaling by a straight line function f(x) =a+ (b - a)x puts ]O, l[
into one-to-one correspondence with the interval ]a, b[. Thus, any interval ]a, b[
is uncountable. Thus, although 1.2.17 shows that any such interval contains
infinitely many rational numbers, 1.2.18 shows that it must also contain infinitely
many, in fact uncountably many, in-ational numbers.
1.2.19 .Corollary If x and y are in IR and x < y, then the interval Jx,y[
contains countably many rational numbers and uncountably many irrational
numbers.
We conclude with a less obvious sequence that we will come back to later
as a source of examples.
1.2.20 Proposition: The harmonic sequence
Let x, = 1 arid x,. =
1+ 1/2+· • •+ 1/ n, n = 2, 3, .... Then x,. is monotone increasing but is unbounded
above and so does not converge (we write x,.-+ oo).
-
45
Exercises for §1.2
In the next section we will be studying the least upper bound property.
Some authors like to use it as the axiom of completeness. In such a treatment
the monotone convergence property appears as a theorem. In our treatment,
which we chose for its intuitive appeal, it is the least upper bound property that
is a theorem!
Exercises for §1.2
In Example 1.2.10, let A ::: limn_, 00 x,,.
1.
a.
Show that A is a root of A2
b.
Find limn->oo x,,.
Z.
Show that 3" /n! converges to 0.
3.
Let x,1 =
J n2 + I -
-
A - 2 == 0.
n. Compute lim,,----, 00 Xn,
Let x,, be a monotone increasing sequence such that Xn+I -x,, ::; 1/ n. Must
converge?
4.
Xn
Let lF be an ordered field in which every strictly monotone increasing
sequence bounded above converges. Prove that lF is complete.
5.
§1.3 Least Upper Bounds
The completeness axiom can be put into several other equivalent forms. To state
these, we need some further terminology.
1.3.1 Definitions Let S C JR. A number b is called an upper bound for S
if for all x E S, we have x ::; b.
A number b is called a least upper bound of S if, first, b is an upper bound
of S and, second, b is less than or equal to every other upper bound of S. See
Figure 1.3-1. The least upper bound of S (also called the supremum of S) is
denote1 sup S, sup(S), lub S, or lub(S). If S c JR. is not bounded above (has no
upper bound), we say that sups is infinite and write sups= +oo. 1
1If
S is empty, then one defines sup(S) to be oo.
46
Chapter 1 The Real Line and Euclidean Space
an upper bound
s
li"...::4'..M....""',.,..
~1:h.,.._-~
"\~
~it~~..iNr.J.;v>o.••..
FIGURE 1.3-1
/
•
'\
the least upper bound
Least upper bound
The set ]a, b[ = { x E JR I a < x < b} is called an open interval, while the set
~ x ~ b} is called a closed interval. For example, the closed
inteival (0, 1), the open inteival JO, I[, and all of the rationals less than 1 have
a least upper bound of I. These are relatively simple sets. It is good to keep in
mind that the theory developed in this section also applies to complicated sets,
such as the set of numbers in (0, 1J with a decimal expansion O.a 1a2a3 · • - where
a; is 5, 6, or 7 for each i. This set is not as readily visualized.
[a, bJ= {x E JR I a
Note. In this text, open inteivals are denoted as Ja, b[ rather than (a, b).
This "European" convention avoids confusion with ordered pairs.
There can be at most one least upper bound for a set S. Indeed, if b and/
b' are both least upper bounds, then since b is less than or equal to every other
upper bound, b ~ b'. Similarly, b' ~ b, and so we conclude that b = b'. Thus,
we may speak of the least upper bound.
The least upper bound of a set need not be a member of that set. For
example, the least upper bound of S = JO, 1[, namely 1, does not belong to S.
A set need, not have any upper bound. For example, the whole real number
system has no upper bound, and the set of positive integers has no upper bound.
In the "degenerate" case of the empty set 0, we regard any number as an upper
bound.
If b is an upper bound for the set S and b E S, then b is the least upper
bound. To see this, note that if d is any upper bound for S, then since b E S,
b ~ d, as required.
A useful alternative to the definition of least upper bound is stated next.
1.3.2 Proposition Let S c IR be nonempty. Then b E IR is the least uppe;
bound of S if! b is an upper bound and for every c > 0 there is an x E S such
that x > b - e.
The proof is found at the end of this chapter. However, the theorem should
be quite "obvious," because b sits just at the "top" (that is, to the "right") of the
§1.3 Least Upper Bounds
47
set Sand there are no "gaps" between it and the set S, so that for any e > 0 we
can take x just below b within a distance€. [Warning: This sort of plausibility
argument is intended to give you a feel for the statement-do not confuse it
with a rigorous proof.]
Similarly, a lower bound for a set S is a number b such that b ~ x for all
x E S. Also, b is c~led a greatest lower bound iff it is a lower bound and
for any lower bound c of S, c ~ b. As with least upper bounds, greatest lower
bounds are unique if they exist. The greatest lower bound is sometimes called
the injimum and is denoted by inf S, inf(S), glb S, or glb(S). As in 1.3.2, a
number c is the greatest lower bound for a set S iff c is a lower bound and for
every e > 0 there is an x E S such that x < c + €. If S C JR is not bounded
below, or is empty, we write inf(S) = -oo.
There are a few more or less obvious properties stated in the next proposition.
1.3.3 Proposition Suppose A c B c IR and A and B are nonempty. Then
infB ~ infA ~ supA ~ supB.
In 1.3.3 we assume that the inf's and sup's exist, or we use the conventions
-oo < x < +oo for any x E JR. The more subtle question of existence is dealt
with in the next theorem.
1.3.4 Theorem In JR the following hold:
i.
Least upper bound property Let S be a nonempty set in JR that has an
upper bound. Then S has a least upper bound in JR.
ii.
Greatest lower bound property Let P be a nonempty set in IR that has
a lower bound. Then P has a greatest lower bound in JR.
The proof is at the end of the chapter. However, this result should also be
fairly apparent. Indeed, if a bounded subset of JR had no least upper bound,
there-wolild be a "hole" at the top of the set, and a sequence of members of S
increasing toward that hole would not converge to an element in R Similarly,
we must have property ii. Using the methods of the proof, it is not difficult to
show that conditions i and ii are each equivalent to the completeness axiom for
an ordered field.
1.3.5 Example Consider the set S = {x
infS.
E JR
I x2 +x < 3}.
Find supS and
48
Chapter 1 The Real Line and Euclidean Space
Solution Consider the graph of y =
r
+ x (Figure 1.3-2). From elementary
calculus we see that for x = -1/2, y is a minimum. Thus Smay be pictured as
shown in Figure 1.3-2. •
y
y=x 2 +x
(- l, _ l)
2
4
FIGURE 1.3-2 The set Sin Example 1.3.5
r
The sup and inf clearly occur when +x = 3, or, from the g·uadratic formula,
when
- 1 ± v'f+TI -1 ± v'l3
2
2
Thus
x=----~=----
../f3-l
sup S = - - 2
and
·rs--<v'TI+I)
In . •
2
Exercises for §1.3
1.
Let S = { x I x3
2.
Consider an increasing sequence Xn that is bounded above and that converges to x. Let S = { Xn I n = 1, 2, 3, ... }. Give a plausibility argument
that X = sups.
3.
If P
4.
Let A CIR and B c IR be bounded below and define A +B = {x+y Ix EA
and y E B}. Is it true that inf(A + B) = inf A + inf B?
5.
Let S c [O, I] consist of all infinite tlecimal expansions x' = O.a 1a2a 3 • • •
where all but finitely many digits are 5 or 6. Find supS.
< 1}.
Find sup S. Is S bounded below?
c Q C JR., P -:/ 0, and P and Q are bounded above, show that
supP 5 supQ.
49
§1.4 Cauchy Sequences
§1.4 Cauchy Sequences
Although monotone bounded sequences converge, not every convergent sequence is monotone. The goal of this section is to characterize those sequences
that ought to converge. The terms of a sequence convergent to x are getting
close to x, and so they must also be getting close to each other. This property
has been named after Augustin Louis Cauchy (1789-1857), whose ideas form
much of the theoretical basis for modern calculus and analysis. Many of these
ideas are contained in his 1823 work, Resume des lefons donnees a l'ecole
royale polytecJmfquf sur le:_calcul infinitesimal.. (See The Origins of Cauchy's
Rigorous Calculus by J. Grabiner, MIT Press, Cambridge, Mass., 1981.)
1.4.1 Definition A sequence Xn of real numbers is called a Cauchy sequence if for every number€ > 0, there is an integer N (depending on €) such
that lxn - x,,,I < e whenever n 2: N and m 2: N.
This condition me~ns intuitively that the sequence "bunches up"; that is, all
the elements of the sequence are arbitrarily close to one another sufficiently far
out in the sequence.
If x,, converges to x, we claim that Xn is a Cauchy sequence. Indeed, given
e > 0 choose N so that lxn - xi < e/2 if n ~ N. If n,m ~ N, then lxn - Xml =
lxn -x+x-x,,,I S Ix,, -xi +lx-x,,,I < e/2+e/2 = e, which proves our assertion.
Here we have used the triangle inequality IY + zl S IYI + lzl. We summarize:
1.4.2 Proposition Every convergent sequence is a Cauchy sequence.
The next two theorems are central to much of analysis. The first is the basis
for many of the ideas about compactness in Chapter 3. It exploits the observation
that even when a sequence does not converge, some parts of it might. Consider
the sequence 1,--:1, 1,-1, 1,-1, ... defined for n 0, 1,2,3, ... by x,, (-It.
This sequence certainly does not converge, but if we look only at the terms
with even index: xo,x2 ,x4 , ••• we find the constant sequence I, l, 1, ... , which
certainly converges. If we look only at the terms with odd index: x 1 ,x3 ,x5 , •••
we find another constant sequence -1, - I, -1, ... , which also converges. We
say that we have found convergent subsequences. A subsequence of a sequence
xo,x1,x2, ... is a sequence formed by leaving out some of the terms of the
sequence while keeping others without changing their order. More precisely,
we select indices,n(l) < n(2) < n(3) < ••• and take the sequence with these
indices:
=
Xn(l),Xn(2),Xn(3), • • • •
=
Chapter 1 The Real Line and Euclidea,i Space
50
Theorem 1.4.3 is our first encounter with a very _important observation known
as the Bo/zano-Weierstrass Property. If a sequence in JR stays bounded, it must
cluster somewhere and have a subsequence (possibly many) that converges to
some point in IR. This depends heavily on completeness. The limit that ought
to be there must not be missing. The second theorem is dosely linked to the
first and supplies the version of completeness that turns out to be appropriate
for generalization.
1.4.3 Th'eorem Every bounded sequence in JR has a subsequence that converges to some point in R
1.4.4 Theorem Every Cauchy sequence in JR converges to an element ofR
1.4.5 Corollary If a and b are in JR and a < b, then every sequence of
points in the interval [a, b] = {x I a $ x $ b} has a subsequence that converges
to some point in [a,b].
The property expressed in Theorem 1.4.4 is called Cauchy completeness. It
is often given as the definition of completeness, but does not, by itself, substitute for the least upper bound axiom. However, together with the Archimedean
principle, it does. A Cauchy-complete Archimedean ordered field satisfies the
least upper bound property. (Without the Archimedean property it might not. A
bounded, decreasing sequence such as 1, 1/2, 1/3, 1/4, 1/5, ... might not converge to anything. The only likely candidate for a limit is 0, but without the
Archimedean prope1ty it need not converge to 0. For a similar reason it would
. not be a Cauchy ~equence.) ·
One intuition for Theorem 1.4.4 runs as follows. If we ignore the first N
terms of a Cauchy sequence, we know that the remaining terms will be bunched
together. As we ~isregard more and more terms, the remainder of the sequence
becomes more tightly grouped and squeezes down to some limiting number, the
limit_ of the sequence. To see more precisely how this is done requires more
care, and so the actual proof is our only recourse. The following lemmas contain
the strategy for proving Theorem 1.4.4 fi:om 1.4.3.
1.4.6 Lemma · Every Cauchy sequence is bounded.
'
.
1.4.7 Lemma If a subsequence of a Cauchy sequence converges to x theri
the sequen<;e ifself converges to x.
Exercises/or §1.4
51
Lemma 1.4.6 is reasonable, since a Cauchy sequence has terms that are
bunched together-say, by a distance 1 from some point onward-and leaving
out a finite number of points does not affect boundedness. As for Lemma 1.4.7,
the idea is that the convergent subsequence must drag the entire sequence toward
the limit, since all of the terms are close to one another beyond a certain point.
To prove 1.4.4 from 1.4.3, note that by 1.4.6 the Cauchy sequence is bounded,
and so by 1.4.3 it has a convergent subsequence. By 1.4.7, the whole Cauchy
sequence must converge.
As for Theorem 1.4.3, it is plausible that a bounded sequence has to bunch
up somewhere, and from this one can extract a convergent sequepce. The main
technical difficuliy one faces in the precise proof is how to use the completeness
of IR; specifically, how to relate the given sequence to a monotone one. Try
to think of some ideas before looking at the proof-such attempts, even if
unsuccessful, can be of benefit in sharpening your mathematical skills.
1.4.8 Example Let Xn be a sequence of real numbers with. the property that
lxn - Xn+d
$ 1;2n. Show that Xn converges.
Solution . If we show that
Xn is a. Cauchy seque{!Ce, the result will follow
from Theorem 1.4.4. We can write, by the triangle 'inequality,
lxn - Xn+kl $ lxn - Xn+il
l
I
+ lxn+I
- Xn+2I
l
+
0
•
0
+ lxn+k-1
- :tn+kl
2
<
<- -+-+···+-2n 2n+l
zn+k-1 - 2n'
since a+ar+ar 2 +- • • +arn = a(] - rn+t )/(1- r) < a/(1- r) if O < r < I. Thus
lxn -x,,,j :5 J/2n-t if m ~ n. Given e > 0, if we choose N so that l/(2N- 1) < e,
then x,. satisfies the definition of a Cauchy sequence. •
·
Exercises for §1.4 ·
1.
Let Xn satisfy lxn - Xn+i I < 1/3n. Show that Xn converges.
2.
Show that the sequence Xn = e5in(Snl has a convergent subsequence.
3.
Find a bounded sequenc·e with' three subsequences converging to three
different numbers.
52
4.
Chapter I The Real Une and Euclidean Space
Let Xn be a Cauchy sequence. Suppose that for every e > 0 there is some
n > 1/e such that lxnl < e. Prove that Xn--+ 0.
_,,
·':
r .. '
5.
r
I
~'
...-;A
r
~
I :
I
: ",
~
_,
True or false: If Xn is a Cauchy sequence, then for n and m large enough,
d(Xn+1,X111+1) 5 d(Xn,X111),
§1.5 Cluster Points; lim inf and lim supA useful tool in the study of convergent sequences is the notion of a cluster
point:
4-.5..1:.0efiiiitiofi-' A point x is called a clusterf1Dint of the sequence Xn if
for every € > 0 there are infinitely many values of n with lxn - xi < e.
For example, both 1 and -1 are cluster points of the sequence 1, -1, l, -1,
1, -1, ... ; notice that this sequence does not converge. However, the next proposition shows that there is a relationship between convergence and cluster points.
£5:2::'Proposifioit Let Xn be a sequence in IR and let x E IR.
> 0 and for each N,
i.
x is a cluster point of Xn if and only if for each
there is an index n > N with lxn - xi < £.
i~
x is a cluster point of Xn if and only if there is a subsequence of Xn that
converges to x.
iii.
Xn--+ x if and only if every subsequence of Xn converges to x.
iv.
Xn -+ x if and only if the sequence is bounded and x is its only cluster
point.
v.
Xn -+ x if and only if every subsequence of Xn has a further subsequence
that converges to x.
€
Consider the sequence 1, 0, -1, 1, 0, -1, 1, 0, -1, .... This sequence also does
not converge. It has three cluster points: 1, 0, and -1. Of these, I and -1
53
§1.5 Cluster Points; /im inf and lim sup
are particularly interesting, being the largest and smallest of the cluster points.
It seems obvious at first, less obvious after more thought, that every bounded
sequence of real numbers has largest and smallest cluster points. We want to
try to capture this in a definition for general sequences in IR.
\':S1Definition Let Xn be a sequence in JR that is bounded above. The
limit superior of Xn is the largest cluster point of Xn, i.e., the sup of the set of
cluster points. If the set of cluster points is empty, we set limsupxn = -oo; if
the sequence is not bounded above, we set limsupxn = +oo. Similarly, if Xn is
bounded below, the limrl-inferior of Xn is the smallest cluster point of x,., i.e.,
the inf of the set of cluster points. If the set of cluster points is empty, we set
liminfx,. +oo, and if Xn is not bounded below, we set liminfxn -oo.
=
=
The limit supeijor is denoted by
and the limit inferior by
lim inf Xn,
lim infn-+ooXn, . Or lim
X11 •
~-Examples
a.
For the sequence 1,0,-l,l,O,-l,l,0,-1, ... ,liminfxn = -1 and
lim sup Xn = 1.
b.
The sequence Xn = l / n converges to 0, which is the only cluster point.
Thus lim Xn lim sup x,. lim inf x,. = 0.
c.
Let Xn 0 if n is even and x,. n if n is odd; i.e., Xn = 0, I, 0, 3, 0, 5, 0, 7, ....
Here, lim inf Xn = 0. The sequence is not bounded above, and so we put
lim sup Xn = +oo.
d.
Let
e.
Even if a sequence is bounded, the lim inf is not necessarily the infimum
of the set {x1,x2 ,x3 , •• • }, nor need the lim sup be its supremum. This is
so because we are interested only in cluster points, and any finite number
of the entries qn be thrown away without changing the cluster points.
=
=
Xn
=
=
=n, i.e., 0, 1,2,3,4, .... Here, lim sup =liminfxn = +oo.
Xn
Chapter I The Real Line and Euclidean Space
54
For a more complicated example, consider the sequence Xn defined to be
I+ 1/j if n = 5j
1 - 1/j if n = 5j + 1
0 if n = 5j +2
- 1 + 1/j if n = 5j + 3
-1 - 1/j if n = 5j + 4
=
wherej
2,3,4,5,6,7,8, ... and n
begins with
= 10,11,12,13, .... The
sequence
as shown in Figure 1.5-1.
Xi3
X14
I
X19
I I
I 1111+1111 I
X13
I I
I
0
1x20
I I
I I
II
I I Xu x,o
I
I I I I I 11 • 1II I I I I
I
1
131
3
2
141
I
.2
3
FIGURE 1.5-1
I
X"i1 I
11
II
11
Xu x,6 I I
I I
I
I I
I
1x,s
I I
I I
I
I
I
~
Xu
I
~
151
1
141
I
I .1
4
6
s
s
2
4
3
The sequence of Example 1.5.4, part e
There are subsequences converging to -1, 0, and I. Subsequences approach both +I and -1 from both above and below. Thus, limsupxn = 1
and lim inf Xn = -1.
A somewhat simpler example is:
C.
Xn
=(-l)n(l + 1/n), n =l,2,3,4, .... The first few terms are
3 45 67 89
- 2• 2' - 3' 4' - 5' 6' - 7' 8' ....
The odd-index terms increase to -1, and the even-index terms decrease
to 1, as in Figure 1.5-2. Thus, liminfxn -1 and limsupxn 1.
•
=
=
§1.5 Cluster Points; litn
uif and litn
sup
55
I II •
•
4 11 -I
- - 11
-2
0
11 I
I
3 I'
I ',
6
8
-5 -1
lim inf (xn)
= -1
FIGURE 1.5-2 The sequence of Example 1.5.4, part f
Keep this picture in mind as we proceed. The key idea is that a cluster point of a sequence remains a cluster point if finitely many tenns are discarded. If xis a cluster point of x 1,x2,x3,X4, ... , then it is also a cluster point of
Xn+1,Xn+2,Xn+3, .... In using lim inf and lim sup, it is sometimes useful to have
a direct €-N characterization.
cl::S:S::Pmpositiru:i, Let Xn be a sequence in IR. Then
i:
ii.
If Xn is bounded below, a number a is equal to lim inf Xn if and only if
> 0,
there is an N such that a - c: < .Xn whenever n 2'.: N,
a.
For all
and
b.
For all € > 0 and for all M, there is an n
€
> M with Xn < a+ c:.
If Xn is bounded above, a number b is equal to lim supxn
if and only
> 0, there is an N such that Xn < b + € whenever n ~ N,
a.
For all
and
b.
For all€ > 0 and for all M, there is an n > M with b - c < Xn-
€
if
An examination of Figure 1.5-2 should make this proposition plausible. Using it, we can develop yet another characterization of lim inf and lim sup. For
a sequence X1,X2,X3, ... in R. let Sn
{xn+1,Xn+2,Xn+3,···}, On
inf Sn, and
bn = sup Sn. (Recall that if Sn is not bounded above, then sup Sn= +oo, and if it
is not bounded below, then inf Sn= -oo.) Then S1 ::> S2 ::> S3 ::> S4 ::> Ss ::> •.••
If n 5 k, then sk c Sn, so that Ok 5 bk 5 bn. If k 5 n, then Sn c sk, and so
ak 5 an 5 bn. Thus, the a's and b's are arranged as
=
=
(Note that +oo and -oo are still allowed as values for an and bn.) The reader
should compute the first few of these for Example 1.5.4, part r.
56
Chapter 1 The Real Line and Euclidean Space
1.5.6 Proposition If Xn is a sequence in JR. then
limsupxn = inf{sup{Xn+J,Xn+2,Xn+3, ... } In= 1, 2, 3, ... }
liminf Xn = sup{inf{Xn+!,Xn+2,Xn+3, ... } In= 1, 2, 3, ... }.
Setting Sn= {xn+1,Xn+2,Xn+3, .. ,}, bn = sup Sn, and an= inf Sn, we can rewrite
this as
limsupxn
=inf{supSn}
=inf{b1,b2,b3,b4, ... }
n
liminf Xn = sup{inf Sn} = sup{ a,, a2, a3, a4, ... }.
n
Some general properties follow.
1.5.7 Proposition Let Xn be a given sequence in JR. Then
~
i.
lim inf Xn
ii.
If Xn
iii.
If M :5 Xn for all n, then lim inf Xn 2:: M.
iv.
lim sup Xn = +oo if and only if Xn is not bounded above.
v.
lim inf Xn = -oo if and only if x,. is not bounded below.
vi.
If x is a cluster point of Xm then lim inf Xn :5 x :5 lim sup Xn,
vii.
If a = lim inf Xn is finite, then it is a cluster point of Xn.
viii.
If b = lim sup x,. is finite, then it is a cluster point of x,..
ix.
x,.
-+
~
lim supx,,.
M for all n, then lim sup Xn
~
M.
x E IR if and 011/y if lim sup x,. = lim inf x,. = x E JR.
Exercises for §1.5
1.
Let x,. = 3 + (- I rO + I/ n). Calculate Jim inf x,. and Jim sup x,..
2.
Find a sequence Xn with Jim sup x,.
3.
Let x,.
Show that x,. has subsequences u,. and l,. with u,.
=5 and lim inf x,. =-3.
be a sequence with lim sup x,. = b E IR and lim inf x,. = a E JR.
-+
band I,.
-+
a.
4.
Let limsupx11 = 2. True or false: }f n is large enough, then x,. > I.99.
5.
Trne or false: If lim supx,, = b, then for n large enough, x,.
~
b.
§1.6 Euclidean Space
57
§1.6 Euclidean Space
In this book we work primarily with one-, two-, or three-dimensional Euclidean
space. In many applications, and in developing analysis in three-space, higher
dimensional spaces naturally occur. Therefore, it is important to treat the general
case, but we usually fall back on the case of one-, two-, or three-space for
visualization and intuition. Let us begin with the formal definition.
1.6.1 Definition Euclidean n-space, denoted lRn, consists of all ordered
n-tuples of real numbers. Symbolically,
]Rn= {(xi, ... ,Xn)
I Xt,· .. ,Xn E JR}.
Thus, JR" = JR x • • • x lR (n times) is the Cartesian product of IR with itself n
times.
Elements of JR" are generally denoted by single letters that stand for n-tuples
such as x = (x 1, ••• , Xn), and we speak of x as a point in lR".
Addition and scalar multiplication of n-tuples are defined by
and
a(x1, ... , Xn) = (ax1, ... , axn)
for a E JR.
The geometric meaning of these operations is reviewed in Figure 1.6-1 for
n = 3.
The spaces IR 2 and JR 3 and more generally IR" are examples of a structure
from linear algebra called a vector space. Just as we did with fields, we single
out the vital properties of the familiar spaces, IR 2 and 1R3, and tum them into a
definition. Basically, a vector space is a collection of objects called vectors that
we can add, subtract, and multiply by numbers.
1.6.2 Definition A real vedor space V is a set of elements called v_ectors,
with given operations of vector addition+ : VxV -+ V and scalarmu/Jiplication
· : JR x V -+ V such that:
l
v + w = w + v for every v and w in V.
iL
(v+u) +w = v +(u +w).
-Ju.
There is a zero vector O such that
v + 0 = v for every v in V.
commutativity
associativity
zerQ vector
Chapter 1 The Real Line and Euclidean Space
58
X
= (x1 ,Xz,-'j)
-x
FIGURE 1.6-1
iv.
v.
vi.
vii.
viii.
Addition and scalar multiplication
For each v in V there is a vector -v
such that v + (-v) = 0.
negatives
,\ · (v + w) = ,\ · v
For ,\ E JR and v, w in V,
+ ,\ · w.
distributivity
For ,\, m E JR and v in V,
,\(m · v) = (,\,n) · v.
associativity
For,\, m E JR and v in V,
(,\ + m) . V = ,\ . V + m . V.
distributivity
1 · v = v for every v E V.
multiplicative identity
A vector space over a general field lF is defined in the same way by substituting IF for JR throughout. In analysis we are concerned mostly with vector spaces
over JR, which we call real vector spaces, and over the complex numbers C,
which are called complex vector spaces. (See §1.8 for a discussion of C.) A
subset of Vis called a subspace (or linear subspace or vector subspace) if it is
itself a vector space with the same operations.
1.6.3 Lemma If W is a subset of a vector space V over a field IF, then W
is a vector subspace over IF if and only if ,\v + µw E W whenever ,\ andµ are
in lF and v and w are in W.
The reader can supply the proof or review the relevant section of a linear
algebra text.
59
§1.6 Euclidean Space
FIGURE 1.6-2
Hyperplane and affine hyperplane
In particular, an (n - 1)-dimensional linear subspace of IR" is called a hyperplane. An affine hyperplane is a set x + H, where H is a hyperplane and
x e IR" and where x + H means the set of all x + y as y ranges through H; thus
x + H = {x + y I y E H}. See Figure 1.6-2.
1.6.4 Theorem Euclidean n-space with the operations ofaddition and scalar
multiplication previously defined is a vector space of dimension n.
The proof is a straightforward check of the axioms for a vector space, which
we shall leave for Exercise 16 at the end of the chapter. This theorem should be
no surprise; after all, a vector space is an abstraction of the basic properties of
vectors in Euclidean space. We can show that IRn has dimension n by exhibiting
a basis with n vectors; for example, the standard basis {e 1 = (1, 0, ... , 0),
e2 = (0, 1,0, ... ,0), ... , en= (0,0, ... ,0, l)}.
Using the standard basis, the components of x = (x 1 , • •• , Xn) are just xi, . .. , Xn.
Using another basis for IRn, the components would be different. This means that
if e1, . .. , en denotes the standard basis, x =
1 x;e;, but if f1, . .. ,fn is another
basis, x =
y;f; for some (possibly different) numbers Y1t ... ,y,.. We have·
not defined the terms "basis" or "component" in general. We refer you to your
·
linear algebra course.
E7=
E7=i
1.6.5 Definition The length or norm of a vector x in IRn is defined by
n
llxll = { ~x;
}1/2
Chapter I The Real Line and Euclidean Space
60
where x = (xi, ... , Xn), The distance between two vectors x and y is the real
number
n
d(x,y) = llx - YII =
{
~(x; - y;)2
}1/2
The inner product of x and y is defined by
n
(x,y} = LX;y;.
i=I
Thus we have llxll2 = {x,x}. In JR.3, the reader is familiar with another
expression for (x,y}, namely, {x,y} = llxll 11Yllcos0, where cos0 is the cosine
of the angle formed by x and y. See Figure l.6-3.
X
FIGURE 1.6-3 Length and inner product
We now summarize the basic propenies of these operations:
1.6.6 Theorem For vectors in IR.n, we have
l.
Properties of the inner product:
i.
(x, x} 2: 0.
ii.
(x,x)
iii.
(x,y + w) = (x,y} + (x, w).
=0 iff x =0.
positivity
nondegeneracy
distributivity
§1.6 Euclidean Space
iv.
v.
II.
ii.
iii.
iv.
IV.
E lit
mulnplicativity
symmetry
Properties of the norm:
i.
III.
(o.x,y) =a(x,y) for a
(x,y) = (y,x).
61
llxl! ~ o.
llxll =0 iff x =0.
!!axll = lal !lxll for a E lit
llx+yll s llxll + IIYII-
positivity
nondegeneracy
multiplicativity
triangle inequality
Properties of the distance:
i.
d(x,y)
ii.
d(x,y)
~
0.
iii.
=0 iff x =y.
d(x,y) =d(y,x).
iv.
d(x,y) S d(x,z) +d(z,y).
positivity
nondegeneracy
symmetry
triangle inequality
The Cauchy-Schwarz ineqUfJlity:
I(x, y) I s Ilxll IIYIIProperties i-v of the inner product are almost obvious algebraically. Once
we- have the Cauchy-Schwarz inequality, the properties of the norm and distance
follow easily.
Here is one proof of the Cauchy-Schwarz inequality; another is given in the
theorem proofs section. We want to prove that I(x, y) I S Ilxll IIYI 1- The cases
x 0 or y 0 are trivial, and so we can assume x ~ 0 and y ~ 0. Rescaling
and Y by rewriting l(x,y)I S llxll llYII as l(x/llxll, y/llYll)I S I, we can assume
llxll IIYII l; then we need to prove that l(x,y)I :S I. Consistent with the
l)thagorean theorem, we claim that .
'
=
x
=
=
=
ll(x,y)yll 2 +llx - (x,y}yll 2 = llx112 = 1,
since (x,y)y should be the projection of the vector x along the vector
Figure 1.6-3. To verify this algebraically, write
y,
as in
ll(x,y)yll 2 + llx - (x,y}yll 2 = (x,y} 2 IIYll 2 + (x - (x,y)y,x - (x,y)y)
= (x,y) 2 IIYll 2 + llx112 - 2(x,y) 2 + (x,y) 2 IIYll2
= l!xll 2
Chapter I The Real line tlnd Euclidean Space
62
Bx+yll< Dxl+ DyU
FIGURE 1.6-4 Triangle inequality
since
IIYII = 1.
Therefore
l(x,y)l 2
=ll(x,y)yl!2 ~ ll(x,y)yll2 + llx -
(x,y)yll2 =llx112
=1,
and so the Cauchy-Schwarz inequality follows. This proof also shows that if
the Cauchy-Schwarz inequality becomes an equality, then x and y are linearly
dependent. (The tenn llx - {x,y)yll 2 in the last display is zero in this case, and
so x = (x,y}y.) Note that IV follows just from the properties I.
Many of these properties should also be evident geometrically. For example,
iv in II and III expresses the fact that the length of one side of a triangle is less
than or equal to the sum of the lengths of the other sides (Figure 1.6-4).
A set with a function d obeying the rules of group III is called a metric
space. A vector space with a nonn obeying the rules in group II is called a
normed space, and a vector space with an inner product obeying the rules in
group I is an inner product space. We will look at these concepts in §1.7
and show that properties III follow from II, and properties II follow from I.
Therefore, all that is needed is to show that the proposed inner product on Rn
satisfies the properties in I. (Aside: The famous inequality of group IV should
properly be called the Cauchy-Bunyakovskii-Schwarz inequality.)
Generalizing the concepts from IR.3, we call x,y E Rn orthogonal iff (x,y) =
0. Two subspaces S and T are orthogonal iff (x,y} = 0 for all x e S and
y E T. If in addition S and T span Rn, then they are called orthogonal
complements.
Exercises/or §1.6 ,
63
An observation that will be helpful to us when studying sequences in IR.n is
the (ollowing:
1.6.7 Proposition If v = (v1, ... , vn) and w = (w1, ... , wn) are vectors in
Rn and we let p(v, w) = max{lv1 - W1 I, lv2 - w2I, ... , lvn - wnl}, then
p(v, w) $ llv -
wll $
../np(v, w).
1.6.8 Example Find the length of the line segment joining (1, l, l) to (3, 2,0).
Solution This length is the length of the vector (3, 2, 0)- ( l, l, l) = (2, l, - l ),
which represents the vector from (1, l, l) to (3,2,0). The length is
11(2, 1,-1)11 = ✓22 + 12 + (-1)2 = ./6.
•
1.6.9 Example In IR.3, find the orthogonal complement of the line x =y
=
z/2 (or xi = X2 = X3/2 in different notation).
Solution This line, call it I, is the one-dimensional subspace spanned by
the vector (1, l, 2) (see Figure l.6-5). The orthogonal complement is a plane
(through the origin, since it is a subspace) and so has an equation of the form
Ax+ By+ Cz
=0,
i.e.,
((A, B, C), (x,y, z))
=0.
Thus, (A, B, C) is normal to the plane, but (I, l, 2) is a vector perpendicular
to the" plane, so that the orthogonal complement sought is the plane x + y +
2z =0. •
Exercises for §1.6
1.
If llx + YII =
origin.
llxll + IIYII, show that x and y lie on
2.
What is the angle between (3, 2, 2) and (0, l, 0)?
3.
Find the orthogonal complement of the plane spanned by (3, 2, 2) and
(0, l, 0) in IR. 3•
the same ray from the
Chapter 1 The Real Line and Euclidean Space
64
z
£
I
I
I
I
I (1,1,2)
X
FIGURE 1.6-5 The line I and its orthogonal complement in Example 1.6.10
={x E lR3 I llxll :$ 3} and Q ={x E IR3 I llxll < 3}.
4.
Describe the sets B
S.
Find the equation of the line through (1, I, I) and (2,3,4). Is this line a
linear subspace?
§1.7 Norms, Inner Products, and Metrics
Among the properties of IR.n is its metric property, giving a way of measuring
distance between points. A metric space is a set M equipped with a function
d : M x M ...... JR. that gives a reasonable way of measuring the distance between
two elements of M. The notation and language are meant to recall the familiar
distance between points in Euclidean 1-, 2-, or 3-space but not to be restricted
to that.
1.7.1 Definition A metric space (M,d) is a set Mand a junction d: M x
M ...... JR such that
~
i.
d(x,y)
ii.
iii.
=0 if and only if x =y.
d(x,y) =d(y,x)for every x,y EM.
iv.
d(x, y) $ d(x, z) + d(z, y) for all x, y, z E M.
0 for all x,y EM.
d(x,y)
positivity
nondegeneracy
symmetry
triangle inequality
§1.7 Nonns, Imier Products, and Metrics
65
1.7.2 Examples
a.
We saw in 1.1.6 that the real line JR itself is a metric space with metric
defined by d(x,y) = Ix - YI• Readers who are familiar with complex
numbers will recognize that C is also a metric space with d(z, w) = lz- wl;
see §1.8 for details. Similarly, 1.6.6 shows that ]Rn is a metric space (with
the standard metric).
b.
Discrete metric Let M be any set and let d(x, y) = 0 if x
if x 'F y. Then d is a metric on M.
c.
Word metric This example is useful in combinatorics and computer science. Let M be the set of all "words," each word consisting of an ordered
8-tuple of Os and 1s.
w = (w,, w2, W3, w4, w5, w6, w1, ws)
=y and d(x, y) =1
with each wk = 0 or
l.
Let d(v, w) = the number of places in which v and ware different. For
example, if v = 01100011 and w 00110101, then d(v, w) 4, since v
and w differ in the second, fourth, sixth, and seventh bits. Then d is a
metric on M. (In establishing its properties it may help to check that
d(v, w) =
lvk - wkl.J
=
=
z:::..,
d.
Bounded metric If d is a metric on a set M and p(x,y) is defined by
p(x,y) = d(x,y)/(1 + d(x,y)), then p is also a metric on M, as one can
verify (see Exercise JO for Chapter ]). This new metric has the property
that p(x,y) < 1 for all x,y e M; i.e., p is bounded by 1.
•
Although metric spaces are often vector spaces, this is not always the case,
as the discrete metric space shows. We tum now to concepts that work only in
(_vector spaces'.')
7
1.7.3 Definili::i nonned spac;;i,o, nonned vecio, space) (V, 11 • I[) ;, a
vector space V and a function 11 • II : V
i.
llvll
ii.
=0 if and only if v =0.
II.Xvii= I.XI llvll for every ve V
and every scalar .X.
iii.
~v.
~ 0 for all v
e V.
llvll
llv+wll::; llvll+llwll
--+
JR called a nonn such that
positivity
,iondegeneracy
multiplicativity
tria,igle inequality
66
Chapter I The Real Une and Euclidean Space
~
We saw in 1.1.5 that JR. is a nonned space with llxl I = lxlReaders familiar with complex numbers will recognize that C is a normed
space with llzll = lzl.
a.
Realnumbers
b.
Taxicab nonn
Consider the space R.2 , but instead of the usual norm on
it, set ll(x,y)II, = lxl + IYI- Then II· Iii is a norm on IR. 2• The name comes
from the associated distance. If P (x,y) and Q (a,b), then
=
d,(P,Q) =
=
IIP - Qlli =Ix-al+ IY- bl.
This is the sum of the vertical and horizontal separations. You must travel
this distance to get from P to Q if you always travel parallel to the axes
(stay on the streets in a taxicab). See Figure 1.7-1.
y
--+---x.___ _ _ _ _ _....a - -• x
FIGURE 1.7-1
c.
The taxicab metric
Supnmum norm Let~= all real-valued functions on the interval (0, 1]
that are bounded. That is, let
M = {f: [0, 1] - JR. I there is a number B
with 1/(x)I :$ B for every x E [0, 11}.
For each/ in M, f([O, 1]) is a bounded subset of JR., and so {lf(x)I
[0, 11} is also. It then has a finite least upper bound and
llfll= = sup{l/(x)I
Ix E
Ix E [0, 11}
defines a function 11 • lloo : M - R The set Mis a vector space, and 11 • lloo
is a norm on it. We will thoroughly examine this norm ip Chapter 5.
•
§1.7 Norms, Inner Products, and Metrics
67
As we remarked earlier in the discussion of Rn, norms always produce
metrics.
!l;.7.;S-:f.roR.os.i.t~n , If (V, II · II) is a normed vector space and d(v, w) is
_defined by
d(v, w) = llv - wll,
then d is a metric on V.
The metrics that norms produce sometimes give a measure of the difference
between two vectors. We saw one example already in the taxicab metric. In the
example of the space of bounded functions, the metric is
d(j,g) = II/ - glloo = sup{l/(x) - g(x)I IO~ x
:5 I}.
Thus, the metric given by the sup norm is the largest vertical separation between
the graphs, as in Figure 1.7-2.
y
I
I
I
I
I
If
- -•
FIGURE 1.7-2
x
The sup distance between functions is the largest vertical
distance between their graphs
Although many interesting metrics come from nonns, not all do. Unless the
vector space consists only of the zero vector, choose a nonzero v and note that
IIAvll = IAI llvll --+ oo as IAI -+ oo, so that d(Av,0) becomes arbitrarily large.
Therefore, Examples 1.7.2b and d c1:1nnot be produced from nonns.
68
Chapter 1 The Real Line and Euclidean Space
lt,1:«£De1imtfon.1
A real vector space V with a function (·, ·} from V x V
is called an inner product space if
i.
(v, v}
~
ii.
(v, v}
=0 if and only if v =0.
iii.
(..\v, w}
0 for all v E V.
v.
R
positivity
nondegeneracy
=..\(v, w) for all v, w E V
and for all
iv.
--+
~
E IR.
multiplicativity
=
(v, w + u} (v, w} + (v. u)
for all v, w,u EV.
(v, w}
distributivity
=(w, v} for all v, w E V.
symmetry
The most important example is Rn with the standard inner product, but there
are others.
1.7.7 Example
ontinuous functions Let V be the space of all continreal-valuedfun ions on the interval [0, l]. (Continuity is studied in detail
in Chapter .
r now we use the notion informally. assuming that the reader
is familiar with it from calculus.) As the reader is probably aware. sums of
continuous functions are continuous; we shall soon verify this officially. Thus,
V is a vector space, usually denoted by C([0, I]), and is of great interest in
analysis. We will confirm in Chapter 4 that any continuous function on [0, I] is
bounded, so that the supremum norm defined in Example 1.7.4c makes sense on
C([0, l]). We will see in Exercise 30 at the end of this chapter that this norm
cannot be associated with an inner product. There is. however, a very interesting
inner product on C([O, l]) not associated with the supremum norm. Products of
continuous functions are continuous, and continuous functions are integrable.
We will verify these assertions later, but for now we will assume them and define
an inner product on C([0, I]) by
(f, g} =
1
1
/
(x)g(x)dx.
It is straightforward (given the basic properties of integrals) to check that this
is an inner product. It will be important to us in Chapter JO in the study of
~Fourier series and Fourier analysis.
•
Inner products, when available, are very powerful tools. As in Rn. they supply a notion of orthogonality: Two vectors v and ware orthogonal if (..\v, w} = 0.
§1.7 Nonns, Inner Products, and Metrics
69
They also give us one of the most useful inequalities in analysis, the CauchySchwarz (or Cauchy-Bunyakovskii-Schwarz) inequality. Before turning to it we
collect some easy but useful facts.
a:2.1'-liroposition' If(·,•} is an .inner product on a real vector space V,
then
i.
(Av+µw,u}=A(v,u}+µ(w,u}
ii.
(u, ,\v + µw) = A(u, v} + µ(u, w}
iii.
(v, Aw} = A(v, w}
iv.
(0, w} = (w,O} = 0.
The proof is straightforward and is left to the reader.
~ ~eauchy-Schwarz·Inequalitr ,If (V, ( · ,: }) is an inner product space,
then
l(v, w}I::; ylM ✓(w, w}
for every v and w in V.
The proof we gave in the 'Preceding section is valid in this abstract context,
too! Besides other uses, this inequality supplies the key to showing that every
inner product produces a nonh.
1:7..1:0-Rroposition_
lf(V, (·,·})is an inner product space and 11·11 is de.fined
for v in V by
.then
llvll = ~•
II· II is a norm on V.
The main point here is that one gets the triangle inequality as an easy consequence of the Cauchy-Schwarz inequality:
llv+wll2
=(v+w,v+w} =llvll2 +2(v,w} + llwll2
$ llvll2 + 2l(v, w}I + llwll 2
$
llvll2 + 2llvll llwll + llwll2 = <llvll + llwll>2 ,
and so llv + wll $ llvll + llwll.
Chapter I The Real Line and Euclidean Space
70
Exercises for §1.7
=
=x for both the sup norm
I.
In C([O, 1]) find d(J,g) where /(x) 1 and g(x)
and the norm given in Example 1.7.7.
2.
Identify the space of polynomials of degree n - I in a real variable with Rn.
In the corresponding Euclidean metric, calculate d(J, g) where/(x) = I
and g(x) =x.
3.
Put the inner product (/, g) = J01/ (x)g(x)dx on C([O, I]).
Cauchy-Schwarz inequality for f(x) 1 and g(x) x.
4.
=
=
Verify the
Using the inner product in Exercise 3, verify the triangle inequality for
f (x) = x and g(x) = x2.
5.
Show that
11 '"lloo
is not the norm defined by the metric in Example 1.7.7.
~.l.8 The Complex Numbers
Note. The following material is important in analysis in general but is not
required for the bulk of this book, except for Chapter 10. You will see much
more of this in courses on complex variables; see, for instance, Basic Complex
Analysis by J. Marsden and M. Hoffman, 2d ed., W. H. Freeman, New York,
1987.
The real numbers JR were created to supply a continuous unbroken line of
numbers suitable for the needs of geometry and calculus. However, there are
still a few problems: While we have supplied a square root of 2, so that the
equation x2 - 2 = 0 has a solution, x2 + 2 = 0 has none. Complex numbers
supply solutions to such equations. The set of complex numbers C is created
following the clue given by the quadratic fonnula, which states that the equation
ax2 + bx+ c = 0 with a -,. 0 has solutions x = -b ± Jb2 - 4ac/2a. For the
equation x2 + x - 6 0 this produces the solutions x 2 and x = -3. For the
equation x2+2x+5 = 0, it suggests x = -I ±2A. In school we are sometimes
taught to conclude that there is no solution. But suppose we try to find one in
a spirit of mathematical playfulness. If these values of x are inserted into the
equation and manipulated according to the rules of arithmetic with the proviso
that (..;=T)2 = -1, they do indeed "solve" the equation. Since the quadratic
fonnula always gives real solutions i~ b2 - 4ac ~ 0 and suggests "solutions"
=
=
§1.8 The Complex Numbers
71
z = (-b/2a) ± (✓4ac- b2 /2a)A if b2 - 4ac < 0, we are led to "play" with
objects z = x + y A where x and y are real. To simplify notation, we use the
symbol i to stand for A. We then manipula~ the expressions x+iy according
to the rules of arithmetic with the proviso that i2 = -1. If x,y,a,b E JR, then
we find that
(x + iy) +(a+ ib) = (x +a)+ i(y + b)
(x+ iy) + (0 + iO) =x+iy
(x+ iy) + (-x- iy) = 0 + iO
(x + iy) · (a+ ib) = (ax - by)+ i(bx + ay)
(x+ iy) • (1 + iO) =x+iy.
Can we divide? At least fonnally, we can write
1
1 X- iy
X
• -y
--=-----=--+1-x + iy x + iy x - iy x2 + y2
x2 + y 2 •
If at least one of x or y is not 0, then this is an allowable expression in our
system. Direct multiplication con6nns that
.._
( __, x 2 + i 2-y 2 )
~+y
x+y
•
(x + iy) = (1 + i0).
The complex numbers are given a logical basis and a geometric realization by
identifying the construction x + iy with the ordered pair (x,y) and then with a
point in the Euclidean plane of analytic geometry. The horizontal axis consisting
of points (x, 0) is thus identified with the real numbers by x "=" x;t;.i0 "=" (x, 0).
It is referred to as the real axis. The vertical axis, consisting of points (0,y), is
traditionally referred to as the imaginary axis. The point (0, 1) is identified with
i and is often called the imaginary unit. The rules of addition and multiplication
become definitions:
(x,y) + (a, b) = (x + a,y + b)
(x,y) · (a,b) = (ax- by,bx+ay).
It is a tedious but routine chore to check that the plane with these operations,
-and with (0, 0) as additive identity and (1, 0) as multiplicative identity, is a field.
(It satisfies properties 1-10 of the field axioms given in §1.1.) The reciprocal of
a nonzero point is given, as suggested in our fonnula for 1/(x + iy), by
<x,y>- 1 =
(A. -h).
X
+y
X
+y
72
Chapter 1 The Real Line and Euclidean Space
Furthermore, the real numbers can be identified with the horizontal axis by
x - (x, 0), and everything agrees: x + y - (x, 0) + (y, 0) and xy - (xy, 0).
Finally, we check that (0, 1) · (0, I) = (-1, 0) - -1, so that we have a square
root for -1.
The resulting field is called the complex numbers and is denoted by C.
However, C cannot be an ordered field. In an ordered field we know that z2 ~ 0
for every element z and that -1 < 0. But in C we have an element i with
,-i = -1. Consequently, there is no way to introduce an ordering in C that has
all of the order properties in the definition of an ordered.field.
Identification of C with the plane JR. 2 gives it a geometric reality. Addition
is the same as vector addition in the plane and follows the parallelogram law,
as in Figure 1.8-1.
y
FIGURE 1.8-1
Complex numbers are added like vectors, illustrating (a+bi)+
(c + di) =(a+ c) + (b + d)i
The geometric meaning of multiplication becomes clear if we use polar
coordinates. Suppose
z = (rcos 0, rsin 0) = rcos 0 + irsin 8
w = (p cos cp, p sin <p) = p cos <p + ip sin <p.
Then
z • w = (rcos 0 + irsinB)(p cos <p + ip sin <p)
= rp[(cos 0 cos tp -
sin 0 sin cp) + i(sin 0 cos cp + cos 0 sin <p )]
= rp[cos(O +<p)+ isin(O + cp)] = [rpcos(B +<p),rpsin(O + cp)].
§1.8 The Complex Numbers
73
y
a+ ib
rcos tJ
FIGURE 1.8-2 Polar coordinate representation of complex numbers
To multiply z by w we multiply the magnitudes of the vectors and add the polar
angles. If we think of w as fixed, multiplication by z corresponds to rotating by
an angle () and stretching by a factor of r.
r is the magnitude or absolute value of z and is denoted by lzl.
(J is the argument of z and is denoted by arg(z).
Notice that if z = x + iy with x and y real, then lzl = x2 + y2 • It is the
distance to the point z in the plane from the origin. Computation with these
quantities is simplified by introduction of the complex conjugate of z. This is
denoted by z and is defined by z= x - iy. The horizontal coordinate x is called
the real part of z and is denoted Re z or Re (z), while y is called the imaginary
part and is denoted Im z or lm(z).
· It is useful to have a definition of e', where z may be a complex number.
Since we want e° to coincide with the usual definition for a real and we want
~1,; = e° • e";, we need only define e;9 for (J real. We define e;9 = cos (J + i sin (J
and hence e°+bi = e°(cosb + isinb). Then, since cosO + isinO = 1, we have
e1+0i = e°, and so our definition agrees with the usual definition in the case
where a is real. The reader may also check (see Exercise 7a for this section)
that e41 +:1 = e4• . e41 •
Complex numbers are represented (as already stated) by points in IR 2• Using
polar coordinates, we can thus write
J
z = ri9 = rcos(J + irsinfJ,
where r
=lzl and (J =arg z, the argument of z; see Figure 1.8-2.
74
Chapter 1 The Real Line and Euclidean Space
1.8.1 Proposition Suppose z = x + iy,
w
= u + iv E C, and x, y, u, v e JR.
Then
vii.
lzl ~o.
lzl =0 ~ z=O.
lzwl = lzllwl.
lz + wl :::; lzl + lwl.
zw=zw.
z+w=z+w.
(1/z) = 1/z.
viii.
1z12 =zz.
ix.
lzl2 =iJ-+y2.
max<lxl, IYI) 5 lzl:::; ./imax<lxl, lyl).
i.
ii.
iii.
iv.
v.
vi.
x.
1.8.2 Examples
a. . Prove that l/i = -i and that 1/(i + I)= (l - i)/2.
Solution First,
1
I
-i
I
I
-1
.- ..
because i • -i
. = -i,
=-(i2) =-(-1) = 1. Also,
I
I .1-i
1-i
; + 1 =; + 11 - ; = T·
since (1 + i)(l - i)
b.
=I + 1 =2.
Find the real and imaginary parts of (z + 2)/(z - 1) where z = x + iy.
Solution
iy
z - 1 = (x - 1) +·iy = (x- I)+ iy · (x - l) - iy
(x+2)(x-1)+y2+i[y(x-1)-y(x+2)]
= --------,---,,------(x - 1)2 + y2
z+2
(x + 2) + iy
(x + 2) + iy
(x - 1) -
§1.8 The Complex Numbers
75
Hence,
Re(z+2]_x2+x-2+y2
z - 1 - (x - 1)2 +y2
c.
Slw
ha [ (3 + 7i)
w t
t
and Im
[z+2]
-3y
z - 1 = (x - 1)2 + y2.
2] -
(3 - 7i)2
.,. 8 - 6i .
8 + 6i
Solution The point here is that it is not necessary first to work out (3 +
7i)2 /(8 + 6i) if we simply use the properties developed in the text, namely,
z 2 = z2, and z/z' = z/i!. Thus we obtain
( (3 + 7i)
8 + 6i
2]
d.
= (3 + 7i)2 = (3 + ?i/ = (3 8 + 6i
7;)2.
8 - 6i
8 - 6i
I +~I =
If lzl =I, prove that ~z
bz+a
I for complex numbers a and b.
Solution Since lzl = I, we have z = m- 1. Thus
az+b az+b I
bz+a = b+a z · ~
Because of the properties of the absolute value and because lzl = 1,
I I=I +~I ·_!_ =
~z + b
bz+a
since laz + bl
e.
az
az+b
lzl
I
=laz + bl =la z+ bl.
Show that the maximum absolute value of z2 + I on the unit disk lzl :5 I
is 2.
Solution By the triangle inequality, lz2 + I I ::; lz2 I + 1 = lzl 2 + 1 :5 I 2 + I = 2
since lzl :5 1, and thus lz2 + I j does not exceed 2 on the disk. Since the value 2
is achieved at z = I, the maximum is 2.
•
Chapter I The Real Line and Euclidean Space
76
Sequences of complex numbers present no new difficulties. Since the absolute value in C satisfies the same formal properties as in JR. and is in fact a
norm, there is no difficulty in defining convergellce of a sequence of complex
numbers z1,z2,z3, ... to a complex number z by
lim
k-oo
Zk
= z iff for every e > 0 there is a number N
such that lz1: - zl < e whenever k ~ N.
The arithmetic of sequences goes through as in JR.. For instance, if Zn --+ z,
w, and >. is constant, then Zn+ Wn --+ z + w, A.Zn --+ .Xz, and ZnWn --+ zw. If
Zn 't- 0 and z # 0, then wn/ Zn --+ w/z. Using the inequalities
Wn --+
IRe(z) - Re(w)I :5 lz - wl :5 IRe(z) - Re(w)I + IIm(z) - lm(w)I
and
IIm(z) - Im(w)I :5 lz - wl :5 IRe(z) - Re(w)I + jlm(z) - lm(w)I,
we get:
1.8.3 Proposition
Zn --+
z in C
iff Re(zn) --+ Re(z) and lm(zn) --+ lm(z) in JR..
As with real numbers, a bounded sequence of complex numbers must have
a convergent subsequence, and it follows from this that C is complete. Thus, a
sequence in C converges to a point in C if and only if it is a Cauchy sequence.
We shall generalize all this to !Rn in §2.7.
Sometimes one needs an inner product in a vector space over C instead of JR..
Some changes are needed, however. For example, if (v, v) = llvll2, then (v, v)
must be real, so that (v, v) = (v, v). This makes it plausible that the appropriate
change is to modify the symmetry condition v in the definition of the inner
product (Definition 1.7 .6).
1.8.4 Definition A complex vector space V with a function (•, ·) from V x
V
--+
C is a complex inner product space if
~
i.
{v,v)
ii.
(v, v) = 0 if and only if v = 0.
Oforall v EV.
positivity
nondegeneracy
77
§1.8 The Complex Numbers
iii.
iv.
v.
{>.v, w) = >.{v, w) for all 'v, w E V and
>.ER.
mu/Jiplicativity
{v, w + u) = (v, w) + (v, u)
for all v, w, u, E V.
distributivity
(v, w) = (w, v) for all v, w E V.
Hermitian symmetry
For example, if C' is the complex vector space of ordered n-tuples of complex numbers, that is, if
C' = {(z1,z2, ... ,Zn) I Zk EC for each k}
and (z, w} is defined for z = (z1, ... , Zn) and w = (w1, ... , Wn) by
n
{z, w) =
L
ZkWA:,
b:1
then(·,•} is a complex inner product on C'. Not too surprisingly, most properties
remain essentially unchanged except for an occasional complex conjugation.
Part ii of Proposition 1.7.8 is changed to ii' (u, >.v + µw) = "X(u, v) + µ{u, w),
and the special case iii is changed to iii' (v, ,\w) = "X(v, w). Thus the inner
product (sometimes called a Hermitian or sesquilinear fonn) is linear in the first
variable and conjugate linear in the second. A note of caution: Some authors,
particularly in the physics literature, use another convention with the fonn linear
in the second variable and conjugate linear in the first: (v, >.w) = >.(v, w) and
{>.v, w) = X(v, w). This causes no substantial difference if one is consistent
The statement of the Cauchy-Schwarz inequality still holds for complex
spaces, although the proof is trickier. (The vectors must be "rotated" first by
multiplication by a complex scalar of norm I so that the appropriate coefficients
used in the proof are real.) The equation Ilzl 12 = (z, z) defines a norm on V as
before. The argument for the triangle inequality is this:
llx + Yll 2 = {x + y,x + y) = (x,x) + (x,y) + (y,x) + (y,y)
= llxll2 + (x,y) + (x,y) + IIYll 2 = llxll 2 + 2Re({x,y)) + IIYll 2
$ llxll 2 + 2l(x,y)I + IIYll2
(by Cauchy-Schwarz)
S llxll2 + 2llxll llYII + IIYll2 = <llxll + IIYll>2 •
Chapter I The Real Line and Euclidean Space
78
Exercises for §1.8
1.
2.
Express the following complex numbers in the form a + ib:
a.
(2+3i)+(4+i)
b.
(2 + 3i)/(4 + t)
c.
1/i + 3/(1 + i)
Find the real and imaginary parts of the following, where
a.
(z + 1)/(2z
b.
.z3
z = x + iy:
- 5)
3.
Is it tr_ue that Re(zw) = (Re z)(Re w)?
4.
What is the complex conjugate of (8 - 2i) 10 / ( 4 + 6i)5 ?
5.
Does z2 = lzl 2 ? If so, prove this equality. If not, for what z is it true?
6.
Assuming either izl = 1 or !wl = 1 and
zw-:/- I, prove that
z-w
- I=l.
I1-zw
7.
8.
9.
Prove that
z.
-b.
ez -:j. 0 for any complex number
c.
lei0 1= I for each real number 0.
d.
(cos 0 + isin 0t = cosn0 + i sinn0.
Show that
a.
ez = I iff z = k21ri for some integer k.
b.
eZ 1 = ez2 iff z1 - z2 = k21ri for some integer k.
For what
z does the sequence Zn= nzn converge?
Theorem Proofs for Chapter 1
79
Theorem Proofs for Chapter 1
1.1.2 Proposition In an ordered field the following properties hold:
If a+ x =0, then x =-a, and if ax= 1. then x = a- 1•
ii.
Unique inverses
vi.
0 · x = 0 for every x.
viii. -x = (-I)xfor every x.
xiii. 0 < 1.
xiv.
For any x,
x2
~ 0.
Proof To prove ii, suppose a + x = 0. Then
-a
=-a+ 0 = -a + (a+ x) =(-a +a)+ x =0 + x = x,
and so -a= x, as claimed. Likewise, if ax= I, then a- 1 = a- 1 • 1 = a- 1 (ax) =
(a- 1a)x = Ix= x.
For vi: 0 • x (0 + 0)x 0 · x + 0 • x, and so
=
=
0 = 0 · x + (-0 · x) = (0 · x + 0 · x) + (-0 · x)
= 0 · X + (0 · X + (-0 · X))
= 0 · X + 0 = 0 · X.
For viii:
x+(-I)·x= 1 •x+(-l)·x
=(1+(-l))·X
=0·x=0
by vi. Thus, (-1) • x = -x by ii.
For xiii, suppose 1 $ 0. Then l+(-1) $ 0+(-1), and so O $ -1. We could
then use property 16: Since O $ -1 and O $ -1, we get O $ (-1) • (-1) =
-(-1) = 1. Therefore, 1 $ 0 and O $ I, and so I = 0 by property 12, in
contradiction to property 10.
'
To prove xiv, consider two cases: If x ~ 0, then
= x. x ~ o, by axiom
16. If x < 0, then x 2 = (-(-x))(-(-x)) = (- 1)2(-x)2, by vii and viii. But
(-1)
., 2 = .,1, since O = (-1)(-1 + 1) = (-1)2 + (-1) • 1 = (-1)2 - I. Thus
r
r=(-x)-~0.
•
80
Chapter I The Real LJne and Euclidean Space
1.1.7 Proposition N is well-ordered by the relation $.
Proof Suppose S C N is a set with no smallest element. Let T = N\S. We
=
=
use induction to show that T N and hence S 0. If O were in S it would
have to be a smallest element. Thus, 0 E T. Instead of attacking T directly we
consider To= {n E N I {O, 1, 2, ... ,n} C T}. Since To C T, we will be done
if we show To= N. The argument here shows OE To. Suppose k E T0 • Then
{O, 1, 2, ... ,k} C T. If k + 1 were in Sit would be a smallest element in S,
since all natural numbers smaller than k + 1 are in T. Thus k + 1 must also be
in T. Since {O, l,2, ... ,k,k+ l} CT, we have k+ 1 E To. Thus To= N by
induction. •
1.2.2 Sandwich Lemma Suppose Xn -+ L. Yn -+ L, and Xn $ Zn $ Yn for
all n (it is enough to assume that there is an No such that Xn $ Zn $ Yn whenever
n > No). Then Zn -+ L.
Proof Indeed, let c > 0 be given. Then
~
1.
There is an No such that
Xn
2.
There is an N1 such that
lxn - LI < €
3.
There is an N2 such that
IYn - LI < € whenever n ~ N2.
$ Zn $ Yn whenever n
No,
whenever n ~ N1, and
Selecting N = max(No, Ni, N2), then whenever n
L < e:, and so lzn - LI < €, as needed. •
~
N,
-€
< Xn
-
L :5
Zn -
L$
Yn -
1.2.5 Proposition If Xn is a sequence in IR. and Xn
-+
x and Xn
-+
y, then
x=y.
Proof
Xn
converge to both x and y. Write
by the triangle inequality. If Ix - YI > 0, then using Ix - yl/2 as our ~, we can
choose N so large that Ix-Xnl < lx-yl/2 and lxn - JI < Ix- yl/2 if n ~ N. Thus
we would conclude that Ix - YI < Ix - .vi, which cannot be. Hence Ix,- yj = 0
and so x=y.
•
81
Theorem Proofs for Chapter 1
1.2.6 Proposition A convergent sequence is bounded.
Proof If Xn
-+ x, there is an N with lxn - xi < 1 whenever n 2: N. Thus
lxnl $ lxl + I when n ~ N. Letting M = max{lxl, !xii, lx2I, ... , lxn-il} + 1, we
get lxnl $ M for every n. •
1.2.7 Limit Theorem for Sequences Suppose Xn-+ x, Yn-+ y, and>.
is constant. Then
i.
Xn + Yn
-+
ii.
>.xn
>.x.
iii.
XnYn
-+
iv.
if Yn
'F O and y 'F 0, then Xn/Yn-+ x/y.
-+
X + y.
xy.
Proof The proofs illustrate the use of the triangle inequality.
i.
Let e > 0. Select N1 so that lxn - xi < e/2 whenever n 2: N1 and select
N2 so that IYn -yl < e/2 whenever n ~ N2. If N = max(N1,N2), then
whenever n ~ N, we have
j(Xn +Yn) -
and so Xn + Yn
(x +Y)I
-+ X
= l(xn - X) +(Yn - Y)I 5 lxn - xi+ IYn - YI
< c/2 + c/2 = e,
+ y.
ii.
This is a special case of iii in which Yn is a constant sequence, and so it
suffices to show iii.
iii.
First write
IXnYn - ..tyj = IXnYn - XnY + XnY - xyj
<
_ Ixn..vn - xnYI + Ixn y· - xvi
•
= lxnl IYn -
YI+ IYI lxn - xj.
Since the sequence Xn converges, it is bounded. There is a constant M
such that lxnl < M for every n. Select N, so that lxn - xi < E"/[2(lyJ + l)]
82
Chapter 1 The Real Line and Euclulean Space
whenever n;;:: Ni, and select N2 so that IYn -yl
~ N2. Let N = max(Ni,N2). If n ~ N, then
< t:/[2(M + I)] whenever
n
[XnYn - xyl :::; lxnl [Yn - y[ + IYI lxn - xi
[yle
< - Mt:
- - + -'-'-- 2(M + 1) 2(IYI + 1)
€
€
<-+-=€,
- 2 2
and so XnYn - .xy.
iv.
If y -:j. 0, then [yl > 0. There is an Ni such that IYn - YI
n ~ Ni, so that IYn[ > lyl/2. For such n we have .
~1
Xn _
= 'XnY-.xyn
Yn Y
YnY
'
< IYl/2 whenever
I= I(Xn -x)y+x(y-yn)
I
YnY
< IYI lxn - xi + lxl IY - Yn [
IYnYI
IYnYI
-
2
21xl
:::; jyf [xn -x[ + [yl 2 IYn -y[.
Now select N2 so that lxn - xi < elYl/4 whenever n ~ N2, and select N3
so that IYn -y[ < e[yl 2/4lxl whenever n ~ N3. If N-= max(N1,N2,N3) and
n ~ N, then
I
Therefore, Xn/Yn
::1
Xn _
< ~ t:ly[ + 4[x[ elYl 2 = €
Yn Y
[YI 4
[y[ 2 4[xl
·
---4
x/y.
•
1.2.11 Proposition Complete ordered fields are Archimedean.
Proof
Let lF be a complete ordered field and consider x E JF. We must show
that there is an integer N such that x < N. If not, the monotone sequence
1, 2, 3, ... is bounded above by x and so converges by the monotone sequence
property. We assert that this sequence cannot converge to a number, say y. If it
did, for any € > 0 there would be an N such that for n ;;:: N,
I=
In+ 1 - nl :::; In + 1 - y[ + IY - n[ < Ze,
by the triangle inequality. This gives a contradiction if €
< I /2.
•
Theorem Proofs for Chapter
83
f
1.2.17 Proposition Q is dense in IR. That is,
i.
If x and y are in JR and x < )' there is an r E Q with x < r < y.
ii.
If x E IR and 1::
> 0, there is an r E Q with Ix - rl < e.
Proof The proof illustrates the interplay of the Archimedean property and the
well-ordering property of N. To establish i, suppose x < y, so that y - x > 0. By
the Archimedean property of IR, there is an integer n with O < 1/n < y-x. Again
by the Archimedean property (version 2), there is an integer k with k/n > x.
By the well-ordering property, there is a smallest such positive k Using it,
(k - l)/n ~ x < k/n. If r = k/n, then
k- 1 1
1
X < r = - - + - '.5 X + - < X + (y - x) = y,
n
n
n
so that x < r < y, as required. Part ii follows from part i by letting y = x + r.
Readers are asked to supply the details themselves.
•
1.2.18 Theorem The unit interval ]0, 1[ in IR is uncountable.
Proof Suppose x 1, x2, x3, •.. were proposed as an enumeration (listing) of the
points in ]O, I[. We will show that there must be at least one x in ]O, 1[ that does
not appear in the list so that no one-to-one correspondence of N with ]0, 1[ is
_possible. Write out each Xt as a decimal expansion:
X1
+-+
0.aua12a13a14a1s ...
x2 +-+ 0.a21 a22a23a24a2s •••
X3 +-+ 0.a31a32a33a34a35 ..•
where each aik is a digit from O through 9 and repeating 9s are chosen by
preference over terminating decimal expansions. That is, 1/4 = 0.2499999 ....
Let x = 0.b1b2b3b4 ... , where bt = 5 if akk f. 5 and bt = 6 if akk = 5. Then
x E ]0, 1[ and x is not in the list, since it is different from Xt in the kth decimal
place. •
1.2.19 Corollary 'ff x and y are in IR and x < y, then the interval ]x,y[
contains countablv manv rational numbers and uncountablY manr irrational
·
·
·
·
numbers.
Chapter 1 The Real Line and Euclidean Space
84
Proof This follows from 1.2.17 and 1.2.18.
•
= 1 and xn = 1+
1/2 + • • • + 1/n, n = 2, 3, .... Then Xn is monotone increasing but is
unbounded above and so does not converge (we write Xn ----too).
1.2.20 Proposition: The harmonic sequence Let x1
Proof Note first that
x2 = 1 + 1/2, x 4 = l + 1/2 + 1/3 + 1/4 > 1 + 1/2 +
1/4 + 1/4 = 1 + 1/2 + 1/2, Xg = 1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + 1/7 + 1/8 >
1+ 1/2+1/2+4/8 = 1+1/2+1/2+1/2. In general, induction giyes.x2" ~ l+(n/2),
and so Xn is unbounded.
•
1.3.2 Proposition Let S c JR be nonempty. Then b E JR is the least upper
bound of S iff b is an upper bound and for every € > 0 there is an x E S such
that x > b- c.
Proof First, suppose b = lub(S) = sup(S) and c > 0. We must produce an
< x + c.. If there were no such x, we would have b ~ x + c for
x E S such that b
every x E S; that is, b - c ~ x. Thus b - c is an upper bound strictly less than b
and therefore b is nut the least upper bound, which contradicts our hypothesis.
Conversely, suppose b satisfies the given condition. Let d be an upper bound
of S. According to the definition of sup(S), we must show that b s; d. Suppose
in fact, b > d. Let e = b - d. Then d = b - c and d ~ x for all x E S implies
b-c ~ x orb~ x+c., and so our condition fails. Thus the supposition that b > d
is wrong, and we may then conclude that b :s; d, as required. This completes
the argument.
•
Note. In this proof, we used the following basic principle of logic: Showing that a statement P implies a statement Q (in symbols P =}- Q) is equivalent to showing that ~Q :::}- ~P where ~Q is the negation of Q. We call
~Q :::}- ~P the contrapositive of P =}- Q, whereas Q :::}- P is the converse.
While proving P =}- Q is equivalent to proving ~Q =}- ~P, it need not be the
same as Q :::}- P, as simple examples like (x = -1) :::}- (x2 = 1) show.
1.3.3 Proposition Suppose
infB
0 ~A
c
B
cR
Then
:s; inf A :s; sup A :s; sup B.
Theorem Proofs for Chapter 1
85
Proof For example, to prove that infB :S infA, let b = infB, so that b :S x
for all x E B. Thus b S x for all x EA also, since A C B. Thus b is a lower
bound for A, so that b S inf A.
•
1.3.4 Theorem In JR the following hold:
i.
Least upper bound property Let S be a nonempty set in IR that has an
upper bound. Then S has a least upper bound in R
ii.
Greatest lower bound property Let P be a nonempty set in JR that has
a lower bound. Then P has a greatest lower bound in R
Proof
i.
We shall give two proofs here. See Exercise 46 at the end of this chapter
for a third proof.
Method 1 Let M be an upper bound and let n be a positive integer. Keeping n
fixed, consider the sequence M - I/2n, M- 2/2", M - 3/2n, ... that steps down
an amount 1/2n from M each time we proceed in the sequence. Choose the first
integer k such that M - k/2" fails to be an upper bound. Call this integer kn.
There must be such a k,., since S :j:. 0 and IR satisfies the Archimedean property.
Let b,. = M - k,./2n, so that b,. is not an upper bound but bn + 1/2n is an upper
bound. Since we have decreased the size of the steps by a factor of 2 each time
n is increased by 1, b1 :S b2 :S b 3 :S .... Since b,. :SM for all n, we can invoke
completeness to ensure that b,. ---+ b for some b. Also, b 2; bn (why?). We claim
that b is the least upper bound. To show this, let x E S. If x > b, let c = x - b
and, by Example 1.2.14, choose n so large that 1/2n < e. Then
x
1
= b + c 2; b,. + c > b,. + 2" .
Thus b,. + 1/2" is not an upper bound, a contradiction. Similarly, for any c > 0
there must be an x E S such that b - x < e:, so that b is the least upper bound,
by 1.3.2.
Method 2 Since S :j:. 0, we can choose some x E S. Let us write y 2; S if
? is
an upper bound of S. For k = 0, 1, 2, ... , let Nk be the smallest natural
~ber such that x + Nk/2k 2: S; Nk exists by the well-ordering property. Let
~ x : Nk/2k. Note that Nk+I = 2 · Nk or Nk+t = 2 · Nk - 1. Thus Yk is a
f: · easing sequence bounded below by .x.
Chapter 1 The Real Line and Euclidean Space
86
By the completeness property of JR., Yt -+ y for some y E JR.. We shall show
that y is the least upper bound for S. First, let us demonstrate that it is an upper
bound. Suppose that z E Sand z > y. Since Yt -+ y, there is some k such that
Yt < z. This is impossible, since each Yk is an upper bound for S. Now we need
to show that y is the smallest upper bound. By Proposition 1.3.2, it suffices to
prove that for any given e: > 0 there is a z E S so that y < z + e:. Pick k such
that 1;21r. < e:, which is possible by Example 1.2.14. By the choice of Ykt there
is some z E S such that z > Y1r. - 1/21 . But y ~ Y1r. and so z > y- l/2k > y- e:,
and the proof of i is complete.
ii.
Consider the set -P = {-x I x E P}. Since Pis bounded below, -P is
bounded above, so by i, -P has a least upper bound c E JR. From the
•
definition we find that -c is the required greatest lower bound.
We remarked earlier that properties i and ii are each equivalent to the completeness axiom for an ordered field. We have shown one-half of the implication,
namely, that the completeness axiom implies i and ii for an ordered field. Exercise 11 asks the reader to prove that i and ii both imply the completeness
axiom.
1.4.2 Proposition Every convergent sequence is a Cauchy sequence.
Proof Let Xn -+ x. For c > 0 choose N so that lxn - xi < e: /2 if n 2: N.
Thus, if n, m 2: N, lxn -xml ~ lxn -xi+ Ix - Xml < e:/2 +e:/2 = f:, by the triangle
inequality. ~us Xn is Cauchy.
•
1.4.3 Theorem Every bounded sequence in JR has a subsequence that converges to some point in JR.
Proof · Suppose Xn is a bounded sequence in R There is an integer M with
-M < Xn < M for every n. Bisect the interval [-M, M] into two half intervals
[-M,0] and [O,M]. At least one of these must contain Xn for infinitely many
indices n. Call it Io and select no with ,x,.,, E lo. Now split Io in half and let I 1 be
one of the portions that contains Xn for infinitely many indices n. As there are
infinitely many xn's available, we can select n1 > no with Xn 1 in /i. Continue in
this way to obtain subintervals, indices, and points such that
1.
lo :::> Ii ·::> I2 :::> h :> ....
2.
h = [at, b,t] with b1i - a1r. = M /2k.
Theorem Proofs for Chapter I
87
3.
no < n 1 < n2 < n3 < ....
4.
Xn1; E ft. (See Figure l.P-1.)
Consider the sequence a1,a2,a3, .... Since 11:+i C 11; C [-M,M], we have
-M $ a1c $ at+1 $ M for every k. The sequence is monotone increasing and
bounded and so must converge to some number x. For each k we have
lxn1; -
xi
$ lxn1; -
a.1:I + la.1: - xi
$
M/2" + la1: - xi.
Let e > 0. By Example 1.2.14, 1/2"-+ 0 ask-+ oo, so that there is an integer
K1 such that I /2.t < c / (2M) whenever k ~ Ki. Since a.1: -+ x, there is an
integer K2 such that la1: - xi < e/2 whenever k ~ K2. Let K = max(K1 ,K2 ). If
k ~ K, then lxn,1: < e/2 + e/2 = e. Therefore limt-oo Xnk = x, and so the
subsequence Xn,1: converges to x.
•
xi
lo
/1
I
I
/2
I•
I
I /3 I
1--+l
I
I
I 11,1
0
I
-M
0
I
QI
I H
I I I
I I I
a2 a4 I
I
I I
I I I I
I I I I
I a3 I I I
I x., I
I
x., x., x.,
FIGURE l.P-1
I
•I
I
I
I
I
I
I
I
I
I
I
I
I
I
M
'
I
x..
The bisection process is used to show that every sequence in
[ -M, M] has a convergent subsequence
1.4.S Corollary If a and b are in JR and a < b, then every sequence of
Points in the interval [a,b] = {x I a~ x Sb} has a subsequence that converges
to some point in [a,b].
88
Chapter I The Real line and Euclidean Space
Proof If Xn
is a sequence of points in [a, b], then a ::::; Xn ::::; b for every
n. The sequence is bounded, and so by Theorem 1.4.3 there is a subsequence
Xn,, x,.2 , x,. 3 , ••• and there is a point x in IR such that limk......, 00 x,.t = x. By Proposition 1.2.3, we have a::::; x::::; b, and so x E [a, b].
•
In the text we showed how to get Theorem 1.4.4 from the following lemmas.
1.4.6 Lemma Every Cauchy sequence is bounded.
Proof Suppose x,. is a Cauchy sequence. Then there is an N such that jx,. xkl < 1 whenever n ~ N and k ~ N. Therefore, lxnl ::::; lxNI + I for n =
N,N + 1,N + 2, .... If we put M = max{lxd, lx2I, ... , lxNI} + I, then lxnl ::::; M
for every n.
•
1.4.7 Lemma If a subsequence of a Cauchy sequence converges to x, then
the sequence itself converges to x.
Proof Suppose x,. is a Cauchy sequence and that it has a subsequence x,.k
convergent to x. Suppose E > 0. There is an N such that Ix,. - Xm I < E /2
whenever n ~ N and m ~ N. Since x,.k is a subsequence, there is an no > N
with lx,.0 - xi < E /2. If m > N, then
lxm - xi= lxm - Xno +xno -xi :<S; jx,,, - x,.0 j + lxno - xi < c:/2 + t:/2 = E.
Therefore, x,. converges to x.
•
1.5.2 Proposition Let x,. be a sequence in
IR and let x E IR.
i.
x is a cluster point of x,. if and only if for each
there is an index n > N with lxn - xi < E.
ii.
x is a cluster point of x,.
converges to x.
iii.
x,.
iv.
x,. - t x
point.
v.
Xn - t x if and only if every subsequence of x,. has a further subsequence
that converges to x.
-t
x
if and only if there
if and only if every subsequence of x,.
if and only if the
E
> 0 and for each N,
is a subsequence of x,. that
converges to x.
sequence is bounded and x is its only cluster
Theorem Proofs for Chapter 1
89
Proof
i.
If x is a cluster point and i=; > 0, then there are infinitely many indices with
lxn - xi < c. Only finitely many are smaller than N, and so one of them
must be larger than N. In the other direction, select indices inductively:
pick n1 with !xn 1 - xi < c, then n2 > n1 with lxn2 - xi < c, then n3 > 712
with lxn3 - xi < c, and so forth. We get n1 < n2 < n3 < · · · with
lxn; - xi < c and so x is a cluster point.
ii.
iii.
To prove this part, we refine the selection process just done. If x is a
cluster point, select an index n 1 with !xn 1 -xi < 1. Using i, select n2 > n1
with lxn2 - xi < 1/2. Continuing in this way using i repeatedly, select
indices n1 < n2 < n3 < n4 < · · · with lxnt ...: xi < 1/k. This gives a
subsequence that converges to x, since if c > 0 we know there is a K with
I/K < c. Thus, k 2: K implies lxnt - xi < 1/k::::; 1/K < c. In the other
direction, if there is a subsequence converging to x and c > 0, then there
is an index such that the indices in the subsequence past that one label
point are closer than i=; to x. Thus x is a cluster point.
is a subsequence, let c > 0 and select N so that
Since n1 < n2 < n3 < · · ·, induction shows
that nk 2: k. Hence k 2: N implies nk 2: N, and so lxnt - xi < €. The
subsequence converges to x. To show the other direction, we prove the
contrapositive. Namely, if Xn were a sequence that did not converge to x,
then there would be an c > 0 such that for each N there is an index past
N with lxn - xi > c. Use this repeatedly to select n1 < n2 < n3 < · · ·
with lxn; - xi > c. This gives a subsequence that does not converge to x.
If
Xn -+
lxn - xi
x and
Xn;
<c for n 2: N.
iv.
If Xk -+ x, then we know that the sequence is bounded and that every
subsequence also converges to x, so x is the only cluster point. Conversely,
suppose the sequence is bounded and that x is the only cluster point. If
the sequence did not converge to x, then there would be an c > 0 and a
subsequence xkrn such that lx-xk(j) I 2: i=; for eachj. Since that subsequence
is bounded, it would have a convergent sub-subsequence. The limit of that
sub-subsequence would be a cluster point of the original sequence different
from x, but there are no such points. Therefore, Xk -+ x.
v.
If Xn -+ x, then every subsequence converges to x by iii, and so we do
not need to pass to a sub-subsequence. If every subsequence has a subsubsequence convergent to x, then x is a cluster point of every subsequence
and hence of Xn. If there were any other cluster point y -:J x, then there
would be a subsequence convergent to it by ii. By iv, y would be the only
cluster point of that subsequence, and it could not have a sub-subsequence
90
Chapter 1 The Real Line and Euclidean Space
convergent to x. Hence x must be the only cluster point of Xn- Therefore
Xn - X, by iv.
•
1.5.5 Proposition Let Xn be a sequence in R Then
i.
If Xn is bounded below, a number a is equal to fun inf Xn if and only if
ii.
a.
For all e
and
> 0, there is an N such that a - e < Xn whenever n 2 N,
b.
For all e
> 0 and for all M,
there is an n
>M
with
Xn
< a+€.
If Xn is bounded above, a number b is equal to lim supxn if and only if
< b + e whenever n 2
a.
For all e > 0, there is an N such that
and
b,
For all e > 0 and for all M, there is an n > M with b - e <
Xn
N,
Xn.
Proof Suppose that a= funinfxn. Then a= inf{x Ix is a cluster point of xn},
by definition. Thus, given e > 0, there is a cluster point x such that x - e/2 < a;
also, a S x for all cluster points. If x is a cluster point, then there are infinitely
many Xn with lxn - xi < £ /2, and so such x,, satisfy x,, - e < a, which proves ib.
If ia fails, then there is an e > 0 and infinitely many x,, satisfying Xn < a - e.
In fact, choose a subsequence Xnt with x,,t < a - e. Since Xn is bounded below,
Xn, is bounded and so has a convergent subsequence converging to, say, x. Then
x S a - e, so that x is a cluster point smaller than a, giving a contradiction. The
converse is. proved in the same way, and the proof of ii is similar.
•
1.5.6 Proposition If Xn is a sequence in JR., then
limsupxn=inf{sup{xn+J,Xn+2,x,,;3, ... } In= 1,2,3, ... }
liminfx,, = sup{inf{xn+t,Xn+2,Xn+3, ... }
In= 1,2,3, ... }.
Proof We prove the equality for lim sup x,,. First of all, we can assume that x,,
is bounded above, for otherwise both sides would be +oo (with the understanding
that inf{oo} = oo ). Let b = lim sup x,, and b,, = sup{x,,+ 1, Xn+2, ••. } as in the text,
so that b 1 2 bi 2 •·•and b,,-+ inf{b1,b 2 , ••• } (if b,, is unbounded below, then
both sides would be -oo, a case we leave to the reader). We need to prove
that b,, -+ b. For every e > 0 there is an N such that x,, < b + e for n 2 N,
by 1.5.Siia. Thus, b,, $ b + e for n 2 N. On the other hand, since b - e < Xn
for arbitrarily large n 2 M, we get b - e S b,, for n 2 M. Thus, for large n,
b - e $ b,, $ b + e, and so b,, ----+ b.
•
Theorem Proofs for Chapter I
91
1.5.7 Proposition Let Xn be a given sequence in R Then
i.
liminfxn $ limsupxn.
ii.
If Xn
iii.
If M $
iv.
lim sup Xn = +oo
if and only if Xn
v.
liminfxn = -oo
if and only if Xn is not bounded below.
vi.
If xis a cluster point of Xn then liminfxn $ x $ limsupxn.
vii.
If a = lim inf Xn is finite, then it is a cluster point of Xn-
$ M for all n, then limsupxn ~ M.
Xn
for all n, then lim infxn
~
M.
is not bounded above.
viii. If b = limsupxn is finite, then it is a cluster point of Xnix.
Xn -+ X
E
JR if and only if Jim sup Xn = lim inf Xn = X E R
Proof
i.
,\•
ii.
Given a sequence Xn in JR, let Sn = {Xn+t,Xn+2,· • • }, On = inf Sn (set
an = -oo if Sn is unbounded below), and bn = supSn (set bn = oo if Sn is
unbounded above). From-the discussion preceding the statement of Proposition 1.5.6 in the main text, each bk is an upper bound for {ai, a2 , a 3 , • .• } .
Thus, lim inf Xn = sup{ a1, a2, a3, ... } $ bt for any k. This holds for every
k, and so lim infxn $ inf{ b1, b2, b3, ... } = lim sup XnSince Xn $ M for all n, supSn $ M for all n, and so limsupxn =
inf (sup Sn) $ M. The proof of iii is similar.
n
iv.
If Xn is not bounded above, limsupxn = +oo, by definition. The converse
follows from ii. The proof of v is similar.
vi.
Suppose x is a cluster point of Xn, Then for every n there is an index
k > n with x - E < x1; < x + c:. That is, Sn n ]x - c:,x + €[ 'F 0. Thus, for
every n,
x- € $ supSn and inf Sn$ x+c:.
Since this holds for every i
> 0, we obtain
inf Sn $ x $ sup Sn
for all n,
and so
liminfxn
=sup(infSn)
$ x$
n
t
inf(supSn)
n
=limsupxn,
92
Chapter 1 The Real Line and Euclidean Space
vii.
By definition, a= sup{a1,a2,a3, ... }, where an= inf{Xn+1,Xn~2,•·•}.
Clearly, a1 $ a2 $ a3 $ · · · $ a. Since a is finite and equal to the
least upper bound of the at, they must converge upward to it. Let e and
N be given. There is an No such that a - e < an $ a whenever n :?: N0 •
Selecting n > max(N,No), we get a - e < infSn $ a. Thus, Sn must
intersect the interval ]a - e, a+ e[. There is an index k > n :?: N with
lxt - al < e, and so a is a cluster point.
viii. The proof is left as an exercise.
ix.
If Xn -+ x, then for all e > 0 there is an N such that x - e < Xn < x + e for
all n :?: N. Thus by 1.5.Si, lim inf Xn = x and by 1.5.Sii, lim sup Xn = x.
Conversely, if the conditions of 1.5.S hold with x = a = b, let N be
the greater of the N's for 1.5.Sia and 1.5.Siia, so that x - £ < Xn < x + £
if n:?: N, and thus Xn -+ x as n-+ oo. •
1.6.6 Theorem For vectors in Rn, we have
I.
Properties of the inner product:
i.
(x, x} :?: 0.
ii.
=0 iff x =0..
(x,y + w} =(x,y) + (x, w}.
(ax,y} =a(x,y} for a ER
(x,y} =(y,x).
iii.
iv.
v.
II.
III.
{x,x}
positivity
nondegeneracy
distributivity
mu/Jiplicativity
symmetry
Properties of the norm:
i.
llxll :?:
ii.
llxll
o.
iii.
=0 iff x =0.
llo:xll =lal llxll for a E R
iv.
llx + YII $ llxll + IIYII-
positivity
nondegeneracy
mu/Jiplicativity
triangle inequality
Properties of the distance:
ii.
:?: 0.
d(x,y) =0 iff x =y.
iii.
d(x,y) = d(y,x).
iv.
d(x, y) $ d(x, z) + d(z, y).
i.
d(x,y)
positivity
nondegeneracy
symmetry
triangle inequality
Theorem Proofs for Chapter 1
IV.
93
The Cauchy-Schwarz inequality:
l{x,y)I ::;; llxll llYIIProof
I.
Each property is a routine check except IV, which will be proved in 1.7.9.
II and III will be proved as general consequences of I in 1.7.5, 1.7.9
and 1.7.10. •
1.6.7 Proposition If v = (v 1, ••• , Vn) and w = (w 1, ••• , wn) are vectors in
Rn and we let p(v, w) = max{lv1 - wd, lv2 - w2l, ... , lvn - Wnl}, then
p(v, w) 5
llv - wll 5
/np(v, w).
Proof Clearly
n
Iv; -
w;j = Jlv; -
w;l 25
L lvi - wil2 = !Iv - wllj=t
Also,
n
!Iv - wll =
L {lvi - wil2} 5
j=l
= Jnp(v, w)2 = /np(v, w),
Which proves the result.
•
1.7.S Proposition If (V, II • II) is a normed vector space and d(v, w) is
defined by
d(v, w) = !Iv then d is a metric on V.
wll,
94
Chapter 1 The Real line and Euclidean Space
Proof
i.
ii.
d(v, w) ~ 0, since d(v, w) = llv To show d(v, w) = 0
= 0 ¢=> V = W.
¢=>
v
= w,
wll 2: 0.
note that d(v, w) = 0
¢=>
llv - wll
=0
¢=>
V - W
iii.
iv.
=d(w, v), since d(v, w) =llv-wll =11-(w-v)II = l-11 llw-vll =
llw - vii= d(w, v)
For the triangle inequality, d(x,y) = llx - YII = ll(x - z) + (z - y)il :5
llx - zll + llz - YII =d(x,z) + d(z,y). •
d(v, w)
If (V, (·, ·)) is an inner product
1.7.9 Cauchy-Schwarz Inequality
space, then
l(x,y)I:::; -.fM ~
for every x and y in V.
Proof If either x or y is 0, then (x,y) = 0, and so the inequality holds.
Therefore, we can assume x I- 0 and y I- 0. Then (x,x) > 0 and (y,y) > 0. For
any a in JR, we have
0 :5 (ax+y,ax+y) =ci(x,x) +2(x,y)a+ (y,y)
by using the basic properties of an inner product. Setting (x,x) = a, 2{x,y) = b,
and (y,y) = c, this becomes
aci + ba + c 2: 0 for every a in JR.
We know (since a > 0) that this quadratic expression has a minimum and that
it occurs when a= -b/2a. Therefore we insert that value for a and obtain
a(::
Since a
2)
+b
(-;a) +c ~
0;
i.e.,
c2: ::.
> 0, this means that
b2 $ 4ac; i.e., l(x,y)l 2 $ (x,x)(y,y).
Taking square roots gives what we want.
•
1.7.10 Proposition lf(V, (·,·))is an inner product space and 11·11 is defined
for v in V by
llvll=~.
then
11 · II
is a norm on V.
Worked Examples for Chapter I
95
Proof The only nontrivial property is the triangle inequality, which was ver-
•
ified in the text.
Worked Examples for Chapter 1
Example 1.1 For real numbers, prove that
a.
b.
c.
x S lxl, -lxl S x.
!xi :5 a ijf -a S x S
Ix+ YI :5 !xi + IYl-
a, where a ~ 0.
Solution
a.
If x ?: 0, then lxl = x; if x < 0, then !xi ?: x, since
x S Ix!- The other assertion is similar.
!xi ?: 0.
b:
If x ?: 0, then we must show that O :5 x :5 a iff -a :5 x S a, which is
obvious. Similarly, if x < 0, then the assertion becomes O :5 -x :5 a iff
-a S x :5 a, which is again obvious. Here we use the fact that if c :5 0,
then O :5 x :5 y iff O ~ ex ~ cy.
c.
By a, -!xi :5 x :5 lxl and -!YI :5 y :5 IYI• Adding, we obtain -(lxl+lyl) S
x +y S lxl + IYI- Then, by b, Ix+ YI :5 !xi+ !YI- In addition, this can be
proved by cases as indicated in the text. Note that this is also a special
case of the triangle inequality in !Rn; see Theorem 1.6.6Iliv. •
In either case,
Example 1.2 Let S be a nonempty set in IR bounded above and x = sup(SJ.
Show that there is a sequence x 1, x2 , • •• in S such that Xk
-+
x.
Solution For each k, use Proposition 1.3.2 to find an x1r. E S such that x1r. :5
x < Xk + 1/k. Then Xk-+ x, since for a given e > 0, we choose N?: -I/e; then
•
k ~ N implies Xt S x < x1:. + e; i.e., Ix - Xtl < e.
Example 1.3 For numbers X1, ••• ,Xn,Yl, .•• ,Yn. and Z1, .•. ,Zn, show that
(t x;y;z;) (t x1) (t Yr) (t z1) ·
4
i=l
2
:5
1=l
1=l
1=!
96
Chapter 1 The Real Line and Euclidean Space
Solution The Cauchy-Schwarz inequality (Theorem 1.6.6lv) says that
Applying this to the numbers w; =
Applying this again to
4Zf gives
(LXTZi2) 2:'.S (LX;4)( LZ;4)
and so
XiZi and y; gives
(
i.e., L(x;z;)
2) :'.S (LX;4) 1/2(LZ;4) 1/2
(LXiYiZi)2:'.S (LX;4)1/2 (LZ;4) 1/2 (LYi2) .
Squaring both sides, the result is obtained. (We have used the fact that if a, b 2: 0,
•
then a :'.Sb iff a 2 :'.S b2 .)
Example 1.4 Suppose x
E JR and x
>
0; show that there is an irrational
number between O and x.
Solution If xis rational, then since ../2 is irrational (Exercise 2), so is x/ ../2
(why?) and is between O and x. On the other hand, if x is irrational, then x/2
is irrational (why?) and lies between O and x.
•
Example 1.5 Let A and B be nonempty sets in JR that are bounded above.
Let a = sup(A) and· b =sup(B) and let the set C be defined by C = {xy I x E A,
y E B}. Show that, in general, ab -I sup(C). If a < 0 and b < 0, prove that
ab = inf( C). If a > 0 and b > 0 and A, B have only positive elements, prove
that ab = sup( C).
= {x E JR I -10 < x < -1} =
) - 10, -1[ and B = ]0, 1/2[, so that a= -1, b = 1/2, and ab= -1/2. But
C =] - 5,0[ and sup(C) = 0.
Now we prove that if a < 0 and b < 0, then ab = inf(C). For this,
we use the analogue of Proposition 1.3.2 for greatest lower bounds. First, let
x E A and y E B. We want to show xy 2: ab. But x :::; a and y :::; b, or
-x 2: -a 2: 0 and -y 2: -b 2: 0, and so (using ordered field axiom ill.16 for
JR), (-x)(-y) 2: (-a)(-b); i.e., xy 2: ab. Given e > 0, we seek x EA and y E B
Solution As a specific instance, let A
Exercises for Chapter I
97
so that ab> xy-E, or lab-xyl < E. Choose x and y so that a < x+c/[2(lbl+ l)],
b < y + c/[2lal], and b < y + 1. Then, since luvl = lul lvl and IYI < lbl + 1, we
get
lab -
xyl $
=
xyl
c
c
·
!al lb - YI+ la - xi IYI < !al Zlal + Z(lbl + l) (lbl + 1) = c
lab - ayl + lay -
(using the triangle inequality).
The last assertion can be proven in an analogous way.
•
Exercises for Chapter 1
1.
For each of the following sets S, find sup(S) and inf(S) if they exist:
a.
{x E JR I x2 < 5}
b.
{x E JR I x2 > 7}
c.
{1/n I n, an integer, n
d.
{-1/n
e.
{.3, .33, .333, ... }
> 0}
In an integer, n > 0}
2.
Review the proof that v'2 is irrational. Generalize this to
positive integer that is not a perfect square.
3.
a.
Let x;?: 0 be a real number such that for any c
that x = 0.
b.
Let S = )0, 1[ . Show that for each c > 0 there exists an x E S such
that X < E.
> 0,
vk.
x $
for k a
E.
Show
4.
Show that d = inf(S) iff d is a lower bound for S and for any c > 0 there
is an x E S such that d ;?: x - E.
5.
Let Xn be a monotone increasing sequence bounded above and consider
the set S = { x 1, x2 , .. . } . Show that x,, converges to sup(S). Make a similar
statement for decreasing sequences.
6.
Let A and B be two nonempty sets of real ~umbers with the property that
x $ y for all x E A, y E B. Show that there exists a number c E JR such that
x $ c $ y for all x E A, y E B. Give a counterexample to this statement
for rational numbers (it is, in fact, equivalent to the completeness axiom
and is the basis for another way of formulating the completeness axiom
known as Dedekind cuts).
98
Chapter 1 The Real line and Euclidean Space
7.
For nonempty sets A,B c IR, let A +B = {x+y Ix EA and y EB}. Show
that sup(A + B) = sup(A) + sup(B).
8.
For nonempty sets A, B c JR, detennine which of the following statements
are true. Prove the true statements and give a counterexample for those
that are false:
a.
sup(A
n B) ~ inf{ sup(A), sup(B)}.
n B) = inf{ sup(A), sup(B)}.
b.
sup(A
c.
sup(A U B) ~ sup{ sup(A), sup(B)}.
d.
sup(A U B) = sup{ sup(A), sup(B)}.
9.
Let Xn be a bounded sequence of real numbers and Yn = (-lfxn, Show
that lirnsupyn ~ lirn sup lxnl• Need we have equality? Make up a similar
·
inequality for lim inf.
10.
Verify that the bounded metric in Example 1.7.2d is indeed a metric.
11.
12.
Show that i and ii of Theorem 1.3.4 both imply the completeness axiom
· for an ordered field.
In an inner product space show that
a.
b.
c.
2llxll2 + 2IIYll 2 = llx +Yll2 + llx - Yll 2 (parallelogram law).
llx + YII 11.t - YII ~ llxll2 + IIYll2 •
4(x,y} = llx + Yll2 - llx - Yll2 (polarizatior, identity).
Interpret these results geometrically in terms of the parallelogram formed
by x and y.
13.
What is the orthogonal complement in lR4 of the space spanned by (1, 0, 1, 1)
and (-1,2,0,0)?
14.
a.
Prove Lagrange's identity
(t
2
x;y;) =
i=l
(t xr) (t l) - L
i=l
(x;yi - xm>2
I 9<j$n
i=I
and use this to give another proof of the Cauchy-Schwarz inequality.
b.
Show that
n
(
~(X;+y;)2
1=1
)
1/2
~
(
n
~Xi
1=1
) 1/2
(
+
n
~Yf
1=1
) 1/2
99
Exercises for Chapter 1
15.
16.
Letxn be a sequence in JR such that d(xn,Xn+i) :S d(x,._ 1,x,.)/2. Show that
x,. is a Cauchy sequence.
.,
Prove Theorem 1.6.4. In fact, for vector spaces V 1, ••• , Vn, show that
V = Vi x · · · x V,. is a vector space.,
17.
Let S c JR be bounded below and nonempty. Show that inf(S) = sup
{x E JR I x is a lower bound for S}.
18.
Show that in Rx,. ----+ x iff -Xn ----+ -x. Hence, prove that the completeness
axiom is equivalent to the statement that every decreasing sequence x 1 ?:
x2 ?: X3 • • • bounded below converges. Prove that the limit of the sequence
is inf {x1,x2, ... }.
19.
Let x
= (1, 1, 1) E R 3 be written x =ytf, + y,iJ2 + y-if3, where f1 = (1, 0, 1),
Ji= (0, 1, 1), and!J = (1, 1,0).
Compute the components y;.
20.
Let S and T be nonzero orthogonal subspaces of R". Prove that if S and
T are orthogonal complements (that is, S and T span all of JR"), then
Sn T = { 0} and dim(S) + dim(T) = n, where dim(S) denotes dimension of
S. Give examples in JR3 of nonzero orthogonal subspaces for which the
condition dim(S) + dim(T) = n holds and examples where it fails. Can it
fail in JR 2 ?
21.
Show that the sequence in Worked Example 1.2 can be chosen to be
increasing.
22.
a.
If x,. and y,. are bounded sequences in R, prove that
lim sup(x,. + y,.) :S lim sup x,. + lim sup y,..
b.
Is the product rule true for lim sups?
23.
Let P c R be a set such that x ?: 0 for all x E P and for each integer k
there is an Xk E P such that kxk :S 1. Prove that O = inf(P).
24.
If sup(P)
25.
We say that P :SQ if for each x E P, there is a y E Q with x :Sy.
= sup(Q) and inf(P) = inf(Q), does P = Q?
a.
If P :S Q, then show that sup(P) :S sup(Q).
b.
Is it true that inf(P) :S inf(Q)?
c.
If P :S Q and Q :S P, does P = Q?
Chapter I The Real Line and Euclidean Space
100
26.
Assume that A= {am,n Im= 1, 2, 3, ... and n = 1, 2, 3, ... } is a bounded
set and that am, 11 2 ap,q whenever m 2 p and n 2 q. Show that
lim an,n = supA.
n-+oo
27.
Let S = {(x,y) E JR2 I xy
inf(B).
28.
Let x,. be a convergent sequence in JR and define A,. = sup{xn,Xn+I, .. . }
and Bn = inf{x11 ,Xn+l, .•. }. Prove that A,. converges to the same limit as
Bn, which in tum is the same as the limit of x11 •
29.
> 1}
and B = {d((x,y),(0,0)) I (x,y) ES}. Find
For each x E JR satisfying x 2 0, prove the existence of y E JR such that
l=x.
30.
Let V be the vector space C([0, 1]) with the norm llflloo = sup{lf(x)I I
x E [0, 1]}. Show that the parallelogram law fails 1and conclude that this
norm does not coJI}e from any inner product. (Refer to Exercise 12.)
31.
Let A, B C JR and f : A x B -+ JR be bounded. Is it true that
sup{/(x,y) I (x,y) EA
x B} = sup{sup{J(x,y) Ix EA} I y EB}
or, the same thing in different notation,
f(x,y) = sup (sup f(x,y))?
sup
(x,y)EAXB
32.
a.
b.
yEB
xEA
Give a reasonable definition for what limn-+oo Xn = oo should mean.
Let xi = I and define inductively Xn+I =(xi+··· +x11 )/2. Prove that
Xn-+00.
33.
a.
Show that (logx)/x-+ 0 as x-+ oo. (You may consult your calculus
text and use, for example, l'Hopital's rule.)
b.
Show that n 1111
-+
1 as n-+ oo.
Exercises 34--45 concern complex numbers.
34.
35.
Express the following complex numbers in the form a + bi:
a.
(2 + 3i)(4 + i)
b.
(8
c.
(1+3/(1+0)2
+ 61) 2
What is the complex conjugate of (3 + 8i)4 /(I + i) 10 ?
Exercises for Chapter I
36.
101
Find the solutions to:
a.
(z + 1)2
b.
z4 -
= 3 + 4i.
i = 0.
37.
Find the solutions to z2 = 3 - 4i.
38.
If a is real and z is complex, prove that Re(az) = a Re z and that Im(az) =
a Im z. Generally, show that Re : C -+ R is a real linear map; that is,
that Re (az + bw) = a Re z + b Re w for a, b real and z, w complex.
39.
Find the real and imaginary parts of the following, where z = x + iy:
a.
b.
40.
a.
'
1/z2
1/(3z+2)
Fix a complex number
z = x + iy and consider the linear mapping
'Pz : R 2 -+ IR2 (that is, of C -+ C) defined by '{)z(W) = z · w (that
is, multiplication by z). Prove that the matrix of 'Pz in the standard
basis (1,0),(0, 1) of!R 2 is given by
41.
Show that Re(iz) = - lm(z) and that lm(iz) = Re(z) for all complex numbers z.
42.
Letting z = x + iy, prove that
43.
If a, b E C, prove the parallelogram identity:
lxl + IYI ::; v214
la - bl 2 +la+ bl 2 = 2(lal 2 + 1h12).
44.
Prove Lagrange's identity for complex numbers:
Deduce the Cauchy inequality from your proof.
45.
Show that if
lzl > I then n-oo
lim t' /n = oo.
102
Chapter 1 The Real Line and Euclidean Space
46.
Prove that each nonempty set S of IR. that is bounded above bas a least
upper bound as follows: Choose .xo E S and Mo an upper bound. Let
ao = (xo + Mo)/2. If ao is an upper bound, let M1 = ao and xI = xo;
otherwise let M 1 = Mo and x1 > ao, x 1 E S. Repeat, generating sequences
Xn and Mn. Prove that they both converge to sup(S).
Chapter 2
The Topology of
Euclidean Space
In this chapter we begin our study of those basic properties of !Rn that are
important for the notion of a continuous function. We will study open sets, which
generalize open intervals on IR, and closed sets, which generalize closed intervals.
The study of open and closed sets constitutes the beginnings of topology. This
study will be continued in Chapter 3.
Most of the material in this chapter depends only on the basic properties of
the distance function and so makes sense in a general metric space. Recall that
the distance function d for !Rn is given by
n
d(x,y) =
{
~(x;
-y;f
}1/2
and that the basic properties of d are
2: 0.
1.
d(x,y)
2.
d(x,y) = 0 iff x = y.
3.
d(x,y)
4.
d(x,y) ::; d(x, z) + d(z,y) (triangle inequality).
= d(y,x).
Recall also that a set M together with a distance function d satisfying these
properties is called a meuic space.
103
Chapter 2 The Topology of Euclidean Space
104
§2.1 Open Sets
To define open sets, we first introduce the notion of an c:-disk in a metric space.
As usual, our primary example will be IRn.
2.1.1 Definition Let (M,d) be a metric space. For each.fixed x EM and
c: > 0, the set
D(x,c:) = {y EM I d(x,y) < c:}
is called the £-disk about x (also called the £-neighborhood or £-ball about x).
See Figure 2.1-J~ set Ac A( is said to be open if for each x EA, there exists
ah£> 0 such that D(x,£) C ~ neighborhood of a point in Mis an open set
containing that point.
y
R'
-.e-+X
FIGURE 2.1-1
The £-disk
Note that the empty set 0 and the whole space M are open.
It is important to realize that the c: required in the de nition of an open
set may depend on x. For example, the unit square in IR2 not including the
"boundary" is open, but the €-neighborhoods get smaller as we approach the
boundary. However, the£ cannot be zero for any x. See Figure 2.1-2.
Consider an open interval in JR = JR 1, say, JO, I[. Indeed, this is an open set
(see Figure 2.1-3). However, if we look upon the set as being in JR2 (as a subset
of the x axis), it is no longer open. Thus, for a set to be open it is essential to
specify which ]Rn and, in general, which metric space it is to be considered a
subset of.
There are numerous examples of sets that are not open. The unit disk
{x E lR.2 I llxll ~ 1} is such an example. This set is not open because for a point
105
§2.1 Open Sets
FIGURE 2.1-2
An open set
y
. ft2
e-disk inR2
e-disk in R
I
X
0
0
(x,0)
FIGURE 2.1-3
]0, l[ is open in JR but not in JR2
FIGURE 2.1-4 A nonopen set
on the "boundary" (that is, points x with llxll = I), every c-disk contains points
that do not lie in the set. See Figure 2.1-4.
2.1.2 Proposition In a metric space, each €-disk D(x, c) is open.
Chapter 2 The Topology of Euclidean Space
106
The main idea for the proof is contained in Figure 2.1-5. Notice in this figure
that the size of the disk about the pointy E D(x,£) gets small as y gets closer
to the boundary. The theorem should be "intuitively" clear from this picture.
FIGURE 2.1-5
Some ideas useful for the proof of Proposition 2.1.2
Some basic laws that open sets obey are the following:
2.1.3 Proposition In a metric space (M, d),
~-
~
i.
The intersection of a finite number of open subsets of M is o p /
ii.
The union of an arbitrary collecti9n of open subsets of M is open.
iii.
The empty set
0
and the whole space M are open.
To appreciate the difference between assertions i and ii, note that the intersection of an arbitrary family of open sets need not be open. For example, in
JR 1, a single point (which is not an open set) is the intersection of the collection
of all open intervals containing it (why?).
Note. A set with a specified collection of subsets (called, by definition, open
sets) obeying the rules in Proposition 2.1.3, and containing the empty set and
the whole space, is called a topological space. We shall not deal with general .
topological spaces in this book, but primarily with the cases of Rn and metric
spaces. However, much of what is said in Chapters 2, 3, and 4 does apply to
,
the more general setting.
§2.1 Open Sets
107
2.1.4 Example Let S = { (x, y) e IR2 I O < x < 1}. Show that S is open.
Solution In Figure 2.1-6 we see that about each point (x,y) e S we can draw
the disk of radius r = min{x, I - x}, and it is entirely contained in S; you can
see this from the picture and can prove it using the triangle inequality. Hence,
by definition, S is open.
•
y
FIGURE 2.1-6
Any point of this set has a disk about it also lying in the set
-----------------
/1_ 2.1.5 Ex~ple Let__s = { (x,y) e R 2 I O < x
1
~
I}. I s ~ - - - -
Solution No, because any disk about (l, 0) e S contains points (x, 0) with
•
- - - - - ________
J
x>l.
2.1.6 Example Let A c Rn be open and B c !Rn. Define
A+ B = {x + y E !Rn Ix EA and y EB}.
Prove that A + B is open.
Solution Let w E A + B. There are points x E A and y E B with w = x + y.
Since A is open, there is an € > 0 such that D(x,€) c A. We claim that
~
Chapter 2 The Topology of Euclidean Space
108
wll
D(w,€) CA+ B. Suppose z E D(w,e). Then llz = llz - (x + y)II < €. But
llz - (x + y)II = ll(z - y) - xii, so z - y E D(x, e) CA. Since y E B, this forces
z = (z-y)+y to be in A +B. Thus, D(w,e) c A +Band henceA+B is an open
set.
•
Note that we have in effect shown that if A is open then A + {y} is open. It
follows that A + B = Uyen(A + {y}) is also open.
2.1.7 Example Let M be any set, and define the discrete metric do on M
by do(x,y) =0 if x =y and do(x,y) = 1 if x i' y. Prove that every set A C Mis
open.
Solution Indeed, given x EA,
is open.
•
D(x, 1/2) = {x} c A, and so by definition, A
·
Exercises for §2.1
1.
Show that JR2 \{(0,0)} is open in JR 2•
2.
Let S= {(x,y) E JR 2 I xy
3.
Let AC R be open and BC R 2 be defined by B = {(x,y) E R 2
Show that B is open.
4.
Let BC JR.n be any set. Define C = {x E Rn I d(x,y) < 1 for some y E B}.
Show that C is open.
5.
Let AC R be open and BC R. Define AB= {xy ER Ix EA and y EB}.
Is AB necessarily open?
6.
Show that R 2 with the taxicab metric has the same open sets as it does
with the standard metric.
> I}. Show that Sis open.
Ix EA}.
§2.2 Interior of a Set
2.2.1 Definition Let M be a metric space and A c M. A point x E A is
called an interior point of A if there is an open set U such that x E U c A. The
interior of A is the collection of all interior points of A and is denoted int(A).
This set might be empty.
Exercises for §2.2
109
The condition on x is equivalent to the following: There is an c > 0 such
that D(x,c) c A - see Exercise 5 at chapter's end. For example, the interior
of a single point in IR.n is empty. The interior of the unit disk in IR2 , including
its boundary, is the unit disk without its boundary.
We can describe the interior of a set in a somewhat different manner as
follows: The interior of A is the union of all open subsets of A (the reader
is asked to show this in Exercise 23 at the end of this chapter). Thus, by
Proposition 2.1.3 or directly, int(A) is open. Hence, int(A) is the largest open
subset of A. Therefore, if there are no open subsets of A, then int(A) = 0. Also,
it is evident that A is open iff int(A) = A.
2.2.2 Example Let S = {(x, y) E IR 2 I 0 < x
:::; 1}. Find int(S).
Solution To determine the interior points, we locate points about which it is
possible to draw an c-disk entirely contained in S. From Figure 2.1-6, we see
that these are points (x,y) where O< x < I. Thus, int(S) = {(x,y) IO< x < l}.
•
2.2.3 Example Is it true that int(A) U int(B) = int(A U B)?
Solution No. On the real line, let A= [0, 1] and B = [1,2]. Then int(A) =
]0, ·1 [ (why?) and int(B) =] 1, 2[, so that int(A)Uint(B) = ]0, 1[ U] 1, 2[ = ]0, 2[\ { 1},
while int(A U B) = int[O, 2] = ]0, 2[.
•
2.2.4 Example Is it true in a general metric space (M, d) that int{y EM
d(y,xo):::; r} = {y EM I d(y,xo) < r} where Xo EM and r > 0 are given?
I
~olution No. For example, consider any set M with the discrete metric,
xo E M and r = 1. Then {y E M I d(y, xo) :::; 1} = M, and so its interior is all of
M. On the other hand, {y I d(y,Xo) < I}= {Xo}, which is not M if M has more
than one point.
•
Exercises for §2.2
1.
LetS={(x,y)EIR.2 lxy2: l}. Findint(S).
2.
Le~.S = {(x,y, z) E JR3 I 0
<x <
I, y2 + z2
:::;
1}. Find int(S).
I IO
Chapter 2 The Topology of Euclidean Space
3.
If A C B, is int(A) C int(B)?
4.
Is it true that int(A) n int(B) = int(A n B)?
5.
Let (M, d) be a metric space, and let Xo E M and r
> 0.
Show that
D(Xo, r) C int{y EM I d(y,xo) ~ r}.
§2.3 Closed Sets
2.3.1 Definition A set B in a metric space M is said to be closed if its
complement (that is, the set M \ B) is open.
For example, a single point in Rn is a closed set. The set in R 2 consisting of
the unit disk with its boundary circle is closed. Roughly speaking, a set is closed
when it contains its "boundary points" (this intuition will be made precise in
§2.6). See Figure 2.3-1. Our intuition can be hard to interpret for complicated
sets, and so the technical definition 2.3.1 is then used.
I
•
X
FIGURE 2.3-1
Examples of closed sets
It is possible to have a set that is neither open nor closed. For example, in
JR 1, a half-open interval ]O; 1] is neither open nor closed. Thus, even if we know
that A is not open, we cannot conclude that it is closed or not closed.
The next theorem is analogous to Proposition 2.1.3.
2.3.2 Proposition In a metric space (M, d),
i.
The union of a finite number of closed subsets is closed.
ii.
The intersection of an arbitrary family of closed subsets is closed.
iii.
The wlwle space M and the empty set 0 are closed.
,r
§2.3 Closed Sets
111
This theorem follows from Proposition 2.1.3 by noting that unions and intersections are interchanged when we talce complements (see the Introduction).
The proof is left to the reader (Exercise 22 at the end of the chapter), who
should also show that in the first assertion, the finite union cannot be replaced
by arbitrary unions.
2.3.3 Example Let S = {(x,y) E IR.2 1 O < x $ 1,0 $ y $ 1}. Is S closed?
Solution See Figure 2.3-2. Intuitively, S is not closed, because the portion
of its boundary on the y-axis is not in S. Also, the complement is not open
because any e-disk about a point on the y-axis, such as the point (0, 1/2), will
intersect S (and hence not be in IR 2 \ S).
•
y
FIGURE 2.3-2
Is this set closed?
2.3.4 Example Let S = { (x,y) E 1R 2 I x2 + y2 $ I}. Is S closed?
Solution Yes; S is the unit disk, including its boundary.
Tj
a!l open set, because for (x,y) E IR 2 \ S, the disk of radius e =
be entirely contained in IR 2 \ S (Figure 2.3-3).
•
complement is
- 1 will
x2 + y 2
2.3.5 Example Show that any finite set in !Rn is closed.
'
Solution Single points are closed, and so the assertion folfows frofll Proposition 2.3.2i.
•
·
/
Chapter 2 The Topology of Euclidean Space
112
y
I
\
I
-
,. '
•
__,,,
'
\
(x,y) /
1------x
FIGURE 2.3-3
The complement of the set S in Example 2.3.4 is open'
2.3.6 Example Let (M, d) be a metric space and A C M be a finite set. Let
:S I for some y EA}. Show that Bis closed.
B = {x EM I d(x,y)
Solution We will show that M \ B is open. Let z E M \ B so that d(z, y) > I
for ally E A. Let Y1, ... ,YN denote the points of A, so that d(z,y;) > 1 for
i = 1, .. . ,N. Let e be the minimum of d(z,y1)- 1, .. . ,d(Z,YN)- I, so that
e > 0. If d(x,z) < e/2, the triangle inequality gives d(x,y1)+d(x,z) 2: d(y1,z)
or d(x,y;) 2: d(y;,z) - e/2, which, by construction of e, is strictly greater than
1. Thus, D(z, e) c M \ B, and so M \ B is open and thus B is closed.
•
Exercises for §2.3
1.
Let S = {(x,y) E 1R2 Ix 2: 1 and y 2: 1}. Is S closed?
2.
Let S= {(x,y) E JR2 Ix= 0, 0
3.
Redo Example 2.3.S directly, this time showing that the complement is
open.
4.
Let A c ]Rn be arbitrary. Show that ]Rn \ (intA) is closed.
S.
Let S = {x E JR Ix is irrational}. Is S closed?
6.
Give an alternative solution of Example 2.3.6 by showing that B is a union
of finitely many closed sets.
< y < l}. Is S closed?
113
§2.4 Accumulation Points
§2.4 Accumulation Points
Another useful way to detennine whether or not a set is closed is based on the
concept of an accumulation point.
2.4.1 Definition A point x in a metric space M is called an accumulation
point of a set A c M if every open set U containing x contains some point of A
other than x.
In other words, an accumulation point of a set A is a point such that other
points of A are arbitrarily close by. Accumulation points are also referred to as
cluster points. Using Proposition 2.1.2, the statement that x is an accumulation
point of A is equivalent to the statement that for every € > 0, D(x, e) contains
some point y of A with y f. x. For example, in JR 1 , a set consisting of a single
point has no accumulation points and the open interval ]0, I [ has all points of
[0, I] as accumulation points. Note that an accumulation point of a set need not
lie in that set. The definitions of accumulation point and closed set are closely
related, as shown by the next theorem.
2.4.2 Theorem A set A
C M is closed iff the accumulation points of A
belong to A.
A set need not have any accumulation points (a single point and the set of
integers in R 1 are examples), in which case Theorem 2.4.2 still applies and we
can conclude that the set is closed.
Theorem 2.4.2 is intuitively clear because the property of being closed means,
roughly speaking, that a set co_ntains all points on its "boundary," and such points
are accumulation points. This sort of rough argument has a pitfall, and one has
to be more careful, since some sets are sufficiently complicated that our intuition
may fail us. For example, consider A= { 1/n E JR I n = 1, 2, 3, ... } U {0}. This
is a closed set (verify!) and its only accumulation point is {0}, which lies in
A. But our intuition about "boundary" is not very clear for this set; hence, the
need for inore careful arguments.
2.4.3 Example Let S = {x E JR
accumulation points of S.
Ix
E [0, 1] and x is rational}. Find the
Solution The set of accumulation·points consists of all points in [0, 1). Indeed, let y E [0, 1) and D(y,€) = ]y - e,y + e[ be· a neighborhood of y .. We
Chapter 2 The Topology of Euclidean Space
114
can find rational points in [O, l] arbitrarily close to y (other than y) and in particular in D(y, e). Hence y is an accumulation point. Any point y <t [0, 1] is
not an accumulation point, because y has an e-disk containing it that does not
meet [0, 1].
•
2.4.4 Example Verify Theorem 2.4.Zfor the set A= {(x,y)
E IR.2 j O::; x::;
1 or x = 2}.
Solution The set A is shown in Figure 2.4-1. Clearly, A is closed. It is also
clear that the set of accumulation points of A equals A, so that every accumulation
point of A lies in A. Note that on JR., the set of accumulation points of [0, I] U
{2} is [0, l] without the point {2}.
•
y
(1,0)
FIGURE 2.4-1
(2,0)
X
This set contains all its accumulation points
2.4.5 Example Let S = {(x,y)
E IR.2 1 y
< x2 + I}. Find the accumulation
points of S.
Solution The set S is shown in Figure 2:4-2. The accumulation points con•
stitute the set {(x, y) I y :$ x2 + I}, as is evident from the figure.
2.4.6 Example In a general metric space M, let B(x, r) = {y E M I d(x,y) ::;
r}. ls it true that points of B(x, r) are accumulation points of D(x, r)?
115
Exercises for §2.4
y
0
FIGURE 2.4-2
The set of points (x,y) satisfying y
< x2 + l
Solution This is not true in general (although it is true in JR."). For example,
if Mis a set with the discrete metric, then D(x, 1) = {x}, while B(x, 1) = M.
There are no accumulation points of {x}, since it consists of only one point,
so that if M has more than one point, B(x, I) will contain points that are not
•
accumulation points of D(x, 1).
2.4.7 Example Show that a bounded sequence Xn of distinct points in JR
has an accumulation point; i.e., the set {x,,x2, ... } has an accumulation point.
Solution In Chapter l (see 1.4.3) we proved that there is some subsequencesay Xn1 --0f x,. that converges: Xn 1 - x. Then xis an accumulation point, since
for every c > 0, D(x, c:) contains Xn1 with nk large enough (by the definition of
convergence) and at least one such Xn 1 is not equal to x, since the Xn 1 are all
distinct.
•
Exercises for §2.4
f
= {(x, y) E JR2 I y =0 and O < x < l}.
1.
Find the accumulation points of A
2.
If A C B and x is an accumulation point of A, is x an accumulation point
of Bas well?
116
3.
Chapter 2 The Topology of Euclidean Space
Find the accumulation points of the following sets in IR.2:
a.
{ (m, n) I m. n integers}
b.
{(p,q) I p,q rational}
c.
{(m/n, 1/n) I m,n integers, n ~0}
d.
{(1/n + 1/m, 0) I n, m integers, n ~ 0, m ;if 0}
4.
Let A c JR be nonempty and bounded above and let x = sup(A). Must x
be an accumulation point of A?
5.
Verify Theorem 2.4.2 for the set A= { (x,y) E IR 2
6.
Let M be a set with the discrete metric and A C M be any subset. Find
the set of accumulation points of A.
I x2 + y + 2x = 3}.
§2.5 Closure of a Set
The interior of a set A is the largest open subset of A. Similarly, we can form
the smallest closed set containing a set A. This set is called the closure of A and
is denoted cl(A) or, sometimes, A.
2.5,1 Definition Let (M, d) be a metric space and A c M. The closure of
A, denoted cl(A), is defined to be the intersection of all closed sets containing A.
Since the intersection of any family of closed sets is closed, cl(A) is closed;
it is also clear that cl(A) ::> A, and that A is closed iff cl(A) = A. For example,
on JR 1, cl( ]0, I [ ) = [0, I]. The connection between closure and accumulation
points is the following:
2.5.2 Proposition For A c M, cl(A) consists of A plus the accumulation
points of A. That is, cl(A) = A U {accumulation points of A}.
In other words, to find the closure of a set A, we add to A all of the accumulation points not already in A. Proposition 2.5.2 should be intuitively clear
from the examples presented earlier.
2.5.3 Example Find the closure of A= [0, 1[ U {2} in
R
Exercises for §2.5
117
Solution The accumulation points are [O, I], and so the closure is [O, l] u
{2}. This is clearly the smallest closed set we could find containing A.
•
2.5.4 Example
For any A
c !Rn, show that IR"\ cl(A) is open.
Solution cl(A) is a closed set, and by definition of a closed set, its comple•
ment is. open.
2.5.5 Example
Is it true that cl(A n B) = cl(A) n cl(B)?
Solution No. Take, for example, A = [O, I] and B =]I, 2]. Then A n B =
and cl(A) n cl(B) = { 1}.
•
0
2.5.6 Example For a subset A
if inf{d(x,y) I y EA}= 0.
if
of a metric space, show that x E cl(A)
and only
a = inf{d(x,y) I y E A}. If x E A,
then taking x = y we get a = 0. If x is an accumulation point of A, then
for any c >. 0, there is a y E A such that d(x,y) < €. Thus, again, a = 0.
Conversely, if o = 0 and x (/. A, then for any c > 0 there is a y E A such that
d(x,y) < € (by a property of inf), so that xis an accumulation point of A and thus
x E cl(A).
•
Solution First, let x E cl(A) and
Exercises for §2.5
Ix> y2}.
1.
Find the closure of A= {(x,y) E IR 2
2.
Find the closure of {1/n In= 1,2,3, ... } in IR.
3.
Let A = { (x, y) E IR 2 I x is rational}. Find cl(A ).
4.
a.
For A C Rn, show that cl(A) \ A consists entirely of accumulation
points of A.
b.
Need it be all of them?
S.
In a general metric space M, let A c D(x, r) for some x E Mand r > 0.
Show that cl(A) C B(x, r) ·= {y E MI d(x,y) 5 r}.
118
Chapter 2 The Topology of Euclidean Space
§2.6 Boundary of a Set
If we consider the unit disk in R.2, we know what we would like to call the
boundary-the obvious choice is the unit circle. For more complicated sets,
such as the rationals, it is not as clear what the boundary should be. Therefore
a precise definition is needed.
2.6.1 Definition For a given set A in a metric space (M,d), the boundary
is defined to be the set
bd(A) = cl(A) n cl(M \ A).
Sometimes the notation 8A = bd(A) is used.
Since the intersection of two closed sets is closed, bd(A) is a closed set.
Also, note that bd(A) = bd(M \ A). From Proposition 2.5.2, we deduce that the
boundary is also described as follows:
2.6.2 Proposition Let A c M. Then x e bd(A) iff for every e > 0, D(x, e)
contains points of A and of M \ A (these points might include x itself). See
Figure 2.6-1.
FIGURE 2.6-1
Disks about boundary points contain points in the set and
· points not in the set
The original definition states that bd(A) is the border between A and M \ A.
This is also what Proposition 2.6.2 is asserting, and therefore it should be intuitively clear.
119
§2.6 Boundary of a Set
2.6.3 Example Let A= {x E IR Ix E [0, 1] and xis rational}. Find bd(A).
=
=
[0, I], since, for any c: > 0 and x E [0, l], D(x, c:)
]x- c:, x+E:[ contains both rational and irrational points. The reader should also
verify that bd(A) = [O, 1] using the original definition of bd(A). This example
shows that if A C B, it does not necessarily follow that bd(A) C bd(B). (Let A
be as in this example and B = [O, 1) in R) •
Solution bd(A)
2.6.4 Example If x E bd(A), must x be an accumulation point?
Solution No. For example, let A = { 0} c IR. Then A has no accumulation
points, but bd(A) = {O}. •
2.6.5 Example Let S = {(x,y) E IR 2 I x2 - y2 > I}. Find bd(S).
Solution The set S is sketched in Figure 2.6-2. Clearly, 'bd(S) consists of the
hyperbola x2 - y2 = 1.
•
y
FIGURE 2.6-2 The set S in Example 2.6.5
Chapter 2 The Topology of Euclidean Space
120
Exercises for §2.6
In= 1,2,3, ... }.
1.
Find bd(A) where A= {1/n E JR
2.
If x E cl(A) \ A, then show that x E bd(A). Is the converse true?
3.
Find bd(A) where A= {(x,y) E 1R2 Ix :5 y}.
4.
Is it always true that bd(A) = bd(int A)?
5.
Let A c JR be bound~d and nonempty and let x = sup(A). Is x E bd(A)?
6.
Prove that the boundary of a set in IR. 2 with the standard metric is the same
as it would be with the taxicab metric.
§2.7 Sequences
The definition of convergence of a sequence in IR.n is similar to that of convergence of a sequence of real numbers. In fact, the definition is meanin~ful in a
metric space.
2.7.1 Definition Let (M,d) be a metric space and Xk a sequence of points
in M. We say that
Xk
converges to a point x E M, written
lim
Xk
= x or
Xk -+
x as k
-+
oo,
k-+<X>
provided that for every open set U containing x, there is an integer N such that
E U whenever k?: N. See Figure 2.7-1.
Xk
This definition coincides with the usual c:-N definition, as the next theorem
shows.
2.7.2 Proposition A sequence Xk in M converges to x
E M
if! for every
c > 0 there is an N such that k?: N implies d(x,xk) < c:.
Let us specialize our discussion of convergence to ]Rn. In this case, Definition
2.7.1 reads: A sequence of points vk in lRn converges to v if for every c > 0
there is an N such that llvk - vii < c: whenever k ?: N. Indeed, this is so just
because d(v, vk) !Iv - vkll- Also note that for n l, we recover the definition
of convergence on JR.. The definition is essentially the same for convergence in
=
=
§2.7 Sequences
121
•
•
FIGURE 2.7-1
Convergence of a sequence in IR 2
any other normed space: If v· is a normed space and vk is a sequence in V, then
Vk converges to v E V if llvk - vii -+ 0 as k -+ oo. In IR 2 , the situation appears
as in Figure 2.7-1. Much of what we did in one dimension still makes sense in
!Rn, but some does not. Most noticeably, there is no natural order imposed on
!Rn, and so the discussion of monotone sequences does not apply. Much of the
rest goes through by replacing absolute values by norms or distances throughout.
For example, since normed spaces are vector spaces we still know how to add
and multiply by numbers and so we can examine the arithmetic of sequences.
2.7.3 Proposition Suppose Vt and wk are sequences of vectors in a normed
space (such as !Rn), ,\k is a sequence of numbers in IR, ,\ E IR is a constant
number; and u is a constant vector. If vk -+ v, Wk -+ w, and ,\k -+ ,\, then
+ Wk
i.
Vk
ii.
>.vk-+ >.v.
iii.
>.ku-+ >.u.
-+ V
+ W.
iv.
v.
If >.k
I
I
~ 0 and >. ~ 0, then ,\k Vk -+ Av.
For some computations, we use the specific representation of vectors in !Rn
in terms of a finite number of coordinates. If v and vk are in !Rn, then we write
v (v 1, ••• , yft) and Vk = (vk, vl, ... ,'1) , where each vi and each v{ lie in R
=
2.7.4 Proposition vk -+ v in !Rn if and ~nly if each sequence of coordinates
converges to the corresponding coordinate of v as a sequence in IR. That is,
iim vk = v in !Rn iff Jim v~ = vi in JR for each i = I, 2, ... , n.
....... OCI
k-c,o
Chapter 2 The Topology of Euclidean Space
122
This can be written a bit more compactly as
lim (vl, . .. , v;> = ( lim vl, ... , lim v:) .
k-+oo
k-oo
k-+oo
For the case of JR 2, Proposition 2.7.4 should be clear from Figure 2.7-1. Note
that this proposition does not make sense in the general metric space setting.
2.7.S Example Show that the sequence of vectors
verges to (0, 0) in JR 2 as k
--+
Vk
= (1/k, l/k2) con-
oo.
Solution The component sequences 1/k and 1/k2 each converge to 0. By
Proposition 2.7.4, the vectors (1/k, 1/k2) converge to (0, 0) in JR 2•
•
We can use sequences to determine whether a set is closed. The method is
as follows:
2.7.6 Proposition
i.
A set A C M is closed ijf for every sequence Xk E A that converges in M,
the limit lies in A.
ii.
For a set BC M, x E cl(B) ijf there is a sequence Xt E B V.:ith
is,
Xk--+
x.
One should note that the sequences in i and ii are allowed to be trivial-that
= x for all k is allowed.
Xk
2.7.7 Example Let Xn E JR"' be a convergent sequence with llxnll $ 1 for
all n. Show that the limit x also satisfies llxll $ I. If llxnll < l, then must we
have llxll < 1?
I IIYII $ 1} is closed. Hence, by
Proposition 2.7.6i, Xn E B and Xn --+ x implies x E B. This is not true if $ is
replaced by <; for example, on JR, consider Xn = 1 - (1/n).
•
·
Solution The unit ball B = {y E IR."'
2.7.8 Example Find the closure of A= { 1/n E JR I n = 1, 2, ... }.
Exercises for §2. 7
123
Solution We can use, for example, Proposition 2.7.6ii. The sequence 1/n-+
0, and so O E cl(A). Taldng other sequences from A will not yield any new
points, and so cl(A) =AU {O}. •
Exercises for §2.7
1.
Find the limit of the sequence ((sin nt /n, 1/n2 ) in JR.2.
2.
Let Xn-+ x in Rm. Show that A= {xn In= 1,2, ... } U {x} is closed.
3.
Let A C Rm,
4.
Verify Proposition 2.7.6ii for the set B = {(x,y) E R 2
5.
Let S = {x ER
Xn
E A, and
Ix
Xn -+
x. Show that x E cl(A).
Ix< y}.
is rational and x2 < 2}. Compute cl(S).
§2.8 Completeness
In a general metric space, or even in Rn, the notions of monotone sequence
and least upper bound do not make sense. However, the Cauchy sequence
characterization of completeness does.
2.8.1 Definition Let (M, d) be a metric space. A Cauchy sequence is a
sequence Xk E M such that for all e > 0, there is an N such that k, I 2:: N implies
< c. The space M is called complete ijf every Cauchy sequence in M
converges to a point in M.
d(xk, x 1)
In. a normed space, such as Rn, a sequence Vk is a Cauchy sequence if for
every c > 0 there is an N such that !lvk - vjll < c whenever k,j 2:: N.
2.8.2 Definition A sequence xk in a normed space is bounded if there is
a number B such that llxk II ::; B for every k. Jn· a metric space we require that
there be a point x 0 such that d(xk, xo) ::; B for all k.
2.8.3 Proposition A convergent sequence in a normed or metric space is
bounded.
Chapter 2 The Topology of Euclidean Space
124
Some general properties of Cauchy sequences follow; compare the development in §1.4.
·
2.8.4 Proposition
i.
Every convergent sequence in a metric space is a Cauchy sequence.
ii.
A Cauchy sequence in a metric space must be bounded.
iii.
If a subsequence of a Cauchy sequence converges to x, then the sequence
converges to x.
According to Theorem 1.4.4, JR. is a complete metric space. An .example of
an incomplete space is the set ofrational numbers with d(x,y) = lx-yj. Another
example is JR\ {O} with the same metric. Theorem 2.8.5 asserts that R_n is also
a complete metric space.
2.8.5 Theorem A sequence Xk in !Rn converges to a point in !Rn iff it is a
Cauchy sequence.
This theorem follows from the corresponding result for lR by using Proposition 2.7.4--convergence in !Rn is equivalent to componentwise convergence in
R
.
The notion of cluster point will help us in the development of completeness.
We generalize the definition and some basic properties we used in §1.5.
2.8.6 Definition A point x in a metric space is called a cluster point of
Xk if for every f: > 0, there are infinitely many values of k with
the sequence
d(Xt,X)
< €.
2.8.7 Proposition If Xk is a sequence in a metric space M and x
i.
x is a cluster point iff for every f:
k > N with d(xk, x) < c.
ii.
xis a cluster point
iii.
x.e:
iv.
Xk -+
--+
E M, then
> 0 and for each integer N, there is a
iff there is a subsequence convergent to x.
x if and only if every subsequence converges to x.
x if! every subsequence of X.e: has a further subsequence that converges to x.
125
Exercises for §2.8
2.8.8 Example
Let (M, d) be a complete metric space and N C M a closed
subset. Show that N is complete as well.
Solution Here it is understood that the metric used on N is the same as that
used on M. Let Xn be a Cauchy sequence in N. Regarding it as a sequence in ,
M, it is still a Cauchy sequence, and so, since M is complete, it converges in M
to, say, x.-Since Xn EN and Xn - t x and N is closed, x EN. Thus Xn converges
in N, and so N is complete.
•
2.8.9 Example
Show that the set of cluster points of a sequence Xk is closed.
Solution Let Yn be a sequence of cluster points and Yn - y. For every c: > 0,
thPrP. is an N such that d(yn,Y) < c:/2 if n 2'. N. There are infinitely many Xk
with d(xk,YN) < g/2, since YN is a cluster point. Thus, by the triangle inequality,
d(xk,Y) :::; d(xk,YN) + d(yN,Y)
of Xk,
•
< c:/2 + c:/2 =
c:, and soy is also a cluster point
Exercises for §2.8
1.
Suppose N is a complete subset of a metric space (M, d). Show that N is
closed.
2,
Let (M, d) be a metric space with the property that every bounded sequence
has a convergent subsequence: Prove that M is complete.
3.
Let M be a set with the discrete metric. Is M complete?
4.
Show that a Cauchy sequence can have at most one cluster point.
5.
Suppose that a metric space M has the property that every bounded sequence has at least one cluster point in M. Show that M is complete.
~
§2.9 Series of Real Numbers _and Vectors
Just as in IR, we can cons.ider series in IRn or, more generally, in a normed space.
A
2.9.1 Definition
I::f:o
Let V be a normed space. series
Xk, where Xk E· V ,
is said to converge to x E V if the sequence of partial sums sk =
x;
converges to x, and if so we write
xk =x.
~:O
I::=0
Chapter 2 The Topology of Euclidean
.. Space
126
As with our discussion of sequences in §2.7, L~Xk =xis equivalent to the
corresponding component series converging to the components of x. Applying
the Cauchy criterion to Sk yields the following:
2.9.2 Theorem Let V be a complete normed space (such as Rn). A series
E Xt in V converges if! for every e > 0, there is an N such that k ~ N implies
llxt +Xt+I + · · · + Xk+pll < e for all integers p = 0, 1, 2, ....
In particular, taking p = 0, we see that if E Xt converges, then Xt -. 0 as
k -. oo (see Exercise 3 in this section).
A series E x1: is said to be absolutely convergent iff the real series E llx1:II
converges. A series that is convergent but not absolutely convergent is said to
be conditionally convergent. Using Theorem 2.9.2 and the triangle inequality
yields the following:
2.9.3 Theorem In a complete normed space, if E Xt converges absolutely,
then
L Xt converges.
This theorem is useful because it allows us to apply tests-for real series (such
as the ratio test) to the series E llxtll to test for convergence of E Xt. Of course,
it could happen that a particular test fails even though E x1: is convergent, in
which case some other method is needed.
Now we shall review the most important tests for the convergence of a real
series. Some of the main points are presented in the following theorem. Some
other tests for convergence will occur in the exercises and, later, in Chapter 5.
In part vi of the theorem, we assume that the reader is familiar with one-variable
integration. The needed theory is reviewed in §4.8.
2.9.4 Theorem
i.
Geometric series: If lrl < I, then L:0 rn converges to 1/(1 diverges (does not converge) if lrl ~ 1.
ii.
Comparison test: If
1 a1: converges, ak 2::: 0, and O ::;; ht ::;; ak, then
L:i bt converges; if E:1 c1: diverges, Ct ~ 0, and O ::; Ck $ dk, then
1 dk diverges.
E:
E:
r), and
127
§2.9 Series of Real Numbers and Vectors
iii.
p-series test: :E:1 1/nP converges if p > l and diverges to oo (that is,
the partial sums increase without bound) if p 5 1.
iv.
Ratio test: Suppose that lim,,_ 00 lan+1/anl exists and is strictly less than 1.
Then
1 an converges absolutely. lf the limit is strictly greater than l,
then the series diverges. lf the limit equals l, then the test is inconclusive.
v.
Root test: Suppose that limn-oo lanl 1/n exists and is strictly less than 1.
Then
1 an converges absolutely. lf the limit is strictly greater than l,
then the series diverges; if the limit equals 1. then the test is inconclusive.
vi.
Integral test: If f is continuous, nonnegative, and monotone decreasing
f(x)dx converge or diverge together.
on [l, +oo[, then E:iJ(n) and
vii.
Ratio comparison test: Let
1 a; and
1 b; be series, with b; > 0
for all i.
lf (a) la;I 5 b; for all i, or lim;. . . . oo la;l/b; < oo, and if (b) :E:1 b; is
convergent, then
.
1 a; is convergent.
lf (c) a; ~ b; for all i, or lim;- 00 a;/b; > 0, and if (d)
1 b; is
divergent, then :E~ a; is divergent.
r::
E:
ft
E:
r::
E:
r::
E:
viii. Alternating series: lf
1 a; is such that the a; alternate in sign, are
decreasing in absolute value, and a; -+ 0 as i -+ oo, then the series
converges. The error in approximating the sum by Sn = :E7=t a; is no
greater than ja,r+1 I-
2.9.5 Example Let Xn = (l/n2 , l/n). Does :Exn converge?
Solution . No, because the harmonic series
2.9.6 Example Let
II :E:Oxnll :5 2.
llxnll
5
1/2n;
I: l / n diverges, by 2.9.4iii.
•
prove that :Exn converges and
Solution We verify the conditions of Theorem 2.9.2. First,
Chapter 2 The Topology of Euclidean Space
128
(by the formula I::O arn = a/(1 - r) for the sum of a geometric series). Thus,
given e > 0, if we choose N so that 1/2N-t < c:, then the conditions of 2.9.2
will hold. Hence I: Xk converges. Moreover, the partial sums satisfy _
Thus the limits also satisfies llsll :S 2 (as in Example 2.7.7). We could
also show that I: Ilxn 11 converges by comparison with the geometric series
I: l/2n. •
(X)
2.9.7 Example Test for convergence:
L ;.
n=I
Solution The ratio test is applicable:
•
and so the series converges.
00
2.9.8 Example Determine whether or not
L n-f!--converges.
+I
11=1
Solution Observe that for x 2 1, f(x) = x/(x2 + 1) is positive and continuous.
Since f'(x) = (-x2 + 1)/(x2 + 1)2 :S O,f is monotone decreasing. Also,
1
00
1
-xdx
- = lim
x 2 + 1 b-+oo
= lim
b-+oo
lb
I
xdx
x1 + I
1 + I)]b = lim -1log (b-+1)
- .
[-log(.x2
2
2
I
b-+oo
2
2
As b - oo, (1/2)log[(b 2 +1)/2] -too, and so the series diverges by the integral
129
Exercises for §2.9
test. One can also proceed as follows:
n
I
n
- > - - = 2n
-•
n2 + I - n 2 + n2
and so by comparison with the divergent series ½"i:,(1/n), we get divergence.
•
2.9.9 Example Discuss the convergence of the series
L (-IYv'l
i+4
OO
1•
i=l
a; = (-l);Vi/(i + 4). We notice that for i large, lad appears to "behave like" b; = l / Vi. The series
b; diverges, by the p-series
test. To make the comparison between lad and b; precise, look at the ratios:
lim;_ 00 ladfb; = lim;- 00 [i/(i +4)] = 1, and so
la;j diverges as well; hence
our series is not absolutely convergent.
The series does look as if it could be alternating: the terms alternate in
sign, and lim;-+oo a; = 0. To see whether the absolute values lad form a
decreasing sequence, it is convenient to look at the function J(x) =
vx/(x+4). The derivative is
Solution Let
r:,:,
:E:,
/'(x) = (1/2.fi )(x+4), . (x + 4) 2
VX·
l = 2/.fi- .fi/2 =
4- x
'
(x + 4)2
2.fi(x + 4)2
which is negative for x > 4, and so f(x) is decreasing for x > 4. Since lad = f(i),
we have ia4j > iasl > la61 > · · ·, which implies that our series "f:, a;, with its
first three terms omitted, is alternating. It follows that the series is convergent;
_since it is not absolutely convergent, it is conditionally convergent. •
Exercises for §2.9
1.
.
~ ((sinn)ff nl ) converges.
Determme whether
2
2.
Show that the seties in Example 2.9.6 converges absolutely.
3.
Let "f:, Xk converge in Rn. Show that
';:1 ~ ,
Xt -+
0 = (0, ... , 0) E Rn.
Chapter 2 The Topology of Euclidean Space
130
4.
00
2n +n
Test for convergence "~ -3n_n
- -.
n=3
5.
Test for convergence
"n.3
oo
~
I
n•
n=O
Theorem Proofs for Chapter 2
2.1.2 Proposition In a metric space, each £-disk D(x, £) is open.
Proof Choose y E D(x, £). We must produce an £' such that D(y, £1 ) c
D(x,£). Figure 2.1-5 suggests that we try £1 = c - d(x,y), which is strictly
positive, since d(x,y) < £. With this choice (which depends on y), we shall
show that D(y,£') C D(x,£). Let z E D(y,£ 1 ), so that d(z,y) < £1• We need to
prove that d(z,x) < £. But, by the triangle inequality, d(z,x) $ d(z,y)+d(y,x) <
£1 +d(y,x), and by the choice of£', £1 +d(y,x) = £. •
2.1.3 Proposition In a metric space (M, d),
I.
The intersection of a finite number of open subsets of M is open.
ii.
The union of an arbitrary collection of open subsets of M is open.
iii.
The empty set 0 and the whole space M are open.
Proof
i.
It suffices to prove that the intersection of two open sets is open, since
we can use induction to get the general result by writing A 1 n ••• n An =
(A1 n--•nAn_i)nAn.
Let A and B be open and C
=
=
and
D(x, €1 ) c B.
A n B; if C
0, C is open, by a
degenerate case of the definition. Therefore, suppose x E C. Since A and
Bare open, there exist£, £ 1 > 0 such that
D(x, €)CA
Let £ 11 be the smaller of c and e:'. Then D(x, € 11 ) c D(x, £ ), and so
D(x, € 11 ) C A; and similarly, D(x, € 11 ) C B, and so D(x, c 11 ) C A n B = C,
as required.
Theorem Proofs for Chapter 2
131
ii.
The proof for unions is easier. Let U, V, ... be a collection of open sets
with union A. For x EA, x E U for some U in the collection. Hence,
since U is open, D(x, c) C UC A for some c > 0, proving that A is open.
iii.
The empty set 0 and the whole space M are open directly from the definition.
•
2.4.2 Theorem A set A
C M is closed ijJ the accumulation points of A
belong to A.
Proof
First, suppose A is closed. Then M\A is open. Thus if x E M\A, there
is an c > 0 such that D(x, c) C M \ A; i.e., D(x,c) n A = 0. Thus x is not an
accumulation point, and so A contains all its accumulation points. Conversely,
suppose A contains all its accumulation points. Let x E M \ A. Since x is not
an accumulation point and x (j A, there is an £ > 0 such that D(x, £) n A = 0;
i.e., D(x, c) c M \ A. Hence M \ A is open, and so A is closed.
•
2.5.2 Proposition For A C M, cl(A) consists of A plus the accumulation
points of A. That is, cl(A) = A U { accumulation points of A}.
Proof Let B =AU{x Ix is an accumulation point of A}. By 2.4.2, any closed
set containing A also contains B. If B is closed, then it will therefore be the
smallest closed set containing A, so that B = cl(A). To show that Bis closed, we
use 2.4.2; let y be an accumulation point of B. If c > 0, then D(y, £) contains
other points of B. If z is such a point, then either z E A or z is an accumulation
point of A. In the latter case, D(z, £ - d(z, y )) is an open set containing z, and so,
by 2.4.1, it contains some other points of A, necessarily distinct from y. Thus y
is an accumulation point of A, and soy E B. Thus B is closed.
•
2.6.2 Proposition Let A c M. Then x e bd(A) ijJ for every£ > 0, D(x, £)
contains points of A and of M \ A (these points might include x itself).
Proof Let X E bd(A) = cl(A) n cl(M \ A). Either X E A or X E M \ A. If
x E A, then, by Proposition 2.5.2, x is an accumulation point of M \ A, and-the
conclusion follows. The case x E M \ A and the converse are similar.
•
2.7.2 Propositi9n A sequence Xt in M converges to x E M ijJ for every
> 0 there is an N such that k ~ N implies d(x, Xt) < £.
£
Chapter 2 The Topology of Euclidean Space
132
x1, -+ x and e > 0. Since D(x,c) is open, there is an N
such that k ~ N implies x1, E D(x, c ), or d(x, x1,) < c, as required. Conversely,
suppose the condition holds and U is a neighborhood of x. Find e > 0 such that
D(x,e) c U. Then there is an N such that k ~ N implies d(x,xk) < e, that is,
Xt E D(x,e) C u and so XJ;-+ x, by definition.
Proof Suppose
•
2.7.3 Proposition Suppose Vt and Wk are sequences of vectors in a normed
space (such as IR.n), Ak is a sequence of numbers in JR., A E R is a constant
number; and u is a constant vector. If v,, -+ v, Wt -+ w, and A.t -+ A, then
i.
Vk
ii.
AV.t-+ Av.
+Wt-+ v+w.
iii.
iv.
A.tVt-+ AV.
v.
If At
I
I
~ 0 and A ~ 0, then At Vt -+
>/·
Proof In each case the proof parallels that of 1.2.7, and so we leave the
adaptation to the reader. To save effort, note that, as in 1.2.7, iv implies each of
ii and iii, and so only proof of i, iv, and v need be shown.
•
2.7.4 Proposition v,, -+ v in IR.n if and only if each sequence of coordinates
converges to the corresponding coordinate of v as a sequence in JR.. That is,
lim Vt= v in Rn iff lim v~ = v; in JR/or each i = 1, 2, ... ,n.
k--+oo
·
k--+oo
Proof If 6(v, vk) = max.{jv 1 - vfl, ... , jvn - v;I}, then 6(v, v.1:) $ llv - v1,II $
y'n8(v, Vk) by 1.6.7. Suppose Vt -+ v in IR.n and I $ j $ n. Letting e > 0, then
there is an integer K such that llv - Vtll < e whenever k ~ K. Therefore, for
such k, we have jvj - vi I $ 6(v, vk) $ llv - Vtll < e, and so vi-+ vj in JR.. The
proof of the converse shows where the finiteness of n is vital. Suppose that for
each j = 1, 2, ... , n, we have limk-oo vi = vi. Let e > 0. For any eo > 0, there
are integers Ki, K2, K3, ... , Kn such that
whenever k ~
lv 1 - v}I < eo
jv2 - vii < t:o
whenever k ~ K2,
< io
whenever k ~ Kn.
I~ - v;I
K1,
Tlieorem Proofs for Chapter 2
133
There are only finitely many such K;, and so one of them is largest. Let K =
max(K1, K2, ... , Kn), For k ~ K, 1.6.7 again gives llv - Vt-II 5 ✓,i6(v, Vt) ~
.fiigo. If this is done with go = g / ✓,i, we obtain llv - Vtll < e whenever k ~ K,
and so Vt--+ Vin Rn.
•
2.7.6 Proposition
i.
A set A C M is closed iff for every sequence Xk E A that converges in M,
the limit lies in A.
ii.
For a set B C M. x E cl(B) if! there is a sequence Xk E B with
Xk --+
x.
Proof
i.
. First, suppose A is closed. Assume Xt --+ x and x ff. A. Then x is an
accumulation point of A, for any neighborhood of x contains Xt E A for k
large. Hence, x E A, by Theorem 2.4.2.
Conversely, we shall use Theorem 2.4.2 to show that A is closed. Let
x be an accumulation point of A, and choose Xt E D(x, 1/k) n A. Then
Xt --+ x, since for any e > 0, we can choose N ~ 1/g; then k ~ N implies
Xt E D(x, g); see Figure 2.P-1. Hence, by hypothesis, x E A, and so A is
closed.
ii.
The argument here is similar and we shall leave it as an exercise.
FIGURE 2.P-1
•
Accumulation points of a set
2.8.3 Proposition A convergent sequence in a normed or metric space is
bounded.
Chapter 2 The Topology of Euclidean Space
134
Proof We proceed as in 1.2.6.
if n
~
N. Thus
Xn
E D(x, 1) if n
If Xn - x, there is an N such that d(xn,x)
N. Let
<1
~
R = rnax{l, d(x,,x), ... ,d(XN-1,x)}.
Then d(x,xn) :SR for all n, and so Xn E D(x, R) for all n.
•
2.8.4 Proposition
i.
Every convergent sequence in a metric space is a Cauchy sequence.
ii.
A Cauchy sequence in a metric space must be bounded.
iii.
If a subsequence of a Cauchy sequence converges to x, then the sequence
converges to x.
Proof i is proved just as in 1.4.2, ii as in 1.4.6, and iii as in 1.4.7. In each
case we replace absolute values by corresponding distances; for example, read
d(X,Xm) for Ix - Xml-
•
2.8.5 Theorem
A sequence Xk in ]Rn converges to a point in ]Rn
if! it is a
Cauchy sequence.
If xk converges to x, then for c: > 0, choose N so that k ~ N implies
l!xk - xii < c/2. Then, for k,l ~ N, llxk - xdl = ll(xk - x) + (x - x1)II :S
l!xk - xi I + !Ix - xd I < c: /2 + c: /2 = c: by the triangle inequality. Thus, Xk is a
Cauchy sequence.
'
Conversely, suppose Xk is a Cauchy sequence. Since lx1 :S l!xk - xiii,
the components are also Cauchy sequences on the real line. By completeness
of IR and Theorem 1.4.4, xj converges to, say, xi. By Proposition 2.7.4, Xk
converges to x = (x 1, ... ,x").
Proof
x/1
•
2.8.7 Proposition If xk is a sequence in a metric space M and x
i.
x is a cluster point if! for every c:
k > N with d(xk,x) < c:.
ii.
x is a cluster point
iii.
Xk - x
E M, then
> 0 and for each integer N, there is a
if! there is a subsequence convergent to x.
if and only if every subsequence converges to x.
135
Theorem Proofs for Chapter 2
iv.
Xk --+ x iff every subsequence of x1c has a further subsequence that converges to x.
Proof This proof is analogous to the proof of 1.5.2, and we leave the adaptation to the reader.
•
2.9.2 Theorem Let V be a complete normed space (such as IR.n). A series
E Xk in V converges iff for every E > 0, there is an N such that k ;?: N implies
llxk + Xk+I + · · · + Xk+pll
< E for all integers p = 0, I, 2, ....
Proof Let St= z=:= 1 x;. By completeness, Exk converges iff sk is a Cauchy
sequence. This is true iff for every E > 0 there is an N such that l ;?: N implies
!Is, - St+qll < £ for all q = I, 2, .... But lls1+q - s,11 = llx1+1 + · · · + XJ+qll, and so
the result follows with k = l + I and p = q - I = 0, 1, 2, . . . . •
2.9.3 Theorem In a complete normed space, if E Xk converges absolutely,
then
E Xt converges.
Proof This follows from Theorem 2.9.2 and the triangle inequality:
2.9.4 Theorem
i.
Geometric series: If
lrl <
I, then
diverges (does not converge)
ii.
Comparison test: If
if lrl
E:0 rn
converges to 1/(1 - r), and
~ I.
z=: ak converges, ak
E: bk converges; if z=: Ck diverges,
1
Ck
~ 0, and O :5 bk
~ 0, and O :5 ck
:5 at, then
:5 dk> then
z=: dk diverges.
p-series test: z=;: 1/nP converges if p > 1 and diverges to oo (that is,
1
1
1
iii.
1
the partial sums increase without bound) if p :5 I.
iv.
Ratio test: Suppose that limn-oo lan+if anl exists and is strictly less than I.
z=:
Then
1 a,, converges absolutely. If the limit is strictly greater than I,
then the series diverges. If the limit equals 1, then the test is inconclusive.
Chapter 2 The Topology of Euclidean Space
136
v.
Root test: Suppose that limn-+oo lanl'/n exists and is strictly less than 1.
Then
1 an converges absolutely. If the limit is strictly greater than I,
then the series diverges; if the limit equals 1, then the test is inconclusive.
vi.
Integral test: If f is continuous. nonnegative, and monotone decreasing
on [1, +oo[. then E:iJ(n) and
f(x)dx converge or diverge together.
vii.
Ratio comparison test: Let E~ a; and E~ b; be series, with b; > 0
for all i.
If ( a) la;! ~ b;for all i, or if lim;-+oo lad/ b; < oo, and if (b) E:, b;
is convergent, then
a; is convergent.
If (c) a; ~ b; for all i, or lim;...... 00 a;/b; > 0, and if (d)
b; is
divergent, then
1 a; is divergent.
E:
ft
E:,
E:,
I;;:
E:,
viii. Alternating series: If
a; is such that the a; alternate in sign, are
decreasing in absolute value, and a; -+ 0 as i -+ oo, then the series
converges. The error in approximating the sum by Sn =
a; is no
greater than lan+1 I-
Z:7=,
Proof
i.
We shall use the following:
Power Lemma
~
lim ,n = {
n-+oo
O
if r > 1
if r= 1
if O ~ r < I.
Proof First consider the case r > 1. We write r as 1 + s, where s > 0. If we
expand rn (1 +st, we get rn
1 + ns + (other positive terms). Therefore,
rn ~ 1 + ns, which goes to oo as n -+ oo. Second, if r 1, then rn 1 for all
n, and so limn-+oo rn = I. Finally, if O ~ r < 1, then, excluding the easy case
r 0, we let p = 1/r, so that p > 1, and hence limn ...... 00 pn oo. Therefore,
limn-+oo rn limn-+oo 1/ pn 0. l'
=
=
=
=
=
=
=
=
By elementary algebra,
2
I - rn+I
I + r + r + • · · + rn = - - -
1- r
if r F- 1. By the power lemma, ,n+l -+ 0 as n -+ 00 if lrl < 1, and Irr' -+ 00
if lrl > 1. Thus we have convergence if lrl < 1 and divergence if lrl > 1.
Obviously,
rn diverges if lrl = I, since rn f+ 0.
E:O
Theorem Proofs for Chapter 2
ii.
137
The partial sums of the series I:~ 1 ak form a Cauchy sequence, and thus
the partial sums of the series I:~1 bk also form a Cauchy sequence, since
for any k and p we have bk+ bk+I + · · · + bk+p ::; ak + ak+I + · · · + ak+p·
Hence
1 bk converges. A positive series can diverge only to +oo,
and so, given M > 0, we can find k o such that k 2': k o implies that
c 1 + c2 + · · ·+ck 2': M. Therefore, for k 2': ko, di + d2 + · · · + dk 2': M, so
that
1 dk also diverges to oo.
I:::
I:::
iii.
First suppose that p ::; 1; in this case I/ nP 2': I/ n for all n = I, 2, ....
Therefore, by ii,
1(1/nP) will diverge if the hannonic series
1(1/n)
diverges. But we proved this in 1.2.20.
Now suppose that p > I. If we let
I:::
I:::
Sk
1
1
1
1
= IP + 2P + 3P + ... + kP'
then sk is an increasing sequence of positive real numbers. On the other
hand,
Sz;
-
I
=
_!_ + (_!_ + _!_) + (_!_ + _!_ + _!_ + ..!..)
IP
2P
3P
4P
5P
+ ... + C2k~I )P + ... + (2k
1
2
4
6P
7P
~ l)P)
2k-l
<-+-+-+···+--- lP 2P 4P
(2k-l)P
1
1
1
1
= --+--+--+··•+--:---,---,IP-I
2P-I 4p-l
(2k-l)p-l
1
<--~1~
1--2P-l
1
(why?). Thus the sequence {sk} is bounded above by ( 1 - 2 p-l
therefore
Note.
'\'00
Lm=I
)-I
1
nP converges.
We can also prove iii using the integral test for positive series (see
vi of the theorem). The demonstration given here, however, also proves
the Cauchy condensation test: Let I: an be a series of positive terms with
~n+I ::; an. Then I: an converges iff Lj~I 2ia2 1 converges (G. J. Porter, "An
Alternative to the Integral Test for Infinite Series," Am. Math. Monthly 19
(1972), p. 634).
;
138
iv.
Chapter 2 The Topology of Euclidean Space
Suppose that limn-+oo lan+i/anl = r < I. Chooser' such that r < r' < 1,
and let N be such that n ~ N implies lan+i/anl < r'. By induction,
laN+pl < laNl(r')P. Consider the series lad+··· +laNl+laNlr' +laNl(r')2 +
laNl(r')3 + · · · . This converges to
laNI
lad+···+ laN-d + -1 - , .
-r
r;:
By ii we can conclude that
1 latl converges. If limn-+oo lan+l / anl =
r > 1, chooser' such that 1 < r' < r, and let N be such that n ~ N implies
that lon+danl > r'. Hence laN+pl > (r')PlaNI, and so limn-+oo laNI = oo,
whereas the limit would have to be zero if the sum converged. Thus
1 Ok diverges. To see that the test fails if limn_,oo lan+i/onl = 1, consider the series 1 + l + l + · •·, and
1 1/nP for p > 1. In both cases,
limn_,00 lon+i/anl = l, but the first series diverges and the second converges.
r::
v.
E:
Suppose that limn ..... 00 (1anl) 1ln = r < 1. Choose r' such that r < r' <
I and N such that n ;?: N implies that lanp!n < r'; in other words,
lanl < (r'Y'. The series la1 I + la2I + · · · + laN-1 I + (r't + (r')N+l + · · ·
converges to lad+ la2I +· · · + loN-d +(r'Y' /(1- r'), and so, by ii,
1 at
converges. If limn-oo(lanl) 1ln = r > 1, choose l < r' < rand N such
that n ;?: N implies that lanp!n > r', or, in other words, lanl > (r')n.
Hence, limn-oo lanl = oo, and therefore,
1 a1c diverges.
To show that the test fails when limn-oo(lanl) 1ln = 1, observe that,
by elementary calculus,
E:
E:
lim
n-+oo
(-I)
t/n
n
1 ) 1/n
lim ( - 2
=1
n
= 1 and
n--+OO
(talce logarithms and note that (logx)/x-+ 0 as x -+ oo). But
diverges and
I/n 2 converges.
E:,
vi.
E: 1/n
1
For this part, we accept some elementary facts about integrals from calculus (see §4.8 for a review). In Figure 2.P-2a, the rectangles of areas
a1 , a2, ... , On enclose more area than that under the curve from x = 1 to
x = n + l. Therefore,
a1+a2+···+an~
J.
1
n+I
f(x)dx.
If we now consider Figure 2.P-2b and talce the area from x = l to x = n,
we have
a2 + a3
+ · · · + On
s· inf(x)dx.
Theorem Proofs for Chapter 2
139
y
l
n n+I
2
(a)
y
l
2
n-ln
(b)
FIGURE 2.P-2
Inequalities needed for the integral test
Adding a1 to both sides gives
a, +a2 +a3 +···+an$ a1 +
in f(x)dx.
Combining the two results yields
ln+I f(x)dx $ a, + a2 + · · · + an $ a1 + in f(x)dx.
Ji°°
f(x)dx is finite, then the right-hand inProvided that the integral
equality implies that the series
1 an is also finite, by the completeness
property of R But if
f(x)dx is infinite, the left-hand inequality shows
that the series is also infinite. Hence, the series and the integral converge
or diverge together.
ft'
vii.
I:;:
For instance, suppose that lim;- 00 ja;lf b; = M < oo, with all b; > 0.
Then for large enough i, we have ja;I/ b; < M + 1, i.e., lad < (M + l)b;. If
Eb; converges, so does E(M + l)b;, and hence Ea; converges, by the
comparison test. The other cases are similar.
Chapter 2 The Topology of Euclidean Space
140
I:;:,
a; be an alternating series. Assuming a, > 0, and if we let
viii. Let
b; = (-1Y+ 1a;, then all the b; are positive, and our series is b1 -bi +b3 b4 + bs · · ·. In addition, we have b, > b2 > b3 > · · · and lim;--+oo b; = 0.
Each even partial sum S2n can be grouped as (b1 - b2) + (b3 - b4) + · · · +
(bn-1 - bn), which is a series of positive terms, and so we have S2 :S S4 :S
S6 :S · · · . The odd partial sums S2n+1 can be grouped as b1 - (b2 - b3) (b4 -bs)- · · · -(b2n -h2n+1), which is a sum of negative terms (except for
the first), so that we have S 1 2: S3 2: Ss 2: · · ·. Next, we note that S2n+1 =
S2,1 + b2n+1 2 S2n• Thus, the even partial sums S2n form an increasing
sequence that is bounded above by any member of the decreasing sequence
of odd partial sums. By the monotone sequence property, the sequence
S2n approaches a limit, Seven• Similarly, the decreasing sequence S2n+1
approaches a limit, Sood.
Thus we have S2 :S S4 :S S6 S · · · S S2n :S · · · S Seven :S Sood $
· · · :S S2n+1 $ · · · $ S3 :S S,. Since S2n+1 - S2n = a2n+1, which approaches
zero as n - oo, the difference Sood - Seven is less than S2n+i - S2n, and so
it must be zero; i.e., Sood= Seven• Call this common value S. Thus
and
IS2n+1 - Sj $ IS2n+1 - S2n+2l = b2n+2 = ja2n+2I,
and so each difference !Sn - Sj is less than ian+1 l- Since an+I - 0, we
get Sn --+ S as n --+ oo. This argument also shows that each tail of an
alternating series is no greater than the first term omitted from the partial
sum.
•
Worked Examples for Chapter 2
Example 2.1 Let S = {(x,,x2) E lR 2 I lxd :S 1, lx2l
closed or neither? What is the interior of S?
Solution
S with
< 1}.
ls S open or
S is not open, since there is no neighborhood around any point of
x, = 1 that is entirely contained in S.
See Figure 2.E-1. On the other
hand, S is not closed, since
and no neighborhood around a point of JR. 2 \ S with x2 = 1 is contained in JR. 2 \ S.
Worked Examples for Chapter 2
----
141
(0,1}
(1,0}
FIGURE 2.E-1
-----------
Detennine the topological properties of this set
Alternatively, we see that S is not closed by noting that the sequence
(0, 1 - 1/n) converges but the limit point (0, 1) does not lie in S (see Proposition
2.7.6).
We assert that int(S) = { (x1. x2) E IR 2 I lx1 I < 1, lx2 I < 1}. We check this by
showing that the members of this set are the interior points of S. If Ix, I < I and
lx2I < 1, then the disk with center (x1 ,x2) and radius r = minimum{ I - lxd, I jx2 1} lies in S. As we have seen, the other points of S are not interior points.
As the student becomes more familiar with this type of argument, some of the
details may be omitted.
•
Example 2.2 Slww that if x is an accumulation point of a set S c IR.n then
every open set containing x contains infinitely many points of S.
Solution We use proof by contradiction. Suppose there were an open set U
containing x and containing only finitely many points of S. Let x 1,x2, ... ,Xm
be the points of S in U other than x. Let e be the minimum of the numbers
d(x,x 1),d(x,x2), . .. ,d(x,x111 ), so that e > 0. Then D(x,c) contains no points of
S other than x, which contradicts the fact that x is an accumulation point of S.
The reader should also supply a direct proof of this result.
•
Example 2.3 If x = sup(S) for S c IR, slww that x E cl(S).
Solution By Proposition 2.5.2, it suffices to show either that x E S, or that
x is an accumulation point of S. By Proposition 1.3.2, for any .c
>0
there is
Chapter 2 The Topology of Euclidean Space
142
a y E S with d(x,y) <
point of S.
•
€.
This means that if x
r/.
S, then xis an accumulation
Example 2.4 Slww that a sequence in a metric space can converge to at
most one point (limits are unique).
y. Given£ > 0, choose N such that k 2: N
implies d(xk,X) < c/2, and M such that k 2: M implies d(xk,Y) < c/2. If k 2: N
and k 2: M, then d(x,y) ~ d(x,xk) + d(xk,Y) < € (by the triangle inequality).
Since O ~ d(x,y) <€holds for every€> 0, d(x,y) = 0, and so x = y.
•
Solution Let Xk-+ x and Xk-+
-------
Example 2.5· "Big Oh" and "little oh" notation Write f = 0( g) if g(x) > 0
for x E JR sufficiently large and f(x)/ g(x) is bounded for x sufficiently large.
g (read f is
Write f = o(g) iff / g goes to zero as x goes to +oo. Also write f
asymptotic to g) if f / g -+ l as x -+ oo. Prove the following:
~
1.
x2 +x = O(x2).
2. --
x2 +x ~ x2.
3.
exp( Jiogx) = o(x).
Solution We note that if/ is asymptotic tog, then it will follow automatically
that/= O(g) (why?). Thus 1 will follow from 2. But 2 is easy, since we know
that (x2 + x)/x2 = I+ 1/ x goes to l as x goes to infinity. To prove 3, we note that
exp(logx) = x, so that (exp Jlogx)/x = exp(Jfcigx- logx). Since logx-+ oo
asx-+ oo, (v'logx)/logx-+ 0 asx-+ oo so that for xlarge, Jlogx ~ (logx)/2
and hence, for x large,
exp-Jiogx
- - < -l ( exp (logx))
-= -I
X
- X
2
,/i'
which goes to zero as x -+ oo.
•
Example 2.6 Recall that one may define ex by
X
x2 x3
e =l+x+-+-+···
2!
3!
.
JR. Hence this definition of
ex makes sense.) Slww that e = e 1 is an irrational number.
(By the ratio test, this series converges for all x E
Exercises for Chapter 2
143
Solution Suppose that e = a/b for integers a and b. Let k be an integer,
k > b, and let a= k!(e - 2 - 1/2! - 1/3! - · • • - 1/k!), so that a is a nonzero
integer as well. Since e = 2 + 1/2! + 1/3! + · · •,
1
k+ 1
1
(k+ l)(k+2)
1
- k+ 1
1
(k+ 1)2
I
k
a=--+----+···<--+---+···=-.
(The last equality follows using the formula for the sum of a geometric series
y+y2 + · • • = y/(1 -y), 0 ~ y < 1.) But a < I/k is impossible if a is an integer
-:/- 0. Thus e =a/bis also impossible, and so e is irrational.
•
Note. Surprisingly, to prove that e' is irrational for r rational is .not at all
simple, and the proof that 71' is irrational is even harder. See, for example,
G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers,
4th ed., Oxford University Press, New York, 1960. In fact, e and 71' are
transcendental numbers, which means that they are not the roots of any polynomial with rational coefficients. This was discovered by Hermite and Lindemann in 1873 and 1882. For an elementary account, see M. Spivak, Calculus,
W. A. Benjamin Co., Menlo Park, Calif.
Exercises for Chapter
1.
2
Discuss whether the following sets are open or closed:
=IR
a.
]1,2[ in IR 1
b.
[2, 3) in JR.
c.
n: [-l, 1/n[ inlR
d.
]Rn in ]Rn
e.
A hyperplane in !Rn
f.
{ r E JO, 1 r is rational} in!R
g.
{(x,y) E IR 2
h.
{ X E ntn
1
[I
IO< X ~
I} inlR. 2
I !!xii =I} in IR.n
2.
Determine the interiors, closures, and boundaries of the sets in Exercise 1.
3.
Let U be open in M and U c A. Show that U
co1Tesponding statement for closed sets?
c
int(A,). What is the
Chapter 2 The Topology of Euclidean Space
144
4.
a.
Show that if Xn -+ x in a metric space M, then x E cl{ x,, x2, . .. }.
When is x an accumulation point?
b.
Can a sequence have more than one accumulation point?
c.
If x is an accumulation point of a set A, prove that there is a sequence
of distinct points of A converging to x.
> 0 such that D(x, €) c A.
5.
Show that x E int(A) iff there is an
6.
Find the limits, if they exist, of these sequences in JR 2 :
a.
(<-1)", ~)
b.
( 1, ~)
'
E
.
c.
( (~) (cosmr), (~) (sin ( n~ +
d.
(~,n-n)
i)))
7.
Let U be open in a metric space M. Show that U = cl(U) \ bd(U). Is this
true for every set in M?
8.
Let S C JR be nonempty, bounded below, and closed. Show that inf(S) E S.
9.
Show that
10.
11.
a.
intB = B \ bdB, and
b.
cl(A) = M \ int(M \ A).
Determine which of the following statements are true.
a.
int(cl(A)) = int(A).
b.
cl(A) nA = A.
c.
cl(int(A)) = A.
d.
bd(cl(A)) = bd(A).
e.
If A is open, ,!hen bd(A) c M \ A.
Show that in a metric space, x,,, -+ x iff for every € > 0, there is an N
such that m ;::: N implies d(xn,,x) ~ € (this differs from Proposition 2.7.2
in that here "< €" is replaced by"~€").
Exercises for Chapter 2
12.
145
Prove the following properties for subsets A and B of a metric space:
a.
int(int(A)) = int(A).
b.
int(A U B) :, int(A) U int(B).
c.
int(A n B) = int(A) n int(B).
13.
Show that cl(A) = A U bd(A).
14.
Prove the following for subsets of a metric space M:
15.
a.
cl(cl(A)) = cl(A).
b.
cl(A U B) = cl(A) U cl(B).
c.
cl(A
n B) C
cl(A) n cl(B).
Prove the following for subsets of a metric space M:
bd(A) = bd(M \ A).
a.
b.
bd(bd(A)) C bd(A).
c.
bd(A U B) C bd(A) U bd(B) C bd(A U B) U A U B.
d.
bd(bd(bd(A))) = bd(bd(A)).
16.
Let a1 = /2, a2 = (v'2 )° 1 , ••• ,an+I = (/2 )"n. Show that an n -+ oo. (You may use any relevant facts from calculus.)
17.
If 1: x,., converges absolutely in Rn, show that
18.
If x,y EM and x 'f y, then prove that there .exist open sets U and V such
that XE u, y EV, and Un V = 0.
19.
Define a limit point of a set A in a metric space M to be a point x E M
such that Un A 'f 0 for every neighborhood U of x.
2 as
1: x,,, sin m converges.
a.
What is the difference between limit points and accumulation points?
Give examples.
b.
If x is a limit point of A, then show that there is a sequence Xn E A
with Xn - x.
c.
If x is an accumulation point of A, then show that x is a limit point
of A. Is the converse true?
d.
If x is a limit point of A and x ¢ A, then show that x is an accumulation point.
e.
Prove: A set is closed iff it contains all of its limit points.
146
20.
Chapter 2 The Topology of Euclidean Space
For a set A in a metric space M and x E M, let
d(x,A) = inf{d(x,y) I y EA},
and for c > 0, let D(A,c) = {x I d(x,A)
< c}.
a.
Show that D(A, c) is open.
b.
Let A c Mand N£ = {x EM I d(x,A) :5 c}, where c > 0. Show
that Ne is closed and that A is closed iff A= n {Ne I c > O}.
I
21.
Prove that a sequence xk in a normed space is a Cauchy sequence iff
for every neighborhood U of 0, there is an N such that k, l ?: N implies
Xk -X1 EU.
22.
Prove Proposition 2.3.2. (Hint: Use Exercise 12 of the Introduction.)
23.
Prove that the interior of a set A c M is the union of all the subsets of A
that are open. Deduce that A is open iff A = int(A). Also, give a direct
proof of the latter statement using the definitions.
24.
Identify R.n+m with R.n x R.111 • Show that A C R.n+m is open iff for each
(x,y) EA, with x E R.n, y E R. 111 , there exist open sets Uc R.n, V C Rm
with x E U, y E V such that U x V c A. Deduce that the product of open
sets is open.
25.
Prove that a set A
family of £-disks.
26.
Define the sequence of numbers an by
c
M is open iff we can write A as the union of some
1
I
ao=l,a,=l+--, ... ,an=I+ 1
I +ao
+an-I
Show that an is a convergent sequence. Find the limit.
27.
Suppose an ?: 0 and an -+ 0 as n -+ oo. Given any c
is a subsequence bn of an such that
bn < €.
28.
Give examples of:
I::,
> 0, show that there
a.
An infinite set in R. with no accumulation points
b.
A nonempty subset of R. that is contained in its set of accumulation
points
Exercises for Chapter 2
147
c.
A subset of JR that has infinitely many accumulation points but contains none of them
d.
A set A such that bd(A) = cl(A).
29.
Let A, B C JR.n and x be an accumulation point of A U B. Must x be an
accumulation point of either A or B?
30.
Show that each open set in JR is a union of disjoint open intervals. Is this
sort of result true in ]Rn for n > 1, where we define an open interval as
the Cartesian product of n open intervals, ]a 1, b 1 [ x · · · x ]an, bn [?
31.
Let A' denote the set of accumulation points of a set A. Prove that A' is
closed. Is (A')' = A' for all A?
32.
Let A C JR.n be closed and Xn E A be a Cauchy sequence. Prove that Xn
converges to a point in A.
33.
Let Sn be a bounded sequence of real numbers. Asfume that 2sn
Sn+I· Show that limn--+oo(Sn+I - Sn)= 0.
34.
Let Xn E JRk and d(Xn+J,Xn) S rd(xn,Xn-1), where O S r < 1. Show that
x,. converges.
35.
Show that any family of disjoint nonempty open sets of real numbers is
countable.
36.
Let A, B C Rn be closed sets. Does A+ B = {x + y I x E A and y E B}
have to be closed?
37.
For Ac M, a metric space, prove that
bd(A) = [A
n
llxk - xiii
s Sn- I +
cl(M \ A)] U [cl(A) \ A].
S 1/k + 1/l. Prove that Xk converges.
38.
Let Xk E JR." satisfy
39.
Let S C JR. be bounded above and below. Prove that sup(S) - inf(S) =
sup{x - y Ix E S and y E S}.
40.
Suppose in JR that for all n, Un
Un converges.
41.
Let An be subsets of a metric space M, An+I C An, and An -:/ 0, but assume
that n~1An = 0. Suppose x E n~1cl(An)- Show that xis an accumulation
point of A 1•
S bn, an S an+I, and bn+I S bn. Prove that
148
C,hapter 2 The Topology of Euciulean Space
42.
Let AC Rn and x E lRn. Define d(x,A) = inf{d(x,y) I.YE A}. Must there
be a z EA such that d(x,A) = d(x, z)?
43.
Let XI = \1'3, ... ,Xn =
44.
A set A c ]Rn is said to be dense in B c ]Rn if B c cl(A). If A is dense in
]Rn a!!d U is o~n, pro,ve that A D U is dense in U. Is· this true if U is not
ope.n?
45.·
Sbow that xlogx = o(~) as x - oo (see Worked Example 2.5).
46.
a.
J3 + Xn-1.
If/= o(g) and if g(x) a:s X - 00.
C~mpute liIDn-+oo Xn-
oo as x -
~. then show that e{<x> = o(ts<x))
b. · Show that limn-+oo(xlogx)/ex = 0 by showing-that x = o(exl 2 ) and
that log x = o(exf 2 ).
47.
Show that 'Y = limn-+oo(l + ½+ · · · + ¾ - logn) exists by using the proof
of the integral test (, is called Euler's consta:nt).
48.
Prove the·follC?-wing generalizations of the ratio and root tests:
.
.
a.
If an > 0 and lim supn-+oo an+ 1/ an < 1, then I: an converges, and if
lirn infn_, 00 an+i/an > 1, then :Z::: an diverges.
b.
If a,, ~ 0 and if Jim supll->00 y'a;; < I (respectively, > 1), then
:z::: all
converges (respectively, diverges).
c.
49.
In the ratio comparison test, can the limits be replaced by lim sup's?
Prove Raabe's test: If all > 0 and if a,.+ifan :::; 1 -A/n for some fixed
constant A > 1 and n sufficiently large, then :Z::: all converges. Similarly,
show that if a11+1/a11 ~ 1 - (l/n), then :Z:::a11 diverges.
Use Raabe's test to prove convergence of the hypergeometric series
whose general term is
an=
a(a + 1) • ••(a+ n - 1)(3((3 + 1) • • • ((3 + n - 1)
1 · 2 · · · n · 'Y('Y + 1).- · · ('y + n - 1)
where a, (3, and 'Y are nonnegative integers, 1 >,a+ (3. Show that the
series diverges if 1 < a+ (3.
50.
Show that for x sufficiently large,f(x) = (xcos2 x + sin 2 x)ex2 is monotonic
and tends to +oo, but that neither the ratio f(x)/(x 112 ex2) nor its reciprocal
is bounded.
Exercises for Chapter 2
51.
a.
149
If Un > 0, n ; 1, 2, ... , show that
·. f Un+I < 1:
. f .;;;< 1·
Ii mm
- _ 1m m
Un _ 1m sup
~
.
b.
.
v
Deduce .that if lim(U11+ 1/ u,.)
vn"7
Un ~
1·1m sup --.
Un+I
.
~
=A, then lim sup efUn ~ A.
c. · Show that the converse of part b is false by use of the sequence'
U2n = U211+t = 2-n.
d.
52.
Calculate limsup,:/nl./n.
Test the following series for convergence.
oo
a.
-k
~✓:+1
k
oo
b..
Lk2+1
k=O
c.
v'n+I.
~
~ n2 -3n+ 1
n=O
00
.
· "log(k + 1) - logk
d. ~
tan- 1(2/k)
k=I
.
00
e.
L sin(n-
0 ),
a real, > 0
11=1
00
r.
3
E.;n
11=!
53.
Given a set A in a metric space, what _is the maximum number of distinct
subsets that can be produced by successively applying the operations closure, interior, and complement .to A (in any order)? Give an example of a
set achieving your maximum.
Chapter 3
Compact and Connected Sets
In this chapter, we study two of the most important and useful kinds of sets in
metric spaces and especially in IR.n. Intuitively, we want to say that a set in IR.n is
compact when it is, closed and is contained in a bounded region, and that a set is
connected when it is "in one piece." Figure 3-1 gives some examples. As usual,
it is necessary to tum these ideas into rigorous definitions. In each case the most
useful technical definition appears to be a little removed from our intuition, but
in the end we will see that it is in good accord with it. The fruitfulness of these
notions will be revealed in Chapter 4, where they will be applied to the study
of continuous functions.
§3.1 Compactness
In this section we give the general definition and properties of compact sets in
metric spaces. A criterion for recognizing compact sets, called the Heine-Borel
theorem, states that a set in IR.n is compact iff it is closed and bounded. This
result, special to the metric space IR.n, is discussed in §3.2.
Recall from our discussion of completeness of IR.n in Chapter 1 that every
bounded sequence has a convergent subsequence. This can be rephrased: If
A C Rn is a closed and bounded set, then every sequence in A has a subsequence
converging to a point of A. Historically, this was recognized to be an important
property of sets, and so was elevated to a definition. This property plays a crucial
role in many basic theorems such as the existence of maxima and minima of
continuous functions on closed intervals, as we shall see in Chapter 4.
3.1.1 Definition Let M be a metric space. A subset A c M is called
sequentially compact if every sequence in A has a subsequence that converges
to a point in A.
151
'
Chapter 3 Compact and Connected Sets
}52
compact
connected
FIGURE 3-1
noncom pact
noncompact
not connected
Compact and connected sets in IR.2
This property is equivalent to another property, called compactness, that
we shall now develop. This property is less obvious, and its equivalence to
sequential compactness is far from clear, at least at first.
Here is some terminology we need for our formal definition. Let M be a
metric space and A C M a subset. A cover of A is a collection { U;} of sets
whose union contains A; it is an open' cover if each U; is open. A subcover of
a given cover is a subcollection of { V;} whose union also contains A or, as we
say, covers A; it is a/mite subcover if the subcollection contains only a finite
number of sets.
Open covers are not necessarily countable collections of open sets. For
example, the uncountable set of disks {D((x, 0), 1) I x E IR} in IR 2 covers the real
axis, and the subcollection of all disks D((n, 0), 1) centered at integer points on
the real line forms a countable subcover. Note that the set of disks D((2n, 0), 1)
centered at even integer points on the real line does not form a subcovering
(why?).
3.1.2 Definition A subset A of a metric space M is called compact if every
open cover of A has a finite subcover.
Here is the first major result, which links compactness and sequential compactness.
§3.1 Compactness
1"53
3.1.3 Bolzano-Weierstrass Theorem A subset of a metric space is
compact
if! it is sequentially compact.
Some simple observations will help give a feel for compactness and for this
theorem. F!rst, a sequentially compact set must be closed. Indeed, if Xn E A
converges to x E M, then by assumption there is a subsequence converging to a
point x 0 EA; by uniqueness of limits, x = xo, and so A is closed. Second, a sequentially compact set A must be bounded, for if not, there is a point xo E A and a
sequence Xn E A with d(xn, xo) ~ n. Then Xn cannot have any convergent subsequence. 'fo show directly that a compact set is bounded, use the fact that for any
x0 EA, the open balls D(x0 , n), n = 1, 2, ... , cover A, so there is a finite subcover.
Note that in the definitions, one can take A = M, in which case one just
speaks of a compact metric space. We shall develop examples of compact
spaces in due course.
Another characterization of compactness relates to completeness. It is a
useful technical tool used in the proof of the Bolzano-Weierstrass theorem.
3.1.4 Definition A set A c M is called totally bounded if for each
there is ajnite_!_~t_lx_1 ~ --~ ·..0N} in M such that :!-C
I D(x;,!_).
uf
E
>0
3.1.5 Theorem A metric space is compact iff it is complete and totally
bounded.
Let A c M, and assume that M is complete. If we apply this theorem to
the metric space A, we conclude that A is compact if! it is closed and totally
bounded.
In Theorem 3.1.5, a few things are obvious, others less obvious. First, note
that D(x;, e) C D(x 1 , € + d(x;,X1 )), so that if
then A C D(x 1, R} and so a totally bounded se,t is bounded. This is consistent
with our earlier remark that compact sets are bounded.
At this stage we do not have effective methods for telling when a given set
is compact. We will remedy this in the next section.
3.1.6 Example The entire real line IR is not compact, for it is unbounded.
Another reason is that
{D(n, l)
=]n -
l, n + l[I n
=0, ±I, ±2, ... }
is an open cover of IR but does not have a finite subcover (why?).
•
· Chapter 3 Compact and Connected Sets
154
3.1.7 Example Let A= JO, 1]. Find an open cover with no finite subcover.
In= 1,2,3, ... }. (Why does
the union contain all of A?) It clearly cannot have a finite subcover. This time,
compactness fails because A is not closed; the point O is "missing" from A.
This collection is not a cover for [O, 1]; in fact any open_ cover for [0, l]
must have a finite subcover, because, as we prove in the next section, [O, 1] is
compact.
•
Solution Consider the open cover {)1/n,2[
3.1.8 Example Give an example of a bounded and closed set that is not
compact.
Solution Let
M be any infinite set with the discrete metric: d(x,y) = 0 if
x = y and d(x,y) = l if x -f:. y. Clearly, MC D(xo, 2) for any xo EM, and so M
is bounded. Since it is already the entire metric space, it is closed. However,
it is not compact. Indeed, { D(x, l /2) I x E M} is an open cover with no finite
•
subcover.
3.1.9 Example A collection of closed sets {Ka} in a metric space M is
said to have the finite intersection property for A if the intersection of any finite
number of the K,,, with A is nonempty. Show that A C M is compact iff every
collection of closed sets with the finite intersection property for A has nonempty
intersection with A.
Solution First, assume A is compact. Let { F;} be a collection of closed sets
and let U; M\F;, so that U; is open. Suppose that An (n~1F;) 0. Taldng
complements, this means that the V; cover A. Since the covering is open, there
is a finite subcovering, say, Ac U 1 U • • • U UN, Then An (F1 n ••• n FN) = 0,
and so { F;} does not have the finite intersection property. Thus, if {F;} is a
collection of closed sets with the finite intersection property, then An { F;} -f:. 0.
Conversely, let { U;} be an open covering of A and let F; = M\ U;. Then
A n (n~1F;) = 0, and so, by assumption, {F;} cannot have the finite intersection
property for A. Thus, A n (F1 n •• • n FN) = 0 for some members F1, ... , FN
of the collection. Hence, U1, ••• , UN is the required finite subcover and thus A
is compact. •
=
=
155
§3.2 The Heine-Borel Theorem
Exercises for §3.1
1.
Show that A c M is sequentially compact iff every infinite subset of A
has an accumulation point in A.
2.
Prove that {(x,y) E R 2 IO
3.
Let M be complete and A C M be totally bounded. Show that cl(A) is
compact.
4.
Let x1c
-+
5 x < l,O 5 y 5 1} is not compact.
x be a convergent sequence in a metric space and let A =
{x1,X2, ... } U fx}.
S.
a.
Show that A is compact.
b.
Verify that every open cover of A has a finite subcover.
Let M be a set with the discrete metric. Show that any infinite subset of M
is noncompact. Why does this not contradict the statement in Exercise 4?
§3.2 The Heine-Borel Theorem
In Euclidean space we can easily tell if a set is compact from the following
theorem:
3.2.1 Heine-Borel Theorem A set A c Rn is compact if! it is closed
and bounded.
.
a
One half of this was already indicated in §3.1. In fact, compact set is
closed and bounded in any metric space. The converse must be special in view
of,Example 3.1.8. Indeed, it is not even obvious that the closed interval (0, I] in
R is compact. In fact, (0, l] is compact, and one of the proofs of the Heine-Borel
theorem begins by treating this case.
3.2.2 Example Determine which of the following are compact:
Ix~ 0} C ~
a.
{x E IR
b.
[0, 1) U [2,3] CIR
c.
{(x,y) E JR2 j x2
·
+y2 < I} C JR.2
Chapter 3 Compact and Connected Sets
156
Solution
a.
Noncompact, because it is unbounded.
b.
Compact, because it is closed and bounded.
c.
Noncompact, because it is not closed.
•
3.2.3 Example Let Xk be a sequence of points in
k. Show that
Xk
)Rn
with Ilxk 11 S 3 for all
has a convergent subseq':'ence.
]Rn I llxll S 3} is closed a~d bounded, and hence
compact. Since Xk EA, we can apply the Bolzano-Weierstrass theorem to obtain
•
the conclusion.
Solution The set A= {x E
3.2.4 Example In the definition of a compact set, can "every" be replaced
by "some"?
Solution No. Let A = JR, and let the open cover consist of the single open
set JR. This has a finite subcover, namely, itself, but being unbounded, JR is not
c9mpact.
•
3.2.5 Example Let A= {O} U {1, 1/2, ... , 1/n, ... }. Show directly that A
'--..
satisfies the definition of compactness.
Solution Let { U;} be an arbitrary open cover of A. We must show that there
is a finite subcover. The point O lies in one of the open sets-relabeling if
needed, we can suppose that OE U1• Since U 1 is open and 1/n - 0, there is an
N such that 1/N, 1/(N +I), ... lie in U1. Relabeling again if needed, suppose
that I E U2, ... , 1/(N - 1) E UN. Then U1, ... , UN is a finite subcover, since it
is a finite subcollection of the { U;} and it includes all of the points of A. Notice
·that if A were the set {I, 1/2, ... }, then the argument would not work. In fact,
this set is not closed, and so it is not compact.
•
·
Exercises for §3.2
1.
Which of the following sets are compact?
a.
{x E JR I OS x S I and xis irrational}
§3.3 Nested Set Property
157
b.
{ (x,y) E lR 2 I O $ x $ 1}
c.
{(x,y) E 1R2 I xy ~.I}
n {(x,y) I x2 + y2 < 5}
2.
Let r 1, r2 , r3, ... be an enumeration of the rational numbers in [0, 1]. Show
that there is a convergent subsequence.
3.
Let M = { (x,y) E IR 2 I x2 +·y2 $ 1} with the standard metric. Show that
A c M is compact iff A is closed.
4.
Let A be a bounded set ~n !Rn. Prove that cl(A) is compact.
5.
Let A be an infinite set in IR with a single accumulation point in A. Must
A be compact?
§3.3 Nested Set Property
The next theorem is an important consequence of the Bolzano-Weierstrass theorem.
3.3.1 Nested Set Property let F1c be a sequence of compact nonempty
sets in a metric space M such that F1<+1 C Fk for·all k = 1, 2, .... Then there is
at least one point in nb1Fk,
Intuitively, the sets Fk are nonempty and decreasing, and so it seems reasonable that there should be a point in all of them. However, if the Fk are not
compact, then the intersection can be empty (see Example 3.3.4). Thus, the
actual proof requires more care.
To prove the nested set property using the Bolzano-Weierstrass theorem, pick
xk E F1c for each k. The sequence x1c has a convergent subsequence, since it lies
in the compact set F 1 • The limit point lies in all of the sets Fk because they are
closed (see Figure 3.3-1). An alternative proof is given at the end of the chapter.
One can rephrase the nested· set property in terms of "growing sets" this way.
Let Uk = M\Fk, so that the U1c are open and Uk+l ::> U1c. Then
1 U1c ¢ M is
F1c
'
I
0.
Thus,
if
Mis
a
metric
space
and
the
open
sets Uk are
equivalent to
1
n:
u:
increasing-i.e., Uk+I ::> Ur-and have compact complements, then the union of
the U1c is not all of M.
Chapter 3 Compact and Connected Sets
158
,> :,
pEnF,,;
I
•
FIGURE 3.3-1
Nested set property
3.3.2 Example Let M be the unit sphere in JR3, M = {(x,y,z) I x2 +y2+z2 =
I} with the standard metric. Let U; be the portion of M strictly belqw latitude
90° - 10/i, i = 1, 2, 3, ... , as in Figure 3.3-2.
latitude 90° - 10/i
FIGURE 3.3-2
An increasing sequence of open sets on the unit sphere
The metric space M is compact (why?), and, consistent with the preceding remarks, the union of the U; is not all of M, since it excludes the north
pole.
•
3.3.3 Example Ver½' the nested set property for Ft= [0; 1/k] C JR.
Solution Each Ft is compact, and F1:+ 1 c F". The intersection is {0}, which
is nonempty. •
Exercises for §3.3
159
3.3.4 Example ls the nested set property true if "compact nonempty" is
replaced by "open nonempty" or "closed nonempty"?
Solution No. Let Fk = ]k, oo[ or [k, oo[.
•
3.3.S Example A more exotic family of decreasing compact sets Fn, for
n:
which
1Fn is quite complicated, is obtained by removing successive triangles
from a given triangle in the plane, as in Figure 3.3-3. •
FIGURE 3.3-3
Sierpinski's gasket
Exercises for §3.3
1.
Verify the nested set property for Fk = { x E IR I x ~ 0, 2 ::;
2.
Is the nested set property true if "compact nonempty" is replaced by "open
bounded nonempty"?
3.
Let x1: -+ x be a convergent sequence in a metric space. Verify the validity
of the nested set property for F1: = {x, I l ~ k} U {x}. What happens if
Fk={xill~k}?
4.
Let x1: -+ x be a convergent sequence in a metric spai:e. Let A be a family
of closed sets with the prope11y that for each A E A, there is an N such
that k ~ N implies Xk E A. Prove that x E nA.
x2 ::; 2 + 1/ k}.
Chapter 3 Compact and Connected Sets
160
§3.4 Path-Connected Sets
The second important topic to be discussed in this chapter is connectedness. We
know intuitively those sets we would like to call "connected." However, our
intuition can fail in judging more complicated sets. For example, how do we
decide whether the set {(x,sin(l/x)) Ix> O} U {(0,y) I y E (-1, ll} c lR 2
is connected (see Figure 3.4-1)? Therefore, we seek a sound mathematical
definition we can depend on.
y
1
----,HH-~-+---------x
FIGURE 3.4-1
Connected?
There are, in fact, two different (but closely related) notions of connectedness. The more intuitive and applicable of these is that of path-connectedness,
and so we begin with it. Our definition must first define what is meant by a
curve (or path) joining two points.
3.4.1 Definition We call a map <p : [a, b] -+ M of an interval [a, b] into a
metric space M continuous if (tk -+ t) implies (<p(tk) -+ <p(t)) for every sequence
tk in [a, b] converging to some t E [a, b]. (Students will recall from their calculus
courses that, intuitively, a continuous function has no "breaks" or "jumps" in
its graph.) A continuous path joining two points x,y in a metric space M is
a mapping <p : [a,b] -+ M such that ip(a) = x, ip(b) = y, and ip is continuous.
Here x may or may not equal y, and b ~ a. A path <p is said to lie in a set A if
<p(t) EA/or all t E [a,b]. See Figure 3.4-2.
We say that a set is path-connected if every two points in the set can be
joined by a continuous path lying in the set.
§3.4 Path-Connected Sets
FIGURE 3.4-2
161
A curve joining x and yin A
For example, it is evident that the region A in Figure 3.4-2 is path-connected.
Another path-connected set is the interval [0, 1). To prove this, let x,y E [0, I]
and define cp : [O, 1) -+ JR by cp(t) = (y - x)t + x. This is a continuous path
connecting x and y, and it lies in [0, l].
Using the above definition of path-connected, a little thought will convince
the reader that the set in Figure 3.4-1 is not path-connected, although this fact
is not obvious. Most of the time it is easy to determine whether a set is pathconnected, by seeing whether any two points can be joined by a continuous
curve lying in the set, which is usually clear geometrically. The second notion
of connectedness is harder to check directly but will be very useful. It appears
in §3.5.
3.4.2 Example Which of the following sets are path-connected?
a.
(0, 3]
b.
[l,2]U[3,4]
c.
{(x,y)E1R2 I0<x:51}
d.
{(x,y)E1R2 I0<x2+y2:51}
Solution Only b is not path-conne'cted, as is clear from a study of Figure 3.4-3.
•
3.4.3 Example Must a path-connected set be open or closed?
Solution No; e.g., [0, I], )0, I[, [O, 1[ are all path-connected.
•
Chapter 3 Compact and Connected Sets
162
0
tf:•:j;r,
(a).
I
3
ta
2
3
MM
lh.i/£1
(b)
y
4
y
--+-x
X
(c)
FIGURE 3.4-3
3.4.4 Example Let cp: [O, I]
-+
Sets for Example 3.4.2
IR 2 be a continuous path, and C = cp([O, 1)).
Show that C is path-connected.
Solution This is intuitively clear, for we can use the path
,p itself to join
two points in C. Precisely, if x = cp(a), y = cp(b), where O ~ a ~ b ~ I, let
c : [a, b] -+ R 2 be defined by c(t) = cp(r). Then c is a path joining x to y and c
lies in C. •
Exercises for §3.4
1.
Detennine which of the following sets are path-connected:
a.
{x E [O, l]
Ix is rational}
I xy ~ l and x > l} U {(x,y) E IR2 I xy ~ l and x ~
lR 3 I x2 +y2 ~ z} U {(x,y,z) I x2 +y2 +z2 > 3}
b.
{(x,y) E
IR 2
c.
{(x,y,z)
E
d.
{(x,y)ElR2 IO~x< l}U{(x,O)I I <x<2}
1}
2.
Let A c R. be path-connected. Give plausible arguments that A must be
an interval (closed, open, or half-open). Are things as simple in R 2 ?
3.
~t cp : [a,b] -+ R 3 be a continuous path and a < c < d < b. Let
C = { cp(t) I c ~ t ~ d}. Must cp- 1(C) be path-connected?
163
§3.5 Connected Sets
§3.5 Connected Sets
There is another way of saying that a set has more than one piece that is both
more sophisticated and more powerful than using path connectedness.
3.5.1 Definition Let A be a subset of a metric space M. Two open sets U, V
are said to separate A if they satisfy these conditions:
a.
unvnA =0.
b.
AnU
c.
An V ~0.
d.
Ac UUV.
~0.
See Figure 3.5-1. We say that A is disconnected
sets do not exist, we say that A is connected.
FIGURE 3.5-1
if such
sets exist, and
if such
1-
A is neither connected nor path-connected
The set in Figure 3 .4-1 can be show~ to be connected but not path-connected;
thus the two notions are not the same. However, there is a valid relation between
the two ideas, which is presented in the next theorem.
3.5.2 Theorem Path-connected sets are connected.
Use of this theorem is perhaps the easiest way to identify a connected set.
The theorem is reasonable, intuitively. In fact, the (false) converse theorem is
also "reasonable." Here, then, is an example of two notions that are closely related and that are intuitively almost identical, but whose true relationship requires
more care to discern.
Chapter 3 Compact and 'Connected Sets
164
If a set is not connected (and hence not path-connected), we can divide it
up into pieces, or components. More precisely, a component of a set A is a
connected subset Ao CA such that there is no connected set in•A containing Ao
other than Ao itself. Thus we see that a component is a maximal connected subset. One can define path component in a similar way, using path-connectedness
instead of connectedness. Some properties of components are given in the exercises at the end of the chapter.
3.5.3 Example Is Z = { ... , -2, -1, 0, 1, 2, 3, ... } c JR connected?
Solution No, for if U = 11 /2, oo[ and V = 1 - oo, 1/ 4(, then Z c U u V,
ZnU={l,2,3, ... } ~0.ZnV={ ... ,-2,-1,0} ~0,andZnUnV=0.
Hence Z is disconnected (i.e., is not connected). It is also evident that Z is
not path-connected, but this fact alone cannot be used to conclude that Z is not·
connected.
•
3.5.4 Example Is { (x,y) E R. 2 IO < x2 + y2 :5 1} connected?
Solution As in Example 3.4.2d, we know that this set is path-connected.
Hence, by Theorem 3.5.2, it is connected. To prove this directly would be more
•
difficult.
Exercises for §3.5
1.
Is (0, l] U ]2, 3] connected? Prove or disprove.
2.
Is {(x,y) E R. 2 1 0
disprove.
3.
Let A c lR.2 be path-connected. Regarding A as a subset of the xy-plane in
R. 3 , show that A is still path-connected. Can you make a similar argument
for A connected?
4.
Discuss the components of
:5
x ~ l} U {(x,0) I I
a.
(0, 1] U (2,3] CR
b.
Z = { ... , - 2, -1, 0, 1, 2, ... } C R
c.
{x E (0, l] I x is rational} C R
< x < 2}
connected? Prove or
·
Theorem Proofs for Chapter 3
165
Theorem Proofs for Chapter 3
3.1.3 Bolzano-Weierstrass Theorem A subset of a metric space is
compact if! it is sequentially compact.
Proof We begin with two lemmas.
Lemma 1 A compact set A c M is closed.
Proof We will show that M\A is open. Let x E M\A and consider the
following collection of open sets: Un= {y I d(y,x) > 1/n}. Since every y EM
with y f. x has d(y,x) > 0, y lies in some Un, Thus, the Un cover A, and so there
must be a finite subcover. One of these has a largest index, say, UN, If e = 1/N,
then, by construction, D(x, l/N) c M\A, and so M\A is open.
Y
Lemma 2 If M is a compact metric space and B
C M is closed, then B is
compact.
Proof Let { U;} be an open covering of B and let V = M\B, so that V is
open. Thus { U;, V} is an open cover of M. Therefore, M has a finite cover, say,
{ U 1, ••• , UN, V}. Then { U1, ••• , UN} is a finite open cover of B.
Y
Proof of 3.1.3 Let A be compact. Assume there exists a sequence Xk
E A that
has no convergent subsequences. In particular, this means that x1: has infinitely
many distinct points, say, y 1,y2 , •••• Since there are no convergent subsequences,
there is some neighborhood Uk of Yk containing no other y;. This is because
if every neighborhood of Y1: contained another Yi• we could, by choosing, the
neighborhoods D(yt, l / m), m - l, 2, ... , select a subsequence converging to ~We claim that the set {y 1,y2 , ••• } is closed. Indeed, it has no accumulation
points, by the assumption that there are no convergent subsequences. Applying
Lemma 2 to {y 1,y2, ... } as a subset .of A, we find that {y1,y2, ... } is compact.
,But { U1:} is an open cover that has no finite subcover, a contradiction. Thus
X.t has a convergent subsequence. The limit lies in A, since A is closed, by
Lemma 1.
" Conversely, assume that A is sequentially compact. To prove that A is
;tompact, let { U;} be an open cover of A. We need to prove that this has a
Jnite subcover. To show this, we proceed in several steps.
Chapter 3 Compact and Connected Sets
166
Lemma 3 There is an r > 0 such that for each y E A, D(y, r) C U; for
some U;. The number r is called a Lebesgue number for the covering. The
in.fimum of all such r is called the Lebesgue number for the covering .
.,
Proof If not, then for every integer n, there is some Yn such that D(yn, I/n)
is not contained in any
say, Zn --+ z E A. Since
such that D(z, E:) C U;0 ,
enough so that d(ZN,Z)
contradiction.
'Y
Lemma 4
U;. By hypothesis, Yn has a convergent subsequence,
the U; cover A, z E U;0 for some U;0 • Choose E: > 0
which is possible since U;0 is open. Choose N large
< c/2 and 1/N < c/2. Then D(ZN, l/N) C U;0 , a
A is totally bounded (see Definition 3.1.4).
E: > 0 we cannot cover A
with finitely many disks. Choose y, E A and Y2 E A \D(y,, E:). By assumption,
we can repeat; choose Yn EA \[D(y,, E:) U • · · U D(yn-1, c)]. This is a sequence
with d(yn, Ym) 2'.: E: for all n and m, and so Yn has no convergent subsequence, a
contradiction to the assumption that A is sequentially compact.
'Y
Proof If A is not totally bounded, then for some
To complete our proof, let r be as in Lemma 3. By Lemma 4 we can write
A C D(y1, r) U · · · U D(yn, r) for finitely many Yi· By Lemma 3, D(yj, r) C U;1 ,
j = 1, ... , n, for some index ii. Then U;p ... , U;. cover A. •
3.1.5 Theorem
A metric space is compact iff it is complete and totally
bounded.
Proof First assume that Mis compact. By 3.1.3, it is sequentially compact.
Thus, if Xk is a Cauchy sequence, it has a convergent subsequence, and so, as
in 1.4.7, the whole sequence converges. Thus Mis complete. It is also totally
bounded, by Lemma 4.
Conversely, assume that M is complete and totally bounded. By 3.1.3, it is
enough to show that M is sequentially compact. Let Yk be a sequence in M. We
can assume that the Yk are all distinct, for if Yk has infinitely many repetitions,
there is a trivially convergent subsequence, and if there are finite repetitions
we may delete them. Given an integer N, cover M with finitely many balls,
D(xi,. 1/ N), ... , D(xiN, 1/ N). An infinite number of the Yk lie in one of these
balls. Start with N = 1. Write M = D(xi1 , 1) U · · · U D(xLN• l), and so we can
select a subsequence of Yk lying entirely in one of these balls. Repeat for N = 2,
getting a further subsequence lying in a fixed ball of radius 1/2, and so on. Now
Theorem Proofs for Chapter 3
167
choose the "diagonal" subsequence, the first member from the first sequence, the
second from the second, and so on. This sequence is Cauchy and since M is
complete, it converges. •
3.2.1 Heine-Borel Theorem A set A c Rn is compact if! it is closed
and bounded.
Proof We have already proved that compact sets are closed and bounded. We
must now show that a set A c Rn is compact if it is closed and bounded. We
will give two proofs of this.
First Proof This proof is based on the Bolzano-Weierstrass theorem and the
fact that any bounded sequence in R has a convergent subsequence, proved in
1.4.3. In' fact, we shall prove that a closed and bounded set A is sequentially compact. Let Xt = (xl,Xi, ... ,xf) E Rn be a sequence. Since A is bounded, x! has a
convergent subsequence, say, x 1/,Ck>• Then x2/,Ck) has a convergent subsequence,
say x2/2(k)• Continuing, we get a further subsequence Xfn(k) = (x 1Jn(k), ••• , Xnfn(k)),
all of whose components converge. Thus Xfn(k> converges in Rn. The limit lies
in A since A is closed. Thus A is sequentially compact, and so is compact.
•
Sec«;md Proof This proof uses the definition of compactness in terms of
open covers directly. We begin with a special case:
Lemma 1 Closed intervals [a,b] in Rare compact.
Proof Let U = { U;} be an open covering of[~, b].'Define
C = {x
e [a,b] I the set [a,x] can be covered by a finite collection of the U;}.
=
=
We want to show that C [a,b]. To this end, let c sup(C). The sup exists
because C "f 0 (since a E C) and C is bounded above by b. Since·a E C and
b is an upper bound for C, c E [a, b], by definition of sup( q. Suppose c E U;0 ;
such a U;0 exists, since the U;'s cover [a,b]. Since U;0 is open, there is an c > 0
such that ]c - c, c + c[ C U;0 • Since c = sup(C), there exists an x E C such that
c- c < x :::; c (see Proposition 1.3.2). Because x E C, [a, x] has a finite subcover,
say, U,, ... , UN; then [a, c + c/2] also has the finite subcover U,, ... , UN, U;0 •
Thus we conclude that c E C and moreover that c = b. Indeed, if c < b, we
would get a member of C larger than c, since [a, c + c/2] has a finite subcover.
The latter cannot happen, since c = sup(C). ,
Chapter 3 Compact and Connected Sets
168
Note. Why does this proof fail for ]a, bJ,
Lemma 2 If AC !Rn is compact and xo
[a, b[ or [a;oo[
E !Rm,
?
then A x {xo} C !Rn x !Rm is
compact.
Proof Let Ube an open cover of A x {xo}, and
V ={VIV= {y I (y,xo) EU}, for some U EU}.
Then V is an open cover of A in JR", and hence V has a finite subcQver of
A, say, V' = {Vi, ... , Vi:}. Each V; E V' corresponds to a U; E U, and U' =
{ U1 , ••• , Uk}
is
then
a
finite
subcover in
IR" x Rm of
AX {xo}.
T
The next step is an induction argument.
Lemma 3 If [-R,R]n-l C Rn-I is compact, then [-R,R]n
where [-R,R]" = [-R,R] x · · · x [-R,R], n times.
C Rn is co,;,pact,
Proof Suppose that [-R,R]n-l is compact and that U is an open cover of
[-R, R]". Define
S= {x E [-R,R]
I [-R,Rt- 1 x [-R,x]
c,fRn has a finite subcover in U}.
)
.
Now -RES, since [-R,RJn-l is compact, by hypothesis, and so, by Lemma
2, [-R,R]n-l x {-R} has a finite subcover in U. Since Sis bounded above by
R, it has a supremum, say, xo. We will show that x0 = R, which will prove the
lemma.
Let U' C U be a finite subcover of [- R, R]"- 1 x {x0 }. For each point
(y,xo) E [-R,R]"- 1 x {xo}, there exists cy > 0 such that D((y,x0 ), ./2£y) is
covered by U'. Because
Vy= D(y, cy)x ]xo - cy, Xo + cy[ C D((y,Xo), /icy),
it is covered by U'. Consider the open cover V = { Vy I y E [-R, R]"- 1 } of
[-R,R]n-l x {.xo}. By Lemma 2, V has a finite subcover of [-R, R]n-l x {xo},
say {Vyw .. , VYN }. Let c = inf{ f:yw •. , c:YN }. Then
Cu
00
[-R,R]"- 1 x ]xo - E:,Xo + c[
i=I
and so [-R,R]n-l x ]xo - c,Xo +c[ is covered by U'.
Vy;,
Theorem Proofs for Chapter 3
169
With this e, there exists x E S such that xo - c: < x $ xo. Since x E S, there
exists a finite subcoverU" c U which covers [-R,Rt- 1 x [-R,x], andU'U U"
is a finite cover of [-R,R]n-l x [-R,x0 +E[. Thus x0 ES. Supp.osex0 < R; then
choose 6 such that x 0 +6 <Rand xo+6 < xo+e. Thus [-R,R]n-l x [-R,.xo+6]
is covered by U' U U", and x 0 + 6 E S, a contradiction, and therefore x0 = R. Y
To conclude the proof of the Heine-Borel theorem, let A c !Rn be closed
and bounded. Since it is bounded, there is an R > 0 such that A C [-R, Rt.
Lemmas 1 and 3 show that [-R, Rt is compact. Lemma 2 in the proof of
3.1.3 shows that A is compact, since it is a closed subspace of the compact set
[-R,R]n.
•
3.3.1 Nested Set Property Let Fk be a sequence of compact nonempty
sets in a metric space M such that Fk+I C Fk for all k = l, 2, .... Then there is
at least one point in
1 Ft.
n~
Proof In the compact set A = F 1, the sets F 1, F 2 , ••• have the finite intersection
property, since the intersection of any finite collection equals the Fk with the
highest index. By Example 3.1.9,
3.5.2 Theorem Path-connected sets are connected.
We begin by first proving a special case of the theorem.
Lemma The interval [a, b] is connected.
Proof.
Suppose the interval were not connected. Then there would be two
open sets U and V with Un [a,b] 'F 0 and vn [a,b] 0, [a, b] n Un V = 0,
and [a,b] C U U V. Further, suppose that b EV. Let c = sup(Un [a,b]), which
;exists, since Un [a, b] is nonempty and is bounded above. The set Un [a, b] is
:~Iosed, since its complement is VU(IR\[a,b]), which is open. Thus c E Un[a,b]
tfsee Exercise 8, Chapter 2). Now c b, since c <I. V and b E V. We claim that
lJlnY neighborhood of c intersects Vn [a,b]. To see this, note that c i:, band no
t-
t-
170
Chapter 3 Compact and Connected Sets
neighborhood of c can be entirely contained in U, since c = sup(Un[a,b]). This
proves the claim and shows that c is an accumulation point of V n [a, b]. Like
Un [a, b], the set V n [a, b] is closed, and so c E V n [a, b]. This contradicts
the statement that Vn un [a,b] = 0.
Y
Corollary There do not exist disjoint nonempty closed subsets C,D of [a,b]
whose union is [a,b].
This follows from the lemma by letting U = IR\D and V = IR\C.
Proof of Theorem 3.5.2 Suppose A is not connected. By definition, there
exist open sets U, V that disconnect A; i.e., the sets U and V are such that
AC uuv, AnUnV=0,UnA-:j0, and VnA-:/0.
Let x E UnA and y E VnA. Since A is path-connected, there is a continuous
path cp: [a,b]-+ A joining X toy. Let C = cp- 1(UnA) and D = cp- 1(VnA),
so that C,D C [a,b]. Suppose tk EC and tk-+ t. Then t E [a,b], since [a,b]
is closed; cp(tk) E U, by definition of C, and cp(ft) -+ cp(t), by continuity. We
claim that cp(t) E U; if not, cp(t) E V, and since V is open, cp(tk) E V for large k,
contradicting cp(tk) E U, which does not intersect An V. Thus cp(t) E U and so C
is closed. Similarly, Dis closed. Then C,D are nonempty, since cp- 1(x) EC and
cp- 1(y) ED and are disjoint, since en D = cp- 1(U) n cp- 1(V) = cp- 1(U n V) = 0
and CUD= cp- 1(U) U cp- 1(V) = cp- 1(U UV)= cp- 1(A) = [a,b]. Thus we have
contradicted the corollary to the Lemma, and so A must be connected.
•
Worked Examples for Chapter 3
Example 3.1 Show that A= {x E !Rn I llxll:::; 1} is compact and connected.
Solution To show that A is compact, we show it is closed and bounded. To
show that it is closed, consider JRn\A = {x E ]Rn l llxll > 1} = B. For x EB,
D(x, llxll - 1) CB, so that Bis open and hence A is closed. It is obvious that A
is bounded, since A C D(O, 2) and therefore A is compact.
·
To show that A is connected, we show that A is path-connected. Let x, y E A.
Then the straight line joining x, y is the required path. Explicitly, use cp : [0, 1] -+
!Rn, cp(t) = (l-t)x+ty. One sees that cp(t) EA, since llcp(t)II :::; (1-t)l!xll+tllYII : : ;
(1 - t) + t = 1, by the triangle inequality.
•
Worked Examples for Chapter 3
171
Example 3.2 Let A c IR.n, x E A, and y E ]Rff\A. Let cp : [O, 1] -+ ]RR be a
(continuous) path joining x and y. Show that there is at such that cp(t) E bd(A).
Solut~on Intuitively, this result says that a path that joins a set to its complement must pierce the boundary of the set at some point. See Figure 3.E-1.
y
\..pierce pomt
.
aA
FIGURE 3.E-1
A path joining A and lRn\A
Let B = {x E [O, l] I cp([O,x]) c A} C [0, l]. Now B f:. 0, since OE B. Let
t = sup(B). We shall show that cp(t) E bd(A). Let U be a neighborhood of cp(t).
Choose tk E [O, t], tk -+ t, such that cp(tk) E A. This is possible, by the definition
of B. Then cp(tk) E U for large k, by continuity of cp. By definition of r, there is
a point sk such that t::; sk ::; t + 1/k and such that !f)(sk) (/. A. Now sk -+ t, and
so, by continuity of cp, cp(sk) (/. U for k large enough. Thus U contains points of
A and lRn\A, and so, by Proposition 2.6.2, cp(t) E bd(A). •
Example 3.3 Prove: A set A C lR is connected ijf it is an interval (an
interval is a set of the form [a, b], [a, bL ]a, b], or ]a, bL where a orb can be
±oo on an open end of the interval).
Solution We have already seen that intervals are connected because they are
path-connected. Now, for the converse, assume that A is not an interval. We shall
show that it is not connected. Saying that A is not an interval means that there
exist points x, y, z such that x < y < z, where x, z E A and y (/. A (why?). Then
U =] - oo,y[ and V = ]y, oo[ are open sets such that Ac vu V, Un VnA = 0,
U n A f:. 0, and V n A f:. 0. Thus A is not connected.
•
Example 3.4 Let A be an open connected subset of Rn. Prove that .A is
path connected.
Chapter 3 Compact and Connected Sets
172
Solution This solution illustrates an important technique for using connect-
edness. We let Xo E A and let B = {y E A I xo and y can be joined by a
continuous path}. Obviously, x0 E B, so that B 1' 0. We claim that B is both
open and closed regarded as a subset of A; i.e., B is both open and closed relative to A. First, B is open for the following reason. If y E B, choose a disk
D(y,c) c A, which is possible since A is open. If z E D(y,c), then z EB, since
we can get a continuous path from x0 to z by concatenation of a path from .to to
y with the straight line from y to z (the reader should prove that this produces a
continuous path). Thus, D(y, e) c B, so B is open. To show that B is closed, let
Jk E Band Yk --+ y EA. Since A is open, there is an c: > 0 such that D(y, e) CA.
Since Jk --+ y, there is an N such that Jk E D(y, c) fork 2: N. Joining Xo to JN by
a continuous path followed by the straight line from YN toy, we see that y E B,
and so B is closed. Since B f:. 0 and B is both open and closed in A, we get
B = A. (Otherwise, B and B\A would disconnect A.) Thus, every point in A can
be joined to xo by a continuous path, and so A is path-connected. •
Exercises for Chapter 3
1.
Which of the following sets are compact? Which are connected?
a.
{(x1,x2) E IR2 I lxd SI}
b.
c.
I llxll S 10}
{X E Rn I I ~ llxll ~ 2}
d.
Z = {integers in_ JR}
e.
A finite set in JR
f.
{x E !Rn I llxll
g.
Perimeter of the unit square in R 2
h.
The boundary of a bounded set in JR
i.
The rationals in [0, l]
j.
A closed set in [0, 1]
{x
E lRn
=1} (distinguish between the cases n =1 and n 2: 2)
2.
Prove that a set A c Rn is not connected iff we can write A c F1 U F2,
where F1, F2 are closed, An F1 n F2 = 0, F1 n A f:. 0, F2 n A f:. 0.
3.
Prove that in Rn, a bounded infinite set A has an accumulation point.
4.
Show that a set A is bounded iff there is a constantM such that d(x,y)::; M
for all x,y E A. Give a plausible definition of the diameter of a set and
·
reformulate your result.
Exercises for Chapter 3
5.
173
Show that the following sets are not compact, by exhibiting an open cover
with no finite subcover.
a.
{x E Rn I llxll < 1}
b.
Z, the integers in JR
6.
Suppose that Fk is a sequence of compact nonempty sets satisfying the
nested set property such that diameter(Fk) -+ 0 ask -+ co. Show that there
is exactly one point in n{Fk}- (By definition, diameter(Fk) = sup{d(x,y) I
x,y E Ft}).
7.
Let Xk be a sequence in Rn that converges to x and let Ak = {xk,Xk+I, ..• }.
Show that {x} = n~1cl(Ak). Is this true in any metric space?
8.
Let A C !Rn be compact and let Xk be a Cauchy sequence in ]Rn with
Xk E A. Show that Xk converges to a point in A.
9.
Determine (by proof or counterexample) the truth or falsity of the following statements:
a.
(A is compact in Rn) => (lRn\A is connected).
b.
(A is connected in lRn) => (IRn\A is connected).
c.
(A is connected in Rn) => (A is open or closed).
d.
(A = {x E Rn I llxll ::; I})=> (lRn\A is connected). [Hint: Check the
cases n = I and n ~ 2.J
10.
A metric space M is said to be locally path-connected if each point in M
has a neighborhood V such that U is path-connected. (This terminology
differs somewhat from that of some topology books.) Show that (M is
connected and locally path-connected) {:} (M is path-connected). ·
11.
a.
Prove that if A is connected in a metric space M and A C B C cl(A),
then B is connected.
b.
Deduce from a that the components of a set A are relatively closed.
Give an example in which they are not relatively open. (C c A is
called relatively closed in A if C is the intersection of some closed
set in M, with A, i.e., if C is closed in the metric space A.)
c.
Show that if a family {B;} of connected sets is such that B; n Bi ":J 0
for all i,j, then U;B; is connected.
d.
Deduce from c that every point of a set lies in a unique component.
e.
Use c to show that Rn is connected, starting with the fact that lines
in Rn are connected.
174
Chapter 3 Compact and Connected Sets
12.
Let S be a set of real numbers that is nonempty and bounded above. Let
-S = {x E JR I -x E S}. Prove that
a.
-S is bounded below.
b.
sup S = - inf( -S).
13.
Let M be a complete metric space and Fn a collection of closed nonempty
subsets (not necessarily compact) of M such that Fn+t C Fn and diameter
(Fn) - 0. Prove that n: 1Fn consists of a single point; compare Exercise 6.
14.
a.
A point x E A C M is said to be isolated in the set A if there is
a neighborhood U of x such that U n A = {x}. Show that this is
equivalent to saying that there is an c > 0 such that for ally EA,
y ~ x, we have d(x,y) > €.
b.
A set is called discrete if all its points are isolated. Give some
examples. Show that a discrete set is compact iff it is finite. ·
15.
Let K1 C M, and Ki C Mi be path-connected (respectively, connected,
compact). Show that K 1 x K2 is path-connected (respectively, connected,
compact) in M, x Mi.
16.
If Xk
17.
Let K be a nonempty closed set in ]Rn and x E JR11 \K. Prove that there is
a y EK such that d(x,y) = inf{d(x,z) I z EK}. Is this true for open sets?
Is it true in general metric spaces?
18.
Let Fn CIR be defined by Fn = {x Ix~ 0 and 2- 1/n $ x2 $ 2+ 1/n}.
Show that n:,Fn t, 0. Use this to show the existence of ../2.
19.
Let Vn C M be open sets such that cl(Vn) is compact, V11
cl(V11 ) C Vn-t· Proven:, Vn ~ 0.
20.
Prove that a compact subset of a metric space must be closed as follows:
Let x be in the complement of A. For each y E A, choose disjoint neighborhoods Uy of y and Vy of x. Consider the open cover { Uy hEA of A to
show the complement of A is open.
21.
a.
Prove: a set A C M is connected iff 0 and A are the only subsets
of A that are open and closed relative to A. (A set U C A is called
open relative to A if U = V n A for some open set V c M; "closed
·
·
relative to A" is defined similarly.)
b.
Prove that 0 and ]Rn are the only subsets of IR" that are both open
and closed.
- x in a normed space, prove that llxkll - !!xii- Is the converse true?
Use this to prove that {x E Rn I Ilxl I $ 1} is closed, using sequences.
~ 0,
and
.
Exercises for Chapter 3
175
22.
Find two sutisets A, B c R 2 and a point x0 E JR2 such that A U B is not
connected but A U B U {x0 } is connected.
23.
Let Q denote the rationals in R Show that both Q and the irrationals
JR\Q are not connected.
24.
Prove that a set A C M is not connected if we can write A as the disjoint
union of two sets B and C such that B n A -:j 0, C n A -:j 0, and neither
of the sets B or C bas a point of accumulation belonging to the other set
25.
Prove that there is a sequence of distinct integers n 1, n2, . . . -+ oo such
that limk-oo sin nk exists.
26.
Show that the completeness property of JR may be replaced by the Nested
Interval Property. If {Fn}1 is a sequence of closed bounded intervals in
JR such that Fn+I C Fn for all n = 1, 2, 3, ... , then there is at least one
.point in n~ 1Fn.
27.
Let A C JR be a bounded set. Show that A is closed iff for every sequence
EA, limsupxn EA and liminfxn EA.
Xn
28.
Let A c M be connected and contain more than one point. Show that
every point of A is an accumulation point of A.
29.
Let A
{(x,y) E JR2 I x 4 + y4
~-connected?
30.
Let Uk be a sequence of open bounded sets in JR.n. Prove or disprove:
=
a.
b.
c.
d.
= I}.
Show that A is compact. Is it
'
LJ: Uk is open.
n: Uk is open.
n: (Rn\ Uk) is closed.
n: (JRn\Uk) is compact.
1
1
1
1
31.
Suppose A c Rn is not compact. Show that there exists a sequence
F 1 :> F2 :> F 3 • · • of closed sets such that Fk nA -:j 0 for all k and
32.
Let Xn be a sequence in JR3 such that llxn+l - xnll $ 1/(n2 + n), n
Show that Xn converges.
33.
Baire category theorem.
A set S in a metric space is called nowhere
dense if for each nonempty open set U, we have cl(S) n U -:j U, or equivalently, int(cl(S)) = 0. Show that ]Rn cannot be written as the countable
union of nowhere dense sets.
2::
1.
176
Chapter 3 Compact and Connected Sets
34.
Prove that each closed set A C M is an intersection of a countable family
of open sets.
35.
Let a E JR and define the sequence a I ' a2' . . . in R by a I
a~_ 1 - an- I + 1 if n > l. For .what a E JR is the sequence
a.
Monotone?
b.
Bounded?
c.
Convergent?
= a,
and an =
Compute the limit in the cases of convergence.
36.
Let A C IRn be uncountable. Prove that A has -an accumulation point.
37.
Let A, B
0
b.
c M with A compact, B closed, and A n B = 0.
Show that there is an
yEB.
£
>0
s~ch that d(x, y)
>
e for all x E A and
Is a true if A, B are merely closed?
38.
Show that A C M is not connected iff there exist two disjoint open_ sets
U, V such that UnA f 0, VnA f 0, and Ac UU V.
39.
Let F 1 = (0, 1/3]U[2/3, 1] be obtained from [O, 1] by removing the middle
third. Repeat, obtaining
Fi= [O, 1/9] U (2/9, 1/3)] U [2/3, 7 /9] ·U [8/9, l].
In general, Fn is a union of intervals and
middle third of these intervals. Let C =
40.
Fn+t
is obtained by removing the
n: Fn, the Cantor set. Prove:
1
a.
C is compact.
b.
Chas infinitely many points. [Hint: Look at the endpoints' of Fn.]
c.
int(C) = 0.
d.
C is perfect; that is, it is closed with no isolated points.
e.
Show that C is totally disconnected; that is, if x, y E C and x f y
then x E U and y E V where U and V are open sets that disconnect C.
Let Fk be a nest of compact sets (that is, Fk+t C Fk)- Furthermore, suppose
each F k is connected. Prove that nf:;1{ Fk} is connected. Give a.n example
to show that compactness is an essential condition and we cannot just
assume that "Fk is a nest of closed connected sets."
...
Chapter 4
Continuous Mappings
To obtain interesting and useful theorems, it is often necessary to make restrictions on the mathematical objects one studies. In this chapter we require that
the functions studied be continuous, and we will investigate some of the consequences of this restriction. -In Chapter 6 we study an even stronger restriction,
namely, that of differentiability.
§4.1 Continuity
First we examine the notion of continuity for real-valued functions on the real
line JR. Figure 4.1- la shows a continuous function, and Figure 4.1-1 b shows a
discontinuous one. A continuous function has the important property that when
xis close to x0 , j(x) is close to f(xo) (as shown in Figure 4.1-la). On the
other hand, in Figure 4.1-lb, even if xis arbitrarily close to xo.f(x) need not be
close to /(Xo). The reader should be familiar with these ideas from one:variable
calculus.
To define continuity in more precise and general terms, first the concept of
the limit of a function at a point is defined. Let (M, d) and (N, p) be two metric
spaces, A CM, andf : A-+ Na given mapping.
4.1.1 Definition
~µpose
that x0 is an accumulation point of A. We say.
that b E N is the limit off at xo, wr{tten
limf(x) = b,
x-xo
'. if given any € > 0 there exists fJ > 0 (possibly depending on f. x0 , and€) such
'.that for all x E A satisfying -x ;= x0 and d(xo, x) < 6, we have p(j(x), b) < € .
177
17~
Chapter 4 Continuous Mappings
y
y
___ _/
I
X
X
(a)
FIGURE 4.1-1
(b)
(a) Continuous function; (b) discontinuous function
Intuitively, this says that as x approaches xo, /(x) approaches b. We also
write f (x) -+ b as x -+ Xo- (Compare this with the concept of the limit of a
sequence.) Note that if Xo is not an accumulation point, there will not be any
other points of A near xo, in which case the condition becomes vacuous.
Note. Whenever statements are given for metric spaces, ask yourself: What
is the corresponding statement for Rn? Likewise, when given a statement or
homework problem in JRn, ask: is this special to Rn, or does it work generally
in a metric, normed, or vector space?
In general, limx-xof(x) need not exist. For example, let/: R\{O}-+ JR be
defined by /(x) 1 if x < 0 and /(x) 2 if x > 0. Then O is an accumulation
point of JR\ {O} but limx-of(x) doesn't exist. However, if f(x) = 1 when x 'f 0
0, then limx-of(x)
1. Another example is f : JR\ {O} -+ JR,
and if f(O)
f(x) = sin(l / x); this function oscillates faster and faster near O and so cannot
approach any limit there.
If limx-xof(x) exists, then it is unique, and so we are justified in saying
the limit off at xo. To prove this, suppose limx-xof(x) = b and b'. To show
that b = b', let c > 0 be given. Then there exist 61 > 0 and 62 > 0 such
that O < d(x,'xo) < 61 implies p(f(x), b) < c /2 and O < d(x, xo) < 62 implies
p(f(x),b') < c/2. Let 6 = min{6 1,61 }; then O < d(x,xo) < 6 implies p(b,b') $
p(b,f(x))+p(f(x), b') < c/2+c/2 =€,by the triangle inequality. Thus, p(b,b') <
€ for any€ > 0 and so p(b, b') = 0, which means b = b'. Notice that in 4.1.1, if
we had not required xo to be an accumulation point, then limits need not have
been unique.
=
=
=
=
§4.1 Continuity
179
There are some particular cases of 4.1.1 in which special notation is used.
For instance, suppose/ is defined (at least) on A= ]xo,a] c JR for some a> xo
and f is real-valued. Then
limf(x) = b
x-Xo
means the limit off with domain A= ]xo,a]. In other words, for every c > 0
there is a f, > 0 such that Ix - .xol < fJ and x > .xo implies 1/(x) - bl < c. Thus
we are talcing the limit of/ as x approaches xo from the right. Similarly, we can
define
lim /(x) = b,
x--+xO
the limit as x approaches xo from the left. These are, for obvious reasons, called
one-sided limits. It should now be clear to the reader how to define expressions
like lim,._ 00 /(x) = a, and ~o forth.
We are now ready to define continuity of a function at a point.
4.1.2 Definition Let A c
M,
f :A
-+ N,
and
.xo
E A. We say that f is
continuous at xo if either Xo is not an accumulation point of A or limx-x0 f(x) =
/(Xo)-
This definition requires the existence of limx-.rof(x) in addition to specifying
its value. It can be rephrased as follows:
f is continuous at the point Xo in its domain ifffor all c > 0, there is a
f, > 0 such that for all x E A, d(x, xo) < 6 implies p(f(x),f (xo)) < c.
In Definition 4.1.1, we needed to specify that x f: xo because f was not
necessarily defined at x0 , but here there is no need to specify x ,j: x0 , s,ii;_ice our
condition is certainly valid if x = x0 • In the important special case f : A c Rn -+
R.111 , observe that/ is continuous at Xo EA iff for all c > 0 there is a 6 > 0 such
that/or all x EA with !Ix - xoll < fJ, we have 11/(x) - /(xo)II < £.
4.1.3 Definition A function f : A
C M -+ N is called continuous on the set
B C A if f is continuous at each point of B. If we just say that f is continuous,
we mean that f is continuous on its domain A.
There are other useful ways of formulating the notion of continuity. One
of these, given in iii of the next theorem, is particularly significant because it
involves only the topology (that is, the open sets), and so it would be applicable
in more general situations.
Chapter 4 Co1Zli1Zuous Mappings
180
4.1.4 Theorem Let f : A c M
-+
N be a mapping. Then the foltowing
assertions are equivalent:
i.
f is continuous on A.
ii.
For each convergent sequence Xk
iii.
For each open set U in N, 1- 1(11) C A is open relative to A: that is,
1- 1(11) = V nA for some open set V.
iv.
For each closed set F C N, 1- 1(F) C
1- 1(F) = G n A for some closed set G.
-+
Xo in A, we have f(xk)
-+ f(xo).
A is closed relative to A; that is,
Condition ii in this theorem has an analogous version for limits that can be
proved in the same way. Namely, if f : A C M -+ N and xo is an accumulation
point of A, then
lim /(x) = b
if and only if
X-+Xo
lim f(xk) = b
lc-oo
for every sequence Xt E A that converges to Xo, Xt "f xo.
From this theorem it is evident that our definition of a continuous path, given
in Chapter 3, coincides with continuity as we have defined it here. In §4.3 we
shall introduce theorems that will enable us to readily establish the continuity
of some of the more common functions.
We now briefly discuss the plausibility of the theorem. First of all, that i is
the same as ii should be clear, for i means that/(x) is near /(xo) if xis near x0 ,
and ii is the same except that it lets x approach xo via a sequence. Assertions
iii and iv are also the same if we remember that open sets are complements of
closed sets.
Let us see what iii is telling us. Choose U to be a small open set containing
/(XQ). That 1- 1(11) is open means that there is a whole open disk around Xo
contained in J- 1(11). In this disk, x is mapped to U, which represents points
near /(xo). In other words, using U as a measure of closeness of /(x) to /(x0), if
xis near enough to xo (i.e., if x E 1- 1(11)), /(x) will be near f(xo). This therefore
represents the same idea expressed in i.
4.1.5 Example Let f : Rn
-+
!Rn be the identity function x
1-+
x. Show that
f is continuous.
E Rn. By definition we must find 6 > 0 for given € > ·o
such that llx - xoll < 6 implies 11/(x) - /(xo)II -< c:. If we choose 6 = c:, the
definition becomes the statement that llx- xoll <€implies llx- xoll < €, which
is a tautology. Hence/ is continuous. •
Solution Fix xo
· Exercises for §4.1
181
4.1.6 Example Let f : )0, oo[-+ R x 1-+ 1/ x. Show that f' is continuous.
Solution Fix Xo E JO, oo[; that is, fix Xo > 0. To detennine how to choose 6,
we examine the expression
lf(x) - f(xo)I
If Ix - xol
< 6,
=1-1 - -1 I = lxoIXXo- Ixi
X
Xo
then we would get
1/(x) - /(xo)I <
6
6
lx.xol = XXo.
rc,.
If 6 < xo/2, then x > xo/2 (Figure 4.1-2) and so 6/ xxo < 26 /
Thus, given
e > 0, choose 6 = min(.to/2, erc,/2). Then [/(x) - /(xo)I < e if Ix - .tol < 6, and
so f is continuous at XQ .
•
1+6+1
I
I
•M·; &
0
~
FIGURE 4.1-2
~
Choosi~g 6 for Example 4.-1.6
4.1.7 Example Let f : Rn -+ Rm be continuous. Show that the set {x E
Rn 111/(x)I I < 1} is open.
I IIYII < 1} ), which is the inverse
· · image of an open set, and so by Theorem 4.1.4iii, it is open.
•
·,:, Solution The given set equals 1- 1( {y
Exercises for §4.1
x2.
a.
Let/: JR-+ JR, x
b.
Let/: JR2 -+ JR, (x,y) ...... x. Prove that/ is continuous.
1-+
Prove that/ is continuous.
Chapter 4 Continuous Mappings
182
2.
Use bin Exercise 1 to show that if UC JR is open, then A = {(x,y) E
lR2 I x E U} is open.
3.
Let/: lR 2
is closed.
4.
Prove directly that condition iii implies condition iv in Theorem 4.1.4.
5.
-+
JR be continuous. Show A = { (x,y) E lR 2 I O ~ /(x,y) ~ 1}
- Give an example of a continuous function f : JR
U c JR such that/( U) is not open.
-+
JR and an open set
§4.2 Images of Compact and Connected Sets
This section discusses how compact and connected sets behave under continuous
mappings. It is crucial to distinguish between the tenns image and preimage
(that is, inverse image) in these theorems; compare the following theorems with
Theorem 4.1.1. In this section, as in §4.1, Mand N will be metric spaces with
metrics d and p, respectively.
4.2.1 Theorem Suppose that f :
M -+ N is continuous and K c M is
connected. Thenf(K) is connected. Similarly, if K is path-connected, so isf(K).
Roughly speaking, this theorem gives confirmation to our intuition that continuous mappings cannot pull sets apart. If j(K) were disconnected, then/ would
have to be discontinuous to pull K apart.
Interestingly, compact sets obey the same type of condition as connected sets
(Figure 4.2-1 ).
4.2.2 Theorem Suppose that f : M
-+
N is continuous and B
c M is
compact. Then f (B) is compact.
This result is perhaps easiest to see using the Bolzano-Weierstrass property.
Indeed, if f (x1:) is a sequence in f (B), so that Xt E B, then we can extract
a convergent subsequence of x1.: and the corresponding subsequence of f(x1.:)
converges, by Theorem 4.1.4ii.
If / : A C ]Rn -+ JR"', we can apply the t"{O preceding theorems to sets
K c A and B c A where we use M = A as the domain metric space and N = ]Rm
as the target metric space. Notice that K is connected as a subset of A iff it is
connected as a subset of lRn, with a similar statement for compactness. (The
reader should supply the proofs of these two assertions.)
§4.2 Images of Compact and Connected Sets
183
f
' ~
M
FIGURE 4.2-1
N
Images of compact (or connected) sets are compact
(or connected)
4.2.3 E:¥ample let K c Ilf! be compact. Prove that A = {x E JR
· there exists a y such that (x,y) EK} is compact.,
I
Solution Let/: JR 2
_. JR, (x,y) ...... x. Then/ is continuous (see Exercise l
of §4.1). We claim that A = f(K), so that A would be compact by Theorem
4.2.2. To prove the claim, let x E A. Then (x,y) E K for some y, so that
x =f(x,y) E/(K). Conversely, if x =f(x,y) for some (x,y) EK, then x EA by
definition.
•
4.2.4 Example . Find a continuous map f : JR -> JR and a compact set K
C
JR
such that 1- 1(K) is not compact. Repeat for K connected.
Solution Let/(x) = 0 for all x E JR and let K = {O}. Thenf- 1(K) = IR is not
compact. For the second part, let/(x) = x2 and K = {I}. Then/- 1 (K) = {-1, I},
•
which is not connected.
4.2.5 Example Let f : JR 2 -> JR be continuous, and let A = {/(x) 11 lxl I = I}.
Show that A is a closed interval, i.e., a set of the form [a, b ].
where S = {x E JR 2 I llxll == 1} is the unit
circle. Now S is connected and compact and so A is connected and compact.
Solution Clearly, A
= f(S),
Chapter 4 Continuous Mappings
184
By Worked Example 3.3 at the end of Chapter 3, A is an interval. But the only
•
compact intervals are the closed intervals.
Exercises for §4.2
1.
Let/: JR-+ JR be continuous. Which of the following sets are-necessarily·
closed, open, compact, or connected?
a.
{x E JR lf(x) = O}
b.
{x E JR j/(x) > 1}
{J(x) E JR I x 2: O}
{J(x) E JR I O S x S 1}
c.
d.
=B =
2.
Verify Theorems 4.2.1 and 4.2.2 for f: JR2 -+ JR, f(x,y) = x2 + y, K
{(x,y) I x 2 +IS l}.
3.
Give an example of a continuous map f : JR -+ JR and a closed subset
B C JR such that f (B) is not closed. Is this possible if B is bounded as
well?
4.
Let A, B
5.
c JR,
and suppose A x B
a.
Prove that A is connected.
b.
Generalize to metric spaces.
c JR2 is connected.
Let A and B be subsets of JR with B not empty. If A x B ~ JR2 is open,
must A be open?
§4.3 Operations on Continuous Mappings
It is plausible that the composition of continuous functions should be continuous,
as we shall now discuss. Recall that for f : A C M -+ N and g : B .C N -+ P
with f(A) c B, we define the composition g of : A -----+ P by x 1--+ g(f(x)). If
x is close to x0 , then (g o f)(x) is close to (g o f)(x0 ), because /(x) is close to
/(xo) and hence g(f(x)) is close to g(f(xo)). See Figure 4.3-1. This indicates
the plausibility of the following result.
4.3.1 Theorem Let M, N, and P be metric spaces and suppose that f : A C
M -----+ N and g : B C N -+ P are continuous mappings with f(A) C B. Then
g of: ACM-----+ Pis continuous.
§4.3 Operations ·on Continuous Mappings
185
I
A
B
E::Z:2a4t ,t1J
FIGURE 4.3-1
IR
Composition of mappings
For example, the function e5inx is continuous, because it is the composition
of the two continuous functionsf(x) sinx and g(x)
=
=~-
Note. As we have done several times already, we accept as known from
calculus the properties of the basic functions such as sin x, eT, and so forth.
These basic functions will be reviewed in §5.3 and will be used in several
examples.
The following theorem gives some additional fundamental properties concerning the arithmetic of limits. For these results we will use a vector space
structure to make sense of addition and scalar multiplication. Thus, let us specialize from metric to vector spaces. Specifically, let M be a metric space and
let V be a normed vector space.
4.3.2 Proposition Let A c M, and let xo be an accumulation point of A.
,
i.
Let f : A --+ V and g : A --+ V be functions, and let f +g : A --+ V be defined
by (f +g)(x) = f(x) + g(x); assume that limx-x0 f(x) and limx-xo g(x) exist
and are equal to a and b, respectively. Then limx-xo<f + g)(x) exists and
is equal to a + b.
ii.
Let f : A --+ IR and g : A --+ V be functions, and let f · g : A --+ V be
defined by (f · g)(x) = f(x)g(x); assum'f that limx.....,x0 f(x) and limx-xo g(x)
exist and are equal to a and b, respectively. Then limx.....,xo(f • g)(x) exists
and is equal to ab.
iii.
Let f : A --+ lR and g : A --+ V be functions,· assume that limx.....,xof(x) and
limx--.x0 g(x) exist and are equal to a -:/- 0 and b, respectively. Then f is
nonzero in a neighborhood V of x0 , so that g/J : V --+ V can be defined
by (g/f)(x) = g(x)/f(x), and limx-x0 (g/f)(x) exists and is equal to b/a.
Chapter 4 Continuous Mappings
186
These results are eminently reasonable. For instance, i states that if x is
close to Xo, so that /(x) is close to a and g(x) is close to b, then /(x) + g(x) is
close to a+ b.
We may deduce some basic properties of the arithmetic of continuous functions.
4.3.3 Corollary Let A
I.
C M, and let xo E A be an accumulation point of A.
Let f : A -+ V and g : A -+ V be continuous at
+ g : A -+ V is continuous at x0 •
.xo:
then the sum
f
ii.
Let f : A -+ JR and g : A -+ V be continuous at XQ; then the product
f · g : A -+ V is continuous at Xo.
iii.
Let f : A -+ IR and g : A -> V be continuous at Xo with f (xo) f. O; then
f is nonzero in a neighborhood U of xo and the quotient g/f : U -> V is
continuous at Xo-
For example, we have seen that f(x) = x, mapping IR to IR, is continuous,
and therefore so is f(x) = .x"; also, any polynomial an.x" + Gn-tx"- 1 + · · · + ao is
continuous from IR to R
Consider a real-valued function of two variables f : IR 2 -+ IR. It is crucial
to distinguish between continuity off (sometimes called joint continuity) and
continuity in each variable separately. For example, consider the following
function, shown in Figure 4.3-2:
f(x,y) =
{ 0 if x f. 0 and y f. 0;
1 if either x =0 or y =0.
In each individual variable separately, f is continuous at (0, 0) (the mappings
x H /(x,O) and y H /(0,y) are constant, and so are continuous), but/ itself is
not continuous at (0, 0) (why?). In this example, observe that lilll(x,y)-co.o)/(x,y)
'/: limy-o[limx-of(x,y)]. See Exercise 16 at the end of this chapter for sufficient
conditions on separate continuity needed to imply joint continuity.
4.3.4 Example Let f : IR
-+
IR be de.fined by f(x) = x sinx. Show that f is
continuous.
Solution Since f is the product of the two continuous functions x and sin x,
it is continuous. •
Exercises for §4.3
187
y
FIGURE 4.3-2 Separate and joint continuity
4.3.S Example Let f : JR
-+
JR 2 .be continuous. Show that g(x) = f(x2 + x3)
is continuous.
Solution Since g is the composition off with the continuous function x 1-+
x2 + x3, it is continuous.
•
4.3.6 Example Letf(x) =x2/(l +x). Where isf continuous?
Solution We define/ for x 'f- - l. By Corollary 4.3.3iii, f is continuous at
•
allx'{--1.
Exercises for §4.3
1.
Where are the following functions continuous?
a.
/(x)
= x sin(x'-).
b.
/(x)
=(x + x'-)/(x'- -
c.
f(x) = (sinx)/x, x
I), x2
'f-
I,/(± I)
=0.
'I: 0.f(0) = l.
2.
Let at -+ a and bt -+ b be convergent sequences in JR.
akbt -+ ab, directly and as a consequence of 4.3.3ii.
3.
Let A= {x E JR I sinx = 0.56}. Show that A is a closed set. Is it compact?
Prove that
188
Chapter 4 Continuous Mappings
4.
Show that/(x) =
Jjxj is continuous.
5.
Show that /(x) =
Jx2 + 1 is continuous.
§4.4 The Boundedness of. Continuous Functions on
Compact Sets
We are ready now to prove an important property of coptinuous real-valued
functions called the "maximum-minimum" or "boundedness"' theorem. This
result says that a continuous function is bounded on a compact set and attains
its largest or maximum value and its smallest or minimum value at some point
of the set. The precise definitions will be stated shortly.
To appreciate the results, let us consider what can happen on a noncompact
set. First, a continuous function need not be bounded. Figure 4.4-1 shows the
function/(x) = 1/x on the open interval ]0, l[. As x gets closer to 0, the function
becomes arbitrarily large, but/ is nevertheless continuous (since/ is the quotient
of 1 by the cont~nuous function x 1--+ x, which does not vanish on ]0, 1[ ).
y
0
FIGURE 4.4-1
A
An unbounded continuous function
Next, we can show that even if a function is bounded and continuous, it might
not assume its maximum at any point of its domain. Figure 4.4-2 shows the
function /(x) = x on the interval [0, 1[. This function never attains a maximum
§4.4 The Boundedness of Continuous Functions on Compact Sets
189
value, because even though there are an infinite number of points as near to 1
as we please, there is no point x for which f(x) = I. From these examples, it
should be fairly plausible that, for a continuous function on a compact set, these
pathologies cannot occur.
y
X
0
FIGURE 4.4-2
A
A function with no maximum
Let us now state the theorem formally.
4.4.1 Maximum-Minimum Theorem Let (M,d) be a metric space, let
ACM, and let/: A--+ JR be continuous. Let KC ,1 be a compact set. Then/ is
bounded on K; that is, B = {f(x) I x E K} C lR is a bounded set. Furthermore,
there exist points xo,x1 EK such thatf(xo) inf(B) andf(x1) sup(B). We call
sup(B) the (absolute) maximum off on K and inf(B) the (absolute) "minimum
off on K.
=
=
It should be appreciated that this result is deeper than the usual derivative
tests for the location of maxima and minima that we learn from calculus. For
example, there are continuous fu~ctions on JR that are differentiable at no point;
such functions cannot be graphed by a smooth curve and so our intuition is not
as clear in these cases. (Y{e will meet such a function in Exercise 20, Chapter 5.)
We usually apply Theorem 4.4.1 in the context of functions of several variables, i.e., maps f : A c ]Rn --+ R However, there are some interesting contexts
in which more general (infinite-dimensional, in fact) metric spaces are needed.
One of these is the problem of finding the surface of minimal area spanning
• given wire in JR 3 • Here M is a suitable space of surfaces and/ assigns to
tach such surface its area. To minimize f, some condition such as compactness
'
.Chapter 4 Continuous Mappings
190
is needed. In this case the ideas of 4.4.1 are useful, although serious additional technicalities are necessary to solve this problem, usually called Plateaus
problem.
Despite these technicalities, one can learn much from physical experiments in
which nature does the minimizing for us. This is done by means of dipping wires
into soap solutions and observing the resulting soap films spanning the wires.
The variety of possible shapes-including several local minima-is surprisingly
interesting. For more information, consult J. Marsden and A. Tromba, Vector
Calculus, 3d ed., W. H. Freeman, New York, 1988, and S. Hildebrant and
A. Tromba, Mathematics and Optimal Form, Scientific American Books, New
York, 1985.
4.4.2 Example Give an example of an unbounded discontinuous function
on a compact set.
=
=
1/x if x > 0 and/(0) 0.
Clearly, this function exhibits the same unboundedness property as does 1/ x
on JO, I].
•
Solution Let/: (0, I] -+IR.be defined by f(x)
4.4.3 Example Verify Theorem 4.4.1 for f(x) = x/(x2 + I) on (0, 1).
Solution f (0) = 0, /(I) = 1/2. We shall verify explicitly that the maximum
=
=
is at x
1, and that the minimum is at x 0. (Elementary calculus helps to
determine this, but we shall give a direct verification.) First, as O 5 x 5 l,
x/(x2 + 1) ~ 0, since x ~ 0 and x2 + 1 ~ 1, so that f(x) ~ /(0) for O ::;; x ::;; I.
Thus O is the minimum. Next, note that O 5 (x - I )2 = x2 - 2x + I, so that
x2 + 1 ~ 2x and hence, for x ';/ 0,
X
X
I
- - < - = -2
x2+1-2x
so that f (x) 5 /(1) = 1/2 and thus x = l is the maximum point.
•
4.4.4 Example Show that x0 and x 1 in Theorem 4.4.1 need not be unique.
Solution Let/(x) = I for all x E (0, I]. Then any Xo,x 1 E (0, l] will do. This
example is of course rather trivial. More interesting ones have already been
learned in calculus, such as (x2 - 1)2 on [-2, 2], which has minima at ± l and
•
maxima at ±2.
§4.5 The Intermediate Value Theorem
191
Exercises for §4.4
1.
Give an example of a continuous and bounded function on all of JR that
does not attain its maximum or minimum.
2.
Verify the maximum-minimum theorem for f(x) = x3
3.
Let f : K C JR.n -+ JR. be continuous on a compact set K and let M = { x E
K lf(x) is the maximum off on K}. Show that Mis a compact set.
4.
Let/: AC !Rn-+ IR be continuous, x,y EA, and c: [O, 1)-+ AC !Rn be a
continuous curve joining x and y. Show that along this curve, f assumes
its maximum and minimum values (among all values along the curve).
5.
Is a version of the maximum-minimum theorem valid for the function
f(x) = (sinx)/x on ]O, oo[? On (0, oo[?
-
x on [-1, 1].
§4.5 The Intermediate Value Theorem
The intermediate value theorem should be known to the reader from elementary calculus. It states that a continuous function on an interval assumes all
values between any two given elements of the range, as in Figure 4.5- la. The
discontinuous function! in Figure 4.5-lb never assumes the value 1/2. Roughly
speaking, while a discontinuous function can jump from one value to another, a
continuous function must pass through all intermediate values. Another way the
intermediate value property can fail is if the domain A is not connected, as illustrated by the continuous function in Figure 4.5-2. Thus the crucial a~sumptions
are that f be continuous and f be defined on a connected region. We shall see
that the proof of the intermediate value theorem is in fact quite simple, because
of the way we have formulated the notion of connectedness.
4.5~1 Intermediate Value Theorem Let M be a metric space, A c M,
and f : A -+ JR be continuous. Suppose that K C A is connected and x, y E K.
For every number c E JR such that f(x) < c < f(y), there exists a point z E K
such that f(z) =·c.
•
Since intervals (open or closed) are connected, the usual intermediate value
tbeorem in calculus becomes a special case. However, notice that Theorem 4.5.1
more general. It applies, for example, to continuous real-valued functions of
~veral vatiables f (x 1, ••• , x 11 ) defined on all of !Rn, which is a connected set.
t.'
/
Chapter 4 Conti,zuous Mappings
192
y
y
o--
I
2
f(x)
-
I
4
~+--l&H!!:il--A
FIGURE 4.5-1
lntennediate value· theorem
y
C
____.- f(x)
X
A
FIGURE 4.5-2
ContinUOlls function with a disconnected domain
Since path-connected sets are connected, we get
4.5.2 Corollary Let K
C IR.n be path connected and f : K -
tinuous. Let x,y E Kand f(x)
f(z) = C.
< c < f(y).
IR be conThen there is a z E K such that
4.5.3 Example Prove Theorem 4.5.1 using the fact that f(K) is connected
and that connected sets in IR. are intervals.
Solution That f(K) is connected comes from Theorem 4.2.1. Hence f(K) is
an interval, possibly unbounded. If /(x),/(y) E J(K) are such that/(x)
~
J(y),
X
Exercises for §4.5
193
then [f(x),f(y)] Cf(K) since/(K) is an inteival. Thus, if/(x) S. c S.f(y), then
c E [f(x),/(y)] c /(K), and so c = f(z) for some z. Another proof is given in
the theorem proofs section of this chapter. •
4.5.4 Example Let f(x) be a cubic polynomial. Slww that f has a (real)
root Xo (that is, f(xo) = 0).
Solution We can write f(x) = ax3 + bx2 +ex+ d, where a t- 0. Suppose that
a > 0. For x large and positive, ax3 is large (and positive) and will be bigger
than the other tenns, so that f (x) > 0 if x is large. This requires some exact
estimates but should be intuitively clear. To see it exactly, note that
· (I+ -axb+ ax- +axd)
-
ax3 +bx2 +cx+d = ax3
c
-?
3
and the factor in parentheses tends to I as x -+ co. Similarly, f(x) < 0 if xis
large and negative. Hence, we can apply the intermediate value theorem with
K = lR to conclude the existence of a point .to where /(Xo) = 0. •
4.5.5 Example Let f : [I , 2] -+ [0, 3) be a continuous function satisfying
f(l) = 0 and /(2) = 3. Show that f has a fixed point. That is, show that there is
a point xo E (I, 2) such that f (Xo) = XoSolution Let g(x) =f(x)-x. Then g is continuous, g(l) =/(I)- I= -1, and
g(2) =/(2) - 2 =3 - 2 = I. Hence, by the intennediate value theorem, g must
vanish at some x0 E [I, 2), and this Xo is a fixed point for /(x).
•
(
Exercises for §4.5
1.
What happens when you apply the method used in Example 4.5.4 to
quadratic polynomials? To quintic polynomials?
2.
Let f : an -+ am be continuous. Let r = { (x,/(x)) I x E lRn} be the graph
off in an x lRm. Prove. that r is closed and connected. Generalize your
result to metric spaces.
3.
Let/: [0, I]-+ [O, I] be continuous. Prove that/ has a fixed point.
Chapter 4 Continuous Mappings
194
4.
Let/ : [a, b) --+ IR be continuous. Show that .the range of/ is a bounded
closed interval.
5.
Prove that there is no continuous map taking [O, 1) onto ]0, I[.
§4.6 Uniform Continuity
Sometimes it is useful to have available a variant of the concept of continuity
known as uniform continuity. This variant is mostly useful for technical reasons,
such as labor-saving devices in proofs. The exact definition is as follows.
4.6.1 Definition Let (M, d) and (N, p) be metric spaces, A c M, f : A --+ N,
and B C A. We say that f is uniformly conJinuous on tlze set B if for every
e > 0 there is a 6 > 0 such that x,y EB and d(x,y) < 6 imply p(f(x),f(y)) < e.
The definition is similar to that for continuity, except that here we are required
to choose 6 to work for all x,y once e is given. For continuity, we were required
only to choose a 6 once we were given e > 0 and a particular xo. Clearly, if f
is uniformly continuous, then f is continuous.
For example, consider f : IR --+ IR, j(x) = x2. Then f is certainly continuous,
but it is not uniformly continuous. Indeed, for e > 0 and x0 > 0 given, the
6 > 0 we need is at least as small as :::/(2x0 ) (why?), and so if we choose x0
large, 6 must get smaller; i.e., no single 6 will do for all xo. This phenomenon
,cannot happen on compact sets, as the next theorem shows. ·
4.6.2 Uniform Continuity Theorem Let f : A - N be continuous and
let K c A be a compact set. Then f is uniformly continuous on K.
The use of merely bounded sets in this theorem will not do, for consider
= 1/x on the noncompact set ]0, l]. If we examine the proof that f is
continuous (Example 4.1.6), we see that f is not uniformly continuous. Of
course, we ·cannot extend/ to be continuous on the compact set (0, 1), because
f is already unbounded.
f(x)
4.6.3 Example Let f : ]0, I]
--+
uniformly continuous on [a, 1) for a
IR be de.fined by j(x) = 1/ x. Show that f is
> 0.
§4.6 Uniform Continuity
195
Solution This follows. from the unifonn continuity theorem since
compact set and/ is continuous on ]0, l] and hence on [a, 1).
•
[a, l] is a
In the next section we shall be reviewing differential calculus. However, we
wish to use some of the ideas from that area now, since they are relevant for
the next result.
4.6.4 Example Let f : ]a, b[ - JR be differentiable, and suppose that there
is a constant M > 0 such that V'(x)I ::5 M for all x satisfying a < x < b. Here, a
orb may be ±oo, and f' stands for the derivative off. Show that f is uniformly
continuous on ]a, b[.
Solution The definition of uniform continuity asks us to estimate the difference lf(x) - f(y)I in tenns of Ix - YI- This suggests using the mean value
theorem. Indeed,
f(x)- f(y) =J'(xo)(x- y)
for some xo between x and y. Hence
lf(x) - f(y)I
:5 Mix - yl;
a mapping with this property is called Upschuz. (We will encounter this property
several times again.) Given e > 0, choose 6 = e/M. Then Ix - YI < 6 implies
lf(x) -f(y)I
Hence f is uniformly continuous.
M•e
< M · 6 =. - M = e.
•
The intuition suggested by this result may shed some light on uniform continuity. Specifically, this result says that if the slope of the graph of a function
is bounded, then it is uniformly continuous. Unbounded slopes suggest that one
should suspect a failure of uniform continuity. This is often a good guide when
examining specific functions or their graphs. However, as Exercise 7 shows,
this guide is not foolproof.
4.6.5 Example Show that sin x : JR - JR is uniformly continuous.
Solution d(sinx)/dx = cosx is bounded in absolute value by I, and so, by
4.6.4, sinx is uniformly continuous.
•
Chapter 4 Continuous Mappings
196
Exercises for §4.6
1.
Demonstrate the conclusion in Example 4.6.3 directly from the definition.
2.
Prove that /(x) = I/x is unifonnly continuous on [a, oo[ for a > 0.
3.
Must a bounded continuous function on JR be uniformly continuous?
4.
If/ and g are uniformly continuous maps of JR to lR., must the product
f · g be unifonnly continuous? What if f and g are bounded?
5.
Letf(x) = Ix!- Show that/: JR-+ JR is uniformly continuous.
6.
a.
Show that f : JR -+ JR is not uniformly continuous iff there exist
an c > 0 and sequences Xn and Yn such that lxn - Ynl < l/n and
IJ(xn) - f(Yn)I ~ €. Generalize this statement to metric spaces.
b.
Use a on JR to prove that/(x) =·x2 is not uniformly continuous.
7.
Let/(x) = ./i.
a.
Show that/ is uniformly continuous on the interval [0, IJ.
b.
Discuss the relationship between uniform continuity and bounded
slopes in light of this example.
§4.7 Differentiation of Functions of One Variable
We assume that the reader is familiar with the general ideas of calculus of
functions of one variable. Our purpose here is to recall and sharpen some of
the basic theoretical points. A few of the proofs are included right in the text,
since many will be a review.
4.7.1 Definition Let the jun_ction f be de.fined on some open interval containing xo E IR. We say that f is differentiable at xo if
. /(xo + h) - f(xo)
/ '(xo ) = 11m
h
h-->O
exists. We call J' (xo) the derivative off at xo.
Rewriting this condition as
.
f(x) - /(xo) - f'(xo) · (x - Xo)
I1m - - - - - - - - - x-+xo
x - Xo
= 0,
§4.7 Differentiation of Function.s of One Variable
197
we see that the straight line y = f(x0 ) + f' (xo)(x - xo), called the tangent line to
the graph off at xo, is a good approximation to f near Xo, and rewriting it as
- f(xo)
. [/(Xo + tu)
I1m
A-
11%-0
LU
-
•
J'(Xo )] = 0,
we see thatf'(.xo), being the limit of the slopes of secant lines, can be interpreted
as the slope of the tangent line to the graph off at (xo,/(Xo)), as in Figure 4. 7-1.
y
slope
= /' (x0)
!T
FIGURE 4.7-1 J'(x0 ) is the slope of the tangent line
Another way of writing the definition that avoids division by tu (and the
exclusion tut- 0) is: For any c > 0, there is a 6 > 0 such that !tu! < {J implies
.,
IJ(xo + tu)- f(xo) - /'(xo)tul ~ cltul.
We assume that the reader is familiar with the Leibniz notation dy / dx for the
derivative. Here is the differentiability-continuity relationship:
4.7.2 Proposition /ff is differentiable at xo, thenf is continuous at x0•
Proof Indeed,
. 1:!.~/(x)
.
= _.l:!_~
0
[f(x~
=~:xo) (x - xo) +f(xo)]
=f' (xo) · 0 +f (xo) =f (xo). •
Chapter 4 Continuous Mappings
198
Here is another proof of this proposition, which we shall need later: Given
e
> 0, there is a 6 > 0 such that
IJ(xo + .1.x) - /(xo) - /'(xo)-1.xl :'.5 cl.1.xl
if l.1.xl
< 6. Thus, by the triangle inequality,
lf(xo + .1.x) - /(.xo)I $ Ii' (xo)-1.xl + elAxl •
Choosing e = I,
lf(xo + L\x) - /(.xo)I :'.5 (f'(xo) + l)IAxl
if IA.ti < 6. This shows more than continuity-it shows that/ has the Li.pschitz
property at x0 ; i.e., there exist constants M ~ 0 and c > 0 such that
MIA.ti
from this (given e > 0, choose 6 = E / M).
lf(xo + Ax) -
/(xo)I
$
if IA.ti < c. Continuity follows
This
inequality means that the graph lies inside th~ "bow tie" region in Figure 4.7-2. -
slope= M
AGURE 4.7-2 Geometric meaning of the Lipschitz property
4.7.3 Example Let f(x) = Ix!- Show that f is continuous but not differentiable at xo = 0.
Solution Indeed, the limit of the difference quotient,
lim /(Xo + Ax) - /(Xo) = lim IA.ti'
dx-o
L\x
dx-o .1.x
§4. 7 Differentiation of Functions of One Variable
199
does not exist, since l.1.xl/ .1.x is I if .1.x > 0 and is -1 if tu < 0. The function
is clearly continuous at Xo = 0 (let c = 6 in the definition of continuity).
•
4.7.4 Example
Let f (x) =
r. Calculate f' (.xo) using the definition.
Solution Here
r_
• f(x+L\x) -f(X) _ 1. (x+L\.x:)2 . [2x A-]_ 2x
I1m
- - - - - - 1m - - - - - - 11m
+ LU ,
L\x
dr-+0
L\x
&-+0
dr-+0
and so/'(x) = 2x; thus/ is differentiable at each x E JR.
•
Of course, in practical problems, we don't calculate derivatives using the
definition. Rather, we use the familiar rules of calculus:
4.7.5 Theorem Suppose that/ and g are differentiable at xo and that k E JR..
Then kf, f + g, Jg, and f / g (assuming g(xo) ~ 0) are all differentiable at Xo and
ii.
=k(f'(xo))
(f + g)'(.xo) =J'(xo) + g'(xo)
iii.
(fg)'(xo) = f(.xo)g'(xo) +/'(.xo)g(xo)
iv.
( [_)' (.xo)
· g
i.
(kf)'(xo)
=g(xo) !' (Xo) - f (xo)g' (Xo)
[g(xo)l2
constant multiple rule
sum rule
product rule
quotient rule
'r
The proof in each case is a straightf01ward limit manipulation the reader
should write out. A bit trickier is:
4.7.6 Chain Rule
/ff is differentiable at xo and g is differentiable atf(.xo),
then g of is differentiable at xo and
(g o /)' (xo) = g' (f(xo))/' (xo).
This proof is found at the end of the chapter. Of course it is via these
rules-4.7.2, 4.7.5, and 4.7.6---that we learn how to differentiate, for example,
rational functions.
'· If/ is differentiable on ]a, b[ and f' is continuous, we say that/ is of class
iC1• Of course, we can differentiate f' again, assuming that it is differentiable,
200
Chapter 4 Continuous Mappings
to get the second derivative f". Iff" is continuous, we say that f is of class
and so on.
C2,
4.7.7 Definition
We say that a function f defined in a neighborlwod of Xo
is increasing (respectively, strictly increasing) at xo if there is an interval ]a, b[
containing Xo such that:
i.
If a < x < Xo, then J(x)
~ /(xo);
respectively, J(x)
< /(xo).
ii.
If Xo < x < b, then f(x)
~ /(.xo);
respectively, J(x)
> J(.xo).
Similarly, f is decreasing (respectively, strictly decreasing) at x0
an interval ]a, b[ containing xo such that:
i.
If a < x < xo, then f(x)
~ J(xo):
respectively, J(x)
> /(.xo).
ii.
If Xo < x < b, then f (x)
$ /(.xo); respectively, f (x)
< f (xo).
4.7.8 Theorem
i.
if there is
Assuming that/ is differentiable at xo,
If f is increasing at Xo, then J' (.xo) ~ 0.
ii. · If f is decreasing at xo, then f' (Xo) $ 0.
> 0, then f is strictly increasing at .xo.
iii.
If f'(xo)
iv.
If f' (xo) < 0, then f is strictly decreasing at Xo.
We defer the proof to the end of the chapter. Using this result, we can relate
calculus to maxima and minima.
4.7.9 Proposition
lff: ]a,b[-. ]R is differentiable at c E ]a,b[ andj has
a maximum (or a minimum) at c, then f'(c) = 0.
Proof Let f have a maximum at c. Then for h > 0, [f(c + h) - f(c)]/ h ~ 0;
and so, letting h-. 0, h > 0, we get/'(c) ~ 0. Similarly, for h < 0 we obtain
f'(c) ~ 0. Hence/'(c) = 0. •
We can now put this together with the maximum-minimum theorem from
§4.4.
4.7.10 Rolle's Theorem If/: [a,b]-. lR is continuous,/ is differentiable
on ]a,b[. and j(b) = /(a) = 0, then there is a number c E ]a,b[ such that
/'(c) = 0.
§4.'7 Differentiation of Functions of One Variable
201
Proof If f(x) = 0 for all
x E [a,b], we can choose any c. Therefore assume
that f is not identically zero. From the maximum-minimum theorem (4.4.1),
we know that there is a point c1 where / assumes its maximum and there is
a point c2 where / assumes its minimum. Since/ is not identically zero and
J(a)
f(b)
O; at least one of CJ, c2 lies in ]a, b[. If c1 E ]a, b[, we get
J'(ci) = 0, by Proposition 4.7.9; similarly for c2.
•
=
=
4.7.11 Mean Value Theorem IfI: [a,b]-+ JR is
continuous and differentiable on ]a, b[. there is a point c E ]a,b[ such that f(b) - f(a) =f'(c)(b - a).
Proof Let cp(x) = f(x)- f(a)- (x - a)[f(b)- f(a)]/(b - a) (see Figure 4.7-3),
and apply Rolle's theorem. •
y
_ _ _ _ ___._ _ _ _ _ _ _ ___._ _ _ _ x
a
FIGURE 4.7-3
X
b
The mean value theorem
4.7.12 Corollary If f: [a,b] -+ JR
,'
is continuous and if f' "". 0 on ]a,b[,
then f is constant.
Proof Applying the mean value theorem to/ on [a,x] gives a point c such
that f(x) - J(a) = J'(c)(x - a) =0, so that /(x) =f(a) for all x E [a, b], and
therefore/ is constant.
4.7.13 Example
•
Let/: ]a, b[-+ JR be differentiable and
that IJ(x) - f(y)I :5 Mix - YI for all x,y E ]a, b[.
1/'(x)I :5 M.
Prove
Chapter 4 Continuous Mappings
202
Solution By the mean value theorem,
/(x)-/(y) =/(c)(x- y)
for some c E ]x,y[. Tal<lng absolute values gives the result.
•
To see how powerful ttie mean value theorem is, consider a fiJP,~tion
J·:
[a,b]-+ JR that is differentiable on ]a,b[ and continuous on [a,b], and suppose
/'(x) ~ 0 for all x E ]a,b[. Then for x1, x2 E ]a,b[ with xi < x2, we can use
the mean value theorem on/ restricted to [x1,x2]. We find that there is some
~ 0, by our
assumption. Thus/(x2) ~/(x1). We have thus ·shown that if X1,X2 E [a,b] and
x1 < x2 , then/(x 1) S/(x2). Thus/ is increasing on [a,b]. This proves i of the
next proposition.
c E lx1,x2[ such that /(x2) - /(x1) = / 1(c)(x2 - x1). But /'(c)
4.7 .14 Proposition Suppose that f is continuous on [a, b] and differentiable
on ]a,b[.
-
i.
Iff'(x)
ii.
If f' (x) S Ofor all x E ]a, b[,
iii.
/f /'(x)
iv.
If f' (x) < 0 for all x E
v.
If f' (x) 0 for all x E ]a, b[, then f(x)
x E [a,b].
~
0 for all x E ]a, b[, then f is increasing on [a, b].
> Ofor all x
=
then f is decreasing on [a, b ].
E ]a,b[, then/ is strictly increasing on [a,b].
]a, b[, then f is strictly decreasing on [a, b ].
= k (k some constant) for all
Now suppose f : ]a, b[ -+ R is such that /' (x) > 0 for all x E ]a, b[ or
f'(x) < 0 for all x E ]a, b[. Then J is strictly increasing or strictly decreasing
on ]a,b[ and so f has an inverse function. The following theorem allows us to
compute the derivative off- 1 in tenns of the derivative of'j.
4.7.15 Inverse Function Theorem Suppose that f : ]a,b[
-+ JR and
that either f'(x) > 0 for all x E ]a,b[ or f'(x) < 0 for all x E ]a, b[. Then
/: ]a, b[ -+ R is a bijection onto its range, 1- 1 .is differentiable on its domain,
and (/- 1 )'(y) = 1/f'(x) wherej(x) =y.
·
Exercises for §4.7
203
Proposition 4.7.14 is also useful for discussions of maxima and minima. For
instance:
4.7.16 Proposition Suppose that f is continuous on [a,b] and is twice
differentiable on ]a, b[, and that Xo E ]a, b[.
If f'(xo) = 0 and f" (xo) > 0, then Xo is a strict_local minimum off.
i.
--
If f' (xo) = 0 anlj" (xol__< Q, then Xo is a strict local maximum off.
ii.
What is going on in i is that f" (x0) > 0 implies that f' is strictly increasing
at x0 , and so/'< 0 just to the left of xo and/'> 0 just to the right. Now 4.7.14
can be used to get the result.
Exercises for §4.7
1./
Give an example of a function defined on JR, which is continuous everywhere and which fails to be differentiable at exactly n given points
/XI,··· ,Xn.
2.
Does the mean value theorem apply to /(x) = .,/x on [0, I]? Does it apply
to g(x) = vfxI on [-1, I]?
3. /
Let f be a nonconstant polynomial such that /(0) = /(I). Prove that /
has a local minimum or a local maximum point somewhere in the open
inteival JO, I[.
4.
A rubber cube of incompressible material is pulled on all faces with a force
T. The material stretches by a factor v in two directions and c6ntracts by
a factor v- 2 in the other. By balancing forces, one can establish Rivlin's
equation:
v3
/
S.
Tv 2
-
20 + 1 = 0,
l
where. a is a strictly positive constant (analogous to the spring constant for
a spring). Show that Rivlin's equation has one (real) solution if T < 6/2a
and has three solutions if T > 6/2a.
.
·
Let f be continuous on [3, 5] and differentiable on ]3, 5[, and suppose
that f(3) = 6 and /(5) = 10. Prove that, for some point xo in the open
inteival ]3, 5[, the tangent line to the graph off at xo passes through the
origin. Illustrate your result with a sketch.
204
Chapter 4 Continuous Mappings
§4.8 Integration of Functions of One Variable
Let A c JR be a bounded set and f : A -+ JR a bounded function. When we
say that we want to integrate the function f over the set A, we mean that we
would like to find the area under the graph off (see Figure 4.8-1). If f ~ 0,
this is the literal area; in general, it is the signed area. To do this, note first,
since A is bounded, that there is a closed interval [a,b] ::.> A. We consider f to
be defined over the whole interval [a,b] by lettingf be zero on [a,b]\A. Next,
we partition [a,b], which means that we pick an integer n and points xo = a,
x,, ... ,Xn-1,Xn =bin such a way that a= Xo < x, < · .. < Xn-1 < Xn = b.
Denote such a partition by P; that is, let P = {xo, ... , Xn}. Then, form the two
sums
n-1
U(f,P)
= I)sup{J(x) Ix E [x;,X;+d})(x;+1 - x;)
j::(}
and
n-1
L(f,P) = I)nf{f(x) Ix E [x;,X;.1-1]})(x;+1 - x;),
i=O
called the upper and lower sums, respectively. The first sum is the sum over all
intervals [x;,X;+il of the maximum(= sup) of/ in each interval times the length
of that interval and has value equal to the area of the shaded region shown in
Figure 4.8-1. Since f is assumed to be bounded, the sup exists in each interval.
The second sum is the sum over all intervals [x;,X;+d of the minimum (or inf)
off in each interval times the length of that interval and is the hatched region
shown in Figure 4.8-1. The boundedness of the function again guarantees that
the inf exists.
Since f is bounded-say, -M :$ f :$ M-we see that
-(b - a)M :$ L(f, P) :$ U(f, P) :$ (b - a)M
for any partition P of [a,b]. Let
S = inf{ U(f, P) I P is any partition}
and
s = sup{ L(f, P) I P is any partition}.
If we again look at Figure 4.8-1, it seems reasonable to expect that as the
size of the intervals in P gets smaller, U(f, P) decreases while l(f, P) increases.
In the limit of decreasing size of the intervals of P, the numbers U(f, P) and
l(J, P) should converge to a common value. This leads us to the following
definition.
205
§4.8 lntegration of Functions of One Variable
y
----·X
FIGURE 4.8-1
The upper and lower sums for this function
4.8.1 Definition We say that/ is Riemann integrable (or just integrable, or
that the integral exists, for short) ifs S. The common value s S is denoted
by JAi or by JAJ(x)dx. If A= [a,b], we write
=
lb
f(x)dx =
=
11 lb
=
f.
It should be noted that integrability does not really involve smoothness or
continuity properties off. In fact, some badly discontinuous functions can still
be integrable.
•#
One thing about this procedure may seem puzzling. Why do we insist that
inf{ U(J, P)} = sup{ L(J, P)}? At first, we might think that this relation will
always hold. However, this is not always the case, as the next example shows.
4.8.2 Example Calculate inf{ U(J, P)} and sup{ L(J, P)} for the function
f : [0, I] C
JR
--+
JR defined by f(x)
= 1 if x is irrational and f(x) = 0 if x is
rational.
=
=
Solution We claim that inf{ U(f, P)} I and sup{ L(J, P)} 0. Indeed, on
any interval,/ is always I at some points and zero at others, so that the inf on
any interval is .zero and the sup is I. Therefore, the integral of this function
over the set [0, I] does not exist for our purposes. In more advanced work the
206
Chapter 4 Continuous Mappings
integral of such a "pathological function" can be defined, but we shall be dealing
mostly with "~ecent functions," for which the integral exists. •
4.8.3 Example Suppose that f : [a,b]
-+
JR is (Riemann) integrable and
f ~ 0. Show that J: f (x) dx ;:: 0.
Solution By definition, the integral is the infimum of sums of the form
L
n-1 (
i=O
sup
f(x)
)
· (x;+ 1 - x;)
xE[x;,x;. 11)
over all partitions. But each of these sums U(f, P) is nonnegative, since f;:: 0.
Hence the integral is ;:: 0, since the inf of a set of nonnegative numbers is also
nonnegative. •
The first result is a theorem giving us some insight into which functions are
integrable. We shall study the problem in greater depth in Chapter 8.
4.8.4 Theorem
i.
If f : [a, bJ -. JR is bounded and is continuous at all but finitely many
points of[a,b], then it is integrable on [a,bJ.
ii.
Any increasing or decreasing function on [a,bJ is integrable on [a,bJ.
We can also prove the following proposition, which gives some rules of
integration.
4.8.5 Proposition
i.
If f is integrable on
[a, bJ and k E JR, then kf is integrable on [a, b J and
f:kf=kf:f.
iii.
If f and g are integrable on [a, bJ, then f + g is integrable on [a, bJ and
J:(!+g) = J:1 + I: g.
If f and g are integrable on [a, bJ and f(x) $ g(x) for all x E [a, b], then
iv.
I:!$
If f is
ii.
J:g.
integrable on [a,b] and [b,cJ, then f is integrable on [a,c] and
1:1=I:J+ J:f.
§4.8 Integration of Functions of One Variable
207
Each of these properties is plausible in terms of the interpretation of the
integral as the signed area under the graph. Here is a useful consequence of
iii. Note that -1/1 f
I/I, so that - I/(x)I dx
f(x) dx
IJ<x)I dx.
Thus,
55
Ilb
5J:
J:
f (x) dx
51b I/(x)I
5J:
dx.
For the development of the proofs and for use in the examples, we introduce
the following notation:
1b
f(x) dx = inf{ U(f, P) I P is any partition},
the upper integral
1b
f(x) dx = sup{ L(f, P) I P is any partition},
the lower integral
and
so that the definition of integrability reads
1b
f (x) dx =
1bI
(x) dx.
In the theorem proofs section, we shall show that
1b
f(x)dx
51b
f(x)dx
is always true; thus, integrability asserts that strict inequality is not the case.
Let us now compute some integrals directly from the definition.
'f'
4.8.6 Example Compute
Ji3 1dx.
Solution Consider a partition P : I
of I over each subinterval is I,
=x
0
< x 1 < ••• < Xn =3.
Since the inf
n
L(l, P) =
L 1·
(X; - X;-1)
= Xn
-
Xo
= 3 - I = 2. •
i=I
Li,kewise, each sup over [x;,X;+il is 1, and so U(l,P) = 2. Thus
1
3
I dx
=2,
1
3
I dx = 2,
and so
1
3
1 dx
=2.
•
Chapter 4 Continuous Mappings
208
4.8.7 Example
Compute f01x dx.
Solution Define Pn by x; = i/n and let m; be the inf off(x) over [x;-1,x;].
In this case, m; = (i - 1)/n, and so
L(x,Pn)=
t (i-1). (~ -i-1) t i-1. !
=
n
i=I
n
n
~
i=l
n
= __!_ ~(i _ l) = __!_ . n(n - 1)
n 2 L.,;
(since
2
n2
i=l
I:7= 1 i = n(n + 1)/2). Thus
n- 1 1
lim L(x, P,,) = lim - 2= -2 .
n-+co
n
n--+CX)
Hence f01x dx ~ 1/2, since f01x dx is an upper bound for the lower sums.
Similarly, the sup off(x) over [x;_ 1,x;] is M; = i/n, and so
n
U(x,Pn)=L
(i) (i i-1)
-
·
n
~l
1
n i
- - - =L-•-
n
n
~I
n
n
_ __!_ ~ i _ __!_ . n(n + 1)
- n2 L.,; - n2
2
·
i=l
Thus limn--+oo U(x, P11 )
= 1/2, and so f01 x dx :::; l /2.
-1 <
11 11
xdx<
2-_o_
and so
1 =1
1
xdx
_o_
o
1
xdx
-
o
xdx< -1
-2
1
=~ =
2
Therefore,
o
1
xdx.
•
Now that we have characterized a large class of integrable functions, we
may still ask what is a practical way to compute. integrals. The answer in one
dimension is, of course, that we use antiderivatives and the_ usual techniques of
integration. The techniques for higher dimensions are given in Chapters 8 and 9.
For f : [a, b] - IR, an antiderivative off is a continuous fuuction F :
[a, b] - lR such that Fis differentiable on ]a, b[ and F'(x) = f(x) for a< x < b.
The following theorem provides an effective method for compu~i_ng integrals of
a wide class of functions.
§4.8 Integration of Functio,is of One Variable
209
4.8.8 Fundamental Theorem of Calculus
Let f : [a,b]
-+
IR be
continuous. Then f has an antiderivative F and
1b
f(x)dx = F(b) - F(a).
If G is any other antiderivative off, we also have
4.8.9 Example
Jt
2 sinx dx
J0"' 12 sinxdx
= - cos('rr /2) -
J: f(x)dx = G(b) -
G(a).
= I, because d(- cosx)/dx = sinx and so
= I.
(- cos(0))
The reader should be familiar
•
with these ideas.
Recall the basic intuition concerning Theorem 4.8.8. Namely, define
F(x) =
and suppose/
~
l<
f (t) dt
0, for simplicity, so that F is the area under the graph off from
a to x. The fact that F' = f comes about because/(x) is the rate at which this area
is increasing. Indeed, this ought to be clear, because F(x + At) - F(x) :::::: f(x)!u
(Figure 4.8-2).
.,,
area"" /(x) Ax
a
FIGURE 4.8-2
x x+.O.x
b
The geometry of the fundamental theorem of calculus
We assume the reader knows, or is willing to review, the basic integration
techniques that are obtained from this theorem, such as the method of substitution
Chapter 4 Continuous Mappings
210
(chain rule) and integration by parts (product rule). These shall be taken for
granted in the rest of this book.
4.8.10 Example Let F(x) = J:J(t)dt. Is F differentiable if f is merely
(Riemann) integrable?
Solution No, continuity in the fundamental theorem cannot
be replaced by
integrability. For example, let /(x) = 0 on [0, 1) and f(x) = 1 on ) 1, 2). Then
F(x)
0 on [O, I] and F(x) x - 1 on ]I, 2), wh~ch is continuous but not
differentiable at x = I (see Figure 4.8-3).
•
=
=
y
y
f
--------x
FIGURE 4.8-3
The "indefinite" integral F off need not be differentiable
Exercises for §4.8
1. /show directly from the definition that
2.
J: dx = b - a.
t/ Evaluate Jt (x + 5) dx using the definition of the integral.
3. JShow directly from the definition that for f : [a, b] -+ IR and P any
partition of [a,b], U(f,P) ~ L(J,P).
4.
J..e-tf: [a,b] c IR-+ IR be integrable and/~ M. Prove that
lb
f(x)dx
~ (b -
a)M.
211
Theorem Proofs for Chapter 4
5.
Evaluate
6.
Evaluate
1
. I
(x+2)9 dx.
7./2tf: [0, 1)
-+ lR,f(x)
= 1 if x =1/n, nan integer, and/(x) =0 otheiwise.
a.
·Prove that/ is integrable.
b.
Show that
Jd f (x) dx = 0.
~
8. ~ t f : [a, b] -+ JR be Riemann integrable and 1/(x)I
M. Let F(x) =
f(t) dt. Prove that IF(y) - F(x)I ~ Mly - xi. Deduce that Fis continuous. Does this check with Example 4.8.10?
J:
Theorem Proofs for Chapter 4
4.1.4 Theorem Let f : A c M -+ N be a mapping. Then the following
assertions are equivalent:
i.
f is continuous on A.
ii.
For each convergent seque.nce Xt -+ Xo in ,A, we have f(xt.) -+ f(.xo).
Iii.
For each open set U in N, 1- 1(U) c
1- 1(U) = VnAfor some open set V.
iv.
For each closed set F C N, f- 1(F) C A is closed relative to A; that is,
1- 1(F) = G n A for some closed set G.
A is open relative to.A; that is,
Proof The pattern of the proof will be i ~ ii ~ iv ~ iii ~ i.
i=>ii Suppose that X.t -+ .xo. To show that/(xt.)-+ /(x0 ), let e > 0; we must find
an integer N so that k ~ N implies p(f(xt.),f(.xo)) < e. To do this, choose
6 > 0 so that d(x,x0 ) < 6 implies p(/(x),/(.xo)) < e. The existence of
such a 6 is guaranteed by the continuity of/. Then choose N so that k ~ N
implies d(Xt.,Xo) < 6. This choice of N yields the desired conclusion.
Chapter 4 Continuous Mappings
212
ii==>iv Let F c N be closed. To show that/- 1(F) is closed in A, we note that a
set B is closed relative to A iff for every sequence Xk E B that converges
to a point x E A, we necessarily have x E B. Let Xk E /- 1(F) and let
Xk-+ x, where x EA. We must show that x E/- 1(F). By ii,f(xk)-+ f(x),
and since f(xk) E F and F is closed, we conclude that f(x) E F. Thus
X
Ef- 1(F).
iv==>iii If U is open, let F = N\ U, which is closed. Then, by iv, 1- 1(F) = G n A
for some closed set G. Thus/- 1(U) =An (M\G), and so/- 1(U) is open
relative to A.
> 0, we must find 8 so that d(x, Xo) < 8 implies
Since D(f(Xo),c) is an open set, /- 1(D(f(xo),€)) is
open, by iii. Thus, by the definition of an open set and the fact that x0 E
1- 1(D(f(Xo),c)), there is a 8 > 0 such thatD(x0 ,o)nA cf- 1(D(f(x0 ),c)).
This is another way of saying that
iii==>i Given xo E A and
p(j(x),f(xo))
<
€
€.
(x EA and d(x,xo)
< 8) ==> (p(f(x),/(xo)) < €).
•
To gain practice with these concepts, one might try proving (directly) other
implications within the theorem, for example, i ==> iii, or ii ==> i.
.
'
4.2.1 Theorem Suppose that f : M -+ N is conti~uous and K c· M is
connected. Then f (K) is connected. Similarly, if K is path-connected, so is f (K).
Proof Suppose f(K) is not connected. By definition, we can write /(K) C
UU V, where Un Vnf(K) = 0, Unf(K) f. 0, Vnf(K) f. 0, and U, V are open
sets. Now,f- 1(U) = U'nK for some open set U', and similarly,f- 1(V) = V' nK
for some open set V'. From the conditions on U, V, we see that U' n V' nK = 0,
K c U' U V', U' n K f. 0, and V' n K f. 0. Thus K is not connected, which
prov~s the first assertion.
For the second part, let z = f(x), w = f(y) be two points in f(K), where
x, y E K. Let c(t) ,be a continuous curve joining x and y. Then d(t) = f(c(t)) is
a continuous curve joining z and w. It is continuous because if tk -+ to, then
c(h) -+ c(to), since c is continuous; consequently f(c(tk)) = d(tk) -+ f(c(to)) =
d(to), since f is continuous. Thus d is continuous, and hence f (K) is pathconnected.
•
4.2.2 Theorem Suppose that f
compact. Then f (B) is compact.
: M
-+
N is continuous and B c M is
213
Theorem Proofs for Chapter 4
Proof Let Yk be a sequence in f (B).' By Theorem 3.1.3 it must be shown
that Yk has a subsequence converging to a point in f (B). Let Yk = f (xk), for
Xk E B. Since B is compact, there is a convergent subsequence, say, Xkn -+ x,
for x E B. Then, by Theorem 4.1.4ii,/(Xkn) -+ /(x), and so/(xkn) is a convergent
subsequence of y,,_. •
4.3.1 Theorem Let M, N, and P be metric spaces and suppose that f : A c
M -+ N and g : B C N -+ P are continuous mappings with f(A) C B. Then
g of : A C M -+ P is continuous.
Proof Let Uc P be open. Then (gof)- 1(U) =J- 1(g- 1(U)). Now, g- 1(U) =
U' n B for some U' open, and 1- 1(U' n B) = 1- 1(U'), since f(A) c B. Since
f is continuous, 1- 1( U') = U" n A for U" open. Thus g of is continuous, by
Theorem 4.1.4. •
The other conditions of Theorem 4.1.4 could also be used to prove Theorem
4.3.1. Instead of proving Theorem 4.3.2, we shall confine ourselves to proving
its corollary. The general case is similar; the only complexity is in the notation.
4.3.3 Corollary Let A
i.
ii.
C M,
and let x0 E A be an accumulation point of A.
Let f : A -+ V and g : A -+ V be continuous at xo; then the sum
f + g : A -+ V is continuous at xo.
Let f : A
-+
JR and g : A
-+
V be continuous at xo; then the product
f · g : A -+ V is continuous at xo.
iii.
Let f : A -+ JR and g : A -+ V be continuous at xo with f (xo) #0; then
f is nonzero in a neighborhood U of xo and the quotient g/f: U-+ Vis
continuous at xo.
Proof
i.
Let xo E A and suppose £ > 0 is given. Choose 61 > 0 such that
d(x,xo) < 61 implies d(f(x),f(xo)) < e/2 and 62 > 0 such that d(x,xo) <
62 implies d(g(x), g(x0 )) < £ /2. Let 6 be the minimum of 61, 62. Therefore,
if d(x,x0 ) < 6, the triangle inequality gives
II{/+ g)(x) - (J + g)(Xo)II = llf(x) - f(xo) + g(x) - g(xo)II
:::; 11/(x) - /(xo)II + llg(x) - g(Xo)II
:::; e/2+£/2 = £.
Chapter 4 Continuous Mappings
214
ii.
Let xo E A and suppose e > 0. Choose 61 such that d(x,Xo) < 61
implies IJ(x) - f(xo)I < e/2(llg(xo)II + 1) and IJ(x)I :5 IJ(xo)I + I (why
is this possible?). Also, choose 62 such that d(x,Xo) < 62 implies that
llg(x) - g(xo)I I < e /[2(j/(xo)I + l)]. Then for 6 = min(61, 62), d(x, xo) < 6
implies (by the triangle inequality)
llfg(x) - fg(xo)II = llf(x)g(x) - f(x)g(Xo) +f(x)g(xo) - f(xo)g(xo)II
:5 1/(x)l llg(x) - g(.xo)II + 1/(x) - /(xo)l llg(.xo)II
(since llavll = lal llvll for v E V and a E IR). Continuing with this line of
reasoning, we get
(1/(xo)I + l)e
llfg(x) - fg(xo)II
iii.
llg(xo)lle
< 2(1/(xo)I + 1) + 2(1lg(xo)II + 1) :5
e
e
2 + 2 = e.
By the proof of ii, it suffices to consider the case 1//, because g/f =
g • (l/f). To show that 1// is continuous, given xo E A, choose 61 such
that lf(x) - /(xo)I :5 (l/(xo)l/2) for llx - xoll < 61. This is possible by
the continuity off. It follows that 1/(x)I ~ (l/(xo)l/2). Now, given e > 0,
choose 62 such that llx - xoll < 62 implies
1/(x) - /(xo)I
Letting 6 = min(61,62), llx- xoll
< elf~o)l2.
< 6 implies
I_I __l_, _
f(x)
/(Xo) -
I
,/(Xo) - f(x) < !J(x) - /(xo)I
f(xo) f(x) l/(xo)l 2/2
This shows that 1/f(x) is continuous at x0 •
<e
·
•
4.4.1 Maximum•Minimum Theorem let (M,d) be a metric space, let
--+ IR be continuous. Let K C A be a compact set. Then f is
bounded on K; that is, B = {f(x) I x E K} C IR is a bounded set. Furthermore,
there exist points Xo, X1 E K such that f(xo) inf(B) and J(x1) sup(B). We call
sup(B) the (absolute) maximum off on Kand inf(B) the (absolute) minimum of
f on K.
A C M, and let f : A
=
=
Proof First, B is bounded above, for B = f(K) is compact, by Theorem 4.2.1,
and so it is closed and bounded, by the definition of compactness. Second, we
215
Theorem Proofs for Chapter 4
want to produce an x 1 such' that x 1 E K and /(xi) = sup(B). Now, since B is
closed, sup(B) E B J(K) (see Exercise 8, Chapter 2). Thus sup(B) f (xi) for
some x 1 E K. The case of inf(B) is similar.
•
=
=
Note. We can also prove the case of inf(B) by applying the foregoing proof
of the supremum case to -f and observing that the maximum of -f is - the
minimum off.
4.5.1 Intermediate Value Theorem Let M be a metric space, A c M,
and f : A -+ JR be continuous. Suppose that KC A is connected and x,y E K.
For every number c E JR such that f(x) < c < f(y), there exists a point z E K
such that f (z) = c.
Proof Suppose no such z exists. Let U =] - oo, c[ ={t E JR I t < c} and
let V = ]c, oo[. Clearly, both U and V are open sets. Since f is continuous, we
have 1- 1( U) = Uo n K for an open set Uo, and similarly, 1- 1(V) = Vo n K. By
the definition of U and V, we have Uo n Von K = 0, and by the assumption that
{z EK lf(z) = c} = 0, we have Uo U Vo:, K. Also, Uo n K f- 0, since x EU;
and V0 n K -:j:. 0, since y E V0 • Hence, K is not connected, a contradiction. •
4.6.2 Uniform Continuity Theorem Let/: A-+ N be continuous and
let K c A be a compact set. Then f ~s uniformly continuous on K.
Proof Given c > 0 and x E K, choose Dx such that d(x,y) .~ Dx implies p(f(x),f(y)) < c/2. The sets D(x, Dx/2) cover K and are open. Therefore, th~re is a finite covering, say, D(x1, bx, /2), ... , D(XN, bxN /2). _Let 6 =
minimum Dx./2, ... , DxN/2. If d(x,y) < f,, then there is an x; such that d(x,x;) <
6xJ2 (since the disks cover K), and therefore d(x;,y) $ d(x,x;) + d(x,y) < bx;·
Thus, by the choice of 6,
p(f(x),J(y)) $ p(f(x),J(x;)) + p(f(x;),J(y))
< c/2 + c/2 = c.
•
4.7.5 Theorem Suppose that/ and g are differentiable at x0 and that k E JR.
Then kj. f + g, Jg, and/Jg (assuming g(Xo) f- 0) are all differentiable at Xo and
i.
(kj)' (xo) = k(f' (xo))
ii.
(J + g)'(xo) = J'(xo) + g'(xo)
constant multiple rule
sum rule
Chapter 4 Continuous Mappings
216
iii.
(Jg)' (Xo) = f(.xo)g'(Xo) +f'(.xo)g(xo)
product rule
iv.
(£)'
quotient rule
(.xo) = g(xo) f' (xo) - f (xo)g' (Xo)
[g(xo)F
g
Proof The result in each case follows from the laws of limits. For instance,
to prove i,
Ckf)'(xo) = lim kf(x) - k/(Xo) = lim k [J(x) - f(Xo)]
x-+xo
x - xo
x-+Xo
x - Xo
_ k 1. f(x) - f(xo) _ kf'( )
1m
xo.
x-+xo
x - Xo
,
-
Readers should fill in the remaining parts themselves, consulting a calculus text
if needed.
•
4.7.6 Chain Rule If! is differentiable at Xo and g is differentiable atf(xo),
then go f is differentiable at Xo and
(g o [)' (Xo)
= g' (f(xo)) J' (xo).
Proof Let y =f(x) and z = g(y). The often used argument, namely,
&:
(&:
= ( lim &:) ( lim !::,,y)
!ly
Ax
=( lim Ay
&:) ( Ii~ !ly)
=dy
dz dy
fix
dx'
dz= lim
= lim
Ay)
dx ar-+o Ax ar-+o Ay Ax
ar-+o
Ar-+O
by-+O
Ar-+0
has the flaw that t::,,y could vanish even if Ax does not.
Here is a more careful argument that avoids these problems with denominators. The definition of differentiability of a function k(-u) at u can be written as:
Given c > 0, there is a r, > 0 such that l!lul < r, implies
lk(u + Au) - k(u) - k'(u)!lul
Given e
< clllul.
> 0, we must find a 6 > 0 such that
lg(f(x + Ax)) - g(f(x)) - g' (f(x)) J' (x)Axl
< c Ax
(1)
Theorem Proofs for Chapter 4
217
if l&I < 8. Here we have written x for xo to simplify notation. Adding and
subtracting g'(f(x))!J./, the left-hand side of (1) reads
!Ag - g' (f(x))Af + g' (J(x))N - g' (f(x)) J' (x)&I
~ IAg - g'(f(x))Afl
where
Ag
=g(f(x + &)) -
g(f(x))
+ jg'(f(x))j IA/ - J'(x)&!
=g(f(x) + Al) -
(2)
g(f(x))
and
N = f(x + &) - f(x).
Since f has the Lipschitz property at x (see the comments following 4.7.2),
!NI < Ml&I if!&! < c for some c > 0. Since g is differentiable atf(x), given
c1 > 0, there is a 81 > 0 such that if !NI < 81, then IAg- g'(f(x))Afl < cdNIThus we should choose 8 so that IN! < 81, i.e., 8 ~ 8ifM. Then the
right-hand side of (2) is bounded above by
ct INI + jg' (J(x))IA/ -
!' (x)&I
~ c1Ml&I + Nltif -
f' (x)&I
where N = lg'(f(x))j. Given E2 > 0, choose 82 so that IN -f'(x)&I
l&I < 82. Then the right-hand side of (3) is bounded above by
(3)
< e2l&I if
(4)
Now it is clear how to choose things: Given e > 0, let e: 1 = e:/2M, c 2 = c/2N,
and 81,82 be the corresponding o's. Let 8 = min(8 1/M, 82). Then (4) shows that
(1) holds.
•
4.7.8 Theorem Assuming that/ is differentiable at x0,
i.
If f is increasing at xo, then f' (xo) 2'.'. 0.
ii.
If f
iii.
Iff' (xo) > 0,
iv.
If f' (xo) < 0, then f is strictly decreasing at xo.
.'
is decreasing at xo, then f' (xo) ~ 0.
then f is strictly increasing at XQ.
Proof i and ii follow from iv and iii (why?), and so we illustrate by proving
iii. Thus, suppose f' (xo) > 0. By definition,
lim f (x) - f(xo) = J' (xo).
x-+xo
X -Xo
Chapter 4 Continuous Mappings
218
Let c = /'(Xo). There is a neighborhood N of Xo such that for x E N and x -;. xo,
we have
- /(Xo) _ J'(:xo)I <J'(:xo).
1/(x)X-Xo
or equivalently,
0
< J(x) -
/(Xo)
X-Xo
< 2/' (XQ).
Thus
j(x)- J(xo)
X-Xo
is positive for x E N, x 'F xo, so that if x > xo, then x - xo
f(x) - f(xo) > 0 (i.e., f(x) > f(:xo)); and if x < xo, then x - xo
f(x) - f(xo) < 0 (i.e., f(x) < f(xo)).
•
>0
<0
and so
and so
4.7.14 Proposition Suppose that/ is continuous on [a, b] and differentiable
on ]a,b[.
i.
IfJ'(x) 2'. 0 for all x
iL
If f' (x)
iii.
If f'(x) > 0 for all x E ]a, b[. then f is strictly increasing on [a, b].
iv.
If f'(x) < 0 for all x E ]a,b[, then/ is strictly decreasing on [a,b].
v.
If f' (x)
~
E ]a, b[, then f is increasing on [a, b].
0 for all x E ]a, b[, then f is decreasing on [a, b].
= 0 for all x
E ]a, b[, then f(x)
= k (k some constant) for all
xe[a,b].
Proof The result in each case follows directly from the mean value theorem.
The text proved i, and the proofs in the other cases are similar. •
4.7.15 Inver~e Function Theorem Suppose that f : ]a,b[ -+ lR and
that either f' (x) > 0 for all x E ]a, b[ or f' (x) < 0 for all x E ]a, b[. Then
/ : ]a, b[ -+ lR is a bijection onto its range, 1- 1 is differentiable on its domain,
and (f- 1)'(y) 1//'(x) where f(x) y.
=
=
Proof Since/ is monotone by the preceding proposition, 1- 1 exists, and by
4.2.1 and 4.3.S, its domain is an interval. First we show that 1- 1 is continuous.
Suppose that f' (x) > 0, so that f is strictly increasing. (A similar proof works
if f is strictly decreasing.) If U c ]a, b[ is an open set, let y E /(U), so that
219
Theorem Proofs for Chapter 4
y = f(x) for some x E U and there is an open inteival ]x1, x2[ with x E ]x1 ,x2[ and
]x1,x2[ c U. Now f(x1) < J(x2) andJOx1 ,x2D c Jf(x1),J(x2)[, sincef is strictly
increasing. For any c with /(xi) < c < f(x2), there is some z E ]x1,x2[ with
f(z) = c, by the intermediate value tlieorem. Thus/Gx1,x2D = ]/(xi),/(x2)( c
J(l]) and y =/(x) E ]J(x1),/(x2 )[. We have shown that (f- 1 1(U) =J(U) is
open. Hence, by Theorem 4.1.4,/- 1 is continuous.
Now write y = f(x), so that x =f- 1(y); and Yo =/(xo), so that Xo =1- 1(yo).
Since 1- 1 is continuous, limy-Yo x = Xo, Then
·
r
(f-t)'(yo) = lim 1-t(y) - 1-t(yo) = lim
x - Xo
Y-Yo
y - Yo
Y-Yo f(x) - J(:xo)
=Y-Yo
lim
I
=_l_ •
J(x) - f(xo) /'(:xo) ·
x- Xo
4.7.16 Proposition Suppose that
f is continuous on [a,b] and is twice
differentiable on ]a, b[, and that Xo E ]a, b[.
> 0, then
i.
If f' (Xo) = 0 and f" (xo)
ii.
If f' (Xo) = 0 and f" (xo) < 0, then xo is a strict local maximum off.
Xo is a strict local minimum off.
Proof
i.
If /"(:xo) > 0, then f'(x) is increasing at Xo, and so there is a 6 > 0 such
that/'(x) < 0 if xo- 6 < x < xo and/'(x) > 0 if xo < x < :xo+6. Thus, by
4.7.14, /(x) > J(xo) if Xo - 6 < x < xo andf(x) > f(:xo) if Xo < x < Xo +6.
ii.
The proof is similar.
•
• I
4.8.4 Theorem
i.
If f : [a, b] -+ JR is bounded and is continuous at all but finitely many
points of[a,b], then it is integrable on [a,b].
ii.
Any increasing ordecreasingfunction on [a,b] is integrable on [a,b].
Let us first prove:
Upper-Lower Integral Inequality Fora boundedfunctionf: [a,b]-+
R,
f:J :5 f:J.
Chapter 4 1Continuous Mappings
220
Proof If P and P' are partitions of [a,b] with P c P', then P' is called a
refinement of P. To get our inequality, we first prove the following result:
Lemma If P' is a refinement of P, then L(f, [a,bl,P')
U(i,[a,b],P') $ U(f,[a,b],P).
~ L(f, [a,b],P) and
.
.
Proof Let Po= P and P1 =PU {at}, where a1 E P'\P, and let -P; = P;-1'U
{a1}, where a; E P'\P;- 1, so that for some t, P = Po c P1 C P2 C · • • C P, = P'
and P;+1 has one more point than P;. If P; = {xo,x1, ... ,xn} with Xj < Xj+l and
Xk-l < lli+I
Xk, then
.
<
'.
L(f, [a, b],P;) = m1 (xi - Xo) + m2(X2 - X1) + • • • + mk-1 (Xk-l - XA:-2)
+ m1;(X1; - Xt-1)
+ mk+1(Xk+I - Xt) + • · · + mn(Xn - Xn-1)
i.
and'
L(f, [a,b],P;+1) = m1(x1 - Xo) + m2(.x2 :- xi)+···+ mk-t(Xk-t - XA:-2)
+ li (a;+1 - X1:-1) + l2(Xk - a;+1)
+ mk+I (Xt+l __: X.t) + · · · + mn(Xn - Xn-1)
where mi = inf{/(x) I x E [Xj-1,xj]}, li = inf{/(x) I x E [Xk-1,a;+d}, and
I x E [a1+1,X.1:1}. The two sums are identical except that in the
second sum, m.1:(x1: - X.t'-1) is replaced by l1(a;+1 -;- X.1:-1) + '2(x.1: - ai+l). Since
m1: = inf{/(x) Ix E [X1:-1,xk]}, mk is a lower bound for {/(x) Ix E [x1:-1,ai+t1}
and {/(x) Ix E [a;+1,xkl}. Thus mk $ l1 and m1: $ '2 and
12 = inf{/(x)
/i(a;+1 - Xk-1) + /2(X1: - a;+i)
~
mt(lli+I - Xt-1)+.m.1:(Xk .- a;+1)
= mt(Xk - X.t-1 ).
Thus L(f, [a,b],P;+ 1) ~ L(f, [a,b],P;). Since this inequality holds for i = 0, 1,
... , t - 1, we have L(f, [a, b], P') ~ L(f, [a, b], P). The proo~ of the result for
upper sums is similar.
Y
For a given partition P of [a,b], we have M; ~ m;, so that L(f, [a,b],P) $
U(f,[a,b],P). Now let P1,P2 be any partitions of [a,b]. Then, by the lemma,
L(f,[a,b],Pi) $ L(f,[a,b],P1 UP2) $ U(f,[a,b],P1 UP2) ' $ U(f,[a,b],P2).
Thus an upper sum for any partition is an upper bound.for
{ L(f, [a, b], P) I P is a partition of [a, b]}.
Theorem Proofs for Chapter 4
221
J:
J:
Hence, for any partition P,
f 5 U(f, [a,b],P). Thus
f is a lower bound
for {U(f,[a,b],P) IP is a partition of[a,b]}, and so
-
1bf 51bf.
Proof of 4.8.4
•
J;J ;:=: L(f,[a,bJ,P) for any partition P of [a,b] and
ibf
5 U(f,[a,b],P).
J: J;
Hence, to show that f = f, it suffices to show that for any c
a partition P of [a,b]with U(f, [a,b].P) -L(f, [a,b],P) < €.
i.
> 0,
there is
Suppose that/ is continuous on [a,b] except at a 1 ,a2,, .. ,am. Since/
is bounded, there are numbers k1, k2 such that k1 < /(x) < k 2 for all
x E [a, b]. Use a partition P 1 such that every subinterval including some
a; has length less than or equal to c /2m(k2 - k1 ). Since/ is continuous on
the union U of the remaining subintervals of P1,f is unifonnly continuous
on U. Hence there is some 6 such that x1, x2 E U and lx1 - x2 I < 6
implies l/(x1) - /(x2)] < € /2(b - a). Refine P1, if necessary, so that every
subinterval of the refined partition P not containing some a; has length
less than 6. Now
n
U(f, [a,b],P) - L(f, [a,b],P) = L(Mj - mj)(Xj - Xj-i),
j=I
=
=
where P consists of a xo < x, < · · · < Xn b, Mi = sup{/(x) I x E
[Xj-1,Xj]}, and mi = inf{/(x) I x E [Xj-J,Xj]}. For every j, Mi 5 k2
and mi ;:=: k1, since k2 and k1 are upper and lower bounds for /(x) on
[a, bJ. We first estimate the part of the sum for subintervals containing
discontinuities, and then the rest of the sum.
j
j
wllh some
a;erxj-•· Xjl
with
101H
a;e~j-i·
xi1
L
= (k2 - k1)
j
wilh
10mC
a;e~j-•· xjl
~
€
(k2 - ki)2(k2 - k1) =
€
2'
Chapter 4 Continuous Mappings
222
since there are at most m subintervals including a point of discontinuity.
For [Xj-t,Xj], on which/ is continuous,/ takes on a maximum value Mi
and a minimum value mi, and since the interval has length less than 6 and
is contained in U, Mi - mi< e:/2(b - a). Thus
L
L
2(b ~ )
a J. with I
<
(Mj - mj)(Xj - Xj-1)
J. with I continuolll
on 1Xj-i• "jl
(xi - Xj-1)
can1inuoaa
on 1"j_ 1.
"jl
e:
~ 2(b - a) · (b - a)=
e:
2'
since the total length of the subintervals cannot exceed the length of [a, b].
Thus
n
U(f, [a,b], P) - L(f, [a,b],P) = L(Mj - mj)(Xj - Xj-i)
j=I
L
=
(Mj - mj)(Xj -Xj-i)
j wllb IOIDC
a;erxj-i· "jl
L
+
j
with /
on
(Mi - mj)(Xj - Xj-i)
contlnuow
rxj-1 ·
"ii
< e:/2 + e:/2 = e:.
Hence f is integrable on [a, b].
ii.
If/ is increasing on
[a,b], then m; =/(x;_ 1) and M; = f(x;). Thus
n
L(f, [a,b], P) = Lf(x;_,)(x; - X;-1)
i=l
and
n
U(f, [a,b],P) = Lf(x;)(x; - X;-1).
i=I
Pick a partition P such that all subintervals have length (b - a)/n. Then
n
n
)
U(f,[a,b],P)-L(f,[a,b],P)= b :a ( ~f(x;)~f(x;-1)
b-a
= --(f(xn) - J(xo))
n
b-a
= -(f(b) - /(a)).
n
Theorem Proofs for Chapter 4
Pick n
223
> (b - a)(f(b) - f(a))/c:.
Then
(b - a)(f(b) - f(a))
U(f, [a,b],P) - L(f, [a,b],P)
< (b _ a)(f(b)- /(a))/c = c
and so f is integrable on [a,b]. The proof for a decreasing function is
similar.
•
4.8.5 Proposition
i.
If f is integrable on [a, b] and k
E JR., then kf is integrable on [a, b] and
f:kf=kJ:f.
ii.
If f and g are integrable on [a,b], thenf + g is integrable on [a,b] and
J:(f+g)= I:t+
iii.
If f and g are integrable on [a,b] andf(x)
J:15,
iv.
I: g.
I: g.
~
g(x)for all x E [a,b]. then
If f is integrable on [a,b] and [b,c], then f is integrable on [a,c] and
1:1=I:t+ J:J.
Proof
i.
For a partition P of [a,b], let m;(kf) = inf{kf(x) I x E [x;-1,x;]} and
m;(f) inf{/(x) Ix E [x;-1,x;]}. If k ~ 0, then m;(kf) k · m;(f) and
=
=
n
L(kf, [a, b], P) =
L m;(k/)(x; -
X;-1)
..
i:I
n
=
L km;(f)(x; -
X;- 1)
i=I
n
=k Em;(/)(x;
-X;- 1)
= k• L(f,[a,b],P).
i=I
Since this holds for any partition P of [a, b],
Similarly,
1
-b
-b
kf =kl f=k
1
b
f,
·
224
Chapter 4 Continuous Mappings
and so
If k
< 0, then M;(kf) = k · m;(j), and so
n
n
U(kf, [a, b], P) = L
m;(lif)(x; -
X;-1)
= L km;(f)(x; -
i=I
X;-1)
i=l
n
=k L m;(j)(x; -
X;-1)
=k • L(j, [a,b],P).
i=I
Therefore
Similarly,
and so
ii.
Let m;(f + g) = inf{ (j + g)(x) Ix E [X;- 1,x;]}. Since /(x) + g(x) ~ m;(f) +
m;(g) for all x E [X;- 1,x;], we get m;(f + g) ~ m;(/) + m;(g) and thus
n
L(j + g, [a,b],P) = Lm;(j + g)(x; - x;_i)
i=I
n
~ L(m;(f) + m;(g))(x; -
X;-1)
i=I
= L(j,[a,b],P)+L(g,[a,b],P).
Hence
Similarly,
Theorem Proofs for Chapter 4
225
Thus we have
l b lb lb
·1b
a f+ a g::;_a_<J+g)::; a (J+g)::;
Hence,
lb
(J + g)
lb
=
(J + g)
lb lb
a
lb lb
=
f +
f+
a
g.
g
and therefore
iii.
iv.
Since f(x) ::; g(x) for all x E [a, b], m;(f) ::; m;(g) and L(J, [a, b], P) ::;
L(g, [a, b], P). Thus
Given partitions P1 : a = Xo < x, < · · · < x" = b of [a, b] and P2 : b =
< Xt+i < · · · < Xn c of [b, c], we can fonn a partition P P1 U P2 of
=
Xk
=
[a,c]. Now
k
n
L(J, [a,cJ,P) = Lm;(x; i=I
X;-1)
n
= Lm;(x; -x;_ 1) + L
i=I
m;(x; -x;_i)
i=k+I
= L(J, [a, c], P1) + L(f, [a, c], P2).
Thus
{L(J, [a, c], P) I P is a partition of [a, cl}
:::> {L(f,[a,c],P1)+L(f,[a,cJ,P2) I P1
is a partition of [a,b] and P 2 is a partition of [b,c]}.
Hence
Similarly,
226
Chapter 4 Continuous Mappings
and so
i.e.,
4.8.8 Fundamental Theorem of Calculus Let f
: [a,b] -+ JR be
continuous. Then f has an antiderivative F and
lb
f(x)dx = F(b) - F(a).
If G is any other antiderivative off, we also have J: f(x) dx = G(b) -
G(a).
Proof Define F: [a,b] --+ JR by F(x) = J;J(y)dy. We claim that Fis an
antiderivative of/. Let x E ]a, b[ and let h > 0 be such that ]x, x + h[ C ]a, b[.
By 4.8.Siv,
Given c > 0, choose h such that 1/(y) - /(x)I < c for ally E ]x,x + h[; such a
choice is possible by the continuity off. Using this and
i
x+/1
f(x)
hdy=f(x),
X
we get
hdy-f(x) I= lix+h (/(y) -f(x))
h
dy I
11x+h f(y)
X
X
< 1x+h
-.
X
1/(y) -f(x)
Idy < chh = e.
h
Hence
lim F(x + h) - F(x) = f(x)
h
h-o+
(the limit through h
> 0).
227
Worked Examples for Chapter 4
Similarly,
lim F(x) - :(x - h) = f(x)
h-o-
(the limit through h
< 0).
Hence F' exists and F' (x) = J(x). From the definition of F, we get
lb
f(y)dy = F(b)- F(a).
The function F is easily seen to be continuous at a and b. Now let G be any
antiderivative of/. We shall show that G F + constant. Since G' (x) F' (x)
f(x) for all x E ]a, b[, we have (G - F)'(x) = 0 for all x E ]a, b[. By 4.7.12,
if a function has zero derivative on an interval, it is constant on that interval.
Therefore G = F + constant, and hence
=
lb
for any antiderivative G off.
=
=
f(y)dy = G(b) - G(a)
•
Worked Examples for Chapter 4
Example 4.1 Let f : A -+ IR.m be written as
f(x) = (/1 (x), . .. ,fm(X)).
Show that f is continuous iff each component f; is continuous, i = I, ... , m.
Solution Let f be continuous. If Xk -+ x in A, we must show that f;(x1c) -+
f;(x) for each i. But this is a consequence of the fact that f(x1c) -+ J(x), and a
sequence in IR.m (here, f(x1c)) converges iff its component sequences converge
(see §2.7). The same reasoning proves the converse. •
Let f : A c M -+ N be continuous. For K C A a connected
set, show that the graph off, {(x,f(x)) Ix EK}, is connected in M x N.
Example 4.2
Solution Consider the mapping g : K c M
-+ M x N defined by g(x) =
(x,/(x)). By the previous example, g is continuous. But g(K) = { (x,/(x)) I x E
K}, and the image of a connected set is connected, by Theorem 4.2.1. •
228
Chapter 4 Continuous Mappings
Example 4.3 let f : A f(xo)
~
]Rm be continuous at x0 E A, A open, and
0. Show that f is nonzero in some neighborhood of Xo,
Solution Given c: > 0, there is a neighborhood U of xo such that llf(x) f(xo)II < c: for all x E U, by the definition of continuity. For our purpose, choose
e = llf(xo)II, Then 11/(x) - f(xo)II < llf(xo)II implies that f(x) ~ 0, because
f(x) = 0 would imply the contradiction II - f(.ro)II < llf(.ro)II. Therefore, on the
neighborhood U determined by c: = 11/(xo)II, f is not zero. •
Example 4.4 Show that a linear map L : ]Rn - ]Rm is continuous.
Solution We show that for our given linear map L : ]Rn - JRnl we can find
a number M such that IIL(x)II $ Mllxll for all x E lRn. Then llx - xoll < c:/M
implies IIL(x) - L(xo)II = IIL(x - .ro)II :::;; Mllx - xoll < c:, which will prove that
L is continuous.
Let M1 = sup{IIL(e1)II, ... , IIL(en)II}, where e1, ... , en is the standard basis
for ]Rn. Letting x = (x1 , ••• , Xn), and using the triangle inequality,
IIL(x)II = llx1L(e1) + · · · + XnL(en)II $ lxd IIL(e1)1l + · · · + lxnl IIL(en)II
$ M1(lx1I + · · · + lxnl) $ M1nllxllThus we can take M = nM1, and we get our result.
•
Example 4.5 A multilinear map L from 1Rn1 x 1Rn2 x • .. x ]Rnk to Rm is
defined to be a mapping such that for each r, l $ r $ k, we have
where the a; E
continuous.
Rn;,
br E ]Rnr, and A E JR. Show that a multilinear map is
Solution Let e1, ... , en be the standard basis of ]Rn, and for x E ]Rn, let
x = (x1, ... ,.xR) = I:7=1x;e;. Define elements of lRm, a;., ... ,;k for integers ii with
I $ ij $ nj, j = 1, ... , k, by a;1, ••• ,ik = L(e;1 , ••• , e;k). We claim that
L(x1, .. . , x,J =
n1
nk
i1=l
it=I
L ···L a;
1, •••
,;kJ.• .. ·
4,
which is the analogue of writing a linear transformation in terms of a matrix.
229
Worked Examples for Chapter 4
Indeed, by multilinearity,
Repeating this k times gives the claimed result.
From this formula it is clear that L is continuous, since the functions x 1;,
through x/t are products of continuous functions and are thus continuous, and
L is a sum of these.
Another solution to this problem, which proceeds as in Worked Example
4.4, is to show that there is a constant M > 0 such that IIL(x1, · · · , Xk)II $
Mllx1 II··· llxkll, from which continuity can be deduced directly. •
Example 4.6 Show that J; x1- dx = a 3 /3 by using the fundamental theorem.
Verify this answer directly by showing that for any given e > 0 there exists a
partition P of [0, a] such that U(x2, P) - L(r, P) < e and that
inf{ U(r' P) I p is any partition )
=sup{ L(:t'' P) I p is any partition} = ~.
Solution The function F defined by F(x) = x3 /3 is an antiderivative of f(x) =
x1-, since F' (x) = x1-.
Thus by the fundamental theorem,
fa
Jo
a3
,? dx = F(a)- F(O) = 3 .
To verify our answer using the upper and lower sums, partition [O, a] into the n
subintervals [O,a/n), [a/n,2a/n], ... ,[(n- I)a/n,a]. Ifwe call this partition P,
then
U(i2,P)=t(~r~= (::) (tk2) =(::) (i)n(n+l)(2n+l)
(see Exercise 41) anq
L(r,P) =
t ((k ~
l)a) 2
~
k=I
=( ~) ( ~ ~) =( :: ) ( i) (n -
l)(n)(2n - 1) .
230
Chapter 4 Continuous Mappings
Hence
U(x2, P) = ( ~) ( 1 + ~) ( I +
;n)
and
Choosing n sufficiently large, we get U(x2, P)-L(x2, P) < c and thus inf{ U(x2, P)
P is any partition} sup{ L(x2, P) I P is any partition} a3 /3.
•
=
=
Example 4.7
a.
Formulate a definition and some basic properties of integration of complexvalued functions on an interval.
b.
Evaluate
f0" eu: dx.
Solution
a.
Any function/: [a,b]-+ C may be divided into two real-valued functions,
and/2, such that/(x) = /1 (x)+if2(x) (that is, define/1(x) as the real part of
f(x) andh(x) as the imaginary part). Then we make the natural definition
/1
ifboth/1 and.hare integrable over [a,b]. In this case/ is called integrable
over [a,b]. It is easily seen that
J:
J:
i.
cf(x) dx = c f (x) dx for any complex number c and any integrable function/;
ii.
J;(J(x) + g(x)] dx = f(x) dx + g(x) dx for integrable functions/
and g; and letting an overbar denote complex conjugation,
iii.
J;
J;
J: f(x)dx = J: f(x)dx.
With a little more effort, one can also prove
iv.
IJ:
f(x) dxl
:5
J: lf(x)I dx.
I
Exercises for Chapter 4
231
Similarly, /'(x) = f{(x) + lf;(x) can be defined, and the usual rules for
derivatives hold. We ask the reader to formulate the fundamental theorem
of calculus in Exercise 46.
b.
Since ea= cosx+isinx, we get
cosx)lo o+ 2; 2;.
•
=
i<-
=
J0
1r
eadx = Jt(cosx+isinx)dx = sinxl 0+
Exercises for Chapter 4
1.
2.
a.
Prove directly that the function 1/.r is continuous on ]O, oo[.
b.
A constant function/ : A -+ Rm is a function such that /(x) = f(y)
for all x,y EA. Show that/ is continuous.
c.
Is the function/(y) = 1/(y4+y2+ 1) continuous? Justify your answer.
a.
Prove that if/: A -+ Rm is continuous and B
continuous.
c
A, then the restriction
JIB is
3.
0
Find a function g : A -+ JR and a set B C A such that glB is
continuous but g is continuous at no point of A.
a.
If f : JR -+ JR is continuous and K C JR is connected, is 1- 1(K)
necessarily connected?
€) Show that if f :
]Rn -+ ]Rm is continuous on all of ]Rn and B c ]Rn is
bounded, then /(B) is bounded.
4.
0
6.
Discuss why it is necessary to impose the condition x ~ x 0 in the definition
of limx--+xof(x) by considering what would happen in the case/ : JR -+ JR,
f(x) = 0 if x ~ 0 and/(0) = 1, and how one would define the derivative.
Show that f : A -+ IR"' is continuous at .to iff for every c > 0 there is a
6 > 0 such that llx -xoll ::; 6 implies 11/(x)-/(xo)II ::; c. Can we replace
c > 0 or 6 > 0 by c ~ 0 or 6 ~ O?
a.
Let { ct} be a sequence in JR. Show that Ct -+ c iff every subsequence
of Ck has a further subsequence that converges to c.
b.
Let f : JR -+ JR be a bounded function. Prove that f is continuous
if and only if the graph off is a closed subset of 1R2 . What if f is
unbounded?
Chapter 4 Continuous Mappings
232
7.
Consider a compact set B C JR.n and let f : B -+ Rm be continuous and
one-to-one. Then prove that 1- 1 : /(B) -+ B is continuous. Show by
example that this may fail if B is connected but not compact. (To find a
counterexample, it is necessary to take m > 1.)
8.
Define maps s : JR.n x Rn -+ Rn and m : JR x JR.n -+ JR.n as addition and
scalar multiplication defined by s(x,y) x+y and m(A,x) AX. Show that
these mappings are continuous.
9.
Prove the following "gluing lemma": Let/: [a, b] -+ ]Rm and g: [b, c]-+
JR.m be continuous. Define h: [a, c] -+ JR.m by h =f on [a, b] and h =g on
[b, c]. If f(b) = g(b), then h is continuous. Generalize this result to sets
A, B in a metric space.
10.
Show that f : A -+ JR.m, A
f(cl(B) nA) C cl(f(B)).
11.
a.
For f : ]a, b[ -+ JR., show that if f is continuous, then its graph r
is path-connected. Argue intuitively that if the graph off is pathconnected, then f is continuous. (The latter is true, but it is a little
more difficult to prove.)
b.
For f : A -+ lR.m, A c JR.n, show that for n 2:: 2, connectedness of the
graph does not imply continuity. [Hint: For f : JR. 2 -+ JR., cut a slit
in the graph.]
c.
Discuss b for m =n = l. [Hint: On JR., consider f(x)
f(x) = sin(l/x) if x > O.]
a.
A map f : A C Rn -+ JR.m is called lipschitz on A if there is a
constant L 2:: 0 such that 11/(x)-/(y)jj s; Lllx-yll, for all x,y EA.
Show that a Lipschitz map is uniformly continuous.
b.
Find a bounded continuous function/:
continuous and hence is not Lipschitz.
c.
Is the sum (product) of two Lipschitz functions again a Lipschitz
function?
d.
Is the sum (product) of two uniformly continuous functions again
uniformly continuous?
e.
Let f be defined and have a continuous derivative on ]a - e:, b + e:[
for some e: > 0. Show thatf is a Lipschitz function on [a, b].
12.
=
c Rn,
=
is continuous iff for every set B
JR-+
c A,
= 0 if x = 0 and
JR that is not uniformly
Exercises for Chapter 4
13.
233
Let f be a bounded continuous function f : ]Rn - t JR. Prove that f (U) is
open for all open sets U c Rn iff for all nonempty open sets V C ]Rn,
'
inf f(x) <f(y) < supf(x)
xEv'
xEV
for ally EV.
14.
a.
Find a function!: JR2
-t
JR such that
lim limf(x,y)
X->0 y...... o
and
·,
lim Iimf(x,y)
y...... o
x->O
exist but are pot equal.
b.
Find a function f : '!!i2 • - t JR such that the two limits in a exist and
are equal butf is not continuous. (Hint: f(x,y) = xy/(x2 + y2) with
f = 0 at (0, O).]
c.
Find a function f : JR. 2 - t JR that is continuous on every line through
the origin but is not continuous. [Hint: Consider the function given
in polar coordinates by rtan(0/4), 0:;; r < oo, 0 S 0 < 27r.]
15.
Let Ji, ... ,fN be functions from A C ]Rn to JR. Let m; be the maximum
of/;, that is, m1 = sup(/;(A)). Let/= "E,f; and m = sup(f(A)). Show that
m S "I:, m;. Give an example where equality fails.
16.
Consider a function f : A x B - t ]Rm, where A c JR, B c ]RP. Call f
separately continuous if for each fixed x0 E A, the map g( y) = f (x0 , y)
is continuous and for Yo E B, h(x) = f(x,yo) is continuous. Say that f is
continuous on A uniformly with respect to B if for each € > 0 and x 0 E A,
there is a 8 > 0 such that llx - xoll < 6 implies 11/(x,y) - f(xo,Y)II < €
for all y E B. Show that if f is separately continuous and is c~ntinuous
on A uniformly with respect to B, then f is continuous.
17.
Demonstrate that multilinear maps on Euclidean space are not necessarily
uniformly continuous.
18.
Let A C M be connected and let f : A - t JR be continuous with f(x) f- 0
for all x E A. Show that f(x) > 0 for all x E A or else /(x) < 0 for aV
xEA.
19.
Find a continuous map f : JR.n -+ ]Rm and a closed set A c R.n such that
/(A) is not cbsed. In fact, do this whenf: R.2 ---t IR is the projection on
the x axis.
20,
Give an alternative proof of Theorem 4.4.1 using sequential compactness.
Chapter 4 Continuous Mappings
234
21.
22.
Which of the following functions on JR are uniformly continuous?
a.
f(x) =
b.
f(x)
Ij(i2 + 1).
c.
=cos3 x.
f(x) =i2 /(i2 + 2).
d.
f(x) =xsinx.
Give an alternative proof of the uniform continuity theorem using the
Balzano-Weierstrass theorem as follows. First, show that if f is not uniformly continuous, there is an c > 0 and there are sequences Xn,Yn such
I/n and p(f(xn),f(Yn)) 2:: c. Pass to convergent subsethat p(Xn,Yn)
quences and obtain a contradiction to the continuity off.
<
23.
Let X be a compact metric space and f : X ~ X an isometry; that is,
d(f(x),f(y)) = d(x,y) for all x,y EX. Show that/ is a bijection.
24.
Let/: ACM--+ N.
a.
Prove that f is uniformly continuous on A iff for every pair of sequences x1c,Yk of A such that d(x1c,Y1c)--+ 0, we have p(f(x1c).f(yk))--+
0.
b.
Let f be uniformly continuous, and let X1c be a Cauchy sequence of
A. Show that f (x1c) is a Cauchy sequence.
c.
Let f be uniformly continuous and N be complete. Show that f has
a unique extension to a continuous function on cl(A).
25.
Let f : ]0, 1[ --+ JR be differentiable and let f' (x) be bounded. Show that
limx-+o+ f(x) and lim.._ 1-/(x) exist. Do this both directly and by applying
Exercise 24c. Give a counterexample if f'(x) is not bounded. ·
26.
Let/: ]a,b]--+ JR be continuous and such thatf'(x) exists on ]a,b[ and
limx-+a+ f' (x) exists. Prove that f is uniformly continuous.
27.
Find the sum of the series I:~(3/4)".
28.
Let/: JO, I[-+ JR be uniformly continuous. Must/ be bounded?
29.
Let/: JR-+ JR satisfy lf(x) - f(y)I $
[Hint: What is f' (x)?J
30.
a.
Let/: [O, oo[--+ JR,f(x) = y'x. Prove that/ is uniformly continuous.
b.
Let k > 0 and f(x) = (x - x")j log x for O < x < 1 and /(0) = 0,
/(1) = 1 - k. Show that/: [O, lJ-+ JR is continuous. Is/ uniformly
continuous?
Ix -
yj 2 • Prove that/ is a constant.
Exercises for Chapter 4
235
31.
Let f(x) = x 1f<x-l) for x :/- 1. How should /(1) be defined to make/
continuous at x = 1?
32.
Let A C Rn be open, Xo EA, ro > 0 and B,0 = {x E Rn I !Ix - xoll $ ro}.
Suppose that B,0 c A. Prove that there is an r > ro such that B, C A.
33.
A set A C Rn is called relatively compact when cl(A) is compact. Prove
that A is relatively compact iff every sequence in A has a subsequence
that converges to a point in Rn.
34.
Assuming that the temperature on the surface of the earth is a continuous
function, prove that on any great circle of the earth there are two antipodal
points with the same temperature.
35.
Let f : JR. -+ JR. be increasing and bounded above. Prove that the limit
limx-o-- /(x) exists.
36.
Show that {(x, sin(l/x))
not path-connected.
37.
Prove the following intenriediate value theorem for derivatives: If f is
differentiable at all points of [a,b], and if/1({1) andf'(b) have opposite
signs, then there is a point Xo E ]a, b[ such that f' (Xo) = 0.
Ix> O} U ({0} x
[-1, 1)) in R 2 is connected but
•'
38.
A real-valued function defined on ]a, b[ is called convex when the following inequality holds for x,y in ]a, b[ and t in (0, 1]:
f(tx + (1 - t)y) $ tf(x) + (1 - t)/(y).
~
If/ has a continuous second derivative and/"> 0, show that/ is convex.
39.
Suppose/ is continuous on [a,b],f(a) =f(b) = 0, and x2J"(x) +4xf'(x) +
2/(x) ~ 0 for x E ]a, b[. Prove that f(x) $ 0 for x in [a, b].
40.
Calculate
d
f'
dx
dt } 0 I +x2 •
41.
Prove that
236
Chapter 4 Continuous Mappings
t
and
k2 = n(n + I~2n + I).
k=l-
These formulas were used in Worked.Example 4.6.
42.
For x > 0, define L(x) = /t(I/t)dt.
definition:
Prove the following, using this
a.
L is increasing in x.
b.
L(xy) = L(x) + L( y).
c.
L'(x) = 1/x.
d.
L(l) = 0.
e.
Properties c and d uniquely determine L. What is L?
J; f(y)
43.
Let f : JR ---t JR be continuous and set F(x) =
F'(x) = 2xf(x2). Give a more general theorem.
44.
Letf: [O, 1) ---t lll be Riemann integrable and suppose for every a, b with
f = 0.
0 S a < b S I there is a (i;, a < c < b, with f(c) = 0. Prove
Mustf be zero? What if f is oontinuous?
dy.
Prove that
Jd
45.
Prove the following second mean value theorem. Let f and g be defined
on [a, b] with g continuous, f 2:: 0, andf integrable. Then there is a point
xo E ]a, b[ such that
1b
46.
f (x)g(x) dx = g(xo)
1b
f(x) dx.
a.
For complex-valued functions on an interval, prove the fundamental
theorem of calculus.
b.
Evaluate
f0
11'
etc dx using a.
Chapter 5
Uniform Convergence
Many important functions are defined using infinite sequences or series. To study
such functions, we need to understand the concept of uniform convergence. To
deal effectively with concrete situations and examples, we use specific tests for
uniform convergence. Perhaps the most helpful such test is the Weierstrass M
test for series. Another test is the Cauchy criterion, which is mainly of theoretical
use. We also include the more specialized tests of Dirichlet and Abel.
In connection with uniform convergence, we look again-at-th~pare-of
continuous functions, which we introduced briefly in §LL In this space the
"points" or "vectors .. are functions, and convergence of a sequence- corresponds
to uniform convergence of these functions. The space is proved to be complete
in the sense that C~nchy sequences..~e. A second basic property of this
space, called the Arzela-Ascoli theorem, establishes the <;_9m_uactness ~ i n
subsets. Another important result, called the Stone:--Weiel:strass theoreJll, enables
orreto awrQ~i_rriate _2ontinuous fu_ncti.Qil.S. b~olynomia1s-oc..by..functions...fro11_1_
other_fil)_m-opriate classes. In connection with a basic iterative technique called
the c.ontraction.mapping...principle,....&ome applications to differential and integral
equations and some problems in control theory are given.
§5.1 Pointwise and Uniform Convergence
There are several different ways in which we might think about the convergence
of a sequence of functions. Perhaps the most natural is what is called pointwise
convergence. In this idea, we simply ask that for each point x in the domain,
the sequence of values fk(x) converge.
237
/
/
238
Chapter 5 Uniform Convergence
5.1.1 Definition Let N be a metric space, A a set, and fk : A -+ N, k =
1, 2, . . . . This sequence of functions fk is said to converge pointwise (or simply)
to a function f : A -+ N if for each x E A, fk(x) -+ f(x) (convergence as a
sequence in N). We often writefk-+ f (pointwise) if!k converges pointwise to f.
While this type of convergence is useful for certain purposes, there are other
situations where it is not. If one is numerically computing f using the functions
fk as approximations, it is not clear that pointwise convergence is an appropriate
notion. Here is what can go wrong: Even if the functions fk are continuous, f
need not be continuous. For example, consider Figure 5.1-1, in which
~( )
Jk
x =
{ 0,
x 2 1/k,
-kx + 1, 0 :S x < 1/k.
y
(0, I)
X
{I, 0)
FIGURE 5.1-1
Pointwise convergence
In this case, the sequence fk(x) converges for each x in the interval [0,1]. The
sequence fk(x) converges to 0 if x > 0 and to 1 if x = 0. The limit function is
thus defined by f(x) =0 if x > 0 and/(0) = I. This is not a continuous function,
even though each of the functions fk is continuous.
What is going wrong? If we focus attention on any one value of x, then for
large enough k, the values fk(x) are close to f(x). However, the k needed for any
particular degree of accuracy depends very much on x. For x near 0, very large
values of k will be needed to make fk(x) close to f(x), and there will always be
points x still closer to 0 for which fk(x) is not close to f(x). To remedy this, we
introduce a stronger notion of convergence. We ask not only that the values of/k
be close to those off, but that they be so uniformly, that is, independently of x.
,.,_
.
§5.1 Pointwise and Uniform Convergence
239
5.1.2 Definition Letfk: A - t N be a sequence offunctions with the property
that for every € > 0 there is an integer L such that k 2 L implies p(jk(x),f(x)) <
for all x E A. Here, p is the metric on N. Under these conditions, we say that
the sequence A comerges uniformly to f on A, and we write fk - f (uniformly).
€
The important point to note here is that the choice of L may depend on c but
not on x. One must be able to choose Lin such a way that the same L works
everywhere on the set A. The notion of uniform convergence thus depends very
much on the set A being considered as domain. Convergence might be uniform
on one domain but not on a larger domain. In the last example, the convergence
of the functions ft to the limit/ is uniform on any interval [a, I] with O <a< I.
Given c > 0, we need only select an integer L large enough that 1/L < a. If
k 2 L, then fk(x) = 0 = f(x) for every x in [a, 1). Thus fx - t f uniformly on
[a, I]. However, if a is very close to 0, then correspondingly large values of L
are required. If c < I /2, then it is not possible to select a single k that makes
1/k(x) - /(x)I < c for every x in [O, l] at the same time. Thus, although the
functions A do converge to f pointwise on [O, I], they do not do so uniformly.
There is a good geometric way to visualize uniform convergence. The condition p(jk(x),f(x)) < € for every x means that/k(x) is always closer than a distance
c to f(x). This is easily described in terms of the graphs of the functions. The
graph of fk must lie within an "c tube" around the graph of/ whenever k 2 L.
See Figure 5.1-2.
y
A
FIGURE 5.1-2
Uniform closeness for f: AC JR--+ JR
Perhaps another example will make the idea clearer. Consider the sequence
of functions A : JR --+ JR defined by
fk(x) = {
0, x< k,
l, x 2 k
f,
~
240
\
b;
Chapter 5 Uniform Convergence
=
=
(k I, 2, 3, ... ). Thenfk --+ 0 (pointwise), because for each x E JR,fk(x) 0 for
k large (k > x). However, A does not converge to zero unifonnly, for, no matter
how large k is, there are points x such thatfk(x) - 0 is not small.
Observe that if fk(x) --+ f (unifonnly), then fk --+ f (pointwise). This is
because for any x E A and e > 0, we have an integer L such that p(fk(x),f(x)) < e
if k ~ L, that is, fk(x) --+ f(x).
We make similar definitions for a series of functions. Here we choose N = V
a normed vector space and let gk : A --+ V, so that addition makes sense.
EZ:
5.1.3 Definition We say that the series
1 gt converges to g pointwise,
and write
gk
=
g
(pointwise),
if
the
sequence
St = I:~=I g; of partial sums
1
gk
= g (uniformly) or
gk
converges pointwise tog.Also, we say that
1
converges tog uniformly if St --+ g (uniformly). For a sequence fk (or serirs
E gt), we say that /k (or E gt) converges uniformly if there exists a function
to which it converges uniformly.
EZ:
E:
E:,
The first basic property of uniform convergence is its preservation of conti' nuity, given in the next result.
5.1.4 Proposition let A c M where M is a metric space, let ft : A
be a sequence of continuous functions, and suppose that fk
A). Then f is continuous on A.
--+
--+ N
f (uniformly on
Thus, uniform convergence is a strong enough condition to guarantee that
the limiting function of a sequence of continuous functions is continuous. In
view of the preceding examples, this should not be unreasonable.
5.1.5 Corollary If the functions gk : A
(uniformly), then g is continuous.
--+
V are continuous and
E:, gk = g
This follows by applying the proposition to the sequence of partial sums.
This result can be restated as the validity of interchanging limits and sums:
00
Jim
00
L gt(X) = L
.r-+.ro k=I
5.1.6 Example let fn : JR--+ JR
fn --+ 0 uniformly as n --+ oo.
Jim gk(x)
k=I .r-+.ro
be defined by fn(X) = (sinx)/n. Show that
§5.1 Pointwise and Uniform Convergence
241
Solution We must show that lfn(x} - 0I = IJ,,(x)I gets small independently of
x as n -+ oo. But l/n(x)I = I sinxl/n S l/n, which gets small independently of
x as n-+ oo.
•
5.1.7 Example Show that the series for sinx,
x3
x5
3!
5!
sin x = x - - + - - • • ·
converges uniformly on the interval [O, r] for any r
'
> 0.
Solution We must show that the sequence of partial sums
n
' Sn(X) =
L
(-1/x2k+I
(2k + 1)!
k=O
converges uniformly. To do this, estimate the difference:
IL (-1/ (2k + IS L (2k + l)!"
00
lsn(X) - sin xi=
x2.t+1
oo
r2k+1
1)!
k-=n+I
k=n+l
The right-hand side is independent of x and tends to 0 as n tends to oo, since it
is the tail of a convergent series. Thus, the convergence of this series is uniform.
Note that continuity of sinx follows from this; a result we know already.
•
5.1.8 Example Let fn(X) = xn,
0
S
x
S 1. Does fn converge uniformly?
·What if0 S x < 1?
'
.
Solution First we determine the limit point by point. We havy fn(0) = 0 for
alln,andf,,(x)-+ 0 if x
to
< I, butfnO) = 1 for all n. Thusf,,
f(X) = { 0,
1,
X
X
converges pointwise
~1
= 1.
It cannot converge uniformly, because this limit is not continuous (Figure 5.1-3).
If 0 S x < 1, then fn converges pointwise to zero and 0 is a continuous function.
However, the convergence is still not uniform, for given any n, xn 2:: 1/2 if xis
close enough to I {since lim_....... 1 xn = 1).
•
242
Chapter S Uniform Convergence
JI
X
x=O
x=I
FIGURE 5.1-3
The sequence.f;.(x) =xn
5.1.9 Example Consider the geometric series E:Oxk.
a.
Show that this series converges pointwise to the function g(x) = 1/(1 -x)
for x in the open interval] - 1, 1(.
b.
IJO
c.
Show that the convergence is not uniform on] - 1, l[.
<a<
1, show that the convergence is uniform on the interval [-a, a].
Solution
a.
The partial sums are given by
n
I -xn+I
Sn(x)= '""'xk= - - - ,
~
1-x
k=O
as can be checked by multiplying (1 - x)(l + x + x2 + • • • + xn). Thus,
lsn(x) - g(x)I = lxln+I /I 1'-xf. For'tixed x in the open interval ] - 1, 1(, the
denominator, 11 .:._
is fixed while the numerator, lxln+t, tends to O as n
tends to oo. More precisely, we compute that 1xln+I /11 < e: whenever
n > (log(e:I 1 - xi)/ logx) - 1.
xi,
b.
The error
xi
lxr I/11 - xi < € is smaller for -x than it is for X, an~
!__
dx
(lxln+I) =
1-
X
\ { _Q_} '
(nxn(l-x)@'(=-- .;..,._t-,:rµ(1 - x)
)
(why?). Since this is positive, we see that the error is largest on [-a, a]
at the right endpoint, x = a. If we select an N large enough to -work there,
243
§5.1 Pointwise and Uniform Convergence
it will also work everywhere on [-a,a], and the definition of uniform
convergence on [-a, a] will be met. c.
If x > 1/?, then the error, lxn+ 1 /(1 - x)I, will be larger than 2lxn+lj.
Thus, for any particular value of n, we can make the error larger than any
specified c between O and -2 by selecting x in the open interval ] 1/2, 1[
and larger than (c /2) 1/(n+l). The convergence is thus not uniform on all of
]-1,1[.
•
Let fn(x) = xn /(1 + xn) for x in the interval [O, 2). Show
that the sequence of functions /1,'2,h, ... converges pointwise on the interval
[0, 2] but that the convergence is not uniform.
5.1.10 Example
Solution
The denominator, 1 + xn, is greater than or equal to 1 for each x
in our domain, and so lfn(x)I 5 xn. If O 5 x < I, this tends to O as n -+ oo.
Therefore limn-+oo.fn(x) 0 if O 5 x < 1. If x 1, then/n(X) 1/2 for every n.
If x > 1, then xn -+ oo as n -+ oo. Since fn(x) = 1/(1 + (1/xn)), this leads to
limn-+oofn(x) = I for x > 1. Thus.fn tends pointwise on [O, 2] to the function/(x)
defined by j(x) = 0 for O 5 x < 1,/(1) 1/2, and/{x) I for I< x 5 2. Each
of the functions In is continuous on [O, 2). If the convergence were uniform,
then the limit function would also be continuous. Since f has a discontinuity at
x = l, the convergence must not be uniform.
.The first few of these functions are shown plotted in Figure 5.1-4 together
with the discontinuous limit function.
•
=
=
=
=
=
y
FIGURE 5.1-4 This sequence Ji: converges.pointwise but not uniformly
244
Chapter 5 Uniform Convergence
Exercises for §5.1
1.
Letfn(x) = (x- I/n)2, 0 $ x $ I. Doesfn converge unifonnly?
2.
Letfn(X)
3.
Let fn : IR -+ IR be unifonnly continuous and let f,, converge unifonnly to
/. Do you think that/ must be unifonnly continuous? Discuss.
4.
Letfn(x) = xn, 0 $ x $ 0.999. Doest,, converge unifonnly?
s.
Let
= x- xn, 0 $ x $ I. Does!,, converge unifonnly?
00
f(x) =
n/2
L nx(n.
1)2
0 $ x $ l.
n=I
Discuss how you might prove that/ is continuous.
§5.2 The Weierstrass M test
'
One reason there were few examples in the lasf section involving infinite series
is that it is usually difficult to study uniform convergence directly unless a
convenient formula can be found for the partial sums, as was the case with
the geometric series. In this section we present some convenient tests for the
uniform convergence of a series of functions. The first is a characterization of
those series of functions that are unifonnly convergent expressed in terms of
the functions in the series without mentioning explicitly the limit function to
which the series converges. The basic idea is to use the Cauchy condition for
the convergence of the sequence of partial sums. Recall that every convergent
sequence satisfies the Cauchy condition. These are the sequences that in some
sense "ought" to converge. If the space in which the points lie is complete, then
the limit actually exists and the sequence does converge.
5.2.1 Cauchy Criterion Let N be a metric space with metric p, and let A
be a set. Suppose that N is complete and ft : A -+ N is a sequence of functions.
Then /k converges uniformly on A if! for every E > 0 there is an N such that
l, k ~ N implies
p(/t(X),J,(x))
for all x EA.
<E
§5.2 T/Je Weierstrass M test
245
For a series, the Cauchy criterion talces the following fonn when applied to
the sequence of partial sums. The series
1 g1c converges uniformly on A ijJ
for every c > 0 there is an N such that k ~ N implies
E:
for all x E A and all integers p ~ 0. As in §5.1, for series, the g1c are assumed
to talce values in a nonned vector space V. To use 5.2.1, we require V to be
R_n or even V R is a common choice.
Using the Cauchy criterion, we can obtain the following important technique
for detennining the uniform convergence of a series.
complete. Of course, V
=
=
5.2.2 Weierstrass M test
Suppose that V is a complete normed vector
space and g1c : A -+ V are functions such that there exist constants M1:. with
llg1c(x)l1 :::; M1c for all x E A, and E:1 M1:. converges. Then E:1 g1c converges
uniformly (and absolutely).
It is not always possible to use the M test, but it is effective in the majority
of cases. For more refined tests, see the Dirichlet and Abel tests in § 5.9.
In the M test , the constants M1:. give a bound on the "rate of convergence,"
the point being that the bound is independent of x. More exactly, the tail of]
the series
1 gk, which represents the error, is bounded by that of
1 Mt,
which tends to O independently of x.
E:
E:
5.2.3 Example Show that the series
verges uniformly on R
E: gn(x) = E: (sin nx)2 / n
1
1
2
con-
lgn<x)I 5 Mn, since Isinnxl 5 1. We also
know-that EO/n2) converges since it is a p-series with p = 2 > 1. Hence, by
Theorem 5.2.2, the series converges unifonnly.
•
Solution Let Mn = l/n2 • Here,
5.2.4 Example Prove that f(x) =
Loo_,.. ( xn.n)2
~ is continuous on JR.
n=u
Solution Here we cannot choose an Mn for the nth term, because x is not
bounded. We therefore do not expect uniform convergence on all of R, but we
can prove unifonn convergence on each interval [-a,a] by letting Mn= (an/n!)2,
246
Chapter 5 Uniform Convergence
which is an upper bound for the nth term on [-a, a]. The ratio test shows that
I: Mn converges, since
converges to zero, which is less than 1. Hence, we have uniform convergence
on [-a,a], and so, by 5.1.5, we get continuity of/ on [-a,a]. Since a was
arbitrary, we get continuity on all of JR.
•
5.2.5 Example Suppose that a sequence fn(x), 0 ::; x ~ 1, converges uniformly andfn is differentiable. Mustf~(x) converge uniformly?
Solution The answer is no. In general, control on the derivatives gives
control on the functions via the mean value theorem, but not vice versa. For
example, letf,,(x) = [sin(n2x)]/n. Then/n - 0 uniformly, but/~(x) =ncos(n2 x)
does not converge even pointwise (set x = 0, for example). Here is another
example of this phenomenon. Let gn(X) = xn+I /(n + 1) on [0, 1). Since lgn(x)I ::;
1/(n + 1), gn -+ 0 uniformly. But g~(x) = xn does not converge uniformly, as
was previously shown.
•
5.2.6 Example
Suppose that ao, a 1, a2, • •• is a bounded sequence of real
numbers. Show that the series
00
~a1e 1e
~-x
k=O k!
converges to a continuous function.
Solution Fix a point x0 • We need to show convergence to a limit function
/(xo) and continuity of/ atxo. LetA = {x I lxl 5 21.xol}. Then.xo e A. The partial
sums are polynomials and so are certainly continuous on the set A. If we show
that the series converges uniformly on A, the sum will thus be continuous on A
and will certainly be so at xo. Since x0 was an arbitrary point, this will compl~te
the proof. Uniform convergence on A may be established by the Weierstrass M
test. If Bis an upper bound for the numbers la1cl, we can take M1c = B-2klxol* /k!.
With g1c(x) = (a1c/ k!)x 1e, we have lg1e(x)I ~ M1e for all x in A. The ratio test shows
that :E:iM1c converges, since M1c+i/M1c = 2lxol/(k + 1) and this tends to Oas k
tends to infinity. The Weierstrass test applies and shows that our series converges
uniformly on A as we wanted.
•
247
§5.3 Integration and Differentiation of Series
Exercises for § 5.2
1.
Discuss the convergence and unifonn convergence of
=xn /(n + xn), x ~ 0, n =l, 2, ... .
a.
fn(X)
b.
fn(X) = e-x2 /njn, XE IR, n = l, 2, ... .
1::
/n 2, 0
2.
Discuss the uniform convergence of
3.
Prove that/(x) =
4.
Discuss the uniform convergence of
5.
If
1 an is absolutely convergent, prove that
unifonnly convergent.
"E: xn /n
1
2
1 xn
:::;
x:::; I.
is continuous on [O, l].
"E:, 1/(r + n
"E:
2 ).
"E: an sin nx
1
is
§5.3 Integration and Differentiation of Series
We saw in the last two sections that infinite series can often be shown to represent continuous functions. If we want to see how they can be used in the
study of calculus, we will need to know how they behave with respect to differentiation and integration. If a sequence of functions / 1.f2 ,/3, ... converges
to a limit function f, under what circumstances does f{,f~,J;, ... converge to
J' or do the in!egrals Jf 1, JJi, J/3, ... converge to JJ? The idea of unifo1m
convergence supplies a tool that is often useful. The result for integrals is fairly
straightforward.
5.3.1 Theorem Suppose that/1,h,h, ... are integrable functions on a closed
bounded interval. [a, b ], and that they converge uniformly to a limit function f
on [a, b]. Then f is integrable on [a, b] and
lim
n-oo
lb
a
fn(X) dx =
lb
j(x) dx.
a
The idea is that for large n, the values fn(X) are all unifonnly close to f(x).
Thus, any Riemann sum for f is close to the corresponding Riemann sum for fnThe finite length of the interval is important at this step. Since fn is integrable,
· its Riemann sums are all close together if the mesh of the partitions is smaR
enough, and so those for f are also close together. In particular, the upper and
lower sums for f are close together. With appropriate details, this shows that f
Chapter 5 Uniform Convergence
248
is integrable. The assertion that the limit of the integrals is equal to the integral
of the limits then follows from the observation that
lb
fn(X) dx -
lb
J(x) dx ::;
lb
l.fn(x) - J(x)I dx.
Selecting n large enough that l.fn(x) - f(x)I < £ for all x E [a, b] makes this
smaller than (b - a)t: and gives us our assertion about limits. Again, the finite
length of the interval is important. The details of the proof are supplied at the
end of this chapter.
If we apply this result to the partial sums of an infinite series of integrable
functions that converges uniformly on a closed bounded interval [a, b], we find
that we can interchange the order of integration and summation.
5.3.2 Corollary Suppose that the junctions gk : [a, b] -+ JR are Riemann
integrable and
1 gk converges uniformly on [a, b]. Then we may interchange
the order of integration and summation:
E:
The corollary follows from Theorem 5.3.1 applied to the sequence of partial
sums.
Intuitively, the theorem should be fairly clear, because if /k is very close to f,
then its integral (the area under the curve) should be close to that off. But be
careful here. Indeed, this result may be false if /k converges only pointwise!
(See Example S.3.S.)
Note. There is a theorem with a wider scope than 5.3.1, called Lebesgue's
dominated convergence theorem. One version of this result (due to Ascoli)
states that if ft converges pointwise to f and the fi are uniformly bounded
(that is, l.fi.(x)I ::; M for all k = I, 2, ... and all x E [a,b]), then the conclusion
of Theorem 5.3.1 remains valid. We shall be content for most of this book
with the more elementary form of the result in Theorem 5.3.1.
Can we take the same liberties with derivatives? The answer to the question
of term-by-term differentiation of a uniformly convergent sequence or series is
often no, as we saw in Example 5.2.5. This result is a good illustration of the
sort of care that is often needed to tum an intuitively plausible statement into one
249
§5.3 lntegraJion and Differentiation of Series
of actual fact. Thus we need more assumptions than just uniform convergence.
Sufficient conditions are given in the following theorem.
5.3.3 Theorem Let /k : ]a, b[ -+ IR be a sequence of differentiable functions
on the open interval ]a, b[ converging pointwise to f : ]a, b[ -+ R Suppose that
the derivatives J; are continuous and converge unifonnly to a function g. Then
f is differentiable andf' = g.
5.3.4 Corollary If the g1; are differentiable, the gk are continuous,
converges pointwise, and
1 g~ converges uniformly, then
I::
I:: 8k
1
As usual, the corollary follows by applying the theorem to the sequence of
partial sums.
5.3.5 Example Give an example of a sequence ft : [O, 1) -+ IR that converges to zero pointwise, but for which
J:J,, dx does not converge to zero.
Solution Letfk have the graph in Figure 5.3-1. Thus,.fic is such that
1; f1c dx =
l for all k = l, 2, 3, .... Furthermore, for each x, f1c(x) -+ 0 as k -+ oo (clearly, if
x = 0, and if x > 0, thenft(x) = 0 as soon ask> 1/x). Thus/1:-+ 0 pointwise,
but ft dx = l for all k.
•
1;
5.3.6 Example Let gn(X) = ru2/(1 + nx2), -1 ~ x ~ 1, n = 1,2,3, ....
Examine the conclusions of Theorem 5.3.3 in this case,
Solution As
n grows, the quantities fn(x) tend pointwise, but not uniformly,
to the function /(x) defined on [-1, I] by f(x) = 1 if x ":/: 0 and /(0) = 0.
The limit function f is not even continuous, much less differentiable, at x =
0. Something must be going wrong with the derivatives. These are given by
t;(x) = 2nx/(1 + nx2)2. These do converge pointwise to O on [-1, 11, but not
uniformly. For n ~ I, we compute thatf~(l/n) ~ 1/2. This example shows that
Chapter 5 Uniform Convergence
250
y
.0, 2k) -
area= 1
...._____--I~---
=1
X
FIGURE 5.3-1
X
This sequence converges to zero pointwise, but the correspond~ng sequence of integrals does not converge to zero
some assumption such as uniform convergence of the derivatives is important
in Theorem 5.3.3. Pointwise convergence is not enough.
•
5.3.7 Example Verify that
J; et dt
= ex - 1, using ex =
EC: x" /n!
and
Corollary 5.3.2.
Solution By the Weierstrass M test , ex =
EC:
xn /n! converges uniformly
on any finite interval. Thus, by Corollary 5.3.2 applied to the interval [0,x],
r
lo
et dt
=
f lor : _ =f
~ Ix = _:_ + x2 + ...
dt
n=O
n!
n=O
(n + 1)!
= ex - I.
0
1!
2!
•
5.3.8 Example
E:
a.
Sum the series
b.
Sum 1 -1/2+ 1/3 -1/4+ • • •.
1 xn /n,
lxl <
I.
Solution
a.
E:
We know that
0 xn = 1/(1 - x), since it is a geometric series. The
convergence is unifonn on [-1 + c:, 1 - c:] for any c: > 0, by the M test.
§5.3 Integration and Differentiation of Series
Thus we may integrate tenn by tenn from O to x:
n+l
00
n
00
L~
= - log(l - x);
n+
L xn
i.e.,
n;()
= - log(l - x).
n=I
This is valid pointwise for all x in ] - 1, 1[, since c is arbitrary.
b.
It is actually valid at x = -1 as well. To see this, recall that
N
1-xN+I
1 _ _xN+I
~xn
_
____
L...,
1-x
-
-1-x
1-x·
n;()
Thus,
L 1x t"dt= Ln+l
- = -log(l-x)- 1x -1-t
dt
N
N
_,..
n~
and hence
0
xn+I
tN+l
O
n;()
It::~
+ log(l -
Letting x = -1, and using 1/(1- t)
t
n=O
(-l)n+I +log2
n+ 1
x)I
= 11x :~It dt,.
:S 1 for -1 :St< 0, we get
:SI
r-1
Jo
tN+I
dtl = _1_
N+2
which tends to O as N tends to oo. Thus,
(-1)"
L -n= 100
log 2 = -
1
1
1
2+ 3 - 4+ · ··· •
n=I
5.3.9 Example
a.
Expandf(x) = 1/(1 +x2) in a geometric series.
b.
Integrate the series in a to prove
c.
Justify the formula of Euler:
1r
1
3
1
5
1
7
-=1--+---+···.
4
Chapter 5 Uniform Convergence
252
Solution
a.
We expand 1/(1 +.x2) as a geometric series:
·_x2
'.x2 2
.x2 3
I
I
l+x2 =1-(-x2)=l+(- )+(- ).+(-) +···
= 1 - .x2 + x4
-
x6 + • •· ,
which is valid if I-x2 I < 1, that is, if lxl < I. The convergence is uniform
on [-1 +t:, 1- t:] for any t: > 0 by the M test.
b.
Integrating from zero to x gives
r ...!!!_
=
I +1
X -
2
} 0
x3 + x5 -
X1
+ ....
3
7
'
5
but we know that the integral of 1/(1 + r2) is tan- 1 1, and so
tan -1 X = X -
c.
x3
3
+ x5
- - x1
- +··•
5
7
for
IXI < 1.
If we set x = 1 and use tan- 1 1 = rr /4, we get Euler's formula:
rr
1 1 1
4=l-3+5-7+·••;
but this is not quite justified, since the series for tan- 1 xis valid only for
lxl < 1. (It is plausible, though, since 1 - ½ +
½ + · · · ,· being an
alternating series, converges.) To justify Euler's formula, we may use the
finite form of the geometric series expansion:
!-
1
,2 4
1 "r"
f+l jbt+2
1+12=1- +t+···+(-)
+(-1
1+12·
Integrating from O to 1, we have
rr
I I
(-1)"
- = tan- 1 1 = 1 - - + - - • • • + - - + (-1)"+ 1
4
3 5
2n+l
11
0
jln+2
--
1+t2
dt.
We will be finished if we can show that the last tenn goes to zero as
oo. We have
n-
0$
11
o
jbt+2
2
+I
-1
d1 $
11 r2"•
O
2 dt
= -2
1
n+ 3
-+
0 as n -+ oo.
•
253
Exercises for §5.3
5.3.10 Example Suppose that ao,a1,a2,a3, ... is a bounded sequence of
real numbers. Show that the series
converges on JR to a differentiable function f(x), and ·that
00
k
/ '(X) = ~
w llt+I
k! X •
k=O
Solution We saw in Example 5.2.6 that the series "£':o(a1c/k!)xk converges
on lR to a continuous function/(x) by considering the functions gk(x) = (ak/k!)x"
on intervals of finite length. But g~(x)
so
00
,00
~
w
k=O
1
gk(~)
,
~
=(ka1c/k!)x"- 1 =(a,J(k -
Gk
=L.JI (k _
l)!)xk-t , and
00
l)!x
k-1
k=
~ ak+I
le
=L.J
k!x .
k=O
The argument used in Example 5.2.6 may be applied again to show that this
series converges uniformly on any interval of finite length to a function g(x).
Corollary 5.3.4 applies and shows that/ is differentiable and that/'(x) = g(x). •
Exercises for §5.3
1.
Investigate the validity of Theorem 5.3.1 for the sequence fn defined by
nx
fn(X) = -1 - -2 ,
+nx
0 $ X $ 1.
2.
Show that the sequence {/n} defined by fn(x) = n3xn(I - x) converges
pointwise to f = 0 oo [O, 1], and then use Theorem 5.3.1 to show that the
convergei:ice is not uniform.
3.
Investigate the validity of Theorems 5.3.1 and 5.3.3 forfn(X) = y,ixn(l-x)
on [O, l]. [Hint: Locate the maximum of fn(x).]
4.
Verify that
J; sin t dt = 1 -
cosx, using
oo
_x2n+I
sinx = ~(-It (2n+ 1)!"
Chapter S Uniform Convergence
254
5.
Verify that sin' x = cosx, using series.
6.
Express the sum of the series
E: xn / n
1
2
as an integral. What if x = - I?
§5.4 The Elementary Functions
A few of the most important functions of mathematical analysis are often referred
to as "elementary functions." These include such functions as:
1.
Exponential functions, such as ex, 2X, and 10'".
2.
The corresponding logarithms.
3.
Power functions, such as x2,x13117 , and, more mysteriously, x"; and closely
related functions such as polynomials and rational functions.
4.
Trigonometric functions, such as sinx and cosx.
The calculus of these functions is usually developed in a beginning course in
calculus and need not be completely repeated here-in fact, we have assumed
that you know something about them already. We will, however, point out some
of the places where there are potential difficulties and show how the ideas from
the last few sections may be applied in this context.
Exponential Functions• and Logarithms
Students encounter exponentiation early in their mathematical experience when
they learn to abbreviate the repeated product of n copies of a number b as bn. It
is apparent that these positive integer exponents satisfy the fundamental property
bn. bk= bn+k_
Extension of this property quickly leads to useful definitions for exponentiation
with negative integer exponents and with rational exponents:
b-n =
_!_
b''
and bp/q = f!/bP = (!J'b)P.
There is no problem with any of these, at least when b is positive, except for
possible doubt about the existence of qth roots of positive reaJ- numbers b. We
solved that problem in Chapter 1 with the study of the completeness of JR-we
can define the qth root of b to be the least upper bound of the set of all real
numbers whose qth power is smaller than b.
255
§5.4 The Elementary Functions
Another subtle problem arises when we consider irrational exponents. What
should be the meaning of 2"? We would like to have a well-behaved function
of x, denoted by 2x, such that 2Pfq = ( ¥1,)P if x happens to be a rational number
p/q. One way to attack this problem would.be to observe that Q is dense in R,
so that any real number such as 1r can be approximated by rational numbers.
We could select a sequence r1, r2, r3, ... in Q with rn -+ 1r as n -+ oo. We know
what meaning we want for 2'1 , 2'2 , 2'3, ... , and we want 2x to be a continuous
function of x, and so we should have 2" = limn-oo 2'n. This is a good idea and
not an unreasonable way to proceed for a numerical approximation of 2,.. It
presents difficulties as a basis for a definition:
1.
Why does the limit exist?
2.
Why is the limit independent of the choice of a sequence of rationals
approximating rr?
3.
If we do the same thing for arbitrary real x, how does one establish differentiability and so forth for the resulting function 2x?
4.
How does one establish the "laws of exponents" such as 2x2Y = 2x+y?
These problems are not insuperable, but they are not trivial, either.
The attack on these questions most commonly used in a calculus course
begins with the logarithm instead of the exponential function. The natural logarithm is introduced by means of the fundamental theorem of calculus as an
antiderivative for the reciprocal function:
ln(x) =
i
I
x
l
- dt for x > 0.
I
This automatically makes the natural logarithm a differentiable function of x on
the open half line x > 0 with derivative equal to 1/ x. The fundamental properties
of the logarithm.follow readily. The exponential function is then introduced as
the inverse of the logarithm. Although perhaps not very intuitive, this approach
has a number of advantages. Exercises 2 through 7 at the end of this section
guide the reader through some of this development.
We present here an approach that starts with the more familiar exponential
function, but defines it in tenns of an infinite series. The reader is probably
familiar with the expansion of the elementary functions in power series called
Taylor series or Maclaurin series. In particular, a familiar fact is that the exponential function can be expressed for all real x as
x Loo -x
I
e =
k=O
k!
k
1 2
1 3 1 4
=l+x+-x-+-x +-x +···
2
3!
4!
.
Chapter 5 Uniform Convergence
256
Our plan is to use this series to define a function, which we then show to behave
in the way the exponential should.
5.4.1 Definition For every number x, set exp(x) =
f
k1,xt.
k=O
•
From 5.3.10, this series converges uniformly and absolutely on any bounded
subset of JR (or <C) and defines a function differentiable on all of JR. We can
use the facts that we know about series to show that the fundamental laws of
exponentials are obeyed. This follows from the absolute convergence of the
series, since we can rearrange the terms of the product of two series to bring
together the terms with like powers:
(1+x+~ +~ +···) (1+y+y; +y; +--·)
x2 + 2xy + y2 x3 + 3x2y + 3xy2 + y3
= I +(x+v)+-----+-------+···
·
2
6
(x+y)2 (x+y)3
= l+(x+y)+-2-+-6-+···.
Writing this out precisely with summation notation and using the binomial formula for expanding (x + yt yields the "law of exponents."
5.4.2 Proposition exp(x) · exp(y) = exp(x + y) for all numbers x and y.
Proof
••W)exp(y);
(t ;/) (t F,;)
L
00
=
n=O
=
11
·)
~ k!j!xkyJ
00
(
k+,FII
Loo(nl
L k! (n -1k)!xk~-k)
l-=O
n=O
00
~
=~
(
n.I
k n-1: )
1~
n
n! {;;' k!(n - k)!x y
L 1n.l (x + yl = exp(x + y).
oo
=
n=O
From this starting point, much else follows quickly.
257
§5.4 The Elementary Functions
5.4.3 Proposition exp(x) is a differentiable function on IR with
i.
exp(0) = 1.
ii.
d
dx (exp(x)) = exp(x).
iii.
exp(x) > 0 for all x in R
iv.
exp(- x) = 1/ exp(x).
v.
exp(x) is strictly increasing everywhere on IR.
vi.
exp(x)
-t
+oo as x
vii.
exp(x)
-t
0 as x - t -oo.
-t
+oo.
viii. exp maps JR one-to-one onto JR+ = {x E JR : x > 0}.
Differentiability is given by Proposition 5.3.10. The first assertion is obvious
from the series definition of exp(x). The second follows easily, since term-byterm differentiation is allowed by 5.3.10:
~(exp(x)) = ~
dx
dx
(1
+x+
= 0 + I +x +
!x1+ ..!..x3 + ..!..x4 + .. •)
2
3!
4!
ir
+ ~x3 + 2-x4 + · ..
3!
4!
5!
,x +··· =exp(x).
12
l _3 1
=l+x+ 2x-+ 31 x- + 4
4
From 5.4.2 we know that exp(-x) • exp(x) = exp(-x +x) = exp(0) = 1.
Thus, exp(-x) = 1/ exp(x). In particular, exp(x) is never equal to 0. Since
exp(0) = 1, we must have exp(x) > 0 for all real x. Otherwise, the continuous
function exp woti'ld have a real root, by the intermediate value theorem. Since
exp'(x) = exp(x) > 0 everywhere, we conclude that exp(x) is strictly increasing
on R This gives exp(t) > 1 for every t > 0. Using the fundamental theorem of
calculus, we can compute that, for x > 0:
exp(x) = exp(0) +
fox exp'(t)dt > I+ fox I dt = l +x.
This shows that exp(x) - t +oo as x - t +oo. Since exp(-x) = 1/ exp(x), we
also get exp(x) - t 0 as x - t -oo. The combination of v, vi, and vii with
the intermediate value theorem shows that exp maps JR one-to-one onto JR+.
Continuity and the intermediate value theorem are used to obtain "onto."
Chapter 5 Uniform Convergence
258
Propositions 5.4.2 and 5.4.3 indicate that the function exp acts a lot like
exponentiation. We now show that it really is. First, we define the special
number e:
00
1
1 1
e exp(l) ~ k! 1 + 1 + 2 + 3 ! + • • •.
=
=
=
Using logarithms, introduced after the next heading, and l'Hopital's rule, it can
be shown that e = limn-ooO + (1/n)Y,. This supplies a link to such subjects as
"continuously" compounded interest. From either of these characterizations or
others, the value of e can be computed to any desired degree of accuracy. The
first few decimal places are
e ~ 2.71828 18284 59045 23536 02874 71352 66249 ...
It may not be too surprising that e is irrational, but more is true. ·rt is not even
algebraic. An algebraic number is a number that is the root of a polynomial with
rational coefficients. Numbers that are not algebraic are called transcendental.
That e is transcendental was shown by Charles Hermite in 1873. (The same
conclusion for 1r came in 1882 from C. L. F. Lindemann.)
The function exp(x) acts just the way e raised to the "power" x should.
First, suppose that n is a positive integer. Repeated application of 5.4.2 gives
exp(n) exp(l +I+•••+ 1) exp(l) • exp(l) • • • exp(l) (exp(l))n en. (The
reader should be able to supply a proof by induction.) For negative integers,
we invoke 5.4.3iv to conclude that exp(-n) I/ exp(n) I/en. This is exactly
what we are accustomed to writing as e-n. Similarly,
=
=
=
=
=
=
(exp(l/n)t = exp(l/n) · exp(l/n) · · · exp(l/n)
=exp((l/n) + 0/n) + · · • + (1/n)) =exp(l) =e.
Thus, exp(l/n) is an nth root for e. We are accustomed to writing this as
exp(l/n) = fie= e1fn. Putting this all together, we see that for integers p
and q with q ~ 0, exp(p/q) = ( fe)P, exactly the number we are accustomed to
writing as ef'/q_
5.4.4. Proposition
exp(x) is a differentiable function on JR. such that
exp(x) = ex for x E Q.
The function exp(x) thus serves as a quite reasonable definition for what ex should mean for any real x (or, in fact, for any complex number). We now have
a good definition for e,r, but our original challenge was 2,r. Before attacking this
·
problem, we turn our attention to logarithms.
259
§5.4 The Elementary Functions
Natural Logarithms
The exponential function exp(x) = ex has been defined to produce a differentiable
function from JR one-to:.one onto JR+. It is strictly increasing, with a derivative
that is strictly positive at every real x. In particular, the derivative is never 0.
By the single-variable inverse function theorem (4.7.15), there is a differentiable
inverse function taldng JR+ one-to-one onto JR. This function is called the natural
logarithm function and is denoted lnx or logx. That it is an inverse for the
exponential means that
· 1og(e.r)
= x for every real x
and
exp(logx) = x
for every x
> 0.
Computation with the chain rule yields the derivative:
d
d
dx (exp(log x)) = dx (x),
i.e.,
d
exp(logx) d.x (logx) = 1.
i.e.,
d.x(logx)
Thus,
d
x d.x(logx)
=l,
d
=x·1
Thus, the natural logarithm is an antiderivative for the reciprocal function. Since
log(l) = 0, the fundamental theorem of calculus gives
logx =
f
I
.r
l
- dt for x
t
> 0,
the formula frequently taken as the starting point for the theory in a calculus
course, as mentioned previously.
· Other Bases
With the natural logarithm available, we can tum our attention to our original
challenge problem. What is 2-rr? Ifwe look ahead for a moment and assume that
it and the natural logarithm both behave as we hope they will, we can perform a
computation that will indicate what the proper definition should be. We would
like to have log(2-rr) 1r log 2, so that 2-rr exp(log(2")) exp(1r log 2). Since
exp and log are known functions, this last expression supplies the basis for a
definition. We can try defining 21r as exp(1r log 2) or, more generally, 2.r as
exp(x log 2) for any x. We will show that this works as desired not only for 2
but for any positive number b as base to give a reasonable definition for ~
=
=
=
Chapter 5 Uniform Convergence
260
5.4.5 Definition
Let b > 0. For any number'x, define
expb(x) = exp(x log b).
Applying the properties we already have for exp and the chain rule yields
the following basic properties of expb:
5.4.6 Proposition If b > 0, then expb(x) is a differentiable function on IR
and
i.
expb(x + y) = expb(x) · expb(y) for all numbers x and y.
ii.
expb(O) = 1.
iii.
dx (expb(x)) = log(b) · expb(x) for all real x.
iv.
expb(x)
v.
expb(-x) = 1/ expb(x).
d
.
> 0 for all real x.
=
=
If b 1, then logb 0. For O < b < 1, we have logb < 0, while if b > 1,
then logb > 0. Since expb(t) > 0 for every t and exp~(x) = log(b)-expb(xlogb),
this gives us all the information we need about the sign of exp~(x).
5.4.7 Proposition
i.
If b =1, then expb(x) =1 for every x E JR.
ii.
If b > 1, then
iii.
a.
expix) is strictly increasing for every x E R
b.
expb(x) --+ +oo as x --+ +oo.
c.
expb(x) --+ 0 as x --+ -oo.
/f O < b
< 1, then
a.
expb(x) is strictly decreasing for every x E R
b.
expb(x) --+ 0 as x
c.
expix)--+ +oo as x--+ -oo .
--+
+oo.
In both cases ii and iii, the function expb(x) maps IR one-to-one onto JR+.
§5.4 The Elementary Functions
261
The computation showing that expb(x) agrees with our usual notion of Ir for
rational x proceeds just as it did for exp(x).
5.4.8 Proposition expb(x) is a continuous and differentiable function on JR
such that expb(x) = Ir for all x E Q.
Thus expb(x) serves as a quite reasonable definition for Ir for all numbers x.
Of course, since expb(x) maps JR one-to-one onto JR+ (at least for b :j: 1), with
derivative never equal to 0, there is a differentiable inverse function,
taking JR+ one-to-one onto JR, and
expb(log/x)) = x
for every x
> 0.
We are now in a position to write down the basic properties of exponentials
and logarithms. First, we use the base e to compute log(/r) = log(exp(xlogb)) =
xlogb. Thus, (fr)Y =exp(ylog(/r)) = exp(yxlogb) =exp(xylogb) =frY. This
is the last of the principal algebraic properties of exponentiation.
5.4.9 Proposition If b > 0 and x and y are in JR, then
i.
b0 = 1.
ii.
Ir
iv.
(fr)Y
> 0 for every x in JR.
= frY.-
The corresponding properties of logarithms are also now available:
5,4.10 Proposition
If b > 0 and b :j: I, then
ii.
logb(xy) = logb(x) + logb(y) for all positive x and y.
iii.
logb(x') = t logb(x) for x
> 0 and t E JR.
Chapter 5 Uniform Convergence
262
=Ir and y =b' and compute
logb(xy) =logb(b b =logbW+') = s + t = logbW) + logb(b')
To obtain ii, we let x
5 1)
= logb(x) + logb(y).
To obtain iii, we let x = b5 and compute
The calculus of these. functions is summarized in the next proposition.
5.4.11 Proposition
i.
!!_(ex)= ex; Jex dx =ex+ C.
ii.
d
dt(logx)
= 1/x;
iii.
1
iv.
d
dx(logbx) = 1/(xlogb).
dx
(bx) =
~ log b;
ix
1
(1/t)dt = logx, x
J~
> 0.
dx = bx/ log b + C.
Power Functions
By a "power function," we mean a function of the form/(x) = xP for a constant
p. The only power functions we have really used so far in the development of
theory are those with p an integer, although we have discussed roots to some
extent, which correspond to rational values of p. We now have a good definition
of xP for any p, at least if x is positive:
xP =exp(plogx).
We know that this satisfies most of the correct properties for powers of positive
numbers:
xPx =xp+r
(xPY =xpr
= 1/xP
XO = I if X f 0.
x-P
263
§5.4 The Elementary Functions
A remaining property, xPyP = (xy)P, is easily checked:
(xy)P = exp(p logxy) = exp(p(logx +logy))= exp(p logx + p logy)
= exp(p log x) • exp(p logy) = xPyP.
The familiar rule for derivatives of power functions is readily obtained from
the chain rule:
d
d
.
d
dx(xP) = dx(exp(plogx)) = exp'(plogx) • dx(plogx)
= exp(p logx) •p/x = (xP). p/x = pxP- 1.
Of course, this works in full generality only for x > 0. However, we can
always set QP = 0 for every p > 0 and
~p/q
= efxi
for every real x if p and q are integers and p is even.
Trigonometric Functions
The expansions of the trigonometric functions sine and cosine are probably
familiar to the reader:
x3 x5
x7
x2
x6
sinx = x - - + - - - + • · ·
3! 5!
7!
and
cosx = 1 - -
2!
z::
x4
+ - - - + ···
4!
6!
.
Each of these series is of the form
0(ak/k!)xk, where each of the coefficients
ak is either -1, 0,- or 1. They thus fit the pattern of Example 5.3.10. Thus, if
we define functions s(x) and c(x) by the series
oo
s(x)
=x- _!_x3 + _!_x5 - I.x7 + ... = ~
3!
5!
1
1
= I - -:x!+ -x4 2!
4!
x2k+I
L..,, (2k + 1)!
7!
k=O
and
c(x)
k
(- I)
1
6!
-x6 + ...
Ii
=~
--:?
L..,, (2k)!
'
00
(-
k=O
each of s(x) and c(x) converges uniformly and absolutely on any bounded set
of numbers and represents a differentiable function on all of JR. Term-by-term
Chapter 5 Uniform Convergence
264
differentiation is valid for any real x. Carrying out this differentiation shows
that
s'(x) = c(x) and c'(x) = -s(x) for all real x.
From the series and from these derivative relations, we can show that s(x) and
c(x) satisfy the same key identities as do sine and cosine. For example, let h(x) =
s(x)2 + c(x)2. Then h(x) is differentiable and h'(x) 2s(x)c(x) - 2c(x)s(x) 0
for every x. Thus h(x) is constant. Since c(O) = 1 and s(O) = 0, we conclude
that h(x) = 1 for every x. That is,
=
s(x)2 + c(x)2 = 1
=
for every real x.
From the series, it is also apparent that s(x) is an odd function of x and that c(x)
is even:
s(-x) = -s(x) and c(-x) = c(x) for all x.
The formulas for the sine and cosine of a sum can also be recovered by direct
manipulation of the series in much the same way that we obtained the fundamental formula for the exponential function, exp(x + y) = exp(x) • exp(y). However,
we can also obtain them indirectly by noting an important link between the exponential and the trigonometric functions. This relationship depends on the fact
that one has uniform and absolute convergence of the series 'E~(ak/k!)xk on
any bounded subset of complex numbers provided that the complex coefficients
a1c stay bounded. The proof follows just as it does for real numbers, as do the
manipulations leading to the formula exp(x + y) = exp(x) •exp(y). If we compute
exp(ix) for real x, we find that
I?
•
(X)
1
eve= ""'-(ixl
~
k=O k'•
. x2 ix3x"ir x6
= l+1x- - - - + - + - - - - ···
2
=(l-
3
4
5
6
~ + ~ - · · ·) + i (x- ~ + ~ -
· · ·)
= c(x) + is(x).
The law of exponents assures us that
c(x + y) + is(x + y)
ei<x+y)
= eixeiY, and so we can compute
= [c(x) + is(x)] · [c(y) + is(y)]
= [c(x)c(y) -
s(x)s(y)]
+ i[s(x)c(y) + c{x)s(y)].
Comparing the real and imaginary parts, we obtain
c(x + y)
= c(x)c(y) - s(x)s(y) and s(x + y) = s(x)c(y) + c(x)s(y).
265
§5.4 The Elementary Functions
The functions defined by the series are thus differentiable functions on all of
R that satisfy the same fundamental identities as do the trigonometric functions.
It seems almost inescapable that they should be the same functions. In fact,
they are. Forging the link is not quite so straightforward as it was in the case
of the exponential. Our previous knowledge of the exponential was essentially
algebraic: e"/q = ~- Confinning that this was the same as exp(p/q) followed
fairly readily by algebraic manipulation of the series. The usual definitions of
sine and cosine, however, are based firmly in the geometry of angles or in the
geometry of arc length along a circle. If 0 is the length of an arc along the
circle of radius I centered at (0, 0) from (1, 0) to a point (a, /3) on the circle,
then a = cos(0) and /3 = sin~0). See 'Figure 5.4.1.
·
a
FIGURE 5.4• l
The sine and cosine of 0 are a and
/3
In a beginning calculus course, the basic limits
. sin 0 1
11m
and
-- =
0-0
0
. cos0 - 1 0
11
m---=
0-+O
0
are usually proved geometrically. The formulas for the sine and cosine of the
sum of two angles are assumed to be known and are used with these two limits
to show that sine and cosine are differentiable and that
:U (sin 0) = cos 0
and
:U (cos 0) = - sin 0.
This process can be reversed if we use the formula for the length of the graph
of a differentiable function,
Length =
1b J1 +
[/'(x)]2 dx.
266
Chapter 5 Uniform Convergence
The first quadrant of the circle
functions in both directions:
y
i1- + f = 1 can be written
=f(x) =vT="x2
and
as the graph of such
=g(y) =v'i""=?.
x
This leads to
i- =1"' Jt +
8
or
8=~ -
2
[f'(x)]2 dx
[°
l
Jo v'f=x2
and
8
=1/J Jt + [g'(x)]2 dy
dx and () =
rr l dy.
Jo JI-y
3
2
The inverse functions 8 = arccos a and 8 = arcsin /3 are thus differentiable
functions of o and /3, and the fundamental theorem o( calculus yields
d()
do = -
1
JI -
and
d() =
d/3
o2
The inverse function theorem says that
() with
1
~-
o and /3 are differentiable functions of
~; = - ~ and
'do= J1 ~ /3
2•
Recalling that o 2 + /3 2 = 1, we have
do = -/3 and d/3 = a
d0
d()
or
:0 cos0 = - sin8
:0 sin0 = cos 8.
and
The pairs of functions
x(t) = c(t)
y(t) = s(t)
= cost
y(t) = sint
x(t)
both satisfy the system of first-order equations
x'(t)
= -y(t)
y'(t) = x(t)
with the same set of initial conditions
x(O)
= 1 and y(O) = 0.
A basic result from the theory of differential equations that we prove in §7.S
shows that the solution to such a system is unique. We conclude that c(t) is the
same as cost and s(t) is the same as sin t.
Exercises for §5.4
267
We did not gain as much from the series approach to the trigonometric
functions as we did from the exponential. However, by allowing the argument
to take on complex values, we have discovered a beautiful link between the two:
eb: = cos x + i sin x, and in the other direction,
eb: +e-ix
·
cosx
=
.
smx
and
2
el"' - e-b:
=
Zi
.
If we let x = 1r, we find a remarkable equation linking five of the most important
numbers in mathematics: i1r + 1 = 0.
5.4.12 Example
Verify that
J; cos f} df} = sinx using series.
Solution We know that the series
00
(-ll'
cos fJ = ~ (Zk)! 92k
converges unifonnly on any bounded subset of R. If we choose b > lxl, then we
have unifonn convergence on [-b, b], so we can integrate term-by-tenn between
0 and x:
tcosfJdfJ=
lo
t(f (-Il921:) dfJ=f ( lot<-ll°9itde)
lo
1:=0
(2k)!
~ (-I}"
~2k+l
= L.., (2k + l)!x-
t=O
•
= smx.
(2k)!
•
k=O
Exercises for §5.4
1.
Verify that
J; sin t dt = I -
cos x, using the series expansion.
Exercises 2 through 7 ask the reader to carry through some aspects of the development of exponentials and logarithms from the starting point of defining the
natural logarithm as an antiderivative for the reciprocal function.
2.
Let/(x) = KO/t)dt for x > 0. Show that/ is a differentiable function on
JR+= {x ER l x > O} with/(])= 0 and/'(x) = 1/x for all x > 0.
3.
Show that if a
Exercise 2).
> 0 and
b
> 0, then /(ab) =/(a)+ f(b)
(/ is defined in
Chapter S Unifonn Convergence
268
4.
Show that
a.
f is strictly increasing on R• (f is defined in Exercise 2).
b.
= +oo.
lim.x-o/(x) = -oo.
lim.x-oof(x)
c.
5.
Show that/ has a differentiable inverse function g = 1- 1 talcing IR one-toone onto JR.+ and that g' (x) = g(x) for all real x (f is defined in Exercise
2).
6.
Show that g(x + y) = g(x) • g(y) for all real numbers x and y (g is defined
in Exercise 5).
'
·
7.
Let e g(l), and show that g(p/q)
p and q in Z.
8.
The "error function" is defined by
=
=eP/q for every rational number p/q,
erf(x) = -1-
./2-i
a.
1"' e-
12 dt .
12
0
Show that erf(x) can be represented by a power series
valid for all x, and compute ao,a 1,a2,a3,a4, and as.
b.
Use your main result from part a to estimate the value of
(This integral gives the probability that a measurement taken at random from a normally distributed population lies within one standard
deviation of the mean of 0.)
§5.5 The Space of Continuous Functions
Fix a metric space M, a subset A c M, and a normed vector space N. Consider
the set V of all functions f : A -+ N. Then V is easily seen to be a vector space.
In V, the zero vector is the function that is O for all x E A, and addition and
269
§5.5 The Space of Continuous Functions
=
scalar multiplication are defined by (J +g)(x) f(x) + g(x) and (>./)(x)
for each l ·c. 'JR,f,g E V. Let
~,
=>.(J(x))
.. •
C = {J EV If is continuous} .
•
If there is danger of confusion, we write C(A, N) to indicate that C depends
on our choice of A and N. Then C is also a vector space, since the sum of two
continuous functions is continuous and a scalar multiple of a continuous function
is continuous.
~
Let Cb be the vector subspace of C consisting of bounded functions: Cb =
{J E C If is bounded}. Recall that "f is bounded" means that there is a constant
M such that 11/(x)II $ M for all x EA. If A is compact, then Cb= C, by Theorem
4.4.1 applied to the real-valued function x 1--+ 11/(x)IIFor f E Cb, let IIii I = sup{ IIJ(x)I I I x E A}, which exists, since f is bounded.
The number 1111 I is a measure of the size off and is called the norm of J. See
Figure 5.5-1. Note that 11111 $ M iff IIJ(x)II $ M for all x E A.
y
y
A
FIGURE 5.5-1
The norm of a function picks out the largest absolute value
off(x)
What we are trying to do here is to look at the space Cb in the same way as
we look at '/Rn. Namely, each point in Cb (which is a function) has a norm, and
so we can hope that many of the concepts developed for vectors in '/Rn will carry
over to Cb, Such a point of view is useful in doing analysis, and some important
results can be proved by using some of our techniques on the space Cb, For this
program to be successful, tHe first task is to establish that Cb is .a normed space.
,.
l' •
270
Chapter 5 Uniform Convergence
Note. Although we have a norm, we do not have an inner product associated
with it such that 11/112 = (/,/}. Other spaces of functions that we study in
Fourier analysis (Chapter 10) do have such an inner product.
If N is only a metric space, we still get a metric on Cb, namely,
d(f,g) = sup{p(f(x),g(x))
Ix EA}.
Most of what follows holds in this context: When a vector space structure on
N is needed, we will explicitly say so.
5.5.1 Theorem
i.
If (M,d} and (N, p) are metric spaces, then so is Cb(A,N); that is, d(j,g)
satisfies
ii.
a.
d(f,g) ~ O and d(f,g)
b.
d(f,g) = d(g,f).
c.
d(f, g)
~
=O if/ f =g.
d(f, h) + d(h, g).
If N is a normed space, then so is Cb(A, N); that is,
a.
b.
c.
II •II S{!tis.fies
11111 ~ o and II/II =o if! I= o.
Ila/II= lal II/II for a ER,/ E Cb.
Iii+ gll ~ 11111 + llgll (triangle inequality).
· These are the basic rules we need to talk about open sets, convergence, and
so forth. For example, write /1c -. f in Cb iff II/le - /II - t 0. The connection with
uniform convergence is simple.
5.5.2 Theorem (fi:
-t
f (uniformly on A))
<=}
(f1c -. f in Cb).
Remember that a sequence ft is called a Cauchy sequence if for each t:: > 0
there is an N such that k,l ~ N implies ll/1c - Jill < t::. Recall that a space
is called complete if every Cauchy sequence converges. Another name for
a complete normed space is a Banach space. Completeness is an important
technical property for a space, since often we may be able to prove that a
sequence satisfies the Cauchy criterion and we want to deduce that it converges
to some element of the space.
§5.5 The Space of Continuous Functions
271
S.S.3 Theorem If N is a complete metric space, then Cb(A, N) is a complete
metric space. If N is a Banach space, so is Cb(A, N).
This result is really a rephrasing of two basic results:
1.
A unifonnly Cauchy sequence of functions into a complete space converges unifonnly to something.
2.
A uniform limit of continuous functions is continuous.
The space Cb is only one of a host of spaces of functions of great importance
in analysis. While both Cb and Rn are complete normed spaces, they are quite
different in other respects. For instance, as we have mentioned, Cb does not
have an inner product that gives the norm II · 11 (Exercises 12 and 30 at the
end of Chapter 1). Another is that Cb is not finite-dimensional. In the following
sections, we shall see some specific problems to which this theory can be applied.
5.5.4 Example Let B = {f E C([0, l],R) lf(x) > 0/or'all x E [0, I]}. Show
that Bis an open set in C([0, l],R).
E B we must find an r: > 0 such that D(/, r:) =
{g E C I II/ - KIi < r:} C B. Since [O, I] is compact,/ has a minimum valuesay, m-at some point of [0, 1). Thus, /(x) ~ m > 0 for all x E [0, 1). Let
r: = m/2. If II/ - KIi < e, then for any x, 11/(x) - g(x)II < e =m/2. Hence,
g(x) ~ m/2 > 0, and so g E B.
•
Solution By definition, for/
S.5.5 Example What is the closure of the set B in Example 5.5.4?
Solution We assert that the closure is D = {/ E C I f(x) ~ 0 for all x E
[O, 11}. This is a closed set because iffn(x) ~ 0 and.fn -+ / unifonnly, and hence
pointwise, then /(x) > 0 for all x. To show that D is the closure, it suffices to
show that for f E D there is an fn E B such that .fn -+ f (why?). Simply let
fn=J+l/n.
•
5.5.6 Example Consider a sequence fn E Cb such that ll.fn+1 - fnll ~ Tn,
where I: rn is convergent, Tn
~
0. Prove that fn con·verges.
272
Chapter 5 Uniform Convergence
Solution By the triangle inequality,
11.fn - fn+kll $
llfn - .fn+dl + 11.fn+I - fn+2II + · · · + llfn+t-1
$ Tn + Tn+I + · · " + Tn+k•
- fn+kll
Since I: r1 is convergent, this expression tends to O as n - t oo, since it is less
than or equal to s - Sn-1 where Sn is the nth partial sum and s is the sum. Hence,
f,, is a Cauchy sequence, and so it converges. •
Exercises for § 5.5
> 0 for all x E IR}. Is B open? If not, what
1.
Let B = {f E Cb(IR,IR) lf(x)
is int(B)?
2.
What is the closure of B in Exercise 1?
3.
Do you see a connection between Example 5.5.6 and the Weierstrass M
test? Discuss.
4.
Let
I nx
fn(X)=--,
n 1 +nx
Show thatfn
5.
-t
0$x$1.
O in C([O, l],IR).
Let/I: be a convergent sequence in Cb(A, Rm). Prove that {fk I k = I, 2, ... }
is bounded in Cb(A, IR"'). Is it closed?
§5.6 The Arzela-Ascoli Theorem
Analogous to the Heine-Borel theorem concerning compact subsets of Rn, this
theorem gives conditions for a set in Cb to be compact. Recall that in Rn a set
is compact if and only if it is closed and bounded, but that this criterion need
not hold in other metric spaces. In particular, it does not hold in Cb. What the
Arzela-Ascoli theorem does is come as close to this as possible with the intent
of providing conditions for compactness that can be verified in examples. We
begin with some terminology.
5.6.1 Definition let B C C(A, N). We say that l3 is an equicontinuous set
of functions if for any c: > 0 there is a 6 > 0 such that x,y E A and d(x,y) < 6
implies d(J(x),f(y)) < c: for all f E /3.
273
§5.6 The Arzela-Ascoli Theorem
This definition is the same as that of uniform continuity, except that now we
also demand that 6 can be chosen independent of/ as well as .xo.
Let Bx = U(x) I / E B} for fixed x. This is the set of all values of the
functions in B at the point x E A. We say that B is pointwise compact if and
only if Bx is compact in N for each x EA.
5.6.2 Arzela-Ascoli Theorem Let Ac M be compact and B c C(A,N).
Then B is compact if and only if B is closed, equicontinuous, and pointwise
compact.
The proof strategy is based on the Balzano-Weierstrass property. The main
point is to show that if each in is in B, then the sequence in has a uniformly
convergent subsequence. Since B is pointwise compact, we can do this at each
x E A. The idea now is to use equicontinuity and compactness of A to make this
uniform. A clever diagonal selection process makes this feasible.
For instance, the theorem is often used as follows:
Let A c M be compact and N = Rm. Assume B C C(A, R_m)
is equicontinuous and pointwise bounded. Then every sequence in B has a
uniformly convergent subsequence.
5.6.3 Corollary
,,I
Here one uses the fact that any bounded set in Rm lies in a compact set (a
ball, for instance). A related result is this: If A c Mis compact and N = R.11 ,
'then B C C(A, R_m) is compact iff it is bounded, closed, and equicontinuous. We
leave the proof of this to the reader.
---+ R be continuous and be such that l/11 (x)I ::;
lOOfor every n and for allx in [O, I] and the derivatives!~ exist and are uniformly
bounded on ]0, l[. Prove that in has a uniformly convergent subsequence.
5.6.4 Example Let / 11 : [O, I]
Solution We verify that the set U11 } is equicontinuous and bounded. The
hypothesis is that lf~(x)I ~ M for a constant M. By the mean value theorem,
lfn(X) - fn(Y)I
:=; Mix - YI,
and so, given c:, we can choose Ii= c:/M, independent of x,y, and n. Thus Un}
is equicontinuous. It is bounded because IIJ,,11 = SUPo:s;x::;i lJ,,(x)I ::; 100. •
Chapter S Uniform Convergence
274
5.6.5 Example ls the result of Example 5.6.4 valid if we omit the condition
that IJ,.(x)I is bounded?
Solution No, for let.fn(x) = n. Then/~= 0, but clearly there is no convergent
subsequence. •
To exploit compactness and combine the Arzela-Ascoli theorem with results
about continuous functions from Chapter 4, we need a supply of continuous
functions on C. The next example provides a start.
5.6.6 Example Let I
: C([O, I]), IR) -+ JR be defined by I(/) =
Id f(x)dx.
Prove that I is continuous.
Solution We must show that.fn -+fin C implies l(Jn)-+ I(/). But this is a
consequence of Theorem 5.3.1. •
Exercises for §5.6
1.
Show that in Example 5.6.4, In bounded can be replaced by fn(O) = 0 to
give the same conclusion.
2.
In 5.6.3, need the whole sequence be convergent?
3.
a.
Show that the following set is open:
b.
Show that Cb is closed in the space of all bounded functions on a
set A.
4.
Let B C C([O, I}, IR) be closed, bounded, and equicontinuous. Let I: B-+
R be defined by l(J) = f(x) dx. Show that there is an /o-·E B at which
the value of / is maximized.
Id
§5.7 The Contraction Mapping Principle and Its Applications
5.
275
Let the functions fn : [a, b] -+ IR. be uniformly bounded continuous functions. Set
Fn(X) =
lx
fn(t)dt,
a $ X :5 b.
Prove that Fn has a uniformly convergent subsequence.
§5.7 The Contraction Mapping Principle and
Its Applications
This section deals with an important iterative technique in analysis called the
contraction mapping principle. It proves existence and uniqueness results and,
in addition, gives a specific iterative procedure for locating the solution. We
begin with the general method.
5.7.1 Contradion Mapping Principle Let M be a complete metric
space and <I> : M --+ M a given mapping. Assume that there is a constant k,
0 :5 k < 1, such that d(<l>(x),<t>(y)) $ kd(x,y)for all x,y EM. Then there is a
unique fixed point for <I>, i.e., a point x. E M such that <l>(x.) = x •. In fact, if xo
is any point in M and we define x, <l>(XQ), X2 <l>(x1), •• ,, Xn+I <l>(xn), ••• ,
then
Jim Xn =x•.
=
=
=
n-oo
Intuitively, <I> is shrinking distances, and so as <I> iterates, points bunch up.
The clustering of this bunching is centered at x.; see Figure 5.7-1.
Note. There are other famous fixed-point theorems that have a more topological flavor. For instance, the Brouwer fixed-point theorem (in a special case)
says that any continuous map/: D-+ D, where D = {x E Rn I llxll ~ I} is the
disk, has a fixed point. Certainly such f's need not be contractions. If n = I,
this is proved by the intermediate value theorem (see 4.5.5 and Exercise 3 of
§4.5). In this result there can be many fixed points.
A related result is that any vector field on the 2-sphere has a zero somewhere; or "at every instant, there is somewhere on the earth that the wind is
not·blowing."
.
•
Our more analytical contraction mapping theorem has some everyday interpretations, too: "Take a map of the city in which you are present and put
it on a horizontal table. Then there is exactly one point on the map which
corresponds to the point on the table directly beneath it."
216
Chapter S Uniform Convergence
FIGURE 5.7-1
A contraction shrinks distances between points
As our first application, we study the existence of solutions of differential
equations. We give a specific context, although the technique is fairly general.
We shall generalize it in §7.5.
Consider a continuous functionf(t,x) defined in a neighborhood of (to,xo) E
JR 2• Assume that the following Li,pschill. condition holds:
for all (t,xi) and (t,x2) in a neighborhood of (to,Xo). If f is differentiable in x
and (8f / 8x)(t, x) is continuous, then the condition is automatic (by the mean
value theorem).
•
5.7.2 Theorem Under the above assumptions, there is a 6 > 0 such that
the equation
dx
dt =f(t,x),
has a unique C1 solution x
tp1(t) = f(t, tp(t)).
(I)
x(to)=xo
=tp(t), with tp(to) =x ,for to 0
6
< t < to+ 6,
i.e.,
The technique of proof is based on the following observation. Solving (I)
is equivalent to finding a continuous function 'P on ]to - 6, to+ 6[ such that
tp(t) = Xo +
1'
to
f(s, <p(s)) ds.
(2)
§5.7 The Contraction Mapping Principle and Its Applications
277
Equation (2) is a fixed-point equation for the map
<l> : 'f'
xo +
1-+
1
1
f(s, cp(s)) ds
lo
taldng functions in (a suitable subset ot) C to C. The idea is to apply the
contraction mapping principle to <l>.
We have, as a consequence of the method used in the contraction mapping
theorem, a specific iterative method for solving (1):
'f'n+I
= <l>(cpn);
i.e.,
ffin+I
(t)
=xo +
1
1
/(s, i.pn(s)) ds and cpo
= X'Q.
to
Then 'f'n converges to the desired solution (if ]t - tol is sufficiently small).
The second application is to integral equations. We begin wit~ what are
called Fredholm equati~ns. These integral equations are of the form:
/(x)
= >.1b K(x,y) f(y) dy + cp(x)
(3)
where K and 'f' are given functions and >. is a given constant. Assume that K
and 'f' are continuous and that for all x,y satisfying a ~ x ~ b, a $ y $ b, we
have
(4)
IK(x,y)I ~ M.
5.7.3 Proposition If >.Mlb -
al< I, then the Fredholm equation (3) has a
unique solution.
The essential idea in the proof is very simple.
C([a,b],IR) be defined by
Let <l> : C([a,b],IR) -
(<l>(/))(x) =>.lb K(x,y)/(y)dy+<{)(X).
The map <l> is a contraction, since
d(<l>(Ji), <l>(Ji)) = s~p I>.
lb
K(x,y)(/1 (y) - fi(y))dyl
~ >.lb -
alMd(/1,/2)
and >.lb - alM < I, by assumption. Thus, (3) has a unique solution by the
contraction mapping theorem.
Volterra integral equations are of the form
f(x) = >.
lr
K(x,y) /(y)dy + cp(x~.
(5)
278
Chapter 5 Uniform Convergence
5.7.4 Proposition Assuming that K is continuous on [a,b] x [a,b], the
Volterra integral equation has a unique solution J(x) for any A.
The idea is the same as before, but now we get a contraction for <f>N =
<I> o <I> o • • • o <I> only if N is large; this compensates for the fact that we did not
assume A was small.
5.7.5 Example Give an example of a complete metric space X and a map
<I> : X
--+
X with d(<l>(x), <l>(y)) $ d(x,y) but with <I> not having a unique fixed
point.
Solution Let X =IR with the usual distance d(x,y) =lx-yl. Let <l>(x) = x+ I.
Clearly, there is no x such that x = x + 1. But l<I>(x) - <l>(y)I = Ix - YI- •
This example shows that in the contraction mapping theorem, it is essential
to have k < l; k = 1 will not do.
S.7 .6 Example Show that the method of successive approximations applied
to the differential equation f' (x) = f(x) with f(0) = 1 leads to the usual formula
for ex.
Solution The equation is equivalent tof(x) = 1 + J;f(y)dy. Begin with the
zero function 0. Since <l>(g) = 1 + J; g(y)dy, we get
<l>(0)(x) = 1;
<1>2 (0)(x)
= <1>(<1>(0)) =1 +
1x
dy
=1 + x;
r
x2
x3
<1>(<1>3(0))(x) =_I+ lor (1 + y+ ~2) dy = l +x+ x2
2 + 3 !;
<1>(<1>2 (0))(x) = 1 + lo (1 + y) ~y = 1 + x+ 2 ;
This sequence clearly converges to ex.
•
§5.7 The ConJraction Mapping Principk and Its Applications
279
5.7.7 Example Find a function f such that f'(x) = xf(x) for x near 0-and
/(0) = 3.
Solution We take for our starting point the simplest possible function satisfying the initial condition, / 0 (x) = 3, and iterate the Volterra operator
(T/)(x) = 3 +
1x
tf(t) dt
in search of a function/ such that Tf = f. We iterate to produce the following
sequence:
/o(X) = 3
r t3 dt =3 + 23x2
r t (3 + 23 t2 ) dt =3 + 23x2 + (2)(34)x
/2(x) = (T/1)(x) =3 + Jo
ft (x) =(T/o)(x) =3 + Jo
4
/3(x)
= (Tfi)(x) = 3 +
3x2
fo\
3
(3 + ~t2 + (2~ 4)
3
4
t4) dr
6
= 3 + 2 + (2)(4)x + (2)(4)(6/ ·
Continuing this process, we see that
3,,
3
3
4
3
6
fn(x) = 3 + 2r + (2)(4/ + (2)(4)(6/ + · · · + (2)(4) • · · (2n)
3
n
=L
n
(2)(4) ... (2k)x2k = 3
k--0
=3
_x2n
1
L (2k)(k!)x2k
k=O
L k!1 (x2)"
2
n
" k=O
These are the partial sums of the infinite series
/(x) = 3
L k!I (x2)"
2 '
oo
k=O
which we recognize as having an infinite radius of convergence and as converging to the sum
/(x) = 3ex2/ 2 •
Direct computation of the derivative using the chain rule confirms that this is
in fact a' solution to our problem. (Of course, we could also have found thi's by
separation of variables.) •
Chapter 5 Uniform Convergence
280
5.7.8 Example Consider the integral equation
f(x) =a+
1z
f(y)xe-xy dy.
Check directly on which intervals (0, r] we get a contraction.
Solution Here 4>(f(x)) =a+ J;J(y)xe-xy dy, and so the constant for 4> is
k = sup
zE[O,r)
r IK(x,y)I dy = sup Jor xe-xy dy
Jo
= sup (1 -
zE[O.r)
e-x ) = I - e-r < I.
2
2
xE[O,r)
Thus, we get a unique solution on any interval [O, r].
•
5.7.9 Example We consider a "closed-loop feedback system" denoted
schematically this way:
r
+o
-t
•
c = F(e)
F
e
I•
Here, r,c,c are functions oft E [a,b]. We choose M = C([a,b], IR) and
assume that F : M - M is a contraction. The basic equations are c = r - c; i.e.,
c + F(c) = r. Prove the existence and uniqueness of a solution c te this system
for given r.
Note. This system can be thought of as follows. A signal (such as an
electrical impulse) renters the system, joined by a return signal c to produce
c = r- c. A "'black box" F modifies c to F(c); the signal c is then fed back to
the incoming signal r.
Solution Note that for two solutions ct and c:2,
ct + F(c:i) = t:2
+ F(c:2)
§5.7 Tlze Contraction Mapping Principle arnl Its Applications
281
and therefore
Since Fis a contraction, we have a contradiction unless c 1 = c 2 • Hence, if there
is a solution c, it is unique.
We now prove the existence of a solution for a given "input" r. Let
Gr(c) = r - F(c).
We want
c
= G,(1:).
We claim that G, is a contraction. Indeed,
Since k
system.
<
1, Gr has a fixed point c, the unique solution of the closed-loop
•
S.7.10 Example A Control Problem Consider the system dx/dt =
f(x) + p, x(O) = Xo, where f : [O, I] -+ JR and p E JR, with f fixed. Suppose that
the solution with p = 0 is x(r). Let y be close to x(l). Can we choose p so that
the trajectory x(t) has x(O) Xo, x(l) y? That is, can we control the solution
to end up at y with a suitable choice of p? See Figure 5.7-2. Set up this problem
a~ a fixed point problem.
=
y
x (1)
=
,,,,.-
,--
..,,/
FIGURE 5.7-2 Trying to steer a trajectory to the pointy
Chapter 5 Uniform Convergence
282
Solution We shall show that this is possible. For p = 0, we have
=xo +
fo f(x(s))ds
x(l) =xo +
fo f(i(s))ds.
x(t)
1
and therefore
We want p such that y = xo +
satisfies dx/dt = f + p. Thus,
J; f(x(s))ds + p, where x(s), which depends on p,
p = y- Xo x(t)
1
1
1
/(x(s))ds,
}
(6)
=.ro+ fo J(x(s))ds+pt.
1
Let <I>: JR. x C([O, 1]) -+ JR. x C([O, 1]) be defined by <I>: (p,x)-+ right-hand
side of (6). If we arrange for <I> to be a contraction, the fixed point for <I> will
be the solution of the control problem.
•
Exercises for §5.7
1.
For what a is <l>(x) = ax a contraction on JR.?
2.
Find a series expression on [O, ½1 for the solution of Equation (1) if
K(x,y) x and a I.
3.
For what intervals [O, r], r ~ 1, is f : [O, r]
4.
Show that the system of ~uations
=
=
1
1
-+
[O, r], x....,.
r, a contraction?
2
= -X1 - -X2 + -X3 +3
4
4
15
1
1
1
x2 = -x, + -x2 + -x3 - I
4
5
2
1
1
1
X3 = --X1 + -X2 - -X3 + 2
4
3
3
X1
has a unique solution, using the contraction mapping principle. [Hint:
Either choose a clever nonn on JR.3, or estimate using the Schwarz inequality.]
§5.8 The Stone-Weierstrass Theorem
283
5.
Convert dy/dx = 3.xy, y(O) = 1, to an integral equation and set up an
iteration scheme to solve it.
6.
Set up an iteration scheme to solve dx/dt
7.
Consider dx/dt = x2, x(0) = 1. Solve this explicitly to see that 6 in 5.7.2
is finite.
8.
Let M be a compact metric space and <I> : M -+ M be such that d(<l>(x), <l>(y))
< d(x,y) for all x,y E M, x 'F y.
a.
b.
=tx2+x3, x(0) =1. Take 6 < 1/5.
Show that <I> has a unique fixed point. [Hint: Minimize d(<l>(x),x).]
Show that a is false if Mis not compact (find a counterexample).
§5.8 The Stone-Weierstrass Theorem
In the study of continuous functions and uniform convergence, three of the most
basic results are the Arzela-Ascoli theorem, the contraction mapping principle,
discussed in §5.7, and the Stone-Weierstrass theorem, which will be discussed
here.
The aim of the Stone-Weierstrass theorem is to show that any continuous
function can be unifonnly approximated by a function that has more easily
managed properties, such as a polynomial. Such polynomial approximation
techniques are important theoretically and in numerical work.
We begin by giving the result for the special case of the unit interval. This
was first proved by Weierstrass, but here we present a version due to Sergei
Bernstein (1880-1968). 1
S.8.1 Theorem
Let/: [0, 1)-+ lR be continuous and let€> 0. Then there
is a polynomial p(x) such that IIP - ill < € . In fact, the sequence of Bernstein
p<lljnomials
converges uniformly to fas n-+ oo, where
(;) = n!/k!(n - k)!
denotes the binomial coefficient.
1 Bernstein was a Russian mathematician who was very influential in analysis and in probability
theory. The proof is in the paper, Dlmonsiration du 1Mor~me de Weierstrass, Jondle sur le ca/cul
des probabilitls, Commun. Soc. Math. Kharkov (2), 13 (1912-13). 1-2.
Chapter 5 Uniform Convergence
284
The first statement here is a consequence of the second. The second can
be easily understood if one knows a little probability theory, which is assumed
only for the following paragraph of discussion. Needless to say, Bernstein's
knowledge of probability theory undoubtedly helped him with the understanding
and the proof of this theorem.
Imagine a "coin" with probability x of getting heads and, consequently, with
probability 1 - x of getting tails. In n tosses, the probability of getting exactly
k heads is
Suppose that in a gambling game called "n-tosses," /(k/ n) dollars is paid out
when exactly k heads tum up when n tosses are made. The average amount
(after a long evening of playing "n-tosses") paid out when n tosses are made is
Here f(k/ n) is f at the fraction of tosses that are heads. Now imagine n to be
very large. Then we expect that in a typical game of n-tosses, k/n will be very
close to x = probability of k heads (= fraction of the time k heads occurs), so
that our average payout should be very close to f{x). Hence, when n is large,
we expect Pn(x) to be close to J(x). This is the intuitive reason for the validity
of the result. The actual proof is a little complicated, as might be expected from
the complexity of the game.
Even for simple f such as J(x) = ,Ix, finding an approximating polynomial
is not trivial.
We can rephrase the theorem as follows. Let P denote the set of all polynomials p : [0, l] -+ IR. Then the first statement of the theorem asserts that P is
dense in C([O, I], IR); that is, cl(P) = C([O, I], IR).
In 1937, Marshall Stone discovered a very useful generalization of this theorem by allowing for more general sets than [O, l] and by replacing P with a
general family of functions satisfying certain properties. The proof makes use
of the special case here. The theorem is useful in various branches of analysis
(for example, we shall use it in our study of Fourier analysis).
5.8.2 Stone-Weierstrass Theorem let M be a metric space, A c M a
compact set, and let 13 C C(A, IR) satisfy the following:
i.
13 is an algebra; if f, g E /3 and a E IR, then f + g E /3, f · g ·e 13, and
afE 13.
ii.
The constant function x ._. l lies in B.
285
§5.8 The Stone- Weierstrass Theorem
iii.
l3 separates points,· that is, for x,y EA, x ':/- y, there is anf E l3 such that
f(x) ':/- f(y).
Then B is dense in C(A, JR); that is, cl(B) = C(A, JR).
5.8.3 Example let Pn be a uniformly convergent sequence of polynomials
on [0, l] and f = lim,, ..... 00 Pn• Must f be differentiable?
Solution No; by 5.8.1, any continuous function is a limit of polynomials,
and there are plenty of continuous functions that are not differentiable, such as
_ { 0,
f (x) 2x - I,
0$
I /2
X
$ 1/2,
:S x S I. •
5.8.4 Example Prove (a) directly from 5.8.1 and (b) from 5.8.2, that the
polynomials on [a,b] are dense in C([a,bJ, JR).
Solution
a.
By 5.8.1, if/ : [0, l]
polynomial p with Ill
Rescaling, set
-+
JR is continuous and
- PII <
£.
f(x) = g(x(b - a)+ a),
so that/ : [0, 1]
-+
£
Let g : [a,b]
0
:S x
> 0, then there is a
-+
JR be continuous.
$ 1,
JR. Let
q(x)=p(:=:)'
a$ x Sb,
so that q: [a,b] -+ JR. Thus, q is a polynomial as well. We claim that
Ilg - qll < £. Indeed,
lg(x) - q(x)I =
and so Ilg are dense.
b.
jt (; ~ :) - =:)I•
P (;
qll < £, since Ill - PII < c. Thus, the polynomials on
[a, b]
In Theorem 5.8.2, we let A= [a,b] and set B = {q e C([a,b], JR) I q is
a polynomial}. Then B clearly satisfies i and ii. It also satisfies iii, for if
x ':/- y, we can let
f(t) = t
so that f(x)
# f( y). Thus B
is dense, by Theorem 5.8.2.
•
286
Chapter 5 Uniform Convergence
Exercises for §5.8
1.
Show that there is a polynomial p(x) such that Jp(x) - sinxJ < 1/100 for
,
0 :5 X :5 27l".
2.
Suppose that Pn is a sequence of polynomials converging uniformly to f
on [O, 1] andf is not a polynomial. Prove that the degrees of the Pn are not
bounded. [Hint: An Nth-degree polynomial p is uniquely determined by
its values at N + 1 points xo, ... , XN via Lagrange's interpolation formula
N
~
p(x)
p(x;)
= 6 7r;(x)--:--(
·)'
j=(}
71', x,
where 1r;(x) = (x -xo)(x -xi)··· (x - XN)/(x -x;).]
3.
Prove that the set of polynomials in C([a, b], JR) is not open. Can a subset
of a metric space ever be both open and dense?
4.
Consider the set of all polynomials p(x, y) in two variables (x, y) E [0, l] x
(0, l]. Prove that this set is dense in C([O, 1] x [0, 1], JR).
5.
Consider the set of all functions on [0, l] of the form
n
h(x) = Lajebix,
where aj, bj E IR.
j=I
Is this set dense in C([O, 1], JR)?
6.
E JR2 I x2 +y2 = 1}, then a functionf: T---t JR may be
thought of as a 21r-periodic function from JR to JR by first identifying a
number 0 with the point (cos 0, sin 0) on T
If T
= {(x,y)
0
---t
(cos0,sin 0)
L f(0).
Similarly, any 27r-periodic function on JR may be thought of as a function
on T. A finite sum of sines and cosines of the form
1+
N
L(ak cos(k0) + bk sin(k0))
k=I
is called a trigonometric polynomial (N is arbitrary but finite).
Show that if f is a continuous 21r-periodic function from JR into JR and
c > 0, then there is a trigonometric polynomial q such that Jq(0)-f(0)1 < c
for every 0 in JR. (The number of terms, N, may depend on f as well as
on c.)
§5.9 The Dirichlet and Abel Tests
287
§5.9 The Dirichlet and Abel Tests
In some cases in which we would like to determine if we have unifonn convergence, the Weierstrass M test fails. For such instances, mathematicians have
devised other tests. The first test to be shown in this section was created by the
Norwegian mathematician Niels Abel, and the second is credited to P. G. Dirichlet, a German (of French origin); both of these mathematicians worked in the
first part of the nineteenth century. These tests are useful in many examples,
and are especially useful during the study of Fourier and power series. They are
important when we have uniform convergence but not absolute convergence.
5.9.1 Abel's Test Let A C IR"' and '-Pn : A -+ IR be a decreasing sequence
of functions; that is, 1Pn+1Cx) $ 1Pn(x) for each x EA. Suppose that there is a
constant M such that l'Pn(x)I $ M for all x EA and all n. [f 'E,';,1fn(x) converges
uniformly on A, then so does
1 Cf'n(Xlfn(X).
r,;:
We get useful tests for ordinary series when we take the special case in
which 'Pn and fn are constant functions. One has a similar test if the 'Pn are
increasing, which can be deduced by applying Abel's test to -'-Pn• A related test
is the Dirichlet test.
5.9.2 Dirichlet's Test Let sn(x) = '£,7= 1/;(x)for a sequencefn : A C JR."' -+
IR. Assume that there is a constant M such that lsn(x)I $ M for all x E A and
all n. Let gn : A C R"' -+ IR be such that gn -+ 0 uniformly, gn ~ 0, and
gn+I (x) $ gn(x). Then '£,';,ifn(x)gn(X) converges uniformly on A.
For example, consider the alternating series '£,(- l)n Cn(x), where 8n 2: 0,
8n(X) -+ 0 uniformly, and 8n+I $ gn. Let fn(x) = (-tr. Then· lsn(x)I $ I, so
that '£,(-lrgn(x) converges uniformly. Note that, as a special case of Theorem
5.9.2, an alternating seri~s whose tenns decrease to zero is convergent.
Notice the similarities and the differences between these theorems. The
conditions on 1Pn in Abel's test do not imply that 1Pn converges uniformly. Also,
we do not require 1Pn ~ 0. The proofs of these theorems use a device known as
Abel's partial summation formula.
5.9 .3 Example Show that '£,~ (sin nx)/ n converges uniformly on [6, 271' -6],
where O < 6 < 271'.
288
Chapter 5 Uniform Convergence
Solution We apply Dirichlet's test with fn(x) = sinnx and
gn(x) = 1/n. The
only hypothesis that is not obvious is IE;..,fi(x)I $ M. To show this requires a
special technique: Use the trigonometric identity
2sin(Lt)sin (~x) = cos [
(1 - ~) x] - cos [ (1 + ~) x]
and add from l = 1 to I = n. We get a collapsing sum, so that
I
12sin (½x) (sinx+ · · · + sinnx)I = cos (½x) - cos
(n+ ½x) 15 2.
Thus
lsIDx+•·•+sinnxl,;
_\
1sm 2x
I"
which gives a bound on E7= 1.fi(x). The bound is good as long as sin(x/2) is
bounded away from zero. For example, on [6,211" - 6], we get such a bound.
Note that the arguments needed here are somewhat more delicate than the M
test.
•
5.9.4 Example Show that
(-l)R
L -ne-nx converges uniformly on [0, oo[.
oo
n=I
<pn(x) = e-nx. For x ~ 0, <pn
is decreasing and le-nxl ~ 1 (why?). We know already that the alternating
series E';(- l)R /n converges, and so, by Abel's theorem, the series converges
•
uniformly.
Solution This time apply Abel's test. Let
5.9.5 ;Example Let f(x) =
(-l)n
L -e-nx. Sh".!w that f
n
oo
is continuous on
n=I
[0, oo[.
Solution The solution is immediate from the preceding example and the fact
that the unifonn limit of continuous functions is continuous. •
§5.10 Power Series and Cesaro and Abel Summability
289
Exercises for §5.9
Test the following series for convergence and uniform convergence:
1.
~Xn
-IU
L., , e
I n.
'
2.
3.
oo
L
0
~ X
< 00.
sm nx - , u
L.,--e '
0
<b~
1
00
4.
(-l)R
~
I
n+x'
•
X
~ 7r -
b.
n
§5.10 Power Series and Cesaro and
Abel Summability
In this section we consider some additional, optional topics in the theory of
infinite series. We begin by studying power series.
5.10.1 Definition
A power series is a series of the form
00
Lak(x-xoi
k=O
where the coefficients ak are fixed real (or complex) numbers. Let
lim sup
k-+oo
~ = -R1 ;
R is called the radius of convergence of the power series, and the set {x
Ix -xol = R} is the circle of convergence (x may be real or complex).
I
Note that O ~ R ~ +oo; R may be O or +oo. The reason for the ter~nology
in Definition 5.10.1 is brought out by the following result.
Chapter 5 Uniform Convergence
290
5.10.2 Theorem E:Oak(x - xo)" converges absolutely for Ix - xol < R,
converges uniformly for lx-xol :5 R', where R' < R, and diverges if lx-xol > R.
(The theorem gives no information if Ix -
Xol = R.)
These convergence properties clearly define R uniquely.
5.10.3 Corollary The sum of a power series is a COO function inside its
circle of convergence. It can be differentiated termwise; the differentiated series
then has the same radius of convergence.
The proof uses the results from §5.3 on termwise differentiation of series.
If limn.... oo Ian/ an+i I happens to exist, then the radius of convergence is
I I
R = l.1m - an
n-ex>
Dn+l
.
This is seen by using Theorem 5.10.2 together with the ratio test. We ask the
reader to prove this.
Next, the concept of Cesaro summability is examined.
=
=<I:;=
5.10.4 Definition .Set Sn E:-= 1 at and <Tn
1 Sk)/n; thus <Tn is the
arithmetic mean of the first n partial sums of the series. The series
1 a1c is
called Cesaro I-summable or (C, 1) summable to A if limn.....00 <Tn =A.If this is
the case, write
E:
00
Lak =A (C, I).
k=l
The idea here is to find some way to attach meaning to otherwise divergent
series. For example,
I - I + 1 - 1+ · ·· =
Here the partial sums are
Sn
1
(Tn
Thus, lT2n = n/2n,
general formula
<T2n+l
=~
1
2 (C, 1).
= I, 0, 1, 0, ... , and their averages are
112233
n
L S1c =l' 2' 3' 4' 5' 6' ··· ·
k=l
= (n + l)/(2n + 1), and so limn-oo <Tn = 1/2. Note the
<Tn
=
L
n
k=l
(
k-
I)
I - -n- ak.
§5.10 Power Series and Cesaro and Abel Summability
291
However, we can introduce a yet more powe,ful method of summation by
averaging the un's, just as the (C, 1) method was based on averaging the Sn's.
That is, we define the (C, 2) sum of the given series to be limn- 00 (u1 + u 2 +
· · · + Un)/n if the limit exists.
The reader can easily see how to define (C, r) summability for arbitrary values r = 1, 2, ... , obtaining successively more powerful methods of summation.
Some basic properties of (C, 1) summability follow.
i.
If E~ 1 ak = A (C, 1) and
E~1 bt = B (C, 1), then
00
2)aa.1: + /3b.1:) = aA + /3B (C, l).
k=I
ii.
If
E: ak = A (C, 1), then
L =A 1
00
a1:+1
a1
(C, 1)
(decapitation).
k=I
iii.
If E~1 a,1: = A in the usual sense, then
00
Lat= A (C, I)
(regularity).
k=I
(This property is crucial; any self-respecting summation method must have
it.)
Proof
iii.
We have.S11 -+ A. Thus, given B
implies Sn ~ B. Now
<
A, there is an no such that n ~ no
1
1
n
n
n-no
n
Un= -(S1 + · · · + Sno + Sno+l +···+Sn)~ -(S1 + · · · + Sno) + - - B .
Hence, lim infn-+oo un ~ B. Since B < A was arbitrary, lim infn-oo un
A similar argument shows that lim supn-+oo un 5 A. Thus,
A
5
lim inf Un
n-+oo
5
lim sup Un
5 A.
Hence the two limits are equal, and so limn-+oo Un= A.
~ A.
Chapter 5 Uniform Convergence
292
Next, we tum to another method of summation called Abel summation (although it was actually invented by Euler).
5.10.5 Definition
E:0
at is summable to A in the sense of Abel
Iimx-+ 1- E:Oatxk =A. We write E:Oat =A (Abel).
if
For instance, we again have
I
- = 1 - 1+ l - I + ···
2
=
(Abel)
=
since /(x)
l - x + x2 - .. · l/(1 + x) for lxl < 1 and this tends to 1/2 as
J-.
Note that (at least in this example) the Abel method gives the same result as
the (C, l) method. Actually, this is always the case, as we shall see in Theorem
5.10.7. First we shall prove that Abel summability is regular.
X-+
5.10.6 Theorem (Abel)
If E:O a" = A, then
E:O a"x" converges for
lxl < 1 and limx-+1- E:Oakxk =A.
Thus, if a power series converges throughout a closed interval, its sum is
continuous, even at the endpoints.
Actually, Abel's method is more powerful than the (C, l) method.
S.10.7 Theorem
E:Oak =A (C, I) implies
E:Oat =A (Abel).
It is interesting to ask for conditions under which a Cesaro-summable series
(or Abel-summable series, and so forth) is actually convergent in the usual sense.
Along these lines we give a result of G. H. Hardy.
If I: an = A (C, 1) and if an = O(1/n) (that is, if lanl ~
C/nfor a constant CJ, then Ean converges (to A) in the usual sense.
S.10.8 Theorem
Note. Theorems of the preceding type are known as "Tauberian," after
·A. Tauber, who proved such a theorem relating Abel summability to ordinary
convergence.
293
§5.10 Power Series and Cesaro and Abel Summability
5.10.9 Example
E:Ox*/k!
Find the radius of convergence of
E:Oxk
and of
Solution In these cases we can use the formula
I I
R = l.1m - On
n-oo On+l
.
The first example gives R = 1, and the second gives
1·1m (n+ 1) =oo.
. ((n+l)!)
R = l1m
- -n.,- = n-+oo
n--+oo
Thus E:Oxk converges to
e.r) for all x.
•
5.10.10 Example
~/(1- x) if lxl <
Show that
1, and
"£,:Ox*/k! converges (to
E: (-lik is not summable (C, 1).
1
Solution Here,
an: -1,+2,-3,+4,-5,+6, .. .
Sn: -1,+l,-2,+2,-3,+3, .. .
Tn: -l,0,-2,0,-3,0, ...
u2n = 0, u2n-1 = -n/(2n - I)-+ -1/2.
Thus, limn-oo undoes not exist. However, the (C,2) sum is -1/4 (the student
should check this statement). •
5.10.11 Example Show that
E: (-I)kk = -1/4 (Abel). Here,
1
oo
k
k
d Loo
k k
d
1
X
L (-l)kx =x(-l)x = x - - = - - - ,
dx
dx 1 +x
(I +x)2
k=I
k=I
This tends to -1/4 as x-+ 1-. and so
oo
I
L(-Iik = - 4 (Abel).
k=I
lxl < t.
Chapter 5 Uniform Convergence
294
Exercises for §5.10
1.
Compute the radius of convergence of
oo
~x
~
.t=l
2.
k
oo
and of
k2
~ k!xk.
~
.t=l
Show that
I
(I - x)2
-l<x<l
=~kx"-'=~(k+l)xt,
~
L...J
k=O
k=l
by differentiating an appropriate series.
3.
Show that
. 2
.
'
- =I+ 0 - 1 +I+ 0 - I+ I+ 0 - 1 + · · · (C, 1).
3
(Note that insertion of zeros can alter the Cesaro sum.)
4.
Show that
2
3
I+0-1+1+0-···=(Abel).
5.
Show that
I I
I
1r
1--+---+···=3
5
7
4
by obtaining a series expansion for the arctangent and using Abel's theo-
rem.
Theorem Proofs for Chapter 5
5.1.4 Proposition Let A c M where M is a metric space, let ft : A
be a sequence of continuous functions, and suppose that ft
A). Then f is continuous on A.
-+
-+
N
f (uniformly on
Since fn -+ f uniformly, given c: > 0, we can find an integer N such
that k ~ N implies p({k(x),f(x)) < c:/3 for all x EA. Consider a particular point
x0 E A. Since fN is continuous, there exists a 6 > 0 such that d(x,Xo) < 6,
x E A => p (fN(x).fN(Xo)) < c:/3. Then, for d(x,Xo) < 6, p(f(x),f(xo)) ~
p(f(x),fN(X)) + p(fN(X),fN(Xo)) + p(JN(Xo),f(Xo)) < c:/3 + c:/3 + c:/3 = €. Since Xo
is arbitrary, f is continuous at each point of A; hence it is continuous. •
Proof
295
Theorem Proofs for Chapter 5
5.2.1 Cauchy Criterion Let N be a metric space with metric p, and let A
be a set. Suppose that N is complete and ft : A -+ N is a sequence pf functions.
Then /k converges uniformly on A iff for every c > 0 there is an N such that
l, k ~ N implies p(fk(x),fi(x)) < £ for all x E A.
Proof If fk -+ f unifonnly, then given c > 0, we can find an integer N such
that k ~ N implies p(J,,(x),f(x)) < c /2 for all x. Then if k, l ~ N, p(J,,(x),fi(x)) :::;
p(fk(x),f(x)) + p(f(x),fi(x)) < c/2 + c/2 = c.
Conversely, if, given £ > 0, we can find an N such that k, l ~ N implies
p(ft(x),fi(x)) < £ for all x, then fk(x) is a Cauchy sequence at each point x, and
so fk(X) converges pointwise to something, which we denote by f(x). Moreover,
we can find an N such that k, l ~ N implies p(fk(x),fi(x)) < c/2 for all x.
Since fk(x) -+ f(x) at each point x, we can find for each x an Nx such that
l ~ Nx implies p(fi(x),j(x)) < e/2. Let l ~ max{N,Nx}- Then k ~ N implies
p(/k(x),f(x)) :::; p(/k(x),fi(x)) + p(fi(x),f(x)) < c/2 + e/2 = £. Since this is true
for each point x, we have found an N such that k ~ N implies p(fk(x),f(x)) < £
for all x. Hence fk -+ f (uniformly).
•
5.2.2 Weierstrass M test Suppose_ that V is a complete normed vector
space and gk : A -+ V are functions such that there exist constants Mk with
llgk(x)II :::; Mt for all x EA, and E~1 Mk converges. Then E~1 g1c converges
uniformly (and absolutely).
E:
Proof Since
1 Mt converges, for every £ > 0 there is an N such that
k ~ N implie,s IMk + · · · + Mk+pl < € for all p = I, 2, .... Fork ~ N, we have,
by the triangle inequality,
llgt(X)
+ · · · + gk+p(x)II :::; llgix)II + · · · + llgk+p(x)II :::; Mk + · · · + M1c+p < c
for all x E A. Thus, by the Cauchy criterion for series,
unifonnly.
•
E: gk converges
1
5.3.1 Theorem Suppose thatft ,h ,!J, ... are integrable functions on a closed
bounded interval [a, b], and that they converge uniformly to a limit function f
on [a,b]. Thenf is integrable on [a,b] and
lim
n-oo
lb
a
fn(x)dx =
lb
a
f(x)dx.
296
Chapter 5 Uniform Convergence
Proof Let us first asswne that f is integrable. To prove the limit relation,
recall that if lg(x)I :s; M then
·
llb
g(x)
dxl :s; M(b -
a).
Given e > 0, choose N such that k 2: N implies [{k(x)- f(x)I
< e/(b- a). Then
as required.
The harder part is to show that f is ip.tegrable. To do this, we use Definition
4.8.1, which is equivalent to
lb lb .
_a f
(x) dx = a f (x) dx.
Let e > 0 be given and choose N so that IJ,,(x) - f(~)I < e if n 2: N and
a :s; x :s; b. In particular, this implies that f is bounded (why?) Note that if a
is an upper bound for fn on [x;,X;+il, then a+ e is an upper bound for f on the
same interval. Thus,
sup{f(x) Ix E [x;,X;+il} S sup{fn(x) Ix E [xi,Xi+il} + e
and hence
U(f, P) :S; U(fn, P) + e(b - a).
Similarly,
L(fn, P) - e(b - a) :S; L(f, P).
Taking the inf's and sup's ?Ver P, we get
and
lb
fn(x)dx- e(b - a) S
lb
f(x)dx.
297
Theorem Proofs for Chapter 5
Thus,
11b f(x)dx-1b f(x)dx :$ 2c(b - a);
being valid for all c
integrable. •
>
0, the upper and lower integrals ar~ equal, so f is
5.3.2 Corollary Suppose that the functions g" : [a,b]
--+ IR are Riemann
integrable and
1 g1: converges uniformly on [a,b]. Then we may interchange
the order of integration and summation:
z:=:
fb
la
Proof Let!,,
(f
gk(x)) dx =
k=J
gk(X) dx) .
k=I
=L~-=I gk ; thenfn
lb
f (lat
--+
f
=z::::1 gk (unifonnly), and so, by 5.3.1,
fn(x)dx-+ lb f(x)dx.
•
5.3.3 Theorem Let ft : ]a, b[ --+ lR be a sequence of differentiable functions
on the open interval ]a, b[ converging pointwise to J : ]a, b[ --+ R Suppose that
the derivatives/£ are continuous and converge uniformly to a function g. Then
f is differentiable and J1 = g.
Write fk(x) = /t(Xo) + J~f£(t) dt, where a < xo < b. This is possible
by the fundamental theorem of calculus. Letting k --+ oo, we getf(x) = f(xo) +
J~ g(t) dt, using Theorem 5.3.1. Since g is continuous by 5.1.4, the fundamental
theorem of calculus shows that the right side is a differentiable function of x with
derivative g(x). Hence the left side is also differentiable, and so/' (x) = g(x). •
Proof
.5.5.1 Theorem
i:
/f (M,d) and (N,p) are metric spaces, then so is Cb(A,N); that is, d(f,g)
satisfies
'?. 0 and d(f,g) =0 iff f
a.
d(f,g)
b.
d(f,g) = d(g,f).
c.
d(f,g) :$ d(f,h)+d(h,g).
=g.
Chapter 5 Uniform Convergence
298
ii.
If N is a normed space, then so is Cb(A, N); that is,
a.
II/II~ o and 11111 = oWt= o.
b.
llo/11 = lol II/II for a E R, / E Cb,
c.
Ill+ ell $
11111 + llell
11 · II satisfies
(triangle inequality).
Proof
i.
Each of a and b is a routine check. To prove c, write
d(f,e) = sup{p(f(x),e(x)) Ix EM}.
For each x EM,
p(f(x), e(x)) $ p(f(x), h(x)) + p(h(x),e(x))
and so by definition of least upper bound,
d(f,e) $ sup{p(f(x),h(x)) + p(h(x),e(x)) Ix EM}.
Now d(f, h) + d(h, e) is an upper bound for the set on the right-hand side,
and so it is greater than or equal to the sup. Thus, d(f, e) $ d(f, h)+d(h, e).
ii.
a and b are clear. For c,
Ill+ gll = sup{ll(f + g)(x)II Ix EA} $ sup{llf(x)II + lle(x)II Ix EA}
by the triangle inequality. Now 11111 + llell is an upper bound for the
set on the right-hand side, since 11/(x)II $ 11111 and lle(x)II $ llcll for
all x E A. Thus, it is greater than or equal to the sup of this set. Thus
II/+ ell $ 11111 + llgll, as required. •
5.5.2 Theorem
{ft_
-+
f (uniformly on A)) <=> (ft
-+
f in Cb),
Proof This is nothing more than a transcription of the definitions. The student
should write it out. •
5.5.3 Theorem If N is a complete metric space, then Cb(A, N) is a complete
metric space. If N is a Banach space, so is Cb(A,N).
Theorem Proofs for Chapter S
299
-Proof Letfn E C1,(A,N) be a Cauchy sequence. Thenfn satisfies the Cauchy
criterion (S.2.1), and so it converges unifonnly to a function f. By 5.1.4, f is
continuous, and, taking, say, c = 1 in the definition of convergence, we see
that I,. bounded implies f is bounded. Thus, f E C1,(A, N). Hence C1,(A, N) is
complete. The second assertion is a special case of the first.
•
5.6.2 Arzela-Ascoli Theorem Let Ac M be compact and B c C(A,N).
Then B is compact if and only if B is closed, equicontinuous. and pointwise
compact.
Proof To prove this, we first prove a lemma.
Lemma Let A be compact. Then for any 6 > 0 there is a finite set C6 =
{Yt, ... ,Yk} such that each x EA is within 6 of some y, E C6,
Proof The collection of balls {D(x,6) I x e A} cover A, and so, by compactness, a finite number-say, D(y1, 6), . .. , D(yk, 6}---do as well. Let C6 =
{Y1,, .. ,Yt}. 'v
Now we tum to the proof of the theorem. Let C = LJ{C1;n In= I, 2, 3, ... }.
Since each C1;n is finite, C is countable; say, C = {x1,x2, ... }. Let f,, be. our
sequence in B. Now {/n} is contained in the pointwise compact set B, and so,
from the Bolzano-Weierstrass theorem, there is a subsequence of !,.(xi) that is
convergent. Let us denote this subsequence by
Similarly, the sequence/u(x2 ), k = 1,.2, ... , has a subsequence
which is convergent. Continuing the process, the sequence ht(X3), k = I, 2, ... ,
has a subsequence
which is convergent. We proceed in this way and then set Kn= Inn, so that Kn is
the nth function occurring in the nth subsequence.
Chapter 5 Unifonn Convergence
300
Diagrammatically, gn is obtained by picking out the diagonal:
/11 /12 /13 '' -fin '·' (first subsequence)
/21 /22/23 · · ·hn · · · (second subsequence)
hi /32 /33 · · -f3,, · · · (third subsequence)
(nth subsequence)
fnl fn2 fn3 · · ·fmi · · ·
This trick is called the "diagonal process" and is useful in a variety of situations.
From its construction, we see that the sequence gn converges at each point
of C; indeed, g,. is a subsequence of each sequence J,n1c, k = I, 2, ....
We shall now prove that the sequence g,. converges at each point of A and
that the convergence is uniform, and this will prove the theorem. To do this, let
c > 0 and let 8 be as in the definition of equicontinuity. Let C6 = {y 1, ••• , Yk}
be a finite subset of C such that every point A is within 8 of some point in C6
(see the lemma). Since the sequences
all converge, there is an integer N such that if m, n
p(g,.(y;),gn(y;))
<
€
~
N, then
for i = I, 2, ... , k.
For each x EA, there exists a Yi E C6 such that jjx assumption of equicontinuity, we have
Yill < 8.
Hence, by the
for all n = I, 2, .... Therefore,
p(gn(X),gm(X))::; p(g,.(x),gn(Yj)) + p(gn(Yj),gm(Yj))
+ p(gm(Yj),gm(x))
< €+€+€= 3€,
provided that m, n
~
N. This shows that
and so the uniform convergence of the sequence gn on A follows from the
Cauchy criterion. The limit lies in B since B is closed. This proves that if
B is equicontinuous and pointwise compact, then B is compact. We leave the
converse to the reader. (See also Exercise 68.)
•
Theorem Proofs for Chapter 5
301
5.6.3 Corollary Let A c M be compact and N = Rm. Assume B c C(A, Rm)
is equicontinuous and pointwise bounded. Then every sequence in B has a
unifonnly convergent subsequence.
Proof The set of values {f(x) If E B} for each fixed x E A is bounded in
Rn and so lies in a compact set (its closure, for instance). Thus, the preceding
proof applies to a sequence in B.
Alternatively, one shows that cl(B) is equicontinuous and pointwise compact,
so that it is compact by 5.6.2.
•
5.7 .1 Contraction Mapping Principle Let M be a complete metric
space and <I> : M -4 M a given mapping. Assume that there is a constant k,
0 ~ k < 1, such that d(<l>(x), <l>(y)) ~ kd(x,y) for all x,y EM. Then there is a
unique fixed point for <I>, i.e., a point x* E M such that <l>(x,.) = x.,. In fact, if xo
is any point in Mand we define X1 = <l>(.xo), X2 = <l>(x1), ..., Xn+I = <l>(xn), ... ,
then
Proof First we show the existence of a fixed point, then its uniqueness. Let
xo E M and x 1, x2, X3, •.• be as in the theorem. If x 1 = Xo, then <l>(xo) = xo and xo
is fixed. If not, then d(x 1, xo) is not 0, and we start by showing that the points
{ Xn} form a Cauchy sequence in M. To show this, write
d(x2,x1) = d(<l>(x1), <l>(xo))
~
kd(xi,xo)
d(x3,X2) = d(<l>(x2), <l>(x1)) :::; kd(x2,x1) ~ k2d(x1, xo);
inductively, d(Xn+J,Xn) ~ ~d(x1,Xo). Also,
d(Xn+p, Xn) ~ d(Xn+p, Xn+p-1) + d(Xn+p-1, Xn+p-2) + · · · + d(Xn+I, Xn)
by the triangle inequality, and so
d(Xn+p, Xn) ~ (k!'+p-l + k!'+p- 2 + · · · + k!')d(x1, Xo).
I::i
But the geometric series
k; converges, since 0 ~ k < 1, and so it satisfies
the Cauchy criterion for the series: given e > 0, there is an N such that
Jci+p-1 + ... + k!'
<
if n 2: N and p is arbitrary. Hence d(Xn+p, Xn)
so {xn} is a Cauchy sequence.
e
d(xi,Xo)
< c if n 2: N, with p arbitrary, and
Chapter 5 Uniform Convergence
302
By completeness of M, limn---.oo Xn exists in M. Call this limit x.; i.e.,
X*
= lim
n---+oo
Xn-
We now show that <I> is (uniformly) continuous. Given c > 0, let 8 = c/k. Then
d(x, y) < 8 => d(<l>(x), <l>(y)) :::; kd(x, y) < k8 = to.
Consider Xn+I = <l>(xn); Xn+I -+ x., and by the continuity of Cl>, <l>(xn) -+ Cl>(x.).
Thus, x. = <l>(x.), so x. is a fixed point.
Finally, we prove the uniqueness of the fixed point x*. Let y. be another
fixed point, i.e., Cl>(y.) = y •. Then
d(x.,y.) = d(<l>(x.), Cl>(y.)) $ kd(x.,y.); i.e., (1 - k)d(x.,y.) :::; 0.
But k < 1, and so (1 - k) > 0, implying d(x.,y.)
fixed point is unique.
•
= 0, i.e., x. = y., and thus the
5.7.2 Theorem Under the assumptions in the text, there is a 8 >
0 such
that the equation
dx
=xo
has a unique C1 solution x = cp(t), with <p(to) = xo, for to <p (t) =f(t, <p(t)).
dt
=f(t,x),
(1)
x(to)
8 < t < to +
o, i.e.,
1
Proof First, choose a ball about (t0 ,x0 ) such that lf(t,x)I $ 1 and the Lipschitz
condition holds. Then choose {; such that
i.
It- tol :::; 8, Ix -
ii.
Ko < 1 (where K is the Lipschitz constant).
Let
xol :::; Lo=> (t,x) is in the chosen ball, and
M = { cp : [to - 8, to + o] -+ Ill I <p is continuous,
<p(to) = xo, and jcp(t) - xol $ U}
C([to - 8, to + 8], Ill).
Note that Mis closed in C, and thus Mis a complete metric space with metric
Let <I> : M -+ C be defined by
<l>(<p)(t) = xo +
l
IQ
1
f(s, <p(s))ds.
Theorem Proofs for Chapter 5
303
As we have noted, the integral equation (2) in the main text is equivalent to (1),
and this is equivalent to cp being a fixed point of <I>. We will show that <I> is a
contraction.
First, we need to show that <I> maps M to M. That is, if cp E M, then
,f;(t) = Xo + J,' f(s, cp(s)) ds E M. The function ,j; is continuous since it satisfies a
Lipschitz coJdition with constant L: Ifs< t, then
lt/l(t) - ip(s)I =
11' f(x, cp(x))dxl $ 1' lf(x,
cp(x))I dx. $ Lit - sl.
Note that ip(to) = Xo and
lt/l(t) - xol
=
l.f
f(s,cp(s))dsl $ Lit- tol $ Ll,
and so in fact 1/J E M. Next we seek a k, 0 $ k
kd(cp1, cpz). Note that
d(<l>(cp1 ), <l>(cp2)) = s~p
Ii'
< 1, such that d(<l>(cp, ), <l>(cp2)) $
[/(s, <p1 (s)) - /(s, \02 (s))] dsl
$ 6K sup lcp1 (s) - <p2(s)I = 6Kd(cp1, <p2)
s
and 6 was chosen so that 6K < l. Thus, we take k = 6K.
In conclusion, since <I> is a contraction, it has a unique fixed point, showing
that our equation has a unique solution.
•
5.7.3 Proposition If >.Mlb- al < I, then the Fredholm equation (3) in the
text has a unique solution.
Proof The proof was given in the text except for the fact that <I>(/) is
continuous. To prove this, note that since [a,b] x [a,b] is compact, K is
uniformly continuous. Therefore, given c > 0, there is a 6 > 0 such that
((x-y)2+(r-s)2) 112 < 6 implies IK(x,t)-K(y,s)I < c. In particular, lx-yl < 6
implies IK(x,t)- K(y,t)I <€,for all t E [a,b]. Thus,
1b
IK(x, t) - K(y, t)I 1/(t)I dt
<€
1b
1/(t)I dt
< cll/ll(b -
and so
l(<l>(/))(x) - (<l>(/))(y)I
which implies <l>(/) is continuous.
•
< c>.(b -
a)ll/11
a)
304
Chapters Uniform Convergence
5.7.4 Proposition Assuming that K is continuous on [a,b] x [a,b], ihe
Volterra integral equation has a unique solution f(x) for any ,\_
Proof
We first prepare a useful lemma.
Lemma Let M be a complete metric space, <I> : M -+ M, and, for some
positive integer N, assume that <ti', defined by <ti'= <I> o <I> o · · · o <I> (N times)
is a contraction. Then <I> has a unique fixed point.
Proof Let x. be the unique fixed point of <ti'. Then <l>"(x.) = x,., and so
ctl'+ 1(x.) = <l>(x.) or <t>N(ct>(x.)) = <l>(x.). Thus <l>(x,.) is a fixed point of <t>N also,
and so, by uniqueness, <l>(x,.) = x,.. Thus ct> has a fixed point. Any fixed point
of ct> is also a fixed point of <t>N, and so it is unique. 'v
Proof" of 5.7.4
Let <I>: C([a,b],lR)-+ C([a,b],lR) be defined by
<l>(j)(x) = ,\
lr
K(x,y) f(y) dy + <p(x).
As in 5.7.3, 4>(J) is continuous. Note that
d(<l>(J1 ), cl>(J2)) = sup l<l>(Ji)(x) - cl>(h)(x)I ~ ,\M sup Ix - ald(/1 ,/2)
= -\Mlb - ald<J1 ,/2).
Also,
d(<l>2(J1 ), <1>2 (/2)) = d(<l>(cl>(J1 )), <l>(<l>(J2)))
~ s~p I,\
1"
K(x,y)-\Mly- ald<J1,/2)dyl
~ ,\2Mi (s~p Ix~ aj2) d(J1 ,/2).
Inductively,
d(<l>"(J, ), <l>"(J2)) ~
,\"M"lb I
n.
'al" d(J1 ,/2).
E:0
By the ratio test,
,\"Mnlb-al" /n! converges, and consequently, there exists
an integer N such that ,\" Mn lb - aln / N! < l, implying that <ti' is a contraction.
From the lemma, we conclude that cl> has a unique fixed point.
•
Theorem Proofs for Chapter 5
305
5.8.1 Theorem Let f : [0, I) -+ JR be continuous and let e > 0. Then there
is a polynomial p(x) such that IIP - /II < e. In fact, the sequence of Bernstein
polynomials
Pn(X) =
t (;)
f(k/ n)xk(l - xf-k
k=O
converges uniformly to f as n -+ oo, where
(~) = n!/k!(n - k)!
denotes the binomial coefficient.
Proof
The binomial theorem asserts
(x + yf
=
t (~)
x"yn-1:_
(1)
k=O
Differentiating equation (I) with respect to x and multiplying by x gives the
identity
(2)
Similarly, by differentiating twice,
n(n - I)x2(x + yf- 2 =
t
k(k - l) ( ~) xtyn-t.
(3)
k=O
Let (for notation)
rk(X) = (~) Xk(l - xf-k.
Thus, Equations (1), (2), and (3) read, with y = I - x,
n
n
n
k=O
k=O
L r1:(x) = I, L' krk(x) = nx, L
k(k - 1)r1:(x) = n(n - 1).r.
It follows that we have the identity
n
II
n
n
2
L(k - nx>2r1:(x) = n x2 L rt(X) - 2nx Lkr1:(x) + L k2rt(X)
k=O
=
x2 -
n2
l:=O
l:=O
2nx • nx + [nx + n(n - 1).r]
= nx(l -x).
(4)
306
Chapter S Unifonn Convergence
Now choose M such that 1/(x)I $ M on [O, l]. Since/ is uniformly continuous, there is, for given e > 0, a 6 > 0 such that Ix - YI < 6 implies
lf(x) - /(y)I < e.
We want to estimate the expression
n
1/(x) - Pn(x)I
= f(x) -
n
Lf(k/n)r,.(x)
= L (f(x) - f(k/n)) rk(x)
k=O
.
k=O
To do this, divide this sum into two parts: those for which lk - nxl < 6n, and
those for which lk - nxl ~ 6n. If lk - nxl < 6n, then Ix - (k/n)I < 6, so that
1/(x) - f(k/n)I < e, and therefore, remembering that rk(x) ~ 0, these terms give
a sum $ e E~ rk(x) = E. The terms of the second type have a sum that is
since l(k - nx)/n61 ~ 1 for these terms. By (4), this sum is bounded by
2Mx(l -x)
n6 2
<~
-
262 n'
since x(l - x) :5 1/4 (why?). Thus, we have proven that for any e > 0, there is
a 6 > 0 such that
M
1/(x) - Pn(x)I < e + 262 n.
Thus, if n ~ M/26 2e, then M/26 2n < e:, so that lf(x)-pn(x)I < 2e; thus Pn-+ f
uniformly. •
5.8.2 Stone-Weierstrass Theorem Let M be a metric space, A c M a
compact set, and let l3 C C(A, IR) satisfy the following:
i.
8 is an algebra; if f, g E B and a E IR, then f + g E 8, f • g E !3, and
afE 8.
ii.
The constant function x 1-+ 1 lies in !3.
iii.
l3 separates points; that is.for x,y EA, x
f(x) ~/(y).
~
y, there is anf E !3 such that
Then l3 is dense in C(A, IR); that is, cl(B) = C(A, IR)
307
Theorem Proofs for Chapter 5
Proof
Let us introduce some notation as follows:
(f V g)(x)
=max(f(x), g(x))
and (f A g)(x)
=min(f(x), g(x)).
(See Figure 5.P-1.) Let rJ be the closure of B. Then, by continuity of addition
and multiplication, we see that 8 also satisfies i. It clearly satisfies ii and iii.
Thus, we want to show that 8 = C(A, IR.).
FIGURE 5.P-1 The graphical meaning of/ V g and/ Ag
By the preceding theorem and Example 5.8.4, we can find a sequence of
polynomials Pn(t) such that
lltl -
Pn(t)I
< 1/n for - n ~
t ~ n.
Thus
IIJ(x)I - Pn<J(x))I
< 1/n
if - n ~f(x) ~ n.
This proves that for f E /J, I/I e 8, because Pn of E
Now we have the identities
/J, since rJ is an algebra.
(an exercise for the reader). Thus, if/, g E 8, then/ V g and/ Ag lie in 8 as
well.
Let h E C(A,IR.) and X1,X2 E A with x 1 'F xi. Choose g E B such that
g(x,) 'F g(x2) (which is possible by hypothesis iii), and let.f..1x1 (x) = ag(x) + /3,
where
·
a= [h(x,)- h(x2)] and /3 = [g(x1)h(x2) - h(xi)g(x2)].
[g(x,) - g(x2)]
[g(xi) - g(x2)]
Chapter 5 Unifonn Convergence
308
Let c > 0 and x e A. For y E A, there is a neighborhood U(y) of y
such that f,x(z) > h(z) - c if z E U(y). This is simply by continuity of h. Let
U(y 1), ••• , U(y,) be a finite subcover of A, which is possible, since A is compact.
Setfx = /y 1x V · · · V /y,x• Thus, as before,/x E B andfx(z) > h(z)- c for all z EA.
Also,fx(x) = h(x). Thus, there is a neighborhood V(x) such thatfx(y) < h(y) + €
if y e V(x). Let V(x1 ), ... , V(xk) cover A, and set/ = fx 1 A · · · Afxk. Then, again,
f E B. Now f(z) > h(z) - € for all z E A, because fxlu) > h(u) - c for all
u EA. For y EA, y E V(xj) for some Xj, and so f(y) ~ fx/Y) < h(y) + c. Thus
lf(z) - h(z)I
•
< c, and so h e !l Thus rJ = C(A, IR).
The proofs for §5.9 use Abel's partial summation formula:
Consider two sequences a1,a2, ... and b1,b2, ... of real numbers and Let
Sn = a1 + · · · + an. Then
n
n
k=I
k=I
L ak.bk = Snbn+I - L
Proof
n
S1;(bt+I - bk)= Snb1 + I:<sn - Sk)(bk+I - bk)k=I
Note that an= Sn - Sn-I· Then
n
n
n
n
k=l
A-=l
k=I
k=I
L atbt = I:<sk - St-1)b1; = L Stbk - L S.t-1bk,
where so = 0. Now
n
n
k=I
k=I
L Sk-lbk = L Skbk+I -
Snbn+I,
and so we obtain the first result. The second equality is obtained by putting
n
bn+I = I:<bk+I - bk)+ b1
1-=l
in the first equality.
•
5.9.1 Abel's Test Let A C IR'" and 'Pn : A
-+ JR be a decreasing sequence
of functions; that is, 'Pn+1(x) ~ '{)n(x) for each x E A. Suppose that there is a
constant M such that l<l'n(x)I ~ M for all x EA and all n. lf "'£':..fn(x) converges
uniformly on A, then so does "'£':,1 ',On(x) fn(x).
Theorem Proofs for Chapter S
Proof
309
Let
n
Sn(X)
n
=Lft(x)
and
rn(X)
=L
k=l
'Pk(X) /k(X).
k=I
By the second equ~ity of the lemma, we find that
n
rn(X) - r111(X)
L (sn(X) -
= (sn(X) - s,,,(x))<pm+I (x) +
Sk(X))(<pk+I (x) - <pt(X))
k=m+I
for n
~
m, so that
lrn(X) - rm(x)I $ lsn(X) - s,,,(x)I lcpm+1 (x)I
ft
L
+
lsn(X) - St(x)I l'Pk+l (x) - ff)k(x)I,
k=m+I
since
ff't+l $ tf'k
and
lff't+1 - 'Pkl = 'Pk - tf'k+I •
Given e > 0, choose N so that n,m ~ N implies lsn(x) - s,,,(x)I
all x E A. Then for all x E A,
lrn(x) - r,,,(x)I
i
i
i
ft
< + ( 3~)
=
< e/3M for
L [tf'k(X) -
<,?k+I (x)]
k=m+I
+ ( 3~) [ff'm+l (x) - C,On+I (x)]
+ ( 3~) Clff'm+I (x)j + l'Pn+I (x)I]
e e e
<-+-+-=c:
- 3 3 3
$
Hence, by the Cauchy criterion, rn(X) converges uniformly.
S.9.2 Dirichlet's Test
R.. Assume that there is a
I:7=
•
Let Sn(X) =
1/;(x) for a sequence fn : A C Rm constant M such that lsn(x)I $ M for all x E A and
all n. Let gn : A C R"' - R. be such that 8n --+ 0 uniformly, gn ~ 0, and
gn+1(x) $ gn(x). Then
11,,(x)gn(X) converges uniformly on A.
:E:
310
Chapter 5 Uniform Convergence
Proof We use the same notation as in the proof of Abel's test, writing 'Pn = g,..
Now, however, to compute rn -rm, we use the first equality in the lemma, namely,
n
L s.1:(x)(<p.1:+1 (x) - <p,1:(x)),
rn(X) - r,n(X) = Sn(X)'Pn+l (x) - Sm(X)'Pm+l (x) -
k=m+I
so that, since 'Pk
~
0 and 'Pk+I :5 'Pk,
n
L (<p,t(X) - 'Pk+I (x))
lrn(X) - rm(x)I :5. M('Pn+I (x) - 'Pm+I (x)) + M
=M(<pn+l (x) + 'Pm+l (x) + 'Pm+l (x) -
.t=m+l
'Pn+l (x))
=2M<p,n+I (x).
Given e > 0, choose N so that m > N implies 'Pm(x) < e/2M for all x. Then
m,n ~ N implies lrn(x) - r,,,(x)I < e, proving the assertion.
•
S.10.2 Theorem }:~a.1:xk(x - xof converges absolutely for Ix - .xol < R,
converges uniformly for lx-xol :5 R1, where R1 < R, and diverges if lx-.xol > R.
(The theorem gives no information if Ix -
.xol = R.)
Proof Recall from 5.10.1 that·R- 1 = limsup ViaJ ask--+ oo. Let R' < R.
Choose R" with R' < R" < R. For n sufficiently large, the definition of lim sup
gives
vTa;J :5
Hence, if Ix -
;,, , that is, Ian I $ ( ;,, )
n
.xol :5. R', we have
Since R' / R" < l, we have unifonn and absolute convergence of the series in
the disk Ix - .xol :5. R' by the Weierstrass M test.
On the other hand, suppose that the series E an(X - xot converges. Then
an(X -xot-+ 0, so that lan(X - xotl $ l for n large. Thus vTa;J :5 lx-xol- 1
· for n large. Hence R- 1 = limsup vfa;;j :5 Ix - xol- 1, that is, Ix - .xol :5 R. •
5.10.3 Corollary The sum of a power series is a COO function inside its
circle of convergence. It can be differentiated termwise; the differentiated series
then has the same radius of convergence.
Theorem Proofs for Chapter 5
311
Proof The series obtained by tennwise differentiating the given series is
E ka1,;(x - Xot- 1 . The radius of convergence is R', where .
~, = limsup ~ -
But
iff -+ 1 (why?), and so
~, = limsup {/kiaJ =
Ji•
i.e.,
R' = R.
Thus, the differentiated series converges uniformly inside any smaller circle,
and therefore it is the derivative of the sum of the original series (see Proposition 5.4.3). By induction, we see that the original series is differentiable any
number of times.
•
If E:Oat = A then E:Oakx" converges for
5.10.6 Theorem (Abel)
lxl < I and limx-1- E:Oatx" =A.
Proof By changing a0 , we can assume that A = 0. Since ak is bounded (in
fact, ak-+ 0), the series E:Oakxk converges for lxl < 1, by Theorem 5.10.2.
Write Sn= E~ak, Since Sn is bounded as n-+ oo, the series E:OSkxk
likewise converges for lxl < I. Since A = 0, Sn -+ 0 as n -+ oo. Write f(x) =
E:O akx.t for lxl < 1. Then
00
00
f(x) =So+ L(St - St-1)xk = (1 - x)
k=I
Since Sn
-+
0, given c:
k=l
> 0 we can find no so that !Sn I $
lf(x)I $ (1 - x)
.
L Skxk.
no
oo
k=l
k=no+l
L Stx" + (1 - x) L
c: for n
> no. Then
c:xk
$ (1 - x) 'tskxkl + (I - x) · c:xno• 1 0
- x)- 1
k=I
no
$ (1-x) I:s1cx" +e.
h:I
Therefore, limsupx--+J- 1/(x)I :5 c:. Since c: > 0 was arbitrary, limsupx--+J-f(x)
=O.
•
Chapter 5 Uniform Convergence
312
5.10.7 Theorem
E:Oa1- =A (C, 1) implies E:Oak =A (Abel).
Proof As before, we may suppose that A = 0. Write Sn = E~ ak, Tn =
E~ Sk. By assumption, Tn = o(n). Here we use the little "oh" notation meaning
that for any e > 0, jo(n)I :5 en for n large enough. Hence Sn = Tn - Tn-1 = o(n)
and an = Sn - Sn-I = o(n). By comparison with a multiple of the convergent
series
kx·t, all three series a,:x·t, Skxk, and T1:xk converge if < 1.
E
E
E
E
lxl
Also,
Now, since Tn = o(n), given e > 0 we may choose no so that n
ITnl :5 En. Accordingly,
1/(x)I :5 (1
- x)2
L Ttxk + (1 - x>2
t~no
:5 (1
- x>2
IL
2:: no implies
ekxkl
k>no
L T1cxk + (1 - x)2 • ex(l - x)-
2
k~no
and we find Jim supx_, 1limx-1- J(x) = 0. •
1/(x)I < e.
Thus, . as in the previous theorem,
5.10.8 Theorem IfEan =A (C, l)andif an= O(1/n)(that is, iflanl :5 C/n
for a constant C), then E an converges (to A) in the usual sense.
Proof As in the preceding proofs, we may suppose that
E~=I a1:,
Tn =
A = 0. Write Sn
=
z:::. . Sk, Then the first hypothesis is written as Tn = o(n), as in
1
the preceding proof.
We want to show that Sn -+ 0. If not, then for some t, > 0, ISnl 2:: 6 for
infinitely many indices n. It can be assumed (by reversing all signs if need be)
that Sn 2:: 6 for infinitely many values of n. But if Sn ~ 6 and r > n, we have
(I
I) -
Sr=Sn+On+1+an+2+··•+ar>6-C --+···+n+l
R
This will be
2:: 6/2 provided that Clog(r/n)
r
>6-Clog-.
n
~ 6/2, that is, r/n
:5 e6f 2c = A.
Worked Examples for Chapter 5
(Note that ,\
>
313
1.) Hence, we have
f,
([,\n] - n)
2~
[An]
L S, =
T[An) -
Tn,
t=+l
(Here [x] means the largest integer less than or equal to x.) Now the right side of
this inequality is o(n), but the left side is of the order (,\-1)6n/2, a contradiction.
Hence Sn must tend to 0.
•
Worked Examples for Chapter 5
Example 5.1
a.
If fk
b.
Answer the same question for uniform convergence.
-+ f (pointwise) and gk -+ g (pointwise), show that f1c + g1c
(pointwise) for functions f,g: A C Rn -+ Rm.
-+
f +g
Solution
a.
For x E A, we must show that (ft+ gA:)(x) -+ (J + g)(x). Given € > 0,
choose N1 such that k ~ Ni implies llfk(x) - f(x)II < c/2 and N2 such
that k ~ N2 implies llgk(x) - g(x)II < e/2. Then let N = max(N1,N2), so
that k ~ N implies (by the triangle inequality)
IIVt + g1c)(x) - (J + g)(x)II
b.
~
11/k(x) - f(x)II + llgk(x) - g(x)II < e.
Repeat the argument in a where each statement is to hold for all x E A.
•
Example 5.2 Prove that a sequence _{k : A -+ Rn converges pointwise (uniformly) iff its components converge pointwise (uniformly).
Solution The part of the example on pointwise convergence follows from
the fact that a sequence in Rm converges iff its components do (see Chapter 2).
However, we write out the argument again so that its validity for uniform convergence can be seen.
Chapter S Uniform Convergence
314
Let x = (x 1, ••• ,x"') E Rm. Then Iii S llxll S E:1 Iii. Indeed, the first
inequality is obvious, and the second follows from the triangle inequality if we
write x = (x 1 , 0, ... , 0) + (0, ~, 0, ... , 0) + • • • + (0, 0, ... , x"').
Applied to ft = (ff, ... ,ft), we have
m
lfi(x) - /(x)I S llft(x) - f(x)II ~
L lff (x) - /(x)I.
i=I
Hence.ft converging pointwise implies thatJJ converges pointwise.
Conversely, suppose that f£(x) converges for each i and x. Choose N; such
that k,l 2:: N; implies IJI(x) - J/(x)I < Ejm. Then if N = max(N1, ... ,N,,,),
k, I 2:: N implies llfk(x) - .fi(x)II < E / m + · · · + f / m = E, and so ft(x) converges.
For uniform convergence, we repeat the argument with each statement holding for all x e A. •
Example 5.3 Find an example of a sequence ft that converges 1,1,niformly to
zero on [0, oo[, where each
contradict Theorem 5.3.1?
Solution
It ft(x) dx exists, but It ft(x) dx -+. +oo. Does this
Let
ft(x) = {
1/k,
0,
if O ~ X ~ k2 ,
if X > k2 .
Then ft-+ 0 uniformly, since lfk(x)I ~ 1/k for all x. However,
fo
00
fk(x) dx
=k2/ k =k -+ oo.
This does not contradict Theorem 5.3.1 because that theorem dealt with finite
intervals.
•
Example 5.4 (Dini's Theorem) Let A c Rn be compact and ft a sequence of continuous functions fk : A -+ R such that (a) ft(x) ~ 0 for x E A;
(b) ft -+ 0 pointwise; and (c) J,,(x) s .fi(x) whenever k 2::: I. Prove that ft -+ 0
uniformly.
Solution This example requires a little care, because we are trying to deduce
uniform convergence from pointwise convergence plus some other hypotheses
and we know that the result won't be true without these extra hypotheses (study
Figure 5.1-1, where all the hypotheses here are valid, except f1c(O) -+ 0 as
k-+ oo).
Worked Examples for Chapter 5
315
Given c > 0, we want to find an N so that lfk(x)I < € for all k 2: N and x E A.
For each x EA, find Nx so that lfic(x)I < c/2 if k 2: Nx. We write Nx to emphasize
that this number depends on x. Here we have used hypothesis (b). By continuity
of fic(x) there is a neighborhood Ux,k for x such that 1/k(Y) - fic(x)I < c/2 for
y E Ux,k· The neighborhoods Ux,N, form a covering of A, and so by compactness
there is a finite subcover, say, centered at X1, • .. , XM- Let N = max(Nxw .. , NxM).
Let x EA and k 2: N. Then x E Ux,,N, for some I, and so lfN,(x)-fN,(x1)I < c/2.
Thus, using (a) and (c),
0 $./k(x) $./N(x)
$_/N,(X) = /N,(X1) + [fN,(X) - fN,(X1)]
< c/2+£/2=£.
Therefore lfk(x)I
<€
fork 2: N, x EA, and so we have uniform convergence.
•
E:
Example 5.5 Consider the alternating series
1 (- ll / n, which converges
(see Theorem· 5.9.2). We cannot rearrange the terms of the series, or else we
may get divergence. In fact, the series E(-ll /n can be rearranged to yield
any desired sum! This was discovered by Riemann (see E;icercise 17). To be
able to rearrange series, it is sufficient to have absolute convergence. First, let
a; be a series. A rearrangement is the
us define a rearrangement. Let
series
aa(i),
where
a
is
a
permutation
of { 1, 2, 3, ... }, or, more precisely,
1
a bijection a: {1,2,3, .. .} - {1,2,3, ... }.
E::i
E:
Prove the following theorem.
I::
Theorem Let gk E !Rm and suppose that
1 gk converges absolutely; that
is,
1 llgkll converges. Then any rearrangement of the series
1 gk also
I::
I::
converges absolutely, and to the same limit.
I::
Solution Let
1 ga(k) be the rearranged series. Given c > 0, there is an N
such that n 2: N implies
llgnll + ... + llgn+pll < €.
Choose an integer Ni such that a(n) > N whenever n > Ni. (We can do this
since there are only finitely many integers n for which a(n) $_ N because a is a
Chapter 5 Uniform Convergence
316
bijection.) Thus, if n
> N 1, we have u(n + k) > N,
and so
I::,
By the Cauchy criterion,
Cu(n) converges absolutely.
To show that the limits are the same, given e, select N2 > N, where N is as
before, such that if 1 ~ / ~ N, then l = u(n) for some n, I ~ n ::; N2 • This is
because such k are finite in number and u is onto. Let No = max(N1 , N2 ); then,
form> No,
By construction, E~ 1 Cu(k) - E;:, Cn is a sum of some of the terms g, where
l > N. Thus, the terms in the preceding equation are no larger thane+ e = 2e:.
Thus, the series E~ 1 8u(k)·converges to E:Ogn, v.:hich is the desired conclusion. The result of this example is closely related to important rearrangement
theorems for double series (see Exercise 51).
•
Exercises for Chapter 5
1.
a.
Letfi: be a sequence of functions from A c !Rn to IR"'. Suppose there
are constants mk such that 11/k(x) - f(x)II ~ mk for all x E A and
such that mk -+ 0. Prove that fi: -+ f unifonnly.
b.
If mk -+ m E 1R and llft(X) - Ji(x)II ~ lmt - m,I for all x E A, show
that fi: converges unifonnly.
2.
Determine which of the following sequences converge (pointwise or uniformly) as k -+ oo. Check the continuity of the limit in each case.
a.
(sinx)/k on JR
b.
1/(kx+ 1) on )0, I[
c.
x/(kx+l)on]0,l[
d.
x/(1 + kx2) on JR
(1, (cos x)/ k2 ), a sequence of functions from JR to IR.2
e.
Exercises for Chapter 5
3.
4.
317
E:
Determine which of the following real series
1 8k converge (pointwise
or uniformly). Check the continuity of the limit in each case.
0
x5,k
> k.
a.
gk(x) =
b.
8k(x)
=
!xi 5: k
{ 1/k2,
1/x2, Jxl > k.
c.
8k(x)
=
v-~t) cos(kx) on R
d.
gk(x)
= xk on ]0, 1[.
{
/-ll,
x
Letfn: [1,2]-+ IR be defined by fn(x) =x/(l +x)n.
a.
Prove that E':i/n(x) is convergent for x E [l, 2].
b.
Is it uniformly convergent?
C.
Is
f12 (E~ fn(X))
dx =
L~ f 12fn(X) dx?
5.
Suppose that fk -+ f uniformly, where fk : A C JRn -+ IR; 8k -+ g
uniformly, where 8k : A-+ IR"'; there is a constant M1 such that Jlg(x)II 5,
M1 for all x; and there is a constant M2 such that lf(x)I 5: M2 for all x.
Show thatfi:gk-+ Jg uniformly. Find a counterex~ple if M 1 or M2 does
not exist. Are M 1 and M 2 necessary for pointwise convergence?
6.
Prove that the sequence fk : A -+ IR"' converges pointwise iff for each
x E A.fk(x) is a Cauchy sequence.
·1.
For functions f : A C !Rn -+ IR , form Cb as in the text. Show that
llfgll::; 11/ll llgll'
8.
Does pointwise convergence of continuous functions on a compact set to
a continuous limit imply uniform convergence on tha! set?
9.
Suppose that the functions 8k are continuous and E : 1 8k converges uniformly on A C !Rn. If Xk -+ xo in A, prove that I:;'; 1 8n(Xk) -+ I:;';1 8n(Xo)
ask-+ oo.
10.
For the sequences and series of Exercises 2 and 3, when can we integrate
or differentiate term by term?
11.
a.
Must a contraction on any metric space have a fixed point? Discuss.
b.
Let f : X -+ X, where X is a complete metric space (such as JR),
satisfy d(f(x),f(y)) < d(x,y) for all x,y EX such that x f: y. Must
f have a fixed point? What if X is compact?
Chapter 5 Uniform Convergence
318
12.
A function f : A - t JR, where A C ]Rn, is called lower semicontinuous if
whenever xo E A and >. < f (xo), there is a neighborhood U of xo such- that
>. <f(x) for all x E U nA. Upper semicontinuity is defined suajlarly.
a.
Show that f is continuous iff it is both upper and lower semicontinuous.
b.
If the functions Ji,. are lower semicontinuous, fie - t f pointwise, and
fie+1 (x) ?:. fk(x), then prove that f is lower semicontinuous.
c.
In b, show thatf need not be continuous even if thefic are continuous.
d.
Let f : [O, 1]
-t
JR, and let g(x) = sup6 >0 infJy-xJ<d(y). Prove that
g is lower semicontinuous.
13.
In Theorem 5.3.3, show that fk
14.
Let f : X
-t
-t
f unifonnly.
X be a contraction on a compact metric space X. Show that
Is this true
n:if"(X) is a single point, wheref" =Jojo·· -of (n times).
if X = IR?
15.
E
Let gk E JR.n arid let Ji,. be a subsequence of gk, Prove that if gk converges
absolutely, then Efk converges absolutely as well. Find a counterexample
if
gk is just convergent.
E
16.
Observe that in Worked Example 5.5 the same argument applies in any
normed space. Use this observation and the space Cb to prove the following:
Theorem Let gk : A C IRn - t JR.m be bounded and continuous, and
suppose that
gk converges unifonnly and absolutely. Then any rearrangement also converges uniformly and absolutely, and to the same limit.
E
17.
Let E:0 an be a convergent, not absolutely convergent, real series. Given
any number x, show that there is a rearrangement
bn of the series that
converges to x.
18.
Give an example of a sequence of discontinuous functions fie converging
uniformly to a limit function f that is continuous.
19.
Prove that
E
defines a continuous function on all of JR..
Exercises for Chapter 5
20.
319
Construct the function g(x) by letting g(x) = !xi if x E (-1/2, 1/2] and
extending g so that it becomes periodic. Define
f(x) =
I:
g(::=:x).
n=I
21.
a.
Sketch g and the first few terms in the sum.
b.
Use the Weierstrass M test to show that/ is continuous.
c.
Prove that f is differentiable at no point.
a.
Prove that if A c IRn is compact, B c C(A, !Rm) is compact <=> B is
closed, bounded, and equicontinuous.
b.; Let D = {f E C([0, 1],IR) I II/II S 1}. Show that Dis closed and
bounded, but is not compact. [Hint: Construct a sequence in D that
is not equicontinuous, and make use of a.]
22.
Let B C C(A, IRm) and A C !Rm be compact. Suppose that for each xo E A
and e > 0, there is a 8 > 0 such that d(x, xo) < 8 implies d(f(x),f(xo)) < e
for all f E B. Prove that B is equicontinuous.
23.
Let f : IR - IR and suppose that f of is continuous. Must f be continuous?
,24.
A metric space M is called second countable if there is a countable collection U1, U2, ... of open sets in M such that every open set in M is the
union of members of this collection. Prove that such an M has a countable
subset C such that cl(C) = M. (We then say that Mis separable.) Prove
conversely that a separable metric space is second countable.
25.
Let f : [0, I] - JR be continuous and one-to-one. Show that f is either
increasing or decreasing.
26.
Let k(x,y) be a continuous real-valued function on the square U = {(x,y) I
0 S x S 1, 0 Sy S 1} and assume that lk(x,y)I < I for each (x,y) EU.
Let A : [O, 1) - JR be continuous. Prove that there is a unique continuous
real-valued functionf(x) on (0, 1] such that
f(x) =A(x)+
27.
fo k(x,y)f(y)dy.
1
Let f : ]a, b[ - IR be uniformly continuous, and that Xn E ]a, b[ with
b. Show that limn__, 00 f(xn) exists.
Xn -
28.
Let fn(x) = x/ n. Is fn uniformly convergent on [0, 396]? On IR?
_ Chapter 5 Uniform Convergence
320
29.
Discuss the uniform continuity of the following:
xE)-1,1[.
a.
f(x)=x2-,
b.
/(x) = x 113 ,
x E [O, oo[.
c.
f(x) = e-x,
x E [O, oo[.
d.
f(x)=xsin(l/x),
e.
f(x) = sin[ln(l + .x3)],
O<x:::;1,/(0)=0.
-1 < x :::; 1,/(-1) = 0.'
30.
Let B = {f E C([O, 1), JR) IJ is C1 on )0, i[, /(0) = 0, and 1/'(x)! :::; 1}.
Show that cl(B) is compact.
31.
Let an be a convergent sequence of real numbers, an
(a1 + · · · + an)/n. Sh~w that bn -+ a as well.
32.
Let M and N be metric spaces and f : M ---t N be continuous. Suppose
that f (M) consists of two distinct points. Prove that M is not connected.
33.
Let fn : (0, 11 ---t JR be a sequence of increasing functions on (0, l], and
suppose that fn ---t O pointwise. Must fn converge uniformly? What if fn
just converges pointwise to some limit/?
34.
Find a sequence/n : [0, l]-+ JR of differentiable functions such that/n -+ 0
uniformly, but such thatf:(1/2) does not converge to 0.
35.
Let/:~ - JR be continuous and bijective. Show that/- 1 is continuous.
(See Exercise 7, Chapter 4. For a generalization, see M. Hoffman, "Continuity oflnverse Functions," Mathematics Magazine 48 (1975), pp. 66-73.)
36.
Let/(x,y) = x2-y/(x4 +y2). Discuss the behavior off ne~r (0, 0) with regard
to the limits
a.
limex,y)-+(O,o)/(x, y),
b.
limx..... o[limy-+O/(x,y)],
c.
limy-+O[limx-+O/(x,y)].
-+
a. Let bn =
37.
Suppose that/: JR-+ JR is continuous and/(1) = 7. Suppose that/(x) is
rational for all x. Prove that J is constant.
38.
Prove that 1 + 1/2 + 1/4 + 1/8 + • • • converges and 1 - 1/2 + 1/3 1/4 + • · · converges, but not absolutely.
. 321
Exercises for Chapter 5
39.
A function g : [O, 1] -+ lR is called simple if we can divide up [O, l]
into subintervals on which g is constant, except perhaps at the endpoints.
Let/: [O, 1] -+ IR be continuous and c > 0. Prove that there is a simple
function g such that Iii - gll < c.
40.
a.
Define 8 : C([O, 1], IR) -+ IR, f
continuous.
b.
Let g : lR -+ lR be continuous. Define F: C([O, 1], IR) -+ C([O, 1], JR)
by F(f) = gof. Prove that Fis continuous; prove that if g is unifonnly
continuous, then F is uniformly continuous.
f(O). Prove that 8 is linear and
1--+
41.
Show that there is a polynomial p(x) such that lp(x) - jxj 3 1 < 1/10 for
- 1000 :'S X '.'S 1000.
42.
Study the possibility of replacing the sequence of Bernstein polynomials
in Theorem 5.8.1 by a sequence of Lagrange interpolation polynomials
(see Exercise 2, §5.8, for the definition and properties) to effect the proof
of the theorems in §5.8.
43.
Let Ce((-1, 1],IR) denote the set of even functions in C([-1, 1],IR).
44.
45.
a.
Show that Ce is closed and not dense in C.
b.
Show that the even polynomials are dense in Ce, but not in C.
Projects: Examine the possibility of extending the Stone-Weierstrass theorem to
a.
Complex-valued functions (keep the same hypotheses on B, except
add ''f E B implies J E B," where the ·overbar denotes complex
conjugation).
b.
Noncompact domains (consult Simmons, Introduction to Topology
and Modern Analysis, McGraw llill).
c.
Use b to study the density of the Hermite functions in a suitable
space of continuous functions (the Hermite functions are defined and
studied in, for example, Courant-Hilbert, Methods of Mathematical
Physics, Vol. I, Wiley).
a.
Let fk : K C ]Rn -+ ]Rm be an equicontinuous sequence of functions
on a compact set K converging pointwise. Prove that the convergence
is uniform.
b.
Let
x2
fn(X) =
X
2
+
(l
)2 ,
-nx
Q
:'S X '.'S 1.
322
Chapter 5 Uniform Convergence
Show that fn converges pointwise but not unifonnly. What can you
conclude from a?
46.
Let f(t,x) be defined and continuous for a S t S b and x E Rn. The
purpose of this exercise is to show that the problem dx/dt f(t,x), x(a)
Xo, has a solution on an interval t E [a, c] for some c > a (it is unique
only under more stringent conditions). Perfonn the operations as follows:
Divide [a, b] into n equal parts to = a, ... , In = b, and define a continuous
function x11 inductively by
=
{
X,.(t) = f(t;, Xn(t;)),
Xn(a) =Xo.
Put An(t) = x,.(t) - f(t,Xn(t)),
Xn(t) = Xo
SO
+
t;
=
< t < l;+1,
that
1
1
/(s,Xn(s))
+ A11 (s)ds.
Use the Arzela-Ascoli theorem to find a convergent subsequence of the
=f(t,x) and x(a) =X(). This method
is called polygonal approximation.
x 11 • Show that the limit satisfies dx/dt
47.
Let .fn : A C JR" -. Rm be a sequence of equicontinuous functions. SupP?Se that ft converges on a dense subset K of A. Prove that the sequence
converges on all of A. Does this shed any light on the proof of Theorem
5.6.2?
48.
Prove that the sup norm on C([O, l], JR) is not derived from an inner product
(,) by 11/11 2 = (/,/). [Hint: Show that the parallelogram identity fails.]
49.
Let S be a set and let B denote the set of all bounded real-valued functions
on S; endow B with the sup norm. Prove that B is a Banach space.
50.
Let / : JR -. JR be a uniform limit of polynomials. Prove that f is a
polynomial.
51.
Consider a double series
00
L amn
where amn E JR,
m, n = 0, l, 2, ....
m,n=O
Say that it converges to S if for any e > 0 there is an N such that n, m ~ N
implies
If ak1-SI <
kJ=O
E:.
Exercises for Chapter 5
323
E:,,,_--o
Define absolute convergence and prove that if
anm is absolutely
'
convergent, then the sum can be rearranged as follows:
Interpret this result in terms of summing entries in an infinite matrix by
rows and columns.
52.
Can we differentiate the series
term by term?
53.
54.
Evaluate the following limits: '
.
I - cosx
a.
hmx-o 3X
b.
lim,..__.o+(l+sin2x) 1/x
c.
1- - ~)
limx-o+ (-.smx x
2X
_
Test the following infinite'series for convergence or divergence:
a.
f
f kit
k=J
b.
/klogk
k2 +2k+3
.
k=l
oo (k!)2
C.
55.
Ek=1
(2k)!
Prove that 1r/4 = I - 1/3 + 1/5 - I/7 +•••starting from
00
(1
+x2)- 1 = I:<-Iix2\ lxl < 1.
k=O
56.
Test the following series for absolute and conditional convergence:
00
a.
L)-Itnn=I
0
,
a real
Chapter 5 Uniform' Convergence
324
b.
~ (-1/logk
L., k log log k
.r-=3
00
c.
(-li,
Lka+(-l)k'a>O
.r-=2
57.
Prove that if
::5 x < oo,
ii.
IJ,,(x)I S g(x), n = I, 2, 3, ... , 0 S x < oo,
iii. fn(x) --+ j(x) uniformly, 0 ::5 x S R, for any R < oo, and
iv. Jo"° g(x) dx < oo,
i.
fn(x), g(x) are continuous,
then limn-+oo Jo"° fn(X) dx =
58.
59.
0
ft f(x) dx.
Prove the following convergence tests;
o.
a.
If u,. >
b.
If Un> 0,
a.
u,.+i
< 1-
u,. -
!n - nlogn'
_a_
a
> 1, then I: Un converges.
l , then "L.., Un diverges.
.
-Un+I 2:: l - -1 - - -
n
Un
nlogn
Letp > I with I/p+ 1/q = 1. For a,b,t
> 0, prove that
d'fP bqt-q
ab<-+~
- p
q
and that ab is the minimum value of the right side. (One way to
prove this is to use elementary calculus.)
b.
Prove Holder's inequality: If ak, bk 2:: 0, p > 1, and 1/p + 1/q = I,
then
11
(
~akbk $
n
~~
) J/p (
/I
1/q
)
~bf
[Hint: Imitate the proof of the Cauchy-Schwarz inequality, using
part a.)
c.
Prove Minkowski's inequality: If ak, bk 2:: 0 and p
n
(
60.
~(ak +bkf
) 1/p
(
n
::5 ~~
The series I:;.::::, xk /k converges if
lxl = 1 does it converge?
lxl <
) 1/p
(
> l, then
n
) 1/p
+ ~bf
1. For which complex x with
325
Exercises for Chapter 5
61.
Let Eakxk have radius of convergence R. Show that Eak(x verges inside the disk of center b, radius R.
62.
Find the radius of convergence of
63.
a.
Exkk/(k + 1)
b.
Exk/logk
bl con-
(Binomial series) Consider
~ a(a - 1) ···(a - k + 1) k
Lt
kl
X •
k=O
Assume that a is not an integer 2:: 0. Show that the radius of convergence
is R = l. (See Exercise 49, Chapter 2, on the hypergeometric series, for
behavior of the series at x = ±1.)
64.
Does 1 + 1/2 + 1/3 + • • • converge (C, 1) or (Abel)?
65.
Letf(x) be continuous,
Jof
0::; x < oo. We normally define
00
/(x) dx::: lim
r
R-++oo}o
f(x) dx,
if the limit exists. By analogy with (C, 1) summability, define a notion
of "(C, 1) integrability from 0 to oo," and prove that your method of
integrability is regular, that is, agrees with the usual
if the latter
converges.
ft
ti = l and tn+l = tn/(1 + t~). where /3 is fixed,
E~1 tn converges. [Hint: Find a constant C such
66.
Define tn inductively by
0 ::; f3 < l. Prove that
that tn::; cn-•I.B.]
67.
Let A= {j/2n E [0,1] j n = 1,2,3, ... , j = o,1,2, ... ,2n}, and let
/ : A - JR satisfy the following condition: There is a sequence En > 0
with
1 En < oo and
E:
ke~n
1) - / (
1n) I<
En
for all n > 0, j = 1, 2, ... , 2".
Prove that f has a unique extension to a continuous function from [O, 1]
to R
326
Chapter 5 Uniform Convergence
68.
Let A C ]Rm be compact and let B c C(A, ]jlffl) be compact. Prove that B
is equicontinuous as follows:
69.
a.
Prove that the map E : C(A, Rm)
is continuous.
b.
Use uniform continuity of E restricted to B x A to deduce the result.
(This method of proof is due to J. Allen.)
X
A -+ Rm defined by (f, x)
1-+
f(x)
Let the functions fn be monotone increasing and continuous on [O, 1].
Suppose that F(x) =
11,,(x) converges for each x E [O, 1]. Prove that
F is continuous.
z::
Chapter 6
Differentiable Mappings
In this chapter we develop the notion of a differentiable map from !Rn to lR.111 (or,
more generally, from one normed space to another). Starting with this chapter,
some linear algebra will be used. At this point, the reader might review the
notion of a linear transformation and its matrix representation, since we shall be
defining the derivative as a linear mapping. The connection between the general
definition and partial derivatives will be found in §6.2. After this we generalize
the usual theorems of calculus from the one-variable context treated in §4.7 to
the multivariable case (such as the theorem stating that differentiability implies
continuity, the chain rule, the mean value theorem, Taylor's theorem, tests for
extrema, and so forth).
§6.1 Definition of the Derivative .
A function of one variable f : ]a, b[
if the limit
-+
IR is called differentiable at xo E ]a, b[
. f (xo + h) / '(Xo ) = 11m
h
f (xo)
, h-O
exists. Equivalently, we may write this formula as
r
1m
h-O
f(xo + h) - f(:xo) - f'(xo)li .:_ 0
. /
- ,
l
that is,
lim f(x) - f(xo) - J'(xo)(x - xo) = 0
x-xo
X-Xo
or, what is the same,
lim lf(x) - /(Xo) - /'(x'o)(x - Xo)I = O.
x-+xo
Ix - xol
327
I•."
328
Chapter 6 Differentiable Mappings
The number/' (xo) represents the slope of the line tangent to the graph of/ at
the point (Xo,/(Xo)), as in Figure 6.1-1.
y
/
/(xo)
_ _ __.__ _ __.__ _...._____ X
• b
a
FiGURE 6.1-1
~e derivative is the slope of the tangent line
Differentiability of/ : JR
m such that
r
x~~
-+
JR at x0 is equivalent to the existence of a number
1/(x) - /(Xo) - m(x - xo)I
lx-xol
0
= ·
The function T(s) = ms is linear: ..T(as + /3t) = aT(s) + /3T(t) for all real a, /3.
s, and t. To generaiize this notion to maps/ : A c !Rn -+ IR"', we make the
following definition.
: :4 c lRn -+ R"' is said to be differentiable at
if there is a linear function, denoted D/(xo) : Rn -+ JR"' and called the
derivative off at XO, such that
6.1.1 Definition A mdp f
xo E A
r
x~~o
11/(x) - /(Xo) - Df(xo)(x - xo)II 0
llx - xoll
= ·
In this definition, D/(xo)(x - x0) denotes the value of the linear map D/(xo)
applied to the vector x - xo E Rn, and so D/(xo)(x - Xo) E R"'. We often write
D/(xo) • h foe D/(xo)(h).
We shall be concerned almost exclusively with the Euclidean case, but note
right here that if V, W are nonned spaces and f : A ·c V -+ W, then D/(xo) :
V -+ W, and the definition makes sense.
§6.1 Definition of the Derivative
6
329
The definition may be rewritten by saying that for every e
EA and JJx - xoll < 6 implies
>
0 there is a
> 0 such that x
llf(x) - f(xo) - Df(xo)(x - xo)IJ ::; eJJx - xoJJ.
In 6.1.1 it is implicit that x (a xo, but in this e-6 formulation, we can allow
x = xo, since then both sides reduce to zero.
Intuitively, x 1--+ f (xo) + Df(xo)(x - xo) is supposed to be the best affine
approximation to f near the point xo, (An affine mapping is a linear mapping
plus a constant.) See Figure 6.1-2. In this figure we have indicated the equations
of the tangent planes to the graph of/.
Y =f(:xo) + Df(xo) (x-Xo)
\
...
(a) /: R • IR
FIGURE 6.1-2
(b)
f:
IR. 2
•
(a) f: JR---+ JR; (b) f: IR2
IR
---+
JR
If f is a function of one variable and if we compare the definitions of Df(x)
and df/dx =f'(x), we see that Df(x)(h) =f'(x) • h (the product of the numbers
f' (x) and h E JR). Thus the linear map D/(x) is just multiplication by dfJdx, in
this case. If f is differentiable at each point of A, we say that/ is differentiable
on A. We expect intuitively (as in Figure 6.1-2) that there can be only one best
linear approximation. This is in fact true if we assume that A is an open set.
6.1.2 Theorem Let A be an open set in !Rn and suppose f : A
differentiable at x 0 • Then DJ(xo) is uniquely determined by f.
---+
!Rm is
Chapter 6 Differentiable Mappings
330
6.1.3 Example Let f : JR -+ IR.. /(x) = x3. Compute Df(x) and df / dx.
Solution From one-variable calculus, dx3 / dx = 3x2. Thus, in this example,
D/(x) is the linear mapping
·h
H
Df(x) · h = 3x2h.
•
6.1.4 Example Show that, in general, Df •is not uniquely determined.
Solution For example, if A = {.to} is a single point, any D/(Xo) will do,
because x e A, llx - xoll < 6 holds only when x = Xo, in which case the
expression
11/{x) -
/(Xo) - DJ(xo)(x -
xo)I I
is zero. The definition is thus fulfilled in a trivial way.
•
Note.
If the proofofTheorem 6.1.2 is examined closely, one sees that Df(x)
is unique (assuming it exists) on a wider range of sets than open sets. For
example, the theorem is valid for closed intervals in IR. or generally for closed
disks in Rn.
Exercises for §6.1
1.
Compute D/(x) for/ : IR.
2.
Prove that D(/ + g) =DJ+ Dg.
3.
Let A = {(x,y) e IR. 2 I O :5 x :5 l,y = O}. Prove that the conclusion of
Theorem 6.1.2 is false for this A.
4.
Let f : Rn -+ JR"' and suppose there is a constant M such that for x E
IR.n, 11/(x)I I :5 Ml lxll2. Prove that / is differentiable at xo = 0 and that
D/(Xo) = 0.
S.
If/ : IR. -+ JR is differentiable and 1/(x)I :5 Ix!. must D/(0) = O?
-+
IR. defined by /(x) = x sinx.
331
§6.2 Matrix Representation
§6.2 Matrix Representation
In addition to the definition of Df(x), there is another way to differentiate a
function f of several variables. We write f in components, j(x1 , ••• , Xn) =
(J,(x,, ... ,xn), ... ,fm(x,, ... ,xn)), and compute the partial derivatives 8fi/8x;
for j = I, ... , m and i = 1, ... , n, where the notation 8/j / Bx; means that we
compute the usual derivative of h with respect to x; while keeping the other
variables x,, ... ,Xi-1, X;+1, ••• ,Xn fixed.
6.2.1 Definition 8/j/ 8x; is given by the following limit, when the limit exists:
8/j(
~
~
)
. {fj(x1, ... ,x;+h, ... ,Xn)-/j(x1,••·•Xn)}
I
.
~o
z
x,, ... ,Xn = 1,m
In §6.1 we saw that Df(x) for/ : JR -+ JR is just the linear map "multiplication
by df / dx." This fact can be generalized to the following theorem.
6.2.2 Theorem Suppose A c lRn is an open set and f : A -+ lR"' is differentiable on A. Then the panial derivatives 8/j/ox; exist, and the matrix of the
linear map Df(x) with respect to the standard bases in lR.n and JR"' is given by
8J,
ax,
8/,
OX2
8ft
8xn
8/i
ax,
8/i
8x2
8/i
8xn
8fm
ax,
8/m
8x2
8/m
OXn
where each partial derivative is evaluated at x = (x 1, ••• , Xn). This matrix is
called the Jacobian matrix off or the derivative matrix.
In doing specific computations, we can usually compute the Jacobian matrix easily, and Theorem 6.2.2 gives us DJ. In some books, D/ is called the
differential or the total derivative off.
Chapter 6 Differentiable Mappings
332
When m = I, f is a real-valued function of n variables. Then
matrix
(:~
DJ has
the
:~)
...
and the derivative applied to a vector e = (a 1, ... ,an) is
8/
I: 8-a;.
n
D/(x) · e =
i=I
X,
It should be emphasized that DJ is a linear mapping at each x E A and the
definition of Df(x) is independent of the basis used. If we change from the
standard basis to another basis, the matrix elements will change. If one examines
the definition of the matrix of a linear transformation, it can be seen that the
columns of the matrix relative to the new basis will be the derivative D/(x)
applied to the new basis in Rn with this image vector expressed in the new basis
in Rm. Of course, the linear map Df(x) itself does not change from basis to
basis. In the case m = 1, D/(x) is, in the standard basis, a 1 x n matrix, as just
shown. The vector whose components are the same as those of Df(x) is c_alled
the gradient off and is denoted grad f or 'vf. Thus, for/ : A C JR.n --+ JR,
grad/=
(of,···,
of).
ax,
OXn
(Sometimes it is said that grad f is just DJ with commas inserted!) To give an
intrinsic definition of the gradient requires the use of the inner product. In fact,
'v/(x) is the vector such that for any v E Rn,
('v/(x), v) = D/(x) • v.
An important special case occurs when f = L is already linear. From the
definition of the derivative, we see that DL(.xo) = L, as expected, since the best
affine approximation to a linear map has the linear map itself as the linear part
(see Example 6.2.4). Thus, the Jacobian matrix of L is the matrix of L itself in
this case. Another case of interest is a constant map. Indeed, one sees that a
constant map has derivative zero; zero is the linear map K : Rn --+ JR.m such that
K(x) = 0 = (0, ... , 0) for all x E R".
6.2.3 Example
Compute DJ.
Let f : R 2
--+
lR.3 be defined by f(x,y) = (x2,x3y,x4y2).
§6.2 Matrix Representation
333
Solution According to Theorem 6.2.2, Df(x,y) is the linear map whose matrix is
fJ/1
ox
0/1
{)y
ah
ah
ox
{)y
0/3
8/3
oy
ax
•
where/1(x,y) = x2,.h(x,y) =~y. and/3(x,y) = J!y2.
6.2.4 Example Let L : Rn - Rm be a linear map; that is, L(x + y) =
L(x) + L(y) and L(ax) = crL(x). Show that DL(x) = L.
Solution Given .xo and e > 0, we must find 6 > 0 such that llx -
.xoll
<6
implies
IIL(x) - L(.xo) - DL(x) · (x - .xo)II ~ illx -
.toll-
But with DL(x) = L, the left side becomes
IIL(x) - L(xo) - L(x - .xo)II,
which is zero, since L(x- xo) = L(x)- L(x0 ), by linearity of L. Hence, DL(x) = L
satisfies the definition of the derivative (with any 6 > 0).
•
6.2.S Example Let f(x,y,z) = (xsiny)/z. Compute grad/.
Solution grad/= (of/ox,of/ay,af/oz), and here
8/
ox=
of
siny
-z-,
{)y
xcosy
= -z-,
of
xsiny
oz=-~,
so that
_ (siny , xcosy , _ xsiny) •
2
grad/(x,y,z ) -
z
z
z
•
334
Chapter 6 Differentiable Mappings
Exercises for §6.2
R.2 ,/(x,y,z) = (x4y,xel). Compute D/.
1.
Let/: JR. 3
2.
Let/: JR. 3 -+ JR, (x,y,z)
3.
Let L be a linear map of ]RR -+ lRm, let g : JRR --+ ]Rm be such that
Ilg(x)II ~ Mllxll2, and let f(x) = L(x) + g(x). Prove that 1?/(0) = L.
4.
Letf(x,y) = (xy,y/x). Compute DJ. Compute the matrix of D/(x,y) with
respect to the basis (I, 0), (1, I) in JR. 2 ,
5.
Discuss the possibility of defining D/ for f a mapping from one normed
space to another.
--+
1--+
ex2+y2+z2. Compute D/ and grad/.
§6.3 Continuity of Differentiable Mappings;
Differentiable Paths
It was shown in §4.7 that a differentiable function of one variable is continuous.
This is appealing intuitively, since having a tangent line (or plane) to the graph
is stronger than having no breaks in the graph. We recall the proof: Let f :
]a, b[ --+ lR be differentiable at xo. Then
. (/(x) - f(xo)) = hm
.
hm
x-+Xo
x-+Xo
(/(x)
/(xo))
- -- - · (x - xo)
x - x0
=/'(xo) · lim (x X-+X<)
xo) =/'(xo) · 0 =0
and so limx-+x0 (/(x) - /(xo)) = 0, which implies that/ is continuous at xo. This
argument generalizes to the case off : A c JR.R -+ lR.111 , as in the next proposition.
6.3.1 Proposition Suppose Ac JR.R is open andf: A-+ JR."' is differentiable
on A. Then f is continuous. In fact, for each xo E A, there are a constant M > 0
and a f>o > 0 such that llx - xoll < 60 implies 11/(x) - /(xo)II 5 Mllx - xoll(This is called the local Lipschitz property.)
Earlier, we examined the special case of real-valued functions, f : ]Rn -+ JR.
The case of a function c : JR -+ lR.m is also important. Here c represents a
§6.3 Continuity of Differentiable Mappings; Differentiable Paths
335
curve or path in IRm. In this case, Dc(t): JR-+ ]Rm is represented by the vector
associated with the single column matrix
(~· l
dcm
dt
where c(t) = (ci(t), ... ,c111 (t)). This vector is denoted c'(t) and is called the
tangent vector or velocity vector to the curve. If we note that
'( )
.
c t = 11m
h-0
c(t
+ h) - c(t)
h
and use the fact that (c(t + h) - c(t))/ h is a chord that approximates the tangent
line to "the cuive, we see that c' (t) should represent the exact tangent vector (see
figure 6.3-1). In terms of a moving particle, (c(t+h)-c(t))/ his an approxfmation
to the velocity, since it is displacement/time, and so c' (t) is the instantaneous
velocity.
z
X
FIGURE 6.3-1
The velocity vector of a curve
Strictly speaking, we should always represent c'(t) as a column vector, since
the matrix ofDc(t) is an m x I matrix. However, this is typographically awkward,
and so we often write c'(t) as a row vector.
Chapter 6 Differentiable Mappings
336
6.3.2 Example Let
c(I)
= (fl, t, sin t). Find the tangent vector to c(t) at
c(0) = (0, 0, 0).
Solution c'(t)
=
(2t, l,cost). Setting t
•
vector tangent to c(t) at (0, 0, 0).
= 0,
c'(0)
= (0, 1, 1),
which is the
Next we recall some single-variable ideas from §4.7:
6.3.3 Example Prove that f : JR -+ JR defined by x i---+ lxl is continuous but
not differentiable at 0.
=
=
Solution f(x) x for x 2: 0 and/(x) -x for x < 0, and so/ is continuous
.on ]0, oo[ and ]- oo, 0[. Since lim.r-o/(x) 0 /(0), f is also continuous at 0,
and so f is continuous at all points. Fin~ly, f is not differentiable at 0, for if it
were,.
lim f(x) - /(0) = lim f (x)
.r-+O X - 0
.r-+O X
would exist. But for x
•
cannot exist.
= =
> 0, f(x)/ x is +1 and for x < 0 it is -1, and so the limit
6.3.4 Example Must the derivative of a function be continuous?
Solution The answer is no, but an example is not obvious. An example is
the function
·
f(x) = x2 sin
and
~
X
if x
i
0,
f (X) =0 if X =0.
See Figure 6.3-2.
To demonstrate differentiability at zero, we shall show that
f(x)
-
X
-+
0 as
X -+
0.
Indeed, lf(x) / xi = Ix sin(l / x)I ::; lxl -+ 0 as x -+ 0. Thus, f' (0) exists and is
zero. Hence, f is differentiable at 0. By differentiation rules of one-variable
calculus,
. -1 -. COS -1 if X
/ '(X ) = 2x· SID
X
X
~
0.
337
Supplement to §6.3
y
\
\
\
\
\
I
I
I
I
I
f(x) = x 2 sin(l/x)
,,,,
/
/
,I
,I
,I
,I
I
,I
FIGURE 6.3-2
This function is differentiable, but the derivative is discontinuous at 0
As x -+ 0, the first term tends to O but the second term oscillates between + I
and -1, and so lim.r-o/'{x) does not exist. Thus,/' exists but is not continuous.
•
Supplement to §6.3
Karl Weierstrass gave an example of a continuous functionf(x) of one variable
that has no derivative anywhere (see Exercise 20, Chapter 5). Roughly speaking, this function superimposes infinitely many corners like those in Example
6.3.3. One can also ask if such functions arise in any "practical'' context. Surprisingly, the answer is yes. One place in which "nowhere differentiable" paths
are important is in the study of Brownian motion, the erratic motion of small•
particles suspended in a fluid such as water or air. The American mathematician
Norbert Wiener developed an effective model for studying this motion involving
integration over a "space" of paths. In this method "most" continuous paths are
not differentiable anywhere and have no tangent vectors.
Another context in which nowhere differentiable paths occur is in the study
of fractals and dynamical systems. For instance, fix a complex number c and let
fc : lR 2 ~ C -+ C be defined by fc(z.) z.2 + c. Let J';°
z. IJ;(z.) -+ oo as n -+
oo}, where f': means fc composed with itself n times, and let Jc = bd(J':°). For
c = 0, it is clear that le is a circle, but for c 'F O (but close enough to zero) it
is known that Jc is a continuous but nowhere differentiable curve! One calls Jc
the Julia set of fc; it is shown in Figure 6.3-3.
=
={
338
Chapter 6 Differentiable Mappings
(b)
(a)
(a) The Julia set of/(z) = z2 +½i, which is a simple closed curve
but is nowhere differentiable; (b) the Julia set of/(z) = z2 - 1,
which contains infinitely' many closed curves
FIGURE 6.3-3
The set of c for which le is path-connected is called the Ma,idelbrot set,
shown in Figure 6.3-4. (The Mandelbrot set is also the set of c for which zero
does not go to infinity under iteration of le-this is not obvious, however.) A
deep theorem of Douady and Hubbard is that the Mandelbrot set is connected.
This is an example of a very complicated set whose connectedness is far from
obvious-computer magnification of this set shows how uuly subtle it is. For
more information, references, and the reasons this subject has some practical
importance, see R. Devaney, An Introduction to Chaotic Dynamical Systems
(Addison-Wesley, Reading, M~·ss., 1985), and references therein.
Exercises for §6.3
=0 if x
1.
Let f(x) = x2 if x is irrational and let f(x)
continuous at 0? Is it differentiable at 0?
2.
Is the local Lipschitz condition in Theorem 6.3.1 enough to guarantee
differentiability?
3.
Must the derivative of a continuous function exist at its maximum?
4.
Let f(x) x sin(l / x), x 'f:. 0, and f(0)
differentiability off at 0.
S.
Find the tangent vector to the curve c(t) = (3r, e', t + i2) at the point
corresponding to t = I.
=
=0.
is rational. Is /
Investigate the continuity and
§6.3 Supplement to §6.3
FIGURE 6.3-4
339
The Mandelbrot set (reprinted by pennission, from B. B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman, New
York, 1983)
Chapter 6 Differentiable Mappings
340
§6.4 Conditions for Differentiability
Since the Jacobian matrix provides- an effective computational tool, it would be
useful to know if the existence of the usual partial derivatives implies that the
derivative DJ exists. This is, unfortunately, not true in general. For example,
take f : R. 2 -+ R defined by f(x,y) = x when y = 0, f(x,y) = y when x = 0, and
f(x,y) = 1 elsewhere. Then of/ox and of fay exist at (0, 0) and are equal to 1.
However, f is not continuous at (0, 0) (why?), and so the derivative DJ cannot
possibly exist at (0, 0). See Figure 6.4-1. (See the examples and exercises for
more exotic examples.)
y
f(x.y)=y
/=1
/= I
f(x,y) =x
X
/=1
FIGURE 6.4-1
/= l
The partial derivatives of this function exist at (0, 0), but the
function is not continuous at (0, 0)
It is quite simple to understand such behavior. The partial derivatives depend
only on what happens in the directions of the x and y axes, whereas the definition
of DJ involves the combined behavior off in a whole neighborhood of a given
point. We can, however, assert the following.
6.4.1 Theorem Let A c Rn be open and f : A c R_n -+ R.111• Suppose
f = (/1, ... ,f,,,). If each of the partials ofi/ ax; exists and is continuous on A,
then f is differentiable on A.
§6.4 Conditions for Differentiabiliiy
341
The partial derivatives of a function measure its rate of change in the special
directions parallel to the axes. The directional derivatives do this in other
directions also.
6.4.2 Definition Let f be real-valued and defined in a neighborhood of
xo E an, and let e E an be a unit vector. Then
d f(
)I
+ te) - f(xo)
1. /(Xo
dt Xo + te l=O = ,~
---,--is called the directional derivative off at xo in the direction e.
From this definition, we see that the directional derivative is the rate of
change off in the direction e and, for n = 2, gives the slope of the graph off
in that direction; see Figure 6.4-2.
z
FIGURE 6.4-2 Slope of l = tan (J = directional derivative
We claim that the directional derivative in the direction of e equals Df(xo)·e.
To see this, consider the definition of Df(x0 ) with x = xo + te: Given c > 0,
11/(xo + te~ - f(Xo) -
Df(Xo).
ell S cllell
Chapter 6 Differentiable Mappings
342
if ltl is sufficiently small. Thus, if J is differentiable at Xo, then the directional
derivatives also exist and are given by
. f(xo + te) - J(xo) 01.,.( )
l1m
- - - - - - = 'J xo · e.
t
1-+0
In particular, observe that {)J/ Ox; is the derivative of/ in the direction of the ith
coordinate axis (with e e; (0, 0, ... , 0, 1, 0, ... , 0)).
For a function/ : JR 2 -+ JR, the directional derivatives D/(x0 ) • e can be used
to determine the plane tangent to the graph of/ (compare Figure 6.1-2). That
is, the line l given by z = f(xo) + D/(xo) · re is tangent to the graph of/, since, as
in Figure 6.4-2, D/(xo) • e is the rate of change off in the direction e. Thus, the
tangent plane to the graph off at (xo.f(xo)) may be described by the equation
= =
z = f (xo) + Df(xo) · (x - xo)
(see Figure 6.4-3). Since we have not defined the notion of the tangent plane to
a surface, we shall adopt this equation as a definition of the tangent plane.
z
z
Xo
FIGURE 6.4-3
=/(xo) + D/(xo)
= (x1,o, . . .
· (x-Xo)
, x•. o)
The tangent plane to the graph of a function
6.4.3 Example Show that the existence of all directional derivatives at a
point need not imply differentiability.
§6.4 Conditions for Differentiability
343
y
FIGURE 6.4-4
The existence of directional derivatives does not force differentiability or even continuity
Solution Define a function/: IR 2 -+ IR by f(x,y) = I if O < y < x2 and by
f(x,y) = 0 otherwise. Then/ is identically O along the horizontal and vertical
axes. Any other line through the origin also stays in the region in which f = 0
for some short distance on both sides of the origin. Thus f is constantly O on
some interval on both sides of the origin along any such line. This shows that
the directional derivative at the origin in the direction of that line is 0. All
directional derivatives exist and are O at the origin, but the function is not even
continuous, much less differentiable, at the origin. See Figure 6.4-4.
A slightly more complicated but similar example is the function/: IR 2 -+ IR
defined by
f(x,y) = 2xy
if x2 t- -y
X +y
and
f(x,y) =0 if x2 = -y.
Letting e = (e 1, e2 ), where e2
1
t
f. 0,
I t2e1e2
e,e2
= - 2- 2 2
r t e1 + te2 re 1 + e2
-f(te1, te2) = -
-+
e,
as t -+ 0; the case e2 = 0 gives zero. Thus each directional derivative exists at
(0, 0). However,! is not continuous at (0, 0), since for x2 near -y with both x, y
llmall,/ is very large. (We leave the details to the reader.)
•
This example shows that existence of all directional derivatives would not
be a convenient definition of differentiability, since it would not even imply
Chapter 6 Differentiable Mappings
344
continuity. This is the reason one adopts the more restrictive notion in Definition 6.1.1.
6.4.4 Example Let f(x,y) = x2 + y. Compute the equation of the plane
tangent to the graph off at x 1, y 2.
=
=
Solution Here Df(x,y) has matrix
of of)
( ox'
8y
= (2x, l),
and so Df( 1, 2) = (2, 1). Thus, the equation of the tangent plane becomes
z
=3 + (2, 1) (
;
=; ) =
3 + 2(x - 1) + (y- 2);
that is,
•
2x+y-z=l.
Exercises for §6.4
1.
Use Theorem 6.4.1 to show thatf(x,y) defined by
(xy)2
f(x,y) = ~ • (x,y) ~ (0,0)
yx2 +y2
and
f(x,y)
=0,
(x,y)
=(0,0)
is differentiable at (0, 0).
2.
Investigate the differentiability of
f(x,y) =
xy
~
yx2 +y2
at (0, 0) if f (0, 0) = 0.
3.
Find the tangent plane to the graph of z = x2 + y2 at (0°1 0).
4.
Find the equation of the tangent plane to z = x3 + y4 at x = 1, y = 3.
S.
Find a function/ : IR 2 -+ JR that is differentiable at each point but whose
partials are not continuous at (0, 0).
§6.5 The Chain Rule
345
§6.5 The Chain Rule
As we recalled in §4.7, one of the most important techniques of differentiation
is the chain rule ("function of a function" rule). For example, to differentiate
(.x3 + 3) 6 , let y = x3 + 3 and first differentiate y6, getting 6y5, and then multiply
by the derivative of x 3 + 3 to obtain the final answer 6(.x3 + 3)5 3x2. There is a
similar process for functions of several variables. For example, if u, v, and f
are real-valued functions of two variables, then
8
of au of av
a/<u<x.y). v<x,y)) = au ax+ av ax·
The following theorem includes all of these as special cases.
~.5.1 Chain Rule Let A C ]Rn be open and f : A -+ ]Rm be differentiable
at XO E A. Let B C ]Rm be open, f (A) C B, and g : B -+ RP be differentiable
at f(xo). Then the composite go f is differentiable at xo and D(g o /)(X-O) =
Dg(f(xo)) o Df(xa).
Note that the right-hand side is well defined, because Df(X-O) : ]Rn - ]Rm
and Dg(f(X-O)) : ]Rm - ]RP and so their composition as linear transformations is
defined. The chain rule is also called the composite Junction theorem, since it
tells us how to differentiate composite functions.
Since the product of two matrices corresponds to the composition of the
corresponding linear maps they represent, the chain rule can be rephrased by
saying that the Jacobian matrix of go f at x = (x 1, ••• , Xn) is the product of the
Jacobian matrix of g evaluated at f(x) with the Jacobian matrix off evaluated
at x (in that order). Thus, if h =go f and y-::= f(x), then
8g1
0)'1
8g1
oyn;
8f1
OX1
8f1
OXn
ogp
8y1
8gp
oym
8fm
OX1
ofm
OXn
Dh(x)=
where the og;/ oyi are evaluated at y = f(x) and the ofi/ oxi at x. Writing this
out, we obtain, for example,
Chapter 6 Differentiable Mappings
346
This situation occurs when we "change variables." For example, supposef(x, y) is
a real-valued function and x = rcos 8, y = r sin 8 define the new polar coordinate
variables r, 9. Fonn the function
h(r,9) =f(rcos9,rsin9).
Then
8h 8/
8/ .
- =-cos9+-smfl
8r ax
8y
and
oh
80
8/ .
8/
= - oxrsm0 + fJyrcos9.
The reader should derive similar fonnulas for spherical coordinates (r, <p, 9),
where x = rcos0sin<p, y = rsin9sin<p, z = rcos<p (spherical coordinates are
discussed further in §9.5).
Another illustration may clarify matters. Suppose we are given functions
u(x,y), v(x,y), w(x,y), andf(u, v, w) and we fonn the function h(x,y) =f(u(x,y),
v(x,y)', w(x,y)). The chain rule gives
oh
of au of 8v of OW
-=--+- -+- -.
8x
au ax
av OX
aw ax
The idea behind the proof of this formula (as an illustrative case) is as follows.
Write
h(x + At,y) - h(x,y) _ J(u(x + At,y), v(x + At,y), w(x + At,y))
At
-
At
f(u(x,y), v(x + At,y), w(x + At,y))
At
+
f(u(x,y), v(x + At,y), w(x + At,y))
At
.
f(u(x,y), v(x,y), w(x + At,y))
At
f(u(x,y), v(x,y), w(x + At,y))
+
Ax
f(u(x,y), v(x,y), w(x,y))
Ax
Usingf(u+~. v, w)-f('u, v, w)
is approximately
R:J
~of /8u, we find that the preceding expression
8/ ~
of Av
of Aw
--+--+-OUA.x 8vax
.1x·
aw
347
§6.S The Chain Rule
Thus, letting ax -+ 0 gives the fonnula. Can you pinpoint what items in this
argument need clarifying for the proof to be complete?
6.5.2 Example
v
=sinx, w =e~
Verify the chain rule for f(u, v, w) = u2v + wv2 and u = xy,
Solution Here h(x,y) =f(u(x,y), v(x,y), w(x,y)) is given by
h(x,y) = x2y2 sinx + ex sin2 x,
and so, directly,
:: = 2xy2 sinx + x2y2 cosx + ex sin2 x + ex 2sinxcosx.
On the other hand, the chain rule gives
of 8u + 8f ov + 8f ow
8u 8x 8v ax aw ax
= 2uv ou + u2 ov + 2wv 8v + v2 ow
ax
ax
. 8x
ax
which is the same result. The formula for aJi/8y can be checked similarly.
•
6.5.3 Example Letf: JR-+ JR and F: lR.2 -+ JR be given by F(x,y) =f(xy).
Verify that
8F
8F
x-=y-
8x
8y ·
Solution By the chain rule,
8F
,
- =f (xy)y
ox
and so the statement is clear.
and
8F
,
oy =f(xy)x,
•
6.5.4 Example Suppose that 'Y : [O, I] -+ ]Rn is a differentiable parametrization of a curve in ]Rn and that f is a differentiable function from ]Rn into JR. Let
h(t) = f(-y(t)). Show that h'(t) = ('vf(-y(t)),-y'(t)) and explain how this generalizes the formula obtained earlier for directional derivatives. (See Figure 6.5-1.)
348
Chapter 6 Differentiable Mappings
R"
R
FIGURE 6.5-1
A function defined along a parametrized curve
I
Solution Since h(t) =(Jo -y)(t), the chain rule gives
It'< f>= [ llof
vX1
of
'll
vX2
1 "'' 1
of ] .[ :~;~
ll
vXn
;
x,.(r)
= of x~ (I)+ of ~(I)+ .•. + of X,.(t).
ox1
ox2
OXn
All the partial derivatives are evaluated at the point -y(t) = (x 1(t),x2(t), ... ,xn(t)).
This is exactly the same as the inner product of the gradient off that point
with the velocity vector -y'(t) at that point. The directional derivative of/
at the point Vo in the direction of the unit vector e = (a 1,a2, •.. ,an) is the
special case in which the curve is the straight line ,(r) = v0 + te. In that case
x1(t) = 'Yk(t) = Vo.t + tat. Thus, we recover the same fonnula as before:
at
{"v/(,(0)), ,' (0)} = {"vf(vo), e}
since ,(0)
=vo, and -y'(r) =e for all t.
•
Exercises for §6.5
1.
Write out the chain rule for
h(x,y,z) ':=/(u(x,y;z), v(x,y), w(y,z)).
349
§6.6 Product Rule and Gradients
2.
Verify the chain rule for
u(x,y,z)=xe>;
v(x,y,z)=(sinx)yz
and
f(u, v) = u2 + vsinu
with
h(x,y,z) = f(u(x,y,z), v(x,y, z)).
3.
Let F(x,y) =/(x2 + y2). Show that x([}Fjay) = y(8F/8x).
4.
Write out the chain rule relating rectangular coordinates to spherical coordinates in three dimensions.
S.
Let/ : IR
JR and F: lR. 2 --+ IR be differentiable and satisfy F(x,f(x)) = 0
8F/ax
and 8Fj[Jy ;6 0. Prove that/'(x) = - [)Fj[Jy where y =J(x).
--+
§6.6 Product Rule and Gradients
Another well-known rule of differential calculus is the product rule, or Leibniz
rule.
6.6.1 Proposition Let A c Rn be open and let f : A --+ JR"' and g : A -+ JR
be differentiable functions. Then gf is differentiable, and for x E A, D( gf)(x) :
]Rn-+ ]Rm is given by D(gf)(x) · e = g(x)(D/(x)·e)+(Dg(x) · e)f(x) for all e E lRn.
(This makes sense, because g(x) E JR and Dg(x) • e E JR.)
We sometimes abbreviate this result by saying that
D(gf) = gDf + (Dg)f,
but the precise meaning is as stated in the proposition. The reader is already
familiar with the product rule from §4.7 and elementary calculus. In terms of
components: the proposition is in fact just the usual product rule:
[)
[JJ1,.
[Jg
X;
I.IX;
I.IX;
-{) ( gfk:) = g-;:;- + -;:;-ft.
Chapter 6 Differentiable Mappings
350
For quotients, we have a similar result. If g
D (~)
'F 0, then
= gDJ ~ 2(Dg)J _
To prove this formula, it suffices, by Proposition 6.6.1, to demonstrate it for the
case 1/g. This reduces it to a problem in elementary calculus, which we asked
the reader to deal with in §4.7.
Other rules of differentiation are encompassed in the statement that D is
linear; that is, that D(f + g) DJ+ Dg and D(..\f) ..\DJ for ..\ E JR, a cons~t.
Let us consider the geometry of gradients a little further. LetJ : A c !Rn -+ JR
be differentiable. The gradient is
=
=
gradJ(x) = (:~, ... , :{) ·
Hence the directional derivative in the direction h is the inner product
DJ(x) · h = (grad J(x), h}
= rate of change of J at the point x in direction h.
Consider a "surface" S defined by J(x) = constant. We will now show that
gradJ(x) is orthogonal to this surface (our argument will gloss over the precise
nature of this surface-this will be dealt with in §7.7). To prove this, consider
a curve c(t) in S with its tangent vector c' (0) where c(0) = .xo. We assert that
(grad /(.xo), c' (0)) = 0.
Since c(t) E S, f(c(t)) = constant in t. Differentiating and using the chain rule,
we get
DJ(c(t)) · c'(t) = 0.
Setting t = 0, and using Df(x) • h = (gradf(x), h), gives the desired relation. See
Figure 6.6-1.
We may describe the plane tangent to the surface S defined by f (x) = constant
at a point xo on S by (gradJ(.xo},x~ xo) = 0, since grad/(xo) is orthogonal to S.
It is also evident from the equation
(grad/(xo), h) = llgrad /(xo)II cos 0
(where 11h11 = 1 and 0 is the angle between grad /(xo) and h) that the vector
grad / (xo) points in the direction in which f is changing the fastest. It is not
unreasonable, because if we suppose that J represents the height function of
a mountain, then f = constant is a level contour. To climb or descend the
§6.6 Product Rule and Gradients
351
s
FIGURE 6.6-1
The gradient is orthogonal to level sets
polygonal approximation path
for an optimization problem
FIGURE 6.6-2
Direction of steepest ascent is orthogonal to the level contours
mountain as quickly as possible, we should walk perpendicular to the level
contours (Figure 6.6-2).
These facts can be of value in optimization problems. In such problems, one
is given a function f(x 1, •• • , Xn) and the problem is to maximize or "optimize"
/ by some practical scheme. A common method is to take a trial point x0 and
proceed along a straight line in the direction of the gradient off to reach a new
point at which f will be larger (at least if we do not go too far), and repeat.
Chapter 6 Differentiable Mappings
352
6.6.2 Example Find the normal to the surface x2 + y2 + z2 = 3 at (1, I, 1).
Solution Here/(x,y,z) = .x2 +y2 +z2 has gradient grad/= (2x,2y,2z), which
at (1, 1, 1) is (2,2,2). Nonnalizing, a unit nonnal is (1/./3, 1/./3, 1/./3). •
6.6.3 Example Find the direction of greatest rate of increase of f(x,y, z) =
x2y sin z at (3, 2, 0).
Solution The required direction is that of the gradient vector, which is
(2xy sinz,x2 sin z,x2y cosz), which becomes (0,0, 18) at (3, 2,0). Nonnalizing,
•
we get the unit direction vector (0,0, I).
6.6.4 Example What is the tangent plane to the surface x2 - y2 + xz = 2 at
(1, 0, l)?
Solution Here grad/(1, 0, 1) = (3,0, 1), and so the tangent plane has equation
((x- I,y,z- 1),(3,0, 1)) = 0, that is, 3x+z = 4.
•
Exercises for §6.6
1.
Prove that
ddt f(xo + th)I /={)
= Df(xo) · h
by using the chain rule, wheref: R.m-+ R.n.
x2 - y2 + xyz = 1 at (1, 0,
2.
Find a unit nonnal to the surface
3.
Find the equation of the tangent plane to the surface i2
(1, 0, 1).
4.
In what direction is J(x,y) = ex2 y increasing the fastest?
5.
Letf: Rn-+ R., g: R.n-+ R.. Show that grad (fg) = f grad g + g grad/.
6.
Show that grad f, as the nonnal to the tangent plane, is a more general
description of the tangent plane than the description in §6.4.
1).
- y2 + xyz = I at
§6.7 The Mean Value Theorem
353
§6.7 The Mean Value Theorem
Two imponant calculus theorems are the mean value theorem and Taylor's theorem. First, let us turn our attention to the mean value theorem. In §4.7
we recalled the proof of the mean value theorem of elementary calculus, which
stated that if/ : [a, b] - JR is continuous and if/ is differentiable on ]a, b[, there
exists a point c E ]a,b[ such that/(b) - /(a) =J'(c)(b - a), where/'= df /dx.
Unfortunately, for f : A C lRn - lRm, this version of the mean value theorem,
naively interpreted, simply is not true. For example, consider f : lR - JR 2,
defined by f(x) = (x2, x3). Let us try to find a c such that O ::; c :::; 1 and
f(l) - /(0) D/(c)(l - 0). This means that (1, 1) - (0, 0) (2c, 3c2), and thus
2c = 1 and 3c2 = 1. Evidently, there is no c satisfying these equations.
Experience leads us to believe that some restrictive condition might provide
a valid theorem. The preceding·yersion will in fact hold if f is real-valued. To
give the correct theorem, let us first make precise the meaning of "c is between
x and y" for c,x,y E JRn. We say c is on the line segment joining x and y, or is
between x and y, if c = (1 :- >.)x + ,\y for some O ::; ,\ ::; 1, as in Figure 6.7-1.
=
=
y
,,.;,~>)lly-xl
x
~>..Uy-xii
FIGURE 6.7-1
The point c lies on the segment joining x and y
6.7.1 Mean Value Theorem
i.
Suppose f : A C IR.n - JR is differentiable on an open set A. For any
x,y E A such that the line segment joining x and y lies in A (which need
not happen for all x,y), there is a point con that segment such that
J(y) - f(x) = Df(c) · (y - x).
ii.
Suppose f : A c Rn - IR."' is differentiable on the open set A. Suppose
the line segment joining x and y lies in A and f = (Ii, ... ,fm). Then there
exist points c,, . .. , c111 on that segment such that
f;(y)- f;(x)
=Df;(c;)(y- x),
i
=l, ... ,m.
354
Chapter 6 Differentiable M(lppings
An alternative formulation of the mean value theorem is given in Worked
Example 6.S at the end of the chapter.
6.7.2 Example A set A C JR" is said to be convex if for each x,y EA, the
segment joining x and y also lies in A. See Figure 6.7-2. Let A C IR." be an
open convex set and let f : A --+ IR.m be differentiable. If DJ = 0 on A, show that
f is constant on A. (Generalizations of this are given in Exercise 9 at the end
of the chapter.)
(a)
(b)
FIGURE 6.7-2 (a) Not convex; (b) convex
Solution If x,y E A, then for each component/;, there is a vector c; such
that
f;(y) - f;(x)
= Df;(c;)(y - x).
Since DJ = 0, DJ; = 0 for each i (why?), and so f;(y) = f;(x). It follows that
f(y) =f(x), which means thatf is constant.
•
6.7.3 Example Suppose that f : [O, oo[ --+ IR. is continuous, j(0) = 0, f is
differentiable on ]O, oo[. and f1 is nondecreasing. Prove that g(x) = f(x)/x is
nondecreasing for x > 0.
Solution From §4.7 or directly from the mean value theorem, we see that a
function h : IR.--+ IR. is nondecreasing if h'(x) ~ 0, because x ~ y implies that
h(y) - h(x) = h'(c)(y - x) ~ 0.
Now
'( )
xf'(x) - f(x)
g X =---x2
§6.8 Taylor's Theorem and Higher Derivatives
355
and there is a c between O and x such that
f(x) = f(x) - /(0) = f' (c) · x $ xf' (x),
since O < c < x implies that/'(x) '?:.f'(c). Thus xf'(x)-f(x) '?:. 0, and so g' '?:. 0,
which implies that g is nondecreasing. •
Exercises for §6.7
1.
2.
3.
4.
If f : JR .:...+ JR is differentiable and is such that f' (x)
(strictly) increasing.
> 0,
prove that f is
'Prove the following (weak version of) l'HtJpital's rule: If/', g' exist at .to,
g'(xo) f 0, and/(xo) 0 g(xo), then lim.x-xa[/(x)/g(x)] =/'(xo)/g'(xo).
==
Use Exercise 2 to evaluate
a.
lim.x-o[(sinx)/x]
b.
lim.x-o[(e.x - 1)/x]
Which of the following sets are convex?
a.
{(x,y) E JR 2 I y '?:. O}
b.
{x E R" I O <
c.
JR\ {O}
llxll < I }
5.
Let f : A C JR" -+ JR be 'differentiable with A convex, and suppose
llgrad/(x)II $ M for x EA. Prove 1/(x) - /(y)I $ Mllx- YII for x,y EA.
Do you think this is true if A is not convex?
6.
Let f : JR -+ JR be differentiable. Assume that for all x E JR, 0 $ f'(x) $
f(x). Show that g(x) = e-xf(x) is decreasing. If f vanishes at some point,
conclude that/ is zero.
§6.8 Taylor's Theorem and Higher Derivatives
Next, we discuss Taylor's formula for functionsf: Ac JRn-+ Rm. To do this,
we first discuss derivatives of higher order. For f : JR" -+ JR, there is no problem
Chapter 6 Differentiable Mappings
356
defining partial derivatives of higher order; we just iterate the process of partial
differentiation
and so on. However, regarding the derivative as a linear map needs a little ~ore
care.
The second derivative is obtained by differentiating DJ, if it exists, and is
accomplished as follows.
6.8.1 Definition Let L(IRn,IR.m) denote the space of linear maps from !Rn
to IR.111 • (If we choose a basis in !Rn and IR.111, then L(IR.n, IR. 111 ) can be identified
with them x n matrices and hence with IRnm.) Now DJ: A--. L(Rn,IRm); that
is, at each x E A we get a linear map Df(x). If we differentiate DJ at Xo,
we get a linear map from !Rn to L(IRn,IRm), by definition of the derivative. We
write D(Df)(xo) D2f (.xo) and define the map B:ro : !Rn x IR.n --. !Rm by setting
Bzo(X1 ,x2) = [D 2/(xo)(xi)](x2).
=
This definition makes sense because D2/(.xo) : IR.n -+ L(IR.n, IR."'), and so
E L(IR.n,!Rnr); therefore it can be applied to x2 . The reason we do
this is that Bx0 avoids the unnecessary use of the conceptually difficult space
D2/(x0 )(xi)
L(IR.n' L(IRn' IR'")).
By definition, a bilinear map B : E x F --. G, where E, F, G are vector
spaces, is a map that is linear in each variable separately; for example, in the
first variable this means B(o:e1 + f3e2,f> = aB(e1 ,f) + /3B(e2 ,/), where e1, e2 E E,
f E F, and a:, f3 E R The map BX-O we have defined may be checked to be a
bilinear map of !Rn X !Rn - IR"'.
Given a bilinear map B : E x F -+ JR., we can associate a matrix with each
choice of basis e1, ... , en of E and /1, ... ,f,11 of F; that is, we can let
aij
If
= B(e;,fj).
n
x=
L x;e;
m
and y = LYjfj,
i=I
j=I
then
au
B(x,y) =~ aijXiYj =(x1, ... ,Xn) (
· 'J
:
an1
Notice that the matrix associated to B depends on the choice _of basis of E and F.
§6.8 Taylor's Theorem and Higher Derivatives
357
Note. For the second derivative, we shall, by abuse of notation, still write
D2/(x0 ) for the bilinear map Bx,, obtained by differentiating DJ at Xo as described in the preceding paragraphs.
6.8.2 Theorem Let f : A C Rn -+ R be twice differentiable on the open set
A. Then the matrix of D2J(x) : Rn x Rn-+ R with respect to the standard basis
is given by
where each partial derivative is evaluated at the point x = (x 1 , • •• , Xn).
For higher derivatives, we proceed in an analogous manner. For example,
-+ Rm' for each x. We do not
associate a matrix with this map, but rather the third-order derivatives labeled by
four indices: 8 3/t/ 8x18xj8X; for each component/". (Such quantities are closely
related to what are called tensors.)
Before proceeding with Taylor's theorem, we give an important property
of the second derivative. The' matrix in Theorem 6.8.2 is, under fairly mild
conditions, symmetric.
D3/ gives a trilinear map D3/(x) : Rn x Rn x Rn
6.8.3 Symmetry of Mixed Partials Let f : A -+ R"' be twice differentiable on the open set A with D2/ continuous (that is, with the functions
82/ / ox;OXj continuous). Then D2/ is symmetric; that is,
or, in terms of components,
Using this, it follows that all the higher derivatives are symmetric as well
under analogous conditions.
358
Chapter 6 Differentiable Mappings
The symmetry of second derivatives represents a fundamental property not
encountered in single-variable calculus. Let us verify the symmetry in a specific
example: Suppose/(x,y,z) = e%Ysinx+x2y4cos2 z, so that/: IR.3 --+ JR.. Then
!
= e:ry cosx + yexy sinx + 2xy4 cos2 z,
%=
xexy sinx + 4x2y3 cos2 z,
and
a21
oyax
a21
=xe:ry cos x + e:ry sin x + .xye:ry sin x + 8xy3 cos2 z =oxoy.
Theorem 6.8.3 is not so obvious intuitively. However, some intuition can be
gained from the main idea of the proof. For functions f(x, y ), we consider the
quantity
S = f(x + h,y + k)- f(x + h,y) - f(x,y + k) + f(x,y),
which is the difference of the differences in Figure 6.8-1. The algebraic fact that
S can be written as the difference of the differences in two ways (horizontal and
then vertical, or vice versa) is the reason one has equality of the mixed partials.
y
(x, y
+ k)
(x, y)
(x +h, y
+ k)
(x+h,y)
X
FIGURE 6.8-1
S is a difference of differences
6.8.4 Definition A function is said to be of class C' if the first r derivatives
exist and are continuous. (Equivalently, this means that all partial derivatives
§6.8 Taylor's Theorem and Higher Derivatives
359
up to order r exist and are continuous.) A function is said to be smooth or of
class COO if it is of class C' for all positive integers r.
Iterative use of the chain rule can be used to show that the composite of C'
functions is also C'.
6.8.5 Taylor's Theorem Let f : A -+ JR be of class C' for A c !Rn an
open set. Let x,y EA, and suppose that the segment joining x and y lies in A.
Then there is a point c on that segment such that
1
1
L k!Dkf(x)(yx, ... ,y- x) + r!D'f(c)(y - x, ... ,y - x)
r-1
J(y) - J(x) =
k=I
where D~(x)(y - x, ... ,y - x) denotes Dkf(x) as a k-linear map applied to the
k-tuple (y - x, ... ,y - x). In coordinates,
L
n
Dkf(x)(y - x, ... ,y - x) =
.
(
.
•••···••k=l
8kf
8X1•.... 8X1.k
)
(y;. - x; 1 )
• • •
(y;k - x;k).
Setting y = x + h, we can write the Taylor formula as
l
J(x + h) =J(x) + DJ(x) · h + .. · + (r- l)!D' 1/(x) · (h, ... ,h) + R,-i(x,h)
where R,_ 1(x, h) is the remainder. Furthermore,
R,-i (x, h)
llhll'-1
-+
0 as h-+ 0.
Other forms in which the remainder term can be cast are given in the proof
of the theorem. This theorem is a generalization of the mean value theorem (in
which case r = l) and of Taylor's theorem encountered in one-variable calculus.
Note. The last two statements in Taylor's theorem require f to be only cr- 1,
as is seen by an examination of the proof.
Chapter 6 Differentiable Mappings
360
The basic idea of the proof of Taylor's theorem is to use the fundamental
theorem of calculus (§4.8) to write
f(x + h) - f(x) =
fo
1
:,
f(x + th)dt
and then to repeatedly integrate by parts, generating a series.
From Taylor's theorem, we are led to form the Taylor series about Xo,
I
L k!D
/(Xo)(x - Xo, ... ,x - Xo).
l
CX)
k=O
This series need not converge to f (x) even if f is C°°. If it does so in a neighborhood of xo, we say that f is real analytic at xo. Thus, a function f is real
analytic if the remainder term (1/ r!) D'f(c)(x - xo, ... ,x - xo) -+ 0 as r-+ oo.
For example, if ex, sinx, and cosx are defined in more traditional ways rather
than the power series approach we took in §5.4, then this method can be used
to establish the power series expansions.
6.8.6 Example Justify and then verify equality of mixed partials for f(x,y) =
yx2(cosy2).
Solution Since f is C°°, Theorem 6.8.3 guarantees equality of mixed partials.
To see this explicitly, we compute as follows:
of = 2xycosy2, -o2f
= 2xcosy2 -
-
ox
8/
{}y =x2cosy2-
oyox
2y2x2 siny2,
[J2f
4xy2 siny2;
{}x{}y =
2xcosy2 -4y2xsiny2.
•
6.8.7 Example If f is C°° on JR. and for every interval [a,b] there is a
constant M such that lfn)(x)I :5 Mn for all n and x E [a, b], slww that f is
analytic at each x0 and
~ fn>(xo) (
)n
f( X) = ~
- -1 -X-Xo •
n.
n=O
Solution Select b with -b < x0 < b. The remainder is
fn>(c) (.x - xotl < Mnlx - Xoln
l n!
-
n!
•
§6.8 Taylor's ·rheorein and Higher Derivatives
361
where c is between x and xo and lf">(c)I < M" on [-b, b]. This tends to 0 as
-+ oo, since by the ratio test, the corresponding series converges. Observe
that the convergence is uniform on all bounded intervals (why?).
•
n
6.8.8 Example Give an example of a COO junction that is not analytic.
Solution
Let
j(x)
=0
if X
=0,
and
/(x)
2 '
= e-l/z
if x 'f 0.
The only place where smoothness off is in doubt is at x = 0. At that point
we can use l'Hopital's rule to evaluate the deri~ative as the limit of a difference
quotient:
J'(0) = lim /(x) - /(O) = Iim !e-i/.r2 = 0.
z-o x-0
z-ox
(Supply the details for the application of l'Hopital's rule.) Similarly, for x '/; 0,
we havef'(x) = (2/x1)e-l/z2 and a similar application of l'Hopital's rule shows
that f"(0) exists and is 0. One proceeds inductively to show that f">(0) exists
and is O for each n > 0. Thus/ is COO. But the Taylor series for/ with center
at Xo = 0 is identically 0. This Taylor series does not converge to the value of
the function in any nontrivial neighborhood of 0, so/ is not analytic at 0. •
6.8.9 Example Compute the second-order Taylor formula for f(x,y) =
cos(x + 2y) around (0, 0).
Solut.ion
Here /(0, 0)
: (0, 0)
= 1,
=- sin(0 + 2 • 0) =0,
{J2f
.
axi (0,0) = - cos(0) = -1,
%
(0, 0) = -2 sin(O + 2 • 0) = 0,
021 ·
·
oy2 (0, 0) = -4 cos 0 = -4,
and
a21
oxoy (0,0)
a21
.
= oyox (0,0) =-2cos0 =-2.
Chapter 6 Differentiable Mappings
362
Thus
1
f(h, k) = 1 -
.
2(h2 + 4hk + 4k2) + R2((0, 0), (h, k)),
where
R2((0, 0), (Ii, k))
-+
0
ll(h,k)ll 2
as
'
•
(0 0)
(h k)
-+
'
·
Exercises for §6.8
1.
Verify the equality of mixed partials for J(x,y) = (ex2+ix)xy2.
2.
Use Example 6.8.7 to establish the Taylor series and analyticity of ex,
sinx, and cosx on all of IR, assuming we know their derivatives.
3.
Let f : JR
-+
JR be defined by
J(x) = x2 sin ( ~)
if x E }-1, l[,
X
;e0
and
f(x) = 0
if X = 0.
Investigate the validity of Taylor's theorem for f about the point x = 0.
4.
Find the Taylor series representa!ion about x = 0 for log(l - x), -1 < x <
I, and show that it equals log(I - x) on -1 < x < l, and also show that
it converges uniformly on closed subintervals of ]- 1, I[.
5.
Compute the second-order Taylor formula for J(x,y) = ex cosy around
(0,0).
6.
Verify that if the conditions in Example 6.8.7 are met, then we can differentiate the Taylor series term by term to obtainf'(x).
§6.9 Maxima and Minima
An important application of Taylor's theorem is the determination of maxima
and minima of real-valued functions. As we might expect from our knowledge
of functions of one variable, the criteria involve the second derivative. Let us
firsr recall the real-variable case.
§6.9 Maxima and Minima
363
If/: ]a, b[-+ JR has a local maximum or minimum at.xo and/ is differentiable
at Xo, then f' (x0 ) = 0. Furthermore, if f is twice continuously differentiable and
/"(.xo) < 0, then Xo is a local maximum, and if f"(xo) > 0, then xo is a local
minimum. To generalize these facts to real-valued functions of n variables, we
begin by giving the relevant definitions.
6.9.1 Definition Let f : A c Rn -+ JR where A is open. If there is a
neighborhood of Xo E A on which /(Xo) is a maximum, that is, iff(xo) 2: /(x) for
all x in the neighbor/wad, we say that xo is a local maximum point and f (Xo) is
a local maximum value for f. Similarly, we can define a local minimum off. A
point is called extreme if it is either a local minimum or a local maximum for
f. A point Xo is a critical point if f is differentiable at xo and if Df(xo) = 0.
The first basic fact is presented in the next theorem.
6.9.2 Theorem If f : A c JR.n -+ JR is differentiable, A is open, and xo EA
is an extreme point f, then Df(xo) = O; that is, Xo is a critical point.
The proof is much the same as for one-variable calculus. The result is intuitively clear, since at an extreme point the graph off must have a horizontal
tangent plane. However, just being a critical point is not sufficient to guarantee
that the point is extreme. For example, consider f (x) = x-3. For this function,
0 is a critical point, since Df(O) = 0. But x-3 > 0 for x > 0 and x-3 < 0 for
x < 0, and so O is not extreme. Another example is given by J(x, y) = y2 - x2.
Here O (0, 0) is a critical point, since of/ ax
2x, of/ ay
2y, and so
Df(0, 0) = 0. However, in any neighborhood of O we .can find points where f is
greater than O and points where/ is less than 0. A critical point that is not a local
extreme point is called a saddle point. Figure 6.9-1 shows how this tenninology
originated.
For/: A C JR-+ JR, we have already mentioned thatf(x) is a local maximum
value if/'(x) = 0 andf"(x) < 0. This is geometrically clear if we remember that
f"(x) < 0 means/ is concave downward (or that the slopes/'(x) are decreasing).
To generalize this, the concept of the Hessian of a function g at xo is introduced.
=
6.9.3 Definition If g : B c Rn
=-
=
-+ JR is of class C 2 , the Hessian of g
at Xo is defined to be the bilinear function HX-O(g) : Rn x Rn -+ JR given by
Hx0 (g)(x,y) = D2 g(xo)(x,y). Thus the Hessian is. as a matrix, just the matrix .of
second partials.
364
Chapter 6 Differentiable Mappings
y
y
maximum
saddle
FIGURE 6.9-1
Maximum, minimum, and saddle points
A bilinear form, that is, a bilinear mapping B : !Rn x !Rn -+ JR, is called
positive definite if B(x, x) > 0 for all x -:/ 0 in !Rn and is called positive semidefinite if B(x,x) ~ 0 for all x E !Rn. Negative definite and negative semidefinite
bilinear forms are defined similarly.
Now we can make the following generalization to the multivariable case.
6.9.4 Theorem
i.
If f : A C JR.n
ii.
If f has a local maximum at xo, then HX-O(J) is negative semidefinite.
-+ JR is a C 2 function defined on an open set A and xo is a
critical point off such that H,o<f) is negative definite, then f has a local
maximum at xo.
The case for minima is covered by changing "negative" to "positive" in
Theorem 6.9.4. Note that a minimum off is a maximum of -f.
The second-order Taylor polynomial defines the quadratic function that best
fits the graph of the function. The second derivative test then says that we
can determine the behavior at a critical point if the second-order term is not
degenerate. It is the same as that of the corresponding quadratic surface: a
paraboloid opening up or down, or a saddle surface (hyperbolic paraboloid).
§6.9 Maxima and Minima
365
The first two correspond to the cases in which the Hessian is positive definite
or negative definite, and the third to that in which it is indefinite.
As we have noted, the matrix of HX-O(f) with respect to the standard basis is
where the partial derivatives are all evaluated at xo.
If n = I, Theorem 6.9.4i reduces to the one-variable testf"(xo) < 0; one can
have a maximum or a saddle point or a minimum ifJ" (x0 ) = 0 (in this case, the
test fails). For example,f(x) = -x4 has a maximum, x5 a saddle point, and x4 a
minimum at xo = 0, although f"(O) = 0.
A few points from linear algebra will be helpful in using Theorem 6.9.4. Let
Ak be the determinant of the matrix
This is the matrix of the Hessian with the last n - k rows and columns removed.
The symmetric matrix Hx0 ( / ) is positive definite if! Ak > 0 fork= l, ... , n. We
shall not prove this here in general, but in· Example 6.9.5 we prove it for 2 x 2
matrices. There is also a criterion .for the negative definite case given in the
next paragraph. Thus, if Ak > 0 for k = I, ... , n, then f has a (local) minimum
at the critical point xo.
If a symmetric matrix is positive semidefinite, then Ak ~ 0, so that by 6.9.4ii
if Ak < 0 for any k, f cannorhave a maximum at x 0. Similarly, f has a (local)
minimum at x0 if Hx0 (f) is positive definite." By changing the sign of Hx,,(f) in
the paragraph above and using properties of determinants, it follows that Hx0 (f)
is negative definite iff Ak < 0 fork odd and Ak > 0 fork even, and that if HX-O(f)
is negative semidefinite, then Ak :S 0 for k odd and Ak ~ 0 for k even. Thus f
has a minimum at x 0 if Ak < 0 for k odd and Ak > 0 for k even. If Ak > 0 for
some odd k or Ak < 0 for some even k, then f cannot have a minimum value
at xo, In fact, if Ak < 0 for some even k, f can have neither a maximum nor a
minimum at x 0 , and x 0 must be a saddle point off (see Exercise 8 at the end of
this chapter for an instance of this).
366
Chapter 6 Differentiable Mappings
Note. This theorem is also useful in economics for optimizing quantities
such as profits. It is also used in mechanics when f is the potential of a
system, because then a minimum corresponds to stability and the maxima and
saddle points correspond to instability. (See Marsden and Tromba, Vector
Calculus, 3rd ed., W. H. Freeman, New York, 1988, Chap. 4, for details.)
6.9.5 Example Show that the matrix
is positive definite if! a > 0 and ad - b2 > 0.
Solution Positive definite means that
(x,y) ( :
: ) ( ; )
>0
if (x,y)
t- (0,0);
that is, ax2 + 2bxy + dy2 > 0. First, suppose this is true for all (x, y) F- (0, 0).
Setting y 0, x 1, we get a > 0. Setting y 1, we have ax2 + 2bx+ d > 0 for
all x. This function is a parabola with a minimum (since a > 0) at 2ax + 2b = 0.
That is, x = -b/a. Hence,
=
that is, ad - b2
=
> 0.
=
•
The converse may be proved in the same way.
6.9.6 Example Investigate the nature of the critical point (0, 0) of
f(x,y) = x2
- xy + y2.
Solution The partial derivatives are
{)J
{)x
=2x-y,
{)2J
{)x2
{)2J
= 2,
8xoy = -l,
8/
oy = -x+2y,
82/
{)y2
= 2,
367
Theorem Proofs for Chapter 6
and so the Hessian evaluated at x = 0, y = 0 is
( 2 -1)
-1
=
=
2
.
=
Here A1 2 > 0 and A2 4 - 1 3 > 0, and so the Hessian is positive definite.
Thus we have a local minimum.
•
Exercises for §6.9
1.
Prove that
is negative definite iff a
< 0 and ad - b2 > 0.
2.
Investigate the nature of the critical point (0, 0) of/(x,y) = x2+2xy+y2+6.
3.
Investigate the nature of the critical point (0, 0, 0) off(x, y, z) = x2 + y2 +
2z2 +xyz.
4.
(This exercise assumes a knowledge of linear algebra.) Let A be a symmetric matrix. Show that A is positive definite if and only if the eigenvalues
of A (which exist and are real, since A is symmetric) are positive. Is this
true if A is not symmetric?
5.
Check that the matrix
O)
( 0l O
0 0
0 0 -1
has A;
6.
~
0 yet the matrix is not semidefinite.
Determine the nature of the critical point (0, 0) of the function x3 + 2xy2 y4 + x2 + 3.xy + y2 + 10.
Theorem Proofs for Chapter 6
t,.1.2 Theorem Let A be an open set in IR.n and suppose f : A
differentiable at Xo- Then Df(xo) is uniquely determined by f.
-+
IR.m is
Chapter 6 Differentiable Mappings
368
Proof Let L1 and Li be 'two linear mappings satisfying conditions of the
definition of the derivative. We must show that L, Li. Fix e E Rn, llell 1,
and let x = xo + >.e for >. E R Then note that
=
!>.I = llx - Xoll
and
l1L1 · e - Li · ell =
=
IIL, · >.e - Li · >.ell
I-XI
·
Since A is open, x E A for ,\ sufficiently small. By the triangle inequality,
IIL1 •e-Li•ell = IIL,(x-XQ)-Li(x-xo)II
llx -
< llf(x) -
-
Xoll
J(Xo) - L1 (x - xo)II
llx-Xoll
llf(x) - /(xo) - Li(x - Xo)I I
+--------llx - xoll
As >. -+ 0, these two tenns each tend to 0, and so L1 · e =Li· e. Our selection
of e was arbitrary, except that !Jell= 1. But for any nonzero y E Rn, y/llYII = e
has length 1, and, by linearity, if L1(e) Li(e), then L 1(y) Li(y).
=
=
•
6.2.2 Theorem Suppose A c Rn is an open set and f : A -+ R"' is differentiable on A. Then the partial derivatives 8Jj/8x; exist, and the matrix of the
linear map Df(x) with respect to the standard bases in Rn and R"' is given by
8/1
8x1
8/1
OX2
8/1
OXn
afz
afz
afz
ox,
8x2
OXn
81,,.
OX1
8/m
8x2
ofnr
OXn
where each partial derivative is evaluated at x = (xi, ... ,xn)- This matrix is
called the Jacobian matrix off or the derivative matrix.
Proof By definition of the matrix of a linear mapping, the jith matrix element
of Df(x) is the jth component of the vector Df(x) • e; = Df(x) applied to the ith
369
Theorem Proofs for Chapter 6
standard basis vector, e;. Call this component aii• Let y = x + he; for h E JR and
note that
llf(y) - /(x) - Df(x) · (y - x)II
IIY -
xii
_ llf(x1, ... ,X; + h, ... ,Xn) - f(x1, ... , Xn) - hDJ(x) · e;II
lhl
Since this tends to O as h
and so, as h - 0,
-+
0, then so does the jth component of the numerator,
ljj(x1, ... , X; + h, ... , Xn) - Jj(x1, . .. , Xn) - haj;I
O
lhl
- .
Therefore,
ajj
. Jj(x1,, .. ,x;+h, ... ,Xn)-Jj(x1,,,.,Xn)
Oji= lIm
= -.
,,-o
h
OX;
•
6.3.1 Proposition Suppose A C !Rn is open andf: A -+ JRm is differentiable
on A. Then f is continuous. In fact, for each Xo E A, there are a constant M > 0
and a 80 > 0 such that llx - xoll < 60 implies llf(x) - j(xo)II $ Mllx - xoll(This is called the local Lipschitz property.)
For the proof, we recall that if L : !Rn - Rm is a linear transformation, then
there is a constant Mo such that III.xii $ Mollxll for all x E Rn (see Worked
Example 4.4 at the end of Chapter 4). Here we shall be taking L = Df(xo).
Proof To prove continuity, it suffices to prove the stated Lipschitz property,
since, given c > 0, we can choose 8 = min(6o, c/ M). To prove the Lipschitz
property, let c = 1 in the definition of the derivative. Then there is a 60 so that
llx - xoll < 80 implies
IIJ(x) - f(xo) - Df(xo)(x - xo)II $ llx - xoll,
which implies
IIJ(x) - f(xo)II $ IIDJ(xo)(x - xo)II + !Ix - xoll
(here we use the triangle inequality in the form IIYII - llzll $ IIY - zll, which
follows by writing y = (y - z) + z and applying the usual form of the triangle
inequality). Let M = Mo+ I and use the fact that IIDJ(xo)(x-xo)II $ Mollx-Xoll
to give the result.
•
Chapter 6 Differentia.ble Mappings
370
6.4.1 Theorem Let A C Rn be open and f : A C Rn - t ]Rm. Suppose
f = (Ji, ... .fm)- If each of the partials 8.[j/8xi exists and is continuous on A,
then f is differentiable on A.
Proof If DJ(x) is to exist, its matrix representation must be the Jacobian
matrix, by Theorem 6.2.2. We need to show that with x E A fixed, for any
c > 0 there is a 8 > 0 such that IIY- xii < 8, y EA implies
llf(y) - f(x) - Df(x)(y-x)II < cllY- xiiTo do this, it suffices, to prove this for each component off separately (why?).
Therefore, we can suppose m = 1. Write
f(y) - f(x) = f(yi, • • • ,Yn) - f(x1,Y2, • • • ,Yn) +f(x1,Y2, • • • ,Yn)
- f(x1,X2,Y3, · · · ,Yn) + f(x1,X2,y3, · · · ,Yn)
- f(x1,x2,X3,y4, · · · ,Yn)
+ · ·. + f(x1, · · · ,Xn-1,Yn) - f(xi, • • • ,Xn)Now use the mean value theorem, which implies
8f
f(yi, • • • ,Yn) - f(x1,Y2, • • • ,Yn) = ~(u1,Y2, • • • ,Yn)(y1 - xi)
UX(
x,
for some u1 between and Yt (y2, ... ,Yn are fixed). We write ~imilar expressions
for the other terms and get
f(y)-f(x)=
(%{
(u1,Yz, ... ,yn)) (y1 -x,)
+ (:~ (x,,uz,y3, ... ,yn)) (y2 -x2)
+ · · · + ( :~ (x1,X2, ... ,x,._,, Un)) (Yn - Xn).
Since Df(x)(y-x) = "'E,7=/8J/8x;)(x1, ... ,xn)(yi -x;),
11/(y) -f(x) -
Df(x)(y- x)II::; {l-:{.(u1,Y2, .. • ,Yn)- :; (xi,••• ,x,.)I
+···+
1:~
(x,, ... ,Xn-t,Un)- :~ (x1, ...
,x,.)I} lly-xll
using the triangle inequality and the fact that IY; - x;I ::; IIY - xi I. Since the
terms 8f/8x; are continuous and"u; lies between y; and x;, there is a 8 > 0 such
that the term in braces is less than c for IIY - xii < 8. This estimate proves the
assertion.
•
Theorem Proofs for Chapter 6
371
6.5.1 Chain Rule Let A
C ]Rn be open and f : A -+ R'n be differentiable
at xo E A. let B C Rn be open, f(A) C B, and g : B -+ RP be differentiable
at f(xo). Then the composite g of is differentiable at Xo and D( g o f)(xo) =
Dg(/(.:to)) o D/(Xo).
Proof
that
To prove that D(g o /)(Xo) • y = Dg(/(Xo)) • (D/(x0) • y), we will 'show
r Ila o f(x) -
x~~
g of(xo) - Dg(J(xo)) · [Df(xo)(x - Xo)]II 0
llx - .:toll
= .
To do this, estimate the numerator as follows:
118 o f(x) - go f(xo) - Dg(f(xo)) · (DJ(xo)(x - xo))I I
=
llc<f(x)) - g(f(Xo)) - Dg(/(.:to))[/(x) - /(.:to)]
+ Dg(f(Xo))[f(x) - /(.:to) - D/(xo)(x - Xo)]II
$ llg(f(x)) - g(f(xo)) - Dg(/(Xo))[/(x) - /(xo)JII
+ IIDg(J(xo))[f(x) - f(xo) - Df(Xo)(x - xo)]II
by the triangle inequality. Since/ is differentiable, there exist a 60 and an M > 0
such that 11/(x) - /(xo)II $ Mllx - xoll whenever !Ix - .:toll < 60, by Theorem
6.3.1. Given c > 0, there is, by the definition of the derivative of g, a 61 > 0
such that 11.Y - J(xo)II < 61 implies
llg(y) - g(J(Xo)) - Dg(J(Xo))[y - /(Xo)JII
Thus !Ix - Xoll
< 62 = min{ 60, 6i}
< ( 2~)
IIY - /(Xo)II-
implies
llg(f(x)) - g(f(xo)) - Dg(/(Xo))[/(x) - /(Xo)JII
llx- .:toll
c
< 2·
Since Dg(f(xo)) is a linear map, we know that there is a constant N such that
IIDg(f(Xo))(y)II $ N · IIYII for ally E Rm, where it can be assumed that N /; 0.
Now, by definition of the derivative, there is a 63 > 0 such that llx - xoll < 63
implies
=IIJ_(x_)_-..;_J(_Xo_)_----'Df~(x_o_)(x_-_Xo_)....:.:.11 < _c
llx - .:toll
2N"
Then !Ix - xoll
< 63
implies
IIDg(f(Xo))[J(x) - f(xo) - D/(Xo)(x - .:to)JII
llx - xoll
<
-
Njlf(x) - /(.:to) - D/(xo)(x - Xo)II
llx-xoll
< ~2
Chapter 6 Differentiable Mappings
372
Let 6 = min{62,63}. Thus, llx- xoll
< 6 implies
Ilg of(x) - g of(xo)- Dg(f(xo)) · Df(xo)(x llx - xoll
<
xo)II
Ilg(/(~)) - g(f(xo)) - Dg(/(Xo))[f(x) - f(xo)JII
llx - xoll
11Dg(f(xo))[f(x) - f(xo) - Df(xo)(x - Xo)JII
+
which proves the formula.
c
c
< 2 + 2 = £,
llx - xoll
•
Note. At this point it might be good for the student to go back to the proof
of the chain rule in §4.7 and ask how· that proof compares with this one.
6.6.1 Proposition Let A c !Rn be open and let f : A
-+ IR"' and g : A -+ IR
be differentiable functions. Then gf is differentiable, and for x EA, D(gf)(x) :
!Rn-+ IR.111 is given by D(gf)(x) · e = g(x)(DJ(x)· e)+(Dg(x)• e)f(x) for all e E !Rn.
(This makes sense, because g(x) E IR and Dg(x) • e E IR.)
Proof Given £ > 0 and xo e A, choose 8 > 0 such that l!x - xoll < 8 implies
i.
jg(x)I $ jg(xo)I + 1 = M.
ii.
IIJ(x) -f(xo) - Df(xo)(x - xo)II $ 3Mllx - xoll-
iii.
llg(x) - g(xo) - Dg(xo)(x - xo)II $ 3llf(Xo)ll llx - xoll-
iv.
llg(x) - g(.xo)II S £/3M where 11D/(xo)YII S MIIYII (iii and iv are needed
only if /(xo) 1 0 and D/(xo) ~ 0). Why is the choice of 6 possible?
€
€
Theorem Proofs for Chapter 6
With llx - xoll
373
< 6 and using the triangle inequality, we get
llg(x)f(x) - g(Xo)f(Xo) - g(Xo)D/(Xo)(x - Xo) - [Dg(xo)(x - xo)] /(xo)II
$ llg(x)f(x) - g(x)f(Xo) - g(x)D/(Xo)(x - Xo)I I
+ llg(x)DJ(xo)(x - Xo) - g(xo)D/(Xo)(x - xo)II
+ llg(x)f(xo) - g(Xo)f(Xo) - [Dg(Xo)(x - Xo)] /(xo)II
e:llx - xoll
$ M.
3M
= e:llx - Xoll-
e:
e:llx - xoll
+ 3MMllx ~ xoll + 311/(xo)II · 11/(xo)II
•
6.7.1 Mean Value Theorem
i.
Suppose f : A C Rn --+ R is differentiable on an open set A. For any
x,y EA such that the line segment joining x and y lies in A (which need
not happen/or all x,y), there is a point con that segment such that
/(y) - j(x) = D/(c) • (y - x).
ii.
Suppose f : A C Rn --+ R"' is differentiable on the open set A. Suppose
the line segment joining x and y lies in A and J = (f1, ••• ,/,11 ). Then there
exist points c 1, ••• , c,,, on that segment such that
f;(y) - f;(x)
=Df;(c;)(y- x),
i =I, ... ,m.
Proof
i.
The function h : [0, I] --+ R defined by h(t) = J((I-t)x+ty) is differentiable
in t on ]0, 1[. There is a to E JO, I [ such that h(l)- h(O) = h' (to)(l - 0), by
the one-variable mean value theorem. Now h(l) = f(y) and h(O) = f(x).
Differentiating using the chain rule, we obtain h'(to) = D/((1 - to)x +
t0 y)(y - x), since the derivative of (1 - t)x + ty with respect to t is y - x
(explain). Hence we can take c = (I - to)x + t0 y.
ii.
This follows by applying i to each component of/ separately.
•
Chapter 6 Dijferentiable Mappings
374
6.8.2 Theorem Let f : A c Rn -+ JR be twice differentiable on the open set
A. Then the matrix of 0 2/(x) : Rn x Rn
-+
JR with respect to the standard basis
is given by
a2f
8x18Xn
cPf
8x18x1
82f
83/
8xn8X1
8xn8Xn
where each partial derivative is evaluated at the point x =(xi, ... ,xn)-
Proof The matrix representation of Of : A
L(!Rn, IR) is the row vector
(8// 8x 1, ... , of/ 8xn), so that by Theorem 6.2.2, in a version suitable for row
vectors, 0 2/ : A -+ IR."2 is given by
-+
a2f
8x18X1
a 21
OX18X2
a21
8x18Xn
8 2J
8xn8X1
a21
{J2f
OXn8X2
OXnOXn
Regarding D2/ as a bilinear map will not change the matrix representation, as a
consideration of the definitions shows. •
6.8.3 Symmetry of Mixed Partials Let f : A -+ Rm be twice differentiable on the open set A with 0 2/ continuous (that is, with the functions
8 2f / 8x;8xi continuous). Then 0 2/ is symmetric; that is,
0 2f (x)(x1, x2) = 0 2f (x)(x2, X1 ),
or; in terms of components,
Proof Showing that 0 2f(x) • (y,z) = 0 2f(x) • (z,y) is equivalent to stating that
a2f1c
a2/k
OX;OXj = OXj.OX;.
By holding all other variables fixed, we are reduced to the two-dimensional
case. Thus we can assume that f is of class C 2 on A c IR 2 and is real-valued.
Theorem Proofs for Chapter 6
375
As suggested in the text, we consider, for fixed (x,y) E A and small h, k, the
quantity
Sh.t = [f(x + h,y + k) - f(x + h,y)] - [j(x,y + k) - f(x,y)].
Define the function gk by gk(u) = f(u,y + k) - f(u,y), and-motivated by
Figure 6.8-1---observe that the formula for Sh.t can be written
Sh,t = gk(X + h) - gk(X).
Thus, by the mean value theorem, Sh,t = g1c(Ch,t) • h foe some Ch.k lying between
x and x + h. Hence
of
at
}
a21
sh,k = { ox (Ch,t,Y + k) - ax {ch,t,Y) . h = {)y{)x (Ch,Jc, du). hk
for some d1,,k lying between y and y + k.
Now Sh,1c is "symmetrical" in h, k and x,y. By interchanging the two middle
terms in Sh,k, we can derive (in the same way)
a21
-
sh,k = axay (cu, dh.t) • hk.
Equating these two formulas for Sh.k, canceling h, k, and letting h
(using continuity of D2/) gives the result.
•
-+
0, k
-+
0
Note. A refinement is this: Iff is C1 and 8 2/ / oxoy exists and is continuous,
then {J2J/ oyox exists and these are equal. This requires more work than the
preceding proof, but the idea is the same. (See Exercise 24 at the end of this
chapter.)
6.8.5 Taylor's Theorem Let f : A -+ JR be of class er for A c lRn an
open set. Let x,y EA, and suppose that the segment joining x and y lies in A.
Then there is a point c on that segment such that
r-1
f(y) - f(x) =
:,Dkf(x)(y - x, ... ,y - x) + ~Drf(c)(y - x, ... ,y - x)
L
J.-=1
r.
•
where Dkf(x)(y - x, ... ,y - x) denotes okf(x) as a k-linear map applied to the
k-tuple (y - x, ... ,y - x). In coordinates,
Dkf(x)(y - x, • •. ,y - x):
L {:)X,,.8'i
Q . ) (Yi1 •., X,k
n
.
(
. I
11, ••• ,,k=
X;I)' • • (y;k
- X;k).
376
Chapter 6 Differentiable Mappings
Setting y = x + h, we can write the Taylor formula as
f(x + h) = f(x) + D/(x) · h + · · · + (r ~ I)! D'-1/(x) · (h, ... , h) + Rr-1 (x, h)
where R,- 1(x, h) is the remainder. Furthermore,
R,_,(x, h) -+ 0 as
h-+ 0.
llhll'- 1
Proof If we remember that
8/
L ~(x+th)h;
d
d f(x+ th)= Df(x+ th)· h =
t
n
i=l vX,
from the chain rule, then we can integrate both sides from t = 0 to t = l to
obtain
1 n of
f(x+h)-f(x) =
~(x+th)h;dt.
1L
O
X,
i=l
Integrate the expression on the right-hand side by parts, using the general fonnula
f1 u dv dt = -
lo
dt
{' v du dt + uv , ,
lo
dt
0
with u = (of/ 8x;)(x + th)h; and v = t - 1. Then,
of
11 (1 L 1' ~(x
+ th)h; dt = L
n
i=l
n
O vX,
i,k=I
O
a2,
t)7J:8(x + th)h;hk dt +
:X, Xk
du
82f
-d = ~ ( x + th)h;h1c,
t vX;vX1c
from the chain rule, and
1
o
= (t - 1) ~/- (x + th)h;
<IX,
I'
t=O
= ~/- (x)h;.
<IX,
Thus we have proved the identity
f(x + h) - f(x) =
of
L ~(x)
· h; + R (h,x)
n
1
i=I
X,
n
i:::l
since
uvl
8/
L h;~(x),
X,
377
Theorem Proofs for Chapter 6
where
Since
lh;I :5 llhll, we get
IR1 (h,xo)I
:5
llhll2
{t [1
i,!c=I } 0
~¥
(1 - t) I O
x, x"
(xo + th)I dt}.
If we integrate R1 (h, x) by parts again with
tJ2f
(t-
u = ~ ( x + th)h;h"
1)2
and v=---2-
vX;vXk
we get
L 1 (t - 2 1)
1
R1 (h~ x) =
• ••
,.,."
fPJ
2
(x + th)h;hjhk dt +
8x,-8x-o
i Xk
O
L -21
,.,
. .
82/
$'I
_$'I
•
vx,ux,
(x)h;hi .
Thus we have proved that
f(x+h) =J(x) +
of
h·h· IJ2J
L h;8-(x)
+ L Ta-a-(x) +R2(h,x),
n
n
X,
i=I
where
R2(h,x) =
L 1 (t-2 1)2 0
1
,.,. ...k
O
X,
i,j=I
[)3/
{) 0
X; Xj Xk
XJ
(x + th)h;hjhk dt .
The integrand in the last formula is a continuous function and is therefore
bounded on a small neighborhood of x (remember, it has to be close to the
value of the function at x). Thus, for a constant M ~ 0 and for Ilhl I small, we
get
·
IR2(h,x)I $1lhli3M.
In particular, R2(h,x0 )/llhll2 $ llhllM-+ 0 ash-+ 0. The formula stated in the
theorem for the remainder (Lagrange's form of the remainder) is obtained by
applying the second mean value theorem for integrals. Recall that this states
that
lb
f(x)g(x)dx =f(c)
lb
g(x)dx,
Chapter 6 Differentiable Mappings
378
provided that f and g are continuous and g ~ 0 on [a, b]; here c is some number
between a and b (see Exercise 45, Chapter 4). Thus,
R1 (h,Xo) =
L 11 (1 -
i,k=I
=
a21
n
fo
1
t)a:a-(x + rh)h;h1: dr
X, Xt
O
(1 - t)D2(f(x+rh))(h,h)dr
= ~D2/(c) · (h,h)
where c lies somewhere on the line joining x to y = x + h. Similarly,
R2(h,xo)=
=
L
n
11
.. '-I
1,J,t=
O
1
-2
1 (t
0
1)2
(I -
2
1)2
83/
0 -8 _8
(x+rh)h;hjhtdt
X, X1 Xt
l
D3/(x +th)· (h, h, h) dr = 3 ! D3/(c) · (h, h, h)
where c lies somewhere on the line joining x to y = x + h. One can proceed by
induction using the same method to get the general result.
•
Remark 1 With a bit more effort, one can prove a stronger theorem:
of class C~ then
If/ is
L k!1 Dkf(x) · (h, . .. , h) + R,(x, h)
r
f(x + h) = f(x) +
k=I
where
R,(x, h) --+ O
11h11'
as h
--+
0 in lR". We leave this point to the interested reader.
Remark 2 Another proof that uses Taylor's formula from one-variable calculus is as follows. Let g(t) = f(x + t(y - x)) for x E [O, l]. Applying Taylor's
formula on JR, there is some t E [O, I] such that
,-1
g(l) - g(O) =
ha!
=
I
I
L k.c<k>co> + ,g<'>cn.
•
,.
=
Note that g(l) /(y) and g(0) = /(x). Let p(t) x + t(y - x). Then g =fop and
Dp(r)(l) = y - x for al) x, and so, by Exercise 6b at the end of this chapter,
g<'>(r) = D"g(r)(l, ... , 1) = D"/(p(r))(Dp(t)(l), ... , Dp(t)(l))
= D'i(x + t(y - x))(y - x, ... ,y - x).
Theorem Proofs for Chapter 6
379
Substituting, we get
r-1
f(y)-f(x)
=L
;,ntf(x)(y-x, ... ,y-x)+~D'f(x+t(y-x))(y-x, ... ,y-x),
k=I
which completes the proof, with c = x+ t(y - x).
6.9.2 Theorem If f : A
C Rn ------+ JR is differentiable, A is open, and x 0 E A
is an extreme point for f, then DJ(x0 ) = O; that is, Xo is a critical point.
If DJ(xo) ¥- 0, we can find an x E lRn such that DJ(x0 )x = c -:/- 0, say,
Now find a 8 > 0 such that if 11h11 < 8, then
Proof
c
> 0.
C
llf(Xo + h) - f(Xo)-Df(xo)hll ~ lllxll llhll.
Pick A > 0 such that Al!xll < 8. Choosing h = AX gives llf(xo + AX) - f(xo) Df(xo)AXII ~ cAllxll/(2llxll) = cA/2. Since Df(xo)AX = Ac, we get f(xo +
AX) - f(xo) > 0. Similarly, llf(xo - AX) - f(xo) + Df(xo)AXII ~ cA/2 implies
f(xo - AX) - f(xo) < 0. Since f(xo + AX) > f(xo) and f(xo - AX) < f(xo), we see
that f (Xo) is not a local extreme value. That is, we can find points y arbitrarily
close to xo such that f(y) > f(xo), and similarly there are points y arbitrarily
close to Xo such thatf(y) <f(xo).
•
6.9.4 Theorem
i.
If f : A c lRn
JR is a C 2 function defined on an open set A and xo is a
critical point off such that Hx0 (f) is negative definite, then f has a local
------+
ma,ximum at xo.
ii.
If f
has a local maximum at xo, then Hxo(f) is negative semidefinite.
Proof
i.
Hx0 (f)(x,x) < 0 for all x ,f. 0 in lR.11 implies D 2/(xo)(x,x) < 0 for all x-:/- 0
in ]Rn. By Worked Example 4.5 at the end of Chapter 4, a bilinear function
is continuous, and so D2/ (xo)(x, x) is a continuous function of x. Moreover,
S = {x E lR.11 I llxll = 1} is compact, and so there is a point i E S such
that O > D2f(xo)(i,x) 2: D2f(xo)(x,x) for all x ES. Let e = -D2f(xo)(i,i).
Then D2f(xo)(x,x) = llxi!2D2f(xo)(x/llxll,x/llxll) ~ -e!lxll 2 for any x-:/- 0
in !Rn. Since D2/ is continuous, there is a 8 > 0 such that I jy - xo 11 < 8
Chapter 6 Differentiable Mappings
380
implies 11D2/(y)- D2/(Xo)II < e/2, and we may also pick our 6 such that
D(Xo, 6) c A. If y E D(xo, 6), Taylor's theorem gives
1 2
.
f(y) - f(xo) = D/(Xo)(y - xo) + 2D f(c)(y - xo,Y - Xo),
where c E D(xo, 6). Thus,
-implies
D 2/(c)(y - Xo,y..- xo)
S D2/(xo)(y
+
11D
- xo,y- xo)
2/(c)(y
- xo,Y - Xo) - D2/(Xo)(y - Xo,Y - Xo)II
:::; -ellY- xoll2 + illy-xoll2 = -illy- xoW.
Since D/(Xo) = 0, Taylor's theorem gives
f(y) - /(Xo) =
½0 /(c)(y 2
Xo,Y -
xo) S
Hence f(y) < f(xo) for all y E D(xo, 6), y
maximum at Xo,
ii.
½(-; IIY - xo!l2) < 0.
'I-
Xo, and so f has a local
To prove this part of the theorem, we argue by contradiction. Let/ have a
local maximum at xo and suppose D2/(Xo)(x,x) > 0 for some x E lRn. Now
consider g(t) = -f(xo+tx). Since/ is defined in a neighborhood of xo, g is
defined in a neighborhood ofO. We have D 2g(0)(1, 1) = -D2/(Xo)(x,x) <
0. Using the proof of i, there is a 6 such that It! < 6, t 'I- 0 implies
g(t) < g(O). Thus !ti < 6 implies f (Xo + tx) > f(xo), and so j does not have
a local maximum at x0 . This contradiction implies that D2/ (Xo)(x, x) S 0
for all x E lRn. Hence HXo(f)(x,x) S O for all x E Rn.
•
Worked Examples for Chapter 6
let f : B C R" -+ lR where B = {x E Rn I !!xii S 1} be
continuous and let f be differentiable on int(B). Suppose f(x) = 0 on bd(B). Show
that there is a point xo E int(B) for which Df(xo) = 0.
Example 6.1
381
Worked Examples for Chapter 6
Solution This is the multidimensional version of Rolle's theorem. If f is
identically zero, the theorem is trivial. Therefore, suppose f (x) 'F 0 for some
x E int(B). Then f attains a maximum, or minimum, at some interior point,
since B is compact. Thus there is an extreme point .xo E int(B), and hence, by
Theorem 6.9.2,
DJ(.xo) = 0.
•
Example 6.2 Show that for a bilinear map f : ]Rn x
D/(Xo,Yo)(x,y) =J(.xo,y) + f(x,yo).
]Rm -+ ]RP, we have
Solution We know that f is differentiable, because from its matrix representation, we see that the partial derivatives are linear and so are continuous.
Since J(x, Yo) is a linear function of x, the derivative in the direction (x, 0)
is Df(xo,yo)(x,0) = f(x,yo), as in Example 6.2.4. Similarly, Df(xo,Yo)(0,y) =
f(xo,y). Since Df(xo,Yo) is linear and (x,y) = (x,0)+(0,y), we have Df(xo,Yo)(x,y)
=f(xo,y) + f(x,yo).
•
Example 6.3 Find the Jacobian of f(x,y) = (sin(xsiny), (x + y)2), f : R. 2
-+ ]R.2_
Solution We have
:
= :/sin(xsiny))
== cos(x siny) :x (x siny) = siny cos(x siny);
0/2 =2(x+y)a
0 (x+y)=2(x+y);
~
vX
.
X
°
8'!1 = cos(xsiny) (x;iny) = cos(xsiny)xcosy;
0y
0y
.
oh
oy
= 2(x+y).
Thus, by Theorem 6.2.2, the Jacobian matrix (where x = xi and y = x2 ) is
( sinycos(xsiny) xcosycos(xsiny) )
2(x+y)
2(x+y)
·
•
382
Chapter 6 Differentiable Mappings
Note. Jacobian matrices generally are not symmetric and indeed need not
even be square. Symmetry is a property only of the second derivative of a
function f : Rn --. R.
Example 6.4 Find the critical points ofJ(x,y) =
r - 3.r +y
2
and determine
whether f has a (local) maximum, (local) minimum. or saddle point at each of
these critical points.
Solution The critical points are precisely those points (x,y) for which
of = 3x2 -
ox
6x
=0
and
of
8y
=2y = 0.
Solving for x, we see that x =0 or x = 2. Therefore, the critical points off are
(0,0) and (2,0). The matrix of the Hessian at (x,y) is
(
2
cJ2/
8xfJx
8/
8x8y
82/
82/
8y8x
8y8y
-- --
)
= ( 6x - 6
0
~).
At (0, 0) the matrix of the Hessian is
and ..11 = -6, ..12 = -12. Hence, f has a saddle point at (0, 0). At (2, 0), the
matrix of the Hessian is
=
=
and ..11 6, ..12 12 and so/ has a local minimum. This is confirmed by the
computer-drawn graph shown in Figure 6.WE-1.
Example 6.5 Let A be an open convex set in Rn and let f : Rn -+ Rm be
differentiable with a continuous derivative. Suppose 11D/(x)yll ~ Mllyll for all
x EA, y E Rn. Prove the mean value inequality:
Exercises for Chapter 6
383
FIGURE 6.WE-1
The graph of/(x,y)
=t' - 3.r +y2
Solution For n = 1, m = 1, this follows directly from the mean value theorem.
To get the general case, observe that by the chain rule, (d/dt) (f(tx 1 +(1-t)x2 )) =
Df(tx 1 + (1 - t)x2) · (x1 - x2). Integrate both sides in t from t = 0 to t = 1 to
obtain
/(x1) - /(x2) =
fo
1
[Df(txi + (1 - t)x2) · (x1 - x2)J dt.
The integral here is defined as the integral of the component functions. Taking
absolute values and using the hypothesis on DJ now gives the result desired.
We used the fact that the absolute value of an integral is less than or equal to
the integral of the absolute value-see the remarks follow,ing 4.8.S and note that
the case of vector functions is similar.
•
Exercises for Chapter 6
1.
If f : A C IR.n - IR."' and g : B c IR.n - IR."' are differentiable functions
on the (open) sets A and B and er, /3 are constants, prove that af + /Jg :
An B c IR.n - JR."' is differentiable and D(a/ + f3g)(x) = aDf(x) + f3Dg(x).
Chapter 6 Differentiable Mappings
384
IR.".' and assume df; / dx exists for i = I, ... , m. Show that
2.
Let f : A C JR.
DJ exists.
3.
Let f : [0, oo[ -+ IR be continuous and let f be differentiable on ]0, =[.
Assume /(0) = 0 and f (x) -+ 0 as x -+ +oo. Show that there is a c E ]O, oo[
such thatf'(c) = 0.
4.
If/ : A c IR.n
x EA.
5.
Calculate the Jacobians of the following functions:
-+
-+
IR"' is a constant function, show that D/(x) = 0 for all
b.
=sin(.x2 + y3).
f(x,y,z) =(zsinx,zsiny).
c.
J(x,y) =xy.
a.
f(x,y)
=x2 + y2.
e. f(x,y) =(sin(.xy), cos(xy),x2y2).
r. f(x,y, z) =xy+i_
g. f(x,y,z) =xyz.
h. f(x,y,z) =(z-'>',x2, tan(xyz)).
6.
d.
f(x,y,z)
a.
If/ : A c IR.n --+ IR.m and g : B c IR"' -+ JRP are twice differentiable
and/(A) CB, then for xo EA, x,y E IR.n, show that
D2 (g o f)(xo)(x,y) = D2 (g(f(Xo)))(Df(xo) · x, Df(Xo) · y)
+ Dg(f(xo)) · D2/(xo)(x,y).
b.
If p : IR.n -+ R"' is a linear map plus some constant and/ : A C
IR"' -+ IR.s is k times differentiable, prove that
D'=(f o p)(xo)(xi, ..• ,xk) = D.t/(p(.xo))(Dp(xo)(xi), ... ,Dp(xo)(xk)).
7.
Find the critical points of the following functions and determine whether
they are local maxima, local minima, or saddle points:
a.
f(x,y) = x3 + 6x2 + 3y2 - 12.xy + 9x.
b.
/(x,y) = sinx + y2
c.
f(x,y,z) = cos 2x · siny + z2 •
d.
J(x, y, z) = (x + y + d.
-
2y + I.
385
Exercises for Chapter 6
8.
Show that if f: A c IR.2
-.
R bas a critical point x0 EA and we let
be evaluated at xo, then
a.
b.
c.
9.
> 0 and 82/ / 8x1 8x1 > 0 imply f has a l?cal minimum at xo.
!!.. > 0 and 82/ / 8x18x1 < 0 imply f has a local maximum at x0 •
Ll < 0 implies x0 is a saddle point off.
!!..
Consider the following two possible properties for a subset X of Rn:
1
There is a point x0 E X such that every other point x in X can
be joined to .xo by a straight line in X.
2
There is a point .xo E X such that every other point x in X can
be joined to x 0 by a di(ferentiable path in X.
a.
Give .examples of each kind of set that are not convex.
b.
Show that if X is an open set in Rn satisfying either of these conditions and/: X-. Risa differentiable function with zero derivative,
then f is constant.
c.
Show that if X is an open subset of lR.n, then the following are
equivalent:
i. Condition 2 above
ii. Path connectedness of X
iii. Connectedness of X
10.
Prove the analogue of Theorem 6.9.4 for minima.
11.
Prove the analogue of Proposition 5.3.3 for f: A c Rn-. Rm.
12.
A function/: Rn - t R is called homogeneous ofdegree m iff(tx) = t"'f(x)
for all x E lR.n and t E R. If f is differentiable, show that for x E Rn,
Df(x)x = mf(x),
8/
LX;~
=mf(x).
n
that is,
i=I
X,
Show that maps multilinear in k variables give rise to homogeneous functions of degree k. Give other examples.
386
13.
Chapter 6 Differentiable Mappings
Use the chain rule to find derivatives of the following, where f(x, y, z) =
and h(x) sinx:
x2 + yz, g(x,y) =y3 + xy,
=
a.
F(x,y,z) =f(h(x),g(x,y),z).
b.
G(x,y,z)
c.
H(x,y,z) =g(f(x,y,h(x)),g(z,y)).
= h(f(x,y,z)g(x,y)).
Also find general formulas for the derivatives of F, G, H.
14.
a.
Extend Worked Example 6.2 to multilinear maps.
b.
Apply the result in a.to the case of the determinant map det: Rn2 =
Rn X • • • X Rn -+ R, to show that A E Rn2 is a critical point of det
·iff A has rank~ n - 2.
15.
Let f : R -+ R be differentiable. Assume there is no x E R such that f
. and f' both vanish at x. Show that S = { x I 0 $ x $ 1,f(x) = .0} is finite.
16.
If f : Rn -+ Rm is a differentiable and DJ is a constant, show that/ is
a linear term plus a constant and that the linear part off is the constant
value ofD/.
17.
If/ : A c Rn -+ R is of class er and Df(xo) = 0, D2/(xo) = 0, ... ,
or- 1/(Xo) = 0 but D'/(x0 )(x, ... ,x) < 0 for all x E Rn, x 'r" 0, then prove
that/ has a local maximum at xo.
x' + bx+ c = 0 where b > 0 has exactly one solu-
18.
Prove that the equation
tion x ER
19.
In each of the following problems, determine the second-order Taylor
fo1mula for the given function about the given point (xo,Yo):
20.
a.
/(x,y)
=(x + y)2, xo = 0,yo =0.
b.
f(x,y)
=e"+Y, Xo =0, Yo= 0.
c.
/(x,y) = 1/(x2 + y2 + 1), Xo = 0, Yo= 0.
d.
j(x,y)
= e-x2-y2 cos(xy), xo = 0, Yo= 0.
e.
J(x,y)
= sin(xy) + cos(xy), xo = 0, yo= 0.
f.
f(x,y)
= e<x- 1>1 cosy, Xo = 1, Yo= 0.
Let L : Rn - t R"' be a linear map. Define IILi I = inf{ M I l!Lxll ~ Mllxll
for all x E Rn}. Show that II· II is a norm on the space of linear maps of
Rn to R"'.
~ercises for Chapter 6
21.
387
a.
Let f : A C Rn - JR, and assume that for some integer k, !).k < 0,
where ~ is evaluated at x0 • Show that f cannot have a (local)
b.
If ~
minimum at·xo.
<0
for some even k, prove that f has a saddle point at xo.
22.
Give an example of a continuous map f : ]0, 1[ --+ JR whose graph is not
closed. Can this happen for f : A C JR --+ JR where A is closed?
23.
Write down the· first four tenns in the Taylor expansion of log(cos x) about
x=0.
24.
Letf(x,y) be a real-valued function on IR 2 . Show that if/ is of class C1 and
{)2JI oxoy exists and is continuous, then 8 2f I oyox exists and {PfI axay =
8 2/ / oyox (this is weaker than saying that f is of class C 2 ). Generalize.
25.
Let f : Rn --+ JR and suppose that the partial derivatives of/ ox;, i =
1, ... , n, exist and that 8// 8x;, i = 1, ... , n - 1, are continuous. Prove that
f is differentiable.
26.
a.
If f : JR --+ JR and f' exists on a neighborhood of x = a and
limx-a•f'(x) l, prove thatf'(a) l.
b.
Let g(x) 1 if x
of any function?
=
=
=
<0
and g(x)
=0 if x ~ 0. Can g be the derivative
27.
Let f : A --+ JR be continuous, where A c Rn is open. Assume that all
directional derivatives exist and that at each xo E A, they determine a
linear map. Must/ be differentiable?
28.
Supp~se we define a new type of connectedness as follows: A set M c
Rn is "differentiably connected" if every two points x,y E M can be
joined by a differentiable arc t/; : (0, 1] --+ Rn. What do differentiably
connected sets look like? Give an example of a path-connected set that is
not differentiably connected. What about open sets?
29.
Let fn(X) = xe-nx, X E [O, oo[, n = 0, l, 2, ....
30.
a.
Show that f(x) = I::Ofn(x) exists. Compute f explicitly.
b.
Is f continuous?
c.
Find a suitable set on which the convergence is uniform.
d.
May we differentiate term by term?
Suppose f : JR --+ JR is bounded and has a continuous derivative. What is
right and what is wrong in the following string of conclusions?
388
Chapter 6 Differentiable Mappings
We want to prove that the set T of all points at which f assumes its
(absolute) maximum is closed. Since f is differentiable, it is continuous.
Hence it assumes its maximum; that is, T is not empty. Denote by S the set
of points at which/'(x) = 0. Then TC S. On the other hand, if x ES, then
/' (x) = O; hence f achieves either a maximum or a minimum there. If it
achieves a maximum, we must have/(x) ~ 0. Hence T = Sn{x I/(x) ~ O}.
Since {x lf(x) ~ O} is closed, as is S, Tis closed.
Is T really closed, or not?
31.
Let A c Rn be compact, and construct the normed space C(A, JR) as in
Chapter 5. Define, for xo E A, 6,:0 : C(A, JR) -+ JR; f 1-+ f(xo). Prove that
6x,, is differentiable.
32.
Let f : IR.2 -+ lR be defined by
f(x,y) =
xy(x2 -y2)
2
X
2
+y
if (x,y) 'I (0,0) and/(0,0) = 0. Show that 8 2f /8x8y and {Pf /8y8x exist
at (0, 0) but are not equal.
33.
Use Taylor's theorem to prove the binomial theorem,
(a +bt =
t (;)
akbn-k.
k=O
34.
Consider the sequence of real numbers
I
2'
I
2+
½'
2+
i¾-r ,...
(a continued fraction). Show that it is convergent, and find the limit.
35.
Let f : ]a, b[ -+ JR be twice differentiable. Suppose f vanishes at three
distinct points. Prove that there is a c E ]a, b[ such thatf"(c) = 0.
36.
Suppose f : [0, l] -+ JR is continuous, f is differentiable on ]0, l[, and
f(O) = 0. Assume that lf'(x)I ~ 1/(x)I, 0 < x < 1. Prove that/(x) = 0 for
all x E [O, l}.
37.
A C 2 function/: lR2
-+
JR is called harmonic if
a21 a21
ax2 + 1Jy2 = 0.
Assume that (xo, Yo) is a strict local maximum and f is harmonic. Prove
that all second derivatives off vanish at (xo, Yo).
389
Exercises for Chapter 6
38.
39.
40.
41.
Find the equation of the plane tangent to the following surfaces at the
indicated points:
a.
z = x2 + y2 , (0, 0).
b.
z=r-y2 +x,(l,0).
c.
z = (x+y)2, (3,2).
Analyze the behavior of the following functions at the indicated points:
a.
z = x2 - y2 + 3xy, (0,0).
b.
z = Ax2 - By2 + Cxy·, (0, 0).
Find the equation of the tangent plane to the surface S given by the graph
of
a.
/(x,y) = ~
b.
/(x,y) =
+r +y2 at (l,0,2).
Jx2 + 2xy-y2 + I
at (l, I, /3).
Give an example of a differentiable function/: JR.-+ JR. such that/(0) = 0,
and yet no inequality of the form If(x)I $; C.x2 holds in any
neighborhood of the origin.
f' (0) = 0
Chapter 7
The Inverse and Implicit
Function Theorems and
Related Topics
We learn in linear algebra that a system of linear equations
can,be solved uniquely for x 1, ••• , Xn if the matrix A = (aij) is nonsingular, that is,
if det(A) #; 0, where det(A) denotes the determinant of A. What about nonlinear
equations? When can we solve a system of the fonn
f1 (X1, ••. , Xn)
= Y1
fn(X1, • • • ,Xn) = Yn
for x,, ... , Xn? The object that generalizes the determinant in the case of linear
systems is the Jacobian determinant, defined by Jf(x) = det(D/(x)), where
x = (x,, ... , Xn) and/ = (/1, ... ,/n). Written out in coordinates,
eJJ,
ox,
8/1
OXn
8/n
8x1
0/n
OXn
J/(x) =
391
Chapter 7 The Inverse and Implicit Function Theorems
392
where the partials are evaluated at x = (x1, ... , Xn). Sometimes one writes Jf as
8(/1, ... ,fn)
O(X1, .•• ,Xn).
If Jf(x) f; 0, one might expect to be able to solve f(x) = y for x.
The theorem that justifies such results is the main subject of §7.1. We consider the case in which we wish to solve f(x,y) = 0 for y (implicit function
theorem) in §7.2. In later sections of this chapter we apply some similar existence theorems to ordinary differential equations and a result concerning the
fonn of a function near a maximum, minimum, or saddle, called the "Morse
lemma." The final section is concerned with extremum problems in the presence
of constraints.
§7 .1 Inverse Function Theorem
Since Jf(x) is the determinant of Df(x), Jf(x) '{" 0 implies that Df(x) : IR.n -+ IR.n
is a linear isomorphism (that is, its matrix is invertible). From the fact that the
best linear approximation to f at x is invertible, we want to conclude that the
function/ itself is invertible. However, there are some restrictions. To appreciate
these, first consider the case f : JR. -+ R It is true that if f is C 1 and if f' (xo) '{" 0,
then/ is invertible (one-to-one) in a neighborhood of .xo. Geometrically, this is
quite clear, for f' (x0 ) ~ 0 means that/ has a nonzero slope at, and consequently
near; xo (see Figure 7 .1-1).
If/'(xo) = 0, then f may or may not be invertible near xo; in Figure 7 .1-1,
f is not invertible near x 1 but is invertible near x3, even though/' vanishes at
both points (near x 1 it behaves like c 1 + (x - x1)2, and near x3 it behaves like
C3 + (x - x3 ) 3 ). Thus, if f'<xo) = 0, no conclusion can be drawn (some further
analysis would be required). Also, f' (xo) '{" 0 does not guarantee that we can
solve /(x) y for all y. For example, there is no X4 such that /(x4 ) y4 for Y4
as in Figure 7 .1-1. From the same figure we see that solutions are generally not
unique, for /(xo) = /(x2). There will be a unique solution only if our attention is
restricted to a suitably small neighborhood of x0 •
Therefore, all we can expect is that f is invertible near f (x0 ). That is, for Y
close to /(Xo), we can solve uniquely for some x near .xo such that/(x) = y. The
question "how near?" is a subtle one requiring a detailed analysis of the proof.
Thus, our main concern will be with local invertibility, that is, with invertibility
of f(x) for x near Xo and y near Yo = f(xo).
=
=
393
§7.1 Inverse Function Theorem
y
I
I
--1------1
I
I
I
I
I
I
I
I
I
X
FIGURE 7 .1-1
This function has an inverse near x0
To compute the derivative of the inverse function 1- 1( y), we use the chain
rule: Fromf- 1(/(x)) = X, we get (dr 1 /dy) -f'(x)'= 1, and so
dj- 1 1
=
1
dy y=J(x) df / dx ·
To prove that 1- 1 really is differentiable requires a little more care. Theorem
7.1,1 includes this single-variable situation as a special case.
7.1.1 Inverse Function Theorem Let Ac JR" be an open set and let
f : A C !Rn - t JR" be of class C 1 ( that is, assume that DJ exists and is continuous).
Let xo E A, and suppose Jf(xo) -/; 0. Then there exists an open neighborhood U
of Xo in A and an open neighborhood W off(xo) such that f ( U) = W and the
restriction off to U has a C 1 inverse 1- 1 : W - t U. Moreover, for y E W and
x =J- 1(y), we have
the inverse of Df(x), meaning the inverse as a linear mapping (corresponding
to the inverse matrix). If f is of class CT', p 2': 1, then so is 1- 1•
Chapter 7 The Inverse and Implicit Function Theorems
394
Note.
1.
If f : A c Rn -. Rm is of class C1 and D/(Xo) is one-to-one, then f
is also locally one-to-one near xo. Similarly, if D/(xo) is onto, then f is
onto some neighborhood of f(xo). These results follow from the inverse
function theorem by the methods of §7.2; see Exercise 11 at the end of
this chapter.
2.
Saying that f has an inverse 1- 1 means exactly that we can uniquely
solve J(x) = y for x E U given any y E W. That there is a solution is the
fact that/ takes U onto W (surjectivity) and that the solution is unique
is that f is one-to-one on U (injectivity).
For y near Yo, we need to prove the existence and uniqueness of an x such that
f(x) = y. The basic technical tool we use is the contraction mapping principle.
In §5.7 we saw how that result could be used to prove the existence of solutions
to some simple differential and integral equations, and in §7.5 we shall amplify
on this.
7.1.2 Example Consider the equations (x4 +y4)/x = u(x,y) and sinx+cosy
= v(x,y). Near which points (x,y) can we solve for x,y in terms of u, v?
Solution Here the functions are u(x,y)
=/1(x,y) =(x4 + y4)/x and v(x,y) =
fi(x,y) = sinx + cosy. We want to know the points near which we can solve
for x,y as functions of u and v and to compute 8x/8u, and so forth. To use
the inverse function theorem, we must first compute 8(f1 ,h)/8(x,y). Letting
f = (/1 ,/2), we take its domain to be A = { (x,y) E R 2 Ix f- O}. Now
of, 8/1
3x4 - y4
4y3
8x 8y
x2
=
X
8/2 ofi
-siny
cosx
8x {)y
siny 4
4y3
4
= - ( y -3x )- -cosx.
2
x
X
a(f1 ,h)
=
o(x,y)
At points where this expression is defined and does not vanish, we can solve for
x, y in terms of u and v. In other words, we can solve for x, y near those x, y for
which x f- 0 and (siny)(y4 - 3x4) f- 4.xy3 cosx. Such conditions generally cannot
be solved explicitly. For example, if xo = 7r /2, yo = 1r /2, we can solve for x, Y
near xo,Yo, because there, 8(/1 ,h)/o(x,y) f- 0.
§7.1 Inverse Function Theorem
395
The derivatives fJx/ au and so forth are obtained according to the inverse
function theorem by inverting the Jacobian matrix. In the 2 x 2 case, this
amounts to the following:
ox
l
av
-=---,
au
JJ(x,y) {)y
{)y
-1
av
Jf(x,y) ox
-=---,
au
-l au
Jf(x,y) {)y
ox
av
-=---,
ay
1
au
av = Jf(x,y) ax
As can be seen by multiplying
1
av au
au 8v)
8y ay - 8y8y
av -au)
(au
ay
fJy
ax
(
Jf -av
ax
8v
f)u
8x
av8u
8u 8v
-{)x{)y
- - +8x8y
--
8x
I
= Jf
(J/O
0 )
Jf
=
( 1 0)
0 1
(See also Worked Example 7.2 at the end of this chapter.)
In this case
ax
-(x2 siny)
au= (siny)(y4 - 3x4)- 4y3xcosx·
Notice that the answer is expressed in terms of x and y, not u and v. Evaluation at
a point (x, y) gives the value of 8x/8u at the corresponding point (u(x,y), v(x,y)).
In our example xo = yo 1r /2 giving u 1r 3/4 and v l. The last fonnula
becomes
=
ax
8u
I
.=1
u=,.-3;4
=
=
-x2 siny .
(siny)(y4- 3x4) - 4y3xcosx
=
I
x=y=rr/ 2
=~2
•
1r ·
The inverse function theorem is useful because it tells us that there are
solutions to equations and it explains how to differentiate the solutions, although
it may be impossible to solve the equations explicitly.
7.1.3 Example Let u(x,y) = excosy
and v(x,y) = exsiny. Show that the
transformation (x,y)....,. (u(x,y), v(x,y)) is locally invertible near every point, but
is not invertible.
Chapter 7 The Inverse and Implicit Function Theorems
396
Solution Here,
au au
8(u, v)
a(x,y)
=
Bx
8y
ox
oy
av av
I
= ex cosy -exsiny
ex siny
ex cosy
= e2x(cos2 y + sin2 y) = e2x ,j: 0.
Hence, by the inverse function theorem the map is locally invertible. It is not
globally one-to-one, however, because
u(x,y + 21r) = u(x,y),
v(x,y + 21r) = v(x,y).
•
Notice that for f : JR -+ JR, if f is differentiable and if f'(x) ,j: 0 for all x,
then f' (x) is either > 0 or < 0, since f' satisfies the intermediate value theorem
(see Exercise 37, Chapter 4 ); hence,/ must be (globally) one-to-one, since f is
always increasing or decreasing. Example 7 .1.3 shows that this need not be the
case in IR 2•
Exercises for §7 .1
1.
Let u(x,y) = x2 -.y2, v(x,y) = 2xy. Show that the map (x,y)
locally invertible at all points (x,y) ,j: (0,0).
2.
Compute ox/8u, 8x/8v, 8y/8u, 8y/8v for the functions in Exercise 1.
3.
Let /(x) = x + 2x2 sin()/ x), x ,j: 0, /(0) = 0. Show that f' (0) ,j: 0 but that/
is not locally invertible near 0. Why does this not contradict the inverse
function theorem?
4.
Let L : !Rn -+ !Rn be a linear isomorphism and f(x) = l(x) + g(x), where
jjg(x)II $ Mllxll2 and/ is C 1• Show that/ is locally invertible near 0.
5.
Investigate whether the system
u(x, y, z) = x + xyz,
v(x,y,z) = y +xy,
w(x,y, z) = z + 2x + 3z2
can be solved for x, y, z in tenns of u, v, w near (0, 0, 0).
t-t
(u,v) is
§7.2 Implicit Function Theorem
397
§7 .2 Implicit Function Theorem
In studying the implicit function theorem, we are again interested in the existence
and differentiability of implicitly defined functions. Undoubtedly, the student has
worked before with functions defined implicitly; however, it may not have been
adequately explained why the ~nipulations are justified. Possible questions
we would like to ask will be more obvious after we look at some examples.
Consider those x and y related by an equation F(x,y) = 0. We would like to say
that this defines a function y f(x) (one says that y /(x) is defined implicitly),
and we would like to compute dy/dx. As in the previous section, given such an
F, one generally cannot solve for y explicitly, and so it is important to know
that such a function exists, and to know how to compute with it, without having
to solve for it.
To provide motivation for the theorem, consider the function F(x,y) = +
y2 -1. We are interested in those x ana y related by F(x,y) = 0, which is just the
unit circle. A function f(x) is a "solution" when F(x.f(x)) = 0 for all x in the
domain of/. Clearly,/ is given by the two values /(x) = ±v"f=xI, and either
of these is a solution, and so / need not be unique. Given (xo, yo) such that
F(xo,Yo) = 0, we wish to know if we can find/(x) such that F(x,f(x)) = 0 and/
is differentiable and unique near (xo, yo). If xo "/; ± l, this is true if/ is taken to
be the appropriate square root. The given y 0 determines which square root must
be selected. See Figure 7.2-1. The points xo = ±1 are exceptional for several
reasons. First,/ is not differentiable there, and second, near xo = ±l,f could be
either square root, and so it is not uniquely determined. These exceptional points
are the places where 8F/8y = 0. Thus, in general, we want some condition like
aF/ fJy "/; 0 to guarantee that, locally at least, we can find a unique differentiable
/ such that F(x,f(x)) = 0.
In the general case, we consider a function F: Rn x Rm -+ Rm, and study
the relation F(x,y) = 0, or, written out,
=
=
r
The goal is to solve for the m unknowns
terms of x 1 , ••• ,Xn.
y,, ... ,Ym
from the m equations in
7.2.1 Implicit Function Theorem Let A c R 11 x Rm be an open set,
and let F : A ..... R"' be a function of class O' (that is, assume char F hasp
continuous derivatives, where p is an integer. p 2::: 1). Suppose (xo,Yo) EA and
398
Chapter 7 The Inverse and Implicit Function Theorems
y
X
FIGURE 7 .2-1
The circle implicitly defines two functions
F(xo, Yo) = 0. Form the determinant
Li=
evaluated at (xo, Yo), where F = (F1, ••• , Fm). Assume that Li ~ 0. Then there are
an open neighborhood UC IR.n of xo, an open neighborhood V ofy0 in 1Rm, and
a unique function f : U - V such that
F(x,f(x))
=0
for all x E U. Furthermore, f is of class CJ'.
This theorem will be proved using the inverse function theorem applied to
the map (x, y) - (x, F(x, y) ). The intuitive reason for the validity of the theorem
and the necessity for the restriction 11 ~ 0 should be clear from the example of
Figure 7 .2-1.
399
§7.2 Implicit Function Theorem
From the equation F(x,f(x)) = 0, one can determine DJ using the chain rule.
First, take the case m = 1. Then the chain rule yields
8
0 = -8 F(x,f(x))
X;
= -08FX; + -8F
8y
8f
~•
UX;
and so we get the important equation (notice the minus sign):
8/
8F/8x;
OX;= - oF/8y.
The reader is especially warned that, because of the minus sign, in the expression
oF/oxi
8Fjoy
it is incorrect to "cancel" the 8F's to obtain 8y/8x;. Thus, while such mnemonic
devices are sometimes useful, they do have limitations.
We can formulate the general solution analogous to Theorem 7.2.1.
7.2.2 Corollary In the implicit function theorem, the ofj / ox; are given by
where - I denotes the inverse matrix.
The proof is similar to the preceding case m = 1 and will be left as an
exercise.
7.2.3 Example Consider the following system of equations:
xu+-yv2 = 0
xv3 +y2u6 = 0.
=
=
Are they uniquely solvable for u, v in terms of x and y near x 1, y = -1, u 1,
'v = -1? Near x =0, y = 1, u = 0, v =0? Compute ou/8x at x = 1, y = -1 and
at x = 0, y = 1 if it exists.
400
Chapter 7 The Inverse and Implicit Function Theorems
Solution Write the equations as F(x,y, u, v) = 0, where F stands for the lefthand sides of the given equations. We want to see if we can solve for u(x,y),
v(x,y). Thus, we form
2yv
Li=
3xv2
I·
At x = 1, y = -1, u = 1, v = -1, we get
Li =
I! ; I= -9 ~ o.
and so there is a unique solution for u, v near this point. Differentiating the
given equations in x results in the pair of equations
U
and
au
av
+ X ax + 2yv ax = 0
~ OV
2 50U
3
v +3xv--+6yu -=0,
ax
ox
evaluating· at the selected point, gives the result
{)u
av
l+-+2- =0
ax
ax
OU
ax
av
ax
-1 +6-+3- =0.
=
=
Eliminating 8v/ax to solve for /Ju/ax gives au/ox = 5/9. At x 0, y
I,
u = 0, v = 0, we have Li = 0. Thus, the implicit function. theorem states that
we cannot expect to uniquely solve for u, v in terms of x and y in this case. To
actually determine solvability would require a direct analysis not provided by
the implicit function theorem.
•
Exercise_s for §7 .2
1.
Check directly near which points we can solve the equation F(x,y) =
y2 + y + 3x + I = 0 for y in terms of x.
2.
Check that your answer in Exercise 1 agrees with the answer you expect
from the implicit function theorem. Compute dy/dx.
§7.3 The Domain-Straightening Theorem
3.
401
In the system
3x+2y+z2 +u+v2 =0
4x+ 3y+z+ u2 +v+w+2 =0
X
+ Z + W + u2 + 2 = 0,
discuss the solvability for u, v, w in terms of x, y, z near x = y = z = 0, u =
v=0,w=-2.
4.
Does the map
(x,y) .,_.
xy )
x2-y2 •-2--2
( -2--2
X
+y
X
+y
have a local inverse near (0, 1)?
5.
Discuss the solvability of
y+x+uv=0
uxy+v=0
for u, v in terms of x, y near x = y = u = v = 0, and check directly.
§7 .3 The Domain-Straightening Theorem
We now give a consequence of the implicit function theorem that is an important
technical tool in the study of surfaces. This result states, roughly, that if f : A C
Rn --+ R has a nonzero derivative at a point xo, then in a neighborhood of x 0 ,
f can be "straightened out"; in fact, f can be defonned into the map that is
the projection onto the coordinate axis Xn by composing it with a "coordinate
change," which is (by definition) a smooth function that has a smooth inverse.
See Figure 7.3-1, where the coordinate change is denoted h, and in which h
straightens out the surfaces of constant/ into planes. The exact result is stated
in the next theorem.
7.3.1 Domain-Straightening Theorem Let A c IR.n be an open set,
and let f : A -+ IR. be a function of class 0: p 2'. I. let xo E A and suppose
f(Xo) = 0 and Df(xo) I- 0. Then there are an open set U, an open set V containing
Xo, and a function h : U--+ V of class 0: with inverse 1i- 1 : V --+ U of class 0:
such that
f(h(X1, . .. , Xn)) = Xn.
402
Chapter 7 The Inverse and Implicit Function Theorems
x,,
IR
A
h
f = constant
/• h = constant
FIGURE 7.3-1
A change of coordinates can straighten out the level sets off
This theorem has a generalization to functions f : A C !Rn --+ !Rm, m :'.5 n,
given in Exercise 18 at the end of the chapter.
The plausibility of the theorem is seen from Figure 7 .3-1. The function of
h is to twist things in a way so that the level surfaces off become planes of
dimension n - 1. The condition Df(xo) 'f 0 is used to guarantee that the surfaces
f = constant are "nondegenerate" or, intuitively, are smooth and of dimension
n - 1. An example will clarify this point.
7.3.2 Example letf(x,y) = :t2 +y2 • Can we "straighten out" f near (0,0)?
What about near other points?
Solution The answer to the first question is "no, not necessarily," because
Df(0,0) = 0. This is clear intuitively, because the surfaces of constant! degenerate at (0, 0) from being circles to being a point (Figure 7.3-2). Evidently, there
is no way we can defonn the surfaces f = constant near (0, 0) to planes. But we
can do this near any point (xo, Yo) 'f (0, 0). •
7 .3.3 Example let f(x, y) = x3 + x + y. Can f be "straightened out" near
(0, 0)?
Solution Yes, since D/(0,0) = (1, l) 'f 0. •
'§7.4 Further Consequences of the Implicit Function Theorem
403
y
/
FIGURE 7.3-2
f = constant
The level curves of x2 + y2 degenerate at (0, 0)
Exercises for §7.3
1.
Near what (x, y) can f(x, y) = x2
- y2 be "straightened out"?
2.
Sketch the graphs off = constant in Exercise l and explain your answer
geometrically.
3.
Can f(x, y) = x3 + y2 + 1 be straightened out near (0, O)? Near (0, I)?
§7 .4 Further Consequences of the Implicit
Function Theorem
The domain-straightening theorem says that we can find a function h that "straightens out" the domain off so that/oh is simply a projection. By analogy to this,
we can look for a function g that "straightens out" the range off so that g of
lo.oks like an "insertion."
7.4.1 Range-Straightening Theorem Let Ac JRP be an open set and
f : A --t ]Rn a function of class c; r ?: 1, and assume p s; n. Let xo E A and
suppose the rank of Df(xo) is p. Then there are open sets U and V in ]Rn with
f (xo) E U and a function g : U --t V of class C' with inverse g- 1 : V --t U
also of class C' and a neighborhood W of x 0 in ]RP such that g o f(x 1, ..• , Xp) =
(x,, ... ,Xp, 0, ... , 0) for all (x 1 , .•• ,xp) E W.
· • Chapter 7 The Inverse and Implicit Function Theorems
404
+
An intuitive illustration is given in Figure 7 l, which should be compared
with Figure 7.3-1. In the present case, the function g flattens out the image
off. Notice that this is intuitively correct; we expect the range off to be a pdimensional "surface," so that it should be possible to flatten it to a piece of JR.P.
Note that the range of a linear map of rank p is a linear subspace of dimension
exactly p, and so this result expresses, in a sense, a generalization of the linear
case.
g
FIGURE 7 .4-1
A change of variables can flatten the range off
To use the straightening theorems, we must have the rank of D/ equal to
the dimension of its image space (or the domain space). However, we can use
the inverse function theorem again to tell us that if D/(x) has constant rank
m in a neighborhood of xo, we can straighten out the domain off with some
invertible function h such that f o h depends only on x 1, ••• , Xrn, Then we can
straighten the range as well. This is the essence of the following theorem and
its corollary. Roughly speaking, it says that if D/ has rank m on !Rn, then n - m
variables are redundant and can be eliminated. For example, if f : JR 2 --+ JR 1,
f (x, y) = x - y, DJ has rank 1, and so we can express/ using just one variable;
let h(x,y) (x + y,y), so that/ o h(x,y) x, which depends only on x.
=
=
Note. Recall that the rank of a linear map is the dimens,ion of its image.
Equivalently, the rank is the size of the largest square submatrix with nonzero
determinant (consult your linear algebra text for details).
§Z.4 Further Consequences of the Implicit Function Theorem
405
7.4.2 Rank Theorem Letf: AC ntn-+ RN (where A is open in IRn) be a
er function such that Df(x) has rank m for all x in a neighborhood of Xo E A.
Then there are an open set U C IRn, an open set V C Rn with xo E V, and a
function h : U -+ V of class C' with inverse h- 1 : V -+ U of class er such
that f Oh depends only on X1, .. , ,-:,n• That is, (f O h)(X1, •, • ,Xm,Xm+I, · · · ,Xn) =
j(x1 , • •• , x,,.) for some C function f. See Figure 7.4-2.
N=3
m=l
11=2
,, - I
I'
, r· -i ·-_
h
A
.,
1
I ii .
, i:
I,
. I I;
I
I
', l _I i I_,'
I
- 1 - - - - - - - - - + - Xi - ~ - - - - - -
FIGURE 7.4-2
A change of variables can make a map of rank m depend on
only m vruiables
7.4.3 Corollary Letf: A C IRn-+ ]RN (where A is open in lRn) be a function
of class C, r ~ 1, such that Df(x) has rank m for all x in a neighborhood of
Xo E A. Then there are an open set U1 C IRn, an open set U2 C ntn with
xo E U2, an open set V. around f(xo), an open set V2 C IRN, and functions
h : U1 -+ U2 and g : V. -+ V2 of class er with inverses of class er such that
(g Of O h)(X1, • .. , Xn) = (X1, .. • , X11,, 0, ... , 0).
Some further applications of the implicit function theorem to surface theory and Lagrange multipliers (extremum problems with constraints) are given
Chapter 7 The Inverse and Implicit Function Theorems
406
in §7.7. Also, in the remaining sections of the chapter some (optional) topics
are treated by the same or similar methods.
7.4.4 Example Letf: IR2 -+ IR3, (x,y)- (x+y3,xy,y+y2). Can the range
off be "straightened out"' near (0,0)?
Solution Here we employ the range-straightening theorem. First, we compute the Jacobian matrix:
I
( y
0
3y2
)
X
'
1 +2y
which, at (0, 0), is
This matrix has rank 2 (since there is a 2 x 2 submatrix with nonzero determinant). Hence, the range-straightening theorem applies, and so we can straighten
out the range. [It will be, intuitively, a two-dimensional surface near (0, 0).]
•
7.4.S Example Let f: IR 2 -+ IR. f(x,y) =
function of only one variable near (0, 0)?
x2 + y. Can f be expressed as a
Solution Yes, since (by the rank theorem) D/(0, 0) = (0, 1) 'f 0; by continuity,
Df 'f O near (0, 0) as well. Thus, D/ has rank 1 near (0, 0). Note that this can
also be answered by using the domain-straightening theorem.
•
Exercises for §7.4
1.
Let/: R 2 -+ IR 3, (x,y) - (x+y2,xy,y2). Can the range be straightened
out near (0, 0)? Near (0, I)?
2.
What does the rank theorem say about the function f : JR 3 -+ IR 2 defined
by (x, y, z) - (x2 + 2y2 , z2 + 3xy) near (0, 0, 0)? Near (0, 1, 0)?
3.
What does Corollary 7.4.3 say about/ : IR3
12y,x+y3 +z3 ) near (0,0,0)?
4.
Examine the statement of Corollary 7.4.3 in the case where f is a linear
mapping.
-+
JR.3, (x, y, z) - (x + 2y, 6x +
§7.5 An Existence Theorem for Ordinary Differential Equations
401
§7.5 An Existence Theorem for Ordinary
Differential ~quations
In §5.7 we used the contraction mapping principle to begin our study of differential and integral equations. Here we consider differential equations a little
further. In calculus we learn how to solve simple linear differential equations
explicitly; for example, one learns that the solution to d2x/dt 2 + k2x = 0 is
x(t) = A cos(kt-w) for constants A and w. It is interesting to investigate whether
or not general differential equations always have solutions. The methods used
here are constructive and suitable for numerical computation; that is, a specific
sequence of approximating solutions is constructed; an example may clarify
matters.
7.5.1 Example Consider the nonlinear equation dx/dt =x2, x(0) =1. Can
we compute x(l)?
Solution In this case we can solve the equation explicitly by separating vari-
=
=
=
ables: Write dx/x2 dt and integrate, getting -1/x r+C, that is, x -1/(t+C).
At t 0, x 1, and so C
l. Thus, x 1/(1 - t) is our solution. This is
evidently the only solution starting at t 0, with x(O) 1. At t 1, the solution
x(t) "blows up," and so x(l) is not defined. Thus we cannot find a differentiable
solution x(t) defined for all t 2:: 0 (Figure 7.5-1).
•
=
=
=-
=
=
=
=
x(t)
I = x(O)
--+-------,
l=l
FIGURE 7.5-1
The solution to x'(t) =
tends to l
x2
through (0, 1) "blows up.. as r
408
Chapter 7 The Inverse and Implicit Function Theorems
This example points out the important fact that, in general, solutions x(t) may
be defined and differentiable only for a small t interval.
Another statement is important here. If we allow vector differential equations, then higher order equations may be reduced to first-order equations.
7.5.2 Example Write d2xjdt2 +lex= 0 as a first-order equation.
Solution We let y = dx/dt and write
{ dx/dt=y
dy/dt= -lex,
which 'is first-order in the vector (x,y) and is equivalent to the original equation.
•
We now give the main existence and uniqueness theorem. In the theorem,
we write B(x0 , r) for the closed ball of radius r about x0 , so that
B(Xo, r) = {y E lRn
I II.to - YII
$ r}.
7.5.3 Theorem Let f : [-a, a] x
Let C = sup{llf(r,x)II
K such that
I (t,x)
B(Xo, r) -+ ]Rn be a continuous mapping.
E [-a,a] x B(xo,r)}. Suppose there is a constant
lif(t,x) - /(t,y)il $ Kllx -
YII
(L)
for all t E [-a,a], x,y E B(xo,r). Let b < min{a,r/C, l/K}. Then there is a
unique continuously differentiable map x : [-b, b] -+ B(xo, r) C lRn such that
x(O) = xo (initial condition)
and ~; = f(t,x(t)).
The main condition on/ is the Lipschitz condition (L) stated in the theorem,
where the number K is called the Upschitz constant. When f meets the condition, we say that f is Upschitz in the variable x. To verify this condition, we
can often use the ideas we learned in Chapters 4, 5, and 6: If Dx f(t, x) denotes
the derivative off for fixed t and IIDx f(t,x)yll $ KIIYII for ally E lRn, then
f is Lipschitz, with Lipschitz constant K. For example, if n = l, this holds if
18/(t,x)/oxl $ K for -a $ t $ a, -r $ x - x0 $ r. We can see this by using
the chain rule as follows:
d
ds f(t,y + s(x - y)) = Dx f(t,y + s(x - y)) • (x - y),
§7.5 An Existence Theorem/or Ordinary Differential Equations
so that integrating between s
409
=0 and s =I, we have
f(t,x) - f(t,y) =
fo D.r
1
f(t,y
+ s(x - y)) · (x - y)ds.
Taking absolute values then yields the result. Note that if f is C1, such a K will
always exist (why?).
Often,/ is independent oft, in which case we say we have an autonomous
system. If/ is merely continuous, the existence (but not uniqueness) of x(t) is
true; see Exercise 45 at the end of Chapter 5.
The idea of the proof of 7.5.3 is to use successive approximations; start with
and write
fo f(s,x1(s))ds
X3(t) =.to+ fo f(s,x2(s))ds
1
x2(1) =xo +
1
Xn(I) = Xo +
fo f(S,Xn-1(s))ds.
1
Then one proves that Xn(t) converges to a function x(t) that satisfies
x(t)
= Xo +
fo f(s,x(s))ds
1
Since f(t) and x(t) are continuous, the fundamental theorem of calculus asserts that the right side of the last equation is a differentiable function of t
with derivative /(t,x(t)). Thus, the left side, x(t), must be differentiable and
x'(t) = f(t,x(t)). Since an integral from O to O is certainly 0, we evidently have
x(O) = Xo- If we compare this with §5.7 we see that we are seeking a fixed point
for the transformation of one function to another given by
(Ty)(t)
= Xo +
1
1
/(s,y(s))ds.
The iteration process we have outlined is called the Picard iteration process
and is the same as that used abstractly in the proof of the contraction mapping
principle. The Lipschitz condition and the choice of interval [-b, b] are used to
show that T is a contraction on the space C([-b, b]). ·
Chapter 7 The Inverse and Implicit Function Theorems
410
7.5.4 Example What value of b does Theorem 7.5.3 give for the problem
in Example 7.5.1?
Solution Here dx/dt = x2 and x(O) = 1. Let, for the moment, a, r be undeter~
mined. Now
C= sup{lf(t,x)I I -a$ t $ a, -r $ x- 1 $ r}
= sup{x2 I -r $ x - 1 $ r} = (r+ 1)2
(see Figure 7.5.2). Thus r/C= r/(r+ 1)2 • Also, 8f/8x= 2x, and so
K
=sup{2lxl 1-r $ x -
1 $ r}
=2(r+ I).
(1 +,)2 - - - - - - - -
Applying Theorem 7.5.3 to the case f(x, t) = x2
FIGURE 7.5-2
Since a is not involved in the estimates for C and K, we choose a large
enough that it does not interfere, say, a= 100. By the theorem, we choose
b
< min { (r+rI)2'
2 (r+
1 1) } .
This works for any r > 0. The two are both equal to ¼if r = 1. If O < r < 1,
the first is smaller and increases from O at r 0 to ¼ at r 1. If r > 1, the
second is smaller and decreases from ¼ as r increases from l. Thus the best
we can do from this is to set r = 1 and conclude that the time of existence is at
least as great as any b < ¼. This is not as good as we found directly (a time of
existence < l), but we can reapply· the theorem to get a new time of existence
at t ¼and work our way to any t < 1. But we can never go past t 1.
•
=
=
=
=
411
§7.6 The Morse Lemma
Exercises for §7.S
=
=
1.
Solve dx/dt 1 +x1,x(O) 0 by the method of successive approximations.
Is x(t) defined for all t ~ O?
2.
Compute b from Theorem 7.5.3 for Exercise I.
3.
Show that dx/ dt = ,Ix, x(O) = 0 has two solutions:
x(t) = 0 for all t
and
x(t)
={
0
r1-/4,
t~ 0
t > 0.
Does this contradict Theorem 7.5.3?
4.
Consider the equation dx/dt = te; sinx, x(O) = 1. Obtain an estimate on
how long we can define the solution x(t).
5.
Let A be an n x n matrix, and consider the linear system
dx
dt
a.
= A · x(t),
x(t) E
Rn.
Show that a solution is
00
x(t) = e'A x(O),
n=()
b.
B"
where e8 = ~ , ·
n.
The time of existence here extends for all t; can this conclusion also
be derived from Theorem 7.5.3?
§7.6 The Morse Lemma
In Chapter 6 we saw that the Hessian of a function/: Rn-+ Rat a critical point
determined the local behavior of/ near this point. The Morse lemma carries this
result one step further. It states, for example, that if/ has a nondegenerate local
minimum at xo, then not only does f look like a paraboloid, but we can change
the coordinates so that/ really "is" a paraboloid in the new coordinates. The
"lemma" (it really is a theorem) also applies to saddle surfaces.
The Morse lemma is fundamental in more advanced work in topology and
analysis, but even here it helps us better understand the shape of functions near
a critical point.
Chapter 7 The Inverse and Implicit Function Theorems
412
7.6.1 Morse Lemma Let A c Rn be open and f : A - JR a smooth
(infinitely differentiable) function. Suppose Df(xo) = 0 and the Hessian off at
xo is nonsingular. Then there are a neighborhood U of Xo and a neighborhood
V of O in ]Rn and a smooth map g : V - U with a smooth inverse such that
f o g = h has the form
h(y) = f(xo) - [y~ + Y~ + · · · +Yi]+ [Yi.1 + · · · + Y~].
where .X is some fixed integer between O and n.
One calls g a change of coordinates, and we speak of y = g- 1(x) as the new
coordinates.
A critical point at which the Hessian matrix .1 = 82f / 8x;8xi is nonsingular
is called a nondegenerate critical point. Thus the Morse lemma gives a rather
complete description of functions in the neighborhood of a nondegenerate critical
point. The number .X is called the index of the critical point. Figure 7 .6-1
+ Y1+ 1 + · · · + y~ for
illustrates the graphs of the quadratic fonns
various indices in JR 2 •
-yr - ···- Yi
(a)
FIGURE 7.6-1
(b)
(c)
(a) Index= 0; (b) index= 1; (c) index= 2
For functions of two variables, it is easy to detennine the index: If .1 is
negative definite, the index is 2; if .1 is positive definite, the index is zero, and
otherwise it is I. In general, to find the index, one needs to know a little more
linear algebra. The knowledgeable reader can check that the index is exactly
the number of negative eigenvalues of .1.
7 .6.2 Example What is· the shape of the surface z = x2 + 2xy + 2y2 + y3 near
(0,0)?
413
Exercises for §7.6
Solution We have a critical point at (0, 0), and the Hessian is
L\
= [a:;ixJ = [ ; ; ] ·
which is positive definite, since 2 > 0 and det(L\) = (2)(4) - (2)(2) > 0. Thus,
the index is 0, and near (0, 0) the surface is approximately a paraboloid and in
some other coordinate system is exactly a paraboloid. •
7.6.3 Example
Compute the index of x2
-
3xy + y2 + 8xy2 + 6 at (0, 0).
Solution (0, 0) is a critical point, and the Hessian is
L\=[; ~]Since 2 > 0 and detL\ = -5 < 0, this matrix is neither positive definite nor
negative definite. Thus, we have index 1 and hence a saddle point at (0, 0). •
Exercises for §7.6
1.
Compute the index of 2x2 + 6xy - y2 + y4 at (0, 0).
2.
What is the shape of the surface x2 + 3xy - y2 at (0, 0)?
3.
Does the Morse lemma apply to
4.
Letf(x,y) = +y2 + 3y3 +8x4 +x2exsinx+6. Show that there exist new
coordinates(, 7/, where
r - 2xy + y2?
r
( =((x,y),
for which
f(x,y) =
1J
=17(x,y),
e+11 +6
2
in a neighborhood of (0, 0).
5.
a.
If f has a nondegenerate critical point at Xo E IR.n, show that there is
a neighborhood of xo containing no other critical points.
b.
What are the critical points of the functionf(x,y) = x2y2?
414
Chapter 7 The Inverse and Implicit Function Theorems
§7.7 Constrained Extrema and
Lagrange Multipliers
In some problems, one is asked to maximize a function subject to certain constraints or side conditions. Such situations arise, for example, in economics:
Suppose we are selling two kinds of goods, say, I and II; let x and y represent
the quantity of each sold. Let/(x,y) represent the profit we earn when'x amount
of I and y amount of II is sold. Since our production is limited by' our capital, we
are constrained to work subject to a relation, say, g(x,y) = 0. Thus we want to
maximize/(x,y) among those x,y satisfying g(x,y) = 0. The condition g(x,y) = 0
is called the constraint in the problem. The purpose of this section is to discuss
some methods that will enable us to handle this and similar situations.
7.7 .1 Lagrange Multiplier Theorem Let f : U c Rn
--+
R and g :
UC Rn--+ R be given C 1 functions. Let xo E U, g(Xo) =co, and let S =g- 1(co),
the level set for g with value co. Assume V g(Xo) 'F 0. If /IS, which denotes f
restricted to S (that is, to tlwse x E U satisfying g(x) = c0 ), has a maximum or
minimum at Xo, then there is a real number A such that
Vf(xo) = .\Vg(xo).
The idea of the proof is as follows. Recall that the tangent space to S at
defined as the space orthogonal to V g(xo) (see §6.6). We motivated this
definition by considering tangents to paths c(t) that lie in S, as follows: If c(t)
is a path in S with c(O) = .xo, then c'(O) is a tangent vector to Sat x0 , since
.xo is
d
d
dtg(c(t)) = dt co = 0,
and, on the other hand, by the chain rule,
d
I,=0 = V g(xo) · c'(0),
dtg(c(t))
and so c' (0) is orthogonal to V g(~).
If /IS has a maximum at xo, then certainly f(c(t)) has a maximum at t = 0.
Hence
d
0 df(c(t))lt=<> V/(xo) · c'(O).
=
=
Thus V/(xo) is also perpendicular to the tangent space to Sat Xo, and so V/(x0 )
and V g(.xo) are parallel. Since V g(xo) 1 0, it follows that V/(xo) is a multiple
of V g(x0 ), which is exactly the conclusion of the theorell!,The following corollary extracts some useful geometry from this proof.
§7.7 Constrained Extrema and Lagrange Multipliers
415
7.7.2 Corollary If f, when constrained to a surface S, has a maximum or
minimum at Xo, then VJ(xo) is perpendicular to Sat xo (see Figure 7.7-1).
X
FIGURE 7.7-1
The geometry of constrained extrema
These results tell us that to find the constrained extrema off, we should
look among those x0 satisfying the conclusions of the theorem or the corollary.
We shall give several illustrations of how to use each. When the method in
Theorem 7.7.1 is used, we must look for a point xo and a constant A, called a
Lagrange multiplier, such that Vf(x0 ) = AV g(.xo). This method is more analytic
in nature, while the method of Corollary 7.7.2 is more geometric. In either case,
the Lagrange multiplier techniques are most effective when used in conjunction
with, rather than in place of, the intuition gained in the study of elementary
calculus.
For a second-derivative test for constrained extrema, we refer fue reader to
Vector Calculus, 3d ed., by Marsden and Tromba (W. H. Freeman, New York,
1988).
Example 7.7.3 Let S c IR2 be a line through (-1, 0) inclined at 45~ and
let f: IR 2 --+ R, (x, t) 1-+ x2 + t2 . Find the minimum off on S.
=
=
Solution Here S {(x,t) I t-x-1 O}, and so we choose g(x, t) = t-x-1.
The extrema of /IS must be found among the points at which VJ is orthogonal
to S, that is, is inclined at -45°. But Vf(x, t) = (2x, 2t) and has the desired
416
Chapter 7 The Inverse and Implicit Function Theorems
slope whenever x = -t, i.e., (x,t) lies in the line L through the origin inclined at
-45°. This can occur for a point (x, t) lying in the set S only for the single point
at which L and S intersect (see Figure 7.7-2). Reference to the level curves of
f indicates that this point, (-1 /2, 1/2), is a relative minimum of /IS (but not
off).
•
y
X
FIGURE 7.7-2 · Locating the critical points of the restriction of/ to S
-+ JR be defined by (x,y) H x2 - y'-, and let S
be the circle of radius l centered at the origin. Find the critical points off
restricted to S.
7.7.4 Example Let f: JR. 2
-+ JR, (x,y) H x2 + y2. The level
curves, tangent spaces, and gradients are shown in Figure 7.7-3. Clearly, the
gradient off is orthogonal to S at the four points (0, ± 1), (± 1, 0), which are
relative minima and maxima, respectively, of /IS.
This problem can be performed apalytically by the method of Lagrange
multipliers. Clearly,
Solution Here S = g- 1(1), where g : JR2
'vf(x,y) =
of 8y
of) = (2x, -2y)
( 8x'
'vg(x,y) = (2x, 2y).
§7.7 Constrained Extrema and Lagrange Multipliers
417
Thus, we seek points (x,y) with x2 + y2 = 1 for which there is a >. satisfying
(2.x, -2y) = >.(2x,2y). These give us the three equations 2x = >.-2.x, -2y = >.-2y,
and x2 + y2 = 1, which can be solved for the three unknowns x,y, and>.. From
2x
>.2x, we conclude either x 0 or >.
1. If x
0, then y
± I and
- 2y >.2y implies >. -1. If >. I, then y 0 and x = ± 1. Thus we get the
same points (0, ± 1), (± 1, 0) as before. As we have mentioned, the method only
locates potential extrem1; whether they are maxima or minima or neither must
be determined by inspection or some other means. •
=
=
FIGURE 7.7-3
=
=
=
=
=
=
=
The hyperbolas are tangent to Sat the critical points of/IS
If the surface S is defihed by a number of constraints,
g1(X1,, .. ,Xn)=C1
g2(X1, ·,, ,Xn) = C2
(so far we just had one g), then Theorem 7.7.l may be generalized as follows. If
/ has a maximum or minimum at xo on S and 'vg, (xo), ... , 'vgk(Xo) are linearly
Chapter 7 The Inverse and Implicit Function Theorems
418
independent, then there are constants ,\ 1, ... , ..\t such that
VJ(xo) = ..\ 1V 81 (Xo) + · · · + ..\t V 8t(Xo).
This may be proved by generalizing the method used to prove Theorem 7.7.1
and is left to the interested reader. Let us now give an example of how this
more general formulation may be used.
7.7.5 Example Find the extreme points off(x,y,z) = x+y+z subject to the
conditions x'- + -y2 = 2 and x + z = I.
Solution
Here there are two constraints,
81(x,y,z) = x2 + y2
-
2 =0
and
g2(x,y,z) =_x+z - 1 = 0.
Thus, we must find x,y,z and ..\1 and ..\2 such that
and
g1(x,y,z) = 0
g2(x,y, z) = 0;
that is, we must solve
1 = ..\1 . 2x + ..\2 • 1
1 = ..\1 . 2y + ..\2. 0
I = ..\1 • 0 + ..\2 · I
and
x2+y2=2
x+z=l.
These are five equations for x,y, z, ..\1, and ..\2. From the third, ..\2 = 1, and
so 2x..\1 = 0, 2y..\1 = 1. Since the second implies ,\ 1 -,. 0, we have x = 0.
Thus y = ±/2 and z = I. Hence, our points are (0, ±./2, 1). The constraints
x'- + y2 = 2 and x + z = 1 together define a compact set -in JR3, the ellipse
formed by intersecting a cylinder and a plane. The continuous function/ must
attain a maximum and a minimum on it, and these must occur among the points
we have found. Therefore, /(0, /2, 1) == 1 + ./2 must be the maximum and
f (0, - /2, 1) = 1 - ./2 must be the minimum.
•
§7.7 Constrained Extrema and Lagrange Multipliers
419
7.7.6 Example Maximizef(x,y,z) = x+z subject to the constraint x2 +y2 +
z2 = 1.
Solution Here we use Theorem 7.7.1. We seek A and (x,y,z) such that
1 = 2.tA
0=2yA
1 = 2zA
and
x2+y2+z2=1.
From the first equation, A 'I- 0, and so the second gives y = 0, and the first
and third give z = x. The fourth becomes x2 + x2 = 1, so x = ± 1/ ../2.
Our points are (1 / ../2, 0, 1/ ../2) and (-1 / ../2, 0, -1 / ../2). The constraint defines a compact set (the unit sphere in IR 3) on which the continuous function f
must attain a maximum and a minimum. These must occur among our points,
so we compare values: /(l/../2,0, 1/../2) = ../2 must be the maximum, and
/( -1 / ../2, 0, -1 / ../2) = - ../2 must be the minimum. •
7.7.7 Example Find the largest volume a rectangular box can have subject
to the constraint that the surface area be fixed at JO square meters.
Solution Here, if x, y, z are the lengths of the sides, the volume is f (x, y, z) =
xyz. The constraint is that 2(xy + xz + yz) = IO, that is, xy + xz + yz = 5. Thus, our
conditions are
yz = A(y +z)
xz = A(x+z)
xy = A(y +x)
xy + xz + yz = 5.
First of an, x 'I- 0, for x = 0 implies yz = 5 and O = Az, so that A = 0 and
yz = 0. Similarly, y 'F 0, z 'F 0, and x + y 'F 0, and so forth. Elimination of A
from the first two equations gives yz/(y + z) = xz/(x + z), which gives x = y.
Similarly, y = z. Using the last equation, 3x2 = 5, and so x = .js/3. Thus
x = y z .js/3, and so xyz (5 /3)312 • This is the solution. It should be
geometrically clear that the maximum occurs when x = y = z. •
= =
=
Chapter 7 The Inverse and Implicit Function Theorems
420
Exercises for §7.7
In Exercises l through 5, find the extrema off subject to the stated constraints.
x2 +y2 +z2 = 2.
1.
f(x,y,z) =x-y+z,
2.
f(X, y)
3.
f(x,y)=x,x2+2y2=3.
4.
f (x, y, z)
5.
f(x,y) = 3x + 2y, 2x2 + 3y2 = 3.
6.
Supranational Sludge Corporation produces sludge using equipment and
material costing p = $243 per unit and labor at a wage of w = $16 per
hour. If x units of equipment/material and y hours of labor are used, then
20x3l4y 1l 4 liters of sludge are produced. If the company has a budget of
B = $51,840,000 to spend, find the maximum amount of sludge that can
be produced and the amounts of equipment/material and of labor used to
produce it.
=X -
y, x2 -
y2
=2.
=x + y + z, x2 - y2 = 1, 2x + z =l.
Theorem Proofs for Chapter 7
7.1.1 Inverse Function Theorem Let A c !Rn be an open set and let
f : A c IR" --+ IR" be of class C1 (that is, assume that DJ exists and is continuous).
Let .xo E A and suppose Jf(xo) 'F 0. Then there exist a neighborhood U of .xo in
A and an open neighborhood W of f(xo) such that f(U) = Wand the restriction
off to Uhas a C1 inverse/-•: W--+ U. Moreover.for y E Wand x =f- 1(y),
we have
of- 1<y> = [DJ(x)r 1 •
the inverse of Df(x), meaning the inverse as a linear mapping (corresponding
to the inverse matrix). If f is of class O', p ~ 1, then so is f- 1•
Proof The proof of the inverse function theorem is not especially easy in its
technical details, but this theorem represents a cornerstone of analysis and so it
should be mastered. We are presenting the proof in !Rn, but the theorem actually
works with very similar ideas in a Banach space (see Abraham, Marsden, and
Ratiu, Manifolds, Tensor Analysis and Applications, Springer-Verlag, New York,
1988, for details). The proof will be based on the contraction mapping principle;
this technique is useful, since it is applicable to many situations. We use the
following special case.
Theorem Proofs for Chapter 7
421
Lemma 1 Let M be a closed subset of !Rn and d the distance function on
JR~ Let f be a mapping of M into M. Assume that there is a constant K, where
.0 < K < I, such that for any two points x and y in M, we have d(J(x),f(y)) ~
Kd(x,y). Then there exists a unique x. EM such that f(x.) = x. (x. is called a
fixed point off).
Before beginning the proof of the inverse function theorem, it is helpful to
have a technical lemma about the set of invertible linear maps (or, equivalently,
the set of invertible matrices). An m x n matrix (or a linear map from JR" -+ !Rm)
is simply an mn-tuple of real numbers, since a matrix A with entries (au) can
be regarded as an mn-tuple (au, ... , a,n, a21, ••• , am1, .. . , amn)- It thus makes
sense to say that a certain subset of the set of all matrices is open or that a map
from the set of m x n matrices to the set of p x q matrices is differentiable.
Let L(]Rn,lRn) denote the set of all n x n matrices (or linear maps from lR" to
!Rn), and let GL(lRn, !Rn) denote the set of all invertible matrices (or invertible
linear maps from !Rn to !Rn), which is called the general linear group. Thus,
GL(lRn,lRn) = {A E L(]Rn,lRn) I detA ~ O}. Let c- 1 : GL(lRn,lRn)-+ GL(lRn,lRn)
denote the map that takes an invertible matrix A to its inverse A- 1•
Lemma 2
i.
GL(lRn, !Rn) is an open subset of L(lRn, lR").
ii.
c- 1 is a COO mapping.
Proof
i.
The determinant mapping det: Rn x • • • x !Rn (n times)-+ JR is an n-linear
map. (Recall that the determinant is linear in the rows.) Thus, det may be
viewed as a polynomial in n 2 variables, and so it is continuous and even
of class COO. Because the set {O} is closed, dec 1({0}) is closed. Hence,
L(lRn,lRn) \ deC 1({0}) is open. But L(lRn,lR") \ deC 1({0}) is the set of
all those n x n matrices with nonzero determinant, and these are exactly
all the invertible matrices GL(lRn, JR").
ii.
From the explicit expression for the inverse of a matrix, we see that
c- 1 is COO. Indeed, the expression for the inverse of the matrix A is
A- 1 = (det A)- 1 adj A, where adj A is a matrix such that (adj A)u =
(- ti•i detA(jli), where det A(jli) denotes the determinant of the matrix obtained from A by deleting the jth row and the ith column. Since
(detAr 1 is a differentiable function of A, we need only show that the
' mapping adj : L(lRn, !Rn) -+ L(lRn, !Rn), which takes a matrix to its adjoint,
Chapter 7 The Inverse and Implicit Function Theorems
422
is C°°. Regarded as a function from 1Rn2 to 1Rn2 , the adjoint is simply an
n2-tuple of functions like (adj A)ij = (- l)i+j detA(jli). As we have mentioned, a multilinear map from 1Rn1 x • • • x )Rnt to Rm is C°°. Thus, each of
the n2 component functions of adj is COO. Hence, the adj map is COO. "f
Proof of the Inverse Function Theorem For clarity, we break up the
proof into a number of steps.
Step 1 Simplification to a special case.
We prove the theorem when D/(.xo) is the identity transfonnation. To see
that this is sufficient to prove the general case, let T = Df(.xo). Then T- 1 exists
and is linear, so D(T- 1)(/(x0 )) = r- 1, and by the chain rule,
D(T- 1 of)(xo) = D(T- 1)(/(xo)) o D/(Xo) = T- 1 o D/(Xo)
= identity transfonnation.
If the theorem is true for T- 1 of, then the theorem is also true for/. Indeed,
if g is an inverse for r- 1 o/, the inverse for f will be g o r- 1•
We can make one further simplifying assumption, namely, that .xo = 0 and
/(x0 ) = 0. To see this, let us suppose we have proved the theorem for the
special case xo = 0 and /(xo) = 0. To prove the general case from this, let
h(x) f(x + xo) - f (xo). Then h(O) 0 and Dh(O) Df(xo), so that Dh(O) is
invertible. If h has an inverse near x = 0, then the required inverse for f near x0
is given by
=
=
=
In summary, step I demonstrates that it is sufficient to prove the theorem
under the assumptions that xo = 0, f(xo) = 0, and D/(0) is the identity. These
will be assumed in the remainder of the discussion.
Step 2 Application of the contraction mapping principle to get a local inverse.
Keeping the preceding remarks in mind, what we would like are two neighborhoods of O such that given any y from the first neighborhood of 0, there is a
unique x from the second neighborhood such.that/(x).= y. To show this, consider
the function gy defined by gy(X) = y + x ~f(x). If for some closed neighborhood
of zero this is a contraction mapping, then it has a unique fixed point, say, x,
and so x = y + x - f(x), or x is the unique point belonging to the neighborhood
such that /(x) = y. To construct this neighborhood, define g(x) = x - f (x); then
Dg(O) = 0. Assume g to be of class Cl', witfi p ?: I. This means in particular that
423
Theorem Proofs for Chapter 7
Dg is a continuous function, and so, by continuity at 0, there exists an r > 0
such that llxll < r implies IIDg;(x)II < 1/2n, where g = (g1, ... ,gn). By the
mean value theorem, given x E D(O, r), there are points c1, c2, ... , Cn in D(O, r)
such that g;(x) g;(x) - g;(O) Dg;(c;)(x - 0) Dg;(c;)(x). Therefore,
=
llg(x)II $
t
i=l
=
lg;(x)I =
t
IDg;(c;)(x)I $
i=l
=
t
IIDg;(c;)II llxll
< ll~II <
i,
, i=l
using the Cauchy-Schwarz inequality on Dg;(c;)(x) = ('vg;(c;),x).
This shows that g maps the closed r ball B(O, r) into the closed r/2 ball
B(O, r/2). Now let y be any member of B(O, r/2). The mapping gy takes B(O, r)
into B(O, r); to see this, note that IIYI I $ r /2 and x E B(O, r) implies
llgy(x)II = IIY + g(x)II $ IIYII + llg<x>II
< r/2 + r/2 = r.
Let x 1 and x 2 be any two points in B(O, r). Then llgy(xi) - gy(x2)II = llg(x1) g(x2)II, and by the mean value theorem, as before, llg(x1)- g(x2)1l $ (1/2)llx1 x2II, and so g1 is a contraction map (with constant K = 1/2). The contraction
mapping principle implies that there is a unique fixed point x E B(O, r) for g1 ,
and as we observed before, this implies/(x) = y. This means that/ has an inverse
1- 1 : B(O, r/2) C JR.ff-+ B(O, r) C JR.ff.
Step 3 The inverse is continuous.
Let x 1 and x2 E B(O, r). Recalling the definition of g, we get
and hence llx1 - x2II $ 211/(xi) - /(x2)1i. Thus, if Y1 and Y2 E B(O, 1/2), then
ll/- 1 (y1)-/- 1(y2)II $ 2IIY1 -Y2II, and so/- 1 is continuous.
Step 4 For suitably small r, the inverse is differentiable on D(O, r/2).
We were given that D/(0) is invertible and that D/ : A c JR.ff --+ IR.n1 is
continuous, and we have shown that GL(JR.ff,JR.ff) is open in L(IR.n, JR.ff). These
facts show that for all x in some neighborhood around 0, _[D/(x)r 1 exists. If
this neighborhood does not contain D(O, r /2), r is restricted further until this
is the case. Hence, we can assume that [D/(x)]- 1 exists for all x E D(O, r/2).
Moreover, we can assume IICD/(x)]- 1yll $ MIIYII for all x E D(O,r/2) and
y E ntn, by continuity of [D/(x)]- 1 (see Worked Example 4.4, Chapter 4).
Chapter 7 The Inverse and Implicit Function Theorems
424
For Y1 and Y2 E D(0,r/2), x1
ll/- 1(y,) -
1- 1(y2) -
=/- 1(y,) and x2 =/- 1(y2),
[D/(x2)J- 1 • (y1 - Y2)II
IIY1 -Y2II
_ llx1 - X2 - [D/(x2)r 1 '(/(x,)- /(x2))II
11/(x,) - / (x2)I I
_ [ llx1 - x2II ] ll{[D/(x2)]- 1}{D/(x2)(x1 -x2) - (/(x1) - /(x2))}II
- llf(x1) - /(x2)II
llx, - x2II
.
Using llx1 - x2II :5 211/(xi) - /(x2)II and IID/(x2)- 1yll :5 Mllyll shows that this
expression is
< 2MIID/(x2)(x1 - x2) - (/(x,) - /(x2))11 _
llx, - x2II
The last expression has a limit zero as llx1 - x2II -+ 0, by the differentiability of
/ at x2 • This shows that/- 1 is differentiable at y 2 with derivative [D/(x2)J- 1 =
[D/(/-l(y2))J-1.
In the theorem, we set W D(0, r/2) and U 1- 1(W), both open sets.
=
Step S 1-
1:
=
D(0, r/2)-+ IRn is of class O'.
From step 4, it follows that the inverse mapping 1- 1 : D(0, r/2) -+ !Rn is
differentiable on D(0, r/2) and that D/- 1(y) = [D/(/- 1(y))]- 1 • We have also
shown that/- 1 : D(0, r/2)-+ IRn is continuous: DJ is continuous by assumption,
and the inversion mapping from GL(!Rn, IRn) (the set of ipvertible linear maps
from Rn to IRn) to GL(Rn, IRn) is continuous and, in fact, COO by lemma 2. This
implies that 01- 1 is a continuous map from D(0, r/2) into L(IRn, IRn). Hence,
1- 1 is of class C1•
Again look at D/- 1(y) = [D/(/- 1(y))r', and observe that since/- 1 is of
class C1, D/ is of class 0- 1, and since the inversion is C°", 01- 1 is of class
C 1• Hence, 1- 1 is of class C1. Continuing in this way by induction, we finally
conclude that 1- 1 is of class O'. •
7.2.1 Implicit Function Theorem Let A c ntn x IRm be an· open set
and let F : A -+ IR"' be a function of class O' (that is, assume that F has p
continuous derivatives, where p is an integer. p 2:'. l). Suppose (xo,Yo) EA and
F(xo,Yo) = 0. Form the determinant
8F1
8y1
8F1
8y,n
8F,,,
8y1
8Fm
8y,,,
~=
Theorem Proofs for Chapter 7
425
evaluated at (xo, yo), where F = (F1, . .. , Fn,). Assume that A ~ 0. Then there
are an open neighborhood U C Rn of Xo, a neighborhood V of Yo in Rm, and a
unique function f : U -+ V such that
F(x.f(x)) = 0
for all x E U. Furthermore,/ is of class O'.
Proof Define the function G : A C Rn x Rm -+ Rn x Rm by G(x,y) =
(x, F(x,y)). For x near xo we want to find y = f(x) making F(x,y) = 0. The idea
is to show that G is locally invertible near (xo, Yo) and to take /(x) detennined
by (x,f(x)) = G- 1(x, 0). Since F is of class O' and the identity mapping is of
class COO, it follows that G is of class O'. The matrix of partial derivatives of
G (Jacobian matrix) is
1
0
0
0
0
0
I
0
0
8F1
aF,
OXt
OXn
8F1
ay,
aF1
oym
EJF,,,
aF"'
8Fm
8y1
oFm
ay"'
0
I
ox,
OXn
The detenninant of this matrix evaluated at (xo, yo) is equal to
8F1
8y1
8F1
oym
8Fm
ay,
8Fm
ay,n
.:i=
evaluated at (xo,Yo). By hypothesis, JG(xo,Yo) 'f 0, and thus, by the inverse
function theorem, there are an open set W containing (Xo, 0) and an open set S
containing (xo,Yo) such that G(S) = Wand G has a O' inverse G- 1 : W-+ S.
From the definition of an open set, there are open sets U c Rn .and V c Rm
with Xo E U and Yo E V such that U X V C S. Let G(U x V) = Y C W.
Thus G : U x V -+ Y is a O' diffeomorphism (this means that G is of class
O' and has inverse G- 1 : Y -+ U x V, also of class O'). Now G- 1 is of the
fonn G- 1(x, w) = (x, H(x, w)), where H is a O' function from Y to V, since
426
Chapter 7 The Inverse and Implicit Function Theorems
G is of this fonn, as is easy to see. Let 1r : !Rn x !Rm --+ !Rm be defined by
1r(x,y) = y, so that F(x,H(x, w)) 1r o G(x,H(x, w)) = 1r o Go G- 1(x, w) w.
Because G- 1 is of the form G- 1(x, w) = (x,H(x, w)), x e U if (x, w) E Y. Define
f : U ---+ V by f(x) = H(x, 0). Since F(x, H(x, w)) = w, we get F(x,f(x)) = 0, and
since H is of class O', f is also of class O'. By the inverse function theorem,
H(x, w) is uniquely detennined. Since f must be given by H(x, 0), f is unique
as well.
=
=
•
7.3.1 Domain-Straightening Theorem Let A c !Rn be an open set,
and let f : A --+ JR be a function of class 0, p ~ 1. Let Xo E A and suppose
f (xo) = 0 and Df(xo) '# 0. Then there a re an open set U, an open set V containing
xo, and a function h : U--+ V of class 0, with inverse h- 1 : V---+ U of class 0,
such that
f(h(X1, . .. , Xn)) = Xn.
Proof Since D/(xo) '# 0, there exists some i such that (of/ ax;)(Xo) 'F 0.
Define g: Rn--+ !Rn by (X1,.,. ,Xn) - (X1, ... ,Xi-1,Xn,Xi+I, .. , ,Xn-1,X;). The
permutation map g is linear and hence of class COO, and because/ is of class
O', the chain rule shows that/ o g is of class O'. Thus (8(/ o g)/8xn)(g- 1)(x0) =
af(xo) / ax; '# 0, which implies that fog is a function of the type described in the
hypotheses of the implicit function theorem with m = I. As in the proof of 7.2.1,
ifwe define G: AC ]Rn-Ix JR_. Rn-I x!R by G(x,y) = (x,fog(x,y)), there are
1 , 0) E U (where
open sets W C !Rn and U C !Rn with xo E W and (xA, ... ,
1
x0 = (xA, ... , .to)) such that G : W ---+ U has an in·verse
: U _. W of class
1(x) = (x1, ... ,Xn) = Xn, where
1cx1, .•• ,Xn) = (7r O G) 0
O'. Now,(/ 0 g) 0
1r : !Rn- I x JR ---+ JR is the projection on the last coordinate. Define V = g(W)
1• Then h is a C1' function with C1' inverse, since
and h : U -. V by h = g o
1
both g and
have this property, andf(h(x1, ... ,xn)) = Xn-
a-
a-
a-
x;;a-
c-
•
It is possible to prove a more general theorem than 7.3.1, using a similar
technjque. That is, if f: A C !Rn ---+ !Rm, n ~ m, and D/(.xo) as a linear map has
rank m, then/ can locally be made to look like a projection on the last m factors
by composing it after a smooth function with smooth inverse. In Exercise 18 at
the end of this chapter, we state this exactly and give a hint for the proof. Note
that here the range has dimension less than or equal to that of the domain. In
the following theorem, the opposite is the case.
7.4.1 Range-Straightening Theorem Let A c JRP be an open set and
f : A --+ !Rn a function of class C~ r
~
1, and assume p
~
n. Let x0 E A and
Theorem Proofs for Chapter 7
427
suppose the rank of Df(Xo) is p. Then there are open sets U and V in !Rn with
f(Xo) E U and a function g : U --+ V of class er with inverse g- 1 : V --+ U
also of class er and a neighborhood W of Xo in ]RP such that g of(xi, ... , Xp) =
(x 1, ••• ,xp,O, ... ,O)forall (x 1 , ••• ,xp) E W.
Proof Since Df(xo) has rank p, some p x p submatrix of Df(x0) has a nonzero
determinant. By relabeling, if necessary, we may assume that
~
8fP
8fP
8x1
8xp
where f = (/ 1, ••• ,f'). Define <p : A x IRn-p
the matrix of D<p is
8/1
--+
JR" by <p(x, y) =f(x) + (0, y). Then
0
8x1
{}JP
8xp
8jP+I
8xp
8f'
8xp
o,
...
0
0
0
1
0
0
1
and so
8J'
8x1
8J'
8xp
~o.
J<p(Xo, 0) =
cJfP
8x 1
cJfP,
8xp
By the inverse function theorem, there are an open set U around f(xo), an
open set V around (xo, 0), and a function g : U --+ V of class er such that
g = ¢- 1• Then g(f(x)) = g(f(x)) + (0, 0) = (x, 0), as desired.
•
428
Chapter 7 The Inverse and Implicit Function' Theorems
7.4.2 Rank Theorem Let f: A c !Rn-+ ]RN (where A is open in !Rn) be·a
C' function such that Df(x) has rank m for all x in a neighborhood of Xo E A.
Then there are an open set UC !Rn, an open set V C !Rn with xo E V, and a
function h : U -+ V of class C' with inverse h-J : V -+ U of class C' such
that j O Ji depends only on Xi, ... ,Xm- That is, (f O h)(x1, ... ,Xm,X,n+J,. •. ,Xn) =
!<x1, ... , Xm) for some C' function l
Proof Let No be the kernel of D/(xo); that is,. let No = {y E !Rn I D/(xo) · y =
O} (a subspace of !Rn of dimension n - m) and let M be an m-dimensional
complement of No in !Rn, that is, let Mn No = {O} and {x + y I x E M, y E
No}= !Rn. Let CJ,••. ,Cm be a basis for Mand Cm+J•··· ,Cn l:>e a basis for No.
Each x E !Rn can be written uniquely as x = t/11 (x)c1 + · · · + 1/Jn(x)cn. Define
G(x) = (0, ... , 0, 1/1111+1 (x), ... , 1/Jn(x)). Then G is linear and hence smooth. Since
D/(x0 ) has rank m, D/(Xo)(IRn) is an m-dimensional subspace P of JR"'. Moreover,
the set {d; = D/(x0 )c; I I $ i $ m} is a basis for P. Any x E JRN may be written
uniquely as
X
= 'Pl (x)dJ + ... + 'P111(x)d111 + 'Pm+J (x)dm+I + ... + 'PN(x)dN,
where dJ, ... , dN is a basis for JRN, with d 1, ... , din the basis for P just given.
Define H : JRN-+ IRn by H(x) = (<p1 (x), ... , 'Pm(x), 0, ... , 0).
Let g(x) = H(f(x))+G(x). Then g maps !Rn to IRn, and since Hand Gare linear, we have Dg(xo)·s = DH(f(xo))oDJ(Xo)(s)+DG(xo)(s) = H(DJ(xo)(s))+G(s).
If we write the matrix of the linear transformation Dg(Xo) in terms of the bases
CJ, ... , Cn and the standard basis, we get the identity matrix. Hence Dg(x0 ) is
invertible. We may use the inverse function theorem to find an open set U
around H(f(x0)) + G(xo) and an open set V around x0 and a smooth inverse
function g-J : U -+ V. For each x E V, Dg(x) is invertible; i.e., Dg(x) is a
one-to-one linear map of !Rn onto JR~ We may assume rank {D/(x)} = m for
all x E A (otheiwise, we restrict f to an even smaller neighborhood of x0 ). For
x E A, D/(x)(IRn) is an m-dimensional subspace, say, Px, of !Rn. For s e M,
Dg(x) · s = H(Df(x) • s) + G(s) = H(Df(x) • s). Thus, if x E V, Df(x) restricted
to M is a one-to-one linear map of M onto Px. That the mapping is onto follows from the fact that M and Px both have dimension m. Similarly, H must
be a one-to-one linear map of Px onto IR"'. Denote the inverse of this map by
Lx : IR,n -+ Px.
Leth= g-J : U-+ V; we shall show that/ o h(x 1, ••• ,xn) does not depend
on Xm+J, ... , Xn. To do this, we may assume that U is a ball. It suffices to
show that D:if1 = 0, where Di.fJ is the derivative of / 1 = f o h restricted to
{O} x IRn- 111 ; that is, we are showing that fJfi/fJx; = 0, i = m + I, ... ,n. It
429
Theorem Proofs for Chapter 7
follows that /1 is constant with respect to Xm+I, .•. ,Xn, Now f = Ji o g, and
so
Df(x) · y = Dfi(g(x)) · Dg(x) · y
= D1 /1 (g(x)) · H(Df(x) · y) + Di/1(g(x)) · G(y).
(1)
Since G is a mapping of !Rn onto {0} x IRn-m, it suffices to show that Di/1(g(x))o
G(y) = 0 for y E Rn. Returning to Equation (1) and using Lx o H = identity, we
obtain
Dif,(g(x)) o G(y) = Lx o H(D f(x) · y) - D1 /1(g(x)) o H(Df(x) · y)
= (Lx - Difi (g(x))) o H(Df(x) · y)
(2)
for ally E Rn. Now Lx - D1 /1 (g(x)) is defined on !Rm x {0} and Ho Df(x) maps
M onto !Rm x {0}. Hence, to show that Lx - D 1 / 1(g(x)) = 0, it suffices to show
that
(Lx - D1 /1 (g(x))) o H(Df(x) · y) = 0
for y EM. This follows because for y EM, G(y) = 0, and so Di/1(g(x)) o G(y) =
0. Therefore, Lx - D1 f,i (g(x)) is identically zero, and thus Di /1 (g(x)) = 0.
•
7.4.3 Corollary Letf: AC Rn--+ JRN (where A is open in Rn) be a function
of class e~ r 2:: 1, such that Df(x) has rank mfor all x in a neighborhood of
Xo E A. Then there are an open set U 1 C Rn, an open set Ui C Rn with
'xo E Ui, an open set V1 around f(xo), an open set Vi C RN, and functions
h : U1 --+ Ui and g : Yi --+ Vi of class er with inverses of class er such that
(go f o h)(xi, ... ,Xn) = (X1, ... ,Xm, 0, ... , 0).
Proof By Theorem 7.4.2, there is a er function h : U-+ V with er inverse,
such that
(f o h)(X1, ... ,Xm,Xm+I, ... ,Xn) =i(xi, ... ,Xm)
for some j : W c Rm -+ RN. Since h is invertible, Dj has rank m, and so, by
the range-straightening theorem, there is an invertible er function go such that
(go O h<x,' ... ,Xm) = (xi, . .. 'Xm, 0, ...• 0).
Define g on_RN by g(x 1, ... ,XN) = (go(x,, ... ,xm),xm+1, ... ,xN). Then g is also
er and is invertible, and we have
(go f) o h(xi, ... ,Xn) = (x1, ... ,Xm, 0, ... , 0).
•
430
Chapter 7 The Inverse and Implicit Function Theorems
7.5.3 Theorem Let/: [-a,a) x B(xo,r)-+ !Rn be a continuous mapping.
Let C = sup{ll/(t,x)II I (t,x) E [-a,a] x B(xo, r)}. Suppose there is a constant
K such that
11/(t,x) - f(t,y)II
$
Kllx - YII
(L)
for all t e [-a,a], x,y E B(xo,r). Let b < min{a,r/C, 1/K}. Then there is a
unique continuously differentiable map x: [-b,b]-+ B(.xo,r) C !Rn such that
x(O) = xo (initial condition)
and : = f(t, x(t)).
Proof The differential equation and the initial condition x(O) = Xo are equivalent, by the fundamental theorem of calculus, to the condition
x(t) = xo +
fo f(s,x(s))ds
1
for a continuous function x(t). Consider C([-b, b], !Rn), which we know (from
§5.4) is a complete metric space. Let
A
={cp E C([-b, b], IR") I cp(O) =Xo and cp(t) E B(xo, r)}.
Then A C C([-b, b], Rn) is closed (why?), and therefore A is also a complete
metric space. We will apply the contraction mapping principle to this space A.
Define F : A -+ A by
'
F(cp)(t) =xo+
fo f(s,cp(s))ds.
1
Here, the expression f~f(s, cp(s))ds is obtained by integrating each component
off; the result is a vector. The general inequality
111 /(s,cp(s))dsll 1' llf(s,cp(s))llds
1
$
is analogous to the similar result for the case of real functions, which we accept
here (see §4.8). First, we must show that F(cp) E A. The function F(cp) is
continuous, since
1
IIF(cp)(r) - F(cp)(t)II = lll 1(s,cp(s))dsll $
JT IIJ(s,cp(s))II ds $ Cir - ti
for -b $ t < r $ b. Thus, F(cp) E C([-b,b],Rn)_ Also, F(cp)(O) = x0 , and for
all t E [-b,b],
IIF(cp)(t) -
.xoll = llfo'1cs, cp(s))dsll
$ sign(t)
1' 11/(s,
cp(s))dsll $ b. C < r,
Theorem Proofs for Chapter 7
431
since b < r/C. The factor sign (t) talces into account the possibility that t
Thus F(cp)(t) E B(Xo, r), and so F(cp) EA. Next, for cp, 1/; EA,
IIF(<p) - F(1/l)II =
< 0.
sup IIF(cp)(r) - F(tj,)(t)II
-b$t$b
= sup
-b9$b
S
II lo{' f(s, cp(s)) - f(s, 1/J(s)) dsll
sup sign(t) {' llf(s, cp(s)) - f(s, ¢(s))II ds
lo
-b$1$b
S sup sign(t) {' Kllcp(s) - tp(s)II ds
lo
-b$t$b
S
1
sup K sign(t)1 llcp - t/illds S Kbllcp-,J,II
-b:S,$b
O
where Kb < 1. Therefore, if we let k = b · K < 1, then d(F(<p), F('I/J)) S kd(<p, 1/J),
and so Fis a contraction and thus has a unique fixed point: x = F(x). This fixed
point x(t) is the unique solution we were seeking. •
The iteration scheme mentioned in the text comes about because, as we saw
in the proof of the contraction mapping theorem, the unique fixed point is the
limit P'(cp) as n-+ oo for any <p EA, such as <p(t) xo.
=
7.6.1 Morse Lemma Let A c
]Rn be open and f : A -+ JR a smooth
(infinitely differentiable) function. Suppose Df(xo) = 0 and the Hessian off at
x0 is nonsingular. Then there are a neighborhood U of Xo and a neighborhood
V of O in ]Rn and a smooth map g : V -+ U with a smooth inverse such that
f o g = h has the form
h(y> =
t<Xo> - [yr+ y~ + ... + Yll + [Yl+1 + · •. + y~1.
where >. is some fixed integer between O and n.
Proof We lose no generality if we assume Xo = 0 and/(x0) = 0. Write
1
I
f (x, , ... , Xn)
=
1"
df(tx,, ... ,tXn)
d
dt =
0
I
If we set
g;(X1,- ..
,Xn) =
1 n
O i=l
1,of
o
8/
~ X;-!)--:(tx1, .•. , tXn) dt.
I.IX,
i:i""""(tx1, ... ,txn)dt,
vX;
Chapter 7 The Inverse and Implicit Function Theorems
432
then
n
J(x,, . .. , Xn) =
L X;g;(x,, . ..• Xn).
i=l
=
=
=
Since X-0 0 is a critical point, (8/ / 8x;)(0) g;(0) 0. Also, the g; are smooth
functions-one need only justify differentiating under the integral sign. (You
may accept this statement now, or refer ahead to Worked Example 9.2 at the
end of Chapter 9 for a detailed justification.)
Since g;(0) = 0, we can apply the same procedure again to write
n
g;(X1, ... ,Xn) = LXjhij(X1, ... ,Xn)
j=I
for certain smooth functions h;j, and therefore
n
f(x1, . .. , Xn) =
L X;Xjhij(X1,, .. , Xn).
i,j=I
We can assume h;i = hii by replacing hu by iiu = (l/2)(hu + hi;) if necessary,
which does not alter the expression for f. Note that at zero, {)2J/ 8x;8Xj = 2hu(0),
and so hu(0) is nonsingular.
·
Now f is written in a way analogous to a quadratic form. What we want to
do is to "diagonalize" it. Proceed by induction. Suppose there exist coordinates
u 1, • •• , Un in a neighborhood U1 of 0 such that
f = ±(ui)2 ± · · · ± (Ur-d +
L U;UjHij(U1,. , , , Un)
(1)
i,j'?:_r
on U1 where r ~ I and the Hu are symmetric. We have already established
this for r = 1 in the preceding paragraph. (The phrase "coordinates u 1 , ••• , Un"
means, as in the text, that (u,, . .. , Un) are invertible functions of (x1 , ••• , Xn).)
We can make a line.ar coordinate change in Ur, ... , Un to diagonalize
L
U;UjHij(0).
i,j'?:_r
In particular, since Hu(0) is nonsingular, the diagonal tenns are nonzero. Thus,
we can assume that Hr,(0) ~ 0. Let g(u1, ... , Un)= IHrrCu 1, ...• un)l1l 2 ; in some
smaller neighborhood U2 C U1 of 0, g will be a COO nonzero function. Define
V; = u;,
{
i ~r
[ '°'
Un)]
u;H;,(u,, ... ,
V,(u1, ... , Un)= g(u1, ... , Un) u, + ~ H (
)
i>r
r,UJ, ••.
,Un
(2)
Theorem Proofs for Chapter 7
433
The Jacobian matrix at 0 is
1
0
o(Vi, ... ,Vn)
O(U1, .. , , Un)
0
1
0
O·
av,
=
&V,
g(O)
&x1
0
0
OXn
0
0
1
which is nonsingular. Therefore, by the inverse function theorem, (u 1 , • •• , un) 1--+
(V1, . .. , Vn) is a COO map with a COO inverse on some smaller neighborhood U3
of 0. In other words, (Vi, ... , Vn) will serve as coordinates. Now consider
n
u,u,H,,(u,, ... 'Un)+ 2
L UjU,Hj,(U1, ... 'Un),
(3)
j=r+I
which consists of the tenns in Equation (I) with either i or j = r. Here we have
used symmetry of Hij, Comparing Equation (1) with Equation (2), we see that
expression (3) equals
±V,V, - ;
rr
[2::
u,H;,(u1, ... ,un)] 2,
i>r
the plus or minus coming about because H,, = ±g2 , where we use + if H,, is
positive and - if H,, is negative. From this we see that Equation (1) becomes
f =
L ±(V;)2 + L V; YjHij{Vt, . .. , Vn)
i~r
i.j>r
for new symmetric iiij. Thus, we have gone inductively from r to r + 1 in
Equation (1 ). Hence, it is true for r = n + 1, which proves the theorem. •
7.7.1 Lagrange Multiplier Theorem Let f: Uc R"
--+ Rand g :
U C Rn --+ R be given C 1 functions. Let Xo E U, g(Xo) = co, and let S = g- 1(co),
the level set for g with value co. Assume v' g(xo) '/a 0. If JIS, which denotes f
Chapter 7 The Inverse and Implicit Function Theorems
434
restricted to S (that is, to those x E U satisfying g(x) = co), has a maximum or
minimum at x 0 , then there is a real number >. such that
V/(Xo) = >. V g(xo),
Proof The only thing not complete about the sketch of the proof given in
§7.7 is that we need to know that if v .L Vg(Xo), then v = c'(0) for a C 1 curve
c(t) in S, with c(0) = Xo,
This can be established as follows. By the domain-straightening theorem,
there is a change of coordinates h such that g(h(x1, ... ,Xn)) = Xn, Thus h- 1(S) is
the coordinate plane x,. = co. Let w = Dh- 1(x0 ) • v. We claim that the last
coordinate of w is zero, that is, that w lies in the plane x,. = c0 • Indeed,
let e,. = (0, 0, ... , 1). We shall show that (w, e,.} = 0. From the chain rule,
g(h(x1, ... ,Xn)) = Xn implies
(V g(xo), Dh(yo) · w} = (w, en},
=
=
where h(yo) xo. But the left side is (V g(xo), v} 0. Let c(t) = h(yo + tw), which
lies in S, and c(0) = .to; from the chain rule, c'(0) = v. The proof may now be
completed as in the text.
•
Here is another proof: Assume that (8g/8x,.)(;xo) t, 0. Then there is a function h such that g(x1, ... ,Xn-1, h(x1, ... ,Xn-1)) = 0 for all (x1, ... ,Xn-1) in an
open set UC JR 11 - 1. Set k(x,, ... ,Xn-i> =J(x1,,,. ,Xn-1,h(x1, .. , ,Xn-1)). We
wish to maximize k. If k has a maximum or minimutn at a point, then
8k = 8/
8x;
8x;
+
8/ 8h
8x,. 8x;
= O.
But
8g
8x;
+ 8g 8h =0,
and so
8xn 8x;
{)J
8/
8x,.
8g
ax; = 8g ax;.
8x,.
Setting
gives the result.
•
Worked Examples for Chapter 7
435
Worked Examples for Chapter 7 ·
Example 7.1 (Product Rule for Jacobians)
g : B C IR.n
--+
Let f: A c IR.n
--+
IR.n,
IR.n, and f(A) C B. Show that for x E A,
lgof(X) = Jg(f(x)) · Jj(x)
(product of real numbers).
Solution By the chain rule,
D(g o /)(x) = Dg(f(x)) o Df(x),
which may be interpreted either as a composition of linear maps or as a matrix product. Since the detenninant of a matrix product is the product of the
detenninants, we get the required result.
•
Example 7.2
Consider equations u = fi(x,y) and v = f2(x,y). Show that they
are invertible near (xo,Yo) iff
8=
ofi oh _ ofi ofi
ox ay
oy ax
does not vanish at (xo,yo). If x(u, v), y(u, v) are the solutions, show that
ox I 8v
OU=~ {)y'
oy
1
ax
I au
av=-~ oy'
oy 1 au
8v = ~ ax·
av
-=- - -,
OU
8 ox
n = 2,
where 8 is the Jacobian detenninant. The matrix of derivatives of the solutions
is the inverse of the matrix of derivatives of/1 ,Ji, Since the inverse of the matrix
Solution This is a special case of the inverse function theorem for
is
_!_ (
8
d
-c
-b )
a
where 8 = ad - be, we get the stated result.
'
•
Chapter 7 The Inverse and Implicit Function Theorems
436
Example 7.3 Let A C Rn be an open set and f: A c Rn --+ Rn a one-toone continuously differentiable function such that Jf(x) = det(DJ(x)) i- 0 for all
x E A. Show that f(A) is an open set and 1- 1 : f(A) -+ A is differentiable.
Solution Let y E J(A) and suppose y = J(x). Since f is continuously differentiable and Df(x) has nonzero detenninant, the inverse function theorem tells
us that there exist open neighborhoods U of x and V .of y such that JIU (the
restriction of/ to U) is a C1 diffeomorphism (that is, it has a C1 inverse) of U
onto V. Hence V c f(A), and so f(A) is open. Now (JIU)- 1 =1- 1 lf(U) and
(flll)- 1 is differentiable at y, and so 1- 1 is differentiable at y. Hence, 1- 1 is
differentiable on J(A). •
Example 7.4 Consider ti1e following equations:
{ x2 -yu = 0,
xy+uv = 0.
Using the implicit function theorem. describe under what conditions these equations can be solved for u and v. Then solve the equations directly and check
these conditions.
Solution Define / 1 : R 4 --+ R by / 1(x, y, u, v) = x2-yu, and define /2 : R 4 --+ R
by /2(x,y, u, v) = xy + uv. Letf: R 4
smooth function. The matrix
8/1
(
au
8.fi
au
--+
8/i
8v )
8/2
R 2 be defined by/=
(/1 ,h);
thenf is a
= ( -y O )
V
U
8v
has detenninant -yu. If (xo,YQ, uo, vo) is such that y0 uo i- 0, then the hypotheses
of the implicit function theorem are satisfied, and so there are neighborhoods
A of (xo,Yo) and B of (uo, vo) and a unique continuously differentiable function
g : A -+ B such that f(x,y,8(X,y))
0 for all (x,y) E A. If we let u 8i and
v = 82 (where 8 = {g1,82)), then u and v are the solutions to the simultaneous
equations. These equations can be solved uniquely for u and v in neighborhoods
around (xo, yo) and (uo, vo) to satisfy the equations, provided that y0u0 i- 0, which
is equivalent to requiring that Xo i- 0 and Yo i- 0, since f (Xo, Yo, uo, vo) = 0, or
that
youo O and XoYo + uovo 0.
By direct computation, the solutions are u = x2 /y and v = -y2 /x, which are
valid except near xo = 0 or Yo= 0. •
=
ro -
=
=
=
Worked Examples for Chapter 7
437
Example 7.5 Functional Dependence Let A c lR.n be an open set and
let the functions /1, ... ,/n : A -+ JR. be smooth. The functions /1, ... ,fn are said
to be functionally dependent at Xo E A if there are a neighborhood U of the
point <J1 (xo), ... ,fn(Xo)) E lR.n and a smooth function F : U -+ JR. such that
DF f. 0 on a neighborhood of <f1 (Xo), ... ,fn(xo)) and if
F(/1 (x), ... .fn(X)) = 0
for all x in some neighborhood of xo.
a.
Show that if f1, ... ,In are functionally dependent at xo, then
fJ(/1, ... ,/n)
8(x1, ... ,Xn)
0
- - - - = atXQ.
b.
If
fJ(f1' ... .fn-i>
fJ(X1, .. · , Xn-t)
f. b
and
fJ(/1 , ... ,/n) = 0
8(X1, · · • ,Xn)
on a neighborhood of Xo, show that /1, ... ,fn are functionally dependent,
and that for some G,
/n = G(/i, .. · ,fn- I).
Solution Let f = (/1, ... ,fn).
a.
b.
=
=
We have F of
0, and so DF(J(x)) o Df(x)
0. If J/(Xo) "/: 0, D/(x)
would be invertible in a neighborhood of xo, implying DF(J(x)) = 0.
By the inverse function theorem, this implies DF(y) = 0 on a whole
neighborhood of /(xo).
The conditions of b imply that D/ has rank n - I. Hence, by" Corollary
7.4.3, there are functions g and h such that
g Of O h(x1, ... ,Xn) = (x1, ... ,Xn-1,0).
Let F be the last component of g. Then
F(/1, ... ,fn) = 0.
Since g is invertible, DF 'I: 0.
It follows from the implicit function theorem that/n = G(/1 , •.•• ,fn- 1);
that is, we can locally solve F(/1, ... ,/n) = 0 for fn = G(/1, ... ,/n), provided that we can show that fJF/ayn 'I: 0. As we saw in part a,
DF(/(x)) o D/(x) = 0,
438
Chapter 7 The hiverse and Implicit Function Theorems
or, in the component notation y; = f;(x1, ••• , Xn),
Ofn
:. l
OX1
OXn
(
8/1
(;:. ..·... :;,)
OX1
=O.
ofn
If 8F/8yn = 0, we would have
(;:. ... ' ~~.)
,
or
(aYI
8f1
8/1'
OX1
OXn-1
ofn-1
Ofn-1
OX1
OXn-1
(
l
=0.
{)F , ... ' 88F ) =0,
Yn-1
since the square matrix is invertible by the assumption that
fJ(/1, • • • ,fn-1)
O(X1, •.. ,Xn-1)
;6 O.
This implies DF = 0, which is not true. Hence oF/ oyn
the desired result.
;6 0, and we have
The reader should note the analogy between linear dependence and functional
dependence, where rank or determinant conditions are replaced by the analogous
conditions on the, Jacobian matrix.
•
Exercises for Chapter 7
1.
Write an expression for of/ ox iff(x,y) = g(x, h(x,y)) where g, h : IR 2 -
2.
Consider the following set of p equations in n + p unknowns:
aqx1 +• · • + 01nXn + a1.,,+1Xn+I +• · · + a1,n+pXn+p
Op1X1 +• · ·
+ OpnXn +
Op,n+lXn+I
+· · • +
Op,n+pXn+p
=0
I.
= 0.
R
Exercises for Chapter 7
439
What does the implicit function theorem say about the solution of these
equations for the unknowns Xn+h •.• ,Xn+p? Does it reduce to a theorem
you know from linear algebra?
3.
Let/: IR-+ IR be C 1 and
u = f(x),
v = -y + xf(x).
If J' (xo) -/- 0, show that this transformation is invertible near (Xo, Yo) and
the inverse has the form
4.
Show that the equations
x2 - y2 _:_ u3 + v2 + 4 = 0
2xy +
I - 2u2 + 3v4 + 8 = 0
determine functions u(x, y ), v(x, y) near x = 2, y = -1 such that u(2, -1) =
2, v(2, -1) = 1. Compute au/ ax.
5.
a.
Define x : .IR2 -+ JR by x(r, 0) = rcos 0 and define y : .IR2 -+ IR by
y(r,0) = rsin0. Show that
8(x,y)
o(r, 0) (ro, 0o) = ro.
b.
When can we form a smooth inverse function r(x,y), 0(x,y)? Check
directly and with the inverse function theorem.
c.
Consider the following transformation for spherical coordinates:
x(r, cp, 0) = r sin cp cos 0;
y(r, cp, 0) = r sin cp sin 0;
z(r, <p, 0) = r cos cp.
- Show that
d.
6.
8(x,y,z)
2 •
a(r, cp, 0) = r sm cp.
When can we solve for (r, cp, 0) in terms of (x, y, z)?
Determine whether the "curve" described by the equation x2 + y + sin(xy)
= 0 can be written in the form y = f(x) in a neighborhood of (0, 0). Does
the implicit function theorem allow you to say whether the equation can
be written in the form x = h(y) in a neighborhood of (0, 0)?
Chapter 7 The Inverse and Implicit Function Theorems
440
7.
Letf satisfy the conditions of the inverse function theorem and let g be the
local inverse g = 1- 1 : W -+ U. Let xo E U and let Yo = f (xo). Consider
the case n = 3, and show that
6;,1
Jf(xo)D1g;(yo) =
6;,2
6;,3
D1 fi(xo)
D2 /i(Xo)
D3 /z(Xo)
D1 /3(Xo)
D2 /)(xo)
,
D3 /)(xp)
where 6;,i = 1 if i = j and 0 if i ':/- j. From this, deduce the following
expression for D1g1:
Also, obtain expressions for the other eight partial derivatives Djg;.
8.
Let (Xo, Yo, zo) be a point of the locus defined by z2 + xy - a = 0, z2 + x2 -
y2-b=0.
9.
a.
Under what sufficient conditions may the part of the locus near
(Xo,Yo, zo) be represented in the form x /(z), y g(z)?
b.
Compute/' (z) and g' (z).
a.
Let/: IR.2
=
-+
=
1R2 be smooth and suppose that
a/1
afz
ox
f)y
ofi
-=-,
oh
fJy = - fJx.
(These are called the Cauchy-Riemann equations and arise naturally in complex variable theory-see, for example, J. Marsden and
M. Hoffman, Basic Complex Analysis. 2d ed., W. H. Freeman, New
York, 1987.) Show that Jf(x,y) 0 iff Df(x,y) 0 and hence that/
is locally invertible iff D/(x,y) I- 0. Prove that the inverse function
also satisfies the Cauchy-Riemann equations.
=
b.
10.
=
Show that the invertibility criterion of a may be false (by giving an
example) if/ does not satisfy the Cauchy-Rie~n equations.
Let /1 ,h,/3 be continuously differentiable functions from 1R4 to JR.. Give
sufficient conditions so that the equations
/1(x,y,z,t)
=0,
fi(x,y,z,t)
can be solved for x,y,z in tenns oft.
=0,
/3(x,y,z,t)
=0
Exercises for Chapter 7
11.
441
a.
Suppose that f : IR" -+ !Rm is of class C 1 and D/(xo) has· rank m.
This means that D/(Xo) as a linear map is onfo. Show that there is a
whole neighborhood of /(Xo) lying in the image off.
·
b.
Suppose/: JR"-+ !Rm is C1 and D/(xo) is one-to-one. Show that/•
is one-to-one on a neighborhood of xo.
12.
Show that the implicit function theorem implies the inverse function theorem.
13.
Prove Corollary 7.2.2.
14.
If / 1, ••• ,fk are functionally independent on R." (that is, if DJ has rank
k for f = (ti, ... ,t,.), k $ n) and g,/1, ••• .fk are functionally dependent,
show that locally we can write g = F(J1 , ••• ,/1.:).
15.
Consider the map r,-• : GL(lR\IR")-+ GL(R.",R."),A t-t A- 1 , taking a
matrix to its inverse. Show that the derivative of this map is given by
16.
Give a direct proof of the Morse lemma for functions f : R. -+ R.. Does it
apply to:
a.
/(x) = x3?
b.
J(x)
= xsin(l/x)?
17.
Does the function h in the domain-straightening theorem have to be unique?
Discuss.
18.
Prove the following generalization of the domain-straightening theorem.
Let A C R.n be an open set and f : A C IR" -+ !Rm, m $. n, a function
of class O'. Let xo E A, and supposef(xo)
0 and rank Df(x0 )
m.
Then there are an open set U, an open set V containing x0 , and a function
h : U-+ V of class O', with inverse 1i-• : V-+ U of class O' (that is,
a O' diffeomorphism), such that f(h(xi, . .. , Xn)) = (Xn-m+l, .•. , Xn). [Hint:
If Df(xo) has rank m, there must existj;, ... ,im such that the matrix (Dj!J;),
l $ i, k $ m, is invertible. Define the permutation map g : IR" -+ IR" by
=
=
g(X1, · • • ,Xn)
Xji, ••• ,Xj111 )
19.
=
(X1,,, · ,Xji-t,Xn-m+t,Xji+I, · • • ,Xjm-t,Xn,Xj,n+l, • ·• ,Xn-m,
.]
Let F : IR 4 -+ IR 2 be defined by F(x, y, u, v) = (u3 + vx + y, uy + v3 - x).
At what points can we solve for F(x,y,u, v) = 0 for u, v in terms of x,y?
Compute 8u/
ox.
442
Chapter 7 The Inverse and Implicit Function Theorems
20.
~'Iff(x,y,z) = 0, then 8x/8y•oy/8x·ox/8z = -I." What do you think this
really means? (Thermodynamics books are notorious for such mystifying
statements,)
21.
Let f : IR.n -+ Rn and g; : R -+ IR, i = 1, ... , n be functions of class C 1 •
Define h: Rn-+ R_n by h(x) =f(g,(x,), ... ,gn(Xn)), where g = (g,, ... ,gn)
and x = (x,, ... ,Xn), Show that
22.
Letf(x,y) = (xy(x2 - y2))/(x2 + y2) for (x,y) 'F (0, 0) and/(0, 0) = 0. Is f
of class Cl?
23.
Let C C Rn be a closed subset such that x E C::} ax E C for all a:
a.
b.
24.
~
0.
Discuss what C "looks like."
Let/: C-+ Rn be continuous and/(a:x) = a:f(x) for x E C, a: 2;: 0.
· Show that there is an M such that 11/(x)II $ Mllxll for all x E C.
Let f(x, y, z) = x2 - yz - sin(xz) and
g(x,y,z) = (xcosy,xsinycosz,xsinysinz).
<;ompute the de,rivative off o g.
I
25.
•
,
l
Let B(0, r) = {x E Rn I !!xii ~ r}. Let/: B(0, r)-+ Rn be a map with
a.
llf(x) - f(y)II $ ½llx - YII
b.
11/(0)11
$ ~r
Prove that there is a unique x E B(O, r) such that /(x) = x.
26.
'Show that there exist positive numbers p > 0, q > 0 such that there are
unique functions u, v from ]-1 -p, -1 +p[ into ]1 -q, 1 +q[ for which
xe"<.r)
+ u(x)ev<.r> = 0 = xe*> + v(x)eu(.r>
for all x E ] - 1 - p, -1 + p[ and u(-1)
27.
=1 =v(- 1)(
Obtain an estimate on the length of time the solution of dx/dt = t2x3e'~
x(O) = 1 exists.
443
Exercises for Chapter 7
28.
Let A c Rn be compact and let B c C(A, R) be compact. Show that there
are an /o E B and an .xo E A such that g(x) :$ /o(xo) for all g E B and
xEA.
29.
Let an ~ an+I ~ 0 and an -+ 0. Let f(x) =
continuous on [-1, O].
30.
Is it possible to solve
'E:0 anxn. Show that f(x) is
xy2 + xzu + yv2=3
u3yz + 2xv - u2 v2=2
for u(x,y,z), v(x,y,z) near (x,y,z)
fJv/ fJy.
= (1, 1, 1), (u,v) = (1, 1)?
Compute
31.
Consider the equation dx/dt = 1 + tx, x(O) = 0. Examine the iteration
scheme given in §7.5 to obtain a power series expression for the solution.
Examine the radius of convergence.
32.
Compute the index of the function x2 + y2 - 7x - Sy+ xy + 16 + (x - 2)3 at
its critical point x = 2,y = 3. Discuss the nature of the function near this
point.
33.
Give another proof of the Morse lemma as follows. Assume x 0 = 0 and
/(.xo) = 0. Use Taylor's theorem to write/(x) =½D2/(0) •(x,x)+ ½Rx(x,x) =
½{Axx, x} so that for each x, Ax is a symmetric linear transfonnation of
IRn. By assumption, Ao is an isomorphism, and so Ax is an isomorphism if
xis near 0. Let Q,, AoA;', so that Q0 I. Using a power series, we can
define the square root Tx of Qx for x close to 0, that is, T; = Qx. Show that
QxAx = AxQ;, where T means the transpose matrix, and using the power
series for Tx, show that the same equation holds for Tx. Let Sx = T; 1, and
conclude that Ax = SxAoS;. Let h(x) S;x, and show that Dh(O)
now
apply the inverse function theorem to conclude that h is locally invertible.
Let g = 1i- 1• Show that
=
=
=
f(x) =
=/;
I
2{Aoh(x), h(x)}
and deduce that f o g(x) = ½D2/(0)(x, x). Finally, use a linear change of
coordinates to diagonalize the quadratic fonn ½D2/.
Find the relative extrema of /IS in Exercises 34 through 37.
34.
/:IR 2 -+IR.,(x,y)i--+x2+y2,
S={(x,2)jxEIR}.
444
Chapter 7 The 1,iverse and Implicit Function Theorems
35.
/:R 2 -+R,(x,y)-x2+y2,
S={(x,y)lx2-y2=I}.
36.
/:R2 -+R,(x,y)-x2-y2,
S={(x,cosx)lxEIR.}.
31.
f: R 3 --+ R, (x,y,z) - x2 + y2 +z2 ,
38.
A rectangular box with no top is to have a surface area of 16 square
meters. Find the dimensions that maximize the volume.
39.
Design a I-liter cylindrical water container that uses the minimum amount
of metal.
S = {(x,y,z) I z ~ -2 +x2 + y2 }.
Chapter 8
Integration
The reader is undoubtedly familiar with the integration process for functions of
one variable and its application to practical problems involving area, volume,
arc length, and so on. The purpose of this chapter and the next is to review,
solidify, and extend this knowledge. In this chapter we formulate the basic
definitions for a general theory of integration. Some familiarity with simple
situations involving multiple integrals is useful but not essential.
The powerful computational theorems for multiple integrals will be given
in the next chapter. These are Fubini's theorem, which enables us to reduce a
multiple integral to iterated single integrals, and the change of variables formula,
which enables us to change from rectangular coordinates to a more convenient
system of coordinates such as polar or spherical coordinates and more generally
to change from one coordinate system to another. To obtain a satisfactory theory
of multiple integrals, even for continuous functions, it is convenient to introduce
the notion of a set of "measure zero." We shall see that one of the main theorems
states that a bounded function is integrable iff its discontinuities form a set of
measure zero. In particular, a bounded function with a finite or countable number
of discontinuities is integrable.
§8.1 Integrable Functions
Building on our work in §4.8, let us outline how integration should be developed
in R 2• Suppose f : A c R 2 -+ R is a bounded nonnegative function (see
Figure 8.1-1), where A is a bounded set. The graph of the function f is a
surface in R 3, and the integration process is used to find the volume under
this surface. We enclose A in some rectangle [a1,bd x [a 2,b2] and extendf
to the whole rectangle by defining it to be zero outside of A. Then we divide
[a1,bd x [a2,b2] into smaller rectangles by partitioning [a1,bd by, for example,
445
446
Chapter 8 Integration
a1 = Xo < X1 < · · · < Xn-1 < Xn = b1 and partitioning [a2, b2] by, say, a2 = Yo <
Y1 < · · · < Y111-1 < Ym = b2, thus forming mn rectangles [x;,X;+d x [Yi,Yi+dThen we calculate the volume of the shaded block of Figure 8.1-1 to be inf{/(z) I
z E [x;,X;+d x [Yi,Yi+d} · (X;+1 - x;)(Yi+I - Yi) and the volume of the shaded
block plus the crosshatched block of Figure 8.1-1 to be
sup{/(z) I z E [x;,x;+d x [Yi,Yi+d} · (x;+1 - x;)(Yi+I - Yi>·
Summing these volumes over all i and j (that is, all the mn "subrectangles" of
the rectangle [a 1 , bi] x [a2, b2D gives two values, denoted L(J, P) and U(J, P),
where P stands for the partition. If
sup{L(f, P) IP is a partition}= inf{ U(f, P) I Pis a partition},
we say that/ is (Riemann) integrable over A and define the (Riemann) integral
over the set A, written JAi' or JJ,,.f(x,y)dxdy, by
1t =
sup{L(/,P)}
=inf{ U(f, P)}.
We have already introduced most of the ideas needed for a theory of integration of bounded functions over bounded sets for arbitrary dimensions. Most
of what remains is to fonnalize the statements for the case lRn.
z
1·
~--~J~~'
l,·1
I
I
X
FIGURE 8.1-1
I
I
~I
I I
I I
t):t;t---t---....,,,
The graph of a bounded function and a column helping make
up the volume under it
Let/ : A C !Rn - IR be a bounded function with domain a bounded set A.
Choose a rectangle [a 1 , bi] x · · • x [an, bn] that encloses A. Furthermore, let/ be
447
§8.1 llltegrable Functions
defined over the whole rectangle by setting it equal to O at points not contained
in A. Let P be a partition of [01, bi] x · · · x [an, bn] obtained by dividing each
[a;,b;] by points ,to, ... ,J,11 I• and fonning the m1m2 • · · mn rectangles
Define the volume of the rectangle B = [a 1, b1] x ••• x [an, bn] to be the product
of the lengths of its edges: vol(B) = v(B) = (b1 - a 1)(b2 - a2)· · · (bn - an). Let
L(j, P) denote the lower sum off for P, defined by
L(j,P) = L[inf{f(x)
Ix E R}]v(R),
REP
the sum being over all subrectangles R of the partition P. Similarly, let U(j,P)
denote the upper sum for P, defined by
U(f, P) = I:rsup{f(x) I X E R} ]v(R).
REP
Now we observe some properties of L(/, P) and U(f, P). From the definition,
we see that for any partition P, L(f, P) S U(f, P). Now suppose P' is any partition that is a refinement of or is finer than P; this means that each subrectangle
belonging to P' is contained in a subrectangle belonging to P. We assert that
L(f, P) $ l(f, P'). Indeed, this follows from the fact that the minimum off on
a rectangle is less than or equal to the minimum on any rectangle contained in
it. Similarly, U(j, P') $ U(f, P). This has the following consequence. If P' and
P" are any two panitions of [a,, b,] x · · · x [an, bn], then l(f, P') $ U(f, P'').
To prove this, let P be a partition of the rectangle that refines both pt and P",
which we can arrange by using all the subdivision points of P' and P"; then
L(f,P') :5 l(f,P) :5 U(f,P) :5 U(f,P'').
Since the function f and the rectangle B are bounded, the sets of numbers
{ L(f, P) I P is a partition of B} and { U(f, P) I P is a partion of B} are bounded
below by inf{f(x) I x E B} vol(B) and above by sup{f(x) Ix E B} vol(B). Each
number in the first set is smaller than all numbers in the second set. Thus, if
we set
s = sup{L(f, P) IP is a partition of B}
S = inf{ U(f, P) I Pis a partition of B},
then s S S. With this notation we can make another definition.
448
Chapter 8 Integration
8.1.1 Definition /f f is a bounded function on a bounded region A, then the
upper integral off on A is defined by
l f == S == inf{ U(f, P) I P is a partition of B}
and the lower integral off on A by
l f == s == sup{l(f..P)
IP is a partition of B}
where B is any rectangle containing the region A. We say that f is Riemann
integrable (from now on we will just use the word "integrable") ifs ==Sand
define the integral off on A as the common value:
r·
Instead of JAJ. the notation JAJ(x)dx Qr
·JAJ(x1,,,, ,Xn)dx1 · · · dxn is
frequently employed. If f : [a, b] -+ JR, the notation
f or
f(x) dx is also
used.
There is an important equivalent characterization of the Riemann integral as
presented in the next theorem.
J:
J:
let A c Rn be bounded and lie in some
rectangle S. let f : A -+ JR be bounded and be extended to S by defining f == 0
outside A. Then f is integrable with integral I iff for any e > 0 there is a o > 0
such that if P is any partition of S into rectangles S1, ... , SN with sides of length
< o and ifx1 E S1, ... ,XN E SN, we have
8.1.2 Darboux's Theorem
N
Lf(x;)v(S;) - I
< e.
;.,1
We call
I:!1 f (x;)v(S;) a Riemann sum.
A condition closely related to Theorem 8.1.2 follows.
8.1.3 Riemann's Condition f is integrable if! for any e > 0 there is a
partition P" of S such that O $ U(f,P,:)- l(f,P,:) < e.
449
§8.1 Integrable Functions
The proof of Riemann•s condition will be given along with the proof of
Darboux's theorem at the end of the chapter.
Notice that if f is continuous, we can realize the upper and lower sums for a
partition as special Riemann sums, since/ assumes its maximum and minimum
on each subrectangle at some point of that subrectangle. If f is continuous on the
whole rectangle S (= interval if n = l), then it follows from uniform continuity
off (see §4.6) and Riemann's condition that f is integrable. We shall, in fact,
prove a more general result in 8.3.1.
Let us now illustrate a few of these ideas in the special case of one dimension.
These were treated in §4.8, but are worth reviewing at this point.
8.1.4 Example Interpret Riemann sums geometrically for f: [a,b]-+ JR.
Solution Let P : a
= xo
< x1 < · · · <
Xn
= b be
a partition and let
c; e [x;,X;+il- By definition, the Riemann sum is
n-1
R = Lf(c;)(x;+1 - x;),
j=()
which is the total area of the rectangles represented in Figure 8.1-2, with the
area of rectangles below the x axis counted with a negative sign. We observe
that L(f,P) ~ R ~ U(f,P), and so Darboux's theorem is plausible.
•
y
FIGURE 8.1-2
Approximation of an integral by a Riemann sum
450
Chapter 8 Integration
J;
8.1.5 Example
Show that
xdx = 1/2, using the definition of the integral.
Compare with a geometrical computation.
Solution Break up [O, I] into n equal parts
Using this as a partition, the infimum of f(x) = x on [i/n, (i + 1)/n] is i/n and
the supremum is (i + 1)/n. Thus, calling this partition P,.
n-1 •
U(f,P)
=
l
l
l
n-1
L !..±.__.
- =--:.. L(i+.l)
n
n ni=()
;=()
I
1 1
=-(1
+ 2 + · • • + n) =-2 •-n • (n + I)
n
n 2
2
and
L(f, P)
i 1 I
=~
· - =-(0 + I + · · • + (n ~n n n
n-i
i=()
2
1))
=-nI2 (I)
-2 (n -
l)(n)
since I + 2 + .. · + k = (1 /2)k(k + 1). Thus
. U(f, P) =
~ ( l + ~)
and
L(f, P) =
~ (1-
~) ·
These both converge to 1/2 as n _. oo. Thus, from Riemann's condition (or
Darboux's theorem), we see that/ is integrable, with integral equal to 1/2. This
•
is also geometrically obvious from Figure 8.1-3.
Exercises for §8.1
1.
Prove that if R is a Riemann sum for a function f and partition P, then
L(J,P) $ R $ U(f,P).
2.
Let/: (0, 2) _. R be defined by f(x) = 0 for O $ x :5 1, and by f(x) = 1
for 1 < x $ 2. Compute, using the definiti~n. f (x) dx.
3.
Let/: [0, l] _. R be defined by /(x) = 0 if x
that f is integrable and f(x) dx = 0.
J;
J;
"I- 1/2 and/(1/2) = 1. Prove
451
§8.2 Volume and Sets of Measure Zero
y
area=
i
i+l
n --,,
FIGURE 8.1-3
i
(base) X (height)=½
x=l
The integral of f(x) = x is the area of a triangle
4.
Let A C JR.ff and let/(x) = I for x EA. What do you think fAf should be?
S.
Evaluate (3x + 4) dx using the definition and compare the answer with
a geometrical computation of area.
6.
Let/: [a,bJ -+ IR be continuous. Use Riemann's condition and unifonn
continuity off to prove that f is integrable.
Jd
§8.2 Volume and Sets of Measure Zero
On the real line, one usually integrates over intervals. However, in JR.ff we
usually need to integrate over more complicated sets. We must be sure that the
sets we are dealing with are restricted in such a way that the partitioning in
the definition of integrability is reasonable. Here, "reasonable" means, roughly
speaking, that the boundary of the set is not too complicated. Our immediate
goal is io develop enough concepts to make these ideas precise. First, we define
the volume of a s~t.
8.2.1 Definition If A c IR.ff is a bounded set, let the characteristic function
IA of A be the map IA : JR.ff -+ IR defined by lA(x) = I if x E A and IA(x) = 0
if x ~ A. We say that A has volume if IA is integrable, and the volume of A is
the number
1
lA(x)dx
= v(A).
452
Chapter 8 Integration
This definition is natural because the region under the graph of lA is the
"cylindrical" region with height 1 and base A (Figure 8.2-1). We shall also use
the phrase "A has content" to mean the same as "A has volume." Sometimes a
set that has volume is called Jordan measur_able.
,,.
,r- - - - - - - I
I
I
--f''-"--.-
y
}
!
_r.....1-------·---- I
~
I
A
X
FIGURE 8.2-1
The characteristic function of a set
If n = 1, so that A C JR, we speak of v(A) as the length of A, and when
A c IR 2, we use the term area of A for v(A).
We say that A has volume zero (or content zero) if v(A) = 0. From the
definition of the integral, this is equivalent to the statement that for every t: > 0
there is a finite covering of A by rectangles, say, S 1, .•. , Sm, such that the total
volume is less than t:; that is,
Ill
I><S;) < E,
i=I
where v(S;) is computed for rectangles as' before. (The details are worked out
in Worked Example 8.1 at the end of the chapter.)
It is useful to allow countable coverings as well as finite ones. These ideas
were systematically introduced by Henri Lebesgue around 1900.
8.2.2 Definition A set A c
(not necessarily bounded) is said to have
measure zero if for every E > 0 there is a covering of A, say, S1,S2, ... ,
by a countable (or finite) number of rectangles such that the total volume
~;:i v(S;) < E. Recall that S1,S2,- .. are said to cover A wizen U~1S; :) A.
Note that these rectangles may overlap.
]Rn
453
§8.2 Volume and Sets of Measure Zero
It is important to realize that these concepts depend on the space in which
we are working. To illustrate the point, consider an example.
8.2.3 Example Slww that, regarded as a subset of R 2 , the real line has
measure zero, but as a subset of JR it does not.
Solution To prove the first assertion, given E > 0, we want to find rectangles
S 1,S2 , ••• that enclose the x axis and have total area < E. Considering Figure
8.2-2, let
S; = [-i, i]
X [-
2i ·E2i+I • 2i ·E2i+I]
.
y
X
x=2
FIGURE 8.2-2
A family of rectangles that covers the horizontal axis
Since v(S;) = 2i[(2c)/(2i • i• 1)] = c/i, we get :E~ v(S;) < :E':1(c/i) = c,
since 1/2 + 1/4 + 1/8 + • • • = 1. It is clear that the real line as a subset of itself
cannot have measure zero, because in covering it by intervals, the total length
of the intervals will be +oo. •
This demonstration is typical of the way one proves that a set has measure
zero. Another example of a set of measure zero is the sphere
S = { X E !Rn
11 lxll = l }•
From the definition of volume, it is clear that if A has volume zero, then A
has measure zero. Indeed, if A has volume zero and E > 0, we can even find a
454
Chapter 8 Integration
finite covering by rectangles for A with total volume <
€. Also, note that if A
has measure zero and BC A, then B has measure zero as well.
The main advantage of measure zero over volume zero is indicated in the
following theorem.
8.2.4 Theorem Suppose that the sets A1,A2, . .. have measure zero in Rn.
Then A1 U A2 U · · · has measure zero in Rn.
From this we conclude, for example, that any set composed of a countable
number of points has measure zero.
8.2.5 Example Consider the set A of rationals in [0, l] c R. The set A
does not have volume; that is, lA is not integrable. Indeed, the function that
has the value I on rationals, 0 on the irrationals, is not integrable as we have
seen in §4.8. Nevertheless, the set A does have measure zero, because a point
has volume ~d measure zero and A consists of countably many points, and so
Theorem 8.2.4 applies.
•
Exercises for §8.2
1.
Show that {(x,y) E R 2 I x2 +y2 = l} has volume zero.
2.
Show that the xy plane in JR 3 has 3-dimensional measure 0.
3.
If A C [a,b] has measure zero in R, prove that [a,b]\A does not have
measure zero in R (It follows from Exercise IO at the end of the chapter
that [a, b] does not have measure zero.)
4.
Use Exercise 3 to show that the irrationals in (0, l] do not have measure
zero.
5.
Must the boundary of a set have measure zero?
6.
Must the boundary of a set of measure zero have measure zero?
§8.3 Lebesgue's Theorem
We now consider one of the most important results in integration theory. We feel
intuitively that most "decent" functions, such as continuous functions, ought to
455
§8.3 Lebesgue's Theorem
be integrable, since the area under their graphs should be definable. To settle the
question of exactly how decent is "decent" we have the theorem of H. Lebesgue.
With this theorem Lebesgue opened up new advances in integration theory by
stressing the concept of measure zero.
8.3.1 Lebesgue's Theorem Let A c ]RR be bounded and let f : A -+ IR
be a bounded function. Extend f to all of R_R by letting it be zero at points
not contained in A. Then f is (Riemann) integrable iff the points at which the
extended f is discontinuous form a set of measure zero.
We can draw two important conclusions from this result as stated in the
following two corollaries.
8.3.2 Corollary A bounded set A c
]RR
has volume
iff the
boundary of A
has measure zero.
This is obtained by applying the theorem to the characteristic function IA.
Since finite and countable sets have measure zero, the theorem also gives:
A c ]RR be bounded and have volume. A bounded
IR with a finite or countable number of points of discontinuity
8.3.3 Corollary Let
junction f : A
is integrable.
-+
This result shows that many of the common functions one meets in practice are integrable. For example, a continuous function on an interval [a, b]
is integrable because [a,b] has volume (the boundary consists of two points).
Functions with a finite number of points of discontinuity are similarly integrable.
Notice that integrability off in Theorem 8.3.1 depends on the extension
of J. For instance, if A is the set of all rationals in [O, l) and f is identically 1,
/ restricted to A is continuous on A but the extended/ is nowhere continuous
and in fact is not integrable. In Corollary 8.3.3 it is not necessary to extend
f. This is accounted for using the fact that A has volume and making use of
Corollary 8.3.2. Another useful result is as follows.
8.3.4 Theorem
i.
Let A c ]RR be bounded and have measure zero and let f : A
(bounded) integrable function. Then JA/(x) dx = 0.
-+
R. be any
456
ii.
Chapter 8 Integratio,i
If f: A
-+ IR is integrable and f(x) 2: 0 for all x and JAf(x)dx = 0, then
the set {x EA 1/(x) 'f' O} has measure zero.
This theorem is not unreasonable. Indeed, a set of measure zero is "small"
with, essentially, zero volume, and so the integral of any function over it ought
to be zero. The second part is likewise reasonable.
8.3.5 Example Let
x
f (X)
-l~x~O
= { 3~ + 8, 0 < X $ 1.
Show that f is integrable on [-1, I].
Solution The set [- 1, I] has volume, and f has only one discontinuity, at
x = 0. Since f is bounded, 8.3.3 shows that/ is integrable.
•
8.3.6 Example Let f(x) = sin(l / x) for x
integrable on [-I, I].
'f' 0
and f(0) = 0. Show that f is
Solution Here/ has one point of discontinuity, at x = 0. Also, 1/(x)I S 1,
and so f is bounded. Thus, by Corollary 8.3.3,f is integrable.
•
8.3.7 Example Let f (x, y) = x2 + sin( I/y) for y
that/ is integrable on A = { (x,y) I x2 + y2 < I}.
'f' 0
and f (x, 0) = x2. Show
Solution Here f is bounded on A = interior of unit disk in IR.2 , and has
discontinuities on the line y = 0, which is a set of zero measure in IR.2• Also,
A has volume (its boundary has zero volume). Hence, by Theorem 8.3.1, / is
integrable.
•
Exercises for §8.3
1.
Let f(x) = x3 on [-1, I]. Prove that/ is integrable.
2.
Let f(x,y) = I if x 'f' 0 and f(0,y) = 0. Prove that f is integrable on
A = [0, l] x [0, I) C IR. 2•
§8.4 Properties of the Integral
457
3.
Compute JAi where f and A are as in Exercise 2.
4.
Let A c JR.n be open and have volume, and let/ : A -+ JR. be continuous,
f(x) 2: 0, and f(xo) > 0 for some Xo E A. Show that JAJ > 0.
5.
Let
rk
= l/k, k = l, 2, ... , and on JR let Ube the open set
Decide whether or not U has volume.
6.
Let f(x) = cos(l/x) for x
on[-1,1].
f:.
0 and /(0) = 0. Show that/ is integrable
§8.4 Properties of th~ Integral
We now present some of the elementary properties of the integral analogous to
those for functions on an interval.
8.4.1 Theorem Let A, B be bounded subsets of Rn, c E
A -+ IR be integrable. Then
R
and let f, g :
i.
f + g is integrable and Jif + g) = JAi + JA g.
ii.
cf is integrable and JA(cf):;: c JAJ.
iii.
Iii is integrable and I JAJI :$ JA
iv.
If f :$ g, then JAi :$ JA g.
v.
If A has volume and Iii :$ M, then I JAJI :$ Mv(A).
vi.
Iff : A -+ IR is continuous
and A has volume and is compact and connected, then there is an xo E A
such that JAJ(x)cfx_= f(xo)v(A). The quantity (1/v(A)) JAi is called the
average off over A.
vii.
Let f : A U B -+ R If the sets A and B are such that A n B has measure
zero and fl(A n B), JIA, and JIB are all integrable, then f is integrable on
A U B and fAusf = JAJ + f 8 f-
lfl-
Mean Value Theorem for Integrals
Chapter 8 Integration
458
If a < b < c on JR, condition vii implies that
1c f(x)dx = 1b f(x)dx + 1cf(x)dx.
In the plane, if A and B are as depicted in Figure 8.4-1, the integral over their
union is the sum of the individual integrals, because the intersection has zero
measure (it is a point).
RGURE 8.4-1
Two sets whose intersection has measure 0
8.4.2 Example If A, B have volume, slww directly (without using Theorem 8.4.1) that A U B has volume.
Solution We must show that bd(AUB) has measure zero (see Corollary 8.3.2).
But
bd(A U B) C bd(A) U bd(B)
(see Exercise 15, Chapter 2), so that, since the right side has measure zero, so
does the left.
•
8.4.3 Example Give a geometric interpretation of property
rem 8.4.1 for f : [a, b]
Solution
J:
-+
iii of Theo-
lR.
f(x)dx represents the area under the graph off with the portion
below the x axis counted negatively. The magnitude of this is clearly less than
•
(or equal to) the area under the graph of lfl; see Figure 8.4-2.
§8.5 Improper Integrals
459
y
1/1
FIGURE 8.4-2
The integral of a function is no larger than that of its absolute
value
Exercises for §8.4
1.
If A1,A 2 , ••• have volume and A= A1 UA2 U · · · is bounded, need A have.
volume?
2.
Explain the geometric meaning of properties iv and v in Theorem 8.4.1.
3.
Let A, B have volume and A n B have zero volume. Use 8.4.1 to show
that v(A U B) = v(A) + v(B).
4.
Set
J; f(x) dx = - J: f(x) dx if a > b. Establish
for all a, b, c (not assuming a
< b < c, as
,,,
we did in the text).
§8.5 Improper Integrals
We often need to integrate unbounded functions or to integrate over unbounded
regions. The resulting improper integrals lead to convergence problems analogous to those for an infinite series.
460
Chapter 8 Integration
One usually defines improper integrals by
r=
lo
{b f(x)dx,
f(x)dx = lim
b-+oo
lo
or, if the function h is unbounded near 0, by
l b.
h(x)dx = lim
0
€-+0
lb
h(x)dx,
,:
and so forth. The definitions given in this section will conform to these notions.
However, a word of caution is advisable at this point: We do not normally define
f f(x)dx = lim t f(x)dx.
J_oo
k-+oo J_k
00
If we did, consider what would happen for /(x) = x: f~ooxdx would be zero,
f-
while f t xdx and
00 xdx would not exist; they would be +oo and -oo,
respectively. Thus, if one wishes to retain additivity of integrals, one must
proceed more carefully. One way to avoid this "canceling of infinities" is to
break up/ into positive and negative parts, as we shall do in 8.5.4 below.
Generally, improper integrals are of two types, depending on whether it is
the function or the domain that is unbounded. First, we shall consider the case
of unbounded regions. To begin, suppose f ~ 0 is bounded and A c Rn is
arbitrary, possibly unbounded. Extend/ to the whole space as usual by setting
f = 0 outside A. See Figure 8.5-1.
y
n= l
f extended to
be zero here
X
a
FIGURE 8.5-1
A= [a, oo[
An improper integral on the real hne
§8.5 Improper Integrals
461
~ 0 for all x in a set A C !Rn and
that f is bounded and integrable on each n-dimensional "cube" [-a,a]n =
[-a,a] x ··· x [-a,a]. If the limit lim,, ..... ooft_ 0 _01 nf exists, we call it JAJ. If it
exists and is finite, we say that f is integrable on A.
8.5.1 Definition Suppose that f(x)
Although we are talcing limits expanding symetrically from 0, we are avoiding the problem mentioned above since we are considering only nonnegative
functions. Functions with arbitrary sign will be handled shortly.
~ 0 and is bounded and integrable on every
cube [-a, a]n. Then f is integrable if! the following condition holds: For each
sequence Bk of bounded sets with volume such that
8.5.2 Theorem Assume that f
i.
Bk C Bk+l
ii.
For each cube C we have C C
for each k and
Bk
for sufficiently large k,
limk..... 00 fa/ exists. In this case, lim,1:..... 00 fa/= JAf. See Figure 8.5-2.
This theorem is saying that we get
infinity.
FIGURE 8.5-2
JAi independently of how we expand to
An increasing chain of sets that absorb each c;:ube
Notice that we have this comparison test: If f ~ 0 is integrable, g is integrable on every cube, and O ::; g ::; f, then g is integrable as well, since fi-a,aln g
462
Chapter 8 Integration
is increasing with a and is bounded by the integral off, and so converges as
a-oo.
Next, let us treat the case of an arbitrary function f 2:: 0 that is unbounded
and defined on an unbounded region. The significance of these conditions in the
case of lR will be given shortly. We reduce the problem to the case just treated
by chopping off the graph off to produce a bounded function.
8.5.3 Definition For each positive real number M, let
Ji ( ) = {
M
x
f(x),
0,
if f(x) $ M,
if f (x) > M
(see Figure 8.5-3), so that /M is bounded by Mand fM 2:: 0. (Note that JA!M
increases as M increases, and O $ /M $ f.) We define
{f
}A
= lim { /M
M-oo}A
if this limit is finite, and in that case we say that f is integrable.
M
FIGURE 8.5-3
Truncating an unbounded function
As before, if f 2:: 0 is integrable and O $ g $
(this is called the comparison test).
f, then g is also integrable
8.5.4 Definition For a general function f : A - JR (that is, a function f
that is not necessarily positive), we let
r<x) = { f(x),
0,
if f(x) 2:: 0,
if f(x) < 0,
§8.5 Improper l11tegra/s
463
and
f-(x)-{ -f(x),
0,
(see Figure 8.5-4).
let
If both JAr and JAJ-
iff(x) :5 0,
if f(x) > 0
exist, we say that f is integrable and
I
f
f
FIGURE 8.5-4
Observe that
I/I
=
The positive and negative parts of a function
r +1-. Thus, if f
is integrable, so is
lfl,
and
IA Ill =
JAr + JA1- ;?: 1JA11.
If I/I is integrable on every cube and if f Ill exists, then both fr and
J1- must be finite if they exist. They are nonnegative and O :5 r :5 I/I and
0 '5:. f- '5:. Iii- Thus, if Iii is integrable and/ is integrable on each cube, then
f is integrable. This assumption on f is needed. Even on a bounded region
it is quite possible for Iii to be Riemann integrable while f itself is not. (See
Exercise 33 at the end of the chapter.)
While improper integrals are important in higher dimensions, the case of the
real line deserves special attention. In this case there is a method for computing
integrals that is particularly simple to use.
8.5.5 Theorem
i.
Suppose f : [a, oo[ - t JR. is continuous and f ;?: 0. Let F be an antiderivative off. Then f is integrable if! limx......00 F(x) exists. In this case
1 1=
f =
[a.oo[
a
f(x)dx = {
x
~ F(x)} ex,
F(a).
Chapter 8 Integration
464
ii.
Suppose f : ]a, b] -+ JR is continuous and f
lim
~
0. Then f is integrable iff
{b f(x)dx
e-+O}a+c
exists. This limit equals
J: f(x) dx.
The generalization of this statement to .!Rn is given in Exercise 26 at the end
of this chapter. Instead of "JAJ exists," one often says "JAJ converges." As
with infinite series, it is necessary to be able to test an improper integral for
convergence or divergence. Some of the tests closely resemble those used for
series.
One of the most useful tests is the comparison test. If fa00 f (x) dx converges,
f ~ 0, 0 :::; g :::; f, and g is integrable on finite intervals, then fa00 g(x) dx
converges. The reason, as stated earlier, is simply that fa00 g(x) dx increases as
b -+ oo and is bounded above by J000 f (x) dx, and so it converges.
In our treatment of improper integrals, we have used what amounts to the
most natural approach on .!Rn. Our method is particularly desirable because most
of the properties of the integral still hold. However, in the special case of JR 1 = JR,
it is also useful to consider a weaker type of convergence, called conditional
convergence. Here we define
f
la
00
f(x) dx (conditional)= lim
b-+oo
rb f(x) dx
la
if the limit exists. This is not the same as our earlier definition (called absolute
convergence), because we demanded that the limit hold separately for j+ and
1-. An example will serve to point up the difference.
8.5.6 Example Let f(x) = (sinx)/x. Show that
ft' f(x),dx is conditionally
but not absolutely convergent.
Solution If f were absolutely integrable on [I. oo[, then I/I would be integrable also. However,
since on the interval [(k - l)1r,krr], 1/x 2: 1/krr and
I:Z=2
1/k-+ oo as n-+ oo, and so
J1
00
Jc!1:..iin Isinxl dx = 2. But
j(sinx)/xl dx = +oo.
§8.5 Improper Integrals
465
However, limb-oo Jt«sinx)/x)dx exists. To see this, note than an integration by parts gives
lb
sinxd
- - x= x
1
and
f1
00
lb
d(cosx)
cosb
- - - = ---+cos 1 x
I
b
lb
1
cosx d
- 1- x,
x-
((cosx)/x2)dx exists because
l bI
cos xi dx
I
x2
<
-
which converges as b-+ oo. Thus,
lb ..!_
,
fi°
0
x2
dx = l -
!
b'
(sinx)/xdx is conditionally convergent.
•
One can give refined tests such as the Dirichlet test for series to show conditional convergence when a test for absolute convergence fails. See Exercise 30
at the end of this chapter.
8.5.7 Example We give some standard improper integrals that are useful
in conjunction with the comparison test. They may all be proved by direct
integration, successive integration by parts, or similar uicks.
a.
roo x!' d { converges if p < -1
x
Jt
ra x!' d
b.
Jo
c.
J1
00
x
diverges if p ~ - I
{ converges if p > - I
diverges if p ~ - I
e-xx!' dx converges for all p.
e.
J; e /x,x;P dx diverges for all p
J; logxdx converges
f.
JtOf logx)dx diverges.
d.
1
ft
r
1/(p + 1)11 if pt- -1, and Jt(l/x)dx = logx!J. Now
For instance,
x!' dx =
log c -+ oo as c -+ oo, and c1'1' 1 -+ oo as c -+ oo if p + I > 0, that is, p > -1.
This gives a. Part b is similar, and e and f can be proved in the same way. We
outline c in Exercise 1 at the end of this section, and d is similar.
•
1 Jx3"+I
00
8.5.8 Example . Show that
1
I
x3 + I
dx converges.
Chapter 8 Integration
466
Solution This is improper as x - oo. Now, for x ~ 0,
1
< _1_ = x-J/2
✓x3 +1 -
n
'
and ft° x- 312 dx converges, by a of 8.5.7. Hence, by comparison, this integral
converges as well.
•
Exercises for §8.5
1.
Establish formula c of Example 8.5.7 as follows. Prove that e-:rxi>+"2
as x - oo, and then compare the integral with ft°(l/r)dx.
2.
Let f : [a, oo[ - JR be Riemann integrable on bounded intervals. Show
that fa00 f (conditional convergence) exists iff for every t: > 0, there is a
T such that t 1 , t2 ~ T implies
1112/(x)dxl
-
0
< t:
(this is called the Cauc/zy criterion).
3.
Show that
f01 e-:rxp dx converges if -1 < p.
4.
Show that
!
00
I
sinx
.
- 2- - dx 1s convergent.
X + l
1
00
5.
For what a is
0
x°
-1- - dx convergent?
+X°'
§ 8.6 Some Convergence Theorems
In Chapter 5 we saw that uniform convergence is sufficient to allow us to
interchange limit and integration operations. In this section we refine that result.
A course at this level is not the proper place for an exhaustive treatment
of convergence theorems, since they fit in more naturally in advanced courses
in measure and integration. Therefore we confine ourselves to an illustrative
theorem: the monotone convergence theorem, which will be needed in Chapter
10 for Fo1,1rier series. For a more complete discussion of convergence theorems
in the Riemann theory, see W. A. J. Luxemburg, "Arzela's Dominated Convergence Theorem for the Riemann Integral," Am. Math. Monthly, 78 (1971),
970-979.
467
§8.6 Some Convergence Theorems
8.6.1 Lebesgue's Monotone Convergence Theorem Let gn: [O, 1)
- JR be a sequence of nonnegative functions such that each improper integral
gn(x)dx exists and is finite. Suppose that 0 :5 gn+I :5 gn and that gn(X) --+ 0
for each x E [0, I). Then
J;
lim
n-+oo
1
1
0
gn(X)dx = 0.
At this point the reader should return to the examples in §5.3 to see that
they are in accord with this theorem. Certainly, if the gn's are not decreasing,
the result is not true, as examples in that section show.
8.6.2 Coronary Let fn,f : [0, l] --+ JR be nonnegative integrable functions
and suppose thar O :5 fn :5 /n+I :5 / and that f(x)dx exists as an improper
integral. Furthermore, suppose fn(x)--+ f(x) for all x E [0, I]. Then
J;
1
1
lim 1 f,,(x)dx=1 f(x)dx.
n-+oo
O
0
This result follows by applying Theorem 8.6.1 to the functions gn(x) = f(x) fn(x), which decrease monotonically to zero. The details are left to the reader.
We used the interval [0, l] for definiteness, but any other interval could be used
as well and the results could also generalize to lRn.
8.6.3 Coronary /ff:
oo exists, then
J:<f - fM>2
[a,b]--+
--+
JR./~ 0, and the improper integral J: f 2 <
0 as M--+ oo.
Here we use Theorem 8.6.l with !Jn = f - fn (the function fM, with M here
chosen to be n, is defined in the previous section).
8.6.4 Example Prove that limn ..... oo
f01 e-nx2 x!' dx = 0 if p > -1.
Solution Theorem 8.6.1 applies. The functions gn(x) = e-nx.2 x!' satisfy gn(x) :5
x!' and so are integrable, and moreover, the gn 's decrease pointwise to zero. •
468
Chapter 8 Integration
8.6.5 Example Let gn be nonnegative and integrable on
I:::'. 1 gn(x) and assume that g(x) is integrable.
11 g(x)dx=
0
Solution
f
[O, I]. Let g(x) =
Prove that
11 gn(x)dx.
0
n=I
Let
n
fn(x) =
L gk(x)dx,
n
I
so that
k=l
{ fn(x) dx =
lo
I
L { gk(x) dx
k=I
lo
Since thefn(x) increase tog, Corollary 8.6.2 shows that their integrals converge
to the integral of g. •
8.6.6 Example Let a 1 = I and I 2 a; 2 0, i = 1, 2, ... , be a decreasing
sequence with a; ----+ 0. Let fj 2: 0 be a sequence of positive numbers, and let
f(x) =fj on the interval [a;+ 1,a;[, as in Figure 8.6-1. Assume that/ is integrable.
Show that
Io' f(x) dx = "£,'j: jj(aj+I 1
aj).
y
JC.x)
_J
_ _ _ _ _.....__
___......__~-----.. X
0
FIGURE 8.6-1
A "step function" with infinitely many steps
Solution Letfn be the function that is zero if x :San and equals/(x) otherwise.
Id
Clearly fn(x) dx = "£,j=~ 1/j(aj+1-aj). Butfn is monotone increasing to/ (except
•
possibly at x = 0), and so by 8.6.2 we get the result.
469
§8.7 Introduction to Distributions
Exercises for §8.6
1.
Show that Theorem 8.6.1 can be proved using the methods of Chapter 5
if the 8n are continuous.
2.
Evaluate lim
3.
Evaluate
1
1
n-oo
o
exsinnx
- - - dx.
1
1
lim
n
1-e-n.x
n-oo O
. r;.
yX
dx.
§8.7 Introduction to Distributions
Around 1930, in his famous book The Principles of Quantum Mechanics, Dirac
emphasized the usefulness of the 6 function, which he defined by the following
properties:
f,(x)
={ 0
00
!fIf
X
X
;6 0,
= 0,
and
1:
8(x)dx = I.
One can imagine approximations to c5 where!,, -+ c5 in some sense (Figure 8.7-1),
but c5 itself is hard to visualize directly. Physicists quickly realized (as engineers
had done independently) the usefulness of such ideas and proceeded to use the
6 function in their computations. For example, to describe the charge density <T
of a point charge with charge e, it is convenient to write <T = e6. One imagines
e6 as a limit of well-defined charge densities <Tn smeared out over small areas
that concentrate down to a single point as n -+ oo.
0
FIGURE 8.7-1
An approximate 6 sequence
At the same lime as the physicists and engineers were computing, mathematicians sat back in smug amusement, occasionally pointing out that this 6 function
Chapter 8 /,itegratio,i
470
business was really all nonsense because no such function could exist. The
definition did not really make sense, as anyone could plainly see. To add to the
mathematicians' enjoyment, Dirac proceeded to differentiate this function 6!
But the physicists turned out to have had a good idea. Today, distributions,
of which 6 is an example, are indispensable in analysis and applied mathematics.
But it took mathematicians almost 20 years to establish the theory of distributions
satisfactorily. This was done by L. Schwartz and S. L. Sobolev around 1948,
although hints of the theory had occurred in the works of earlier mathematicians
as well. We give only the briefest glimpse of the theory.
In actual computations, 6 almost always appears under the integral sign in
the form
j 6(x)j(x)dx =f(0).
(I)
We can see the idea behind Equation (l) from Dirac's definition, because the
fact that 6 is zero away from x = 0 means that only /(0) counts, so that one
ought to have
j 6(x)f(x)dx =f(0) j 6(x)dx =f(0).
Equation (I) is the clue to how to proceed, as follows. Consider the space
Cb(IR, JR) of bounded continuous functions on IR (§5.5). Then, regard
6 : Cb(IR, IR)-+ IR,
f
1-+
f(0).
Thus we do not regard 6 as a function on JR at all, but a function on Cb(IR, IR)
that maps f to /(0). This operation is well defined, and 6 is easily seen to be
continuous (see Exercise 40, Chapter 5).
Thus, it is possible to circumvent the difficulty with Dirac's definition by
taking a whole new point of view; that is, think of 6 as being an assignment of a
number to each function f. This assignment is denoted by the symbolic expression J 6(x)f(x) dx. Any continuous function g also defines such an operation; it
sendsf to
1:
g(x)j(x) dx.
Thus distributions (of which linear maps from Cb to IR are examples) include
more than just ordinary functions.
How does one differentiate 6? For this, note that if g is differentiable, then
1=
-=
dg f(x) dx = -d
X
provided that f is zero for large
lxl,
1=
-=
df (x) dx,
g(x)-d
X
as can be seen by an integration by parts.
Exercises for §8.7
471
Thus dg/ dx sends f to the same number that g sends -df/ dx to. Thus it is
logical to define 6' by
8, (j)
df)
df
=6 ( -dx
- =--(0).
dx
This leads us to replace Cb by the functions f that are C1 and vanish for large
lxl. We can also replace Cb by the C'° functions vanishing for large enough lxlWe are led to the next definition.
8.7.1 Definition Let V denote the space of C'° functions that are identically zero outside some interval (V is called the space of test functions). A
distribution T is a linear map T : 'D --+ JR.. (Strictly speaking, one must require
T to be continuous in the sense that if fn -+ f uniformly on bounded sets and all
derivatives offn converge uniformly TO those off on bounded sets and all the fn
vanish outside a fixed compact set, then T<fn) --+ T(j). The actual topology on
V is a bit complicated, however.) The derivative T' is defined by T' : 'D --+ R,
/ H T(-df/dx).
If g is a continuous function, it is customary to use the same symbol g for
the distribution that maps f H J~00 g(x)f(x) dx.
This elegant formulation was due to the founders of distribution theory.
Much work remained to prove significant theorems about distributions, and this
led to an important revitalization of the theory of partial differential equations.
The physicists were pleased and everyone was happy-and will be at least for
the moment, until the scenaiio repeats on other topics.
Exercises for §8.7
1.
Show that 811 (f) = f" (0).
2.
Let T,, and T be distributions. Say Tn
Show that
--+
2
T if Tn(f)--+ T(f) for a1l/ E 'D.
C
e-nx __, u.
7r
412
3.
Chapter 8 Jntegration
If Tn
--+
T (see Exercise ,2), show that T~
--+ • T'.
Discuss and compare with
§5.3.
4.
Find a sequence of continuous functions gn such that gn
-+
6'.
Theorem Proofs for Chapter 8
We will prove Darboux's theorem and Riemann's condition together.
8.1.2 Darboux's Theorem Let A c Rn be bounded and lie in some
rectangle S. Let f : A -+ R be bounded and be extended to S by defining f = 0
outside A. Then f is integrable with integral I if! for any e > 0 there is a 6 > 0
such that if Pis any partition of S into rectangles S1, ... , SN with sides of length
< 6 and if x1 E St, ... ,XN E SN, we have
[t,f(x;)v(S;) - I
We call
I::
1
< e.
f(x;)v(S;) a Riemann sum.
8.1.3 Riemann's Condition f is integrable if! for any c > 0 there i's a
O::; U(/,Pc)- L(f,Pc) < €.
partition Pc of S such that
Proof We will show that the conditions ''f integrable," ''f satisfies Riemann's
condition," and "f satisfies Darboux's condition" are equivalent. This will be
done in four steps.
Step 1 If f is integrable, then f satisfies Riemann's condition.
Proof Given e > 0, there is a partition P'.: such that
Theorem Proofs for Chapter 8
473
where/= JAJ. We can do this, since/= inf{U(f,P)
finer than ~. then we know that
IP is a partition}. If Pis
Similarly, choose P': such that for P finer than P~. we have L(f,P)
Let Pi:=~ UP':. If Pis finer than P1:, then
I-
c
2 < L(f,P) $
U(f,P) <I+
> I - e/2.
€
2,
and so
0 :$ U(f,P)- L(f,P) <
€,
which is Riemann's condition.
Step 2 If f satisfies Riemann s condition, then f is integrable.
Proof
For any e
> 0,
there is a P~ such that
This implies that S = s. Indeed, for each P, we have
L(f,P) $ s $ S $ U(f,P),
and so if U(f, Pi:) - L(f, P,J
hence S .,; s. ,
< €,
we also have S - s
<e
for every e
>0
and
Step 3 Iff satisfies Darboux's condition, then f is integrable.
Proof We will show that the/ given in Darboux's condition will be the same
as S = inf{ U(f, P) I P is a partition} and also the same as s. To accomplish this,
given e > 0, we prc,duce a pru1ition P such that
IU<J,P)-
II<€,
474
Chapter 8 Integration
which will show that S 5 I. Similarly, we will have/ 5 s, and then/ 5 s 5 S :5 /
will imply s = S = I. To do this, choose 6 > 0 such that if P is a partition with
sides < 6, then
ILf(x;)v(S;) < ;.
,1
where S1 , •.• , SN are the rectangles making up the partition P. Choose x; such
that
c
lf(x;) - s~_p(f)I < v(S;)2N.
I
Then
IU(f, P) - II
Now
so that IU(f,P) -
s IU(f, P) -
Lf(x;)v(S;)I + ILf(x;)v(S;) -
11 .
~
I< ~
ev(S;)
e
IU(f,P)- ~f(x;)v(S;)
~ v(S;)2N = 2,
II < e, as required.
The case for lower sums is similar.
Step 4 If f is integrable, then f satisfies Darboux's condition.
Proof Suppose f is integrable with integral I. We will show, in two steps,
that for any e > 0, there is a 6 > 0 such that if P is any partition into rectangles
S1, ••• ,SN with sides < 6, and if X1 E S 1, ••• ,XN E SN, we have
lt.f(x;)v(S;) -
,1 < e.
Rn. Given c > 0, we shall
show that there exists a 6 > 0 such that for any partition P' into subrectangles
with sides less than 6, the sum of the volumes of the subrectangles of P' that
are not entirely contained in some rectangle of Pis less than£.
To see this, we examine the cases n = l end n > l separately. First, suppose
that we are working on the interval [a, b]; suppose that the partition P consists
of N points. We assert that the 6 that is needed is simply given by 1::/N. Indeed,
the length of the intervals in P' that are not contained in an interval of P is
N x 6 = (maximum number of intervals not contained entirely in an interval of
P) x (maximum length of each such interval of P') = £. Turning to the general
case, let the partition P consist of rectangles Vi, ... , VM. We denote the total
"area" of the faces lying between any two rectangles by T. Let 6 = c/T and let
P' be any partition of B into subrectangles of sides less than 6. For any rectangle
Step 4A Let P be a partition of the rectangle B C
475
Theorem Proofs for Chapter 8
S E P' such that S is not contained in one of the V;, S intersects two adjacent
rectangles. One can see that v(S) 5 M, where A is the total area of faces between
two subrectangles contained in S (see Figure 8.P-1). Thus LseP' v(S) < 6T = c.
rectangles in P
SEP'
A = area of this face
FIGURE 8.P-1
Showing that v(S) = wA 5 6A
Since f is bounded, there exists an M > 0 such that IJ(x)I < M
for all x E S. There are partitions Pi and P2 of S such that / - l(f, Pi) < c /2
and U(f,P2 ) - I< c/2. Choose a partition P that refines both Pi and P2 • Then
U(f,P)-1 < c/2 and/ - L(f,P) < c/2. By step 4A, there exists a 6 > 0 such
that for any partition of Pinto rectangles of sides < 6, the sum of the volumes
of the subrectangles not contained in some subrectangle of Pis less than c/2M.
Let S1, ••• , SN be a partition into subrectangles of side less than 6, let Si, ... , Sx
be the subrectangles contained in some subrectangle of P, and let Sx+r, .•. , SN
be the remaining subrectangles. If x 1 E S1, ... ,XN E SN, then
Step 4B
N
K
N
Lf(x;)v(S;) = Lf(x;)v(S;) + L
f(x;)v(S;)
i=K+i
i=-1
i=l
c
<
- U(f ' P)+AA,
•r• -2M
c
= U(f, P) + 2 < / + c.
•
Similarly,
N
Lf(x;)v(S;)
i=l
~ l(f,P)- ~>I - c.
476
Chapter 8 I,itegration
Therefore,
N
Lf(x;)v(S;) - I
•
< e.
i=I
In some later proofs it will be convenient to have the following technical point
at hand: In the definition of measure zero, one can use either closed or open
rectangles.
Proof Let A C IR~ First, suppose that, given e > 0, there are open rectangles
V1 , V2, ... covering A of total volume < e. Let B; = cl(V;). Then 81, 82, ... are
closed rectangles covering A with the same total volume < .::
Conversely, given e > 0, suppose we have a covering by closed rectangles
B1,B2 , •.. with total volume< t/2n. Let V; be the open rectangle containing B;
with twice the side. Then v(V;) = 2nv(B;), and so
00
L
00
v(V;) = 2n L
v(B;) < E:.
This same argument also works for content zero. See Exercise 11 at chapter's
end. •
8.2.4 Theorem Suppose that the sets A 1 ,A2, ... have measure zero in IR~
Then A1 U A2 U • • • has measure zero in JR~
Proof Since all of the A; have measure zero, there is a covering of the A;
Lj':
with rectangles Bil, B,'2, ... such that
1 v(Bij) < e/2~ Since the collection
Bil, B,'2, ... covers the A;, the countable collection of all Bij covers A I U A 2 U · · · .
But
00
00
00
00
i=I j=I
i=I
L v(Bij) =LL v(B;j) < L ;; =e.
i,j=I
Since e is arbitrary, A I U A2 U · · · has measure zero.
•
Note. That we can sum up the v(B;j) first by j, then by
i, follows from the
fact that the tenns can be rearranged in an absolutely convergent double series.
See Exercise 51, Chapter 5.
8.3.1 Lebesgue's Theorem let A c !Rn be bounded and let f : A -+ JR
be a bounded function. Extend f to all of 1Rn by letting it be zero at points
'heorem Proofs for Chapter 8
477
ot contained in A. Then f is (Riemann) integrable iff the points at which the
·xtended f is discontinuous form a set of measure zero.
roof
Consider some rectangle B that contains A. Then we must show that
e function f is integrable on A iff the set of discontinuities of the function g,
hich equals f on A and zero elsewhere, has measure zero.
It is useful for the proof to have a measure of how "bad" a discontinuity is .
. o do this, we define the oscillation of a function h at xo, written O(h, .xo), to
O(h,xo)
= inf{sup{lh(xi)- h(x2)I I x1,x2 E U} I U is a neighborhood of xo}.
ote that the inf is taken over all neighborhoods U of xo and that O(h,xo) ~ 0.
e· claim that O(h,.xo) = 0 iff h is continuous at x0 • To see this, note that h is
ontinuous at x 0 iff for any E > 0, there is a neighborhood U of xo such that
up{lh(xo) - h(xi)I I xi E U} < e, and this is equivalent to O(h,x0 ) = 0.
, We are now ready to continue the proof-for convenience, it is broken into
two steps. Let g : B -+ R be defined by g(x) f(x) if x E A, and g(x) 0 if
=
=
t~¢ A.
I'
~tep 1 We assume that the set of discontinuities of g has measure zero. Thus,
~ we let De= {x I O(g,x) ~ c} for E > 0 and D = {discontinuities of g}, then
f e CD. If y is an accumulation point of De, every neighborhood of y contains
rl point of De. Then every neighborhood U of y is a neighborhood of a point
I x1,x2 E U} ~ c. This
:tmplies O(f,y) ~ e, and soy E De. This proves that De is a closed set. Since
· De C B, De is bounded and therefore compact. Now De has measure zero, since
De C D, and so by definition there is a collection B 1, B2 , ••• of (open) rectangles
that cover De such that E~ v(B;) < E. We know that a finite number of the
B; cover De, since De is compact. Suppose B1, ••• , BN cover De, Certainly,
1of De, and by construction of De, sup{lf(x1) - f(x2)l
L~ 1 v(B;) < E.
Now pick a partition of B. By refining the partition, we may assume that each
re~tangle of it is either disjoint from De or contained in one of the rectangles
B1, ~2, .•• , BN that cover De;. Thus the rectangles of the partition fall into two
, (not necessarily disjoint) collections: C1 = { those rectangles that are contained in
one of the Bk} and C2 = {those rectangles that do not intersect De}. We now use
compactness to subdivide the rectangles in C2 to obtain a further refinement of
our partition. For each rectangle S that does not intersect De, the oscillation of g
at each point of the rectangle is less than c. Hence we can find a neighborhood
Ux of each point x of the rectangle such that Mux(g) - mux(g) < E, where
Mux(g) = sup{g(y) I y E Ux} and mux(g) = inf{g(y) I y E Ux}, Since Sis
Chapter 8 lntegratum
478
compact, a finite collection of the open sets U,, covers S. Pick a refined partition
in S such that each rectangle of the partition is contained in U:r-I for some U,,.I
in the finite collection that covers S. If we do this for each S in C2, we get a
partition P such that
U(g, P) - L(g, P)
:5
L (Ms(g) -
ms(g))v(S) +
SEC1
L (Ms(g) -
ms(g))v(S)
SEC1
:5 ev(B) + L
2Mv(S), where
1/(x)I < M on A
SEC1
:5 ev(B) + 2Me,
L
since
N
v(S)
SEC1
<
L v(B;) < e.
i=I
Since e is arbitrary, Riemann's condition shows that g and hence/ are integrable.
Step 2 Suppose g is integrable. The set of discontinuities of g is the set of
points of oscillation greater than zero. Hence {discontinuities of g} = D 1UD 1; 2 U
D 1;3 U ···,where D 1;n = {x EB I O(g,x) ~ 1/n}. By Theorem 8.1.2, there is a
partition of B such that U(g, P) - L(g, P) = L-sEP(Ms(g)- ms(g))v(S) < e. Now
D1;n = {x E D1;n Ix lies on the boundary of some S} U {x E D1;n I x E interior {S) for some S} = S 1 U S2 • The first of these sets, S 1, has measure zero, since
we can cover the boundary of a rectangle with arbitrarily thin rectangles. Let C
denote the collection of rectangles of the partition that have an element of D1;n
in their interior. Then, if S E C,
1
Ms(g) - ms(g) ~ n
and
-1 L v(S) :5 L(Ms(g) - ms(g))v(S) :5 I)Ms(g) - ms(g))v(S) < e.
n SEC'
SEC
SEP
Hence C is a collection of rectangles that covers S2 and LSEC v(S) < ne:. We
can find a collection C' of rectangles that covers S1 with Lsec• v(S) < e:. Then
CUC' covers Di;n and Esecuc' v(S) < (n + 1)€. Since e: is arbitrary, D 1;n
has measure zero. Finally, {discontinuities of g} = D 1 U D 10 u D 1; 3 U · • • has
measure zero, by Theorem 8.2.4.
•
8.3.2 Corollary A bounded set A
has measure zero.
C !Rn has volume
if! the boundary of A
Theorem Proofs for Chapter 8
479
Proof
By Theorem 8.3.1, it suffices to show that the set of discontinuities of
1,t. where
~f x ¢ A
1, 1f x EA,
l,t(X) = { 0,
is 1 the boundary of A. Bl;lt if x E bd(A), then any neighborhood of x intersects A
and IR."\A. Hence there are points yin the neighborhood such that l..t(x)-1..t(Y) =
I. Thus IA is not continuous at x. If x ¢ bd(A), then there is a neighborhood of x
that lies entirely in A or IR.'\A. In either case, IA is constant on this neighborhood,
and so 1..t is continuous at x.
•
8.3.3 Corollary Let A C IR." be bounded and have volume. A bounded
function f : A
is integrable.
-t
JR. with a finite or countable number of points of discontinuity
Proof The discontinuities of the extended function g, which is equal to/ on A
and zero at points outside A, are simply the discontinuities off possibly together
with some discontinuities of g on the boundary of A, for the same reason as in
the proof of 8.3.2. But bd(A) has measure zero, by Corollary 8.3.2. Hence it is
sufficient to show that a countable set has measure zero. But this follows from
Theorem 8.2.4 and from the fact that a point has measure zero.
•
8.3.4 Theorem
i,
Let A C JR" be bounded and have measure zero and let f : A ..... JR be any
(bounded) integrable function. Then JAf(x) dx = 0.
ii.
If f : A ..... JR
is integrable and f(x) 2 0 for all x and f..tf(x) dx = 0, then
the set {x EA lf(x) 'f' O} has measure zero.
Proof
i.
We claim that a set of measure zero cannot contain a nontrivial rectangle,
that is, a rectangle [a,, bi] x · · · x [an, bnJ such that a; < b; for each i.
The reason is that a subset of a set of measure zero must be of measure
zero and a nontrivial rectangle does not have measure zero. Let S be a
rectangle enclosing A, and extend/ to S by setting it equal to O on S\A;
let P be any prutition of S into subrectangles S 1, •.• , SN, and let M be such
· that f(x) =:; M for all x E A.
Chapter 8 Integration
Then
L(f, P) =
N
N
i=I
i=I
L ms;(/)v(S;) ::; ML ms;OA)v(S;).
Suppose ms;OA) ~ 0 for some i, such that S; is a (nontrivial) rectangle. This means that S; c A, which contradicts the opening remarks
of the proof. Thus, for any nontrivial S;, ms;OA) = 0, while for any
trivial S;, v(S;) 0. Hence }:~1 ms;OA)v(S;) 0, or L(f, P) ::; 0. Now
SUPxes/(x) = -infxes;(-/(x)), and so
=
=
V(f,P) = "sup/(x)v(S;) = - "
L,__ ES
S;EPX
inf(-/(x))v(S;) == -l(-f,P),
L,__ xES·
;
S;EP
I
and by the same argument again, l(-f,P) :s; 0. Hence -l(-f,P) =
U(f, P) ~ 0. Since P was arbitrary, U(f, Q) ~ 0 ~ l(J, Q) for any
partition Q of S, and hence
and so, since f is integrable,
Consider the set A,,,= {x EA I f(x) > 1/m}; we first show that Am has
content zero. Given e > 0, let S be a rectangle enclosing A, extend/ to S
by setting it equal to O on S\A, and let P be a partition of the rectangle S
such that V(f, P) < e / m. Such a partition exists by the fact that JAi = 0.
If S 1, ••• ,Sx are the subrectangles of the partition P that have nonempty
intersection with A,,,, then, if Ms;(/) is the sup of/ on S;,
K
K
i=I
i=I
L v(S;) ~ L mMs;(f)v(S;) < e,
since mMs;Cf) > l. Therefore S1, ••• , Sx is a cover by closed rectangles
of the set A,,, such that
1 v(S;) < e. Hence Am has content zero. Since
A,,, has content zero, it also has measure zero.
"I:f:
Finally, observe that
00
{x EA lf(x) ~0} = LJAm.
m=I
Thus, by Theorem 8.2.4, this set has measure zero.
•
481
Theorem Proofs for Chapter 8
8.4.1 Theorem Let A, B be bounded subsets of JR~ c E JR, and let f,g :
A - R be integrable. Then
i.
f + g is integrable and Iii+ g) =IA!+ IA g.
ii.
cf is integrable and IA (cf) = c JAJ.
iii.
Ill is integrable and I JAJI
iv.
/f f
v.
If A has volume and Iii :5 M, then I JAJI :5 Mv(A).
vi.
Mean Value Theorem for Integrals If f : A -+ JR is continuous
and A has volume and is compact and connected, then there is an xo E A
such that JAJ(x) dx = f(xo)v(A). The quantity
:5 JA IJI.
:5 g, then JAJ :5 JAg.
1
v(A)
is called the average off over A.
vii.
Let f : A U B -+ JR. If the sets A and B are such that A n B has measure
zero and fl(A n B)./IA, and JIB are all integrable, then f is integrable on
AU Band IAuBf = JAi + fng.
A and B, then it is integrable on A n B. Indeed,
if A and B have volume, so does A n B, since bd(A n B) c bd A u bd B.
Note. If f is integrable on
Proof
i.
Let S be a rectangle enclosing A and let f and g be extended to S by setting
them equal to zero on S\A. Suppose£ > 0 is given. By Theorem 8.1.2,
there is a 61 > 0 such that if P1 is any partition of S into subrectangles
St, ... ,SN with sides less than 61 and if Xt E S1, ... ,XN E SN, then
Similarly, there is a 62 > 0 such that if P 2 is any partition of S into
subrectangles R1 , ••• , RM with sides less than 62 and if x 1 E R1, ••• ,XM E
RM, then
Chapter 8 Integration
If we let 6 = min(61, 62), then if P is any partition of S into subrectangles
T1, ... , TK with sides less than 8 and if x1 E T1, ... ,XK E TK, then
t,(g(x;) +f(x;))v(T;)-1/-1
gl <
€.
By Theorem 8.1.2, we may conclude that/+ g is integrable and IA (f + g) =
JAf+JAg.
·
Suppose c > 0 is given. Let S be an enclosing rectangle for A with f
extended to S as in i. Let 6 > 0 be such that if P is any partition of S into
subrectangles S1, ... , SN with sides < 6 and if x, E S1, ... , XN E SN, then
This implies that
It.
c/(x)v(S;) - c
111 <
€.
Hence cf is integrable and JA (cf) = c J;.f.
This part will be proved as a corollary of iv.
Let P be any partition of the enclosing rectangle S. Since f $ g, we see
that U(J, P) $ U(g, P). Thus
1I
= inf{ U(J, P) I P is any partition}
$ inf{U(g,P) IP is any partition}=
1
g.
We use the fact that if f is continuous at a point x in its domain, then I/I is
continuous at the point, since I/I is the composition of y ..... !YI following
f. Hence, if/ is integrable over A, then, by Theorem 8.3.1, I/I is integrable
over A. Now -I/I $ f $ Ill, and so, by iv, l/1 $
$ IA Ill, and
therefore I J;.fl $ JA lfl,
J;.
JAi
If Pis any partition of the enclosing rectangle S into subrectangles S 1, ... , SN,
then
1111 $
A
U(lfl,P)
N
N
i=I
i=l
=LMs;<IJl)v(S;) $ MLMs;OA)v(S;) =MU(lA,P).
483
Theorem Proofs for Chapter 8
·This implies that JA !fl $ Mv(A), and so
11/1 11/1
$
vi.
$ Mv(A).
Let m = inf{f(x) I x E A} and M = sup{/(x) I x E A}. By· assumption, m
and M are assumed values, since A is compact. Let ,\ = (JAJ) /v(A). (If
v(A) = 0, the theorem follows from Theorem 8.3.4.) By v, m $ >. $ M,
and so, by the intennediate value theorem, there is an Xo E A such that
f (xo) = ..\, which proves the assertion.
Note. More careful reasoning shows that compactness of A is not necessary;
see Exercise 19.
vii.
Let /1 = f · lA,h = f · 1B, and h = f · lAnB, so that these represent the
extensions of/IA.JIB, andflAnB required by the definition ofintegrability.
By assumption,/1,h, and/) are integrable and, for example, fAuBf ·IA=
JAi, by definition. Now f = f1 +h - h so by i,
[
JAuB
f=[
JAuB
~+f h- [ h=
JAuB
JAuB
0+0[ ~
Jn
JAnB
JA
Since A n B has measure zero, Theorem 8.3.4 gives fAnBf = 0.
•
~ 0 and is bounded and integrable on every
cube [-a, at. Then f is integrable if! the following condition holds: For each
sequence B1c of bounded sets with volume such that
8.5.2 Theorem Assume that f
i.
Bk C Bt+l and
ii.
For any cube C we have Cc Bk for sufficiently large k,
Iimk-+oo f 8
J exists. In this case, lim1-00J
= JAJ.
8/
Proof Suppose/ is integrable. If [-a,a]n C Bt c [-b,b]n, then
1
[-a,a]n
1$
r f $1
} 81c
[ -b,b]n
f.
since f · Ic-a,a)n $ f · Ink $ f · 1[-b,b)n· Since lima-+oo ~-a,a1nf exists, so does
limk-+oo f 8
and the limit is the same. By ii, fork large, we have [-a, a]n C Bk
for any given a.
J,
484
Chapter 8 lntegrlllion
f8 ,J is an increasing sequence in k, using i, and has a
-+ C as k --+ oo. Thus, since f 8 ,J $ C for all k and since, for
For the converse,
limit. Say
f8/
each a, [-a,a]n C B1:; for some k, fi-a.aY.f $ C for all a. Hence, as a--+ oo
this is an increasing function of a that is bounded above, and so it converges
(why?). •
8.5.5 Theorem
Suppose f : [a, oo[ -+ JR is continuous and f ~ 0. Let F be an antiderivative off. Then f is integrable iff limx-= F(x) exists. In this case
i.
1 =1
f
[a,cx,[
ii.
Suppose f : ]a, b]
-+
00
a
f(x)dx
={x-°:'
Jim F(x)} -
JR is continuous and f
lim
lb
~
F(a).
0. Then f is integrable ijf
f(x)dx
,:-0 a+£
exists. This limit equals
J:f(x) dx.
Proof
=J:
J: =
i.
Ji-b,bY'f•lca.bl
f for b > a, and f F(b)-F(a). Hence limb-= lb!
exists iff limb-oo F(b) does, and fa00 f = (lilllb-oo F(b)) - F(a).
ii.
The second part is a little trickier. Fore> 0 definer to be O on [a,a+c]
and f on ]a + E, b]. We proceed by first giving two preliminary steps.
Step 1 If M =sup{/(x) I a + t:
J(x)
~
::; x ::; b}, then, recalling that fM(x)
M and zero othenvise, we have
=/(x) if
because f ::5 /M on [a,b]. Note that /M might not be zero on [a,a + E]; in
Figure 8.P-2, it is not.
485
Theorem Proofs for Chapter 8
M
a=O
FIGURE 8.P-2
b
t
Trimming and truncating an unbounded function
Step 2 For any e and M,
, This is because/M $Mon [a,a+e], and for x E [a+e,b].fM(x) $/(x), and so
l b/M - lbr 5(la+€ /M + 1" 1) -1" = la+.!M
I
f
a
a
a
a+e
I
r
I
J
a+e
$ eM.
a
To demonstrate the theorem, first suppose /M -+ I as M -+ oo. Notice that since
f ~ 0, JJM increases as M i_ncreases. We must show that fr also converges to
/ as e -+ 0. It clearly increases to something :$ /, by step 1. Given 6 > 0, choose
M such that/ - f /M < 6/2. If e = 6/2M, then, by step 2, f fM - f JE < 6/2.
Hence I < 6. Thus lim.-+o
= I.
The converse follows in much the same way, again using steps 1 and 2 to
show that if Ir-+ I, then J!M-+ I. •
To prepare for the monotone convergence theorem, we prove the following
lemma.
I
fr
fr
Chapter 8 Integration
486
Lemma Suppose f is Riemann integrable, f : (0, 1) ...... JR. I/I :5 M, and
f 2:: a > 0. Then E = { x E (0, 1) I / (x) 2:: a /2} contains a finite union of
intervals of total length l 2:: a./4M.
J;
J;
Proof Let P be a partition of (0, 1) such that 0 :5 f - L(f, P) :5 a/4. Then
L(f,P) 2:: 3o/4. We will now show that the intervals REP with RC E satisfy
the conclusion; let l denote their total length. We have
L
3° $ L(f, P) =
4
REP,RCE
$Ml+
inf f(x)v(R) +
xek'
a
2o -
l) $Ml+
L
REP.R~E
inf f(x)v(R)
xek'
a
2.
•
Therefore I 2:: a/4M, as claimed.
8.6.1 Lebesgue's Monotone Convergence Theorem Let 8n : (0, I]
...... IR be a sequence of nonnegative functions such that each improper integral
8n(X) dx exists and is finite. Suppose that O $ 8n+I :5 8n and that 8n(x) -+ 0
for each x E (0, I]. Then
J;
lim
[1 gn(X) dx = 0.
n-=Jo
Proof We have O :5
J01 8n+1 :5 Jd gn;
that is, the integrals form a bounded
decreasing sequence, and so A = limn-oo 8n exists, and A 2:: 0. We wish to
J;
show that A = 0, and so we assume that A > 0. Note that f01 8n 2:: A for all n.
Consider En= {x E (0, l] I Bn(X) c: 2A/5}. Observe that En+I C En, We want to
apply the lemma, but 8n might not be bounded. However, 8n :5 81; since
81
exists as an improper integral,
81M ......
81 as M ...... oo. Here
J;
fd
gnM
(x)
={
8n(X)
M
Jd
for 8n(X) < M
for 8n(X) C: M.
Choose M > 2A/5 and such that 0 $ Jd(g1 - 81M) $ A/5, so that for all n,
Worked Examples for Chapter 8
487
J;
Therefore,
gnM ~ 4>./5 = o. Note that since M > 2>./5,En {x E [0, l] I
gnM(x) ~ 2>./5}. Therefore, by the lemma, En contains a finite union of intervals
of total length ~ >./5M. Now define
00
D = LJ{x E [O, l] I gn is not continuous at x};
n=I
then D has measure 0. Thus D is contained in the union U of a countable
number of disjoint open intervals with total length < >./5M. It may be readily
seen that En is not a subset of U. Observe that if xo is an accumulation point of
En but is not in En, then Cn must be discontinuous at xo, so that xo E D C U.
That is, cl(En) C En U U. Define Fn = cl(En)\ U. By what we have just shown,
Fn C En. But Fn is compact and Fn+I C Fn, and so by the nested set property,
Fn "f 0, and hence
1 En ~ 0. But tliis means that for some x E [0, l],
gn(x) ~ 2>./5 > 0 for all n, contradicting the hypothesis gn(x) -+ 0.
n:,
n:
•
8.6.3 Corollary /ff: [a,b]-+ R,f ~ 0, and the improper integral
J: /
2
<
oo exists, then J:<f-/M) 2 -+ 0 as M-+ oo.
Proof Apply the monotone convergence theorem with gn = (f - fn)2. Since
(f - fn + 1)2 ~ (/ - fn)2 ~ / 2 , the integrals 8n exist and gn+I ~ 8n· •
J:
Worked Examples for Chapter 8
an
Example 8.1 Show that a bounded set A c
has zero volume if! it can
be covered by a finite number of rectangles of arbitrarily small total volume.
Solution Suppose A has zero volume and let c > 0 be given. Let S be
a closed rectangle containing A, and let IA be the characteristic function of
A (i.e., let IA equal I on A and O on S\A ). By definition of zero volume,
there is a partition P of S into subrectangles S1 , ••. , SM such that U(IA, P) < €.
Let Po be the collection of all those subrectangles S; that intersect A. Then
U(IA, P) = LsePo v(S), and so Po is a collection of rectangles covering A with
a total volume < €.
Suppose, on the other hand, that given c > 0, A can be covered by a finite
number of rectangles of total volume< €. Let these rectangles be Vi, ... , VM.
Let S be a closed rectangle containing A, and let P be a partition of S into
Chapter 8 Integration
488
subrectangles S1, ... ,SN such that each S; is either contained in some Vi or has
at most its boundary in common with some V;'s-the partition is defined by
using all the edges of the Vj's. Then U(lA, P) = I:~ 1 v(V;) < i::. This implies
that inf{ U(lA,P) I Pis a partition} = 0, and hence l(lA,P') = 0 for any partition
P', since O $ l(lA,P) $ U(lA,P) for all P. Therefore A has volume, and this
volume is zero.
•
Example 8.2 Let ft be a sequence of bounded (Riemann) integrable functions defined on R = [a, b] x [c, d]. Suppose ft
integrable on R and
f
-+
f uniformly. Prove that f is
j
lf1c(x,y)dxdy = kf(x,y)dxdy
Solution
(Compare Theorem 5.3.1) First observe that/ is bounded. Indeed,
iffN is such that IJN(x,y) - f(x,y)I < l for all (x,y) ER, then, using the triangle
inequality,
lf(x,y)I ~ 1/N(x,y)I + I.
Therefore, since fN is bounded, so is f.
By Theorem 8.1.2, we must find a number/ such that for every
is a 6 > 0 such that
£
> 0, there
n,m
Lf<.t;,.y~)(Xp - Xp-d(yq - Yq-1) - I
<€
p.q=I
for any subdivision .to, ... , Xn of [a, b] with lxp - Xp-d < 6 and Xp-1 $
$ Xp and a similar subdivision yo, ... ,Ym of [c, d]. Thus we expect I =
Iimt-oo JJRft(x,y)dxdy. Now JJRf1h,y)dxdy is a Cauchy sequence. To see
this, note that
x',,
If
ltt(x,y)dxdy -
j kj.;(x,y)dxdyl <
€
if lft(x,y) - .fi(x,y)I
< are: (R)'
Hence, the sequence converges to a value that we call /.
Given e > 0, choose N such that k ~ N implies I JJR/t(X, y) dx dy-11 < i:: /3,
choose N1 such that k ~ N1 implies lft(x,y)-/(x,y)I < c/3(b- a)(d - c) for all
(x,y) ER, and choose N2 = max(N,N1). Since/N2 is integrable, there is a 6 > 0
such that for lxp -Xp-d < 6, IYq - Yq-d < 6,
Worked Examples for Chapter 8
489
With this choice, the triangle inequality gives
n.m
Et<X:,.y~)(Xp - Xp-1)(yq - Yq-1) -1
p,q=l
n,111
n,m
~ I)<X:,,y~)(Xp - Xp-1)(yq - Yq-d - LfN2<X:,.y~)(Xp - Xp-1)(yq p,q=I
Yq-1)
p,q=l
,1 < i + i + i = c:,
+ Iff/N2(x,y)dxdy -
•
-.yhich proves the assertion.
Note that this procedure was done on rectangles in IR 2 , but the same ideas
work on subsets of !Rn. The reader should formulate the precise statement (see
Exercises 14 and 16 at the end of this chapter).
Example 8.3 Find
f
110.001
(l dx )2 •
+x
Solution We are integrating a nonnegative function defined on an unbounded
set, and so by definition
1
dx
---->
[O,oo(
(1 + .x)-
= Jim
a-oo
1°dx
---1.
o (I + x)-
Since
d(-1/(l+x))
I
dx
= (I +x)2 '
the fundamental theorem shows that
"dx
l
I
a
2
(1
+
x)
=
1
+
a
+
1
+
0
=
1
+a·
0
Hence
1
. la ---2= =l
~
tr-+oo
dx
O
(1
+ X)
1
(O.oo[
dx
---2·
(I + X)
•
490
Chapter 8 llltegraJion
Exercises for Chapter 8
1.
a.
b.
Let / : A C lR.n -+ JR, where A is bounded and f is bounded and
integrable over A. Consider another bounded integrable function g :
A -+ JR. such that g(x) = f(x) except on a set S C A of measure
zero. Assuming that/ and g are integrable on Sand A \S, prove that
JAg = JAf.
If f : A C !Rn -+ JR and g : A C IRn -+ JR are bounded functions,
integrable on the bounded set A, and JA If - gj = 0, prove that
j(x) = g(x) for all x E A, except possibly for a set of measure zero.
2.
Give a proof of Theorem 8.4.liii directly from Darboux's theorem and the
triangle inequality for real numbers.
3.
Prove that an increasing function/: [a,b]-+ JR is Riemann integrable.
4.
Show that /(x) = x2"+ 1 for n an integer ~ 0 is not (absolutely) integrable
over JR, even though lim,,_oo 0 J exists.
5.
In JR 3, prove that any subset of the xy plane has measure zero.
6.
If f : A C lR.n -+ JR and g : A c lR.n -+ JR are bounded integrable functions
on the bounded set A such that f(x) < g(x) for all x E A and v(A) f. 0,
then show that JAi < JA g.
7.
Let f : [a, b] -+ JR be continuous and differentiable on ]a, b[. Assume that
/(a)= 0,/(b) = -1, and J:t(x)dx = 0. Prove that there is a c E ]a,b[
such that /'(c) = 0.
8.
Give a proof of Theorem 8.4.li, ii, iii, iv, and vii for improper integrals.
(The parts of Theorem 8.4.1 omitted here do not make sense if A has
infinite volume.)
9.
Compute the following integrals:
t
r21r
•
d
smx
x;
a.
Jo
b.
J; r(sinx)dx.
10.
If S is a closed or open rectangle, show that the two definitions of volume
coincide. That is, prove that fs 1 = (b1 - a1 )(b2 - a2) · · · (bn - an) where
either S = [a1, bi] x · · · x [an, bn] or S = ]a1, bi [ x · · · x ]an, bn[-
11.
a.
If A C A I U A2 U · · · U AN, all sets having -volume, show that v(S)
}:f: v(A;).
1
~
Exerpises for Chapter 8
b.
491
If A is compact, show that A has measure zero iff A has content zero.
Show that a bounded set B has volume iff bd(B) has content zero.
•· 12~. Prove that A has measure zero iff for ev~ry E > 0 there is a covering of
A by. sets Vi, V2, ... with volume such that I:: v(V;) < E.
13.
Prove that a bounded function f : S -+ IR is integrable on the re<;tangle
S iff there is a number/ such that for any E > 0 there is a partition P£
such that for any refinement P of P, and any choice of x; e S; for S; e P,
II:s;=Pf(x;)v(S;) - 11 < E.
14.
Generalize Worked Example 8.2 to functions I : A c !Rn
-+
JR.
r 15..
Is
I 16.
a.
Suppose fk -+ I uniformly on A C JR". Let Ak be the points of
discontinuity of ft (extended). Show that the discontinuities of I
(extended) are contained in A 1 U A2 U · · ·.
· b.
Use a to give another proof of integrability of the limit function in
Example 8.2.
c.
Jo"° :xP dx convergent for any real p? If so, which p?
Find integrable functions lk : A
-+
JR such that fk
-+
f pointwise but
f is not integrable.
17.
a.
Let Pn denote the division of [a, b] into 2" equal subintervals, and
form L(f,Pn) and U(J,Pn) for I: [a,b] -+ JR a bounded function.
Show that I is integrable iff limn-oo l(j, Pn) = lim,,_ 00 U(f, Pn).
Why do these limits always exist?
b.
Generalize a for A c IR".
! 18. a.
I
(
Generalize the mean value theorem for integrals (8.4.1 vi) to the case
where A is any bounded connected set.
b.
~ 0 for x E a connected compact set A C JR, cp is continuous
and increasing in x, and I is positive and integrable, then / <p is
integrable and JAl'P = <p(c) JAi for some point c EA (second mean
value theorem).
c.
Show that b fails if A is bounded but not compact.
If <p(x)
19.
Let f : B -+ 1R be integrable, I ~ 0. If A C B and f is integrable on A,
then JAi ::; f 8 f. ls this true if we do not assume f ~ O?
~ 20.
Suppose f : ]0, b] -+ JR is continuous, positive, and integrable on ]0, b]
and thl}t as x -+ 0 from the right, f(x) increases monotonically to +<xi.
Prove that Ej(E)-+ 0 as E -1 0.
I
t
Chapter 8 Integration
492
Ji°°
>
1. Show that if O < p $ I,
21.
Show that
x-P sin x dx converges if p
then the convergence is conditional.
22.
The gamma function is defined to be the function given by the improper
integral r(p) = Ji°" e-:r#- 1 dx. Show that the integral is convergent for
p> 0.
23.
a.
If <p : [a, b] -. IR" is a continuous function, show that the set S =
graph <p = { (x, cp(x)) I x E [a,bl} c an+! has content and measure
zero.
b.
For <p : IR -. IR" continuous, show that graph <p has measure zero.
c.
Let f : [a, b] -. lR. be integrable. Show that the graph off has
volume zero by considering the difference of the upper and lower
sums for/.
d.
Show that the ellipse
24.
x2 + 3y2 = 9 in a 2
Give an example to show that the following is not equivalent to the integrability off. For any e > 0, there is a 6 > 0 such that if P is any partition
into rectangles S1, ••• , SN with sides < 6, there exist x 1 E S1, ••• ,XN E SN
such that
It.
2S.
has volume zero.
Let f : [0, oo[ -.
Show that
f(x;)v(S;) -
/I
< e.
a be nonnegative, integrable, and uniformly continuous.
lim f(x) = 0.
X--+00
26.
c a", where A is bounded and has volume.
Let f : A c
to be unbounded. Suppose C; is a sequence
of compact sets with volume, C; c A with C; increasing to A, and assume
v(C;) -. v(A) (this is automatically true, as shall be seen in Exercise 15
at the end of Chapter 9). Show that f is integrable iff f is integrable on
each C;, lim;_oo fc.f exists, and
Consider a set A
an-. IR,/~ 0, but allow f
I
[ f = _lim [
}A
,. . . oo)C;
f.
27.
Prove that if f : A C IR." -. a is continuous, A is open with volume, and
0 for each B C A with volume, then f 0.
8/
28.
Let/: [O, I] -. a be integrable and be continuous atxo. Show that the map
x 1-• J;J(y) dx is differentiable with derivative /(Xo). Give an example of
f =
=
493
tfixercises for Chapter 8
a discontinuous integrable f for which this map is not differentiable. For
bounded integrable f, prove that this map is always continuous and, in
fact, is Lipschitz.
l9,
Show that the Cantor set CC [O, 1) has measure zero (see Exercise 38 at
the end of Chapter 3).
30..,
Prove the following analogues of the Weierstrass and Dirichlet. tests for
uniform convergence using the Cauchy criterion:
a.
Let/: [a,oo[ x [c,d]-+ R. and suppose there is a positive function
M(x) defined for x E [a, oo[, such that 1/(x, s)I :5 M(x) for all s E
[c,d], and such that l000 M(x)dx < oo. Then F(s) = l000 /(x,s)dx
converges uniformly in s. If f(x, s) is continuous in (x, s), prove that
F is continuous.
b.
Let f : [a, oo[ x [c, d] -+ R. be continuous, and suppose that
I f(x, s) dxl :5 M for a constant M for all r ~ a, s E [c, d]. Suppose
cp(x, s) is decreasing in x and cp(x, s) -+ 0 as x -+ oo uniformly in s.
Prove that F(s) = l000 cp(x,s'J/(x,s)dx converges uniformly.
l:
31.
Let/: [a,b]-+ R be differentiable and assume thatf' is integrable. Prove
that J:J'(x)dx =f(b) - f(a).
32.
Suppose that rp is a differentiable function on [a, b] and that/ is continuous
on the range of cp. Show that
]
ddX [1,p(.r>
,p(a) f(t)dt =J(cp(x))cp'(x).
33.
Define a functionf on the interval [0, 1) by puttingf(x) = I if xis rational
and /(x) = -1 if x is irrational. Show that I/I is integrable on [0, 1] but
that f is not.
34.
Define a function f on [0, I] by putting f(x) = I if x is irrational and
f(x) = 1/q if xis a rational number equal to p/q for integers p and q with
no common factor. Is f integrable on [O, I]? (See also Exercise 38.)
35.
Let An= [(n + 1) + (n + 2) + · · · + (n + n)]/n. Prove that limn_ 00 (1/n)An =
3 /2, using the Riemann integral.
36.
Prove that limn-oo(n!) 1/n /n = e- 1 by considering Riemann sums for
log xdx based on the partition 1/n < 2/n < ••• < I.
37.
1;
a.
Under what conditions is J: f(cp(t))cp'(t)dt =
1::;:/ f(x) dx?
494
Chapter 8 Integration
b.
38.
Evaluate
j
~ using x = cost.
(1 -x) 1 -x2
Let f : [O, 1] -+ JR be defined by
if x is irrational
f(x) = { ~/q ifx=p/q,
where p, q ~ 0 with no common factor. Show that f is integrable, and
compute J~ f. (See also Exercise 34.)
! !
1 + n 2 + · · · + 2:] .
39.
Prove that {og 2 = limn---+oo [ n
40.
Find an open subset of JR contained in JO, I [ that does not have volume,
as follows:
a.
Review the construction of the Cantor set (see Exercise 39, Chapter 3).
b.
Modify the Cantor set by letting C1c be obtained from Ct-I by removing the middle 1/2/r.th from each interval of Ct- I and letting
Co= [0, 1). Set C =
1 C1c.
n:
k
·
I
4.
c.
Show that v(C1c) = IL= 1(1 - 1/2') ~
d.
Let U be the complement of C. Compute the boundary of U, and,
using c, show that it cannot have measure zero.
This exercise also produces an example of a compact set C with empty
interior that does not have volume.
41.
I/ is Riemann integrable}. Set
Let R([a, b]) = {/ : [a, b] -+ JR
d(f, g) =
lb
lf(x) - g(x)I dx.
Is d a metric on the space R([a,b])?
42.
Find a subset A of [0, 1] such that A = cl(int A) and yet bd(A) does not
have measure zero.
43.
It is a fact (see J. Marsden and M. Hoffman, Basic Complex Analysis,
2d ed., W. H. Freeman, New York, 1987, p. 26) that
. 7f' • 27f'
SID - SID -
n
n
• (n - 1)7f'
• • • SID
n
n
= 2n-l •
Use this identity to evaluate the improper integral
J0
1r
log sin x dx.
j:;xercises for Chapter 8
495
44.
Discuss generalizations of Theorem 8.6.1 to R_n.
45.
Discuss generalizations of Theorem 8.6.1 from [0, 1) to [0, oo[.
46.
a.
Suppose U::: ]-1, 1( x ]-1, I[ C R. 2, / : U-+ R.. Assume that 8f/8x
and 8// 8y exist at each point of U and are bounded on U, where
(x,y) are the standard coordinates for R. 2• Show that/ is continuous
at (0,0).
b.
Show by example that boundedness of the partial derivatives is necessary in part a; mere existence is not enough.
47.
For every a > 0, compare
hence determine
J:
x 0 dx with
and
:E~1 n° dx and
I: -n- .
N
.
l1m
N-+oo
:E;:1 n°
n=I
o
Nl+o
48.
For any function/(x) continuous over the reals, define the sequence/n(X):::
n J;+J/n f(Od{ for n = 1, 2, 3, .... Show that dfn(x)/dx exists even if
df(x)/dx does not, that/(x) = limn-+oofn(X), and that convergence to the
limit is uniform when f is uniformly continuous.
49.
Suppose {h} is a collection of open intervals whose union covers a compact interval C on the real axis; show that some positive e: exists such
that every subinterval of C no wider than s lies entirely in at least one of
the /k's.
·
50.
State whatever lemmas, theorems, and so forth are needed to justify each
of the following assertions:
I:: 2-k sin(k/n) = 0.
a.
limn-+oo
b.
If f (x) is a power series converging in )-1, l [ , then the same is true
for J'(x).
c.
Letf(x) = tan(1rx/2) and set On =fn>(0)/n!. Then :E:Oan is not a
convergent series. (Do not attempt to compute an.)
d.
1
If fn(x) is differentiable on [a,bJ with l{;(x)I < IO for all n and if
0 at each x, thenf,,(x)-+ 0 uniformly.
x E [a,b] andfn(x)-+
e.
f<x) = "wL-=I
f.
lex-l - x - x2 /2 - · · · - x99 /99!j ~ exx100 /100! for x > 0.
g.
f (x) :::
h.
.
ea.r-1
a
hmx-+O - .-b- = -b.
sm x
00
cos(2"x) h
.
d . .
as a continuous envat1ve.
3"
:E: xn /
1
n 3 is continuous in the closed interval [- 1, 1].
496
51.
Chapter 8 Integration
< oo,
then J~<E:Oancosnx)dx=E:O(an/n)sinnt.
i.
IfE:Olanl
j.
For some integer n, n
> 10100(logn) 1000•
Is the following true or false? A functibQ____de~ned on (0, oo[ but not
infinitely often differentiable cannot be expresS'OQ__as the sum of a Dirichlet
'··
series
00
f (t) =
L ane-nl.
=<>
52.
Let Q be the set of points (x,y) E IR. 2 that can be expressed in the form
(sin0 +-sintJ.,,cosO - cos'efJ) for some (0,tJ.,). Find an interior point of Q
(that is, a poi!}t that, together with a disk around it, belongs to 0).
53.
Show that the series
1
00
L 2" - k sin(kx)
k=I
is uniformly convergent on IR..
54.
Prove the "dominated convergence theorem for series": If
i.
for each k, a1c.,. -+
ii.
for each
iii.
E: ak.,. are convergent), then
L ak,n E ak as n
Ok
as n -+ oo, at and a1c.,. E !Rm,
n and k, lla1c.,.II :5 bk, lla1cll :5 bk, some b1: E IR., and
E:'1 b1c is convergent (so that, by the comparison test, E:'1 a1c and
1
00
00
-+
k=I
-+
oo.
k=I
Give an example to show that condition iii is necessary, even if it is
assumed that Ek ak.,. and Ek a1c are convergent.
55.
Find a differentiable function J(x) such that /'(x) is bounded but is not
integrable on [0, l].
Chapter 9
Fubini's Theorem and the
Change of Variables Formula
§9.1 Introduction
'J1lere are two fundamental integration theorems that help us evaluate multiple
integrals. The first concerns the evaluation of multiple integrals by means of
iterated integration. Using this method, we can calculate the value of a multiple
integral by performing successive single integrations.
9.1.1 Example letting A be the square defined by O $ x $ 1 and O $ y $ 1,
evaluate the integral of (x + y)x over A.
Solution
1
(x+y)xdxdy=1
A
1
r-=O
1'
(1'
[.r+yx]dy) dx
y=O
= o (J.1 + !x)
dx =! + ! = 2..
2
3 4
12
•
If A is not a square but, say, a triangle, then we extend the function to a square
by letting it be zero outside A. Then, in the integration process shown in the
example, the y integration becomes cut off at some point that depends on x, as
indicated in Figure 9.1-1.
The intuition behind the method of Example 9.1.1 is this: For a given function f(x,y), 0 $ x $ l, 0 $ y $ I, the number Jdf(x,y)dy is the area under
the graph off on the line x = constant. Integrating this area over x gives the
497
498
Chapter 9 Fubini's Theorem and the Change of Variables Formula
y
y
ft(x, y)dy
0
/
V
--+--------......L----1-dx
X
FIGURE 9.1-1
dx
Iterated integrals
volume under the graph. This suggests that JAJ(x,y)dxdy = fo'<Jd f(x,y)dy)dx.
The precise result will be given in Fubini's theorem, 9.2.1.
•
The second basic theorem of the chapter is called the change of variables
formula. This is used in conjunction with the method of Example 9.1.1 to
evaluate cenain types of integrals. The following example is typical.
9.1.2 Example Evaluate JA (x1 + y2) dx dy where A is the unit disk given by
A={(x,y)EIR. 2 1.1..2+y2~1}.
-
Solution Change variables to polar coordinates (r, 8) : x = r cos 8, y = r sin 6,
for r > 0 and O < 0 < 21r. We have the "rnle" dxdy = rdrd8. Since x2+y2 = ,2,
we get
1
(x2+y2)dxdy=
A
1
r 2 rdrd8 =
A
1112~
1-=0
IJ=O
rdBdr=
1'
r=O
21rrdr=
21T . •
The justification of the "rule" dxdy = rdrd0 is given by the change of
variables formula (Theorem 9.3.1). The exu·a factor r is just the Jacobian
8(x,y)/ 8(r, 0) = r. However, it is easy to heuristically "justify" the rule by
regarding dr and dO as infinitesimals: dr represents a radial infinitesimal while
rd0 represents an infinitesimal length of arc generated by a change in angle
of d6 along a circle of radius r. Thus rdrdO is the area element in a sector
bounded by r, r + dr and 0, 0 + dB (see Figure 9.1-2).
'
In one dimension, the change of vruiables fonnula is easy. It states that if f
is continuous on [a,b] and we have a mapping <p: [a,,B]-. [a,b] with cp(a) = a,
499
§9.1 Introduction
y
FIGURE 9.1-2 An "element of area" in polar coordinates
cp(/J)
= b and cp' exists and is continuous, then
, lb
f(x)dx
=
1/3
f(cp(u))cp'(u)du.
To prove this, first find an F such that F' = f, which is possible by the fundamental theorem, and obse1ve that
lb
f(x)dx = F(b) - F(a).
Now let G(u) = F(cp(u)). By the chain rnle,
G'(u) = F'(cp(u))cp'(u) =J(cp(u))cp'(u).
Hence, again by the fundamental theorem,
1/3
f(cp(u))cp'(u)du
= G(fJ)
- G(a)
= F(b) _-
F(a),
as required. (This theorem is also trne if f is not continuous but merely integrable, as we shall see in §9.3.) This technique is often called "integration by
substitution," and its power is well known to the student:
9.1.3 Example Integrate
J(l +x2) 10fdX.
Solution Let y = 1 + :,.2 and note that y' = 2x, and so
!
!
/ (1 + r)ioxdx = 2 Jyioy' dx = 2 jy10 dy
where C is a constant.
•
= _!_Yu + C = _!_(l + x2)11 + C
22
22
'
500
Chapter 9 Fubini's Theorem and tlie Change of Variables Fonnula
The generalization of this result to higher dimensions is contained in the
statement of the change of vruiables formula; while it is a good deal more subtle
to prove, it is easy to understand and along with Fubini's theorem is one of the
most powerful computational methods we have.
Exercises for §9.1
1.
Evaluate
J~ ~~ dx.
2.
Evaluate
1
3.
Evaluate IA(x + y2)dxdy where A== [O, l] x [O, l].
4.
Evaluate IA e- 12 -I dx dy where A is the unit disk in IR.2.
5.
Evaluate IA dxdy where A is the triangle in IR 2 bounded by the lines x == 0,
y :: 0, X + y-= l.
sinx
-,,- dx.
o cos-x
1
§9.2 Fubini's Theorem
We now state the first of our two basic theorems. We start by giving Fubini's
theorem for the case of the plane IR2•
9.2.1 Theorem
i.
Let A be the rectangle described by a :5 x :5 b, c :5 y :5 d, and let
f : A -+ IR be continuous. Then
The expression
lb (id
f(x,y)dy) dx
means that the function
g(x) =
is integratedfrom a to b.
1d
f(x,y)dydy
§9.2 Fubini's Theorem
ii.
501
In i, suppose f is integrable and the function fx : [c, d] -+ JR defined by
fx(y) =f(x,y) is inregrablefor each.fixed x E [a,b]. Then
One can similarly assume that
exists for each y and obtain
As usual, we can apply this to a nonsquare region A by extending f to be
zero outside A and applying this theorem to a containing rectangle. Note that as
we perform this extension,/ may develop discontinuities on the boundary of A.
Dropping the assumption of existence of the iterated integrals and replacing
them by existence of the double integral off is not possible in the present context.
To obtain such a result, the student will have to wait for more advanced courses
in measure theory. In practice, however, Theorem 9.2.1 is often adequate. As
mentioned in §9.1, the result is, intuitively, entirely reasonable.
The following corollary is a typical application of the theorem. This corollary
can often be used effectively by breaking up a complicated region into smaller
regions to each of which the corollary applies.
9.2.2 Corollary
Let cp, t/; : [a, b]
t/;(x) for all x E [a, b], let
A= {(x,y)
and let f: A
-+
-+
JR be continuous maps such that cp(x) ~
I a:::; x ~ b,cp(x):::; y:::; 1/,(x)},
JR be continuous. Then (see Figure 9.2-1)
1I= 1 (11/)(x) f(x,y)dy) dx.
1>
A
a
'f'(X)
There is an analogous theorem with the roles of x and y interchanged. The
corollary is a consequence of the theorem if we remember that f is extended to
be zero outside A. These results are extended to multiple integrals as follows.
502
Chapter 9 Fubini's Theorem and the Change of Variables Formula
y
J
,t,(r)
f(x, y)dy
~(r)
ef>(x)
-+----'-------'----- X
FIGURE 9.2-1
Fubini's theorem
9.2.3 Theorem
i.
Let A C !Rn and B C IR"' be rectangles and let f : A x B C !Rn x IR.111 -+ JR
be continuous. Define.for each x EA, fx : BC IR"' -+ 1R by fx(y) = f(x,y).
Then
ii.
If f is integrable and fx is integrable for each fixed x, then again
1x/ 1
=
(1f(x,y)dy) dx.
Similarly, if JAf(x,y) dx exists for each y, then
In practice, this theorem may be used repeatedly to reduce a problem to
iterated one-dimensional integrals.
9.2.4 Example Evaluate
1
(x + y + 7.)2 dx dy dz
where A is the three-dimensional volume sketched in Figure 9.2-2.
§9.2 Fubini's Theorem
503
z
l
co.o.•i
A
(0,),0)
/(1,0,0)
X
A tetrahedral region in IR 3
FIGURE 9.2-2
Solution Here A is the set {(x,y, z) E IR 3 Ix~ 0,y ~ 0, z ~ 0, and x+y+z $
1}, and so it consists of those points (x, y, z) for which x ~ 0, y ~ 0, x + y $ I,
and O :5 z :5 1- (x+y). Let B = {(x,y) Ix~ 0,y ~ 0, and x+y :5 I}. Then, by
Theorem 9.2.3, and remembering that/ is zero outside A,
r (lor,-,.
1
+.Y)
}x+y+zf-dxdydz= ln
)
(x+y+ddz dxdy.
Similarly, B consists of those points (x,y) for which x E [0, I] and y E [O, 1-x],
and so
1
[1
,,_<x+y+z)2dxdydz= lo
r1-x r1-(x+y)
lo
lo
(x+y+ddzdydx.
Now use the fundamental theorem to evaluate these integrals. First note that
:z D<x+y+z>3] =(x+y+d,
504
Chapter 9 Fubi11i's Theorem a11d the Cha11ge of Variables Formula
and so pe1forming the z integration yields
1
(x + y
=
=
+ d dx dy dz
1111-x
1111-x (1
cx+y+
0
1;
(x+y))3
(x+y+0)3) d dx
3
y
(x+y)3)
- - - - dvdx.
3
3
.
0
Again, using the fundamental theorem, the y integration yields
For improper integrals it is generally sufficient to apply the theorem first on
a bounded region and then take limits. See Example 1 at the end of this chapter.
9.2.5 Example In the following integral, change the order of integration
and evaluate:
111'
xydydx.
Solution The region in question is shown in Figure 9.2-3 (see Corollary 9.2.2).
In the original order, we do the y integral first. For each x between O and 1,
y runs from x up to 1, as in pru1 (a) of the figure. In the reverse order for each
y between O and 1, we let x run from Oto y, as in part (b) of the figure. We
obt\in
1l1y
o
11 2
o xydxdy = o
y3
s·]
dy =
•
sos
§9.3 Change of Variables Theorem
y
y
y
X
FIGURE 9.2-3
A triangular region of integration in JR 2
Exercises for §9.2
1.
Show that the volume of region A in Example 9.2.4 is 1/6.
2.
Draw the region con-esponding to the integral
uate.
3.
Interchange the order of integration in Exercise 2 and check that the answer
is unaltered.
4.
Let A be the region in JR3 bounded by the planes x = 0, y = 0, z = 2 and
the surface z = x2 + y 2• Show that
1
xdu(vdz = 8 (
J; J.,-X (x+y)dydx and eval-
Y;).
§9.3 Change of Variables Theorem
Next we tum to the change of variables fonnula for multiple integrals.
9.3.1 Change of Variables Theorem let A c IRn be an open bounded
set with volume and let g : A -+ IRn be a C1 mapping that is one-to-one. Assume
Chapter 9 Fubini's Theorem and the Change of Variables Formula
506
that Jg(x) "F O for all x EA and that IJg(x)I and 1/IJg(x)I are bounded on A.
Let B = g(A) and assume that B has volume (see §8.2). /ff: B-+ JR. is bounded
and integrable, then (/ o g)IJ(g)I is integrable on A and
fsf =
1
<f o g)IJgl,
that is,
1
8
f(Y1, ... ,y,.)dyi · · · dyn =
1
A
8(g1, ... ,gn)
) dxi · · · dxn.
X1, ... ,Xn
-
f(g(X1, ... ,Xn)) {)(
The proof of this theorem requires some subtle manipulations, but we can
give a fairly simple plausibility argument as follows:
To begin, suppose we isolate a small rectangle S in .I:\. Then g is approximately affine near S, and so g(S) is approximately a parallelepiped, as in Figure 9.3-1. If g were affine, the volume of g(S) would be jdet gjv(S), where g is
the linear part of g. However, g(xo) + Dg(.xo) approximates g well near .xo and
is affine, and so v(g(S)) ~ IJglv(S). Thus,
f(g(x))IJg(x)jdx
~ f(y)dy,
where y = g(x), and so "adding" these infinitesimal quantities gives the result.
B
g
~
I)
....
g(S)
FIGURE 9.3-1
Change of coordinates
9.3.2 Example What is the volume of the parallelepiped spanned _by the
vectors (1, 1, 1), (2, 3, 1), (0, I, 1) in IR.3 ?
§9.3 Change of Variables Theorem
507
Solution These vectors are the images of the standard basis under the linear
mapping with matrix
By the change of variables theorem, the volume of the image set B is ldet gl
times 1, the volume of A (the unit cube in IR 3). This detenninant is calculated
to be 2, and so the volume required is 2.
•
t,( (,'1,;J \.
9.3.3 Example Evaluate J~
= i1- - v2, V : 2xv.
U
•
L1.
J;(x4 - y4)dxdy using the change of variables
.
11°ti'/
-2>.IJ
..
x~1-1..c ;
'--
c.-..__
Solution Take
B = JO, I [ x JO, I [ and let g be the map taking (u, v) to the
corresponding (x,y), so that g- 1(x,y) = (.\.2-y2, 2xy). The region A corresponding
to B is shown in Figure 9.3-2. We ·can check that g: A ---+Bis one-to-one and
1
Jg(u, v) = - - = ---,
Jg- 1(x,y) 4(x'.! + y 2 )
since
-2y ) = 4(r + v2)
2x
· '
cJ(u, v) = ( 2x
o(x,y)
2y
which is nonzero. Thus, by the change of variables theorem, our integral is
1 (r
u·
A
d d
+y2) 4( : \)
X
+y
=-4I
1
ududv
A
=1211-c,,214>
ududv= 0.
O (,.2/4)-1
(Strictly speaking, one should first apply the change of vruiables theorem only
to the integral
fc1<x4 -y4)dxdy and then let~---+ 0 to make Jg(u, v) bounded;
a refinement shows that this assumption is actually not necessary.) The given
integral can also be evaluated directly as an iterated integral:
J.:
508
Chapter 9 Fubini's Theorem and the Change of Variables Formula
V
y
FIGURE 9.3-2 A change of variables in IR 2
Exercises for §9.3
1.
Show that the change of vruiables theorem covers the one-variable theorem
discussed in §9.1 as a special case.
2.
What is the volume of the parallelepiped spanned by ( 1, 1, 1, 1), (0, 1, 1, 0),
(2, 0, 3, 0), (1, 1, 0, I) in IR4 ?
3.
Transform
y =u -v.
4.
Show that the volume of the parallelepiped spanned by vectors v 1 , • •• , Vn
in !Rn is given by ldet Aijl 112 where Aii = (v;, Vj}-
5.
Show that the area of the ellipse
a2 + y2 / b2 S I is rrab by perfonning
a change of vru"iables and reducing the problem to one of finding the area
of a circle.
Jd f11(.r + y2)dxdy using the change of variables x = u + v,
r/
§9.4 Polar Coordinates
An application of the change of variables formula is to the evaluation of integrals
using polar coordinates. The function that changes from polar coordinates to
rectangular coordinates is g(r, 8) = (r cos 8, r sin 8). The Jacobian detenninant is
Jg(r,8)
=I cos
.8
sm 8
- r sin 8
rcos 8
I=
?
•
?
rcos- 8 +rsm-0
=r.
§9.4 Polar Coordinates
509
If g is defined on the set { (r, 0) I r > 0, 0 < 0 < 21r}, then Jg(r, 0) is never zero
and g is one-to-one on this set. We leave to the student the verification that g
is one-to-one. Although the image of this function excludes the set of points
on the x axis with x ~ 0, this is a set of measure zero and therefore does not
contribute to the value of an integral. (See Figure 9.4-1.)
y
8
2ir
r-------I
I
I
L
-
r
FIGURE 9.4-1
Polar coordinates
9.4.1 Example Find the mass of a thin plate in the shape of an annulus
with inner radius 1, outer radius 2, and mass density 1/r 3 at points a distance
r from the center. See Figure 9.4-2.
y
8
~---,
2
FIGURE 9.4-2
Polar coordinate description of an annulus
510
Chapter 9 Fubini's Theorem and the Change of Variables Formula
Solution If we let B denote the annulus, then B is the image (except for
points on the positive x axis) under the polar coordinate map g of A = { (r, 0) I
0 < () < 271", 1 < r < 2}. Hence the required mass is given by
fn<r+y 2)- 312 dxdy= 1o/?)jlgldrd0= 11/r 2 d~d(J
=
fo
2
2
1r /
1/r 2 drd0=
fo
2
1r
(-~+1) d0=7r.
•
Exercises for §9.4
Evaluate the integrals in Exercises l and 2 using polru: coordinates.
1.
fvexp(x2 +y2)dxdy where D = {(x,y) I x2
2.
fvln(x2 +y2)dxdy where D = {(x,y) Ix~ O,~
and where a and b are positive real constants.
3.
Find the area of a circle of radius r using polar coordinates.
+y2 < l}.
~
0,a2
~ x2 +y2 ~
b 2 },
§9.5 Spherical and Cylindrical Coordinates
The same techniques that were applied to polar coordinates can also be applied
to spherical coordinates. Here, let g(r, <p, 0) = (rsin <p cos(), rsin () sin rp, rcos cp)
and consider g to be defined on {(r, cp, 0) I r > 0, 0 < () < 211", 0 < rp < 7l'}. See
Figure 9.5-1. The image under g of this set is all of JR 3 except for the part of
the xz plane where x 2:: 0. From Exercise 5, Chapter 8, this is a set of measure
zero, and so it can be neglected in integrals. The Jacobian determinant is given
by
lg(r, <p, 0) =
sin cp cos()
sin cp sin()
cos cp
rcos cp cos 0 -rsincpsin0
rcos <p sin()
r sin cp cos ()
-rsin cp
= r 2 sin rp.
0
Hence Jg(r, <p, ()) -:j:. 0 in the specified region; g is also one-to-one on the region.
Therefore the change of variables formula gives
f
jg(A)
f(x,y, z)dxdydz = 1t(rsin cp cos 8, rsin rp sin 0, rcos cp)r 2 sin rpdrd<p d0.
A
§9.5 Spherical and Cylindrical Coordinates
511
z
(x. y, z)
y
X
FIGURE 9.5-1
Spherical coordinates in JR. 3
9.5.1 Example Integrate the function f(x,y, z) = x2 + y2 + z2 over the set
I x2 +y2 +z2 < 1}.
B = {(x,y,z)
B is the image of A = { (r, <p, 0) I O < r < I, 0 < () <
271", 0 < <p < 7r} under g (except for the points of B on the xz plane where
x ~ 0). Hence
Solution The set
i
1 1Jgl
(x2 +y2 +z2)dxdydz =
(since sin cp
1r
drdipdO
1
=
r 2 • r 2 sinrpdrdcpdO
> 0 in the relevant region). Thus our integral is
fo 11r fo
2
r2
1
r 4 sin cp drdrp d()
=fo 2 11r si~ rp dcp d() =~ fo 2
1r
1r
2 d0
=;11".
•
Cylindrical coordinates are treated much the same way as polar and spherical
coordinates. The appropriate mapping is g(r, (), z) = (r cos 0, r sin 0, z) on the set
{(r,0,z) Ir> 0,0 < 0 < 27r}. See Figure 9.5-2. The Jacobian is Jg(r,O,z) = r,
and so the change of variables theorem becomes
f
jg(A)
f(x,y,z)dxdydz= ff(rcos0,rsin0,z)rdrd0dz.
}A
This is useful for triple integrals that have "cylindrical symmetry," as opposed
to "spherical symmetry" problems solved using spherical coordinates.
•
512
Chapter 9 Fubini's Theorem and the Cha,rge of Variables Formula
z
X
FIGURE 9.5-2
Cylindrical coordinates in JR. 3
9.5.2 Example Evaluate JRze-,:2-l dxdydz over the
x2 + y2 S 1, 0 ~ z ~ 1}.
region R = {(x,y,z)
I
Solution Here we get
/1 (2"
l z=0 Jo=0
1
1
ze-r? rdrdO dz=
r=O
i(l - e-
1 ).
•
Exercises for §9.5
1.
Evaluate fv e(x?+y2+z2>312 dx dy dz, where D is the unit ball in JR. 3•
2.
Let D be the unit ball in JR. 3 . Evaluate Jv<dx dy dz)/ ✓ 1 + x 2 + y2 + z2 •
3.
Evaluate fv z ✓x2 + y 2 dxdydz where D = {(x,y,z)
11 S x2 + y2 ~ 2, I ~
z S 2}.
4.
Perform a change of variables to evaluate
where D = {(x,y) IO~ y S x,O S x ~ I}.
5.
Work out
1
(l-y2) 1/?
-1
-(1-yl)l/l
111
1
0
by using cylindrical coordinates.
JD exp ((x + y)/(x -
z(x2 +y2 )dxdydz
y)) dxdy
§9.6 A Note on the Lebesgue Integral
513
§9.6 A Note on the Lebesgue Integral
Several times in this and the previous chapter, we have hinted of the existence
of another theory of integration. There is such a theory, called the Lebesgue
theory. We shall now discuss the need for this theory and its basic underlying
differences from the Riemann theory.
The need for the Lebesgue integral is largely a technical one: Some functions
might not be Riemann integrable, but we might need to integrate them anyway.
For example, such a function can occur as a limit of Riemann integrable functions
(a uniform limit of Riemann integrable functions is Riemann integrable, but a
pointwise limit need not be). It is desirable, however, to work with "complete"
spaces, which contain limits of Cauchy sequences-for example, IR.n in Chapter 2
and C(A, JR) in Chapter 5. In the next chapter, on Fourier series, we shall see a
useful space of functions with the property that convergence in this space may
be pointwise but not necessarily uniform. The Riemann theory is not sufficient
to integrate such limit functions.
Henri Lebesgue's problem was to find a more general theory of integration
than the Riemann theory that, moreover, was more useful technically. Actually,
this is a simplification; the factual history is more complicated. For example, his
work first received prominence when he used his ideas to give a characterization
of Riemann integrable functions (see Theorem 8.3.1, Chapter 8). For historical details see, for example, M. Kline, Mathematical Thought from Ancient to
Modern Times (Oxford University Press, New York, 1972).
Here we give a brief glimpse of the ideas. To develop them fully requires
a course in itself (see, for example, H. L. Royden, Real Analysis, 3rd ed.,
MacMillen, New York, 1988).
The basis of Lebesgue's theory is as follows. For the Riemann integral, the
area under the graph of a function was divided into vertical rectangles in order
to define the integral. But why could it not be divided into horizontal rectangles
just as well? See Figure 9.6-1.
Intuitively, this type of subdivision, where we take a partition on the y axis
rather than the x axis, ought to yield the same area. But in fact, there is a
technical difference, and Lebesgue's idea (horizontal rectangles) is what we
want. This also leads to more complicated mathematics, and let us now see
why.
Let f : [a, b] -. JR and suppose f is bounded and ? 0. Let O = Yo < Y1 <
· · · < Yn be a partition of an interval [0, Y] with Y > sup{f(x) I x E [a, b]}
containing the range off. Our candidate for an approximating sum is (look at
Figure 9.6-1 to see this):
L)Y;+1 -y;) · ("length" of the set {x lf(x)? y;}).
514
Chapter 9 Fubi,zi's Theorem and the Cha,zge of Variables Formuk
y
y
(a)
FIGURE 9.6-1
(b)
(a) Riemann approximations; (b) Lebesgue approximations
The technical point is that {x I f(x) 2: y;} = /; might be a complicated set;
after all, the Riemann theory can already handle "decent" functions, and so we
have to be prepared to handle fairly complex ones. Thus our problem is that if
/; is a complicated set, how do we compute its length? This is where a main
part of the Lebesgue theory comes in. One first has to develop the idea of the
measure (or length) of a set. This turns out to be a bit complicated, but once this
measure theory is developed, one can then proceed with the study of Lebesgue
approximating sums as just described. Note that we already saw one aspect of
"measure" in Chapter 8 when we studied sets of measure zero. This concept is
taken directly from the Lebesgue theory.
The conclusion is, therefore, that with considerably more effo1t, a more
comprehensive theory of integration (which includes the Riemann theory as a
special case) is possible. In more advanced areas of mathematics, the technical
rewards are more than wo1th the effo11 involved.
§9.7 Interchange of Limiting Operations
In the next chapter we begin the study of a large area of analysis that is both the
inspiration for and one of the main applications of much of the analysis invented
in the last two centmies, including much of the contents of this book. This area
involves the expansion of a given function in te1ms of sines and cosines, either
as an integral or as a sum, and is known as Fourier analysis. Before doing
that, this might be a good place to bring together for comparison several of th e
results which we have seen in the text so far.
Analysis is concerned largely with infinite processes. _Jhe quantities studied
are often the values of a limit of some kind. This applies to derivatives, integrals,
§9.7 Interchange of limiting Operations
515
the sums of infinite series, the limits of sequences, and so on. Many of the
theorems that we have encountered can be viewed as saying that under specified
conditions, certain limiting processes may be interchanged or performed in either
order.
9.7.1 Example A uniform limit of continuous functions is continuous.
One of the first results that we had in Chapter 5 was that if f1,fi,f3 , •••
are continuous functions that converge unifonnly to a function f, then f is also
continuous. The asse11ion of continuity at a point a is that limx-af(x) = f(a),
so the result of this theorem can be expressed as
f lim(fn(x)) .
lim ( lim (f,,(x)) = lim
x---.a
n-+oo \.:-a
n-oo
The expression on the left is limx-af(x) while that on the tight is lim11 _
00
(f0 (a)) =
f(a).
9.7.2 Example Under appropriate hypotheses, the integral of a limit is the
limit of the integrals.
This sort of conclusion also asserts the validity of the interchange of two
limiting operations. In this case we are interested in conclusions of the form
l b(
lim f,,(x)) dx = lim
n-oo
11-oc,
0
(lb
a
fn(x) dx) .
In Chapter 5 we saw examples showing that this interchange cannot always be
done. We also had one rather imp011ant case in which it could.
If /i.h,/3, ... is a sequence of integrable functions on an interval [a,b]
converging uniformly on [a, b] to f, then f is integrable and
[
b
a
J(x)dx = !!_m
11
oo
(1b
a
9.7.3 Example Term-by-term operations.
f,,(x)dx) .
516
Chapter 9 Fubini's Theorem and the Change of Variables Formula
Theorems asserting the validity of term-by-te1m differentiation or integration
of series under various conditions are also of this type. Their conclusions can
be expressed as
or as
d[oo
lL
00
dx Lfn(X) =
n=l
d fn(x).
dx
n=l
Either of these conclusions requires some conditions for its validity. We encountered theorems of this type in Chapter 5. They were particularly effective
in the case of power series.
Differentiation and Integration
There are a number of results involving integrals and derivatives. Some of
these we have met; most recently Fubini's theorem, which asserts that iterated
integrals may be evaluated in either order under appropriate conditions: A pair
of integrations may be interchanged. Earlier in Chapter 6 we encountered the
result that with fairly mild assumptions we can conclude that mixed partial
derivatives may be taken in either order: A pair of differentiation operations
may be interchanged. It is rather interesting to see how some of these results can
be related to each other, perhaps with slightly more restrictive assumptions. The
fundamental theorem of calculus itself asserts the interchange of an integration
operation and a differentiation.
Theorem If f andf' are continuous on [a,b] and a< x < b then
d
dx
lx
a
f(t) dt = J(x) =/(a}+
lxJ'
a
(t) dt
This result lies at the heart of everything else that is going on, but we begin with
a somewhat different version of interchanging the operations of differentiation
and integration.
9.7.4 Example Differentiation of an integral depending on a parameter.
(Differentiation under the integral sign.) Under reasonable conditions,
d
dy
(lb
a
f(x,y)dx
) lb
=
a
8/ (x,y)dx
ay
§9.7 Interchange of Limiting Operations
517
Here is a precise version:
Proposition Suppose/: [a,b] x [c,d]-. JR is continuous and of/oy exists
for c
< y < d and extends to be continuous on
[a,b] x [c,d]. Let
F(y) = lb /(x,y)dx.
Then Fis differentiable and
F'(y) =
l
b
a
f}J
fJ(x,y)dx.
Y
Proof We need to show that the limit of the difference quotient exists and is
the desired integral. Thus, consider
I
F<y+h)-F<y)_jbof< >d
!=l
x.y X =
II
a uy
By the mean value theorem,
f(x,y + h) - f(x,y)
of
h
= a/x, Cx,1,)
for some
I
Cx,1,
between y and y + h, so
I
F(y+h)- F(y)
jb of
lib
h
' - a o/x,y)dx = "
I
(8/
8/
) dx .
o/X,Cx,1r)- a/x,y)
Since 8J/8y extends to be continuous on the compact set [a,b] x [c,d], it is
uniformly continuous. Given e > 0, we can choose 6 > 0 such that
8/
18y (xo,Yo)
8/
- oy (x,y)
I<
e
b_
0
Chapter 9 Fubini's Theorem and the Change of Variables Formula
518
whenever
Ix - xol < 6 and IY - Yol < 6.
I
F(y+h)- F(y)
h
-
lb
a
fJf
lhl < 6, then
If
[
a/x,y)dx :::;
lb 18!
of
I
a/X,Cx,1i)- a/x,y) dx
a
€
< -b-(b-a)=c.
-a
This establishes the limit we wanted.
Cl
There are similar results for improper integrals, but they require some form
of uniformity. The situation is analogous to that for sums. This may not be
surprising if we consider the integral as a limit of sums. The de1ivative of the
sum of a finite number of te1ms is equal to the sum of the derivatives. The
same conclusion for infinite series requires some assumptions. The theorem we
obtained for series involved the uniform convergence of the seiies of derivatives.
Here we use a similar assumption on the integral of the pru1ial detivative.
9.7.5 Proposition Suppose f: [a,oo) x [c,d] -+ IR is continuous and, as
in the preceding proposition, fJJ/fJy exists and is continuous on [a,oo) x [c,d].
Suppose further that the integrals
1
00
18
00
f(x,y)dx
and
of (x,y)dx
a
Y
converge, the second uniformly for y E [c, d). Let
1
f(x,y)dx.
1
fJf
a/x,y)dx.
F(y) =
00
Then Fis differentiable and
00
F'(y)
=
0
The assumption of unifmm convergence of the second integral to some G(y)
means that for each c > 0, there is a B such that
IG(y)-
lb
%<x,y)dxl
<E
for every y E [c,d] wheneverb
> B.
The proof is analogous to that for the coITesponding result about the term-byterm differentiation of selies.
§9.7 /nterclzange of limiting Operations
519
These two results justify differentiation under the integral sign in many cases.
This idea is of considerable practical value in computations-some examples are
in the exercises. In Worked Example 9.2 at the end of this chapter, we embed this
result in a general formula for delivatives of integrals. Here we are interested in
how it can be related to our other interchange theorems such as that of Fubini. .
9.7.6 Example Fubini's Theorem: With appropriate conditions, iterated
integrals may be evaluated in either order:
It is interesting to see how a proof for continuous f can be based on the
process of differentiation under the integral sign just obtained.
The function h(t,y) = J~f(x,y)dx is continuous on [a,b] x [c,d] as is its
partial derivative ah/at =f(t,y). The integral
g(x) =
id
f(x,y)dy
is a continuous function of x. Now we need only consider the function
Clearly, H(a) = 0 and we want to show that H(b) = 0. But
H(t) =
so
H'(t)
=g(t) -
1'
g(x)dx -
id~: id
dy =
id
h(t,y)dy
f(t,y)dy -
id
f(t,y)dy
=0
for all t. Thus, H must be constant on [a,b], so H(b) = 0 as desired.
Finally, we will see how the equality of mixed prutial derivatives can be
obtained from Fubini's theorem.
9.7.7 Example Equality of mixed partial derivatives.
a2g
ti2g
I f ~ and n n both exist and are continuous, then thev are equal.
uxuy
uyux
·
Chapter 9 Fubini's Theorem and the Change of Variables Formula
520
We want to show that
a2g
a2g
---=0.
oxoy
8yox
Since it is a continuous function, it suffices to show that the integral over
every small rectangle [a, b] x [c, d] is 0. If R is such a rectangle, then
JJR
r{ [
=
=
0 28
02 8 ]
,
ox8y - 8y8x dxdy
1d [1b :x (:i) dx] dy-1b [1d :y (::) dyl dx
i
d
C
[{)g
8g
]
a/b,y)- a/a,y) dy-
lb
a
[{)g
f)g
]
ax<x,d)- ax<x,c) dx
= (g((b, d) - g(b, c) - g(a, d) + g(a, c)) - (g(b, d) - g(a, d) - g(b, c) + g(a, c))
=0
as desired.
•
Exercises for §9.7
1.
2.
> -1.
Jd x' log(x) dx exists.
a.
Let t
b.
Evaluate the integral in a by considering
c.
Let n be a positive integer and t
Show that
Show that
1
2 ,r
O
• (
xsm tx
)
dx
=
> -1.
Evaluate
J01 x' (log(x))R dx.
sin(21rt) - 21rtcos(21rt)
t
2
by considering the derivative of J;" cos(tx)dx with respect tot.
:t fo
2
3.
Evaluate
4.
Carry out the proof of Proposition 9.7.S.
s.
Evaluate
"
x cos(tx) dx.
Jo"° x'e-ax dx
by repeatedly differentiating
a> 0.) Justify the application of Proposition 9.7.5.
Jo"° e-ax dx.
(Here
Theorem Proofs for Chapter 9
521
Theorem Proofs for Chapter 9
9.2.1 Theorem
i.
Let A be the rectangle described by a $ x $ b, c
f : A --+ JR be continuous.
The expression
~
y $ d, and let
Then
lb (1d
f(x,y)dy) dx
means that the function
g(x) =
1d
f(x,y)dy
is int~gratedfrom a to b.
ii.
In i, suppose f is integrable and the function fx : [c,d] = f(x,y) is integrable for each fixed x E [a, b]. Then
lR defined by
fx(y)
One can similarly assume that
lb
/(x,y)dx
exists for each y and obtain
Proof Since i is a special case of ii, we need only prove ii. Let g : [a, b] c
JR --+ JR be the function defined by
g(x) =
id
J(x,y)dy.
522
Chapter 9 Fubini's Theorem and the Change of Variables Formula
f
J:
We must show that g is integrable over [a, b] and that 1.f =
g(x) dx.
Suppose a xo < x1 < · · · < x,. b and c = Yo < · · · < y,. d are partitions
of the intervals [a, b] and [c, d]. Let P[a,bJ be the pru1ition of [a, b] given by the
sets V; = [X;-1,X;]. Pie.di the partition of [c,d] given by the sets Wj = [Yj-1tYj],
and PA the pru1ition of A given by the sets
=
=
=
Then
L(J, PA)=
L
ms/f)v(Sij) =LL ms/f)v(Wj)v(V;),
i
ij
j
where ms(/) is the minimum (inf) off on the set S. For x E V;, we have
msii(f) 5 mw//..-), where f.- is defined by fx(y) = J(x,y). Hence,
~ ms/f)v(Wj) 5 ~ mw/fx)v(Wj) 5
J
id
fx(y)dy = g(x).
J
Since this inequality holds for any x E V;, we get
Therefore,
L(f, PA)
5
I;i ms/f)v(Wj) 5
L mv;(g)v(V;) 5 L(g,
mv;(g).
P[a,b]),
i
From this and from a similar argument for upper bounds, we obtain the inequalities
Since f is integrable over A, these inequalities show that g is integrable and
Using the same argument, if we assume that
we get
J: f(x,y)dx exists for each y,
523
Tl,eorem Proofs for Chapter 9
9.2.2 Corollary Let ip, ·I/J : [a, b] -+ 'IR. be continuous maps such that ip(x) :s;
'lj;(x)for all x E [a,b], let A= {(x,y) I a :5 x :5 b,ip(x) :5 y :5 'lj;(x)}, and let
f: A -+ 'IR. be continuous. Then (see Figure 9.2-1)
1l
f =
b
(11/J(x) f(x,y)dy ) dx.
a
A
'P(X)
Proof Let S = [a,b] x [c,d] be a closed rectangle enclosing A, and extendf
to S by setting it equal to O on S\A. The sets graph(cp) = {(x,ip(x)) Ix E [a,b]}
and graph(tJ,) = { (x, ·r/;(x)) I x E [a, bl} are of measure zero (by Exercise 23,
Chapter 8). Thus, the set of discontinuities of/ defined on S is of measure zero
and so f is integrable over S. Also, for any x.fx is continuous on [c, d], except
possibly at <p(x). and 1/J(x), and so f,. is integrable for any x E [a,b]. Applying
Theorem 9.2.1, we get
r1 = Jrs1 = 1b 1df,(y) dy dx = lb (Jr<x) fx(y) dy) dx.
JA
a
a
C
•
'P(X)
The proof of Theorem 9.2.3 is analogous to the proof of 9.2.2, and so it is
left as an exercise.
We now tum to Theorem 9.3.1. The proof of this can be very laborious
if not dealt with effectively. In particular, the idea given in the text is hard to
make precise if we interpret it too literally. The proof we give here is due to J.
Schwartz in the American Mathematical Monthly 61 (1954), pp. 81-85.
9.3.1 Change of Variables Theorem Let Ac 'IR." be an open bounded
set with volume and let g : A -+ 'IR." be a C1 mapping that is one-to-one. Assume
that Jg(x) 'f Ofor all x EA and that IJg(x)I and 1/IJg(x)I are bounded on A. Let
B = g(A) and assume that B has volume (see §8.2). If f :rB-+ R is bounded
and integrable, then (/ o g)IJ(g)I is integrable on A and
fn1 =
1
(f O g)llgl,
that is,
1
f(y1, • •. ,Yn)dYt · · · dyn =
B
1
8(g1,. • • ,gn)
f(g(x1, ... ,Xn)) a(
) dx1 · · · dxn.
A
Xt,• • • ,Xn
524
Chapter 9 Fubini's Theorem and the Change of Variables Formula
Note. An analysis of the proof shows that/, !Jg!, and !Jgl- 1 need not be
assumed bounded (see §8.7, for improper integrals, and the remark following
Worked Example 9.4).
The first stage of the proof consists of establishing our formula when g = L is
a linear map, in which case JL = det L. This yields the geometlic interpretation of
det L: It is the factor by which volumes are changed under the transformation L.
Since we do not want to assume this from linear algebra, we will go through
the proof in some detail. We do need, however, to recall two points from linear
algebra: (i) det TS= det T · det S, and (ii) any matdx is a product of elementary
matrices.
Lemma 1 If L : Rn -. IR." is a linear map and A C IR." is a set that has volume
(that is, JA IA exists), then L(A) has volume and the volume is I det LI• v(A), i.e:,
JL(A) lA = JA Idet LI. (See Figure 9.P·l.)
A
}----/
FIGURE 9.P-1
Image of a rectangle under a linear map
Proof We first consider the special case where A is a rectangle and L is a
linear map whose mauix in terms of the standard basis is of one of the following
two types:
I 0
0
0
1
L1 =
C
1 0
0
0
I
525
Theorem Proofs for Chapter 9
or
0
1 0
0 1
1
0
0
0
1
The first matrix is obtained from the identity matlix by replacing a single I on
the diagonal by a constant c. The second matrix is obtained from the identity
mattix by writing a single 1 anywhere off the diagonal. These are called the
elementary matrices.
If A= [a 1,bi] x · • • x [a,,,b,,] and c is in the ith row, then
L1(A) = [a1, bi] x · • • x [ca;, cb;] x · · · x [a,,, b,,],
so that the volume of L1(A) is
v(L1 (A))
=lclv(A) =Idet
L1 lv(A).
If the I off the diagonal is in the (i,j) position, then
L2(A)
= {(x1,- .. ,x;-1,x; +Xj,Xi+t,· .. ,x,,) I Xk E [ak,btl}
for I ::; k ::; n. This image is a parallelopiped in which the cross sections parallel
to the x;xrplane have been sheared to become parallelograms. The cross sections
in all perpendicular planes are still rectangles. The possibilities are indicated in
Figure 9.P-2, which shows the x;xrplane cross sections.
For convenience, we renumber the coordinates so that the "I" appears in
the second column of the top row and the vruiables affected are x 1 and x2 .
Geomettically we see that the volume has not been changed since these cross
sections are parallelograms with the same base and height as the 01iginal twodimensional rectangle. This can be confilmed analytically by examining the
integral which gives Jhe volume of the image and using the· Fubini theorem
to do the x 1 integral next to last and the x2 integral last. (We continue the
renumbering by which the variables affected by the shear are x 1 and x2 .)
The volume of ~(A) is thus
v(L2(A)) =
j J···JJI dx2 dx1 dx3 · · · dx,,-1 dx,,
= (b,, - a,,)(b,, - a,i) x · · · x (b3 - a3)
jjl
dx:?dx1.
Chapter 9 Fubini's Theorem and the Change of Variables Formula
526
y
--•I
b2
y
I
b2
case I
b1 -
--I
02
I
I
I
I
I
01
bi
o1
<
b2 -
I
I
o2
-
02
X
Li
= (b
I
I
I
D
01
+
--•
I
I
bi
02
--
I
I
01
+
b1
+
b2
+
I
I
I
b2
bi
\+
bi
+
I
I
I
I
I
I
b2
b1 -
o1
>
b2 -
--+--I
I
case 2
02
I
02
X
01
b1
+ 02
oi
o2
bi
(b)
FIGURE 9.P-2
Image of a rec1angle under an elementary matrix L 2
For each x2 between a2 and b2, X1 runs between the 45° lines x = y + a 1 and
x = y + b 1• Therefore,
JJ
b2
y
I
I
I
01
I
I
--,----1-
I
02
o2
(a)
y
I
I
I
I
I
I
1 dx1 dx2 =
1b1 (1•'1+b1 I dx1 ) dx2 = 1b1 (b1 a1
.x1+a1
a1 )dx2
02
= (b2 - a2)(b1 - a1),
which leads to
v(~(A)) = (b,, - a,,)(bn - a11 ) x · · · x (b3 - G3)(b2 - a2)(b1 - a1) = v(A).
Since <let(~)= I, we have v(l 2(A)) = I<let l2Jv(A).
Let A be an arbitrary set with volume and let L; be one of the elementary
matiices. Assume lhat <let l; f O (that is, c f O if L; is an elementary matrix of
+
bi
Theorem Proofs for Chapter 9
527
the first type). Let S be a rectangle enclosing A, and let P be a partition of A
into subrectangles S1, ... , SN such that
U(lA, P) - v(A)
< e:(21 det
l;I)
and
v(A) - l;(lA, P)
< e:(21 det
LI).
Letting Ve= LJ{S; IS; CA} and We = LJ{S; IS; n A ~ 0}, we get v(l;(Ve)) =
Idet l;ll;(lA,P) and v(l;(We)) = Idet L;IU(lA,P). Thus v(l;(We))-v(l;(Vc)) <
E, and so L;(A) has volume and v(l;(A))
Idet L;lv(A). If detL; 0, i.e., L; is
a matrix of the first kind with c 0, then v(L;(S)) 0 for any rectangle S, and
so v(L;(A)) = 0 for any set A with volume.
If L is a linear map and A is a set with volume, then from ii immediately
preceding Lemma I, we can write L = L 1U! •••lki where the L; are elementary
matrices. By repeated application of what we have just proved, it follows that
L(A) has volume and
=
=
v(l(A)) = I det
Lill det ~I· .. Idet
=
=
Lklv(A) = Idet Llv(A).
•
Lemma 2 If the theorem is true for the function f = l, then it is true for any
integrable/.
Proof If the theorem is true for f = I, it is true for any constant function
(why?). If f is any integrable function on g(A), let S be a rectangle enclosing
g(A) and let P be a prutition of g(A) into rectangles S 1, ... , SN, Let ms;(/) =
inf{f(x) Ix ES;}, and let it also denote the constant function on S; with constant
value ms;<f). Then
N
l(f,P)
•
L 1ms;(f)
N
= Z:ms;(f)v(S;)=
~I
~
~I
N
=
~ 1-,cs;>(ms;(/) o g)llgl
N
5
~ 1-l(S;)(f
=
1
g- 1CS;)
<t
O
O
g)llgl
g)llgl =
r<t
}A
O
g>11g1.
Hence f 8 (A/ 5 Jif o g)llgl. A similar argument with Ms;(f) = sup{f(x)
S;} shows that f8 cA/ 2:: JA(f o g)llgl, so that f8cA/ = JA(f o g)llgl.
•
Ix E
Chapter 9 Fubini's Theorem and the Change of Variables Formula
528
Lemma 3 The theorem is true if g is a linear transformation.
Proof By Lemma 1,
1 =1I
1
1
gl = A IJgl,
det
g(A)
A
since g = Dg. By Lemma 2,
r
1=
}g(A)
.
1u
O
g)IJgl.
A
•
We need one more key observation:
Lemma 4 If the theorem is true for g: A-+ !Rn andfor h: B-+ !Rn, where
g(A) CB, then the theorem lwlds for hog : A -+ Rn.
Proof Perform the following computation:
r
Jhog(A)
1
1=
=
h(g(A))
1
1=
ru
o 1i>111i1
}g(A)
(f Oh O g)(IJhl
O
g)IJgl =
1
(f O (h O g))IJ(h O g)I-
•
Some more notation will enable our proof to run smoothly. If x E !Rn, x ==
Ix! = maxi..::;;..::;,. Ix;!. This "nonn" has the property that in terms
of it, a cube with center p and side of length 2s can be characterized by Ix-pl :$ s.
If A : JR.n -+ Rn is the linear transformation with matrix aij, so that
(x1, ... , x,.), put
A(x)
= A(x1, ... ,Xn) =
(t
a1jXj, ... ,
1=1
t
OnjXj),
r-1
we put
n
!Al = 1<,<11
m!}x L laiil- -
j=I
Thus, IA(x)I ~ IAllxl, We also introduce the Jacobian matrix j(x) = (j;k(x)) of
the transformation g(x) = (g 1(x) • • • g,.(x)), where, as usual, j;k(x) = og;(x)/ OXk-
529
Theorem Proofs for Chapter 9
Ix -
If C is a cube in the open set A, such that C is the set of all x satisfying
Pl :'.5 s, then v(C) = (2s)2. By the mean value theorem,
n
g;(x) - g;(p)
= L)k[P + 0;(x)(x
- p)](xk - Pk),
t-=1
where O :'.5 0;(x) :'.5 I. Thus, lg(x) - g(p)I :'.5 s max_vec jj(y)I: that is, g(C) is
entirely contained in the cube defined by Jz - g(p)I :'.5 smax_vec Jj(y)J, and so if
g(C) has volume, then
(I)
v(g(C)) :'.5 {max lj(yWv<C)}.
yEC
To ensure that g(C) has volume, we prove another lemma.
Lemma 5 If h
: U c Rn --. R" is a C1 map tluu is one-10-one, Jh(x) f. 0,
U is a bounded open set, and C is a set with volume such that cl(C) C U, then
h(C) has volume.
Proof It is sufficient to show that the boundary bd(h(C)) has content zero.
To do this, we first show that bd(h(C)) C /z(bd(C)). Let x E bd(h(C)): to show
that x E h(bd(C)), let y = 1i- 1(x). Then we must show that y E bd(C). Let V be
a neighborhood of y, and suppose V c U. Then h(V) is an open neighborhood
of x, since 1,- 1 is continuous. Thus, h(V) contains points of h(C) and Rn\h(C),
since x E bd(C). Since /z is one-to-one, V contains points of A and Rn\c, and
soy E bd(C). Applying this argument to 11- 1, we see that in fact bd(h(C)) =
h(bd(C)).
To show that h(bd(C)) has volume zero, given c > 0, cover bd(C) with
rectangles B1 , ••• ,BN of total volume :'.5 c. Equation(]) showed that h(bd(C))
lies in a covering by rectangles with total volume (max llh(x)l)c, where the
maximum is over B1 U • • • U BN, Thus, h(bd(C)) has volume zero.
•
If A is a linear transfonnation and S has volume, then
v(A - I (S))
= det(A - I )v(S)
(taJce/ = l on A- 1(S),f = 0 on the complement, and apply Lemma 3). In (1),
let S = g(C), which we now know has volume; since
ldet(A- 1)Jv(g(C)) :'.5 {maxlA- 1j(y)l}n v(C),
·
yEC
530
Clzapter 9 Fubilli's Tlzeorem and tlze Change of Variables Fonnula
we obtain
g(C) $ I det(A)I {max IA- 1j(y)l}n v(C).
yEC
(2)
Let the cube C be subdivided into ~ finite set C 1 , ••• , CM of nonoverlapping
cubes with centers x 1, ••• ,XM, and suppose that 6 is greater than the length of a
side of any of them. Apply (2) to each of C1, ••• , CM, taking A = j(x;) in (2),
and then add. This gives
v(g(C)) $
t,
ldet(j(x;))I
{~i7u- (x;)j(y)lr v(C;).
1
r
1(z)j(y) approaches the
Since j(x) is a continuous (matrix-valued) function,
1
identity matrix as z-+ y and hence {max.vEC; I
(x;)j(y)W 5 1 + 77(6), where
77(6) approaches zero with 6. Thus,
r
M
v(g(C))
:5
[I + 17(6)) L I det(j(x;))lv(C;);
i=I
as 6 approaches zero, the sum on the right approaches
inequality becomes
v(g(C)) $
fc IJg(x)I dx, and the
1
jJg(x)I dx.
(3)
The proof of Lemma 2 with f = 0 outside B, together with (3), gives
1t
g(A)
$
f (f og)llgl.
.}A
(4)
Equation (4) can be applied equally well to g- 1 to give
i.e.,
f <t O g)IJgl
:51
Combining (4) and (5) gives the theorem.
•
}A
g(A)
f.
(5)
531
Worked Examples for Chapter 9
Worked Examples for Chapter 9
Example 9.1 Use the clumge of variables formula and polar coordinates to
1-:
show that
e-x2 dx = .fir
(the function e-x2 is called the Gaussian function; see Figure 9. WE-1 ).
y
FIGURE 9.WE-1
The Gaussian function
Solution To use polar coordinates, let h = JAb e-x2-_v2 dxdy, where Ab is the
circle of radius b centered at the origin. Thus
which we evaluate by iterated integrals. Since (d/ dr)(e-' 2 ) = -2re-' 2, we get
h = 211' lb re-' 2 dr = 11'(1 - e-b\
r=O
To relate h to
J~
00
e-x2 dx, first note that
f
}rt..2
e-x1 -l dxdv = lim 11'(1 - e-b2 ) = 11'.
·
b-oo
Since the integral exists and e-·•·1 -l ~ 0, we can evaluate it any way we please.
Let us evaluate J'JP'.2 e-x2-y2 dxdy using the rectangle [-b,b]2 = [-b,b].x [-b,b].
Thus
f
}l'J.1
e-~:z_y2 dxdy = lim
b-oo
1
(-b,W
e-x2 -l dxdy.
Chapter 9 Fubini's Theorem and the Change of Variables Formulo.
532
But
1
e-x2-y2 dx dy =
[ -b,b]2
(lb
e-•·2 dx)
-b
(lb
e-i dy)
=
-b
.l
(lb
e-"2 dx)
2
-b
2
2
2
by Fubini's theorem, and the fact that e-x -y = e-x e- 1 . We conclude that
limb-oo(t be_,., dx)2 = 1r. Since e-x2 ~ 0, we have J~00 e-r2 dx = ,fir, as
•
required.
Example 9.2 Leibniz's Rule for Differentiation of Integrals
Suppose f : [a, b] x [c, d] -+ JR is continuous and of/ 8y exists and is continuous
on [a, b] x [c, d]. let u(t) and v(t) be continuously differentiable functions from
[c, d] into [a, b], and put
1
1{/)
F(t) =
f(x, t) dx.
11(/)
Prove that F is differentiable and that
1 7i
,,(t)
F' (t) = f(v(t), t)v' (t) - f(u(t), t)u' (t) +
11(1)
8f
t
(x, t) dx.
Solution This is an application of the chain rule after we set it up properly.
Let u = u(t), v
=v(r) and w = r. Then F(t) = g(u, v, w) =J,;'t(x, w)dx. So,
F'(t) = fJg du+ fJg dv + 8g dw.
8u dt
dt
dt
av
aw
The pm1ial de1ivatives of g with respect to u and v are obtained from the fundamental theorem of calculus. Since that theorem involves the upper limit of
the integral, differentiation with respect to the lower limit is accomplished by
reversing the limits and applying the theorem. This reversal introduces a minus
sign. The partial de1ivative with respect to w is obtained by differentiation under
the integral sign by Proposition 9.7.4. Since w t and dw/dt I, this leaves us
with the desired formula. •
=
=
Example 9.3 Compure the volume of the ball of radius r in Rn with center
at the origin (that is, of the set {x E IR" I llxll < r}J.
Solution Use induction on the dimension n. In JR the ball is simply the open
interval ]-r, r[ and has volume 2r. Suppose we have computed the volume of
Worked Examples for Chapter 9
533
the (n - 1)-ball of radius r to be an_ 1rn-i. (One guesses that the answer is of
this fonn, since the (n - 1)-ball is an (n - 1)-dimensional object). Then, since
the boundary of the n-ball has measure zero, we may apply Fubini's theorem.
For each fixed Xn, 0 ~ Xn < r, the cross section of the n-ball of radius r, which
is denoted B(n, r), is
{(x1, ... ,Xn-1,x,,) I ri +···+~-I < r 2 - ~}.
which is an (n - !)-ball of radius (r 2
-
x~) 112 • Hence Fubini's theorem gives
Letting x,1 = r sin 0 for O < 0 < 1r /2, we get (r 2
rcos 0 > 0 on ]O, 1r /2[. Thus,
r
ln(n,r)
l
1'-r
=
a,,_1((r2 - x;,)1/2)"-I dx,,
r~ a -1(rcos 8)
11
= a,,r",
where a,,= 2a,,_1
-•
x;,) 112
=rcos 0 and dxn/ d0 =
r a,,_1((r2 - x;,)1/2)"-I dx,,
lo
r~ cos" 8 d0
= 2 lo
11
=2
-
rcos 8 dB= 2a,,_ 1rn lo
1"
12
cos" 0 d0.
Using elementary calculus, we find
(n - l)(n - 3) · • •
lor1
2
n
n(n - 2)· · •
cos" 0d0 =
{
(n -
I )(n - 3) • • ·
n(n - 2) .. •
7r
odd,
n even.
2
Hence we have
v(B(l, r))
v(B(3, r))
= 2r
v(B(2, r))
?
.,
= 1rr2
= 2 · a,- · r3 · -3
v(B(4, r))
4
·7r2
2
3
= 2 · a4 · r 5 · -158
8 ') 5
= -1r-r
15
= 2 · a3 · r 11 · 83 · 27r
= -r4
= -1r?
v(B(5, r))
= 2 · a1 • r- · 21 · 27r
v(B(6, r))
15 1r
= 2 · as · r6 • 48
·2
I
-1r3 r6
= 6
. •
534
Chapter 9 Fubini's Theorem and the Change of Variables Formula
Example 9.4 Improve the change of variables formula by replacing "A is
open" with "A has volume. "
Solution There are (at least) two ways in which this can be done, and we
shall give both as theorems.
Theorem Let g : D c IR.n -+ IR.n be of class ct, where D is open. Furthermore. let g be one-to-one and Jg(x) 'F Ofor all x ED. Let B = g(D). Suppose D
and B have volume. Let AC B have volume andf: A-+ JR. be integrable. Then
1
g-l(A)
<f og)IJgl = ff.
}A
Proof Extend/ to B by letting/= 0 outside A. By 9.3.1, J0 (! o g)IJgl = J8 f.
Since f = 0 outside A, fog is zero outside g-t(A), and so the conclusion
•
follows.
Theorem Let B have volume and f : B - JR. be integrable. Let A have
volume and suppose g : int(A) C IR.n - int(B) C IR.n is ct, one-to-one, and onto
and Jg(x) -:/- 0 for all x e int(A). If f : B -+ JR. is integrable, then
L
fat= <t
O
g)IJgl.
Proof Since B has volume and bd(int(B)) c bd(B), int(B) has volume (since
= J8 f.
bd(B) has measure zero). Also, int(B) U (bd(B) n B) = B, and so
Hence, we get the result by the change of vruiables theorem. •
hn,cB/
Notice that the conditions on g are equivalent to the existence of a ct inverse
for g (by the inverse function theorem).
•
Note. In these two theorems, one can show that Jg(x) "IO can be dropped
as a hypothesis (then g does not necessarily have a C1 inverse). This can be
deduced from Exercise 27. Exercise 26 asks the reader to prove the change
of variables formula for improper integrals. A solution may be based on the
usual change of variables formula and our discussion of improper integrals
from Chapter 8.
Exercises for Chapter 9
535
Exercises for Chapter 9
1.
Use cylindrical coordinates g(r, 0, z) = (rcos 0, rsin 0,z) defined on {r, 0, z)
r > 0, 0 < 0 < 21r} to calculate the integral over A = {(x,y, z) I x2 + y2 <
I,lzl < I} of/(x,y,z)=(x2+y2)z2 •
2.
Give a counterexample to show that the change of variables fonnula does
not hold if g is not one-to-one .even though Jg(x) ~ 0. [Hint: Take/= I
and g(x,y) = (e'"cosy,e'"siny).]
3.
Evaluate the following integrals, if they exist:
a.
b.
c.
f,..(y//x) dxdy, where A is the unit square {(x,y) IO< x < 1,0 <
y < I}.
e.
JA xdxdy, where A= {(x,y) IO< x <
fo" Jd r2 drd0.
J_I 1 Jor(I - x2)1/2 (x-__, + 3y-x)dydx.
g.
fo, 0 < y < sinx2}.
~
Suppose that /,/1 ,/2,/3, ... are continuous real-valued functions on [O, I]
and that fn -+ f unifonnly on [0, l] as n -+ oo. Prove that each of
I/I, lid, l/21, l/31, ... is integrable on (0, l] and that
Ifni -+ 01 I/I as
n-+ oo.
Jd
5.
6.
< 2 +x,x < I}.
d.
f.
4.
f,.. x2y2 dxdy, where A= {(x,y) IO< x < y2,0 < y
J,.. sin(x2 + y2) dx dy, where A is the unit disk.
fp_; I/(x2 +y2 +z2 ) dxdydz.
J
Compute the volume of the following sets:
a.
A tetrahedron with base area A and height h.
b.
A cone with base radius ro and height
c.
d.
{(x,y)
I x2 < y < I -
ho.
x2}.
{(x,y,z)lx2+y2+z'.?< I andz< 1/2}.
Prove:
a.
.X = inf {
If A has volume and if .X is defined as
~ v(S;) I S ,S2, ...
1
then v(A) = .X.
is a countable cover of A by open rectangles}
536
Chapter 9 Fubini's Theorem and the Change of Variables Fonnula
b.
Let A be a bounded set with volume and let A; be a sequence of
sets with volume such that the A;'s are nonoverlapping (that is, have
nonintersecting interiors) and
00
LJA;=A.
i=I
Show that
00
v(A) =
L v(A;).
i=I
7.
Find
JA xy sin(.x2 -
y2) d.x dy, where
A= {(x,y)
8.
a.
IO< y < l,x > y, and x2 - y2 < l}.
Let u : A C R 2 --+ ]a, b[ and v : B c R 2 --+ ]c, d[ be two functions
of class C 1 from open sets A and B onto the intervals ]a, b[ and
]c,d[ such that u(x,y) = u(x',y') and v(x,y) = v(x',y') only when
(x,y) = (x',y'), and such that
fJu av _
fJx ay
av au ~ O
ax ay
at any point (x,y) E An B. Let W = {(x,y) I a < u(x,y) < b,
c < v(x,y) < d}, and let f be an integrable function on W. Show
that
b.
Use a to evaluate
where W = {(x,y)
9.
l
(xi
+ y2) d.x dy,
Ix> O,y > 0, -I < .x2 - y2 < r,xy < I}.
Suppose f : ]a, b[ --+ R and g : ]c, d[ --+ R are two integrable functions.
Define J(x, y) = f(x), g(x, y) = g( y) (assume f and g are bounded). Prove
1
~~x~~
l(x,y)g(x,y)d.xdy== (lbf(x)d.x)
a
(1d
c
g(x)d.x).
Exercises for Chapter 9
537
10.
Suppose A c !Rn and A has zero volume. Suppose/: A
function. Prove that f is integrable and JAJ = 0.
11.
Let S {(x,y) E IR 2 Ix rational, 0 < x < I, and if x
f01m, y = k/ m, k = I, ... , m - I}. Show that
=
-+
JR is a bounded
=p/m in
lowest
but that
f
Is
J10.11xro,11
doesn't exist.
12.
Show, if f"(x)
> 0 for all x, that/ is convex upward, i.e.,
!(x;y) $f(x)~f(y)_
13.
=
=
Suppose C C A x B and v(C) 0. Let Cx
{y E B
assume that
I
v _ { 1 if (x,y) E C
ex<.) 0 othe1wise
I (x,y)
E C} and
is integrable over B for each x E A. Show that Cx has volume zero in A
for all x except possibly a set of measure zero. Give an example where
v(Cx) 'I- 0 for some x.
14.
True or False-If tme, give a reason; if false, give a counterexample.
f.
a.
Suppose/ is integrable on A and g is a function on A such that g $
Then g is also int\!grable on A.
b.
Suppose A has volume and/ is continuous on A. Then
c.
Suppose A has volume and
d.
Any bounded subset of IR 2 has zero volume when considered as a
subset of IR. 3•
e.
If/ = [0, I] and ,p 1 and cp 2 are continuous functions : / -+ JR. with
<P1 (x) < (;?2(x) for all x E /, then S = { (x,y) I x E /, tp1 (x) < y <
<p2(x)} has volume.
JAJ exists.
JAJ exists.
Then f is continuous on A.
538
Chapter 9 Fubini's Theorem and the Change of Variables Formula
15.
If A is a bounded set with volume and A; is a sequence of sets with volume
such that A;+1 :::J A; and A1 UA2 U · · · =A, then show that v(A;)-+ v(A) as
i-+ 00.
16.
Suppose g : JR -+ JR is differentiable everywhere, and that lg'(x)! $ M
for all x E JR. Show that, if c: is small enough, the function f defined by
f(x) =x + c:g(x) is one-to-one.
17.
Let/: [0, 1] -+ JR be defined by
f(x) =
{
n2
0
Prove that/ is integrable and
18.
19.
if x = 1/n
if x E [0, 1], x-:/- 1/n.
f01f(x) dx = 0. (Warning: f
is not bounded.)
Letfn(x) = L~n=I (lj2m) sinmx be defined for all x E R
a.
Show that limn__,oofn(X) exists.
b.
Show that the sequence converges uniformly.
c.
Show that 021r (Iimn__,oofn(t)) dt = 0.
f
Let A C 1Rn, B C ]Rm have volume and f : A -+ JR, g : B -+ JR be
integrable. Let F(x,y) = f(x) + g(y). Show that
LxB F(x,y)dxdy = (lt) v(B) + (ls g) v(A).
20.
21.
Compute
> 0.
a.
limn-+oo(l + 2k + · · · + nk)jnk+I, where k
b.
lim"__, 00 (1/(n+ 1) + 1/(n + 2) + • • • + 1/2n).
For what values of p is rP integrable in JR 3 , where
r= Jx2 +y2 +z2?
22.
Let f and g be two integrable real-valued functions on [a, b]. Let
h(x) = max(f(x), g(x)) = { fg((xx))
Show that h(x) is integrable.
if f(x) 2". g(x),
if f(x) ~ g(x).
539
Exercises for Chapter 9
23.
Let A be a closed rectangle in Rn and C c A have volume. Prove that for
any e > 0 there is a compact set K C C such that v(C\K) < €, and that
there is a compact set L ::> C such that v(L\C) < €.
24.
True or False:
a.
b.
If/ is a continuous function on [0, I], then/ is bounded on [0, I].
If/ : S -+ lR" is a continuous function, where S is a compact subset
of JR, then f (S) is compact.
c.
An integrable function on [0, I] must be continuous on [O, I].
d.
If U and V are open subsets of JR, then Ux V = {(x,y) Ix E U,y E V}
is open in lR 2•
e.
If/ and g are integrable functions on [a,b], then/- 2g is integrable
on [a,b].
f.
Any bounded sequence in
g.
If/ is a continuous real-valued function on [0, I] such that f(x) ~ 0
for all x E [O, I] and Jd f(x)dx = 0, then/(x) = 0 for all x E [0, I].
If/ is an infinitely differentiable real-valued function on JR, then f
h.
]Rn
must have a convergent subsequence.
has a power se1ies expansion about each point of JR.
i.
If /,/1 ,h,h, ... are uniformly bounded continuous real-valued functions on [0, I] such that/(x) = lim,,_ 00 f,,(x) for each x E [0, I], then
converges to f unifo1mly on [0, I].
/1 ,h,h, ...
j.
If a0 ,a 1 ,a2 , ••• is a sequence of real numbers and r is the radius of
convergence of L:0 a,,x", then L:Oa,,r" converges.
k.
If/1,h,/3, ... converges uniformly to/ on
I.
If/1,/2,/3, ... converges unifotmly to/ on [a,b] and if/,, is differ-
[a, b] and if f,, is integrable
for each n = 1, 2, 3, ... , then f is integrable on [a, b ].
entiable on ]a, b[ for each n = I, 2, 3, ... , then f is differentiable
on ]a,b[.
m.
An open connected subset of JR" is arcwise connected.
n.
If f is a differentiable real-valued function on ]0, 1[ and /(I /2) 2'.:
f (x) for all x E ]0, I[, then f' (1 /2) = 0.
o.
If f is an integrable function on [0, I] and E: > 0, then there exists a
step function g on [0, l] such that 01 If - Bl < E:.
J
p.
Every closed and bounded subset of JR contains its least upper bound.
q.
Iff(x) has a power series expansion on ]-r, r[, then/ is differentiable
on ]-r, r[.
Chapter 9 Fubini's Theorem and the Change of Variables Formula
540
< c,
If f is integrable on [a, b] and [b, c], where a
integrable on [a, c].
s.
If f: ]Rn-+ lRm is continuous and Sis closed in lRm, then/- 1(S) is
closed in Rn.
t.
If a series
partial sums
u.
25.
<
r.
b
then f is
E,.: ai converges conditionally, then the sequence of
1
Ej=1 ai is bounded.
If f is a continuous real-valued function on ]a, b[ , then there exists
a differentiable function Fon ]a,b[ such that/= F' on ]a,b[.
Compute the area of the region D = {(x,y)
I15
x
< 3 and
r
5
y
5
r+l}.
26.
Investigate possible generalizations of Theorem 9.3.1 to unbounded regions.
27.
The purpose of this problem is to prove a simplified version of a fairly
difficult theorem known as Sard's theorem. A general treatment can be
found in Milnor, Topology from the Differentiable Viewpoint, Princeton
University Press (1967), or in Abraham, Marsden, and Ratiu, Manifolds,
Tensor Analysis and Appli~ations, Springer-Verlag (1988). In our case the
statement of the theorem is as follows.
Theorem Let A c lRn be open and g : A -+ R" be of class C1. Let
B
={x EA I Jg(x) =O}. Then g(B) has measure zero.
The set B need not have measure zero (to see this, take g to be a constant
mapping). Before outlining the proof, we ask the reader to assume the
result and use it to show that in Theorem 9.3.1, the assumption that Jg(x)
¢ 0 can be omitted (provided that the (open) set of points where Jg(x) ¢ 0
has volume).
The theorem is proved as follows. First show that if U is a closed rectangle
in A, it suffices to show that g( U n B) has measure zero (show that g(B)
is the countable union of these intersections). In fact, we shall show that
g(U n B) has content zero.
Prove these two points: For any e
x,y E U, llx - YII < 6, we have
> 0, there is a 6 > 0 such that for
llg(x) - g(y) - Dg(x)(x -
y)II < c:llx - YII-
Exercises for Chapter 9
541
Also, there is an M such that
llc<x) -
g(y)II $ MIIY -
xii-
Let U have sides of length I. Choose N ~ l/ 6 such that if U is divided into
Nn rectangles (of sides l/N) and Sis such a rectangle, then for x,y E S,
the two preceding inequalities hold. Suppose x E Sn B. Find a hyperplane
H in IR.n (H will be some (n - ))-dimensional subspace) such that
{Dg(x)(y - x) I y ES} c H.
Next show that {g(y) I y E S} lies in a cylinder of height < 2cn(l/ N) and
base an (n - ))-cube of side < 2Mn(l/N). Deduce that g(U n B) lies in
N'' rectangles of total volume < eK, where K = 2nMn- 1(n)Rln, a constant
independent of N. This will prove the result.
28.
Suppose f(x) is continuous on )-1, 1[, /(0) = 0, and /(x) f; 0 if x f; 0.
Prove: lim......... 00 [log(I + f(x))J/f(x) exists. What does it equal?
29.
Suppose f is bounded, defined on [0, b ], b
for all O < c < b. Show that
30.
ft f exists.
> 0, and suppose J: f exists
Consider the following theorem.
Theorem Let A be an open subset of lR.n and cp : A -+ !Rn a one-to-one
continuously differentiable map whose Jacobian Jcp is nowhere zero on A.
Suppose that the function/: cp(A)-+ JR. is continuous and is zero outside a
compact subsetof cp(A) and that
exists. Then
= fifocp)IJcpl.
J..,,A/
J..,,A/
Suppose A;, i = l, 2, ... , are open subsets of IR.n such that the theorem is
true for each A; and the restJiction of cp to A;. Let A = UA;. Show that
the theorem is true for A and cp.
31.
Compute limn-oonlog (I+
1/n).
32.
Suppose S C Rn has volume and t E JR.. Let tS = { (tx 1 , • •• , txn)
(x 1 , ••• , Xn) E S}. Show that tS has volume and vol(tS) = W vol(S).
33.
a.
Prove Theorem 9.2.3. [Hint: As in Theorem 9.2.1, it suffices to
prove ii. Do this in the same way as for Theorem 9.2.1.)
b.
Prove the following generalization of Corollary 9.2.2. Let A c IR.n be
a closed rectangle and let cp : A C IR.n -+ IR.m and t/; : A c IR.n -+ Rm
I
542
Chapter 9 Fubini's Theorem and the Change of Variables Formula
graph of 1/1
_)
IR
-
( n
-
graph of ,p
=2)
m=I
°'
I
~
y
the set D
_2/J--:.-:A
X
FIGURE 9.E-1
The set D in Exercise 33 is the region between the graphs of
functions
be continuous functions, such that 'Pi(x) $ 'I/Jj(x) for all x E A, I $
j $ m. Let D = { (x,y) E JR" x JR"' I x E A, cpi(x) $ Yi $ 1/;j(x), I $
j $ m}. See Figure 9.E-1. Let f : A -+ JR be continuous, define
Bx C JR"' by Bx = {y E JR"' I 'Pi(x) $ Yi $ if;j(x), l $ j $ m}, define
fx: Bx C JR"'-+ JR by !t(y) =J(x,y), and define g: AC JR"-+ JR by
g(x) =f 8 fx. Then g is integrable over A and IA g =Inf.
Chapter 10
Fourier Analysis
Fourier analysis arose historically in connection with problems in mechanics
such as heat conduction and wave motion. The subject has evolved into an
extensive theory with many applications, both mathematical and physical. This
chapter is intended to give a brief but basic working knowledge of some Fourier
methods, to introduce the student to the general theory, and to delineate some
fundamental applications.
As an introduction to the study of Fourier analysis, we consider a vibrating
strirg. Further applications (for example, to quantum mechanics) are presented
in later sections. Consider a string of length l with clamped ends that is free
to vibrate when plucked. The position (vertical displacement) of the string
is represented by a function y(t, x), where t is the time and x E [O, l]. See
Figure 10-1. It is known from elementary mechanics that for small-amplitude
vibrations, y obeys the wave equatio,i
&2y
ot2
2 &2y
=C
f)x2'
where c is a constant dete1mined by the nature of the string and the tension in
it. That the string has clamped ends entails that y(t, 0) = y(t, l) = 0 for all t.
To simplify matters, let ·us first look at the case of special solutions called
sta,iding waves; these are solutions of the form y(t,x) = u(x) cos wt, where y(t,x),
as in Figure 10-1, represents the ve1tical displacement at x at time t and where
w is the frequency. Thus lu(x)I represents the amplitude at point x, and the
wave shape is given by the function u(x). Physically, a standing wave is a
synchronous up-and-down motion that repeats its shape periodically after time
t = 21r / w, such as occurs when a string produces a pure note. Specific standing
waves called fundamental solutions or lzarmo,iics are given by
Yn(t,x) = sin
c;x)
cos(w,.t),
543
n = 0, 1, 2, ... ,
544
Chapter 10 Fourier Analysis
y(t, x)
FIGURE 10-1
The displacement of a st1ing at location x at time tis denoted
by y(x, t)
=
=
=
where w,. mrc/l is the frequency. For example, with n 2 and t 1r/w11 , we
obtain the illustration in Figure I 0-1.
It is both important and remarkable that any solution y(x, t) describing the
motion of the string can be decomposed into harmonics-that is, written as a
series
00
y(x, t)
00
=L c y (x, t) =L c,.u,,(x) cos (w,., t),
11
11=1
11
n=l
where y,. is as before and Un(x) = sin(mrx/l). We call u1 the first harmonic component of y, u2 the second, and so fo1th. Thus a complicated-looking vibration
(such as occurs on a violin string) is in reality an infinite combination of harmonics, where each harmonic component appears with weight c,.. In Figure 10-2
we illustrate how summing three sine cmves of varying amplitudes can lead to
a more complicated wave shape. For a general wave shape, one may require an
infinite combination of sine cmves.
n=I
+
s= 2
n =2
+
I ~
~ n=3
s=2~
FIGURE 10-2
The superposition of three harmonics
The purpose of Fomier analysis is to catl)' out this procedure of decomposition using a general method. For finite regions (such as [O, /], used in th e
§JO.I Inner Product Spaces
545
previous illustrations), the appropriate method to use is that of Fourier series,
while on an infinite region (the whole real line, for instance), Fourier integrals
are required.
The series obtained from sin nx, cos nx, or einx are called the classical Fourier
series. For other types of problems (the harmonic oscillator in quantum mechanics, for example), other types of basic solutions enter, and arbitrary solutions need
to be expanded in te1ms of these basic solutions (for the quantum mechanical
harmonic oscillator, for instance, Hermite functions are used). Therefore, it is
useful to discuss the general theory of expansions, as will be done in §10.1
and §10.2.
In §10.3, §10.4, and §10.5 we study the special case of trigonometric Fourier
seties and justify the expansion procedure. To justify this procedure for other
families of functions often requires an examination of special situations-for
instance, the differential equation giving rise to the problem. In this regard,
there are two main theorems. The first deals with the imp011ant concept of
mean convergence and states that any square integrable function has a Fourier
series that converges in the mean. As we later explain, this must not be confused
with pointwise convergence. For the latter, one can use the basic theorem of
Dilichlet and Jordan. Some fm1her theorems on convergence, such as justifying
term-by-term differentiation, are given in §10.6. A few important applications
ofFomier methods are presented in §10.7. There we study special cases of three
problems-the wave equation, the Dilichlet problem (Laplace's equation), and
the heat equation-all from the point of view of Fomier series. In §10.8, an
informal treatment of Fomier integrals is given, stating the basic properties and
definitions without proofs (these are left for a more advanced course). We hope
that the student will obtain some perspective of the role of Fourier integrals and
their relationship to Fourier series. Finally, in §10.9, a brief glimpse is given of
quantum mechanics and how the techniques of §10.1 to §10.3 can be used to
establish some of the basic results of the subject.
§10.1 Inner Product Spaces
Before we begin our study of Fourier series, let us look at some concepts that
enable us to simplify the job. These ideas reveal an imp011ant geometric aspect
of Fourier seties. In this chapter we also need some basic facts about complex
numbers from Chapter l.
In Cliapter I we studied the inner product ( , ) on IR". Now we extend
th ese notions to an arbitrary vector space V. In the present chapter, V is no
longer finite-dimensional, but is an infinite-dimensional space of functions such
as the space C(A, JR 111 ) studied in Chapter 5. For example, V might be a space
of functions f : [O, 27r] -+ IR. This is a vector space using the usual definitions,
546
Cliapter JO Fourier Analysis
=
=Jt
=
(f+g)(x) f(x)+g(x) and (o:f)(x) af(x). The expression (f, g)
f(x)g(x) dx
is called the inner product off and g. It is this sort of space V that the reader
should keep in mind when studying the next two sections.
It is important to allow the members of our function space to talce on complex
values, for the simple reason that it is often more convenient to work with e; 9
than with sin 0 and cos 0. In the complex case, we let
(f, g) =
fo
2
1r
f(x)g(x) dx,
where g(x) is the complex conjugate of g(x). The reason we use g(x) is so that
we can (as in the real case) define the length or nonn off by
111112 =(/,/) = fo 2 lf(x)l 2 dx
1r
(for complex numbers z, recall that lzl 2 = zz is a positive real number).
Our study begins with abstract vector spaces with inner products rather than
the special space of functions above (which is actually the one of most interest to
us), because it is conceptually and notationally simpler to work with the notation
( , ) than with integrals. At this point only the following basic properties of ( , )
are significant.
10.1.1 Definition Let V be a complex vector space (that is, a vector space
where we allow complex numbers for scalar multiplication). An inner product
on V is a mapping ( , ) : V x V --+ C (where C denotes the complex numbers)
with the following properties:
•
i.
(of+ /Jg, h) = a:(f, h} + /J(g, h) for all f,g, h E V and a, /3 E IC.
ii.
(f, h) = (h,f).
iii.
(/,/) ~ 0, and (f,J) = 0 implies f = 0.
From i and ii we deduce that (h, af + /Jg) = <i{h,f} + f3{h,g). Notice that if
all quantities were real, we would have the same properties as the usual inner
product on R". In general, as we have stressed, V is not finite-dimensional, and
so we cannot use finite bases and mauices.
10.1.2 Theorem The space V of continuousfunctionsf: [a,b]--+ Cforms
an inner product space if we define
{f,g) =
lb
f(x)g(x)dx.
547
§10.1 Inner Product Spaces
The integral of a complex-valued function is defined as
where f = / 1 + if2 • The properties of complex integrals are similar to and may be
deduced from those of real integrals.
In an inner product space we can introduce many of the ideas considered in
Chapter l. The norm off, denoted 11/11, is defined by
111112 = (f,J)
and the distance between f and g by
d(f,g) =
II/- gll-
We use the same language as in JR." by analogy. For example, we say that f
and g are orthogonal if (J, g) = 0. Since V is a vector space, we can also talk
about linear dependence and other related ideas we saw in IR". The following
theorem develops the analogy with JR." further.
10.1.3 Cauchy-Schwarz Inequality Let f, g belong to the inner product
space V; then
l(f,g)I
~
llfll llgJI-
Also, V with norm IIll I satisfies the axioms of a normed space and, with d(f, g) =
II/- gjl, the axioms of a metric space.
Since we have a metric space, the notions of open and closed sets from
Chapter 2 are relevant here. The main concept for us here is that of convergence
of a sequence or series, which we now recall.
10.1.4 Definition Let V be an inner product space and let J,, be a sequence
in V. We say that fn converges to f and write fn -+ f if Ilfn - fl I -+ 0, that
is, if for every real number £ > 0 there is an N such that n ~ N implies
Iii,, - /II < c. Similarly, a series E;::::i g,. converges to f if the sequence of
partial sums Sn = I:~=! 8k converges to f.
If V consists of functions f : [a, b]
reads
-+
C, then the Cauchy-Schwarz inequality
Chapter IO Fourier Analysis
548
The triangle inequality (i.e., II/+ Bil ~
Minkowski's inequality and reads
b
{
11/(x) + g(x)l2 dx
} 1/2
:5
{
1
b
11/11 + llgll)
l/(x)l 2 dx
} 1/2
+
becomes what is called
{
1
b
lg(x)l 2 dx
} 1/2
With this inner product on the space of functions V, convergence is called
co,ivergence in the mean. It need not coincide with pointwise or uniform convergence, and so we write J,, -+ f (in mean), fn -+ f (pointwise), and fn -+ f
(unifmmly) to distinguish them. As we saw in Chapter 5, uniform convergence
implies mean convergence; however, pointwise convergence need not imply
mean convergence. Fm1her relationships between the different types of convergence are studied in §10.3.
Thus fn -+ f (in mean) has the same meaning as the statement
(lb
lfn(X) -
/(x)l 2 dx)
-+
0.
For example, consider the function/n defined by fn(X) = l - nx for O :5 x :5 1/n
and fn(X) = 0 for all other x E (0, I). Then fn -+ 0 (in mean), but fn does
not converge to f = 0 pointwise (at x = 0 specifically), and therefore it does
not do so unifo1mly. See Figure 10.1-1. One could contemplate other types
of convergence, such as
IJ,,(x) - f(x)I dx -+ 0, but mean convergence is the
most appropriate for Fomier analysis because of its close connection with inner
product spaces.
J:
y
0
n
FIGURE 10.1-1 The sequence J,, converges to zero in mean, but not pointwise
549
§10.1 Inner Product Spaces
The space V of Theorem 10.1.2 may be extended to include other functions
such as piecewise continuous functions. However, even if V is extended to all
Riemann integrable functions, it still suffers from a serious deficiency-it is not
complete; that is, there are Cauchy sequences that do not converge. We recall
this notion.
10.1.5 Definition A sequence fn in an inner product space V is a Cauchy
sequence when for any e > 0 there is an N such that m, n ;::: N implies
llfn - J,,.11 < e. An inner product space is called complete if every Cauchy
sequence in the space converges. A complete inner product space is called a
Hilbert space.
To make V in Theorem 10.1.2 complete, the Lebesgue integral is used. Our
elementary discussion does not require this notion, but the student should be
aware that a solution to this problem can be found.
Can we work with noncontinuous functi_ons and still have an inner product
space? The answer is quite simple. The only place in Theorem 10.1.2 where
continuity is used is in the statement that
1b l/(x)l
2
dx = 0
implies
f=O.
For an integrable function f, we saw in Chapter 8 that
1b l/(x)l
2
dx = 0
implies
f =O,
except possibly for those x in a set of measure zero. If we regard such an
f as equivalent to zero (change f on a set of measure zero if necessary), then
Theorem 10.1.2 carries over. With that understanding, the next theorem follows.
10.1.6 Theorem Let V = L2 be the space of functions f : [a, b] --+ <C such
lf(x)l2 dx < oo). Then the space Vis an inner
that l/1 2 is integrable (that is,
J:
product space with inner product
and norm
b
11111 = ( 111<x)l 2 dx
) 1/2
550
Chapter 10 Fourier Analysis
Another convenient class of functions that fonns an inner product space is
the class of sectionally continuous (or piecewise continuous) functions. They
are defined as follows.
10.1.7 Definition A function f: [a,b] -+ C is sectionally continuous if
[a, b] has a finite partition a= xo < x1 < · · · < Xn = b such that f is continuous
and bounded on each open section ]x;, X;+ 1[, i = 0, ... , n - 1.
We shall be working with this class later.
10.1.8 Example If /1, ... ,fn are orthonormal vectors in the inner product
space V (that is, if (f;,fj) = 0 for i -,. j and (J;,f;) = 1), prove that / 1, ••• ,fn are
linearly independent.
=
I:7=1 cJ; 0. We must show that c;
and form (g,f;). We have
Solution Suppose
LJ=I Cj fj,
L
n
(g,f;} =
(
)
Cj
fj, f;
=
j=I
L cj{fj,f;) = L
n
n
j=I
j=I
=
(where Dj; I if j = i and is zero if j -,. i). Since g
we have the desired result. •
Cj
= 0. Fix i, let g =
·bi;= c;
=0, we obtain c; =0, and so
•
10.1.9 Example Let V be an inner product space and f, g E V, g f. 0.
Define the projection off on g to be the vector h = (f, g)(g/jlgll2). Show that
h and f - h are orthogonal, and interpret this result geometrically.
Solution We compute as follows:
(hf- h) = {h /) - llhli2 = (g,/}(/,g} - ((/,g}g, (/,g)g)
'
'
.
llgl 12
- (g,J)(f,g)
-
llgll2
llgll 4
(f,g)(f:i} - 0
-
llgll 2
-
'
since (f,g) = (g,J). Hence hand/ - h are orthogonal. The geometric significance of this is illustrated in Figure 10.1-2. •
§10.2 Orthogonal Families of Functions
551
_/2
0
h
g
FIGURE 10.1-2 The function his the projection of/ along g
Exercises for § 10.1
1.
Prove the following in an inner product space:
(/, g) = 0 •
II/+ gll 2 = 11/112 + llgll 2 (Pythagorean theorem).
4(/,g) = <II/+gll2- llf- gll 2 )+i<ll/ +igll2- ll/ - igll 2 ) (polarization
a.
b.
identity).
211/112 +2118112 =II/+ gll 2 + III - gll2
II/+ gll · II/- gll:;; 111112 + llgll 2 -
c.
d.
2.
Show that
ll
f(x)d,J' <; (b- a) [
(parallelogram law).
1t<x)J2 dx,
and deduce that a square integrable function on [a, b], continuous on ]a, b[,
is also integrable. Is the converse true?
3.
Let fi?t, ••• , fi?,, be 011hono11nal vectors in V and/ E V. Define the projection of .f. on cp,, ... , fi?,, by g =
1 (/, cp;)rp;. Show that g and/ - g are
orthogonal. Interpret geomet1ically.
4.
In an inner product space, prove that 111/11-1 lg 111 :;; 11/ - g 11- In particular,
II/II ~ Ilg II+ II/ - gll. [Hint: Write f = (/ - g) +g and apply the triangle
inequality.]
5.
If/,,
I:7.
-+ f
(in mean), prove that
IIi,, 11
is a bounded sequence.
§ 10.2 Orthogonal Families of Functions
In this section we study some general properties of orthogonal vectors in an
inner product space. The main notions developed are those of a general Fourier
series, a complete 011hon01mal system, and the relations between these concepts.
Chapter 10 Fourier Analysis
552
Let V be a vector space with an inner product ( , ). A vector c.p E V is
called normalized if 11'-PII (c.p,c.p) 1/ 2 1. For g EV, if g 'I- 0, then g/llgll is
normalized. Also, recall that/ and g are called orthogonal vectors if (f,g} = 0.
A sequence c.p 0 , r.p 1, c.p 2 , ••• in V is called an orthonormal family if each <p; is
normalized, and c.p;, '{)j are 011hogonal if i 'I- j. These conditions may be restated
as
=
=
(c.p;, 'Pj} = 6ij,
=
=
where 6ij 1 if i j and O if i 'I- j.
Ultimately we shall study the space V = L2 consisting of the square integrable
functions f : [a, b] --+ C with (/, g} = f(x)g(x) dx, as was stressed in §10.1,
but for now our discussion will be in the context of general inner product spaces.
The object of Fomier analysis is to wlite each f E V in the form
J:
where Ck E C and <po, c.p 1 , <p2, ••. is a given 011hono1mal family. In general,
one cannot do this; however, if it can be done for each f E V, the family
<po, rp 1, cp 2 , ••• is called complete. (This is not to be confused with the notion of
completeness indicating that all Cauchy sequences converge.)
The sum I:~ Ck<fJk is understood to be taken in the sense of "mean convergence"; that is, if Sn = I::;=0 Ck<fJk, then II I:::O Ckcpk - s,,11 --+ 0 as n --+ oo. The
questions of pointwise or uniform convergence (in the case where V is a space
of functions) will be dealt with in the ensuing sections.
Our first job is to determine the constants ck in the expression for f. This
is easy if we keep in mind the geometlic intuition. We refer specifically to the
fact that in IR", if e 1, ... , e,, is an 011hono1mal basis, each x E IR" is written
II
x= Lx;e;,
i=I
where x; = (x, e;}. The latter is called the projection of x along e;. The same is
true in general.
10.2.1 Theorem In an inner product space V, suppose f = L~ ckcpk for
an orthonormal family <po, <p1, ..• in V (convergence in the mean) and f E V.
Then c1: (f, '-Pk) (c.p,.,f).
=
=
We gather some imp011ant terminology in the following definition.
§10.2 Orthogonal Families of Functions
553
10.2.2 Definition An orthonormal family
<po, <p1, ••• in an inner product
space V is called complete if every f E V can be written f = E:0 Ck'Pk• We call
E:O{f, <fJk}<fJ1: the Fourier series off with respect to <po, <p1, ••• and {f, 'Pk) the
Fourier coejftciems.
Theorem 10.2.1 says that the only candidate for representing f in terms of
the "'" is the Fomier series with Ck = {f. 'Pk}, Saying that <po, ip 1, ... is complete
is equivalent to the condition that eachf is "equal" to its Fourier series, that is,
that the Fourier series off converges in the mean to f. Thus {'Pk} is complete
if! for every f E V, II! - E~'..ao{f, 'P.t}'Pkll - 0 as n - oo.
Before proceeding with the theory, let us give some examples of orthonormal
families (which will turn out to be complete orthonormal families).
First, there are the classical Fourier series where the 'Pn are taken to be the
functions defined by
eim·
rp,,(x)
= r-:.=• n = 0, ± 1, ±2, ... , x E [O, 21r ].
v21r
The Fourier series for f : [0, 21r] - C for this family is given by
,
where
c1: =
~ (211: f(x)e-ikx dx = {f, cpi.)
v21r Jo
(the term e-ih in the expression for c,: has a minus sign because we use the
complex conjugate of g in {f,g}). After we prove completeness (§10.3), we can
assert that/ equals its Fomier se1ies in the sense of convergence in the mean.
Another family that is related to the preceding one by einx = cos nx + i sin nx
and is also called the classical Fourier series is
1
cos mx sin nx
.../i.1r , ,/ii , ,/ii ,
m, n = I . 2, . ..
Orthonormality of this family will be checked in Example 10.2.8. The reader
should write out the Fomier series of a function with respect to this family.
The preceding are the 011hono1mal families most pertinent to our later discussions. For reference, however, we give other classical examples that arise
in certain applications. To describe these, we first review the Gram-Schmidt
process.
Chapter·JO Fourier Analysis
554
Given an inner product space V and linearly independent vectors g0 ,g 1,
g2, ... in V, one can· form a corresponding orthonormal system cp0, r.p 1, ... as
follows:
go
cpo = llgoll'
(g1, cpo)cpo
(g1. cpo)cpoll'
g2 - (g2, cp1 )cp1 - (g2, cpo)cpo
- llg2 - (g2, cp1)r.p1 - {g2, cpo)'Poll'
C/)1 -
g1 -
- Ilg1
C/)2
-
and so on. Geometrically, this is obtained by using repeated orthogonal projections. It is left to the reader to verify that the process leads to an orthonormal
family; see Exercise 2 for this section.
The normalized Legendre polynomials are obtained by applying the GramSchmidt process to the polynomials 1,x,x2, ... ,x", ... on [-1, l]. It can be
shown by induction (a fairly tedious but straightforward proof) that the nth
normalized Legendre polynomial is
Pn(x)=
J2n+I
/2. 2Rn!
.!:_(x2 -lt.
dx"
On JR. = ]-oo, oo[, the Gram-Schmidt process applied to the functions
x"e-x21 2 , n = 0, 1, 2, ... , gives the normalized Hermite functions, and applied
to the functions x"e-x, n ::: 0, 1, 2, ... , on (0, oo[, it gives the normalized Laguerre functions. These functions are best treated in the context of differential
equations, where they represent fundamental solutions of certain differential
equations, just as sin nx is a fundamental solution for the vibrating string.
Let us continue with the general theory.
10.2.3 Bessel's Inequality Let cp0 , cp 1, ••• be an orthonormal system in an
inner product space V. For eachf EV, the real series
I(/, cp;) 12 converges
and we have the inequality
I:~
00
Ll(J,cp;)l 2 ~
ll/1I 2 -
i=O
It follows that the Fourier coefficients Ct = (J, cpk) converge to O as
k -+ oo. Thus Bessel's inequality gives some control over the behavior of the
Fourier coefficients.
The next result relates completeness of a system cp0 , cp 1••• ~ to Bessel's inequality.
555
§10.2 Orthogonal Families of Functions
10.2.4 Parseval's Theorem Let V be an inner product space and 'Po, 'Pi, ...
an orthonormal system. Then '{)o, <pi, ••. is complete iff for each/ E V, we have
00
11/112 =
L
l(f,cpn}l
n=O
2•
Hence, we see that Bessel's inequality becomes an equality exactly when
completeness holds. This theorem gives many useful relations in Fourier series,
but n01mally it is not very practical for telling when a specific family <.po, '{Ji, ••. is
complete. For this, one usually uses direct techniques, which are given in §10.3.
Geometrically, Parsevars relation may be regarded as a generalized Pythagorean theorem. If g is perpendicular to h (that is, (g,h) = 0), then Ilg+
hll2 = llgll2 + 1111112, which is the Pythagorean theorem for right triangles. If
"£:0(1, '{)n)'Pn = f, then f is a sum of orthogonal vectors {/, '{)n}'{)n, and so
11/112 should equal the sum of the squa,res of the lengths of {f, t.p,.}t.pn. Since cp,. is
normalized, (/, cp,.)cp,, has square length l(f, cp,.)12, and so we should get 11/112 =
E:O I{/, cp,,)12, which is Parseval's relation. If we have an incomplete orthonormal system, then, intuitively speaking, there are some terms missing on the
right-hand side and so only an inequality prevails, namely, Bessel's inequality.
We have seen that it is natural and, indeed, obligatory to choose the Fourier
coefficients c; = (I. cp;} when expanding/= z:=:i c;cp;. There is another reason
for this choice that aids in the geomettic understanding. The constants c; for
which the length
· f- tc;cp;I
j=()
is smallest are c; = (/, cp;}, the Fourier coefficients (that is, the choice c; = (I, t,p;)
yields the best mean approximation). This is reasonable because "£7=0(/, cp;}cp;
is just the projection of/ on the space spanned by '{)o, ... , 'fJn, and the shortest
distance to a plane from a point is the perpendicular distance. See Figure 10.2-1.
The precise statement is given next.
10.2.5 Best Mean Approximation Theorem Let V be an inner product space and <po, cp1, . .. , cp,. a set of orthonormal vectors in V. Then for each
set of numbers to, ti, . .. , t,.,
n
f -
L (f. '{)1:}cpk
l:=O
Equality holds iff t1: = (f, ,p1:}.
556
Chapter 10 Fouri;er Anf.!{J~is
FIGURE 10.2-1
The geometry behind the best mean approximation
This concludes our introduction to the general theory. The rest of the chapter
is devoted to the study of the classical cases of the orthonormal families
{
einx
..,/2-i' n=0,±1, ...
}
and
{
1 sinnx cosmx
../2-i
Ji, Ji , n,m= 1,2, ... }
on [O, 271'] or [-71', 71']. The co1responding 011honormal families on other intervals
follow from this by rescaling (see Exercise 3 for this section).
We cannot overstress the fact that/= "E,~(f. t.pk)'Pk means only that the sum
converges in the mean to f and that this does not entail pointwise convergence
without some additional conditions. In the general situation (see §10.3), we
usually do have mean convergence, but to obtain pointwise or uniform convergence, we require more careful hypotheses, such as continuity or differentiability
assumptions on the function f.
10.2.6 Example Let V be an inner product space and <po, <p 1, ••• a complete
orthonormal system. Show that <p 1, <p2, .•• is not complete.
Solution If cp,, <p2, ••. were complete, we could write f =
each/EV. Take/= <po; then we would have
r;:1(f, <p;)<p; for
,x,
.Po = 1:)Po, 'r';)<p;.
i=I
But ( <po, <p;) = 0, i = I, 2, ... , and so <po = 0, which is impossible, since IIcpo 11 = I.
Hence cp 1, ip2 ,.. • is not complete. As an alternative method of solving the
557
§10.2 Ortltogonal Families of Functions
problem, observe that Parseval's relation
00
111112 =
L l(f, <p;)l
2
i=I
does not hold for/ = cpo, because the left side would be I, while the right
side would be 0. Similarly, 'PN, 'PN+I, .•• or any proper subcollection is not
complete.
•
10.2.7 Example lfcpo,cp 1, ••• is a complete orthonormal system in an inner
product space V and f is orthogonal to each cp;, then f = 0.
Solution Since the system is complete, we can write
00
1 = Eu. cp;)cp;.
i=O
By assumption, each (f, cp;) = 0, so that/= 0. (If V were a Hilbert space, the
converse would also be true; see Exercise 14 at the end of this chapter.)
•
10.2.8 Example Show that the functions
sin nx cos mx
I
,/fir'
'1ir , n,m= 1,2, ... ,
,/ii,
are ortlwnormal on [0, 21r].
Solution In effect, this problem means that
Jo{21r (
1
21r
I )2
,/fir dx = 1,
cos 2 mx
---dx= I,
0
7r
(for normalization); and (for 011hogonality)
1
1
1
2tr
o
21r
0
21r
0
l
r-c: cos mx dx = 0,
v21r
I
r,;
sin nx dx = 0,
y21f
I
- cosmxsinnxdx = 0,
1f
1
'.?1r
0
sin2 nx
--dx=I
1f
Chapter 10 Fourier Analysis
558
for all m,n ;=:: I;
1
2,,.
0
and
l
f: m1 ;
- cos mx cos m1x dx = 0, m
7r
1
2,,.
0
l
- sinnx sinn 1xdx = 0, n f: n1•
7r
Each of these relations may be velified by techniques from calculus for integrating tiigonometric functions. An easier way is to note that
-I
27r
because if n
12" . . dx e'nxe- 111"'
-
i:
Unm,
0
f: m,
1h
ei<n-m>x dx = .
I
ei<n-m)x
1(n - m)
0
lh
= 0.
0
Taking the real and imaginary pru1s of this relation for all n, m gives the desired
relations.
This method also shows that
in.rl
{ jh
n =0,±1,±2, ... }
is an orthon01mal system on [O, 21r].
•
10.2.9 Example Let f : (0, 21r]
C be such that
fo
2
"
--+
l/(x)j 2 dx < oo.
Show that
Jim
n~oo
1:hr f (x) sin nx dx.= 0
0
1
2,r
and
Jim
111-00
0
j(x) cos mx dx = 0.
•
559
Exercises/or §10.2
Solution By Example 10.2.8, the sets
{ sinnx
,.fi
I
n=l,2, ...
}
and
ft
{ cosmx
I m = 1,2, ... }
are orthononnal families. Hence, by Bessel's inequality, the Fourier coefficients
off with respect to these systems converge to zero, and so the result follows.
The reader may legitimately ask where the hypothesis
{2'1r
IIJJl2 = Jo l/(x)l 2 dx < oo
is used in this solution. This is required so that we can fonn the inner product
space V of such functions and obtain the upper bound 11!112 < oo so that the
series in Bessel's inequality will converge and thus the nth tenn will tend to
zero. We shall retum to a more careful study of these points in the course of our
proof of the completeness theorems (cf the Riemann-Lebesgue lemma).
•
Exercises for §10.2
I.
Show that any n 011honormal vectors in Rn form a complete set.
2.
Let g0 , g 1, g 2 , ••• be linearly independent vectors in an inner product space.
Inductively define
ho= go,
ho
<po= llholl''"''
n-1
hn
I
=gn - ~ (gn, <{)k)<{)k, <,?n = 11~:11'" ••
Show that <po, cp 1, <p 2 , ••• are 011hon01mal. Why must we assume that the
g's are linearly independent?
3.
a.
Suppose <po(x), <p 1(x), ... are 011hono1mal functions on (0, 211"]. Show
that the functions
t/Jn(X) =
V{ii
T 'Pn (21l"X)
-L-
are orthononnal on [0, l].
b.
Write the family obtained by modifying 1/ J'f;, (sin nx)/ ft,
(cos nu)/ ft, or, altematively, einx / /2ir to [0, l] as in a.
Chapter 10 Fourier Analysis
560
4.
5.
c.
Write the Fourier series off for the families obtained in b.
d.
Show that if the
<{)n
in a are complete, so are the t/J,,.
Assume for the moment that the functions I/ ../2ir, (sin nx)/ yfi,
(cosmx)/ ,/ii are complete on the interval [0, 21r] (this will be proved
later).
a.
Apply this to the function x to show that x = 1r
(convergence in the mean).
b.
Using the Fourier coefficients found in a, apply Parseval's relation
to show that 1r2 /6 =
1/n2 •
c.
Use the same procedure on
-
zr::,(sinnx)/n
E:,
x2
to get 1r4 /90 =
E:, 1/n4.
Prove that
2~
L., cos
kB= sin[(n + 1/2)8] _ 1
sin(B /2)
k=I
by usincr
0
6.
.
e' 9
.
.
+ e 2' 9 + ... + em 0 =
e;o(I _
ein0)
.
I - e19
Prove that the Fomier series of a sum of two functions is the term-by-term
sum of the Fourier series.
§10.3 Completeness and Convergence Theorems
This section investigates the convergence of the Fomier series of a function. The
Fourier se1ies of a function is uniquely detennined by that function, but there
is no prior guarantee· that the series converges or, if it does converge, that its
sum is the given function. The type of convergence we obtain depends on the
hypotheses we place on/. The important results are summarized in Table 10.3-l;
further convergence theorems are given in §10.4 and §10.6.
It is possible to slightly weaken the hypotheses of the pointwise convergence
theorems presented, and some of the slightly sharper hypotheses that result are
given in the optional §10.4. 1 Examples and computational techniques are given
in §10.5.
1There is a deep result of L Carleson which states that for 1/1 2 integrable, the Fourier series of
/ converges pointwisc to/, except possibly on a sci of measure zero. However, this result is beyond
the scope of this book. See Acta Math. 116 (1966), p. 135.
§10.3 Completeness and Convergence Theorems
TABLE 10.3-1
Convergence Properties of Fourier Series
Hypotheses on the function/
Convergence of Fourier series
J02
Converges in mean to f
1r
561
l/(x)l2 dx < oo
f,f' both sectionally
continuous with only
jump discontinuities
Converges pointwise (and in mean)
f continuous with /(-1r) = f(1r),
f' sectionally continuous with
only jump discontinuities
Converges uniformly (§10.6),
pointwise, and in mean to/
to
f(x+) + f(x-)
2
From now on, we deal primarily with the following two orthonormal systems
on the interval [0,21r] or [-1r,1r]:
a.
Exponential system:
n = 0, ± I, ±2, ... ;
b.
Trigonometric system:
I
sin nx cos mx
..ff;' ,Ii ' ,Ii .
n,m = 1,2, ....
These two systems are closely related. Indeed, the uigonometric system is
obtained by taking the real and imaginary pru1s of the exponential system (see
Exercise I for this section).
The Fourier se1ies of a function f : [0, 21r] --, C with respect to the exponential system is the series
where the Fomier coefficients are given by
Cn
= -I
271"
121r f(y)e-m,v. dy
0
Chapter JO Fourier Analysis
562
(we have gathered two ./fir's to give 21r for convenience and for historical and
conventional reasons).
The Fourier series of a function f with respect to the trigonometric system
is
'°"'
00
ao + ~OnCOSnX+ bnSinnX,
.
2
n=I.
where the coefficients are given by
a,.= -l
12'11' f(x)cosnxdx,
71"
n
= 0, 1,2, ...
0
and
b,,=-• 12'11' f(x)sinnxdx,
71"
n = 1,2, ....
0
The reader should review §10.1 and §10.2 if these statements are not clear.
The prutial sums for the uigonometric series and the exponential series are the
same (Exercise 1 at the end of this section). Thus, if we can prove convergence
theorems for one system, we automatically obtain theorems for the other. The
system used depends on the pruticular problem and, to some extent, personal
taste. Examples of the computational differences are given in §10.5.
The primary goal here is to give theorems that say that a function "equals" its
Fourier series. If we take equality to be convergence in the mean, then this is a
problem of completeness of the 01thono1mal system. F01tunately, and this is one
of the main theorems of the subject, the exponential and trigonometric systems
are complete. On the other hand, if we take the equality to mean pointwise or
uniform convergence, then extra conditions must be put on/. We first deal with
completeness:
10.3.1 Mean Completeness Theorem The exponential and trigonometric systems on [0,21r] (or [-1r,1r)) are complete in the space V = L2 of
functions f : [0, 21r) - C with J;'II' lf(x)l2 dx < oo (the integral may be improper).
This means that if a function f has l/1 2 integrable (that is, the function/ is
square integrable), then f equals the sum of its Fourier series in the sense of
convergence in the mean. One need not get pointwise convergence. Indeed, one
can constrnct a continuous (periodic) function f whose Fourier series diverges
at a given point. (See, for example, Widom, Drasin, and Tromba, Lectures on
Measure and Integration Theory, Van Nostrand Mathematical Studies, No. 20,
New York, 1967, p. 153).
§10.3 Completeness and Converge,ice Theorems
563
Let Sn be the nth pru1ial sum of the trigonometric Foulier selies for a realvalued function/. The intuition behind the mean completeness theorem is illustrated in Figure 10.3-1. Each Sn is a smooth function (since it is a trigonometric
polynomial), but as n -+ oo, Sn may converge to something discontinuous, as
illustrated. If the convergence is uniform, then/ must be continuous, as we saw
in Chapter 5. Thus, if f is discontinuous, we get mean convergence, but not
unifo1m convergence.
The shaded "area" • 0
or, more precisely
Sn
,f
2•
lf(x) - sn (x) l2 dx • 0
0
0
FIGURE 10.3-1
2,r
x.,
The geometlic significance of mean convergence
While the mean completeness theorem deals with a wide variety of functions,
it does not answer the question of pointwise convergence. The next theorem
considers this question. To state the theorem, we need some terminology. For
this theorem we may use either real or complex functions, but it is often enough
to consider real functions, for if f = / 1 + ifi, the Foulier series off is that of
/1 + i (that of b) (why?). Suppose that f : [0, 21r] -+ R (or [-1r, 1r] -+ R) has
a possible discontinuity at xo E [0, 21r ]. In the case where x0 = 0 or 271", this
shall mean that we are to take the function f extended to be periodic; that is,
we define f(x + 21r) = /(x). This is reasonable because the Fomier series itself
is pe1iodic. This periodic extension is illustrated for two cases in Figure l 0.3-2.
We define
f(xi) = lim /(x) = Jim /(x)
x-xO
x-+xo, x>xo
if it exists. This means that for every £ > 0, there is a 8 > 0 such that if
Ix - xol < 8 and x > Xo, then IJ(x) - f(xo>I < €. Intuitively, J(xt) means the
value of/ just to the light of xo. See Figure 10.3-3. Of course, /(x0) may not
exist; look at Figure 10.3-3b. One defines/(x0 ) in an entirely analogous way. A
discontinuity xo where both/(x0 ) and/(x0) exist is called ajump discontinuity,
and /(x0) - /(x0 ) is called the jump off at xo. The jump can be either positive
or negative, and is zero when /(xo) /(xo) f(Xo) iff / is continuous at Xo.
=
=
Chapter JO Fourier Analysis
564
f(x) = lsin (x/2)1
0
271'
271'
0
(a)
(b)
FIGURE 10.3-2
(a) Continuous at O; (b) discontinuous at 0
y
(b)
(a)
FIGURE 10.3-3
(a) Jump discontinuity; (b) /(4,) does not exist
Suppose f is differentiable on some open interval ]xo,xo + e[. Then we can
talk aboutf'(x0) andf'(x0 ) if they exist. lntuitively,J'(x0) is the slope of/ just to
the right of x0 . For instance, in Figure 10.3-2a.f' (O+) = limx-o+ (d / dx)(sin(x/2)) =
1/2, and in Figure I0.3-2b,/'(O+) = +I.
There is a slightly weaker definition of f'(x0) that is sometimes important.
The previous definition demanded that J'(x) ~xist for x > xo and that J'(x0) =
limx-x0f'(x) exist. It is easy to prove that if this is so, then
(see Exercise 39 at the end of the chapter). For the following theorem, the
existence of this limit is sufficient, and so we shall adopt it as our definition,
with f' (x0 ) similarly defined by
/'(x_) _1. {f(x0) - 1f(xo - h)} .
0
-
1m
1,~o+
l
565
§10.3 Completeness and Convergence Theorems
The proof that this second method is actually a weaker assumption is left
to the reader in the same exercise. Observe that f is differentiable at xo iff
f(:xo) =f(x0) =f(x0 ) and/'(x0) and/'(x0) exist and are equal.
The next theorem (due to Jordan) contains the principal result on pointwise
convergence.
10.3.2 Pointwise Convergence Theorem Let f : (0, 2rr] -+ IR (or
f : [-rr, rr J -+ IR) be sectionally continuous and have a jump discontinuity at
xo, and assume that /'(x0) and f'(x0 ) both exist. Then the Fourier series of
f (in either exponential or trigonometric form) evaluated at xo converges to
[/(Xo) +/(xo)l/2. In particular. if f is differentiable at Xo, the Fourier series of
f converges at xo to f(xo).
If :xo is an endpoint of the interval, then, as mentioned previously, the numbers f(x't,) and /(x0 ) are computed for the function after it is extended to be
periodic (see Figure 10.3-2). Notice also that the Fourier series does not necessruily converge to f(xo) at a jump discontinuity but converges to the average of
f(x't,) andf(x0 ). A typical example is a step function (see Figure 10.3-4).
y
/(;~o+)
/(xo+) + /(xo-)
2
f(x0 -
)
.-+~...._+-..,_
------------~--
0
X
2,r
FIGURE 10.3-4 At a discontinuity, the Fourier series converges to the average
value
The pointwise convergence theorem is useful because it gives us conditions
that are easily verified in examples and that hold in many cases of interest.
Without benefit of the pointwise convergence theorem, it could be difficult even
in simple examples to prove directly that the Fourier series converges to the
associated function.
C/zapter 10 Fourier Analysis
566
In the pointwise convergence theorem we also have mean convergence of
the Foulier seties to f by the mean -completeness theorem. However, pointwise
convergence is a more delicate and sometimes more useful condition.
10.3.3 Example Suppose f :
Show that
t
IJ(x)I'
dx ~
.t It
NY""'
2~
= 2111"
Ilo.('Ir
f(x)dx
12 +
0
Ilor
oo.
L;l Ilo.(1f f(x)cosnxdx 12
00
0
f(x)sinnxdx 12
Solution Let
V =L2
<
dxl'
n=l
00
+~
;I
J;1r lf(x)l 2 dx
[0, 21r] --+ C satisfies
1 lf(x)l
={t: [0,21r]--+ C 1111112 =
2
1r
2 dx
< oo }·
By the mean completeness theorem, we have the complete orthon01mal family
=~I
{ <p,,(x)
Thus Parseval's relation· holds:
(/, lf'n}
=10
n =0,±1,±2, .. -}-
111112 = E:_ I(!, lf'n)l2. where
00
21r
--
f(x)<pn(X) dx
and so the first equality follows.
12,..
=0
f(x)e-in.T dx
Remember that
~
211"
,
E:_ I(!, lf'n)l 2
00
means
limN-+oo E:-N l(f, <p,,)12 for the exponential functions. (That is, they are taken
in the order <po, If'± 1, /f'±2, .... Here the te1ms are positive, and so the series can
be rearranged arbitrarily, by Worked Example S.S at the end of Chapter 5.)
The second equality follows by applying the same procedure to the complete
orthonormal family
{
I
..fi;'
sinnx cosmx
..fi . ..fi ' n, m = 1, 2, ... } .
One can also delive the second equality from the first by writing e-in.T = cosnx-
i sin nx, squaling and gatheling terms, and noting that the cross.terms from n and
-n cancel. •
567
§10.3 Completeness and Convergence Theorems
10.3.4 Example For the following functions on [-71', 11'], state whether we
have mean or pointwise convergence of the Fourier series and what the series
converges to at xo = 0.
a.
f(x)
={ 2-2
b.
f(x)
={ Xx+l
c.
f(x) = sinx.
d.
f(x)
={
I +x
x:$0
x> 0.
X
:$ 1
X
>
x + sin(l / x)
1.
X
X
:$ 0
> 0.
Solution The graphs of these functions are given in Figure 10.3-5. Each
function is piecewise continuous, and the discontinuities are jump discontinuities.
This is obvious except perhaps for d. There, f(x) = x sin( I/x) ---+ 0 as x ---+ 0,
since Ix sin(l / x)I ~ lxl. so that f(O+) = 0.
y
y
/:
'
fl
'
I
X
(a)
I
/
'
X
(b)
X
(c)
I
X
(d)
FIGURE 10.3-5 Determine the convergence properties of the·Fouiier series of
these functions
Chapter 10 Fourier Analysis
568
At 0, f' CO+) and f' co-) exist in cases a, b, and c. All of these are fairly
obvious. For instance, in a, f (x) = 2 for x > 0, and so limx--+o+ J' Cx) exists and
equals 0. In case d, this is not true. Here, for h > 0, we have
f(O +h) - f(O+)
. I
h
= smh,
which does not converge as h --+ 0. Thus the pointwise convergence theorem
does not apply in this case. In each case, however, we do have mean convergence
by the mean completeness theorem. At x = 0, the Fourier series converges in a
to O = [/(0+)+/(0-))/2, and in b to 0, and inc to 0, and ind our theorems fail.
(One can show the convergence of the Fourier series in d to I /2 by a direct
analysis.) •
10.3.5 Example Find an example of a function f such that the Fourier
series off converges poinrwise and in the mean to f, but does not converge
uniformly.
Solution Let
f
={
0
I /2
l
-11" < X < 0
if x = 0 or x =Ti
0<X<11'.
The discontinuities off are jump discontinuities (see Figure 10.3-6). From the
mean completeness theorem, the Fouder se1ies off converges to f in mean,
and by the pointwise convergence theorem, it converges pointwise, since f Cx0 ) =
[/(xt) +/(x0)]/2 at each point.
y
f
•
•
•
-71'
'II'
•
X
FIGURE 10.3-6 This function has a Fourier series that converges in the mean
and pointwise, but not uniformly
However, the Fourier seiies cannot unifo1mly converge to f, because each
--+ f(x) uniformly, f would be continuous, which
is not the case. •
sn(X) is continuous and if s,,(x)
569
Exercises for §10.3
Exercises for §10.3
1.
a.
Show that the nth partial sum of the trigonometric Fomier series
of a (real or complex) function equals the nth prutial sum of the
exponential series.
b.
Write the corresponding series on (-11',11'].
c.
Show that if, on [-11',11'],/ is even (that is,/(x) =/(-x)), then in the
trigonomettic Fourier series, all bn = 0. The series is then called the
cosine series.
d.
Repeat question c for/ odd; that is, if f(-x) = -f(x), show that all
an = 0. The series is then called the sine series.
2.
For f : [0, 211'] -+ IR, show that f is continuous at zero (in the sense
off being periodic) iff /(0) = /(211') and f _is continuous in the usual
sense at both points O and 271" in [0, 211' ]-that is, limx-o+ /(x) = /(0) and
lim,.._ 2,..- f(x) = /(211').
3.
Suppose f : [0, l]
j
I
-+
C satisfies
L
lJ(x)l 2 dx = -}
0
00
na:-oo
J; lf(x)l2 dx < oo. Show that
1
I
2
, f(x)e 2""inx/l dx 1
0
4.
Use the theorems of this section to justify your manipulations in Exercise 4, §10.2.
5.
For each of the following functions on [-rr, rr] detennine whether the
Fourier se1ies converges pointwise or in mean, and what the pointwise
limit is if it exists.
a.
f(x) = x' (consider all possible values of n: n = ... , -3, -2, -1,0,
1,2, ... ).
b.
< 0 for some k E IR.
0
X
,u
X ~
f(x) = { ,__
0
570
Chapter 10 Fourier Analys
c.
f(x) = tanx.
d.
f(x) = e-x2.
e.
-l/x2
f(x) = { ~
x>O
X
~Q.
§10.4 Functions of Bounded Variation and Fejer
Theory (Optional)
There is a theorem that is similar to the pointwise convergence theorem, but tha1
holds under more general conditions and that also gives a criterion for uniform
convergence. We state this theorem without proof (the proof is similar to that of
the pointwise convergence theorem-it is just a little more intricate). We shall
be content to prove a weaker version in §10.6 and to prove a related theorem
of Fejer.
To understand the theorem, the notion of a function of bounded variation is
needed. For f : [a, b] - JR, we say that f is of bounded variation if there is a
number M such that for all partitions a = xo < x 1 < ••• < Xn = b of [a, b],
II
L lf(xk) - f(xk-ill ~ M.
k=I
Roughly, saying that f is of bounded variation means that the graph off has
finite arc length. One can show that a function is of bounded variation iff it is
the difference of two bounded monotone functions. 2 It follows that if f is of
bounded variation, then its discontinuities are all jump discontinuities and are
countable in number, so that f(x0) and f (x0) are defined.
10.4.1 Dirichlet-Jordan Theorem
Let f : [0, 211') -
JR be a bounded
function.
i.
If f is of bounded variation on an interval [xo - £,xo + e] (for some
e > 0) about xo, then the Fourier series off evaluated at xo converges to
[f(xo) +f(xo)]/2.
ii.
If f
is continuous, f(O) = f(?11'i and f is of bounqed variation, then the
Fourier series off converges uniformly to f.
is of bounded variation, set v(x) = sup{}:Z~, IJ(xk) - f(xt-i>I I a= xo ~ xi ::=: · · · ~
= p - q, where p = v + f /2 and q = v - f /2. One checks that P
and q are increasing. The converse is easy to verify.
2 1f /
xn
=x}, the variation of J. Write f
§10.4 Functions of Bounded Variation and Fejer Theory (Optional)
511
Both the pointwise convergence theorem and the Dirichlet-Jordan theorem
give sufficient conditions for the Fourier series to converge. Exercise 34 at
the end of the chapter gives an example to show that the conditions are not
necessary. Useful necessary and sufficient conditions are not known.
As we have remarked, the Fourier series of a continuous function need not
converge pointwise. By the Dirichlet-Jordan theorem, such a function cannot
be of bounded variation. Fejer's theory covers this case by weakening the
requirement of pointwise convergence of the series to Cesaro summability of
the series. Recall from §5.10 that a sequence a,, a2 , •.. is said to converge in
the sense of Cesaro, or (C, 1), if Un= (a, + · • • + an)/n converges. If an - x,
then <Tn - x, but not necessruily conversely. For series, this criterion is applied
to the partial sums. In 1904 Fejer proved the remarkable fact that alt~ough the
Fourier se1ies of a continuous function need not converge pointwise, it is always
(C, 1) convergent.
10.4.2 Fejer's Theorem Assume that f is piecewise continuous on [O, 271']
and that f(xt) and f(x 0 ) exist. Then the Fourier series off converges (C, 1) at
Xo to [f(xt) + f(x0 )]/2. If f is continuous and f(0) = f(211'), the Fourier series
converges (C, 1) uniformly to f.
Note that no assumption of bounded variation or differentiability is required.
When one considers "distributions" or "generalized functions," such as the
Dirac delta function (§8.9), Fourier series still make sense when suitably interpreted, and every distribution has a convergent Fourier series (convergent in an
approp1iate sense). These points regarding convergence are quite useful in practice, but space does not allow a systematic u·eatment of them here. However,
we can give an example.
10.4.3 Example Compute the Fourier series of the delta junction 6 on
[-71', 11'].
Solution Recall that this function has the defining prope1ty:
1-:f(x?f>(x - a)dx =f(a).
Therefore,
1"
li(x)e_;,.,, dx = l,
-,r
Chapter JO Fourier Analys,
572
and so the Fourier series of 8 is
00
I:
n=-oo
Of course, this does not converge at x = 0, but we do not expect it to, since 8(0
is undefined. What is true is that
o<x> =
n=-oo
in the sense that convergence holds under the integral sign; that is, for any C1
functionf,
f(O) =
1
,r
-,r
L
00
b(x)f(x) dx =
n=-oo
1"
einx
.,J<x) dx
2
-1r
The validity of this follows from the pointwise convergence theorem:
for each y. (Since the sum is from -oo to +oo, we can replace n by -n.) The
situation for a general distribution and the proof of convergence of its Fourier
series is analogous. That is, if T is a distribution, then,
where
On
= T(e-inx).
•
Exercises for §10.4
1.
Compute the Fourier series of {>', the derivative of the delta function.
2.
Prove that
E:-oo eikx is (C, l) summable to Ofor x not a multiple of 211"-
573
§10.5 Computation of Fourier Series
§10.5 Computation of Fourier Series
In this section we are concerned with specific examples of Fourier series and
methods to compute them. Included in our discussion is an interesting phenomenon that occurs in the behavior of a Fourier series at a jump discontinuity
known as the Gibbs phenomenon.
The trigonometric and exponential forms of Fourier series are equivalent,
as we have seen (Exercise I, §10.3). The vruious forms of Fourier series and
their convergence prope11ies are summruized in Tables 10.5-1 and 10.5-2. The
functions can be real or complex, but we will work with real functions for
simplicity. There ru·e several comments to be made on these formulas. The first
two forms (Table 10.5-1) should be self-explanatory. The Fourier sine series
arises when/ is odd, because then we have/(-x) = -f(x) and hence
J,r f(x)cosnxdx = -I J,. f(-x)cos(-nx)d(-x)
= --I J,r f(x)cosnxd(-x) = -an,
an= -I
7r
7r
-,r
7r
-,r
-,r
so that On = 0.
Similru·ly, for f even, the Fomier series reduces to the cosine series. See
Figure 10.5-1.
y
y
periodic
extension
f;,
~\
,,
I , _•
':\,off
.
'
A
.._ - I
I
I
X
-'IT
(a) f odd
FIGURE 10.5-1
-x
X
1T
(b) / even
(a)/ is odd; (b) / is even
For the interval [-/, l], we replace 011honormal functions 'Pk on [ -1r. 1r] by
. ctions 1Pk(X) = Rt,cpk(rrx/1), whi_ch are 011honormal on [-l,l]. This change
TABLE 10.5-1 Various Forms of Fourier Series
VI
...:i
~
The function f
Fourier series
Coefficients
Exponential
Fourier series
f defined on [O, 21r]
Loon=-oo n /n.T
Cn -..=
Trigonometric
Fourier series
/ defined on [-r.. 1r]
(or (0. 2r.])
C
(or [-1r.1r])
2I,,
J2" /(Xe) -inx dx
(or
, +
L:, (an cos
nx + bn sin nx)
f is odd: /(-x) = -/(x)
Fourier
cosine series
f defined on (-ir.ir] and
f is even: /(-x) = /(:c)
1°" + E:, Gn cos
Exponential
series on [-/. /]
f defined on [-/, /]
Loon=-oo an /nx.,,/1
Trigonometric
series on [-/. /J
f defined on [-/, /]
~
Half-interval
cosine series
f defined on [0, {)
Half-interval
sine series
f defined on (0, I]
/ defined on [- ir. 1r] and
L;::,b,,sin nx
r.
>
¾r,J(x)cos nxdx
bn = ¾f~.,J(x)sin nxdx
On=
(or
Fourier
sine series
0
bn =
t"
O
n =0, 1, ...
n = 1,2, ...
)
¾f~.,J(x)sin
nxdx
n = 1,2, ...
= ¾f0" f(x) sin nx dx
an=
nx
¾f~J(x)cos
nxdx
n = 1,2, ...
¾f 0" f(x) cos nx dx
Cn = f, f~,J(x)e-inxr/l dx
=
2
+ Loo
n=I (OnCOS nir:c
I
+ bn sin
!!n
2
+
Eoon=I on
COS
Loo
b · nwx
n=I nSin T
In1rx
t f ,J(x) cos T dx
bn = t t,J(x) sin T dx
. I
On = f f0 /(:c) cos T dx
On =
n;.r)
bn =
j
J: f(x) sin T
n =0, l ....
n = 1,2, ...
n =0, I, ...
Q
-§
.,~
....
~
~
s::
dx
n = 1,2•...
~-.,
?:
§10.S Computation of Fourier Series
575
TABLE 10.5-2 Convergence Properties
fo2,. lf(x)l2 dx
Properties ofJ
Convergence of Fourier series
< oo
Converges in mean to f
f has a jump discontinuity at xo and
J'(x~).f (x0 ) exist. If x0 is an endpoint,
we regard/ as extended so that f becomes
periodic (see the text for the half-interval
forms).
Converges pointwise at xo to
t<xo> +1<x;>
2
f continuous with /(-'It")= J('rr),
/' sectionally continuous with only
jump discontinuities
Converges uniformly to f
of scale preserves the convergence properties. The reader should write down
the sine and cosine series (for/ odd or even) on a general interval [-l, /].
The half-interval fonnulas are obtained as follows: For the cosine series,
extend/ to [-1,l] by defining
f(-x) = J(x).
Then/ becomes even, and so it has a cosine series. See Figure 10.5-2. For
convergence at x0 = 0 we must check this extended function, not the original
one. If /(O•) exists, then the even extended function evidently has no jump at 0.
Similarly, we do not get a jump at I or -/. Thus the usual convergence criterion
applies without modification for the cosine selies.
y
X
-JI
FIGURE 10.5-2
JI
Extending/ to be even
The half-interval sine seties is similar. On [-/,0[, define/ by /(-x) = -/(x),
for O < x $ I, so thatf becomes odd and so has a sine series for its Fourier series.
576
Chapter 10 Fourier Analysis
See Figure 10.5-3. In this case, at the point O the Fourier series is always zero
(sin O = 0). Thus, at 0, a jump discontinuity is usually introduced, but the Fourier
series is zero at these points. To ensure continuity of the extended/, we would
have to impose the conditions that/ be continuous and/(0) =/(/) = 0. Thus the
convergence crite1ia apply to the half-interval sine series without modification
if we keep in mind that at O and I the convergence is to zero.
y
"I
,,,
A
._.,I
,......
I
--1------1.------+I
'
I
I
-£ I
y
'"
X
£1
-,
'
FIGURE 10.5-3 Extending/ to be odd
From the general.theory, we know that Parseval's relation holds for each f
with
that is,
where the en are the Fourier coefficients in exponential fonn. Care must be taken,
because cn,an,bn of Table 10.5-1 are not the Fourier coefficients in the previous
sense, since we have gathered factors of ../Fi, Ji for traditional reasons. But if
we remember this, Parseval's relation is easy to find. The results are tabulated
in Table 10.5-3.
If we know the Fourier series of /(x), say, on [-1r, 1r], then we can get an
expansion for the function g(x) = f~ 1J(y)dy using the following theorem.
10.5.1 Integration Theorem
Suppose
f ~" l/(x)l 2 dx < oo
Fourier series
~+
00
L (an cos nx + bn sin nx).
n=I
and f has
577
§10.5 Computation of Fourier Series
TABLE 10.5-3 Parseval's Relation
Type of series
Parseval's relation
Exponential series
1
2ir
Trigonometric series
;I
Sine series
.!.r.
12"
O
00
IJ(x)l2 dx
1"
_,. l/(x)j 2 dx
2
I
Half-interval cosine
series
II
ao
2
IJ(x)I di:=
12")
0
2
00
11
00
11
2
n=-oo
2
ao
00
2
2
lf(x)f di:= 22 + L(an +bn)
•I
2
ao2
00
2
1/(x)I dx= 2 + Lan
0
Half-interval sine
series
or
n=l
-1
2
7
(
1/(x)I dx = 22 +Lan
-I
~igonometric series
on[-/,/]
_,.
n=l
-w
Exponential series
~n [-/./]
1~)
b~
dx
I
or
00
1" IJ<x)i2 =f
1"
I 11
L lcnl
;
(
= ; 02 +~(a!+ b!)
-ff'
Cosine series
=n ~ lcnl 2
•I
711 lf(x)i2dx= Lb!
00
n=l
O
!7'en, l,etting g(x) = J~ 1J(y)dy, we have
(,"
g(x)= ao(x2+1r) +
t
(an
[x,. cosnydy+bn 1-: sinnydy)
~ { -smnx+a11 .
bn ((-ll-cos nx) }
= ao(X2+ tr) +~
n=I
n
n
the convergence is uniform for -1r::; x::; 1r.
578
Chapter 10 Fourier Analysis
Note that this expansion is not the Fourier sedes of g, but it does give the
Fourier expansion of g(x) - aoX/2. Also, the expression is obtained simply by
integrating the Fomier sedes off term by term. Similarly, any of the series in
Table 10.5-1 may be integrated term by term to give a uniformly convergent
series (the proof is the same in each case). This is useful when the constants
an and bn have already been computed for J. To get the Fourier series for g, we
substitute the series for x and then gather terms. (The series for x is given later
in this section. See also Worked Example 10.3 at the end of the chapter.)
As in §5.3, differentiation of Fourier series requires more care than integration. It will be treated in §10.6.
In Table 10.5-4 are assembled some of the common Fourier expansions. In
using this table, one should keep in mind that the Fomier series is linear. That is,
(Fomier series of af + bg) = a(Fomier se1ies off) + b(Fomier series of g ).
One should also remember that if one has found an expansion in t1igonometric
functions, it must be the Fomier series, since it is unique.
Theorem 10.5.1 can be used to build fmther series successively by integrations, for example, x, x2 , x3, .... Also, note that if/ is modified at a finite number
(or even a countable number) of points, the Fomier se1ies is unchanged (why?).
Specific illustrations will be given sho11ly. In Table 10.5-4, intervals of length
21r and 1r are presented for convenience. These can be changed to intervals of
length 2/ and I by introducing constants and new variables as indicated in Table
10.5-1. Some further expansions are found in the exercises and examples.
In these formulas, care should be taken with regard to the domain. For
example,/(x) = x on [0,21r] is quite different from/(x) = x on [-1r,1r] as a
periodic function (Figure 10.5-4).
,,
I
_,,-
,,
I ,-"'
,,1I
,-
I
I
,,
0
,, ,,
,;"
,,-1
I
I
,,,I
211'
v"
,,1
I
,,
(b)
(a)
FIGURE 10.5-4 (a)f(x) =x on [0,211']; (b)/(x) =x on [-1r,1r]
Of course, on ]O, 1r[ the functions pointwise equal their series. A comparison
of these series can le~d to interesting identities. For the function x, for example,
we deduce that
00 sin nx
00 ( - l)n+I
x=1r-2I:-- =2I:---sinnx,
n
n
n=I
n=I
TABLE 10.5-4
Function
1 J(x)
={
-11'
f:
! +~
2
<.O'>
e:g
Valid pointwise
on the interval
Series
6 0$x$r.
Sx < o
...
Some Fourier and Related Series
r.
sin[(2n - l)x]
2n-l
n;J
l
l
00
) -
n
1r.
~
r.[: series is
~
at x
=0,
1r. -11'
la /(x) = I. 0 $ x $
1r
~
r.
f:
11=1
~
;::
n;J
sin[(2n - l)x]
2n - l
~-.,
JO, r.[: series is O at x = 0, x = 11'
(half-interval sine series)
l (half-interval cosine series)
2
L (-1)"+1 sin
--n-
nx
~
~-...
[0, 7r)
00
2/(x)=x
i·
~
sin nx
=-+-I:0-<-'>>-2
n
71'
I
J-
1r,
ir[: series is O at x
=
71",
x = -71"
n;J
L sin
00
71"-
2
nx
--
ft;)
~
_i
2
1r
n
L cos[(2n -
JO, 2ir[: series is
1r
at x = O. x = 2ir
00
n;J
l)x]
(2n - 1)2
[0, 71"]
(half-interval cosine series)
u,
-.J
\0
VI
00
0
TABLE 10.5-4 Continued.
0
2a f(x) = { x
-'Tr< X < 0
O :::; :; :::; 1r
1r
2 ~ cos[(2n - l)x]
L.,
(2n - 1)2
4- ;
-
n=I
2
3 f(x)
=x2
27rX- -21r + 4
3
~ (-1)" .
L., -n- sm nx
]- 1r,1r[; series is -rr/2 at x = -rr,x = --rr
n=I
I: -cos-nx00
00
~,r 2 + 4 ~ (cos
3
L.,
n2
n=I
[0, 2,r)
n2
n=I
nx _,rsmn nx)
•
,r2
00
(-1)"
- +4 ~ --cos nx
3
L., n2
JO, 2-rr[; series is 21r 2 at 0, 2,r
[-?r,71']
n=I
,rx _
!
~ sin[(2n - l)x]
,r L.,
(2n - 1)3
[0, ,r]
n=I
~ (27r(-t)"• 1
L.,
4 l - (-1)") .
-~n-'-- - ;
n3
sm nx
[0, ,r[: series is 0 at x = ,r
n=I
n=l
(half-interval cosine series)
\
~
s::
.,~::i
00
2
4 ~ cos 2nx
;-;L.,4n2 - l
....
~
(half-interval sine series)
4 f(x) = sin x
Q
-El
~
[0, ,r[
;i...
s
...t;·
~
TABLE 10.5-4
....
(.()')
Continued.
~
I.Ii
-
4a f (x) = { ~in x
0:5x:51r
1 I.
2 f cos 21U'
-+-smx----,r :5 X < 0 ,r
2
1r
4n 2 - l
~
[-1r,,r]
~
all of JR
;:
11=1
4b /(x) = jsinxl
2
4fcos21U'
----1r
r.
4n 2 - I
~-~
~
n=I
5 /(X) = COS
X
_!
1r
L nsm
21U'
4n
I
00
•
]O, 1r [ ; series is O at x = 0, x = 1r
2 -
n=I
6 f(x) = eX
f: (-
1r
-oo
It
---e
I - in
inx
:!.
~
~
(half-interval sine series)
sinh 1r
---
~
s:::
~] - r.. ir[;
series is cosh
1r
at
1r. -1r
Vt
00
Chapter 10 Fourier Analysis
582
and so we recover la in Table 10.5-4:
f
n=I
sin[(2n - l)x] = ~
2n - 1
4
for 0 < x < 1r. However, outside the interval ]0, 7r[ they will not agree; see
Figure 10.5-5. We have sketched roughly what the two series look like up to
the nth tenn.
(a)
(b)
FIGURE 10.5-5 The Fomier series of (a) f(x) = x on [0, 21r] and (b) on [-11", 7r J
We now tum to the Gibbs phenomenon, which occurs when f has a jump
discontinuity. (It is named after J. Gibbs, a mathematical physicist and physical
chemist who discovered it. Gibbs is usually credited with developing vector
notation around 1880.) The idea is illustrated in Figure 10.5-6. This shows that
if s,. is the nth partial sum of the Fourier series, then the maxima and minima of
s,. near the jump differ by more than the jump off, and this excess persists as
n -+ oo. Roughly, the Fourier se1ies "overshoots" the jump, and the overshoot
this is that as n -+ oo, s11 (x) tends
continues to the limit. Another way of saying
'.../
to approximate a ve11ical line longer than the jump.
We will be content with treating a special case of the jump discontinuity for
which we can dete1mine the overshoot exactly.
10.5.2 Gibbs Phenomenon Consider
f(x) = { a
b
and suppose a
-1r $ x < 0
0$x$1r,
< b. let s (x) be the nth partial sum of the trigonometric Fourier
11
§10.5 Computation of Fourier Series
583
f
FIGURE 10.5-6 The Gibbs' phenomenon
series. Then the maximum of Sn occurs at 1r /2n and the minimum at
. S11 ( 1r)
(b -a)
n~'!
2n = - 2 -
(2 Jor
;
-1r
/2n and
sint dt + 1) + b ~ (b - a)(0.089) + b.
-t-
Similarly,
.
hm
n-.oo
Sn
( - -1r) =
2n
(b--a)
- (- -211r -sint dt + 1) + a
2
7r
o
t
~
a - (b - a)(0.089),
and the difference of these limits is
(b-a)(
- 2-
=
4 f"sint
-;
-t-dt+ 1) ~ (b - a)(l.l79).
Jo
=
For a -1 and b 1, this is illustrated in Figure 10.5-7. Thus the overshoot
of the maxima and minima is about 9 percent of the jump inf for each.
10.5.3 Example Verify formula I of Table
10.5-4.
Solution The se1ies is obtained by evaluating the Fomier coefficients by
direct integration on [-1r,1r] (see Table 10.5-1). We obtain
an = -1
7r
1"
o
cos nx dx =
{
01 n =
_ 01 2
n- • , ...
Chapter JO Fourier Analysi.
584
y
1.179
f
X
-1.179
FIGURE 10.5-7
and
bn
= _!_
-rr
The Gibbs' phenomenon for a step function
r sinnxdx= _!_ (- cosnx)I"
=_!_(I -(-It).
n
n
Jo
o
-rr
-rr
which equals 2/-rrn if n is odd and is zero if n is even. This establishes formula 1.
.
•
10.5.4 Example Use the tables to find the series for the function
-1
g(x) = { 1
-7r
<X <0
0 $; $ -rr .
Solution Let/ be defined as in fmmula 1 of Table 10.5-4. Then
g(x) = 2/(x) - I,
and so the expansion for g is
2
(!
2
+ ~ ~ sin[(2n - l)x]) _ 1 =
-rr~
n=I
2n-1
i ~ sin[(2n -rr~
n=I
l)x],
2n-l
since the Fourier expansion of I on [--rr, -rr] is 1. (One could also obtain this
directly.)
•
§10.5 Computation of Fourier Series
585
The half-interval sine expansion of I is not 1 itself but is the expansion of
g in Example 10.5.4 (why?).
10.5.5 Example For each O < x < 1r, prove that
oo
2
oo (-l)n+l
~ - ~-?(I - (-l)'')cosnx = 2 ~ - - - sinnx.
2 ~1rn~
n
n=l
n=l
Solution The left side is the cosine series for f(x) = x on [0, 1r ], while the
right side is the sine se1ies. For each O < x < 1r, we have convergence to the
value x, and so we get the desired identity. (At ±1r, the right side is zero; what
is the left side?) •
10.5.6 Example Establish the first formula in item 2 of Table 10.5-4 for
f(x) = x and state how one obtains the formulas for x2 •
Solution Since f(x) = x is odd, we use the sine selies. Then
21"
b,, = -
7f
xsinnxdx.
0
Integrating by parts gives
bn =
~
1: + ~ 1" co:nx dx
( xc:nx)
21r(-l)n+l
2
= --'----'- + - . 0.
1f
n
1f
Hence the series is
oo
.
00
Lb,, smnx = 2 L
n=I
11=1
(-
l)n+l .
- - - sm nx.
n
The series for x2 is obtained as follows: The first fonnula is the integral of
the series for x on [0, 21r] (with a factor of 2, since
y dy = x2 /2). The term
21r2 /3 comes from the cosine te1m at 0, using
1 1/n 2 = 1r2/6 (Exercise 4,
I:.:
J;
§I0.2). Here Theorem 10.5.l is used. The second formula uses the first and
the expansion for x on ]0, 21r[ from 2 of Table I0.5-4. (The first two formulas
can also be found directly.) The third formula is the Fourier series (= cosine
~eries, in this case) for x2. The remaining formulas are obtained by taJcing the
Integral of the cosine expansion of x on [O, 1r], and using the sine expansion of
x on ]O, 1r[. •
586
Chapter JO Fourier Analysis
10.5.7 Example Find the Fourier series on [-1r, 7r] for
f(x) = { 0~
.x-
-71'
'.5 X '.5 0
0'.5x'.51r.
Solution If we integrate function 2a of Table 10.5-4, we get 1/2 of the f
given here. Hence, by Theorem 10.5.1,
!f
71' 2 _
)= 1rx
4 + 4
2 (x
7r
- L lx
~1..~
_
n=I
oc
11=1
-1r
= 1l"X +
71' 2 _
4
~
4
,,.
cos[(2n - l)x] d
(2n - I )2
x
(-It
--sinnxdx
n
~
1r
f
n=I
sin[(2n - l)x]
(2n - 1) 3
+ ~(-I)" (co\nx) _
~
n-
11=1
(-!)". (-lt.
n-
Inse11ing the first series for f(x) = x from 2 gives
1r ( 2 ~ (- lt+ 1 •
~ 2
2 ~ sin[(2n - l)x]
~
n
sm nx 4 - ;;: ~ (2n - 1)3
4
n=I
n=I
;:- (-It cosnx _ ~ _!__
2
~ 2·
-z.....
n=I
n
n=I
n
This is the desired series but in a slightly awkward form. Since (by Exercise 4, §10.2)
1 I/n2 = 1r 2 /6, the resulting series is
z::::
f(x) = 2 (
1r2
00
12 + ~
{
[1r(-1)11+ 1
2n
-
I -
(-1)"]
1rn 3
sinnx+
Exercises for §10.5
1.
Establish the following in Table 10.5-4:
a.
Formulas 2, 2a
b.
Formulas 4, 4a, 4b
(-It
})
-;;rcosnx
.
•
§10.6 Further Convergence Theorems
2.
587
Establish the following:
I .
~ (-lln sinnx
xcosx=- 2 smx+2L...,
n2 _ 1 , -1r<x<1r.
n=2
x2.
3.
Find the half-inte1val sine and cosine series for
4.
Compute the Fomier seties on [-1r, 1r] for each of the following functions:
a.
IO
f (x) = { -11
b.
/(x)
C.
j(x) = x2
X
S.
x>O
x
<0
X
>0
= { -x2 x < O
+X +3
Compute the jump in the Gibbs phenomenon for the function
X > 0
-4 x< 0
f(x) = { 8
on the inte1val [-1r,1r].
6.
By conside,ing
f x)-{ n/4
(
-
O$x$1r
-1r:s;x<O
-1r/4
and the point x = 1r /2, prove Leibniz' formula:
I
I
I
1r
1--+---+ .. ·=-
3
5
7
4.
§10.6 Further Convergence Theorems
In this section, we give some additional convergence theorems concerned mainly
with unifonn convergence, differentiability, and integration of Fourier series. We
have already stated in §10.4 that if f is continuous and of bounded variation,
then the Fourier seiies off converges unifonnly to f. Let us now give a slightly
Weaker version that is easier to prove and is almost as good in practice. The
reader may wish to review the notion of unifo1m convergence from Chapter 5.
For example, in Figure 10.5-6 why is s,, not unifonnly convergent to f?
588
Chapter IO Fourier Analysii
10.6.1 Uniform Convergence Theorem
Suppose f is continuous on
[-11", 1r], f(-1r) = f(1r), and f' is sectionally continuous with jump discontinuities. Then the (trigonometric or exponential) Fourier series off converges to f
absolutely and uniformly. A similar statement holds for f on [0, 21r ].
In particular, this implies that the Fourier series converges in the mean and
pointwise, consistent with what we already learned in §10.3.
For example, consider f(x) = lxl on [-11", 1r]. Here the conditions of Theorem
10.6.1 are satisfied (but they are not satisfied on [0, 271"], since this is not the
same function), and so the Fomier seiies of/, namely,
~
2
_f
~
1r
11=1
cos[(2n - l)x]
(2n - 1)2 '
converges unifonnly. See Figure I 0.6-1.
FIGURE 10.6-1
The Fourier se1ies of this function converges unifonnly
Thus, by the definition of uniform convergence, there is, for every e:
N such that n 2: N implies
lxl _ (~ _
2
~ ~ cos[(2n 1r L,
I
I
l)x]) <
(2n - 1)2
for all XE [-7r, 7r].
One might think that if
00
f(x) = ;o +
L (a,, cos nx + b,. sin nx),
n=l
then
00
J'(x) = 1)<-nan)sinnx+nb,,cosnx}
n=I
€
> 0, an
§10.6 Furtlzer Convergence Tlzeorems
589
at each point where/' (x) exists. Unfo11unately, this is not true. For example, let
f (x) = { I
-1
Then
/
O< X
::; 1T
-1r:'.5x:'.50.
) 4 ~ . [(2n - l)x]
(x = ; L-, sm (2n - I) '
n=I
1::
so that for x > 0 we would expect O = (4/1r)
1 cos[(2n- l)x]. But this series
does not converge, since cos[(2n - l)x] does not converge to 0. (To make sense
out of this, distribution theory can be used.)
To get a differentiation theorem, one naturally thinks of using the techniques
from Chapter 5. However, we can get a better theorem in this case by arguing
directly. The result is as follows.
10.6.2 Differentiation Theorem let f be continuous on [-11", 11"], let
f(-1r) = f(1T), and let J' be sectionally continuous with jump discontinuities.
Suppose J" exists at x E [-1r, 1r]. Then the Fourier series/or
00
f(x) =
;o + L
(an cos nx + bn sin nx)
11=1
may be differentiated term by term at x to give
00
J' (x) = L (-na,, sin nx + nb,. cos nx).
11=1
Furthermore, this is the Fourier series off'.
As in Chapter 5, one must be careful when differentiating series; certain
conditions must hold to justify the operations. The result should be compared
with the integration theorem (10.5.1).
10.6.3 Example Consider f(x) = lxl, where x E [-1r, 11"[, which satisfies the
conditions of the differentiation theorem at each x 'I- 0. It has the Fourier series
~
2
_ i ~ cos[(2n 1T L-,
I
Hence
J'(x) = {
I
-1
l)x]
(2n - 1)2
•
Chapter 10 Fourier Analysis
590
has the Fourier series
if
1r
n=I
which agrees with what we know.
sin[(2n - l)x]
(2n - I)
'
•
10.6.4 Example Give the version of the uniform convergence theorem that
is valid on [-l, l].
Solution We want to show that if/ is continuous on [-l, l],f(-1) = f(l), and
f'(x) is sectionally continuous with jump discontinuities, then the Fourier series
'°" (
00
ao + ~
2
a 11
. n1rx)
cos -mrx
1- + b11 sm - 1-
11=1
converges unifonnly and absolutely to/, where
On
l {'
n7rX
J_/(x) cos - 1- dx
=l
and
b,. = lI
j' f sm. - 1- dx
(x)
n1rx
-I
(see Table 10.5-1). The proof could be accomplished following the 'method of
proof of the uniform convergence theorem, but we can also deduce the result
directly from this theorem as follows: Let g(x): [-1r,1r]-+ IR be defined by
g(x) = f(lx/1r). Then
a11
1j,,. ('Y)
=ll j'_/<x) cos -n1rx
1- dx =; _/ ; cos ny dy
(using x = ly/1r). Thus a11 is also the Fourier coefficient of g, and similarly
for b11 • Since g satisfies the conditions of the uniform convergence theorem, we
find that
00
0; + ~)011 cos ny + b11 sin ny)
11=!
converges uniformly and absolutely tog on [-1r,1r]. Replacing y by 1rx/l, we
see that the same is true of/.
•
591
§10.6 Further Convergence Theorems
10.6.5 Example For each of the following functions, explain whether the
Fourier series converges in the mean, pointwise, or uniformly. Determine if we
can differentiate the Fourier series.
a.
f: [0,211']-+ IR, where f(x) = 1/n if 1/(n + 1) :5 x < 1/n andf(x) = 1 if
1/2 :5 x :5 211', where n = 1,2, ....
b.
f:
c.
f: (-11', 11']-+ IR. where f(x) = x2 + 1 if -11' :5 x < 0 and f(x) = x + 1 if
0 < X :$ 11'.
lxl-
(-11',71']-+ IR, wheref(x) = 11' -
Solution The three functions are sketched in Figure 10.6-2. In all three cases,
f is bounded and square integrable (the function in a is integrable because its
discontinuities fo1m a countable set; see Chapter 8). Hence the Fourier series
converges in the mean in all cases.
·
y
y
y
f
__ ..J
_.....µ...._. ________..~---"
'
I I I
432
(a)
"
2"
(b)
f
.,
..... ... _
-"
(c)
FIGURE 10.6-2 Determining the convergence prope1ties of the Fourier series
of these functions
The Fourier series in a converges pointwise to the function midway between the jumps at a discontinuity, and to 1/2 at the origin, by t~e pointwise
convergence theorem.
In cases a and c, the convergence is not unifonn, because/ is not continuous
(for continuity at the endpoints, one must look at the periodic extension; then c
develops a discontinuity).
The function in b has a unifo1mly convergent Fourier series, since it satisfies
the conditions of the unifonn convergence theorem.
Chapter 10 Fourier Analysis
592
The Fourier sedes of c converges to f at each x such that -1r
at -1r and 1r it converges to
< x < 1r, and
1
I
+-1r) + 1.
2[f(-1r)
+f(1r)] = 2((7!"2
+ 1) + ('ll" + 1)] = (11'2
-2
Only the series of b may be differentiated, to give the Fourier series of
/(x)
={ 1
-1
which converges to f' for x
~
-7!"
5
X
50
0<x51r
0 and to O at x = 0.
•
Exercises for §10.6
For each of Exercises 1-3, determine what type of convergence the Fourier
series will have and if we can differentiate the series.
1.
f(x) = x2 on [-1r, 11'].
2.
f(x) = 1r -
3.
3 if - 7l" 5 X < -1/2
{
f(x) =
0 if - 1/2 5 x < I /2
3 if 1/2 5 X 5 'll".
x2
on [-1r, 1r].
4.
Use Theorem 10.5.1 to find the Fourier series of x3 on [-1r, 'll"], using the
Fourier series of x2 from Table 10.5-4.
5.
a.
Suppose/: [-1r,1r]-+ IR is differentiable on [-1r,'ll"],f(-'ll") =/(1r),
and/' and f" are sectionally continuous, with jump discontinuities.
Show that
00
-I
If' (x)l 2 dx = n2 (a~ + b~).
1,..
'/l"
-,r
L
n=l
where an, bn are the Fourier coefficients off.
b.
6.
Use a and Schwarz' inequality to deduce that
z=:
1 (a~ +b~) 112
< oo.
Consider the half-interval cosine series for sinx on [-1r, 'll"). Verify
the differentiation theorem directly in this case.
§10. 7 Applications
593
§10.7 Applications
In this section, we describe some applications of Fourier methods to simple
boundary value problems that occur in physics and engineering. These examples are fairly easy, yet serve to illustrate basic techniques. This material is
intended solely for illustration and as a link with other courses in mathematics,
engineering, or physics the student may be taking. It is by no means a complete course in boundary value problems. For example, we use only rectangular
coordinates, when in fact polar and spherical coordinates are also very useful.
The problems we consider are standard ones-the vibrating string, heat conduction, and Laplace's equation. Further applications to boundary value problems for ordinary differential equations are given in Exercises 19 and 71 at the
end of the chapter.
We begin by considering the vibrating string. From standard physical arguments, we find that an approximating mathematical model for a vibrating string
with uniform density and (small) vertical displacement y(x, t) at x at time t is
that y(x, t) should satisfy the wave equation,
(1)
Here c is a constant determined by the physics of the string and represents the
velocity of wave propagation along the string (as will be seen in what follows).
See Figure 10.7-1.
y(x, t)
-AP~-----...r;_
FIGURE 10.7-1 The wave equation describes the propagation of waves on a
string
To completely specify the problem, it is necessary to give additional conditions. For instance, we can specify initial conditions, including the configuration
of the string at t = 0, that is, how it is initially "plucked," or the function y(x, 0).
It is physically reasonable to assume that dy / dt is zero at t = 0 (that is, the string
is motionless at the instant of plucking). It is also necessary to specify what
happens at the ends of the string. Typically they are held fixed; for example,
y(O, t) = 0, y(l, t) = 0, although other choices are possible. Such specifications
are called the boundary conditions.
Chapter 10 Fourier Analysis
594
Once we have selected this model of the vibrating string, we have a mathematical problem-physics reenters when we wish to interpret the answer. There
is a basic method used in these problems called separation of variables, which
yields special solutions; from these, one can build up general solutions.
Consider the case of a given initial displacement. To be more precise, let us
call the initial displacement problem the problem of finding y(x, t) for O < x < l,
such that
1.
ffly
8t2
fPv
8x2
-=c2_.
equation of motion
y(x, 0) = f(x)
{
(for given/)
av
2.
at t = 0,
3.
y(O, t) = 0, y(l, t) = 0
initial conditions
(no initial
velocity)
a/x,0) = 0
boundary conditions
(for all t)
Thus we seek the motion of the string for future (or past) time when it is
initially "plucked" in shape f(x). For 2 and 3 to be consistent, we assume also
that/(0) = 0 = f(l). Other types of initial conditions are considered in Exercise 3
for this section.
Separation of vaiiables means that we first seek solutions to the equation of
motion of the f01m
y(x, t)
= h(x)g(t).
Substituting in the equation of motion, we obtain
h(x)g"(t)
= c2h"(x)g(t),
which is satisfied if
=0
and
g"(t) + >-.c2g(t)
(2)
=0
A solution of (2) with h(0) =h(l) =0 and g'(0) =0 is
=sm. n11'x
- 1-
and
g(t)
h"(x) + )..h(x)
for a constant).. (why?).
h(x)
where
=cos -n11'ct
1-
,, ,,
n-1r-
).. =T·
n
=1,2,3, ....
Thus, for each n, a solution of the equations of motion satisfying conditions 1
and 3 is
. n1rx
n1rct
y,,(x, t) = sm - 1- cos - 1-, n = 1, 2, ....
595
§10. 7 Applications
The initial conditions for this solution are
y(x, 0) = sin n;x
and
it
(x, 0) = 0.
Thus we have the solution for a particular initial condition sin(mrx/ /). However,
we know that any/ can be expanded in a half-interval sine series, and since all
the conditions are linear, we should be able to add up the solutions corresponding
to the terms in this expansion. This is done more precisely as follows.
10.7.1 Theorem In the initial-displacement problem, suppose that f is twice
differentiable. Then the solution to the initial-displacement problem is
y(x, t)
=2I [f(x -
ct)+ f(x + ct)]
= bn sm. -mrx
mrct
=L
1- cos - 1- ,
(3)
n=l
where the bn are the half-interval sine coefficients,
bn =
2 [1
. mrx
l Jo f(x) sm - 1- dx,
and f is extended to be odd periodic. (Twice differentiable means we are assuming that the extended/ is twice differentiable.) (See Figure 10.7-2.)
y
'
'\
\
/'
I
\,._.,,
FIGURE 10.7-2
',
\
X
\
I
I
'-"'
Solving the initial-displacement problem
It happens in this case that the solution (3) could be simplified to a more
easily handled and explicit form. Often, however, one must deal directly with
the Fourier series itself.
Before generalizing, let us note the simple physical interpretation of the
result. The graph of /(x - ct) is that of/ moved over to the right a distance
596
Chapter 10 Fourier Analysis
ct, and so we can interpret the function g,(x) = f(x - ct) as / moving to the
right with velocity c after time t. Similarly, h,(x) = f(x + ct) is f moving to
the left with velocity c. See Figure 10.7-3. Thus, the initial shape of the string
propagates away to the left and right with velocity c, forming two waves, each
with one-half the initial amplitude, and the waves "reflect" (with sign change)
when they reach the endpoints.
,•-
/
•
,,
£ \
0
I
,..,,._.;
\
c,
h1
,_..,,.
:D!:{"'
I
,--~,,•
,
'
\
g
h
66'
\
.
1
~ - - 1 (•
\
\
FIGURE 10.7-3
,,_,,,
I
I \
\
~·
I
I
,..__/
Left- and right-traveling waves
To use the tfalf-interval sine series, recall that we made/ odd periodic. If
we look only on the interval [O, l], we see that when/ moves to l, it reflects
from the wall; see Figure 10.7-4. Since the solution is the sum, there will be
complicated canceling (or "interference").
To keep track of this, it is useful to visualize a simpler situation first. Suppose
/ were concentrated near a point (possibly a 6 function) and we were to watch
it move. The motions in this case are called the characteristics of the problem.
They should be visualized as if one were watching a movie. See Figure 10.7-5.
To use genuine delta functions or functions/ that are continuous but not twice
differentiable, we must generalize the scope of Theorem 10.7.1 and also generalize what we mean by a solution of fJ2y / 012 = c2({)2y/ ox2) for y that are not
differentiable. This is done using the theory of distributions. Admitting distributions, the result still holds for f a dist1ibution (that is, the formal manipulations
§10. 7 Applications
597
-- -
-- -
t= 1
= ~
FIGURE 10.7-4 Waves reflecting at the boundary
•
A'
•
2
0
(a)
=u:
±
±
(b)
;y___·
•
-v-
•
(d)
(c)
:ft
(e)
:ft
(f)
=
FIGURE 10.7-5 (a) t 0; (b) t = l; (c) t = 2; (d) t = 3; (e) t = 4; (t) t
(g) return to (a)
=5;
can be justified when properly interpreted). We then regard (j(x-ct)+f(x+ct))/2
as the solution for any f, differentiable or not.
In two-dimensional problems (such as a vibrating drum), the wave equation
reads
{}2y - 2
(4)
ot2 - C or. + o_tj .
(82y 82y)
In this case, the general solution can be written in Fourier series but does not
have a simple explicit expression as did the one-dimensional case. The solution
Chapter 10 Fourier Analysis
598
(for the similar initial-displacement problem) on the rectangle [0, /] x [0, l'] is
given by
'°'
.
00
V
. m,rxz
[
y(x1,X2,t) = ~ bnm sm -mrx1
1- sm - 1,- cos ,rct(n/1)2 + (m/1')2
]
(5)
n,m=l
where
. mrx11- sm. -mu2
lorlor J(x1,X2)sm1,-dx1 dx2•
I
4
bnm = IL'
I'
The reader is asked to go through the delivation of this in Exercise 68 at chapter's
end.
Turning to heat conduction, consider a bar whose temperature is T(x, t) at the
point x at time t. Interpreting -(8T/ax) as the rate of heat flow, the condition
of "insulation" at x 0 is (aT / ax) 0 (evaluated at x 0). The law of heat
conduction asserts that
=
=
8T
=
a 2T
81 (x, t) = k ax2 (x, t),
(6)
where k is a constant determined by the conductivity of the material. This is
called the heat equation. (For a derivation see Marsden and Tromba, Vector
Calculus, 3d ed., W. H. Freeman, New York, 1988, Chap. 7.)
The heat equation (6) differs from the wave equation in that aT/ at replaces
a2 T/ai1. This difference is very jmportant, for solutions to the heat conduction
problem are different in their behavior from those of the wave equation. For
example, in the heat equation one obtains solutions only for t ~ 0: Intuitively,
for the wave equation the graph of the solution "bounces around" like water
waves. For the heat equation, the solution diffuses out and becomes steady as
t - oo (as temperature tends to become evened out).
To study this simple situation, let us create the following model for heat
conduction of a bar with insulated ends (for simplicity, talce k = l ). Hence, we
wish to solve for T(x, t) satisfying
1.
aT
a2 T
-aX (x,t) = k-a
(x,t), 0 < x < l,
x-
2.
T(x,O) =J(x),
?
0
<x <l
t~0
heat equation
initial conditiom
and
aT
3.
{
ax (0,t) = 0
aT
a/l,t) =0
boundary conditions
599
§10.7 Applications
First, we find special solutions for special/ by separation of variables. Letting T(x, t) = g(x)h(t), we get
g(x)h' (t) = g" (x)h(t).
(Recall that we are taking k = 1.) These equations are true if, for a constant ..\,
g(x) + ..\g" (x) = 0
and
h(t) + ..\h' (t) = 0.
(7)
Solutions of these equations satisfying the boundary conditions are
n1rx
g(x) = cos l
22 2
h(t) = e-n ., . 111 ,
and
n
= 0, 1, 2, ... ,
where ,\ = n 2 1r 2 /12. "we use cosine and not sine so that the third boundary
condition will hold. (These boundary conditions can't be met if we try to solve
for g and h with ,\ < 0.) Thus a solution with f(x) = cos(n7rX/ l) is given by
e-n 2.,,. 21 /t2 cos(mrx/l), n = 0, 1, 2, .... Since all expressions are linear and
f(x) f;·
ao
2
00
mrx
~
+6
1-
On COS -
n=I
(half-interval cosine series), we expect that the general solution with initial condition f is given by
ao
2
00
~
+6
a,,e
-n2.,,.2,;,2
n1rx
cos - 1-.
n=I
10.7.2 Theorem If f is square integrable, then, for each t > 0,
a0
T(x . t) = -2 +
L a e-n .,,.
00
2 2
111
/I
2
n1rx
cos - /
(8)
n=I
converges uniformly, is differentiable, and satisfies the heat equation and boundary conditions. At t = 0, it equals f in the sense of convergence in the mean,
and poinrwise if f is of class C1. As usual,
2 {'
n1rx
a,, = 1 lo f(x) cos - 1-
dx.
Chapter 10 Fourier Analysis
600
The exponential tenn in (8) makes the convergence rapid for r > 0. Fort < 0,
divergence usually prevails. As t -+ oo, all the tenns in the series converge to 0,
and also the sum converges to 0, leaving
Jim T(x, t) = -21 ao,
/-+00
so that T becomes a uniform constant temperature in accordance with our intuition. The reader is asked to prove this in Exercise 69 at the end of the chapter.
What happens as t -+ 0 is answered by the following, more delicate result.
10.7.3 Theorem In Theorem 10.7.2,
lim
1-0, t>O
T(x, t) = f(x)
(9)
in the sense of convergence in mean and uniform convergence (and pointwise)
if f is continuous, with f1 sectionally continuous. More generally, if the Fourier
series off converges at x to j(x), then T(x, t) -+ j(x) as t -+ 0.
This important result tells us in what sense we recover the initial value /(x)
from the values for r > 0. This is not derivable from differentiability of T(x, t)
for r > 0.
Our final application will be the consideration of Laplace's equation on a
square. Laplace's equation in Rn is
'\l2r.p
=0;
L f:P<p
8x2 =0.
n
i.e.,
k=I
(10)
k
Such a function cp is called harmonic. This equation ruises in many problems of
electrostatics, fluid flow, and heat conduction. For instance, in heat conduction
in IR.n, if Tis independent oft, so that fJT/81 = 0, then T solves Laplace's
equation.
The basic problem, called Dirichlet's problem, is the following: Given values of cp on some closed curve in the plane, find <p inside. This seemingly simple
problem is at the core of the vast subject of potential theory. (The terminology
arises from electrostatics, in which cp represents the electric potential; again, see
Marsden and Tromba, Vector Calculus, Chap. 7, for details. This problem can
also be attacked by methods of complex variables; see, for example, J. Marsden
and M. Hoffman, Basic Complex Analysis, 2d ed., W. H. Freeman, New York,
1987, Chap. 5.)
§10.7 Applications
601
Let us use Fourier seiies to solve this problem for a square in IR 2 (cubes in
R 3 are similar). The problem is summruized as follows:
In IR 2 , on [0,a] x [0,b], find a fun~tion cp such that
lAplace 's equation
1.
2.
cp(x, 0) = g1 (x), cp(x, b) = g2(x),
boundary conditions
cp(0,y) =/,(y), cp(a,y) =/2(y),
where/; and gi ru·e given functions. See Figure 10.7-6.
y
K2
b
J;
f,
vitp=Q
. ,. ,-·
--+-----------''-----+-
X
a
FIGURE 10.7-6
Solving Laplace's equation with given boundary conditions
First, let us obtain special solutions by separation of variables.
cp(x,y) = 'Pl (x)cp2(y), we get
Trying
which holds if, for a constant ,\, the functions cp 1 and cp 2 satisfy the equations
(11)
Solutions are
. mrx
cpi (x) =sm-,
a
n = 1,2, ... ,
and
tp2(y)
where,\
.
mr(b - y)
= smh - - - ,
a
=n21!'2/a 2• We choose sinh(z) =(e: -
e-i)/2, z = n1l'(b - y)/a, rather
602
Chapter JO Fourier Analysis
than e4 or e-z, because it will vanish when y = b. Similarly, we choose sine
rather than cosine. Thus
. mr(b - y) . mrx
ip(x,y) = smh ---'-- sm -
a
a
satisfies the boundary conditions
. h mrb . mrx
g1 =sm - s m - ,
a
a
/1 =h =gz =0.
We obtain other basic solutions similarly. It can be expected, therefore, that
when / 1 /2 g 2 0, the solution of the problem is
= = =
. h mr(b ( ) _ Loo b,, sm
'P x,y -
a
11=1
y) sin(mrx/a)
. h(mrbf a)'
sm
(12)
I::
where g 1(x) =
1 b,. sin(nrrx/ a) (half-interval sine series). Similar solutions
hold for the other sides, and the sum is the solution for all sides.
10.7.4 Theorem
i.
Given g1, let ip(x,y) be defined by (12). Suppose 81 is of class C 2 and
g1(0)
g1(a)
0. Then 'P converges uniformly, is the solution to the
Dirichlet problem with / 1 h g2 0, and is continuous on the whole
square, and "v2 ip = 0 on the interior.
=
=
= = =
'
U.
If each of /1 ,h, g1, g2 is of class C 2 and vanishes at the comers of the
rectangle, then the solution rp(x,y) is the sum of four series like Equation
( 12), 'v 2 cp = 0 on the interior, and 'P is continuous on the whole rectangle
and assumes the given boundary values. Furthermore, 'P is C00 on the
interior.
iii.
/f f1,h,g1,g2 are only square integrable, then the series for <p converges
on the interior, 'v 2 ip = 0, and cp is C00 • Also, cp takes on the boundary
values in the sense of convergence in mean. This means.for example, that
limy.....o cp(x,y) <p(x, 0) g, (x) with convergence in mean.
=
=
Results i and ii are still true if we assume only thatf; and g; are continuous,
but they require a different method of proof. The present procedure is good
because it gives the solution explicitly in tenns of Fourier series.
§10.7 Applications
603
10.7.5 Example In the initial-displacement problem, define the total energy
of the string at rime t to be
.l'.
11'(8)
E(t) = 2
.l'.
c7-1'(a)
ax
2
dx + 2
at
0
2
(13)
dx
0
(kinetic plus potential energy). Show that E(t) is constant in t.
Solution It suffices to show that dE/dr = 0. Now
11 a (av)
1
dE- dt - 2
0
at
....:...
at
2
2
c
dx+2
1'
0
-fJ (oy)
at ax
2
dx
.
(14)
This is justified by differentiation under the integral sign if y is twice continuously differentiable (see Chapter 9). Then
c2
{'
2 lo
r'
O (f}y) 2
1
{)y [)2y
at ax
dx = c- lo ox 8t8x dx.
(15)
Integrating the right-hand side of (15) by parts and using the that oy / 81 = 0
at x = O,l gives
?
[
1 ay
fPy
-c- lo at 8x2 dx,
which, in view of the equation of motion, equals
1/
{)y f)2y
-
-;- --;;-:,dx.
o vt vt-
Thus dE / dt = 0, since the first term in (14) is
1/
{)y fJ2y
-;- --;;-:, dx.
o vt vt-
•
In the case where y is not twice differentiable, more care is needed. (For
example, if y is a 8 function, E is not even defined.) For the heat equation,
we do not have conse1vation of energy, because, roughly speaking, the energy
diffuses away. (See Exercise 6 for this section.)
10.7.6 Example A bar with insulated ends has a temperature distribution
given by f(x) = x, 0
t > 0.
<
x
<
l, at I = 0. Find the temperature distribution for
Chapter JO Fourier Analysis
604
Solution According to Theorem 10.7.2, we take the half-interval cosine series
for x and insert factors e-n 2 " 21112 . The series for x is given by
x _ ~ _ 41 ~ cos[(2n - l)rrx/l]
'
- 2 1r 2 L..,
(2n - I )2
n=l
and so the required solution is
T(
x, t
) = ~ _ 4/ ~ -(f.1n-I)2..-2,)//2cos[(2n - l)1rx/l]
2
7r2 L.., e
(2n - 1)2
.
'
n=l
In general, one cannot reduce this expre~sion to a compact fo1m but must
instead work with this series expan-sion.
•
10.7.7 Example Solve the Dirichlet problem on [0, 1r] x [0, 1r] with boundary
values g 1
=1, Ji =0, 12 =0, g2 =0. How are the boundary values assumed?
Solution The sine series for l is
~ ~ sin[(2n - l)x]
1rL..,
2n-1
·
n=l
According to Theorem 10.7.4, <p is obtained by inserting factors
sinh[n(1r - y)]
sinh(n1r)
and so we obtain
<p (x,y )
4
=-
Loo sinh[(2n -
1r
l)(1r - y)] sin[(2n - l)x]
smh[(2n - l)1r]
2n - l
.
11=1
as the solution. By Theorem 10.7.4, <p is C"'° and satisfies V 2 cp = 0 on the
interior of the square. Here <p(x, 0) = 1 in the sense of convergence in mean;
that is, <p(x,y) - 1 as y - 0 (in mean). The partial sum Sn of <p is roughly
sketched in Figure 10.7-7.
Exercises for §10.7
1.
For the initial-displacement problem of a suing, consider the "plucked
string" with f(x) hx if O $ x $ l/2 and /(x) lh - lu: if 1/2 $ x $ l.
=
=
605
§10.8 Fourier Integrals
Gibbs' phenomenon
X
FIGURE 10.7-7
A specific solution of the Dirichlet problem
Show that f does not satisfy the hypotheses of Theorem 10.7.1. Find an
expression for the solution after time 1 = I/ c, 3l/2c.
{
Suppose, in the initial-displacement problem, that f has a maximum or a
discontinuity at x0 • Show that, after time t, this feature is propagated like
a chm·acteristic.
3.
a.
2.
State the initial-velocity problem for a suing with no initial displacement. If the initial velocity is g(x), show that the solution is
I
y(x, 1) = 2c
b.
4.
5.
O
g(z) dz -
1x-cl g(z) dz }.
0
Combine a with the initial-displacement problem to get a solution
for the problem with both initial displacement and initial velocity.
(This is called d'Alembert's solution.)
In Theorem 10. 7.2, prove that for any 1 ~ 0, (2/ /) f~ T(x, 1) dx = ao.
=
A bar with insulated ends has at t 0 the temperature distribution J(x, 0)
Find the temperature at 1 > 0 and the limit as 1 -+ oo.
x2.
6.
{1x+cl
=
Let T(x, 1) be a solution of the heat equation and set l(l) = J~ IT(x, 1)12 dx.
Show that L(l) is nonincreasing.
§10.8 Fourier Integrals
This section consists of an informal discussion of Fourier integrals. We sketch
the main results so that the reader may see their role in Fourier analysis. As we
have seen in the previous sections, Fourier series are a useful tool for analyzing
606
Chapter 10 Fourier Analysis
functions on a finite interval. Since many functions are given on the whole real
line JR, it would be helpful to have an analogous theory on JR.. Fourier integrals
provide this theory.
Considerf : [-l, I] -+ JR.. Writing f in tenns of its exponential Fourier series,
L c,,ei111rx/l,
00
f(x):
1
where c,, = 21
-oo
JI
.
f(y)e-m1ry/l dy.
-I
Let a: = mr / l and write
~
•
1r
f(x) = ~c(a)e"rx 1 ,
where c(o:) = -21
1r
-oo
JI
.
/(y)e-' 0 Y dy.
-I
For l large, a: approximates a continuous variable, and this sum is a Riemann
sum with Aa = 1r / I. This suggests that
J(x) =
1-:
where c(a) = -21
1r
c(o:)eiox da,
Joo f(y)e-'°'Y. dy.
(1)
-oo
In short, we expand our finite intervals to infinite intervals, and the Fourier series
is then expressed as an integral.
The same steps can be used in trigonometric form, except that integrals are
taken from O to oo, as are the corresponding sums.
The relevant theorem states that if f is sectionally continuous with jump
discontinuities, f' (Xo) and f' (Xo) exist there, and J~ If(x)I dx < 00 (/ is integrable), then
(2)
where
c(a)'= -21
1r
1
00
j(y)e-'•0 Y dy.
-oo
One proves this in a way similar to the pointwise convergence theorem.
Formula (2) is called the Fourier inversion formula. In trigonometric form
the formula is
I
2 [/(x•)+f(x-)] =
f'X>
lo
[A(o:)cosax+B(o:)sinax]da,
where
11
00
A(a) = -
7r
0
f(y)cosaydy
and
11"°
B(a) = -
7r
0
f(y)sino:ydy.
(3)
§10.8 Fourier llltegra/s
607
This form is especially convenient if/ is even or odd.
In view of the inversion theorem, the Fourier transform off is defined as
f(o:)
A
100 f(x)e-,xo
• dx.
= -21
(4)
-oo
7r
If f is continuous, differentiable, and integrable on JR, then
(5)
There is a similar formula on !Rn; that is,
~
/(x)
L
j(o:)i(o.
x)
do:,
(6)
where
j(Oi)
f
= _I_
J'IW'
(21r)"
f(x)e-i(x,o) dx,
x, a E !Rn, and (x, o:) is the usual inner product in !Rn.
Given f : [0, oo[ --+ JR, we can extend f to all of JR by making it even or
odd. Just as with the cosine and sine se1ies, we can then introduce the Fourier
cosine transform by extending f to be even, and setting
21
fc(o:) = -
00
/(y) cos(o:y) dy.
(7)
0
7f
The inversion formula becomes
f(x) =
100
lc(o:) COS(OIX) do:.
(8)
Similarly, extending/ to be odd leads to the Fourier sine transform,
].(a-)=~
1r
.['xi /(y)sin(01y)dy,
lo
(9)
and the inversion formula becomes
f(x) =
fo
00
j.(o:) sin(X01) d01.
(10)
A standard fact one should know (using Worked Example 9.1, Chapter 9) is
that the Fomier transfo1m of e-x212 (Gaussian function) on JR is e- 02 / 2 / .._ff;,
which is consistent with the inversion theorem.
608
Chapter 10 Fourier Analysis
In general, an integral tramform is an association of the function
g(x) =
1
(11)
k(x,y)f(y)dy
with the function/ for some fixed function k called the kernel, and some fixed
range A of integration. Such operations are common in mathematical physics.
Thus the Fourier transform is an integral transform with kernel k(x,y) = e-ixy /2rr.
Here we come to an important, general problem. The transform maps f to g;
that is, given/, we get g. Can we invert this? In other words, given g, can we
invert the transformation to find/?
The Fourier inversion formula solves this problem in the case of k(x,y) =
e-ixJ /2rr. That is, knowing the Fourier transform, we can recover the function
using the inversion formula.
Another common integral transfonn is the Laplace transform with kernel
k(x,y) = e-xy and range [0,oo]. Thus, the Laplace transfo1m off is
C(f)(x) =
1
00
(12)
e:.."-,f(y)dy.
The inversion problem for Laplace transforms also has a solution. (See, for
example, Marsden and Hoffman, Basic Complex Analysis, 2d ed., Chap. 7, for
details.)
For/: [-tr, rr] --t IR, Parseval's relation gives the identity (see Table 10.5-3)
1
1<
11/112 =
-:r
00
lf(x)l 2 dx = 2tr
L lcnl
2
(13)
-oo
for the Fourier coefficients c,, in exponential form. Since Cn is analogous to the
Fomier transforms, we might expect that something similar holds in terms of j.
This is true-if l/1 2 and I/I are integrable, then letting
we have
(14)
More generally, {/,g} = 21r(j,g}. Here the Fourier transform can be any of the
types-exponential, trigonometric, sine, or cosine.
Formula (14) is variously known as Parseval's relation and Plancherel's
theorem. To deal effectively with the technicalities involved requires the theory
of the Lebesgue integral.
Exercises/or §10.8
609
For/ and g integrable on lR (or !Rn), we define the convolution of/ and g
by
(/ * g)(x) = 1:f(x- y)g(y)dy.
(15)
This operation enters naturally in many problems. One of its main prope11ies is
that
(16)
(see Worked Example 10.4 at the end of this chapter).
Fomier transfo1ms have impo11ant applications in both pure and applied
mathematics. For example, they are important in partial differential equations,
such as the wave equation on all of IR". The reason is that in tenns of Fourier
transfo1ms, the equations become much simpler, often algebraic, and when these
equations are solved the answer is obtained using the inversion fonnula. Convolutions are encountered when we invert.
Using Fourier transforms, many problems solved in earlier sections on finite
intervals can be translated easily to problems on the whole real line lR. The
following exercises outline how to do this.
Exercises for §10.8
These problems can be done infonnally with little attention to rigor, in the
manner in which this section has been presented.
1.
Show that if j(a') is the Fourier transfonn of/, then j(o:) = io:](a) and
271' f ~00 a 2 IJl2 da = f ~00 If' 12 dx.
2.
Let J(x, y) satisfy 8 2/ / 8r + 8 2/ / 8y2
Iimy-oof(x,y) = 0 for all x.
a.
=0. Suppose that J(x, 0) =g(x) and
Let](o:,y) be the Fourier transform of J(x,y), with y regarded as a
constant. Show that](o:,y) = g(a)e-1°1.Y.
b. .. Show that e-lol.v is the Fourier transfonn, with respect to x, of
2y
x2 +y2·
c.
Deduce that the solution of Laplace's equation is
f(x,y) = -l
7f'
1
00
-oo
1
~ ( Y x'fg(x')dx.
y- + X -
Chapter JO Fourier Analysis
610
3.
Suppose that f (x, t) is a function for which
{)J = k2 cPf,
{)r
{)x2
-oo
< x < oo,
t
c: 0'
and f(x, 0) = g(x). Let j(o:, t) be the Fourier transfonn off.
~
k2
Show that/(a, t) = g(a)e-
b.
Deduce that the solution of the heat equation is
f(x, t) = _I_
2k,/iri
4.
2
a.
0
1•
1=
g(x')e-<x-x')2 /4k2, dx'.
-oo
Suppose that f(x, t) is a function for which
[)2J
fi1f
vt-
vx-
-;--;, = -;:;-;,,
f(x,0) = g(x),
and
{)J
{)r (x, 0) = h(x).
Letj(a, t) be the Fourier transfonn off(x,t), with t regarded as a constant.
a.
Show thatj(a, t) = g(a) cos at+ h(o:)([sin o:t]/ o).
b.
Deduce that the solution of the wave equation is
f(x, t) =
I
2[g(x -
t)
I
f'
+ g(x + t)] + 2 Jo [h(x - r) + h(x + r)] dr.
§10.9 Quantum Mechanical Formalism
This section will explain how the theory of Fourier series in an inner product
space (developed in §IO.I and §10.2) is related to the fonnalism of quantum
mechanics.
First, let us explain some differences between classical and quantum mechanics. In classical mechanics, a particle's motion is described by a definite
path with its associated velocity. In quantum mechanics, there is always some
"unce11ainty" about the position or velocity (or both). For atomic phenomena
this unce11ainty is necessary, and these effects are outside the domain of applicability of classical mechanics. For example, if an atomic particle with a definite
initial velocity is prepared and projected at a screen, we do not know precisely
what its future path will be. Instead, when we look for the particle we can only
detennine the probability of finding it in a given region.
§10.9 Quantum Mechanical Formalism
611
In Figure 10.9-1, we consider projecting particles through a screen with two
slits onto a detection plate.3 Only the probability of location can be determined,
not the exact location. For repeated trials, light and dark areas on the screen
corresponding to high and low probability are obtained. This is represented by
the curve in the figure. Other physical phenomena, such as the discrete spectra
for atoms, also require a quantum mechanical description, as texts on quantum
mechanics explain.
detection
plate
screen
probability
distribution
particles all
prepared . .
identically
FIGURE 10.9-1
The double-slit experiment
This info1mation should not be taken to mean that classical mechanics is
"false" while quantum mechanics is "true." Both are mathematical models that
approximate nature well in their own limited circumstances. Quantum mechanics
is, for example, not very useful when we study the dynamics of the space shuttle.
The question, then, is, How does one describe the behavior of a quantum
mechanical particle? A single quantum mechanical particle is described by a
complex-valued function, 1/J(x), where x E JR.3. For more particles, one changes
IR 3 to another space (for N particles, JR 3N is used). If the system depends on
time, then we use ·¢(1,x). The probability density for locating the particle in
space is given by 1f,,(x),p(x), so that if the total probability is 1, then we should
have
ll'¢11 2 = ),.3
r 1/;(x)-,p(x) dx = 1,
that is, '¢ should be no1malized. This last sentence provides one of the links
between studying the mathematical model (that is, studying the inner product
space of square integrable functions 1/; and operations on it discussed in this
section) and the physical interpretation of this model.
3This is an imaginary experiment lhal is used for illustralive purposes only. In real experiments.
the "screen" might be, for instance, a cryslal, and the slits mighl correspond 10 interatomic spacings.
Chapter 10 Fourier Analysis
612
If one is measu1ing a definite quantity, such as the x coordinate, the momentum, or the angular momentum, then it cannot be measured with certainty. The
aspect of quantum mechanics we wish to explore is the mathematical structure of
the 'wave functions t/; and the mathematical objects corresponding to physically
measurable quantities. Of course, the subject goes much deeper than this, and
our discussion only begins to scratch the surface.
Let us introduce some rigor into our discussion. First, suppose V = L2 is the
space of functions 'ljJ : JR 3 -+ C that are square integrable:
ll1"11 2 = ('I/J, 1J,) ::
j t/;(x)t/;(x) dx < oo.
As we have seen before, this is an inner product space with inner product
{f,g} =
r f(x)g(x)dx.
J.,_1
Thus, all the discussion relating to an inner product space (see §10.1 and §10.2)
is relevant here.
A quantum mechanical state is (by definition) a function t/J E V such that
llt/JII = I; that is, t/J is no1malized. An observable is an operator A on V that is
symmetric (or self-adjoint, or Hermitian); this means that A : V-+ V is a linear
map satisfying
{Af,g) = (/,Ag}
for all f,g E V. Actually, A may be defined only on some elements of V. For
example, if A is the Laplacian; i.e., if
2
2
Af =
2
0/
0/
0/
2
-o
x- + -o
r + -o
z- = v' f,
?
?
?
then A is defined on those f E V whose second derivatives also lie in V. One
can check fonnally that v' 2 is symmetric by using integration by parts twice. On
the other hand,
is not symmeuic, but i({) /
is (the reader should prove
this). (Operators like the Laplacian are called unbounded, and for them one
has to distinguish between symmet1ic and self-adjoint. For further information,
see Reed-Simon, Methods of Modern Mathematical Physics, Vol. I, Functional
Analysis, Academic Press, New York, 1972.)
An eigenfunction of an operator A is an / E V, f 'F 0, such that Af = >.f
for some complex number >. called the eigenvalue. Observe that if f is an
eigenfunction, then so is f /11111, and so we can assume that our eigenfunctions
are normalized.
There are two impo11ant remarks to be made concerning eigenfunctions of
symmetric operators. First, if A is symmetric and f is an eigenfunction with
eigenvalue >., then >. is real.
o/ox
ox)
§10.9 Quantum Mechanical Formalism
Proof {Af,f)·= (>..J,f) =,\IIJll2. On the other
,\f) =Allfll2, and so ,\ =X, that is, ,\ is real. •
613
=(f, Af} =(f,
hand, {Af,f)
Second, if f and g are eigenfunctions with eigenvalues ,\ and Jt and ,\ -:j; µ,
then f and g are orthogonal.
Proof Consider (Af,g) - (/,Ag}
Since Jt ~ ,\, (f,g} = 0. •
=0 = (,\/,g) -
= (,\ - µ)(f,g}.
(f,µg)
If f and g are independent eigenfunctions with the same eigenvalue >.., then
we can obtain two new 011hogonal eigenfunctions by the Gram-Schmidt process. A similar operation can be pe1f01med for any number of eigenfunctions.
Thus, the eigenfunctions of a symmetric operator A form an ortlwnormal set.
Hence, we have the connection with Fourier series, for, if <po, <p1, <p2, ... are the
orthono1mal eigenfunctions of A and if they are complete, then we can write
t/J = 'E:0(-t/J,<pn)<p,, for any 1/J EV.
Unfortunately, the eigenfunctions are often not complete. For example, the
Laplacian operator on Rn has no (square integrable) eigenfunctions. In many
important problems ("bound-state" problems), however, the eigenfunctions are
complete. This is proved in advanced courses on functional analysis by means
of l!, theorem called the spectral theorem.
Let us return to the physical interpretation of obse1vables. Again, let V be
as before and let A be a symmetric operator.
Physical Interpretation If A is "measured" in a state t/;, only the eigenvalues of A are observed. The value >.n is observed with probability l('lf', 'Pn)l 2 ,
where A<p,, = ,\n'Pn•
This interpretation is consistent because (see Exercise 22 at the end of the
chapter)
I = (1/J, t/1) =
L (t/ 'Pn) (<f!n, t/J) = L l{t/J, lf>n) 1
1,
2,
and so the total probability is I. Fui1hennore, if the system is already in state if>n
(which it need not be generally), then we observe ,\n with probability I (that is,
With ce11ainty). Thus the Fourier expansion of tp exhibits 1/J as a "mixture" of the
eigenstates <fin, and the squares of the absolute values of the Fourier coefficients
are the probabilities of observing the pru1icular eigenvalues.
As we have seen before, in a given state ¢ and given an obse1vable A, one
cannot generally predict with certainty the obse1ved value of A. But the average
value observed, after many trials, is E:0 l(t/1, <f'n) j2..\11 = (AtJ,, ¢) (see Exercise 3
Chapter 10 Fourier Analysis
614
for this section). This quantity {A'I/J, '1/J) is also called the expectation value of
A in the state t/J. What is this for an eigenstate?
Probably the most important example of an observable is the energy operator,
denoted H, also called the Hamiltonian. For a single particle in a potential U,
it is given by
li2
Ht/J = - 2m 'v2'1/J + Ut/J.
The justification of this choice depends on a more detailed analysis of the foundations of the subject; we refer to quantum mechanics texts for this. Here U
is a given real-valued function representing the potential energy, m is the particle's mass, and h is a certain constant that depends on units of measurement
(Ii= 1.05 x 10-27erg • sec) and is called Planck's constant.
For example, in the hydrogen atom we can observe only discrete energy
levels, which are eigenvalues of the operator
Ht/) = - /i2 'v2t/J - J!..'
2m
r
(r
where r(x,y, z) =
+y2 +z2 ) 112 and m is the mass of the electron. These energy
levels are directly measured in atomic spectra. A word of caution-any t/J is
an admissible state, not just the eigenstates. But the eigenstates are particularly
impo11ant states because their eigenvalues give the values that are observable.
The reason the energy operator is so impo11ant is twofold. First, its eigenvalues give the possible energies we can observe. Second, this operator governs
the time dependence of t/J by means of the celebrated Schrodinger equation,
which reads:
iii at/) = Ht/J
at
(the solution t/J of this equation using Fourier series is given in Exercise 23 at
the end of the chapter).
Other operations are
1.
The position operator (in the x direction):
Q.(t/J) = xtjJ(x,y, z).
2.
The momentum operator (in the x direction):
Px(t/J) =
3.
~
~t/J' vx
The angular momentum operator (about the z axis):
Ii. ( ot/J
07/J)
Jz<t/J> = T Yax - x ay ·
Similar definitions can be made for Qy, Ql, and so on.
§10.9 Quantum Mechanical Formalism
615
The eigenfunctions of Jz are complete, and the eigenfunctions and eigenvalues are computed in any quantum mechanics book. The operators Qx, Px do not
have square integrable eigenfunctions.
Finally, before looking at a specific example, we examine the important
notion of the commutator. The commutator of two operators A and B is the
operator [A, B] defined by
[A,B] =AB- BA,
where AB means Ao B; that is, (AB)(f) = A(B(f)).
Suppose/ is an eigenfunction of both A and B. If A/= >.J and Bf = µf, then
[A, B] f
=(AB -
BA)/
=A(pf) -
B(>.f) = Ji>./ - AJL/
=0.
Thus, if A and B have the same eigenfunctions, then [A, B"jf = 0 for such
eigenfunctions. If the eigenfunctions are complete, we expect [A, B] = 0 for all
f, but this requires more assumptions than we can go into here.
Conversely, if [A, B] = 0, we can select the eigenfunctions to be simultaneous
eigenfunctions of A and B. This is easy to see if there are no repeated eigenvalues
(the more general case requires a bit more argument). To illustrate, suppose Af =
)if. Then A(Bf) - B(Af) = 0, and so A(Bf) = >.(Bf). Thus Bf is an eigenfunction
of A, so that Bf=
for someµ (since, by assumption, >. is a simple eigenvalue).
Thus f is an eigenfunction of both A and B.
In summary, [A, B] = 0 iff A and B have simultaneous eigenfunctions. Physically this means that these eigenfunctions give exact observables for both A
and B at once, or, as we say, A and B can be measured simultaneously. Further justification of this statement is given by the famous uncertainty principle
(Exercise 5 for this section), which states that the product of the "errors" in
measuring A and B for the same 1/,J (that is, measuring A, B simultaneously) is
at least 21 (Ctt,, tt, }I, where C = [A, B]. The definition of "error" is also given in
Exercise 5.
Finally, let us look at the example of a particle in an "infinitely deep well."
Other impo11ant examples such as the harmonic oscillator (H = -(h2 /2m)V 2 +kr)
and the hydrogen atom (H = -(h2 /2m)V 2 - 1/r) are found in standard texts,
and are a little more laborious to perform fully. The Hamiltonian for the deep
well problem in one dimension is
,,f
li2 IJ2tt,
H= - - --+
Vtt,
2
2m 8x
'
where V = 0 on [O, I] and V = oo outside. Since this is not workable as it
Chapter 10 Fourier Analysis
616
stands within the space of square integrable functions, let us reformulate H by
demanding
H'ljJ = - Ti2 EP'I/J
2m
{
1/J and
ax2
H't/l = 0
on [0, l}
outside [0, I}.
Thus H is really an operator on the functions 'lj;(x), 0 ::;; x ~ l, with tf;(O) =
t/J(l) = 0.
It follows that ·I/J is an eigenfunction iff there is a constant E such that
The general solution to this equation is
t/)(x) = A sin >.x + B cos >.x,
where >. 2 = 2mE/ !i2. If t/; =0 at 0, l, then B = 0 and also >.
Thus, the nonnalized eigenfunctions are
'Pn(X)
(see Figure 10.9-2), where >.n
=
=mr / l, n = 1, 2, ....
y2/l sin AnX
=mr / l, n =1, 2, ... , and the eigenvalues are
Thus, these E,, are the only possible energy values we can observe:
FIGURE 10.9-2 Energy levels for a pru1icle in a deep well
Exercises/or §10.9
617
Here these functions are complete, as has been proved in §10.3. Thus, if a
particle is in a state t/J, the probability that we will observe energy En is given
by
2
l{t,b,cp,.)j2 = 1
I1
1
2
•
mrx
t/J(x)sm-1-dx
Exercises for §10.9
t..
Let A be an operator on V. Define its adjoint A* by (A*x,y) = (x,Ay)
(assume such an A* exists). Prove that (AB)*= B*A*.
2.
Let V be Rn and A : V -+ V be linear. Prove that A is symmetric in our
sense if its matrix with respect to any orthonoimal basis is symmetric in
the usual sense of matrices, that is, aij = aji.
3.
Let A be a symmetJ.ic operator on V with a complete set of eigenfunctions cp0 ,cp 1 ,cp2 , ••• and eigenvalues .Xo,.X 1, •••• If t/J EV, show that the
expectation of A is given by
00
{At/J,t/J) =
L 1{7/J,cpn)l -Xn.
2
n=O
Interpret this quantum mechanically (that is, probabilistically).
4.
Suppose (Af,g) = -(/,Ag) for all f,g E V. Show that A2 is symmetric
and that it has only negative eigenvalues.
5.
Uncertainty Principle Let A be a symmetric operator. The uncertainty
(or variance) in observing A in a state t/J is given by
a.
Show that this equals {A 2 ¢, t/J) - (At/I, ¢)2.
b.
Let A, B be two symmetric operators and let C = [A, B]. Show that
Note the special case [A, B] = 0 and interpret it.
Chapter 10 Fourier 1nalysis
618
c.
For the case Ar/; = xi/I and B'f/; = (Ii/ i)o'ljJ / ox (position and momentum), show that
(for 11'1/lll = 1). This is called the Heisenberg uncertainty principle.
Theorem Proofs for Chapter 10
10.1.2 Theorem The space V of continuous functions/: [a,b]---+ Cforms
an inner product space if we define
1
b
(f,g) =
f(x)g(x)dx.
Proof The properties of the inner product follow from these computations:
i.
(af + bg, h) = lb [af(x) + bg(x)]h(x) dx
=alb f(x)h(x) dx + b lb g(x)h(x) dx
= a(f,g) + b(f,g).
ii.
fb
(f,g)
=la
lb
=
iii.
1
b
f(x)g(x)dx
=
f(x)g(x) dx
=(g,f).
f(x)g(x)dx
Note from ii that (/,/) = (/,/), and so (/,/) is real; thus (/,/) ~ 0
makes sense. Here
l
(/,/) =
since
l/(x)l 2
~ 0.
a
b
f(x)f(x) dx
1
=
b
IJ(x)l 2 dx ~ 0,
l'heorem Proofs for Chapter 10
619
Finally, suppose (/,/) = 0. If h is continuous and h ~ 0, then
lb
(see §8.4), and so (f,J)
h(x)dx
=0
implies
h
=0
=0 implies J: l/12 dx =0 ,and hence f =0. •
10.1.3 Cauchy-Schwarz Inequality
Letf,g belong to the inner prod-
uct space V; then
l(f,g)I:::; IIJll llgll.
Also, V with norm II/II satisfies the axioms of a normed space and, with d(f, g) =
Ill - gll, the axioms of a metric space.
Proof First, let us prove the inequality when ll811 = I:
o:::; II! - (f,g)gll 2 = U- if,g}g,f- (f,g}g}
= (f,J} - (f,g}(f,g} - (f,g}(f,g} + (f,g}(f,g}(g,g}
=(f,f} - (f,g}(f,g} =11111 2 - l(t,g}l22 :::; 111112Thus I(/, g} 1
For the general case I(f, g) I :::; II/I I Ilg II, we can suppose g 'F 0, so that
llgll 'F 0. Let h = g/llgll, and so that 11h11 = I. Then l(f,h}I :::; 11/11- But
l(f,h}I = l(f,g}lfllgll $ 11111, and so we obtain the result. •
This method is similar to that used to prove the Cauchy-Schwarz inequality
in real inner product spaces (Chapter I), except that now a bit more care is
needed to keep track of complex conjugates. The reader should derive the other
prope11ies, taking special care with the tliangle inequality II/+ gll :::; 11111 + 11811-
10.1.6 Theorem
Let V = L2 be the space of functions/: [a,b]--+ C such
that IJl is integrable (that is,
IJ(x)l2 dx < oo). Then the space V is an inner
product space with inner product
2
J:
and norm
1lf(x)l
) 1/2
b
11/11 =
(
2 dx
620
Chapter 10 Fourier Analysis
J:
Proof First, if 11/11 = 0, we have IJ(x)l 2 dx = 0, and so by Theorem 8.3.4ii,
Chapter 8,/ is zero except possibly on a set of measure zero. Since we are identifying functions that agree except on a set of measure zero, f = 0. Finally, {f, g)
satisfies all the other rules of an inner product space as shown in Theorem 10.1.2.
We need only show that (f,g) is finite (that is, thatfg is integrable).
If we were to work only with bounded functions.Jg would be integrable and
bounded, as are both/ and g (see Chapter 8). Since we allow improper integrals,
f and g need not be bounded. If we split f and g into real and imaginary parts,
and into positive and negative parts, we are reduced to the case in which/ and g
are real and positive (the reader is asked to carry out the details as an exercise).
For each M ~ l, define (jg)M as in Chapter 8. We want to show that
lim
M-oo
lb
(Jg)M
< oo.
0
By examining cases, we see that
and so
i\Jg)M
s {f./M,g./M) + llfv1Mll2 + llcv1Mll2
s llfv1Mll2llgv1MW + llfv1Mll2 + llgv1Mll2.
by the Schwarz inequality. But 11/v'Mll $
II/II and llcv'MII $
llcll, and so
i\tg)M :5 ll/11 llgll + 111112 + llcW < oo.
Hence we obtain the result (the limit exists as the integral increases with M; we
needed only to show that it was bounded above).
Finally, for sectionally continuous functions, observe that they form a vector
space (Exercise 9 at the end of the chapter) and are bounded (Exercise 11).
Hence both functions, f and 1/1 2 , are integrable, since the set of discontinuities
is finite (see Chapter 8).
•
I::O
10.2.1 Theorem In an inner product space V, suppose f =
Ck'Pk for
an orthonormal family <po, cp1, . .• in V (convergence in the mean) and f E V.
Then Ck=(/, 'Pk) = (cpk.f).
Proof Lets,,= I::-=0ckcpk, so that II/ - snll
-+
0. Fix i and choose n ~ i.
Form
(/- s,,,cp;) = (/,cp;) - (s,,,cp;) .
.
Theorem Proofs for Chapter JO
621
This expression approaches zero as n--+ oo, since
n 2:'.: i, we have
n
(sn,'Pi}
l(f- Sn, <p;}I S II/- snll- For
n
n
=:~::)ct<pt,cp;) =LCt('Pk,cp;) =Lck6ki=c;.
k=O
k=O
Thus, (/, cp;) - c;--+ 0 as n
have (/, cp;} = c;. •
--+
oo. Since this expression is independent of n, we
10.2.3 Bessel's Inequality Let cpo, <p1, ••. be an orthonormal system in an
inner product space V. For eachf EV, the real series I::i I(!, <p;}l 2 converges
and we have the inequality
00
L l{f,ip;)l s 111112.
2
i=O
Proof Let Sn = L7=0U, <p;}<p;. We first show thatf-sn and Sn are orthogonal.
To see this, it is enough to show that f - Sn and <p;, I S i S n, are orthogonal
(why?). Indeed, (f - Sn, <p;)
<p;} - (sn, <p;) and (sn, <p;) (f, cp;}, since
=(/,
=
n
<sn,'Pi)
n
=L<if,<pj}'Pj,'Pi} =LU•'Pi}6u =u.<p;}
j=O
j=O
(this is the same computation as in the proof of Theorem 10.2.1). If g and h
are orthogonal, then Ilg+ Jzl!2 = llgll 2 + llhll2 (Pythagoras' relation, Exercise I,
§10.1), and so
11/112 = II/ hence llsnll S 111112. Now
Sn+
snW = llsnW + II/ -
snll2;
n
=
L l(f, cp;)l ll'f';ffl'
2
i=O
since the <p; are orthogonal, and therefore
n
llsnll2 = L l{f,<p;}l 2 $ ll/112.
i=O
Thus the series I::i l(f, cp;}i2 has partial sum llsnll2, which is an increasing
sequence, since the terms of the selies are > 0 and the selies is bounded above
by 11/112. Hence the series converges, with -;um $ 11/112. •
622
Chapter 10 Fourier Analysis
10.2.4 Parseval's Theorem Let V be an inner product space and cp0 , cp1, ...
an orthonormal system. Then <po, cp 1, ... is complete if! for each f E V, we have
00
11/112 =
L I(/, cpn} I~·
n=O
Proof Let Sn= 'E7=0U, cp;)cp;. In the proofofBessel's inequality, it was shown
that
If <po, <p1, ••• is complete, then Sn -+ f, and so II/ - snll 2 -+ 0. Therefore, letting
n -+ oo in llsnll 2 = I:7=0 l(f, cp;)l 2 gives 11/11 2 = 'E~ l(f, cp;)l 2 •
Conversely, if this relation holds, then llfll2 - llsnll2 -+ 0 as n -+ oo, and
so II! - snll2 -+ 0, that is, s,, -+ f, which means that
00
f=
I,:{f,cp;)cp;. •
j=()
10.2.5 Best Mean Approximation Theorem Let V be an inner product space and ipo, cp1, ... , 'Pn a set of orthonormal vectors in V. Then for each
set of numbers to, t1, .•. , tn,
Equality holds iff tk = {f,cpk)-
Proof Let
show that
Ck
=(!, 'Pt}, s,, = 'E7=0 c;cp;, and hn ='E7:0 t;cp;. It is required to
II/ - s,,112
with equality iff ft
$
II/ -
hnll2
= c1:,. To see this, we show that
n
llf- hnll2 = 11/112
n
n
- L lc,:12 + L lc1:. - t1:.l 2 = llf-snll2 + L lc1:. - tl,
which suffices to prove the theorem. To prove this equality, note that
Iii -
hnll 2
={f- hn,f- hn} =lf,f) -
{f,hn} - (hn,f} + (hn,hn},
Theorem Proofs for Chapter 10
623
First,
n
{hn,hn}
=L,{t;cp;,tjc,?j} =L,t;ljOij =L, lt;l2.
i,j
i,j
Second,
{f, hn)
(1, t
=
t1c<p1c)
i=O
t
=
k=O
c1c"fi,
k=O
Thus
n
n
n
ll/-hnll2 = 11/211- L,c1:"fi- Lt1cc1:+ L
k=O
as required.
n
lt1cf = 11/11
k=O
b:O
2-
n
L lctl2+ L
k=O
lc1c-ttl 2 ,
!-=O
•
10.3.1 Mean Completeness Theorem The exponential and trigonometric systems on [0,21r] (or [-1r,1r]) are complete in the space V = L2 of
functions f : (0, 21r] --+ C with
lf(x)l 2 dx < oo (the integral may be improper).
J;"
Proof 4 By our remarks in the text and Exercise I, §10.3, it suffices to consider the exponential case. Two necessary points are contained in the following
lemmas.
Lemma 1 (Stone-Weierstrass theorem in a special case) Let f : (0, 21r] --+ C
be continuous and let f(0) = /(21r) (periodicity). Then for any e > 0 there is
an n and there are constants c;, i = -n, ... , - I, 0, I, ... , n, such that if we form
the function
then
1/(x) - Pn(x)I < e
for all x E (0, 21r].
4 A proof due to Luxemburg and not relying on the Stone-Weierstrass theorem Is outlined in
Exercise 75 at chapter's end. Another proof, due to Lebesgue, is given in Exercise 76. Both proofs,
however, rely on the converse of Example 10.2.7 (see Exercise 14 at the end of the chapter), which
uses completeness of L2 , that Is, the Lebesgue integral.
624
Chapter JO Fourier ,Analysis
The Stone-Weierstrass theorem was proved in Chapter 5. See also Exercise 44b, Chapter 5.
-+ C be square integrable, and c > 0. Then there
is a continuous function g : [0, 211'] -+ C with g(0) = g(21r) such that
Lemma 2 let f: [0, 211']
{2,r
llf -
gll 2 = Jo
g(x)l2 dx < c.
lf<x> -
Proof First suppose that f ~ 0 and is bounded by M. Given c
partition P of [0, 21r] such that, setting h = f2,
('' h -
Jo
t
h(c;)(x;+ 1
-
x;)
> 0, choose a
< ~-
i=l
and make a similar estimate for f. We can, by drawing a graph composed of
straight lines, construct a continuous g such that g is constant = f(c;) on [y;, z;],
where [y;,z;] c [x;,X;+il, jy;-x;] < c/8M2n, and lx;+ 1 -z;j < c/8M2n, and such
that g is bounded by M. Then
{2...
(''
c
e
Jo lf-gl 2 dx= Jo (/2 +g2-2Jg)dx < 2 +4M2 • 8M2n •n =€,
J/
J
J
2 =
by adding and subtracting the approximations for
h and f and using
the definition of g. The details are left to the reader.
The general case may be dealt with by writing/ as/= r-1- (see Chapter 8),
so that we can assume f ~ 0. Form /M as in Chapter 8 and choose M large
2 < c / 4, which is possible by a corollary of the monotone
enough that f If - /M 1
convergence theorem (Chapter 8). Thus, we can choose g such that J lg-/Ml 2 <
c/4, and so J If - gl 2 < c, since
Ill - gll $ Ill - /MIi + Ilg - /MIi
$
I_; + I_; = ../e.
To prove the theorem from these lemmas requires two steps.
Step 1 Proof of the theorem for f continuous and periodic.
Let
II
s,, =
I: u. r.p1:)r.pk,
-n
eixk
where <fk(x) = r,c.
v21r
•
T},eorem Proofs for Chapter 10
625
For c > 0, we must show there is an N such that n ~ N implies Iii - snll < €. It
suffices to produce a single n, because, by Theorem 10.2.S, llf-sn+kll $ 11/-snllUsing Lemma 1, choose p,, so that 1/(x) - Pn(x)I < cf .,/iii, and form the
corresponding Sn. Now
Iii - Pnll2 = lo('1r
- lf(x) Thus
II/ -
p,,jj
< c.
p,,(x)j2 dx $
('1r ( ~ ) 2 dx = c2.
lo~
By Theorem tp.2.S,
Iii - s,,11
$
Iii - Pnll < €,
since the Fourier sedes gives the best mean approximation to f. This proves
Step 1.
Step 2 General case.
In view of Lemma 2 and Step 1, it suffices to prove the following. Here, V
is the space of square integrable functions, but the lemma is stated in general
ter.rns.
Lemma 3 let V be an inner product space and let <po, ip 1, ••• be an orthonormal family. Suppose f E V and f,, --+ f. If
00
fn =
L (/n, 'Pl:}'Pk
1-=0
for each n, then
1=
I: u. <p1:}<p1c.
l:=O
Proof Given€ > 0, choose N such that k ~ N implies
M such that n ~ M implies
n
LUN,'PJ'Pi - IN < ;.
j=O
II.ft -ill = € /3. Choose
Chapter JO Fourier Analysis
626
Using the tiiangle inequality,
n
n
n
LU,'Pi)'Pi -
t
~ "LU,'Pj}cpj- LUN,'Pj)'Pj
j=O
.i=O
j=O
n
L (/N, ,Pj},Pj - /N
+
+ ll!N - 111.
j=O
By Bessel's inequality, the first term is ~
which proves our assertion.
II/ - /NII• Thus n 2:'. M
implies
•
10.3.2 Pointwise Convergence Theorem Let f : [0, 21r] -+ JR (or
/ : [-1r, 1r] -+ JR) be sectionally continuous and have a jump discontinuity at
Xo, and assume that f'(x'ti) and J'(x0 ) both exist. Then the Fourier series of
f (in either exponential or trigonometric form) evaluated at xo converges to
[/(x<)) +/(x0 )]/2. In particular; if f is differentiable at Xo, the Fourier series of
f converges at Xo to f (xo).
Proof It is convenient to first prove the following special case:
Lemma 4 Let f : [-1r, 1r] -+ CC be square integrable and differentiable at
xo (as usual, extend/ so that it is periodic). Then the Fourier series off at xo
converges to /(Xo).
Proof (We follow a proof of P. Chernoff.) By translating and adding a constant, we can assume xo = 0 and f(x0 ) = 0 (why?). Define a new function g(x)
by setting
g(x)
=e'X((x)- 1 if x 'f- 0
and
g(x)
=f'~O)
l
if x
=0.
By the quotient rule of calculus, it follows that g is continuous at 0. Since
I/(eix - I) is bounded in absolute value outside a neighborhood of 0, it follows
that g is square integrable (why?).
Theorem Proofs for Chapter JO
627
Now f(x) = (eu: - l)g(x). Let c,,(/) be the nth Fourier coefficient off and
c,,(g) that for g. By definition,
Thus,
N
L c,,(f) = C-N-1(g) -
CN(g),
n=--N
since we have a telescoping sum. Since xo = 0, 'E~N c,,(f) is the Nth partial
sum at x = 0 of the Fourier series of/. But CN(g) -+ 0, by Bessel's inequality.
Hence 'E~N c,,(f) -+ 0 = f(xo). •
Actually, we do not need the fact that f is differentiable at xo. If f "is
Lipschitz" at xo (that is, if there is a constant M such that 1(/(x) - /(xo))/(x xo)I 5 M for Ix - xol < 6, x -,: xo), we could obtain the same result by a
similar proof (we only need g in the proof to be square integrable-or even just
integrable). For example, if/ is continuous and/'(x0) and/'(x0 ) exist, then this
condition is satisfied (why?).
To prove the theorem from Lemma 4 and the preceding remarks, consider
h(x)
={
/(Xo)
f(xo)
x
f(Xo)
X
X
< Xo
=Xo
> Xo.
Then /J is a step function and we can compute its Fourier series explicitly (as in
item l, Table 10.5.4; see Example 10.5.3). In particular, this se1ies converges
to [/(x0 ) +/(x0)]/2 at x0 • Now consider the function
k(x) = f(x) - h(x).
= =
=
Then k(xo) 0 k(x0) k(x0 ) and k'(x0), k'(x0 ) exist. Hence, by Lemma 4,
the Fourier series of k converges to O at x0 . Therefore, the Fourier series for/
converges to [/(x0) +/(x0 )]/2 at x0 . This proves the assertion. •
Now we tum to the longer, classical proof of Theorem 10.3.2. Later, it will
be convenient to have this longer proof at hand, despite the fact that it is more
complex than the one just given. First, we explain the basic idea behind the
proof. Let s,,(x) be the nth partial sum of the trigonometric Fourier series. We
will write
1
'>,r
s,,(x) =
0
f(OD,.(x - Od{
Chapter JO Fourier Analysis
628
for a function· Dn specified later (Lemma 9); we say that Sn is the convolution
off and Dn. We will prove that Dn has unit area and "concentrates" around O;
that is, it behaves like a Dirac delta function. As n -+ oo, the convolution will
pick off the value off at x. See Figure 10.P- l. For this reason, Dn is also called
:m approximate identity.
y
X
FIGURE 10.P-I
The graph of the approximate identity function Dn(X)
Before we can formalize these ideas, we need some preliminary results. The
first lemma is a generalization of Example 10.2.9.
Lemma 5 Riemann-Lebesgue LeIPma Suppose f is bounded and
(Riemann) integrable on [a,b]. Then
lim
o-+oo
lb
a
f(x)sin(ax)dx = 0
(where the limit is taken through all real a
> 0).
Proof First, suppose f is a constant M. Then
lb
j(x)sin axdx =
IMI
lb
sin axdx = IMI Icos(aa): cos(ab)I
51 2:1-+ 0 as
er-+ oo.
Thus the result is true if f is a constant.
For the general case, given e > 0 choose a partition P = {.to,x1 , ••• ,x,,} of
[a,b] such that U(J,P) - L(f,P) < e/2. Then
n
U(f,P)=LM;(x;-X;-1)
i=l
n
and
l(J,P)=Lm;(x;-X;- 1),
i:l
629
Theorem Proofs for Chapte,.. 10
where M; is the maximum of/ on [x;-1,X;] and m; is the minimum. Let m be
the step function equal tom; on ]x;-1,x;]. Choose N so that
l
b
m(x)sinaxdx =
L l:r'r· m;sinaxdK < ~
n
i=I
a
x;_I
if o: ~ N, which is possible because m; is constant and n is fixed and finite. By
the triangle inequality, for a ~ N,
11b f(x) sin ax
dKI s Ilb m(x) sin ax dxl + lib [/(x) <
where M
2€ + lb
a jM(x) -
m(x)I <U,
=M; on ]x;-1, x;]. (Here we have used jsin o:xl
0 and
l
b
a
m(x)] sin ax dxl
$ 1.) But M(x)- m(x) 2:'.:
[M(x) - m(x)] dx = U(f, P) - l(f, P)
€
< 2,
and so, for o > N,
Lemma 6 Suppose g : [0, a]
Then
.
11m
1:---+oo
Proof
-+ R is sectionally continuous and g' (O+) exists.
1° ()
o
1r (O+)
gx sin kxdx =2g .
X
Since
1
° g (x )sinkxdx
sinkxdx
- - - = g(0+)1a -- +
0
O
X
1n
O
X
g(x) - g(O+) sm
. kxd
-'-----"-x,
X
it suffices to show that
{°
lo
and
1
n
0
sin kxdx dK -+
x
~
as k -+ oo,
2
g(x) - g(O+) . kxd
0
----sm
x-+
X
ask-+ oo.
(I)
(2)
630
Chapter 10 Fourier Analyst!
To prove (1), note that
lo - -kx dx= lirao a
sin t d
t,
t
sin
X
which converges to 1r/2 ask-+ oo, since fo"°[(sint)/t]dt = 1r/2; see Example
8.5.6 an~ Exercise 29 at the end of this chapter.
To prove Equation (2), obse1ve that [g(x)-g(0•)]/xis bounded and integrable
(since, as x-+ 0, this approaches a limit g'(0•)). Therefore
l
a
g(x) - g(0+) . kxdx-+ 0
.a,___....;a__sm
X
0
as k
-+
oo, by Lemma 5.
•
Lemma 5 is needed for a real and an arbitrary interval [a, b]. Note that this
case does not follow from Example 10.2.9.
Lemma 7 let g be sectionally continuous on ]a, b[ and have a jump discontinuity at xo. Suppose g'(x0) and g'(x0 ) exist. Then
.
I1m
lb
k-+oo
Proof
a
g
()sink(x-xo)dx
X ---X- Xo
1r[g(x"t,)+g(xo)l
2
= ----~-.
Write the integral as a sum,
and note that
- xo)) d
1xo-a gxo-t-( )sin kt dt
x - - - - - x=
1xo g()sin[k(x
t
X-Xo
0
and
1b(
)sin[k(x - x0 )] d
g X ---- X=
xo
X-Xo
o
1b-xo g(xo O
)sin kt d
t.
t
I --
Now let k-+ oo and employ Lemma 6. We get 1rg(x0 )/2 and 1rg(x0)/2 for the
limit of these two integrals, respectively, as k -+ oo. •
Theorem Proofs for Chapter I 0
631
Lemma 8
Let f : [0,21r] -+ R
series off may be written as
I
Sn(X) = 21T
Proof
[2'"
lo
Then the nth partial sum of the Fourier
I
f(t) dt + ;
0
I: lo/2" f(t) cos k(t - x) dt.
n
0
k=I
This is clear if we remember that
cos[k(t - x)] = cos kt cos kx + sin kt sin kx.
Lemma 9
Let sn(x) be the nth partial sum of the Fourier series off. Then
1
(11r
Sn(x) = 21T lo
Proof
•
f(t)D,,(t - x) dt
where Dn(u) =
sin[(n + 1/2)u]
sin(u/ 2 ) .
This follows from Lemma 8 and the identity
~ iku_ 1 2 ~
L e - + L
k=-n
(Exercise 5, §10.2).
k _sin[(n+l/2)u]
cos u sin(u/2)
k=I
•
We are now ready to prove the pointwise convergence theorem. We must
show that
f(xo) +J(x"o)
( )
Sn Xo -+
Z
-+ oo. We shall assume that O < x0 < 21r. The reader is asked to consider
the cases xo = 0 and x0 = 21r separately. By Lemma 9,
as n
1 1'.!1r
Sn (XO ) = -
1T
o
( )sin[(n + 1/2)(1 -
Xo)] d
g t - - - - ' - - - - - t,
t-Xo
where
(r - xo)/2
g(x) = f(t) . [(
)/Z], Xo
sm
t-xo
i- t.
By Lemma 7 (which is applicable by Exercise 41 at the end of this chapter),
we have
Sn
( )
XO -+
g(Xo) + g(x0 )
z
.
Chapter JO Fourier Analysis
632
Now it is a simple matter to see that
g(x;) =f (x6) and g(x0 ) =/ (x0 ),
•·
and so the theorem is obtained.
10.4.2 Fejer's Theorem Assume that/ is piecewise continuous on [O, 271']
and that f(x0) and J(x0 ) exist. Then the Fourier series off converges (C, 1) at
Xo to [f(xo) + /(xo)J/2. If f is contiyYJOUS and /(0) = /(271'), then the Fourier
series converges (C, I) uniformly to fProof For notational -reasons it is slightly more convenient to use [-11", 11"]
instead of [O, 21r] for this proof; this does not aff1rct the conclusions. Consider
sn(x) =
ckeikx, the nth partial sum of the Fourier series of/. To discuss
(C, I) summability, we let
z::=-n
l
a-,,(x) = - -1
n+
L sk(x).
n
k=O
Using Lemma 9,
l
a-,,(x) = - 1
'
n+
L 2I 1,,. f(x - t)Dk(t)dt;
n
k=O
'Tr
-,r
that is,
a-,,(x) = -2I
7l'
1,,. /(x - t)Fn(t)dt,
-,r
where
I
II
F,,(t) = ~ Dk(I)
n+l~
k=O
is the Fejer kemeL We shall need the following lemmas.
1- sin 2 [(~ + l)r/ 2].
n+1
sin-[t/2]
Lemma 10 F,,(t) = -
Theorem Proofs for Chapter I 0
633
Proof By the fonnula for Dn (Lemma 9), we have
(n +
l)F ()
n I
=~ sin[(k + l/2)t] =
L-
sin(r/2)
k=O
'
1 1 {~ i(k+i;ii,}
sin(t/2) m L- e
k=O
l
{ . ei(n+l)t - I }
- ---Im e"/2 - - - - sin(t /2)
ei1 - I
1
== sin(t/2)
{
Im
I }
i(n+l)t -
eir/2 _ e-it/2
2
l - cos[(n + l)t]
sin [1/2(n + l)t]
==--~--=--'-,c----.
2
2 sin (r/2)
sin2 (t/2)
Lemma 11 The Fejer kernel has the following properties:
i.
Fn(t) is 21r-periodic.
ii.
-2I
11"
lir
Fn(t)dt = 1.
-,r
~
iii.
Fn(t)
iv.
For each fixed 6
0.
> 0, limn-oo f.1::;J,JS~ Fn(t)dt = 0.
Proof i and ii follow from the definition of Fn; iii follows from Lemma 10.
iv: For 6
:5 ltl :5
:5 l / (sin2 6/2). Hence
1r
we have 1/ (sin 2 r/2)
0
<
F (t) < --=-n
- n + I sin2 (6 /2)'
I
I
6<
-
ltl <- 1r ·
Since this converges to 0 uniformly as n-+oo, the integral
f6 sJ,JS1rFn(t)dt-+O.
•
Let us now prove Fejer's theorem. Using the same technique that was used
in the proof of the pointwise convergence theorem (see the arguments following
Lemma 4), it suffices to prove the last pmt of the theorem. Thus, assume thatf
is continuous. We have
u,,(x) = -2I
7r
1"'
f(x - t)F,,(t)dt.
-,r
Hence, by ii of Lemma 11,
f(x) - u,.(x) = -2I
7r
1"'
-,r
(f(x) - f(x - r))Fn,(t)dt.
Chapter JO Fourier Analysis
634
Accordingly, by iii of Lemma 11 (positivity of Fn),
lf(x) - O'n(x)I :5 2I11" J:r
- , r 1/(x) - f(x - t)IFn(t)dt.
Given e > 0, we can use uniform continuity of f to find 6
lf(x) - /{y)j $ e if Ix - YI $ 6. Then
1/(x) - o-,.(x)I $ -21 {
J,,,$6
1l"
0 such that
lf(x) - f(x - t)IFn(t)dt
1 {
+ -2
1r
>
}6$ltl$:r
lf(x) - f(x - t)IFn(I) dt.
The first integral on the right-hand side is bounded above by
1
-2I
1r
1119
eFn(t)dt :5 -2I
J,r eF (t)dt = e.
11
1r
-,r
The second integral is bounded by
-21
1r
1
2MF,.(t)dt = M
6$1il$ir
1r
1
Fn(t)dt,
6$l1l$1r
where M = sup, 1/(t)j. By property iv of Lemma 11, we may choose N so that if
~ N, this last integral is :5 e. Thus, if n ~ N, lf(x) - O'n(x)I $ e + e = 2e.
•
n
One can prove that the Cesaro sums converge to an integrable function except
possibly on a set of measure zero (see Hewitt and Stromberg, Real and Abstract
Analysis, Springer-Verlag, p. 294).
10.5.1 Integration Theorem
Fourier series
Suppose
J:::r l/(x)l 2 dx < oo
00
0;
~ L)a,. cos nx + bn sin nx).
n=l
Then, letting g(x) = J~ 1J(y)dy, we have
g(x)= ao(\+rr) +~(an
=
ao(x + rr)
2
00
{
1-:
a,. .
cosnydy+bn
bn
1:,r sinnydy)
+~ -;smnx+-;«-It-cosnx)
and the convergence is uniform for -rr $ x :5 rr.
}
and f has
Theorem Proofs for Chapter I 0
Proof
We prepare the following lemma.
Lemma 12 Suppose fn : [a, b]
fn
-+
635
-+
R. are such tha,t
f in mean. Let
gn(X)
Then gn
-+
Proof
=l~f,,(y)dy
and
g(x)
l"
=
J; l.fn(x)j2 dx < oo and
f(y)dy.
g uniformly on [a, b].
By the Cauchy-Schwarz inequality,
l8n(X) - g(x)l 2 $
$
(1x
(1x l.fn(y) - f(y)l
lfn(Y) - f(y)j dy)
from which the result is obvious.
2
2
dy) (x - a) $ II!,, - fll2(b - a),
•
For the theorem, let s,,(x) be the nth partial sum of the Fourier series and
taJcef,, = Sn in the lemma. We know thatfn -+fin mean (Theorem 10.3.1), and
so gn -+ g unifonnly. Here 8n is the partial sum of the integrated Fourier series,
and so we have the result. •
10.5.2 Gibbs' Phenomenon Consider
f(x) = {
a -'Ir$
X
<0
b 0:5x51r,
and suppose a < b. Let s,.(x) be the nth partial sum of the trigonometric Fourier
series. Then the maximum of Sn occurs at 1r /2n and the minimum at -1r /2n and
n~'! sn (:) = ( b ;
a) (
¾11r si~ 1 dt + 1) + b :::: (b - a)(0.089) + b.
Similarly,
n~~ Sn ( -
:)
= ( b;
a) (-¾ 11r si; t dt + 1) + a: : a- (b - a)(0.089),
and the difference of these limits .is
a) (-;Jo
4 r -t-dt+l
sin t
)
( -b -2 -
::::(b-a)(l.179).
Chapter JO Fourier Analysis
636
Proof Let us first prove this for the special case a =- I, b =1. We have seen
that the Fourier series of
-1
g(x) = { 1
is
Let
i
f
71'
n=l
-71' < X < 0
0 $ ; $ rr
sin[(2n - l)x].
2n - 1
_i ~ sin[(2k Sn(x) - 71'
~
l)x]
2k - 1
·
J.-=1
By differentiating, we see that s,. has its maximum at Xn = 71' /2n (some details
here are left to the reader). The value at this maximum is
s,. (~)
=~
2n
1r
t
k=l
sin[(2k - 1)71' /2n]
2k - l
=~
71'
t
k=l
sin[(2k - 1)1r /2n] (~) .
(2k -
1)1r
/2n
n
This sum is a Riemann sum for the function (siny)/y on [O, 71'] with partition
{O, 71'/n, 271' /n, ...• 71' }. Hence, ifwe choose n even and let n-+ oo, this converges
(see Chapter 8)
to
211rsinyd
y.
71' 0 y
-
The case of the minimum off for x < 0 holds, since f and s,, are both odd.
The numetical value of the integral is approximately 1.179 and is computed
by numerical methods, as readers can verify on their calculators or home computers.
The general case for f follows by observing that its Fourier series has nth
partial sum (1/2)(b - a)(s,. +I)+ a (why?). •
10.6.1 Uniform Convergence Theorem
Suppose f is continuous on
(-71'. 71' ]. f(-71') = f(11' ), and f' is sectionally continuous with jump discontinuities. Then the (trigonometric or exponential) Fourier series off converges to f
absolutely and uniformly. A similar statement holds for f on [O, 271'].
Proof We can wtite
f(x) =
~+
co
L (a,. cos nx + bn sin nx).
n=l
Theorem Proofs for Chapter JO
637
by Theorem 10.3.2. The Fourier coefficients off' are
I
a,. = 11"
j,.. J'(x) cos nx dx,
I
/311 = -
11"
-,r
j,.. /'(x)sinnxdx,
-,r
and we have
n
I,..
a,.= -f(x)cosnx
11"
-,r
+ -n
11"
j,.. f(x)sinnxdx = nb,.,
-,r
since we can integrate by parts and since/(1r) =/(-1r). Similarly, /311 = -na,..
Some care is needed to justify integration by patts, since f' exists only in
sections. But if integration by parts is applied on each section using the fact
that f' has jump discontinuities only, and noting the continuity off, then we get
the results we obtained here.
Lemma 13 Under the conditions of Theorem 10.6.1, we have
00
00
n=I
n=I
L la,.I < oo, L lbnl < oo,
and na,.
-+
0, nb,.
-+
0.
Proof We know that I::1 /3';, converges (by Bessel's inequality for/'). Lettings,, = I:~': 1 lakl,
by the Schwarz inequality. Since this is bounded, so is s,., and therefore it
converges (an increasing sequence converges iff it is bounded). Thus I::1 la11 I
converges. Since /311 -+ 0, we also have na 11 -+ 0. The case of the b,. 's is
similar.
•
It suffices to show that ao/2+ I::1(a,. cos nx+b,. sin nx) converges uniformly,
since the limit must be f(x). It is enough, in tum, to show that I::1 (a,. cos nx +
b,. sin nx) is unifmmly convergent. However,
la,. cos nx + b,. sin nxl ~ la,. I + lb,. I = M,.,
lemma, I:;;1 M,. converges. Hence, by the Weierstrass M test,
and so, by the
the series converges unifo1mly and absolutely.
•
Chapter 10 Fourier Analysis
638
10.6.2 Differentiation Theorem Let f be continuous on [-1r,1r], let
/(-1r) = /(1r), and let f' be sectionally continuous with jump discontinuities.
Suppose/" exists at x E [-1r, 1r]. Then the Fourier series for
00
f (x) = 0; +
L (a,. cos nx + bn sin nx)
n=I
may be differentiated term by term at x to give
00
f' (x) =
L (-nan sin nx + nbn cos nx).
n=I
Furthermore, this is the Fourier series off'.
Proof
The proof of Theorem 10.6.1 showed that the Fourier coefficients of
f' are given by
a,,
=nb,.,
/3
11
=-nan,
This remark suffices to prove the theorem, since if/" exists, f' (x) will be the
sum of Fourier series.
•
10.7.1 Theorem In the initial-displacement problem, suppose that/ is twice
differentiable. Then the solution to the initial-displacement problem is
y(x, t)
=21 [/(x -
mrct
=Lb,. sm -mrx
1- cos - 1-,
00
ct)+ f(x + ct)]
•
n=I
where the b,. are the half-interval sine coefficients,
2
b,. = 1
Jo[1 J(x) sm. -n1rx
1- dx,
and f is extended to be odd periodic. (Twice differentiable means we are assuming that the extended/ is twice differentiable.) (See Figure 10.7-2.)
1::
Proof
The series for y(x, t) converges because
1 bn sin(n1rx/l) converges
uniformly and absolutely to f (Theorem 10.6.1). Let us show that
00
•
n1rx
n1rct l
Lb,, sm - 1- cos - 1- = 2[J(x - ct)+ f(x + ct)].
n=I
Theorem Proofs for Chapter I 0
639
For this, note that
. mrx
mrct
. mr(x - ct)
. mr(x + ct)
2 sm+sm
,
1
1-cos-1-=sm
1
so that
~ b . mrx
L..J
n=I
n
n1rct _ I ~ b . n1r(x - ct) ~ b . mr(x + ct)
sm - 1- cos - 1- - 2 ~ n sm
l
+ ~ n sm
l
n=I
=
n=I
I
2[f(x -
ct)+ f(x + ct)].
Now we verify that
.
I
y(x, t) = ~[f(x - ct)+ f(x + ct)]
satisfies all the conditions. First,
cPy =
8t2
82Y.
!c2c1"<x
- ct+ J"<x +ct)]= c2 ax
2
2
Second, at t = 0, y(x, 0) = J(x) and
at (x,0) =2I c[-/'(x) +j'(x) +j'(x)] =0.
8y
Third, y(O, t) = ½[f(-ct) +/(ct)] = 0, because f is odd (when extended); and
.
y(l, t)
=2I [J(l -
ct)+ J(l + ct)]
=0,
because f(l - ct) = -f(ct - l) = -f(ct + l), since f(x) = f(x + 21) by periodicity.
•
10.7.2 Theorem .If f is square integrable, then, for each t > 0,
T(x t) = -ao +
'
2
L a e-n ,.. 11 cos -n1rx
00
2 2
1
2
n
n=I
l
converges uniformly, is differentiable, and satisfies the heat equation and boundary conditions. At t = 0, it equals f in the sense of convergence in the mean,
and pointwise if f is of class C1• As usual,
On
=
21
I
1
O
f(x) cos -n1rx
1- dx.
Chapter JO Fourier Analysis
640
Proof To show that T(x, t) satisfies the heat equation, what we must do is
justify tenn-by-term differentiation in both x and t. For this we use Theorem
S.4.3. What we must show is that the series of derivatives
(which represents both oT/ot and o2 T/ox2) converges uniformly int and in x,
which we do by the Weierstrass M test. Since lanl is bounded (an -+ 0, in fact),
we can omit the tenns On1r 2 JP. Now in x, let Mn = n 2 e-n2 1r 21112 • By the ratio
test, L Mn < oo, and so the series converges uniformly in x.
Uniformly in t means uniformly for all t ~ e, where e > 0 is arbitrary but
.
,,
22;12
'I;""
fixed. In this case we let Mn = n-e-n 1r e and note that LJ Mn converges. (We
cannot allow t = 0.) The rest of the theorem is obvious. •
10.7.3 Theorem In Theorem 10.7.2,
lim T(x, t) = j(x)
1-0,r>O
in the sense of convergence in mean, and, converges uniformly (and pointwise)
if f is continuous, with J' sectionally continuous. More generally, for any f, if
the Fourier series off converges at x to j(x), then T(x,t)-+ f(x) as t-+ 0.
Proof For the first part, it will suffice to show the following.
Lemma 14 For each t > 0, suppose J, E V, an inner product space, and
cpo, cp1, ... is 'a complete ortlwnormal basis. Let
00
/, =L Cn(t)'Pn,
n=I
If
00
f
=L Cn'Pn•
n=l
00
lim ~ lcn(t) - cnl 2 = 0,
1-+0 ~
n=I
then f,
-+
f (in mean).
Proof The result follows from Parseval's relation
IIJ,-!112 =
z:: lcn(t)-cnl
1
2•
•
Theorem Proofs for Chapter I 0
641
In the case of Theorem 10.7.3, we must show that
00
lim ~ lanl20
,-o~
- e-n2,..2,1,2>2 = o.
n=l
E:
2 21 12
To do this, it is enough to show that the function g(t) =
1 lanl 2 (1-e-n 1r l )2
is continuous int, since g(0) = 0. To show that g(t) is continuous, we shall show
that the series converges unifonnly in t. To do this, Abel's test will be used.
The fonn we need is the following:
E:
Lemma 15 Let
1 Cn be a convergent series and 'Pn(t) a uniformly bounded, decreasing (respectively, increasing) sequence defined fort~ 0. Then g(t) =
1 Cn<,?n(t) converges uniformly in t. In particular. g is continuous and g(O) =
lim, .....o g(t).
E:
See Theorem 5.9.l for the proof. One deduces the increasing case from the
decreasing case by considering -g(t), instead of g(t). In our case Cn = lanl 2and
•
'Pn(t) = (1 - e-n l ,.. 21/12 )-? • Now 'Pn :5 cp,,. 1f
n :5 m, and I'Pn(t) I :5 I. Thus, from
the lemma and the fact that L Cn converges, we have our result.
Now suppose/' is sectionally continuous. From the proof of Theorem 10.6.1,
1 lonl < oo. Thus, for a given x,
E:
00
lf(x) - T(x,t)I $
L lanlO - e-n
2 211
,..
'\
n=l
By an argument like the preceding, the series on the light converges unifonnly,
and so we can let t -+ 0 in each term to conclude that T(x, t) -+ f(x) as t -+ 0.
Indeed, note that the convergence is uniform in x because we have the bound
2 21
which approaches O as t -+ 0 and is independent of x.
1 lanlO - e-n ,.. l 1\
Finally, suppose
On cos(mrx/1) converges for some fixed x. Then we
wish to show that (for this x fixed)
E:
E:,
00
•
•
~
2 2 1,
mrx
1
= ,hm
,hmg(t)
.....o
.....o ~ a,.e-n,.. cos - 1- = 0.
n=I
Here we cannot make the same estimate, because the factor cos(mrx/[) is needed
for L On cos(mrx/ l) to converge. However, Lemma 15 can be applied with
Cn On cos(mrx/ I) and cpn(t) e_,,i,..i,;t2 to yield the conclusion, since the 'Pn
are decreasing and are bounded by I.
=
=
•
From this proof we also conclude that
lim T(x, t) = T(x, to);
1--+lo
642
Chapter JO Fourier Analysis
that is, Tis continuous in r, in each of the three cases of Theorem 10.7.3. Indeed,
we already know that for r > 0, T(x, t) is differentiable and hence continuous.
However, T(x, t) may not be differentiable at r = 0, but the proof just given does
show that we have continuity at r = 0.
These methods using Abel's and Ditichlet's tests are important for establishing convergence in other problems (such as Laplace's equation), as we shall see
in the next proof.
10.7.4 Theorem
i.
Given g1, let rp(x,y) be defined by
_ ~b
rp (x, y ) - L
n=I
. h mr(b- y) sin(mrx/a)
.
/ ,
a
smh(mrb a)
,, sm
(l 2 )
Suppose g1 is of class C 2 and g 1(0) = g1 (a) = 0. Then rp converges
uniformly, and is the solution to the Dirichlet problem with f 1 = /2 = g2 = 0,
and is continuous on the whole square, and V 2 ¢ = 0 on the interior.
ii.
If each of /1 ,/2, g 1, g2 is of class C 2 and vanishes at the corners of the
rectangle, then the solution rp(x,y) is the sum of four series like Equation
( 12), V 2 rp = 0 on the interior. and V is continuous on the wlwle rectangle
and assumes the given boundary values. Furthermore, rp is COO on the
interior.
iii.
If /1 ,h, g1, g2 are only square integrable, then the series for rp converges
on the interior. V2 rp = 0, and rp is COO. Also, <p takes on the boundary
values in the sense of convergence in mean. This means, for example, that
limy-O rp(x,y) rp(x, 0) g1 (x) with convergence in mean.
=
=
= =
Proof For simplicity, let us take the case a b 1r, the general case being
obtained by a change of coordinates. To prove parts i and ii of the theorem, we
show that rp(x,y) converges uniformly in x and y and that we can differentiate
twice, tenn by term, on the interior. In view of the preceding remarks, this
suffices to prove the theorem. Part ii is a consequence of i and linearity; the
boundary values are assumed because g 1 is represented by its Fourier series.
By Theorems 10.6.1 and 10.6.2,
00
00
g1(x)
=Lb,, sinnx,
11=!
g~ (x) =
L nb,, cos nx,
n=I
Theorem Proofs for Chapter 10
643
and these series converge uniformly and absolutely. Here we use the fact that
g1 (11") 0.
To show that <p converges unifonnly we use Abel's test, as in the proof of
Theorem 10.7.3, on the square [0, 1r] x [O, 1r]. Thus we must show, since the
series for g 1 converges unifonnly, that l{)n = (sin n(1r - y))/ sin n1r is decreasing
with n and these functions are unifonnly bounded. If we can show that they are
decreasing, then unifonn boundedness follows, because O $ l{)n(Y) $ cp1(y) and
cp 1 is bounded, since it is continuous. In fact, cp 1 :5 1 in this case.
To show that l{)n+I :5 <p,., fix y and consider the function t/J(t) = (sinh t(1r y))/ sinh t1r, t > 0. It suffices to show that tf;'(t) :5 0, for then tf; decreases as
t increases and, in particular, 'lf;(n + 1) :5 t/J(n). This is a special case of the
following lemma.
g1 (0)
=
=
Lemma 16 For constants a, /3 satisfying fJ > 0, (3 2::: a, and letting 1/J(t) =
sinh(at)/ sinh((Jt), 1/J'(t) :5 0 fort? 0.
Proof
sinh2 ((3t) 1/J' (t) = a: sinh(/Jt) cosh(M) - /J sinh(at) cosh(/3t)
= _ [J 2 - a 2 [sinh[(a + fJ)t] _ sinh[(/J - a)r]] ,
a+/3
2
/3-a
using the identity sinh(u+v) = sinh u cosh v+sinh vcoshu. If the tenn in brackets
is ? 0, we are finished, since [J 2 - a 2 > 0. This is in fact true. To see it, let
p
(t) = sinh[(a + ,B)r] _ sinh[(/3 - a:)t].
a+[J
[J-a
Now p(0) = 0 and p'(t) = sinh[(a· + /J)r] - sinh[(fJ - a)t] ? 0, since sinh is
increasing. Hence p(r) ? 0 for all r ? 0.
•
This establishes the first part of the proof, which says that the series for
ip(x,y) converges unifonnly. For the differentiability part, we must show that
00
"'""' .,
•
.
.X(x,y) = L, n-b,. smhn(1r - y)
n=I
-
smnx
)
sm n1r
. h(
converges uniformly (this is the second formal y derivative; the second x derivative is its negative).
It is important to realize that we can get uniform convergence only if we
stay away from the boundary; in fact, for any i > 0 we shall establish unifonn
Chapter 10 Fourier Analysis
644
convergence on O < e $ y $ 1r and x arbitrary. With this extra restriction, the
delicacy of Abel's test is no longer needed; the Weierstrass M test will do. We
have lbnl $ M. Let
_ 2Msinh[n(1r - e)]
Mn-n
.
sinh(1rn)
Then Mn bounds the terms in,\. But 2 sinh[n(1r-e)]
e"n(l - e"n), from the definition of sinh. Thus
< e"(n-c) and 2
sinh(n1r) ~
Since e > 0, L Mn converges, and so we have unifonn convergence.
Note that we could use nk instead of n 2 for any k and still have convergence;
in fact, we can differentiate any number of times; that is, cp is COO (a little
thought shows that cp is analytic-see Example 6.8.7).
The proof of part iii is routine. To show that V 2cp = 0 and cp is a COO
function on the interior, the proof is similar to the preceding (all that was used
was that the bn are bounded). For convergence in mean, we proceed as in the
proof of Theorem 10.7.3, using Lemma 15.
Worked Examples for Chapter 10
Example 10.1 Let f : [O, 1r] _. C be a continuous .function. Prove that the
following inequality holds:
111' f(x)
2
sinxdxl + ... +
11,,. f(x)
sinnxdxr $
i 1,,. l/(x)l dx.
2
Solution This follows from Bessel's inequality applied to the following (incomplete) 011honormal system on [0, 1r]:
.,/2j;-
sinx, ... , ,/2r; sinnx.
If we used an infinite sum, we would have equality by Parseval's theorem (see
Table 10.5-3).
•
Example 10.2 Show that iff,,
-+
f (in mean) and 8n _. g (in mean) in an
inner product space, then
{fn,gn)-+ (/,g}.
Worked Examples for Chapter 10
645
Solution First, make an estimate using the Schwarz inequality and the triangle inequality:
IUn,gn) - (f,g)I 5 l(fn,gn) - (fn,g)I + l(fn,g)- (f,g)I
= l(fn,8n - g)I + IUn - /,g}I
5 ll!nll llgn - gll +llfn - 111 llgll.
The result follows from this. Given e > 0, choose N so that n
each of the following estimates:
N implies
e
2.
llfn - !II< 211811;
llfn - !II ~ I; and
3.
ll8n - gll < 2(11/11 + l)"
1.
~
e
(Why is this possible?)
Then 11/n - /II ~ 1 implies
l(fn,gn} -
(f,g}I
-+
(why?), and for n ~ N,
~{II/II+ I)llgn - gll +llfn - /11 llgll
5
This proves that (/n,8n}
llfnll 5 II/II+ I
c11111 + 1)~<11111 + 1)-l +
~:: l
= e.
•
(f,g}.
Example 10.3 Let f : [0, 271"]
-+ lR. be square integrable and de.fine g(x) =
Jtf(y)dy. Find the Fourier expansion of g and state where it is valid.
Solution As we did in Theorem 10.5.1, we can integrate the Fourier series
co
f(y)=; + L(ancosny+bnsinny)
n=I
term by te1m from y = 0 to y = x to obtain
g(x) =
1
x
o
ao
f(y)dy = 2 x+
. -(l
bn - cosnx)) .
L (an-smnx+
n
n
co
n=I
E:
From Table 10:5-4 we know that x = 1r - 2
1(1 / n) sin nx for x in [O_, 21r]
converging in mean and pointwise for x not equal to O or 21r. Also, the Cauchy-
Chapter 10 Fourier Analysis
646
Schwarz inequality shows that "L,(bn/n) converges since L, b; and "L,(l/n)2 both
do. Putting these together, we can rewrite the last series as a Fourier series:
ao71'
g(z) = 2 +
L -;;bn + L
00
00
!-=I
k=l
(an - ao .
bn
)
n
sm nx - -;; cos nx .
This converges in mean and pointwise for x not equal to O or 271'. Since the
Fourier coefficients are unique, this is the expansion of g(x).
Since g is bounded by
If(x)I dx, which is finite, g is square integrable, and
so it has a Fourier series. Note that g is continuous but need not be differentiable
(see Exercise 61 ).
•
J:".
Example 10.4 Let f : [O, 271']
-+ IR, g : [0, 271']
icity. Define the convolutio11 off and g by
(/ * g)(x) = 1''.,, f(y)g(x -
-+
IR, and extend by period-
y) dy.
Compute the Fourier series off* g in terms of the series off and g, using the
exponential form of Fourier series.
Solution The Fourier coefficients off* g are given by
Cn = 2~
12.,,(/ * g)(x)e-inx dx = 2~ 121r 121r f(y)g(x - y)e-inx dydx
= _I [21r [21r f(y)e-inyg(x - y)e-in(x-y) dydx,
271' lo
Jo
since e-i,ix = e-iny .e-illT+iny_ Changing variables (y 1-+ y and x-y 1-+ t) and using
periodicity (we may interchange the order of integration by Fubini's theorem,
§9.2) leads to
_!._ [21r f(y)e-iny dy [21r g(t)e-inr dt.
=r,:_
Letf(x)
g, so that
an
271' lo
aneinx
and g(x)
00
=r,:_
=2~ fo 21r f(x)e-illr dx
and
lo
00
bneinx be the Fourier series off and
bn
=2~ fo 2
1r
g(x)e-inx dx.
Then the previous computation shows that we have
00
Cn
=27l'anbn,
and thus
(/ * g)(x)
=2tr L anbninx_
-oo
647
Worked Examples for Chapter 10
A similar operation could be done with the trigonometric series, although
the computations and the results are more awkward. Sufficient conditions on f
and g for the preceding calculations to be valid are that f apd g be sectionally
continuous, or, more generally, square integrable. If we want the series to
converge pointwise, we must add the hypotheses of the pointwise convergence
•
theorem (10.3.2).
Example 10.5 Consider f(x) = cos .u, -,r
$
x $
1r,
where .,\ is real and
non-integral. Compute the Fourier series off.
Solution
Cn
= -I
211'
j" cos .,\xe-mx. dx = - j" (e'""".,_ + e-''")e-•nx
.,_ . dx
I
4,r
-11"
I [ ei<.>.-n)r
e-i(.>.+11)r ] ,,.
=- - - - + - - 4,r i(.,\ - n)
-11"
-i(.,\ + n)
=-1r
(-1)" 2 . . .,\ ( I
I )
(- It . .,\
= 41ri . 'sm ,r .,\ - n + .,\ + n = ~ sm ,r .
2.,\
.,\2 - n2.
By the pointwise convergence theorem, the Fourier series converges to /(x) at
all points. Hence, for lxl $ 1r,
sin,r.,\ ~
t
2.,\
;1:x
cos.,\x = ~ ~ (-1) .,\2 _ k2 e .
l-=-oo
Formally, the solution is complete, but let's go on. If we set x = 1r, we get
, =sin1r.,\
~ - l ).,\2t- ~
-l)k =sin1r.,\ {.!. ~ 2.,\ }
21r ~ (
k2 (
>, + ~ >,2 - k2 .
cos ,r"
7r
k=-oo
k=I
Hence
1r
cos1r.,\
sin 1r .,\
1
-
l
~
= ~ .,\ 2
2>.
- k2'
.,\ ,- integer.
l-=1
Note that the series on the right converges unifonnly for O $ .,\ $ .,\0 < I.
Note also that ,r(cos 1r .,\/ sin 1r .,\) - (I/>.) --+ 0 as >. -+ 0, and so is Riemann
integrable. By integrating,
sin,r.,\)
Loo lo 0
Joo., ( ,r.,\- =
o
k=I
(1-
.,\2) •
-k2
l>-1 <
1.
Chapter JO Fourier Analysis
648
Exponentiating,
sin 71" >. = rroo ( I 71"
>.
>.
2) .'
k2
i.e.,
sin 1r >. = 1r >.
k=I
IT
00
(
>.2)
1 - k2
,
l>-1 <
1.
k=I
The product on the tight defines a function of,\ of period 2 (see Exercise 74),
as does the left side, and so this fonnula holds for all real values of >.. This
product fonnula for the sine was discovered (though not rigorously proved) by
Euler. (For another method of proof, see J. Marsden and M. Hoffman, Basic
Complex Analysis, 2d ed., Chap. 7.)
If we take >. = 1/2, then
l
=~
2
rr
00
(
~I
l __
I ) =~
4k2
2
IT (2k (2k)
- 1)(2k + I)
(2k)
'
00
~I
and so
71"
00
(2k) (2k)
.
1)(2k + 1); i.e.,
2 = IT (2k k=I
71"
(2 · 2)(4 · 4)(6 • 6) · .. (2n · 2n)
•
- = hm - - - - - - - - - - - - - - ,
2 n-oo (1 · 3)(3 · 5)(5 · 7) · · · ((2n - 1) · (2n + 1))
which is called Wallis' product formula for 1r /2.
•
Example 10.6 An interesting application of the Parseval relation to the
isoperimetric problem is as follows: Show that among all plane curves of a
given perimeter. the largest area is enclosed by the circle.
Solution
Let (x(t), y(t)), 0 $ t $ 21r, be a parametric representation of a
simple closed curve. We assume that x(t), y(t) are C1 functions oft, and that
the parameter t is arc length. Thus the total length is 21r, and .x(t)2 + y(t)2 = 1,
and so J;1r (.x(t)2 + y(t)2) dt = 21r. The enclosed area is
A=
fo 1r x(t)y(t) dt.
2
(fhis is a standard calculus fo1mula; see, for example, Marsden and Tromba,
Vector Calculus, 3d ed., Chap. 7 .) We claim that A $ 1r, and that A = 1r only
if the curve is a circle. To prove this, we express A in tenns of the Fourier
coefficients of x and y. Wtite
00
x(t) =
~ + L(ak cos kt+ ht sin kt)
k=I
649
Worked Examples for Chapter I 0
and
00
y(t) = ~o +
L(
D'k
cos kt + fJk sin kt).
k=I
All coefficients are real, and so, by a change of origin in the plane, we may
assume ao ao 0.
The Fourier series of the derivatives i(t), y(t) are
= =
00
i(t) =
L (kht cos kt + ba" sin kt)
k=I
and
00
Y(t)
=
L (kf3k cos kt+ ba" sin kt).
k=I
By Parseval's relation,
and the area is
Hence
k=I
00
~.,
')
.,
?
,,
= 1r L., (k- - k)(a'i: +bi:+ a*+ /3i)
k=I
00
+
1r
L k{ (ak -
/3t)2 + (ak + bd }.
L-=1
Thus
1r -
i.
ak
A ~ 0, and
1r -
A = 0 if and only if
=bk =ak =f)k =0 for k ~ 2;
In this case,
x(t)
=a 1 cos
t +b 1 sin t
and
y(t)
=-b 1 cost+ a 1 sin t =-x(t + 1r /2).
Chapter 10 Fourier Analysis
650
Equivalently, for some Rand 6,
x(t) = R cos(t + 6)
and y(t) = R sin(t + 6).
The condition x2+y2 = l implies R = I, and therefore we have a circle of radius 1.
•
Exercises for Chapter 10
1.
Let V be an inner product space and M c V a vector subspace. Define
the orthogonal complement of M by
M.1.. ={/EV I (f,g} = 0 for all g EM}.
Show that M.1.. is a vector subspace of Vandis closed (that is, if/n E M.1..
andfn -+ f (in mean), then/ E M.1..).
2.
Prove that the Legendre polynomials (see §10.2) are complete in L2 of
[-1, 1]. [Hint: First show that any polynomial can be expanded in Legendre polynomials, and then employ the method of proof of the mean
completeness theorem.]
3.
a.
Use the Fourier series for eX on [-71', 71'] to prove the following identity:
00
I
l
-2(1rcoth 7r - 1) =
Z::-2-1
·
n +
I
b.
Use the half-interval cosine series for cos ax where a is not an integer
to prove the following identity:
1l'Cot1ra
I
=- +
a
Z::-a 2a-n
oo
2- 2.
I
4.
Prove that if In-+ f (in mean), then 11/nll-+ 11111- Is the converse true?
5.
Prove that uniform convergence implies mean convergence (on finite intervals).
6.
Consider the space h of all sequences x = (x1,x2, .. . ) of real numbers
< oo. Show that '2 is an inner product space with {x,y} =
with
1
1 x;y;. In addition, show that this space is complete (by showing that
all Cauchy sequences converge).
r;:
r;: r;
Exercises for Chapter 10
7.
651
Let
ltxr<oo},
l2={(x1,X2,,,.)
,
1-1
which, by Exercise 6, is a Hilbert space. Let 1Pn E '2, n = 1, 2, ... , be
defined as 'f'n = (0, 0, ... , 1, 0, ... ) with the I in the nth spot. Show in two
ways that 'Pl, cp2, •.. is a complete orthonormal set:
a.
Directly,
b.
Using Parseval's theorem (see also Exercise 14c).
8.
Find functions fn and f on [-1, I] such that fn
mean.
9.
Verify that the sectionally continuous functions f : [a, b]
vector space.
10.
Find a sequence fn of sectionally continuous functions with fn -+ f pointwise (respectively, in mean) such that/ is not sectionally continuous.
11.
Prove that if/: [a,b]-+ C is sectionally continuous, then so is
also that IfI is bounded.
I/I. Show
12.
Prove that if f and g are square integrable on [a, b], then
f = g except on a set of measure zero.
II/ -
gj I = 0 iff
13.
In the proof of Bessel's inequality, we showed that
2
-+
n
= 11/112
-
f (pointwise) but not in
-+
C form a
n
L l(/,'f'k)l + L 1(/,'f'k) - tkl
2
2-
k::O
Use this equality with IS: = (f, cpk) to give another proof of Bessel's inequality.
14.
Suppose V is a Hilbe11 space (that is, is a complete inner product space).
Let 'Po, 'Pl, ... be an 011honormal set in V.
a.
For each/EV, show that
n
Sn
=
L (/,
Cf)k}Cf)k
k=O
converges to some element of V. Hint: Show that
m
llsn - smf =
L l(f, <p;)l2.
i=n+l
and use Bessel's inequality to show that sn is a Cauchy sequence.
Chapter 10 Fourier Analysis
652
b.
For eachf EV, show that, ifs= :E:(J, then <p;)rp;, thenf - sis
orthogonal to each cp; andf - sis orthogonal to s.
c.
Show that if, whenever f is orthogonal to each rp;, we have f = 0,
then rp 0 , cp 1 , ••• is complete. [Hint: By b, f - s is orthogonal to each
<p;, and so f = s.]
15.
Show that in V of Theorem 10.1.2, we do not have the Bolzano-Weierstrass
theorem; that is, in a closed bounded set, a sequence need not have any
convergent subsequences.
16.
Compute the Fomier seties of /(x) = (x + x2)/2, 0 ~ x ~ 21r.
17.
a.
Let <po, rp 1 , ••• be orthononnal vectors in an inner product space V.
If
00
LCk<{)lc
=0,
k=:O
then show that c1c
18.
=0, k =0, 1, 2, ....
b.
If <po, <p1, •.• is a complete set and (/, rp;) = (g, rp;) for all i, then
prove that/= g.
c.
If V is a space of integrable functions, prove that b implies f(x) = g(x)
except possibly on a set of measure zero.
This is an exercise on Fourier series in several variables.
a.
We have seen that einx / ...fi;, n = 0, ± l, ... , are orthonormal functions on [0, 21r]. Consider the functions
einx+imy
'Pn.m(x,y)
=~=
einx eimy
../fir ..,fj;i.
Show that 'Pn,m are orthononnal on [0, 21r] x [0, 21r ].
b.
Generalize a to construct 'Pn.,11 , given general 'Pn that are orthonormal
on [a,b].
Note. The einx /../fir are complete, and so are the 'Pn,m• This is proved in
Theorem 10.3.1 (the mean completeness theorem) for ei= / ../fir, and the proof
for 'Pn,m is similar.
Exercises for Chapter 10
19.
653
Sturm-Liouvi/le Problems Consider the differential equation
Z
+ [q(x) + ,\p(x)] f(x) = 0
to be solved for f(x) with boundary conditions /(a) = f(b) = 0 and a ::;
x ::; b. The functions q,p are fixed, and we assume that p(x) > 0. The ,\
for which solutions/ exist are called eigenvalues.
a.
Show that if f and g are solutions with eigenvalues ,\ and µ and
,\ 'f., J'·, then
lb
p(x)f(x)g(x) dx = 0.
Hint: Use the differential equations to show that
d
(,\ - Jt)p(x)f(x)g(x) = dx[g'(x)f(x) - /'(x)g(x}],
and then integrate.
b.
Interpret a as orthogonality off and g with
(/,g) =
lb
p(x)f(x)g(x)dx.
Show that this is an inner product. 5
20.
Show that {(I/ ,/ir) sin nx I n = 1, 2, ... } is an orthonormal family on
[O, 2,r ]. What is a Fourier series for this family? Is it complete?
21.
Let ipo, 1{)1, ... be an orthonormal system in an inner product space V and
let f E v. Let Sn =
t.pk)<p1:, the nth pattial sum of the Fourier
series. Show that for any integer p ~ 0,
22.
Let V be an inner product space and ip0, ip 1, ... a complete orthonormal
system. for f,g E V, show that
E;=0u,
00
(/,g) = L,(/,ipk)(<pt,g}.
k=O
Of course, part of this problem is to show that the sum converges. [Hint:
Write f and g as limits of Fourier se1ies and apply Worked Example 10.2.]
5 Many orthonormal systems arise this way. The trigonometric system arises with p = l,q = 0.
lbere Is an advanced theorem that asserts that such systems are complete. See, for example,
Coddington and Levinson, Theory of Ordinary Differential Equmions, McGraw Hill, New York.
654
Chapter 10 Fourier Analysis
Exercises 23-28 refer to quantum mechanical systems (§10.9).
23.
Show that the solution of ili(otJ;/81)
= HtJ;, if t/J = 'I/Jo at t = 0, is given by
00
1P = "fJt/Jo,'Pn)e-iEnt/li'Pn,
n=O
where the 'Pn E V are the eigenfunctions of H with eigenvalues En (you
may assume that the selies can be differentiated term oy term and that the
eigenfunctions are complete). What happens if 1/; is already an eigenfunction?
24.
Compute the commutators of the operators Q..., Qy, Qt, Px, Py, Pt, and
lx, l_v, ]z given in the text. What is the uncertainty principle for these
operators (Exercise 5, §10.9)?
25.
Suppose i/i.(f}'lj;/at) =Hip.Prove that (1/;,H,p) is constant in time. This
result is called conservation of energy.
26.
If A and B are symmetric, is [A, B] symmetric? What about i[A, BJ?
27.
Solve for the eigenfunctions in a deep well if we replace [0, /] by [-/, /].
28.
If A is symmetric, then show that (At/J,t/J) is real for any t/J EV. Interpret
this result using Exercise 3, §10.9.
29.
In the second proof of the pointwise convergence theorem, we used the
equality
00
sinx d _ 1r
- - X-0
X
2
1
(recall that this integral is conditionally convergent). Prove this equality
as follows. Let
v()
r·
t = loo e-,xsinx
- - dx.
t
Show that
F'(t)
-1=
=
O
X
e-ix sinxdx =-(r2 + l)- 1
(see Worked Example 9.2 at the end of Chapter 9) and hence, that F(t) =
- tan- 1 t + C. Show that F(t) --. 0 as t --. oo, so that C = 1r /2. Then look
at F(O) for the ~esult. The main difficulty here is the justification of these
steps. (This integral can also be evaluated using methods of complex variables; see J. Marsden and M. Hoffman, Basic Complex Analysis, 2d ed.,
Chap. 4.)
Exercises for Chapter 10
655
J::
30.
The derivative of the delta function is defined by
6'(x)f(x)dx = -f'(O)
(see §8.7). Compute the Fourier transform of 61 ; how is it related to that
of 6?
31.
Convolutions were defined in Worked Example 10.4. Show that/*C = C*f.
Use that example to write out Parseval's relation for f * g.
32.
The Riesz-Fischer Theorem This theorem represents one of the
early and most important successes of the Lebesgue integral. For this
problem we do not assume a knowledge of the Lebesgue integral, but
take for granted that the set of all square integrable functions forms a
Hilbe1t space. Assuming this, the Riesz-Fischer theorem is quite easy.
Sometimes, depending on how you read the history, the statement we just
took for granted is called the Riesz-Fischer theorem!
a.
Prove the following theorem.
Riesz-Fischer Theorem Let V be a Hilbert space and
<po, cp 1, . . . a complete orthonormal set. let co, c1, . . . be complex
numbers and suppose E:O lcnl2 < 00. Then there exists an f E V
with
(/, cp;} = C;.
Thus every series
00
L
Cn'P11
with
n:O
is the Fourier series of some f.
[Hint: Letfn = I:7=0 c;cp; and show thatfn is a Cauchy sequence, by
showing that Iii,,. - fnf = :E;~+l jc;l 2.]
b.
Use a to prove that for every sequence
c1;,
k = 0, ± 1, ±2, ... , with
N
oo
"~ lci:12 = N-+oo
lim "~ lc.tl 2 < oo,
-oo
-N
there is a square integrable f on [O, 271"] (or [-71", 11']) such that
00
f =
L
Ck'{Jk,
-oo
where
cp1;
= e;1:x and the convergence is convergence in the mean.
Chapter 10 Fourier Analysis
656
c.
Is
00
L (1/n)einx
n•-oo
nto
the Fourier series of some some integrable function on [-rr, ,r]? Is
/la-oo
n~o
33.
If f: [a, b] -+ JR, has a discontinuity only at Xo E ]a, b[ andf' is bounded
on ]x0 , b[ and on ]a,xo[, then prove that f is a function of bounded
variation. Show that one can apply the Jordan-Dirichlet theorem.
34.
Letf(0) 0,f(x) = xsin(l/x) if 0 < x S rr,f(x) (2rr-x) sin(l/(2rr-x))
if rr $ x < 2rr, andf(2,r) = 0. Show that/ is continuous and periodic on
[0, 2rr] and the Fourier series off converges at each point, but that the
hypotheses of both the Jordan and the Jordan-Dirichlet theorems fail.
35.
Investigate the nature of convergence of the Fourier series of each of the
following functions on ]-,r,,r]:
=
=
c.
=x3
J(x) =(sinx)2
f(x) ={ ~x2
d.
f(x)
={
e.
j(x)
={ ~ sm. I
a.
b.
f(x)
t
x>0
xso
x~O
x<0
+l
X
36.
xso
x>0
Suppose f is real and square integrable on [-/, l] and that
On
= II
j'
-I
bn
Show that
=Il
f(x) cos nrrx
- 1- dx, n = 0, 1, 2, ...
1
1
•
nrrx
_/<x)sm - 1-dx, n
=1,2, ....
1+ I:a;+ Lh; = 11j'
.,
00
n=I
00
n=I
-I
f(x) 2 dx.
Exercises/or Chapter 10
37.
657
Form the function
cp(x)
= a sinx + b sin 2x + csin 3x
on [0, 71"]. For what values of a, b, c is rp closest in mean to the constant
function I? What about on the interval [-71", 71"]?
38.
Let g : ]a, b[-+ R be continuous and suppose g(a•) and g(b-) exist. Prove
that g is bounded. What does this say about the definition of a sectionally
continuous function?
39.
a.
Suppose/ is differentiable for x
show that /(x0) exists as well.
b.
If limA--x0f'(x) exists, show that it equals
l.
,m
{f(xo + h) - f(xo)}
h
h-+o+
c.
40.
> Xo and lim_.._ _..,J'(x) exists. Then
.
Consider the functions / 1(x) = xsin(l/x), fi(x) = x2 sin(l/x),
/J(x) = .x3 sin(l/x), for x > 0. Which of limx-o+ //(x) and Jf (0•)
exist?
Let x 0 be a jump discontinuity off. Define h(x) = /(Xo - x) and g(x) =
+ x). Show that h(O•) =/(x0 ) and g(O•) =/(x0).
f(xo
41.
Let/ : [a, b] -+ IR. have a jump discontinuity at xo E ]a, b[ with /'(x0),
f'(x 0 ) existing. Let g: ]a,b[ -+ R be differentiable and suppose
g(x)
= /(x)rp(x).
Prove t~at g has a jump discontinuity at xo, g'(x0) and g'(x0 ) exist, and
=f(x"o)cp(xo)
g(Xo) =/(Xo )cp(Xo),
g(xo)
Apply this when
cp(x) = .
(x-xo)/2
/
sm[(x - xo) 2]
,
if x ~ xo and cp(xo) = I,
to complete the proof of the pointwise convergence theorem.
42.
If In -+ f in mean on [a,b], then prove that
subinterval.
In
-+
f
in mean on any
658
Chapter 10 Fourier Analysis
43.
Suppose f,, -. f and 8n -. g in mean on [a, b]. Define the function
h by h(x) = J: f(y)g(y)dy and let hn be defined similarly by hn(x) =
fn(y)gn(Y) dy. Prove that hn - t h unifonnly.
J:
44.
Establish fonnulas 5 and 6 in Table 10.5-4.
45.
Establish the following fonnulas:
a.
xsinx= l-(cosx)/2-2E~((-l)"cosnx)/(n2 -l),
b.
log[sin(x/2)] = - log 2 - E~(cosnx)/n,
-1r
::;x~
1r.
on ]0, 21r[.
46.
Apply Parseval's relation to fonnulas 4a and 6 in Table 10.5-4 to obtain
some arithmetical identities. What justifies reading off an and bn as the
coefficients of cos nx and sin nx?
47.
Discuss the Gibbs phenomenon for f(x)
48.
Let/ have a jump discontinuity at xo and let/'(x~)./'(x0) exist andf'(x)
exist and be continuous for x e ]xo - e,xo[ and x e ]xo, .xo+e]. Show that
/ "exhibits a Gibbs phenomenon" at Xo and the "overshoot" ~ (f(x~) /(Xo )) • (l.179).
49.
Use the Fourier series of lsinxl in Table 10.5-4 to show that
00
L 4n
1
2 -
I
50.
I
1=
2
and
=2, x ~ 0;f(x) =0, x < 0.
~ (-1)"
I
~ 4n 2 - 1 = 2 1
1r
4·
Use the Fourier sine series on [O, rr] to show that
cos 1rx =
~7r
f
n=I
4n-?n- 1 sin(21rnx),
0<x
< 1.
51.
In Table 10.5-4, state the discontinuities of all functions (including endpoints of intervals) and the values of the Fourier series at those points.
52.
Derive fonnula 2a of Table 10.5-4 by noting that f(x) = (x + lxl)/2 and
observing that the series for lxl is just the half-interval cosine series for x.
53.
Use Jordan's theorem at x = 0 applied to Example 10.5.S) to prove that
1r2 /8 =
1/<2n - 1)2 •
54.
a.
E:,
Let/ be smooth (COO) on [-1r, 1r], and suppose f(-1r) = f(1r),
f(k>(-1r) = fk>(1r), k = 1,2, .... Prove that the Fourier series of
/ may be differentiated any number of times and will still converge
uniformly.
659
Exercises for Chapter 10
55.
56.
b.
Show that for any integer p and nPan -+ 0 and nPbn -+ 0, where an,
bn are the Fourier coefficients off.
a.
Let/: (0, rr] -+ R be continuous and let/' be sectionally continuous
with jump discontinuities. Then show that the half-interval cosine
series off converges uniformly and absolutely to/.
b.
Justify the same conclusion for the half-interval sine series if we
assume also that/(0) =f(rr) = 0. Explain.
c.
Show that without the condition/(-rr) =/(tr) in Theorem 10.6.1,
the conclusion is false.
Let
f (x)
0 -'Ir$ X < 0
= { l O $ x $ 11".
Show that if we differentiate the Fourier series off, we get the Fourier
series of 6(x) - 6(x + rr) (6 is the Dirac 6 function). Can you explain in
what sense/' = 6?
·
57.
Give the theorem that is obtained by combining Theorem 10.6.1 and Corollary 5.3.4, and show why Theorem 10.6.2 is better.
58.
Use Theorem 10.6.2 to derive a differentiation theorem for half-interval
cosine series.
59.
If f is square integrable on [-rr, rr] with Fourier coefficients an, bn, then
prove that
1 bn/n converge absolutely.
1 an/ n and
60.
a.
Let/: [-1r 1 ,r]-+ R be square integrable and g(x) = J~.,,J(y)dy.
Find the Fourier expansion of g, and state where it is valid.
b.
Repeat part a for the half-interval cosine series off : (0, 71"] -+ R.
1:;:
1::
61.
In Worked Example 10.3, show that g is of bounded variation (hence the
Jordan-Dirichlet theorem applies).
62.
Use Worked Example 10.3 to find the Fourier series of x3 on [O, 2,r] using
that of x2 from Table 10.5-4.
63.
For each of the following, determine what type of convergence the Fourier
series has and whether or not the series can be differentiated tennwise.
= x3 on [-,r, rr]
a.
f(x)
b.
3
{
J(x) = . x
1
-,r $ X < -(I /2)
-(1 /2) $ x < 1
1$
rr
xs
660
64.
Chapter 10 Fourier Analysis
c.
f(x)
= 1r3 - lxl 2 on [-rr, rr]
d.
f(x)
= { x 3 + 8 -rr :$ x S 0
e.
f(x)
f.
f(x)
g.
loglsin(x/2)1 on ]-1r,1r[
x2+8
O:$x:$1r
0
-1r < X < 0
= { x3 sin(l/x) 0
<; :$;
-2n(x + l).i + 2n + 1 1/(n + 1) :$ x S 1/n
-71" :$ X :$ 0
otherwise on [-rr, rr]
={ 0
1
Suppose/ is square integrable on [0,2rr] with Fourier coefficients an,bn,
If f (O+), /(0-), f' (0+), and f' (0-) all exist, prove that :E:O an converges.
What assumptions guarantee that
1 bn converges?
z:=:
Exercises 65-71 are based on §10.7.
65.
If the temperature at the ends of a bar is kept constant at zero, show that
the temperature T after time t, if T is equal to f at r = 0, is given by
00
'
( 2 2 /I) . mrx
T(x, t) = "
~ bne- n 'Ir I sm -,-.
n=l
where the bn are the half-interval sine series coefficients off.
66.
Show that the solution to the heat equation is always COO for t
Theorem 10.7.4).
67.
a.
> 0 (see
On (0, rr] x (0, rr], find a function cp such that V 2 cp = 0 and
ip(x,0)=
(x;1rf- ~
2
,
cp(0,y)=0, cp(rr,y)=0, cp(x,rr)=0.
Explain in what way the boundary values are assumed.
b.
Repeat the problem with cp(x 1 0) = x.
68.
Show that the solution in the text is correct for the two-dimensional wave
equation. Derive the fundamental solutions by separation of variables.
You may wish to use Exercis~.18.
69.
In Theorem 10.7.2, prove that
(uniformly in x).
Exercises for Chapter 10
661
70.
In Theorem 10.7.4iii, show that cp converges unifonnly on any compact
set in the interior of the square.
71.
Boundary Value Problems for Ordinary Differential Equations Suppose/ on [-1r, 1r] has/(-1r) =/(1r) and Fourier coefficients
On, bn. The fact that f' has coefficients On = nbn and f3n = -nan (see
Theorem 10.6.1) is useful in solving certain boundary value problems.
a.
Solve the equation f" (x) + kf(x) = g(x) for given g on [-1r, 1r] if we
require/'(-1r) = /'(1r) and/(-1r) = /(1r), by noting that -n2 an+kan =
iin and -n 2bn + kbn = bn, where iin and bn are the coefficients for g.
Hence show that
·
~
iio
iin
bn
.
/(x) = 2k + L..,; k _ n2 cosnx+ k- n2 smnx.
n=I
b.
72.
Solve the equation con-esponding to a for [-l, LJ.
Let 6 be a given real number, 0
< 6 < 1r.
-{ 1
f(x) -
73.
0
Define
!xi~
6
< lxl $
r,
11".
a.
Calculate the Fourier series off.
b.
By evaluating at x = 1r, show that
c.
What does the Parseval relation say in the case of/?
Evaluate
lim
1,r /x sin kx dx.
2
k-oo O
74.
Vetify that the infinite product in Worked Example 10.5,
is periodic with period 2,J(>. +2) = /(>..). First show that/(>.+ 1) = -J(>..).
662
75.
Chapter 10 Fourier Analysis
(W. A. J. Luxemburg) Let 'Pn be a set of orthonormal functions in L2 of
the interval [a, b ]. Show that
a.
'Pn is complete iff x - a=
b.
'Pn is complete iff
"E: IJ; 'Pn(t)dtj2 for all x E [a,b] and
1
I
<b ~ a)2 =~larb lar 'Pn(t)dt 12 dx.
00
76.
(Lebesgue's proof of completeness of the trigonometric system, from
A. Zygmund, Trigonometric Series, Chelsea, New York) Give another
proof of the mean completeness theorem as follows. Let f : [ -71", 71"] -+ IR
be continuous and be orthogonal to cos nx, sin mx. Prove f = 0 as follows:
Assume j(x) > c for x E / ]xo - 6, Xo + 6[. Let Tn(x) [t(x)]", where
t(x) = l +cos(x-xo)- cos 6, and show that Tn(X) ~ 0 on I, Tn(X) -+ oo uniformly on every closed subinterval of/, and the Tn are uniformly bounded
outside /. Use this to show that (j, Tn) = 0 is impossible for n large. For
generalf, attempt to apply the results just obtained to F(x) = 1J(t)dt.
=
=
f
Appendix A
Miscellaneous Exercises
This appendix contains supplementary exercises. Numbers 1-53 are miscellaneous exercises that cover a variety of topics. Most are not routine questions,
but rather ask for examples, true-false evaluatons of assertions, or developments of variations or extensions of ideas from the text. Numbers 54--66 are
examination-type questions that may be useful for review.
Exercises
1.
For two sets A, B C Rn, define the distance between them by d(A, B) =
inf{d(x,y) Ix e A,y e B}.
a.
Find two closed sets such that A n B = 0 and yet d(A, B) = 0.
b.
For A compact and x <I. A, show that d(A,x) = d(y,x)
y eA.
c.
Show that for each set A C Rn we have cl (A)= {x
> 0 for some
e R_n I d(x,A) =
O}.
2.
Let A C R_n. Show that A is compact iff every continuous map/ : A -+ R
is bounded above and assumes its maximum at some point in A.
3.
Show that/ : R-+ R has a continuous derivative iff the double limit
lim
(x,y}--+(Xo.XO)
exists for every Xo E R.
663
f(x)- /(y)
x-y
Appendix A Miscellaneous Exercises
664
E:1(x+ 1)2 /n
Show that
on [0, l].
S.
What is wrong with this argument?
f(x)
2
00
-oo
1
=1/(1 +x)2 =(d/dx)[-1/(1 +x)],
so
1
and
E: (x + l'f/n! each converge unifonniy
4.
f(x) dx = lim
a-oo
la
-a
f(x) dx = lim
a-oo
[
-1-1
- - - -1
- ] = 0.
+a 1- a
6.
Give an example of a continuous function f : lR - JR such that f(IR) is
not closed. If A is a closed bounded interval in JR, mustf(A) be closed?
7.
At first, one might think the intervals ]O, 1[ and [0, 1] are very similar. State
at least five significant differences between them in terms of topology and
continuous functions.
8.
a.
b.
c.
d.
e.
9.
Are the following statements true or false? (All sets have volume and all
functions are bounded and integrable.)
c.
JAi = J8f.
If {x I f(x) ~ g(x)} has measure 0, then JAi = f8j.'
If/ 2: 0, g 2= 0, and JAi = JA g, then/= g on A (except possibly on
d.
a set of measure 0).
The same question as c with the additional assumption that f 2= g.
a.
b.
10.
11.
Find an example of a closed set A in !Rn such that A = bd (A).
If A = bd (A) show that A is closed.
Prove: If int(A) = 0, then (A= bd(A) ¢=> A is closed).
Prove: If A is closed and A C bd (A), then A = bd (A) and int (A) = 0.
Find a set A such that A c bd (A) but A ~ bd (A).
If A :::> B, and A \ B has measure 0, then
Let/ : [a,b]-+ lR be integrable.
a.
b.
c.
Prove that F(x) = f0xf(t)dt is uniformly continuous on [a,b].
a.
Let T : !Rn -+ Rn be a linear mapping. Prove that T is norm
preserving (that is, II Tx II = II x II) iff T preserves the inner product,
{Tx, Ty}= {x,y ). (See Exercise 12, Chapter 1).
Show that F has a derivative at xo if f is continuous at .xo.
Show that F is differentiable except on a set of measure 0.
,pxercises
665
I
b.
1.2.
If T preserves the norm (or inner product), then Tis an isomorphism.
Prove Cavalieri's principle: If A, B c IR.3 have volume, and if every plane
parallel to the xy-plane intersects A and B in equal area, then A and B
have the same volume. [Hint: Make use of Fubini's theorem.]
Remark: In connection with this problem, see Gelbaum and Olmsted,
Counterexamples in Analysis, Example 6, Chapter I 1. For applications
see a calculus text.
13.
Show that a set A c Rn has volume iff for every e > 0 there is a
set Ve C A and a set W,, :, A such that V,. and We have volume and
v(We \ V,.) = v(W,.) - v(V,.) < e. Show that if the latter condition holds
then the volume of A is inf{v(W,.) I e > O} = sup{v(V,.) I e > O}.
14.
Show that the volume of the figure obtained by revolving the area under
the graph of a nonnegative function / : [a, b] -+ R once around the xaxis is given by
1rf(x)2 dx. Use this fonnula to compute the volume of
{(x,y,z) E R 3 I l < X < 2 and y2 +z2 < x4}.
J:
< 0.
15.
Let/ : [a,b]-+ R be continuous and suppose/(a)/(b)
there is an x E]a, b[ such that f(x) = 0.
16.
Let f : Rn -+ Rn be of class C1 and suppose J/(x) 'f O for all x. Let
x0 E Rn and B {x E Rn I/(x) .xo}. Show that B has no accumulation
points.
t,.
Give an example of a function/ of class C1 that has derivative equal to
zero at a point x but is one-to-one in a neighborhood of x. Show that 1- 1
cannot be differentiable at/(x).
18.
Show that if the sets A and B have volume then so does A U B. If A n B
and A \ B have volume as well, then prove:
=
Show that
=
n B).
a.
v(A U B) = v(A) + v(B) - v(A
b.
v(A \ B) = v(A) - v(B) if A ::> B.
19.
Suppose that f : A C Rn -+ JR is integrable and that/ = g except on a set
of content zero. Show that g is integrable. Show that this is false if we
replace "content zero" by "measure zero."
20.
a.
Show that if {U0 } is a family of disjoint open sets in Rn, then the
family is countable. [Hint: Pick a point with rational coodinates in
each set and use the fact that the set of such points is countable.]
b.
If A is an open subset of Rn, show that the connected components
of A are open and that there are countably many of them.
666
Appendix A Miscellaneous Exercises
c.
Prove that each open set in R is the union of a countable collection
of intervals.
21.
Let/ : [a,b]-+ [a,,0] be a strictly increasing map of [a,b] onto [a,,0].
Show that f and 1- 1 are continuous.
22.
Prove the Lebesgue Covering Lemma: Let A be a compact subset of Rn
and let {VO} be an open cover of A. Then there is an e > 0 such that if
S is any rectangle contained in A with sides shorter than e, then S c V00
for some ao.
23.
Let f : A C Rn -+ R be a mapping and let M C A be the set of (strict)
local maxima of/. Show that J(M) is finite or countable. Give examples.
24.
Let S be an open connected set in Rn. Let A be a connected component
of IRn \ S. Show that Rn \ A is connected.
25.
Find a nonconstant continuous function/ [a, b] -+ R that has its maximum
at xo E )0, I [ but for which/' (Xo) does not exist.
26.
A set B c A is said to be dense in A if A c cl (B). Show that this is
equivalent to the condition that for every open set U with A n U f: 0 we
have B n U f: f2J. Is A dense in cl (A)? Show that Rn has a countable
dense subset.
27.
a.
Let A c Rn have volume. Show that int (A) and cl (A) have volume
and v(A) v(int (A)) v(cl (A)).
b.
Show that if A is a set such that cl (A) has volume, we cannot conclude that A itself has volume.
Show that if A is a set with int (A) = 0, we cannot conclude that A
has content O or measure 0.
c.
=
=
28.
Rn -+ B c Rn be of class C1 on an open set A and suppose
B = g(A). We say g is volume preserving if for every set D c A such that
D and g(D) have volume, we have v(g(D)) = v(D). Suppose g is one-toone and Jg(X) t,. 0 at each x E A, then show that g is volume preserving
iff jJg(x)I = I for all x EA.
29.
a.
Let g : A
b.
c
Show that if a set A has content 0, then cl (A) has content 0. Is this
true for measure O?
Let f : A C Rn -+ Rm be of class C1 on an open set A. Let B C A
and cl (B) c A and suppose that B is compact. If B has content or
measure 0, show that /(B) does as well. [Hint: Consider the case
in which Jf(x) = 0 separately and use Sard's theorem (Exercise 27,
Chapter 9).]
667
Exercises
30.
31.
32.
A subset A of Rn is said to be homeomorphic to a subset B of Rm if there
is a continuous map '{) : A - B with a continuous inverse 'P- 1 : B - A.
We call '{) a homeomorphism.
a.
Find an example of a bijection '{) : A - B that is continuous but is
not a homeomorphism.
lb.
Let/ : A C R_n - Rm be continuous and let r = { (x,/(x) E Rn X Rm
x e A} be the graph of/. Show that A and rare homeomorphic.
I
Let/ : [a, b] - R be a monotone function. (Say, for example, that f is
nondecreasing: x $ y => f(x) $ /(y).)
a.
Show that the left and right limits, /(x+) = limh-+o+ /(x + h) and
f(x-) = liffih-0" /(x- h), exist at each x in [a, b]. (Only one of them
at each endpoint.)
b.
Show that/ has at most a countable set of discontinuities. [Hint: Let
Pn be the set of points at which the jump of/ exceeds 1/n. Show that
Pn is finite and consider the union of all the Pn for n = 1, 2, 3, ....]
c.
It is a famous theorem of Lebesgue that for each such/ the derivative
of/ exists, except possibly for points in a set of measure 0. Consider
some examples to verify the validity of the results. Look up a proof
in, for example, Hewitt and Stromberg, Real and Abstract Analysis,
and write a brief essay on the essential features of the proof.
Prove that the transformation
X1
=
U1
X2
=
U1 +U2
X3=
Xn
=
u1+u2+u3
U1
+···+Un
leaves volumes unchanged.
33.
Let g [O, I] - R be integrable. Show that
1[1
1
34.
1
g(t) dt] dx =
1
1
tg(t) dt.
Let K be a compact set. If (/n}'f is a uniformly convergent sequence
of continuous real-valued functions on K, show that In is equicontinuous.
The converse is not true. Give a counterexample.
668
Appendix A Miscellaneoui Exercise,
35.
Let S be a subset of Rn with volume," and let t > 0. Let R be the set
of points R
(tx1, ••• , txn) I (x1, ... , ~n) E S}. Show that v(R) rnv(S).
What if t < O?
36.
Explain how the Gibbs phenomenon is possible and the Fourier series still
converge in the mean and pointwise.
37.
a.
Let /(x) have the Fourier series (ao/2) + ~:1 [an cos nx + bn sin nx]
on [-1r,1r]. Define the reflection off by g(x) =f(-x). Show that
the Fourier series of g is (ao/2) + ~:1 [an cos nx - bn sin nx].
b.
Let
={
=
f(x) = { O,
x,
for - 1T $ X $ 0,
for O < X $ 1T.
Recall that the Fourier series for f is
,r ~ ((-1)" - 1
(-1)" .
)
-4 + L...,
cosnx- - - smnx .
2
,...
,rn
n
1
Use part a to show that the Fourier series for
~
2
2~
+ ~
(-If - I
1rn2
n=I
c.
d.
38.
lxl
on [-1r, 1r] is
_ ~ _ 4 ~ cos[(2k- l)x]
cosnx - 2
L..., ,r(2k- 1)2 •
k=l
1r2
I
I
I
Use part b to show that - = I + - 2 + - 2 + - 2 + • • •
8
3 5 7
•
Use part b to obtain the Fourier cosine series of x on [0, 1r] and
conversely.
a.
Let V be an inner product space and ipo, ip1, ... a complete orthonormal sequence. Suppose that W is a subspace of V and that/ E W iff
{f,ipo} = 0. Prove that ip1,cp2, ... is a complete orthonormal system
for W. Generalize.
b.
Apply part a to the trigonometric system and
i. W = {/ : [-,r, ,r]-+ JR I J~.,J(x)dx = O}.
ii. W = {/ : [-1r,1r]-+ JR. If is even}.
iii. W = {/ : [-,r,,r]-+ JR If is odd}.
39.
If/: [a,b]-+ JR is square integrable, prove that/ is integrable. That is:
l/(x)l 2 dx < oo =}
lf(x)I dx < oo. [Suggestion: Use the CauchySchwarz inequality.]
J:
J:
669
Exercises
40.
a.
b.
41.
Suppose that/ : [-1r, rr] -+ JR is sectionally continuous with only
jump discontinuities. Show that the sum of the Fourier series of/ at
x depends only on the values of/ in any neighborhood of x. This is
called Riemann s localization property. [Sugestion: Apply Theorem
10.3.2.]
The Fourier coefficients of/ depend on the values of/ throughout [-rr, rr ]. How do you reconcile this with the result of part a?
[Suggestion: Study the proof of 10.3.2.]
\Suppose the Fourier series of/ : [-1r, rr] x [-1r, rr]
-+
JR is
00
L
Cn,milVtimy.
n,1n=-oo
(See Exercise 18, Chapter 10.)
a.
b.
Write out the Fourier series off in trigonometric form.
For fixed y, let g(x) = J(x,y). Show that the exponential Fourier
coefficients for g are
00
G(n) = Cn
=
L
Cn,meimy_
m=-oo
c.
If/ is square integrable we know that its Fourier series converges
to f in mean. (See Theorem 10.3.1.) The purpose here is to give a
pointwise convergence theorem. Show that if f is of class C1 and
J(x, 1r) = f(x, -1r) and /(rr,y) = /(-rr,y) for all x and y, then the
Fourier series for f converges to f(x,y) pointwise. [Suggestion: Use
part b and 10.3.2.]
:E.:
42.
For what values of p is
1(sinnx)/nP the Fourier series of a square
integrable function? (See Exercise 32, Chapter 10.)
43.
a.
b.
Show that
t
cos kx
s isin(~/2) I
for x I 0.
Consider the Fourier series for ·the step function,
00
f(x)
=
L co:nx.
n=l
Show that for each 6 > 0, this converges uniformly on [6, rr]. [Suggestion: Use part a and the Dirichlet test]
Appendix A Miscellaneous Exercises
610
z:;:
c.
Generalize b to any Fourier series /(x) =
1 an cos nx with an
decreasing. Conclude that/ must be continuous on ]0, 1r[.
d.
Deduce from c that if/ has a discontinuity at Xo and O < x0
then the Fourier coefficients off are not decreasing.
=
=(x -
< 1r,
l/2)2 - (2 / 4.
44.
A string on [0, l] is initially displaced at t 0 by f(x)
Find a formula for the displacement after time t.
45.
a.
If a bar with insulated ends has temperature T = constant at t = O,
show that T = constant for all t > 0.
b.
If a bar on [0, 1r] has temperature at t = 0 given by sinx, find the
temperature for t > 0.
c.
Same as part b except that T = cosx at t = 0.
a.
Find a function <p on [O, 1r] x [0, 1r] such that 'v2 <p
cosx, and <p(x, 1r) = 0 = <p(O,y) = <p(1r,y).
b.
In part a, replace fi?(O,y)
c.
In what sense are the boundary values in a and b assumed?
46.
47.
=0 and <p(x, 0) =
=0 by <p(O,y) =I and find the function.
Let V be an inner product space. Usually 11.fn II -+ II/ II does not imply
-+ f (Exercise 16, Chapter 3). However, show that llfn II -+ Iii II does
imply fn -+ f in mean if .fn is the nth partial sum of the Fourier series for
/ with respect to an orthonormal family.
In
48.
49.
Let/ : IR. -+ lR be twice differentiable. Show that
a.
If F(x,y) =f(xy), then x8F/8x = yfJF/ay.
b.
If F(x,y) =f(ax+by), then b{)Fj{)x=a8F/{)y.
c.
If F(x,y) =f(x2 +y2), then y8F/8x =xoF/{)y.
d.
If F(x,y) =f(x+cy) +f(x- cy), then c282 F/8x2 = {)2 F/8y2.
z:;: (xn/n) converges, then (x
Prove Kronecker's Lemma: If
-+ 0 as n -+ oo. (That is, Xn
Xn)/n
50.
+ • •· +
0 in the Cesaro sense.)
Let f : [a, b] -+ JR be a bounded integrable function with / (x) ~ m
for all x in [a,b]. Show that
51.
1
1
-+
(J:O/f)) (J;t) ~ (b- a)2.
J~
>0
Suppose that/ : [O, I] -+ JR is integrable, that
f(x) dx ~ 1, and that
0 $ f(x) $ 10 for all x in [0, 1]. Define the set E = {x E [0, 1] If(x) ~ 1},
and assume that E has volume. Show that v(E) ~ 1/2.
Exercises
52.
671
Suppose that f : [O, 21r] - JR is continuous and /(0) = /(21r ). Let
N
L (/,'Pk) 'Pk
SN=
k=-N
be the Nth partial sum of the Fourier series for/ and define
4>(y)
=
1'
and
f(x) dx
State whether each of the following "must be true" (MBT) or "could be
false" (CBF).
53.
a.
I;,r SN(x)eix dx - I;,r f(x)dx as N -
b.
It x2sN(X)dx- It x2f(x)dx.
c.
II SN - f II -
d.
SN(2) - /(2).
e.
l:N is the Nth partial sum of the Fourier series for 4>.
r.
II i:N -
oo.
0.
g,
4> II - o.
I:N(2) - 4>(2).
h.
I:N - cl> uniformly on [O, 21r].
The Poisson kernel and harmonic functions. Let/(0) be continuous and
21r-periodic. (We can think off as a function defined on the circumference
of the unit circle in the plane .. ) By Fej6r's theorem, we know that the
Fourier series of/ converges to/ in the (C, I) sense. Deduce that
00
lim ~ ckrltli" 9 = f(0).
,~1•
L...,,,
k=-oo
(Note the exponent lkl for the negative indices.) Define
00
u(r, 0) =
L
c,_,l"leik8
•
k=-oo
for O $ r < 1. We regard u as a function in the interior of the unit disk
in the plane. In rectangular coordinates x, y we have
00
u(x,y) = co+
L (ck(X + iy)" + C-k(X k=l
iy)").
672
Appendix A Miscellaneous Exercises
Prove that this series converges uniformly in each disk of radius smaller
than 1.
Show (by the general theory of power series) that we can differentiate u
tenn-by-tenn any number of times. In this way prove that
{J2u 82 u
8x2 + f)y2 =0;
that is, u is a solution of Laplace's equation-a harmonic function. We
have already seen that u(r, 0) --+ /(0) as r --+ 1-, so we have solved the
"Dirichlet problem": to find a hannonic function in the unit disk that has
a given function for its boundary values.
For O $ r < 1 prove that
u(r, 0) = -2I
11"
1"
f(t)P,(0 - t)dt
-,r
=
=
where P,(O - t) I:;~00 rlkleik(e-t>. The function P,(y) E~oo rlkleikJ is
called the Poisson kernel. Sum this series explicitly to prove that
P,(y) =
1+
1 - ,2
.
- 2rcosy
,:i
Show that this kernel has the same crucial properties as the Fej6r kernel,
namely
a.
2,r-periodicity.
b.
(1/2,r) f:.1r P,(t)dt = 1.
c.
P,(t)
d.
For each fixed 6 > 0, we have
~
0.
lim [
,-1- }6$ltl$1r
P,(t)dt = 0.
Deduce that u(r, 0) discussed above converges to /(0) uniformly as r-+ 1-.
Examination-Type Questions
Exercises 54-58 are examination-type questions based on Chapters 1-6 and 8.
54.
a.
Define the least upper bound of a set S.
Examination-Type Questions
55.
673
b.
Find sup{x ER I x2 +x < 3}.
c.
What is meant by saying that R is complete?
d.
Let (xn)f be a convergent sequence in R. Prove that it is a Cauchy
sequence.
e.
Let O <a< l, and define .xo = a, and inductively, Xn = Cxn-t + 1)/2
for n = 1, 2, 3, .... Prove that Xn converges to 1 as n -+ oo.
a.
b.
Define the term "connected set"
c.
State and prove a general version of the intermediate value theorem.
d.
Prove that {(x,y) E R 2 I x = 0 and y ~ O} U {(x,y) E R 2
y and x > O} is connected.
e.
If A and B are connected sets in Rn and A n B
Define the term "path-connected set."
~
Ix
=
0, show that A U B
is connected.
56.
a.
b.
c.·
Define what it means for a sequence of functions /1:
converge uniformly.
: Rn
-+
R to
E:, (sin kx) k31 converges uniformly for x E R.
Is f(x) = E: (sin kx) /k31 a continuous function of x? Justify your
2/
Prove that
1
2
2
2
answer.
d.
Let /1:(x) = (1/kx) + I fork = 1, 2, 3, ... and x e]O, l[. Prove that
1 pointwise.
/1: -+
57.
IJ,)f in part d converge uniformly on ]0.1[?
e.
Does the sequence
a..
Let In : [a, b] -+ R be continuous functions differentiable on ]a, b[
with/~ continuous. Suppose fn converge uniformly to/ and/~ converge uniformly tog. State a theorem concerning differentiablity of
f.
b.
Prove your theorem in a. Clearly state any results used.
c.
Let/1:(x) = (sinkx)/k2. Does your theorem work?
d.
State a result that would guarantee that the following operation would
be valid:
1
e.
Definer= E:Oxn /n!. Use part d to prove that
J; e' dt = r - 1.
674
58.
Appendix A Miscellaneous Exercises
a.
Let/ : A C R_n
derivative of/.
b.
For / : A C Rn -+ IR, define the gradient of / and discuss the
geometrical meaning of ( grad /(x), e ).
c.
If Sis a surface defined by S
d.
Find the equation of the plane tangent to the surface x2 + y3 + z4 = 3
at (l, 1, 1).
e.
Argue that the two surfaces x2 + y2 +
tangent at the point (1, 1, 1).
-+ R_n
where A is open. Give a definition of the
=
=
{x I F(x) c} for some constant c,
argue that grad / (x) is perpendicular to S if x E S.
r = 3 and x1 + y3 + z = 3 are
4
Exercises 59-62 are based on Chapters 6-10.
59.
a.
Let/ : Rn
IR".
-+
IRm. Define what it means for f to be differentiable at
XE
60.
b.
Is it true that existence of the partial derivatives implies that / is
differentiable? Discuss.
c.
Let/ : IR2
d.
Let h(x,y) =J(g(x,y),k(y),p(x)). Write down a formula for oh/ox.
Justify this in terms of the chain rule.
e.
Let / : lR -+ lR be differentiable. Assume that / and /' have no
common zeros. Prove that/ has only finitely many zeros in (0, 1).
a.
What does the inverse function theorem state for functions/ : lR
IR?
b.
Consider the equations
-+
R.3,/(x,y) = (xy,eY,cosx). Compute D/(1,0).
{
-+
=2
xz+y2 +y = 3.
x1+y40
Show that they are solvable for y(x),z(x) near (x,y,z) = (I, I, 1).
Compute dy/dx at x = 1.
61.
[0, 1} be continuous. Prove that cp has a fixed point.
c.
Let cp : (0, 1)
d.
Let F : IR" -+ IR" be of class C1 and have nonzero Jacobian at every
point. Prove that F(R") is open.
e.
Let / : IR.2 -+ lR be continuous. Show that / is not one-to-one.
[Suggestion: If/ were one-to-one, then the images of the x and y
axes would both be intervals in IR.]
a.
Evaluate
-+
JA e-x2-I dxdy where A= {(x,y) E 1R2 I x2 +y2 ~ 1}.
Examination-Type Questions
b.
c.
d.
e.
62.
a.
b.
675
Evaluate J8 x dxdy where B is the region in the plane bounded by
x = 0, y = 0, and x + y = 1.
State one version of Fubini's theorem.
Use part c to write a fonnula for I,1./(x,y)dxdy where A= {(x,y) E
R.2 I c :5 y :5 d and t/J(y) :5 x :5 cp(y)}. Here t/1 and cp are to
be smooth functions on the interval [c,d] with t/J(y) :5 cp(y) for all
y E [c, d]. Sketch this region.
.
Let cp : IR2 - R.2 be of class C1 and bijective with Jcp '/; 0. Assume
IA dxdy I,;,CA) dxdy for all open disks A. Prove that Jcp 1.
=
=
Let V be an inner product space and <po, cp1, cp2 , ••• an orthonormal
sequence in V. Write the Fourier series off EV relative to {cp;}.
Explain how part a is related to the fonnula
00
f(x)
~ 0;
+ :~::::Oncosnx+
n=l
00
Lbn sinnx
n=l
where
a1c
b1c =
c.
d.
e.
J1r_/<x)coskxdx
-I 11r f(x) sin kxdx
= ;I
11'
k = 0, 1,2, ...
k = 1, 2, ....
-11'
Compute the Fourier series of f(x) = x, -,r < x < ,r.
What is the pointwise limit of the series in part c? Does the series
converge in mean? Discuss.
Assuming the completeness theorems in the text, prove that {sin nx I
n = 1, 2, 3, ... } is complete in the space of square integrable functions
on (0, ,r].
Exercises 63--66 are based on Chapters 1-7.
63.
Let B = D(O, I) be the open unit ball in Rn centered at 0, and let/ : B - JR.
be continuous. Assume there is a continuous map g : B - L(IRn, IR)
Oinear maps from Rn to R), such that for each pair of points x,y in B,
f(y) - f(x) = [g(ty + (1 - t)x))(y - x)] dt. Show that f is of class C 1,
and that Df = g.
Id
64.
Let D = {x E R.2 111 x II :5 1} be the closed unit disk centered at the origin
in R.2• For each integer n > 0, lett,. : D - R be continuous and assume
that/ : D - R is a continuous function such that the sequence (fn}'f
converges uniformly to f on D. Is the set of functions {J,, I n = 1, 2, 3, ... }
equicontinuous and/or bounded? Justify your answer.
676
65.
Appendix A Miscellaneous Exercises
For each pair of functions/,g : [0, 1]-+ IR, define/ V g : [0, 1]-+ R by
(JV g)(x) = max{f(x),g(x)}.
a.
b.
If f and g are continuous, show that f V g is continuous.
Defineamaptt, : C([0, l],IR)xC([0, l],IR)-+ C([0, 1],IR)bytt,(f,g) =
Show that 1/J is continuous (with respect to the supremum norm
on the space C([0, l],IR) of all continuous real valued functions on
JV g.
[0, I]).
66.
Define a map a: C([0,1],IR)-+ C([0,1],lR) as follows: For each/ E
C([0, I], IR), define a(J) E C([0, lJ, IR) by (a(J))(x) = f(t) dt.
J;
a.
Show that a is a continuous map from C([0, l],R) to C([0, l],IR)
with respect to the supremum norm on C([0, I], IR).
b.
Is a a compact linar map? Justify your answer. (A linear map is
called compact if the closure of the image of the unit ball is compact.)
Appendix B
References and
Suggestions for Further Study
This appendix contains references from the text and a number of suggestions for
further study. They are grouped loosely by subject area, and we have supplied
a few remarks on the level and material of the selections.
The number of books on advanced calculus and introductory analysis is
overwhelming. Despite the large number of recent texts, some of the older
books remain the best. Some favorites are:
[1] Carslaw, H. S., 1930. Theory of Fourier's Series and Integrals. 3rd ed. New
York: Dover.
[2] Hardy, G. H., 1947. Pure Mathematics. 9th ed. New York: Cambridge Univ.
Press.
[3] Hobson, E.W., 1921. The Theory of Functions of a Real Variable and the
Theory of Fourier's Series. Cambridge, Eng.: Cambridge Univ. Press.
[4] Titchmarsh, E. C., 1937. Theory of Fourier Integrals. London: Oxford Univ.
Press.
[5] Whittaker, E. T., and Watson, G. N., 1926. A Course of Modem Analysis.
Cambridge, Eng.: Cambridge Univ. Press.
Of the more recent texts on roughly the same level as this one, the following
have been popular. Of these, [6, 7, 8, 9, 12, 13, 15, 16] are fairly classical,
while [10, 11, 14] tend to be a bit more abstract.
677
678
Appendix B References and Suggestions for Further Study
[6] Apostol, T. M., 1974. Mathematical Analysis. 2nd ed. Reading Mass.:
Addison-Wesley.
[7] Bartle, R. G., 1976. The Elements of Real Analysis. 2nd ed. New York:
Wiley.
[8] Buck, R. C., 1965. Advanced Calculus. 2nd ed. New York: McGraw-Hill.
[9] Graves, L. M., 1956. Theory of Functions of Real Variables. 2nd ed. New
York: McGraw-Hill
[10] Lang, S., 1968. Analysis I. Reading, Mass.: Addison-Wesley.
[11] Loomis, L. H., and Sternberg, S., 1968. Advanced Calculus. Reading,
Mass.: Addison-Wesley.
[12] Olmsted, J. M. A., 1961. Advanced Calculus. New York:
Century-Crofts.
Appleton-
[13] _ __, 1956. Real Variables. New York: Appleton-Century Crofts.
[14] Rosenlicht, M., 1968. Introduction to Analysis. Glenview, Ill: Scott, Foresman and Co.
[15] Rudin, W., 1976. Principles of Mathematical Analysis. 3rd ed. New York:
McGraw-Hill.
[16] Widder, D. V., 1965. Advanced Calculus. 2nd ed. Englewood Cliffs, N.J.:
Prentice-Hall.
For more information on the foundations of set theory, consult [17, 18, 19]. In
[17] there is not much material on logic and the axioms of set theory, but all the
facts on set theory that are of practical importance in the course are presented
concisely. In addition, [ 17] thoroughly develops the abstract differential calculus
(see our Chapters 6 and 7) in the context of Banach spaces. The axioms for sets
in the introduction are taken verbatim from [18]. [19] is a more modem text
designed to introduce the student to basic set theory and to begin the study of
the subject itself.
[17] Dieudonne, Jean, 1966. Foundations of Modem Analysis. New Jersey:
Prentice-Hall.
[18] Halmos, Paul R., 1960. Naive Set Theory. New York: Springer-Verlag.
[19] Roitman, J., 1990. Introduction to Modern Set Theory. New York: Wiley.
Appendix B References and Suggestions for Further Study
679
The following are some general references for more advanced work in real
analysis, including Lebesgue integration and abstract analysis in general Banach
and Hilbert spaces.
[20] Burk.hill, J.C., 1951. The Lebesgue Integral. Cambridge, Eng.: Cambridge
Univ. Press.
[21] Halmos, P.R., 1950. Measure Theory. New York: Springer-Verlag.
[22] Hewitt, E., and Stromberg, K., 1969. Real and Abstract Analysis. New
York: Springer-Verlag.
[23] Gleason, A. M., 1966. Fundamentals of Abstract Analysis. Reading, Mass.:
Addison-Wesley.
[24] Lang, S., 1969. Analysis II. Reading, Mass.: Addison-Wesley.
[25] Royden, H. L., 1989. Real Analysis. 3rd ed. New York: Macmillan.
[26] Rudin, W., 1987. Real and Complex Analysis. 3rd ed. New York: McGrawHill.
[27] _ __, 1991. Functional Aalysis. 2nd ed. New York: McGraw-Hill.
[28] Simmons, G., 1963. Introduction to Topology and Modern Analysis. New
York: McGraw-Hill.
A handy' book to use for finding counterexamples to theorems with missing
hypotheses is [29].
[29] Gelbaum, B. R., and Olmsted, J.M. H., 1964. Counterexamples in Analysis.
San Francisco: Holden Day:
"
Our text studied quite a bit about series. Classical references are [30, 31].
[30] Hardy, G. H., 1949. Divergent Series. London: Oxford Univ. Press.
[31] Knopp, K., 1951. Theory and Application of Infinite Series. New York:
Dover.
Those wishing to_pursue distribution theory can consult the following in addition
to [27].
[32] Gelfand, I. M., and Shilov, G. E., 1964. Generalized Functions. New York:
Academic Press.
[33] Schwartz, L., 1966. Theorie des distributions. Paris: Hermann.
Appendix B References and Suggestions for Further Study
·680
[34] 7.emanian, A., 1965. Distribution Theory ahd Transform Analysis. New
York: McGraw-Hill.
The following texts develop the theory of ordinary differential equations and
integral equations. Of these, [35] and [36] are comprehensive treatises.
[35] Coddington, E. A., and Levinson, N., 1955. Theory of Ordinary Differential
Equations. New York: McGraw-Hill.
[36] Hartman, P., 1964. Ordinary Differential Equations. New York: Wiley.
[37] Hurewicz, W., 1958. Lectures on Ordinary Differential Equations. New
York: Dover.
[38] Roxin, E. 0., 1972. Ordinary Differential Equations. Belmont, Calif.:
Wadsworth.
[39] Widom, H., 1969. Lectures on Integral Equations. New York: Van Nostrand Mathematical Studies No. 17.
Advanced calculus can be elegantly applied to study problems in geometry and
vector analysis, including fractal geometry. Besides [10, 11, 24] consult:
[40] Barnsley, M., 1988. Fractals Everywhere. San Diego: Academic Press.
[41) Devaney, R. L., 1989. An Introduction to Chaotic Dynamical Systems.
2nd ed. Redwood City, Calif.: Addison-Wesley.
[42) Fleming, W., 1965. Functions of Several Variables. Reading, Mass.:
Addison-Wesley.
[43) Munkres, James R., 1991. Analysis on Manifolds. Redwood City, Calif.:
Addison-Wesley.
[44) Peitgen, H.-O.,. Jurgens, H., and Saupe, D., 1992. Chaos and Fractals;
New Frontiers of Science. New Yodc: Springer-Verlag.
[45] Spivak, M., 1965. Calculus on Manifolds. New York: Benjamin.
We have already cited several texts that deal with Fourier series and analysis:
[I, 3, 4, 22, 24, 26, 33, 34). Others, somewhat more advanced, are:
'
[46) Korner, T. W., 1988. Fourier Analysis. Cambridge, Eng.: Cambridge Univ.
Press.
[47) Stein, M., and Weiss, G., 1971. Introduction to Fourier Analysis on Euclidean Spaces. Princeton, N.J.: Princeton Univ. Press.
Appendix B References and Suggestions for Further Study
681
(48] Widom, H., 1969. Lectures on Measures and Integration. New York: Van
Nostrand Mathematical Studies No. 20.
(49] Zygmund, A., 1959. Trigonometric Series. 2nd ed. Cambridge, Eng.: Cambridge Univ. Press.
Our chapter on Fourier analysis gave an introduction to partial differential equations. Further information can be found in the following texts. The last two use
distribution theory, with (53] being advanced.
[50] Churchill, R. V., and Brown, J. W., 1987. Fourier Series and Boundary
Value Problems. 4th ed. New York: McGraw-Hill.
(51] Courant, R., and Hilbert, D., 1962. Methods of Mathematical Physics
(2 vol.). New York: Wiley-Interscience.
[52] Duff, G. F. D., and Naylor, D., 1966. Differential Equations of Applied
Mathematics. New York: Wiley.
[53] Sobolev, S. L., 1963. Applications of Functional Analysis in Mathematical
Physics. Providence, R.I.: American Mathematical Society Translations,
vol 7.
A few references on quantum mechanics follow. [56] is a standard elementary
text, while [54, 55] are more advanced and mathematically oriented.
[54] Jauch, J. M., 1968. Foundations of Quantum Mechanics. Reading, Mass.:
Addison-Wesley.
[55] Mackey, G. W., 1963. The Mathematical Foundations of Quantum Mechanics. New York: Benjamin.
(56] Merzbacher, E., 1970. Quantum Mechanics. 2nd ed. New York: Wiley.
There are a number of important topics in classical analysis that we did not
cover. For example, we could have studied the gamma function following (16]
or [57]. This topic is often covered in courses in complex variables as well.
[57] Artin, E., 1964. The Gamma Function. New York: Holt, Rinehart and
Winston.
There are a large number of excellent texts that are not in English. For example:
[58] Bourbaki, N., 1961. Elements de Mathematique; Fonctions d'une variable
reele. Paris: Hermann.
682
Appendix B References and Suggestions for Further Study
[59] Dieudonne, J., 1971. Ca/cul Infinitesimal. Paris: Hennann.
A rigorous treatment of elementary analysis did not evolve rapidly or smoothly.
The creators of this area of mathematics traveled over cobblestones and encountered numerous blind alleys before experiencing their brilliant insights. An
appreciation of this history is important to the student's education in mathematics. Recommended texts are:
·
[60] Grabiner, J. V., 1981. The Origins of Cauchy's Rigorous Calculus. Cambridge, Mass.: The MIT Press.
[61] Kline, M., 1972. Mathematical Thought from Ancient to Modem Times.
New York: Oxford Univ. Press.
Appendix C
Answers and Suggestions for
.Selected Odd-Numbered
Exercises
In this appendix we supply answers, hints, or suggestions to many of the oddnumbered exercises. These are very seldom complete solutions. Supporting
computations·and arguments usually must be given to support the answers or to
fill in the gaps and follow up the hints and suggestions.
Introduction: Sets and Functions
1.
={l},f- 1(8) =S.
a.
f(A)
b.
J(A) =A,f- 1(8) = B.
c.
f(A) = {l,O,-l},/- 1(8) = {x Ix$ O}.
3.
In each case use the definitions: 'For example, for a, x E/- 1(C1 U C2 ) iff
/(x) E C1 UC2 iff (f(x) E C1 or/(x) E C2) iff x Ef- 1(C1)U/- 1(C2). One
need not have equality in d.
5.
For example, assume /(D1 n D2) = /(Di) n/(Di). If x1 -:f. x2 and/(x1) =
/(x2) y, let D 1 {xi} and D2 {x2} to deduce a contradiction. Thus,
/ is one-to-one. Some possible equivalents to "onto" are: for all y e
T,/- 1({y}). "I, 0, or, for all R C T,R = f(f- 1(R)). Other answers are
possible.
=
=
=
683
Appendix C Answers and Suggestions
684
7.
A subset B of A is determined by choosing for each x EA either x E B or
x , B. Apply a standard argument for counting a sequence of choices or
use induction.
9.
If x E UA, then x E A for some A E
A. Since A
C B, A E B, and so
x E UB. Thus UA CUB. The second part is similar.
11.
x E (g o/r 1(C)
g-l(C)
13.
¢=> (g of)(x) E C
E/-t(g-l(C)).
¢=>
A,B '/: 0 <==> (3a EA and 3b E B)
A xB '/:0.
15. a.
b.
17.
¢:=> X
If f is one-to-one, let g(y)
there is no such x.
g(f(x)) E C
~
<==>
/(x) E
(a,b) EA x B, and so
=x if f(x) =y and let g(y) be anything if
If f is onto and y E T, choose some x such that f(x) = y and let
h(y) = x. (Note the use of the axiom of choice.)
Let T : ]Rn-+ ]Rm be the linear transfonnation associated with A. We learn
in linear algebra that T is onto, i.e., Tx = y is solvable for x e ]Rn for any
y e lRm, if and only if T has rank m. If B exists, Tis onto by Exercise 15b.
Conversely, if T is onto, choose a complementary subspace W c IR.n of
dimension m to ker(D. Then Tl W : W -+ R_m is an isomorphism. Let U
be its inverse. Then T o U = identity . If B is the matix of U, we get
AB = identity.
Chapter 1: The Real Line and Euclidean Space
1.1 Ordered Fields and the Number Systems ·
1.
(a+ b)2 = (a+ b)(a + b) = (a+ b)a +(a+ b)b = a 2 + ba +ab+ b 2 (by the
distributive law twice). Now use commutativity.
3.
Expand (a - b)3
5.
Let lF = {0, 1,2} with arithmetic mod 3. For example, 2 • 2 = 1 and
1 +·2 = 0. To show it cannot be ordered, get a contradiction from (for
example) 1 > 0 so 1 + 1 = 2 > 0 so l + 2 = 0 > 0.
> 0 and rearrange.
685
Appendix C Answers and Suggestions
1.2 Completeness and the Real Number System
1.
a.
Take limits as n - oo in x~ = 2 + Xn-1 •
b.
2.
3.
0.
5.
From a (nontrivial) monotone sequence (xn)f", extract a subsequence that
is strictly monotone.
1.3 Least Upper Bounds
1.
sup(S) = 1; S is not bounded below.
3.
sup(Q) is an upper bound for P, so sup(Q)
5.
sup(S) = l.
2: sup(P).
1.4 Cauchy Sequences
1.
lxn - Xn+kl :::; E:;t-l lxi - X;+il :::; E:(1/3;) ::; 1/3n-l. So, for all
c > 0, we can choose N such that 1/3n-l < c. We get a Cauchy sequence,
so (xn)?" converges.
3.
A possibility is 1, 0, -1, 1, 0, -1, 1, 0, -1, ....
5.
False.
1.5 Cluster Points, Jim inf, and Jim sup
=2; lim sup(xn) =4.
1.
lim inf(xn)
3.
Use 1.5.S to show that there are points XN(n> within ¾ of a (orb). Make
sure you end up with a subsequence.
5.
False.
1.6 Euclidean Space
1.
If equality holds in the Cauchy-Schwarz or triangle inequality, then x and
y are parallel. Expand II x + y 11 2 = (II x II + II y 11>2 to obtain II x II · II y II =
(x,y) = llxll · IIYII · cos(0). Use this directly or let u = y/llYII and
z = x - ( x, u ) u. .Check that ( z, u ) = 0 and that I x II 2 = II z II 2 + I(x, u ) I2 =
II x 11 2 , Conclude that z = 0 and x = ((x,y) / II y ll 2)y.
Appendix C Answers and Suggestions
686
3.
{A· (-2,0,3) E JR. 3 JAE JR}.
5.
x
=(y + 1)/2 =(z + 2)/3, or P(t) =(1 + t, 1 + 2t, 1 + 3t). This line is not a
linear subspace since (0, 0, 0) is not on it.
1.7 Norms, Inner Products, and Metrics
1.
The distance given by the sup norm is 1. That from 1.7.7 is 1//3.
3.
-J7JY)
=1
v1fi:i')
,J7J:T) •v1fi:i') is true.
5.
For f(x) =
different.
X,
= 1//3, and (f,g)
·
Ill lloo = 1, but
u:n
112
= 1/2.
So l(f,g)j
= 1//3. So these norms are
1.8 The Complex Numbers
1.
a.
6 + 4i.
b.
(11/5) + 2i.
c.
(3 /2) - (5 /2)i.
3.
No, not always. _,--.. ·
5.
No; true iff z is real.
7.
a.
Use trig identities and the properties of real exponents.
b.
le'J = eRe(z)
c.
cos2 0 + sin2 0 = 1
d.
Use induction.
Jzl <
1; then Zn
9.
-/
--+
0. So
0 as n
ez
"I 0.
--+
oo.
Exercises for Chapter 1 (at end of chapter)
1.
<
= v'S; inf(S) = -v'S.
a.
sup(S)
b.
Neither sup(S) nor inf(S) exist (except as ±oo).
£.
sup(S) = 1; inf(S) = 0.
d.
sup(S) = O; inf(S) = -1.
e.
sup(S) = 1/3; inf(S) =0.3.
Appendix C Answers and Suggestiom
3.
687
> 0 and consider c = x/2.
a.
Suppose x
b.
Let x = min{e/2, 1/2}.
5.
If c > 0, use 1.3.2 to get N with sup(S) - c < XN $ sup(S). If n ~ N,
then sup(S) - e < XN $ Xn $ sup(S), so lxn - sup(S)I < c. If (Yn}i is
monotone decreasing and R = {Y1,Y2,y3, ... }, then Yn -+ inf(R).
7.
Since sup(A) + sup(B) is an upper bound for A+ B (why?), sup(A + B) $
sup(A) + sup(B). Next show that elements of A + B get arbitrarily close to
sup(A) + sup(B).
9.
Use Yn $ lxnl• Equality need not hold. For example, let Xn = (- l)n+I.
Then limsupyn -1 and limsuplxnl l; liminfyn ~ liminf(- lxnl),
11.
i => completeness: Let (xn}f be increasing and bounded above. By i,
(xn}i has a lub, say S. Check that Xn -+ S. See exercise 5. The proof
that ii => completeness is done by showing that Xn -+ - inf{-xn}-
13.
The space spanned by { (2, 1, -2, 0), (2, 1, 0, - 2)}.
15.
=
=
First show that d(xn,Xn+1) $ d(x1,X2)/2n-l, then that d(xn,Xn+k) $
d(x1, X2) /2n- 2 •
17.
If L is the set of lower bounds for Sandy E L, then y 5 inf(S) (why?).
So sup(L) $ inf(S). But inf(S) E L. So inf(S) $ sup(L).
19.
YI =y2 =y3 = 1/2.
21.
If x ES, let x1 = x. Otherwise, x1
of x2,x3, ... with Y1 $ Y2, etc.
23.
Since O is a lower bound for P, 0 $ inf(P). Use the given condition to
rule out O < inf(P).
25.
a.
If x E P, then there is a y E Q with x $ y $ sup(Q). So sup(Q) is
an upper bound for P.
b.
No; consider P = (0, l]. Q = (-1, l]. Then P $ Q but inf(P)
inf(Q).
c.
No; consider P = irrationals, Q = rationals. Here is another: P = { l}
and Q = {O, l}. Then P $ Q, but inf(P) > inf(Q). Also Q $ P, but
< x for all k. Let Y1 = xi, y2 be the first
>
Pf:.Q.
27.
inf(B) =. /2.
29.
Let y = sup{ u E R. I u2 $ x}. Obtain contradictions from y2 < x and from
r>x.
Appendix C Answers and Suggestions
688
31.
True.
33.
a.
Use the
b.
Take logarithms and use a.
00/00 form of l'Hopital's rule.
35.
(3-8i)4 /(1-l) 10 .
31.
z = ±(2 - i).
39.
a.
Re (1/z2) = (x 2 - y2)/(x2 + y2)2;
b.
Re(l/(3z+2)) =(3x+2)/((3x+2)2+9y2);Im(1/(3z+2)) =-3y/
((3x + 2)2 + 9y2)
Im (1/z2) = -2xy/(x2 + y2)2.
41.
If z = x + iy, then Re(iz) = Re(ix -y) = -y = -lm(z), and lm(iz) =
lm(ix -y) = x = Re(z).
43.
Use lzl 2 =z · and z + w = + w to expand
45.
lzln /n > R {::::::=:} loglzl > (logn+logR)/n. The left side is positive and
the right side tends to Oby l'Hopital's rule.
z
z
la - bl 2 + la+ bj2.
Chapter 2: Topology of Euclidean Space
2.1 Open Sets
1.
If v = (a, b) -:/- (0, 0), then v E U = D(v, II v II) C JR2
detail.)
3.
If Vo= (xo,Yo) EB, there is an r
Show II v - vo II < r => v E B.
5.
No in general; yes if B is open or 0 (j. B.
> 0 with
Ix - xol
\ {
(0, 0) }. (Give
< r => x
EA. (Why?)
2.2 Interior of a Set
1.
int(S) = {(x,y) E JR2 I xy > l}.
3.
Yes.
5.
Show more generally that if U is open and U c;: A, then U c int (A). (See
Exercise 3 at the end of the chapter.)
Appendix C Answers and Suggestions
689
2.3 Closed Sets
1.
Yes.
3.
If S = {xi, ... ,xk} and vo E IR.n\s, let r = (1/2) min{d(v0 ,xi), ... , d(vo,xk)}.
Show that I v - Vo II < r =} v E IR.n \ S.
5.
No.
2.4 Accumulation Points
1.
{(x, y) E IR.2 I y = 0 and O :S x :S 1}.
3.
a.
0.
b.
JR.2.
c.
x-axis.
d.
{(1/n,0) E IR 2 J n E .Z,n
5.
f. O} U {O}.
A is the parabola y = 4 - (x + 1)2. If (a, b) is an accumulation. point,
then there are (xb Yk) on it with Xk - t a and Yk - t b. By continuity,
b=4-(1 +a)2. So (a,b) EA.
2.5 Closure of a Set
1.
cl(A) = {(x,y) E IR.2 Ix~ y2}.
3.
JR.2.
5.
The set B(x, r)
cl (A) C B(x, r).
= M \ {y I d(y,x) >
2.6 Boundary of a Set
1.
bd(A) = {O} UA.
3.
bd (A)= {(x,y) E IR.2
5.
Yes.
Ix= y}.
r} is closed and A c B(x, r). So
Appendix C Answers and Suggestions
690
2.7 Sequences
1.
(0,0).
3.
Use Theorem 2.7.6 ii.
5.
cl (S) = {x ER
I x2 ~ 2} = [-v2, yl].
2.8 Completeness
1.
If Xn E N and Xn -+ x, then (xn)f' is a Cauchy sequence. Since N is
complete, it has a limit in N c M. Since limits in M are unique, this limit
must !>e x. So x E N. This shows that N is closed. (Why?)
3.
Yes, every Cauchy sequence is eventually constant.
S.
If (xn)f' is a Cauchy sequence it is bounded. It has a cluster point
and hence a convergent subsequence. So the whole sequence converges.
-.
(Why?)
2.9 Series of Real Numbers and Vectors
1.
Converges.
3.
Let e > 0. By 2.9.2, there is an N such that
So Xt-+ 0.
S.
Does not converge.
II Xt II < e for every k ~ N.
Exercises for Chapter 2 (at end of chapter)
1.
a.
Open.
b.
Closed.
C.
[-1, OJ is closed.
d.
Open and closed.
e.
Closed.
r.
Neither open nor closed.
g.
Neither open nor closed.
h.
Closed.
Appendix C Answers and Suggestions
691
3.
Each x in U satisfies the definition of an interior point of A. For closed
sets: If i4. c C and C is closed, then cl (A) C C.
5.
There is an open set U with x E U C A. Use the definition of openness
of U.
7.
It is not true for every subset of M. Try U = [0, l] C JR.
9.
Use 2.6.2 to get a, then use a.to get b.
11.
Consider d(xm,X) ::; e/2
13.·
Compare with Exercise 9.
15.
a.
Use M \ (M \ A) = A and the definition of boundary.
b.
Use the fact that bd (A) is closed. (Why?)
< e.
Lm 11 Xm II - (Why?)
17.
The series Lm((sinm)xm) converges by comparison to
So
Xm converges by 2.9.3.
19.
a.
21.
Each open neighborhood U of O contains a disk D(O, e).
23.
If V is the union of all open subsets of A, then V is open and contained
in A. So V c int(A). (See Exercise 3.) If x E int(A), there is an open U
with x E U C A, so x E V.
25.
Since e-disks are open, so is any union of them. Conversely, if A is open
and x E A, there is an ex > 0 with x E D(x1ex) C A. So A = UxEA D(x, ex),
27.
Pick bn with bn = lbnl
29.
Yes.
31.
What needs to be done is to show that an accumulation point of A' must
be an accumulation point of A. (A')' need not be equal to A'. Consider
A= {l/2, 1/3,, .. }.
33.
Show that On = 'sn+l - Sn are increasing and bounded above. Use the
telescoping series Sn = so +
Ok to show that the limit must be 0.
35.
Hint: Each of the sets must contain a rational number.
37.
[Ancl (M \A)]U[cl (A)\A] = [AU(cl (A)\A)]n[cl (M \ A) U(cl (A) \A)]=
cl (A) n [cl (M \ A) U (cl (A)\ A)]. Why is this bd (A)?
Lm
A limit point either belongs to A or is an accumulation point of A.
It might be an isolated point of A; an accumulation point is not.
< e/2n.
I;;:-/
Appendix C Answers and Suggestions
692
39.
First show that sup(S)- inf(S) is an upper bound for {x-y I x e S and y
S}. Then consider x and y in S very close to sup(S) and inf(S).
e
41. · Let Ube a neighborhood of x. There is an n with x E cl (An') \An. (Why?)
So Un (An\ {x})-,. 0.
43.
(1 + ./13)/2.
45.
We need
47.
Compare the graph of 1/x to inscribed and circumscribed rectangles to
show that the sequence "In = 1 + (1/2) + • · • + (1/n) - logn is decreasing
and bounded below by 0.
49.
Show that an
O(n-A) by considering Pn
11~1(1 - A/k) and use
Exercise 47 to establish that logPn = -A logn + 0(1).
51.
d.
53.
14.
r P /e-+ 0. Do some manipulations and use l'Hopital's rule.
0
=
=
1/e.
Chapter 3: Compact and Connected Sets
3.1 Compactness
1.
Let (xn}i be a sequence in A. If {xn} is a finite set, some entry must
b;e repeated infinitely often. Use it to get a convergent subsequence. If
it is an infinite set, then an accumulation point must be the limit of a
subsequence. (Careful, this is trickier than it looks. We need the limit of
a subsequence, not just any sequence.)
3.
cl (A) is closed. To show that it is totally bounded, let c > 0 and cover A by
finitely many c/2-balls. Show that cl (A) is covered by the corresponding
c-balls.
5.
A sequence converges iff it is eventually constant. This does not contradict
Exercise 4 since the entries i~ such a convergent sequence form a finite
set.
3.2 The Heine-Borel Theorem
1.
None of them.
Appendix C Answers and Suggestions
693
3.
All subsets of M are bounded, so A compact
{==::} A closed.
5.
No.
{==::}
(A closed and bounded)
3.3 The Nested Set Property
1.
nkFk={v'2} "f"0.
3.
If Fk
= {x1 I l 2: k}, then nk Fk = 0.
None of the sets Fk are compact.
3.4 Path-Connected Sets
1.
3.
a.
Not path-connected. Any path between two rationals must contain
an irrational.
b.
Path-connected.
c.
Path-connected.
d.
Not path-connected. If the point (1, 0) were added, it would be pathconnected.
No.
3.5 Connected Sets
1.
No. ] - 1/2, 3/2[ and ]2, 7 /2[ are disjoint open sets whose union contains A.
3.
If A c ~ 2 • let A* = {(x,y,O) E ~ 3 I (x,y) EA}. If "/(t) = (x(t),y(t))
is a path in A, then "/* (t) = (x(t), y(t), 0) is a path in A"'. If U* and V*
disconnect A*, then U = {(x,y) E ~ 2 I (x,y,O) E U*} and V = {(x,y) E
~ 2 1 (x,y,0) EV*} would disconnect A.
Exercises for Chapter 3 (at end of chapter)
1.
a.
Connected, not compact.
b.
Connected and compact.
c.
Compact, not connected if n
d.
Neither connected nor compact.
e.
Compact, but not connected if it contains more than one point.
= 1, connected and compact if 11 2
2.
Appendix C Answers and Suggestions
694
f.
n = 1, compact and not connected. n
g.
Compact and connected.
h.
Compact, not necessarily connected.
i.
Neither compact nor connected.
j.
Compact, not necessarily connected.
~
2, compact and connected.
3.
cl (A) is bounded since A is (why?), and it is closed. So it is compact.
Every infinite subset, such as A, must have an accumulation point. (See
Section 3.1, Exercise l.)
5.
a.
Uk=
b.
Uk= ]k - (1/3),k + (1/3)(, k = ... , -3, -2, -1, 0, 1, 2, 3, ....
{x E R 2
111 x II
< kj(k + 1)}, k = 1, 2, 3, ....
7.
Start with cl (Ak) = {x} U {xk,Xk+I, .. .}. Remember to show that if y-/- x,
then y ¢. cl (Ak) for large k. (No such Xk can be close to y since they are
close to x.) (Give detail.)
9.
a.
False; [0, 1] is compact, but JR. \ [0, 1] is not connected. In .IR.n,
A = {x E .IR.n I 1 :S II x II :S 2} is compact, but Rn \ A is not
connected.
b.
False; same examples as in a.
c.
False; ]a, b] is connected but is neither open nor closed.
d.
False for n = 1, true for n ~ 2. (Rn \ A is path-connected if n ~ 2.)
a.
If U and V disconnect Band x EB n U, then An U-/- 0. (Why?)
Similarly, A n V -/- 0. So A would be disconnected.
c.
Establish lemma: If B is connected, B C C, and C is disconnected
by U and V, then either B C U or B C V.
11.
13.
Pick Xn E Fn. Show (xn)i is a Cauchy sequence. Its limit must be
in cl (Fn) for all n. (Why?) There cannot be two such points since
diam(Fn) --t 0.
·
15.
a.
17.
As in Worked Example 1.2 of Chapter 1, find Zic E K' with d(x,Zk) --t
d~K) = inf{d(x,z) I z E K}. For large k they are all in the closed
ball of radius 1 + d(z, K) around x. Use compactness to get a subsequence
converging to some z. Then z E K (why?), and d(Zn(j), x) ----+ d(z, x) (why?).
So d(x, z) = d(x, K). (Why?) This does not work for open sets. The proof
does not work unless closed balls are compact.
If 1 (t) and µ(t) are paths from x to a in K 1 and from y to b in K2,
then r.p(t) = (-y(t),µ(t)) is a path from (x,y) to (a,b) in K1 x K2. ·
Appendix C Answers and Suggestions
695
19.
cl<Vn+1) C Vn C cl(Vn). Use the nested set property.
21.
Look closely at the definitions for the general statement. We know that
Rn is path-connected, so it is connected.
23.
Q C ] - oo, v'2 [ U ]v'2, oo[; both intervals are open, they are disjoint.
They disconnect Q. Similarly, R \ Q C] - oo, O[ U ]O, oo[ disconnects
R\Q.
25.
The sequence sin n for n = I, 2, 3, . . . is contained in the compact set
[-1, 1] and hence has a convergent subsequence sin(nk),
27.
If x E cl (A), there is a sequence (xn)i"' in A converging to x. If the
condition holds, then x E A. (Why?) So cl (A) c A, and A is closed. For
the converse, if A is closed and bounded, the lim inf and lim sup of a
sequence in A are the limits of subsequences, so they are in A.
29.
A is both compact and connected.
31.
The set A must be either not closed or not bounded or both. Treat these
cases separately. If A is not closed, there must be an accumulation point
of A which is not in A.
33.
If Ai ,A2,A3, ... are closed nowhere dense sets, put Bk = !Rn\ Ak to see
that the assertion is equivalent to
Theorem /f B1,B2, ... are open dense subsets of!Rn, then
B
= nkBk
is dense in Rn.
Let x E Rn and c > 0. Inductively define a nested sequence of closed
·
disks B(xk, rk) = {y 111 Xk - Y II S rk} by
=
D1
B(x1, r1) where x, E B 1,
II x -
x,
II < c/3,
II x,
- X2
r 1 < c/3,
and B(x 1,r1) CB,;
=
D2
B(x2, r2) where x2 E B2,
II < r1 /3,
r2
< r1/3,
and B(x2, r2) c B2;
etc.
Check that Dk C Bk for each k and D(x, c) 2 cl (D1) 2 D1 2 cl (D2) 2
D2 2 · .. . Existence of y E
cl (Dt) shows that B is dense in !Rn.
(Why?)
nk
35.
a.
All a. If a= 0 or 1, the sequence is constant.
b.
0
s as 1.
Appendix C Answers and Suggestions
696
37.
c.
0:Sa:Sl.
a.
For each x E A there is a Ox such. that D(x, Ox) c M \ B; apply
compactness to the covering {D(x, Dx/2) I x E A}.
b.
No; let A be the y-axis and B be the graph of y = 1/x.
Chapter 4: Continuous Mappings
4.1 Continuity
For xo E JR and e
> 0,
try 8 = min(l, e/(1 + 2 lxoi)).
1.
a.
3.
A
5.
a.
f(x) = 1, U = any non-empty open set.
b.
f(x) = { x,
1,
=/- 1([0, I]).
This is closed since/ is continuous and [0, l] is closed.
0, if X :S 0.
if 0 < x < 1; U = ] -1, 2[; f(U) = [0, 1] is closed.
if X ~ l.
4.2 Images of Compact and Connected Sets
1.
a.
Closed, not necessarily compact or connected.
b.
Open, not necessarily compact or connected.
c.
Connected, not necessarily compact, open, or closed.
d.
Compact, closed, and connected; not necessarily open.
3.
It is not possible if B is closed and bounded since then it is compact.
5.
Yes.
4.3 Operations on Continuous Mappings
1.
a.
Everywhere.
b.
f is continuous on JR\ {-1, l}.
c.
Everywhere.
3.
Use the fact that {0.56} is closed and sinx is continuous. A is not compact.
5.
f
=
g o h where g(x)
h are continuous.
=vx and h(x) =x2 + l. f
is continuous since g and
Appendix C Answers and Suggestions
697
4.4 The Boundedness of Continuous Functions on Compact
Sets
1.
f(x) = x/(1 + jxj).
3.
M =f- 1({sup(f(x))}). Why does this exist; why compact?
5.
sup(f{]0, oo[)) = 1 is not attained on ]0, oo[ . Extend by /(0) = 1 (continuous?) to get it on [0, oo[.
- --4.5 The Intermediate Value Theorem
'
1.
Quadratic polynomials need not be negative anywhere, so the method
,fails. The method works for quintic polynomials and, in general, for all
·
odd-degree polynomials.
3.
Apply the intermediate value theorem to g(x) = f (x) - x.
5.
f([0, 11) would be compact, and ]0, 1[ is not compact.
4.6 Uniform Continuity
1.
IO/x)- (1/y)I = l(x -y)/xyl :5 Ix - YI /a2
Download