INTRODUCTION TO ANALYSIS 1. Axioms for the real number

advertisement
INTRODUCTION TO ANALYSIS
TIM TRAYNOR
UNIVERSITY OF WINDSOR
C ONTENTS
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
Axioms for the real number system
Consequences of the field axioms
Consequences of the ordered field axioms
Intervals
Maximum and minimum
Absolute value and distance
The natural numbers
A little number theory
The rational numbers
Incompleteness of the rationals
The existence of roots — a consequence of completeness
The extended real number system
The Complex Number System
A bit about Rn
Cantor’s Principle and the uncountability of R
The uncountability of the reals
Countablity of the Rationals
Cantor’s Principle in Rn
Suprema, infima, and the Archimedean Property
Supremum and Infimum as operations
x
Supremum and infimum in the extended real numbers system, R
Exponents
Natural exponent
Integer exponent
Rational exponents
Arbitrary real exponents
The existence of Topology in R and Rn and other metric spaces.
Open and Closed sets
Balls, open sets, and closed sets in subspaces
Interior, boundary, and closure
Closure
Bounded sets
Boundedness
Accumulation and the Bolzano-Weierstrass Theorem (set form)
Accumulation points
Compactness and the Heine-Borel Theorem
0
1
3
7
10
11
11
15
19
21
21
23
25
27
29
31
31
32
33
35
39
41
45
45
45
45
46
49
51
51
55
57
60
63
63
65
65
67
INTRODUCTION TO ANALYSIS
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
Compactness in subspaces
Convergence of sequences: Definition and Examples
Examples
Limit theorems for sequences of reals
Convergence to C1, and 1
Existence: Monotone sequences
Cluster points and subsequences: The Bolzano-Weierstrass theorem
(sequence form)
Subsequences
Existence: Cauchy sequences
The number e, an applicationP
of Monotone Convergence
1
Connection with the series k kŠ
n
Convergence of ..1 C 2=n/ /
Limit inferior and limit superior.
Unbounded sequences
Series of numbers
Limits of functions
Uniqueness
Left and right limits
Infinite limits of functions and limits at ˙1
Continuity of functions
Discontinuities of a monotone function
Continuity and compactness
The Intermediate Value Theorem
Banach’s Contraction Mapping Theorem
Uniform Continuity
Differentiation
Mean Value Theorems
The Real Inverse Function Theorem
L’Hôpital’s Rule
Taylor’s Theorem
Convex Functions
The Riemann Integral
Existence of the Riemann integral
The Fundamental Theorem of Calculus
Lebesgue’s Criterion for Riemann Integrability (optional)
The Cantor set
Pointwise and uniform convergence
Distance interpretation
Uniform convergence of series of functions
Uniform convergence: Continuity, integral, derivative.
Continuity
Integration
Differentiation
A continuous nowhere differentiable function on R.
Power series
The complex case
The product of two power series
1
70
73
73
81
87
89
93
93
97
99
100
100
103
106
107
117
118
121
123
125
128
131
133
137
139
143
147
151
153
157
161
167
173
183
189
191
193
196
196
199
199
201
202
205
209
213
213
INTRODUCTION TO ANALYSIS
49.
50.
51.
52.
53.
54.
55.
The exponential and trigonometric functions
Details (proofs)
The connection with the angles of geometry
Differentiation of vector-valued and complex-valued functions
Integration of vector-valued functions
Rectifiable curves — arc length
Differentiation of vector functions of a vector variable
Connection with the usual definition for functions of a real variable
The case of real-valued functions of a vector variable
The space L.Rn ; Rm /
Rules of differentiation
Directional derivatives and partial derivatives
The gradient
The matrix of the derivative. The Jacobian
Continuous differentiability
The Inverse Function Theorem
The Implicit Function Theorem
Appendix: Countability
-1
215
216
219
221
225
226
229
230
230
230
232
233
234
235
236
239
243
247
0
TIM TRAYNOR UNIVERSITY OF WINDSOR
Many thanks to Maria Pap, Aleksandra Katafiasz,
and students of several years. T.T.
INTRODUCTION TO ANALYSIS
1
1. A XIOMS FOR THE REAL NUMBER SYSTEM
We begin by assuming the existence of a set R called the set of real numbers with certain
properties. We will see that everything is based on these.
The system of real numbers forms a complete ordered field. This means that it satisfies
the following axioms:
The field axioms. R is field; that is, a set with two binary operations .x; y/ 7! x C y and
.x; y/ 7! xy, called addition and multiplication, with two members 0 and 1 and two
unary operations x 7! x and x 7! x 1 (for x ¤ 0) such that:
(A1) For all x; y 2 R, x C y D y C x
(commutativity)
(A2) For all x; y; z 2 R, .x C y/ C z D x C .y C z/
(associativity)
(A3) For all x 2 R, x C 0 D x D 0 C x
(0 is an identity for addition)
(A4) For all x 2 R, x C . x/ D 0 D . x/ C x.
( x is an additive inverse of x)
For all x; y 2 R, xy D yx,
(commutativity)
For all x; y; z 2 R, .xy/z D x.yz/
(associativity)
1 ¤ 0 and for all x 2 R, x1 D x D 1x
(1 is an identity for multiplication).
For all x 2 R with x ¤ 0, xx 1 D 1 D x 1 x (x 1 is a multiplicative inverse of x).
For all x; y; z 2 R, x.y C z/ D xy C xz and .y C z/x D yx C zx
(distributive law).
The order axioms. In addition to the above there is a relation < on R making it an ordered
field, that is, it satisfies:
(O1) For all x; y 2 R, exactly one of x < y, x D y, y < x holds
(trichotomy).
(O2) If x < y and y < z, then x < z
(transitivity).
(O3) x < y implies x C z < y C z.
(addition preserves order)
(O4) x < y and z > 0 implies xz < yz.
(multiplication by a number > 0 preserves order).
The Completeness Axiom. If A and B are non-empty subsets of R such that for all a 2 A
and b 2 B, a < b, then there is an x 2 R such that a x for all a 2 A and x b for all
b 2 B. (Here x y is an abbreviation for x < y or x D y.)
(There are many other ways of expressing the completeness axiom and we will meet
some of them. We have chosen one that can be stated with very little theory.)
Notice, by the way, that the conclusion of the completeness axiom still holds if a < b
is replaced by a b. (Why?)
(M1)
(M2)
(M3)
(M4)
(DL)
2
TIM TRAYNOR UNIVERSITY OF WINDSOR
In analysis books, it is traditional to only write x C 0 D x in the axiom for additive
identity, since the other part, x D 0Cx follows from the commutativity. Similar statements
apply for the multiplicative identity and the additive and multiplicative inverses. Here,
however, we have elected to use the conventions usually used in algebra books, to facilitate
the comparison with other algebraic systems.
The distributive law (DL) is described by saying that multiplication is distributive over
addition. This distinguishes it from x C .y z/ D .x C y/ .x C z/, which does not hold.
By contrast, in elementary set theory we do have two distributive laws (intersection over
union and union over intersection).
From the field axioms we can deduce all the usual rules of algebraic manipulation that
we know so well. The order axioms add the ability to handle inequalities in the way we
are accustomed, and the completeness axiom adds the power of Calculus and much more.
Even though the field axioms give us all the rules of the algebra of numbers, they don’t
guarantee the existence of very many numbers. In fact, the field could have only the members 0 and 1. (You might want to amuse yourself by figuring out what the addition and
multiplication tables would have to look like then.) But, when we add the order axioms,
we find that
0 < 1 < 1 C 1 < .1 C 1/ C 1 < ..1 C 1/ C 1/ C 1 < : : : :
This gives an infinite number of numbers. We will be able to produce the set N D
f1; 2; : : : g of natural numbers, complete with the Principle of Mathematical Induction.
From them and their additive inverses and 0 we will get the set Z of integers and, using
multiplicative inverses, we can upgrade to Q, the rational numbers. This set is again an
ordered field.
But it is the Completeness
Axiom that really gives us lots of numbers. You probably
p
have heard that 2 and are not rational numbers. It is the completeness axiom that
guarantees they exist as real numbers. It is completeness that will tell us that a continuous
function that is negative at some point and positive somewhere else must be 0 somewhere
in between. (Without completeness, the graph of the cosine function would never intersect
the x-axis). It will tell us that many integrals exist, even if we can’t figure out their values.
It will enable us to solve differential equations, etc.
Look closely at this axiom: it is of a different nature than the others. The others talk
only about one, two, or three real numbers at a time. The Completeness Axiom, however,
talks about two sets of real numbers. That is what gives it its power.
INTRODUCTION TO ANALYSIS
3
2. C ONSEQUENCES OF THE FIELD AXIOMS
We won’t spend much time on these, since the student will have ample practice with
them in Algebra courses; we just give a few examples which should convince you that all
the usual algebraic properties will be deducible.
Before we begin, look at the first axiom. It starts
“ for all x; y 2 R ”:
What this really means is
“ for all x 2 R and for all y 2 R ”:
This is a common way to abbreviate. Make sure you don’t get “for all x; y. . . ” mixed up
with “for all (x,y). . . ” the latter is talking about an ordered pair of elements.
An expression such as “for all x” is an example of a universal quantifier. Logically,
“for each x” and “for every x” (8x) also mean the same thing. The other type of quantifier
is the existential quantifier, “there exists ”, “for some ” (9). Quantifiers play a very
significant role role in Analysis.
The first results we will establish are
Uniqueness of the identity elements
Cancellation laws for addition and multiplication
Uniqueness of the inverses
2.1 Theorem. (Uniqueness of the identity elements.)
(1) If a 2 R and for all x 2 R, x C a D x, then a D 0.
(2) If b 2 R and for all x 2 R, xb D x then b D 1.
Proof. (1) Since 0 is an identity for addition, (A3),
for all x 2 R, x D 0 C x:
In particular,
a D 0 C a:
(1)
By hypothesis,
for all x 2 R, x C a D x:
Therefore,
0Ca D0
which combined with (1) gives
a D 0;
as required.
(2) The corresponding statement for multiplication and 1 is proved in the same way.
Here is a more compact version of the proof of (1)
Proof.
a D a C 0;
D 0;
and hence a D 0.
2.2 Theorem. (Cancellation)
by A3,
(2)
by hypothesis;
(3)
4
Consequences of the field axioms
(1) For all x; y; z 2 R, if x C z D y C z; then x D y.
(2) Similarly, for all x; y and w in R, if w ¤ 0 and xw D yw, then x D y.
Proof. (1) Fix arbitrary x; y; z 2 R for which
x C z D y C z:
Then,
.x C z/ C . z/ D .y C z/ C . z/
x C .z C . z// D y C .z C . z//;
x C 0 D y C 0;
x D y;
(4)
by A2, associativity
by A4
(5)
(6)
(7)
by A3
The proof of (2) is left as an exercise
Inverses, too, are unique:
2.3 Theorem. For x 2 R, the only additive inverse of x is
multiplicative inverse is x 1 .
x; if x is not 0, its only
Proof. Let x 2 R. Suppose a is an additive inverse of x. Then,
x C a D 0:
But also
x C . x/ D 0;
hence
x C a D x C . x/:
By commutativity,
a C x D . x/ C x;
and then a D x; by cancellation (the previous result).
The proof of the corresponding result for multiplicative inverse is left to the reader.
The word inverse by itself usually refers to the multiplicative inverse. while x is called
minus x, or “the negative of x”. In the latter case, we must be aware that the negative of a
negative number (to be defined later) is not negative, but positive.
We define x y to be x C . y/ and if y ¤ 0, yx to be xy 1 ,
Once we get going, of course, we will not even bother with the justifications of simple
algebraic calculations, since we will have much more immediate concerns. The student
should, however, achieve a level of competence that such tiny gaps in reasoning can be
filled anytime.
Here are some more consequences of the field properties:
2.4 Theorem. Let a,b,c,. . . be elements of R (or another fixed field). Then,
(1) 0 D 0;
(2) 1 1 D 1I
(3) . a/ D a;
(4) if a ¤ 0, then .a 1 / 1 D a;
(5) a 0 D 0;
(6) . a/b D .ab/ D a. b/;
(7) . a/. b/ D ab;
(8) ab D 0 implies either a D 0 or b D 0;
ac
(9) If b ¤ 0 and d ¤ 0 then ab dc D bd
INTRODUCTION TO ANALYSIS
(10) If b ¤ 0 and d ¤ 0 then
(11) If b ¤ 0 and d ¤ 0 then
a
b
a
b
5
Cbc
C dc D adbd
c
D d if and only if ad D bc.
We prove a few of these and leave the rest to the reader.
Proof. (1) Since 0 C 0 D 0, by (A3), 0 is an additive inverse of 0. But such an inverse is
unique, hence 0 D 0.
(5) Start with
0 C 0 D 0:
Multiply by a on the left to obtain
a.0 C 0/ D a0:
Now use the distributive law
a0 C a0 D a0:
The right side here is also 0 C a0; by A3 and A1, so
a0 C a0 D 0 C a0;
a0 D 0;
yielding
by cancellation.
(6) As with (1) we show . a/b is an additive inverse of ab.
ab C . a/b D b.a C . a//;
by commutativity and distributivity
D b0
by A4
D0
by the previous result.
Thus ab C .. a/b/ D 0, so by uniqueness of additive inverse, . a/b D
That a. b/ D .ab/ is proved in the same way.
.ab/.
6
Consequences of the field axioms
Notes
INTRODUCTION TO ANALYSIS
7
3. C ONSEQUENCES OF THE ORDERED FIELD AXIOMS
We begin with a very simple, but very striking, consequence of these seemingly innocent
assumptions. It illustrates the role of the trichotomy axiom.
3.1 Theorem. 0 < 1.
Proof. By trichotomy 0 < 1, 0 D 1, or 1 < 0. We know 0 ¤ 1, by the axiom. Now,
suppose 1 < 0. Then, since addition preserves order (O3) ,
1 C . 1/ < 0 C . 1/:
And, since 1 C . 1/ D 0 and 0 C . 1/ D
0<
1, we have
1:
This and O4 yields
0. 1/ < . 1/. 1/:
But 0. 1/ D 0 and . 1/. 1/ D 1 1 D 1, so
0 < 1:
This contradicts 1 < 0. The only possibility remaining is 1 > 0, as required.
Let us officially define
2 D 1 C 1; 3 D 2 C 1; 4 D 3 C 1; 5 D 4 C 1;
6 D 5 C 1; 7 D 6 C 1; 8 D 7 C 1; 9 D 8 C 1
From the previous result we get, for all x 2 R
1 C 0 < 1 C 1;
:
so,
1<2
In general, for all x, x < x C 1, so 1 < 2 < 3 < 4 and so on. This shows that there are
lots of numbers going out to the right. Since taking negatives reverses the order (as we will
see), there are also lots of numbers going to the left. Here is how we can also get lots of
numbers between two numbers. It is surprising how useful this fact is.
3.2 Theorem. For real numbers, a and b, if a < b, then a <
there exists x 2 R with a < x < b.
aCb
2
< b. Hence, if a < b,
Proof. (Practice exercise.)
As you are aware, a < b < c means a < b and b < c. One of the conveniences of this
notation is that, by transitivity, we can drop the middle one, so it includes the statement
a < c.
The expression a b is an abbreviation for a < b or a D b. The new relation on R
is still transitive and addition preserves this order:
ab
implies
aCc bCc
And similarly if c 0,
a b implies ac bc
The trichotomy axiom O1 breaks into two conditions:
(O1a) For all x; y 2 R, x < y or x D y or y < x and
(O1b) For all x; y 2 R, not (x < y and x D y) and not (x < y and y < x) and not
(x D y and y < x).
These can be stated in terms of the relation as the following two statements
8
Consequences of the ordered field axioms
(TOa) For all x; y 2 R, either x y or y x
(TOb) For all x; y 2 R, if x y and y x then x D y.
(A relation that satisfies TOa and TOb and O2 (transitivity) is called a total order.)
The student should prove the above statements as an exercise in simple logic. We will
assume them in what follows.
The expression a b < c means a b and b < c, and we can conclude a < c;
a < b c is interpreted similarly and yields the same conclusion. (You should check
this.)
As you expect, the expression a > b is defined to mean b < a and a b is defined to
mean b a.
3.3 Theorem. (Our First Analysis Result) For real numbers a; b; c; "0 :
(1) If for all x > a, c x, then c a.
(2) Let a < b. If for all x such that a < x < b, c x, then c a.
(3) If for all " > 0 c a C ", then c a.
(4) Let "0 > 0. If c a C ", whenever 0 < " < "0 , then c a.
In the sequel, we will refer to this result as FAR.
Many students have difficulty believing this result, because they are thinking of x as
fixed. But the hypothesis is about all possible x in the interval .a; C1/ (in version (1))
and all possible x in the interval .a; b/ in version .2/. (See the definitions of intervals
below.)?? If you draw yourself the picture, the results become “obvious”: if c is less than
or equal to every element of .a; b/, it must be less than or equal to the left endpoint a.
Proof. (1) Assume that for all x > a, c x. By trichotomy, the negation of c a is
c > a. So, suppose that
c > a:
Then,
aCc
< c:
2
yields x > a and c > x. This proves that
a<
Thus, choosing x D
aCc
2
there exists x > a with c > x.
This contradicts the hypothesis that
for all x > a, c x.
Thus, c > a is false so c a, as required.
By the way, even though the hypothesis is just that 8x > a; c x, this actually implies
8x > a; c < x. In fact, after the result has been proved, we even see 8x > a, c a < x.
(2) Method 1:
Let a < b and assume
c x, for all x with a < x < b:
Suppose c > a. There are two cases: either c > b or c b.
If c > b, take x D aCb
, so that a < x < b. By the hypothesis, c x, and by
2
transitivity c < b. Since c > b, this contradicts trichotomy.
If, instead, c b, take x D aCc
: Then, a < x < c: But then, by transitivity, a < x < b;
2
so again, by hypothesis, c x and again this contradicts trichotomy, since x < c.
Thus, c > a is false, so c a.
INTRODUCTION TO ANALYSIS
9
Method 2 — reducing to version (1): Let a < b. Assume
c x, for all x with a < x < b:
Now, let x > a. If x < b, then a < x < b, so c x. But, a <
then
aCb
c
< b x;
2
so again c x.
Thus,
aCb
2
< b, so if x b,
for all x > a, c xI
hence, c a, by (1).
(3) Suppose c aC", for all " > 0. Let x > a. Then x a > 0, so c aC.x a/ D x.
Thus,
for all x > a, c xI
hence, c a, by (1) again.
(4) This is an exercise. You can mimic the proof of version (2) or deduce it from (3) as
(2) was deduced from (1).
Each of these versions of the First Analysis Result have a corresponding version for the
opposite inequalities. For example, if c x, for all x < a, then c a. I suggest you
prove these.
A real number a is called
positive, if a > 0,
negative, if a < 0;
non-negative, if a 0, and
non-positive, if a 0.
3.4 Theorem. For real numbers a; b:
(1) if a < b then b < a;
(2) if a is negative then a is positive;
(3) if a is positive then a is negative.
Proof. (1) Let a < b. Adding a C . b/ to both sides gives
a C .. a/ C . b// < b C .. a/ C . b//;
by O3
.a C . a// C . b/ < .b C . b// C . a/;
by A2 and A1
0 C . b/ < 0 C . a/
b<
a;
as required.
(2) Let a be negative. Then a < 0, so by (1),
positive by definition.
Statement (3) is proved in the same way.
3.5 Theorem. Let a; b; c; d be real numbers.
(1) If a < b and c < d then a C c < b C d .
(2) If 0 < a < b and 0 < c < d then ac < bd .
0 <
a. But
0 D 0. Thus,
a is
10
Consequences of the ordered field axioms
Proof. (1) Start with
a < b:
By O3 we may add c to both sides yielding
a C c < b C c:
()
But also
c<d
so we may add b to both sides and use commutativity, yielding
b C c < b C d:
()
Combining ./ and ./, using transitivity (O2), we have
a C c < b C d;
as required.
(2) This is proved similarly, using O4 instead of O3. Note carefully how the positivity
is used.
Here are a few more familiar properties. The proofs of these should present no serious
difficulties. They are left as exercises. Make sure you try some. You should also state and
prove versions of these results that use and in place of one or both of < and >.
3.6 Theorem. Let a; b; c; d be real numbers. Then:
(1) a < b and c < 0 implies ac > bc.
(2) a > 0 and b > 0 implies ab > 0.
(3) a < 0 and b < 0 implies ab > 0.
(4) a < 0 and b > 0 implies ab < 0:
(5) a2 0 and a2 > 0, if a ¤ 0.
(6) If a > 0 then a 1 > 0
(7) If 0 < a < b then 0 < b 1 < a 1 .
(8) If 0 < a < 1, then a2 < a.
(9) If a > 1, then a < a2 :
(10) 0 < a < b implies a2 < b 2 :
(11) 0 < b; d implies ab < dc iff ad < bc.
a
b
(12) 0 a < b implies aC1
< bC1
aCc
(13) 0 < b; d and ab < dc implies ab < bCd
< dc .
Intervals. If a; b are real numbers, we define four bounded intervals by
.a; b/ D fx 2 R W a < x < bg
Œa; b D fx 2 R W a x bg
.a; b D fx 2 R W a < x bg
Œa; b/ D fx 2 R W a x < bg
We notice that if b < a these all turn out to be empty. We usually assume a b when
we use this notation. If a D b, Œa; b D fag, whereas the others are empty. We call .a; b/
an open interval and Œa; b, a closed interval; .a; b and Œa; b/ are half-open intervals.
In all four cases, a is the left endpoint and b is the right endpoint of the interval. If
a < b, the interval is called non-degenerate, otherwise degenerate. Notice that each such
belongs to it.
non-degenerate interval is non-empty since aCb
2
INTRODUCTION TO ANALYSIS
11
There are also unbounded intervals defined by
.a; C1/ D fx 2 R W x > ag
(8)
Œa; C1/ D fx 2 R W x ag
(9)
. 1; b/ D fx 2 R W x < bg
(10)
. 1; b D fx 2 R W x bg
(11)
. 1; C1/ D R
(12)
The symbols C1, and 1 are from the extended real number system. (See section 8.)
Every interval J of R (bounded or unbounded) has the property that for all x; y 2 J , if
x < t < y then t 2 J . Once one has the concept of supremum and infimum (see section
12. S UPREMA , INFIMA , AND THE A RCHIMEDIAN PROPERTY) one can show that this
property characterizes intervals.
3.7 Theorem. (Characterization of interval) A subset J of R is an interval if and only if
for all x; y 2 J and t 2 R, if x < t < y then t 2 J .
Maximum and minimum. If x and y are two real numbers, the maximum of x and y is
(
x; if x y
maxfx; yg D
y if y xI
the minimum of x and y is
(
minfx; yg D
x; if x y
y if y xI
3.8 Theorem. For real numbers x; y; a,
(1) maxfx; yg a if and only if x a and y a and
(2) minfx; yg a if and only if x a and y a.
(3) maxfx; yg < a if and only if x < a and y < a and
(4) minfx; yg > a if and only if x > a and y > a.
Proof. (Exercise)
Absolute value and distance. For each x 2 R, the absolute value of x is defined to be
jxj D x if x 0 and x if x < 0.
An immediate consequence (which you should prove for practice) is:
3.9 Lemma. Let x; a 2 R. Then, jxj a iff a x a. (similarly, for <)
3.10 Theorem. For real numbers:
(1) jxj 0, for all x; jxj D 0 iff x D 0,
(2) jx C yj jxj C jyj.
(3) jxyj D jxjjyj.
We are using the common abbreviation “iff” for “if and only if”.
Proof. (1) If x 0 then jxj D x 0: If x < 0, jxj D x > 0. In both cases, jxj 0.
Now, if x D 0, jxj D x D 0 by definition. Conversely, suppose x ¤ 0. Then either
x < 0 or x > 0. If x < 0 then jxj D x > 0, while if x > 0 then jxj D x > 0. Thus, in
both cases jxj ¤ 0.
(2) Since jxj jxj, we have
jxj x jxj
12
Consequences of the ordered field axioms
and similarly
jyj y jyj;
and adding gives
.jxj C jyj/ x C y jxj C jyj:
Now using the Lemma again with a D jxj C jyj, we have
jx C yj jxj C jyj;
as required.
Alternate Proof In case x 0; y 0, we have jxj D x, jyj D y; and x C y 0,
so that
jx C yj D x C y D jxj C jyj:
In case x < 0, y < 0, we have jxj D
jx C yj D
In case x 0;
x, jyj D
.x C y/ D
y and x C y < 0, so
x C . y/ D jxj C jyj:
y < 0, we have jxj D x, jyj D
y > 0. If x C y 0, this gives
jx C yj D x C y < x C . y/ D jxj C jyj;
while if x C y < 0 it gives
jx C yj D
x
y x C . y/ D jxj C jyj:
The case x < 0; y 0 is similar. In all 4 cases then, we had jx C yj jxj C jyj.
The proof of (3) is similar to the alternative proof of (2), but easier.
Property (2) here is called the triangle inequality for absolute value. It gets its name
from its analogue in the Euclidean plane, where it describes the relationship between the
lengths of sides of a triangle. Often the following variant of the triangle inequality is useful.
3.11 Corollary. Let x; y 2 R. Then j jxj
Proof. jxj D jx
y C yj jx
jyj j jx
yj and j jxj
jyj j jx C yj.
yj C jyj, by the triangle inequality, so
jxj
jyj jx
yj:
Similarly
Now j jxj jyj j is one of jxj
in both cases
jyj jxj jx yj:
jyj or jyj jxj D .jxj
jyj/, depending on the sign, so
j jxj jyj j jx yj:
To obtain the second statement from the first, replace y by y and use j yj D jyj.
3.12 Remark. It is a common source of error to reverse the inequality in this corollary,
writing jx yj j jxj jyj j, which is false in general.
For real numbers x and y, the quantity jx yj is called the distance from x to y. We
will write d.x; y/ or dist.x; y/ for this.
3.13 Theorem. For all a; b; c 2 R, d.a; c/ d.a; b/ C d.b; c/.
Proof. Let a; b; c 2 R. Then, ja cj D j.a b/C.b c/j ja bjCjb cj, by the triangle
inequality for absolute value. The rest is just translation into the distance notation.
The property stated in the above theorem is called the triangle inequality for distance.
Again, the name really comes from the analogous result in two dimensions. It says that the
length of one side of a triangle is less than or equal to the sum of the lengths of the other
two sides.
INTRODUCTION TO ANALYSIS
13
3.1. Use the concept of minimum of two real numbers to simplify the proof of Our First Analysis
Result (2).
3.2. Let a; b; c be real with b > 0. If "0 > 0 and c a C "b, whenever 0 < " < "0 , then c a.
3.3. Let I be a set of real numbers, and let a; b 2 R with a b. Then I is an interval with endpoints
a and b if and only if .a; b/ I Œa; b. (Check the 4 cases.)
3.4. Let I D .a; b/ and J D .c; d / be bounded open intervals. Find a formula for the intersection,
I \ J , and prove it. Under the additional hypothesis that I \ J ¤ ;, find and prove a formula
for I [ J .
S
3.5. If fI˛ W ˛ 2 Ag is a family of intervals of R, all containing the point c, then ˛2A I˛ is also
an interval.
3.6. (Improving the previous problem) IfS
fI˛ W ˛ 2 Ag is a family of intervals of R, such that for
each ˛; ˇ 2 A, I˛ \ Iˇ ¤ ;, then ˛2A I˛ is also an interval.
T
3.7. If fI˛ W ˛ 2 Ag is a family of intervals of R, ˛2A I˛ is also an interval, though possibly
empty.
3.8. Find and prove formulas for the intersection of two closed bounded intervals and for the union of
two non-disjoint ones.
3.9. Find, with proof,
\
Œ2; x/;
x>3
that is, the intersection of the family of intervals fŒ2; x/ W x > 3g.
3.10. Find, with proof,
[
.3; x:
x<7
3.11. Let I be an interval in R. If a 2 I and a < t … I , then, for all x 2 I , x < t.
3.12. Let A and B be disjoint intervals in R. Suppose a 2 A, b 2 B, and a < b. Then, for all
x 2 A and all y 2 B, x < y.
3.13. If jaj " for all " > 0, then a D 0.
3.14. For real a; b; c, d.a; b/ D d.a; c/ C d.c; b/ if and only if a c b or a c b.
3.15. If x; y 2 Œa; b, then d.x; y/ d.a; b/.
3.16. If x; y 2 Œa; b, then d.x; y/ < d.a; b/, unless one of x; y is a and the other is b.
3.17. Let jx
2j < ı and ı < 1. Prove jx C 7j < ı C 9 and jx 2 C 5x
14j < 10ı.
14
Consequences of the ordered field axioms
Notes
INTRODUCTION TO ANALYSIS
15
4. T HE NATURAL NUMBERS
For a subset A of R, A is called inductive (or inductively closed) if for all x, x 2 A
implies x C 1 2 A: One inductive set is R itself.
The set of natural numbers is defined to be the set
\
ND
fA W A is inductive with 1 2 Ag:
4.1 Theorem. N is an inductive subset of R containing 1. If A is any inductive set that
contains 1, then A N.
Proof. . First, there is an inductive subset of R containing 1, namely R itself, so N exists
and is a subset of R.
T
From the definition of intersection, 1 2 fA W A is inductive and 1 2 Ag D N.
To see that N is inductive, let x 2 N. Then,
x 2 A; for all inductive sets A with 1 2 A.
Hence
x C 1 2 A, for all inductive sets A with 1 2 A:
Therefore,
xC12
\
fA W A is inductive with 1 2 Ag D N;
by definition. Thus x 2 N H) x C 1 2 N, so N is inductive.
For the second statement, let A be inductive with 1 belonging to it. If x 2 N, then
x 2 A, by definition of intersection. So N A.
The end of the proof is a special case of the general fact that the intersection of a family
of sets is always contained in each set in the family.
The above immediately yields the Principle of Mathematical Induction:
4.2 Theorem. (PMI)
(1) If A is an inductive set of natural numbers including the element 1, then A D N.
(2) If P .n/ is a statement about natural numbers n, such that
(i) P .1/ is true, and
(ii) whenever P .n/ is true, P .n C 1/ is true,
then P .n/ is true for all n 2 N.
Proof. (1) By hypothesis, A N; Also A is inductive with 1 2 A, so N A by the
previous result. Thus A D N.
(2) Let A D fn 2 N W P .n/ is true g. By hypothesis, 1 2 A and x 2 A H) x C 1 2
A. Thus A is inductive, so A D N, by part (1). That is, for all n 2 N, P .n/ is true.
4.3 Theorem. For all n 2 N, n 1, hence 0 … N.
Proof. Here the statement to be proved by induction on n is:
n 1:
Now,
(i) 1 1 is true of course and,
(ii) if n 1, then n C 1 1 C 0 1.
Thus, by PMI n 1, for all n 2 N.
Moreover, since 0 < 1, 0 … N.
So, if n D 1, its predecessor n
case.
1 is not a natural number, but this is the only such
16
The natural numbers
4.4 Theorem. If n 2 N and n > 1, then n
1 2 N.
Proof. Since every natural number is greater than or equal to 1, saying n > 1 is the same
as saying n ¤ 1. Let A be fn 2 N W n D 1 or n 1 2 Ng.
Then A contains 1, by definition.
Suppose now n 2 A. Then n 2 N; hence, .n C 1/ 1 D n 2 N.
Thus n 2 A H) n C 1 2 A, so A is inductive.
Hence N D A, by PMI. Thus, for all n 2 N, n D 1 or n 1 2 N: In other words, n ¤ 1
implies n 1 2 N.
The last step of the proof used the fact, from elementary logic, that the statement p _ q
is equivalent to the statement .not p/ H) q. (_ means “or” and H) means “implies”.)
4.5 Theorem. If n 2 N and n < m 2 N, then n C 1 m.
Proof. We use induction on n, so let
A D fn 2 N W for all m 2 N; if n < m then
n C 1 mg:
Let m 2 N. If 1 < m, then m 1 2 N, so 1 m 1, hence 1 C 1 m. Since m was
arbitrary, 1 2 A.
Suppose n 2 A, and let m 2 N again be arbitrary. If n C 1 < m, then 1 < m, so
m 1 2 N and n < m 1; hence, n C 1 m 1, by the inductive hypothesis (that is,
since n 2 A). Thus, .n C 1/ C 1 m. And this was true for arbitrary m, so A is inductive,
and A D N, as required.
In the above proof we used a general principle. If we have a statement to prove about
two natural numbers n and m, we try to do induction on just one of them (in the above
case, n). We get more power by making the inductive hypothesis strong, saying “for all
m 2 N”. The idea is that if you assume more, you can prove more.
4.6 Corollary. If n 2 N and n < a < n C 1, then a … N.
In other words, there is no natural number between successive ones.
Proof. Let n 2 N and n < a < n C 1. If a 2 N, then by the theorem n C 1 a, which
contradicts a < n C 1.
4.7 Theorem (Well-ordering property). Each non-; subset of N has a least element.
If A is a set of real numbers, to say that a0 is a least element of A means a0 2 A and
for each a 2 A, a0 a. Clearly, a set A can have no more than one least element (why?).
Proof. Let M N and let M be non-empty.
Suppose M has no least element. We will show that this leads to a contradiction.
Let B be the set of n 2 N for which n m, for all m 2 M . We claim that 1 2 B and
B is inductive.
First, 1 2 B. Indeed, 1 m, for all m 2 M , since M is a set of natural numbers.
Now let n 2 B. Then
n m, for all m 2 M ,
But n … M , for otherwise n would be the least element of M which we are assuming
doesn’t exist. Thus,
n < m, for all m 2 M ,
so
n C 1 m, for all m 2 M ,
INTRODUCTION TO ANALYSIS
17
by the previous result, that is n C 1 2 B. Thus, n 2 B implies n C 1 2 B.
This shows B is inductive, so contains all the elements of N. In other words, for all
natural numbers n,
n m, for all m 2 M .
Since M ¤ ; it contains some element, say m0 . But m0 is a natural number so
m0 m, for all m 2 M .
So m0 is the least member of M after all, a contradiction.
There is a strengthening of the Principle of Mathematical induction which is often more
convenient, known as the Principle of Strong Induction, 2nd Principle of Mathematical
Induction or the Principle of Complete Induction. As usual f1; : : : ; ng means the set of
all natural numbers k with 1 k n.
4.8 Theorem (Strong Induction).
(1) Let B be a set of natural numbers such that 1 2 B and that for all n, f1; : : : ; ng B implies n C 1 2 B, then B D N.
(2) If P .n/ is a statement about natural numbers n such that
(i) P .1/ is true, and
(ii) whenever P .k/ is true for all natural numbers k n, P .n C 1/ is also true,
then P .n/ is true for all n 2 N.
Proof. Let A be the set of all n 2 N such that f1; : : : ; ng B. We will show that A is
inductive.
Since 1 2 B, f1g B, so 1 2 A.
Let n 2 A. Then f1; : : : ; ng B, so n C 1 2 B. But then, f1; : : : ; n C 1g B so that
n C 1 2 A.
Thus, by the Principle of Mathematical Induction, A D N, so that for all n 2 N,
f1; : : : ; ng B, in particular, N B. Since B N, B D N.
4.9 Theorem. The set N is closed under addition.
The statement means that for all m; n 2 N, m C n 2 N. In the proof, we will use
the general principle mentioned earlier: We will choose one of m and n and prove the
statement with the other universally quantified.
Proof. Let
A D fn 2 N W for all m 2 N, m C n 2 N; g:
Then 1 2 A, because this just says m C 1 2 N for all m 2 N, which we already know.
Now let n 2 A. Then, for each m 2 N, m C n 2 N, so
m C .n C 1/ D .m C n/ C 1 2 N;
so that n C 1 2 A.
Thus, A is inductive, so each n 2 N belongs to A, which says that for all n 2 N and for
all m 2 N, m C n 2 N, as required.
In the above proof, how did we know to add n C 1 to the m? Because, we assumed
n 2 A and we were trying to show n C 1 in A. For that, we needed to show n C 1 satisfied
the definition of the members of the set A. Thus, we fixed one m and added n C 1 to it to
see if m C .n C 1/ still belonged to N. It did.
We also followed a common custom of not bothering to write out the line:
n2A
implies n C 1 2 A:
18
The natural numbers
4.10 Theorem. The set N is closed under multiplication.
Proof. Let
A D fn 2 N W for all m 2 N, mn 2 Ng
Then, 1 2 A, since 1 is a multiplicative identity, so suppose n 2 A and fix m 2 N. Then,
m.n C 1/ D mn C m, by the distributive law. Since n 2 A, mn 2 N and thus, by closure
under addition, mn C m 2 N. This proves n C 1 2 A, so A is inductive.
Therefore by PMI, for all n 2 N and for all m 2 N mn 2 N.
For n 2 N and x 2 R we define x n using the idea of induction: x 1 D x; and assuming
x n defined we let x nC1 D x n x. A definition of this type is called a recursive definition
or a definition by recursion.
One also lets x 0 D 1. (In some books, 00 is left undefined, but these still tend
P to use
it as though its definition were 00 D 1, for example in an expression such as nkD0 x k ,
when x is allowed to be 0. We take 00 D 1 always. We have to be very careful, though,
not to give this expression properties that it doesn’t have.)
4.1. Sometimes the Principle of Strong Induction is stated in only one step: If B is a set of natural
numbers such that n 2 N and k 2 B, for all k 2 N with k < n implies n 2 B, then B D N.
Show that in using this version one still has to prove 1 2 N.
4.2. If m; n 2 N, with m > n, then m
n 2 N.
4.3. The Binomial
Theorem: Let a; b 2 R (or in any field), for any n 2 N, .a C b/n D
Pn
n k n k
n
is the binomial coefficient kŠ.nnŠ k/Š . It satisfies Pascal’s Law:
. Here k
kD0 k a b
n
n
nC1
kC1 D k C kC1 .
4.4. The Bernoulli inequalities: For n 2 N, if b > 1, .1 C b/n 1 C nb; if b < 1, .1
1 nb.
P
n i bi 1 .
4.5. For each n 2 N, an b n D .a b/ n
iD1 a
b/n 4.6. If S is a non-empty finite subset of R, then S has a maximum (that is, a largest element) and a
minimum ( smallest element). (This can be proved by induction on the number of elements of S .)
4.7. Let an 2 R, for all n 2 N. Suppose an anC1 , for all n 2 N. (Then .an / is called a
decreasing sequence.) Prove that for all n; k 2 N, if n k, then an ak .
4.8. Let an 2 R, for all n 2 N. Suppose an anC1 , for all n 2 N. (Then .an / is called an
increasing sequence.) Prove that for all n; k 2 N, if n k, then an ak .
4.9. (Extension of PMI) The set of integers is defined as Z D fn m W m; n 2 Ng. If A is an
inductive set of integers with a 2 A, then fn 2 Z W n ag A. Consequently, if P .n/ is
a statement about integers n for which P .a/ is true and P .n/ implies P .n C 1/, then P .n/ is
true for all n a.
INTRODUCTION TO ANALYSIS
19
5. A LITTLE NUMBER THEORY
We now define the set Z of integers by
ZDN
N D fn
m W n 2 N and m 2 Ng:
From this, you can easily prove that Z is closed under addition and multiplication. It is
almost as easy to show that Z D N [ f0g [ . N/; this is based on the fact that, if n; m 2 N
and n > m, then n m 2 N, which you can prove by induction.
For n; d 2 Z, if n D cd , for some c 2 Z, we say d divides n, or d is a divisor of
n, or n is a multiple of d , denoted d jn. One also says n is divisible by d . Evidently 0 is
divisible by every integer and d divides n if and only if it divides n, so we often may
assume n > 0.
If n is positive, then d jn implies d n.
5.1 Theorem. The division algorithm If n; d are integers with d > 0, there exist unique
q; r with n D dq C r, 0 r < d .
Proof. Let A D fm 2 N [ f0g W m D n dq; q 2 Zg. Then, A is not empty. (If
n 0, then n D n d 0 2 A; if n < 0, n d.n/ D .d 1/. n/ 2 A.) Using the Well
Ordering Property, let r be the least element of A and choose q so that r D n dq. Then,
n D dq C r and r 0. We need to show r < d .
If, to the contrary, r d , then r d D n d.q C 1/ 2 A, contradicting the fact that r
is the smallest element of A.
To prove uniqueness, let n also be dq1 Cr1 , with 0 r1 < d . Then dq Cr D dq1 Cr1 ,
so d.q1 q/ D r r1 and d < r r1 < d , so 1 < q1 q < 1. Since the only integer
between 1 and 1 is 0, q D q1 and r D r1 , as required.
If d is a divisor of both n and m, we call d a common divisor of n and m.
5.2 Theorem. Each pair of integers n; m, not both 0, has a unique positive common divisor
of the form d D nx C my, where x; y 2 Z. If e is any other common divisor of n; m, ejd ,
so e d .
Thus, d is the greatest common divisor of n and m.
Proof. Without loss of generality, we may assume n; m 2 N. Let
A D fk 2 N W k D nx C my;
x 2 Z; y 2 Zg:
The set A is not empty since n C m belongs to it. Let d D nx C my be the least element
of A, where x; y 2 Z.
To see that d divides n, use the Division Algorithm to find q; r with n D dq C r,
0 r < d . If r > 0, then d > r D n dq D n .nx C my/q D n.1 xq/ C m.yq/,
so r is an element of A smaller than d . Thus, r D 0, so d jn. Similarly, d jm.
Of course, if e divides n and m it divides nx C my D d ; hence, e d . Uniqueness
follows from this.
A natural number n > 1 is called prime if the only positive divisors of n are 1 and n.
Otherwise it is called composite. Integers a; b are called relatively prime if the greatest
common divisor of a and b is 1, that is they have no common factor d > 1.
5.3 Lemma. If n; m are integers that are not relatively prime, and d is their greatest
common divisor, then n=d and m=d are relatively prime.
20
A little number theory
Proof. We may assume n; m 2 N. If a natural number k were to divide n=d and m=d ,
then kd would be a larger divisor of n and m, a contradiction.
5.4 Euclid’s Lemma. If a; b; c are integers, a divides bc and a and b are relatively prime,
then a divides c.
Proof. Since 1 is the greatest common divisor of a and b, 1 D ax C by, where x; y 2 Z.
Thus, c D axc C bcy is divisible by a, since both terms are.
5.5 Corollary. If a prime number p divides a product a1 a2 : : : ak , then it divides one of
the factors ai .
Proof. This is a simple induction, using Euclid’s Lemma.
5.6 Unique Factorization Theorem. Each natural number n > 1 can be represented as
a product of primes n D p1 p2 : : : pm in one and only one way, except for the order of the
factors.
Proof. To clarify, if n itself is prime, n D n is the required product.
Let A D fn 2 N W n > 1 and n is not a product of primesg. We are to show that A is
empty. If it is not, it has a smallest element a. Since a is not prime, it can be written as
a D bc, where 1 < b < a and 1 < c < a. Thus, both b and c are products of primes.
Say b D p1 p2 : : : pm and c D q1 q2 : : : qk , where each of the pi and qj are primes. Thus,
a D p1 p2 : : : pm q1 q2 : : : qk is a product of primes, a contradiction. So A is empty and
every natural number is a product of primes.
To prove uniqueness, we must show that if n > 1 is a natural number and
n D p1 p2 : : : pm D q1 q2 : : : qk ;
()
where all the pi and qj are primes, then m D k and q1 ; q2 ; : : : ; qk is just a rearrangement
of p1 ; p2 ; : : : pm . We do this by induction on n. Since every product of two or more primes
is greater than 2, 2 cannot be written in any other way as a product of primes. Now suppose
the uniquess holds for integers less than n and that () holds. Since p1 is prime, we find
from Euclid’s lemma that p1 must be qi , for some i D 1; : : : ; k. By rearranging the order,
we can assume p1 D q1 . Then,
p2 : : : pm D q2 : : : qk :
Since this integer is smaller than n, the p2 ; : : : pm and q2 ; : : : ; qk are the same except for
the order; hence p1 ; p2 ; : : : pm and q1 ; q2 ; : : : ; qk are the same except for the order.
INTRODUCTION TO ANALYSIS
21
6. T HE RATIONAL NUMBERS
We define the set Q of rational numbers by
nm
o
QD
W m; n 2 Z; n ¤ 0 :
n
˚
W
m
2
Z;
n
2
N . Moreover:
We find that Q D m
n
6.1 Theorem. Q is closed under addition and under multiplication and taking of additive
and multiplicative inverses. Hence, Q satisfies the axioms of an ordered field.
The above unproved statements should be considered exercises. Do a few of them from
time to time.
Incompleteness of the rationals.pAlthough Q is an ordered field, it still has lots of holes.
We expect there to be a number ( 2) whose square is 2, for example. But:
6.2 Theorem. There is no rational number a such that a2 D 2.
Proof. Suppose that a D m=n, in lowest terms (meaning m and n are relatively prime,
that is, have no common divisor > 1) and that a2 D 2. Then,
m 2
D 2;
n
so that.
m2 D 2n2 :
2
This says m is even, so m itself must be even.
(If m D 2k C 1, odd, then m2 D 4k 2 C 4k C 1, odd.)
But if m D 2k;
4k 2 D 2n2 ;
so
2k 2 D n2 ;
so n2 is even, and hence n is also even.
Thus, m and n are both even, which contradicts the fact that
m
n
was in lowest terms. The above may be extended to show that no prime has a rational square root. This result will
be often used in examples. If you would like to prove this, remember the words “even” and
“odd” mean “divisible by 2” and “not divisible by 2”.
p
In any case, we see that 2, if it exists, is irrational; that is, not rational. We will
prove, however, that it does exist in R as a consequence of completeness. (See the section
E XISTENCE OF ROOTS.)
6.1. No prime has a rational square root. Euclid’s Lemma 5.4 from the section A LITTLE N UMBER
T HEORY is needed.
6.2. A natural number n has a rational square root only if it is actually a perfect square: n D m2 ,
where m is also a natural number. (Uses the Unique Factorization Theorem 5.6 of Number Theory.)
22
Rational numbers
Notes
INTRODUCTION TO ANALYSIS
23
7. T HE EXISTENCE OF ROOTS — A CONSEQUENCE OF COMPLETENESS
p
We now find out that the irrational number 2 does indeed exist.
7.1 Theorem. Let y be a positive real number. Then, for every n 2 N, there exists a
unique positive real number x such that x n D y.
Proof. (1) First, note that for positive numbers a; b,
a < b H) an < b n
This is proved by induction on n. (Exercise).
(2) This implies uniqueness: Suppose x1n D y and x2n D y, with x1 and x2 positive but
not equal. Then one must be smaller, by trichotomy. Say x1 < x2 : Then x1n < x2n , so we
can’t have x1n D x2n . The contradiction shows x1 D x2 .
(3) Let A D fa > 0 W an yg, B D fb > 0 W b n yg. I claim that A and B are not
empty and every element of A is every element of B.
Indeed, since y > 0,
y
0 < yC1
< 1;
so we have
0<
y
yC1
n
y
yC1
< y:
y
yC1
Thus,
2 A. In the same way, y C 1 > 1, so .y C 1/n y C 1 > y and hence
y C 1 2 B.
Now, if a 2 A and b 2 B we have an y b n , so an b n , and therefore a b.
This comes from step 1, because if a > b we would have an > b n .
(4) Step 3 sets us up for our completeness axiom. There must exist an x with a x b,
for all a 2 A and all b 2 B. We will now show that for this x, x n D y.
Let 0 < a < x < b. Then an < x n < b n . But also, a … B (since x each element of
B), so an < y. Similarly, y < b n . Thus,
an < x n < b n
an < y < b n
Since both x n and y are in the interval .an ; b n /,
jx n
yj < b n
an :
But,
bn
an D .b
a/
n
X
b n i ai
1
.b
a/nb n
1
:
iD1
So, jx n yj .b
b D x C ". Then b
a/nb n 1 . Now take any " such that 0 < " < x and a D x
a D 2" and b < 2x and so
jx n
yj 2"n.2x/n
1
",
:
Hence,
jx n yj
":
2n.2x/n 1
Now " here was arbitrary satisfying 0 < " < x. So,
0
jx n yj
D 0:
2n.2x/n 1
Hence jx n
yj D 0. But then x n
y D 0, so x n D y, which completes the proof.
24
The existence of roots — a consequence of completeness
The proof just given may appear difficult at first, but will seem easy later in the course.
I just wanted to show how the axioms immediately give such an important result, without
developing a big theory. Though the words aren’t there, the proof gives a preview of
arguments about limits and continuous functions. The result itself is a special case of the
Intermediate Value Theorem 32.1.
INTRODUCTION TO ANALYSIS
25
8. T HE EXTENDED REAL NUMBER SYSTEM
x D R [ fC1; 1g consists of the set of real
The extended real number system R
numbers R together with two additional objects, C1 and 1. The ordering of R is
x by setting
extended to R
1 < x < C1;
for each x 2 R. The resulting order still satisfies the trichotomy and transitivity axioms.
x by setting for x 2 R
The addition is partially extended to R
x C .C1/ D C1 C x D C1
x
.C1/ D
1
(13)
x C . 1/ D
x
. 1/ D C1
(14)
1Cx D
x= C 1 D x=
1
1D0
(15)
if x > 0,
x.C1/ D C1x D C1
x. 1/ D
1x D
1
(16)
if x < 0,
x.C1/ D C1x D
x. 1/ D
1x D C1
(17)
1
(18)
1
C1 C .C1/ D C1
C1.C1/ D . 1/. 1/ D C1
1 C . 1/ D
.C1/. 1/ D . 1/.C1/ D
1
(19)
The expressions C1 .C1/, C1 C . 1/, and 1 C .C1/ are left undefined, as
are the corresponding quotients C1= C 1, etc.
The expressions 0.C1/ and 0. 1/ are sometimes defined to be 0, since a “rectangle”
of infinite length and 0 height should have area 0. We shall have little use for this in this
course.
x For
Be very careful not to assume that the axioms of the real number system hold in R.
example, C1.2 C . 1// D C1, but C1.2/ C .C1/. 1/ is not defined.
26
The extended real number system
Notes
INTRODUCTION TO ANALYSIS
27
9. T HE C OMPLEX N UMBER S YSTEM
The system C, of complex numbers is a field — that is, satisfies the field axioms A1–
A4, M1–M4, and DL listed in section 1. A XIOMS OF THE R EAL N UMBER SYSTEM —
which contains R as a subfield.1 The set C contains an element i for which i 2 D 1. This
shows that C cannot be made into an ordered field, since the square of any member of an
ordered field is non-negative. We take as an axiom that every element of C is of the form
z D x C iy;
where x; y 2 R:
This representation is unique. Indeed, if
x C iy D x 0 C iy 0 ; with x; y; x 0 ; y 0 2 R;
then x x 0 D i.y y 0 /, so .x x 0 /2 D .y 0 y/2 . Since there are no real numbers that
are both positive and negative, this shows x D x 0 and y D y 0 , as required.
Because of the unique representation just shown, x C iy can be identified with the
ordered pair .x; y/. (This is often used as a way of constructing C from R.)
If z D x C iy, where x; y are real, then x is called the real part of z, denoted Re.z/
and y is called the imaginary part of z, denoted Im.z/; the number zx D x iy is called
the complex conjugate ofpz. Thus,
z D x 2 C y 2 , and one defines the magnitude or
p zx
absolute value of z to be zx
z D x 2 C y 2 . (See 7. T HE EXISTENCE OF ROOTS — A
CONSEQUENCE OF COMPLETENESS .) Notice that when z D x, real, then z is the same as
jxj as defined in section 3. C ONSEQUENCES OF THE ORDERED FIELD AXIOMS.
9.1 Theorem. If z and w are complex numbers, then
x w D zx C w,
(1) z C
x
(2) zw
x D zx w,
x
(3) z C zx D 2 Re.z/, and z zx D 2i Im.z/.
9.2 Theorem. If z and w are complex numbers, then
(1) jzj 0, jzj D 0 if and only if z D 0,
(2) jx
z j D jzj
(3) jzwj D jzjjwj,
(4) j Re.z/j jzj, j Im.z/j jzj, and
(5) jz C wj jzj C jwj
(triangle inequality)
All the proofs here are straightforward, except perhaps the proof of the triangle inequality: Since zxw is the conjugate of z w,
x zw
x C zxw D 2 Re.z w/;
x hence,
x w/ D zx
jz C wj2 D .z C w/.z C
z C zw
x C zxw C w w
x
D jzj2 C 2 Re.z w/
x C jwj2
jzj2 C 2jz wj
x C jwj2
D jzj2 C 2jzjjwj C jwj2
D .jzj C jwj/2 ;
from which the result follows.
1This means that the operations applied to complex numbers belonging to R yield the same results as in the
real number system.
28
Complex Numbers
Notes
INTRODUCTION TO ANALYSIS
29
10. A BIT ABOUT Rn
As the reader will know from a first course in Linear Algebra, Rn (real Euclidean nspace) is the set of n-tuples x D .x1 ; : : : ; xn / of real numbers. We assume the reader is
familiar with the basic algebraic properties of this space. There is an addition x C y and a
multiplication cx defined for x; y 2 Rn and c 2 R
x C y D .x1 C y1 ; : : : ; xn C yn /
cx D .cx1 ; : : : ; cxn /;
making Rn a vector space (we won’t reproduce
P the definition here).
P
There is a dot product defined by x y D niD1 xi yi and a norm jxj D . niD1 xi2 /1=2 .
The notation kxk is also used, expecially if one wants to emphasize that some letters represent real numbers and others represent vectors.
After establishing simple properties of the dot product, one proves the Cauchy-Schwartz
inequality:
jx yj jxjjyj;
as follows.
If x or y are 0, both sides are 0.
If jxj D 1 and jyj D 1,
0 jx
yj2 D .x y/ .x y/
D x x 2x y C y y
D jxj2
D2
2x y C jyj2
2x y;
so that
xy 1
2
Similarly, using 0 jx C yj , we have
x y 1:
Thus, if jxj D jyj D 1,
jx yj 1;
If x ¤ 0 and y ¤ 0, we can replace x and y in () by x=jxj and y=jyj to obtain
ˇ
ˇ
ˇ x
y ˇˇ
ˇ
ˇ jxj jyj ˇ 1;
()
which is the same as
jx yj jxjjyj;
as required.
The triangle inequality
jx C yj jxj C jyj:
follows from the Cauchy-Schwarz inequality via the calculation
jx C yj2 D .x C y/ .x C y/
D x x C 2x y C y y
x x C 2jxjjyj C y y
D jxj2 C 2jxjjyj C jyj2
D .jxj C jyj/2 :
A bit about Rn
30
In Rn , the distance between two points x and y is d.x; y/ D jx
10.1. Let x; y 2
Rn ,
yj.
t0 > 0. If jx C tyj jxj, for all real t with 0 < t < t0 , then x y 0.
10.2. The line through a in the direction v is given by L D fa C t v W t 2 Rg. Find the distance
between 2 arbitrary points x D a C t v and y D a C sv on the line.
INTRODUCTION TO ANALYSIS
31
11. C ANTOR ’ S P RINCIPLE AND THE UNCOUNTABILITY OF R
Recall that a bounded closed interval in R is a set of the form Œa; b D fx W a x bg, where a; b 2 R. (Often, when the context makes it clear, one omits the word
“bounded”.)
11.1 Theorem (Cantor’s Principle of nested intervals). For each n 2 N , let In be a
(non-empty) bounded
closed interval of the real number system and suppose In InC1 ,
T
for all n. Then, n2N In ¤ ;.
Proof. Let In D Œan ; bn , for each n. The condition In InC1 says that for each n;
an anC1 bnC1 bn .
a1 a2 an bn b2 b1 :
A simple induction yields that n k implies an ak and bk bn . (See Exercises 4.7
and 4.8.) From this we see that for any n; m 2 N an bm : Indeed, let k D maxfn; mg,
then
a n a k bk bm :
Therefore, for all n; m, an bm ; hence, by completeness there exists a realTnumber x such
that for all n; m, an x bm . Thus, for all n, an x bn ; that is, x 2 n2N In , so the
intersection is not empty.
With the concepts of supremum and infimum, we can prove a more precise result:
\
In D Œa; b;
(20)
n2N
where a D supfan W n 2 Ng, and b D inffbm W m 2 Ng. [See 12. S UPREMA , INFIMA ,
AND THE A RCHIMEDIAN PROPERTY ].
The uncountability ofpthe reals. We saw in the section on existence of roots that there
are irrational numbers, 2 in particular. We are now going to see that there are many more
irrational numbers than there are rationals.
One can prove that it is possible to “count” the rational numbers; that is, to list all
the rational numbers r1 ; r2 ; r3 ; r4 ; : : :, letting one rational number rn correspond to each
natural number n. The theorem below will prove that this cannot be done for the real
numbers. More explanation about the countability of Q and of the concepts will follow the
proof.
An interval of R is called non-degenerate if it has at least 2 points in it. In particular,
an interval Œa; b of R is non-degenerate if a < b.
11.2 Lemma. If x 2 R and Œa; b is a non-degenerate closed interval of R, then there exists
a non-degenerate closed interval I Œa; b with x … I .
Proof. Let a0 and b 0 be any two points with a < a0 < b 0 < b, for example a0 D a C .b
a/=3 and b 0 D a C 2.b a/=3. Then the intervals Œa; a0  and Œb 0 ; b are disjoint so either
x … Œa; a0  or x … Œb 0 ; b.
Of course, it is quite possible that x is in neither of the two intervals constructed in the
above proof.
11.3 Theorem. Each non-degenerate interval I R, including R itself, is uncountable.
32
Cantor’s Principle and the uncountability of R
Proof. To say that I is uncountable is to say that it cannot be written in the form
I D fx1 ; x2 ; : : : g D fxn W n 2 Ng:
Suppose to the contrary, that we could write I in that form. Start with 2 points a < b
in I . Then, by the lemma, there exists a non-degenerate closed interval I1 Œa; b with
x1 … I1 : Again, there is a non-degenerate closed interval I2 I1 with x2 … I2 , Continuing
recursively, assuming Ik defined with xk … Ik , we find a non-degenerate closed interval
IkC1 contained in Ik with xkC1 … IkC1 : Thus, we have a nested sequence of closed
intervals
Œa; b I1 I2 Ik IkC1 : : : ;
T
with xk … Ik , for all k. The intersection n2N In ofT
all these In contains none of the points
of I , yet by Cantor’s Principle of Nested Intervals, n2N In ¤ ;. This is a contradiction.
T
Let us go through T
that contradiction slowly. Cantor’s principle says n2N In ¤ ;. But,
if c isT
a point of n In ; then
T c belongs to I so is one of the xk . But xk … Ik and
Ik n In ; so c D xk … n In .
The set of real numbers R itself can be viewed as the interval . 1; C1/, so the above
argument applies
The reader may have seen elsewhere a proof involving decimal expansions. Although
it looks quite different, it is essentially the same type of proof. The reason is that if x
has decimal expansion 0:d1 d2 d3 : : : then the digits d1 ; d2 ; d3 ; : : : determine a decreasing
sequence I1 ; I2 ; I3 ; : : : of intervals to which x belongs and can be chosen to ensure that
xn … In . You might try working this out.
Countablity of the Rationals. One famous way of counting the positive rationals is
1 2 1 3 2 1 4 3 2 1 5 4 3 2 1
, ; , ; ; , ; ; ; , ; ; ; ; ; : : :,
1 1 2 1 2 3 1 2 3 4 1 2 3 4 5
listing first those for which the numerator and denominator sum to 2, then those for which
they sum to 3, then to 4, then to 5, etc. (If we want to list a number only once we would
have to omit duplicates — those which are not in lowest terms, such as 22 , 33 , 26 , etc.) To
list all the rationals, start with 0, then alternate positive and negative, thus:
1 1 2 2 1 1 3 3 2
0, ;
, ;
; ;
, ;
; ;: : :
1 1 1 1 2 2 1 1 2
So, the set of rational numbers is countable. A set is finite if is empty or it can be written in
the form fx1 ; x2 ; : : : ; xn g where n is a fixed natural number. Finite sets are also considered
countable. It is evident, then, that each subset of a countable set is countable.
It turns out that a set (such as Q) which is countable and infinite can always be written
fx1 ; x2 ; x3 ; : : :g D fxi W i 2 Ng;
with all of the xi distinct. Such sets are called denumerable (or countably infinite).
An interesting way to enumerate the positive rationals is given by the following recursive definition. Put x1 D 1, then for k 2 N, put x2k D xk C 1 and x2kC1 D 1=x2k .
I learned about this from an article by Shen Yu-Ting (Shen You Ding) in the American
Mathematical Monthly, vol 87, 1980, pages 25–29.
For more about these things, including rigorous proofs, see the section C OUNTABILITY.
INTRODUCTION TO ANALYSIS
33
Cantor’s Principle in Rn . To say that I is a bounded closed interval of Rn means
I D Œa1 ; b1  Œa2 ; b2  Œan ; bn ;
where each of Œa1 ; b1 ; : : : ; Œan ; bn  is a bounded closed interval of R. If a D .a1 ; : : : ; an /
and b D .b1 ; : : : ; bn / we write Œa; b for this I .
With these definitions we obtain the same result as before.
11.4 Theorem (Cantor’s Principle of Nested Intervals in Rn .). Let .Ik / be a T
sequence of
non-empty bounded closed intervals in Rn with Ik IkC1 for all k 2 N. Then k Ik ¤ ;:
The proof is an exercise. It consists of reducing the problem to the corresponding theorem in R, using the fact that if A D A1 An , then c 2 A if and only if ci 2 Ai , for
i D 1; : : : ; n.
11.1. Bounded intervals are required in Cantor’s Principle: If .Ik / is a sequence of unbounded nonempty closed intervals (of the form Œak ; C1/, say) with Ik IkC1 , for all k, we can have
T
1
kD1 Ik D ;. This requires the Archimedean property. See section 12: S UPREMA , INFIMA ,
AND THE A RCHIMEDEAN P ROPERTY .
11.2. Closed intervals are required in Cantor’s Principle:
If .Ik / is a sequence of bounded non-empty
T
intervals Ik IkC1 , for all k, we can have 1
I
kD1 k D ;. This also requires the Archimedean
property. See S UPREMA , INFIMA , AND THE A RCHIMEDEAN P ROPERTY.
11.3. Let fI˛ W ˛ 2 Ag be a family of closed bounded intervals
T I˛ D Œa˛ ; b˛  of R. Assume
that each pair of intervals of the family intersect. Then, ˛2A I˛ ¤ ;. (Show that for all
˛; ˇ 2 A; a˛ bˇ and use completeness.) This is a special case of Helly’s Theorem on
compact convex sets in Rn .
34
Cantor’s Principle and the uncountability of R
Notes
INTRODUCTION TO ANALYSIS
35
12. S UPREMA , INFIMA , AND THE A RCHIMEDEAN P ROPERTY
For a subset S of R, a real number c is the largest element of S (also called greatest
element or maximum of S ), if
(1) c x; for all x 2 S
(2) c 2 S.
We then write c D max S .
Similarly, c is called the smallest element of S (also called least element or mininum
of S ), and we write c D min S, if
(1) c x, for all x 2 S and
(2) c 2 S.
It is immediate that a set can only have one maximum and only one minimum. Indeed, if
c1 and c2 are different elements of S then either c1 < c2 or c2 < c1 , so they can’t both be
the largest, nor can they both be the smallest.
Recall that a finite set F is one which is either ; or can be written as F D fx1 ; : : : ; xn g,
where n is a natural number. It is a nice exercise in induction to prove that every nonempty finite set of real numbers has a largest and a smallest element. We also know, from
the Well-ordering Property, that each non-empty set of natural numbers has a least element.
The existence of greatest or least elements in other situations is guaranteed by completeness.
For a subset S of R, a real number u is said to be an upper bound for S if u x, for
all x 2 S . Similarly, ` is called a lower-bound for S if ` x, for all x 2 S .
12.1 Theorem.
(a) If a non-empty set S R has an upper bound, then it has a least one.
(b) If such an S has a lower bound, then it has a greatest one.
Proof. (a) First suppose S has an upper bound and let U be the set of all the upper bounds
of S. Our job is to prove that U has a least element.
Now S is non-empty, by hypothesis and U is non-empty, since an upper bound is assumed to exist. By definition of upper bound, if x 2 S and u 2 U , we have x u: Thus,
we are in the setting of the Completness Axiom: S plays the role of A and U plays the
role of B. Therefore, there exists a real number c such that x c u, for all x 2 S and
u 2 U.
This is exactly what we want. That x c for all x 2 S says that c is an upper bound
for S and that c u, for all u 2 U says that c is the least one.
(b) Let us briefly outline the proof for lower bounds. If S ¤ 0 and has a lower bound,
then the set L of all its lower bounds is non-empty, and by definition ` x for all ` 2 L
and all x 2 S. By the completeness axiom, there is a c with ` c x for all ` 2 L and
all x 2 S . This c is then the greatest lower bound of S.
The least upper bound, if it exists, is also called the supremum of S (supS ) and the
greatest lower bound of a set S , if it exists, is also called the infimum of S (inf S).
A set which has an upper bound u is called bounded above by u and a set which has a
lower bound ` is called bounded below by `. The theorem we just stated is often quoted
as:
Theorem. Every non-empty set bounded above has a supremum, and every non-empty set
bounded below has an infimum.
36
Suprema, infima, and the Archimedean Property
In many books, one of these is taken as an axiom and is called completeness. We have
shown these follow from our Completeness Axiom. It is also easy to show that each implies
our axiom.
Going back to the meanings of the terms we see that c D sup S iff
(1) c x, for all x 2 S ( c is an upper bound of S ) and
(2) if u x for all x 2 S, then u c (the least one).
It is often useful, mainly in checking individual examples, to use condition (2) in
its contrapositive form:
(20 ) if u < c then there exists x 2 S with u < x.
(In other words, if u is less than c, then it is no longer an upper bound of S.)
You should compare (1) and (2) to the corresponding conditions for maximum, to see
that if c D max S , then c D sup S .
In the case of lower bounds we see that c D inf S iff
(1) c x, for all x 2 S and
(2) if ` x for all x 2 S , then ` c, or in contrapositive form
(20 ) if ` > c then there exists x 2 S with ` > x.
(In other words, if ` is greater than c, then it is no longer a lower bound of S .)
Again a minimum of a set is always its infimum.
12.2 Examples.
(a) For the set A D f 21 ; 3; 2; 1g, 3 is the maximum of A since it belongs to A and is each of the members of A. It is therefore also the supremum of A. Similarly, the mimimum
(hence also the infimum) of A is 21 .
(b) The infimum of the set B D .3; 1/ D fx W x > 3g is 3: This requires proof.
Certainly, (1) 3 < x, for all x 2 .3; 1/.
Now, suppose ` is a lower bound for B. That is, suppose ` x for all x > 3
Then, ` 3. This was “our First Analysis Result” — 3.3 (in section 3: C ONSEQUENCES
OF THE ORDERED FIELD PROPERTIES .)
Thus, (2) If ` x for all x 2 B, then ` 3. Hence, 3 is the greatest lower bound of B
(infimum).
(c) The set N of natural numbers has 1 as a minimum, hence infimum. However, as the
following theorem shows, it is not bounded above, so has no supremum.
12.3 The Archimedean Property. If "; b 2 R and " > 0, then there exists an n 2 N such
that n" > b (equivalently, that nb < ").
Proof. Let " > 0, and b any element of R. Suppose the conclusion were false. Then we
would have n" b, for all n 2 N. This means that the set S D fn" W n 2 Ng is bounded
above (by b). Thus, S has a supremum (say, c D sup S /. Since c is an upper bound for S,
n" c; for all n 2 N:
But if n 2 N, so is n C 1, hence
.n C 1/" c; for all n 2 N;
and therefore,
n" c
"; for all n 2 N:
But this shows that c " is also an upper bound for S , even though it is smaller than c,
since " > 0. This is a contradiction, establishing the result.
INTRODUCTION TO ANALYSIS
37
Another way to look at the contradiction at the end of the proof is to conclude from
na c
"; for all n 2 N
that
cc
";
by property (2) of supremum, and hence c < c.
T
12.4 T
Example.
n2N .0; 1=n D ;. To prove this, suppose the set is not empty; say,
x 2 n2N .0; 1=n. Then, for all n 2 N, x 2 .0; 1=n; that is, for all n 2 N, 0 < x 1=n.
But by the Archimedean Property,
if x > 0, there exists n 2 N with 1=n < x, so this is a
T
contradiction. Therefore, n2N .0; 1=n D ;.
12.5 Examples. Practice Problems The solutions appear below —12.12
T
(1) Determine, with proof, the set S D n2N .1 n1 ; 1 C n1 /, and find its supremum
and infimum (if possible).
(2) Find the supremum, infimum, maximum and minimum of f n1 W n 2 Ng, if they
exist.
(3) Do the same for .1; 2.
12.6 Theorem (Order density of the rationals in R). If a and b are real numbers with
a < b, then there is a rational number r with a < r < b.
The intuitive idea is that we can use the Archimedean property to get to the left of a
and then move in small steps of size n1 < b a to the right till we (or the rational number)
land(s) in .a; b/.
Proof. Fix a; b 2 R with a < b. By the Archimedean property, there exists k 2 N with
k > a, so
k < a:
Choose such a k. Now, by the second form of the Archimedean property, we can choose
n 2 N, with
1
< b a:
n
Again, by the Archimedean property, there exists a natural number m such that m n1 >
a C k. By the Well-ordering Property of N, there is a smallest such natural number. If m
is this one, we have
m
m 1
>aCk
but
aCk
n
n
Thus,
m
k;
a<
n
and
m
1
m 1
1
kD C
k C a < b a C a D b;
n
n
n
n
since n1 < b a.
Summarizing, we have
m
a<
k < b;
n
that is, a < r < b, where r D m
k, a rational number.
n
38
Suprema, infima, and the Archimedean Property
12.7 Note. In the above proof, we were unable to use the Well-ordering Property directly
to get a smallest integer with a certain property, so we translated a and b to the right by
adding a suitable k, so that a C k and b C k are positive. Then, we were able to use the
Well-ordering property to get a least natural number m with m=n > a C k, and translate
back to get m=n k > a. This translation technique is useful in many situations. In the
next result we use it to deduce density of the irrational numbers from density of the rational
numbers
In general a subset A of R is called order dense in R if whenever a and b are real
numbers with a < b, then there is an element x 2 A with a < x < b.
12.8 Theorem ( Order density of the irrationals in R). If a and b are real numbers with
a < b, then there is an irrational number x with a < x < b. (That is, Qc is order dense
in R.)
p
p
Proof. . Let a; b be real numbers with a < b.pThen, a
2p
<b
2. By the
p density of
2
<
r
<
b
2.
Then,
a
<
2 C r < b.
the rationals,
we
can
choose
r
2
Q
with
a
p
Take x D 2 C r. Then, a < x < b and x is irrational, as required.
Here are some more results that you can easily deduce from the Archimedean Property,
using translation and the Well Ordering Property.
12.9 Corollary. For every real number x, there exists an integer n with n < x.
12.10 Theorem. Every non-empty set of integers bounded below has a minimum and every
non-empty set of integers bounded above has a maximum.
12.11 Example. For each x 2 R, there is a largest integer n with n x. It is denoted bxc
(pronounced “floor of x”). Thus, bxc 2 Z and
bxc x < bxc C 1:
bxc is also called the integral part of x and the function x 2 R 7! bxc is called the floor
function or greatest integer function. The fractional part of x is hxi D x bxc. (Some
use this terminology for x bxc, if x 0, and x b xc if x < 0.)
Similarly, there is a least integer m, with m x, denoted dxe, “ceiling of x”,
dxe
1 < x dxe:
12.12 Examples ( Solution of the practice problems). 12.5
T
(1) Determine n2N .1 n1 ; 1 C n1 / and find its supremum and infimum.
Call this set S . A point x belongs to S iff x 2 .1 n1 ; 1 C n1 /, for all n 2 N.
Now, certainly 1 belongs to each of these intervals, so 1 2 S . If x ¤ 1, then either
x > 1 or x < 1. If x > 1, then x 1 > 0, so by the Archimedean property, there
exists a natural number n with n1 < x 1, and hence x > 1 C n1 . Thus, for this n,
x … .1 n1 ; 1 C n1 /, so x … S. Similarly, if x < 1, 1 x > 0 and we can find n
with n1 < 1 x, so that x < 1 n1 , and again x … S .
Thus, 1 is the only element of S; S D f1g. So 1 D inf S D sup S D max S D
min S.
(2) Find the supremum, infimum, maximum and minimum of f n1 W n 2 Ng, if they
exist. Let us call the set W . Since, for all n 2 N,
1
0 < 1;
()
n
INTRODUCTION TO ANALYSIS
39
we see that 1 x, for all x 2 W ; and since 1 2 W , 1 is the maximum of W .
Since, when it exists, the maximum of a set is always its supremum, this says also
that 1 D sup W .
Again from (), we see that 0 x, for all x 2 W ; that is, 0 is a lower bound of
W.
To show it is the greatest one, suppose ` > 0. Then, by the Archimedean
property, there exists n 2 N with n1 < `; that is, there exists x 2 W with x < `,
so that ` is not a lower bound of W , as required.
(3) Find the maximum, minumum, supremum and infimum of the set (1,2], provided
these exist.
Proof. Let us put A D .1; 2. Then, by definition x 2 A iff 1 < x 2.
Thus, 2 x; for all x 2 A and 2 2 A. These are exactly the two conditions
that 2 be the maximum of A. So,
2 D max A:
Since, when it exists, the maximum of a set is always its supremum, this says also
that
2 D sup A:
Let us go over this in this special case. We need two things:
that 2 is an upper bound of A (i.e. 2 x, for all x 2 A) and
that it is the least one (if u x, for all x 2 A, then u 2).
The first of these was already part of 2 being the maximum of A. As for the
second, since 2 2 A, if u each element of A, it is certainly 2.
Now let us investigate the question of minimum and infimum. It looks like 1
“would like to be” the minimum of A, but 1 … A, so it cannot be the mimimum.
We do, however, have that 1 x, for all x 2 A; that is, 1 is a lower bound of A.
Is it the greatest one? Well, suppose ` > 1. There are two possibilities: ` > 2
and 1 < ` 2. In the first case, ` is not a lower bound of A, since 2 2 A. In the
, we have 1 < a < ` 2 , In particular, 1 < a 2, so
second, if we put a D 1C`
2
a 2 A, and also, a < `, so ` is not a lower bound for A.
We have thus proved that the infimum of A is 1:
1 D inf A:
Now, if A had a minimum, it would have to also be the infimum, so 1 would have
to be the mimimum, which it is not, since 1 … A. Thus,
.1; 2 has no minumum.
Supremum and Infimum as operations. We can view sup and inf as operations acting
on the elements of a set S, yielding a new real number. This point of view can be quite
convenient, especially in theoretical calculations.
The definition of supremum for example, together with the theorem on its existence
could be stated:
Let S be a non-empty set of real numbers. If x u, for all x 2 S , then
(0) sup S exists,
(1) x sup S , for all x 2 S and
(2) sup S u.
40
Suprema, infima, and the Archimedean Property
Thus, (for all x 2 S , x u) implies sup S u.
“The supremum of a set of numbers, each u is also u.”
(Of course, there is nothing new here, but a change of emphasis.)
Similarly, for a non-empty set S of real numbers, from
` x, for all x 2 S , we deduce
(0) inf S exists
(1) inf S x, for all x 2 S ,
(2) ` inf S.
“The infimum of a set of numbers, each `, is also `.”
Let us look at these principles in action.
12.13 Examples.
(1) If S is a set of real numbers, and c 2 R, cS denotes fcx W x 2 S g.
Now, if S is a non-empty set of real numbers, bounded above and c > 0, then
sup.cS / D c sup S.
Proof. Let S be non-empty and bounded above, so that sup S exists.
Let y 2 cS . Then, y D cx, for some choice of x 2 S .
Thus, y=c D x 2 S . Since the supremum of S is an upper bound for S,
y=c sup S:
Since c > 0, we may multiply by it and get
y c sup S:
But, y was an arbitrary element of cS , so
for all y 2 cS , y c sup S:
Hence, also,
sup.cS / c sup S:
Now, suppose x 2 S .
Then,
cx 2 cS:
Hence,
cx sup.cS /:
Therefore, since c is positive,
1
x sup.cS /:
c
The x 2 S here was arbitrary, so this statement is true for all x 2 S . Thus again
we may “take the supremum of the left-side over all x 2 S ,” obtaining
1
sup S sup.cS /:
c
Thus,
c sup S sup.cS /:
This with the earlier statement, yields
sup.cS / D c sup S:
(2) Let A B, where A is non-empty and B is bounded above. Then sup A sup B.
INTRODUCTION TO ANALYSIS
41
Proof.
Let x 2 A. Then, x 2 B and since sup B is an upper bound of B, x sup B.
Thus,
for all x 2 A, x sup B.
Hence, sup A sup B.
How would the above change if we talked about inf instead of sup?
x The extended real
Supremum and infimum in the extended real numbers system, R.
numbers system consists of R together with the additional elements 1 and C1, satisfying 1 < x < C1, for all real x. The definitions of supremum, infimum, maximum and
x has C1 as an upper bound and 1 as
minimum still make sense. Clearly every set in R
a lower bound. If A is a set which is not bounded above in R, it will have sup A D C1 in
x if it is not bounded below, then inf A D 1 in R.
x
R;
At this point the reader can prove the characterization of intervals mentioned in section
3. C ONSEQUENCES OF THE ORDERED FIELD AXIOMS
12.14 Theorem (Characterization of intervals). A subset J of R is an interval if and
only if for all x; y 2 J and t 2 R, if x < t < y then t 2 J .
(Do the first two problems below first to get started.)
NOTE: For general statements below, we assume given sets are not-empty, even if we
don’t say so explicitly.
12.1. Let A be a non-empty set of real numbers bounded above and below.
Then inf A sup A.
12.2. If I is a non-empty bounded interval of R with endpoints a b, then a D inf I and b D sup I .
(The possibilities are I D Œa; b; .a; b/; .a; b, or Œa; b/.)
12.3. The previous problem extends to unbounded intervals as well, using the conventions about suprex
mum and infimum in R.
12.4. If a < b, then inf.a; b/ \ Q D a and sup.a; b/ \ Q D b.
12.5. If c D sup A and a is not the maximum of A, then sup.A n fag/ D c also.
12.6. Let A and B be sets of real numbers with A B.
If B is bounded above, then sup A sup B.
If B is bounded below, then inf A inf B.
12.7. Let A and B be sets of real numbers bounded above.
Then, sup.A [ B/ D maxfsup A; sup Bg.
12.8. Let A and B be non-empty sets of real numbers bounded below.
Then, inf.A [ B/ D minfinf A; inf Bg.
12.9. Let A be a non-empty set of real numbers bounded above and let c 2 R
Then, sup.c C A/ D c C sup A. (c C A D fc C a W a 2 Ag.)
12.10. Let A be a non-empty set of real numbers bounded above and let c 2 R, c > 0.
Then, sup.cA/ D c sup A. (cA D fca W a 2 Ag.)
12.11. Let A be a set of real numbers bounded above..
Then, inf. A/ D sup A. ( A D f a W a 2 Ag.)
12.12. Let A be a set of real numbers bounded above and let c 2 R
Then, inf.c C A/ D c C inf A. (c C A D fc C a W a 2 Ag.)
12.13. Let A and B be sets of real numbers bounded above.
Then, sup.A C B/ D sup A C sup B. (A C B D fa C b W a 2 A; b 2 Bg.)
42
Suprema, infima, and the Archimedean Property
12.14. Let A and B be sets of real numbers bounded below.
Then, inf.A C B/ D inf A C inf B. (A C B D fa C b W a 2 A; b 2 Bg.)
12.15. Let f and g be functions with domain X and values in R such that ff .x/ W x 2 X g and
fg.x/ W x 2 X g are both bounded above. Then,
supff .x/ C g.x/ W x 2 X g supff .x/ W x 2 X g C supfg.x/ W x 2 X g:
Note: this is often written
sup .f .x/ C g.x// sup f .x/ C sup fg.x/:
x2X
x2X
x2X
Simple examples show that equality need not hold.
12.16. Let f and g be functions with domain X and values in R such that ff .x/ W x 2 X g and
fg.x/ W x 2 X g are both bounded below. Prove that
infff .x/ C g.x/ W x 2 X g infff .x/ W x 2 X g C inffg.x/ W x 2 X g:
Note: this is often written
inf .f .x/ C g.x// inf f .x/ C inf fg.x/:
x2X
x2X
x2X
Equality need not hold.
12.17. Find (with proof)
[
2
(a)
.2 C ; n.
n
n2N
\
1
(b)
.2
; 2/.
n
n2N [
1
1
(c)
;
.
nC1 n
n2N [
1
2
(d)
2 C ;
.
n
n
n2N
12.18. Find the supremum, infimum, maximum, and minimum (if they exist) of the following sets.
Prove your answer.
(a) f3; 2; ; g
1
, for n 2 N.
(b) fxn W n 1g, where xn D 1 C . 1/n C n
(b0 ) fxn W n 2g, for the same xn as in (b).
(b00 ) fx
˚ n W n 3g, for the
same xn as in (b).
(c) x 2 Q W x 2 13 .
1
1
(d) f n
Cm
W n; m 2 Ng.
(e) Œa; b/, where a < b.
x
(f) f 1Cx
W x 2 R; x > 0g.
n
p o
(g) x 2 Q W x 7 .
12.19. If c is a fixed real number, then fx 2 R W x D c C r; where r 2 Qg is dense in R.
12.20. Prove the more precise version of Cantor’s Theorem referred to in equation 11.20:
For each n 2 N , let In D Œan ; bn  be a (non-empty) bounded closed interval of the real
number system and suppose In InC1 , for all n. Then,
\
In D Œa; b;
n2N
where a D supfan W n 2 Ng, and b D inffbm W m 2 Ng.
12.21. Every additive subgroup of R is either dense or consists of the integral multiples of one element.
(Depending on whether or not it contains a least positive member.)
[Proof sketch] If the infimum a of the set of positive elements of the group G is 0, then the
group is dense. [If an interval I has length ` > 0, and g 2 G with 0 < g < `, then some integer
multiple of g will belong to I .] Otherwise, a > 0. If a … G, there exists g with a < g < a C a
INTRODUCTION TO ANALYSIS
43
and again there is another h 2 G with a < h < g, so that 0 < g h < a. This is a smaller
positive group element, which is impossible. Thus, a 2 G. The claim is that G D aZ. For if
g 2 G and na < g < .n C 1/a, we would have 0 < g < a, again impossible.
12.22. Let x be a real number and N 2 N. Then, there exists m 2 f1; : : : ; N g and n 2 Z with
jmx nj < 1=N . [The N C 1 numbers Œkx/ D kx bkxc, for k D 0; : : : ; N , lie in the
interval Œ0; 1/. Since there are more than N pairs of these numbers, some pair of them must fall
i
of length 1=N .]
in one of the N intervals Πi N1 ; N
12.23. If x is an irrational number, the set xZCZ D fmx Cn W m; n 2 Zg is dense in R. [Combine
the 2 previous problems.]
12.24. If x is irrational, then there exist an infinite number of rationals p=q such that jx
1=q 2 .
p=qj <
44
Suprema, infima, and the Archimedean Property
Notes
INTRODUCTION TO ANALYSIS
45
13. E XPONENTS
Here we will explain how to use completeness to define ax where 0 < a 2 R and x 2 R.
What we want is that a1 D a and the laws of exponents hold, namely:
(1) axCy D ax ay
(2) .ax /y D axy
(3) .ab/x D ax b x ,
for as many values of x, y, a, b as possible.
Natural exponent. Start by allowing a to be any real number.
For n 2 N, an is defined recursively by a1 D a and anC1 D an a. Thus, as we expect,
n
a becomes a
: : :…
a.
„ aƒ‚
n times
Using induction, we can prove that
anCm D an am
.an /m D anm
and .ab/n D an b n ;
for all real a; b and natural numbers n; m; that is, the laws of exponents for a; b real and
x; y naturals.
Integer exponent. Now, restrict to the case a ¤ 0. For x an integer, x D n m where
n; m 2 N. We can prove that if n m D n0 m0 , where n; m; n0 ; m0 are natural numbers,
then
0
an
an
D m0
am
a
So it makes sense to define
an
an m D m :
a
0
1
1
Notice in particular that a D a =a D 1.
With this definition, it is an easy exercise to show the laws of exponents hold for a; b
non-zero and x; y integers.
Rational exponents. If x is any positive real number and n is a natural number, there
exists a unique real number y such that y n D x and we let x 1=n D y. See the section
E XISTENCE OF ROOTS. Here, if a > 0 and r is a rational number of the form m=n where
m 2 Z and n 2 N, we put
ar D .am /1=n :
To justify this, we need to show that if r is also p=q, where p 2 Z and q 2 N, then
.am /1=n D .ap /1=q :
By the uniqueness of roots, it is enough to show that when .am /1=n is raised to the q th
power, the result is ap .
As a first step, notice that .am /1=n is also .a1=n /m , since
..a1=n /m /n D .a1=n /mn D .a1=n /nm D ..a1=n /n /m D am :
But m=n D p=q means mq D pn. Thus,
..am /1=n /q D ..a1=n /m /q D .a1=n /mq D .a1=n /np D ..a1=n /n /p D ap ;
as required.
Again, straightforward calculations allow us to prove the laws of exponents for a; b > 0
and x; y rational.
46
Exponents
Arbitrary real exponents. We now come to the point of all the above discussion. We
would like to extend the definition of ax so it applies to all real x. For this, we look at
three cases. a > 1, a D 1, and a < 1. Since we want .1b/x D 1x b x , we have no choice
but to take 1x D 1; and since we want 1 D .aa 1 /x D ax .a 1 /x , we will have to have
ax D .a 1 / x . So we work with the case a > 1.
Let a > 1, x 2 R. Then, ax is defined to be the unique number satisfying
ar ax as ; for all rational numbers r; s with r < x < s:
()
To establish that such a number exists, consider the two sets
C D far W r 2 Q; r < xg and D D fas W s 2 Q; x < sg
We show
(a) c 2 C and d 2 D imply c < d .
(b) For each " > 0 there exist c 2 C and d 2 D with d c < ".
It follows that there is exactly one number between the elements of C and D, namely,
sup C D inf D (this is an exercise) and the definition sets ax equal to this value.
(a) Since a > 1, for natural numbers m; n, am > 1, hence am=n D .am /1=n > 1. That
is, for all rational numbers r > 0, ar > 1. From this we deduce that for rationals,
r < s implies ar < as .
Indeed, s
()
r > 0, so
ar < ar .as
r
/ D arCs
r
D as :
Thus, if r < x and x < s, ar < as , which establishes (a).
(b) By induction one can show that for every natural number n,
b > 1 implies b n
1 n.b
1/I
hence, taking b D a1=n ,
1 n.a1=n
a
hence, if 0 < s
1/I
r < 1=n, then
ar .a 1/
:
n
Let M be any rational > x. Use the Archimedean property to find n so large that
aM .a 1/
< ". Then choose r; s rational with
n
1
1
x
<r <x<s<xC
2n
2n
as
Then, r < M and s
ar D ar .as
r
1/ ar .a1=n
1/ r < n1 , so
ar .a 1/
aM .a 1/
< ":
n
n
This is what we wanted: ar 2 C , as 2 D and as ar < ".
as
ar 13.1 Note.
(i) If x is rational, the new definition and the old definition of ax give the same value,
because of ().
(ii) If x < y, and a > 1, we see that ax < ay . Indeed, by density we may choose
rationals u; v with x < u < v < y, so ax au < av ay .
INTRODUCTION TO ANALYSIS
47
As we stated earlier, if a D 1, we let ax D 1, for all x, if 0 < a < 1, we let ax D
.a 1 / x .
13.2 Theorem. With the above definitions, the expression ax satifies the laws of exponents:
For all real a > 0, and for all x; y 2 R,
(1) axCy D ax ay ,
(2) .ax /y D axy , and
(3) .ab/x D ax b x
Proof. (1) Start with a > 1 and x; y 2 R. Let u and v be rational with
u<xCy <v
Then u y < x, so by density of the rationals, there exists a rational r with u y < r < x.
Put s D u r. Then, s < y and u D r C s, so
au D arCs D ar as ax ay
Similarly,
ax ar av
Thus, u and v rational with u < x C y < v implies
au ax ay av :
But, there is only one number between all such au and av , nameley axCy . Thus axCy D
ax ay .
Note now that ax a x D axC. x/ D a0 D 1, so
a
x
D .ax /
1
In case a D 1, 1x 1y D 1 1 D 1 D 1xCy
In case 0 < a < 1, the result (1) follows easily from the case a > 1, by applying it to
a 1:
axCy D .a 1 / .xCy/ D .a 1 / xC. y/ D .a 1 / x .a 1 /. y/ D ax ay
(2) This is similar to the proof of (1), but uses multiplication instead of addition.
Case a > 1, x; y > 0.
Let u; v be rationals with u < xy < v. Since u < xy, and y > 0, yu < x. Thus, there
exists a rational r with u=y < r < x. And then u=r is rational s with s < y. Thus,
au D ars D .ar /s .ax /s .ax /y :
Similarly .ax /y av . Hence,
au .ax /y av ;
for all rationals u; v with u < xy < v, so by the uniqueness in the definition, axy D .ax /y .
The cases with a > 1 and one or both of x and y negative follow from the positive case
by simple algebraic manipulation. For example, if a > 1, x > 0, y < 0, we have
axy D .a
xy
/
1
D .ax.
y/
/
1
D ..ax /.
y/
/
1
D .ax /y :
We leave the other cases to the reader. As in (1), the cases with a D 1 are trivial and the
cases with a < 1 follow from the cases a > 1 by considering a 1 .
(3) is left as an exercise.
48
Exponents
Notes
INTRODUCTION TO ANALYSIS
49
14. T HE EXISTENCE OF Here we would like to prove the existence of the number and indicate how the trigonometric functions can be defined. The most common intuitive definition given for the number is the ratio of the circumference of a circle to its diameter. Of course, that brings
up the question as to what the circumference of a circle is exactly. One might say it is the
distance a point travels around the circle in one direction till it gets back to the starting
point.
This all can be defined rigorously once one has the concept of a continuous parameterized curve. Here, let us do a simple version. You will agree that any reasonable definition
of the total distance around the circle will be twice the length of the upper semicircle, so
consider the semicircle
C WD f.x; y/ W x 2 C y 2 D 1; y 0g:
p
For a given x 2 Œ 1; 1, the unique y with .x; y/ 2 C is y D 1 x 2 . For each list
. 1; 0/ DPa0 ; a1 ; : : : ; an D .1; 0/ of points ai D .xi ; yi / 2 C , such that xi 1 < xi , we
calculate niD1 d.ai 1 ; ai /. The supremum of all such sums, if it exists, may be interpreted
as the length of this semicircular arc, and used to define . So consider the set of real
numbers
˚Pn
SD
i D1 d.ai 1 ; ai / W . 1; 0/ D a0 ; a1 ; : : : ; an D .1; 0/ 2 C :
(Here we are assuming as indicated above that these ai D .xi ; yi / are listed in order of
their first coordinates xi .)
The set S is non-empty (for example 2 D d.. 1; 0/; .1:0// 2 S ). If we can show that
S is bounded above, then we will know that its supremum exists. We will prove that S is
bounded above by 4.
So, let a0 ; a1 ; : : : ; an be such a list. For each i,
d.ai
1 ; ai /
jxi
xi j C jyi
1
1
yi j:
Notice that if i 1 and xi is negative, xi 1 < xi yields
q
q
yi 1 D 1 xi2 1 < 1 xi2 D yi :
Similarly as soon as xi 1 0, yi 1 > yi . As the x values cross from negative to
positive, we could have two values of y the same. Let k be the first index such that
yk D maxfy0 ; : : : ; yn g. Then, since xi 1 < xi , for all i and yi 1 < yi , for i k
and yi 1 yi , for i > k,
n
X
d.ai
1 ; ai / i D1
n
X
jxi
xi j C
1
i D1
D
n
X
jyi
1
yi j
iD1
.xi
1
xi / C
i D1
D1
n
X
k
X
.yi
1
iD1
. 1/ C .yk
0/ C .yk
yi / C
n
X
.yi
1
yi /
i DkC1
0/
2C1C1D4
This shows that D sup S , the length of the semicircular arc, exists and is 4.
Just to check that this is consistent with what you have learned, consider a semicircle of
radius r:
2
2
Cr D f.x 0 ; y 0 / W x 0 C y 0 D r 2 ; y 0 0g D fr.x; y/ W x 2 C y 2 D 1; y 0g:
50
The existence of The distance between two points ra and rb on this arc is d.ra; rb/ D jra rbj D
rja bj D rd.a; b/ and, if we compute the arc length in a similar way, we obtain
˚Pn
sup
i D1 d.rai 1 ; rai / W . 1; 0/ D a0 ; a1 ; : : : ; an D .1; 0/ 2 C D supfrs W s 2 Sg D r;
as expected.
Now, let’s go back to the unit circle. In a similar way, we can define arc length `.a; b/
along the unit circle
U D f.x; y/ W x 2 C y 2 D 1g
from any point a 2 U to another b 2 U , traversing continuously in one direction. This is
a bit cumbersome with the theory we have at the moment, but the idea is simple. We find
that if b is between a and c as we traverse the from a to c, the lengths satisfy
`.a; c/ D `.a; b/ C `.b; c/:
(21)
Assuming the traversal is in the counterclockwise direction, one can show that for each
2 Œ0; 2/, there is exactly one point u D .x; y/ 2 U with `..1; 0/; u/ D , and of
course, we put
cos D x;
sin D y:
Using the usual identification of points in R2 with vectors whose tails are at the origin,
we think of as a measure of the angle between the vector .1; 0/ and the vector .x; y/,
traversing in that direction. The rotation map
u
cos sin u
7!
v
sin cos v
preserves distance. It follows that it preserves arc length. This together with the additivity
formula (21) can be used to give the usual formulas for the sin and cos of sums and differences of angles. Other standard identities follow from sin2 C cos2 D x 2 C y 2 D 1.
INTRODUCTION TO ANALYSIS
51
15. T OPOLOGY IN R AND Rn AND OTHER METRIC SPACES .
A metric space is a set X together with a function d W X X ! R (called a metric)
such that for all x; y; z 2 X ,
(1) d.x; y/ 0
(2) d.x; y/ D 0 if and only if x D y
(3) d.x; y/ D d.y; x/
(4) d.x; y/ d.x; z/ C d.z; y/ (triangle inequality).
15.1 Examples. (a) R. As we know, in R, the distance between x and y, defined by
d.x; y/ D jx yj, satisfies these 4 properties, so R together with this distance forms a
metric space.
(b) Rn . These conditions are also satisfied by distances in Rn . Recall that for points
Pn
2 1=2
(vectors) in Rn , the distance from x to y is d.x; y/ D jx yj, where jxj D
,
i D1 xi
the norm of x, and d is called the Euclidean metric or Euclidean distance in Rn ; many
authors use the notation kxk instead of jxj. We will occasionally do this for emphasis.
(c) The complex number system C is also a metric space under the metric defined by
d.z; w/ D jz wj. Here, if z D x C iy, jzj D .x 2 C y 2 /1=2 . As a result, as a metric
space, C can be regarded as the same as R2 .
(d) Metric subspaces. The main examples we work with in this course are subspaces
of Rn (or R).
If X S, a metric space with metric d , we can define a distance function on X simply
by restriction
dX .x; y/ D d.x; y/; for x; y 2 X .
This distance is still a metric, because the four properties were already satisfied for elements of S , so they are certainly satisfied for elements of X .
The set X with the metric dX is called a (metric) subspace of S .
If we are working in a metric space X , the complement of a set A in X is Ac D X n A.
Thus, in R, Ac D R n A and in Rn , Ac means Rn n A.
Open and Closed sets. Let X be a metric space. For x 2 X , " > 0, the set
B.x; "/ D fy 2 X W d.y; x/ < "g
is called the "-neighbourhood or (open) "-ball about x. Here, " is called the radius of
the ball. For each x, we will let Bx D fB.x; "/ W " > 0g, the family of all these basic
neighbourhoods. Now if x 2 X and A X , exactly one of three things can happen:
A contains some U 2 Bx
c
.x is an interior point of A/
A and A intersect each U 2 Bx
.x is a boundary point of A/
A is disjoint from some U 2 Bx
.x is an exterior point of A/
The sets of such points are, respectively, called the interior of A, the boundary of A,
and the exterior of A, and denoted int A, bd A, ext A. You can check right away that a
point x is an exterior point of A if and only if it is an interior point of Ac .
A set is called open if all its points are interior points and is called closed if all points
outside it are exterior points. Thus,
15.2 Theorem (Main test for open or closed set).
(1) A is open iff for each x 2 A, there exists an " > 0 with B.x; "/ A;
(2) A is closed iff for each x … A, there exists an " > 0 with B.x; "/ \ A D ;.
Topology in R and Rn and other metric spaces.
52
Since B.x; "/ \ A D ; , B.x; "/ Ac ; (2) says A is closed iff for all x 2 Ac ,
there exists " > 0 with B.x; "/ Ac . That is:
15.3 Corollary. A set is closed if and only if its complement is open.
Since a set clearly contains its interior (proof?) we can say A is open iff it is equal to
its interior, and it is closed iff its complement is equal to its exterior.
15.4 Examples. Determine which of the following sets are open, closed, neither or both.
(a) .1; 5/.
(b) Œ1; 5.
(c) .1; 5.
(d) Œ2; 1/.
(e) R.
(f) f1; 2; 3g.
(g) f n1 W n 2 Ng.
p
(h) fr 2 Q W r > 2g.
(i) Œ2; 4 Œ3; 6, a closed interval of R2 .
Proof. (a) Let A be the open interval .1; 5/.
If x 2 A, then 1 < x < 5, so x 1 > 0 and 5 x > 0.
Put " D minfx 1; 5 xg. Then, " > 0.
I claim B.x; "/ A.
Indeed, y 2 B.x; "/ implies jy xj D d.y; x/ < ". Thus,
1Dx
.x
1/ x
"<y <xC"xC5
x D 5;
so y 2 .1; 5/ D A.
We have shown that for each x 2 A, there exists " > 0 with B.x; "/ A. So all of the
points of A are interior points, or in other words, A is open.
On the other hand, A is not closed because 1 … A and every neighbourhood of 1 contains
points of A: indeed, if " > 0 then the neighbourhood B.1; "/ D .1 "; 1 C "/
and if a is any point strictly between 1 and minf5; 1 C "g, such as minf3; 1 C "=2g/ ,
then 1 < a < 1 C " and a 2 A, so
a 2 B.1; "/ \ A:
(b) Let B D Œ1; 5. (Don’t confuse this B with a ball B.1; "/.) Then B is not open,
since for example, 1 2 B, but is not an interior point. Every neighbourhood of 1 contains
points outside of B; indeed, if " > 0, B.1; "/ D .1 "; 1 C "/ contains the point 1 "=2,
which does not belong to B.
However, B is closed. To prove this, we take an element x … B and find a neighbourhood of x which is disjoint from B. For such an x, either x < 1 or x > 5. In case x < 1,
we let " D .1 x/=2, which is > 0.
Then B.x; "/ D .x .1 x/=2; xC.1 x/=2. All of its elements are < 1, so B.x; "/\B D
;.
In case x > 5, we put " D .x 5/=2. Then t 2 B.x; "/ implies
t >x
.x
5/=2 > x
.x
5/ D 5;
so t … B. In other words, B.x; "/ \ B D ;. Thus, in both possible cases, x has a
neighbourhood which doesn’t intersect B.
(Actually, we could have taken " D 1 x in the first case and x 5 in the second. The
smaller choice of " was taken just to make us feel “safer”: B.x; "/ is well away from B. )
INTRODUCTION TO ANALYSIS
53
(c) If C D .1; 5, then C is not closed, because as for part (a), each neighbourhood
of 1 intersects C . Use the same a. On the other hand, C is not open, because each
neighbourhood of 5 contains points of C c , namely points > 5. (Indeed, if " > 0, then
5 C "=2 2 B.5; "/, but does not belong to C .)
(d) Œ2; C1/ means fx 2 R W 2 x < 1g. Call this set A. Then 2 2 A, but each
neighbourhood of 2 intersects Ac D . 1; 2/, so 2 … int.A/. Thus, A is not open.
On the other hand, any point of Ac — that is, any point x < 2 — is in the exterior.
Indeed, let " D 2 x. Then t 2 B.x; "/ implies t < x C " D 2, so B.x; "/ is disjoint from
A. Thus, A is closed.
(e) If x 2 R, then each neighbourhood of x is R, so x 2 int R; thus, R is open. R is
also closed, for there are no points outside of R, hence all points outside of R are exterior!
The condition is vacuously satisfied.
(f) Let F D f1; 2; 3g. Here we have only 3 points. If x … F; put " D minfd.x; 1/; d.x; 2/; d.x; 3/g.
This minimum exists since the minimum of a finite set always exists and it is > 0, since
each of d.x; 1/; d.x; 2/; d.x; 3/ is > 0. Now, the ball B.x; "/ does not intersect F , because y 2 B.x; "/ implies d.x; y/ < " d.x; a/, for each a 2 F . This shows that each
point outside of F is an exterior point, so F is closed.
On the other hand, if a 2 F , then a is not an interior point of F . Indeed, if " > 0, then
B.a; "/ D .a "; a C "/ is an infinite set and F is finite, so B.a; "/ contains (infinitely
many) points that are not in F . Thus,B.a; "/ 6 F .
(g) Let M D f n1 W n 2 Ng. Then M is neither open nor closed.
M not closed: Certainly, 0 … M . To see that 0 has no neighbourhood disjoint from M
involves the Archimedean property. If " > 0; there exists n 2 N with n1 < ". But then
1
2 B.0; "/ \ M .
n
M not open: 1 2 M , but each neighbourhood of 1 contains many points of M c
[[Why?]] .
p
(h) Let S D fr 2 Q W r > 2g. Then the density of the irrationals will show S is not
open, and the density of the rationals will allow us to conclude S is not closed.
In more detail: suppose x 2 S . Then S does not contain a neighbourhood of x. Indeed,
if " > 0 then B.x; "/ is an interval, so contains an irrational, and S consists entirely of
rationals. Thus, S is not open.
p
On the other hand, if x is any irrational > 2, then x … S and if and " > 0, B.x; "/
contains the interval .x; x C "/: Any rational in this set belongs to S . Thus such an x has
no neighbourhood disjoint from S , so S is not closed.
(i) Let I D Œ2; 4 Œ3; 6. Then, I is closed. To prove this, let c D .c1 ; c2 / … Œ2; 4 Œ3; 6. Then either c1 … Œ2; 4 or c2 … Œ3; 6 (or both). In the case c1 … Œ2; 4, either c1 < 2
or c1 > 4.
We have to show that B.c; "/ \ I D ;. Let x D
In the first case, let " D 2 c1 . p
.x1 ; x2 / 2 B.c; "/. Then jx1 c1 j .x1 c1 /2 C .x2 c2 /2 D d.x; c/ < ".
Thus, x1 < c1 C " D 2. Therefore x1 … Œ2; 4, so x … Œ2; 4 Œ3; 6.
The case c1 > 4 is proved similarly, using " D c1 4. And, of course, the case
c2 … Œ3; 6 splits into 2 subcases, also proved in the same way.
We now prove I is not open. There are lots of points that are not interior points. To
show the way to more general situations, let us look at the point a D .2; 5/. Since 2 2 Œ2; 4
and 5 2 Œ3; 6, .2; 5/ 2 I . To show that no neighbourhood of a is entirely contained in I ,
let " > 0 be arbitrary. Let x D .2 "=2; 5/.pThen, x … I , since x1 D 2 "=2 … Œ2; 4.
But, x 2 B.a; "/, since d.x; a/ D jx aj D .2 "=2 2/2 C .5 5/ D "=2 < ". This
54
Topology in R and Rn and other metric spaces.
shows B.a; "/ is not contained in I . Since " was arbitrary, no ball around a is contained in
I . Thus, a is not an interior point of I and therefore, I is not open.
Pay close attention to the following two theorems. You will notice that they depend
only on the properties of distance. Thus, they are valid in any metric space.
15.5 Theorem. Every open ball is an open set.
Proof. Let U D B.a; r/, where r > 0. Let x 2 U . Then d.x; a/ < r. Put " D r d.x; a/.
Then B.x; "/ U: Indeed, if y 2 B.x; "/ then d.y; x/ < ", so
d.y; a/ d.y; x/ C d.x; a/
< " C d.x; a/
D r d.x; a/ C d.x; a/
D rI
that is, d.y; a/ < r. In other words, y 2 U , as required.
15.6 Theorem.
(1) The union of any family of open sets is open.
(2) The intersection of any finite family of open sets is open.
S
Proof. (1) Let fGi W i 2 I g be a family of open sets and put U D i 2I Gi . If x 2 U , then
(by definition of union) there exists an i 2 I with x 2 Gi . For such an i , since Gi is open,
there exists an " > 0 with B.x; "/ Gi . But Gi U , so B.x; "/ U . Thus, for each
x 2 U there is a neighbourhood of x contained in U , so U is open.T
(2) Let fG1 ; : : : ; Gn g be a finite family of open sets and U D niD1 Gi . To prove U
is open, let x 2 U . Then for each i , x 2 Gi and Gi is open. Thus, we may choose
"i > 0 such that that B.x; "i / Gi . Put " D
T minf"1 ; : : : "n g. Then " > 0 and, for each
i, B.x; "/ is contained in Gi . so B.x; "/ niD1 Gi D U . Thus, each point of U has a
neighbourhood contained in U , so U is open.
The two properties in the above theorem form the basis for the concept of topology in
more advanced studies.
As a corollary to the previous theorem we see immediately that:
15.7 Theorem. The intersection of any family of closed sets is closed; the union of any
finite family of closed sets is closed.
Proof. Recall (corollary 15.3) that a set is closed if andT
only if its complement is open. So
let fFi W i 2 I g be a family of closed sets and let C D i Fi . Then, for each i 2 I ,
Fic is open.
Therefore,
[
Fic is open.
i2I
But, by De Morgan’s laws,
!c
c
C D
\
i 2I
Fi
D
[
Fic ;
i 2I
so C c is open. Therefore, its complement, C is closed.
The proof of the second statement is similar and is left to the reader.
INTRODUCTION TO ANALYSIS
55
Balls, open sets, and closed sets in subspaces.
Let X S , a metric space with metric d . Then, BX .a; "/ denotes the open ball centred
at a and radius " in the subspace X . We calculate immediately that
BX .a; "/ D fx 2 X W dX .x; a/ < "g
D fx 2 X W d.x; a/ < "g
D fx W d.x; a/ < "g \ X
D B.a; "/ \ X:
x "/ \ X:
Of course, a similar result holds for closed balls: BxX .a; "/ D B.a;
15.8 Theorem. Let X S. Then,
(1) U is open in X iff there exists an open set G in S such that U D G \ X ;
(2) C is closed in X iff there exists a closed set F in S such that C D F \ X .
Proof. (1) ( (H ) Let G be open in S . We have to prove that G \ X is open in X . Let
a 2 G \ X . Then a 2 G. Since G is open in S , there exists " > 0 such that B.a; "/ G.
Therefore
BX .a; "/ D B.a; "/ \ X G \ X:
Therefore, every point of G \ X is an interior point and hence G \ X is open in X .
( H) ) Let U be open in X . We have to find G open in S such that U D G \ X . For
each a 2 U , there is an open ball in X about a contained in U . That is there exists "a > 0
such that BX .a; "a / U . Then
[
BX .a; "a / D U (why?).
a2U
But
BX .a; "a / D B.a; "a / \ X; for all a 2 U
Put
GD
[
B.a; "a /:
a2U
Then G is open (in the whole space) and
[
[
G\X D
B.a; "a / \ X D
BX .a; "a / D U:
a2U
a2U
Thus, U D G \ X , where G is open in the whole space.
(2) is left as an exercise. (Hint: What is the relationship between open and closed
sets?)
There are similar results about interior and about closure. (The corresponding statement
about boundary is false. So is the corresponding statement about compact sets.) In the case
of compact sets, it turns out that a subset of X is compact in X iff it is compact in the whole
space. (See the section: Compactness in subspaces.)
15.9 Examples.
(a) If S D R, X D Œa; b, a closed interval and G D R, we get G \ X D Œa; b is open
in X but not in R.
(b) The set Œ0; 1/ neither open nor closed in R. However, it is open in the subspace
X1 D Œ0; 1/, though still is not closed in X1 . It is closed in the subspace X2 D . 1; 1/,
but is not open in X2 .
15.1. Classify each of the following as open, closed, neither or both. (Provide proof, of course.)
Topology in R and Rn and other metric spaces.
56
(a) Q \ .2; 7/.
(b) Z, the set of integers.
(c) f40 =n W n 2 Ng.
15.2. Every finite set in a metric space is closed.
15.3. In R, every open interval .a; b/ is an open set; one or both of a; b can be infinite. Notice that
. 1; C1/ D R is one of these.
15.4. In R, every closed interval is a closed set. Here we include Œa; b, when a; b are real, and infinite
closed intervals Œa; C1/, . 1; b ( and R itself).
15.5. If A1 and A2 are closed sets in R, then A1 A2 is a closed subset of R2 .
15.6. If A1 and A2 are open sets in R, then A1 A2 is a open subset of R2 .
15.7. In a metric space, each closed ball fx W d.x; a/ rg is a closed set.
15.8. If t 2 R and a; b 2 Rn , let x.t / D .1
15.9. In
Rn ,
t/a C tb. Then, d.x.t /; a/ D jt jd.a; b/.
no open ball fx W d.x; a/ < rg is closed.
15.10. Give an example of a sequence .An / of open sets such that the intersection
open.
T
n2N
An is not
15.11. The "-neighbourhoods are called basic neighbourhoods. If U is a set which contains some
basic neighbourhood of x, then U is still referred to as a neighbourhood. Our definitions were in
terms of basic neighbourhoods, but prove that x is
(a) an interior point of A iff A contains some neighbourhood of x;
(b) a boundary point of A iff both A and Ac intersect each neighbourhood of x;
(c) an exterior point of A iff A is disjoint from some neighbourood of x.
15.12. A is open iff it contains a neighbourhood of each of its points iff A is a neighbourhood of
each of its points (in the general sense of the previous exercise). [You will see that in most
general results, “basic neighbourhood” can be replaced by “neighbourhood”, without affecting
their validity.]
INTRODUCTION TO ANALYSIS
57
16. I NTERIOR , BOUNDARY, AND CLOSURE
Interior and boundary points were defined in the section T OPOLOGY IN R AND Rn . It is
worth noticing that the interior points are the points of the set that are not in the boundary.
16.1 Lemma. int A D A n bd A.
Proof. If x 2 int A then there exists a neighbourhood U D B.x; "/ of x with U A.
Since x 2 U , x 2 A. But U A also implies U \ Ac D ;, so x … bd A. This shows
x 2 int A implies x 2 A n bd A.
Conversely, if x 2 A n bd A, then x 2 A and x … bd A. The second statement means
that there is a neighbourhood U of x which does not intersect both A and Ac . But x 2 U
so U intesects A, hence U doesn’t intersect Ac , That is U A, so that x 2 int A.
You should check that all points of A that are not interior points are in the boundary.
16.2 Theorem.
(1) A is open iff it is disjoint from its boundary.
(2) A and Ac have the same boundary.
(3) A is closed iff it contains its boundary.
Proof. (1) Once again, int A D A n bd A so
A is open
”
A D int A ”
A D A n bd A
”
A \ bd A D ;
(2) and (3) are left to the reader.
16.3 Examples. Find the boundary, and interior points for the following sets.
(a) .1; 5/.
(b) Œ1; 5.
(c) Œ1; 5/.
(d) Œ2; 1/.
(e) R.
(f) f1; 2; 3g.
(g) f n1 W n 2 Ng.
p
(h) fr 2 Q W r > 2g.
These are the same sets as in examples 15.4 (a)–(h)
Soln. (a) Let A be the open interval .1; 5/. What are its boundary points and what are its
interior points?
As we have shown in example 15.4(a), A is open: each point of A is an interior point
and since A can have no interior points outside, these are all of them:
A D int.A/:
If " > 0 then the neighbourhood B.1; "/ D .1 "; 1 C "/ contains points both of A
and of Ac . Indeed, if a is any point between 1 and minf5; 1 C "g, such as minf3; 1 C "=2g,
then 1 < a < " and a 2 A, so
a 2 B.1; "/ \ AI
also
1 2 B.1; "/ \ Ac :
Thus, 1 is a boundary point of A. In the same way, each neighbourhood of 5 contains
points of A and of Ac . Thus, 5 is also a boundary point of A. So far we have shown that
58
Topology in R and Rn and other metric spaces.
bd A f1; 5g. Are there any other boundary points of A? The points of A are interior
points, so the only other place to look is outside Œ1; 5.
The closed interval Œ1; 5 is a closed set, so if x is any point not in Œ1; 5, it has a
neighbourhood U that doesn’t intersect Œ1; 5. Such a neighbourhood doesn’t intersect
A D .1; 5/ either, so A has no boundary points outside of Œ1; 5; thus,
bd A D f1; 5g:
Note: If we didn’t want to use the fact that Œ1; 5 is a closed set we could go through the
argument from basics. If x > 5, taking " D x 5 we find that
B.x; "/ \ A D ;;
so x is an exterior point (hence not a boundary point of A). Similarly, if x < 1, x is not a
boundary point point of A.
Notice, by the way, that A contained none of its boundary points.
(b) Let B D Œ1; 5. Since .1; 5/ is an open set, each point of .1; 5/ has a neighbourhood
U contained in .1; 5/, so also contained in B. Thus, .1; 5/ int B. This time, however, not
all the points of B are interior points: the points 1; 5 are in B, but are boundary points rather
than interior points. For example if " > 0, 1 "=2 2 B.1; "/ \ B c and B.1; "/ \ B ¤ ;,
since we know B.1; "/ \ .1; 5/ ¤ ;. Thus,
int B D .1; 5/
and
f1; 5g bd B:
But, as before, since B is closed, no point outside it can be a boundary point of B, so in
fact,
f1; 5g D bd B:
(c) If C D .1; 5, then one more time we see that bd C D f1; 5g and int C D .1; 5/, but
now C contains one of its boundary points but not the other.
(d) Let D D Œ2; C1/ D fx 2 R W 2 x < 1g. The interval .2; 1/ is an open set, so
if x 2 .2; 1/, x has a neibourhood U contained in .2; 1/. Such a U is contained in D
also, s
.2; 1/ intŒ2; 1/:
Each neighbourhood of 2 intersects both A and Ac D . 1; 2/, (check this) so 2 2 bd D.
Since there are no more points of A,
int D D .2; 1/:
And since D is closed (or . 1; 2/ is open) any point < 2 is in the exterior, so
bd D D f2g:
(e) If x 2 R, then each neighbourhood of x is R, so x 2 int R; there are no more
points, so int R D R and bd R D ;.
(f) Let F D f1; 2; 3g. As we have seen, F is closed. Each point outside F is an exterior
point, so not a boundary point.
On the other hand, if a 2 F , then a is a boundary point of F . Indeed, if " > 0, then
a 2 B.a; "/ \ F;
so B.a; "/ \ F ¤ ;. Also, B.a; "/ D .a "; a C "/ is an infinite set and F is finite, so
B.a; "/ contains (infinitely many) points that are not in F ; that is, B.a; "/ \ F c ¤ ;. This
shows that each point of F is a boundary point of F .
INTRODUCTION TO ANALYSIS
59
(g) Let G D f n1 W n 2 Ng. We find bd G D G [ f0g. That is, each of the points of G are
boundary points, 0 is another one, and there are no others. It follows that G has no interior
points, int G D ;.
We will let the reader check that all points of G are boundary points of G.
To see that 0 is a boundary point, we need the Archimedean property: If " > 0; there
exists n 2 N with n1 < ". But then n1 2 B.0; "/ \ G. On the other hand, 0 2 B.0; "/ \ G c .
That there are no other boundary points, involves 3 cases: x < 0; 0 < x < 1, and
x > 1. In the first case, if we choose " D x, B.x; "/ \ G D ;, since all elements of G
are > 0. The case x > 1 is similar. Just use " D x 1.
It is the case 0 < x < 1 that is more interesting. Let x … G, but 0 < x < 1. By
the Archimedean property there is a natural number n with 1=n < x. Choose the smallest
such n, so that
1
1
<x<
:
n
n 1
(n ¤ 1, since x < 1). The interval .1=n; 1=.n 1// is open so it contains a neighbourhood
U of x. This neighbourhood cannot contain any elements of G, since there is no integer
between n 1 and n. This shows that x cannot be a boundary point of G, establishing the
claim.
Note: Here n 1 D b1=xc and
pn D d1=xe.
(h) Let H D fr 2 Q W r > 2g.
p Then density of the rationals and of the irrationals
2; 1/ and int H D ;.
will allow us to conclude bd H
D
Œ
p
p
To see this, first
note
that
Œ
2;
1/
is closed so G has no boundary points < 2. But, if
p
p
" > 0, and x 2, B.x; "/ \ . 2; 1/
p .x; x C "/; this contains a rational r, by density
of the rationals. This r is greater than 2, so belongs to H . Thus, B.x; "/ \ H ¤ ;. Also,
c
the irrationals are dense, so there
p is an irrational t 2 .x; x C "/, so B.X; "/ \
pH ¤ ;.
This shows that each x 2 is a boundary point of H . Thus, bd H D Π2; C1/ and
int H D ;.
In the following list of properties we will see the interior of A is the largest open set
contained in A.
16.4 Theorem (Properties of interior).
(0)
(1)
(2)
(3)
(4)
(5)
A is open iff A D int A
int A AI
A B implies int A int B.
If G is open and G A, then G int A.
int A is open; that is, int.int A/ D int A.
int.A \ B/ D .int A/ \ .int B/
Consequently, int.A/ is the largest open set contained in A.
Proof. You will see that we have already used some of the arguments in special cases.
(0) and (1) have been established before, we are just collecting them here together.
(2) If x 2 int A, then there exists an open ball U D B.x; "/ with U A. But A B,
so U B and hence x 2 int B.
(3) If G is open and G A, then
G D int G int A
by (0) and (2)
Topology in R and Rn and other metric spaces.
60
(4) Let x 2 int A. We have to show that there exists a neighbourhood of x contained in
int A. By definition, we do know that there is an open ball U D B.x; "/ A. But an open
ball is an open set, so U int A, by (3). Thus int A is open.
The formula int.int A/ D int A follows by (0).
(5) Since A \ B A; (2) yields
int.A \ B/ int A
and similarly,
int.A \ B/ int B
so
int.A \ B/ .int A/ \ .int B/
On the other hand, int A and int B are open, so
.int A/ \ .int B/ is open
and
.int A/ \ .int B/ A \ B:
Thus,
.int A/ \ .int B/ int.A \ B/;
by (3).
We said that int A is the largest open set contained in A. Well, (4) says it is open, (1)
says it is contained in A and (3) says it contains any other open set contained in A, so it is
the largest.
Closure. There is an operator that does for closed sets what interior does for open ones.
For A R or Rn (or any other metric space) we define the closure of A by
cl A D A [ bd A:
The elements of cl A are called closure points of A.
Here is a characterization of closure which is so important that many take it as the
definition.
16.5 Theorem. A point x is a closure point of a set A iff every neighbourhood of x intersects A.
You should prove this as an exercise, from the definition. We can also deduce this fact
from the following formula.
16.6 Lemma. .cl A/c D int.Ac /.
Proof. This is just a calculation. Since cl.A/ D A [ bd A,
.cl A/c D .A [ bd A/c
D Ac \ .bd A/c
D Ac n bd.A/
D Ac n bd.Ac /
since bd.A/ D bd.Ac /
D int.Ac /:
To get theorem 16.5, then, we see that x 2 cl.A/ if and only if x … int.Ac /; that is,
iff there does not exist a neighbourhood of x contiained in Ac — in other words, iff every
neighbourhood of x intersects A.
INTRODUCTION TO ANALYSIS
61
16.7 Theorem (Properties of closure).
(0) A is closed iff A D cl A
(1) cl A AI
(2) A B implies cl A cl B.
(3) If F is closed and F A, then F cl A.
(4) cl A is closed; that is, cl.cl A/ D cl A.
(5) cl.A [ B/ D .cl A/ [ .cl B/
Hence, cl.A/ is the smallest closed set containing A.
The proofs are left as exercises. They follow almost immediately from the corresponding results for interior. Alternatively, one may use similar proofs.
16.1. Find the interior and boundary of each of these sets. Give proof.
(a) f1 1=n W n 2 Ng
(b) .3; 4 [ f7g
(c) fr 2 Qc W r 2 .3; 4g
(d) fn2 W n 2 Ng
(e) The line segment from .0; 0/ to .1; 1/ of R2 .
Note: You can (and may) use theorems about open and closed sets. For example, every open
interval is an open set.
16.2. Find the interior, boundary, and closure of the closed ball in Rn :
x
B.a;
r/ D fx 2 Rn W d.x; a/ rg (with proof, of course).
16.3. The minimum of a set A of real numbers is never an interior point of A.
16.4. When they exist (in R), sup A and inf A are always boundary points of A.
62
Topology in R and Rn and other metric spaces.
Notes
INTRODUCTION TO ANALYSIS
63
17. B OUNDED SETS
As we know, in a metric space, the open ball centred at a radius r is
B.a; r/ D fx W d.x; a/ < rg;
and the closed ball centred at a radius r is
x r/ D fx W d.x; a/ rg;
B.a;
One also says “ball about a” instead of ball centred at a”.
Boundedness. In a metric space, a set A is called bounded if it is contained in some ball.
(It doesn’t matter whether one uses open or closed balls. Why? — Problem 17.1)
17.1 Theorem.
(a) If A is a set in a metric space S , then A is bounded iff for each point c 2 S , A is
contained in some ball centred at c.
(b) In Rn , A is bounded in the metric space sense iff there exists M such that jxj M ,
for all x 2 A.
(c) In R, A is bounded in the metric space sense iff it is bounded above and bounded
below, and iff there exists M with jxj M , for all x 2 A.
Proof. (a) If A is contained in a ball centred at c, it is certainly contained in a ball, so
it is bounded by definition. Now, suppose A is contained in the ball B.a; r/ and c is
some other point of the metric space S. Then, for all x 2 A; d.x; a/ < r, so d.x; c/ d.x; a/ C d.a; c/ < r C d.a; c/. In other words A is contained in the open ball centred at
c, radius M D r C d.a; c/.
(b) For A Rn , we can take c D 0 in (a) and obtain A is bounded iff there exists
M > 0 such that A B.0; M /; that is, jxj < M for all x 2 A.
(c) Let A R. If A is bounded in the metric space sense, then there exist s point a and
a radius r such that A B.a; r/. Thus, x 2 A implies a r < x < a C r. This shows
that A is bounded below by a r and above by a C r.
Now suppose A is bounded below by b and above by c. Then A Œb; c, and Œb; c is
x r/, where a D .b C c/=2 and r D .c b/=2. Alternatively, we could
a closed ball B.a;
let M D maxfjbj; jcjg and obtain M x M , for all x 2 A; that is jxj M , for all
x M /.
x 2 A. In other words, A B.0;
17.2 Note. Be aware of the difference between the definition of bounded and the characterization (a) of the Theorem. The definition says there exists a 2 S and there exists r > 0
such that A B.a; r/. The characterization says we can change the centre to any other
point of S: for all c 2 S, there exists r such that A B.c; r/. As we indicated in (b), if
the space is Rn one often takes the centre to be 0.
17.3 Example. By an interval in Rn , we mean a cartesian product I1 : : : In where each
Ii is an interval of the reals. If a D .a1 ; : : : ; an / and b D .b1 ; : : : ; bn / are points of Rn ,
then a b means ai bi for i D 1; : : : ; n. The closed inteval Œa; b then means
Œa; b D fx W ai xi bi ;
for i D 1; : : : ; ng D Œa1 ; b1  Œan ; bn :
Every closed interval I D Œa1 ; b1  Œa2 ; b2  Œan ; bn  of Rn is bounded. Conversely,
every bounded set in Rn is contained is some such closed interval. The reader should prove
this. What is the radius of the smallest ball that contains I ?
64
Bounded sets
For a set A of a metric space S , the diameter of A is
diam.A/ D supfd.x; y/ W x; y 2 Ag;
with the convention that diam.A/ D 1, if this set of distances is not bounded above. Also
the diameter of the empty set is considered to be 0.
17.1. Prove that a set A in a metric space is contained in some open ball if and only if it is contained
in some closed ball.
17.2. A set in a metric space is bounded if and only if its diameter is finite.
17.3. For a non-empty set A in a metric space diam.A/ D diam.cl A//.
17.4. A set is bounded if and only if its closure is bounded.
17.5. The diameter of the sphere in Rn of radius r, S.a; r/ D fx 2 Rn W jx aj D rg is 2r. The
same is true for the closed ball and the open ball of radius r. This need not be true in other metric
spaces.
17.6. If I D Œa; b D Œa1 ; b1  Œan ; bn  is a closed interval of Rn , then diam.I / D d.a; b/ D
ja bj, the “length of the diagonal”. (a D .a1 ; : : : ; an /, b D .b1 ; : : : ; bn /.) What if I is a
non-closed interval with these endpoints?
a Cbi
17.7. (Bisection Procedure) Let I D I1 In , Ii D Œai ; bi , ci D i 2
For each i , put Ii0 D Œai ; ci  and Ii1 D Œci ; bi ; Then, Ii D Ii0 [ Ii1 so
[
e
e
I D I1 In D
I1 1 In n :
for i D 1; : : : ; n.
e2f0;1gn
Each e D .e1 ; : : : ; en / here is an n-tuple of 0’s and 1’s. There are 2n of them. Each product
e
e
I1 1 In n is a closed interval with diameter
!1=2
!1=2
X
X bi ai 2
1
ei 2
D d.a; b/:
length Ii
D
2
2
i
i
In the case n D 2, the representation looks like
I D .I10 I20 / [ .I10 I21 / [ .I11 I20 / [ .I10 I21 /:
INTRODUCTION TO ANALYSIS
65
18. ACCUMULATION AND THE B OLZANO -W EIERSTRASS T HEOREM ( SET FORM )
Accumulation points. For a point c and a set A, c is an accumulation point of A if
for each " > 0, B.c; "/ \ A n fcg ¤ ;. Thus, c is an accumulation point of A iff each
neighbourhood of c intersects A in a point other than c. The set of all accumulation points
of A will be denoted here by acc A. (In some books this is denoted A0 and is called the
derived set of A. We won’t use this terminology.)
A neighbourhood of c with c removed is called a deleted neighbourhood of c. In
particular, the set B 0 .c; "/ D B.c; "/ n fcg is called the deleted neighbourhood of c
radius ". (Here " > 0 as usual.) Using this language we could say that c is an accumulation
point of A if each deleted neighbourhood of c intersects A.
18.1 Theorem.
(a) acc A n A D bd A n A D cl A n A:
(b) cl A D A [ acc A
.D A [ bd A/.
Proof. (a) Let c 2 .acc A/ n A. Then c … A and every neighbourhood of c intersects
A n fcg. Let U be a neighbourhood of c. Since U intersects A n fcg it intersects A and it
also intersects Ac , since c 2 U \ Ac . This shows that c 2 bd A. Since it is also in Ac ,
c 2 bd AnA.
Conversely, let c 2 bd AnA. Then c 2 bd A and c … A. By definition of boundary, every
neighbourhood U of c intersects A (and Ac ). And since c … A, U \ A D U \ A n fcg ¤ ;.
So that c 2 acc A.
The second equality is trivial:
cl A n A D .A [ bd A/ n A D bd A n A:
(b) is of the same level of difficulty:
cl A D A [ bd A D A [ .bd A n A/ D A [ .acc A/;
by (a).
Many books use the formula cl A D A [ acc A as the definition of closure.
18.2 Examples.
(1) If A is (2,3], then acc A D Œ2; 3.
(2) No finite set has an accumulation point. Indeed, if F is a finite set and
" D minfd.c; a/ W a 2 F n fcgg, then " > 0 and B.c; "/ \ F n fcg D ;.
(3) If A D f n1 W n 2 Ng, then acc A D f0g.
(4) acc N D ;.
A point x is called an isolated point of A if it belongs to A, but is not an accumulation
point of A. Thus, in the above examples (2), (3), and (4) all the points of the set are isolated.
The following is one of the reasons for the name “accumulation” point.
18.3 Theorem. A point c is an accumulation point of A iff every neighbourhood of c
contains infinitely many points of A.
Proof. . (H / This direction of the proof is immediate. If a neighbourhood contains
infinitely many points, then it contains one other than c!
( H) ) Let c be an accumulation point of A. Let U D B.c; "/ be a neighbourhood of
c. Then U has at least one point of A other than c. Suppose U \ A is F , finite and put
"0 D minfd.c; a/ W a 2 F n fcg/. Then, "0 > 0, but B.c; "0 / contains no element of A other
than c, a contradiction.
66
Accumulation and the Bolzano-Weierstrass Theorem
18.4 The Bolzano-Weierstrass Theorem (set form). Every bounded infinite set in R or
Rn has an accumulation point.
We will be giving several proofs of this, because of the various techniques that they
teach.
Many of our results are true in general metric spaces. This is not one of them. It depends
on the order completeness of the reals.
Proof. Let A be a bounded infinite set in R. Since A is bounded, there exists a closed
bounded interval I with A I . Let d be its diameter. (if I D Œa; b D Œa1 ; b1  Œan ; bn , the diameter of I is ja bj.) Now, we may bisect I using the midpoints pi D
ai Cbi
, obtaining it as the union of 2n closed intervals of diameter d=2: see problem17.7
2
(Each of these intervals is of the form Œx; y, where for each i, either xi D ai and yi D pi
or xi D pi and yi D bi . )
At least one of these 2n intervals must contain an infinite number of points of A; otherwise, A would be finite. So let I1 be one closed interval contained in I with diam.I1 / D
d=2. Continue this recursively — if Ik has been chosen of diameter d=2k containing infinitely many points of A, choose IkC1 a closed interval of diameter d=2kC1 contained in
Ik and also containing infinitely many points of A.
By Cantor’s Principle of nested intervals, the intersection
\
Ik
k2N
contains some point c. (In fact it will be a singleton fcg, but that is not needed here.) We
claim c is an accumulation point of A. To see this, let " > 0. Then there exists k 2 N with
d=2k < " and we have
Ik B.c; "/:
Indeed, c 2 Ik and if x 2 Ik ; then d.x; c/ diam.Ik / D d=2k < ". Finally, since Ik
contains infinitely many points of A, the neighbourhood B.c; "/ contains infinitely many
points of A, so c is an accumulation point of A.
18.1. If A B, then acc A acc B.
18.2. If A is a non-empty set of real numbers bounded below with no minimum, then inf A is an
accumulation point of A.
18.3. In Rn , find the set of accumulation points, the boundary, and the closure of B.a; r/, r > 0.
18.4. In Rn , find the set of accumulation points, the boundary, and the closure of the sphere S.a; r/ D
fx W jx aj D r, r > 0.
18.5. Find the set off accumulation points of these sets. Give proof.
(a) f1 1=n W n 2 Ng
(b) .3; 4 [ f7g
(c) fr 2 Qc W r 2 .3; 4g
(d) fn2 W n 2 Ng
(e) The line segment from .0; 0/ to .1; 1/ of R2 .
18.6. In R or Rn a set without accumulation points must have empty interior.
18.7. By contrast with problem 18.6: Let the metric space be Z, the integers, with the usual distance.
(Thus, it is a subspace of R.) Then, for all A Z, acc.A/ D ;, but int.A/ D A.
INTRODUCTION TO ANALYSIS
67
19. C OMPACTNESS AND THE H EINE -B OREL T HEOREM
The general statements and definitions we make here are valid in R, or Rn or any metric
space. The Heine-Borel Theorem however is only true in R or Rn .
If U is a family of sets and A is another set, U covers A means every element of A is
in some member of U. This can be stated in terms of the union of the sets of U:
[
[
UD
U A:
U 2U
To say a set K is compact means that whenever U is a family of open sets covering K,
there is a finite U0 U which also covers K.
19.1 Example. Every finite set is compact.
Proof. Let K D fx1 ; : : : ; xn g. To prove K is compact we must begin with an arbitrary
family of open sets which covers K and extract from it a finite set which still covers K. So
let U be such a family of open sets.
[
U K:
U 2U
Then for each i D 1; : : : ; n,
xi 2
[
U;
U 2U
hence we may choose a Ui 2 U with xi 2 Ui . Put
U0 D fU1 ; : : : ; Un g:
Then
[
U0 D U1 [ [ Un K;
and U0 is finite, so U0 is a finite subfamily of U which covers K.
We have shown that every family of open sets which covers K has a finite subfamily
which also covers K. That is, by definition, K is compact.
19.2 Example. Let A D Œ1; 1/ D fx 2 R W x 1g. Then A is not compact.
To prove this we must show that there exists U, a family of open sets which covers A,
such that there does not exist a finite subfamily U0 of U which also covers A.
For each n 2 N, put Un D .0; n/, an open interval, hence an open set. Put U D fUn W
n 2 Ng. Then
[
[
[
U D
UD
Un :
U 2U
n2N
This contains A. Indeed, a 2 A means a 2 R with a 1. By the Archimedean Property,
there exists N 2 N with N > a, hence 0 < a < N: In other words a 2 UN , so
[
a2
Un :
n2N
Now, suppose U0 is a finite subfamily of U. Then, we can write
U0 D fUn1 ; : : : ; Unk g;
where k 2 N. Let M be the maximum of the numbers n1 ; : : : ; nk . Since the Un are
(increasingly) nested,
[
U0 D .0; M / D UM :
This does not contain A since A contains many points > M .
68
Compactness and the Heine-Borel Theorem
Thus, U is a family of open sets covering A, which has no finite subfamily which covers
A. Therefore, A is not compact.
19.3 Example. (2,5] is not compact.
To prove this notice that this interval has no minimum. It is the fact that the 2 is missing
that will make it not compact. We put Un D .2 C n1 ; 1/, for each n 2 N. The family
U D fUn W n 2 Ng
consists of open sets and it covers (2,5]. Indeed, if x 2 .2;
S5 and n is chosen so that
1=n < x 2, then x > 2 C 1=n, so x 2 Un . (Actually, n2N Un D .2; 1/; as this
method of proof shows.)
Now, suppose U0 is a finite subfamily of U, say
U0 D fUn1 ; : : : ; Unk g
If M is the largest of the ni , i D 1; : : : ; k, then the union of U0 is UM D .2 C
there is a point in
1
.2; 5 n UM D 2; 2 C
;
M
so that U0 does not cover (2,5]. Thus, (2,5] is not compact.
The examples here are indicative of a general theorem.
1
M
; 1/ and
19.4 Theorem. In any metric space, every compact set is closed and bounded.
Proof. Let K be a compact set in the metric space S . To prove K is bounded, let a be
any point of S; we will find a ball centred at a containing K. For each postive r 2 R, let
Ur D B.a; r/. Then, Ur is an open set for each r > 0. Put
U D fUr W 0 < r 2 Rg:
Now, each point x is some distance d.a; x/ away from a. Thus, there exists an r with
r > d.a; x/, so x 2 B.a; r/. Since this is true for all x 2 S; for all x 2 K, there exists
r > 0, such that x 2 Ur . This shows that
U D fUr W 0 < r 2 Rg covers K:
Therefore, by compactness, there exists a finite subfamily U0 of U which also covers K.
Say,
U0 D fUr1 ; : : : ; Urk g:
If M is the maximum of the ri ; we have for all i , Uri UM , so every element of K
belongs to UM D B.a; M / and hence, Thus, K is contained in the ball B.a; M /, so is
bounded.
Now, to prove K is closed, let a … K. We will find a neighbourhood of a that is disjoint
from K.
x r/c and U D fUr W 0 < r 2 R. Since
This time, for 0 < r 2 R, we put Ur D B.a;
every closed ball is a closed set and the complement of a closed set is open, U is a family
of open sets.
Now, let x 2 K. Then, x ¤ a, so d.x; a/ > 0. Thus, if r D d.x; a/, x … B.a; r/; in
other words x 2 Ur . Thus, each x 2 K is in some U 2 U, so
U covers K:
Therefore, since K is compact, there is a finite subfamily, say
U0 D fUr1 ; : : : ; Urn g
INTRODUCTION TO ANALYSIS
69
which covers K. Thus, for each x 2 K, there exists i 2 f1; : : : ng with x 2 Uri ; in
x ri /. Let " D minfr1 ; : : : ; rn g, so that for i D 1; : : : ; n,
other words, with x … B.a;
x "/ B.a;
x ri /. Thus, for all x 2 K, … B.a;
x "/:
B.a;
x "/ \ K D ;:
B.a;
Thus, every point outside K has a neighbourhood disjoint from K, so K is closed.
A family U of open sets which covers K is often referred to as an open cover of A, even
though normally “open” refers to a set. A subfamily of a cover of A which still covers A is
called a subcover. Sometimes one says ‘subcover of A’, but it is ‘sub’ of U and cover of
A.
In this language the definition of compact set becomes: K is compact if and only if
every open cover of K has a finite subcover.
19.5 Theorem. Every closed subset of a compact set is compact.
Proof. Let K be compact. Let F be a closed set with F K. Let U be a family of open
sets covering F .
e of K, extract a finite U
e0 U
e which covers K,
The plan is to construct an open cover U
and then get from it the finite U0 U which covers F . Please do not make the mistake
many students make of starting with a cover of K.
e D U [ fF c g. Since F is closed, the elements of U
e are still open. Since U
We put U
covers F , every element of F is in some U 2 U and every other element of K is in F c , so
e is a family of open sets which covers K. Thus, there is a finite U
e 0 which covers K and
U
c
0
e , but it contains no points of
hence also covers F . Now, F may or may not belong to U
F , so is not really needed. We remove it if it is there: put
e 0 n fF c g:
U0 D U
e 0 with x 2 U I but this U
Then, U0 still covers F , for if x 2 F , then there exists U 2 U
c
c
0
cannot be F ; since x … F , so must belong to U .
As we said before, the general results above are actually true in any metric space. The
following, however, is not. The fact that we are dealing with Rm is essential.
19.6 The Heine-Borel Theorem. In Rm , any closed bounded set is compact.
Proof. If a set is bounded, it is contained in a closed interval I D Œa; b. Since each closed
subset of a compact set is compact, we need only prove that I is compact.
For this, let U be a family of open sets that covers of I but no finite subfamily covers
I.
Put I1 D I and let d D diam.I /. Assume a closed interval In I has been chosen
such that
In is not covered by a finite subfamily of U and
1
diam.In / D n 1 d:
2
Then, In is the union of a finite number (2m ) of closed intervals of diameter 12 diam.In / D
1
d: If each of these can be covered by finite subfamilies of U, then so could In , so at
2n
least one of them, say InC1 cannot be covered by a finite subfamily of U.
Thus, we have obtained,
T by recursion, a nested sequence .In / of closed intervals, and
by Cantor’s Principle, n In ¤ ;. Let c be a point of this intersection.
Now, U is an open cover of I , so there exists U 2 U with c 2 U . For such a U , there
exists " > 0 with B.c; "/ U . Choose such an " and then an n so that
diam.In / D d=2n
1
< ":
70
Compactness and the Heine-Borel Theorem
Then, In B.c; "/ U:
Thus, fU g is a finite subfamily of U which covers In , contrary to the construction: NO
finite subfamily of U covers In .
19.7 Theorem. LetT
F be a non-empty
T family of compact sets such that for each finite
subfamily F 0 of F , F 0 ¤ ;. Then F ¤ ;.
T
T
(Recall that F and F 2F F both mean the same thing.)
Proof. Let F be a non-empty family of compact sets such that each finite subfamily has
non-empty intersection.
Let U D fF c W F 2 F g. Now, F compact implies F is closed and hence F c is open.
Thus, U is a family of open sets.
T
Choose any K 2 F . If we suppose F 2F F D ;, we have
!c
[
[
\
UD
Fc D
F
D ;c K:
F 2F
F 2F
Thus, by compactness there exists a finite U0 U which also covers K. In other words,
there exists a finite F 0 F such that
[
F c K:
F 2F 0
That is,
\
F Kc;
F 2F 0
or, equivalently,
\
F \ K D ;;
F 2F 0
This shows that the intersection of the finite subfamily F 0 [ fKg is empty. This is a
contradiction.
19.8 Example (Practice). To see if you understand, show that Cantor’s Principle of Nested
Intervals can be regarded as as special case of the above theorem.
Compactness in subspaces. You will recall (see Theorem 15.8) that, in a subspace X of
a metric space S ,
(1) a set U is open iff there exists a set G, open in S , such that U D G \ X .
(2) a set C is closed iff there exists a set F , closed in S, such that C D F \ X .
In the first case U need not be open in S . For example, if X is not open in S we can take
U D X and G D S . (However, U will be open in S if X is open in S . Indeed, U D G \X
is then the intersection of 2 open sets in S .)
Similarly, in case (2), C need not be closed in S, though it is so if X is closed in S.
But compactness behaves differently.
19.9 Theorem. Let X be a subspace of the metric space S. If K X , then K is compact
in X iff K is compact in S.
Proof. Suppose K is compact in X. To prove K is compact in S, let U be a cover of K by
open sets in S . Since
[
U K;
U 2U
INTRODUCTION TO ANALYSIS
71
intersecting with X gives
[
U \ X K \ X D K;
U 2U
Thus,
fU \ X W U 2 Ug
is a cover of K by open sets of X . But K is compact in X, so there exists a finite subcover
fU1 \ X; : : : ; Un \ X g
Thus,
.U1 \ X / [ [ .Un \ X / K
and therefore,
U1 [ [ Un K;
so that fU1 ; : : : ; Un g is the required subcover, showing that K is compact in R.
Now, for the converse, suppose K is compact in S. To show K is compact in the
subspace X , let U be a cover of K by open sets of X . For each U 2 U, there is an open
GU of S with
U D GU \ X:
Then
[
[
GU U K;
U 2U
U 2U
so that
fGU W U 2 Ug
is an cover of K by open sets of S . Thus, there is a finite subcover. Say,
GU1 [ [ GUn K:
Intersect these all with X and get
.GU1 \ X / [ [ .GUn \ X / K \ X D K:
That is,
U1 [ [ Un K;
so fU1 ; : : : ; Un g is a subfamily of U covering K, the required subcover. Hence, K is
compact in X .
The conclusion to all of this is that, if a theorem refers to a compact subset of X , a
subspace of S , it doesn’t matter which of these spaces we think of the set being in. This
frequently comes up if we are working with a function defined only on a subset of R, or if
we need to restrict a function to a subset of its domain.
19.1. Let K be compact in a metric space. For each x 2 K let ı.x/ be a positive real number. Prove
that there are finitely many elements x1 ; : : : ; xk of K such that
B.x1 ; ı.x1 // [ [ B.xk ; ı.xk // K.
19.2. Find a metric space and a subset of it which is closed and bounded but not compact.
19.3. The set A D Œ2; 4/ is not compact.
(a) Prove this by explicitly finding a family of open sets which covers A but has no finite subfamily which
also covers A.
(b) Find another family of open sets which covers A and does have a finite subfamily which covers A.
72
Compactness and the Heine-Borel Theorem
Notes
INTRODUCTION TO ANALYSIS
73
20. C ONVERGENCE OF SEQUENCES : D EFINITION AND E XAMPLES
A sequence .xn / in a space X is a function n 7! xn on fn 2 Z W n pg, for some fixed
p. To make more precise where the indexing starts one can use the notation .xn /1
nDp . One
usually takes p D 1 as typical, which is what we do here. Sometimes it is convenient to
use p D 0.
A sequence .xn / in a metric space is said to converge to the point a provided
for every " > 0, there exists N 2 N such that for n > N , d.xn ; a/ < ".
(22)
In symbols,
8" > 0; 9N 2 N; 8n > N; d.xn ; a/ < ":
(22a)
We then call a the limit of the sequence .xn / and write
lim xn D a
xn ! a:
or
n
To say a sequence .xn / diverges simply means it doesn’t converge. Thus each sequence
in a metric space is either convergent or divergent.
Examples. Below we give the general term of each of a number of sequences. The object
is to investigate whether the limits of the sequences exist or not, with proof.
(1)
(2)
1
n
p1
n
(3) 1 C 21n
(4) . 1/n
n2
(5) 1Cn
2
2
(6) nn3C2n
5
(7) . 1/n
(8) n2
1
2
1
n
Remember that ja C bj jaj C jbj and jjaj
inequality.
20.1 Example. limn
xn D n1 .
1
n
D 0. In other words
1
n
jbjj ja
bj, versions of the triangle
converges to 0 or briefly,
1
n
! 0. Let
Analysis. The definition of convergence involves showing that the distance from the general term xn to 0 can be made small when n is large. So we look at
ˇ
ˇ
ˇ1
ˇ
1
d
; 0 D ˇˇ
0ˇˇ :
n
n
We have to make this small when n is large. More precisely, we must show that
1
8 " > 0; 9N 2 N; 8n > N; d
; 0 < ":
n
Now,
d
ˇ
ˇ1
1
; 0 D ˇˇ
n
n
The Archimedean Property lets us find N with
So let us put this into a formal proof.
1
N
ˇ
ˇ
1
0ˇˇ D
n
< ", and any n > N will satisfy
1
n
< ".
74
Convergence of sequences: Examples
Proof. Let " > 0 be given (fixed, but arbitrary). Choose, by the Archimedean Property,
N 2 N with N1 < ". Then, for n > N ,
1
1
<
< ";
n
N
ˇ
ˇ
ˇ1
ˇ
1
1
ˇ
;0 D ˇ
0ˇˇ D < ":
d
n
n
n
therefore,
Thus,
for all " > 0, there exists N 2 N; for all n > N , d
That is, limn
1
n
1
; 0 < ":
n
D 0.
20.2 Example. . p1n / converges to
Analysis. We guess that limn
.
p1
n
d
D 0. The relevant distance is
ˇ
ˇ
ˇ 1
ˇ
1
1
ˇ
0ˇˇ D p ;
p ; 0 D ˇp
n
n
n
For " > 0 we need to find N such that n > N implies
p1
n
< ". Now,
1
1
< "2 :
p <" ”
n
n
So we see how the proof should go.
Proof. Let " > 0 be given. Choose N 2 N with
1
"2 (possible by the Archi. Property)
N
Then for n > N ,
1
1
<
"2 ; hence
n
N
1
p < ":
n
ˇ
ˇ
ˇ
ˇ
1
Thus, for all n > N , we have d pn ; 0 D ˇ p1n 0ˇ < ": But " was arbitrary, so for all
" > 0, there exists N 2 N such that for all n > N , d p1n ; 0 < "I that is, limn p1n D 0.
20.3 Example. Let xn D 1 C
1
.
2n
Then limn xn D 1.
Analysis. We are to show that no matter what " > 0 we are given, we can find an N so
large that
d.xn ; 1/ < ";
for n > N . Now,
ˇ
ˇ
ˇ
ˇ
1
1
d.xn ; 1/ D jxn 1j D ˇˇ1 C n 1ˇˇ D n :
2
2
Since 2n > n, for all n 2 N;
1
<"
2n
INTRODUCTION TO ANALYSIS
75
provided
1
< ":
n
Proof. Let " > 0 be given. Choose N 2 N with
Then for all n 2 N, n > N implies
ˇ
ˇ
1
d.xn ; 1/ D jxn 1j D ˇˇ1 C n
2
1
N
< ", by the Archimedean Property.
ˇ
ˇ
1
1
1ˇˇ D n < < ":
2
n
Thus, for all " > 0 there exists N 2 N such that n > N implies d.xn ; 1/ < "; that is,
xn ! 1.
20.4 Example. Let an D . 1/n . Then limn an does not exist. That is, the sequence .an /
does not converge.
Notice that
a1 D
1
a2 D 1
a3 D 1
a4 D 1;
::
:
(
1; if n is odd
1;
if n is even.
The successive values here never get closer together than 2.
Imagine an ! c. When n is even, jan cj < " gives j1 cj < ", hence
an D
1
When n is odd jan
" < c < 1 C ":
1j < " gives j. 1/
1
cj < "; hence
"<c<
1 C ":
We take " D 1. Then the even case yields 0 < c < 2 and the odd one gives 2 < c < 0.
Putting these things together properly will yield a contradiction.
Claim. .an / diverges, that is, does not converge.
Proof. Suppose an converged. Then there would exist c with an ! c. Take " D 1. Then
there exists N such that for all n > N , jan cj <1. For such N , we take an even n > N
and get
1 " < c < 1 C "I
that is,
0 < c < 2:
But we may also take an odd n > N , and get
1
"<c<
1C"W
that is,
2 < c < 0:
Thus the existence of such an N implies
0<c
and
c < 0;
76
Convergence of sequences: Examples
a contradiction.
Therefore, for " D 1, no N exists with jan
cj < ", for n > N . Therefore
8" > 0; 9N 2 N; 8n > N; jan
cj < "
is false: .an / does not converge to c. Here c was any proposed limit, so .an / does not
converge.
Proof. (Second method.) Here the idea is that if .an / converges to some c, then the terms
must be getting close together.
Suppose .an / converges to c. Take " D 1 in the definition to find N so that n > N
implies jan cj < 1. Consider any n > N . Then n C 1 is also > N . Thus,
jan
anC1 j jan
cj C jc
jan
anC1 j < 1 C 1 D 2;
anC1 j < 2;
but also
jan
anC1 j D j. 1/n
. 1/nC1 j D j1 C 1j D 2:
These two statements together yield 2 < 2, a contradiction.
Thus, .an / does not converge to c. The c here was arbitrary, so .an / does not converge
at all.
We will see that the method just used is quite general. It involves the idea of a Cauchy
sequence. See section 24. E XISTENCE : C AUCHY SEQUENCES.
20.5 Example. We guess limn
n2
1Cn2
D 1. One way to see this is to use the calculation:
n2
D
1 C n2
The
1
n2
1
n2
1
:
C1
is very small when n is large, so should be negligible.
Analysis.
ˇ
ˇ n2
ˇ
ˇ 1 C n2
Now 1 C n2 > n, so
1
1Cn2
< " provided
ˇ ˇ 2
ˇ
ˇ ˇn
.1 C n2 / ˇˇ
ˇ
ˇ
1ˇ D ˇ
ˇ
1 C n2
ˇ
ˇ
ˇ
1 ˇˇ
D ˇˇ
1 C n2 ˇ
1
D
:
1 C n2
1
n
< ". And Archimedes can take care of that.
2
n
Proof. Let xn D 1Cn
2 . Let " > 0 be given. By the Archimedean property, there exists N
1
with N ". For such an N let n > N . then
1
1
1
< <
"
2
1Cn
n
N
INTRODUCTION TO ANALYSIS
77
Thus,
ˇ
ˇ
ˇ
ˇ n2
ˇ
1
1j D ˇˇ
ˇ
2
1Cn
ˇ 2
ˇ
ˇn
.1 C n2 / ˇˇ
ˇ
Dˇ
ˇ
1 C n2
ˇ
ˇ
ˇ
1 ˇˇ
D ˇˇ
1 C n2 ˇ
1
< ":
D
1 C n2
Therefore, for all n > N , jxn 1j < ". Since " > 0 was arbitrary, for all " > 0 there exists
N such that for all n > N , jxn 1j < ". Therefore, limn xn D 1.
jxn
20.6 Example. limn
n2 C2n
n3 5
D 0.
Analysis.
ˇ ˇ 2
ˇ
ˇ 2
ˇ ˇ n C 2n ˇ
ˇ n C 2n
ˇ
ˇ
ˇ:
ˇ
0ˇ D ˇ 3
ˇ n3 5
n
5 ˇ
If n 2, we may remove the absolute value signs, because then n3 5 23 5 > 0.
2
How much bigger must n be to make nn3C2n
< "‹ Do not attempt to solve for n. If
5
n 2 we do have
n2 C 2n n2 C n2 D 2n2
and also
n3
n3
; whenever
5;
n3 5 2
2
which will hold if
n3 10;
which will hold if n 3.
Now, for n 3, we have
n2 C 2n
2n2
4
D ;
3
3
n
5
n =2
n
which should be made < ".
We are ready to organize this into a proof.
Proof. Let us call this sequence .xn /. Let " > 0 be given. Choose N1 2 N with N1 4=".
Let N D maxf3; N1 g. Then for n > N , we have
ˇ
ˇ 2
ˇ
ˇ n C 2n
ˇ
0ˇˇ
jxn 0j D ˇ 3
n
5
n2 C 2n
(since n 2/
n3 5
2n2
3
.because n 3 H) n3
n =2
4
D
n
< ":
D
5 n3 =2/
Thus, for all " > 0, there exists N 2 N, such that for all n > N , jxn
limn xn D 0.
0j < ". Thus,
78
Convergence of sequences: Examples
20.7 Example. Let an D . 1/n 12 n1 . When n is large, we see that an gets close to
1
, when n is even, and close to 12 , when n is odd. We guess that this sequence does not
2
converge.
Analysis. Suppose .an / converges and let c be its limit. For all n 2 N
d.anC1 ; an / d.anC1 ; c/ C d.c; an /:
This can be made small for n large, while also,
ˇ
ˇ
1
d.anC1 ; an / D janC1 an j D ˇˇ. 1/nC1
2
ˇ
ˇ
1
D ˇˇ. 1/nC1
2
ˇ
ˇ
1
C
D ˇˇ1
nC1
1
1
. 1/n
nC1
2
ˇ
1
1 1 ˇˇ
C
nC1
2 n ˇ
ˇ
1 ˇˇ
:
n ˇ
ˇ
1 ˇˇ
n ˇ
If n 2, we have
1
1
1
1
C C < 1;
nC1
n
3
2
so for these n we may remove the absolute value signs
1
1
janC1 an j D 1
C
>1
nC1
n
2
:
n
If n 6, we would get
2
2
D :
6
3
Now we see what to do for a proof. We will use " D 31 .
janC1
an j > 1
Proof. Suppose .an / converges to c. Then there exists N 2 N, with jan cj < 31 , for all
n > N . Fix such an N and choose any n 2 N greater than the maximum of N and 6. Then
n C 1 is also > maxfN; 6g. We have
1
2
1
an j janC1 cj C jc an j < C D :
3
3
3
ˇ
ˇ
ˇ
1
1
1
1 ˇˇ
janC1 an j D ˇˇ. 1/nC1
. 1/n
2 nC1
2 n ˇ
ˇ
ˇ
ˇ
1
1
1 1 ˇˇ
D ˇˇ. 1/nC1
C
2 nC1
2 n ˇ
ˇ
ˇ
ˇ
1
1 ˇˇ
:
D ˇˇ1
C
nC1
n ˇ
Since n 6, we may remove the absolute value signs and get
1
1
2
janC1 an j D 1
>1
C
nC1
n
n
2
2
>1
D :
6
3
2
2
Combining this with ./ we have 3 < 3 , a contradiction.
Thus, .an / does not converge.
janC1
20.8 Example. The sequence .n2 / does not converge.
()
INTRODUCTION TO ANALYSIS
79
The reason this doesn’t converge is that it becomes too large (unbounded), so can’t get
close to any fixed number. Let us go directly to the proof.
Proof. Suppose limn n2 D c 2 R. Then, taking " D 1 in the definition, there exists N
such that for all n > N ,
jn2 cj < 1:
By the triangle inequality, we have
n2 jn2
cj C jcj < 1 C jcj;
for all n > N . But according to the Archimedean Property, there exists n with n >
maxfN; 1 C jcjg, For such an n we have
n < n2 < 1 C jcj < n;
which is impossible.
This contradiction shows that the limit c did not actually exist.
20.1. For the following sequences .xn /, prove they converge or diverge (that is, do not converge) by
using the definition of limit directly.
(a) xn D
(b) xn D
(c)
(d)
(e)
(f)
xn
xn
xn
xn
D
D
D
D
n2 1
.
n2 C2n
2n
4n C1
1C. 1/n .n2 /
.
n2 C1
3n
.
2n 17
. 1/n .1 n/
.
p 1Cn
2
n C 4 C 1=n
p
n2 .
20.2. Let S be a metric space with distance function d . Which of the following are logically equivalent
to the definition of convergence to a point in S ?
(a) 9a 2 S; 8" > 0; 9N 2 N; 8n > N; d.xn ; a/ < ".
(b) 8" > 0; 9a 2 S; 9N 2 N; 8n > N; d.xn ; a/ < ".
(c) 8" > 0; 9N 2 N; 9a 2 S; 8n > N; d.xn ; a/ < ".
(d) 9N 2 N; 8" > 0; 9a 2 S; 8n > N; d.xn ; a/ < ".
(e) 9a 2 S; 9N 2 N; 8" > 0; 8n > N; d.xn ; a/ < ".
(f) 9a 2 S; 9N 2 N; 8n > N; 8" > 0; d.xn ; a/ < ".
(g) 9a 2 S; 8" > 0; 8n > N; 9N 2 N; d.xn ; a/ < ".
80
Convergence of sequences: Examples
Notes
INTRODUCTION TO ANALYSIS
81
21. L IMIT THEOREMS FOR SEQUENCES OF REALS
The limit of a sequence of real numbers is unique: If a sequence .xn / converges to a
number a, then that is the only number it converges to. This justifies the notation a D
limn xn .
21.1 Theorem (Uniqueness of limits). Let .xn / be a sequence converging to a and to b.
Then a D b.
Proof. Let " > 0: Since xn ! a, we may choose Na such that
jxn
aj < "=2, for n > Na :
Since xn ! b, we may choose Nb such that
jxn
bj < "=2, for n > Nb :
Put N D maxfNa ; Nb g, and let n > N . Then, by the triangle inequality,
ja
bj ja xn j C jxn
"
"
< C D"
2
2
Thus, for all " > 0, ja bj < ". Hence, a
result” (Theorem 3.3), so a D b.
bj
:
b D 0, by what we call “the first analysis
A set A R is called bounded if there exists M 2 R with jaj M;for all a 2 A. As we
saw in section 17. B OUNDED SETS, this is the same as “order bounded” (that is, bounded
above and bounded below) or bounded in the metric space sense (that is, contained in some
ball). A sequence .an / in R is called bounded if its range fan W n 2 Ng is bounded; that
is, if there exists M 2 R with jan j M , for all n 2 N.
21.2 Theorem. Every convergent sequence is bounded.
Proof. Let .xn / be a convergent sequence and let a be its limit. From the definition of
convergence, if we take " D 1, we obtain an N 2 N with
jxn
aj < 1; for all n > N:
Then, by the triangle inequality,
jxn j jxn
aj C jaj < 1 C jaj; for n > N:
Put M D maxfjx1 j; : : : ; jxN j; 1 C jajg. Then,
jxn j M; for all n 2 N:
Therefore, .xn / is a bounded sequence.
21.3 Theorem (Comparison). Let .xn / and .cn / be a sequence of real numbers, a 2 R.
If limn cn D 0 and there exists k 0 and m 2 N such that
jxn
then limn xn D a.
The proof is a good exercise.
aj kcn ; for all n > m;
82
Limit theorems for sequences of reals
21.4 Theorem. Let .xn / and .yn / be sequences in R, a; b 2 R.
(1) If xn ! a and yn ! b; then xn C yn ! a C b.
(2) If xn ! a and c 2R then cxn ! ca.
(3) If xn ! a and yn ! b, then xn yn ! ab.
(4) Let xn ! a and yn ! b. If yn ¤ 0, for all n 2 N and b ¤ 0, then xn =yn ! a=b.
Thus,
lim.xn C yn / D lim xn C lim yn provided the right side exists.
n
n
n
You should write out similar formulations for (2),(3) and (4). Notice that you have the
additional requirement, in the case of quotients, that none of the denominators be 0.
Proof of (1). Let xn ! a, yn ! b. Let " > 0 be given. By definition, since xn ! a, there
exists N1 2 N such that
"
jxn aj < ; for n > N1 .
2
Also since yn ! b, there exists N2 2 N such that
"
jyn bj < ; for n > N2 .
2
Let N D maxfN1 ; N2 g. Then n > N implies both of these hold. Therefore, for n > N;
jxn C yn
.a C b/j D jxn
a C yn
bj
jxn aj C jyn bj by the triangle inequality
"
"
< C D ":
2
2
We have shown that for all " > 0, there exists N 2 N such that for all n > N , j.xn C
yn / .a C b/j < "; that is, .xn C yn / converges to a C b.
"
Proof of (2). Suppose limn xn D a; c 2 R. Let " > 0 be given. Put "0 D
. Then
jcj C 1
there exists N 2 N such that
n > N implies jxn
aj < "0 .
Then,
e
< ".
jcj C 1
Since " > 0 was arbitrary, this shows for all " > 0 there exists N such that for all n > N
jcxn caj < ". In other words, cxn ! ca.
n > N implies jcxn
caj jcjjxn
aj jcj"0 D jcj
In the above proof, we used jcj C 1 instead of jcj since c could have been 0 and we cannot
divide by 0. An alternative would have been to treat the case c D 0 separately. After all, if
c D 0, jcxn caj D j0 0j D 0, for all n 2 N.
Analysis of (3). Let xn ! a and yn ! b. Then
jxn yn
abj D jxn yn
xn b C xn b
jxn yn
xn bj C jxn b
jxn jjyn
bj C jxn
abj
abj
ajjbj:
Since jxn aj gets small as n gets large and since jbj stays fixed, the second term here will
become small. As for the first term, we see that jyn aj gets small as n gets large. Without
some control of jxn j the product jxn jjyn bj could get large. But since every convergent
INTRODUCTION TO ANALYSIS
83
sequence is bounded, this won’t be a problem: there is a M such that jxn j < M for all
n 2 N, and M jyn bj can be made small.
Proof of (3). Since .xn / converges, there exists M with jxn j < M for all n 2 N.
Let " > 0 be given. Since xn ! a, there exists N1 2 N such that
"
jxn aj <
; for n > N1 .
2.jbj C 1/
Also since yn ! b, there exists N2 2 N such that
"
jyn bj <
; for n > N2 .
2M
Let N D maxfN1 ; N2 g. Then n > N implies both of these hold, and
jxn yn abj D jxn yn xn b C xn b abj
jxn yn
xn bj C jxn b
abj
jxn jjyn bj C jxn ajjbj
"
"
C
jbj
<M
2M
2.jbj C 1/
"
"
C D ":
2
2
Thus, for all n > N ,
jxn yn abj < ":
Since " > 0 was arbitrary, this shows xn yn ! ab.
Proof of (4). Let xn ! a and yn ! b, with b ¤ 0 and yn ¤ 0 for all n. It will be enough
to prove that
1
1
! ;
()
yn
b
because then, by the limit of products theorem (3),
1
xn
1
a
D xn
!a
D :
yn
yn
b
b
Accordingly, let " > 0 be given. For all n,
ˇ
ˇ
ˇ1
1 ˇˇ
jb yn j
ˇ
ˇ D jy jjbj :
ˇy
b
n
n
Now, since yn ! b, there exists N1 such that
jyn
bj < jbj=2;
for n > N1
so, by the triangle inequality, for n > N1 ,
jbj jb
yn j C jyn j < jbj=2 C jyn j;
So
jyn j jbj
; for n > N1 .
2
Also, there exists N2 such that
jyn
bj < "jbj2 =2
for n > N2 .
Let N D maxfN1 ; N2 g. Then, for n > N
ˇ
ˇ
ˇ1
1 ˇˇ
"jbj2 =2
jb yn j
ˇ
<
:
D
ˇ
ˇy
b
jyn jjbj
jbj2 =2
n
84
Limit theorems for sequences of reals
To understand the next two theorems, it is better to think of convergence in terms of
neighbourhoods. Remember that jx aj < " iff a " < x < a C ", that is iff x 2
B.a; "/ D .a "; a C "/. Thus,
xn ! a iff for all " > 0, there exists N 2 N such that for all n > N , xn 2 .a
"; a C ").
The key to the squeeze theorem below is that if x and z belong to an interval U , such
as .a "; a C "/, and if x y z, then y also belongs to U .
21.5 Squeeze Theorem.
Let .xn /, .yn / and .zn ) be sequences of real numbers with xn yn zn , for all n. If
.xn / and .zn / both converge to a, then .yn / converges to a.
Proof. Let " > 0 be given. Since xn ! a, we may choose N1 with
xn 2 .a
"; a C "/;
for n > N1 :
Since zn ! a, we may choose N2 with
zn 2 .a
"; a C "/;
for n > N2 :
Put N D maxfN1 ; N2 g. Then for n > N , both xn and zn belong to .a
hypothesis,
xn yn zn ;
for all n 2 N:
Hence,
yn 2 .a "; a C "/;
for n > N :
Explicitly, if n > N , then
a
"; a C "/. But, by
" < xn yn zn < a C ":
Since " > 0 is arbitrary, we have for all " > 0 there exists N with jyn
n > N , as required.
aj < ", for all
21.6 Theorem (Preservation of inequalities). If .xn / and .yn / are sequences of reals
with xn ! x, yn ! y and xn yn , for all n2 N, then x y.
Proof. Assume the hypothesis and suppose x > y. Put " D .x y/=2. Then, x " D
y C " D .x C y/=2. Now, since xn ! x, we may choose N1 such that xn 2 .x "; x C "/
for n > N1 , and since yn ! y, we may choose N2 such that yn 2 .y "; y C "/ for
n > N2 . Let n be any natural number > maxfN1 ; N2 g. Then,
yn < y C " D x
" < xn :
This contradicts the fact that xn yn for all n 2 N.
WARNING: This theorem does not say that the strict inequality is preserved. If xn <
yn ; for all n and the limits exist, it is not true that limn xn < limn yn . We still only have
limn xn limn yn .
For an example, take xn D 1 n1 and yn D 1 C n1 . Then xn < yn but in the limit both
become 1.
21.7 Basic limit examples.
1
(1) If p > 0, limn p D 0.
n
(2) If jaj < 1, then limn an D 0.
(3) limn n1=n D 1
INTRODUCTION TO ANALYSIS
85
(4) If a > 0 limn a1=n D 1.
For example (1) we don’t require that p be an integer. But, we need the fact that for all
x > 0 and all p 2 R, x p is defined and the usual rules of exponents hold. (See Theorem
13.2.)
Proof of example (1).
Let " > 0. By the Archimedean property, there exists N 2 N with N 1=."1=p /. Then
for n > N ,
p
1
1=p
D ":
<
"
np
Thus,
ˇ
ˇ
ˇ 1
ˇ
ˇ
ˇ < "; for n > N ,
0
ˇ np
ˇ
and n1p ! 0.
Proof of example (2).
1
Let jaj < 1. jaj
> 1, so
1
D
jaj
1 C b, where b > 0, and for n 2 N .
1
:
.1 C b/n
By the Bernoulli inequality .1 C b/n 1 C nb. (This is proved by induction or from the
Binomial Theorem.) Hence,
1
1
jan j D jajn <
:
1 C nb
nb
jajn D
1
Let " > 0. Then there exists N such that N1 < "b. Then for n > N , jajn < nb
< "b
D ".
b
Thus, for all " > 0, there exists N such that for all n > N , jan 0j < ". Hence
an ! 0.
Proof of example (3).
Let xn D n1=n 1. Then, xn 0, for all n. It will be enough to prove xn ! 0.
By the Binomial Theorem, for n > 1,
n.n 1/ 2
n.n 1/ 2
xn C : : : xnn >
xn I
n D .1 C xn /n D 1 C nxn C
2
2
hence,
n
2
xn2 <
D
; for n > 1.
n.n 1/=2
n 1
Now let " > 0 be given. Choose any N "22 C 1. Then, for all n > N , jxn j D xn < ".
Hence, xn ! 0.
Proof of (4).
First assume a 1. Then 1 a1=n n1=n , for n a. So
ja1=n
1=n
1j n1=n
1;
for n > a:
1=n
By example (3) .n
1/ converges to 0, so a
! 1, by comparison. (Alternatively we
could have used the squeeze theorem.)
In case 0 < a < 1, we apply the above to the reciprocal: 1=a > 1, so .1=a/n ! 1 and
hence an D 1=.1=a/n ! 1 by the limit of quotients theorem.
21.8 Theorem. Let .an / be a sequence of positive real numbers such that limn
and is < 1. Then an ! 0.
anC1
an
exists
86
Limit theorems for sequences of reals
a
Proof. Let r D limn nC1
and assume r < 1. Choose c with r < c < 1 and put " D c
an
a
Then, there exists N 2 N such that, for all n N , j nC1
rj < ". Thus, for n N ,
an
r.
anC1
< r C " D c:
an
In other words anC1 < can for n N . Thus,
aN C1 aN c
aN C2 aN C1 c aN c 2 ;
::
:
By a simple induction aN Ck aN c k , for all k 2 N or what is the same,
an aN c n
Put K D aN c
N
N
;
for n > N .
and get
an Kc n ;
for n > N .
Since an 0 and c n ! 0, it follows that limn an D 0.
It would be a good idea to write out the proof of uniqueness of limits in terms of the
distance function d , to see that the result is also valid in general metric space. (See problem
21.1.)
Another way to look at uniqueness of limits is via
21.9 Theorem (The Hausdorff Property.). In any metric space, if a ¤ b, there exist
neighbourhoods Ua of a and Ub of b with Ua \ Ub D ;.
Proof. Since a ¤ b, d.a; b/ > 0. Take ı any positive number d.a; b/=2,
Ua D B.a; ı/ and Ub D B.b; ı/.
Then Ua and Ub are neighbourhoods of a and b, respectively, and Ua \ Ub D ;. Indeed,
if z 2 Ua \ Ub , then d.a; b/ d.a; z/ C d.z; b/ < ı C ı d.a; b/, an impossibility. To prove uniqueness of limits, suppose .xn / converged to both a and b. By the Hausdorff property, there exist neighbourhoods Ua of a and Ub of b with Ua \ Ub D ;.
Since xn ! a, we may choose Na such that
xn 2 Ua , for n > Na :
Since xn ! b, we may choose Nb such that
xn 2 Ub , for n > Nb :
Now, take a particular n > maxfNa ; Nb g. Then,
xn 2 Ua \ Ub ;
which is impossible.
INTRODUCTION TO ANALYSIS
87
Convergence to C1, and 1. For a sequence .xn / of real numbers, limn xn D C1
means for each M 2 R, there exists N 2 N such that, for all n > N , xn > M . Some
people refer to this as .xn / converges to C1 (in the extended real number system). Others
say .xn / diverges to C1, to emphasize that .xn / does not converge in the real number
system. Similarly, limn xn D 1 means for each M 2 R, there exists N 2 N such that,
for all n > N , xn < M . Here one says .xn / converges to 1 (in the extended real
number system), or .xn / diverges to 1.
x
21.10 Theorem. Let .xn / and .yn / be sequences in R, a; b 2 R.
(1) If xn ! a and yn ! b; then xn C yn ! a C b, provided a C b is defined.
(2) If xn ! a and yn ! b, then xn yn ! ab, unless one of a and b is 0 and the other
is infinite.
(4) Let xn ! a and yn ! b. If yn ¤ 0, for all n 2 N and b ¤ 0, then xn =yn ! a=b,
provided the latter is defined.
These results are left as exercises. It is best to write out explicit cases, for example if
a > 0 is real and b D C1, then xn yn ! C1, etc.
21.1. Write out the proof of uniqueness of limits in terms of the distance function d , to prove the result
in a general metric space.
21.2. The theorem on limits of sum, product, and quotients of sequences extend (with no change in
proof) to sequences of complex numbers.
21.3. A sequence .an / of (non-zero) real numbers converges to 0 if and only if 1=jan j ! C1.
21.4. Let .xn / and .yn / be sequences in R. Let xn ! 1 and yn ! c, where 0 < c 2 R. Prove
that xn yn ! 1. Find an improvement of this result,in which .yn / need not converge.
21.5. Let .xn / be a sequence of elements of Rm ; xn D .xn1 ; : : : ; xnm /. Then, .xn / converges to
some a D .a1 ; : : : ; am / if and only if xnj ! aj , for each D 1; : : : ; m.
21.6. The theorem on limits of sum, product, and quotients of sequences extend to Rm as follows:
Let .xn / and .yn / be sequences in Rm , .cn / a sequence of reals. Let a; b 2 Rm and c 2 R.
(1) If xn ! a and yn ! b; then xn C yn ! a C b.
(2) If xn ! a and cn ! c then cn xn ! ca (product of a vector by a scalar).
(3) If xn ! a and yn ! b, then xn yn ! a b (dot product).
(4) Let xn ! a and cn ! c. If cn ¤ 0, for all n 2 N and c ¤ 0, then xn =cn ! a=c.
21.7. A set A in a metric space is bounded if there exists a ball B.a; r/ (about some a) which contains
A: B.a; r/ A. A sequence is bounded if its range is. Modify the proof for real numbers to
prove every convergent sequence in a metric space is bounded.
21.8. By definition, for a sequence .xn / in a metric space, xn ! a if and only if
8" > 0; 9N 2 N; 8n > N; d.xn ; a/ < ":
Prove that xn ! a if and only if
8" > 0; 9N 2 N; 8n N; d.xn ; a/ < 2":
Notice the changes to weak inequality (n N ).
21.9. Let G be an open set and let .xn / be a sequence converging to a 2 G. Prove that there exists
N 2 N with xn 2 G, for all n N .
21.10. Let .an / and .bn / be sequences in a metric space S such that d.an ; bn / ! 0. Then .an /
converges iff .bn / converges, and if they converge, they have the same limit.
21.11.
(a)
(b)
(c)
Prove or disprove, for sequences of real numbers:
If .xn / converges and .yn / diverges, then .xn C yn / diverges.
If .xn / diverges and .yn / diverges, then .xn yn / diverges.
If .xn / diverges and .yn / converges, then .xn yn / diverges.
88
Limit theorems for sequences of reals
21.12. Let .xn / be a sequence in R converging to c and let an D
Give an example where .an / converges, but .xn / does not.
x1 CCxn
.
n
Then an ! c, also.
21.13. (Connection with Linear Algebra) The set c of convergent sequences x D .xn / in R is a
vector space and the the map T W c ! R defined by T .x/ D limn cn is a linear functional. The
space R may be replaced by C or Rm or Cm in this statement.
INTRODUCTION TO ANALYSIS
89
22. E XISTENCE : M ONOTONE SEQUENCES
In the definition and examples, we needed to know (or guess) the limit of a sequence in
order to prove it converged. Here and in a subsequent section, we will find conditions that
guarantee the existence of a limit, without knowing it in advance.
The first is the idea of a monotone sequence. A sequence .xn / of real numbers is called
increasing if
xn xnC1 ; for all n 2 NI
it is called strictly increasing if
xn < xnC1 ;
for all n 2 NI
xn xnC1 ;
for all n 2 NI
xn > xnC1 ;
for all n 2 N:
it is called decreasing if
and strictly decreasing if
A sequence is called monotone if it is either increasing or decreasing and I guess you can
figure out what “strictly monotone” means.
Alternate terminology: Some people use ‘increasing’ for ‘strictly increasing’ and say ‘non-decreasing’
for what is here called increasing. Be careful: a sequence which is not decreasing need not be nondecreasing! (E.g. our friend .. 1/n /. ) Some authors use monotone increasing for increasing and
monotone decreasing for decreasing. Sometimes isotone and antitone are used.
Be sure which terminology a book you are looking at is using.
22.1 Monotone Convergence Theorem. Every bounded monotone sequence of real numbers converges.
Proof. We prove the increasing case and leave the decreasing case as an exercise.
Suppose .xn / is an increasing sequence which is bounded. In particular, fxn W n 2 Ng
is bounded above, so has a supremum. Let a D supn2N xn . We claim .xn / converges to a.
Indeed, fix " > 0. By definition of supremum,
xn a < a C "
and since a
for all n 2 N;
" < a, there exists, and we choose, N 2 N with
a
" < xN :
But .xn / is increasing, so xn xN , for all n N . Thus, for all n N;
a
" < xn < a C ":
Thus, for all " > 0, there exists N 2 N with xn in the ball B.a; "/ for all n > N ; in other
words, xn ! a.
.
You will notice that what one really proves is
22.2 Corollary.
(1) each increasing sequence bounded above converges to its supremum.
(2) each decreasing sequence bounded below converges to its infimum.
Of course, an increasing sequence is bounded iff it is bounded above, and a decreasing
sequence is bounded iff it is bounded below. (why?)
90
Existence: Monotone sequences
22.3 Example. Let .xn / be the sequence defined by recursion by
p
x1 D 1;
xnC1 D 1 C xn ; for n 2 N.
We will see that this is a bounded increasing sequence. If we try a few values, it appears
that xn 2 for all n. So we try to prove that by induction. First, x1 D 1 < 2. Now
suppose, xn < 2. Then
p
p
p
xnC1 D 1 C xn < 1 C 2 D 3 < 2;
so by induction xn < 2 for all n 2 N. To prove that .xn / is increasing, we also use
induction. We have to show
xn xnC1 ; for all n 2 N.
p
p
This is true for n D 1, since x1 D 1 < 1 C 1 D 1 C x1 D x2 . Now supposing it true
for n we have
p
p
xnC1 D 1 C xn 1 C xnC1 D xnC1C1 :
Thus the result holds for all n 2 N.
Since .xn / is bounded and increasing we can apply the Monotone Convergence Theorem to yield that .xn / converges to some a 2 R.
p To decide what the limit a is we need that limn xnC1 is also a. (Proof?). Now, xnC1 D
1 C xn , so
2
xnC1
D 1 C xn ; for all n.
Hence, using the limit theorems for sums and products, we have
2
a2 D lim xnC1
D 1 C lim xn D 1 C a:
n
n
p
Thus a2 D 1 C a; and hence a D .1 ˙ 5/=2. Also, xn 0 for all n, so in the
limit a p
0. (Inequalities are preserved
p under limits.) This excludes the possibility that
a D .1
5/=2; therefore a D .1 C 5/=2.
The above proof could have been shortened using the following result.
22.4 Lemma. If .yn / is a sequence of positive real numbers with yn ! b, then
p
b.
p
p
The proof is an exercise, based on y
b D p y bp .
p
yn !
yC b
p
22.5 Example (Finding 2). Notice that x 2 D 2 ” 2x 2 p
D x2 C 2 ” x D
2
.x C 2/=2x. Let us use this formula as motivation to try to find 2. Let x1 D 2, and for
each natural number n, let
x2 C 2
xnC1 D n
:
()
2xn
We will show that .xn / is a bounded monotone sequence.
For all a; b, a2 C b 2 2ab, since a2 2ab C b 2 D .a b/2 0. Thus, for x > 0,
p
x 2 C . 2/2 p
2;
()
2x
p
p
p
so that, for all n, xnC1 p2. Since also x1 D 2 > 2 this shows xn 2, for all n, so
.xn / is bounded below by 2.
INTRODUCTION TO ANALYSIS
Now, for all n, we have
91
xn2 C 2
xn
2xn
”
xn2 2xn2
xnC1 D
”
xn2 2;
which we have just checked is true.
p We have shown that .xn / is a decreasing sequence, which by ./ is bounded below by
2. Thus xn converges to some x. Then taking a limit on both sides of ./ yields
x2 C 2
:
2x
p
p
The only positive solution to this equation is x D 2, so that xn ! 2.
Note that, if we had started p
with x1 any other positive number, we would still reach the
same conclusion. If 0 < x1 < 2, the sequence is not monotone, but becomes so after the
first step and changing a finite number of terms does not affect convergence or divergence
(or the value of the limit). What happens if x1 < 0?
xD
The Monotone Convergence Theorem can be extended to the unbounded case using the
concept of infinite limits.
22.6 Theorem.
(a) If .xn / is unbounded and increasing, then limn xn D C1.
(b) If .xn / is unbounded and decreasing, then limn xn D 1.
Proof. (a) Let .xn / be increasing and unbounded. Recall that, by definition, limn xn D
C1 means for each M 2 R, there exists N 2 N such that, for all n > N , xn > M .
So let M 2 R be given. Since .xn / is increasing, it is bounded below by x1 . But .xn / is
unbounded, so it can’t be bounded above. Thus, there must be an N 2 N with xN > M .
Now, as in the case of finite limits, for n N , we have xn xN ; since the sequence is
increasing. Thus, for all n N , xn > M , as required.
The proof of (b) is similar.
x C1 is considered the supremum of any set
In the extended real number system R,
which is not bounded above, and 1 is the infimum of any set which is not bounded
below. Thus, in general,
22.7 Corollary.
(1) Each increasing sequence of real numbers converges to its supremum, possibly C1.
(2) Each decreasing sequence of real numbers converges to its infimum, possibly 1.
22.1. For the following sequences .xn / of reals, use the monotone convergence theorem to decide
whether they converge and if so find the limit.
(a) xn D 51=n .
(b) x1 D 1 and xnC1 D 14 .xn C 5/, for all n 2 N.
(c) x1 D k > 0, xnC1 D k=.1 C xn /, for all n. (Careful. This requires a little more. )
22.2. Let x1 D 1 and for all n, xnC1 D xn C 1=xn . Prove .xn / is unbounded.
22.3. Let .xn / be a sequence of real numbers and an D maxfa1 ; : : : ; an g, for all n. Prove an
x
converges to supfxn W n 2 Ng. The convergence is in R if the sequence is bounded and in R
otherwise.
92
Existence: Monotone sequences
Notes
INTRODUCTION TO ANALYSIS
93
23. C LUSTER POINTS AND SUBSEQUENCES : T HE B OLZANO -W EIERSTRASS
THEOREM ( SEQUENCE FORM )
A point c is called a cluster point of .xn / if for each " > 0, for all n 2 N, there exists
m > n such that xm 2 B.c; "/. Thus, the set fn 2 N W xn 2 B.c; "/g is infinite. In other
words, c is a cluster point of .xn / if each neighbourhood of c contains xn for infinitely
many n. We also say .xn / clusters at c.
23.1 Lemma. If .xn / converges to c then .xn / clusters at c.
23.2 Examples.
(a) Let xn D . 1/n . Then, 1 and 1 are cluster points of .xn /. Are
there any others?
(b) Let xn D 1 C . 1/n C 1=n. Then, 2 and 0 are the only two cluster points of .xn /.
(c) Let xn D nC. 1/n n. Then .xn / has only one cluster point, yet does not converge.
Subsequences. The reader will have noticed that in examples where there is more than one
cluster point, the sequence seems to ‘converge’ to that point over a subset of the indices.
If .xn / is a sequence and .nk / is a strictly increasing sequence of natural numbers
n1 < n2 < < nk < nkC1 ;
then the sequence .yk / defined by yk D xnk , for all k 2 N is called a subsequence of
.xn ).
In the above examples, if we take nk D 2k, we obtain a subsequence .yk / D .x2k /. In
particular, for example (b), x2k D 1 C . 1/2k C 1=.2k// D 2 C 1=.2k/, so the subsequence .yk / is just .2 C 1=.2k//, which clearly converges to the cluster point 2. Another
subsequence, .x2k 1 /, picks out the terms of .xn / which have odd indices. We see that
.x2k 1 / D .1=.2k 1//, which converges to the other cluster point 0.
It is important to realize that a sequence .xn / really stands for a function f W n 2 N 7!
xn . To construct a subsequence, we compose this with a map g W k 2 N 7! nk , obtaining
a new function whose value at k is f .g.k// D xnk . If the function g W k 7! 2k, the
resulting function (subsequence) becomes .x2k /. Since k is just a “dummy variable” that
runs through all the natural numbers, it can be replaced by any other letter, so .x2n / denotes
the same subsequence.
The following lemma is convenient to keep in mind when working with subsequences.
The proof is an easy induction.
23.3 Lemma. If .nk / is a strictly increasing sequence of natural numbers, then nk k,
for all k.
23.4 Theorem. A sequence .xn / clusters at a if and only if it has a subsequence which
converges to a.
Proof. Let .xn / cluster at a. From the definition, for each n 2 N, and for each " > 0,
there exists m > n (which depends on both n and "), with xm 2 B.a; "/. We define a
subsequence recursively as follows.
First take n D 1 and " D 1 and obtain an n1 > 1 with
xn1 2 B.a; 1/:
Then take n D n1 and " D 1=2 and obtain an n2 > n1 with
xn2 2 B.a; 1=2/:
In general, if nk has been chosen, we choose nkC1 > nk with
1
/:
xnkC1 2 B.a; kC1
94
Cluster points and subsequences: The Bolzano-Weierstrass Thm
We then have xnk 2 B.a; 1=k/ for all k 2 N. The resulting subsequence .xnk / converges
to a.
Indeed, let " > 0, Then there exists K with K1 < ". For k > K we have n1k < " and
xk 2 B.a; 1=k/ B.a; "/, as required.
For the converse, let .xnk / be a subsequence of .xn / which converges to a. Then .nk /
is a strictly increasing sequence of natural numbers so that nk k, for all k. Fix " > 0,
and N 2 N. Since xnk ! a, there exists K such that for k > K, xnk 2 B.a; "/. Choose
k to be any natural number > maxfN; Kg. Then, nk k > N and xnk 2 B.a; "/. Thus,
8" > 0; 8N 2 N; 9m > N with xm 2 B.a; "/:
Thus we have seen that the notions of cluster point and subsequential limit coincide
in metric spaces.
Now, let us restrict to the real number system.
23.5 Theorem. Every sequence of real numbers has a monotone subsequence.
Proof. Let .xn / be a sequence of real numbers. Call n a dominant index if xn xm , for
all m n. There are 2 cases. Either the set D of dominant indices is infinite or it is finite.
If D is infinite, choose a sequence .nk / in D with nk < nkC1 , for all k 2 N. Then,
xnk xnkC1 , for all k, so the subsequence .xnk / is decreasing.
If D is finite, then for all n > max D, there exists m > n, with xm > xn . In this case,
we can let n1 > max D be arbitrary, and for each k choose nkC1 > nk , with xnkC1 > xn ,
obtaining a strictly increasing subsequence .xnk /.
23.6 Bolzano-Weierstrass Theorem (Sequence form).
(1) Every bounded sequence of real numbers has a cluster point.
(2) Equivalently, every bounded sequence of real numbers has a convergent subsequence.
Proof. Let .xn / be a bounded sequence in R. Then, .xn / has a monotone subsequence. But
every bounded mononotone sequence converges, so .xn / has a convergent subsequence.
This result extends to Euclidean space Rm .
23.7 Bolzano-Weierstrass Theorem (Sequence form in Rm ).
(1) Every bounded sequence in Rm has a cluster point.
(2) Equivalently, every bounded sequence in Rm has a convergent subsequence.
Proof. This can be deduced from the corresponding set form, but we prove it here by
repeatedly applying the real case.
Let .xn / be a bounded sequence in Rm . Each xn is an m-tuple, xn D .xn1 ; xn2 ; : : : ; xnm /.
Since .xn / is bounded, so are each of the real sequences .xni /. First the sequence .xn / has
a subsequence .xk1 / D .xn1 / for which the sequence of first coordinates converge. Indeed,
k
.xn1 / is a bounded sequence of real numbers, so has a subsequence which converges. In
this same way, the new sequence .xk1 / has a subsequence .xj2 /, for which the sequence
of second coordinates converge. But that sequence .xj2 / is also a subsequence of .xn /,
so its first coordinates also converge. If we continue in this way, in m steps we reach a
subsequence which converges in all coordinates, so converges in Rm .
INTRODUCTION TO ANALYSIS
23.1. If a sequence .xn / in a metric space does not converge to a point a, then there is a neighbourhood
U of a and a subsequence .xnk /, none of whose terms are in U , so that it does not cluster at a.
23.2. A bounded set in Rm converges if and only if it has exactly one cluster point. (Use problem
23.1.)
23.3. Prove the sequence form of the Bolzano-Weierstrass in R by proving directly that for a bounded
sequence .xn / supft W xn t for infinitely many ng exists in R and is a cluster point of .xn /.
(This is actually the lim supn xn , but we need not refer to that concept.)
23.4. Mimic the proof of the set form of the Bolzano-Weierstrass theorem to prove the sequence form.
23.5. From the set form of the Bolzano-Weierstrass theorem, deduce the sequence form. (Be careful:
the range of a sequence could be finite.)
23.6. From the sequence form of the Bolzano-Weierstrass theorem, deduce the set form.
23.7. Create an example of a sequence of real numbers that:
(a) is not convergent, but has exactly one cluster point;
(b) has exactly 5 cluster points, but all terms are distinct;
(c) has an infinite number of cluster points;
(d) is not monotone, yet has limit C1.
23.8. Let .an / and .bn / be sequences in a metric space S such that d.an ; bn / ! 0. Then .an / and
.bn / have the same cluster points.
95
96
Cluster points and subsequences: The Bolzano-Weierstrass Thm
Notes
INTRODUCTION TO ANALYSIS
97
24. E XISTENCE : C AUCHY SEQUENCES
A sequence .xn / in a metric space .S; d / is called a Cauchy sequence if for every
" > 0, there exists N 2 N such that for all n; m > N , d.xn ; xm / < ".
We see immediately that
24.1 Lemma. Every convergent sequence is Cauchy.
Proof. Suppose .xn / converges to a. Let " > 0. By definition, there exists N such that for
all n > N d.xn ; a/ < "=2. But then for n; m > N , d.xn ; xm / d.xn ; a/ C d.a; xm / <
"
C 2" D "; as required.
2
The remarkable thing is that we will be able to prove that in R and Rm every Cauchy
sequence converges, which will give us a second way of getting the existence of a limit
without knowing it in advance.
24.2 Theorem. If a Cauchy sequence clusters at a, then it converges to a.
Proof. Let .xn / be a Cauchy sequence with cluster point a. We will show that .xn / also
converges to a.
Let " > 0. Since .xn / is Cauchy, there exists N such that
for all n; m > N , d.xn ; xm / < "=2
Fix such an N and let n > N . Since .xn / clusters at a, we may choose m > N such that
d.xm ; a/ < "=2. Thus, d.xn ; a/ d.xn ; xm / C d.xm ; a/ < ".
We have shown, then, that for all n > N , d.xn ; a/ < ". Since " > 0 was arbitrary,
xn ! a.
24.3 Theorem. Every Cauchy sequence in a metric space is bounded.
Proof. Let .xn / be a Cauchy sequence. Let a be a fixed point of the space. Apply the
definition with " D 1 to obtain N such that for n; m > N ,
d.xn ; xm / < 1:
In particular, taking m D N C 1
d.xn ; a/ d.xn ; xN C1 / C d.xN C1 ; a/ < 1 C d.xN C1 ; a/;
for all n > N C 1:
So we put M D maxfd.x1 ; a/; : : : ; d.xN ; a/; 1 C d.xN C1 ; a/g. Then,
d.xn ; a/ M; for all n 2 N:
Therefore, .xn / is a bounded sequence.
24.4 Theorem (Cauchy Criterion). Every Cauchy sequence in R or Rm converges.
Proof. Let .xn / be a Cauchy sequence in Rm . Then .xn / is bounded, by the above theorem,
so has a cluster point by the Bolzano-Weierstrass theorem. Thus, .xn / converges to that
cluster point.
You will have noticed that the completeness of the reals was the essential ingredient in
this proof. (This was what caused cluster points to exist.) The fact that the Cauchy criterion
holds for sequences in Rm is called metric completeness of Rm .
A metric space .S; d / is called a complete metric space if every Cauchy sequence
converges.
The concept of Cauchy sequence is at the heart of the theory of convergence of series.
See section 27. S ERIES OF NUMBERS
98
Existence: Cauchy sequences
24.1. Let .xn / be a sequence in a metric space such that for every " > 0, there exists N such that for
all n > N , d.xn ; xN / < ", then .xn / is Cauchy.
24.2. A sequence .xn / in a metric space is called contractive if there exists k 2 .0; 1/ such that
d.xnC2 ; xnC1 / kd.xnC1 ; xn /, for all n. Every contractive sequence is Cauchy.
1
. Prove that .an / is a Cauchy sequence,
1 C an
so converges. What is the limit? (Suggestion: prove the sequence is contractive.)
24.3. Let an D 1, and for each n, put anC1 D 1 C
24.4. Let .an / be a sequence in a metric space and for each n 2 N, let An D fak W k mg. Then
.an / is a Cauchy sequence iff limn diam An ! 0.
24.5. A metric space is complete iff whenever .K
T n / is a sequence of closed sets with Kn KnC1 ,
for all n 2 N, with diam.Kn / ! 0, then n2N Kn consists of a single point.
INTRODUCTION TO ANALYSIS
99
25. T HE NUMBER e, AN APPLICATION OF M ONOTONE C ONVERGENCE
Here we use the Monotone Convergence Theorem for the real numbers to prove that
1 n
lim 1 C
n
n
exists. You know this from Calculus as the number e. Once we show it exists, we will
make this the definition of e.
Let
1 n
an D 1 C
:
n
Of course, an 0. We will show that .an / is bounded above and is increasing, so that it
will converge by the Monotone Convergence Theorem.
First, use the Binomial Theorem to obtain
! n
X
n
1 k
an D
:
n
k
kD0
Boundedness. Now,
! n
1 k
D1 1
k
n
Since each of the
i
n
0 this is 1
n
1
2
1
n
k
1
n
1
:
kŠ
()
1
,
kŠ
hence
! n
n
X
n
1 k X 1
:
an D
k
n
kŠ
kD0
kD0
Now, a1 D 2 and for n 2,
an 1 C 1 C
n
X
kD2
1
k.k
n X
1
D2C
1/
k 1
kD2
1
k
D2C1
1
< 1 C 2 D 3:
n
Thus, .an / is bounded above by 3 (and below by 0).
Increasing. To show .an / is actually increasing, look at the expressions for an and anC1 .
! n
X
n
1 k
an D
:
k
n
kD0
!
k
n
X
1
nC1
1
C
anC1 D
:
nC1
.n C 1/nC1
k
kD0
So an anC1 , provided for each k D 0; : : : ; n
! !
k
nC1
n
1 k
1
;
k
n
k
nC1
But, the left-side here is
1 1
1
n
1
2
1
n
k
1
n
1
kŠ
and the right side is obtained from it by replacing n by n C 1, which makes it larger, since
i
.1 ni / < .1 nC1
/.
The number e an application of Monotone Convergence
100
This proves .an / is a bounded monotone sequence of real numbers, so it converges. As
we said, the limit is called e.
P 1
Connection with the series k kŠ
. We noticed above that
n
1 n X 1
:
an D 1 C
n
kŠ
kD0
And, if we let bn stand for the right side of this inequality,
bn 3:
Clearly, the sequence .bn / is increasing and bounded, so it too has a limit, denoted
Let us temporarily call this limit E: Since
P1
1
kD0 kŠ .
a n bn ;
in the limit we have
e E:
But look at ./ again.
! n
n
X
X
n
1 k
an D
1 1
D
k
n
kD0
kD0
Fix m 2 N. Then, for n m;
m
X
1 1
an 1
n
kD0
1
But, this is a finite sum, and each of the
sides gives
e
i
n
1
n
1
2
1
n
2
1
n
k
1
n
k
1
n
1
:
kŠ
1
:
kŠ
converges to 0 as n runs, so taking limits of both
m
X
1
D bm :
kŠ
kD0
Again, inequalities are preserved in the limit, so
e lim bm D E:
m
Since we had e E before, this gives e D E. That is,
1
1 n X 1
D
:
lim 1 C
n
n
kŠ
kD0
(See 27. S ERIES OF NUMBERS .)
Convergence of ..1 C 2=n/n /.
A slight modification of the proof that an D .1 C 1=n/n increases with n, shows that
.1 C 2=n/n also increases with n, so converges in R, provided this sequence is bounded
above. But,
2 n
2 2n
1C
1C
D an2 e 2 ;
n
2n
n
so it is indeed bounded above. Let cn D 1 C n2 and let w D limn cn . Since .c2n / is a
subsequence of .cn / it must have the same limit. Thus
2 2n
D lim an2 D e 2 :
w D lim 1 C
n
n
2n
INTRODUCTION TO ANALYSIS
101
More about the number e will be found in section 27. S ERIES OF NUMBERS.
25.1. For each p 2 N, the sequence .yn /, where yn D .1 C p=n/n is increasing, bounded above,
and converges to e p .
25.2. Let r > 0 be rational; say r D p=q with p; q 2 N. Let wn D .1 C r=n/n . Then .wn / is
q
increasing. By looking at wn , one can see that .wn / is bounded above by e r . If z is its limit,
q
z q D limn wn D e p , so z D e p=q D e r .
25.3. Using the fact that e x is the unique number such that e r e x e s , for all rational r; s with
r < x < s, show that for 0 < x 2 R, .1 C x=n/n ! e x .
25.4. For p 2 N, .1 p=n/n D . n np /n D 1=.1 C n pp /n ; for n > p. This decreases in n with
limit 1=e p D e p . Arguing with subsequences, obtain for rational r < 0, .1 C r=n/n ! e r
and finally that .1 C x=n/n ! e x , for all negative real x, so that in fact this holds for all x 2 R.
25.5. Let an D .1 C 1=n/n , bn D an .1 C 1=n/ D .1 C 1=n/nC1 . Then an bn . .an / is
increasing and .bn / is decreasing, so both converge and to the same limit, namely e.
n
.1C1=n/n
n2 1
Outline: Using Bernoulli’s inequality prove that for n > 1, .1C1=.n
is
1//n D
n2
between 1
1=n and 1=.1 C 1=n/, from which an =an
1
1 and bn =bn
1
1.
102
The number e an application of Monotone Convergence
Notes
INTRODUCTION TO ANALYSIS
103
26. L IMIT INFERIOR AND LIMIT SUPERIOR .
If .xn / is a bounded sequence there are two important monotone sequences of real
numbers associated with it. If we put, for each n 2 N,
an D inf xk
bn D sup xk
and
kn
kn
then .an / is an increasing sequence and .bn / is a decreasing sequence. Indeed, for each
n 2 N,
fxk W k ng fxk W k n C 1g;
So
an D inffxk W k ng inffxk W k n C 1g D anC1
and
bn D supfxk W k ng supfxk W k n C 1g D bnC1 :
(The infima and suprema here exist in R because the sequences are bounded both below
and above.)
Now, if .xn / is bounded, we see that .an / is also bounded, and since it is increasing it
converges to some a. In fact, it converges to a D supn an .
This a is called the limit inferior of .xn / written lim infn xn . Thus,
lim inf xn D lim inf xk D sup inf xk
n
n kn
n kn
Similarly, .bn / converges to b D infn bn , which is called called the limit superior of .xn /
written b D lim supn xn W
lim sup xn D lim sup xk D inf sup xk :
n kn
n
n kn
Other notation for these are: limn xn for lim infn xn and limn xn for lim supn xn .
26.1 Example. Let xn D 1 C . 1/n C
1
.
2n
xn D 2 C
1
;
2n
xn D
1
2n
Then for each n,
if n is even, and
if n is odd.
Thus, if bn D supkn xk , then
(
bn D
2C
2C
1
;
2n
1
;
2nC1
if n is even
if n is odd.
hence,
lim sup xn D lim bn D 2:
n
n
On the other hand, we find that infkn xn D 0, for all n, so lim infn xn D 0.
The following theorem gives characterizations of limit superior and limit inferior. These
are often taken as the definitions of the concepts.
104
Limit inferior and limit superior
26.2 Theorem. For a bounded sequence .xn / of real numbers,
(1) lim infn xn D a if and only if for each " > 0,
(i) there exists n such that xk > a ", for all k n and
(ii) for all n, there exists k n with xk < a C ".
(2) lim supn xn D b if and only if for each " > 0,
(i) there exists n such that xk < b C ", for all k n and
(ii) for all n there exists k n with xk > b ".
Proof. We do the case of lim sup and leave the lim inf case as an exercise. Let b D
lim supn xn . Then b D infn bn , where bn D supkn xk .
(i) Let " > 0. Since b D infn bn , we may choose n with
bn < b C ":
That is,
sup xk < b C ":
kn
But if the supremum of a set is < some number, each member of the set is also < that
number, so
xk < b C "; for all k n:
Thus, for each " > 0, there exists n, such that xk < b C ", for all k n.
(ii) Again let " > 0. Then b " < b D infn bn , so if we fix an arbitrary n,
b
" < bn :
But, bn D supkn xk , and a number less than a least upper bound is no longer an upper
bound, so there exists k n with
b " < xk :
Thus, we have shown that for each " > 0, and each n 2 N, there exists k n with
xk > b ", completing the proof that the limit superior satisfies the two properties.
Conversely, suppose (i) and (ii) hold. Let " > 0. Then, by (i), we may choose n such
that
xk < b C ", for all k n.
Taking supremum over all the k n, we obtain
sup xk b C ":
kn
(Be careful, this is , not < !)
Thus, there exists n such that
sup xk b C ":
kn
Hence, taking infimum (or limit),
lim sup xn D inf sup xk b C ":
n kn
n
Since this inequality is true for arbitrary " > 0,
lim sup xn b:
n
Again, let " > 0 and fix n 2 N. Then, by (ii),
there exists k n with xk > b
",
INTRODUCTION TO ANALYSIS
105
so that
sup xk > b
":
kn
But then, since n was arbitrary, we may take infimum and get
lim sup xn D inf sup xk b
n kn
n
":
Finally, since " was arbitrary, we get
lim sup xn b:
n
Thus, lim supn xn b and lim supn xn b, so we have equality
26.3 Note. Remember that the variables k, and n in the definitions limit inferior and limit
superior and in the above theorem are dummy variables, so may be replace by any others.
Thus, lim supn xn is also limN supnN xn D infN supnN xn . And, if " > 0, then
(i) there exists N such that xn < b C ", for all n N .
(ii) for all N , there exists n N with xn > b ".
Here, (i) can be summarized by saying xn < b C ", for all except for a finite number of
indices n, and (ii) by saying that xn > b ", for an infinite number of n.
“All but finitely many terms are < b C " and infinitely many terms are > b
".”
26.4 Theorem. Let .xn / be a bounded sequence of real numbers. Then lim infn xn and
lim supn xn are each cluster points of .xn /.
Proof. (liminf case) This follows from the characterization in theorem 26.2:
lim infn xn D a if and only if for each " > 0,
(i) there exists n such that xk > a ", for all k n and
(ii) for all n, there exists k n with xk < a C ".
To see this, let a D lim infn xn . By condition (i) all but finitely many terms xk are
> a " and by (ii), infinitely many are < a C ", so infinitely many are < a C ". So,
together, infinitely many terms are in .a "; a C "/, showing that a is a cluster point of
.xn /.
Here is a detailed argument:
Let " > 0 and n 2 N and choose by (i) an N 2 N such that
for all k N , xk > a
".
Now, by (ii), we can find m maxfn C 1; N g, with xm < a C ". Thus, m > n and
satisfies
xm > a
" and xm < a C ";
that is
a
" < xm < a C ":
Thus, for all " > 0, and all n 2 N, there exists m > n with xm 2 B.a; "/; that is, a is a
cluster point of .xn /.
The proof for lim supn xn is similar.
106
Limit inferior and limit superior
Unbounded sequences. If a real sequence .xn / is unbounded, we can still define lim supn xn
and lim infn xn in the same way, provided we use the conventions for supremum and infix namely that a (non-empty)set which is not bounded above has supremum C1
mum in R,
and one which is not bounded below has infimum 1.
We find that, if .xn / is not bounded above, then lim supn xn D C1 and lim infn xn
may or may not be infinite; if .xn / is not bounded below, then lim infn xn D 1, and
lim supn xn may or may not be infinite.
26.1. For a bounded sequence .xn /, lim infn xn is the smallest cluster point of .xn / and lim supn xn
is the largest.
26.2. If a bounded sequence .xn / has lim infn xn D lim supn xn , then the sequence converges to this
common value.
26.3. If an unbounded sequence .xn / has lim infn xn D lim supn xn , then the sequence converges to
x
this common value in R.
26.4. For a bounded sequence .xn / in R, lim supn xn D supft 2 R W xn t; for infinitely many ng;
lim infn xn D infft 2 R W xn t for infinitely many ng.
26.5. For a sequence in a metric space, xn ! a iff lim supn d.xn ; a/ D 0.
INTRODUCTION TO ANALYSIS
107
27. S ERIES OF NUMBERS
Associated
with
complex) numbers is the
P
Pa sequence .an / D .a1 ; a2 ; : : : / of real (orP
n
th
series n an D 1
a
.
For
each
n
2
N,
the
sum
s
D
n
nD1 n
kD1 ak is called the n
2
th
partial sum ofP
the series and an is called its n term.
The series 1
nD1 an is said to converge if the sequence .sn / of partial sums converges
and to be Cauchy if .sn / is Cauchy. If s is the limit of .sn / we call s the sum of the series
and write
1
X
an D s:
nD1
If .sn / diverges, the series is said to diverge.
P
n
27.1 Example (Geometric Series). the series 1
nD1 x
case
1
X
1
:
xn 1 D
1
x
nD1
1
converges iff jxj < 1, in which
Proof. If x ¤ 1, the nth partial sum is
sn D
n
X
xk
1
D
kD1
1 xn
:
1 x
Since x
! 0 if and only if jxj < 1, this converges to 1 1 x if jxj < 1 and diverges if
jxj > 1.
If x D 1, we have sn D n so the series diverges (to C1) and if x D 1, the sequence
.sn / is .1; 0; 1; 0; : : : / which also diverges.
If x is a complex number with jxj D 1, the series still diverges, since snC1 sn D
x n 6! 0. (This is a special case of the “trivial test”, also known as the “nth -term test”,
theorem 27.11 below.)
n
For convenience of notation one also works with series of the form
1
X
an ;
nDp
whose terms form a sequence .ap ; apC1 ; : : :/ indexed on fp; p C 1; : : :g and whose partial
sums are of the form
n
X
sn D
ak :
kDp
For example, the geometric series above is often considered to be
x ¤ 1)
n
X
1 x nC1
xk D
;
1 x
P1
nD0
x n . We have (if
kD0
and if jxj < 1,
1
X
nD0
xn D
1
1
x
:
2Technically, the series is the pair ..a /; .s // consisting of the sequence of terms and the sequence of partial
n
n
sums.
108
Series of numbers
27.2 Theorem. (Linearity)
P
P1
P1
(a) If 1
nD1 an and
nD1 bn are convergent series then
nD1 .an Cbn / is convergent
with
1
1
1
X
X
X
.an C bn / D
an C
bn
nD1
(b) If
nD1
P1
nD1 an is convergent and c 2 R, then
1
X
can D c
nD1
nD1
P1
1
X
nD1
can is convergent with
an :
nD1
Proof. This follows from the corresponding facts for sequences. The details are left as an
exercise.
P
1
27.3 Example (The harmonic series). The series 1
nD1 n diverges.
Proof. Suppose this series were to converge with sum s, and let sn be the nth partial sum.
Then s sn ! 0. But, for all n,
s
sn s2n
D
sn
2n
X
kDnC1
2n
X
kDnC1
1
k
1
n
1
D
D ;
2n
2n
2
which does not converge to 0, a contradiction.
27.4 Example (The number e). In section 25, we defined the number e as the limit of the
sequence whose terms are .1 C 1=n/n . But we also proved that that e is the limit of the
1
n
X
X
1
1
; that is, that e is the sum of the infinite series
:
sums sn D
kŠ
kŠ
kD0
kD0
The convergence of the partial sums sn to e is very rapid. Indeed, the error in using sn
to approximate e is
1
X
1
e sn D
kŠ
kDnC1
1
X
1
1
<
.n C 1/Š
.n C 1/k
kD0
1
1
1
D
D
1
.n C 1/Š 1 nC1
nŠn
1
Thus, 0 < e sn < nŠn
. For n D 10, the error is 0:2755731922 10
yields e correct to 7 decimal places.
7
, so the partial sum
27.5 Theorem. The number e is irrational
Proof. Suppose otherwise that e D m=n, m; n 2 N. By the estimate above,
0 < nŠ.e
sn / < 1=n:
INTRODUCTION TO ANALYSIS
109
By our assumption, nŠe is an integer, and
1
1
;
nŠsn D nŠ 1 C 1 C C : : :
2Š
nŠ
is an integer, so nŠ.e
sn / is an integer between 0 and 1, which is impossible.
Actually, e is trancendental, that is, it is not the root of any polynomial with rational
coefficients, but that we don’t prove here.
The monotone convergence theorem for convergence of sequences of reals becomes, in
terms of series:
27.6 Theorem. A series of non-negative terms converges iff its partial sums are bounded.
P1
Proof.
Pn Let nD1 an be a series with an 0, for all n, and with partial sums sn D
kD1 ak . Then, for all n;
sn sn C anC1 D snC1 :
This shows that .sn / is an increasing sequence of real numbers, so if it is bounded above,
it is convergent; if not, it diverges to C1.
27.7 Note. We can say more: For a series of non-negative terms
n
X
ak kD1
1
X
ak :
kD1
Indeed, using the notation of the previous proof, we know that limn sn D supn sn .
x and we write
In case the .sn / is not bounded above, we know that sn ! C1 in R
1
X
an D C1:
nD1
However, we still say the series diverges (to infinity).
P
P
27.8 Theorem (Comparison test). For series n an , and n bn of non-negative terms,
if N0 2 N, and an bn , for all n N0 , thend
P
P
(a) if Pn bn converges then so doesP n an and
(b) if n an diverges, then so does n bn .
Proof. .a/ and .b/ are contrapositives of each other, so we prove only the first. Changing
a finite number of terms does not affect convergence, (although it does affect the sum), so
we mayPassume an bn for all n. P
Pn
Let n bn converge and let B D 1
nD1 bn . The partial sums Bn D
kD1 bk form an
increasing sequence, bounded above by B. But then,
0
n
X
ak kD1
This shows that the partial sums of
P
1
X
nD1
n
n
X
bk B:
kD1
an are also bounded above by B so converge and
an B D
1
X
bn :
nD1
The Cauchy condition can be restated in terms of series as:
110
Series of numbers
P
27.9 Theorem. A series n an is Cauchy (hence converges) iff for each " > 0, there exists
N such that for n m N ,
ˇ
ˇ n
ˇX ˇ
ˇ
ˇ
ak ˇ < ":
ˇ
ˇ
ˇ
kDm
Pn
Proof. Let sn D kD1 ak . Then, by definition, .sn / is Cauchy iff for each " > 0, there
exists N such that for n; m > N ,
jsn
sm j < ":
We notice that if n m, then
jsn
ˇ n
ˇ
m
ˇX
ˇ
X
ˇ
ˇ
sm j D ˇ
ak
ak ˇ
ˇ
ˇ
kD1
kD1
ˇ
ˇ
ˇ X
ˇ
ˇ
ˇ n
ˇ
Dˇ
ak ˇˇ :
ˇkDmC1 ˇ
Suppose .sn / is Cauchy. Let " > 0. Choose N1 so that for m; n > N1 , jsn sm j < ": Let
N D N1 C 2 and n m N . Then, n; m 1 > N1 , so
ˇ n
ˇ
ˇX ˇ
ˇ
ˇ
jsn sm 1 j D ˇ
an ˇ < ":
ˇ
ˇ
kDm
ˇP
ˇ
Thus, for each " > 0, there exists N such that n m N implies ˇ nkDm aˇ n ˇ < ". ˇ
P
Conversely, suppose " > 0 and N is chosen so that n m N implies ˇ nkDm an ˇ <
". Let n; m > N .
ˇ
ˇP
In case n > m; we have n m C 1 N and jsn sm j D ˇ nkDmC1 an ˇ < ".
The case n < m is proved similarly, by interchanging the roles of n and m; and in case
n D m, jsn sm j D 0 < ".
Thus, in all cases n; m > N implies jsn sm j < ", so the sequence .sn / is Cauchy. P
27.10 Corollary.
IfP 1
then the sequence of “tails” or “remainders”
nD1 an converges,
P1
P1
n
Rn WD nD1 an
a
D
a
kD1 k
kDnC1 k converges to 0.
Here is what some people call the nth -term test.
27.11 Theorem (trivial test for divergence).
P
(1) If n an converges then an ! 0. Equivalently,
P
(2) if .an / does not converge to 0, then the series n an diverges.
Proof. If the series converges
it ˇis Cauchy, so if " > 0 is given, we can find an N such
ˇP
that for n m N ˇ nkDm ak ˇ < ". Taking n D m N gives jan j < ". This proves
an ! 0.
We emphasize that the above result
P does not say that an ! 0 implies the series converges. Indeed, the harmonic series n n1 diverges, yet 1=n ! 0.
However, there is a case where this does hold.
27.12 Theorem (Alternating series test).
P
nC1
If the sequence .an / decreases to 0, then the series 1
a converges. MorenD1 . 1/
P1
P
Pn1
n
kC1
kC1
over, the remainder Rn D kD1 . 1/
ak
ak D kDnC1 . 1/kC1 ak ;
kD1 . 1/
st
nC2
satisfies jRn j anC1 and Rn has the same
anC1 .
P1sign asnthe n C 1 term . 1/
(Corresponding statements hold for nD1 . 1/ an .)
INTRODUCTION TO ANALYSIS
111
Thus, if the terms of a series have alternating signs and have absolute values which
decrease with limit 0, then the series converges and the nth “remainder”, that is, the error
in using the nth partial sum to approximate the sum, is bounded by the size of first term
omitted (the n C 1st term) and is of the same sign.
Another way you could state this result for both cases (without explicitly writing the . 1/n
nC1 ) is: Suppose c ! 0, jc j jc
or . 1/
n
n
nC1 j, for all n, and cn cnC1 0, for all n.
P
Then, n cn converges and
Rn D
1
X
n
X
cn
nD1
ck D
kD1
1
X
ck
kDnC1
satisifies jRn j jcnC1 j and cn RnC1 0, for all n.
The cn cnC1 0 indicates that the terms alternate in sign. The cn RnC1 0 indicates
that the remainder is of the same sign as cnC1 .
Proof. Let m and n be natural numbers with m > n. Then
m
X
. 1/kC1 ak D . 1/nC1 ..an
anC1 / C .anC2
anC3 / C .anC4
anC5 / C : : : /
kDn
Since the sequence .an / decreases, the terms .an
and hence the sum
.an
anC1 / C .anC2
anC3 / C .anC4
is also 0. (Whether this sum ends with am or am
even or odd, but it is still 0.)
ˇ
ˇ m
ˇ
ˇX
ˇ
ˇ
kC1
. 1/
ak ˇ D .an
ˇ
ˇ
ˇ
anC1 /, .anC2
anC1 / C .anC2
am
anC3 /, : : : are all 0,
anC5 / C : : : /
1
depends on whether m
anC3 / C .anC4
n is
anC5 / C : : : /
kDn
D an
Œ.anC1
anC2 / C .anC3
anC4 / C : : : / an ;
since again the terms .anC1 anC2 /, .anC3 anC4 /; : : : are 0.
Now, let " > 0 and use the fact that an ! 0 to find N such that n N implies an < ".
Then for m > n N , we have
ˇ
ˇ m
ˇ
ˇX
ˇ
ˇ
. 1/kC1 ak ˇ < ";
ˇ
ˇ
ˇ
kDn
so the series is Cauchy hence converges.
We have still to check the sign of the remainder and the estimate jRn j anC1 . We saw
above that
m
X
. 1/kC1 ak
kDn
is . 1/nC1 times a non-negative quantity, hence has the same sign as . 1/nC1 an , the nth
term. When we let m tend to infinity we obtain
Rn
1
D
1
X
. 1/kC1 ak ;
kDn
which still has the sign of the nth term, as required. (Replace n by n C 1 to obtain the result
in the form stated.)
P
P
A series n an is said to converge absolutely if n jan j converges.
112
Series of numbers
27.13 Theorem. If a series of real (or complex) numbers converges absolutely, then it
converges.
P
P
Proof. Let n an converge absolutely. Then, by definition, n jan j converges.
Thus, by the Cauchy criterion, if " > 0 is fixed, we can choose N such that n m N
implies
n
X
jak j < "
kDm
Hence,
ˇ n
ˇ
n
ˇX ˇ X
ˇ
ˇ
ak ˇ jak j < "
ˇ
ˇ
ˇ
kDm
Thus, the series
P
kDm
n an is also Cauchy, so converges.
The reason we emphasize that the terms of the sequence are real or complex numbers
is that the metric completeness of these spaces is responsible for the result. Remember
“metric completeness” refers to the fact that Cauchy sequences converge. (That absolute
convergence implies convergence can actually be used to characterize completeness.)
27.14 Corollary P
(Absolute comparison test). IfPthere exists N0 such that jan j jbn j for
all n N0 , then n an converges absolutely if n bn does.
Notice that, in this form, there was no test for divergence.
27.15
(Limit comparison test). 3
PTheoremP
Let n an and n bn be series of non-negative terms.
P
P
(a) If limn abnn D L < 1 and n bn converges, then n an converges also.
P
P
(b) If limn abnn D L > 0 and n an converges, then so does n bn .
P
P In part (b), L is allowed to be C1. In case 0 < L < 1, the result says n an and
n bn converge or diverge together. That is, both converge or both diverge.
Proof. (a) Suppose an =bn ! L < 1 and fix K with L < K < 1. Then
P there exists N
such thatPfor n N , an =bn < K. Thus,
a
Kb
,
for
all
n.
Thus,
if
n
n
n bn converges,
P
so does n Kbn , and hence so does n an , by the usual comparison test.
(b) Suppose limn an =bn D L > 0, and let 0 < c < L. Then, we may choose N P
so that
for n N , anP
=bn > c. Thus, cbn <Pan , for all n N , and so convergence of n an
implies that of n cbn , and hence of n bn , since c is not 0.
You will notice that limit could have been replaced by lim sup in (a) and by lim inf in
(b). As the proof shows, what is really involved is that the ratios an =bn be bounded above
in (a) and be bounded below by a number c > 0 in (b).
The doubling idea used to show that the harmonic series diverges can be refined to give
a surprisingly useful test. For a series whose terms are non-negative and decrease, a rather
“thin” subsequence determines convergence.
27.16 Theorem (Cauchy’s condensation test). Let an anC1 0, for all n. Then,
1
X
nD1
an converges
”
1
X
2k a2k converges.
kD0
3Some people call this the “ratio comparison test”. See the exercises for another result with that name.
INTRODUCTION TO ANALYSIS
113
Proof. If 2m > n, we have
n
X
ai a1 C .a2 C a3 / C .a4 C a5 C a6 C a7 / C C .a2m C C a2mC1
1/
i D1
a1 C .a2 C a2 / C .a4 C a4 C a4 C a4 / C C .a2m C C a2m /
D 1a1 C 2a2 C 4a4 C C 2m a2m
1
X
2k a2k ;
kD0
P1
k
so if kD0 2 a2k converges, so does
Similarly,
1
X
P1
nD1
an , since its partial sums are bounded above.
ai a1 C a2 C .a3 C a4 / C .a5 C a6 C a7 C a8 / C .a9 C C a16 / C C .a2m
1 C1
C C a2m /
i D1
a1 C a2 C .a4 C a4 / C .a8 C a8 C a8 C a8 / C .a16 C C a16 / C C .a2m
1
a1 C a2 C 2a4 C 4a8 C C 2m 1 a2m
2
1
D .a1 C 2a2 C 4a4 C C 2m a2m /
2
m
1X k
D
2 a2k
2
kD0
so that if
P1
nD1
an converges, so does
P1
kD0
2k a2k :
Let us apply this to the so called p-hyperharmonic series, also called “p-series”.
P
27.17 Theorem (p-series). For a real number p, n n1p converges iff p > 1.
P
Proof. By the Cauchy condensation test the series n n1p converges iff
k
1
1 X
X
1
1
k
2
D
2p 1
.2k /p
kD0
kD0
does. But this is a geometric series; it converges iff 2p1 1 < 1, that is, iff p > 1.
P
27.18 Theorem (Cauchy’s root test). The series n an
(a) converges absolutely if lim supn jan j1=n < 1;
(b) diverges if lim supn jan j1=n > 1
If lim supn jan j1=n D 1 the series could converge or diverge.
Proof. (a) Let ˛ D lim supn jan j1=n < 1. Choose r with ˛ < r < 1. Then, there exists N
such that for n > N , jan j1=n < r. Thus,
jan j r n for n > N:
P
Hence n an converges absolutely by comparison with the geometric series n r n , 0 r < 1.
(b) If lim supn jan j1=n >P
1, then for infinitely many n, jan j > 1, hence .an / could not
tend to 0. Hence, the series n an diverges.
P
1 C1
C C a2m /
;
114
Series of numbers
Notice that limn .1=n/1=n D 1 and
1
n n2 converges.
P
1
n n
diverges, while limn .1=n2 /1=n is also 1 and
P
P
27.19 Theorem (D’Alembert’s ratio test). The series n an , with an ¤ 0,
(a) converges absolutely if lim supn janC1 j=jan j < 1 and
(b) diverges if lim infn janC1 j=jan j > 1; (more generally, if there exists N such that
for n N , janC1 j=jan j 1).
Proof. (a) We may assume an > 0, for all n. Let ˛ D lim supn
˛. Then, there exists N such that for n N ,
anC1
< r:
an
Thus, for all n N , anC1 < an r; so that
aN C1 < aN r;
anC1
an
< 1 and let 1 > r >
aN C2 < aN C1 r < aN r 2 ;
and by induction
aN Ck < aN r k :
Thus, for all n N ,
an aN r n N D aN r
Writing K for the constant aN r N ; we have
N n
r :
an Kr n ; for n N:
P
Since 0 < r < 1, the series n an converges, by comparison with a convergent geometric series.
(b) If lim inf anC1 =an > 1, then there exists an N such that for all n N , anC1 =an > 1.
Now if we have even anC1 =an 1, for all nP
N , we see that an anC1 for all n N ,
so an cannot converge to 0, hence the series n an cannot converge.
P 1
P 1
anC1
Again, n n diverges and n n2 converges, yet in both cases limn an D 1.
Notice that both the ratio test and the root test deduce divergence only from the lack of
convergence to 0 of the terms.
The relationship between the ratio and the root tests is brought out by the following
result. It shows that whenever the ratio test shows convergence, the root test will also. If
the lim inf version of the ratio test shows divergence, so will the root test.
There are series, however, for which the root test indicates convergence, but the ratio
test does not apply.
27.20 Theorem. If an > 0 for all n 2 N,
anC1
anC1
lim inf
lim inf an1=n lim sup an1=n lim sup
:
n
n
an
an
n
n
Proof. Recall that, for any sequence .xn /,
lim sup xn D lim sup xk
n
n kn
and
lim inf xn D lim inf xk :
n
n kn
and that if
ˇ > lim sup xn ; then there exists N such that for n N , xn < ˇ:
n
()
INTRODUCTION TO ANALYSIS
115
(“All but finitely many terms are < ˇ.”)
Now, for each n;
inf xk sup xk ;
kn
kn
so in the limit
lim inf xn lim sup xn :
n
n
This is a general result which applies here to give
lim inf an1=n lim sup an1=n :
n
n
The interesting part is the comparison with the ratios.
a
and let ˇ > ˛. Then, by () there exists N such that for n N ,
Let ˛ D lim supn nC1
an
anC1
< ˇ:
an
Thus, for all n N , anC1 < an ˇ; so that
aN C2 < aN C1 ˇ < aN ˇ 2 ;
aN C1 < aN ˇ;
and by induction
aN Ck < aN ˇ k :
Thus, for all n N ,
an aN ˇ n N D aN ˇ
Writing K for the constant aN ˇ N ; we have
N
ˇn:
an Kˇ n ; for n N:
Thus,
an1=n K 1=n ˇ; for n N:
Taking the lim sup of both sides (recalling that the limit superior does not change if we
change a finite number of terms) we have
lim sup an1=n lim sup K 1=n ˇ:
n
n
But if a limit exists, it is also the limit superior, so
lim sup K 1=n D lim K 1=n D 1;
n
n
and hence,
lim sup an1=n ˇ:
n
But ˇ was arbitrary > ˛. Hence,
anC1
an
n
n
The inequality involving limit inferior is proved the same way.
lim sup an1=n ˛ D lim sup
P
P
27.1. (Ratio Comparison Test) For series n an and n bn of positive terms, if
P
P
for all n sufficiently large, then convergence of n bn implies that of n an .
anC1
an
bnC1
bn ,
27.2. Use the corresponding results about sequences of real numbers prove:
(Linearity) Let an ; bn 2 R, for all n 2 N.
P
P1
P1
(a) If 1
nD1 an and
nD1 bn are convergent series then
nD1 .an C bn / is convergent with
1
X
.an C bn / D
nD1
1
X
nD1
an C
1
X
nD1
bn
116
Series of numbers
(b) If
P1
nD1
an is convergent and c 2 R, then
1
X
P1
nD1 can
1
X
can D c
nD1
is convergent with
an :
nD1
27.3. For the following series, determine whether the series converges or diverges. If it is convergent,
find its sum.
P1
1
(a)
nD1 n1=n .
1
X
3
.
(b)
n.n C 1/
(c)
nD1
1
X
kD1
k2 C 5
.
3k 3 C 2k 1
27.4. For the following series, determine whether the series converges or diverges. If it is convergent,
find its sum.
1
X
1
(a)
.
2n
(b)
(c)
(d)
nD1
1
X
nD1
1
X
nD1
1
X
nD1
.3n
1
.
2/.3n C 1/
n
p
.
1 C n2
3n C 2n
.
6n
27.5. Test the following for convergence or divergence.
1
X
2n
(a)
.
n
3 C1
(b)
(c)
(d)
nD1
1
X
. 1/n
nD1
1
X
. 1/n
nD1
1
X
nD1
n2
.
n2 C 1
p
n
1
.
nC4
1
cos n
.
n4=3
27.6. For a real-number x, x C D maxfx; 0g and x D maxf x; 0g, so that x D x C x and
jxj D x C C x .
P
P C
P
(a) If n an converges absolutely, then both n an
and
a converge.
P
P Cn n P
(b) If n an converges, but not absolutely, then both n an and n an diverge.
P
P1
27.7. A rearrangement of a series 1
nD1 an is a series
nD1 akn , where W n 7! kn is a
bijective map of N onto N.
P
(a) If the series 1
nD1 an converges absolutely with sum s, then every rearrangement of this series converges with the same sum.
P
x
(b) If a series of real numbers 1
nD1 an converges, but not absolutely, then for each x 2 R, there is a
rearrangement which
converges
to
x.
In
fact,
for
each
˛;
ˇ
with
1
˛
ˇ
C1,
then there is
P 0
0 such that
a rearrangement n an
with partial sums sn
0
0
D ˛ and lim sup sn
D ˇ:
lim inf sn
n
n
INTRODUCTION TO ANALYSIS
117
28. L IMITS OF FUNCTIONS
Let X be a set in a metric space S and let Y be another metric space and f W X ! Y .
If c 2 S , we say f .x/ converges to L as x tends to c, and write f .x/ ! L as x ! c iff
for each " > 0, there exists ı > 0 such that for x 2 X ,
H)
0 < dS .x; c/ < ı
dY .f .x/; L/ < ":
In other words, for all " > 0 there exists ı > 0 such that
f B.c; ı/ n fcg B.L; "/:
()
One also writes limx!c f .x/ D L. (This notation has a slight flaw, as we shall see shortly.)
Here, dS denotes the distance in S and dY denotes the distance in Y . Often we may just
use the letter d in both places. By the way, in the expression (), since f is only defined
on X and c is excluded, it doesn’t matter whether we use the ball in S , B.c; ı/ D fx 2 S W
dS .c; x/ < ıg, or the corresponding ball in X , namely BX .c; ı/ D X \ B.c; ı/.
28.1 Example. Let f .x/ D x 2 C 2x C 6, for x 2 R. Then, limx!3 f .x/ D 21.
Proof. Here, X D S D Y D R: distances are given in terms of absolute values.
Let " > 0. Then,
jf .x/
Now, if jx
21j < "
,
jx 2 C 2x C 6
,
jx 2 C 2x
,
jx
21j < "
15j < "
3jjx C 5j < ":
3j < 1; we will have
jx C 5j jx 3j C j3 C 5j
1 C 8 D 9;
so that
Thus, jf .x/
jf .x/ 21j jx
21j will be < " provided
jx
and jx
3j9:
3j9 < ";
3j < 1. Thus, we take ı D minf1; "=9g. Then, jx
jf .x/
21j D jx
3jjx C 5j jx
3j < ı implies
"
3j9 < 9 D ";
9
as required to prove limx!3 f .x/ D 21.
As you can see, the techniques we have developed for limits of sequences seem to apply
for limits of functions. Actually, there is a close connection, as we will see, known as the
sequential criterion for convergence.
The set B 0 .c; ı/ D B.c; ı/ n fcg is often called the deleted neighbourhood of c radius
ı, or the deleted open ball about c radius ı. The definition of convergence of the function
f to L as x ! c becomes:
For each neighbourhood V of L
there exists a deleted neighbourhood U of c
with f .U / V
Recall that, by definition, (section 18) a point c is an accumulation point of a set A iff
B.c; ı/ \ .A n fcg/ ¤ ;, and this can be rewritten so that B 0 .c; ı/ \ A ¤ ;. (A point of A
which is not an accumulation point is called an isolated point of A. )
118
Limits of functions
So points of A come in 2 kinds: accumulation and isolated. But be careful, accumulation
points need not be points of A; A [ acc.A/ D cl A.
Uniqueness. If c is an isolated point of the domain of the function f , the limit of f
at c has little meaning, as we will describe shortly, but if c is an accumulation point of
the domain, the limit is unique. To see this, we recall a concept mentioned briefly in the
discussion of convergence of sequences:
28.2 Theorem (The Hausdorff Property). In any metric space, if x ¤ y, there exist
neighbourhoods Ux of x and Uy of y with Ux \ Uy D ;.
The proof of this result, which we gave in theorem 21.9 is a triviality, but the result is
enormously important, so lets go through it again.
Proof. Since x ¤ y, d.x; y/ > 0. Take ı any positive number d.x; y/=2,
Ux D B.x; ı/ and Uy D B.y; ı/.
Then Ux and Uy are neighbourhoods of x and y, respectively, and Ux \ Uy D ;. Indeed,
if z 2 Ux \ Uy , then d.x; y/ d.x; z/ C d.z; y/ < ı C ı d.x; y/, an impossibility. 28.3 Theorem (Uniqueness of limits). Let S and Y be metric spaces. Suppose c is an
accumulation point of X , f W X ! Y . If f .x/ ! L1 as x ! c and f .x/ ! L2 as
x ! c, then L1 D L2 .
Proof. Suppose L1 ¤ L2 . By the Hausdorff property, there exists neighbourhoods V1 of
L1 and V2 of L2 such that V1 \ V2 D ;:
By definition, there exist deleted neighbourhoods U1 of c and U2 of c with with f .U1 / V1 and f .U2 / V2 . Now U1 \ U2 is still a deleted neighbourhood of c. (Take out your
ı’s and check.) Thus, U1 \ U2 \ X ¤ ;. But if x 2 U1 \ U2 \ X ,
f .x/ 2 V1 \ V2 ;
which is impossible.
The theorem we have just proved is the justification of the notation
lim f .x/ D L:
x!c
However, limits of functions can be non unique in a certain situation.
28.4 Theorem. If c is not an accumulation point of X , the domain of the function f W
X ! Y , then f .x/ ! y as x ! c, for all y 2 Y .
Proof. c is not an accumulation point of X means there exists a deleted neighbourhood U
of c which does not intersect X . Thus, if V is a neighbourhood of y,
f .U / D f .X \ U / D f .;/ D ; V:
This is so disturbing to some people that they refuse to talk about limits at an isolated
point of X . They just define this behaviour away. We prefer to allow c to be a non accumulation point, so that a function continuous at c will converge at c to f .c/ even in this
case. See 30.1
If we use a formula for something, it should have a unique definition. In case c is an
accumulation point of the domain of f , the limit is unique, so this is fine. If c is not an
accumulation point, it is not. Nevertheless, it is somewhat common to use this notation
even in the latter case.
INTRODUCTION TO ANALYSIS
119
One more thing. . . you will notice that it is not necessary for c to be a member of the
domain of the function to define a limit at c.
28.5 Theorem (The sequential criterion for convergence). Let S , Y be metric spaces,
c 2 S , X S and f W X ! Y . Then
f .x/ ! L as x ! c
” f .xn / ! L for each sequence .xn / in X n fcg
converging to c:
We sometimes abbreviate “sequential criterion” as “SC”.
Proof. ( H) ) Suppose f .x/ ! L as x ! c. Let .xn / be a sequence in X n fcg with
xn ! c. We are to show f .xn / ! L.
Let " > 0. Then, there exists ı > 0 such that f .x/ 2 B.L; "/, whenever x 2 B.c; ı/ n
fcg. But xn ! c and xn ¤ c for all n, so there exists N such that n > N implies
xn 2 B.c; ı/ n fcg. For this N , and for n > N , we therefore have f .xn / 2 B.L; "/ as
required.
( (H ) We prove this by establishing the contrapositive. Assume f .x/ does not converge to L as x ! c. That is, there exists " > 0 such that for all ı > 0, there exists
x 2 B.c; ı/ \ X, with x ¤ c, and f .x/ … B.L; "/.
Fix such an " > 0.
(Today it is delta’s turn to take on a lot of identities.)
For each n 2 N, take ı D 1=n in the above statement and choose
xn 2 B.c; 1=n/ \ X; with xn ¤ c;
but f .xn / … B.L; "/:
Since 1=n ! 0, .xn / is a sequence in X nfcg for which xn ! c, yet there is no n for which
f .xn / belongs to B.L; "/I the sequence .f .xn // certainly doesn’t converge to L.
For those who prefer distances to neigbourhoods: we constructed .xn / with xn 2 X
1
0 < d.xn ; c/ < ; but d.f .xn /; L/ "; for all n 2 N:
n
28.6 Example. Let f .x/ D sin.1=x/; for x ¤ 0, so the domain of f is X D R n f0g.
Then f .x/ does not converge to anything as x ! 0.
1
0.5
0
-1.5
-1
-0.5
0
0.5
1
x
-0.5
-1
1.5
120
Limits of functions
Proof. Let xn D 1=. n
/; for each n. Then f .x1 / D 1; f .x2 / D 0; f .x3 / D 1,
2
f .x4 / D 0,. . . . We see that the sequence .f .xn // does not converge. But xn ¤ 0
and xn ! 0, so the sequential criterion isn’t satisfied. Thus, f .x/ does not converge as
x ! c.
Let f; g be functions on a set X to R. Then the functions f C g, fg, f =g are “defined
pointwise” as follows.
(1) f C g is defined on X by .f C g/.x/ D f .x/ C g.x/:
(2) fg is defined on X by .fg/.x/ D f .x/g.x/:
(3) f =g is defined on fx 2 X W g.x/ ¤ 0g by .f =g/.x/ D f .x/=g.x/:
28.7 Theorem. Let X be a subset of a metric space, f W X ! R, g W X ! R, c 2 X ,
with f .x/ ! L and g.x/ ! M as x ! c, then
(1) .f C g/.x/ ! L C M as x ! c (sum law)
(2) .fg/.x/ ! LM as x ! c (product law)
(3) .f =g/.x/ ! L=M as x ! c (quotient law), provided M ¤ 0.
We will prove (1) using the definition of limit, and (2) using the sequential criterion (SC).
(3) is left as an exercise. I think you will find it easiest using the SC, but it is instructive to
try it both ways.
Proof. (1) Let " > 0. Since f .x/ ! L as x ! c, then there exists ı1 > 0 such that x 2 X
and
x 2 X; and 0 < d.x; c/ < ı1 ; imply jf .x/
Lj < "=2
and similarly there exists ı2 > 0 such that
x 2 X; and 0 < d.x; c/ < ı2 ; imply jg.x/
M j < "=2:
Put ı D minfı1 ; ı2 g then, x 2 X and 0 < d.x; c/ < ı imply
j.f C g/.x/
.L C M /j jf .x/
Lj C jg.x/
Mj <
"
"
C D ":
2
2
Since " > 0 was arbitrary,
8" > 0; 9ı > 0; 8x 2 X; 0 < d.x; c/ < ı
H)
j.f C g/.x/
.L C M /j < ":
That is, .f C g/.x/ ! L C M as x ! c.
(2) Let .xn / be an arbitrary sequence in X n fcg converging to c. Then f .xn / ! L and
g.xn / ! M , by the SC, and .fg/.xn / D f .xn /g.xn / by definition. Therefore, by the
product law for sequences,
.fg/.xn / ! LM:
Since .xn / was arbitrary, fg satisfies the SC for convergence and .fg/.x/ ! LM as
x ! c.
28.1. The sequential criterion has a stronger form, without reference to a particular limit: If for each
.xn / converging to c, .f .xn // converges, then f .x/ converges as x ! c. (The point here is
that at first sight, the limit of f .xn / could be different for different sequences .xn /, yet it is part
of the conclusion that there is only one such limit, and then f .x/ converges to it as x ! c.
INTRODUCTION TO ANALYSIS
121
Left and right limits. For a function f defined on a subset X of the reals, and c 2 R, f .x/
tends to L as x ! c from the left means the restriction of f to X \ . 1; c converges to
L. Then L is called the left limit or left-hand limit of f at c, denoted
f .c / D lim f .x/ or lim x!c
x<c f .x/
x!c
Be very careful using this notation. There is no point c . f .c / makes sense, but c does
not. (Also if c is not an accumulation point of X \ . 1; c, one shouldn’t use the limit
notation, since the limit is not unique.)
Similarly, f .x/ converges to L as x ! c from the right means the restriction of f to
X \ Œc; 1/ does. The corresponding notation is
f .cC/ D lim f .x/ or lim x!c
x>c f .x/
x!cC
You can prove that limx!c f .x/ D L if and only if f .c / D f .cC/ D L. (See
exercise 28.2 below.)
A real-valued function f is called increasing on X if x1 < x2 in X implies f .x1 / f .x2 / and decreasing on X if x1 < x2 in X implies f .x1 / f .x2 /.
A real-valued function f is called strictly increasing on X if x1 < x2 in X implies
f .x1 / < f .x2 / and strictly decreasing on X if x1 < x2 in X implies f .x1 / > f .x2 /.
A function is called monotone if it is either increasing or decreasing and strictly monotone if it is either strictly increasing or strictly decreasing.
You should prove as an exercise:
28.8 Theorem. . (Convergence of monotone functions) If f is a monotone function
defined on an interval I of the reals, then for each c 2 I , the limits from the right and from
the left exist.
In fact, if f is increasing, f .c / D supx<c f .x/ f .c/ if c is not the left endpoint of
I and f .cC/ D infx>c f .x/ f .c/ if c is not the right endpoint of I .
If f is decreasing, we have the same result with supremum and infimum interchanged.
The proofs follow the pattern of the corresponding results for monotone sequences.
28.2. Let X; Y be a metric spaces and A and B be X with union X , f W X ! Y , c 2 X . Prove
that limx!c f .x/ D L iff limx!c .f jA/.x/ D L and limx!c .f jB/.x/ D L.
Often, in application, X is an interval of R and A D X \ . 1; c, B D X \ Œc; C1/.
So, we deduce limx!c f .x/ D L, if and only if f .c / D f .cC/ D L.
28.3. Formulate and prove sum, product, and quotient laws — as far as possible — for functions
from subsets of metric spaces to Rm . Remember there is no quotient of vectors, and there is a
multiplication by a scalar and a dot product of vectors available, depending on the case.
28.4. Prove the result on convergence of monotone functions. What changes would have to be made
if c does not belong to the interval?
28.5.
(a) Use the definition to prove limx!5 x 2 3x C 1 D 11
(b) Use the definition to prove limx!2 x 3 D 8.
28.6. Let f W X ! R, c 2 R and suppose limx!c f .x/ D p < 0. Prove there is a deleted
neighbourhood U of c and an r < 0 such that f .x/ r for all x 2 U \ X .
28.7. Disprove: If f .x/ ! L as x ! a and g.x/ ! a as x ! c, then f .g.x// ! L as x ! c.
(See also problems 30.1 and 30.2.)
28.8. The problem with 28.7 goes away if there exists a deleted neighbourhood U of c with g.x/ ¤
a, for all x 2 U . A particular case of this, of great use, is the case g is a one-to-one function.
122
Limits of functions
28.9. Let f W X ! R, c 2 R and suppose limx!c f .x/ D p. Prove f is bounded on some
neighbourhood of c; that is, there exists a neighbourhood U of c and M > 0 such that jf .x/j M , for all x 2 U \ X .
28.10. Let f W X S ! Rn , where S is a metric space. Find an analogue of the Cauchy condition
for convergence of f at a point c 2 S . Using your definition, prove that if f is Cauchy at c, then
f converges at c.
INTRODUCTION TO ANALYSIS
123
29. I NFINITE LIMITS OF FUNCTIONS AND LIMITS AT ˙1
Limits of real-valued functions that involve infinities are not much different from those
involving only real numbers, if we keep in mind the meanings of neighbourhoods of
x D
C1 and of 1. For a real-valued function defined on a set X R and c; L in R
Œ 1; C1, we say f .x/ converges to L as x tends to c, if for each neighbourhood V of
L, there exists a deleted neighbourhood U of c such that for x 2 X \ U , f .x/ 2 V .
Let’s write out what that means in some special cases. The neighbourhoods of C1
are of the form .M; C1 where M is any real number; the neighbourhoods of 1 are
of the form Π1; M /; where M is any real number.
The deleted neighbourhoods of C1 are of the form .M; C1/ and the deleted neighbourhoods of 1 are of the form . 1; M /.
Thus, if c 2 R, f .x/ ! C1 as x ! c, or
lim f .x/ D C1
x!c
means for all M 2 R, there exists ı > 0 such that, for all x 2 X \B.c; ı/nfcg, f .x/ > M :
x 2 X and 0 ¤ jx
cj < ı H) f .x/ > M:
Similarly, if L 2 R, f .x/ ! L as x !
1,
lim f .x/ D L;
x! 1
means for all " > 0 there exists r such that, for all x 2 X \ . 1; r/, f .x/ 2 B.L; "/.
Most of the algebraic operations are preserved for infinite limits, provided the quantities
are defined. (See the section T HE EXTENDED REAL NUMBER SYSTEM, for the definition
of the operations.) The only exceptions involve 0 times C1 or 0 times 1. Of course,
C1 1, and C1=C1 and the like are not defined, so there is no corresponding theorem
for such a situation.
29.1 Example. Let f and g be real valued functions defined on X. If limx!c f .x/ D
p 2 R and limx!c g.x/ D C1, then limx!c .f C g/.x/ D C1.
Proof. Let M 2 R.
Since f .x/ ! p, there is a deleted neighbourhood U1 of c such that
1 for x 2 U1 \ X .
f .x/ > p
Since g.x/ ! C1, there is a deleted neighbourhod U2 of c such that
g.x/ > M
.p
1/ for x 2 U2 \ X .
Put U D U1 \ U2 . Then, for x 2 U \ X ,
.f C g/.x/ D f .x/ C g.x/ > p
1CM
.p
1/ D M for x 2 U \ X .
Thus, for each M 2 R, there exists a deleted neighbourhood U of c such that for all
x 2 U \ X , .f C g/.x/ > M , as required to prove .f C g/.x/ ! C1 as x ! c.
The above proof has been written so that it works whether c is real or infinite. If c 2 R,
you could use B 0 .c; ı1 / for U1 and B 0 .c; ı2 / for U2 . Then, U would become B 0 .c; ı/,
where ı D minfı1 ; ı2 g. If c D 1, you could use U1 D . 1; r1 / and U2 D . 1; r2 /.
Then U D . 1; r/, where r D minfr1 ; r2 g.
The reader can check that in the definitions of convergence to C1, it is sufficient to use
neighbourhoods of the form .M; C1, with M > 0, and for convergence to 1, one can
use Π1; M /, with M < 0, so that it is easier to work with inequalities.
124
Infinite limits of functions and limits at ˙1
29.1. Formulate and prove the various limit theorems indicated in this discussion.
29.2. State and prove sequential criteria for infinite limits and limits at an infinity.
29.3. If h is defined (at least) in Œa; 1/, a > 0, and H.t/ D h.1=t / for 0 < t < 1=a. Then
lim h.x/ D lim H.t/;
x!1
t!0C
x
whenever one side exists in R.
29.4. Every unbounded monotone function on an interval of R has an infinite limit. Make more precise
statements of this principle and prove them.
29.5. Give an example, with proof, of a function, defined on an interval, which is not monotone but
converges to C1.
INTRODUCTION TO ANALYSIS
125
30. C ONTINUITY OF FUNCTIONS
Let X be a subset of a metric space S and f W X ! Y , another metric space. Then f is
called continuous at a 2 X iff for every " > 0, there exists ı > 0 such d.f .x/; f .a// < ",
whenever x 2 X and d.x; a/ < ı.
In terms of neighbourhoods, f W X ! R is continuous a 2 X if for each neighbourhood
V of f .a/, there exists a neighbourhood U of a with f .U / V .
Here we remember that a ı-neighbourhood in X is of the form BX .a; ı/ D X \
BS .a; ı/.
If A X , f is called continuous on A if it is continuous at each point of A; f is
called simply continuous if it is continuous at each point of its domain. We say f is
discontinuous at the point a, if a 2 X and f is not continuous at a.
If a is not a point of the domain of f , f is neither continous nor discontinuous at a — it is
not “at a” at all.
If a is not an accumulation point of the domain X of f , (that is, if a is an isolated point
of X ) then we see that f is automatically continuous at a. Indeed, if a is isolated, then
exists ı > 0 such that BX .a; ı/ D BR .a; ı/ \ X D fag: thus for x 2 BX .a; ı/;
d.f .x/; f .a// D d.f .a/; f .a// D 0 < ";
no matter what the given " > 0 was. Looked at another way, if V is a neighbourhood of
f .a/; then, f .BX .a; ı// D ff .a/g V .
30.1 Theorem. Let S , Y be metric spaces, X S, and f W X ! Y and a 2 S. Then f
is continuous at a iff limx!a f .x/ D f .a/.
This is an immediate consequence of the definitions. We emphasize that the condition
means 3 things: (1) the limit limx!a f .x/ exists, (2) f .a/ exists (that is a 2 X ) and (3)
the two sides are equal.
If S is a metric space and X is a subset of S , we know that X becomes a metric space
with the induced metric. Since continuity of a function f doesn’t involve points outside
the domain of f , often we might as well state theorems, assuming the whole space as the
domain.
30.2 Theorem (Sequential criterion for continuity). Let X and Y be metric spaces,
f W X ! Y , and a 2 X . Then f is continuous at a iff for each sequence .xn / in
X n fag converging to a, f .xn / ! f .a/, or alternatively, iff for each sequence .xn / in X
converging to a, f .xn / ! f .a/.
Proof. Since f is continuous at a iff limx!a f .x/ D f .a/, the first version is an immediate consequence of the sequential criterion for limits; namely,
lim f .x/ D L
x!a
iff
for each .xn / in X n fag converging to a, f .xn / ! L :
The proof of the second form of this is essentially the same as the proof of the limit version;
the difference is just that we don’t have to pay special attention to a.
Here is the detail: Assume f is continuous at a. Let xn ! a, xn 2 X for all n2 N.
Let V be a neighbourhood of f .a/. Then there is a neighbourhood U of a such that
f .U / V . But since xn ! a, there exists N 2 N such that for n > N , xn 2 U
and hence f .xn / 2 V . Thus, for every neighbourhood V of f .a/, there exists N with
f .xn / 2 V for n > N . That is, f .xn / ! f .a/.
For the converse, we assume f is not continuous at a. Then, there exists neighbourhood
V of f .a/, such that there is no neighbourhood U of a with f .U / V . For each n 2 N,
126
Continuity of functions
then, there exists xn 2 BX .a; n1 / such that f .xn / … V . Since 1n ! 0 we have xn ! a.
Thus, .xn / is a sequence in X converging to a, yet f .xn / does not converge to f .a/. 30.3 Example. Let f W R ! R be the indicator function of the rationals 1Q W
(
1 if x 2 Q
f .x/ D 1Q .x/ D
0 if x … Q
Then f is discontinuous at each point of R.
One way to prove this is by the sequential criterion. If a 2 Q, we use the fact that Qc
is dense in R to obtain a sequence .xn / of irrationals converging to a, but then f .xn / D
0 ! 0 and f .a/ D 1, so .f .xn // does not converge to f .a/. So f is not continuous at a.
On the other hand, if a … Q then (since Q is dense in R) there is a sequence .xn / in
Q with xn ! a but f .xn / D 1 which doesn’t converge to 0 D f .a/ and again, by the
sequential criterion, f is not continuous at a.
Actually, a slight change in wording of the proof we have given shows that f does not
even have a limit at any point of Q.
30.4 Example (Dirichlet’s function). We now give an example of a function on Œ0; 1 which
is continuous exactly on the irrationals of this interval. We put, for x 2 Œ0; 1
(
1
if x D m
rational in lowest terms
n
f .x/ D n
0 if x is irrational:
(Lowest terms are necessary to make the function “well-defined”, that is, to have a unique
value for each x.)
Dirichlet function
f(0)=1
f(1)=1
1
0.8
0.6
f(1/2)=1/2
0.4
f(1/3)=1/3
f(2/3)=1/3
1/4
1/4
0.2
0
0
0.2
0.6
0.4
0.8
1
x
We will prove that for each a 2 R, limx!a f .x/ D 0. Since f .a/ D 0, if a is
irrational, and f .a/ ¤ 0, if a is rational, this shows f is continuous at each irrational and
discontinuous at each rational.
Fix " > 0. By the Archimedean property, there exists k such that 1= k < ". Now, let F
be the set of rationals in [0,1] with denominators < k, This set is finite, so there exists a
ı > 0 such that B 0 .a; ı/ \ F D ;. Thus, for each rational m
in B 0 .a; ı/ \ Œ0; 1, n k
n
INTRODUCTION TO ANALYSIS
127
and hence,
ˇ
ˇ m
ˇ ˇˇ 1
ˇ
1
ˇ
ˇ
0ˇ D ˇˇ
0ˇˇ D < ":
ˇf
n
n
n
For each irrational x in B.a; ı/ \ Œ0; 1, f .x/D 0, so again jf .x/ 0j < ". Thus, for all
x 2 Œ0; 1, jx aj < ı implies jf .x/ 0/j < ", so that limx!a f .x/ D 0.
The sum, product, and quotient of continuous functions are continuous:
30.5 Theorem. Let f; g be functions on X to R. If f; g are continuous at a then then
(a) the functions f C g and fg are continuous at a.
(b) f =g is continuous at a, provided g.a/ ¤ 0.
We recall that the f =g is defined on the set of those x 2 X for which g.x/ ¤ 0.
Proof. This follows from the corresponding theorem about limits. For example
lim .f C g/.x/ D lim f .x/ C lim g.x/
x!a
x!a
x!a
D f .a/ C g.a/ D .f C g/.a/
30.6 Examples.
(1) Each constant function defined by f .x/ D c for all x 2 R is continuous on R.
Indeed, for each " > 0, we can take any ı > 0, and get jf .x/ f .a/j D 0 < " for
all x,a, in particular when jx aj < ı.
(2) The identity function i.x/ D x is continouus. For a given " > 0, the choice ı D "
satisfies the definition:
jx
aj < ı
H)
jf .x/
f .a/j D jx
aj < ":
(3) Let p be a polynomial function. Then there exist n 2 N and constants a0 ; a1 ; : : : ; an
such that
p.x/ D a0 C a1 x 1 C a2 x 2 C C an x n :
Thus, using (1) and (2) and induction with the fact that the sum and product of
continuous functions is continuous, we see that p is continuous on R.
(4) A rational function is by definition the quotient of two polynomial functions, and
is thus continuous on its domain.
The composition of two continuous functions is continuous.
30.7 Theorem. Let X; Y; Z be metric spaces Let f W X ! Y and g W Y ! Z. If f is
continuous at a 2 X and g is continuous at f .a/ then g ı f is continuous at a.
Proof. Let W be a neighbourhood of g ı f .a/. Since g is continuous at f .a/, there is a
neighbourhood V of f .a/ such that
g.V / W:
But f is continuous at a, so there exists a neighbourhood U of a such that
f .U / V:
Hence,
g ı f .U / D g.f .U // g.V / W;
as required.
A function is continuous iff the inverse image of an open set is open.
128
Continuity of functions
30.8 Theorem. Let X and Y be metric spaces and f W X ! Y: Then f is continuous iff
for each G open in Y , f 1 .G/ is open in X .
Proof. Suppose f is continuous and let G be open in R. Let a 2 f 1 .G/. Then, f .a/ 2
G. Then there is an open ball V centered at f .a/ contained in G. But then, since f is
continuous at a, there is a neighbourhood U of a such that f .U / V . Thus, f .U / G;
and hence U f 1 .G/. Thus each point of f 1 .G/ is an interior point, so is open.
Conversely, if the condition is satisfied and V is an open ball around f .a/, then f 1 .V /
is open set containing a, so there is an open ball U about a with U f 1 .V /, in other
words, with f .U / V , showing f is continuous at a.
30.9 Corollary. For f W X ! Y , f is continuous iff its inverse image of each closed set
in Y is closed in X .
Discontinuities of a monotone function. A real function f defined on an open interval
I is said to have a jump discontinuity at c if f .cC/ and f .c / exist in R and f .c / ¤
f .cC/. The difference f .cC/ f .c / is called the jump of f at c. The only kind of
discontinuity that a monotone function on an open interval can have is a jump discontinuity.
(Why?)
If the interval I is not open, the jump at an interior point is defined in the same way, bit
if c is an endpoint of I , one takes the jump to be f .cC/ f .c/, if c is the left endpoint of
I and f .c/ f .c /, if c is the right endpoint.
30.10 Theorem. If I is an interval of R and ' W I ! R is monotone, then ' has at most a
countable number of discontinuities.
Proof. We may assume ' is increasing. Then, for all x 2 int.I /, '.x / and '.xC/ exist
and
'.x / '.x/ '.xC/:
For x a endpoint, we can only say '.x/ '.xC/ or '.x / '.x/.
Now, if x 2 int.I / is a point of discontinuity, put Ix D .'.x /; '.xC//. If x is an
endpoint of I use Ix D .'.x/; '.xC//, or .'.x /; '.x//. Then, Ix ¤ ;, so we may
choose a rational number r.x/ 2 Ix /, by density. If x1 < x2 , then '.x1 C/ '.x2 /, so
Ix1 \ Ix2 D ;, and hence, r.x1 / ¤ r.x2 /. This shows that the set of discontinuities of '
in I is in one-to-one correspondence with a subset of Q, so is countable.
30.1. Discuss how the composition of continuous functions theorem could lead to a formal substitution: if f .x/ ! b, as x ! a, then
lim g.f .x// D lim g.y/:
x!a
y!b
30.2. Let X; Y; Z be metric spaces, A X , B Y . Let f W A ! B and g W B ! Z. If f .x/ !
b as x ! a and g.y/ ! c as y ! b, then limx!a g.f .x// D limy!b g.y/ D c if and
only if either (1) g.b/ D c (so g is continuous at b) or (2) there exists a deleted neighbourhood
U 0 of a such that f .x/ ¤ b, for x 2 U 0 . Note that (2) holds in particular if f is one-to-one on
A onto B.
30.3. The function g W R ! R with
g.x/ D
(
x;
x2Q
x2; x … Q
is continuous at exactly 2 points.
30.4. Give an example, with proof, of a function f W R ! R which is continuous at exactly 3 points.
INTRODUCTION TO ANALYSIS
30.5. Prove that the function defined by f .x/ D
129
(
x cos.1=x/; x ¤ 0
0;
xD0
is continuous everywhere.
30.6. Let f W X S ! Rn , where S is a metric space. Find an analogue of the Cauchy condition
for convergence of f at a point c 2 S . Using your definition, prove that if f is Cauchy at c, then
f converges at c.
30.7. (Removable discontinuity) Let f W X ! Y , a 2 X and limx!a f .x/ D b ¤ f .a/, so
that f is discontinuous at a. Then the function fx defined by fx.x/ D f .x/, if x ¤ a, and
fx.a/ D b is continuous at a. (The discontinuity has been “removed”.)
If f is only defined on X n fag, the same construction extends f to X , in such a way that it
is continuous at a. (In this case no discontinuity has been removed, but some authors still call a
a removable discontinuity.)
30.8. Let i W Rm ! R, be defined by i .x/ D . x1 ; : : : ; xm / D xi . (This is called the i th
coordinate map, or the projection onto the i th coordinate). This map is continuous because
ji .x/ i .y/j jx yj. A function f D .f1 ; : : : ; fm / from a metric space to Rm
is continuous at c iff and only if the composites fi D i ı f are continuous at c, for each
i D 1; : : : ; m.
30.9. Assume known that the functions x 7! e x and sin are continuous. Prove the map g on R2 to R,
2
defined by g.x1 ; x2 / D e x1 sin.1=.x22 C 1// C x2 is continuous everywhere. [Use continuity
of the coordinate maps.]
30.10. Prove the function f W R2 ! R2 defined by
2
f .x1 ; x2 / D .x12 sin.x1 x22 /; e x2 /
is continuous at each point of R2 .
30.11. Discuss the continuity at .0; 0/ of
2
g W .x1 ; x2 / 2 R 7!
8
3
2
< sin.x1 Cx2 / ;
x12 Cx22
:
.
1;
x¤0
x D 0:
130
Continuity of functions
Notes
INTRODUCTION TO ANALYSIS
131
31. C ONTINUITY AND COMPACTNESS
The continuous image of a compact set is compact:
31.1 Theorem. Let f W X ! Y be a continuous function from one metric space X to
another one Y . If K is a compact subset of X , then f .K/ is a compact subset of Y .
Proof. Let U a family of open sets covering f .K/. For each U 2 U, f
X . Moreover, the family ff 1 .U / W U 2 Ug covers K. Indeed,
[
U f .K/
1
.U / is open in
U 2U
so
!
[
U 2U
f
1
.U / D f
1
[
U
K:
U 2U
1
.U1 /; : : : ; f
1
Since K is compact, there is a finite subfamily ff
.Un /g which also covers
K.
f 1 .U1 / [ [ f 1 .Un / K:
That is
f 1 .U1 [ [ Un / K;
and so
U1 [ [ Un f .K/:
Thus, each family of open sets covering f .K/ has a finite subfamily which also covers
f .K/, as required.
31.2 Extreme Value Theorem. Each continuous real-valued function on a non-empty
compact set assumes a maximum and a minimum value.
Notice that this result needs no reference to derivatives.
Proof. Let K be compact and non-empty and let f be continuous on K to R. Then f .K/
is compact non-empty, hence it has a minimum and a maximum.
This is actually all there is to the theorem, but let’s put it in familiar terms: if y1 D
min f .K/ and y2 D max f .K/, then y1 2 f .K/ so there exists x1 2 K with f .x1 / D y1
and similarly there exists x2 2 K with f .x2 / D y2 : Finally for all x 2 K, we obtain
f .x/ 2 f .K/, so f .x1 / f .x/ f .x2 /.
By the way, avoid saying “x1 is the minimum” here. (It is f .x1 / that is the minimum.)
Instead, say “x1 is a minimizer of f at x1 , or f assumes a minimum at x1 . Similarly, x2
is a maximizer of f .
31.1. If X is a compact metric space and f W X ! Y is a continuous bijection, then f
continuous.
1
is also
31.2. Let Œa; b be a closed interval of R and W Œa; b ! Rn be one-to-one and continuous. Then,
1 is also continuous (on .Œa; b/.
31.3. The previous result also holds for open intervals .a; b/, but not half open ones .a; b.
These problems 31.1, 31.2, 31.3 are important for the study of curves and their length.
132
Continuity and compactness
Notes
INTRODUCTION TO ANALYSIS
133
32. T HE I NTERMEDIATE VALUE T HEOREM
Here is a result on which much of the application of Calculus is based.
32.1 Intermediate Value Theorem. Let I be an interval, f W I ! R be continuous and
let a; b 2 I , with a < b. If y is a point strictly between f .a/ and f .b/, then there exists
c 2 .a; b/ such that f .c/ D y.
Proof. We may assume I D Œa; b, since the restriction of a continuous function is continuous. We may assume f .a/ < y < f .b/. (For, if f .a/ > y > f .b/, we may replace f
by f and y by y.)
Let A D f 1 .. 1; y// and C D f 1 .. 1; y/ Then A is open in Œa; b and C is
closed in Œa; b, since . 1; y/ is open and . 1; y is closed and f is continuous. Now
A D fx W f .x/ < yg
C D fx W f .x/ yg:
Since f .a/ < y; A is not empty. Since A Œa; b, A is bounded above. Thus we may set
c D sup A. Then c 2 cl A, A C and C is closed so c 2 C . That is,
f .c/ y:
We know c < b, since f .b/ > y. Now, suppose f .c/ ¤ y. Then, f .c/ < y, so c 2 A,
open in [a,b]. Thus, there exists ı > 0 such that Œa; b \ B.c; ı/ A. We know c C ı b,
since f .b/ > y, so that
Œc; c C ı/ A:
If we let d be any point of .c; c C ı/ we have c < d 2 A; contradicting the fact that c is
an upper bound of A and establishing that f .c/ D y.
32.2 Remark. Of course, if in the hypothesis of the above theorem, we just have y between
f .a/ and f .b/ instead of strictly between, then there still is c 2 Œa; b with f .c/ D y.
Indeed, if y is between them but not strictly between them, then either f .a/ D y or
f .b/ D y.
32.3 Example. Every 5th degree real polynomial has at least one real root.
We use the limit behaviour of such polynomials at the infinities.
If P .x/ D a5 x 5 C a4 x 4 C a3 x 3 C a2 x 2 C a1 x C a0 , there are 2 cases, a5 > 0 and
a5 < 0. Without loss of generality we may assume a5 > 0, since P .x/ D 0 if and only if
P .x/ D 0.
Then,
a4
a3
a2
a1
a0
C
C
C
C
P .x/ D a5 x 5 1 C
a5 x
a5 x 2
a5 x 3
a5 x 4
a5 x 5
which converges to C1 as x ! 1 and converges to 1 as x ! 1. In particular,
there exists s with P .s/ > 0 and t with P .t / < 0. But then, since P is continuous,
by the Intermediate Value Theorem, there exists c between s and t with P .c/ D 0, as
required.
Of course, a slight modification will show that:
32.4 Theorem. Every odd degree real polynomial has a real root.
Here is a common application of the Intermediate Value Theorem.
32.5 Corollary. Let f W Œa; b ! Œa; b be continuous. Then f has a fixed point; that is,
there exists x 2 Œa; b with f .x/ D x.
134
The Intermediate Value Theorem
Proof. Let g.x/ D f .x/ x for all x 2 Œa; b. We have to show there exists c with
g.c/ D 0. But,
f .a/ 2 Œa; b H) g.a/ D f .a/ a 0
and
f .b/ 2 Œa; b H) g.b/ D f .b/
b0
Thus, by the Intermediate value theorem, there exists c 2 Œa; b with f .c/ D c. (See the
above remark.)
32.6 Note. This proof indicated a general technique. Given 2 functions, f and g, to prove
there exists a point x with f .x/ D g.x/, define a new function h D f g and prove there
exists x with h.x/ D 0.
We recall that J is an interval if and only if it contains all numbers between any 2 of its
points.
Theorem (12.14. Characterization of interval). A set J in R is an interval if and only if
whenever y1 and y2 belong to J and y1 < y < y2 , then y also belongs to J .
Proof. The proof is an exercise, which you have probably already done. The left endpoint
of J is inf J and the right endpoint is sup J .
A continuous image of an interval is an interval.
32.7 Corollary. If I is an interval of R and f is a continuous real-valued function whose
domain contains I , then f .I / is an interval.
Proof. Because of the characterization of intervals, this is almost a restatement of the Intermediate Value Theorem.
Indeed, let I be an interval contained in the domain of f , J D f .I /, and let y1 ; y2 2 J
with y1 < y < y2 . Then, there exists x1 ; x2 2 I with y1 D f .x1 /, y2 D f .x2 / and
f .x1 / < y < f .x1 /, so by the Intermediate Value Theorem, there exists x between x1
and x2 with y D f .x/.
Every continuous image of a compact interval is a compact interval.
32.8 Corollary. If I is a compact interval of R and f is a continuous real-valued function
whose domain contains I , then f .I / is a compact interval.
Proof. The continuous image of a compact set is compact and the continuous image of an
interval is an interval, so if I is a compact interval, and f is continuous on I , then f .I / is
a compact interval.
To visualize this, notice that f .I / D Œm1 ; m2 , where m1 is the minimum value and m2
is the maximum value of f on I .
32.9 Remark (Reminder). A compact interval is the same as a closed bounded interval
Œa; b. This language is a way of emphasizing its properties. As you know, one includes
sets of the form Œa; 1/, .a; 1/, . 1; b/, . 1; b, . 1; 1/ as intervals. Thus, Œa; 1/
is a closed set, which is an interval, but is not compact. You have to look at the context
to understand what someone means when he or she talks about “closed interval”. But a
compact interval must be closed and bounded, so there is no ambiguity: it must be of the
form Œa; b, where a b, real numbers.
INTRODUCTION TO ANALYSIS
135
A function which satisfies the conclusion of the Intermediate Value Theorem is said to
have the Intermediate Value Property or Darboux Property, because Darboux proved
derivatives have this property. A function with the Darboux property is also said to be a
Darboux function. A function can have this property without being continuous.
32.10 Example (A discontinuous function with the Intermediate Value Property). The
function defined on R by f .x/ D sin.1=x/ for x ¤ 0 and f .0/ D 0 is not continuous at 0,
but attains all values between any two of its values. Indeed, if Œa; b with a < b contains
0, then f takes on all values between 1 and 1, and if it does not contain 0, then f takes
all values between f .a/ and f .b/ by the IVT, because f is continuous on that interval.
Notice that the graph of the function in the above example wiggles a lot. If it doesn’t
wiggle too much, a function with the IVP must be continuous. The simplest case is that of
a monotone function.
32.11 Theorem. If f W I ! R is monotone on the interval I and if f .I / is an interval,
then f is continuous on I .
Proof. The easiest way to see this is to use left and right limits. Let c 2 I . If c is not the
left endpoint of I , we know f .c / D limx!c f .x/ D supx<c f .x/. But, since f .I / is
an interval a 2 I with a < c implies f takes on all values between f .a/ and f .c/. Thus
ff .x/ W x < cg is an interval with right endpoint f .c/, so its supremum is f .c/. Thus,
f .c / D f .c/. Similarly, if c is not the right endpoint of I , f .cC/ D f .c/. Thus, f is
both right and left continuous, so is continuous at c.
Note, by the way, that the full force of the Intermediate Value Property was not assumed
here. But, because of the monotonicity it turns out to hold anyway. It is easy to construct a
function f such that both I and f .I / are intervals of R, but f doesn’t have the IVP (and,
of course, is not continuous).
32.12 Corollary. If I is an interval and f is continuous and strictly increasing on I then
f 1 is continuous and strictly increasing on f .I /.
Proof. If I is an interval and f is continuous, then f .I / is also an interval. If f is also
strictly increasing, then so is f 1 . Since f 1 is defined on an interval and has an interval
as its range, it must be continuous, by the previous theorem.
32.13 Example. Existence of roots again. Let f .x/ D x n , defined for x 2 Œ0; 1/. Then
f is continuous and strictly increasing. Thus, f has an inverse, which therefore must also
be strictly increasing and continuous. By the IVT, the range of f is an interval. Since
x n ! 1, the interval must be Œ0; 1/. Thus the inverse of f is defined on Œ0; 1/. This is
the function y 7! y 1=n .
It is interesting that the only way a continuous function on an interval can have an
inverse is if it is strictly monotone:
32.14 Theorem. If f is continuous and injective on an interval I , then f is either strictly
increasing or strictly decreasing on I .
Proof. Suppose f is continuous and injective on the interval I .
Claim. If a1 < a2 < a3 then f is strictly monotone on fa1 ; a2 ; a3 g; that is, either
f .a1 / < f .a2 / < f .a3 / or f .a1 / > f .a2 / > f .a3 /.
Indeed, suppose a1 < a2 < a3 and f .a1 / < f .a2 / then we must also have f .a2 / <
f .a3 /. Otherwise, f .a2 / > f .a3 / and if we choose any y strictly between maxff .a1 /; f .a3 /g,
136
The Intermediate Value Theorem
and f .a2 /, the Intermediate Value Theorem yields points c1 2 .a1 ; a2 / and c2 2 .a2 ; a3 /,
with f .c1 / D f .c2 / D y, contradicting injectivity. The case f .a1 / > f .a2 / is similar,
establishing the claim.
Now suppose x1 < x2 with f .x1 / < f .x2 / and x < x 0 are any two points of I . We
have to show that f .x/ < f .x 0 / as well. Let A be the set fx1 ; x2 ; x; x 0 g.
It is possible that A consists of 2 or 3 points (for example we could have x1 D x). The
first of these cases is trivial. The second case we have handled by our claim.
The final case is that A has 4 distinct points. If so, order them in increasing order
a1 < a2 < a3 < a4 : Since f is monotone on sets of 3 elements, and since fa1 ; a2 ; a3 g
and fa2 ; a3 ; a4 g have the pair fa2 ; a3 g in common, f is monotone on the whole of A.
Since f .x1 / < f .x2 /, f is increasing on A and we have f .x/ < f .x 0 / as required.
32.1. Prove that the equation
p
x
5 D 1=.x C 3/ has at least one real solution.
32.2. Prove that there is a real number x with e x D x 2 .
32.3. Prove there exists a point x such that sin.x
the sine function, including continuity.)
1/ D x. (You are allowed the usual properties of
32.4. Find intervals I and J of R and a function f on I onto J , but which doesn’t have the IVP.
32.5. Find a bijection of Œ0; 1 onto itself which has no points of continuity or prove this impossible.
32.6. Prove the Intermediate Value Theorem using a bisection argument. Hint: If y is between f .a/
and f .b/ and c is the point .a C b/=2, then y is either between f .a/ and f .c/ or f .c/ and
f .b/. This allows one to construct a decreasing sequence of closed intervals. The one point x in
the intersection of these satisfies f .x/ D y.
The following two notes give the true content of the Intermediate Value Theorem.
32.7. Two subsets A and B of a metric space S are called separated if cl.A/ \ B and A \ cl.B/ are
both empty; that is A contains no closure points of B and B contains no closure points of A. A
set E S is called connected if there are no non-empty separated sets A, B with E D A [ B.
A subset of R is connected if and only if it is an interval.
32.8. Let f be a continuous function from one metric space X to another, Y , the image of any
connected set in X is connected.
32.9. A set E in a metric space is called pathwise connected if for every p and q in E , there exists a
continuous function f defined on an interval Œa; b with f .a/ D p, f .b/ D q and f .x/ 2 E ,
for all x 2 Œa; b. This is interpreted as saying “we can draw a continuous path in E from any
point to any other”.
Every pathwise connected set is connected.
32.10. A set A in Rn (or other vector space) is called convex if a; b 2 A implies A contains the line
segment f.1 t /a C t b W 0 t 1g. Each convex set in Rn is (pathwise connected, hence)
connected.
32.11. In R a set is convex iff it is an interval.
INTRODUCTION TO ANALYSIS
137
33. BANACH ’ S C ONTRACTION M APPING T HEOREM
The result of the title, also known as Banach’s Fixed Point Theorem, is valid in a general complete metric space, such as Rn . It provides conditions under which a mapping
has a fixed point; its proof provides an algorithm for finding that point. Thus, the result
complements the Intermediate Value Theorem in obtaining solutions to equations.
If X is a metric space and f W X ! X , f is called a contraction mapping if there is
a number c < 1 with d.f .x/; f .y/ cd.x; y/, for all x; y 2 X .
33.1 Banach’s Fixed Point Theorem. Let X be a complete metric space and let f W
X ! X be a contraction mapping. Then, there exists exactly one point x 2 X such that
f .x/ D x.
Proof. Define a sequence of points .xn / as follows. Let x0 be any point of X and for
n D 0; 1; 2; : : : ; let xnC1 D f .xn /. Choose c < 1 so that d.f .x/; f .y// cd.x; y/, for
all x; y 2 X . For n 1, we have
d.xn ; xnC1 / D d.f .xn
1 /; f .xn //
c d.xn
1 ; xn /:
By induction, we obtain
d.xn ; xnC1 / c n d.x0 ; x1 /:
If n < m, this yields
d.xn ; xm / m
X1
d.xi ; xi C1 / .c n C c nC1 C C c m
1
/d.x0 ; x1 /
kDn
cn
d.x0 ; x1 /:
1 c
Since c n ! 0, .xn / is a Cauchy sequence. Since X is complete, .xn / converges to some
x 2 X . Since f is a contraction mapping, it is (uniformly) continuous. Hence,
f .x/ D lim f .xn / D lim xnC1 D x;
n
n
as required.
The uniqueness is immediate: if x and y are distinct fixed points, then
d.x; y/ D d.f .x/; f .y// cd.x; y/ < d.x; y/;
a contradiction.
33.2 Example. The map f W Œ1; C1/ ! Œ1; C1/ defined by f .x/ D 1 C 1=.1 C x/ is a
contraction. What is its fixed point?
138
Banach’s Contraction Mapping Theorem
Notes
INTRODUCTION TO ANALYSIS
139
34. U NIFORM C ONTINUITY
Let X and Y be metric spaces, with distance functions dX and dY . (We often write
simply d , in both cases, if there is no danger of confusion.)
Let f be a function on X to Y . Then f is continuous on X iff it is continuous at each
point a of X , that is:
8a 2 X;
8" > 0; 9ı > 0; such that
8x 2 X; d.x; a/ < ı H) d.f .x/; f .a// < ":
For example, let X D R and f .x/ D x 2 , for x 2 X . To show that f is continuous, one
takes a 2 X; lets " > 0 and calculates for x 2 X :
d.f .x/; f .a// D jx 2
D jx
If we assume jx
a2 j
ajjx C aj
aj < 1 we will have jx C aj < 1 C 2jaj, so
jf .x/
f .a/j jx
aj.1 C 2jaj/;
which is < ", provided
jx
aj <
"
;
1 C 2jaj
so we can take ıa D minf1; "=.1 C 2jaj/g, and get
jx
aj < ıa implies jf .x/
f .a/j < ":
Notice that the ı D ıa depends on a: For larger a we need a smaller ıa .
Uniform continuity is different. The distances involved don’t depend on where in the
domain of the function we are. The order of quantifiers is changed. A function f W X ! Y
is said to be uniformly continuous on X if
8" > 0; 9ı > 0; such that
8x; y 2 X;
d.x; y/ < ı H) d.f .x/; f .y// < ":
34.1 Example. Let f .x/ D 3x, for x 2 R. Fix " > 0. Then for all x; y,
jf .x/
f .y/j < " iff 3jx
yj < "
So if we take ı D "=3, we have for all x; y 2 R;
jx
yj < ı
H)
jf .x/
f .y/j < ":
Thus, f is uniformly continuous on R.
At this point one should check that the function defined on R by f .x/ D x 2 is not
uniformly continous on R.
If one says f W X ! Y is uniformly continuous, one means f is uniformly continuous
on its domain, X .
34.2 Theorem. Every continuous function on a compact set is uniformly continuous. That
is, if f W K ! Y is continuous and K is compact, then f is uniformly continuous on K.
Proof. Let " > 0. Since f is continuous on K, for each a 2 K, there exists ıa such that
x 2 B.a; ıa / \ K
H)
d.f .x/; f .a// < "=2:
()
140
Uniform Continuity
Now, fB.a; ıa =2/ W a 2 Kg is a family of open sets covering K, so there exists a finite
subfamily which also covers K; that is, there exists a1 ; : : : ; an 2 K such that
n
[
B.ai ; ıai =2/ K:
()
iD1
Let ı D
1
2
minfıa1 ; : : : ; ıan g. Suppose d.x; y/ < ı. By ./ there exists i such that
d.x; ai / < ıai =2:
Then we also have
d.y; ai / d.y; x/ C d.x; ai / < ı C ıai =2 < 2ıai =2 D ıai :
Thus, by ./, we have
d.f .x/; f .y// d.f .x/; f .ai // C d.f .ai /; f .y// < "=2 C "=2 D ":
We have shown that for x; y 2 K, d.x; y/ < ı implies d.f .x/; f .y// < ".
(Notice that the dependence on i disappeared.)
Thus, f is uniformly continuous on K.
34.3 Theorem. Let f be uniformly continuous on X . If .xn / is a Cauchy sequence in X ,
then .f .xn // is also a Cauchy sequence.
Informally, “uniformly continuous functions map Cauchy sequences to Cauchy sequences”.
Proof. Let .xn / be a Cauchy sequence in X .
Fix " > 0. Then, by uniform continuity, there exists a ı > 0 such that for x; x 0 2 X ,
d.x; x 0 / < ı implies d.f .x/; f .x 0 // < ".
From the definition of Cauchy sequence we obtain N such that for n; m > N d.xn ; xm / <
ı. Thus,
for n; m > N , d.f .xn /; f .xm // < ":
Since " was arbitrary, this shows the sequence .f .xn // is Cauchy.
Recall that if f is a function with domain A, A B and fx is defined on B with
x
f .x/ D f .x/ for x 2 A, we say that fx is an extension of f (to B) and f is the
restriction of fx to A. To say that f has an extension to a continuous function on B, means
there exists an extension fx of f to B, which is continuous. (Similarly for “uniformly
continuous”.)
34.4 Theorem. Let A B X. If f W A ! Y and fx W B ! Y is an extension of f
which is uniformly continuous on B, then f itself is uniformly continuous on A.
Proof. Let " > 0. Since fx is uniformly continuous on B, we may choose ı > 0 such that
for all x; x 0 2 B, d.x; x 0 / < ı implies d.fx.x/; fx.x 0 // < ".
Now, let x; x 0 2 A with d.x; x 0 / < ı. Then x; x 0 2 B so
d.fx.x/; fx.x 0 // < ":
But, fx extends f , so f .x/ D fx.x/ and f .x 0 / D fx.x 0 /. Hence,
d.f .x/; f .x 0 // < ":
Thus, for all x; x 0 2 A with d.x; x 0 / < ı, d.f .x/; f .x 0 // < ". Since " was arbitrary, this
shows f is uniformly continuous on its domain, A.
INTRODUCTION TO ANALYSIS
141
34.5 Corollary. Let f be defined on a subset A of Œa; b with values in R. If f can be
extended to a continuous function on Œa; b, then f is uniformly continuous.
Proof. If fx is a continuous function on Œa; b which extends f , then fx is a continuous
function on the compact set Œa; b, so is uniformly continuous. Thus, its restriction f is
also uniformly continuous, by the previous result.
34.6 Example. The function defined on . 1; 0/ [ .0; 2/ by
sin x
f .x/ WD
x
is uniformly continuous.
Proof. Define the function fx on Œ 1; 2 by
(
sin x
; x 2 Œ 1; 2 n f0g
x
fx.x/ D
1;
x D 0:
Since sin is continuous on all of R and x 7! x is continuous everywhere, the map x 7! sinx x
is continuous on all of R n f0g. Also limx!0 sinx x D 1. So the function fx is continuous
on the compact set Œ 1; 2, hence uniformly continuous. Furthermore, fx.x/ D f .x/ on
. 1; 0/ [ .0; 2/, so f is also uniformly continuous.
We are now going to prove a generalization of the converse of this theorem.
34.7 Theorem (Extending a u.c. function). Let X; Y be metric spaces, with Y complete
(say R or Rm ). Let f be uniformly continuous on the set A. Then f can be extended to a
uniformly continuous function on cl A.
The conclusion says there exists a function fx W cl A ! Y such that fx is uniformly
continuous on cl A and fx extends f .
Proof. Let f be uniformly continuous on A. Fix x 2 cl A. If there is a continuous function
on cl A which extends f then its formula must be
fx.x/ D lim f .a/;
(23)
a!x
so we check that the limit on the right-side exists.
Let .an / be any sequence in A converging to x. Then, .an / is Cauchy. Since f is uniformly continuous on A, .f .an // is also Cauchy. Since Y is complete, .f .an // converges.
Thus, for every sequence .an / in A converging to x, .f .an // converges; hence, by the
sequential criterion, lima!x f .a/ exists.
Define fx on cl.A/ by (23). First, fx extends f , because if x 2 A,
fx.x/ D lim f .a/ D f .x/; since f is continuous on A.
a!x
We now prove fx is uniformly continuous. Let " > 0. Use the definition of the uniform
continuity of f to choose ı > 0 such that
a; a0 2 A;
d.a; a0 / < ı H) d.f .a/; f .a0 // < ":
(24)
Let x; x 0 2 cl A with d.x; x 0 / < ı. Choose sequences .an / and .an0 / in A such that an ! x
and an0 ! x 0 : Then,
d.an ; an0 / ! d.x; x 0 / < ı:
Thus, there exists N so large that for all n N ,
d.an ; an0 / < ı;
142
Uniform Continuity
so that by (24)
d.f .an /; f .an0 // < "
for all n N .
0
0
x
x
But, f .an / ! f .x/, f .an / ! f .x /, so in the limit
d.fx.x/; fx.x 0 // ":
Thus, x; x 0 2 cl A, d.x; x 0 / < ı implies d.fx.x/; fx.x 0 // ", and hence fx is uniformly
continuous on cl A.
34.8 Example. The function defined on Œ 1; 1 n f0g by
f .x/ D sin.1=x/
is continuous, but not uniformly continuous.
Proof. The closure of Œ 1; 1 n f0g is just the interval Œ 1; 1. If f had a continuous
extension fx to Œ 1; 1, then fx.0/ would have to be limx!0 sin.1=x/. But this limit doesn’t
exist, since every neighbourhood of 0 contains points where sin.1=x/ is 1 and points where
it is 1. Thus, there is no such extension and according to the theorem, f is not uniformly
continuous.
34.9 Corollary. If f is uniformly continuous on an interval .a; b/ of R, a < b, real, then,
f can be extended to a uniformly continuous function on Œa; b.
The proof of this is immediate from the previous theorem, since cl.a; b/ D Œa; b.
34.1. Prove the sequential criterion for uniform continuity:
For f W X ! Y , f is uniformly continuous if and only if, for each pair of sequences .xn / and
0 / in X with d.x ; x 0 / ! 0, d.f .x /; f .x 0 // ! 0.
.xn
n
n
n
n
34.2. A function f W R ! R is said to be periodic if there exists a number k (called the period of f )
such that f .x C k/ D f .x/ for all x 2 R. Prove that a continuous periodic function is bounded
and uniformly continuous on all of R.
34.3. Let f be defined on an interval .a; b/, with one or both of a; b infinite. If limx!a f .x/ and
limx!b f .x/ exist in R, then f is uniformly continuous.
34.4. Let A X , let f W A ! Y be continuous and let C be the set of all points x of X for which
the limit fx.x/ D lima!x f .a/ exists in Y . Then the function fx, defined on C by this formula,
is a continuous extension of f .
34.5. Prove that each function below is uniformly continuous on the given set by directly verifying the
definition.
x
(a) f W Œ0; 3 ! R, defined by f .x/ D 1Cx
.
(b) f W Œ0; 10/ ! R, defined by f .x/ D 2x 2 x.
34.6. Which of the following are uniformly continuous? Use the theorems to justify your answers.
(a) g W .0; 7/ n f3g defined by g.x/ D
ex
.
.x 3/2
(b) h W .0; 7/ n f3g ! R, defined by h.x/ D e
1
.x 3/2
.
34.7. Let f W X ! R and g W X ! R. If f and g are uniformly continuous, so is f C g, but fg
need not be.
34.8. Use the definition to prove that the function f W Œ0; 1/ ! R, defined by f .x/ D
uniformly continuous.
x 1
xC1
is
34.9. Let A; B X R, f W X ! Y uniformly continuous on A and f uniformly continuous on
B, does f have to be uniformly continuous on A [ B?
p
34.10. Is the function f W Œ0; C1/ defined by f .x/ D x uniformly continuous?
INTRODUCTION TO ANALYSIS
143
35. D IFFERENTIATION
Let f be a real-valued function defined on an interval I of R containing the point c. We
say f is differentiable at c if the limit
f .x/ f .c/
lim
x!c
x c
exists. If so, this limit is called the derivative of f at c. The function f 0 with domain the
set of points where f is differentiable defined by
f .x/ f .c/
f 0 .c/ D lim
x!c
x c
is called the derivative of f .
35.1 Theorem. For a function f W I ! R and c 2 I , f is differentiable at c iff there
exists a number m and a function " W I ! R such that limx!c ".x/ D ".c/ D 0 such that
for all x 2 I;
f .x/ D f .c/ C m.x c/ C ".x/.x c/:
0
In this case, f .c/ D m.
The graph of the function x 7! f .c/Cm.x c/ is tangent to the graph of f at .c; f .c//,
and we will sometimes call this result the tangent characterisation of differentiability.
The slope of the tangent line is m D f 0 .c/, and one also calls this the slope of f at c.
Proof. . If such a number m and a function " exist, we have
f .x/ f .c/
lim
D lim .m C ".x// D m C lim ".x/ D m;
x!c
x!c
x!c
x c
so f is differentiable at c with derivative m.
Conversely, suppose f is differentiable with f 0 .c/ D m, we have
f .x/ f .c/
m D 0;
lim
x!c
x c
Put ".c/ D 0 and for all other x,
f .x/ f .c/
".x/ D
m
x c
Then limx!c ".x/ D 0 D ".c/. and
f .x/ f .c/
D m C ".x/;
x c
so
f .x/ D f .c/ C m.x c/ C ".x/.x c/;
as required.
35.2 Theorem. If f is differentiable at c, then f is continuous at c.
Proof. If f is differentiable at c then it can be written as
f .x/ D f .c/ C m.x
c/ C ".x/.x
c/:
where limx!c ".x/ D ".c/ D 0. This last condition says " is a continuous function, so the
right-hand side is a continuous a sum of products of continuous functions, so is continuous.
Hence f is continuous.
The most important rule for calculation of derivatives is the chain rule, so we give it
immediately.
144
Differentiation
35.3 The chain rule. Let I be an interval and g W I ! R be differentiable at x0 . Let
g.I / J , f W J ! R, and let f be differentiable at u0 D g.x0 /. Then, f ı g is
differentiable at x0 , with .f ı g/0 .xo / D f 0 .uo /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /.
Proof. Since f is differentiable at u0 , there exists a function, " which is continuous and 0
at u0 D g.x0 / with
f .u/ D f .u0 / C f 0 .u0 /.u
u0 / C ".u/.u
u0 /;
(25)
for all u 2 J . Replacing u by g.x/ and u0 by g.x0 / in (25) yields
f .g.x// D f .g.xo // C f 0 .u0 /.g.x/
g.x0 // C ".g.x//.g.x/
g.x0 //:
If x ¤ x0 , we may rearrange and divide by x x0 , obtaining:
f .g.x// f .g.xo //
g.x/ g.x0 /
g.x/ g.x0 /
D f 0 .u0 /
C ".g.x//
:
x x0
x x0
x x0
Now, since g is differentiable at x0 , it is continuous there and " is continuous at u0 D
0/
converges to
g.x0 /, so the composite is continuous at x0 . Thus, as x ! x0 ; g.x/x g.x
x0
0
g .x0 / and ".g.x// converges to ".g.x0 // D ".u0 / D 0, so
f .g.x// f .g.xo //
lim
D f 0 .u0 /g 0 .x0 / C 0g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /:
x!x0
x x0
That is .f ı g/0 .x0 / D f 0 .g.x0 //g 0 .x0 /, as required.
The following version of the above proof, using only the tangent characterization of
differentiability is a little messier, but is more directly generalizable to higher dimensions,
where division is not defined.
2nd proof. Since g is differentiable at x0 , there exists a function "1 , continuous and 0 at
x0 with
g.x/ D g.x0 / C g 0 .x0 /.x x0 / C "1 .x/.x x0 /;
(26)
for all x 2 I . Since f is differentiable at u0 , there exists a function, "2 which is continuous
and 0 at u0 D g.x0 / with
f .u/ D f .u0 / C f 0 .u0 /.u
u0 / C "2 .u/.u
u0 /;
(27)
for all u 2 J . Replacing u by g.x/ in (27) gives
f .g.x// D f .uo / C f 0 .u0 /.g.x/
But u0 D g.x0 /, so by (1) we may replace g.x/
yielding
f .g.x// D f .g.x0 // C f 0 .u0 /Œg 0 .x0 /.x
0
C "2 .g.x//Œg .x0 /.x
D f .g.x0 // C f 0 .u0 /g 0 .x0 /.x
0
u0 / C "2 .g.x//.g.x/
u0 by g 0 .x0 /.x
x0 / C "1 .x/.x
u0 /:
x0 / C "1 .x/.x
x0 /
x0 /
x0 / C "1 .x/.x x0 /
x0 / C f 0 .u0 /"1 .x/ C "2 ..g.x//g 0 .x0 / C "1 .x/ .x
0
Since Œf .u0 /"1 .x/ C "2 ..g.x//g .x0 / C "1 .x/ converges to 0 as x ! x0 , this shows
f ı g is differentiable at x0 with derivative f 0 .u0 /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /.
35.4 Theorem. If f is constant on the interval I , then f is differentiable on I with
f 0 .x/ D 0 for all x 2 I .
To prove this, just do the calculation from the definition, or note that m D 0 and the
function " D 0 satisfy the tangent characterization of differentiability.
x0 /
INTRODUCTION TO ANALYSIS
145
35.5 Theorem. Let f W I ! R, g W I ! R be differentiable at c 2 I and let k 2 R, then:
(a) kf is differentiable at c with .kf /0 .c/ D kf 0 .c/ (constant multiple rule)
(b) f C g is differentiable at c and .f C g/0 .c/ D f 0 .c/ C g 0 .c/ (sum rule).
(c) fg is differentiable at c and .fg/.c/ Df 0 .c/g.c/ C f .c/g 0 .c/ (product rule).
(d) If g.c/ ¤ 0,
f
g
is differentiable at c and
f
g
0
.c/ D
g.c/f 0 .c/ f .c/g 0 .c/
g.c/2
(quotient rule).
Proof. (a) and (b) are left as exercises.
For (c), we merely calculate, for each x 2 I;
.fg/.x/
x
.fg/.c/
f .x/g.x/ f .c/g.c/
D
c
x c
f .x/g.x/ f .c/g.x/ C f .c/g.x/ f .c/g.c/
D
x c g.x/ g.c/
f .x/ f .c/
D
g.x/ C f .c/
:
x c
x c
Now, since f is differentiable at c, f .x/x cf .c/ converges to f 0 .c/ as x ! c. Similarly,
g.x/ g.c/
converges to g 0 .c/. Moreover, since g is differentiable at c, it is continuous
x c
there, so g.x/ ! g.c/ as x ! c. This shows fg is differentiable at c with derivative
f 0 .c/g.c/ C f .c/g 0 .c/.
(d) The proof of the quotient rule is similar. But before doing the calculation we recall
that f =g has as domain the set of points x for which g.x/ ¤ 0. Moreover, since g.c/ ¤ 0
and g is continuous at c, there is a neighbourhood of c on which g ¤ 0.
Now, for x in this neighbourhood, we calculate
f .x/
g.x/
f .c/
g.c/f .x/ g.x/f .c/
g.c/.f .x/
D
D
g.c/
g.x/g.c/
f .c// .g.x/
g.x/g.c/
g.c//f .c/
;
which when divided by (x-c) gives
f .x/
g.x/
x
f .c/
g.c/
c
D
g.c/ .f .x/x
f .c//
c
.g.x/ g.c//
f .c/
x c
g.x/g.c/
:
Now, let x ! c, and use the fact that f is differentiable at c, g is differentiable at c and g
is continuous at c to get the required result.
For a function f W I ! R, a point c is called a critical point of f if one of the
following three conditions is satisfied
(1) c is an endpoint of I ,
(2) f 0 .c/ does not exist, or
(3) f 0 .c/ D 0.
35.6 Interior Extremum Theorem. Let I be an interval of R. If f W I ! R is differentiable at c 2 int.I / and f has a maximum or minimum at c, then f 0 .c/ D 0.
Proof. We assume f has a maximum at c; the case of minimum is similar. For elements
x 2 I with x < c, we have f .x/ f .c/, and x c < 0, so
f .x/
x
f .c/
0;
c
146
Differentiation
Thus,
f .x/ f .c/
f 0 .c/ D lim
0:
x c
x!c
x<c
Similarly, for x 2 I with x > c,
f .x/ f .c/
0;
x c
and again
f .x/ f .c/
0:
f 0 .c/ D lim
x c
x!c
x>c
0
0
Thus f .c/ 0 and f .c/ 0, so f 0 .c/ D 0.
In the proof of the Interior Extremum Theorem, we restricted the “difference quotient”
q.x/ D f .x/x fc .c/ to I \ . 1; c/, and then took a limit, the so-called limit from the left.
Since the limit of q is f 0 .c/, so is the limit from the left. Then we similarly used the limit
from the right. The hypothesis that c be an interior point of I was needed. Where?
The limit from the left used above is called the left derivative of f at c, denoted f 0 .c/.
Similarly the limit from the right is called the right derivative, fC0 .c/. Of course, if f is
differentiable at a point c 2 int I , then fC0 .c/ D f 0 .c/ D f 0 .c/.
If c is the left endpoint of I , then the derivative at c and the right derivative at c are the
same thing. A similar statement holds for the right endpoint and left derivative. A closer
look at the proof of the theorem shows:
35.7 Endpoint Extremum Theorem. If f W I ! R is differentiable at the left endpoint c
of I and f has a minimum at c, then f 0 .c/ 0; if it has a maximum there, then f 0 .c/ 0.
The inequalities reverse if the word “left” is replaced by “right” in this statement.
The function f W I ! R has a local maximum (or local minimum) at c if there is
a neighbourhood U of c in I for which f jU has a maximum (minimum). The point c is
then called a local maximizer (minimizer).
35.8 Local Extremum. Let I be an interval, f W I ! R and have a local maximum or
local minimum at a point c 2 I . Then c is a critical point of f .
Proof. We may assume there is a local maximum at c.
There are 3 cases in the definition of critical point; if c is not an endpoint and not a point
where f is non-differentiable, then c must be an interior point of I and f 0 .c/ exists. Thus
there is a ball B.c; "/ D .c "; c C "/ I , and there is a neighbourhood U of c such that
the restriction of f to U has a maximum. By shrinking the radius, then, we find an open
interval .a; b/ such that c 2 .a; b/, f is differentiable at c and the restriction of f to .a; b/
has a maximum at c; thus, we may assume without loss of generality that I D .a; b/ and
f has a maximum at c. Then f 0 .c/ D 0 by the Interior Extremum Theorem.
35.1. Recall that a function f is left differentiable at x 2 I if x is not a left endpoint of I and
f .x/
its left derivative, defined by f 0 .x/ D limu!x f .u/
exists (in R) and f is right
u x
f .x/
0
differentiable at x if x is not a right endpoint of I and fC
.x/ D limu!xC f .u/
exists.
u x
If f is both left and right differentiable at a point x 2 int.I /, then f is continuous at x.
35.2. The function f W I ! R is differentiable at c with derivative m if and only if for all " > 0, there
exists ı > 0 such that for all x 2 I with jx cj < ı, jf .x/ f .c/ m.x c/j < "jx cj.
INTRODUCTION TO ANALYSIS
147
36. M EAN VALUE T HEOREMS
36.1 Theorem. Rolle’s. Let f W Œa; b ! R be continuous on Œa; b and differentiable on
.a; b/ with f .a/ D f .b/. Then, there exists c 2 .a; b/ with f 0 .c/ D 0.
Proof. Since f is continuous on the compact set Œa; b, it has a maximum and a minimum
at some point of Œa; b. If one of these is in the interior, that is, at some c 2 .a; b/,
then f 0 .c/ D 0. If, on the other hand, both the maximum and the minimum are at the
endpoints, then they are equal by hypothesis. Hence, in this case, f is a constant and
thus has derivative 0 at all points of (a,b). Thus, any c in .a; b/ produces the desired
conclusion.
The following theorem is purportedly due to Lagrange. We will also study a generalization due to Cauchy.
36.2 Mean Value Theorem (MVT). Let f W Œa; b ! R be continuous on Œa; b and differentiable on .a; b/. Then, there exists c 2 .a; b/ with f 0 .c/ D f .b/b af .a/ . Equivalently,
f .b/ f .a/ D f 0 .c/.b a/.
Proof. This is a generalization of Rolle’s Theorem. The method of proof is to create a
new function h which satisfies the hypotheses of Rolle’s Theorem and for which h0 .c/ D 0
gives the desired equality.
For all x 2 Œa; b, let
f .b/ f .a/
.x a/:
g.x/ D f .a/ C
b a
Then
g.a/ D f .a/;
g.b/ D f .b/:
f .b/ f .a/
; for all x 2 Œa; b:
b a
Thus, we let h.x/ D f .x/ g.x/; for all x 2 Œa; b, so that h.a/ D h.b/ D 0, h is
continuous on [a,b] and h0 D f 0 g 0 . By Rolle’s Theorem, there is a point c 2 .a; b/
where h0 .c/ D 0, that is, where f 0 .c/ D f .b/b fa .a/ , as required.
g 0 .x/ D
The Mean Value Theorem can be rewritten in the useful form:
36.3 Mean Value Theorem (MVT– utility version). If I is an interval of the reals and f
is continuous on I and differentiable on its interior, then for each distinct x1 ; x2 2 I ,
f .x2 / D f .x1 / C f 0 .c/.x2
x1 /;
for some c strictly between x1 and x2 .
This is because the quotient f .xx11/ fx2.x2 / does not change when you interchange x1 and
x2 . Of course, if f is differentiable everywhere, the result is true also for x1 D x2 , except
that one then has c is no longer strictly between x1 and x2 .
It is an immediate consequence that only constant functions have derivative zero.
36.4 Theorem. If f is continuous on an interval I , differentiable on its interior, with
f 0 D 0 there, then f is constant on I .
Proof. If x1 ; x2 belong to I , with x1 < x2 ; then there exists c 2 .x1 ; x2 / with f .x2 / f .x1 / D
f 0 .c/.x2 x1 / D 0. Thus, f .x1 / D f .x2 /. This shows all the values of f are the same.
That is, f is a constant function.
148
Mean Value Theorems
From this it follows that if two functions have the same derivative, they differ by a
constant:
36.5 Corollary. Let F; G be continuous on an interval I , differentiable on the interior
with F 0 D G 0 . Then there exists a constant C such that G D F C C on I .
If f is defined on the interior of an interval I and F is defined and continuous on I ,
with derivative f in the interior, then F Ris called a primitive, antiderivative or indefinite
integral of f , and one writes F .x/ D f .x/ dx. Thus, if G is another antiderivative of
f , then F D G C C , where C is a constant function.
36.6 Theorem. Let f be continuous on the interval I , differentiable on its interior.
(a)
(b)
(c)
(d)
If f 0 > 0 on int I , then f is strictly increasing on I .
If f 0 < 0 on int I , then f is strictly decreasing on I .
f 0 0 on int I iff f is increasing on I .
f 0 0 on int I iff f is decreasing on I .
Proof. (a) If x1 < x2 in I , then by the Mean Value Theorem 36.3, we may choose c
between x1 and x2 with
f .x2 / D f .x1 / C f 0 .c/.x2
x1 /:
Since f 0 .c/ > 0, this says f .x1 / < f .x2 /, as required.
The proof of (b) is similar, as is one direction of (c) and (d).
We emphasise that in (a) and (b), the implication cannot be reversed. For example, if f
is defined on R by f .x/ D x 3 ; f 0 .x/ D 3x 2 , which is 0 at x D 0, yet f is everywhere
strictly increasing.
The following theorem, proved by Gaston Darboux in 1875 is often called The Intermediate Value Theorem for derivatives. It will lead to results about continuity of
derivatives and differentiation of inverse functions (Inverse Function Theorem).
36.7 The Darboux Property of derivatives. Let f be differentiable on the interval I .
If a; b 2 I and m is any number strictly between f 0 .a/ and f 0 .b/, then there exists c
between a and b with f 0 .c/ D m.
Proof. Say, f 0 .a/ < m < f 0 .b/, a < b. Let h.x/ D f .x/ mx, for all x 2 I . Since
h0 .a/ < 0 and h0 .b/ > 0, h cannot be monotone on Œa; b, so must have the same value
at 2 different points, hence by Rolle’s Theorem, a point between where h0 .c/ D 0; that is,
f 0 .c/ D m.
Before Darboux published his paper, it was believed that if a function f is differentiable
on an interval, the derivative f 0 must be continuous. Darboux gave examples showing
that this need not be the case. However, his theorem clarifies the situation: in the usual
examples, I can be broken into intervals on which f 0 is monotone.
36.8 Corollary. Let f be differentiable on an interval I , with a monotone derivative f 0 .
Then f 0 is continuous on I .
Proof. We recall that whenever a monotone function defined on an interval has the Darboux property (that is, satisfies the conclusion of the Intermediate Value Theorem), it is
continuous. We have just proved that f 0 is such a function.
INTRODUCTION TO ANALYSIS
149
36.9 Example. (A function, differentiable on R, whose derivative is not continuous.) For
x 2 R, put
(
x 2 sin.1=x/; x ¤ 0;
f .x/ D
0;
x D 0:
The reader can check, using the usual rules, that f is differentiable for x ¤ 0 with f 0 .x/ D
2x sin.1=x/ cos.1=x/. Notice that limx!0 f 0 .x/ does not exist. Nevertheless, f 0 is
differentiable at 0.
To prove that f is differentiable at 0 one must go back to the definition.
f .x/ f .0/
f 0 .0/ D lim
D lim x sin.1=x/ D 0;
x!0
x!0
x 0
since the sine function is bounded.
Thus, f 0 is defined on all of R, yet f 0 is not continuous at 0, since it doesn’t even have
a limit there.
In your study of uniform continuity, you will recall that one often tries to get an inequality of the form
jf .x/ f .y/j M jx yj;
for then jx yj < "=M implies jf .x/ f .y/j < ", so that f is uniformly continuous. A
function with this property is called Lipschitz (of order 1).
36.10 Theorem. Let f be continuous on an interval I , differentiable in the interior. If f 0
is bounded on the interior of I , then f is Lipschitz on I , hence f is uniformly continuous.
Proof. Let jf 0 .x/j M , for all x 2 I . Let x1 ; x2 2 I . If these are distinct, there exists c
between x1 and x2 with
f .x1 / f .x2 /
D f 0 .c/;
x1 x2
by the Mean Value Theorem. Thus,
ˇ
ˇ
ˇ f .x1 / f .x2 / ˇ
ˇ
ˇ D jf 0 .c/j M;
ˇ
ˇ
x
x
1
2
so jf .x1 / f .x2 /j M jx1 x2 j. In case x1 D x2 , this argument fails, but the inequality
is still true, since both sides are then 0.
Here is the generalized mean value theorem mentioned earlier.
36.11 Cauchy’s Mean Value Theorem. Let f; g be real-valued functions continuous on
Œa; b, differentiable on .a; b/. Then, there exists a point c 2 .a; b/ such that
.f .b/
Proof. Let h.x/ D .f .b/
h.a/ D .f .b/
f .a//g 0 .c/ D .g.b/
f .a//g.x/
f .a//g.a/
.g.b/
.g.b/
g.a//f 0 .c/:
g.a//f .x/, for all x 2 Œa; b. Then
g.a//f .a/ D f .b/g.a/
g.b/f .a/:
and
h.b/ D .f .b/
f .a//g.b/
.g.b/
g.a//f .b/ D
f .a/g.b/ C g.a/f .b/;
which is the same thing. So by Rolle’s theorem (or the MVT) there exists a point c in
.a; b/ with h0 .c/ D 0, that is where
.f .b/
This is what is required.
f .a//g 0 .c/
.g.b/
g.a//f 0 .c/ D 0:
150
Mean Value Theorems
A major consequence of Cauchy’s MVT is L’Hôpital’s rule, in section 38.
36.1. Let f be continuous on the interval I and differentiable on the interior with f 0 > 0 except at
finitely many points in each bounded interval. Then f is strictly increasing on I . (Note: At the
finitely-many points, we don’t even need to assume the differentiability.)
36.2. If f 0 0 on I and there is no interval .a; b/ with a < b on which f 0 D 0, then f is strictly
increasing on Œa; b
36.3. Use the Mean Value Theorem or Rolle’s Theorem to prove that a real polynomial of degree n
can have at most n real roots.
36.4. Let f be differentiable on the interval Œ1; 7 with f .1/ D 3 and f 0 .x/ 4, for all x 2 .1; 7/.
What is the largest f .7/ could be?
36.5. Prove the following inequalities using the Mean Value Theorem.
(a) e x > x C 1; for x > 0.
(b) If 0 < p < 1, and t 1, then .1 C t /p < 1 C t p .
36.6. Prove the “first derivative test” for local maximum: Let f be continuous on .a; b/, differentiable, except maybe at c, with f 0 .x/ > 0, for a < x < c, and f 0 .x/ < 0, for c < x < b,
then f has a maximum at c.
36.7. Consider the function defined on R by f .x/ D x 2 sin2 .1=x 2 / for x ¤ 0 and 0 for x D 0.
Prove that f has a minimum at 0 and is differentiable there, but the first derivative test does not
apply: the derivative does not simply change sign at 0. (It changes sign infinitely often in each
neighbourhood of 0.)
36.8. Let f; g be differentiable on Œa; C1/ with f 0 .x/ g 0 .x/, for all x a. Suppose f .a/ g.a/. Prove that f .x/ g.x/, for all x a.
36.9. Let f be differentiable on R with f . 1/ D 0, f .0/ D 2, and f .1/ D
is a point c where f 0 .c/ D 7.
50. Prove that there
36.10. Alternate proofs of the Darboux Property of Derivatives.
Proof. (Lars Olsen, American Mathematical Monthly 2004) — I later found that this is also
in Tom Apostol, 2nd Ed. page 112
Assume without loss of generality that a < b and f 0 .a/ < m < f 0 .b/. For x 2 I put
(
(
f .x/ f .a/
f .b/ f .x/
; x¤a
; x¤b
x a
b x
qa .x/ D
and
q
.x/
D
b
f 0 .a/;
x D a:
f 0 .b/;
x D b:
Since f 0 .a/ D limx!a qa .x/ and f 0 .b/ D limx!b qb .x/, these functions are continuf .a/
ous on I . Notice also that qa .b/ D qb .a/ D f .b/
.
b a
In case m qa .b/, the Intermediate Value Theorem gives us a point x 2 .a; b with
f .x/ f .a/
;
x a
and then, the Mean Value theorem gives us a point c 2 .a; x with
m D qa .x/ D
m D f 0 .c/:
In case m > qa .b/ D qb .a/, the analogous argument with qb gives us a point x 2 .a; b/
and a point c 2 .x; b/ with
mD
f .b/
b
f .x/
D f 0 .c/:
x
This completes the proof.
Usual proof. Assume without loss of generality that a < b and f 0 .a/ < m < f 0 .b/. As
before, let g.x/ D f .x/ mx. Then, g 0 .a/ D f 0 .a/ m < 0 and g 0 .b/ D f 0 .b/ m > 0.
Since the restriction of g to Œa; b is continuous, it must have a minimum at some point c of
Œa; b.
Now, c cannot be a; otherwise, by the Endpoint Extremum Theorem, g 0 .a/ 0. Also, c
can’t be b; otherwise, g 0 .b/ 0. But then g 0 .c/ D 0; that is, f 0 .c/ D m:
INTRODUCTION TO ANALYSIS
151
37. T HE R EAL I NVERSE F UNCTION T HEOREM
The following application of the Darboux Property of Derivatives is the key to differentiating the inverse of functions.
37.1 Theorem. Inverse Function. Let f be differentiable on the interval I and f 0 .x/ ¤ 0,
for all x 2 I . Then,
(1) f is injective, strictly increasing or strictly decreasing,
(2) f 1 is differentiable, on J D f .I / and
(3) at each point y 2 J;
1
.f 1 /0 .y/ D 0
:
f .f 1 .y//
Note that if x is the point of I with y D f .x/, the final formula reads
1
:
.f 1 /0 .y/ D 0
f .x/
In calculus courses one often writes
. dy
dx
D1
;
dy
dx
but one has to remember that on the left side, the x represents the function f 1 , and dx
dy
stands for the derivative of that function at the point y, whereas on the right side y is
dy
representing the function f and dx
stands for the derivative of f at x, yet we are still
assuming that the y at which we evaluate the left side is related to the x at which we
evaluate the right side by y D f .x/. The notation is a mess! — though very convenient in
calculations.
Proof. By the Darboux property of derivatives (Theorem 36), if f 0 ¤ 0 on I , then there
cannot exist a; b 2 I with f 0 .a/ > 0 and f 0 .b/ < 0. Thus f 0 > 0 on I or f 0 < 0 on I .
Thus, f is strictly increasing on I or strictly decreasing. In particular, f is injective.
Say f is strictly increasing on I . Then f 1 is also strictly increasing. Since f is
continuous, it too has the Darboux property (intermediate value property) which makes
J D f .I / an interval and f 1 is continuous there. Fix y 2 J; y D f .x/ where x 2 I .
We will use the sequential criterion to show f 1 is differentiable at y. Let .yn / be a
sequence in J n fyg with yn ! y and put xn D f 1 .yn /. Since f 1 is continuous,
xn ! x. Also xn ¤ x, for all n since f 1 is injective. Thus, we calculate
f
1
.yn /
yn
f
y
1
.y/
D
xn
f .xn /
x
D
f .x/
1
f .xn / f .x/
xn x
:
Since f is differentiable at x, this converges to f 01.x/ : Since the sequence .yn / was arbitrary,
f 1 .t / f 1 .y/
1
lim
D 0
:
t !y
t y
f .x/
That is, .f 1 /0 .y/ exists and is 1=f 0 .x/.
37.2 Example. The function tan is defined on the set of real numbers where cos is not 0,
sin x
namely on A WD Rnf=2Cn W n 2 Zg by tan x D cos
. Since sin and cos are continuous,
x
so is tan on A. The denominator is positive in the interval . =2; =2/. As x ! 2 ,
sin x ! 1 and cos x ! 0, so limx! 2 tan x D C1; similarly, limx! 2 C tan x D 1.
From this, for any y 2 R, there exist a; b with tan a < y and tan b > y. Thus, by the
152
The Real Inverse Function Theorem
Intermediate Value Theorem, there exists x with tan x D y. This shows that tan maps
. =2; =2/ onto R.
Since sin0 D cos and cos0 D sin, the quotient rule gives, for x 2 A,
cos x sin0 x sin x cos0 x
cos2 x C sin2 x
D
D 1 C tan2 x:
cos2 x
cos2 x
(We could say cos12 x D sec2 x, but we are anticipating the use below.) In any case, this
shows tan0 x > 0 in . =2; =2/, so tan is strictly increasing there, hence injective. The
restriction of tan to . =2; =2/ is thus invertible. Its inverse is defined to be arctan.
Since tan maps . =2; =2/ onto R, arctan maps R onto . =2; =2/. By the Inverse
Function Theorem for real functions, arctan is differentiable with
1
arctan0 y D
tan0 .arctan y/
tan0 x D
If x D arctan y, tan x D y, and tan0 x D 1 C tan2 x D 1 C y 2 . Hence,
1
;
arctan0 y D
1 C y2
for all y.
37.1. You are given that the exponential function exp.x/ D e x has exp0 D exp. Prove that ln D loge
is also differentiable, with the usual formula.
37.2. Let arcsin denote the inverse of the restriction of sin to the interval Œ =2; =2. Prove that
arcsin is defined on Œ 1; 1, that it is strictly increasing and continuous there, and that it is differentiable in . 1; 1/and determine its derivative.Justify all steps (including that such an inverse
exists).
37.3. Formulate and work out a similar result for cos restricted to Œ0; . Could any larger interval be
used?
37.4. If f is continuously differentiable on I and f 0 .a/ ¤ 0, then f is locally invertible, that is
there exists a neighbourhood U of x for which f jU is invertible and if g is the inverse of f jU ,
g 0 .f .x// D 1=f 0 .g.f .x///, for x 2 U .
INTRODUCTION TO ANALYSIS
153
38. L’H ÔPITAL’ S RULE
This actually refers to several related results about calculating the limit of a quotient
f =g of two functions f; g by calculating the limit of the quotient f 0 =g 0 of their derivatives.
38.1 L’Hôpital’s rule (0/0 form at a real). Let I be an interval of R, c 2 I ; let f; g be
defined on I , except possibly at c, differentiable with g 0 not 0 in a deleted neighbourhood
of c, and limx!c f .x/ D limx!c g.x/ D 0.
If
lim
x!c
f 0 .x/
x
DL2R
g 0 .x/
lim
then
x!c
f .x/
D L:
g.x/
(28)
Proof. We may assume f and g are continuous with value 0 at c. Indeed, if they are not
we may define fx and gx on I by
(
(
g.x/; x ¤ c
f
.x/;
x
¤
c
gx.x/ D
fx.x/ D
0;
x D c:
0;
x D c:
Then, fx and gx are continuous with value 0 at c, and fx0 D f 0 and gx0 D g 0 ¤ 0 in a deleted
neighbourhood U of c. Since for x 2 U ,
fx0 .x/
f 0 .x/
D
gx0 .x/
g 0 .x/
and
f .x/
fx.x/
D
;
g.x/
gx.x/
it would be enough to prove the result for these new functions.
Assume then, that f and g are defined and continuous with value 0 at c. Let U D
.a; c/ [ .c; b/ be the deleted neighbourhood of c in I such that f 0 and g 0 exist at all points
of U and g 0 is not zero in U . Since g 0 .x/ ¤ 0 for x 2 U , and g.c/ D 0, g ¤ 0 in U .
We use the sequential criterion for convergence. We must show that for each sequence
.xn /
.xn / in U with xn ! c, fg.x
! L.
n/
Fix .xn / a sequence in U converging to c. For each xn ; either xn < c or c < xn . Since
g is continuous on Œxn ; c (or Œc; xn ), and differentiable on the interior, Cauchy’s Mean
Value Theorem allows us to choose cn between xn and c with
.f .xn /
f .c//g 0 .cn / D .g.xn /
g.c//f 0 .cn /:
In other words, since f .c/ D g.c/ D 0,
.f .xn /
f .xn /
D
g.xn /
.g.xn /
Now, since cn is between xn and c, jcn
f .c//
f 0 .cn /
D 0
:
g.c//
g .cn /
cj jxn
cj, so cn ! c, and by hypothesis,
0
f .cn /
! L;
g 0 .cn /
so
f .xn /
! L:
g.xn /
Since .xn / was an arbitrary sequence in U converging to c,
lim
x!c
f .x/
D L:
g.x/
Recall the practice problem (Exercise 29.3).
154
L’Hôpital’s Rule
38.2 Lemma. If h is defined (at least) in Œa; 1/, a > 0, and H.t / D h.1=t/ for 0 < t <
1=a. Then
lim h.x/ D lim H.t/;
x!1
t !0C
x
whenever one side exists in R.
38.3 L’Hôpital’s rule (0/0 form at 1 or 1). Let f; g be defined (at least) on an interval
on Œa; 1/ of R, differentiable there with g 0 never 0 and limx!1 f .x/ D limx!1 g.x/ D
0. Then the implication (28) holds.
A similar statement holds for limits at 1.
Proof. We may assume a > 0. Put F .t / D f .1=t /, G.t / D g.1=t /, for each t with
1=t 2 Œa; 1/. Then F and G are differentiable in (0,1/a) and
f 0 .1=t /. 1=t 2 /
f 0 .1=t /
f 0 .x/
F 0 .t /
D
lim
D
lim
D
lim
D L:
x!1 g 0 .x/
t !0C g 0 .1=t /. 1=t 2 /
t !0C g 0 .1=t /
t !0C G 0 .t /
lim
Moreover, F and G converge to 0 as t !0C, so by the previous theorem,
lim
x!1
f .x/
F .t /
D lim
D L:
t !0C G.t /
g.x/
Here is a version where the denominator converges to 1. Note that no assumption is
made on the behaviour of the numerator.
38.4 L’Hôpital’s rule (‹=1 form). Let f; g be defined and differentiable (at least) in
a (possibly infinite) interval .a; b/ R, except at some c, with g 0 never 0 there and
limx!c g.x/ D 1, (or 1). Then the implication (28) holds.
Proof. . We may assume b D c, since the case c D a is similar and the general case
follows by considering right and left limits.
We do the case L is finite, but write the proof in such a way that it is easily changed to
give the infinite cases.
Let u < L < v. We need only show that there is an interval U D .r; c/ with u <
f .x/
< v, for all x 2 U . Choose u0 ; v 0 2 R with u < u0 < L < v 0 < v. Since
g.x/
lim
x!c
f 0 .x/
D L;
g 0 .x/
there exists an interval U1 D .r1 ; c/ with
u0 <
f 0 .t /
< v0;
g 0 .t /
()
for all t 2 U1 .
Fix x; y 2 U1 (distinct) and apply Cauchy’s MVT to get a t between x and y with
f .x/
g.x/
f .y/
f 0 .t /
D 0 :
g.y/
g .t /
There is no problem about dividing by zero here, since g 0 is not 0 anywhere on .a; c/, and
hence g.x/ is never g.y/. Since t is between x and y, it also belongs to U1 , hence by ./;
u0 <
f .x/
g.x/
f .y/
< v0:
g.y/
()
INTRODUCTION TO ANALYSIS
155
We need not make reference to t any longer. This statement ./ holds for all x and y in
U1 . Hold y fixed throughout the remainder of the proof.
Since g.x/ ! 1, we may choose a smaller neighbourhood U2 D .r2 ; c/ with g.x/ >
g.y/ and g.x/ > 0, for x 2 U2 .
Divide the numerator and denominator of the quotient in ./ by g.x/ to obtain
u0 <
f .x/
g.x/
1
f .y/
g.x/
g.y/
g.x/
< v0:
The denominator is > 0 here so we may multiply through by it, preserving the inequality.
After adding f .y/=g.x/, this gives
g.y/
f .y/
g.y/
f .y/
f .x/
u0 1
C
C
<
< v0 1
:
g.x/
g.x/
g.x/
g.x/
g.x/
Now, as x ! c, since g.x/ ! 1, the left side here converges to u0 and the right side
converges to v 0 . Thus, since u < u0 , and v > v 0 , there is a smaller deleted neighbourhood
U3 D .r3 ; c/ such that the left side is > u and the right side is < v, for x 2 U3 . Hence
f .x/
< v:
u<
g.x/
Thus, U3 is the required U .
In the case the limit L is C1, a proof is obtained by deleting the side of the inequalities
above containing v and v 0 , since the neighbourhoods of C1 are of the form .u; C1/.
Similarly, the proof for the case L D 1 is obtained by deleting the side involving u and
u0 .
Alternate Proof of both of the 0=0 forms. We may assume the interval I has right endpoint c, where c
could be C1. The case that c is a left endpoint is similar and the general case follows by considering right and
left limits. Let V be a neighbourhood of L and let W be a neighbourhood of L with cl W V . (For example,
if L is finite and V D B.L; "/, we could take W D B.L; "=2/.) Since,
lim
x!c
f 0 .x/
D L;
g 0 .x/
there exists an interval U D .r; c/ with
f 0 .t /
2W
g 0 .t /
for all t 2 U . By hypothesis we can take U small enough that g 0 is never 0 on U .
Fix x; y 2 U (distinct) and apply Cauchy’s MVT to get a t between x and y with
f .x/
g.x/
( )
f .y/
f 0 .t /
D 0
:
g.y/
g .t /
Since g 0 is not 0 anywhere on U , g.x/ is never g.y/. Since t is between x and y, it also belongs to U ; hence,
by . /,
f .x/ f .y/
2 W:
g.x/ g.y/
Now, let y ! c and get
f .x/
f .x/ f .y/
D lim
2 cl W V:
y!c g.x/
g.x/
g.y/
Thus, for each neighbourhood W of L, we have found a deleted neighbourhood of c such that for all x 2 U ,
f .x/
f .x/
g.x/ 2 V , which shows that limx!c g.x/ D L, as required.
38.1. Let f .x/ D 2x C sin 2x and g.x/ D .x C sin x cos x/e 2 sin x . Investigate limx!C1
and limx!C1
f .x/
g.x/ .
Explain the behaviour you discover.
f 0 .x/
g 0 .x/
156
L’Hôpital’s Rule
Notes
INTRODUCTION TO ANALYSIS
157
39. TAYLOR ’ S T HEOREM
The following theorem, named after Brook Taylor (1685–1731), is an extension of the
Mean Value Theorem to include higher derivatives. For a function defined on an interval
I . If f is differentiable in a neigbhourhood of x and the derivative f 0 is differentiable
at x, then the second derivative of f at x is f 00 .x/ D .f 0 /0 .x/. Similarly, f 000 .x/ is the
derivative of f 00 at x, provided f 00 exists in a neighbourhood of x, and f 00 is differentiable
at x. Other notation for these are f .2/ .x/ and f .3/ .x/. This notation can be extended
recursively. We put f .0/ .x/ D f .x/, and supposing f .n/ exists in a neighbourhood of x
and is differentiable at x, we put f .nC1/ .x/ D .f .n/ /0 .x/.
39.1 Taylor’s Theorem (Taylor’s formula with Lagrange’s form of remainder). Let I be
an interval of the reals, f W I ! R and its first n derivatives be continuous on I and
differentiable on the interior of I . Let x0 2 I . Then for all x 2 I ,
f .x/ D
n
X
f .k/ .x0 /
.x
kŠ
kD0
.nC1/
where Rn .x/ D
f
.c/
.x
.n C 1/Š
x0 /k C Rn .x/
x0 /nC1 ; for some c between x and x0 :
We emphasis that f .nC1/ is assumed to exist on the interior of I , but need not be
continuous, and it is not assumed to exist at the endpoints of I .
Remember that f .0/ D f and 0Š D 1, .x x0 /0 D 1; even when x D x0 .
Proof. For each x 2 I , put
Pn .x/ D
n
X
f .k/ .x0 /
.x
kŠ
x0 /k
kD0
This is called the Taylor polynomial of order n about x0 ; it is defined for all x in I , since
f is n-times differentiable. Define the remainder Rn .x/ by subtraction:
Rn .x/ D f .x/
Pn .x/:
From the way it is defined, the remainder automatically satisfies
f .x/ D Pn .x/ C Rn .x/:
We are to show it can be written in the stated form.
Keep x; x0 2 I fixed, with x ¤ x0 ; and define a function
g.t / D
n
X
f .k/ .t /
.x
kŠ
t /k C Rn .x/
kD0
Then, since .x
x/ D 0 and
.x x0 /nC1
.x x0 /nC1
.x t /nC1
:
.x x0 /nC1
D 1,
g.x/ D f .x/ and
g.x0 / D Pn .x/ C Rn .x/ D f .x/:
158
Taylor’s Theorem
Thus, by Rolle’s Theorem (or the Mean Value Theorem), there exists a c between x and
x0 with g 0 .c/ D 0. Using the rules of differentiation, we calculate:
#
"
n
X
f .k/ .t /
.n C 1/.x t /n
f .kC1/ .t /
0
0
k
k 1
g .t / D f .t / C
.x t /
k.x t /
Rn .x/
kŠ
kŠ
.x x0 /nC1
kD1
#
"
n
X
f .k/ .t /
.n C 1/.x t /n
f .kC1/ .t /
k
k 1
0
.x t /
.x t /
Rn .x/
D f .t / C
kŠ
.k 1/Š
.x x0 /nC1
D f 0 .t / C
kD1
.nC1/
f
.t /
nŠ
.x
t /n
0
f .t /
.x
0Š
n
t /0
Rn .x/.n C 1/
f .nC1/ .t /
.x t /n
.x t /n Rn .x/.n C 1/
:
nŠ
.x x0 /nC1
Now, at t D c the left side is 0, so when we solve for Rn .x/, the .x
.x t /
.x x0 /nC1
D
Rn .x/ D
f .nC1/ .c/
.x
.n C 1/nŠ
t /n cancel and
x0 /nC1 ;
which is equal to the required form.
The form given to the remainder here is generally attributed to Lagrange. Taylor didn’t
actually prove this theorem, but gave the infinite series expansion, known as Taylor’s series,
without discussing questions of convergence.
39.2 Example. If f .x/ D e x ; for x 2 R, then f 0 .x/ D e x , for all x 2 R. Thus, for all n
f .n/ .0/ D e 0 D 1 and the Taylor polynomial of order n about 0 is
Pn .x/ D
n
X
1 k
x :
kŠ
kD0
Taylor’s Theorem says
f .x/ D Pn .x/ C
f .nC1/ .c/ nC1
ec
x
D Pn .x/ C
x nC1 ;
.n C 1/Š
.n C 1/Š
for some c between 0 and x.
If, for example x > 0, and n D 4, we obtain (since e t increases with t ) 1 < e c < e x
and
x5
x5
P4 .x/ C
< e x < P4 .x/ C e x :
5Š
5Š
For more applications of such approximation methods, see your calculus book.
39.1. The remainder in Lagrange’s form of Taylor’s theorem could be written as
Rn .x/ D
f .nC1/ ..1 /x C x0 /
.x
.n C 1/Š
x0 /nC1 ; for some t 2 .0; 1/:
39.2. (Taylor’s formula with Cauchy form of remainder.) Let I be an interval of the reals, f W
I ! R and its first n derivatives be continuous on I and differentiable on the interior of I . Let
x0 2 I . Then for all x 2 I ,
f .x/ D
n
X
f .k/ .x0 /
.x
kŠ
kD0
f .nC1/ .c/
x0 /k C Rn .x/
.x c/n .x x0 /; for some c between x and x0 :
nŠ
Hint: use a slightly different form of auxiliary function.
where Rn .x/ D
:
INTRODUCTION TO ANALYSIS
159
39.3. (Taylor’s formula with Schlömilch form of remainder.) Generalize both Cauchy’s form and
Lagrange’s form of Taylor’s Theorem, obtaining one for which the remainder involves .x x0 /
and .x c/ to different powers summing to n C 1. This general result was proved by Oskar
Schlömilch in 1847.
39.4. (Young’s form of Taylor’s Theorem.)
Let f be n-times differentiable at a point a of the interval I and
P
f .k/ .a/
Pn .x/ D n
.x a/k . Then, for all x 2 I ,
kD0
kŠ
f .x/ D Pn .x/ C Rn .x/,
a/n where ".x/ ! 0 as x ! a. (One says
where Rn .x/ is of the form ".x/.x
Rn .x/ is o..x a/n / as x ! a.)
Hint: One simply calculates the limit
f .x/
.x
lim
x!a
Pn .x/
:
a/n
Use L’Hôpital’s rule n 1 times, possible since the hypothesis that f is n-times differentiable at
a includes that it is (n 1)-times differentiable on a neighbourhood of a in I . CAREFUL: f is
only n-times differentiable, and f .n/ is not assumed continuous.
39.5. Let f be n-times differentiable on the open interval I containing the point a and f .k/ .a/ D 0
for k D 1; : : : ;n-1, but f .n/ .a/ ¤ 0.
(a) If n is even and f .n/ .a/ < 0, then f has a local maximum at a.
(b) If n is even and f .n/ .a/ > 0, then f has a local minimum at a.
(c) If n is odd, then f has neither a local maximum, nor a local minimum at a.
39.6. Show that if x 2 Œ0; 1, then
x2
x3
C
2
3
Here log stands for loge D ln.
x
x4
log.1 C x/ x
4
39.7. Prove that the function
(
f W x 7!
exp. 1=x 2 /;
0;
x2
x3
C
:
2
3
x¤0
xD0
has f .n/ .0/ D 0; for all n D 0; 1; 2; : : : , so that the remainder in Taylor’s theorem about 0 is
the function f itself.
160
Taylor’s Theorem
Notes
INTRODUCTION TO ANALYSIS
161
40. C ONVEX F UNCTIONS
Let I be an interval of R and f a real function whose domain contains I . Then f is
called convex (on I ) if for all a; b 2 I and 0 < t < 1,
f ..1
t/a C t b/ .1
t /f .a/ C tf .b/:
(C0)
The function f is called strictly convex if this holds with “” replaced by “<”. If we
reverse the inequalities, we obtain the notions of concave and strictly concave function, respectively. Since the points strictly between a and b are those of the form x D .1 t /a C t b,
where
b x
x a
; 1 tD
;
tD
b a
b a
(C0) becomes
x a
b x
f .a/ C
f .b/;
(C1)
f .x/ b a
b a
for all x between a and b. Let `.x/ denote the right side of this inequality. Then, ` is a
linear function whose value at a is f .a/ and whose value at b is f .b/, so it could also be
written
f .b/
b
f .b/
D f .b/ C
b
`.x/ D f .a/ C
f .a/
.x
a
f .a/
.x
a
a/
(L1)
b/
(L2)
Thus, f is convex if and only if for each a; b 2 I , the line segment joining the two
points .a; f .a// and .b; f .b// is never below the graph of f .
Using the two representations (L1) and (L2) of the point `.x/, we see that the inequality
(C1) can also be written as.
f .x/
x
f .a/
f .b/
a
b
f .a/
a
(C2)
f .b/ f .a/
f .b/
b a
b
Combining (C2) and (C3) gives also
f .x/
x
(C3)
f .x/
x
(C4)
or as
f .x/
x
f .a/
f .b/
a
b
But (C4) can also be rearranged to give (C1) back. Indeed, (C4) yields
f .x/ D
b
b
x
x
f .x/ C
a
b
a
b
f .x/ a
b
x
x
f .a/ C
a
b
a
f .b/:
a
Hence, we obtain the following characterization.
40.1 Lemma. Let I be an interval of R and f W I ! R. Then f is convex if and only if,
for all a; x; b 2 I with a < x < b, one of the equivalent inequalities (C1),(C2),(C3), or
(C4) holds.
40.2 Corollary. Every convex function on an open interval I is continuous. In fact, for
each closed interval Œa; b I , f is Lipschitz on Œa; b.
162
Convex Functions
Proof. Let I be an open interval, f convex on I . We actually show f is Lipschitz on any
closed interval Œa; b I with a < b. Choose a0 ; b 0 2 I such that a0 < a < b < b 0 . Then,
for x; y 2 Œa; b, we have, applying (C1–C4) several times,
f .a0 / f .a/
f .x/ f .y/
f .b/
a0 a
x y
b
ˇ ˇ
ˇ
nˇ 0
o
0 ˇ
ˇ ˇ
ˇ
Thus, putting K D max ˇ f .aa/0 fa .a/ ˇ ; ˇ f .b/b bf0.b / ˇ yields
K
so that jf .x
f .y/j Kjx
f .x/
x
f .b 0 /
:
b0
f .y/
K;
y
yj.
40.3 Corollary. Let f be convex on I and a < x < b, with a; b 2 I .
(1) If f .a/ f .x/, then f .x/ f .b/.
(2) If f .x/ f .b/, then f .a/ f .x/.
(3) If f .a/ < f .x/, then f .x/ < f .b/.
(4) If f .x/ > f .b/, then f .a/ > f .x/.
In any case, f .x/ maxff .a/; f .b/g.
Proof. By (C4), if f is convex and a < x < b, then f .x/ f .a/ 0 implies f .b/
f .x/ 0, which yields (1), and f .b/ f .x/ 0 implies f .x/ f .a/ 0, which yields
.2/. (3) and (4) are proved in the same way.
We will now see that every non-monotone convex function on an open interval decreases
to a minimum and increases thereafter. Notice that there is no assumption involving differentiation here.
40.4 Theorem. Let I be an open interval and f be a convex function on I . Then, either
f is monotone, or there exists x0 2 I such that f is decreasing on fx 2 I W x x0 g and
f is increasing on fx 2 I W x x0 g.
Proof. Assume f is not monotone. Then, we can find a; z; b 2 I with a < z < b and
either f .a/ < f .z/ and f .z/ > f .b/ or f .a/ > f .z/ and f .z/ < f .b/. The first
possibility is excluded by the corollary; hence, the second must hold. But f is continuous
on the compact interval Œa; b, so has a minimum value at some x0 :
f .x0 / D minff .x/ W x 2 Œa; bg:
Since f .x0 / f .z/, we have
f .a/ > f .x0 / and f .x0 / < f .b/:
From this, we actually have
f .x0 / D minff .x/ W x 2 I g:
For example, if x < a, we have x < a < x0 , and f .a/ > f .x0 /, so by the corollary,
f .x/ f .a/ > f .x0 /. A similar argument works for x > b.
Now, let x1 < x2 x0 . Then f .x2 / f .x0 /, so f .x1 / f .x2 /; hence, f is
decreasing f is decreasing on fx 2 I W x x0 g. In the same way, f is increasing on
fx 2 I W x x0 g.
40.5 Theorem. If I is an open interval and f W I ! R is differentiable on I , with f 0
increasing on I , in particular, if f 00 0 on I , then f is convex.
INTRODUCTION TO ANALYSIS
163
Proof. Let f 0 be increasing and let a; x; b 2 I with a < x < b. Thus, the Mean Value
Theorem yields c1 2 .a; x/ and c2 2 .x; b/ with
f .x/ f .a/
f .x/ f .b/
f 0 .c1 / D
and f 0 .c2 / D
:
x a
x b
Since f 0 is increasing, f 0 .c1 / f 0 .c2 /, so that inequality (C4) holds, for all such a <
x < b. Thus, f is convex.
Since the condition f 00 0 on I entails f 0 is increasing on I , we are done.
Recall that a function f is left differentiable at x 2 I if x is not a left endpoint of I
and its left derivative, defined by f 0 .x/ D limu!x f .u/u xf .x/ exists (in R) and f is right
differentiable at x if x is not a right endpoint of I and fC0 .x/ D limu!xC f .u/u xf .x/
exists.
40.6 Theorem. Let I be an open interval of R and let f W I ! R be convex. Then,
(1) f is both left and right differentiable at each point of x 2 I with
(2) f 0 .x/ fC0 .x/;
(3) each of f 0 and fC0 is an increasing function on I ;
(4) f is differentiable except at a countable number of points of I ;
(5) if Œa; b I , then for all x; y 2 Œa; b,
jf .x/
where M D
f .y/j M jx
yj;
maxfjfC0 .a/j; jf 0 .b/jg;
Proof. Fix x 2 I . Define for u ¤ x,
f .u/ f .x/
:
u x
Then, ' is an increasing function on I n fxg. There are 3 cases to check, but they all come
from .C 2 C 4/ with different choices of the variables. For example, if u1 < u2 < x,
replace a; x; b by u1 ; u2 ; x in (C2) to obtain
f .u2 / f .x/
f .u1 / f .x/
D '.u2 /:
'.u1 / D
u1 x
u2 x
If u1 < x < u2 , replace a; x; b by u1 ; x; u2 in (C4) to get
f .x/ f .u2 /
f .u1 / f .x/
'.u1 / D
D '.u2 /:
u1 x
x u2
We’ll leave the 3rd case for the reader. Now, if u < x < v, we have
'.u/ D
'.u/ '.v/;
so letting u ! x
and v ! xC, we have
lim '.u/ D sup '.u/ inf '.v/ D lim '.v/I
u!x
0
that is, f .x/ and
fC0
u<x
v>x
v!xC
exist with
f 0 .x/ fC0 .x/;
proving (1) and (2).
We now prove (3) in a strong form. Let x1 < x2 and choose u with x1 < u < x2 ,
obtaining, using (C4)
f .u/ f .x2 /
f .u/ f .x1 /
f 0 .x2 /;
fC0 .x1 / u x1
u x2
164
Convex Functions
so
f 0 .x1 / fC0 .x1 / f 0 .x2 / fC0 .x2 /:
0
()
fC0
Thus, both f and
are increasing.
Now, increasing functions on an interval have at most a countable number of discontinuities (Theorem 30.10). So to prove (4), we need only show that f is differentiable at
each point x where f 0 is continuous. But at such a point, () yields for u > x,
f 0 .x/ fC0 .x/ f 0 .u/
and letting u ! xC gives
f 0 .x/ fC0 .x/ f 0 .x/;
so f 0 .x/ D fC0 .x/, as required.
Now, to prove (5) let a x; y b. Then, using consequences of .C1
fC0 .a/ f .x/
x
C 4/ again,
f .y/
f 0 .b/;
y
so putting M D maxfjfC0 .a/j; jf 0 .b/jg, we have
M f .x/
x
f .y/
MI
y
ˇ
ˇ
ˇ
ˇ
that is, ˇ f .x/x fy .y/ ˇ M; so that (5) follows.
Finally, as we have noted, (5) says f is Lipschitz on every compact subinterval of I , so
is continuous in a strong way.
40.7 Warning. If the interval I is not open, f can be convex on I , without being continuous. For example, the function f defined on . 1; 1 by
(
x2; x < 1
f .x/ D
7;
x D 1:
is convex, with a jump discontinuity at 7.
Putting together the above results, we obtain the following commonly used charactizations.
40.8 Corollary. Let I be an open interval.
(1) If f is differentiable on I , then f is convex iff f 0 is increasing on I .
(2) If f is twice differentiable on I , then f is convex iff f 00 0 on I .
(3) If f is twice differentiable and f 00 is never 0, then f is (strictly) convex iff there
exists c 2 I with f 00 .c/ > 0.
Proof. (1) Assuming f is differentiable on I , f 0 D fC0 D f 0 , so f 0 is increasing by (3)
of the theorem. We already proved that if f 0 is increasing, then f is convex.
(2) If f is twice differentiable, then f 0 is increasing on I iff f 00 0 on I , so (1)
applies.
(3) By the Darboux property of the derivative .f 0 /0 , since f 00 is never 0, the positivity
of f 00 at one point implies positivity in the whole interval and (2) applies.
The one-sided derivatives of a convex function determine tangent lines. The graph of
the function never falls below these lines.
INTRODUCTION TO ANALYSIS
165
40.9 Theorem. Let I be an open interval of R, x0 2 I , and let f W I ! R be convex. Let
m be any real number with f 0 .x0 / m fC0 .x0 /. Then, the linear function T W x 7!
f .x0 / C m.x x0 / satisfies
T .x0 / D f .x0 /
T .x/ f .x/; for all x 2 I:
Proof. . Of course for x D x0 , T .x/ D f .x0 /. We have seen that for x > x0 ,
f .x/ f .x0 /
fC0 .x0 / m:
x x0
So, f .x/ f .x0 / C m.x x0 / D T .x/. Similarly, for x < x0 ,
f .x/ f .x0 /
f 0 .x0 / m;
x x0
which again yields f .x/ f .x0 / C m.x x0 / D T .x/, since x x0 < 0.
40.10 Corollary. Every convex function f on an open interval I is the pointwise supremum of linear functions:
f .x0 / D supfT .x0 / W T linear and T f on I g:
Proof. If T f on I and x0 2 I , then T .x0 / f .x0 /. So
supfT .x0 / W T linear and T f on I g f .x0 /:
On the other hand, the theorem shows that there exists a linear T D Tx0 with T f on I
and T .x0 / D f .x0 /, so we have equality.
Conversely, you can prove:
40.11 Theorem. If the real function f is is the pointwise supremum of a family of convex
functions on the interval I , then f is also convex.
40.1. Let f be convex on I and a < x < b, with a; b 2 I . If f .a/ < f .x/, then f .x/ < f .b/.
If f .x/ > f .b/, then f .a/ > f .x/.
40.2. Let f be a convex function on the interval I . If f has a local minimum at x0 , it is also a global
minimum.
40.3. Give an example of a non-monotone convex function on an interval but which has no miminum
value.
40.4. Let f be strictly convex on an interval I . Then f has at most one minimizer. If f is not
monotone on the interior of I , then f has exactly one minimizer.
166
Convex Functions
Notes
INTRODUCTION TO ANALYSIS
167
41. T HE R IEMANN I NTEGRAL
Here we introduce the Riemann integral using the original formulation of Riemann.
The later one due to Darboux will arise naturally in the section on existence. You will have
heard of Riemann sums from calculus. In everything, if we don’t say otherwise, f is a
real-valued function defined (at least) on a bounded interval Œa; b to R.
A Riemann partition of Œa; b is a finite set of non-overlapping intervals
P D fI1 ; I2 ; : : : ; In g D fŒx0 ; x1 ; Œx1 ; x2 ; : : : ; Œxn
1 ; xn g
obtained by subdividing the interval with points
a D x0 x1 x2 xn D b:
A tagged partition of Œa; b is obtained from a partition P D fI1 ; I2 ; : : : ; In g by choosing points ti 2 Ii (called tags):
D f.t1 ; I1 /; .t2 ; I2 /; : : : ; .tn ; In /g:
The length of an interval I D Œu; v is .I / D v u. In this setting, if Ii D Œxi 1 ; xi 
it is traditional to write xi for its length. The mesh of the partition P or of the tagged
partition is the length of the largest interval in it:
kk D kP k D max .Ii / D max.xi
i
xi
i
Pn
1/
D max xi :
i
P
Notice that the sum of the lengths iD1 .Ii / D i .xi xi 1 / D b
Œa; b. Corresponding to each such tagged partition is a Riemann sum
R.f; / D
n
X
f .ti /.xi
xi
1/
X
D
i D1
a, the length of
f .t /.I /:
.t;I /2
The function f is called Riemann integrable over Œa; b if a real number r exists such that
lim R.f; / D r;
kk!0
in the sense that for all " > 0, there exists a ı > 0 such that
jR.f; /
rj < ";
whenever is a tagged partition of Œa; b with kk < ı.
Of course, r is then referred to as the Riemann integral of f over Œa; b; it is denoted
Rb
Rb
a f D a f .x/ dx.
One can show that if f is integrable on Œa; b then it is bounded on Œa; b (see a sketch
in the exercises). At first, I suggest you just accept it as part of the definition.
41.1 Example. To see that you understand the definition, check that if f is constantly k,
Rb
Rb
Rb
then a f D a f .x/ dx D a k dx D k.b a/, as expected.
As in the case of the other limit processes we have studied, the limit of the sum is the
sum of the limits and order is preserved.
41.2 Theorem (linearity). The set of Riemann integrable functions on Œa; b is a vector
space on which the integral is a linear functional. Thus,
Rb
Rb
(1) If f is integrable on Œa; b and k 2 R, then kf is integrable and a kf D k a f:
Rb
Rb
Rb
(2) If f and g are integrable on Œa; b then so is f Cg and a .f Cg/ D a f C a g.
168
The Riemann Integral
Proof. Let us prove (2) and leave (1) to the reader.
Let f and g be Riemann integrable on Œa; b. Let " > 0. Choose ı1 so that for each
Rb
tagged partition of Œa; b with kk < ı1 , jR.f; /
f j < "=2. Choose ı2 so that for
Rb a
each tagged partition with kk < ı2 , jR.g; /
a gj < "=2. Let ı D minfı1 ; ı2 g. If
a tagged partition of Œa; b with mesh kk < ı
R.f C g; / D
n
X
.f .t/ C g.t //.I / D R.f; / C R.g; /:
.t;I /2
Thus,
ˇ
ˇ
ˇR.f C g; /
.
Rb
a
f C
Rb
a
ˇ ˇ
ˇ ˇ
g/ˇ ˇR.f; /
This shows that f C g is integrable on Œa; b with
Rb
a
Rb
a
ˇ ˇ
ˇ ˇ
f ˇ C ˇR.g; /
.f C g/ D
Rb
a
f C
R b ˇˇ
a g ˇ < ":
Rb
a
g.
41.3 Theorem (monotonicity). If f and g are integrable on Œa; b and f g, then
Rb
Rb
a f a g.
Proof. This amounts to a “preservation of inequalities in limits” result. The method is the
Rb
Rb
same as usual. Let f g on Œa; b with a f D r and a g D s. Suppose r > s. Choose
" D .r s/=2.
Using the definition, choose ı > 0 so that for kk < ı, both jR.f; / rj < " and
jR.g; / sj < ". Then, choose any tagged partition of Œa; b with mesh less than ı.
Then, R.g; / < s C " D r " < R.f; /, yet
X
X
R.f; / D
f .t /.I / g.t /.I / D R.g; /
.t;I /2
.t;I /2
a contradiction.
To find such a , just choose n with .b a/=n < ı and create equally spaced division
points: xi D a C .b a/i=n, for each i. Any choice of the tags will do.
41.4 Theorem (additivity over intervals). If a < c < b and f is integrable on Œa; c and
Rb
Rc
Rb
on Œc; b, then f is integrable on Œa; b with a f D a f C c f .
(We will soon show also that if f is integrable over Œa; b then it is integrable over each
of Œa; c and Œc; b, so the result applies. See Theorem 42.6)
Proof. Since f is integrable on Œa; c and on Œc; b, it is bounded on each, hence is bounded
on Œa; b. Say jf .t /j < K < C1, for all t 2 Œa; b. Let " > 0. Since f is integrable on
Œa; c we canR choose ı1 > 0 such that for each tagged partition 1 of Œa; c with k1 k < ı1 ,
c
jR.f; 1 / a j < "=3. Similarly, we can choose ı2 > 0 such that for each tagged partition
Rb
2 of Œc; b with k2 k < ı2 , jR.f; 2 /
c j < "=3. Choose ı D minfı1 ; ı2 ; "=6Kg:
Now, let D f.t1 ; I1 /; : : : ; .tn ; In g be a tagged partition of Œa; b with kk < ı where
the subdivision points are x0 ; : : : ; xn arranged in increasing order. We create tagged partitions of Œa; c and Œc; b as follows. Choose k such that c 2 Ik D Œxk 1 ; xk  and put
1 D f.t1 ; I1 /; : : : ; .tk
1 ; Ik 1 /; .c; Œxk 1 ; c/g
and
2 D f.c; Œc; xk ; .tkC1 ; IkC1 /; : : : ; .tn ; In /g:
Thus, we have taken the tagged intervals .ti ; Ii / to the left of c with one more obtained by
splitting one with c in it, and did a similar thing to the other side. The point c becomes
INTRODUCTION TO ANALYSIS
169
a tag for 2 intervals. Notice that k1 k < ı1 and k2 k < ı2 and since f .c/.c
f .c/.xk c/ D f .c/.xk xk 1 / D f .c/.Ik /,
R.f;1 / C R.f; 2 /
X
D
f .ti /.Ii / C f .c/.c
D
i <k
n
X
xk
1/
C f .c/.xk
c/ C
X
xk
1/
C
f .ti /.Ii /
i >k
f .ti /.Ii / C f .c/.Ik /
f .tk /.Ik /
iD1
D R.f; / C .f .c/
f .tk //.Ik /:
Moreover, jf .tk /
f .c/j < 2K, so
Rc
Rb
jR.f; / . a f C c f /j
ˇ
R c ˇ ˇˇ
D ˇR.f; /
f ˇ C ˇR.f; 2 /
Rb
a
c
"=3 C "=3 C 2K"=6K D ":
ˇ
ˇ
f ˇ C j.f .tk /
f .c//.Ik /j
Rb
41.5 Lemma (Sequential Criterion). A function f is integrable on Œa; b, with a f D r
iff for any choice of a sequence .m / of tagged partitions with km k ! 0, we have
limm R.f; m / D r.
Proof. This should be a routine exercise by now.
The sequential criterion can be used to prove some of the earlier results a little more
quickly.
Just as for limits of sequences, proving that a function is integrable directly from Riemann’s definition can be tricky. What one needs is a good guess at the value of the integral.
Rb
Rb
41.6 Example. Let f .x/ D x, for x 2 Œa; b. We expect a f D a x dx D .b 2 a2 /=2.
Let’s try to prove it.
For a tagged partition D f.t1 ; Œx0 ; x1 /; : : : ; .tn ; Œxn 1 ; xn /g, the Riemann sum is
R.f; / D
n
X
ti .xi
xi
1 /;
i D1
but it is difficult to see how this will converge to the expected limit as kk ! 0. However,
if kk < ı, and we create a new tagged partition by changing the tags ti to some other
ti , we obtain
ˇ
ˇ n
n
ˇX
ˇ
X
ˇ
ˇ
jR.f; / R.f; /j D ˇ
ti .xi xi 1 /
ti .xi xi 1 /ˇ
ˇ
ˇ
i D1
n
X
jti
i D1
ti j.xi
xi
1/
iD1
1 ; xi : ti
n
1X
.xi
2
i D1
ı.xi
xi
1/
D .xi
1
C xi /.xi
a/:
C xi /=2. Then,
n
1
D ı.b
iD1
Take ti to be the midpoint of the interval Œxi
R.f; / D
n
X
xi
1/
D
1X 2
.xi
2
i D1
xi2 1 / D
1 2
.b
2
a2 /:
170
The Riemann Integral
Thus, kk < ı implies
1 2
.b
2
jR.f; /
from which we see that
Rb
a
f D 12 .b 2
a2 /j ı.b
a/;
a2 /.
What made this proof work? It was the fact that when the mesh of is small enough,
all the Riemann sums R.f; /, obtained by changing the tags of , are close to R.f; /
and that one can always choose the new tags to give the same value. Let us apply the same
method to prove a special case of part of the Fundamental Theorem of Calculus.
41.7 Theorem (integrating a continuous derivative). Let f be a continuous real function
on the interval Œa; b. If there exists a function F continuous on Œa; b and differentiable on
Rb
.a; b/ with F 0 D f there, then f is integrable with a f D F .b/ f .a/.
Proof. Let " > 0. Since f is continuous on Œa; b, it is uniformly continuous, so we can
choose ı > 0 so that jt t j < ı implies jf .t / f .t /j < ". Let D f.t1 ; Œx0 ; x1 /; : : : ; .tn ; Œxn
be a tagged partition with mesh kk < ı. If we create a new partition by choosing new
tags ti in .xi 1 ; xi /, we have
ˇ n
ˇ
n
ˇX
ˇ
X
ˇ
ˇ
jR.f; / R.f; /j D ˇ
f .ti /.xi xi 1 /
f .ti /.xi xi 1 /ˇ
ˇ
ˇ
i D1
n
X
i D1
jf .ti /
f .ti /j.xi
1/ xi
n
X
".xi
xi
1/
D ".b
a/
i D1
i D1
By the Mean Value Theorem, we could choose the tags ti with
f .ti /.xi
xi
1/
D F 0 .ti /.xi
xi
1/
D F .xi /
F .xi
1 /:
Sum this over i , using the telescoping property, and get
R.f; / D
n
X
f .ti /.xi
xi
1/
D F .b/
F .a/:
i D1
Thus, kk < ı implies
jR.f; / .F .b/ F .a/j < ".b
Rb
Since " was arbitrary, this tells us a f D F .b/ F .a/.
a/:
41.8 Remark. In the previous result, we actually only need assume f is integrable rather
than continuous. We are in a position to make the required minor modification of the proof
just given, but we have chosen to postpone it to after the existence criteria of the next
section so that it may be naturally grouped with related results. See Theorem 43.2
41.1. If S is a finite subset of Œa; b, f W Œa; b ! R, and f .x/ D 0 if x … S , then f is integrable
Rb
with a f D 0.
41.2. If f D g except at a finite number of points in Œa; b, then f is integrable on Œa; b iff g is so,
Rb
Rb
and then a f D a g.
41.3. Use the Riemann definition of integral to prove that the function f W Œ0; 1 ! R, defined by
(
1; x 2 Q
f .x/ D
0; x … Q
is not integrable. [Every interval of positive length contains both rationals and irrationals that can
be used as tags.]
1 ; xn /g.
INTRODUCTION TO ANALYSIS
41.4. Use the sequential criterion to deduce linearity of the integral from the usual limit theorems for
sequences of reals.
41.5. Use the sequential criterion to deduce monotonicity of the integral from preservation of order in
limits of sequences of reals.
41.6. If f is Riemann integrable on Œa; b, then f is bounded on Œa; b. Sketch: first check that there
Rb
exists ı > 0 such that for tagged partitions with kk < ı, jR.f; /j < 1 C j a f j D M .
Fix one particular partition P D fI1 ; : : : ; In g with kP k < ı. If f were unbounded, it would
be unbounded on one of the intervals Ii , say for i D k. Fix the tags on all the other intervals,
then choose the tag for the k th so that f .tk /.Ii / overpowers the rest of the sum, making
jR.f; /j > M .
171
172
The Riemann Integral
Notes
INTRODUCTION TO ANALYSIS
173
42. E XISTENCE OF THE R IEMANN INTEGRAL
The upper and lower Darboux sums for f with respect to the partition P D fI1 ; : : : ; In g
are
n
X
U.f; P / D
sup f .Ii /.Ii / and
i D1
L.f; P / D
n
X
inf f .Ii /.Ii /:
i D1
Notice that f .Ii / D ff .t / W t 2 Ii g is a set of numbers and the upper sum is defined in
terms of the sup of this set.
It is worth noting immediately a connection between the Riemann sums and the Darboux sums: Let us write P if is obtained from P by tagging its intervals. Since, for
each tag ti 2 Ii ,
inf f .Ii / f .ti / sup f .Ii /;
we see that each Riemann sum R.f; / satisfies
L.f; P / R.f; / U.f; P /
and since the tags t1 ; : : : ; tn move independently, you can check that in fact, for each
partition P of Œa; b,
L.f; P / D inf R.f; / and U.f; P / D sup R.f; /
()
P
P
If P and Q are partitions of the same interval, one says that Q is finer than P , or Q is
a refinement of P , if each interval of Q comes from subdividing an interval of P . The
supremum norm of the function f on Œa; b is kf k D supx2Œa;b jf .x/j and we recall that
kP k denotes the mesh of the partition P (the maximum length of the intervals of P ).
When we refine a partition, the lower sums go up and the upper ones come down, but
not by much.
42.1 Lemma (Comparison of Darboux sums). Let f W Œa; b ! R, and let P and Q be
partitions of Œa; b, with Q finer than P .
(a) Then,
L.f; P / L.f; Q/ U.f; Q/ U.f; P /:
(b) If Q is obtained from P by inserting N new division points,
then
U.f; P / U.f; Q/ 2N kf kkP k and
L.f; Q/
L.f; P / 2N kf kkP k:
Proof. We may assume that P D fI1 ; : : : ; In g and that Q is obtained by dividing one
interval Ik using x , where x is not one of the subdivision points of P . The general case
follows by a simple induction.
We work with the lower sums. The upper sums behave similarly.
Say, x 2 Ik D Œxk 1 ; xk  and J1 D Œxk 1 ; x  and J2 D Œx ; xk  are the new intervals
produced. Then,
L.f; P / D
n
X
inf f .Ii /.Ii /
iD1
D
X
i ¤k
inf f .Ii /.Ii / C inf f .Ik /.Ik /
174
Existence of the Riemann integral
and the lower sum for Q is the same, except that the Ik term is replaced by two new ones
inf f .J1 /.J1 / C inf f .J2 /.J2 /:
Now, f .J1 / f .Ik /, so inf f .J1 / inf f .Ik /, and similarly, inf f .J2 / inf f .Ik /, so
inf f .J1 /.J1 / C inf f .J2 /.J2 / inf f .Ik /.J1 / C inf f .Ik /.J2 /
D inf f .Ik /..J1 / C .J2 //
D inf f .Ik /.Ik /:
When we add in the terms with i ¤ k, we obtain
L.f; Q/ L.f; P /;
which proves (a).
To prove (b), just notice that all the values of f are between kf k and kf k, so
L.f; Q/ L.f; P / D inf f .J1 /.J1 / C inf f .J2 /.J2 / inf f .Ik /.Ik /
kf k.J1 / C kf k.J2 / C kf k.Ik / D 2kf k.Ik / 2kf kkP k;
as required.
42.2 Corollary. If P and Q are partitions of Œa; b and f W Œa; b ! R,
then L.f; P / U.f; Q/.
Proof. If P and Q are any partitions of Œa; b, then they have a common refinement P obtained by using the division points of both: P D fI \ J W I 2 P; J 2 Qg. Then,
L.f; P / L.f; P / U.f; P / U.f; Q/:
Thus,
L.f; P / U.f; Q/;
for all partitions P and Q of Œa; b.
Now, the Corollary says that all the lower sums are less than all the upper sums. By
completeness of the reals, there must be at least one number in between. The smallest and
largest such numbers are
Rb
Rb
f D sup L.f; P / and a f D inf U.f; P /;
a
P
P
where it is understood that P is running over all possible partitions of Œa; b. These are
known as the lower integral and upper integral, respectively. We will see in the next
result that that a f is Riemann integrable if and only if these are equal — that is, if and
only if there is only one number between all the lower sums and the upper sums — and
that number is the Riemann integral.
If I is a subinterval of Œa; b, the oscillation of f on I is
!f .I / D sup jf .s/
f .t /j D sup f .I /
inf f .I /;
s;t 2I
If P is a partition of Œa; b, then
U.f; P /
L.f; P / D
n
X
!f .I /.I /:
I 2P
Let us denote this by .f; P /. Notice that if Q is a refinement of P , then upper sums
come down and lower sums go up and
.f; Q/ D U.f; Q/
L.f; Q/ U.f; P /
L.f; P / D .f; P /:
INTRODUCTION TO ANALYSIS
175
42.3 Theorem. Let f be a bounded function on Œa; b to R. Then equivalent are:
(1) f is Riemann integrable;
(2) for each " > 0, there exists a partition P of Œa; b with .f; P / < ";
(3) there is a unique number D such that for every partition P of Œa; b,
namely,
Rb
a
L.f; P / D U.f; P /;
Rb
Rb
f D D D f D af .
a
We will refer to (2) as the Basic Integrability Criterion (BIC) and to (3) as Darboux’s
characterization.
Proof. Assume first that BIC holds. Let D be a fixed number such that for every partition
P of Œa; b,
L.f; P / D U.f; P /:
Rb
Rb
Rb
For example, take D to be f or a f . We will prove that a f D D, establishing both
a
(1) and (3).
Let " > 0 and choose a partition P" of Œa; b such that
.f; P" / < ":
Let N be the number of intervals in P" . Choose ı > 0 so that
4N kf kı < "
Let be any tagged partition with kk < ı. Let P D fI1 ; : : : ; In g be the corresponding
partition without the tags and let Q be the common refinement of P" and P , obtained by
using the division points of both. Then,
L.f; Q/ D U.f; Q/:
and, .f; Q/ .f; P" /, so
.f; Q/ < ":
Moreover, fewer than N points are inserted in P to obtain Q so
U.f; P / U.f; Q/ 2N kf kkP k 2N kf kı < "=2 and
L.f; Q/
L.f; P / 2N kf kkP k 2N kf kı < "=2:
Putting this together with the fact that L.f; P / R.f; / U.f; P /, we see that
L.f; Q/
"=2 R.f; / U.f; Q/ C "=2:
Thus, both R.f; / and D are in the interval ŒL.f; Q/ "=2; U.f; Q/ C"=2: This interval
has length U.f; Q/ L.f; Q/ C " D .f; Q/ C " < 2", so
jR.f; /
Dj < 2":
We have shown that for all " > 0, there exists ı > 0 so that for every tagged partition of
Œa; b with kk < ı,
jR.f; / Dj < 2";
Rb
so f is Riemann integrable with a f D D.
Now, suppose that f is Riemann integrable and let " > 0. Choose ı > 0 so that for
every tagged partition of Œa; b with kk < ı,
Rb
jR.f; /
a f j "=4I
Fix any partition P with kP k < ı. Each Riemann sum made with tagged in P is thus in
the same closed interval of length "=2. Since the upper sum U.f; P / is the supremum of
176
Existence of the Riemann integral
such Riemann sums and L.f; P / is the infimum of them, U.f; P / and L.f; P / are also in
that interval, so .f; P / D U.f; P / L.f; P / "=2 < " and (2) holds.
We have left to prove that (3) implies (2). So suppose (3) holds; that is, suppose D D
Rb
Rb
Rb
f D a f . Let " > 0. Since a f D infP U.f; P /, we can choose a partition P1 of
a
Œa; b with U.f; P1 / < D C "=2. Similarly, we choose a partition P2 with L.f; P2 / >
D "=2. Take P to be a common refinement of P1 and P2 so that
U.f; P / < D C "=2 and L.f; P / > D
.f; P / D U.f; P /
"=2 and hence
L.f; P / < "
;
as required.
42.4 Example. Let g be the indicator function of the rationals on Œ0; 1:
(
1; if x is rational
g.x/ D
0; if x is irrational.
We will see that g is not Riemann integrable.
Indeed, let P D fI1 ; : : : ; In g be a partition of Œ0; 1. We may assume that each interval
Ii is of positive length, since the others do not contribute to the relevant sums. Each
Ii contains both a rational and an irrational, so the oscillation of g on Ii is !g.I / D
sups;t 2Ii jg.s/ g.t /j D 1 0 D 1. Thus,
.g; P / D
n
X
1.Ii / D 1:
i D1
Since P was arbitrary, this shows the Basic Integrability Criterion is not satisfied; hence,
g is not integrable.
We can also look at this in terms of upper and lower integrals.
U.g; P / D
L.g; P / D
n
X
i D1
n
X
1.xi
xi
1/
D xn
0.xi
xi
1/
D0
x0 D 1
iD1
Since P was an arbitrary partition of Œ0; 1 we have
integrable.
R1
0g
D 1 and
R1
0
g D 0, so g is not
42.5 Example. Let f .x/ D x 2 , defined for x 2 Œ0; 1. Consider the partition P determined
by fx0 ; x1 ; : : : ; xn g where xi D ni ; for i D 0; : : : ; n. Then, for each i,
2
i
2
;
sup f .Œxi 1 ; xi / D xi D
n
so
n
n
X
X
i2 1
n.n C 1/.2n C 1/
U.f; P / D
sup f .Œxi 1 ; xi /.xi xi 1 / D
D
2
n n
6n3
i D1
i D1
and hence, since this is one of the sums in the definition of
Z 1
n.n C 1/.2n C 1/
f 6n3
0
R1
0f
,
INTRODUCTION TO ANALYSIS
177
(The infimum of a set is a lower bound for it.) Since the left side does not depend on n, we
may take a limit and get
Z 1
2
1
f D :
6
3
0
Similarly, we find
L.f; P / D
n
X
inf f .Œxi
1 ; xi /.xi
xi
1/
D
iD1
n
X
.i
i D1
so
Z
1
f .n
.n
1/2 1
D
2
n
n
0
Z
1
f 0
1/ C 1/
1/ C 1/
1/.n/.2.n
6n3
and in the limit
1/.n/.2.n
6n3
1
:
3
Thus,
Z b
Z b
1
1
f ;
f 3
3
a
a
Rb
so f is Riemann integrable with a f D 13 by Darboux’s characterization.
Here is a general example of the use of the Basic Integrability Theorem to show integrability, when we don’t know the value of the integral.
42.6 Theorem (integrability over subintervals).
If f is integrable over Œa; b and Œc; d  Œa; b then f is integrable over Œc; d .
Proof. We use the Basic Integrability Condition.
Notice that J I implies !f .J / !f .I /, since this amounts to taking supremum
over a smaller set.
Let " > 0. Use the fact that f is integrable on Œa; b to choose a partition P1 of Œa; b
with .f; P1 / < ". P be the partition of Œc; d  obtained by intersecting each I 2 P with
Œc; d . Then
X
X
.f; P / D
!f .I \ Œc; d /.I \ Œc; d / !f .I /.I / D .f; P1 / < ":
I 2P1
I 2P1
Thus, f also satisfies the BIC on Œc; d , as required.
ˇR
ˇ
ˇ b ˇ Rb
42.7 Theorem. If f is integrable on Œa; b then so is jf j and ˇ a f ˇ a jf j.
Proof. Let f be integrable. If we can show jf j integrable, we can use monotonicity:
jf j f jf j;
so noting that
Rb
a
jf j D
Rb
a
. jf j/, we obtain
Z b
Z b
Z
jf j f a
or in other words
a
b
a
ˇ
ˇZ
ˇ b ˇ Z b
ˇ
ˇ
fˇ
jf j:
ˇ
ˇ a ˇ
a
jf j;
178
Existence of the Riemann integral
To prove that jf j is integrable, we use the Basic Criterion. Let " > 0. Suppose P is a
partition of Œa; b with
X
!f .I /.I / < ":
I 2P
For each I , we have for s; t 2 I
ˇ
ˇjf .t /j
ˇ
jf .s/jˇ jf .t /
f .s/j
Then
!jf j.I / D sup jjf .t /j
s;t
jf .s/jj sup jf .t /
f .s/j D !f .I /:
s;t
So multiplying by .I / and summing gives
X
X
!jf j.I /.I / !f .I /.I / < ":
I
I
hence jf j satisfies the Basic Criterion for integrability, as required.
In the above result we have shown absolute integrability implies integrability. The
converse fails, as the reader can show.
Now we will apply the Basic Integrability Criterion to establish integrability of two of
our favourite kinds of functions.
42.8 Theorem. Let f be a real function on Œa; b.
(1) If f is monotone, then f is integrable on Œa; b.
(2) If f is continuous, then f is integrable on Œa; b.
Proof. We show in each case that the Basic Integrability Criterion is satisfied. We see that
for any partition P of Œa; b determined by x0 ; x1 ; : : : ; xn
.f; P / D
n
X
!f .Œxi
1 ; xi /.xi
xi
1 /;
i D1
and we will use the hypotheses to make this small. In the first case this will be done by
making the xi D xi xi 1 small enough to overpower the total change in f , and in the
second by making the !f .Œxi 1 ; xi / small enough to overpower the total change in x.
(1) We assume without loss of generality that f is increasing. The decreasing case is
similar.
Let " > 0. Choose ı > 0 such that .f .b/ f .a//ı < " and then a partition P
determined by fx0 ; : : : ; xn g with kP k < ı. Since f is increasing, for each i , f .xi / is
the maximum value of f on Ii D Œxi 1 ; xi  and f .xi 1 / is the minimum value. Hence,
!f .Œxi 1 ; xi / D f .xi / f .xi 1 / and
.f; P / D
n
X
.f .xi /
f .xi
1 //.xi
.f .xi /
f .xi
1 //ı
iD1
n
X
xi
1/
D .f .b/
f .a//ı < ":
iD1
Thus, the Basic Integrability Criterion is satisfied and f is integrable.
(2) Suppose f is continuous on Œa; b. Then f is uniformly continuous. Let " > 0.
Then there exists ı > 0 such that js t j < ı implies jf .s/ f .t /j < "=.b a/. Choose
INTRODUCTION TO ANALYSIS
179
such a ı and then choose any partition P with kP k < ı. Then, for all I 2 P , s; t 2 I
implies jf .s/ f .t /j < "=.b a/, so !f .I / "=.b a/ and
.f; P / D
n
X
!f .Œxi
1 ; xi /.xi
xi
1/ iD1
n
X
"
b
a
.xi
xi
1/
D ":
iD1
Again this shows the basic integrablity criterion is satisfied, so f is integrable.
Continuous and monotone functions are by no means the only ones that are integrable.
Changing the value of a function on a finite number of points, for example, destroys both
properties, but leaves the function integrable with the same integral. Right now, if you
haven’t already done so, you should prove the special case:
42.9 Theorem. If S is a finite subset of Œa; b, f W Œa; b ! R, and f .x/ D 0 if x … S,
Rb
then f is integrable with a f D 0.
Here is a more dramatic example.
42.10 Example. The Dirichlet function, defined on [0,1] by
(
1
; if x D pq is rational in lowest terms
f .x/ D q
0; otherwise
is integrable with integral 0.
The point is that we have shown elsewhere in the course (Example 30.4) that this function is discontinuous at all rationals (and continuous at all irrationals) of Œ0; 1 and it is
certainly not monotone. Nevertheless, it is integrable.
R1
Proof. Since f 0, f 0, so it is the upper integral that is of interest.
0
Now, let " > 0. Since there are only finitely many q with 1=q > ", there are only
finitely many rationals in Œ0; 1 of the form p=q, with f .p=q/ D 1=q > ". But there are
no irrationals with f .x/ > ", since f .x/ is 0 for x irrational. Thus, the set
F WD fx W f .x/ > "g
"
.
2N
At most 2N of the intervals of P can contain points of F . All values of f are 1 and if
I \ F D ;, sup f .I / ", so
X
U.f; P / D
sup f .I /.I /
is finite. Say it has N elements. Choose any partition P of Œ0; 1 of mesh kP k <
I
D
X
sup f .I /.I / C
X
sup f .I /.I /
I \F D;
I \F ¤;
X
1.I / C
X
".I /
I \F D;
I \F ¤;
The first term here is at most 2N kP k < " and the second is at most " times the total length
of all the intervals of P , which is 1 0 D 0. Thus,
U.f; P / 2"I
hence,
Z
1
f 2";
0
for all " > 0;
180
Existence of the Riemann integral
and therefore
R1
0f
0. Since
R1
0
f 0, we have
R1
0
f D 0.
42.11 Theorem (Integral of products). If f; g are integrable on Œa; b, then so is fg.
Proof. Let f and g be integrable on Œa; b. Then, f and g are both bounded. Say both
kf k and kgk are less than K. Let " > 0. Choose, by the BIC, partitions P1 and P2 so that
.f; P1 / < "=2K and .g; P2 / < "=2K. Take P to be a common refinement of these.
Then, for each I 2 P , s; t 2 I implies
jf .s/g.s/ f .t /g.t /j jf .s/ f .t /jjg.s/j C jf .t /jjg.s/ g.t /j K!f .I / C K!g.I /:
Thus,
!.fg/.I / K!f .I / C K!g.I /;
so
.fg; P / D
X
!.fg/.I /.I / K.f; P / C K.g; P / < K
I 2P
"
"
CK
D ":
2K
2K
Thus, fg satisfies the BIC, so is integrable.
42.12 Theorem (Integral of composites). Let f W Œa; b ! Œc; d  and g W Œc; d  ! R.
(a) If f is integrable and g is continuous then g ı f is integrable on Œa; b.
(b) If f and g are both integrable, the composite need not be integrable.
Proof. (a) Since g is continuous on the compact interval Œc; d , it is bounded: kgk D
supx2Œc;d  jg.x/j < 1.
Let " > 0. Choose "0 > 0 so that "0 .b a/ C 2kgk"0 < ".
Since g is continuous on Œc; d , g is uniformly continuous, so we may choose ı > 0
such that
jg.s/ g.t /j "0
when js t j < ı:
Since f is integrable, there exists a partition P of Œa; b with
.f; P / < ı"0 :
Fix such a P D fI1 ; : : : ; In g. Divide the index set into two parts,
A D fi W !f .Ii / < ıg
B D fi W !f .Ii / ıg
Then, for i in A; x; y 2 Ii implies
jf .x/ f .y/j !f .Ii / < ı so
jg ı f .x/
g ı f .y/j D jg.f .x//
g.f .y//j "0 ;
Hence,
!.g ı f /.Ii /.Ii / "0 .Ii /;
for i 2 A, and
X
!.g ı f /.Ii /.Ii / i 2A
X
"0 xi "0 .b
a/;
i 2A
For i 2 B, !f .Ii / ı so
X
X
ı.Ii / !f .Ii /.Ii / .f; P / < ı"0 ;
i2B
i 2B
which shows that the total length of the intervals indexed in B is less than "0 and hence,
X
X
!.g ı f /.Ii /.Ii / 2kgk.Ii / < 2kgk"0 ;
i2B
i 2B
INTRODUCTION TO ANALYSIS
181
Thus, altogether,
.g ı f; P / "0 .b
a/ C 2kgk"0 < "
This shows g ı f is integrable by the Basic Integrability Criterion.
(b) The composite need not be integrable if f and g are both integrable, but g is not
continuous. Let f be the Dirichlet function on Œ0; 1 (f .p=q/ D 1=q, if x D p=q in
lowest terms, f .x/ D 0, if x is irrational). Let g on [0,1] be given by g.u/ D 1, if u ¤ 0,
and g.0/ D 0. Then, f is integrable and g is integrable, but the composite g ı f is the
indicator function of the rationals on [0,1], which is the basic example of a non-integrable
function.
One can also prove that g ı f need not be integrable if g is integrable and f is continuous. This requires a more sophisticated argument.
42.13 Examples. Here are some applications of the above results.
(1) If f is integrable on Œa; b, then the function sin.f /, which really means sin ıf is
also integrable on Œa; b, since the sine function is continuous everywhere.
(2) If f is integrable on Œa; b so is jf j. This has been proved earlier, but it is also a
consequence of the present result since the absolute value map is continuous.
(3) If f is integrable, so is f 2 . This follows from the integral of products result, but it
is also a consequence of the composite result since the function g W u 7! u2 is continuous
and g ı f D f 2 :
(4) (Integral of Products again.) If f; g are integrable on Œa; b, then so is fg. Indeed,
f C g is integrable, so f 2 ; g 2 and .f C g/2 are integrable, and hence
fg D
1
.f C g/2
2
f2
g2
is so also.
(5) If f; g are integrable on Œa; b, so are
f _ g WD maxff; gg and f ^ g WD minff; gg:
Indeed, for real numbers u; v,
(
u C v C ju
vj D
2u;
2v;
if u v
if u < v,
D 2 maxfu; vg:
Thus,
1
.f C g C jf
2
gj are.
f _g D
which is integrable since f; g; jf
Similarly
f ^g D
1
.f C g
2
jf
gj/;
gj/
is integrable.
42.1. If P is a partition of Œa; b and Q is a finer one, then for each f W Œa; b ! R, .f; Q/ .f; P /.
42.2. If f is bounded on Œa; b and integrable on Œc; b, for all c 2 .a; b/, then f is integrable on
Œa; b.
182
Existence of the Riemann integral
42.3. A function f W Œa; b ! R is called of bounded variation on Œa; b if there exists a finite K
such that
P
a D x0 x1 x2 xn D b implies n
f .xi /j < K.
iD1 jf .xi 1 /
Prove that such a function is Riemann integrable.
42.4. Consider the function f W Œ0; 1 ! R, defined by
(
x 2 ; if x is rational
f .x/ D
0;
otherwise.
Decide, with proof, whether f is Riemann integrable.
Rb
42.5. Let f be continuous on Œa; b with f .x/ 0 for all x and a f D 0. Prove that f is
constantly 0. Hint: Contrapositive. What can you say about values of f near a point p with
f .p/ > 0?
42.6. Give an example of a function whose absolute value is integrable, but the function is not.
42.7. Let f be bounded on Œa; b. Suppose there exists a sequence of partitions of Œa; b such that
limn >1 . U.f; Pn / L.f; Pn / / D 0, then
(a) f is integrable on Œa; b and
Rb
(b) a f D limn U.f; Pn / D limn L.f; Pn /.
Rb
42.8. Let f be continuous on Œa; b with f .x/ 0 for all x and a f D 0. Prove that f is
constantly 0. Hint: Contrapositive. What can you say about values of f near a point p with
f .p/ > 0?
42.9. If f is integrable on Œa; b and there exists a number m > 0 such that f .x/ > m, for all
x 2 Œa; b, then 1=f is integrable. (The same would hold if m < 0 and f .x/ < m for all
x 2 Œa; b.)
42.10. For each partition P of Œa; b,
U.f; P / D sup R.f; / and
P
L.f; P / D inf R.f; /:
P
INTRODUCTION TO ANALYSIS
183
43. T HE F UNDAMENTAL T HEOREM OF C ALCULUS
If f is a Riemann integrable function on an interval Œa; b, it is integrable over subintervals. This was proved in the section E XISTENCE OF THE R IEMANN INTEGRAL. Thus,
we may define a new function F on Œa; b by integrating over subintervals. The result is a
Lipschitz, hence (uniformly) continuous function.
43.1 Theorem. Let f be Riemann integrable on Œa; b. If F is defined on Œa; b by
Z x
F .x/ D
f .t / dt
a
then F is Lipschitz, hence uniformly continuous on Œa; b.
(Of course, continuity on Œa; b always implies uniform continuity, but the Lipschitz
property yields it in a simple way.)
Proof. Recall that f being integrable includes that f is bounded. Say jf .x/j K, for
all x 2 Œa; b. Since f is integrable on Œa; b, it is integrable on subintervals. Thus, if
x; y 2 Œa; b with x < y,
Z y
Z x
Z y
f;
f D
f
F .y/ F .x/ D
y
Z
jF .y/
F .x/j x
a
a
so
y
Z
K K.y
jf j x/ D Kjx
yj:
x
x
If y < x, interchange the roles of x and y here; in either case, we get
jF .x/
F .y/j Kjx
yj;
for all x; y 2 Œa; b. This is the statement that F is Lipschitz. From this uniform continuity
follows, as we know.
The smallest possible K in the above argument is the supremum norm of f defined by
kf k D supx2Œa;b jf .x/j. What we proved is that
jF .x/
F .y/j kf kjx
yj:
43.2 The Fundamental Theorem of Calculus.
(1 (differentiating an integral) Let
f be Riemann integrable on Œa; b. If F is defined on Œa; b by
Z x
F .x/ D
f .t / dt
a
and f is continuous at c 2 Œa; b, then F 0 .c/ D f .c/.
(2) (integrating a derivative) Let f be Riemann integrable on Œa; b. If there exists a
continuous function F on Œa; b with F 0 D f on .a; b/, then
Z b
f D F .b/ F .a/:
a
Notice that in both cases we are assuming integrability of f . The second part is saying
Rb
that if F 0 is integrable, then a F 0 .t / dt D F .b/ F .a/. Since this also applies for those
Rx 0
x between a and b, we get a F D F .x/ F .a/.
184
The Fundamental Theorem of Calculus
Proof. (1) Suppose f is continuous at c. Given " > 0, choose ı > 0 such that jx cj < ı
implies jf .x/ f .c/j < ". Then since f .c/ is constant
Z x
Z x
Z x
F .x/ F .c/ f .c/.x c/ D
f
f .c/ D
f .t/ f .c/ dt
c
But for jx
Thus,
c
ˇR x
cj < ı, the absolute value of the right-side is ˇ c jf
ˇ
ˇ F .x/
ˇ
ˇ
x
for jx
c
ˇ
f .c/jˇ "jx
cj.
ˇ
ˇ
f .c/ˇˇ ";
F .c/
c
cj < ı. This says
F 0 .c/ D lim
x !c
F .x/
x
F .c/
D f .c/;
c
as claimed.
(2) Now, instead assume that there exists F such that F 0 D f . Recall that f is supposed
Rb
integrable. So let " > 0. Choose ı > 0 so that for kk < ı, jR.f; /
a f j < ".
Create a partition of Œa; b determined by fx0 ; : : : ; xn g with kk < ı, using the mean
value theorem to find tags ti in .xi 1 ; xi / with
f .ti /.xi
xi
1/
D F 0 .ti /.xi
xi
1/
D F .xi /
F .xi
1 /:
Sum this over i, using the telescoping property, and get
R.f; / D
n
X
f .ti /.xi
xi
1/
D F .b/
F .a/:
i D1
Thus,
Z
jF .b/
b
f j < ":
F .a/
a
Since " was arbitrary, this tells us
Rb
a
f D F .b/
F .a/.
Rx
43.3 Warning. It is quite possible that the map F W x 7! a f is differentiable at a point
c, without f being
R xcontinuous at c. For example, if f .x/ D 1, for x 2 Œ0; 2 n f1g, and
f .1/ D 14, then 0 f D x, for all x 2 Œ0; 2. (Changing a value at one point does not
change the integral.) Hence F will then be differentiable everywhere, with F 0 .x/ D 1.
The Dirichlet function is a more extreme example. It is discontinuous at every rational
in Œ0; 1, but the integrated function is 0 everywhere, so is differentiable everywhere.
R x3
43.4 Example. Let G.x/ D 0 sin.cos.t // dt. Find G 0 .x/ if possible.
Soln. Since
also, so for all u 2 R, the
R u sin and cos are continuous on R, their compositeR is
u
integral 0 sin.cos.t // dt exists and the function F W u 7! 0 sin.cos.t // dt is differentiable by the Fundamental Theorem of Calculus, with F 0 .u/ D sin.cos.u//, for all u. The
function in question is G D F ı g, where g.x/ D x 3 . Thus, by the chain rule,
G 0 .x/ D F 0 .g.x//g 0 .x/ D sin.cos.g.x///3x 2 :
A similar question with both limits of integration “variable” is handled by subtraction.
R x3
R x3
R x2
For example x 2 sin.cos.t // dt D 0 sin.cos.t // dt
0 sin.cos.t // dt, and each term
can be treated separately.
INTRODUCTION TO ANALYSIS
185
43.5 Corollary (Integration by parts.). Suppose that f and g are differentiable on Œa; b
with integrable derivatives. Then,
Z b
Z b
0
fg D f .b/g.b/ f .a/g.a/
f 0g
a
a
Proof. Since f and g are differentiable, they are continuous, hence integrable. Therefore
both fg 0 and f 0 g are also integrable.
By the product formula for differentiation, we have
.fg/0 .x/ D f 0 .x/g.x/ C f .x/g 0 .x/
for all x in Œa; b. Thus, by the Fundamental Theorem of Calculus,
Z b
Z b
Z b
0
0
.f 0 g C g 0 f /
gf D
f gC
a
a
a
b
Z
.fg/0
D
a
D f .b/g.b/
f .a/g.a/:
The following is also known as integration by substitution. For examples of its use,
look in the “Techniques of Integration” section of almost any Calculus book.
43.6 Corollary (Change of Variable). Let g be a real function differentiable on the interval Œc; d , with integrable derivative. Let f be a real function which is continuous on the
range of g. Put a D g.c/ and b D g.d /. Then,
Z b
Z d
0
f
f ıg g D
a
c
.
Written with dummy variables, the formula of this Corollary looks like
Z b
Z d
0
f .g.x// g .x/ dx D
f .u/ du
c
a
The right side can be formally obtained from the left side with the substitutions:
u D g.x/
du D g 0 .x/ dx
x runs from c to d
u runs from a D g.c/ to b D g.d /
Proof. Under the hypotheses, g is differentiable,
hence continuous, so the range of g is a
Ru
closed interval Œu0 ; u1 . Let F .u/ D u0 f , for all u 2 Œu0 ; u1  and let G D F ı g; that
is, G.x/ D F .g.x// for x 2 Œc; d . Since f is continuous on Œu0 ; u1 , F is differentiable
there with F 0 D f , by the Fundamental Theorem of Calculus (differentiating an integral).
By the Chain Rule G 0 D F 0 ıg g 0 D f ıg g 0 . To use the other half of the Fundamental
Theorem, we need to know that G 0 is integrable. But, f was given continuous, and g is
186
The Fundamental Theorem of Calculus
continuous, so f ıg is continuous, hence integrable. Since g 0 was also assumed integrable,
the product f ı g g 0 is also integrable. Thus,
Z d
Z d
0
f ıgg D
G 0 D G.d / G.c/
c
c
D F .g.d //
D F .b/
F .g.c//
Z b
F .a/ D
f:
a
43.7 Remark. Actually the change of variable theorem is true without the hypothesis of
continuity. One just needs f integrable, but this method of proof fails. A reasonably
easy proof is available if g is monotone. It is much more difficult in general, relying on
Lebesgue’s Criterion for Riemann Integrability (Theorem 44.3) and a result known as the
Bounded Convergence Theorem.
Rb
43.1. Let f be integrable on Œa; b and for each x 2 Œa; b, put H.x/ D x f: Prove that, if f
is continuous at c, then H is differentiable at c with H 0 .c/ D f .c/. This shows that if one
Rx
Rb
Rx
d
uses the convention b f D
x f , when x < b, we still get dx b f D f .x/, when f is
continuous at x.
43.2. (Linear Change of Variable) Let f W Œa; b ! R and let g W Œc; d  ! Œa; b be the function
whose graph is a straight line with g.c/ D a, g.d / D b. Then f is integrable if and only
Rb
Rd
if f ı g is integrable. Moreover a f .u/ du D m c f .g.x// dx, where m is the slope of
the line. (Suggestion: Use Riemann’s definition of integral directly. Note: f is not assumed
continuous here.)
43.3. (Monotone Change of Variable.) Let g be a monotone real function differentiable on the interval
Œc; d , with integrable derivative. Let f be a real function which is integrable on the range of g.
Put a D g.c/ and b D g.d /. Then,
Z d
Z b
f ıg g 0 D
f
c
a
.
43.4. Find a formula for the derivatives of the functions defined by
Rx p
(a) F .x/ D 0 1 C t 2 dt .
R sin x
(b) F .x/ D 0
ln.5 C t / dt
R x2 p
(c) F .x/ D x
1 C t 2 dt .
Rx
43.5. Let f .t / D t for 0 t 2 and f .t / D t, for 2 < t 4 and let F .x/ D 0 f .t / dt, for
x 2 Œ0; 4.
(a) Find an explicit expression for F .
(b) Sketch F . Determine where F is differentiable and where not.
(c) Find a formula for F 0 where F is differentiable.
(
x 2 sin.1=x 2 /; x ¤ 0
43.6. Let f .x/ D
. Prove that f is differentiable everywhere, but f 0 is
0;
x D 0:
not integrable on Œ 1; 1.
Rx
43.7. Let f be Riemann integrable on Œa; b and F .x/ D a f , for all x 2 Œa; b. If f has a jump
discontinuity at c, then F cannot be differentiable at c.
43.8. Let f be Riemann integrable on Œa; b. If there exists a continuous function F on Œa; b with
F 0 D f except at a finite number of points, then
Z b
f D F .b/ F .a/:
a
INTRODUCTION TO ANALYSIS
Rb
43.9. For a real function f defined (at least) on Œa; b/ the improper integral a f is defined to be
Rt
x It is said to converge if this is finite. Prove that if f
limt!b a f , whenever this exists in R.
is defined and Riemann integrable on all of Œa; b, then the improper integral, defined this way,
Rb
Rb
converges and has the same value as the Riemann integral: a f D a f . (It is for this reason
Rb
that the one usually just writes a f for the improper integral.)
187
188
The Fundamental Theorem of Calculus
Notes
INTRODUCTION TO ANALYSIS
189
44. L EBESGUE ’ S C RITERION FOR R IEMANN I NTEGRABILITY ( OPTIONAL )
Here we give Henri Lebesgue’s characterization of those functions which are Riemann
integrable. Recall the example of the he Dirichlet function, defined on [0,1] by
(
1
; if x D pq is rational in lowest terms
f .x/ D q
:
0; otherwise
This function is continuous at all irrational numbers and discontinuous at the rational numbers. It is also Riemann-integrable (with integral 0). It turns out that there is a connection
here. It is the nature of the set of discontinuities that determines integrability.
For a real-valued function f defined on a set X , and I X , let !f .I / D sups;t 2I jf .s/
f .t /j, the oscillation of f on I , as usual. The oscillation of f at a point x is defined as
!f .x/ D inff!f .B.x; ı// W ı > 0g:
It is easy to prove that f is continuous at x if and only if !f .x/ D 0.
44.1 Lemma. Let f W Œa; b ! R. Then, for every ˛ > 0, fx W !f .x/ < ˛g is open in
Œa; b and fx W !f .x/ ˛g is a closed set (in R).
Proof. Let G D fx 2 Œa; b W !f .x/ < ˛g. Let c 2 G. Then, !f .c/ < ˛ and by
definition, there is a ı > 0 such that !f .B.c; ı/\Œa; b/ < ˛. If x 2 B.c; ı/\Œa; b, and U
is a neighbourhood of x contained in B.c; ı/, then !f .U / < ˛, so !f .x/ !f .U / < ˛,
also. Thus, G is open in Œa; b.
Since Œa; b is closed and G is open in Œa; b, fx W !f .x/ ˛g D Œa; b n G, is closed
in Œa; b and in R.
Let .I / denote the length of the interval I . A subset N of R is said to have measure
0, if for each " > 0, P
there exists countable family H D fI1 ; I2 ; : : : g of intervals covering
N , with total length k .Ik / < ".
44.2 Lemma.
(1) Every countable set of reals has measure 0.
(2) If B has measure 0 and A B, then A also
S has measure 0.
(3) If Ak has measure 0, for all k 2 N, then k2N Ak also has measure 0.
.
Proof. (1) Let A D fa1 ; a2 ; : : : g be countable,
" > 0, and for every k, let Ik be the interval
S
.a "=2kC1 ; a C "=2kC1 /. Then, A k Ik — that is these intervals cover A. For each
P
P
k
k, the length of Ik is "=2k , and the total length is k .Ik / 1
kD1 "=2 D ". Thus, A
has measure 0.
(2) is obvious, because a family of intervals that covers B also covers A.
To prove (3), one uses a modification of the proof of (1). Let " > 0. For eachS
k, let Hk
be a countable family of intervals whose total length is less than "=2k . Then, k Hk is
P
still a countable family of intervals, and their total length is less than k "=2k D ".
Of course, we could have proved singletons have measure 0 and then deduce (1) from
(3).
44.3 Theorem. (Lebesgue’s Criterion for integrability) Let f W Œa; b ! R. Then, f
is Riemann integrable if and only if f is bounded and the set of discontinuities of f has
measure 0.
Notice that the Dirichlet function satisfies this criterion, since the set of discontinuities
is the set of rationals in Œ0; 1, which is countable.
190
Lebesgue’s criterion for Riemann integrability
Proof. Let f be Riemann integrable on Œa; b. Then, f is certainly bounded. Let D be
the set of points of discontinuity of F . Then D D fx W !f .x/ > 0g. We are to show
that DShas measure 0. For each ˛ > 0, let N.˛/ D fx 2 Œa; b W !f .x/ ˛g. Then,
DD 1
kD1 N.1=k/. Thus, we need only prove that each N.˛/ has measure 0.
Fix such an ˛ and let " > 0. By the Basic Integrability Criterion, we can choose a
partition P , of Œa; b determined by the set of division points fx0 ; x1 ; : : : ; xn g with
n
X
!f .Œxi
1 ; xi /.xi
xi
1/
< ˛"=2:
i D1
Assume, as we may, that the xi are distinct. Let F be the set of all i for which .xi
intersects N.˛/. Then for each i 2 F , !f .Œxi 1 ; xi / ˛. Thus,
X
X
!f .Œxi 1 ; xi /xi < ˛"=2;
xi ˛
i2F
1 ; xi /
i2F
so that the sum of the lengths of the intervals .xi 1 ; xi / is less than "=2. These cover N.˛/
except for the elements of fx0 ; x1 ; : : : ; xn g. But these can be covered by intervals whose
lengths total less than "=2, so that N.˛/ can be covered with open intervals of total length
less than ", as required.
For the converse, let f be bounded and suppose that the set D of discontinuities of f
is of measure 0.
Fix " > 0 and let E D fx W !f .x/ "g. Since E D, E has measure 0. Thus, E
can be covered by a countable family of open intervals, whose total length is less than ".
Since E isSclosed and bounded, it is compact, so a finite family of such intervals will do,
say E m
i D1 Ui . For each i, let Ii be the closure of Ui . For simplicity, by replacing
pairs that intersect, we may
S assume that no two Ii intersect. Let D D fI1 ; : : : ; Im g.
The set K D Œa; b n m
i D1 Ui is compact (in fact, is the union of a finite number of
disjoint closed intervals) and consists of points where !f .x/ < ". For each x 2 K, there
is a closed interval J with x 2 int J and !f .ŒJ / < ". By compactness, a finite number of
such intervals covers K. By intersecting with K, we can assume that they are all subsets
of K. Thus, let C D fJ1 ; : : : ; Jk g, be closed intervals whose union is K and such that
!f .ŒJj / < ", for all j . We can (and do) assume that the intervals Jk do not overlap.
The family D [ C D fŒx0 ; x1 ; Œx1 ; x2 ; : : : ; Œxn 1 ; xn g partitions Œa; b and
n
X
!f .Œxi
1 ; xi /.xi
xi
1/ D
i D1
m
X
!f .Ii /.Ii / C
i D1
X
k
X
!f .Jj /.Jj /
j D1
2kf k.Ii / C
i
D 2kf k
k
X
".Jj /
j D1
X
.Ii / C ".b
a/
i
2kf k" C ".b
a/;
which is arbitrarily small. Thus, the Basic Integrablity Criterion is satisfied and f is integrable.
You may have noticed that part of this argument is similar to that in the proof that the
composition g ı f of a continuous function g with an integrable function f is integrable.
We see now that the composition result is an immediate consequence of Lebesgue’s criterion.
INTRODUCTION TO ANALYSIS
191
44.4 Lemma. Let f W Œa; b ! Œc; d  be integrable and g W Œc; d  ! R be continuous.
Then, g ı f is integrable.
Proof. The set of points of discontinuity of f has measure 0, since f is integrable. But
g ı f is continuous wherever f is, so the set of discontinuities of g ı f is contained in that
of f , so has measure 0 also.
The Cantor set. .
We know that countable sets are of measure zero, but are there any others? Yes, indeed;
the following is an example of an uncountable set of measure 0.
Let I be the unit interval [0,1]. Let G11 D . 31 ; 23 / the “open middle third” of I . I n G11
is the disjoint union of the two compact intervals K11 D Œ0; 13 , K12 D Œ 23 ; 1. Let G21 and
G22 be the “open middle third” of K11 and K12 , respectively, and K21 , K22 ; K23 , K24 be
the 4 compact intervals obtained by removing these middle thirds, etc
K01 D Œ0; 1
G11 D . 31 ; 23 /
K11 D Œ0; 13 ; K12 D Œ 23 ; 1
G21 D . 91 ; 29 /; G22 D . 97 ; 89 /;
K21 D Œ0; 19 ; K22 D Œ 92 ; 39 ; K23 D Œ 69 ; 97 ;
::::::::::::::::::::::::::::::::::::::::::::
K24 D Œ 89 ; 1
In general, Gij is the open interval of length 1=3i concentric with Ki
and Ki;2j are the two component intervals of Ki 1;j n Gij .
The Cantor (ternary) set is defined to be
0 i
1
2
\ [
@
C D
Kij A ;
i 2N
1;j ,
while Ki;2j
1
j D1
(The reader may check that C is the set of those points of [0,1] which have a ternary
expansion using only the digits 0 and 2.)
Now, for a fixed i, the total length of the intervals Kij , j D 1; : : : ; 2i is
i
2i
2i
X
X
1
2
D
.Kij / D
:
3i
3
i D1
i D1
. 32 /i
! 0, C has measure 0.
Since
To see that C is uncountable, suppose C D fx1 ; x2 ; x3 ; : : : g. Then x1 2 K11 [ K12 ,
a union of two disjoint sets. Let K1j1 be the one of these for which x1 … K1j1 . Now take
K2j2 to be such that x2 … K2j2 K1j1T
,. . . . In this way obtain a decreasing sequence Kiji
of compact intervals. The intersection i Kiji is non-empty, since it is the intersection of
non-empty T
compact sets, yet contains none of the points xi of C . This is a contradiction,
since C i Kiji .
The indicator function of the Cantor set, defined by
(
1; x 2 C
f .x/ D 1C .x/ D
0; x … C;
is continuous on Œ0; 1 n C and discontinuous on the set C of measure 0. It is integrable
R1
with 0 f D 0.
The set C happens to be compact, with empty interior and all its points accumulation
points. Such a set is called “perfect”. By removing smaller intervals one can modify this
192
Lebesgue’s criterion for Riemann integrability
example to obtain a Cantor-type set D with the same properties but not of measure 0.
Moreover, this can be done in such a way that there is a continuous, strictly increasing
function bijection g of Œ0; 1 onto itself such that g.D/ D C . Then f ı g is discontinuous
on D. Thus, this gives an integrable function f and a continuous function g such that
f ı g is not integrable. We omit the details here.
INTRODUCTION TO ANALYSIS
193
45. P OINTWISE AND UNIFORM CONVERGENCE
Let .fn / be a sequence of real functions defined on a set X . Then .fn / is said to
converge pointwise to f on X if for all x 2 X , fn .x/ converges to f .x/. Notation:
fn ! f pointwise on X , or simply limn fn D f .
45.1 Questions. Suppose .fn / converges pointwise to f . If fn is continuous at x for each
n, is f continuous at x? If fn is differentiable at x, is f differentiable? If fn is integrable
is f integrable?
Continuity at x means (for example) limt !x f .t / D f .x/, so the first question is, does
lim lim fn .t / D lim lim fn .t /‹
t !x n
n t !x
We will see that the answer is, in general, no.
For integrals, what we want to know really is whether
Z
Z b
Z b
D
f .x/ dx
fn .x/ dx D
lim
n
lim fn
a
a
a
!
b
n
45.2 Example (A limit of continuous functions which is not continuous). For each n 2 N,
define fn W Œ0; 1 ! R by fn .x/ D x n : Then for each x 2 Œ0; 1,
(
0; if 0 x < 1
n
lim fn .x/ D lim x D
n
n
1; if x D 1.
Thus fn ! f; pointwise, where f .x/ D 0 for x 2 Œ0; 1/ and f .1/ D 1. This is certainly
not a continuous function.
45.3 Example (A limit of integrable functions with the “wrong integral”). For each n let
fn be the function defined on Œ0; 2 which is 0 at 0, n at n1 ; 0 again at n2 , linear in between,
and 0 on the rest of the interval.
8
2
ˆ
if 0 x n1
<n x;
2
2
fn .x/ D
n .x n /; if n1 < x n2
ˆ
:
0;
otherwise
R2
Then 0 fn D 1, for all n, but limn fn .x/ D 0, for all x 2 Œ0; 2, So
Z 2
Z 2
lim fn .x/ dx:
fn .x/ dx ¤
lim
n
0
0
n
For another example, let
fn .x/ D n2 x.1
x 2 /n ;
x 2 Œ0; 1:
Then fn .x/ ! 0, for all x, yet
Z 1
lim
fn .x/ dx D lim n2 =2.n C 1/ ! C1;
n
which is certainly not
R1
0
0
0.
n
The problems here will be rectified by using a stronger kind of convergence of sequences of functions: Let .fn / be a sequence of functions defined (at least) on a set X .
Then .fn / is said to converge uniformly to f on X if for all " > 0 there exists N 2 N
such that for all n N and all x 2 X , jfn .x/ f .x/j < ". Notation: fn ! f uniformly
(on X ).
194
Pointwise and uniform convergence
45.4 Example. For each n 2 N, define fn W Œ0; 1 ! R by
x
:
fn .x/ D 1
n
Then .fn / converges uniformly on [0,1].
Proof. . Our first job is to identify the limit. For each fixed x 2 Œ0; 1, the sequence . xn /
converges to 0, so we put f .x/ D 1, for all x 2 Œ0; 1. For each x 2 Œ0; 1 we have
ˇ
ˇ
1
x
x
ˇ
ˇ
1ˇ D :
jfn .x/ f .x/j D ˇ1
n
n
n
Now if " > 0, then by the Archimedean property, there exists N 2 N such that n1 < ",
thus for all n N , and all x 2 Œ0; 1,
1
jfn .x/ f .x/j < ":
n
Thus,
For all " > 0, there exists N 2 N such that
for all n N and all x 2 Œ0; 1, jfn .x/
that is, .fn / converges to f uniformly on Œ0; 1.
f .x/j < ";
45.5 Example. For each n 2 N, define fn W Œ0; 1 ! R by fn .x/ D x n : Then (as we have
seen) fn converges pointwise on [0,1] to f given by
(
0; if 0 x < 1
f .x/ D
1; if x D 1.
We claim that this convergence is not uniform.
Indeed, choose " D 21 . Let N 2 N and choose n D N . Put x D . 34 /1=n . Since
0 < 34 < 1, 0 < x < 1, and hence x 2 Œ0; 1, yet d.fn .x/; f .x// D jx n 0j D
.. 34 /1=n /n D 43 > 12 D ". We have shown that there exists an " > 0 such that for all
N 2 N, there exists n N and an x 2 Œ0; 1 with jfn .x/ f .x/j ". This the negation
of the defintion of uniform convergence on Œ0; 1.
Note that in the positive example of uniform convergence, we majorized the distance to
the limit by a number n1 which went to 0, independently of x. This is the key to proving
uniform convergence to a known function:
45.6 Theorem. Let .fn / be a sequence of functions on X , f another one. Then fn ! f
uniformly iff there exists a sequence .Kn / of extended real numbers converging to 0, such
that for all n 2 N,
jfn .x/ f .x/j Kn ; for all x 2 X .
In this setting, kfn
f k (supremum norm) always serves as a suitable Kn .
Proof. Assume fn ! f uniformly on X . Then, for all " > 0, there exists N such that for
all n N ,
jfn .x/ f .x/j ":
But then by the definition of supremum, for all n N ,
kfn
f k D sup jfn .x/
f .x/j ":
x2X
Thus, if we put
Kn D kfn
f k;
INTRODUCTION TO ANALYSIS
195
then, for all " > 0, there exists N such that for n N , jKn 0j "; that is, Kn ! 0.
Conversely, suppose for each n, Kn is an extended real number such that for all x 2 X
jfn .x/ f .x/j Kn ; and Kn ! 0. Then, for all " > 0, there exists N such that for
n N;
jfn .x/ f .x/j Kn < ";
for all x 2 X . Thus, fn ! f , uniformly on X , by definition.
45.7 Example. If fn .x/ D x n , for x 2 Œ0; 1/, fn ! f pointwise, where f .x/ D 0,
for x 2 Œ0; 1/. Then, jfn .x/ f .x/j D jx n j, for all x 2 Œ0; 1/. Thus, kfn f k D
supx2Œ0;1/ jx n j D 1. Since this does not converge to 0, .fn / does not converge uniformly.
45.8 Theorem (Cauchy criterion for uniform convergence). Let .fn / be a sequence of
functions on X to R. Then there exists a function f such that fn ! f uniformly iff for all
" > 0 there exists N such that for n; m N ,
jfn .x/
fm .x/j "; for all x 2 X .
In terms of supremum norm, (C) is saying kfn
(C)
fm k ".
Proof. The proof of one direction is almost the same as the one for sequences of real
numbers. Suppose .fn / converges uniformly to f . Let " > 0. Then we can choose N such
that for all n N and all x 2 X , jfn .x/ f .x/j < "=2. Let n; m N . Then, for each
x 2 X,
jfn .x/
fm .x/j jfn .x/
f .x/j C jf .x/
fm .x/j < "=2 C "=2 D ":
Thus, the condition is satisfied.
Conversely, suppose the condition is satisfied. Namely,
8" > 0 9N 2 N such that for n; m N , jfn .x/
fm .x/j " , for all x 2 X .
Now, fix a particular x 2 X . Then,
8" > 0 9N 2 N such that for n; m N , jfn .x/
fm .x/j ".
This means that the the sequence .fn .x//n of real numbers is Cauchy. Since the set of real
numbers is complete in the usual metric, this sequence converges. Since this held for an
arbitrary x 2 X , we define a function f on X by
f .x/ D lim fn .x/; for all x 2 X .
n
So far we have fn ! f pointwise. Now, going back to the Cauchy condition, fix " > 0
and choose N such that
for all n; m N , and all x 2 X , jfn .x/
fm .x/j ".
Let x 2 X , and n N . Then
for all m N , jfn .x/
fm .x/j ".
Now let m ! 1. Since fm .x/ ! f .x/, this yields
jfn .x/
f .x/j ":
Since this was true for arbitrary n N and arbitrary x 2 X , we can say
for all n N and all x 2 X , jfn .x/
f .x/j ":
Thus, for all " > 0 there exists N such that for n N ,
jfn .x/
That is fn ! f uniformly.
f .x/j "; for all x 2 X ;
196
Pointwise and uniform convergence
Distance interpretation. We saw above that using the idea of supremum norm
kgk D sup jg.x/j;
x2X
a sequence of functions .fn / converges uniformly to f iff kfn
words,
for all " > 0; there exists N 2 N with kfn
f k ! 0, or in other
f k "; for all n N :
So we could define a distance (the uniform distance) on the set of functions on X by
d.f; g/ D kf
gk:
This satisfies the metric properties:
(1)
(2)
(3)
(4)
d.f; g/ 0,
d.f; g/ D 0 iff f D g.
d.f; g/ D d.g; f /.
d.f; g/ d.f; h/ C d.h; g/.
and we see that fn ! f uniformly iff .fn / converges to f in the uniform distance.
Similarly, we see that the Cauchy criterion for uniform convergence is just the Cauchy
criterion for this distance.
The only thing stopping d from being a metric is the fact that d.f; g/ could be C1,
this happens whenever f
g is not a bounded function. We could call d an extended
metric, as some authors do.
Uniform convergence of series of functions. As with sequences
P1and series of numbers,
if .fn /1
nD1 fn refers to the senD1 is a sequence of numbers, the corresponding series
quence .sn / of partial sums
n
X
sn D
fk :
kD1
(And there is a similar notation for sequences and series indexed on, say f0; 1; 2; : : : g.)
The series is said to converge
P (pointwise) if .sn / converges pointwise.P
We say that the series n fn converges absolutely if the series n jfn j converges.
(Just as for series of numbers, absolute convergence implies convergence but not conversely.)
P
We say that the series n fn converges uniformly if .sn / does and the limit is called
the sum of the series.
The Cauchy criterion for uniform convergence for series becomes
P
P
45.9 Theorem. For a series n fn of functions on a set X , n fn converges uniformly
on X iff for all " > 0, there exists N 2 N such that for n m N ,
ˇ n
ˇ
ˇ
ˇX
ˇ
ˇ
fk .x/ˇ "; for all x 2 X :
ˇ
ˇ
ˇ
kDm
As a corollary we have
P
45.10 The Weierstrass M-test. A series n fn of functions fn on X is uniformly and absolutely convergent provided there exists
Pa sequence .Mn / of real numbers with jfn .x/j Mn , for all x 2 X , and all n such that n Mn converges .
INTRODUCTION TO ANALYSIS
197
The proof should now be an easy exercise.
Notice also that if any Mn works in the Weierstrass M-test, then kfn k must also work,
since jfn .x/j Mn for all x 2 X implies
jfn .x/j kfn k Mn for all x 2 X :
However, other choices of Mn are often easier to work with.
P
x/
45.11 Example. Let fn .x/ D sin.cos
, for x 2 R. Then n fn converges uniformly on
2n Cx 2
P
R, by the Weierstrass M -test, since jfn .x/j 21n and n 21n converges. (Mn D 21n .) It is
not at all clear what the sum is.
45.12 Example (A series converging uniformly for which the Weierstrass M-test does not
nC1 2
apply). Let fn .x/ D . 1/ n x for x 2 Œ0; 5.
Here the Weierstrass M -test does not apply, since if we put
ˇ
ˇ
2
ˇ . 1/nC1 x 2 ˇ
ˇD 5 ;
Mn D kfn k D sup ˇˇ
ˇ
n
n
x2Œ0;5
P1
then nD1 Mn diverges.
2
However, since, for each x 2 Œ0; 5, the sequence . xn / is decreasing to 0, the series
P1 . 1/nC1 x 2
converges by the alternating series test to some f .x/,
nD1
n
ˇ
ˇ
n
ˇ
2ˇ
X
x2
52
ˇ
kC1 x ˇ
:
. 1/
ˇ
ˇf .x/
ˇ
k ˇ nC1
nC1
kD1
Thus,
ˇ
ˇ
n
ˇ
2ˇ
X
x
52
ˇ
ˇ
kf
. 1/kC1 ˇ ! 0:
kD1 fk k D sup ˇf .x/
k ˇ nC1
x2Œ0;5 ˇ
kD1
P
Hence, the series 1
nD1 fn converges uniformly on Œ0; 5.
Pn
nx 2 =.1 C nx 2 /,
45.1. Let fn .x/ D
for all x 0.
(a) Prove that for each t > 0, .fn / converges uniformly on Œt; 1/ but
(b) Prove that .fn / does not converge uniformly on Œ0; 1/.
45.2. For a sequence .fn / of real functions on X and f another one, fn ! f uniformly on X iff
for each sequence .xn / in X , fn .xn / f .xn / ! 0.
45.3. Let .fn / be a sequence of functions on X which converges pointwise on X . Let F X be
finite. Prove .fn / converges uniformly on F .
45.4. Decide whether the following sequences of functions are uniformly convergent on their domains.
(a) .fn /, where fn W Œ0; 1 ! R is defined by fn .x/ D nx.1 x 2 /n .
(b) .gn /, where gn W Œ0; 1 ! R is defined by gn .x/ D x.1 x/n .
45.5. Prove that the sum of two uniformly convergent sequences of functions converges uniformly.
45.6. Prove or disprove that the product of two uniformly convergent sequences of functions is also
uniformly convergent.
P
45.7. If the series 1
nD1 fn converges uniformly on X , then fn ! 0 uniformly on X . The converse
doesn’t hold.
45.8. Discuss pointwise and uniform convergence of the following series.
(a)
P1
nD1
1
for x 2 Œ1; C1/.
1 C n2 x 2
198
Pointwise and uniform convergence
1
for x 2 R.
1 C n2 x 2
P1 sin.nx/
(c)
, for x 2 R.
nD1
n2
(b)
P1
nD1
45.9. Formulate versions of the definitions and results on uniform convergece of sequences of functions for the metric space valued case. What changes have to be made?
INTRODUCTION TO ANALYSIS
199
46. U NIFORM CONVERGENCE : C ONTINUITY, INTEGRAL , DERIVATIVE .
We know that under pointwise convergence, continuity can be lost in the limit. The
same is true of integrability and the value of the integral of a limit function, and since
integration is connected to differentiation by the Fundamental Theorem of Calculus, there
are problems there as well. But, as promised earlier, the difficulties are to a large extent
rectified by uniform convergence.
Continuity. The uniform limit of continuous functions is continuous.
46.1 Theorem. Let .fn / be a sequence of real functions on a metric space X , a 2 X , and
fn continuous at a for all n 2 N. If fn ! f uniformly on X , then f is also continuous at
a.
Proof. Let " > 0. By uniform convergence there exists N such that for all n N and for
all x 2 X ,
"
jfn .x/ f .x/j :
3
Fix such an N . Since fN is continuous at a, we can choose a neighbourhood U of a such
that
"
jfN .x/ fN .a/j < ; for all x 2 U .
3
Thus, for all x 2 U .
jf .x/
f .a/j jf .x/
fN .x/j C jfN .x/
fN .a/j C jfN .a/
which shows that f is continuous at a.
f .a/j < "
Applying the above statement to each a 2 X , we have:
46.2 Theorem. Let .fn / be a sequence of continuous functions on X converging uniformly
to f . Then, f is also continuous.
46.3 Example. For each n 2 N let fn .x/ D tan 1 .nx/ for all x 2 R. Here tan 1 refers to
arctan; the inverse of the tangent function restricted to its principal domain . =2; =2).
Now, tan 1 .0/ D 0; as y ! 1, tan 1 .y/ ! =2Iand as y ! 1, tan 1 .y/ ! =2.
Thus,
8
9
< =2; if x > 0 =
0;
if x D 0 D sgn.x/;
lim fn .x/ D lim tan 1 .nx/ D
n
n
:
;
2
=2; if x < 0:
where sgn is the function known as signum, which gives the “sign of x” (+1 if x is > 0;
1 if it is < 0 and 0 otherwise). Thus, fn ! 2 sgn pointwise, but the convergence cannot
be uniform, since the the fn are all continuous, but the limit function is discontinuous
at 0.
46.4
P1 Corollary. Let .fn / be a sequence of continuous functions on X such that the series
nD1 fn converges uniformly with sum f . Then, f is also continuous.
P
Proof. This follows because thePconvergence of 1
nD1 fn uniformly to f , really means
that the sequence of sums sn D nkD1 fk converges uniformly to f .
Since the sum of a finite number of continuous functions is continuous, each sn is con
tinuous and sn ! f uniformly, f is also continuous.
A slight modification of the result about uniform limit of continuous functions gives a
corresponding result for a sequence of functions, each of whose limits exist.
200
Uniform convergence: Continuity, integral, derivative.
46.5 Theorem. Let .fn / be a sequence of functions on X and let a be an accumulation
point of X . Suppose .fn / converges uniformly to f on X n fag and for each n 2 N,
limx!a fn .x/ D yn : Then limn yn exists and
lim f .x/ D lim yn :
x!a
n
In other words
lim lim fn .x/ D lim lim fn .x/:
x!a n
n x!a
Can you see why a is assumed to be an accumulation point of X ? If a is not an accumulation point of X , then any function on X converges to anything as x ! a, so the result
would be meaningless. Officially, the notation lim should not even be used in that case,
since the limits would not be unique.
Proof. Let " > 0. By the Cauchy condition for uniform convergence there exists N such
that for all n; m N and for all x 2 X ,
jfn .x/
fm .x/j ":
Fix such an N . Fix n; m N for a moment and take the limit as x ! a. This yields
jyn
ym j ":
(Here we used the fact that a 2 acc X , so that the limits yn and ym are unique.) Thus, for
all n; m N jyn ym j "; so the sequence .yn / is Cauchy, hence converges to some
number y.
Now, we start again. Let " > 0. Since fn converges uniformly on X , and yn ! y,
there exists N so large that n N implies
jfn .x/
f .x/j "=3
for all x 2 X
and
jyn yj "=3:
Choose n D N: Since limx!a fN .x/ D yN ; there is a neighbourhood U of a such that
jfN .x/
yN j < "=3; for x 2 U n fag:
Thus, for x 2 U n fag;
jf .x/
yj jf .x/
fN .x/j C jfN .x/
yN j C jyN
yj < ";
which shows that
lim f .x/ D y D lim yn ;
x!a
n
as promised.
The reader should prove the corresponding result for series of functions:
46.6 Corollary (Limit past the summation sign). Let .fn / be a sequence
of functions
P
on X and let a be an accumulation point of X . Suppose the series 1
f
nD1 n converges
P
uniformly on X with sum f , and for each n 2 N, limx!a fn .x/ D yn : Then 1
nD1 yn
converges and
1
X
lim f .x/ D
yn :
x!a
In other words
lim
x!a
1
X
nD1
fn .x/ D
nD1
1
X
nD1
lim fn .x/:
x!a
INTRODUCTION TO ANALYSIS
201
This is also referred to as “interchanging limit and sum” under uniform convergence.
Integration. The uniform limit of integrable functions is also integrable, and the integral
of the limit is the limit of the integrals. More precisely:
46.7 Theorem (Interchanging limit and integral.). If for each n 2 N, fn is Riemann
integrable on Œa; b and the sequence .fn / converges uniformly to f on Œa; b, then f is
Riemann integrable on Œa; b and
Z b
Z b
Z b
lim
fn D
lim fn D
f:
n
a
a
n
a
(Use of this result is also referred to as “taking a limit under the integral sign”.)
Before starting on the proof, recall the Basic Integrability Criterion 42.3(2): f is integrable if and only if for each " > 0, there exists a partition P D fI1 ; : : : ; In g of Œa; b,
with
X
!f .I /.I // < ";
I 2P
where !f .I / D sups;t 2I jf .s/
of the interval I .
f .t /j, the oscillation of f on I and .I / is the length
Proof. For each n, let Kn D kfn f k. Since fn ! f uniformly, Kn ! 0. Let " > 0
and choose N so large that for m N , Km .b a/ < ". Since fN is integrable, we can
choose a partition P D fI1 ; : : : ; In g of Œa; b with
n
X
!fN .I /.I / < ":
I 2P
But, for all s; t,
jf .s/ f .t /j jf .s/ fN .s/jCjfN .s/ fN .t /jCjfN .t / f .t /j jfN .s/ fN .t /jC2KN ;
so
!f .I / !fN .I / C 2KN :
Hence,
X
I 2P
!f .I /.I / X
!fN .I /.I / C 2KN .b
a/ < 3":
I 2P
Thus, f also satisfies the Basic Integrability Condition, so is integrable.
Finally,
ˇZ
ˇ Z
Z b
Z b ˇˇ ˇˇZ b
ˇ b
ˇ
b
ˇ
ˇ
ˇ ˇ
f
f ˇ D ˇ .fn f /ˇ jfn f j Kn D Kn .b
ˇ
ˇ a n
ˇ ˇ a
ˇ
a
a
a
This shows
b
Z
n
b
Z
fn D
lim
a
a/ ! 0
f;
a
as required.
46.8 Corollary (Integration termPby term.). For each n 2 N, Let fn be Riemann integrable on Œa; b and let the series 1
nD1 fn be uniformly convergent with sum f . Then, f
is also Riemann integrable and
Z b
1 Z b
X
f D
fn :
a
nD1 a
202
Uniform convergence: Continuity, integral, derivative.
P
Proof. This follows because thePconvergence of 1
nD1 fn uniformly to f , really means
that the sequence of sums sn D nkD1 fk converges uniformly to f . Thus,
Z b
Z b
Z b
Z bX
n
n Z b
1 Z b
X
X
f D
lim sn D lim
sn D lim
fk D lim
fk D
fn ;
a
a
n
n
n
a
a
n
kD1
kD1
a
nD1 a
as required.
Differentiation. The uniform limit of derivatives is a derivative, and we may interchange
the limit operation with that of differentiation, provided one more obviously necessary
condition is satisfied.
46.9 Theorem (Interchanging limit and derivative). Let .fn / be a sequence of differentiable functions on Œa; b. If
(1) the sequence .fn0 / converges uniformly on Œa; b and
(2) there is one point c 2 Œa; b such that .fn .c// converges,
then, .fn / converges uniformly to some f , f is differentiable on Œa; b and
lim fn0 D f 0 :
n
46.10 Note. If we use D to stand for the differentiation operator, we can write this as
Df D D.lim fn / D lim Dfn :
n
n
If we were to assume each fn had a continuous derivative, we could deduce this from
the corresponding result for integrals, using the Fundamental Theorem of Calculus. The
version given here doesn’t even assume that the derivatives are integrable.
Notice also that the uniform convergence of the sequence of .fn / is part of the conclusion, not the hypothesis. It is the sequence of derivatives that is assumed to converge
uniformly.
Proof. Let n; m 2 N and t; x 2 Œa; b. The Mean Value Theorem, applied to fn
yields a point s such that
fn .t /
Since
jfn0 .s/
fm .t /
fm0 .s/j
jfn .t /
kfn0
fm .t /
.fn .x/
fm0 k
fm .x// D .fn0 .s/
fm0 .s//.t
fm
x/:
this yields
.fn .x/
fm .x//j k.fn0
fm0 /kjt
xj:
()
Taking x D c, we obtain for all t 2 Œa; b,
jfn .t /
fm .t /j jfn .c/
fm .c/j C kfn0
fm0 kjt
cj:
.fn0 /
Since .fn .c// converges,
converges uniformly, and jt cj jb aj, for each " > 0,
there exists N , such that jfn .t / fm .t /j < ", for n; m N . Thus, .fn / is uniformly
Cauchy, so converges uniformly.
Now, ./ can be written
ˇ
ˇ
ˇ fn .t / fn .x/ fm .t / fm .x/ ˇ
ˇ kf 0 f 0 k:
ˇ
n
m
ˇ
ˇ
t x
t x
Let f be the limit of .fn / and g the limit of .fn0 /. Fix x 2 Œa; b and fix n for a moment.
Then, letting m ! 1, yields
ˇ
ˇ
ˇ fn .t / fn .x/ f .t / f .x/ ˇ
ˇ kf 0 gk:
ˇ
n
ˇ
ˇ
t x
t x
INTRODUCTION TO ANALYSIS
203
Since fn0 ! g uniformly, this shows that the sequence of functions 'n defined by
fn .t / fn .x/
t x
converges uniformly, and we can interchange the order of limits, obtaining
fn .t / fn .x/
f .t / f .x/
D lim lim
D lim fn0 .x/ D g.x/:
lim
n t !x
n
t !x
t x
t x
0
Thus, f D g and by hypothesis,
'n .t / D
fn0 ! f 0 uniformly;
as promised.
As in the case of integration, the result just proved yields a corresponding result about
differentiating series.
P
46.11 Corollary (Differentiation term-by-term). Let 1
nD1
P1fn be0 a series of differentiablePreal functions defined on Œa; b such that the series
nD1 fn converges uniformly
P1
and 1
nD1 fn .c/ converges for some point c. Then
nD1 fn converges uniformly and if
f denotes the sum, then,
1
X
0
fn0 :
f D
nD1
The proof is a straightforward exercise.
46.1. Let f be the Dirichlet function defined on [0,1] by
(
1
; if x D m
n is rational in lowest terms
f .x/ D n
0; otherwise.
For each k let gk be the maximum of f and 1=k. Use uniform convergence to prove that
R1
0 f D 0.
8
< sin.nx/
; x¤0
46.2. Let fn W R ! R, defined by fn .x/ D
. Prove .fn / does not converge
nx
:1;
xD0
uniformly, using limit interchange theorems, that is, the theorems of this section.
46.3. Prove that if .fn / is a sequence of bounded functions on X converging uniformly to the function
f , then f is also bounded.
204
Uniform convergence: Continuity, integral, derivative.
Notes
INTRODUCTION TO ANALYSIS
47. A CONTINUOUS
205
NOWHERE DIFFERENTIABLE FUNCTION ON
R.
This is an example of the use of the theorem on sums of a uniformly convergent series
of functions to prove a remarkable result. It is a modification of an example of Weierstrass.
He built his example out of trigonometric functions, but we will use the technically simpler
“sawtooth” functions.
Start with a function g W R ! R defined by setting g.x/ D jxj, for x 2 Œ 2; 2 and
extend this periodically by putting g.x C 4/ D g.x/, for all x 2 R. You can check that
g.x/ is the minimum distance to the set 4Z D f4m W m 2 Zg, namely
g.x/ D inffjx
yj W y 2 4Zg D inffjx
4mj W m 2 Zg:
P
n
2
For each n 2 N, let fn .x/ D g.44n x/ and define f D 1
nD1 fn . Since jfn .x/j 4n , for
P1 2
all x and nD1 4n converges, we know the series converges uniformly. Since each fn is
continuous, the sum f is also continuous. We will prove that f is nowhere differentiable.
Graphs of g; f1 ; f2 ; f3 appear here and graphs of the partial sums s2 D f1 C f2 and
s3 D f1 C f2 C f3 appear on the page following the proof (for technical typesetting
reasons).
2.0
g
1.5
1.0
f_1
0.5
f_2
0.0
−2
−1
0
1
2
x
ja
Notice that if an open interval .a; b/ contains no even integer, then jg.a/ g.b/j D
bj.
Let a 2 R. We will prove that f is not differentiable at a. Suppose otherwise.
The interval .4k a 1; 4k a C 1/ can contain at most one even integer. Put
ık D
(
1;
if there is no even integer in .4k a; 4k a C 1/
1; otherwise — so that there is no even integer in .4k a
hk D ık =4k :
1; 4k a/
206
A continuous nowhere differentiable function.
Since hk ! 0, the difference quotient
f .a C hk /
qk WD
hk
f .a/
! f 0 .a/:
And hence, qk qk 1 ! 0. However, for each n, fn has period 41
hk D ık =4k is a multiple of this, so fn .a C hk / fn .a/ D 0.
qk D
f .a C hk /
hk
f .a/
D
k
X
fn .a C hk /
hk
nD1
n
, so for n > k, and
fn .a/
:
But there is no even integer between 4k a and 4k a C ık , so for n k, there is no even
integer between 4n a and 4n a C 4n hk , hence
jfn .a C hk /
Thus,
fn .aChk / fn .a/
hk
fn .a/j D 4
n
jg.4n a C 4n hk /
g.4n a/j D jhk j:
is either 1 or 1 and
k
X
qk D
˙1
nD1
is odd if k is odd and even if k is even. Hence jqk qk 1 j 1, which doesn’t converge to
0, a contradiction.
Another formula for the function g which gives the minimum distance to 4Z is
ˇ
ˇ
ˇ
x C 2 ˇˇ
ˇ
g.x/ D ˇx 4
ˇ;
4
where, as usual, bac is the greatest integer less than or equal to a. To see this, let m D
b xC2
c. Then,
4
xC2
m
< m C 1;
4
so that
2 x 4m < 2
If n 2 Z with n m C 1, we have
x
and if n m
4n x
4m
4n x
4m C 4 4<2
4D
2x
4m
2C4D2>x
4m:
1, then
x
In both cases jx
4mj < jx
P1
4nj, so jx
4mj is indeed the minimum distance.
sin..nŠ/2 t/
.
nŠ
47.1. Let f D nD0 , where fn .t / D
Then f is continuous and nowhere differentiable.
P
1
2
The proof uses: (1) 1
:
(2)
j
sin
x
sin yj jx yj. (3) for every integer k 3,
kDn kŠ
nŠ
3
there exists y with jx
yj
and
j
sin
kx
sin kyj 1.
k
k
P
sin b n x
The original example of Weierstrass was 1
nD 1
an , with b an integer and b=a and a
sufficiently large.
INTRODUCTION TO ANALYSIS
207
s_2
0.4
0.0
−2
0
2
x
s_3
0.4
0.0
−2
0
x
2
208
A continuous nowhere differentiable function.
Notes
INTRODUCTION TO ANALYSIS
209
48. P OWER SERIES
A series of the form
1
X
an .x
c/n
nD0
is called a power series about c. (The same name is given also to series where the
P summation starts at n D 1 (or larger).) This can be considered a series of functions, n fn ,
where fn .x/ D an .x c/n . On what domains does this define a series which converges
pointwise? On what domains does it converge uniformly? Can we integrate this series, or
differentiate it, by integrating or differentiating the nice polynomials an .x c/n ? The answer is surprisingly often YES. For the moment we consider power series of real numbers
(and “real variables”). We will make some remarks P
about the complex case afterward.
The radius of convergence of the power series 1
c/n , is
nD0 an .x
˚ P
R D sup r W n an r n converges (possibly C1).
˚ P
The value C1 is given when the set r W n an r n converges is not bounded above. One
P1
should note that R 0, since the series nD0 an r n converges to a0 when r D 0.
P
48.1 Lemma. For the power series n an .x c/n , put ˛ D lim supn jan j1=n . Then the
radius of convergence is R D ˛1 (interpreted as C1, if ˛ D 0, and as 0, if ˛ D C1).
P
Proof. This is an application of the root rest, which says that a series n xn converges
absolutely if lim supn jxn j1=n < 1 and diverges if lim supn jxn j1=n > 1.
Let ˛ be as given in the statement. If ˛ is finite, and r 0, then
lim sup jan r n j1=n D lim sup.jan j1=n r/ D r.lim sup jan j1=n / D ˛r:
n
n
n
P
Thus the series n an r n converges absolutely for r < 1=˛ and diverges for r > 1=˛. This
shows that the radius of convergence is 1=˛.
In case ˛ D 0, r˛ D 0 < 1, so the series converges for all r and hence the radius of
convergence is C1.
Finally, in case ˛ D C1, lim supn jan r n j1=n D C1 > 1, unless r D 0, in which case
lim supn jan r n j1=n D lim supn 0 D 0. Thus, the radius of convergence is 0.
Notice also that this proof actually showed that, in the definition of radius of convergence, one could use jan j in place of an , obtaining absolute convergence.
P
48.2 Theorem. Let R be the radius of convergence of the power series n an .x c/n .
(1) Then the series converges pointwise
on .c R; c C R/.
P
(2) If 0 r < R,Pthen the series n an .x c/n converges uniformly on Œc r; c C r.
(3) Put f .x/ D 1
c/n , for x 2 .c R; c C R/. Then:
nD0 an .x
(a) If a; b 2 .c R; c C R/ with a b, then f is integrable on Œa; b with
b
Z b
1 Z b
1
X
X
.x c/nC1
n
f .x/dx D
an .x c/ dx D
an
:
nC1
a
xDa
nD0 a
nD0
(b) f is differentiable with
f 0 .x/ D
1
X
nD0
nan .x
c/n
1
; for x 2 .c
R; c C R/:
210
Power series
The interval of convergence of the series is the largest interval on which the series
converges. It consists of .c R; c CR/ together with those endpoints at which it converges.
Proof. (The statements are vacuously satisfied if R D 0.)
((1) and (2)) Convergence pointwise on .c R; c C R/ means convergence for each
fixed x 2 .c R; c C R/. But if x belongs to this interval, equivalently if jx cj < R, then
there exists r with jx cj < r < R; hence, if we can show the series converges uniformly
on Œc r; c C r, it has to convergeP
at x. Thus it is enough to prove (2).
So, suppose 0 r < R. then n jan jr n converges. But then, for all n and all x 2
Œc r; c C r, jan .x c/n j jan jr n , so by the Weierstrass M -test, the series converges
absolutely, uniformly for x 2 Œc r; c C r.
(3) Let fn .x/ D an .x c/n defined for x 2 .c R; c C R/. LetPa; b 2 .c R; c C R/
and choose 0 r < R so that Œa; b Œc r; c C r. Then n fn converges to f
uniformly on Œc r; c C r, hence on Œa; b, so by the theorem on integrating uniformly
convergent series,
b
Z b
1
1 Z b
X
X
.x c/nC1
:
fn D
an
f D
nC1
a
xDa
nD0
nD0 a
This proves part (a).
To prove part (b), we apply the result on differentiating the sum of a series. However,
for that result, we need uniformity of the convergence of the series
1
X
fn0
nD0
of derivatives in a neighbourhood of the point where we intend to differentiate. Now,
1
X
fn0 .x/ D
1
X
an n.x
c/n
1
:
(D)
nD0
nD0
The radius of convergence of this “differentiated” series is determined by
lim sup jnan j1=n D lim sup jnj1=n jan j1=n D lim sup jan j1=n ;
n
n
n
1=n
since n
! 1. Thus, the series (D) has radius of convergence R.
Since also
1
X
fn .c/ converges to a0 ;
nD0
we can differentiate term by term and get
f 0 .x/ D
1
X
fn0 .x/;
nD0
for all x 2 Œc r; c C r. If we are given a particular point in .c R; c C R/, we simply
choose r so that x belongs to Œc r; c C r to complete the proof.
48.3 Note. We see from the above arguments that the radii of convergence of the three
series
X
X
X an
an .x c/n
an n.x c/n 1
and
.x c/nC1 ;
n
C
1
n
n
n
are the same.
INTRODUCTION TO ANALYSIS
211
P
48.4 Corollary. Suppose f .x/ D 1
c/n ; for x 2 .c R; c C R/. Then for
nD0 an .x
each k 2 N, f is k-times differentiable in this interval, with derivative
f
.k/
.x/ D
1
X
nDk
.n
nŠ
an .x
k/Š
c/n
k
;
and in particular, f .k/ .c/ D kŠak ; so
ak D
f .k/ .c/
:
kŠ
Thus, we obtain the representation of f in terms of its so-called Taylor’s series:
f .x/ D
1
X
an .x
c/n D
nD0
1
X
f .n/ .c/
.x
nŠ
nD0
c/n ; for x 2 .c
R; c C R/
The proof is a simple exercise.
48.5 Warning. The above theorem does not say that an infinitely differentiable function is
always the sum of its Taylor’s series. It only applies to functions which are already known
to be expandable as the sum of a power series. The function defined on R by
(
2
e 1=x ; if x ¤ 0
f .x/ D
0;
if x D 0
has f .n/ .0/ D 0, for all n 2 N, so its Taylor’s series about 0 is the series
1
X
0 n
x
nŠ
nD0
the “0 series”. Its sum is 0 for all x, which is certainly different from f . The remainder
term in Taylor’s Theorem is actually the entire function f .
P
We know that, if R is the radius of convergence of the series 1
c/n , then it
nD0 an .x
converges pointwise on .c R; c C R/ and uniformly on any closed interval
contained it
P
this, but not necessarily on the whole interval. (The geometric series n x n shows this.)
It diverges at each point outside the closed interval Œc R; c C R. What happens to the
uniformity if there is convergence at an endpoint?
P
48.6 Abel’s Theorem. If the power series 1
c/n converges at c C R, then it
nD0 an .x
converges uniformly on Œc; c C R. Similarly,
at c R, then it converges
P if it converges
uniformly on Œc R; c. Consequently, 1
c/n converges uniformly on each
nD0 an .x
closed interval contained in its interval of convergence.
Proof. Without loss of P
generality, we may (and do) assume c D 0 and R D 1. (Why is
this?) Let " > 0. Since 1
nD0 an converges, it is Cauchy, so we may find N such that
ˇ
ˇ m
ˇX ˇ
ˇ
ˇ
ai ˇ < "; for all m N;
ˇ
ˇ
ˇ
i DN
or what is the same
ˇ k
ˇ
ˇX
ˇ
ˇ
ˇ
aN Ci ˇ < "; for all k 0:
ˇ
ˇ
ˇ
iD0
212
Power series
Put Ak D
k
X
Pk
iD0
aN Ci , so jAk j < " for all k. Then, for x 2 Œ0; 1,
aN Ci x N Ci D aN x N C aN C1 x N C1 C C aN Ck x N Ck
i D0
D A0 x N C .A1
A0 /x N C1 C .A2
D x N ŒA0 C .A1
A0 /x C .A2
N
D x ŒA0 .1
A1 /x N C2 C C .Ak
A1 /x 2 C C .Ak
2
x/ C A1 .x
x / C C Ak
1 .x
k 1
Ak
Ak
1 /x
k
1 /x
N Ck

k
x / C Ak x k 
Now the absolute value of this is at most
x N ŒjA0 j.1
x/ C jA1 j.x
x N Œ".1
x/ C ".x
x 2 / C C jAk
x 2 / C C ".x k
1
1 j.x
k 1
x k / C jAk jx k 
x k / C "x k 
D x N ", since the sum telescopes.
Thus,
ˇ m
ˇ
ˇX
ˇ
ˇ
iˇ
ai x ˇ "; for all m N and x 2 Œ0; 1;
ˇ
ˇ
ˇ
i DN
which shows that the series satisfies the Cauchy condition for uniform convergence on
Œ0; 1.
The proof for the case of convergence at c R is similar. Alternatively, it can be deduced
from
by replacing x c by .x c/; that is, by considering the series
P the nprevious case
c/n :
n . 1/ an .x
48.7 Corollary. If
P1
nD0
an converges, then limx!1
P1
nD0
an x n D
P1
an .
P1
nD0
48.8 Example. Let f .x/ D 1=.1
x ¤ 1. Since 1=.1 x/ D nD0 x n , for
P1C x/, for
n n
jxj < 1, we also have f .x/ D nD0 . 1/ x , for jxj < 1. The interval of convergence
of this series is just . 1; 1/. The Taylor’s series for f about 0 is just this, but f is not the
sum of its Taylor’s series, except on that interval.
Now, for each x 2 Œ0; 1/, we may integrate term by term, yielding
Z x
1
1
X
X
1
x nC1
xn
. 1/n
. 1/n 1 :
dt D
D
nC1
n
0 1Ct
nD0
nD1
This latter series also converges at x D 1, because it is an alternating series of decreasing
terms.
1
1 1
1
C
C ::::
2
3 4
Thus, by Abel’s Theorem,
lim
x!1
1
X
nD1
. 1/n
1x
n
n
D
1
X
nD1
. 1/n
11
n
::
Assuming the properties of natural logarithms, we are now able to sum this series, because
Z x
Z 1
1
1
lim
dt D
dt D ln.1 C 1/ ln.1 C 0/ D ln 2:
x!1 0 1 C t
0 1Ct
INTRODUCTION TO ANALYSIS
213
The complex case. The definition of power series remains unchanged if the coefficients
an and the center c are replaced by complex numbers. Traditionally,P
the variable z is
used instead of x. Again, using the same proofs, we find that the series 1
c/n
nD0 an .z
converges absolutely in the ball fz W jz cj < Rg, uniformly on any smaller ball, and
diverges for jz cj > R, where again R D 1=˛, ˛ D lim supn jan j1=n . The statement
about integrating the series term by term doesn’t make sense with the definition of integral
we are using — we have no concept of integration with respect to a complex variable.
The complex derivative of a function f W Z ! C, where Z C is defined as in the
real case by f 0 .z0 / D limz!z0 f .z/z fz0.z0 / . If f .z/ is given by a power series, the theorem
on differentiation term-by-term is still true, but our proof doesn’t apply, because it depends
on the Mean Value Theorem, which is not valid in the complex case. (See Example 50.6.)
P1
P1
The product of two power series. Given series
P1 nD0 an and nD0
P bn . The convolution
or (Cauchy)product of these two series is nD0 cn , where cn D nkD0 ak bn k .
Power series
for this definition. If one formally multiplies the two
Pgive the motivation
P
power series n an z n and n z n term by term, collecting terms containing the same
power of z (as if they were polynomials), one obtains
X
X
.
an z n /.
z n / D .a0 C a1 z C a2 z 2 C a3 z 3 : : : /.b0 C b1 z C b2 z 2 C : : : /
n
n
D a0 b0 C .a0 b1 C a1 b0 /z C .a0 b2 C a1 b1 C a2 b0 /z 2 C : : :
D c0 C c1 z C c2 z 2 C : : : :
Taking z D 1 gives the above definition.
48.9 Theorem. Suppose
P
(a) P1
nD0 an converges absolutely with sum A and
1
(b)
nD0 bn converges with sum B.
P
Pn
Then, the series 1
nD0 cn , with cn D
kD0 ak bn k , converges with sum AB.
P
P
P
Proof. Put An D kn ak , Bn D kn bk , and Cn D kn ck . Then,
Cn D a0 b0 C .a0 b1 C a1 b0 / C .a0 b2 C a1 b1 C a2 b0 / C C .a0 bn C a1 bn
D a 0 Bn C a 1 Bn
1
C a 2 Bn
2
1
C C a n b0 /
C : : : an B0
D .a0 C a1 C C an /B C a0 .Bn
B/ C a1 .Bn
1
B/ C a2 .Bn
2
B/ C : : : an .B0
D A n B C Wn
where
Wn D a0 .Bn
B/ C a1 .Bn
B/ C C an .B0 B/:
P
Since An B ! AB, our job is to show that Wn ! 0. Let Ax D 1
nD0 jan j, which was
assumed finite by (a).
Choose N so large that jBn Bj ", for n N . Then,
1
B/ C a2 .Bn
jWn j ja0 j" C ja1 j" C ja2 j" C : : : jan
C jan N C1 .BN 1
x C jan N C1 .BN
A"
B/ C an
1
2
N j"
N C2 .BN 2
B/ C an
N C2 .BN 2
B/ C : : : an .B0
B/ C : : : an .B0
B/j
B/j
x C"
Here, there are N terms, each of which tend to 0, since ak ! 0. Thus, this less than A"
for all n large enough. This shows that Wn ! 0, and we are done.
B/
214
Power series
The same conclusion holds also with the absolute convergence in (a) replaced by convergence, provided the product series is known to converge.
P1
P1
48.10
nD0 bn converges with sum B and
P1 Theorem. If nD0 an converges with sum
PA,
n
c
convergeces
with
sum
C
,
where
c
D
a
n
kD0 k bn k , then AB D C .
nD0 n
P1
P
P1
n
n
Proof. Define f .x/ D nD0 an x n , g.x/ D 1
nD0 bn x and h.x/ D
nD0 cn x , for
x 2 Œ0; 1. For x < 1, the series converge absolutely and hence may be multiplied using
the Cauchy product, so that
f .x/g.x/ D h.x/
.0 x < 1/:
Because of Abel’s Theorem, these functions are continuous at 1:
f .x/ ! A
g.x/ ! B
h.x/ ! C;
so that AB D C , as required.
ˇ
ˇ
P
ˇa
ˇ
1
48.1. If limn ˇ nC1
(again interpreted as
c/n is ˛
n an .x
an ˇ D ˛, the radius of convergence of
C1, if ˛ D 0; 0, if ˛ D C1).
P
48.2. Abel’s theorem states: If the power series 1
c/n converges at c C R, then
nD0 an .x
it converges uniformly on Œc; c C R. Similarly, if it converges at c R, then it converges
uniformly on Œc R; c. In our proof, we stated it is enough to prove this for the special
case R D 1 and c D 0. Prove that the result is indeed deducible from this case.
48.3. Prove that there is exactly one function F on R to R such that F 00 D F , F .0/ D 1,
F 0 .0/ D 0.
R1
P
n
48.4. If f .x/ D 1
nD1 x =n on Œ0; 1/, find (the improper integral) 0 f . (With proof, of course.)
INTRODUCTION TO ANALYSIS
215
49. T HE EXPONENTIAL AND TRIGONOMETRIC FUNCTIONS
Here we define the complex exponential function, establish its main properties, and use
it to obtain other elementary functions. We begin in outline form and give details afterward.
(1) We define the exponential function for every complex number z by
1
X
zn
exp.z/ D
:
nŠ
nD0
(2)
(3)
(4)
(5)
(6)
(7)
(8)
The series converges absolutely for every z and uniformly on each bounded subset
of C.
exp.a C b/ D exp.a/ exp.b/, for all a; b 2 C.
exp.0/ D 1, exp.1/ D e, exp. z/ D exp.z/ 1 .
exp.z/ ¤ 0, for all z.
exp0 .z/ D exp.z/. (complex differentiation)
The restriction of exp to R is strictly increasing, continuous and positive. It agrees
with the map x 7! e x , defined earlier, so one also writes e z for exp.z/.
limx! 1 exp.x/ D 0, limx!C1 exp.x/ D C1 and hence, exp maps R onto
.0; C1/.
The map t 7! exp.i t / maps R into [actually onto as we see in step (11)] the unit
circle and we define the cosine and sine functions as the real and imaginary parts
of this map. Thus,
cos t D Re.exp.i t //
sin t D Im.exp.i t //;
or what is the same
e i t D cos t C i sin t
(“the Euler indentity”)
(9) sin and cos are differentiable with
sin0 D cos;
cos0 D
sin :
(10) The functions cos and sin have power series representations
1
X
t2
t 2k
t4
cos t D 1
. 1/k
C
C D
2Š
4Š
.2k/Š
kD0
sin t D t
1
X
t5
t 2kC1
t3
C
C D
. 1/k
3Š
5Š
.2k C 1/Š
kD0
(11) There is a smallest positive number such that exp.i =2/ D i. Then, the interval
Œ0; 2/ is mapped by t 7! exp.i t / onto the unit circle and exp.z/ D 1 if and only
if z D .2 i /k, for some integer k.
(12) exp maps C onto C n f0g. Thus, for every complex number w other than 0, there
exists z such that e z D w.
216
The exponential and trigonometric functions
Details (proofs). EX
z
z nC1 . z n
D
! 0, the defining series
(1) Since
.n C 1/Š nŠ
nC1
1
X
zn
:
nŠ
nD0
converges, by the ratio test, for all z 2 C. Thus, the definition is valid, the radius
of convergence is R D C1, and hence on each disk fz W jzj rg, the series
converges uniformly and absolutely.
(2) exp.a C b/ D exp.a/ exp.b/, for all a; b 2 C.
P n
To see this, we use the Cauchy product (convolution) of the series. Since n anŠ
P bn
converges absolutely and n nŠ converges,
1
1
1 n
X
ak X b m X X ak b n k
D
kŠ mD0 mŠ
kŠ.n k/Š
nD0
kD0
kD0
D
n
nŠ
1 X
ak b n
nŠ
kŠ.n
k/Š
nD0
1
X
k
kD0
1
X
1
D
.a C b/n
nŠ
nD0
by the binomial theorem.
(3) As usual for power series, 00 is defined to be 1 and since all the other terms of
the series for exp.0/ vanish, exp.0/ D 1. Long ago we proved that exp.1/ D
limn .1 C 1=n/n D e.
From (2), exp.z/ exp. z/ D exp.z z/ D exp.0/ D 1, so that exp. z/ D
exp.z/ 1 .
(4) That exp.z/ ¤ 0, for all z, is an immediate consequence.
(5) Using complex differentiation, exp0 .z/ means
lim
w!z
exp.w/
w
exp.z/
exp.z C h/
D lim
z
h
h!0
exp.z/
By (2) this is the same as
exp.z/ lim
exp.h/
h
h!0
But, using the series, we see that
P1 hn
exp.h/ exp.0/
D nD0 nŠ
h
h
exp.0/
1
D
:
1
1
X
X
hn 1
hn 2
D1Ch
;
nŠ
nŠ
nD1
nD2
which converges to 1. Thus, exp0 .z/ D exp.z/, as claimed.
(6) It follows from (5) that restriction of exp to R also has exp0 .x/ D exp.x/. So we
see that this function is differentiable, hence continuous.
Since exp.0/ D 1 and since exp.x/ ¤ 0, for all x, we know by the Intermediate
Value Theorem that exp.x/ > 0, for all real x.
But then, exp0 .x/ > 0, so exp is strictly increasing on R. ı Applying (2) by induction one finds that for n a natural number, exp.n/ D e n , and similar arguments
INTRODUCTION TO ANALYSIS
217
(see section 13 ) exp.r/ D e r , for rational r. By continuity, we find that, for real
x,
exp.x/ D supfe r W r < x; r 2 Qg D inffe s W s > x; s 2 Qg:
This is the way we defined e x . Thus, exp.x/ D e x , for all real x, and one also
writes e z for exp.z/.
P
(7) Since, for positive real x, exp.x/ D n x n =nŠ > x, limx!C1 exp.x/ D C1,
and limx! 1 exp.x/ D limx!1 exp. x/ D limx!1 1= exp.x/ D 0. By the
Intermediate Value Theorem, this entails that exp maps R onto .0; C1/.
(8) Recall that for a complex number z D x Ciy, the complex conjugate is zx D x iy
z . Now, let t be real. From the series,
and jzj2 D zx
X .i t /n
exp.i t / D
nŠ
n
we see that exp. i t / is the complex conjugate of exp.i t /. Indeed, those terms
with even n are real and don’t change when we replace i by i. Those terms with
odd n are imaginary, and are negated when we replace i by i. Consequently
j exp.i t /j2 D exp.i t / exp. i t / D 1:
This shows that for all t 2 R, j exp.i t /j D 1; that is, the map t 7! exp.i t / maps R
into the unit circle. We define the cosine and sine functions by
cos t D Re.exp.i t //
sin t D Im.exp.i t //I
that is,
e i t D cos t C i sin t
(“the Euler indentity”)
(9) Differentiating exp.i t / with respect to t gives
cos0 t C i sin0 t D i exp.i t / D i.cos t C i sin t / D
sin t C i cos t;
so that sin and cos are differentiable with
cos0 D
sin;
sin0 D cos :
Note: we should be careful here. We are comparing differentiation with respect to
a complex variable with differentiation with respect to a real variable. We know
for each z,
exp.w/ exp.z/
lim
D exp z;
w!z
w z
so
exp.w/ exp.i t /
D exp i t:
lim
w!i t
w it
Hence, if we restrict w to only run through values of the form w D i s, for s 2 R,
then
exp.i s/ exp.i t /
lim
D exp i t
s!t
is it
This gives
cos s cos t
sin s sin t
Ci
D i exp.i t /;
s t
s t
justifying the calculation at the beginning of this paragraph.
lim
s!t
218
The exponential and trigonometric functions
(10) As we noted above, the terms of the series for exp.i t/ with even powers of i t are
real and those with odd powers of i t are imaginary. That is
1
1
X
X
t 2k
t 2kC1
Ci
:
cos t C i sin t D exp.it / D
. 1/k
. 1/k
.2k/Š
.2k C 1/Š
kD0
In other words,
cos t D
sin t D
1
X
kD0
1
X
kD0
kD0
. 1/k
t 2k
.2k/Š
. 1/k
t 2kC1
.2k C 1/Š
(11) Now we are going to define . We know that cos 0 C i sin 0 D exp.i 0/ D 1, so
cos 0 D 1 and from the series representation of cos,
6
22
2
24
22
24
cos 2 D 1
C
< 1
C
D 1=3:
2Š
4Š
6Š
2Š
4Š
Thus, by the Intermediate Value Theorem, there exists a t > 0 with cos t D 0.
Because of the continuity of cos, we can take t0 to be the smallest such t . Define
to be 2t0 ; thus, is the smallest positive number such that cos =2 D 0.
Now, for t 2 Œ0; =2/, cos t > 0.
But sin0 t D cos t , so sin is strictly increasing on Œ0; =2/
and hence, since sin 0 D 0, sin > 0 on .0; =2.
Since cos2 .i =2/ C sin2 .=2/ D 1, sin2 .=2/ D 1, and hence sin.=2/ D 1.
It follows that
e i =2 D cos.=2/ C i sin.=2/ D i
e i D e i =2 e i =2 D i 2 D
e
i 3=2
D
i
1
()
e i 2 D 1
Now, let z D u C iv be on the unit circle.
If u > 0, v 0, there exists t with cos t D u. But then v 2 D sin2 t , so that
v D sin t , since both are non-negative.
If u 0, v > 0, then .u C iv/. i / D v i u D e i t , for some t , and hence
z D e i.tC=2/ :
Finally, if v < 0, then z D e i t , for some t , by the previous 2 cases, so z D
e i.tC/ .
We now prove that exp.z/ D 1 if and only if z D .2 i /k, for some integer
k. Certainly from ./, for each k 2 Z e i 2k D 1: To prove the converse, first
observe that on .0; /, cos0 D sin < 0, so cos is strictly decreasing there
and on .; 2/, cos0 > 0, so cos is strictly increasing there.
Hence, 0 is the only y 2 Œ0; 2/, with e iy D 1. ı[] If z D x C iy then exp.z/ D
e x e iy . If e z D 1, then je z j D e x D 1, so x D 0 and e iy D 1. Let k D by=2c,
the greatest integer less than or equal to y=2. Then, exp.i.y 2k// D 1, so
y 2k D 0, and thus y D 2k. Thus, z D .2 i /k, as required.
(12) Let w be a complex number other than 0. Then, jwj D e x , for some real x
and w=jwj is on the unit circle, so is of the form e iy , for some real y. Hence
w D e xCiy , as required to establish all the properties listed.
.
INTRODUCTION TO ANALYSIS
219
The connection with the angles of geometry. An angle , in radian measure, is identified
with the length of an arc: namely, the arc traced out on the unit circle as one rotates the
point .1; 0/ through the angle . If
.t / D e i t ;
t 2 Œ0; 2;
the curve has range the unit circle and the length of this arc is, by Theorem 51.3,
Z 2
Z 2
0
j .t /j dt D
1 dt D 2;
0
0
R
and more generally, if t moves from 0 to , traces out an arc of length 0 1 dt D .
So the approach we have taken is consistent with the angle interpretation, and we find the
cosine and sine of an angle , as defined in terms of right triangles, is also consistent with
the definitions of cos and sin here.
220
The exponential and trigonometric functions
Notes
Differentiation of vector-valued and complex-valued fns
221
50. D IFFERENTIATION OF VECTOR - VALUED AND COMPLEX - VALUED FUNCTIONS
The definition of derivative given for real-valued functions applies without change to
those with values in C or Rn . Thus, let f be a function defined on an interval I of R
containing the point c with values in C or in Rn . We say f is differentiable at c if the
limit
f .x/ f .c/
lim
x!c
x c
exists. If so, this limit is called the derivative of f at c. The function f 0 with domain the
set of points where f is differentiable defined by
f 0 .c/ D lim
x!c
f .x/
x
f .c/
c
is called the derivative of f .
In the vector case, if f D .f1 ; : : : fn /, that is, fi .x/ is the i th component of f .x/, for
each i, we see that f is differentiable at c if and only if fi is differentiable at c, for all i,
and we have
f 0 .c/ D .f10 .c/; : : : ; fn0 .c//:
As a vector-space, C is identified with R2 through the correspondence:
x C iy
!
.x; y/
and we see that for a function f W I ! C, f D f1 C if2 , f is differentiable at c iff each
of f1 and f2 are differentiable and
f 0 .c/ D f10 .c/ C if20 .c/:
The Tangent Characterization (35.1) holds true again in this setting, with the identical
proof. The only change is that some numbers become vectors:
50.1 Theorem. For a function f W I ! Rn and x 2 I , f is differentiable at x iff there
exists an v 2 Rn and a function " W I ! Rn such that limt !x ".t / D ".x/ D 0 such that
for all t 2 I;
f .t / D f .x/ C v.t x/ C ".t /.t x/:
In this case, f 0 .x/ D v.
The product v.t x/ here means multiplication of the vector v by the scalar t x.
If f 0 .x/ ¤ 0, as t moves along the real line, f .x/ C f 0 .x/.t x/ traces out a straight
line tangent to the image of f at f .x/. Since ".t / ! 0, as t gets close to x, the error
in using the point on the tangent line to approximate the point f .t / becomes small, even
compared to t x. [This is actually the meaning assigned to the word “tangent” in this
setting.]
The formula in the tangent characterization can also be written:
f .x C h/ D f .x/ C vh C "0 .h/h;
where "0 .h/ ! 0 as h ! 0. Of course this is meaningful only for those h for which
x C h 2 I.
The tangent characterization of derivative again immediately gives continuity.
50.2 Theorem. If f is differentiable at c, then f is continuous at c.
To obtain a chain rule for this notion of derivative, we have to be careful about the
ranges of the two functions.
222
Differentiation of vector-valued and complex-valued fns
50.3 The chain rule. Let I be an interval and g W I ! R be differentiable at x0 . Let
g.I / J , another interval of R, f W J ! Rn , and let f be differentiable at u0 D g.x0 /.
Then, f ı g is differentiable at x0 , with .f ı g/0 .xo / D f 0 .uo /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /.
This is proved just as before. It will also be deducible from the version given elsewhere
for functions of a vector variable. Notice that here if g had its values in Rn , f would have
to be defined on a set of elements of Rn and the present definition of derivative would not
apply.
The simple algebraic results now become:
50.4 Theorem. If f is constant on the interval I , then f is differentiable on I with
f 0 .x/ D 0 for all x 2 I .
50.5 Theorem. Let f W I ! Rn , g W I ! Rn be differentiable at c 2 I and let k 2 R,
then:
(a) kf is differentiable at c with .kf /0 .c/ D kf 0 .c/
(b) f C g is differentiable at c and .f C g/0 .c/ D f 0 .c/ C g 0 .c/ (sum rule).
(c) f g is differentiable at c and .f g/0 .c/ D f 0 .c/ g.c/ C f .c/ g 0 .c/ (product
rule for dot product).
Of course, there is also a version of the product rule for one of the functions f and g
scalar valued and the other vector valued, and a corresponding quotient rule.
The Mean Value Theorem does not hold for vector (or complex) valued functions defined on a real interval.
50.6 Example (Complex MVT fails). For each t 2 Œ0; 2, let f .t / D e i t . Then,
f .2/
f .0/ D 1
1 D 0;
but
f 0 .t / D i e i t
has absolute value 1, for all t .
But there is a consequence of the Mean Value Theorem that does hold; it is the one that
we used to show that a function with a bounded derivative is Lipschitz. (Theorem 36.10)
50.7 Mean Value Inequality.
Let f W Œa; b ! Rn be continuous on Œa; b and differentiable on .a; b/. Then, there
exists x 2 .a; b/ such that
jf .b/
Proof. Put z D f .b/
f .a/j jf 0 .x/jjb
aj:
f .a/ and define for t 2 Œa; b,
'.t / D z f .t /:
Then, ' is a real-valued function, continuous on Œa; b, differentiable in .a; b/. Therefore
there exists x 2 .a; b/ with
'.b/
'.a/ D ' 0 .x/.b
a/:
But z is constant, so by the product rule,
' 0 .x/ D 0 f .x/ C z f 0 .x/ D z f 0 .x/:
As for the other side of the equation,
'.b/
'.a/ D z f .b/
z f .a/ D z .f .b/
f .a// D jzj2 :
Differentiation of vector-valued and complex-valued fns
223
Thus,
jzj2 D z f 0 .x/.b a/ jzjjf 0 .x/j.b a/;
by the Cauchy-Schwarz inequality, and the result follows by cancelling the jzj.
We will have a further generalization of this result when we turn to functions of a vector
variable. (Theorem 53.4.)
50.1. Prove the chain rule for vector functions of a real variable: Let I be an interval and g W I ! R
be differentiable at x0 . Let g.I / J , another interval of R, f W J ! Rn , and let f be
differentiable at u0 D g.x0 /. Then, f ı g is differentiable at x0 , with .f ı g/0 .xo / D
f 0 .uo /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /.
50.2. Let W Œa:b ! Rn , with constant norm; that is j.t /j the same for all t . Prove 0 .t / is
orthogonal to .t /, for all t .
224
Differentiation of vector-valued and complex-valued fns
Notes
Differentiation of vector-valued and complex-valued fns
225
51. I NTEGRATION OF VECTOR - VALUED FUNCTIONS
Let f1 ; : : : ; fn be functions on Œa; b R to R, and let f D .f1 ; : : : ; fn / be the
corresponding function on Œa; b to Rn . The Riemann definition of integral as the limit of
Riemann sums as the mesh of the tagged partitions tend to 0 still makes sense here, and
one finds that f is integrable iff each of f1 ; : : : fn is integrable and
Z b
Z b
Z b !
f D
f1 : : : ;
fn :
a
a
a
When we need (or want) to show the variable of integration this becomes.
!
Z b
Z b
Z b
f .t / dt D
f1 .t / dt; : : : ;
fn .t / dt
a
a
a
It is clear this integral is still linear and additive on intervals, by just applying the real
case to each coordinate. The same is true for the Fundamental Theorem of Calculus. Let
us state the “integrating a derivative” form.
51.1 Theorem. If F maps Œa; b into Rn , F 0 D f on Œa; b, and f is Riemann integrable,
then
Z b
f .t / dt D F .b/ F .a/:
a
The result about the integral of the absolute value of an integrable function f is also
true, but the proof is a little trickier, since f is vector-valued and jf j is real-valued. (By
the way — remember that jf j means the function whose value at x is jf .x/j, the norm of
the vector f .x/, not really absolute value. We are reserving kf k for the supremum norm
of the function. )
51.2 Theorem. Let f W Œa; b ! Rn be Riemann integrable. Then jf j is Riemann integrable and
ˇ
ˇZ
ˇ b ˇ Z b
ˇ
ˇ
jf j:
fˇ
ˇ
ˇ a ˇ
a
Proof. The hypothesis implies that each of the components f1 ,. . . ,fn is integrable, so each
of the squares f12 ,. . . ,fn2 is integrable and so is their sum. The square root function is
P
continuous, so jf j D . i fi2 /1=2 is also integrable.
Rb
Rb
Put wi D a fi , so that w D .w1 ; : : : ; wn / is a f . Then,
Z bX
X
X Z b
jwj2 D
wi2 D
wi
fi D
wi fi :
i
i
a
a
i
By the Cauchy-Schwarz inequality, for all t 2 Œa; b,
X
wi fi .t / jwjjf .t /j:
i
Thus,
jwj2 b
Z
b
Z
jwjjf .t /j dt D jwj
a
and the result follows by cancelling the jwj.
jf j;
a
226
Integration of vector-valued functions
Rectifiable curves — arc length. A continuous mapping of an interval Œa; b into Rn
is called a (parametrized) curve, because its range C D f.t / W t 2 Œa; bg can be
considered a geometric curve, traced out by the point .t / as t moves from Œa; b. The
“distance traveled” by the point .t / is thought of as the length of the curve. But one
should should keep in mind that same set C corresponds to many different maps and these
then may have different lengths. (For example, the curve could wrap around a circle several
times.) If is one-to-one, it is often called an arc. If .a/ D .b/, is called a closed
curve.
For a firm definition of the length of a curve, associate to each partition P of Œa; b
determined by points x0 D a D x1 xk D bg the number
`.; P / D
k
X
j.xi /
.xi
1 /j:
i D1
This is the sum of the distances between the points .xi /, and .xi 1 /, so is the length of
a polygonal path with vertices .x0 /; .x1 /; : : : ; .xk /. We define the length of to be
`. / D supf`.; P / W P a partition of Œa; bg:
It is easy to see that the approximations `.; P / increase as the partition P gets finer.
If `. / is finite, we call the curve rectifiable. Notice that, although a process similar to
integration is used, integration and differentiation are not involved in the definition.
A continuously differentiable curve is one for which 0 is continuous on the parameter interval. Sometimes these are called “smooth” curves, but the geometric curve
traced out by such a curve could have sharp points, so some authors reserve this word for
continuously differentiable , for which g 0 .t / ¤ 0, for all t . Those who call continuously
differentiable curves “smooth” often call those which have no zero derivatives regular.)
For continuously differentiable curves, we do have a formula in terms of integration and
differentiation.
51.3 Theorem. If W Œa; b ! Rn is a continuously differentiable curve, then is rectifiable and
Z
b
j 0 .t /j dt:
`. / D
a
Proof. . Let P D fx0 ; x1 ; : : : ; xk g be a partition of Œa; b. For each i we have by the
Fundamental Theorem of Calculus,
ˇ Z x
ˇZ x
i
ˇ
ˇ i 0
ˇ
.t / dt ˇˇ j 0 .t /j dt:
j.xi / .xi 1 /j D ˇ
xi
xi
1
1
Summing over i gives
Z
`.; P / b
j 0 .t /j dt
a
and taking the supremum over all such partitions yields
Z b
`. / j 0 .t /j dt:
a
For the reverse inequality, let " > 0, and use the fact that 0 is uniformly continuous
to choose ı > 0 such that j 0 .s/ 0 .t /j < ", whenever js t j < ı. Then, choose any
partition P D fx0 ; : : : ; xk g of Œa; b with mesh kP k < ı.
For each t 2 Œxi 1 ; xi , we have
j 0 .t /j j 0 .xi /j C "
Differentiation of vector-valued and complex-valued fns
227
and integrating gives
ˇZ x
ˇ
Z xi
ˇ i 0
ˇ
j 0 .t /j dt j 0 .xi /jxi C "xi D j 0 .xi /xi j C "xi D ˇˇ
.xi / dt ˇˇ C "xi
xi 1
xi 1
ˇZ x
ˇ
ˇ i 0
ˇ
D ˇˇ
.t / C . 0 .xi / 0 .t // dt ˇˇ C "xi
xi 1
ˇ Z x
ˇZ x
i
ˇ
ˇ i 0
ˇ
ˇ
.t / dt ˇˇ C
j 0 .xi / 0 .t /j dt C "xi
xi
xi
1
j.xi /
.xi
1
1 j C 2"xi
Summing over i yields
Z
b
j 0 .t /j dt `.; P / C 2" `. / C 2"
a
Since " is arbitrary,
b
Z
j 0 .t /j dt `. /;
a
which is all that was left to prove.
t 3 ; t 2 /,
51.1. Let I be an interval of R with 0 2 int.I / and .t / D .1 C
for t 2 I . Then, is
continuously differentiable, but 0 .t / D 0 and the geometric curve .I / has a sharp point at
.0/.
51.2. Let 1 be a rectifiable curve in Rn defined on Œa; b and ' a continuous 1-1 mapping of Œc; d 
onto Œa; b and 2 .s/ D 1 .'.s//, for all s 2 Œc; d . Prove 1 and 2 have the same length.
51.3. Let W Œa; b ! Rn be a one-to-one curve. Let W Œc; d  ! Rn be another one such that
.Œa; b/ D .Œc; d /. Prove that the two curves have the same length. [Hint: these curves have
continuous inverses; see C ONTINUITY AND C OMPACTNESS, Exercise 31.2.]
Note: Differentiation and integration are not involved.
51.4. Prove that if 1 W Œa; b ! Rn and 2 W Œb; c ! Rn are curves and is the curve on Œa; c
that extends both of these (we call it 1 C 2 ) then `. / D `.1 / C `.2 /.
228
Integration of vector-valued functions
Notes
Differentiation of vector-valued and complex-valued fns
229
52. D IFFERENTIATION OF VECTOR FUNCTIONS OF A VECTOR VARIABLE
We now study the general case of a mappings between a subset of Rn into Rm . These
are often called transformations or vector valued functions of several variables.
We recall that for real functions defined on an interval of R, f was differentiable at a
point c if and only if there was a straight line approximating f closely at c, and a similar
result held for vector valued functions of a real variable. This is the clue for the definition
in the vector-to-vector case.
A mapping T of a vector space E1 into another vector space E is called a linear transformation if, for all x; y 2 E1 , and all scalars t ,
T .x C y/ D T x C T y
T .tx/ D t T x:
Note that for linear maps, one often writes T x instead of T .x/.
1 Definition. Let G be an open set of Rn and let f be a mapping into Rm , defined (at
least) on G. We say that f is differentiable at x0 2 G, provided there exists a linear
transformation T W Rn ! Rm and a function " W G ! Rm such that limx!x0 ".x/ D
".x0 / D 0 and for all x 2 G,
f .x/ D f .x0 / C T .x
x0 / C ".x/jx
x0 j:
If this is satisfied, T is denoted f 0 .x0 / or Df .x0 /, and called the derivative of f at x0 .
Of course the function " is given by
".x/ D
f .x/
f .x0 / T .x
jx x0 j
for x ¤ x0 , (and 0, if x D x0 ).
It should be noticed that we had to use jx
divide by a vector.
x0 /
;
x0 j, instead of x
x0 , because we cannot
Another way we could write the defining condition is
f .x/ D f .x0 / C Df .x0 /.x
x0 / C R.x/;
where the remainder R.x/ satisfies
lim
x!x0
jR.x/j
D 0:
jx x0 j
Such a term R.x/ is said to be o.x/ as x ! x0 .
At the risk of causing boredom, we also note that this could be written
f .x0 C h/ D f .x0 / C Df .x0 /h C R1 .h/;
where
jR1 .h/j
D 0:
jhj
h!0
lim
As we shall reconfirm shortly, linear transformations on Rn are continuous, so it follows
as in the real setting, that differentiability imples continuity.
52.1 Theorem. If G is an open subset of Rn and f W G ! Rm is differentiable at x0 , then
f is continuous at x0 .
230
Differentiation of vector functions of a vector-variable
Connection with the usual definition for functions of a real variable. If T W R ! R
is a linear mapping, then its value at h 2 R is T h D T .1/h. That is, if a D T .1/, then
T h D ah, ordinary multiplication. Conversely, every function of that form is linear. So,
when we write
f .x/ D f .x0 / C f 0 .x0 /.x x0 / C R.x/;
it doesn’t matter whether we think of f 0 .x0 / as a number which is multiplied by x x0 ,
or as a linear mapping, evaluated at x x0 .
Similarly, if v is a vector of Rm , then the mapping h 2 R 7! vh (multiplication of v by
the scalar h) is a linear mapping. Conversely, if T W R ! Rm is a linear transformation,
and v is the vector T .1/, T at h is again h.T .1// D vh.
Once more, then, it doesn’t matter whether we think of f 0 .x0 /.x x0 / as a vector
multiplied by the scalar x x0 or as a linear transformation acting on x x0 .
The case of real-valued functions of a vector variable. From elementary linear algebra,
we learn:
52.2 Theorem. If T W Rn ! R, then T is a linear transformation if and only if there exists
a D .a1 ; : : : ; an / 2 Rn such that T x D a1 x1 C a2 x2 C C an xn D a x, for all x 2 Rn .
The linearity of such a map follows immediately from the properties of dot product. For
the converse, let e1 ; e2 ; : : : ; en be the standard basis vectors of Rn ,
e1 D .1; 0; 0; : : : ; 0/
e2 D .0; 1; 0; : : : ; 0/
::
:D
::
:
en D .0; 0; 0; : : : ; 1/;
so eij D 1, if i D j , and 0 otherwise. Then, each x 2 Rn is of the form
x D .x1 ; : : : ; xn / D
n
X
xi ei ;
i D1
P
and we see that T x D i xi T ei D .a1 ; : : : ; an / .x1 ; : : : ; xn / D a x, where ai D T .ei /,
for all i .
Thus, if a function f on an open set G of Rn with values in R is differentiable at x0 ,
then there is a vector a 2 Rn such that f 0 .x0 /h D a h, for all h 2 Rn . The vector a here
will turn out to be what is called the gradient of f at x0 .
Often we think of the graph of the mapping f W G Rn ! R, namely: S D
f.x; f .x// W x 2 Gg, as an m-dimensional surface in RmC1 . The graph of the map
x 7! f .x0 / C f 0 .x0 /.x x0 / then becomes a (hyper)plane tangent to S at x0 .
The space L.Rn ; Rm /. The identification of the derivative as a real number or as a vector
with the derivative as a linear transformation becomes clearer when we notice that the set
L.Rn ; Rm / of all linear transformations from Rn to Rm is itself a vector space under the
usual operations
.S C T /.x/ D S.x/ C T .x/
.cT /.x/ D c.T .x//:
The correspondences mentioned above are actually vector space isomorphisms. Thus, as
vector spaces,
(1) R can be identified with L.R; R/ under the isomorphism a 7! Ta , where Ta h D
ah, for all real h;
Differentiation of vector-valued and complex-valued fns
231
(2) Rm can be identified with L.R; Rm / under the correspondence v 7! Tv , where
Tv .h/ D vh, multiplication by a scalar;
(3) Rn can be identified with L.Rn ; R/ under the correspondence a 7! Ta , where
Ta .x/ D a x, for all x 2 Rn .
Of course, we are not only interested in the algebraic properties of derivatives, but also in
related distances. To handle that, we introduce the norm of a linear transformation T ,
kT k D sup jT xj:
jxj1
ˇ ˇ
ˇxˇ
x
Notice that if x ¤ 0, then ˇ jxj
/j D
ˇ D 1, so jT . jxj
1
jT xj
jxj
and hence,
jT xj kT k jxj:
This evidently holds also when x D 0. On the other hand, if K 0 satisfies jT xj Kjxj,
for all x 2 Rn , we have by the definition of supremum as a least upper bound that kT k K. Thus,
52.3 Lemma. For T 2 L.Rn ; Rm /, kT k is the least K 0 such that jT xj Kjxj, for all
x 2 Rn .
Now, letP
e1 ; : : : ; en be the standard basis vectors of Rn , and x D .x1 ; : : : ; xn / 2 Rn .
Then, x D njD1 xj ej , so
ˇ
ˇ
ˇ
ˇX
n
n
1=2
X
ˇ X
ˇ n
ˇ
jT ej j2
;
xj T ej ˇˇ jxj jjT ej j jxj
jT xj D ˇ
ˇ j D1
ˇj D1
j D1
by the Cauchy-Schwarz inequality. This shows that
kT k .
n
X
jT ej j2 /1=2 < C1:
j D1
It follows that T is Lipschitz, hence uniformly continuous, because
jT .x
y/j kT kjx
yj; for all x 2 Rn :
52.4 Theorem. The map T 7! kT k turns L.Rn ; Rm / into a normed space; that is, for all
S; T 2 L.Rn ; Rm / and all scalars c,
(1) 0 kT k < 1;
(2) kS C T k kS k C kT k; and
(3) kcT k D jcjkT k:
One can check that in the 3 special cases above, the correspondences also preserve the
norm, as follows.
52.5 Theorem.
(1) If a 2 R and Ta h D ah, for all real h, then kTa k D jaj.
(2) If v 2 Rn and Tv .h/ D vh, multiplication by the scalar h, then kTv k D jvj.
(3) If a 2 Rn and Ta .x/ D a x, for all x 2 Rn , the kTa k D jaj.
232
Differentiation of vector functions of a vector-variable
Rules of differentiation. We first consider the case f D T is already a linear mapping.
Then,
f .x/ D f .x0 / C T .x x0 / C 0;
0
0
so T .x0 / D f .x0 / D T . Thus,
52.6 Theorem. Every linear transformation T W Rn ! Rm is differentiable at each point
x of Rn , with T 0 .x/ D T .
We emphasize that we are not saying T 0 D T . In general if f W G ! Rm , f 0 D Df ,
is not a transformation of Rn to Rm , but rather a mapping that associates to each x 2 Rn a
linear transformation f 0 .x/ 2 L.Rn ; Rm /. In the case f D T is linear, the value of f 0 .x/
is T , for all x.
Derivatives of constant maps are 0.
52.7 Theorem. Let f W G Rn ! Rm be defined by f .x/ D v, for all x 2 G. Then,
f 0 .x/ D 0, for all x 2 Rm .
This is immediate from the definition and is left as an exercise.
52.8 The chain rule. Let G be an open subset of Rn and g W G ! Rm be differentiable
at x0 . Let f map an open set U containing g.G/ into Rk , and let f be differentiable at
u0 D g.x0 /. Then, f ı g is differentiable at x0 , with .f ı g/0 .x0 / D f 0 .u0 /g 0 .x0 / D
f 0 .g.x0 //g 0 .x0 /.
Here if T D g 0 .x0 /, S D f 0 .g.x0 //, we are talking about the composite mapping
S ı T D ST .
Proof. Since g is differentiable at x0 , there exists a function "1 , continuous and 0 at x0
with
g.x/ D g.x0 / C g 0 .x0 /.x x0 / C "1 .x/jx x0 j;
(29)
for all x 2 G. Since f is differentiable at u0 , there exists a function, "2 which is continuous
and 0 at u0 D g.x0 / with
f .u/ D f .u0 / C f 0 .u0 /.u
u0 / C "2 .u/ju
u0 j;
(30)
for all u in U . Replacing u by g.x/ in (30) gives
f .g.x// D f .u0 / C f 0 .u0 /.g.x/
u0 / C "2 .g.x//jg.x/
u0 j:
(31)
0
But u0 D g.x0 /, so by (1) we may replace g.x/ u0 by g .x0 /.x x0 / C "1 .x/jx x0 j
yielding
h
i
f .g.x// D f .g.x0 // C f 0 .u0 / g 0 .x0 /.x x0 / C "1 .x/jx x0 j
ˇ
ˇ
ˇ
ˇ
(32)
C "2 .g.x//ˇ g 0 .x0 /.x x0 / C "1 .x/jx x0 j ˇ ;
D f .g.x0 // C f 0 .u0 /g 0 .x0 /.x
x0 / C R.x/
where
R.x/ D f 0 .u0 /"1 .x/jx
ˇ
x0 j C "2 .g.x//ˇg 0 .x0 /.x
x0 / C "1 .x/jx
ˇ
x0 jˇ:
Thus, we need only show that the remainder R.x/ satisfies jR.x/j=jx x0 j ! 0, as
x ! x0 . Well, jg 0 .x0 /.x x0 / C "1 .x/jx x0 jj kg 0 .x0 /kjx x0 j C j"1 .x/jjx x0 j,
so
jR.x/j
kf 0 .u0 /kj"1 .x/j C j"2 .g.x//jkg 0 .x0 /k C j"1 .x/j ! 0;
jx x0 j
Differentiation of vector-valued and complex-valued fns
233
as required. Here we used the fact that g is continuous at x0 , and "2 is continuous and 0 at
g.x0 / D u0 .
52.9 Remark. The fact that for linear transformations we write T x instead of T .x/ and
S T instead of S ı T hides some subtleties in the computation above. For example, if S
denotes f 0 .u0 /, T denotes g 0 .x0 /, and h denotes x x0 , going from the first to the second
line of (32), we calculate
S T .h/ C "1 .x/jhj D S.T .h// C S."1 .x/jhj/ D .S ı T /.h/ C S."1 .x//jhj;
where we have “taken out” the scalar jhj.
52.10 Theorem. Let f and g be functions defined and differentiable on an open set containing x0 2 Rn with values in Rm and let c 2 R. Then f C g and cf are differentiable at
x0 and .f C g/0 .x0 / D f 0 .x0 / C g 0 .x0 / and .cf /0 .x0 / D cf 0 .x0 /.
52.11 Theorem. Let f D .f1 ; : : : ; fm / map an open set G of Rn to Rm . Then f is differentiable at x iff each fi is differentiable at x. In this case f 0 .x/ D .f10 .x/; : : : ; fm0 .x//.
Proof. Note that for variety, we have fixed x in the domain of f .
Assume fi is differentiable at x for each i D 1; : : : ; m and let Ti D fi0 .x/. Then, there
exist "1 ; : : : ; "m such that "i .h/ ! 0 as h ! 0 and fi .x C h/ fi .x/ D Ti h C "i .h/jhj.
Let e1 ; : : : ; em be the standard basis vectors of Rm . Then,
f .x C h/
f .x/ D
m
X
i D1
.fi .x C h/
fi .x//ei D
m
X
i D1
.Ti h/ei C
m
X
"i .h/jhjei
i D1
Thus, f .x C h/ D f .x/ C .T1 h; : : : ; Tm h/ C ".h/jhj, where ".h/ ! 0 as h ! 0. Thus,
f is differentiable at x and f 0 .x/ is the linear transformation T whose components are
T1 ; : : : ; Tn , as required.
The converse can be simply established by direct calculation, but it is fun to see that it
follows by the chain rule from other results. Indeed, if f is differentiable at x, fi D i ıf ,
where i is the projection of Rm onto its i th coodinate. Thus,
fi0 .x/ D i0 .f .x// ı f 0 .x/
But, i is linear so i0 .f .x// D i , and we get fi0 .x/ D i ı f 0 .x/, the i th coordinate of
f 0 .x/.
Directional derivatives and partial derivatives. A lot of information about derivatives of
a vector function f of a vector variable can be obtained from functions of a real variable,
by looking at the behaviour of f along a straight line.
Let u be a non-zero element of Rn , f a function defined in a neighbourhood of x 2 Rn
with values in Rm . The u-directional derivative of f at x, or derivative of f at x in the
direction u is
f .x C t u/ f .x/
;
Du f .x/ D lim
t !0
t
provided this limit exists. Notice that this is the derivative at 0 of the function of a real
variable, f ı `, where `.t / D x C t u, for all t 2 R, which parametrizes a straight line
through the point x in the direction u. In the special case u D ej , the j th basis vector,
.x/
Dej f .x/ is called the j th partial derivative at x and denoted Dj f .x/ or @f
. Notice,
@xj
that in this case,
f .x1 ; : : : ; xj 1 ; xj C t; xj C1 ; : : : ; xn / f .x1 ; : : : ; xn /
Dj f .x/ D lim
t !0
t
234
Differentiation of vector functions of a vector-variable
the derivative at xj of the function s 7! g.s/ obtained from f by fixing all the xk , for
k ¤ j , and replacing xj by the variable s.
52.12 Theorem. Let u be a non-zero element of Rn , f a function defined in a neighbourhood of x 2 Rn with values in Rm . If f is differentiable at x, then the u-directional
derivative of f at x is f 0 .x/u, the value of the linear transformation f 0 .x/ at u.
Proof. Let `.t / D x C t u, for all t 2 R. Then, by the chain rule for differentiation of
transformations,
Du f .x/1 D .f ı `/0 .0/1 D .f 0 .`.0// ı `0 .0//.1/ D f 0 .`.0//.`0 .0/1/
But `.0/ D x, and `0 .0/1 D u, so
Du f .x/ D f 0 .x/u:
The reason we put the 1 in the computation in the above proof is that we were using the
chain rule for transformations (i.e. vector functions of a vector variable), and the directional
derivative was defined as a vector, not as a linear transformation. Multiplying by 1 amounts
to evaluating the corresponding linear transformation at 1.
Proof using direct calculation. :
f .x C t u/
t
f .x/
D
f 0 .x/.t u/ C "1 .t u/jt uj
jt uj
D f 0 .x/u C "1 .t u/
;
t
t
where "1 .t u/ ! 0 as t ! 0. Since the absolute value of the scalar
we have
f .x C t u/ f .x/
D f 0 .x/u
Du f .x/ D lim
t !0
t
jt uj
t
is constantly juj,
52.13 Remark. It follows that if s is a scalar, Dsu f .x/ D sDu f .x/. Many people require
that u be a unit vector and leave directional derivatives undefined otherwise; others define
the directional derivative in terms of u=juj. Of course, this means that formulas must be
adjusted by the factor 1=juj.
The gradient. In the case of a real valued function f of a vector variable, we saw that
there exists a vector a 2 Rn such that f 0 .x/h D a h, for all h 2 Rn . We will now identify
that a. If a D .a1 ; : : : ; an /, then the j th coordinate of a is simply
aj D a ej ;
th
where ej is the j standard basis vector. Thus, in the present case,
aj D f 0 .x/ej D Dj f .x/ D
@f .x/
;
@xj
and a D .D1 f .x/; : : : ; Dn f .x//.
The vector .D1 f .x/; : : : ; Dn f .x//, defined whenever each of the partials D1 f .x/,. . . ,Dn f .x/
exists is called the gradient of f at x and denoted rf .x/ or grad f .x/. Thus, if a real
valued f is differentiable at x, then f 0 .x/h D grad f .x/ h, for all h 2 Rn . Often, in this
setting, one uses the notation dx D .dx1 ; : : : ; dxn / instead of h and this becomes
f 0 .x/dx D grad f .x/ dx D
@f .x/
@f .x/
dx1 C C
dxn
@x1
@xn
Differentiation of vector-valued and complex-valued fns
235
52.14 Example. The existence of the gradient does not imply differentiability — not even
continuity. Let
( x x
1 2
if .x1 ; x2 / ¤ 0
2
2;
f .x1 ; x2 / D x1 Cx2
0;
if .x1 ; x2 / D 0:
Then, along the line x2 D 0, f .x/ is constantly 0, so D1 f .0/ D 0, and similarly
D2 f .0/ D 0, so rf .0/ D .0; 0/, but as x ! 0 along the line x1 D x2 , f .x/ ! 1=2, so f
is not continuous at 0. You can check this using "; ı arguments or you can look at the same
phenomenon by using a composition with the function ` W t 7! t .1; 1/ which decribes the
line x1 D x2 .
t2
D 1=2:
lim f .t; t / D lim 2
t t C t2
t !0
In this example, we see that the derivative in the direction of .1; 1/ does not exist. One
might be tempted to believe that if all directional derivatives existed and were equal, then
f would be differentiable, but this fails also.
52.15 Example (The existence of all directional derivatives does not imply continuity).
Let f W R2 ! R be given by
8 2
< x1 x2 ; if .x ; x / ¤ 0
1 2
4
2
f .x1 ; x2 / D x1 Cx2
:0;
if .x ; x / D 0:
1
2
Fix u D .u1 ; u2 / ¤ 0. If u2 D 0, we get
t 2 u21 0 ı
t D 0;
t !0 t 4 u4 C 0
1
Du f .0; 0/ D lim
while if u2 ¤ 0,
t 3 u21 u2
u21
D
:
t !0 t .t 4 u4 C t 2 u2 /
u2
1
2
Thus, all directional derivatives exist at .0; 0/. This implies that the restriction of f to each
straight line through the origin is continuous at 0. Nevertheless, f is not continuous at 0,
for if we follow the curve W t 7! .t; t 2 /, we have
Du f .0; 0/ D lim
t4
D 1=2 ¤ 0:
t !0
t !0 t 4 C t 4
The matrix of the derivative. The Jacobian. We recall from
Algebra that every
2 Linear
3
x1
6 7
vector x 2 Rn has a column matrix representation Œx D 4 ::: 5, where the xj are the
lim f ..t // D lim
xn
P
coordinates of x with respect to the standard basis x D .x1 ; : : : ; xn / D njD1 xj ej . The
mapping x 7! Œx is a vector space isomorphism, and often one identifies x with Œx.
Moreover, every linear transformation T W Rn ! Rm has a matrix
2
3
a11 : : : a1n
6
:: 7
ŒT  D 4 :::
: 5
am1
:::
amn
which satisfies
ŒT x D ŒT Œx;
(matrix multiplication).
The columns of ŒT  are the column representations of the image vectors T e1 ; : : : ; T en .
236
Differentiation of vector functions of a vector-variable
For the reader who has not seen that development, or would like a review, here is a brief
n
version. Let e1 ; : : : ; en denote the standard
Pbasis vectors in R , and let ex1 ; : : : ; exm be the
standard basis vectors in Rm . Then, x D n
x
e
,
and
j
j
j D1
Tx D
n
X
xj T ej :
j D1
For each j , T ej D
Pm
aij exi , for some numbers aij , i D 1; : : : ; m. Thus,
0
1
m
m
n
X
X
X
X
@
Tx D
xj
aij exi D
aij xj A exi
iD1
j
Thus, the coordinates of T x are
iD1
iD1
j D1
Pn
aij xj . In terms of matrix multiplication this says
32 3
a11 : : :
a1n
x1
6 :
6 7
:: 7
7 6 :: 7
:
ŒT x D 6
4 :
: 54 : 5
am1 : : : amn
xn
j D1
2
Now, in case T is f 0 .x/ the derivative at x of a transformation taking a neighbourhood
of x 2 Rn to Rm , for each j , f 0 .x/ej is the ej -directional derivative; that is, the partial
derivative Dj f .x/. Its coordinates with respect to ex1 ; : : : ; exm are just obtained by differentiating the coordinates of f at x. Thus, the i th coordinate of f 0 .x/ej is Dj fi .x/, and
the matrix Œf 0 .x/ which represents the linear transformation f 0 .x/ is
2
3
D1 f1 .x/ : : : Dn f1 .x/
6
7
::
::
4
5
:
:
D1 fm .x/
:::
Dn fm .x/
This matrix is often called the Jacobian matrix of f at x. If m D n, the matrix is
invertible if and only if its determinant is non-zero. This determinant detŒf 0 .x/ is called
1 ;:::;fn /
.
the Jacobian of f at x, sometimes denoted @.f
@.x1 ;:::;xn /
Continuous differentiability. For a function f W G Rn ! Rm , G open, f is continuously differentiable at a if f 0 is exists in a neighbourhood G of a and f 0 is continous at a as a map on G to L.Rn ; Rm /, with the distance given by the operator norm:
d.T; S / D kT Sk.
basis
in Rn , then
RecallPthat we showed, that if e1 ; : : : ; ej are the standardP
P vectors
2
2
2
kT k j jT ej j . If ŒT  D .aij / the right-hand side here is j i aij : Thus, if T and
S are linear transformations with matrices A D .aij / and B D .bij /, then
0
11=2
X
kT Sk @ .aij bij /2 A :
(33)
ij
Even though existence of all the partials doesn’t imply differentiability in general, it does
so if the partials are continuous.
52.16 Theorem. For a function f defined (at least) in an open set G of Rn , with values
in Rm , f is continuously differentiable in G if and only if for all i; j the partial derivative
Dj fi exists and is continuous in G.
Proof. . ( H) ) Let f be continuously differentiable at a 2 G. Then, for all x in G,
each partial derivative exists with Dj fi .x/ D .f 0 .x/ej / exi , where exi denotes the i th basis
Differentiation of vector-valued and complex-valued fns
237
vector in Rm . Thus,
jDj fi .x/
Dj fi .a/j D jf 0 .x/ej exi
D j.f 0 .x/
f 0 .a/ej exi j
f 0 .a//ej ei j kf 0 .x/
f 0 .a/k jej j j exi j:
As x ! 0, f 0 .x/ ! f 0 .a/, by continuity of f 0 , so Dj fi is continous at a.
( (H ) Conversely, assume the partial derivatives are continuous in G. Once we show
differentiability, the continuity of the derivative will follow from
0
11=2
X
kf 0 .x/ f 0 .a/k @ .Dj fi .x/ Dj fi .a//2 A ;
ij
which is the application of (33.) to this situation.
Without loss of generality, assume m D 1. Let a 2 G. Let " > 0. Choose ı > 0 so
small that B.a; ı/ G and for all j ,
jDj f .x/
Dj f .a/j < "=n;
for jx aj < ı
P
P
k
Take h 2 Rn with jhj < ı. Put x0 D a, xk D aC j D1 hj ej , (so xn D aC njD1 hj ej D
a C h). Then,
n
X
f .a C h/ f .a/ D
f .xj / f .xj 1 /:
j D1
By the Mean Value Theorem (for real functions of a real variable),
f .xj /
for some cj on the line from xj
1
f .xj
1/
D Dj f .cj /hj ;
to xj . Thus,
f .a C h/
f .a/ D
n
X
Dj f .cj /hj ;
j D1
so
ˇ
ˇ
ˇ
ˇf .a C h/
ˇ
ˇ
f .a/
ˇ
ˇ ˇ
ˇ
ˇ ˇX
n
X
ˇ
ˇ ˇ n
Dj f .a/hj ˇˇ
Dj f .cj /hj
Dj f .a/hj ˇˇ ˇˇ
ˇ
ˇ ˇj D1
j D1
j D1
X
jDj f .cj / Dj f .a/jjhj j
n
X
j
"jhj
Thus, by definition, f is differentiable at a with f 0 .a/h D
h.
P
j
Dj f .a/hj D rf .a/ 52.1. Let f be defined in a neighbourhood of x in Rn with values in Rm . If W t 7! a C t u and
.t0 / D x, then Du f .x/ is the derivative of f ı at t0 .
52.2. Directly from the definition, prove that the function f W R2 ! R defined by f .x1 ; x2 / D
sin x1 is differentiable at each point a D .a1 ; a2 / of R2 .
52.3. Let f W R2 ! R be defined by
f .x/ D
8
< qx1 jx2 j ;
2
2
x¤0
:
0;
x D 0:
x1 Cx2
Prove that f has all directional derivatives at 0, but is not differentiable at 0.
238
Differentiation of vector functions of a vector-variable
52.4. Let be a bilinear product from Rn Rm to Rk . Prove there exists M 2 R with ju vj M jujjvj, for all u 2 Rn , v 2 Rm .
52.5. Prove the product rule for differentiation of vector functions of a vector variable: Let G be
an open subset of Rp , x0 2 G, f W G ! Rn , g W G ! Rm , a bilinear product from
Rn Rm to Rk . If f and g are differentiable at x0 , so is f g, with D.f g/.x0 /h D
.Df .x0 /h/ g.x0 / C f .x0 / .Dg.x0 /h).
52.6. Let f be a real valued function on an open set G of R2 such that the partial derivatives D1 f
and D2 f exist and are bounded in G. Prove that f is continuous. (Suggestion: mimic the proof
that continuous partials imply differentiability.)
Differentiation of vector-valued and complex-valued fns
239
53. T HE I NVERSE F UNCTION T HEOREM
For real functions of a real variable, we learned that if f 0 .x/ ¤ 0, for all x in an
open interval I , then f is strictly monotone, hence injective, so the inverse function f 1
is defined everywhere on f .I /, and this inverse function itself is also differentiable. If
we knew only that f 0 .a/ were non-zero at one point a, and if we assumed that f 0 were
continuous, then we could still find a neighbourhood U of a such that f 0 .u/ ¤ 0 for all
u 2 U , so the result would still hold for f restricted to U . This is the version of the result
we will develop for the higher dimensional case.
The general idea of the theorem is that a derivative gives a local approximation to a
function at point; if the derivative at the point is invertible, then the function is also invertible (near that point). The significance of the condition f 0 .a/ ¤ 0 is that it is equivalent to
the invertiblity of f 0 .a/. That is the condition we will have to impose on the general case.
53.1 The Inverse Function Theorem. Let f be a continuously differentiable mapping of
an open subset G of Rn to Rn . If a 2 G with f 0 .a/ invertible and b D f .a/, then
(1) there exist open sets U and V such that a 2 U , b 2 V , and f maps U one-to-one
onto V ;
(2) if g is the inverse of the restriction of f to U , then g is continuously differentiable
on V and
g 0 .y/ D Œf 0 .g.y// 1 :
53.2 Note. Unlike the case of functions on R to R, there is no hope for invertibility on
all of G, even if f 0 .x/ is invertible for all x 2 G. You can check this by looking at the
mapping f W R2 ! R2 defined by
f .x1 ; x2 / D .e x1 cos x2 ; e x1 sin x2 /:
This is actually the exponential function, viewed as a map on R2 to R2 instead of C to C.
It is continuously differentiable at every point and its derivative (in the sense of transformation) is invertible, but f is far from one-to-one. Its inverse “the complex logarithm” has
infinitely many “branches”.
53.3 Theorem. Let  be the set of all invertible linear transformations of Rn to itself.
(1) If T 2  and S 2 L.Rn ; Rn / with
kS
T k < 1=kT
1
k;
then S is also invertible, so  is open.
(2) The map T 7! T 1 on  onto itself is continuous.
Proof. (1) Let T be invertible and ˛ D 1=kT
jxj D jT
1
kT
1
kj.T
kT
1
k .kT
1
k. Then
T xj kT
1
kjT xj
S /x C S xj
S kjxj C jS xj/ :
Thus,
.˛ kT S k/jxj jS xj:
()
Now, for all S with kS T k < ˛, we deduce from () that S x D 0 implies x D 0, so S
is invertible. Thus,  is open.
(2) Replacing x by S 1 y in () gives
1
jS 1 yj jyj;
˛ kT S k
240
The Inverse Function Theorem
so that kS
Also,
1
k 1=.˛
kT
kS
Sk/ < 2=˛ D 2kT
1
T
1
k D kS
1
kS
1
TT
kkT
1
1
k, provided kT
S
1
1
ST
1
S kkT
S k < ˛=2.
k
k
1 2
when kT
2kT k kT S k;
S k < ˛=2. It follows that the inversion map is continous at T .
n
m
53.4 Mean Value Inequality. Let U be an open subset of R and f W U ! R , is
differentiable. If U contains the line segment from a to b, then there is a point c on that
segment with
jf .b/ f .a/j kf 0 .c/kjb aj:
If kf 0 .x/k K < C1 for x 2 U and U is convex, that is, contains the line segment
joining each pair of its points, then the result entails f is Lipschitz.
Proof. We have proved elsewhere the corresponding result for vector functions of a real
variable (see Theorem 50.7). We will reduce the present situation to that case.
Let .t / D .1 t /a C t b, so that as t traverses the interval Œ0; 1, .t / traverses the line
segment from a to b. Let g D f ı . According to the chain rule,
g 0 .t / D f 0 ..t // 0 .t / D f 0 ..t //.b
0
a/:
0
Then, jg .t /j kf ..t //kjb aj, so according to the real variable Mean Value Inequality,
for some t 2 .0; 1/,
jg.1/ g.0/j jg 0 .t /jj1 0jI
hence, putting c D .t /,
jf .b/
as required.
f .a/j kf 0 .c/kjb
aj;
We are now ready for the proof of the inverse function theorem.
Differentiation of vector-valued and complex-valued fns
241
Proof of the Inverse Function Theorem. Let f 0 .a/ be invertible and denote it by T . Since
f 0 is continuous at a, there exists an open ball U centred at a with
kf 0 .x/
1
2kT 1 k
Tk for all x 2 U:
Then, we see that f 0 .x/ is also invertible, though we won’t make use of that yet.
For a fixed y 2 Rn , define for all x 2 U ,
1
'.x/ D 'y .x/ D x C T
.y
f .x//:
Notice that y D f .x/ if and only if '.x/ D x; that is, if and only if x is a fixed point of '.
Differentiating gives
' 0 .x/ D I C T
1
1
. f 0 .x// D T
.T
f 0 .x//I
hence,
1
k' 0 .x/k kT
kkT
f 0 .x/k 1=2:
Since U is convex, we can use the Mean Value Inequality, obtaining
'.x2 /j 12 jx1
j'.x1 /
x2 j;
(34)
for all x1 ; x2 2 U , showing that ' is a contraction mapping. It follows that ' can have at
most one fixed point. Thus, there is at most one point x with y D f .x/. This shows that
f is one-to-one on U .
Let V D f .U /. To show that V is also open, let y0 2 V . Let x0 be such that f .x0 / D
x 0 ; r/ is contained in U . We will show
y0 . Choose r > 0 so small that the closed ball B.x
that V contains the ball centred at y0 , radius r=2kT 1 k.
So, let jy y0 j r=2kT 1 k. Using the contraction mapping ' D 'y , defined above,
we compute
j'.x0 /
x0 j D jT
1
f .x0 //j kT
.y
1
kjy
f .x0 /j r=2;
x 0 ; r/,
so that for x 2 B.x
j'.x/
x0 j j'.x/
x0 j 21 jx
'.x0 /j C j'.x0 /
x0 j C
r
2
r:
x 0 ; r/ is a complete metric space, since it is closed and Rn is complete.
The closed ball B.x
Thus, by the Contraction Mapping Theorem (Banach’s fixed point theorem), ' has a fixed
point x. Thus, f .x/ D y, so that y 2 V . This completes the proof that V is open.
Now, let g be the inverse of the restriction of f to U . Our job is to show that g is
continuously differentiable in V , with the expected formula for its derivative.
The inequality (34) doesn’t actually depend on the particular y. Indeed,
'.x1 /
'.x2 / D x1
x2
T
1
.f .x1 /
f .x2 //;
so (C) becomes
jx1
x2
T
1
f .x2 //j 12 jx1
.f .x1 /
x2 j;
and hence
1
jx
2 1
x2 j jT
1
.f .x1 /
f .x2 //j:
For y1 ; y2 2 V , we may replace x1 and x2 by g.y1 /, g.y2 /, obtaining
jg.y1 /
g.y2 /j 2jT
1
.y1
y2 /j 2kT
1
k jy1
y2 j:
242
The Inverse Function Theorem
Now, to show g is differentiable at any point y0 2 V , let y 2 V also, and put x0 D g.y0 /,
y D g.x/, S D Œf 0 .x0 / 1 . Then, there is ".x/ ! ".x0 / D 0 with
g.y/ g.y0 / S.y y0 / D x x0 S.y y0 /
D
S.y
y0
f 0 .x0 /.x
D
S.f .x/
f .x0 /
D
S.".x/jx
x0 j/
D
S.".g.y//jg.y/
As y ! y0 , S.".g.y/// ! 0 and jg.y/
differentiable at y0 , with derivative
g 0 .y0 / D S D Œf 0 .g.y0 //
1
f .x0 /.x
x0 //
g.y0 /j:
1
g.y0 j 2kT
x0 //
0
kjy
y0 j, so this shows g is
:
0
Finally, g is continuous, f is continuous by hypothesis, and the inversion map T !
T 1 is continuous on the set  of all invertible linear operators on Rn , so g 0 — which is
the composite of these 3 — is continuous on V , so g is also continuously differentiable,
and we are done.
53.5 Corollary. If f is a continuously differentiable mapping of an open set G of Rn into
Rn , with f 0 .x/ invertible for all x, then f is an open mapping; that is, f .W / is open, for
each open subset W of G.
The proof is an exercise.
Differentiation of vector-valued and complex-valued fns
243
54. T HE I MPLICIT F UNCTION T HEOREM
The Inverse Function Theorem gives conditions under which we can solve an equation
of the form y D f .x/ “for x in terms of y”, where x; y are variables in Rn . Equivalently,
for systems of n equations
y1 D f1 .x1 ; : : : ; xn /
::
:
yn D fn .x1 ; : : : ; xn /;
involving real variables x1 ; : : : ; xn ; y1 ; : : : ; yn , it gives conditions under which we can
solve for x1 ; : : : ; xn in terms of y1 ; : : : ; yn .
We now look for conditions under which we can do similarly for more general systems
f1 .x1 ; : : : ; xn ; y1 ; : : : ; ym / D 0
::
:
()
fn .x1 ; : : : ; xn ; y1 ; : : : ; ym / D 0:
For a moment, consider the case n D m D 1. If f is a continuously differentiable realvalued function in the plane, f .x; y/ D 0 can be solved for x in terms of y in any neigh.a; b/ ¤ 0. A simple
bourhood of a point .a; b/ such that f .a; b/ D 0, provided @f
@x
familiar example consists of the equation
x2 C y2
2
Here f .x; y/ D x C y
2
1 D 0:
()
1. Attempts to solve for x in terms of y yield
p
x D ˙ 1 y2:
( )
@f
.x; y/
@x
The partial derivative
is 2x. This is 0 if and only if x D 0. The corresponding points satisifying ./ are .0; 1/ and .0; 1/. As long as we stay away from these
points, we can choose one of the 2 formulas indicated by . / to obtain a valid function
2
2
compatible with ./. For example, if
1 D 0; then the function
pa > 0, and a C b
g W . 1; 1/ ! R defined by g.y/ D 1 y 2 is such that a D g.b/ and for an element
.x; y/ in the open set U D f.x; y/ W x > 0g,
x2 C y2
1 D 0 ” x D g.y/:
Further we notice that at such a point, g 0 .y/ D y.1 y 2 / 1=2 D y=x. This is the
same as one gets by a formal calculation called “implicit differentiation”
@f .x; y/
@f .x; y/
dx C
dy D 0
@x
@y
2x dx C 2y dy D 0
f 0 .x; y/.dx; dy/ D
dx
2y
y
D
D
dy
2x
x
Assuming we knew g were differentiable at y, this is actually reflects a correct calculation,
based on the chain rule. Indeed, we know in the open set U , f .x; y/ D 0 if and only if
g.y/ D x, so f .g.y/; y/ D 0. Let .y/ D .g.y/; y/. Since f 0 .u; v/.h; k/ D 2uh C 2vk,
by the chain rule,
.f ı /0 .y/ D f 0 .g.y/; y/ 0 .y/ D 2g.y/g 0 .y/ C 2y1:
Since f ı D 0, , we can solve for g 0 .y/ obtaining g 0 .y/ D
y=g.y/.
244
The Implicit Function Theorem
Now, let’s turn to the general case. Treat the system of equations ./ as one equation
f .x; y/ D 0;
where f is defined on an open subset of Rn Rm . The idea is once again that the derivative
of f at a point .a; b/ is a linear transformation which yields a local approximation to
f at that point. If T is a linear transformation from Rn Rm to Rn , it induces linear
transformations Tx W Rn ! Rn and Ty W Rm ! Rn by
Tx .h/ D T .h; 0/
Ty .k/ D T .0; k/:
These satisfy T .h; k/ D Tx h C Ty k. If Tx is invertible, then for each y 2 Rm , there exists
a unique h 2 Rn with T .h; k/ D 0, given by h D Tx 1 Ty k.
If T D Df .a; b/ D f 0 .a; b/, then Tx is the derivative of the map x 7! f .x; b/ and Ty
is the derivative of the map y 7! f .a; y/. We denote these (tranformation-valued) partial
derivatives by D1 f .a; b/ and D2 f .a; b/.
.
54.1 Implicit Function Theorem. Let f be a continuously differentiable map on an open
subset G of Rn Rm into Rn , .a; b/ 2 G, and with f .a; b/ D 0. If D1 f .a; b/ is invertible,
then there exist open sets U Rn Rm and W Rm with .a; b/ 2 U , b 2 W and a
map g W W ! Rn such that for .x; y/ 2 U ,
f .x; y/ D 0 ” x D g.y/:
Moreover, g is continuously differentiable in W and for all y 2 W ,
Dg.y/ D
.D1 f .g.y/; y//
1
ı D2 f .g.y/; y/:
Proof. . Let T D Df .a; b/. Define a map F W G ! Rn Rm , by F .x; y/ D .f .x; y/; y/.
Then F is continuously differentiable, since both f and W .x; y/ 7! y are so. In
fact, DF .x; y/ D .Df .x; y/; D.x; y// D .Df .x; y/; /, since we can differentiate each
coordinate separately. In particular, DF .a; b/ D .T; /, the linear operator that maps
.h; k/ to .T .h; k/; k/.
Now, DF .a; b/.h; k/ D 0 implies .T .h; k/; k/ D 0, so k D 0, and thus T .h; 0/ D 0.
In other words, Tx h D 0. But by hypothesis, Tx D D1 f .a; b/ is invertible, so h is also
0. From F 0 .a; b/.h; k/ D 0, we have deduced .h; k/ D 0; hence, DF .a; b/ is invertible.
This shows that F satisfies the hypotheses of the Inverse Function Theorem. Therefore,
there exist U an open subset of G, and V an open subset of Rn Rm with F a bijection of
U onto V .
Put W D fy 2 Rm W .0; y/ 2 V g. This is open, since V is open. Now, y 2 W iff
f .x; y/ D 0, for some .x; y/ 2 U . Since F is one-to-one on U , this x is unique. Thus,
we define a map g W W ! Rn by letting g.y/ be the unique x with .x; y/ 2 U and
f .x; y/ D 0.
Let H be the inverse of the restriction of F to U . Then, we know from the Inverse
Function Theorem that H is also continuously differentiable and
.g.y/; y/ D H.0; y/;
so g is also continuously differentiable. To compute Dg.y/, put .y/ D .g.y/; y/, for all
y 2 W . Then D.y/k D .Dg.y/k; k/, for all k 2 Rm . Since f ..y// D 0, for y 2 W ,
the chain rule gives at a point .x; y/ where x D g.y/,
Df .x; y/.Dg.y/k; k/ D 0
Differentiation of vector-valued and complex-valued fns
245
Thus, D1 f .x; y/.Dg.y/k/ C D2 f .x; y/k D 0. Solving this for the tranformation Dg.y/
we have
Dg.y/k D .D1 f .x; y// 1 D2 f .x; y/k;
for all k 2 Rm . In other words,
Dg.y/ D
.D1 f .x; y//
1
D2 f .x; y/:
54.2 Example. Let f be given by f .x; y/ D .f1 .x1 ; x2 ; y1 ; y2 /; f2 .x1 ; x2 ; y1 ; y2 //;
(
f1 .x1 ; x2 ; y1 ; y2 / D x12 x22 C y12 y2 C y1 y22
f2 .x1 ; x2 ; y1 ; y2 / D e y1 Cy2 x2 :
Then, f .1; 1; 0; 0/ D .0; 0/. The derivative of f at .x; y/ has matrix
D1 f1 .x; y/ D2 f1 .x; y/ j D3 f1 .x; y/ D4 f1 .x; y/
2x1
D
D1 f2 .x; y/ D2 f2 .x; y/ j D3 f2 .x; y/ D4 f2 .x; y/
0
2x2
1
The matrix has been partitioned to emphasize
2x1 2x2
2y1 y2 C y22
D1 f .x; y/ D
D2 f .x; y/ D
and
0
1
e y1 Cy2
At the point .a; b/ D .1; 1; 0; 0/,
2
D1 f .a; b/ D
0
j 2y1 y2 C y22
j
e y1 Cy2
y12 C 2y2
:
e y1 Cy2
2
;
1
which is invertible, since its columns are independent (or since its determinant is not 0).
Thus, in a neighbourhood W of b there is a continuously differentiable function g such
that f .g.y/; y/ D 0, for all y 2 W and the matrix of Dg.y/ is
1
1
2x1 2x2
2y1 y2 C y22 y12 C 2y2
D2 f .x; y/ D
ŒDg.y/ D D1 f .x; y/
:
0
1
e y1 Cy2
e y1 Cy2
54.1. Show that the continuity of f 0 is needed in the inverse function theorem, even in the case n D 1:
Let
(
t C 2t 2 sin. 1t /; t ¤ 0
:
f .t / D
0;
t D0
Then, f 0 .0/ D 1, f 0 is bounded in . 1; 1/, but f is not 1 to 1 in any neighbourhood of 0.
54.2. Let f be a real valued function on and open set G of R2 such that the partial derivatives D1 f
and D2 f exist and are bounded in G. Prove that f is continuous.
54.3. Investigate how the inverse function theorem applies to the function f W R2 ! R2 given by
f .x1 ; x2 / D .x12 C x22 ; 2x1 x2 /:
y12 C 2y2
e y1 Cy2
246
The Implicit Function Theorem
Notes
Differentiation of vector-valued and complex-valued fns
247
55. A PPENDIX : C OUNTABILITY
Recall that a function f W A ! B is called one-to-one or an injection if f .a/ D f .a0 /
implies a D a0 . It is called onto or a surjection if, for each b 2 B there exists a 2 A with
f .a/ D b, and is called a bijection if it is both an injection and a surjection (one-to-one
and onto).
2 Definition. Two sets A, and B, are called equinumerous if there exists a bijection
f W A ! B. Notation: A $ B. We also say A is equinumerous with B, or that A and B
are in one-to-one correspondence.
55.1 Example. The set N D f1; 2; : : : g is equinumerous with fm 2 N W m 2g D
f2; 3; 4; : : : g. The idea of this example is basic and IMPORTANT, it is used over and
over again!
Let us put A D fm 2 N W m 2g. Since we have no theorems to use yet, to show
N $ A, we must find a bijection from N onto A.
We simply let f .n/ D n C 1, for n 2 N. Certainly f W N ! A, because if n 2 N then
n C 1 1 C 1 D 2.
To show that f is injective (i.e. one-to-one) let
f .n/ D f .n0 /:
Then,
n C 1 D n0 C 1;
so
n D n0 :
To prove f is surjective, that is f maps onto A, let m 2 A. Then m 2 N and m > 1, so
m 1 2 N (one of our first theorems of natural numbers) and
f .m
1/ D m
1 C 1 D m:
55.2 Example. The set of natural numbers is equinumerous with the set of even natural
numbers.
To see this, recall that a natural number m is called even if it is divisible by 2. That, in
turn, means that there exists n 2 N such that m D 2n. Thus, the set of even numbers is
exactly f2n W n 2 Ng D 2N. A bijection on N to 2N is given by
f .n/ D 2n; for n 2 N.
Certainly for all n 2 N, f .n/ is even and we have just checked that f is onto. To see that
f is one-to-one, put f .n1 / D f .n2 /. Then, 2n1 D 2n2 , so n1 D n2 as required.
Equinumerosity is an equivalence relation on the class of all sets. That is:
55.3 Lemma. For sets A; B; C :
(a) A $ A.
(b) A $ B implies B $ A.
(c) A $ B and B $ C imply A $ C .
248
Appendix: Countability
Proof. (a) For any set A, the identity mapping defined by
f .x/ D x;
for all x 2 A,
is a bijection.
Indeed, if f .x/ D f .x 0 / then x D x 0 , just by definition (so f is an injection) and if x 2 A then
x D f .x/ (so f is a surjection).
(b) Suppose f W A ! B is a bijection. Then f 1 W B ! A is also a bijection.
Let us review the proof of this. From the definitions, f .x/ D y iff x D f 1 .y/.
[For a given y in the range of f , the fact that f is one-to-one gives us exactly one x such that
f .x/ D y, and this x is defined to be f 1 .y/].
Now if y; y 0 2 B and f 1 .y/ D f 1 .y 0 /, then f .f 1 .y// D f .f 1 .y 0 //, so y D y 0 .
Thus, f 1 is injective. To see that it is surjective, let x 2 A. Then f .x/ is some point y
of B and thus x D f 1 .y/.
(c) is left as an exercise.
55.4 Lemma. The only set equinumerous with ; is itself.
Proof. Suppose A $ ;. Then there is a bijection f W A ! ;. If A ¤ ;, choose a 2 A
then f .a/ 2 ;, which is impossible.
55.5 Lemma (Pigeonhole Principle). For natural numbers m and n, if m < n, then there
is no injection f W f1; : : : ; ng ! f1; : : : ; mg.
Proof. We prove this by induction on m. Thus we let P(m) denote “ For all n 2 N, if
m < n then there is no injection f W f1; : : : ; ng ! f1; : : : ; mg”.
1-1
This is true for m D 1, because if n 2 N and f W f1; : : : ; ng ! f1g, then f .n/ D f .1/,
which implies n D 1.
Assume it is true for m: Let n 2 N and n > m C 1. Suppose there were an injection
f W f1; : : : ; ng ! f1; : : : ; m C 1g. Since there is no injection of f1; : : : ; ng into f1; : : : ; mg,
there must be a p 2 f1; : : : ; ng with f .p/ D mC1. Fix such a p and put g.k/ D f .k/, for
k < p and f .k C 1/ for k D p; : : : ; n 1. Then g injects f1; : : : ; n 1g into f1; : : : ; mg,
which is a contradiction. Thus, no such f exists: the statement is true for m C 1.
By the PMI, P .m/ holds for all m 2 N. That is, for all m; n 2 N, if m < n, then there
is no injection f W f1; : : : ; ng ! f1; : : : ; mg.
55.6 Corollary. There is no injection of f1; : : : ; ng onto a proper subset of itself.
Proof. Suppose h were an injection on f1; : : : ; ng to itself which is not surjective. Since h
cannot map f1; : : : ; ng to f1; : : : ; n 1g, there is some i with h.i / D n and since h is not
surjective, there is some other k 2 f1; : : : ; ng n .range h). Define
(
h.j /; j ¤ i
f .j / D
:
k;
j Di
Then f injects f1; : : : ; ng into f1; : : : ; n
1g which is impossible.
3 Definition. The empty set is said to have 0 elements. For n 2 N, A is said to have n
elements if A $ f1; : : : ; ng
We prove now that this n is unique. It is called the cardinality of A (or the number of
elements in A). denoted card.A/ or #.A/.
55.7 Theorem. If n; m 2 N and A $ f1; : : : ; ng and A $ f1; : : : ; mg then m D n.
In other words, if A has m elements and A has n elements, then m D n.
Differentiation of vector-valued and complex-valued fns
249
Proof. Let n; m 2 N. And suppose A $ f1; : : : ; ng and A $ f1; : : : ; mg. Then, from the
equivalence relation properties of $,
f1; : : : ; ng $ f1; : : : ; mg;
If m ¤ n, one must be smaller. Say m < n. But, by the Pigeon-hole Principle, there can be
no injection of f1; : : : ; ng into f1; : : : ; mg;, so no bijection. This is a contradiction yielding
m D n.
4 Definition. A set is called
finite if it has n elements, for some n 2 W D N [ f0g,
infinite, if it is not finite
denumerable, if it is equinumerous with N
countable , if it is finite or denumerable.
55.8 Theorem (Also referred to as the Pigeon-hole Principle). If m; n 2 N, A has n
elements, B has m elements and m < n, then there is no injection of A into B.
Proof. Exercise.
We have shown that the set f2; 3; : : : g is denumerable, as is the set of even numbers.
The reader should check that:
The set of odd numbers is denumerable and
so is the set Z of integers f 2; 1; 0; 1; 2; : : : g.
55.9 Theorem.
(1) The set N is infinite; hence, all denumerable sets are infinite.
(2) Equivalent for a set A are:
(a) A is infinite
(b) A contains a denumerable subset
(c) A is equinumerous with a proper subset of itself.
Proof. (1) If N were not infinite, then since it is not ;, there would be a bijection f W N !
f1; : : : ; ng, for some n 2 N. But then, if we let g be the restriction of f to f1; : : : ; n C 1g:
g.x/ D f .x/;
for x 2 f1; : : : ; n C 1g;
then g is an injection of f1; : : : ; n C 1g into f1; : : : ; ng. which violates the Pigeon-hole
Principle.
For the second statement, suppose A is denumerable, but finite. Then N $ A. If A
were empty we would have N D ;, and if n 2 N with A $ f1; : : : ; ng, we would have
N $ f1; : : : ng, which cannot happen since N is infinite.
(2) ((a) H) (b)) Assume (a) holds; that is, A is infinite.
Then, A is not empty, so we may choose a point x1 2 A.
Suppose x1 ; : : : ; xk have been chosen as distinct (that is, with xi ¤ xj if i ¤ j )
elements of A. Then
fx1 ; : : : ; xk g A;
and
A ¤ fx1 ; : : : ; xk g;
otherwise A would be finite. Thus
A n fx1 ; : : : ; xk g ¤ ;;
and we may choose another xkC1 2 A n fx1 ; : : : ; xk g. Thus, x1 ; : : : ; xkC1 are distinct
elements of A.
250
Appendix: Countability
By recursion (definition by induction), we thus have proved the existence of a sequence
of elements xi 2 A; i 2 N, such that xi ¤ xj if i ¤ j . (Indeed, if i < j , all the elements
x1 ; : : : ; xj are distinct.) This says that the map
g W i 7! xi
is one-to-one on N into A. Let B be the range of g, that is B D fxi W i 2 Ng. Then g is
onto B so g is a bijection of N onto B. That is B is a denumerable subset of A.
((b) H) (c)) Suppose B is a denumerable subset of A, say
i 7! xi
is an injection on N onto B. We now define a map f on A by
(
x;
if x 2 A n B
f .x/ D
xiC1 ; if x D xi ; for some i 2 N.
This f is 1-1 on A onto A n fx1 g, a proper subset of A.
Indeed, first, notice that if x 2 A; then either x 2 A n B, in which case, f .x/ D x
cannot be x1 , or x is some xi 2 B, and f .x/ D xi C1 . But xiC1 cannot be xi ,
otherwise i C 1 D 1, since i 7! xi is one-to-one. This is impossible, since each
natural number is 1. This proves the range of f is contained in A n fx1 g (which is
all we really need to say that f maps to a proper subset of A. But we go on anyway,
since the technique of proof is of use elswhere).
If a 2 A n fx1 g, either a 2 A n B, so f .a/ D a 2 A n fx1 g or a D xiC1 , for some
i and then a D f .xi /. This shows the range of f is exactly A n fx1 g.
To show f is injective, we let f .x/ D f .x 0 / and show x D x 0 There are three
cases, both x; x 0 2 A n B, both are in B, or one is in A n B and the other is in B.
In the first case, f .x/ D x and f .x 0 / D x 0 and f .x/ D f .x 0 / so x D x 0 .
In the second case, x D xi , and x 0 D xj , for some i; j 2 N. Thus, f .x/ D xiC1
and f .x 0 / D xj C1 , so
xi C1 D xj C1 :
But the various xk are distinct, so i C 1 D j C 1, and therefore i D j .
The third case cannot occur, since then (say) f .x/ 2 A n B and f .x 0 / 2 B.
((c) H) (a))
Let A be equinumerous with a proper subset B of itself. (Thus B A, A n B ¤ ;, and
A $ B.)
Suppose A were finite. Then A ¤ ;, since A n B ¤ ;, so A $ f1; : : : ; ng for some
n 2 N. Fix such an n. Let f W A ! B be a bijection and g W f1; : : : ; ng ! A a bijection.
Then g 1 ı f ı g is still one-to-one (why?). Moreover, it maps f1; : : : ; ng onto a proper
subset of itself, which is impossible by the (corollary to) the Pigeon-hole principle. Hence,
A is infinite.
In detail, since B is a proper subset of A, there exists an a 2 A n B. Then, there is an i with
g.i / D a. If there would exist j with
g
1
ı f ı g.j / D i;
we would have
f .g.j // D g.i / D a;
which contradicts the fact that f maps to B. Thus the range of g
1
ı f ı g.j / is not all of f1; : : : ; ng.
55.10 Example. If a < b, then Œa; b is infinite.
Differentiation of vector-valued and complex-valued fns
251
One proof of this would use .b/. For each n 2 N, let xn D a C b n a . Then the set
fxn W n 2 Ng is denumerable and contained in Œa; b.
. Then Œa; c is a proper subset of Œa; b, yet the
Another could use (c): Let c D aCb
2
function on Œa; c defined by f .x/ D 2.x a/ C a maps Œa; c bijectively onto Œa; b. For convenience of reference, we now list the facts about countability most commonly
used in Analysis, in the form of 4 theorems, and give the proofs afterward.
55.11 A.
(1) Every subset of a finite set is finite.
(2) Every subset of a countable set is countable.
(3) The image under any map of a finite set is finite.
(4) The image under any map of a countable set is countable.
55.12 B. For a non-empty set A, equivalent are:
(a) A is countable.
(b) There is a surjection f W N ! A.
(c) There is an injection g W A ! N.
55.13 C. The following are countable:
N N, hence
the product of two countable sets.
the union of a countable number of countable sets.
the integers
the rationals.
the algebraic numbers
55.14 D (Theorem 11.3 of text). Each non-degenerate of R is uncountable, including R
itself.
Proof of A(1). Let A be finite and B A. If B is infinite, then it contains a denumerable
subset D. But then D A also, so A is also infinite.
Proof of A(2). (Every subset of a countable set is countable.)
Let A be countable and B A. If A is finite, then so is B is by (1). So we may assume
A is denumerable. Also, if B is finite we are done, so we also assume B is infinite.
Let A D fa1 ; a2 ; : : : g, where the an are distinct.
Now B is infinite, so the set M WD fn 2 N W an 2 Bg is not empty, in fact it is infinite
(why?).
By the Well-ordering Property of N, M has a least element. So let n1 be the first n in
M . Thus
i < n1 implies i … M:
()
Continuing recursively, suppose n1 ; : : : ; nk have been chosen in M , we choose nkC1 to be
the first element n of M n fn1 ; : : : ; nk g. This defines a map k 7! nk of N to M .
At each stage,
if n nk and n 2 M , then n 2 fn1 ; : : : ; nk g.
()
Thus nkC1 > nk and we have k 7! nk is one-to-one on N onto M ; hence, k 7! ank is
one-to-one on N onto B.
Here are some details of the last paragraph. The statement ./ is proved by induction on k:
It is true for k D 1, since n1 is the first element of M (see ./). So suppose the statement is true for
k, Then the elements of M n fn1 ; : : : ; nk g are all > nk . Since nkC1 is one of these, nkC1 > nk and
252
Appendix: Countability
since it is the smallest one, if n 2 M with n nkC1 , it has to be nkC1 or else n nk , In the second
case, n 2 fn1 ; : : : ; nk g, by the inductive hypothesis. Thus, in both cases
if n nkC1 and n 2 M , then n 2 fn1 ; : : : ; nkC1 g,
which is the statement for k C 1. Thus, by the PMI, ./ is true for all k.
Now, the map i 7! ni is one-to-one, for if i ¤ k, say i < k, then ni 2 fn1 ; : : : ; nk
not in this set.
1g
and nk is
Finally, to prove the map i 7! ni is onto M , let m 2 M . Since nk < nkC1 for all k we have nk k,
for all k 2 N. (This is a simple induction exercise.) Thus m 2 M and m nm , so m 2 fn1 ; : : : ; nm g, by
./.
As we said, the results A,B,C, and D above are listed together for ease of reference, but
it is not necessarily the best order to prove them. We will come back to the rest of Theorem
A later, and for now turn to Theorem B.
Proof of Th. B. Let A ¤ ;.
(a) H) (b).
Let A be countable. Then either A is denumerable, or A is finite. If A is denumerable,
then there is a bijection f W N ! A, and this is certainly a surjection.
So, we are left with the case A is finite. Then, there is an n 2 N and a bijection i 7! ai
on f1; : : : ; ng onto A. Thus
A D fa1 ; : : : ; an g:
We simply define f W N ! A by
(
f .j / D
aj ;
a1 ;
if j n
if j > n:
Clearly this map is onto since each of the elements of A was already one of the ai for
i n.
(b) H) (c).
Assume f W N ! A is surjective. For each a 2 A, there is an element n 2 N with
f .n/ D a. Choose one such n (say the first one) and call it na . Then the map
a 7! na ;
is on A into N. This map is injective because if
na D na 0 ;
then
f .na / D f .na0 /;
0
that is, a D a .
(c) H) (a).
Now let g W A ! N be injective. Then gŒA N and
g W A ! gŒA;
is one-to-one, and onto. Thus A is equinumerous gŒA, and gŒA is countable, since it is a
subset of a countable set.
Proof of A(c). Suppose A is countable, let g be any mapping. We have to show that gŒA
is countable.
By theorem B(b) there is a surjection f of N onto A, and then the composite map g ı f
maps N onto gŒA, so gŒA, is also countable.
Differentiation of vector-valued and complex-valued fns
253
It is left as an exercise to state and prove an analogue of Theorem B for finite sets and
use it to prove A(3) that the image of any finite set is finite.
Proof that N N is countable .
Method 1: The map f W .m; n/ 7! m C .m C n 2/.m C n 1/=2 is 1-1 on N N onto
N. This is “diagonal enumeration”.
Onto is proved by induction: f .1; 1/ D 1. Supposing f .m; n/ D k either n D 1 in
which case f .1; m C 1/ D k C 1 otherwise f .m C 1; n 1/ D k C 1.
To prove 1-1: suppose f .m; n/ D f .m0 ; n0 /. If m C n D m0 C n0 , then m D m0
C1 1/
D
and then n D n0 ; if m C n > m0 C n0 D N , then f .m; n/ m C .N C1 2/.N
2
.N 1/N
.N 2/.N 1/
0
0
mC
D
m
C
.N
1/
C
>
f
.m
;
n
/.
2
2
Method 2: By B(c), it is enough to show that there is an injection g W N N into N. One
such is given by
g W .m; n/ 7! 2m
1
.2n
for .m; n/ 2 N N:
1/;
If
g.m; n/ D g.m0 ; n0 /
then,
2m 1 .2n 1/ D 2m
From very basic number theory, this implies
2m
1
D 2m
0
1
and
0
1
.2n0
2n
1/:
1 D 2n0
1
(otherwise you could prove that an even number was equal to an odd number) and then
m D m0 and n D n0 ;
that is,
.m; n/ D .m0 ; n0 /:
The mapping g is actually also onto. For a given k 2 N, k D g.m; n/ where m is the greatest
natural number for which 2m 1 divides k (leaving the odd number 2n 1 as a quotient).
It is interesting to draw a picture numbering pairs .m; n/ this way. Try it.
Another common choice for this method is to use h.m; n/ D 2m 3n . It is still injective, but certainly not
onto N.
The reader can easily deduce from the countability of N N, that the cartesian product
of two countable sets is countable, that is, if A and B are countable, so is A B.
Proof that a countable union of countable sets is countable. If I is countable (¤ ;/ and
for each i 2 I , Ai is countable (and ¤ ;), then there is a map g on N onto I and, for each
i 2 I , there is a map hi on N onto Ai .
S Then the map f defined by f .m; n/ D hg.m/ .n/ maps the countable set N N onto
i 2I Ai , hence the latter image is also countable.
Proof that the set of rationals is countable. The set Z of integers is the union of the nonnegative and the negative integers, so is countable.
The set of rationals is the image of the countable set Z N under the map .m; n/ 7!
m=n, so is countable by B(2).
254
Appendix: Countability
Notes
Download