Real Analysis

advertisement
ANALYSIS
—
AN INTRODUCTORY COURSE
Ivan F Wilde
Mathematics Department
King’s College London
iwilde@mth.kcl.ac.uk
Contents
1 Sets
1
2 The Real Numbers
9
3 Sequences
29
4 Series
59
5 Functions
81
6 Power Series
105
7 The elementary functions
111
Chapter 1
Sets
It is very convenient to introduce some notation and terminology from set
theory. A set is just a collection of objects — which will usually be certain
mathematical objects, such as numbers, points in the plane, functions or
some such. If A denotes some given set and x denotes an object belonging
to A, then this fact is indicated by the expression
x∈A
to be read as “x belongs to A”, or “x is a member of A”, or “x is an element
of A”. If x denotes some object which does not belong to the set A, then
this is indicated by the symbolism
x∈
/A
and is read as “x does not belong to A”, or “x is not a member of A”, or
“x is not an element of A”.
To say that the sets A and B are equal is to say that they have the same
elements. In other words, to say that A = B is to say both that if x ∈ A
then also x ∈ B and if y ∈ B then also y ∈ A. We can write this as
(
x∈A
=⇒ x ∈ B
A = B is the same as
y∈B
=⇒ y ∈ A.
The verification that given sets A and B are equal is made up of two
parts. The first is the verification that every element of A is also an element
of B and the second part is the verification that every element of B is also
an element of A.
We list a few examples of sets and also introduce some notation.
1
2
Chapter 1
Examples 1.1.
1. The set consisting of the three integers 2, 3, 4. We write this as { 2, 3, 4 }.
2. The set of natural numbers { 1, 2, 3, 4, 5, 6, . . . } (i.e., all strictly positive
integers). This set is denoted by N. Notice that 0 ∈
/ N.
√
3. The set of all real numbers, denoted by R. For example, 8, −11, 0, 5,
− 12 , 13 , π are elements of R.
4. The set of complex numbers is denoted by C.
5. The set of all integers (positive, negative and including zero) is denoted
by Z.
6. The set of all rational numbers (all real numbers of the form m
n for
integers m, n with n 6= 0) is denoted by
Q.
For
example,
the
real
numbers
√
3
17
2∈
/ Q.
4 , − 9 , 0, 78, −3 belong to Q, but
7. The set of even natural numbers { 2, 4, 6, 8, . . . }. This could also be
written as { n ∈ N : n = 2m for some m ∈ N }. (The colon ‘:’ stands for
“such that” (or “with the property that”), so this can be read as “the
set of all n in N such that n = 2m for some m in N”.)
8. The set { x ∈ R : x > 1 } is the set of all those real numbers strictly
greater than 1.
9. The set { z ∈ C : |z| = 1 } is the set of complex numbers with absolute
value equal to 1. This is the “unit circle” in C ( — the circle with centre
at the origin and with radius equal to 1).
Certain sets of real numbers, so-called “intervals”, are given a special
notation with the use of round and square brackets. Let a ∈ R and b ∈ R
and suppose that a < b.
{ x ∈ R : a ≤ x ≤ b } is denoted [ a, b]
(closed interval)
{ x ∈ R : a < x < b } is denoted (a, b)
(open interval)
{ x ∈ R : a ≤ x < b } is denoted [ a, b)
(closed-open interval)
{ x ∈ R : a < x ≤ b } is denoted (a, b]
(open-closed interval)
{ x ∈ R : x ≤ a } is denoted (−∞, a]
{ x ∈ R : x < a } is denoted (−∞, a)
{ x ∈ R : a ≤ x } is denoted [ a, ∞)
{ x ∈ R : a < x } is denoted (a, ∞).
Department of Mathematics
3
Sets
It is important to realize that all this is just notation — a useful visual
short-hand. In particular, the symbol ∞ is used in four of the cases. This in
no way is meant to imply that ∞ represents a real number — it positively,
absolutely, certainly is not.
∞ is not a real number.
There is no such real number as ∞.
Given sets A and B, we say that A is a subset of B if every element of
A is also an element of B, i.e., x ∈ A =⇒ x ∈ B. If this is the case, we
write
A⊆B
— read “A is a subset of B”. By virtue of our earlier discussion of the
equality A = B, we can say that
A=B
⇐⇒
both A ⊆ B and B ⊆ A.
“ is equivalent to ”
“ if and only if ”
We have N ⊆ R, Q ⊆ R, N ⊆ Z.
Definition 1.2. Suppose that A and B are given sets. The union of A and B,
denoted by A ∪ B, is the set with elements which belong to either A or B
(or both);
A ∪ B = { x : x ∈ A or x ∈ B }
— read “A union B equals . . . ”. Note that the usage of the word “or”
allows “both”.
In non-mathematical language, the union A ∪ B is obtained by bundling
together everything in A and everything in B. Clearly, by construction,
A ⊆ A ∪ B and also B ⊆ A ∪ B.
Example 1.3. Suppose that A = { 1, 2, 3 } and B = { 3, 6, 8 }. Then we find
that A ∪ B = { 1, 2, 3, 6, 8 }.
Definition 1.4. The intersection of A and B, denoted by A ∩ B, is the set
with elements which belong to both A and B;
A ∩ B = { x : x ∈ A and x ∈ B }
— read “A intersect B equals . . . ”.
In non-mathematical language, the intersection A ∩ B is got by selecting
everything which belongs to both A and B. Clearly, by construction, we see
that A ∩ B ⊆ A and also A ∩ B ⊆ B.
King’s College London
4
Chapter 1
Example 1.5. With A = { 1, 2, 3 } and B = { 3, 6, 8 }, as in the example
above, we see that A ∩ B = { 3 }.
If A and B have no elements in common then their intersection A∩B has
no elements at all. It is convenient to provide a symbol for this situation.
We let ∅ denote the set with no elements. ∅ is called the “empty set”.
Then A ∩ B = ∅ if A and B have no common elements. In such a situation,
we say that A and B are disjoint.
Example 1.6. Let A and B be the intervals in R given as A = (1, 4] and
B = (4, 6). Then A ∩ B = ∅ and A ∪ B = (1, 6).
Remark 1.7. Let A and B be given sets and consider the truth, or otherwise,
of the statement “ A ⊆ B ”. This fails to be true precisely when A possesses
an element which is not a member of B.
Now suppose that A = ∅. The statement “ ∅ ⊆ B ” is false provided
that there is some “nuisance” element of ∅ which is not an element of B.
However, ∅ has no elements at all, so there can be no such “nuisance”
element. In other words, the statement “ ∅ ⊆ B ” cannot be false and
consequently must be true; ∅ obeys ∅ ⊆ B for any set B. This might seem
a bit odd, but is just a logical consequence of the formalism.
Theorem 1.8. For sets A, B and C, we have
(1) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C).
(2) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).
Proof. (1) We must show that lhs ⊆ rhs and that rhs ⊆ lhs. First, we shall
show that lhs ⊆ rhs. If lhs = ∅, then we are done, because ∅ is a subset of
any set. So now suppose that lhs 6= ∅ and let x ∈ lhs = A ∪ (B ∩ C). Then
x ∈ A or x ∈ (B ∩ C) (or both).
(i) Suppose x ∈ A. Then x ∈ A ∪ B and also x ∈ A ∪ C and therefore
x ∈ (A ∪ B) ∩ (A ∪ C), that is, x ∈ rhs.
(ii) Suppose that x ∈ (B ∩ C). Then x ∈ B and x ∈ C and so x ∈ A ∪ B
and also x ∈ A ∪ C. Therefore x ∈ (A ∪ B) ∩ (A ∪ C), that is, x ∈ rhs.
So in either case (i) or (ii) (and at least one these must be true), we find
that x ∈ rhs. Since x ∈ lhs is arbitrary, we deduce that every element of the
left hand side is also an element of the right hand side, that is, lhs ⊆ rhs.
Now we shall show that rhs ⊆ lhs. If rhs = ∅, then there is no more
to prove. So suppose that rhs 6= ∅. Let x ∈ rhs. Then x ∈ (A ∪ B) and
x ∈ (A ∪ C).
Case (i): suppose x ∈ A. Then certainly x ∈ A ∪ (B ∩ C) and so x ∈ lhs.
Case (ii): suppose x ∈
/ A. Then since x ∈ (A ∪ B), it follows that x ∈ B.
Also x ∈ (A ∪ C) and so it follows that x ∈ C. Hence x ∈ B ∩ C and so
x ∈ A ∪ (B ∩ C) which tells us that x ∈ lhs.
Department of Mathematics
5
Sets
We have seen that every element of the right hand side also belongs to the
left hand side, that is, rhs ⊆ lhs.
Combining these two parts, we have lhs ⊆ rhs and also rhs ⊆ lhs and so
it follows that lhs = rhs, as required.
(2) This is left as an exercise.
The notions of union and intersection extend to the situation with more
than just two sets. For example,
A1 ∪ A2 ∪ A3 = { x : x ∈ A1 or x ∈ A2 or x ∈ A3 }
= { x : x belongs to at least one of the sets A1 , A2 , A3 }
= { x : x ∈ Ai for some i = 1 or 2 or 3 }
= { x : x ∈ Ai for some i ∈ { 1, 2, 3 } }.
More generally, for n sets A1 , A2 , . . . , An , we have
A1 ∪ A2 ∪ · · · ∪ An = { x : x ∈ Ai for some i ∈ { 1, 2, . . . , n } }.
This union is often denoted by
n
[
Ai which is
i=1
A1 ∪A2 ∪· · ·∪An . Let Λ denote
somewhat more concise than
the alternative
the “index set” { 1, 2, . . . , n }.
This is just the set of labels for the collection of sets we are considering. Then
the above can be conveniently written as
[
Ai = { x : x ∈ Ai for some i ∈ Λ }.
i∈Λ
This all makes sense for any non-empty index set. Indeed, suppose that we
have some collection of sets indexed (that is, labelled,) by a set Λ. Suppose
the set with label λ ∈ Λ is denoted by Aλ . The union of all the Aλ s is
defined to be
[
Aλ = { x : x ∈ Aλ for some λ ∈ Λ }.
λ∈Λ
If Λ = N, one often writes
∞
[
Ai for
i=1
[
Aλ .
λ∈Λ
Examples 1.9.
1. Suppose that Λ = { 1, 2, 3, . . . , 57, 58 } and Aj = [j, j + 1] for each j ∈ Λ.
(So, for example, with j = 7, A7 = [7, 7 + 1] = [7, 8].) Then
58
[
Aj = [1, 59].
j=1
King’s College London
6
Chapter 1
2. Suppose that Λ = N and Aj = [1, j + 1] for j ∈ N. Then
∞
[
Aj = [1, ∞).
j=1
S
To see this, suppose that x ∈ ∞
j=1 . Then x is an element of at least
one of the Aj s, that is, there is some j0 , say, in N such that x ∈ Aj0 .
This means that x ∈ [1, j0 + 1], that is, 1 ≤ x ≤ j0 + 1 and so certainly
x ∈ [1, ∞). It follows that lhs ⊆ rhs.
Now suppose that x ∈ [1, ∞). Then, in particular, x ≥ 1. Let N
be any natural number satisfying N > x. Then certainly
x satisfies
S
1 ≤ x ≤ N + 1 which means that x ∈ AN and so x ∈ ∞
A
j=1 j . Hence
rhs ⊆ lhs and the equality lhs = rhs follows.
3. Suppose that Λ is the interval (0, 1) and, for each λ ∈ (0, 1), Aλ is given
by Aλ = { (x, y) ∈ R2 : x = λ }. In other words, Aλ is the vertical line
x = λ in the plane R2 . Then
[
Aλ = { (x, y) ∈ R2 : 0 < x < 1 }
λ∈Λ
which is the vertical strip in R2 with boundary edges given by the lines
with x = 0 and x = 1, respectively. Note that these lines (boundary
edges) are not part of the union of the Aλ s.
4. Let Λ be the interval [3, 5] and for each λ ∈ [3, 5] let Aλ = { λ }. In other
words, Aλ consists of just one point, the real number λ. Then
[
Aλ = [3, 5]
λ∈Λ
which just says that the interval [3, 5] is the union of all its points (as it
should be).
A similar discussion can be made regarding intersections.
A1 ∩ A2 ∩ A3 = { x : x ∈ A1 and x ∈ A2 and x ∈ A3 }
= { x : x belongs to every one of the sets A1 , A2 , A3 }
= { x : x ∈ Ai for all i = 1 or 2 or 3 }
= { x : x ∈ Ai for all i ∈ { 1, 2, 3 } }.
In general, if { Aλ }Λ is any collection of sets indexed by the (non-empty)
set Λ, then the intersection of the Aλ s is
\
Aλ = { x : x ∈ Aλ for all λ ∈ Λ }.
λ∈Λ
Department of Mathematics
7
Sets
If Λ = { 1, 2, . . . , n }, we usually write
we usually write
∞
\
i=1
Ai for
\
n
\
Ai for
i=1
\
Aλ and if Λ = N, then
λ∈Λ
Aλ .
λ∈Λ
Examples 1.10.
1. Suppose that Λ = N and for each j ∈ Λ = N, let Aj = [0, j]. Then
\
Aj = [0, 1].
j∈N
2. Let Λ = N and set Aj = [j, j + 1] for j ∈ N. Then
∞
\
Aj = ∅.
j=1
3. Let Λ = N and set Aj = [j, ∞) for j ∈ N. Then
∞
\
Aj = ∅.
j=1
T∞
To see this, note that x ∈ j=1 provided that x belongs to every Aj .
This means that x satisfies j ≤ x ≤ j + 1 for all j ∈ N. But clearly this
fails whenever j is a natural number strictly greater than x. In other
words, there are no real numbers which satisfy this criterion.
4. Suppose that Λ = N and for each k ∈ N let Ak be the interval given by
Ak = [0, 1/k). Then, in this case,
∞
\
Ak = { 0 }.
k=1
This follows because the only non-negative real number which is smaller
than every 1/k (where k ∈ N) is zero.
5. Let Λ = N and let Ak = [0, 1 + 1/k] for k ∈ N. Then
∞
\
Ak = [0, 1].
k=1
Indeed, [0, 1] ⊆ Ak for every k and if x ∈
/ [0, 1] then x must fail to belong
to some Ak .
King’s College London
8
Chapter 1
Theorem 1.11. Suppose that A and Bλ , for λ ∈ Λ, are given sets. Then
\
¡\
¢
Bλ =
(A ∪ Bλ ).
(1) A ∪
λ∈Λ
(2) A ∩
¡[
λ∈Λ
λ∈Λ
¢
Bλ =
[
(A ∩ Bλ ).
λ∈Λ
¡T
¢
that
x
∈
A
∪
B
. If x ∈ A, then x ∈ A ∪ Bλ for
Proof. (1) Suppose
λ
λ∈Λ
T
all λTand so x ∈ λ∈Λ (A ∪ Bλ ). If x ∈
/ A, then it must
T be the case that
x ∈ λ∈Λ Bλ , in which
case
x
∈
B
for
all
λ
and
so
x
∈
λ
λ∈Λ (A ∪ Bλ ). We
¡T
¢ T
have shown that A ∪
B
⊆
(A
∪
B
).
λ
λ∈Λ λ
λ∈Λ
T
To establish the reverse inclusion, suppose that x ∈ λ∈Λ (A∪B
¡T λ ). Then
¢
x ∈ A ∪ Bλ for every λ ∈ Λ. If x ∈ A, the certainly x ∈ A ∪
B
. If
λ
λ∈Λ
T
x∈
/ A, then we must have that x¡T∈ Bλ for¢ every λ, that is, x ∈ λ∈Λ Bλ .
But then itTfollows that x ∈ A ∪¡T λ∈Λ Bλ¢ .
Hence λ∈Λ (A ∪ Bλ ) ⊆ A ∪
λ∈Λ Bλ and so the equality
A∪
¡\
λ∈Λ
\
¢
(A ∪ Bλ )
Bλ =
λ∈Λ
follows.
(2) The proof of this proceeds along similar lines to part (1).
Department of Mathematics
Chapter 2
The Real Numbers
In this chapter, we will discuss the properties of R, the real number system.
It might well be appropriate to ask exactly what a real number is? It is
the job of mathematics to set out clear descriptions of the objects within its
scope, so it is not at all unreasonable to expect an answer to this. One must
start somewhere. For example, in geometry, one might take the concept
of “point” as a basic undefined object. Lines are then specified by pairs
of points — the line passing through them. Beginning with the natural
numbers, N, one can construct Z and from Z one constructs the rationals, Q.
Finally from Q it is possible to construct the real numbers R. We will not
do this here, but rather we will take a close look at the structure and special
properties of R. Of course, everybody knows that numbers can be added
and multiplied and even subtracted and it makes sense to divide one number
by another (as long as the latter, the denominator, is not zero). We can also
compare two numbers and discuss which is the larger. It is precisely these
properties (or axioms) that we wish to isolate and highlight.
Arithmetic
To each pair of real numbers a, b ∈ R, there corresponds a third, denoted
a + b. This “pairing”, denoted ‘ + ’ and called “addition” obeys
(A1) a + (b + c) = (a + b) + c, for all a, b, c ∈ R.
(A2) a + b = b + a, for all a, b ∈ R.
(A3) There is a unique element, denoted 0, in R such that a + 0 = a, for
any a ∈ R.
(A4) For any a ∈ R, there is a unique element (denoted −a) in R such that
a + (−a) = 0.
The properties (A1) – (A4) say that R is an abelian group with respect to
the binary operation ‘ + ’.
9
10
Chapter 2
Next, we consider multiplication. To each pair a, b ∈ R, there is a
third, denoted a.b, the “product” of a and b. The operation ‘ . ’, called
multiplication, obeys
(A5) a.(b.c) = (a.b).c, for any a, b, c ∈ R.
(A6) a.b = b.a, for any a, b ∈ R.
(A7) There is a unique element, denoted 1, in R, with 1 6= 0 and such that
a.1 = a, for any a ∈ R.
(A8) For any a ∈ R with a 6= 0, there is a unique element in R, written a−1
or a1 , such that a.a−1 = 1. The element a−1 is called the (multiplicative)
inverse, or reciprocal, of a.
(A9) a.(b + c) = a.b + a.c, for all a, b, c ∈ R.
Remarks 2.1.
1. 0−1 is not defined. The element 0 has no reciprocal. Such an object
simply does not exist in R. 1/0 has no meaning.
2. Subtraction is given by a − b = a + (−b), for a, b ∈ R.
3. Division is defined via a ÷ b = a.(b−1 ) ( = a. 1b = a/b) provided b 6= 0. If
it should happen that b = 0, then the expression a/b has no meaning.
4. It is usual to omit the dot and write just ab for the product a.b. There
is almost never any confusion from this.
All the familiar arithmetic results are consequences of the above properties
(A1) – (A9).
Examples 2.2.
1. For any x ∈ R, x.0 = 0.
Proof. By (A3), 0 + 0 = 0 and so x.(0 + 0) = x.0. Hence, by (A9),
x.0 + x.0 = x.0. Adding −(x.0) to both sides gives
(x.0 + x.0) + (−(x.0)) = x.0 + (−(x.0))
= 0,
by (A4).
Hence, by (A1), x.0 + (x.0 + (−(x.0)) = 0 and so, using property (A4)
again, we get x.0 + 0 = 0. However, by (A3), x.0 + 0 = x.0 and so by
equating these last two expressions for x.0 + 0 we obtain x.0 = 0, as
required.
Department of Mathematics
11
The Real Numbers
2. For any x, y ∈ R, x.(−y) = −(x.y).
Proof.
(x.y) + x.(−y) = x.(y + (−y)) ,
= x.0 ,
= 0,
by (A9),
by (A4),
by the previous result.
By (A4) (uniqueness), −(x.y) must be the same as x.(−y).
3. For any x ∈ R, −(−x) = x.
Proof. We have
x = x + 0,
by (A3),
= x + ((−x) + (−(−x))) ,
by (A4),
= (x + (−x)) + (−(−x)) ,
by (A1),
= 0 + (−(−x)) ,
= −(−x) ,
by (A4),
by (A3)
as required.
4. For any x, y ∈ R, x.y = (−x).(−y).
Proof. By example 2, above, α.(−β) = −(α, β) for any α, β ∈ R. If we
now choose α = −x and β = y, we get
(−x).(−y) = −((−x).y)
= −(y.(−x)) ,
by (A6),
= −(−(y.x)) ,
by example 2, above,
= −(−(x.y)) ,
by (A6),
= x.y
by example 3, above,
and we are done.
King’s College London
12
Chapter 2
Order properties
Here we formalize the idea of one number being greater than another. We
can “order” two numbers by thinking of the larger as being the higher in
order. More precisely, there is a relation < (read “less than”) between
elements of R satisfying the following:
(A10) For any a, b ∈ R, exactly one of the following is true:
a < b, b < a or
a=b
(trichotomy).
The notation u > a (read “u is greater than a”) means that a < u.
(A11) If a < b and b < c, then a < c.
(A12) If a < b, then a + c < b + c, for any c ∈ R.
(A13) If a < b and γ > 0, then aγ < bγ.
Notation We write a ≤ b to signify that either a < b is true or else a = b
is true. In view of (A10), we can say that a ≤ b means that it is false that
a > b. The notation x ≥ w is used to mean that w ≤ x and as already noted
above, x > w is used to mean w < x.
By (A10), if x 6= 0, then either x > 0 or else x < 0. If x > 0, then x is
said to be (strictly) positive and if x < 0, we say that x is (strictly) negative.
Thus, if x is not zero, then it is either positive or else it is negative. It is
quite common to call a number x positive if it obeys x ≥ 0 or negative if it
obeys x ≤ 0. Should it be necessary to indicate that x is not zero, then one
adds the adjective ‘strictly’.
Examples 2.3.
1. For any x ∈ R, we have x > 0 ⇐⇒ (−x) < 0.
Proof. Using (A12), we have
0 < x =⇒ 0 + (−x) < x + (−x) (adding (−x) to both sides),
=⇒ (−x) < 0 since rhs = 0, by (A4).
Conversely, again from (A12),
(−x) < 0 =⇒ (−x) + x < 0 + x (adding x to both sides),
=⇒ 0 < x by (A2), (A3) and (A4)
and the result follows.
Department of Mathematics
13
The Real Numbers
2. For any x 6= 0, we have x2 > 0.
Proof. Since x 6= 0, we must have either x > 0 or x < 0, by (A10). If
x > 0, then by (A13) we have x2 > 0 (take a = 0, b = γ = x). On the
other hand, if x < 0, then −x > 0 by the example above. Hence, by
(A13) (with a = 0, b = γ = (−x)), it follows that (−x)(−x) > 0.
But we know (from the arithmetic properties) that (−α)(−β) = α β, for
any α, β ∈ R and so we have
x2 = x x = (−x)(−x) > 0
as required.
The number 1 was introduced in (A7). If we set x = 1 here, then we
see that 1 = 12 > 0, i.e., 1 > 0. We have deduced that the number 1 is
positive. Nobody would doubt this, but we see explicitly that this is a
consequence of our set-up. Note that it follows from this, by (A12), that
a < a + 1, for any a ∈ R.
3. If a, b ∈ R with a ≤ b, then −a ≥ −b.
Proof. If a = b then certainly −a = −b, so we need only consider the
case when a < b.
a < b =⇒ a + (−a) < b + (−a)
by (A12),
=⇒ 0 < b + (−a)
=⇒ −b < (−b) + b + (−a)
by (A12) and (A4),
=⇒ −b < −a by (A1), (A2) and (A4)
and the result follows.
From now on, we will work with real numbers and inequalities just as we
normally would — and will not follow through a succession of steps invoking
the various listed properties as required as we go. Suffice it to say that we
could do so if we wished.
Next, we introduce a very important function, the modulus or absolute
value.
King’s College London
14
Chapter 2
Definition 2.4. For any x ∈ R, the modulus (or absolute value) of x is the
number |x| defined according to the rule
(
x,
if x ≥ 0,
|x| =
−x, if x < 0.
¯ ¯
For example, |5| = 5, |0| = 0, |−3| = 3 and ¯− 12 ¯ = 12 . Note that |x| is never
negative. We also see that |x| = max{ x, −x }.
Let f (x) = x and g(x) = −x. Then |x| = f (x) when x ≥ 0 and
|x| = g(x) when x < 0. Now, we know what the graphs of y = f (x) = x and
y = g(x) = −x look like and so we can sketch the graph of the function |x|.
It is made up of two straight lines, meeting at the origin.
|x|
6
¡
¡
@
@
¡y = x
@
y = −x@
¡
¡
@
¡
@
¡
¡
@
@
@¡
0
- x
Figure 2.1: The absolute value function |x|.
The basic properties of the absolute value are contained in the following
two propositions. They are used time and time again in analysis and it is
absolutely essential to be fluent in their use.
Proposition 2.5.
(i) For any a, b ∈ R, we have |ab| = |a| |b|.
(ii) For any a ∈ R and r > 0, the inequality |a| < r is equivalent to the
pair of inequalities −r < a < r.
Proof. (i) We just consider the various possibilities. If either a or b is zero,
then so is the product ab. Hence |ab| = 0 and at least one of |a| or |b| is also
zero. Therefore |ab| = 0 = |a| |b|. If both a > 0 and b > 0, then ab > 0 and
we have |ab| = ab, |a| = a and |b| = b and so |ab| = |a| |b| in this case.
Now, if a > 0 but b < 0, then ab < 0 so we have |a| = a, |b| = −b and
|ab| = −ab = |a| |b|. The case a < 0 and b > 0 is similar.
Finally, suppose that both a < 0 and b < 0. Then ab > 0 and we have
|ab| = ab, |a| = −a and |b| = −b. Hence, |ab| = ab = (−a)(−b) = |a| |b|.
Department of Mathematics
15
The Real Numbers
(ii) Suppose that |a| < r. Then max{ a, −a } < r and so both a < r
and −a < r. In other words, a < r and −r < a which can be written as
−r < a < a.
On the other hand, if −r < a < r, then both a < r and −a < r so that
max{ a, −a } < r. That is, |a| < r, as required.
Remark 2.6. Putting b = −1 in (i), above, and using the fact that |−1| = 1,
we see that |−a| = |a|.
Proposition 2.7. For any real numbers a and b,
(i) |a + b| ≤ |a| + |b|.
(ii) |a − b| ≤ |a| + |b|.
(iii) ||a| − |b|| ≤ |a − b|.
Proof. (i) We have a + b ≤ |a| + |b| and −(a + b) = −a − b ≤ |a| + |b|. Hence
|a + b| = max{ a + b, −(a + b) } ≤ |a| + |b| .
(ii) Let c = −b and apply (i) to the real numbers a and c to get the
inequality |a + c| ≤ |a| + |c|. But then this means that |a − b| ≤ |a| + |b|.
(iii) We have
|a| = |(a − b) + b| ≤ |a − b| + |b|
by part (i) (with (a − b) replacing a). This implies that |a| − |b| ≤ |a − b|.
Swapping around a and b, we have −(|a| − |b|) = |b| − |a| ≤ |b − a| = |a − b|
and therefore
| |a| − |b| | = max{ |a| − |b| , −(|a| − |b|) } ≤ |a − b|
as required.
If a and b are real numbers, how far apart are they? For example, if a = 7
and b = 11 then we might say that the distance between a and b is 4. If,
on the other hand, a = 10 and b = −6, then we would say that the distance
between them is 16. In either case, we notice that the distance is given by
|a − b|. It is extremely useful to view |a − b| as the distance between the
numbers a and b. For example, to say that |a − b| is “very small” is to say
that a and b are “close” to each other.
King’s College London
16
Chapter 2
Proposition 2.8. Let a, b ∈ R be given and suppose that for any given ε > 0,
a and b obey the inequality a < b + ε. Then a ≤ b. In particular, if x < ε
for all ε > 0, then x ≤ 0.
Proof. We know that either a ≤ b or else a > b. Suppose the latter were
true, namely, a > b. Set ε1 = a − b. Then ε1 > 0 and a = b + ε1 . Taking
ε = ε1 , we see that this conflicts with the hypothesis that a < b + ε for every
ε > 0 (it fails for the choice ε = ε1 ). We conclude that a > b must be false
and so a ≤ b.
For the last part, simply set a = x and b = 0 to get the desired conclusion.
We have listed a number of properties obeyed by the real numbers:
(A1) . . . (A9) — arithmetic
(A10) . . . (A13) — order.
Is this it? Are there any more to be included? We notice that all of these
properties are satisfied by the rational numbers, Q. Are all real numbers
rational, i.e., is it true that Q = R? Or do we need to consider yet further
properties which distinguish between Q and R? Consider an apparently
unrelated question. Do all numbers have square roots? Since a2 is positive
for any a ∈ R, it is clear that no negative number can have a square root in
R. (Indeed, it is the consideration of C, the complex numbers, which allows
for square roots of negative numbers.) So we ask, does every positive real
number have a square root? Does every natural number n have a square
root in R? In particular, is there such a real number as the square root
of 2? It would be nice to think that there is such a real number. In fact,
according to Pythagoras’ Theorem, this should be the length of the diagonal
of a square whose sides have unit length. The following proposition tells us
that there is certainly no such rational number.
Proposition√2.9. There are no integers m, n ∈ N satisfying m2 = 2n2 . In
particular, 2 is not a rational number.
√
Proof. To say that 2 is rational√is to say that there are integers m and
n (with n 6= 0) such that m/n = 2. This means that m2 /n2 = 2 and so
m2 = 2n2 for m, n ∈ N. (By replacing m or n by −m or −n, if necessary,
we may assume
that m and n in this last equality are both positive.) So the
√
fact that 2 ∈
/ Q is a consequence of the first part of the proposition.
Consider the equality
m2 = 2n2
(∗)
To show that m2 = 2n2 is impossible for any m, n ∈ N, suppose the contrary,
namely that there are numbers m and n in N obeying (∗). We will show
that this leads to a contradiction.
Department of Mathematics
17
The Real Numbers
Indeed, if m2 = 2n2 , then m2 is even. The square of an odd number is
odd and so it follows that m must also be even. This means that we can
express m as m = 2k for some suitable k ∈ N. But then
(2k)2 = m2 = 2n2
which means that 2k 2 = n2 and so n2 is even. Arguing as above, we deduce
that n can be expressed as n = 2j for some j ∈ N. Substituting, we see that
k and j also obey (∗), namely, k 2 = 2j 2 . This tells us that m/2 and n/2 are
integers also obeying (∗).
Repeating this whole argument with m0 = m/2 and n0 = n/2, we find
that both m0 /2 and n0 /2 belong to N and also satisfy (∗). In other words,
m/22 and n/22 belong to N and obey (∗). We can keep repeating this argument to deduce that m/2j and n/2j are integers obeying (∗). In particular,
m/2j ∈ N implies that m/2j ≥ 1 and so m ≥ 2j . But this holds for any
j ∈ N and we can take j as large as we wish. We can take j so large that
2j > m. This leads to a contradiction, as we wanted it to. We finally
conclude that there are no natural numbers m and n obeying (∗) and as a
consequence,
there is no element of Q whose square is equal to 2, that is,
√
2 is not a rational number.
Remark 2.10. A somewhat similar argument can be used to show
√ that many
other numbers do not have square roots in Q. For example, 3 ∈
/ Q. In
√
fact, one can show that if n ∈ N, then √
either n ∈ N, that
is,
n
is
a
perfect
√
√
square, or else n ∈
/ Q. For example, 16 = 4 ∈ N but 17 ∈
/ Q.
Returning to the discussion of the defining properties of R, we still have
to pinpoint the extra property that R has which is not shared by Q. First
we need some terminology.
Definition 2.11. A non-empty subset S of R is said to be bounded from above
if there is some M ∈ R such that
a≤M
for all a ∈ S. Any such number M is called an upper bound for the set S.
Evidently, if M is an upper bound for S, then so is any number greater
than M .
We say that a non-empty subset S of R is bounded from below if there
is some m ∈ R such that
m≤a
for all a ∈ S. Any such number m is called a lower bound for the set S. If
m is a lower bound for S, then so is any number less than m.
If S is both bounded from above and from below, then S is said to be
bounded.
King’s College London
18
Chapter 2
Example 2.12. Consider the set A = (−6, 4] . Then A is bounded because
any x ∈ A obeys −6 ≤ x ≤ 4. (In fact, any x ∈ A obeys the inequalities
−6 < x ≤ 4.) Any real number greater than or equal to 4 is an upper bound
for A and any real number less than or equal to −6 is a lower bound for A.
The set A has a maximal element, namely 4, but A does not have a minimal
element.
Let
∞
[
3
5
7
B = (1, 2 ) ∪ (2, 2 ) ∪ (3, 2 ) ∪ · · · =
(k, k + 12 ).
k=1
Then the set B is bounded from below (the number 1 is clearly a lower
bound for B). However, B contains k + 14 for every k ∈ N and so B is not
bounded from above (so B is not bounded). We also see that B does not
have a minimal element.
Remark 2.13. What does it mean to say that a set S is not bounded from
above? Consider the inequality
x≤M.
(∗)
Now, given S and some particular real number M , the inequality (∗) may
hold for some elements x in S but may fail for other elements of S. To say
that S is bounded from above is to say that there is some M such that (∗)
holds for all elements x ∈ S. If S is not bounded from above, then it must
be the case that whatever M we try, there will always be some x in S for
which (∗) fails, that is, for any given M there will be some x ∈ S such that
x > M . In particular, if we try M = 1, then there will be some element
(many, in fact) in S greater than 1. Let us pick any such element and label
it as x1 . Then we have x1 ∈ S and x1 > 1.
We can now try M = 2. Again, (∗) must fail for at least one element
in S and it could even happen that x1 > 2. To ensure that we get a new
element from S, let M = max{ 2, x1 }. Then there must be at least one
element of S greater than this M . Let x2 denote any such element. Then
we have x2 ∈ S and x2 > 2 and x2 6= x1 .
Now setting M = max{ 3, x1 , x2 }, we may say that there is some element
in S, which we choose to denote by x3 , such that x3 > 3 and x3 6= x1 and
x3 6= x2 . We can continue to do this and so we see that if S is not bounded
from above, then there exist elements x1 , x2 , x3 , . . . , xn , . . . (which are all
different) such that xn > n for each n ∈ N.
The following concepts play an essential rôle.
Department of Mathematics
19
The Real Numbers
Definition 2.14. Suppose that S is a non-empty subset of R which is bounded
from above. The number M is the least upper bound (lub) of S if
(i) a ≤ M for all a ∈ S (i.e., M is an upper bound for S).
(ii) If M 0 is any upper bound for S, then M ≤ M 0 .
If S is a non-empty subset of R which is bounded from below, then the
number m is the greatest lower bound (glb) of S if
(i) m ≤ a for all a ∈ S (i.e., m is a lower bound for S).
(ii) If m0 is any lower bound for S, then m0 ≤ m.
Note that the least upper bound and the greatest lower bound of a set S
need not themselves belong to S. They may or they may not. The least
upper bound is also called the supremum (sup) and the greatest lower bound
is also called the infimum (inf). The ideas are illustrated by some examples.
Examples 2.15.
1. Let S be the following set consisting of 4 elements, S = { −3, 1, 2, 5 }.
Then clearly S is bounded from above and from below. The least upper
bound is 5 and the greatest lower bound is −3.
2. Let S be the interval S = (−6, 4]. Then lub S = 4 and glb S = −6. Note
that 4 ∈ S whereas −6 ∈
/ S.
3. Let S = (1, ∞). S is not bounded from above and so has no least upper
bound. S is bounded from below and we see that glb S = 1. Note that
glb S ∈
/ S in this case.
Remark 2.16. Suppose that M is the lub for a set S. Let δ > 0. Then
M − δ < M and since any upper bound M 0 for S has to obey M ≤ M 0 , we
see that M − δ cannot be an upper bound for S. But this means that it
is false that a ≤ M − δ for all a ∈ S. In other words, there must be some
a ∈ S which satisfies M − δ < a. Since M is an upper bound for S, we also
have a ≤ M and so a obeys
M − δ < a ≤ M.
So no matter how small δ may be, there will always be some element a ∈ S
(possibly depending on δ and there may be many) such that M −δ < a ≤ M ,
where M = lub S.
For any δ > 0, there is a ∈ S such that lub S − δ < a ≤ lub S .
King’s College London
20
Chapter 2
Now suppose that m = glb S. Then for any δ > 0 (however small), we
note that m < m + δ and so m + δ cannot be a lower bound for S (because
all lower bounds for S must be less than or equal to m). Hence, there is
some a ∈ S such that a < m + δ, which means that
m ≤ a < m + δ.
For any δ > 0, there is a ∈ S such that glb S ≤ a < glb S + δ .
Remark 2.17. As already noted above, lub S and glb S may or may not
belong to the set S. If it should happen that lub S ∈ S, then in this case
lub S (or sup S) is the maximum element of S, denoted max S. If glb S ∈ S,
then glb S (or inf S) is the minimum element of S, denoted min S.
For example, the interval S = (−2, 5] is bounded and, by inspection, we
see that sup S = 5 and inf S = −2. Since sup S = 5 ∈ S, the set S does
indeed have a maximum element, namely, 5 = sup S. However, inf S ∈
/ S
and so S has no minimum element.
We are now in a position to discuss the final property satisfied by R and
it is precisely this last property which distinguishes R from Q.
(A14) (The completeness property of R)
Any non-empty subset of R which is bounded from above possesses a
least upper bound.
Any non-empty subset of R which is bounded from below possesses a
greatest lower bound.
These statements might appear self-evident, but as we will see, they have
far-reaching consequences. We note here that these two statements are not
independent, in fact, each implies the other, that is, they are equivalent.
Remark 2.18. It is very convenient to think of R as the set of points on a line
(the real line). Indeed, this is standard procedure when sketching graphs of
functions where the coordinate axes represent the real numbers.
Department of Mathematics
21
The Real Numbers
Imagine now the following situation.
¢¡
|
{z
A
}
R
|
6
{z
}
B
Figure 2.2: The real line has no gaps.
The set A consists of all points on the line (real numbers) to the left
of the arrow and B comprises all those points to the right. Numbers are
bigger the more they are to the right. The arrow points to the least upper
bound of A (which is also the greatest lower bound of B). The completeness
property (A14) ensures the existence of the real number in R that the arrow
supposedly points to. There are no “gaps” or “missing points” on the real
line. We can think of the integers Z or even the rationals Q as collections
of dots on a line, but it is property (A14) which allows us to visualize R as
the whole “unbroken” line itself.
The next result is so obvious that it seems hardly worth noting. However,
it is very important and follows from property (A14).
Theorem 2.19 (Archimedean Property). For any given x ∈ R, there is some
n ∈ N such that n > x.
Proof. Let x ∈ R be given. We use the method of “proof by contradiction”
— so suppose that there is no n ∈ N obeying n > x. This means that n ≤ x
for all n ∈ N, that is, x is an upper bound for N in R. By the completeness
property, (A14), N has a least upper bound, α, say. Then α is an upper
bound for N so that
n≤α
(∗)
for all n ∈ N. Since α is the least upper bound, α − 1 cannot be an upper
bound for N and so there must be some k ∈ N such that α − 1 < k. But
we can rewrite this as α < k + 1 which contradicts (∗) since k + 1 ∈ N. We
conclude that there is some n ∈ N obeying n > x, as claimed.
Corollary 2.20.
(i) For any given δ > 0, there is some n ∈ N such that
1
< δ.
n
α
< β.
n
Proof. (i) Let δ > 0 be given. By the Archimedean Property, there is some
n ∈ N such that n > 1/δ. But then this gives 1/n < δ, as required.
(ii) For given α > 0 and β > 0, set δ = β/α. By (i), there is n ∈ N such
that 1/n < δ = β/α and so α/n < β.
(ii) For any α > 0, β > 0, there is n ∈ N such
King’s College London
22
Chapter 2
The next result is no surprise either.
Theorem 2.21. For any a ∈ R, there is a unique integer n ∈ Z such that
n ≤ a < n + 1.
Proof. Let S = { k ∈ Z : k > a }. By theorem 2.19, S is not empty and is
bounded below (by a). Hence, by the completeness property (A14), S has
a greatest lower bound α, say, in R. We have
a≤α≤k
for all k ∈ S. (The inequality a ≤ α follows because a is a lower bound and
α is the greatest lower bound and the inequality α ≤ k follows because α is
a lower bound of S.) Since α is the greatest lower bound, α + 1 cannot be
a lower bound of S and so there is some m ∈ S such that m < α + 1, that
is, m − 1 < α.
|
m−1=n
|
|
a
m=n+1
R
Figure 2.3: The integer part of a.
Now, α is a lower bound for S and m − 1 < α and so m − 1 ∈
/ S. But
then, by the defining property of S, this means that it is false that m−1 > a.
In other words, we have m − 1 ≤ a. But m ∈ S and so m > a and so m
satisfies m − 1 ≤ a < m. Putting n = m − 1, we get n ∈ Z and n satisfies
the required inequalities n ≤ a < n + 1.
To show the uniqueness of such n ∈ Z, suppose that also n0 ∈ Z obeys
0
n ≤ a < n0 +1. Suppose that n < n0 . Then n+1 ≤ n0 and so the inequalities
n0 ≤ a and a < n + 1 give
n0 ≤ a < n + 1 ≤ n0
giving n0 < n0 which is impossible. Similarly, the assumption that n0 < n
would lead to the impossible inequality n < n. We conclude that n = n0
which is to say that n is unique.
Remark 2.22. For x ∈ R, let n ∈ Z be the unique integer obeying the
inequalities n ≤ x < n+1. Set r = x−n. Then we see that 0 ≤ x−n = r < 1
and so x = n + r with n ∈ Z and where 0 ≤ r < 1. The unique integer n
here is called the integer part of the real number x and is denoted by [x] (or
sometimes by bxc).
Department of Mathematics
23
The Real Numbers
Theorem 2.23. Between any pair of real numbers a < b, there are infinitelymany rational numbers and also infinitely-many irrational numbers.
Proof. First, we shall show that there is at least one such rational, that is,
we shall show that for any given a < b in R, there is some q ∈ Q such that
a < q < b. The idea of the proof is as follows. If there is an integer between
a and b, then we are done. In any case, we note that since the integers are
spread one unit apart, there should certainly be at least one integer between
a and b if the distance between a and b is greater than 1. If the distance
between a and b is less than 1, then we can “open up the gap” between them
by multiplying both by a sufficiently large (positive) integer, n, say. The
gap between na and nb is n(b − a). Clearly, if n is large enough, this value is
greater than 1. Then there will be some integer m, say, between na and nb,
i.e., na < m < nb. But then we see (since n is positive) that a < m/n < b
and q = m/n is a rational number which does the job.
We shall now write this argument out formally. Let n ∈ N be sufficiently
large that n(b − a) > 1 so that na + 1 < nb and let m = [na] + 1. Since
[na] ≤ na < [na] + 1, it follows that
[na] ≤ na < [na] + 1 ≤ na + 1 < nb
| {z }
m
and so na < m < nb and hence a < m/n < b. (Note that n > 0, so this last
step is valid.) Setting q = m/n, we have that q ∈ Q and q obeys a < q < b,
as required.
To see that there are infinitely-many rationals between a and b, we just
repeat the above argument but with, say, q and b instead of a and b. This
tells us that there is a rational, q2 , say, obeying q < q2 < b. Once again,
repeating this argument, there is a rational, q3 , say, obeying q2 < q3 < b.
Continuing in this way, we see that for any n ∈ N, there are n rationals,
q, q2 , . . . , qn obeying
a < q < q2 < q3 < · · · < qn < b .
Hence it follows that there are infinitely-many rationals between a and b.
To show that there are infinitely-many irrational numbers between a
and
√ b, we use a trick together with the observation that if r is rational, then
r/√ 2 is irrational.
The trick is simply to apply the first part to the numbers
√
a 2 and b 2 to deduce that for any n ∈ N there are rational numbers
r1 , r2 , . . . , rn obeying
√
√
a 2 < r1 < r2 < · · · < rn < b 2 .
√
Now let µj = rj / 2 for j = 1, 2, . . . , n. Then each µj is irrational and we
have
a < µ1 < µ2 < · · · < µn < b
and the result follows.
King’s College London
24
Chapter 2
As a further application of the Completeness Property of R, we shall
show that any positive real number has a positive nth root.
Theorem 2.24. Let x ≥ 0 and n ∈ N be given. Then there is a unique s ≥ 0
such that sn = x. The real number s is called the (positive) nth root of x
and is denoted by x1/n .
Proof. If x = 0, then we can take s = 0, so suppose that x > 0.
Let A be the set A = { t ≥ 0 : tn < x }. Then 0 ∈ A and so A is not empty
and, by the Archimedean Property, there is some integer K with K > x.
But then every t ∈ A must obey t < K because otherwise we would have
t ≥ K and therefore tn ≥ K n ≥ K > x, which is not possible for any t ∈ A.
This means that A is bounded from above. By the Completeness Property
of R, A has a least upper bound, lub A = s, say. Note now that, since x > 0,
by the Archimedean Property there is some m ∈ N such that m > 1/x.
Hence mn ≥ m > 1/x which implies that 1/mn < x so that 1/mn ∈ A. This
means that s ≥ 1/mn . In particular, s > 0.
Now, exactly one of the statements sn = x, sn < x or sn > x is true.
We claim that sn = x and to show this we shall show that the last two
statements must be false.
Indeed, suppose that sn < x. For k ∈ N, let sk = s(1 + k1 ). Then
evidently sk > s and we will show that snk < x for suitably large k.
Let d = x − sn . Then d > 0 and
¢
¡
x − snk = x − sn + sn − snk = d − (snk − sn ) = d − sn (1 + k1 )n − 1 .
Now, writing α = (1 + k1 ) and noting that 1 < α ≤ 2, we estimate
(1 + k1 )n − 1 = αn − 1
= (α − 1)(αn−1 + αn−2 + · · · + 1)
≤ (α − 1)(2n−1 + 2n−2 + · · · + 1)
≤ (α − 1) n 2n
=
1
k
n 2n .
Hence
¡
¢ sn n 2n
snk − sn = sn (1 + k1 )n − 1 ≤
.
k
For sufficiently large k, the right hand side of this inequality is less than d
and so
x − snk = x − sn + sn − snk = d − (snk − sn ) > 0 .
It follows that if k is large enough, then sk ∈ A. But sk > s which means
that s cannot be the least upper bound of A and we have a contradiction.
Hence it must be false that sn < x.
Department of Mathematics
25
The Real Numbers
Suppose now that sn > x and let δ = sn − x. For given k ∈ N, let
tk = s(1 − k1 ). Writing β = 1 − k1 and noting that 0 ≤ β ≤ 1, we estimate
that
1 − (1 − k1 )n = 1 − β n
= −(β n − 1)
= −(β − 1)(β n−1 + β n−2 + · · · + 1)
= (1 − β)(β n−1 + β n−2 + · · · + 1)
≤ (1 − β) n
=
1
k
n.
It follows that
sn − tnk = sn (1 − (1 − k1 )n ) ≤
1
k
sn n < δ
for sufficiently large k. But then this means that
tnk − x = tnk − sn + sn − x = δ − (sn − tnk ) > 0
for large k. However, tk < s and since s = lub A, it follows that tk is not an
upper bound for A. In other words, there is some τ ∈ A such that τ > tk
and therefore τ n − x > tnk − x > 0. However, τ ∈ A means that τ n < x
which is a contradiction and so it is false that sn > x.
We have now shown that sn < x is false and also that sn > x is false
and so we conclude that it must be true that sn = x, as required.
We have established the existence of some s ≥ 0 such that sn = x and
so, finally, we must prove that such an s is unique. If x = 0, then s = 0
obeys sn = 0 = x. No s 6= 0 can obey sn = 0 because sn (1/s)n = 1 6= 0, so
s = 0 is the only solution to sn = 0.
Now let s > 0 and t > 0. If s > t, then s/t > 1 so that (s/t)n > 1 and
we find that sn > tn . Interchanging the rôles of s and t, it follows that if
s < t, then sn < tn . We conclude that if sn = x = tn then both s < t and
s > t are impossible and so s = t.
The proof is complete.
King’s College London
26
Chapter 2
Principle of induction
Suppose that, for each n ∈ N, P (n) is a statement about the number n such
that
(i) P (1) is true.
(ii) For any k ∈ N, the truth of P (k) implies the truth of P (k + 1).
Then P (n) is true for all n.
Example 2.25. For any n ∈ N,
n(n + 1)(2n + 1)
.
6
Proof. For n ∈ N, let P (n) be the statement that
12 + 22 + 32 + · · · + n2 =
12 + 22 + 32 + · · · + n2 =
n(n + 1)(2n + 1)
.
6
Then P (1) is the statement that
12 =
1(1 + 1)(2 + 1)
6
which is true.
Now suppose that k ∈ N and that P (k) is true. We wish to show that
P (k + 1) is also true. Since we are assuming that P (k) is true, we see that
12 + 22 + 32 + · · · + k 2 + (k + 1)2 =
k(k + 1)(2k + 1)
+ (k + 1)2 ,
6
using the truth of P (k),
k(k + 1)(2k + 1) + (k + 1)(6k + 6)
6
2
(k + 1)(2k + k + 6k + 6)
=
6
(k + 1)(k + 2)(2k + 3)
=
6
which is to say that P (k + 1) is true. By the principle of induction, we
conclude that P (n) is true for all n ∈ N.
=
We can rephrase the principle of induction as follows. Let T be the set
given by T = { k ∈ N : P (k) is true }, so k ∈ T if and only if P (k) is true. In
particular, P (1) is true if and only if 1 ∈ T . Hence the principle of induction
may be rephrased as follows.
“ Let T be a set of natural numbers such that 1 ∈ T and such that
if T contains k then it also contains k + 1. Then T = N. ”
Department of Mathematics
27
The Real Numbers
Principle of induction (2nd form)
Suppose that Q(n) is a statement about the natural number n such that
(i) Q(1) is true.
(ii) For any k ∈ N, the truth of all Q(1), Q(2), . . . , Q(k) implies the truth
of Q(k + 1).
Then Q(n) is true for all n.
In a nutshell:
Suppose that:
Q(1) is true and

Q(1) true 

Q(2) true 


Q(3) true
=⇒ Q(k + 1) true

..


.


Q(k) true
Conclusion:
Q(n) is true for all n ∈ N.
This follows from the usual form of the principle.
To see this, let S = { m ∈ N : Q(m) is true }. We shall use the usual form
of induction to show that the hypotheses above imply that S = N.
For any n ∈ N, let P (n) be the statement “{ 1, 2, . . . , n } ⊆ S ”.
Now, by hypothesis, Q(1) is true and so 1 ∈ S. Hence { 1 } ⊆ S which
is to say that P (1) is true.
Next, suppose that the truth of P (k) implies that of P (k +1) and assume
that P (k) is true. This means that { 1, 2, . . . , k } ⊆ S, that is, each of Q(1),
Q(2), . . . , Q(k) is true. But then by the 2nd part of the hypothesis above,
Q(k + 1) is true, that is to say, k + 1 ∈ S. Hence { 1, 2, . . . k, k + 1 } ⊆ S.
But this just tells us that P (k + 1) is true. By induction (usual form), it
follows that P (n) is true for all n ∈ N. This means that { 1, 2, . . . , n } ⊆ S
for all n. In particular, n ∈ S for every n ∈ N, that is, Q(n) is true for all
n ∈ N which is the content of the 2nd form of the principle.
King’s College London
28
Department of Mathematics
Chapter 2
Chapter 3
Sequences
A sequence of real numbers is just a “listing” a1 , a2 , a3 , . . . of real numbers
labelled by N, the set of natural numbers. Thus, to each n ∈ N, there
corresponds a real number an . Not surprisingly, an is called the nth term of
the sequence.
a1 , a2 , a3 , . . . , ak , ak+1 , . . .
£°£
£°£
?
labelled by N
k th
term
Figure 3.1: The sequence (an )n∈N .
Whilst it may seem a trivial comment, it is important to note that the
essential thing about a sequence is that it has a notion of “direction” — it
makes sense to talk about one term being further down the sequence than
another. For example, a101 is further down the sequence than, say, a45 .
It is convenient to denote the above sequence by (an )n∈N or even simply
by (an ). Note that there is no requirement that the terms be different. It is
quite permissible for aj to be the same as an for different j and n. Indeed, one
could have an = α, say, for all n. This is just a sequence with constant terms
(all equal to α) — a somewhat trivial sequence, but a sequence nonetheless.
Remark 3.1. On a more formal level, one can think of a sequence of real
numbers to be nothing but a function from N into R. Indeed, we can define
such a function f : N → R by setting f (n) = an for n ∈ N. Conversely,
any f : N → R will determine a sequence of real numbers, as above, via the
assignment an = f (n).
One might wish to consider a finite sequence such as, say, the four term
sequence a1 , a2 , a3 , a4 . We will use the word sequence to mean an infinite
sequence and simply include the adjective “finite” when this is meant.
29
30
Chapter 3
Examples 3.2.
1.
1, 4, 9, 16, . . .
Here the general term an is given by the simple formula an = n2 .
2.
2, 3/2, 4/3, 5/4, 6/5, . . .
The general term is an = (n + 1)/n.
3.
2, 0, 2, 0, 2, 0, . . .
(
0, if n is even
Here an =
2, if n is odd.
This can also be expressed as an = 1 − (−1)n .
4. Let an be defined by the prescription a1 = a2 = 1 and an = an−1 + an−2
for n ≥ 3. The sequence (an ) is then
1, 1, 2, 3, 5, 8, 13, . . .
These are known as the Fibonacci numbers.
We are usually interested in the “long-term” behaviour of sequences,
that is, what happens as we look further and further down the sequence.
What happens to an when n gets very large? Do the terms “settle down”
or do they get sometimes big, sometimes small, . . . , or what?
In examples 3.2.1 and 3.2.4, the terms just get huge.
In example 3.2.2, we see, for example, that a99 = 100/99, a10000 =
10001/10000, a1020 = (1020 + 1)/1020 , . . . , so it looks as though the terms
become close to 1.
In example 3.2.3, the terms just keep oscillating between the two values 0
and 2.
In example 3.2.2, we would like to say that the sequence approaches 1
as we go further and further down it. Indeed, for example, the difference
between a1010 and 1 is that between (1010 + 1)/1010 and 1, that is, 10−10 .
How can we formulate this idea of “convergence of a sequence” precisely?
We might picture a sequence in two ways, as follows. The first is as the graph
of the function n 7→ an . (Notice that we do not join up the dots.)
a1
·
1
a2
·
2
a4
·
3
a·
4
an
·
...
n
3
Figure 3.2: A sequence as a graph.
Department of Mathematics
N
31
Sequences
The second way is just to indicate the values of the sequence on the real
line.
×
a4
×
a2
×
a1
×
a3
R
Figure 3.3: Plot the values of the sequence on the real line.
The example 3.2.3 above, would then be pictured either as
r
r
r
a4
a2
r
r
1
a5
a3
a1
2
3
a6
r
...
4
N
Figure 3.4: Graph with values 0 or 2.
or as
0
×
a2
a4
..
.
2
×
a1
a3
..
.
R
Figure 3.5: The values of an are either 0 or 2.
King’s College London
32
Chapter 3
Returning to the general situation now, how should we formulate the
idea that a sequence (an ) “converges to α”? According to our first pictorial
description, we would want the plotted points of the sequence (the graph)
to eventually become very close to the line y = α.
×
×
×
×
×
×
×
×
×
y=α
×
1
2
3
...
4
R
Figure 3.6: The graph gets close to the line y = α.
In terms of the second pictorial description, we would simply demand
that the values of the sequence eventually cluster around the value x = α.
r
r
rrrrr r r
r
R
x=α
Figure 3.7: The values of (an ) cluster around x = α.
If we think of the index n as representing time, then we can think of an
as the value of the sequence at the time n. The sequence can be considered
to have some property “eventually” provided we are prepared to wait long
enough for it to become established. It is very convenient to use this word
“eventually”, so we shall indicate precisely what we mean by it.
We say that a sequence eventually has some particular property if there
is some N ∈ N such that all the terms an after aN (i.e., all an with n > N )
have the property under consideration. (The number N can be thought of
as some offered time after which we are guaranteed that the property under
consideration will hold and will continue to hold.)
As an example of this usage, let (an ) be the sequence given by the prescription an = 100 − n, for n ∈ N. Then a1 = 99, a2 = 98, . . . etc. It is clear
that an is negative whenever n is greater than 100. Thus, we can say that
this sequence (an ) is eventually negative.
Now we can formulate the notion of convergence of a sequence. The idea
is that (an ) converges to the number α if eventually it is as “close” to α
as desired. That is to say, given some preassigned tolerance ε, no matter
how small, we demand that eventually (an ) is close to within ε of α. In
Department of Mathematics
33
Sequences
other words, the distance between an and α (as points on the real line) is
eventually smaller than ε.
Definition 3.3. We say that the sequence (an )n∈N of real numbers converges
to the real number α if for any given ε > 0, there is some natural number
N ∈ N such that |an − α| < ε whenever n > N .
α is called the limit of the sequence. In such a situation, we write an → α
as n → ∞ or alternatively limn→∞ an = α.
The use of the symbol ∞ is just as part of a phrase and it has no meaning
in isolation. There is no real number ∞.
Remark 3.4. The positive number ε is the assigned tolerance demanded.
Typically, the smaller ε is, so the larger we should expect N to have to
be. For example, consider the sequence (an ) where an = 1/n. We would
expect that an → 0 as n → ∞. To see this, let ε > 0 be given. (We are
not able to choose this. It is given to us and its actual value is beyond our
control.) It will be true that |an − 0| < ε provided n > 1/ε. So after some
contemplation, we proceed as follows. We are unable to influence the choice
of ε given to us, but once it is given then we can (and must) base our tactics
on it. So let N be any natural number larger than 1/ε. If n > N , then
n > N > 1/ε and so 1/n < ε. That is, if n > N , then |an − 0| = 1/n < ε
and so, according to our definition, we have shown that an → 0 as n → ∞.
Notice that the smaller ε is, the larger N has to be.
Note that the statement
if n > N then |an − α| < ε
can also be written as
|an − α| < ε whenever n > N
or also as
n > N =⇒ |an − α| < ε.
Also, we should note that the inequality |an − α| < ε telling that the distance
between the real numbers an and α is less than ε can also be expressed by
the pair of inequalities
−ε < an − α < ε
or equivalently by the pair
α − ε < an < α + ε.
This simply means that an lies on the real line somewhere between the two
values α − ε and α + ε. This must happen eventually if the sequence is to
be convergent (to α).
King’s College London
34
Chapter 3
an lies in here
z
}|
{
(
α−ε
α
)
α+ε
R
Figure 3.8: The value of an lies within ε of α.
Example 3.5. Let (an )n∈N be the sequence with
an =
2n + 5
n
for n ∈ N. Does (an ) converge? Looking at the first few terms, we find
13 15 17 19
205
(an ) = (7, 29 , 11
3 , 4 , 5 , 6 , 7 , . . . , 100 , . . . ).
It seems that an → 2 as n → ∞, but we must prove it.
Let ε > 0 be given. We have to show that eventually (an ) is within ε of 2.
We have |an − 2| = |(2n + 5)/n − 2| = 5/n. Now, the inequality 5/n < ε is
the same as n > 5/ε. Let N be any natural number which obeys N > 5/ε.
Then if n > N , we have
5
n>N >
ε
and so 5/n < ε. This means that if n > N then |an − 2| < ε and we have
succeeded in proving that an → 2 as n → ∞.
Example 3.6. Let (an )n∈N be the sequence an = 1/n2 . We shall show that
an → 0 as n → ∞.
Let ε > 0 be given.
We wish to show that there is N ∈ N such that if n > N then |an − 0| < ε,
that is, |an − 0| = 1/n2 < ε.
Now,
1
1
1
⇐⇒ n > √
< ε ⇐⇒ n2 >
n2
ε
ε
√
so take N ∈ N to be any natural number satisfying N > 1/ ε. Then if
√
n > N , it follows that n > 1/ ε and so n2 > 1/ε which in turn implies that
1/n2 < ε and the proof is complete.
Alternatively, we note that 1/n2 ≤ 1/n and so if 1/n < ε then it follows that
1/n2 ≤ 1/n < ε. So let N ∈ N be any natural number such that N > 1/ε.
Then 1/N < ε and so if n > N we have
1
1
1
1
1
<
< ε =⇒ 2 ≤ <
< ε.
n
N
n
n
N
Department of Mathematics
35
Sequences
Example 3.7. Let (an )n∈N be the sequence an =
that an → 0 as n → ∞.
4
1
+ √ . We shall show
n3
n
Let ε > 0 be given.
We must show that
|an − 0| =
1
4
+√ <ε
3
n
n
whenever n is large enough. To see this, we note that
4
1
4
1
4
1
5
+√ ≤ +√ ≤√ +√ =√ .
n3
n
n
n
n
n
n
If the right
√hand side is less than ε, then so is the left hand side. Let N ∈ N
satisfy 5/ N < ε, that is, 25/N < ε2 or N > 25/ε2 . Then if n > N , we
may say that 25/n < 25/N < ε2 and so
4
1
5
5
+√ ≤ √ < √ <ε
3
n
n
n
N
that is, |an − 0| < ε whenever n > N .
Example 3.8. Let |x| < 1 and for n ∈ N, let an = xn . Does (an ) converge?
Since |x| < 1, |xn | = |x|n gets smaller and smaller as n increases, so we
might guess that xn → 0 as n → ∞.
Let ε > 0 be given.
We must show that eventually |xn − 0| < ε which is the same as showing
that eventually |x|n < ε. Set d = |x|. Then we wish to show that eventually
dn < ε. Notice that d ≥ 0 and so we no longer have to worry about whether
x is positive or negative. We have transferred the problem from one about
x to one about d.
Consider first the case x = 0. Then also d = 0 and dn = 0 for all n. In
particular, if we go through the motions by choosing N = 1, then certainly
dn < ε whenever n > N (because dn = 0), which tells us (trivially) that
eventually dn < ε and so therefore xn → 0 as n → ∞.
Now suppose that x 6= 0. Then 0 < |x| < 1, so that 0 < d < 1. Define
t by d = 1/(1 + t), that is t = (1 − d)/d. Then t > 0. By the binomial
theorem, we have
µ ¶
n 2
n
(1 + t) = 1 + nt +
t + · · · + tn > nt
2
for any n ∈ N. Hence
dn =
1
1
< .
n
(1 + t)
nt
King’s College London
36
Chapter 3
We shall use this to estimate dn . If the right hand side is less than ε, then
so is the left hand side. To carry this through, let N be any natural number
obeying N > 1/εt. Then this means that 1/N t < ε. For any n > N , we
therefore have the inequality 1/n < 1/N and (since t > 0) we also have
dn <
1
1
<
< ε.
nt
Nt
In other words, we have shown that eventually dn is less than ε. In terms
of x and an , we have
|an − 0| = |x|n =
1
1
1
<
<
<ε
(1 + t)n
nt
Nt
whenever n > N . Hence if |x| < 1 then xn → 0 as n → ∞.
Is it possible for a sequence to converge to two different limits? To
convince ourselves that this is not possible, suppose the contrary. That is,
suppose that (an ) is some sequence which has the property that it converges
both to α and β, say, with α 6= β. Let ε > 0 be given. Then by definition
of convergence, (an ) is eventually within distance ε of α and also (an ) is
eventually within distance ε of β.
eventually in here
z }| {
eventually in here
z }| {
(
)
α−ε α α+ε
(
)
β−ε β β+ε
R
Figure 3.9: The sequence (an ) is eventually within ε of both α and β.
As one can see from the figure, if ε is small enough, then the two intervals
(α − ε, α + ε) and (β − ε, β + ε) will not overlap and it will not be possible
for any terms of the sequence (an ) to belong to both of these intervals
simultaneously. We can turn this into a rigorous argument as follows.
Theorem 3.9. Suppose that (an )n∈N is a sequence such that an → α and also
an → β as n → ∞. Then α = β, that is, a convergent sequence has a unique
limit.
Proof. Let ε > 0 be given.
Since we know that an → α, then we are assured that eventually (an ) is
within ε of α. Thus, there is some N1 ∈ N such that if n > N1 then the
distance between an and α is less than ε, i.e., if n > N1 then |an − α| < ε.
Similarly, we know that an → β as n → ∞ and so eventually (an ) is
within ε of β. Thus, there is some N2 ∈ N such that if n > N2 then the
distance between an and β is less than ε, i.e., if n > N2 then |an − β| < ε.
Department of Mathematics
37
Sequences
So far so good. What next? To get both of these happening simultaneously,
we let N = max{ N1 , N2 }. Then n > N means that both n > N1 and also
n > N2 . Hence we can say that if n > N then both |an − α| < ε and also
|an − β| < ε.
Now what? We expand out these sets of inequalities. Pick and fix any
n > N (for example n = N + 1 would do). Then
α − ε < an < α + ε
β − ε < an < β + ε.
The left hand side of the first pair together with the right hand side of the
second pair of inequalities gives α − ε < an < β + ε and so
α − ε < β + ε.
Similarly, the left hand side of the second pair together with the right hand
side of the first pair of inequalities gives β − ε < an < α + ε and so
β − ε < α + ε.
Combining these we see that
−2ε < α − β < 2ε
which is to say that |α − β| < 2ε. This happens for any given ε > 0 and
so the non-negative number |α − β| must actually be zero. But this means
that α = β and the proof is complete.
Definition 3.10. We say that the sequence (an )n∈N is bounded from above if
there is some M ∈ R such that
an ≤ M
for all n ∈ N. The sequence (an )n∈N is said to be bounded from below if
there is some m ∈ R such that
m ≤ an
for all n ∈ N. If (an ) is bounded both from above and from below, then we
say that (an ) is bounded.
Examples 3.11.
1. Let an = n + (−1)n n for n ∈ N. Then we see that (an ) is the sequence
given by (an ) = (0, 4, 0, 8, 0, 12, 0, . . . ). Evidently (an ) is bounded from
below (in fact, an ≥ 0) but (an ) is not bounded from above. (There is no
M for which an ≤ M holds for all n. Indeed, for any fixed M whatsoever,
if n is any even natural number greater than M , then an = 2n > n > M .)
King’s College London
38
Chapter 3
2. Let an = 1/n, n ∈ N. It is clear that an obeys 0 ≤ an ≤ 2 for all n
and so (an ) is bounded both from above and from below, that is, (an ) is
bounded.
Proposition 3.12. The sequence (an ) is bounded if and only if there is some
K ≥ 0 such that |an | ≤ K for all n.
Proof. Suppose first that (an ) is bounded. Then there is m and M such
that
m ≤ an ≤ M
for all n. We do not know whether m or M are positive or negative. However,
we can introduce |m| and |M | as follows. For any x ∈ R, it is true that
− |x| ≤ x ≤ |x|. Applying this to m and M in the above inequalities, we see
that
− |m| ≤ m ≤ an ≤ M ≤ |M | .
Let K = max{ |m| , |M | }. Then clearly,
−K ≤ − |m| ≤ m ≤ an ≤ M ≤ |M | ≤ K
which gives the inequalities
−K ≤ an ≤ K
so that |an | ≤ K, for all n, as required.
For the converse, suppose that there is K ≥ 0 so that |an | ≤ K for all n.
Then this can be expressed as
−K ≤ an ≤ K
for all n and therefore (an ) is bounded (taking m = −K and M = K in the
definition).
Theorem 3.13. If a sequence converges then it is bounded.
Proof. Suppose that (an ) is a convergent sequence, an → α, say, as n → ∞.
Then, in particular, (an ) is eventually within distance 1, say, of α. This
means that there is some N ∈ N such that if n > N then the distance
between an and α is less than 1, i.e., if n > N then |an − α| < 1. We can
rewrite this as
−1 ≤ an − α ≤ 1
or
α − 1 ≤ an ≤ α + 1
whenever n > N . This tells us that the tail (an for n > N ) of the sequence is
bounded but what about the whole sequence? This is now easy — we know
Department of Mathematics
39
Sequences
about an when n > N so we only still need to take into account the beginning
of the sequence up to the N th term, that is, the terms a1 , a2 , . . . , aN . Let
M = max{ a1 , a2 , . . . , aN , α + 1 } and let m = min{ a1 , a2 , . . . , aN , α − 1 }.
Then certainly α + 1 ≤ M and m ≤ α − 1. Hence if n > N , then
m ≤ an ≤ M.
But by construction of m and M , we also have the inequalities m ≤ an ≤ M
for any 1 ≤ n ≤ N . Piecing together these two parts of the argument, we
conclude that
m ≤ an ≤ M
for any n and we have shown that (an ) is bounded, as required.
Remark 3.14. The converse of this is false. For example, let (an ) be the
sequence with an = (−1)n . Then (an ) = (−1, 1, −1, 1, −1, . . . ) which is
bounded (for example, −1 ≤ an ≤ 1 for all n) but does not converge.
Definition 3.15. A sequence (an ) of real numbers is said to be
(i) increasing if an+1 ≥ an for all n;
(ii) strictly increasing if an+1 > an for all n;
(iii) decreasing if an+1 ≤ an for all n;
(iv) strictly decreasing if an+1 < an for all n.
A sequence satisfying any of these conditions is said to be monotonic or
monotone. It is strictly monotonic if it satisfies either (ii) or (iv).
One reason for an interest in monotonic sequences is the following.
Theorem 3.16. If (an ) is an increasing sequence of real numbers and is
bounded from above, then it converges.
Proof. Suppose then that an ≤ an+1 and that an ≤ M for all n. Let
K = lub{ an : n ∈ N }, so that K is well-defined with K ≤ M . We claim
that an → K as n → ∞.
Let ε > 0 be given. We must show that eventually (an ) is within distance
ε of K. Now, K is an upper bound for { an : n ∈ N } and so an ≤ K for
all n. It is enough then to show that K − ε < an eventually. However, this
is true for the following reason. K − ε < K and K is the least upper bound
of { an : n ∈ N } and so K − ε is not an upper bound for { an : n ∈ N }. This
means that there is some aj , say, with aj > K − ε. But the sequence (an )
is increasing and so an ≥ aj for all n > j. Hence an > K − ε for all n > j.
We have shown that
K − ε < an ≤ K < K + ε
for all n > j. This means that eventually |an − K| < ε and so the proof is
complete.
King’s College London
40
Chapter 3
Remark 3.17. Note that in the course of the proof of the above result, we
have not only shown that (an ) converges but we have actually established
what the limit is — it is the least upper bound of the set of real numbers
{ an : n ∈ N }. Of course, this does not necessarily provide us with the
numerical value of the limit.
It is also worth noting that from this result and the fact that a convergent
sequence is bounded, we can say that an increasing sequence converges if and
only if it is bounded. The sequence (an ) with an = n is clearly increasing.
It is not bounded and so we can say immediately that it does not converge
(which is no surprise, in this case).
Corollary 3.18. Any sequence which is decreasing and bounded from below
must converge.
Proof. Suppose that (bn ) is a sequence which is decreasing and bounded from
below. Then bn+1 ≤ bn for all n and there is some k such that bn ≥ k for
all n. Set an = −bn and K = −k. Then these inequalities become an ≤ an+1
and an ≤ K for all n, that is, (an ) is increasing and is also bounded from
above. By the theorem, we deduce that (an ) converges. Denote its limit by
α and let β = −α. We will show that bn → β as n → ∞ (as one might well
expect). Let ε > 0 be given. Then there is some N ∈ N such that if n > N
then
|an − α| < ε.
In terms of bn and β, the left hand side becomes |−bn + β| which is equal
to |bn − β| and so we have established that
|bn − β| < ε
whenever n > N , which completes the proof.
Example 3.19. Let (an ) be the sequence given by
a1 = 1,
a2 = 1 + 1,
a3 = 1 + 1 + 2!1 ,
a4 = 1 + 1 + 2!1 + 3!1 ,
...
an = 1 + 1 + 2!1 + · · · +
1
(n−1)!
,
...
1
This can be written more succinctly as a1 = 1 and an = an−1 + (n−1)!
for n ≥ 2. Does (an ) converge? It is clear that an+1 > an and so (an )
is increasing (in fact, strictly increasing). If we can show that it is also
bounded then we conclude that it must converge. Can we find K such that
Department of Mathematics
41
Sequences
an ≤ K for all n? We have a1 = 1 and for any n ≥ 1
1
1
1
1
+
+
+ ··· +
2 2.3 2.3.4
2.3 . . . n
1
1
1
1
≤ 1 + 1 + + 2 + 3 + · · · + n−1
2
¡ 2 1 n2¢ 2
1 − (2)
=1+
, summing the GP,
(1 − 21 )
an+1 = 1 + 1 +
= 1 + 2(1 − ( 12 )n )
< 1 + 2 = 3.
Hence the increasing sequence (an ) is bounded above, by 3. We conclude
that (an ) converges. Because it is increasing, we know that its limit is equal
to lub{ an : n ∈ N } = α, say. But an obeys an ≤ 3 and so 3 is an upper
bound for { an : n ∈ N } and therefore lub{ an : n ∈ N } ≤ 3, that is, α ≤ 3.
Of course, α = lub{ an : n ∈ N } ≥ ak for any particular k. Taking k = 3,
we get that α ≥ a3 > 2 and so we can say that 2 < α ≤ 3. In fact, α is just
e (and e = 2.71828 . . . ).
If an → α and bn → β, then we might expect it to be the case that
an + bn → α + β. After all, if (an ) is eventually close to α and (bn ) is
eventually close to β, then it seems quite reasonable to guess that (an + bn )
is eventually close to α + β. This is true, but we must take care with the
details.
Theorem 3.20. Suppose that (an ) and (bn ) are sequences in R.
(i) If an → α as n → ∞, then λ an → λ α as n → ∞, for any λ ∈ R.
(ii) If an → α as n → ∞ and bn → β as n → ∞, then an + bn → α + β
as n → ∞ and also an bn → αβ as n → ∞.
(iii) If an → α as n → ∞ and if bn → β as n → ∞ and if bn 6= 0 for
all n and if β 6= 0, then an /bn → α/β as n → ∞.
Proof. (i) Fix λ ∈ R. Let ε > 0 be given. We must show that |λan − λα| < ε
eventually. If λ = 0, then λan = 0 for all n and so it is clear that in this
case λan = 0 → 0 = λα as n → ∞.
So now suppose that λ 6= 0. Let ε0 > 0. (We will specify ε0 in a moment.)
Then since we know that an → α, it follows that there is some N ∈ N such
that n > N implies that
|an − α| < ε0 .
Now,
|λan − λα| = |λ| |an − α|
< |λ| ε0
King’s College London
42
Chapter 3
whenever n > N . If we choose ε0 = ε/ |λ| then see that
|λan − λα| < |λ| ε0 = ε
whenever n > N . Hence λan → λα, as required.
(ii) Let ε > 0 be given. Suppose ε0 > 0. We will specify the value of ε0 in
a moment. There is N1 ∈ N such that n > N1 implies that |an − α| < ε0 .
Also, there is N2 ∈ N such that n > N2 implies that |bn − β| < ε0 . Set
N = max{ N1 , N2 }. Then if n > N , we see that
|an + bn − (α + β)| = |an − α + bn − β| ≤ |an − α| + |bn − β| < ε0 + ε0 = 2ε0 .
Setting ε0 =
1
2
ε, it follows that if n > N , then
|an + bn − (α + β)| < 2 ε0 = ε,
that is, an + bn → α + β as n → ∞.
To show that an bn → α β, consider first the case xn → 0 and yn → 0.
We shall show that xn yn → 0.
Let ε > 0 be given.
√
Then we know that there is N1 ∈ N such that if n > N1 then |xn | < ε.
√
Similarly, we know that there is N2 ∈ N such that if n > N2 then |yn | < ε.
Let N = N1 + N2 . Then if n > N , it follows that
|xn yn | <
√ √
ε ε=ε
that is, xn yn → 0 as n → ∞.
Now, in the general case, we simply use previous results to note that
an bn = (an − α)(bn − β) + αbn + an β − αβ
→ 0 + αβ + αβ − αβ
= αβ
as required.
(iii) Now suppose that an → α, bn → β and suppose that bn 6= 0 for all n
and that β 6= 0. Let γn = 1/bn and let γ = 1/β. Then an /bn = an γn . To
show that an /bn → α/β, we shall show that γn → γ as n → ∞. The desired
conclusion will then follow from the second part of (ii), above.
We have
|γn − γ| = |1/bn − 1/β|
|β − bn |
.
=
|bn β|
Department of Mathematics
43
Sequences
For large enough n, the numerator is small and the denominator is close to
|β|2 , so we might hope that the whole expression is small. (Note that it is
imperative here that β 6= 0.) We shall show that 1/ |bn | is bounded from
above. Indeed, |β| > 0 and so, taking 12 |β| as our “ε”, we can say that there
is some N 0 such that n > N 0 implies that
|bn − β| <
1
2
|β| .
Hence, if n > N 0 , we have
|β| = |β − bn + bn | ≤ |β − bn | + |bn |
<
1
2
|β| + |bn |
and so 12 |β| < |bn |. If we set K = min{ |b1 | , |b2 | , . . . , |bN 0 | , 21 β }, then it is
true that K > 0 and |bn | ≥ K for all n. Hence 1/ |bn | ≤ 1/K for all n.
Let ε > 0 be given. Let ε0 = ε K |β|. Since bn → β, there is N such that
n > N implies that
|bn − β| < ε0 .
But then, for any n > N , we have
|γn − γ| =
ε0
|β − bn |
<
=ε
|bn | |β|
K |β|
and the proof is complete.
Examples 3.21.
1. Taking an = 1/n, it follows that λ/n → 0 as n → ∞ for any λ ∈ R.
2. Suppose that an → α as n → ∞. Then it follows immediately that
an − α → 0 as n → ∞. Indeed, for any given ε > 0, there is some
N ∈ N such that n > N implies that |an − α| < ε. But |an − α| =
|(an − α) − 0|, so to say that an → 0 is just to say that an − α → 0 as
n → ∞.
3. With an = bn , we see that if an → α, then a2n → α2 as n → ∞. Now
with bn = a2n , it follows that a3n → α3 as n → ∞. Repeating this (i.e., by
induction), we see that if an → α as n → ∞, then akn → αk as n → ∞
for any given k ∈ N.
(3 − 4/n2 )
3n2 − 4
for
n
∈
N.
We
can
rewrite
a
as
a
=
.
n
n
2n2 + 1
(2 + 1/n2 )
Then we note that −4/n2 → 0 and 1/n2 → 0, so that 3 − 4/n2 → 3 and
(3 − 4/n2 )
2 + 1/n2 → 1 as n → ∞. Finally, it follows that an =
→ 3/2
(2 + 1/n2 )
as n → ∞.
4. Let an =
King’s College London
44
Chapter 3
7n3 − 5n2 + 3n − 9
. The first thing we do is to divide through
3n3 + 4n2 − 8n + 2
by the highest power of n occurring in the numerator or denominator,
i.e., in this case, by n3 . So, an can be rewritten as
5. Let an =
an =
(7 − 5/n + 3/n2 − 9/n3 )
.
(3 + 4/n − 8/n2 + 2/n3 )
Then we see that the numerator converges to 7 and the denominator
converges to 3 as n → ∞. Hence an → 7/3 as n → ∞.
n4 − 8
(1/n3 − 8/n7 )
.
Then
we
have
a
=
and so it follows
n
n7 + 3
(1 + 3/n7 )
that an → 0/1 = 0 as n → ∞.
6. Let an =
2n5 + 4
(2 + 4/n5 )
.
Then
a
=
. Now, the numerator
n
n3 + 6
(1/n2 + 6/n5 )
converges to 2 whilst the denominator converges to 0 as n → ∞. The
above theorem about the convergence of an /bn says nothing about the
case when bn or β = lim bn are zero. In this example, we back up and
note that, by inspection, we have
7. Let an =
an =
2n5
2n5
2n5 + 4
2n2
>
≥
.
=
n3 + 6
n3 + 6
n3 + 6n3
7
It follows that an is not bounded from above and so cannot converge.
8. Suppose that |x| < 1 and consider the sequence an = xn , for n ∈ N. Then
the sequence bn = |an | = |x|n is monotone decreasing and is bounded
below (by 0) and so therefore it converges, to `, say: bn → ` as n → ∞.
Hence the sequence (b2n ) also converges to `. However,
¯ ¯
b2n = |a2n | = ¯x2n ¯ = |x|n |x|n
= bn bn → `2
and so we see that ` = `2 . Therefore either ` = 0 or else ` = 1. The value
` = 1 is not possible because (bn ) converges to its greatest lower bound
and the value 1 is not a lower bound. Hence ` = 0 and we conclude that
|an | → 0 as n → ∞.
Let ε > 0 be given.
Then there is some N such that n > N implies that
||an | − 0| = |an | = |an − 0| < ε
which shows that xn = an → 0 as n → ∞.
The next result is very useful.
Department of Mathematics
45
Sequences
Proposition 3.22. Suppose that (cn ) is a sequence in R with cn ≥ 0 for all
n ∈ N and such that cn → γ as n → ∞. Then γ ≥ 0. In other words,
the limit of a convergent positive sequence is positive. (Note that we are
using the term positive to mean not strictly negative, so that the value zero
is allowed.)
Proof. Exactly one of γ < 0, γ = 0 or γ > 0 is true. We wish to show that
the first is impossible. To do this, suppose the contrary, that is, suppose
that γ < 0. We will obtain a contradiction from this.
Let ε = −γ. Then according to our hypothesis, ε > 0. We know that
cn → γ as n → ∞ and so we can say that there is some N in N such that
n > N implies that |cn − γ| < ε. How can we use this? Fix any n > N , for
example we could take n = N + 1. The inequality |cn − γ| < ε is equivalent
to the pair of inequalities
−ε < cn − γ < ε .
Recalling that ε = −γ, we find that
cn − γ < ε = −γ .
This tells us that cn < 0 which is false. We have obtained our contradiction
and so we can conclude that, as claimed, it is true that γ ≥ 0.
It is natural to ask whether strict positivity of every cn implies that of
the limit γ, that is, if cn > 0 for all n, can we deduce that necessarily γ > 0?
The answer is no. To show this, we just need to exhibit an explicit example.
Such an example is provided by the sequence cn = 1/n. It is true here that
cn = 1/n > 0 for every n. The sequence (cn ) converges, but its limit is
γ = 0. So we have cn > 0 for all n, cn → γ as n → ∞ but γ = 0.
The following theorem provides a useful technique for exhibiting convergence of a sequence even under circumstances where we do not know
explicitly the values of the terms of the sequence.
Theorem 3.23 (Sandwich Principle). Suppose that (an ), (bn ) and (xn ) are
sequences in R such that
(i) an ≤ xn ≤ bn for all n ∈ N and
(ii) both an → µ and bn → µ as n → ∞.
Then (xn ) converges and its limit is µ.
Proof. Let ε > 0 be given.
King’s College London
46
Chapter 3
The inequalities an ≤ xn ≤ bn can be rewritten as
0 ≤ xn − an ≤ bn − an .
| {z }
| {z }
yn
zn
Since both (an ) and (bn ) converge to µ, it follows that zn = bn − an →
µ − µ = 0 as n → ∞. Hence there is some N in N such that n > N implies
that |zn | < ε. But since yn = xn − an ≥ 0, we have |yn | = yn and so n > N
implies that
|yn | = yn ≤ zn = |zn | < ε
which means that yn → 0 as n → ∞. To finish the proof, we observe that
xn = yn + an → 0 + µ as n → ∞ and we are done.
We illustrate this with a proof that any real number can be approximated
by rationals.
Theorem 3.24. Any real number is the limit of some sequence of rational
numbers.
Proof. Let a be any given real number. For each n ∈ N, we know that there
is a rational number qn , say, lying between the numbers a and a + 1/n. That
is, qn satisfies
a ≤ qn ≤ a + n1 .
Since 1/n → 0, an application of the Sandwich Principle tells us immediately
that qn → a as n → ∞, as required.
Note that a similar proof shows that any real number is the limit of
a sequence of irrational numbers (just replace the adjective “rational” by
“irrational”.) The point though is that even though one might think of the
irrational numbers as somewhat weird, they can nevertheless be approximated as closely as desired by rational numbers.
Subsequences
Consider the sequence (an ) given by an = sin( 12 nπ) for n ∈ N. Evidently,
an = 0 if n is even and alternates between ±1 for n odd. For example, the
first 5 terms are a1 = 1, a2 = 0, a3 = −1, a4 = 0, a5 = 1.
Next, consider the sequence
(bn ) = (1, 23 , 13 , 54 , 51 , . . . ) .
This is given by
(
bn =
1
n,
n
n+1
for n odd
, for n even.
We notice that the odd terms approach 0 whereas the even terms approach 1.
Department of Mathematics
47
Sequences
These two examples suggest that we might well be interested in considering
certain terms of a sequence in isolation from the original sequence. This idea
is formalized in the concept of a subsequence of a sequence. Roughly speaking, a subsequence of a sequence is simply any sequence obtained by leaving
out particular terms from the original sequence. For example, the even terms
a2 , a4 , a6 , . . . form a subsequence of the sequence (an ). Another subsequence
of (an ) is obtained by considering, say, every tenth term, a10 , a20 , a30 , . . . .
Definition 3.25. Let (an ) be a given sequence. A subsequence of (an )n∈N is
any sequence of the form (an1 , an2 , an3 , . . . ) where n1 < n2 < n3 < . . . is
any (strictly increasing) sequence of natural numbers.
We can express this somewhat more formally as follows. A sequence
(bk )k∈N is a subsequence of the sequence (an )n∈N if there is some mapping
ϕ : N → N such that i < j implies that ϕ(i) < ϕ(j) (i.e., ϕ is strictly
increasing) and such that bk = aϕ(k) for each k ∈ N. This agrees with the
above formulation if we simply set ϕ(k) = nk and put bk = aϕ(k) = ank . (It
really just amounts to a matter of notation.)
Of course, (bk )k∈N is a sequence in its own right and so one can consider
subsequences of (bk ). Evidently, a subsequence of (bk ) is also a subsequence
of (an )n∈N . This is intuitively clear. We get a subsequence of (bk ) by leaving
out some of its terms. However, (bk ) itself was obtained from (an ) by leaving
out various terms of (an ), so if we leave out both lots in one step, we get our
subsequence of (bk ) directly from (an ). To see this more formally, suppose
that (cj )j∈N is a subsequence of (bk )k∈N . Then there is a strictly increasing
map ψ : N → N such that cj = bψ(j) for all j ∈ N. However, since (bk ) is a
subsequence of (an ), there is a strictly increasing map ϕ : N → N such that
bk = aϕ(k) for all k ∈ N. This means that we can write cj as
cj = bψ(j) = aϕ(ψ(j))
for j ∈ N. Let γ : N → N be the map γ(j) = ϕ(ψ(j)). Evidently γ is strictly
increasing and cj = aγ(j) for j ∈ N. This shows that (cj )j∈N is a subsequence
of (an )n∈N .
Remark 3.26. Let (anj ) be a subsequence of (an ). It is intuitively clear that,
say, the 20th term of (anj ) has to be at least the 20th term of (an ). In
general, the term anj is at least as far along the (an ) sequence as the j th or,
in other words, nj ≥ j.
We will verify this by induction. For j ∈ N, let P (j) be the statement
“ nj ≥ j ”. Now, nj ∈ N and so, in particular, n1 ≥ 1, which means that
P (1) is true. Fix j ∈ N and suppose that P (j) is true. We will show that
this implies that P (j + 1) is also true. Indeed, nj is strictly increasing in j
and so we have
nj+1 > nj ≥ j ,
by the induction hypothesis that P (j) is true.
King’s College London
48
Chapter 3
Since all quantities under consideration are integer-valued, we deduce that
nj+1 ≥ j + 1, i.e., P (j + 1) is true. It follows, by induction, that P (j) is true
for all j ∈ N.
Proposition 3.27. Suppose that (an ) converges to α. Then so does every
subsequence of (an ).
Proof. Let (ank )k∈N be any subsequence of (an ) whatsoever. We wish to
show that ank → α as n → ∞.
Let ε > 0 be given.
Now, we know that an → α as n → ∞. Therefore, we are assured that
there is some N ∈ N such that n > N implies that |an − α| < ε. But (ank )
is a subsequence of (an ) and so we know that nk ≥ k for all k ∈ N. It
follows that if k > N , then certainly nk > N . Hence, k > N implies that
|ank − α| < ε and the proof is complete.
Remark 3.28. The proposition tells us that if (an ) converges, then so does
any subsequence, and to the same limit.
Consider the sequence an = (−1)n . Then we see that a2n = 1 for
all n, whereas a2n−1 = −1 for all n, so that (an ) certainly possesses two
subsequences which both converge but to different limits. Consequently, the
original sequence cannot possibly converge. (If it did, every subsequence
would have to converge to the same limit, namely, the limit of the original
sequence.)
Bolzano-Weierstrass Theorem
Before we launch into one of the most important results of real analysis, let
us make one or two observations regarding upper and lower bounds.
Suppose that A and B are subsets of R with A ⊆ B. If M is such that
b ≤ M for all b ∈ B, then certainly, in particular, a ≤ M for all a ∈ A. In
other words, an upper bound for B is also (a fortiori) an upper bound for
any subset A of B. Now, lub B is an upper bound for B and so lub B is
certainly an upper bound for A. It follows that
lub A ≤ lub B .
It is possible for the inequality here to be strict. For example, if A is the
interval A = [1, 2] and B is the interval B = [0, 3], then A ⊂ B and evidently
lub A = 2 whereas lub B = 3, so that lub A < lub B in this case.
Similarly, we note that if m is a lower bound for B, then m is also a
lower bound for A and so
glb B ≤ glb A .
With the example A = [1, 2] and B = [0, 3], as above, we see that glb B = 0
and glb A = 1.
Department of Mathematics
49
Sequences
Theorem 3.29 (Bolzano-Weierstrass Theorem). Any bounded sequence of
real numbers possesses a convergent subsequence.
(In other words, if (an )n∈N is a bounded sequence in R, then there is a
strictly increasing sequence (nk )k∈N of natural numbers such that (ank )k∈N
converges.)
Proof. Suppose that M and m are upper and lower bounds for (an ),
m ≤ an ≤ M
(∗)
The idea of the proof is to construct a certain bounded monotone decreasing
sequence and use the fact that this converges to its greatest lower bound
and to drag a suitable subsequence of (an ) along with this.
We construct the first element of the auxiliary monotone sequence. Let
M1 = lub{ an : n ∈ N }. Then M1 −1 is not an upper bound for { an : n ∈ N }
and so there must be some n1 , say, in N such that
M1 − 1 < an1 ≤ M1 .
(The value 1 subtracted here (from M1 ) is not important. We could have
chosen any positive number. However, we shall repeat this process and we
require a sequence of positive numbers which converge to 0. The numbers
1, 12 , 13 , . . . suit our purpose.) We note that M1 is an upper bound for (an )
and so, in particular, it is an upper bound for the set { an : n > n1 }.
Next, we construct M2 as follows. Let M2 = lub{ an : n > n1 }. Then
M2 ≤ M1 and moreover, M2 − 12 is not an upper bound for { an : n > n1 }
and so there is some n2 > n1 such that
M2 −
1
2
< an2 ≤ M2 .
The way ahead is now clear. Let M3 = lub{ an : n > n2 }. Then
M3 ≤ M2 and since M3 − 13 is not an upper bound for { an : n > n2 } there
must be some n3 > n2 such that
M3 −
1
3
< an3 ≤ M3 .
Continuing in this way, we construct a sequence (Mj )j∈N and a sequence
(nj )j∈N of natural numbers such that Mj+1 ≤ Mj , nj+1 > nj , and
Mj −
1
j
< anj ≤ Mj
for all j ∈ N.
Now we note that m ≤ anj ≤ Mj and so (Mj ) is a decreasing sequence
which is bounded from below. It follows that Mj → µ as j → ∞, where
µ = glb{ Mj : j ∈ N }. We are not interested in the value of this limit µ.
King’s College London
50
Chapter 3
All we need to know is that the sequence (Mj )j∈N converges to something.
However, by our very construction,
Mj −
1
j
< anj ≤ Mj
and so, by the Sandwich Principle, Theorem 3.23, anj → µ as j → ∞.
We have succeeded in exhibiting a convergent subsequence, namely the
subsequence (anj )j∈N and the proof is complete.
Remark 3.30. Note that the theorem does not tell us anything about the
subsequence or its limit. Indeed, it cannot, because we know nothing about
our original sequence other than the fact that it is bounded. It can also
happen that there are many convergent subsequences with different limits.
It is easy to construct such examples. For example, let (un ), (vn ) and (wn )
be any three given convergent sequences, say, un → u, vn → v and wn → w.
We construct the sequence (an ) as follows:
(a1 , a2 , a3 , a4 , . . . , ) = (u1 , v1 , w1 , u2 , v2 , w2 , u3 , . . . ) .
In other words, the three sequences (un ), (vn ) and (wn ) are dovetailed to
form (an ). Explicitly, for n ∈ N,


uk , if n = 3k − 2 for some k ∈ N,
an = vk , if n = 3k − 1 for some k ∈ N,


wk , if n = 3k for some k ∈ N.
Evidently, if u, v and w are different, then the sequences (a3j−2 )j∈N =
(uj )j∈N , (a3j−1 )j∈N = (vj )j∈N and (a3j )j∈N = (wj )j∈N are three convergent
subsequences of (an )n∈N with different limits.
Let us say that a real number µ is a limit point of a given sequence if it
is the limit of some convergent subsequence. Then in this terminology, the
real numbers u, v and w are limit points of the sequence (an ).
Next, we need a little more terminology.
Definition 3.31. A sequence (an )n∈N is said to be a Cauchy sequence (also
known as a fundamental sequence) if it has the property that for any given
ε > 0 there is some N ∈ N such that both n > N and m > N imply that
|an − am | < ε .
In other words, eventually the distance between any two terms of the sequence is less than ε.
Department of Mathematics
51
Sequences
Proposition 3.32. Every Cauchy sequence is bounded.
Proof. Suppose that (an ) is a Cauchy sequence. Then we know that there
is some N ∈ N such that both n > N and m > N imply that
|an − am | < 1 .
(The value 1 on the right hand side here is not at all critical. We could have
selected any positive real number instead, with obvious modifications to the
following reasoning.) In particular, for any j > N ,
|aj | ≤ |aj − aN +1 | + |aN +1 | < 1 + |aN +1 | .
It follows that if we let M = 1 + max{ |ai | : 1 ≤ i ≤ N + 1 }, then we have
|ak | ≤ M
for all k ∈ N. This shows that (an ) is bounded.
We have seen that a bounded monotone sequence must converge. The
next theorem is very important as it gives us a necessary and sufficient
condition for convergence of a sequence.
Theorem 3.33.
sequence.
A sequence converges in R if and only if it is a Cauchy
Proof. We must show that any Cauchy sequence has to converge and, conversely, that any convergent sequence is a Cauchy sequence.
So suppose that (an )n∈N is a Cauchy sequence. We must show that there
is some α such that an → α as n → ∞. At first, this might seem impossible
because there is no way of knowing what α might be. However, it turns out
that we do not need to know the actual value of α but rather just that it
does exist. Indeed, we have seen that a Cauchy sequence is bounded and the
Bolzano-Weierstrass Theorem tells us that a bounded sequence possesses a
convergent subsequence. We shall show that this is enough to guarantee
that the sequence itself converges.
Let ε > 0 be given.
As noted above, we know that (an ) has some convergent subsequence, say
ank → α as k → ∞. We shall show that an → α by an ε/2-argument. Since
we know that ank → α as k → ∞, we can say that there is k0 ∈ N such that
k > k0 implies that
|ank − α| < 21 ε .
Since (an ) is a Cauchy sequence, there is N0 such that both n > N0 and
m > N0 imply that
|an − am | < 21 ε .
King’s College London
52
Chapter 3
Let N = max{ k0 , N0 }. Now, if k > N it follows that also nk > N (since
nk ≥ k) and so if k > N then
|ak − α| ≤ |ak − ank | + |ank − α| <
1
2
ε + 12 ε = ε .
Thus ak → α as k → ∞ as required.
Next, suppose that (an ) converges. We must show that (an ) is a Cauchy
sequence.
Let ε > 0 be given.
We use an ε/2-argument. Let α denote limn→∞ an . Then there is N ∈ N
such that n > N implies that
|an − α| <
1
2
ε.
But then if both n > N and m > N , we have
|an − am | ≤ |an − α| − |α − am | <
1
2
ε + 21 ε = ε
which verifies that (an ) is indeed a Cauchy sequence, as claimed.
Department of Mathematics
53
Sequences
Some special sequences
Example 3.34. What happens to c1/n as n → ∞ for given fixed c > 0 ?
To investigate this, let c > 0 and consider the sequence given by (c1/n ) =
(c, c1/2 , c1/3 , c1/4 , . . . ). Suppose first that c > 1. Then c1/n > 1. For n ∈ N,
let dn be given by dn = c1/n − 1, so that dn > 0 and c1/n = 1 + dn . Hence,
by the binomial theorem,
µ ¶
µ
¶
n 2
n
n
c = (1 + dn ) = 1 + n dn +
d + ··· +
dn−1 + dnn
2 n
n−1 n
≥ 1 + n dn .
It follows that c − 1 ≥ n dn and so we have
0 < dn ≤
c−1
.
n
It follows from the Sandwich Principle that dn → 0 as n → ∞. Hence, for
any c > 1, c1/n = 1 + dn → 1 as n → ∞.
If c = 1, then evidently c1/n = 11/n = 1 → 1 as n → ∞.
Now suppose that 0 < c < 1. Set γ = 1/c so that γ > 1. Then from the
above, c1/n = (1/γ)1/n = 1/(γ 1/n ) → 1 as n → ∞.
We conclude that c1/n → 1 as n → ∞ for any fixed c > 0.
c1/n → 1 as n → ∞ for any fixed c > 0
Example 3.35. What happens to n1/n as n → ∞ ? There is conflicting
behaviour here. Taking the nth root would tend to make things smaller, but
one is taking the nth root of n which itself gets larger. It is not immediately
clear what will happen.
Define kn by n1/n = 1 + kn (so that kn = n1/n − 1). Then kn > 0 for all
n > 1. We shall show that kn → 0 as n → ∞. To see this, notice that for
any n > 1
n = (1 + kn )n
= 1 + n kn +
>
Hence, for n > 1,
n(n − 1) 2
kn + · · · + knn
2
n(n − 1) 2
kn .
2
√
2
0 < kn < √
n−1
King’s College London
54
Chapter 3
and by the Sandwich Principle, we deduce that kn → 0 as n → ∞. Hence
n1/n = 1 + kn → 1 as n → ∞.
n1/n → 1 as n → ∞
Example 3.36. What happens to cn /n! as n → ∞ for fixed c ∈ R ? If c > 1,
then cn gets large as n grows but so does the denominator n!. There is
conflicting behaviour here, so it is not obvious what does happen.
For any c ∈ R, choose an integer k ∈ N such that k > |c|. The (k + m)th
term of the sequence is
ck+m
ck cm
=
.
(k + m)!
k! (k + 1)(k + 2) . . . (k + m)
We have
¯ k+m ¯
¯
¯ c
¯=
0 ≤ ¯¯
(k + m)! ¯
¯ k¯
¯c ¯
|c|m
k! (k + 1)(k + 2) . . . (k + m)
¯ k¯ m
¯c ¯ |c|
≤
m
¯ k!k ¯ k
¯c ¯
=
γm
k!
k
where γ = |c| /k < 1. Now let aj = cj /j! for 1 ≤ j ≤ k and let aj = |c|k! γ m
for j = k + m with m ≥ 1. Then evidently aj → 0 as j → ∞ and
¯ j¯
¯c ¯
0 ≤ ¯¯ ¯¯ ≤ aj .
j!
¯
¯
By the Sandwich Principle, it follows that ¯cj /j!¯ → 0 and hence we also
have cj /j! → 0 as j → ∞.
cj /j! → 0 as j → ∞ for any fixed c ∈ R
√
√
Example 3.37. What happens to n + 1 − n as n → ∞ ? Each of the
two terms becomes large but what about their difference? To see what does
happen, we use a trick and write
√
√ √
√
√
√
( n + 1 − n)( n + 1 + n)
√
0< n+1− n=
√
( n + 1 + n)
(n + 1) − n
= √
√
( n + 1 + n)
1
1
= √
√ <√
n
( n + 1 + n)
Department of Mathematics
55
Sequences
and so by the Sandwich Principle, we deduce that
n → ∞.
√
√
n + 1 − n → 0 as
√
√
n + 1 − n → 0 as n → ∞
Example 3.38. Let 0 < a < 1 and let k ∈ N be fixed. What happens to nk an
as n → ∞ ? The term nk gets large but the term an becomes small as n
grows. We have conflicting behaviour.
To investigate this, first let us note that nk = (nk/n )n and also that
= (n1/n )k → 1k = 1 as n → ∞. It follows that nk/n a → a as n → ∞.
Let r obey a < r < 1. Then eventually nk/n a < r (because with ε = r − a,
eventually nk/n a − a < ε = r − a). It follows that eventually
nk/n
0 < nk an = (nk/n a)n < rn .
But rn → 0 and so by the Sandwich Principle we conclude that nk an → 0
as n → ∞. (There is N ∈ N such that n > N implies that
0 < nk an = (nk/n a)n < rn .
For 1 ≤ n ≤ N , set bn = nk an and for n > N set bn = rn . Then bn → 0 as
n → ∞ and we have
0 < nk an ≤ bn
and so the Sandwich Principle tells us that nk an → 0 as n → ∞.)
nk an → 0
as n → ∞ for any fixed 0 < a < 1
King’s College London
56
Chapter 3
Sequences of functions
Just as one can have a sequence of real numbers, so one can have a sequence
of functions. By this is simply meant a family of functions labelled by the
natural numbers N. Consider, then, a sequence (fn )n∈N of functions. For
each given x, the sequence (fn (x))n∈N is just a sequence of real numbers,
as considered already. Here, as always, fn (x) is the notation for the value
taken by the function fn at the real number x. In this way, we get many
sequences — one for each x. Now, for some particular values of x the
sequence (fn (x))n∈N may converge whereas for other values of x it may not.
Even when it does converge, its limit will, in general, depend on the value
of x. These various values of the limit themselves determine a function of x.
This leads to the following notion of convergence of a sequence of functions.
Definition 3.39. Suppose that (fn )n∈N is a sequence of functions each defined
on a particular subset S in R. We say that the sequence (fn )n∈N converges
pointwise on S to the function f if for each x ∈ S the sequence (fn (x))n∈N of
real numbers converges to the real number f (x). We write fn → f pointwise
on S as n → ∞.
Some examples will illustrate this important idea.
Examples 3.40.
1. Let fn (x) = xn and let S be the open interval S = (−1, 1). We have
seen that xn → 0 as n → ∞ for any x with |x| < 1. This simply says
that (fn ) converges pointwise to f = 0, the function identically zero on
the set (−1, 1).
2. Let fn (x) = xn as above, but now let S = (−1, 1]. Then for |x| < 1,
we know that fn (x) = xn → 0 as n → ∞. Furthermore, with x = 1,
we have fn (1) = 1n = 1, so that fn (1) → 1 as n → ∞. Let f be the
function on S = (−1, 1] given by
(
0, for −1 < x < 1,
f (x) =
1, for x = 1.
Then we can say that fn → f pointwise on (−1, 1].
3. Once again, let fn (x) = xn but now let S be the interval S = [−1, 1].
We know that for each x ∈ (−1, 1], the sequence (fn (x)) of real numbers
converges. We must investigate what happens for x = −1. We see that
fn (−1) = (−1)n , so that the sequence (fn (−1))n∈N of real numbers does
not converge. This means that there does not exist a function f on
[−1, 1] with the property that fn (−1) → f (−1). The conclusion is that
in this case (fn ) does not converge pointwise on [−1, 1] to any function
at all.
Department of Mathematics
Sequences
57
These examples illustrate the obvious but nevertheless crucial point that
pointwise convergence of a sequence of functions involves not only a particular sequence of functions but also the set on which the pointwise convergence
is to be considered to take place. The notion of “pointwise convergence” only
makes sense when used together with the set to which it refers.
King’s College London
58
Department of Mathematics
Chapter 3
Chapter 4
Series
Given a sequence a1 , a2 , . . . we wish to discus the “infinite sum”
a1 + a2 + a3 + . . .
P
Such an expression is called an infinite series and is denoted by ∞
k=1 ak .
We shall attempt to interpret such a series as a suitable limiting object. To
this end, let sn be the so-called nth partial sum
sn =
n
X
ak = a1 + · · · + an .
k=1
P
Then as n becomes larger, so sn looks more like the series ∞
k=1 ak . Of
course, there is the matter of convergence to bePconsidered. The point is
that one can always write down the expression ∞
k=1 ak but without some
extra discussion it is not all clear what it actually means. It is certainly a
combination of symbols, but does it have any reasonable interpretation as
a real number?
P∞ if it happens to be the case that ak = 1 for
P For example,
see that in this
a
=
every k, then ∞
k
k=1 1. What does this mean? WeP
k=1
∞
special case, sn = n which gets large. The answer is
that
k=1 ak simply
P∞
has no meaning in this case. We say that the series k=1 ak diverges.
As another example, suppose that ak = (−1)k+1 . Then ak = 1 for odd k
and is otherwise equal to −1. Then
∞
X
ak = 1 − 1 + 1 − 1 + 1 − 1 + . . .
k=1
— which means what exactly? In this example, we see that sn = 1 if n is
odd but is zero if n is even. The partial sums flip interminably between the
two values 1 and 0.
P
Definition 4.1. The series ∞
k=1 ak is said to be convergent if the sequence
of partial sums (sn )n∈N converges.
If sn →P
α as n → ∞, then α is said to be the sum of the series and the
expression ∞
k=1 ak is defined to be this limit α.
A series which is not convergent is said to be divergent.
59
60
Chapter 4
Example 4.2. Let ak = (1/3)k , so that
∞
X
ak =
k=1
∞
X
1
.
3k
k=1
We see that the partial sums are given by
sn =
³ 1 ´n ( 1 − ( 1 )n+1 )
1
1 1 ³ 1 ´n
1
3
+ ··· +
−
= 3
=
→
1
3
3
2
2
3
2
(1 − 3 )
as n → ∞. Hence
∞
X
1
1
= .
k
3
2
k=1
Note that the same argument shows that
∞
X
k=1
xk =
x
1−x
(∗)
for any x with |x| < 1. (The requirement that |x| < 1 ensures that xn → 0
as n → ∞.)
Note that if we were to ignore the fact that it was not valid but go ahead
anyway
and simply set x = 1 in the above formula (∗), then we would have
P∞
1
on
the left hand side and 1/0 on the right hand side — neither of
k=1
which have meaning as real numbers. Again, if we ignore the fact that it
is invalid but anyway set x = −1 in (∗), then the left hand side becomes
−(1 − 1 + 1 − 1 + . . . ) and the right hand side becomes − 12 which might lead
one to suggest that 1 − 1 + 1 − 1 + . . . is in some sense equal to 12 . The fact
is that 1 − 1 + 1 − 1 + . . . has no sensible interpretation as a real number.
Returning to the series itself and setting x = 5, say, we see that the
partial sum sn = 5 + · · · + 5n ≥ 5n so that
P∞the ksequence (sn ) does not
converge (it is not bounded) and therefore
P∞ k=1k 5 is divergent. That’s it
— nothing more to say. The expression k=1 5 does not represent a real
number
P∞andkit cannot be manipulated as if it did. (It is tempting to say
that k=1 5 has no meaning at all. However, it does implicitly carry with
it the discussion here to the effect that the sequence of partial sums does
not converge.)
The following divergence test allows us to immediately spot certain series
as being divergent.
Proposition 4.3 (Test for divergence). Suppose
P that the sequence (an ) fails
to converge to 0 as n → ∞. Then the series ∞
k=1 ak diverges.
P
Proof.PWe must show that if ∞
k=1 ak is convergent then an → 0. So suppose
that ∞
a
is
convergent
with
sum α, say. This means that sn → α as
k
k=1
P
n → ∞, where sn = nk=1 ak .
Department of Mathematics
61
Series
Let ε > 0 be given.
We need an ε/2-argument. Since sn → α, we are assured that there is some
N 0 ∈ N such that
|sn − α| < 21 ε
whenever n > N 0 . But then
|an | = |sn − sn−1 | = |sn − α + α − sn−1 |
≤ |sn − α| + |α − sn−1 |
< 12 ε + 21 ε = ε
provided n > N 0 and n − 1 > N 0 . So if we set N = N 0 + 1, then if n > N
we can be sure that
|an | < ε
which establishes that an → 0 as n → ∞ and the proof is complete.
Remark 4.4. It is very important to understand what this proposition says
and what it does not say. It says that if the terms of a series fail to converge
to zero, then the series itself is divergent.
It is quite possible to find a series whose terms do converge to zero but
nevertheless,
P∞ the series is divergent. Such an example is provided by the
series k=1 ak with ak = 1/k.
1+
1 1 1 1
+ + + + ...
2 3 4 5
is a divergent series
Indeed, the sequence of partial sums (sn ) is not bounded. One can see this
as follows. That portion of the graph of the function y = 1/x between the
values x = k and x = k + 1 lies below the line y = 1/k. Let Rk denote the
rectangle with height 1/k and with base on the interval [k, k + 1]. Then the
area of Rk is greater than the area under the graph of y = 1/x between the
values x = k and x = k + 1, that is,
Z k+1
1
1
dx = ln(k + 1) − ln k .
area Rk = >
k
x
k
Summing from k = 1 to k = n, we get
n
sn = 1 +
1
1 X
+ ··· + >
(ln(k + 1) − ln k) = ln(n + 1) − ln 1 = ln(n + 1).
2
n
k=1
But ln(n + 1) > ln n which becomes arbitrarilyPlarge for large enough n.
So we conclude that (sn ) is unbounded and so ∞
k=1 ak , with ak = 1/k, is
divergent — despite the fact that an = 1/n → 0 as n → ∞.
King’s College London
62
Chapter 4
An alternative argument is as follows. One notices that
1+
1
2
+
1
1
+ 1 + 1 + 1 + 1 + 1 + 1 + · · · + 16
+ 1 + ··· + 1 +...
|3 {z 4} |5 6 {z 7 8} |9
{z
} |17 {z 32}
>
1
2
> 4× 18 = 12
1
> 8× 16
= 12
1
> 16× 32
= 21
and so we see that
s1 = 1,
s2 = s1 + 12 ,
s4 > s2 + 12 ,
s8 > s4 + 21 ,
s16 > s8 + 21 ,
...
and so it follows that s2j ≥ (j + 2) 12 for j ∈ N. (This inequality is strict
for j > 1, but this is of no consequence here.) So the sequence (sn ) is not
bounded.
The next result tells us that we can do arithmetic with convergent series
just as we would expect.
P∞
P∞
Proposition 4.5. Suppose
that
a
and
k
k are convergent series.
k=1
k=1 bP
P∞
∞
Then the series
k=1 (ak + bk ) are also
k=1 λ ak , for any λ ∈ R, and
convergent and have sums such that
∞
X
k=1
λ ak = λ
∞
X
ak
and
∞
X
(ak + bk ) =
k=1
k=1
k=1
∞
X
ak +
∞
X
bk .
k=1
Proof.PWe just need to look
P∞So let
P and their limits.
P at the partial sums
a
and
β
=
sn = nk=1 ak and let tn = nk=1 bk and let α = ∞
k=1 ak .
k=1 k
By hypothesis, we know that sn → α and tn → β as n → ∞. But
n
X
λ ak = λ sn → λ α
k=1
and
n
X
(ak + bk ) = sn + tn → α + β
k=1
P∞
P∞
as n → ∞. It follows
that
λ
a
is
convergent
with
sum
λ
α
=
λ
k
k=1 ak
k=1
P∞
and
P∞ also that
P∞ k=1 (ak + bk ) is convergent with sum given by α + β =
a
+
k
k=1 bk , as required.
k=1
Example 4.6. For k ∈ N, let ak = 9/10k . Then
∞
X
ak =
k=1
9
9
9
+ 2 + 3 + ...
10 10
10
which is usually referred to as 0.9 . . . recurring. Is this series convergent and
if so, what is its sum? We see that
∞
X
k=1
Department of Mathematics
ak =
∞
∞
X
X
9
1
=
9
.
k
10
10k
k=1
k=1
63
Series
P
But we have seen earlier that ∞
k=1
1
1
1
10 /(1 − 10 ) = 9 . It follows that
1
10k
is convergent with sum equal to
∞
X
9
1
= 9 × = 1.
k
9
10
k=1
∞
X
9
= 0.9999 · · · = 1
10k
k=1
Continuing with this theme, we have the following.
Example 4.7. Let (ak )k∈N P
be any sequence of integers taking values in the
k
set { 0, 1, 2, . . . , 9 }. Then ∞
k=1 ak /10 is convergent with sum lying in the
interval [0, 1].
P
ak
th partial sum of the series
To see this, let sn = nk=1 10
k denote the n
P∞
k
n+1
≥ 0 and so the sequence
k=1 ak /10 . We note that sn+1 −sn = an+1 /10
of partial sums (sn ) is monotone increasing. Furthermore, since each ak is
an integer in the range 0 to 9, it follows that ak /10 ≤ 9/10 and so we can
say that ak /10k ≤ 9/10k . Hence, for any n ∈ N,
¡1
¢
n
n
1 n+1
1
X
X
ak
9
10 ¡− ( 10 ) ¢
10 ¢
¡
sn =
≤
=
9
<
9
= 1.
1
1
10k
10k
1 − 10
1 − 10
k=1
k=1
We have shown that the sequence (sn ) is monotone increasing
Pand bounded
k
and therefore it converges. Hence, by definition, the series ∞
k=1 ak /10 is
convergent.
P
k
We must now show that ∞
k=1 ak /10 = α, say, lies between 0 and 1.
However, each sn ≥ 0 and since sn → α as n → ∞, it follows that it must
also be true that α ≥ 0. Furthermore, we have seen that sn < 1 and so we
have 1 − sn > 0. But 1 − sn → 1 − α and so it also follows that 1 − α ≥ 0.
Hence 0 ≤ α ≤ 1, as claimed.
We can interpret this as saying that every infinite decimal represents
some real number x lying in the range 0 ≤ x ≤ 1. The converse is also true.
Example 4.8. Let x be a real number satisfying 0 ≤ x ≤ 1. Then there is
a sequence of integers
(ak )k∈N with values in { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } such
P
k
that the series ∞
a
k=1 k /10 is convergent with sum equal to x.
To show this, we must construct the sequence (ak ). We know how to do
this for x = 1, (take ak = 9 for all k) so let us suppose now that 0 ≤ x < 1.
If we are told that
a1
a2
a3
x=
+ 2 + 3 + ...
10 10
10
then evidently,
a3
a2
+ 2 + ...
10 x = a1 +
10 10
King’s College London
64
Chapter 4
so that a1 is the integer part of 10x, a1 = [10x]. Similarly,
100 x = 10a1 + a2 +
a3
+ ...
10
so that a2 is the integer part of 100x − 10a1 , a2 = [100x − 10[10x]]. In this
way, we can write any ak in terms of x. We simply use this idea to construct
the ak s.
We isolate the following fact: if u is any real number with 0 ≤ u < 1, then
there is an integer a in the set { 0, 1, 2, . . . , 8, 9 } and a real number α obeying
0 ≤ α < 1 such that 10u = a+α. To see this, we first note that 0 ≤ 10u < 10
and so [10u], the integer part of 10u, lies in the set { 0, 1, 2, . . . , 8, 9 }. Let
a = [10u] and let α = 10u − [10u]. Since 0 ≤ w − [w] < 1 for any real
number w, we see that 0 ≤ α < 1 and 10u = a + α as required.
Since 0 ≤ x < 1, as noted above, we can write 10x as 10x = a1 + α1
where a1 is an integer in the set { 0, 1, 2, . . . , 8, 9 } and 0 ≤ α1 < 1. Then
x=
α1
a1
+
.
10 10
Now, with α1 instead of x, we can say that we can write α1 as
α1 =
a2
α2
+
10 10
for some integer a2 in the set { 0, 1, 2, . . . , 8, 9 } and some real number α2
obeying 0 ≤ α2 < 1. Then
x=
a1
α1
a1
a2
α2
+
=
+
+
.
10 10
10 102 102
Repeating this for α2 , we get
x=
a1
a2
a3
α3
+
+
+
10 102 103 103
with a3 ∈ { 0, 1, 2, . . . , 8, 9 } and 0 ≤ α3 < 1.
Continuing in this way, we construct integers an in the range 0 to 9 and
real numbers αn obeying 0 ≤ αn < 1 such that
x=
a2
a3
an
αn
a1
+ 2 + 3 + ··· + n + n .
10
10
10
10
10
|
{z
}
sn
Finally, we note that
|x − sn | =
αn
9
≤ n →0
n
10
10
as
P∞n → ∞, kthat is, sn → x as n → ∞ and so it follows that the series
k=1 ak /10 converges with sum equal to x, and the proof is complete.
Department of Mathematics
65
Series
This provides another proof that any given real number is the limit of
a sequence of rationals. Indeed, for b ∈ R, write b = [b] + x where [b] is
the integer part of b and 0 ≤ x < 1. As discussed above, x = lim sn where
each sn is the partial sum of a series with rational terms of the form ak /10k
for suitable ak ∈ { 0, 1, 2, . . . , 9 }. In particular, each sn is rational and so is
[b] + sn . However, [b] + sn → [b] + x = b and the result follows.
Since 0.99 · · · = 1 = 1.00 . . . it is clear that the decimal expansion of a
real number need not be unique. Indeed, further examples are provided by
0.5 = 0.499 . . . or 0.63 = 0.6299 . . . and so on. However, this is the only
possible kind of ambiguity as the next theorem shows.
Theorem 4.9. Suppose that 0 ≤ x < 1 and that
x = 0.a1 a2 · · · = 0.b1 b2 . . . ,
that is,
x=
∞
∞
X
X
ak
bk
=
10k
10k
k=1
k=1
where each ak and bk belong to { 0, 1, 2, . . . , 8, 9 }. Then either ak = bk for
all k ∈ N or else there is some N ∈ N such that ak = bk for 1 ≤ k < N and
either aN = bN + 1 and ak = 0 and bk = 9 for all k > N or bN = aN + 1
and bk = 0 and ak = 9 for all k > N .
Proof. We will use the following result.
P
k
Lemma 4.10. Suppose that 0 ≤ αk ≤ 9 and that ∞
k=1 αk /10 = 0. Then
αk = 0 for all k ∈ N.
Pn
k
Proof of
Lemma. Let sn =
k=1 αk /10 denote the partial sums of the
P∞
k
(sn ) is a
series k=1 αk /10 . Then sn+1 − sn = αn+1 /10n+1 ≥ 0 so
Pthat
n
positive increasing sequence. Moreover,
each sn obeys sn ≤ k=1 9/10k < 1
P∞
so that (sn ) P
converges, that is, k=1 αk /10k is a convergent series. Its value
k
obeys sn ≤ ∞
k=1 αk /10 . Hence, for any m ∈ N,
∞
X
αm
0 ≤ m ≤ sm ≤
αk /10k = 0
10
k=1
so that it follows that αm = 0 as claimed and the proof of the lemma is
complete.
We turn now to the proof of the theorem.
Case 1: x = 0.
P
k
In this case, we have 0 = x = ∞
k=1 ak /10 with ak ∈ { 0, 1, . . . , 9 }. By
the Lemma, we conclude that ak = 0 for all k ∈ N. Hence ak = bk = 0 for
k ∈ N.
King’s College London
66
Chapter 4
Case 2: 0 < x < 1.
Suppose that
x=
∞
∞
X
X
ak
bk
=
k
10
10k
k=1
k=1
and it is false that ak = bk for all k ∈ N. Let N be the smallest integer for
which ak 6= bk , so that ak = bk for 1 ≤ k < N but aN 6= bN .
Suppose that aN > bN . Then 0 ≤ bN < aN ≤ 9 and aN ≥ 1. We have
∞
∞
X
X
ak
bk
0=x−x=
−
10k
10k
k=1
=
=
k=1
∞
X
k=N
=
k=1
∞
X
(ak − bk )
10k
(ak − bk )
10k
(aN − bN ) (aN +1 − bN +1 )
+
+ ...
10N
10N +1
Multiplying by 10N , we see that
(aN +1 − bN +1 )
+ ...
10
∞
X
cn
= (aN − bN ) +
10n
0 = (aN − bN ) +
n=1
where cn = aN +n − bN +n for all n ∈ N. Now, we can write (aN − bN ) as
aN − bN = c + 1
where c is an integer with c ≥ 0. We also note
cn belongs to the
P that each
n , we get
set { −9, −8, . . . , 8, 9 }. Hence, writing 1 = ∞
9/10
n=1
∞
∞
X
X
9
cn
0=c+
+
n
10
10n
n=1
that is,
0=c+
n=1
∞
X
(9 + cn )
n=1
10n
.
Now, 9 + cn ≥ 0 and c ≥ 0 so that both terms on the rightP
hand side above
(9+cn )
are non-negative. It must be the case that c = 0 and also ∞
n=1 10n = 0.
But then cn = −9 for all n ∈ N, by the Lemma.
Hence aN = bN + 1 and aN +n − bN +n = −9 which implies that aN +n = 0
and bN +n = 9 for all n ∈ N and the result follows.
Department of Mathematics
67
Series
Returning now to the general theory, it is clear that the convergence of
a series will not be affected by changing the values of a few terms, although
of course, this will change the value of its sum. This is confirmed formally
in the next proposition.
P
P∞
Proposition 4.11. Suppose that ∞
k=1 ak is a convergent series and
k=1 bk
is
any series such that bk = ak except for at most finitely-many k. Then
P∞
k=1 bk is also convergent.
Proof.PAs always, we look at the partial sums, so let sn =
tn = nk=1 bk . Evidently,
tn =
n
X
k=1
bk =
n
X
k=1
( bk − ak ) +
| {z }
= ck , say
n
X
ak =
k=1
n
X
Pn
k=1 ak
and let
ck + sn .
k=1
P
Next, let un = nk=1 ck . Now, by hypothesis, ck = 0 except for at most
finitely-many k. In other words, there is some N ∈ N such that ck = 0 for
all n > N . This means that un is eventually constant,
un =
n
X
k=1
ck =
N
X
ck = uN ,
k=1
whenever n > N , and so (un ) converges (to the value uN ). But
tn = un + sn
and since the right hand side converges, so does the left hand side and the
result follows.
Theorem 4.12 (Comparison
P∞ Test for positive series).
P∞ Suppose 0 ≤ ak ≤ bk
for all k ∈ N and that k=1 bk converges. Then k=1 ak also converges.
P
P
Proof. Let sn = nk=1 ak and tn = nk=1 bk . By hypothesis, (tn ) converges
and so (tn ) is a bounded sequence. Therefore there is some M > 0 such that
tn ≤ M
for all n ∈ N. But since 0 ≤ ak ≤ bk , it follows that sn ≤ tn and so
the sequence (sn ) of partial sums is bounded above (by M ). Furthermore,
sn+1 − sn = an+1 ≥ 0 and so (sn ) is monotone increasing. However, we
know that a monotone increasing sequence which is bounded above must
converge. Hence result.
King’s College London
68
Chapter 4
∞
X
1
1
1
1
Example 4.13. Consider the series
= 1 + 2 + 2 + 2 + ... .
k2
2
3
4
k=1
Pn
1
th partial sum. Then (s ) is increasing.
Let sn =
n
k=1 k2 denote the n
Furthermore, for n > 1, we see that
1
1
1
1
+ 2 + 2 + ··· + 2
2
2
3
4
n
1
1
1
1
<1+
+
+
+ ··· +
1.2 2.3 3.4
(n − 1)n
³
´
³
´
³
³ 1
1
1 1
1 1´
1´
=1+ 1−
+
−
+
−
+ ··· +
−
2
2 3
3 4
n−1 n
1
=2−
n
<2
sn = 1 +
and so (sn ) is P
bounded from above and therefore must converge. Hence,
1
by definition, ∞
k=1 k2 is convergent. Note, however, that this discussion
gives us no hint as to the value of its sum. This is an example where
the convergence of a series can quite sensibly be discussed without actually
knowing what its sum is.
∞
X
1
?
Example 4.14. What about the series
k4
k=1
1
1
≤ 2 for all k ∈ N, we can apply the Comparison Test for positive
4
k
k
∞
X
1
1
1
series to deduce that
is convergent. Indeed, since α ≤ 2 for all
4
k
k
k
k=1
P∞
α
k ∈ N for any α ≥ 2, we can say that the series k=1 1/k is convergent
whenever α ≥ 2.
Since
∞
X
k=1
1
k 2+β
is convergent for any β ≥ 0
Example 4.15. What about the series
∞
X
1
√ ?
k
k=1
1
1
≤ √ ,
k
k
for every k ∈ N, together
with
the
Comparison
Test
for
positive
series
to
P
conclude that the series ∞
1/k
were
convergent.
However,
we
know
this
k=1
√
P
not to be the case. It follows that the series ∞
k=1 1/ k is not convergent.
If this series were convergent, then we could use the inequality
Department of Mathematics
69
Series
Indeed, we can apply this reasoning to the series
∞
X
1
kµ
P∞
k=1 1/k
µ
for any µ ≤ 1.
is not convergent for any µ ≤ 1
k=1
∞
X
1
is convergent for
Example 4.16. We have seen above that the series
kν
k=1
ν ≥ 2 but not convergent for ν ≤ 1. It is natural to ask what happens for
values of ν lying in the range 1 < ν < 2. We shall see that the series is
convergent for all ν > 1.
P
Write ν = 1 + ε, where ε > 0 and let sn = nk=1 1/k ν . Evidently (sn )
is an increasing sequence so if we can show that it is bounded, then we will
be able to conclude that it converges. The idea is to compare the terms
1/k ν with the integral of the function y = 1/xν over unit intervals. In fact,
over the range k ≤ x ≤ k + 1, the function y = 1/x(1+ε) is greater than
1/(k + 1)1+ε and so
Z k+1
1
dx
≤
.
1+ε
(k + 1)
x1+ε
k
Summing over k, we find that
1
1
1
+ ν + ··· + ν
ν
2
3
n
Z 2
Z 3
Z n
dx
dx
dx
≤1+
+
+ ··· +
1+ε
1+ε
1+ε
x
2 x
n−1 x
Z1 n
dx
=1+
1+ε
x
1
h
1 in
=1+ − ε
εx 1
1
1
=1+ −
ε ε nε
1
≤1+ .
ε
sn = 1 +
We see that the sequence (sn ) is bounded from above and since it is also
increasing, it must converge.
∞
X
1
kν
is convergent for all ν > 1 and divergent for all ν ≤ 1
k=1
This technique of comparing terms of a series with integrals can be quite
useful. The general idea is contained in the following theorem.
King’s College London
70
Chapter 4
Theorem 4.17 (Integral Test). Suppose that ψ : [1, ∞)R → R is a positive
n
decreasing function such P
that the sequence of integrals ( 1 ψ(x) dx)n∈N con∞
verges as n → ∞. Then n=1 ψ(n) is convergent.
Pn
Proof. Since ψ(x) ≥ 0, the sequence of partial sums sn =
k=1 ψ(k) is
increasing. Now, because ψ is decreasing, it follows that ψ(k) ≤ ψ(x) for
Rk
all x ∈ [k − 1, k] for all k ≥ 2. Hence k−1 (ψ(k) − ψ(x)) dx ≤ 0, that is,
Rk
ψ(k) ≤ k−1 ψ(x) dx. Therefore
sn = ψ(1) + ψ(2) + ψ(3) + · · · + ψ(n)
Z 2
Z 3
Z n
≤ ψ(1) +
ψ(x) dx +
ψ(x) dx + · · · +
ψ(x) dx
1
2
n−1
Z n
= ψ(1) +
ψ(x) dx .
1
¡R n
¢
By hypothesis, the sequence of integrals 1 ψ(x) dx converges and so is
bounded. It follows that the sequence of partial sums (sn ) is bounded from
above and therefore converges. The result follows.
This theorem can be rephrased in a slightly more general form, as follows.
Theorem 4.18 (Integral Test). Let (an ) be a sequence of positive real numbers
and suppose Rthat there is some positive function ϕ such that the sequence
n
of integrals ( 1 ϕ(x) dx)n∈N converges as n → ∞ and such that, for each
k ≥ 2,
ak ≤ ϕ(x)
P
for all (k − 1) ≤ x ≤ k. Then ∞
k=1 ak is convergent.
P
Proof. As usual, let sn = nk=1 ak . Then (sn )n∈N is an increasing sequence
(because each ak ≥ 0). We need only show that (sn ) is bounded. To see
this, note that for k ≥ 2,
Z
ak =
Z
k
k−1
ak dx ≤
k
ϕ(x) dx
k−1
and so
sn = a1 + a2 + · · · + an
Z 2
Z 3
Z n
≤ a1 +
ϕ(x) dx +
ϕ(x) dx + · · · +
ϕ(x) dx
1
2
n−1
Z n
= a1 +
ϕ(x) dx .
1
Department of Mathematics
71
Series
Rn
Now, the sequence ( 1 ϕ(x) dx) converges, by hypothesis,
and so it is bounded
Rn
and therefore there is a constant C such that 1 ϕ(x) dx ≤ C for all n ∈ N.
Hence, for any n,
Z n
ϕ(x) dx ≤ a1 + C
0 ≤ sn ≤ a1 +
1
which shows that (sn ) is a bounded sequence and the result follows.
The following test for convergence of positive series is very useful.
Theorem 4.19 (D’Alembert’s Ratio Test for positive series).
Suppose that an > 0 for all n.
(i) Suppose that there is some 0 < ρ < 1 and some N ∈ N such that if
an+1
n > N then
< ρ.
an
P∞
Then the series
n=1 an is convergent.
(ii) If there is N 0 ∈ N such that
P∞
series
n=1 an is divergent.
an+1
≥ 1 for all n > N 0 , then the
an
Proof. (i) Suppose that 0 < ρ < 1 and that
an+1
< ρ for n > N . Then
an
aN +2 < aN +1 ρ
aN +3 < aN +2 ρ < aN +1 ρ2
aN +4 < aN +3 ρ < aN +1 ρ3
..
.
aN +k+1 < aN +1 ρk .
aN +1
.
ρN +1
In other words, we have an < K ρn for all n > N + 1. Now we construct
a new sequence (un ) by setting
(
0,
n≤N +1
un =
an , n > N + 1 .
Hence aN +k+1 < K ρN +k+1 for all k ≥ 1, where we have let K =
P∞
n
Then certainly un < K ρn for all n. Now, we know that
n=1 K ρ is
convergent (with sum K ρ/(1 − ρ)) because 0 < ρ < 1.
P
By the Comparison Test, it follows that P∞
n=1 un is convergent. However,
an = un eventually and so it follows that ∞
n=1 an is also convergent and
the proof of (i) is complete.
King’s College London
72
Chapter 4
(ii) Suppose now that
an+1
≥ 1 for all n > N 0 . Then for any k ∈ N
an
aN 0 +k ≥ aN 0 +k−1 ≥ · · · ≥ aN 0 +1 > 0 .
This means that it is impossible for an → 0 as n → ∞ P
(every term after
the (N 0 + 1)th is greater than aN 0 +1 ). We conclude that ∞
n=1 an must be
divergent.
There is another (weaker) but also very useful version of this theorem.
Theorem 4.20 (D’Alembert’s Ratio Test for positive series (2nd version)).
an+1
Suppose that an > 0 for all n and that
→ L as n → ∞.
an
P
(i) If L < 1, then ∞
n=1 an is convergent.
(ii) If L > 1, then
P∞
n=1 an
is divergent.
(There is no claim as to what happens when L = 1.)
an+1
→ L where 0 ≤ L < 1. Then for any ε > 0,
an
an+1
we may say that eventually
∈ (L − ε, L + ε).
an
an+1
In particular,
< L + ε eventually. Let ε be so small that L + ε < 1,
an
an+1
< ρ where ρ = L+ε < 1. By the previous version of the Theorem,
then
an
P
it follows that ∞
n=1 an is convergent.
an+1
(ii) Now suppose that L > 1. Then for any ε > 0,
eventually
an
an+1
belongs to the interval (L − ε, L + ε). In particular, eventually
> L − ε.
an
But L > 1, so if ε > 0 is chosen so small that L − ε > 1, then we may say
P∞
an+1
that eventually
> L − ε > 1 and so
n=1 an is divergent, by the
an
previous version of the Theorem.
Proof. (i) Suppose that
Example 4.21. What can be said when L = 1? Without further analysis, the
answer is “nothing”. Indeed, there are examples of series which converge
when L = 1 and other examples of
diverge when L = 1.
P series which
2 is convergent and we see that
For example, we know that ∞
1/k
k=1
an+1 /an = n2 /(n + 1)2 → 1 as
n
→
∞, so that L = 1 in this case.
P
However, we also know that ∞
1/k
is divergent, but here again, we see
k=1
that an+1 /an = n/(n + 1) → 1 = L as n → ∞.
When L = 1, the Ratio Test tells us nothing.
Department of Mathematics
73
Series
Example 4.22. For fixed 0 ≤ c < 1, the series
P∞
k=1 kc
k
is convergent.
If c = 0, there is nothing to prove, so suppose that 0 < c < 1. Setting
an = ncn , we see that
an+1
(n + 1)cn+1
(n + 1)c
=
=
→c
n
an
nc
n
as n → ∞. Since an > 0 forPall n and since L = c < 1, we can apply the
k
Ratio Test to conclude that ∞
k=1 kc is convergent.
P
p k
The same argument shows that for any power p, the series ∞
k=1 k c is
convergent (provided 0 ≤ c < 1).
For any 0 ≤ c < 1 and any p ∈ N, the series
∞
X
k p ck is convergent.
k=1
Theorem 4.23 (nth Root Test). Suppose that an > 0 for all n ∈ N and that
(an )1/n → ` as n → ∞.
P
(i) If ` < 1, the series ∞
k=1 ak is convergent.
(ii) If ` > 1, then the series
P∞
k=1 ak
is divergent.
(There is no conclusion when ` = 1.)
Proof. Suppose that ` < 1. Choose ρ such that ` < ρ < 1 and set ε = ρ − `.
1/n
Then ε > 0 and so there is some N ∈ N such that |an − `| < ε whenever
n > N . In particular,
(an )1/n − ` < ε = ρ − `
P
i.e., an < ρn , whenever n > N . We must show that sn = nk=1 ak converges.
Since an > 0, the sequence (sn ) is monotone increasing so it is enough to
show that (sn ) is bounded from above. But for any n > N ,
sn = a1 + a2 + · · · + an
= sN + aN +1 + · · · + an
< sN + ρN +1 + ρN +2 + · · · + ρn
ρN +1 − ρn+1
1−ρ
N
+1
ρ
< sN +
.
(1 − ρ)
= sN +
Hence, for any j,
sj < sN +j < sN +
ρN +j+1
1−ρ
King’s College London
74
Chapter 4
which shows that the sequence (sn ) is bounded from above and therefore
converges, as claimed.
Next, suppose that ` > 1. Choose d such that 1 < d < ` and let ε = `−d.
Then ε > 0 and there is N ∈ N such that
a1/n
∈ (` − ε, ` + ε)
n
whenever n > N . In particular, for n > N ,
` − ε < a1/n
n
which means that an > dn . It follows that, for any n > N ,
sn = sN + aN +1 + · · · + an > an > dn > 1 .
From this, we see
that it is false that an → 0 as n → ∞ and so by the Test
P∞
for divergence, k=1 an is divergent.
We have considered tests applicable only to positive series. The following
is a convergence test for the case when the terms alternate between positive
and negative values.
Theorem 4.24 (Alternating Series Test). Suppose that (an ) is a positive,
decreasing sequence such that an → 0 as n → ∞. Then the (alternating)
series
∞
X
a1 − a2 + a3 − a4 + . . . =
(−1)n+1 an
n=1
is convergent.
Proof. By hypothesis, an ≥ 0, an+1 ≤ an and an → 0 as n → ∞.
Let sn = a1 − a2 + a3 − a4 + · · · + (−1)n+1 an denote the nth partial sum
of the series, as usual. We shall consider the two cases when n is even and
when n is odd.
Suppose that n is even, say n = 2m. Then
s2m+2 = s2m + ( a2m+1 − a2m+2 )
|
{z
}
≥0
and so s2m+2 ≥ s2m .
Next, we note that
s2m = a1 − a2 + a3 − a4 + a5 − · · · − a2m
= a1 − ( a2 − a3 ) − ( a4 − a5 ) − · · · − ( a2m−2 − a2m−1 ) − a2m
| {z }
| {z }
|{z}
|
{z
}
≥0
≥0
≥0
≥0
≤ a1 .
For notational convenience, let xm = s2m . Then we have shown that (xm ) is
increasing and bounded from above (by a1 ). It follows that (xm ) converges,
say xm → α as m → ∞.
Department of Mathematics
75
Series
Claim: sn → α as n → ∞.
Let ε > 0 be given.
Then there is N1 ∈ N such that if m > N1 then |xm − α| < 12 ε. Also,
there is N2 ∈ N such that if n > N2 then |an | < 12 ε. Let N = 2(N1 + N2 ).
Let n > N and consider |sn − α|. If n is even, say, n = 2m, then
n = 2m > N =⇒ 2m > 2(N1 + N2 ) =⇒ m > N1
and so
|sn − α| = |s2m − α| = |xm − α| <
1
2
ε < ε.
If n is odd, say n = 2k + 1, then
n = 2k + 1 > N =⇒ 2k ≥ N = 2(N1 + N2 ) =⇒ k > N1 .
Moreover, since N > N2 , we have n > N =⇒ n > N2 and so we see that
n = 2k + 1 > N =⇒ both k > N1 and n > N2 .
Hence
|sn − α| = |s2k+1 − α| = |s2k + a2k+1 − α|
= |xk + an − α|
≤ |xk − α| + |an |
<
1
2
ε + 12 ε
= ε.
So regardless of whether n is even or odd, if n > N then |sn − α| < ε. Hence
sP
n → α as n → ∞, as claimed, and we conclude that the alternating series
∞
n+1 a is convergent.
n
n=1 (−1)
Example 4.25. The series 1 − 12 + 13 − 14 + 15 − 16 + . . . converges.
This follows immediately from the Alternating Series Test.
P∞
Definition
n=1 an is said to converge absolutely if the
P∞ 4.26. The series
|a
|
is
convergent.
series n=1
P n
The series ∞
but does
n=1 an is said to converge conditionally if it converges
P
not converge absolutely, i.e., it converges but the series ∞
|a
n=1 n | is not
convergent.
Example 4.27. We have seen that the series
∞
X
(−1)n+1 n1 = 1 −
1
2
+
1
3
−
1
4
+
1
5
−
1
6
+ ...
n=1
P
1
converges.
we know that ∞
n=1 n does not converge and so the
P∞ However,
1
n+1
series n=1 (−1)
n is an example of a conditionally convergent series.
King’s College London
76
Chapter 4
Theorem 4.28. Every absolutely convergent series is convergent.
P∞
Pn
Proof. Suppose
Pn that n=1 an is absolutely convergent. Let tnP=∞ k=1 |ak |
and sn = k=1 ak . Then we know that tn converges (since n=1 an converges absolutely). It follows that (tn ) is a Cauchy sequence. We shall show
that (sn ) is also a Cauchy sequence.
Let ε > 0 be given.
Then there is N such that n, m > N imply that
|tn − tm | < ε .
However, for n > m,
|sn − sm | = |am+1 + · · · + an | ≤ |am+1 | + · · · + |an | = |tn − tm |
and so it follows that
|sn − sm | < ε
whenever n, m > N , which shows that (sn ) is a Cauchy sequence. But any
Cauchy sequence in R converges and the result follows.
We know that if a and b are real numbers, then a + b = b + a. More
generally, if a1 , . . . , am is a collection of m real numbers, then their sum
a1 + · · · + am is the same irrespective
of the order in which we choose to add
P
them together. Now, a series ∞
a
n=1 n is the result of adding together real
numbers, so it is natural to guess that the order of the addition does not
matter. To discuss this, we shall need the notion of a rearrangement.
P
P∞
Definition 4.29. The series ∞
n=1 bn is a rearrangement of the series
n=1 an
if there is some one-one map ϕ of N onto N such that bn = aϕ(n) for each
n ∈ N. In other words, every b is one of the a’s and every a appears as
some b.
P
Theorem 4.30. Suppose that the series ∞
n=1 an converges absolutely. Then
every rearrangement also converges, with the same sum.
P
P∞
Proof. Let ∞
n=1 bn be a rearrangement of
n=1 an . Then there is some
Pn oneone map
ϕ
of
N
onto
N
such
that
b
=
a
for
every
n.
Let
s
=
n
n
ϕ(n)
k=1 ak ,
Pn
Pn
P∞
tn = k=1 |ak |, rn = k=1 bk and let s = k=1 ak = limn→∞ sn . We must
show that (rn ) converges and that its limit is equal to s.
Let ε > 0 be given.
Since sn → s and (tn ) is a Cauchy sequence, there is some N ∈ N such that
n, m ≥ N imply that both
|sn − s| < ε/2
and
|tn − tm | < ε/2 .
Now, the sequence of b’s is a relabelling of the a’s and so for each j there
is some kj such that aj = bkj . Let N 0 = max{ kj : 1 ≤ j ≤ N } so that the
Department of Mathematics
77
Series
collection a1 , a2 , . . . , aN is included in the collection b1 , b2 , . . . , bN 0 . Then for
any n > N 0
b1 + b2 + · · · + bn = a1 + a2 + · · · + aN + σn
where σn = a`1 +· · ·+a`r for some integers `1 , . . . , `r with N < `1 < · · · < `r .
Now
|σn | ≤ |a`1 | + · · · + |a`r | ≤
`r
X
|ak | = t`r − tN < ε/2
k=N +1
and so if n > N 0
|rn − s| = |sn + σn − s|
≤ |sn − s| + |σn |
≤ ε/2 + ε/2 = ε
and the proof is complete.
Theorem 4.31 (Cauchy’s Condensation Test). Suppose (an )n∈N is a positive,
decreasing sequence of real numbers (that is, P
an ≥ 0 and an+1 ≤ an ). For
each k ∈ N, let bkP= 2k a2k . Then the series ∞
n=1 an is convergent if and
∞
only if the series k=1 bk is convergent.
(In other words, either both series converge or neither does.)
P
P
Proof. Let sn = nm=1 am and tk = ki=1 bk be the partial sums of the series
under consideration. Since an ≥ 0, the sequences (an )n∈N and (bk )k∈N are
increasing sequences. Now, we know that if an increasing sequence in R is
bounded from above, then it must converge, so our strategy is to show that
the sequences of partial sums are bounded (from above).
The idea is to estimate the partial sums of the series in terms of each
other by bracketing the an terms into groups of size 2, 4, 8, 16, . . . and
using the fact that an ≥ an+1 . We note that
2a4 ≤ a3 + a4 ≤ 2a2
4a8 ≤ a5 + a6 + a7 + a8 ≤ 4a4
8a16 ≤ a9 + a10 + · · · + a15 + a16 ≤ 8a8
..
.
Summing, we find that (for k > 1)
2a4 + 4a8 + · · · + 2k−1 a2k ≤ a3 + a4 + · · · + a2k
≤ 2a2 + 4a4 + · · · + 2k−1 a2k−1 .
King’s College London
78
Chapter 4
In terms of the bk s, this becomes
1
2
(b2 + · · · + bk ) ≤ a3 + a4 + · · · + a2k ≤ b1 + b2 + · · · + bk−1
giving the pair of inequalities
1
2 tk
− b1 ≤ s2k − (a1 + a2 ) ≤ tk−1
(∗)
P∞
Suppose
P∞ now that the series n=1 an converges and, for clarity, let us write
s = n=1 an = limj→∞ sj . Since (sj ) is increasing, it follows that sj ≤ s
for all j ∈ N. From (∗), it follows that
1
2 tk
− b1 ≤ s2k − (a1 + a2 ) ≤ s − (a1 + a2 )
for all k ∈ N. Hence (tk ) is a bounded,
P increasing sequence and so converges.
But, by definition, this means that ∞
i=1 bi is convergent.
P
Next, suppose that the series ∞
i=1 bi converges and write t for its sum. Then
tk ≤ t for all k ∈ N. Now, for any n ∈ N, it is true that 2n > ¡n (as
¢ can be seen
n
n
n
by the Binomial Theorem, as follows; 2 = (1+1) = 1+n+ 2 +· · ·+1 > n).
Hence sn ≤ s2n and so, using (∗), we get
sn ≤ s2n ≤ tn−1 + (a1 + a2 ) ≤ t + (a1 + a2 )
for all n − 1 ∈ N. Therefore
P∞ (sn ) is a bounded, increasing sequence and so
converges. Therefore j=1 aj is convergent.
P
Example 4.32. We already know that the series ∞
n=1 1/n diverges, but let
us consider it again via the Condensation Test. First, we note that an = 1/n
satisfies the hypotheses required to apply the Condensation Test. Now,
bk = 2k a2k = 2k /2k = 1
P
P
and so it is clear that Pk bk = k 1 diverges. Applying the Condensation
Test, we conclude that ∞
n=1 1/n diverges. (In fact, we have already shown
this from first principles using this method of grouping.)
P∞
1+δ for given δ > 0. Once again, a = 1/n1+δ
Next, consider
n
n=1 1/n
satisfies the hypotheses required to apply the Condensation Test. In this
case, we have
2k
1
2k
bk = 2k a2k = k(1+δ) = k kδ = kδ
2 2
2
2
P
so that k bk is a geometric series with common ratio 1/2δ . This series
therefore converges (because 1/2δ is smaller than 1).
Department of Mathematics
79
Series
P
We might say that the series n 1/n diverges presumably because the terms
1/n do not become “small enough quickly enough”. Increasing
P the power of
n from 1 to 1 + δ is sufficient to “speed things up” so that Pn 1/n1+δ does
converge, no matter how small δ may be. Consider the series ∞
n=2 1/(n ln n)
(we cannot start this series with n = 1 because ln 1 = 0). Is the change from
an = 1/n to an = 1/(n ln n) enough to give convergence of the series?
To investigate its convergence or otherwise, let an = 1/(n ln n) for n ≥ 2
and set a1 = 5, say, or any value greater than 1/2 ln 2. This choice of a1
the hypotheses
is not quite arbitrary but is chosen so that (an )n∈N satisfies
P
required to apply thePCondensation Test. The series ∞
a
n=2 n converges if
and only if the series ∞
a
does,
regardless
of
our
choice
for a1 . Applying
n=1 n
P∞
the Condensation
PTest, nwe may say that the series n=2 1/(n ln n) converges
if (and only if) ∞
n=1 2 a2n does. But
2n a2n =
1 1
2n
=
2n ln(2n )
ln 2 n
P
and we know that the
series
n 1/n does not converge. We can conclude,
P∞
then, that the series n=2 1/(n ln n) is divergent.
The series
∞
X
n=2
1
is divergent.
(n ln n)
King’s College London
80
Department of Mathematics
Chapter 4
Chapter 5
Functions
Suppose that x represents the value of the length of a side of a square. Then
its area depends on x and, in fact, is given by the formula: area = x2 . The
area is a function of x. In general, if S is some given subset of R, then a
real-valued function f on S is a rule or assignment by which to each element
x ∈ S is associated some real number, denoted by f (x). We write f : S → R
which is read as “f maps S into R”. One also writes x 7→ f (x) which is
read as “x is mapped to the value f (x)”. The set S is called the domain
(of definition) of the function f . If x ∈
/ S, then f (x) has not been given a
meaning.
More generally, if A and B are given sets, then a mapping g : A → B
is an association a 7→ g(a) of each element of aµto some
¶ element g(a) ∈ B.
t 1
For example, for each 0 ≤ t ≤ 1, let g(t) =
. Then t 7→ g(t) is
0 t3
an example of a mapping from the interval [0, 1] into the set of 2 × 2 real
matrices.
In general, if B is equal to either R or C, the the mapping is often
referred to as a function. Note that a function may be given by a “pretty
formula” but it does not have to be. For example, the function f : R → R
with f (x) = 1 + x2 is given by a formula. To get f (x), we just substitute
the value of x into the formula. However, the function

2

x < −1,
x ,
x 7→ 1,
−1 ≤ x ≤ 0,

 3
x + 1, x > 0
is a perfectly good function, but is not given by a formula in the same way
as the previous example. In fact, this function seems to be a concoction
constructed from the functions x2 , 1 and x3 + 1. A slightly more involved
example is


if x ∈
/Q,
0,
x 7→ 1/n, if x ∈ Q and x = k/n (with k ∈ Z, n ∈ N and


where k and n have no common divisors).
81
82
Chapter 5
For a function to be well-defined, there must be specified
(i) its domain of definition,
(ii) some assignment giving the value it takes at each point of its domain.
It is often very useful to consider the visual representation of f given by
plotting the points (x, f (x)) in R2 . This is the graph of f .
Examples 5.1.
1. Linear functions: x 7→ f (x) = mx + c for constants m, c ∈ R and x ∈ S.
2. Polynomials: x 7→ f (x) = a0 + a1 x + a2 x2 + · · · + an xn for x ∈ S, where
the coefficients a0 , a1 , . . . an are constants in R and an 6= 0. n is the
degree of such a polynomial.
p(x)
for x ∈ S where p and q are
q(x)
polynomials. Note that the right hand side is not defined for any values
of x for which q(x) = 0.
(
1/x, x 6= 0 ,
4. S = R, f (x) =
3,
x = 0.
(
x2 , x 6= 0 ,
5. S = [−1, 1], f (x) =
2,
x = 0.
(
0, x ∈
/ Q,
6. S = R, f (x) =
1, x ∈ Q .
(
1, x ∈
/ Q,
(A thought: let g(x) =
and let h = f + g.
0, x ∈ Q
3. Rational functions: x 7→ f (x) =
Then
we see that h(x) = f (x) + g(x) =R 1 for all x ∈ RR. Certainly
R1
1
1
h(x)
dx = 1 but what are the values of 0 f (x) dx and 0 g(x) dx and
0
is it true that
Z 1
Z 1
Z 1
1=
h(x) dx =
f (x) dx +
g(x) dx ?)
0
7. S = R,


0,



1,
f (x) = 41


2,


1,
0
0
x < 0,
0 ≤ x < 1,
1 ≤ x < 6,
x ≥ 6.
This kind of step-function is familiar from probability theory — it is the
cumulative distribution function f (x) = Prob{ X ≤ x } for a random
variable X taking the values 0, 1 and 6 with probabilities 14 , 14 and 21 ,
respectively.
Department of Mathematics
83
Functions
Let f : S → R be a given function and let A ⊆ S.
We say that f is bounded from above on A if there is some M such that
f (x) ≤ M for all x ∈ A.
Analogously, f is said to be bounded from below on A if there is some m
such that f (x) ≥ m for all x ∈ A.
If f is both bounded from above and from below on A, then we say that f
is bounded on A.
½
¾
½
¾
increasing
f (x1 ) ≤ f (x2 )
We say that f is
on A if
for any x1 , x2 ∈ A
decreasing
f (x1 ) ≥ f (x2 )
with x1 < x2 .
½
¾
½
¾
increasing
f (x1 ) < f (x2 )
We say that f is strictly
on A if
for any
decreasing
f (x1 ) > f (x2 )
x1 , x2 ∈ A with x1 < x2 .
Examples 5.2.
1. S = R, f (x) = x2 . Then f is bounded from below on R (by m = 0)
but f is not bounded from above on R. f is strictly increasing on [0, ∞)
and f is strictly decreasing on (−∞, 0]. f is bounded on any bounded
interval [a, b]. (We see that 0 ≤ f (x) ≤ max{ a2 , b2 } on [a, b].)
2. S = [1, ∞), f (x) = 1 − x1 for x ∈ S. Then f is increasing and bounded
on S. We see that f attains its glb, namely 0 but does not attain its
lub, 1.
Definition 5.3. Let f : S → R and let x0 ∈ S. We say that f is continuous
at the point x0 if for any given ε > 0 there is some δ > 0 such that
x ∈ S and |x − x0 | < δ =⇒ |f (x) − f (x0 )| < ε.
We say that f is continuous on some given set A if f is continuous at each
point of A.
What’s going on? Note that continuity is defined at some point x0 . The idea is
that one is first given a margin of error, this is the ε > 0. For f to be continuous
at the specified point x0 , we demand that f (x) be within distance ε of f (x0 ) as
long as x is suitably close to x0 , i.e., x is within some suitable distance δ of x0 .
It must be possible to find such δ no matter how small ε is. In general, one must
expect that the smaller ε is, then the smaller δ will need to be. The requirement
that x ∈ S ensures that f (x) actually makes sense in the first place.
If we set h = x − x0 , then we demand that f (x0 + h) be within distance ε of f (x0 )
whenever |h| < δ (provided that x0 + h ∈ S). The point x0 and the error value ε
must be given first. Then one must be able to find a suitable δ as indicated.
King’s College London
84
Chapter 5
We shall illustrate the idea with a simple example (so no surprises here).
Example 5.4. Let S = R and set f (x) = x2 . Let x0 be arbitrary (but fixed).
We shall show that f is continuous at x0 . The procedure is as follows.
Let ε > 0 be given.
We must find some δ > 0 such that |f (x) − f (x0 )| < ε whenever |x − x0 | < δ.
For convenience, write x = x0 + h. We see that
¯
¯ ¯
¯ ¯
¯
(∗)
|f (x) − f (x0 )| = ¯x2 − x20 ¯ = ¯(x0 + h)2 − x20 ¯ = ¯2x0 h + h2 ¯ .
How small must h be in order for this to be smaller than ε? We do not
need
an optimal
estimate, any will do. One idea would be to notice that
¯
¯
¯2x0 h + h2 ¯ ≤ |2x0 h| + h2 and then try to make each of these two terms
smaller than 12 ε, that is, we try to make sure that both |2x0 h| < 12 ε and
h2 < 21 ε. This suggests the two requirements that |h| < ε/(4 |x0 |) and
p
|h| < ε/2. We must be careful here because it might happen that x0 = 0,
in which case we cannot divide by |x0 |. To side-step this nuisance, we shall
consider the two cases x0 = 0 and x0 6= 0 separately.
So first suppose that x0 6= 0. Then
p we simply choose δ to be the minimum
of the two terms ε/(4 |x0 |) and ε/2. This will ensure that if |h| < δ then
|f (x) − f (x0 )| < ε.
Next, suppose that x0 = 0. Then the right hand side of (∗) is simply equal
√
to h2 . If we choose δ = ε, then |h| < δ implies that h2 < ε and so, by (∗),
we have |f (x) − f (x0 )| < ε.
In either of the cases x0 = 0 or x0 6= 0, we have exhibited a suitable δ
so that |x − x0 | < δ implies that |f (x) − f (x0 )| < ε. We have shown that
f (x) = x2 is continuous at any given point x0 ∈ R and the proof is complete.
Notice that the δ depends on both ε and x0 . We must always expect this to
happen (even though in some trivial situations it might not).
The following theorem gives us an extremely useful characterization of
continuity.
Theorem 5.5. Let f : S → R and let x0 ∈ S. The following two statements
are equivalent:
(i) f is continuous at x0 ;
(ii) if (an )n∈N is any sequence in S such that an → x0 as n → ∞, then
the sequence (f (an ))n∈N converges to f (x0 ).
Proof. Suppose that statement (i) holds. To show that (ii) is also true, let
(an ) be any sequence in S with the property that an → x0 as n → ∞. We
must show that f (an ) → f (x0 ) as n → ∞.
Department of Mathematics
85
Functions
Let ε > 0 be given.
By hypothesis, f is continuous at x0 and so there is some δ > 0 such
that
|x − x0 | < δ =⇒ |f (x) − f (x0 )| < ε.
(∗)
But an → x0 as n → ∞ and so there exists N ∈ N such that
n > N =⇒ |an − x0 | < δ.
(∗∗)
Evidently, (∗) and (∗∗) together (with x = an ) tell us that
n > N =⇒ |f (an ) − f (x0 )| < ε
which means that (f (an )) converges to f (x0 ) as n → ∞, as required.
Now suppose that (ii) holds. We must show that this implies that f is
continuous at any x0 ∈ S. Suppose that this were not true, that is, let us
suppose that f is not continuous at the point x0 ∈ S. What does this mean?
It means that there is some ε0 > 0 such that it is false that there is some
δ > 0 so that
x ∈ S and |x − x0 | < δ =⇒ |f (x) − f (x0 )| < ε0 .
That is, there is some ε0 > 0 such that no matter what δ > 0 we choose, it
will be false that
x ∈ S and |x − x0 | < δ =⇒ |f (x) − f (x0 )| < ε0 .
That is, there is some ε0 > 0 such that for any δ > 0 there is some x ∈ S
with |x − x0 | < δ such that it is false that |f (x) − f (x0 )| < ε0 .
That is, there is some ε0 > 0 such that for any δ > 0 there is some
x ∈ S with |x − x0 | < δ such that |f (x) − f (x0 )| ≥ ε0 . Note that x may
well depend on δ.
How does this help? For given n ∈ N, set δ = n1 . Then, according to the
discussion above, there is some point x ∈ S with |x − x0 | < n1 but such that
|f (x) − f (x0 )| ≥ ε0 . The number x could depend on n, so let us relabel it
and call it an . Then
|an − x0 | <
1
n
but
|f (an ) − f (x0 )| ≥ ε0 .
If we do this for each n ∈ N we get a sequence (an )n∈N in S which clearly
converges to x0 . However, because |f (an ) − f (x0 )| ≥ ε0 for all n ∈ N, the
sequence (f (an ))n∈N does not converge to f (x0 ). This is a contradiction (we
started with the hypothesis that (ii) was true). Therefore our assumption
that f was not continuous on S is wrong and we conclude that f is indeed
continuous on S. This completes the proof that the truth of statement (ii)
implies that of statement (i).
King’s College London
86
Chapter 5
We can now apply this theorem, together with various known results
about sequences, to establish some (not very surprising but) basic properties
of continuous functions.
Theorem 5.6. Suppose that f : S → R, g : S → R and that α ∈ R. Suppose
that x0 ∈ S and that f and g are continuous at x0 . Then
(i) The sum f + g is continuous at x0 .
(ii) αf is continuous at x0 .
(iii) The product f g is continuous at x0 .
(iv) If g does not vanish on S, then the quotient f /g is defined on S
and is continuous at x0 .
Proof. Suppose that (an ) is any sequence in S with the property that an →
x0 as n → ∞. Then we know from the previous theorem that f (an ) → f (x0 )
and also that g(an ) → g(x0 ) as n → ∞. It follows that
(i) The sum (f + g)(an ) = f (an ) + g(an ) → f (x0 ) + g(x0 ) = (f + g)(x0 )
as n → ∞.
(ii) αf (an ) → αf (x0 ) as n → ∞.
(iii) The product (f g)(an ) = f (an )g(an ) → f (x0 )g(x0 ) as n → ∞.
(iv) Since g does not vanish on S, the quotient f /g is well-defined on S.
Moreover, g(an ) 6= 0 for any n ∈ N and so (f /g)(an ) = f (an )/g(an ) →
f (x0 )/g(x0 ) as n → ∞.
Now applying the previous theorem once again proves (i)—(iv).
Remark 5.7. We could also have proved the above facts directly from the
definition of continuity. For example, a proof that f + g is continuous at x0
is as follows.
Let ε > 0 be given.
Then there is some δ 0 > 0 such that
|x − x0 | < δ 0 (and x ∈ S) =⇒ |f (x) − f (x0 )| < 21 ε.
(∗)
The reason for using 12 ε rather than ε will become clear below. Similarly,
there is some δ 00 > 0 such that
|x − x0 | < δ 00 (and x ∈ S) =⇒ |g(x) − g(x0 )| < 21 ε.
Department of Mathematics
(∗∗)
87
Functions
Now, let δ = min{ δ 0 , δ 00 }. Then, from (∗) and (∗∗),
|(f + g)(x) − (f + g)(x0 )| = |f (x) − f (x0 ) + g(x) − g(x0 )|
≤ |f (x) − f (x0 )| + |g(x) − g(x0 )|
< 21 ε + 12 ε = ε
whenever |x − x0 | < δ (and x ∈ S) and so, by definition, it follows that f + g
is continuous at the point x0 in S.
Remark 5.8. The function f (x) = x is continuous on R and so with g = f , we
deduce from the theorem that f 2 (x) is also continuous on R. This is just the
statement that the function x2 is continuous. By induction, we can deduce
from the theorem that products of continuous functions and also finite linear
combinations of continuous functions are continuous, i.e., if f1 , . . . , fk are
each continuous at x0 , then so is the product function f1 f2 . . . fk as well
as the linear combination α1 f1 + · · · + αk fk , for any α1 , . . . , αk ∈ R. In
particular, any power of a continuous function is continuous and taking
f (x) = x, we see that any polynomial a0 + a1 x + · · · + an xn is continuous
on R.
Example 5.9. The function x 7→
shown as follows.
√
x is continuous on [0, ∞). This can be
Let x0 ∈ [0, ∞) be fixed and let ε > 0 be given. Suppose first that x0 > 0.
For any x ≥ 0, we have
√
√
√
| x − x0 | √
√
√
| x + x0 |
| x − x0 | = √
√
| x + x0 |
| x − x0 |
=√
√
x + x0
| x − x0 |
< √
x0
<ε
√
provided |x − x0 | < δ where we have chosen δ = ε x0 .
To conclude, consider the case x0 = 0. Then we simply observe that
√
√
√
| x − x0 | = x
<ε
whenever |x − 0| < δ with δ chosen to be ε2 .
King’s College London
88
Chapter 5
Example 5.10. The function x 7→ 1/x, for x > 0, is continuous on (0, ∞).
Let f (x) = 1/x for x > 0. To show that f is continuous on (0, ∞), let
x0 ∈ (0, ∞) be given and suppose that (an )n∈N is any sequence in (0, ∞)
such that an → x0 as n → ∞. We know that this means that 1/an → 1/x0 ,
that is, f (an ) → f (x0 ) as n → ∞. But this implies that f is continuous at
x0 , as required.
Note that f is bounded from below (by 0) but f is not bounded from
above on (0, ∞). For any M > 0, there is k ∈ N such that k > M , by the
Archimedean Property. Hence, if 0 < x < 1/k, then f (x) = 1/x > k > M .
It follows that there is no constant M such that f (x) < M for all x ∈ (0, ∞),
that is, f is not bounded from above on (0, ∞).
From the example above, we see that if f (x) = 1/x for x in any interval
of the form (0, b), say, then f is continuous on (0, b) but is not bounded
there. This situation cannot happen on closed intervals. This is the content
of the following important theorem.
Theorem 5.11. Suppose that the function f : [a, b] → R is continuous on the
closed interval [a, b]. Then f is bounded on [a, b].
Proof. We argue by contradiction. Suppose that f is continuous on [a, b] but
is not bounded. Suppose that f is not bounded from above. This means
that for any given M whatsoever, there will be some x ∈ [a, b] such that
f (x) > M . In particular, for each n ∈ N (taking M = n) we know that
there is some point an , say, in the interval [a, b] such that f (an ) > n.
Consider the sequence (an )n∈N . This sequence lies in the bounded interval [a, b] and so, by the Bolzano-Weierstrass Theorem, it has a convergent
subsequence (ank )k∈N , say; ank → α as k → ∞. Since a ≤ ank ≤ b for all
k, it follows that a ≤ α ≤ b. (The limit of a convergent sequence belonging
to a closed interval also belongs to the same closed interval.) But, by hypothesis, f is continuous at α and so ank → α implies that f (ank ) → f (α).
It is this that will provide our sought after contradiction. By construction,
f (ank ) > nk and so it looks rather unlikely that (f (ank )) could converge.
To see that this is the situation, we observe that there is some K ∈ N such
that
|f (ank ) − f (α)| < 1
for all k > K (because f (ank ) → f (α)). But then
f (ank ) = f (ank ) − f (α) + f (α)
≤ |f (ank ) − f (α)| + f (α)
< 1 + f (α)
for all k > K. However, f (ank ) > nk ≥ k so 1 + f (α) > k for all k ∈ N.
This is a contradiction and we conclude that f is bounded from above.
Department of Mathematics
Functions
89
To show that f is also bounded from below, we consider g = −f . Then
g is continuous because f is. The argument just presented, applied to g,
shows that g is bounded from above. But this just means that f is bounded
from below and the proof is complete.
Remark 5.12. The two essential ingredients are that f is continuous and that
the interval is both closed and bounded. The boundedness was required so
that we could invoke the Bolzano-Weierstrass Theorem and the fact that it
was closed ensured that α, the limit of the Bolzano-Weierstrass convergent
subsequence actually also belonged to the interval. This in turn guaranteed
that f was not only defined at α but was continuous there.
If we try to relax these requirements, we see that the conclusion of the
theorem need no longer be true. For example, we must insist that f be
continuous. Indeed, consider the function f on the closed interval [0, 1]
given by
(
0,
for x = 0
f (x) =
1/x, for 0 < x ≤ 1.
Evidently f is not bounded on [0, 1] but then f is not continuous at the
point x = 0.
Taking f (x) = 1/x for x ∈ (0, 1], we see that again f is not bounded on
(0, 1], but then (0, 1] is not a closed interval.
Let f (x) = x for x ∈ [0, ∞). Again, f is not bounded on the interval [0, ∞)
but this interval is not bounded.
We have seen that a continuous function on a closed interval is bounded.
The next theorem tells us that it attains its bounds.
Theorem 5.13. Suppose that f is continuous on the closed interval [a, b].
Then there is some α ∈ [a, b] and β ∈ [a, b] such that f (α) ≤ f (x) ≤ f (β)
for all x ∈ [a, b]. In other words, if ran f = { f (x) : x ∈ [a, b] } is the range
of f , then f (α) = inf ran f = min ran f and f (β) = sup ran f = max ran f .
Proof. We have seen that f is bounded. Let m = inf ran f and M =
sup ran f . By definition of the supremum, there is some sequence (yn ) in
ran f such that yn → M as n → ∞. Since yn ∈ ran f , there is some
xn ∈ [a, b] such that yn = f (xn ). By the Bolzano-Weierstrass Theorem, (xn )
has a convergent subsequence (xnk )k∈N . Let β = limk xnk . Then β ∈ [a, b].
Since f is continuous on [a, b], it follows that f (xnk ) → f (β) as k → ∞.
But f (xnk ) = ynk and (ynk )k∈N is a subsequence of the convergent sequence
(yn ). Therefore (ynk )k∈N converges to the same limit, that is, ynk → M as
k → ∞. Since ynk = f (xnk ) → f (β) as k → ∞, we deduce that M = f (β).
That is, sup ran f = f (β) and so f (x) ≤ f (β) for all x ∈ [a, b].
King’s College London
90
Chapter 5
We can argue in a similar way to show that there is some α ∈ [a, b] such
that m = f (α). However, we can draw the same conclusion using the above
result as follows. Note that if g = −f , then g is continuous on the interval
[a, b] and sup ran g = −m. By the argument above, there is some α ∈ [a, b]
such that −m = g(α). This gives the desired result that m = f (α).
Alternative Proof. We know that f is bounded. Let M = sup ran f . To
show that f achieves its least upper bound M , we suppose not and obtain a
contradiction. Since M is an upper bound and is not achieved by f , we must
have that f (x) < M for all x ∈ [a, b]. In particular, M − f is continuous and
strictly positive on [a, b]. It follows that h = 1/(M − f ) is also continuous
and positive on [a, b]. But then h is bounded on [a, b] and so there is some
constant K such that 0 < h ≤ K on [a, b], that is,
0<
1
≤K.
M −f
Hence f ≤ M − 1/K which says that M − 1/K is an upper bound for f on
[a, b]. But then this contradicts the fact that M is the least upper bound for
f on [a, b]. We conclude that f achieves this bound, i.e., there is β ∈ [a, b]
such that f (β) = M = sup ran f .
In a similar way, if f does not achieve its greatest lower bound, m, then
f − m is continuous and strictly positive on [a, b]. Hence there is L such
that
1
0<
≤L
f −m
on [a, b]. Hence m + 1/L ≤ f and m + 1/L is a lower bound for f on [a, b].
This contradicts the fact that m is the greatest lower bound for f on [a, b]
and we can conclude that f does achieve its greatest lower bound, that is,
there is α ∈ [a, b] such that f (α) = m.
Theorem 5.14 (Intermediate-Value Theorem). Any real-valued function f
continuous on the interval [a, b] assumes all values between f (a) and f (b).
In other words, if ζ lies between the values f (a) and f (b), then there is
some s with a ≤ s ≤ b such that f (s) = ζ.
Proof. Suppose f is continuous on [a, b] and let ζ be any value between f (a)
and f (b). If ζ = f (a), take s = a and if ζ = f (b) take s = b.
Suppose that f (a) < f (b) and let f (a) < ζ < f (b). Let A be the set
A = { x ∈ [a, b] : f (x) < ζ }. Then a ∈ A and so A is a non-empty subset
of the bounded interval [a, b]. Hence A is bounded and so has a least upper
bound, s, say. We shall show that f (s) = ζ.
Since s = lub A, there is some sequence (an ) in A such that an ↑ s. But
A ⊆ [a, b] and so a ≤ an ≤ b and it follows that a ≤ s ≤ b. Furthermore,
Department of Mathematics
91
Functions
by the continuity of f at s, it follows that f (an ) → f (s). However, an ∈ A
and so f (an ) < ζ for each n and it follows that f (s) ≤ ζ. Since, in addition,
ζ < f (b), we see that s 6= b and so we must have a ≤ s < b.
Let (tn ) be any sequence in (s, b) such that tn → s. Since tn ∈ [a, b]
and tn > s, it must be the case that tn ∈
/ A, that is, f (tn ) ≥ ζ. Now, f
is continuous at s and so f (tn ) → f (s) which implies that f (s) ≥ ζ. We
deduce that f (s) = ζ, as required.
Now suppose that f (a) > ζ > f (b). Set g(x) = −f (x). Then we have
that g(a) < −ζ < g(b) and applying the above result to g, we can say that
there is s ∈ [a, b] such that g(s) = −ζ, that is f (s) = ζ and the proof is
complete.
Corollary 5.15. Suppose that f is continuous on [a, b]. Then ran f , the range
of f , is a closed interval [m, M ].
Proof. We know that f is bounded and that f achieves its bounds, that is,
there is α ∈ [a, b] and β ∈ [a, b] such that
m = inf ran f = f (α) ≤ f (x) ≤ M = sup ran f = f (β)
for all x ∈ [a, b]. Evidently, ran f ⊆ [m, M ].
Let c obey m ≤ c ≤ M . By the Intermediate-Value Theorem, there is
some s between α and β such that f (s) = c. In particular, c ∈ ran f and so
we conclude that ran f = [m, M ].
Example 5.16. f (x) = x6 + 3x2 − 1 has a zero inside the interval [0, 1].
To see this, we simply notice that f (0) = −1 and f (1) = 3. Since f is
continuous on R, it is continuous on [0, 1] and so, by the Intermediate-Value
Theorem, f assumes every value between −1 and 3 over the interval [0, 1].
In particular, there is some s ∈ [0, 1] such that f (s) = 0, as claimed. Of
course, this argument does not tell us whether such s is unique or not. (In
fact, it is because f is strictly increasing on [0, ∞) and so cannot take any
value twice on [0, ∞). A moment’s reflection reveals that f (x) ≥ f (0) = −1,
f is not bounded from above and f (−x) = f (x). Therefore f assumes every
value in the range (−1, ∞) exactly twice and assumes the value −1 at the
single point x = 0.)
Example 5.17 (Thomae’s function). We wish to exhibit a function which is
continuous at each irrational point in [0, 1] but is not continuous at any
rational point in [0, 1]. Such a function was constructed by Thomae in 1875.
Any rational number x may be written as x = p/q where we may assume
that p and q are coprime and that p ∈ Z and q ∈ N. This done, we define
King’s College London
92
Chapter 5
ϕ : Q → R by setting ϕ(x) = 1/q where x = p/q. For example,
ϕ(x) = 1 for x = 1
ϕ(x) =
ϕ(x) =
ϕ(x) =
1
2
1
3
1
4
for x =
1
2
1
3,
1
4,
for x =
1
2
11 , 11 ,
for x =
for x =
2
3
3
4
..
.
ϕ(x) =
1
11
...,
10
11
. . . and so on.
Suppose x ∈ Q obeys 0 < x < 1 and that ϕ(x) = 1/q. Then x must be of
the form x = p/q for some p ∈ N with 1 ≤ p ≤ q − 1. In particular, for any
given q ∈ N, { x ∈ Q : 0 < x < 1 and ϕ(x) = 1/q } is a finite set of rational
numbers.
Next, we define f : [0, 1] → R with the help of ϕ as follows.


x=0
1,
f (x) = ϕ(x), x ∈ Q ∩ [0, 1]


0,
x∈
/ Q ∩ [0, 1] .
Claim: f is discontinuous at every rational in [0, 1].
Proof First we note that f (0) = 1 and that f (x) = 1/q when x has the
form p/q (with p, q coprime). In any event, f (x) > 0 for any given rational
x in [0, 1]. Now let r ∈ Q ∩ [0, 1] be given and let (xn ) be any sequence of
irrationals in [0,√
1] which converge to r. (For example,
if r 6= 0 we could let
√
xn = r(1 − 1/n 2) but otherwise let xn = 1/n 2.) Then f (xn ) = 0 for
every n so it cannot be true that f (xn ) → f (r) (because f (r) > 0), that is,
f fails to be continuous at r, as claimed.
Claim: f is continuous at every irrational in [0, 1].
Proof Let x0 be any given irrational number with 0 < x0 < 1. Then
f (x0 ) = 0.
Let ε > 0 be given.
We must show that there is some δ > 0 such that
x ∈ [0, 1] and |x − x0 | < δ =⇒ |f (x) − f (x0 )| < ε .
|
{z
}
(∗)
= |f (x)−0| = |f (x)|
Now f (0) = f (1) = 1 and so (∗) must fail for x = 0 or x = 1 if ε < 1.
Furthermore, (∗) fails if x = p/q (p, q coprime) and ϕ(x) = 1/q ≥ ε, that is,
q ≤ 1/ε. In other words, (∗) will fail if x = 0, x = 1 or else x ∈ Q ∩ [0, 1] and
Department of Mathematics
93
Functions
ϕ(x) = 1/q where q ≤ 1/ε. However, there are only finitely-many numbers
q ∈ N obeying q ≤ 1/ε and so the set
A = { r ∈ Q ∩ [0, 1] : r = 0 or ϕ(r) ≥ ε }
is finite. Write A = { r1 , . . . , rm }.
Since x0 ∈
/ Q, it follows that x0 6= rj for any 1 ≤ j ≤ m. For each
1 ≤ j ≤ m, let δj = |x0 − rj | and let δ = min{ δj : 1 ≤ j ≤ m }. Then
δ > 0 and if x obeys |x − x0 | < δ it must be the case that x 6= rj for any
1 ≤ j ≤ m. It follows that if x ∈ [0, 1] and obeys |x − x0 | < δ, then either
x∈
/ Q and so f (x) = 0 or else x ∈ Q but x ∈
/ A and so f (x) = ϕ(x) < ε. In
any event, (∗) holds and so f is continuous at x0 , as required.
King’s College London
94
Chapter 5
Differentiability
We know from calculus that the slope of the tangent to the graph of a
function f at some point is given by the so-called derivative at the point
in question. To find this slope, one considers the limiting behaviour of the
Newton quotient
f (a + h) − f (a)
h
as h approaches 0. We wish to set this up formally.
Definition 5.18. We say that the function f is differentiable at the point a
(a)
if limh→0, h6=0 f (a+h)−f
exists, that is, if there is some ξ ∈ R such that for
h
any ε > 0 there is some δ > 0 such that
¯ f (a + h) − f (a)
¯
¯
¯
0 < |h| < δ =⇒ ¯
− ξ ¯ < ε.
h
The real number ξ is called the derivative of f at a and is usually written
df
as f 0 (a) or as dx
(a).
Remarks 5.19.
f (a + h) − f (a)
is not defined for h = 0
h
and clearly, it will only make any sense if both f (a) and f (a + h) are
defined. We shall take it to be part of the definition that this is true, at
least for suitably small values of h. That is, we assume that there is some
(possibly very small) open interval around a of the form (a − ρ, a + ρ)
on which f is defined. This means that if a function is defined only
on the integers Z, say, then it will not make any sense to discuss its
differentiability.
1. Note that the Newton quotient
2. We see immediately that if f is constant, then f (a + h) = f (a) for any
h and so the Newton quotient is zero for all h 6= 0 and therefore f is
indeed differentiable at a with derivative f 0 (a) = 0.
3. Suppose that f is differentiable at a with derivative f 0 (a). Let Φf,a be
the function given by

 f (x) − f (a) , x 6= a
Φf,a (x) =
x−a
 f 0 (a) ,
x = a.
Then

 f (a + h) − f (a)
, h 6= 0
Φf,a (a + h) =
h
 f 0 (a) ,
h = 0.
Department of Mathematics
95
Functions
By definition of differentiability, for any given ε > 0 there is some δ > 0
such that
¯
¯
0 < |h| < δ =⇒ ¯Φf,a (a + h) − f 0 (a)¯ < ε ,
that is,
0 < |h| < δ =⇒ |Φf,a (a + h) − Φf,a (a)| < ε .
(∗)
Now, (∗) is still valid if we allow h = 0 and so (with x = a + h), we see
that
|x − a| < δ =⇒ |Φf,a (x) − Φf,a (a)| < ε .
In other words, the differentiability of f implies that Φf,a is continuous
at x = a.
(
x3 , x ≥ 0
Example 5.20. Let f (x) =
What is f 0 (x)?
x2 , x < 0 .
Consider the region x > 0. Here, f (x) = x3 and so f is differentiable
with derivative 3x2 for any x > 0. In the region x < 0, f (x) = x2 and so
f 0 (x) = 2x for any x < 0. What about x = 0? We must argue from first
principles. The Newton quotient (with a = 0) is
 3
h − 0
2
f (0 + h) − f (0)  h = h
=
2

h
h − 0 = h
h
→ 0 as h → 0.
for h > 0
for h < 0
Hence f is differentiable at x = 0 with derivative f 0 (0) = 0.
Proposition 5.21. If f is differentiable at a, then f is continuous at a.
Proof. The idea is straightforward. For h 6= 0, we can write
f (a + h) − f (a) =
³ f (a + h) − f (a) ´
h
h.
The first term on the right hand side approaches f 0 (a) as h → 0 and so the
whole right hand side should approach zero as h → 0. Looking at the left
hand side, this means that f (a + h) approaches f (a) as h → 0. Formally,
we have
f (x) = f (a) + Φf,a (x) (x − a) .
The right hand side is the product of the two functions Φf,a (x) and (x − a),
each being continuous at x = a and so the same is true of their product.
Therefore the left hand side is continuous at x = a, as required.
King’s College London
96
Chapter 5
Example 5.22. The converse to Proposition 5.21 is false. As an example,
consider f (x) = |x| for x ∈ R. Then f is continuous at every x ∈ R.
However, f is not differentiable at x = 0. Indeed,
(
1,
if h > 0
f (0 + h) − f (0)
|h + 0| − |0|
|h|
=
=
=
h
h
h
−1, if h < 0
so the Newton quotient does not have a limit as h → 0 (with h 6= 0) and
consequently f is not differentiable at x = 0.
The following are familiar and very important rules.
Proposition 5.23. Suppose that f and g are differentiable at x0 .
(i) For any α ∈ R, αf is differentiable at x0 and (αf )0 (x0 ) = α f 0 (x0 ).
(ii) The sum f + g is differentiable at x0 and
(f + g)0 (x0 ) = f 0 (x0 ) + g 0 (x0 ) .
(iii) The product f g is differentiable at x0 and
(f g)0 (x0 ) = f 0 (x0 ) g(x0 ) + f (x0 ) g 0 (x0 ) .
(iv) Suppose that f 6= 0. Then 1/f is differentiable at x0 and
³ 1 ´0
f
(x0 ) =
−f 0 (x0 )
.
(f (x0 ))2
Proof. In the following, h is small but h 6= 0.
(i) We have
(α f )(x0 + h) − (α f )(x0 )
α f (x0 + h) − α f (x0 )
=
h
h
= α Φf,x0 (x0 + h) → α f 0 (x0 )
as h → 0.
(ii) We have
(f + g)(x0 + h) − (f + g)(x0 )
f (x0 + h) + g(x0 + h) − f (x0 ) − g(x0 )
=
h
h
f (x0 + h) − f (x0 ) g(x0 + h) − g(x0 )
=
+
h
h
→ f 0 (x0 ) + g 0 (x0 )
as h → 0.
Department of Mathematics
97
Functions
(iii) We have
(f g)(x0 + h) − (f g)(x0 )
f (x0 + h) g(x0 + h) − f (x0 ) g(x0 )
=
h
h
f (x0 + h) − f (x0 )
g(x0 + h) − g(x0 )
=
g(x0 + h) +
f (x0 )
h
h
→ f 0 (x0 ) g(x0 ) + g 0 (x0 ) f (x0 )
as h → 0, since g is continuous at x0 .
(iv) We have
µ
¶
1/f (x0 + h) − 1/f (x0 )
1
1
1
=
−
h
h f (x0 + h) f (x0 )
µ
¶
1 f (x0 ) − f (x0 + h)
=
h
f (x0 + h) f (x0 )
− Φf (x0 + h)
=
f (x0 + h) f (x0 )
− f 0 (x0 )
→
(f (x0 ))2
as h → 0 since f is continuous at x0 .
Recall that f ◦ g denotes the composition x 7→ f (g(x)) (function of a
function). Of course, for this to be well-defined the range of g must be
contained in the domain of definition of f . In the following, we assume that
this is satisfied.
Theorem 5.24 (Chain Rule). Suppose that g is differentiable at x0 and that
f is differentiable at v0 = g(x0 ). Then the composition f ◦ g is differentiable
at x0 and
(f ◦ g)0 (x0 ) = f 0 (g(x0 )) g 0 (x0 ) .
Proof. Suppose that h is small and that h 6= 0. Let v0 = g(x0 ) and put
λ = g(x0 + h) − g(x0 ) so that g(x0 + h) = v0 + λ. Then
(f ◦ g)(x0 + h) − (f ◦ g)(x0 )
f (g(x0 + h)) − f (g(x0 ))
=
h
h
f (v0 + λ) − f (v0 )
=
h
1
= Φf,v0 (v0 + λ) λ (even if λ = 0)
h
³ g(x + h) − g(x ) ´
0
0
= Φf,v0 (v0 + λ)
h
= Φf,v0 (v0 + λ) Φg,x0 (x0 + h) .
King’s College London
98
Chapter 5
Now, g(x0 + h) → g(x0 ) as h → 0 because g is continuous at x0 . In other
words, λ = g(x0 + h) − g(x0 ) → 0 as h → 0. It follows that
Φf,v0 (v0 + λ) Φg,x0 (x0 + h) → Φf,v0 (v0 ) Φg,x0 (x0 )
= f 0 (v0 ) g 0 (x0 )
= f 0 (g(x0 )) g 0 (x0 )
as h → 0 and the result follows.
Imagine a function f (x) on the interval [0, 1], say, which has the property
that f (0) = f (1). Can we draw any conclusions about the behaviour of f (x)
for x between 0 and 1? It seems clear that either f is constant on [0, 1] or
else “goes up and or down” but in any event must have a “turning point”.
We know from calculus that this should demand that f 0 be zero somewhere.
However, it is clear that f cannot be entirely arbitrary for this to be true. For
example, suppose that f (0) = 0 = f (1) and that f (x) = 5x for 0 < x < 1.
Evidently f 0 is never zero. In fact, f 0 (x) = 5 for 0 < x < 1. We note that f
is not continuous at x = 1.
As another example, consider f (x) = 1 − |x| for x ∈ [−1, 1]. We see that
f (−1) = 0 = f (1) but is it true that f 0 is zero for x between −1 and 1?
No, it is not. We see that f 0 (x) = 1 for −1 < x < 0 and that f 0 (x) = −1
for 0 < x < 1 and f is not differentiable at x = 0. In this example, f is
continuous on [−1, 1] but fails to be differentiable on (−1, 1).
If we impose suitable continuity and differentiability hypotheses, then
what we want will be true.
Theorem 5.25 (Rolle’s Theorem). Suppose that f is continuous on the closed
interval [a, b] and is differentiable in the open interval (a, b). Suppose further
that f (a) = f (b). Then there is some ξ ∈ (a, b) such that f 0 (ξ) = 0. (Note
that ξ need not be unique.)
Proof. Since f is continuous on [a, b], it follows that f is bounded and attains
its bounds, by Theorem 5.13. Let m = inf{ f (x) : x ∈ [a, b] } and let
M = sup{ f (x) : x ∈ [a, b] }, so that
m ≤ f (x) ≤ M ,
for all x ∈ [a, b].
If m = M , then f is constant on [a, b] and this means that f 0 (x) = 0 for all
x ∈ (a, b). In this case, any ξ ∈ (a, b) will do.
Suppose now that m 6= M , so that m < M . Since f (a) = f (b) at least
one of m or M must be different from this common value f (a) = f (b).
Suppose that M 6= f (a) ( = f (b)). As noted above, by Theorem 5.13,
there is some ξ ∈ [a, b] such that f (ξ) = M . Now, M 6= f (a) and M 6= f (b)
and so ξ 6= a and ξ 6= b. It follows that ξ belongs to the open interval (a, b).
Department of Mathematics
99
Functions
We shall show that f 0 (ξ) = 0. To see this, we note that f (x) ≤ M = f (ξ)
for any x ∈ [a, b] and so (putting x = ξ +h) it follows that f (ξ +h)−f (ξ) ≤ 0
provided |h| is small enough to ensure that ξ + h ∈ [a, b]. Hence
f (ξ + h) − f (ξ)
≤ 0 for h > 0 and small
h
(∗)
f (ξ + h) − f (ξ)
≥ 0 for h < 0 and small.
h
(∗∗)
and
But (∗) approaches f 0 (ξ) as h ↓ 0 which implies that f 0 (ξ) ≤ 0. On the other
hand, (∗∗) approaches f 0 (ξ) as h ↑ 0 and so f 0 (ξ) ≥ 0. Putting these two
results together, we see that it must be the case that f 0 (ξ) = 0, as required.
It remains to consider the case when M = f (a). This must require that
m < f (a) ( = f (b)). We proceed now just as before to deduce that there
is some ξ ∈ (a, b) such that f (ξ) = m and so (∗) and (∗∗) hold but with
the inequalities reversed. However, the conclusion is the same, namely that
f 0 (ξ) = 0.
Theorem 5.26 (Mean Value Theorem). Suppose that f is continuous on the
closed interval [a, b] and differentiable on the open interval (a, b). Then there
is some ξ ∈ (a, b) such that
f 0 (ξ) =
f (b) − f (a)
.
b−a
Proof. Let y = `(x) = mx + c be the straight line passing through the pair
of points (a, f (a)) and (b, f (b)). Then the slope m is equal to the ratio
(f (b) − f (a))/(b − a).
Let g(x) = f (x) − `(x). Evidently, g is continuous on [a, b] and differentiable on (a, b) (because ` is). Furthermore, since `(a) = f (a) and
`(b) = f (b), by construction, we find that g(a) = 0 = g(b). By Rolle’s Theorem, Theorem 5.25, applied to g, there is some ξ ∈ (a, b) such that g 0 (ξ) = 0.
However, g 0 (x) = f 0 (x) − m for any x ∈ (a, b) and so
f 0 (ξ) = m =
f (b) − f (a)
b−a
and the proof is complete.
We know that a function which is constant on an open interval is differentiable and that its derivative is zero. The converse is true (so no surprise
there then).
King’s College London
100
Chapter 5
Corollary 5.27. Suppose that f is differentiable on the open interval (a, b)
and that f 0 (x) = 0 for all x ∈ (a, b). Then f is constant on (a, b).
Proof. Let α and β be any pair of points in (a, b). We shall show that
f (α) = f (β). By relabelling, if necessary, we may suppose that α < β. By
hypothesis, f is differentiable at each point in the closed interval [α, β] and
so is also continuous there, by Proposition 5.21. f obeys the hypotheses
of the Mean value Theorem on [α, β] and so we can say that there is some
ξ ∈ (α, β) such that
f (β) − f (α)
f 0 (ξ) =
.
β−α
However, f 0 vanishes on (a, b) and so f 0 (ξ) = 0 which means that we must
have f (α) = f (β) and the result follows.
Remark 5.28. The Mean Value Theorem can sometimes be useful for obtaining inequalities. For example, setting f (x) = sin x and assuming standard
properties of the trigonometric functions, we can apply the Mean Value
Theorem to f on the interval [0, x] for x > 0 to find that
f 0 (ξ) =
f (x) − f (0)
x−0
or
cos ξ =
sin x
x
for some ξ ∈ (0, x). However, cos θ ≤ 1 for all θ and so we find that sin x ≤ x
for all x > 0.
Similarly, applying the Mean Value Theorem to f (x) = ln(1 + x) on the
interval [0, x], we find that
f 0 (ξ) =
f (x) − f (0)
x−0
or
1
ln(1 + x)
=
1+ξ
x
for some ξ ∈ (0, x). But then 1/(1 + ξ) < 1 and we find that ln(1 + x) < x
for any x > 0.
These inequalities could also have easily been obtained from the fact
that the integral of a positive function is positive. Indeed,
Z x
x − sin x =
(1 − cos t) dt ≥ 0 .
0
In the same way,
Z
x − ln(1 + x) =
0
x¡
1−
1
1+t
¢
dt ≥ 0.
In fact, one can show that both integrals are strictly positive if x > 0 so this
last method gives the strict inequalities sin x < x and ln(1 + x) < x for all
x > 0. (In this connection, note that if ln(1 + x) = x, then 1 + x = ex . This
is not possible for any x > 0 as is seen from the series expansion for ex .)
Department of Mathematics
101
Functions
Suppose that f and g are continuous on [a, b], differentiable on (a, b) and
that g 0 is never zero on (a, b). The Mean Value Theorem applied to f and g
tells us that there is some ξ and η in (a, b) such that
f (b) − f (a)
= f 0 (ξ) and
b−a
g(b) − g(a)
= g 0 (η) .
b−a
Dividing (and noting that g(b) − g(a) 6= 0 since g 0 (η) 6= 0, by hypothesis),
gives
f (b) − f (a)
f 0 (ξ)
= 0
.
g(b) − g(a)
g (η)
It is possible to do a little better.
Theorem 5.29 (Cauchy’s Mean Value Theorem). Suppose that f and g are
continuous on [a, b] and differentiable on (a, b). Suppose further that g 0 is
never zero on (a, b). Then there is some ξ ∈ (a, b) such that
f (b) − f (a)
f 0 (ξ)
= 0
.
g(b) − g(a)
g (ξ)
Proof. First, we observe that if g(a) = g(b) then Rolle’s Theorem tells us
that g 0 (η) = 0 for some η ∈ (a, b). However, g 0 has no zeros on (a, b), by
hypothesis, and so it follows as noted above that g(a) 6= g(b). Set
¡
¢
¡
¢
ϕ(x) = g(b) − g(a) f (x) − f (b) − f (a) g(x) .
Then
ϕ(a) = g(b) f (a) − f (b) g(a) = ϕ(b)
and ϕ satisfies the hypotheses of Rolle’s Theorem. Hence there is some
ξ ∈ (a, b) such that ϕ0 (ξ) = 0, that is,
¡
¢
¡
¢
g(b) − g(a) f 0 (ξ) − f (b) − f (a) g 0 (ξ) = 0
or
f (b) − f (a)
f 0 (ξ)
= 0
,
g(b) − g(a)
g (ξ)
as required.
Remark 5.30. Notice that interchanging a and b does not affect the left
hand side of the above equality. This means that we can slightly rephrase
Cauchy’s Mean Value Theorem to say that for any a 6= b there is some ξ
between a and b such that
f (b) − f (a)
f 0 (ξ)
= 0
,
g(b) − g(a)
g (ξ)
regardless of whether a < b or a > b.
King’s College London
102
Chapter 5
Taylor’s Theorem
It is convenient to let f (k) denote the k th -derivative of f (whenever it exists).
dk (xj )
dk (xj )
=
0,
whereas
if
k
≤
j,
then
we
see
that
=
dxk
dxk
j(j − 1) . . . (j − (k − 1))xj−k . This vanishes when x = 0 and so we see that
¯
dk (xj ) ¯¯
=0
dxk ¯ x=0
Now, if k > j, then
for any k, j ∈ N.
Consider the polynomial p(x) = α0 + α1 x + α2 x2 + · · · + αm xm . Taking
derivatives and setting x = 0, we find p(0) = α0 , p0 (0) = α1 , p(2) (0) =
2α2 , p(3) (0) = 3! α3 . In general,
p(k) (0) = k! αk .
Now consider some general function f (x) and define
a0 = f (0), a1 = f 0 (0), a2 =
1
2
f (2) (0), . . . , ak =
1 (k)
f (0), . . . , etc.
k!
Let
Pn−1 (x) = a0 + a1 x + a2 x2 + · · · + an−1 xn−1
Rn (x) = f (x) − Pn−1 (x) .
If f (x) is a polynomial of degree n − 1, then f (x) = Pn−1 (x) and Rn (x) = 0.
So, in general, we can think of Pn−1 (x) as a polynomial approximation to
f (x) and Rn (x) as the remainder. The smaller Rn (x) is, so f (x) is closer to
a polynomial. The question is, what can be said about Rn (x)? This is the
content of Taylor’s Theorem.
To begin with, we notice that for k ≤ n − 1,
(k)
R(k) (0) = f (k) (0) − Pn−1 (0) = f (k) (0) − k! ak = 0 ,
by our construction of the ak s. We will use this in the following discussion.
Now, for x 6= 0, we apply Cauchy’s Mean Value Theorem to the pair of
functions Rn (t) and gn (t) = tn , to write
Rn (x) − Rn (0)
R0 (ζ)
= 0n
gn (x) − gn (0)
gn (ζ)
for some ζ lying between 0 and x. (It does not matter whether x > 0 or
x < 0.) Now, any such ζ can be expressed in the form ζ = θ1 x for some
0 < θ1 < 1. Hence
Rn (x)
Rn (x) − Rn (0)
R0 (θ1 x)
=
= 0n
gn (x)
gn (x) − gn (0)
gn (θ1 x)
Department of Mathematics
103
Functions
for some 0 < θ1 < 1, since both Rn (0) = 0 and gn (0) = 0.
(k)
(k)
We repeat this argument applied successively to Rn (t) and gn (t), and
(k)
(k)
use the facts that Rn (0) = 0 and gn (0) = 0 for k ≤ n − 1, to deduce that
Rn (x)
Rn (x) − Rn (0)
R0 (θ1 x)
=
= 0n
for some 0 < θ1 < 1,
gn (x)
gn (x) − gn (0)
gn (θ1 x)
R0 (θ1 x) − Rn0 (0)
Rn00 (θ2 θ1 x)
= n0
=
for some 0 < θ2 < 1,
gn (θ1 x) − gn0 (0)
gn00 (θ2 θ1 x)
(3)
=
Rn00 (θ2 θ1 x) − Rn00 (0)
Rn (θ3 θ2 θ1 x)
= (3)
00
00
gn (θ2 θ1 x) − gn (0)
gn (θ3 θ2 θ1 x)
..
.
(n−1)
=
Rn
(n−1)
gn
(n−1)
(θn−1 . . . θ1 x) − Rn
(n−1)
(θn−1 . . . θ1 x) − gn
(0)
(0)
for some 0 < θ3 < 1,
(n)
=
Rn (θn . . . θ1 x)
(n)
gn (θn . . . θ1 x)
for some 0 < θn < 1.
(n)
(n)
(n)
However, Rn (s) = f (n) (s) − Pn−1 (s) = f (n) (s) since Pn−1 (s) = 0 and
(n)
gn (s) = n! . Let τ = θ1 θ2 . . . θn . Then 0 < τ < 1 and we get that
f (n) (τ x)
f (x) − Pn−1 (x)
Rn (x)
=
=
xn
gn (x)
n!
We can rewrite this to give
f (x) = Pn−1 (x) +
xn (n)
f (τ x)
n!
for some 0 < τ < 1. We have established the following theorem.
Theorem 5.31 (Taylor’s Theorem). Suppose f is defined on some interval
(α, β) and has derivatives up to order n at all points in (α, β). Suppose also
that 0 ∈ (α, β) and x ∈ (α, β). Then
f (x) = f (0) + x f 0 (0) +
where Rn (x) =
x2 00
xn−1
f (0) + . . . +
f (n−1) (0) + Rn (x)
2!
(n − 1)!
xn (n)
f (ξ) for some ξ between 0 and x.
n!
Remark 5.32. Note that ξ will generally depend on f , x and also n.
Example 5.33. Let f (x) = ln(1 + x) on, say (−1, 3). The derivatives of f are
given by
(−1)k+1 (k − 1)!
f (k) (x) =
for k ∈ N.
(1 + x)k
King’s College London
104
Chapter 5
For any x ∈ (−1, 3), by Taylor’s Theorem (up to remainder order n + 1), we
may say that
ln(1 + x) = x −
where
Rn+1 (x) =
x2 x3 x4
+
−
+ · · · + Rn+1 (x)
2
3
4
xn+1 (−1)n+2 n!
xn+1
(−1)n+2
=
(n + 1)! (1 + ξ)n+1
(n + 1) (1 + ξ)n+1
for some ξ between 0 and x. Now let x = 1. Then f (1) = ln 2 and so
¡
n ¢
ln 2 − 1 − 12 + 13 − 14 + · · · + (−1)
= Rn+1 (1)
n
(−1)n+2
for some 0 < ξ < 1.
(n + 1)(1 + ξ)n+1
1
which means that Rn+1 (1) → 0 as n → ∞. It
But |Rn+1 (1)| <
n+1
follows that
where Rn+1 (1) =
ln 2 = 1 −
1
2
+
1
3
−
1
4
+ ... =
∞
X
(−1)n+1
n=1
n
There is a further more general formulation. For fixed a, let g(s) = f (s + a)
and apply Taylor’s Theorem to g(s) to get
g(s) = g(0) + s g 0 (0) + 12 s2 g 00 (0) + · · · +
sn−1
sn (n)
g (n−1) (0) +
g (ξ)
(n − 1)!
n!
for some ξ between 0 and s. Now, g(s) = f (s + a) and g(0) = f (a).
Furthermore, by the chain rule, we find that g (k) (0) = f (k) (a) and g (n) (ξ) =
f (n) (ξ + a). But if ξ lies between 0 and s then ξ + a lies between a and s + a.
Putting x = s + a, we have s = x − a and so η = ξ + a lies between a and x.
We arrive at the following version of Taylor’s Theorem.
Theorem 5.34 (Taylor’s Theorem for f about a). Suppose f is defined on
some interval (α, β) and has derivatives up to order n at all points in (α, β).
Suppose also that a ∈ (α, β) and x ∈ (α, β). Then
f (x) = f (a) + (x − a) f 0 (a) +
where Rn (x) =
(x − a)2 00
f (a) + · · ·
2!
(x − a)n−1 (n−1)
··· +
f
(a) + Rn (x)
(n − 1)!
xn (n)
f (η) for some η between a and x.
n!
Department of Mathematics
Chapter 6
Power Series
P∞
n
Definition 6.1. A series of the form
n=0 an (x − α) , where the an are
constants, is called a power series (about x = α).
We notice immediately that such a power series always converges for
x = α (in this case, all terms, except possibly for the a0 term, are zero).
What can be said about the convergence of power series? The following
results explain the situation. By setting w = x − α, it is often sufficient to
consider the case α = 0, so that the powers are simply powers of x and we
will usually do this.
P∞
n
Proposition 6.2. Suppose that the power series
n=0 an x converges for
some value x = x0 with x0 6= 0. Then it converges absolutely for every
x satisfying |x| < |x0 |.
P
Proof. Let Sn (x) = nk=0 ak xk . By hypothesis, (Sn (x0 ))n∈N∪{ 0 } converges.
In particular, (ak xk0 ) converges (to zero) and so is a bounded sequence; that
is, there is some M > 0 such that |ak xk0 | < M for all k.
P
We wish to show that nk=0 |ak xk | converges for every x with |x| < |x0 |.
Suppose, then, that
|x| < |x0 | and set ρ = |x/x0 |. Evidently,
P x obeys
k converges. But then
0 ≤ ρ < 1 and so ∞
ρ
k=0
|ak xk | = |ak xk0 | |x/x0 |k ≤ M ρk
and so
Pn
k
k=0 |ak x |
converges by the Comparison Test.
Radius of Convergence of a Power Series
Consider a given power series
P∞
J = {x ∈ R :
n
n=0 an x
P∞
and let
n
n=0 an x
converges }.
What can be said about J ? Certainly, 0 ∈ J and it could happen that this
is the only element of J. For example, if an = nn , then an xn = (nx)n and so
no matter how small x is, eventually |nx| > 1 provided x 6= 0. This means
105
106
Chapter 6
that for any given x 6= 0, it is false that an xn → 0 as n → ∞ and so the
power series cannot converge. In this
P∞case Jn = { 0 }.
Suppose
that
x
∈
J,
so
that
0
n=0 an x0 is convergent. Then we know
P∞
n
that n=0 an x also converges (absolutely) for every x obeying |x| < |x0 |.
In other words, if x0 ∈ J, then every point in the interval (− |x0 | , |x0 |) also
belongs to J. What does this mean for J? There are 3 distinct (mutually
exclusive) possibilities.
(i) J = { 0 }.
(ii) J is bounded but there is some t 6= 0 with t ∈ J
(that is, J 6= { 0 } but is bounded).
(iii) J is unbounded.
We can immediately deduce that if J is not bounded, case (iii), then it must
be the whole of R. Indeed, to say that J is not bounded is to say that for
any r > 0, there is some x ∈ J with |x| > r. Hence [−r, r] ⊆ J for all r > 0
and so J = R.
Now consider case (ii) and let
P∞
n
A = {r > 0 :
n=0 an x converges for x ∈ (−r, r) } .
Evidently, if t ∈ J, then |t| ∈ A and so A is bounded because J is. Let
R = lub A. Then R > 0 otherwise we are in case (i).
Suppose 0 < ρ < R. Then, by definition
of lub, there is r ∈ A such
P
n converges (absolutely) for
that ρ < r ≤ R. But then the series ∞
a
x
n=0 n
x ∈ (−r, r) and, in particular, for x with |x| = ρ.
P
n
Next, suppose that x ∈ R with |x| = ρ > R. If ∞
n=0 an x were to
converge, then we could deduce that (−ρ, ρ) ⊆ J which would mean that
ρ
A. This contradicts the fact that R is an upper bound for A and so
P∈
∞
n
n=0 an x cannot converge for any such x.
P
n
Case (ii) means then that there is some R > 0 such that ∞
n=0 an x
converges (absolutely) for all x with |x| < R but diverges for any x with
|x| > R. The behaviour of the power series when |x| = R (i.e., x = ±R)
requires separate extra discussion and will depend on the particular power
series. Anything is possible.
This discussion is summarized in the following very important theorem.
Department of Mathematics
107
Power Series
Theorem 6.3 (Radius
Theorem for Power Series). For any
P of Convergence
n , exactly one of the following three posgiven power series ∞
a
(x
−
α)
n
n=0
sibilities applies.
P∞
n
(i)
n=0 an (x − α) converges only for x = α.
P
n
(ii) There is R > 0 such that ∞
n=0 an (x − α) converges (absolutely)
for all |x − α| < R but diverges for any x with |x − α| > R.
(iii)
P∞
n=0 an (x
− α)n converges (absolutely) for all x.
Definition 6.4. The value R above is called the radius of convergence of the
power series. In case (iii), one says that the series has an infinite radius of
convergence.
Examples 6.5.
P
n
1. Consider ∞
n=0 x . This series converges if |x| < 1 (by the Ratio Test)
and otherwise diverges, so R = 1. Note that the series diverges at both
of the boundary values x = ±1.
2. Consider
∞
X
an xn = 1 + x +
n=0
x2 x3
+
+ ...
2
3
The series converges if |x| < 1 (by Comparison with 1 + x + x2 + . . . ).
If x = 1, then it becomes 1 + 1 + 21 + 13 + . . . which we know diverges. It
follows that it cannot converge for any x with |x| > 1. When x = −1, it
becomes 1 − 1 + 12 − 13 + . . . which converges. So 1 + x + x2 /2 + x3 /3 + . . .
converges at x = −1 but diverges at x = 1.
Replacing x by −x, we see that the series
1−x+
x2 x3 x4 x5
−
+
−
+ ...
2
3
4
5
converges for |x| < 1 and for x = 1 but diverges when x = −1.
3. Formally adding together the two series above, suggests the power series
2+
2x2 2x4 2x6
x4 x6
+
+
+ · · · = 1 + 1 + x2 +
+
+ ...
2
4
6
2
3
which converges for |x| < 1 = R but diverges when x = ±1.
4. The series
x2 x6 x8
+
−
+ ...
2
3
4
converges for |x| < 1 = R and also converges for both x = ±1.
1−
King’s College London
108
Chapter 6
5. The series
∞
X
xn
n=0
n!
=1+x+
x2 x3
+
+ ...
2!
3!
converges absolutely for all x ∈ R, by the Ratio Test.
P
n
If the power series P∞
n=0 an x is differentiated term by term, then the
∞
resulting power series is n=1 nan xn−1 . This is called the associated derived
series. The next theorem tells us that this makes sense.
P∞
n
Theorem 6.6. Suppose
P∞ that n−1n=0 an x has radius of convergence R > 0.
Then the series n=1 nan x
also has radius of convergence equal to R.
(The possibility of an infinite radius of convergence is included.)
Proof.
that 0 < |u| < R. Let r > 0 obey 0 < |u| < r < R. Then
P∞ Suppose
n converges. Since n1/n → 1 as n → ∞, it follows that there is
|a
|
r
n=0 n
some N ∈ N such that n1/n < r/ |u| for all n > N . Therefore
n |an | |un | = |an | (n1/n |u|)n < |an | rn
for all n > N . By Comparison, it follows that
∞
X
n an u
n=1
n−1
= (1/u)
∞
X
n an un
n=1
P
n−1 has
converges absolutely. It follows that the power series ∞
n=1 nan x
radius of convergence at least equal to R.
P
n−1 converges absolutely,
On the other hand, if the derived series ∞
n=1 nan x
then the inequality
|an | |x|n ≤ |x| n |an | |x|n−1
P
n
for n ≥ 1 implies that ∞
n=0 an x converges absolutely, by Comparison. The
result follows.
Remark
6.7. By applying the theorem once again, we see that the power
P∞
series n=2 n(n − 1) an xn−2 also has radius of convergence equal to R. Of
course, we can now apply the theorem again . . .
The big question is whether the derived series is indeed the derivative of
the original power series. We shall now show that this is true.
We recall that Taylor’s Theorem, with 2nd order remainder for a function
f about x0 , gives
f (x) = f (x0 ) + (x − x0 )f 0 (x0 ) +
f (2) (c)
(x − x0 )2
2!
for some c between x and x0 . Setting f (x) = xk gives the equality
2
xk − xk0 = k (x − x0 ) xk−1
+ 21 k(k − 1)ck−2
0
k (x − x0 )
Department of Mathematics
109
Power Series
for some ck between x0 and x. Note that ck may depend on k (as well as x0
and x). If x = x0 + h, then this becomes
(x0 + h)k − xk0 = h k xk−1
+ 21 k(k − 1)ck−2
h2
0
k
(∗)
for some ck between x0 and x0 + h.
We can use this to find the derivative of a power series inside
P its disc
n
of convergence. Indeed, suppose that the power series f (x) = ∞
n=0 an x
has radius of convergence R > 0. Let |x0 | < R be given and let r > 0 obey
0 < |x0 | < r < R. Let h 6= P
0 be so small that |x0 | + |h| < r. This means that
n
−r < x0 + h < r so that ∞
n=0 an (x0 + h) converges (absolutely). Using
(∗), we find that
∞
f (x0 + h) − f (x0 ) X
−
n an xn−1
=
0
h
n=1
1
2
h
∞
X
n(n − 1)cnn−2 .
n=2
Now cn is between x0 and x0 + h and both of these points lie in the interval
(−r, r) and so it follows that
that is |cn | < r. But then, by
P cn ∈ (−r, r),n−2
Comparison with the series ∞
n(n−1)r
, the power series on the right
n=2
hand side is convergent. Letting h → 0 gives the desired result that
∞
f (x0 + h) − f (x0 ) X
=
n an x0n−1 .
h→0
h
f 0 (x0 ) = lim
n=1
We have proved the following important theorem.
P
n
Theorem 6.8 (Differentiation of Power Series). The power series ∞
n=0 an x
is differentiable at each point x0 inside its radius of convergence.
P
n−1
Moreover, its derivative is given by the derived series ∞
.
n=1 n an x0
Example 6.9. We shall show that
ln(1 + x) = x −
x2 x3 x4
+
−
+ ...
2
3
4
for any x ∈ (−1, 1). The radius of convergence, R, of the power series on
the right hand side is R = 1.
Let us begin by guessing that ln(1 + x) = a0 + a1 x + a2 x2 + . . . . If this is to
be true, then putting x = 0, we should have ln 1 = a0 + 0, that is, a0 = 0.
Differentiating term by term and then setting x = 0, we might guess that
d
dx ln(1 + x) |x=0 = a1 . This gives 1 = a1 . Differentiating twice (term by
d2
term) and setting x = 0, we might guess that dx
x ln(1 + x) |x=0 = 2a2 , that
is, a2 = − 21 . Repeating this, we guess that ak = (−1)k+1 /k. So much for
the guessing, now let us justify our reasoning.
King’s College London
110
Chapter 6
Let g(x) be the power series
g(x) = x −
x2 x3 x4
+
−
+ ... .
2
3
4
We see that this power series converges for x = 1 and so it must converge
absolutely for |x| < 1. (This can also be seen directly by the Ratio Test.)
The series does not converge when x = 1 and so we deduce that its radius
of convergence is R = 1. For any x with |x| < R = 1, the power series
can be differentiated and the derivative is that obtained by term by term
differentiation. Hence
g 0 (x) = 1 − x + x2 − x3 + . . .
for any x with |x| < 1. However, we know that
1
= 1 − x + x2 − x3 + . . .
1+x
d
for |x| < 1 and so g 0 (x) = 1/(1 + x) for x ∈ (−1, 1). But dx
ln(1 + x) =
1/(1+x) for x ∈ (−1, 1) and so ln(1+x)−g(x) has zero derivative on (−1, 1).
It follows that ln(1 + x) − g(x) is constant on (−1, 1). Setting x = 0, we see
that this constant must be ln 1 − g(0) = 0 and so ln(1 + x) = g(x) on the
interval (−1, 1), as required.
Note that we have shown that ln(1 + x) = x − 21 x2 + 13 x3 − . . . for
any x ∈ (−1, 1). We have already seen (thanks to Taylor’s Theorem) that
ln 2 = 1 − 12 + 13 − 14 + . . . which means that this expansion is also valid for
x = 1.
When x = −1, the left hand side becomes ln 0, which is not defined and
the right hand side becomes the divergent series −1 − 21 − 31 − 14 − . . . .
Department of Mathematics
Chapter 7
The elementary functions
We have already used the elementary functions (the trigonometric functions,
exponential function and the logarithm) as examples to illustrate various
aspects of the theory. Now is the time to give their formal definitions.
The trigonometric functions sin x and cos x and the exponential function
exp x are defined as follows.
Definition 7.1. For any x ∈ R,
sin x =
∞
X
(−1)n x2n+1
n=0
cos x =
(2n + 1)!
∞
X
(−1)n x2n
(2n)!
n=0
exp x =
∞
X
xn
n=0
n!
=x−
=1−
=1+x+
x3 x5 x7
+
−
+ ...
3!
5!
7!
x2 x4 x6
+
−
+ ...
2!
4!
6!
x2 x3
+
+ ... .
2!
3!
Each of these power series converges absolutely for all x ∈ R (by the Ratio
Test) so they have an infinite radius of convergence.
Remark 7.2. These are the definitions and so each and every property that
these functions possess must be obtainable from these definitions.
We can see immediately that sin 0 = 0, cos 0 = 1 and exp 0 = 1. We also
note that sin(−x) = − sin x (so sin x is an odd function) and cos(−x) = cos x
(so cos x is an even function). Furthermore, by the basic differentiation of
power series theorem, Theorem 6.8, we see that these functions are differentiable at every x ∈ R with derivatives given by term by term differentiation
111
112
Chapter 7
so that
d
sin x =
dx
d
cos x =
dx
d
exp x =
dx
´
d ³
x3 x5
x2 x4
x−
+
− ... = 1 −
+
− · · · = cos x
dx
3!
5!
2!
4!
´
x2 x4
d ³
x3 x5
1−
+
− . . . = −x +
−
+ · · · = − sin x
dx
2!
4!
3!
5!
´
x2 x3
d ³
x2
1+x+
+
+ ... = 0 + 1 + x +
+ · · · = exp x .
dx
2!
3!
2!
We shall establish further familiar properties.
Theorem 7.3. For any x ∈ R, sin2 x + cos2 x = 1.
Proof. Let ϕ(x) = sin2 x + cos2 x. Then we calculate the derivative
ϕ0 (x) = 2 sin x cos x − 2 cos x sin x = 0 .
It follows that ϕ(x) is constant on R. In particular,
ϕ(x) = ϕ(0) = sin2 0 + cos2 0 = 0 + 1 = 1
that is, sin2 x + cos2 x = 1, as required.
Remark 7.4. Since both terms sin2 x and cos2 x are non-negative, we can say
that −1 ≤ sin x ≤ 1 and also −1 ≤ cos x ≤ 1 for all x ∈ R. The functions
sin x and cos x are bounded (by ±1). This is not at all obvious just by
looking at the power series in their definitions.
Theorem 7.5 (Addition Formulae). For any a, b ∈ R, we have
sin(a + b) = sin a cos b + cos a sin b
cos(a + b) = cos a cos b − sin a sin b .
Proof. Let ψ(x) = sin(α − x) cos x + cos(α − x) sin x. Then we see that
ψ 0 (x) = − cos(α − x) cos x − sin(α − x) sin x
+ sin(α − x) sin x + cos(α − x) cos x = 0 .
It follows that ψ(x) is constant on R and so ψ(x) = ψ(0), that is,
sin(α − x) cos x + cos(α − x) sin x = sin α .
Putting α = a + b and x = b, we obtain the desired formula
sin(a + b) = sin a cos b + cos a sin b .
The other formula can be obtained similarly. Indeed, let
µ(x) = cos(α − x) cos x − sin(α − x) sin x .
Department of Mathematics
113
The elementary functions
Then we find that µ0 (x) = 0 so that µ(x) is constant on R. Hence µ(x) =
µ(0) = cos α. Again setting α = a + b and x = b, we find that
cos(a + b) = cos a cos b − sin a sin b
and the proof is complete.
Remark 7.6. The formulae
sin(a − b) = sin a cos b − cos a sin b
cos(a − b) = cos a cos b + sin a sin b
follow by replacing b by −b and using the facts that sin(−b) = − sin b whereas
cos(−b) = cos b. Notice further that if we set a = x and b = x in this last
formula, then we get
cos(x − x) = cos2 x + sin2 x
that is, we recover the formula sin2 x + cos2 x = 1.
The number π
The elementary geometric approach to the trigonometric functions is by
means of triangles and circles. The number π makes its appearance in the
formula relating the circumference and the radius of a circle (or giving the
area A = πr2 of a circle of radius r). For us here, we must always proceed via
the power series definitions of the trigonometric functions. The identification
of π begins with some preliminary properties of the functions sin x and cos x.
Lemma 7.7.
(i) sin x > 0 for all x ∈ (0, 2) .
(ii) cos 2 < 0 .
Proof. (i) Taylor’s Theorem (up to order 2) says that
f (x) = f (0) + x f 0 (0) +
x2 00
f (c)
2!
for some c between 0 and x. With f (x) = sin x, we obtain
sin x = 0 + x −
x2
x2
sin(c) ≥ x −
2
2
for some c between 0 and x. We have used the facts that sin 0 = 0, cos 0 = 1
and − sin(c) ≥ −1. Hence
sin x ≥ x − 12 x2 =
1
2
x(2 − x) > 0
if 0 < x < 2, as claimed.
King’s College London
114
Chapter 7
(ii) Applying Taylor’s Theorem (up to order 4), we may say that there
is some λ between 0 and x such that
cos x = 1 − 0 −
x2
x4
+0+
cos λ .
2!
4!
But cos λ ≤ 1 and so
cos x ≤ 1 −
x2 x4
+
.
2
4!
Putting x = 2 gives
cos 2 ≤ 1 −
4 16
2
1
+
= −1 + = −
2 24
3
3
which implies that cos 2 ≤ − 13 < 0, as required.
Now we come to the crucial part.
Theorem 7.8. There is a unique 0 < µ < 2 such that cos µ = 0.
Proof. We know that cos 0 = 1 and we have just seen that cos 2 < 0. It
follows by the Intermediate Value Theorem (applied to the function cos x
on the interval [0, 2]) that there is some µ ∈ (0, 2) such that cos µ = 0.
We must now show that there is only one such µ. To see this, suppose
that cos β = 0 for some β ∈ (0, 2) with β 6= µ. Then
by Rolle’s Theorem,
¯
d
there is some ξ between µ and β such that dx
cos x¯x=ξ = 0, that is, sin ξ = 0.
But we have shown that sin x > 0 on (0, 2). This gives a contradiction and
so we conclude that there can be no such β. In other words, there is a unique
µ with 0 < µ < 2 such that cos µ = 0.
Definition 7.9. The real number π is defined to be π = 2µ, where µ is the
unique solution in (0, 2) to cos µ = 0.
All we can say at the moment is that 0 < π < 4. It is known that π is
irrational and its decimal expansion is known to some two million decimal
places. Curiously enough, it seems that each of the digits 0, 1, . . . , 9 appears
with about the same frequency in this expansion.
Theorem 7.10. The number π is such that sin( 12 π) = 1, cos(2π) = 1 and
sin(2π) = 0. Furthermore, for any x ∈ R
sin(x + 2π) = sin x
cos(x + 2π) = cos x .
Proof. By its very definition, we know that cos( 12 π) = 0. But since we have
the identity sin2 x + cos2 x = 1, it follows that sin( 21 π) = ±1. However, we
have seen that sin x > 0 on (0, 2) and so it follows that sin( 12 π) = 1.
Department of Mathematics
115
The elementary functions
By the addition formulae, sin π = 2 sin( 12 π) cos( 21 π) = 0. This then
implies that sin(2π) = 2 sin π cos π = 0. To show that cos(2π) = 1, we use
the addition formula again to find that
cos(2x) = cos2 x − sin2 x = 1 − 2 sin2 x .
Setting x = π, we get cos(2π) = 1 because sin π = 0.
Finally, using the above results together with the addition formulae, we
calculate
sin(x + 2π) = sin x cos(2π) + cos x sin(2π) = sin x
cos(x + 2π) = cos x cos(2π) − sin x sin(2π) = cos x
for any x ∈ R and the proof is complete.
Properties of the exponential function
We now turn to a discussion of the exponential function.
Proposition 7.11. The function exp x enjoys the following properties.
(i)
d
dx
exp x = exp x for all x ∈ R.
(ii) exp 0 = 1.
(iii) For any a, b ∈ R, exp(a + b) = exp a exp b.
(iv) exp(−x) = 1/ exp x for all x ∈ R.
(v) exp x > 0 for all x ∈ R.
Proof. (i) As already noted, this follows because the derivative of the power
series is that power series got by differentiating term by term.
(ii) Putting x = 0 in the power series gives exp 0 = 1.
(iii) Fix u ∈ R and set ϕ(x) = exp x exp(u − x). Then
ϕ0 (x) = exp x exp(u − x) − exp x exp(u − x) = 0
for all x ∈ R. It follows that ϕ(x) is constant, so that ϕ(x) = ϕ(0). But
ϕ(0) = exp u and so ϕ(x) = exp u. Letting u = a + b and x = a, we find
that exp a exp b = exp(a + b), as required.
(iv) From the above, we find that exp x exp(−x) = exp 0 = 1 and so
exp(−x) = 1/ exp x.
King’s College London
116
Chapter 7
(v) Since exp x exp(−x) = 1 it follows that exp x 6= 0 for any x ∈ R.
However, it is clear from the power series that exp x > 0 if x > 0 and so the
formula exp x exp(−x) = 1 implies that exp(−x) > 0 too.
(Alternatively, one can note that exp x = exp( 12 x) exp( 21 x) = (exp( 21 x))2
which is positive.)
Because of the property exp(a + b) = exp a exp b, one often writes ex
for exp x, so this reads ea+b = ea eb . However, this notation needs some
further discussion. The point is that the symbols e2 , say, now appear to
have two interpretations. Firstly as exp(2) and secondly as the square of
the number e. The real number e is defined as exp(1) and we see that
e2 = exp(1)2 = exp(1) exp(1) = exp(1 + 1) = exp(2)
so the two interpretations actually agree. What about, say, e1/2 ? This is
interpreted as either exp( 12 ) or as the square root of e. But
exp( 12 ) exp( 12 ) = exp( 12 + 12 ) = exp(1) = e
so exp( 12 ) is the square root of e. This extends to any rational power.
Theorem 7.12. For any r ∈ Q, exp(r) = er , where e = exp(1)
Proof. If r = 0, then exp(0) = 1 = e0 , by definition of the power e0 . Now
suppose that r > 0 and write r = p/q for p and q ∈ N. We have
(exp r)q = exp r × · · · × exp r = exp(rq)
{z
}
|
q factors
= exp p = exp(1
|+1+
{z· · · + 1})
p terms
p
= exp 1 × · · · × exp 1 = e
|
{z
}
p factors
and so exp(r) = ep/q = er .
Now let r = −s where s ∈ Q and s > 0. The above discussion tells us
that exp(s) = es so that
exp(r) = exp(−s) =
1
1
= s = e−s = er
exp(s)
e
and we are done.
Remark 7.13. This result clarifies the symbolism ex . This can always be
considered as shorthand notation for exp x, but if x is rational, then it can
also mean the xth power of the real number e. In this (rational) case, the
values are the same, as the theorem shows, so there is no ambiguity.
Department of Mathematics
117
The elementary functions
Remark 7.14. We have seen that the power series expression for exp x tells us
d
that dx
exp x = exp x and exp 0 = 1. These properties completely determine
exp x. In fact, if ψ(x) is the power series ψ(x) = a0 + a1 x + a2 x2 + . . . , then
the requirement that ψ 0 (x) = ψ(x) demands that
a1 + 2a2 x + 3a3 x2 + 4a4 x3 + · · · = a0 + a1 x + a2 x2 + . . .
This holds if kak = ak−1 for all k = 0, 1, 2, . . . which means that a1 = a0 ,
a2 = a1 /2 = a0 /2, . . . , ak = ak−1 /k = ak−2 /k(k − 1) = · · · = a0 /k!. If
ψ(0) = 1, then a0 = 1 and ak = 1/k! so we find that ψ(x) = exp x.
This holds without assuming that we begin with a power series. Indeed,
suppose that ϕ(x) is differentiable on R and that ϕ0 (x) = ϕ(x) and ϕ(0) = 1.
We shall show that ϕ(x) = exp x.
Let g(x) = ϕ(x) exp(−x). Then g is differentiable on R and
g 0 (x) = ϕ0 (x) exp(−x) − ϕ(x) exp(−x) = 0
since ϕ0 (x) = ϕ(x). Fix u ∈ R and let (a, b) be any interval in R such that
both u ∈ (a, b) and 0 ∈ (a, b). Then g 0 is zero on the interval (a, b) and
so g is constant there. In particular, g(u) = g(0). However, by construction,
g(0) = ϕ(0) exp 0 = 1 and so g(u) = g(0) = 1. Hence ϕ(u) exp(−u) = 1
and we finally arrive at the required result that ϕ(u) = exp u.
The function exp x has further interesting properties.
Theorem 7.15. The function exp x obeys the following.
(i) The map x 7→ exp x is one-one from R onto (0, ∞). In fact, exp x
is strictly increasing on R.
(ii) For any k ∈ N, xk / exp x → 0 as x → ∞.
Proof. (i) From the power series expression for exp x, we see that if x > 0
then exp x > 1 + x > 1. Suppose that a, b ∈ R and that a < b. Then
b − a > 0 so that exp(b − a) > 1. Multiplying by exp a (which is positive),
we see that exp(b − a) exp a > exp a, that is exp b > exp a. It follows that
exp x is strictly increasing and so exp a = exp b is only possible if a = b, that
is, x 7→ exp x is one-one.
(Alternatively, the Mean Value Theorem tells us that (exp b − exp a)/(b − a)
is equal to the derivative of exp x evaluated at some point between a and b.
This derivative is always positive and so (exp b − exp a) and (b − a) always
have the same sign. In particular, exp a = exp b only if a = b.)
We still have to show that exp x maps R onto (0, ∞). To see this, let
µ ∈ (0, ∞). We must show that there is some u ∈ R such that exp u = µ. Let
α > µ and let β > 1/µ. Then exp α > 1 + α > µ and exp β > 1 + β > 1/µ,
so that exp(−β) = 1/ exp β < µ. So we have
exp(−β) < µ < exp α .
King’s College London
118
Chapter 7
Now exp x is continuous on R and so in particular is continuous on the
closed interval [−β, α]. By the Intermediate Value Theorem, there is some
u between −β and α such that exp u = µ, as required.
(ii) For x > 0, the power series expression for exp x tells us that
exp x =
∞
X
xn
n=0
n!
>
xk+1
.
(k + 1)!
Hence 0 < xk / exp x < (k + 1)!/x for x > 0 and so xk / exp x → 0 as
x → ∞.
Remark 7.16. This last result can be written as xk exp(−x) → 0 as x → ∞
or as (exp x)/xk → ∞ as x → ∞ and it implies that xk exp x → 0 as
x → −∞.
It is clear from the power series definition (with x = 1) that e > 1+1 = 2.
We can easily obtain an upper bound for e via Taylor’s Theorem. Indeed,
exp(k) (x) = exp(x) for any k ∈ N and exp(0) = 1, so by Taylor’s Theorem
up to remainder of order 3, we have
exp(x) = 1 + x +
x2 x3
+
exp(cx )
2!
3!
for some cx between 0 and x. If x = 1, then c1 < 1 and so ec1 < e and we
get
e − (1 + 1 + 12 ) = 16 ec1 < 16 e ,
that is,
e < 3.
We can profitably pursue this method of estimation. Taylor’s Theorem up
to remainder of order m + 1 gives
ex = 1 + x +
x2
xm
+ ··· +
+ Rm+1
2!
m!
m+1
x
where Rm+1 = (m+1)!
ecx . Now setting x = 1 and noting the inequalities
3
0 < ec1 < e < 3, we see that 0 < Rm+1 < (m+1)!
.
3
1
However, if m ≥ 3, then (m+1)! < m! and we deduce that
³
1 ´
1
1
.
<
0 < e − 1 + 1 + + ··· +
2!
m!
m!
(∗)
This can be rewritten as
³
1
1
1 ´
1
1
< e < 1 + 1 + + ··· +
+
1 + 1 + + ··· +
2!
m!
2!
m!
m!
for any m ≥ 3. These estimates allow us to prove the following interesting
fact.
Department of Mathematics
119
The elementary functions
Theorem 7.17. The real number e is irrational.
Proof. The proof is by contradiction. Suppose it were the case that e ∈ Q
and let e = p/q where p, q ∈ N. Let m ∈ N obey m > q + 3 (so that m is
greater than both q and 3). Using the estimate (∗) and multiplying through
by m! , we see that
¶
µ
p ³
1
1 ´
< 1.
0 < m!
− 1 + 1 + + ··· +
q
2!
m!
However,
¶
µ
¶
p ³
1
1 ´
m! p ³
m!
m!
=
m!
− 1+1+ +···+
− m! + m! +
+···+
q
2!
m!
q
2!
m!
which is an integer because each term is an integer. This gives us our
contradiction since there is no integer lying strictly between 0 and 1. The
proof is complete.
In fact, we can prove more, namely that all powers and roots of powers
are irrational. That is, ep/q is irrational for any p, q ∈ Z with p 6= 0 and
q 6= 0. (If q = 0, then p/q does not make any sense. If q 6= 0 but p = 0, then
we have ep/q = e0 = 1 which is rational.) In order to show this, we need a
few preliminary results.
Lemma 7.18. For given n ∈ N, let f (x) =
xn (1 − x)n
. Then
n!
2n
1 X
(i) f (x) =
cm xm , with cm ∈ Z.
n! m=n
(ii) If 0 < x < 1, then 0 < f (x) < 1/n!.
(iii) f (k) (0) ∈ Z and f (k) (1) ∈ Z for all k ≥ 0.
Proof. (i) The Binomial Theorem tells us that (1 − x)n can be written as
(1 − x)n = a0 + a1 x + a2 x2 + · · · + an xn
for suitable integers a0 , a1 , . . . , an . In fact, a0 = 1, a1 = −n, a2 = n(n − 1)
and so on. In general, am = (−1)m n! /(n − m)! m!.
Alternatively, this can be proved by induction. Indeed, let P (n) be the
statement that (1 − x)n = a0 + a1 x + a2 x2 + · · · + an xn for coefficients
a0 , a1 , . . . , an ∈ Z. Then with n = 1, we have (1 − x)1 = 1 − x and we see
that P (1) is true.
Now suppose that n ∈ N and P (n) is true. Then
(1 − x)n+1 = (1 − x) (1 − x)n = (1 − x)(a0 + a1 x + a2 x2 + · · · + an xn )
King’s College London
120
Chapter 7
for coefficients a0 , a1 , . . . , an ∈ Z. Expanding the right hand side gives
(1 − x)n+1 = a0 + a1 x + a2 x2 + · · · + an xn
− x(a0 + a1 x + a2 x2 + · · · + an xn )
= a0 + (a1 − a0 )x + (a2 − a1 )x2 + · · ·
· · · + (an − an−1 )xn − an xn+1 .
Evidently, the coefficients all belong to Z and so P (n + 1) is true. By
induction, it follows that P (n) is true for all n ∈ N.
(ii) If 0 < x < 1, then also 0 < (1−x) < 1 and therefore both 0 < xn < 1
and 0 < (1 − x)n < 1. Hence 0 < f (x) < 1/n! .
(iii) We first note that differentiating k times the power xm and then
setting x = 0 gives
(
¯
0, if m 6= k
dk xm ¯¯
=
¯
k
dx
k!, if m = k.
x=0
It follows directly from (i) that
(
k! ck /n! , if n ≤ k ≤ 2n
(k)
f (0) =
0,
otherwise.
Furthermore, if k ≥ n, then k!/n! ∈ Z and so we see that f (k) (0) ∈ Z for
any k ≥ 0.
Next, we use the relation f (x) = f (1 − x) together with the chain rule
to find f (k) (1). Let u = 1 − x. Then du/dx = −1 so that
df (u) du
df (u)
d f (1 − x)
=
=
× (−1) .
dx
du dx
du
Differentiating k times gives
dk f (1 − x)
dk f (u)
=
× (−1)k .
k
dx
duk
Hence, using the equality f (x) = f (1 − x) = f (u), we get
f (k) (x) =
k
dk f (x)
dk f (1 − x)
k d f (u)
=
=
(−1)
dxk
dxk
duk
for all x. Putting x = 1 gives u = 0 and so f (k) (1) = (−1)k f (k) (0). However,
we know that f (k) (0) ∈ Z and so (−1)k f (k) (0) ∈ Z. That is, f (k) (1) ∈ Z, as
claimed.
We are now in a position to prove the following result we are interested
in concerning the irrationality of various powers of e.
Department of Mathematics
121
The elementary functions
Theorem 7.19. er is irrational for every r ∈ Q \ { 0 }.
Proof. We first show that es ∈
/ Q for every s ∈ N. The proof is by contradiction, so suppose the contrary, namely that there is some s ∈ N such that
es ∈ Q. Then we can write es = p/q for p, q ∈ N.
Choose and fix n ∈ N obeying the inequality n! > p s2n+1 and let f (x) =
xn (1 − x)n /n! as in the previous lemma, Lemma 7.17. We introduce the
following function F (x) defined to be
F (x) = s2n f (x) − s2n−1 f 0 (x) + s2n−2 f 00 (x)
− s2n−3 f (3) (x) + · · · − sf (2n−1) (x) + f (2n) (x) .
Now, by part (i) of Lemma 7.17, f (x) has degree 2n and so f (k) (x) = 0 for
all k > 2n. Hence, differentiating the formula above, we find that
F 0 (x) = s2n f 0 (x) − s2n−1 f 00 (x) + s2n−2 f (3) (x)
− s2n−3 f (4) (x) + · · · − sf (2n) (x) + f (2n+1) (x) .
|
{z
}
=0
Hence (after many cancellations)
F 0 (x) + sF (x) = s2n f 0 (x) − s2n−1 f 00 (x) + s2n−2 f (3) (x)
− s2n−3 f (4) (x) + · · · − sf (2n) (x)
+ s2n+1 f (x) − s2n f 0 (x) + s2n−1 f 00 (x)
− s2n−2 f (3) (x) + · · · − s2 f (2n−1) (x) + sf (2n) (x)
= s2n+1 f (x) .
It follows that
¢
d ¡ sx
e F (x) = s esx F (x) + esx F 0 (x)
dx
¡
¢
= esx sF (x) + F 0 (x)
I=
= s2n+1 esx f (x) .
Hence
Z
1
h
i1
s2n+1 esx f (x) dx = esx F (x)
0
0
s
= e F (1) − F (0)
=
p
q
F (1) − F (0)
since es = p/q. Therefore
q I = p F (1) − q F (0) .
King’s College London
122
Chapter 7
Now, s ∈ N and by Lemma 7.17 we know that f (k) (0) ∈ Z and f (k) (1) ∈ Z
for all k ≥ 0. It follows from the expression for F (x) that both F (0) ∈ Z
and F (1) ∈ Z. Hence q I ∈ Z. Furthermore, the integrand in the formula
for I is positive on (0, 1) and so I > 0. It follows that q I ∈ N.
Now, by Lemma 7.17 again, 0 < f (x) < 1/n! for 0 < x < 1 and esx < es
for x < 1 and so
Z 1
0 < qI = q
s2n+1 esx f (x) dx
0
Z 1
1
2n+1
< qs
esx
dx
n!
0
1
p s2n+1
< q s2n+1 es
=
n!
n!
<1
by our very choice of n at the start. However, there are no integers strictly
between 0 and 1 so we have finally arrived at our contradiction and we
conclude that es is irrational for every s ∈ N.
Let m ∈ N. Since em = 1/e−m and we have just shown that em ∈
/ Q, it
follows that e−m ∈
/ Q and so we may conclude that es ∈
/ Q for all s ∈ Z\{ 0 }.
Now let r ∈ Q \ { 0 }. Write r = m/n for m ∈ Z \ { 0 } and n ∈ N. If
er were rational, it would follow that em = (em/n )n = (er )n is also rational.
But we know that em is irrational for every m ∈ Z \ { 0 } and so it follows
that er is irrational and the proof is complete.
Compound Interest
If one pound is invested for one year at an annual interest rate of 100r% and
compounded at n regular intervals, the compound interest formula states
that its value on maturity is (1 + r/n)n pounds. This value is approximately
equal to er . We shall see why this is so.
Proposition 7.20. For fixed r > 0, let xn = (1 + r/n)n , n ∈ N. Then (xn ) is
a bounded increasing sequence and so converges.
Proof. Using the Binomial Theorem, we write
n µ ¶³
¡
¢n X
n
r ´k
xn = 1 + r/n =
k
n
k=0
n
X
n(n − 1) . . . (n − (k − 1)) ³ r ´k
=
k!
n
k=0
=
n
X
k=0
Department of Mathematics
bk (n) rk
123
The elementary functions
¡
¢¡
¢ ¡
¢
1 1 − n1 1 − n2 . . . 1 − (k−1)
.
n
¡
j¢
Now, as n increases, j/n decreases and so 1 − n increases. In other
words, for each fixed k ≤ n, bk (n) < bk (n + 1). It follows that
where bk (n) =
xn+1 =
n(n−1)...(n−(k−1))
k! nk
n+1
X
=
bk (n + 1) rk =
k=0
>
n
X
1
k!
n
X
bk (n + 1) rk + bn+1 (n + 1) rn+1
k=0
bk (n) rk = xn
k=0
which shows that (xn ) is an increasing sequence. Moreover, it is clear that
bk (n) ≤
1
k!
so we see that
xn =
n
X
bk (n) rk ≤
n
X
rk
k=0
k=0
k!
< er
and the proof is complete.
We can now establish the result we are interested in.
¡
¢n
Theorem 7.21. For any fixed r > 0, 1 + r/n → er as n → ∞.
Proof. Let ε > 0 be given.
Using the notation established above, we know that xn = (1 + r/n)n → α
for some α ∈ R. We must show that α = er . Let N1 ∈ N be such that if
n > N1 then
|xn − α| <
Next, let sn =
Pn
k=0 r
k /k!
1
5
ε.
and let N2 ∈ N be such that if n > N2 then
|sn − er | <
1
5
ε.
(We know that sn → er .)
Now we note that for each fixed k, bk (n) → 1/k! as n → ∞. Fix
N > N1 + N2 and let N3 ∈ N be such that N3 > N and if n > N3 then
N
N
X
¯X
¯
bk (n) rk −
k=0
1
k!
¯
rk ¯ <
1
5
ε.
k=0
King’s College London
124
Chapter 7
For any n > N3 , we have
n
n
X
¯X
¯
k
1 k¯
¯
|xn − sn | =
bk (n) r −
k! r
¯
=¯
k=0
N
X
k=0
N
X
bk (n) rk −
k=0
1
k!
rk
k=0
n
X
+
k=N +1
¯
≤¯
N
X
bk (n) rk −
1
5
=
1
5
1
5
<
1
k!
n
X
¯
rk ¯ + 2
k=0
∞
X
ε+2
1
k!
1
k!
¯
rk ¯
k=N +1
N
X
k=0
<
n
X
bk (n) rk −
1
k!
rk
k=N +1
rk
k=N +1
r
ε + 2 (e − sN )
ε + 25 ε =
3
5
ε.
But then, for n > N3 ,
|α − er | ≤ |α − xn | + |xn − sn | + |sn − er | <
1
5
ε + 53 ε + 15 ε = ε
and we conclude that α = er .
Corollary 7.22. For any r > 0, (1 − r/n)n → e−r as n → ∞.
Proof. Let yn = (1 − r2 /n2 )n and suppose that n is so large that r/n < 1.
For such n, we see that
0 < 1 − yn = 1 − (1 − r2 /n2 )n
n
³ −r2 ´k
X
=1−
bk (n)
n
k=0
=−
≤
n
X
k=1
n
X
bk (n)
bk (n)
k=1
r2 /n
<e
³ −r2 ´k
n
³ r2 ´k
n
− 1.
2
Now er /n → 1 and so by the Sandwich Principle, we see that yn → 1 as
n → ∞. However, we then find that
(1 − r/n)n =
as required.
Department of Mathematics
1
(1 − r2 /n2 )n
→ r = e−r
(1 + r/n)n
e
125
The elementary functions
The logarithm
The logarithm is defined via the exponential function. We know that exp x
maps R one-one onto (0, ∞). This means that to each x ∈ (0, ∞) there is
one and only one v ∈ R such that exp v = x.
Definition 7.23. For x ∈ (0, ∞), log x is the value v ∈ R such that exp v = x.
It follows that x 7→ log x maps (0, ∞) onto R.
log x is defined by the formula elog x = x for x > 0.
Remark 7.24. The notation ln x is also used for the function log x here. The
notation ln emphasizes the fact that this is the “logarithm to base e”, the
so-called “natural” logarithm.
Proposition 7.25. The function log x has the following properties.
(i) log 1 = 0 and log e = 1.
(ii) For any s, t > 0, log(st) = log s + log t.
(iii) For any x > 0, log(1/x) = − log x.
(iv) log x is strictly increasing and log x → ∞ as x → ∞.
(v) (log x)/x → 0 as x → ∞.
Proof. We shall make use of the identity log(es ) = s for s ∈ R.
(i) We have log 1 = log(e0 ) = 0. Also log e = log e1 = 1.
(ii) For any s, t > 0, we have
log(s t) = log(elog s elog t ) = log(elog s+log t ) = log s + log t .
(iii) We have
log(1/x) = log(1/elog x ) = log(e− log x ) = − log x .
(iv) Suppose that a < b. Then elog a = a < b = elog b and so we have
log a < log b because exp x is strictly increasing.
Now let M > 0 be given. Set m = eM . Then if x > m, it follows that
log x > log m, that is, log x > log(eM ) = M .
King’s College London
126
Chapter 7
(v) Let v = log x. Then x = ev and
log x
v
= v.
x
e
Now, if x → ∞ then also log x → ∞, that is, v → ∞. However,
we already know that v/ev → 0 as v → ∞ and so (log x)/x → 0 as
x → ∞.
The proof is complete.
Theorem 7.26. The function log x is continuous at each point in (0, ∞).
Proof. Let s ∈ (0, ∞) be given and let ε > 0 be given. We know that the
function t 7→ et is strictly increasing, that is, α < β if and only if eα < eβ .
This means that
α < t < β ⇐⇒ eα < et < eβ .
In particular, if α = log s − ε, t = log x and β = log s + ε, this becomes
log s − ε < log x < log s + ε ⇐⇒ s e−ε < x < s eε .
Let δ = min{ s eε − s, s − s e−ε }. Then
|x − s| < δ =⇒ s − δ < x < s + δ =⇒ s e−ε < x < s eε .
Therefore log s − ε < x < log s + ε or |log x − log s| < ε and it follows that
log x is continuous at s.
The next theorem tells us what the derivative of log x is.
Theorem 7.27. The function log x is differentiable at every s > 0 and its
d
derivative at s is 1/s. (In other words, dx
log x = 1/x on (0, ∞).)
Proof. Let s ∈ (0, ∞) be given and let h be small but h 6= 0. We must show
that the Newton quotient h1 (log(s+h)−log(s)) approaches 1/s as h → 0. To
see this, let v = log s and let k = log(s+h)−log s, so that log(s+h) = v +k.
The continuity of log x at s implies that log(s + h) → log(s) as h → 0, that
is, k → 0 as h → 0. In terms of v and k, we have
h = (s + h) − s = elog(s+h) − elog s = ev+k − ev = exp(v + k) − exp(v) .
Note also that since h 6= 0, it follows that s + h 6= s and so log(s + h) 6= log s
Department of Mathematics
127
The elementary functions
which means that k 6= 0. Using these remarks, we get
log(s + h) − log(s)
k
=
h
exp(v + k) − exp(v)
³ exp(v + k) − exp(v) ´−1
=
k
¡
¢−1
0
→ exp (v)
as h → 0 (since also k → 0),
1
=
exp(v)
1
=
s
as required.
For any positive real number a, we know what the power ak means for
any k ∈ N. We also know what ap/q means for p, q ∈ N: it is the real
number whose q th√power is equal to ap . However, it is not at all clear what
a power such as 3 2 means. We would like to set up a reasonable definition
for powers such as this. We need some preliminary results.
Proposition 7.28. For any a > 0 and m, n ∈ N,
log(am/n ) =
m
n
log(a) .
Proof. First note that log(sk ) = k log(s) for any s > 0 and k ∈ N. We shall
verify this by induction. For k ∈ N, let P (k) be the statement “log(sk ) =
k log(s)”. Evidently, P (1) is true. Using the previous proposition, we see
that
log(sk+1 ) = log(sk s) = log(sk ) + log(s) = k log(s) + log(s) = (k + 1) log(s)
if P (k) is true. Hence the truth of P (k + 1) follows from that of P (k) and
so, by induction, we conclude that P (k) is true for all k ∈ N.
Let t = a1/n so that tn = a and tm = am/n . We have
n log(am/n ) = n log(tm )
= nm log(t)
= m log(tn )
= m log(a)
and it follows that log(am/n ) =
m
n
log(a).
−m/n =
From the above, we see that am/n = exp( m
n log(a)). Moreover, a
m
r
r log a for any
1/am/n = 1/ exp( m
n log(a)) = exp(− n log(a)). Hence a = e
r
a > 0 and any r ∈ Q. Now, the left hand side, a , makes no sense unless r is
rational but the right hand side, namely, er log a (which is short-hand notation
for exp(r log a)), is well-defined for any real number r. This suggests the
following definition of the power as for any s ∈ R.
King’s College London
128
Chapter 7
Definition 7.29. For a > 0 and s ∈ R, the power as is defined to be
as = es log a .
A further remark is in order here. By setting a = e, the real number
exp(1), we have a formula for the power es . But this is just
es = exp(s log e) = exp s
since log e = 1. This is in agreement with our penchant for using the shorthand notation es = exp s, so everything works out alright; that is, we can
think of the expression es as being the power, as defined above, or as an
abbreviation for the exponential, exp s. These are the same thing. The next
proposition tells us that the expected power laws hold.
Proposition 7.30. For any a ∈ (0, ∞) and any s, t ∈ R, we have
as+t = as at
and
(as )t = ast .
Proof. From the definition, we have
as+t = exp((s + t) log a) = exp(s log a) exp(t log a)
= as at .
Similarly,
(as )t = exp(t log(as )) = exp(t log(es log a ))
= exp(t s log a)
= ast ,
as required.
Proposition 7.31. For any a ∈ R, the function f (x) = xa is differentiable
on (0, ∞) and
f 0 (x) = a xa−1 .
Proof. From the definition, f (x) = xa = ea log x and so the standard rules
of differentiation imply that
f 0 (x) =
as claimed.
Department of Mathematics
xa
a a log x
e
=a
= a xa−1
x
x
Download