Uploaded by Lumiel White

Michael Field - Essential Real Analysis (2017, Springer)

advertisement
Michael Field
Essential Real Analysis
123
Michael Field
Engineering Mathematics Department
Merchant Venturers School of Engineering
Bristol University
UK
Department of Mathematics
Rice University
Houston, Texas
USA
ISSN 1615-2085
ISSN 2197-4144 (electronic)
Springer Undergraduate Mathematics Series
ISBN 978-3-319-67545-9
ISBN 978-3-319-67546-6 (eBook)
https://doi.org/10.1007/978-3-319-67546-6
Library of Congress Control Number: 2017955015
Mathematics Subject Classification: 26-01, 40-01, 26Axx, 26Bxx (especially 26B05, 26B10), 26Exx
(especially 26E05, 26E10), 33Bxx (especially 33B15), 34A12, 40Axx, 42Axx (especially 42A10),
54Exx (especially 54-01, 54E35)
© Springer International Publishing AG 2017
This Springer imprint is published by Springer Nature
The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This book is an introduction to real analysis: the foundations, the nature of the
subject, and some of the big results. It is based on long personal experience of
teaching undergraduate and graduate level analysis to a diverse range of classes
in England (Warwick), Australia (Sydney) and the United States (Houston) and is
intended to appeal to mathematicians with either a pure or applied focus.
Topics in the book are drawn from seventeenth century to late twentieth century
analysis. While the techniques of analysis naturally form an important part of the
book, the emphasis is on presenting a broad spectrum of some of the powerful and
beautiful results that can be proved using analytic methods. Many of the results
are important in applications—for example, Fourier series and asymptotics—and
should help provide a sound foundation for work in Applied Mathematics.
At the conclusion of the preface, we give a detailed description of the contents
of the book, together with advice for the reader and comments about what is not
included and why. For now, we mention a few of the highlights. We include a
wide range of results on Fourier series. Applications include the infinite product
formula for sin.x/, used in the analysis of the Gamma function, and results on
Bernoulli polynomials leading to explicit formulas for the sum of a number of wellknown infinite series. We give an introduction to the theory of smooth (infinitely
differentiable, non-analytic) functions with an emphasis on how to construct a
smooth function with specific properties. There is an extensive development of
the theory of metric spaces (including applications of both the Arzelà–Ascoli and
contraction mapping theorems to differential equations). This is followed by a
short chapter on the Hausdorff metric, which includes a pretty application of the
contraction mapping lemma to fractal geometry and iterated function systems. The
final chapter contains a systematic development of the differential calculus on finitedimensional vector spaces and includes results ranging from the equivalence of
norms on a finite-dimensional vector space, through the implicit function and rank
theorems to a proof of the strong version of the existence and regularity theorem
for differential equations. The guiding principle throughout is that the generality of
concepts and language introduced in the text has to be justified by the application.
We have more to say about this below but, as an example, we emphasize sequential
compactness rather than compactness (open cover definition) simply because it is
difficult to find good applications at this level that need the more abstract definition
of compactness. In this book, abstraction and generality has to be justified by
application and/or transparency. We abstract as much as we can but no more than
we need to.
In the next few paragraphs, I sketch some of the reasons and concerns that led to
the writing of the book. I hope these comments will help both student and instructor
to make the best use of the text.
Analysis is a beautiful, powerful and central part of mathematics. Yet I feel
that the way the subject is sometimes presented in undergraduate courses can be
deceptive, misleading and uninspiring.
Many of the insights, ideas and practices of mathematics are coded into the
language and notation of mathematics. As Alexandre Borovik remarks in his
illuminating book: Mathematics under the Microscope. Notes on Cognitive Aspects
of Mathematical Practice
‘Mathematical languages unstoppably develop towards an ever increasing degree of compression of information’ [4, page 68].
How does one learn a foreign language? It depends on the context. If conversational skills are what is required, then the best way to proceed is to be immersed
in the culture; to be surrounded by people who are native speakers. On the other
hand, if the goal is to read classic Roman or Greek literature, then it is a painstaking
process of learning the language and grammar step-by-step with the eventual aim of
reading Virgil or Sophocles in the original. Immersion in the culture can be effective
and pleasurable whereas learning from the book can be a slow and painful process
(my own experience translating Caesar’s Gallic War III from the Latin).
Spoken languages evolve relatively slowly; even though the seventeenth century
plays of Shakespeare can be difficult to read, a production of a Shakespeare is
usually easy to understand—though some of the puns and word plays can be missed.
Learning and reading mathematics is a trickier proposition.
In contrast to spoken language, mathematical languages can evolve and change
rapidly and, as Borovik notes, contain in compressed form much information about
the ideas and practice. The modern abstract language of analysis would be hard for
a seventeenth or eighteenth century mathematician to grasp. They might reasonably
well ask, ‘why on earth are you going to so much trouble?’ It is not just the
compressed way definitions are handled; there is all the infrastructure of logic
and set theory that we now take for granted but which was originally regarded
as controversial, even at the beginning of the twentieth century. How then does
language mesh with mathematical content in contemporary undergraduate classes
in calculus and analysis? In substance, as opposed to language, much contemporary
undergraduate analysis barely goes beyond eighteenth century mathematics. For
example, the one-variable part of a calculus sequence is largely stuck in the
seventeenth century and multivariable calculus rarely gets far into the nineteenth
century. The mathematical language used in these courses can often be a fractured
mix of the old and new which does not resonate well with the content. It may be
helpful to discuss one example in detail.
The "; ı-definition of continuity. This is invariably given in calculus texts, usually
as part of a discussion on limits. Subsequently, the topic is largely ignored—with
proofs or arguments deferred as being too difficult. Versions of the "; ı-definition
of continuity were originally proposed by Bernard Bolzano (1817) and AugustinLouis Cauchy (about 1820) and then put on a more sound and, to our eyes, familiar
footing by Karl Weierstrass about 20 years later. Bolzano and Cauchy were among
the first mathematicians to stress the importance of rigour in analysis. Even so,
Cauchy’s original use of the continuity definition was incorrect: he argued that a
pointwise limit of a sequence of continuous functions was continuous.1 Although
questions about the meaning and nature of limits, especially associated to tangent
lines, have played a role in analysis since the time of Leibniz and Newton, the idea
of a continuous function is relatively recent, spurred in part by the early nineteenth
century development of Fourier series. Continuity plays a peripheral role in most
contemporary calculus classes and even in introductory analysis classes it is often
poorly motivated—it is quite possible to complete a mathematics degree and have
little, if any, contact with Fourier series or functions outside of combinations of
polynomials, trigonometric and exponential functions.
The "; ı-definition of continuity is difficult: a continuous function is not quite
what you think it is—a typical continuous function is nowhere differentiable. Like in
much of analysis, the definition is framed in terms of inequality (tricky) rather than
the easier equality seen in elementary algebra. Frankly, the continuity definition is
ugly and uses too many quantifiers (for all x, for all " > 0, there exists. . . ). In
the twentieth century, a more natural topological definition of continuity appeared
which is based on preservation of structure (open sets). The definition is far reaching
and applies to many areas of mathematics, including algebra, but unfortunately the
definition is even more remote from the content of a standard calculus sequence.
Can continuity be motivated in a calculus or introductory analysis course? In
particular, are there significant results or examples? One possible approach is to
prove, or at least discuss, the uniqueness and existence of the integral of a continuous
function.
In a calculus class, the word ‘integral’ is, in practice, synonymous with ‘antiderivative’. It is easy to show rigorously that if the integral of a continuous bounded
function exists, then it must be given by the anti-derivative (this is uniqueness of the
integral). Conversely, the proof of existence of the anti-derivative for a continuous
bounded function is not hard (the proof does not require uniform continuity). While
existence does depend on completeness properties of the real numbers (as is so for
most results in real analysis), it does not depend on general properties of continuous
functions (for example, that a continuous function on a closed interval attains its
bounds). In spite of the simplicity of proving the existence of the integral of a
1
The correct result was obtained later by Weierstrass using the idea of uniform convergence. This,
and all the other results and definitions we mention, can be found in the main body of the text.
continuous bounded function, calculus texts persist in giving a development based
on upper and lower Riemann sums (so presenting the theory as an extension of
the method of exhaustion developed by Eudoxus and later Archimedes) and almost
universally claim the proof of existence is too difficult to include. It is not. (We refer
to Chap. 2, Sect. 2.8 for a simple presentation of the existence and uniqueness of
the integral of a continuous bounded function.) The whole point about the integral
of a continuous bounded function is that one does not need to do approximation by
Riemann sums if the function has an anti-derivative. This is part of the magic of the
integral and differential calculus.2
In practice, calculus and introductory analysis courses only consider functions
constructed from polynomials, trigonometric, exponential and logarithmic functions. For these functions, it is usually easy to prove boundedness on a closed and
bounded interval. The differential calculus can then be used to find the bounds
and where they are attained. There is no real connection between the abstract
theory of continuous functions on a closed and bounded interval and the functions
considered in a calculus class which are invariably analytic (in particular, infinitely
differentiable). What is interesting and remarkable is the existence theorem that
every continuous function, such as cos.x2 /, does have an anti-derivative even though
it cannot always be given in closed form (in terms of combinations of known
functions). These issues are often not discussed in modern textbooks. In summary,
highly sophisticated definitions and language are introduced for the solution of
problems that are never mentioned. It is not surprising that students can be baffled
by calculus classes: there is a disconnect between the material and the language.
These problems can be reinforced when students take an introductory class
on analysis. These courses are often presented3 as ‘an introduction to proof’ or
‘calculus done properly’. Aside from the poor psychology (‘what you did before
was wrong and so a waste of time’), the premise is wrong: this abstraction is not
needed to do calculus or much of classical analysis (as is testified to by the work of
Euler and other eighteenth century mathematicians).
Amongst mathematicians there are strongly held views about teaching students
how to write proofs. My personal view is that it is inadvisable to overemphasize
the nuances of logic, truth tables, existential quantifiers etc. Often this leads to
overuse of symbols combined with a lack of understanding of the underlying
mathematics. Symbols seem powerful. Even though (perhaps because) I was
brought up on Axiomatic Euclidean Geometry (the mathematical equivalent of
learning Latin), I am sceptical about doing serious mathematics in the axiomaticdidactic style. It confuses the process of doing mathematics with the activity of
writing mathematics. Figuring out a proof is an intuitive process that results from
increasing understanding of the mathematical structure; formalism comes in when
2
The interest of the Riemann sum approach to the integral is that the construction works for
bounded functions which have countably many discontinuities. In that context, upper and lower
sums are needed. This was highly relevant for nineteenth century mathematics.
3
In the United States.
one writes the proof down (in a terse coded form) so that others can understand
and use the result. Introductory analysis presents rich problems that can naturally
lead to the logical style one needs for writing proofs (for example, the proof that a
sequentially continuous function is continuous). An essential part of this process
is developing the skill to construct good examples and counterexamples. In my
view, the way to learn logic and mathematical expression is through application not
through a formal course. Emphasis solely on correctness is antithetical to developing
the intuitive understanding that one needs to do mathematics.
There can also be disconnects between language and application in more
advanced undergraduate classes on analysis. Metric spaces are a terrific subject to
learn at any level. There is geometric intuition combined with many powerful results
(notably the contraction mapping lemma) and good applications. The subject also
provides a beautiful and easily accessible abstraction, generalization and clarification of much foundational one variable real analysis. Definitions and results can be
given using elementary sequence-based definitions such as sequential continuity or
sequential compactness. These definitions lead to simple and transparent proofs. In
contrast, if one emphasizes the topological approach, for example the open cover
definition of compactness, proofs are not so transparent and often harder.4 Most
function spaces encountered in elementary analysis are separable metric spaces
where sequence-based methods work very well. At undergraduate level, it is not
so easy to give interesting examples of non-separable or non-metrizable spaces.5
Quoting Borovik again
‘Always test a mathematical theory on the simplest possible example—and explore the
example to the utmost limits’ [4, Page 3].
As the late Christopher Zeeman said, ‘a good example is worth 10 theorems’.
Counterexamples are important too—as a way of testing limits of the theory (as
well as the need for the theory).6 In summary, and paraphrasing Albert Einstein,
‘Everything should be made as simple as possible, but no simpler.’ If there is a good
reason to consider the topology of non-separable or non-metrizable metric spaces in
the course, then give the topological definition of compactness. Else, keep it simple.
One reason given to stress form over application in a final year analysis course
is that the course should be preparation for graduate classes in analysis. I feel
this approach is mistaken. As a professional mathematician, I should be inspiring
undergraduate students about the nature and power of mathematics and not covering
the preliminaries for a possible future graduate class. A final year undergraduate
course should be complete in itself and contain interesting and exciting applications.
If this is not done, it is like learning Latin to read Virgil but never actually reading
4
A simple example is given by the proof of uniform continuity of a continuous function on a closed
and bounded interval.
5
Spaces of functions of bounded variation are metrizable but not separable. Spaces of smooth
functions on R with the Whitney C1 topology are neither separable nor metrizable.
6
Examples showing that results on boundedness of continuous functions on closed intervals fail if
we work over the rational numbers.
any Virgil (‘we never had the time to reach that part of the course’). Pure form, no
content. This can be a problem with introducing the Lebesgue integral at the end of
a mathematics degree without giving any applications in probability, ergodic theory
or Fourier analysis.
On occasions I advise students in my analysis classes not to spend too much time
reading mathematics texts. That view is based on my own experience—an effective
way to learn mathematics is to do it, play with it but generally avoid spending too
much time reading books about it. Reading a mathematics book can give a veneer
of superficial understanding that dissolves the moment one tries to use the theory
described in the book. An analogy might be learning carpentry, plumbing or a
foreign language—knowing the theory is important but not that helpful; knowing
how to use the tools is crucial. That takes time, practice and serious effort. As an
example, think about hiring a personal trainer at the gym. You pay him or her rather
a lot of money and sit back two or three times a week and watch them exercise, lift
weights and generally work out and suffer. As a result you lose weight and gain a
svelte figure. . . . It is the same with mathematics and learning mathematics. Much
more is required than finding the ultimate book (or teacher).
So how does one approach a book on mathematics? Certainly not like a novel, to
be read breathlessly from cover to cover. Although there are classics of mathematics
literature, ranging from Euclid’s Elements and Newton’s Principia to the collected
works of Euler or Poincaré, rather few mathematicians have read these works cover
to cover. Dipping into these books is another matter. So perhaps one should regard
a mathematics text as similar to a computer or software manual? Not quite. A good
software manual should explain how to do standard tasks and have lots of good
examples (they often do not). Although all this is required of a serious mathematics
text, more is needed: why do we need this hypothesis, can we relax this condition,
why do we have to go to all this trouble to prove this result? Not just operational
skill but understanding and insight is required. The language and theory also need
to be motivated throughout by good and significant applications.
It is time to say a few words about the book, the contents and how to proceed.
Chapters 1 and 2 play multiple roles. There is a review of basic set theory
(Chap. 1) and the introduction of terminology and notation used throughout the text.
The main item in Chap. 1 is an elementary rigorous construction of the real numbers
using decimal expansions. This material should make for good classroom or small
group discussion. The approach is old and originally due to the Flemish (Dutch)
mathematician Simon Stevin in the sixteenth century. It predates the more abstract
nineteenth century approach to real numbers developed by Weierstrass, Dedekind
and others. It has the merit of a direct practical construction, done in the familiar
context of decimal expansions, and the approach fits naturally with the methods used
in Chap. 2 (for example, in the proof of existence of least upper bounds). Although
Stevin’s approach is currently unfashionable (or unknown), it does in my opinion
have one outstanding merit over the more abstract approach—every irrational has a
natural sequence of rational approximations given by decimal truncations.7
Chapter 2 reviews completeness properties of the real line and basic analysis
of continuous functions. Key results, such as the Bolzano–Weierstrass theorem,
are proved using natural constructions based on the representation of real numbers
as decimal expansions. We include an appendix giving a simple presentation of
the existence and uniqueness of the integral for continuous functions. Discussion
topics for extension and exploration of this approach to multiple integrals are
given in the exercises. There is also an appendix on the more abstract approach
to the construction of the real numbers based on Cauchy sequences. There are also
review sections in Chap. 2 on complex numbers, a little calculus, and the log and
exponential functions. Most readers will know this material already—indeed much
of what is in Chap. 2—but my guess is that everyone at some point will get a queasy
feeling that there is a detail they need to check and so the details are provided.
Note that in both Chaps. 1 and 2, the definition and elementary properties of limits
are assumed known—we do not replicate uninteresting proofs about sums, products
and quotients of limits often given in elementary calculus texts. One exception is that
we do indicate a careful proof of convergence of geometric series over the rational
numbers (Lemma 1.5.9).
Chapters 3 and 4 are about infinite series, infinite products and uniform convergence. In Chap. 3, we consider infinite series and infinite products of real
and complex numbers (mostly real rather than complex). With a view to later
applications to Fourier series and power series, we include Dirichlet and Abel’s
tests. We also give the statement and proof of Tannery’s theorem—used in several
applications, notably the first proof of the infinite product formula for sin.x/. In
Chap. 4 we investigate characteristic problems in analysis involving interchange of
limit operations in infinite sums (and products) of functions. A highlight of Chap. 4
is the construction of a continuous nowhere differentiable function.
In Chap. 5 we get to the heart of our subject: functions. Using Bernstein polynomials, we give a constructive proof of the Weierstrass approximation theorem:
every continuous function on a closed and bounded interval can be uniformly
approximated by polynomials. After giving applications of the Weierstrass theorem
we turn to smooth (infinitely differentiable) and real analytic functions. We start by
emphasizing examples and give elementary methods for the construction of smooth
(non-analytic) functions with specified properties. In so doing, we make our first
brief contact with twentieth century mathematics. We also develop a little of the
theory of real analytic functions, including results on analytic differential equations
(the methods we give apply equally to complex analytic functions). In the remainder
of Chap. 5 we develop the foundational theory of Fourier series. The main result
we prove is that the Fourier series of a continuous piecewise C1 periodic function
converges uniformly to the function. Using Fourier series we give a second proof of
7
That merit leads to the natural question of ‘best possible’ rational approximations—Diophantine
approximation.
the infinite product formula for sin.x/. We use Fourier series methods to compute
the sums of several infinite series.
In Chap. 6 we discuss two topics from eighteenth century analysis. We start
with the Gamma-function and verify most of the standard properties. Along the
way we introduce important techniques from analysis such as differentiation under
the integral sign. In the remainder of the chapter we discuss Bernoulli polynomials
and the Euler–Maclaurin formula. We use quite elementary mathematics to obtain
remarkably powerful results. For example, we use the Euler–Maclaurin formula to
give sharp estimates on the sums of several standard infinite series and also prove
versions of Stirling’s formula estimating nŠ.
In Chap. 7 we give an introduction to metric spaces. This is a chapter about
constructing the infrastructure needed for doing analysis on spaces more general
than domains in Euclidean space. We emphasize the metric structure and geometric
intuition. For example, a proper subset U of the metric space .X; d/ is defined
to be open if d.x; X X U/ > 0 for all x 2 U (the alternative is to use an "; ı
definition in terms of balls or disks). Major results proved in this chapter include the
Arzelà–Ascoli theorem (the Bolzano–Weierstrass theorem for spaces of continuous
functions, uniform metric) and the contraction mapping lemma. We conclude the
chapter with some simple yet powerful applications of the contraction mapping
lemma to differential equations and the inverse function theorem (this is developed
further in Chap. 9).
In Chap. 8 we give a non-trivial application of the contraction mapping lemma to
the theory of iterated function systems. The results in this chapter give a beautiful
illustration of the power of the abstract methods developed in the chapter on metric
spaces. We show how to construct a complete metric on the (non-linear) space
H.Rn / of compact subsets of Rn . We show that an iterated function system on Rn
naturally defines a contraction operator on H.Rn / and thereby deduce that there is a
unique fractal defined by the iterated function system. The result is not difficult:
the problem lies in organizing the concepts and this is dealt with elegantly and
efficiently when we use the language of metric spaces.
Finally, in Chap. 9, we give a systematic account of the modern theory of
differential calculus on normed vector spaces. Apart from providing proofs and
statements of many standard results, such as the mean value theorem and Taylor’s
theorem, there are versions of Leibniz’s rule and Faà di Bruno’s formula for the
higher derivatives of a composite of vector-valued maps. We include applications
of the contraction mapping lemma to several versions of the implicit function
theorem, including the rank theorem. Also proved is the Cr existence theorem for
ordinary differential equations—the proof, based on the equation of variations, uses
the contraction mapping lemma and uniform approximation by smooth functions.
This result is fundamental in the development of the modern theory of dynamical
systems.
Although the book makes some use of complex numbers, we have not developed
the techniques and results of complex analysis based on Cauchy’s theorem and the
Cauchy–Riemann equations. The main reason for this omission is the current practice of offering a first self-contained course on complex analysis, including Cauchy’s
theorem and applications, followed perhaps by a more advanced course including
topics such as the Riemann mapping theorem or the Weierstrass and Mittag-Leffler
theorems. On integration, we have included a simple exposition of the integral
and indicated extensions to functions with countably many discontinuities in the
exercises. We have also included, mainly in exercises, results on monotone functions
and functions of bounded variation. We have not, however, developed the Riemann–
Stieltjes integral—it seemed difficult to give good applications appropriate to the
general style and content of the book (for example, applications in probability or
Riesz’s theorem on the dual space of C0 .Œa; b/, uniform norm). We do not develop
the general theory of multiple integrals. Our feeling here is that the key result—
the change of variables formula for multiple integrals—is hard to prove (correctly)
using Riemann sums and is better done in the context of Lebesgue integration, a
topic that lies outside the scope of this text. At a few points in the text we make use of
elementary results on multiple integrals (with one exception, always on rectangular
domains).
In Sydney, Australia, we gave a year long second year course in analysis
approximately based on chapters two though six. In Houston, I have given senior
level two semester courses that cover most of the topics from the first eight chapters
and sometimes a little from Chap. 9, depending on the background and knowledge
of the class (for an undergraduate class, one needs to be fairly selective in the choice
of material from Chap. 9).
The exercises: there are approximately 570, which range in difficulty from
routine practice to serious challenges. Some of the exercises are suitable for class
or group discussion and projects.
Acknowledgements are due to Don Cartwright and John McMullen who collaborated with me in the design of a second year honours analysis course given at
Sydney University from 1977 and which is the foundation for substantial parts of
Chap. 3 through 6. Senior undergraduate and graduate students at the University
of Houston have taken analysis courses based on material from all chapters of
the book and I would like to record my appreciation for all the many helpful
comments and good questions I received from those classes. Last, but not least,
many thanks to Springer—most especially Anne-Kathrin Birchley-Brun, Remi
Lodh and Angela Schulze-Thomim—the anonymous copy-editor, who did great
work on the manuscript, and the production team at Spi for their fine work.
Houston, TX, USA
Michael Field
Contents
1
Sets, Functions and the Real Numbers . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1.3 Functions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1.4 Countable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1.5 The Real Numbers .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1.6 The Structure of the Real Numbers .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
1
1
2
6
8
13
19
2 Basic Properties of Real Numbers, Sequences and Continuous
Functions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.2 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.3 Bounded Subsets of R and the Supremum and Infimum . . . . . . . . . . . .
2.4 The Bolzano–Weierstrass Theorem .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.5 lim sup and lim inf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.6 Complex Numbers .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.7 Appendix: Results from the Differential Calculus . . . . . . . . . . . . . . . . . . .
2.8 Appendix: The Riemann Integral . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
2.9 Appendix: The Log and Exponential Functions... . . . . . . . . . . . . . . . . . . .
2.10 Appendix: Construction of R Revisited . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
31
31
32
37
48
58
63
67
71
82
86
3 Infinite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.2 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.3 Series of Eventually Positive Terms . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.4 General Principle of Convergence . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.5 Absolute Convergence .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.6 Conditionally Convergent Series . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.7 Abel’s and Dirichlet’s Tests . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.8 Double Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.9 Infinite Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3.10 Appendix: Trigonometric Identities . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
91
91
91
92
100
100
106
109
112
116
125
4 Uniform Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
4.2 Pointwise Convergence .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
4.3 Uniform Convergence of Sequences .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
4.4 Uniform Convergence of Infinite Series . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
4.5 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
4.6 Abel and Dirichlet’s Test for Uniform Convergence .. . . . . . . . . . . . . . . .
4.7 Integrating and Differentiating Term-by-Term .. .. . . . . . . . . . . . . . . . . . . .
4.8 A Continuous Nowhere Differentiable Function .. . . . . . . . . . . . . . . . . . . .
129
129
130
131
138
142
147
150
155
5 Functions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
5.2 Smooth Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
5.3 The Weierstrass Approximation Theorem .. . . . . . .. . . . . . . . . . . . . . . . . . . .
5.4 Analytic Functions.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
5.5 Trigonometric and Fourier Series . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
5.6 Mean Square Convergence .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
5.7 Appendix: Second Weierstrass Approximation Theorem.. . . . . . . . . . .
161
161
162
170
178
186
204
209
6 Topics from Classical Analysis: The Gamma-Function and the
Euler–Maclaurin Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
6.1 The Gamma-Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
6.2 Bernoulli Numbers and Bernoulli Polynomials . .. . . . . . . . . . . . . . . . . . . .
6.3 The Euler–Maclaurin Formula . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
211
211
223
231
7 Metric Spaces .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.1 Basic Definitions and Examples . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.2 Distance from a Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.3 Open and Closed Subsets of a Metric Space: Intuition . . . . . . . . . . . . . .
7.4 Open and Closed Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.5 Interior and Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.6 Open and Closed Subsets of a Subspace . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.7 Dense Subsets and the Boundary of a Set . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.8 Neighbourhoods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.9 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.10 Sequences and Limit Points .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.11 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.12 Construction and Extension of Continuous Functions .. . . . . . . . . . . . . .
7.13 Sequential Compactness.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.14 Compact Subsets of R: The Middle Thirds Cantor Set . . . . . . . . . . . . . .
7.15 Complete Metric Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.16 Equicontinuity and the Arzelà–Ascoli Theorem .. . . . . . . . . . . . . . . . . . . .
7.17 The Contraction Mapping Lemma .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
7.18 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
245
245
250
251
252
258
260
261
263
264
266
273
276
281
289
297
305
312
321
8 Fractals and Iterated Function Systems . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
8.1 The Space H.Rn / .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
8.2 Iterated Function Systems . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
8.3 Examples of Iterated Function Systems . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
8.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
329
330
337
339
344
9 Differential Calculus on Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.1 Normed Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.2 Linear Maps .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.3 The Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.4 Properties of the Derivative . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.5 Maps to and from Products.. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.6 Inverse and Implicit Function Theorems . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.7 Local Existence and Uniqueness Theorem for Ordinary
Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.8 Higher Derivatives as Approximations . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.9 Multi-Linear Maps and Polynomials . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.10 Higher-Order Derivatives.. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.11 Extension of Results from C1 to Cr -Maps.. . . . . . .. . . . . . . . . . . . . . . . . . . .
9.12 Taylor’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.13 The Leibniz Rule and Faà di Bruno’s Formula .. .. . . . . . . . . . . . . . . . . . . .
9.14 Smooth Functions and Uniform Approximation .. . . . . . . . . . . . . . . . . . . .
9.15 The Local Cr Existence Theorem for ODEs . . . . .. . . . . . . . . . . . . . . . . . . .
9.16 Diffeomorphisms and Flows . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.17 Concluding Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
9.18 Appendix: Finite-Dimensional Normed Vector Spaces .. . . . . . . . . . . . .
349
349
353
358
362
372
376
391
395
396
407
412
414
417
423
430
435
438
439
References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 443
Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 445
Chapter 1
Sets, Functions and the Real Numbers
1.1 Introduction
We start by reviewing some of the basic definitions, notations and properties of sets
and functions. Although much of this material should be familiar, notations vary
and readers should at least skim through the sections on sets and functions so as to
familiarize themselves with the notational conventions used throughout the book.
The remainder of the chapter is devoted to a leisurely but careful discussion of
one approach to defining the real number system. Roughly speaking, we think of
a real number as defined by its decimal approximations. This will prove useful in
Chap. 2 when we prove general results on convergence (when we do not know the
limit). Overall, the section on real numbers is intended to motivate group discussion
and investigation. (What are the problems? How might we solve them?) At the
conclusion of Chap. 2, we return to the problem of the construction of the real
numbers and give an elegant, though more abstract, construction.
We assume some familiarity with proof by induction and recursive or inductive
definitions. We briefly recall the ideas; first, proof by induction. If for each natural
number n, we are given a statement S.n/, then S.n/ will be true for all n if S.1/ is
true and the truth of S.n/ implies the truth of S.nC1/ for all n 1. For a recursive or
inductive definition, we aim to give definitions or mathematical statements S.n/ for
n 1. We can do this if S.1/ is given, and S.n C 1/ is uniquely determined by S.n/
for all n 1. We often use recursive definitions to define sequences. For example,
x1 D 1, xnC1 D 12 .xn C x2n /, n 1. The rule used to define xnC1 in terms of xn
may involve logical statements and not be given in terms of a simple mathematical
formula.
2
1 Sets, Functions and the Real Numbers
1.2 Sets
Roughly speaking a set is a collection of ‘objects’. Each object in the set is regarded
as a member of the set. If we have a set X and x is an object, then we say x is a
member of X if x is one of the objects comprising X. We write this symbolically as
“x 2 X”.
Examples 1.2.1
(1) Let X D f1; 2; 3g be the set with the members 1; 2; 3. We have 1 2 X, 2 2 X,
3 2 X, 4 … X, where the last notation means that ‘4 is not a member of X’.
(2) Let N D f1; 2; g denote the set of strictly positive integers—the natural
numbers. Note the use of the dots to signify that N consists of all positive
integers. We have 10 2 N but 1; 0; 12 … N.
(3) Let Z denote the set of integers: Z D f0; ˙1; ˙2; g.
(4) Let ZC denote the set of non-negative integers: ZC D f0; 1; 2; g.
(5) Let Q D f rs j r; s 2 Z; s ¤ 0g denote the set of all rational numbers. We usually
assume s > 0 and .r; s/ D 1 (the notation .r; s/ D 1 signifies that r; s have no
common factors).
(6) We let R denote the set of all real numbers. For the present, we will be imprecise
about the exact nature of the members of R. However, if x 2 Z or x 2 Q, then
x 2 R.
(7) Let Œ0; 1 D fx 2 R j 0 x 1g. We refer to Œ0; 1 as the ‘closed unit interval.’
Observe the logical definition of Œ0; 1: we impose a condition—0 x 1—on
the set of real numbers. The logical condition follows the j symbol. In words:
Œ0; 1 is the set of all real numbers x satisfying the condition 0 x 1.
Let ; denote the empty set. The empty set is the set with no members.
Example 1.2.2 Consider the sets ;; f;g,f;; f;gg. The second set f;g is not empty:
it has one member, the empty set ;. Similarly, the third set has two members: ;
and f;g.
1.2.1 Subsets
Let A; B be sets. We say that A is a subset of B, written symbolically as A B,
if every member of A is a member of B. In terms of the implication symbol H),
A B if
a 2 A H) a 2 B:
Note that in this definition we allow A D B. If we can find a 2 A such that a … B,
then A is not a subset of B. We write this as A 6 B. A consequence of our definitions
is that we regard the empty set as a subset of B since we cannot find any member of
1.2 Sets
3
; which is not a member of B (and so ; 6 B is false). If A B but A ¤ B we write
A ¨ B. If, in addition, A ¤ ;, we refer to A as a proper subset of B. We also allow
the notation A B which means that B is a subset of A (or A is a superset of B).
Example 1.2.3
Œ0; 1 R; Œ0; 1 6 N; N ZC Z Q R:
As we shall see, Q is a proper subset of R.
1.2.2 Operations on Sets
Unions and Intersections Let A; B; be sets. The union A [ B of A and B is
defined by
A [ B D fx j x 2 A or x 2 Bg:
Observe that A A [ B, B A [ B and A [ B D B [ A.
The intersection A \ B of A and B is defined by
A \ B D fx j x 2 A and x 2 Bg:
We have A \ B A; B A [ B and A \ B D B \ A.
Example 1.2.4 A [ A D A \ A D A [ ; D A and A \ ; D ;.
Later we need to look at unions and intersections of families of sets. Suppose
then that we are given a set A D fAi g of sets indexed by a non-empty set I. That
is, A D fAi j i 2 Ig (or, in abbreviated form fAi gi2I ). If the index set I D N, then
we have a sequence of sets A1 ; A2 ; . We define unions and intersections for the
family fAi gi2I by
[i2I Ai D fx j 9i 2 I with x 2 Ai g;
\i2I Ai D fx j x 2 Ai 8i 2 Ig:
(Here we have made use of the shorthand symbols 9 (‘there exists’) and 8 (‘for
all’).)
Examples 1.2.5
(1) [n2N fng D N, [n2ZC f˙ng D Z.
1
, Ax D Œx ıx ; x C ıx . Then [x2R Ax D R,
(2) For x 2 R, define ıx D 1Cjxj
\x2R Ax D ;. If instead we take ıx D jxj, Ax D Œxıx ; xCıx , then [x2R Ax D R,
\x2R Ax D f0g.
4
1 Sets, Functions and the Real Numbers
Complements Fix a non-empty set X. If A is a subset of X, we define the
complement X X A of A (in X) by
X X A D fx 2 X j x … Ag:
Remark 1.2.6 There are several other notations commonly in use for the complement X X A. For example, X A, A0 , Ac and {A. An advantage of our notation is that
it enables us to easily write sets built by iterated complementation—for example,
A X .B X C/.
z
Example 1.2.7 We have X X ; D X and X X X D ;.
Lemma 1.2.8 For all subsets A of X we have
X X .X X A/ D A:
Proof If x 2 A, then x … X X A. Hence x lies in the complement of X X A. That is,
x 2 X X .X X A/. We have shown that A X X .X X A/. Reversing the argument
shows that X X .X X A/ A. Hence X X .X X A/ D A.
t
u
The next result will prove useful when we investigate open and closed sets of
metric spaces (Chap. 7).
Proposition 1.2.9 Let fAi gi2I be a family of subsets of X. We have
(1) X X \i2I Ai D [i2I .X X Ai /.
(2) X X [i2I Ai D \i2I .X X Ai /.
Proof The proof is left to the exercises.
t
u
Example 1.2.10 We define a subset A of R to be of type C if it is either finite, or
empty or equal to R. A subset of R is of type O if it is the complement of a subset of
type C. Since R and ; are of type C it follows (by taking complements) that R and ;
are also are of type O. These are the only subsets of R that are of type C and type O.
All other subsets of type O are the complement of a finite set and so must be infinite.
It follows from Proposition 1.2.9 that the intersection (respectively, union) of any
collection of sets of type C (respectively, type O) is a set of type C (respectively,
type O). On the other hand, only finite unions (respectively, intersections) of type C
(respectively, type O) will always be of type C (respectively, type O).
The Power Set Let X be a set. We define the power set of X, P.X/, to be the set
of all subsets of X. That is, P.X/ D fA j A Xg.
Examples 1.2.11
(1) P.;/ D f;g ¤ ;.
(2) If X D f1; 2g, then P.X/ D ff1g; f2g; f1; 2g; ;g.
Note that P.X/ always contains ; and X. Hence, provided X ¤ ;, P.X/ must
contain at least two members. It is easy to see that if X is finite and contains N
members, then P.X/ contains exactly 2N members.
1.2 Sets
X
5
Products of Sets Let X, Y be non-empty sets. We define the (Cartesian) product
Y to be the set of ordered pairs of elements of X and Y. That is,
X
Y D f.x; y/ j x 2 X; y 2 Yg:
It is straightforward to extend this definition to finite products. For example, given
sets X1 ; ; XN , we define
…NiD1 Xi D X1
XN D f.x1 ; ; xN / j xi 2 Xi ; 1 i Ng:
Remark 1.2.12 If either X or Y is empty, then X Y D ;. If X and Y are non-empty,
then X Y ¤ ; since we can pick at least one element x0 from X, and one element
y0 from Y. Hence, .x0 ; y0 / 2 X Y and X Y ¤ ;. This argument becomes a little
dangerous if we try to define the product …i2I Xi over an arbitrary indexing set I.
Formally, we can define
…i2I Xi D f f W I ! [i2I Xi j f .i/ 2 Xi ; 8i 2 Ig;
where f is a function with domain I and range [i2I Xi which satisfies f .i/ 2 Xi for all
i 2 I (see the next section for more on functions). However, with this definition
of product, it is not clear that the product is non-empty since, without further
assumptions, there seems no obvious way of constructing a function f satisfying
f .i/ 2 Xi for all i 2 I (this is not a problem if I D N—use induction). In practice, it
is usually assumed that …i2I Xi ¤ ; if Xi ¤ ;, for all i 2 I, whatever the indexing
set I. This assumption is known as the Axiom of Choice. The Axiom of Choice,
and an equivalent statement called Zorn’s Lemma, play an important role in many
parts of mathematics; in particular, when we require statements that apply in great
generality. The need for care was seen early on in the development of set theory
because of the appearance of contradictions. Best known is Russell’s paradox: if we
let X denote the set of all sets and define Z X by Z D fX 2 X j X … Xg, then
Z 2 Z iff (shorthand for “if and only if") Z … Z. Russell’s and other paradoxes can
be avoided by developing axiomatic versions of set theory. Most mathematicians
now assume a version of Zermelo–Fraenkel axiomatic set theory (ZF). For more
information, we refer the reader to one of the many books on the foundations and
history of set theory (personal favourites are [12, 13]).
z
EXERCISES 1.2.13
(1) Prove that A \ .B [ C/ D .A \ B/ [ .A \ C/ (Distributive law).
(2) True or false for all sets A; B; C? In each case, either prove the statement or
provide a counterexample.
(a)
(b)
(c)
(d)
A D B iff A B and B A.
A [ B D .A X B/ [ .B X A/ [ .A \ B/.
A [ .B \ C/ D .A [ B/ \ .A [ C/.
A B iff x … B H) x … A.
6
1 Sets, Functions and the Real Numbers
(3) Let A; B be subsets of X. Prove that A X B D .X X B/ \ A. Deduce that if we use
the notation Ac for X X A, then every expression involving A X B can be written
in terms of A and Bc . If A; B; C are subsets of X, find the simplest expression
you can for ..A X B/ X C/ \ .A X C/ in terms of A; B; C; Ac ; Bc ; Cc .
(4) Complete the proof of Proposition 1.2.9. (To show that X X \i2I Xi D [i2I .X X
Xi /, prove that the left-hand side is a subset of the right-hand side and
conversely.)
(5) Let X be a non-empty set. Suppose that A; B are proper subsets of X (A; B ¤
;; X). Show that we can generate at most four different subsets of X from A
and B using the operations of intersection and union. What about if we allow
complements? Consider the same question, but now suppose we are given three
subsets A; B; C of X. (Hint: Use the result of Q2 and Proposition 1.2.9 for
the extension to complements. We remark that the number of subsets we can
generate by intersection and union from n subsets of X increases rapidly as a
function of n and is closely related to the Dedekind number M.n/ of n.)
(6) Let A; B; C be subsets of X. Define the symmetric difference A4B by
A4B D .A X B/ [ .B X A/:
Complete the sentence ‘x 2 A4B iff x 2 A and . . . or . . . and : : : … : : :’. Prove
(a)
(b)
(c)
(d)
(e)
(f)
A4B D ; iff A D B.
A X B D A \ .A4B/.
A \ .B4C/ D .A \ B/4.A \ C/.
.A4B/4C D A4.B4C/ (associativity of symmetric difference).
A4B D .A4C/4.C4B/.
Show that A4B D C4D iff A4C D B4D.
(7) Let fAi j i 2 Ig and fBj j j 2 Jg be families of subsets of X. Prove that
.\i2I Ai / [ \j2J Bj D \i2I;j2J Ai [ Bj :
(The indexing sets I; J are non-empty and may be infinite.)
1.3 Functions
Let X; Y be non-empty sets. A function, or map, f from X to Y assigns to each x 2 X
a unique point f .x/ in Y. We denote this assignment symbolically by f W X ! Y (“f
maps X to Y”). We call f .x/ the value of f at x. Every function f W X ! Y has a
graph f X Y defined by
f D f.x; f .x// j x 2 Xg:
1.3 Functions
7
Conversely, if G X Y has the property that for every x 2 X, there exists a unique
point y 2 Y such that .x; y/ 2 G, then G is the graph f of a function f , where the
value f .x/ of f at x is y—the unique point in Y such that .x; y/ 2 G.
If f W X ! Y, then the range or image of f is the subset f .X/ of Y defined by
f .X/ D f f .x/ j x 2 Xg:
More generally, if A X, then f .A/ is the subset of Y defined by f .A/ D f f .a/ j
a 2 Ag.
If B is a subset of Y, the inverse image (by f ) of B is the subset f 1 .B/ of X
defined by
f 1 .B/ D fx 2 X j f .x/ 2 Bg:
Example 1.3.1 If B Y X f .X/, then f 1 .B/ D ;. Conversely, if B Y and
f 1 .B/ D ;, then B \ f .X/ D ;.
If f W X ! Y, g W Y ! Z, then the composite g f of f and g is the map g f W X ! Z
defined by
.g f /.x/ D g. f .x//; .x 2 X/:
Remark 1.3.2 The composite g f of f and g is not the multiplicative product of f and
g. Of course, if (say) f ; g W X ! R, we can form the multiplicative product f g,
defined by . f g/.x/ D f .x/g.x/. As it is natural to abbreviate f g as fg (especially
if f ; g W R ! R), it is sometimes useful to use a notation like g ı f for composites
so as to make it clear that we are not dealing with f g.
z
Example 1.3.3 If C Z, then .g f /1 .C/ D f 1 .g1 .C//—note the reverse order.
The proof is left to the exercises.
Definition 1.3.4 Let f W X ! Y.
(1) f is onto (or surjective) if f .X/ D Y.
(2) f is 1:1 (or injective) if f .x/ D f .x0 / iff x D x0 .
(3) f is 1:1 onto (or bijective) if f is 1:1 and onto.
If f W X ! Y is a bijection, then we may define the inverse map f 1 W Y ! X
by defining the value of f 1 at y 2 Y to be the unique point f 1 . y/ 2 X such that
f . f 1 . y// D y. Since f is onto, there always exists at least one point x 2 X such
that f .x/ D y. Since f is 1:1, the point x is unique. We define f 1 . y/ D x.
Remark 1.3.5 The reader should be aware of the ambiguity caused by using the
symbol f 1 for inverses of sets and inverse maps. In particular, if f W X ! Y, and
b 2 Y, then f 1 .fbg/ is a subset of X. If f has an inverse, then f 1 .b/ is a point of
X. In practice, one writes f 1 .b/ to cover both situations. That is, f 1 .b/ D fx 2
X j f .x/ D bg. If f is a bijection, we identify the point f 1 .b/ 2 X with the subset
8
1 Sets, Functions and the Real Numbers
f f 1 .b/g of X. If f has inverse function f 1 , then the inverse image set f 1 .B/ is
equal to the image of B by the map f 1 .
z
Example 1.3.6 f W X ! Y, g W Y ! Z are bijections, then gf W X ! Y is a bijection
and .gf /1 D f 1 g1 .
EXERCISES 1.3.7
(1) Let f W X ! Y and fUi j i 2 Ig be a family of subsets of X. Show that
f .[i2I Ui / D [i2I f .Ui /. What about f .\i2I Ui /?
(2) Let f W X ! Y and U; V be subsets of X. Show, by finding an explicit example,
that in general, f .U/ X f .V/ ¤ f .U X V/.
(3) Let f W X ! Y and fUi j i 2 Ig be a family of subsets of Y. Prove that
(a) f 1 .\i2I Ui / D \i2I f 1 .Ui /.
(b) f 1 .[i2I Ui / D [i2I f 1 .Ui /.
(4) Show that if f W X ! Y, g W Y ! Z and C is a subset of Z, then .gf /1 .C/ D
f 1 .g1 .C//.
(5) Define f W R ! R by f .x/ D x2 C 1. Find
(a) f .Œ0; 2/.
(b) f 1 ..1; 1//.
(c) f 1 .Œ2; 3/.
If g W R ! R is the map g.x/ D x3 , find (a) .g f /1 .Œ0; 2/, (b) g f .Œ0; 2/,
(c) . fg/1 .Œ0; 2/, (d) fg.Œ0; 2/.
1.4 Countable Sets
1.4.1 Equivalence of Sets
Definition 1.4.1 The sets X, Y are equivalent if there exists a bijection f W X ! Y.
We write this symbolically as “X Y”.
Remark 1.4.2 The relation is reflexive X
X, symmetric X
Y H) Y
X,
and transitive X
Y, Y
Z H) X
Z. A relation satisfying these properties is
called an equivalence relation.
z
We give a general result on inequivalence that has an intriguing proof discovered
by the creator of set theory, Georg Cantor. The proof is reminiscent of the argument
used in Russell’s paradox (see Remark 1.2.12).
Proposition 1.4.3 Let X be a set. Then X 6 P.X/.
The interest in this result lies in the case when X is not finite (see below). The result
is trivially true if X D ; (P.;/ D f;g ¤ ;) so we assume X ¤ ;.
1.4 Countable Sets
9
Proof We prove that there is no surjection from X to P.X/. Our proof goes by
contradiction. Suppose that f W X ! P.X/ is surjective. Since f .x/ is a subset of
X for all x 2 X, we may define the following subset B of X:
B D fx 2 X j x … f .x/g:
Observe that the definition makes sense even if f .x/ D ;—a possibility since
; 2 P.X/. Since we assume f is onto, 9b 2 X such that f .b/ D B. There are
exactly two possibilities: b 2 B, or b … B. If b 2 B then, by definition of B,
b … f .b/ D B. Contradiction. Similarly, if b … B D f .b/, then b 2 B, by definition
of B. Contradiction. Either assumption leads to an absurd conclusion and therefore
our original assumption that f is onto must be false.
t
u
Remark 1.4.4 If X is infinite, then Proposition 1.4.3 implies that X is not equivalent
to the infinite set P.X/. In particular, infinite sets need not be equivalent. When
Cantor proved this (and more) in 1874, the result was highly controversial and was
attacked by many mathematicians, philosophers and theologians.
z
1.4.2 Finite and Countable Sets
For this section only we use the notation n for the set f1; ; ng of the first n natural
numbers. If m < n 2 N, then n m will be the set f1; ; n mg.
Definition 1.4.5 A set X is finite if either X D ; or there exists an n 2 N such that
X
n:
A set is infinite if it is not finite.
We need to check that the n in our definition of finite is uniquely determined by
X. For this we need a preliminary result.
Lemma 1.4.6 Let X; Y be equivalent sets containing at least two members. If
x0 2 X and y0 2 Y, then X X fx0 g Y X fy0 g.
Proof Let f W X ! Y be a bijection. If f .x0 / D y0 , then f restricts to a bijection
f W X X fx0 g ! Y X fy0 g and so X X fx0 g Y X fy0 g. If f .x0 / ¤ y0 , we may choose
z0 2 Y, z0 ¤ y0 (since Y contains at least two members). Now define g W Y ! Y by
g. y/ D y if y … fy0 ; z0 g, g.z0 / D y0 , g. y0 / D z0 . The composite g f W X ! Y is a
bijection and .gf /.x0 / D y0 . We proceed as before.
t
u
Lemma 1.4.7 Let n; m 2 N. Then n
m iff n D m.
Proof Obviously, if n D m, we have n
m. So suppose n
m. Since the result
is trivial if n D 1 or m D 1, we may assume n; m > 1. Without loss of generality
suppose m n. Apply the previous lemma m 1 times to get 1
n m C 1. It
follows that 1 D n m C 1. That is, m D n.
t
u
10
1 Sets, Functions and the Real Numbers
As an immediate corollary of Lemma 1.4.7 we have
Corollary 1.4.8 If X is finite and non-empty, then there exists a unique n 2 N such
that X
n. The integer n is called the cardinality of X. (If X D ;, we regard the
cardinality of X as being zero.)
Example 1.4.9 If the cardinality of X is n, then the cardinality of P.X/ is 2n . Since
2n > n, for all n 2 N, we see that the cardinality of the power set of a finite
set is always greater than the cardinality of the set (this holds true if X D ; as
P.;/ D f;g).
Definition 1.4.10 A set X is countable if either X is finite or X N. If X N, we
sometimes say X is countably infinite. If X is not countable, we say X is uncountable.
Examples 1.4.11
(1) The set Z of all integers is countable. For this it is enough to note that the map
f W N ! Z defined by
f .n/ D
n=2;
if n is even;
.n C 1/=2; if n is odd;
is a bijection.
(2) The set of prime numbers is countably infinite (see the exercises at the end of
the section).
(3) Not all sets are countable. For example, P.N/ 6 N (Proposition 1.4.3) and so,
since P.N/ is infinite (P.N/ ff1g; f2g; g), P.N/ cannot be countable. Proposition 1.4.12 A subset of a countable set is countable.
Proof Let Y be a subset of the countable set X. The result is immediate if X is finite
so we shall assume that X is countably infinite. We may write X D fx1 ; x2 ; g.
More precisely, there exists a bijection f W N ! X, and so we may define xn D f .n/,
n 2 N. Let n1 1 be the smallest integer such that xn1 2 Y. Assume we have
defined n1 < n2 < < nk such that xnj 2 Y, 1 j k and Y \ fx1 ; x2 ; ; xnk g D
fxn1 ; ; xnk g. Then either Y \ fx1 ; ; xnk g D Y and so Y is finite or there exists a
smallest nkC1 > nk such that xnkC1 2 Y. It follows that either Y is finite (the process
terminates) or we may write Y D fxn1 ; xn2 ; g and so Y is countable (a bijection
g W N ! Y is defined by g.k/ D xnk , k 1).
t
u
Theorem 1.4.13 Every infinite set X contains a countable subset.
Proof The proof is similar to that of Proposition 1.4.12 and uses an inductive
construction. We construct a 1:1 map g W N ! X. Since X ¤ ;, we can pick
an element x1 2 X. After n steps, suppose we have picked n distinct elements
fx1 ; ; xn g of X. Since X is not finite, X X fx1 ; ; xn g ¤ ; and so we can
pick xnC1 2 X X fx1 ; ; xn g. The construction defines a 1:1 map g W N ! X
by g.n/ D xn , n 2 N.
t
u
1.4 Countable Sets
11
Lemma 1.4.14 Let f W X ! Y.
(1) If f is injective and Y is countable, then X is countable.
(2) If f is surjective and X is countable, then Y is countable.
Proof We prove (1) and leave (2) to the exercises. Since f .X/ Y and Y is
countable, f .X/ is countable by Proposition 1.4.12. The result follows since X
f .X/ (since f is injective, f defines a bijection of X onto f .X/).
u
t
Theorem 1.4.15
(1) A finite product of countable sets is countable.
(2) A countable union of countable sets is countable.
Proof (1) We start by proving that Nm N, for all m 2. Our proof is by induction
on m. Suppose m D 2. Since N2 D N N, we may write the points of N2 as the
infinite array
.1; 1/ .1; 2/
.2; 1/ .2; 2/
.3; 1/ .3; 2/
.1; 3/ .2; 3/ .3; 3/ We give an inductive definition of F W N ! N2 . We define F.1/ D .1; 1/. Suppose
we have defined F.1/; ; F.n 1/, n > 1. We define F.n/. Suppose F.n 1/ D
.i; j/. If j D 1, we take F.n/ D .1; i C 1/, else we take F.n/ D .i C 1; j 1/. This
defines a path through the array which follows the diagonals: .1; 1/ ! .1; 2/ !
.2; 1/ ! .1; 3/ ! .2; 2/ ! . The map F defines the required bijection between
N and N2 . Suppose that m > 2 and that we have shown Nm1 N. We have
Nm D Nm1
N
N
N
N;
where the first equivalence follows by the inductive hypothesis and the second
equivalence is the case m D 2. Now suppose A1 ; ; Am are countable sets. For
each j 2 m, there exists an injection ˛j W Aj ! N. Hence we may define the injection
m
m
.˛1 ; ; ˛m / W …m
iD1 Ai ! N . Hence …iD1 Ai is countable by Lemma 1.4.14(1).
It remains to prove (2). Let fAi gi2I be a countable family of countable sets. We
assume that I is infinite and take I D N (if I
m, define Aj D Am , j > m). Since
Aj is countable, there exists a surjection ˇj W N ! Aj (if Aj is finite with n elements
a1 ; ; an , define ˇj .m/ D an , m > n). Define the surjection ˇ W N2 ! [n2N An by
ˇ.i; j/ D ˇj .i/. The countability of [n2N An follows by Lemma 1.4.14(2).
t
u
Examples 1.4.16
(1) For m 1, Nm , Zm are countable.
(2) The set Q of rational numbers is countable. Every element of Q can be
represented uniquely in the form r=s, where s > 0 and .r; s/ D 1 (we write
0 D 01 ). Hence we may represent Q as a subset of Z2
N2
N. Apply
Proposition 1.4.12.
12
1 Sets, Functions and the Real Numbers
(3) Let A D fa 2 R j 9n; p0 ¤ 0; ; pn 2 Z with p0 an C pn1 a C pn D 0g.
Then A—the set of algebraic numbers—is countable. This is a simple consequence of Theorem 1.4.15 and we leave the details to the reader as an exercise.
In particular, fm1=n j m; n 1g A is countable.
EXERCISES 1.4.17
(1) Prove Lemma 1.4.14(2). (Hint: Show, using an inductive construction, that if
f W X ! Y is onto and X is countable, then there exists a subset X 0 of X such
that f maps X 0 bijectively onto Y. This shows that there exists an injective map
g W Y ! X—g is the inverse of f W X 0 ! Y.)
(2) Prove that the following sets are countably infinite.
(a) The set of positive odd integers. (Construct a bijection between the set
and N.)
(b) The set of prime numbers. (Euclid’s argument. Suppose the contrary and
let p1 < < pN be the set of prime numbers. Derive a contradiction by
showing that the prime factorization of p1 pN C 1 must have a prime
factor bigger than pN .)
(c) The set of all real numbers which are roots of an equation of the form
p0 xn C C pn1 x C pn D 0, where n 1 and p0 ; ; pn 2 Z.
(d) The subset A of R defined by a 2 A iff there exist n 2 N and p1 ; ; pn1 2
Q such that
p
an C p1 an1 C C pn1 a C 2 D 0.
(3)
(4)
(5)
(6)
(For (c,d) you need to verify that the set can be represented as an infinite subset
of a (known) countable set.)
Let p 2 N, p > 1. Define Dp D fx 2 R j x D pmn ; where m 2 Z; n 2 ZC g.
Show that Dp and Q X Dp are both countably infinite.
Show that if X contains an uncountable subset, then X is uncountable.
Show that if X is an infinite set, then we can find a proper subset Y of X (Y ¤ X)
such that Y X. (Hint: use Theorem 1.4.13.)
Let A0 ; B0 be non-empty sets and suppose that f W A0 ! B0 and g W B0 ! A0 .
For n 0 define inductively BjC1 D f .Aj /, AjC1 D g.Bj /. Show that
(a) A0 A1 , B0 B1 .
(b) If E D \n0 An , F D \n0 Bn and p 0, then
Ap D [np .An X AnC1 / [ E; Bp D [np .Bn X BnC1 / [ F
as unions of disjoint subsets.
1.5 The Real Numbers
13
(c) We may write A0 , A1 as unions of disjoint subsets A0 D X [ Y [ E, A1 D
X [ Y1 [ E, where
X D [n1 .A2n1 X A2n /;
Y D [n0 .A2n X A2nC1 /;
Y1 D [n1 .A2n X A2nC1 /:
Similarly for B0 ; B1 .
(d) If f ; g are 1:1, then g f W Y ! Y1 is a bijection.
Deduce that if f ; g are 1:1, then A0
A1 and so, since A1
B0 (by g1 ),
A0 B0 . (The Cantor–Bernstein theorem: if A is equivalent to a subset of B and
B is equivalent to a subset of A, then A is equivalent to B.)
1.5 The Real Numbers
1.5.1 Not All Real Numbers Are Rational
The original formulation of geometry by the Pythagorean school was based on ideas
of proportion and tacitly assumed that all numbers were rational.1 An advantage of
this approach was that numbers could, in theory, all be constructed geometrically
using ruler and compass. It came as a shock to the Pythagorean school when it
was discovered that some numbers that arose geometrically were not rational. The
easiest example comes from Pythagoras’s p
theorem: the hypotenuse of an isosceles
right angled triangle with side length 1 is 2 … Q. In most cases, the square root
of a positive integer is not rational. Indeed, the only time it is rational is when the
integer is the square of another integer.
p
p
Proposition 1.5.1 Let n 2 N. Then n 2 Q iff n 2 N. That is, the set of natural
numbers with rational square root is precisely f12 D 1; 22 D 4; 32 D 9; g.
Proof We prove a special case of this result and leave the general case (and
p
extensions) to the exercises. We show that if p > 1 is prime then p … Q. Our
p
p
proof goes by contradiction. Suppose that p 2 Q, then we may write p D rs ,
where r; s 2 N and .r; s/ D 1 (recall that .r; s/ D 1 means no common factors—
the unique factorization of an integer into a product of primes that allows for this
p
representation is used again in the proof). Since we assume p D rs we have, on
squaring and multiplying by s2 ,
ps2 D r2 :
1
Strictly speaking, strictly positive numbers; the concepts of negative and zero numbers were
developed later in Indian and Arabian mathematics.
14
1 Sets, Functions and the Real Numbers
It follows that p is a factor of r2 and so, since p > 1 is prime, p must be a factor
of r (use the prime factorization of r). Hence we may write r D pR, where R 2 N.
Substituting for r, we get ps2 D p2 R2 and so, after cancelling p,
s2 D pR2 :
Just as before, it follows that p is a factor of s. But we have shown that p is a factor
p
of both r and s. This contradicts our assumption that .r; s/ D 1. Hence p cannot
be rational.
t
u
Remark 1.5.2 As remarked above, the discovery that mathematics could not be
done within the (countable) framework of rational numbers was of profound
significance. It is no coincidence that numbers that are not rational are called
irrational or that there is the word play between surd (root of number) and absurd.
Irrational numbers cannot be expressed in finite terms—indeed, most irrational
numbers correspond (in a sense that can be made very precise) to an infinite
sequence of random numbers and so cannot be represented in any finite form.
Allowing irrational numbers means the acceptance that randomness can and does
play a pivotal role in mathematics—even in a precise and quantitative subject like
real analysis.
z
EXERCISES 1.5.3
(1)
(2)
(3)
(4)
(5)
p
Show that if p1 ; p2 ; : : : ; pn are distinct primes then p1 p2 pn is irrational.
Complete the
p proof of Proposition 1.5.1.
Show that n 5 is irrational for all n 2 N, n 2.
p
Show that if p is prime n p ispirrational for all n 2 N, n 2.
Show that if n; m 2 then n m is rational iff there exists an ` 2 p
N such that
m D `n . (Hints: extend the method of (1) or show directly that if n m D p=q,
then p=q must be an integer—use prime factorization and raise to the nth
power.)
1.5.2 Construction of the Real Numbers
We present an approach to the construction of the real numbers using decimal
expansions and approximation. The methods we use originate from the work
of the sixteenth century Dutch Mathematician Simon Stevin who developed the
foundations of decimal arithmetic and real numbers in a 35-page booklet De
Thiende (‘The art of tenths’) published in 1685 (a very readable historical survey on
the influence of Stevin’s work can be found in the article by Błaszczyk et al. [3, §2]).
Aside from familiarity, the main advantage of our approach is that it leads to an
elementary constructive proof of the existence of the least upper bound or supremum
(see Chap. 2) and that the methods we use lead naturally to the ‘subdivide and
conquer’ techniques we repeatedly use in Chap. 2. On the other hand there are
some technical difficulties to be overcome related to the non-uniqueness of the
1.5 The Real Numbers
15
decimal expansions of some rational numbers (irrational numbers always have a
unique decimal expansion). At the end of Chap. 2, we give a more abstract approach
in terms of equivalence relations and Cauchy sequences of rational numbers. Even
with the general approach there are still many details to be checked.
The reader should view the material in the remainder of the chapter as being
for exploration and discussion: starting with the relatively simple idea of a real
number as being a decimal expansion, how can one define familiar concepts like
order, absolute value, addition and subtraction? How do we represent the rational
numbers as a subset of the real numbers and what do we mean by the approximation
of a real number by a rational number or truncated decimal? As we shall see, these
questions lead naturally to the idea of ‘limit’.
1.5.3 Decimal Expansions and Rational Numbers
For us a decimal expansion will be a formal expression
x D ˙x0 :x1 x2 xn ;
where x0 2 ZC (not Z) and xn 2 f0; ; 9g, n 2 N.
We regard expansions prefixed by a C as positive, and those prefixed by a as negative. If we drop the ˙ (we often do), we regard the expansion as unsigned
(could be either prefixed by C or ) or positive (implicit C).
A decimal expansion is an infinite string of integers together with a sign. The
problem is to give the expansion a useful interpretation. Our first task will be to
show that certain types of decimal expansions naturally define rational numbers
and, conversely, that rational numbers have a special type of decimal expansion.
Our goal is to identify the set of decimal expansions with the set of real numbers.
Indeed, we will define the real numbers to be the set of all decimal expansions.
Let N 2 N. A decimal expansion x D ˙x0 :x1 x2 is terminating of length N if
xN ¤ 0 and xn D 0, n > N. That is,
x D ˙x0 :x1 xN 0;
where 0 is shorthand for 0 repeated infinitely often. A decimal expansion is
terminating if it is terminating of length N for some N 2 N.
In future we regard the finite expansion x0 :x1 xN as being identical to
x0 :x1 xN 0 and write x0 :x1 xN D x0 :x1 xN 0. Similarly for negative decimal
expansions. We also take C0:0 as identical to 0:0 and set ˙0:0 D 0 (see
Example 1.5.8 below).
Definition 1.5.4 (Truncations) If x D x0 :x1 x2 xn is an (unsigned) decimal
expansion and N 2 N, we set
xN D x0 :x1 x2 xN ;
and call xN the (decimal) truncation of x to N-terms.
16
1 Sets, Functions and the Real Numbers
Remark 1.5.5 We always use a capital N superscript to label the truncation of x to
N-terms and generally reserve lower case subscripts to label general terms in the
decimal expansion.
z
It is easy to identify terminating decimals with rational numbers.
Lemma 1.5.6 If x D ˙x0 :x1 x2 xn is a decimal expansion, then the truncation
xN of x to N-terms defines a unique rational number xN according to the rule
N
X
xn
xN D ˙ x0 C
n
10
nD1
!
:
P
xn
Remark 1.5.7 If x D x0 :x1 x2 xn , then xN D .x0 C NnD1 10
n /. If instead we
had defined decimal expansions to be of the form x0 :x1 x2 , where x0 2 Z (rather
than ZC ), then the truncation would fail badly for negative decimal expansions. For
x1
example, the truncation x1 of x D 1:1 would give the rational number x0 C 10
D
1
D 9=10.
z
1 C 10
N
N
Example 1.5.8 For all N 2 N, 0:0 D 0:0 D 0. This justifies the identification
of 0:0 and 0:0.
We need an elementary result on geometric series for our study of infinite decimal
expansions of rational numbers.
Lemma 1.5.9 Let a; r 2 Q and suppose jrj < 1. Then the geometric series
P
1
n
nD0 ar converges to a=.1 r/ 2 Q.
P
a.1rmC1 /
n
Proof We have m
and so
nD0 ar D
1r
ˇ m
ˇ
ˇX
ˇ
ˇ
ˇ
n
ar a=.1 r/ˇ D jajjrjmC1 =.1 r/:
ˇ
ˇ
ˇ
nD0
Letting m ! 1, the result follows (see Exercise 1.5.17(1) for a proof, not
depending on properties of real numbers, that limm!1 rm D 0).
t
u
Remark 1.5.10 The formula for the sum of an infinite geometric series is wellknown. From our perspective, what is interesting is that provided the constant term
a and multiplier r are rational, the infinite sum
rational. This rarely holds
P is always
2k
1=n
…
Q, for all k 2 N (see
for general infinite series. For example, 1
nD1
Chap. 5). Without real numbers, the theory of infinite series is effectively P
restricted
xn
to geometric series. In particular, we cannot yet give a meaning to x0 C 1
nD1 10n
for a general decimal expansion x0 :x1 unless we can show the series converges
to a rational number.
z
An unsigned decimal expansion is eventually periodic of period p 1, if we can
find k 2 N and a1 ; ; ap 2 f0; ; 9g such that
x D x0 :x1 xk a1 a2 ap a1 a2 ap 1.5 The Real Numbers
17
We usually write this in abbreviated form as x D x0 :x1 xk a1 a2 ap . We also
require that p is minimal and that a1 ¤ 0 if p D 1.
Lemma 1.5.11 If x D x0 :x1 xk a1 a2 ap is eventually periodic, then
1
X
xn
x0 C
2 Q:
n
10
nD1
That is, the infinite series
P1
xn
nD1 10n
converges to a rational number.
Proof For m 1 we may write
x0 C
kCpm
X
nD1
k
m1
X
xn
xn
A X pj
D
x
C
C
10 ;
0
10n
10n
10k jD0
nD1
(1.1)
P
Pp
am
x0 C knD1 10xnn 2 Q. Since A; 10p 2 Q,
where A D
mD1 10m . Clearly,
P
10pj converges and is rational. Letting k ! 1
Lemma 1.5.9 implies that 10Ak 1
P1 xn jD0
in (1.1), we see that nD1 10n converges and
1
N
1
X
X
xn
xn
A X pj
x0 C
D x0 C
C k
10
10n
10n
10 jD0
nD1
nD1
is a rational number.
t
u
Proposition 1.5.12 Suppose that x D p=q 2 Q where . p; q/ D 1 and p; q > 0.
There are two mutually exclusive possibilities.
(1) If the prime factorization of q is 2r 5s , r C s 0, then the decimal expansion
of x is not unique and can be written in precisely two ways: as a terminating
decimal x D x0 :x1 xN or as an infinite decimal x D x0 :x1 .xN 1/9, where
xN 2 f1; ; 9g.
(2) If q ¤ 1 and the prime factorization of q contains primes other than 2 or 5,
then the decimal expansion of x is unique and eventually periodic with period
at most q.
A similar result holds if p < 0. If x D 0, then x has the unique decimal expansion
0:0 (we regard ˙0:0 as being identified).
Proof We leave the proof to the exercises.
t
u
Remark 1.5.13 Note that in (2) of Proposition 1.5.12, the decimal expansion of x
z
cannot be of period 1 with a1 2 f0; 9g.
Examples 1.5.14
(1) If p 2 N, q D 1, then part (1) of the proposition applies. We may write p D
.p 1/:9.
18
1 Sets, Functions and the Real Numbers
(2) Computing, we find that
3
D 0:08823529411764705882352941 D 0:08823529411764705
34
The decimal expansion is unique and eventually periodic of period 16 34. On
the other hand, 7=20 D 7=.22 5/ D 0:35 D 0:349 and the decimal expansion is
not unique.
1.5.4 Decimal Expansions and Real Numbers
We define the real numbers R to be the set of all signed decimal expansions:
R D f˙x0 :x1 x2 xn j x0 2 ZC ; xn 2 f0; 1; ; 9g; n 2 Ng:
Let RC D fx 2 R j x D Cx0 :x1 x2 g (the positive real numbers), and R D fx 2
R j x D x0 :x1 x2 g (the negative real numbers). So as to simplify notation, we
almost always drop the C-prefix from decimal expansions in RC and write RC D
fx 2 R j x D x0 :x1 x2 g. We also identify the zero expansions ˙0:0 and denote
either expansion by 0. With this convention, RC \ R D f0g.
Our first step is to identify the set of rational numbers Q with a proper subset of
R. The only difficulty here is that not all rational numbers have a unique decimal
expansion (Proposition 1.5.12). We deal with this the same way we dealt with
the two decimal representations ˙0:0 of zero. We regard a terminating decimal
expansion ˙x0 :x1 x2 xN (or ˙x0 :x1 x2 xN 0), with xN 2 f1; ; 9g, as identified
with the decimal expansion ˙x0 :x1 x2 .xN 1/9. If a decimal expansion x does
not end with recurring 0’s or 9’s, then we regard x as uniquely defined by its
decimal expansion. It follows from Proposition 1.5.12 that we may regard Q as
identified with a proper subset of R. Specifically, we identify p=q 2 Q with its
decimal expansion with the understanding that the non-zero decimal expansion
˙x0 :x1 x2 xN is identified with ˙x0 :x1 x2 .xN 1/9.
Remark 1.5.15 We can enforce uniqueness of decimal expansions if we insist either
that R contains no terminating decimal expansions (other than 0 D 0:0) or that R
contains no decimal expansions ending in recurring nines. In practice, it is useful to
allow both types of expansion.
z
It follows from Lemmas 1.5.6, 1.5.11 that if x 2 R is either eventually periodic
or terminating, then x is rational. If x 2 Q is eventually periodic and not of
the form ˙x0 :x1 xN 0, xN ¤ 0, then the decimal expansion of x is unique. If
x is neither eventually periodic nor terminating, we say x is irrational. Irrational
numbers have unique decimal expansions. This is by definition!—we only identify
decimal expansions corresponding to rational numbers of the form p=.2r 5s /, where
r C s 0, p 2 Z.
1.6 The Structure of the Real Numbers
19
Now that we have a minimal description of the real numbers it is easy to prove
Cantor’s result that R is uncountable.
Theorem 1.5.16 The set R is uncountable.
Proof We give the proof discovered by Cantor and based on his diagonal method.
If R is countable then certainly the half-open interval Œ0; 1/ D fx 2 RC j x0 D 0;
x ¤ 0:9g is countable (every subset of a countable set is countable). It is therefore
enough to show that Œ0; 1/ is uncountable. Suppose the contrary. Then we may write
Œ0; 1/ D fxn j n 2 Ng, where each xn has decimal expansion not ending in recurring
9’s for all n 2 N. In terms of decimal expansions, we have
xn D 0:xn1 xn2 xnn ;
where the sequence xn1 ; xn2 ; does not end in recurring 9’s. We define z D
0:z1 z2 2 RC by
zn D 4; if xnn D 5;
D 5; if xnn ¤ 5
Clearly z 2 Œ0; 1/ since the decimal expansion of z cannot end in recurring 9’s. On
the other hand, z … fx1 ; x2 ; g since zn ¤ xnn , all n 1 (decimal expansions are
unique granted our condition on recurring 9’s). Contradiction. Hence R cannot be
countable.
t
u
EXERCISES 1.5.17
(1) Find an elementary argument to prove that if r is rational and jrj < 1 then
limn!1 rn D 0. (Hints and comments: It is enough to assume r 2 .0; 1/. We
may write r D 1 s, where 1 > s > 0. Observe that .1 s/ 1=.1 C s/. By the
binomial theorem .1 C s/n 1 C ns > ns, for all n 2 N. Hence .1 s/n an1 ,
s D 1=a. Now argue using limn!1 1=n D 0. This argument is elementary and
does not rely on using properties of monotone decreasing sequences or the log
and exponential functions.)
(2) Show that if xN 2 f1; ; 9g, then the decimals x D x0 :x1 xN 0 and x0 D
x0 :x1 .xN 1/9 define the same rational number.
(3) Prove Proposition 1.5.12.
1.6 The Structure of the Real Numbers
For the remainder of the chapter we look at the problem of extending order, absolute
value, addition and subtraction from the rationals to the real numbers. Order and
absolute value are easy and natural to define for real numbers and allow us to make
a start on approximating real numbers by rational numbers. Addition and subtraction
20
1 Sets, Functions and the Real Numbers
of infinite decimals is trickier as we have to work with rational approximations
and be careful about ‘bookkeeping’. However, it is easy to add and subtract two
decimals if one of the decimals is terminating. As a result, we can approximate a
real number by its decimal truncations: for all x 2 R, we have jx xN j 10N ,
N 2 N. We conclude by defining multiplication and division of real numbers by
rational numbers (we defer general multiplication and division of real numbers to
Chap. 2).
We suggest a careful reading of the definitions and results on order, absolute
value and approximation of real numbers by rational numbers and then skim
through the elementary but longer arguments on addition and subtraction of infinite
decimals. The alternative more abstract approach we give at the end of Chap. 2
handles the arithmetic properties of real numbers straightforwardly but there are
still many details to be checked.
1.6.1 Order on R
Provided we require unique decimal expansions (we deny either recurring 0’s or
9’s), it is easy to define an order < on R that extends the usual order on Q. Suppose
we restrict to decimal expansions that do not end with recurring 9’s. If x; y 2 RC ,
we write x < y (equivalently, y > x) if there exists an N 0 such that xn D yn ,
n < N, and xN < yN . Necessarily x ¤ y by uniqueness of decimal expansions! If
x 2 R , and y 2 RC and x ¤ y (so x; y are not both equal to zero) we declare that
x < y, and if x; y 2 R then x < y iff y < x. Since we have unique decimal
expansions, this restricts to the usual order on Q (if we did not have this restriction,
then x D 0:1, y D 0:09 would cause a problem.) We extend the notation in the usual
way to , . With these conventions we have
RC D fx 2 R j x 0g; R D fx 2 R j x 0g:
Remark 1.6.1 If instead we had restricted to decimal expansions of non-zero
numbers that do not end with recurring 0’s, we would have ended with the same
order structure on R. Thus 0:1 < 0:2 (deny recurring 9’s) and 0:09 < 0:19 (deny
recurring 0’s). Note, however, that if we deny recurring 0’s, then we need a special
argument for 0 D 0:0. Hence our preference for denying recurring 9’s.
z
Example 1.6.2 If x 2 RC , there exists an N 2 N such that x < N. The proof is
immediate: if x D x0 :x1 and we deny recurring 9’s, define N D .x0 C 1/:0. This
property is known as the Archimedean property of real numbers.
1.6.2 The Absolute Value
Definition 1.6.3 If x 2 R, the absolute value jxj of x is defined to be x, if x 0,
and x if x < 0. That is jxj is x ‘unsigned’.
1.6 The Structure of the Real Numbers
21
Remark 1.6.4 The absolute value restricted to Q gives the usual absolute value on
rationals (same definition).
z
Lemma 1.6.5 If x 2 R and x0 ; ; xN D 0, xNC1 ¤ 0, then 10N1 jxj 10N .
Proof If we replace xn by 9 for n > N, we have
jxj D 0:0 0xNC1 0:0 09 D 10N1
1
X
9
D 10N :
m
10
mD0
This shows jxj 10N . On the other hand if xNC1 ¤ 0, we can replace xNC1 by 1
and set xn D 0, n > N C 1 to obtain
jxj D 0:0 0xNC1 0:0 010 D 10N1 :
Hence 10N1 jxj.
t
u
Remark 1.6.6 If x is irrational, we always have strict inequality in Lemma 1.6.5:
10N1 < jxj < 10N .
z
Example 1.6.7 We claim that if x 2 RC is non-zero, then there exists a z 2 R
such that 0 < z < x. Since x ¤ 0, there exists a least N 0 such that xN ¤ 0. By
Lemma 1.6.5, x 10N1 . Since 10N1 > 10N2 > 0, we may take z D 10N2 .
In this case we constructed a rational z. We can find an irrational z by choosing any
non-rational decimal expansion, for example define b D 1012 013 014 0 01n 0 (where 1n is shorthand for n repeated 1’s). If we define z D 0:0NC1 b, then z is
positive, irrational and z < 10N1 x by Lemma 1.6.5. Hence 0 < z < x.
If we assume more structure on the reals (addition and division), it is easy to
deduce this result from the Archimedean property of R (Example 1.6.2). See also
Proposition 1.6.20.
Remark 1.6.8 So far we have made no use of addition and subtraction of real
numbers.
z
1.6.3 Addition and Subtraction: Terminating Decimals
In this section we review the addition and subtraction of terminating decimals.
Necessarily our definition should give the same result as addition of subtraction
of rational numbers according to the rule pq ˙ rs D ps˙rq
qs .
First, some notational conventions. If n 2 N, we usually denote the terminating
decimals ˙n:0 by ˙n. Note that it follows from our conventions that, as real
numbers, we have n D n:0 D .n 1/:9 and n D n:0 D .n 1/:9. Let
RT Q denote the set of terminating decimals.
Addition of terminating decimals of the same sign follows the standard ‘add and
carry’ rules—we define x C y D .x C y/ if x; y 0. Suppose x > 0 > y. Then
22
1 Sets, Functions and the Real Numbers
x C y D x .y/. If y < x, we compute using standard subtract and carry (if
y > x, write x C y D ..y/ x).
Example 1.6.9
2:35 C .1:46/ D 2:35 1:46 D 0:89;
where we carry 1. Note this is correct since 1:46 C 0:89 D 2:35 (add and carry or
compute as a sum of rational numbers). On the other hand, 1:46 2:35 computes to
1:11 if we use subtract and carry and this is incorrect.
These rules give the correct definition of addition for terminating decimals—that
is, they give the same result we get using the standard rule for addition of rational
numbers.
We define subtraction using addition:
def
x y D x C .y/; x; y 2 RT :
Remark 1.6.10 Let x D 0:x1 xN1 xN and suppose xN ¤ 0. Define xN 2 RT
by xN D 0:x1 xN1 .xN C 1/, where a D 9 a, a 2 f0; 1; ; 9g. Observe
that xN C 1 2 f1; ; 9g, since xN ¤ 0, and x C xN D 1 (add and carry). The
difficulties of adding finite decimals of opposite sign occur because if n 2 ZC , and
x D 0:x1 xN1 xN , then n C x D .n 1/:x1 xN1 .xN C 1/ and this is only
equal to .n 1/:x1 xN1 xN if N D 1 and x1 D 5. Of course, there is no problem
for n C x if n 0.
z
1.6.4 A Special Case of Addition for Infinite Decimals
When we come to the problem of addition and subtraction of infinite decimals, we
cannot avoid looking at limits—in this case of sequences of rational approximations.
Roughly speaking, if we are given infinite decimals x D x0 :x1 x2 , y D y0 :y1 y2 ,
we want to define x C y to be the limit as N ! 1 of xN C yN , where xN D
x0 :x1 xN , yN D y0 :y1 yN are the truncations of x; y to N-terms. The difficulty is
that generally we do not know what the limit is—as a decimal expansion—and ‘add
and carry’ does not work for infinite decimals. However, when we know what the
limit is, it is usually easy to prove the convergence of xN CyN to the limit as N ! 1.
If this sounds tautological it is: the definition of convergence of a sequence assumes
we know the limit. Later, in Chap. 2, we introduce the idea of a Cauchy sequence,
which gives an intrinsic definition of convergence without having to know the limit.
In this section we look at the problem of solving the equation x C y D n, where
x is a given infinite decimal and n 2 Z. We show that there is a unique infinite
decimal y satisfying the equation and that the decimal expansion of y can be given
explicitly in terms of that of x. Once we have this result, it is easy to extend our
definition of addition and subtraction to sums and differences when one (not both)
1.6 The Structure of the Real Numbers
23
of the terms may be an infinite decimal. Although this seems a small step, it allows
us to view the truncations xN , N 2 N, as (rational) approximations to an infinite
decimal. Specifically, we can easily show that
jx xN j 10N ; N 2 N:
It is natural to think of a real number x as the set fxN j N 2 Ng of all its truncations:
this is the way we do computations with irrational numbersP
in practice. That is,
xn
rather than attempting to define x by evaluating the infinite sum 1
nD0 10n (we cannot
at this point unless x is rational), we think of x as defined by its set of truncations
and then do ‘approximate arithmetic’ (which will be exact in the limit). Observe that
we cannot write down an irrational number in exact form—that requires an infinite
string of integers—instead we write down a ‘good enough’ rational approximation
to the number. In order to make this process work and keep control of the errors, we
need to introduce ideas based on limits.
Let x D 0:x1 x2 be an infinite (not terminating) decimal and define xN D
0:Nx1 xN 2 (recall aN D 9 a for a 2 f0; ; 9g, see Remark 1.6.10). If we define
z D x C xN by zn D xn C xN n , n 0, then z D 0:9 D 1. Unlike what happens for
terminating decimals, we will not be able to avoid recurring 9’s when we consider
addition of infinite decimals.
N
In terms of the truncations zN ; xN ; xN N , we have zN D xN C xN N D 0:9 and so
1 .xN C xN N / D 10N . That is, limN!1 xN C xN N D 1 (this is a statement about
rational numbers). In other words, if we define the sum x C xN by addition of like
terms, then the resulting decimal expansion is the limit of the sum of the truncations.
This is the key property we need when we come to the sum of general decimal
expansions. Observe there is no problem here with the addition as there are no terms
to be carried.
Now suppose x D x0 :x1 x2 is an infinite decimal and n 2 Z. The general
solution of the equation x C y D n is given as follows
yD
.n x0 /:Nx1 xN 2 ; if x0 < n;
.x0 n/:x1 x2 ; if x0 n:
(1.2)
We may now easily extend our rules to define x ˙ y when one of x; y is an
infinite decimal. Addition when x; y are of the same sign follows the pattern given
for terminating decimals. For example, if y D y0 :y1 yM , and x; y 0, then x C y
is defined by adding xM and yM and appending xMC1 xMC2 . If x y and x is
not terminating, then x y is defined exactly as we did when both x and y are
terminating. If x < y, then we may write x D .x C n/ n, where n 2 Z is chosen
so that x C n y. The decimal z D .x C n/ y is well defined (x C n > y) and we
have x y D z n D .n z/, which is given by (1.2).
24
1 Sets, Functions and the Real Numbers
1.6.5 Decimal Approximation of Real Numbers
The results in the previous section allow us to estimate real numbers in terms of
rational approximations by finite decimals.
Lemma 1.6.11 Let x 2 R have decimal expansion x D x0 :x1 . For all N 2 N
we have
0 jx xN j 1=10N :
In particular, limN!1 xN D x.
Proof Without loss of generality suppose x 0. Then
0 x xN D 0:0N xNC1 0:0N 9 D 10N :
The result follows since limN!1 xN D x iff limN!1 jx xN j D 0.
t
u
Remarks 1.6.12
(1) The proof of Lemma 1.6.11 uses only the order structure on R (elementary)
together with arguments involving rational numbers and geometric series (cf.
Lemma 1.5.9).
P
xn
(2) Previously, we have only discussed the convergence of 1
nD0 10n for eventually
P
xn
periodic decimal expansions. Since limN!1 xN D x, and xN D NnD0 10
n , we
now have the result that
1
N
X
X
xn def
xn
D
lim
D x:
n
n
N!1
10
10
nD0
nD0
That is, the sequence of partial sums converges to x.
z
Example 1.6.13 Let x be an infinite decimal and y 2 RT . Set z D x C y. Then
limN!1 .z .xN C yN // D 0. If x; y are of the same sign, and y is of length M, then
xN C yM D zN for N M and the result follows from Lemma 1.6.11. If x 0 y
and y > x, choose n 2 Z so that n C x > y. Set u D .n C x/ C y. We have
n C xN C yN D uN ! u and uN n ! z.
Remark 1.6.14 If x 2 R is eventually periodic and y is a terminating decimal, then
x˙y is eventually periodic and so rational. We leave it to the exercises for the reader
to check that our definition of addition and subtraction gives the same result as when
we add/subtract p=q; r=s using p=q ˙ r=s D .ps ˙ qr/=qs.
z
1.6 The Structure of the Real Numbers
25
1.6.6 Addition and Subtraction of Real Numbers
It remains to define addition and subtraction of infinite decimals. Suppose x; y 2 R
are infinite decimals. We define x C y to be the limit as N ! 1 of xN C yN . In
order to prove the limit exists, we have to prove that the initial terms of the decimal
expansion of xN C yN ‘stabilize’ as N ! 1. To capture this property precisely, we
define a new limit operation.
Definition 1.6.15 Let x0 :x1 x2 be the decimal expansion of x 2 R. Suppose that
.xn /n1 is a sequence of decimal expansions. We write limn!1 xn D x if, for every
N 2 N, we can find M 2 N such that
xni D xi ; for all i N and n M:
In words, we say that .xn / converges to x iff for any M, we can find N such that the
truncation of xn to M terms is equal to truncation of x to M terms for all n N.
Note the purely symbolic character of this definition. There is no use of subtraction
or absolute value.
Examples 1.6.16
(1) Suppose that x D ˙x0 :x1 2 R and let xN D ˙x0 :x1 xN D ˙x0 :x1 xN 0
denote the truncation of x to N terms. Then limN!1 xN D x. In this case, given
N 2 N, we may take M D N in Definition 1.6.15.
(2) Let x D 1:0, y D 0:9. Let .xN / and . yN / be the sequences of truncations defined
N
N
by taking xN D 1:0 , yN D 0:9 . Observe that limN!1 xN D 1:0 ¤ 0:9 D
N
limn!1 y , even though the two limits define the same real number. This is
only an issue for rational numbers which have a finite decimal expansion.
(3) Suppose x D 0:1234516 , y D 0:3765484 . For N 2 N, let zN D xN C yN .
We have
z6 D 0:499999 D 0:495 ;
z7 D 0:500000 D 0:506 :
If N > 7, it is easy to see that zNn D z7n , if n 6, and that zN7 2 f0; 1g, whatever
the higher-order terms are in the decimal expansions of x and y. For example,
consider the ‘worst’ case xn D yn D 9, for all n > 7. Computing we find that
zN D 0:505 19N8 8, for all N 8. We see that the initial term of zN is 0:500000,
for N > 7, and that limN!1 zN exists and is equal to 0:505 19.
Lemma 1.6.17 (Stability Lemma) Let x; y 2 RC and set zN D xN C yN , N 2 N.
Given N0 > 1, suppose there exists an m < N0 such that zNm0 8. Then for N > N0
we have
zNn D zNn 0 ; for all n < m:
26
1 Sets, Functions and the Real Numbers
Proof Let N > N0 > m 1 and suppose that zNm0 8. Assume first that xn D
yn D 9, N n > N0 . Adding, we find that zNm0 D 9 and zNn D zNn 0 for all n < m. If
we vary xn ; yn , n > N0 , we only make zN smaller. Since zN zN0 , the terms zNn are
unchanged for all n < m.
t
u
Proposition 1.6.18 Let x; y 0. Then limN!1 .xN C yN / exists and defines a
unique point in RC .
Proof The result is trivial if either x or y is zero so suppose x; y > 0. If either x or y
is a terminating decimal, the result is easy (see Sect. 1.6.4). Hence we may assume
that x; y have unique infinite decimal expansions, x D x0 :x1 x2 , y D y0 :y1 y2 .
Set zN D xN C yN . Then zN D zN0 :zN1 zNN where the integers zN0 ; zNN may depend
on N.
Given N 2, we let m D m.N/ be the largest value of m 2 N such that m <
N and zNm 8. If zNm D 9 for 1 m < N, we set m.N/ D 0. It follows from
Lemma 1.6.17 that
zPn D zQ
n ; P; Q N; n < m.N/:
Set sN D m.N/ 1. Our construction defines an increasing sequence .sN / ZC .
There are two possibilities: either sN ! 1 as N ! 1 or there exists a P 2 Z
such that sN D P for all sufficiently large N. If the second condition holds then
the limN!1 .zN D xN C yN / exists and the corresponding decimal expansion
ends with recurring 9’s. If the first condition holds, then by the definition of lim,
limN!1 .zN D xN C yN / exists (given N, take M D sN in Definition 1.6.15).
t
u
Granted Proposition 1.6.18 it is now easy to define addition and subtraction of
general real numbers.
If x; y 0, we define x C y D limN!1 .xN C yN /. If both x and y are negative,
define x C y D .x C y/. If x 0 y, choose n 2 N so that n C y 0 and then
define x C y D .x C .n C y// n (using (1.2) as needed). For subtraction, define
x y D x C .y/.
It follows immediately from our constructions and Lemma 1.6.11 that if x; y 2 R
and we set z D x ˙ y then
lim jz .xN ˙ yN /j D 0:
N!1
(1.3)
That is limN!1 .xN C yN / D x ˙ y.
Example 1.6.19 Using (1.3), the usual rules for absolute value, such as the triangle
inequality, follow immediately from the corresponding rules for rational numbers:
if x; y 2 R then
jx C yj D lim jxN C yN j lim .jxN j C jyN j/ D jxj C jyj:
N!1
N!1
1.6 The Structure of the Real Numbers
27
Proposition 1.6.20 Let x; y 2 R and suppose x < y. Then there exists a z 2 R such
that x < z < y. We may require z to be either rational or irrational.
Proof Observe that x < y iff y x > 0. By Example 1.6.7, there exists an a 2 R
such that 0 < a < y x. Hence x < x C a < y. If we want z to be irrational,
observe that either x C a is irrational (and we done) or x C a is rational. In the latter
case we can choose an irrational b satisfying 0 < b < y x a (for example,
b D 0:0M 1012013 01n 0 for large enough M—see Example 1.6.7) and then
define z D b C x C a < y. If we require z to be rational, take a high enough order
truncation of z. Finally, note that we cannot (yet!) take z D .x C y/=2 as we have
not defined multiplication and division of real numbers.
t
u
Remarks 1.6.21
(1) All the standard rules of addition and subtraction (commutativity, associativity,
etc.) are easily seen to hold for real numbers. Indeed they are all inherited
through the limit operation from the corresponding properties for rational
numbers.
(2) Let x 2 R. We have shown that we can view x as the limit of the sequence
.xN / of truncations of x: x D limN!1 xN . In this sense, we can think of a real
number as defined by its set of truncations, all of which are rational. While
the rational numbers Q are naturally defined (in terms of the integers), the
restriction to (decimal) truncations depends on the choice of base 10. However,
it is not hard to show that changing base does not change the set of real numbers.
The problem is avoided by defining a real number in terms of all of its rational
approximations—see section “Appendix: Construction of R Revisited” at the
end of Chap. 2.
(3) It is worth emphasizing again the conceptual leap that is required in going from
rational to irrational numbers. Rational numbers are given finitely; the specification of an irrational number depends on a limiting process and irrational
numbers cannot be described finitely. About the best one can do is define an
irrational number by a recursive process. For example, if we take x0 D 1 p
and
define xnC1 D 12 .xn C x2n /; n 0, then .xn / Q and limn!1 xn D 2.
However, it can be shown that most irrational numbers cannot be specified in
this way: there are uncountably many real numbers, but only countably many
recursion formulas with rational coefficients!
z
1.6.7 Multiplication of Real Numbers by Rationals
Let x; y 2 R. One way of defining the product xy is to show that limN!1 xN yN
exists. However, even more than was the case in the proof of Proposition 1.6.18,
bookkeeping is a problem. We give a much simpler and more elegant proof of
convergence in Chap. 2 based on results on bounded monotone sequences. On the
other hand, if one of x; y is rational, it is easy to define the product by making use
of our results on addition of real numbers.
28
1 Sets, Functions and the Real Numbers
Suppose then that x 2 R, p=q 2 Q. Without loss of generality, assume that
x; p; q > 0. We outline the main steps in defining the product pq x, leaving the details
to the exercises.
(1) Show px 2 R. (Since px should be the sum of p copies of x, we define px using
addition of real numbers.)
N
(2) If q 2 N then limN!1 xq exists. There is no problem with division of the
decimal expansion x by q—we start with division by q of the initial term x0
of the decimal expansion and carry to the right. In particular,
xN
q
D
n
xM
q
; all n maxfM; Ng:
n
N
Hence limN!1 xq exists and we define the limit to be pq x.
def
Remark 1.6.22 The method gives division of real numbers by rationals: x=. pq / D
q
z
p x, provided p ¤ 0.
EXERCISES 1.6.23
(1) Let x; y be two decimal expansions. Show that x y D 0 (as real numbers)
iff either x D y (as decimal expansions) or x and y represent the same rational
number which has a terminating decimal expansion. A consequence is that if
x; y are real numbers and x is irrational then x D y (as decimal expansions) iff
x y D 0. P
n2
is irrational. More generally, show that if p.x/ D xm C
(2) Show that 1
nD0 10
a1 xm1
C
C
a
is
a
polynomial of degree m 1 with integer coefficients,
m
P
p.n/
then 1
10
is
rational
if and only if m D 1.
nD0
(3) Verify that we get the same order structure on R if (a) we deny decimal
expansions ending in recurring zeros, or (b) we deny decimal expansions
ending in recurring nines.
(4) Let A > 0. Show that for n 0, it is possible to choose a unique finite decimal
expansion Xn D x0 :x1 : : : xn such that
(a) Xn2 A.
(b) .Xn C 10n /2 > A.
Show also that the terms x0 ; x1 ; : : : ; xn do not depend on n (that is, if m > n,
then the first n C 1 terms p
of Xn , Xm are the same). Deduce that limn!1 Xn D X
exists. Show that X D A. (For the last part, define Yn D Xn C 10n and
observe that Xn2 x < Yn2 . Now let n ! 1.)
(5) Extend the method of the previous exercise to show that if A > 0 and p 2,
then the positive pth root of A exists. (We give alternative constructions for
rapidly computing roots in terms of rational sequences in Chap. 2.)
(6) Fill in the details for the construction of pq x in Sect. 1.6.7. (Start by giving an
inductive definition of px, x 2 RC , p 2—define px D .p 1/x C x, p 2.
Verify that rx C sx D px if r; s 2 N, r C s D p.)
1.6 The Structure of the Real Numbers
29
(7) Show that if r1 ; r2 2 Q, x 2 R then r1 .r2 x/ D .r1 r2 /x (associative law of
multiplication). Deduce that if r 2 Q, r ¤ 0, and x 2 R, then rx 2 Q iff
x 2 Q.
(8) Let a < b, a; b 2 R. Prove that .a; b/ \ Q is countably infinite. (Show that all
but finitely many members of fa C 10n j n 2 Ng lie in .a; b/.)
(9) Show that
(a) .0; 1/ R. (Look for a bijection of the form g.x/ D A=x C B=.x 1/.)
(b) .0; 1/ [ C
.0; 1/ for any countable set C disjoint from .0; 1/. (Hint:
Choose a countable infinite subset K of .0; 1/ and observe that K[C K.)
Deduce that .0; 1/
Œ0; 1 (this can never be realized by a continuous
map).
(c) P.N/ .0; 1/. (Hint: Let B be the set of all binary expansions 0:b1 b2 ,
bi 2 f0; 1g. Show that every X 2 P.N/ determines a unique b 2 B by bn D
1 iff n 2 X and hence show P.N/ B. Using (2) show that B .0; 1/—
you will have to address the non-uniqueness of binary expansions.)
(d) P.N/ R.
(10) Let F be the set of all functions f W Œ0; 1 ! R. Show that F 6 Œ0; 1. (Hint:
use the diagonal method. If we restrict to continuous functions, then we do
have equivalence—a continuous function on Œ0; 1 is uniquely determined by
its values at the rational points of Œ0; 1.)
(11) Let X be a non-empty set and F denote the set of all functions f W X ! R.
Show that F 6 X.
(12) Let X, Y be non-empty sets and suppose that Y contains at least two points. Let
F denote the set of all functions f W X ! Y. Show that F 6 X. What happens
if Y consists of a single point?
(13) Using decimal expansions, find an onto map F W Œ0; 1 ! Œ0; 12 . Is the map F
you have constructed 1:1? If not (most likely), show that it is possible to define
a bijection G W Œ0; 1 ! Œ0; 12 . (Hints and comments for the second part. The
new map G is closely related to F. The problem lies with non-uniqueness of
decimal expansions. Let D denote the set of all decimal expansions 0:x1 x2 .
Show there is a bijection between D and D2 —easy! Then verify D
Œ0; 1,
D2 Œ0; 12 ; this will require handling countable sets of ‘bad’ points (use the
result of Q9). The maps F; G will not be continuous. Although it is possible
to construct continuous maps of Œ0; 1 onto Œ0; 12 (Peano curves), there are no
continuous bijections between Œ0; 1 and Œ0; 12 .)
Chapter 2
Basic Properties of Real Numbers, Sequences
and Continuous Functions
2.1 Introduction
In this chapter we prove a number of foundational results about real numbers,
sequences and continuous functions. Sequences will play a major role throughout.
We start by proving key results on the convergence of bounded monotone sequences
using methods that develop naturally from our real number constructions in Chap. 1.
As an application, we give the general definitions for multiplication and division of
real numbers. We then prove the Bolzano–Weierstrass theorem and its important
corollary that every bounded sequence has at least one convergent subsequence. We
use both results repeatedly in the sequel. Turning next to functions, we verify the
equivalence of continuity and sequential continuity, and then use relatively simple
sequence-based methods to prove standard results about continuous functions on a
closed and bounded interval (boundedness, attainment of bounds, the intermediate
value theorem and uniform continuity). Next we define Cauchy sequences and prove
the fundamental result that a sequence is convergent if and only if it is Cauchy.
As a consequence we obtain an intrinsic definition of convergence that does not
explicitly depend on the limit. We devote a section to the definitions and properties
of the operations of lim sup and lim inf and show how we may use these concepts
to provide alternative proofs of some of our results. After a section reviewing the
definition of complex numbers and properties of complex sequences, we conclude
the chapter with four appendices. In the first appendix, we review some standard
results of the differential calculus. In the second appendix we provide a simple proof
that every continuous function on a closed interval has a unique Riemann integral
(the proof does not use uniform continuity). In the third appendix, we develop from
scratch the theory of the exponential and natural logarithm and prove important and
much used growth estimates for log x and ex , as x ! C1, and for log x as x ! 0C.
In the final appendix, we outline an approach to the construction of the real number
system that is based on Cauchy sequences of rational numbers.
32
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
2.2 Sequences
Let Z be a non-empty set. Formally, a sequence of points of Z is a function x W
N ! Z (sometimes x W ZC ! Z). We invariably set x.n/ D xn , n 1, denote the
sequence by .xn /, or .xn /n1 , and regard .xn / as being an ordered subset of Z—the
order being given by N.
Example 2.2.1 If x 2 Z and we define xn D x, for all n 1, then .xn / is a constant
sequence. In particular, we do not require that the map x W N ! Z be 1:1 and .xn / is
not the same as the set fxn j n 2 Ng which, for this example, is the singleton fxg. 2.2.1 Sequences of Real Numbers and Convergence
In this chapter we will be mainly interested in sequences .xn / of real numbers: xn 2
R, n 2 N. We sometimes write .xn / R to signify that .xn / is a sequence of real
numbers. That is, fxn j n 2 Ng R. Similarly, if we write .xn / Q, then .xn / will
be a sequence of rational numbers.
Example 2.2.2 Since Q is countable, there is a surjective map x W N ! Q. The
associated sequence .xn / has the property that [1
nD1 fxn g D Q. We can require
that every rational number occurs infinitely often in the sequence .xn /. Indeed, this
follows since the set of all pairs .r; s/ 2 Z2 , with s ¤ 0, is countable and so if
q D r=s, then .nr/=.ns/ D q for all n 2 N. Infinity is very elastic.
Definition 2.2.3 The sequence .xn / R is convergent with limit x 2 R if, for all
" > 0, there exists an N 2 N such that
jx xn j < "; n N:
We write this as limn!1 xn D x. We say that x is the limit of the sequence .xn / or
that the sequence .xn / converges to x.
Remarks 2.2.4
(1) In limit definitions, we generally use capitals M; N; : : : for the bounds—this
is in contrast to the way we used capitals in the previous chapter for decimal
truncations. Nevertheless, we continue to use the notation .xN / for the sequence
of decimal truncations of a real number x.
(2) If the sequence .xn / is convergent, then the limit is unique. Intuitively this is
clear: it is not possible to be arbitrarily close to distinct points. We leave the
formal details as an exercise for the reader.
(3) The definition of convergence works perfectly well within the framework of the
rational numbers. In this case, we P
require .xn / Q and x; " 2 Q. We showed in
Chap. 1 that the geometric series arn always converges in Q if a; r 2 Q and
jrj < 1. As we shall see, this is quite exceptional. In general, infinite sequences
2.2 Sequences
33
or series of rational numbers will not converge in Q even though they converge
in R.
(4) The definition of convergent sequence suffers from the defect that it includes
the limit x. Later we shall see that providing we work with the real numbers
R (as opposed to the rationals Q), it is possible to give an intrinsic definition
of a convergent sequence that does not depend explicitly on the limit x. This is
significant as in many cases it is possible to prove convergence without knowing
the limit.
z
We give some equivalent ways of formulating the limit definition in the next
lemma.
Lemma 2.2.5 Let .xn / be a sequence of real numbers and x 2 R. The following
statements are equivalent.
8" > 0, 9N 2 N such that jx xn j < " for all n N.
8" > 0, 9N 2 N such that jx xn j " for all n N.
8m 2 N, 9N 2 N such that jx xn j < 10m for all n N.
There exists a sequence .m / of strictly positive numbers converging to zero
such that 8m 2 N, 9N 2 N such that jx xn j < m for all n N.
(5) For every sequence .m / of strictly positive numbers converging to zero, 8m 2
N, 9N 2 N such that jx xn j < m for all n N.
(1)
(2)
(3)
(4)
(In statements (3,4,5), we can replace < by as in (2).)
Proof We need to show that if p; q 2 f1; : : : ; 5g, p ¤ q, then . p/ H) .q/. That is,
if the sequence converges according to ( p), then it converges according to (q).
We start by proving the equivalence of (1) and (2). (1) H) .2/ is obvious since
" ". For the converse, suppose convergence according to (2). Given "=2 > 0, we
can choose N 2 N such that jx xn j "=2 for all n N. Since "=2 < ", we have
jx xn j < " for all n N and so we have convergence according to (1).
Turning to the remaining statements, we have (5) H) (3,4) and (3) H) (4) ((5)
is the strongest statement, (4) the weakest). Hence, it suffices to show that (1) H)
(4) H) (5) H) (1). For (1) H) (4), take m D m1 and apply (1) with " D m .
Next suppose (4) holds with the sequence .` /. Let .m / be any sequence of strictly
positive numbers converging to zero. Given m 2 N, m > 0 and so, since .` /
converges to zero, there exists an `0 2 N such that 0 < `0 m . Hence, by (4),
there exists an N 2 N such that jx xn j < ` for all n N. Since ` m ,
jx xn j < m for all n N, proving that (5) holds. Finally, we show that (5) H)
(1). For this, it is enough to define m D "=m and apply (5) with m D 1.
t
u
Remarks 2.2.6
(1) Statements (3,4) of the lemma are the easiest to work with as they only require
verification of a countable number of conditions. On the other hand, (1,2,5)
require verification of an uncountable number of conditions.
(2) There is no loss of generality in requiring in (1,2) that " 2 Q and in (4,5) that
.m / Q.
z
34
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Example 2.2.7 Let x 2 R have decimal expansion x D x0 :x1 x2 . For N 1,
define xN D x0 :x1 xN . Then .xN / is convergent and limN!1 xN D x. This follows
since
jx xN j 10N ; N 1;
and so we may use the convergence statement (3) of Lemma 2.2.5 (with < replaced
by ).
In subsequent sections, we often make use of the well-known squeezing lemma.
We give the statement for reference and leave the straightforward proof to the
exercises.
Lemma 2.2.8 If .an /; .bn /; .cn / are sequences of real numbers which satisfy
(1) an xn bn for all n 1 (for large enough n suffices);
(2) limn!1 an ; limn!1 bn exist and have the same limit, say x? ,
then .xn / is convergent and has limit x? .
Examples 2.2.9 We give some examples of convergence and applications of the
squeezing lemma. Most of the examples require multiplication of real numbers and
a very limited knowledge of rational exponents and roots. In every case, the limit
will be rational. The examples will not be used in the theoretical developments in
this chapter. Indeed, the gaps will be filled in subsequent sections and exercises.
(1) Let p; q 2 N, . p; q/ D 1, and set ˛ D p=q > 0. We claim that .n˛ / converges
to zero. Suppose first that p D q D 1. Given " > 0, choose N 2 N such that
N > 1=" (this uses the Archimedean property of R). We have j0 n1 j < ",
n N, and so limn!1 n1 D 0. Suppose p > 1, q D 1. Since np n,
1=n 1=np , for all n 2 N. Hence, taking an D 0, bn D np , and cn D 1=n
in the statement of the squeezing lemma, we have limN!1 np D 0. Next
take q > 1, p D 1, and recall that n1=q is the positive qth root of 1=n. If
n D mq , then n1=q D 1=m. Taking m D 1=m in Lemma 2.2.5(4), we see that
j0 n1=q j < 1=m, for all n N D mq . Hence limn!1 n1=q D 0. Finally,
limn!1 np=q D limn!1 .n1=q /p D .limn!1 n1=q /p D 0 by standard
properties of limits and multiplication of real numbers. (The result extends
to strictly positive exponents ˛ 2 R. However, for this we need properties
of the log and exponential function—see section “Appendix: The Log and
Exponential Functions”.)
(2) We claim that if x > 0, then .x1=n / converges to 1. Suppose first that x > 1 so
that x1=n > 1. Set xn D x1=n 1 > 0. By the binomial theorem x D .1 C xn /n 1 C nxn , n 1, and so
0 < xn x1
:
n
The result follows by (1) and Lemma 2.2.8. If 0 < x < 1, apply the previous
argument to y D 1=x.
2.2 Sequences
35
(3) The sequence .n1=n / converges to 1. Set n1=n D 1Cxn . Clearly xn 0. Applying
the binomial theorem we have
n D .1 C xn /n 1 C
n.n 1/ 2
xn ;
2
and so
r
0 xn 2
:
n
The result follows by (1) and the squeezing lemma.
(4) If r 2 .1; 1/, the geometric sequence .rn / converges and has limit zero.
Suppose that r 2 .0; 1/. Define x > 0 by r D 1=.1 C x/. Then rn D
.1 C x/n nx and so 0 rn x1 n1 . The result follows by (1) and the
squeezing lemma. If r 2 .1; 0/, then the same argument shows that jrjn ! 0
and hence limn!1 rn D 0 (by the definition of the limit).
2.2.2 Subsequences
Definition 2.2.10 Let .xn / be a sequence of real numbers. A subsequence .xnj / of
.xn / is a sequence of the form xn1 ; xn2 ; where 1 n1 < n2 < . That is, it is a
sequence .zj / where zj D xnj , j 1.
Remark 2.2.11 If .xn / R is a sequence then every countably infinite subset K of
N uniquely determines a subsequence. Indeed, if K is a countably infinite subset of
N, we may write K uniquely as K D fnj j j 2 Ng, where 1 n1 < n2 < . We
define the sequence .zj / by zj D xnj , j 2 N.
z
We leave the proof of the next lemma as an exercise.
Lemma 2.2.12 Let .xn / be a convergent sequence with limit x. Every subsequence
.xnj / of .xn / is convergent with limit x.
Examples 2.2.13
(1) Define the sequence .xn / by xn D n, n 1. As a subsequence we could
take .xnp / where xnp denotes the pth prime number (so .xnp / is the sequence
2; 3; 5; 7; 11; ). It is clear that .xn / has no convergent subsequences.
(2) Let .xn / be a sequence such that fxn j n 1g D Q. Obviously, .xn / is not
convergent. However, for every x 2 R, we can construct a subsequence .xnj /
of .xn / which converges to x (this does not contradict Lemma 2.2.12—a nonconvergent sequence may have many convergent subsequences). Suppose then
that x 2 R. We give an inductive construction for a subsequence converging
to x which will repeatedly use that Q \ .a; b/ is countably infinite for all open
intervals .a; b/, a < b (Exercises 1.6.23(10)). We define xn1 by taking n1 1 to
36
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
be the smallest integer such that xn1 2 .x 101 ; x C 101 /. Suppose we have
constructed xn1 ; ; xnm1 so that 1 n1 < < nm1 and
xnj 2 .x 10j ; x C 10j /; j D 1; ; m 1:
Choose nm to be the smallest integer greater than nm1 such that xnm 2 .x 10m ; xC10m /. That we can choose nm follows since .x10m ; xC10m /\Q
contains .x10m ; xC10m /\.QXfx1; ; xnm1 g/, which is countably infinite.
This completes the inductive construction of .xnj /. Since jx xnm j < 10m , m 1, it follows from Lemma 2.2.5 (statement (4) this time) that .xnm / converges
to x.
EXERCISES 2.2.14
(1) Show that if a sequence is convergent, then the limit is unique.
(2) Prove the squeezing lemma (Lemma 2.2.8). Is the result true if we work over
the rational numbers?
(3) Usingpthe squeezing
p lemma, and a little algebra, show that the sequence .xn /,
xn D n C 1 n is convergent. What is the limit?
(4) Suppose that the sequences .an / and .bn / are convergent with respective limits
a? and b? . Show that if an bn for all n 2 N, then a? b? . Can any more be
said if an < bn for all n 2 N? What about if an bn for all sufficiently large n?
(5) Prove Lemma 2.2.12.
(6) Find a countable infinite subset X of R such that if .xn / X is convergent,
then .xn / is eventually constant and the limit of .xn / lies in X (.xn / is eventually
constant if 9x, 9N 2 N such that xn D x, n N).
(7) Let X be a non-empty subset of R. A point x 2 R is a closure point of X if we
can find a sequence .xn / X which converges to x. Denote the set of closure
points of X by X. Why is it true that X X?
(a) Find an example of a countably infinite unbounded set X of R such that
X D X.
(b) Find an example of a countably infinite bounded subset X of R such that
X D X.
(c) Find an example of a countably infinite bounded subset of Œ0; 1 such that
X X X D f0; 12 ; 1g.
(d) Find an example of a countably infinite subset X of R such that X D R.
(8) Suppose that .xn / is convergent. Let W N ! N be a bijection. Prove that .x .n/ /
is convergent and has the same limit as .xn /.
(9) Write the set Q of all rational numbers as a sequence .qn /n1 where we assume
qn ¤ qm if n ¤ m. Given " > 0, define In D .qn "2.nC1/ ; qn C "2.nC1/ /,
n 1, and set I D [n2N In .
P
(a) If jIn j denotes the length of In , show that 1
nD1 jIn j D ".
(b) Show that the set X D R X I contains no proper subintervals even though
the ‘length’ of the complement I is at most ".
2.3 Bounded Subsets of R and the Supremum and Infimum
37
(c) The set X consists of irrational numbers. Does X contain all the irrational
numbers? Why/Why not?
2.3 Bounded Subsets of R and the Supremum and Infimum
Definition 2.3.1 Let A be a subset of R.
(1) A is bounded above if A is nonempty and 9M 2 R such that M x for all x 2 A.
We call M an upper bound for A.
(2) A is bounded below if A is nonempty and 9m 2 R such that m x for all x 2 A.
We call m a lower bound for A.
(3) A is bounded if A is bounded above and below.
Examples 2.3.2
(1) N is bounded below (m 1 works) but not bounded above.
(2) Z is unbounded.
(3) If a < b are real numbers, then .a; b/, Œa; b are bounded. In both cases we can
take as upper bound any M b and as lower bound any m a.
(4) A non-empty subset A of R is bounded iff 9R 0 such that A ŒR; R.
The next two lemmas turn out to be very useful in our discussion of upper and
lower bounds and convergence of bounded sequences.
Lemma 2.3.3 Let A be a nonempty subset of R. If M C " is an upper bound for A
for all " > 0, then M is an upper bound for A. An analogous result holds for lower
bounds.
Proof We prove by contradiction. If M is not an upper bound, there exists an a 2 A
such that M < a. Take " D .aM/=2 > 0 and observe that MC" D .MCa/=2 < a,
contradicting our assumption that M C " is an upper bound for A for all " > 0. u
t
Remark 2.3.4 The proof uses Sect. 1.6.7 on multiplication of real numbers by
rational numbers—in this case by 12 . We can avoid multiplication by choosing z 2 R
satisfying M < z < a (Example 1.6.7) and taking " D z M.
z
Lemma 2.3.5 Let .xn / be a sequence of real numbers and suppose that xn M for
all n 2 N. If limn!1 xn D x? , then x? M. An analogous result holds for lower
bounds.
Proof We prove by contradiction. Suppose x? > M. Take " D .x? M/=2 > 0.
Since .xn / converges to x? , there exists an N 2 N such that jx? xN j < ". Hence
xN x? " > M. Contradiction.
t
u
Remark 2.3.6 If xn < M in the statement of Lemma 2.3.5, then we can only infer
that x? M. Typically limits preserve ‘’ but not strict inequality.
z
38
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Definition 2.3.7 Let A be a nonempty subset of R.
(1) Suppose A is bounded above. A least upper bound for A, or supremum for A, is
a real number M such that
(a) M is an upper bound for A.
(b) If M 0 is any upper bound for A, then M M 0 .
(2) Suppose A is bounded below. A greatest lower bound for A, or infimum for A,
is a real number m such that
(a) m is a lower bound for A.
(b) If m0 is any lower bound for A, then m0 m.
Lemma 2.3.8 Suppose A is bounded above. If the supremum of A exists, it is unique.
Similarly for the infimum of A, if A is bounded below.
Proof (1) Suppose that M; M 0 are supremums of A. Then by property (a), both M
and M 0 are upper bounds of A. Since M is a supremum of A, it follows by (b) that
M M 0 . Applying the same argument with the roles of M; M 0 interchanged, we get
M 0 M. Hence M D M 0 . We may apply a similar argument to prove the uniqueness
of the infimum.
t
u
Remarks 2.3.9
(1) If they exist, we denote the supremum and infimum of A by sup.A/ and inf.A/
respectively. Alternative and commonly used notations are lub.A/ for sup.A/,
and glb.A/ for inf.A/.
(2) In analysis, we only define the maximum of A, max.A/, and minimum of A,
min.A/, if A is a finite subset of R. This is a little confusing as we do refer to
the maximum and minimum values of a function.
z
Lemma 2.3.10 Suppose sup.A/ exists. Define A D fa j a 2 Ag and, for x 2 R,
A C x D fa C x j a 2 Ag.
(1) inf.A/ exists and equals sup.A/.
(2) sup.A C x/ exists and equals sup.A/ C x.
Similarly, if inf.A/ exists, sup.A/ D inf.A/, inf.A C x/ D inf.A/ C x, x 2 R.
Proof We leave this as an exercise.
t
u
We have the following necessary and sufficient condition for the existence of
sup.A/.
Lemma 2.3.11 Let A be a subset of R which is bounded above and let M 2 R.
Suppose that
(1) M is an upper bound for A.
(2) For every " > 0, there exists an x 2 A such that x > M ".
Then sup.A/ D M. Conversely, if M D sup.A/, then (1,2) are satisfied. A similar
criterion holds for the infimum of A.
2.3 Bounded Subsets of R and the Supremum and Infimum
39
Proof If (2) fails then there exists an " > 0 such that M " x for all x 2 A.
Hence M " is an upper bound for A and M cannot be the supremum of A. Hence
conditions (1,2) are necessary for M to be the supremum of A. Conversely, suppose
(1,2) hold. Since (2) holds, M " is not an upper bound of A for all " > 0. Since,
by (1), M is an upper bound of A, M must be the least upper bound of A.
t
u
Theorem 2.3.12 Let A R be bounded above. Then sup.A/ exists. Similarly, if A
is bounded below, inf.A/ exists.
Proof Let A be bounded above. It is enough by Lemma 2.3.10 to prove the existence
of the supremum of A C x for some x 2 R since sup.A/ D sup.A C x/ x. It follows
that there is no loss of generality in requiring that A contains a point a > 0 and so we
may and shall assume that 0 is not an upper bound of A. This assumption simplifies
the proof as we avoid having to make a separate argument to handle negative upper
bounds.
We use an inductive technique to construct the decimal expansion ˛ D
˛0 :˛1 ˛2 of sup.A/. Specifically, we construct a positive sequence ˛ P D
˛0 :˛1 ˛2 ˛P , P 0, of decimal truncations of ˛. The truncations will satisfy
(a)
(b)
(c)
(d)
˛ P is not an upper bound of A, P 0.
˛ P C 10P is an upper bound of A, P 0.
If 0 P < Q, then ˛ P and ˛ Q agree to the first P decimal places.
The sequence .˛ P /P0 is increasing: ˛ P ˛ Q , if P < Q.
Let L be the smallest integer which is an upper bound for A. Since 0 is not an
upper bound, L 2 N. Define ˛ 0 D ˛0 D L 1 0.
Proceeding inductively, suppose that we have constructed ˛ j D ˛0 :˛1 ˛2 ˛j
satisfying (a–d) for j < P. Consider Zp D ˛ P1 C p10P , 0 p 10. We have
that Z10 is an upper bound of A (using (b) for ˛ P1 ), but Z0 D ˛ P1 is not an upper
bound (by (a)). Choose p 2 f0; ; 9g so that ZpC1 is an upper bound but Zp is not.
Define ˛P D p, ˛ P D ˛0 :˛1 ˛2 ˛P D ˛ P1 C p10P . This completes the inductive
step.
It is immediate from (c) that limn!1 ˛ P converges to the real number ˛ D
˛0 :˛1 ˛2 . We claim that ˛ D sup.A/.
By property (b), ˛ P C 10P is an upper bound for A for all P 0. Since .˛ P /
is an increasing sequence and ˛ P ˛ for all P, ˛ C 10P is an upper bound for A
for all P 0. Hence ˛ is an upper bound of A (Lemma 2.3.3). We need to show
˛ is the least upper bound. Suppose ˇ is an upper bound of A. Then ˇ > ˛ P , for
all P 0 (property (a)). Hence, by Lemma 2.3.5, ˇ limP!1 ˛ P D ˛ and so
sup.A/ D ˛. The result for infimums can be proved along the same lines or, more
simply, by using Lemma 2.3.10.
t
u
Remarks 2.3.13
(1) The proof of Theorem 2.3.12, which depends on a ‘subdivide and conquer’
technique, is carefully constructed so as to make transparent the convergence
of the sequence .˛ P / to the supremum of A. We do this in two ways. First,
40
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
the sequence .˛ P / is an increasing sequence which obviously converges to ˛.
Secondly, ˛ P is not an upper bound but ˛ P C 10P is an upper bound.
(2) The proof does not work over the rational numbers. There is no reason why the
sequence .˛ P / should converge to a rational number.
(3) The existence of the supremum is sometimes taken as an Axiom for the real
numbers. The point of the proof is that if one thinks of real numbers as
being decimal expansions, then it is straightforward to construct the supremum
directly. In particular, we construct the supremum as a sequence of rational
approximations.
z
Examples 2.3.14
(1) Theorem 2.3.12 fails if we work over the rational numbers.
p The easiest example
is found by defining A D fx0p
:x1 xnpj n 1g, where 2 D x0 :x1 . If sup.A/
existed it would have to be 2 but 2 … Q. Other examples can be found by
constructing (according to some simple rule) a non-periodic non-terminating
decimal expansion and then taking the set of decimal truncations. For example,
define x? D 1:01 102 103 1 0n 1 (here 0p signifies a string of p zeros).
(2) Let fIj D .aj ; bj / j j 2 Jg be a set of open intervals of R. Suppose that \j2J Ii ¤
;—that is, the intervals Ii share at least one common point, say x0 . We claim
that I D [j2J Ii is an open interval (we regard .1; b/, .a; 1/ and R as open
intervals). Suppose the sets faj j j 2 Jg, fbj j j 2 Jg are bounded subsets
of R (we leave the case where one or both of these sets is unbounded to the
exercises). Set a? D inffaj j j 2 Jg, b? D supfbj j j 2 Jg. It suffices to show
that I D .a? ; b? /. By definition of the supremum and infimum, given " > 0,
there exist `; m 2 J such that a? a` ", b? bm C ". Since I` ; Im share the
common point x0 , and a` < x0 < bm , I` [Im I and so .a? C"; b? "/ I. Since
this is true for all " > 0, we see that .a? ; b? / I. On the other hand, a? ; b? … I.
Indeed, if a? 2 I, this would imply that there exists an m 2 J such that a? 2 Im .
But then am < a? and so a? could not be a lower bound for faj j j 2 Jg. A
similar argument applies to b? . We have shown that I D .a? ; b? /. This result
is false if we work over the rational numbers (the open interval .a; b/Q of Q is
defined to be the set fx 2 Q j a < x < bg, where a; b 2 Q). We leave the
construction of an explicit counterexample to the exercises.
Remark 2.3.15 It is useful to extend the definition of sup and inf to allow for
unbounded sets, Thus, if A is not bounded above, we set sup.A/ D C1 and if
A is not bounded below, we set inf.A/ D 1.
z
2.3.1 Applications to Sequences and Series
Let .an / be a sequence of real numbers. Recall that the sequence .an / diverges to
C1 if for every M 2 R, there exists an N 2 N such that an M for all n N. We
often write this as limn!1 an D C1. We may similarly define divergence to 1.
2.3 Bounded Subsets of R and the Supremum and Infimum
41
We have already made use of increasing sequences in the proof of Theorem 2.3.12. For completeness, we give some formal definitions.
Definition 2.3.16 The sequence .an / of real numbers is increasing if a1 a2 a3 . That is, an am whenever n < m. The sequence is strictly increasing if
an < am whenever n < m and is eventually increasing if there exists an N 2 N such
that an < am whenever N n < m. We similarly may define decreasing, strictly
decreasing and eventually decreasing sequences.
Definition 2.3.17 The sequence .an / of real numbers is bounded above if fan j n 2
Ng R is bounded above. The sequence is bounded below if fan j n 2 Ng is
bounded below.
The next result is both simple to state and very special to the real numbers. It
provides a gateway into the study of convergence of sequences and series of real
numbers. The result is false for sequences of rational numbers.
Theorem 2.3.18 Let .an / be an increasing sequence of real numbers.
(1) If .an / is not bounded above, then limn!1 an D C1.
(2) If .an / is bounded above then .an / is convergent and limn!1 an D supfan j n 2
Ng.
A similar result holds for decreasing sequences. The results also hold for eventually
increasing (or decreasing) sequences provided that we take the supremum (or
infimum) over the increasing (or decreasing) part of the sequence.
Proof Set A D fan j n 1g. Suppose first that A is not bounded. Then for every
M 2 R, there exists an N 2 N such that aN M. Since .an / is increasing, an M, for all n N. Hence limn!1 an D C1. If A D fan j n 1g is bounded,
Theorem 2.3.12 applies and we can define a? D sup.A/. We claim .an / is convergent
with limit a? . Certainly an a? for all n 1 (a? is an upper bound). Further, for
every " > 0, there exists an N 2 N such that aN > a? " (otherwise a? would not be
the least upper bound). Since .an / is increasing, and bounded above by a? , we have
a? an > a? " for all n N. That is, ja? an j < ", n N. Hence limn!1 an
exists and equals a? . We leave the proofs of the remaining parts of the theorem to
the reader.
t
u
Theorem 2.3.18 has the following important and useful corollary.
P
Theorem 2.3.19 Let 1
terms. Then either
i be a series of (eventually) positive
iD1 aP
P
P1
1
1
a
diverges
to
C1
or
a
converges.
In
particular,
iD1 i
iD1 i
iD1 ai converges iff
P
the sequence .Sn D niD1 ai / of partial sums is bounded.
Pn
Proof For P
n 1, define the nth partial sum Sn D
iD1 ai . We recall that, by
definition, 1
iD1 ai converges iff the sequence .Sn / of partial sums converges. Since
it is assumed that the terms in the series are (eventually) positive, it follows that the
sequence .Sn / is (eventually) increasing. The result follows by Theorem 2.3.18. u
t
Remark 2.3.20 The significance of Theorems 2.3.18 and 2.3.19 is that they give a
criterion for convergence that does not require us to know the limit. For example,
42
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Theorem 2.3.19 implies that an infinite series of positive terms either converges or
diverges to C1. Aside from the implicit upper bound, nothing is stated or needed
about the actual value of the limit.
z
2.3.2 Multiplication and Division of Real Numbers
With Theorem 2.3.18, we have all the necessary tools to define multiplication and
division of real numbers.
Multiplication Suppose that x; y 2 RC . For N 1, let xN and yN denote the
truncations of x and y to N-terms. Set zN D xN yN . Then .zN /N1 is an increasing
sequence of rational numbers bounded above by .x0 C 1/ . y0 C 1/. Hence,
limN!1 zN exists by Theorem 2.3.18(2). We define the product xy to be limN!1 zN .
We extend the definition of product to negative numbers in a way consistent with
multiplication of negative rationals: if x; y < 0, define xy D .x/.y/ and if x
and y are of opposite sign, define xy D .x/y D x.y/. In all cases, we have
xy D limN!1 xN yN .
With multiplication defined in terms of rational approximation, it is straightforward to extend standard results on rationals to real numbers.
Examples 2.3.21
(1) For all x; y; z 2 R, x. y C z/ D xy C xz (distributive law). Indeed,
x. y C z/ D lim xN . yN C zN / D lim xN yN C lim xN zN D xy C xz;
N!1
N!1
N!1
where we have used the distributive law for rationals, xN . yN C zN / D xN yN C
xN zN .
(2) If x; y 2 R and xy D 0, then either x D 0 or y D 0. It is enough to show that if
x; y ¤ 0, then xy ¤ 0. Without loss of generality take x; y > 0. Choose p; q 2 N
so that xp ; yq ¤ 0. Then xN 10p , all N p, and yN 10q , all N q.
Therefore xN yN 10. pCq/ for all N p C q. Hence xy D limN!1 xN yN 10. pCq/ > 0.
Division Since multiplication of real numbers is now defined, it suffices to define
the reciprocal x1 for x > 0. The sequence .xN / is increasing and bounded above
by x0 C 1, and so . x1N / is a decreasing sequence bounded below by 1=.x0 C 1/ (the
sequence is defined provided xN ¤ 0 which is true for large enough N). Hence, by
Theorem 2.3.18, we may define x1 D limN!1 x1N .
Example 2.3.22 Equipped with division, Examples 2.3.21(2) follows by multiplying the equation xy D 0 by x1 if x ¤ 0.
Remark 2.3.23 We refer to the exercises for effective ways of computing the
reciprocal and the roots x1=p of x 2 RC , p 2.
z
2.3 Bounded Subsets of R and the Supremum and Infimum
43
2.3.3 Examples of Convergent Sequences
For the remainder of this section, we show how we can use Theorem 2.3.18 to give
simple proofs of convergence for some basic geometric sequences.
Lemma 2.3.24 Let x 2 R and consider the sequence .xn /.
(1)
(2)
(3)
(4)
If x 2 .1; 1/, .xn / converges to 0.
If x D 1, .xn / converges to 1.
If x > 1, .xn / diverges to C1.
If x 1, .xn / is divergent.
Proof Statements (2,4) are obvious; we prove (1,3) using Theorem 2.3.18 together
with standard facts about limits.
If x D 0, (1) is immediate so suppose x 2 .0; 1/. Then .xn / is a (strictly) decreasing sequence bounded below by 0. Hence, by Theorem 2.3.18, .xn / converges with
limit x? 0. We have
xx? D x lim xn D lim xnC1 D x?
n!1
n!1
and so xx? D x? . Since 0 < x < 1, x? D 0. If x 2 .1; 0/, then jxjn ! 0 since
jxj 2 .0; 1/. Hence, by the definition of the limit, limn!1 xn D 0. We prove (3) by
observing that if x > 1, then y D x1 2 .0; 1/ and so by (1), limn!1 yn D 0. Hence
for any M > 0, there exists an N 2 N such that yn 1=M, for all n N. That is,
xn M, n N, and so .xn / diverges to C1.
t
u
As an immediate corollary of Lemmas 2.3.24 and 2.2.8, we have
Lemma 2.3.25 Let .an / be a sequence. Suppose that there exist C 0 and r 2
.0; 1/ such that
0 jan j Crn ;
for all sufficiently large n. Then .an / converges and limn!1 an D 0.
Example 2.3.26 If ˛ 2 Q and r 2 .1; 1/, then the sequence .n˛ rn / is convergent
with limit zero. We may assume that r ¤ 0 and ˛ > 0 (since 0 < n˛ 1 if ˛ 0).
Set xn D n˛ rn . We have
ˇ ˇ
˛
ˇ xnC1 ˇ
ˇ D 1 C 1 jrj; n 1:
ˇ
ˇ x ˇ
n
n
Choose N 2 N so that .1C N1 /˛ jrj .jrjC1/=2 < 1. Since ..1C 1n /˛ / is a decreasing
sequence, we have
1 ˛
1C
jrj .jrj C 1/=2; n N:
n
44
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Therefore for n N we have
jxn j jxN j
jrj C 1
2
nN
D jxN j2N .jrj C 1/N
DC
jrj C 1
2
n
jrj C 1
2
n
;
:
The result follows from Lemma 2.3.24 since jrj C 1 < 2.
The examples we have given so far have a rational limit (zero or one) and follow
from relatively simple arguments. We end this section with examples where the limit
is not rational and elementary arguments no longer suffice to prove convergence.
Examples 2.3.27
(1) Let A > 0. Define the sequence .xn / inductively by x1 D 1, xnC1 D 12 .xn C xAn /,
n 1. Clearly .xn / is a sequence of strictly positive real numbers and if A 2 Q,
then xn 2 Q for all n 2 N. Recall that if a; b > 0, then .a C b/2 4ab with
equality iff a D b. Applying this inequality with a D xn , b D x2n , we get
x2nC1 A; n 1;
with equality iff x2n D A. Since we assumed x1 D 1, we have x2n A for all
n 2. A simple computation shows that
xn xn1
xnC1 xn D
2
1
A
xn xn1
; n 3:
Since x2n ; x2n1 A, n 3, we have xn xn1 A and so 1 xn xAn1 0, for all
n 3. Hence, the sign of xnC1 xn is the same as that of xn xn1 for all n 3.
Starting with x1 D 1, we compute that
x3 x2 D .A 1/2
0:
2.1 C A/
Hence .xn /n3 is a decreasing sequence bounded below by 0. It follows from
Theorem 2.3.18 that .xn / converges. If we set limn!1 D z, we have z2 A
(since .x2n / is bounded below by A). Since limn!1 xnC1 D limn!1 xn D z, we
have
A
1
A
1
xn C
zC
:
D
z D lim
n!1 2
xn
2
z
p
That is, z2 D A. Hence z D A—the positive square root of A.
Notice that if A 2 Q, then .xn / Q but the limit is typically irrational. For
example, if A 2 N is not a perfect square.
2.3 Bounded Subsets of R and the Supremum and Infimum
45
p
The convergence to A is fast—the iteration is based on Newton’s method
applied to the function f .x/ D x2 A. In the exercises we give similar
methods for constructing x1=p , x > 0, and the reciprocal x1 . Not only do these
methods give the existence of general roots of positive reals, they also give rapid
numerical computation of the roots and division by real numbers.
(2) Suppose xn D .1 C 1n /n , n 1. By the binomial theorem
1 n
n.n 1/ 1
1
1
1C
D 1Cn C
CC n
2
n
n
2Š
n
n
D 1C1C
n
X
Kn . j/ 2;
jD2
where
1
2
j1
1
1
1
1
:
Kn . j/ D
jŠ
n
n
n
For fixed j, Kn . j/ increases with n as do the number of terms in the expansion
of .1 C 1n /n . Hence .1 C 1n /n is an increasing sequence and must either converge
or diverge to C1. But since Kn . j/ < jŠ1 , we have
1 n
1
1
1C
< 1C1C CC
n
2Š
nŠ
< 1C1C
1
1
1
C 2 C C n < 3;
2
2
2
proving that .1 C 1n /n is bounded and therefore converges with limit in
.2:5; 3. In Chap. 3 (Proposition 3.5.7), we show that the limit is e. See also
Exercises 2.9.10(5) where we give a less elementary proof that uses properties
of the logarithm.
EXERCISES 2.3.28
(1) Prove Lemma 2.3.10.
(2) Construct a countable set of open intervals .aj ; bj /Q of rational numbers, aj <
bj 2 Q, with a common point x0 , such that [j1 .aj ; bj /Q is not an open interval
in Q (see Examples 2.3.14(3) for the definition of .a; b/Q ).
(3) Find an example of a countably infinite bounded subset A of R such that
(a)
(b)
(c)
(d)
sup.A/ 2 A.
For every " > 0, 9x 2 A such that x > sup.A/ ".
inf.A/ … A.
For every " > 0, 9x 2 A such that x < inf.A/ C ".
46
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Which of (a,b,c,d) could hold if A were finite? (Hint: construct .an / as the
union of two sequences .a2n /, .a2n1 /. You should give explicit definitions for
a2n and a2n1 .)
(4) Find an explicit example of a countably infinite subset A D fan j n 2 Ng of R
such that the following four properties hold:
(a)
(b)
(c)
(d)
sup.A/ D C1.
inf.A/ D 0 … A.
For every " > 0, 9x 2 A such that x < inf.A/ C ".
If .ank / is a subsequence of .an /, then either limk!1 ank D 0 or
limk!1 ank D C1 or .ank / is not convergent.
(You should construct A to be the union of two sequences, one diverging to
C1, the other converging to 0—see also the hint for the previous question.)
(5) Construct an explicit example of a countably infinite subset A D fan j n 2 Ng
of R such that the following four properties hold:
(a)
(b)
(c)
(d)
inf.A/ D 1.
sup.A/ D 1 … A.
For every " > 0, 9x 2 A such that x > sup.A/ ".
If .ank / is a subsequence of .an /, then either limk!1 ank D 1 or
limk!1 ank D 1 or .ank / is not convergent. Construct explicit subsequences to show that each of these possibilities can occur.
(You should construct A to be the union of three sequences.)
(6) Construct an explicit example of a countably infinite subset A D fan j n 2 Ng
of R such that the following three properties hold:
(a) sup.A/ D C1.
(b) inf.A/ D 1.
(c) If p 2 Z, there exists a sequence of distinct points .an / of A such that
limn!1 an D p.
(d) If .an / is a convergent sequence of distinct points of A, then .an / converges
to an integer.
(You should construct A as a countable union of countable sets. Note the use
of the word ‘distinct’. Without that we could just take A D Z. Why?)
(7) Let y > 0 and a > 0 such that ay < 2. Define the sequence .xn / by x1 D
a, xnC1 D 2xn x2n y, n 2 N. Show that .xn / converges to y1 . (Hints and
comments: The maximum value of f .x/ D 2x x2 y occurs at x D y1 and
f .0/ D f .2y1 / D 0. Show that xn y1 , for all n 2, and that .xn /n2 is an
increasing sequence. The construction is based on Newton’s method and gives
a rapidly convergent method for approximating the reciprocal of a real number
and so division by real numbers.)
(8) Let p 2 N, p 2 and y 2 R, y > 0. This exercise proves the existence of
the positive pth root y1=p of y by giving an explicit method for the computation
of y1=p . Choose a > 0 satisfying ap < . p C 1/y. Define the sequence .xn /
2.3 Bounded Subsets of R and the Supremum and Infimum
47
p 1
ny
by x1 D a, xnC1 D xn . pC1/x
, n 2 N. Show that .xn / converges and that
p
the limit is y1=p (that is, .limn!1 xn /p D y). Show also that this is the unique
positive pth root of y.
n
(9) Show that if the sequence .xn / converges to a, then the sequence . x1 CCx
/ of
n
arithmetic means converges to a. Show also that if .xn / is a sequence of strictly
positive numbers with limit a > 0, then we have a similar result for geometric
means:
lim
n!1
p
n
x1 x2 xn D a:
What can you say if a D 0? (Hint for the first part: start by proving the case
a D 0. Hint for the second part: use logarithms.)
(10) Using the results of the previous question, show that
1C 1 CC 1
(a) limn!1 p 2 n n D 0.
(b) limn!1 n npD 1. p
(c) limn!1
1C 2CC n n
n
D 1.
(11) Define the sequence .xn / by x0 D 1,
xnC1 D 1 C
1
; n 0:
xn
Show that
(a) .xn / is a sequence of rational numbers.
(b) 1 < xn < 2 for all n > 0.
(c) x0 < x2 < x4 < : : : < x2n < : : : < x2mC1 < : : : < x5 < x3 < x1 ,
(n; m 3).
Verify limn!1 xn exists and is irrational. (Hints for (a,b,c): Induction, induction, induction. The limit is known as the golden mean or golden ratio. The
denominators of xn give the Fibonacci sequence.)
(12) The examples and exercises we have given so far may suggest that sequences
.xn /n0 defined recursively (xnC1 D f .xn /, n 0) typically converge. This is
far from the case. As an example, suitable for computer experimentation, the
def
reader may investigate the sequence xnC1 D L .xn / D xn .1 xn /, where
x0 2 Œ0; 1 (L is called the logistic map). Provided 2 Œ0; 4, .xn / Œ0; 1.
The sequence .xn / converges provided 2 Œ0; 3. However, for 2 .3; 4,
the sequence .xn / typically does not converge and exhibits ever more complex
behaviour, including randomness (or ‘chaos’), as approaches 4 (for more
details and references we refer to the text by Strogatz [28, Chap. 10]).
48
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
2.4 The Bolzano–Weierstrass Theorem
Theorem 2.4.1 (Bolzano–Weierstrass Theorem) If X is an infinite bounded subset of R, then there exists a convergent sequence .xn / consisting of distinct points
of X.
Proof Since X is bounded, there exists a closed interval I0 D Œa0 ; b0 containing X.
We construct a sequence of closed intervals In D Œan ; bn , n 0, with the following
properties
(1)
(2)
(3)
(4)
(5)
InC1 In , n 0.
.an / is an increasing sequence, .bn / is a decreasing sequence.
an < bn , all n 0.
def
jIn j D jbn an j D 2n jb0 a0 j, n 0.
X \ In is infinite, n 0.
Our construction of .In / is inductive. When n D 0, conditions (3,4,5) are
automatically satisfied (conditions (1,2) are empty). So suppose we have constructed
n
n
intervals I0 ; ; In satisfying (1–5). Let J D Œan ; an Cb
, K D Πan Cb
; bn . Note that
2
2
1
.nC1/
jb0 a0 j by (4). Since J [ K D In and In \ X is
jJj D jKj D 2 jIn j D 2
infinite, one (at least) of J \ X, K \ X must be infinite. Choose one of J; K so that the
intersection is infinite. Denote the corresponding interval by InC1 D ŒanC1 ; bnC1 .
Since ŒanC1 ; bnC1 Œan ; bn , we have an anC1 < bnC1 bn . This completes the
inductive step and the construction of the intervals In .
Since .an / is bounded above by bm , m 0, and .bn / is bounded below by am , m 0, both sequences .an /, .bn / converge by Theorem 2.3.18 and am limn!1 an limn!1 bn bm , for all m 0. Applying (4), we see that limn!1 an D limn!1 bn .
It remains to construct a convergent sequence .xn / of distinct points of X. Since
I0 \X is infinite, we can choose x0 2 I0 \X. Proceeding inductively, suppose we have
constructed distinct points xj 2 Ij \ X, 0 j n. Since InC1 \ X is infinite, we can
choose xnC1 2 .InC1 \ X/ X fx0 ; ; xn g. This completes the inductive construction
of .xn /. Since xn 2 In , n 0, we have
an xn bn ; n 0;
and so, by the squeezing lemma, .xn / is convergent.
t
u
Remark 2.4.2 Theorem 2.4.1 fails if we work over the rational numbers. The
condition that X is bounded is also necessary. A simple counterexample is given
by X D N.
z
We have a very useful application of Theorem 2.4.1 to sequences.
Proposition 2.4.3 Let .xn / be a bounded sequence. Then there exists a convergent
subsequence .xnk / of .xn /.
Proof An instructive direct proof of the result can be given along the lines of the
proof of Theorem 2.4.1—see the exercises at the end of the section. Here we present
2.4 The Bolzano–Weierstrass Theorem
49
a proof using Theorem 2.4.1. Set X D fxn j n 2 Ng. First observe that the result is
easy if X is finite—we can choose .xnk / to be a constant sequence. Suppose that X is
infinite. By Theorem 2.4.1, we can pick a convergent sequence .zj / of distinct points
of X. For each j 2 N there exists a unique smallest mj 2 N such that zj D xmj . If m1 <
m2 < we are done. If not, set M D fmj j j 2 Ng. Take n1 D mj.1/ D min M and
proceed inductively. Assume we have defined n1 D mj.1/ < < nk D mj.k/ , where
j.1/ < < j.k/. Define nkC1 D mj.kC1/ D minfmj 2 M j mj > n1 ; ; nk ; j >
j.k/g. This defines a subsequence .xnk / of .xn / which is also a subsequence of .zj /.
Hence, by Lemma 2.2.12, .xnk / converges (with limit limn!1 zn ).
t
u
2.4.1 Continuous Functions
We start by recalling the standard definition of a continuous function.
Definition 2.4.4 If x0 2 X R and f W X ! R, then f is continuous at x0 if for
every " > 0, there exists a ı > 0 such that
j f .x/ f .x0 /j < "; whenever x 2 X; and jx x0 j < ı:
We say f is continuous on X, or just continuous, if f is continuous at every point
of X.
Remarks 2.4.5
(1) For most of our initial applications, X will either be an open or closed interval
of R. Later we will need to work with more general subsets of R.
(2) The definition of continuity has some unpleasant and subtle features. For
example, it appears to require the verification of uncountably many conditions
(that is, for each " > 0 ). However, as in Lemma 2.2.5, we can easily
show that it suffices to verify the conditions just for " D 10n , n 1
(indeed, any sequence converging to zero will do). We give below an alternative,
but equivalent, formulation of continuity known as sequential continuity that
is, in many cases, much easier to work with. In spite of the simplifications
obtained either by working with a countable set of conditions or with sequential
continuity, the fact remains that the concept of continuity is highly non-intuitive.
Contrary to the often made suggestion that the graph of a continuous function is
what one gets by ‘drawing a line without breaks’, the reality is that the graph of
a ‘typical’ continuous function is very jagged on all scales and the function
is nowhere differentiable. In practice, the functions usually encountered in
analysis and its applications have more structure than just continuity. Finally,
there is a far more elegant and natural definition of continuity that applies in
many contexts (including algebra) and which avoids the arid and uninformative
"; ı notation. This definition does, however, require another significant layer of
abstraction. We revisit this issue later when we discuss metric spaces in Chap. 7.
50
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
(3) As in Lemma 2.2.5, we can replace < " in the definition by " (of course we
cannot replace the condition that ı is strictly positive).
z
Definition 2.4.6 If x0 2 X R and f W X ! R, then f is sequentially continuous at
x0 if for every sequence .xn / X converging to x0 we have
lim f .xn / D f .x0 /:
n!1
We say f is sequentially continuous on X if f is sequentially continuous at every
point of X.
Remark 2.4.7 At first sight the definition of sequential continuity requires even
more to be checked than does the definition of continuity. However, the power of
the definition lies in the application to convergent sequences. If f is continuous and
xn ! x0 then f .xn / ! f .x0 /. As we shall see, this property is very useful, especially
in a context where we can apply the Bolzano–Weierstrass theorem (or its corollary
Proposition 2.4.3).
z
Example 2.4.8 Let X D Z R. Every function f W Z ! R is sequentially
continuous. This follows since if .xn / Z is convergent to x0 2 Z then .xn / is
eventually constant (that is, there exists an N 2 N such that xn D xN , for all n N).
Of course, it is easy to give a direct proof that f W Z ! R is continuous—take ı < 1
in Definition 2.4.4.
Theorem 2.4.9 If x0 2 X R and f W X ! R, then f is continuous at x0 iff f is
sequentially continuous at x0 .
Proof We start by proving that if f is continuous at x0 then f is sequentially
continuous at x0 . We have to show that given a sequence .xn / converging to x0 and
" > 0, there exists an N 2 N such that j f .xn / f .x0 /j < ", for all n N. Since
f is continuous at x0 , there exists a ı > 0 such that j f .x/ f .x0 /j < " whenever
jx x0 j < ı (here, and below, we always assume without further comment that
x 2 X). Since .xn / converges to x0 , there exists an N 2 N such that jxn x0 j < ı, for
all n N. But then j f .xn / f .x0 /j < ", for all n N. Hence limn!1 f .xn / D f .x0 /.
It remains to prove the trickier converse that the sequential continuity of f at x0
implies the continuity of f at x0 . We prove this by contradiction. Suppose that f is
not continuous at x0 . This means that there must be an "0 > 0 for which we cannot
find any ı > 0 satisfying the conditions of the continuity definition. Hence, taking
ı D 1=n, we can find an xn 2 X such that jxn x0 j < 1=n and j f .x0 / f .xn /j "0 .
By construction limn!1 xn D x0 and so, by sequential continuity, limn!1 f .xn / D
f .x0 /. But this implies there exists an N 2 N such that j f .x0 / f .xn /j < "0 , for all
n N, and so contradicts our assumption that j f .x0 / f .xn /j "0 , for all n 2 N.
Hence f must be continuous at x0 .
t
u
With these preliminaries out of the way, we can now prove a result that gives the
basic properties of a continuous function defined on a closed interval.
2.4 The Bolzano–Weierstrass Theorem
51
Theorem 2.4.10 Let f W Œa; b ! R be continuous (1 < a b < 1). Then
(1) f .Œa; b/ is a bounded subset of R (“continuous functions are bounded on closed
bounded intervals”).
(2) If m D inf. f .Œa; b//, M D sup. f .Œa; b//, then there exist xm ; xM 2 Œa; b such
that f .xm / D m, f .xM / D M (“a continuous function on a closed and bounded
interval attains its bounds”).
(3) f .Œxm ; xM / D Œm; M. In particular, f .Œa; b/ Œ f .a/; f .b/ (the intermediate
value theorem).
Proof
(1) Suppose that f is not bounded above on Œa; b. Then for each n 2 N, there
exists an xn 2 Œa; b such that f .xn / n. Applying Proposition 2.4.3, we can
choose a convergent subsequence .xnk / of .xn /. Let limk!1 xnk D x? 2 Œa; b.
By sequential continuity, limk!1 f .xnk / D f .x? /. But the sequence . f .xnk //
is unbounded by construction and so cannot converge. Contradiction. Hence f
must be bounded above on Œa; b. Applying this result to f shows that f is
bounded below on Œa; b.
(2) Set M D sup. f .Œa; b//. For each n 2 N, there exists an xn 2 Œa; b such that
f .xn / > M 1=n (definition of the supremum). Using Proposition 2.4.3 again,
we can pick a convergent subsequence .xnk / of .xn /. If limk!1 xnk D x? , then
by sequential continuity we have f .x? / D M. The result for the infimum is
obtained by applying the result to f .
(3) We have to prove that f .Œxm ; xM / D Œm; M. Without loss of generality assume
xm < xM —if xm D xM , then m D M and f is constant; if xm > xM , replace f by
f . We will show that for every z 2 .m; M/ we can find a (least) x? 2 .m; M/
such that f .x? / D z.
The basic idea is to look at the set X of points x 2 Œxm ; xM such that f < z
on Œxm ; x—see Fig. 2.1. Clearly, X ¤ Œxm ; xM (otherwise m < f .x/ < M for
all x 2 Œxm ; xM ). So we expect there is a first point x? 2 Œxm ; xM for which
f(b) = M
graph(f)
z
y=z
f(a)
m
a xm
x
Fig. 2.1 Proving the intermediate value theorem
b = xM
52
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
f .x? / 6< z. Since f .x/ < z for x < x? , we expect continuity to give us f .x? / D z.
Now for the details.
Given z 2 .m; M/, define
X D fx 2 Œxm ; xM j f ./ < z; all 2 Œm; xg:
Clearly, X ¤ ; since f .xm / D m < z and so xm 2 X. Since xM is an upper
bound of X, x? D sup.X/ exists and x? xM . Choose a sequence .xn / X
such that xn ! x? . By sequential continuity, limk!1 f .xn / D f .x? /. Since
f .xn / < z for all n, we have f .x? / z. If f .x? / D z we are done. If not, then
f .x? / < z, and so x? < xM . By the continuity of f , we can find ı > 0 such
that Œx? ı; x? C ı Œxm ; xM and f .x/ < z for all x 2 Œx? ı; x? C ı. Since
x? ı 2 X, f .x/ < z on Œxm ; x? ı and so f < z on Œxm ; x? C ı. Therefore,
x? C ı 2 X, contradicting the definition of x? as the supremum of X. Hence
f .x? / D z.
Finally, we need to show that f .Œa; b/ Œ f .a/; f .b/. This is obvious since
f .a/; f .b/ 2 Œm; M and f .Œa; b/ D Œm; M.
t
u
Remarks 2.4.11
(1) The proofs of (1,2) are a little different (and easier) than the proofs given in
many texts. We indicate alternative proofs of these results in the exercises.
(2) Note that Theorem 2.4.10 implies that a continuous function f W R ! R maps
closed bounded intervals to closed bounded intervals. In general, a continuous
function f W R ! R maps intervals to intervals but does not necessarily
map open intervals to open intervals or unbounded closed intervals to closed
intervals. See the exercises.
z
Examples 2.4.12
(1) All three parts of Theorem 2.4.10 fail if we work over the rational numbers
or consider real-valued functions defined on intervals of rational numbers. For
example, if we set Œ0; 1Q D Œ0; 1 \ Q and define f W Œ0; 1Q ! Q (or R) by
f .x/ D 2x2 1, then 0 … f .Œ0; 1Q / even though f .0/ D 1 < 0 < 1 D f .1/.
(2) Let f W Œa; b ! R be continuous and satisfy either f .Œa; b/ Œa; b or
f .Œa; b/ Œa; b. Then f has a fixed point. That is, there exists an x? 2 Œa; b
such that f .x? / D x? . To see this, define g.x/ D f .x/ x. Suppose that
f .Œa; b/ Œa; b. Then f .a/ a and so g.a/ 0. Similarly, g.a/ 0. Hence,
by the intermediate value theorem 0 2 g.Œa; b/ and there exists an x? 2 Œa; b
such that g.x? / D f .x? / x? D 0. We leave the proof of the second statement
to the exercises.
We conclude this review of continuous functions with a definition and result that
shows the utility of working with sequential continuity.
2.4 The Bolzano–Weierstrass Theorem
53
Definition 2.4.13 If X is a non-empty subset of R and f W X ! R, then f is
uniformly continuous if for every " > 0, there exists a ı > 0 such that
j f .x/ f . y/j < "; whenever x; y 2 X; and jx yj < ı:
Remark 2.4.14 A uniformly continuous function on X is continuous. In terms of the
definition of continuity, uniform continuity implies that ı > 0 can be chosen to be
z
independent of x0 2 X.
Theorem 2.4.15 Every continuous real-valued function defined on a closed and
bounded interval Œa; b is uniformly continuous.
Proof Suppose f W Œa; b ! R is continuous but not uniformly continuous. If f is
not uniformly continuous, there exists "0 > 0 such that for every ı > 0, there is a
pair x; y 2 Œa; b, with jx yj < ı and j f .x/ f . y/j "0 . Choose ı D 1=n, n 2 N.
Then for each n 2 N, we can find points xn ; yn 2 Œa; b such that
j f .xn / f . yn /j "0 ; and jxn yn j <
1
:
n
By Proposition 2.4.3, .xn / Œa; b has a convergent subsequence, say .xnk /. Let
limk!1 xnk D x? 2 Œa; b. Since jxnk ynk j < 1=nk , we have
jx? ynk j D j.x? xnk / C .xnk ynk /j
jx? xnk j C jxnk ynk j
< jx? xnk j C
1
:
nk
Letting k ! 1, we see that . ynk / is convergent with limit x? . By the sequential
continuity of f , we have limk!1 f .xnk / D limk!1 f . ynk / D f .x? / and hence
limk!1 j f .xnk / f . ynk /j D 0, contradicting our assumption that j f .xn / f . yn /j "0 , all n 2 N. Hence f must be uniformly continuous.
t
u
EXERCISES 2.4.16
(1) Let .xn / be a bounded sequence of real numbers. Give a proof based on the
subdivision method used in the proof of the Bolzano–Weierstrass theorem to
show that .xn / has a convergent subsequence. (For your proof you should not
need to distinguish the cases where fxn j n 2 Ng is finite or infinite—as a
subset of R.)
(2) Find a countable infinite subset X of R such that if .xn / X is convergent,
then .xn / is eventually constant and the limit of .xn / lies in X. (.xn / is eventually
constant if 9x, 9N 2 N such that xn D x, n N. Eventually constant sequences
always converge.)
54
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
(3) Let X be a non-empty subset of R. We say that x 2 R is a closure point of X
if we can find a sequence .xn / X which converges to x. Denote the set of
closure points of X by X. Why is it true that X X?
(a) Find an example of a countably infinite unbounded set X of R such that
X D X.
(b) Find an example of a countably infinite bounded subset X of R such that
X D X.
(c) Find an example of a countably infinite bounded subset of Œ0; 1 such that
X X X D f0; 12 ; 1g.
(d) Find an example of a countably infinite subset X of R such that X D R.
(4) By Theorem 2.4.10, if f is a continuous R-valued map on a closed and
bounded interval, then f is bounded and attains its bounds. Show by means
of examples that each of the conditions continuous, closed, and bounded is
necessary for either of the conclusions bounded, attains its bounds to hold.
(5) Show that a continuous function f W R ! R maps intervals to intervals (an
interval may be half-open, open or closed and bounded or bounded). Find an
example where a bounded open interval is mapped to a closed interval. Does
the closed interval have to be bounded?
(6) Deduce part (2) of Theorem 2.4.10 from part (1) by assuming that the upper
bound M of f is not attained and considering the function 1=.M f .x//.
(7) Suppose f W Œa; b ! R is continuous and f .a/ < 0 < f .b/. Show, using the
subdivide and rule method of the proof of Theorem 2.3.12, that there exists a
solution of f .x/ D 0. (Hint and comments. Replacing f .x/ by g.x/ D f ..b a/xCa/, there is no loss of generality in assuming a D 0; b D 1. Now construct
a sequence .xN D 0:x1 xN / of decimal truncations such that f .xN / < 0 f .xN C 10N /. Simon Stevin used a similar method to show that a polynomial
which changed sign had a root.)
(8) Suppose that I; J are non-empty closed bounded intervals of R, f W I ! R is
continuous and f .I/ J. Show that there exists a closed interval I ? I such
that f .I ? / D J. Is the result true if I or J are not bounded? Prove it or find
counterexample(s).
(9) Is Theorem 2.4.9 true if we work over the rational numbers?
(10) Suppose that f W R ! R and that for every sequence .xn / of real numbers
diverging to C1, we have limn!1 f .xn / D a. Prove that limx!1 f .x/ D a.
Is the converse true?
(11) Complete the analysis for the second case in Examples 2.4.12(2).
(12) Show by means of examples that a continuous map f W Œ0; 1Q ! R need not
be uniformly continuous.
(13) Define f W R ! R by
f .x/ D
10s ; if x D rs 2 Q; .r; s/ D 1; s > 0; and s D 1 if r D 0;
0; if x … Q:
2.4 The Bolzano–Weierstrass Theorem
55
Prove that f is continuous at x iff x is irrational. (Hint: You maypassume that
every rational r can be approximated by irrationals—r C 10n 2; that will
help you prove that f is not continuous at rational points.)
(14) True of false? In each case either prove the result or provide a simple explicit
counterexample.
(a) If f W Œ0; 1Q ! Q is continuous, then f is bounded.
(b) If f W Œ0; 1Q ! Q is continuous and bounded, then f attains its bounds.
(c) The intermediate value theorem holds for continuous functions f W
Œa; bQ ! Q.
How would your answers change if instead we looked at continuous maps
f W Œ0; 1 ! Q? (Be advised: the answers change! Hint: since Q R, every
continuous f W Œ0; 1 ! Q determines a continuous R-valued map F W Œ0; 1 !
R with image f .Œ0; 1/ consisting of rational numbers.)
(15) Show that f .x/ D 1=x is not uniformly continuous on .0; 1/ but is uniformly
continuous on .1; 2/.
(16) Find examples of functions f W R ! R which are (a) uniformly continuous,
(b) not uniformly continuous.
(17) A common proof of Theorem 2.4.10(1) proceeds along the following lines. Let
X D fx 2 Œa; b j f bounded on Œa; xg. Show that (a) X ¤ ;, (b) sup.X/ 2 X,
(c) sup.X/ D b. Fill in the details and use similar methods to prove part (2)
of Theorem 2.4.10. (Comment: one defect of this approach is that it does not
extend well to functions defined on more general sets, for example, subsets of
Rn , since it makes use of the order structure on R.)
2.4.2 Cauchy Sequences
Equipped with the Bolzano–Weierstrass theorem we can now give a satisfactory
intrinsic definition of a convergent sequence which does not depend on knowing the
limit.
Definition 2.4.17 A sequence .xn / of real numbers is a Cauchy sequence if for
every " > 0, there exists an N 2 N such that
jxm xn j < "; for all m; n N:
If .xn / is Cauchy, we write limm;n!1 jxm xn j D 0.
Remarks 2.4.18
(1) Roughly speaking, a Cauchy sequence has the property that terms in the
sequence eventually all get arbitrarily close to one another.
56
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
(2) As in Lemma 2.2.5, we can replace < " by " and it is enough to test the truth
of the definition for any sequence .m / of strictly positive numbers converging
to zero.
z
Example 2.4.19 Let x 2 R. The sequence of decimal truncations .xN D x0 :x1 xN /
to x defines a Cauchy sequence: jxM xN j 10M , M N.
We need the following elementary lemma about Cauchy sequences (this result is
also true if we work over Q).
Lemma 2.4.20 Let .xn / be a sequence of real numbers.
(1) If .xn / is Cauchy, then fxn j n 2 Ng is a bounded subset of R.
(2) If .xn / is convergent, then .xn / is Cauchy.
(3) If .xn / is Cauchy and .xn / has a convergent subsequence, then .xn / is convergent.
Proof
(1) Take " D 1 in Definition 2.4.17. Then there exists an N 2 N so that jxm xn j 1, for all m; n N. Taking m D N, we see that jxn xN j 1 for all n N and
so jxn j jxN j C 1, n N. Hence jxn j maxfjx1 j; ; jxN1 j; jxN j C 1g for all
n 1, proving that fxn j n 2 Ng is a bounded subset of R.
(2) Suppose limn!1 xn D x? . Let " > 0. Since .xn / converges to x? we can choose
N 2 N such that jx? xn j < "=2, n N. We have
jxm xn j D jx? xm C xn x? j
jx? xm j C jx? xn j
"
"
< C D "; if m; n N:
2
2
(3) Finally, suppose that .xnk / is a convergent subsequence of .xn / with limit x? .
Given " > 0, we can choose N1 2 N such that jx? xnk j < "=2, provided
nk N1 (it is easier to work with nk here as opposed to the index k). Since
.xn / is Cauchy, we can choose N2 2 N so that jxm xn j < "=2, m; n N2 . Set
N D maxfN1 ; N2 g. For all n; nk 2 N we have
jx? xn j D jx? xnk C xnk xn j jx? xnk j C jxnk xn j:
Fix nk N. Then for all n N, we have jx? xnk jCjxnk xn j < "=2C"=2 D ",
proving that .xn / converges to x? .
t
u
We can now state and prove our main result on Cauchy sequences.
Theorem 2.4.21 A sequence .xn / of real numbers is convergent iff .xn / is Cauchy.
Proof By Lemma 2.4.20(2), if .xn / is convergent, then .xn / is Cauchy. Conversely,
if .xn / is Cauchy then by Lemma 2.4.20(1), .xn / is bounded and so, by Proposition 2.4.3, .xn / has a convergent subsequence. Apply Lemma 2.4.20(3).
t
u
2.4 The Bolzano–Weierstrass Theorem
57
Remarks 2.4.22
(1) Theorem 2.4.21 fails over Q. Indeed, the sequence of finite decimal approximations to an irrational number provides an example of a non-convergent Cauchy
sequence in Q.
(2) We can use Theorem 2.4.21 as the basis for a more intrinsic (though perhaps
less transparent) definition of the real numbers that does not depend on working
to a particular base. Specifically, consider the set C of all Cauchy sequences of
. yn /
rational numbers. We define an equivalence relation on C by .xn /
iff limn!1 jxn yn j D 0. In particular, if one or other sequence converges,
then both do with the same limit. We define the set of real numbers as the
set of equivalence classes and then prove Theorem 2.4.21 directly without
recourse to the Bolzano–Weierstrass theorem. Modulo the abstraction of using
equivalence classes, what we are doing with this general construction is defining
real numbers by (all of) their rational approximations. For more details, we refer
to the appendix at the end of the chapter.
z
Multiplication and Division Revisited Once we know Cauchy sequences of real
numbers converge it is easy to define the operations of multiplication and division
on R.
Suppose x; y 2 R. Let .xn /, . yn / be sequences of rational numbers converging to
x, y respectively. We want to define xy D limn!1 xn yn . For this to work we need to
check that (a) .xn yn / is convergent and (b) the limit of .xn yn / is independent of the
choice of sequences .xn /, . yn / converging to x, y. We verify (a) and leave (b) to the
exercises. For (a) we prove that .xn yn / is a Cauchy sequence. For this, observe that
jxm ym xn yn j jxm ym xm yn j C jxm yn xn yn j
D jxm jjym yn j C jyn jjxm xn j:
By Lemma 2.4.20(1), there exists an M > 0 such that jxm j; jyn j M, for all n; m 2
"
N. Given " > 0, choose N 2 N such that jym yn j; jxm xn j < 2M
, m; n N. We
have
jxm ym xn yn j jxm jjym yn j C jyn jjxm xn j
"
"
<M
CM
D "; for m; n N:
2M
2M
Hence .xn yn / is Cauchy.
We use exactly the same process to define division of real numbers. Finally,
we may deduce all the standard laws of arithmetic for real numbers from the
corresponding laws for rational numbers. For example, the distributive law x. y C
z/ D xy C xz follows from
x. y C z/ D lim xn . yn C zn / D lim xn yn C lim xn zn D xy C yz:
n!1
n!1
n!1
58
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
EXERCISES 2.4.23
(1) Verify that the geometric sequence .rn / is Cauchy if jrj < 1.
(2) Find examples of sequences .xn / such that
(a) for every p 2 N, limn!1 jxnCp xn j D 0, (b) .xn / is not Cauchy. (Hint:
try xn D log.n C 1/.)
(3) Suppose that .xn / is a sequence of real numbers and there exists a k 2 .0; 1/
such that jxnC1 xn j < kjxn xn1 j for all n 2. Show that .xn / is a Cauchy
sequence.
Show by means of an example that if we allow k 2 .0; 1/ to depend on n, then
this result may fail and .xn / may not converge. (Hint: Look for an increasing
sequence .xn / which diverges to C1 but for which xn =n ! 0.)
(4) Complete the proof of the definition of multiplication on R by showing that the
limit of .xn yn / is the same for all rational sequences .xn / converging to x and
. yn / converging to y.
(5) Show how we define division
P by non-zero real numbers.
(6) For n 1, define Sn D njD1 .1/jC1 j2 . Prove that .Sn / is a Cauchy sequence
P
nC1 2
j converges.
and hence that the infinite series 1
nD1 .1/
(7) Suppose that f W Q ! R is uniformly continuous. Show that there exists a
unique continuous function F W R ! R such that F.x/ D f .x/ for all x 2 Q (we
say “F is a continuous extension of f to R”). Find an example of a continuous
(but not uniformly continuous) function g W Q ! R which does not extend to R.
(Hint for first part: Show that every Cauchy sequence .qn / Q is mapped by f
to a Cauchy sequence . f .qn // R. Remember to verify that F is well-defined
and does not depend on the particular choice of Cauchy sequence.)
2.5 lim sup and lim inf
2.5.1 Sequences
Suppose that .xn / R is a bounded sequence. For n 1, let Xn D fxm j m ng.
We define
˛n D inf.Xn /; ˇn D sup.Xn /:
Since .xn / is a bounded sequence, we have
1 < ˛n ˇn < C1;
for all n 1. Moreover, since X1 X2 , we have
˛1 ˛2 ˛n ˇn ˇ2 ˇ1 :
2.5 lim sup and lim inf
59
It follows by Theorem 2.3.18 that limn!1 ˛n and limn!1 ˇn exist and that
lim ˛n lim ˇn :
n!1
n!1
(2.1)
We define lim inf xn D limn!1 ˛n and lim sup xn D limn!1 ˇn . Alternative (and
commonly used) notations are lim xn for lim sup xn and lim xn for lim inf xn .
Lemma 2.5.1 If .xn / is a bounded sequence of real numbers, then
(1) lim inf xn lim sup xn .
(2) lim inf xn D lim sup xn iff .xn / is convergent. If .xn / is convergent then the limit
of .xn / must be the common value of lim inf xn and lim sup xn .
(3) There exists a subsequence of .xn / converging to lim inf xn . Similarly for
lim sup xn .
(4) If .xnk / is a convergent subsequence of .xn /, then
lim xnk 2 Œlim inf xn ; lim sup xn :
k!1
Proof (1) is immediate from (2.1). The remainder of the proof is left to the
exercises.
t
u
Remarks 2.5.2
(1) Lemma 2.5.1(3) gives an alternative proof of Proposition 2.4.3.
(2) We may define lim sup and lim inf for unbounded sequences if we give
lim sup xn D C1 and lim inf xn D 1 the obvious meanings.
z
Example 2.5.3 Suppose that .xn / is a Cauchy sequence. By Lemma 2.4.20, .xn / is
bounded. It follows from the definition of Cauchy sequence that for every " > 0,
we can choose N 2 N so that j inffxn j n Ng supfxn j n Ngj < ". Hence
lim inf xn D lim sup xn and .xn / is convergent by Lemma 2.5.1(2).
2.5.2 Functions, Continuity, lim sup and lim inf
We start with a review of one-sided limits. Most of this material should be familiar
to the reader. For simplicity we usually assume the domain is a closed interval
but everything we say extends to general intervals: open or closed, bounded or
unbounded. Some of the results we prove can be used to extend the range of
applicability of the Riemann integral as well as to develop the theory of the
Riemann–Stieltjes integral (see also the exercises at the end of the section and the
exercises in section “Appendix: The Riemann Integral”).
Let f W Œa; b ! R be bounded. Given x0 2 Œa; b, let limx!x0 signify the limit as
x approaches x0 from the left and limx!x0 C denote the limit as x approaches x0 from
the right. If limx!x0 f .x/ exists we set limx!x0 f .x/ D f .x0 / and similarly define
60
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
f .x0 C/. If x0 D a (respectively b), we only consider the limit limx!aC (respectively,
limx!b ).
Lemma 2.5.4 (Notation and Assumptions as Above) The function f is continuous at x0 iff the one-sided limits limx!x0 ˙ f .x/ D f .x0 ˙/ both exist and
f .x0 / D f .x0 / D f .x0 C/:
(The statement is modified in the obvious way at the end-points of Œa; b.)
Proof A standard argument—left to the exercises.
t
u
Examples 2.5.5
(1) Define f W Œ1; C1 ! R by
8
< 1; x < 0;
f .x/ D
0; x D 0;
:
1; x > 0:
The map f is continuous except at x D 0. We have f .0/ D 1, f .0C/ D 1
and f .0/ D 0. We refer to the discontinuity at x D 0 as a jump discontinuity of
f : the limits limx!x0 ˙ f .x/ exist but are not equal. Whatever the value of f .x0 /,
f is not continuous at x0 .
(2) Define f W Œ1; C1 ! R by
f .x/ D
sin.1=x/; x ¤ 0;
0;
x D 0:
The map f is continuous except at x D 0. Neither of the limits limx!0˙ f .x/
exist.
(3) Suppose f .x/ D x2 , x ¤ 0 and f .0/ D 1. In this case f .0˙/ D 0 ¤ f .0/.
We refer to x D 0 as a removable discontinuity of f : if we redefine f so that
f .0/ D 0, then f will be continuous. Note that neither of the discontinuities in
the previous examples are removable.
.
As example (2) above shows, not all discontinuities of a function need be jump
discontinuities. In particular, the limits limx!x0 ˙ f .x/ need not exist. However, since
we are assuming f is bounded, we may use the operations of lim sup and lim inf to
define quantities that reflect the variation in f near a discontinuity point x0 . More
precisely, let x0 2 Œa; b/. Since f is bounded, we may define
def
f .x0 C/ D lim sup f .x/ D lim sup f ..x0 ; x0 C h/;
h!0C
x!x0 C
def
f .x0 C/ D lim inf f .x/ D lim inf f ..x0 ; x0 C h/:
x!x0 C
h!0C
2.5 lim sup and lim inf
61
Since f is bounded, we have 1 f .x0 C/ f .x0 C/ < C1. If x0 2 .a; b, we
may similarly define f .x0 /, f .x0 /, and if x0 2 .a; b/
f .x0 / D lim inf f .x/; f .x0 / D lim sup f .x/:
x!x0
x!x0
It follows easily from the definitions that we have the following relations between
these limits.
f .x0 C/ f .x0 C/;
f .x0 / f .x0 /;
f .x0 / f .x0 /;
f .x0 / D maxf f .x0 /; f .x0 C/; f .x0 /g;
f .x0 / D minf f .x0 /; f .x0 C/; f .x0 /g:
Example 2.5.6 Define f W Œ1; C1 ! R by
8
x > 0;
< 3 maxf0; sin.1=x/g;
f .x/ D 2 C 5 minf0; sin.1=x/g; x < 0;
:
7;
x D 0:
In this case we have f .0C/ D 3, f .0C/ D 0, f .0/ D 2, f .0/ D 3, f .0/ D 7,
f .0/ D 3. In general, there are no further relationships we can expect between the
various limits.
We define three terms which quantify the ‘fluctuation’ or ‘oscillation’ of f at x0 :
!f .x0 / D f .x0 / f .x0 /;
!f .x0 C/ D f .x0 C/ f .x0 C/;
!f .x0 / D f .x0 / f .x0 /:
Lemma 2.5.7 (Notation and assumptions as above.)
(1)
(2)
(3)
(4)
If !f .x0 / D 0, then f is continuous at x0 .
If !f .x0 C/ D 0, then f .x0 C/ exists.
If !f .x0 / D 0, then f .x0 / exists.
If !f .x0 ˙/ D 0 and f .x0 ˙/ D f .x0 /, then f is continuous at x0 .
Proof Straightforward and left to the exercises.
t
u
62
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Remarks 2.5.8
(1) We briefly mention the important concepts of upper and lower semi-continuity,
which play a role in many parts of analysis. In our context, a bounded map f W
Œa; b ! R is upper semi-continuous at x0 if f .x0 / D lim supx!x0 f .x/ f .x0 /
and lower semi-continuous at x0 if f .x0 / D lim infx!x0 f .x/ f .x0 /. It follows
from Lemma 2.5.7(1) that if f is upper and lower semi-continuous at x0 , then
f is continuous at x0 . If f has a jump discontinuity at x0 , then f will be upper
semi-continuous at x0 if f .x0 / maxf f .x0 C/; f .x0 /g. It may be shown that if
f W Œa; b ! R is bounded, then !f is upper semi-continuous.
(2) Given f W Œa; b ! R, it was shown by Young [31] that the subset of points of
Œa; b where either f .x0 / ¤ f .x0 C/ or f .x0 C/ ¤ f .x0 C/ is countable. (See
Exercises 7.11.10(16,17) for an outline proof of Young’s theorem.)
z
Theorem 2.5.9 Let I R be an interval and f W I ! R be monotone and bounded.
Then
(1) For all x0 2 I, f .x0 C/ and f .x0 / exist. In particular, all discontinuities of f are
jump discontinuities.
(2) The set of points where f is discontinuous is a countable subset of I.
Proof Without loss of generality assume f is monotone increasing. Since f
is increasing, we have lim infx!x0 f .x/ D limx!x0 f .x/ D f .x0 / and
lim supx!x0 C f .x/ D limx!x0 C f .x/ D f .x0 C/, proving (1). Let Df I denote
the set of points of discontinuity of f . Since f is increasing, it follows from (1) and
Lemma 2.5.7 that Df D fx 2 I j !f .x/ D f .xC/ f .x/ > 0g. Given n 2 N, define
Dn D fx 2 I j !f .x/ 1=ng:
Since
f is bounded and increasing, Dn is finite for all n 2 N ( f .b/ f .a/ P
t
u
x2Dn !f .x/). The result follows since Df D [n1 Dn .
EXERCISES 2.5.10
(1) Complete the proof of Lemma 2.5.1.
(2) Provide the proof of Lemma 2.5.4.
(3) Provide the details of the proof of Lemma 2.5.7 and verify that the conditions
are all necessary. (For example, find a function f which satisfies !f .x0 ˙/ D 0
and f .x0 C/ D f .x0 /, but is not continuous at x0 .)
(4) Show that f W Œa; b ! R is upper semi-continuous at x0 iff f W Œa; b ! R is
lower semi-continuous at x0 .
(5) Show that f W Œa; b ! R is upper semi-continuous at x0 iff for all sequences
.xn / Œa; b converging to x0 we have lim supn!1 f .xn / f .x0 /. Formulate
and prove the analogous statements for lower semi-continuity.
(6) Show that
(a) the floor function f .x/ D bxc, which returns the greatest integer x, is
upper semi-continuous,
(b) the ceiling function f .x/ D dxe, which returns the smallest integer x, is
lower semi-continuous.
2.6 Complex Numbers
63
(7) Show that if f W Œa; b ! R is upper semi-continuous, then f is bounded above
on Œa; b and attains it upper bound. Formulate and prove a corresponding
result for lower semi-continuous functions. (Hint: Use the sequence method
used in the proof of Theorem 2.4.10.)
(8) A functionPf W Œa; b ! R is of bounded variation if there exists an M 0
n1
such that jD0
j f .xj / f .xjC1 /j M for all finite partitions P D fxj j a D
x0 x1 xn D b of Œa; b. Show that
(a) If f is monotone or continuously differentiable, then f is of bounded
variation.
(b) If f .x/ D sin.1=x/, x 2 .0; 1, f .0/ D 0, then f is not of bounded
variation. Show also that xf .x/ is not of bounded variation (note that xf .x/
is differentiable but not continuously differentiable).
(c) If f is of bounded variation, f .x0 ˙/ exist for all x0 2 Œa; b and the set of
discontinuities is countable.
(9) Show that if f ; g W Œa; b ! R are of bounded variable so are f ˙ g, f g.
(10) Let f W Œa; b ! R be of bounded
variation and x 2 Œa; b. If P is a finite
Pn1
x
partition of Œa; x, let V.P/ D
jD0 j f .xj / f .xjC1 /j and define Va . f / D
supP V.P/ (taken over all finite partitions of Œa; x). Show that
(a) V.x/ D Vax . f / is monotone increasing on Œa; b.
(b) W D V f is monotone increasing on Œa; b.
(c) f is of bounded variation if and only if f can be written as the difference
of two monotone functions (either both strictly increasing or both strictly
decreasing).
Can you prove (c) more geometrically if f is C1 ?
2.6 Complex Numbers
We recall the definitions and elementary properties of complex numbers and extend
results on sequences of real numbers to complex numbers.
2.6.1 Review of Complex Numbers
A complex number z D x C {y may be identified with the point .x; y/ 2 R2 . Addition
and subtraction of the complex numbers z1 D x1 C{y1 and z2 D x2 C{y2 corresponds
to vector addition in R2 :
z1 ˙ z2 D .x1 ˙ x2 / C {. y1 ˙ y2 /:
64
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Multiplication of complex numbers is defined by
z1 z2 D .x1 C {y1 /.x2 C {y2 / D .x1 x2 y1 y2 / C {.x1 y2 C x2 y1 /:
On R2 , multiplication is given by .x1 ; y1 /.x2 ; y2 / D .x1 x2 y1 y2 ; x1 y2 C x2 y1 /.
It is easy to check that multiplication is commutative (z1 z2 D z2 z1 ), associative
(z1 .z2 z3 / D .z1 z2 /z3 ) and that the distributive law holds
z1 .z2 C z3 / D z1 z2 C z1 z3 :
If we let C denote the set of complex numbers and identify C with R2 by z D x C
{y $ .x; y/, then the real numbers R are naturally defined as the subset f.x; 0/ j x 2
Rg of C. We say the complex number z is real if z D x C {0. We define 1 D .1; 0/ D
1 C {0, and 0 D .0; 0/. We then have 1z D z1 D z, and z C 0 D 0 C z D z for all
z 2 C.
Since { 2 D .0; 1/ .0; 1/ D .1; 0/ D 1, we have { 2 D 1. A complex number
z is imaginary if z D {y for some y 2 R. The square of every imaginary number is
negative.
We define the modulus jzj of z D x C {y 2 C by
jzj D
p
x2 C y2 :
Of course, jzj is the Euclidean length of the vector .x; y/ 2 R2 . It is straightforward
to verify that jz1 z2 j D jz1 jjz2 j; for all z1 ; z2 2 C: If z 2 C is real, then jzj is the
absolute value of z. We have the important triangle-inequality
jz1 C z2 j jz1 j C jz2 j; for all z1 ; z2 2 C:
Let c W C ! C be the real linear map defined by
c.x C {y/ D x {y:
We refer to c as complex conjugation and write c.z/ D zN. Observe that zN D z iff z is
real and zN D z iff z is imaginary. Since zNz D .x C{y/.x {y/ D x2 { 2 y2 D x2 Cy2 ,
we have
jzj2 D zNz:
If jzj D 1, then z is a point on the unit circle x2 C y2 D 1.
Referring to Fig. 2.2, z defines a unique 2 Œ0; 2 / such that z D cos C { sin
(that is, the Cartesian coordinates of z are .cos ; sin /). If we define e{ D cos C
{ sin , then e{ D cos { sin and so
cos
D
e{ C e{
; sin
2
D
e{ e{
:
2{
2.6 Complex Numbers
65
Fig. 2.2 Complex number of
unit modulus
y
z = (cos θ,sin θ)
r=1
x
P
Remark 2.6.1 If we substitute x D { in the exponential series ex D 1
nD0
we obtain the well-known infinite series for cos and sin (see Chap. 5).
We may use standard trigonometric identities to verify
e{0 D 1; e{ D e{ ; e{ e{ D e{.
C /
xn
,
nŠ
then
z
:
In particular, for n 2 Z we have De Moivre’s formula
.e{ /n D e{n :
We leave to the exercises the proof that if a 2 C and ae{ ¤ 1, then
n
X
ap e{p D
pD0
If z ¤ 0, there exists a unique
.1 anC1 e{.nC1/ /
:
1 ae{
2 Œ0; 2 / such that
z D jzje{ :
For this we observe that u D z=jzj lies on the unit circle and so defines a unique
2 Œ0; 2 / as described above. We call z D jzje{ the modulus and argument form
of z. If z D x C {y, then r D jzj, are the polar coordinates of .x; y/.
Multiplication takes a particularly simple form if we use the modulus and
argument representation of complex numbers. If z1 D jz1 je{ 1 and z2 D jz1 je{ 1 ,
then
z1 z2 D jz1 jjz2 je{ 1 e{
1
D jz1 z2 je{.
1 C 2/
:
66
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
2.6.2 Sequences of Complex Numbers
Definition 2.6.2 A sequence .zn / of complex numbers is convergent if there exists
a z 2 C such that
lim jz zn j D 0:
n!1
We call z the limit of the sequence .zn / and write limn!1 zn D z.
Remark 2.6.3 The definition is formally identical to that of convergence of a real
sequence with the proviso that we replace the absolute value by the modulus.
z
The next lemma allows us to switch easily between real and complex sequences.
Lemma 2.6.4 Let .zn / be a sequence of complex numbers. If we write zn D xn C{yn ,
then .zn / is convergent iff both the real sequences .xn / and . yn / are convergent.
Proof Observe that if z D x C {y, then
jxj; jyj jzj jxj C jyj:
(2.2)
Suppose that .zn / is convergent with limit z D x C {y. By definition, limn!1 jz zn j D 0. By the left-hand inequality of (2.2), we have jx xn j; jy yn j jz zn j,
for all n 2 N. Hence, by the squeezing lemma, limn!1 jx xn j; jy yn j D 0 and
the sequences .xn /, . yn / converge with respective limits x and y. The converse is
equally simple using the right-hand inequality of (2.2).
t
u
Definition 2.6.5 A sequence .zn / of complex numbers is a Cauchy sequence if
limm;n!1 jzm zn j D 0.
Theorem 2.6.6 A sequence .zn / is Cauchy iff it is convergent.
Proof We leave the proof, which is an easy consequence of Theorem 2.4.21 and
Lemma 2.6.4, to the exercises.
t
u
Remark 2.6.7 Lemma 2.6.4 and Theorem 2.4.21 allow us to extend many results
on real sequences and series to complex sequences and series. Subsequently, we
usually indicate these extensions in remarks rather than developing the complex
theory separately.
z
EXERCISES 2.6.8
(1) Verify that jz1 z2 j D jz1 jjz2 j and z1 z2 D zN1 zN2 for all z1 ; z2 2 C.
(2) Complete the proof of Theorem 2.6.6.
(3) Verify the formula for the sum of a geometric series. (Hint: multiply both sides
by 1 ae{ .)
(4) A subset A of C is bounded if there exists a C 0 such that jzj C for all
z 2 A. Show that if A is an infinite bounded subset of C then there exists a
2.7 Appendix: Results from the Differential Calculus
67
convergent subsequence .zn / consisting of distinct points of A. (Hint: Use the
Bolzano–Weierstrass theorem twice and Lemma 2.6.4.)
(5) Show that every bounded sequence of complex numbers has a convergent
subsequence.
(6) Show that a continuous function f W Œa; b ! C is bounded and attains its
bounds.
2.7 Appendix: Results from the Differential Calculus
We review some definitions and results from the differential calculus of functions of
one variable. For the results on Taylor’s theorem, we only use an elementary result
from the theory of Riemann integration. Namely that if f has an anti-derivative F
Rb
(F 0 D f ), then a f .t/ dt D F.b/ F.a/ (see the second appendix for additional
comments).
Definition 2.7.1 Let I be an interval (open or closed, bounded or unbounded). If f W
.x0 /
I ! R is continuous and x0 2 I, then f is differentiable at x0 if limh!0 f .x0 Ch/f
h
exists. We denote the value of the limit by f 0 .x0 / and call f 0 .x0 / the derivative of f
at x0 . If f is differentiable at every point of I, we say f is differentiable on I.
Remarks 2.7.2
(1) If x0 2 I is an end-point of I, then we take the appropriate one-sided limit. For
.a/
example, if I D Œa; b, then f 0 .a/ D limh!0C f .aCh/f
.
h
.x0 /
.
(2) Continuity of f at x0 is implied by the existence of the limit limh!0 f .x0 Ch/f
h
The verification is routine.
z
Easily the most important foundational theorem in the differential calculus is the
mean value theorem. The mean value theorem follows simply from Rolle’s theorem
which we state and prove first.
Theorem 2.7.3 (Bhaskara (1114–1185), Rolle 1691) Let f W Œa; b ! R be
continuous on Œa; b and differentiable on .a; b/. If f .a/ D f .b/ D 0, there exists
a z 2 .a; b/ such that f 0 .z/ D 0.
Proof Either f is constant, in which case f 0 0 and we may take z D .a C b/=2, or
not. If not, then by the continuity of f , f .x/ attains minimum and maximum values
m < M on Œa; b. Since f is not constant, at least one of m; M is non-zero. Without
loss of generality, suppose M > 0 and that M D f .z/ where necessarily z 2 .a; b/. It
is a simple consequence of the definition of derivative that f 0 .z/ D 0 (if not, f would
have to take values greater than M close to z).
t
u
Theorem 2.7.4 (Mean Value Theorem) Let f W Œa; b ! R be continuous on Œa; b
and differentiable on .a; b/. Then there exists a z 2 .a; b/ such that
f .b/ f .a/ D f 0 .z/.b a/:
68
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Proof Define G.x/ D . f .b/ f .a//.x a/ . f .x/ f .a//.b a/, x 2 Œa; b. Then
G.a/ D G.b/ D 0 and G satisfies the conditions of Rolle’s theorem. Therefore
there exists a z 2 .a; b/ such that G0 .z/ D 0. Observe that G0 .z/ D . f .b/ f .a// f 0 .z/.b a/.
t
u
Remark 2.7.5 The precise value of z given by the mean value theorem is rarely of
interest. If we set M D supx2.a;b/ j f 0 .x/j, then provided M < 1, we obtain the very
useful estimate
j f .b/ f .a/j Mjb aj:
(2.3)
This is the form in which we make most use of the mean value theorem and it is also
the form in which it generalizes to functions of several variables (see Chap. 9). z
Corollary 2.7.6 Suppose that f W Œa; b ! R is continuous and differentiable on
.a; b/. If f 0 D 0 on .a; b/, then f is constant.
Proof If x 2 Œa; b, then j f .x/ f .a/j 0, by (2.3).
t
u
2.7.1 Higher Derivatives and Taylor’s Theorem
Let I be an interval. If f W I ! R is differentiable, let f 0 W I ! R denote the derivative
map. We say f is continuously differentiable, or C1 , if f is differentiable and f 0 W
I ! R is continuous. Proceeding inductively, f W I ! R is r-times continuously
differentiable, or just Cr , if the derivative maps f 0 ; : : : ; f .r1/ exist and are continuous
on I and the derivative map f .r1/ W I ! R is differentiable with derivative map
. f .r1/ /0 W I ! R continuous. Set f .r/ D . f .r1/ /0 . If f is r-times continuously
differentiable for all r 1, f is said to be smooth or C1 . We make a special study
of smooth functions later in Chap. 5.
Theorem 2.7.7 (Taylors Theorem: Integral Remainder) Let I be a non-empty
open or closed interval, r 2 N and f W I ! R be .r C 1/-times continuously
differentiable. Given a; x 2 I, we have
f .x/ D f .a/ C
f .r/ .a/
f 0 .a/
.x a/ C C
.x a/r C Rr .a; x/;
1Š
rŠ
where the remainder term Rr .a; x/ is given explicitly by
Rr .a; x/ D
D
1
rŠ
Z
x
f .rC1/ .t/.x t/r dt
a
.x a/rC1
rŠ
Z
1
0
.1 s/r f .rC1/ .a C s.x a// ds:
2.7 Appendix: Results from the Differential Calculus
69
Proof The proof is by induction. The result is trivially true when r D 0 (the integral
is defined in terms of the anti-derivative f of f 0 ). So suppose we have shown
f .x/ D
k
X
f .i/ .a/
iD0
iŠ
.x a/i C
1
kŠ
Z
x
f .kC1/ .t/.x t/k dt;
a
where 0 < k < r. Integrating by parts we have
Z
x
1 .kC1/
f
.t/.x t/kC1 jxtDa
kC1
Z x
1
C
f .kC2/ .t/.x t/kC1 dt
kC1 a
f .kC1/ .t/.x t/k dt D a
D
1 .kC1/
f
.a/.x a/kC1
kC1
Z x
1
C
f .kC2/ .t/.x t/kC1 dt:
kC1 a
R x .kC2/
1
Dividing by kŠ, we see that RkC1 .a; x/ D .kC1/Š
.t/.x t/k dt, completing
a f
the inductive step. It remains to prove that the two versions of the remainder term
are equal. This is easily done by means of the substitution t D a C s.x a/ and we
leave the details to the reader.
t
u
Taylor’s theorem with integral remainder will suffice for our later applications.
We remark that if Œa ı; a C ı I, and we set MrC1 D supt2Œaı;aCı j f .rC1/ .t/j,
then we have the estimate
ˇ
ˇ
r
ˇ
ˇ
X
f .i/ .a/
jx ajrC1
ˇ
iˇ
.x a/ ˇ MrC1
; jx aj ı:
(2.4)
ˇ f .x/ ˇ
ˇ
iŠ
rŠ
iD0
The estimate follows easily from the second form for the remainder.
Definition 2.7.8 If f W I ! R is C1 , then the Taylor series Ta f of f at a is defined
by
Ta f .x/ D
1 .n/
X
f .a/
nD0
nŠ
.x a/n :
Remark 2.7.9 This is a formal definition and we caution the reader that the Taylor
series, even if it converges, may bear little, if any, relation to the values of f .x/, when
x ¤ a (see Chap. 5). However, the Taylor series does encode information about f —
all the derivatives of f at x D a.
z
70
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
2.7.2 Other Forms of the Remainder in Taylor’s Theorem
It is possible to express the remainder in a form that assumes weaker conditions on
f . In this section we state some characteristic results.
Given that f W I ! R is r-times differentiable at a, recall that if x 2 I, then the
remainder Rr .a; x/ is defined by
Rr .a; x/ D f .x/ r
X
f .i/ .a/
iD0
iŠ
.x a/i :
We start by giving a variant of the remainder estimate (2.4).
Theorem 2.7.10 If f W I ! R is r times differentiable at a, then
jRr .a; x/j
D 0:
x!a jx ajr
lim
(Limit through points of I.)
Proof The proof is by induction on r and uses Rolle’s theorem (see Exercise 2.7.12(5)). Note that the case r D 1 is the definition of differentiability at
x D a.
t
u
Theorem 2.7.11 (Classical Taylor’s Theorem) Suppose that f W I ! R.
(a) If f is .r C 1/-times differentiable on I, then for each a < x 2 I, 9 2 .a; x/ such
that
Rr .a; x/ D
f .rC1/ ./
.x a/rC1 :
.r C 1/Š
(Lagrange form of the remainder.)
(b) If f is CrC1 , then for each x 2 I, 9 2 .a; x/ such that
Rr .a; x/ D
f .rC1/ ./
.x /r .x a/:
rŠ
(Cauchy form of the remainder.)
Proof We give the proof of Rthe Cauchy remainder leaving the Lagrange form to the
y
exercises. Define g. y/ D rŠ1 a f .rC1/ .t/.x t/r dt, y 2 Œa; x. Since g is differentiable
(fundamental theorem of calculus), the mean value theorem implies there exists an
2 .a; x/ such that Rn .a; x/ D g.x/ g.a/ D .x a/g0 ./ D .x a/ rŠ1 f .rC1/ ./
.x /r .
t
u
2.8 Appendix: The Riemann Integral
71
EXERCISES 2.7.12
(1) Suppose that h W Œa; x ! R is .r C 1/ times differentiable and h. j/ .a/ D h.x/ D
0, 0 j r. Using induction and Rolle’s theorem, show that there exists
2 .a; x/ such that h.rC1/ ./ D 0.
P
. j/
(2) Assume the conditions of Theorem 2.7.11(a). Set T.t/ D rjD0 f jŠ.a/ .t a/ j
and define h W Œa; x ! R by
f .x/ T.x/
.t a/rC1 ; t 2 Œa; x:
h.t/ D f .t/ T.t/ .x a/rC1
Show that h satisfies the conditions of (1) and deduce the Lagrange form of the
remainder in Taylor’s theorem.
(3) Using
form of the remainder, show that if n 2 N, then e D
Pn 1the Lagrange
1
< 1. Deduce that e is irrational. (Hint: suppose
jD0 jŠ C .nC1/Š e , where 0 <
e D p=q. Choose n > q.)
(4) Show that Theorem 2.7.10 follows from Theorem 2.7.11 if f is CrC1 .
(5) Prove Theorem 2.7.10 (assume only that f is Cr ).
(Hint. The result is true for r D 1. Let a < b 2 I. Given x 2 Œa; b, define
G.x/ D Rr .a; b/.x a/r Rr .a; x/.b a/r :
Verify G.a/; G.b/ D 0 and use Rolle’s Theorem and induction on r.)
2.8 Appendix: The Riemann Integral
Suppose that f is a continuous function defined on a closed and bounded interval. In
this appendix we show how to define, construct and compute the Riemann integral
of f . We make use of two results: Theorem 2.4.10(1,2) ( f is bounded on a closed
bounded interval and attains its bounds) and the mean value theorem.
Rather than defining the integral of f by approximating upper and lower sums, we
instead state two simple properties that the integral should possess. We show these
properties are reasonable by verifying that they give the correct areas under the
graph of a constant function (area of a rectangle) and under the graph of y D x (area
of a triangle). We then prove that these properties uniquely determine the integral
if it exists. It is then almost a triviality to observe that if f has an anti-derivative F
(F 0 D f ) then the integral from a to x of f exists and is equal to F.x/ F.a/ (the
fundamental theorem of calculus). We conclude with an elementary proof of the
main theoretical result that every continuous function defined on a closed interval
has an anti-derivative. This result amounts to an existence theorem for solutions of
dy
the ordinary differential equation dx
D f .x/. We briefly indicate how to extend our
definition of the integral to include bounded functions with at most countably many
discontinuities.
72
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
If f is everywhere positive then we think of the integral of f as the area under
the graph of f . If f is everywhere negative, then the corresponding integral will be
negative and equal to minus the area under the graph of f .
2.8.1 Two Basic Properties Required of the Integral
Let f be a real-valued function with domain D R, where D is an interval which
may be open, closed, half-open or unbounded. We assume f is bounded on all
closed intervals Œa; b D. Every continuous function f satisfies this condition
by Theorem 2.4.10.
Definition 2.8.1 A function I.x; y/, with domain D
f if
D, is a (definite) integral for
(1) Given a b c, a; b; c 2 D, we have
I.a; c/ D I.a; b/ C I.b; c/:
(2.5)
m.b a/ I.a; b/ M.b a/;
(2.6)
(2) Given a < b, a; b 2 D,
where m is any lower bound for f on Œa; b and M is any upper bound for f on
Œa; b.
Remark 2.8.2 We should emphasize that I.x; y/ depends on f —we could have
written If rather than I but we prefer the simpler notation with the understanding
that the function f remains fixed.
In Fig. 2.3 we show the meaning of (2.5). The condition implies that if we choose
any finite sequence a D x0 < x1 : : : < xN D b, then
I.a; b/ D
N1
X
I.xn ; xnC1 /:
(2.7)
nD0
I (b,c)
I (a,b)
a
b
I (a,c) = I (a,b) + I (b,c)
Fig. 2.3 Condition (2.5)
c
2.8 Appendix: The Riemann Integral
73
Turning to the second condition, assume for the moment that f is positive (as in
Fig. 2.4). The first inequality m.b a/ I.a; b/ of (2.6) says that whatever I.a; b/
is, it cannot be smaller than the area of the largest rectangle with base Œa; b that we
can fit under the graph of f . Similarly, the second inequality I.a; b/ M.b a/
implies that I.a; b/ can be no larger than the area of the smallest rectangle with base
Œa; b that contains the graph of f .
Examples 2.8.3
(1) Let f .x/ D C be a constant function. Then I.a; b/ D C.ba/: take m D M D C
and the result is immediate from (2.6). Notice that this gives the (signed) area
of the rectangle with base Œa; b and height C. Of course, if C > 0 we get the
usual unsigned area. If f is negative, then I.a; b/ is ‘signed’ and negative.
(2) If f takes positive and negative values there can be cancellation between the
positive and negative parts of I.a; b/ and so the inequality (2.6) is weaker—see
Fig. 2.5 where m D 1; M D C1 and I.a; b/ D 0 (using (2.5) and the previous
example).
M
M(b–a)
m
m(b–a)
b
a
Fig. 2.4 Condition (2.6)
+1
a
(a+b)/2
b
–1
I (a,b) = I (a,(a+b)/2) + I ((a+b)/2,b) = 0
Fig. 2.5 A case where I .a; b/ D 0
74
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
As we shall soon see, (2.5), (2.6) uniquely characterize the function I.x; y/ when
f is continuous.
In practice, it is useful to extend (2.5) to allow for arbitrary triples a; b; c 2 D.
First note that if a D b D c we get
I.a; a/ D I.a; a/ C I.a; a/
and so I.a; a/ D 0. It follows that if we want to define I.a; b/ when b < a we must
take I.a; b/ D I.a; b/ since for (2.5) to hold (for a; b; a) we need
0 D I.a; a/ D I.a; b/ C I.b; a/
With these conventions, it is easy to check that if (2.5) holds then we have
I.a; c/ D I.a; b/ C I.b; c/;
for all a; b; c 2 D.
Summarizing, we henceforth suppose that I.x; y/ satisfies
(I) For all a; b; c 2 D we have
I.a; c/ D I.a; b/ C I.b; c/:
(II) Given a < b, a; b 2 D,
m.b a/ I.a; b/ M.b a/;
where m is any lower bound for f on Œa; b and M is any upper bound for f on
Œa; b.
Example 2.8.4 Conditions (I,II) allow us to do some simple computations. For
example, suppose f .x/ D x and we take a D 0, b D 1. We compute I.0; 1/. Take
the subdivision 0; 1=N; 2=N; : : : ; .N 1/=N; 1 of Œ0; 1. Applying (2.7), we have
I.0; 1/ D
N1
X
nD0
I
n nC1
;
:
N
N
n
nC1
2
On the interval ΠNn ; nC1
N we have the bounds N x N . Hence n=N
n nC1
2
I. N ; N / .n C 1/=N . Summing from n D 0 to N D 1, we obtain the estimate
N1
N1
1 X
1 X
n
I.0;
1/
n C 1:
N 2 nD0
N 2 nD0
2.8 Appendix: The Riemann Integral
75
The arithmetic progressions 0; 1; : : : ; N 1 and 1; 2; : : : ; N have respective sums
N.N 1/=2 and N.N C 1/=2 and so
NC1
N1
I.0; 1/ :
2N
2N
This estimate holds for all N 1. Letting N ! 1, the squeezing lemma implies
that I.0; 1/ D 12 (the area of the triangle of base 1 and height 1).
2.8.2 Existence of I .x; y/, Part 1
We now assume that f is continuous. Recall that f W D ! R has an anti-derivative
F if there exists a differentiable function F W D ! R such that F 0 D f on D.
Lemma 2.8.5 If f is continuous and has an anti-derivative F, then I.a; b/ D
F.b/ F.a/ satisfies properties I, II.
Proof Define
I.a; b/ D F.b/ F.a/; a; b 2 D:
Since .F.b/ F.a// C .F.c/ F.b// D .F.c/ F.a// (for all a; b; c 2 D), it
is obvious that I.a; b/ satisfies I. It remains to show that I satisfies II. Suppose
a; b 2 D, a < b. Let M; m be upper and lower bounds for f on Œa; b. By the mean
value theorem, we can find z 2 .a; b/ so that
I.a; b/ D F.b/ F.a/ D F 0 .z/.b a/ D f .z/.b a/:
Since m f .x/ M on Œa; b, it follows immediately that
m.b a/ f .z/.b a/ D I.a; b/ M.b a/;
proving II.
t
u
2.8.3 Uniqueness of I .x; y/, f Continuous
Theorem 2.8.6 Let f W D ! R be continuous. If we can find I satisfying I, II,
then
(1) For all a 2 D, Fa .x/ D I.a; x/ is an anti-derivative of f .
(2) I is unique.
Proof Fix a 2 D and set F.x/ D I.a; x/, x 2 D. We claim that F is an antiderivative of f : F 0 .x/ D f .x/, x 2 D.
76
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Fix x 2 D and choose h 2 R so that x C h 2 D. (If x is not an end-point of D, we
have x C h 2 D for sufficiently small h. Otherwise we restrict to positive or negative
values of h as appropriate.) By I, we have
I.a; x/ C I.x; x C h/ D I.a; x C h/:
Hence
I.a; x C h/ I.a; x/ D I.x; x C h/:
Therefore, if h ¤ 0,
I.a; x C h/ I.a; x/
I.x; x C h/
D
:
h
h
Let mh and Mh respectively denote the infimum and supremum of f on Œx; x C h.
Since f is continuous, 1 < mh Mh < 1 and limh!0 Mh ; mh D f .x/. Suppose
first that h > 0. From II, we have
mh h I.x; x C h/ Mh h;
and so
mh I.x; x C h/=h Mh :
Letting h ! 0C, we see
lim I.x; x C h/=h D f .x/:
h!0C
D I.xCh;x/
(since I.x; x C h/ D I.x C h; x/). The argument
If h < 0, then I.x;xCh/
h
h
now proceeds as before, using II applied to the interval Œx C h; x (note x C h < x)
to give
lim I.x; x C h/=h D f .x/:
h!0
Hence we have shown that
I.a; x C h/ I.a; x/
D f .x/
h!0
h
lim
and F.x/ D I.a; x/ is an anti-derivative of f , proving (1).
Since any two anti-derivatives of f differ by a constant,1 (2) follows from (1). u
t
1
By the mean value theorem, F0 G0 D 0 implies G D F Cc if the common domain is an interval.
2.8 Appendix: The Riemann Integral
77
Remark 2.8.7 Note that Lemma 2.8.5 and Theorem 2.8.6 suffice for all the standard
applications and examples in a first calculus course: all the functions considered
invariably have an anti-derivative and so the definite (or indefinite) integral is given
by the anti-derivative. No arguments needing approximating sums are needed. z
Rb
In future we adopt the usual notation and set I.a; b/ D a f .t/ dt. We refer to
Rb
a f .t/ dt as the Riemann or definite integral of f from a to b. If x 2 Œa; b, then
Theorem 2.8.6 implies the fundamental theorem of calculus
d
dx
Z
x
f .t/ dt D f .x/:
(2.8)
a
2.8.4 Existence of the Integral, Part 2
In this section we prove
Theorem 2.8.8 Every continuous function f W D ! R has an anti-derivative.
Our proof of Theorem 2.8.8 proceeds by constructing a function L.x; y/ that
satisfies conditions I, II. This construction is quite straightforward and uses only
parts (1,2) of Theorem 2.4.10 (in particular, no use is made of results on uniform
continuity).
Proof of Theorem 2.8.8 Fix an interval Œa; b and suppose that f is continuous on
Œa; b. A partition P of Œa; b consists of a finite number of points t0 ; : : : ; tN satisfying
a D t0 t1 : : : tN D b:
Given a partition P, set mj D inff f .s/ j s 2 Œtj ; tjC1 g, 0 j < N. Define the (lower)
sum L.P; f / by
L.P; f / D
N1
X
mj .tjC1 tj /:
jD0
If m; M denote lower and upper bounds for f on Œa; b then m mj M and so
m.b a/ L.P; f / M.b a/:
(2.9)
If we add new points to P, say to form P 0 , then the reader may easily check that
L.P 0 ; f / L.P; f /. It follows from (2.9), that m.b a/ and M.b a/ are lower
and upper bounds respectively for L.P; f / for all partitions P of Œa; b. Hence if we
define
L.a; b/ D supfL.P; f / j all partitions P of Œa; bg;
78
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
we have
m.b a/ L.a; b/ M.b a/:
Hence L.a; b/ satisfies II. Since we can always add a point b to a partition of Œa; c
(a b c), it is easy to see that
L.a; b/ C L.b; c/ D L.a; c/;
and so L.a; b/ satisfies I. It follows by Theorem 2.8.6 that L.a; x/ is an antiderivative of f .
t
u
Remarks 2.8.9
(1) We could have done the construction using ‘upper’ sums U.P; f /. Since
we have proved already that the integral is unique, we get for free that
Rb
supP L.P; f / D infP U.P; f / D a f .x/ dx. We can similarly use approximatPN1
ing sums that lie between L.P; f / and U.P; f /. For example, jD0
f .tj /.tjC1 tj / (see the exercises below).
(2) If we use uniform partitions P (tjC1 tj D .b a/=N is independent
of j), then we need the uniform continuity of f in order to prove that
Rb
limN!1 U.P; f /; L.P; f / D a f .x/ dx. But this assumption is not needed to
prove the existence of the integral. Indeed, a virtue of the Riemann integral is
that once you know the integrand f has an anti-derivative F, then you can write
down the integral in terms of F. Nothing is needed about approximating sums.
The only technical difficulty is proving the existence of the Riemann integral
for general continuous functions. This may be regarded as an existence theorem
for ordinary differential equations: given a continuous function f , the ordinary
differential equation
dy
D f .x/
dx
has a C1 solution y D F.x/. (Later we address the existence theorem when f is
a function of y, rather than x.)
z
2.8.5 Methods of Integration
Once we know the integral can be given in terms of an anti-derivative, all the
standard results from calculus follow more or less immediately. For example,
suppose that f is continuous (and so has an anti-derivative) and g is any differentiable
function with continuous derivative. We have
Z
Z
g.b/
b
f .s/ ds D
g.a/
a
f .g.t//g0 .t/ dt:
2.8 Appendix: The Riemann Integral
79
Indeed, since f is continuous, there exists an F such that F 0 D f . Hence .F ı g/0 .t/ D
F 0 .g.t//g0 .t/ D f .g.t//g0 .t/ and the result follows. If g is invertible, we obtain the
integration by substitution formula:
Z
Z
b
f .s/ ds D
a
g1 .b/
g1 .a/
f .g.t//g0 .t/ dt:
(Substitute s D g.t/ and ds D g0 .t/ dt.)
2.8.6 Extensions
We can prove the existence of an integral satisfying I, II for any bounded function
on a domain D which has at most countably many discontinuities. This is easy to
do when there are finitely many discontinuities of f and more challenging when
there are countably infinitely many discontinuities (see the exercises). We can also
weaken the boundedness condition to p
allow for functions that grow slowly enough
near singular points (for example 1= jxj near zero) as well as to allow for the
definition of the integral on unbounded domains. We address these issues in more
detail as and when they arise in the text (see, in particular, the first section of
Chap. 6).
EXERCISES 2.8.10
(1) Suppose that the function I.x; y/, x; y 2 D, satisfies condition I whenever
a b c. Show that if we define I.a; b/ D I.b; a/ for b < a, then I holds
for all a; b; c 2 D.
(2) Show that if f W D ! R is bounded and I.a; b/ exists and satisfies I, II, then
I.a; x/ is a continuous function of x 2 D, a a fixed point of D.
(3) Let P; P 0 be two (finite) partitions of Œa; b. Show that if we define Q D P[P 0 ,
then L.Q; f / maxfL.P; f /; L.P 0 ; f /g.
(4) Let n 2 N and let Pn be the partition of Œa; b defined by tj D a C j.ba/
,
n
0 j n. Using the uniform continuity of f (Theorem 2.4.15), verify that
Rb
limn!1 L.Pn ; f / D a f .t/ dt. (Hint: Show limn!1 U.Pn ; f / L.Pn ; f / D 0.)
(5) Suppose f W Œa; b ! R has finitely many discontinuities. Show how we may
Rb
Rx
define the definite integral a f .t/ dt and verify that the derivative of a f .t/ dt
exists and equals f .x/ at all points x where f .x/ is continuous.
(6) Let f ; g W Œa; b ! R be continuous and suppose that g is of constant sign (that
is, either positive or negative). Show that there exists an x 2 .a; b/ such that
Z
b
a
Z
b
f .t/g.t/ dt D f .x/
g.t/ dt:
a
80
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Rb
Rb
Hint: Show that . a f .t/g.t/ dt= a g.t/ dt/ 2 Œm; M, where m; M respectively
denote the infimum and supremum of f on Œa; b. (The result is known as the
first
1, then we obtain
R b mean value theorem for integrals. Note that if g
a f .t/ dt D f .x/.b a/—often called the mean value theorem for integrals.)
(7) Let f ; g W Œa; b ! R be continuous and suppose that g is positive and
monotone decreasing. There exists an x 2 .a; b such that
Z
b
a
Z
f .t/g.t/ dt D g.a/
x
f .t/ dt:
a
(This result is known as the second mean value theorem for integrals.) Find an
example to show that we may have x D b and so deduce that we cannot use
the condition x 2 .a; b/.
(8) Let P D ftj j a D t0 < t1 < < tN D bg be a partition of Œa; b and
define ı.P/ D maxftjC1 tj j j D 0; ; N 1g. If f W Œa; b ! R is
Rb
continuous, show that L.P; f /; U.P; f / ! a f .t/ dt as ı.P/ ! 0. Deduce
PN1
that if we define V.P; f / D jD0 j .tjC1 tj /, where j 2 Œmj ; Mj , and mj ; Mj
are respectively the infimum and supremum of f jŒtj ; tjC1 , then V.P; f / !
Rb
a f .t/ dt as ı.P/ ! 0. (Hint: use the uniform continuity of f . Note that you
are expected to show that if " > 0, then there exists a ı0 > 0 such that if
Rb
ı.P/ < ı0 , then jV.P; f / a f .t/ dtj < ".)
Rb
(9) In this extended exercise we consider the problem of defining a f .t/ dt when
f is bounded and the discontinuity set D of f is countable. When there are
countably infinitely many discontinuities, we use the definition of integrability
Rb
given by equality of integrals defined by upper and lower sums: a f .t/ dt D
Rb
f .t/ dt. We use the results and notation from Sect. 2.5.2.
a
(a) Given ` > 0, let G` D fx j !f .x/ `g. Show that given ı > 0, we can
choose a finite set of open intervals fJi j i 2 kg of total length at most "
such that [i2k Ji G` . (Hints: Since the discontinuity set D is countable
and D G` , we can choose a countable set of open intervals Ii such that
[i Ii G` . Now show, using the result of Exercises 2.5.10(7), that we can
pick a finite number of the intervals Ii with union containing G` .)
(b) Let " > 0. Show that by choosing ı; ` > 0 sufficiently small we have
Rb
Rb
a f .t/ dt a f .t/ dt < ". (Hint: Use the result of (a) together with the
definition of !f .x/.)
Rb
Rb
f .t/ dt.
(c) Deduce that a f .t/ dt D
a
Rx
(d) Prove that if f is continuous at x 2 Œa; b then a f .t/ dt is differentiable at
x with derivative f .x/.
We remark that this result requires more serious analysis than what is required
if f is continuous or has an anti-derivative.
2.8 Appendix: The Riemann Integral
81
(10) This is an extended exercise that may be used for discussion and projects. The
aim is to define integrals of continuous functions on rectangular domains in
R2 . Recall that a (bounded) rectangle R is a subset of R2 that can be written
as a product of closed and bounded intervals: R D Œa; b Œc; d D f.x; y/ 2
R2 j x 2 Œa; b; y 2 Œc; dg, where 1 < a b < 1, 1 < c d < 1. If
f W R2 ! R, then f is continuous at .x0 ; y0 / if, given " > 0, there exists a ı > 0
such that j f .x; y/ f .x0 ; y0 /j < ", whenever jx x0 j; jy y0 j < ı. We assume
here that continuous functions on a rectangle are bounded (see (a) below and
2
also Chap. 7). In what follows f will be fixed, continuous
R and have domain R
and we consider the problem of defining the integral R f of f over rectangles
R D Œa; b Œc; d. Just as in the 1-variable case, we approach the problem by
stating the properties we require of the integral and then prove existence and
uniqueness.
(a) (Preliminaries on continuity.) If f W Œa; b Œc; d ! R is continuous,
show that f is (a) sequentially continuous and (b) uniformly continuous.
(Hint and comments: use the same method as in the 1-variable case. Given
sequential continuity, it is easy to show that f is bounded and attains its
bounds.)
(b) Generalize conditions I, II so that they apply to bounded functions on a
rectangle
R D Œa; b Œc; d. (Let I..a; b/; .c; d// denote the candidate
R
for R f . For II, we allow the rectangle R to be written as a finite union
of rectangles which only meet along their boundaries.) In (c–f) below the
aim is to prove that I, II uniquely characterize the integral of f . We then
give two solutions to finding I..a; b/; .c; d//—which must be equal, by
uniqueness.
(c) Assume f W R ! R is independent of y. Show that if I, II hold, then
Rb
I..a; b/; .c; d// D .d c/1 a f .x/ dx for all a < b, c < d. Verify the
similar statement if f is independent of x. (Hint: fix c; d and show that
I.a; b/ D .d c/1 I..a; b/; .c; d// satisfies conditions I, II for functions
of one variable.)
(d) Let .x; y/ 2 R D Œa; b Œc; d and define RV.x; y/ D I..a; x/; .c; y//.
y
Verify that (i) V is continuous, (ii) @V
.x; y/ D c f .x; y/ dy, (iii) @V
.x; y/ D
@x
@y
Rx
f
.x;
y/
dx.
(Hint:
for
the
proof
of
(ii,iii)
use
uniform
continuity—see
(a)
a
above.)
(e) Using (d), show that if I..a; x/; .c; y// exists, x 2 Œa; b, y 2 Œc; d, then it
is unique.
RdRb
RbRd
(f) Show that I..a; c/; .c; d// D c a f .x; y/ dxdy D a c f .x; y/ dydx. This
not only gives the existence of the integral but also gives Fubini’s theorem.
@2 V
(g) What is @x@y
.x; y/?
The arguments above allow us to give an elementary definition of double
integrals on rectangles. The results suffice for all but one of our applications
of double integrals in Chap. 6 as well as our construction of uniform approximations in Chap. 9. The arguments easily generalize to unbounded rectangles
82
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
(for example, quadrants of R2 ) as well as integrals over rectangular regions in
Rn , n > 2. However, much extra work has to be done to rigorously establish
integrals on general non-rectangular domains and, in particular, to prove the
change of variables formula for multiple integrals. This is best done in the
framework of Lebesgue integration though the one application of a linear
change of variables we make in Chap. 6 can be done fairly easily using direct
arguments.
2.9 Appendix: The Log and Exponential Functions
In this section we give a summary, with proofs, of the main properties of the
logarithm and exponential functions. Particularly important for us will be the result
that as x ! C1, log x grows more slowly than xa , for any a > 0, and ex grows
faster than any power of x.
2.9.1 The Logarithm
For x > 0, we define the natural or Napierian logarithm by
Z
x
log x D
1
dt
:
t
Remark 2.9.1 We avoid the alternative notation ln x for the logarithm of x on the
grounds that the base 10 logarithm is rarely used these days and so there is no
longer a good reason to use an unpronounceable notation for the natural logarithm.
It is immediate from the fundamental theorem of calculus (2.8) that log is continuously differentiable on .0; 1/ with derivative given by
1
d
log.x/ D :
dx
x
From this it follows that log W .0; 1/ ! R is C1 (infinitely differentiable). Since
log0 .x/ D 1=x > 0 for all x > 0, log is a strictly increasing function of x.
Proposition 2.9.2 We have
(1)
(2)
(3)
(4)
log xy D log x C log y, for all x; y > 0.
log 1 D 0.
log x1 D log x, for all x > 0.
p
log x q D pq log x, for all pq 2 Q.
2.9 Appendix: The Log and Exponential Functions
83
Proof We have
Z
xy
log xy D
1
dt
D
t
Z
1
x
dt
C
t
Z
xy
x
dt
D log x C
t
Z
xy
x
dt
:
t
R xy
Hence to prove (1), it suffices to show that x dtt D log y. For this, we make the
R y du
R xy dt
substitution t D ux, to obtain x t D 1 u D log y.
Statement (2) follows from (1) by taking x D y D 1. Alternatively, take x D 1 in
the definition of log x. Statement (3) follows from (2) by taking y D x1 in (1).
Finally, (1) and (3) imply that log xn D n log x, n 2 Z. Therefore, for q 2 N,
p
1
1
1
1
we have log.x q /q D q log x q , and so log x q D 1q log x. Hence log x q D log.x q /p D
p
log x.
t
u
q
Proposition 2.9.3 The logarithm maps .0; 1/ bijectively onto R. In particular,
limx!0C log x D 1, limx!C1 log x D C1.
Proof Since log is strictly increasing, log is a bijection onto its image. Since log 2 >
0 (log 1 D 0 and log is strictly increasing), limn!1 log 2n D n log 2 D C1. Hence
limx!C1 log x D C1. On the other hand limx!0C log x D limy!C1 log y1 D
limy!C1 log y D 1. It remains to show that log maps .0; 1/ onto R. Let
y 2 R. Choose n 2 N so that n log 2 y n log 2. Since n log 2 D log 2n ,
n log 2 D log 2n , the intermediate value theorem implies there exists an x 2 Œ2n ; 2n such that log x D y.
t
u
Remarks 2.9.4
(1) By Proposition 2.9.3, we may define e > 1 to be the unique real number such
that log e D 1.
(2) We may use Proposition 2.9.3 to define xa for all x > 0, a 2 R. Thus we
define xa to be the unique positive real number with logarithm a log x. Granted
Proposition 2.9.2(4), this definition of xa coincides with the usual one when a is
rational. We also have the obvious extension of Proposition 2.9.2(4): log xa D
a log x for all x > 0, a 2 R. For further properties of ax , see the exercises at the
end of the section.
z
2.9.2 The Exponential Function
We define the exponential function exp W R ! .0; 1/ to be the inverse of log W
.0; 1/ ! R. As is customary, we often use the notation ex for exp.x/. This is
justified by (2,3,4) of the next proposition.
Proposition 2.9.5 We have
(1) elog x D x, for all x > 0, log.ex / D x, for all x 2 R.
(2) exCy D ex ey , for all x; y 2 R.
84
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
(3) e0 D 1.
(4) ex D 1=ex , for all x > 0.
Proof (1) follows since exp is the inverse of log. The remaining properties follow
easily from (1) and Proposition 2.9.2. For example, x C y D log.ex / C log.ey / D
log.ex ey /. Exponentiate to get (2).
t
u
Proposition 2.9.6
(1) exp W R ! .0; 1/ is strictly increasing.
(2) exp is continuous.
(3) exp is C1 and exp0 .x/ D exp.x/, for all x 2 R.
Proof Since log W .0; 1/ ! R is strictly increasing, exp W R ! .0; 1/ is strictly
increasing.
We prove the continuity of exp. Let " > 0, and x0 2 R. We must find ı > 0
such that j exp.x/ exp.x0 /j < ", if jx x0 j < ı. Set y0 D exp.x0 / and suppose
0 < " < y0 . Set a D log. y0 "/, b D log. y0 C "/ and note that a < x0 < b. We
have exp.a; b/ D . y0 "; y0 C "/ and so if we take ı D minfx0 a; b x0 g, we
have j exp.x/ exp.x0 /j < " if jx x0 j < ı.
It remains to prove that exp is C1 . We start by proving that exp is differentiable
at x D 0 with derivative 1. That is, we claim limh!0 .eh 1/=h/ D 1. Setting
h D log x, we have
eh 1
elog x elog 1
D lim
h!0
x!1
h
log x
lim
x1
x!1 log x log 1
D lim
D 1;
since limx!1
log xlog 1
x1
D log0 .1/ D 1. The derivative of exp at x is defined by
exCh ex
h!0
h
exp0 .x/ D lim
eh 1
h!0
h
D exp.x/;
D exp.x/ lim
where we have used exp0 .0/ D 1. Hence for all x 2 R, exp0 .x/ D exp.x/. Since exp
is continuous, exp0 D exp is continuous and so exp is C1 . Proceeding inductively,
we have for all n 2 N, exp.n/ D exp and so exp is C1 .
t
u
Remark 2.9.7 As a corollary of Proposition 2.9.6, we see that xa D exp.a log x/ is
differentiable with derivative axa1 .
z
2.9 Appendix: The Log and Exponential Functions
85
2.9.3 Estimates
Proposition 2.9.8 Let a; b > 0.
(1) limx!1 xa .log x/b D 0.
(2) limx!0C xa .log x/b D 0.
Proof We prove (1) ((2) follows from (1) by replacing x by x1 ). Since
xa .log x/b D .xa=b .log x//b , there is no loss of generality in taking b D 1
and verifying that limx!1 xa log x D 0 for all a > 0. Computing the derivative of
f .x/ D xa log x, we find that f 0 .x/ < 0 if a log x > 1. Hence, xa log x is monotone
decreasing for sufficiently large x and so limx!1 xa log x exists and is greater than
or equal to zero. Now .2n /a log 2n D .2a /n n log 2 ! 0 as n ! 1, since 2a < 1
(Example 2.3.26). Hence limx!1 xa log x D 0.
t
u
Proposition 2.9.9 Let a 2 R, c > 0. Then limx!C1 xa ecx D 0.
Proof We leave this as an exercise, using Proposition 2.9.8.
t
u
EXERCISES 2.9.10
(1) For a > 0, show that limx!1 .log x/a log log x D 0. State and prove an
analogous result that applies as x ! 0C.
(2) Provide the proof of Proposition 2.9.9.
(3) Show that log 3 > 1 > log 2 and deduce that e 2 .2; 3/.
(4) Using calculus, show that
2
(a) x x2 log.1 C x/ x, for all x 0.
(b) x log.1 x/ x x2 , for all x 2 Œ0; 1=2.
(5) Using the results of the previous exercise show that for all x 2 R
lim
n!1
1C
x
n
n
D ex :
(6) Show that
2n2 1
2
2n3nC1n
limn!1 3n1
(a) limn!1
(b)
n2
D e1 .
p
D 3 e.
n
(7) Let p; q 2 N. Find limn!1 qnCp
.
qn
x
(8) For a > 0, x 2 R, define a D exp.x log a/. Verify that
(a)
(b)
(c)
(d)
ax ay D axCy , all x; y 2 R.
a0 D 1, a1 D a.
ax D 1=ax .
ax is infinitely differentiable and the derivative of ax is .log a/ax .
Show also that if a > 1 (respectively, a < 1) then ax defines a monotone
strictly increasing (respectively, decreasing) bijection of R onto .0; 1/.
86
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
(9) Let a 2 .0; 1/. Verify that limx!1 ax xb D 0 for all b 2 R.
˛
(10) Let ˛ > 0. Find limn!1 nn .
2.10 Appendix: Construction of R Revisited
We look at the construction of the real numbers using Cauchy sequences of rational
numbers rather than decimal expansions. Most of this appendix should be regarded
as being for group discussion—at most we give brief proofs, preferring instead to
make precise the results that need to be proved.
Let C denote the set of all Cauchy sequences of rational numbers. Our aim is to
show that there is a natural way to partition C as fC˛ j ˛ 2 Rg. Rather than thinking
of a real number as a single ‘point’, we view the real number ˛ as the set of all
possible rational approximations to ˛. That is, we think of each partition set C˛ as
defining a real number. Practically speaking, this is the way we handle irrational
numbers—we compute using rational approximations. The devil is in the details—
though nothing is hard, there are many points to be checked. One advantage of the
approach is that we avoid the problems of addition, subtraction and multiplication
of decimal expansions as well as issues about whether or not a rational number has
more than one decimal expansion. This time we just use the standard and simple
arithmetic properties of rational numbers: pq ˙ rs D ps˙rq
, pq rs D pr
. Disadvantages
qs
qs
are that we work at a more abstract level and that the arguments verifying the
existence of an order on the real numbers are a little harder than what we sketched
in Chap. 1. Also the methods used in Chap. 1 lead to natural and constructive proofs
of, for example, the existence of the supremum of a bounded set.
If s D .xn / 2 C, we define
Cs D ft D . yn / 2 C j lim jxn yn j D 0g:
n!1
Since s 2 Cs , Cs ¤ ;. The next lemma gives a natural partition of C.
Lemma 2.10.1 If s; t 2 C, then either Cs D Ct or Cs \ Ct D ;. In particular, Cs D Ct
iff t 2 Cs and fCs j s 2 Cg defines a partition of C.
Let R D fCs j s 2 Cg denote the partition of C given by the lemma. There is a
natural way to embed the rational numbers Q in R. Given q 2 Q, let Cq 2 R be
defined by the constant Cauchy sequence q D .q/. By Lemma 2.10.1, if q; r 2 Q,
then Cq D Cr iff q D r.
2.10.1 Arithmetic
Let s D .xn /; t D . yn / 2 C. We define s ˙ t D .xn ˙ tn /, st D s t D .xn yn /. We also
let 0 D .0/ denote the Cauchy sequence all of whose terms are zero and 1 D .1/
denote the Cauchy sequence all of whose terms are one.
2.10 Appendix: Construction of R Revisited
87
Lemma 2.10.2 Let s; t 2 C.
(1) s ˙ t; st 2 C.
(2) s ˙ 0 D s, 1s D s, 0s D 0.
We need to be careful when it comes to division; more precisely, the definition
of the reciprocal. Let C ? denote the subset of C consisting of Cauchy sequences .xn /
for which xn ¤ 0, all n 2 N. Given s 2 C, set Cs? D C ? \ Cs . We remark that Cs? ¤ ;
(if s D .xn / 2 C replace every term xn which is zero by 1=n2 to get a sequence
s0 2 Cs? ).
Lemma 2.10.3 Suppose that s D .xn / 2 C ? and that s … C0 . Then s1 D .x1
n / 2 C.
Now the idea is to extend Lemmas 2.10.2, 2.10.3 to R. Suppose Cs ; Ct 2 R. We
define
Cs ˙ Ct D Cs˙t ;
Cs Ct D Cst ;
Cs1 D CsN1 ; where sN 2 Cs? ; and s … C0 :
Lemma 2.10.4 Our definitions of ˙, and the reciprocal on R are well defined
and are compatible with the usual definitions of ˙; , and the reciprocal on Q R.
Proof To verify that the definition of ˙ on R is well defined, we have to show that
Cs˙t depends only on the partition sets Cs , Ct and not on the particular choices of s
and t. That is, if s0 2 Cs and t0 2 Ct , we have to show Cs˙t D Cs0 ˙t0 . This follows
from Lemma 2.10.1. Similar arguments hold for multiplication and the reciprocal.
t
u
Proposition 2.10.5 With our definition of ˙, , and reciprocal, R inherits all of
the standard laws of arithmetic from Q. In particular, zero is represented by C0 , 1
by C1 and we have
(1)
(2)
(3)
(4)
Cs C Ct D Ct C Cs , Cs Ct D Ct Cs (commutativity).
Cs C C0 D Cs , Cs C0 D C0 , Cs C1 D Cs .
.Cs C Ct / C Cu D Cs C .Ct C Cu /, .Cs Ct /Cu D Cs .Ct Cu / (associativity).
Cs .Ct C Cu / D Cs Ct C Cs Cu (distributivity).
The additive inverse of Cs is defined to be Cs and the multiplicative inverse Cs1
of Cs is defined for s … C0 by CsN1 , where sN 2 Cs? .
Remark 2.10.6 Setting up the basic arithmetic is easier when we work with Cauchy
sequences of rational numbers as opposed to the decimal expansions used in
Chap. 1.
z
88
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
2.10.2 Order Structure on R
Definition 2.10.7 Given Cs 2 R, we write Cs > C0 if there exists t D .tn / 2 Cs ,
ı 2 Q, with ı > 0, and N 2 N such that
tn ı; for all n N:
We write Cs < C0 if Cs > C0 .
Lemma 2.10.8 Let Cs > C0 . Then for every u D .un / 2 Cs , there exists a ı 2 Q,
ı > 0, and N 2 N such that
un ı; for all n N:
(Both ı and N will depend on the choice of u 2 Cs .)
Remark 2.10.9 It follows from Lemma 2.10.8 that if q 2 Q then Cq > C0 iff q > 0
(standard order on Q).
z
Lemma 2.10.10 Let Cs 2 R. Then exactly one of the following statements holds:
Cs D C0 ; Cs > C0 ; Cs < C0 :
In particular, < is well defined.
Remark 2.10.11 The issue here is to that show if Cs ¤ C0 , then either Cs > C0
or Cs < C0 . Note that the theory here is harder than what we did in Chap. 1
using decimal expansion. The reason is that when we used decimal expansions,
the sequences of approximating rationals were monotone increasing (and naturally
defined).
z
Using Lemma 2.10.10 and our results on the arithmetic on R, we can define an
order on R by Cs > Ct if Cs Ct D Cst > C0 .
Proposition 2.10.12 With our definition of <, R inherits all of the standard
properties of < holding on Q. In particular,
(1)
(2)
(3)
(4)
If Cs < Ct and Ct < Cu , then Cs < Cu .
If Cs < Ct , then If Cs C Cu < Ct C Cu for all Cu 2 R.
If Cs < Ct and Cu > C0 , then Cs Cu < Ct Cu .
Cs < Ct iff Cs > Ct .
Remark 2.10.13 A consequence of Lemma 2.10.10 and the definition of order is
that if Cs > C0 then there exists a q 2 Q such that Cs > Cq > 0. This property is
equivalent to the Archimedean property of R: if Cs > C0 , there exists an n 2 N such
that Cn > Cs .
z
2.10 Appendix: Construction of R Revisited
89
2.10.3 Absolute Value
Just as for decimal expansions, it is easy to define the absolute value once we have
the order structure on R.
Definition 2.10.14 Given Cs 2 R we define jCs j D Cs if Cs C0 and jCs j D Cs if
Cs < C0 .
It is clear that j j is compatible with the absolute value defined on Q. That is, if
q 2 Q, we have jCq j D Cjqj .
Lemma 2.10.15 (The Triangle Inequality) For all Cs ; Ct 2 R, we have
jCs C Ct j jCs j C jCt j:
Proof Given s D .sn / 2 C, define jsj D .jsn j/ and note that jCs j D Cjsj . In order to
prove the triangle inequality we must show
CjsCtj Cjsj C Cjtj :
If s D .sn /, t D .tn /, then s; t are Cauchy sequences of rational numbers and so by
the triangle inequality on Q, we have
jsn C tn j jsn j C jtn j; n 2 N:
(2.10)
Now argue by contradiction: suppose CjsCtj > Cjsj C Cjtj . Then jsn C tn j > jsn j C jtn j
for sufficiently large n (Lemma 2.10.8), contradicting (2.10).
t
u
2.10.4 Limits, Density and Completeness
Definition 2.10.16 A sequence .Csn / in R is convergent if there exists a Cs 2 R
such that jCsn Cs j ! C0 as n ! 1. That is, given C" > C0 , there exists an N 2 N
such that C0 jCsn Cs j < C" for all n N.
Theorem 2.10.17 (Density of Rational Numbers) Every Cs 2 C is the limit of
a sequence of rational numbers. That is, given Cs 2 C, there exists a sequence
.qn / Q such that
lim Cqn D Cs :
n!1
(For n 1, Cqn is defined using the constant Cauchy sequence, all terms of which
equal qn .)
Proof Suppose s D .sn /. We define qn D sn , n 1.
t
u
90
2 Basic Properties of Real Numbers, Sequences and Continuous Functions
Theorem 2.10.18 (Cauchy Sequences Converge in R) Every Cauchy sequence
.Csn / in R is convergent.
Proof Define the sequence .Sn / Q by
Sn D snn ; n 2 N:
Then .Sn / is Cauchy and if we set S D .Sn /, limn!1 Csn D CS .
t
u
Chapter 3
Infinite Series
3.1 Introduction
In this chapter we undertake a detailed study of the convergence of infinite series.
This work forms an essential foundation for the construction and analysis of
functions that we give in Chaps. 4–6.
We start by looking at general infinite series, then specialize to series of positive
terms and give a number of criteria for convergence. Next, using our results on
Cauchy sequences, we consider absolutely and conditionally convergent series and
find conditions for convergence. As an illustration of the care that needs to be taken,
we prove Riemann’s rearrangement theorem: if an infinite series is convergent but
not absolutely convergent, then we can add the terms in a different order so as to
make the series converge to any preassigned number or not converge at all. We
conclude with definitions and results on doubly infinite series and infinite products
and prove the infinite product formula for sin x.
3.2 Generalities
First, we recall some definitions and results from Chap. 2.P
Let .an / be a sequence of
real numbers. For n 2 N, we define the partial sum Sn D niD1 ai .
P
Definition 3.2.1 The infinite series 1
if the sequence .Sn / of
nD1 an is convergent P
partial sums is convergent. If .Sn / is convergent, we define 1
nD1 an to be equal to
limn!1 Sn .
Remarks 3.2.2
P
(1) The infinite series 1
nD1 an should be thought of symbolically—as shorthand
for the sequence of partial sums. When (and only when) the sequence is known
92
3 Infinite Series
P
to be convergent, we identify 1
nD1 an with the limit of the corresponding
sequence of partial sums. Of course, this is what we did previously in our
description of real numbers. If x P
has decimal expansion ˙x0 :x1 , then we
1
n
identify x with the infinite
sum
˙
nD0 xn 10 .
P1
P
(2) We sometimes write “ nD1 an <P
1” to signify that the infinite series 1
nD1 an
1
is convergent. A statement
like
“
a
D
5”
should
be
interpreted
as
saying
n
nD1
P
that the infinite series 1
nD1 an is convergent and that the limit (of the sequence
of partial sums) is equal to 5. We say “the sum of the infinite series is 5”.
(3) Although we shall not spell out the details, all the usual limit laws for sequences
P1
1
carry
nD1 an ,
P1 over to infinite series. For example, if the infinite
P1series
b
are
both
convergent,
then
so
is
the
infinite
series
.a
C
b
n
n
n / and
nD1
PnD1
P1
P1
1
.a
C
b
/
D
a
C
b
.
z
n
n
n
n
nD1
nD1
nD1
There is precisely one general necessary condition for convergence of an infinite
series.
P
Lemma 3.2.3 If the infinite series 1
nD1 an is convergent, then limn!1 an D 0.
(No restrictions on the signs of the an .)
P1
D limn!1 Sn1 . Hence
Proof If
nD1 an is convergent, then limn!1 Sn
limn!1 an D limn!1 .Sn Sn1 / D 0.
t
u
Remark 3.2.4 Everything above extends immediately to infinite series of complex
terms. In the next section we study series of positive terms. The results we obtain
have no analogue for complex series (there is no natural order relation on the
complex numbers).
z
3.3 Series of Eventually Positive Terms
If the terms of .an / are all positive (respectively, eventually positive), then .Sn /
is increasing (respectively, eventually increasing).
As a consequence of TheoP
rem 2.3.18, we see that the infinite series 1
a
is
convergent iff the sequence
nD1 n
.Sn / of partial sums is bounded.
Examples 3.3.1
P
P1
1
(1) The infinite series 1
nD2 n.n1/ is convergent and
nD2
1
1
we have n.n1/
D n1
1n > 0. Hence
N
X
nD2
1
An essential ‘if’.
1
n.n1/
N X
1
1
1
1
D
D1 :
n.n 1/
n1 n
N
nD2
D 1. For n 2
3.3 Series of Eventually Positive Terms
93
P
1
Letting N ! 1, we see that 1
nD2 n.n1/ converges to 1.
P1 1
(2) The infinite series nD1 n diverges to C1. For n 1, set N D 2n D 1 C
Pn1 i
iD0 2 . We have
N
X
1
iD1
i
D 1C
C
1
C
2
1
1
C
3
4
C
1
1
CC
5
8
C
1
1
CC n
2n1 C 1
2
1
1
1
1
C 2 2 C C 2 j jC1 C C 2n1 n
2
2
2
2
nC2
1
:
D 1Cn D
2
2
1C
This
shows that the increasing sequence .SN / is not bounded above and
Pestimate
1
so 1
diverges
to C1.
nD1 n
Our aim in the remainder of this section is to develop some convergence tests
for infinite series of (eventually) positive terms. These tests range from the highly
practical (comparison, ratio and Cauchy integral tests) to the more theoretical
D’Alembert and Cauchy tests. For most practical examples, readers are advised not
to use the theoretical tests—at least until simpler tests have been tried. They rarely
work better and it is easy to make errors when applying them.
3.3.1 The Comparison Test
Proposition 3.3.2 (The Comparison Test) Let .un /; .vn / be sequences of real
numbers satisfying 0 un vn , for all n 2 N.
P1
P1
is convergent, then (a)
(1) P
If
nD1 vn P
nD1 un is convergent, and (b) 0 1
1
u
v
.
n
n
nD1
nD1
P
P1
(2) If 1
nD1 un is divergent, then
nD1 vn is divergent (in either case to C1).
(The result applies with minor changes in the statements if .un / and .vn / are
eventually positive and un vn for all sufficiently large n.)
P
P
Proof For n 2 N, define Sn D niD1 ui , Tn D niD1 vi . Since 0 ui vi , we have
0 Sn Tn ; for all n 2 N:
P
Suppose that 1
for all
nD1 vn is convergent, with limit T. Then 0 Sn Tn T,
P1
n 2 N. Hence the P
increasing sequence
.S
/
is
bounded
above
by
T
and
so
un
n
nD1
P1
is convergent and 1
nD1 un nD1 vn , proving (1).
94
3 Infinite Series
P
If 1
nD1 un is divergent, then the series must diverge
Pmto C1 (Theorem 2.3.19).
Hence
for
all
K
0,
there
exists
an
N
2
N
such
that
Pm
Pm
P1 nD1 un K, m N. Hence
v
u
K,
for
all
m
N,
and
so
t
u
n
n
nD1
nD1
nD1 vn diverges to C1.
Examples 3.3.3
P
p
is convergent for p 2. If
(1) Using the comparison test, we show that 1
nD1 n
p
2
p > 2, then n nP for all n 2 N and so, by (1) of the comparison test, it
2
suffices to show that 1
is convergent. Observe that for n 2 we have
nD1 n
1
1
:
<
n2
n.n 1/
P
1
The series 1
nD2 n.n1/ is convergent, with sum 1, by Examples 3.3.1(1). Hence,
P
P
2
2
D1C 1
is convergent with sum at
by the comparison test, 1
nD1 n
nD2 n
most 2.
P
p
(2) Using the comparison test, we show that 1
is divergent for p 1. If
nD1 n
P
p
1
1
p 1, we have n n , for all n 2 N. Since 1
is divergent by
nD1 n
p
1
Examples 3.3.1(2), the divergence of n n is immediate from (2) of the
comparison test.
3.3.2 The Ratio Test
Proposition 3.3.4 (The Ratio Test) Let .an / be a sequence of positive real numa
bers and suppose that limn!1 nC1
an exists.
P
anC1
(1) If limn!1 an < 1, the series 1
an is convergent.
PnD1
a
1
(2) If limn!1 nC1
>
1,
the
series
nD1 an is divergent.
an
a
Proof We prove (1) and leave (2) to the exercises. If limn!1 nC1
D s < 1,
an
anC1
then there exists an N 2 N such that an r D .s C 1/=2 < 1 for all
n N.
Consequently, aNCp raNCp1 rp aN for all p 2 N. The
P1
series pD0 aNCp therefore converges by comparison with the geometric series
P1
P1
P1
p
t
u
pD0 aN r . If
pD0 aNCp converges, then obviously
nD1 an converges.
Remark 3.3.5 We emphasize that for the ratio test to apply, it is necessary that
a
limn!1 nC1
exists.
z
an
Examples 3.3.6
(1) Convergence does not follow if anC1 =an < 1 for all n 2 N. It is essential
to compute the limit (if it exists). As a simple example,Ptake an D 1=n.
Then anC1 =an D n=.n C 1/ < 1 for all n 2 N, yet 1
nD1 1=n diverges
(Examples 3.3.1(2)).
3.3 Series of Eventually Positive Terms
95
(2) The classic area of application of
test is to power series. AsPan example,
Pthe ratio
1 xn
xn
C
consider the exponential series 1
nD0 nŠ , where x 2 R . If x D 0,
nD0 nŠ D 1
and there is nothing to prove. If we fix x > 0, and define an D xn =nŠ, we have
x
xnC1 nŠ
anC1
D
:
D n
an
x .n C 1/Š
nC1
Since limn!1
x
nC1
D 0 < 1, the ratio test applies and so
P1
xn
nD0 nŠ
is convergent.
3.3.3 D’Alembert’s Test
Proposition 3.3.7 (D’Alembert’s Test) Let .an / be a sequence of positive real
numbers.
P1
a
(1) If lim supn!1 nC1
an < 1, the seriesP nD1 an is convergent.
a
1
(2) If lim infn!1 nC1
nD1 an is divergent.
an > 1, the series
Proof The proof is almost identical to that of the ratio test. For example, the
a
condition lim supn!1 nC1
< 1 implies that there exists 0 < r < 1, N 2 N such
an
a
that nC1
r
<
1
for
all
n
N. The proof then proceeds exactly as in the ratio
an
test.
t
u
Example 3.3.8 It is quite difficult to find interesting examples of series where the
ratio test fails to apply but D’Alembert’s test is applicable. As a somewhat contrived
example, define .an / by
n n1
1
1
; n 1;
2
3
n1 n1
1
1
D
; n 1:
2
3
a2n D
a2n1
We have a2n =a2n1 D 1=2 ¤ 1=3 D a2nC1 =a2n and so the limit as
n ! 1 of anC1 =an does not exist. However, D’Alembert’s
P1 test applies, since
a
lim supn!1 nC1
D
1=2,
and
so
the
infinite
series
nD1 an converges. Of
an
course,
the
convergence
is
easily
seen
by
comparison
with
the geometric series
P1 n
2
.
nD1
96
3 Infinite Series
3.3.4 Cauchy’s Test
Proposition 3.3.9 (Cauchy’s Test) Let .an / be a sequence of positive real numbers.
1
P
(1) If lim sup ann < 1, the series 1
nD1 an is convergent.
1
P1
n
(2) If lim sup an > 1, the series nD1 an is divergent.
1
Proof We prove (1) and leave (2) to the exercises. If lim sup ann D s < 1, then there
1
n
exists an N 2 N such that supfa
n j n Ng r D .s C 1/=2 < 1. We therefore have
P1
n
an rP, all n N. Hence nD1 an converges by comparison with the geometric
n
series 1
t
u
nD1 r .
Remarks 3.3.10
(1) The Cauchy test is of great theoretical importance, as we see when we look at
power series. However, in most practical applications it is usually best to start
by trying the ratio test.
1
(2) If lim ann D exists, then we have the simpler form of Cauchy’s test: the series
converges if < 1 and diverges if > 1. This is the form that is used in most
of the exercises at the end of the section.
(3) The divergence condition for Cauchy’s test uses the lim sup, not (the weaker)
lim inf.
z
Examples 3.3.11
P1
n
C
(1) We examine the convergence of the series
nD1 nx , x 2 R . We have
n 1=n
1=n
limn!1 .nx / P D limn!1 n x D x, by Examples 2.2.9(3). Hence, by
n
Cauchy’s test, 1
nD1 nx converges if x 2 Œ0; 1/. We see by inspection that the
series diverges if x 1 (Lemma 3.2.3). These results follow more easily using
the ratio test.
n2
P
n
(2) Consider the series 1
1 C 1n . We have
nD1 2
1 2 1=n
1
2n .1 C /n
D 21 lim .1 C /n > 1:25;
n!1
n!1
n
n
lim
where the last inequality follows from Examples 2.3.27(2). Hence the series
diverges. The result is not so easy if we try the ratio test.
3.3.5 Cauchy’s Integral Test
R1
If f W Œa; 1/ !R R is continuous, then we define the infinite integral a f .t/ dt
x
to be limx!C1 a f .t/ dt if the limit exists (and is finite). A necessary (but not
3.3 Series of Eventually Positive Terms
97
R1
sufficient) condition for the existence of a f .t/ dt is that limx!C1 f .x/ D 0. We
R1
refer to a f .t/ dt as an improper integral (see the exercises for more definitions
R1
and examples). When a f .t/ dt exists, we say the integral converges and write
R1
a f .t/ dt < 1.
Proposition 3.3.12 (Cauchy’s Integral Test) Let f W Œ1; 1/ ! R be a positive,
continuous and monotoneP
decreasing function. A necessary and sufficient condition
1
for
the
convergence
of
nD1 f .n/ is the convergence of the improper integral
R1
f
.t/
dt.
For
all
n
2
N
we
have the estimate
1
Z
f .n/ C
1
n
f .t/ dt n
X
Z
f . j/ f .1/ C
jD1
n
1
f .t/ dt; n > 1:
(3.1)
If either the series or the integral converges, then we have the estimate
Z
1
1
f .t/ dt 1
X
Z
f . j/ f .1/ C
jD1
1
1
f .t/ dt:
(3.2)
Proof For n > 1, we have (property I of the integral)
Z
n
1
f .t/ dt D
n1 Z
X
jD1
jC1
f .t/ dt:
j
Since f is monotone decreasing, we have (property II of the integral)
Z
jC1
f . j/ f .t/ dt f . j C 1/:
j
P
Using these estimates we easily verify (3.1). Since f 0, the sequence . njD1 f . j//
P
of partial sums is increasing and so, by Theorem 2.3.18, 1
jD1 f . j/ converges iff
R1
f
.t/
dt
converges.
Finally,
(3.2)
follows
by
letting
n
!
1
in (3.1) and using
1
limn!1 f .n/ D 0 (Lemma 3.2.3).
t
u
Remark 3.3.13 Cauchy’s integral test was originally found by Maclaurin in 1742
and rediscovered later by Cauchy. Early versions of the test were used in the
fourteenth century by the Kerala school of mathematics in India.
z
P1 1
C
Example 3.3.14 We consider the convergence of nD1 np , where p 2 R . Define
the continuous function f .x/ D 1=xp , x 2 Œ1; 1/. Since p 0, f is monotone
decreasing and so Cauchy’s integral test applies. If p ¤ 1, we have
Z
n
1
1
1
1
1
:
dt
D
tp
p1
np1
98
3 Infinite Series
Rn
If p > 1, limn!1 1 t1p d D . p 1/1 and so the improper integral converges.
P
1
Hence 1
nD1 np converges by Cauchy’s integral test and
. p 1/1 1
X
1
1 C . p 1/1 :
p
n
nD1
On the other hand, if p < 1, the improper integral diverges and so
by Cauchy’s integral test. There remains the case p D 1. We have
Z
1
n
P1
1
nD1 np
diverges
1
dt D log n 1:
t
Since
n!C1 log n D C1 as n ! 1, the improper integral diverges and so
P1 lim
1
diverges
by Cauchy’s integral test. We note for future reference the useful
nD1 n
estimate
X1
1
C log n 1 C log n:
n
j
jD1
n
(3.3)
We provide a number of other examples of applications of Cauchy’s integral test in
the exercises.
EXERCISES 3.3.15
(1) Complete the proofs of D’Alembert’s and Cauchy’s test—take particular care
with the divergence
statement (2) in Cauchy’s
test.
P
P
(2) Let Tn D njD1 1=.2j 1/ and Sn D njD1 1=j, n 1. Show that Tn > Sn =2
P
and deduce that the series 1
jD1 1=.2j 1/ is divergent. Show also how this
result can be derived from Cauchy’s integral test.
(3) Cauchy’s test is stronger than D’Alembert’s test, which is stronger than the
ratio test. For each of the following series, determine the weakest test that
proves convergence (implicit in the question is showing why the weaker tests
fail; you do not have to prove that tests stronger than the weakest test that
works also work.)
(a) 1 C .aC1
C .aC1/.2aC1/
C C .aC1/.naC1/
C , where b > a > 0.
bC1
.bC1/.2bC1/
.bC1/.nbC1/
(b) 1 C ˛ C ˇ 2 C ˛ 3 C ˇ 4 C , where 0 < ˛ < ˇ < 1.
P
n2 n
(4) Show that 1
nD1 q x converges for all positive values of x if 0 < q < 1.
What happens
P if q >1 1?
(5) Show that 1
nD2 .log n/n is convergent.
(6) Determine whether or not the followings series converge
p
P
p
1=2
(a) P1
.pn C 1 pn/.
nD1 n
1
2=3
(b)
. n C 1 n/.
nD1 n
3.3 Series of Eventually Positive Terms
(c)
(d)
P1 nC1 n2
nD1
n
nD1
n
P1 nC1 n2
(7) Show that
P1
(a)
PnD2
1
(b)
PnD2
1
(c)
nD3
99
.
5n .
1
is divergent. (Start at n D 2 as log 1 D 0.)
n log n
1
n.log n/p is convergent if p > 1. (Start at n D 2 as
1
n log n log log n is divergent.
log 1 D 0.)
(log n is the logarithm to base e, also denoted by ln x.) Show also that
1
X
nD2
1
1
1
1
;
C
:
2
n.log n/2
log 2 log 2
2.log 2/2
P
P1
1
1
(8) Show that 1
nD2 .log n/log n is convergent but
nD2 .log n/log log n is divergent.
(9) For what values of x 0 are the following series convergent?
P1 log n n
(a)
np x .
PnD1
1
(b)
n n xn .
PnD1
1
n n2
(c)
nD1 n x .
P
P1 p
(10) Show that if 1
an anC1 converges (it is
nD1 an is convergent, then
nD1
always assumed that .an / is a sequence of positive numbers).
(a) P
Show that if .an / is a decreasing sequence,
P1then the convergence of
1 p
an anC1 implies the
convergence
of
nD1
nD1 an . P
P
p
(b) Find an example where 1
an anC1 converges but 1
nD1
nD1 an diverges
(by (a), .an / cannot be decreasing).
P
P1
(11) Suppose that 1
nD1 an and
nD1 bn are series of positive terms.
P1
P
P1 p
(a) Show that if nD1 an and 1
an bn
nD1 bn are convergent, then
nD1
converges. P
P1
2
(b) Show that if 1
nD1 an converges then so does
nD1 an =n.
(c) Show, by means of an example, that the converse of (b) is false, even if
.an / is decreasing.
(12) Show that
(a) Sn D 1 C 12 C C 1n log n 2 Œ0; 1, n 1.
(b) .Sn / is an increasing sequence.
P
Deduce that limn!1 . njD1 1=j log n/ exists and lies in Œ0; 1. (The limit is
usually denoted by and referred to as Euler’s constant. The value of is
approximately 0:5772 (see Chap. 6, Sect. 6.3.5). It is not yet (2017) known
whether P
is rational
1 or irrational.
nC1
(13) Show that 1
nD1 n log. n / is convergent.
100
3 Infinite Series
P
(14) Suppose an > 0 and
an diverges
What can be said about the
P an (to
PC1).
an
convergence or divergence of
,
?
2
1Cnan
1Cn a
n
3.4 General Principle of Convergence
In the next four sections, we study series where the terms are not necessarily all of
the same sign. The sequence of partial sums will no longer be monotone and so we
will not be able to apply Theorems 2.3.18, 2.3.19. Instead, we will need to use the
result that if the sequence of partial sums is Cauchy, then it converges.
Theorem 3.4.1 (General
Principle of Convergence) Let .an / be a sequence of
P
real numbers. Then 1
a
nD1 n is convergent iff for every " > 0, there exists an N 2 N
such that
jxm C xmC1 C C xn j < "; for all n m N:
Proof The sequence .Sn / of partial sums is convergent iff it is a Cauchy sequence.
That is, given " > 0, there exists an N 2 N such that jSn Sm1 j D jxm C Cxn j < ",
for all n m N.
t
u
Remark 3.4.2 Theorem 3.4.1 extends to infinite series of complex numbers. The
proof is formally the same as that of Theorem 3.4.1 but using Theorem 2.6.6.
z
3.5 Absolute Convergence
Definition
3.5.1 Let .an / R. The infinite series
P
if 1
ja
j is convergent.
n
nD1
P1
nD1
an is absolutely convergent
Remark 3.5.2 The results for sums and differences of convergent series extend to
absolutely convergent series
the corresponding results for series of positive
P usingP
bn are absolutely convergent then so are the
terms. P
For example, if
an and
series .an ˙ bn /.
z
Theorem 3.5.3 Every absolutely convergent series is convergent.
Proof Our proof makes essential
use of Theorem 3.4.1 (general principle of
P
convergence). Suppose that 1
ja
nD1 n j < 1. Then, by Theorem 3.4.1, given " > 0,
there exists an N 2 N such that
jam j C jamC1 j C C jan j < "; for all n m N:
3.5 Absolute Convergence
101
But jam C amC1 C C an j jam j C jamC1 j C C jan j and so
jam C amC1 C C an j < "; for all n m N:
Hence,
P1
nD1
t
u
an is convergent by Theorem 3.4.1.
Remarks 3.5.4
(1) The reader is cautioned that the converse to Theorem 3.5.3 is false: a convergent
series
P need not be absolutely convergent. We give examples shortly.
(2) If 1
an infinite series of complex numbers, then the series is absolutely
nD1 an is P
convergent if 1
nD1 jan j < 1 (where jj now denotes the modulus of a complex
number). Without exception, all of our results on absolutely convergent real
series extend to absolutely convergent complex series.
z
Theorem 3.5.3 allows us to translate results on convergent series of positive terms
to absolutely convergent series.
P
nC1 p
n is absolutely convergent iff p > 1,
Example 3.5.5 The seriesP 1
nD1 .1/
1
nC1 p
Example 3.3.14. Hence
.1/
n
is convergent, p > 1. Later in this
nD1
P
nC1 p
chapter we prove 1
.1/
n
is
convergent
for p > 0 even though absolute
nD1
convergence fails if p 1.
We conclude this section with an important and practical result called Tannery’s
theorem that allows us to interchange limit operations in a countable set of absolutely convergent series. Specifically, given sequences .an . p//, for p D 1; 2; ,
Tannery’s theorem gives easily verifiable conditions under which
lim
1
X
p!1
an . p/ D
nD1
1
X
nD1
lim aj . p/:
p!1
Theorem 3.5.6 (Tannery’s Theorem) Suppose we are given sequences .an . p// R depending on p 2 N. Assume
(1) limp!1 an . p/ D an , n 2 N.
P
(2) jan . p/j Mn , for all n; p 2 N, where 1
nD1 Mn < 1.
Then
lim
p!1
p
X
an . p/ D lim
1
X
p!1
nD1
nD1
an . p/ D
1
X
an :
nD1
The result continues to hold if .an . p// C.
Proof We have to show that given " > 0, there exists an N 2 N such that if p N
then
ˇ
ˇ p
1
ˇ
ˇX
X
ˇ
ˇ
an . p/ an ˇ < ":
ˇ
ˇ
ˇ
nD1
nD1
102
3 Infinite Series
P1
It follows
P1 from (1,2) that jan . p/j; jan j Mn , for all p; n 2 N and so nD1 an . p/
and PnD1 an are P
absolutely convergent. Moreover, we may choose N1 2 N such
m
that m
ja
j;
n
nDN1
nDN1 jan . p/j < "=3, for 1 m N1 and p 2 N. Hence for
m N1 we have
ˇ m
ˇ
N1
1
1
m
ˇX
ˇ X
X
X
X
ˇ
ˇ
an . p/ an ˇ jan . p/ an j C
jan j C
jan . p/j
ˇ
ˇ
ˇ
nD1
nD1
nDN1 C1
nD1
<
N1
X
nDN1 C1
jan . p/ an j C 2"=3:
nD1
By (1), we can choose N N1 such that jan . p/ an j < "=.3N1 /, for all p N and
1 n N1 . Hence, taking m D p,
ˇ
ˇ p
1
ˇ
ˇX
X
"
2"
ˇ
ˇ
D "; for all p N:
an . p/ an ˇ < N1
C
ˇ
ˇ
ˇ
3N
3
1
nD1
nD1
The proof extends immediately to the case of complex series—absolute value is
replaced by the modulus of a complex number.
u
t
3.5.1 The Exponential Series
P1 xn
The exponential series
nD0 nŠ is absolutely convergent for all x 2 R by
P
xn
Examples 3.3.6(2). Hence, 1
nD0 nŠ converges for all x 2 R. We may give an infinite
series definition of the exponential function ex or exp.x/ by
ex D
1 n
X
x
nD0
nŠ
; x 2 R:
The next result shows that the series definition of exp gives the same function as
that defined by the inverse of the logarithm in section “Appendix: The Log and
Exponential Functions”.
Proposition 3.5.7 For all x 2 R,
lim
n!1
1C
x
n
n
D
1 n
X
x
nD0
D ex :
nŠ
3.5 Absolute Convergence
103
n
Proof We refer to Exercises 2.9.10(5) for the proof that ex equals limn!1 1 C nx ,
where ex is defined as the inverse
simplify notation, we define ex D
of xlog
n x. To
P
1 xn
x
De .
nD0 nŠ and show that limn!1 1 C n
Fix x 2 R. By the binomial theorem, we have
x
x
xp
p. p 1/ x2
.1 C /p D 1 C p C
C
C
p
p
2
p2
pp
D 1C
p
X
xn
nD1
nŠ
Kp .n/;
where Kp .1/ D 1 and Kp .n/ D .1 1p /.1 2p / .1 n1
/ < 1, p n > 1. We use
p
Tannery’s theorem to complete the proof. Following the notation of Theorem 3.5.6,
define an D xn =nŠ and
an . p/ D
8
ˆ
ˆ
<1;
n D 0;
:̂0;
p < n:
xn
K .n/;
ˆ nŠ p
p n;
n
We have limp!1 an . p/ D xnŠ , verifying (1) of Theorem 3.5.6. Condition (2) holds
P
n
n
xn
since j xnŠ Kp .n/j jxjnŠ and 1
nD0 nŠ is absolutely convergent. Applying Tannery’s
theorem we have
lim
n!1
1C
x
n
n
D lim
n!1
n
X
jD0
aj .n/ D
1 n
X
x
nD0
nŠ
:
t
u
Remarks 3.5.8
(1) This result continues to hold if we allow for complex variables: limn!1 .1 C
z n
z
n / D e , for all z 2 C. The proof is exactly the same with ‘absolute value’
replaced everywhere by ‘modulus’. (It is not straightforward to define exp.z/ in
terms of the inverse of log z, if z complex.)
(2) A proof of Proposition 3.5.7 can be based on the method of Examples 2.3.27(2),
avoiding Tannery’s theorem. But the argument still has to address the issues that
arise in the proof of Tannery’s theorem.
z
3.5.2 Tests for Absolutely Convergent Series
We present versions of the comparison, ratio and Cauchy test appropriate for
proving absolute convergence. Proofs all use Theorem 3.5.3 together with the
104
3 Infinite Series
corresponding result for series of positive terms. The results also apply to complex
series with ‘absolute value’ replaced by the ‘modulus’.
Proposition 3.5.9 (The Comparison Test) Let .un /; .vn / be sequences of real
numbers satisfying jun j jvn j, for all n 2 N.
P1
P
(1) If 1
nD1 vn is absolutely convergent, then
nD1 un is absolutely convergent
(and
so
convergent).
P
P1
(2) If 1
nD1 un is not absolutely convergent then
nD1 vn is not absolutely convergent.
P
P1
Remark 3.5.10 For the second statement, 1
nD1 vn may converge even if
nD1 un
is divergent.
z
Proposition 3.5.11 (The Ratio Test) Let .an / be a sequence of real numbers and
a
suppose that limn!1 j nC1
j D `.
an
P1
(1) If ` < 1, the series PnD1 an is absolutely convergent (and so convergent).
(2) If ` > 1, the series 1
nD1 an is divergent.
Proposition 3.5.12 (Cauchy’s Test) Let .an / be a sequence of real numbers.
P1
1
(1) If lim sup jan j n < 1, the series
nD1 an is absolutely convergent (and so
convergent).
P
1
(2) If lim sup jan j n > 1, the series 1
nD1 an is divergent.
P
Definition 3.5.13 P
Let .an / be a sequence of real numbers. The series 1
nD1 bn is a
rearrangement of 1
a
if
there
exists
a
bijection
W
N
!
N
such
that bn D
n
nD1
P
P
1
a .n/ , for all n 2 N. That is, a rearrangement of 1
a
is
a
series
n
nD1
nD1 a .n/
where is a “permutation” of N.
Example
3.5.14 The series x C x3 x2 C x5 C x7 x4 C is a rearrangement of
P1
nC1 n
x .
nD1 .1/
We end this section with an important result that fails dramatically when the
series is convergent but not absolutely convergent.
P1
Theorem
nD1 a .n/ of an absolutely convergent
P13.5.15 Every rearrangement
series nD1 an is convergent and
1
X
nD1
a .n/ D
1
X
an :
nD1
P
Proof If 1
nD1 an is absolutely convergent, it follows from the general principle of
convergence that given " > 0, there exists an N 2 N such that jam j C C jan j < ",
for all n > m N. Let M D maxf 1 .1/; 1 .N 1/g 2 N. Observe that if
n M, then .n/ N. Let n > m M and set n0 D maxf.m/; ; .n/g and
m0 D minf.m/; ; .n/g. We have n0 > m0 N and so
ja .m/j C C ja .n/j jam0 j C jam0 C1 j C C jan0 j < ":
3.5 Absolute Convergence
105
P
Hence by the general principle of convergence 1
nD1 a .n/ is
Pabsolutely convergent.
It remains to prove that the series have the same sum. Let 1
nD1 an D `. With the
1
same notation
used
above,
suppose
p
>
M
and
set
q
D
maxf
.N/; 1 . p/g.
PN1
We have j nD1 an `j " and so
ˇ ˇN1
ˇ
ˇ p
ˇ ˇX
ˇ
ˇX
ˇ ˇ
ˇ
ˇ
a .n/ `ˇ ˇ
an `ˇ C jaN C C aq j < 2":
ˇ
ˇ ˇ
ˇ
ˇ
nD1
nD1
This estimate holds for all p > M and so
P1
nD1
a .n/ D `.
t
u
EXERCISES 3.5.16
P
(1) Suppose that .an / is a sequence of real numbers such that (a) 1
nD1
P an is
convergent and (b) the an are eventually of the same sign. Show that 1
nD1 an
is absolutely convergent.
(2) For each of the following series find R > 0 such that the series converges if
jxj < R and diverges if jxj > R.
P1 2 n
nx .
(a)
PnD1
2 n2
(b) P1
nD1 nn x .
1 n n
(c)
nŠ x .
PnD1
1 .2n/Š n
(d)
nD0 .nŠ/2 x .
P1 .2n/Š2 2n
x .
(e)
nD0
p
P1 .4n/Š
n n
x .
(f)
nD1 n
(3) For what values of x are
(a)
(b)
P1
nD1
P1
nD1
1C
x2
2n
1C
x2
n
n2
n2
e2n ,
x2n ,
convergent?
(4) Suppose that p W N ! N satisfies a p.n/
n A, where A a > 0. Show
that if wePassume conditions
(1,2)
of
Tannery’s
theorem (Theorem 3.5.6) then
P
limn!1 njD1 aj . p.n// D 1
a
.
m
mD1
(5) Using Tannery’s theorem, prove that
2
3
n
n1 X
e
n
j
5:
D lim 4
n!1
e1
n
jD0
p
(Hint: Let an . p/ D . pn
p / if n < p and be zero otherwise.)
(6) Show that the rearrangement theorem also holds for absolutely convergent
series of complex terms.
106
3 Infinite Series
3.6 Conditionally Convergent Series
In this section we look at convergent real series that are not absolutely convergent.
The results we obtain do not have (simple) extensions to complex series.
P
Definition 3.6.1 Let .an / be a sequence of real numbers. The series 1
nD1 an is
conditionally convergent if it is convergent but not absolutely convergent.
P
Remark 3.6.2 If 1
nD1 an is conditionally convergent, then the series must have
infinitely many positive and negative terms. Else the terms would either be
eventually positive or eventually negative and the series would be absolutely
convergent.
z
P1
Example 3.6.3 The series nD1 .1/nC1 =n is conditionally convergent. Since
1
1
1
S2n D 1 CC
0; n 1;
2
2n 1 2n
the sequence .S2n / of partial sums is an increasing sequence of positive numbers.
On the other hand,
1
1
1 1
1
D S2n ;
S2nC1 D 1 2 3
2n 2n C 1
2n C 1
1
is a decreasing sequence bounded above by 1. Now S2n D S2nC1 C 2nC1
< 1,
n 1, and so .S2n / is an increasing sequence of positive numbers bounded above
by 1 C 1=.2m C 1/, for all m 1. Hence .S2n / converges to ` 2 .0; 1. Since
1
S2nC1 D S2n 2nC1
, .S2nC1 / also converges to `. Therefore, limn!1 Sn D ` 2 .0; 1.
(In the P
next chapter we show that ` D log 2.)
Let 1
a
be
a
conditionally
convergent
series.
Define
sequences
.u
/,
.v
/
of
n
n
nD1 n
positive real numbers by
un D maxf0; an g; vn D minf0; an g:
Observe that for all n 2 N we have
an D un vn ; jan j D un C vn :
The next result will be useful when we prove Riemann’s theorem on rearrangements
of a conditionally convergent series.
P
Proposition 3.6.4 If 1
series and we define the
nD1 an is a conditionally
P
Pconvergent
1
sequences .un /, .vn / as above, then 1
u
,
v
both
diverge to C1.
n
n
nD1
nD1
P1
P1
P1
Proof If nD1 vn converges
then nD1 .an C vn / D nD1 un converges. Therefore,
P
P1
1
.u
C
v
/
D
ja
j
n
nD1 n
nD1 n converges, contradicting the conditional convergence
3.6 Conditionally Convergent Series
107
P
P1
P1
of 1
nD1 an . Hence
nD1 vn D C1. A similar argument shows that
nD1 un
D C1.
t
u
is divergent. For this
Example
The series 1 212 C 13 412 C 15 C P
P3.6.5
1
series, nD1 un isPdivergent (Exercises 3.3.15(2)) but 1
nD1 vn is convergent (by
comparison with
1=n2). Hence the series cannot be absolutely or conditionally
convergent (Proposition 3.6.4) and therefore must diverge (in this case to C1). 3.6.1 Alternating Series
P
nC1
Definition 3.6.6 The series 1
an is called an alternating series if .an /
nD1 .1/
is a sequence of positive real numbers.
Proposition 3.6.7 (Leibniz AlternatingP
Series Test) Let .an / be a sequence of
nC1
positive numbers. The alternating series 1
an converges if
nD1 .1/
(a) .an / is a decreasing sequence.
(b) limn!1 an D 0.
P
nC1
an 2 Œ0; a1 .
If (a,b) hold then 1
nD1 .1/
Proof The proof is formally identical to the argument used in Example 3.6.3 and
we leave the details to the reader.
t
u
P1 .1/n P1 .1/n
Example 3.6.8 The alternating series
nD2 log n ,
nD3 log log n are convergent.
Note that the convergence of these series is very slow. For example, we need to
take n greater than 1064 to ensure that the nth term of the second series is less than
1=5.
3.6.2 Riemann’s Theorem
In this section we look at rearrangements of a conditionally convergent series. We
start with a simple example.
Example 3.6.9 Consider the conditionally convergent series 1 1 C 12 12 C C
1
1
n n C . The series trivially converges to zero. Take the rearrangement
1
1
1 1
1C C C
2
3
4 2
P
1
1
1
It is easy to see that S3n D njD1 2j.2j1/
. Using the identity 2j.2j1/
D 2j1
2j1 ,
1
1
1
we find that S3n D 1 2 C 3 2n . We deduce easily that the rearranged
series converges with sum equal to log 2 > 0. This example shows the failure of the
rearrangement theorem when the series is not absolutely convergent.
1C
108
3 Infinite Series
Theorem 3.6.10 (Riemann’s Rearrangement Theorem) Let
ditionally convergent series.
P1
nD1
an be a con-
(a) For every x 2 R [ f1; C1g there exists a rearrangement such that
1
X
a .n/ D x:
nD1
(b) There exist rearrangements such that
˙1).
P1
nD1
a .n/ does not converge (even to
Proof We prove (a) and leave (b) to the exercises. Our first step is to define
sequences . pk / and .qk / by requiring that . pk / is the subsequence of .an / defined by
the positive terms and .qk / is the subsequence of .an / defined by the strictly negative
terms. Note that the sequence . pk / may contain zeros and that for each k 2 N, there
exist unique nk ; mk 2 N such that pk D ank , qk D amP
k.
Let x 2 R. For simplicity, suppose x 0. Since 1
kD1 pk diverges to C1, there
exists
a
unique
k
1
such
that
p
C
C
p
x
<
p1 C C pk1 D P1 . Since
1
1
k
1
1
P1
kD1 qk diverges to 1, there exists a unique `1 1 such that Q1 D P1 C q1 C
C q`1 < x P1 C q1 C C q`1 1 D Q1 q`1 . In the obvious way, we may
inductively
n /, .`n / and sequences .Pn /, .Qn / so that
P n define increasing sequences .kP
`n
Pn D kjDk
p
C
Q
,
Q
D
P
C
j
n1
n
n
C1
jD`n1 C1 qj and Pn pkn 1 x < Pn ,
n1
Qn < x Qn P
q`n 1 , for all n 1. Since kj ; `j 1, there are at least 2n 1 terms
from the series
an in Pn , and at least 2n terms in Qn . This construction defines
the rearrangement
1
X
a .n/ D an1 C C ank1 C am1 C C am`1 C ank1 C1 C nD1
P
We claim the rearranged series converges to x. Since
an is convergent,
limn!1 an D 0 and so limn!1 pn D limn!1 qn D 0. Hence there exists an
N 1 so that jpn j; jqn j < " for all n N. Choose M 2 N so that if n M,
then a .n/ is either pk , with k N orP
qk with k N. It follows from the
m
construction
of
the
rearrangement
that
j
nD1 a .n/ xj < " for all m M.
P1
Hence nD1 a .n/ D x.
If x D C1, we modify the construction by requiring that Pn pkn 1 n < Pn and Qn < n 1 Qn q`n 1 . The argument when x D 1 is
similar.
t
u
Remark 3.6.11 The statement of Riemann’s rearrangement theorem can be
strengthened along the following lines:P
let 1 x1 < x2 < < xN C1.
Then there exists a rearrangement of 1
nD1 an such that for each xj there exists a
subsequence .Snk / of .Sn / which converges to xj . (See also the exercises.)
z
3.7 Abel’s and Dirichlet’s Tests
109
EXERCISES 3.6.12
(1) Determine the convergence of the series
P1 .1/n n
(a)
nC1 .
PnD1
1 cos n
p .
(b)
n
PnD1
1
n n
3
e
.
(c)
PnD1
1 .1/n
.
(d)
nD2 log n
P1
P1
P
(2) Suppose 1
nD1 an and
nD1 bn both converge. Must
nD1 an bn converge?
Either prove it or find a counterexample.
1
(3) Show that the rearrangement 1 12 14 C 13 16 18 C 15 10
C of
P1
1
nC1
=n converges to 2 log 2. (Hint: Look at the partial sum to 3n terms
nD1 .1/
of the series.)
1
(4) Show that the rearrangement 1 C 13 12 C 15 C 17 14 C 19 C 11
of
P1
3
nC1
.1/
=n
converges
to
log
2.
(Hint:
Work
with
the
partial
sum
to 3n
nD1
2
1
1
terms. At some point you will need to show that nC1
C C 2n
is equal to the
P
nC1
partial sum to 2n terms of the series 1
=n.)
nD1 .1/
1
1
1
1
1
1
(5) Show that 1 2 4 C 5 C 7 8 10 C C D 23 log 2.
(6) Show that 1 C 13 C 15 12 14 16 C C C D log 2.
(7) Prove part (b) of Riemann’s rearrangement theorem.
(8) Prove the result indicated in Remark 3.6.11.
3.7 Abel’s and Dirichlet’s Tests
In this section we state and prove two powerful tests that can be used to determine
the convergence of non-absolutely convergent series. Both results depend on a
simple but subtle inequality due to Abel.
Lemma
P3.7.1 (Abel’s Lemma) If the sequence .Sn / of partial sums of the infinite
series 1
nD1 an satisfies the bounds
m Sn M; n 2 N;
then for any decreasing sequence .un / of positive real numbers we have the bounds
mu1 n
X
aj uj Mu1 ; n 2 N:
jD1
Proof Since an D Sn Sn1 , n 2, we have
n
X
aj uj D S1 u1 C .S2 S1 /u2 C C .Sn Sn1 /un
jD1
D S1 .u1 u2 / C C Sn1 .un1 un / C Sn un :
(3.4)
110
3 Infinite Series
Since .un / is a decreasing sequence of positive numbers,
we have u1 u2 ; ; un1 P
un ; un 0. Hence, upper and lower bounds for njD1 aj uj are given respectively by
M..u1 u2 / C .u2 u3 / C C .un1 un / C un / D Mu1 ;
m..u1 u2 / C .u2 u3 / C C .un1 un / C un / D mu1 ;
and so mu1 Pn
jD1
aj uj Mu1 .
t
u
P1
an is convergent and
Proposition 3.7.2 (Abel’s Test) If the infinite series nD1 P
.vn / is a bounded monotone sequence of real numbers, then 1
nD1 an vn is convergent.
Proof Since .vn / is monotone and bounded, .vn / is convergent, say to v. If .vn / is
increasing, set un D vvn and if .vn / is decreasing set un D vn v. In both cases .un /
is a monotone decreasing P
sequence ofP
positive numbers. Since an vn D an v an un
or
a
v
D
a
u
a
v
and
a
v
D
v
an is convergent, it is enough to prove that
n
n
n
n
n
n
P1
nD1 an un converges.
Fix m 2 and define Km D supfjSn Sm1 j j n mg. ByP
the general principle
of convergence, limm!1 Km D 0. Applying Abel’s lemma to njDm aj uj gives
ˇ
ˇ
ˇ
ˇ n
ˇ
ˇX
ˇ
aj uj ˇˇ Km um Km u1 ; n m:
ˇ
ˇ
ˇ jDm
Let " > 0. Since limm!1 Km D 0, there exists an N 2 N such that jKm j < "=u1 .
Therefore,
ˇ
ˇ
ˇ n
ˇ
ˇX
ˇ
ˇ
aj uj ˇˇ Km v1 < "; if n m N:
ˇ
ˇ jDm
ˇ
The result follows by the general principle of convergence.
t
u
Examples 3.7.3
(1) Consider the series 1 1 C 12 12 C 13 , which trivially converges to zero
(see Example 3.6.9). Take the bounded increasing sequence 1; 12 ; 12 ; 23 ; 23 ; 34 ; .
Multiplying the terms of the series by the corresponding term of the decreasing
sequence yields the infinite series
1
1
2
3
1
1
1
C 2 C 2 C 2 2
2
3
3
4
4
Abel’s test implies that this series converges. Note that the alternating series
test does not apply to this series as the terms in the series do not define a
3.7 Abel’s and Dirichlet’s Tests
111
decreasing
(It is not too difficult to show that the series converges
P sequence.
2
to P
2 1
nD1 n .)
P1 an
(2) If 1
nD1 an converges, then
nD1 nx converges if x 0.
Proposition 3.7.4 (Dirichlet’s
P Test) Suppose that the sequence .Sn / of partial
sums of the infinite series 1
nD1 an is bounded
P and .un / is a decreasing sequence
of positive numbers. If limn!1 un D 0, then 1
nD1 an un is convergent.
Proof Suppose that ˛ Sn ˇ for all n 2 N. If we set K D maxfj˛j; jˇjg, then
jSn j K for all n 1. We have jSn Sm j jSn j C jSm j and so jSn Sm j 2K for
all n m 1. Applying Abel’s lemma gives the estimate
ˇ
ˇ
ˇ
ˇ n
ˇ
ˇX
ˇ
aj uj ˇˇ 2Kum ; for all n m 1:
ˇ
ˇ
ˇ jDm
Given " > 0, choose N 2 N such that um < "=2K for all m N. We have
ˇ
ˇ
ˇ
ˇ n
ˇ
ˇX
ˇ
ˇ
a
u
j j ˇ 2Kum < "; n m N:
ˇ
ˇ
ˇ jDm
It follows from the general principle of convergence that
P1
nD1 an un
is convergent.
t
u
Examples 3.7.5
P
nC1
is not convergent but the partial sums are bounded.
(1) The series 1
nD1 .1/
If .un / is a decreasing sequence
numbers converging to zero then
P of positive
nC1
Dirichlet’s test impliesPthat 1
un converges
test).
nD1 .1/
P1 a(Leibniz
n
(2) If the partial sums of 1
nD1 an are bounded then
nD1 nx converges if x > 0.
(3) Dirichlet’s test is often
for the study of trigonometric series. For example,
P useful
cos n
consider the series 1
nD1
n . Using Dirichlet’s test we show that the series
converges provided that is not an integer multiple of 2 . We start by noting
that
P1 if 1 is an integer multiple of 2 then the series is the harmonic series
is an odd multiple of , then the series is
n which is divergent. If
PnD1
1 .1/n
which
converges
by
the
alternating series test. For other values
nD1 n
of , in particular irrational multiples of , the issue of convergence is quite
subtle.
P1 cos n
The main ingredient in the proof of convergence of
is the
nD1
n
trigonometric identity
n
X
jD1
cos.n / D
cos
nC1 2
sin
sin. 2 /
n 2
;
¤ 2n :
(3.5)
112
3 Infinite Series
(We give the proof
ter.) Provided
is
mate
ˇ
ˇX
ˇ n
ˇ
cos.n
ˇ
ˇ jD1
of this identity in an appendix at the end of the chapnot an integral multiple of 2 , (3.5) gives the estiˇ
ˇ
ˇ j cos nC1
sin n2 j
1
2
; n 1:
/ˇˇ j sin. 2 /j
j sin. 2 /j
ˇ
Take an D cos.n /, un D 1=n in Dirichlet’s test.
EXERCISES 3.7.6
P
P1
(1) P
Suppose
that P1
nD1 nan converges. Show that
nD1 an converges. What about
p
1
nC1
na
?
.1/
a
?
(You
may
not
assume
all the terms are of the same
n
n
nD1
sign. Either prove
it
or
find
a
counterexample.)
P
P
p
n
(2) Suppose that 1
Show that 1
nan converges.
nD1 an converges.
nD1
P1 sin.n /
(3) For what values of 2 R is nD1 n convergent?
P
sin..2nC1/ /
converges for all 2 R.
(4) Prove that 1
nD0
.2nC1/
(5) For what values of 2 R is the series with nth term
1 sin.n /
1
1C CC
2
n
n
convergent? (You may assume .1 C 12 C C 1n /=n ! 0.)
(6) Prove the following extensions of Abel’s and Dirichlet’s tests (due to
Dedekind).
P
(a) Let P1
n / is a bounded sequence such
nD1 an be convergent and suppose
P.v
1
that 1
jv
v
j
converges.
Then
nC1
n
nD1
nD1 vn an converges.
P
(b) Let .an / and .vn / be such that the sequences of partial sums . njD1 aj /
Pn
P1
and . jD1 jvjC1 vj j/ are bounded and limn!1 vn D 0. Then nD1 vn an
converges.
P
n sin.log n/
(7) Prove that 1
is convergent if a > 0. (Hint: Use the result of
nD1 .1/
na
n/
the previous question with vn D sin.log
. Estimate jvnC1 vn j using the mean
na
value theorem.)
3.8 Double Series
A double series is an infinite series of the form
1
X
m;nD1
am;n ;
3.8 Double Series
113
where am;n 2 R (or C for a complex double series). Given m; n 2 N, we define the
partial sum Sm;n by
Sm;n D
X
im;jn
ai;j D
n
m X
X
ai;j :
iD1 jD1
P
Definition 3.8.1 The double series 1
m;nD1 am;n is convergent, with sum S, if there
exists an S 2 R such that for every " > 0, there exists an N 2 N such that
jSm;n Sj < "; m; n N:
That is, if limm;n!1 Sm;n D S.
Examples 3.8.2
P
m n
3 is convergent with sum 1=2. Since Sm;n D
(1) The double series 1
m;nD1 2
Pm i Pn
j
. iD1 2 /. jD1 3 /, we easily compute that
Sm;n D .1 2m /.1 3n /=2:
The result follows since limm!1 1 2m D 1 and limn!1 1 3n D 1.
(2) It is not enough in Definition 3.8.1 to require that limn!1 Sn;n D 0. For
example,P
define ai;i D i2 , i 2 N, and ai;j D aj;i D 1, if i < j. Then
n
2
Sn;n D
but am;n ¹ 0 as n; m ! 1! (cf. Lemma 3.2.3). The
iD1 i
definition we have given is simple and leads quickly to results on repeated
series, but alternative definitions are possible that define the partial sums on
non-rectangular regions.
P1
Definition 3.8.3 The double series
m;nD1 am;n is absolutely convergent if
P
1
ja
j
is
convergent.
m;nD1 m;n
Proposition 3.8.4 An absolutely convergent double series is convergent.
P
P1
Proof Suppose that 1
m;nD1 am;n is absolutely convergent and that
m;nD1 jam;n j D
P1
P
OS. Let SO m;n and Sm;n denote the partial sums for m;nD1 jam;n j and 1
m;nD1 am;n ,
P1
O
respectively. Since
m;nD1 am;n is absolutely convergent, .Sn;n / is a Cauchy
sequence. Just as in the proof of Theorem 3.5.3, it follows easily that .Sn;n / is
Cauchy and so .Sn;n / converges, say to S. Suppose m n. We have
jS Sm;n j jS Sn;n j C jSn;n Sm;n j jS Sn;n j C jSO m;n SO n;n j:
Letting m n ! 1, we see that limm;n!1 Sm;n D S. The same argument applies
if m n.
t
u
P1
2
2
Proposition 3.8.5 LetP W N ! N be a bijection. If m;nD1 am;n is absolutely
convergent then so is 1
m;nD1 a .m;n/ and the two sums are equal.
114
3 Infinite Series
Proof The proof is similar to that of Theorem 3.5.15; we leave the details to the
exercises.
u
t
3.8.1 Repeated Series
A repeated series is an infinite series which is of either of the forms
1
1
X
X
mD1
!
am;n
or
1
1
X
X
nD1
nD1
!
am;n :
mD1
If we think of the terms am;n as defining an infinite matrix Œam;n , then
P the first
repeated sum is naturally called the sum by rows of the double series 1
m;nD1 am;n
and the second repeated sum is called the sum by columns.
P1
Proposition 3.8.6
P1 Suppose that
P1 m;nD1 am;n converges and that for all m; n 2
N, the series nD1 am;n and mD1 am;n converge, then the repeated series both
converge and we have
1
X
am;n D
m;nD1
Proof Set
P1
m;nD1
1
1
X
X
mD1
!
am;n
nD1
D
1
1
X
X
nD1
!
am;n :
mD1
am;n D S. Given " > 0, there exists an N 2 N such that
jSm;n Sj < "; m; n N:
P
Hence j limn!1 Sm;n Sj " for all m N since 1
nD1 am;n converges for all
m 2 N. It follows that limm!1 .limn!1 Sm;n / D S. The same argument proves that
limm!1 .limn!1 Sm;n / D S.
t
u
Example 3.8.7 If both repeated series converge but the double series is divergent
then the repeated sums may be different. We give an example due to Arndt [5,
Chap. V]. If we define
am;n D
1
mC1
m
mC1
n
1
mC2
mC1
mC2
n
;
then after some computation we find that
Sm;n D
1
1
nC1
2 2
"
mC1
mC2
mC1
mC2
nC1 #
:
3.8 Double Series
115
It follows that
1
1
X
X
mD1
am;n
nD1
1
1
X
X
nD1
!
1
D lim . lim Sm;n / D ;
m!1 n!1
2
!
am;n
mD1
D lim . lim Sm;n / D
n!1 m!1
1
:
2
It is clear that limm;n!1 Sm;n does not exist.
Proposition 3.8.8 Suppose that one of
1
X
jam;n j;
m;nD1
1
1
X
X
mD1
!
jam;n j ;
nD1
1
1
X
X
nD1
!
jam;n j
mD1
is convergent, then all three series are convergent with the same sum and we have
1
X
m;nD1
am;n D
1
1
X
X
mD1
nD1
!
am;n
D
1
1
X
X
nD1
!
am;n :
mD1
Proof The result follows straightforwardly from Propositions 3.8.6 and 3.8.4 and
we leave the details to the exercises.
t
u
EXERCISES 3.8.9
n
m
(1) Let
P1am;n D 1=.˛ C ˇ /. Show that if ˛; ˇ > 1, then the double series
m;nD1 am;n is convergent.
(2) State and prove a version of the comparison test that is applicable to double
series of positive terms.
(3) Prove Proposition 3.8.5.
(4) Prove Proposition
P3.8.8.
P1
1
(5) Given
the
series
0. Show that
P1
P1 nD0 an , mD0 bm , define cm;n D am bn , m; n P
1
if nD0 an , mD0 bm are convergent,
then
the
double
series
m;nD1 am bn is
P1
P1
convergent with sum equal to . nD1 an /. mD1 bm /.
(6) Suppose that f W Œ1; 1/2 ! R is continuous, monotone decreasing (in the
sense that f .x0 ; y0 / f .x; y/, x0 P
x, y0 y) and lim.x;y/!C1 f .x; y/ D
1
0. Show that the double series
m;nD1 f .m; n/ converges if and only if
R1R1
f
.x;
y/
dxdy
converges.
1
1
(7) Let f W Œ0; 1/2 ! R be continuous and positive. For s 0, define
g.s/ D infx2Œ0;s f .x; s x/, G.s/ D supx2Œ0;s f .x; s x/. Suppose that sg.s/ and
sG.s/ are both monotone
zero as s ! 1. Show
P decreasing and convergeR to
1
that the double series 1
f
.m;
n/
converges
if
m;nD1
1 sG.s/ ds converges and
R1
diverges if 1 sg.s/ ds diverges. (Hint: Given n, estimate the sum of terms on
the diagonal x C y D n.)
116
3 Infinite Series
(8) Show that
P1
.m C n/a converges iff a > 2.
(a)
Pm;nD1
1
2
2 a
(b)
converges if a > 2, A; C > 0 and AC > B2
m;nD1 .Am C2BmnCCn /
(if B < 0).
(c) P
If f W Œ1; 1/ ! R is continuous and monotone
R 1 decreasing to zero, then
1
2
2
m;nD1 f .Am C2BmnCCn / converges iff 1 f .x/ dx converges (assume
the coefficients A; B; C satisfy the conditions of the previous question).
(9) Show that if anm D .1/mCn =mn, then the double series and associated
repeated series are convergent with common sum .log 2/2 but the series is not
absolutely P
convergent
P
xn
mn
(10) Show that 1
D 1
m;nD1 x
nD1 1xn , jxj < 1 (Lambert’s series).
(11) Let am;n D .1/mCn mn=.m C n/2 . Show that
1
1
X
X
mD1
!
am;n
nD1
but that the double series
1
1
and L C 16
).
L 16
D
1
1
X
X
nD1
P1
m;nD1
mD1
!
am;n
1
1
log 2 DL
D
6
4
am;n is not convergent (it oscillates between
3.9 Infinite Products
Q
Suppose that .an / is a sequence of real numbers. The infinite product 1
nD1 an is
defined to be the sequence .Pn / of partial products
where
P
D
a
a
an ,
n
1
2
Q
n 2 N. Roughly speaking, the infinite product 1
a
converges
if
the
sequence
nD1 n
.Pn / converges. In practice, it is useful to avoid situations where limn!1 Pn D 0. If
limn!1 Pn exists and is either 0 or ˙1, the infinite product is said to diverge (we
refine this definition later). As we shall soon see, there is a close connection between
the theories of infinite series and infinite products.
This relationship isQbest seen by
Q
working with infinite products of the form 1
.1
C an / rather than 1
nD1
nD1 an .
Definition 3.9.1 Let .an / be a sequence of real numbers. The infinite product
Q
1
nD1 .1 C an / is convergent if the sequence .P
Qn / of partial products is convergent
and does not converge to either 0 or C1. If 1
nD1 .1 C an / is not convergent, it is
divergent.
Remarks 3.9.2
Q
are all positive, the infinite product 1
(1) Provided that the terms 1 Can P
nD1 .1 Can /
converges iff the infinite sum 1
log.1
C
a
/
converges.
n
nD1
Q
(2) We can define infinite products 1
nD1 .1 C an / with an 2 C. The definition of
convergence is the same though we no can
Plonger relate the convergence of the
infinite product with the convergence of 1
nD1 log.1 C an /. With the exception
3.9 Infinite Products
117
of Proposition 3.9.10, many of the results and tests we give below do not apply
to the complex case.
(3) In practice, it is prudent to slightly
modify the definition of convergence. We
Q
say that the infinite product 1
.1
C
nD1
Q an / is convergent if there exists an N 2
N such that an ¤ 1, n N, and 1
nDN .1 C an / converges in the sense of
Definition 3.9.1. Otherwise we say the product diverges. The reason for this
variation will be clearer when we look at the infinite product formula for the
sine function.
z
Examples 3.9.3
Q
1
(1) The infinite product 1
have .1 C 1/.1 C 1=2/ .1 C
nD1 .1 C n / diverges. We P
1
1=n/
1
C
1=2
C
C
1=n.
Since
the
series
nD1 1=n is divergent to C1,
Q1
1
to C1.
nD1 .1 C n / divergesQ
1
1
(2) The infinite product 1
nD2 .1 n2 / is convergent. Observe that 1 n2 D
Q
Q
.n1/.nC1/
1
. Consequently, Pn D njD2 . j1/.j2 jC1/ D 12 nC1
. Hence 1
nD2 .1 n2 /
n
n2
Q1
is convergent and nD2 .1 n12 / D 12 .
If we assume all the terms an are positive
Q then it is easy to give necessary and
sufficient conditions for convergence of 1
nD1 .1 C an /.
Q
Lemma
3.9.4 Assume an 0 for all n 2 N. Then 1
nD1 .1 C an / converges iff
P1
nD1 an < 1 and then
1
X
an nD1
1
Y
.1 C an / exp
nD1
1
X
!
an :
nD1
Proof For n 2 N,
n
X
iD1
ai n
Pn
Y
.1 C ai / e iD1 ai ;
iD1
where the last inequality follows from 1 C ai eai . The result follows from
Theorem 2.3.18.
t
u
Example 3.9.5 Q
As an immediate consequence of Lemma 3.9.4 and our results on
1
series, we have 1
nD1 .1 C np / converges iff p > 1.
We have a useful variation of Lemma 3.9.4 that allows for all the terms ai to be
negative.
Q
Lemma
3.9.6 Assume an 2 Œ0; 1/ for all n 2 N. Then 1
nD1 .1 an / converges iff
P1
a
<
1.
nD1 n
Qm
Qm
Proof Since .1 a/ .1 C a/1 if a 2 Œ0; 1/, we have P
nD1 .1 an / . nD1 .1 C
an //1
for all m 2 N. By Lemma 3.9.4 it follows that if 1
nD1
Q1
P an diverges to C1,
then nD1 .1an/ diverges to zero. Conversely, suppose that 1
nD1 an is convergent.
118
3 Infinite Series
An easy induction on n verifies that for n m 1 we have
n
Y
.1 aj / 1 jDm
n
X
aj :
jDm
P
Pn
Since 1
nD1 an is convergent, there exists an N 2 N such that
jDm aj < 1=2 for all
n m N. Hence
n
Y
jDm
.1 aj / >
1
; n m N:
2
Q
Q
Consequently, Pn D njD1 .1 aj / 12 NjD1 .1 aj / D C > 0, for all n 2 N.
Therefore
the decreasing sequence .Pn / is bounded below by C > 0 and therefore
Q1
t
u
nD1 .1 an / converges.
Remark 3.9.7 We refer the reader to Exercises 3.9.19(4) for the Weierstrass
inequalities which we have made use of in the proofs of Lemmas 3.9.4, 3.9.6. z
Lemmas 3.9.4, 3.9.6 will suffice for most of our intended applications to infinite
products (in particular, our Fourier series proof in Chap. 5 of the infinite product
formula for the sine function). In the remainder of the section, we develop some
more advanced topics from the theory of infinite products that parallels our previous
work on conditional and absolute convergence for infinite series.
3.9.1 Tests for Convergence of an Infinite Product
We start with the definition of absolute convergence for infinite products.
Q
Definition
3.9.8 The infinite product 1
nD1 .1 C an / is absolutely convergent if
Q1
nD1 .1 C jan j/ is convergent.
Before giving our main result, we prove a lemma that is useful for estimating
products.
Lemma 3.9.9 Let .an / be a sequence of real or complex numbers. Then
ˇQ
ˇ
Pn
ˇ
ˇ
(1) ˇ njD1 .1 C aj /ˇ e jD1 jaj j .
ˇQ
ˇ
Pn
ˇ
ˇ
(2) ˇ njD1 .1 C aj / 1ˇ e jD1 jaj j 1.
jxj
Proof Estimate (1) follows easily
ˇ x 2 C).
ˇQ from j1 C xj ˇ 1 C
ˇ jxj e (x 2 R or
ˇ ˇQn
ˇ
ˇ n
For estimate (2), observe that ˇ jD1 .1 C aj / 1ˇ ˇ jD1 .1 C jaj j/ 1ˇ and then
use (1).
t
u
Proposition 3.9.10 Let .an / be a sequence of real (or complex) numbers none of
which equals 1.
3.9 Infinite Products
119
Q1
P1
(a) Q
n is absolutely convergent.
nD1 .1 C an / is absolutely convergent iff
nD1
Qa1
(b) If 1
.1
C
a
/
is
absolutely
convergent,
then
n
nD1
nD1 .1 C an / is convergent.
P
Q1
Proof (a) Lemma 3.9.4 implies that nD1 .1 C jaQn j/ is convergent iff 1
nD1 jan j is
n
convergent. It remains to prove (b). Set Pn D jD1 .1 C aj /. We prove that the
sequence .Pn / of partial products is a Cauchy sequence. If n > m, then
0
1
n
Y
Pn Pm D Pm @
.1 C aj / 1A :
jDmC1
In order to estimate jPn Pm j, we make use of the inequalities (a) j1 C xj 1 C jxj,
x 2 R (or C), and (b) 1 C x ex for all x 0. We have
jPn Pm j D jPm j j
n
Y
.1 C aj / 1j
jDmC1
e
De
Since
Pn
P1
nD1
Pm
jD1
Pn
jan j < 1, .
jD1
jaj j
jaj j
Pn
jD1
e
Pn
e
jDmC1 jaj j
Pm
jD1
jaj j
1 ; by Lemma 3.9.9;
:
jaj j/ is a Cauchy sequence and so therefore is
jD1 jaj j
/. Hence
jPn Pm j ! 0 as m; n ! 1 and .Pn / isQa Cauchy sequence. This
.e
Q
proves that 1
.1
C an / exists. It remains to show that 1
nD1
nD1 .1 C an / D L ¤ 0 if
an ¤ 1 for all n 2 N. Now if an ¤ 1, for all n 2 N, then Pm ¤ 0 for all m 1.
From our previous estimates for jPn Pm j we see that
jPn Pm j jPm j e
Pn
jDmC1
jaj j
1 :
Pn
P
jDmC1 jaj j 1 Since 1
nD1 jan j < 1, there exists an N 2 N such that e
n > m N and so
1
2
for all
jPn Pm j jPm j=2; n > m N:
Letting n ! 1, we get jL Pm j jPm j=2, m N. In particular, if L D 0, this
implies jPN j jPN j=2 and so PN D 0. Contradiction.
t
u
Q1
p
Example 3.9.11 The infinite products nD2 .1 ˙ n / converge if p > 1.
Theorem 3.9.12 (General Principle of Convergence for Products) Let .an / be
a sequence
Q of real (or complex) numbers none of which equals 1. The infinite
product 1
nD1 .1 C an / converges iff for every " > 0 there exists an N 2 N such that
ˇ
ˇ
ˇY
ˇ
ˇ n
ˇ
ˇ .1 C an / 1ˇ < "; for all m; n N:
ˇ
ˇ
ˇjDm
ˇ
120
3 Infinite Series
Proof We leave the proof to the exercises.
t
u
We conclude with some additional tests for convergence which only apply to
infinite products of real numbers and give necessary and sufficient conditions for
convergence in terms of infinite sums.
Lemma
P1 2 3.9.13 Let .an / be a sequence of real numbers none of which equals 1. If
nD1 an < 1, then
P1
Q
an converges.
(a) Q1
nD1 .1 C an / converges if
nD1 P
(b) Q1
if 1
nD1 .1 C an / diverges to C1P
nD1 an diverges to C1.
1
(c) 1
.1
C
a
/
diverges
to
0
if
n
nD1
nD1 an diverges to 1.
P1 2
P1
Q
If nD1 an diverges and nD1 an converges, then 1
nD1 .1 C an / diverges to zero.
Proof A straightforward application of the calculus shows that
if 0 u 1;
u2 ;
2
u =4 u log.1 C u/ 1 2
u
=.1
C
u/;
if 0 > u > 1:
2
P
P1 2
Since one of 1
nD1 an ,
nD1 an < 1 converges, an ! 0. Hence we may choose
N 2 N such that jan j 1=2, n N. We have j.1 C an /j 1 jan j 1=2, for all
n N. Using the inequalities above, we have for n > m N
" n
#
n
X
Y
1 X 2
a ai log
.1 C ai /
4 iDmC1 i
iDmC1
iDmC1
n
X
a2i :
iDmC1
P
Q
Hence if a2n is convergent then .amC1 C Can /log niDmC1 .1 C ai / converges
to zero as n m
Statement (a,b,c)
P ! 1.
P1 now follow by the general
Qn principle of
2
convergence. If 1
nD1 an diverges and
nD1 an converges, then logΠiDmC1 .1Cai /
must diverge to 1 proving the final statement.
t
u
Q1
1
Example 3.9.14 Lemma 3.9.13 implies that nD1 .1 C n / is divergent while
Q1
.1/n
nD1 .1 C n / is convergent.
3.9.2 Tannery’s Theorem and an Infinite Product for sin x
We have a version of Tannery’s theorem for infinite products.
Theorem 3.9.15 Suppose we are given a sequence .an . p// R depending on
p 2 N. Assume
(1) limp!1 an . p/ D an , n 2 N.
P
(2) jan . p/j Mn , for all n; p 2 N, where 1
nD1 Mn < 1.
3.9 Infinite Products
121
Then
lim
p!1
p
Y
.1 C an . p// D lim
p!1
nD1
1
Y
.1 C an . p// D
nD1
1
Y
.1 C an /:
nD1
The result continues to hold if .an . p// C.
Proof Suppose first that .an . p// R. For sufficiently large N, we can assume
jan . p/j 1=2, n; p N. We have
n
Y
0
1
n
X
.1 C aj . p// D exp @
log.1 C aj . p//A :
jDN
jDN
Now Aj . p/ D log.1 C aj . p// satisfies the conditions of Tannery’s theorem for
series since if jaj . p/j 1=2, we have jAj . p/j D j log.1 C aj . p//j 2jaj . p/j
(use Exercise 2.9.9(4)). Now apply Tannery’s theorem for series (Theorem 3.5.6).
If .an . p// C, we reduce to the series case using the method of Proposition 3.9.10
(see Remark 3.9.16 below).
t
u
P
n
1
z
Remark 3.9.16 In the complex case we take as definitions exp.z/ D ez D nD0 nŠ ,
P
nC1 zn
z 2 C, and log.1 C z/ D 1
nD1 .1/
n , for jzj < 1. It is not hard to show (using
absolute convergence) that exp.log.1 C z// D 1 C z, if jzj < 1, which is needed for
the proof of Tannery’s theorem for complex infinite products.
z
Proposition 3.9.17 For all z 2 C
sin z D z
1 Y
z2
1 2 2 :
n
nD1
Proof We give a proof of Proposition 3.9.17 that uses Tannery’s theorem for infinite
products and a minimal amount of complex variable theory. (We give an alternative
and simpler real variable proof based on Fourier series in Chap. 5.)
For z 2 C we have (by definition)
sin z D
e{z e{z
:
2{
(3.6)
Applying the complex version of Proposition 3.5.7 to e˙{z gives
sin z D lim
n!1
.1 C {zn /n .1 {zn /n
2{
D lim Pn .z/;
n!1
122
3 Infinite Series
where Pn .z/ is a polynomial of degree (at most) n. Our approach will be to factorize
Pn .z/ and for this we need to find the solutions of Pn .z/ D 0. Observe that
{z n
{z n
D 1
n
n
{z
{z
; where un D 1
” 1C Du 1
n
n
n u1
:
”zD
{ uC1
Pn .z/ D 0 ” 1 C
From now we assume that n is odd and so u ¤ 1. The solutions of un D 1 are
given by
uDe
2k {
n
; kD
n1
n2
; ; 1; 0; 1; ;
:
2
2
For k 2 f n2
; ; 1; 0; 1; ; n1
g we have P.zk / D 0, where
2
2
!
2k {
n e n 1
zk D
{ e 2kn { C 1
Dn
.e
k {
n
.e
e
k {
n
k {
n
/=2{
k {
n
C e /=2
k
:
D n tan
n
Since tan x is an odd function, the roots of Pn .z/ D 0 are
2
; ˙n tan
0; ˙n tan z
n
n
and so (for n odd) we have Pn .z/ D Cz
; ; ˙n tan
Q n1
2
jD1
1
z2
j
n2 tan2 . n /
n1
2
n
!
;
. The coefficient of z
in Pn .z/ is easily verified to be 1 and so C D 1. Hence
n1
Pn .z/ D z
2
Y
1
jD1
n2 tan2 . jn /
For fixed j, we have limn!1 n2 tan2 . jn / D j2
for infinite products with
(
an . p/ D
1
2
:
. Now we apply Tannery’s theorem
z2
;
p2 tan2 . np /
0;
!
z2
n
n<
p1
2 ;
p1
2 :
3.9 Infinite Products
123
2
Noting that tan x x for x 2 Œ0; =2/, we see that j p2 tanz2 . n / j p
jzj2
,
n2 2
1 p .n 1/=2. It follows by Proposition 3.9.10 that condition (2) of Tannery’s theorem
for infinite products is satisfied. Condition (1) is immediate since limp!1 an . p/ D
2
.1 nz 2 /.
t
u
Remark 3.9.18 The most famous infinite product formula is that found by Euler for
the Riemann zeta-function (see Exercises 3.9.19(6c)). For many other examples of
infinite products we refer the reader to
www-elsa.physik.uni-bonn.de/~dieckman/InfProd/InfProd.html
for the encyclopedic list compiled by Andreas Dieckmann.
z
EXERCISES 3.9.19
(1) Show that
Q
2
1
(a) 1
nD2 1 n.nC1/ D 3 .
Q
.1/n
D 1.
(b) 1
nD2 1 C n
Q1 (c) nD2 1 C 1n D 0 (in particular, the product diverges).
Q
(d) 1
1 C n244 D 6.
QnD3
2nC1
1
(e) 1
nD2 1 C n2 1 D 3 .
(2) Prove Theorem
3.9.12.
Q
Q
n1
2n1
/ converges to
(3) Evaluate NnD1 .1 C x2 / and hence show that 1
nD1 .1 C x
.1 x/1 , for x 2 .1; 1/.
(4) Suppose that .an / .0; 1/. Show that for n > N 1 we have the Weierstrass
inequalities
(a)
n
Y
.1 C aj / 1 C
jDN
n
X
aj ;
jDN
n
Y
.1 aj / 1 jDN
n
X
aj :
jDN
(b)
n
Y
0
.1 C aj / @1 jDN
provided
n
X
jDN
Pn
jDN
aj < 1.
11
aj A
;
n
Y
jDN
0
.1 aj / @1 C
n
X
jDN
11
aj A
;
124
3 Infinite Series
Deduce that provided
0
@1 Pn
jDN
n
X
aj < 1 we have for all n > N the estimates
11
aj A
Qn
jDN .1 C aj / 1 C
jDN
0
@1 C
n
X
aj
jDN
n
X
11
aj A
Qn
jDN .1 aj / 1 jDN
n
X
aj :
jDN
P
iff either the infinite product
As a corollary, show that 1
nD1 an converges Q
Q
1
1
.1
C
a
/
converges
or
the
infinite
product
n
nD1
nD1 .1 an / converges.
(5) Show that
Q
.1/n1
p
(a) 1
nD1 1 C log n n converges.
Q1
n1
p
(b) nD1 1 C .1/
diverges to zero.
n
(6) Let . pn / denote the sequence of prime numbers > 1 written in ascending
order: 2 D p2 < p3 < . Show that
P1 1
(a)
nD2 px converges for x > 1.
Q1
(b) nD2 .1 p1x / converges for x > 1 and, in particular, is non-zero.
n
Q
P
1 1
1
(c) 1
D 1
nD2 .1 pxn /
nD1 nx , x > 1 (Euler product for the zeta-function).
P1 1
(d)
nD2 pn is divergent.
(7)
(8)
(9)
(10)
(Hints for parts (c,d): Every n 2 N can be written uniquely as a product of
primes. Given N 2, let P.N/ N be the subset of all positive integers
whose
are p2 ; ; pN . We regard 1 2 P.N/. Verify that for x > 0,
QN prime1 factors
P
1
1
.1
/
D
nD2
k2P.N/ kx . Use Lemma 3.9.6 for (d).)
pxn
ShowP
that if there are
many primes p1 ; ; pN , then we would
QNonly finitely
1 1
have 1
< 1, and so deduce that there are
nD1 1=n D
nD1 .1 pn /
infinitely many
primes.
Q
Prove that if .1 C an / is absolutely convergent, then the value of the product
is independent of the order of the factors.
State and prove an analogue of Riemann’s rearrangement theorem for infinite
products that are not absolutely convergent.
Taking z D { in the product formula for sin z verify that
1 Y
e e
1
D
1C 2 :
2
n
nD1
(Assume
sin z is defined as in (3.6).) Hence, using Examples 3.9.3(2), find
Q1
1
.1
/.
nD1
n4
Q
e =2 e =2
1
D 1
(11) Show that
nD1 .1 C 4n2 /.
3.10 Appendix: Trigonometric Identities
125
(12) Prove the infinite product formula for cos z
1 Y
cos z D
1
nD1
4z2
.2n 1/2
:
2
(Hint: Use the product formula for sin z together with the trigonometric
identity sin 2z D 2 sin z cos z.)
3.10 Appendix: Trigonometric Identities
In this appendix we prove some very useful trigonometric identities.
Theorem 3.10.1 Let ˛; ˇ 2 R and suppose that ˇ is not an integer multiple of 2 .
For n 0 we have
n
X
.nC1/ˇ
2
sin
cos.˛ C kˇ/ D
sin. ˇ2 /
kD0
n
X
.nC1/ˇ
2
sin
sin.˛ C kˇ/ D
cos
cos.kˇ/ D
.nC1/ˇ
2
sin
nˇ
2
sin. ˇ2 /
kD1
n
X
nˇ
2
sin ˛ C
sin. ˇ2 /
kD0
n
X
nˇ
2
cos ˛ C
sin
sin.kˇ/ D
kD1
.nC1/ˇ
2
sin
nˇ
2
sin. ˇ2 /
;
;
;
:
Proof By DeMoivre’s theorem we have
cos.˛ C kˇ/ C { sin.˛ C kˇ/ D e{˛C{kˇ D e{˛ e{kˇ :
Therefore
n
X
cos.˛ C kˇ/ C { sin.˛ C kˇ/ D e{˛
kD0
n
X
kD0
Provided that ˇ is not an integer multiple of 2 , we have
n
X
kD0
e{kˇ D
1 e{.nC1/ˇ
:
1 e{ˇ
e{kˇ :
126
3 Infinite Series
(This is most easily verified by multiplying both sides by 1 e{ˇ . Alternatively,
divide.) Taking real and imaginary parts gives us
1 e{.nC1/ˇ
;
cos.˛ C kˇ/ D Real e{˛
1 e{ˇ
kD0
n
X
1 e{.nC1/ˇ
:
sin.˛ C kˇ/ D Im e{˛
1 e{ˇ
kD0
n
X
We have
e{˛
{.nC1/ˇ
1 e{.nC1/ˇ
/.1 e{ˇ /
{˛ .1 e
D
e
1 e{ˇ
2 e{ˇ e{ˇ
D e{˛
D
.1 e{.nC1/ˇ /.1 e{ˇ /
2 2 cos ˇ
A C {B
4 sin2 . ˇ2 /
;
where
A D cos ˛ C cos.nˇ C ˛/ cos..n C 1/ˇ C ˛/ cos.˛ ˇ/;
B D sin ˛ C sin.nˇ C ˛/ sin..n C 1/ˇ C ˛/ sin.˛ ˇ/:
Using the trigonometric identities cos a C cos b D 2 cos. aCb
/ cos. ab
/ and cos a 2
2
aCb
ba
cos b D 2 sin. 2 / sin. 2 /, it is straightforward to show that
A D 4 cos.˛ C
Hence
Pn
.n C 1/ˇ
nˇ
/ sin.
/ sin.ˇ=2/:
2
2
2 ˇ
kD0 cos.˛ C kˇ/ D A=4 sin . 2 / D
sin
.nC1/ˇ
2
cos ˛C
ˇ
nˇ
2
sin. 2 /
a˙b
ab
2 sin. 2 / cos. 2 / gives
. A similar analysis
using the identities sin a ˙ sin b D
the result for the sum
of sines. Alternatively, replace ˛ by ˛ =2 in the cosine sum formula.
.nC1/ˇ
nˇ
P
cos
sin 2
2
Finally, we need to show nkD1 cos.kˇ/ D
. This follows from
ˇ
sin. 2 /
the expression for the sum from k D 0 to n (with ˛ D 0) if we subtract the initial
.nC1/ˇ
/ cos. nˇ
/ sin. nˇ
term 1 (cos 0) and then use the formula sin. .nC1/ˇ
2
2 / cos.
2
2 /D
ˇ
sin. 2 /.
t
u
3.10 Appendix: Trigonometric Identities
127
EXERCISES 3.10.2
(1) Show that provided x is not an odd multiple of
n
X
kD1
.1/kC1 cos.kx/ D
we have
cos. 2x / C .1/kC1 cos..k C 12 /x/
:
2 cos. 2x /
P
P
(2) Find formulas for nkD0 .1/kC1 cos.kx/ and nkD1 .1/kC1 sin.kx/. (Hint: To
get the alternating sum,
formulas.)
P replace x by x C in the original
P
(3) Find formulas for nkD0 .1/kC1 cos.˛ C kx/ and nkD0 .1/kC1 sin.˛ C kx/,
˛ 2 R.
Chapter 4
Uniform Convergence
4.1 Introduction
In this chapter we begin our study of continuous and differentiable functions. We
focus on construction and properties. Our main strategy will be to build functions
as infinite series (or products) of elementary functions such as xn or sin nx and
cos nx. For example,
we develop techniques that enable us to give conditions for a
P
n
power series 1
a
nD0 n x to converge to an infinitely differentiable function. We also
investigate
P continuity properties of trigonometric or Fourier series such as the sine
series 1
nD1 bn sin.nx/. We conclude the chapter with an example of a trigonometric
series that converges to a continuous function on R that is nowhere differentiable.
Overall, the aim in this chapter is to develop the tools—which are largely based
on the concept of uniform convergence. In the next chapter, we use these tools to
study several important classes of functions. Although in this and the following
chapter we work almost exclusively with real-valued functions defined on subsets,
usually subintervals of the real line, the ideas and methods we develop have general
applicability and most of the results apply to complex or vector valued functions.
We start by looking at convergence of sequences of functions. We then apply
our results to the partial sums of infinite series of functions. All of this is along
the lines developed in the previous chapter and indeed much of our work will be
making the translation from sequences/series of real numbers to sequences/series
of functions. A new and important issue will be the validity of term-by-term
integration and differentiation of infinite series. For example, when can we find the
integral of a function defined as an infinite series by integrating term-by-term? Many
foundational theorems in analysis are about precisely this problem of interchanging
the order of limiting operations.
130
4 Uniform Convergence
4.2 Pointwise Convergence
We always assume that I is a non-empty subset of R. Typically, I might be an
interval, possibly unbounded, which may be open, closed, or half-open. However,
all of what we say works perfectly well if I is any non-empty subset of R. Suppose
that we are given a sequence .un / of real-valued functions on I. That is, for each
n 2 N, un W I ! R. At this point we do not assume any additional properties of the
functions un (such as continuity). Observe that for each x 2 I, .un .x// is a sequence
of real numbers. The next definition gives a natural definition of convergence of the
sequence of functions .un / in terms of the sequences .un .x//, x 2 I.
Definition 4.2.1 (Notation and Assumptions as Above) The sequence .un / of
functions on I is pointwise convergent (on I) if there exists a function u W I ! R
such that for every x 2 I we have
lim un .x/ D u.x/:
n!1
We refer to u as the pointwise limit of the sequence .un /.
Examples 4.2.2
(1) Take I D Œ0; 1, let f W I ! R be any function and define un D f =n, n 2
N. That is, for each x 2 I, n 2 N, un .x/ D f .x/=n. Although f may not be
bounded on I (we are not assuming f is continuous), it is true that for every
(fixed) x 2 I, f .x/ 2 R, and so limn!1 un .x/ D limn!1 f .x/=n D 0. Hence
.un / is pointwise convergent on I with pointwise limit the zero function. In this
case, the pointwise limit is continuous even though the terms in the sequence
might be discontinuous at every point of I.
(2) Take I D Œ0; 1 and let un .x/ D xn , x 2 I, n 2 N. If 0 x < 1, we have
limn!1 un .x/ D limn!1 xn D 0. On the other hand, limn!1 un .1/ D 1.
The pointwise limit u is continuous on Œ0; 1/ but has a discontinuity at x D 1:
without further conditions, the pointwise limit of continuous functions need not
be continuous. A feature of this example is that as x gets close to 1, convergence
to u.x/ is slow. More specifically, given x 2 Œ0; 1/, 1 > " > 0, let N.x/ 2
N be the smallest integer such that un .x/ D xn < ". Clearly N.0/ D 1 and
"
if 0 < x < 1, N.x/ is the smallest integer bigger than log
log x . Consequently,
limx!1 N.x/ D C1 and convergence is slow when x is close to 1.
(3) Even if the pointwise limit of a sequence of continuous functions is continuous,
the convergence can have unpleasant features. For example, take I D Œ0; 1,
p 2 R and define
un .x/ D np xn .1 x/; x 2 Œ0; 1; n 2 N:
Since limn!1 np xn .1 x/ D 0, if x 2 Œ0; 1, we see that .un / is pointwise
convergent on I with pointwise limit the zero function (note that un .1/ D
0, all n). A straightforward application of the differential calculus shows
4.3 Uniform Convergence of Sequences
131
that the maximum value of un on I is np1 .n=.n C 1//nC1 and is attained
when x D n=.n C 1/. We see that if p < 1, then limn!1 supx2I un .x/ D
limn!1 np1 .n=.n C 1//n D 0. If p D 1, then limn!1 supx2I un .x/ D e1
(where we have used limn!1 .n=.n C 1//n D 1=.1 C 1=n/n D e1 ). If p > 1,
then limn!1 supx2I un .x/ D C1. If p 1, then even though .un / converges
pointwise to the zero function, the graph of un does not approach that of the
zero function. It is also natural to consider the area under the graph of un . We
have
Z
1
0
Z
un .x/ dx D
1
0
np xn .1 x/ dx D
np
; n 1:
.n C 1/.n C 2/
R1
R1
Clearly limn!1 0 un .x/ dx D 0 D 0 limn!1 un .x/ dx iff p < 2. If p D 2,
R1
R1
limn!1 0 un .x/ D 1 and if p > 2, then limn!1 0 un .x/ D C1. This shows
that without further conditions on the convergence of functions we cannot
interchange the order of limit and integration.
4.3 Uniform Convergence of Sequences
The examples in the previous section show that pointwise convergence of functions
does not handle continuity well and can lead to some nasty pathology (as shown in
Examples 4.2.2(3)). We seek a definition of convergence of functions that behaves
well with respect to continuity and basic operations of analysis such as integration
and differentiation.
Suppose that f ; g W I R ! R. What does it mean for f and g to be ‘close’?
One natural approach is to require that j f .x/ g.x/j is small for all x 2 I. That is,
we are asking that the graphs of f and g are close as subsets of R2 . More formally,
given " > 0, let T. f ; "/ R2 denote the tube of width 2" centred on the graph of f .
That is,
T. f ; "/ D f.x; y/ j x 2 I; j f .x/ yj < "g:
See Fig. 4.1. In order that f and g are "-close, we require that graph.g/ T. f ; "/.
Obviously, graph.g/ T. f ; "/ iff j f .x/ g.x/j < " for all x 2 I. Hence graph.g/ T. f ; "/ iff graph. f / T.g; "/ and so the condition is symmetric in f and g.
In the remainder of this section we formalize this idea of closeness or uniform
approximation. We do this by first restricting to the class of bounded functions
(defined on any subset of R) and then giving a precise definition of what we mean by
the distance between two functions f ; g such that f g is bounded. This will enable
us to give a good definition of convergence for sequences of continuous functions.
We develop these ideas further in the next chapter where we show how we can
132
4 Uniform Convergence
T(f, ε )
graph(f)
2ε
graph(g)
I
Fig. 4.1 Graphs of f and g that are "-close to each other
approximate a continuous function, which may be nowhere differentiable, by more
regular functions, such as polynomials.
4.3.1 Spaces of Bounded Functions
We continue to assume that I is a non-empty subset of R. A function f W I ! R is
bounded if there exists an M 0 such that
j f .x/j M; for all x 2 I:
We do not assume yet that f is continuous. If f W I ! R is bounded, we define
k f k D supfj f .x/j j x 2 Ig < 1:
Remark 4.3.1 The number k f k is often called the uniform-norm of f (also the C0 norm, 1-norm or supremum norm). It is commonly denoted by k f k1 .
z
Definition 4.3.2 Let B.I/ denote the set of all bounded functions f W I ! R.
Example 4.3.3 Constant functions are bounded and so B.I/ contains all the constant
functions, including the zero function.
Lemma 4.3.4 (Notation as Above) Let f ; g 2 B.I/.
(1) For all c 2 R, we have f C cg 2 B.I/.
(2) k f C gk k f k C kgk, and kcf k D jcjk f k, all c 2 R.
(3) k f k D 0 iff f
0.
In particular, B.I/ is a vector space: for all f ; g 2 B.I/, c; d 2 R, we have cf C dg 2
B.I/.
4.3 Uniform Convergence of Sequences
133
Proof We start by showing that if f 2 B.I/, c 2 R, then cf 2 B.I/ and kcf k D
jcjk f k. Since j f .x/j k f k, we have jcf .x/j jcjk f k, all x 2 I, and so jcjk f k is
an upper bound for cf . Hence cf 2 B.I/. We claim that kcf k D jcjk f k. If not, there
exists an M < jcjk f k such that M is an upper bound for cf . But then M=jcj < k f k
would be an upper bound for f , contradicting the definition of k f k.
Next we prove that if f ; g 2 B.I/ then f C g 2 B.I/ and k f C gk k f k C kgk
(this will complete the proof of (1,2)). For all x 2 I, we have
j f .x/ C g.x/j j f .x/j C jg.x/j k f k C kgk:
Therefore k f kCkgk is an upper bound for fj f .x/Cg.x/j j x 2 Ig and so f Cg 2 B.I/
and k f C gk k f k C kgk.
Finally, suppose k f k D 0. Then supfj f .x/j j x 2 Ig D 0. Hence f .x/ D 0, for all
x 2 I, and so f
0. The converse is trivial.
Definition 4.3.5 Suppose that f ; g W I ! R and f g 2 B.I/. We define the distance
between f and g, . f ; g/, by
. f ; g/ D k f gk:
Lemma 4.3.6 (Notation as Above) Suppose that f ; g; h 2 B.I/. We have
(1) . f ; g/ 0 and . f ; g/ D 0 iff f D g.
(2) . f ; g/ D .g; f /.
(3) . f ; h/ . f ; g/ C .g; h/ (triangle inequality).
Proof The result is immediate from Lemma 4.3.4.
Remark 4.3.7 For the previous lemma to hold it suffices that f g; g h;
f h 2 B.I/.
z
4.3.2 Spaces of Continuous Functions
Let C0 .I/ denote the space of continuous real-valued functions on I R. In general,
C0 .I/ 6 B.I/ (take I D .0; 1/ and f .x/ D x1 ). However, there is a large class of
subsets I of R for which C0 .I/ B.I/. We concentrate on the best known case.
Theorem 4.3.8 If I is a closed and bounded interval, then C0 .I/ B.I/.
Proof This is a restatement of Theorem 2.4.10(1): continuous functions on a closed
and bounded interval are bounded.
134
4 Uniform Convergence
4.3.3 Convergence of Functions
Definition 4.3.9 Let I R. If .un / is a sequence of functions on I, then .un /
converges uniformly to u W I ! R if
lim .u; un / D 0:
n!1
Remarks 4.3.10
(1) If .un / converges uniformly to u W I ! R, then we must have u un 2 B.I/,
at least for large enough n. In particular, if .un / B.I/ then u 2 B.I/ since
.u; un / < 1 implies that u un 2 B.I/ and so, u D .u un / C un 2 B.I/
(Lemma 4.3.4).
(2) The use of the term ‘uniform’ in the definition should be clear. The sequence
.un / converges uniformly to u if for every " > 0, we can find an N 2 N such
that if n N then jun .x/ u.x/j < " for all x 2 I. This is a much stronger
condition than pointwise convergence, where N may depend strongly on x—
see Example 4.2.2.
z
Proposition 4.3.11 (Notation as Above) If .un / converges uniformly to u, then .un /
converges pointwise to u.
Proof We must prove that for each x 2 I, limn!1 un .x/ D u.x/. Let " > 0. Since
limn!1 .un ; u/ D 0, there exists an N 2 N such that .un ; u/ < ", for all n N.
That is,
.un ; u/ D supfjun .y/ u.y/j j y 2 Ig < "; n N:
Since jun .x/ u.x/j .un ; u/, we have jun .x/ u.x/j < " for all n N and so
limn!1 un .x/ D u.x/.
The next result shows that uniform convergence behaves well with respect to
both continuity and boundedness and so avoids the problems we have seen with
pointwise convergence.
Theorem 4.3.12 Let I R and .un / be a sequence of continuous (respectively,
bounded) functions on I which converges uniformly to u. Then u is continuous
(respectively, bounded).
Proof Suppose that .un / C0 .I/ converges uniformly to u. We are required to
prove that if x0 2 I and " > 0, then there exists a ı > 0 such that ju.x0 / u.x/j <
", for all x 2 I such that jx0 xj < ı. The idea of the proof is to approximate
u sufficiently closely by a continuous function uN (how large we need to take N
depends on ") and then use the continuity of uN to deduce the estimate we require
on u. In more detail, choose N 2 N such that .uN ; u/ < "=3. By definition of
4.3 Uniform Convergence of Sequences
135
.uN ; u/, we have
juN .y/ u.y/j < "=3; for all y 2 I:
Since uN is continuous on I, there exists a ı > 0 such that
juN .x0 / uN .x/j < "=3; for all x 2 I such that jx0 xj < ı:
Now we use the triangle inequality. Suppose x 2 I, then
ju.x0 / u.x/j D ju.x0 / uN .x0 / C uN .x0 / uN .x/ C uN .x/ u.x/j
ju.x0 / uN .x0 /j C juN .x0 / uN .x/j C juN .x/ u.x/j
< "=3 C "=3 C "=3 D ";
where the last inequality holds provided jx0 xj < ı.
The final statement follows from Remarks 4.3.10(1).
Corollary 4.3.13 Let .un / be a sequence of continuous functions on the closed and
bounded interval I D Œa; b. Suppose that .un / converges uniformly to u, then u is
continuous and bounded.
Proof An immediate corollary of Theorem 4.3.12 since every continuous function
on Œa; b is bounded.
Examples 4.3.14
(1) Take I D Œ0; 1 and let un .x/ D xn , x 2 I (as in Examples 4.2.2(2)). Recall that
the pointwise limit u of .un / is the function which is equal to zero on Œ0; 1/ and
1 at x D 1. We claim that .u; un / D 1 for all n 2 N and so the convergence is
not uniform. It suffices to show that for every " > 0, there exists an x 2 Œ0; 1/
such that jun .x/ u.x/j D jun .x/j D xn > 1 ". This is immediate from the
continuity of un at x D 1. (Of course, since u is not continuous, we can deduce
that .un / does not converge uniformly to u using Corollary 4.3.13.)
(2) Take I D Œ0; 1, p 2 R and let un .x/ D np xn .1 x/, x 2 I (as in
Examples 4.2.2(3)). Recall that the pointwise limit u of .un / is identically zero.
We have (see Examples 4.2.2(3)), .u; un / D np1 .n=.n C 1//n , n 2 N. Hence
.u; un / ! 0 iff p < 1. If p D 1, .u; un / ! e1 , and if p > 1, .u; un / ! C1.
Hence we only have uniform convergence when p < 1.
4.3.4 General Principle of Convergence
Just as we did for sequences of real numbers we may define Cauchy sequences of
functions. The Cauchy sequence definition has the merit of not requiring knowledge
of the actual limit.
136
4 Uniform Convergence
Definition 4.3.15 If .un / is a sequence of functions on the non-empty subset I of
R, then .un / is a Cauchy sequence if .um ; un / ! 0 as m; n ! 1. That is, if for
every " > 0, there exists an N 2 N such that
.um ; un / < "; for all m; n N:
Theorem 4.3.16 (General Principle of Uniform Convergence) Let I R and
.un / be a sequence of functions on I. Then .un / is uniformly convergent on I iff .un /
is a Cauchy sequence. If either condition holds and the limit function is u, then u will
be bounded (respectively, continuous) if .un / B.I/ (respectively .un / C0 .I/).
Proof Suppose that .un / is a Cauchy sequence. We start by verifying that .un / is
pointwise convergent. Let " > 0 and choose N 2 N so that .um ; un / < " for all
m; n N. If x 2 I, we have jum .x/ un .x/j .um ; un / < "; for all m; n N.
Hence .un .x// is a Cauchy sequence and by the general principle of convergence for
sequences of real numbers, there exists a u.x/ 2 R such that limn!1 un .x/ D u.x/.
This construction defines a function u W I ! R. Observe that u is the pointwise limit
of the sequence .un /. The estimate
jum .x/ un .x/j < "; m; n N;
(4.1)
holds for all x 2 I. That is, the integer N does not depend on the choice of x 2 I.
Letting m ! 1 in (4.1) gives
ju.x/ un .x/j "; n N; for all x 2 I;
and so .u; un / ", for all n N. Hence .un / converges uniformly to u. We leave
the proof that a uniformly convergent sequence is Cauchy to the exercises. The final
statements follow from Remarks 4.3.10(1) and Theorem 4.3.12.
Our main applications of the general principle of uniform convergence will be to
infinite series and are described in the next section.
EXERCISES 4.3.17
(1) Complete the proof of Theorem 4.3.16 by showing that a uniformly convergent
sequence of functions is a Cauchy sequence.
(2) Show that if . fn / converges uniformly to f on I R, and . fn / converges
uniformly to g on J R, then (a) f jI \ J D gjI \ J (“f jI \ J” means f
restricted to I \ J), (b) . fn / converges uniformly to a function F W I [ J ! R
where FjI D f , FjJ D g.
(3) Show that if . fn /, .gn / respectively converge uniformly to f , g on I, then . fn ˙
gn / converges uniformly to f ˙ g on I. Show that if . fn /; .gn / 2 B.I/, then
uniform convergence of . fn /, .gn / implies . fn gn / converges uniformly to fg.
Show by means of examples that this result may fail if either . fn / or .gn /
consists of unbounded functions.
4.3 Uniform Convergence of Sequences
137
(4) Find the pointwise limit of the following sequences of functions on the
specified domain. In each case describe the continuity properties of the limit
function.
(a) fn .x/ D tan1 .nx/, x 0.
nx
(b) fn .x/ D 1Cn
2 x2 , x 2 R.
Is the convergence for either of these sequences uniform? Why/Why not?
(5) Suppose un .x/ D xn .1 xn /, x 2 Œ0; 1. Is .un / pointwise convergent on Œ0; 1?
uniformly convergent on Œ0; 1?
(6) Let p; q 2 ZC . Let vn .x/ D xpn .1 xqn /, x 2 Œ0:1. Show that .vn / is pointwise
convergent on Œ0; 1? Can we choose p; q so that .vn / is uniformly convergent
on Œ0; 1?
2
(7) Let un .x/ D xn .1 xn /, x 2 Œ0; 1. Investigate the pointwise and uniform
convergence of the sequence .un / on Œ0; 1.
2
(8) Let un .x/ D xn .1 xn /. Show that the sequence .un / converges uniformly on
Œ0; 1. What is the limit?
(9) Suppose vn .x/ D nxenx , x 2 Œ0; 1. Is .vn / pointwise convergent on Œ0; 1?
uniformly convergent onpŒ0; 1? Would the answer change if we took vn .x/ D
2
nxenx ? vn .x/ D nxen x ?
(10) Let fn .x/ D np xn .1 x/2 . Show that . fn / is uniformly convergent on Œ0; 1 iff
p < 2. (Hint: Proposition 3.5.7.) What about if fn .x/ D np xn .1 x/q , q > 2?
(11) Suppose that . fn / is a sequence of continuous functions which is pointwise
convergent to f on the open interval .a; b/. Suppose that the convergence
of . fn / to f is uniform on every closed subinterval of .a; b/. Prove that f is
continuous on .a; b/.
(12) The sequences . fn / on Œ0; 1, .gn / on Œ0; 100, and .hn / on R are defined by
(a) fn .x/ D xn .1 x/.
nx3
(b) gn .x/ D 1Cnx
.
(c) hn .x/ D
nx4
.
1Cnx2
Find the pointwise limits of these sequences and prove that the convergence is
uniform.
(13) Determine whether or not the following sequences converge uniformly on the
specified domains. It is a good idea to start by finding pointwise limits.
(a) fn .x/ D
(b) fn .x/ D
1
nCx , x 0, n 1.
xn
1Cxn , x 2 Œ0; 1, n 1.
(14) Let .qn / be a sequence consisting of all the rational numbers with qn ¤ qm ,
n ¤ m. Let C > 0 and .an / be any sequence of real numbers such that jan j C > 0 for all n 2 N. For n 2 N define
an ; if x D qn ;
fn .x/ D
0; otherwise:
138
4 Uniform Convergence
Show that (a) . fn / is pointwise convergent, (b) . fn / is not uniformly convergent
on any closed interval Œa; b, a ¤ b.
(15) Suppose that .un / converges uniformly to u on I. Show that if I ? I is such
that un is continuous on I ? for all n, then u is continuous on I ? (this is a slight
extension of Theorem 4.3.12).
(16) Following Exercises 2.4.16(6), define f W R ! R by f .x/ D 0 if x … Q and
f .x/ D 10s if x D r=s, where .r; s/ D 1 and s > 0 and we take s D 1
if x D 0. Let .qn / be a sequence consisting of all the rational numbers and
suppose qn ¤ qm , n ¤ m. For n 2 N define
fn .x/ D
10s ; if x D r=s 2 fq1 ; ; qn g;
0;
otherwise:
Show that . fn / converges uniformly to f on R. Deduce, using the previous
exercise, that f is continuous on R X Q.
(17) Let . fn / be uniformly bounded sequence of continuous functions on Œa; b such
that for each x 2 Œa; b, . fn .x// is monotone.
(a) Show that . fn / converges pointwise to a function f W Œa; b ! R.
(b) Show that f need not be continuous (construct an example).
(c) Show that if f is continuous, then the convergence of . fn / to f is uniform.
(Remark and hints. Result (c) is Dini’s theorem. In order to prove (c), fix " > 0
and define n D fx j j f .x/ fn .x/j "g. Since . fn .x// is monotone, we have
1 n . It suffices to prove that there exists an N 2 N such that
D
;.
Prove
by contradiction. Useful observations are (1) \n1 n D ;;
N
(2) if x … n , then 9ı > 0 such that .x ı; x C ı/ \ m D ;, all m n.)
4.4 Uniform Convergence of Infinite Series
In this section I will always denote a subinterval of R (open, closed, half-open,
bounded or unbounded). However, the results we give easily extend to functions
defined on an arbitrary non-empty subset of R.
Let .un / be a sequence of functions defined on I. For n 1, we define the
sequence .Sn / of partial sums by
Sn .x/ D
n
X
uj .x/; x 2 I:
jD1
Note that .Sn / is a sequence of functions defined on I.
4.4 Uniform Convergence of Infinite Series
139
Definition 4.4.1 (Notation as Above)
P
(a) The infinite series 1
nD1 un is pointwise convergent (on I) to the function S W
I ! R if the sequence of partial sums .Sn / is pointwise convergent to S. (That
is, limn!1 Sn .x/ DP
S.x/, for all x 2 I).
(b) The infinite series 1
nD1 un is uniformly convergent (on I) to the function S W
I ! R if the sequence of partial sums .Sn / is uniformly convergent to S. (That
is, limn!1 .S; Sn / D 0.)
P1 xn
Example 4.4.2 We claim that the series
nD1 n2 is uniformly convergent on
Œ1; 1. For x 2 Œ1; 1, m < n 2 N, we have
ˇ
ˇ
ˇ
ˇ n
n
n
X
X
ˇ X xj ˇ
jxjj
1
ˇ
ˇ
jSn .x/ Sm .x/j D ˇ
;
2ˇ
2
2
j
j
j
ˇjDmC1 ˇ jDmC1
jDmC1
with equality if x D 1. Hence .Sn ; Sm / D
Pn
1
jDmC1 j2
! 0, as m; n ! 1. It
P
xn
follows by the general principal of uniform convergence that 1
nD1 n2 is uniformly
convergent on Œ1; 1.
As an immediate consequence of our results on the uniform convergence of
sequences, we have the first of our main results on uniform convergence of infinite
series.
of continuous (respectively, bounded)
Theorem 4.4.3 Let .un / be a sequence
P
functions on I. If the infinite series 1
u
nD1 n is uniformly convergent to the function
S W I ! R, then S is continuous (respectively, bounded).
For applications, it is useful to have a slightly stronger version of the continuity
statement in Theorem 4.4.3.
Theorem
P 4.4.4 Let .un / be a sequence of continuous functions on I. If the infinite
series P1
nD1 un is uniformly convergent on every closed and bounded subinterval of
I, then 1
nD1 un converges to a continuous function on I.
Proof The result is immediate from the previous theorem if I is a closed and
bounded interval. So assume that I is P
not a closed and bounded interval. The
1
hypotheses of the theorem imply that
nD1 un converges pointwise on I to a
function S W I ! R. Indeed, given x 2 I, apply the uniform convergence hypothesis
of the theorem to the closed interval Œx; x. In order to show that S is continuous it
is enough to prove that S is continuous on every closed and bounded subinterval
of I. But this follows from the hypotheses of the theorem and Theorem 4.4.3. (If
I D Œa; C1/ then we prove S continuous on any bounded interval Œa; b, b > a, and
that suffices for continuity at a. For all other points x 2 I, choose a < x < b so that
Œa; b I. Then S is continuous on Œa; b and certainly continuous at x.)
140
4 Uniform Convergence
Remarks 4.4.5
(1) We do not claim in Theorem 4.4.4 that S is bounded.
(2) If I is an arbitrary non-empty
subset of R, then Theorem 4.4.4 continues to
P
u
converges
uniformly on Œa; b \ I for all 1 <
apply provided that 1
nD1 n
a b < C1.
z
One last, but key, result before we give some examples.
Theorem 4.4.6 (General Principle of Uniform Convergence
for Series) Let .un /
P
be a sequence of functions on I. The infinite series 1
u
is
uniformly
convergent
nD1 n
on I iff the sequence of partial sums .Sn / is Cauchy. More formally, if for every
" > 0, there exists an N 2 N such that
kum C C un k < "; for all n m N;
P
then there exists a function S W I ! R such that 1
nD1 un converges uniformly to S.
If the sequence .un / consist of continuous functions, then S is continuous.
Proof Apply Theorem 4.3.16 to the sequence of partial sums.
Examples 4.4.7
P
sin nx
(1) Let un .x/ D sin nx=n2 , n 1. We claim that 1
nD1 n2 is uniformly convergent
P1 sin nx
on R and S.x/ D nD1 n2 is continuous on R. To see this, observe that for
P
2
all x 2 R, we have jun .x/j 1=n2 . We know that 1
nD1 1=n < 1 and so, by
the general principle of convergence for series (of real numbers) given " > 0,
there exists an N 2 N such that j m12 C C n12 j < "; for all n m N: Hence
ˇ
ˇ
ˇX
ˇ
n
ˇ n sin jx ˇ X
1
ˇ
ˇ
< "; for all m n N:
ˇ
ˇ
2
2
ˇ jDm j ˇ jDm j
P
Therefore .Sn ; Sm1 / D k njDm uj k < "; for all n m N, and the sequence
of partial sums is Cauchy. The result follows from Theorem 4.4.6. The reader
should note how this proof is a mix
theory of series of real numbers (using
Pof the
in this case the convergence of
1=n2 ) and results on uniform convergence.
The method we used to deduce uniform convergence by comparing with a
‘known’ series of real numbers is very powerful and due to Weierstrass (it is
a special case of the Weierstrass M-test—see below).
(2) Define un .x/ D xn , n 0. Let a 2 .0; 1/ and regard un as defined on Œa; a.
For all x 2 Œa; Ca, n m, we have
jxm C C xn j D jxjm j1 C C xnm j am
1
X
jD0
a j D am =.1 a/:
4.4 Uniform Convergence of Infinite Series
141
Hence kum C C un k am =.1 a/ ! 0 as n m ! 1.PBy the
n
general principle of uniform convergence for series (Theorem 4.4.3), 1
nD0 x
is uniformly convergent to a continuous
function on Œa; a. This argument is
P
n
valid for all a 2 .0; 1/ and so 1
x
converges
to a continuous function S on
nD0
.1; 1/ (TheoremP4.4.4). In this case we know that S.x/ D 1=.1x/. The reader
n
should note that 1
nD0 x is not uniformly convergent to 1=.1 x/ on .1; 1/.
This is easily seen since given any n m 1, we can make am C Can > 1=2
by taking a sufficiently close to 1. In particular, the sequence of partial sums
cannot be Cauchy on .1; 1/.
Theorem 4.4.8 (Weierstrass M-Test) Suppose that .un / is a sequence of functions
defined on I and that there exists a sequence .Mn / of positive real numbers such
that
(a) P
jun .x/j Mn for all x 2 I, n 2 N.
1
(b)
nD1 Mn < 1.
P1
ThenP nD1 un is uniformly convergent on I. If the .un / are all continuous, so is
SD 1
nD1 un .
Proof The proof is the same as that used in Examples 4.4.7(1).
P We prove that the
sequence .Sn / of partial sums is Cauchy. Let " > 0. Since 1
nD1 Mn < 1, there
exists an N 2 N such that Mm C C Mn < " for all n m N. It follows from
assumption (a) that for n m N and x 2 I we have
ˇ
ˇ
ˇX
ˇ X
m
ˇ n
ˇ
ˇ
ˇ
u
.x/
Mj < ":
j
ˇ
ˇ
ˇ jDm
ˇ jDn
P
P
Hence .Sn ; Sm1 / D k njDm uj k njDm Mj < "; for all n m N, and so .Sn /
is a Cauchy sequence. Now apply Theorem 4.4.6.
We give some characteristic applications of the M-test in the next set of examples
(more examples appear in the following section).
Examples 4.4.9
P
n
n
n
1
(1) Consider the series 1
nD1 1Cx2 Cn3 . We have 0 < 1Cx2 Cn3 1Cn3 < n2 for
P1
1
all x 2 R, n 2 N. Taking Mn D n2 in the M-test, we see that nD1 1Cxn2 Cn3
converges uniformly to a continuous function on R.
(2) Let
/ be any sequence of real numbers and p > 1. The infinite series
P1.ansin.a
n x/
converges uniformly to a continuous function on R. For this, we
nD1
np
sin.an x/
p
p
note
P pthat j np j n and take Mn D n in the M-test (since p > 1,
n < 1).
P
xn
(3) Consider the exponential series 1
nD0 nŠ . This series does not converge uniformly on R. We show that the series converges uniformly on every closed and
n
n
bounded interval ŒR; R, R 0. Certainly j xnŠ j RnŠ for all x 2 ŒR; R. We
P
n
Rn
take Mn D RnŠ in the M-test. Since 1
nD0 nŠ < 1, it follows by the M-test
142
4 Uniform Convergence
P
xn
that 1
converges uniformly on ŒR; R for all R 0. As a consequence
nD0P
nŠ
1 xn
exp.x/ D nD0 nŠ defines a continuous function on R.
EXERCISES 4.4.10
(1) Consider the infinite series
(2)
(3)
(4)
(5)
(6)
(7)
(8)
P1
1
nD1 1Cn2 x .
(a) For what values of x is the series convergent?
(b) On what closed
does it converge uniformly.
P1 intervals
1
continuous
on the set of points where the series
(c) Is f .x/ D
nD1 1Cn2 x
converges?
P
x
is uniformly convergent on R.
Show that the series 1
n.1Cnx2 /
PnD1
1
1
Show that the series nD1 n3 Cn4 x2 is uniformly convergent on R.
P
n
n
Show that the series 1
nD1 x .1 x / converges pointwise but not uniformly
on Œ0; 1.
P
n2
n
Show that the series 1
nD1 x .1 x /=n converges uniformly on Œ0; 1. What
P1 n
n2
about nD1 x .1 x /=n?
P
x2
Show that the infinite series 1
nD0 .1Cx2 /n converges pointwise on R and find
the sum. Show that the series converges uniformly on any closed and bounded
interval I not
x D 0. What happens if 0 2 I?
P containing
nC1
Show that 1
.1/
=.n C x/ is uniformly convergent on Œ0; 1/ but that
nD1
the M-test does not apply to give uniform convergence on any closed interval
J Œ0; 1/.
P
x
Let u be defined by the geometric series u.x/ D 1
nD0 .1Cx/n , x 0.
(a) Find u.x/, x 0. Is u continuous?
(b) Is the convergence uniform onP
Œ0; 1/? Why/Why not?
x
(c) Show that the convergence of 1
nD0 .1Cx/n is uniform on ŒX; 1/, provided
X > 0.
(9) We say that Œa; b is a proper closed interval if a ¤ b. Prove the following
extension of Theorem 4.4.4: if
P fIi j i 2 Ig is a family of proper closed
subintervals of I such that (a)P 1
nD1 un is uniformly convergent on Ii for all
i 2 I, and (b) [i2I Ii D I, then 1
nD1 un converges to a continuous function on
I.
P
P
(10) Show that 1
an is absolutely
nD1 an sin nx converges uniformly on R if
convergent.
4.5 Power Series
In
we considerPthe convergence properties of series of the form
P1this section
n
n
a
x
(more
generally, 1
nD0 n
nD0 an .x x0 / ). This type of series is called a power
series.
4.5 Power Series
143
P
n
Lemma 4.5.1 Suppose that the power series 1
nD0 an x converges for either x D r
or x D r. Then there exists a C D C.r/ > 0 such that
jan j Crn ; n 0:
P
n
n
Proof Suppose that jxj D r and 1
nD0 an x converges. Now limn!1 an x D 0
n
n
(Lemma 3.2.3) and so the sequence .jan x j/ D .jan jr / is bounded. Hence there
exists a C > 0 such that jan jrn C for all n 0.
P1
Lemma P
4.5.2 Suppose that the power series nD0 an xn converges if P
x D z ¤
1
n
n
0. Then 1
convergent
for
all
x
2
.jzj;
jzj/.
If
nD0 an x is absolutely
nD0 an x
P1
n
diverges for x D z, then nD0 an x diverges if jxj > jzj.
P
n
Proof Suppose that 1
nD0 an z is convergent. By Lemma 4.5.1, there exists a C 0
such that
jan j Cjzjn ; n 0:
Therefore
jxj
jan x j C
jzj
n
n
; n 0;
P1
an xn is absolutely convergent if jxj < jzj by comparison with the
n
P
jxj
geometric series C 1
.
nD0 jzj
P1
n
Suppose
that
a
z
is
divergent
and that there exists an x, P
jxj > jzj, such
nD0 n
P
1
n
n
that 1
a
x
is
convergent.
Then
by
the
first
part
of
the
lemma,
nD0 n
nD0 an y will
be convergent if jyj < jxj, contradicting the divergence of the series at z. Therefore,
the series is divergent for all x satisfying jxj > jzj.
P1
Definition 4.5.3 The radius of convergence R of nD0 an xn is defined by
and
nD0
R D supfjxj j
1
X
an xn convergesg:
nD0
Examples 4.5.4
P
xn
(1) The exponential series 1
has radius of convergence R D C1.
nD0 P
nŠ
nC1 xn
converges if jxj < 1 and diverges
(2) Using the ratio test, the series 1
nD1 .1/
n
if jxj > 1. Hence the radius of convergence R D 1. The series converges if x D 1
and diverges if x D 1.
P1
n
Proposition
P1 4.5.5n Suppose that nD0 an x has radius of convergence R > 0. Then
S.x/ D nD0 an x defines a continuous function on .R; R/.
144
4 Uniform Convergence
Proof Let a 2 .0; R/ and choose b, a < b < R. By Lemma 4.5.1, there exists a
a n
n
n
C 0 such that
n j Cb . Therefore jan a j C. b / , for all n 0. Since
Pja
1
a n
a n
0 b=a < 1, nD0 C. b / < 1.P
Take Mn D C. b / . Then jan xn j Mn for all
n
x 2 Œa; a and so, by the M-test, 1
on Œa; a.
nD0 an x is uniformly
P convergent
n
n
Since the functions an x are continuous, it follows that 1
a
x
converges
to a
n
nD0
P
n
continuous function on Œa; a. This holds for all a 2 .R; R/, and so 1
a
x
is
n
nD0
continuous on .R; R/ (see Theorem 4.4.4).
P
nC1 xn
defines a continuous function on .1; 1/
Example 4.5.6 The series 1
nD1 .1/
n
since the radius of convergence is 1 by Examples 4.5.4(2). This series converges at
x D 1 and we shall show in the next section that the series converges uniformly on
Œ0; 1 (this uses Abel’s test for uniformly convergent series).
Using Cauchy’s test we can give an explicit formula for the radius of convergence
of a power series.
P1
n
is equal to
Proposition 4.5.7 The radius of convergence of
nD0 an x
1=n
1=.lim sup jan j /.
P
n
Proof Set ` D lim sup jan j1=n . Then lim sup jan xn j1=n D `jxj, and so 1
nD0 an x
converges if jxj` < 1 and diverges if jxj` > 1 by Cauchy’s test. Hence R D
1=.lim sup jan j1=n /.
4.5.1 Sums, Products and Quotients of Power Series
We define the sum
series
P1
nD0
a n xn C
P1
nD0
bn xn of two power series to be the power
1
X
.an C bn /xn :
nD0
We may similarly define the difference of two power series.
P1
P
n
bn xn has
Proposition 4.5.8 If 1
nD0 an x has radius of convergence
P1 R and nD0
radius convergence S, then the radius of convergence of nD0 .an ˙ bn /xn is at least
minfR; Sg.
Proof We may P
assume minfR; P
Sg > 0, else the result is trivial.PLet 0 < jxj <
1
1
n
n
n
minfR; Sg, then 1
a
x
and
n
nD0
nD0 bn x both converge and so
nD0 .an ˙ bn /x
converges by standard properties of convergent
series. Since this is so for all jxj <
P
n
minfR; Sg, the radius of convergence of 1
.a
nD0 n ˙ bn /x is at least minfR; Sg. P
P1
n
n
Definition 4.5.9 The productP. 1
nD0 an x /. nD0 bn x / of two power series is
1
defined to be the power series nD0 cn xn , where
cn D
n
X
jD0
aj bnj ; n 0:
4.5 Power Series
145
P
P1
n
n
Proposition 4.5.10 If the power series 1
nD0 an x and
nD0 bn x have radii of
convergence R and S respectively, then the product of the series
radius of
Phas
1
n
convergence at least P
minfR; Sg. In particular, if we set f .x/ D
a
nD0 n x , x 2
1
n
.R; R/, and g.x/ D nD0 bn x , x 2 .S; S/, then
f .x/g.x/ D
1
X
cn xn ; x 2 . minfR; Sg; C minfR; Sg/:
nD0
Proof We assume minfR; Sg > 0, otherwise the result is trivial. Fix 0 < r <
minfR; Sg and choose s 2 .r; minfR; Sg/. By Lemma 4.5.1, there
Pexists a C > 0
such that jan j; jbn j Csn for all n 0. It follows that jcn j njD0 jaj jjbnj j 2 n
n
C2 .n C 1/. rs /n D Mn , if jxj r. Since
C
P s .n C 1/, n 0. Hence jcn x j P
n
Mn < 1, it follows by the M-test that 1
nD0 cn x converges uniformly on Œr; r.
This holds for all 0 < r < minfR; Sg and so the radius of convergence of the product
is at least
P minfR;nSg.
P1
n
Let 1
a
x
have
radius
of
convergence
R
>
0
and
let
f
.x/
D
a
x
,
n
n
nD0
nD0
x 2 .R; R/. We recall that f is continuous on .R; R/ by Proposition 4.5.5. Suppose
that f .0/ ¤ 0—that is, a0 ¤ 0. Since f is continuous, f will be non-vanishing near
x D 0. We define the sequence .dn / R recursively by d0 D a1
0 and
0
1
n1
1 @X
dn D anj dj A ; n 1:
a0 jD0
We refer to
P1
nD0
dn xn as the reciprocal power series
P1
nD0
a n xn .
Proposition 4.5.11 (Notation
P andn Assumptions as Above) If we denote the
radius of convergence of 1
nD0 dn x by S, then S > 0 and
1
X
dn xn D 1=f .x/; x 2 . minfR; Sg; C minfR; Sg/:
nD0
P
n
Proof If P1
non-zero radius of convergence S, then the radius of convernD0 dn x hasP
1
n
n
gence P
of . 1
d
x
/.
nD0 n
nD0 an x / is at least r D minfR; Sg by Proposition 4.5.10.
n
Since jD0 aj dnj D 0, n > 0, and a0 d0 D 1, the product is identically one on
.r; r/.
Let r 2 .R; R/. By Lemma 4.5.1, there exists a C D C.r/ > 0 such that
jan j Crn , n 0. Define
sD
rja0 j
:
C C ja0 j
146
4 Uniform Convergence
Observe that s 2 .0; r/ and for future reference note that
C s
D 1:
ja0 j r s
(4.2)
Set D D ja0 j1 .P
We claim that jdn j Dsn , all n 0. This shows that the radius of
n
convergence of 1
nD0 dn x is at least s > 0. Our proof is by induction. The result is
trivial if n D 0. Suppose we have proved the estimate for j D 0; ; n 1. Since
Pn
dn D a10
jD1 aj dnj , we have
0
jdn j 1
n
X
DC
@
rj snCj A
ja0 j jD1
D
DC n X s
s
ja0 j
r
jD1
D
n1
DC n s X s
s
ja0 j
r jD0 r
DC n s
1
s
ja0 j
r 1 s=r
D
DC n s
s
ja0 j
rs
n
j
j
D Dsn ;
where the last statement follows by (4.2). Since D D ja0 j1 works for jd0 j D D, the
induction shows that
D works for all n 0, granted our choice of s. P this value
Pof
1
n
n
Suppose that 1
a
x
,
with non-zero radius of
n
nD0
nD0 bn x are power
P1 series
n
convergence
R
and
S,
respectively.
Let
f
.x/
D
a
x
,
x
2 .R; R/, and g.x/ D
n
nD0
P1
n
b
x
,
x
2
.S;
S/
be
the
continuous
functions
defined
by the power series. If
n
nD0
g.0/ ¤ 0, then it follows from propositions 4.5.10, 4.5.11 that there exists an s > 0
such that the quotient f .x/=g.x/ has a power series representation on .s; s/.
Let Rfxg denote the set of all power series with strictly positive radius of
convergence. Our results show that Rfxg is closed under addition and multiplication
(Rfxg is a ring). Let R? fxg Rfxg be the set of all power series with non-vanishing
constant coefficient. If u 2 R? fxg, then the reciprocal u1 2 R? fxg and R? fxg is
closed under multiplication and division (R? fxg is a group—the group of units of
Rfxg). In Chap. 5, we shall use these properties of power series as part of a study
of real analytic functions—functions which have a power series representation at
every point in their domain.
4.6 Abel and Dirichlet’s Test for Uniform Convergence
EXERCISES 4.5.12
(1) Find the radius of convergence of
P1
(5)
(6)
(7)
(8)
nŠ.2n/Š 2n
nD0 .3n/Š x .
.2n/Š.3n/Š n
nD0 .5n/Š x .
P1
More generally, suppose
P
.pn/Š.qn/Š n
p C q D N, where p; q 2 N. Find the radius of convergence of 1
nD0 .Nn/Š x .
n
rn
How would the result change if you replaced x by x for some fixed r 2 N?
P
n n2
Find the radius of convergence of 1
nD1 n x . (Hint: R ¤ 0.)
˛ Q
For ˛ 2 R, let n denote the generalized binomial coefficient nŠ1 n1
.˛ j/.
P1 ˛ n jD0
Find the radius of convergence of the binomial series nD0 n x . (A special
argument is needed if ˛ 2 ZP
theorem with integral remainder
C .) Using
˛ Taylor’s
n
x
(Theorem 2.7.7), show that 1
D
.1
C
x/˛ if jxj < 1.
nD0 n
In Proposition 4.5.10, it is stated that the radius of convergence is at least
minfR; Sg. Show by means of an example that the radius of convergence may
be strictly greater than minfR; Sg.
Use Proposition 4.5.11 to find a power series for .1 C x2 /1 . What is the radius
of convergence?
Find the first four terms in the power series expansion of .2 C x C x2 /1 . Using
the result of (1), find the first four terms in the power series expansion of .1 C
x2 /1 .2 C x C x2 /1 .
Using Proposition 4.5.10, prove
ex ey D exCy , where ex is assumed to be
P1 that
xn
defined by the power series nD0 nŠ .
(2) Find the radius of convergence of
(3)
(4)
147
4.6 Abel and Dirichlet’s Test for Uniform Convergence
Proposition 4.6.1 (Abel’s Test for Uniform Convergence)
Given sequences .an /,
P
.un / of functions defined on I R, the series 1
a
u
is
uniformly
convergent on
nD1 n n
I if
P
(1) The series 1
nD1 an is uniformly convergent on I.
(2) 9K 0 such that kun k K, for all n 2 N.
(3) .un .x// is either decreasing for all x 2 I or increasing for all x 2 I.
P
In particular, if an ; un are continuous, n 1, then U D 1
nD1 an un is continuous
on I.
Proof The proof is obtained by using the argument of the proof of Abel’s test for
infinite series. We leave the proof to the exercises (see also the proof of the Dirichlet
test for uniform convergence which we give below).
Proposition 4.6.2 (Dirichlet’s Test for Uniform Convergence)
P1 Given sequences
.an /, .un / of functions defined on I R, the series
nD1 an un is uniformly
convergent on I if
(1) 9K 0 such that ka1 C C an k K for all n 1.
(2) .un .x// is decreasing for all x 2 I.
(3) .un / is uniformly convergent to the zero function on I.
148
4 Uniform Convergence
In particular, if an ; un are continuous, n 1, then S D
on I.
P1
nD1
an un is continuous
Proof We apply the argumentP
of the proof of Dirichlet’s test for infinite series
pointwise to the infinite series 1
nD1 an un . Thus, using Abel’s lemma, we have the
estimate
ˇ n
ˇ
ˇX
ˇ
ˇ
ˇ
aj .x/uj .x/ˇ 2Kum .x/; for all n m 1; x 2 I:
ˇ
ˇ
ˇ
jDm
Hence j
Pn
jDm
aj .x/uj .x/j 2Kkum k, for all n m 1, x 2 I, and so
n
X
aj uj 2Kkum k; for all n m 1:
jDm
Given " > 0, choose N 2 N such that kum k < "=2K, all m N. We have
n
X
aj uj "; for all n m N:
jDm
It
from the general principle of uniform convergence (Theorem 4.4.6) that
Pfollows
1
a
u
is uniformly convergent on I.
n
n
nD1
Examples 4.6.3
P1 .1/nC1 xn
P
(1)
is uniformly convergent on Œ0; 1. In particular, limx!1 1
nD1
nD1
n
P1 .1/nC1
.1/nC1 xn
D
. This is an application of Abel’s test with an D
nD1
n
n
.1/nC1 =n and un .x/ D xn .
P
cos.nx/
(2) We claim that for all " 2 .0; / and m 2 Z, the series 1
is uniformly
nD1
P1 n cos.nx/
convergent on Œ2m C "; 2.m C 1/ ". In particular, nD1 n defines a
continuous
function on .2m ; 2.m C 1/ / for all m 2 Z. Similar results hold
P
sin.nx/
for 1
nD1
n . To prove the claim, suppose that x is not an integer multiple of
2 . We have
n
X
kD1
cos
cos.kx/ D
.nC1/x
2
sin
sin. 2x /
nx 2
:
If we suppose x 2 Œ2m C "; 2.m C 1/ ", where " 2 .0; / and m 2 Z, then
the minimum value of j sin. 2x /j is taken at x D 2m C " (or 2.m C 1/ ").
4.6 Abel and Dirichlet’s Test for Uniform Convergence
149
Hence we have the estimate
ˇ n
ˇ
ˇX
ˇ
"
ˇ
ˇ
cos.kx/ˇ 1=j sin. /j; for all x 2 Œ2m C "; 2.m C 1/ ":
ˇ
ˇ
ˇ
2
kD1
We now apply Dirichlet’s test with K D 1=j sin. 2" /j and un D 1=n.
Examples 4.6.3(1) is a special case of Abel’s theorem, which we now state and
prove.
P
Theorem 4.6.4 (Abel’s Theorem) If the infinite series 1
nD0 an is convergent, then
P
1
n
nD0 an x is uniformly convergent on Œ0; 1 and
lim
x!1
1
X
nD0
a n xn D
1
X
an :
nD0
Proof The method is the same as that used for Examples 4.6.3(1) and depends on
Abel’s test.
EXERCISES 4.6.5
P
(1) Show that if 1
nD1 an converges then
P1
n
x
(a)
an 1Cx
n converges uniformly on Œ0; 1,
PnD1
nxn .1x/
1
(b)
a
nD1 n 1xn converges uniformly on Œ0; 1
(for (b), the x-dependent terms are defined to be equal to 1 at x D 1).
Deduce that these infinite series define continuous functions on Œ0; 1.
P
P1 an
(2) Show that if the partial sums of 1
nD1 an are bounded then
nD1 nx defines a
continuous function
on
.0;
1/.
P
P1 nx2
(3) Show that if 1
an is uniformly convergent on
nD1 an converges, then
nD1 e
R. Why is the
result
easier
if
a
0
for
all
n
2
N?
n
P
P1 an
(4) Show that if 1
nD1 an is convergent then
nD1 nx defines a continuous function
on Œ0; 1/. P
sin.nx/
p
(5) Show that 1
defines a continuous function on .2m ; 2.m C 1/ / for
nD1
n
all m 2 Z. P
sin..2nC1/x/
(6) Show that 1
defines a continuous function on .m ; .m C 1/ /
nD0
2nC1
for all m 2 Z.
P1
nC1 cos.nx/
(7) Show that
defines a continuous function on ..2m nD1 .1/
n
1/ ; .2m C 1/ / for all m 2 Z. (Hint: use Exercises 3.10.2(1).)
(8) Suppose that
P .an / is a monotone decreasing sequence of positive numbers.
Show that 1
only if limn!1 nan D 0.
nD1 an sin nx converges uniformly on
PR
p
(Hint: take x D .2p C
1/
and
show
that
j
nDm an sin nxj > 0:4pap if
p
p > 2m 1. Note that 2= > 0:4. It can be shown that the condition is
also sufficient.)
150
4 Uniform Convergence
4.7 Integrating and Differentiating Term-by-Term
In this section we address the question of when we can interchange the order of
integration or differentiation with summation. We start with a special case of our
main result on interchanging the order of summation and integration.
Proposition 4.7.1 Let .un / be a sequence of continuous
functions defined on the
P
closed and bounded interval I and suppose that 1
nD1 un converges uniformly on I
to U W I ! R. Given a 2 I, we have for all x 2 I,
Z
Z
x
x
U.t/ dt D
a
a
1
X
!
un .t/
dt D
nD1
1 Z
X
nD1
x
un .t/ dt;
(4.3)
a
P1 R x
un .t/ dt is uniformly convergent on I.
P1
Remark 4.7.2 Since PnD1 un is uniformly convergent on I, and the terms un are
1
all continuous, U D
nD1 un is continuous on I (Theorem 4.4.3) and therefore
integrable.
and the series
nD1 a
Proof of Proposition 4.7.1 Let a; x 2 I.PSince I is an interval, Œa; x I (abusing
n
notation, we allow x < a). Set Sn D
jD1 uj , n 1. In order to prove (4.3), it
suffices to show that given " > 0, we can find N 2 N such that
ˇ
ˇ
ˇ
ˇZ x
ˇ ˇZ x
Z x
n Z x
X
ˇ
ˇ
ˇ ˇ
ˇ
ˇ < ";
ˇ
ˇ
D
U.t/
dt
S
.t/
dt
U.t/
dt
u
.t/
dt
n
j
ˇ
ˇ
ˇ ˇ
a
a
ˇ a
ˇ
jD1 a
for all n N. Since the result is trivial
P if a D x, we may assume x ¤ a. Denote the
length of the interval I by jIj. Since 1
nD1 un is uniformly convergent on I, we can
choose N 1 such that kSn Uk < "=jIj for all n N. Integrating from a to x we
have, for n N,
ˇZ x
ˇ ˇZ x
ˇ
Z x
ˇ
ˇ ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
U.t/
dt
S
.t/
dt
jU.t/
S
.t/j
dt
n
n
ˇ
ˇ ˇ
ˇ
a
a
a
ˇZ x
ˇ
ˇ
ˇ
ˇˇ
kSn Uk dtˇˇ
a
D ja xjkSn Uk
jIj kSn Uk
< ":
R x P1
P1 R x
of
HenceR a
nD1 un .t/ dt D
nD1 a un .t/ dt. The
R x uniform convergence
Rx
P
x
1
u
.t/
dt
is
immediate
from
the
estimate
j
U.t/
dt
S
.t/
dtj
n
n
nD1 a
a
a
jIjkSn Uk, x 2 I.
t
u
4.7 Integrating and Differentiating Term-by-Term
151
Remark 4.7.3 Proposition 4.7.1 holds for uniformly convergent sequences of continuous functions—indeed, that is exactly how the proposition was proved.
z
For applications, we need a stronger version of Proposition 4.7.4 that applies
when I is not necessarily closed or bounded.
Theorem 4.7.4 Given a sequence
.un / of continuous functions defined on the
P
interval I R, suppose that 1
u
nD1 n converges uniformly on closed and bounded
subintervals of I to U W I ! R. Given a 2 I, we have for all x 2 I,
Z
Z
x
1
X
x
U.t/ dt D
a
a
!
un .t/
dt D
nD1
1 Z
X
nD1
x
un .t/ dt;
a
P1 R x
and the series
nD1 a un .t/ dt is uniformly convergent on every closed and
bounded subinterval of I.
P
Proof Since 1
nD1 un is uniformly convergent on all closed
P1 and bounded subintervals of I and the terms un are all continuous, U D
nD1 un is continuous on
I (Theorem 4.4.4) and therefore integrable. The first part of the theorem follows
from Proposition 4.7.1 with I D Œa; x. For the uniform convergence statement
observe that every closed and bounded subinterval J of I is contained in an interval
Œx; a [ Œa; y where x a y. By Proposition 4.7.1, we have uniform convergence
on Œx; a and Œa; y and hence on Œx; a [ Œa; y and therefore on J.
Examples 4.7.5
P1
n n
(1) We have .1 C x/1 D
nD0 .1/ x , x 2 .1; 1/. Convergence is uniform
on every closed subinterval Œx; x .1; 1/. Take a D 0 in the statement of
Theorem 4.7.4. We have U.x/ D 1=.1 C x/, x 2 .1; 1/, and
Z
x
log.1 C x/ D
0
1
dt D
1Ct
D
Z
0
1
X
nD0
!
dt
nD0
1 Z
X
nD0
D
1
X
.1/n tn
x
x
0
.1/n tn dt
.1/n
xnC1
:
n
It follows from Abel’s theorem (see also Examples 4.6.3(1)) that convergence
P
P1
n xnC1
n xnC1
of 1
nD0 .1/
nD0 .1/
n is uniform on Œ0; 1 and so log.1Cx/ D
n for
x 2 .1; 1. Taking x D 1 (strictly, taking the limit of both sides as x ! 1) we
get the series formula for log 2. (Note that the power series defines a continuous
function on .1; 1 while log.1 C x/ is continuous for all x > 1.)
152
4 Uniform Convergence
P
n 2n
(2) We have .1 C x2 /1 D 1
nD0 .1/ x on .1; 1/ and convergence is uniform
on Œx; x, for all x 2 Œ0; 1/. Applying Theorem 4.7.4, we have for x 2 .1; 1/,
tan1 .x/ D
1
X
x2nC1
:
.1/n
2n C 1
nD0
P
n x2nC1
It follows from Abel’s theorem that convergence of 1
nD0 .1/ 2nC1 is uniform
on Œ0; 1 and so, as in the previous example, we may take x D 1 to get
4
D tan1 .1/ D
1
X
.1/n
nD0
1
:
2n C 1
Definition 4.7.6 Let I be an interval. A function u W I ! R is C1 , or once
continuously differentiable, if u is continuous, differentiable and u0 W I ! R is
continuous. More generally, if r 2 N, u is Cr , or r-times continuously differentiable,
if the first r derivatives of u all exist and are continuous on I. If u is Cr for all r 2 N,
we say u is C1 or infinitely differentiable.
Remarks 4.7.7
(1) We allow the interval I to be open, closed, half-open and unbounded. When
I is closed, we interpret continuity and differentiability in the usual way. For
example, u W Œa; b ! R is differentiable at x D a if limh!0C .u.aCh/u.a//=h
exists and the value of the limit is defined to be u0 .a/.
(2) If r 2 N, denote the rth derivative map of u by u.r/ . If r D 1; 2, we write u0 ; u00 .
Note that if u is Cr , r > 1, we require u.r1/ to exist and be continuous and
define u.r/ to be the derivative of u.r1/ .
z
Theorem 4.7.8 Given a sequence .un / of C1 functions defined on the interval I R, suppose that
P1 0
(a)
nD1 un is uniformly convergent
Pon all closed and bounded subintervals of I.
(b) There exists an a 2 I such that 1
nD1 un .a/ is convergent.
P1
Then nD1 un converges pointwise on I to a C1 function U W I ! R and
U0 D
1
X
u0n :
nD1
The convergence of
P1
nD1
un is uniform on all closed and bounded subintervals of I.
Remark 4.7.9 Condition (b) is clearly necessary. For example, take un
n 2 N.
1 for all
z
4.7 Integrating and Differentiating Term-by-Term
153
ProofP
of Theorem 4.7.8 Since un is C1 , u0n is continuous for all n 2 N. Therefore
0
V D 1
nD1 un defines a continuous function on I. Apply Theorem 4.7.4 to get for
all x 2 I
Z
x
V.t/ dt D
a
Z xX
1
a nD1
D
1 Z
X
nD1
D
x
a
u0n .t/ dt
u0n .t/ dt
1
X
.un .x/ un .a//
nD1
D
1
X
un .x/ nD1
1
X
un .a/;
nD1
P1
where the final step
P1follows by condition (b) and the convergence
P1 of nD1 .un .x/ un .a//. Hence
nD1 un converges pointwise on I and
nD1 un .x/ converges
uniformly on all closed and bounded subintervals of I by Theorem 4.7.4. Our
argument shows that for all x 2 I we have
1
X
un .x/ D
nD1
1
X
Z
x
un .a/ C
nD1
V.t/ dt:
a
Since V is continuous, it follows by the fundamental theorem of calculus that the
right-hand side of this equation is differentiable at x with derivative V.x/. Hence
1
X
!0
un
nD1
.x/ D V.x/ D
1
X
u0n .x/;
nD1
and so we obtain the derivative by term-by-term differentiation.
t
u
z
Remark 4.7.10 We leave the C version of Theorem 4.7.8 to the exercises.
We end this section with an important application of our results on term-byintegration and differentiation to power series.
P
n
Theorem 4.7.11 Suppose that the power series U.x/ D 1
nD0 an x has radius of
convergence R > 0 (as usual we allow R D C1). Then U is a C1 -function on
.R; R/ and the derivatives and definite integrals of U may be computed by termby-term differentiation and integration. Furthermore, the power series giving the
derivatives and integrals of U all have radius of convergence R.
P1
n
Proof It suffices to show that
nD0 an x has radius of convergence R iff
P
1
n1
has radius of convergence R (note that the first series is obtained,
nD1 nan x
r
154
4 Uniform Convergence
up to a constant, by term-by-term integration of the second series). We may prove
this either by estimating jan j along the lines of the proof of Lemma 4.5.2 or, more
simply, by using the root test: R1 D lim sup jan j1=n D lim sup jnan j1=n since
limn!1 n1=n D 1.
EXERCISES 4.7.12
(1) We gave results on term-by-term differentiation and integration for infinite
series. State and prove the corresponding results for sequences of functions.
(2) True or False? If true, explain why; if false give a counterexample.
which are C1 on R.
(a) Suppose
the sequence .un / consists of functions
P1 that
P1
0
If nD1 un .x/ is uniformly convergent on R, then nD1 un .x/ converges
to P
a C1 function on R.
P1 an xn
n
(b) If 1
nD0 an x has radius of convergence R D 1, then
nD0 nC1 converges
when x D 1.
(3) Suppose that the sequence .un / consists of functions which are Cr on the
interval I, where 1 < r < 1. Show that if
P1 .r/
(a)
nD1 un is uniformly convergent on all closed and bounded subintervals
of I,
P
.s/
(b) there exists an a 2 I such that 1
nD1 un .a/ is convergent for 0 s < r,
P
r
then 1
nD1 un converges to a C -function V on I and
V .s/ D
1
X
u.s/
n ; 0 < s r:
nD1
P
n
(4) Find an explicit example of a power series U0 .x/ D 1
mD0 an x with radius
of convergence
R x 1 such that if we define the power series .Um / inductively by
UmC1 .x/ D 0 Um .t/ dt, m 0, then the power series Um diverges at x D 1
for all m 2 ZC .
(5) Let .In /1
nD1 be a sequence of non-empty, mutually disjoint open intervals and
.˛n / be a sequence of strictly positive real numbers. Suppose that . fn / is a
sequence of positive continuous functions on R such that for n 1, (a) fn is
non-zero precisely on In and (b) the maximum value of fn is ˛n .
P
(a) Is 1
nD1 fn always pointwise convergent on R?
(b) Find a necessary and sufficient
P condition on the sequence .˛n / that allows
the M-test to be applied to 1
nD1 fn .
P1
(c) Is it true that if the M-test does not apply, then
nD1 fn cannot be
uniformly convergent on R? If false, provide an example.
(d) Denote the length of the interval In by `n .P
Show that if there exists an
A > 0 such that `n A for all n 1, then 1
nD1 fn is always continuous
on R.
4.8 A Continuous Nowhere Differentiable Function
(6) Let un .x/ D
(7)
(8)
(9)
(10)
(11)
xn
,
2n n3
155
n 1.
P
(a) Find the radius of convergence
R of 1
nD1 un .
P1
(b) Is the convergence
of
u
uniform
on ŒR; R? Why?/Why not?
n
nD1
P
(c) Is u D 1
u
differentiable
on
.R;
R/? If yes, what is the derivative
n
nD1
and does the limit limx!R u0 .x/ exist?
P
1
1
Show that the series U.x/ D 1
nD1 n3 Cn4 x2 defines a C function on R and find
0
a series representation
P for U .x/.
1
1
Show that V.x/ D 1
nD1 n2 Cn4 x2 defines a continuous C function on x ¤ 0
but V is not differentiable at x D 0. (Hints for the second part: observe that
V is an even function and so if V is differentiable at x D 0, we must have
V 0 .0/ D 0. Now show limx!0C .V.x/ V.0//=x exists and is non-zero—you
might find the estimates
integral test useful.)
Pof Cauchy’s
1
1
For x > 1, let .x/ D 1
and find an infinite series
nD1 nx . Prove that is C
.n/
for .x/, n 2 N.
P
sin nx
Define F.x/ D 1
Prove that F defines a C1 -function on R.
nD1 nn , x 2 R.
P1 sin nx
What can we say about G.x/ D nD1 np if p is an integer strictly greater
than 1?
Let .In /1
nD1 be a sequence of non-empty, mutually disjoint open intervals (for
1
example: In D .n; n C 1/ or In D . nC1
; 1n /, n 1). Suppose that . fn / is a
sequence of positive continuous functions on R such that for n 1, (a) fn is
non-zero precisely on In and (b) the maximum value of fn is 1.
P
(a) Show that 1
R.
nD1 fn is pointwise convergent onP
1
(b) Show that the
M-test
can
never
be
applied
to
nD1 fn .
P1
(c) Show that nD1 P
fn is never uniformly convergent but there exist choices
of .In / for which 1
nD1 fn is always continuous on R (you choose the .In /;
the . fn / satisfy conditions (a,b) listed
P above).
(d) Find a choice of .In / for which 1
nD1 fn never converges to a continuous
function on R (you choose the .In /; the . fn / satisfy conditions (a,b) listed
above).
4.8 A Continuous Nowhere Differentiable Function
P
sin nx
If we consider the infinite series 1
we can show, using Dirichlet’s test,
nD1 n
that the series defines a function U W . ; / ! R which is continuous except at
xPD 0. Term-by-term differentiation of the series for U leads to the infinite series
1
nD1 cos.nx/ which, using the partial sum formula, can be shown to diverge at
every point of . ; /. It is natural to guess that U might not be differentiable on
. ; /. However, as we see later in Chap. 5, this is false. Indeed, U is infinitely
differentiable on . ; / except at x D 0!
P1 z
Arising out his work on the zeta-function
D
nD1 n , Riemann sugP.z/
1
gested in 1861 that the continuous function nD1 sin.n2 x/=n2 might be nowhere
156
4 Uniform Convergence
differentiable. While this turned out not (quite) to be the case,1 Riemann’s question
prompted
work by Weierstrass who investigated the function defined by the
P further
n
n
series 1
a
sin.b
x/, where 0 < a and b > 1 is an odd positive
Since
nD0
P integer.
n
n
n
n
n
ja sin.b x/j a , for all x 2 R, it follows from the M-test that 1
a
sin.b
x/
nD0
converges uniformly to a continuousPfunction U on R. If we differentiate term-by1
n
n
term, we obtain the infinite series
x/. If ab > 1, it again looks
nD0 .ab/ cos.b
unlikely that the series converges pointwise on R, let alone uniformly. This suggests
that U may not be differentiable anywhere on R, at least if ab is large enough.
Weierstrass showed in 1872 that if ab > 1 C 32
5:7, then U is indeed nowhere
differentiable on R2 Although Weierstrass’ proof is not hard, we prefer a simpler
example, due to van der Waerden (1930), of a nowhere differentiable continuous
function. Like Weierstrass’ example, van der Waerden’s function is defined using a
uniformly convergent series of continuous functions.
Define
x; if x 2 Œ0; 12 ;
u0 .x/ D
1 x; if x 2 Π12 ; 1:
Extend u0 to R as a 1-periodic function. That is, if x 2 Œn; n C 1, n 2 Z, then
u0 .x/ D u0 .x n/ (note x n 2 Œ0; 1). For all m 2 Z, x 2 R we have
u0 .x C m/ D u0 .x/:
We show the graph of u0 in Fig. 4.2.
For n 1, define
un .x/ D
−1
0
1
1
u0 .10n x/; x 2 R:
10n
2
3
4
5
Fig. 4.2 The graph of u0
1
More than 100 years later Gerver showed in 1969 that
x D p =q, where p; q are odd integers.
2
In 1916, G.H. Hardy improved this result to ab > 1.
P1
nD1
sin.n2 x/=n2 is differentiable iff
4.8 A Continuous Nowhere Differentiable Function
The function un is
1
10n -periodic.
un x C
157
Indeed, for all m 2 Z, x 2 R,
m
10n
1
m
u0 10n x C n
n
10
10
1
D n u0 .10n x C m/
10
1
D n u0 .10n x/
10
D un .x/:
D
Given p 2 N, p > 1, define 1p Z D f mp j m 2 Zg. For example, 12 Z D
f0; ˙ 21 ; ˙1; ˙ 32 ; g. Observe that the set of points where u0 is not differentiable is
precisely 12 Z. Elsewhere the derivative of u0 is ˙1. The set of points where un is not
1
m
differentiable is 210
n Z D f 210n j m 2 Zg. Elsewhere the derivative of un is ˙1.
Define U W R ! R by
U.x/ D
1
X
un .x/:
nD0
P
Since jun .x/j 101n , it follows by the M-test that 1
nD0 un is uniformly convergent
on R and U is continuous. We claim that U is nowhere differentiable. We prove
the nowhere differentiability of U by showing that for each x0 2 R, there exists a
0/
sequence .xN / converging to x0 such that the limit as N ! 1 of U.xxNN/U.x
does not
x0
0/
D U 0 .x0 / since xN ! x0 .)
exist (if U is differentiable at x0 , then limN!1 U.xxNN/U.x
x0
Let x0 2 R and N 0. Then there exists a unique m 2 12 Z such that x0 2
mC 12
10N
.NC1/
Π10mN ;
10
mC 12
10N
/ is 12 101N . Certainly either x0 10mN
mC 12
2
1
or 10N x0 > 10.NC1/ (as 12 101N > 10NC1
). Define xN D x0 ˙ 10NC1
1
mC
Π10mN ; 10N2 . This completes the construction of the sequence .xN /.
/. The length of the interval Π10mN ;
>
so
that xN 2
For n N, we have
un .xN / un .x0 /
D ˙1:
xN x0
(The set of points where un is not differentiable is a proper subset of the set of
points where uN is not differentiable if n < N.) On the other hand, if n > N then
1
1
un .xN / D un .x0 ˙ 10NC1
/ D un .x0 / by the 101n periodicity of un ( 10NC1
is an integer
1
multiple of 10n if n > N). Hence if n > N,
un .xN / un .x0 /
D 0:
xN x0
158
4 Uniform Convergence
We have
1
X un .xN / un .x0 /
U.xN / U.x0 /
D
xN x0
xN x0
nD0
D
N
X
un .xN / un .x0 /
xN x0
nD0
D
N
X
˙1
nD0
D QN ;
where QN must be an odd integer if N is even and an even integer if N is odd. Hence
0/
the limit of U.xxNN/U.x
as N ! 1 does not exist and so U cannot be differentiable
x0
at x0 .
Remark 4.8.1 Are these examples of nowhere differentiable continuous functions
exceptional and pathological? Pathological perhaps, but certainly not exceptional.
‘Most’, in a sense that can be made precise, continuous functions f W Œa; b ! R are
nowhere differentiable.
z
EXERCISES 4.8.2
(1) Let u0 be the sawtooth function
defined in Sect. 4.8. For n 2 ZC , define vn .x/ D
P
n
n
22 u0 .22 x/. Show that 1
nD0 vn .x/ is continuous and nowhere differentiable
on R.
(2) In this question, we address the nowhere differentiability of the Weierstrass
function. Let 0 < a < b and suppose that b is an odd integer. Fix x0 2 R.
(a) Show that given m 2 N, there exists an N 2 Z such that bm x0 N 1 2
Π12 ; C 12 .
(b) Show that if we set xm D N=bm and n m, then
cos.bn xm / cos.bn x0 / D .1/N .1 C cos.bnm .bm x0 N 1///
D .1/N I.m; n/;
where I.m; m/ 1 and I.m; n/ 0, all n > m.
(c) Show that
ˇ1
ˇ
ˇ X cos.bn x / cos.bn x / ˇ
2
am
ˇ
m
0
ˇ
.ab/m ;
an
ˇ
ˇ
ˇnDm
ˇ jxm x0 j
xm x0
3
where we have used jxm x0 j 3=.2bm/.
4.8 A Continuous Nowhere Differentiable Function
159
(d) Using the mean value theorem, show that for all n 2 ZC ,
ˇ
ˇ
ˇ n cos.bn xm / cos.bn x0 / ˇ
ˇ .ab/n :
ˇa
ˇ
ˇ
xm x0
(e) Show that
ˇm1
ˇ
ˇX cos.bn x / cos.bn x / ˇ
ˇ
ˇ
m
0
an
ˇ
ˇ<
ˇ
ˇ
xm x0
nD0
.ab/m
:
ab 1
(f) Show that for all m 2 N
ˇ1
ˇ
ˇX cos.bn x / cos.bn x / ˇ
2
ˇ
m
0 ˇ
m
;
a
ˇ
ˇ > .ab/m
ˇ
ˇ
xm x0
3 ab 1
nD0
and hence deduce that
ab > 1 C 3 =2.
P1
nD0 a
n
cos.bn x/ cannot be differentiable at x0 if
(3) Let f W Œa; b ! R be C1 . Show we can construct a sequence .un / C0 .Œa; b/
such that (a) .un / converges uniformly to f , and (b) for all n, un is nowhere
differentiable.
(4) Let f W R ! R be the continuous 2-periodic function defined by
8
ˆ
ˆ
<
0; if x 2 Œ0; 1=2;
6x 3; if x 2 Œ1=2; 2=3;
f .x/ D
ˆ 1; if x 2 Œ2=3; 1;
:̂
2 x; if x 2 Œ1; 2:
Define E D .X; Y/ W R ! R2 by
E.t/ D
1
X
nD1
n
2 f .3
2n1
t/;
1
X
!
n
2n
2 f .3 t/ :
nD1
(a) Show that E is continuous and E maps the unit interval Œ0; 1 onto Œ0; 1
Œ0; 1.
(Hint:
.x; y/ 2PŒ0; 1 Œ0; 1, write x; y in binary form as x D
P1 Given
1
n
n
2
a
nD1 P 2n1 , y D
nD1 2 a2n , where ai 2 f0; 1g, i 1. Show that
1
1t
if t D 2 iD1 3
ai , then f .3k t/ D ak and so E.t/ D .x; y/.)
(b) Show that the result of (a) does not depend on the values of f on .1; 2/—
subject to f .1/ D 1, f .2/ D 0.
(c) Modifying f on .1; 2/ as needed, show that E can be nowhere differentiable.
(This elementary example of a space filling curve was given by I.J. Schoenberg
in 1938. No such examples can exist if E is differentiable.)
Chapter 5
Functions
5.1 Introduction
In this chapter we investigate and compare several natural classes of functions that
play an important role in analysis. We begin with a general overview and then, in
subsequent sections, study specific classes of functions using the tools developed in
the previous chapter.
The most regular, and familiar, class of functions on the real line is the space
P.R/ of polynomialsP
on R. Recall that if p 2 P.R/, then either p
0 or we
n
nj
may write p.x/ D
a
x
where
a
¤
0,
n
is
the
degree
of
p,
and the
0
jD0 j
expression for p is unique. If p 2 P.R/ then p is smooth (that is, infinitely
differentiable or C1 ) and the derivatives and integrals of p are obtained by
term-by-term differentiation and integration of p. At the other extreme we have
the space C0 .R/ of continuous functions on R. As we indicated at the end of
Chap. 4, typical functions in C0 .R/ may have unpleasant properties such as nowhere
differentiability. We can interpolate between continuous and polynomial functions
using spaces of differentiable functions. To this end, if 1 r 1, let Cr .R/ denote
the space of Cr -functions on R. We have the sequence of strict inclusions
C0 .R/ C1 .R/ Cr .R/ C1 .R/ P.R/:
There is another class of functions, intermediate between polynomials and C1 functions, that play an important historic role in analysis (especially complex
analysis). Recall that if f 2 C1 .R/, then the Taylor series Tfx0 of f at x0 2 R is
defined by
Tfx0 .x/ D
1
X
f .n/ .x0 /
.x x0 /n :
nŠ
nD0
162
5 Functions
In general, the Taylor series of f at x0 may have zero radius of convergence and
even if it converges it may not converge to f —we give examples shortly. However,
for many classical functions of analysis (such as ex , sin x, and cos x), the Taylor
series at x0 does converge to f if x is close enough to x0 . We encode this property in
a definition.
Definition 5.1.1 A C1 -function f W R ! R is (real) analytic if for every x0 2 R,
there exists a ı D ı.x0 / > 0 such that
f .x/ D
1
X
f .n/ .x0 /
.x x0 /n for all x satisfying jx x0 j < ı:
nŠ
nD0
Remark 5.1.2 We give the definition of a real analytic function defined on an open
interval .a; b/ R in Sect. 5.4.
z
Let C! .R/ denote the space of all real analytic functions on R. Evidently we have
C0 .R/ C1 .R/ C! .R/ P.R/:
We start by developing the theory of C1 -functions and, in particular, show that a
C1 -function need not be analytic (hence all the inclusions above are strict). Next
we show that even though a continuous function f may be nowhere differentiable,
we can uniformly approximate f as close as we wish on closed bounded intervals
by polynomials (the “Weierstrass approximation theorem”). Next we develop some
of the classical theory of analytic functions and show, for example, that ex , sin x
and cos x all define analytic functions on R. Finally, we conclude the chapter with
two sections on Fourier series—for this we will use many results on pointwise
and uniform convergence from the previous chapter as well as a version of the
Weierstrass approximation theorem.
5.2 Smooth Functions
We start by constructing a smooth (that is, C1 ) non-analytic function. Specifically,
we construct a smooth bounded function ˆ W R ! R that is strictly positive on x > 0
and zero on x 0. Subsequently, we use ˆ as a building block for the construction
of a wide range of smooth non-analytic functions satisfying various properties and
thereby illustrate how to construct a smooth function with specified properties.
The function ˆ cannot be built by piecing together simple functions.
Example 5.2.1 Define F W R ! R by F.x/ D x4 , x 0, and F.x/ D 0, x < 0. The
graph of F looks ‘smooth’ near x D 0; see Fig. 5.1. However, although it is easily
checked that F is C3 (we have F 0 .0/ D F 00 .0/ D F 000 .0/ D 0), F is not 4-times
differentiable at x D 0. Indeed, for x > 0, F 000 .x/ D 24x, and if x < 0, F 000 .x/ D 0.
5.2 Smooth Functions
163
y = x4
0
Fig. 5.1 Graph of C3 but not four times differentiable function
Therefore
Fn000 .h/ F 000 .0/
F 000 .h/ F 000 .0/
D 24 ¤ 0 D lim
h!0C
h!0
h
h
lim
and so F 000 is not differentiable at x D 0. The moral of this example is that if we
want to construct a smooth function, we cannot just piece together bits of standard
functions like polynomials and trigonometric functions.
Before we construct our example of a smooth non-analytic function we need a
technical lemma.
Lemma 5.2.2 If q 2 P.R/ and p 2 Z, then
q.x/ 1
e x D 0:
xp
Pm
mj
, and so xp q.x/ D
Proof If q is of degree m, then q.x/ D
jD0 bj x
Pm
mjp
. Since the limit of a finite sum is the sum of the limits of the
jD0 bj x
lim
x!0C
1
terms in the sum, it is enough to show that limx!0C xk e x D 0; for all k 2 Z.
Setting y D 1=x, it suffices to show
lim yk ey D 0;
y!C1
for all k 2 Z. A proof of this standard result about the growth of the exponential
function is given in Sect. 2.9.3 (Proposition 2.9.9).
t
u
Proposition 5.2.3 Define ˆ W R ! R by
(
ˆ.x/ D
Then
(1) ˆ 2 C1 .R/.
(2) ˆ. j/ .0/ D 0, for all j 0.
0;
e
1x
x 0;
;
x > 0:
164
5 Functions
Proof It is clear that ˆ restricted to either .1; 0/ or .0; 1/ is C1 . We have to
show that ˆ is infinitely differentiable with all derivatives continuous at x D 0. We
start by finding expressions for the derivatives of ˆ at non-zero points of R. Let
j 1. We claim that
(
. j/
ˆ .x/ D
0;
qj .x/ 1
e x;
x2j
x < 0;
x > 0;
where qj is a polynomial in x of degree j 1 with constant term C1. To see this, note
that ˆ. j/ .x/ D 0 if x < 0 since ˆ vanishes identically on .1; 0/. The expression
for x > 0 is an easy inductive argument that we leave to the reader. We prove that for
j 0, ˆ. j/ .0/ exists and is equal to zero and ˆ. j/ is continuous at x D 0. If j D 0,
ˆ.0/ .0/ D ˆ.0/ D 0 (by definition of ˆ) and ˆ will be continuous at x D 0 since
1
limx!0C e x D 0 by Lemma 5.2.2. Proceeding inductively, suppose that we have
shown for j < n that ˆ. j/ .0/ exists and is equal to zero and that ˆ. j/ is continuous
at x D 0. First we show that ˆ.n1/ is differentiable at x D 0 with zero derivative.
We have
00
ˆ.n1/ .x/ ˆ.n1/ .0/
D
D 0:
x!0
x
x
lim
It remains to consider the limit as x ! 0C. We have
ˆ.n1/ .x/ ˆ.n1/ .0/
D lim
x!0C
x!0C
x
lim
D lim
x!0C
qn1 .x/ 1x
e
x2n2
0
x
qn1 .x/ 1
e x
x2n1
D 0;
by Lemma 5.2.2, with m D 2n 1. Hence ˆ.n1/ is differentiable at x D 0 with
zero derivative and so ˆ is n times differentiable at x D 0 with ˆ.n/ .0/ D 0. To
complete the inductive step, we must show that ˆ.n/ is continuous at x D 0; that is,
1
limx!0 ˆ.n/ .x/ D 0. Obviously, limx!0 ˆ.n/ .x/ D 0. Since ˆ.n/ .x/ D qxn2n.x/ e x if
x > 0, we have limx!0C ˆ.n/ .x/ D 0 by Lemma 5.2.2.
t
u
Example 5.2.4 The C1 -function ˆ W R ! R defined in Proposition 5.2.3 is not
analytic. Indeed, ˆ is strictly positive on x > 0 and so is non-zero on x > 0. On the
P
P1
ˆ.n/ .0/ n
n
other hand the Taylor series Tˆ0 of ˆ at the origin is 1
nD0
nD0 0x D
nŠ x D
0. Hence the Taylor series of ˆ at the origin does not converge to ˆ on any interval
.a; a/, a > 0, and therefore ˆ cannot be analytic.
Remarks 5.2.5
(1) In general, the Taylor series of a smooth function bears little relation to the
function. There is a classical result of E. Borel (1895) that shows that given
5.2 Smooth Functions
165
any sequence .an /n0 of real numbers, there exists a C1 -function
!R
Pf W R
an n
with Maclaurin series (the Taylor series at zero) given by Tf0 D 1
x
(so
nD0 nŠ
n
f .n/ .0/ D an ). If we choose a rapidly increasing sequence such as an D nn ,
the radius of convergence of the Maclaurin series will be zero even though f is
defined on all of R. See also the exercises at the end of the section.
(2) A necessary condition for f W R ! R to be analytic is that f 1 .c/
must be a countable subset of R for all c 2 R. In particular, if f 1 .c/
contains an open interval, f cannot be analytic (see Sect. 5.4 for properties of
analytic functions).
z
5.2.1 Constructing Smooth Functions
We use the smooth function ˆ constructed in Proposition 5.2.3 as a building block
to construct many other smooth non-analytic functions.
Examples 5.2.6
(1) Given a 2 R, we construct a smooth function f W R ! R such that f .x/ < 0,
for x < a and f .x/ D 0 for x a by
f .x/ D ˆ.a x/:
Observe that f .n/ .a/ D 0, n 2 ZC . The obvious variations can be made on
this function by considering ˙ˆ.˙.x a//. More generally, observe that if
g W R ! R is any smooth function then f .x/ D ˆ.g.x// is smooth and
f 1 .0/ D fx 2 R j g.x/ 0g:
We have f .n/ .x/ D 0 for all n 0 at every point x 2 f 1 .0/. In particular,
f .x/ D ˆ.x2 / is a smooth positive non-analytic function with zero set f 1 .0/
D f0g.
(2) Given a < b 2 R, we find a smooth function ‰a;b W R ! R satisfying
‰a;b .x/ D 0; if x … .a; b/;
‰a;b .x/ > 0; if x 2 .a; b/:
To this end we define
‰a;b .x/ D ˆ.b x/ˆ.x a/; x 2 R:
Observe that ˆ.x a/ D 0 iff x a and ˆ.b x/ D 0 iff x b. Hence
‰a;b .x/ D 0 iff x … .a; b/. Since ˆ.x/ > 0 if x > 0, we have ‰a;b .x/ > 0 if
x 2 .a; b/. Since ‰a;b is the product of C1 -functions, ‰a;b is C1 . Note that if
166
5 Functions
(a)
graph( Ψa,b )
a
(b)
b
graph( Θa,b )
−b
−a
a
b
Fig. 5.2 Smooth positive bump functions on R. (a) Smooth positive bump function which is nonzero on (a; b). (b) Tabletop function which is non-zero on (b; b) and equal to 1 on [a; a]
. j/
z … .a; b/, then ‰a;b .z/ D 0, j 0. We show the graph of ‰a;b in Fig. 5.2a (the
graph is symmetric about the mid-point .a C b/=2).
(3) Given a; b 2 R, with 0 < a < b < 1, we construct a smooth function ‚a;b
satisfying
‚a;b .x/ D 0; if jxj b;
‚a;b .x/ D 1; if jxj a;
‚a;b .x/ 2 .0; 1/; if jxj 2 .a; b/:
For this we define
‚a;b .x/ D
ˆ.b2
ˆ.b2 x2 /
; x 2 R:
x2 / C ˆ.x2 a2 /
Since 0 < a < b, the denominator is never zero and so ‚a;b is well defined
and C1 . If jxj b, then the numerator is zero; if jxj a, the denominator is
equal to the numerator and so ‚a;b .x/ D 1. If jxj 2 .a; b/, then the numerator
is strictly less than the denominator and so ‚a;b .x/ 2 .0; 1/. We remark that
all the derivatives of ‚a;b at x are zero if jxj … .a; b/. In particular, ‚a;b is not
analytic. We show the graph of ‚a;b in Fig. 5.2b.
Remark 5.2.7 The two functions constructed in the previous examples are often
called “bump” functions. Granted the map ˆ, their construction depends more on
simple logic than difficult analysis.
z
5.2 Smooth Functions
167
Examples 5.2.8
(1) We construct a smooth function with zero set equal to f0g [ f˙1=n j n 1g. As
a first try, we might consider f .x/ D x sin. =x/, x ¤ 0, f .0/ D 0. This function
is continuous and has the specified zero set but it is not differentiable at x D 0 as
limx!0 . f .x/ f .0//=x D limx!0 sin. =x/, which does not exist. If we instead
try f .x/ D x2 sin. =x/, x ¤ 0, f .0/ D 0, we find that f is differentiable at x D 0
but not C1 . More generally, if we define f .x/ D x2nC1 sin. =x/, x ¤ 0, and
f .0/ D 0, then f can be shown to be Cn , n 1 (we leave this to the exercises).
In order to find a C1 -function with the correct properties, we try
f .x/ D
ˆ.x2 / sin. =x/; x ¤ 0;
0;
x D 0:
Just as in the proof of Proposition 5.2.3, we may use Lemma 5.2.2 to show
that f is C1 and all the derivatives of f vanish at zero. In particular, f is not
analytic. Notice the way we use ˆ to ‘smooth’ out the irregularities near x D 0
of sin. =x/.
(2) We show how to construct a C1 -function F W R ! R satisfying
(a)
(b)
(c)
(d)
(e)
( f)
F.x/ D 2, x 2.
F.x/ 2 .0; 2/ for x 2 .2; 1/.
F.x/ D 0, for x 2 Œ1; 0.
F.x/ 0 on Œ0; 1 and F.x/ D 0 iff x D 1=n or 1 1=n for some n 2 N.
F.x/ 2 .1; 0/, for x 2 .1; 5/.
F.x/ D 1, for x 5.
We express F as a sum of functions F1 C F2 C F3 , where
F1 .x/ D 2
(
F2 .x/ D
ˆ.x 1/
;
ˆ.x 1/ C ˆ.x C 2/
ˆ.x/ˆ.1 x/ sin2 . x / sin2 . 1x /; x 2 .0; 1/;
0;
F3 .x/ D x … .0; 1/;
ˆ.x 1/
:
ˆ.5 x/ C ˆ.x 1/
The denominator of F1 is never zero and so F1 defines a smooth function on R
which satisfies (a,b). Further, F1 .x/ D 0, for all x 1. The function F2 is zero
outside Œ0; 1 and is positive on Œ0; 1 with zeros at 1=2; 1=3; 2=3; 1=4; 3=4; .
The factors ˆ.x/, ˆ.1 x/ ensure that F2 is smooth at x D 0; 1. Finally, F3
vanishes for x 1 and satisfies (e,f). Since the denominator of F3 is never zero,
F3 defines a smooth function on R. The function F D F1 C F2 C F3 is a sum of
smooth functions and therefore defines a smooth function on R which satisfies
(a–f).
168
5 Functions
EXERCISES 5.2.9
(1) Define f .x/ D x3 sin. x /, x ¤ 0, f .0/ D 0. Show that
(a) f is continuous on R (you may assume that f is C1 on x ¤ 0).
(b) f is differentiable at x D 0 and f 0 .0/ D 0 (you will need to work from the
definition of the derivative as a limit).
(c) f 0 is continuous on R. (You will need to find limx!0 f 0 .x/.)
(d) Is f 0 differentiable at x D 0?
More generally, show that if f .x/ D x2n sin. x /, x ¤ 0, and f .0/ D 0, then f is
Cn1 and n-times differentiable but not Cn ( f .n/ is not continuous at x D 0).
What about if f .x/ D x2nC1 sin. x /, x ¤ 0, and f .0/ D 0?
(2) Define
(
f .x/ D
x2 sin
p
x
; x > 0;
0;
x 0:
You may assume f is smooth on x ¤ 0. Show that
(a)
(b)
(c)
(d)
f is continuous on R.
f is differentiable at x D 0 and f 0 .0/ D 0.
f 0 is continuous on R.
f is not twice differentiable at x D 0.
What is the zero set ( f 1 .0/) of f ?
(3) Find (explicit) smooth (C1 ) functions f ; g W R ! R such that
(a) f .0/ D 0 and f . 1n / D 0, n 1. Elsewhere f > 0.
(b) g.x/ 2 .0; 1/, for all x 2 .0; 1/ [ .2; 3/ [ .3; 4/ [ .5; 6/, g D 1 on
Œ1; 2 [ Œ4; 5, elsewhere g D 0.
(4) Let a; b 2 R, a < b. Find a smooth function f W R ! R such that
f .x/ D 0; x a;
f .x/ 2 .0; 1/; x 2 .a; b/;
f .x/ D 1; x b:
(5) Let 1 < a < b < c < d < C1. Using the function ˆ find a C1 2
C1 .R/ such that
(a) .x/ D 0 if x a or x b.
(b) .x/ D 1 if x 2 Œb; c.
(c) For all other x 2 R, .x/ 2 .0; 1/.
Extend the definition of as far as you can so as to remove the strict
inequalities in 1 < a < b < c < d < C1 ( for example, 1 a <
b c < d < C1).
5.2 Smooth Functions
169
(6) Using the C1 -function ˆ, find a C1 -function G W R ! R which satisfies all
of the following conditions:
(a)
(b)
(c)
(d)
(e)
G.x/ D 2 if x 1.
G.x/ 2 .0; 2/ if x 2 .1; 0/.
G.x/ 0 on Œ0; 1 and equals zero iff x D
G.x/ 2 .3; 0/ if x 2 .1; 5/.
G.x/ D 3 if x 5.
1
n2
or 1 1
n
for some n 2 N.
Indicate briefly why your function G is smooth at x D 0; 1.
(7) Using the function ˆ
(a) Find a C1 -function e such that e > 0 on .1; 0/ [ .1; 1/ and e 0 on
Œ0; 1.
(b) Find a C1 -function f such that f .0/ D 0, elsewhere f < 0 and f . j/ .0/ D 0,
j 0.
(c) Find a C1 -function g such that the zero set of g is f˙n3 jn 2 Zg, elsewhere
g < 0.
(d) Find a C1 -function h such that h.x/ D 0, x 0, and
(a) h.x/ D n C 1, if x 2 Œ2n C 1; 2n C 2, n 0,
(b) h.x/ 2 .n; n C 1/, if x 2 .2n; 2n C 1/, n 0.
(You are advised to draw the graph first. One step at a time.)
(8) Using the function ˆ
(a) Find a C1 -function e such that (a) e > 0 on .1; 0/, (b) e 0 on Œ0; 1/.
(b) Find a C1 -function f such that (a) f > 0 on .1; 1/, (b) f
0 on Œ1; 1/.
(c) Find a C1 -function g such that (b) g > 0 on .0; 1/, (b) g.x/ D 0 if
x … .0; 1/, (c) g has a unique maximum value at x D 12 . (In particular, g is
not a tabletop function—it is simpler). What are g.n/ .1/, g.n/ .0/, n 0?
(d) Find a C1 -function F.x/ such that F. n1 / D F.1 1n / D 0, n 1;
elsewhere F is strictly positive. What are F .n/ .0/; F .n/ .1/, n 0? (For
this problem it suffices to give a brief indication of why your function F
is infinitely differentiable at the points 0; 1.)
(9) Using the C1 -function ˆ, find a C1 -function G W R ! R which satisfies all
of the following conditions:
(a) G.x/ D 2 if either x 1 or x 2.
(b) G.x/ 2 .0; 2/ if either x 2 .1; 0/ or x 2 .1; 2/.
(c) G.x/ 0 on Œ0; 1 and equals zero iff x D 1n or 1 1
n
for some n 2 N.
Indicate briefly why your function G is smooth at x D 0; 1.
(10) (E. Borel’s theorem.) For b > 0, define „b .x/ D ‚1;1=2 .x=b/, where ‚1;1=2 is
the tabletop function defined in Examples 5.2.6(3) with a D 1; b D 1=2. Given
a sequence .an /n1 , show that it is possible to choose a sequence .bn /n0 170
5 Functions
R.> 0/ such that the series
f .x/ D
1
X
„bn .x/
nD0
a n xn
nŠ
P
an xn
defines a smooth function on R with Taylor series Tf0 D 1
nD0 nŠ . (Hints.
Choose
decreasing
to zero so that bnC1 bn =2, n 0. Observe
P .bn / monotone
P
n
n
that NnD0 „bn .x/ annŠx D NnD0 annŠx on ŒbNC1 ; bNC1 , N 0. Show that if
. j/
.bn / decreases fast enough, then f is smooth. Note that k„bn k is bounded by
. j/
Cj bj
n , where Cj depends only on k‚1;1=2 k.)
5.3 The Weierstrass Approximation Theorem
In this section we consider uniform approximation of continuous functions by
polynomials. For general results we need restrictions on the domain of f . For
example, as x ! 1, ex increases much faster than any polynomial and so it
is unreasonable to expect to be able to approximate ex on R by a polynomial.
Similarly, we cannot expect to approximate f .x/ D x1 on .0; 1/ by a polynomial
(every polynomial is bounded on .0; 1/). Instead, we consider approximation of
continuous functions on closed and bounded intervals. We prove the Weierstrass
approximation theorem: every continuous function on a closed and bounded interval
can be uniformly approximated by a polynomial. Our proof is relatively elementary
and uses Bernstein polynomials.
All the work lies in proving a special case of the theorem that applies to
continuous functions on the closed unit interval.
Theorem 5.3.1 Every continuous function on I D Œ0; 1 can be uniformly approximated by polynomials. That is, if f 2 C0 .I/ and " > 0, then there exists a polynomial
p such that
. f ; p/ D sup j f .x/ p.x/j < ":
x2I
Our proof of Theorem 5.3.1 will be constructive: given a continuous function
f W Œ0; 1 ! R, we construct an explicit sequence of polynomials that converge
uniformly to f .
5.3.1 Bernstein Polynomials
Set I D Œ0; 1. Let f 2 C0 .I/ and n 1. The nth Bernstein polynomial Bn . f / of f is
the polynomial of degree at most n defined by
!
n
X
n
p
f . /xp .1 x/np :
Bn . f /.x/ D
p
n
pD0
5.3 The Weierstrass Approximation Theorem
171
Lemma 5.3.2 We have for n 1
(1)
(2)
(3)
(4)
(5)
(6)
Bn .cf / D cBn . f /, f 2 C0 .I/, c 2 R.
Bn . f C g/ D Bn . f / C Bn .g/, f ; g 2 C0 .I/.
Bn . f / > 0 on I if f > 0 on I.
Bn .1/ D 1.
Bn .t/.x/ D x (here f .t/ D t).
2
Bn .t2 /.x/ D x2 C xx
(here f .t/ D t2 ).
n
Remarks 5.3.3
(1) Statements (1,2) imply Bn W C0 .I/ ! C0 .I/ is linear.
(2) Statement (3) implies that if f > g then Bn . f / > Bn .g/ (on I).
(3) When we replace f by an actual function the variable for f will always be t—as
in f .t/. The Bernstein polynomial will always be a function of x 2 I. Thus, in
statements (5,6) the variable t is a ‘dummy’ variable which just indicates the
functional form of f .
z
Proof of Lemma 5.3.2 (1,2,3) are obvious (for (3) observe that xp .1 x/np > 0 on
.0; 1/).
P
(4) Bn .1/.x/ D npD0 np 1xp .1 x/np D .x C .1 x//n D 1.
(5) We assume n 2—the result is easy if n D 1. We have
!
n
X
n p p
x .1 x/np
Bn .t/.x/ D
n
p
pD0
D
n
X
pD1
D
n
X
pD1
D
n
X
pD1
p p
nŠ
x .1 x/np
pŠ.n p/Š n
.n 1/Š
xp .1 x/np
. p 1/Š.n p/Š
.n 1/Š
xxp1 .1 x/.n1/. p1/
. p 1/Š..n 1/ . p 1//Š
!
n1
X
n1 q
x .1 x/.n1/q ; .q D p 1/
Dx
q
qD0
D xBn1 .1/.x/ D x:
172
5 Functions
(6) Again we assume n 3—see below for the case n D 2. We have
!
n
X
n p 2 p
. / x .1 x/np
Bn .t /.x/ D
p
n
pD0
2
n
X
D
pD1
n
X
D
pD1
p p
.n 1/Š
x .1 x/np ; as in (5)
. p 1/Š.n p/Š n
.n 1/Š
. p 1/Š.n p/Š
p1
1 p
C
x .1 x/np
n
n
D A C B;
where
AD
n
X
pD1
BD
n
X
pD1
p1 p
.n 1/Š
x .1 x/np ;
. p 1/Š.n p/Š n
1 p
.n 1/Š
x .1 x/np :
. p 1/Š.n p/Š n
Checking the proof of (5), we see that B D 1n Bn1 .t/.x/ D nx . It remains to
evaluate A. Cancelling the factor . p 1/ and taking out factors .n 1/=n and
x2 we have (just as in the proof of (5))
.n 1/ X
.n 2/Š
xp2 .1 x/.n2/. p2/
n
.
p
2/Š..n
2/
.
p
2//Š
pD2
n
A D x2
.n 1/
Bn2 .1/.x/
n
.n 1/
:
D x2
n
D x2
(Note that if n D 2,
Finally,
P2
.22/Š
p2
.1
pD2 . p2/Š..22/. p2//Š x
A C B D x2
and so Bn .t2 /.x/ D x2 C
x/.22/. p2/ D 1.)
x
.n 1/
x x2
C D x2 C
;
n
n
n
xx2
n .
t
u
0
Proof of Theorem 5.3.1 Let f 2 C .I/ and " > 0. Since I is closed and bounded,
f W I ! R is uniformly continuous (Theorem 2.4.15) and so 9ı > 0 such that for all
5.3 The Weierstrass Approximation Theorem
173
t; x 2 I satisfying jx tj < ı we have
"=2 < f .t/ f .x/ < "=2 .that is; j f .t/ f .x/j < "=2/:
(5.1)
Since f is continuous on I, M D sups2I j f .s/j < 1. The next inequality follows
from the triangle inequality
2M < f .t/ f .x/ < 2M; for all t; x 2 I:
(5.2)
.t x/2 is greater than or equal to 2M provided that
Observe that the function 2M
ı2
jt xj ı. It follows from (5.1), (5.2) that for all t; x 2 I we have
"=2 2M
2M
.t x/2 < f .t/ f .x/ < 2 .t x/2 C "=2:
2
ı
ı
(5.3)
Regard each term in this inequality as a function of t (so x is fixed). Noting property
(3) of Bernstein polynomials we have for all n 1 the inequality between fnunctions
(of x)
2M
2M
2
Bn "=2 2 .t x/2 < Bn . f / Bn . f .x// < Bn
.t
x/
C
"=2
:
ı
ı2
(What are we doing? We fix x, set t D pn in (5.3), multiply by np xp .1 x/np and
sum from p D 0 to p D n. In particular, Bn . f .x// D f .x/Bn .1/ D f .x/, using
property (1)).
Using properties (1,2,4), we have
2M
2M
2
.t
x/
C
"=2
D 2 Bn ..t x/2 / C "=2;
ı2
ı
2M
2M
Bn 2 .t x/2 "=2 D 2 Bn ..t x/2 / "=2:
ı
ı
Bn
Hence for all x 2 I we have
"=2 2M
2M
Bn ..t x/2 /.x/ < Bn . f /.x/ f .x/ < 2 Bn ..t x/2 /.x/ C "=2:
ı2
ı
(5.4)
We claim that 9N such that for n N, j 2M
B ..t x/2 /.x/j < "=2 for all x 2 I. It
ı2 n
then follows from (5.4) that for n N, x 2 I, jBn . f /.x/ f .x/j < "=2 C "=2 D "
and we are done.
In order to prove the claim we evaluate Bn ..t x/2 /. Since .t x/2 D t2 2txCx2 ,
we have
Bn ..t x/2 / D Bn .t2 / 2xBn .t/ C x2 Bn .1/:
174
5 Functions
Evaluating at x, this gives us (using (4,5,6))
Bn ..t x/2 /.x/ D Bn .t2 /.x/ 2xBn .t/.x/ C x2 Bn .1/.x/
x x2
2
D x C
2xx C x2 1
n
D
x x2
:
n
The maximum value of x x2 on Œ0; 1 is 1=4 and so 0 Bn ..t x/2 /.x/ < 1=4n.
Hence for x 2 I,
0<
Now choose N so that
2M
M
Bn ..t x/2 / C "=2 <
C "=2:
2
ı
2nı 2
M
2nı 2
< "=2, n N.
t
u
Remarks 5.3.4
(1) Bernstein polynomials are named after the Russian mathematician Sergei
Natanovich Bernstein. They were first used by him to give a constructive proof
of the Weierstrass approximation theorem.
(2) The proof of the Weierstrass approximation theorem using Bernstein polynomials may seem slightly magical (especially from Eq. (5.3)). However, there is a
simple probabilistic interpretation of the argument that we now briefly describe
(we refer to Lamperti [22, pages 38–40] for more details and background).
Consider coin tossing where the probability of falling heads is x 2 Œ0; 1 (and
therefore the probability of falling tails is 1 x). If we toss the coin
n times, the
probability of there being exactly p tosses that results in heads is np xp .1 x/np
(this is the binomial distribution). Now suppose that X is the random variable
defined as the number of times the coin falls heads in n coin tosses. Then X
has the binomial distribution defined above. By the weak law of large numbers
limn!1 P.jX=n xj > ı/ D 0, for all ı > 0. Moreover, this estimate is uniform
in x 2 Œ0; 1. Suppose we are given a continuous function f W Œ0; 1 ! R. We
evaluate f at the points p=n, p D 0; ; n. The expectation En . f / of f .X=n/
is then Bn . f /.x/. Because f is uniformly continuous on Œ0; 1, it follows that
limn!1 P.j f .X=n/ f .x/j > ı ? / D 0, uniformly in x, for all ı ? > 0. From this
one can show—as in the proof of Theorem 5.3.1—that limn!1 En . f / D f .x/
uniformly in x 2 Œ0; 1.
z
Theorem 5.3.5 (The Weierstrass Approximation Theorem) Let Œa; b be a
closed and bounded interval. Every continuous function on Œa; b can be uniformly
approximated by polynomials.
Proof Let L W Œ0; 1 ! Œa; b be the linear bijection defined by L.x/ D .b a/x C a,
x 2 Œ0; 1. We denote the inverse of L by K and note that K. y/ D . y a/=.b a/,
y 2 Œa; b.
5.3 The Weierstrass Approximation Theorem
175
Let f W Œa; b ! R be continuous and set F D f ı L W Œ0; 1 ! R. Since F is
continuous, Theorem 5.3.1 implies that there exists a sequence . pn / of polynomials
such that limn!1 supx2Œ0;1 jF.x/ pn .x/j D 0. Set Pn . y/ D pn .K. y//, y 2 Œa; b,
and note that since K is linear, Pn is a polynomial. Now jF.x/ pn .x/j D j f . y/ Pn . y/j, where L.x/ D y. Since K is 1:1 onto, we have
sup jF.x/ pn .x/j D sup j f . y/ Pn . y/j;
x2Œ0;1
y2Œa;b
and so limn!1 supy2Œa;b j f . y/ Pn . y/j D 0.
t
u
5.3.2 An Application of the Weierstrass Approximation
Theorem
Proposition 5.3.6 Let f W Œa; b ! R be continuous and suppose that
Rb
n
0.
a f .x/x dx D 0, for all n 0. Then f
Rb
Proof Since f is continuous it suffices to prove a f .x/2 dx D 0.
We start by observing that if p.x/ D an xn C C a1 x C a0 , then
Z
b
f .x/p.x/ dx D
a
n
X
Z
b
aj
f .x/xj dx D 0;
a
jD0
by our assumption.
Let M D 1 C supx2Œa;b j f .x/j 1. By the Weierstrass approximation theorem,
we can find a polynomial p such that supx2Œa;b j f .x/ p.x/j < "=.M.b a//. We
have
Z b
Z b
Z b
2
f .x/ dx D
f .x/. f .x/ p.x// dx C
f .x/p.x/ dx
a
Z
a
a
b
f .x/. f .x/ p.x// dx:
D
a
Now
ˇZ
ˇ
ˇ
ˇ
b
a
ˇ Z
ˇ
f .x/. f .x/ p.x// dxˇˇ b
j f .x/jj f .x/ p.x/j dx
a
< .b a/M
Our argument shows that for all " > 0,
Rb
a
"
D ":
M.b a/
f .x/2 dx < ". Hence
Rb
a
f .x/2 dx D 0. u
t
176
5 Functions
5.3.3 Uniform Approximation of a Family
For our applications to Fourier series we will need a slightly stronger version
of the Weierstrass approximation theorem that applies to continuous families
of continuous functions. For this we need one or two elementary results about
continuous functions defined on rectangles in R2 . We give elementary proofs of
these results in this section but remark that everything we say is an easy consequence
of the general theory we develop later in Chap. 7.
Let Œa; b; Œc; d be closed bounded intervals. Suppose that f W Œa; b Œc; d ! R.
Given 2 Œc; d, define f W Œa; b ! R by
f .x/ D f .x; /; x 2 Œa; b:
We may regard f f j 2 Œc; dg as defining a family of functions f W Œa; b ! R
parameterized by 2 Œc; d.
The map f W Œa; b Œc; d ! R is continuous if for every " > 0, there exists a
ı > 0 such that
j f .x1 ; 1 / f .x2 ; 2 /j < "; if jx1 x2 j; j1 2 j < ı:
If this conditions holds, then f W Œa; b ! R is continuous for all 2 Œc; d. We refer
to f f j 2 Œc; dg as a continuous family of continuous functions on Œa; b.
We start by showing that a continuous family satisfies a (weak) version of
uniform continuity.
Lemma 5.3.7 Let f f j 2 Œc; dg be a continuous family of continuous functions
on Œa; b. Given " > 0, there exists a ı > 0 such that for all 2 Œc; d we have
j f .x/ f . y/j < "; jx yj < ı:
Proof Suppose the contrary. Then there exists an " > 0 such that for every n 2 N
there exist xn ; yn 2 Œa; b and n 2 Œc; d such that
jxn yn j < 1=n and j fn .xn / fn . yn /j ":
As in the proof of Theorem 2.4.15, the Bolzano–Weierstrass theorem implies that
the bounded sequences .xn /; . yn / Œa; b have a convergent subsequence. A second
application of the Bolzano–Weierstrass theorem to the corresponding subsequence
of .n / yields convergent subsequences of .xn /; . yn / and .n /, say .xnk /; . ynk / and
.nk /, with limk!1 xnk D limk!1 ynk . We derive the required contradiction by
letting k ! 1 in j fnk .xnk / fnk . ynk /j.
t
u
Lemma 5.3.8 Let f f j 2 Œc; dg be a continuous family of continuous functions
on Œa; b. There exists an M 0 such that for all 2 Œc; d we have
sup j f .x/j M:
x2Œa;b
5.3 The Weierstrass Approximation Theorem
177
Proof The function g W Œc; d ! R defined by g./ D f .a; / is continuous and
so there exists an N 0 such that j f .a; /j N for all 2 Œc; d. Take " D 1 in
Lemma 5.3.7 to obtain ı > 0 such that j f .x/ f . y/j < 1, whenever jx yj < ı.
Observe that j f .x/j j f .a/j C .b a/ı 1 C 1, x 2 Œa; b. Hence j f .x/j N C .b a/ı 1 C 1 for all x 2 Œa; b, 2 Œc; d. Take M D N C .b a/ı 1 C 1. u
t
Remarks 5.3.9
(1) Lemma 5.3.8 shows that a continuous function on a bounded closed rectangle
is bounded. This is a natural generalization of our earlier result on continuous
functions defined on a closed and bounded interval. The proof of Lemma 5.3.7
can easily be extended to prove uniform continuity on a bounded closed
rectangle. As we shall see in Chap. 7 we can prove far more general results
that apply to functions defined on arbitrary bounded and ‘closed’ subsets of Rn .
(2) For our continuous families of continuous functions we require joint continuity
of f in .x; /. Everything we have said breaks down badly if we only assume
separate continuity. That is, for fixed x, f .x; / is continuous on Œc; d, and for
fixed , f .x; / is continuous on Œa; b.
z
Theorem 5.3.10 Let f f j 2 Œc; dg be a continuous family of continuous
functions on Œa; b. There exists a sequence . pn / of continuous polynomial families
fpn W Œa; b ! R j 2 Œc; dg converging uniformly to the family f f j 2 Œc; dg.
That is, for each " > 0, there exists an N 2 N such that for each 2 Œc; d,
sup j f .x/ pn .x/j < "; n N:
x2Œa;b
Proof Without loss of generality, assume Œa; b D Œ0; 1. For 2 Œc; d, define
pn D Bn . f /; n 2 N:
We now just repeat the proof of Theorem 5.3.1—using Lemmas 5.3.7, 5.3.8, we
choose the constants ı; M that occur in the proof of Theorem 5.3.1 to be independent
of 2 Œc; d.
t
u
EXERCISES 5.3.11
(1) Let f .x/ D jx 12 j, x 2 Œ0; 1. Compute Bn . f /, n D 1; 2; 3.
(a) Sketch the graph of f , together with the graphs of the approximations Bn . f /,
n D 1; 2; 3.
(b) Where is the approximation poor?
(c) Compute B8 . f /.1=2/ and hence show . f ; B8 . f // > 0:13.
(c) Suppose we take " D 1=10. Find a value of N for which . f ; Bn . f // <
1=10, for all n N. (Note: Do not strive for the best estimate of N. Just get
a value—even if it is quite large. You may want to look back over the proof
of the Weierstrass approximation theorem.)
178
5 Functions
(2) Let Cr .I/ denote the space of r-times continuously differentiable functions on
I D Œ0; 1, 0 r < 1. Show that given " > 0, there exists a polynomial p such
that
. f .s/ ; p.s/ / < "; 0 s r:
(Uniform approximation of a function and its first r-derivatives.) Hint: Start by
approximating f .r/ and then work back to f .
x2
(3) For > 0, define D p21 exp. 2
/. Show that if f 2 C0 .R/ is bounded and
R1
we define f .x/ D 1 f .t/ .x t/ dt, then
(a) f is C1 , > 0. (You will need results on differentiation under the integral
sign—see Lemma 6.1.6.)
(b) f converges uniformly to f on all closed bounded subintervals of R (that
is, given Œa; b and " > 0, there exists an 0 > 0 such that supx2Œa;b j f .x/ f .x/j < ", for all 2 .0; 0 /).
R1
(For part (b) you will need (A) 1 .t/ dt D 1 for all > 0, and (B) if
Rı
ı; " > 0, there exists an 0 > 0 such that ı .t/ dt > 1 " for all 2 .0; 0 .)
Show how this result can be used to prove that we can uniformly approximate
continuous functions on Œa; b by smooth functions.
(4) Show how to extend the proof of Lemma 5.3.7 to obtain uniform continuity
of a continuous family of continuous functions. Show that uniform continuity
implies boundedness (we assume the domain is a bounded rectangle).
(5) Show that Lemma 5.3.7 and Lemma 5.3.8 both fail if f is only separately
continuous (see Remarks 5.3.9).
5.4 Analytic Functions
Definition 5.4.1 A C1 -function f W .a; b/ ! R is (real) analytic if for every x0 2
.a; b/, there exists an r > 0 such that
f .x/ D
1
X
f .n/ .x0 /
.x x0 /n ; x 2 .x0 r; x0 C r/ \ .a; b/:
nŠ
nD0
That is, f is analytic if for every point x0 in the domain of f , f is equal to the Taylor
series of f at x0 on some open interval containing x0 .
Examples 5.4.2
(1) If f W .a; b/ ! R, g W .c; d/ ! R are analytic then f ˙ g is an analytic function
on .a; b/ \ .c; d/.
(2) Every polynomial is an analytic function on R. This requires us to show
that if p.x/ D a0 xn C C an , then for every x0 2 R, we may find
5.4 Analytic Functions
179
Pn
nj
constants A0 ; ; An 2 R such that p.x/ D
, for all
jD0 Aj .x x0 /
x 2 R. We leave this as an easy exercise for the reader (see also the proof of
Proposition 5.4.3).
The result given by the next proposition is certainly what one would expect, but
the proof requires some work.
P
n
Proposition 5.4.3 Suppose that
the power series 1
nD0 an x has radius of converP1
n
gence R > 0. Then f .x/ D
nD0
P an x defines nan analytic function on .R; R/.
More generally, if c 2 R, then 1
nD0 an .x c/ defines an analytic function on
.c R; c C R/.
Proof We are required to show that if x0 2 .R; R/, then there exists an r > 0
P
f .n/ .x0 /
.x x0 /n converges to f .x/, for all
such that the Taylor series Tfx0 .x/ D 1
nD0
nŠ
x 2 .x0 r; x0 C r/.
Since the derivatives
P of f onn .R; R/ are obtained by term-by-term differentiation
of the power series 1
nD0 an x , we have
!
1
X
f .n/ .x0 /
m
D
am x0mn ; n 0:
nŠ
n
mDn
We start by noting a special case of the result. If f is a polynomial of degree p, then
!
!
p
p
X
X
m
a n xn D
am x0mn .x x0 /n ;
n
nD0
nD0 mDn
p
X
since it is easy to check that both sides of the equation are polynomials of degree at
most p and have the same derivatives of order p at x D x0 .
.n/
The proof of the general case has two parts. First, we estimate j f nŠ.x0 /j so as to
show thatPTfx0 has a non-zero radius of convergence. Then we prove that the partial
n
sums of 1
nD0 an x converge to Tfx0 .x/—this will use the special case together with
estimates on remainders.
Fix b 2 .jx0 j; R/. By Lemma 4.5.1, there exists a C 0 such that
jan j Cbn ; n 0:
Using this estimate it is easy to show that
and that we have the estimate
P 1 m
mn
is absolutely convergent
mDn n am x0
!
!
1
X
m jx0 j mn
j f .n/ .x0 /j
n
Cb
n b
nŠ
mDn
jx0 j .nC1/
D Cbn 1 ;
b
180
5 Functions
where the last equality follows from the binomial theorem. Choose r > 0 so that
br
1
jx0 j
b
< 1 and Œx0 r; x0 C r .R; R/:
We claim that the Taylor series Tfx0 converges on Œx0 r; x0 C r. We have
ˇ .n/
ˇ
ˇ f .x0 /
ˇ
jx0 j .nC1/
nˇ
n
ˇ
jx x0 jn
ˇ nŠ .x x0 / ˇ Cb .1 b /
< Cbn .1 DD
where D D C=.1 jx0 j
b /.
Since
rb
1
jx0 j .nC1/ n
/
r ; if x 2 .x0 r; x0 C r/
b
!n
;
jx0 j
b
P1
rb
n
nD0 . 1 jx0 j /
is convergent (by our choice of r),
b
the Taylor series converges for all x 2 Œx0 r; x0 C r.
Finally, we need to show that Tfx0 .x/ converges to f .x/ for all x 2 Œx0 r; x0 C r.
For this itP
suffices to show that if " > 0 then there exists an N 2 N such that
p
jTfx0 .x/ nD0 an xn j < ", for all p N and x 2 Œx0 r; x0 C r.
Let " > 0. Fix x 2 Œx0 r; x0 C r and choose N 2 N so that for all p N we
have
ˇ
1ˇ
!
!ˇˇ ˇˇ p 0 1
!
ˇ
ˇ X
1
X
X X m
ˇ
ˇ
ˇ 1
ˇ
m
m ˇ ˇ
m
ˇ
@
(5.5)
jam jr ˇ ; ˇ
jam jr Aˇˇ < "=2:
ˇ
ˇ
ˇnDpC1 mDn n
ˇ ˇ nD0 mDpC1 n
For p N define
!
!
p
p
X
X
m
mn
a m x0
.x x0 /n ;
I1 D
n
nD0 mDn
!
!
1
1
X
X
m
mn
a m x0
.x x0 /n ;
I2 D
n
nDpC1 mDn
0
1
!
p
1
X
X
m
@
am x0mn A .x x0 /n :
I3 D
n
nD0 mDpC1
For x 2 Œx0 r; x0 C r, we have (by absolute convergence)
1
X
f .n/ .x0 /
.x x0 /n D I1 C I2 C I3 :
nŠ
nD0
5.4 Analytic Functions
181
Pp
Pp
Now I1 D nD0 an xn (special case: nD0 an xn is a polynomial of degree p). Since
x 2 Œx0 r; x0 C r, we have by (5.5), jI2 j; jI3 j < "=2, if p N. Hence
jTfx0 .x/ p
X
an xn j D jI1 C I2 j < "; for all p N:
nD0
Hence the sequence .
Œx0 r; x0 C r.
Pp
nD0
an xn / of partial sums converges pointwise to Tfx0 .x/ on
t
u
Remark 5.4.4 As we show in the next examples, the radius of convergence of Tfx0
may be strictly bigger than R. It is straightforward to show that it is always at least
minfR x0 ; x0 C Rg.
z
Examples 5.4.5
P1 xn
(1) The exponential series
nD0 nŠ defines an analytic function exp.x/ on R.
We claim that (a) exp.0/ D 1, (b) exp0 .x/ D exp.x/ for all x 2 R, (c)
exp.x/ exp.x/ D 1, for all x 2 R, and (d) exp.x C y/ D exp.x/ exp. y/, for all
x; y 2 R. (a) is immediate from the series definition and (b) follows by termby-term differentiation of the power series defining exp.x/. By the chain rule
d
d
dx .exp.x// D exp.x/ and so dx .exp.x/ exp.x// D 0 for all x 2 R. Hence
exp.x/ exp.x/ is constant and, taking x D 0, we have exp.x/ exp.x/ D 1 for
d
all x 2 R. Finally, using (b) again, we have dx
.exp.xCy/ exp.x/ exp.y// D 0
and so exp.x C y/ exp.x/ exp.y/ is constant as a function of x. Take x D y
and use (a,c) to deduce that exp.x C y/ exp.x/ exp.y/ D 1 for all x 2 R.
Hence, applying (c) again, we deduce that exp.x C y/ D exp.x/ exp. y/. If we
set exp.1/ D e
2:718 , then (c,d) imply that we may write exp.x/ D ex
x
where e satisfies P
the exponent laws for a power.
n
(2) The power series 1
nD0 x has radius of convergence 1 and converges to f .x/ D
1
.1 x/ on .1; 1/. Given a 2 .1; 1/, we have f .n/ .a/ D nŠ.1 a/n and so
P
P1 xa n
f .n/ .a/
n
Tfa .x/ D 1
nD0
nD0 . 1a / . The radius of convergence of this
nŠ .xa/ D
series is 1 a. Observe that 1 a > 1 if a < 0 and so the radius of convergence
of the Taylor series of a power series can be strictly bigger than the radius of
convergence
of the power series. In this example, the analytic function defined
P
n
1
by 1
and the latter function is naturally
nD0 x on .1; 1/ is equal to .1 x/
defined on .1; 1/ as an analytic function.
Proposition 5.4.6 Suppose that f W I ! R and g W J ! R are analytic.
(1) The product f g W I \ J ! R is analytic.
(2) If g is non-zero on I \ J then the quotient f =g W I \ J ! R is analytic.
(3) If f .I/ J then the composite g ı f W I ! R is analytic.
Proof Statement (1) can be proved using Proposition 5.4.3 and the result on
products of power series (Proposition 4.5.10). Similarly (2) follows from Proposition 4.5.11. We omit the proof of (3)—see the remarks below.
t
u
182
5 Functions
Remark 5.4.7 The easiest way of proving analyticity of the composite of analytic
functions is to complexify and use complex analytic methods based on Cauchy’s
integral theorem. For a proof of analyticity using real power series methods, we
refer to Krantz and Parks [20, §1.3]. See also the exercises at the end of Sect. 9.13.1
on Faà di Bruno’s formula in Chap. 9.
z
Proposition 5.4.8 Suppose that f W .a; b/ ! R is analytic and not identically zero.
Then
(1) The zeros of f are isolated: if f .x0 / D 0, then there exists an s > 0 such that the
only zero of f on .x0 s; x0 C s/ is x0 .
(2) If f .x0 / D 0 then there exists a unique p 2 N and analytic function g on .a; b/
such that g.x0 / ¤ 0 and
f .x/ D .x x0 /p g.x/:
Proof Suppose that f .x0 / D 0. Without loss of generality, take x0 D 0. For some
r > 0 we may write
f .x/ D
1
X
an xn ; x 2 .r; r/:
nD0
Since f .0/ D 0, we must have a0 D 0. Let p be the smallest integer for which
ap ¤ 0. Then
f .x/ D
1
X
a n xn D xp
nDp
1
X
anCp xn D xp g.x/;
nD0
P
n
where g.x/ D 1
nD0 anCp x , x 2 .r; r/. Since ap ¤ 0, g.0/ ¤ 0. Moreover, the
radius of convergence of the power series defining g is at least r and so g is analytic
on .r; r/. In particular, g is continuous on .r; r/ and non-zero at x D 0. Hence
there exists an s > 0 such that g ¤ 0 on .s; s/. Therefore the only zero of f on
.s; s/ is at x D 0, proving (1). We define g W .a; b/ ! R by g.0/ D ap and
g.x/ D xp g.x/, x ¤ 0. We leave it to the exercises for the reader to verify that g is
analytic.
t
u
Remark 5.4.9 It follows from Proposition 5.4.8 that if f W .a; b/ ! R is analytic
and not constant, then for all c 2 R, f 1 .c/ is a countable subset of .a; b/ consisting
of isolated points.
z
5.4.1 Analytic Continuation
The next result is very special to analytic functions—it fails completely for C1 functions.
5.4 Analytic Functions
183
Proposition 5.4.10 Let f W I ! R and g W J ! R be analytic functions defined on
the open intervals I; J. If there exists an x0 2 I \ J such that
f .n/ .x0 / D g.n/ .x0 /; for all n 0;
then f D g on I \ J. Otherwise said, if the analytic functions f and g have the same
Taylor series at some point then f D g on their common domain.
Proof Let X D fx 2 I \ J j f .n/ .x/ D g.n/ .x/ for all n 0g. It suffices to prove
X D I \ J. Since x0 2 X, X ¤ ;. Moreover, if x 2 X, then f and g have the same
power series representation on an open interval K I \J containing x and therefore
K I \ J. Suppose X ¤ I \ J. Without loss of generality suppose there exists a
z 2 I \ J, z < x0 . Let z0 D supfz < x0 j z … Xg. Clearly, .z0 ; x0 / X. Choose
a sequence . yj / .z0 ; x0 / such that limj!1 yj D z0 . By sequential continuity of
f .n/ ; g.n/ , we have limj!1 f .n/ . yj / D f .n/ .z0 /, limj!1 g.n/ . yj / D g.n/ .z0 / for all
n 0. But since . yj / X, we have f .n/ . yn / D g.n/ . yn / for all n 0 and so
f .n/ .z0 / D g.n/ .z0 /, n 0. Hence z0 2 X. But if z0 2 X, then there is an open
interval .z0 r; z0 C r/ X \ .I \ J/, contradicting the definition of z0 as the
supremum of points z < x0 not in X. Hence X D I \ J.
t
u
The next result is an immediate corollary of Proposition 5.4.10.
Corollary 5.4.11 If f ; g W I ! R are analytic functions which are equal on a nonempty open subinterval of I, then f D g on I.
Definition 5.4.12 Let I; J be open intervals and f W I ! R, g W J ! R be analytic
functions. We call g an analytic continuation of f if (a) J I and (b) g D f on I.
Proposition 5.4.13 Every analytic function f W I ! R has a unique maximal
analytic continuation F W J ! R.
Proof Let A D fg W J ! R j 2 ƒg denote the set of all analytic continuations
of f . Define J D [2ƒ J . Given x 2 J, there exists a 2 ƒ such that x 2 J and we
define F.x/ D g .x/. As an immediate consequence of Corollary 5.4.11, the value
F.x/ is independent of the choice of 2 ƒ such that x 2 J (if x 2 J ; J , then
x 2 J \ J I). The map F is analytic (since F D g on each J ) and obviously F
is the maximal analytic continuation of f .
t
u
P1
Example 5.4.14 The analytic function f .x/ D nD0 .1/n xn , jxj < 1, has maximal
analytic continuation F.x/ D 1=.1 C x/ defined on .1; 1/.
5.4.2 Analytic Functions and Ordinary Differential Equations
A natural way of constructing analytic functions is as solutions to linear ordinary
differential equations.
184
5 Functions
Example 5.4.15 Consider the linear differential equation y0 D ay, where a 2 R and
y0 D dy
dx . We search for a solution y.x/ which satisfies the initial condition y.0/ D y0
(the analysis is the same if we specify y.x0 /, x0 ¤ 0).
We start by observing that if y W R ! R is a C1 solution to y0 D ay, then all the
derivatives y.n/ .0/ are all uniquely determined by the initial condition. Indeed, since
y0 D ay we have y0 .0/ D ay.0/ D ay0 . Differentiating once, y must satisfy y00 D ay0
and so y00 .0/ D ay0 .0/ D a2 y0 . Proceeding inductively, it is clear that for n 0 we
have
y.n/ .0/ D an y0 :
P
y.n/ .0/ n
Assume that y is analytic. Then y.x/ D 1
nD0 nŠ x for x 2 .r; r/, where r > 0.
Using our computed values of y.n/ .0/ we see that
y.x/ D y0
1
X
.ax/n
nD0
nŠ
:
This power series has radius of convergence R D 1. Using our results
Pon term-by.ax/n
term differentiation of a power series, we see easily that y.x/ D y0 1
nD0 nŠ D
y0 eax is a solution of y0 D ay which is defined for all x 2 R and satisfies the initial
condition y.0/ D y0 . Moreover, the solution is unique. To see this, suppose that u.x/
is a differentiable function defined on an open interval I containing x D 0 which
satisfies u0 D au on I and u.0/ D y0 . Define v.x/ D eax u.x/. For x 2 I we have
v 0 .x/ D aeax u.x/ C eax u0 .x/ D aeax u.x/ C aeax u.x/ D 0:
Therefore, v is constant on I. We have v.0/ D u.0/ D y0 and so v.x/ D y0 for all
x 2 I. That is, u.x/ D y0 eax , x 2 I.
Remark 5.4.16 It is worth summarizing the method used in the previous example.
Given the initial condition, all the higher derivatives of a solution are uniquely
determined. As a result the Taylor series of the solution at the origin is uniquely
determined. We show that the Taylor series has non-zero radius of convergence
and observe, using term-by-term differentiation, that the Taylor series defines a
solution to the differential equation with the correct initial condition. Finally, we
compare a solution with the right initial condition to the constructed solution
and so verify uniqueness. In practice, for higher-order linear constant coefficient
differential equations it is usually best to work over the complex numbers, though
in some cases it is possible to work using just real numbers—see the exercises at the
end of the section.
z
The next proposition gives a general result on the existence of analytic solutions
to a linear ordinary equation. We omit the proof—which is most easily done using
complex variable methods.
5.4 Analytic Functions
185
Proposition 5.4.17 Consider the second-order linear differential equation
y00 C a.x/y0 C b.x/y D 0;
(5.6)
where a; b 2 C! .R/. Given y0 ; y00 2 R, there exist solutions y1 ; y2 2 C! .R/ to (5.6)
satisfying
(a) y1 .0/ D y0 , y01 .0/ D 0; y2 .0/ D 0, y02 .0/ D y00 .
(b) If y W I ! R is a solution to (5.6) such that y.0/ D y0 , y0 .0/ D y00 (so 0 2 I),
then y D y0 y1 C y00 y2 on I (in particular, solutions are uniquely specified by
their initial conditions).
EXERCISES 5.4.18
(1) Consider
the ordinary differential equation y00 D y. Suppose that y.x/ D
P1
n
nD0 an x is a power series solution of the equation (assume a non-zero
radius of convergence—you will verify this assumption later). Show that y.x/ is
uniquely determined by y.0/ and y0 .0/.
(a) If y.0/ D 1, y0 .0/ D 0 denote the solution by c.x/. Verify that the power
series you get has radius of convergence R D 1.
(b) If y.0/ D 0, y0 .0/ D 1 denote the solution by s.x/. Verify that the power
series you get has radius of convergence R D 1.
(c) Verify that s0 D c, c0 D s and hence that s2 C c2 1 on R.
(2) Let ˛ 2 R. Consider the analytic differential equation
y0 .x/ D
˛
y.x/
1Cx
on .1; 1/.
(a) Verify that the unique solution with initial condition y.0/ D 1 is given by
y.x/ D .1 C x/˛ . (Assume the uniqueness theorem for solutions of ordinary
differential equations—see Chap. 7, Theorem 7.17.12.)
(b) Verify that the binomial series
!
1
X
˛ k
y.x/ D
x
n
nD0
has radius of convergence R D 1 for all ˛ 2 R, ˛ … ZC , and is a solution
to the differential equation with y.0/ D P
1.
˛ k
(c) Deduce the binomial series .1 C x/˛ D 1
nD0 n x , jxj < 1.
(For an alternative proof, using Taylor’s theorem, see Exercises 4.5.12(4). It can
also be shown that the complex binomial series converges to .1Cz/˛ if ˛; z 2 C,
jzj < 1.)
186
5 Functions
5.5 Trigonometric and Fourier Series
In this section we consider the problems of approximating periodic functions
by trigonometric polynomials and the representation of periodic functions by a
trigonometric or Fourier series.
We start by giving the definition of a periodic function.
Definition 5.5.1 A function f W R ! R is periodic with period > 0 if
f .x C / D f .x/; for all x 2 R:
We say f is -periodic.
Remarks 5.5.2
(1) We generally assume that the period is the smallest strictly positive real
number such that f .x C / D f .x/ for all x 2 R ( is then called the prime
period of f ). Of course, if f .x C / D f .x/ for all x 2 R and all > 0, then f is
constant.
(2) If f W R ! R is -periodic then f .x C m/ D f .x/ for all m 2 Z.
(3) We may require that the period D 2 —if not, define fN .x/ D f .x=2 / and
note that fN has period 2 .
z
Example 5.5.3 Let ! > 0. The functions sin.!x/, cos.!x/ both have period 2 =!.
The function sin2 .!x/ has period =!.
Definition 5.5.4 A function T W R ! R is a trigonometric polynomial of degree
N 1 if T can be written in the form
T.x/ D a0 C
N
X
.an cos nx C bn sin nx/;
nD1
where a2N C b2N ¤ 0 is non-zero. Note that the period of T is 2 .
P
Example 5.5.5 Let T.x/ D a0 C NnD1 .an cos nx C bn sin nx/ be a trigonometric
{x
{x
polynomial of degree N. If
then we may write T as
PNwe seti z D ei , zN D e
a polynomial pT .z; zN/ D
.c
z
C
c
N
z
N
/,
where
c
;
; cN 2 C are uniquely
i
0
iD0 i
determined by the coefficients ai ; bi . This observation explains the use of the term
‘polynomial’ in the definition of trigonometric polynomial.
We used the Weierstrass approximation theorem to uniformly approximate
continuous functions on a closed bounded interval by polynomials. We may also use
the Weierstrass approximation theorem to show that continuous periodic functions
can be uniformly approximated by trigonometric polynomials. We give the proof of
the next result in the appendix to this chapter.
5.5 Trigonometric and Fourier Series
187
Theorem 5.5.6 (Second Weierstrass Approximation Theorem) Every continuous 2 -periodic function on R can be uniformly approximated by trigonometric
polynomials.
In practice, it turns out to be much more interesting to represent periodic
functions by trigonometric series.
Definition 5.5.7 A trigonometric series is a series of the form
a0 C
1
X
.an cos nx C bn sin nx/:
nD1
We will mainly be interested in the classes of piecewise continuous and piecewise
differentiable functions. These are functions which have only jump discontinuities.
We give the precise definition we use (see also Sect. 2.5.2).
Definition 5.5.8 A function f W Œa; b ! R is piecewise continuous if there exist a
finite subset fdj j j D 1; Ng of Œa; b such that
(a) a d1 < < dN b.
(b) f is continuous, except at x D d1 ; ; dN .
(c) For each j, limx!dj f .x/ D f .dj / and limx!dj C f .x/ D f .dj C/ exist and are
finite (we make the obvious variations if either d1 D a or dN D b).
If f is defined on R, then f is piecewise continuous if it is piecewise continuous
restricted to every bounded closed interval Œa; b. A function f is piecewise C1 if
both f and f 0 are piecewise continuous.
Remarks 5.5.9
(1) We refer to the type of discontinuity described in Definition 5.5.8 as a jump
discontinuity. The jump at a jump discontinuity d of f is defined to be f .dC/ f .d/.
(2) Let f W Œa; b ! R be piecewise continuous with discontinuity points a < d1 <
< dN < b. Set d0 D a, dNC1 D b. It is sometimes useful to regard f as
defining a continuous function fj on each subinterval Œdj ; djC1 , j 2 f0; : : : ; Ng.
For this, we define fj .dj / D f .dj C/ and fj .djC1 / D f .djC1 /.
z
Examples 5.5.10
(1) If f .x/ D sin.1=x/, x ¤ 0, and f .0/ D 0, then f is not piecewise continuous.
(2) If we define S.x/ D 1, x 2 Œ2n ; .2n C 1/ / and S.x/ D 1, x 2 Œ.2n C
1/ ; .2n C 2/ /, n 2 Z, then S is piecewise continuous (indeed, piecewise
smooth) and 2 -periodic. The function S defines a square wave.
(3) The function f .x/ D jxj is continuous and piecewise C1 . We have
limx!0C f 0 .x/ D C1, limx!0 f 0 .x/ D 1.
P1
If the trigonometric series
P a0 C nD1 .an cos nx C bn sin nx/ converges for all
x 2 R, then U.x/ D a0 C 1
nD1 .an cos nx C bn sin nx/ is 2 -periodic: U.x C 2 / D
U.x/ for all x 2 R. In this section, we will be interested in representing 2 -
188
5 Functions
periodic functions as trigonometric series. Initially, we obtain results on pointwise
convergence. Later we obtain results on uniform convergence. However, uniform
convergence is not the most natural form of convergence to use when studying
trigonometric series (unlike power series). Much better is the concept of mean
square convergence, which we address later in the section.
We start by showing how every continuous (or piecewise continuous) 2 periodic function f W R ! R naturally determines a trigonometric series which
we call the Fourier series of f . The problem will be to relate the Fourier series to the
original function f .
Definition 5.5.11 Let f be a 2 -periodic function on R and assume that f is
piecewise continuous (so f has finitely many jump discontinuities on Œ0; 2 ). The
Fourier series F . f / of f is defined to be the infinite series
a0 C
1
X
.an cos nx C bn sin nx/;
nD1
where
a0 D
an D
bn D
1
2
1
1
Z
0
Z
Z
2
2
0
2
0
f .x/ dx D
1
2
Z
f .x/ cos nx dx D
f .x/ sin nx dx D
1
1
f .x/ dx;
Z
Z
f .x/ cos nx dx; n 1;
f .x/ sin nx dx; n 1:
We refer to an ; bn as the Fourier coefficients of f .
Remarks 5.5.12
(1) It is common in the literature to take the first coefficient a0 in the Fourier series
to be a0 =2. With this convention, a0 is half the average of f on Π; , rather
than the average as we have defined it. One way or another one has to deal with
an anomalous factor or divisor of 2.
(2) If f is an even function ( f .x/ D f .x/), then bn D 0 for all n 2 N since
the integrand f .x/ sin nx will be odd. Similarly if f is an odd function ( f .x/ D
f .x/) then an D 0 for all n 2 ZC .
z
Example 5.5.13 Let S be the 2 -periodic square wave function defined in Examples 5.5.10(2). Since S is odd, we have an D 0, n 0 (see the previous remarks).
On the other hand
Z
Z
1
2
bn D
S.x/ sin nx dx D
sin nx dx:
0
5.5 Trigonometric and Fourier Series
189
It follows easily that for n 0 we have
b2n D 0;
4
:
.2n C 1/
P1 sin.2nC1/x
b2nC1 D
Hence the Fourier series is F .S/ D
4
nD0
2nC1
.
Remark 5.5.14 Although most of the time we assume functions are 2 -periodic, in
Chap. 6 we consider Fourier series of 1-periodic functions. For future reference, the
Fourier coefficients of a 1-periodic function f are defined by
Z
a0 D
1
0
Z
an D 2
1
0
Z
bn D 2
f .x/ dx;
0
1
f .x/ cos 2n x dx; n 1;
f .x/ sin 2n x dx; n 1;
and the corresponding Fourier series F . f / is
a0 C
1
X
.an cos 2 nx C bn sin 2 nx/:
nD1
z
5.5.1 The Orthogonality Relations
We compute the Fourier coefficients of cos px; sin px, p 0,
Lemma 5.5.15
1
Z
2
0
1
Z
2
0
1
Z
8
ˆ
ˆ
<1
cos px cos nx dx D 0
ˆ
:̂2
if p ¤ n;
if p D n D 0;
sin px cos nx dx D 0; if p 1; n 0;
(
2
sin px sin nx dx D
0
if p D n; p; n 1;
0
if p ¤ n; p; n 1;
1
if p D n 1:
190
5 Functions
Proof All of the statements follow using standard trigonometric identities such as
cos A cos B D 12 .cos.ACB/Ccos.AB// and we leave the details to the reader. u
t
As an important corollary of the second Weierstrass approximation theorem
(Theorem 5.5.6) we have
Theorem 5.5.16 Let f W R ! R be continuous and 2 -periodic. If all the Fourier
coefficients of f are zero, then f
0. In particular, if continuous and 2 -periodic
functions f ; g W R ! R have the same Fourier series, then f D g.
Proof We leave it to the exercises at the end of the section.
t
u
Remarks 5.5.17
(1) The reader should note the similarity of this result to Proposition 5.3.6.
(2) Theorem 5.5.16 shows that the Fourier coefficients are important invariants of
the function f —notwithstanding any issues about convergence of the Fourier
series.
(3) Theorem 5.5.16 is true if we only assume piecewise continuity.
z
We can give additional justification for our definition of the Fourier coefficients
an ; bn when the Fourier series converges uniformly.
P1
Proposition 5.5.18 Suppose that a0 C
nD1 .an cos nx C bn sin nx/ converges
uniformly on Œ0; 2 to the function f . Then the an and bn must be the Fourier
coefficients of f .
P
Proof If a0 C 1
on Œ0; 2 to f then f
nD1 .an cos nxCbn sin nx/ converges uniformly
PN
is continuous on Œ0; 2 since the partial sums SN D a0 C nD1 .an cos nxCbn sin nx/
are continuous. If .SN / converges uniformly to f then SN .x/ cos nx converges
uniformly to f .x/ cos nx on Œ0; 2 for all n 0. Hence
1
Z
2
0
SN .x/ cos nx dx !
1
Z
2
0
f .x/ cos nx dx:
Using the orthogonality relations (Lemma 5.5.15), we have
1
Z
2
0
SN .x/ cos nx dx D
R2
Hence a0 D 21 0 f .x/ dx and an D
applies to the coefficients bn , n 0.
1
R2
0
an ; if N n > 0;
2a0 ; if n D 0:
f .x/ cos nx dx, n 1. A similar analysis
t
u
Remark 5.5.19 Theorem 5.5.16 implies that a continuous 2 -periodic function is
uniquely determined by its Fourier coefficients. Proposition 5.5.18 gives conditions
under which we can reconstruct f given the Fourier coefficients. A general resolution
of this ‘inverse’ problem motivates much of the more advanced work on Fourier
series. For example, a very natural question to ask is which trigonometric series are
the Fourier series of a continuous or smooth function.
z
5.5 Trigonometric and Fourier Series
191
5.5.2 The Riemann–Lebesgue Lemma
Lemma 5.5.20 (The Riemann–Lebesgue Lemma) Let f W Œa; b ! R be
piecewise continuous. Then
Z b
Z b
lim
f .x/ cos x dx D lim
f .x/ sin x dx D 0:
!1 a
!1 a
Rb
Proof We prove that lim!1 a f .x/ cos x dx D 0, the analysis for the second
integral is similar. We start by assuming f is C1 on Œa; b. Integrating by parts, we
have
Z b
Z
1 b 0
1
f .x/ cos x dx D . f .b/ sin b f .a/ sin a/ f .x/ sin x dx:
a
a
Let C be an upper bound for j f j and j f 0 j on Œa; b (this uses the continuity of f , f 0 ).
We have the estimate
ˇZ b
ˇ
ˇ
ˇ
ˇ
ˇ 2C= C C.b a/=:
f
.x/
cos
x
dx
ˇ
ˇ
a
Rb
Letting ! 1 we have lim!1 a f .x/ cos x dx D 0. Next, assume only that f
isR continuous. Given " > 0, it suffices to show that there exists a 0 > 0 such that
b
j a f .x/ cos x dxj < ", for all 0 . By the Weierstrass approximation theorem,
we can find a polynomial p W Œa; b ! R such that
k f pk D sup j f .x/ p.x/j <
x2Œa;b
"
:
2.b a/
Since p is C1 , we can find a 0 > 0 such that
ˇZ b
ˇ
ˇ
ˇ "
ˇ
p.x/ cos x dxˇˇ ; for all 0 :
ˇ
2
a
We have
ˇZ b
ˇ ˇZ b
ˇ
Z b
ˇ
ˇ ˇ
ˇ
ˇ
ˇ D ˇ . f p/ cos x dx C
ˇ
f
.x/
cos
x
dx
p
cos
x
dx
ˇ
ˇ ˇ
ˇ
a
a
a
ˇZ b
ˇ ˇZ b
ˇ
ˇ
ˇ ˇ
ˇ
ˇˇ . f p/.x/ cos x dxˇˇ C ˇˇ
p.x/ cos x dxˇˇ
a
a
ˇZ b
ˇ
Z b
ˇ
ˇ
j f .x/ p.x/j dx C ˇˇ
p.x/ cos x dxˇˇ
a
a
"
"
C D "; for all 0 :
< .b a/
2.b a/
2
192
5 Functions
It remains to prove the case when f is only piecewise continuous. Suppose that f has
jump discontinuities at d1 < d2 < < dN1 , where a < d1 and dN < b. We give
two proofs.
Method 1. Set d0 D a, dN D b. We have
Z
b
f .x/ cos x dx D
a
N1
X Z dnC1
nD0
fn .x/ cos x dx;
dn
where fn is defined as in Remarks 5.5.9(2). Since fn is continuous on Œdn ; dnC1 , we
Rd
have lim!1 dnnC1 f .x/ cos x dx D 0, n D 0; ; N 1. The result follows.
Method 2. Given " > 0, we can approximate f by a continuous function f" so that
Rb
a j f .x/ f" .x/j dx < "=.b a/—see Fig. 5.3. We have
Z
b
Z
b
f .x/ cos x dx D
a
Z
a
Therefore j
Rb
a
b
f" .x/ cos x dx C
f .x/ cos x dxj j
. f" .x/ f .x// cos x dx:
a
Rb
a f" .x/ cos x dxj C
Z
lim
!1 a
" and
b
f .x/ cos x dx ":
Since this holds for all " > 0, the result follows.
t
u
Remarks 5.5.21
(1) It is easy to extend the Riemann–Lebesgue Lemma to bounded functions on
Œa; b with finite or countably many discontinuities. The proof follows Method
2 of the proof of Lemma 5.5.20. Stronger versions hold if we use the Lebesgue
integral rather than the Riemann integral.
(2) It is useful to have a slightly stronger version of the Riemann–Lebesgue lemma
that holds for continuous families of continuous functions. Suppose f W Œa; b
Œc; d ! R is continuous and set f .x/ D f .x; /, x 2 Œa; b, 2 Œc; d. If we
Rb
Rb
define C.; / D a f .x/ cos x dx, S.; / D a f .x/ sin x dx, then given
graph(f)
graph(f ε)
a
d1
Fig. 5.3 Approximating a piecewise continuous function by a continuous function
b
5.5 Trigonometric and Fourier Series
193
" > 0, there exists a 0 such that
jC.; /j; jS.; /j < ";
for all 0 and 2 Œc; d. The proof is the same as that given above
except that we use the Weierstrass approximation theorem for continuous
families, Theorem 5.3.10. We can even allow for jump discontinuities at points
d1 ; d2 ; ; dN1 2 .a; b/ provided we assume (say) that the discontinuity points
do not depend on the parameter .
z
5.5.3 Integral Formula for Partial Sums of a Fourier Series
Definition 5.5.22 Let n 0. The nth Dirichlet kernel Dn .x/ is defined by
Dn .x/ D 1 C 2
n
X
cos jx; x 2 R:
jD1
The collection fDn j n 1g is called the Dirichlet kernel.
The next lemma gives two elementary but useful properties of the Dirichlet
kernel.
Lemma 5.5.23 If n 0, then
R2
R
(1) 0 Dn .x/ dx D Dn .x/ dx D 2 ,
(2) Dn .x/ D
sin..nC 12 /x/
,
sin 2x
if x is not an integer multiple of 2 .
Proof The first statement is an immediate consequence of the orthogonality relations. Next, from the trigonometric identities in the appendix to Chap. 3, we have
1C2
n
X
jD1
cos jx D 1 C
nx
2 cos. nC1
2 x/ sin. 2 /
:
x
sin. 2 /
Since 2 cos A sin B D sin.A C B/ sin.A B/, it is easy to verify that the right-hand
side is equal to
sin..nC 12 /x/
.
sin 2x
t
u
Remarks 5.5.24
(1) By Lemma 5.5.23(2), or the definition of Dn .x/, we have limx!0 Dn .x/ D 2n C
1.
(2) As we shall soon see the function Dn .x/ plays an
R important role in the
convergence theory of Fourier series. The integral jDn .x/j dx grows like
log n and this lack of convergence is reflected in the fact that the Fourier series
of a continuous function may not converge pointwise at every point. It can be
194
5 Functions
shown that the Fourier series of a continuous function does converge at ‘most’
points. However, the proof of this result, due to Carleson (1966), is hard. As
we shall see, adding a little regularity to the function improves the convergence
properties of the Fourier series.
z
Suppose f W R ! R is a piecewise continuous 2 -periodic function. For n 0
define the partial sums
Sn . f /.x/ D a0 C
n
X
.aj cos jx C bj sin jx/;
jD1
where aj ; bj are the Fourier coefficients of f .
Lemma 5.5.25 (Partial Sum Formula) (Notation as above.) For n 0 we have
Sn . f /.x/ D
1
2
D
1
2
D
1
2
Z
Z
Z
f .t/Dn .t x/ dt
f .x t/Dn .t/ dt
(5.7)
f .x C t/Dn .t/ dt:
Proof We have
1
Sn . f /.x/ D
2
C
Z
jD1
Z
n
X
1
jD1
1
D
2
f .x/ dx C
Z
n
X
1
f .t/ cos jt dt cos jx
f .t/ sin jt dt sin jx
Z
f .t/Œ1 C 2
n
X
cos. j.x t// dt;
jD1
where we have used the trigonometric identity cos.A B/ D cos A cos B C
sin A sin B. Hence, by definition of Dn , we have
1
Sn . f /.x/ D
2
D
1
2
Z
Z
f .t/Dn .x t/ dt
f .t/Dn .t x/ dt;
5.5 Trigonometric and Fourier Series
195
since Dn is even. For the second formula (the only formula we use in the sequel),
we make the substitution u D x t to obtain
Z x
1
f .x u/Dn .u/ du
Sn . f /.x/ D
2
xC
Z 1
D
f .x u/Dn .u/ du; periodicity, evenness of Dn ;
2
Z
1
f .x u/Dn .u/ du:
D
2
The proof of the third formula is similar and left as an exercise.
t
u
Theorem 5.5.26 Let f W R ! R be a 2 -periodic piecewise continuous function
and let x0 2 R. Set f .x0 C/ D limx!x0 C f .x/, f .x0 / D limx!x0 f .x/ and assume
that
DR D lim
f .x0 C t/ f .x0 C/
;
t
DL D lim
f .x0 C t/ f .x0 /
t
t!0C
t!0
exist. Then the Fourier series of f is convergent at x0 and
F . f /.x0 / D
1
Πf .x0 / C f .x0 C/ :
2
In particular, if f is continuous and piecewise differentiable on R then F . f /.x/
converges to f .x/ for all x 2 R.
Before we start the proof of the theorem we need a technical lemma that allows
us to use the differentiability properties of f at x0 .
Lemma 5.5.27 Assume f satisfies the conditions of Theorem 5.5.26 and define g W
Π; ! R by
8
f .x0 t/f .x0 /
ˆ
; t > 0;
ˆ
sin 2t
<
f .x0 t/f .x0 C/
g.t/ D
; t < 0;
sin 2t
ˆ
:̂
.DL C DR /; t D 0;
then limt!0C g.t/ D 2DL , limt!0 g.t/ D 2DR . In particular,
(a) g is piecewise continuous on Œ ; with a discontinuity at t D 0 if DL ¤ DR .
(b) If f is differentiable at x0 , then g is continuous at zero and g.0/ D 2f 0 .x0 /.
196
5 Functions
Proof For t > 0, we have
f .x0 t/ f .x0 / 2t
f .x0 t/ f .x0 /
D
2
sin 2t
t
sin 2t
! 2DL ; as t ! 0 C :
The same argument shows that limt!0 g.t/ D 2DR .
t
u
Proof of Theorem 5.5.26 We start by observing that since Dn .t/ is even we have by
Lemma 5.5.23(1)
1
1
Πf .x0 / C f .x0 C/ D
2
2
Z
Z
0
Dn .t/f .x0 / dt C
0
Dn .t/f .x0 C/ dt :
By the partial sum formula (5.7), we see easily that
Sn . f /.x0 / 1
Πf .x0 / C f .x0 C/ D I C IC ;
2
where
1
I D
2
IC D
1
2
Z
0
Z
1
nC
t dt;
2
1
f .x0 t/ f .x0 C/
sin
nC
t dt:
sin. 2t /
2
f .x0 t/ f .x0 /
sin
sin. 2t /
0
Hence
Z
I C IC D
g.t/ sin
1
t dt;
nC
2
where g is piecewise continuous
on Π; by Lemma 5.5.27.
R
We have limn!1 g.t/ sin..n C 12 /t/ dt D 0 (by the Riemann–Lebesgue
lemma) and so limn!1 Sn . f /.x0 / D . f .x0 / C f .x0 C/=2.
t
u
Example
5.5.28 Let S be the 2 -periodic square wave function with Fourier series
4 P1 sin..2nC1/x/
(Example 5.5.13). As a result of Theorem 5.5.26, we see that
nD1
2nC1
8
S
1
< 1; if x 2 Sn2Z .2n ; .2n C 1/ /;
4 X sin..2n C 1/x/
D 1; if x 2 n2Z ..2n C 1/ ; .2n C 2/ /;
:
2n C 1
nD1
0; if x is an integer multiple of :
5.5 Trigonometric and Fourier Series
197
Theorem 5.5.29 If f W R ! R is continuous, 2 -periodic and piecewise C1 , then
the Fourier series of f converges uniformly to f .
Proof Suppose first that f isR C1 . Then, as in the proof of Theorem 5.5.26, we may
write Sn . f /.x/ f .x/ D 21 g.x; t/ sin..n C 12 /t/ dt, where g.x; t/ is continuous.
We regard g.x; t/ D gx .t/ as a continuous family of continuous functions and apply
the Riemann–Lebesgue lemma for families (Remark 5.5.21(2)) to get the required
estimate for uniform convergence. If we assume that f is continuous and piecewise
C1 , then the same argument gives the uniform estimates needed for the Riemann–
Lebesgue lemma on any closed interval not containing a discontinuity of f 0 as an
interior point.
t
u
5.5.4 Failure of Uniform Convergence: Gibbs Phenomenon
The Gibbs phenomenon is the appearance of quite large oscillations in the partial
sums Sn .x/ to the left and right of a jump discontinuity. The resulting ‘overshoot’ in
the partial sums does not die out as n ! 1. We illustrate the phenomenon with an
investigation of the convergence properties of the Fourier series of the square wave
function.
The 2 -periodic square wave S.x/ defined in Examples 5.5.10(2) does not satisfy
the conditions of Theorem 5.5.29 as S has discontinuities at integer multiples of 2 .
We showed in Example 5.5.13 that S has Fourier series
1
4 X sin.2n C 1/x
F .S/ D
;
2n C 1
nD0
and in Examples 5.5.28(1) that the series converges pointwise to S.x/ except if x
is an integer multiple of . Since the pointwise limit of F .S/ is not continuous,
convergence of F .S/ cannot be uniform. On the other hand, a straightforward
application of Dirichlet’s test shows that convergence of F .S/ is uniform on every
closed interval Œa; b which does not contain an integer multiple of 2 (see the
section on Dirichlet and Abel’s
in Chap. 4, especially Examples 4.6.3(2)).
P tests
1
We have Sn .S/.x/ D 4 njD1 2j1
sin..2j 1/x/. Taking x D 2n , we compute that
Sn .S/
2n
D
n
4X
jD1
D2
n
X
jD1
1
.2j 1/
sin
2j 1
2n
2j 1
1
G
;
n
2n
198
5 Functions
P
where G.0/ D 1 and G.x/ D sin x x , x ¤ 0. Now njD1 1n G. 2j1
2n / is an approximating
Riemann sum to
Z
Z 1
Z 1
1
sin x
sin u
dx D
du:
G.x/ dx D
x
u
0
0
0
P
( njD1 1n G. 2j1
2n / is the sum from j D 0; ; n 1 of the value of G at the mid-point
of Œ j=n; . j C 1/=n times the length of the interval—1=n.) Since G is continuous, we
therefore have
Z
2
sin u
D
du:
lim Sn .S/
n!1
2n
u
0
R
The integral 0 sinu u du may be computed numerically and has approximate value
1:8519. Hence limn!1 Sn .S/. 2n / 3:7038=
1:179. The jump in S at x D 0 is
equal to 2 and so we see that
lim Sn .S/
n!1
2n
1:179
1 C 0:0895 2:
Hence the overshoot is approximately 8:95% of the jump at x D 0.
Remarks 5.5.30
(1) The overshoot described above is a universal phenomenon: whenever there is
a jump discontinuity in a piecewise C1 -function, there will be an overshoot
in the partial sums near the discontinuity and this overshoot in the limit is
approximately 8:9490% of the jump at the discontinuity. The phenomenon was
originally described (partly incorrectly) by Gibbs in 1848 and later corrected by
him in 1898.
(2) It is worth remarking that the rate at which the Fourier coefficients of a
function f converge to zero depends on the smoothness of f . If f is C1 (or
analytic) the coefficients decay very rapidly (see Exercises 5.5.33(12)). If the
function is only piecewise continuous then the series of coefficients is never
absolutely convergent (else the M-test would imply convergence to a continuous
function).
z
5.5.5 The Infinite Product Formula for sin x
Q
x2
We show how to derive the infinite product formula sin x D x 1
nD1 .1 n2 2 /, x 2 R,
using methods based on Fourier series.
We start by finding the Fourier series of the 2 -periodic continuous piecewise
C1 function f on R defined by
f .x/ D cos
x
; x 2 Π; ;
5.5 Trigonometric and Fourier Series
199
where we shall assume is not an integer multiple of . Since cos x is even, f is an
even function of x and so all the Fourier sine coefficients bn D 0. We have
Z
1
a0 D
cos
x
sin dx D
0
and
an D
D
2
1
Z
cos
Z
0
x
cos
cos nx dx
C n x C cos
0
D
D
D
1
"
sin. C n/x
1
"
Cn
C
.1/n sin Cn
D .1/n
C
Cn
C
n x dx
#xD
n
sin. C n/
sin. n/x
sin. n/
xD0
#
n
.1/n sin n
2 sin :
2 2 n 2
Since f is continuous and piecewise differentiable, the Fourier series of f converges
pointwise to f and so for all x 2 R we have
cos
If we take x D
for cot :
x
1
D
sin X
2 sin .1/n 2
cos nx:
C
2 n2
nD1
and divide both sides by sin we get the partial fraction expansion
1
1 X
2
cot D C
nD1 2 n2
2
; for all 2 R X Z:
Let 0 < " x < . We have
Z xX
Z x
1
1
2
d D
cot 2 n2
"
" nD1
D
1 Z
X
nD1
x
"
n2
2
d
2
d;
2 2
200
5 Functions
P
2
since it follows easily from the M-test that the series 1
nD1 2 n2 2 is uniformly
convergent on Œ"; x (this requires 2 n2 2 ¤ 0 on Œ"; x, which is so since 0 <
" x < ). Integrating, we see that
Œlog.sin / log Dx
D" D
1
X
log.n2
2
2 /
Dx
D"
:
nD1
Evaluating, we obtain the identity
sin x
log
x
sin "
log
"
1
X
n2
D
log 2
n
nD1
x2
:
2 "2
2
Letting " ! 0C, we get
log
sin x
x
D
x2
log 1 2 2 :
n
nD1
1
X
Exponentiating this expression and multiplying both sides by x gives
sin x D x
1 Y
x2
1 2 2 ; x 2 Œ0; /:
n
nD1
Q
x2
Since both sin x and x 1
nD1 1 n2 2 are odd functions and vanish at x D ˙ , we
Q
Q1
x2
x2
have sin x D x 1
nD1 1 n2 2 on Π; . Finally, sin x and x
nD1 1 n2 2
are both 2 -periodic (if G.x/ denotes the infinite product, it is enough to show
G.x C / D G.x/—see the exercises at the end of the section). Hence sin x D
Q
x2
x 1
nD1 1 n2 2 for all x 2 R.
Remark 5.5.31 This proof only applies when x 2 R. The proof we gave in Chap. 3
holds for x 2 C. Note that the infinite product converges by Lemmas 3.9.4, 3.9.6 (or
Proposition 3.9.10).
z
Example 5.5.32 (Wallis’ Formula for ) Dividing the infinite product formula for
sin x by x and taking x D =2 gives the identity
2
D
Y
1 1 Y
1
4n2 1
1 2 D
:
4n
4n2
nD1
nD1
5.5 Trigonometric and Fourier Series
201
Taking the reciprocal of both sides and noting that
formula for =2:
2
D
4n2
4n2 1
D
.2n/.2n/
.2n1/.2nC1/
gives Wallis’
1
Y
2 2 4 4 6 6 8 8
.2n/.2n/
D :
.2n 1/.2n C 1/
1 3 3 5 5 7 7 9
nD1
EXERCISES 5.5.33
(1) Show that every 2 -periodic polynomial p W R ! R is constant.
(2) Let f be continuous and 2 -periodic. Using the second Weierstrass approximation theorem, show that if all the Fourier coefficients of f are zero then
f
0 (Theorem 5.5.16).
p
(3) Show that the Riemann–Lebesgue lemma holds on Œ0; b if f .x/ D 1= x.
(4) Let f be a piecewise continuous 2 -periodic functions. Show that the Fourier
coefficients an ; bn of f converge to zero as n ! 1.
(5) Extend Theorem 5.5.16 and Proposition 5.3.6 to piecewise continuous functions. (Hint: use the second method given at the end of the proof of the
Riemann–Lebesgue Lemma 5.5.20.) Show that the same method shows that
these results also hold if we only assume f is (a) bounded, and (b) has finitely
many discontinuities.
(6) Let fPW R ! R be continuous and 2 -periodic with Fourier series F . f / D
a0 C 1
nD1 .an cos nxCb
P n sin nx/. Suppose that (A) F . f / converges at one point
of R, (B) the series 1
nD1 .nan sin nx C nbn cos nx/ is uniformly convergent
on R. Show that f is C1 and that F . f / converges uniformly to f on R. (Hints:
F .F . f // D F . f / and the result of (2) above).
(7) Define the continuous 2 -periodic function T W R ! R by
T.x/ D
jxj; x 2 Π; :
(a) Sketch the graph of T.
(b) Find the Fourier series of T.
(c) Does the Fourier series converge pointwise to T? uniformly on R to T?
Why/Why not?
(8) Define the piecewise continuous 2 -periodic function S W R ! R on Π; by
S.x/ D
x; x 2 . ; /;
0; x D ˙ :
(a) Sketch the graph of S.
(b) Find the Fourier series of S.
(c) Does the Fourier series converge pointwise to S? uniformly on R to S?
202
5 Functions
If you have access to a program like Maple or Matlab, plot the graphs of
the partial sums S20 .S/ and S50 .S/ over the range Œ3 ; 3 and estimate the
overshoot as a percentage of the jump 2 .
(9) Show that the Fourier series of the 2 -periodic sawtooth function defined by
x
;
2
x 2 .0; 2 /;
0; x D 0; 2
S.x/ D
is given by
F .S/ D
1
X
sin nx
n
nD1
:
(a) Show the partial sums Sn of the Fourier series of S satisfy
Sn .x/ D
1
2
Z
x
0
x
Dn .t/ dt :
2
(b) Using the approximation sin t
t, for t small, deduce that there exists a
C 0 (independent of n) such that
Z
Sn .x/ D
x
0
sin.n C 12 /t
dt C en .x/;
t
where jen .x/j Cx, x 2 .0; 2 /.
(c) Take x D =.n C 12 / in (b) and deduce that
!
Sn
nC
1
2
Z
D
0
sin u
du C ˛n ;
u
R
where j˛n j c=n. Using the approximate value 1:852 for 0 sinu u du,
deduce that Sn . =.n C 12 //
=2 C 0:09
for n large. That is, for large
n the overshoot is (at least) 9% of the size of the jump at the discontinuity.
(10) Let F W R ! R be the piecewise continuous 2 -periodic function defined on
Π; by
F.x/ D
0; if x 2 Π; =2 [ Π=2; ;
; if x 2 . =2; =2/:
5.5 Trigonometric and Fourier Series
203
(a) Find the Fourier series of F.
(b) At what points of Π; does F .F/ converge pointwise to F? If F .F/
does not converge pointwise to F at x0 , what is F .F/.x0 /? Are your
answers consistent
5.5.26?
Pwith Theorem
nC1 1
(c) Using (a,b), find 1
nD1 .1/
2n1 .
(11) Suppose that the 2 -periodic
continuous function f W R ! R has Fourier
P
.a
bn sin nx/. Assuming that (a) F . f /
series F . f / D a0 C 1
nD1 n cos nx C P
converges at at least one point, and (b) nD1 .nan sin nx C nbn cos nx/ is
uniformly convergent, explain why the series F . f / converges (uniformly) to f
on R.
(12) Suppose f W R ! R is 2 -periodic and C1 . Show that the Fourier coefficients
of f decay faster than any power of 1=n. Specifically, show that for each m 1,
there exists a Cm 0 such that jan j; jbn j Cm nm , for all n 1. Conversely,
show that if f is continuous and this condition holds then f is C1 . (The decay
is exponentially fast if f is analytic.)
Q
x2
is 2 -periodic. (Hint: We know that
(13) Show that G.x/ D x 1
nD1 1 n2 2
G.x/ D sin.x/ on Π; and so it suffices to prove G is 2 -periodic.
Show that 2 -periodicity follows from G.x C / D G.x/. Let Gk .x/ D
Q
2
x knD1 1 n2x 2 . Find a simple expression for Gk .x C /=Gk .x/, x ¤ n ,
and let k ! 1.)
42n .nŠ/4
(14) Show that Wallis’ formula implies that limn!1 Œ.2n/Š
2 .2nC1/ D 2 . Deduce
p
.nŠ/2 p
22n
D limn!1 .2n/Š n . (These formulas can also be found by evaluating
that
R =2 p
Ip D 0 sin dx and then finding limn!1 I2n =I2nC1 .)
Q
4x2
(15) There is an infinite product for cosine: cos x D 1
nD1 1 .2k1/2 2 . Take
logs and differentiate to find a fractional series for tan x. Can you derive the
cosine product from the sine product using the identity cos x D sin. 2 x/?
(See the previous exercise for hints.)
(16) Derive the infinite product for cos x using the trigonometric identity sin 2x D
2 sin x cos x and the infinite product for sin x.
(17) Assuming that the product formula for sin x is valid for all x 2 C (it is, see
Chap. 3), find an infinite product formula for sinh x.
(18) Let .xn / be a sequence of points in Œ0; 1. Given N 2 N and 0 a < b 1,
define AN .a; b/ to be the cardinality of the set fj 2 Œ1; N j xj 2 Œa; bg. The
sequence .xn / is uniformly distributed if for all 0 a < b 1 we have
lim
N!1
AN .a; b/
D b a:
N
A sequence .xn / R is uniformly distributed mod 1 if the fractional parts of
xn are uniformly distributed in Œ0; 1.
204
5 Functions
(a) Let f W Œ0; 1 ! R be continuous and .xn / Œ0; 1. Show that .xn / is
uniformly distributed iff
Z 1
N
1 X
lim
f .xj / D
f .s/ ds:
N!1 N
0
jD1
(b) (Weyl criterion.) Show that the sequence .xn / R is uniformly distributed
mod 1 iff
N
1 X 2
e
N!1 N
jD1
lim
m{xn
D 0;
for all m 2 Z.
(Hints for (b): for necessity, use (a). For sufficiency, use the second Weierstrass
approximation theorem to show that (a) holds for all continuous f W R ! R of
period 1.)
(19) Show that
(a) If ˛ is irrational, then .n˛/ is uniformly distributed mod 1.
(b) .log n/ is not uniformly distributed mod 1.
(Hint for (b): use the Euler–Maclaurin formula. See also [21, Chap. 1].)
5.6 Mean Square Convergence
So far we have focused on the question of whether or not the Fourier series of a
continuous function f converges pointwise to f . A more natural notionR of convergence for Fourier series is mean-square or L2 -convergence: limn!1 j f .x/ Sn . f /.x/j2 dx D 0. Although the full development of this theory depends on using
a more sophisticated version of integration such as the Lebesgue integral, we can at
least indicate why mean-square convergence is a natural concept for Fourier series.
Definition 5.6.1 Given continuous 2 -periodic functions f ; g W R ! R, the scalar
or inner product of f and g is defined by
h f ; gi D
1
2
Z
f .x/g.x/ dx:
We leave the proof of the next lemma to the exercises.
5.6 Mean Square Convergence
205
Lemma 5.6.2 Let f ; g; h W R ! R be continuous and 2 -periodic.
(1) haf C bg; hi D ah f ; hi C bhg; hi for all a; b 2 R.
(2) h f ; gi D hg; f i.
(3) h f ; f i 0 and h f ; f i D 0 iff f D 0.
Definition 5.6.3 Given a continuous 2 -periodic function f W R ! R, define the
L2 -norm of f by
j f j2 D
p
h f ; f i:
Lemma 5.6.4 Let f ; g W R ! R be continuous and 2 -periodic. We have
(1)
(2)
(3)
(4)
j f j2 0 and j f j2 D 0 iff f D 0.
jaf j2 D jajj f j2 for all a 2 R.
jh f ; gij j f j2 jgj2 (Cauchy–Schwarz inequality).
j f C gj2 j f j2 C jgj2 (triangle inequality).
Proof (1,2) are immediate from Lemma 5.6.2. In order to prove (3) we shall use
the necessary and sufficient condition A; C 0 and B2 < AC for a quadratic form
Ax2 C 2Bxy C Cy2 to be positive semi-definite. Let x; y 2 R. By Lemma 5.6.2,
hxf C yg; xf C ygi D x2 h f ; f i C 2xyh f ; gi C y2 hg; gi
D x2 j f j22 C 2xyh f ; gi C y2 jgj22 :
Since hxf C yg; xf C ygi 0, the quadratic form x2 j f j22 C 2xyh f ; gi C y2 jgj22 is
positive for all x; y 2 R, and so we must have h f ; gi2 j f j22 jgj22 , proving (3).
Finally we have j f C gj22 D j f j22 C 2h f ; gi C jgj22 j f j22 C 2j f j2 jgj2 C jgj22 by (3).
That is, j f C gj22 .j f j2 C jgj2 /2 , proving (4).
t
u
Definition 5.6.5 The continuous non-zero 2 -periodic functions f ; g W R ! R are
orthogonal if h f ; gi D 0.
Lemma 5.6.6 The set
1; cos x; cos 2x; ; sin x; sin 2x; of 2 -periodic functions are pairwise orthogonal.
Proof Use the orthogonality relations (Lemma 5.5.15).
t
u
Lemma 5.6.7 (Pythagoras’ Theorem) Suppose that f1 ; ; fn are pairwise
orthogonal (that is h fi ; fj i D 0, i ¤ j). Then
j f1 C C fn j22 D j f1 j22 C C j fn j22 :
206
5 Functions
Proof We have
* n
X
iD1
fi ;
n
X
+
fj D
jD1
n
X
h fi ; fj i
i;jD1
D
n
X
h fi ; fi i
iD1
D
n
X
j fi j22 :
iD1
Pn
Pn
Pn
Since h iD1 fi ; jD1 fj i D j iD1 fi j22 , the result follows.
t
u
We define a new distance function 2 . f ; g/ on piecewise continuous 2 -periodic
functions by
s
Z
1
2 . f ; g/ D j f gj2 D
j f .x/ g.x/j2 dx:
2
It follows from Lemma 5.6.4 that 2 . f ; g/ satisfies the usual properties of a distance
function; in particular, the triangle inequality: 2 . f ; h/ 2 . f ; g/ C 2 .g; h/.
Recalling the uniform metric . f ; g/ D supx2Π; j f .x/ g.x/j, we have
s
Z
1
j f .x/ g.x/j2 dx
2 . f ; g/ D
2
s
Z
1
. f ; g/2 ; dx . f ; g/;
2
for all piecewise continuous 2 -periodic functions f ; g.
Example 5.6.8 In general, 2 . f ; g/ may be much smaller than . f ; g/. For example,
if we define
1; x 2 Œ1=N; 1=N;
fN .x/ D
0; x 2 . ; X Œ1=N; 1=N;
p
then . fN ; 0/ D 1 for all N 0 but 2 . fN ; 0/ D 1=N .
Proposition 5.6.9 Let f be a piecewise
P continuous 2 -periodic function. For n 0,
let Sn denote the partial sum a0 C njD1 .aj cos jx C bj sin jx/ of the Fourier series of
f . The infimum of 2 . f ; T/ over all trigonometric polynomials T of degree less than
or equal to n is given by 2 . f ; Sn / and is attained only when g D Sn . Moreover,
1X 2
.a C b2j //:
2 jD1 j
n
j f Sn j22 D j f j22 .a20 C
5.6 Mean Square Convergence
207
P
Proof Let T.x/ D A0 C njD1 .Aj cos jx C Bj sin jx/ be any trigonometric polynomial
of degree at most n. We have
2 . f ; T/2 D j f Tj22 D j. f Sn / C .Sn T/j22 :
Now f Sn is orthogonal to cos jx; sin jx, 0 j n since, for example, if j n,
h f Sn ; cos jxi D h f ; cos jxi hSn ; cos jxi
D aj aj D 0;
by the orthogonality relations and the definition of aj . It follows by Pythagoras’
theorem (Lemma 5.6.7) that
j f Tj2 D j f Sn j22 C jSn Tj22 ;
and so 2 . f ; T/2 D 2 . f ; Sn /2 C 2 .Sn ; T/2 2 . f ; Sn /2 with equality iff T D Sn .
The final statement follows taking T D 0.
u
t
Lemma 5.6.10 Let f be a piecewise continuous 2 -periodic function. Given " > 0,
there exists a trigonometric polynomial T such that
2 . f ; T/ < ":
Proof If f is continuous, then by the second Weierstrass approximation theorem
we can choose a trigonometric polynomial T such that . f ; T/ < ". But 2 . f ; T/ . f ; T/ and so the result is proved if f is continuous. If f is piecewise continuous, we
may choose a continuous 2 -periodic function g such that 2 . f ; g/ < "=2 (we may
require f D g outside of small intervals containing the discontinuity points). As we
did above, we may choose a trigonometric polynomial T such that 2 .g; T/ < "=2.
Now 2 . f ; T/ 2 . f ; g/ C 2 .g; T/ < "=2 C "=2 D ".
t
u
Theorem 5.6.11 Let f a piecewise continuous 2 -periodic function with Fourier
coefficients an ; bn . Then
(a) 2 . f ; Sn / !
R 0 as n ! 1.
P
2
2
(b) j f j22 D 21 j f .x/j2 dx D a20 C 12 1
nD1 .an C bn /.
Proof Immediate from Proposition 5.6.9 and Lemma 5.6.10.
t
u
Remarks 5.6.12
(1) Statement (b) of Theorem 5.6.11 is known as Parseval’s identity.
(2) Theorem 5.6.11 P
suggests a natural inverse problem: Given sequences .an /; .bn /
2
2
such that a20 C 12 1
nD1 .an C bn / < 1, does there exist a function f with Fourier
coefficients an ; bn and which satisfies Parseval’s identity? In order to give a
satisfactory answer to the problem we have to expand the class of functions to
allow for functions which may not be continuous anywhere on R but which are
208
5 Functions
nevertheless square integrable. For this to make sense we need to work with a
more powerful version of the integral that allows for functions which may have
no points of continuity. All of this can be, and has been, done but lies beyond
the scope of this text.
z
Example 5.6.13 We recall that the Fourier series of the square wave function S
(Example 5.5.13) is given by
S.x/ D
1
4 X sin..2n C 1/x/
:
2n C 1
nD0
Applying Parseval’s identity, we see that
1
1D
2
Hence
P1
1
nD0 .2nC1/2
D
Z
1
S.x/ dx D
2
2
2
2 X
1
4
nD0
1
:
.2n C 1/2
=8.
EXERCISES 5.6.14
(1) Verify the statements of Lemma 5.6.2.
(2) The Fourier sine series of
saw-tooth function S.x/ D . x/=2, x 2 .0; 2 /,
Pthe
1
sin.nx/=n. Using Parseval’s identity, deduce that
S.0/
D
S.2
/
D
0,
is
nD1
P1
2
2
=6.
nD1 1=n D
(3) Show that the Fourier sine series of theP2 -periodic odd function defined
sin..2nC1/x
on Œ0; by f .x/ D x. x/ is 8 1
nD0 .2nC1/3 . Hence show that
P1
6
1
nD0 .2nC1/6 D 960 .
(4) For n 0, define the Legendre polynomials by
Pn .x/ D
1
2n nŠ
dn 2
.x 1/n :
dxn
(a) Show that Pn .x/ is a polynomial of degree n and find the coefficient of xn .
Deduce
that every polynomial p.x/ of degree n can be written as p.x/ D
Pn
kD0 ck Pk .x/, where c0 ; ; cn 2 R are unique.
(b) Show that fPn j n 0g define an orthogonal family of polynomials on
Œ1; 1. Specifically, show that
Z
1
1
(
Pn .x/Pm .x/ dx D
0;
2
;
2nC1
(c) Show that if f W Œ1; 1 ! R is continuous and
n 0, then f D 0.
if n ¤ m;
if n D m:
R1
1
f .x/Pn .x/ dx D 0 for all
5.7 Appendix: Second Weierstrass Approximation Theorem
209
5.7 Appendix: Second Weierstrass Approximation Theorem
In this appendix we prove Theorem 5.5.6: every continuous 2 -periodic function
on f W R ! R can be uniformly approximated by trigonometric polynomials (the
second Weierstrass approximation theorem).
Since f is 2 -periodic, it is enough to show that we can uniformly approximate
f by trigonometric polynomials on Π; . We break the proof into a number of
lemmas.
Lemma 5.7.1 If f W R ! R is even ( f .x/ D f .x/, for all x 2 R), then we can
uniformly approximate f by trigonometric polynomials.
Proof Since f is even the values of f on Π; are uniquely determined by the
values of f on Œ0; . Therefore it suffices to uniformly approximate f on Œ0; by
even trigonometric polynomials.
Define g.t/ D f .cos1 t/, t 2 Œ1; 1. Since cos1 W Œ1; 1 ! Œ0; is
continuous, g is continuous on Œ1; 1. By the Weierstrass approximation theorem,
we may uniformly approximate g on Œ1; 1 by polynomials. That is, given " > 0,
there exists a p 2 P.R/ such that
sup jg.t/ p.t/j < ":
(5.8)
t2Œ1;1
Set t D cos x, x 2 Œ0; . We can rewrite (5.8) as supx2Œ0; jg.cos x/ p.cos x/j < ".
Since g.cos x/ D f .cos1 .cos x// D f .x/, we have
sup j f .x/ p.cos x/j < ":
x2Œ0; Using standard trigonometric identities it is well-known (and easy) to show that
every power of cos x can be written as linear combinations of cos jx, j 2 N. Hence
p.cos x/ can be written as a trigonometric polynomial with no sine terms:
p.cos x/ D a0 C
n
X
aj cos jx:
jD1
This function is even and so we have uniformly approximated f on Œ0; by an even
trigonometric polynomial.
t
u
Lemma 5.7.2 If f is even, then f .x/ sin2 x can be uniformly approximated by
trigonometric polynomials.
Proof Using Lemma 5.7.1, we first uniformly approximate f by trigonometric
polynomials then we use standard trigonometric identities to obtain the required
t
u
uniform approximations of f .x/ sin2 x by trigonometric polynomials.
210
5 Functions
Lemma 5.7.3 If f is odd (f .x/ D f .x/) then f .x/ sin x can be uniformly
approximated by trigonometric polynomials.
Proof Since f is odd, g.x/ D f .x/ sin x is even and so we may apply Lemma 5.7.1.
t
u
Lemma 5.7.4 Every continuous function f W R ! R may be written uniquely as a
sum fe C fo of even and odd continuous functions. If f is 2 -periodic, so are fe ; fo .
Proof Define fe .x/ D
f .x/Cf .x/
, fo .x/
2
D
f .x/f .x/
.
2
t
u
Lemma 5.7.5 If f is 2 -periodic, then we can uniformly approximate f .x/ sin2 x by
trigonometric polynomials.
Proof Using Lemmas 5.7.4 and 5.7.2, we reduce to the case when f is odd. Now
apply Lemma 5.7.3 to f .x/ sin x and finally multiply the approximating trigonometric polynomials by sin x and apply the trigonometric identities sin x cos jx D
1
.sin. j C 1/x sin. j 1/x/ to obtain the required uniform approximations to
2
f .x/ sin2 x.
t
u
Lemma 5.7.6 If f is 2 -periodic then we can uniformly approximate f . 2 x/ sin2 x
by trigonometric polynomials.
Proof Apply Lemma 5.7.4 to fQ .x/ D f . 2 x/.
t
u
Proof of Theorem 5.5.6 Taking y D 2 x in Lemma 5.7.6, we see that
f .x/ cos2 x can be uniformly approximated by trigonometric polynomials. Hence,
by Lemma 5.7.2, f .x/ sin2 x C f .x/ cos2 x D f .x/ can be uniformly approximated by
trigonometric polynomials.
t
u
Chapter 6
Topics from Classical Analysis:
The Gamma-Function and the Euler–Maclaurin
Formula
In this chapter we look at two topics from classical analysis: the Gamma-function
and the Euler–Maclaurin formula. Our investigation of the Gamma-function will
require many of the ideas we have developed on convergence and involves infinite
products, improper integrals and other techniques and results from analysis such as
differentiation under the integral sign and multiple integrals. We also need some
standard results on multiple integrals (in our situation these results are elementary
as we almost always assume rectangular domains and continuous, even smooth,
integrands—see Exercises 2.8.10(10)). The Euler–Maclaurin formula is easy to
prove but has powerful applications to estimation and asymptotics. For example,
using the Euler–Maclaurin formula, we prove Stirling’s formula (estimating nŠ) and
also estimate Euler’s constant and the sums of various infinite series.
6.1 The Gamma-Function
The Gamma-function gives an extension of the factorial nŠ to all positive real
numbers. We start by giving the definition (due to Euler) which involves a doubly
improper integral. Once we have checked that the integral converges, it is relatively
straightforward to derive the basic properties of the Gamma-function. Along the
way we encounter a number of fairly standard techniques often seen in applications
of analysis: estimates yielding convergence of infinite integrals and conditions
that allow us to differentiate under the integral sign (yet another instance of
interchanging limits—in this case involving a triple limit).
Definition 6.1.1 The Gamma-function is defined for x > 0 by
Z
.x/ D
1
0
tx1 et dt:
212 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
Since the definition of the Gamma-function involves an infinite integral and tx1
blows-up at t D 0 if x < 1, we need to take some care with this definition. We start
by giving conditions for the convergence of improper integrals.
Lemma 6.1.2 Let f W Œa; R1/ ! R be a continuous function which is either positive
1
or negative. The integral a f .t/ dt converges iff there exists an M 0 such that
ˇZ
ˇ
ˇ
ˇ
b
a
ˇ
ˇ
f .t/ dtˇˇ M;
for all b a.
Proof Without loss of generality suppose f 0 (if not, replace f by f ). Then
Rb
G.b/ D a f .t/ dt is a monotone increasing function of b. If the condition of the
lemma holds, then G is bounded above since jG.b/j M for all b a. Since G is
Rb
increasing, we have limb!1 a f .t/ dt D supba G.b/ < 1, proving convergence.
The converse is obvious.
t
u
Lemma 6.1.3 Let f W .a; A ! R be continuous and either positive or negative.
RA
The integral a f .t/ dt converges iff there exists an M 0 such that
ˇZ
ˇ
ˇ
ˇ
A
˛
ˇ
ˇ
f .t/ dtˇˇ M;
for all ˛ 2 .a; A.
Proof We use the same method of proof as that of Lemma 6.1.2. We leave the details
to the reader.
t
u
Remark 6.1.4 We leave to the exercises versions of Lemmas 6.1.2, 6.1.3 that hold
without the assumption that f is of constant sign.
z
Using these lemmas, it is easy to show the Gamma-function is well defined.
R1
Lemma 6.1.5 The improper integral 0 tx1 et dt converges for x > 0.
Proof We start by showing that for every x > 0, there exists an M 0 such that
R b x1 t
e dt M, for all b 1. Fix x > 0. Since limt!1 tx1 et=2 D 0, there
1 t
exists a C 0 such that tx1 et < Cet=2 , for all t 1. Hence
Z
b
t
1
x1 t
Z
b
e dt C
1
et=2 dt 2Ce1=2 ;
and so we satisfy the conditions of Lemma 6.1.2 with M D 2Ce1=2 . It remains to
R1
show that 0 tx1 et dt converges if x 2 .0; 1/. Since tx1 et tx1 , t > 0, we have
6.1 The Gamma-Function
213
for all a 2 .0; 1,
Z
1
t
x1 t
Z
1
e dt a
tx1 ; dt D
a
1
.1 ax / 1=x:
x
R1
t
u
Hence 0 tx1 et dt converges by Lemma 6.1.3.
We need a result on differentiation under the integral sign before we establish the
main properties of the Gamma-function.
Lemma 6.1.6 Let I be an open interval (bounded or unbounded) and g W I
.a; 1/ ! R be continuous. Assume that
(1)
@g
@2 g
.a; 1/.
@x .x; t/, @x2 .x; t/ exist and are continuous on I
R1
R 1 @g
R 1 @2 g
The integrals a g.x; t/ dt, a @x .x; t/ dt, a @x2 .x; t/ dt
(2)
exist for all x 2 I.
(3) If x 2 I, ı > 0 and Œx ı; x C ı I, then there exists an M 0 such that
R ˇ @2 g
Rˇ
j ˛ @g
@x . y; t/ dtj; j ˛ @x2 . y; t/ dtj M for all ˛; ˇ a and y 2 Œx ı; x C ı.
R1
If we define F.x/ D a g.x; t/ dt, x 2 I, then F is C1 on I and
F 0 .x/ D
Z
1
a
@g
.x; t/ dt; x 2 I:
@x
Remark 6.1.7 The conditions of Lemma 6.1.6 are not intended to be optimal:
indeed they are not! However, not only do the conditions lead to a simple proof of the
result on the validity of differentiating under the integral sign but the conditions are
easy to verify for our intended application. Note that condition (3) of the lemma is
only really needed because we are dealing with an improper integral. If we assume
that g.x; t/ is continuous on a product of closed and bounded intervals then the
estimates (3) follow using uniform continuity arguments.
z
Proof of Lemma 6.1.6 Fix x 2 I and choose ı > 0 so that Œx ı; x C ı I. Let h 2
Œı; ı. Applying (the trivial case of) Taylor’s theorem with integral remainder—
Theorem 2.7.7 with r D 1—we get
Z
1
g.x C h; t/ D g.x; t/ C h
0
@g
.x C sh; t/ ds:
@x
Integrating from a to 1 with respect to t, we obtain
Z
1
Z
1
F.x C h/ F.x/ D h
0
a
@g
.x C sh; t/ dsdt:
@x
Interchanging the order of integration (use Exercises 2.8.10(10)(f)) gives
Z
1
Z
1
F.x C h/ F.x/ D h
0
a
@g
.x C sh; t/ dt ds:
@x
214 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
R1
Now by (3), there exists an M 0 such that j a @g
.x C sh; t/ dtj M for all
@x
h 2 Œı; ı. Hence jF.x C h/ F.x/j jhjM for all h 2 Œı; ı and so F is
continuous at x.
Next we consider the differentiability of F at x. For this we again use Taylor’s
theorem with integral remainder applied to g.x; t/, regarded as a function of x. For
h 2 Œı; ı, h ¤ 0, we have
g.x C h; t/ D g.x; t/ C h
@g
.x; t/ C h2
@x
Z
1
0
.1 s/
@2 g
.x C sh; t/ ds:
@x2
Hence
g.x C h; t/ g.x; t/ @g
.x; t/ D h
h
@x
Z
0
1
.1 s/
@2 g
.x C sh; t/ ds:
@x2
Integrating from a to 1 with respect to t, we obtain
F.x C h/ F.x/
h
Z
a
1
@g
.x; t/ dt D R.x; tI h/;
@x
where
Z
1
Z
1
@2 g
.x C sh; t/ dsdt
@x2
a
0
Z 1 2
Z 1
@g
Dh
.1 s/
.x
C
sh;
t/
dt
ds;
@x2
0
a
R.x; tI h/ D h
.1 s/
and the interchange of order of integration follows by Fubini’s theorem—
Exercises 2.8.10(10)(f) again. By condition (3), there exists an M 0 such that
R1 2
j a @@xg2 .x C sh; t/ dtj M for all h 2 Œı; ı, s 2 Œ0; 1. Hence
ˇ Z 1
Z
ˇ
ˇh
.1
s/
ˇ
0
1
a
ˇ
Z 1
ˇ
@2 g
ˇ Mjhj
.x
C
sh;
t/
dt
ds
.1 s/ ds D Mjhj=2:
ˇ
@x2
0
This estimate implies that for h 2 Œı; ı, h ¤ 0, we have
ˇ
ˇ
Z 1
ˇ
ˇ F.x C h/ F.x/
@g
ˇ
.x; t/ dtˇˇ Mjhj=2:
ˇ
h
@x
a
Now let h ! 0 to obtain the differentiability of F at x. Finally, in order to prove
that F is C1 we use the same argument used to prove F is continuous. We omit the
details.
t
u
6.1 The Gamma-Function
215
6.1.1 Properties of .x/
Theorem 6.1.8
(1) .x C 1/ D x.x/, for all x > 0.
(2) .n C 1/ D nŠ, n 1.
(3) is a smooth function on .0; 1/ and
.n/ .x/ D
Z
1
0
.log t/n tx1 et dt; n 1:
Proof Let 0 < a < R < 1. Integrating by parts, we have
Z
R
x t
t e dt D
a
Œet tx Ra
a x
D e a e
Z
R
Cx
tx1 et dt
a
R x
Z
R
R Cx
tx1 et dt:
a
If x > 0, we may take limits as a ! 0C and R ! 1 to get .x C 1/ D x.x/,
proving (1). For (2),R observe that by (1) we get .n C 1/ D n.n/ D D nŠ.1/.
1
We have .1/ D 0 et dt D 1. It remains to prove (3). Set k.x; t/ D tx1 et .
Differentiating with respect to x we have
@n k
D .log t/n k.x; t/; n 0:
@xn
For all n 0, .log t/n k.x; t/ is continuous on .0; 1/ .0; 1/. Choose ı > 0 so
Œx ı; x C ı .0; 1/. Just as in the proof of Lemma 6.1.5, the integrals
Rthat
1
.log
t/n k.x; t/ dt converge for all n 0 (see the exercises for details). Similar
0
arguments
also show that for each n 0, there exists an Mn 0 such that
R1
j 0 .log t/m k. y; t/ dtj Mn for all y 2 Œx ı; x C ı and 0 m n. Now we
are in a position to apply Lemma 6.1.6. Since conditions (1,2,3) of Lemma 6.1.6
1
hold with g.x; t/ D tx1 et , is
inductively, suppose we have
R 1C . Proceeding
n
.n/
proved is C and that .x/ D 0 .log t/n tx1 et dt. We apply Lemma 6.1.6 with
g.x; t/ D .log
t/n tx1 et to obtain .n/ is C1 (and so is CnC1 ) and .nC1/ .x/ D
R1
.n/ 0
. / .x/ D 0 .log t/nC1 tx1 et dt.
t
u
6.1.2 Convexity of log Recall that a C2 function f W Œa; b ! R is convex if f 00 0 on Œa; b and strictly
convex if f 00 > 0 on .a; b/. If f is defined on an open or half-open interval, then f is
defined to be convex if f is convex on all closed subintervals.
216 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
N
M
L
u
v
w
Fig. 6.1 Chord triples for a convex function
Example 6.1.9 The function log W .0; 1/ ! R is strictly convex since
. log/00 .x/ D 1=x2 > 0 for all x 2 .0; 1/.
Everything we need about convex functions is contained in the next result.
Lemma 6.1.10 Suppose that the C2 function f W Œa; b ! R is convex. Let a u <
v < w b. Then
f .w/ f .u/
f .w/ f .v/
f .v/ f .u/
:
vu
wu
wv
If the inequalities are strict then f is strictly convex.
In Fig. 6.1 we show the geometrically transparent relationship between the slopes
of the chords L; M; N given by Lemma 6.1.10.
Proof Since f is convex, f 0 is increasing on Œa; b. For x v, define g.x/ D
f .x/f .u/
f .v/f .u/
1
0
0
xu vu . We have g.v/ D 0 and g .x/ D .xu/2 ..xu/f .x/. f .x/f .u//.
By the mean value theorem, there exists a
2 .u; x/ such that f .x/ f .u/ D
1
. f 0 .x/ f 0 . //. Since f 0 is increasing on Œa; b,
.x u/f 0 . / and so g0 .x/ D .xu/
it follows that g0 0 on Œv; w. Hence, since g.v/ D 0, g.w/ 0. This proves
the first inequality. The proof of the second inequality is similar. The case of strict
inequality follows easily using the same arguments.
t
u
Theorem 6.1.11 log is strictly convex on .0; 1/.
Proof We show that .log /00 > 0 on .0; 1/. Computing, we find that .log /00 D
. 00 . 0 /2 /= 2 . It suffices therefore to show that 00 > . 0 /2 . We may write
Z
00
.x/ .x/ D
1
t
0
Z
1
Z
1
e dt
0
Z
1
D
0
x1 t
0
2 x1 s
.log s/ s
e
.log s/2 tx1 sx1 et es dtds
ds
6.1 The Gamma-Function
217
Z
1
Z
1
D
0 .x/2 D
Z
0
1
0
Z
0
.log s/2 .ts/x1 ets dtds;
1
0
log t log s .ts/x1 ets dtds:
Observe that
.x/ 00 .x/ D
Z
Z
1
0
1
Z
Z
1
0
1
D
0
0
.log s/2 .ts/x1 ets dtds
.log t/2 .ts/x1 ets dtds;
using the symmetry of the integrand .ts/x1 ets and the region of integration—
positive quadrant—in t; s. Hence
2..x/ 00 .x/ 0 .x/2 /
Z 1Z
D
Z
0
1
Z
1
0
1
D
0
0
Œ.log t/2 C .log s/2 2 log t log s.ts/x1 ets dtds
.log t log s/2 .ts/x1 ets dtds
>0:
t
u
Theorem 6.1.12 Suppose that f W .0; 1/ ! R is C2 and satisfies
(a) f .x C 1/ D xf .x/, x > 0.
(b) f .1/ D 1.
(c) log f is convex.
x
nŠn
. In particular, since satisfies (a,b,c),
Then f .x/ D limn!1 Πx.xC1/.xCn/
.x/ D lim
n!1
nŠnx
:
x.x C 1/ .x C n/
Remarks 6.1.13
(1) It is enough to assume D log f is convex in the sense that .x C .1 /y/ .x/ C .1 / . y/, 2 Œ0; 1, without requiring that is differentiable or
C2 (for a proof, see Rudin [27]).
nŠnx
(2) The limit limn!1 Πx.xC1/.xCn/
exists provided x ¤ 0; 1; 2; and so the
theorem gives an extension of the Gamma-function to all of the real line except
the negative integers.
z
218 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
Proof of Theorem 6.1.12 Let .x/ D log f .x/, x > 0. It follows from (a,b) that
.n C 1/ D log.nŠ/; n 0:
(6.1)
In general, we have .x C 1/ D .x/ C log x and so for n 0 we have
.n C 1 C x/ D .x/ C log.x C n/ C C log x
(6.2)
D .x/ C logŒx.x C 1/ .x C n/:
x
nŠn
for x 2
In view of (a,b), it is enough to show that f .x/ D limn!1 Πx.xC1/.xCn/
.0; 1/. Consider the quotients
.n C 1/ .n/
;
1
.n C 1 C x/ .n C 1/
;
x
.n C 2/ .n C 1/
:
1
Noting that .m C 1/ .m/ D log m and applying Lemma 6.1.10 to the convex
function we have
log n .n C 1 C x/ .n C 1/
log.n C 1/:
x
Substituting for .n C 1 C x/, using (6.1), (6.2) and multiplying through by x gives
x log n .x/ C logŒx.x C 1/ .x C n/ log.nŠ/ x log.n C 1/:
Hence,
0 .x/ log.nŠ/ log.nx / C logŒx.x C 1/ .x C n/ x log.n C 1/ x log n:
That is,
0 .x/ log
1
nŠnx
x log 1 C
:
x.x C 1/ .x C n/
n
Since limn!1 x log.1 C 1n / D 0, it follows from the squeezing lemma that .x/ D
h
i
nŠnx
limn!1 log x.xC1/.xCn/
and so
f .x/ D e
.x/
D lim
n!1
nŠnx
:
x.x C 1/ .x C n/
t
u
6.1 The Gamma-Function
219
6.1.3 The Gamma-Product
In this section we obtain an infinite product formula for .x/, involving Euler’s
constant, and which easily leads to a relation between the Gamma and sine
functions.
Let n 2 N. For x 2 R, we have
n
x x.x
n Y
x
C 1/ .x C n/
x
D xn
:
1C
nŠ
j
jD1
Q
x
The infinite product 1
nD1 .1C n / is not convergent (for example, by Lemma 3.9.13).
However, there is a powerful trick due
that enables us to manufacture
Q to Weierstrass
x
a convergent infinite product from 1
nD1 .1 C n /.
Q
x nx
converges for all x 2 R
Lemma 6.1.14 The infinite product 1
nD1 .1 C n /e
and is only zero if x D 0; 1; 2; .
Proof We give a proof that works for x 2 C. For a simpler argument, valid only for
x
x 2 R, use Exercises 6.1.23(6). Set an D .1 C nx /e n 1, n 1. We have
x
n
an D e n e n 1 x
x
and so
x
jan j D je n jj
2
e
jxj
n
1
X
1 x
jŠ
n
jD2
j
j
3
j
1
X
jxj
1
4
5
jŠ
n
jD2
jxj2
1 jxj
C /
.1 C
2n2
3 n
2
3
1
2
X
jxj jxj
1 jxj j
5
en 24
2n
jŠ
n
jD0
jxj
Den
jxj
D e2 n
jxj2
2n2
1 2jxj 2
e jxj
2n2
D cn2 ;
220 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
P
where c is independent
n. Therefore 1
nD1 an is absolutely convergent and so by
Qof
x nx
Lemmas 3.9.4, 3.9.6, 1
/e
.1
C
is convergent for all x 2 R and is zero iff
nD1
n
x is a negative integer.
t
u
Remark 6.1.15 Lemma 6.1.14 is valid for x 2 C, x not a negative integer.
The proof is exactly the same except that absolute value is everywhere replaced
by modulus of a complex number and we use Proposition 3.9.10 rather than
Lemmas 3.9.4, 3.9.6.
z
Theorem 6.1.16 For x > 0, we have
.x/ D
xe
x
Q1
nD1
1
x ;
.1 C nx /e n
where denotes Euler’s constant. Moreover, the expression on the right is defined
and finite for all x 2 C provided only that x is not a negative integer.
Proof We have
Pn
1
nŠnx
ex log n ex jD1 j
i:
D Q h
x
x.x C 1/ .x C n/
x njD1 .1 C xj /e j
x
nŠn
D .x/. On the other hand, by
By Theorem 6.1.12, limn!1 x.xC1/.xCn/
h
i
x
Qn
Lemma 6.1.14, jD1 .1 C xj /e j is convergent for all x 2 R. Finally, since
lim ex log n ex
Pn
1
jD1 j
n!1
Pn
D lim ex.log n
1
jD1 j /
n!1
D e x ;
we have
Pn
1
1
ex log n ex jD1 j
h
iD
lim
x :
x
1
x
n!1
xe …nD1 .1 C nx /e n
x…njD1 .1 C xj /e j
t
u
As a result of Theorem 6.1.16, we may regard the Gamma-function as defined
on all of R (or C), except for the non-positive integers, by the formula
.x/ D
xe
x
Q1
nD1
1
x :
.1 C nx /e n
Lemma 6.1.17 With the extended definition of , we have
.x C 1/ D x.x/; x … f0; 1; 2; g:
6.1 The Gamma-Function
221
t
u
Proof We leave this to the exercises.
Theorem 6.1.18
.x/.1 x/ D
sin. x/
; x … Z:
Proof By Lemma 6.1.17, we have .1 x/ D x.x/ and so, provided x … Z,
we have
.x/.1 x/ D x.x/.x/
D lim
n!1
xe
D lim
n!1
D
x
jD1
.1 C xj /e
xj
x
i
xe
x
Qn
jD1
h
x
.1 xj /e j
i
1
Qn
1
jD1
sin. x/
h
Qn
x
x2
j2
;
where the last statement follows by the infinite product formula for sin x
(Sect. 5.5.5).
t
u
Remark 6.1.19 Theorem 6.1.18 holds for x 2 C, x … Z—use the infinite product
formula for sin z, z 2 C, Proposition 3.9.17.
z
p
. An
Example 6.1.20 Taking x D 12 in Theorem 6.1.18, we obtain . 12 / D
alternative proof of this result, which does not use the product formula, can be based
on Exercises 6.1.23(9).
6.1.4 An Integral Formula for the Beta Function
We conclude this section on the Gamma function with a useful integral formula.
Theorem 6.1.21 For x; y > 0 we have
Z
1
0
.1 t/x1 ty1 dt D
.x/. y/
:
.x C y/
Proof We have
Z
.x/. y/ D
1
0
Z
1
tx1 et dt
1
sy1 es ds
0
Z
1
D
0
Z
0
tx1 sy1 e.tCs/ dtds:
222 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
Making the change of variables u D s C t, v D s, we find
Z
1
0
Z
1
tx1 sy1 e.tCs/ dtds D
Z Z
0
.u v/x1 v y1 eu dudv;
R
where R D f.u; v/ j 0 v ug. Now
Z Z
.u v/x1 v y1 eu dudv
R
Z
1
Z
1
Z
u
.u v/x1 v y1 dv eu du
1
ux1 .1 t/x1 uy1 ty1 u dt eu du;
D
Z
0
0
D
0
0
where the second integral is obtained from the first by the change of variable t D
v=u. But now the second integral is equal to
Z
1
0
eu uxCy1 du
Z
1
0
.1 t/x1 ty1 dt ;
which is the product of .x C y/ with the integral we want.
t
u
R1
Remark 6.1.22 The integral 0 .1 t/x1 ty1 dt is usually called the beta function
of x; y and denoted by B.x; y/.
z
EXERCISES 6.1.23
R1
(1) Let f W Œa; 1/ ! R be a continuous function. The integral a f .t/ dt
converges iff for every " > 0, there exists an N 2 Œa; 1/ such that
Rˇ
j ˛ f .t/ dtj < "; for all ˛ < ˇ 2 ŒN; 1/.
RA
(2) Let f W .a; A ! R be continuous on .a; A. The integral a f .t/ dt converges
Rˇ
iff for every " > 0, there exists a b 2 .a; A such that j ˛ f .t/ dtj < ". for all
˛ < ˇ 2 .a; b.
(3) Complete the details of the proof of Lemma 6.1.10.
(4) Suppose that f W Œa; b ! R is C2 . Show that f is convex iff f .x C .1 /y/ f .x/ C .1 /f . y/, for all x < y 2 Œa; b and 2 Œ0; 1. (Hint: use
Lemma 6.1.10.) What about strict convexity?
(5) Verify Lemma 6.1.17.
(6) Show that for all x 2 R, ex 1 C x. Deduce that for all x 1, 1 .1 C
x/ex 1 x2 . Using this estimate, obtain a simple proof of Lemma 6.1.14,
valid only for real values of x.
6.2 Bernoulli Numbers and Bernoulli Polynomials
223
(7) Verify the duplication formula of Legendre:
p 12x
1
D
.x/ x C
2
.2x/:
2
(Hint: Use the product
h representation
i of .x/ given in Theorem 6.1.12—note
nŠnx1
.)
that .x/ D limn!1 x.xC1/.xCn1/
(8) Show that
p
x
x
x
.1 x/ 1 C
1
1C
D
:
2
3
4
.1 C 2x /. 21 2x /
Show also that .1 x/.1 3x /.1 C 2x /.1 5x /./.C/././.C/ converges
and find the limit. This provides an example of rearrangement to a different
limit for a conditionally convergent infinite product.
(9) Show that the substitution t D s2 in the defining integral for .x/ leads to the
formula
Z 1
2
s2x1 es ds; x > 0:
.x/ D 2
0
Deduce that
(10) Show that
R1
1
e
Z
s2
=2
0
ds D . 12 / D
p
.
.sin /2x1 .cos /2y1 d D
.x/. y/
;
2.x C y/
where x; y > 0. (Hint: use Theorem 6.1.21.)
(11) Show that
Z
=2
0
d
p
1 C sin2
. 1 /2
d D p4 :
4 2
6.2 Bernoulli Numbers and Bernoulli Polynomials
We define the Bernoulli numbers and Bernoulli polynomials and establish some
of their basic properties. We will make much use of these ideas in our subsequent
development and applications of the Euler–Maclaurin formula.
Proposition 6.2.1 There is a unique sequence .Bn /n0 of real numbers characterized by
(a) B0 D 1.
P
(b) Bn D nkD0 nk Bk , n > 1.
224 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
Proof Condition (b) implies that if n > 1, then
!
n
Bn D B0 C
B1 C C nBn1 C Bn ;
1
and so for n 2,
Bn1
!
!
!
1
n
n
D
B1 C C
Bn2 :
B0 C
1
n2
n
If B0 D 1, we can use (6.3) to inductively define Bn for all n 1.
(6.3)
t
u
Definition 6.2.2 The numbers B0 ; B1 ; given by Proposition 6.2.1 are called the
Bernoulli numbers.
We may use (6.3) to compute the first few Bernoulli numbers. We find
1
1
1
1
:
B0 D 1; B1 D ; B2 D ; B3 D 0; B4 D ; B5 D 0; B6 D
2
6
30
42
Appearances to the contrary, the sequence .Bn / is unbounded since jB2n j ! 1 as
n ! 1. Sometimes, B1 is taken to be zero. However, for us it is convenient to take
B1 D 12 .
In the next lemma we give a rather devious proof that B2nC1 D 0, n 1. We give
an alternative proof later in the section.
Lemma 6.2.3 B2nC1 D 0, n 1.
Proof Define g.x/ D ex x1 , x ¤ 0, and g.0/ D 1. Since ex 1 D xf .x/, where
f .0/ D 1 and f .x/ is analytic, it follows from the results of Chap. 4 that g.x/ is an
analytic function on some interval .ı; ı/ containing the origin and so we may write
g.x/ D
1
X
nD0
bn
xn
; x 2 .ı; ı/:
nŠ
P
Since f .0/ D 1, b0P
D 1. Multiplying
both sides by .ex 1/ gives .ex 1/. 1
nD0 bn
P
n
m
1 x
1 bn n
x
/
D
x.
That
is
.
/.
x
/
D
x.
Computing
the
coefficient
of
xn we
mD1 mŠ
nD0 nŠ
nŠ
find that
n1
X
pD0
bp
1
D 0; n > 1:
pŠ.n p/Š
Multiplying by nŠ and adding bn to both sides yields
!
n
X
n
bp :
bn D
p
pD0
6.2 Bernoulli Numbers and Bernoulli Polynomials
225
Hence, since b0 D 1, it follows by Proposition 6.2.1 that bn D Bn , all n 0. Now
B1 D b1 D 1=2. We prove that B2nC1 D 0, n 1, by showing that g.x/ C x=2 is
an even function of x. We have ex x1 C 2x D exx1 2x iff
x
x
x.ex 1/ C .ex 1/.ex 1/ D x.ex 1/ .ex 1/.ex 1/:
2
2
Computing, we find that both sides are equal to 2x .ex ex /. Hence g.x/ C x=2 is
even and B2nC1 D b2nC1 D 0, n 1.
t
u
The Bernoulli polynomials Bn .x/ (not to be confused with the Bernstein polynomials) are defined for n 0 by
!
n
X
n
Bnk xk :
Bn .x/ D
k
kD0
Computing the first few polynomials, we find that
1
1
3
1
B0 .x/ D 1; B1 .x/ D x ; B2 .x/ D x2 x C ; B3 .x/ D x3 x2 C x:
2
6
2
2
Lemma 6.2.4 For n ¤ 1,
Bn .0/ D Bn .1/ D Bn :
t
u
Proof Left to the exercises.
Remark 6.2.5 Note that B1 .0/ D
12
¤ B1 .1/ D
1
.
2
Lemma 6.2.6
B0n .x/ D nBn1 .x/; n 1:
Proof Since Bn .x/ D
n
k
kD0 k Bnk x ,
Pn
B0n .x/
we have
!
n
X
n
D
kBnk xk1
k
kD1
!
n
X
n1
D
n
Bnk xk1
k
1
kD1
!
n1
X
n1
Dn
B.n1/.k1/ xk1
k
1
k1D0
z
226 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
!
n1
X
n1
B.n1/j xj
Dn
j
jD0
D nBn1 .x/:
t
u
As a useful corollary to Lemmas 6.2.4 and 6.2.6 we have
Rx
Proposition 6.2.7 Bn .x/ D n 0 Bn1 .t/ dt C Bn , n 1.
Remark 6.2.8 We can use Proposition 6.2.7 to recursively compute the Bernoulli
polynomials.
z
e
6.2.1 The 1-Periodic Functions B
n
Let e
Bn denote the 1-periodic extension of Bn restricted to Œ0; 1 to R. That is, if
x 2 R, choose p 2 Z such that x p 2 Œ0; 1 and define
e
Bn .x/ D Bn .x p/:
Since Bn .0/ D Bn .1/ when n ¤ 1, it is immediate that e
Bn is uniquely determined and
continuous provided n ¤ 1. When n D 1, we need to be careful as B1 .0/ ¤ B1 .1/.
What we do is take e
B1 .x/ D B1 .x p/ if x … Z and define e
B1 .x/ D 0 if x is an
integer. The resulting function will then have a jump discontinuity at integer points.
See Fig. 6.2.
For all x 2 R, p 2 Z, and n 0, we have e
Bn .xCp/ D e
Bn .x/ (that is, the functions
e
Bn are all 1-periodic). If x 2 . j; j C 1/, then
1
e
B1 .x/ D x j :
2
−1
Fig. 6.2 Graph of e
B1
−0.5
0
0.5
1
1.5
2
6.2 Bernoulli Numbers and Bernoulli Polynomials
227
Lemma 6.2.9 For all A 0,
ˇZ
ˇ
ˇ
ˇ
A
0
ˇ
ˇ 1
e
B1 .x/ dxˇˇ :
8
R jC1
e
B1 .x/ dx D 0, for all j 2 Z. It is clear from Fig. 6.2 that if
R jCy
B1 .x/ dxj when y D 12 . Obviously,
j C y 2 Πj; j C 1, then we maximize j j e
R jC 12
e
B1 .x/ dx D 18 . The result follows since we can write A uniquely as j C y, where
j
y 2 Œ0; 1/, j 2 Z.
t
u
We now compute the Fourier series of e
Bn . This will give new and remarkable
expressions for the original Bernoulli polynomials Bn .x/.
Proof We have
j
Theorem 6.2.10
(1) If n 2 is even,
1
2.1/ 2 C1 nŠ X cos.2 kx/
e
Bn .x/ D
; x 2 R:
.2 /n
kn
kD1
n
(2) If n 1 is odd,
1
2.1/ 2 nŠ X sin.2 kx/
e
Bn .x/ D
; x 2 R:
.2 /n
kn
kD1
nC1
Remark 6.2.11 Theorem 6.2.10 gives expressions for the Bernoulli polynomials
Bn .x/—restrict e
Bn .x/ to Œ0; 1. There is one proviso: the infinite series formula we
get for B1 .x/ is only valid for x 2 .0; 1/.
z
Before we prove Theorem 6.2.10, we give several corollaries.
Corollary 6.2.12
n
1
X
1
.1/ 2 C1 Bn .2 /n
; n D 2; 4; 6; D
kn
2.nŠ/
kD1
Proof Take x D 0 in (1) of Theorem 6.2.10.
t
u
P1 1
2 P1
4
1
Example 6.2.13
kD1 k2 D 6 ,
kD1 k4 D 90 .
P1 1
Remark 6.2.14 The problem of computing kD1 kn when n is odd is much less well
P
1
understood. It was only in 1978 that it was shown that 1
z
kD1 k3 … Q.
Corollary 6.2.15 The Bernoulli number B2n is strictly positive iff n is odd, else B2n
is strictly negative.
Proof Left to the exercises.
t
u
228 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
Corollary 6.2.16 B2nC1 D 0, n 1.
t
Proof Take x D 0 in (2) of Theorem 6.2.10 and use Bn .0/ D Bn , Lemma 6.2.4. u
Corollary 6.2.17 B2n
2.1/nC1 .2n/Š
.
.2 /2n
lim B2n =
n!1
That is,
2.1/nC1.2n/Š
.2 /2n
P1
.2n/Š
Proof B2n =. 2.1/
/ D
2n
P1 .21 /
limn!1 kD1 k2n D 1.
nC1
1
kD1 k2n .
D 1:
We leave it to the exercises to show
t
u
Corollary 6.2.18 limn!1 jB2n j D 1.
Proof This is a simple consequence of the previous corollary—it suffices to show
t
u
that limn!1 jB2nC2 =B2n j D 1. We leave the details to the exercises.
Remark 6.2.19 For a more precise estimate on the growth of B2n , see Exercises 6.3.9(4).
z
Corollary 6.2.20
je
B2n .x/j jB2n j; n 2 N; x 2 R:
Proof By (1) of Theorem 6.2.10 the maximum value of jB2n .x/j is attained at x D 0.
But jB2n .0/j D jB2n j (Lemma 6.2.4).
t
u
Lemma 6.2.21 For all n 2 N,
R jC1
(1) j e
Bn .x/ dx D 0, j 2 Z.
RA
e
(2) j B2n .x/ dxj jB2n j, all A 1, n 2 N.
1
Proof We have
Z
jC1
e
Bn .x/ dx D
j
Z
j
jC1
BnC1 .x/
d e
dx
dx n C 1
D .e
BnC1 . j C 1/ e
BnC1 . j//=.n C 1/
D 0;
by 1-periodicity of e
BnC1 . The second statement follows from Corollary 6.2.20
and (1).
t
u
Proof of Theorem 6.2.10 In order to compute the Fourier coefficients, we use
integration by parts together with Lemmas 6.2.4, 6.2.6. Suppose that n 1 and
a0 C
1
X
kD1
.ak cos.2 kx/ C bk sin.2 kx//
6.2 Bernoulli Numbers and Bernoulli Polynomials
229
is the Fourier series of BQ n . Since BQ n is piecewise smooth for n > 1, the Fourier series
of BQ n converges uniformly to BQ n if n > 1, by Theorem 5.5.29. In case n D 1, our
definition of the value of BQ 1 at integer points guarantees by Theorem 5.5.26 that the
Fourier series converges pointwise to BQ 1 .
We can combine the computation of the cosine and sine coefficients by making
.n/
use of complex numbers. More precisely, given n 1, we define ck 2 C by
.n/
.n/
c0 D a0 ; 2ck D ak {bk ; k 1:
The choice of the factor 2 and the minus sign is purely to optimize the computations.
With these conventions we have
.n/
.n/
ak D 2Re.ck /; bk D 2Im.ck /; k 1:
Using Remark 5.5.14, we have
.n/
ck
Z
1
D
0
Bn .t/e{2
kt
dt; k 0:
Taking k D 0, we have
.n/
c0
Z
1
D
Z
0
1
D
0
Bn .t/ dt
B0nC1 .t/
dt; by Lemma 6.2.6
nC1
BnC1 .1/ BnC1 .0/
D
nC1
D 0; by Lemma 6.2.4:
Next suppose k 1, n 2. We have
Z
.n/
ck D
1
0
Bn .t/e{2
kt
dt
1
Z 1
1
1
Bn .t/e{2 kt C
B0n .t/e{2 kt dt
{2 k
{2
k
0
0
Z 1
n
D
Bn1 .t/e{2 kt dt; by Lemmas 6.2.4, 6.2.6
{2 k 0
n .n1/
c
D
:
{2 k k
D
230 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
Iterating, we obtain
.n/
ck D
nŠ
.1/
c ; n 2:
.{2 k/n1 k
.1/
It remains to compute ck
Z
.1/
ck D
1
0
B1 .t/e{2
kt
dt
1 {2 kt
e
t
D
dt; definition of B1
2
0
Z 1
1 {2 kt 1
1
1
t
e
D
C
e{2
{2 k
2
{2 k 0
0
1
1
1
D
{2 k 2
2
Z
D
1
kt
dt
1
:
{2 k
Therefore,
.n/
ck D
nŠ
; n 1; k 1:
.{2 k/n
.n/
.n/
Since ak D 2Re.ck / and bk D 2Im.ck /, we have
n
ak D
ak D
2.nŠ/.1/ 2
.2 k/n
0;
; bk D
bk D
0;
nC1
2.nŠ/.1/ 2
.2 k/n
This completes the proof of Theorem 6.2.10.
n even,
; n odd.
t
u
EXERCISES 6.2.22
(1)
(2)
(3)
(4)
(5)
Prove Lemma 6.2.4.
1
Verify that B3 .x/ D x3 32 x2 C 12 x and B4 .x/ D x4 2x3 C x2 30
.
Prove Corollary 6.2.15.
P
1
Show that limn!1 1
D 1.
kn
P1 .1/kD1
n
3
Show that nD0 .2nC1/3 D 32 . More generally, for all p 2 N find a formula for
P1
.1/n
nD0 .2nC1/2p1 .
(6) Complete the proof of Corollary 6.2.18.
6.3 The Euler–Maclaurin Formula
231
R 1 B3 .x/
(7) Prove that for all n 2 N, n e
dx 0 (we use this result in the next section).
x3
xy
(8) (Generating function for Bernoulli polynomials.) Prove that exex 1 D
P1
n
x
x
nD0 Bn . y/ nŠ (for this note the proof of Lemma 6.2.3 and that ex 1 is the
generating function for the Bernoulli numbers).
6.3 The Euler–Maclaurin Formula
Let f W Œ1; 1/ ! R be a smooth or analytic function. In this section we develop
a formula, known
R n (summation) formula, that allows us to
P as the Euler–Maclaurin
estimate a sum nkD1 f .k/ in terms of 1 f .x/ dx and various expressions, involving
Bernoulli numbers, together with a remainder term. More precisely, given an integer
r 0, we have
n
X
kD1
Z
f .k/ D
n
1
C
f .x/ dx C
f .1/ C f .n/
2
r
X
B2k .2k1/
f
.1/ f .2k1/ .n/ C R.r; n/:
.2k/Š
kD1
.2k1/
P
B2k
Typically, the infinite series 1
.1/ f .2k1/ .n/ does not converge
kD1 .2k/Š f
and so the remainder term R.r; n/ may diverge as r ! 1. However, R.r; n/ is given
explicitly as an integral and, with a careful choice of r (preferably not too large but
typically depending P
on n), we can often make R.r; n/ very small and thereby get
a good estimate on nkD1 f .k/ (see also below where we discuss the strategy for
applying the Euler–Maclaurin formula).
The formula was discovered independently
and Maclaurin in about
Pby Euler
2
1735. Euler applied the formula to compute 1
to 20 decimal places and
nD1 n
likely used the result to conjecture that the sum was 2 =6—a result he proved
later that year (1735). As an indication
of the power of the Euler–Maclaurin
P
2
formula, a direct computation of 1
requires about 1020 terms to get 20
nD1 n
decimal places of accuracy (over three trillion years work at one calculated term
per second).
There are many applications of the Euler–Maclaurin formula including Stirling’s
formula (estimating nŠ) as
good estimates for Euler’s constant and sums
Pwell as
3
of infinite series such as 1
nD1 n .
We start by proving a simple special case of the formula and then, after giving an
application to Stirling’s formula, we proceed to state and prove the general case.
232 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
6.3.1 The Euler–Maclaurin Formula for r D 0
Proposition 6.3.1 Let f W Œ1; 1/ ! R be C1 . For n 2 N we have
n
X
Z
f .k/ D
kD1
Proof We have
Z
Rn
1
n
1
f .x/ dx C
f .x/ dx D
f .1/ C f .n/
C
2
Pn1 R kC1
kD1 k
Z
n
1
Be1 .x/f 0 .x/ dx:
f .x/ dx. Now
1
d
xk
f .x/ dx
dx
2
k
kC1 Z kC1 1
1 0
D
xk
f .x/
f .x/ dx
xk
2
2
k
k
Z kC1
f .k C 1/ C f .k/
D
f 0 .x/e
B1 .x/ dx;
2
k
Z
kC1
kC1
f .x/ dx D
k
where we have used the observation that the 1-periodic extension of B1 .x/ D x to R is equal to x k 12 on .k; k C 1/. Summing from k D 1 to n 1, we get
Z
n
1
f .x/ dx D
2
Pn
kD1 f .k/
f .1/ f .n/
2
Z
n
1
1
2
Be1 .x/f 0 .x/ dx;
t
u
p
n n
Example 6.3.2 (Stirling’s Formula—Version 1) We show that nŠ D 2 n e
eın , where limn!1 ın D 0.
Taking f .x/ D log x and applying Proposition 6.3.1, we find that
and rearranging we obtain the required result.
log.nŠ/ D
n
X
log k D n log n n C 1 C
kD1
log n
C
2
Z
n
1
Be1 .x/
dx;
x
Rn
where we have used 1 log x dx D n log n n C 1. Noting that n log n n C
1
log.nnC 2 en /, we have
1
log.nŠ/ D log.nnC 2 en / C 1 C
Z
1
1
log n
2
D
e
B1 .x/
dx ın ;
x
R 1 B1 .x/
R 1 B1 .x/
1
dx. Set C D 1 C 1 e
dx so that log nŠ D log.nnC 2 en / C
where ın D n e
x
x
C ın . Exponentiating, we obtain
1
nŠ D eC nnC 2 en eın :
6.3 The Euler–Maclaurin Formula
233
It remains to prove that (a) ın ! 0 as n ! 1, and (b) eC D
p
2 .
(a) Let A n. By Lemma 6.2.9 and the 1-periodicity of e
B1 , we have
ˇZ
ˇ
ˇ
ˇ
Set F.x/ D
Rx
n
A
n
ˇ ˇZ
ˇ ˇ
e
B1 .x/ dxˇˇ D ˇˇ
An
0
ˇ
ˇ
e
B1 .x/ dxˇˇ 1=8:
e
B1 .t/ dt. Integrating by parts, we have
Z
A
n
Z A
e
B1 .x/
dx D F.x/=xjAxDn C
F.x/=x2 dx
x
n
Z A
D F.A/=A C
F.x/=x2 dx:
n
Therefore
ˇZ
ˇ
ˇ
ˇ
A
n
ˇ
Z A
e
1
B1 .x/ ˇˇ
dxˇ C
jF.x/j=x2 dx
x
8A
n
Z
1 A 1
1
C
dx
8A
8 n x2
A
1
1
1
C
D
8A
8
x xDn
D
1
:
8n
Since this estimate holds for all A n, we have shown that jın j limn!1 ın D 0, proving (a).
(b) We recall Wallis’s formula from Example 5.5.32: limn!1
Taking square roots gives
4n .nŠ/2
p
D
n!1 .2n/Š 2n C 1
Substituting our expressions for nŠ and .2n/Š in
42n .nŠ/4
Œ.2n/Š2 .2nC1/
r
lim
2
:
4n .nŠ/2
p
,
.2n/Š 2nC1
1
1
8n
we have
4n .nŠ/2
22n ŒeC nnC 2 en eın 2
p
D
p
1
.2n/Š 2n C 1
eC .2n/2nC 2 e2n eı2n 2n C 1
p
n
D eC p
eı2n 2ın :
2.2n C 1/
and so
D
2.
234 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
Since limn!1 ın D 0,
r
Hence eC D
p
n
D lim eC p
eı2n 2ın D eC =2:
n!1
2
2.2n C 1/
p
2 .
1
2,
Remark 6.3.3 Using the explicit form BQ 1 .x/ D x j x 2 Πj; j C 1, together
with the strict monotonicity of 1=x, we may show easily that ın < 0. Consequently,
1
eın > 1 and, since jın j 8n
, we have the estimate
nŠ 2
hp
n
2 n
e
n
;
p
2 n
n
e
n
i
e1=8n ; n 2 N:
z
We improve on this estimate shortly.
6.3.2 General Version of the Euler–Maclaurin Formula
Theorem 6.3.4 Let n; r be positive integers with n > 0 and let f W Œ1; 1/ ! R be
at least C2rC1 . Then
Z
n
1
f .x/ dx D
n
X
f .k/ kD1
f .1/ C f .n/
2
r
X
B2j .2j1/
Œf
.n/ f .2j1/ .1/
.2j/Š
jD1
Z n
1
e
C
B2r .x/f .2r/ .x/ dx:
.2r/Š 1
Moreover,
1
.2r/Š
Z
n
1
e
B2r .x/f .2r/ .x/ dx D 1
.2r C 1/Š
Z
n
1
e
B2rC1 .x/f .2rC1/ .x/ dx:
Remarks 6.3.5
(1) The utility of this Rresult depends on being able to show that the remainder
ne
1
.2r/
or error term .2r/Š
.x/ dx converges rapidly as n ! 1. This is
1 B2r .x/f
typically the case provided that the higher derivatives f .2r/ .x/ or f .2rC1/ .x/
converge rapidly to zero as x ! 1.
(2) Observe that the error term will vanish if f is a polynomial of degree
P less than or
equal to 2r. Hence the theorem gives an explicit formula for nkD1 p.k/, when
p is a polynomial.
6.3 The Euler–Maclaurin Formula
235
(3) The proof of the Euler–Maclaurin formula is rather easy, quite formal and
similar to the proofs of Taylor’s theorem with integral remainder (see Sect. 2.7
reviewing results from the differential calculus). Matters get more interesting
when one starts to estimate.
z
Proof of Theorem 6.3.4 We proceed by induction on r. We have already proved the
result for r D 0—Proposition 6.3.1. Suppose the theorem is proved for r R. We
prove it for R C 1. By Lemma 6.2.6, we have
1
.2R/Š
Z
n
1
e
B2R .x/f .2R/ .x/ dx
D
1
.2R/Š
Z
1
n
1 e0
B
.x/f .2R/ .x/ dx:
2R C 1 2RC1
Integrating by parts,
1
.2R/Š
Z
n
1
1 e0
.x/f .2R/ .x/ dx
B
2R C 1 2RC1
xDn
1
e
B2RC1 .x/f .2R/ .x/
D
.2R C 1/Š
xD1
Z n
1
e
B2RC1 .x/f .2RC1/ .x/ dx
.2R C 1/Š 1
Z n
1
e
D
B2RC1 .x/f .2RC1/ .x/ dx;
.2R C 1/Š 1
since B2RC1 D 0, if R > 0 by Corollary 6.2.16.
Now
Z n
1
e
B2RC1 .x/f .2RC1/ .x/ dx
.2R C 1/Š 1
Z n
1
e
B0
D
.x/f .2RC1/ .x/ dx
.2R C 2/Š 1 2RC2
xDn
1
.2RC1/
e
.x/
D
B2RC2 .x/f
.2R C 2/Š
xD1
Z n
1
e
C
B2RC2 .x/f .2RC2/ .x/ dx
.2R C 2/Š 1
B2RC2
Πf .2RC1/ .n/ f .2RC1/ .1/
.2R C 2/Š
Z n
1
e
C
B2RC2 .x/f .2RC2/ .x/ dx:
.2R C 2/Š 1
D
(6.4)
236 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
This proves the Euler–Maclaurin formula for r D R C 1 and completes the
inductive step. The finalRstatement of the theorem is explicit in the second step of
ne
1
.2R/
our computation of .2R/Š
.x/ dx.
t
u
1 B2R .x/f
6.3.3 The Strategy
The Euler–Maclaurin formula states that
n
X
Z
f .k/ D
kD1
n
f .x/ dx C
1
f .1/ C f .n/
2
r
X
B2j .2j1/
Œf
.n/ f .2j1/ .1/
.2j/Š
jD1
Z n
1
e
B2r .x/f .2r/ .x/ dx:
.2r/Š 1
C
One R way we can use the result is to fix n and estimate the remainder
ne
1
.2r/
.x/ dx. This is exactly what we did in Example 6.3.2. However,
.2r/Š 1 B2r .x/f
suppose that limn!1 f .s/ .n/ D 0 for all s 2r, then letting n ! 1 we find that
1
X
Z
f .k/ D
1
f .x/ dx C
1
kD1
r
X
B2j .2j1/
f
.1/
.2j/Š
jD1
1
.2r/Š
Not only does this give an expression for
finite sum formula to get
1
X
kD1
f .k/ D
n
X
f .1/
2
Z
1
1
e
B2r .x/f .2r/ .x/ dx:
P1
kD1 f .k/
but we can subtract the original
f .n/ X B2j .2j1/
f
.n/
2
.2j/Š
jD1
r
f .k/ kD1
1
.2r/Š
Z
1
n
e
B2r .x/f .2r/ .x/ dx C
Z
1
f .x/ dx:
n
Now choose a small value of n, say n D 10. Provided that we can integrate f , we
can often easily compute all the terms on the right-hand side except the integral
6.3 The Euler–Maclaurin Formula
237
involving e
B2r .x/f .2r/ .x/. This we have to estimate. In many cases a judicious choice
of n and r will make this term very small—we give one or two examples shortly.
6.3.4 Application to Stirling’s Formula
If we apply the Euler–Maclaurin formula to f .x/ D log x with r > 0, we can obtain
better estimates of nŠ
Theorem 6.3.6 (Stirling’s Formula, Version 2)
p
n
e
2 n
n
nŠ p
n
2 n
e
n
1
e 12n ; n 1:
Proof It follows from the Euler–Maclaurin formula with f .x/ D log x that
log
nŠ
1
nnC 2 en
D
r
X
jD1
Z
B2j
1
1 ne
B2r .x/
1
C
dx:
2j1
2j.2j 1/ n
2r 1 x2r
Let n ! 1 and we get, using the version of Stirling’s formula proved in
Example 6.3.2
r
X
p
log. 2 / D jD1
B2j
1
C
2j.2j 1/
2r
Subtract this from our expression for log
nŠ
1
nnC 2 en
Z
1
1
to get
nŠ
log p
1
2 nnC 2 en
D
r
X
jD1
1
B2j
1
2j.2j 1/ n2j1 2r
Take r D 1 in (6.5). The right-hand side equals
1
2
Z
1
n
1
12n
e
B2r .x/
dx:
x2r
Z
1
2
1
n
e
B2r .x/
dx:
x2r
R1e
B2 .x/
n
x2
dx. We have
Z
e
B3 .x/
B2 .x/
1 1 1 d e
dx
dx
D
x2
2 n x2 dx
3
Z
1 1e
d 1
D
dx
B3 .x/
6 n
dx x2
Z
1 1e
B3 .x/
D
dx:
3 n
x3
(6.5)
238 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
R 1 B3 .x/
Using Exercises 6.2.22(3), we have n e
dx 0. Using the explicit forx3
B3 .x/j 1=25 and so
mula for B3 .x/ (see Exercises 6.2.22(2)), we find that je
R B3 .x/
R1 1
1 1e
1
2
1
1
j 3 n x3 dxj 75 n x3 dx D 75n2 12n , n 1. Hence 0 12n
R
B2 .x/
1 1e
1
dx and
2
n
x2
12n
nŠ
0 < log p
1
2 nnC 2 en
1
; n 1:
12n
t
u
Exponentiating, we get the result.
Remark 6.3.7 If we take r > 1, we can find sharper estimates on the error. Indeed
the result we proved above gives the second term in Stirling’s series:
nŠ D
p
n
2 n
e
n
1
1
139
571
C
1C
C
:
12n
288n2 51840n3 2488320n4
This series does not converge. It is an example of an asymptotic expansion. For any
given n there are only so many initial terms that give a good approximation. Taking
more terms makes the approximation worse.
z
6.3.5 Computing Euler’s Constant
Take f .x/ D 1x and r D 2 in the Euler–Maclaurin formula. We have f 0 .x/ D x12 ,
Rn
f 00 .x/ D x23 , f .3/ .x/ D x64 , f .4/ .x/ D 24
, and 1 dxx D log n. Substituting, we get
x5
log n D
n
X
1
kD1
k
1C
2
1
n
1
1
2 2
2
n
1
1
6
1 . 30
/
6
6
24
n4
14
Z
4Š
1 ne
C
B4 .x/ 5 dx:
4Š 1
x
After some simplifying, this gives
log n D
n
X
1
kD1
k
1
1
1
1
1
1
C
C
C
2
2 12
120 2n
12n
120n4
Z
1
n
e
B4 .x/
dx;
x5
6.3 The Euler–Maclaurin Formula
239
and so
n
X
1
k
kD1
1
1
1
1
1
1
C
C
C
2
12 120
2n 12n2
120n4
log n D
Z
n
1
e
B4 .x/
dx:
x5
Letting n ! 1, we get
D
Since
Rn
1
D
R1
1
n
X
1
kD1
k
R1
n
1
1
1
C
2
12 120
Z
1
1
e
B4 .x/
dx:
x5
, this gives us
log n D
C
1
1
1
C
C
2
2n 12n
120n4
Z
1
n
e
B4 .x/
dx;
x5
and so we obtain an asymptotic formula for Euler’s constant:
D
n
X
1
kD1
k
log n 1
1
1
C
En ;
2
2n
12n
120n4
R 1 B4 .x/
where the error term En D n e
dx. Ignoring the error term for the moment, we
x5
find that if we take n=10 then
log 10 D 2:302585092994
1=20 D 0:05
1=1200 D 0:00083P
1=1200000 D 0:00000083P
From this, we get
10
X
1
kD1
k
log 10 1
1
1
C
D 0:577215660974 20
1200 12000000
The true value of is D 0:577215664901 : : : and so our estimate is accurate
to 8 decimal places. We can verify this by estimating the error term E10 D
R1e
B4 .x/
dx.
10
x5
240 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
6.3.6 Estimating E10
We have je
B4 .x/j jB4 j, x 2 R, by Corollary 6.2.20. Therefore
ˇZ
ˇ
ˇ
ˇ
1
10
ˇ Z 1
e
B4 .x/ ˇˇ
jB4 j
dxˇ 5
x
x5
10
Z 1
1
1
D
dx
30 10 x5
1
1 x4
D
30 4 10
D
1
120
104
106 :
Hence jE10 j 106 . We can do better by using the second form of the error term in
the Euler–Maclaurin formula. We have
Z
1
10
Z 1e
e
B4 .x/
B5 .x/
dx D
dx:
5
x
x6
10
This time we have to deal with an odd Bernoulli polynomial and we can no
longer use Corollary 6.2.20. What we shall do is use an estimate on improper
integrals—really an integral version
R 1 of Dirichlet’s test—and then reduce to a
B6 .x/ dx, where we can again use Corolproblem that involves estimating 10 e
lary 6.2.20.
First we prove a powerful lemma on improper integrals.
Lemma 6.3.8 Suppose that g; h W Œa; 1/ ! R and g is continuous and h is C1 .
Assume further that
(a) h.x/ is decreasing and converges to zero as x ! 1,
RA
(b) there exists an M 0 such that j a g.x/ dxj M for all A a.
R1
Then a g.x/h.x/ dx exists and
ˇZ
ˇ
ˇ
ˇ
1
a
ˇ
ˇ
g.x/h.x/ dxˇˇ Mh.a/:
Proof We verify the estimate and leave the proof of convergence to the exercises
(the proof uses Exercises 6.1.23(1), but the crucial step is done below). Let A a.
6.3 The Euler–Maclaurin Formula
Set G.x/ D
Rx
a
241
g.t/ dt. Integrating by parts we have
Z
A
a
Z
g.x/h.x/ dx D ŒG.x/h.x/xDA
xDa Z
A
D G.A/h.A/ A
G.x/h0 .x/ dx
a
G.x/h0 .x/ dx:
a
We have to estimate both terms in this equation. Obviously,
jG.A/h.A/j Mh.A/:
Since h0 .x/ 0, we have
ˇZ
ˇ
ˇ
ˇ
A
a
ˇZ A
ˇ
ˇ
ˇ
0
ˇ
G.x/.h .x// dxˇˇ
ˇ
a
Z A
jG.x/j.h0 .x// dxj
ˇ
ˇ
G.x/h .x/ dxˇˇ D
0
a
Z
A
M
h0 .x/ dx
a
D M.h.a/ h.A//:
Therefore
ˇZ
ˇ
ˇ
ˇ
A
a
ˇ
ˇ
g.x/h.x/ dxˇˇ Mh.A/ C M.h.a/ h.A// D Mh.a/:
Letting A ! 1 the result follows.
t
u
If we take g.x/ D e
B5 .x/ then g satisfies (b) of Lemma 6.3.8 by Lemma 6.2.21 and
the 1-periodicity of e
B5 . Since h.x/ D x6 obviously satisfies (a) of Lemma 6.3.8,
we have
ˇZ A
ˇ
ˇ
ˇZ 1
e
ˇ
ˇ
ˇ
B5 .x/ ˇˇ
6
ˇ
ˇ
e
dxˇ 10
sup ˇ
B5 .x/ dxˇˇ :
ˇ
6
x
A1 1
10
But
ˇZ
ˇ
ˇ
ˇ
A
1
ˇ
ˇ ˇZ A
ˇ ˇ
d e
B6 .x/ ˇˇ
ˇ
ˇ
e
dxˇ
B5 .x/ dxˇ D ˇ
6
1 dx
ˇ
ˇ
ˇe
B6 .A/ B6 ˇˇ
ˇ
Dˇ
ˇ
6
242 6 Topics from Classical Analysis: The Gamma-Function and the Euler–Maclaurin Formula
2B6
; Corollary 6.2.20
6
1
1
B6 D
:
D
3 42
42
R 1 B4 .x/
1
Hence j 10 e
dxj 126
106 < 108 . Hence the error jE10 j in our
x5
computation of Euler’s constant is less than 108 .
6.3.7 Estimating
P1
1
kD1 k2
This time we apply Euler–Maclaurin
to f .x/ D 1=x2 and take r D 1. We have
R n dx
2
6
0
00
f .x/ D x3 , f .x/ D x4 and 1 x2 D 1 1n . Substituting, we get
n
X
1C
1
1
1 D
2
n
k
2
kD1
1
n2
Z ne
2
2
B2 .x/
B2 3 3
C3
dx:
n
1
x4
1
Taking B2 D 1=6, we get
Z ne
n
X
1
1
11 1
1
B2 .x/
C 2 3 3
D
dx:
2
k
6
n
2n
3n
x4
1
kD1
Letting n ! 1 gives
Z 1e
1
X
1
11
B2 .x/
3
D
dx:
2
k
6
x4
1
kD1
Writing
P1 1
Rn
1
D
R1
1
R1
n
and substituting, we get an asymptotic formula for
kD1 k2 :
Z 1e
1
n
X
X
1
B2 .x/
1
1
1
1
D
C
C
3
dx:
2
2
2
3
k
k
n
2n
3n
x4
n
kD1
kD1
P
1
2
Using the known value 1
=6, this gives an estimate accurate to 4
kD1 k2 D
decimal places if we take n D 10 and ignore the remainder. We can do much better
if we take r D 2.
EXERCISES 6.3.9
(1) Using the Euler–Maclaurin formula with r D 1, show that
Z ne
n
X
1
B2 .x/
1
5
1
1
D
C
6
dx:
3
2
3
4
k
4 2n
2n
4n
x5
1
kD1
6.3 The Euler–Maclaurin Formula
Hence find a formula for
asymptotic formula
243
P1
1
kD1 k3
in terms of
R1e
B2 .x/
1
x5
dx and deduce the
Z 1e
1
n
X
X
1
1
1
1
1
B2 .x/
D
C
C
6
dx:
3
3
2
3
4
k
k
2n
2n
4n
x5
n
kD1
kD1
R 1 B2 .x/
Verify that j6 n e
dxj n15 and hence find the smallest value of n that you
x5
P
1
3
can take in the formula above to estimate 1
kD1 k3 to within 10 .
R1e
1
(2) Show that the estimate used in Q1 can be improved to j6 n B2x.x/
5 dxj n6 .
(3) Using Stirling’s formula, show that
p n
3 Œ3 .nŠ/3
lim
n!1
n.3n/Š
p
(4)
(5)
(6)
(7)
5
.nŠ/
exists and equals 2 . What is limn!1 5n2Œ5.5n/Š
?
p
Show that if ˛ D 1 C n , where 0 p < n, then B2n grows at least as fast as
.n˛/Š (that is, there exists a C D C. p/ > 0 such that for all sufficiently large n,
B2n C.n˛/Š). Hint: Stirling’s formula and Corollary 6.2.17.
Prove the existence of the infinite integral in Lemma 6.3.8 (you will need
Exercises 6.1.23(1)). R
1
x
Prove that the integral 1 sin
x˛ dx converges provided that ˛ > 0. What can you
R 1 sin x
say about 0 x˛ dx?
R1 x
1
Define F.˛/ D 0 sin
x˛ dx. Verify that F is a C -function on .0; 2/. (You may
assume or easily show that given p 2 N, ˛ > 0, then .log x/p x˛ is decreasing
for all sufficiently large x and if ˛ > 0, there exists 0 < ˇ < ˛, C > 0, such
that j.log x/p x˛ j Cxˇ on .0; 1).
n
Chapter 7
Metric Spaces
In this chapter we develop the theory of metric spaces. The idea of a metric or
distance on a set is simple, intuitive and powerful. The concept is natural for many
important mathematical structures including vector spaces, geometric objects such
as surfaces or manifolds, and spaces of continuous and differentiable functions.
Although metric spaces need not have any vector space structure, they provide
an ideal abstract framework for studying sequences, convergence and continuous
functions. In the first half of the chapter we focus on foundations and examples. In
particular, we define open and closed sets and show that a metric space has a natural
topology of open sets. Using this idea we will be able to give a natural ‘preservation
of structure’ definition of continuity that avoids the surfeit of quantifiers in the "; ıdefinition. In the remainder of the chapter we develop theory and give results, such
as the Arzelà–Ascoli theorem, that generalize the Bolzano–Weierstrass theorem
to spaces of continuous functions. We also prove the simple, yet very powerful,
contraction mapping lemma. We use this result in Chap. 8 to prove a fundamental
result on the existence of fractals and in the final chapter to prove results including
the implicit and inverse function theorems and the existence and uniqueness theorem
for ordinary differential equations.
7.1 Basic Definitions and Examples
Definition 7.1.1 A metric space .X; d/ consists of a (non-empty) set X together
with a real-valued function d.x; y/ on X 2 which satisfies
(1)
(2)
(3)
(4)
d.x; y/ 0 for all x; y 2 X.
d.x; y/ D 0 iff x D y.
d.x; y/ D d. y; x/ for all x; y 2 X.
d.x; z/ d.x; y/ C d. y; z/ for all x; y; z 2 X.
246
7 Metric Spaces
We call d a metric (or distance) on X and often refer to the metric space X if the
metric d is clear from the context.
Remark 7.1.2 We refer to (4) of Definition 7.1.1 as the triangle inequality—see
examples (1,2) below for justification.
z
Examples 7.1.3
(1) .R; j j/, where j j denotes absolute value; that is d.x; y/ D jx yj. Properties
(1,2,3) are immediate. The triangle inequality jx C yj jxj C jyj implies (4) if
we replace x by x y and y by y z.
(2) The Euclidean metric d (or d2 ) on Rn is defined by
v
u n
uX
d.x; y/ D t .xi yi /2 ;
iD1
where x D .x1 ; ; xn /; y D . y1 ; ; yn / 2 Rn (if n D 1, d.x; y/ D jx yj,
the metric defined in (1)). Properties (1,2,3) are easily verified. It remains to
verify the triangle inequality. If we let .u; v/ denote the inner (or dot) product
1
of vectors u; v and kuk D .u; u/ 2 denote the corresponding Euclidean norm,
then it suffices to show that ku C vk kuk C kvk for all u; v 2 Rn . Squaring, it
is easy to see that this inequality follows from the Cauchy–Schwarz inequality
j.u; v/j kukkvk. We prove the Cauchy–Schwarz inequality by observing that
the quadratic form
.xu C yv; xu C yv/ D x2 .u; u/ C 2xy.u; v/ C y2 .v; v/
is positive for all x; y 2 R. The result follows since a quadratic form Ax2 C
2Bxy C Cy2 is positive for all x; y 2 R iff A; B; AC B2 0.
(3) Let X be any non-empty set. We define the discrete metric on X by
d.x; y/ D
1; if x ¤ y;
0; if x D y:
We leave it to the exercises for the reader to verify that d satisfies the conditions
for a metric.
(4) Let C0 .Œa; b/ denote the vector space of real-valued continuous functions on the
closed and bounded interval Œa; b. Define the uniform metric on C0 .Œa; b/ by
. f ; g/ D sup fj f .x/ g.x/j j x 2 Œa; bg:
x2Œa;b
Obviously . f ; g/ 0 for all f ; g 2 C0 .Œa; b/. If . f ; g/ D 0, then j f .x/ g.x/j D 0 for all x 2 Œa; b and so f D g, proving (2). Since j f .x/ g.x/j D
jg.x/f .x/j, we obviously have . f ; g/ D .g; f /. Finally, if f ; g; h 2 C0 .Œa; b/
7.1 Basic Definitions and Examples
247
and x 2 Œa; b, we have by the triangle inequality
j f .x/ g.x/j C jg.x/ h.x/j j f .x/ h.x/j:
Since j f .x/ g.x/j . f ; g/, jg.x/ h.x/j .g; h/ for all x 2 Œa; b (by
definition of the supremum), we have . f ; g/ C .g; h/ j f .x/ h.x/j. Since
this estimate holds for all x 2 Œa; b, . f ; g/ C .g; h/ is an upper bound
for fj f .x/ g.x/j j x 2 Œa; bg and so . f ; g/ C .g; h/ . f ; h/. Not
surprisingly, the uniform metric is particularly well adapted for the study of
uniform convergence. In our final example (6), we define another metric on
C0 .Œa; b/ that is particularly appropriate for the study of Fourier series.
(5) Let B.Œa; b/ denote the vector space of real-valued bounded functions on the
interval Œa; b. We define the uniform metric on B.Œa; b/ exactly as we did for
C0 .Œa; b/ and the proof that defines a metric on B.Œa; b/ is unchanged. Note
that we can replace Œa; b by any non-empty subset of R.
(6) Recall from Sect. 5.6 that the L2 -metric on C0 .Œa; b/ is defined by
Z
b
2 . f ; g/ D
j f .x/ g.x/j2 dx
12
; f ; g 2 C0 .Œa; b/:
a
It follows from the results of Sect. 5.6, notably Lemma 5.6.4, that 2 is a metric
and that 2 . f ; g/ . f ; g/, for all f ; g 2 C0 .Œa; b/.
The following variant of the triangle inequality is very useful.
Lemma 7.1.4 Let .X; d/ be a metric space. Then for all x; y; z 2 X we have
d.x; z/ jd.x; y/ d.z; y/j:
Proof By the triangle inequality, we have d.x; z/ C d.z; y/ d.x; y/ and so
d.x; z/ d.x; y/ d.z; y/:
Interchanging x and z in this inequality we obtain
d.x; z/ D d.z; x/ d.z; y/ d.x; y/:
Hence d.x; z/ jd.x; y/ d.z; y/j.
t
u
Definition 7.1.5 Let Y be a non-empty subset of the metric space .X; d/. The
induced metric dY on Y is defined by
dY . y1 ; y2 / D d. y1 ; y2 /; y1 ; y2 2 Y;
Lemma 7.1.6 The induced metric is a metric on Y.
248
7 Metric Spaces
Fig. 7.1 Metrics on the unit
circle
a
x
l
y
t
u
Proof Immediate, since dY inherits all the properties of d.
Remark 7.1.7 If we take the induced metric on a subset Y of .X; d/, we often drop
the subscript Y and generally refer to Y as a subspace (rather than subset) of X. z
Examples 7.1.8
(1) The metric induced on the subset Z of R by the standard metric is given by
dZ .m; n/ D jm nj, m; n 2 Z.
(2) Take the standard Euclidean metric d on R2 and let S1 denote the unit circle,
centre the origin, in R2 . Referring to Fig. 7.1, dS1 .x; y/ equals the length of the
chord xy. The induced metric dS1 is obviously different from the more natural
metric on S1 defined by arc length (referring to Fig. 7.1, ` < a if x ¤ y).
EXERCISES 7.1.9
(1) Verify that the discrete metric (Examples 7.1.3(3)) is a metric.
(2) Regard the unit circle S1 R2 as parametrized by angle 2 Œ0; 2 /. Define
d. ; / D minfj j; 2 j jg (arc length) and verify that d defines a
metric on S1 .
(3) Let x D .x1 ; x2 /, y D . y1 ; y2 / 2 R2 and define
d1 .x; y/ D jx1 y1 j C jx2 y2 j;
d1 .x; y/ D maxfjx1 y1 j; jx2 y2 jg;
p
p
jx1 y1 j C jx2 y2 j
d 1 .x; y/ D
2
2
:
Show that d1 ; d1 define metrics on R2 but that d 1 is not a metric on R2 . (For
2
d 1 you will need to find a triple of points in R2 for which the triangle inequality
2
fails.)
(4) Define the appropriate extensions of d1 and d1 to Rn , n > 2, and verify that
they are metrics on Rn .
(5) (Product metric.) Let .X; dX /, .Y; dY / be metric spaces. Let X
Y D
f.x; y/ j x 2 X; y 2 Yg. Define the product metric d on X
Y by
7.1 Basic Definitions and Examples
249
d..x1 ; y1 /; .x2 ; y2 // D maxfdX .x1 ; x2 /; dY . y1 ; y2 /g. Verify that d is a metric
on X Y.
(6) Let X be a set and suppose that d.x; y/ satisfies d.x; y/ D 0 iff x D y, and
d.x; z/ d.x; y/ C d.z; y/, for all x; y; z 2 X. Show that d is a metric on X (that
is, d satisfies (1–4) of Definition 7.1.1).
(7) Let .X; d/ bePa metric space and suppose that x1 ; ; xn 2 X, n 3. Show that
n1
d.x1 ; xn / iD1
d.xi ; xiC1 /.
(8) Let X be a non-empty set and B.X; R/ denote the set of all bounded functions
f W X ! R. That is, f 2 B.X; R/ iff there exists an M 0 such that j f .x/j M
for all x 2 X. Show that
(a) B.X; R/ is a vector space (if f ; g 2 B.X; R/, 2 R, then f C g 2
B.X; R/).
(b) . f ; g/ D supx2X j f .x/ g.x/j < 1 for all f ; g 2 B.X; R/.
(c) defines a metric on B.X; R/.
(Notes: X is a general set and f W X ! R is not assumed to be continuous. is
called the uniform metric on B.X; R/.)
(9) Let .X; d/ be a metric space and let D D supx;y2X d.x; y/. We refer to D as the
diameter of X. Show by means of examples that we can have (a) D < 1, (b)
D D 1. Show that if we define .x; y/ D minf1; d.x; y/g, then defines a
metric on X and that that the diameter of X with respect to the metric is at
most one.
p
(10) Let h; i be an inner product on the vector space V. Define kXk D hX; Xi,
X 2 V. Show that the Cauchy–Schwarz inequality holds:
jhX; Yij kXkkYk; for all X; Y 2 V:
(Use the method of Examples 7.1.3(2).) Deduce that if we define d.X; Y/ D
kX Yk, then .V; d/ is a metric space.
(11) Let X be a set. Metrics d; on X are said to be equivalent if there exist
constants C; c > 0 such that for all x; y 2 X, cd.x; y/ .x; y/ Cd.x; y/.
(a) Show that if d; are equivalent, then we can find constants c0 ; C0 > 0 such
that for all x; y 2 X, c0 .x; y/ d.x; y/ C0 .x; y/.
(b) Verify that if d; are equivalent and ; are equivalent, then d; are
equivalent.
(c) Show that every metric on a finite set X is equivalent to the discrete metric
on X.
(d) Show that the induced metric on Z (Examples 7.1.8(1)) is not equivalent
to the discrete metric on Z.
(e) Show that the induced metric on the unit circle S1 (Examples 7.1.8(2)) is
equivalent to the metric defined by arc length (see exercise (2)).
(f) Show that the metrics d1 , d1 on R2 are equivalent to the Euclidean metric
on R2 . Generalize to Rn .
250
7 Metric Spaces
(h) Show that the uniform and L2 metrics on C0 .Œa; b/ are not equivalent.
(Hint: for n 2 N, construct fn 2 C0 .Œa; b/ such that . f ; 0/ D 1,
2 . f ; 0/ D 1=n. See also Sect. 5.6.)
7.2 Distance from a Subset
Suppose that A is a non-empty subset of the metric space .X; d/. Given x 2 X, it is
natural to define the distance d.x; A/ from x to A. Roughly speaking, this should be
the shortest distance from x to A. More formally, we define
d.x; A/ D inf d.x; a/:
a2A
Notice that 0 d.x; A/ d.x; a/ for all a 2 A. As we shall see shortly, it may not
be possible to pick a point a0 2 A such that that d.x; A/ is equal to d.x; a0 /. The main
properties of distance to a subset are given in the next proposition.
Proposition 7.2.1 Let A be a non-empty subset of the metric space .X; d/.
(1)
(2)
(3)
(4)
d.x; A/ 0 for all x 2 X.
If x 2 A, then d.x; A/ D 0.
If A B X, then d.x; A/ d.x; B/ for all x 2 X.
If x; x0 2 X, then
jd.x; A/ d.x0 ; A/j d.x; x0 /:
Proof Statements (1) and (2) are obvious. For statement (3), observe that for every
a 2 A, we can choose b D a 2 B such that d.x; a/ D d.x; b/. Hence infb2B d.x; b/ infa2A d.x; a/. It remains to prove (4). Given a 2 A, we have
d.x; x0 / C d.x0 ; a/ d.x; a/ d.x; A/:
Hence d.x; A/ is a lower bound for d.x; x0 / C d.x0 ; a/ and so
d.x; A/ inf .d.x; x0 / C d.x0 ; a// D d.x; x0 / C d.x0 ; A/:
a2A
Therefore, d.x; x0 / d.x; A/ d.x0 ; A/. Interchanging x and x0 , we have d.x; x0 / D
d.x0 ; x/ d.x0 ; A/ d.x; A/. Combining the two inequalities gives jd.x; A/ d.x0 ; A/j d.x; x0 /.
t
u
Remark 7.2.2 Proposition 7.2.1(4) generalizes Lemma 7.1.4.
z
7.3 Open and Closed Subsets of a Metric Space: Intuition
251
Examples 7.2.3
(1) Let X D R (standard metric) and A D .a; b/, 0 < a < b < 1. Observe that
d.a; A/ D 0 but a … A and so the converse of Proposition 7.2.1(2) is false.
(2) If A ¨ B X, it is possible to have d.x; A/ D d.x; B/ for all x 2 X. For
example, take A D .a; b/, B D Œa; b, X D R.
Remarks 7.2.4
(1) It is useful to extend the definition of d.x; A/ to allow for A to be the empty set.
We define d.x; ;/ D C1, where the symbol C1 satisfies C1 > x for all
x 2 R. This convention is compatible with Proposition 7.2.1(1,2,3).
(2) Later, in Chap. 8, we define the distance between non-empty subsets A and B
of a metric space. The distance we define will be a metric—though for this we
need to restrict the class of subsets we work with. The definition of the distance
between a point and a subset will suffice for our discussion of open and closed
subsets of a metric space—the topic of the next few sections.
z
EXERCISES 7.2.5
(1) Suppose that fAi j i 2 Ig is a family of subsets of X. Let x 2 X and define
ai D d.x; Ai /, i 2 I. Show that d.x; [i2I Ai / D infi2I ai . Verify this result remains
true if we allow some of the sets Ai to be empty. (See Remarks 7.2.4(1).)
(2) Verify that (1–3) of Proposition 7.2.1 are compatible with our definition of
d.x; A/ if A D ;.
7.3 Open and Closed Subsets of a Metric Space: Intuition
Suppose that A is a non-empty subset of the metric space X—see Fig. 7.2.
We discuss some simple features of A that relate to the metric structure. First
of all the outside of A is defined as the complement X X A. It is natural to define
the boundary of A—usually denoted by @A—as the set of points in X which are
of zero distance from both A and X X A. This leads naturally to the question of
Fig. 7.2 Subset A of metric
space X
X
A
252
7 Metric Spaces
characterizing the points of A which do not lie on the boundary of A. It turns out
that this set—the interior of A—is an example of an open set. As we shall see, open
sets play a central role in the theory of metric spaces and our investigations will
start by giving a careful, but very simple and natural, definition of an open set. Later
we shall see that continuity can be formulated entirely in terms of open sets. We
might also consider all the points in X which are at zero distance from A. This set
of points—called the closure of A—gives an example of a closed set. As we shall
see, a set F X is closed iff X X F is open. Closed sets are important because,
for example, a subset of X is closed iff it can be represented as the solution set of a
continuous real-valued function on X. After we have established the basic properties
of open and closed sets, we then develop the theory of sequences and continuous
functions in metric spaces. Much of what we do is formally very similar to what we
have previously done on the real line. In many ways it will be simpler as we will
not get distracted by any extraneous structure of the real line (such as arithmetical
properties). Basically, we work with a set X and metric d. A very simple yet, as we
shall see, very rich structure.
7.4 Open and Closed Sets
We start with the definition of an open set.
Definition 7.4.1 A subset U of the metric space X is open if
d.u; X X U/ > 0;
for all u 2 U:
Examples 7.4.2
(1) An open interval .a; b/ R is an open set: if x 2 .a; b/, then d.x; .1; a [
Œb; 1// D minfx a; b xg > 0.
(2) If we give the set X the discrete metric, then every subset of X is open: if Z ¨ X,
we have d.u; X X Z/ D 1 > 0 for all u 2 Z.
Lemma 7.4.3 The empty set ; and X are open subsets of X.
Proof If we take the negation of the definition of an open set, we see that a subset
Z is not open if there exists a u 2 Z such that d.u; X X Z/ D 0. Since ; contains
no points, it cannot satisfy the ‘not open’ condition and so ; must be open. The
set X is an open subset of X since d.x; ;/ > 0 by our convention on distance
(Remarks 7.2.4(1)).
t
u
We can find more examples of open sets by generalizing the definition of an open
interval to a general metric space.
If .X; d/ is a metric space, x0 2 X and r > 0, we define
Dr .x0 / D fx 2 X j d.x; x0 / < rg:
We call Dr .x0 / the open disk or open ball of centre x0 and radius r.
7.4 Open and Closed Sets
X0
253
r
X0
2r
2r
(a)
(b)
Fig. 7.3 Round (Euclidean) and square disks in R2 . (a) A round Euclidean disk in R2 , and (b) A
square disk in R2 for the d1 -metric
Examples 7.4.4
(1) If X D R, standard metric, then Dr .x0 / D .x0 r; x0 C r/ (the open interval,
centre x0 , length 2r).
(2) If X D R2 and we take the Euclidean metric, then Dr .x0 / is the Euclidean disk
of radius r and centre x0 , see Fig. 7.3a. On the other hand, if we use the metric
d1 .x; y/ D maxfjx1 y1 j; jx2 y2 jg, Dr .x0 / will be the square centred at x0
with side-length 2r, see Fig. 7.3b.
(3) If d is the discrete metric on the set X, then Dr .x0 / D fx0 g, if r 1, and
Dr .x0 / D X if r > 1.
Lemma 7.4.5 Let X be a metric space. An open disk Dr .x0 / X is an open set.
Proof We may assume that Dr .x0 / ¤ X. We have to show that for all u 2 Dr .x0 /,
d.u; X X Dr .x0 // > 0. Let d.u; x0 / D s < r. For every v 2 X X Dr .x0 / we have by
Lemma 7.1.4
d.u; v/ jd.u; x0 / d.v; x0 /j r s > 0;
since d.v; x0 / r.
We make frequent use of the next result.
t
u
Lemma 7.4.6 If A X, then Dr .a/ A iff d.a; X X A/ r.
Proof Suppose that Dr .a/ A. If v 2 X X A, then, since v … Dr .a/, d.a; v/ r
and so d.a; X X A/ r. We prove the converse by contradiction. Suppose that
d.a; X X A/ D r > 0. If Dr .a/ 6 A, there exists a u 2 Dr .a/ \ .X X A/. Therefore
d.a; X X A/ d.a; u/ < r, contradicting our assumption that d.a; X X A/ r. u
t
Proposition 7.4.7 Let .X; d/ be a metric space. A subset U of X is open iff for every
u 2 U, there exists an r D r.u/ > 0 such that Dr .u/ U.
Proof Suppose first that U is open. Let u 2 U and set d.u; X X U/ D r.u/ D r > 0.
By Lemma 7.4.6, Dr .u/ U. Conversely, suppose that u 2 U and there exists an
r D r.u/ > 0 such that Dr .u/ U. By Lemma 7.4.6, d.u; X X U/ r.
t
u
254
7 Metric Spaces
Remark 7.4.8 It is common in the literature to define an open subset of a metric
space .X; d/ by requiring that for every u 2 U, there exists an r D r.u/ > 0
such that Dr .u/ U. We prefer our definition because it is (a) simpler and more
natural, (b) uses only one quantifier rather than the two demanded by the disk
definition. However, notice that the disk definition of an open set automatically gives
X as an open subset of X. In particular, we do not need the distance convention in
Remarks 7.2.4(1). We give a number of exercises at the end of the section that use
the disk definition.
z
Theorem 7.4.9 Let U denote the set of all open subsets of the metric space X. We
have
(1) X 2 U.
(2) ; 2 U.
(3) If fUi j i 2 Ig is a family of open subsets of X (not necessarily countable), then
[i2I Ui 2 U. (Arbitrary unions of open sets are open.)
(4) If U1 ; ; Un 2 U, then \niD1 Ui 2 U. (Finite intersections of open sets are
open.)
Proof (1,2) are just Lemma 7.4.3. (3) We may assume that at least one of the sets Ui
is non-empty (else the result follows from (2)). It suffices to show that if u 2 [i2I Ui ,
then d.u; X X [i2I Ui / > 0. So suppose u 2 Ui0 . We have X X [i2I Ui X X Ui0 and
so by Lemma 7.2.1(3)
d.u; X X [i2I Ui / d.u; X X Ui0 /:
Since u 2 Ui0 and Ui0 is open, d.u; X X Ui0 / > 0. (4) Assume \niD1 Ui ¤ ;. If
u 2 \niD1 Ui , then u 2 Ui , i 2 f1; ; ng and so di D d.u; XXUi / > 0, i 2 f1; ; ng.
If we set d D mini fdi g, then d > 0 and
d.u; X X \niD1 Ui / D d.u; [niD1 .X X Ui // D minfd.u; X X Ui /g D d > 0:
i
Hence \niD1 Ui is open.
t
u
Remark 7.4.10 Given a set X, a collection U of subsets of X satisfying (1–4) of
Theorem 7.4.9 is said to define a topology on X. Members of U are called open
sets and X, together with the topology U, is called a topological space. For a metric
space X, we call the associated topology the metric topology of X.
z
Examples 7.4.11
(1) Infinite intersections of open sets will generally not be open. As a simple
example, take Ui D .1=i; 1 C 1=i/ R, i 1. Then \1
iD1 Ui D Œ0; 1.
(2) Let the set X be given the discrete metric. Then the topology of X consists of all
subsets of X. This topology is the largest topology one can define on X.
7.4 Open and Closed Sets
255
(3) Given a non-empty set X, define U D f;; Xg. Then U is a topology on X and is
the smallest topology one can define on a set X. If X has more than one element,
this topology cannot be defined by a metric on X. (For more examples, see the
exercises at the end of the section.)
Proposition 7.4.12 Let X be a metric space. Every non-empty open subset of X can
be written as a union of open disks.
Proof Let U be a non-empty open subset of X. Given x 2 U, we may choose r.x/ >
t
u
0 so that Dr.x/ .x/ U (Proposition 7.4.7). We have [x2U Dr.x/ .x/ D U.
Remark 7.4.13 Every open subset of .R; j j/ is a countable union of disjoint open
intervals. We leave the proof to the exercises at the end of the section. Later in the
chapter we extend the countable union part of this result to a large and important
class of metric spaces.
z
For the remainder of this section, we consider closed subsets of a metric space.
Again, the definition is most simply given in terms of the distance function to a
subset.
Definition 7.4.14 A subset F of the metric space X is closed if d.u; F/ D 0 implies
u 2 F.
Examples 7.4.15
(1) The sets ; and X are closed subsets of X. Indeed, ; is closed since d.u; ;/ is
never zero (distance convention, Remarks 7.2.4(1)). On the other hand, X is
closed since d.x; X/ D 0 for all x 2 X.
(2) A closed interval Œa; b R is a closed set: if x … Œa; b, then d.x; Œa; b/ D
maxfmaxf0; a xg; maxf0; x bgg > 0.
(3) If x 2 X, then fxg is a closed subset of X: d. y; fxg/ D d.x; y/ D 0 iff x D y.
(4) If we give the set X the discrete metric, then every subset Z of X is closed:
d.u; Z/ D 1 iff u … Z.
Proposition 7.4.16 A subset F of the metric space X is closed iff X X F is open.
Proof Suppose F is a closed subset of X. If u 2 X X F, then d.u; X X .X X F// D
d.u; F/ > 0, since otherwise d.u; F/ D 0, contradicting our assumption that F is
closed. For the converse, reverse the previous argument.
t
u
Remark 7.4.17 Proposition 7.4.16 does not say that a subset of X must be either
open or closed. Although every subset of a metric space with the discrete metric is
open and closed, it is usually the case that most subsets of a metric space are neither
open nor closed. For a simple example of a subset A of R which is neither open nor
closed, take A D .0; 1.
z
Theorem 7.4.18 Let F denote the set of all closed subsets of the metric space X.
Then
(1) X 2 F .
(2) ; 2 F .
256
7 Metric Spaces
(3) If fFi j i 2 Ig is a family of closed subsets of F , then \i2I Fi 2 F (an arbitrary
intersection of closed sets is closed).
(4) If F1 ; ; Fn 2 F , then [niD1 Fn 2 F (a finite union of closed sets is closed).
Proof We can prove this in two ways. Either use Theorem 7.4.9 and Proposition 7.4.16 or work directly from the definition. We use the direct approach
and prove (3) and leave the remaining cases to the exercises. Suppose then that
d.x; \i2I Fi / D 0. Since \i2I Fi Fj for all j 2 I we have, by Lemma 7.2.1(3),
d.x; Fj / d.x; \i2I Fi / D 0 for all j 2 I. Hence d.x; Fj / D 0 and so x 2 Fj since Fj
is closed. Since this holds for all j 2 I, x 2 \i2I Fi .
t
u
Examples 7.4.19
(1) An infinite union of closed sets need not be closed. For example, [i2 Π1i ; 1 1
D .0; 1/, which is not closed (since d.0; .0; 1// D 0).
i
(2) Let f W R ! R be continuous and set F D f 1 .0/ D fx 2 R j f .x/ D 0g (F is
the solution set of f .x/ D 0). Then F is a closed set. Suppose that d.x; F/ D 0.
Choose a sequence .xn / F converging to x. We have limn!1 d.xn ; x/ D 0.
By the sequential continuity of f , we have 0 D limn!1 f .xn / D f .x/. Therefore
x 2 F. Alternatively, we can use Proposition 7.4.16 and prove that RXF is open.
To do this, observe that R X F D fx 2 R j f .x/ ¤ 0g. Let z 2 R X F and set
r D j f .z/j ¤ 0. By the continuity of f , and therefore j f j, there exists a ı > 0
such that j f .x/j > r=2 for all x 2 .z ı; z C ı/. Hence .z ı; z C ı/ R X F
and so R X F must be open and F closed. As we shall see later, this result holds
in great generality. Moreover, F is closed iff F is the zero set of a continuous
function: closed sets are precisely the zero sets of continuous functions.
Let Dr .x0 / denote the closed disk of radius r > 0, centre x0 , in the metric space
.X; d/. That is
Dr .x0 / D fx 2 X j d.x0 ; x/ rg:
Lemma 7.4.20 A closed disk is closed.
t
Proof If x … Dr .x0 /, then d.x; x0 / > r. Hence, d.x; Dr .x0 // D d.x; x0 / r ¤ 0. u
Example 7.4.21 If Dr .x/ D Dr .x/ then Dr .x/ is open and closed. As a simple
example where this can happen, let Y be the metric space defined to be the union of
the open intervals .1; 2/ and .3; 4/. Take the induced metric on Y (that is, the metric
induced on Y by the standard metric on R). Both .1; 2/, .3; 4/ are open subsets of Y
and therefore, taking complements (in Y!), are also closed subsets of Y. Of course,
it is easy to check directly that dY .x; .a; b// D 0 iff x 2 .a; b/.
We give a simple proposition that relates open and closed sets. This result is quite
useful when we look at open and closed sets in the induced metric on a subset.
Proposition 7.4.22 Let U X be open and F X be closed. Then
(1) U X F is open.
(2) F X U is closed.
7.4 Open and Closed Sets
257
Proof Let A; B be subsets of X. We claim that X X .A X B/ D .X X A/ [ B. Indeed,
x 2 X X .A X B/ iff x … A or x 2 B. But x … A or x 2 B iff x 2 .X X A/ [ B. Apply
this result with A D U, B D F to obtain (1). Interchange U and F to get (2).
t
u
Example 7.4.23 Take X D R. Then .a; b/ X Œc; d is always an open interval
(possibly empty) and Œa; b X .c; d/ is always a closed interval (possibly empty). We conclude with the definition of an isolated point.
Definition 7.4.24 A point x in the metric space X is isolated if d.x; X X fxg/ > 0.
Lemma 7.4.25 The following conditions are equivalent.
(1)
(2)
(3)
(4)
The point x is an isolated point of X.
fxg is an open subset of X.
fxg is an open & closed subset of X.
X X fxg is closed.
Proof (1) ” (2) by the definition of open set. (2) ” (3) since fxg is always a
closed subset. (3) ” (4) by Proposition 7.4.16.
t
u
Examples 7.4.26
(1) The metric space .R; j j/ contains no isolated points.
(2) The metric space .Z; j j/ consists of isolated points.
(3) If we give the set X the discrete metric, then X consists of isolated points.
EXERCISES 7.4.27
(1) Describe (draw a figure) the open disk, centre x0 , radius r, for the metric d1 on
R2 (see Exercises 7.1.9(3) for the definition of d1 ).
(2) Let U be a non-empty open subset of R. Show that U can be written as a
countable disjoint union of open intervals. (Hint: if x 2 U, let Ix denote the
union of all open intervals I U which contain x. Verify that Ix is an open
interval.)
(3) Prove Theorem 7.4.9 using the disk definition of open set.
(4) Prove Proposition 7.4.16 using the disk definition of open set.
(5) Complete the proof of Theorem 7.4.18.
(6) Suppose that h1 and h2 are equivalent metrics on X (see Exercises 7.1.9(11)
for the definition of equivalent metric). Show that .X; h1 /, .X; h2 / have the
same open sets. By looking at Z with the discrete metric and metric induced
from .R; j j/, show that the converse is false—same open sets does not imply
equivalent metrics.
(7) Show that x0 is an isolated point of X iff fx0 g and X X fx0 g are both open and
closed.
(8) Show that the diagonal .X/ D f.x; x/ j x 2 Xg is a closed subset of X 2 if we
take the product metric on X 2 D X X (Exercises 7.1.9(5)).
(9) Let .X; d/ be a metric space. Suppose that arbitrary intersections of open
subsets of X are open. Show that every point of X is isolated (so X has the
topology given by the discrete metric).
258
7 Metric Spaces
(10) Let U consist of the empty set together with all subsets of R which are the
form R X F, where F is a finite subset of R. Verify that U defines a topology on
R (this topology is known as the Zariski topology on R and is used in algebraic
geometry for the study of zero sets of polynomials).
7.5 Interior and Closure
In this section we show that there is a natural way of associating an open set and a
closed set to every subset of a metric space.
Definition 7.5.1 Let .X; d/ be a metric space with topology of open sets U and
closed sets F . Let A X.
ı
(1) The interior A of A is the largest open subset of A:
[
ı
AD
U:
U2U ;UA
(2) The closure A of A is the smallest closed superset of A:
\
AD
F:
F2F ;F A
Remarks 7.5.2
(1) Since a union of open sets is open, by Theorem 7.4.9(3), [U2U ;UA U is an
open subset set of A. Since the union contains all open subsets of A, it is the
largest open subset of A. Similarly, using Theorem 7.4.18(3), \F2F ;F AF is the
smallest closed set containing A.
(2) Observe that the definition of interior and closure only uses properties of
the topology and does not directly use the metric structure on X. Hence the
definition extends to general topological spaces.
z
Examples 7.5.3
ı
(1) Take X D R and suppose 1 < c < d < C1. We have Œc; d D .c; d/ and
.c; d/ D Œc; d.
ı
ı
(2) For all metric spaces .X; d/ we have X D X, ; D ;, X D X, ; D ;.
ı
(3) If X has the discrete metric, then A D A D A for all subsets A of X.
Proposition 7.5.4 If A X, then
ı
(a) A A A.
ı
(b) A is open iff A D A.
7.5 Interior and Closure
259
(c) A is closed iff A D A.
ı
ı
(d) If A B X, then A B and A B.
Proof (a) is immediate by the definition of interior and closure. (b,c) If A is open,
ı
then A A, since A contains the open set A. The proof of (c) is similar. Finally, if
ı
ı
A B, then every open subset of A is an open subset of B. Hence A B. The proof
that A B is similar.
t
u
The next result characterizes the interior and closure of a set using metric
properties.
Proposition 7.5.5 Let A X. Then
ı
(1) A D fx 2 A j d.x; X X A/ > 0.
(2) A D fx 2 X j d.x; A/ D 0g.
Proof We give the proof of (1), the proof of (2) is similar. Suppose first that a 2 A
and d.a; X X A/ D r. By Lemma 7.4.6, Dr .a/ A. But Dr .a/ is an open subset of A
ı
ı
and so, by definition of the interior, Dr .a/ A. Hence a 2 A. Conversely, suppose
ı
ı
ı
that a 2 A. Since A is open, there exists an r > 0 such that Dr .a/ A A. Hence,
again by Lemma 7.4.6, d.a; X X A/ r.
t
u
Remark 7.5.6 We say a 2 A is an interior point of A if d.a; X X A/ > 0 and that
x 2 X is a closure point of A if d.x; A/ D 0. Proposition 7.5.5 shows that the interior
(respectively, closure) of A is the set of all interior (respectively, closure) points
of A.
z
We may easily give a characterization of the interior and closure of a subset in
terms of open and closed disks.
Lemma 7.5.7 Let A be a subset of X.
ı
(1) x 2 A iff there exists an r > 0 such that Dr .x/ A.
(2) x 2 A iff Dr .x/ \ A ¤ ; for all r > 0.
t
u
Proof Left to the exercises.
EXERCISES 7.5.8
ı
(1) Take the standard metric on R. Find (a) Q, (b) Q. How would your answer
change if we took the discrete metric on R?
(2) Let Dr .x/ be the open r-disk in R2 , Euclidean metric. Show that Dr .x/ D Dr .x/,
for all r > 0, x 2 R2 . Find an example of a metric space .X; d/ for which
Dr .x/ ¨ Dr .x/. Show that we always have Dr .x/ Dr .x/. Similarly, investigate
the relation between Dr .x/ and the interior of Dr .x/ and show that, in general,
the interior of a closed disk of radius r is not equal to the open disk of radius r.
(3) If f 2 C0 .Œa; b/ and F D fg 2 C0 .Œa; b/ j . f ; g/ < rg, show that F D fg 2
C0 .Œa; b/ j . f ; g/ rg.
(4) Provide the proof of Proposition 7.5.5(2).
260
7 Metric Spaces
(5) (Proof of Lemma 7.5.7.) Let A X. Show that
ı
(a) x 2 A iff there exists an r > 0 such that Dr .x/ A.
(b) x 2 A iff for all r > 0, Dr .x/ \ A ¤ ;.
((a) and (b) are commonly used to define the interior and closure of a set.)
(6) If E1 ; ; En is a finite collection of subsets of the metric space X, show that
ı
the interior of \niD1 Ei equals \niD1 Ei . Show that the result is false if we allow
arbitrary intersections. What, if anything, can be said relating the interior of a
union of sets to the union of the interiors?
7.6 Open and Closed Subsets of a Subspace
Let Y be a non-empty subset of the metric space .X; d/ and let dY denote the induced
metric on Y (see Definition 7.1.5). There is a simple relationship between the open
and closed sets of .Y; dY / and .X; d/.
Proposition 7.6.1 Let Y be a subset of the metric space .X; d/.
(1) A subset U of Y is open in .Y; dY / iff there exists an open set V of .X; d/ such
that U D Y \ V.
(2) A subset F of Y is closed in .Y; dY / iff there exists a closed set Z of .X; d/ such
that F D Y \ Z.
In particular, if we denote the topology of .X; d/ by U and that of .Y; dY / by U Y , we
def
have U Y D U \ Y D fU \ Y j U 2 Ug.
Proof Suppose first that U is an open subset of X. We show U \ Y is an open subset
of Y (relative to the induced metric). Let y 2 U \ Y. Since U is an open subset of X,
we have d. y; X XU/ > 0. But Y XU X XU and so d. y; Y XU/ d. y; X XU/ > 0.
Hence U \ Y is an open subset of Y. If F X is closed, then Y \ .X X F/ is an
open subset of Y. But Y \ F D Y X .X X F/ and so Y \ F is a closed subset of Y.
Now suppose that F is a closed subset of Y (induced metric) and let F denote the
closure of F in .X; d/. We have F D Y \ F since F D fx 2 X j d.x; F/ D 0g and
so Y \ F D fx 2 Y j d.x; F/ D 0g D F, completing the proof of (2). Finally, the
converse to (1) follows from (2) by taking complements. That is, if U Y is open
then U D Y \ .X X Y X U/ (closure taken in X).
t
u
EXERCISES 7.6.2
(1) Provide an alternative proof of Proposition 7.6.1(1) that uses the disk definition
of open set together with the result that a union of open disks is open.
(2) Suppose that Y X is an open set (relative to the metric d on X). Show that
U Y is open (in the induced metric) iff U is an open subset of X and that
F Y is closed iff there exists an open subset W of X such that F D Y X W
(note Proposition 7.4.16). Formulate and prove the corresponding results when
Y is a closed subset of X.
7.7 Dense Subsets and the Boundary of a Set
261
7.7 Dense Subsets and the Boundary of a Set
In this section we give some useful definitions based on closure.
7.7.1 Dense Subsets and Separable Metric Spaces
Definition 7.7.1 A subset A of the metric space X is dense in X if A D X.
Lemma 7.7.2 Let A be a subset of X. The following conditions are equivalent
(1) A is a dense subset of X.
ı
(2) X X A has no interior points: .X X A/ D ;.
(3) d.x; A/ D 0 for all x 2 X.
(4) For every x 2X, and every r > 0, Dr .x/ \ A ¤ ;.
Proof We leave the proof to the exercises.
t
u
Definition 7.7.3 A metric space .X; d/ is separable if X has a countable dense
subset.
Examples 7.7.4
(1) .R; j j/ is separable: the rational numbers Q are a dense subset of R. More
generally, Rn is separable since Qn is a dense subset of Rn , n 1 (here we
may take the Euclidean metric or either of the metrics d1 ; d1 on Rn ). The
simplest proof of density uses (3) of Lemma 7.7.2 and the metric d1 . Since
Qn is countable, Rn is separable, n 1.
(2) Let Œa; b R be a bounded closed interval and let P C0 .Œa; b/ be the set of
all polynomial maps p W Œa; b ! R. Then P is a dense subset of .C0 .Œa; b/; /
(uniform metric). This is precisely the Weierstrass approximation theorem.
Indeed, the Weierstrass approximation theorem states that for all f 2 C0 .Œa; b/
and all r > 0 we have Dr . f /\P ¤ ;. Hence P D C0 .Œa; b/ by Lemma 7.7.2(3).
If we let PQ denote the space of polynomial maps p W Œa; b ! R with rational
coefficients, then PQ is countable (PQ can be written as a countable union of
countable sets) and so since PQ D P D C0 .Œa; b/, we see that .C0 .Œa; b/; /
is separable. Similar results are true if we use the L2 -metric on C0 .Œa; b/ (see
Theorem 5.6.11).
(3) If we give X the discrete metric then the only subset of X which is dense
is X itself (use (2) of Lemma 7.7.2). In particular, X is separable iff X is
countable.
Proposition 7.7.5 Suppose that the metric space .X; d/ is separable. Then there
exists a countable family B of open subsets of X such that every open subset of X is
a union of sets from B.
262
7 Metric Spaces
Proof If X is separable then there exists a countable dense subset fqn j n 2 Ng of X.
Associated to each n 2 N, we define Bn to be the set of all open disks centred at qn
and with radius r 2 Q. Since Q is countable, Bn is countable and so B D [n2N Bn
is a countable set of open subsets of X. Given any open subset U of X, we may
write U as a union of open sets from B. Indeed, if x 2 U, choose m 2 N so that
D2=m .x/ U. Now choose qn 2 D1=m .x/. Then Ux D D1=m .qn / D2=m .x/ U. We have U D [x2U Ux and so we have expressed U as a union of open sets
from B.
t
u
Remark 7.7.6 Any metric space .X; d/ with the property that there exists a countable collection B of open sets such that every open set can be written as a union
of sets from B is called second countable and B is called a basis for the topology
of .X; d/.
z
Example 7.7.7 If X D R, then every open subset of R is a countable union of
disjoint open intervals (see Exercises 7.4.27(2)). Each of these open intervals can be
written as a countable union of open intervals with rational endpoints. However,
we cannot generally write U as a disjoint union of open intervals with rational
endpoints. If U is an open subset of Rn , n 1, then U is a countable union of
(generally non-disjoint) open disks (with rational radius and rational centre).
7.7.2 Boundary of a Subset
Definition 7.7.8 The boundary (also called frontier) @A of a subset A of the metric
space X is defined by
@A D A \ .X X A/:
Lemma 7.7.9 Let A be a subset of the metric space X. Then
(1) @A is a closed subset of X.
(2) @A D fx 2 X j d.x; A/ D d.x; X X A/ D 0g.
ı
(3) @A D A X A.
(4) x 2 @A iff Dr .x/ \ A; Dr .x/ \ .X X A/ ¤ ; for all r > 0.
ı
Proof (1,2) are immediate from the definitions. For (3) observe that if x … A then
x 2 X X A. For (4) use Exercises 7.5.8(5b).
t
u
Examples 7.7.10
(1) If Œa; b R, @Œa; b D fa; bg. Similarly @.a; b/ D fa; bg.
(2) @X D @; D ;.
7.8 Neighbourhoods
263
def
(3) @Dr .x/; @Dr .x/ are subsets of Sr .x/ D f y 2 X j d.x; y/ D rg (Sr .x/ is
the ‘sphere’ of radius r centred at x). In general, @Dr .x/ ¤ @Dr .x/ and
@Dr .x/; @Dr .x/ may be proper subsets of Sr .x/.
(4) If X has the discrete metric, then @Y D ; for all subsets Y of X.
(5) If A is dense in X then @A D X if X X A is dense in X.
EXERCISES 7.7.11
(1) Prove Lemma 7.7.2.
(2) Let Dr .x/ be a disk in the metric space .X; d/. Show that @Dr .x/ may be empty,
even if Dr .x/ ¤ X.
(3) Let A be a subset of X. Show that @A \ @.X X A/ D X iff A and X X A are dense
in X.
7.8 Neighbourhoods
ı
Definition 7.8.1 A subset N of .X; d/ is a neighbourhood of x 2 X if x 2 N. If N is
open, we say N is an open neighbourhood of x.
Lemma 7.8.2 A subset N of .X; d/ is a neighbourhood of x iff there exists an r > 0
such that Dr .x/ N.
Proof By Lemma 7.4.6, if d.x; X X N/Dr > 0, then Dr .x/ N and conversely.
t
u
Examples 7.8.3
(1) Let X D R, standard metric. The closed interval Œa; b is a neighbourhood of
every point x 2 .a; b/. It is not a neighbourhood of a or b. The open interval
.a; b/ is an open neighbourhood of every point in .a; b/.
ı
(2) If X is a metric space and N X, then N is a neighbourhood of x iff x 2 N.
(3) The open disk Dr .x/ and closed disk Dr .x/ are neighbourhoods of x.
Remark 7.8.4 If N is an open neighbourhood of x, then N is an open neighbourhood
of every point in N.
z
We may characterize the interior and closure of a set using neighbourhoods.
Lemma 7.8.5 Let A be a subset of X.
ı
(1) x 2 A iff there exists a neighbourhood N of x such that N A.
(2) x 2 A iff N \ A ¤ ; for all neighbourhoods N of x.
Proof The result is immediate from Lemmas 7.5.7, 7.8.2.
t
u
264
7 Metric Spaces
EXERCISES 7.8.6
(1) Show that distinct points of a metric space have disjoint open neighbourhoods.
(2) Let .X; d/ be a metric space and suppose that x 2 X. Show that if x has a
neighbourhood containing finitely many points, then x is isolated. Conversely,
show that if every neighbourhood of x contains infinitely many points, then x is
not isolated.
(3) Let A be a subset of the metric space .X; d/. Show that a 2 A iff N \ A ¤ ;
for every neighbourhood N of a. Reformulate the definition of the interior of a
set in terms of neighbourhoods and verify that your definition does define the
interior.
7.9 Summary and Discussion
Let A be any subset of the metric space X. We have shown there is a maximal open
ı
set A and a minimal closed set A such that
ı
A A A:
Moreover,
ı
(1) x 2 A iff d.x; X X A/ > 0.
(2) x 2 A iff d.x; A/ D 0.
ı
(3) A is open iff A D A.
(4) A is closed iff A D A.
Notwithstanding that a closed set is just the complement of an open set, open and
closed sets have rather different properties. For example, every open subset of R is a
countable union of disjoint open intervals. However, it is not true that a closed subset
of R can be expressed in such a simple way; for example, as a countable union of
disjoint closed intervals. Indeed, closed subsets of R can be extremely complex and
pathological (matters are worse in Rn , n > 1). If we can write an open set U as the
disjoint union [1
iD1 .an ; bn /, where bn < anC1 , n 1, then the closed set R X U
is the countable union of the disjoint closed intervals Œbn ; anC1 . However, although
we can write U as the disjoint union of open intervals .an ; bn /, we cannot usually
require that bn < anC1 , for all n. The situation is similar to that of the rational
numbers: although the rationals are countable, we cannot write them as a sequence
.rn / satisfying rn < rnC1 , n 1. In Exercises 2.2.14(9), a construction was given of
an open subset I of R which contained every rational number and was such that the
total length jIj of the open intervals comprising I was less than some preassigned
number " > 0. Let A denote the complement of I in R. Even though jIj < ", A
can have no interior points since arbitrarily close to every interior point is a rational
interior point. Consequently the structure of A is hard to visualize—arbitrarily close
7.9 Summary and Discussion
265
to every point of A is a hole where we have removed an interval containing a rational
number. At non-isolated points of a 2 A, there is a sequence of holes converging
to a. Granted this complexity, it is perhaps surprising that every closed subset of R
can be represented as the zero set of a continuous (indeed C1 ) function f W R ! R.
Needless to say, the construction of f depends on defining f to be non-zero on the
complement of the closed set (see the exercises for an example).
EXERCISES 7.9.1
(1) Let .X; d/ be a metric space. Prove that for all x 2 X, r > 0, Sr .x/ D f y 2
X j d.x; y/ D rg is a closed set.
(2) Let A be a subset of the metric space X. Prove that the diameter of A equals
the diameter of the closure of A. Does the same result hold if instead of the
closure we take the interior of A?
(3) Let A1 ; ; An be subsets of the metric space X. Prove that [niD1 Ai D [niD1 Ai .
Find an example to show this result is generally false for infinite unions.
Investigate what happens for intersections.
ı
(4) Let A be a non-empty subset of the metric space X. Show that A D X X X X A
ı
and A D X X .X X A/.
(5) Show that in general @Dr .x/ ¤ @Dr .x/. Also find examples where we have
@Dr .x/; @Dr .x/ ¤ Sr .x/ (Sr .x/ is the sphere of radius r and centre x—see also
(1)).
(6) True or false: in each case either prove it or provide a counterexample.
ı
ı
(a) E D E?
ı
(b) E D E?
(E is a subset of the metric space X.)
ı
ı
(7) If A is a subset of the metric space X show that A [ @A [ .X X A/ D X.
(8) Let B.Œa; b/ denote the space of bounded functions f W Œa; b ! R
with uniform metric . f ; g/ D supx2X j f .x/ g.x/j, f ; g 2 B.Œa; b/ (see
Exercises 7.1.9). Show (a) if f 2 B.Œa; b/ is not continuous, then there exists
an r > 0 such that every g 2 Dr . f / is not continuous, (b) the space C0 .Œa; b/ is
a closed subset of B.Œa; b/, and (c) the space P of polynomials p W Œa; b ! R
is not dense in B.Œa; b/. (For (c) you should prove it in two ways: either
use (b) or construct a bounded function which cannot be approximated by
polynomials in the metric .)
(9) Suppose that every subset of the metric space X is either open or closed. Show
that at most one point of X is not isolated. What about the converse?
(10) A subset A of the metric space X is nowhere dense if the interior of A is empty.
Show that every finite subset of R is nowhere dense and construct an example
of a countable subset of Œ0; 1 which is nowhere dense. (Later we give an
example of a non-countable subset of Œ0; 1 which is nowhere dense.)
266
7 Metric Spaces
(11) Let .qn / be the set of all rational numbers, indexed by the positive integers,
and let " > 0. For n 1, set In D .qn 2.nC1/ "; qn C 2.nC1/ "/ and define
I D [n1 In (see Exercises 2.2.14(9) and the discussion above).
(a) Show that I D R.
ı
(b) Set A D R X I. Show that A D ;.
(c) For n 1, construct a continuous function n W R ! R which is non.nC1/
zero precisely
". Using the M-test
P1 on In and has maximum value 2
show that nD1 n converges to a continuous function on R which is zero
precisely on the set A.
(d) Using the bump function ‰a;b of Examples 5.2.6(2), construct a smooth
function n W R ! R which is non-zero precisely on the interval In . For
each n 1, choose ˛n > 0 so that
˛n .maxfsup j
x2In
n .x/j; sup j
x2In
0
n .x/j; ; sup j
x2In
.n/
n .x/jg/
"2.nC1/ :
P1
Show that nD1 ˛n n converges to a C1 -function W R ! R which is
zero precisely on the set A. (You will need both the M-test and results of
Chap. 5 that give conditions for an infinite series of functions to converge
to a differentiable function.)
(The methods used in this example to construct
closed subsets of R. See also Sect. 7.12.)
and
extend to general
7.10 Sequences and Limit Points
In this section we develop the theory of convergent sequences in metric spaces. We
show how results about convergence are related to closed sets and prove a very
useful characterization of a closed subset of a metric space: a subset A of X is closed
iff the limit of every convergent sequence .xn / A lies in A.
Definition 7.10.1 Let .X; d/ be a metric space. A sequence of points in X consists
of an ordered subset .xn / of X indexed by the positive or strictly positive integers.
Examples 7.10.2
(1) A constant sequence .xn / in the metric space X has the property that for all n,
xn D x0 for some fixed x0 2 X.
(2) If we define xn D . 1n ; 1n /, n 1, then .xn / is a sequence in R2 .
(3) Let T W X ! X. Given x0 2 X, define the sequence .xn / recursively by xnC1 D
T.xn /, n 0. For example, suppose X D C0 .Œa; b/, F W R ! R is continuous
7.10 Sequences and Limit Points
267
and C 2 R. Define T W X ! X by
Z x
F. f .t// dt; f 2 C0 .Œa; b/; x 2 Œa; b:
T. f /.x/ D C C
a
If we choose an initial function f0 2 C0 .Œa; b/, then we obtain the sequence
. fn / in C0 .Œa; b/ by the rule fnC1 D T. fn /. As we shall see, this iteration turns
out to be useful in constructing the solution y D y.x/ to the differential equation
dy
D F. y/ with initial condition y.a/ D C.
dx
Definition 7.10.3 The sequence .xn / of points in .X; d/ is convergent if there exists
an x 2 X such that limn!1 d.xn ; x/ D 0. We call x the limit of the sequence and
write limn!1 xn D x.
Lemma 7.10.4 Let .xn / be a sequence of points in .X; d/. The following statements
are equivalent.
(1)
(2)
(3)
(4)
.xn / converges to x.
For all r > 0, there exists an m D m.r/ 2 N such that xn 2 Dr .x/, for all n m.
For all r > 0, there exists an m D m.r/ 2 N such that xn 2 Dr .x/, for all n m.
For every neighbourhood N of x, there exists an m D m.N/ 2 N such that
xn 2 N, for all n m.
Proof (1) H) (2) Assume (1) holds. Since limn!1 d.xn ; x/ D 0, given r > 0,
there exists m 2 N such that d.x; xn / < r, n m. That is, xn 2 Dr .x/, n m.
(2) H) (4) Assume (2) holds. Given a neighbourhood N of x, there exists an
r > 0 such that Dr .x/ N. Now apply (2).
(4) H) (3) Dr .x/ is a neighbourhood of x, r > 0.
(3) H) (1) Assume (3) holds. Given " > 0, there exists an m 2 N such that
xn 2 D"=2 .x/ D" .x/, n m. That is, d.x; xn / < ", n m. Hence .xn / converges
to x.
t
u
Remarks 7.10.5
(1) Note formulation (4) of convergence—framed in terms of neighbourhoods. This
is part of our move away from the "; ı style of definitions to more general and
natural definitions given in terms of open sets (or neighbourhoods) and one less
quantifier.
(2) Just as for sequences of real numbers, we need a criterion for convergence that
does not depend on knowing the limit. However, before we develop that aspect
of the theory we introduce some new ideas that relate limits of sequences to
closed sets.
z
7.10.1 Limit Points of a Set
We recall that a point x 2 X is isolated if fxg is a neighbourhood of x. More
generally, if A is a non-empty subset of X, a point a 2 A is isolated (in A) if there
268
7 Metric Spaces
exists a neighbourhood N of a such that N \ .A n fag/ D ;. In terms of the induced
topology, a is isolated in A iff a is an isolated point of .A; dA / (we leave the formal
verification to the exercises).
Example 7.10.6 Let X D R, standard metric, and take A D f0g [ Œ1; 2. Then A has
one isolated point: f0g. As a less trivial example, take A D f0g [ f1=n j n 1g.
Then every point of A is isolated except f0g.
In our next definition we aim to capture points which are not isolated relative to a
subset A of X.
Definition 7.10.7 If A is a non-empty subset of the metric space X, then a point
x 2 X is called a limit point (or accumulation point) of A if
(1) d.x; A/ D 0 (equivalently, x 2 A) and
(2) x is not an isolated point of A.
We denote the set of limit points of A by A0 .
Examples 7.10.8
(1) Let A D .a; b/ R. Every point of Œa; b is a limit point of A and so Œa; b A0 .
Since the closure of .a; b/ is Œa; b, we have A0 D Œa; b.
(2) Let X D R and define A D f0g [ f 1n j n 1g. The subset A has exactly one
limit point: 0 (0 is the only point in A which is not isolated). Hence A0 D f0g.
(3) Let X D R and define A D f n1 j n 1g. The subset A has exactly one limit
point: 0 and so A0 D f0g. In this case the limit point does not lie in A.
(4) Let .X; d/ be a metric space and suppose x0 is an isolated point of X. If A is any
non-empty subset of X, then x0 … A0 . In particular, if X has the discrete metric
and A X, then A0 D ;.
(5) Take Q R. Then Q0 D R: we have Q D R and since Q has no isolated points,
Q0 D R.
(6) Let P C0 .Œa; b/ denote the set of polynomial maps and take the uniform
metric on C0 .Œa; b/. Then P0 D C0 .Œa; b/ (since P contains no isolated points
and P D C0 .Œa; b/ by the Weierstrass approximation theorem).
(7) Let B.Œa; b/ denote the space of bounded real-valued functions on Œa; b and
take the uniform metric on B.Œa; b/. Suppose that .xn / is a sequence of distinct
points of Œa; b. For n 1 define n 2 B.Œa; b/ by
(
1; if x D xn ;
n .x/ D
0; if x ¤ xn :
We have . n ; m / D 1 for all n ¤ m and so the set f n j n 2 Ng consists of
isolated points and has empty limit point set. Note that f n j n 2 Ng has no
convergent subsequences and is a bounded subset of B.Œa; b/ (it is a subset of
D2 .0/). Consequently, the Bolzano–Weierstrass theorem does not generalize to
B.Œa; b/ (a similar remark holds for .C0 .Œa; b/; /, see the exercises at the end
of the section).
We give equivalent formulations of the definition of a limit point.
7.10 Sequences and Limit Points
269
Lemma 7.10.9 Let A be a non-empty subset of the metric space .X; d/ and suppose
x 2 X. The following statements are equivalent.
(1)
(2)
(3)
(4)
x 2 A0 .
For all r > 0, Dr .x/ \ .A X fxg/ ¤ ;.
For every neighbourhood N of x, N \ .A X fxg/ ¤ ;.
There exists a sequence .xn / A X fxg which converges to x.
Proof (1) H) .2/ Suppose that x 2 A0 . Given r > 0, Dr .x/ \ .A X fxg/ ¤ ; (else
either x is an isolated point of A—if x 2 A—or x … A and d.x; A/ > 0).
(2) ” (3). For this observe that if N is a neighbourhood of x, then there exists
an n 2 N such that D1=n .x/ N and so N \ .A X fxg/ ¤ ;. The converse implication
is trivial.
(2) H) .4/. For all n 2 N, D1=n .x/ \ .A X fxg/ ¤ ;. Choose xn 2 D1=n .x/ \ .A X
fxg/, n 2 N. Then d.x; xn / < 1=n, for all n 2 N, and so limn!1 xn D x.
(4) H) (1). Since limn!1 d.xn ; x/ D 0 and d.x; A/ d.x; xn /, we have
d.x; A/ D 0. Since .xn / A X fxg, x cannot be an isolated point of A. Hence
x 2 A0 .
t
u
We collect together some properties of the limit point set in the next result.
Proposition 7.10.10 Let A be a non-empty subset of the metric space .X; d/.
(1)
(2)
(3)
(4)
x 2 A0 iff x 2 A X fxg.
A0 is a closed subset of X.
A D A [ A0 .
A is closed iff A0 A (“A is closed iff A contains all its limit points”).
Proof
(1) If x 2 A X fxg then d.x; A X fxg/ D 0 and so x 2 A and x is not an isolated point
of A. The converse is equally simple.
(2) It suffices to show that if x … A0 , then d.x; A0 / > 0. If x … A0 , there exists an r >
0 such that Dr .x/ \ .A X fxg/ D ;. Hence Dr .x/ X X A0 and d.x; A0 / r > 0.
(3) Suppose x 2 A. Either x 2 A or not. If not, then d.x; A/ D d.x; A X fxg/ D 0 and
so x 2 A0 by (1). Hence A A [ A0 . Conversely, if x 2 A [ A0 , then d.x; A/ D 0
(using (1) again) and so x 2 A.
Finally, (4) is immediate from (3) since A is closed iff A D A.
t
u
7.10.2 Limit Points of a Sequence
We next investigate limit points of sequences—our definition will need to take
account of the order implicit in the definition of a sequence.
Definition 7.10.11 A limit point (or cluster point) of the sequence .xn / X is a
point x 2 X such that there exists a subsequence .xnk / of .xn / converging to x.
270
7 Metric Spaces
Remark 7.10.12 A sequence .xn / X defines a subset fxn j n 2 Ng. A limit point
of the sequence .xn / may not be a limit point of the subset fxn j n 2 Ng. For
example, a constant sequence .xn D x0 / has the limit point x0 but the set fx0 g has no
limit points.
z
Proposition 7.10.13 Let A be a subset of the metric space X and suppose that
.xn / A. Every limit point of the sequence .xn / lies in A. In particular, A is closed
iff A contains the limit of every convergent sequence of points of A.
Proof Suppose that .xnk / is a subsequence of .xn / converging to x? . We have
d.x? ; A/ d.x? ; xnk /, k 2 N, and so letting k ! 1, we see d.x? ; A/ D 0 and
x? 2 A. Alternatively, one can base the proof on disk neighbourhoods of x? and
use Lemma 7.5.7. For the final statement, suppose first that A is closed. If .xn / A
converges to x? , then x? 2 A D A. Conversely, if there exists .xn / A converging
to x? … A, then x? 2 A0 and so A cannot be closed by Proposition 7.10.10(4).
t
u
Remark 7.10.14 The last part of Proposition 7.10.13 will be very useful when we
investigate properties of continuous functions and compact sets. It is worth giving
a self-contained proof that uses only the definition of a closed set. Let A be closed
and suppose .xn / A converges to x? . Since limn!1 d.xn ; x? / D 0 and d.x? ; A/ d.xn ; x? /, for all n 2 N, d.x? ; A/ D 0. Hence x? 2 A. Conversely, suppose A is not
closed. Then there exists an x? 2 X X A such that d.x? ; A/ D 0. For n 2 N, choose
xn 2 D1=n .x? / \ A. Then the sequence .xn / A converges to x? … A.
z
We now give a far-reaching generalization of Proposition 7.10.13.
Theorem 7.10.15 Let .xn / be a sequence in .X; d/. Define
S D \n1 fxm j m ng:
Then x 2 S iff there exists a subsequence .xnk / of .xn / converging to x. In particular,
.xn / has a convergent subsequence iff S ¤ ;.
This theorem allows us to capture all possible limits of convergent subsequences
of a sequence by a process of intersection and closure. We remark that if .xn / A,
then fxm j m ng A so Theorem 7.10.15 implies Proposition 7.10.13.
Before we prove the theorem, we give a number of examples to illustrate the
ideas.
Examples 7.10.16
(1) Let .xn / R be defined by xn D n, n 1. For n 1 we have
fxm j m ng D fm j m ng
D fm j m ng:
Hence \n1 fxm j m ng D \n1 fm j m ng D ;. In this case, .xn / has no
convergent subsequences.
7.10 Sequences and Limit Points
271
(2) Let .qn / Q R be a sequence which contains every rational number. Then
for n 1,
fqm j m ng D Q X finite set;
and so fqm j m ng D R. Therefore, \n1 fqm j m ng D R—every real
number is the limit of a sequence of distinct rational numbers.
(3) Let xn D .1/nC1 , n 2 N. We have fxm j m ng D f1; C1g for all n 2
N. Hence \n1 fxm j m ng D f1; C1g, reflecting the fact that a convergent
subsequence of .xn / must converge to ˙1.
(4) Let .xn / R be a bounded sequence of real numbers. We know by Proposition 2.4.3 (corollary to the Bolzano–Weierstrass theorem) that .xn / has at least
one convergent subsequence. Consequently, it follows from Theorem 7.10.15
that \n1 fxm j m ng ¤ ;. This property does not hold for bounded sequences
in a general metric space. Rather than working with bounded sequences, we
instead require that sequences are subsets of compact subsets of the metric
space. We then always have at least one convergent subsequence (this is
essentially our definition of the term “compact”). A compact set can be thought
of as a far reaching generalization of a closed and bounded interval. The main
problem will be to find good characterizations of compactness.
Proof of Theorem 7.10.15 If x? 2 S D \n1 fxm j m ng, then x? 2 fxm j m ng
for all n 2 N. Hence, for every r > 0,
Dr .x? / \ fxm j m ng ¤ ;:
(7.1)
Using (7.1), we construct inductively a subsequence .xnk / of .xn / converging to
x? . Taking r D 1, there exists an n1 2 N such that xn1 2 D1 .x? /. Suppose we
have constructed xn1 ; ; xnk such that n1 < n2 < < nk and xnj 2 D1=j .x? /,
j D 1; ; k. We claim we can pick nkC1 > nk so that xnkC1 2 D1=.kC1/ .x? /.
If not, this would imply D1=.kC1/ .x? / \ fxm j m ng D ;, for all n > nk ,
contradicting (7.1). This completes our construction of .xnk /. Since d.x? ; xnk / < 1=k,
we have limk!1 xnk D x? .
Conversely, suppose that .xnk / is a subsequence of .xn / converging to x? . We
claim that x? 2 S. It suffices to show that for every r > 0, Dr .x? / \ fxm I j m ng ¤ ; for all n 2 N. Since limk!1 d.xnk ; x? / D 0, there exists a k.r/ 2 N
such that d.xnk ; x? / < r for all k k.r/. Pick nk n with k k.r/. Then xnk 2
Dr .x? / \ fxm j m ng and so Dr .x? / \ fxm j m ng ¤ ;.
t
u
Example 7.10.17 Let .xn / R be a bounded sequence and set S D
\n1 fxm j m ng. It follows by the definition of lim inf, lim sup and Lemma 2.5.1(3)
that inf S D lim inf xn , sup S D lim sup xn . Therefore the smallest closed
interval containing S is Œlim inf xn ; lim sup xn . Viewed this way, we may think of
Theorem 7.10.15 as a generalization of lim sup, lim inf to general metric spaces. 272
7 Metric Spaces
EXERCISES 7.10.18
(1) Suppose that A X. Show that a 2 A is isolated in A iff a is an isolated point
of .A; dA /.
(2) Find countable infinite subsets A of R such that
(a)
(b)
(c)
(d)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
A has no limit points.
A has exactly three limit points.
A is bounded and A0 D f0g [ f1=n j n 1g (so A0 is countable).
A has non-countably many limit points.
In which of the cases (a–d) must (1) A have isolated points? (2) A have
infinitely many isolated points?
Let .X; d/ be a metric space with the property that every convergent sequence
is eventually constant. Prove that every point in X is isolated. (The topology
on X is therefore the same as the topology given by the discrete metric.)
Find an example of a non-empty open subset A of R for which .A0 /ı ¤ A.
Let A X. Show that if a is an isolated point of A then d.a; A0 / > 0 and
deduce that A0 is a closed subset of X.
Show by means of an example that in general .A0 /0 ¤ A0 . Is .A0 /0 a subset or
superset of A0 ?
Find an example of a subset A of R such that A0 is countably infinite but .A0 /0 D
;.
Let A; B be subsets of the metric space .X; d/. Prove that .A [ B/0 D A0 [ B0 .
Is it true that .A \ B/0 is equal to the intersection of A0 and B0 ?
Show that A0 D .A/0 for all subsets A of a metric space X.
Prove that A X is closed iff for every convergent sequence .xn / A, the
limit of the sequence lies in A.
Show that a sequence .xn / X is convergent iff \n1 fxm j m ng consists of
a single point.
Construct a bounded countable subset . n / of .C0 .Œa; b/; / with no limit
points. (Hint: choose n so that . n ; m / 1 for all n > m 1, see
Examples 7.10.8(7)).
Show that every closed nonempty subset F of a closed interval can be written
as a union F D E [ P, where E is a countable subset of F and P is closed and
contains no isolated points (P is an example of a perfect set). Is this result true
for general metric spaces? Proof or counterexample. (Hint for the first part:
Suppose F is uncountable. Let P F be the set of all points x 2 F such that
every neighbourhood of x contains uncountably many points of F. Note that
taking P D F 0 does not work in general.)
Suppose that F is a closed nonempty subset of the interval Œa; b and that F
is uncountable. Show that we can find a subset H of F such that (a) F X H is
countable, (b) for every point z 2 H, we can find sequences .xn /; . yn / H
converging to z such that xn < z < yn for all n 2 N. (Hint: Use the previous
exercise and construct H as a subset of P.)
7.11 Continuous Functions
273
7.11 Continuous Functions
We start with a general definition of continuity (that works for all topological
spaces). We show later that our definition is equivalent to the familiar "; ı-definition.
Definition 7.11.1 Let .X; d/ and .Y; / be metric spaces and f W X ! Y. The map f
is continuous at the point x0 2 X if for every neighbourhood N of f .x0 /, f 1 .N/ is a
neighbourhood of x0 . If f is continuous at every point of X, we say f is continuous.
We give a simple application of our definition that shows an advantage of framing
continuity in terms of neighbourhoods.
Lemma 7.11.2 Let X; Y; Z be metric spaces and suppose f W X ! Y is continuous
at x0 2 X, g W Y ! Z is continuous at f .x0 / D y0 2 Y. Then the composite
g ı f W X ! Z is continuous at x0 .
Proof It suffices to show that if Q is a neighbourhood of g. y0 /, then .g ı f /1 .Q/
is a neighbourhood of x0 . Since Q is a neighbourhood of g. y0 / and g is continuous
at y0 , g1 .Q/ is a neighbourhood of y0 . Since f is continuous at x0 , f 1 .g1 .Q// D
.g ı f /1 .Q/ is a neighbourhood of x0 .
t
u
If we work with continuous maps from X to Y, we can give an elegant
characterization of continuity in terms of open or closed sets.
Theorem 7.11.3 Let X and Y be metric spaces and f W X ! Y. The following
statements are equivalent.
(1) f is continuous.
(2) For every open subset U of Y, f 1 .U/ is an open subset of X.
(3) For every closed subset F of Y, f 1 .F/ is a closed subset of X.
Proof We start by noting that (2) and (3) are equivalent since
f 1 .Y X U/ D f 1 .Y/ X f 1 .U/ D X X f 1 .U/; f 1 .Y X F/ D X X f 1 .F/:
It suffices to prove (1) and (2) are equivalent. Suppose (2) holds. Let x0 2 X and
N be a neighbourhood of f .x0 /. It suffices to show f 1 .N/ is a neighbourhood of
ı
ı
x0 . Certainly, N is an open neighbourhood of f .x0 / and so f 1 .N/ is an open subset
ı
of X. Since x0 2 f 1 .N/ f 1 .N/, f 1 .N/ is a neighbourhood of x0 . Conversely,
suppose (1) holds. Let U be an open subset of Y. Then U is a neighbourhood of
every point y 2 U. Since f is continuous, f 1 .U/ will be a neighbourhood of every
point x 2 X such that f .x/ 2 U. In other words, the interior of f 1 .U/ is precisely
f 1 .U/ and so f 1 .U/ is open.
t
u
Examples 7.11.4
(1) The identity map I W X ! X of a metric space X is continuous: for every open
subset of U of X, I 1 .U/ D U, which is open.
274
7 Metric Spaces
(2) Let f W X ! R be continuous. Then f 1 .0/ is a closed subset of X: solutions
sets of continuous functions are closed. Generally, if f W X ! Y is continuous
and y0 2 Y, then f 1 . y0 / D fx 2 X j f .x/ D y0 g is a closed subset of X. If
we work with strict inequality, we obtain open sets. For example, if a 2 R and
f W X ! R is continuous, then fx j f .x/ > ag is an open subset of X.
In the next lemma, we show that continuity as we have defined it is equivalent to
the usual "; ı definition.
Lemma 7.11.5 Let .X; d/ and .Y; / be metric spaces, f W X ! Y and x0 2 X. The
following statements are equivalent.
(1) f is continuous at x0 .
(2) For every " > 0, there exists a ı > 0 such that f 1 .D" . f .x0 /// Dı .x0 /.
(3) For every " > 0, there exists a ı > 0 such that . f .x/; f .x0 // < " if d.x; x0 / < ı.
Proof (1) H) (2) Taking N D D" . f .x0 //, we see that f 1 .N/ is a neighbourhood
of x0 . Hence there exists a ı > 0 such that Dı .x0 / f 1 .N/ and f 1 .D" . f .x0 /// Dı .x0 /. (2) ” (3) If f 1 .D" . f .x0 /// Dı .x0 / then D" . f .x0 // f .Dı .x0 //.
Obviously this implies the equivalence of (2) and (3). Finally, we show (2) H)
(1). Let N be a neighbourhood of f .x0 /. Choose " > 0 such that D" . f .x0 // N.
Now there exists a ı > 0 such that f 1 .D" . f .x0 /// Dı .x0 / and so f 1 .D" . f .x0 ///
is a neighbourhood of x0 . Therefore, f 1 .N/ f 1 .D" . f .x0 /// is a neighbourhood
of x0 .
t
u
Remarks 7.11.6
(1) Our definition of continuity simply says that continuous functions are exactly
those functions that preserve open sets. That is, f 1 .U/ is open for every open
set U. The disadvantages of the "; ı definition are firstly that it requires three
quantifiers (‘for all x0 2 X, ‘for all " > 0’, ‘there exists a ı > 0’) and secondly
that it uses metrically defined disks which are not preserved by continuous
functions (f 1 .Dr .x// is usually not a disk). In this sense the definition is not at
all natural.
(2) Note that the continuity definition uses the inverse image of sets, not the forward
images. This is characteristic of many definition in mathematics. It is also often
the case that the properties are not preserved under forward images. However,
it is usually interesting when they are; we encounter two important examples
shortly (compactness and connectedness).
z
Example 7.11.7 If f W X ! Y is continuous, then f .F/ is generally not a closed
subset of Y if F is closed in X. Similarly, f .U/ will generally not be open in Y if U
is an open subset of X. For example, suppose f W R ! R is given by f .x/ D x2 . Take
U D .1; 1/. Then f .U/ D Œ0; 1/, which is not an open subset of R. For an example
where f does not map closed sets to closed sets, let F R2 be the graph of the
continuous strictly positive function g.x/ D .1 C x2 /1 . Since g is continuous, F is a
closed subset of R2 : F is the zero set of the continuous function G.x; y/ D y f .x/.
7.11 Continuous Functions
275
Let f W R2 ! R be the projection on the y-axis: f .x; y/ D y. Then f .F/ D .0; 1,
which is not closed.
For future reference, we give the definition of uniform continuity in metric spaces.
Definition 7.11.8 Let .X; d/; .Y; / be metric spaces. The map f W X ! Y is
uniformly continuous if for each " > 0, there exists a ı > 0 such that
. f .x/; f .x0 // < "; for all x; x0 2 X satisfying d.x; x0 / < ı:
Remark 7.11.9 Unlike continuity, the definition of uniform continuity requires
structure beyond that of open and closed sets.
z
EXERCISES 7.11.10
(1) Suppose X; Y; Z are metric space and f W X ! Y, g W Y ! Z are continuous.
Prove that the composite g ı f W X ! Z is continuous. Show that if f ; g are
uniformly continuous, then so is g ı f .
(2) Suppose that f W X ! R is continuous. Prove that the maps fC .x/ D
max.0; f .x// and f .x/ D min.0; f .x// are continuous.
(3) Let .X; d/, .Y; / be metric spaces. An isometry of X and Y is an onto map
f W X ! Y such that . f .x1 /; f .x2 // D d.x1 ; x2 / for all x1 ; x2 2 X. Prove that
every isometry is 1:1 and continuous (even uniformly continuous). Show that
if f is an isometry then the inverse map f 1 W Y ! X is also an isometry.
(4) Suppose that f W X1 ! Y1 and g W X2 ! Y2 are continuous maps
of metric spaces. Define metrics di on Xi
Yi by di ..u; v/; .a; b// D
maxfdXi .u; a/; dYi .v; b//g, i D 1; 2. Show that f g W X1 Y1 ! X2 Y2
is continuous.
(5) Let A be a non-empty subset of the metric space .X; d/. Show that the distance
function d.x; A/ D infa2A d.x; a/ is uniformly continuous.
(6) Let X be a metric space and suppose that every function f W X ! R is
continuous. Show that every subset of X is open and closed.
(7) Suppose that the metric space X is written as a union [i2I Ui of open subsets
of X. Given f W X ! R show that if f W Ui ! R is continuous for all i 2 I then
f is continuous. What about if we write X as a finite or infinite union of closed
sets Fi and we assume f W Fi ! R is continuous?
(8) We showed that a continuous map f W X ! Y need not map open sets to open
sets. Find examples of maps f W X ! Y which map open sets to open sets but
which are not continuous. (Hints: (a) Let Y have the discrete topology; (b) take
X D Y D C0 .Œ0; 1/, f the identity map of X but inequivalent metrics on X and
Y.)
(9) Find an example of a map f W R ! R which maps closed sets to closed sets
but which is not continuous.
(10) Let f W X ! Y, where .X; d/, .Y; / are metric spaces. Show if f .x0 / is an
interior point of f .Dı .x0 // for all ı > 0 then it does not necessarily follow that
f is continuous at x0 . (Hint: Take X D Y D R, x0 D f .x0 / D 0. Choose f so
that f .ı; ı/ D Œ1; 1, for all ı > 0! Why do we need something like this?)
276
7 Metric Spaces
(11) Take the Zariski topology on R (see Exercises 7.5.8). Show that if p W R ! R
is a polynomial then p1 .U/ is Zariski open for every Zariski open subset U
of R. Would this result be true if p W R ! R was continuous or smooth but not
a polynomial? Why?
(12) A map f W X ! Y between metric spaces is a homeomorphism if f is 1:1 onto
and both f and f 1 are continuous. Show that if f W X ! X is 1:1 and onto
then f is a homeomorphism iff f .U/; f 1 .U/ are open subsets of X for all open
subsets U of X. Show, by means of examples, that a homeomorphism need not
be uniformly continuous.
(13) Show that every metric space is homeomorphic to a metric space of finite
diameter.
(14) Extend the definitions of f .x˙/, f .x˙/, !f .x/ given in Sect. 2.5.2 to maps
f W X ! R, where X is a metric space.
(15) Let f W Œa; b ! R be bounded and not necessarily continuous. Given ` > 0,
define F ` D fx 2 Œa; b j f .x/ D f .xC/g and F ` D fx 2 Œa; b j f .x/ D
f .xC/g. Show that F ` and F` are closed subsets of Œa; b. (For notation and
terminology, see Sect. 2.5.2.)
(16) Prove Young’s theorem: Suppose f W Œa; b ! R and let F D fx j f .xC/ ¤
f .x/g. Then F is countable (see also Remarks 2.5.8(2)). (Hints. Let `; k > 0.
Following the previous exercise, show that F `;k D fx j f .xC/ f .x/ `; f .x/ kg is a closed subset of Œa; b. If F `;k is not countable, then F `;k
contains an uncountable subset H such that every point of H is a limit from
the left and right of points of H (Exercises 7.10.18(14)). This implies that
f .x/ k, for all x 2 H and so f .xC/ k C `, for all x 2 H. Proceeding
inductively, deduce that f .xC/; f .x/ D C1, for all x 2 H, contradicting our
definition of F`;k . Hence F `;k is countable.)
(17) Improve the previous result to show that outside of a countable subset of Œa; b
we have
f .xC/ D f .x/ f .x/ f .xC/ D f .x/:
(18) Show, by means of examples, that Young’s theorem generally fails for maps
f W X ! R, X a metric space.
7.12 Construction and Extension of Continuous Functions
In the last section we defined and gave various characterizations of continuous
functions on a metric space. However, we avoided the issue of the existence of
non-trivial continuous functions on a general metric space. It is time to address this
question. We consider the simplest case of constructing real-valued functions on a
metric space. Suppose then that .X; d/ is a metric space and let C0 .X/ denote the
set of continuous functions f W X ! R. Obviously, C0 .X/ contains the constant
7.12 Construction and Extension of Continuous Functions
277
functions. Is it possible to construct non-constant continuous functions? We are
assuming nothing about the set X except the presence of a metric. At this level
of abstraction, the only way forward appears to be to use the metric to construct
continuous functions on .X; d/.
Lemma 7.12.1 Let a 2 X and define da W X ! R by da .x/ D d.a; x/. Then da is
continuous. Consequently, fda j a 2 Xg C0 .X/.
Proof By Lemma 7.1.4, we have
jda .x/ da . y/j D jd.a; x/ d.a; y/j d.x; y/; x; y 2 X:
Hence da is continuous at x for all x 2 X and so da 2 C0 .X/.
t
u
Remark 7.12.2 The function da is never constant if X contains more than one
point.
z
For our purposes we need a slight generalization of Lemma 7.12.1.
Proposition 7.12.3 Let A be a non-empty subset of the metric space .X; d/ and
define dA W X ! R by
dA .x/ D d.x; A/:
Then dA 2 C0 .X/.
Proof The result follows from Proposition 7.2.1(4) by exactly the same argument
used to prove Lemma 7.12.1.
t
u
It turns out that the set fdA j A X; A ¤ ;g is rich enough to allow us to
represent the closed sets of a metric space as the zero sets of continuous functions.
Theorem 7.12.4 (Urysohn’s Lemma) Let .X; d/ be a metric space and A; B be
disjoint closed subsets of X. There exists a continuous function f W X ! R such
that
(1) f 1 .0/ D A.
(2) f 1 .1/ D B.
(3) f .X/ Œ0; 1.
Proof In the spirit of our constructions of C1 -functions given in Chap. 5, define
f .x/ D
d.x; A/
; x 2 X:
.1 C d.x; B//.d.x; B/ C d.x; A//
Since A; B are closed and disjoint d.x; A/ C d.x; B/ > 0 for all x 2 X and so f is well
defined. Since d.x; A/; d.x; B/ are continuous by Proposition 7.12.3, f is continuous.
We leave it to the reader to complete the simple verification that f satisfies
(1–3).
t
u
278
7 Metric Spaces
Remarks 7.12.5
(1) We may allow B to be the empty set in Theorem 7.12.4: define f .x/ D
d.x; A/=.1 C d.x; A// and note that f .X/ Œ0; 1/ and f 1 .0/ D A.
(2) Urysohn’s lemma holds for normal topological spaces which need not be metric
spaces. However, at this level of generality, the best that can be claimed is
f 1 .1/ A, f 1 .0/ B. The metric space proof of the Urysohn lemma often
uses the Tietze extension theorem (see below). The proof we give is elementary
and constructs f so that A; B are level sets of f .
z
Theorem 7.12.6 (Tietze Extension Theorem) Let A be a closed subset of the
metric space .X; d/ and suppose f W A ! R is continuous and bounded. There
exists a continuous map F W X ! R such that F.x/ D f .x/, for all x 2 A. Moreover,
we may construct F so that F is bounded and
inf f .s/ F.x/ sup f .s/; x 2 X:
s2A
s2A
Proof It suffices to prove the result under the assumption f 0 since we can write
f as a difference max.0; f / max.0; f / of positive continuous functions (note
Exercises 7.11.10(2)). Replacing f by f C 1, we can further assume f 1. Set
M D supx2A f .x/. We may assume M > 1 (else f is constant and the result is trivial).
We define the extension F by
F.x/ D
f .x/;
x 2 A;
.infy2A f . y/d.x; y//=d.x; A/; x 2 X X A:
ı
ı
Since X D A [ .X X A/ [ @A, it suffices to prove that F is continuous at points of A,
ı
ı
X X A, and @A. Since F D f on A, the continuity of F at points of A is immediate.
Continuity of F at points of X X A.
Let x 2 X X A. Since d.x; A/ > 0, it suffices to show that the function g W X ! R
defined by g.z/ D infy2A f . y/d.z; y/ is continuous at z D x. Let " > 0 and set
ı D "=M. If x0 2 Dı .x/, we have d.x; y/ d.x0 ; y/d.x0 ; x/. Choose y 2 A such that
g.x/ > f . y/d.x; y/". We have f . y/d.x; y/ f . y/d.x0 ; y/f . y/d.x0 ; x/ > g.x0 /"
and so g.x/ g.x0 / 2". Similarly, g.x0 / g.x/ 2". Hence jg.x/ g.x0 /j < 2"
for all x0 2 Dı .x/ proving the continuity of g at x.
Continuity of F at points of @A.
Let x 2 @A and choose " > 0. Since f is continuous at x, there exists a ı > 0
such that j f .x/ f . y/j < " for all y 2 Dı .x/ \ A. Set ıN D ı=.M C 1/. Suppose
N If y 2 A X Dı .x/, we have d.x0 ; y/ d.x; y/ d.x; x0 / >
x0 2 X X A and d.x; x0 / < ı.
ıM=.M C 1/ D M ıN and so, since f 1,
inf
y2AXDı .x/
N
f . y/d.x0 ; y/ > M ı:
7.12 Construction and Extension of Continuous Functions
279
Since f .x/ M, f .x/d.x0 ; x/ M ıN and so
inf f . y/d.x0 ; y/ D
y2A
inf
y2Dı .x/\A
f . y/d.x0 ; y/:
(7.2)
If y 2 Dı .x/ \ A, f .x/ " < f . y/ < f .x/ C ". Since infy2Dı .x/\A d.x0 ; y/ D d.x0 ; A/,
it follows from (7.2) that
. f .x/ "/d.x0 ; A/ < inf . f . y/d.x0 ; y// < . f .x/ C "/d.x0 ; A/
y2A
and so jF.x0 / f .x/j < ", for x0 2 DıN .x/ \ .X X A/. Since F D f on A and ıN < ı,
this gives jF.x0 / f .x/j < ", for x0 2 DıN .x/ proving the continuity of F at x.
Finally, it is immediate from the definition of F on X X A that 1 F.x/ M,
x 2 X.
t
u
Remark 7.12.7 The boundedness assumption cannot be avoided in our argument
for the continuity of F on X X A (it is not essential for the continuity on @A).
In the exercises we indicate the generalization of the Tietze extension theorem to
unbounded functions.
z
EXERCISES 7.12.8
(1) Show that if A is a closed subset of Rn , then every continuous function f W A !
R extends to a continuous function F W Rn ! R. (Hint: Construct a sequence
.Fn / of continuous functions Fn W Rn ! R such that Fn D f on Dn .0/ \ A and
FnC1 D Fn on Dn .0/, n 1.)
(2) Show that the Tietze extension theorem holds if f is unbounded. (Hint: Suppose
f W A ! R is unbounded. Apply Theorem 7.12.6 to fQ D ˛ ı f W A ! R, where
˛.x/ D tan1 .x/, x 2 R.)
7.12.1 Sequential Continuity
Just as for functions on R, there is a very useful characterization of continuity
of functions on a metric space given in terms of convergent sequences. First, a
definition.
Definition 7.12.9 Let .X; d/, .Y; / be metric spaces. A map f W X ! Y is
sequentially continuous if given any convergent sequence .xn / in X, . f .xn // is a
convergent sequence of points in Y and
lim f .xn / D f . lim xn /:
n!1
n!1
Remark 7.12.10 We define sequential continuity of f at a point x0 2 X by restricting
to sequences which converge to x0 .
z
280
7 Metric Spaces
Examples 7.12.11
(1) Let .X; d/ be a metric space and fix a 2 X. Then f .x/ D d.x; a/ is sequentially
continuous. Indeed, let limn!1 xn D x? . Then jd.x? ; a/ d.xn ; a/j d.x? ; xn /
by Lemma 7.1.4. The result follows.
(2) If we take the product metric D..x1 ; y1 /; .x2 ; y2 // D maxi d.xi ; yi / on X X,
then d W X X ! R is sequentially continuous. To see this, observe that if
.Xn D .xn ; yn // is a sequence in X X, then .Xn / converges to .x? ; y? / in the
product metric iff .xn / converges to x? and . yn / converges to y? . We claim that
if .xn ; yn / converges to .x? ; y? /, then limn!1 d.xn ; yn / D d.x? ; y? /. We have
jd.xn ; yn / d.x? ; y? /j jd.xn ; yn / d.xn ; y? /j C jd.xn ; y? / d.x? ; y? /j;
d. yn ; y? / C d.xn ; x? /;
where the last line follows by Lemma 7.1.4. Now let n ! 1.
Theorem 7.12.12 (Notation as Above) The function f W X ! Y is continuous iff f
is sequentially continuous.
Proof The proof is formally identical that of the proof of Theorem 2.4.9 in
Chap. 2 that applied to real-valued functions on R. In detail, suppose first that f
is continuous. Let .xn / X be a convergent sequence with limit x0 . Since f is
continuous at x0 , given " > 0, there exists an r > 0 such that d. f .x/; f .x0 // < ", if
x 2 Dr .x0 /. Since .xn / converges to x0 , there exists an m 2 N such that xn 2 Dr .x0 /,
n m and so d. f .xn /; f .x0 // < ", for n m. Therefore, . f .xn // converges to
f .x0 /. Conversely, suppose that . f .xn // converges to f .x0 / for every sequence .xn /
converging to x0 . We claim f is continuous at x0 . Suppose the contrary. If f is not
continuous at x0 , there exists an " > 0, such that for every n 2 N, there exists an
xn 2 X such that xn 2 D1=n .x0 / and f .xn / … D" . f .x0 //. Obviously, .xn / converges to
x0 . Since f .xn / … D" . f .x0 //, d. f .xn /; f .x0 // " for all n 2 N and so . f .xn // cannot
converge to f .x0 /, contradicting the assumption that f is sequentially continuous.
Hence f must be continuous at x0 .
t
u
Remark 7.12.13 Theorem 7.12.12 is very much a metric space theorem. It does
not extend to general topological spaces. It is, however, a powerful result and, as
in Chap. 2, leads to simple and transparent proofs of many foundational results for
continuous functions.
z
Example 7.12.14 Let X; Y1 ; Y2 be metric spaces and fi W X ! Yi , i D 1; 2 be
continuous. Then . f1 ; f2 / W X ! Y1 Y2 is continuous, where we take the product
metric on Y1 Y2 . By Theorem 7.12.12, it suffices to show . f1 ; f2 / is sequentially
continuous. Let .xn / X converge to x? . By sequential continuity we have
limn!1 fi .xn / D fi .x? /, i D 1; 2. Hence limn!1 . f1 .xn /; f2 .xn // D . f1 .x? /; f2 .x? //
and so . f1 ; f2 / is sequentially continuous.
We conclude with an application of Theorem 7.12.12.
7.13 Sequential Compactness
281
Proposition 7.12.15 Let f ; g W X ! Y be continuous. Then S D fx 2 X j f .x/ D
g.x/g is a closed subset of X.
Proof We give a proof based on sequential continuity. In order to prove that S is
closed, it suffices to show that if .xn / S converges to x0 , then x0 2 S. By sequential
continuity of f and g, limn!1 f .xn / D f .x0 /, limn!1 g.xn / D g.x0 /. Since f .xn / D
g.xn / for all n, we have f .x0 / D g.x0 / and so x0 2 S.
t
u
Remark 7.12.16 Here is a sketch of an alternative proof of the previous proposition
which uses Theorem 7.11.3 and Example 7.12.14. Let . f ; g/ W X ! Y Y be the
map defined by . f ; g/.x/ D . f .x/; g.x//. By Example 7.12.14, . f ; g/ is continuous.
If we define the diagonal D f. y; y/ j y 2 Yg, then is a closed subset of Y Y
(see Exercises 7.4.27(8)). Now use S D . f ; g/1 . / and Theorem 7.11.3. It is worth
noting that even though there is a natural way of defining the product topology on
Y Y, once we move away from the setting of metric spaces the diagonal may
not be closed in Y Y.
z
EXERCISES 7.12.17
(1) Let f W X ! Y be continuous and e be a limit point of the set E X. Show that
if f is 1:1 then f .e/ is a limit point of f .E/ Y. True or false if f is not 1:1?
(2) Let f W X ! Y be continuous. Show that if E X, then f .E/ f .E/. What
about the reverse inclusion: f .E/ f .E/? (Prove or give a counterexample.)
(3) Construct a function f W R ! R such that f is discontinuous at all points of a
dense subset Q of R but is such that the restriction of f to Q is continuous.
(4) Let f D . f1 ; ; fn / W X ! Rn . Prove that f is continuous iff every component
function fi W X ! R is continuous. (Do this in two ways: an "; ı-proof and a
proof based on neighbourhoods or closed sets.)
(5) Suppose that f ; g W X ! Y are continuous functions and that f D g on a dense
subset E of X. Show that f D g. (Hint: Proposition 7.12.15.)
7.13 Sequential Compactness
In this section our aim is to generalize to metric spaces the result that every
continuous real-valued function on a closed and bounded interval is bounded
and attains its bounds. More specifically, we want to characterize those subsets
of a metric space for which every continuous function defined on the subset is
bounded and attains its bounds. We do this by focusing on one property of a closed
and bounded interval that follows from the Bolzano–Weierstrass theorem: every
sequence contained in a closed and bounded interval has a subsequence converging
to a point of the interval. We call sets that satisfy this condition (sequentially)
compact. We provide some interesting classes of sets which are compact and finally
show that continuous functions preserve compactness.
Definition 7.13.1 Let .X; d/ be a metric space. A subset A of X is sequentially
compact if every sequence .xn / A has a subsequence converging to a point of A.
282
7 Metric Spaces
Remark 7.13.2 To avoid discussion of uninteresting special cases, we generally
assume that the set A of Definition 7.13.1 is not empty.
z
Example 7.13.3 The closed and bounded interval Œa; b is sequentially compact.
Indeed, if .xn / Œa; b is a sequence, then by Proposition 2.4.3, .xn / has a
convergent subsequence which must converge to a point of Œa; b since Œa; b is
closed.
Proposition 7.13.4 Let A be a sequentially compact subset of the metric space X.
Then A is a closed and bounded subset of X.
Proof Let x 2 A0 . There exists a sequence .xn / A X fxg which converges to
x. Therefore x 2 A (since every convergent subsequence of .xn / converges to x).
It remains to prove that A is bounded. That is, there exists an M 0 such that
d.x; y/ M for all x; y 2 A. Fix a 2 A and observe that A is bounded if and only if
there exists an M 0 0 such that d.x; a/ M 0 for x 2 A. (d.x; a/ M for all x 2 A
implies d.x; y/ d.x; a/ C y. y; a/ 2M 0 for all x; y 2 A. The converse is obvious
taking M 0 D M.)
Suppose A is not bounded. Then for every n 2 N, there exist xn 2 A such that
d.xn ; a/ > n:
Since A is sequentially compact, we can find a subsequence .xnk / of .xn / converging
to a point x? 2 A. We have d.xnk ; a/ > nk , for all k 1. Since f .x/ D d.x; a/ is
sequentially continuous,
lim d.xnk ; a/ D d.x? ; a/ < 1:
k!1
This is a contradiction since d.xnk ; a/ > nk and so .d.xnk ; a// diverges to C1.
t
u
Example 7.13.5 Although a necessary condition for sequential compactness is
boundedness, it is not a sufficient condition. For example, every set X with the
discrete metric is bounded (with M D 1) but a general sequence .xn / X is
only assured of having a convergent subsequence if X is finite. In particular, if
.xn / consists of distinct points then .xn / has no convergent subsequence. Somewhat
less trivially if .X; d/ is any metric space, we can define a new metric D on X by
D.x; y/ D minf1; d.x; y/g. Every subset of X is bounded with respect to the metric
D. For example, if we replace the Euclidean metric on Rn by the metric D, then
every closed subset of Rn is bounded. Obviously, the closed sets Z or R are not
sequentially compact.
Notwithstanding the previous examples, there is one important case where
sequential compactness is equivalent to being closed and bounded.
Theorem 7.13.6 Let m 2 N. A subset A of Rm is sequentially compact iff A is
closed and bounded. (The metric may be the Euclidean metric, d1 or d1 or any
metric equivalent to these metrics.)
7.13 Sequential Compactness
283
Proof We know by Proposition 7.13.4 that every compact subset of Rm is closed
and bounded. It remains to prove the converse. The proof is by induction on m.
Suppose m D 1. If .xn / is a sequence of points in A, then there exists a convergent
subsequence by Proposition 2.4.3 and the limit must lie in A since A is closed.
Assume the result has been proved for m 1, m > 1. Observe that the product
metric d1 on Rm restricts to d1 on Rp where we identify Rp with the subspace
f.x1 ; ; xp ; 0; ; 0/ j x1 ; ; xp 2 Rg of Rm , 1 p < m. The same is true for
the metrics d2 and d1 . We make a choice of one of these metrics and denote it
by d. Suppose .xn / A. Write xn D . yn ; zn / where yn 2 Rm1 , zn 2 R. Since A
is bounded, . yn / is a bounded sequence in Rm1 , .zn / is a bounded sequence in R
(since d.. y; z/; 0/ d. y; 0/; d.z; 0/). Since . yn / is a bounded sequence in Rm1
it follows by the inductive hypothesis that there is a convergent subsequence, say
. ynk /. Let limk!1 ynk D y? . Now .znk / is a bounded sequence in R and so by
the result for n D 1, there is a convergent subsequence, which we may denote
by .zmk / (where m1 < m2 < and fmi j i 1g fnk j k 1g). Set
limk!1 zmk D z? . Since . ymk / is a subsequence of the convergent sequence . ynk /,
. ymk / is convergent and limk!1 ymk D y? . Hence . ymk ; zmk / is convergent in Rm
with limit x? D . y? ; z? /. Since A is closed, x? 2 A.
t
u
Corollary 7.13.7 Every bounded sequence in Rm has a convergent subsequence.
Proof Let .xn / Rm be bounded. Then A D fxn j n 1g is a closed and bounded
subset of Rm . By Theorem 7.13.6, A is sequentially compact and so .xn / A has a
convergent subsequence.
t
u
Theorem 7.13.8 Let .X; d/, .Y; / be metric spaces. If f W X ! Y is continuous
and A is a sequentially compact subset of X, then
(1) f .A/ is a sequentially compact subset of Y,
(2) f W A ! Y is uniformly continuous (Definition 7.11.8).
Proof
(1) We have to show that if . yn / f .A/ is a sequence, then there exists a convergent
subsequence with limit in f .A/. Since . yn / f .A/, we can find a sequence
.xn / A such that f .xn / D yn , n 2 N. Since A is sequentially compact,
there exists a convergent subsequence .xnk / of .xn / with limit x? 2 A. By
sequential continuity, limk!1 f .xnk / D f .x? /. Therefore . ynk / is a convergent
subsequence of . yn / with limit equal to f .x? / 2 f .A/.
(2) The proof is formally identical to that of Theorem 2.4.15 and we leave the
details to the exercises.
t
u
Theorem 7.13.9 Let .X; d/ be a metric space, A be a sequentially compact subset
of X and f W X ! R be continuous. Then f W A ! R is bounded and attains its
bounds: there exist am ; aM 2 A such that
1 < inf f .A/ D f .am / f .x/ f .aM / D sup f .A/ < C1;
for all x 2 A.
284
7 Metric Spaces
Proof By Theorem 7.13.8, f .A/ is a compact subset of R. Therefore, by Proposition 7.13.4, f .A/ is a closed and bounded subset of R. Hence sup f .A/; inf f .A/ 2
f .A/. Pick am ; aM 2 A such that f .am / D inf f .A/, f .aM / D sup f .A/.
t
u
7.13.1 Additional Properties of Compactness
Proposition 7.13.10 If A is a sequentially compact subset of the metric space X,
then every closed subset of A is sequentially compact.
Proof Let Z be a closed subset of A. It suffices to prove that every sequence .xn /
of points of Z has a convergent subsequence converging to a point of Z. Since A is
sequentially compact and .xn / Z A, there exists a convergent subsequence of
.xnk / of .xn /. Since Z is closed and .xnk / is a convergent sequence of points of Z,
limk!1 xnk 2 Z.
t
u
Proposition 7.13.11 Let A be a subset of the metric space .X; d/. Then A is a
sequentially compact subset of X iff .A; dA / (A with the induced metric) is a
sequentially compact metric space.
Proof Suppose .xn / is a sequence of points of A. By sequential compactness of
A as a subset of X, there exists a convergent subsequence .xnk / of A such that
limk!1 xnk D x? 2 A. Now d.xnk ; x? / D dA .xnk ; x? / and so clearly .xnk / is a
convergent subsequence in .A; dA /. This argument shows that if A is a sequentially
compact subset of X then .A; dA / is a sequentially compact metric space. The
converse is obtained by reversing the argument.
t
u
Remark 7.13.12 Proposition 7.13.11 shows that sequential compactness is an
absolute or intrinsic property of a set. By contrast, properties like open and closed
are relative properties. For example, if A is a proper open subset of X which is not
closed (in X), then A will always be a closed subset of the metric space .A; dA /. If Z
is a subset of A which does not contain all its limit points in X (and so is not closed
in X), then Z may contain all of its limit points if viewed as a subset of .A; dA /. For
example, if Z D A \ F, where F is a closed subset of X and A is open.
z
We now work towards giving some more topological properties of compactness.
With the exception of the relatively elementary Theorem 7.13.21 (used in the proof
of the Arzelà–Ascoli theorem), no use is made of these results in the remainder of
the book.
Theorem 7.13.13 Let F1 F2 be a decreasing sequence of non-empty
sequentially compact subsets of X. Then \1
nD1 Fn ¤ ;. Conversely, if it is true that
the intersection of every decreasing sequence of closed subsets is non-empty, then X
is sequentially compact.
Proof For each n 2 N, pick xn 2 Fn . Then .xn / is a sequence of points in F1 and so
has a convergent subsequence .xnk / with limit x? 2 F1 . We claim x? 2 \1
nD1 Fi . It
suffices to show x? 2 Fm for all m 1. But xnk 2 Fm for k m (nk k) and so,
since Fm is sequentially compact and therefore closed, x? 2 Fm .
7.13 Sequential Compactness
285
Conversely, suppose that the intersection of every decreasing sequence .Fn / of
closed subsets of X is non-empty. Let .xn / be a sequence of points in X. Set Fn D
fxm j m ng. Then .Fn / is a decreasing sequence of closed subsets of X. Since
\n1 Fn ¤ ; it follows by Theorem 7.10.15 that .xn / has a convergent subsequence.
Hence X is sequentially compact.
t
u
Remark 7.13.14 The property described in Theorem 7.13.13 is exactly the property
we used to prove the Bolzano–Weierstrass theorem (Theorem 2.4.1). In that case we
looked at a decreasing sequence of closed and bounded intervals.
z
Corollary 7.13.15 Suppose that U D fUi j i 2 Ng is a countable collection of open
subsets of a sequentially compact metric space X such that [1
nD1 Un D X, then there
exists a finite subset fUi1 ; ; Uik g of U such that [kjD1 Uij D X.
Proof Suppose the contrary. Then Vn D [niD1 Ui ¤ X, for all n 2 N. For n 1,
define Fn D X X Vn . Then .Fn / is a decreasing sequence of closed subsets of X.
Since each Fn is a closed subset of a sequentially compact space, Fn is compact
(Proposition 7.13.10). By our hypothesis, Fn ¤ ; for all n 1. Therefore, by
1
Theorem 7.13.13, \1
nD1 Fn ¤ ;. This contradicts our assumption that [nD1 Un D X
1
1
1
1
since X X [nD1 Un D X X [nD1 Vn D \nD1 .X X Vn / D \nD1 Fn .
t
u
Definition 7.13.16 Let A be a subset of the metric space X. If U D fUi j i 2 Ig is a
collection of open subsets of X, we say U is an open cover of A if
A [i2I Ui :
If I is finite (respectively, countable), U is a finite (respectively, countable) open
cover of A.
If V U is also an open cover of A, then V is a subcover of A.
Remark 7.13.17 Corollary 7.13.15 states that every countable open cover of a
sequentially compact metric space has a finite subcover.
z
A much stronger version of Corollary 7.13.15 is true and the result—stated below—
is used to define compactness for general topological spaces.
Theorem 7.13.18 Let A be a compact subset of the metric space X and suppose
that U D fUi j i 2 Ig is an open cover of A. Then there exists a finite subcover of A.
That is, there exist Ui1 ; ; Uik 2 U such that
[kjD1 Uij A:
We break the proof of Theorem 7.13.18 into a number of steps, each interesting
in its own right. First, we remark that it follows from Proposition 7.13.11 that there
is no loss of generality in assuming A D X (else, replace .X; d/ by .A; dA / and then
UA D fUi \ A j i 2 Ig will be an open cover of A).
We recall that a metric space is separable if it has a countable dense subset. We
showed earlier (Proposition 7.7.5) that if X is a separable metric space, then X is
second countable: there exists a countable collection B of open subsets of X such
that every open subset of X can be written as a union of open sets from B.
286
7 Metric Spaces
Remark 7.13.19 It is not hard to show that X is a separable metric space iff X is
second countable. See the exercises.
z
Proposition 7.13.20 Every open cover of a separable metric space has a countable
subcover.
Proof Let B D fBn j n 2 Ng be the countable collection of open sets given by
Proposition 7.7.5. Let U D fUi j i 2 Ig be an open cover of X. Let Bn 2 B. If there
exists a Ui 2 U, such that Bn Ui , then choose one such Ui and label it as Ui.n/ .
In this way, we choose a countable collection fUi.n/ j n 2 Qg, where Q will be a
subset of N (if there is no Ui such that Bn Ui , we make no choice). We claim
that fUi.n/ j n 2 Qg is an open cover of X. Pick x 2 X. Then x lies in some Uk and
Uk is a union of Bn ’s. The point x lies in at least one of these Bn ’s, say Bm . Since
Bm Uk , one of the Ui ’s containing Bm must equal Ui.m/ . But x 2 Ui.m/ . Therefore,
fUi.n/ j n 2 Qg is an open cover of X.
t
u
Theorem 7.13.21 A sequentially compact metric space is separable.
Proof For each n 2 N, we construct a finite subset En of X such that d.x; En / < 1=n
for all x 2 X. Let n 2 N. Suppose we have chosen z1 ; ; zm 2 X such that
d.zi ; zj / 1=n, i ¤ j. If min1im d.x; zi / < 1=n for all x 2 X, take En D
fz1 ; ; zm g. Else, pick zmC1 2 X such that d.zmC1 ; zi / 1=n, 1 i m. The
process eventually terminates since otherwise we construct an infinite sequence
.zn / X such that d.zi ; zj / 1=n for all i ¤ j. Such a sequence can have
no convergent subsequence, contradicting the assumption that X is sequentially
compact. If we define E D [1
nD1 En , E is a countable dense subset of X and so
X is separable.
t
u
Proof of Theorem 7.13.18 As indicated previously, we may assume A D X. Since
X is sequentially compact, X is separable by Theorem 7.13.21. Therefore, by
Proposition 7.13.20, an open cover of X has a countable subcover. The result follows
from Corollary 7.13.15.
t
u
Remark 7.13.22 It follows from Theorem 7.13.18 and the Bolzano–Weierstrass
theorem that every open cover of a closed and bounded subset of Rn has a finite
subcover. This result is known as the Heine–Borel theorem. It is possible to use this
result to give alternative proofs of many of our results on continuous functions on
closed and bounded sets or sequentially compact sets. We give some illustrations
in the exercises. However, there is no application presented in this book where a
proof using the Heine–Borel theorem is simpler than a proof based on sequential
compactness. For this reason, we have preferred to use sequence-based arguments
in most of our proofs.
z
Using Theorem 7.13.18, we may prove a generalization of Theorem 7.13.13 that
is important for the study of compactness in general topological spaces.
Theorem 7.13.23 Let F D fFi j i 2 Ig be a collection of non-empty closed subsets
of the sequentially compact metric space X. Suppose that every finite intersection
\kjD1 Fij of sets from F is non-empty, then \i2I Fi ¤ ;.
7.13 Sequential Compactness
287
t
u
Proof We leave the proof to the exercises.
EXERCISES 7.13.24
(1) Complete the proof of Theorem 7.13.8 by showing that every continuous
function on a sequentially compact set is uniformly continuous.
(2) Suppose that f W X ! Y is a continuous 1:1 onto map and that X is sequentially
compact. Prove that f is a homeomorphism. (Hints: see Exercises 7.11.10(12)
for the definition of homeomorphism and use Theorem 7.13.8, Proposition 7.13.10 and Theorem 7.11.3.)
(3) Let f W X ! Y be continuous. Show that if E X and E is sequentially
compact, then f .E/ f .E/. Do we have equality?
(4) Prove Theorem 7.13.23.
(5) Provide an alternative proof of Corollary 7.13.15 along the following lines:
Let fUi g be a countable open cover of the compact space X. If there is no finite
open subcover, then for each n 2 N, there exists an xn 2 X X[niD1 Ui . Complete
the proof by obtaining a contradiction.
(6) Let E1 ; ; En be sequentially compact subsets of the metric space .X; d/.
Prove that [niD1 Ei is sequentially compact.
(7) Suppose that .X1 ; d1 /; .X2 ; d2 / are sequentially compact metric spaces. Show
that X1 X2 is sequentially compact if we take the product metric on X1 X2 .
Generalize to the product of n sequentially compact metric spaces.
(8) Suppose that .X; d/ is sequentially compact and let X1 denote the space of all
sequences .xn / X. Define
d1 ..xn /; .x0n // D
1
X
2n d.xn ; yn /:
nD1
Show that
(a) d1 is a metric on X1 .
(b) .X1 ; d1 / is sequentially compact.
We remark that it can be shown that an arbitrary product of compact
topological spaces is compact—Tychonoff’s theorem. We refer to books on
general topology (for example, [18, 30]) for the definition of the product
topology and the proof of Tychonoff’s theorem, which depends on the Axiom
of Choice from set theory.
(9) Let X D f0; 1g and take the discrete metric on X. Define .X1 ; d1 / as in the
previous question. Show that X1 is homeomorphic to the middle-thirds Cantor
set C. (Hints: use the ternary expansion for points in C to define a continuous
bijection h W X1 ! C. Use exercise (2) above.)
(10) Let f W X ! Y be a continuous map between metric spaces. Suppose that A
is a compact subset of Y. Find an example to show that f 1 .A/ need not be
compact.
288
7 Metric Spaces
(11) Let f W X ! Y be a continuous map between metric spaces. Suppose that
(a) For all y 2 Y, f 1 . y/ is either empty or a compact subset of X.
(b) f is closed: f maps closed subsets of X to closed subsets of Y.
Show that if (a,b) hold, then f 1 .A/ is compact for all compact subsets A of Y.
Show, by means of examples, that conditions (a) and (b) are both necessary.
(Maps for which inverse images of compact sets are compact are called proper
maps.)
(12) Let A; B be non-empty subsets of the metric space .X; d/. Define D.A; B/
D infa2A;b2B d.a; b/.
(a) Show that if A and B are sequentially compact, then there exist a0 2 A,
b0 2 B such that D.A; B/ D d.a0 ; b0 /.
(b) Show that if A is sequentially compact and B is a closed subset of X then
D.A; B/ > 0 iff A \ B D ;.
(c) Show that if A and B are subsets of Rn (standard metric) and A is
sequentially compact, B is closed, then there exist a0 2 A, b0 2 B such
that D.A; B/ D d.a0 ; b0 /. Show that this result does not hold for subsets
of general metric spaces. (Hints for second part: One approach can be
based on Examples 7.10.8(3). Take X D R2 X f.0; 0/g and observe that
f.0; 1n / j n 2 Ng is a closed subset of X. Alternatively, an example can be
constructed based on Examples 7.10.8(7)—suppose xn ! x0 … fxn g.)
(d) Find an example of disjoint closed subsets A; B with D.A; B/ D 0.
N are metric spaces and .X; d/ is sequentially compact.
(13) Suppose .X; d/, .Y; d/
Given continuous functions f ; g W X ! Y define
N f .x/; g.x// j x 2 Xg (the uniform metric):
. f ; g/ D supfd.
Verify
(a) is well defined (that is, . f ; g/ < 1).
N f .x0 /; g.x0 //.
(b) 9x0 2 X such that . f ; g/ D d.
(c) If C0 .X; Y/ denotes the space of all continuous functions from X to Y, then
defines a metric on C0 .X; Y/.
Suppose we allow X to be non-compact and let B.X; Y/ denote the space of
all continuous functions f from X to Y such that f is bounded (that is, f .X/
is a bounded subset of Y: 9R D Rf > 0 such that f .X/ DR . y/ for some
y 2 Y). Show that defines a metric on B.X; Y/. Is statement (b) above still
N f .x/; g.x//
valid? (Prove or give a counterexample.) (Hint: Define G.x/ D d.
0
N
and use Lemma 7.1.4 to prove the estimate jG.x/ G.x /j d. f .x/; f .x0 // C
N
d.g.x/;
g.x0 //.)
(14) Show that if there exists a countable collection B of open subsets of X such
that every open subset of X can be written as a union of open sets from B, then
X is separable. (Hint: Proposition 7.13.20 and cover by open disks.)
7.14 Compact Subsets of R: The Middle Thirds Cantor Set
289
(15) Let .X; d/ be a metric space and f W R ! X be continuous. Define . f / D
\T0 f f .t/ j t Tg. Show that x 2 . f / if and only if there exists a monotone
increasing sequence .tn /, limn!1 tn D C1, such that limn!1 f .tn / D x. If X
is compact (or f .R/ is compact) show that . f / ¤ ;. Show, by means of an
example, that if these conditions are not satisfied, . f / may be empty.
7.14 Compact Subsets of R: The Middle Thirds Cantor Set
The structure of open subsets of R is relatively simple: every open subset of R can be
written as a countable union of disjoint open intervals (Exercises 7.4.27(2)). Closed
sets, even of the real line, can have a highly complex structure. In this section we
describe the construction and properties of the (middle-thirds or ternary) Cantor set.
The Cantor set is a compact subset of the unit interval Œ0; 1 which (a) is uncountable,
(b) has no interior points, and (c) has no isolated points. It is obtained by removing
a countable set of open disjoint intervals from Œ0; 1 of total length equal to one. A
very interesting feature of the middle-thirds Cantor set is that it looks the same at
all scales: self-similarity. Cantor-like sets play a very important role in the modern
theory of dynamics and we briefly investigate that aspect in the exercises. At the end
of the section we give a general definition of a Cantor set. However, when we say
the Cantor set, we always mean the middle thirds (or ternary) Cantor set.
7.14.1 Construction of the Cantor Set
We give a construction which is based on ideas from dynamics. We define the
continuous map T W R ! R by
3x; if x 12 ;
T.x/ D
3 3x; if x 12 :
Observe that T W .1; 12 ! .1; 32 and T W Π12 ; 1/ ! .1; 32 are 1:1 onto
linear maps and that
T..1; 0/ [ .1; C1// .1; 0/:
(7.3)
Given x0 2 R, we define the sequence .xn / R inductively by
xnC1 D T.xn /; n 0:
We usually write xn D T n .x0 /. If x0 < 0, then x1 D T.x0 / D 3x0 < x0 < 0 and
clearly xn D 3n x0 < 0, n 1. Hence, using (7.3), we see that
lim xn D 1; if x0 2 .1; 0/ [ .1; C1/:
n!1
290
7 Metric Spaces
3/2
1
0
(
1/3
1/2
)
2/3
x−axis
1
Fig. 7.4 Graph of the map T
If T.x/ 2 Œ0; 1, then x 2 Œ0; 1. Consequently, if x0 2 Œ0; 1, then one of two
things happen, either there exists an n 0 such that x0 ; ; xn 2 Œ0; 1 but xnC1 D
T.xn / D T nC1 .x0 / > 1 (see Fig. 7.4) or xn D T n .x0 / 2 Œ0; 1 for all n 0, In the
first case limn!1 xn D 1. In the second case .xn / Œ0; 1. Certainly there exist
points x0 2 Œ0; 1 for which .xn / 6 Œ0; 1. For example, every point in . 13 ; 23 / exits
Œ0; 1 under just one application of T. On the other hand there exist points x0 2 Œ0; 1
for which .xn / Œ0; 1. For example, if we take x0 D 0, then xn D 0, for all n 0.
Another example is given by taking x0 D 13 . We have x1 D 1, xn D 0, n 2. We
define the Cantor set to be the subset C of Œ0; 1 consisting of all points x such that
T n .x/ 2 Œ0; 1 for all n 0:
C D fx 2 Œ0; 1 j T n .x/ 2 Œ0; 1; for all n 0g:
7.14.2 Properties of the Cantor Set
We are going to give a precise geometric description of the Cantor set. In order to do
this, we need some new notation. Denote the unit interval Œ0; 1 by I0 and for n > 0
define
In D fx 2 I0 j T n .x/ 2 I0 g:
Note that In D fx 2 I0 j T j .x/ 2 I0 ; 0 j ng, since once a point has exited I0 it
never returns, and
I0 I1 In InC1 (7.4)
7.14 Compact Subsets of R: The Middle Thirds Cantor Set
291
We have
CD
\
n0
In D
\
In ; for all m 2 N;
(7.5)
nm
where the last equality follows from (7.4).
Lemma 7.14.1 For n m 0, we have T m .In / D Inm . In particular,
(1) for n m 0, .T n /1 .Im / D ImCn ,
(2) T.C/ D C.
Proof We claim that for k 0 we have T.IkC1 / D Ik . Granted the claim, a simple
induction verifies that for n m 0, we have T m .In / D Inm . In order to verify
the claim, observe that if x 2 IkC1 , then T.x/ 2 Ik and so T.IkC1 / Ik . Conversely,
let x 2 Ik . Since T.I0 / Ik , k 0, there exists a y 2 I0 such that T. y/ D x. Since
T k .x/ 2 I0 , T kC1 . y/ 2 I0 and so y 2 IkC1 and x 2 T.IkC1 /. Hence T.IkC1 / Ik .
It remains to prove (1,2). For (1), observe that if x 2 .T n /1 .Im /, then T n .x/ 2 Im .
Since T n .x/ 2 Im implies that x 2 ImCn , we have .T n /1 .Im / D ImCn . (2) Since
\n0 In D \n1 In , we have
T.C/ D T.\n1 In / D \n1 T.In /
D \n1 In1 D \n0 In D C;
where the last line follows since T.In / D In1 , n 1.
t
u
Remark 7.14.2 Although T maps C onto C, T is not 1:1 (for example, T.0/ D
T.1/ D 0 and 0; 1 2 C).
z
Lemma 7.14.3 C is a compact subset of I0 .
Proof Since T n is continuous and In D .T n /1 .I0 /, In is a closed subset of I0 . Hence
C D \n0 In is a closed subset of Œ0; 1 and therefore C is compact.
t
u
Example 7.14.4 We have I1 D I0 X . 13 ; 23 / D Œ0; 13 [ Œ 23 ; 1. Now T W Œ0; 13 ! Œ0; 1
is given by T.x/ D 3x and so
1
1
1 2
1
D 0;
\T
;
I2 \ 0;
3
3
3 3
1
1 2
D 0;
X
;
3
32 32
1
2 1
D 0; 2 [ 2 ;
:
3
3 3
Similarly I2 \ Π23 ; 1 D Π23 ; 372 [ Π382 ; 1. In other words, we obtain I1 by removing the
middle third of I0 and we obtain I2 by removing the middle thirds of the two closed
intervals that comprise I1 .
292
7 Metric Spaces
The next lemma, although elementary, will prove useful in unravelling the
structure of the sets In .
Lemma 7.14.5 Let f .x/ D mx C c, where m; c 2 R and m ¤ 0. Suppose that
f .Œ˛; ˇ/ D Œ0; 1, where ˛ < ˇ. We have
(1) f maps Œ˛; ˇ 1:1 onto Œ0; 1.
(2) If f preserves orientation (that is, m > 0) then f .˛/ D 0, f .ˇ/ D 1. If f reverses
orientation, then f .˛/ D 1, f .ˇ/ D 0.
˛ˇ
1
(3) f 1 .. 13 ; 23 // D .˛ C ˇ˛
maps the middle third open interval
3 ; ˇ C 3 /. (f
of Œ0; 1 to the middle third open interval of Œ˛; ˇ.)
Proof The result is geometrically obvious—see Fig. 7.5—but for completeness we
provide an analytic/algebraic proof.
(1) Since m ¤ 0, f W R ! R is 1:1 onto. Given that f .Œ˛; ˇ/ D Œ0; 1, it is immediate
that f restricts to a 1:1 map of Œ˛; ˇ onto Œ0; 1. (2) Suppose that m > 0. Then f is
an increasing function of x. If f .˛/ > 0, then f .x/ f .˛/ > 0 for all x 2 Œ˛; ˇ
and so 0 … f .Œ˛; ˇ/, contradicting the assumption that f .Œ˛; ˇ/ D Œ0; 1. Hence
f .˛/ D 0. Similarly, f .ˇ/ D 1. If m < 0, then f is decreasing and we apply the
same arguments to show that f .˛/ D 1, f .ˇ/ D 0. (3) Suppose that m > 0 (the
argument is similar if m < 0). We have f .˛/ D 0, f .ˇ/ D 1 and so, by linearity,
ˇ˛
1
1
ˇ˛
D m˛ C m
C c D .m˛ C c/ C .mˇ C c .m˛ C c// D :
f ˛C
3
3
3
3
The same argument shows that f .˛ C
ˇ˛
3 /
D 23 .
t
u
Proposition 7.14.6 For n 0 we have
(1) In is the disjoint union of 2n closed intervals Inj , j D 1; ; 2n , each of length
3n .
1/3
(
0
[
2/3
)
1
]
y=mx+c
[
α
Fig. 7.5 Removing middle thirds, case m > 0
( )
]
β
7.14 Compact Subsets of R: The Middle Thirds Cantor Set
293
(2) For 1 j 2n , T n W Inj ! I0 is a linear 1:1 onto map and there exists a
bnj 2 3Z such that T n .x/ D ˙3n x C bnj , for all x 2 Inj . (3Z is the set of all
integers divisible by 3.)
(3) InC1 \ Inj D Inj X T n .. 13 ; 23 //. That is, we obtain InC1 from In by removing the
middle third open interval from each closed interval Inj comprising In .
Proof The proof is by induction on n. In the previous example, we verified the result
in case n D 1. So suppose the result has been shown for n D 0; ; m. We prove
it for n D m C 1. Let J D Imj be one of the closed intervals comprising Im . By
the inductive hypothesis, T m W Imj ! I0 is 1:1 onto and we may write T m .x/ D
˙3m x C b, where b 2 3Z. Suppose that T m .x/ D 3m x C b (the argument when T m
reverses orientation is similar). Then by Lemma 7.14.5, T m maps the open middlethirds interval of J onto . 13 ; 23 /. Hence ImC1 \ J D J X .T m /1 . 13 ; 23 /. Therefore,
ImC1 \ J consists of two closed intervals J1 ; J2 , each of length one third the length
of J, that is 3.mC1/ . Now T m W J1 ! Œ0; 13 and T m W J2 ! Œ 23 ; 1 (we assumed
T m preserved orientation). Hence T mC1 W J1 ! Œ0; 1, T mC1 W J2 ! Œ0; 1 are 1:1
onto maps. For x 2 J1 , T mC1 .x/ D 3.3m x C b/ D 3mC1 x C 3b and if x 2 J2 ,
T mC1 .x/ D 3 3.3m x C b/ D 3mC1 x C 3.1 b/. In both cases, the constant term
lies in 3Z. Applying this argument to each of the closed subintervals comprising Im ,
we see that ImC1 is the disjoint union of 2 2m D 2mC1 closed intervals each of
length 3.mC1/ . This completes the inductive step.
t
u
We now give a number of corollaries of Proposition 7.14.6.
Corollary 7.14.7 The total length of all the middle thirds intervals removed in the
construction of C is 1.
Proof At step one, we remove one interval of length 1=3. At step two we remove
two intervals, each of length 1=32 . At the nth step, we remove 2n intervals each of
length 3.nC1/ . Hence the total length of the intervals removed is
1
X
nD0
Since
P1 2 n
nD0
3
2n 3.nC1/ D
1 1X 2 n
:
3 nD0 3
D 1=.1 23 / D 3, the result follows.
t
u
Corollary 7.14.8 If we let E D [n0 [1j2n @Inj denote the set of end-points of all
the closed intervals comprising In , n 0, then E is a countable subset of C.
Proof Since each set [1j2n @Inj is finite, E is countable (a countable union of finite
sets is finite). Since each In is obtained from In1 by removing middle third intervals,
we never remove end-points of the intervals Inj . Hence E \n0 In D C.
t
u
Remark 7.14.9 It is natural to guess that the Cantor set C is equal to E. However,
as we shall soon see, this is false. Indeed, C is an uncountable subset of I0 .
z
Corollary 7.14.10 Let n 2 N and chose j, 1 j 2n . Then T n maps C \ Inj 1:1
onto C.
294
7 Metric Spaces
Proof We have T n .C \ Inj / D T n .C/ \ T n .Inj / D C \ I0 , by Lemma 7.14.1(2) and
Proposition 7.14.6(2). By Proposition 7.14.6(2), T n W Inj ! I0 is 1:1.
t
u
Remark 7.14.11 The property of C described by the previous corollary implies that
the Cantor set is ‘self-similar’ on all scales. That is, given any of the closed intervals
Inj , we find a copy of the Cantor set within Inj . Sets of this type are examples of
fractals and we give more examples and constructions in the next chapter.
z
We have already remarked that the Cantor set C is a compact set. We now verify
some other metric and topological properties of C.
Definition 7.14.12 A non-empty subset E of the metric space X is perfect if E D E0 .
Remark 7.14.13 A set is perfect iff it is closed and has no isolated points.
z
Lemma 7.14.14 The Cantor set is perfect: C D C0 .
Proof We already know that C is a closed subset of R and so C0 C. It suffices to
show that C has no isolated points. Suppose the contrary and let x 2 C be isolated.
Then there exists a ı > 0 such that .x ı; x C ı/ \ C D fxg. Since C D \n0 In ,
C In for all n 0. Consequently, x 2 In , for all n 0. Each closed interval Inj
comprising In has length 3n . Choose n so that 3n < ı and suppose that x 2 Inj .
Then .x ı; x C ı/ \ Inj @Inj . Since @Inj C (Corollary 7.14.8), we see that
.x ı; x C ı/ \ C contains at least two points. Contradiction. Hence x cannot be an
isolated point.
t
u
Definition 7.14.15 A non-empty subset E of R is totally disconnected if E contains
no (non-empty) open intervals.
Remark 7.14.16 Later we will define totally disconnected for general metric
spaces.
z
Example 7.14.17 If E is a subset of R then E is totally disconnected iff E has no
interior points. To see this, observe that x is an interior point of E iff there exists a
non-empty open interval I E which contains x.
Proposition 7.14.18 The Cantor set is compact, perfect and totally disconnected.
Proof We have already shown that C is compact and perfect. It remains to prove
that C is totally disconnected. We give two proofs. The first proof makes essential
use of the structure of open subsets of the real line; the second proof uses arguments
from dynamics and extends to more general spaces. In what follows jIj denotes the
length of the interval I.
Method I. Suppose that I C is a closed interval. It suffices to show that jIj D 0.
Since C D \n0 In , we have I In for all n 0. Therefore for each n, there exists
a j such that I Inj . Hence jIj < 3n for all n 0 and so jIj D 0.
Method II. Let I D Œ˛; ˇ C, where ˛ ˇ. Since I C and T.C/ D C, we
have T n .I/ C for all n 0. Since C Œ0; 1, it follows that the closed interval
T n .I/ must be a subset of Œ0; 1 for all n 0. But jT n .I/j D 3n jIj, for all n 0. If
jIj > 0, we eventually get jT n .I/j > 1, contradicting C I0 . Hence jIj D 0.
t
u
7.14 Compact Subsets of R: The Middle Thirds Cantor Set
295
Definition 7.14.19 A compact metric space is a Cantor set if it is perfect and totally
disconnected.
Remark 7.14.20 It can be shown [30, Theorem 30.7] that every Cantor set is
homeomorphic to the middle thirds Cantor set C (see Exercises 7.11.10(12) for the
definition of a homeomorphism).
z
7.14.3 Ternary Expansions and the Uncountability of C
The ternary expansion of x 2 R is the expansion of x to base 3. That is, x D
˙x0 :x1 will be the ternary expansion of x if xi 2 f0; 1; 2g for all i 1 and
1
X
xn
x D sign.x/ x0 C
n
3
nD1
!
;
where sign.x/ D C1 if x 0 and sign.x/ D 1 if x < 0.
Example 7.14.21 Just as for decimal expansions, rational numbers may have more
than one ternary expansion. For example, 1 D 0:2 D 1:0 and 13 D 0:10 D 0:02. Let † Œ0; 1 denote the set of points which have a ternary expansion x D
0:x1 x2 such that xn 2 f0; 2g for all n. If x 2 †, we always regard the ternary
expansion as infinite. That is, we write 0:20 rather than 0:2.
Example 7.14.22 1 2 † (since 1 D 0:2) and 13 2 † (since
hand, 12 … † as the (unique) ternary expansion of 12 is 0:1.
If x 2 f0; 1; 2g, let xN D 2 x.
1
3
D 0:02). On the other
Lemma 7.14.23 If x D 0:x1 x2 xn 2 C and x1 ¤ 1, then
(
T.x/ D
0:x2 x3 xn ; if x1 D 0;
0:Nx2 xN 3 xN n ; if x1 D 2:
In particular, we have T.†/ D †. Conversely, if x 2 Œ0; 1 does not have a ternary
expansion in †, then there exists an N 2 N such that T N .x/ … Œ0; 1.
Proof If x1 D 0, then x 2 Œ0; 13 and T.x/ D 3x D 0:x2 x3 xn . If x1 D 2, then
x 2 Π23 ; 1 and so
T.x/ D 3 2:x2 x3 xn D 1 0:x2 x3 xn D 0:22 2 0:x2 x3 xn D 0:Nx2 xN 3 xN n :
296
7 Metric Spaces
Since xN 2 f0; 2g if x 2 f0; 2g, we see that if x 2 †, then T.x/ 2 †. Hence T.†/ †.
On the other hand, if x D 0:x1 x2 2 †, then T.0:0x1 x2 / D x. Hence T.†/ D
†. Finally, suppose that x 2 Œ0; 1 does not have a ternary expansion consisting of
0’s and 2’s. Let x D 0:x1 x2 . If x1 D 1, then x ¤ 0:x1 a, a 2 f0; 2g (else x 2 †).
Hence x 2 . 13 ; 23 / and T.x/ … I0 . More generally, if xj , j > 1, is the first term in the
ternary expansion of x which is equal to 1, then T j1 .x/ D 0:xj xO jC1 xO n , where
xO n 2 fxn ; xN n g. It follows just as before that T j1 .x/ 2 . 31 ; 23 / and so T j .x/ … I0 .
t
u
Theorem 7.14.24 We have
(1) C D †,
(2) C is uncountable.
Proof
(1) Since T.†/ D †, points in † never leave I0 under iteration by T. Hence † C.
On the other hand, if x … †, then there exists an N 2 N such that T N .x/ … Œ0; 1
and so x … C. Therefore, C D †.
(2) It suffices to prove † is uncountable. Define B W † ! Œ0; 1 by
B.0:x1 xn / D 0:y1 yn , where yn D 0 if xn D 0 and yn D 1 if xn D 2,
n 1. Observe that B.†/ is the set of all binary expansions 0:b1 bn of
points in Œ0; 1 and so B is certainly onto. Since Œ0; 1 is uncountable so therefore
is †.
t
u
Example 7.14.25 Theorem 7.14.24 shows that C contains many more points than
those in the (countable) interval end point set E. For example, 14 … E is a point of
the Cantor set. To see this observe that T. 14 / D 34 and T. 34 / D 3 3 34 D 34 . Since 34
is fixed by T, T n . 14 / D 34 2 I0 for all n 1 and so 14 2 C.
EXERCISES 7.14.26
(1) Show that Q and R X Q are totally disconnected subsets of R.
(2) Find examples of subsets of R which are (a) compact, perfect, not totally
disconnected, (b) compact, not perfect, totally disconnected, (c) not compact,
perfect, totally disconnected.
(3) Show that a perfect subset E of Rn is uncountable. (Hint: suppose the contrary
and set E D fxn j n 2 Ng. Construct a decreasing sequence Dk of closed disks
such that (a) x1 2 D1 , (b) Dk \ E ¤ ;, k 2 N, (c) xk … DkC1 , k 1. Now use
Theorem 7.13.13 applied to Fn D Dn \ E to obtain a contradiction.)
(4) A metric space X is locally compact if every point in X has a compact
neighbourhood. Show that every perfect set in a locally compact metric space
is uncountable.
(5) Find all the points x 2 C such that T.x/ D x (x is a fixed point of T). Find a
point x 2 C such that T 2 .x/ D x, but T.x/ ¤ x (we call x a point of prime
period two for T). Can you find a point x 2 C which is of prime period three
for T? (Hint: use ternary expansions; alternatively, graph T 3 and find the points
of intersection of the graph of T 3 with the diagonal y D x.)
7.15 Complete Metric Spaces
297
(6) Show that there exists a continuous map f W C ! Œ0; 1 which maps the Cantor
set onto Œ0; 1. (Hint: use the ternary expansion of points in C. It can be shown
that every compact metric space is the continuous image of the Cantor set.)
(7) Let b W Œ0; 1 ! R be the asymmetric “Baker’s” transformation defined by
3x; 0 x < 2=3;
b.x/ D
3x 2; 2=3 x 1:
Verify that the set of points X in Œ0; 1 that never exit Œ0; 1 under iteration by
b is the middle thirds Cantor set.
(8) If you construct a middle fifths Cantor set Œ0; 1—the middle fifth interval of
each closed subinterval is removed—what is the total length of all middle fifth
intervals that are removed? Prove that the resulting set is compact, perfect
and totally disconnected. More generally, define a Cantor set by removing
middle xths, starting with Œ0; 1, x 2 .0; 1/. Show that the total length of all the
intervals removed is one. (Hint: calculate what is removed and left at each step
rather than counting the number and lengths of intervals created at each step.)
(9) Can you construct a ‘fat’ Cantor subset of Œ0; 1 such that the total length of
intervals removed is less than one? (Hint: follow the counting strategy of the
previous example. You will have to vary the proportion xn 2 .0; 1/ removed
at the nth step.PExercises 3.9.19(4) will be useful in showing that you get a fat
Cantor set iff 1
nD1 xn < 1.)
p
(10) Let f W R ! R be defined by f .x/ D x.1 x/. Show that if > 2 C 5 then
X D fx 2 Œ0; 1 j f n .x/ 2 Œ0; 1; for all n 2 Ng is a compact, perfect, totally
disconnected subset of Œ0; 1. (Hint: the condition on implies that there exists
an a > 1 such that j f 0 .x/j a for all x 2 Œ0; 1 such that f .x/ 2 Œ0; 1. Use
the second method of proof of Proposition 7.14.18 to show that X is totally
disconnected.)
7.15 Complete Metric Spaces
One of the most important ideas in our study of convergence of sequences and series
of real numbers was that of a Cauchy sequence. The definition of a Cauchy sequence
naturally generalizes to metric spaces. In this section we develop the theory of
Cauchy sequences in metric spaces and show, for example, how results on uniform
convergence in Chap. 4 can be naturally reformulated in metric space terms.
Definition 7.15.1 A sequence .xn / in the metric space .X; d/ is a Cauchy sequence
if limm;n!1 d.xn ; xm / D 0. That is, if for every " > 0, there exists an N 2 N such
that
d.xn ; xm / < "; for all m; n N:
The next lemma is a metric space version of Lemma 2.4.20.
298
7 Metric Spaces
Lemma 7.15.2 Let .xn / be a sequence in the metric space .X; d/.
(1) If .xn / is Cauchy, then fxn j n 2 Ng is a bounded subset of X.
(2) If .xn / is convergent, then .xn / is Cauchy.
(3) If .xn / is Cauchy and .xn / has a convergent subsequence, then .xn / is convergent.
Proof The proof, modulo changes of notation, is formally identical to that of
Lemma 2.4.20. We prove (3) and leave (1,2) to the exercises. Suppose that the
subsequence .xnk / of .xn / is convergent with limit x? . Given " > 0, choose N 2 N
such that d.xn ; xm / < "=2, for all n; m N, and d.xnk ; x? / < "=2, for all nk N.
Choose nk N. Then for all n N we have
d.xn ; x? / d.xn ; xnk / C d.xnk ; x? /
< "=2 C "=2 D ":
Hence .xn / is convergent with limit x? .
t
u
Definition 7.15.3 A metric space X is complete if every Cauchy sequence in X
converges.
Examples 7.15.4
(1) R is complete (in the standard metric).
(2) Rm is complete in the Euclidean metric
pPmd2 (or in either of the metrics d1 or
2
d1 ). If x; y 2 Rm , then d2 .x; y/ D
iD1 .xi yi / jxi yi j, 1 i m.
n
m
Hence, if .x / is a Cauchy sequence in R , then .xni / is a Cauchy sequence
in R, 1 i m. Let limn!1 xni D x?i , 1 i m. We claim that .xn /
?
?
?
converges with
p limit x D .x1?; ;n xm /. This follows easily from the estimate
? n
d2 .x ; x / m max1im jxi xi j. Similar arguments apply for the metrics
d1 ; d1 .
The next proposition gives more examples of complete metric spaces.
Proposition 7.15.5 Let .X; d/ be a metric space.
(1) If X is sequentially compact, then X is complete.
(2) If X is complete then every closed subset of X is complete (in the induced
metric).
Proof (1) If X is sequentially compact, then every sequence .xn / in X has a
convergent subsequence. Now apply Lemma 7.15.2(3). (2) Suppose that E is a
closed subset of the complete metric space X. If .xn / is a Cauchy sequence of points
of .E; dE /, then certainly .xn / is Cauchy in .X; d/. Hence .xn / converges to a point
x? 2 X. Since E is closed, x? 2 E.
t
u
Example 7.15.6 Every closed and bounded subset of .Rn ; d2 / is complete in the
induced metric.
7.15 Complete Metric Spaces
299
7.15.1 Completeness of Spaces of Functions
Let .X; d/ be a metric space. Let C0 .X; R/, B.X; R/ and B0 .X; R/ respectively
denote the spaces of continuous, bounded and bounded continuous real-valued
functions on X. Obviously,
C0 .X; R/ B0 .X; R/ B.X; R/:
If X is compact then C0 .X; R/ D B0 .X; R/. Given f ; g 2 B.X; R/, we define the
uniform metric on B.X; R/ by
. f ; g/ D sup j f .x/ g.x/j:
x2X
(See also Exercises 7.1.9(8).) Let also denote the induced metrics B0 .X;R/ on
B0 .X; R/ and C0 .X;R/ on C0 .X; R/ (when X is compact).
Theorem 7.15.7 Let .X; d/ be a metric space.
(1) .B.X; R/; / and .B0 .X; R/; / are complete.
(2) If X is compact, then .C0 .X; R/; / is complete.
Proof The proof of this result is similar to that of Theorem 4.3.16. We prove that
.B.X; R/; / and .B0 .X; R/; / are complete (which also accounts for (2) since
.C0 .X; R/; / D .B0 .X; R/; / if X is compact). Suppose then that . fn / is a Cauchy
sequence of functions in B.X; R/. Given x 2 X, we have . fn ; fm / j fn .x/ fm .x/j
for all n; m 2 N. Hence . fn .x// is a Cauchy sequence in R. Since .R; jj/ is complete,
. fn .x// is convergent. Set limn!1 fn .x/ D fO .x/. Since x was an arbitrary point of
X, this defines the function fO W X ! R. Let " > 0. Since . fn / is Cauchy, there
exists an N 2 N such that . fn ; fm / " for all n; m N. Letting m ! 1, this
gives . fn ; fO / " for all n N. Hence . fn / converges to fO in .B.X; R/; / (since
fO fn 2 .B.X; R/; / for all n, fO is bounded). Now suppose . fn / B0 .X; R/. We
prove that fO is continuous. Given x 2 X and " > 0, it suffices to find r > 0 such that
j fO .x/ fO . y/j < ", for all y 2 Dr .x/. Since . fn / is Cauchy, there exists an N 2 N
such that . fn ; fm / < "=3 for all n; m N. Since fN is continuous, there exists an
r > 0 such that j fN .x/ fN . y/j < "=3 for all y 2 Dr .x/. We have
j fO .x/ fO . y/j D j fO .x/ fN .x/ C fN .x/ fN . y/ C fN . y/ fO . y/j
j fO .x/ fN .x/j C j fN .x/ fN . y/j C j fN . y/ fO . y/j
< "=3 C "=3 C "=3 D "; if y 2 Dr .x/:
Hence fO is continuous at x.
t
u
300
7 Metric Spaces
Examples 7.15.8
(1) If I is a closed interval in R or more generally a closed and bounded subset of
R, Theorem 7.15.7 implies that a Cauchy sequence of continuous real-valued
functions on I, uniform metric, is uniformly convergent (general principle of
convergence, Theorem 4.3.16).
(2) If we choose a different metric on B0 .X; R/ or C0 .X; R/, then the corresponding
function space may not be complete. As an example, recall that the L2 -metric
2 on C0 .Œ0; 1; R/ is defined by
Z
2 . f ; g/ D
1
0
2
j f .x/ g.x/j dx
12
; f ; g 2 C0 .Œ0; 1/:
The metric space .C0 .Œ0; 1; R/; 2 / is not complete. It suffices to find a nonconvergent Cauchy sequence in .C0 .Œ0; 1; R/; 2 /. If we define the sequence
. fn / by
fn .x/ D
8
<
:
1
2 .1
0;
if 0 x 12 1n ;
1
C n.x 2 //; if 12 1n x 12 C 1n ;
1;
if 12 C 1n x 1;
then it is straightforward to check that . fn / is a Cauchy sequence in
.C0 .Œ0; 1; R/; 2 / that does not converge in the L2 -metric to a continuous
function. Note that . fn / is not Cauchy in .C0 .Œ0; 1; R/; /.
It is easy to extend Theorem 7.15.7 to spaces of vector-valued functions. Given
p 2 N, let B.X; Rp / denote the space of bounded Rp -valued functions on X,
where bounded is relative to one of the standard (equivalent) metrics on Rp : d2 ; d1
or d1 (because the metrics are equivalent, B.X; Rp / does not depend on which
particular metric from d2 ; d1 ; d1 we choose). We may similarly define B0 .X; Rp /
and C0 .X; Rp /. Define the uniform metric in the usual way by . f ; g/ D
supx2X d. f .x/; g.x//, where d is the metric on Rp and if f ; g 2 C0 .X; Rp /, we assume
X is compact. We have the following useful corollary of Theorem 7.15.7.
Theorem 7.15.9 Let .X; d/ be a metric space and p 2 N.
(1) .B.X; Rp /; / and .B0 .X; Rp /; / are complete.
(2) If X is compact, then .C0 .X; Rp /; / is complete.
Proof Choose the metric d1 on Rp . Suppose . f n / B.X; Rp / is a Cauchy
sequence. If we write f n in component form as . f1n ; ; fpn /, then . f n ; f m / D
supx2X max1ip j fin .x/ fim .x/j and so . fin / B.X; R/ is Cauchy, 1 i p.
Now apply Theorem 7.15.7 to deduce that there exists an f ? D . f1? ; ; fp? / such
that limn!1 fin D fi? , 1 i p. Since each component of f ? is bounded, it
is immediate that f ? is bounded in the d1 metric (the bound is the maximum of
7.15 Complete Metric Spaces
301
the bounds of the components) and that . f n / converges to f ? in .B.X; Rp /; /. The
continuity statements are immediate since f ? is continuous iff each component of f ?
is continuous.
t
u
7.15.2 Completion of a Metric Space
Definition 7.15.10 Let .X; d/ be a metric space. A completion of .X; d/ consists of
O d/
O such that
a metric space .X;
O d/
O is a complete metric space.
(1) .X;
O
(2) X is a dense subset of XO (closure is relative to the metric d).
(3) dO X D d (the metric dO X induced by dO on X equals d).
Example 7.15.11 A completion of .Q; j j/ is .R; j j/.
First we prove that if .X; d/ has a completion then the completion is essentially
unique. Then we shall show that every metric space has a completion. Our proof
will depend on the completeness of .R; j j/.
Before we prove the uniqueness of a completion, we need to review the definition
N be metric spaces. Recall (see Exercises 7.11.10(3))
of an isometry. Let .X; d/, .Y; d/
that an isometry of X and Y is a 1:1 onto map F W X ! Y such that
N
d.F.x/;
F.x0 // D d.x; x0 /; for all x; x0 2 X:
We remark that if F W X ! Y is an isometry then so is F 1 W Y ! X. Both F
and F 1 are obviously continuous. If F W X ! Y is an isometry, then X and Y are
indistinguishable as metric spaces and we say X and Y are isometric. We show that
any two completions of a metric space are isometric. More precisely, we prove
Proposition 7.15.12 Let .XO 1 ; dO 1 /, .XO 2 ; dO 2 / be completions of the metric space
.X; d/. Then there exists a unique isometry F W XO 1 ! XO 2 such that F restricts to
the identity map on the subspace X of XO 1 .
Proof Let x? 2 XO 1 . Since X is dense in XO 1 , there exists a sequence .xn / X which
converges to x? in .XO 1 ; dO 1 /. Necessarily .xn / is a Cauchy sequence (with respect
to d) and so .xn / is a Cauchy sequence in .XO 2 ; dO 2 / (since dO 2 induces the metric d
on X). Since .XO 2 ; dO 2 / is complete, .xn / converges in .XO 2 ; dO 2 /, say to y? . We claim
that y? depends only on x? and not on the particular choice of sequence .xn / X
converging to x? . This is clear since if .x0n / X converges to x? then d.xn ; x0n / ! 0
as n ! 1 and so limn!1 dO 2 .xn ; x0n / D 0. Hence .xn / and .x0n / have the same limit
in .XO 2 ; dO 2 /. We define F W XO 1 ! XO 2 by F.x? / D y? . If x? 2 X, then x? D y? and so
F restricts to the identity map on X. If we reverse the construction, to define a map
G W XO 2 ! XO 1 , then it is easy to check that G ı F is the identity on XO 1 and F ı G
is the identity on XO 2 . Therefore F is 1:1 onto. We must check that F is an isometry.
Suppose .xn /; .zn / X are convergent sequences with respective limits x? ; z? 2 XO 1 .
302
7 Metric Spaces
We have
dO 2 .F.x? /; F.z? // D lim dO 2 .F.xn /; F.zn //
n!1
D lim d.xn ; yn /
n!1
D lim dO 1 .xn ; yn /
n!1
D dO 1 .x? ; z? /:
Finally, we must show that F is unique. Suppose that F; F 0 W XO 1 ! XO 2 satisfy the
conditions of the proposition. Then F D F 0 on X. Since X is a dense subset of XO 1
and F; F 0 are continuous, F D F 0 by Proposition 7.12.15.
t
u
O d/
O is a completion
Corollary 7.15.13 If .X; d/ is a complete metric space and .X;
of X, then XO D X, dO D d.
Theorem 7.15.14 Let .X; d/ be a metric space. Then .X; d/ has a completion
O d/.
O
.X;
Proof The metric space .B0 .X; R/; / is complete (Theorem 7.15.7; as usual denotes the uniform metric). We construct a completion of X by defining an isometry
‚ of X onto a subspace Z of B0 .X; R/. Identifying X with Z, we define the
completion of X to be .Z; /. In order to define ‚, fix a 2 X. For x 2 X, define
‚.x/ to be the map fx W X ! R where
fx . y/ D d.x; y/ d. y; a/; y 2 X:
Lemma 7.1.4 implies that j fx . y/j D jd.x; y/ d. y; a/j d.x; a/ and so fx is
bounded. Since fx is continuous, fx 2 B0 .X; R/ and so ‚ W X ! B0 .X; R/. It remains
to prove that ‚ is an isometry onto ‚.X/ .B0 .X; R/; /. For x; x0 2 X, we have
. fx ; fx0 / D sup j fx . y/ fx0 . y/j
y2X
D sup jd.x; y/ d. y; a/ .d.x0 ; y/ d. y; a//j
y2X
D sup jd.x; y/ d.x0 ; y/j
y2X
D d.x; x0 /;
where the last equality follows taking y D x0 and using Lemma 7.1.4.
t
u
Remark 7.15.15 Theorem 7.15.14 highlights the pivotal role of the completeness of
the real numbers—the completion of Q.
z
7.15 Complete Metric Spaces
303
7.15.3 Category
Recall that a subset E of a metric space X is dense if E D X. At the opposite extreme
we can formalize the idea of a subset which is ‘nowhere dense’.
ı
Definition 7.15.16 A subset E of the metric space X is nowhere dense if E D ;.
Examples 7.15.17
(1) Suppose that X contains no isolated points. Any finite subset of X is nowhere
dense.
(2) If .xn / is a convergent sequence in R, then E D fxn j n 2 Ng is nowhere dense.
(3) The middle thirds Cantor set C is nowhere dense (even though C is compact
and uncountable).
Definition 7.15.18 A subset E of the metric space X is of the first category if E
can be written as a countable union of nowhere dense sets; if E is not of the first
category, it is of the second category.
Lemma 7.15.19 Let E be a subset of the metric space X which is of the first
category. Then if U X is open and non-empty, then there exists a non-empty
open subset V of U such that V \ E D ;.
Proof Suppose the contrary. Then given x 2 U, every open neighbourhood of x
ı
meets E and so x 2 E. Since this holds for all x 2 U, U E and so E ¤ ;,
contradicting the assumption that E is nowhere dense.
u
t
Theorem 7.15.20 (Baire Category Theorem) A complete metric space is of the
second category.
Proof Suppose the contrary and that the metric space X can be written as X D
[1
nD1 An , where the An are nowhere dense subsets of X. Apply Lemma 7.15.19 to
choose a closed disk Dr1 .x1 / such that Dr1 .x1 / \ A1 D ;. Apply Lemma 7.15.19
again with U D Dr1 .x1 / and E D A2 , to find a closed disk Dr2 .x2 / D1 with
r2 21 r1 . Proceeding inductively, we obtain a decreasing sequence Dn D Drn .xn /
of closed disks such that Dn \ An D ; and rn 2n r1 . We claim that \n1 Dn ¤ ;.
This follows since .xn / is a Cauchy sequence in X (as rn ! 0) and so, by the
completeness of X, limn!xn xn D x? exists. Since each Dn is closed and Dn fxm j m ng, x? 2 Dn , for all n 1 and so x? 2 \n1 Dn . But by construction
x? … An , for all n, contradicting the assumption that X D [1
t
u
nD1 An .
Corollary 7.15.21 Let Z be a complete metric space. If Vn is an open and dense
subset of Z, n 2 N, then \n1 Vn ¤ ;.
Proof If Vn is an open and dense subset of Z, then the closed set Fn D Z X Vn is
nowhere dense in Z. Now Z X \n1 Vn D [n1 Fn and since Z is second category,
[n1 Fn ¤ Z. Hence \n1 Vn ¤ ;.
t
u
304
7 Metric Spaces
Corollary 7.15.22 Let X be a complete metric space. Suppose that for n 1, Un is
an open and dense subset of X. Then \n1 Un is a dense subset of X.
Proof It suffices to show that
.\n1 Un / \ Dr .x/ D \n1 Un \ Dr .x/ ¤ ;
for all x 2 X, r > 0. Take Z D Dr .x/ and Vn D Un \ Dr .x/. Since Z D Dr .x/ is a
closed subset of a complete metric space, Z is complete in the induced metric and
so Corollary 7.15.21 applies.
t
u
EXERCISES 7.15.23
(1) Prove parts (1) and (2) of Lemma 7.15.2.
(2) Show that .B0 .X; Q/; / is not complete ( as usual is the uniform metric.
What is the completion of .B0 .X; Q/; /)? (Hint for the last part: does
B0 .X; Q/ D C0 .X; Q/?)
(3) Let C1 .Œa; b/ denote the space of C1 functions on Œa; b. Show that
.C1 .Œa; b/; / is not complete ( denotes the uniform metric). Show that
if we define 1 . f ; g/ D . f ; g/ˇ C . f 0 ; g0 /,ˇ then .C1 .Œa; b/; 1 / is complete.
ex
ey ˇ
(4) For x; y 2 R, define .x; y/ D ˇ 1Ce
x 1Cey . Verify that is a metric on R that
defines the same open sets as the standard metric on R. Show that .R; / is not
complete. What is the completion of .R; /? (Specifically, what is b
R X R?)
N be metric spaces. Define the uniform metric on B0 .X; Y/
(5) Let .X; d/, .Y; d/
N f .x/; g.x// (see Exercises 7.13.24(13) when X is
by . f ; g/ D supx2X d.
0
compact). Show that B .X; Y/ is complete iff Y is complete. Prove a similar
result for C0 .X; Y/ in case X is compact.
(6) Show that the metric space .X; d/ is complete if d is the discrete metric.
(7) Let Y be a subspace of the metric space .X; d/. Show that if Y is complete in
the induced metric then Y is a closed subset of X. If Y is a closed subset of
.X; d/, need Y be complete in the induced metric?
N be metric spaces and F W X ! Y be an isometry. Show that
(8) Let .X; d/, .Y; d/
N is complete.
.X; d/ is complete iff .Y; d/
N
(9) Let d; d be equivalent metrics on X (see Exercises 7.1.9(11) for the definition
N is complete.
of equivalent metric). Show that .X; d/ is complete iff .X; d/
O
O
(10) Let .X; d/ be a metric space. Construct a completion .X; d/ along the lines of
section “Appendix: Construction of R Revisited”. That is, let C denote the set
of all Cauchy sequences of points of X and partition C by the equivalence
relation .xn /
. yn / iff limn!1 d.xn ; yn / D 0. Let XO denote the set of
equivalence classes. Show that there is a natural way to define a metric dO on XO
O d/
O is a completion of .X; d/.
so that .X;
(11) Show that a nowhere dense set has no isolated points.
(12) Show that it is not possible to find a countable subset fan j n 2 Ng of R such
that [n1 an C C Œ0; 1. Here an C C D fan C x j x 2 Cg is C translated by
an . (See also Exercises 8.3.2(8).)
7.16 Equicontinuity and the Arzelà–Ascoli Theorem
305
(13) Let X be a metric space. Suppose that F1 F2 is a decreasing
sequence of non-empty closed subsets of X such that limn!1 D.Fn / D 0,
where D.Fn / denotes the diameter of Fn . Show that if X is complete then
\n1 Fn is nonempty and consists of precisely one point. Show, by means of
examples, that if we omit either of the conditions (a) Fn closed, or (b) .X; d/
complete, then the intersection may be empty. (Hint for the first part: see the
proof of Theorem 7.15.20.)
7.16 Equicontinuity and the Arzelà–Ascoli Theorem
Let .X; d/ be a compact1 metric space and let .C0 .X/; / denote the space of
continuous real-valued functions on X with the uniform metric
. f ; g/ D sup j f .x/ g.x/j; f ; g 2 C0 .X/:
x2X
As we showed in the previous section, .C0 .X/; / is a complete metric space. In
this section we give a characterization of the compact subsets of C0 .X/. Although
we restrict to real-valued continuous maps on X, what we say generalizes easily to
the metric space of continuous maps f W X ! Y, where Y is a complete metric
space (see Exercises 7.13.24(13) for the definition and properties of the uniform
metric on C0 .X; Y/ and note that if Y is complete so is C0 .X; Y/). The methods
we use are a synthesis of many of the ideas and results we have developed for the
study and description of metric spaces. We conclude the section with an application
of our main result (the Arzelà–Ascoli theorem) to the existence theory of ordinary
differential equations.
We start by recalling Theorem 7.13.6, the generalization of the Bolzano–
Weierstrass theorem to Rn : a subset Z of Rn is compact iff Z is closed and bounded
(relative to the Euclidean norm on Rn ). Our aim in this section is to obtain an
analogous characterization of compact subsets of C0 .X/. First, however, we remark
that it is easy to see that a closed and bounded subset of C0 .X/ need not be
sequentially compact.
Example 7.16.1 Let X D Œ0; 1 R (induced metric) and define E D D1 .0/.
Certainly E is a closed and bounded subset of C0 .X/. If we define fn .x/ D xn ,
n 2 N, then . fn / E. Since the pointwise limit of fn is discontinuous, there are no
convergent subsequences of . fn / (in the uniform metric).
Definition 7.16.2 Let E be a subset of C0 .X/.
(1) E is pointwise bounded if for every x 2 X, there exists an Mx 0 such that
j f .x/j Mx ; for all f 2 E:
1
Throughout this section, ‘compact’ is to be understood as sequentially compact.
306
7 Metric Spaces
(2) E is uniformly bounded if there exists an R 0 such that
f 2 DR .0/; for all f 2 E:
Obviously if E is uniformly bounded then E is pointwise bounded. The converse
is easily seen to be false.
Example 7.16.3 Take X D Œ0; 1 and for n 2 N define
8
ˆ
ˆ
ˆ
<
1
0;
0 x nC1
;
1
1
1 1
1
2
2n .n C 1/.x nC1 /;
nC1 x 2 . n C nC1 /;
fn .x/ D
1
1
1
1
1
2
ˆ
ˆ 2n 2n .n C 1/.x nC1 /; 2 . n C nC1 / x n ;
:̂
1
0;
x 1:
n
1
The function fn takes its maximum value n at the midpoint of ΠnC1
; 1n and is zero
1
; 1n . Hence . fn / C0 .X/ is pointwise bounded but not
on the complement of ΠnC1
uniformly bounded.
Next we introduce a definition that plays a crucial role in our description of compact
subsets of C0 .X/.
Definition 7.16.4 A subset E of C0 .X/ is equicontinuous (on X) if for every " > 0,
there exists a ı > 0 such that if d.x; y/ < ı then
j f .x/ f . y/j < "; for all f 2 E:
Examples 7.16.5
(1) If E consists of a single function f , equicontinuity is automatic by the uniform
continuity of f (Theorem 7.13.8(2)). Consequently, any finite subset of C0 .X/
is automatically equicontinuous. Viewed in this way, equicontinuity is the
natural generalization of uniform continuity to an infinite family of continuous
functions.
(2) An equicontinuous set E need not be pointwise bounded. For example, choose
f 2 C0 .X/ and let E D f f C n j n 2 Ng.
(3) Let ˛ > 0. A function f W X ! R is Hölder continuous with exponent ˛ if
there exists a K 0 such that j f .x/ f . y/j Kd.x; y/˛ for all x; y 2 X.
If ˛ D 1, f is Lipschitz. Suppose that E is a subset of C0 .X/ consisting of
Hölder continuous functions all with the same exponent ˛ and same bound
j f .x/ f . y/j Kd.x; y/˛ (that is, K > 0 independent of f 2 E). Then E is
1
equicontinuous. Indeed, given " > 0, take ı D ."=K/ ˛ . Since C1 functions are
Lipschitz on compact subsets of R (and Rn ) by the mean value theorem, we see
that adding a little regularity to our functions can result in big equicontinuous
sets.
7.16 Equicontinuity and the Arzelà–Ascoli Theorem
307
Lemma 7.16.6 Let E be a subset of C0 .X/.
(1) If E is uniformly bounded, then E is uniformly bounded.
(2) If E is equicontinuous, then E is equicontinuous.
t
u
Proof Left to the exercises.
We now state the main theorem of this section.
Theorem 7.16.7 (Arzelà–Ascoli) A subset E of .C0 .X/; / is compact iff
(1) E is pointwise bounded,
(2) E is equicontinuous,
(3) E is closed.
We start by proving a number of preliminary results.
Lemma 7.16.8 Let E be an equicontinuous subset of C0 .X/. Then E is pointwise
bounded iff E is uniformly bounded.
Proof Suppose that E is pointwise bounded (the converse is trivial). Since E is
equicontinuous, given " D 1, we can choose ı > 0 such that for all x; y 2 X
with d.x; y/ < ı we have
j f .x/ f . y/j < 1:
(7.6)
Since X is compact, we can choose a finite subset P of X such that for every x 2 X,
there exists a p 2 P such that d.x; p/ < ı. Since E is pointwise bounded, for
each p 2 P there exists an Mp 0 such that j f . p/j Mp for all f 2 E. Set
M D maxp2P Mp . Given x 2 X, choose p 2 P such that d.x; p/ < ı. By (7.6),
j f .x/j 1 C j f . p/j 1 C M. Hence E is uniformly bounded.
t
u
Lemma 7.16.9 If . fn / is a uniformly convergent sequence in C0 .X/, then f fn j n 2
Ng is equicontinuous on X.
Proof Let " > 0. We must find ı > 0 so that j fn .x/ fn . y/j < " whenever d.x; y/ <
ı. Since . fn / is uniformly convergent, . fn / is a Cauchy sequence in C0 .X/ and so
there exists an N 2 N such that
j fn .x/ fN .x/j < "=3; n N; x 2 X:
(7.7)
Since continuous functions on a compact metric space are uniformly continuous,
there exists a ı > 0 such that for 1 i N we have
j fi .x/ fi . y/j < "=3 < "; 8x; y 2 X satisfying d.x; y/ < ı:
(7.8)
If n > N and d.x; y/ < ı, we have by (7.7) and (7.8) with i D N,
j fn .x/ fn . y/j j fn .x/ fN .x/j C j fN .x/ fN . y/j C j fN . y/ fn . y/j < ":
Together with (7.8), this completes the proof that f fn j n 2 Ng is equicontinuous
on X.
t
u
308
7 Metric Spaces
Lemma 7.16.10 Let Q be a dense subset of X and . fn / C0 .X/ be a sequence
satisfying
(1) f fn j n 2 Ng is equicontinuous on X.
(2) . fn .q// is convergent for all q 2 Q.
Then . fn / is uniformly convergent.
Proof It suffices to prove that . fn / is a Cauchy sequence in C0 .X/. That is, given
" > 0, we claim there exists an N 2 N such that . fm ; fn / < " for all m; n N.
Since . fn / is equicontinuous on X, we may choose ı > 0 so that
j fn .x/ fn . y/j < "=3; for all x; y 2 X such that d.x; y/ < ı:
(7.9)
Since Q is a dense subset of the compact space X, we can choose a finite subset Q
of Q satisfying d.x; Q/ < ı for all x 2 X. Since . fn .q// is convergent for all q 2 Q,
we may choose N 2 N so that for all m; n N we have
j fm .q/ fn .q/j < "=3; for all q 2 Q:
(7.10)
Given x 2 X, choose q 2 Q so that d.x; q/ < ı. We have
j fm .x/ fn .x/j j fm .x/ fm .q/j C j fm .q/ fn .q/j C j fn .q/ fn .x/j;
"
"
"
< C C D "; if n; m N;
3
3
3
where the second inequality follows from (7.9), (7.10).
t
u
The final result we need before we prove the Arzelà–Ascoli theorem shows
that we can always construct a subsequence satisfying the second condition of
Lemma 7.16.10. This result uses neither continuity nor a metric.
Lemma 7.16.11 Let . fn / be a pointwise bounded sequence of real-valued functions
defined on a countable set Q. Then there exists a subsequence . fnk / of . fn / which is
pointwise convergent on Q.
Proof Since Q is countable, we may write Q D fq1 ; q2 ; g. For k 1, we
construct subsequences . fnk / of . fn / satisfying
(1) . fn` / is a subsequence of . fnk / if ` > k.
(2) . f k .qi // is convergent for 1 i k.
The construction is inductive. We start by constructing . fn1 /. Since . fn .q1 // R is
bounded, there exists a subsequence . fn1 / of . fn / such that . fn1 .q1 / is convergent.
Suppose we have constructed . fnj / satisfying (1,2) above for 1 j < k. In order to
construct . fnk / we repeat the construction we gave for . fn1 / but with . fnk1 / replacing
. fn / and .qk / replacing q1 . Finally, we construct the required subsequence . fnk / of
. fn / by taking fnk D fkk , k 2 N. With the exception of at most k 1 terms, . fkk / is a
subsequence of . fnk / and so for all i 2 N, . fkk .qi // is convergent.
t
u
7.16 Equicontinuity and the Arzelà–Ascoli Theorem
309
Proof of Theorem 7.16.7 Suppose that E C0 .X/ satisfies conditions (1,2,3) of
Theorem 7.16.7. We must show that every sequence . fn / E has a subsequence
converging to a point of E. Since X is compact, X is separable (Theorem 7.13.21)
and so we may pick a countable dense subset Q of X. If . fn / E, it follows by
Lemma 7.16.11 that there exists a subsequence . fnk / of . fn / which is pointwise
convergent on Q. Applying Lemma 7.16.10, . fnk / is uniformly convergent on
X. Since E is closed, limk!1 fnk 2 E and so E is compact. For the converse,
suppose that E is a compact subset of C0 .X/. Necessarily, E is closed and
bounded (Proposition 7.13.4). Noting Lemma 7.16.8, it suffices to prove that E is
equicontinuous. Suppose the contrary. Then there exists an " > 0 such that for every
n 2 N, there exist fn 2 E, xn ; yn 2 X, satisfying
j fn .xn / fn . yn /j "; d.xn ; yn / < 1=n:
Since X is compact, there exist convergent subsequences .xnk /; . ynk / of .xn /; . yn /
with common limit x? . Since we assume E is compact, there exists a convergent subsequence . fmk / of . fnk /. Necessarily we have limk!1 fmk .xmk / D limk!1 fmk . ymk /,
contradicting the estimate j fmk .xmk /fmk . ymk /j ". Hence E is equicontinuous. u
t
Corollary 7.16.12 Let . fn / C0 .X/. Then . fn / has a (uniformly) convergent
subsequence if f fn j n 2 Ng is pointwise bounded and equicontinuous.
Proof Suppose that E D f fn j n 2 Ng is pointwise bounded and equicontinuous.
By Lemma 7.16.6, E is pointwise bounded and equicontinuous and so E is compact
and . fn / has a convergent subsequence by Theorem 7.16.7.
t
u
7.16.1 An Application to Differential Equations
Let f W R ! R be continuous and x0 2 R. If f is bounded (so f 2 B0 .R/) we show
that the ordinary differential equation
dx
D f .x/
dt
has a C1 solution
W Œ0; 1 ! R satisfying the initial condition .0/ D x0 . That is,
0
.t/ D f . .t//; t 2 Œ0; 1; and .0/ D x0 :
Remarks 7.16.13
(1) We need the boundedness condition on f , else the solution may escape to infinity
in finite time (see the exercises for a simple example). Of course, we can always
multiply f by a bump function to obtain a bounded function which is equal to
f on some preassigned interval ŒR; R. Thus, our result gives solutions defined
on some interval Œ0; ı even if f is not bounded.
310
7 Metric Spaces
(2) The choice of the interval Œ0; 1 is only for convenience. With minor modifications, the proof works for any closed interval Œa; b containing the origin.
(3) The solution we obtain may not be unique (see the exercises). In the next section
we show how, if we assume more regularity on f , we obtain uniqueness of
solutions.
(4) The existence of solutions to the non-autonomous ordinary differential equation
x0 D f .t/ (f W R ! R continuous)
is immediate from the fundamental theorem
Rt
of calculus: .t/ D x0 C 0 f .s/ ds. Solutions are unique and exist for all t 2 R.
(5) In the exercises we indicate the extension of the existence result to nonautonomous ordinary differential equations on Rn , n 1.
z
D f .x/ based on the Euler
We give a proof of the existence of solutions to dx
dt
method and equicontinuity. The basic idea to define a sequence of continuous
piecewise linear approximations to a solution, prove equicontinuity of the sequence
and then apply Corollary 7.16.12 to get a sequence converging to a solution of
the differential equation. (In the next section we use the easier Picard method to
construct a sequence which converges to a solution. However, for this method to
work we need greater regularity of the function f .x/.)
Let n 2 N. For 0 i n, let ti D i=n. Define the continuous piecewise linear
map n W Œ0; 1 ! R by
We remark that
Define
n
n .0/
D x0 ;
0
n .t/
D f . n .ti // if t 2 .ti ; tiC1 /;
0
n .tC/
D f . n .ti // if t D ti ;
0
n .t/
D f . n .ti // if t D tiC1 :
is piecewise C1 with jumps in the derivative at t D ti , 0 < i < n.
ın .t/ D
0
n .t/
f . n .t//; t … ft1 ; ; tn1 g;
0;
t 2 ft1 ; ; tn g:
Note that ın .0/ D 0. We have
Z t
.t/
D
x
C
. f . n .s// C ın .s// ds; t 2 Œ0; 1:
n
0
(7.11)
0
Since f W R ! R is bounded, we may choose M 0 such that j f .x/j M for
all x 2 R. Let k k denote the uniform norm on C0 .Œ0; 1/ (k f k D . f ; 0/). We have
the following estimates
(a) j n .t/ n .t0 /j Mjt t0 j, t; t0 2 Œ0; 1 (by (7.11) and j n0 .t˙/j M for all
t 2 Œ0; 1).
(b) kın k 2M (by (a) and the definition of ın ).
(c) k n k jx0 j C M (by (7.11) and j n0 .t˙/j M).
7.16 Equicontinuity and the Arzelà–Ascoli Theorem
311
Statement (a) implies that . n / is equicontinuous on Œ0; 1 (see Examples 7.16.5(3)).
Statement (c) implies . n / is uniformly bounded. Applying Corollary 7.16.12, there
is a uniformly convergent subsequence . nk / of . n /. The map f is uniformly
continuous on ŒMjx0 j; MCjx0 j and so . f ı nk / converges uniformly (exercise) to
f ı on Œ0; 1. Moreover, .ın / converges uniformly to 0 on Œ0; 1 using the definition
of ın and estimate (a). Now let k ! 1 in
Z
nk .t/ D x0 C
0
t
.f.
nk .s//
C ınk .s// ds
Rt
to obtain .t/ D x0 C 0 f . .s// ds, t 2 Œ0; 1. By the fundamental theorem of
calculus, .t/ is a solution of x0 D f .x/ with initial condition .0/ D x0 .
EXERCISES 7.16.14
(1) Prove Lemma 7.16.6.
(2) Suppose that . fn / C0 .X/ is equicontinuous and pointwise convergent. Prove
that . fn / is uniformly convergent.
(3) Let f W R ! R be continuous and define fn .t/ D f .nx/, t 2 R. Show that if . fn /
is equicontinuous on R, then f is constant.
(4) Let ˛ 2 .0; 1/. Show that x 7! x˛ is Hölder continuous, exponent ˛, on Œ0; a
for all a > 0. (Hint: prove that 0 y˛ x˛ . y x/˛ for all y x 0.)
Suppose that f W Œ0; a ! R is Hölder continuous with exponent ˛ > 1. Show
that f is constant.
N be a complete metric space. Let
(5) Let .X; d/ be a compact metric space and .Y; d/
0
denote the uniform metric on C .X; Y/. A subset E of C0 .X; Y/ is pointwise
bounded if for every x 2 X, there exist R D R.x/ > 0, y D y.x/ 2 Y such that
f f .x/ j f 2 Eg DR . y/ and E is uniformly bounded if we can choose R > 0,
y 2 Y such that f f .x/ j f 2 Eg DR . y/ for all x 2 X. Verify that with these
definitions, Lemmas 7.16.6, 7.16.8 and 7.16.9 all extend to C0 .X; Y/. Hence
generalize the Arzelà–Ascoli theorem and Corollary 7.16.12 to .C0 .X; Y/; /.
(6) Find the solution .t/ to x0 D x2 which has initial condition .0/ D x0 > 0.
Hence show that if x0 > 1, the solution escapes to C1 in time tc D x1
0 < 1.
(7) Show that the differential equation x0 D x1=3 does not have unique solutions
.t/ with .0/ D 0. (Hint: one solution is .t/ 0; find others.)
(8) Generalize the existence theorem for ordinary differential equations to x0 D
f .x/, x 2 Rm , where f W Rm ! Rm is continuous and bounded. Extend to nonautonomous equations x0 D f .x; t/ by defining the new variable xmC1 satisfying
x0mC1 D 1, xmC1 .0/ D 0.
312
7 Metric Spaces
7.17 The Contraction Mapping Lemma
In this section we prove one of the most interesting results about self maps of
a complete metric space: the contraction mapping lemma. We start with some
definitions and preliminary results and, after giving a proof of the contraction
mapping lemma, describe two applications.
Definition 7.17.1 A map f W X ! X of the metric space .X; d/ is called a
contraction mapping or contraction if there exists a k, 0 k < 1, such that
d. f .x/; f .x0 // kd.x; x0 /; for all x; x0 2 X:
We call k a contraction constant for f . The infimum of the set of all contraction
constants for f is called the contraction constant of f .
Example 7.17.2 Let f W R ! R be a C1 map and assume that supx2R j f 0 .x/j D k <
1. Then f is a contraction mapping with contraction constant k. Indeed, by the mean
value theorem, for all x < y 2 R, j f .x/ f . y/j D j f 0 .z/jjx yj for some z 2 Œx; y
and so j f .x/ f . y/j kjx yj. Note that k is the contraction constant of f .
Lemma 7.17.3 A contraction mapping is uniformly continuous.
Proof Let f W X ! X be a contracting mapping with contraction constant k > 0. For
all " > 0, we have d. f .x/; f .x0 // " if d.x; x0 / ı D "=k. Hence f is uniformly
continuous.
t
u
Remark 7.17.4 Lemma 7.17.3 holds if d. f .x/; f .x0 / kd.x; x0 / where 1 > k 0.
Maps which satisfy this condition are called Lipschitz and the smallest value of k
for which the inequality holds is called the Lipschitz constant of f . In particular, a
Lipschitz map is a contraction iff it has Lipschitz constant strictly less than 1. z
Definition 7.17.5 Let f W X ! X. A point x? 2 X is a fixed point of f if
f .x? / D x? :
Theorem 7.17.6 (Contraction Mapping Lemma) If f W X ! X is a contraction
mapping of the complete metric space .X; d/, then
(1) the map f has a unique fixed point x? ,
(2) given any point x0 2 X, if we define the sequence .xnC1 / by xnC1 D f .xn /, n 0,
then limn!1 xn D x? .
Proof Suppose f has contraction constant k. We start by showing that if f has a fixed
point, then the fixed point is unique. Suppose then that x? ; y? are fixed points of f .
Since f is a contraction, and f .x? / D x? , f . y? / D y? , we have
d.x? ; y? / D d. f .x? /; f . y? // kd.x? ; y? /:
7.17 The Contraction Mapping Lemma
313
Since k < 1, the only way we can satisfy d.x? ; y? / kd.x? ; y? / is if d.x? ; y? / D 0.
That is, x? D y? .
In order to prove the existence of a fixed point, it suffices to prove (2). Fix x0 2 X
and define .xn / by xnC1 D f .xn /, n 0. We prove that .xn / is a Cauchy sequence.
Let n > m. Then
d.xm ; xn / d.xm ; xmC1 / C d.xmC1 ; xmC2 / C C d.xn1 ; xn /:
(7.12)
Now given r 2 N, we have
d.xr ; xrC1 / D d. f .xr1 /; f .xr //
kd.xr1 ; xr / D kd. f .xr2 /; f .xr1 //
kr d.x0 ; x1 /:
Substituting this estimate in (7.12), we get
d.xm ; xn / .km C kmC1 C C kn1 /d.x0 ; x1 /;
D km .1 C C knm1 /d.x0 ; x1 /;
0
1
1
X
kj A d.x0 ; x1 /;
km @
jD0
D
km
d.x0 ; x1 /:
1k
m
k
Therefore, d.xm ; xn / 1k
d.x0 ; x1 /, n > m, and so, since k < 1,
limn;m!1 d.xm ; xn / D 0, proving that .xn / is a Cauchy sequence. Since .X; d/
is complete, .xn / converges. Denote the limit by x? . We have limn!1 d.xnC1 ; xn / D
d.x? ; x? / D 0. But d.xnC1 ; xn / D d. f .xn /; xn / and so, since f is (sequentially)
continuous (Lemma 7.17.3), we have 0 D limn!1 d. f .xn /; xn / D d. f .x? /; x? /,
proving that f .x? / D x? .
t
u
Remark 7.17.7 An attractive feature of Theorem 7.17.6 is that the result is constructive: it gives a simple way of finding the fixed point. Take any initial point, x0 2 X,
and iterate by the map f . The resulting sequence is Cauchy (even if the space is not
complete) and so the iteration works well on a computer. More formally, if X is not
complete, let XO be the completion of X. The sequence .xn / X will converge to a
O Even though the sequence may not converge in X, the terms will get
point x? 2 X.
arbitrarily close to the point x? and so the sequence will appear to converge on a
computer where one works with finite precision. This situation should be contrasted
with the elementary result that every continuous map f W Œ0; 1 ! Œ0; 1 has a fixed
point. The proof, based on the intermediate value theorem, gives little help in finding
314
7 Metric Spaces
a fixed point. Iteration generally will not work. As a simple example, take the map
f W Œ0; 1 ! Œ0; 1 defined by f .x/ D 4x.1 x/, x 2 Œ0; 1. This map has two fixed
points, 0 and 3=4, which are easily found by solving 4x.1 x/ D x. However, if
we try to find the fixed points by taking a general point x0 2 Œ0; 1 and iterating, the
resulting sequence will typically not converge.
z
Examples 7.17.8
(1) For y 2 R, define f W R ! R by f .x/ D 12 cos x C 14 tan1 .x/. We show that f
has a unique fixed point.
1
3
0
We have f 0 .x/ D 12 sin x C 4.1Cx
2 / . Since j f .x/j 4 for all x 2 R,
supx2R j f 0 .x/j 34 . Since f is C1 , it follows by the mean value theorem that
j f .x/ f . y/j 3
jx yj; for all x; y 2 R:
4
Hence f is a contraction mapping with contraction constant at most 34 . Therefore, f has a unique fixed point x? 2 R.
(2) Let F.x/ D 12 cos x C 14 tan1 .x/ x, x 2 R. We claim that there is a unique
solution to the equation F.x/ D 0. We have F.x/ D 0 if and only if 12 cos x C
1
1
4 tan .x/ D x. Now apply the previous result.
7.17.1 Contraction Mapping Lemma with Parameters
Before we give some applications of the contraction mapping lemma, we prove that
the fixed point given by Theorem 7.17.6 depends ‘continuously’ on the map f .
N are metric spaces and F W X ƒ ! X. Given
Suppose then that .X; d/ and .ƒ; d/
2 ƒ, we define the map f W X ! X by f .x/ D F.x; /. We regard f W X ! X as
a family of maps parametrized by 2 ƒ.
Theorem 7.17.9 (Contraction Mapping Lemma with Parameters) Let .X; d/ be
N be a metric space. Suppose that F W X ƒ ! X
a complete metric space and .ƒ; d/
and that 0 k < 1. Assume that
(1) For every 2 ƒ, f W X ! X is a contraction map with contraction constant
k k.
(2) For every x 2 X, the map F.x; / W ƒ ! XI 7! F.x; / is continuous.
There exists a continuous map x W ƒ ! X such that x./ is the unique fixed point of
f for all 2 ƒ.
Proof We know from Theorem 7.17.6 that, for each 2 ƒ, f W X ! X has a unique
fixed point which we denote by x./. This defines a map x W ƒ ! X. It remains to
prove that x W ƒ ! X is continuous. Fix 2 ƒ. We must show that given " > 0,
N
there exists a ı > 0 such that d.x./; x.0 // < ", if d.;
0 / < ı. For 0 2 ƒ,
7.17 The Contraction Mapping Lemma
315
we have
d.x./; x.0 // D d. f .x.//; f0 .x.0 ///
d. f .x.//; f0 .x./// C d. f0 .x.//; f0 .x.0 ///
d. f .x.//; f0 .x./// C kd.x./; x.0 //:
Hence
.1 k/d.x./; x.0 // d. f .x.//; f0 .x.///:
(7.13)
Since, for fixed x, F.x; / W ƒ ! XI 0 7! F.x; 0 / is continuous, we may take
N
x D x./ and choose ı > 0 such that if d.;
0 / < ı, then d. f .x.//; f0 .x./// <
N
.1k/". Substituting in (7.13), we see that if d.; 0 /<ı, then d.x./; x.0 //<". u
t
7.17.2 An Application of the Contraction Mapping Lemma to
Ordinary Differential Equations
We show how we can use the contraction mapping lemma to prove the local
existence and uniqueness theorem for autonomous ordinary differential equations.
For simplicity, we consider differential equations on the real line. Later, in Chap. 9,
we indicate the straightforward generalization to ordinary differential equations on
Rn .
We start by considering the non-autonomous differential equation x0 D g.t/,
where g W R ! R is continuous. A solution of this equation with initial condition x0
will be a differentiable map x W R ! R such that
x0 .t/ D g.t/; t 2 R
and x.0/ D x0 . An application of the fundamental theorem of calculus easily gives
the explicit solution
Z
x.t/ D x0 C
t
0
g.s/ ds; t 2 R:
We remark that if g is Cr , then x W R ! R is CrC1 (since x0 .t/ D g.t/, x0 is Cr ).
If x0 D f .x/, where f W R ! R is C1 , then we cannot use the fundamental
theorem of calculus to construct solutions with specified initial conditions unless f
R x.t/
Rt
is never zero (and then we only obtain solutions implicitly by x0 f dx
.x/ D 0 ds D t).
Moreover, even if we can construct solutions they may not be defined for all time.
316
7 Metric Spaces
Example 7.17.10 Consider the ordinary differential equation x0 D x2 . The solution
0
2
x0 to x D x with initial condition x0 is given by
x0 .t/
D
x0
;
1 tx0
1
where t 2 .1; x1
0 / if x0 > 0, and t 2 .x0 ; 1/ if x0 < 0. If x0 ¤ 0, then the
solution “blows up” in finite time.
From now on suppose that f W R ! R is C1 (Lipschitz suffices). We consider the
ordinary differential equation
x0 D f .x/:
A solution of this equation with initial condition x0 is a differentiable map x W
Œı; ı ! R such that
x0 .t/ D f .x.t//; for all t 2 Œı; ı:
x.0/ D x0 :
Remark 7.17.11 A solution curve to x0 D f .x/ is also called an integral curve or a
trajectory (of the differential equation).
z
Theorem 7.17.12 Let f W R ! R be Cr , r 1. Let I D Œa; b be a bounded closed
interval and suppose C D supx2R j f 0 .x/j < 1. Choose ı > 0 such that ıC < 1.
Then there exists a continuous map
W I ! C0 .Œı; ı; R/I x 7!
x
satisfying
(1) for all x0 2 I, x0 is a solution of x0 D f .x/ with initial condition x0 ,
(2) x W Œı; ı ! R is CrC1 , for all x 2 I,
(3) x0 is unique in the sense that if W .˛; ˇ/ I ! R is a solution of x0 D f .x/
with .0/ D x0 , then D x0 on I.
Proof Take the uniform metric on C0 .Œı; ı; R/. Define T W C0 .Œı; ı; R/ I !
C0 .Œı; ı; R/ by
Z
T. ; x/.t/ D x C
t
0
f . .s// ds; . ; x/ 2 C0 .Œı; ı; R/
I; t 2 Œı; ı:
Observe that T. ; x/.0/ D x, for all 2 C0 .Œı; ı; R/. Now T. ; x/ W Œı; ı ! R
is C1 (fundamental theorem of calculus) and so certainly T. ; x/ 2 C0 .Œı; ı; R/.
We claim that T satisfies the conditions of Theorem 7.17.9 with k D
R t ıC. For fixed
2 C0 .Œı; ı; R/, the map T. ; / W I ! C0 .Œı; ı; R/, x 7! x C 0 f . .s// ds, is
obviously continuous and so condition (2) of Theorem 7.17.9 is satisfied. In order
7.17 The Contraction Mapping Lemma
317
to verify the contraction property, let ;
have
2 C0 .Œı; ı; R/. For (fixed) x 2 I, we
ˇZ t
ˇ
ˇ
ˇ
ˇ
jT. ; x/.t/ T. ; x/.t/j D ˇ f . .s// f . .s// dsˇˇ
0
ˇZ t
ˇ
ˇ
ˇ
ˇ
ˇ j f . .s// f . .s//j dsˇˇ :
0
Now j f . .s// f . .s//j Cj .s/ .s/j by the mean value theorem and so
ˇ
ˇZ t
ˇ
ˇ
ˇ
jT. ; x/.t/ T. ; x/.t/j ˇ Cj .s/ .s/j dsˇˇ
0
ˇZ t
ˇ
ˇ
ˇ
C ˇˇ . ; / dsˇˇ
0
Cı. ; /; if t 2 Œı; ı:
Since we assumed that Cı < 1, T satisfies condition (1) of Theorem 7.17.9 with
k D Cı. Let x denote the unique fixed point of Tx . We have
Z
x .t/
DxC
0
t
f . x .s// ds; t 2 Œı; ı:
Differentiating with respect to t, it follows from the fundamental theorem of calculus
that
0
x .t/
D f . x .t//; t 2 Œı; ı:
We have x .0/ D x and so we have proved part (1) of the theorem. Since x0 .t/ D
f . x .t//, we see that if x is Cs , s r, then x0 must be Cs (chain rule) and so x
is CsC1 . Hence x is CrC1 . Finally, suppose that W .˛; ˇ/ I ! R is a solution
of x0 D f .x/ with .0/ D x0 . If we restrict to Œı; ı then the restriction lies in
C0 .Œı; ı; R/ and so equals x by the uniqueness part of the contraction mapping
lemma.
t
u
Remarks 7.17.13
(1) Theorem 7.17.12 goes beyond the existence of solutions. The result shows that
the solutions depend continuously on the initial condition. With more work—
see Chap. 9—it can be shown that solutions depend Cr on the initial condition x.
(2) Observe that the strong condition supx2R j f 0 .x/j < 1 can always be satisfied
by multiplying f by a tabletop function which is equal to one on a big interval
containing I. In more detail, if x0 D f .x/, where supx2R j f 0 .x/j D C1, and
Œa; b R, we may choose an interval ŒA; B Œa; b with A a, B b,
318
7 Metric Spaces
and a tabletop function
which is equal to 1 on ŒA; B and is zero outside
ŒA 1; B C 1. The differential equation x0 D .x/f .x/ will then satisfy the
conditions of Theorem 7.17.12. It follows that we can choose ı > 0 such that
for all x 2 Œa; b, there is a unique solution x W Œı; ı ! R to x0 D .x/f .x/.
Choosing ı smaller if necessary, we can require that x .t/ 2 ŒA; B for all x 2
Œa; b, t 2 Œı; ı. Hence x W Œı; ı ! R will solve x0 D f .x/, for all x 2 Œa; b.
(3) In Theorem 7.17.12 the right-hand side of the differential equation was
independent of time. If instead we consider the non-autonomous equation
@f
x0 D f .x; t/, where f W R2 ! R and sup.x;t/2R2 j @x
.x; t/j D C < 1, then
the proof of Theorem 7.17.12 extends immediately to give unique solutions
0
z
x W Œı; ı ! R to x D f .x; t/ with initial condition x 2 Œa; b.
Example 7.17.14 We can use the second statement of the contraction mapping
lemma to construct approximate solutions to ordinary differential equations. For
example, consider the equation
x0 D x 1; x 2 R:
Given x0 2 R, we can construct a sequence . n / of approximate solutions with
initial condition x0 . We start by defining 0 by
0 .t/
D x0 ; t 2 R:
This is the ‘simplest’ approximate solution with the required initial condition. We
define 1 ; 2 by
Z
1 .t/
D x0 C
t
0
Z
.x 1/. 0 .s// ds D x0 C
t
0
.x0 1/ ds
D x0 C t.x0 1/ D 1 C .x0 1/.1 C t/;
Z t
t2
:
.x0 C s.x0 1/ 1/ ds D 1 C .x0 1/ 1 C t C
2 .t/ D x0 C
2
0
The solution to x0 D x 1 with initial condition x0 is .t/ D 1 C .x0 1/et . Hence
2 gives an approximate solution agreeing with the first 3 terms in the Taylor series
of 1 C .x0 1/et at t D 0.
7.17.3 An Application of the Contraction Mapping Lemma to
the Inverse Function Theorem
Our second application of the contraction mapping lemma will be to the inverse
function theorem. We look at the case of maps f W R ! R and note that there are far
simpler proofs making use of the natural order on R. The proof we give, however,
has the merit that it generalizes easily to maps f W Rn ! Rn (see Chap. 9).
7.17 The Contraction Mapping Lemma
319
Suppose that f W R ! R is Cr , r 1, and f 0 .0/ ¤ 0. The inverse function
ı
ı
theorem states that we can choose closed intervals I; J, with 0 2 I , f .0/ 2 J, such
that f W I ! J is 1:1 onto and f 1 W J ! I is Cr . What we shall do here is construct
I; J and verify only that the inverse map f W J ! I is continuous. (We address the
differentiability of f 1 in Chap. 9.)
We start by observing that if we replace f by f f .0/, then we may assume
f .0/ D 0. Replacing f by f =f 0 .0/, we may further assume that f 0 .0/ D 1.
Theorem 7.17.15 Suppose that f W R ! R is C1 and satisfies
(1) f .0/ D 0.
(2) f 0 .0/ D 1.
Then there exist a closed interval I, containing 0 as an interior point, and s > 0, such
that f maps I 1:1 onto Œs; s and the inverse map f 1 W Œs; s ! I is continuous.
Proof For y 2 R, define ˆy W R ! R by
ˆy .x/ D x f .x/ C y:
Observe that x is a fixed point of ˆy iff f .x/ D y. This suggests finding fixed points
x D x. y/ of ˆy using the contracting mapping lemma with parameters. The map
y 7! x. y/ will then be a continuous inverse to f (since f .x. y// D y).
Set .x/ D x f .x/. We have .0/ D 0 .0/ D 0. Since is at least C1 , there
exists an r > 0 such that j 0 .x/j 12 for all x 2 Œr; r. Hence, by the mean value
theorem, we have
j .x/ .x0 /j 1
jx x0 j; for all x; x0 2 Œr; r:
2
(7.14)
We claim that for all y 2 Œ 2r ; 2r (a) ˆy W Œr; r ! Œr; r,
(b) ˆy is a contraction map with contraction constant 12 .
In order to verify (a), observe that for jxj r we have
jˆy .x/j D j .x/ C yj j .x/j C jyj r
r
C D r;
2
2
where the last inequality follows from (7.14) with x0 D 0. The proof that ˆy is a
contraction is even simpler. If x; x0 2 Œr; r, we have
jˆy .x/ ˆy .x0 /j D j .x/ .x0 /j where the last inequality again uses (7.14).
1
jx x0 j;
2
320
7 Metric Spaces
The map ˆ W Œr; r Œ 2r ; 2r ! Œr; r satisfies the conditions of Theorem 7.17.9 (for fixed x, ˆ.x; / is obviously continuous) and so, since Œr; r is a
complete metric space, it follows by Theorem 7.17.9 that there is a continuous map
x W Œ 2r ; 2r ! Œr; r such that
h r ri
:
ˆy .x. y// D x. y/; for all y 2 ;
2 2
In particular, f .x. y// D y, for all y 2 Π2r ; 2r . Set 2r D s and x. y/ D g. y/,
y 2 Œs; s so that g W Œs; s ! Œr; r. Since g is continuous and g.0/ D 0 (by
uniqueness of fixed points), g.Œs; s/ D I is a closed interval containing 0. Since
f ı g is the identity map on Œs; s, f W I ! Œs; s is onto. Further, by the uniqueness
of fixed points, f is 1 W 1. Hence g D f 1 . Finally, 0 is an interior point of I since
f W I ! R is continuous and so f 1 .s; s/ is an open subset of R containing 0. u
t
EXERCISES 7.17.16
(1) Find a contraction map f W X ! X, where X D R X f0g, which does not have
a fixed point.
(2) Let A W Rn ! Rn be linear (A can be written as an n n matrix) and b 2 Rn .
Show that a necessary condition for the affine linear map L.x/ D Ax C b to be
a contraction is that every eigenvalue of A is of modulus less than 1. Show that
if we take the Euclidean metric on Rn , then this condition is not sufficient.
(3) Consider the ODE x0 D x. Fix x 2 R. Taking 0 W Œa; a ! R to be the
constant
map 0 .t/ D x, compute the first three iterates of T .t/ D x C
Rt
f
.
.s//
ds, starting with D 0 . Make a conjecture as to the form of n .t/,
0
n > 3, and prove your conjecture.
(4) Consider the ODE x0 D x2 . Taking 0 W Œa; a ! R toRbe the constant map
t
0 .t/ D x, compute the first three iterates of T .t/ D xC 0 f . .s// ds, starting
with D 0 . In particular, verify that the terms in 3 of degree 1; 2 and 3 in
x match those in the binomial expansion of x=.1 tx/ (the solution of x0 D x2
with initial condition x).
(5) Provide the justification for Remarks 7.17.13(2) and show that there exists a
ı > 0 such that for all x 2 Œa; b, the solution x W Œı; ı ! R of x0 D
.x/f .x/ satisfies x .t/ 2 ŒA; B, t 2 Œı; ı.
(6) Find an example of a map f W R ! R such that (a) j f .x/ f . y/j < jx yj for
all x; y 2 R, x ¤ y, and (b) f has no fixed point. Why does your example not
contradict the contraction mapping lemma? (Hint: Look for an example of the
form f .x/ D x C .x/ where .x/ > 0 for all x 2 R and 1 < 0 .x/ < 0 for
all x 2 R.)
(7) Suppose that .X; d/ is complete and f W X ! X. Show that if there exists a
p 2 N such that f p is a contraction, then f has a unique fixed point. (f p denotes
the composition of f with itself p times.)
(8) Let F W Œ0; 1 ! Œ0; 1 be differentiable and M D supx2Œ0;1 j f 0 .x/j < 1.
Rx
Define T W C0 .Œ0; 1/ ! C0 .Œ0; 1/ by T. /.x/ D 0 F. .s// ds,
2
C0 .Œ0; 1/. Show that if pŠ > M, then T p is a contraction and hence T has
a unique fixed point.
7.18 Connectedness
321
(9) Show that if .X; d/ is compact and f W X ! X satisfies d. f .x/; f . y// < d.x; y/,
for all x ¤ y 2 X, then f has a unique fixed point. (Hint: For every r > 0, there
exists a k 2 Œ0; 1/ such that if d.x; y/ r, then d. f .x/; f . y// kd.x; y/.)
(10) Let W R ! R be Lipschitz with Lipschitz constant k (j.x/ . y/j kjx yj
for all x; y 2 R). Let f .x/ D ax C .x/, where a ¤ 0. Show that if k < jaj,
then f W R ! R is 1:1 onto and f 1 is Lipschitz with Lipschitz constant
1=.1 k=jaj/.
(11) Let .X; d/ be a metric space. A map f W X ! X is an expansion if there exists
a k > 1 such that d. f .x/; f . y// kd.x; y/ for all x; y 2 X. Show
(a) If f is an expansion, then f is 1 W 1.
(b) If f W X ! X is an expansion and f has a fixed point, then the fixed point
is unique.
(c) If X is compact and contains at least two points, then there are no
expansions of X.
(d) If X is complete and f W X ! X is a continuous surjective expansion, then
f has a unique fixed point.
Find examples of (a) a continuous expansion of R, (b) a discontinuous
expansion of R, (c) an expansion of a complete metric space which is not
onto and has no fixed point.
(12) Let .X; d/ be a compact metric space and suppose that f W X ! X satisfies
d. f .x/; f . y// d.x; y/ for all x; y 2 X. Prove that f is an isometry. Show by
means of an example that f need not have a fixed point. (Hint: Given u; v 2 X
and " > 0, show there exists an n 2 N such that d.u; un /; d.v; vn / < ", where
un D f n .u/, vn D f n .v/.)
7.18 Connectedness
We conclude this chapter with a discussion of connectedness—a property implicitly
used in the proof of the intermediate value theorem.
Definition 7.18.1 A metric space .X; d/ is connected if the only open and closed
subsets of X are X and the empty set. If X is not connected, X is disconnected.
We have a useful characterization of disconnected spaces.
Lemma 7.18.2 A metric space .X; d/ is disconnected iff we can write X D U [ V
where U and V are open, non-empty disjoint subsets of X.
Proof Immediate since the hypotheses imply that V D X X U is open and
closed.
t
u
We may extend the definition of connectedness to subsets of a metric space.
Definition 7.18.3 The non-empty subset Y of the metric space .X; d/ is connected
iff the only open and closed subsets of .Y; dY / are Y and the empty set.
322
7 Metric Spaces
Remark 7.18.4 We emphasize that connectedness and disconnectedness is only
defined for non-empty subsets of a metric space.
z
Lemma 7.18.5 A subset Y of the metric space .X; d/ is disconnected iff there exist
open subsets U; V of X such that
(1) U \ Y and V \ Y are disjoint non-empty subsets of Y.
(2) .U \ Y/ [ .V \ Y/ D Y.
In particular, Y is a connected subset of X iff .Y; dY / is a connected metric space.
Proof Every open subset A of .Y; dY / may be written A D U \ Y, where U is an
open subset of X, by Proposition 7.6.1. The result follows by Lemma 7.18.2 applied
to .Y; dY /.
t
u
Remark 7.18.6 Just as was the case for compact subsets (Proposition 7.13.10),
connectedness is an intrinsic property of a subset.
z
Examples 7.18.7
(1) A metric space consisting of a single point is connected.
(2) If .X; d/ has the discrete metric, then .X; d/ is disconnected if X contains more
than one point.
(3) A metric space containing more than one point is totally disconnected if every
connected subset of X consists of a single point. A space with the discrete metric
is totally disconnected. So also is the Cantor set C. We leave the verification of
this to the exercises.
(4) The metric space .Q; j j/ is totally disconnected. Indeed, suppose that E Q
is connected. Let x; y 2 E, x y. Suppose that x < y. Regard Q as a subspace
of R and choose an irrational number z 2 .x; y/. Take U D Q \ .1; z/,
V D Q \ .z; 1/. Then U; V are open disjoint non-empty subsets of Q such that
U \ E; V \ E ¤ ;.
As the previous examples show, it is not hard to find disconnected sets. Finding nontrivial connected sets is a little trickier. It turns out that the key to finding connected
sets is to classify the connected subsets of R. Not surprisingly a subset of R is
connected iff it is an interval. Once we have this result, we can prove some simple
results that allow us to combine connected sets and in this way find many examples
of connected sets.
We start by giving a characterization of an interval.
Lemma 7.18.8 A non-empty subset A of R is an interval iff for every x; y 2 A,
x y, we have Œx; y A.
Proof Obviously every interval I R satisfies the condition. Conversely, if A
satisfies the condition, let x? D inffx j x 2 Ag, y? D supfx j x 2 Ag. We have
1 x? y? C1. Observe that if x 2 .x? ; y? /, then x 2 A and so A .x? ; y? /.
If x? D 1, then A .1; y? /. If y? < 1, then A D .1; y? , if y? 2 A, else
A D .1; y? /. The other cases are handled similarly.
t
u
7.18 Connectedness
323
Theorem 7.18.9 A subset A of R is connected iff A is an interval.
Proof Suppose A R is an interval and let U; V be open subsets of R such that
U [ V A and .U \ A/ \ .V \ A/ D ;. Suppose that U \ A; V \ A ¤ ;. Pick
x 2 U \ A, y 2 V \ A. Without loss of generality assume x < y. Since A is an
interval, Œx; y A. Let z D supfx0 2 Œx; y j x 2 Ug. Since Œx; y A, and x 2 U,
we must have z > x as U is open. Similarly z < y since y 2 V and V is open. If
z 2 V, then z must lie in the interior of Œx; y \ V, contradicting the definition of z as
the supremum of points in Œx; y lying in U. On the other hand if z 2 U, then z must
lie in the interior of Œx; y \ U and so z is not the supremum of points in Œx; y lying
in U. Contradiction. Therefore one of the sets U \ A; V \ A must be empty.
Conversely, suppose that A is a connected subset of R. Let x < y 2 A. Suppose
there exists a z 2 Œx; y that does not lie in A. Take U D .1; z/, V D .z; 1/. Then
U [ V A, U \ A; V \ A ¤ ;, and .U \ A/ \ .V \ A/ D ;, contradicting the
assumption that A is connected. Hence Œx; y A for all x < y 2 A and so A is an
interval.
t
u
Remark 7.18.10 The use of the supremum in the proof of Theorem 7.18.9 is not
surprising in view of Examples 7.18.7(4).
z
Theorem 7.18.11 Let f W X ! Y be a continuous map between metric spaces.
If X is connected, then f .X/ is a connected subset of Y. More generally, if E is a
connected subset of X, then f .E/ is a connected subset of Y.
Proof In order to prove that f .X/ is connected, it suffices to show that if U; V are
open subsets of Y such that U [ V f .X/ and .U \ f .X// \ .V \ f .X// D ;,
then either U \ f .X/ or V \ f .X/ is the empty set. Since f is continuous, f 1 .U/,
f 1 .V/ are open subsets of X. Since U [ V Y, f 1 .U/ [ f 1 .V/ D X. Since
.U \ f .X// \ .V \ f .X// D ;, f 1 .U/ \ f 1 .V/ D ;. Therefore, one of f 1 .U/,
f 1 .V/ is empty since X is connected and so either U \ f .X/ or V \ f .X/ is the
empty set. The result when E is a connected subset follows by replacing .X; d/ by
.E; dE / and f by the restriction of f to E.
t
u
Example 7.18.12 If
W .a; b/ ! Rn is continuous—so
n
curve in R —then .a; b/ is a connected subset of Rn .
defines a continuous
Proposition 7.18.13 The closure of a connected subset of a metric space is
connected.
Proof Let E be a connected subset of the metric space X. It suffices to show that
if U; V are open subsets of X such that U [ V E and .U \ E/ \ .V \ E/ D ;,
then either U \ E D ; or V \ E D ;. If these conditions hold then U [ V E,
since E E, and .U \ E/ \ .V \ E/ D ;. Therefore, since E is assumed connected,
one of U \ E; V \ E must be the empty set. Without loss of generality, suppose that
U \ E D ;. This implies that U \ E D ; (for every x 2 U, there exists an r > 0
such that Dr .x/ \ E D ;).
t
u
324
7 Metric Spaces
Example 7.18.14 Let E D f.x; sin.1=x// j x > 0g R2 . Since fx j x > 0g is
connected and x 7! ..x; sin.1=x// is continuous for x > 0, E is a connected subset
of R2 . We have E D f.0; y/ j 1 y 1g [ E and this set is connected by the
previous proposition. Later we give a simple example to show that the interior of a
connected set need not be connected.
Theorem 7.18.15 Let fEi j i 2 Ig be a family of connected subsets of X. If Ei \Ej ¤
; for all i; j 2 I, then [i2I Ei is a connected subset of X.
Proof Set E D [i2I Ei . Let U; V be open subsets of X such that U [ V E and
.E \ U/ \ .E \ V/ D ;. It suffices to prove one of E \ U; E \ V is the empty-set.
Observe that
E \ U D [i2I Ei \ U:
Suppose that for some i 2 I, Ei \ U ¤ ;. Then Ei \ U D Ei , since Ei is connected.
Therefore since Ei \ Ej ¤ ; for all j 2 I, we must have Ej \ U ¤ ; for all j 2 I and
so Ej U for all j 2 I. Since Ej \ V D ; for all j 2 I, we must have V \ E D ;. u
t
Definition 7.18.16 A metric space .X; d/ is path-connected if for all for x; y 2 X,
there exists a continuous curve W Œ0; 1 ! X such that .0/ D x, .1/ D y.
Proposition 7.18.17 A path-connected metric space is connected.
Proof Let X be path-connected. Fix x 2 X. For each y 2 X, there exists a continuous
curve y W Œ0; 1 ! X such that y .0/ D x, y .1/ D y. Set Ey D y .Œ0; 1/. Since
y is continuous, Ey is a connected subset of X. Since x 2 Ey \ Ez for all y; z 2 X,
[y2X Ey D X is connected by Theorem 7.18.15.
t
u
Examples 7.18.18
(1) For m 1, Rm is path-connected and therefore connected.
(2) For all x 2 Rm , the open and closed disks Dr .x/, Dr .x/ are path-connected since
every point y 2 Dr .x/ (or Dr .x/) can be joined to the centre x by the continuous
path .t/ D ty C .1 t/x, t 2 Œ0; 1.
(3) The unit sphere S2 in R3 is connected. For this, observe that S2 is the
image of the continuous map P W R2 ! R3 defined by S. ; / D
.cos sin ; sin sin ; cos / (spherical polar coordinate map with r D 1).
Alternatively, one can show that S2 X fxg is the continuous image of R2 (x any
point of S2 ). Then by Proposition 7.18.13, S2 X fxg D S2 is connected. This
approach has the merit that it generalizes to prove that the n-sphere is connected
for all n 1—see the exercises.
(4) A connected set need not be path connected. For example, although the graph
E of sin.1=x/, x > 0, is path connected, E is not path connected though it is
connected (see Example 7.18.14).
7.18 Connectedness
325
Example 7.18.19 The interior of a connected subset of a metric space need not be
connected. Take the Euclidean metric on R2 , and define E D D1 .0; 0/ [ D1 .2; 0/ ı
R2 . We have E D D1 .0; 0/ [ D1 .2; 0/, which is not connected (take U D D1 .0; 0/,
V D D1 .2; 0/ in the definition of disconnected).
EXERCISES 7.18.20
(1) Suppose that E; F are connected subsets of X. If E \ F ¤ ;, need E \ F be
connected?
(2) Show that the intermediate value theorem follows from the connectedness of
an interval and Theorem 7.18.11.
(3) Suppose f W X ! Y is continuous and E is a connected (respectively,
path connected) subset of Y. Must f 1 .E/ be a connected (respectively, path
connected) subset of X?
(4) Suppose that the metric space .X; d/ is connected. Show that if f W X ! R is
continuous then f .X/ is an interval. In particular, if a; b 2 f .X/, then f takes
every value between a and b. Show that if additionally X is compact, then
there exist m < M 2 R such that f .X/ D Œm; M (version of Theorem 2.4.10
for metric spaces).
(5) Let .X; d/ be a metric space. Show that if X is countable then X is connected
if and only if X consists of a single point. (Hint. Let x0 2 X and consider
f W X ! R defined by f .x/ D d.x; x0 /. Note the result is false for general
topological spaces.)
(6) Show that a metric space .X; d/ is connected iff for every proper non-empty
subset E of X, @E ¤ ;.
(7) Prove that the middle thirds Cantor set is totally disconnected (Examples 7.18.7(3)).
(8) Suppose that E is a connected subset of the metric space .X; d/. Show that E0
is connected. (Hint: reduce to the case where E is closed.)
(9) Suppose that E is a connected subset of the metric space X and that E does not
consist of a single point. Show that E [ fzg is connected iff z 2 E0 .
(10) Non-empty subsets A; B of .X; d/ are separated if A \ B; A \ B D ;.
(a) Suppose that X D A[B where A; B are separated. Show that A; B are open
and closed in X and X is disconnected.
(b) Show that X is disconnected iff we can write X D A [ B, where A; B are
separated.
(c) Show that a subset E of X is disconnected iff we can write E D A [ B,
where A; B are separated.
(d) Show that disjoint subsets A; B of X are separated iff no point of A is a
limit point of B and no point of B is a limit point of A.
326
7 Metric Spaces
(Connectedness is often defined in terms of separated sets (for example, see
[18, 27]). The definition in terms of separation is equivalent to our definition
by (a). That connectedness is an intrinsic property follows from (d). Although
the definition of connectedness in terms of separated sets is a little more
complicated it works well when considering connectedness of subsets.)
(11) Prove that E D f.x; sin.1=x// j x > 0g R2 g is not path connected (see
Example 7.18.14). (Hint: If W Œ0; 1 ! E is continuous, then is uniformly
continuous since Œ0; 1 is compact.)
(12) Which of the following sets are connected? Why?
(a)
(b)
(c)
(d)
The unit circle S1 D f.x; y/ j x2 C y2 D 1g R2 .
The paraboloid P D f.x; y; z/ j x2 C y2 D zg R3 .
The surface H D f.x; y; z/ j x2 y2 D 1g R3 .
The cone on the middle thirds Cantor set C: ftX C .1 t/.0; 1/ j t 2
Œ0; 1; X 2 Cg R2 , where C R is the Cantor set and we regard
R R2 as the x-axis.
(13) Show that if f W X ! Y is continuous and X is path connected, then f .X/ is
path connected.
(14) Suppose that Ei is a path connected subset of the metric space .Xi ; di /, i D 1; 2.
Show that E1 E2 is a path connected subset of .X1 X2 ; D/ where D is the
product metric on X1 X2 (D..x1 ; x2 /; . y1 ; y2 / D maxfd1 .x1 ; y1 /; d2 .x2 ; y2 //g).
(15) Let E be a connected subset of .X; d/ and for r > 0, let E.r/ D fx 2
X j d.x; E/ rg. Must E.r/ be connected? Would your answer change if
X D Rn ?
(16) Suppose that En are connected subsets of a metric space such that En \EnC1 ¤
;, n 1. Prove that [n1 En is connected.
(17) Let .X; d/ be a metric space and let x 2 X. Let Cx denote the union of all
connected subsets of X which contain x. Show that
(a) Cx is a closed connected subset of X.
(b) If A Cx is connected then A D Cx .
(c) fCx j x 2 Xg defines a partition of X (that is, if x; x0 2 X, either Cx D Cx0
or Cx \ Cx0 D ;).
We call the sets Cx the connected components of X. Show that we can also
define the connected path-components of a metric space and that we obtain a
partition of X into path-components. What are the path-components for the set
E, where E is defined in Example 7.18.14? Need path-components be closed?
(18) Let E be a non-empty closed subset of the metric space X. Show that the
connected components of E are closed subsets of A. Are the connected
components of an open subset A of X always open?
(19) Define the 1:1 map F W Rm ! RmC1 D Rm R by
F.x/ D
kxk2 1
2x
;
;
kxk2 C 1 kxk2 C 1
7.18 Connectedness
327
where x D .x1 ; ; xm / 2 Rm . Verify that F.Rm / D Sm X f.0; ; 0; 1/g
and deduce that the unit sphere in RmC1 is connected. (When m D 2, the
inverse map F 1 W S2 X f0; 0; 1g is stereographic projection: if X 2 S2 X
f0; 0; 1g, F 1 .X/ is the unique point of intersection of the x; y-plane with the
line through f0; 0; 1g and X.)
(20) Let E be a compact subset of the metric space .X; d/. Show that
(a) E is disconnected iff and only if E can be written as the union of two
disjoint (non-empty) compact subsets of E.
(b) E is disconnected iff we can find disjoint open subsets U; V of X such that
U \ E; V \ E ¤ ; and E U [ V.
(21)
(22)
(23)
(24)
(25)
(Hint for (b): if A; B are disjoint compact subsets of the metric space X, then
infa2A;b2B d.a; b/ > 0.)
Suppose that E1 E2 E3 is a decreasing sequence of (sequentially)
compact connected sets. Show that E D \1
nD1 Ej is connected. (Hint: use part
(b) of the previous question.)
Show, by looking for an example in R2 , that the previous result may fail if the
Ej are closed but not compact (assume \1
nD1 Ej ¤ ;.) What happens if Ej R?
Suppose that E1 E2 E3 is a decreasing sequence of compact path
connected sets. Is E D \1
nD1 Ej path connected? (Hint: Look for a sequence
.En / which has intersection the set E of Example 7.18.14.)
Let Y D f0; 1g with the discrete metric. Show that .X; d/ is connected iff
every continuous function f W X ! Y is constant. Use this result to show
that if .X1 ; d1 /, .X2 ; d2 / are connected metric spaces then the product .X1
X2 ; d/ is connected (where we take the product metric d..x1 ; x2 /; . y1 ; y2 / D
maxfd1 .x1 ; y1 /; d2 .x2 ; y2 //g on X1 X2 ).
If X has topology of open sets U, we say X is connected iff the only open and
closed subsets of X are X and ;. More generally, if E is a non-empty subset
of X, we define a topology UE of open sets of E by UE D fU \ E j U 2 Ug.
We say E is connected iff the only open and closed subsets of E are E and the
empty set. Take the Zariski topology on R: the open subsets of R are either R,
; or R X F where F is finite.
(a) Show that R is connected (in the Zariski topology).
(b) Show that Z R is a connected subset of R (in the Zariski topology).
(c) Classify the connected subsets of R (in the Zariski topology). In particular,
show that every non-empty open subset of R is connected, as is the Cantor
set, and that a finite set is connected iff it consists of just one point.
(d) Is Z a path connected subset of R? What about Œa; b?
(Moral: Connectedness appears to be relatively intuitive for metric spaces; for
general topological spaces what connectedness detects can seem un-geometric
and non-intuitive—though there is usually a mathematically significant interpretation of connectedness.)
328
7 Metric Spaces
(26) If f W R ! R is continuous, then the graph of f , f.x; f .x// j x 2 Rg, is a closed
connected subset of R2 . Suppose that the graph of f W R ! R is a closed
subset of R2 . Does it follow that f is continuous? What about if the graph of
f W R ! R is connected? Prove that if the graph of f W R ! R is closed and
connected, then f is continuous.
(27) Suppose that f W R2 ! R is continuous. If f 1 .0/ is connected, does it follow
that f 1 .0/ is path connected? (Hint: Theorem 7.12.4. If f is a polynomial it
can be shown that f 1 .0/ is connected iff f 1 .0/ is path connected.)
Chapter 8
Fractals and Iterated Function Systems
As motivation for what we do in this chapter, we start by taking another look at
the middle thirds Cantor set C constructed in Chap. 7. Define affine linear maps
R1 ; R2 W R ! R by
R1 .x/ D
x
;
3
x
R2 .x/ D 1 :
3
Following the same notation we used in our discussion of the Cantor set in
Sect. 7.14, observe that R1 .I0 / D Œ0; 13 D I11 and R2 .I0 / D Œ 23 ; 1 D I12 .
Hence R1 .I0 / [ R2 .I0 / D I1 . Similarly, R1 .I1 / [ R2 .I1 / D I2 and, in general,
R1 .In / [ R2 .In / D InC1 , for all n 0.
We can abstract this process in the following way. Let H.R/ denote the set of all
compact subsets of R. Define an operator R W H.R/ ! H.R/ by
R.X/ D R1 .X/ [ R2 .X/; X 2 H.R/:
Observe that if X 2 H.R/, then R.X/ 2 H.R/ since R1 .X/ and R2 .X/ are compact
subsets of R (R1 ; R2 are continuous) and so R1 .X/ [ R2 .X/ is compact (either by
Exercises 7.13.24(6) or the Bolzano–Weierstrass theorem).
The Cantor set C is a fixed point of the operator R. This is a consequence of the
self-similarity of the Cantor set since T.C \ Œ0; 13 / D C, and T.C \ Œ 23 ; 1/ D C,
by Corollary 7.14.10 (see Sect. 7.14 for the definition of T W Œ0; 1 ! Œ0; 1).
Hence R1 .C/ [ R2 .C/ D C (R1 , R2 are the inverse maps of T on Œ0; 13 and Œ 23 ; 1,
respectively).
These observations suggest the natural question as to whether we can find a
complete metric on H.R/ with respect to which R is a contraction mapping. If
we can do this, then limn!1 Rn .X/ D C for every compact subset X of R. For
example, take X to be a single point and then iterate by R to get the Cantor set!
330
8 Fractals and Iterated Function Systems
In this chapter we develop these ideas in the setting of the space of compact
subsets of Rn . There are two issues. First we need to find a complete metric on the
space H.Rn / of all compact subsets of Rn . Then we need to identify an interesting
class of maps which lead naturally to contraction mappings on H.Rn /.
8.1 The Space H.Rn /
Let H.Rn / denote the set of all (non-empty) compact subsets of Rn . Since a set
X Rn is compact iff X is closed and bounded, H.Rn / is the set of all closed
and bounded subsets of Rn . Note that H.Rn / contains all finite subsets of Rn . In
particular, we can regard Rn as a subset of H.Rn / by the map .x1 ; ; xn / 7!
f.x1 ; ; xn /g.
If X; Y 2 H.Rn / then X [ Y 2 H.Rn / and, if X \ Y ¤ ;, we also have X \ Y 2
H.Rn /.
Our aim in this section is to define a metric h on H.Rn / for which .H.Rn /; h/ is
a complete metric space. The metric we construct is known as the Hausdorff metric
(named after Felix Hausdorff, 1868–1942, one of the founders of topology). The
main issues are finding a natural definition for h and the verification of the metric
properties. In order to define h, we start by defining a positive function .A; B/,
A; B 2 H.Rn /, such that .A; B/ D 0 iff A B. Since detects inclusion one way,
it is natural to define h.A; B/ D maxf.A; B/; .B; A/g. We see that h.A; B/ D 0
iff A B and B A. That is, iff A D B. Roughly speaking, .A; B/ will measure
the greatest distance of points of A from the set B. This will be zero if A B. See
Fig. 8.1 where we illustrate the situation B ¨ A, where .A; B/ > 0 and .B; A/ D 0.
Now for the details. Let d denote the Euclidean metric on Rn . We recall from
Sect. 7.2 that if B is a non-empty subset of Rn and a 2 Rn , then the distance d.a; B/
from a to B is defined by
d.a; B/ D inf d.a; x/:
x2B
Fig. 8.1 Measuring the
distance between sets
ρ(A,B)
A
B
8.1 The Space H.Rn /
331
Since infx2B d.a; x/ D inffd.a; x/ j x 2 Bg is bounded below by 0, d.a; B/ is defined
and finite for every (non-empty) subset B of Rn . Moreover, d.a; B/ continuous as a
function of a (Proposition 7.12.3).
Now suppose B 2 H.Rn /. Since d.a; / W B ! R, x 7! d.a; x/, is continuous and
B is compact, it follows from Theorem 7.13.9 that there exists an x0 2 B such that
d.a; B/ D d.a; x0 /:
In general, x0 will not be unique. For future reference, note that
d.a; B/ D d.a; x0 / d.a; x/; for all x 2 B:
(8.1)
Lemma 8.1.1 Let B 2 H.Rn / and a 2 Rn . Then d.a; B/ D 0 iff a 2 B.
Proof Since B is closed, a 2 B iff d.a; B/ D 0.
t
u
Lemma 8.1.2 Let B 2 H.R /. Then d.x; B/ is a uniformly continuous function of
x 2 Rn .
n
Proof It follows from Proposition 7.2.1(4) that for all x; xN 2 Rn , we have
jd.x; B/ d.Nx; B/j d.Nx; x/:
Hence d.x; B/ is uniformly continuous (given " > 0, take ı D ").
Given A; B 2 H.Rn /, define
t
u
.A; B/ D sup d.a; B/
a2A
D sup inf d.a; b/:
a2A b2B
(For the finiteness of it suffices that A is compact, B is closed.)
We can use to detect inclusion of compact sets.
Lemma 8.1.3 Let A; B 2 H.Rn /. Then .A; B/ D 0 iff A B.
Proof Suppose A B. Then d.a; B/ D 0 for all a 2 A and so .A; B/ D 0.
Conversely, suppose .A; B/ D 0. Then supa2A d.a; B/ D 0. Hence d.a; B/ D 0 for
all a 2 A. Hence A B (Lemma 8.1.1).
t
u
Lemma 8.1.3 holds if A and B are closed (not necessarily compact). The next
lemma makes essential use of compactness and fails if A and B are not compact.
Lemma 8.1.4 Let A; B 2 H.Rn /. Then there exist a0 2 A; b0 2 B such that
.A; B/ D d.a0 ; b0 /:
Moreover,
(a) .A; B/ D d.a0 ; b0 / d.a0 ; b/, for all b 2 B,
(b) .A; B/ D d.a0 ; b0 / d.a; b0 /, for all a 2 A.
332
8 Fractals and Iterated Function Systems
Fig. 8.2 Lemma 8.1.4, parts
(a) and (b)
A
a
a0
B
b0
b
Proof Since d.x; B/ is continuous on A and A is compact, there exists an a0 2 A
such that
.A; B/ D d.a0 ; B/:
By (8.1), there exists a b0 2 B such that
d.a0 ; B/ D d.a0 ; b0 /:
Hence .A; B/ D d.a0 ; b0 /. The remaining statements are immediate from the
definitions (see Fig. 8.2).
u
t
Lemma 8.1.5 Let A; X; Y 2 H.Rn /. If X Y, then
.A; X/ .A; Y/:
Proof For all a 2 A, d.a; X/ d.a; Y/ (since X Y). Therefore, for all a 2 A,
we have
d.a; Y/ d.a; X/ sup d.Na; X/ D .A; X/:
aN 2A
That is, .A; X/ is an upper bound for d.a; Y/, a 2 A, and so
.A; X/ sup d.a; Y/ D .A; Y/:
a2A
The next result will be crucial for our main applications.
Lemma 8.1.6 If A; B; C; D 2 H.Rn /, then
.A [ B; C [ D/ maxf.A; C/; .B; D/g:
Proof We claim that
.A [ B; C [ D/ D maxf.A; C [ D/; .B; C [ D/g:
t
u
8.1 The Space H.Rn /
333
For this observe that
.A [ B; C [ D/ D sup d.x; C [ D/
x2A[B
D d.x0 ; C [ D/ for some x0 2 A [ B
D .A; C [ D/ if x0 2 A
D .B; C [ D/ if x0 2 B
D maxf.A; C [ D/; .B; C [ D/g:
Now by Lemma 8.1.5,
.A; C [ D/ .A; C/;
.B; C [ D/ .B; D/:
Hence .A [ B; C [ D/ maxf.A; C/; .B; D/g.
t
u
Remark 8.1.7 For a more symmetric version of the lemma, see Exercises 8.1.13(2).
z
Lemma 8.1.8 Given A; B; C 2 H.Rn /, we have
.A; B/ C .B; C/ .A; C/:
Proof By Lemma 8.1.4, there exist a0 ; a1 2 A, b0 ; b1 2 B and c0 ; c1 2 C such that
.A; B/ D d.a1 ; b0 /; .B; C/ D d.b1 ; c1 /; .A; C/ D d.a0 ; c0 /:
By Lemma 8.1.4(a), we have
.A; C/ D d.a0 ; c0 / d.a0 ; c1 /:
By the triangle inequality for d, we have
d.a0 ; c1 / d.a0 ; b0 / C d.b0 ; c1 /:
By Lemma 8.1.4(b), we have
.A; B/ D d.a1 ; b0 / d.a0 ; b0 /;
.B; C/ D d.b1 ; c1 / d.b0 ; c1 /:
Hence
.A; C/ d.a0 ; c1 / d.a0 ; b0 / C d.b0 ; c1 /
d.a1 ; b0 / C d.b1 ; c1 /
D .A; B/ C .B; C/:
t
u
334
8 Fractals and Iterated Function Systems
We define the Hausdorff metric h on H.Rn / by
h.X; Y/ D maxf.X; Y/; .Y; X/g; X; Y 2 H.Rn /:
Theorem 8.1.9 h defines a metric on H.Rn /. Moreover, for all A; B; C; D 2 H.Rn /
we have
h.A [ B; C [ D/ maxfh.A; C/; h.B; D/g:
Proof Obviously h.X; Y/ 0 for all X; Y 2 H.Rn /.
If h.X; Y/ D 0, then .X; Y/ D .Y; X/ D 0. If .X; Y/ D 0 then X Y
(Lemma 8.1.3). Similarly, if .Y; X/ D 0, then Y X. Hence if h.X; Y/ D 0, then
X Y X and so X D Y.
It is immediate from the definition of h that h.X; Y/ D h.Y; X/. It remains to
prove the triangle inequality. For X; Y; Z 2 H.Rn /, we have by Lemma 8.1.8,
.X; Z/ .X; Y/ C .Y; Z/;
.Z; X/ .Z; Y/ C .Y; X/
D .Y; X/ C .Z; Y/:
Hence
.X; Z/; .Z; X/ maxf.X; Y/; .Y; X/g C maxf.Y; Z/; .Z; Y/g;
and so
h.X; Z/ D maxf.X; Z/; .Z; X/g
maxf.X; Y/; .Y; X/g C maxf.Y; Z/; .Z; Y/g
D h.X; Y/ C h.Y; Z/:
The estimate for h.A [ B; C [ D/ follows from the corresponding result for (Lemma 8.1.6).
u
t
8.1.1 Completeness of .H.Rn /; h/
We start by defining a useful family of closed compact neighbourhoods (in Rn ) of a
point X 2 H.Rn /. Suppose then that X 2 H.Rn / and let r > 0. Define X.r/ D fx 2
Rn j d.x; X/ rg. The set X.r/ is a closed neighbourhood of X regarded as a subset
of Rn . In particular, if X D fx0 g, X.r/ D Dr .x0 / (closed r disk, centre x0 ).
8.1 The Space H.Rn /
335
Lemma 8.1.10 Let X; Y 2 H.Rn / and r > 0. Then h.X; Y/ r iff Y X.r/ and
X Y.r/.
t
u
Proof Left to the exercises.
Remark 8.1.11 If we define Dr .X/ D fY 2 H.Rn / j h.X; Y/ rg, then
Lemma 8.1.10 implies that
Dr .X/ D fY 2 H.Rn / j Y X.r/; X Y.r/g:
z
We can use Lemma 8.1.10 to get a better understanding of convergence in H.Rn /.
Suppose that .Xn / H.Rn / converges to X. This means that, given " > 0, there
exists an N 2 N such that
h.Xn ; X/ < "; n N:
By Lemma 8.1.10, Xn X."/, for all n N. In particular, for n N large, Xn will
be an "-approximation to X: if you can only resolve detail to within ", Xn will be
indistinguishable from X for n N.
Theorem 8.1.12 .H.Rn /; h/ is a complete metric space.
Proof Suppose that .Xn / is a Cauchy sequence in H.Rn /, Since .Xn / is Cauchy,
there exists an N 2 N such that h.Xn ; Xm / 1, n; m N. By Lemma 8.1.10,
we have
Xn XN .1/; for all n N:
This means that we can assume all the Xn are subsets of some fixed compact subset
Z of Rn . Specifically Z D X1 [ XN1 [ XN .1/.
We now follow the same strategy we used for Cauchy sequences in Rn . We
know the sequence .Xn / is bounded, so we look at the set of limit points. The next
definition should look familiar. Define
ƒ D \n1 [mn Xm :
Each of the sets [mn Xm is compact, since every Xm Z and so for all n 1,
[mn Xm Z. Since Z is bounded and [mn Xm is closed, it follows by Bolzano–
Weierstrass that [mn Xm is compact.
Now [m1 Xm [m2 Xm is a decreasing sequence of non-empty compact
subsets of Rn and so ƒ is a non-empty compact subset of Rn . It suffices to show that
lim Xn D ƒ:
n!1
Choose " > 0. Since .Xn / is Cauchy, there exists an N1 2 N such that h.Xn ; Xm / ",
for all n; m N1 . Since Xm Xn ."/, for all n; m N1 and ƒ D \np [mn Xm , all
336
8 Fractals and Iterated Function Systems
p 1, we certainly have ƒ Xn ."/, n N1 . We claim that we can find N2 2 N
such that Xn ƒ."/, n N2 . Assuming the claim, we then have Xn ƒ."/,
ƒ Xn ."/, for all n N D maxfN1 ; N2 g and so, by Lemma 8.1.10, h.Xn ; ƒ/ ",
n N, proving the convergence of .Xn / to ƒ.
It remains to prove the claim. Suppose the contrary. Then for each p 2 N,
there exists an n p such that Xn 6 ƒ."/. Hence there exists an xn 2 Xn such
that d.xn ; ƒ/ > ". Using this observation, we may construct a sequence .xnk /
such that xnk 2 Xnk and d.xnk ; ƒ/ > ", k 2 N. Since .xnk / Z, it follows
that .xnk / has a convergent subsequence .xmk / with limit z 2 Z. By construction,
d.z; ƒ/ D limk!1 d.xmk ; ƒ/ ". But z 2 \n1 [mn Xm D ƒ and so d.z; ƒ/ D 0.
Contradiction.
t
u
EXERCISES 8.1.13
(1) Let A; B 2 H.Rn /. Recall that
.A; B/ D sup inf d.a; b/:
a2A b2B
(a) Show, by means of a (simple) example, that infb2B supa2A d.a; b/ does not
generally equal .A; B/.
(b) Suppose infb2B supa2A d.a; b/ D 0. What does this say about A and B?
(2) Show that if A; B; C; D 2 H.Rn /, then
.A [ B; C [ D/ maxfminf.A; C/; .A; D/g; minf.B; C/; .B; D/gg:
(Hint: use the argument of the proof of Lemma 8.1.6.)
(3) Complete the proof of Lemma 8.1.10 by showing that if A; B 2 H.Rn /, then
h.A; B/ D inffr j A.r/ B and B.r/ Ag.
(4) Prove that .H.Rn /; h/ is a separable metric space. (Hint: define a countable
dense subset E of .H.Rn /; h/ which consists of finite sets—for example, E D
fX Qn j X finiteg.)
(5) Let .X; d/ be a complete metric space and H.X/ denote the set of compact
subsets of X. Show how to define the Hausdorff metric hX on H.X/ and verify
that .H.X/; hX / is complete.
(6) Let .X; d/ be a metric space. Show that .H.X/; hX / is compact iff .X; d/ is
compact. (Caution: this needs the open cover definition of compactness—every
open cover has a finite subcover—as there is no assumption that X is separable.)
(7) Let .X; d/ be a metric space and suppose that .xn / X is a Cauchy sequence.
For n 1, set An D fxi j 1 i ng X. Show that .An / is a Cauchy
sequence in .H.X/; hX / (notation of previous example). Deduce that .H.X/; hX /
is complete if and only if X is complete.
8.2 Iterated Function Systems
337
8.2 Iterated Function Systems
Recall from Chap. 7 that a map f W Rn ! Rn is a contraction if there exists a
k 2 Œ0; 1/ such that
d. f .x/; f . y// kd.x; y/; for all x; y 2 Rn ;
and that we call the smallest value of k for which this estimate holds the contraction
constant of f .
Lemma 8.2.1 Let f W Rn ! Rn be a contraction map with contraction constant
k. If we define F W H.Rn / ! H.Rn / by F .X/ D f .X/, then F is a contraction
mapping with contraction constant k.
Proof We have to show h. f .X/; f .Y// kh.X; Y/ for all X; Y 2 H.Rn /. Since
h.X; Y/ D maxf.X; Y/; .Y; X/g, it suffices to show . f .X/; f .Y// k.X; Y/ for
all X; Y 2 H.Rn /. We have
. f .X/; f .Y// D sup inf d. f .x/; f . y//
x2X y2Y
sup inf kd.x; y/
x2X y2Y
D k sup inf d.x; y/
x2X y2Y
D k.X; Y/:
If we take X; Y to be the point sets fxg, fyg, we see that k is the contraction constant
of F .
t
u
Suppose that we are given continuous functions f1 ; ; fp W Rn ! Rn . We define
the operator1 F W H.Rn / ! H.Rn / by
F .X/ D f1 .X/ [ [ fp .X/; X 2 H.Rn /:
Note that F does take values in H.Rn /. Indeed, since the fi are assumed continuous,
each fi .X/ is a compact subset of Rn . Since we have a finite union of compact sets,
F .X/ is compact and so F .X/ 2 H.Rn / for all X 2 H.Rn / (Exercises 7.13.24(6)).
Now assume that each fi is a contraction. If fi has a contraction constant ki < 1,
then taking k D maxi ki < 1, we can assume the fi have a common contraction
constant k.
Proposition 8.2.2 (Notation and Assumptions as Above) The operator F W
H.Rn / ! H.Rn / is a contraction map.
1
We prefer to use the term ‘operator’ rather than ‘map’.
338
8 Fractals and Iterated Function Systems
Proof Let X; Y 2 H.Rn /. We have
h.F .X/; F .Y// D h. f1 .X/ [ [ fp .X/; f1 .Y/ [ [ fp .Y//:
We have h.A [ B; C [ D/ maxfh.A; C/; h.B; D/g for all A; B; C; D 2 H.Rn /.
Applying this result repeatedly to the right-hand side of the expression for
h.F .X/; F .Y// gives
h.F .X/; F .Y// max h. fi .X/; fi .Y//:
1ip
By Lemma 8.2.1, h. fi .X/; fi .Y// kh.X; Y/, 1 i p, and so we have shown
h.F .X/; F .Y// kh.X; Y/.
t
u
Corollary 8.2.3 (Notation and Assumptions as Above) The operator F W
H.Rn / ! H.Rn / has a unique fixed point X ? 2 H.Rn /. Moreover, F .X ? / D X ? iff
X ? D f1 .X ? / [ [ fp .X ? /:
Proof Apply the contraction mapping lemma.
(8.2)
t
u
Remarks 8.2.4
(1) Suppose we have a finite set of contraction maps f1 ; ; fp of Rn . Start with
any compact subset X of Rn (for example a single point). Iterate F and define
Xn D F n .X/. Then Xn always converges to the same compact subset of Rn ,
independent of the initial set X.
(2) Equation (8.2) shows that the fixed point X ? of the operator F has the property
that it is the union of scaled-down copies of itself. This property is a form of
self-similarity. We have already seen self-similarity in the Cantor set C and we
shall shortly give some striking visual examples of self-similarity. We remark
that sets that exhibit self-similarity at all scales are often called fractals.
(3) Proposition 8.2.2 applies equally well to the set H.X/ of compact subsets of any
complete metric space X. We refer to a finite set f fi g of contractions of X as an
iterated function system or IFS. John Hutchinson showed in 1981 [15] that the
operator associated to an IFS had a unique fixed point. Subsequently, iterated
function systems, and their associated fractals, were popularized in Michael
Barnsley’s book Fractals Everywhere [2].
z
EXERCISES 8.2.5
(1) Suppose that f1 ; ; fp are contractions of Rn . Let fj have fixed point x?j 2 Rn ,
j 2 p. Show that if X ? 2 H.Rn / is the fixed point of F given by Corollary 8.2.3,
then x?j 2 X ? , for all j 2 p. Deduce that if a1 ; ; an 2 p, then for all j 2 p,
fan fa1 .x?j / 2 X ? .
(2) (Notation of previous question). Show that if p D 2, then it is possible to choose
f1 ; f2 so that X ? consists of exactly two points. Let m > 2. Can we choose f1 ; f2 so
that X ? consists of m points? What about if f1 ; f2 are affine linear contractions?
8.3 Examples of Iterated Function Systems
339
8.3 Examples of Iterated Function Systems
An affine linear map of Rn is a mapping L W Rn ! Rn which can be written in
the form
Lx D Ax C b; x 2 Rn ;
where A is a linear mapping of Rn (n
n matrix) and b 2 Rn .
Lemma 8.3.1
(1) An affine linear map Lx D ax C b of R is a contraction iff jaj < 1.
(2) The affine linear map of .R2 ; d2 / given by
e
x
ab
C
L.x; y/ D
f
y
cd
is a contraction iff
a2 C c2 < 1;
b2 C d 2 < 1;
a2 C b2 C c2 C d 2 < 1 C .ad bc/2 :
Proof
(1) For all x; y 2 R, we have jLx Lyj D ja.x y/j D jajjx yj. Hence L is a
contraction iff jaj < 1.
(2) Since d.Lx1 ; Lx2 / D kA.x1 x2 /k, x1 ; x2 2 R2 , L is a contraction iff A is a
contraction. The linear map A is a contraction iff given .x; y/ ¤ .0; 0/ we have
kA.x; y/k2 D .ax C by/2 C .cx C dy/2 < x2 C y2 :
The contraction constant of A will then be supfkA.x; y/k j x2 C y2 D 1g. Now
.ax C by/2 C .cx C dy/2 < x2 C y2 iff
x2 .a2 C c2 1/ C 2xy.ab C cd/ C y2 .b2 C d2 1/ < 0:
This condition holds for all .x; y/ ¤ .0; 0/ iff a2 C c2 < 1, b2 C d2 < 1 and
.a2 C c2 1/.b2 C d2 1/ .ab C cd/2 > 0. The last condition simplifies to
a2 C b2 C c2 C d2 < 1 C .ad bc/2 .
t
u
340
8 Fractals and Iterated Function Systems
Fig. 8.3 The IFS
fA ; B ; C g
B
ρB (X)
X
ρA (X)
A
ρC (X)
C
8.3.1 The Sierpiński Triangle (or Gasket)
Fix the
equilateral triangle
4ABC in the plane with vertices A D .1; 0/, B D
p
p
3
3
1
1
. 2 ; 2 /, C D . 2 ; 2 /, and note that the centre of this triangle is the origin
of R2 (see Fig. 8.3).
We define affine linear contractions A , B and C of R2 . Let X D .x; y/ 2 R2 .
Define
X
1
;0 :
A .X/ D C
2
2
Observe that A is a contraction with contraction constant kA D
point A. Similarly, define
1
2
and unique fixed
p !
1
X
3
B .X/ D C ;
;
2
4 4
which has fixed point B, and
X
C .X/ D 2
p !
1
3
;
;
4 4
which has fixed point C. Each of these maps moves X exactly halfway to the
corresponding vertex (see Fig. 8.3) and all the maps have the same contraction
constant 12 .
Let S W H.R2 / ! H.R2 / be the operator defined by the IFS fA ; B ; C g. It
follows from Corollary 8.2.3 that S has a unique fixed point. In Fig. 8.4 we show
the first two iterates of the map S where as our initial point we have taken the (filledin) triangle 4ABC.
8.3 Examples of Iterated Function Systems
S
341
S
Fig. 8.4 The first two iterates of S
Fig. 8.5 Sierpiński triangle (or gasket)
In Fig. 8.5 we show a visualization of the fixed point. This compact subset of
R2 is known as the Sierpiński triangle or Sierpiński gasket. Just as for the Cantor
set, the Sierpiński triangle is self-similar. Each of the little triangles making up the
Sierpiński triangle is a scaled down copy of the Sierpiński triangle.
8.3.2 Four Variations on the Sierpiński Triangle
In Fig. 8.6a, we show the effect of increasing the contraction constant from 0:5
to 0:55. Observe that there is now an overlap occurring if we iterate the filled-in
triangle 4ABC. On the other hand if we decrease the contraction constant from 0:5
to 0:45, we get the effect shown in Fig. 8.6b. Finally, in Fig. 8.7, we show the effect
342
8 Fractals and Iterated Function Systems
Fig. 8.6 Varying the contraction constant in the IFS used for the Sierpiński triangle. (a) Contraction constant 0.55, (b) contraction constant 0.45.
Fig. 8.7 Sierpiński pentagons: (a) contraction constant 0:5, (b) contraction constant 0:45.
of increasing the number of elements in the IFS to five. For both images shown in
Fig. 8.7, we have taken five contractions, one for each vertex of a regular pentagon.
In Fig. 8.7a, the contraction constants were all 0:5; in Fig. 8.7b, the contraction
constants were all 0:45. We explain the “grey scale” colouring used in Fig. 8.7 in
the paragraph on random iteration in Sect. 8.4.1.
EXERCISES 8.3.2
(1) Define
L1 .x/ D
2
x
x
; L2 .x/ D C ; x 2 R:
3
3
3
8.3 Examples of Iterated Function Systems
343
Show that fL1 ; L2 g is an iterated function system with fixed point the middle
thirds Cantor set C. (This IFS contracts by 1=3 from the points 0; 1—compare
with the IFS giving the Sierpiński triangle.)
(2) Let f f1 ; f2 g be the IFS given by
0:4000 0:3733 x
0:3533
f1 .x; y/ D
C
;
0:0600 0:6000
y
0:0000
1:1000
0:8000 0:1867 x
:
C
f2 .x; y/ D
0:1000
y
0:1371 0:8000
Verify that f1 and f2 are contractions. If you have access to a computer with
Matlab or Mathematica, plot the resulting image you get with this IFS. (Use
random iteration—see the next section.)
(3) Prove that the Sierpiński triangle is connected. What about the fractals in
Figs. 8.6, 8.7? (Hint: Exercises 7.18.20(21).)
(4) Show that the Sierpiński triangle is path connected.
(5) Suppose that instead of the Euclidean metric on Rn , we use the metric
d1 .x; y/ D max jxi yi j; x D .x1 ; ; xn /; y D . y1 ; ; yn /:
1in
Show that the affine linear map Lx D Ax C b is a contraction (with respect to
d1 ) iff
1
0
n
X
jaij jA < 1:
max @
1in
jD1
Find an example of an affine linear map of R2 which is a contraction with
respect to d1 but not the Euclidean metric d2 .
(6) Suppose that f f1 ; ; fk g is a set of affine linear maps of Rn which are
contractions of .Rn ; d1 /. Show that the operator F W H.Rn / ! H.Rn / defined
p
by F .X/ D [iD1 fi .X/ has a unique fixed point even though F may not be a
contraction of .H.Rn /; h/. (Hint: change the metric to ‘h1 ’.)
(7) A necessary condition for an affine linear map Lx D Ax C b of Rn to be
a contraction is that all the eigenvalues of A have modulus less than 1 (see
Exercises 7.17.16(2)). Conversely, if this condition holds it can be shown (using
Jordan normal form) that there exists a norm on Rn with respect to which L is
a contraction. This suggests that if we have a finite set of affine linear maps
Li x D Ai x C bi such that each Ai has all eigenvalues of modulus less than 1,
then the corresponding IFS has a unique fixed point that can be obtained by
iteration. Find an example in R2 with just two maps that shows this conclusion
is false. (Hint: take A1 to be the composition of rotation through =2 with the
diagonal matrix Œd1 ; d2 where d1 d2 2 .0; 1/ and d1 < 1 < d2 and A2 to be
344
8 Fractals and Iterated Function Systems
the composition of the rotation through =2 with the diagonal matrix Œd2 ; d1 .
These ideas have implications in control theory—see [23, Chap. 1].)
(8) Show that the product C2 D f.c1 ; c2 / j c1 ; c2 2 Cg of two middle-thirds Cantor
sets can be represented as the unique fixed point of the iterated function system
I D fLij j i; j 2 f0; 1gg, where Lij .x/ D 13 .x vij / C vij , x 2 R2 , and vij D .i; j/
are vertices of the unit square Œ0; 12 in R2 . Let F W H.R2 / ! H.R2 / denote the
operator determined by I. Let a 2 Œ0; 2 and `a denote the line x C y D a. Show
that `a \ F n .Œ0; 12 / ¤ ; for all a 2 Œ0; 2, n 1. Deduce that C C C D Œ0; 2.
(Hints: use exercise (1) and show that it is enough to prove `a \ F .Œ0; 12 / ¤ ;
for all a 2 Œ0; 2.)
(9) Let r 2 .0; 12 and Cr denote the set defined by the iterated function system
fLri j i 2 f0; 1gg, where Lri .x/ D r.x i/ C i, i 2 f0; 1g. Show that
(a) Cr is a Cantor set (Definition 7.14.19) if and only if r < 12 . What is C 1 ?
2
(b) Cr C Cr Œ0; 2 with equality if and only if r 2 Œ 13 ; 12 .
(c) If r < 13 , then Cr C Cr Œ0; 2 is a Cantor set and Œ0; 2 X .Cr C Cr / is a
disjoint union of open intervals of total length 2.
8.4 Concluding Remarks
8.4.1 Computing the Fixed Point of an IFS
Suppose we are given an IFS f fi j i D 1; ; pg, where each fi W R2 ! R2 is an
affine linear contraction with contraction constant ki . Set k D maxi ki and let F W
H.R2 / ! H.R2 / be the contraction induced by the IFS. It follows from the second
part of the contraction mapping lemma that in order to compute the fixed point X ?
of F , we can start with any initial X0 2 H.R2 / and iterate by F . If h.X0 ; X ? / D C,
then after n-iterations we have the estimate h.F n .X/; X ? / kn C. In particular, we
can take X0 D fx0 g, a point in R2 . We have h.x0 ; X ? / D supx2X ? d.x0 ; x/, where
d is the Euclidean metric on R2 . Now X1 D F .X0 / consists of (at most) p-points,
X2 D F 2 .X0 / at most p2 points, and so on. After n iterations, we get a compact
subset Xn of R2 containing at most pn points. This process works reasonably well
for the Sierpiński triangle S where the associated IFS has p D 3 and k D 1=2. If
we want to approximate the triangle to within 104 and we start with X0 D fx0 g,
where h.x0 ; S/ D 1, we need to choose n so that .1=2/n < 104 . The set Xn will
then consist of at most 3n points. Computing we find that it suffices to take n D 14
and then the number of points in X14 will be at most 4;782;969. Although this is not
hard to work out on a computer, note how the size of the array of numbers we need
to store triples at every step. On the other hand, suppose that p is larger, say p D 10
and the contraction rate k is bigger, say k D 0:9. To get an approximation within
104 , we need to choose n so that .9=10/n < 104 —that is, 9n < 104n —and the
number of points at the nth step will then be 10n . Computing we find that, n D 68
and the number of points at step n is 1068 . This is now completely unrealistic to
8.4 Concluding Remarks
345
simulate on a computer. While it is possible to refine this technique for computing
the limit set, we prefer to emphasize an alternative approach based on the idea of
random iteration.
Random Iteration There is another way to compute the fixed point of an IFS
that is computationally economical and fast. What we do is perform a random
iteration. Fix an initial point x0 2 R2 . Suppose there are p functions in the IFS.
Successively pick elements of the IFS with equal probability 1p . Supposing we
get the random sequence fi1 , fi2 ; ; fik ; of functions, we define the sequence
.xn / R2 by xn D fin .xn1 /, n 1. It may be proved that, with probability 1, the
set of limit points of the sequence .xn / is equal to the fixed point set of the IFS.
In practice, this scheme often converges very rapidly. All we do is throw away the
first few points as transient, and then keep iterating and plotting until the image
has stabilized. We can sometimes improve
the rate of convergence by choosing fi
P
with probability pi 2 .0; 1/, where i pi D 1 and we do not necessarily assume
pi D 1=p. All the images of fractals shown thus far in this chapter were computed
using random iteration. The grey scale colouring of the fractals in Fig. 8.7 gives a
representation of the frequency with which points of the iteration visit regions of
the fractal. For example, in Fig. 8.7a, the dark interior region is frequently visited,
while the boundary of the fractal is infreqently visited. We refer the reader to the
references at the end of the chapter for more information and examples.
8.4.2 The Collage Theorem
The collage theorem gives a constructive scheme for approximating a compact
subset of R2 (more generally, Rn ) arbitrarily closely by the fixed point of an IFS
consisting of affine linear contractions. More precisely, given X 2 H.R2 / and " > 0,
there exists an IFS f f1 ; ; fN g such that h.X; X ?/ < ", where X ? is the fixed point
of the IFS. Since it is computationally cheap to generate the fixed point of an IFS
using random iteration, these ideas have been used in image compression. We refer
to Barnsley’s book [2] for details on the mathematical theory. In Fig. 8.8 we show a
‘fractal fern’, this was computed using the IFS f f1 ; f2 ; f3 ; f4 g where
0:7 0
x
0:1496
f1 .x; y/ D
C
;
0 0:7
y
0:2962
0:1 0:433 x
0:4478
C
;
f2 .x; y/ D
0:1732 0:25
y
0:0014
0:1 0:433 x
0:4445
f3 .x; y/ D
C
;
0:1732 0:25
y
0:1559
0 0
x
0:4987
C
:
f4 .x; y/ D
0 0:3
y
0:007
346
8 Fractals and Iterated Function Systems
Fig. 8.8 A fractal fern
Other sources on fractals include the classic book by Benoit Mandelbrot, The
Fractal Geometry of Nature [24], and the book by Heinz-Otto Peitgen and Peter H.
Richter, The Beauty of Fractals [26]. These books show some of the potential for
fractal-based artwork. Techniques used for making fractal landscapes and images
have been used to create special effects scenes in a number of Hollywood movies,
most notably in the Star Wars series; Star Trek: The Wrath of Khan; and in
the Lord of the Rings trilogy. For a mix of fractals and symmetry, and some
mathematics, we refer to Symmetry in Chaos [11, Chap. 7]. For an introduction
to the mathematical theory of fractals, we suggest the book by Falconer, Fractal
Geometry: Mathematical Foundations and Applications [8].
8.4.3 The Power of Abstraction and Generalization
In Chap. 2, we gave a proof of the foundational theorem from Calculus that every
continuous function on a closed interval is bounded and takes all values between its
8.4 Concluding Remarks
347
upper and lower bounds.2 The proof was tricky—it depended crucially on properties
of the real numbers. In the previous chapter on metric space, we developed an
abstract framework for the study of results of this type and introduced a range of
new concepts, such as compactness and connectedness, which abstracted the key
properties of the closed interval and real numbers needed for the proof of the
foundational theorem. These concepts defined the precise structure needed for the
proof of general results. The power of this approach can be seen in the present
chapter. We have progressed from the relatively mundane study of real-valued
functions on the real line to the analysis of operators defined on spaces whose points
are compact subsets of Rn . Fixed points are now compact sets rather than points on
the real line. The spaces we deal with—such as spaces of compact sets or spaces of
functions—may be infinite-dimensional and beyond simple visualization.
The moral is that problems in mathematics (and science) that are simple to state
often require methods and concepts that are of great generality and abstraction for
their solution.3 This is the nature and power of mathematics. Finding the underlying
structure—the crucial ideas—and then developing an abstract framework which
includes the essential and excludes the inessential.
2
This result appears in some form, often without proof, in every undergraduate or high school text
on Calculus.
3
The paradox is now fully established that the utmost abstractions are the true weapons with which
to control our thought of concrete fact. Alfred North Whitehead, from Science and the Modern
World [29].
Chapter 9
Differential Calculus on Rm
In this chapter we develop the differential calculus on Rm . The key concept is that
of the derivative, which we view as the ‘best linear approximation’ to a function
rather than as the limit of a quotient (as is done in the theory of differentiable maps
f W R ! R). All of what we do is independent of norm and choice of coordinate
system on Rm . Linear (and multi-linear) maps between normed vector spaces play
a central role in the theory. Consequently, we start by developing and reviewing the
theory of continuous linear maps between finite-dimensional normed vector spaces.
Proofs of some additional properties of finite-dimensional normed vector spaces,
including the equivalence of all norms on a given finite-dimensional vector space,
are given in an appendix at the end of the chapter. With these preliminaries out of the
way, we develop in a coordinate-free way the theory of the derivative. Next, using
the contraction mapping lemma, we prove the C1 versions of the implicit and inverse
function theorems, the rank theorem, and the existence and uniqueness theorem for
ordinary differential equations. So as to simplify the notation, we initially assume
functions are defined on Rm , rather than on an open subset of Rm —all definitions
and results extend without difficulty to functions defined on open subsets of Rm .
In the remainder of the chapter, we develop the theory of higher derivatives and
prove Cr versions of the chain rule, the inverse and implicit function theorems, and
Taylor’s theorem. All of this will require some preliminaries on multi-linear maps
and polynomial maps between vector spaces. We conclude with the Cr version of the
existence theorem for ordinary differential equations—including the Cr dependence
on initial conditions.
9.1 Normed Vector Spaces
Suppose that V is a finite-dimensional real vector space. If the dimension of V is m,
we set dim.V/ D m. The choice of a basis fv1 ; ; vm g for V uniquely determines
a linear isomorphism A W V ! Rm by Avi D ei , 1 i m, where fe1 ; ; em g
9 Differential Calculus on Rm
350
denotes the standard basis of Rm consisting
Pm of unit vectors along each coordinate
axis.1 In terms of coordinates, if x D
iD1 xi vi 2 V, then Ax has coordinates
.x1 ; : : : ; xm / 2 Rm .
We recall the definition of a norm on V.
Definition 9.1.1 Let V be a vector space. A norm on V is a map k k W V ! R
satisfying
(1)
(2)
(3)
(4)
kvk 0 for all v 2 V.
kvk D 0 iff v D 0.
kavk D jajkvk for all a 2 R and v 2 V.
kv C wk kvk C kwk for all v; w 2 V (triangle inequality).
We call .V; k k/ a normed vector space.
If .V; k k/ is a normed vector space, we define the associated metric d on V by
d.v; w/ D kv wk; v; w 2 V:
It is conceivable that different norms on V could define metrics which have different
topologies. While this certainly can and does happen if V is infinite-dimensional (see
Exercises 7.1.9(11)), it turns out that all norms define the same topology on a finitedimensional vector space. Before we state the precise result, we need a definition.
Definition 9.1.2 Two norms k k1 and k k2 on a vector space V are equivalent if
there exists a C 1 such that
C1 kvk1 kvk2 Ckvk1 ; for all v 2 V:
Remarks 9.1.3
(1) Observe that if the condition of the definition holds, then C1 kvk2 kvk1 Ckvk2 and so the definition is symmetrical in the two norms. It is also clear that
if we can find c0 ; C0 > 0 such that c0 kvk1 kvk2 C0 kvk1 for all v 2 V, then
the conditions of the definition are satisfied with C D maxfC0 ; 1=c0 g.
(2) Equivalent norms on V define equivalent metrics on V. Consequently, equivalent norms define the same topology of open subsets of V (Exercises 7.4.27(6))
and so have the same continuous functions.
z
We give the proof of the next theorem in the appendix at the end of the chapter.
Theorem 9.1.4 Any two norms on a finite-dimensional vector space V are equivalent. In particular,
(1) all norms define the same topology on V,
(2) .V; k k/ is complete with respect to any norm on V.
1
We write Ax rather than A.x/ when A is a linear map.
9.1 Normed Vector Spaces
351
Lemma 9.1.5 Let .V; k kV / be an m-dimensional normed vector space and A W
V ! Rm be a linear isomorphism. If we define kxk D kA1 xkV , x 2 Rm , then
k k is a norm on Rm and the topology of open sets on Rm defined by k k is the
same as that defined by the Euclidean norm on Rm . Moreover, A W V ! Rm is a
norm-preserving linear homeomorphism:
kxkV D kAxk; for all x 2 V:
Proof We leave the verification that k k defines a norm on Rm as an exercise
for the reader. The statement about the topology of open sets on Rm follows from
Theorem 9.1.4.
t
u
Example 9.1.6 Let V have basis fv1 ; ; vm g and define the linear isomorphism
A W V ! Rm by Avi D ei , 1 i m. Every norm k k on Rm uniquely determines
a norm k k? on V by
kvk? D kAvk; v 2 V:
Obviously A W V ! Rm is norm-preserving. If k k is the Euclidean norm on
Rm , then kxk? D .Ax; Ax/1=2 , where .; / is the Euclidean inner product on Rm .
Consequently, k k? is defined by the inner product .x; y/? D .Ax; Ay/ on V.
Given an m-dimensional normed vector space .V; kkV /, we can always fix a basis
of V and identify V with Rm (as in Example 9.1.6). Moreover, Theorem 9.1.4 implies
that the Euclidean norm k k2 on Rm is equivalent to the norm induced from k kV on
Rm . The metric topology on Rm will be the same whether we use the Euclidean norm
k k2 or the induced norm (Theorem 9.1.4(1)). Consequently, as far as continuity
properties are concerned, there is no loss of generality in working with .Rm ; kk2 /—
but note that this statement does depend on the non-trivial Theorem 9.1.4. From a
formal point of view it is easier to work at the abstract level of maps f W V ! W
between general finite-dimensional normed vector spaces. However, when it comes
to examples, especially computations, we usually have to choose a coordinate
system—now we are looking at maps f W Rm ! Rn . We compromise by looking at
maps f W Rm ! Rn between spaces with the Euclidean norm but present arguments
that generalize to the abstract setting f W V ! W by simply changing Rm to V
and Rn to W. There is precisely one point in the development of the theory where
we have to choose a coordinate system and implicitly make use of Theorem 9.1.4.
Later, when we come to higher derivatives, we will work at the abstract level of
maps f W V ! W. We do this to avoid burying the ideas in the complex notation that
results from using coordinates.
Summary of Conventions Let Rm denote m-dimensional Euclidean space.
Denote vectors in Rm (or any normed space) using boldface: x; y 2 Rm . Let
.x1 ; ; xm / denote the coordinates of x 2 Rm (relative to the standard basis
fe1 ; ; em g of Rm ). Denote the Euclidean norm on Rm by k k and recall that
kxk2 D .x; x/ where .; / denotes the inner or ‘dot’ product on Rm (.x; y/ D x y).
Denote the unit sphere of Rm by Sm1 . That is, Sm1 D fx 2 Rm j kxk D 1g. If
9 Differential Calculus on Rm
352
A W Rm ! Rn is linear and x 2 Rm , we usually write Ax, rather than A.x/, for the
value of A at x. Using a matrix representation for A, it is not hard to verify that
every linear map A W Rm ! Rn is continuous (relative to the topology defined by
the Euclidean norm—we give a formal proof shortly).
EXERCISES 9.1.7
(1) Let .V; k k/ be a normed vector space and let d denote the associated metric
on V. Verify that
(a) d.x C z; y C z/ D d.x; y/, for all x; y; z 2 V (‘translation invariance’ of d).
(b) d.kx; ky/ D jkjd.x; y/ for all x; y 2 V, k 2 R (‘scalar invariance’ of d).
(2) For p 1, define the p-norm k kp on Rn by
n
X
k.x1 ; ; xn /kp D
!1=p
p
xi
:
iD1
It is easy to verify that k kp satisfies (1–3) of Definition 9.1.1. The triangle
inequality is Minkowski’s inequality:
n
X
!
jxi C yi j
p
iD1
n
X
!1=p
jxi j
p
C
iD1
n
X
!1=p
jyi j
:
p
iD1
This is easy to prove if p D 1; 2 (in case p D 2 we have the Euclidean norm).
For the remainder of this exercise we indicate the steps needed to prove the
general case.
(a) Let f .x; y/ D ˛x C ˇy x˛ yˇ , where x; y 0, ˛; ˇ 2 .0; 1/ and ˛ C ˇ D
1. By finding the minimum value of f for a fixed value of y, show that
˛x C ˇy x˛ yˇ for all x; y 0.
(b) Let .an /; .bn / be real sequences consisting
of n terms. Let p;P
q > 1 satisfy
P
1=p C 1=q D 1. Set Am D am =. niD1 jai jp /1=p , Bm D bm =. niD1 jbi jq /1=q ,
1 i m. Using (a), show that
jAm Bm j jAm jp =p C jBm jq =q;
and hence, by summing over m, that
n
X
iD1
jAi Bi j 1 n
X
iD1
!1=p
jAi j
p
n
X
iD1
!1=q
jBi j
q
:
9.2 Linear Maps
353
Deduce Hölder’s inequality:
n
X
jai bi j iD1
n
X
!1=p
jai j
p
iD1
n
X
!1=q
jbi j
q
:
iD1
(c) Under the assumptions of (b) show that
n
X
iD1
jai C bi jp n
X
iD1
jai C bi jp1 jai j C
n
X
jai C bi jp1 jbi j;
iD1
and apply Hölder’s inequality to deduce Minkowski’s inequality.
(3) Show that the product norm k.x1 ; ; xn /k1 D maxi jxi j may be regarded as
limp!1 kxkp .
(4) What goes wrong if we try to define k kp for p < 1?
(5) Define k kP W RnC1 ! R by
k.x1 ; ; xnC1 /kP D maxfx1 ; ; xnC1 g minfx1 ; ; xnC1 g:
(a) Show that k kP defines a norm on the hyperplane x1 C C xnC1 D 0.
(b) Let n D 2. Show that the unit ‘circle’ defined by kxkP D 1 on the
hyperplane x1 C x2 C x3 D 0 is a regular hexagon and find the vertices
of the hexagon.
9.2 Linear Maps
In this section we cover some elementary results on linear maps that we need for
the development of the differential calculus on Rm . As far as possible we do this in
a ‘coordinate-free’ way.
Let A W Rm ! Rn be linear. Although we can represent A as a matrix,
conceptually it is easiest to regard A as a map A W Rm ! Rn which is linear. That is,
A.x C y/ D Ax C Ay; for all x; y 2 Rm ; 2 R:
We start by showing that every linear map A W Rm ! Rn is continuous. One way of
doing this is by using the matrix representation of A and writing Ax in coordinates.
However, we give a proof that suggests the real issue is the finite-dimensionality of
the vector spaces Rm ; Rn . Indeed, linear maps defined on an infinite-dimensional
normed vector space need not be continuous (see the exercises for an example).
Lemma 9.2.1 Let A W Rm ! Rn be linear. If A is continuous at x D 0, then A is
continuous on Rm .
9 Differential Calculus on Rm
354
Proof Let x0 2 Rm and " > 0. Since A is continuous at x D 0, there exists a
ı > 0 such that kAz A0k D kAzk < ", for all z such that kzk < ı. Observe
that kAx0 Axk D kA.x0 x/k (linearity) and so, taking z D x0 x, we have
kAx0 Axk < ", if kx0 xk < ı, proving continuity of A at x0 .
t
u
Remark 9.2.2 A consequence of the proof of Lemma 9.2.1 is that if A is continuous
then A is uniformly continuous.
z
Lemma 9.2.3 Let A W Rm ! Rn be linear. Then A is continuous at 0 if A is bounded
on the unit sphere Sm1 of Rm . That is, if there exists a C 0 such that kAuk C
for all u 2 Sm1 .
Proof We are given that kAuk C, for all u 2 Sm1 . If x 2 Rm is non-zero, then
1
x=kxk 2 Sm1 and so kA.x=kxk/k C. By linearity, A.x=kxk/ D kxk
Ax and so
1
1
k kxk Axk D kxk kAxk. Hence
kAxk Ckxk; for all x 2 Rm :
Let " > 0 and take ı D "= maxfC; 1g. Our estimate on kAxk implies that kAxk < "
whenever kxk < ı and so A is continuous at x D 0.
t
u
Lemma 9.2.4 If A W Rm ! Rn is linear, then A is bounded on Sm1 .
Pm
Proof Every point u 2 Sm1 may be written uniquely as u D
jD1 uj ej , where
Pm 2
m
fe1 ; ; em g is the standard basis of R and jD1 uj D 1. By linearity of A, we
have
Au D
m
X
uj Aej ;
jD1
and so, by the triangle inequality
kAuk m
X
juj jkAej k
jD1
M
m
X
juj j mM;
jD1
where M D maxfkAej kg and we have used juj j kuk D 1, 1 j m. Hence
kAuk mM, for all u 2 Sm1 .
t
u
Remark 9.2.5 Notice that if A is continuous, then A is bounded on Sm1 since Sm1
is a compact subset of Rm .
z
Proposition 9.2.6 Every linear map A W Rm ! Rn is continuous.
Proof Immediate from Lemmas 9.2.1, 9.2.3 and 9.2.4.
t
u
9.2 Linear Maps
355
Remark 9.2.7 Note the point in the proof of Lemma 9.2.4 where we use the finitedimensionality of Rm and the Euclidean norm. Proposition 9.2.6 holds for linear
maps A W V ! W between normed vector spaces provided that V is finitedimensional. For this we need Theorem 9.1.4.
z
9.2.1 Normed Vector Spaces of Linear Maps
If A W Rm ! Rn is a linear map, define
kAk D sup kAuk D sup kAuk:
kukD1
u2Sm1
Since A is continuous and Sm1 is closed and bounded (therefore compact), we have
kAk < 1 (alternatively, use Lemma 9.2.4). We refer to kAk as the norm or operator
norm of A.
Examples 9.2.8
(1) Let I W Rm ! Rm denote the identity map of Rm . Then kIk D 1.
(2) If A W R2 ! R2 is the linear map with matrix ŒA given by
˛ ˇ
ŒA D
;
ˇ ˛
p
then kAk D ˛ 2 C ˇ 2 . The hardest way of seeing this is by using Lagrange
multipliers to find the maximum value of kAuk on the unit circle in R2 . A much
easier way is to identify R2 with C (.x; y/
x C {y) and observe that A.x; y/
corresponds to complex multiplication by ˛ C {ˇ. That is, Az D .˛ C {ˇ/z and
so kAzk D j.˛ C {ˇ/zj (modulus on the right-hand
side). The claimed result
p
follows since j.˛ C {ˇ/zj D j˛ C {ˇjjzj D ˛ 2 C ˇ 2 jzj.
Let L.Rm ; Rn / denote the (vector) space of all linear maps from Rm to Rn and let 0
(or 0m;n ) denote the zero linear map.
Theorem 9.2.9 Let A; B 2 L.Rm ; Rn /. We have
(1) kAk 0 and kAk D 0 iff A D 0.
(2) kaAk D jajkAk for all a 2 R.
(3) kA C Bk kAk C kBk for all A; B 2 L.Rm ; Rn /.
In particular, .L.Rm ; Rn /; k k/ has the structure of a normed vector space.
Proof
(1) Obviously kAk 0 for all A 2 L.Rm ; Rn /. If kAk D 0, then Au D 0 for all unit
vectors u 2 Rm . Since every non-zero vector in Rm is a scalar multiple of a unit
vector, it follows by the linearity of A that Ax D 0 for all x 2 Rm and so A D 0.
9 Differential Calculus on Rm
356
(2) We have supkukD1 kaAuk D supkukD1 jajkAuk D jajkAk.
(3) Suppose A; B 2 L.Rm ; Rn /. We have
kA C Bk D sup kAu C Buk
kukD1
sup .kAuk C kBuk/
kukD1
sup kAuk C sup kBuk
kukD1
kukD1
D kAk C kBk;
proving the triangle inequality.
t
u
Remark 9.2.10 The operator norm defines the Euclidean norm on L.Rm ; Rn / Š
Rmn iff n D 1 or m D 1—see the exercises at the end of the section for the
isomorphism between L.Rm ; Rn / and Rmn .
z
Proposition 9.2.11 (Additional Properties of k k)
(a) kAxk kAkkxk, for all A 2 L.Rm ; Rn / and x 2 Rm .
(b) If we define d.A; B/ D kA Bk, A; B 2 L.Rm ; Rn /, then d defines a (complete)
metric on L.Rm ; Rn /.
(c) If L W Rm ! Rn , M W Rn ! Rp , then kMLk kMkkLk.
(d) If A 2 L.Rm ; Rn / is invertible (so A1 exists and m D n), then kA1 k 1=kAk.
Proof All the statements are quite elementary. We prove (a,c) and leave (b,d) to the
exercises.
(a) If x D 0, then certainly 0 D kAxk kAkkxk D 0. So suppose x ¤ 0
x
and set u D kxk
. By definition of kAk, we have kAuk kAk (u is a unit vector).
x
x
1
/ D kxk
Ax and so
Since u D kxk , it follows by linearity of A that Au D A. kxk
1
Ax kAk. Multiplying through by kxk gives the result.
kxk
(c) Suppose L W Rm ! Rn , M W Rn ! Rp . Let u 2 Sm1 . We have
kMLuk D kM.Lu/k kMkkLuk kMkkLk;
where the first inequality follows by (a) and the second inequality either by (a)
or the definition of kLk. Since this estimate holds for all u 2 Sm1 , kMLk D
supu2Sm1 kMLuk kMkkLk.
t
u
Remark 9.2.12 An important consequence of Proposition 9.2.11(a) is that for all
x; y 2 Rm we have the estimate
kAx Ayk kAkkx yk:
This estimate plays an absolutely crucial role in our analysis of linear maps. The
mean value theorem for differentiable maps f W Rm ! Rn is of the same form
9.2 Linear Maps
357
and it is this that often enables us to attack problems about non-linear maps using
techniques of linear analysis.
z
EXERCISES 9.2.13
(1) The space L.Rm ; Rn / is isomorphic to Rmn (map the matrix Œaij of A 2
L.Rm ; Rn / to .a11 ; a12 ; : : : ; a1n ; : : : ; amn / 2 Rmn . Show that the operator norm
k k induced on Rmn is equal to the Euclidean norm on Rmn iff n D 1 or m D 1
and that the norms are always equivalent (without recourse to Theorem 9.1.4).
(2) Prove statements (b,d) of Theorem 9.2.11
(3) Suppose A 2 L.Rm ; Rn /. Let At 2 L.Rn ; Rm / denote the transpose of A (if
the matrix of A is Œaij then the matrix of ŒAt is Œatij D Œaji ). Define jAj2 D
trace.AAt / (the trace of a matrix is the sum of the diagonal elements—note that
AAt is an n n matrix).
(a) Show that trace.AAt / D trace.At A/.
(b) jAj 0 for all A 2 L.Rm ; Rn / and jAj D 0 iff A D 0.
(c) j j defines a norm on L.Rm ; Rn / (you will need to ‘verify’ the triangle
inequality.)
(d) Show that there is an inner product h; i on L.Rn ; Rm / which defines j j.
(This gives a natural norm on linear maps that depends only on the inner product
structures on Rm ; Rn —inner products are needed to define the transpose in a
coordinate-free way.)
(4) Let m denote the space of all infinite sequences x D .xi /1
iD1 of real numbers
such that all but finitely many of the xi are equal to zero. Thus if x 2 m, there
exists an N 2 N such that xi D 0 for all i N.
(a) Verify that m has the structure of a vector space if we define vector space
addition and scalar multiplication coordinate-wise.
(b) Show that if we define kxk D max
i jxi j, then k k defines a norm on m.
P
1
(c) Define f W m ! R by f .x/ D
nD1 nxn . Verify that f is linear but not
continuous with respect to the topology defined by k k. (Hint: It is enough
to show f is not bounded on the closed unit ball of m. Why?)
(It is easy to see that .m; kk/ is not complete. Examples of discontinuous linear
maps can be defined on infinite-dimensional complete normed vector spaces,
such as C0 .Œ0; 1/ with the uniform metric, but they are harder to construct.)
(5) Suppose .V; k k/ and .W; k k/ are normed vector spaces and set S.V/ D fu 2
V j kuk D 1g. Show that if we everywhere replace .Rm ; k k2 / by .V; k k/ and
.Rn ; k k2 / by .W; k k/, then:
(a) Lemmas 9.2.1, 9.2.3 remain true and A W V ! W is continuous iff A is
bounded on S.V/. (No assumption on the finite-dimensionality of V or W.)
(b) If we let L.V; W/ denote the space of continuous linear maps from V
to W, then Theorem 9.2.1 and Proposition 9.2.11 are true. (Of course,
Theorem 9.1.4 implies that every linear map A W V ! W is continuous
if dim.V/ < 1.)
9 Differential Calculus on Rm
358
9.3 The Derivative
For functions f W R ! R, differentiability at x0 2 R is most easily described in
terms of the existence of a unique tangent line to the graph of f at x0 . The tangent
line is constructed using limiting chords to the graph. The derivative, if it exists, is
then defined to be the slope of the tangent line and is a real number. This approach
does not generalize naturally to functions f W Rm ! Rn , m > 1. The difficulty lies
with defining the analog of the tangent line. What we require is a unique tangent
plane to the graph of f at x0 but there is no obvious analogy of the limiting chords
construction used for functions of one variable. Of course, one can define partial
derivatives but these depend on choosing a coordinate system on Rm . Whatever the
derivative of a function is, it should surely not depend on the choice of a coordinate
system—just as we do not need a coordinate system to define a linear map. Our
goal then is to give a natural coordinate-free definition of differentiability and the
derivative. The way forward is to realize that the tangent line is the graph of an affine
linear function and the tangent plane to the graph, if it exists, will be the graph of
an affine linear map. Instead of thinking of the derivative as a scalar (or vector),
we regard the derivative as a function—more precisely the linear part of the affine
linear map that determines the graph of the tangent plane. To make all this precise
requires ideas of approximation.
Roughly speaking, a function f W Rm ! Rn is differentiable at x0 2 Rm if we
have a good affine linear approximation to f near x0 . We need to make precise the
meaning of ‘good approximation’ and ‘affine linear map’. We start with the easier
definition (see also Chap. 8). The map G W Rm ! Rn is an affine linear map if we
can write
G.x/ D Ax C b;
where A W Rm ! Rn is a linear map and b 2 Rn is a constant vector. If m D n D 1,
then an affine linear map may be written in the familiar form y D mx C c, where
m; c 2 R.
The definition of what is meant by a good approximation to f near x0 is
trickier. Suppose that G.x/ D Ax C b is an affine linear map. If G is to be
a good approximation to f near x0 , we certainly want G.x0 / D f .x0 /. Hence
Ax0 C b D f .x0 /. However, this condition says nothing about how the values
Ax C b compare with f .x/ at points x near x0 . For this we need an estimate on
k f .x/ .Ax C b/k, when x is close to x0 . As a first attempt we might ask that
limx!x0 k f .x/ .Ax C b/k D 0. However, since we have f .x0 / D Ax0 C b
(if G.x0 / D f .x0 /), this condition is equivalent to the continuity of f at x0 . A
stronger condition is needed for differentiability. What we shall require is that
k f .x/ .Ax C b/k goes to zero faster than kx x0 k as kx x0 k ! 0. That is,
lim
x!x0
k f .x/ .Ax C b/k
D 0:
kx x0 k
(9.1)
9.3 The Derivative
359
As we shall see, this condition implies that Ax C b is the best possible affine linear
approximation to f at x0 . If we write x D x0 C h, then A.x0 C h/ C b D f .x0 / C Ah
and (9.1) is equivalent to
k f .x0 C h/ f .x0 / Ahk
D 0:
h!0
khk
(9.2)
lim
Equation (9.2) is reminiscent of the definition of the derivative of a map f W R !
R. Since we cannot divide by vectors, we take norms of vectors instead. There is
another way of looking at (9.2) that avoids division and explicit mention of limits.
If we define the remainder or error term r.h/ by r.h/ D f .x0 C h/ . f .x0 / C Ah/,
then (9.2) can be rewritten as
f .x0 C h/ . f .x0 / C Ah/ D r.h/;
where limh!0 kr.h/k
khk D 0. As we shall frequently encounter this condition on r.h/,
we introduce the economical ‘small o’ notation: write r.h/ D o.h/ if r.0/ D 0 and
limh!0 kr.h/k
D 0. Equivalently, r.h/ D o.h/ if for every " > 0, there exists a ı > 0
khk
such that kr.h/k < "khk whenever khk < ı.
Example 9.3.1 Let f W R ! R be differentiable at x0 . The affine linear map g.h/ D
f .x0 / C ah is the tangent line to the graph of f at x0 iff f .x0 C h/ f .x0 / ah D o.h/.
Referring to Fig. 9.1, the error r.h/ D o.h/ goes to zero faster than jhj for the
tangent line. For the line L, the error goes to zero like jhj. If r.h/ D o.h/, then
g.h/ D f .x0 / C f 0 .x0 /h (a D f 0 .x0 / in the figure).
We can now give a formal definition of what it means for a function to be
differentiable at a point.
y = f(x)
L
f(x0 +h)
r(h)
f(x0 )
x0
Fig. 9.1 Remainder term r.h/ for f W R ! R
x0 +h
g(h) = f(x 0 ) + ah
9 Differential Calculus on Rm
360
Definition 9.3.2 The map f W Rm ! Rn is differentiable at x0 2 Rm if there exists
a linear map A W Rm ! Rn such that
f .x0 C h/ D f .x0 / C Ah C r.h/;
where r.h/ D o.h/.
Remarks 9.3.3
(1) We shall shortly show that if we can find a linear map A satisfying Definition 9.3.2 then A is unique. Naturally we call A the derivative of f at x0 . We
denote the derivative of f at x0 either by Dfx0 or Df .x0 / (we usually use the
first notation). Thus, differentiability at x0 means that there exists a linear map
Dfx0 W Rm ! Rn such that
f .x0 C h/ D f .x0 / C Dfx0 .h/ C r.h/;
where r.h/ D o.h/. Alternatively, if we write x D x0 C h,
f .x/ D f .x0 / C Dfx0 .x x0 / C r.x x0 /;
where r.x x0 / D o.x x0 /. We emphasize that the definition implies that f
has a good affine linear approximation at x0 . That is, the error r.x x0 / we get
by replacing f .x/ near x0 by the (affine) linear map f .x0 / C Dfx0 .x x0 / goes
to zero faster than kx x0 k:
f .x/ . f .x0 / C Dfx0 .x x0 // D o.x x0 /:
(2) We develop properties of the small o notation, and introduce the big O notation,
in the exercises at the end of the section.
z
Lemma 9.3.4 If f W Rm ! Rn is differentiable at x0 , then f is continuous at x0 .
Proof If f is differentiable at x0 , then there exists a linear map A W Rm ! Rn
such that f .x/ D f .x0 / C A.x x0 / C r.x x0 /, where r.x x0 / D o.x x0 /.
Since A is linear, A is continuous and so limx!x0 A.x x0 / D A.0/ D 0. Since
r.x x0 / D o.x x0 /, we also have limx!x0 r.x x0 / D r.0/ D 0. Therefore
limx!x0 f .x/ D f .x0 / and f is continuous at x0 .
t
u
Definition 9.3.5 If f W R ! Rn is differentiable at x0 2 R, we define f 0 .x0 / 2 Rn
by
f 0 .x0 / D Dfx0 .1/:
We end with an example that shows the connection between Definition 9.3.5 and the
limit definition for functions of one variable. We continue to assume the derivative
is unique—a result that is well-known and easy in the one variable case.
9.3 The Derivative
361
Example 9.3.6 Suppose f W R ! Rn is differentiable at x0 2 R in the sense of
Definition 9.3.2. Following Definition 9.3.5, set f 0 .x0 / D Dfx0 .1/ 2 Rn . We claim
that
lim
h!0
f .x0 C h/ f .x0 /
D f 0 .x0 /:
h
To see this, observe that
f .x0 C h/ D f .x0 / C Dfx0 .h/ C r.h/
D f .x0 / C hf 0 .x0 / C r.h/:
Hence limh!0
f .x0 Ch/f .x0 /
h
f 0 .x0 / D limh!0
r.h/
h
D 0.
EXERCISES 9.3.7
(1) Let x 2 Rm . Define g W R ! Rm by g.t/ D tx. What is g0 .t/ D Dgt .1/?
(2) Working from Definition 9.3.2, show that the Euclidean norm k k2 W Rn ! R
is never differentiable at x D 0. Using Theorem 9.1.4, deduce that every norm
k k on Rn is not differentiable at x D 0.
(3) Suppose that r W Rm ! Rn and that r.x/ D o.x/. Show that r is differentiable
at 0 and that Dr0 D 0.
(4) Suppose that f ; g W Rm ! Rn and that f .x/; g.x/ D o.x/. Verify that . f ˙
g/.x/ D o.x/. If n D 1, show that f .x/g.x/ D o.x/.
(5) Let f W Rm ! Rn . We write f .x/ D O.x/ and say f is O.x/ (‘big zero x’) if there
exist r > 0, C > 0 such that k f .x/k Ckxk for all x 2 Dr .0/. Verify that
(a) If f is O.x/ then f .0/ D 0 and f is continuous at x D 0.
(b) If f is differentiable at x D 0 and f .0/ D 0, then f is O.x/. Find an example
to show that if f .0/ D 0 and f is O.x/, then f may not be differentiable at
x D 0.
(c) Let f ; g W Rm ! Rn . Suppose that f ; g are O.x/. What can be said about
f ˙ g? Suppose n D 1. What can be said about fg? Deduce that if f ; g are
O.x/, then fg is differentiable at x D 0 and find D. fg/0 .
(6) Suppose we follow the assumptions of Exercises 9.2.13(5)—in particular,
L.V; W/ consists of continuous linear maps. Show that all results and definitions
of the section continue to apply.
9 Differential Calculus on Rm
362
9.4 Properties of the Derivative
Lemma 9.4.1 If f is differentiable at x0 , the derivative is unique.
Proof Suppose the linear maps A; B W Rm ! Rn both satisfy the defining equation
for the derivative
f .x0 C h/ D f .x0 / C Ah C r1 .h/
D f .x0 / C Bh C r2 .h/:
Subtract the second equation from the first to get
.A B/.h/ D r2 .h/ r1 .h/ D o.h/;
since r1 .h/; r2 .h/ D o.h/. Hence
lim
h!0
k.A B/.h/k
D 0:
khk
h
D k.A B/.u/k where u D khk
. Since every unit vector u 2 Rm
But k.AB/.h/k
khk
h
can be written in the form u D khk for arbitrarily small vectors h, it follows that
kA Bk D 0. Hence, by Theorem 9.2.9(1), A D B.
t
u
Examples 9.4.2
(1) If f W Rm ! Rn is linear, then f is differentiable at all points x 2 Rm and
Dfx D f . (This corresponds to the 1-variable result that if f .x/ D ax, then f 0 is
constant, equal to a.)
(2) Define W Rn ! R by .x/ D kxk2 D .x; x/. We claim that is differentiable
on Rn and D x .h/ D 2.x; h/ for all x; h 2 Rn . For x; h 2 Rn we have
.x C h/ D .x C h; x C h/ D .x; x/ C 2.x; h/ C .h; h/ D .x/ C 2.x; h/ C khk2:
2
D 0, is differentiable and D x .h/ D 2.x; h/. We remark
Since limh!0 khk
khk
that D x .h/ D 2.x; h/ D 0 iff .x; h/ D 0. That is, D x .h/ D 0 iff h ? x. If
khk D 1, then for x ¤ 0, D x .h/ takes its maximal value when h D x=kxk and
its minimal value when h D x=kxk.
(3) Let A1 ; ; Ap 2 L.Rm ; R/. Define F W Rm ! R by F.x/ D A1 .x/ Ap .x/,
x 2 Rm . We claim that F is differentiable at x D 0 with DF0 D 0, if p > 1,
and DF0 D A1 if p D 1. The statement for p D 1 is the first example above so
suppose p > 1. We have
F.0 C h/ D F.0/ C A1 .h/ Ap .h/ D A1 .h/ Ap .h/:
9.4 Properties of the Derivative
363
Hence k f .h/k kA1 k kAn kkhkp . Since p > 1, limh!0 kF.h/k
D 0 and
khk
so F is differentiable at x D 0 with DF0 D 0. Note that every coordinate
functional x 7! xi is linear and so this example implies that the monomials
F.x/ D xa11 xamm are differentiable at x D 0 and DF0 D 0 if a1 C C am > 1.
Definition 9.4.3 Let f W Rm ! Rn .
(1) f is differentiable if f is differentiable at all points of Rm .
(2) f is continuously differentiable, or C1 , if f is differentiable and the derivative
map Df W Rm ! L.Rm ; Rn / is continuous.
Remarks 9.4.4
(1) The continuity in (2) is relative to the metric on L.Rm ; Rn / given in Proposition 9.2.11(b)—indeed the metric associated to any norm on L.Rm ; Rn /
(Theorem 9.1.4).
(2) In "; ı terms, to say that f is C1 means that given x0 2 Rm , " > 0, there exists a
ı > 0 such that kDfx Dfx0 k < "; kx x0 k < ı.
z
Examples 9.4.5
(1) If f W Rm ! Rn is linear then f is C1 . Indeed, Dfx D f for all x 2 Rm and so
Df W Rm ! L.Rm ; Rn / is constant and obviously continuous.
(2) Define
W Rn ! R by .x/ D kxk2 D .x; x/. As we showed in
Examples 9.4.2(2), is differentiable on Rn and D x .h/ D 2.x; h/. In order
to prove that is C1 , we need to estimate kD x D x0 k. We have
kD
x
D
x0 k
D sup jD x .u/ D
kukD1
x0 .u/j
D sup j2.x; u/ 2.x0 ; u/j:
kukD1
Now j2.x; u/ 2.x0 ; u/j D 2j.x x0 ; u/j 2kx x0 kkuk, by the Cauchy–
Schwarz inequality. Therefore, kD x D x0 k 2kx x0 k and so D is
continuous at x (given " > 0, take ı D "=2).
9.4.1 Directional Derivative
Let f W Rm ! Rn be differentiable and suppose u 2 Sm1 (a unit vector). We
define the directional derivative of f at x0 in direction u to be the vector Du fx0 2 Rn
defined by
Du fx0 D Dfx0 .u/:
@f
We also denote the directional derivative at x0 by @u
.x0 /. In particular, we set
@f
Dej fx0 D @xj .x0 /, where fe1 ; ; em g denotes the standard basis of Rm .
9 Differential Calculus on Rm
364
Lemma 9.4.6 If f is differentiable at x0 2 Rm and u 2 Sm1 , then
Du fx0 D
d
f .x0 C tu/jtD0 :
dt
Proof By definition of the derivative of f at x0 we have
f .x0 C h/ D f .x0 / C Dfx0 .h/ C r.h/:
Now set h D tu to obtain
f .x0 C tu/ D f .x0 / C tDfx0 .u/ C r.tu/:
Dividing by t ¤ 0 gives
f .x0 C tu/ f .x0 /
r.tu/
D Dfx0 .u/ C
:
t
t
Since r.h/ D o.h/, we have r.tu/ D o.t/. Letting t ! 0, we see that f .x0 C tu/ is
def
differentiable as a function of t at t D 0 with derivative Dfx0 .u/ D Du fx0 .
t
u
Example 9.4.7 The function f .x0 C tu/ may be differentiable as a function of t
for all u 2 Sm1 without f being differentiable at x0 . As a simple example, take
f .x1 ; x2 / D x21 x2 =.x21 C x22 /, .x1 ; x2 / ¤ .0; 0/, and f .0; 0/ D 0. Set .0; 0/ D 0.
It is easy to check that f is continuous, all directional derivatives exist at 0, and
@f
.0/ D 0, i D 1; 2. Hence, if f is differentiable at 0, then the derivative must be the
@xi
zero linear map 0 W R2 ! R. But this is absurd since Du f .0/ ¤ 0 if u … f˙e1 ; ˙e2 g
(alternatively, if Df0 D 0, then f .h/ D r.h/ and it is easy to verify that f .h/ ¤ o.h/).
9.4.2 Partial Derivatives
If f D . f1 ; ; fn / W Rm ! Rn is differentiable and we take u D ej 2 Rm , 1 j m, then
@f
@f1
@fn
.x0 /:
Dej f .x0 / D
.x0 / D
; ;
@xj
@xj
@xj
The m
@fi
n matrix Π@x
of partial derivatives of f at x0 is then equal to the matrix
j
@fi
of the derivative Dfx0 . We refer to Π@x
as the Jacobian matrix of f at x0 (Dfx0 is
j
sometimes called the Jacobian of f at x0 ).
Observe that in order to compute the partial derivatives we need to choose
coordinate systems on Rm ; Rn . Thus, if f W V ! W is differentiable at x0 , the
9.4 Properties of the Derivative
365
derivative Dfx0 2 L.V; W/ (see Exercises 9.3.7(6)). In order to define the partial
derivatives of f at x0 , we need to choose bases for V and W and thereby identify V
with Rm and W with Rn . The matrix of Dfx0 relative to the coordinate systems on V
@fi
of partial derivatives of f at x0 .
and W will then be the matrix Π@x
j
m
n
Provided that f W R ! R is differentiable at x0 , we can always compute the
partial derivatives of f . As we see later, the converse is more subtle and requires
some continuity of the partial derivatives of f . As Example 9.4.7 shows, we cannot
deduce differentiability just from the existence of partial derivatives.
9.4.3 The Chain Rule
The chain rule is one of the most useful results about derivatives. Simply put, the
chain rule asserts that the best affine linear approximation to a composite g ı f of
differentiable functions is the composite of the best affine linear approximations of
g and f . Viewed in this way, the proof that we give is quite natural: we verify that
the composite of the derivatives does give the best approximation to the composite
of the maps.
Theorem 9.4.8 (Chain Rule) Let f W Rm ! Rn and g W Rn ! Rp . Suppose x 2 Rm
and set y D f .x/ 2 Rn . If f is differentiable at x and g is differentiable at y, then
g ı f is differentiable at x and
D.g ı f /x D Dgy ı Dfx :
Proof It is enough to show that Dgy ı Dfx 2 L.Rm ; Rp / satisfies the defining
condition for the differentiability of g ı f at x. That is,
g ı f .x C h/ D g ı f .x/ C Dgy ı Dfx .h/ C R.h/;
where R.h/ D o.h/. We start by writing down the differentiability assumptions we
are given on f and g.
f .x C h/ D f .x/ C Dfx .h/ C r.h/;
g.y C k/ D g.y/ C Dgy .k/ C s.k/;
where r.h/ D o.h/ and s.k/ D o.k/. Taking k D Dfx .h/ C r.h/ and substituting
in the right-hand side of the formula for g.y C k/ D g. f .x/ C Dfx .h/ C r.h// D
g ı f .x C h/ gives
g. f .x C h// D g f .x/ C Dgy .Dfx .h/ C r.h// C s.Dfx .h/ C r.h//
D g ı f .x/ C Dgy ı Dfx .h/
CDgy .r.h// C s.Dfx .h/ C r.h//:
9 Differential Calculus on Rm
366
Therefore, R.h/ D Dgy .r.h// C s.Dfx .h/ C r.h//. To complete the proof, we show
that Dgy .r.h// D o.h/ and s.Dfx .h/ C r.h// D o.h/.
(1) Dgy .r.h// D o.h/. For h ¤ 0, we have
kDgy kkr.h/k
limh!0
khk
kDgy .r.h//k
khk
kDgy kkr.h/k
.
khk
Since r.h/ D
D 0 and so kDgy .r.h//k D o.h/.
o.h/,
(2) s.Dfx .h/ C r.h// D o.h/. Since r.h/ D o.h/, we can choose ı1 > 0 such that
kr.h/k khk, if khk ı1 . Hence
kDfx .h/ C r.h/k kDfx .h/k C kr.h/k .kDfx k C 1/khk; khk ı1 :
Since s.k/ D o.k/, given " > 0, we can choose ı2 > 0 such that
ks.k/k "
kkk; kkk ı2 :
.kDfx k C 1/
Set ı D minfı1 ; .kDfıx2kC1/ g. Then if khk ı, we have
kDfx .h/ C r.h/k .kDfx k C 1/khk ı2 ;
and so
ks.Dfx .h/ C r.h//k "
.kDfx k C 1/khk D "khk:
.kDfx k C 1/
Hence s.Dfx .h/ C r.h// D o.h/.
t
u
Remarks 9.4.9
(1) Notice how well this proof using approximation avoids the difficulties encounf .x//
tered using the limh!0 g. f .xCh//g.
definition from the 1-variable theory.
h
(2) If we assume f and g have respective domains the open subsets U of Rm and V
of Rn , then g ı f is defined on the open set U \ f 1 .V/ Rm and Theorem 9.4.8
applies with the proviso that x 2 U \ f 1 .V/.
z
Examples 9.4.10
(1) Let V W Rm ! R and D . 1 ; ; m / W R ! Rm be differentiable. We
claim that V ı W R ! R is differentiable and .V ı /0 .t/ D DV .t/ . 0 .t// (see
Definition 9.3.5 for the notation 0 ; .V ı /0 ). In terms of partial derivatives,
X @V
d
V. .t// D
. .t// i0 .t/:
dt
@x
i
iD1
m
To verify the claim, apply the chain rule to get D.V ı /t D DV .t/ ı D t .
Either side of the equation defines a linear map from R to R. Evaluate at 1 to
9.4 Properties of the Derivative
get .V ı /0 .t/ D DV
367
.t/ .
DV
0
.t//. Now
.t/ .
0
0
.t/ D
.t// D DV
.t/ .
Pm
iD1
m
X
0
i .t/ei
and so we have
0
i .t/ei /
iD1
D
m
X
0
i .t/DV .t/ .ei /
iD1
D
m
X
iD1
@V
0
.
i .t/
@xi
.t//;
where the last line follows by definition of the partial derivative.
(2) Let U Rm and V Rn be open sets and suppose that f W U ! V is 1:1 onto
and both f and f 1 W V ! U are differentiable. Then
(a) For all x 2 U, Dfx W Rm ! Rn is a linear isomorphism and .Dfx /1 D
Dff1
.x/ .
(b) m D n.
(a) We have f 1 ı f D IU , where IU is the identity map of U. It follows by the chain
rule that if x 2 U, then
m
m
Dff1
.x/ ı Dfx D I 2 L.R ; R /:
Hence the linear map Dfx is invertible with inverse Dff1
.x/ .
(b) If we have a linear isomorphism A W Rm ! Rn then (by linear algebra—look at
bases), m D n.
Remark 9.4.11 It is natural to ask if m D n when f W U ! V is 1:1 onto but f and
f 1 are only continuous, that is, f is a homeomorphism. The answer is yes but the
proof is tricky and depends on results from topology, specifically the “invariance of
domain theorem”.
z
9.4.4 The Mean Value Theorem
We recall that the mean value theorem for maps F W Œa; b ! R, continuous on Œa; b
and differentiable on .a; b/, states that there exists a c 2 .a; b/ such that
F.b/ F.a/ D F 0 .c/.b a/:
(9.3)
Before we state the version of the mean value theorem appropriate for vector spaces,
we need some notation. Given x ¤ y 2 Rm , let Œx; y Rm be the line segment
joining x and y. That is,
Œx; y D f.1 t/x C ty j t 2 Œ0; 1g:
9 Differential Calculus on Rm
368
Theorem 9.4.12 (The Mean Value Theorem) Let U Rm be open and f W U !
Rn be differentiable. Given x; y 2 U such that Œx; y U, we have
k f .x/ f .y/k sup kDfz kkx yk:
z2.x;y/
(We allow supz2.x;y/ kDfz k D C1.)
Proof We prove the result in two steps. If n D 1, we deduce the theorem from the
1-variable version of the mean value theorem. If n > 1, we reduce to the n D 1
case by projecting Rn along the line defined by the vector f .y/ f .x/. Now for the
details.
Suppose n D 1. Define F W Œ0; 1 ! R by
F.t/ D f ..1 t/x C ty/; t 2 Œ0; 1:
Observe that F.0/ D f .x/, F.1/ D f .y/. Apply the 1-dimensional version of the
mean value theorem (9.3) to get
f .y/ f .x/ D F.1/ F.0/ D F 0 .c/;
for some point c 2 .0; 1/. Since F.t/ D f ..1 t/x C ty/ it follows by the chain rule
that
F 0 .t/ D Df.1t/xCty .x C y/;
and so
f .y/ f .x/ D DfzQ .y x/;
where zQ D .1 c/x C cy. Hence
k f .y/ f .x/k D kDfzQ .y x/k kDfzQ kky xk sup kDfz kkx yk:
z2.x;y/
Next suppose n > 1. Since the result is obvious if f .x/ D f .y/, we may suppose
.x/
that f .x/ ¤ f .y/. Set u D k ff .y/f
2 Sn1 . Define the linear map W Rn ! R
.y/f .x/k
n
by .w/ D .u; w/, w 2 R . The map gives the component of the orthogonal
projection of Rn along the line ftu j t 2 Rg. Since j .w/j D j.u; w/j kukkwk D
kwk, we have (take w D u)
k k D 1:
Define G W Œ0; 1 ! R by
G.t/ D
.F.t// D .u; F.t//; t 2 Œ0; 1:
(9.4)
9.4 Properties of the Derivative
369
Observe that
f .y/ f .x/
; F.1/ F.0/
G.1/ G.0/ D
k f .y/ f .x/k
f .y/ f .x/
D
; f .y/ f .x/ D k f .y/ f .x/k:
k f .y/ f .x/k
Applying the 1-dimensional version of the mean value theorem (9.3) to G we get
k f .y/ f .x/k D G.1/ G.0/ D G0 .c/;
for some c 2 .0; 1/. It remain to compute G0 .c/. We apply the chain rule to G D
ı F. Since is linear, D D and so
G0 .t/ D
.F 0 .t// D
.Df.1t/xCty .x C y//:
Setting t D c, zQ D .1 c/x C cy, gives
k f .y/ f .x/k D G.1/ G.0/ D G0 .c/ D
.DfzQ .y x//:
Since G0 .c/ D k f .y/ f .x/k, G0 .c/ > 0 and so
k f .y/ f .x/k D jG0 .c/j D j .DfzQ .y x//j
k kkDfzQ kky xk D kDfzQ kky xk;
where we have used (9.4).
t
u
Remarks 9.4.13
(1) The multivariable form of the mean value theorem is written as an inequality.
For maps into Rn , n > 1, it is generally not possible to write f .y/ f .x/ D
Dfz .y x/ for some z 2 Œx; y. It is not hard to construct examples—see the
exercises at the end of the section.
(2) The mean value theorem is easily the most important foundational result in the
differential calculus. It estimates k f .y/f .x/k as though f were a linear map—
if f is linear, then k f .y/ f .x/k k f kky xk. Having inequality, rather than
equality, is no loss: it is rare (outside of contrived problems) that explicit use is
made of the value c in (9.3).
z
Corollary 9.4.14 Let U Rm be open and connected. Suppose that f W U ! Rn is
differentiable and Dfx D 0 for all x 2 U. Then f is constant.
9 Differential Calculus on Rm
370
Proof Given x 2 U, choose r > 0 such that Dr .x/ U. For every y 2 Dr .x/,
Œx; y Dr .x/ U. Hence if y 2 Dr .x/, we can apply the mean value theorem to
get
k f .y/ f .x/k sup kDfz kky xk D 0;
z2Œx;y
since Df D 0. Hence f .y/ D f .x/, for all y 2 Dr .x/. Fix x0 2 U and define
W D fy 2 U j f .y/ D f .x0 /g:
Since f is continuous (because f is differentiable), W is a closed subset of U. On
the other hand, it follows from the argument above that W is open. Since x0 2 W,
W ¤ ; and so, by the connectivity of U, W D U.
t
u
EXERCISES 9.4.15
(1) Suppose that f ; g W Rm ! Rn are both differentiable at x. Show that f ˙ g W
Rm ! Rn is differentiable at x with derivative Dfx ˙ Dgx .
(2) Let Q W Rm ! Rn be differentiable and suppose that Q.tx/ D td Q.x/ for all
t 2 R, x 2 Rm (d is assumed to be a strictly positive integer).
(a) Show that Q.0/ D 0.
(b) DQx .x/ D dQ.x/, for all x 2 Rn (Euler’s theorem). (Hint: Apply the chain
rule to Q ı g W R ! Rn where g.t/ D tx.)
R1
(3) Let f W Rm ! Rn be C1 and f .0/ D 0. Show that f .x/ D 0 Dftx .x/ dt and
deduce that we can write the components fi of f in the form
fi .x/ D
m
X
xj gji .x/; 1 i n;
jD1
R 1 @fi
R1
where gji .x/ D 0 @x
.tx/ dt. (Hint: f .x/ D 0 dtd f .tx/ dt. The integral
j
R
ofR an Rn R-valued function is defined component-wise: . f1 ; ; fn / D
. f1 ; ; fn /.)
(4) Recall that the derivative of the norm squared function .x/ D kxk2 is given by
D x .h/ D 2.x; h/. Using this, together with the chain rule, show that .x/ D
5
kxk 2 is C1 on Rm and find D x 2 L.Rm ; R/. (You may assume that g W RC !
R defined by g.t/ D ta is C1 with derivative ata1 , provided that a 1. Start
by writing as a composition.)
(5) Let f ; g W Rm ! R be differentiable functions. Find, from first principles
(using the definition of differentiable), the derivative D. fg/x 2 L.Rm ; R/, x 2
Rm , in terms of Dfx ; Dgx and the values of f and g at x. (Note that fg D f g.)
Deduce a general formula for the derivative at x of a product of p differentiable
functions. Hence show that if f .x/ D A1 .x/ Ap .x/, where each Ai W Rn ! R
is linear, then f is C1 . Deduce that every monomial M W Rn ! R, M.x/ D
9.4 Properties of the Derivative
371
xa11 xann , a1 ; ; an 2 ZC , is C1 and hence that every real-valued polynomial
on Rn is C1 .
(6) Let f ; g W Rm ! R be C1 . Verify the Leibniz law:
D. fg/x D g.x/Dfx C f .x/Dgx ; x 2 Rm :
( fg W Rm ! R is defined by f g.x/ D f .x/g.x/, x 2 Rm .)
(7) Define f W R R D R2 ! R by
(
f .x1 ; x2 / D
x1 x32
;
x21 Cx42
0;
.x1 ; x2 / ¤ .0; 0/;
.x1 ; x2 / D .0; 0/:
Show that
(a) f is continuous on R2 .
(b) The directional derivatives Du fx exist on R2 for all unit vectors u.
Compute them at .x; y/ D .0; 0/.
(c) f is not differentiable at .0; 0/.
(Hints:pfor (a) you may find the arithmetic-geometric mean inequality (.A C
B 2 AB, A; B 0) helpful. For (c), look at what happens on a curve .h2 ; h/,
0 < h 1, and note that if the derivative exists it may be found using (b). Now
use the chain rule.)
(8) Find an example of a C1 map f W R ! R2 such that
f .1/ f .0/ ¤ f 0 .t/; for all t 2 Œ0; 1:
(Failure of the mean value theorem as an equality for maps into Rn , n > 1.
Note that if the result were true then it would apply to the components and so
for j D 1; 2, fj .1/ fj .0/ D fj0 .t/ at the same point t 2 .0; 1/. So it is enough
to find two functions f1 ; f2 W R ! R where the equality occurs at different
points.)
(9) Let U Rm be open and suppose that f W U ! R. Given x0 2 U, we say f .x0 /
is a local maximum value of f if there exists an r > 0 such that
f .x/ f .x0 /; whenever kx x0 k < r:
Show that if f is differentiable on U then a necessary condition for f .x0 / to be
a local maximum value of f is Dfx0 D 0 (as a linear map from Rm to R). You
should work from the definition of differentiable map (and there should be no
mention of partial derivatives).
(10) Let U be an open subset of Rm and let f W U ! Rn . Suppose that there is a
continuous map u W U ! L.Rm ; Rn / such that for all y 2 Rm and all x 2 U,
9 Differential Calculus on Rm
372
we have
lim
t!0
f .x C ty/ f .x/
D u.x/.y/:
t
By applying the mean value theorem to the map g.t/ D f .x C ty/ u.x/.ty/,
prove that f is C1 on U and Dfx D u.x/ for all x 2 U. Rephrase this result in
terms of directional derivatives of f .
9.5 Maps to and from Products
We start with the easier case of a map to a product. Suppose then that f D . f1 ; f2 / W
Rm ! Rn1 Rn2 . If f1 ; f2 are continuous so is f and conversely. Before giving the
result for differentiability of maps f W Rm ! Rn1 Rn2 , note the natural linear
isomorphism
L.Rm ; Rn1
Rn2 /
L.Rm ; Rn1 /
L.Rm ; Rn2 /
defined by taking the components A1 ; A2 of A W Rm ! Rn1
Rn2 .
Proposition 9.5.1 Let U Rm be open and suppose f D . f1 ; f2 / W U Rm !
Rn1 Rn2 .
(1) If x 2 U, then f is differentiable at x iff both f1 and f2 are differentiable at x,
and the derivatives of f and f1 , f2 at x are related by
Dfx .h/ D .Df1;x ; Df2;x /.h/ D .Df1;x .h/; Df2;x .h//; h 2 Rm :
(2) f is differentiable (respectively, C1 ) on U iff both f1 ; f2 are differentiable
(respectively, C1 ) on U.
Proof We prove (1). If f1 ; f2 are differentiable at x and h 2 Rm , we have
f1 .x C h/ D f1 .x/ C Df1;x .h/ C r1 .h/;
f2 .x C h/ D f2 .x/ C Df2;x .h/ C r2 .h/;
where r1 .h/; r2 .h/ D o.h/. Hence,
f .x C h/ D f .x/ C .Df1;x ; Df2;x /.h/ C .r1 .h/; r2 .h//; h 2 Rm :
Taking the (Euclidean norm) k.v1 ; v2 /k D kv1 k C kv2 k on the product Rn1 Rn2
Rn1 Cn2 , we have k.r1 .h/; r2 .h//k D kr1 .h/kCkr2 .h/k and so .r1 .h/; r2 .h// D o.h/.
Hence f is differentiable at x and Dfx D .Df1;x ; Df2;x /. For the converse, reverse the
argument.
t
u
9.5 Maps to and from Products
373
Corollary 9.5.2 Let U Rm be open and f D . f1 ; ; fn / W U Rm ! Rn be
differentiable at x0 2 U. Then
Dfx D .Df1;x ; Dfn;x / 2
n
L.Rm ; R/:
Proof A straightforward induction using Proposition 9.5.1.
t
u
Next we look at maps from a product. Suppose that f W Rp Rq ! Rn , p; q 2 N.
It does not follow that separate continuity implies continuity. That is, if both x 7!
f .x; y/ (y fixed) and y 7! f .x; y/ (x fixed) are continuous, f need not be continuous
(for example, define f .0; 0/ D 0 and f .x; y/ D xy=.x2 C y2 /, .x; y/ ¤ .0; 0/). This
suggests that inferring results about the differentiability of f .x; y/ from the separate
differentiability of f in x and y may not be so straightforward.
In order to relate derivatives in x and y with the derivative at .x; y/, we make use
of the natural isomorphism
L.Rp ; Rn /
L.Rq ; Rn /
L.Rp
Rq ; Rn /I .A1 ; A2 / 7! A;
(9.5)
where A.u; v/ D A1 .u/ C A2 .v/, .u; v/ 2 Rp Rq .
Let X0 D .x0 ; y0 / 2 Rp Rq . Suppose that the map x 7! f .x; y0 / is differentiable
at x D x0 . We denote the derivative at x0 by D1 f.x0 ;y0 / D D1 fX0 . Note that this
notation emphasizes that the derivative in x-variables depends on y. We similarly let
D2 fX0 denote the derivative of y 7! f .x0 ; y/ at y0 .
Since D1 fX0 2 L.Rp ; Rn / and D2 fX0 2 L.Rq ; Rn /, the linear map A 2 L.Rp
q
R ; Rn / determined by the natural isomorphism (9.5) is given by
A.u; v/ D D1 fX0 .u/ C D2 fX0 .v/; .u; v/ 2 Rp
Rq :
We show that if f is differentiable at X0 , then DfX0 D A.
Proposition 9.5.3 If f W Rp
Rq ! Rn is differentiable at X0 , then f is
differentiable with respect to x and y at X0 and
D1 fX0 .h/ D DfX0 .h; 0/; h 2 Rp ;
D2 fX0 .k/ D DfX0 .0; k/; k 2 Rq ;
DfX0 .h; k/ D D1 fX0 .h/ C D2 fX0 .k/; .h; k/ 2 Rp
Rq :
Proof Since f is differentiable at X0 ,
f .x0 C h; y0 C k/ D f .x0 ; y0 / C DfX0 .h; k/ C R.h; k/;
(9.6)
9 Differential Calculus on Rm
374
where R.h; k/ D o.h; k/. Taking k D 0, we have limh!0 kR.h;0/k
D 0 and so
k.h;0/k
r.h/ D R.h; 0/ D o.h/. Similarly, s.k/ D R.0; k/ D o.k/. Taking k D 0 in (9.6)
gives
f .x0 C h; y0 / D f .x0 ; y0 / C DfX0 .h; 0/ C r.h/;
and so x 7! f .x; y0 / is differentiable at x0 with derivative D1 fX0 given by
D1 fX0 .h/ D Df.X0 / .h; 0/. The result for D2 fX0 is proved similarly. The final
statement is immediate.
u
t
Now we look at the converse of Proposition 9.5.3.
Theorem 9.5.4 Let U Rm1 Rm2 be open and suppose f W U Rm1 Rm2 ! Rn .
Then f is C1 on U iff f is separately continuously differentiable (that is, iff the maps
Dj f W U ! L.Rmj ; Rn / exist and are continuous, j D 1; 2).
Proof Let .h; k/ 2 Rm1
Rm2 , X D .x; y/ 2 U. We need to show that
f .x C h; y C k/ f .x; y/ D D1 fX .h/ C D2 fX .k/ C R.h; k/;
where R.h; k/ D o.h; k/. We have
f .x C h; y C k/ f .x; y/ D . f .x C h; y C k/ f .x C h; y//
C . f .x C h; y/ f .x; y//:
We start by considering the second term on the right-hand side. Since f is assumed
differentiable in the first variable at X we have
f .x C h; y/ f .x; y/ D D1 fX .h/ C r.h/;
where r.h/ D o.h/. For all k 2 Rq we have k.h; k/k k.h; 0/k D khk (Euclidean
norms). Hence
kr.h/k
D 0;
.h;k/!0 k.h; k/k
lim
(9.7)
and so r.h/ D o.h; k/. Now we turn to the less straightforward analysis of the first
term. For fixed h, define
g.k/ D f .x C h; y C k/ f .x C h; y/ D2 fX .k/:
We need to show that g.k/ D o.h; k/. Since f is assumed differentiable with respect
to the y-variable on U, g.k/ is differentiable for .h; k/ 2 Dr .0; 0/, where X C
Dr .0; 0/ U. The derivative of g is given by
Dgk D D2 fXC.h;k/ D2 fX :
9.5 Maps to and from Products
375
(Recall the derivative of a linear map is constant, equal to the linear map.) Now D2 f
is continuous at X and so given " > 0, there exists a ı > 0 such that
kDgk k D kD2 fXC.h;k/ D2 fX k "; k.h; k/k ı:
(9.8)
Next we apply the mean value theorem to g.k/ to obtain
kg.k/ g.0/k sup kDgtk kkkk:
0<t<1
Since g.0/ D 0, this gives us the estimate
kg.k/k sup kDgtk kkkk:
0<t<1
By (9.8) we therefore have
kg.k/k "kkk; k.h; k/k ı:
That is, given " > 0, we have shown there exists a ı > 0 such that kg.k/k "kkk kg.k/k
D
"k.h; k/k, whenever k.h; k/k ı. As a consequence, we have lim.h;k/!0 k.h;k/k
0. Summarizing, we have shown that
f .x C h; y C k/ f .x; y/ D1 fX .h/ D2 fX .k/ D r.h/ C g.k/;
where r.h/ C g.k/ D o.h; k/. Hence f is differentiable at X with DfX .h; k/ D
D1 fX .h/ C D2 fX .k/.
t
u
Remarks 9.5.5
(1) The result is false without some continuity assumption on the ‘partial’ derivatives Dj f . Inspection of the proof given shows that f will be differentiable at X
provided that (a) D1 fX exists, (b) D2 f exists on an open disk centred at X, and
(c) D2 f is continuous at X.
(2) The proof of the result makes essential use of the mean value theorem.
z
Corollary 9.5.6 Let U Rm be open and f D . f1 ; ; fn / W U ! Rn . Then f is C1
on U iff all partial derivatives @fi =@xj W U ! R exist and are continuous on U.
Proof If f is C1 , then the partial derivatives exist and are continuous on U
(Sect. 9.4.2). We prove the converse by induction on m. The result is immediate
from Theorem 9.5.4 if m D 2 (with p D q D 1). Suppose the result is proved
for m 1, m > 2. Take p D m 1, q D 1 in Theorem 9.5.4. By the inductive
assumption, D1 f W U ! L.Rp ; Rn / exists and is continuous and, by assumption,
D2 f W U ! L.R1 ; Rn / exists and is continuous. By Theorem 9.5.4, f is C1 on U. u
t
9 Differential Calculus on Rm
376
9.6 Inverse and Implicit Function Theorems
We start with the definition of the general linear group and then discuss the process
of inverting a linear map.
9.6.1 The General Linear Group
If we let GL.R; n/ denote the set of invertible linear maps A W Rn ! Rn , then
GL.R; n/ D fA 2 L.Rn ; Rn / j det.A/ ¤ 0g;
where det.A/ denotes the determinant of A. The set GL.R; n/ has the structure of
a group since for all A; B 2 GL.R; n/ the composition AB1 2 GL.R; n/. The
group identity is I D In —the identity map of Rn . We refer to GL.R; n/ as the
general linear group (of degree n). We give GL.R; n/ the metric determined by the
operator norm on L.Rn ; Rn /. We can define coordinates on GL.R; n/ by identifying
2
L.Rn ; Rn / with Rn (Œaij $ .aij /). The topology on GL.R; n/ determined by
2
the Euclidean metric on Rn is identical to that given by the operator norm
(Exercises 9.2.13(1)).
Lemma 9.6.1 GL.R; n/ is an open and dense subset of L.Rn ; Rn /.
Proof The determinant map det W L.Rn ; Rn / ! R is continuous—it is a polynomial map in matrix coordinates. Hence det1 .0/ L.Rn ; Rn / is closed and so
GL.R; n/ D L.Rn ; Rn / X det1 .0/ is open. For density, recall that det.A/ D 0 iff A
has a zero eigenvalue. If A … GL.R; n/, then A C "I 2 GL.R; n/ for all sufficiently
small " > 0 since the eigenvalues of A C "I are "-translates of the eigenvalues of A
and there are at most n distinct eigenvalues of A.
t
u
Lemma 9.6.2 If we define ˇ W GL.R; n/ ! GL.R; n/ by ˇ.A/ D A1 , then ˇ is C1 .
Proof If we denote the ij-component of Œˇ.A/ by ˇ.A/ij , then ˇ.A/ij D Aij =det.A/,
where Aij is .1/iCj times the determinant of the .n 1/ .n 1/ matrix defined by
removing the ith row and jth column from A (the .i; j/-cofactor of A). This function
is clearly C1 in the matrix entries aij and so is certainly C1 by Corollary 9.5.6. u
t
9.6.2 Diffeomorphisms and the Inverse Function Theorem
Definition 9.6.3 Let U Rm , V Rn be non-empty open sets. A map f W U ! V
is a C1 diffeomorphism (of U onto V) if
(1) f is 1:1 onto.
(2) Both f and f 1 are C1 .
9.6 Inverse and Implicit Function Theorems
377
Remark 9.6.4 As we showed earlier, it follows from the chain rule that if f W U ! V
1
is a C1 diffeomorphism then Dff1
for all x 2 U. In particular, Dfx is a
.x/ D .Dfx /
linear isomorphism and m D n.
z
It is difficult to give sufficient conditions on the derivative of a C1 map f W U ! V
for it to be a diffeomorphism unless n D 1 (then it is enough that f 0 is of constant
sign and U is connected). However, there is a very useful result that shows when
f is a local diffeomorphism. This result—the inverse function theorem—states that
if Dfx is invertible, then f will restrict to a diffeomorphism on a sufficiently small
open neighbourhood of x. In this section we only prove C1 results. However, all our
results extend easily to Cr -maps and we indicate proofs in Sect. 9.11 after we have
defined higher derivatives.
Theorem 9.6.5 (The Inverse Function Theorem) Let W be an open subset of Rm
and f W W ! Rm be C1 . Suppose that Dfx0 is invertible at x0 2 W. Then we can find
open neighbourhoods U W of x0 , and V of f .x0 / such that
(1) f maps U 1:1 onto V. In particular, V D f .U/ is open.
(2) f W U ! V is a C1 diffeomorphism.
Proof We start by proving a special case of the theorem. Assume that f W Rm ! Rm
and
f .0/ D 0; Df0 D I:
We show that for all y 2 Rm sufficiently close to the origin of Rm we can find
x D x.y/ 2 Rm such that f .x/ D y. The point x will be our candidate for f 1 .y/.
That is, if we set f 1 .y/ D x, then f . f 1 .y// D y. We construct the point x using
the contraction mapping lemma with parameters (Theorem 7.17.9). A bonus of this
approach is that it gives us an iterative scheme for constructing f 1 .y/.
Given y 2 Rm , define ‰y W Rm ! Rm by
‰y .x/ D x f .x/ C y:
Observe that ‰y .x/ D x iff x f .x/ C y D x iff f .x/ D y. That is, every fixed point
x of ‰y gives a solution of f .x/ D y (and conversely).
If f D I (so f .x/ D x for all x), then ‰y .x/ D y and so the (unique) fixed point
is x D y. In this trivial case the inverse is the identity map. We are assuming that
f .0/ D 0 and Df0 D I and so although x f .x/ will not generally be zero, it should
be small, at least for small kxk. Our first steps will be to quantify the size of the
term x f .x/. To this end, set .x/ D x f .x/. We claim there exists an r > 0 such
that
(a) k .x/k 12 kxk, kxk r.
(b) k .x1 / .x2 /k 12 kx1 x2 k, kx1 k; kx2 k r.
Since .0/ D 0, (b) H) (a) and so it suffices to prove (b). Since D 0 D I Df0 D 0
and f , therefore , is C1 , we can choose r > 0 such that kD x k 12 , all kxk r. Let
9 Differential Calculus on Rm
378
Dr denote the closed disk, centre 0, radius r in Rm . If x1 ; x2 2 Dr , then Œx1 ; x2 Dr .
Hence we may apply the mean value theorem to get
k .x1 / .x2 /k sup kD z kkx1 x2 k z2Œx1 ;x2 1
kx1 x2 k; x1 ; x2 2 Dr ;
2
proving (b).
We now show that for all y 2 Dr=2 ,
(1) ‰y W Dr ! Dr ,
(2) ‰y is a contraction mapping, contraction constant 12 .
Suppose kyk r=2 and kxk r. We have
k‰y .x/k D k .x/ C yk
k .x/k C kyk
r
k .x/k C .since kyk r=2/
2
r
r
C D r .by estimate (a) on /:
2
2
This proves (1). Turning to (2), suppose kx1 k; kx2 k r and kyk r=2. We have
k‰y .x1 / ‰y .x2 /k D k .x1 / .x2 /k .the y0 s cancel/
D
1
kx1 x2 k .by estimate (b) on /:
2
Hence ‰y W Dr ! Dr is a contraction map with contraction constant k D 12 .
Apply Theorem 7.17.9 with X D Dr , ƒ D Dr=2 , F.x; y/ D ‰y .x/ and k D 12
to deduce that there is a continuous map xQN W Dr=2 ! Dr such that x D xQN .y/ is the
unique solution in Dr to f .x/ D y, y 2 Dr=2 . Since xQN .0/ D 0 and xQN is continuous, we
may choose 0 < s r=2 such that xQN .Ds / Dr . Set V D Ds and U D f 1 .V/ \ Dr .
Since f is continuous, U is an open subset of Rm . We claim that xQN W V ! U and
is 1:1 onto. Since xQN .y/ D f 1 .y/ \ Dr , for all y 2 V, we have xQN .V/ D U proving
that xQN maps V onto U. Moreover, if y; y0 2 Dr=2 , then xQN .y/ D xQN .y0 / D x 2 Dr iff
f .x/ D y D y0 . Hence xQN is 1:1. It follows that xQN W V ! U is the inverse of f W U ! V.
Set f 1 D xQN W V ! U. Since xQN is continuous, it only remains to prove that f 1 is
1
C .
Let y 2 V and set x D f 1 .y/. We know that if f 1 is differentiable at y, then
1
Dfy D .Dfx /1 . If we define s.k/ by
f 1 .y C k/ D f 1 .y/ C .Dfx /1 .k/ C s.k/;
9.6 Inverse and Implicit Function Theorems
379
we have to show that s.k/ D o.k/. Set f 1 .y C k/ D x C h. We have
f .x C h/ D f .x/ C Dfx .h/ C r.h/;
where r.h/ D o.h/. Since f .x C h/ D f .x/ C k, it follows that
k D Dfx .h/ C r.h/;
and so .Dfx /1 .k/ D h C .Dfx /1 .r.h//. Since r.h/ D o.h/ and k.Dfx /1 k > 0, we
can choose ı > 0 such that
kkk 1
k.Dfx /1 k1 khk; if khk ı:
2
(9.9)
Estimating ks.k/k we have
ks.k/k D k f 1 .y C k/ f 1 .y/ .Dfx /1 .k/k
D k.Dfx /1 .Dfx . f 1 .y C k/ f 1 .y// k/k
k.Dfx /1 kkDfx . f 1 .y C k/ f 1 .y// k/k
D k.Dfx /1 kkr.h/k:
It follows from (9.9) that if khk ı, then
kr.h/k
ks.k/k
kr.h/k
k.Dfx /1 k 1
:
D 2k.Dfx /1 k2
1 k1 khk
kkk
khk
k.Df
/
x
2
Since k depends continuously on h (k D f .xCh/f .x/) and r.h/ D o.h/, it follows
that s.k/ D o.k/, proving that f 1 is differentiable at y with derivative .Dfx /1 .
The derivative map Df 1 W V ! L.Rm ; Rm / is the composite
f 1
Df
ˇ
V ! U ! L.Rm ; Rm / ! L.Rm ; Rm /;
where ˇ.A/ D A1 . Since these maps are continuous, Df 1 is continuous and so
f 1 is C1 .
Finally, we need to address the case of general maps f . Suppose Dfx0 is a linear
isomorphism. Define fQ .x/ D .Dfx0 /1 . f .x C x0 / f .x0 //. We have fQ .0/ D 0 and
Q VQ
DfQ0 D I and so the previous analysis applies to fQ to give open neighbourhoods U,
1
Q
Q
Q
Q
of the origin such that f W U ! V is a C diffeomorphism. If we set U D U C x0 ,
V D Dfx0 .V/ C f .x0 /, then f .U/ D V and f W U ! V is a C1 diffeomorphism. u
t
9 Differential Calculus on Rm
380
9.6.3 The Implicit Function Theorem
We prove next an extension of the inverse function theorem: the implicit function
theorem. The theorem gives sufficient conditions for the local solvability of an
equation. The result follows easily from the inverse function theorem. We give a
C1 version. The Cr version is an immediate consequence of the Cr version of the
inverse function theorem which we prove in the section on higher derivatives.
Theorem 9.6.6 (The Implicit Function Theorem) Let f W Rm
Suppose that
Rn ! Rn be C1 .
(1) f .0; 0/ D 0.
(2) D2 f0 2 L.Rn ; Rn / is a linear isomorphism.
Then there exist open neighbourhoods U of 0 2 Rm , V of 0 2 Rn , W of .0; 0/ 2
Rm Rn , and a C1 diffeomorphism H W U V ! W such that
(a) H preserves the first coordinate: H.x; y/ D .x; h.x; y//, all .x; y/ 2 U V.
(b) f .x; h.x; y// D y for all .x; y/ 2 U V. In particular, if y D 0 and we set
h.x; 0/ D u.x/ then
f .x; u.x// D 0; for all x 2 U:
The derivative of u is given by
Dux D .D2 f.x;u.x// /1 D1 f.x;u.x// ; x 2 U:
(c) If .x; z/ 2 W, y 2 W, then f .x; z/ D y iff z D h.x; y/.
In Fig. 9.2, we indicate how the map H 1 W W ! U V ‘straightens out’ each
solution set f 1 .z/ \ W onto the open subset U fzg of Rm fzg. Condition (a) of
the theorem implies that H shears W parallel to the Rn direction.
Statement (b) of the theorem shows that we can solve f .x; y/ D 0 near the origin
obtaining y as a C1 function of x. That is, with y D u.x/, f .x; u.x// D 0, x 2 U.
However, the theorem goes far beyond this. Statement (b) implies that for all y 2 V,
we can solve f .x; y/ D y by the C1 function uy W U ! Rn defined by uy .x/ D
h.x; y/. Moreover, by (c), we find all the solutions to f .x; y/ D y in W.
Note that the result is obvious if f W Rm Rn ! Rn is the projection p on the
second factor (that is, f .x; y/ D y). Indeed, we may then take h.x; y/ D y, U D Rm ,
V D Rn and W D Rm Rn . Viewed in this way, statement (b) of the theorem
says that we can make a local change of coordinates on Rm Rn so that in the new
coordinates, f is (locally, near the origin) the projection on the second factor. That
is, f ı H W U V ! V is the projection on Rn : f ı H.x; y/ D y. As always with local
results in the differential calculus, we show that subject to certain conditions, maps
locally look linear. See Fig. 9.2.
9.6 Inverse and Implicit Function Theorems
381
Rn
f −1(z)
Rn
W
f −1(0)
Rm
f
(
z
H
Rn
V
(
0
p
R m x {z}
V
Rm
UxV
U
Fig. 9.2 Implicit function theorem
Proof of Theorem 9.6.6 Define G.x; y/ D .x; f .x; y//. Observe that G.0; 0/ D
.0; 0/ and
DG.0;0/ D
Im
0
D1 f.0;0/ D2 f.0;0/
2 L.Rm
Rn ; Rm
Rn /;
where Im is the identity map of Rm . Since D2 f.0;0/ is a linear isomorphism of
Rn , DG.0;0/ is a linear isomorphism. Hence we may apply the inverse function
theorem to find open neighbourhoods U; V; W of the origins in Rm ; Rn and Rm Rn
respectively such that G W W ! U V is a C1 diffeomorphism. Set H D G1 and
let p W Rm Rn ! Rn denote the projection on Rn . Observe that H preserves the
first coordinate and so we may write H.x; y/ D .x; h.x; y// for some C1 function
h W U V ! Rn (write H.x; y/ D .Nx; yN / and apply G to both sides). We have
p ı G D f;
as maps of W into Rn . Now compose on the right by H D G1 to obtain
p D f ı H:
Here p D f ı H maps U V onto V Rn . The equation p D f ı H is equivalent
to statement (b) of the theorem. Statement (c) follows because we know all the
9 Differential Calculus on Rm
382
solutions to p.x; y/ D y on U V—p.x; y/ D y iff f .x; h.x; y// D y. The formula
for the derivative of u follows by differentiating the identity f .x; u.x// D 0, x 2 U.
t
u
Remark 9.6.7 Theorem 9.6.6 extends to maps with domain a proper open subset of
Rm Rn —the proof is unchanged.
z
Corollary 9.6.8 Let f W U Rp ! Rq be C1 and q p. Suppose that x0 2 U
and
(1) f .x0 / D 0,
(2) Dfx0 2 L.Rp ; Rq / is onto (Dfx0 has maximal rank equal to q).
Then there exist open neighbourhoods U of 0 2 Rpq , W of x0 2 Rp , and a C1 map
u W U ! W such that
(a) u.0/ D x0 .
(b) f .u.x// D 0, for all x 2 U.
(c) The only solutions of the equation f .x/ D 0 in W are those given by (a,b).
Proof Replacing f .x/ by f .xCx0 /, there is no loss of generality in assuming x0 D 0.
If we set K D ker.Df0 /, then K Š Rpq . Let F be a vector space complement to
K in Rp . Then F Š Rq and Df0 W F ! Rq is an isomorphism. Choose a basis
fei g for Rp so that fe1 ; ; epq g spans K and fepqC1 ; ; ep g spans F. With these
conventions, Rp Š Rpq Rq and f satisfies the hypotheses of Theorem 9.6.6. The
result is now immediate from Theorem 9.6.6(b).
t
u
Examples 9.6.9
(1) Let f W R2 ! R be defined by
f .x; y/ D x2 cos y C sin y:
@f
.0; 0/ D 1. Hence Theorem 9.6.6 applies and there
We have f .0; 0/ D 0 and @y
exists an open interval I containing 0 2 R, an open neighbourhood W of .0; 0/ 2
R2 and a C1 map u W J ! R such that u.0/ D 0 and
f .x; u.x// D x2 cos u.x/ C sin u.x/ D 0; x 2 I:
These are the only solutions to f .x; y/ D 0 in W. We have u0 .0/ D 0
(Theorem 9.6.6(c)).
(2) Consider the equation
F.x; y; z/ D z3 C .x4 C y4 /z C 1 D 0:
We claim we can find a unique solution z D f .x; y/ to this equation which is
defined on all of R2 and such that f is C1 . First note that for fixed x; y, the
equation has at least one real root since the sign of z3 C .x4 C y4 /z C 1 is that
9.6 Inverse and Implicit Function Theorems
383
of z for jzj sufficiently large. We have @F
.x; y; z/ D 3z2 C .x4 C y4 / D 0 iff x D
@z
y D z D 0. Since F.0; 0; 0/ ¤ 0 and elsewhere @F
@z .x; y; z/ > 0, it follows that
there is exactly one real root of the equation F.x; y; z/ D 0 for each .x; y/ 2 R2 .
Let f W R2 ! R be the function giving this root. By Theorem 9.6.6, f is C1 .
Using Theorem 9.6.6(c) to compute the partial derivatives of f at .x; y/ 2 R2 ,
@f
@f
we find that @x
.0; 0/ D @y
.0; 0/ D 0 and
4x3 z
@f
.x; y/ D 2
; .x; y/ ¤ .0; 0/;
@x
3z C x4
4y3 z
@f
.x; y/ D 2
; .x; y/ ¤ .0; 0/:
@y
3z C y4
9.6.4 A Dual Version of the Implicit Function Theorem
The implicit function theorem gives conditions that allow us to show that under a
local change of coordinates a map is a projection and therefore locally onto. We now
look at the case of maps that are locally injective and give conditions that imply a
map is locally an inclusion (after a change of coordinates on the range).
Theorem 9.6.10 Let U Rm be an open neighbourhood of 0 2 Rm and f W U Rm ! Rm Rn be a C1 map satisfying
(1) f .0/ D .0; 0/.
(2) Df0 is a linear isomorphism of Rm onto Rm
f0g Rm
Rn .
There exist open neighbourhoods V1 U of 0 2 Rm , V2 of 0 2 Rn , W of .0; 0/ 2
Rm Rn and a C1 diffeomorphism g W W ! V1 V2 such that the composite
g f W V1 ! Rm Rn is the restriction to V1 of the inclusion i1 of Rm onto the
subspace Rm f0g of Rm Rn . That is,
g f .x/ D i1 .x/ D .x; 0/; x 2 V1 :
Remark 9.6.11 The final statement of Theorem 9.6.10 implies that f W V1 ! Rm
Rn is injective.
z
We illustrate Theorem 9.6.10 in Fig. 9.3. The map g ‘straightens out’ the image
f .V1 / of f W V1 ! Rm Rn .
Proof of Theorem 9.6.10 Define
WU
Rn ! Rm
Rn by
.x; y/ D f .x/ C .0; y/; x 2 U; y 2 Rn :
9 Differential Calculus on Rm
384
Rn
W
Rm
f
Rm
(
)
V1
f(V1)
1
g
gf(V1)
Rn
V2
Rm
V1
Fig. 9.3 Dual implicit function theorem
Observe that D fp1 C p2 , where p1 W Rm Rn ! Rm is the projection on Rm and
p2 W Rm Rn ! Rn is the projection on Rn . It follows from the chain rule that is
C1 and
Df0 0
2 L.Rm Rn ; Rm Rn /:
D 0 D Df0 p1 C p2 D
0 In
Since Df0 is a linear isomorphism of Rm onto Rm f0g, D 0 is a linear isomorphism
of Rm
Rn . Hence, by the inverse function theorem, we may find an open
neighbourhood V1 V2 of .0; 0/ 2 Rm Rn such that .V1 V2 / D W is an
open neighbourhood of .0; 0/ 2 Rm Rn and W V1 V2 ! .V1 V2 / is a
C1 diffeomorphism. Set 1 D g W W ! V1 V2 . We claim that g satisfies the
conditions of the theorem. Indeed, on V1 , we have i1 D f . Composing on the left
by 1 D g, we obtain g f D i1 jV1 .
t
u
Remark 9.6.12 If f preserves the first coordinate (that is, f .x/ D .x; f2 .x//), then so
does and therefore g.
z
Example 9.6.13 Let f W R2 ! R3 be defined by
f .x; y/ D .yex cos y; x C xy2 ; x sin.xy2 //:
We claim we can choose a neighbourhood V of .0; 0/ 2 R2 such that f W V ! R3 is
injective and the only solution to f .x; y/ D .0; 0; 0/ in V is .x; y/ D .0; 0/. Clearly
9.6 Inverse and Implicit Function Theorems
385
.0; 0/ is a solution of f .x; y/ D .0; 0; 0/. Computing we find that
0
Df.0;0/
1
01
D @1 0 A :
00
Hence Df.0;0/ defines a linear isomorphism of R2 onto R2 f0g R2 R. The claim
now follows from Theorem 9.6.10.
9.6.5 The Rank Theorem
The final result of the section includes the inverse and both implicit function
theorems as special cases.
Throughout we will be especially careful with notation so as to improve the
clarity of the exposition. In particular, we often use 0p (rather than 0) to denote
the origin of Rp and 0p;q (rather than 0) to denote the zero map from Rp to Rq .
Before we state the rank theorem, we give an outline of the general idea. Suppose
then that f W Rm ! Rn is C1 and f .0/ D 0. Let W be an open neighbourhood
of 0 2 Rm and suppose that for all X 2 W, DfX has constant rank q 1. That
is, dim.DfX .Rm // D q for all X 2 W. The inverse function theorem applies if
m D n D q, the implicit function if q D n > m, and the dual implicit function if
q D m < n. In these cases, q 2 fm; ng and we need only assume rank.Df0 / D q
since it can be shown that rank.DfX / D q for all X in a neighbourhood of 0m . If
q … fm; ng, then rank.DfX / q on a neighbourhood of 0 but equality does not
usually hold.
If DfX has constant rank q, X 2 W, then the rank theorem gives the local structure
of both image and level sets. Referring to Fig. 9.4, we can find open neighbourhoods
W1 W of 0m , W2 of 0n , such that f .W1 / is the diffeomorphic image of an open
subset of Rq and, setting f D f jW1 , the level sets ff 1 .z/ j z 2 f .W1 /g form a C1 family of sets which is the diffeomorphic image of a trivial q-dimensional family
f −1(c)
f −1(0)
f −1(d)
f(W 1 )
f
W1
0m
c
0n
W2
Fig. 9.4 Geometry of the rank theorem
d
9 Differential Calculus on Rm
386
ffxg V2 j x 2 V1 g, where V1 is an open neighbourhood of 0q , V2 of 0mq . In
particular, each level set is m q-dimensional.
Now for the formal statement of the rank theorem.
Theorem 9.6.14 (The Rank Theorem) Let W be an open neighbourhood of 0m 2
Rm and f W W ! Rn be C1 with f .0m / D 0n . Suppose that DfX is of constant
rank q on W. Then there exist open neighbourhoods W1 W of 0m , W2 of 0n , open
neighbourhoods V1 Rq of 0q , V2 Rmq of 0mq , U2 Rnq of 0nq , and C1
diffeomorphisms h W V1 V2 ! W1 , g W W2 ! V1 U2 such that
g f h.x; y/ D .x; 0nq /; x 2 V1 ; y 2 V2 :
In particular, f 1 .0// \ W1 is mapped by h1 homeomorphically onto the open set
f0q g V2 f0q g Rmq .
Although the proof of the rank theorem is not that difficult, it does require careful
preparation.
Notational Conventions and Assumptions Fix bases of Rm and Rn so that
ker.Df0 / D f0q g Rmq and Df0 .Rq f0mq g/ D Rq f0q g. Denote the subspace
Rq f0mq g of Rq Rmq by Rq (so that Df0 .Rq / D Rq f0nq g Rn ). Let
p1 W Rq Rmq ! Rq and p01 W Rq Rnq ! Rq denote the projections on the first
factors and p2 ; p02 denote the corresponding projections on the second factors. Let
i1 W Rq ! Rq Rmq denote the inclusion on the first factor and similarly define
i01 , i2 and i02 . For reasons of clarity, we adopt for the remainder of the section the
convention that composition of linear maps is indicated by ı. Thus the expression
.Df /g ı Dg arising from the chain rule will be a linear-map-valued function whose
value at the point x in the domain of f is the linear map Dfg.x/ ı Dgx .
We illustrate the geometric content of the rank theorem in Fig. 9.5. The map g
straightens out f .W1 / onto the open subset V1 f0nq g of Rq f0nq g and h1
straightens out the family of inverse images ff 1 .z/ \ W1 j z 2 f .W1 /g onto the
family ffp01 g.z/g V2 j z 2 f .W1 /g. The map u W V1 ! W2 given by u.x/ D
hf .x; 0mq / is injective and maps onto f .W1 /. In essence, the rank theorem asserts
that we can make local differentiable (non-linear) changes of coordinate at 0m and
0n so that in the new coordinates f D i01 ı p1 .
Our first result provides a key step in the proof of the rank theorem and shows
that, under the conditions of the rank theorem, we can represent DfX .Rm / as the
e of 0. That
graph of a linear map X 2 L.Rq ; Rnq /, for X in some neighbourhood W
m
q
is, DfX .R / D graph.X /.R /, where graph.X /.u/ D .u; X .u//, u 2 Rm . Indeed,
e
more is shown: if f D . f1 ; f2 /, then Df2;X D X ı Df1;X , X 2 W.
Lemma 9.6.15 (Notation and Assumptions as Above) We may choose an open
e ! L.Rq ; Rq
e W of 0 2 Rm and a continuous map W W
neighbourhood W
nq
e
R /, X 7! X , such that for all X 2 W
(a)
(b)
e ! L.Rq ; Rnq / and Iq is the identity map of Rq .
D .Iq ; X /, where W W
0
X ı p1 ı DfX D DfX .
X
9.6 Inverse and Implicit Function Theorems
387
R m–q
R n–q
f(W1)
f
–1(0)
f
W1
Rq
0
Rq
V2
R m–q
0
0
u
g
V1
h
p1
R n–q
1
gfh =
Rq
W2
1
p1
Rq
V2
{0} × V2
U2
0
gf(W1)
V1
V1
Fig. 9.5 The rank theorem: mappings and geometry
Proof Define W W ! L.Rq ; Rq / by X D p01 ıDfX ıi1 . Since Df0 .Rq / D Rq f0g,
0 2 GL.R; q/. Now GL.R; q/ is an open subset of L.Rq ; Rq / and is continuous,
e W of 0m such that X 2 GL.R; q/
so we can choose an open neighbourhood W
q
e
for all X 2 W. It follows that DfX .R / is a q-dimensional subspace of Rq Rnq for
e We define X by requiring that
all X 2 W.
e
DfX ı i1 D .Iq ; X / ı p01 ı DfX ı i1 D .Iq ; X / ı X ; X 2 W:
Since p01 ı DfX ı i1 D X , this condition is satisfied iff X D p02 ı DfX ı i1 ı 1
X . Now
e ! L.Rq ; Rnq / is continuous, proving
f is C1 and is continuous, and so W W
e and DfX ı i1 .Rq / is
(a). For (b), observe that since DfX has rank q for all X 2 W
m
q
e Given X 2 W,
e
q-dimensional, we have DfX .R / D DfX ı i1 .R / for all X 2 W.
m
q
e 2 R , we may write e uniquely as eq C k, where eq 2 R and k 2 ker.DfX /. Since
0
0
t
u
X ı p1 ı DfX ı i1 D DfX ı i1 , it follows that X ı p1 ı DfX D DfX .
Proof of Theorem 9.6.14 The proof of the rank theorem is a combination of the
proofs of the implicit and dual implicit function theorems. Define the C1 map
W W ! Rq Rmq by
.x/ D p01 f .X/ C p2 .X/; X 2 W:
9 Differential Calculus on Rm
388
e of 0m , will map each
As we shall see, for a small enough neighbourhood W 0 W
level set Lz D f 1 .z/ \ W 0 of f jW 0 diffeomorphically onto an open neighbourhood
of the origin in fxg Rmq , where x D x.z/ 2 Rq is the unique intersection of Lz
with Rq f0mq g. That is, locally straightens out the level sets of f . Note this is
obvious if f .x; y/ D .x; 0/ 2 Rq Rnq , .x; y/ 2 Rq Rmq , as D Im .
Computing, D 0 we find
D
0
p0 ı Df0 ı i1 p01 ı Df0 ı i2
D 1
0m;mq
Imq
2 L.Rq
Rmq ; Rq
Rmq /:
Hence D 0 2 GL.R; m/, since p01 ı Df0 ı i1 2 GL.R; q/. Applying the inverse
function theorem, there exist connected open neighbourhoods e
V 1 V2 of .0q ; 0mq /
e of 0m such that W W 0 ! Ve1 V2 is a C1 diffeomorphism. Set h D 1
and W 0 W
and define F D fh W Ve1 V2 Rq Rmq ! Rn . We claim that F is independent
of the V2 variable. We show this by proving D2 F D 0 on e
V 1 V2 . We have f D F .
Differentiating this identity we obtain
Df D .DF/ ı D
D .D1 F/ ı p01 ı Df C .D2 F/ ı p2 ;
since D D p01 ı Df C p2 . Hence
.D2 F/ ı p2 D Df .D1 F/ ı p01 ı Df
D . .D1 F/ / ı p01 ı Df ;
where the last line follows by Lemma 9.6.15(b). It follows that
..
X
D1 F
.X/ /
ı p01 ı DfX /.e/ D 0; e 2 Rq ; X 2 W 0 :
But p01 ı DfX .Rm / D Rq and so
.D1 F/
0:
Hence .D2 F/ ı p2 D 0 and D2 F D 0 on .W 0 / D Ve1 V2 .
Define the C1 map u D .u1 ; u2 / W VQ 1 ! Rq Rnq by
F.x; y/ D .u1 .x/; u2 .x//; .x; y/ 2 VQ 1
V2 :
We have u.VQ 1 / D f .W 0 /.
Now Du10 2 GL.R; q/, since F D fh D u1 on VQ 1 f0g, and Du20 D 0q;nq .
Hence we may apply the dual implicit function theorem (Theorem 9.6.10) to u W
9.6 Inverse and Implicit Function Theorems
389
VQ 1 ! Rq Rnq to obtain open neighbourhoods V1 VQ 1 of 0q , W2 Rn of 0n ,
U2 Rnq of 0nq , and a C1 diffeomorphism g W W2 ! V1 U2 such that
gu.x/ D .x; 0/; x 2 V1 :
Setting V2 D VQ 2 , W1 D h.V1
V2 /, completes the proof of the rank theorem.
t
u
EXERCISES 9.6.16
(1) Show that a map f which is C1 and a homeomorphism ( f is bijective and f 1
is continuous) need not have a differentiable inverse.
(2) Find an example of a C1 map f W R2 ! R2 such that Dfx 2 GL.R; 2/ for all
x 2 R2 and f is not 1:1.
(3) Let f W RnC1 ! R be C1 and suppose that at the point a D .a1 ; ; an ; b/ 2
RnC1 , @x@fnC1 .a/ ¤ 0, Show that there exist an open neighbourhood V of
.a1 ; ; an /, an open neighbourhood W of a and a C1 map u W V ! R such
that
f .x1 ; ; xn ; u.x1 ; ; xn // D 0; for all .x1 ; ; xn / 2 V;
and these are the only solutions to f .x1 ; ; xn ; y/ D 0 in W. Show that the
partial derivatives of u are given by
@f
@f
@u
.x/ D .x; u.x//=
.x; u.x//; x 2 V; i 2 n:
@xi
@xi
@xnC1
(4) Consider the simultaneous equations in the unknown functions f and g
f .x; y/3 C xg.x; y/2 C y D 0;
g.x; y/3 C yg.x; y/ C f .x; y/2 x D 0:
Show that the solution set of these equations is given by the zero set of the
function F W R4 ! R2 defined by
F.x; y; u; v/ D .u3 C xv 2 C y; v 3 C yv C u2 x/:
Verify that there is an open neighbourhood V of .1; 1/ 2 R2 and C1 functions
f ; g W V ! R such that f .1; 1/ D 1, g.1; 1/ D 0 and
F.x; y; f .x; y/; g.x; y// D 0; .x; y/ 2 V;
and that these are the only solutions to F D 0 on some neighbourhood of
.1; 1; 1; 0/.
390
9 Differential Calculus on Rm
(5) Show that there exist C1 functions f and g defined on some neighbourhood of
.0; 0/ 2 R2 that satisfy the equations
.8 C x2 /f .x; y/ .y C 1/g.x; y/3 C y2 D 0;
g.x; y/2 .y C 1/f .x; y/g.x; y/ 2 D 0;
subject to the condition that f .0; 0/ D 1 and g.0; 0/ D 2.
(6) Let n m. Show that the subset of L.Rm ; Rn / consisting of surjective linear
maps is an open dense subset of L.Rm ; Rn /. State and prove an analogous
result in case n m.
(7) Suppose that W R ! Rn (n > 1) is C1 and suppose that (a) 0 .t/ ¤ 0
for all t 2 R, and (b) is 1:1. Show, by means of examples, that need not
map R homeomorphically onto .R/ (induced topology on .R/. (Hint: find
examples satisfying (a,b) such that .R/ (1) is, and (2) is not, a closed subset
of Rn .)
(8) A map f W Rm ! Rn is proper if f 1 .K/ is compact whenever K is a compact
subset of Rn .
(a) If f is proper, show that f .Rm / must be an unbounded subset of Rn .
(b) If f is proper and continuous, show that f is a closed map: f maps closed
sets to closed sets. (Hint: sequential arguments make this easy.)
(c) Suppose that the map of the previous question satisfies (a,b) and is
also proper. Verify that maps R homeomorphically onto .R/ (induced
topology).
(d) Generalize (c) to maps f W Rm ! Rn .
(9) Let n > m. Suppose that f W Rm ! Rn is (a) C1 , (b) 1:1, (c) Dfx W Rm ! Rn
is of rank m (injective) for all x 2 Rm , and (d) f is proper. Show that given
any y 2 f .Rm / Rn , there exists an open neighbourhood U of y 2 Rn and
a C1 diffeomorphism of U onto a product open neighbourhood V W of
0 2 Rm Rnm such that . f .Rm / \ U/ D V f0g. (Hint: dual implicit
function theorem and take care with ‘multiple’ intersections of f .Rm / with U.)
Show by means of an example that the result may fail if f is not proper. (If
conditions (a–d) hold, we say that f is a C1 embedding and .Rm / has the
structure of a (C1 ) submanifold of Rn .)
(10) Show, by means of examples, that the constant rank condition in the statement
of the rank theorem cannot be weakened.
(11) Show that a C1 map f W Rm ! Rn can only be injective if m n. (Hint: find
a non-empty open subset of Rm on which the rank of Dfx is maximal and use
the rank theorem.)
(12) Let U be an open subset of Rp and suppose f W U ! L.Rm ; Rn / is C1 and that
f .x/ is of rank q for all x 2 U. Given x0 2 U, show that we can choose an
open neighbourhood U0 of x0 and C1 maps ˛ W U0 ! GL.R; m/, ˇ W U0 !
GL.R; n/ such that
ˇ ı f ı ˛ W U0 ! L.Rm ; Rn /
9.7 Local Existence and Uniqueness Theorem for Ordinary Differential. . .
391
is the constant map q defined by q .x1 ; ; xm / D .x1 ; ; xq ; 0; ; 0/.
(Hints: Using the rank theorem, study the map
.L.Rm ; Rm /
FWU
L.Rm ; Rm // ! U
L.Rm ; Rn /
defined by F.x; A; B/ D .x; B ı f .x/ ı A/. Alternatively, consider the map
fQ W U Rm ! U Rn defined by fQ .x; e/ D .x; f .x/.e//.)
9.7 Local Existence and Uniqueness Theorem for Ordinary
Differential Equations
Let f W Rm ! Rm be a C1 vector field on Rm (we could just as well assume f is
defined on an open subset of Rm but to keep notation simple, we assume the domain
is all of Rm ). We consider the ordinary differential equation (or ‘ODE’)
x0 D f .x/:
(9.10)
In coordinates, (9.10) corresponds to the system
x0i D fi .x1 ; ; xm /; 1 i m;
of ODEs. We recall that a solution of (9.10) with initial condition x0 consists of a
C1 map W I ! Rm , where I is an open interval in R containing the origin, such
that
0
.t/ D f . .t//; t 2 I; and .0/ D x0 :
Remarks 9.7.1
(1) Since 0 D f ı and both f and are C1 , 0 must be C1 by the chain rule and
so is actually C2 . This is characteristic: solutions of ODEs are one order of
differentiability more regular than the vector field defining the ODE.
(2) We assume the equation is autonomous—f does not depend on t. However,
if f does depend on t, x0 D f .x; t/, we can make the equation autonomous
by introducing a new variable and considering the system x0 D f .x; /,
0 D 1. Alternatively, the proof of the existence and uniqueness theorem
continues to work under the assumption that f depends on t—changes required
are minimal. For future reference, we give an existence result for a class of
linear non-autonomous ODEs at the end of the section (the vector field f will
be independent of x).
z
We are going to prove a theorem that gives the existence and uniqueness of
local solutions to (9.10) and the continuous dependence of solutions on the initial
9 Differential Calculus on Rm
392
conditions. Later, in Sect. 9.15, we strengthen this result and show that solutions are
C1 in time and space.
Before stating the main result, it is helpful to introduce some new terminology.
Suppose that there exists an open neighbourhood U of x0 2 Rm , an open interval
I D .ı; ı/ R and a C0 map
I ! Rm
WU
such that if x 2 U and we set x .t/ D .x; t/, t 2 I, then x is a solution to
x0 D f .x/ with initial condition x. The requirement that x equals x at t D 0 implies
that .x; 0/ D x for all x 2 U. If we can find W U I ! Rm satisfying these
conditions we say that is a C0 local flow for x0 D f .x/ (on a neighbourhood
of x0 ).
Theorem 9.7.2 (Existence of Local Flows) Suppose that f is a C1 vector field on
Rm . Then x0 D f .x/ has a C0 local flow on a neighbourhood of every point in Rm .
Moreover, the solutions to x0 D f .x/ are unique. That is, if x W I ! Rm is the
solution with initial condition x given by the local flow and ı W J ! Rm is any other
solution with initial condition x (so 0 2 J), then x D ı on I \ J.
Proof We prove the existence of a local flow on a neighbourhood of the origin of
Rm . For s > 0, let Ds denote the closed s-disk with centre 0 in Rm . Given r; a > 0,
let C0 .Œa; a; D2r / denote the set of all continuous maps f W Œa; a ! D2r Rm .
Recall that if we define the uniform metric on C0 .Œa; a; Rm / by . f ; g/ D
supt2Œa;a k f .t/ g.t/k, then .C0 .Œa; a; Rm /; / is a complete metric space
(Theorem 7.15.9) and C0 .Œa; a; D2r / is complete subspace of .C0 .Œa; a; Rm /; /
(Exercises 7.15.23(5)).
For x 2 Dr , 2 C0 .Œa; a; D2r /, define Tx . / 2 C0 .Œa; a; Rm / by
Z
Tx . /.t/ D x C
t
0
f . .s// ds; t 2 Œa; a:
Observe that Tx . /.0/ D x and Tx . / is differentiable with
Tx . /0 .t/ D f . .t//; t 2 Œa; a:
In particular, if Tx . / D ,
will be a solution of x0 D f .x/ with initial
condition x.
Let M1 D supx2D2r k f .x/k < 1 (since f is continuous and D2r is compact). We
claim that if a r=M1 , then Tx . / 2 C0 .Œa; a; D2r /. This follows since if x 2 Dr ,
t 2 Œa; a, we have
Z
kTx . /.t/k D x C
t
0
f . .s// ds
Z
t
kxk C
0
f . .s// ds
9.7 Local Existence and Uniqueness Theorem for Ordinary Differential. . .
393
ˇZ t
ˇ
ˇ
ˇ
ˇ
kxk C ˇ k f . .s//k dsˇˇ
0
kxk C aM1 2r:
Let M2 D supx2D2r kDfx k < 1 (since f is C1 and D2r is compact) and set
a D minf Mr1 ; 2M1 2 g. We claim that T W C0 .Œa; a; D2r / Dr ! C0 .Œa; a; D2r /,
.x; / 7! Tx . /, is a family of contraction mappings satisfying the hypotheses of
the contraction mapping lemma with parameters.
First of all note that T is well defined as a map to C0 .Œa; a; D2r / since a r=M1 .
Let x 2 Dr and ; 2 C0 .Œa; a; Rm /. We have
Z
d.Tx . /; Tx .// D sup
t2Œa;a
t
xC
0
Z
Z t
f . .s// ds x C
f ..s// ds
0
t
D sup
t2Œa;a
0
t2Œa;a
0
Πf . .s// f ..s// ds
ˇZ t
ˇ
ˇ
ˇ
ˇ
sup ˇ k f . .s// f ..s//k dsˇˇ :
It follows by the mean value theorem that k f . .s// f ..s//k M2 k .s/ .s/k,
s 2 Œa; a, and so
ˇZ t
ˇ
ˇ
ˇ
d.Tx . /; Tx .// sup ˇˇ Mk .s/ .s/k dsˇˇ
t2Œa;a
0
aM2 sup k .s/ .s/k
s2Œa;a
D aM2 d. ; / 1
d. ; /;
2
where the last line follows by the definition of a. This estimate holds for all x 2 Dr .
Since x 7! Tx . / is obviously continuous on Dr for fixed , the conditions of the
contraction mapping lemma with parameters hold and we obtain a continuous map
WU
.a; a/ ! Rm ;
such that for all x 2 U, x W .a; a/ ! Rm is a solution to x0 D f .x/ with initial
condition x.
Observe that the proof continues to work for any closed interval K Œa; a
containing the origin. So if ı W J ! Rm is another solution with initial condition
x, take K D J \ .a; a/ and use uniqueness of fixed points to get ı D x on the
overlap.
t
u
9 Differential Calculus on Rm
394
Remark 9.7.3 Theorem 9.7.2 continues to hold if the vector field f is defined on
a proper open subset U of Rm with the minor change that we additionally require
D2r U.
z
Lemma 9.7.4 (Uniqueness of Solutions) Let f be C1 and I1 ; I2 be open intervals
containing 0 2 R. If i W Ii ! Rm are solutions of x0 D f .x/ with the same initial
condition, then
1 .t/
2 .t/
D
for all t 2 I1 \ I2 :
Proof Set I1 \ I2 D .a; b/ and define X D ft 2 .a; b/ j 1 .t/ D 2 .t/g. Since
0 2 I, X ¤ ;. As 1 ; 2 are continuous, X is a closed subset of .a; b/. Suppose
s 2 X and set 1 .s/ D 2 .s/ D y. Differentiating with respect to t, we see that
0
i .t/ D i .s C t/, t 2 .a s; b s/, are both solutions to x D f .x/ with initial
condition y. By the uniqueness part of Theorem 9.7.2, 1 D 2 on some interval
.ı; ı/ containing t D 0 and so 1 .t/ D 2 .t/ for t 2 .sı; sCı/\.a; b/. Therefore
X is open. Since .a; b/ is connected and X ¤ ;, X D .a; b/.
t
u
If we assume f depends on t and is independent of x, results are much easier to
prove. We use the next result later in Sect. 9.15 when we strengthen Theorem 9.7.2
and show is C1 (in .x; t/).
Proposition 9.7.5 Let M > 0 and define
ƒ D fG 2 C0 .R; L.Rm ; Rm // j jGj D sup kG.t/k Mg:
t2R
Given G 2 ƒ, consider the ordinary differential equation
A0 D G.t/ ı A.t/
(9.11)
defined on L.Rm ; Rm /. With a D 1=.2M/, there exists a C1 map W L.Rm ; Rm /
Œa; a ! L.Rm ; Rm / such that A.t/ D .A0 ; t/ is the unique solution to (9.11)
with initial condition A0 . Furthermore, if G 2 ƒ, and we denote the corresponding
family of solutions to (9.11) by A.t/, then for solutions with the same initial
condition we have the estimate
kA Ak 2ajG GjkAk;
where kAk D supt2Œa;a kA.t/k.
Proof Let X D C0 .Œa; a; L.Rm ; Rm // (uniform metric) and define T W X
L.Rm ; Rm / ! X by
Z
T. ; A0 /.t/ D A0 C
t
G.s/ ı
0
.s/ ds;
2 X; t 2 Œa; a:
9.8 Higher Derivatives as Approximations
395
The proof of existence follows that of Theorem 9.7.2. The estimate in jG Gj
uses the integral form of the solution and a D 1=.2M/. Note that the estimate is
a quantitative version of continuous dependence on parameters—in this case, the
parameter is G. We leave the straightforward details to the exercises.
t
u
EXERCISES 9.7.6
(1) Complete the proof of Proposition 9.7.5.
(2) Let A W R ! L.Rm ; Rm / be continuous and consider the linear system x0 D
A.t/x. State and prove an existence and uniqueness theorem for this system.
9.8 Higher Derivatives as Approximations
We motivated the idea of derivative in terms of affine linear approximation. It is
natural to try to extend this idea to define higher-order derivatives. For example, if
f W U Rm ! Rn is C1 , then we might define f to be twice differentiable at x0 2 U
if there exists a homogeneous quadratic polynomial Q W Rm ! Rn such that if we
define the remainder term r.h/ by
f .x0 C h/ D f .x0 / C Dfx0 .h/ C Q.h/ C r.h/;
then r.h/ D o.khk2 /. That is, kr.h/k ! 0 as h ! 0 faster than khk2 . Ignoring
for now the definition of a homogeneous quadratic polynomial, what we are saying
is that f is twice differentiable at x0 if f has a ‘good’ quadratic approximation near
x0 . Exactly this type of condition holds if f W R ! R is twice differentiable at x0 :
f .x0 C h/ D f .x/0 C f 0 .x0 /h C f 00 .x0 /h2 =2 C r.h/, where r.h/ D o.h2 /. This follows
from Taylor’s theorem (Theorem 2.7.10). Observe that f .x0 / C f 0 .x0 /h C f 00 .x0 /h2 =2
is the ‘best possible’ quadratic approximation to f near x0 . There is, however,
another way of approaching the theory of higher derivatives. If f W U Rm ! Rn is
C1 , then it is natural to say that f is twice differentiable at x0 if Df W U ! L.Rm ; Rn /
is differentiable at x0 . Viewed this way, the second derivative of f at x0 will be a
linear map D2 fx0 W Rm ! L.Rm ; Rn /. That is, D2 fx0 2 L.Rm ; L.Rm ; Rn //. Assuming
f is twice continuously differentiable on U, the third derivative of f at x0 would
be defined as a linear map D3 fx0 2 L.Rm ; L.Rm ; L.Rm ; Rn /// and so on. This, of
course, looks complicated but the situation is saved because there is a natural way
of going from elements of L.Rm ; L.Rm ; Rn / / to polynomial maps Rm ! Rn
and this will give us the connection between higher derivatives and approximation.
The reality is that the details for higher-order derivatives for vector valued maps
defined on a vector space may appear to be complicated but most of the difficulties
lie with keeping the notation under control. For example, to specify the matrix of
the derivative linear map from Rm to Rn we need nm partial derivatives. For the
second derivative we need a total of m2 n partial derivatives; for the pth derivative,
mp n partial derivatives. Writing all of this out in coordinates is both daunting and
9 Differential Calculus on Rm
396
unhelpful. One of our goals is to develop a good ‘language’ so that the results mirror
those of the one-variable theory in a transparent way. Turning to the details, we shall
start our work on higher derivatives with an extended discussion of polynomial and
multi-linear maps between vector spaces. We then define symmetric multi-linear
maps and show that there is a natural bijective correspondence between polynomials
and symmetric multi-linear maps. With these preliminaries out of the way, we
show how higher derivatives define symmetric multi-linear maps and so determine
polynomial maps. We also show how the inverse and implicit functions generalize
easily to Cr -maps, r > 1. We conclude our work on differentiation with a number
of more advanced results about higher-order derivatives of products (Leibniz law)
and compositions (Faà di Bruno’s formula).
9.9 Multi-Linear Maps and Polynomials
9.9.1 Preliminaries on Normed Vector Spaces
It is useful to work with general normed vector spaces .V; k kV / rather than
restricting to .Rm ; k k2 /. In part this is because when we consider spaces of linear
and multi-linear maps, the operator norm will not usually be the Euclidean norm.
Another consideration is that working in greater generality simplifies the notation
and helps to reveal the relationships between spaces of linear and multi-linear maps.
We usually drop the subscript V from the norm symbol and denote the norm on V
by k k (if we need to emphasize the space V, we write k kV ).
We assume all vector spaces are finite-dimensional (and so isomorphic to Rn for
some n). Moreover, all norms on a finite-dimensional vector space are equivalent
and define the same topology as the Euclidean norm (Theorem 9.1.4).
Let L.V; W/ denote the vector space of linear maps from V to W. Since
we assume V; W are finite-dimensional normed vector spaces, L.V; W/ consists
of continuous linear maps (see Sect. 9.2 and note this only needs the finitedimensionality of V). As shown in Sect. 9.2.1, continuous linear maps are bounded
on the unit disk, centre the origin, and we define the operator norm on L.V; W/ by
kAk D sup kAvk; A 2 L.V; W/:
kvkD1
All the results proved in Sect. 9.2 extend immediate to general finite-dimensional
normed vector spaces. In particular, kAxk kAkkxk, x 2 V, A 2 L.V; W/ (see also
Exercises 9.2.13).
9.9 Multi-Linear Maps and Polynomials
397
9.9.2 Multi-Linear Maps
Let .V; k k/ be a normed vector space and p 2 N. We define the p-fold product
.V p ; k k/ to be the normed vector space which is the product of p copies of V with
product norm defined by
k.v1 ; ; vp /k D maxfkvi k j 1 i pg; .v1 ; ; vp / 2 V p :
Remarks 9.9.1
(1) In what follows we make no use of the vector space structure on V p which is
defined by coordinate-wise addition and scalar multiplication in the usual way.
Note that the topology on V p is uniquely defined, independently of the choice
of norm on V (Theorem 9.1.4).
(2) An alternative notation for V p is p V.
(3) If the dimension of V is m, then the dimension of V p is pm.
z
Definition 9.9.2 Let V; W be normed vector spaces and p 2 N. A map T W V p ! W
is called p-linear or multi-linear if T is linear in each variable separately. That is,
for every i 2 f1; : : : ; pg, and all u; v 2 V, 2 R, and xj 2 V, j ¤ i, we have
T.x1 ; ; xi1 ; u C v; ; xp / D T.x1 ; ; xi1 ; u; ; xp /
C T.x1 ; ; xi1 ; v; ; xp /:
For p 0, let Lp .VI W/ denote the space of all p-linear maps T W V p ! W
(define L0 .VI W/ D W). Clearly, Lp .VI W/ inherits the structure of a vector space
from W: .T C S/.X/ D T.X/ C S.X/, T; S 2 Lp .VI W/, 2 R, X 2 V p .
Examples 9.9.3
(1) Since L1 .VI W/ D L.V; W/, every 1-linear map is linear. If T 2 L2 .VI W/, T is
called bilinear. For example, an inner product h; i on V defines a bilinear map
h; i W V 2 ! R.
(2) If p > 1, a p-linear map T W V p ! W is not linear with respect to the vector
space structure on V p (note Remarks 9.9.1(1)).
n
(3) Suppose T W V 2 ! W is bilinear. Let fei gm
iD1 and ff` g`D1 be bases for V and W,
respectively. Denote the associated coordinates on V by .x1 ; ; xm /. Relative
to these bases, we may write T in coordinate form as
0
T..x1 ; ; xm /; .y1 ; ; ym // D @
m
X
i;jD1
a1ij xi yj ; ;
m
X
1
anij xi yj A ;
i;jD1
where the coefficients a`ij are uniquely determined by T according to T.ei ; ej / D
Pn
`
2
`D1 aij f` , 1 i; j m, 1 ` n. It follows that dim.L .VI W// D
9 Differential Calculus on Rm
398
dim.V/2 dim.W/. Similar expressions and results hold if p > 2. In particular,
dim.Lp .VI W// D dim.V/p dim.W/.
Lemma 9.9.4 A multi-linear map between finite-dimensional normed vector spaces
is continuous.
Proof Let T 2 Lp .VI W/. Choosing bases for V and W, we may write T in
coordinate form as we did in Examples 9.9.3(3) for the case p D 2. Since each
component T` of T D .T1 ; ; Tn / may be written as a finite sum of continuous
p
j
monomials a`i1 ip x1i1 xip , where 1 i1 ; ; ip m and .x1 ; ; xjm / are the
coordinates of a point in the jth factor of V p , T` is continuous, 1 ` n. Hence T
is continuous.
t
u
Remark 9.9.5 At the cost of some extra work, it is possible to avoid the coordinate
computations of the previous lemma. Specifically, given p 2 N, it can be shown
that there exists a (unique up to isomorphism) finite-dimensional vector space ˝p V
and natural continuous p-linear map j W V p ! ˝p V such that every p-linear map
T W V p ! W can be uniquely factored through ˝p V as the composite TO j, where TO 2
L.˝p V; W/. The space ˝p V is the p-fold tensor product of V and is of dimension
dim.V/p . Since L.˝p V; W/ consists of continuous linear maps and j W V p ! ˝p V
is continuous, every p-linear map T W V p ! W is continuous.
z
Theorem 9.9.6 Let .V; k k/; .W; k k/ be normed vector spaces and p 2 N. Then
Lp .VI W/ has the structure of a normed vector space with norm defined by
kTk D
sup
k.x1 ; ;xp /kD1
kT.x1 ; ; xp /k:
Moreover, for all T 2 Lp .VI W/, we have
kT.x1 ; ; xp /k kTkkx1 k kxp k;
.x1 ; ; xp / 2 V p :
(9.12)
Proof Since V p is a finite-dimensional vector space, the topology on V p is uniquely
defined independently of the choice of norm on V (see Remarks 9.9.1(1)). Hence
S.V p / D fu 2 V p j kuk D 1g is a compact subset of V p and so kTk < 1 for all
T 2 Lp .VI W/. Standard arguments show that k k defines a norm on Lp .VI W/.
Finally, suppose that .x1 ; ; xp / 2 V p . If any one of the vectors xi D 0, then
T.x1 ; ; xp / D 0 (by p-linearity) and so (9.12) holds trivially. So suppose xi ¤ 0,
x
1 i p. Set X D . kxx11 k ; ; kxpp k /. By definition of the product norm we have
kXk D 1 and so
T
xp
x1
; ;
kx1 k
kxp k
kTk:
9.9 Multi-Linear Maps and Polynomials
399
Now
T
xp
x1
; ;
kx1 k
kxp k
D
1
T.x1 ; ; xp /:
kx1 k kxp k
by the p-linearity of T. Taking the norm of both sides and multiplying by
kx1 k kxp k gives (9.12).
u
t
Example 9.9.7 Let .; / W Rn Rn ! R denote the Euclidean inner product on Rn .
We have k.; /k D 1 (by the Cauchy–Schwarz inequality).
Lemma 9.9.8 Let V; W be normed vector spaces. There is a natural normpreserving linear isomorphism L.V; L.V; W// L2 .VI W/.
Proof Given T 2 L.V; L.V; W//, define TO 2 L2 .VI W/ by
O 1 ; x2 / D T.x1 /.x2 /; x1 ; x2 2 V:
T.x
Since T and T.x1 / are linear, TO is bilinear. We have
O D
kTk
sup
kx1 k;kx2 kD1
D
sup
kx1 k;kx2 kD1
O 1 ; x2 /k
kT.x
kT.x1 /.x2 /k
D sup kT.x1 /k
kx1 kD1
D kTk;
2
where the last two lines follow from the definition of the operator norm. The map
T 7! TO is obviously linear: T C S D TO C SO for all T; S 2 L.V; L.V; W//, 2 R.
O D 0 iff kTk D 0, we see that T 7! TO must be 1:1. Since L.V; L.V; W//
Since kTk
and L2 .VI W/ have the same dimension, T 7! TO must be a linear isomorphism
between L.V; L.V; W// and L2 .VI W/. Alternatively, we may construct the inverse
map: given S 2 L2 .VI W/, define S0 2 L.V; L.V; W// by S0 .x1 /.x2 / D S.x1 ; x2 /,
x1 ; x2 2 V. We leave it to the reader to verify that this formula defines a linear
O
inverse to the map T 7! T.
t
u
Remark 9.9.9 We use the word ‘natural’ in Lemma 9.9.8 in the sense that the
isomorphism we construct does not depend on choosing bases for either V or W.
Indeed, the construction works just as well if V; W are infinite-dimensional normed
vector spaces and we consider spaces of continuous linear and multi-linear maps. z
9 Differential Calculus on Rm
400
Theorem 9.9.10 Let V; W be normed vector spaces. For p 1, there are natural
norm-preserving isomorphisms
L.V; Lp1 .VI W//
Lp .VI W/;
L.V; L.V; ; L.V; W/ //
Lp .VI W/;
where there are p copies of V on the left-hand side of the second isomorphism.
Proof Let T 2 L.V; Lp1 .VI W// and x1 ; x2 ; ; xp 2 V. We define TO 2 Lp .VI W/
by
O 1 ; x2 ; ; xp / D T.x1 /.x2 ; ; xp /:
T.x
Exactly as in the proof of Lemma 9.9.8, we verify that T 7! TO is a norm-preserving
linear isomorphism. The proof of the second statement is similar and may either be
proved directly or by induction.
t
u
EXERCISES 9.9.11
(1) Let T W V V D V 2 ! V be defined as vector space addition: T.x; y/ D
x C y. Verify that T is linear (with respect to the linear structure defined on V 2 ,
Remarks 9.9.1(1)) and find kTk.
(2) The product norm on Rp D p R is usually denoted by k k1 . What is
k.x1 ; ; xp /k? (See Exercises 7.1.9(3) for the definition of the corresponding
d1 metric when p D 2.)
(3) Take the norm k k1 on Rm , Rn . Let A D Œaij 2 L.Rm ; Rn /. Verify that
0
0
kAxk1 @max @
i
m
X
11
jaij jAA kxk1 ; x 2 Rm ;
jD1
P
and that kAk D maxi . m
jD1 jaij j/.
(4) Let .V1 ; k k1 /; ; .Vp ; k kp / be (finite-dimensional) normed vector spaces.
p
Define the product norm on V1 Vp D iD1 Vi by k.v1 ; ; vp /k D
p
maxi kvi k. Verify that . iD1 Vi ; k k/ has the structure of a normed vector space.
(5) Continuing with the assumptions of the preceding exercise, define the space
p
Lp .V1 ; ; Vp I W/ of p-linear multi-linear maps from iD1 Vi to W. Show that
p
L .V1 ; ; Vp I W/ has the structure of a normed vector space such that given
T 2 Lp .V1 ; ; Vp I W/ we have
kT.v1 ; ; vp /k kTkkv1 k kvp k; .v1 ; ; vp / 2
p
iD1 Vi :
(6) Show that scalar multiplication on V defines a bilinear map S 2 L2 .V; RI V/.
What is kSk?
9.9 Multi-Linear Maps and Polynomials
401
(7) Let U; V; W be finite-dimensional normed vector spaces. Show that
(a) The map C W L.V; W/ L.U; V/ ! L.U; W/ defined by C.A; B/ D A ı B
(composition) is bilinear.
(b) kCk D 1. (It is easy, by Proposition 9.2.11(c), to show that kCk 1.)
(8) Let V1 ; V2 ; W be finite-dimensional normed vector spaces. Verify we have a
natural norm-preserving linear isomorphism L.V1 ; L.V2 ; W//
L.V1 ; V2 I W/.
Extend to the case of p normed vector spaces V1 ; ; Vp .
(9) Let V1 ; ; Vp , W1 ; ; Wq be finite-dimensional normed vector spaces. Show
there is a natural linear isomorphism
L.
p
iD1 Vi ;
q
jD1 Wj /
i;j L.Vi ; Wj /:
Verify that if p D 1, the isomorphism is norm-preserving. What happens if
p > 1?
9.9.3 Symmetric Multi-Linear Maps and Polynomials
For n 2 N, let Sn denote the group of all permutations of f1; ; ng. We refer to Sn
as the symmetric group on n symbols and recall that the order of the group Sn is nŠ
Definition 9.9.12 Let V; W be normed vector spaces. A p-linear map T 2 Lp .VI W/
is symmetric if for all .x1 ; ; xp / 2 V p we have
T.x .1/ ; ; x . p/ / D T.x1 ; ; xp /; for all 2 Sp :
We denote the set of all symmetric p-linear maps from V to W by Lps .VI W/; as for
p-linear maps, we take L0s .VI W/ D W.
Examples 9.9.13
(1) Let h; i be an inner product on the vector space V. Then h; i 2 L2s .VI R/. This
is a consequence of the symmetry property of an inner product: hu; vi D hv; ui,
for all u; v 2 V.
(2) If V is 1-dimensional, then Lps .VI W/ D Lp .VI W/ Š W, for all p 2 N.
For this observe that if fvg is a basis of V and we set T.v; ; v/ D w,
then T.x1 v; ; xp v/ D x1 xp w. This expression is obviously symmetric in
x1 ; ; xp .
Proposition 9.9.14 If V; W are finite-dimensional normed vector spaces and p 2
ZC , then Lps .VI W/ is a vector subspace of Lp .VI W/. In particular, Lps .VI W/ inherits
the structure of a normed vector space from the norm on Lp .VI W/.
Proof Left to the reader.
t
u
Our definition of a homogeneous polynomial on a normed vector space is given
in terms of symmetric multi-linear maps.
9 Differential Calculus on Rm
402
Definition 9.9.15 Let V; W be finite-dimensional normed vector spaces and d 2
ZC . A map p W V ! W is a homogeneous polynomial of degree d if there exists a
T 2 Lds .VI W/ such that
p.x/ D T.x; x; ; x/; x 2 V:
We denote the set of all homogeneous polynomial maps of degree d from V to W
by P.d/.V; W/. Note that P.0/ .V; W/ D W.
Remarks 9.9.16
(1) Since multi-linear maps are continuous, polynomials are continuous (vector
spaces are assumed finite-dimensional).
(2) Rather than writing P.x/ D T.x; x; ; x/, it is often more convenient and
suggestive to write P.x/ D T.xd /, where it is understood that xd is shorthand
for .x; ; x/ 2 V d and does not refer to the product of x with itself d-times
(this is not defined on a general vector space). We may generalize this notation
in the obvious way and define T.xa ; yb / for a C b D d, x; y 2 V. Since T is
symmetric, there is no ambiguity with this notation.
(3) The definition of a homogeneous polynomial p allows the possibility of there
being more than one choice of symmetric multi-linear map T defining p. In due
course, we show that the choice is unique and that there is a linear isomorphism
between P.d/ .V; W/ and Lds .VI W/.
z
Examples 9.9.17
(1) Let p 2 P.d/ .V; W/. We have p.x/ D d p.x/ for all x 2 V, 2 R. This
homogeneity condition does not (quite) imply p is a polynomial. For example,
2
if V D R2 , W D R, and we define f .x; y/ D x2xCyy 2 , .x; y/ ¤ .0; 0/ and
f .0; 0/ D .0; 0/, then f is continuous and homogeneous of degree 1 but f is
not a polynomial. On the other hand, if f homogeneous of degree d and d times
differentiable at x D 0, then f is a homogeneous polynomial of degree d. This
is not hard to show using the vector-valued version of Taylor’s theorem which
we prove later.
(2) We can associate a polynomial to every T 2 Ld .VI W/ (no symmetry assumed).
To do this, define
p.x/ D T.xn /; x 2 V:
Observe that p is a polynomial in the sense of Definition 9.9.15 since if we
define Ts 2 Lds .VI W/ by
Ts .x1 ; ; xd / D
1 X
T.x .1/ ; ; x .d//;
dŠ 2S
d
(9.13)
9.9 Multi-Linear Maps and Polynomials
403
then Ts is symmetric and T.xd / D Ts .xd / for all x 2 V. We refer to Ts
as the symmetrization of T. Note that the symmetrization map Ld .VI W/ !
Lds .VI W/; T 7! Ts , is linear and onto.
(3) If p 2 P.d/ .V; W/ and p.x/ D T.xd /, where T 2 Lds .VI W/, then p is C1 with
derivative given by
Dpx .e/ D dT.xd1 ; e/; x; e 2 V:
This follows since p.x C h/ D T..x C h/d / D T.xd / C dT.xd1 ; h/ C R.x; h/,
where we have used the symmetry of T and R.x; h/ is a sum of terms of the form
T.xr ; hs /, r C s D d, s 2. We have kR.x; h/k Ckhk2 , by Theorem 9.9.6,
and so R.x; h/ D o.khk/.
For d 2 ZC , P.d/ .V; W/ has the structure of a vector space. Given p 2 P.d/.V; W/,
define
kpk D sup kp.x/k:
kxkD1
Proposition 9.9.18 (Notation as Above)
(1) For d 2 ZC , . p.d/.V; W/; k k/ has the structure of a normed vector space.
(2) Given p 2 P.d/ .V; W/, we have
kp.x/k kpkkxkd ; x 2 V:
Proof Exactly the same method used for the proof of the corresponding result for
multi-linear maps (Theorem 9.9.6). We leave the details to the reader.
t
u
Remark 9.9.19 If p.x/ D T.xn /, T 2 Lds .VI W/, it is not true that the polynomial
norm kpk equals the multi-linear norm kTk. The relation between the norms is given
in the exercises at the end of the section.
z
Definition 9.9.20 A map p W V ! W is a polynomial of degree d if there exist
homogenous polynomials pj 2 P. j/ .V; W/, 0 j d, such that
p.x/ D
d
X
pj .x/; x 2 V:
jD0
Let Pd .V; W/ denote the vector space of all polynomial maps of degree d from V to
W.
9 Differential Calculus on Rm
404
9.9.4 Multi-Index Notation and Coordinate form
for Polynomials
Before we give the coordinate description of polynomial maps, we need to review
multi-index
notation. Let m 2 N and ˛ 2 Zm
C . If ˛ D .˛1 ; ; ˛m /, set j˛j D
Pm
m
iD1 ˛i and ˛Š D ˛1 Š ˛m Š Given x D .x1 ; ; xm / 2 R , define
x˛ D x˛1 1 x˛mm :
Thus x˛ is a monomial of degree j˛j and x˛ defines a real-valued map on Rm . It is
useful to extend this notation. Suppose W is a vector space. If w 2 W, we call the
map x 7! x˛ w a monomial of degree d from Rm to W.
More generally, suppose x 2 V and d 2 N. We previously defined xd D
.x; ; x/ 2 V d . If ˛ is a multi-index, and x1 ; ; xm 2 V, we define x˛ 2 V j˛j
to be .x˛1 1 ; ; x˛mm / 2 V ˛1 V ˛m .
Lemma 9.9.21 Let T 2 Lds .VI W/. For all x1 ; ; xm 2 V, and 1 ; ; m 2 R, we
have
T..1 x1 C C m xm /d / D
X j˛jŠ
˛ T.x˛ /:
˛Š
(9.14)
˛Wj˛jDd
Proof A straightforward computation that uses the symmetry and d-linearity of T
together with the elementary theory of permutations and combinations. We leave
the details to the exercises.
t
u
.d/
Corollary 9.9.22 Fix a basis V D fvi gm
iD1 of V. If p 2 P .VI W/ is defined by
d
d
p.x/ D T.x /, where T 2 Ls .VI W/, then
p.x/ D
X j˛jŠ
a ˛ x˛ ;
˛Š
˛Wj˛jDd
where .x1 ; ; xm / are the coordinates of x 2 V relative to V and the coefficients
a˛ 2 W are given by a˛ D T.v˛ /. Conversely, any sum of this type defines a
homogeneous polynomial p W V ! W.
Proof The first part of the corollary follows from Lemma 9.9.21. For the converse,
it is enough (by linearity) to prove that any monomial x˛ w, where w 2 W, is a
polynomial. Observe that x˛ w may be defined by the asymmetric d-linear map
S.x1; ; xd / D
˛1
Y
iD1
x1i ˛m
Y
!
xmi w;
iD1
where the coordinates of xi are .x1i ; ; xmi /. Now define T to be the symmetrization
Ss of S (see (9.13)). Obviously T.xd / D x˛ w.
t
u
9.9 Multi-Linear Maps and Polynomials
405
9.9.5 The Polarization Lemma
In this section, which is not needed in the remainder of the chapter, we construct
an explicit linear isomorphism between P.d/.VI W/ and Lds .VI W/. We start with a
simple example.
Example 9.9.23 Suppose that T 2 L2s .VI W/ and define p.x/ D T.x2 /, x 2 V. We
may recover T, knowing p, using the identity
T.x; y/ D
1
.p.x C y/ p.x y/ p.x C y/ C p.x y//:
8
Indeed, since p.x/ D T.x; x/ we have, by the bilinearity of T, p.x ˙ y/ D T.x; x/ C
T.y; y/˙2T.x; y/, p..x˙y// D p.x˙y/. Of course, this is not the only expression
we can use. For example, . p.x C y/ p.x/ p.y//=2 also defines T.x; y/. However,
we can and will be able to generalize the first formula to apply to homogeneous
polynomials of degree d > 2.
Before we state the polarization lemma, we introduce some new notation. Given
d 2 N, let S.d/ denote the set of all 2 f1; C1gd . Thus, if 2 S.d/, we have
D .1 ; ; d /, where i D ˙1, 1 i d.
Lemma 9.9.24 (Polarization Lemma) Let T 2 Lds .VI W/. For all .x1 ; ; xd / 2
V d , we have
0
1
X
1
T.x1 ; ; xd / D d @
1 d T..1 x1 C C d xd /d /A :
2 dŠ
2S.d/
P
Proof We may assume d > 1. Expanding 2S.d/ 1 d T..1 x1 C C d xd /d /
by Lemma 9.9.21, we have to consider the sum over 2 S.d/ of terms of the form
X dŠ ˛ C1
1 d˛d C1 T.x˛1 1 ; ; x˛d d /:
˛Š 1
˛Wj˛jDd
Suppose ˛ ¤ .1; ; 1/. Then it straightforward to check that at least two of the
indices ˛i C 1 must be odd. Without loss of generality, suppose ˛1 C 1; ˛2 C 1 are
odd. SumPover 1 ; 2 , keeping the remaining i fixed. We obtain a contribution from
1 ; 2 of 1 D˙1;2 D˙1 1˛1 C1 2˛2 C1 . This sum is zero since ˛1 C 1; ˛2 C 1 are odd.
Consequently, if we fix ˛ and sum over 2 S.d/ we obtain zero. On the other hand,
if ˛ D .1; ; 1/, then 1˛1 C1 d˛d C1 D 1. Summing over 2 S.d/, we see that
dŠ
the coefficient of T.x1 ; ; xd / is dŠ2q , since ˛Š
D dŠ and jS.d/j D 2d .
t
u
Proposition 9.9.25 The map W Lds .VI W/ ! P.d/ .V; W/ defined by .T/.x/ D
T.xn / is a vector space isomorphism with inverse the map U W P.d/.V; W/ !
9 Differential Calculus on Rm
406
Lds .VI W/ defined by
1
0
1 @X
U. p/.x1; ; xd / D d
1 d p.1 x1 C C d xd /A ;
2 dŠ
2S.d/
where p 2 P.d/ .V; W/ and x1 ; ; xd 2 V.
Proof The map W Lds .VI W/ ! P.d/ .V; W/ is linear and surjective by definition.
If .T/ D 0 then, by the polarization lemma, T D 0. Hence is injective. We have
U D 1 by the polarization lemma.
t
u
EXERCISES 9.9.26
(1) Let p 2 Pd .Rm ; R/. Show that there exist unique a˛ 2 R such that
p.x/ D
X j˛jŠ
a ˛ x ˛ ; x 2 Rm :
˛Š
j˛jd
(2) Given p 2 P.a/ .V; R/ and q 2 P.b/ .V; R/, define pq W V ! R by . pq/.x/ D
p.x/q.x/. Show that pq 2 P.aCb/ .V; R/.
(3) Find symmetric multi-linear maps which define the following homogeneous
polynomials on Rm :
(a) p.x/ D x21 C C x2m .
(b) q.x/ D xi xj , where i ¤ j.
m
2
m
(4) Let fei gm
iD1 denote the standard basis of R . Given T 2 Ls .R I R/, define the
m m matrix Œaij by aij D T.ei ; ej /. Show that the matrix Œaij is symmetric
(that is, equal to its transpose).
(5) Let .V; k k/ be an m-dimensional normed vector space. Let V ? D L.V; R/
?
? m
denote the dual space of V. If B D fegm
iD1 is a basis of V, let B D fej gjD1
denote the dual basis of V ? (e?j .ei / D 0, i ¤ j, e?i .ei / D 1). If T 2 L2 .VI R/,
let TO 2 L.V; V ? / be the linear map given by Lemma 9.9.8. Show that T is
symmetric iff the matrix of TO (relative to the bases B; B ?) is symmetric.
(6) For T 2 Lds .VI W/, define the ‘polynomial norm’ of T by
kTk0 D sup kT.xd /k:
kxkD1
Prove that k k0 is related to the norm we defined on Ld .VI W/ by
kTk0 kTk nn
kTk0 ; for all T 2 Lds .VI W/:
nŠ
Deduce that k k0 defines a norm on Lds .VI W/. Does k k0 define a norm on
Ld .VI W/, if d > 1? (Hint for the first part: use the polarization lemma.)
9.10 Higher-Order Derivatives
407
9.10 Higher-Order Derivatives
In this section, we return to our study of maps f W U Rm ! Rn defined on open
subsets of Rm . However, everything we do works perfectly well for general finitedimensional normed vector spaces. In particular, it is a consequence of the result on
equivalence of norms (Theorem 9.1.4) that we may choose any norm on Rn ; Rm .
Definition 9.10.1 Let U be an open subset of Rm and suppose f W U ! Rn is C1 .
We say f is twice differentiable at the point x0 2 U if the map
Df W U ! L.Rm ; Rn /
is differentiable at x0 . We set D.Df /x0 D D2 fx0 and call D2 fx0 the second derivative
of f at x0 .
If f is twice differentiable at x0 , then D2 fx0 2 L.Rm ; L.Rm ; Rn //
L2 .Rm I Rn /,
by Lemma 9.9.8. In the sequel, we almost always regard the second derivative D2 fx0
as defining a bilinear map in L2 .Rm I Rn /.
The differentiability of Df at x0 implies that we have the equation in L.Rm ; Rn /
Dfx0 Ch D Dfx0 C D2 fx0 .h/ C o.h/;
where o.h/; D2 fx0 .h/ 2 L.Rm ; Rn /. If we evaluate the equation at k 2 Rm , we obtain
Dfx0 Ch .k/ D Dfx0 .k/ C D2 fx0 .h/.k/ C o.h/.k/
D Dfx0 .k/ C D2 fx0 .h; k/ C o.h/.k/;
where we have used the natural isomorphism L.Rm ; L.Rm ; Rn //
L2 .Rm I Rn /.
m
n
Since o.h/ 2 L.R ; R /, we have ko.h/.k/k ko.h/kkkk and so o.h/.k/ D
o.h; k/ (note that k.h; k/k D maxfkhk; kkkg and so o.h/.k/=k.h; k/k ! 0 as
k.h; k/k ! 0). As a result, we have the equation in Rn
Dfx0 Ch .k/ D Dfx0 .k/ C D2 fx0 .h; k/ C o.h; k/:
Definition 9.10.2 (Notation as Above) A map f W U Rm ! Rn is twice
differentiable on U if f is twice differentiable at every point of U. The map f is C2
or twice continuously differentiable (on U) if, in addition, D2 f W U ! L2 .Rm I Rn /
is continuous.
9.10.1 Second-Order Partial Derivatives
Before we give the relationship between the second derivative and second-order partial derivatives, we prove a useful result that allows us to interchange differentiation
with evaluation at a fixed vector.
9 Differential Calculus on Rm
408
Lemma 9.10.3 (Evaluation Lemma) Let d; m; p; q 2 N. Suppose that f W U Rm ! Ld .Rp I Rq / is differentiable at x0 2 U. If we fix e1 ; ; ed 2 Rp , then the
map f .e1 ; ; ed / W U ! Rq defined by f .e1 ; ; ed /.x/ D f .x/.e1 ; ; ed / is
differentiable at x0 with derivative given by
D. f .e1 ; ; ed //x0 D Dfx0 .e1 ; ; ed /:
If f is C1 on U so is f .e1 ; ; ed /.
Proof Since f is differentiable at x0 , we have
f .x0 C h/ D f .x0 / C Dfx0 .h/ C o.h/; x0 C h 2 U:
This is an equation in Ld .Rp I Rq /. Evaluating at E D .e1 ; ; ed /, we obtain
. f E/.x0 C h/ D . f E/.x0 / C Dfx0 .h/.E/ C o.h/.E/
D . f E/.x0 / C .Dfx0 E/.h/ C o.h/.E/:
Since ko.h/.E/k D ko.h/.e1 ; ; ed /k ko.h/kke1 k kep k, by our results on
multi-linear maps, we have ko.h/.e1 ; ; ed /k D o.h/.
u
t
Lemma 9.10.4 Suppose that f W U Rm ! Rn is twice differentiable at x0 2 U.
Then all second-order partial derivatives of f exist at x0 and we have
@2 f
.x0 / D D2 fx0 .ej ; ek /; 1 j; k m;
@xj @xk
where fej g denotes the standard basis of Rm . If f is C2 on U, then all the second
partial derivatives exist and are continuous on U.
Proof Applying Lemma 9.10.3 to Df W U ! L.Rm ; Rn /, we see that Df ek is
differentiable at x0 with derivative given by
D.Df ek /x0 D D2 fx0 ek :
Evaluating at ej , we get D.Df ek /x0 .ej / D D2 fx0 .ej ; ek /. Since Df ek D
@ @f
. /
@xj @xk
@2 f
,
@xj @xk
@2 f
.x0 /
@xj @xk
@f
@xk ,
and
D
it follows that
D D2 fx0 .ej ; ek /.
t
u
D.Df ek /x0 .ej / D
We now come to the main theorem of this section: the symmetry of the second
derivative.
Theorem 9.10.5 If f W U Rm ! Rn is twice differentiable at x0 2 U, then
D2 fx0 2 L2s .Rm I Rn /:
9.10 Higher-Order Derivatives
409
Proof We have to show that D2 fx0 .h; k/ D D2 fx0 .k; h/ for all h; k 2 Rm . Fix d > 0
so that D2d .x0 / U and assume in what follows that h; k 2 Dd .0/. Taking the
product norm on Rm Rm , we have k.h; k/k D maxfkhk; kkkg. Define the map
S W Dr .0/ Dr .0/ ! Rn by
S.h; k/ D f .x0 C h C k/ f .x0 C h/ f .x0 C k/ C f .x0 /:
Clearly S is symmetric: S.h; k/ D S.k; h/ for all h; k 2 Dd .0/. We prove that if
.h; k/ D S.h; k/ D2 fx0 .h; k/;
then .h; k/ D o.kh; kk2 /. Specifically, we show that if " > 0, then there exists
a d1 2 .0; d such that k.h; k/k 4"k.h; k/k2 , for all k.h; k/k d1 . It will
then follow easily from the symmetry of S and the bilinearity of D2 fx0 that D2 fx0 2
L2s .Rm I Rn /.
Define g W Œ0; 1 ! Rn by
g.t/ D f .x0 C h C tk/ f .x0 C tk/ tD2 fx0 .h; k/; t 2 Œ0; 1:
Observe that
g.1/ g.0/ D S.h; k/ D2 fx0 .h; k/:
Since g is continuous on Œ0; 1 and differentiable on .0; 1/, it follows by the mean
value theorem (Theorem 9.4.12) that
kg.1/ g.0/k sup kg0 .t/k:
t2.0;1/
Computing g0 .t/, we find
g0 .t/ D Dfx0 CtkCh .k/ Dfx0 Ctk .k/ D2 fx0 .h; k/
D .Dfx0 CtkCh .k/ Dfx0 .k//
.Dfx0 Ctk .k/ Dfx0 .k// D2 fx0 .h; k/:
Since Df is differentiable at x0 , given " > 0, there exists a d1 2 .0; d such that if
kuk 2d1 ,
Dfx0 Cu D Dfx0 C D2 fx0 .u/ C r.u/;
where kr.u/k "kuk. Evaluating this equation in L.Rm ; Rn / at k 2 Rm , we have
Dfx0 Cu .k/ D Dfx0 .k/ C D2 fx0 .u; k/ C r.u; k/;
9 Differential Calculus on Rm
410
where kr.u; k/k "kukkkk, for all k.u; k/k d1 . Substituting in our expression
for g0 .t/ gives
g0 .t/ D D2 fx0 .tk C h; k/ D2 fx0 .tk; k/ D2 fx0 .h; k/
Cr.tk C h; k/ r.tk; k/
D r.tk C h; k/ r.tk; k/;
where the first three terms cancel using the bilinearity of D2 fx0 . Estimating kg0 .t/k
we see that
kg0 .t/k kr.tk C h; k/k C kr.tk; k/k
"kkk.ktk C hk C ktkk/; if k.h; k/k d1
2"kkk.khk C kkk/
4"k.h; k/k2 :
Since kS.h; k/ D2 fx0 .h; k/k supt2.0;1/ kg0 .t/k, we have the estimate
kS.h; k/ D2 fx0 .h; k/k 4"k.h; k/k2 ; k.h; k/k d1 :
Since S.h; k/ D S.k; h/, an application of the triangle inequality yields
kD2 fx0 .h; k/ D2 fx0 .k; h/k 8"k.h; k/k2 ; k.h; k/k d1 :
The bilinearity of D2 fx0 implies that this estimate holds for all .h; k/ 2 Rm Rm .
Since " > 0 was arbitrary, we have D2 fx0 .h; k/ D D2 fx0 .k; h/ for all .h; k/ 2
Rm R m .
t
u
Remark 9.10.6 The assumptions of Theorem 9.10.5 are both natural and minimal:
the map f is twice differentiable at x0 . No assumptions about continuity or the
existence of the second derivative on a neighbourhood of x0 are required. Note again
the central role of the mean value theorem in the proof.
z
Corollary 9.10.7 If f D . f1 ; ; fn / W U Rm ! Rn is twice differentiable at x0 ,
then for all 1 i; j m, 1 ` n we have
@2 f`
@2 f`
.x0 / D
.x0 /:
@xi @xj
@xj @xi
(Symmetry of second-order partial derivatives.)
Proof Theorem 9.10.5, Lemma 9.10.4 and Corollary 9.5.2.
t
u
9.10 Higher-Order Derivatives
411
9.10.2 Higher-Order Derivatives: General Case
Let 1 p 1 and suppose U Rm is open and f W U Rm ! Rn . Proceeding
inductively, suppose that f is . p 1/ times continuously differentiable on U with
associated . p 1/th derivative map Dp1 f W U ! Lp1 .Rm I Rn /. The map f is
p-times differentiable at x0 2 U, if Dp1 f W U ! Lp1 .Rm I Rn / is differentiable at
x0 . As usual, we regard D.Dp1 f /x0 2 L.Rm ; Lp1 .Rm I Rn // as defining an element
of Lp .Rm I Rn / via the natural isomorphism L.Rm ; Lp1 .Rm I Rn // Lp .Rm I Rn / and
set D.Dp1 f /x0 D Dp fx0 . If f is p-times differentiable at every point of U, then f is
p times differentiable on U and if Dp f W U ! Lp .Rm I Rn / is continuous we say f is
p times continuously differentiable, or Cp , on U. If f is Cp for all p 2 N, we say f is
infinitely differentiable or C1 .
Examples 9.10.8
(1) If A 2 L.Rm ; Rn /, then A is C1 and Dp A D 0, p > 1.
(2) If T 2 Lds .Rm I Rn / and we define p 2 P.d/.Rm ; Rn / by p.x/ D T.xd /, then p is
C1 and
(
dŠ
T.e1 ; ; er ; xdr /; r d;
Dr px .e1 ; ; er / D .dr/Š
0;
r > d:
The proof is a straightforward induction on p (the case p D 1 is Examples 9.9.17(3)).
(3) Let W Rm Rn ! Rp be a bilinear map. Then is C1 and for .x; y/ 2 Rm Rn ,
.e1 ; f1 /; .e2 ; f2 / 2 Rm Rn we have
D2
D1
.x;y/ .e1 /
D .e1 ; y/;
D2
.x;y/ .f1 /
D .x; f1 /;
.x;y/ ..e1 ; f1 /; .e2 ; f2 //
D
r
.x;y/
D .e1 ; f2 / C .e2 ; f1 /;
D 0; r > 2:
The result is easily proved directly or by using Proposition 9.5.3.
Lemma 9.10.9 Let f W U Rm ! Rn be p-times differentiable at x0 2 U and
e2 ; ; ep 2 Rm . If
Dp1 f .e2 ; ; ep / W U ! Rn
is the map defined by
.Dp1 f .e2 ; ; ep //.x/ D Dp1 fx .e2 ; ; ep /; x 2 U;
9 Differential Calculus on Rm
412
then Dp1 f .e2 ; ; ep / is differentiable at x0 with derivative given by
D.Dp1 f .e2 ; ; ep //x0 .e1 / D Dp fx0 .e1 ; ; ep /; e1 2 Rm :
Proof Apply the evaluation lemma (Lemma 9.10.3) to Dp1 f .
t
u
Theorem 9.10.10 (Symmetry of Higher-Order Derivatives) If f W U Rm !
Rn is p-times differentiable at x0 , then Dp fx0 2 Lps .Rm I Rn /.
Proof We prove the result by induction on p. The result is true when p D 2—
Theorem 9.10.5. Suppose the result is proved for derivatives of order less than or
equal to p 1. Let e1 ; ; ep 2 Rm . By Lemma 9.10.9, the map Dp1 f .e2 ; ; ep / W
U ! Rn is differentiable at x0 with derivative given by
D.Dp1 f .e2 ; ; ep //x0 .e1 / D Dp fx0 .e1 ; ; ep /:
p
By the inductive hypothesis, Dx0 is symmetric in the last p 1 variables. By
Lemma 9.10.9 again, the map Dp2 f .e3 ; ; ep / W U ! Rn is twice differentiable
at x0 and
D2 .Dp2 f .e3 ; ; ep //x0 .e1 ; e2 / D Dp fx0 .e1 ; ; ep /:
By Theorem 9.10.5, Dp fx0 is symmetric in the first two variables. Combining this
with the symmetry in the last p 1 variables, it follows that Dp fx0 is symmetric. u
t
Our final result in this section shows the relationship between higher-order
derivatives and higher-order partial derivatives.
Theorem 9.10.11 Let f W U Rm ! Rn be p-times differentiable at x0 and fei gm
iD1
denote the standard basis of Rm . Given ˛ 2 Nm , we have
j˛j
(1) @˛ f .x0 / D @x˛@1 @xf ˛m .x0 / D Dj˛j fx0 .e˛ /:
m
1
(2) The higher-order partial derivatives of f are independent of the order of
differentiation.
(3) If all the partial derivatives @˛ f exist and are continuous on U, j˛j p, then f
is Cp .
def
Proof Part (1) follows by definition of partial derivative, and (2,3) follow by
induction on p, Corollary 9.5.6 and Theorem 9.10.10.
u
t
9.11 Extension of Results from C1 to Cr -Maps
In this section we extend some of the main results proved in Sects. 9.4.3 and 9.6 to
Cr -maps. We conclude with statements and proofs of a version of Leibniz’ law for
the rth derivative of a product and a vector-valued version of Faà di Bruno’s formula
for the rth derivative of a composite of vector-valued functions.
9.11 Extension of Results from C1 to Cr -Maps
413
We start with the Cr version of the chain rule.
Theorem 9.11.1 Let r 1 and suppose that f W U Rm ! Rn , g W V Rn ! Rp
are Cr . Then g f W U \ f 1 .V/ ! Rp is Cr .
Proof We indicate two proofs.
Method 1: Induction on r. The case r D 1 is Theorem 9.4.8. Let W L.Rn ; Rp /
L.Rm ; Rn / ! L.Rm ; Rp / be the bilinear map defined by .A; B/ D A ı B. Since
is bilinear, is C1 (Examples 9.10.8(3)). Suppose the result has been proved for
r 1 (where r 2). The map .Dg/f W U \ f 1 .V/ ! L.Rn ; Rp / is a composition
of Cr1 maps and so, by the inductive hypothesis, .Dg/f is Cr1 . We have D.g f / D
..Dg/f ; Df /. Since is C1 and .Dg/f and Df are Cr1 , it follows by the inductive
hypothesis that D.g f / is Cr1 . Hence g f is Cr .
Method 2: Coordinates and partial derivatives. An inductive argument shows that
an sth order partial derivative of the composite g f will be a polynomial of degree s in
the partial derivatives of f of order at most s with coefficients depending linearly on
partial derivatives of g of order less than or equal s. If f and g are Cr it follows that
all partial derivatives of g f of order less than or equal to r exist and are continuous.
Now apply Theorem 9.10.11(3).
t
u
9.11.1 The Inverse and Implicit Function Theorems
Let U Rm , V Rm be open non-empty sets. Suppose that 1 r 1. A map
f W U ! V is a Cr diffeomorphism (of U onto V) if
(1) f is 1:1 onto.
(2) Both f and f 1 are Cr .
1
As we showed earlier, if f W U ! V is a Cr diffeomorphism then Dff1
.x/ D .Dfx /
for all x 2 U.
We start with a simple extension of Lemma 9.6.2.
Lemma 9.11.2 The map ˇ W GL.R; n/ ! GL.R; n/, ˇ.A/ D A1 , is C1 .
Proof The proof of Lemma 9.6.2 already shows that partial derivatives of all orders
of ˇ with respect to the components aij of A exist and are continuous. The result
follows by Theorem 9.10.11.
t
u
Lemma 9.11.3 Let U Rm , V Rm be open non-empty sets. Suppose that
f W U ! V is a Cr -map, r 1, which is a C1 diffeomorphism. Then f is a Cr
diffeomorphism.
Proof The proof is by induction on r. Suppose the result is true for r 1. We
may write D. f 1 / as the composite ˇ.Df /f 1 (that is, .Df 1 /y D .Dff 1 .y/ /1 for
all y 2 V). Since ˇ is C1 , Df is Cr1 , and f 1 is Cr1 (inductive hypothesis),
Theorem 9.11.1 implies that D. f 1 / is Cr1 . Hence, f 1 is Cr .
t
u
9 Differential Calculus on Rm
414
Theorem 9.11.4 (The Inverse Function Theorem for Cr -Maps) Let W be an
open subset of Rm and f W W ! Rm be Cr , where 1 r 1. If Dfx0 is invertible at
x0 2 W, then we can find open neighbourhoods U W of x0 , and V of f .x0 / such
that
(1) f maps U 1:1 onto V. In particular, V D f .U/ is open.
(2) f W U ! V is a Cr diffeomorphism.
Proof Immediate from the C1 version of the inverse function theorem (Theorem 9.6.5) and Lemma 9.11.3.
t
u
Once we have the Cr version of the inverse function theorem, the proofs of
the implicit function theorem, dual implicit function and rank theorem all extend
immediately to give Cr versions of these results.
9.12 Taylor’s Theorem
In this section we prove versions of Taylor’s theorem for vector-valued maps.
Theorem 9.12.1 (Taylor’s Theorem, Version 1) Let U Rm be open and
suppose f W U ! Rn is p-times differentiable at x 2 U. If we define the remainder
term r.h/ by
r.h/ D f .x C h/ p
X
1 r
D fx .hr /;
rŠ
rD0
then r.h/ D o.khkp /. That is, the function T r fx .h/ D
approximation to f at x of order p.
Pp
1 r
r
rD0 rŠ D fx .h /
gives an
Proof The proof is by induction on p. The result is true for p D 1 by the definition of
derivative. Suppose the result is true for p 1, p 2. Fix d > 0 so that Dd .x/ U.
Define S W Dd .0/ ! Rn by
S.k/ D f .x C k/ f .x/:
Since S is p-times differentiable at 0 we have
Dr S0 D Dr fx ; 1 r p:
Substituting in the defining equation for r.h/, we have
p
X
1 j
D S0 .kj / C r.k/:
S.k/ D
jŠ
jD1
9.12 Taylor’s Theorem
415
Differentiating with respect to k and setting DSh D g.h/ gives
g.h/ D
p1
X
1 j
D g0 .hj / C Drh :
jŠ
jD0
Hence, by the inductive hypothesis, given " > 0, there exists a ı > 0 such that if
khk < ı, then kDrh k "khkp1 . By the mean value theorem we have
kr.h/k D kr.h/ r.0/k khk sup kDrth k:
0<t<1
Since kDrth k "kthkp1 "khkp1 , we have
kr.h/k "khkp ; khk ı:
Since " > 0 was arbitrary, the result follows.
t
u
Corollary 9.12.2 Let U be an open subset of R and suppose that f W U ! R is p
times differentiable at x. We have
m
1
X j˛jŠ
@
@˛ f .x/h˛ A C r.h/;
f .x C h/ D f .x/ C
˛Š
rD1
p
X
0
j˛jDr
where x C h 2 U and r.h/ D o.khkp /.
Remark 9.12.3 In classical partial derivative notation, the expression for f .x C h/
given by Corollary 9.12.2 is
1
r
X j˛jŠ
@f
˛1
˛m A
@
f .x/ C
C r.h/:
˛1
˛m .x/h1 hm
˛Š
@x
@x
m
1
rD1
p
X
0
j˛jDr
z
Example 9.12.4 We find the quadratic approximation at x D 0 given by Taylor’s
theorem to f .x1 ; x2 ; x3 / D ex1 cos x2 sin x3 . Listing the partial derivatives of order at
most 2 at x D 0 we have
f .0/ D 0;
@f
@f
@f
.0/ D
.0/ D 0;
.0/ D 1;
@x1
@x2
@x3
@2 f
@2 f
.0/ D 0; if .i; j/ ¤ .1; 3/; .3; 1/;
.0/ D 1:
@xi @xj
@x1 @x3
9 Differential Calculus on Rm
416
Applying Corollary 9.12.2, we see that
Q.x1 ; x2 ; x3 / D x3 C x1 x3
is the quadratic approximation at x D 0 given by Taylor’s theorem.
Theorem 9.12.5 (Scalar-Valued Taylor’s Theorem) Let U Rm be open and
suppose f W U ! R is p-times differentiable on the line segment Œx; x C h U.
Then there exists a 2 .0; 1/ such that
p1
X
1 r
1
D fx .hr / C Dp fxC h .hp /:
f .x C h/ D
rŠ
pŠ
rD0
Proof Define
W Œ0; 1 ! U Rm by .t/ D x C th, t 2 Œ0; 1. If we set
F D f W Œ0; 1 ! R, then F is p-times differentiable on .0; 1/ and
F .r/ .t/ D Dr fxCth .hr /; r D 1; ; p:
Applying the classical Taylor’s theorem (Theorem 2.7.11(a)) to F gives the
result.
t
u
We conclude this section with an explicit, and very useful, form for the remainder
term in Taylor’s theorem. In this case we assume f is Cp on a neighbourhood of x.
Theorem 9.12.6 (Taylor’s Theorem, Integral Remainder) Let U Rm be open
and f W U ! Rn be Cp . If Œx; x C h U, then
f .x C h/ D
Z 1
p1
X
1 r
.1 s/p1 p
D fx .hr / C
D fxCsh .hp / ds:
rŠ
.
p
1/Š
0
rD0
R1
.1s/p1 p
D fxCsh .hp / ds is defined to be the integral of the
. p1/Š
.1s/p1 p
p
n
. p1/Š D fxCsh .h / 2 R . The result is proved by a simple induction
Proof The integral
0
components of
on p and uses integration by parts. We leave the details to the exercises.
t
u
EXERCISES 9.12.7
(1) Find the cubic approximation at x D 0 given by Taylor’s theorem to f .x1 ; x2 / D
2
.sin.x C y2 /; yex ; ex Cy /.
(2) What is the Taylor series at x D 0 of f .x; y/ D ˆ.x/ˆ.y/, where ˆ is the
C1 -function defined in Proposition 5.2.3.
(3) Show that if we assume the conditions of Theorem 9.12.6 and define the
remainder
!
p1
X
1 r
r
r.h/ D f .x C h/ D fx .h / ;
rŠ
rD0
then r.h/ D o.khk˛ / for all ˛ 2 .0; p/.
(4) Complete the proof of Theorem 9.12.6.
9.13 The Leibniz Rule and Faà di Bruno’s Formula
417
9.13 The Leibniz Rule and Faà di Bruno’s Formula
Recall that if f ; g W R ! R are Cr and we denote the product of f and g by f g, then
the Leibniz rule states that f g is Cr with derivative given by
!
r
X
r . j/
f .x/g.rj/ .x/:
. f g/ .x/ D
j
jD0
.r/
Suppose now that f ; g W U Rm ! R are Cr , r 1. Using either Exercises 9.4.15(6) or a simple argument based on the chain rule (f g D . f .x/; g.x//,
where .x; y/ D xy), we find that f g is differentiable with derivative
D. f g/x D g.x/Dfx C f .x/Dgx ; x 2 U:
When we come to the second derivative of f g, the terms involving first derivatives
of f and g (Dfx Dgx and Dgx Dfx ) will not be symmetric. Similar problems arise for
all higher derivatives. The way we handle this is to symmetrize the terms involving
derivatives of both f and g.
For p; q 0, we define the symmetrization operator ? W Lps .Rm I R/
q
Ls .Rm I R/ ! LpCq
.Rm I R/ by
s
.A ? B/.e1 ; ; epCq /
X
1
A.e .1/ ; ; e . p/ /B.e . pC1/ ; ; e . pCq/ /
D
1
. p C q/Š
pŠqŠ X
D
A.e .1/ ; ; e . p/ /B.e . pC1/ ; ; e . pCq/ /;
2
. p C q/Š
P
P
where 1 is the sum over all permutations 2 SpCq and 2 is the sum over all
permutations 2 SpCq such that .1/ < .2/ < < . p/ and . p C 1/ < <
. p C q/.
Remarks 9.13.1
(1) A ? B D B ? A, for all A 2 Lps .Rm I R/, B 2 Lqs .Rm I R/.
(2) If q D 0, then B 2 R and A ? B D BA 2 Lps .Rm I R/.
z
We may now give the general version of the Leibniz rule.
Theorem 9.13.2 (Leibniz Rule) Let f ; g W U Rm ! R be Cr , r 1. Then f g is
Cr with derivative given by
!
r
X
r j
D fx ? D.rj/ gx ; x 2 U:
D . f g/x D
j
jD0
r
9 Differential Calculus on Rm
418
Proof The proof is a simple induction on r using Lemma 9.10.9. We leave the details
(and generalizations) to the exercises.
t
u
9.13.1 Faà di Bruno’s Formula
In this section we give a formula for the rth derivative of a composite of vectorvalued maps. In the case of real-valued maps, there is the following result attributed
to Faà di Bruno [7] (see the historical notes at the end of the section).
Theorem 9.13.3 (Faà di Bruno’s Formula) Let f ; g W R ! R be Cr , r 1. We
have
.r/
.g f / .x/ D
X
q
r . j/
Y
rŠ
f .x/ j
.q/
g . f .x//
;
q1 Š qr Š
jŠ
jD1
P
where the sum is over all .q1 ; ; qr / 2 ZrC satisfying rjD1 jqj D r and q1 C C
qr D q.
We shall give three versions of Faà di Bruno’s formula for vector-valued maps.
Before stating these we need to define some new symmetrization operators. Fix r 2
N and suppose we are given integers q1 ; ; qr 0 such that r D q1 C2q2 C Crqr .
Set q D q1 C C qr and note that 1 q r. Set M0 D 0 and for 1 j r, define
Mj D q1 C C jqj and
Mij D Mj1 C ij; i D 0; ; qj 1:
Note that Mi1 D i, i D 0; ; q1 1.
Recall that Sr is the symmetric group on r symbols. We define two subsets of Sr .
First, let Sr? .q/ be the subset of Sr consisting of permutations such that
.Mij C 1/ < .Mij C 2/ < < .Mij C j/;
(9.15)
for j D 1; ; r 1 and i D 0; ; qj 1.
Second, let Sr .q/ be the subset of Sr consisting of permutations such that
.Mj1 C 1/ < .Mj1 C 2/ < < .Mj /;
(9.16)
for j D 1; ; r 1. Note that Sr .q/ Sr? .q/. It is elementary to show that the
cardinalities of Sr? .q/ and Sr .q/ are given by
rŠ
;
qj
jD1 . jŠ/
jSr? .q/j D Qr
jSr .q/j D Qr
jD1
qj Š
rŠ
Qr
jD1 . jŠ/
qj
:
(9.17)
9.13 The Leibniz Rule and Faà di Bruno’s Formula
419
Suppose given multi-linear maps Aj 2 Ljs .Rm I Rn /, 1 j r. For k 2 N, let Akj
m
n
j
m
n k
denote the k-tuple .Aj ; ; Aj / 2 k Ljs .R
Qr; R /j D mLs .Rn q;j R / .
q1
qr
Set A D A.q/ D .A1 ; ; Ar / 2 jD1 Ls .R I R / (if qj D 0, we omit the
corresponding term from the product). We view A as the r-linear mapping from Rm
to q Rn defined by
q
q
A.E1 ; ; Er / D .A11 .E1 /; A22 .E2 /; ; Aqr r .Er //;
jq
where Ej D .eMj1 C1 ; ; eMj / 2 times j Rm , 1 j r.
If 2 Sr , define W r Rm ! r Rm by
.e1 ; ; er / D .e .1/ ; ; e .r/ /; .e1 ; ; er / 2
r
Rm :
Given A as above, we define A to be the r-linear mapping from Rm to
by
.A/.e1 ; ; er / D A..e1 ; ; er //; .e1 ; ; er / 2
r
q
Rn defined
Rm :
In terms of the jqj -tuples Ej D .eMj1 C1 ; ; eMj /, if we define
Ej D .e .Mj1 C1/ ; ; e .Mj / /;
then .A/.E1 ; ; Er / D A.E1 ; ; Er /.
Now suppose that B 2 Lqs .Rn I Rp /. We will combine A and B in three different
ways to define r-linear symmetric maps from Rm to Rp . Define the r-linear mappings
B ? A; B A; B ~ A by
1X
B.A.e1 ; ; er //;
2Sr
rŠ
X
1
B ˘ A.e1 ; ; er / D
B.A.e1 ; ; er //;
2S?r .q/
q1 Š qr Š
X
B ~ A.e1 ; ; er / D
B.A.e1 ; ; er //:
B ? A.e1 ; ; er / D
2Sr .q/
(For the definitions of Sr? .q/, Sr .q/, see (9.15), (9.16).)
Lemma 9.13.4 (Assumptions and Notation as Above)
(1) B ? A; B ˘ A; B ~ A 2 Lrs .Rm I Rp /.
(2) B ~ A D B ˘ A D
Qr
jD1 qj Š
rŠ
Q
r
jD1 . jŠ/
qj
B ? A.
9 Differential Calculus on Rm
420
Proof Clearly B ? A is symmetric without requiring any symmetry conditions on B
or A. Indeed, if 2 Sr , .e1 ; ; er / 2 Rm , we have
1X
B.A..e1 ; ; er ///
2Sr
rŠ
1X
D
B.A..e1 ; ; er ///
2Sr
rŠ
D .B ? A/.e1 ; ; er /;
.B ? A/..e1 ; ; er // D
since every permutation 2 Sr can be written uniquely as (take D 1 ).
Hence B ? A 2 Lrs .Rm I Rp /. For B ˘ A to be symmetric, it suffices that each Aj is a
symmetric j-linear map. For B~A to be symmetric we also need B to be a symmetric
q-linear map. Finally, the relation between B ? A, B ˘ A and B ~ A follows easily
from (9.17).
t
u
Remark 9.13.5 Note that all the combinatorial coefficients in Faà di Bruno’s
formula occur in Lemma 9.13.4.
z
Example 9.13.6 Suppose that f ; g W R ! R are Cr . Suppose given q1 ; ; qr 0,
satisfying q1 C qr D q, q1 C 2q2 C C rqr D r. We have
.Dq g/f ~ ..D1 f /q1 ; ; .Dr f /qr /
D Qr
jD1
qj Š
rŠ
Qr
jD1 . jŠ/
qj
.Dq g/f ..D1 f /q1 ; ; .Dr f /qr /:
Evaluating the expression on the right-hand side at x and 1r gives
Qr
jD1
D
qj Š
rŠ
Qr
jD1 . jŠ/
g.q/ . f .x//
qj
r
Y
f . j/ .x/qj
jD1
r . j/
Y
rŠ
g.q/ . f .x//
q1 Š qr Š
jD1
f
.x/
jŠ
qj
;
which is the general term in Faà di Bruno’s formula.
Theorem 9.13.7 Let f W R ! R , g W R ! R be C -maps. Then
m
Dr .gf / D
X
X
n
n
p
r
.Dq g/f ˘ ..D1 f /q1 ; ; .Dr f /qr /
.Dq g/f ~ ..D1 f /q1 ; ; .Dr f /qr /
1 q1
r qr X
rŠ
Df
Df
Qr
.Dq g/f ?
D
;
; ;
q
Š
1Š
rŠ
j
jD1
D
9.13 The Leibniz Rule and Faà di Bruno’s Formula
421
where in each case the sum is over all q1 ; ; qr 0 such that q1 C2q2 C Crqr D
r and q1 C C qr D q.
Remarks 9.13.8
(1) It follows from Example 9.13.6 that the third form for Dr .g f / gives Faà di
Bruno’s formula (Theorem 9.13.3).
(2) The second form of Theorem 9.13.7 gives the most economical and natural
expression for the terms in Dr .gf /. For example, forms 1 and 2 give for
q D 1 the term .Dg/fDr f while form 3 involves an unnecessary symmetrization of Dr f . Turning to the final term q D r in the sum, the second
form gives the term .Dr g/f .Df /r while the first and third form give
1 P
r
z
2Sr .D g/f ..Df .e .1/ /; ; Df .e .r/ //.
rŠ
Proof of Theorem 9.13.7 (sketch). It follows from Lemma 9.13.4 that the three
expressions for Dr .g f / are equal.
It suffices to verify the third form. Although proof by induction might appear
to be an attractive approach, it is quickly seen that induction does not work well.
Instead, we start by observing that we can write
Dr .g f / D
X
aq1 q2 qr .Dq g/f ı ?
D1 f
1Š
q1
; ;
Dr f
rŠ
qr ;
P
where q1 C C qr D q, rjD1 jqj D r and the coefficients aq1 q2 qr are rational
and independent of f , g and m; n; p. The problem is to find the coefficients aq1 q2 qn .
This we can do by judicious choice of g and f . Noting Example 9.13.6, it suffices to
restrict to the case n D m D p D 1. We consider the function H D exp.f /, where
f W R ! R is C1 , 2 R and g.y/ D exp.y/ (so H D g f ). The Taylor series of H
at x is
Tx H.h/ D
1
X
rD0
.exp.f //.r/ .x/
hr
:
rŠ
P1 . j/ h j
The Taylor series of f at x is Tx f .h/ D
jD0 f .x/ jŠ . Substituting Tx f .h/ for
f .x C h/ in exp.f .x C h// and using the exponent law for exp gives the formal
identity
1
X
rD0
.exp.f //.r/ .x/
1
Y
hr
hj
D
:
exp f . j/ .x/
rŠ
jŠ
jD0
(9.18)
It follows from Taylor’s theorem (Theorem 2.7.10) that coefficients of like powers
in (9.18) are equal. In other words, we can work within the framework of formal
power series and ignore issues of convergence. The right-hand side of (9.18) is equal
9 Differential Calculus on Rm
422
Q
hj
hj
. j/
. j/
to exp.f .x// 1
jD1 exp.f .x/ jŠ /. Expanding the terms exp.f .x/ jŠ /, j > 0, and
collecting like powers of h, we find that the right-hand side of (9.18) is equal to
1
X
2
h 4
r
rD0
r
X
exp.f .x//
q
X
qD0
3
q
r . j/
Y
f .x/ j 5
1
;
q1 Š qr Š jD1
jŠ
where the innermost sum is over q1 ; ; qr 0, satisfying q1 C C qr D q and
q1 C C rqr D r. Comparing the coefficients of hr in (9.18) and noting that the
term in .exp.f //.r/ .x/ associated to the qth derivative of exp corresponds to the
term q exp.f .x// gives
aq1 q2 qr D
rŠ
;
q1 Š qr Š
completing the proof of Faà di Bruno’s formula and Theorem 9.13.7.
t
u
Historical Comments Although Francesco Faà di Bruno may have been the first to
publish the formula for the higher derivative of a composite of real-valued functions,
he surely was not the first to discover the formula. We refer the reader to the article
by Johnson [17] for more historical and mathematical details about the formula as
well as the proof and relationships with Bell polynomials. The method we use is
elementary and based on an “anonymous” proof published several years before that
of Faà di Bruno (who only gave the result, not the proof, in his original papers). The
first reference I am aware of for a formula for the pth derivative of a composite of
vector-valued functions appears in Abraham and Robbin’s 1967 research text [1].
The formula they give is recursive and is not quite clear as stated since the terms
appear not to be symmetric (there is a similar issue with their version of Leibniz’s
theorem). Versions of their result appears in [9] and in [10, page 293]. There have
been many publications in recent years proving various versions of Faà di Bruno’s
formula for vector-valued maps. We refer to Krantz’s text on real analysis [19] for
references. From our perspective, we find it remarkable that even though Faà di
Bruno did not deal with vector-valued functions or symmetric multi-linear maps, all
of the difficulties are already present in his formula.
EXERCISES 9.13.9
(1) Provide the details of the proof of Theorem 9.13.2. (Hint: Prove by induction
on r, start by considering the formula for Dr1 . f g/x .e2 ; ; er / and use
Lemma 9.10.9 and standard combinatorial identities.)
(2) Let ˛; ˇ 2 Zm
C . Write ˇ ˛, if ˛i ˇi , 1 i m. If ˇ ˛, define
˛ ˇ D .˛1 ˇ1 ; ; ˛m ˇm /. Show that if f ; g W U Rm ! R are r-times
9.14 Smooth Functions and Uniform Approximation
423
differentiable at x 2 U and j˛j D r then
˛
@ . f g/.x/ D
X
ˇ˛
(3)
(4)
(5)
(6)
!
˛
@ˇ f .x/@˛ˇ g.x/;
˛ˇ
˛ D j˛jŠ=Œ˛1 Š ˛m Š.˛1 ˇ1 /Š .˛m ˇm /Š.
where ˛ˇ
Suppose f W Rm ! Rn1 , g W Rm ! Rn2 and we are given a bilinear map
W Rn1 Rn2 ! Rn . Define f g W Rm ! Rn by f g.x/ D . f .x/; g.x//. State
and prove a version of the Leibniz law for f g.
Verify (9.17).
Let f ; g W Rm ! Rm be smooth. Compute the first four derivatives of g f
and compare with the formulas given by Theorem 9.13.7. How many terms
involving D4 fx and .Dfx /2 are there in the expression .D3 g/f ~ ..Dfx /2 ; D4 fx /
occurring in the formula for D6 .g f /x .e1 ; ; e6 /?
Suppose that f ; g are real-valued analytic functions of one variable. Using Faà
di Bruno’s formula for functions of one variable, show that the composition
f ı g is analytic. (Hint: use (9.17). This provides a proof of Proposition 5.4.6
that does not use complex analysis. See also Krantz and Parks [20, §1.3].)
9.14 Smooth Functions and Uniform Approximation
In this section we give examples of smooth (C1 ) non-polynomial functions on Rm ,
m > 1, generalizing the ‘bump’ and ‘tabletop’ functions of Chap. 5. Using these
functions, we prove a variant of the Weierstrass approximation theorem that allows
us to uniformly approximate Cr -functions, and their first r-derivatives, by smooth
functions. This result is used later in the proof of the existence of Cr local flows for
ordinary differential equations defined by a Cr vector field.
Definition 9.14.1 Let f W Rm ! Rn . The (closed) support of f , denoted supp. f /, is
defined by
supp. f / D fx 2 Rm j f .x/ ¤ 0g:
The map f is of compact support if supp. f / is compact.
Example 9.14.2 If p W Rm ! Rn is a homogeneous polynomial, then supp. p/ is
compact iff p
0. Indeed, since p is homogeneous, if p.x/ ¤ 0, then p.x/ D
d p.x/ ¤ 0 for all 2 R, ¤ 0. Hence supp. p/ Rx and so supp. p/ cannot
be compact. The same result holds without assuming the homogeneity of p (see the
exercises).
For 0 p 1, let Ccp .Rm ; Rn / be the set of all Cp maps f W Rm ! Rn
with compact support. Since supp. f C g/ fx C y j x 2 supp. f /; y 2 supp.g/g,
9 Differential Calculus on Rm
424
and supp.f / D supp. f /, it follows that Ccp .Rm ; Rn / is a vector subspace of
Cp .Rm ; Rn /.
In order to construct C1 -functions with compact support we make use of the
theory developed in Sect. 5.2. Recall that the C1 map ˆ W R ! R is defined by
ˆ.x/ D
exp.1=x/; x > 0;
0;
x 0;
and that ˆ is used to construct C1 -functions on R with compact support. In
particular, if 1 < a < b < C1, we define the ‘bump’ function
‰a;b .x/ D ˆ.b x/ˆ.x a/; x 2 R;
satisfying ‰a;b 0 and supp.‰a;b / D Œa; b. If 0 < r < s < 1, we define the
‘tabletop’ function
‚r;s .x/ D
ˆ.x2 r2 /
:
ˆ.s2 x2 / C ˆ.x2 r2 /
We have supp.‚r;s / D Œs; s and ‚r;s .x/ D 1, for all x 2 Œr; r.
It is straightforward to define bump and tabletop functions on Rm . For example,
if x 2 Rm , define the tabletop function
‚r;s .x/ D
ˆ.kxk2 r2 /
; x 2 Rm :
ˆ.s2 kxk2 / C ˆ.kxk2 r2 /
It follows by the chain rule that ‚r;s is C1 (the square of the Euclidean norm is
C1 ). We have supp.‚r;s / D Ds .0/ and ‚r;s 1 on Dr .0/.
Example 9.14.3 If f W Rm ! Rn is C1 , then ‚r;s f 2 Cc1 .Rm ; Rn / and
supp.‚r;s f / Ds .0/. Note that ‚r;s f D f on Dr .0/.
For maps defined on Rm , it is useful to have a tabletop function with support a
hypercube in Rm —this is compatible with the coordinate structure on Rm and works
well with multiple integrals. To this end, suppose r1 ; ; rm > 0 and define
r1 ; ;rm .x/ D
m
Y
‰ri ;ri .xi /; x 2 Rm :
iD1
Since partial derivatives of r1 ; ;rm of all orders exist and are continuous, r1 ; ;rm is
C1 and we have
supp.r1 ; ;rm / D
m
Y
Œri ; ri :
iD1
9.14 Smooth Functions and Uniform Approximation
425
If we take ri D r > 0, 1 i m, and set r1 ; ;rm D r , then supp.r / is the
closed hypercube C.r/ D Œr; rm . If f W Rm ! Rn is C1 , then r f is C1 and
supp.r f / C.r/.
Let K be a compact subset of Rm . If f 2 Cp .Rm ; Rn /, define
k f kKp D
p
X
sup kDj fx k:
jD0 x2K
Lemma 9.14.4 For all compact subsets K of Rm , k kKp defines a semi-norm on
Cp .Rm ; Rn /. That is,
(1) k f kKp 0, for all f 2 Cp .Rm ; Rn /.
(2) k f C gkKp k f kKp C kgkKp , for all f ; g 2 Cp .Rm ; Rn /.
(3) k f kKp D jjk f kKp , for all f 2 Cp .Rm ; Rn /, 2 R.
t
u
Proof Routine and left to the exercises.
Remark 9.14.5 Note that we can have
K \ supp. f / D ;.
If f 2 Ccp .Rm ; Rn /, we may define
k f kKp
k f kp D k f kpsupp. f / D
D 0 when f ¤ 0. For example, if
z
p
X
sup kDj fx k:
m
jD0 x2R
It follows from Lemma 9.14.4 that, for p 0, k kp defines a norm on Ccp .Rm ; Rn /,
and we refer to k kp as the p-norm on Ccp .Rm ; Rn /.
The semi-norm k kKp defines uniform convergence on K.
Lemma 9.14.6 Let K be a compact subset of Rn and suppose that . f` / C0 .Rm ; Rn / is Cauchy with respect to k kK0 . Then . f` / converges uniformly on K to
a continuous function f W K Rm ! Rn .
Proof Set g` D f` jK, ` 2 N, and apply Theorem 7.15.9 to .g` /.
t
u
Remark 9.14.7 Lemma 9.14.6 says nothing about the convergence of . f` / on Rm X
K. Indeed, it is easy to construct examples where . f` / converges on K but does not
converge at any point of Rm X K.
z
We want to extend Lemma 9.14.4 to take account of differentiability. To keep
matters simple, we restrict compact sets to the collection of closed hypercubes
C.r/
C.r/ D Œr; rm and set k kp D k krp , r > 0. Let C.r/ D .r; r/m denote
the open hypercube.
Lemma 9.14.8 Let . f` / C1 .Rm ; Rn / and r > 0. If there exist maps f ; Fi W C.r/ !
Rn satisfying
k f` f kr0 ; k
@f`
Fi kr0 ! 0; as ` ! 1; 1 i m;
@xi
9 Differential Calculus on Rm
426
then
(1) f ; Fi W C.r/ ! Rn are continuous, 1 i m.
(2) f W C.r/ ! Rn is C1 with partial derivatives given by
m, x 2 C.r/.
@f
@xi .x/
D Fi .x/, 1 i `
Proof Since . f` /, . @f
@xi / converge uniformly on C.r/, it follows that f , Fi are
continuous on C.r/, proving (1). Fix i, 1 i m. For .x1 ; ; xm / 2 C.r/, we
have
Z xi
@f`
f` .x1 ; ; xm / D
.x1 ; ; s; ; xm / ds C f` .x1 ; ; 0; ; xm /:
0 @xi
Since convergence is uniform, we may apply Proposition 4.7.1 for fixed xj , j ¤ i
(note Remark 4.7.3) and let ` ! 1 to obtain
Z
f .x1 ; ; xm / D
xi
0
Fi .x1 ; ; s; ; xm / ds C f .x1 ; ; xN ; ; xm /:
Hence f is continuously partially differentiable on C.r/, with
so f is C1 on C.r/ by Theorem 9.10.11(2).
@f
@xi
D Fi on C.r/, and
t
u
Theorem 9.14.9 Let r > 0 and suppose . f` / Cp .Rm ; Rn / is Cauchy with respect
to k krp . Then there exists a continuous map f W C.r/ ! Rn such that
(1) . f` / converges to f uniformly on C.r/.
(2) f W C.r/ ! Rm is Cp and Dj f` converges uniformly to Dj f on C.r/ for 0 j p.
Proof Statement (1) is Lemma 9.14.6. For (2) we have, again by Lemma 9.14.6, that
@˛ f` converges uniformly to a continuous function F˛ W C.r/ ! Rn for all ˛ 2 Zm
C,
j˛j p. We use induction on j˛j and Lemma 9.14.8 to show that @˛ f D F˛ on C.r/,
j˛j p. Hence, by Theorem 9.10.11, f W C.r/ ! Rm is Cp .
t
u
Remark 9.14.10 If we replace k krp by k kKp , Theorem 9.14.9 continues to hold
with C.r/ replaced by the interior of K.
z
Corollary 9.14.11 Let . f` / Ccp .Rm ; Rn / be Cauchy with respect to k kp . Then
. f` / converges to f 2 Cp .Rm ; Rn /:
lim k f` f kp D 0:
`!1
Proof Left to the exercises. In general, f … Ccp .Rm ; Rn /.
t
u
The semi-norm k kKp is exactly what is needed to define uniform approximation
of functions on a compact set K. We show that if K Rm is compact and f 2
Cp .Rm ; Rn /, then for any " > 0, we can find fQ 2 C1 .Rm ; Rn / such that
k f fQ kKp < ":
9.14 Smooth Functions and Uniform Approximation
427
This is uniform approximation of a function, and its first p derivatives, on K by a
C1 -function. The condition k f fQ kKp < " implies that
kDj . f fQ /x k < "; x 2 K; j D 0; ; p:
For our applications, it suffices to approximate by C1 -functions rather than
polynomials. Although the Weierstrass approximation theorem generalizes to Rm ,
the Bernstein polynomial approach used in Chap. 5 only works well for the k k0 norm. Our methods give uniform approximations of a Cp -function of compact
support by a C1 -function of compact support—this simplifies some of our proofs.
We define the C1 positive function W Rm ! R by
.x/ D 1 .x/; x 2 Rm ;
Q
where we recall that 1 .x/ D m
iD1 ‰1;1 .xi /. We have supp. / D C.1/ (the unit
hypercube, centred at the origin) and
Z
Z
Z
D
Rm
R
R
Z
D
R
‰1;1 .x1 / ‰1;1 .xm / dx1 dxm
m
‰1;1 .s/ ds
D c;
where
R c > 0, since > 0 on C.1/. Replacing
that Rm D 1. For ı > 0, define
ı .x/
D
1
ın
by c1 we may and shall assume
x
; x 2 Rm :
ı
Lemma 9.14.12 For all ı > 0,
R
(1) Rm ı D 1.
(2) supp. ı / D C.ı/.
Proof
The first part is an elementary change of variables argument for
R
‰
.s/ ds (the non-trivial change of variables formula for multiple integrals
1;1
R
is not needed). The second statement is obvious.
t
u
Let f 2 Ccp .Rm ; Rn /. For ı > 0, define
Z
fı .x/ D
Z
Rm
ı .x
Z
D
R
s/f .s/ ds
R
ı .x1
s1 ; ; xm sm /f .s1 ; ; sm / ds1 dsm :
9 Differential Calculus on Rm
428
Remarks 9.14.13
(1) The defining integral for fı can be evaluated over any hypercube C.r/ supp. f /. Hence the integral lies within the elementary class of multiple integrals
considered in this text (see Chap. 2, Exercises 2.8.10(10)).
(2) The integral defining fı is called the convolution of f and ı .
(3) For the definition of fı , f was assumed to have compact support. However, the
integral clearly converges for any continuous function f W Rm ! Rn since, for
fixed x, ı .x s/f .s/ has compact support.
z
Lemma 9.14.14 If f 2 Ccp .Rm ; Rn / and ı > 0, then
Z
fı .x/ D
Z
Rm
ı .x
s/f .s/ ds D
Rm
ı .s/f .x
s/ ds:
Proof Change variables from si to si xi , 1 i m.
t
u
Remark 9.14.15 The second integral of Lemma 9.14.14 can be evaluated over the
z
hypercube C.r C ı/, if C.r/ supp. f /.
Theorem 9.14.16 (Uniform Approximation by Smooth Functions) Let p 0
and f 2 Ccp .Rm ; Rn /. Then
(1) fı 2 Cc1 .Rm ; Rn /, for all ı > 0.
(2) limı!0C k f fı kp D 0.
Proof (1) Let ˛ be a multi-index, with 0 j˛j p. Since
fı .x1 ; ; xm /
Z
Z
D
ı .x1 s1 ; ; xm sm /f .s1 ; ; sm / ds1 dsm ;
R
R
and ı is C1 , we have (by the easy version of Lemma 6.1.6 for integrals of
smooth functions over compact intervals) that all partial derivatives of f" exist, are
continuous and are given by
@˛ fı .x/ D
Z
Rm
Hence fı 2 Cc1 .Rm ; Rn /, for all ı > 0.
For (2), we use
Z
fı .x/ D
Rm
@˛ ı .x s/f .s/ ds:
ı .s/f .x
s/ ds:
We have (by the easy version of Lemma 6.1.6)
@˛ fı .x/ D
Z
Rm
˛
ı .s/@ f .x
s/ ds;
9.14 Smooth Functions and Uniform Approximation
for all ˛, with 0 j˛j p. Since
˛
R
Z
@ . fı f /.x/ D
ı
Rm
Rm
429
D 1, it follows that
˛
ı .s/.@ f .x
s/ @˛ f .x// ds:
Since supp. f / is compact, @˛ f is uniformly continuous on Rm for all j˛j p. Hence,
given " > 0, we may choose ıN > 0 so that for all j˛j p, k@˛ f .xs/@˛ f .x//k "
N Since supp. ı / D C.ı/, we have
if ksk1 ı.
N
k@˛ fı @˛ f k0 "; j˛j p; 0 < ı ı:
It follows that limı!0C k f fı kp D 0.
t
u
Remark 9.14.17 Theorem 9.14.16 suffices for our main application in the next
section. We indicate some extensions in the exercises.
z
EXERCISES 9.14.18
(1) Let p W Rm ! Rn be a polynomial of degree d. Show that supp. p/ is compact
iff p 0. (Hint: Look at pjRm X Dr .0/ for large r.)
(2) Provide the details of the proof of Lemma 9.14.4.
(3) Provide the proof for Corollary 9.14.11 and give examples to show that
.Cc1 .Rm ; Rn /; k kp / is not complete for any p 0.
(4) Show that if K is a compact subset of Rm and . f` / Cp .Rm ; Rn / is Cauchy
ı
with respect to k kKp , then there exists a Cp -function f W K ! Rn such that Dj f`
ı
converges uniformly to Dj f on K, 1 j p. (Hint: Use tabletop functions and
Theorem 9.14.9.)
(5) Suppose that f 2 Cp .Rm ; Rn / and K is a compact subset of Rm . Show that for
" > 0, there exists an fQ 2 Cc1 .Rm ; Rn / such that k f fQ kKp < ". (Hint: multiply
f by a tabletop function which is identically one on a hypercube containing K
and use Theorem 9.14.16.)
(6) Let U be an open subset of Rm and K be a compact subset of U. In this exercise
we indicate how a Cp -function on U can be uniformly approximated on K (in
k kKp ) by smooth functions.
(a) Let a D inffd.x; K/ j x 2 Rm X Ug. Verify that 0 < a 1.
(b) For r 2 .0; a/, define Kr D fx 2 Rm j d.x; K/ rg. Verify that Kr is a
compact subset of U.
(c) For ı > 0, r 2 .0; a/, define
Z
ı;r .x/ D
Rm
ı .x
s/d.s; Kr / ds:
Verify that ı;r 2 C1 .Rm ; R/ and that for sufficiently small ı > 0, ı;r D 0
on K and ı;r > 0 on Rm X Kr (r 2 .0; a/ is fixed).
9 Differential Calculus on Rm
430
(d) Let 2r 2 .0; a/. Using the compactness
of Kr , choose aP
finite set 1 ; ; N
P
of bump functions such that NjD1 j > 0 on Kr and NjD1 j D 0 outside
of K2r .
(e) Show that, for sufficiently small ı > 0, the function
0
D@
1, 0
N
X
jA
jD1
@
1
N
X
j
C ı;r A
jD1
is C1 on Rm , equal to 1 on K and equal to 0 outside K2r .
(f) Let f 2 Cp .U; Rn / and " > 0. Using the function and Theorem 9.14.16,
show that there exists an fQ 2 Cc1 .Rm ; Rn / such that k f fQ kKp < ".
(7) In this exercise we indicate the steps in proving a strong version of the
Weierstrass approximation theorem for Cp -functions on Rm . The aim is to show
that if K is a compact subset of Rm then any Cp -function f W Rm ! R can be
approximated by polynomials in the semi-norm k kKp .
(a) Show that it suffices to prove the result for f which have compact support
in .0; 1/m .
(b) For p 0, define
p .x/ D
cp .1 x2 /p ; jxj 1;
0;
jxj > 1;
R1
where cp > 0 is chosen so that 1 p D 1. Show that for all "; ı > 0,
there exists a P D P."; ı/ such that if p P then jp .x/j < " Rfor all jxj > ı.
1
(c) If f W R ! R is Cp and supp. f / .0; 1/, show that fp .x/ D 1 p .x s/
f .s/ ds is a polynomial of degree at most 2p.
Œ0;1
(d) Show that k f fp kp ! 0 as p ! 1.
(e) Extend (c,d) to Cp -functions f W Rm ! R with supp. f / .0; 1/m .
9.15 The Local Cr Existence Theorem for ODEs
Let f W Rm ! Rm be a Cr vector field on Rm , r 1. We recall that a local flow is a
continuous map W U .ı; ı/ ! Rm defined on the open subset U .ı; ı/ of
Rm R such that for each x 2 U, x W .ı; ı/ ! Rm is the solution to x0 D f .x/
with initial condition x. A local flow is Cr if is Cr (in .x; t/).
In this section we show that an ordinary differential equation x0 D f .x/ has Cr
local flows if the vector field f is of class Cr . It turns out that it is straightforward to
prove the existence of Cr local flows if f is of class CrC1 . The improvement to Cr
local flows if f is Cr requires a more sophisticated argument.
9.15 The Local Cr Existence Theorem for ODEs
431
Lemma 9.15.1 Suppose that f is a C2 vector field on Rm . Then x0 D f .x/ has a C1
local flow on a neighbourhood of every point in Rm .
Proof If x0 2 Rm , it follows by Theorem 9.7.2 that there is an open neighbourhood
U of x0 and ı > 0 such that the local flow W U .ı; ı/ ! Rm is defined and C0 .
We prove that is C1 —initially for a possibly smaller neighbourhood V of x0 and
smaller ı > 0.
Since is a local flow we have
0
Start by assuming
we obtain
.x; t/ D f . .x; t//; .x; t/ 2 U
.ı; ı/:
(9.19)
is differentiable in x. Differentiating (9.19) in the Rm variable,
D1 0 .x; t/ D Df
.x;t/
ı D1 .x; t/;
where D1 .x; t/ denotes the derivative of in the x variable at .x; t/. Consequently,
if is C1 , then D1 satisfies a linear differential equation in L.Rm ; Rm /
U0 .x; t/ D Df
.x;t/
ı U.x; t/:
If we view x as a parameter in this equation and set Df .x;t/ D G.t/, where G W
.ı; ı/ ! L.Rm ; Rm /, then D1 satisfies the linear differential equation
U0 .t/ D G.t/ ı U.t/:
(9.20)
This equation is solvable if G is just continuous—Proposition 9.7.5. Since .x; 0/ D
x, the initial condition we require for (9.20) is U.0/ D Im 2 L.Rm ; Rm /. These
observations suggest that we should solve the system on Rm L.Rm ; Rm / defined
by
0
0
.x; t/ D f . .x; t//;
U .x; t/ D Df
.x;t/
ı U.x; t/;
(9.21)
(9.22)
subject to the initial conditions .x; 0/ D x, U.x; 0/ D I.
Since we are assuming f is C2 , (9.21), (9.22) satisfies the conditions of
Theorem 9.7.2 and so there is a local flow ˆ W V .ı 0 ; ı 0 / ! Rm L.Rm ; Rm /,
where V is an open neighbourhood of x0 , with V U, and 0 < ı 0 ı. By
uniqueness of solutions to (9.21), ˆ.x; t/ D . .x; t/; U.x; t//, .x; t/ 2 V .ı 0 ; ı 0 /.
It remains to show that is C1 and D1 is equal to U.x; t/. This is not easy
to infer directly from (9.21), (9.22). However, we can use part of the contraction
mapping lemma together with an argument based on uniform convergence.
9 Differential Calculus on Rm
432
Specifically, let X D C0 .Œı 0 ; ı 0 in the usual way by
Z
T. ; A/.x; t/ D .x; I/ C
V; Rm
L.Rm ; Rm // and define T W X ! X
Z
t
0
f . .x; s/ ds;
t
0
Df .x;s/ ı A.x; s/ ds ;
where . ; A/ 2 X. Using the assumption that f is C2 , we may choose ı 0 ; V so that
T is a contraction mapping. Define the sequence . 0 ; Un / inductively by 0
x,
U0 I and
.
nC1 ; UnC1 /
D T. n ; Un /; n 0:
Obviously, D1 0 D U0 and an easy induction shows that D1 n D Un for all n 0.
Since . n ; Un / converges uniformly to . ; U/, it follows by Theorem 9.14.9 that
is C1 and D1 D limn!1 Un D U.
t
u
Remarks 9.15.2
(1) The differential Eq. (9.22) is called the equation of variations.
(2) Even though it is relatively easy to solve the linear equation of variations, we
still have to be careful to show that the solution does give the derivative D1 . If
f is only C1 , we have to work harder.
z
Proposition 9.15.3 Suppose that f is a CrC1 vector field on Rm . Then x0 D f .x/
has a Cr local flow on some neighbourhood of every point x0 2 Rm .
Proof The result follows by induction on r using Lemma 9.15.1. We leave the
details to the exercises.
t
u
It is conceivable that as r increases, the domain of the local flow given by
Proposition 9.15.3 shrinks to f.x0 ; 0/g and so, without further work, we cannot
deduce that if f is C1 , then there is a C1 local flow.
Suppose x0 D f .x/, where f W Rm ! Rm is CrC1 . Let x0 2 Rm . It follows from
Theorem 9.7.2 that there is an open neighbourhood U of x0 and ı > 0 such that the
local flow W U .ı; ı/ ! Rm is defined and C0 . We show that is Cr (same U,
same ı).
Lemma 9.15.4 (1-Parameter Group Property for Local Flows) Let
W U
.ı; ı/ ! Rm be a local flow for x0 D f .x/. Then for all x 2 U, and s; t 2 R
such that s; s C t 2 .ı; ı/, we have
. .x; s/; t/ D .x; s C t/:
Proof Fix s 2 .ı; ı/. Since 0 .x; s C t/ D f . .x; s C t//, .t/ D .x; s C t/ is
the solution to x0 D f .x/ with initial condition .x; s/. Note that is defined and
9.15 The Local Cr Existence Theorem for ODEs
433
unique on .ı s; ı s/. Differentiating with respect to t, we see that
d
. . .x; s/; t// D
dt
0
. .x; s/; t/ D f . . .x; s/; t//; t 2 .ı s; ı s/:
Since . .x; s/; 0/ D .x; s/, the result follows by uniqueness of solutions.
Lemma 9.15.5 Let f be a C
vector field on R . If
C0 local flow given by Theorem 9.7.2, then is Cr .
rC1
m
WU
t
u
.ı; ı/ ! R is the
m
Proof By Proposition 9.15.3, we have a Cr local flow defined on a neighbourhood
of every point .x; s/, x 2 U, s 2 .ı; ı/. Applying Lemma 9.15.4, we see that is
Cr on a neighbourhood of every point in U .ı; ı/ and so W U .ı; ı/ ! Rm
is Cr .
t
u
Theorem 9.15.6 (Local Flows for Smooth Vector Fields) Suppose that f is a C1
vector field on Rm . Then x0 D f .x/ has a C1 local flow on a neighbourhood of
every point in Rm .
Proof By Lemma 9.15.5 and Proposition 9.15.3, the C0 local flows
.ı; ı/ ! Rm given by Theorem 9.7.2 are Cr for all r 0.
W U
t
u
Theorem 9.15.7 (Existence of C Local Flows) Let 1 r 1. If f is a C vector
field on Rm , then x0 D f .x/ has a Cr local flow on a neighbourhood of every point
in Rm .
r
r
Proof We give the details for r D 1. The general case follows by induction on r.
Fix x0 2 Rm and let D1 D D1 .x0 /.
We may assume that f has compact support (multiply f by a tabletop function
which is 1 on the neighbourhood D1 . By Theorem 9.14.16, we may choose a
sequence . fn / Cc1 .Rm ; Rm / such that . fn / converges to f in k k1 . By k k1
convergence of . fn / to f , there exist M1 ; M2 > 0 such that supx2Rm kg.x/k M1 ,
supx2Rm kDgx k M2 , for all g 2 ff g [ ffn j n 2 Ng. It follows from the proof
of Theorem 9.7.2 that we may choose open neighbourhoods V W of x0 , with
V W, and ı > 0, so that every g 2 ff g [ ffn j n 2 Ng has a local flow
Œı; ı ! Rm . If g D f , set g D , and if g D fn , set g D n .
g W W
Shrinking V; W and ı if necessary, we may assume that .W Œı; ı/ D1 .x0 /
(so that coincides with the local flow of the unmodified vector field f ).
It follows from Lemma 9.15.5, that n is C1 on W Œı; ı, all n 2 N. Set K D
V Œı; ı. Using continuous dependence on parameters, or direct computation,
. n / converges uniformly to on K.
Next we consider the equation of variations. For n 1, set Un .x; t/ D D1 n .x; t/,
.x; t/ 2 K. We have
U0n .x; t/ D Gn .t/ ı Un .x; t/; n 2 N;
(9.23)
U0 .x; t/ D G.t/ ı U.x; t/;
(9.24)
9 Differential Calculus on Rm
434
where Gn .t/ D D. fn / n .x;t/ and G.t/ D Df .x;t/ . Now n converges uniformly to
on K and Dfn converges uniformly to Df on Rm . We have
kD. fn /
n
Df kK0 kD. fn /
n
D. fn / kK0 C kD. fn / Df kK0 :
The second term on the right-hand side converges to zero by the uniform convergence of Dfn . For the first term, observe that, since .Dfn / is an equicontinuous family
on Rm (Lemma 7.16.9), given " > 0, there exists a ı > 0 such that kD. fn /X D. fn /Y k < " whenever X; Y 2 K satisfy kX Yk < ı. Since . n / converges
uniformly to on K, given ı > 0, there exists an N 2 N such that k n kK0 < ı, for
all n N. Consequently, kD. fn / n D. fn / kK0 < " for n N. Hence the first term
converges to zero as n ! 1. Therefore, kD. fn / n Df kK0 ! 0 as n ! 1. It now
follows from our earlier result for linear differential equations (Proposition 9.7.5)
that Un D D1 n converges uniformly to the solution U of (9.24)—the equation of
variations for x0 D f .x/. Hence is C1 and D1 D U.
t
u
Example 9.15.8 We can use Theorem 9.15.6 to give an alternative proof of the
inverse function theorem. We sketch the basic idea. Suppose that f W Rm ! Rm
is C1 , f .0/ D 0, and Df0 is a linear isomorphism. We consider the problem of
solving f .x/ D y, for y close to the origin. Fixing y, let us try to solve f .x/ D ty,
t 2 .a; a/, where a > 1. If we could solve the equation, we would get a family of
solutions x.t/ such that
f .x.t// D ty; t 2 .a; a/:
Differentiating we get Dfx.t/ .x0 .t// D y and so x.t/ satisfies the ODE
x0 .t/ D .Dfx.t/ /1 .y/; x.0/ D 0:
We obtain a C1 solution .t; y/—in this case the initial condition is always
.0; y/ D 0, and the solution depends C1 on the parameter y (a slight extension
of the previous result to allow dependence on parameters). It is not hard to show
that we can choose r > 0 such that .t; y/ is defined for jtj 1, provided that
kyk r. We define the inverse map by f 1 .y/ D .1; y/, kyk < r. Since is C1 , it
is immediate that f 1 is C1 .
EXERCISES 9.15.9
(1) Complete the proof of Proposition 9.15.3.
(2) Let U be an open subset of Rm and f W U ! Rm be C1 . Show that if x0 2 U, the
ODE x0 D f .x/ has a C0 local flow defined on a neighbourhood of x0 . Extend
to the case of Cr local flows, where f is Cr . (Hint: start by multiplying f by a
smooth tabletop function ‰ which is equal to one on a neighbourhood W of
x0 and equal to zero on a neighbourhood of Rm X U. Apply Theorem 9.7.2 to
x0 D ‰.x/f .x/ to obtain a local flow W V Œa; a ! Rm , where V W.
Show that we can choose b 2 .0; a so that W V Œb; b ! U defines a local
flow for x0 D f .x/.)
9.16 Diffeomorphisms and Flows
435
9.16 Diffeomorphisms and Flows
In Sect. 9.6, we gave the formal definition of a diffeomorphism between open
subsets of Rm . Yet we have been coy about giving specific examples. For example,
what can one say about the group Diffr .Rm / of Cr diffeomorphisms of Rm ? The
case m D 1 is easy—a Cr diffeomorphism of R is given by a strictly monotone
Cr surjection of R. The case m > 1 is not so transparent. Obviously, any linear
map A 2 GL.R; m/ defines a C1 diffeomorphism of Rm —but this is a trivial
example that needs no differential calculus or analysis for its elucidation. Using
the method of proof of the inverse function theorem, it is straightforward to show
that if A 2 GL.R; m/, then F.x/ D Ax C .x/ will be a C1 diffeomorphism of
Rm for 2 Cc1 .Rm ; Rm / and kk1 sufficiently small. But this seems likely to
give a small and unrepresentative class of diffeomorphisms of Rm . If we look at
polynomials p 2 Pd .Rm ; Rm /, d > 1, surprisingly little is known. We recall the
Jacobian Conjecture:
Suppose p 2 Pd .Rm ; Rm /, d > 1, and the Jacobian det.Dp/ is constant on Rm . Then p is a
diffeomorphism of Rm .
The conjecture was first made in 1939 by Ott-Heinrich Keller. At this time (2017),
the conjecture is neither proved or disproved, even if m D 2. Many erroneous
proofs have been proposed. Note that when p is a diffeomorphism, the inverse is
a polynomial map.
For the remainder of the section we show how we can use the theory of ODEs,
in particular the existence of local flows (Theorem 9.15.7), to construct many nontrivial examples of smooth diffeomorphisms of Rm .
9.16.1 Smooth Flows
Definition 9.16.1 A map ˆ W Rm
C1 and
R ! Rm is a smooth or C1 flow on Rm if ˆ is
ˆ.x; 0/ D x; for all x 2 Rm :
(9.25)
ˆ.x; t C s/ D ˆ.ˆ.x; t/; s/; for all x 2 R ; s; t 2 R:
m
(9.26)
Remark 9.16.2 The definition, and most of what we do below, generalizes straightforwardly to Cr flows.
z
Suppose that ˆ W Rm R ! Rm is a smooth flow. For t 2 R, let ˆt W Rm ! Rm be
the C1 map defined by ˆt .x/ D ˆ.x; t/, x 2 Rm .
Proposition 9.16.3 If ˆ is a smooth flow, then
(1) ˆ0 D Im .
(2) For all t; s 2 R, ˆtCs D ˆt ˆs D ˆs ˆt .
(3) For all t 2 R, ˆt 2 Diff1 .Rm / and has inverse ˆt .
9 Differential Calculus on Rm
436
Proof Statement (1) is immediate from (9.25). Next observe that ˆ.x; t C s/ D
ˆ.ˆ.x; t/; s/ for all x; t; s iff ˆtCs D ˆs ˆt for all t; s. Since ˆ.x; t C s/ D ˆ.x; s C
t/, this proves (2). Finally (3) is immediate from (1,2), since ˆt ˆt D ˆt ˆt D
ˆ0 D Im .
t
u
Remark 9.16.4 (1,2) of Proposition 9.16.3 are referred to as the one-parameter
group property of a flow: the map W R ! Diff1 .Rm / defined by .t/ D ˆt is
a group homomorphism.
z
We continue to assume ˆ W Rm R ! Rm is a smooth flow. For x 2 Rm , define
ˆx W R ! Rm by ˆx .t/ D ˆ.x; t/. We also define the C1 vector field f D fˆ on
Rm by
f .x/ D
@ˆ
.x; 0/; x 2 Rm :
@t
(9.27)
Remark 9.16.5 If ˆ is a Cr flow, then f is only Cr1 —this generates some
complications and is the main reason why we restrict to smooth flows.
z
Proposition 9.16.6 (Notation and Assumptions as Above) For all x 2 Rm , ˆx W
R ! Rm is the unique solution to x0 D f .x/ with initial condition x.
Proof Fix x 2 Rm . Differentiating the identity ˆ.ˆ.x; t/; s/ D ˆ.x; t C s/ with
respect to s and setting s D 0, we get
f .ˆx .t// D
d
ˆx .t C s/jsD0 D ˆ0x .t/; t 2 R:
ds
Since ˆx .0/ D x, it follows that ˆx is a solution to x0 D f .x/ with initial condition
x. That the solution is unique follows from Lemma 9.7.4.
t
u
Example 9.16.7 Let A 2 L.Rm ; Rm / and consider the linear ODE x0 D Ax. The
solution with initial condition x is given by ˆx .t/ D exp.At/x, where exp.At/ D
P
1 An n
nD0 nŠ t . Since exp.At/ exp.As/ D exp.A.t C s//, it follows that ˆ.x; t/ D
exp.At/x is a smooth (linear) flow.
Definition 9.16.8 Let f be a smooth vector field on Rm , x0 2 Rm , and 1 a <
b C1. A solution curve W .a; b/ ! Rm to x0 D f .x/ with initial condition x0
is maximal if given any solution curve W .c; d/ ! Rm , with initial condition x0 ,
.c; d/ .a; b/.
Lemma 9.16.9 (Notation and Assumptions as Above) For each initial condition
x0 2 Rm , there is unique maximal solution curve to x0 D f .x/ with initial
condition x0 .
Proof Let f W I ! Rm j 2 ƒg denote the set of all solution curves to x0 D f .x/
with initial condition x0 . Set I D [2ƒ I and define .t/ D .t/, for t 2 I I.
The map W I ! Rm is well defined—by Lemma 9.7.4—and obviously maximal
by construction. Uniqueness follows from Lemma 9.7.4.
t
u
9.16 Diffeomorphisms and Flows
437
Given a smooth vector field f W Rm ! Rm , let ˆx W Ix ! Rm denote the maximal
solution curve for x0 D f .x/ with initial condition x. Set D D [x2Rm fxg Ix Rm R. Define ˆ.x; t/ D ˆx .t/, .x; t/ 2 D.
Proposition 9.16.10 (Local Flows) (Notations and assumptions as above.) We
have
(1) ˆ.x; 0/ D x for all x 2 Rm .
(2) If x 2 Rm , s; t 2 R, and ˆ.x; s/; ˆ.x; t C s/ 2 D, then ˆ.x; t C s/ D
ˆ.ˆ.x; t/; s/.
(3) D is an open subset of Rm R containing Rm f0g.
(4) ˆ W D ! Rm is smooth.
(5) If D D Rm R, ˆ is a smooth flow.
Proof (Sketch) Statement (1) follows by definition and (2) follows by uniqueness
of maximal solution curves (if Ix D .a; b/, then Iˆ.x;t/ D .a t; b t/). It follows
from the previous section that at every point y D ˆx .t/ 2 D there is a local C1
flow. Together with uniqueness of maximal solutions, it follows that D is an open
neighbourhood of Rm f0g. Statements (4,5) use Lemma 9.15.5.
t
u
Remark 9.16.11 Proposition 9.16.10 holds if ‘smooth’ is replaced everywhere by
‘Cr ’, r 1. Note that ˆ will be Cr (in .x; t/) but CrC1 in t (ˆ0x D f ˆx which is Cr
in t by the chain rule). As a result the vector field defined by ˆ in (9.27) will be Cr ,
not just Cr1 .
z
Example 9.16.12 The ODE x0 D x2 , x 2 R, gives an example where D is a proper
subset. A straightforward computation gives
ˆ.x; t/ D
0; if x D 0;
x
1tx ; if x ¤ 0:
If x < 0, then Ix D .x1 ; 1/, if x > 0, Ix D .1; x1 /, and if x D 0, Ix D R.
Our final result on smooth flows yields many examples of smooth flows and
diffeomorphisms of Rm .
Theorem 9.16.13 Let f be a smooth vector field on Rm . Suppose one of the
following conditions holds
(1) f 2 Cc1 .Rm ; Rm /.
(2) There exist constants A; B 2 RC such that for all x 2 Rm
k f .x/k A C Bkxk:
Then f has a smooth flow.
Proof Let x0 2 Rm and set r0 D kx0 k and .t/ D ˆx0 .t/, t 2 Ix . Assume (2)
holds ((1) is easier). It follows that k .t/k is bounded by r.t/ where r0 D A C Br,
r.0/ D r0 . That is, k .t/k can grow at most exponentially in t. It follows that .t/
438
9 Differential Calculus on Rm
cannot go to infinity in finite time. Specifically, if .a; b/ Ix , 1 < a < b < C1,
then there exists an R > 0 such that .a; b/ DR . Hence .a; b/ is compact and
from this it follows easily that we can choose a0 < a, b0 > b such that .a0 ; b0 / Ix .
Consequently, Ix D R (else require Ix , .a; b/ to share an end-point).
t
u
EXERCISES 9.16.14
(1) Fill in the details for the proof of Proposition 9.16.10.
(2) Show that if f 2 C1 .Rm ; Rm / there exists a C1 map W R ! R.> 0/
such that f satisfies (2) of Theorem 9.16.13. Verify that f , f have the same
trajectories—that is, ˆx .Ix / D ˆx .R/, where ˆx W Ix ! R is the maximal
0
solution curve for x D f .x/ with initial condition x and ˆx gives the maximal
solution curve for f . (Hint: For n 2 N, define An D 1 C supn1kxkn k f .x/k
P
1
and define .x/ D 1
nD1 An …n .x/, where the …n are suitably chosen smooth
functions with compact support.)
9.17 Concluding Comments
(a) The definition of derivative only used the existence of a norm on Rm and extends
to general ‘normed vector spaces’, including infinite-dimensional spaces. For
example, the space C0 .I/ of all continuous R-valued functions on the closed
interval I D Œ0; 1 with norm defined by k f k D supt2Œ0;1 jf .t/j. If the normed
space is complete (for example, C0 .I/) then the inverse and implicit function
theorems apply [6].
(b) The contraction mapping lemma can be extended significantly. The version we
gave involving parameters showed that under mild conditions the fixed point
depends continuously on the parameter. This can be generalized to allow for
the fixed point to depend differentiably on parameters. It is at this point that
the generalizations sketched in (a) come into play and allow direct proofs of
the existence and uniqueness theorem for ODEs as well as other foundational
results in the theory of differential equations. For more details we refer the
reader to the text Smooth Dynamical Systems by Irwin [16].
(c) The theory of smooth flows has far reaching generalizations to smooth compact
manifolds—for example, the unit sphere in Euclidean space—and leads naturally into the subject of differentiable dynamical systems. From an extensive
literature, we suggest John Milnor’s monograph on differential topology [25] for
a concise introduction to differential manifolds and the texts by Morris Hirsch et
al. [14] and Stephen Strogatz [28] for introductions to differentiable dynamical
systems.
9.18 Appendix: Finite-Dimensional Normed Vector Spaces
439
9.18 Appendix: Finite-Dimensional Normed Vector Spaces
In Sect. 9.2, we showed how starting with the Euclidean vector spaces Rm , Rn
we arrived at a new normed vector space L.Rm ; Rn / and that the operator norm
we defined on L.Rm ; Rn / was not generally the Euclidean norm obtained via the
isomorphism L.Rm ; Rn /
Rmn , Œaij 7! .aij /. The question arose as to whether
we always get the same topology on L.Rm ; Rn /—that is, might the open sets (and
continuous functions) depend on the choice of norms on Rm ; Rn ? In this appendix,
we resolve this issue and show that for finite-dimensional vector spaces, all norms
define the same topology and hence the same continuous functions.
Theorem 9.18.1 Any two norms on a finite-dimensional vector space V are
equivalent. In particular,
(1) All norms define the same topology on V.
(2) .V; k k/ is complete with respect to any norm on V.
The key step in the proof of Theorem 9.18.1 is given by the following lemma.
Lemma 9.18.2 Let .V; k k/ be a normed vector space and suppose L W V ! R is
linear. Then L is continuous iff L1 .0/ is a closed subspace of V.
Proof If L is continuous, then L1 .0/ is a closed subspace of V by standard metric
space theory. The proof of the converse is not so simple. We may assume L is not
identically zero. Set D D D1 .0/ V. Observe that D is balanced: tD D for all
t 2 Œ1; 1. Since L is linear, L.D/ R is also balanced. It suffices to prove that if
L is not continuous then L1 .0/ is not a closed subspace of V. If L is not continuous
at x D 0, then L.D/ is not a bounded subset of R (continuity at x D 0 implies there
exists an r > 0 such that L.Dr .0// Œ1; 1. But then L.D/ Œ1=r; 1=r). Since
L.D/ is balanced, we must therefore have L.D/ D R. Let x 2 V, " > 0. Since
L.D/ D R, there exists a z 2 D such that L.z/ D "1 L.x/. That is L.x "z/ D 0 and
so x "z 2 L1 .0/ and d.x; L1 .0// < ". Since this is so for all " > 0, x 2 L1 .0/.
Our argument proves that L1 .0/ is dense in V. Since L ¤ 0, L1 .0/ ¤ V and so
L1 .0/ is not closed.
t
u
Proof of Theorem 9.18.1. Our proof is by a double induction. For n 2 N, let En be
the statement that all norms on an n-dimensional vector space are equivalent and Cn
be the statement that every n-dimensional normed vector space is complete (in the
associated metric). The induction depends on showing that: E1 is true, En H) Cn ,
Cn H) EnC1 . We leave the verification of E1 to the exercises.
En H) Cn . Let V be an n-dimensional vector space. We start by noting that if
k k1 is equivalent to k k2 then V is complete in the metric defined by k k1 iff V is
complete in the metric defined by k k2 (this is easy to check as both metrics have
the same Cauchy sequences). Consequently, to verify that En H) Cn it is enough
to find one norm on V relative to which V is complete. For this, choose a linear
isomorphism A W V ! Rn and define kxk D kAxk2 , where k k2 is the Euclidean
norm on Rn .
9 Differential Calculus on Rm
440
Cn H) EnC1 . Suppose V is an .n C 1/-dimensional normed vector space. Fix a
basis P
fv1 ; ; vnC1 g for V and let A W V ! RnC1 be the linear isomorphism defined
by A. nC1
iD1 xi vi / D .x1 ; ; xnC1 /. Define the norm k k? on V by kxk? D kAxk1 ,
where k.x1 ; ; xnC1 /k1 D maxi jxi j. Suppose k k is a norm on V. It suffices to
prove that k k and k k? are equivalent. Denote the components of A by ai W V ! R,
i D 1; ; n C 1. Fix i and set a D ai . Since A is a linear isomorphism, a ¤ 0 and
so E D a1 .0/ is an n-dimensional linear subspace of V. Take the induced norm
on V. By hypothesis Cn , .E; k k/ is complete. A subspace of a metric space which
is complete in the induced metric contains all its limit points and hence is closed.
Therefore, E is a proper closed subset of V and a is continuous by Lemma 9.18.2.
Hence
kxk? D kAxk D k.a1 .x/; ; anC1 .x//k1 Kkxk;
where K D supfjai .x/j
P j 1 i n C 1; kxk 1g < 1. On the other hand,
A1 .x1 ; ; xnC1 / D nC1
iD1 xi vi and so
kxk D k
nC1
X
iD1
xi vi k maxfkvi kg max jxi j D maxfkvi kgkxk? :
i
i
Hence the norms k k and k k? are equivalent.
i
t
u
Corollary 9.18.3 If .V; k k/ and .W; k k/ are finite-dimensional normed vector
spaces then every linear map A W V ! W is continuous.
Proof If we choose bases for V and W we can always assume by Theorem 9.1.4
that the norms on V Š Rm and W Š Rn are the Euclidean norms. Apply
Proposition 9.2.6.
t
u
We end with a topological characterization of finite-dimensional normed vector
spaces.
Theorem 9.18.4 (F. Riesz) Let .E; k k/ be a normed vector space. Then the closed
unit disk D1 .0/ in E is compact iff E is finite-dimensional.
Proof If E is of finite dimension n, then we may fix an isomorphism E Š Rn . Let
k k2 denote the induced Euclidean norm on E. Now all closed disks Br .0/ are
compact in .E; k k2 /. Since k k, k k2 are equivalent norms, we may choose R > 0
such that D1 .0/ BR .0/. Hence D1 .0/ is compact.
Conversely, suppose that D1 .0/ is a compact subset of E. By compactness,2 we
may choose a finite subset f1 ; ; fk 2 D1 .0/ such that d.x; ff1 ; ; fk g/ < 1 for all
x 2 D1 .0/. Let F be the finite-dimensional subspace of E spanned by ff1 ; ; fk g.
By Theorem 9.1.4(2), .F; kk/ is a closed normed vector subspace of E. It suffices to
show F D E. If not, we may choose x 2 E X F such that d.x; F/ > 0. Every closed
2
Either sequential compactness or the open cover definition.
9.18 Appendix: Finite-Dimensional Normed Vector Spaces
441
disk with centre 0 in F is contained in D1 .0/ for some > 0 and so all closed disks
with centre 0 in F are compact. Since d.x; f/ D kx fk is a continuous function of
f 2 F, it follows that the lower bound d.x; F/ D inff2F kx fk is attained at some
point y 2 F. That is, d.x; F/ D kx yk, where y 2 F. Set z D .x y/=kx yk 2
D1 .0/. We have
xy
;F
d.z; F/ D d
kx yk
D
1
d.x y; F/; scalar invariance ofd
kx yk
D
1
d.x; F/; translation invariance ofd
kx yk
D 1:
But d.z; F/ d.z; ff1 ; ; fk g/ < 1. Contradiction. Hence E D F and E is finitedimensional.
t
u
EXERCISES 9.18.5
(1) Show that if k k1 is equivalent to k k2 and k k2 is equivalent to k k3 then
k k1 is equivalent to k k3 .
(2) Prove that all norms on R are equivalent (statement E1 of the proof of
Theorem 9.1.4).
(3) Show directly that the norms k k2 , k k1 and k k1 on Rn are all equivalent.
Specifically, prove that for all x 2 Rn , we have
kxk1 kxk1 nkxk1 ;
kxk1 kxk2 p
nkxk1 :
Generalize to all p-norms k kp , p 1.
(4) Define the norm k k1 on C0 .Œ0; 1/ by
Z
k f k1 D
1
0
jf .t/j dt; f 2 C0 .Œ0; 1/:
(a) Verify that k k1 does define a norm on C0 .Œ0; 1/.
(b) By considering the sequence of functions . fn / C0 .Œ0; 1/ defined by
8
<
0 t 1=n;
n2 t 2 ;
3
2
fn .t/ D n .t 2=n/ ; 1=n t 2=n;
:
0;
t 2=n;
show that the norm k k1 is not equivalent to the L2 -norm j j2 on C0 .Œ0; 1/
(see Sect. 5.6 for the definition of j j2 ).
9 Differential Calculus on Rm
442
(c) By considering the sequence .gn / C0 .Œ0; 1/ defined by
gn .t/ D
8
<
1;
0 t 1=2 1=n;
.n=2 C 1 nt/=2; 1=2 1=n t 1=2 C 1=n;
:
0;
t 1=2 C 1=n;
show that .C0 .Œ0; 1/; k k1 / is not complete.
References
1. R. Abraham, J.W. Robbin, Transversal Mappings and Flows (Benjamin, New York, 1967)
2. M. Barnsley, Fractals Everywhere (Academic, New York, 1988)
3. P. Błaszczyk, M.G. Katz, D. Sherry, Ten misconceptions from the history of analysis and their
debunking. Found. Sci. 18(1), 43–74 (2013)
4. A.V. Borovnik, Mathematics under the Microscope. Notes on Cognitive Aspects of Mathematical Practice (American Mathematical Society, Providence, RI, 2009)
5. T.J.I’A. Bromwich, Theory of Infinite Series, 2nd edn. (Macmillan and Co., London, 1959)
6. J. Dieudonné, Foundations of Modern Analysis (Academic, New York, 1960)
7. M. Faà di Bruno, Note sur une nouvelle formule de calcul differentiel. Q. J. Pure Appl. Math.
1, 359–360 (1857)
8. K. Falconer, Fractal Geometry: Mathematical Foundations and Applications, 2nd edn. (Wiley,
New York, 2003)
9. M. Field, Differential Calculus and Its Applications (Van Nostrand Reinhold, New York, 1976)
10. M.J. Field, Stratification of equivariant varieties. Bull. Aust. Math. Soc. 16, 279–296 (1977)
11. M. Field, M. Golubitsky, Symmetry in Chaos: A Search for Pattern in Mathematics, Art and
Nature, 2nd edn. (Society for Industrial and Applied Mathematics, Philadelphia, 2009)
12. A. Fraenkel, Abstract Set Theory (North Holland, Amsterdam, 1953)
13. A. Fraenkel, Y. Bar-Hillel, A. Levy, Foundations of Set Theory (North Holland, Amsterdam,
1958)
14. M.W. Hirsch, S. Smale, R. Devanney, Differential Equations, Dynamical Systems, and an
Introduction to Chaos, 3rd edn. (Academic, New York, 2013)
15. J.E. Hutchinson, Fractals and self similarity. Indiana Univ. Math. J. 30, 713–747 (1981)
16. M.C. Irwin, Smooth Dynamical Systems. Advanced Series in Nonlinear Dynamics, vol. 17
(World Scientific, Singapore, 2001). The original book, published by Academic press, appeared
in 1980
17. W.P. Johnson, The curious history of Faà di Bruno’s formula. Am. Math. Mon. 109, 217–234
(2002)
18. J.L. Kelley, General Topology. Graduate Texts in Mathematics, vol. 27 (Springer, New York,
1975). Originally published 1955, Van Nostrand Reinhold
19. S.G. Krantz, Real Analysis and Foundations, 2nd edn. (Chapman and Hall/CRC, Boca Raton,
2004)
20. S.G. Krantz, H.R. Parks, A Primer of Real Analytic Functions. Basler Lehrbücher, vol. 4
(Birkhäuser, Basel, 1992)
21. L. Kuipers, H. Niederreiter, Uniform Distribution of Sequences (Dover, New York, 2006)
22. J.W. Lamperti, Probability, 2nd edn. (Wiley, New York, 1996)
444
References
23. D. Liberzon, Switching in Systems and Control. Systems and Control: Foundations and
Applications (Birkhäuser, Basel, 2003)
24. P. Mandelbrot, The Fractal Geometry of Nature (W.H. Freeman and Co., New York, 1982)
25. J.W. Milnor, Topology from the Differentiable Viewpoint (Princeton University Press, Princeton, NJ, 1965)
26. H.-O. Peitgen, P.H. Richter, The Beauty of Fractals (Springer, New York, 1988)
27. W. Rudin, Principles of Mathematical Analysis, 3rd edn. (McGraw-Hill, New York, 1976)
28. S.H. Strogatz, Nonlinear Dynamics and Chaos (Studies in Nonlinearity), 2nd edn. (Westview
Press, Boulder, 2015)
29. A.N. Whitehead, Science and the Modern World, Paperback edn. (Macmillan Company, New
York, 1925; Cambridge University Press, Cambridge, 2011)
30. S. Willard, General Topology (Dover, New York, 2004). Originally Published by AddisonWesley, Reading, MA, 1970
31. W.H. Young, On the distinction of right and left at points of discontinuity. Q. J. Math. 39,
67–83 (1908)
Index
˛
, 147
B .X; R/, bounded continuous R-valued
functions on X, 299
B.I/, bounded functions on I, 132
B.X; R/, bounded R-valued functions on X,
249, 299
B.Œa; b/, bounded functions on Œa; b, 247
Bn . f /, Bernstein polynomial, 170
C0 .I/, continuous functions on I, 133
C0 .X; R/, continuous R-valued functions on X,
299
C1 -function, 152
C1 -function, 68, 161, 423
C! .R/, 162
Ccp .Rm ; Rn /, Cp -functions with compact
support, 424
Cr -function, 68
C.r/, closed hypercube, 425
.X/, 257
r1 ; ;rm , r , 424
ex , 83
exp.x/, 83
-function, 211
.x/, Gamma function, 211
H.Rn /, space of compact subsets of Rn , 330
L2 -distance, 206
L2 -metric, 247
L2 -norm, 205
L.Rm ; Rn /, 355
ln x, 82
log x, 82
Lps .VI W/, 401
kkp , 352
1:1 function, 7
P.R/, polynomial functions, 161
n
0
Pd .V; W/, 403
P.d/ .V; W/, 402
‰a;b , 165
Sm , unit sphere in RmC1 , 351
? symmetrization operator, 417
‚a;b , 166
X X A, complement, 4
absolute convergence, 113
accumulation point, 268
affine linear map, 339, 358
analytic
continuation, 183
maximal, 183
function, 162, 178
real analytic, 162
functions
composite, 181
product, 181
zeros, 182
anti-derivative, 75
Archimedean property, 20, 88
arithmetic mean, 47
Arzelà–Ascoli theorem, 307
Axiom of Choice, 5
balanced, 439
ball, 252
basis (for topology), 262
Bernoulli
numbers, 223
polynomials, 223, 225
Bernstein polynomials, 170
beta function, 222
446
big O notation, 361
bijection, 7
bilinear, 397
binomial series, 147, 185
Bolzano–Weierstrass theorem, 48
Borel’s theorem, 164, 169
boundary of set, 262
bounded, 37
above, 37
below, 37
function, 132
variation, 63
bump function, 165, 424
Cantor, 8
set, 289, 329
Cantor–Bernstein theorem, 13
cardinality of set, 10
Cauchy sequence, 55, 297
complex numbers, 66
functions, 136
Cauchy’s integral test, 96
Cauchy’s test, 96
Cauchy–Schwarz inequality, 246
ceiling function, 62
chain rule, 365
Cr -maps, 413
closed
disk, 256
subset, 255
closure, 258
cluster point, 269
collage theorem, 345
compact support, 423
comparison test, 93
complement of set, 4
complete metric space, 298
completion, 301
complex conjugation, 64
complex number, 63
addition, 63
modulus, 64
multiplication, 64
composite
mapping formula, 365
of functions, 7
connected, 321
components, 326
subset, 321
constant sequence, 266
continuity
at a point, 273
metric spaces, 273
Index
continuous
family, 176
function, 49
continuously differentiable, 68, 152
Cp , 68, 411
contraction, 337
constant, 312
mapping, 312
mapping lemma, 312
with parameters, 314
convergence
infinite series, 91
convex function, 215
convexity of log , 215
convolution, 428
countable set, 10
countably infinite, 10
D’Alembert’s test, 95
De Moivre’s formula, 65
decimal expansion, 15
Dedekind numbers, 6
definite integral, 77
dense subset, 261
derivative, 67, 360
diagonal subspace, 257
diameter, 249
diffeomorphism, 376
Cr , 413
differentiable function, 360
differentiation under integral sign, 213
directional derivative, 363
Dirichlet kernel, 193
disconnected, 321
discrete metric, 246
disk, 252
distance to a subset, 250
double series, 112
absolute convergence, 113
dual implicit function theorem, 383, 414
duplication formula of Legendre, 223
embedding, 390
empty set, 2
equation of variations, 432
equicontinuous, 306
equivalence
of sets, 8
relation, 8
equivalent
metrics, 249, 350
norms, 350
Index
Euclidean
metric, 246
norm, 351
Euler
constant, 99, 220
constant, computation, 238
method for ODEs, 310
product for zeta-function, 124
Euler’s theorem, 370
Euler–Maclaurin formula, 231
r D 0, 232
eventually
increasing sequence, 41
periodic decimal, 16
exponential function, 83
Faà di Bruno’s formula
history, 422
family
of sets, 3
finite set, 9
first category, 303
fixed point, 52, 296, 312
floor function, 62
flow, 435
fnamily
of functions, 176
formal power series, 421
Fourier
coefficients, 188
series, 188
partial sum formula, 193
fractal, 294, 338
frontier of set, 262
function, 6
analytic, 162, 178
bijective, 7
bounded
continuous real-valued on X, 299
on Œa; b, 247
on I, 132
real-valued on X, 249, 299
bump, 165, 424
C1 , 152
C1 , 68, 161, 423
continuous, 49
bounded real-valued on X, 299
on I, 133
real-valued on X, 299
Cp with compact support, 424
Cr , 68
Gamma, 211
injective, 7
447
nowhere differentiable, 49, 156
Weierstrass, 156
1:1, 7
1:1 onto, 7
onto, 7
polynomial, 161
real analytic, 162
smooth, 423
surjective, 7
tabletop, 166
fundamental theorem of calculus, 77
Gamma-function, 211
properties, 215
general
linear group, 376
principle of uniform convergence, 136
generalized binomial coefficient, 147
geometric mean, 47
Gibbs phenomenon, 197
graph of function, 6
greatest lower bound, 38
Hausdorff metric, 330, 334
Heine–Borel theorem, 286
Hölder
continuity, 306
inequality, 353
homeomorphism, 276
homogeneous polynomial of degree d, 402
hypercube, 425
iff, 5
IFS, 338
image of function, 7
implicit function theorem, 380, 414
dual version, 383
improper integral, 212
induced metric, 247
inf, 38
infimum, 38
infinite
product, 116
for sin x, 121, 198
general principle of convergence, 119
series
Cauchy’s test, 96
comparison test, 93
D’Alembert’s test, 95
integral test, 96
necessary condition for convergence, 92
448
positive terms, 92
ratio test, 94
set, 9
infinitely differentiable, 411
initial condition, 316, 391
injective function, 7
inner product, 351
of functions, 204
integral curve, 316
interior, 258
point, 259
intersection (of sets), 3
inverse
function theorem, 377, 414
on R, 318
image (of set), 7
map, 7
invertible linear maps, 376
irrational number, 14, 18
isolated point, 257, 268
isometry, 275, 301
iterated function system, 338
Jacobian, 364
matrix, 364
jump discontinuity, 60, 187
least upper bound, 38
Legendre polynomials, 208
Leibniz rule, 417
lim inf, 59
limit
of sequence, 32
point, 268
of sequence, 269
lim sup, 59
linear map, 353
Lipschitz, 306, 312
inverse function theorem, 321
local
diffeomorphism, 377
flow, 392
locally compact, 296
logarithm
Napierian, 82
natural, 82
logistic map, 47
lower semi-continuous, 62
M-test, 141
map, 6
Index
maximal solution curve, 436
mean square convergence, 204
mean value theorem, 67
for integrals, 80
metric, 246
space, 246
completion, 301
topology, 254
middle-thirds Cantor set, 289
Minkowski’s inequality, 352
modulus and argument, 65
monomial, 404
of degree d, 404
multi-index notation, 404
multi-linear, 397
Napierian logarithm, 82
natural
isomorphism, 399
logarithm, 82
neighbourhood, 263
non-analytic smooth function, 162
norm, 350
of a linear map, 355
p-norm, 352
normed vector space, 350
nowhere dense, 265, 303
nowhere differentiable function, 49, 156
Weierstrass, 156
ODE, 391
one-parameter group property, 436
one-sided limits, 59
onto function, 7
open
cover, 285
disk, 252
neighbourhood, 263
subset, 252
operator norm, 355, 396
ordinary differential equation, 316, 391
integral curve, 316
solution curve, 316
orthogonal functions, 205
partial
derivative, 364
sum, 91
path-connected, 324
perfect set, 294
periodic function, 186
Index
p-fold tensor product, 398
piecewise continuous function, 187
p-linear map, 397
p-norm, 352, 425
pointwise
bounded set, 305
convergence, 130
limit, 130
polarization lemma, 405
polynomial
Bernstein, 170
function, 161
of degree d, 403
power series, 142
product, 144
radius of convergence, 143
reciprocal, 145
sum, 144
power set, 4, 8
product
metric, 248, 327
norm, 397, 400
of power series, 144
of sets, 5
proper, 288
map, 390
subset, 3
Pythagoras’ theorem, 206
radius of convergence, 143
random iteration, 345
range of function, 7
rank theorem, 385, 414
ratio test, 94
rational number, 13
real analytic function, 162
reciprocal of power series, 145
rectangle, 81
removable discontinuity, 60
repeated series, 114
Riemann integral, 77
Riesz’s theorem, 440
Rolle’s theorem, 67
Russell’s paradox, 5, 8
scalar invariance of metric, 352
second
category, 303
countable, 262, 285
derivative, 407
Weierstrass approximation theorem, 187
self-similar, 294
449
self-similarity, 338
semi-continuity, 62
semi-norm, 425
separable, 261, 285
separated sets, 325
sequence, 32
complex numbers, 66
convergence, 32
increasing, 41
limit, 32
metric space, 266
metric space convergence, 267
sequences diverging to ˙1, 40
sequential continuity, 50, 279
sequentially compact, 281
Sierpiński triangle, 340, 341
small o notation, 359
smooth
flow, 435
function, 68, 161, 423
non-analytic function, 162
solution of differential equation, 391
space of linear maps, 355
square wave, 197
squeezing lemma, 34
Stirling’s
formula, 232, 237
series, 238
subcover, 285
submanifold, 390
subsequence, 35
subset, 2
subspace
isolated point, 268
sum
by columns, 114
by rows, 114
of power series, 144
sup, 38
superset, 3
support (of function), 423
supremum, 38
surjective function, 7
symmetric
p-linear map, 401
group, 401
symmetrization, 403
tabletop function, 166, 424
tangent
line, 358
plane, 358
Tannery’s theorem, 101, 120
450
Taylor series, 69, 161
analytic function, 178
Taylor’s theorem, 68, 414
Cauchy remainder, 70
integral remainder, 68, 416
Lagrange remainder, 70
remainder estimate, 70
terminating decimal, 15
ternary expansion, 295
Tietze extension theorem, 278
topological space, 254
topology, 254
totally disconnected, 294, 322
trajectory of differential equation, 316
translation invariance of metric, 352
triangle inequality, 89, 133, 246
trigonometric
polynomial, 186
degree, 186
series, 187
twice differentiable, 407
uncountable, 10
uniform
approximation, 131, 170, 426
Cr -functions, 423
convergence, 134
Abel test, 147
Index
Dirichlet test, 147
series, 139
metric, 246, 249, 299, 300, 305, 392
norm, 132
uniformly
bounded set, 306
distributed, 203
union (of sets), 3
unit sphere, 351
upper semi-continuous, 62
Urysohn’s lemma, 277
Wallis’ formula, 200, 203
Weierstrass
approximation theorem, 170, 174, 261,
423, 430
continuous families, 177
trigonometric polynomials, 187
inequalities, 118, 123
nowhere differentiable function, 156
Weyl criterion, 203
Young’s theorem, 62, 276
Zariski topology, 258
Download