Uploaded by abrar aljeri

[Halsey L

advertisement
H. L. Royden
STANFORD UNIVERSITY
Real Analysis
SECOND
EDITION
The Macmillan Company
Collier-Macmillan Limited, London
Preface
This book is the outgrowth of a course at Stanford entitled "Theory
of Functions of a Real Variable," which I have given from time to time
during the last ten years. This course was designed for first-year graduate
students in mathematics and statistics. It presupposes a general background
in undergraduate mathematics and specific acquaintance with the material
in an undergraduate course on the fundamental concepts of analysis.
I have attempted to cover the basic material that every graduate student
should know in the classical theory of functions of a real variable and in
measure and integration theory, as well as some of the more important and
elementary topics in general topology and normed linear space theory.
The treatment of material given' here is quite standard in graduate courses
of this sort, although Lebesgue measure and Lebesgue integration are
treated in this book before the general theory of measure and integration.
I have found this a happy pedagogical practice, since the student first
becomes familiar with an important concrete case and then sees that much
of what he has learned can be applied in very general situations.
The preparation of a revised edition has given me the opportunity to
expand points that students have found difficult and to improve a number
of awkward proofs. Except for the addition of problems, little change has
been made in the first seven chapters. The remaining chapters in Part Two
have been enlarged by the addition of sections on products of topological
spaces, the Stone- t ech compactification, and topological vector spaces.
Part Three, on the other hand, has been considerably revised. The approach
to the abstract Lebesgue integral in Chapter 11 has been altered slightly
to make the development parallel to the one used in Chapter 13. The
proof of the Radon-Nikodym theorem has been simplified, and a section
on LP spaces has been added. Chapter 12 contains new material on inner
vii
viii
Preface
measure and its role in extension and uniqueness theorems. Chapter 13
on the Daniell integral has been rearranged to correct several erroneous
proofs that made tacit use of things not proved until later. A new chapter
on measure theory in locally compact spaces has been added.
There is considerable independence among chapters, and the chart on
page 4 gives the essential dependencies. The instructor thus has considerable freedom in arranging the material here into a course according to his
taste. Sections that are peripheral to the principal line of argument have
been starred (*). The Prologue to the Student lists some of the notations
and conventions and makes some suggestions.
The material in this book belongs to the common culture of mathematics
and reflects the craftsmanship of many mathematicians. My treatment of it
is particularly indebted to the published works of Constantine Carathéodory, Paul Halmos, and Stanislaw Saks and to the lectures and
conversations of Andrew Gleason, John Herriot, and Lynn Loomis.
Chapter 15 is the result of intensive discussion with John Lamperti.
I also wish to acknowledge my indebtedness for helpful suggestions and
criticism from numerous students and colleagues. Of the former I should
like to mention in particular Peter Loeb, who read the manuscript of the
original edition and whose helpful suggestions improved the clarity of a
number of arguments, and Charles Stanton, who read the manuscript for
this revision, correcting a number of fallacious statements and problems.
Among my colleagues, particular thanks are due Paul Berg, who pointed
out Littlewood's "three principles" to me, to Herman Rubin, who provided counterexamples to many of the theorems the first time I taught the
course, and to John Kelley, who read the manuscript, giving helpful
advice and making me omit my polemical remarks. (A few have reappeared as footnotes, however.) Finally, my thanks go to Margaret Cline
for her patience and skill in transforming illegible copy into a finished
typescript for the original edition, to William Glassmire for reading proofs
of the revised edition, to Valerie Yuchartz for typing the material for this
edition, and to the editors at Macmillan for their forbearance and encouragement during the years in which this book was written.
H. L. R.
Stanford, California
Contents
1
Prologue to the Student
1 Set Theory
5
1
2
3
4
5
6
Introduction, 5
Functions, 8
Unions, intersections, and complements, 11
Algebras of sets, 16
The axiom of choice and infinite direct products, 18
Countable sets, 19
7 Relations and equivalences, 22
8 Partial orderings and the maximal principle, 23
9 Well ordering and the countable ordinals, 24
Part One 0 Theory of Functions of a Real Variable
2
The Real Number System
29
1
2
3
4
5
6
Axioms for the real numbers, 29
The natural and rational numbers as subsets of R, 32
The extended real numbers, 34
Sequences of real numbers, 35
Open and closed sets of real numbers, 38
Continuous functions, 44
7 Borel sets, 50
ix
Contents
3 Lebesgue Measure
52
1 Introduction, 52
2 Outer measure, 54
3 Measurable sets and Lebesgue measure, 56
*4 A nonmeasurable set, 63
5 Measurable functions, 65
6 Littlewood's three principles, 71
4 The Lebesgue Integral
73
1 The Riemann integral, 73
2 The Lebesgue integral of a bounded function over a set of finite
measure, 75
3 The integral of a nonnegative function, 82
4 The general Lebesgue integral, 86
*5 Convergence in measure, 91
5 Differentiation and Integration
94
1 Differentiation of monotone functions, 94
2 Functions of bounded variation, 98
3 Differentiation of an integral, 101
4 Absolute continuity, 104
*5 Convex functions, 108
6 The Classical Banach Spaces
1 The LP spaces,
111
Ill
2 The "'bider and Minkowski inequalities, 112
3 Convergence and completeness, 115
4 Bounded linear functionals on the LP spaces, 119
Part Two o Abstract Spaces
7 Metric Spaces
1
2
3
4
introduction, 127
Open and closed sets, 129
Continuous functions and homeomorphisms, 131
Convergence and completeness, 133
5 Uniform continuity and uniformity, 135
6 Subspaces, 137
7 Boire category, 139
*Indicates sections peripheral to the principal line of argument.
127
Contents
xi
8 Topological Spaces
1
2
3
4
5
*6
142
Fundamental notions, 142
Bases and countability, 145
The separation axioms and continuous real-valued functions, 147
Product spaces, 150
Connectedness, 150
Absolute Vs, 154
*7 Nets, 155
9 Compact Spaces
157
1
2
3
4
5
*6
Basic properties, 157
Countable compactness and the Bolzano-Weierstrass property, 159
Compact metric spaces, 163
Products of compact spaces, 165
Locally compact spaces, 168
The Stone-tech compactification, 170
7 The Stone-Weierstrass theorem, 171
*8 The Ascoli theorem, 177
10 Banach Spaces
181
1
2
3
4
*5
*6
Introduction, 181
Linear operators, 184
Linear functionals and the Hahn-Banach theorem, 186
The closed graph theorem, 193
Topological vector spaces, 197
Weak topologies, 200
*7 Convexity, 203
8 Hilbert space, 210
Part Three o General Measure and Integration Theory
11 Measure and Integration
1 Measure spaces, 217
2
3
*4
5
6
Measurable functions, 223
Integration, 225
General convergence theorems, 231
Signed measures, 232
The Radon-Nikodym theorem, 238
7 The LP spaces, 243
217
Contents
xii
12 Measure and Outer Measure
250
1 Outer measure and measurability, 250
2 The extension theorem, 253
*3 The Lebesgue-Stieltjes integral, 261
4 Product measures, 264
*5 Inner measure, 274
*6 Extension by sets of measure zero, 281
*7 Carathéodory outer measure, 283
13 The Daniell Integral
286
1 Introduction, 286
2 The extension theorem, 288
3 Uniqueness, 294
4 Measurability and measure, 295
14 Measure and Topology
1
2
3
*4
Baire sets and Borel sets, 301
Positive linear functionals and Baire measures, 304
Bounded linear functionals on C(X), 308
The Borel extension of a measure, 313
15 Mappings of Measure Spaces
1
2
3
4
5
301
317
Point mappings and set mappings, 317
Measure algebras, 319
Borel equivalences, 324
Set mappings and point mappings on complete metric spaces, 328
The isometries of U, 331
Epilogue
335
Bibliography
337
Index of Symbols
339
Subject Index
341
Real Analysis
Prologue to the Student
This book covers a portion of the material that every graduate
student in mathematics must know. For want of a better name the
material here is denoted by real analysis, by which I mean those parts
of modern mathematics which have their roots in the classical theory
of functions of a real variable. These include the classical theory
of functions of a real variable itself, measure and integration, pointset topology, and the theory of normed linear spaces. This book is
accordingly divided into three parts. The first part contains the
classical theory of functions, including the classical Banach spaces.
The second is devoted to general topology and to the theory of general Banach spaces, and the third to abstract treatments of measure
and integration.
Prerequisites. It is assumed that the reader already has some
acquaintance with the principal theorems on continuous functions
of a real variable and with Riemann integration. No formal use of
this knowledge is made here, and Chapter 2 provides (formally)
all of the basic theorems required. The material in Chapter 2 is,
however, presented in a rather brief fashion and is intended for
review and as introduction to the succeeding chapters. The reader
to whom this material is not already familiar may find it difficult to
follow the presentation here. We also presuppose some acquaintance
with the elements of modern algebra as taught in the usual undergraduate course. The definitions and elementary properties of groups
and rings are used in some of the peripheral sections, and the basic
notions of linear vector spaces are used in Chapter 10. The theory
of sets underlies all of the material in this book, and I have sketched
in Chapter 1 some of the basic facts from set theory. Since the
remainder of the book is full of applications of set theory, the
student should become adept at set-theoretic arguments as he
1
Prologue to the Student
2
progresses through the book. I recommend that he first read Chapter
1 lightly and then refer back to it as needed. The books by Halmos
[9]' and Suppes [23] contain a more thorough treatment of set
theory and can be profitably read by the student while he reads this
book. Gleason [7] also contains useful material on set theory and
has an excellent discussion of the role of set theory in characterizing
abstract mathematical structure. The reader will also find there a
good discussion and development of the real number system and its
properties.
Logical notation. It is convenient to use some abbreviations for
logical expressions. We use '8E' to mean 'and' so that 'A & B' means
'A and B'; 'y' means 'or' so that 'A y B' means 'A or B (or both)';
— means 'not' or 'it is not the case that', so that ' —A' means 'it is
not the case that A'. Another important notion is the one that we
express by the symbol It has a number of synonyms in English,
so that the statement 'A = B' can be expressed by saying 'if A,
then B', 'A implies B', 'A only if B', 'A is sufficient for B', or '13 is
necessary for A'. The statement 'A = B' is equivalent to each of the
statements `(— A) y B' and — (A & (— B))'. We also use the
notation 'A <=> B' to mean `(A = B) & (B = A)'. English synonyms
for 'A <=> B' are 'A if and only if B', 'A if B', 'A is equivalent to B',
and 'A is necessary and sufficient for B'.
In addition to the preceding symbols we use two further abbreviations: '(x)' to mean 'for all x' or 'for every x', and `OxY to mean
'there is an x' or 'for some x'. Thus the statement (x)(3y)(x < y)
says that for every x there is a y which is larger than x. Similarly,
(3y)(x)(x < y) says that there is a y which is larger than every x.
Note that these two statements are different: As applied to real
numbers, the first is true and the second is false.
Since saying that there is an x such that A (x) means that it is not
the case that for every x we have — A(x), we see that (3x) A(x)
<=> (x) A(x). Similarly, (x) A(x) <=> (ax) A(x). This
rule is often convenient when we wish to express the negation of a
complex statement. Thus
(x)
(x)
(]y) (x < y)} <=>
<=> (]x) (y)
(Y)
(x < Y)
(x < y)
<=> (ix) (y) (y
1
Numbers in brackets refer to the Bibliography, p. 337.
x),
Prologue to the Student
3
where we have used properties of the real numbers to infer that
— (x < y) <=> (y-5__ x).
We will sometimes modify the standard logical notation slightly
and write (e > 0) (. . .), ( 5 > 0) (. . .), and (]x e A) (. . .) to
mean 'for every e greater than 0 (. . .)', 'there is a 5 greater than
0 such that (. . .)', and 'there is an x in the set A such that (. . .)'.
This modification shortens our expressions. For example (e > 0)
(...) would be written in standard notation (e) {(€ > 0) = (. . .)} .
For a thorough discussion of the formal use of logical symbolism
the student should refer to Suppes [22].
Statements and their proofs. Most of the principal statements
(theorems, propositions, etc.) in mathematics have the standard
form 'if A, then B' or in symbols 'A = B'. The contrapositive of
A = B is the statement (— B) =(— A). It is readily seen that a
statement and its contrapositive are equivalent; that is, if one is true,
then so is the other. The direct method of proving a theorem of the
form 'A = B' is to start with A, deduce various consequences from
it, and end with B. It is sometimes easier to prove a theorem by
contraposition, that is, by starting with — B and deriving — A. A
third method of proof is proof by contradiction or reductio ad
absurdum: We begin with A and — B and derive a contradiction. All
students are enjoined in the strongest possible terms to eschew proofs
by contradiction! There are two reasons for this prohibition: First,
such proofs are very often fallacious, the contradiction on the final
page arising from an erroneous deduction on an earlier page, rather
than from the incompatibility of A with — B. Second, even when
correct, such a proof gives little insight into the connection between
A and B, whereas both the direct proof and the proof by contraposition construct a chain of argument connecting A with B.
The principal statements in this book are numbered consecutively
in each chapter and are variously labeled lemma, proposition,
theorem, or corollary. A theorem is a statement of such importance
that it should be remembered since it will be used frequently. A
proposition is a statement of some interest in its own right but which
has less frequent application. A lemma is usually used only for
proving propositions and theorems in the same section. References
to statements in the same chapter are made by giving the statement number, as Theorem 17. References to statements in another
chapter take the form Proposition 3.21, meaning Proposition 21
of Chapter 3. A similar convention is followed with respect to prob-
4
Prologue to the Student
lems. I have tried to restrict the essential use of interchapter references to named theorems such as 'the Lebesgue Convergence
Theorem'; the references to numbered statements are mostly auxiliary references which the student should find it unnecessary to
consult.
The proof of a theorem, proposition, etc., in this book begins
with the word 'proof' and ends with the symbol 'I', which has the
meaning of 'this completes the proof.' If a theorem has the form
'A <= B', the proof is usually divided into two parts, one, the 'only
if' part, proving A = B, the other, the 'if' part, proving B = A.
Interdependence of the chapters. The dependence of one chapter
on the various preceding ones is indicated by the following chart
(except for a few peripheral references). Chapters 14 and 15 depend
on most of the preceding chapters. Chapter 10a indicates Sections
1 through 4 and 8 of Chapter 10; Chapter 10b denotes Sections 5
through 7 of that chapter.
1 Set Theory
2 The Real Number System
3 Lebesgue Measure I
7 Me ric Spaces
8 Topological Spaces'
4 The Lebesgue Integral
5 Differentiation
and Integration
6 Classical
Banach
11 Measure and
9 Compact Spaces
Integration
12 Measure and
Outer Measure
10a Banach Spaces
Spaces
13 The Daniell
Integral
10b Topological Vector
Spaces
I
1
Set Theory
Introduction
One of the most important tools in modern mathematics is the
theory of sets. The study of sets and their use in the foundations of
mathematics was begun just before the turn of the century by Cantor,
Frege, Russell, and others, and it appeared that all mathematics
could be based on set theory alone. It is in fact possible to base
most of mathematics on set theory, but unfortunately this set theory
is not quite as simple and natural as Frege and Russell supposed;
for it was soon discovered that a free and uncritical use of set
theory leads to contradictions and that set theory had to have a
careful development with various devices used to exclude the contradictions. Roughly speaking, the contradictions appear when one
uses sets that are "too big," such as trying to speak of a set which
contains everything. In this book we shall avoid these contradictions
by having some set or space X fixed for a given discussion and considering only sets whose elements are elements of X, or sets (collections) whose elements are subsets of X, or sets (families) whose
elements are collections of subsets of X, and so forth. In the first few
chapters X will usually be the set of real numbers.
In the present chapter we shall describe some of the notions from
set theory which will be useful later. Our purpose is descriptive and
the arguments given are directed toward plausibility and (hopefully)
understanding rather than toward rigorous proof in some fixed basis
for set theory. The descriptions and notations given here are for
6
Set Theory
[Ch. 1
the most part consistent with the set theory described by Halmos
in his book Naive Set Theory [9], although we assume as known, or
primitive, several notions such as the natural numbers, the rational
numbers, functions, and so forth, which can be defined (as in
Halmos) in terms of the more primitive notions of set theory.
For an axiomatic treatment, I recommend Suppes' book Axiomatic Set Theory [23] or the appendix to Kelley's book, General
Topology [14].
The natural numbers (positive integers) play such an important
role in this book that we introduce the special symbol N for the set
of natural numbers. We also shall take for granted the principle of
mathematical induction and the well-ordering principle. The principle of mathematical induction states that if P(n) is a proposition
defined for each n in N, then {P(l) & [P (n) = P (n (n)P (n).
The well-ordering principle asserts that each nonempty subset of N
has a smallest element.
The basic notions of set theory are those of set and the idea of
membership in a set. We express this latter notion by e, and write
'x e A' for the statement 'x is an element (or member) of A'. A
set is completely determined by its members; that is, if two sets A
and B have the property that x E A if and only if x e B, then A = B.
Suppose that each x in a set A is in the set B, that is, x e A x e B;
then we say that A is a subset of B or that A is contained in B and
write A c B. Thus we always have A c A, and if A c B and B c A
then B = A. It is perhaps unfortunate that the English phrase
"contained in" is often used to represent both the notions e and c,
but we shall use it only in the latter context. We write 'x e A' to
mean 'not (x E A)', that is, that x is not an element of A.
Since a set is determined by its elements, one of the commonest
ways of determining a set is by specifying its elements, as in the
definition: The set A is the set of all elements x in X which have
the property P. We abbreviate this by writing
A =, {x e X: P (x)} ,
and where the set X is understood we sometimes write'
A = (x: P(x)}
The presence (explicit or implicit) of the qualifying set X is essential. Otherwise we
are confronted with the Russell paradox (cf. Suppes [23], p. 6).
Sec. 1]
Introduction
7
We usually think of a set as having some members, but it turns out
to be convenient to consider also a set which has no members. Since
a set is determined by its elements, there is only one such set, and
we call it the empty set and denote it by Ø. If A is any set, then each
member of 325 (there are none) is a member of A, and so 0 c A.
Thus the empty set is a subset of every set.
If x, y, z are elements of X, we define the set {x} to be the set
whose only element is x; the set {x, y } to be the set whose elements
are exactly x and y; the set {x, y, z} to be the set whose elements are
x, y, and z; and so forth. The set {x} is called a unit set, or the
singleton of x. One should distinguish carefully between x and {x} .
For example, we always have x e {x} , while we seldom have x e x.
In {x, y} there is no preference given to x over y; that is, {x, y } =
ly, x}. For this reason we call {x, y} an unordered pair. It is often
useful to consider also the ordered pair (x, y), where we distinguish
between the first element x and the second element y. Thus (x, y) =(y, x) if
(a, b) if and only if x = a and y ----- b, so that (x, y)
x
y. Similarly, we shall consider ordered triplets (x, y, z), quadruplets (x, y, z, w), and so forth, where we distinguish between the
first, second, third,. . . , elements. Although it is possible to define
ordered pairs, triplets, and so forth, in terms of unordered pairs
(cf. Halmos), we shall not do so here.
If X and Y are two sets, we define the Cartesian, or direct, product
X X Y to be the set {(x, y)} of all ordered pairs whose first element
belongs to X and whose second element belongs to Y. Similarly,
XX YXZ is the set {(x, y, z)} of all ordered triples such that
x e X, y e Y, and z c Z. If Xis the set of real numbers, then X X X
is the set of ordered pairs of real numbers and, as we know from
analytic geometry, is equivalent to the set of points in the plane.
We sometimes write X 2 for X X X, X 3 for XX XX X, and so
forth.
Problems
1. Show that { x: x x} = Ø.
2. Show that if x c 0, then x is a green-eyed lion.
3. Show that in general the sets X X (Y X Z) and (X X Y) X Z are
different but that there is a natural correspondence between each of them
and XX YX Z.
Set Theory [Ch. 1
8
4. Show that the well-ordering principle implies the principle of
mathematical induction. [Consider the set {n e N: P(n) false} .]
5. Use mathematical induction to establish the well-ordering principle.
[Given a set S of positive integers, let P(n) be the proposition "If n c S,
then S has a least element".]
2
Functions
By a function f from (or on) a set X to (or into) a set Y we mean
a rule which assigns to each x in X a unique element f(x) in Y. The
collection G of pairs of the form (x,f(x)) in X X Y is called the
graph of the function f. A subset G of X X Y is the graph of a
function on X if and only if for each x e X there is a unique pair in
G whose first element is x. Since a function is determined by its
graph, many people like to define a function to be its graph. It is
irrelevant for our purposes whether we do this or consider the notion
of function as primitive. 2
The word 'mapping' is often used as a synonym for 'function'.
We often express the fact that f is a function on X into Y by writing
f: X — > Y.
The set X is called the domain (or domain of definition) of f. The
set of values taken by f, that is, the set {y e Y: (]x)[y = f(x)]}, is
called the range of f The range of a function f will in general be
smaller than Y. If the range off is Y, then we say that f is a function
onto Y. (Another terminology for this case: f is surjective. We shall
not use it here.)
If A is a subset of X, we define the image under f of A as the set
of elements in Y such that y = f(x) for some x in A. We denote
this image by f [21]. Thus
f[A] = {y c Y: (3x) [x c A
and y = f(x)]).
Thus the range off is f[X], and f is onto Y if and only if Y = f [X].
2
The equivalence between functions and their graphs is valid only for functions from a
given set X to a set Y. There are difficulties, for example, in defining the graph of the
identity function i defined by i(x) = x for all x. In a formal treatment which takes
function as a primitive notion, we must have axioms describing the properties of functions, such as f = g) <=> (x)[ f (x) = g(x)), as well as axioms enabling us to construct
functions.
(
Sec. 2]
9
Functions
More important than the notion of image of a set under f is the
notion of an inverse image. If B is a subset of Y, we define the
inverse image f -1 [13] of B to be the set of those x in X for which f(x)
is in B; that is,
f —l [B] = {x e X: f(x) el3}.
It should be noted that f is onto Y if and only if the inverse image
of each nonempty subset of Y is nonempty.
A function f: X p Y is called one-to-one (or univalent, or injective)
iff(xi) = f (x 2) only when x 1 = x2 . Functions which are one-to-one
from X onto Y are often called one-to-one correspondences between
X and Y. (They are also called bijective.) In this case there is a
function g: Y —> X such that for all x and y we have g(f(x)) = x
and f (g(y)) = y. The function g is called the inverse of f and is
sometimes denoted by f -1 .
It should be noted that, if we denote g by f -1 , then f -1 [E] can be
considered to be the inverse image of E under f or the image of E
under f -1 . Fortunately for our notation, these are the same set.
Iff: X
Y and g: Y —> Z, we define a new function h: X —> Z by
setting h(x) = g(f(x)). The function h is called the composition of g
with f and denoted by g of If f: X —> Y and A is a subset of X, we
can define a new function g: A -- Y by defining g(x) = f(x) for
X g A. This new function g is called the restriction of f to A and is
sometimes written f 1 A. In many cases it is important to distinguish
carefully between the functions g and f They have different ranges,
and the inverse images under g are different from those under f
Having mentioned the possibility of defining functions by means
of ordered pairs, it is only fair to point out that, conversely, ordered
pairs may be defined in terms of the notion of a function: An
ordered pair is a function whose domain is the set {1, 2 . Similarly,
a finite sequence, or n-tuple, is a function whose domain is the first n
natural numbers, that is, the set {i F N: i < n}. (We call such a set a
segment of N.) Similarly, an infinite sequence is a function whose
domain is the set N of natural numbers. We use the term 'sequence'
to mean a finite or infinite sequence. If the range of a sequence is in a
set X, we speak of a sequence from (or in) X or of a sequence of
elements of X. It is customary to depart somewhat from the usual
functional notation when dealing with sequences and denote the
value of the function at i by x, and to call this value the i th element
of the sequence. We shall often denote ordered n-tuples by
—
—
}
10
Set Theory
[Ch. 1
and infinite sequences by (x1)7_ 1 . When no misunderstanding is likely
to arise we often write simply (xi). The range of the sequence (x i )
will be denoted by {x}. Thus the range of an ordered n-tuple (x 1)7= 1
is the unordered n-tuple {x } 7_ 1 . This notation is reasonably consistent with our earlier notation concerning ordered and unordered
pairs, triplets, and so forth.
A set A is called countable if it is the range of some sequence and
finite if it is the range of some finite sequence. A set which is not
finite is called infinite. (Many authors restrict the use of the word
'countable' to sets which are infinite and countable, but our definition
includes the finite sets among the countable sets.) We use the term
`countably infinite' for infinite countable sets. We shall return to this
notion again in Section 6.
One of the most useful ways of defining an infinite sequence is
given by the following principle:
Principle of Recursive Definition: Let f be a function from a set X
to itself, and let a be an element of X. Then there is a unique infinite
sequence (x,) from X such that x 1 = a and xi+1 = f(xi) for each i.
The existence of such a sequence is intuitively clear: We define
x 1 = a, x 2 = f(a), x3 = f(f fri)), and so on. A more formal proof
can be given as follows: We first prove by induction on n that for
each natural number n there is a unique finite sequence
(n)
(n)
Xi , X2
5 • • •
,(n)
16 n
5
such that
x (1n) = a and x(in) 1 = f(x7)
for 1 < i < n.
From the uniqueness, it follows that xi( n) = xj( "' for i < n < m.
Thus if we define x„ to be x„(n) , we have xi( n) = x„ and we see that
the sequence (x,) satisfies the requirements of our principle.
A slight extension of this principle is the following: For each
natural number n letfn be a function on Xn to X and let a e X. Then
there is a unique sequence (x i) from X such that x 1 = a and xi+ 1
• • ,
An important notion connected with that of a sequence is the
notion of a subsequence. We say that a mapping g of N into N is
monotone if (i > j) = (g(i) > g(j)). Iff is an infinite sequence (that
is, a function whose domain is N), we say that h is an infinite sub-
Sec. 3]
11
Unions, Intersections, and Complements
sequence of f if there is a monotone mapping g of N into N such
that h = f o g. If we write f as (fi) and g as
denote f 0 g by (A).
(gi),
then we usually
Problems
6. Let f: X --> Y be a mapping of a nonempty space X into Y. Show
that f is one-to-one if and only if there is a mapping g: Y— X such that
g of is the identity map on X, that is, such that g(f(x)) = x for all x e X.
7. Let f: X —4 Y be a mapping of X into Y. Show that f is onto if
there is a mapping g: Y X such that fo g is the identity map on Y,
that is, such that f(g(y)) = y for all y e Y. (For the converse see
Problem 20.)
8. Use mathematical induction to prove the generalized principle of
recursive definition stated in the text.
3
Unions, Intersections, and Complements
Let us fix a given set X and consider the set 6.(X) consisting of all
subsets of X. There are certain set - theoretic operations that we can
perform on subsets of X. If A and B are subsets of X, we define their
intersection A n B to be the set of all elements which belong to both
A and B. Thus
AnB={x:xeA&xeB}.
We note that the definition is symmetric in A and B; that is, A n B =
BnA. Also AnBcA and AnB= A<=>AcB. We have
(AnB)nC= An(BnC) and write this set as AnBnC.
It is the set of all elements which belong to each of the sets A, B,
and C.
We define the union A u B of two sets A and B to be the set of
elements that are in either A or B. Thus
AuB= {x:xeAVxeB}.
We have
AuB= Bu A
Au(BuC)= (AuB)uC= AuBuC
AcAuB
A= AuB<=>BcA.
Set Theory
12
[Ch. 1
We also have relations between unions and intersections which are
called distributive laws:
A n (B u C) = (A n B) u (A n C)
A u (B n C) = (A u B) n (A u C).
The empty set 0 and the space X play special roles:
Au
A,
,4n0=0
,4 u X = X,
,4 n X = ,4.
If A is a subset of X, we define the complement A- of A (relative
to X) as the set of elements not in A. Thus
{x
X: x
A} .
We sometimes write —A instead of A. We have
= x, f{ =
A = A, A u = X, A n = 0
AcB
C A.
Two special laws relating complements to unions and intersections
are De Morgan's laws:
—(A
U
B) = A n
—(A n B) = :4 u .
If A and B are two subsets of X we define the difference B
— A, or
relative complement of A in B, as the set of elements in B which are
not in A. Thus
B — A = {x:xeB&xeA}.
We have B — A ---- B n A.
We shall also use the notation A A B for the symmetric difference
of two sets defined by
A A B = ( A — B) U (B
A).
The symmetric difference of two sets consists of all those points
which belong to one or the other of the two sets but not to both.
Sec. 3]
13
Unions, Intersections, and Complements
If the intersection of two sets is empty we say the sets are disjoint.
A collection e of sets is said to be a disjoint collection of sets or a
collection of pairwise disjoint sets if any two sets in e are disjoint.
The process of taking unions (or intersections) of two sets can be
extended by repetition to give unions (or intersections) of any finite
collection of sets. However, we can give a definition of intersection
for an arbitrary collection e of sets: The intersection of the collection
e is the set of those elements of X which belong to each member
of e. We denote this intersection by
A or fl {A: A e e}. Thus
n
Aee
n A = {xeX:(A)(AeexeA)}.
Ace
Similarly, we define the union of an arbitrary collection of sets by
UA = {xe X:(3A)(A ee&xeA)}.
Ace
De Morgan's laws hold for arbitrary unions and intersections:
-[u+
nA
Ace
Ace
Ace
Ace
U A.
We also have the distributive laws:
B
U
n[Ace
A] =
Bu[ nA]=
Ace
n
Ace
n (Bui ) .
Ace
It follows from our definition that the union of an empty collection
of sets is empty and that the intersection of the empty collection of
sets is X.
By a sequence of subsets of X we mean a sequence from w(X),
that is, a mapping of N(or a segment of N) into cP(X). If (A,) is an
infinite sequence of subsets of X, we write
range of the sequence. Thus
U
z—i
(V)(x
U
A i for the union of the
g Az)}.
14
Set Theory
[Ch. 1
Similarly, if (B,)_, is a finite sequence of subsets of X, we write
n B, for the intersection of the range of the
s =1
B, = B i n
sequence, so that
n • • n Bn.
B2
This notation for sequences of sets is so convenient that we often
generalize it to arbitrary collections of sets by using the notion of an
indexed collection: An indexed subset of X (or collection of subsets of
X) is a function on an index set A to X (or the set of subsets of X).
If A is the set of natural numbers, then the notion of an indexed
set coincides with the notion of a sequence.
In keeping with the notation for sequences, we usually write
xx instead of x(x), and denote the indexed set itself by {xx} or
{xx : X e AL We say that {xx} is indexed by A. We define the union
and intersection of an indexed set to be the union and intersection
of the range of the function defining the indexed set. Thus
U
XeA
Ax = {x e X: (3x)(X en &x e A)}
and
n A x = {x e
X: (X )( X
CA
XEAx».
XsA
In the case when A is the set N of natural numbers we have
fl A, =
zeN
n
1
44%,
and similarly for unions.
If f maps X into Y and 1,1 x, is a collection of subsets of X, then
f[y
Ad=
y f[A x],
but we can conclude only that
f[f?
A xi c
f[A x].
For inverse images we have, for a collection {B x} of subsets of Y,
[y Bd =y f —l [Bx]
f —l [f? Bx ] =
Sec.
Unions, Intersections, and
33
15
Complements
and
f -1 [13] ---- 4-1 [B]
for B c Y. Also
f[f - l[B]] C B
and
ri ff [A]] D A
for A c X and B c Y.
Problems
Show that A c /3.4= A n B = At=>AUB. B.
Prove the distributive laws.
Show that A c B
r3 c ;11- .
Show that:
BLAandAL(BAC)= (A L B)
C.
b.A.6,B.--0<=>A=B.
c. AA13=-- X<A=T3.
d. il.e.52f = AandALX=
e. (A
B) n E = (A n E) (B n E).
13. Prove De Morgan's laws (for arbitrary unions and intersections).
14. Show that
9.
10.
11.
12.
U = U n A).
B
n
[
Aee
Ace
15. Show that if Q and 63 are two collections of sets, then
[U{A: A e a}] n [U{B: B r03} ]
= U {A n B:
13) ra x 631.
16. a. Show f[Urix] = Uf[Ax].
b. Show f [nAd
c
c. Give an example where
f[nAx] nf[Ax].
17. Show that:
a• ri rUBx] U f'[Bx].
b.
r l [nBx] nr i[Bx].
c. f-1
_
I LB, for B C Y.
[]
18. a. Show that if f maps X into Y and A C X, B C Y, then
f[ri[B]] C B
16
Set Theory
[Ch. 1
and
ri[f[A]] j A.
b. Give examples to show that we need not have equality.
c. Show that if f maps X onto Y and B C Y, then
f[r i[ B]l = B.
4 Algebras of Sets
A collection a of subsets of X is called an algebra of sets or a
Boolean algebra if (i) A u B is in a whenever A and B are, and (ii)
is in a whenever A is. It follows from De Morgan's laws that (iii)
A n B is in a whenever A and B are. If a collection a of subsets of X
satisfies (ii) and (iii), then by De Morgan's laws it also satisfies (i)
and is therefore a Boolean algebra. By taking unions two at a time,
we see that if A 1 ,. , A n are sets in a then A 1 U A2 U
u A n is
again in a. Similarly, A 1 n A, n • • n A n is in a.
We shall find several propositions concerning algebras of sets
useful. The first is the following:
' • •
I. Proposition: Given any collection e of subsets of X, there is a
smallest algebra a which contains e; that is, there is an algebra a
containing e and such that if 63 is any algebra containing e, then 63
contains a.
be the family of all algebras (of subsets of X) which
Then e is a subcollection of a,
contain e. Let a =
since each ,33 in v contains e. Moreover, a is an algebra. For if A and
B are in a, then for each 03 e 5 we have A e 63 and B e 63. Since (B
is an algebra, A u B belongs to 83. Since this is true for every 03 e
we have A U B in n{63: e 5}. Similarly, we see that if A c
then ;4- e a. From the definition of a, it follows that if 03 is an algebra
containing e, then 63 D a. I
Proof: Let
2. Proposition: Let a be an algebra of subsets and (A,) a sequence
of sets in a. Then there is a sequence (B,) of sets in a such that
B„n Bm =Ofornmand
U B,
ÛA,.
z=1
Sec. 4]
17
Algebras of Sets
Proof: Since the proposition is trivial when (AO is finite, we assume
(A i) to be an infinite sequence. Set B 1 — A 1 , and for each natural
number n > 1 define
Bn -=--- A n
'' [A1 U
A2
U•••U
An—li
= A n n 21 n 22 n • • • n 2._1.
Since the complements and intersections of sets in a are in a, we
have each B r, 6 a. We also have B r, C A. Let B. and Brn be two such
sets, and suppose 711 < n. Then B. C A., and so
Bm
n B r, c A m n B r,
= A m n A n n • • • n 2. n • • •
= (A m n ;in) n • • •
= 0n•••
=0.
Since B, C Ai, we have
cc,
cc,
U B,, C U A.,
,---).
%—.1
Let x c U A. Then x must belong to at least one of the A,'s. Let n
, _-- 1
be the smallest value of i such that x E A,. Then x E Bn, and so
x e U B. Thus
-
n=1
U B D
7,1
U A n,
n=1.
and we have
U Bn = U A. I
a=1
n=1
An algebra a of sets is called a or-algebra, or a Borel field, if every
union of a countable collection of sets in a is again in a. That is,
co
if (A,) is a sequence of sets, then U A, must again be in a. From
i= 1
De Morgan's laws it follows that the intersection of a countable
collection of sets in a is again in a. A slight modification of the
proof of Proposition 1 gives us the following proposition:
18
Set Theory
[Ch. 1
3. Proposition: Given any collection e of subsets of X, there is a
smallest o--algebra which contains e; that is, there is a o--algebra
containing e such that if ci3 is any o--algebra containing e then a C 63.
Problem
19. Prove Proposition 3.
5 The Axiom of Choice and Infinite
Direct Products
An important axiom in set theory is the so-called axiom of choice.
It is somewhat less elementary than the other axioms used in axiomatic set theory and is known to be independent of them. Many
mathematicians like to be very explicit about the use of the axiom
of choice and its consequences, but we shall be rather informal about
its use. The axiom is the following:
Axiom of Choice: Let e be any collection of nonempty sets. Then
there is a function F defined on e which assigns to each set A e e an
element F(A) in A.
The function F is called a choice function, and its existence may be
thought of as the result of choosing for each of the sets A in e
an element in A. There is, of course, no difficulty in doing this if
there are only a finite number of sets in e, but we need the axiom of
choice in case the collection e is infinite. If the sets in e are disjoint,
we may think of the axiom of choice as asserting the possibility
of selecting a "parliament" consisting of one member from each
of the sets in e.
Let e = MI be a collection of sets indexed by an index set A.
We define the direct product
X Xx
to be the collection of all sets {xx} indexed by A and having the
property that xx e X. If A = {1, 2), we have our earlier definition
of the direct product X1 X X2 of the two sets X1 and X2. If z = {xx}
is an element of X Xx, we call xx the At coordinate of z.
Sec. 6]
Countable Sets
19
If one of the Xx is empty, then X Xx is also empty. The axiom of
X
choice is equivalent to the converse statement: If none of the Xx
are empty, then X Xx is not empty. For this reason Bertrand Russell
X
prefers to call the axiom of choice the multiplicative axiom.
Problem
20. Let f: X -- Y be a mapping onto Y. Then there is a mapping
g: Y --÷ X such tl- At fo g is the identity map on Y. [Apply the axiom of
choice to the collection {A: ey c Y) with A = f— I{y)]).]
6
Countable Sets
In Section 2 we defined a set to be countable if it was the range
of some sequence. If it is the range of a finite sequence, we have
a finite set, but the range of an infinite sequence may also be finite.
In fact, every nonempty finite set is the range of an infinite sequence.
For example, the finite set {x 1 , . . . , x,„} is the range of the infinite
sequence defined by setting x, = x„ for i > n. Thus a set is countably
infinite if it is the range of some infinite sequence but not the range of
any finite sequence. The set N of natural numbers is an example of
a countably infinite set.
Before proceeding further, we had better come to terms with the
empty set. The empty set is not the range of any sequence (unless we
admit sequences with zero terms). It is convenient, however, to define
finite and countable sets so that the empty set is both finite and
countable. Hence:
Definition: A set is called finite if it is either empty or the range
of a finite sequence. A set is called countable (or denumerable) if
it is either empty or the range of a sequence.
It follows at once from this definition that the image of any
countable set is countable, that is, that the range of any function
with a countable domain is itself countable, and similarly for finite
sets.
It is usual in mathematics to give a slightly different but equivalent
definition based on the notion of a one-to-one correspondence. We
Set Theory
20
[Ch. 1
first note that any set that can be put in one-to-one correspondence
with a finite set is finite and that any set that can be put in one-to-one
correspondence with a countable set must be countable. Since the
set N of natural numbers is countable but not finite, any set which
can be put in one-to-one correspondence with N is countably infinite.
It is customary to use this property to define the notion of countably
infinite Thus to show that our definition is equivalent to the customary one, we must show that, if an infinite set E is the range of a
sequence (x„), then E can be put in one-to-one correspondence with
N. To do this we define a function cp from N into N by recursion as
follows: Let yo (1) = 1, and define yo (n + 1) to be the smallest value
of m such that x.,„ o xi for all i < yo(n). Since E is infinite, such an
m always exists, and by the well-ordering principle for N there is
always a least such m. The correspondence n —> x (7, ) is a one-to-one
correspondence between N and E. Thus we have shown that a
set is countably infinite if and only if it can be put into one-to-one
correspondence with N. We are now in a position to prove some
simple propositions about countable sets:
4. Proposition: Every subset of a countable set is countable.
Proof: Let E = { .x.n} be a countable set, and let A be a subset of
E. If A is empty, A is countable by definition. If A is not empty,
choose x in A. Define a new sequence (y n) by setting yn — x„ if
x„ e A and yt, = x if x„ 0 A. Then A is the range of (yn ) and is
therefore countable. I
5. Proposition: Let A be a countable set. Then the set of all finite
sequences from A is also countable.
Proof: Since A is countable, it can be put into one-to-one
correspondence with a subset of the set N of natural numbers. Thus
it suffices to prove that the set S of all finite sequences of natural
numbers is countable. Let (2, 3, 5, 7, 11, . . . , Pi, . . .) be the sequence
of prime numbers. Then each n in N has a unique factorization of
the form3 n = 2x13x2 • • • plk, where x, e N o = Nu {0} and x k >0.
Let f be the function on N which assigns to the natural number n
the finite sequence (x 1 , . . . , x k ) from No. Then S is a subset of the
range of f. Hence S is countable by Proposition 4. I
3
Except 1, we agree to write 1 -- 2 0 .
Sec.
63
Countable Sets
21
6. Proposition: The set of all rational numbers is countable.
7. Proposition: The union of a countable collection of countable
sets is countable.
Proof: Let e be a countable collection of countable sets. If all
the sets in e are empty, the union is empty and thus countable. Thus
we may as well assume that e contains nonempty sets, and since
the empty set contributes nothing to the union of e, we can assume
that the sets in e are nonempty. Thus e is the range of an infinite
sequence (iln ),7= 1 , and each A n is the range of an infinite sequence
(x„„i):= 1 . But the mapping of (n, m) to xn „, is a mapping of the set of
ordered pairs of natural numbers onto the union of e. Since the set
of pairs of natural numbers is countable, the union of the collection
e must also be countable. I
Problems
21. Show that every subset of a finite set is finite.
22. Prove Proposition 6 by using Propositions 4 and 5. [Hint: The
mapping
(p, q, l) > p/q
—
(p,q, 2) -- —p/q
(1, 1, 3) —p 0
is a function whose range is the set of rational numbers and whose domain
is a subset of the set of finite sequences from N.]
23. Show that the set E of infinite sequences from {0, 1} is not countable. [Hint: Let f be a function from N to E. Then f(v) is a sequence
Let by = 1 — a„. Then (bn ) is again a sequence from 10, 11,
and for each v e N we have (bn ) (a,n ). This method of proof is known
as the Cantor diagonal process.]
24. Let f be a function from a set X to the collection 6"(X) of subsets
of X. Then there is a set E C X which is not in the range of f. (Let
E= {x: x g f(x)} .)
25. Use the axiom of choice and the generalized principle of recursive
definition to show that each infinite set X contains a countably infinite
subset.
22
7
Set Theory
[Ch. 1
Relations and Equivalences
Two given entities x and y may be "related" to each other in many
ways, as in x = y, x e y, x c y, or for numbers x < y. In general we
say that R denotes a relation if, given x and y, either x stands in the
relation R to y (written x R y) or x does not stand in the relation R
to y. A relation R is said to be a relation on a set X if x R y implies
x e X and y e X. If R is a relation on a set X, we define the graph of R
to be the set {(x, y): x R . Since we consider two relations R and S
to be the same if (x R y) <=> (x S y), each relation on a set Xis uniquely
determined by its graph, and conversely each subset of X X X is the
graph of some relation on X. Thus we may if we like identify a
relation on X with its graph and define a relation to be a subset of
X X X. In many formalized treatments of set theory a relation is in
general defined simply as a set of ordered pairs. 4
A relation R is said to be transitive on a set X if x R y and y R z
imply x R z for all x, y, and z in X. Thus = and < are transitive
relations on the set of real numbers. A relation R is said to be
symmetric on X if x R y implies y R x for all x and y in X. It is said
to be reflexive on X if for all x e X we have x R x.
A relation which is transitive, reflexive, and symmetric on Xis said
to be an equivalence relation on X or simply an equivalence on X.
Suppose that is an equivalence relation on a set X. For a given
x c X, let Ex be the set of elements equivalent to x, that is Ez =
{y: y =----- x} . If y and z are both in Ex, then y x and z x, and by
symmetry and transitivity we have z y. Thus any two elements of
Ex are equivalent. If y e Ex and z y, then z = y and y x, whence
z x, and so z e E. Thus any element of X which is equivalent to
an element of Ex is itself an element of E. Consequently, for any
two elements x and y of X, the sets Ex and Ey are either identical
(if x y) or disjoint (if x y). The sets in the collection {Ex : x e X}
Thus X is the
are called equivalence sets or classes of X under
Note that x e Ex,
disjoint union of the equivalence classes under
and so no equivalence class is empty.
The collection of equivalence classes under an equivalence
is
called the quotient of X with respect to
and is sometimes denoted
4
Cf Suppes [23], p. 57, or Halmos [9], p 26. It should be pointed out, however, that
there is a difficulty in this approach in that =, c, and c are no longer relations. For this
reason I prefer some treatment such as the one in Kelley [14], p. 260, where relations
are not necessarily sets of ordered pairs.
Sec. 8]
Partial Orderings and the Maximal Principle
23
by X/=- . The mapping x E x is called the natural mapping of X
onto X/..
A binary operation on a set X is a mapping from X X X to X.
We say that an equivalence relation-_, is compatible with a binary
operation ± if x . x' and y y' imply that (x ± y) (x'
y'). In
this case ± defines an operation on the quotient Q = X/ =- as follows:
If E and F belong to Q, choose x c E, y c F and define E F to be
E(x+y). Since is an equivalence, E F is seen to depend only on
E and F and not on the choice of x and y.
For further details the reader may consult Birkhoff and MacLane
[2], pp. 145ff.
Problems
26. Prove that F + G as defined above depends only on F and G.
27. Let X be an Abelian group under +. Then ----- is compatible with
+ if and only if x x' implies x + y = x' + y. The induced operation
then makes the quotient space into a group.
8
Partial Orderings and the
Maximal Principle
A relation R is said to be antisymmetric on a set X if x R y and
y R x imply x = y for all x and y in X. A relation • is said to be a
partial ordering of a set X (or to partially order X) if it is transitive
and antisymmetric on X. Thus < is a partial ordering on the real
numbers and c is a partial ordering on w(X). A partial ordering
< on a set X is said to be a linear ordering (or simple ordering) of
X if for any two elements x and y of X we have either x -< y or
y < x. Thus < linearly orders the set of real numbers, while c is
not a linear ordering on (P (X).
If < is a partial order on X and if a -< b, we often say that a
precedes b or that b follows a. Sometimes we say that a is less than
b or b greater than a. If E c X, we say that an element a c E is
the first element in E or the smallest element in E if, whenever x c E,
and x a, we have a -< x. Similarly for last (or largest) elements.
An element a c E is called a minimal element of E if there is no x c E
with x
a and x < a, and similarly for maximal elements. It should
24
Set Theory
[Ch. 1
be observed that if a set has a smallest element, then that element is a
minimal element. If < is a linear ordering, a minimal element is a
least element, but in general it is possible to have minimal elements
which are not least elements.
Our definition of partial order makes no assertion about the possibility or necessity of x < x. If we have x < x for all x, we call < a
reflexive partial order. If we never have x < x, then < is called a
strict partial order. Thus < is a strict partial order for the real
numbers and < is a reflexive partial order. To any partial order <
there is associated a unique strict partial order and a unique reflexive
partial order that agree with < for all (x, y) with x y. If < is any
partial order we use < for the associated reflexive partial order.
The following principle is equivalent to the axiom of choice and is
often more convenient to apply. For a proof of this equivalence and a
discussion of related principles see Suppes [23], Chapter 8, or Kelley
[14], pp. 31-36.
Hausdorff Maximal Principle: Let < be a partial ordering on a
set X. Then there is a maximal linearly ordered subset S of X, that is,
a subset S of X which is linearly ordered by < and has the property
that, ifScTc X and T is linearly ordered by <, then S = T.
Problems
28. Let < be a partial order on X. Then there is a unique strict partial
order < and a unique reflexive partial order < on X such that for x y
wehavex <y<=>x<y<=>x<y.
29. Give an example of a partially ordered set which has a unique
minimal element but no smallest element.
9 Well Ordering and the Countable Ordinals
A strict linear ordering < on a set X is called a well ordering
for X or is said to well order X if every nonempty subset of X contains a first element. Thus, if we take X = N and < to mean less
than, then N is well ordered by < . On the other hand, the set R of all
real numbers is not well ordered by the relation "less than." The
following principle clearly implies the axiom of choice and can be
shown equivalent to it (cf. Suppes [23], Chapter 8, or Kelley [14],
pp. 31-36).
Sec. 9]
Well Ordering and the Countable Ordinals
25
Well Ordering Principle: Every set X can be well ordered; that is,
there is a relation < which well orders X.
8. Proposition: There is an uncountable set X which is well ordered
by a relation < in such a way that:
i. There is a last element 52 in X.
ii. If x e X and x
0, then the set {y e X: y < x} is countable.
Proof: Let Y be any uncountable set, say the one given by Problem 23. By the well-ordering principle, there is a well ordering < for
Y. If Y does not have a last element, take an element a g Y, replace
Y by Y u {a} , and extend the order < by setting y < a for all
y e Y. This new Y has a last element and is well ordered by < . The
set of y in Y for which the set flx c Y: x < yj is uncountable is a
nonempty set, since it contains the last element of Y. Let St be the
smallest element in this set, and let X = {x c Y: x < 52 or x = S2 } .
Then X is the required set. I
The well-ordered set X given in the proposition will be very useful
for constructing examples. It can be shown to be unique in the
sense that, if Vis any other well-ordered set with the same properties,
then there is a one-to-one order preserving correspondence between
X and Y. The last element st in X is called the first uncountable
ordinal and X is called the set of ordinals less than or equal to the
first uncountable ordinal. The elements x < 52 are called countable
ordinals. If {y: y < x} is finite we call x a finite ordinal. If w is the
first nonfinite ordinal, then {x: x < col is the set of finite ordinals
and is equivalent, as an ordered set, to the set N of positive integers.
Problems
30. a. Show that any subset of a well-ordered set is well ordered.
b. If < is a partial order on X with the property that every nonempty subset of X has a least element, then G is a linear ordering and
consequently a well ordering.
31. Let Y be the set of ordinals less than the first uncountable ordinal;
i.e., Y = {x c X: x < 2} . Show that every countable subset E of Y has
an upper bound in Y and hence a least upper bound. (An element b is an
upper bound for E if x < b for each x c E; it is a least upper bound if
b < b* for each upper bound b*.)
Set Theory [Ch. 1
26
32. A subset S of a well-ordered set Xis called a segment if
S = Ix e X: x < y}
for some y e X or if S = X. Show that a union of segments is again a
segment.
33. Let X and Y be two well-ordered sets. A function f from X to Y
is called successor preserving if for each x e X the element f(x) is the first
element of Y not in f[{z: z < x}].
a. Show that there is at most one successor-preserving map of X
into Y.
b. Show that the range of a successor-preserving map is a segment.
c. Show that if f is successor preserving, then f is a one-to-one
order-preserving map and f ' is also order preserving.
d. Iff is a successor-preserving map of Xinto Y, then the restriction
off to a segment is also successor preserving.
e. If X and Y are well-ordered sets, then there is a successorpreserving map from one of them onto a segment of the other; that is,
either there is a successor-preserving map of X onto a segment of Y or a
successor-preserving map of Y onto a segment of X. [Hint: Consider the
collection of all those segments of X on which there is a successor-preserving map into Y. Show that there is a successor-preserving map f of the
union S of this collection into Y and that either S = X or f[S] = Y.]
f. Show that the well-ordered set X in Proposition 8 is unique up
to isomorphism.
Part One
0
Theory of Functions
of a Real Variable
2
1
The Real Number
System
Axioms for the Real Numbers
We assume that the reader has a familiarity with the set R of real
numbers and those basic properties of real numbers which are
usually treated in an undergraduate course in analysis. The present
chapter is devoted to a review and systematization of those results
which will be useful later.
One approach to the subject of real numbers is to define them as
Dedekind cuts of rational numbers, the rational numbers in turn
being defined in terms of the natural numbers. Such a program gives
an elegant construction of the real numbers out of more primitive
concepts and set theory. We shall not concern ourselves here with
the construction of the real numbers but will think of them as
already given, and list a set of axioms for them. All the properties
we need are consequences of these axioms, and in fact these axioms
completely characterize the real numbers.
We thus assume as given the set R of real numbers, the set P of
positive real numbers, and the functions `+' and `-' on R X R to R
and assume that these satisfy the following axioms, which we list in
three groups. The first group describes the algebraic properties and
the second the order properties. The third comprises the least upper
bound axiom.
A. The Field Axioms: For all real numbers x, y, and z we have:
+ y ----- y + x.
A2. (x + y) + z = x + (y +
Al.
x
z).
29
30
The Real Number System
[Ch. 2
0 = x for all x e R.
A3. 30 e R such that x
A4. For each x e R there is a w ER such that x - F w = O.
A5. xy = yx.
A6. (xy)z = x(yz).
A7. 31 e Rsuch that 1 / 0 and x • 1 = x for all x ER.
A8. For each x in R different from 0 there is w e R such that
xw = 1.
A9. x(y z) = xy xz.
Any set which satisfies these axioms is called a field (under + and .).
It follows from Al that the 0 in A3 is unique, a fact which we have
assumed in the formulation of A4, A7, and A8. The w in A4 is unique
and denoted by ' —x'. We define subtraction x — y as x (—y).
The 1 in A7 is unique. The w in A8 can be shown to be unique and is
denoted by `x— P. If we have a field, that is, any system satisfying
Al through A9, we can perform all the operations of elementary
algebra, including the solution of simultaneous linear equations. We
shall use the various consequences of these axioms without explicit
mention.
The second class of properties possessed by the real numbers have
to do with the fact the real numbers are ordered. We could axiomatize
the notion of a less than b, but it is somewhat more convenient to
use the notion of a positive real number as the primitive one. When
we do this our second group of axioms takes the following form:
B. Axioms of Order: The subset P of positive real numbers satisfies
the following:
B 1. (x, y c P)
x
y P.
xy c P.
B2. (x, y c P)
B3. (x P)
—x P.
B4. (x e R)
(x = 0) or (x e P) or (—x e P).
Any system satisfying the axioms of groups A and B is called an
ordered field. Thus the real numbers are an ordered field. The rational
numbers give another example of an ordered field.
In an ordered field we define the notion x < y to mean y — x c P.
We write `x < y' for `x < y or x = y'. In terms of < Axiom B 1
is equivalent to
(x<y&z<w)xd-z<y+w,
Sec. 1)
31
Axioms for the Real Numbers
and B2 is equivalent to
(0 <x<y&0<z<w)xz<yw.
Axiom B3 asserts that a number cannot be both greater than and
less than another, while B4 states that of any two different numbers
one must be the larger. Since Axiom Bi implies that the relation < is
transitive, we see that the real numbers are linearly ordered by <
Except for the discussion in the beginning of the next Section we
take all the consequences of these two axiom groups for granted
and use them without explicit mention. For further properties of
ordered fields the reader should see Birkhoff and MacLane [2].
The third axiom group consists of a single axiom, and it is this
axiom which distinguishes the real numbers from other ordered
fields. In contrast to our cavalier policy about the consequences of
the first two axiom groups we shall be explicit about the use of this
last axiom. Before stating this final axiom, let us introduce some
terminology: If S is a set of real numbers, we say that b is an upper
bound for S if for each x e S we have x < b. We sometimes express
this by writing S < b. A number c is called a least upper bound for S
if it is an upper bound for S and if c < b for each upper bound b of S.
Clearly the least upper bound of a set S is unique if it exists. Our
final axiom for real numbers simply guarantees its existence for sets
with an upper bound.
C. Completeness Axiom: Every nonempty set S of real numbers
which has an upper bound has a least upper bound.
As a consequence of Axiom C we have the following proposition:
1. Proposition: Let L and U be nonempty subsets of R with
R=LuU and such that for each 1 in L and each u in U we have
1 < u. Then either L has a greatest element or U has a least element.
We shall often denote the least upper bound of S by sup S or by
sup x and occasionally by sup {x: x E S}. We can define lower
xeS
bounds and greatest lower bounds in a similar fashion, and it follows
from Axiom C that every set of real numbers with a lower bound
has a greatest lower bound. We denote the greatest lower bound of a
set S by inf S or by inf x. Note that inf x = —sup —x.
xeS
xeS
xeS
32
The Real Number System
[Ch. 2
Problems
1. Show that 1 e P.
2. Use Axiom C to show that every nonempty set of real numbers with
a lower bound has a greatest lower bound.
3. Prove Proposition 1 using Axiom C.
4. If x and y are two real numbers, we define max (x, y) to be x if x > y
and y if y > x. We often denote max (x, y) by 5c V y'. Similarly, we define
min (x, y) to be the smaller of x and y, and denote it by 'x A y'. Show that
a. (x A y) A z=xA (y A z).
b. xAy+xVy=x+y.
c. (—x) A ( y) = —(x V y).
—
V y+ z= (x + z) V (y + z).
e. z(x V y) = (zx) V (zy) if z _> O.
5. We define lxi to be x if x > 0 and to be — x if x < O. Show that
d. x
a. lxyl = lx1 lyl.
b. ix + yl < 1x1 + 1y1.
c. Ix! =
x
V
(—x).
d. x v y = 1(x +
y + Ix
—
YD.
e. if —y < x < y, then ixi < y.
2 The Natural and Rational Numbers
as Subsets of R
We have adopted the procedure of taking the natural numbers for
granted and of using them as counting numbers. Yet we all regard
such a number as 3 not only as a natural number but also as a real
number. In fact, we have used the symbol 1 not only to denote the
first natural number but also the special real number given by Axiom
A7, and one is tempted to say that we define the real number 3 as
1 + 1 ± 1, and that in a "similar" fashion we can define real numbers corresponding to any natural number. Actually we may use the
tools we have at hand to do this in a more precise fashion.
By the principle of recursive definition there is a function cp from
the natural numbers to the real numbers defined by 441) = 1 and
cp(n + 1) = p(n) ± 1. (Here 1 denotes a real number on the right
side and a natural number on the left.) We shall show that the
Sec. 2]
The Natural and Rational Numbers as Subsets of R
33
mapping yo is a one-to-one mapping of N into R. Let p and q be two
different natural numbers, say p < q. Then q = p n, and we
shall show that ç(p) < ço(q) by induction on n. For n
1 we have
q=p
1 and ç(q) =ic(p) + 1 > cp(p). For general n we have
ço(p n ± 1) = io(p n) + 1 > go(p n),and so (p(p n) > (p(p)
implies ço(p n 1) > cp(p). Thus by induction yo(p n) > 4p(p),
and we see that the mapping yo is one-to-one. We can also prove by
induction that (p(p q) = yo(p) cp(q) and ço(pq) = (p(p)(p(q). Thus
yo gives a one-to-one correspondence between the natural numbers and
a subset of R, and yo preserves sums, products, and the relation <
Strictly speaking, we should distinguish between the natural number
n and its image yo(n) under cp, but we shall not make the distinction
here; we shall consider the set N of natural numbers to be a subset
of R. By taking differences of natural numbers, we obtain the integers
as a subset of R, and taking quotients of integers gives us the rationals.
Since Axiom C was not used in this discussion, the same results hold
for any ordered field. Thus we have shown the following:
2. Proposition: Every ordered field contains (sets isomorphic to)
the natural numbers, the integers, and the rational numbers.
If we make use of Axiom C we can prove several further facts about
the integers and rational numbers as subsets of the reals. One of the
most important is the following theorem, which for historical reasons'
is called the Axiom of Archimedes:
3. Axiom of Archimedes: Given any real number x, there is an
integer n such that x < n.
Proof: Let S be the set of integers k such that k < x. Since S
has the upper bound x, it has a least upper bound y by Axiom C.
Since y is the least upper bound for S, y — I cannot be an upper
But
bound for S, and so there is ak e S such that k > y —
k 1 > y ± >y, and so (k 1) k S. Since k 1 is an integer
1 greater than x by the definition of S. I
not in S, we must have k
4. Corollary: Between any two real numbers is a rational,- that is,
if x < y, then there is a rational r with x < r < y.
1
It was first used by Eudoxus.
34
The Real Number System
[Ch. 2
Proof: Let us first suppose 0 < x. By the Axiom of Archimedes
there is an integer q > (y x) -1 . Then WO <y — x. The set of
integers n such that y < (n1q) is a nonempty (by the Axiom of
Archimedes) set of positive integers, and so has a least element p.
Then (p 1)lq <y _<(plq), and x = y (y x) < (plq) (11q)
(p
1)/q. Thus r = (p 1)/q lies between x and y. If x < 0, we
can find an integer n such that n > x. Then n x > 0, and there
is a rational r with nd-x <r <n+y,and r — n is a rational
between x and y.
—
—
—
—
—
—
—
—
•
3 The Extended Real Numbers
It is often convenient to extend the system of real numbers by the
addition of two elements + Go and — Go . This enlarged set is called
the set of extended real numbers. We extend the definition of < to
the extended real numbers by postulating — < x < 0 0, for each
real number x. We define
00
+ GO = œ
X •
GC
=
x — 00
GC
— Go
=
op
if x > 0
if x > 0
for all real numbers x, and set
00 + 09 = c/3, — C/3 —
GO • (± 00) =
CO
= — 00
— OC • (± 00) =
TOO,
The operation oo — oo is left undefined, but we shall adopt the arbitrary convention that 0 - oc= O.
One use of extended real numbers is in the expression "sup S".
If S is a nonempty set of real numbers with an upper bound, we
define sup S to be the least upper bound of S. If S has no upper
bound, we write sup S = co . Then sup S is defined for all nonempty
sets S, and if we define sup 0 to be — œ, then in all cases sup E is
the smallest extended real number which is greater than or equal to
each element of E. Similar conventions are adopted with respect to
inf S.
A function whose values are in the set of extended real numbers is
called an extended real-valued function.
Sec. 4]
35
Sequences of Real Numbers
4 Sequences of Real Numbers
By a sequence 2 (x„) of real numbers we mean a function which
maps each natural number n into the real number xn. We say that a
real number lis a limit of the sequence (x,„) if for each positive there
is an N such that for all n > N we have Ix„ — /1 < E. It is easily
verified that a sequence can have at most one limit, and we denote
this limit by lirn x„ when it exists. In symbols / = lim x„ if
(E
> 0)(iN)(n
— < e).
A sequence (x,,) of real numbers is called a Cauchy sequence if,
given e > 0, there is an N such that for all n > N and all in > N
we have Ix,, — xin j < E. The Cauchy Criterion states that a sequence
of real numbers converges if and only if it is a Cauchy sequence (see
Problem 10).
We extend this notion of limit of a sequence to include the value
cf.,' as follows: lim x„ = Go if given A there is an N such that for all
n > N we have x„ > A. A sequence is called convergent if it has a
limit. This definition is ambiguous; it depends on whether or not
we mean a limit which is a real number or an extended real number.
In most of analysis it is more usual to use the restricted definition
for convergence, which requires a limit to be a real number, but we
shall find it convenient in the next few chapters to allow ± Go as
limits in good standing. In those cases in which it is important to
distinguish between the two concepts of a limit we shall try to be
explicit by the use of such phrases as 'converges to a real number'
or 'converges in the set of extended real numbers.'
If / = lim x„, we often write x„ L If, in addition, (x„) is monotone, that is, xi, < x„ 1 , we write xi, 1 1.
In the case of a real number we can paraphrase the definition of
limit as follows: 1 is the limit of (x,„) if, given E > 0, all but a finite
number of terms of the sequence (xi,) are within e of I. A weaker
requirement is to have infinitely many terms of the sequence within
E of I. In this case we say that 1 is a cluster point of the sequence
(xi,). Thus I is a cluster point of (x,,) if, given E > 0, and given
N, in > N such that Ix,, — /1 < E. We extend this definition to the
case / = Go by saying that Go is a cluster point of (xi,) if, given A
2
These are infinite sequences in the terminology of Chapter 1. Since we shall be mostly
interested in infinite sequences in the remainder of this book, we drop the adjective
"infinite" and assume all sequences infinite unless otherwise specified.
The Real Number System
36
[Ch. 2
and given N, 3n > N such that xn > A. An obvious modification
applies to — 0. Thus, if a sequence has a limit 1, then 1 is a cluster
point, but the converse is not usually true. For example, the sequence
(x n ) defined by xn = (- 1)n has +1 and —1 as cluster points but
has no limit
If xn) is a sequence, we define its limit superior by
(
lirn xn = inf sup Xk.
n k>n
The symbols lim and lim sup are both used for the limit superior.
A real number l is the limit superior of the sequence (x n ) if and only
if (i) given e > 0, 3n such that xk < 1 + e for all k > n, and (ii) given
e > 0, and given n, 3k > n such that x k > Z — e. The extended real
number 00 is the limit superior of (xn) if and only if, given A and n,
there is a k > n such that x k > A. The extended real number — 00
is the limit superior of (xn ) if and only if — 00 = lim x.
We define the limit inferior by
lim xn = sup inf xk.
n k>n
We have lim — xn = —Ern xn, and lim x, < lirn x n . The sequence
(x,) converges to an extended real number 1 if and only if 1 =
Ern xn = lim xn. If (x n ) and (yn ) are two sequences, we have
lim x, + lim y, < lim (x, + yn) < lim + lim
< lim(x+ Y.) <limx+lim y.
Problems
6. Show that a sequence can have at most one limit.
7. Show that 1 is a cluster point of (xn ) if and only if there is a subsequence (xn,),= 1 which converges to I.
8. a. Show that lim xn and Ern x„ are the largest and smallest cluster
points of the sequence (x).
b. Show that every bounded infinite sequence has a subsequence
which converges to a real number.
9. Show that a sequence (xn ) is convergent if and only if there is
exactly one extended real number which is a cluster point of the sequence.
Is this statement true if we omit the word 'extended'?
Sec. 4]
Sequences of Real Numbers
37
10. a. Show that a sequence (x„) which converges to a real number I
is a Cauchy sequence.
b. Show that each Cauchy sequence is bounded.
c. Show that if a Cauchy sequence has a subsequence which converges to 1, then the original sequence converges to I.
d. Establish the Cauchy Criterion: There is a real number I to
which the sequence (xn ) converges if and only if (xn) is a Cauchy sequence.
11. Show that x = lim x„ if and only if every subsequence of (xn ) has
in turn a subsequence which converges to x.
12. Show that the real number I is the limit superior of the sequence
(x„) if and only if (i) given E > 0, 3n such that xk < I E for all k > n,
and (ii) given c > 0 and n, 3k > n such that xi, > I — e.
13. Show that lim xv, = cc if and only if given A and n, 3k > n with
Xk > A.
14. Show that lim xn < lim x„ and that lim x„ = iii
only if I = lim
15. Prove that
lim x„
= I if and
limy,, < lim (x„
Y.) < gira
provided the right and left sides are not of the form cc — co.
16. Prove that if x„ > 0 and yn > 0, then
lim (xn yn ) < (fim xn)(lim
provided the product on the right is not of the form 0 oc.
17. We shall say that a sequence (or series) (x,) is summable to the real
number s or has a sum s if the sequence (s„) defined by sn =
s as a limit. In this case we write s =
E x„ has
E x,. Show that, if each
xv > 0,
there is always an extended real number s such that
co
s =
18. Show that the series (x,) has a sum if
E lxvi <
v=1
19. Let (x„) be a sequence of real numbers. Show that x = lim xn. if
and only if
X=
E (xv+ 1
—
xv).
38
The Real Number System
20. Let E be a set of positive real numbers. We define
Ex
[Ch. 2
to be
xeE
sup SF , where a is the collection of finite subsets of E and SF is the (finite)
sum of the elements of F.
a. Show that
Ex
<
cc
only if E is countable.
xeE
b. Show that if E is countable and (xn ) is a one-to-one mapping
of N onto E, then
E x --= E x.
xeE
n---- 1
21. Let p be an integer greater than 1, and x a real number, 0 < x < 1.
Show that there is a sequence (an ) of integers with 0 < an < p such that
x =
E
n=1
pn
and that this sequence is unique except when x is of the form q/pn, in
which case there are exactly two such sequences. Show that, conversely,
if (an ) is any sequence of integers with 0 < a < p, the series
E an
pn
converges to a real number x with 0 < x < 1.
If p = 10, this sequence is called the decimal expansion of x. For
p = 2 it is called the binary expansion; and for p = 3, the ternary expansion.
22. Show that R is uncountable. [Use Problem 1.23. Another proof
will be given by Corollary 3.4.]
5 Open and Closed Sets of Real Numbers
The simplest sets of real numbers are the intervals. We define the
open interval (a, b) to be the set ix: a < x < . We always take
a < b but we consider also the infinite intervals (a, co) = fx: a <
oc) for the
and
co, b) = fx: x < . Sometimes we write
set of all real numbers. We define the closed interval [a, b] to be the
set {x: a < x < . For closed intervals we take a and b finite. The
half-open interval (a, b] is defined to be {x: a < x < , and [a, b) =
{x: a < x < . A generalization of the notion of an open interval
is given by that of an open set:
(—
(—
,
Definition: A set 0 of real numbers is called open i f for each x e 0
there is a t > 0 such that each y with Ix — yl < 3 belongs to O.
Sec. 5]
Open and Closed Sets of Real Numbers
.77
Another way of phrasing this definition is to say that a set 0
is open if for every x in 0 there is an open interval I such that x 6 I c
0. The open intervals are examples of open sets, and both the empty
set Ø and the set R of real numbers are open. We establish some
properties of open sets.
5. Proposition: The intersection 0 1 n 0 2 of two open sets 0 1 and
0 2 is open.
Proof: Let X E 01 n 0 2 . Since x c 0 1 and 0 1 is open, there is a
31 > 0 such that all y with {x — yj < 3 1 belong to 0 1 . Similarly,
there is a 32 > 0 such that all y with !x — yj < 3 2 belong to 02.
Take 3 to be the smaller ofô 1 and 32. Then â > 0, and if Ix — yj < 3,
then y belongs to both 0 1 and 0 2, i.e., to 0 1 n 02 . I
6. Corollary: The intersection of any finite collection of open sets
is open.
7. Proposition: The union of any collection e of open sets is open.
Proof: Let U be the union of the collection e, and X E U. Then
there is an 0 E e with X E 0. Since 0 is open, there is an E > 0 such
that all y with Ix — yl < € belong to 0 and hence to U, since
0 c U. Thus U is open. I
It follows from Proposition 5 that the intersection of any finite
collection of open sets is open. It is not true, however, that the intersection of any collection of open sets is open. Take, for example,
n
On to be the open interval (— 71). Then
0. = {0}, and {0} is not
an open set.
Every union of open intervals is an open set by Proposition 7.
A strong form of the converse of this is also true:
8. Proposition: Every open set of real numbers is the union of a
countable collection of disjoint open intervals.
Proof: Since 0 is open, for each X E 0 there is a y > x such that
(x, y) C O. Let b = sup {y: (x, Y) c 0}. Let a = inf {z: (z, x) c 0 } .
Then a < x < b, and Ix = (a, b) is an open interval containing x.
Now Ix c 0, for if w E I, say X < W < b, we have by the definition of b a number y > w such that (x, y) c 0, and so w e 0.
40
The Real Number System
[Ch. 2
Moreover, b o 0, for if b e 0, then for some e > 0 we have
(b — E, b
e) c 0, whence (x, b 6) c 0, contradicting the definition of b. Similarly, a o O. Consider the collection of open intervals
x e O. Since each x in 0 is contained in ./x, and each Ix is contained in 0, we have 0 = UIx . Let (a, b) and (c, d) be two intervals
in this collection with a point in common. Then we must have c < b
and a < d. Since c does not belong to 0, it does not belong to (a, b)
and we have c < a. Since a does not belong to 0 and hence not to
(c, d), we have a < c. Thus a — c. Similarly, b -- d, and (a, b)
(c, d). Thus two different intervals in the collection {/x} must be
disjoint. Thus 0 is the union of the disjoint collection {/x} of open
intervals, and it remains only to show that this collection is countable.
But each open interval contains a rational number by the corollary
to the Axiom of Archimedes. Since we have a collection of disjoint
open intervals, each open interval contains a different rational number, and the collection can be put in one-to-one correspondence with a
subset of the rationals. Thus it is a countable collection. 1
9. Proposition (Lindelof): Let e be a collection of open sets of real
numbers. Then there is a countable subcollection {0, } of e such that
U
Oee
0 =
U o,.
z=1
Proof: Let U = U{O: 0 e e}, and x c U. Then there is an
0 c e with x c O. Since 0 is open, there is an open interval Ix such
that x c I c O. It follows from Corollary 4 that we can find an open
interval ‘Tx with rational endpoints such that x e J C I. Since the
collection of all open intervals with rational endpoints is countable,
the collection {ix}, x e U, is countable, and U = U J. For each
xeU
interval in {4, } choose a set 0 in e which contains it. This gives a
countable subcollection {0, } 7_, of e, and U =
U Or I
We shall also study the notion of a closed set which generalizes the
notion of a closed interval. We begin by defining a point of closure:
Definition: A real number x is called a point of closure of a set
E if for every 5 > 0 there is a y in E such that Ix — yl < 5.
This is equivalent to saying that x is a point of closure of E if
every open interval containing x also contains a point of E. Each
Sec. 5]
Open and Closed Sets of Real Numbers
41
point of E is trivially a point of closure of E. We denote the set of
points of closure of E by T. Thus E c
10. Proposition: If A c B, then 74 c :13. Also (A u
=
Proof: The first part follows immediately from the definition of
points of closure. Since Ac Au B, we have 71c(AuB). Similarly,
c (A u B). Thus AuBc (A u B). Suppose that x A u T3.
Then there is a S i > 0 such that there is no y c A with Ix — yl <
and there is a 5 2 > 0 such that there is no y c B with Ix — yl < 5 2 .
Thus if 5 = min (5 1, 32), there is no yeAuB with fx — yI < 5.
Consequently, x (A u B), and we have (A u B) c 71 u T3. I
Definition: A set F is called closed if F =
Since we always have F c P, a set F is closed if T' c F, that is,
if F contains all of its points of closure. The empty set 0 and the
set R of all real numbers are closed. The closed intervals [a, b] and
[a, 00 ) are closed. It is customary to use the letter F to denote closed
sets (French, fermé).
11. Proposition: For any set E the set
T is closed; that is, r =
Proof: Let x be a point of closure of E. Then, given S > 0, there
is a point y e E with Ix — yl < 612. Since y e E, there is a zeE
with y — zI < 5/2. Thus Ix — zi < 5, and we see that x is a point
of closure of E. I
12. Proposition: The union F1 U F2 of two closed sets F1 and F2 is
closed.
Proof: By Proposition 10 we have
(F1 u F2) P1 u
F2 = F1 U F2 . 1
13. Proposition: The intersection of any collection e of closed sets
is closed.
n
Proof: Let x be a point of closure of
F c C). Then, given
5 > 0, there is ay e fl {F: F e ej such that Ix — yl < &Since such
a y belongs to each F c e, we see that x is a point of closure of each
The Real Number System
42
[Ch. 2
F e e. Since each F is closed, we have x e F for each F in e. Hence
x e fl {F: F e e}. I
14. Proposition: The complement of an open set is closed and the
complement of a closed set is open.
Proof: Let 0 be open. If x e 0, there is a > 0 such that if
< 8, then y e 0. Hence x cannot be a point of closure of
Ix
Ô, since there is no y e Ô with ix
yl < 3. Thus Ô contains all of
its points of closure and is therefore closed.
On the other hand, let F be closed and X E P. Then, since x is not
a point of closure of F, there is a 5 > 0 such that there is no ye F
with Ix — yJ < 5. Hence, if lx — yJ < 8, then y c F. Thus F is
open. I
—
We say that a collection e of sets covers a set F if FC U {0: 0 ee}.
In this case the collection e is called a covering of F. If each 0 e e
is open, we call e an open covering of F. If e contains only a finite
number of sets, we call e a finite covering. This terminology is
inconsistent: In 'open covering' the adjective 'open' refers to the
sets in the covering; in 'finite covering' the adjective 'finite' refers to
the collection and does not imply that the sets in the collection are
finite sets. Thus the term 'open covering' is an abuse of language and
should properly be 'covering by open sets'. Unfortunately, the former
terminology is well established in mathematics. With this terminology we state the following theorem:
15. Theorem (Heine-Borel): Let F be a closed and bounded set of
real numbers. Then each open covering of F has a finite subcovering.
That is, if e is a collection of open sets such that F c U f0: 0 e e},
07J of sets in e such that
then there is a finite collection f0 1 ,
.
,
F C U 0i.
Proof: Let us first consider the case that F is the closed interval
[a, b], where —x <a <b< x. Let E bethesetofnumbersx < b
with the property that the interval [a, x] can be covered by a finite
number of the sets of e. The set E is bounded by b and so has a least
upper bound c. Since c e [a, b], there is an 0 e e which contains c.
Since 0 is open, there is an E > 0 such that the interval (c — E, C
Sec. 5]
Open and Closed Sets of Real Numbers
43
is contained in O. Now c — e is not an upper bound for E, and
so there must be an x e E with x > c — E. Since x e E, there is a
finite collection {0 1 , . , Oi } of sets in e which covers [a, x].
Consequently, the finite collection {0 1 , . . . , 0 k, 0 covers [a, c e).
Thus each point of [c, c
e) would be in E if it were less than or
equal to b. Since no point of [c, c
e) except c can belong to E, we
must have c = b and b e E. Thus [a, b] can be covered by a finite
number of sets from e, proving our special case.
Now let F be any closed and bounded set and e an open covering
of F. Since F is bounded it is contained in some closed bounded
interval [a, b]. Let e* be the collection obtained by adding P to e;
that is, e* = e u {P}. Since F is closed, P is open, and e* is a
collection of open sets. By hypothesis F c U {0: 0 e e}, and so
R=PuFCPuU{O:Oee} = WO: O e e*}. Thus e is an
open covering of R and therefore of [a, b]. By our previous case there
is a finite subcollection of e* which covers [a, b] and hence F. If this
finite subcollection does not contain P, it is a subcollection of e and
the conclusion of our theorem holds. If the subcollection contains
P, denote it by ;0 1 , . , On, P-1. Then F c P u 0 1 u • • • u On.
Since no point of F is contained in P, we have F C 01 U • • ' U On,
and the collection {0 1 , , On is a finite subcollection of e which
covers F. I
}
}
16. Proposition: Let e be a collection of closed sets (of real
numbers) with the property that every finite subcollection of e has a
nonempty intersection, and suppose that one of the sets in e is bounded.
Then
n
Fee
F
Problems
23. Is the set of rational numbers open or closed?
24. What are the sets of real numbers that are both open and closed?
25. Find two sets A and B such that A n B = 0 and 74 n
0.
26. Show that xis a point of closure of Eif and only if there is a sequence
(Yn) with yn e E and x
27. A number x is called an accumulation point of a set E if it is a point
of closure of E fx; . Show that the set E' of accumulation points of E is
a closed set.
44
The Real Number System
[Ch. 2
28. Show that E
EUE'.
29. A set is called isolated if E n E' = Ø. Show that every isolated set
of real numbers is countable.
30. A set D is called dense in R if —D- = R. Show that the set of rational
numbers is dense in R.
31. Prove Propositions 12 and 13 using Propositions 5, 7, and 14.
32. Prove Propositions 5 and 7 using Propositions 12, 13, and 14.
33. A point x is called an interior point of a set A if there is a S > 0
such that the interval (x — S x
(3) is contained in A. The set of interior
points of A is denoted by A°. Show that
a. A is open if and only if A = A°.
,
b. A° =
34. Derive Proposition 16 from the Heine-Borel Theorem using
De Morgan's laws.
35. Let (F„) be a sequence of nonempty closed sets of real numbers
with Fn+ 1 C F. Show that if one of the sets F
bounded, then
fl F
0. Give an example to show that this conclusion may be false
if we do not require one of the sets to be bounded.
36. The Cantor ternary set C consists of all those real numbers in [0, 1]
which have ternary expansion (cf. Problem 21) (an ) for which an is
never I. (If x has two ternary expansions, we put x in the Cantor set if
one of the expansions has no term equal to 1.) Show that C is a closed set,
and that C is obtained by first removing the middle third (*, D from [0, 1],
4) of the remaining intervals,
then removing the middle thirds a-, 4) and
and so on.
37. Show that the Cantor set can be put into a one-to-one correspondence with the interval [0, 1].
38. Show that the set of accumulation points of the Cantor set is the
Cantor set itself.
6 Continuous Functions
Let f be a real-valued function whose domain of definition is a set E
of real numbers. We say that f is continuous at the point x in E if,
given E > 0, there is a 3 > 0 such that for all y in E with lx — y < 3
we have j f(x) — f( y){ < E. The function f is said to be continuous on
a subset A of E if it is continuous at each point of A. If we merely
say that f is continuous, we mean that f is continuous on its domain.
Sec. 6]
45
Continuous Functions
17. Proposition: Let f be a real-valued function defined and continuous on a closed and bounded set F. Then f is bounded on F and
assumes its maximum and minimum on F; that is, there are points
x 1 and x2 in F such that .f (x 1) < f(x) < f (x 2) for all x in F.
Proof: We shall first show that f is bounded on F. Since f is
continuous on F, for each x c Fthere is an open interval 4 containing
x such that (Y) — f (.0 < 1 for y E rx n F. Thus for y E rx n F,
we have f(y)1 < f(x)1 + 1 and so f is bounded in I. The collection
14: x e FI is a collection of open intervals which covers F, and by
the Heine-Borel Theorem there is a finite subcollection {4,, . . , ir.„}
which covers F. Let M = 1 + max [I f (x i )j, , f (x 00. Then
each y in F belongs to some interval I in the finite subcollection,
and hence f (y)i < 1 + < M. This shows that f is bounded
(by M) on F.
To see that f assumes its maximum on F, let m = sup f(x). Since
xcF
f is bounded, m is finite, and our goal is to show that there is an
x1 c F such that f(x 1 ) = m. Suppose not. Then f(x) < m for each
X e F, and by continuity there is an open interval 4 containing x
such that ft.))) < (x)
m) for all y c Ix n F. Again using the
Heine-Borel Theorem, we can find a finite number of these intervals
. , /x„) which cover F. Set a = max [f (xi), • • • , f(x„)]. Then
<
each y e F belongs to one such interval /xi, and f( y) < 32-[f(x k)
(a + m). Thus (a + m) is a bound for f on F. But this is impossible, since -21-(a m) < m. Consequently, there must be an x 1
such that f(x i) = m. Similarly, there is an x2 at which f assumes its
minimum. I
18. Proposition: Let f be a real-valued function defined on
(— o, Go). Then f is continuous if and only if for each open set 0 of
real numbers f-1 [0] is an open set.
Proof: Suppose f"[0] is open for each open set 0, and let x
be an arbitrary real number. Then, given e > 0, the interval / =
(f(x) e, f(x) e) is an open set, and so its inverse image f -1 [1]
[I], there must be some 6 > 0 such that
must be open. Since x e
(x — 6, x
6) c ri [I]. But this implies that if I y — xi < 6, then
e); that is, I .[(x) — f(y) < e. Hence f is
f(y) e (f(x) — E, f(x)
continuous at x. Since x was arbitrary, f is continuous.
Suppose now that f is continuous and that O is an open set. Let
x be a point off '[0]. Thenf(x) c 0, and there is an E > 0 such that
46
The Real Number System
[Ch. 2
(f(x) — E, f(x)
e) c O. Since f is continuous at x, there is a 6 > 0
such that f(x) — f(y)i < E for 1x — yf < 6. Thus, for every
y e (x — 3, x + 3), we havef(y) e (f(x) — e, f(x) e) c O. Hence
(x — 3, x 5) c f -1 [0], and so f —i [0] is open. I
Definition: A real-valued function f defined on a set E is said to be
uniformly continuous (on E) if, given e > 0, there is a â > 0 such that
for all x and y in E with x —yf < 5 we have f(x) — f(y)1 < E.
19. Proposition: If a real-valued function fis defined and continuous
on a closed and bounded set F of real numbers, it is uniformly continuous on F.
Proof: Given E > 0 and x in F, there is a 6, > 0 such that
Ix — y1 < 6, implies I f(x) — f(y)I < 1E. Let 0, be the interval
(x —
x 16 s). Then {Oz : x e F is an open covering of F. By
the fleine-Borel Theorem there is a finite subcollection ft Oz„
o.„}
which covers F. Let 5 --= min . 6,). Then 6 is positive.
Given two points y and z in F such that Iy — zI < 6, the point y must
belong to some Oz,, and hence there is an i such that ly — xzl <
Consequently,
}
lz —
1z — y1 + iy —
I<
+6<
Hence
I f(y) - f
<
I f (z) -
<
and
whence
I f(z) - 1(y)1 <
showing that f is uniformly continuous on F.I
Definition: A sequence (f, i ) of functions defined on a set E is
said to converge pointwise on E to a function f if for every x in E
we have f(x) = lim f n (x); that is, if, given x c E and E > 0, there is an
N such that for all n > N we have f f(x) — f(x)1 G E.
Definition: A sequence (fa) of functions defined on a set E is said to
converge uniformly on E if, given e > 0, there is an N such that for all
x e E and all n > N we have f,1(x) — f(x)[ < E.
47
Sec. 61 Continuous Functions
Problems
39. Let F be a closed set of real numbers and f a real-valued function
which is defined and continuous on F. Show that there is a function g
defined and continuous on (— cc, cc) such that f(x) = g(x) for each x e F.
[Hint: Take g to be linear in each of the intervals of which P is composed.]
40. Let f be a real-valued function with domain E. Prove that f is continuous if and only if for each open set O there is an open set U such that
f -1 [0] = E n U.
41. Let (fn ) be a sequence of continuous functions defined on a set E.
Prove that if (A) converges uniformly to i on E then f is continuous on E.
42. Let f be that function defined by setting
x
1
p sin —
if x irrational
if x = in lowest terms.
At what points is f continuous?
43. a. Show that iff and g are continuous functions, then the functions
f + g and fg are continuous.
b. Show that iff and g are continuous, then so isf. g.
c. Let f V g be the function defined by (f V g)(x) = f (x) V g(x),
and define f A g similarly. Show that if f and g are continuous so are
IV g and f A g.
d. If/ is continuous, then so is
44. A real-valued function f is said to be monotone increasing if
f(x) < f(y) whenever x < y, and strictly monotone increasing if
f(x) < f(y) whenever x < y. It is called monotone (or strictly monotone)
iff or —f is monotone (or strictly monotone) increasing. Let f be a continuous function on the interval [a, b]. Then there is a continuous function
g such that g(f(x)) = x for all x e [a, b] if and only if f is strictly monotone. In this case we also have f(g(y)) = y for each y between f(a) and
f(b). A function f which has a continuous inverse is called a homeomorphism (between its domain and its range).
45. A continuous function y, on [a, b] is called polygonal (or piecewise
linear) if there is a subdivision a = x 0 < x 1 < • • • < x0 = b such that
is linear on each interval [x„ x, +1 ]. Let f be an arbitrary continuous
function on [a, b] and E a positive number. Show that there is a polygonal
function (,0 on [a, b] with f(x) — w(x)I < E for all x e [a, b].
46. Let x be a real number in [0, 1 ] with the ternary expansion (an )
(cf. Problem 21). Let N = Gc, if none of the an are 1, and otherwise let N
48
The Real Number System
[Ch.
2
be the smallest value of n such that an = 1. Let bn = -1-an, for n < N and
bN = 1. Show that
is independent of the ternary expansion of x (if x has two expansions) and
that the function f defined by setting
N
E
f(X)
is a continuous, monotone function on the interval [0, I]. Show that f is
constant on each interval contained in the complement of the Cantor
ternary set (Problem 36), and that maps the Cantor ternary set onto the
interval [0, I]. (This function is called the Cantor ternary function.)
47. Limit superior of a function of a real variable. Let f be a real (or
extended real) valued function defined for all x in an interval containing y.
We define
!
lim f(x) = inf
x -,2/
lim f(x) = inf
x-v+
with similar definitions
a. 1im f(x) < A
x-->y
that for all x with 0 <
b.
f(x) > A
sup
f(x)
8>0 0<lx-yl<5
sup
f(x)
8>0 0<x-y<3
for lim.
if and only if, given
E
> 0, there is a 5 > 0 such
lx — yj < S we have f(x) < A + e.
if and only if, given E > 0 and & > 0, there is an
x such that 0 < ix —y < 5 and f(x) > A — E.
C. liM f(x) < lim f(x) with equality (for limf
± co) if and only
x->y
if lim f(x) exists.
d. If limf(x) = A and (xn ) is a sequence with x„
y such that
y = lim xn, then limf( xn) < A.
e. If lim f(x) = A, then there is a sequence (xn ) with xn
y such
x-1.1/
that y = lim xr, and A = lim f(xn).
f. For a real number I we have I = lim f(x) if and only if l =
y and y = lim xn .
for every sequence (x7,) with xn
48. Semicontinuous functions. An extended real-valued function f
— 00 and f(y) <
is called lower semicontinuous at the point y if f(y)
oo
lim f(x). Similarly, f is called upper semicontinuous at y if f(y)
liMf(x„)
Sec. 6]
49
Continuous Functions
and f(y) > lirn f(x). We say that f is lower (upper) semicontinuous on
an interval if it is lower (upper) semicontinuous at each point of the
interval. The function f is upper semicontinuous if and only if the function —f is lower semicontinuous.
a. Let f(y) be finite. Prove that f is lower semicontinuous at y if
and only if given e > 0, 36 > 0 such that f(y) < f(x)
e for all x
with Ix — y <
b. A function f is continuous (at a point or in an interval) if and
only if it is both upper and lower semicontinuous (at the point or in the
interval).
c. Show that a real-valued function f is lower semicontinuous on
(a, b) if and only if the set fx: f(x) > X; is open for each real number X.
d. Show that if f and g are lower semicontinuous functions, so
are f V g and f g.
e. Let (fii ) be a sequence of lower semicontinuous functions. Show
that the function f defined by f(x) = supf„(x) is also lower semicontinuous.
f. A real-valued function ,r) defined on an interval [a, b] is called a
step function if there is a partition a = xo < x 1 < • • • <
b such
that for each i the function cp assumes only one value in the interval
(x,, x2+1). Show that a step function cp is lower semicontinuous if ,p(x,)
is less than or equal to the smaller of the two values assumed in (x j _ 1 , xi)
and (x„ xi + i).
g. A function f defined on an interval [a, b] is lower semicontinuous
if and only if there is a monotone increasing sequence () of lower semicontinuous step functions on [a, b] such that for each x e [a, b] we have
f(x) = lim
h. Show that the step functions in (g) can be replaced by continuous functions.
i. Prove that a function f which is defined and lower semicontinuous on a closed interval [a, b] is bounded from below and assumes its
minimum on [a, b], that is, that there is a y e [a, b] such that f(y) < f(x)
for all x c [a, b].
49. Upper and lower envelopes of a function. Let f be a real-valued
function defined on [a, b]. We define the lower envelope g off to be the
function g defined by
g(y) = sup inf
f(x),
6<0 1x—yl<8
and the upper envelope h by
h(y) = inf
sup f(x).
6>0 lx—Yl<5
The Real Number System
50
[Ch. 2
a. For each x e [a, b], g(x) f(x) h(x), and g(x) = f (x) if and
only if f is lower semicontinuous at x, while g(x) = h(x) if and only if f
is continuous at X.
b. If f is bounded, the function g is lower semicontinuous, while
h is upper semicontinuous.
c. If ço is any lower semicontinuous function such that ço(x) < f(x)
for all x e [a, b], then 99(x) < g(x) for all x e [a, b].
7
Borel Sets
Although the intersection of any collection of closed sets is closed
and the union of any finite collection of closed sets is closed, the
union of a countable collection of closed sets need not be closed.
For example, the set of rational numbers is the union of a countable
collection of closed sets each of which contains exactly one number.
Thus if we are interested in o--algebras of sets which contain all of
the closed sets we must consider more general types of sets than the
open and closed sets. This leads us to the following definition :
Definition: The collection 63 of Borel sets is the smallest ci-algebra
which contains all of the open sets.
Such a smallest g-algebra exists by Proposition 1.3. It is also the
smallest g-algebra which contains all closed sets and the smallest
g-algebra which contains the open intervals.
A set which is a countable union of closed sets is called an 5,
(5 for closed, g for sum). Thus every countable set is an 5,, as is,
of course, every closed set. A countable union of sets in 5, is again
in
Since
(a, b)
[a -In =1
n
b—
ni
each open interval is an 5,, and hence each open set is an
We say that a set is a ga if it is the intersection of a
collection of open sets (g for open, 5 for durchschnitt).
complement of an 5, is a ga, and conversely.
The 5, and ga are relatively simple types of Borel sets.
also consider sets of type 5,5, which are the intersections of
5,.
countable
Thus the
We could
countable
51
Sec. 7] Borel Sets
collections of sets each of which is an 5,. Similarly, we can construct
the classes g5 „ ff,6 ,, etc. Thus the classes in the two sequences
ff,
cr, 5,
c)-(Sa,
• • • 1
gScr, 9acra, • • •
are all classes of Borel sets. However, not every Borel set belongs to
one of these classes. Further theory of Borel sets can be found in
Kuratowski [16], but we shall need only the properties that follow
directly from the fact that they form the smallest a-algebra containing the open and closed sets.
Problems
50. Let f be a lower semicontinuous function defined for all real numbers. What can you say about the sets {x: f(x) > a}, {x: f(x) > a} ,
{x: f(x) < a} , { x: f(x) _< a} , and {x: f(x) = al?
51. Let f be a real-valued function defined for all real numbers. Prove
that the set of points at which f is continuous is a g5 .
52. Let (A) be a sequence of continuous functions defined on R.
Show that the set C of points where this sequence converges is an gcrs.
3
1
Lebesgue Measure
Introduction
The length /(/) of an interval lis defined, as usual, to be the difference of the endpoints of the interval. Length is an example of a set
function, that is, a function which associates an extended real number
to each set in some collection of sets. In the case of length the domain
is the collection of all intervals. We should like to extend the notion
of length to more complicated sets than intervals. For instance, we
could define the "length" of an open set to be the sum of the lengths
of the open intervals of which it is composed. Since the class of
open sets is still too restricted for our purposes, we would like to
construct a set function m which assigns to each set E in some
collection on of sets of real numbers a nonnegative extended real
number mE called the measure of E. Ideally, we should like m to
have the following properties:
i. mE is defined for each set Eof real numbers; that is, Tri, = 00);
ii. for an interval /, m/ — V);
iii. if (En) is a sequence of disjoint sets (for which m is defined),
m(U E.) = EmEn;
iv. m is translation invariant; that is, if E is a set for which m is
defined and if E ± y is the set {x ± y: x c E}, obtained by
replacing each point x in E by the point x ± y, then
m(E + y) = mE.
52
Sec. 1]
53
Introduction
Unfortunately, as we shall see in Section 4, it is impossible to
construct a set function having all four of these properties, and it is
not known whether there is a set function satisfying the first three
properties.' Consequently, one of these properties must be weakened,
and it is most useful to retain the last three properties and to weaken
the first condition so that mE need not be defined for all sets E of
real numbers. 2 We shall want mE to be defined for as many sets as
possible and will find it convenient to require the family arc of sets
for which m is defined to be a Gr-algebra. Thus we shall say that in is
a countably additive measure if it is a nonnegative extended real-valued
function whose domain of definition is a Gr-algebra Tr, of sets (of real
numbers) and we have m(U En) = EmEn for each sequence (Es) of
disjoint sets in arc. Our goal in the next two sections will be the
construction of a countably additive measure which is translation
invariant and has the property that m/ = /(/) for each interval I.
Problems
Let m be a countably additive measure defined for all sets in a o--algebra
t.
1. If A and B are two sets in art with A c B, then mA < mB. This
property is called monotonicity.
2. Let (En) be any sequence of sets in DM Then ma j En ) < EmEn .
[Hint: Use Proposition 1.2.] This property of a measure is called countable
subadditivity.
3. If there is a set A
in OTC such that mA < co, then m0 = O.
4. Let nE be co for an infinite set E and be equal to the number of
elements in E for a finite set. Show that n is a countably additive set function which is translation invariant and defined for all sets of real numbers.
This measure is called the counting measure.
If we assume the continuum hypothesis (that every noncountable set of real numbers
can be put in one-to-one correspondence with the set of all real numbers), then such a
measure is impossible.
2 Weakening property (i) is not the only approach; it is also possible to replace
property (iii) of countable additivity by the weaker property of finite additivity: for
each finite sequence (En) of disjoint sets we have m(U En) = mEn. (see Problem 10.21). Another possible alternative to property (iii) is countable subadditivity,
which is satisfied by the outer measure we construct in the next section (see
Problem 2).
1
2
Outer Measure
For each set A of real numbers consider the countable collections
{/n) of open intervals which cover A, that is, collections for which
A c U In, and for each such collection consider the sum of the
lengths of the intervals in the collection. Since the lengths are
positive numbers, this sum is uniquely defined independently of the
order of the terms. We define the outer measure 3 m*A of A to be
the infimum of all such sums. In an abbreviated notation
in* A =
inf
ACU.T.
E 1(I„).
It follows immediately from the definition of m* that m*0 ----- 0
and that if A c B, then m*A < m* B. Also each set consisting of a
single point has outer measure zero. We establish two propositions
concerning outer measure:
1. Proposition: The outer measure of an interval is its length.
Proof: We begin with the case in which we have a closed finite
interval, say [a, b]. Since the open interval (a — e, b
e) contains
[a, b] for each positive e, we have mla, b] < 1(a — e, b
e) —
b — a + 2e. Since m*[a, b] < b — a + 2e for each positive e, we
must have m*[a, b] < b — a. Thus we have only to show that
m*[a, b] > b — a. But this is equivalent to showing that if ;In) is
any countable collection of open intervals covering [a, b], then
(1)
E 1(I„) > b — a.
By the Heine-Borel theorem, any collection of open intervals covering
[a, b] contains a finite subcollection which also covers [a, b], and
since the sum of the lengths of the finite subcollection is no greater
than the sum of the lengths of the original collection, it suffices to
prove the inequality (1) for finite collections {in} which cover [a, b].
Since a is contained in U /„, there must be one of the /„'s which
contains a. Let this be the interval (a1, b 1). We have al < a < b1.
If b 1 < b, then b 1 e [a, b], and since b 1 e (a1 , b 1), there must be an
3
In order to distinguish this outer measure from the more general outer measures to
be considered in Chapter 12, we call this outer measure Lebesgue outer measure, after
Henri Lebesgue. Since we consider no other outer measure in this chapter, we refer to
m* simply as outer measure.
Sec. 2]
Outer Measure
55
interval (a2, b 2) in the collection 141 such that b 1 c (a2, b 2); that is,
a 2 < b 1 < b 2 . Continuing in this fashion we obtain a sequence
(a1 , b 1), . . . , (ak,bk) from the collection { in} such that ai <
<
Since (/„) is a finite collection, our process must terminate with
some interval (ak,bk). But it terminates only if b (ak, bk), that is,
if ak <b < bk . Thus
E 1(1n ) ?_ E 1(a1, bi)
= (bk — ak)
...
(bk_i — ak_i)
(b1 — al)
= bk — (ak — bk—i)
— bk-2)
— . . . — (a 2 — b 1 ) — al > bk — al,
since ai < b1_ 1 . But b k > b and a l < a, and so we have bk — a 1 >
b — a, whence E/(/„) > (b — a). This shows that m*[a, b] b — a.
If l is any finite interval, then, given e > 0, there is a closed interval
J c I such that 1(J) > 1(1) — e. Hence
I(J) = m*J m*I _< m*1 = 1(1) = 1(1).
1(1) -
Thus for each e > 0,
— E < in* /
1(/),
and so m*/ = 1(4
If I is an infinite interval, then given any real number A, there is
a closed interval Jc I with I(J) = A. Hence m*/ > m*.1 = kJ) = A.
Since m*/ > A for each A, mI = cc = 1(1).1
2. Proposition: Let {A n) be a countable collection of sets of real
numbers. Then
ne(U A n) < E m*A„.
Proof: If one of the sets A n has infinite outer measure, the inequality holds trivially. If m*A is finite, then, given E > 0, there is a
countable collection {I ,, } 1 of open intervals such that A n, CU
and
U
E
,
<
m*A r,
2—ne. Now the collection fk/ii 1 ----
i is countable, being the union of a countable number of
56
Lebesgue Measure
[Ch. 3
countable collections, and covers U A. Thus
m*(U
E ki„, z)
n,i
EE
E m*An
n
< E (m*A i,
e2 —n)
E•
Since e was an arbitrary positive number,
ne(U
<
E m *An.
3. Corollary: If A is countable, m*A = O.
4. Corollary: The set [0, 1] is not countable.
5. Proposition: Given any set A and any e > 0, there is an open set
0 such that A c 0 and m*0 < m*A e. There is a G e 9,5 such
that A c G and m*A = m*G.
Problems
5. Let A be the set of rational numbers between 0 and 1, and let {4, } be
a finite collection of open intervals covering A. Then E/(4) > L
6. Prove Proposition 5.
7. Prove that m* is translation invariant.
8. Prove that if m* A = 0, then m*(A U B) = m*B.
3
Measurable Sets and Lebesgue Measure
While outer measure has the advantage that it is defined for all sets,
it is not countably additive. It becomes countably additive, however,
if we suitably reduce the family of sets on which it is defined. Perhaps
the best way of doing this is to use the following definition due to
Carathéodory:
Definition: A set E is said to be measurable' if for each set A we
have m*A = m*(A n E) m*(A n t).
Since we always have m*A < m*(A n E) m*(A n P), we see
that E is measurable if (and only if) for each A we have m*A >
4
In the present case m* is Lebesgue outer measure, and we say E is Lebesgue measurable. More general notions of measurable set are considered in Chapters 11 and 12.
Sec. 3]
Measurable Sets and Lebesgue Measure
57
m* (A n E) ± m* (A n E). Since the definition of measurability is
symmetric in E and E, we have E measurable whenever E is. Clearly
0 and the set R of all real numbers are measurable.
6. Lemma: If m*E = 0, then E is measurable.
Proof: Let A be any set. Then A nEc E, and so m*(A n E) <
m*E = O. Also ADA n E, and so
m*A > m* (A n É) = m* (A n L') + m* (A n E),
and therefore E is measurable. 1
7. Lemma: If E 1 and E, are measurable, so is El U E2 •
Proof: Let A be any set. Since E2 is measurable, we have
m*(A n E1 ) = m* (A n El n E2) + m * (A n El n E2),
and since A n (E, u E2) = [A n Ed u [A n E2 n Ed, we have
m* (A n [E, u Ed) < m* (A n E1 ) ± m* (A n E2 n El).
Thus
m*(A n [E1 u E2]) + m*(A n E1 n E2) _ rn*(A n E1)
+ m*(A n E2 n E1 ) + m*(A n E1 n E2)
= m*(A n El) + m*(A n El ) = m*A,
by the measurability of E1 . Since —(E1 u E2) —
that El u E2 is measurable. I
El n --E2, this shows
8. Corollary: The family on of measurable sets is an algebra of
sets.
9. Lemma: Let A be any set, and E1 ,. . . , E,,, a finite sequence of
disjoint measurable sets. Then
n
144 n [ 0 E9]) =
9-1
E
m*(4 n E,).
i- 1
Proof: We prove the lemma by induction on n. It is clearly true
for n = 1, and we assume it is true if we have n — 1 sets E,. Since
58
Lebesgue Measure
(Ch. 3
the Ei are disjoint sets, we have
An[u E i] nEn = AnEn
and
I n-1
Ed•
An[C1
EdnÉn = n[U
i-1
i-1
Hence the measurability of En implies
m* (A n[t) EiD = m*(4 n En) + m* n
J=1
[D'
i=1
Eil)
n-1
= m*(A n En)
E
m*(A n Ei)
by our assumption of the lemma for n — 1 sets. 1
10. Theorem: The collection :Trt of measurable sets is a ci-algebra;
that is, the complement of a measurable set is measurable and the
union (and intersection) of a countable collection of measurable sets is
measurable. Moreover, every set with outer measure zero is measurable.
Proof: We have already observed that on is an algebra of sets,
and so we have only to prove that if a set E is the union of a countable
collection of measurable sets it is measurable. By Proposition 1.2
such an E must be the union of a sequence E'n) of pairwise disjoint
measurable sets. Let A be any set, and let Fn = U E. Then Fn is
measurable, and Pn 9 E. Hence
m *A
= m*(A n F,,) m*(A n Pn)
m*(A n E,,) m*(A n E).
By Lemma 9
m*
E
m*(A n E2).
nE
m*(A n
nF=
i-1
Thus
m*A >
E
m*
Sec. 3]
Measurable Sets and Lebesgue Measure
59
Since the left side of this inequality is independent of n, we have
m* A >
m* (A n E) m*(A n
L')
=
> m* (A n E)
m*(A n E")
by the countable subadditivity of m* . I
11. Lemma: The interval (a, 00 ) is measurable.
Proof: Let A be any set, A 1 = A n (a, GO), A2 = A n (— oc, a].
Then we must show m*A
m*A 2 < m* A. If m*A = oc, then
there is nothing to prove. If m*A < oc, then, given E > 0, there is a
countable collection {/n} of open intervals which cover A and for
which
E 44) _< m*,4 + e.
Let I, = In (a, 00 ) and
intervals (or empty) and
/(4,) = /(K)
=
I
n (— 00, a]. Then I and If are
/(4,") = m*/',
m*C.
Since A i c U K, we have
m*A
and, since A 2 C
U
< m*(U
<
E m*A,
<
E
, we have
m*A 2 < m*(U
rec.
Thus
m*Ai + m*A2 < E (m*/:, + m*K)
< E kin) < m* A ±
E.
But € was an arbitrary positive number, and so we must have m*A i +
m*A 2 < m*A.I
12. Theorem: Every Borel set is measurable. In particular each
open set and each closed set is measurable.
60
Lebesgue Measure
[Ch. 3
Proof: Since the collection OTC of measurable sets is a a-algebra,
oo).
we have (-00, a] measurable for each a since (- 00, a] =
Since
(— 00, b)
U (—
ci
, b — 1/n), we have (— 00, b) measur-
n=1
able. Hence each open interval (a, b) =-- (— cc, b) n (a, co) is measurable. But each open set is the union of a countable number of open
intervals and so must be measurable. Thus arc is a a--algebra containing the open sets and must therefore contain the family 63 of
Borel sets, since 63 is the smallest a--algebra containing the open
sets. [Note: The theorem also follows immediately from the fact
that oir is a a--algebra containing each interval of the form (a, co)
and the fact that cQ, is the smallest a--algebra containing all such
intervals.] I
If E is a measurable set, we define the Lebesgue measure mE to
be the outer measure of E. Thus m is the set function obtained by
restricting the set function m* to the family orc of measurable sets.
Two important properties of Lebesgue measure are summarized by
the following propositions:
13. Proposition: Let (E1) be a sequence of measurable sets. Then
m(U Ei)
E
If the sets En are pairwise disjoint, then
m(U Ei) =
E mEi.
Proof: The inequality is simply a restatement of the subadditivity of m* given by Proposition 2. If (E1) is a finite sequence of
disjoint measurable sets, then Lemma 9 with A = R implies that
m ( U Ei) =
E mEi,
and so m is finitely additive. Let (E1) be an infinite sequence of
pairwise disjoint measurable sets. Then
EiD
U
Sec. 3]
Measurable Sets and Lebesgue Measure
61
and so
Ei)
E)M
=
E
mEi.
Since the left side of this inequality is independent of n, we have
CO
00
in(
U
Ei) ?_
E
mEi.
The reverse inequality follows from countable subadditivity, and
we have
m
(
mE t. I
Û Ei) =
14. Proposition: Let (E„) be an infinite decreasing sequence of
measurable sets, that is, a sequence with En+1 C En for each n. Let mE i
be finite. Then
m(n
Ei) = lim mEn .
Proof: Let E = fl E and let Fi = E
,
E1+1. Then
OD
E = U Fi,
and the sets Fi are pairwise disjoint. Hence
m(E i E) =
E
t=1
mF1 =
E
But mEi = mE
m(Ei E), and mEi mEi ±i In(Ei Ei+i),
since E c E1 and Ei+i c Ei. Since mEi < mEi < 00 , we have
m(Ei E) = mEi — mE and m(Ei Ei+1) = mE, — mEi +i. Thus
mE l — mE = E (mE i — mEi+i)
1
= lirn E (mE i — mEi+i)
1=1
= lim (mEi — mE n )
n —>oo
= mEi — bin mEn.
n —) co
62
Lebesgue Measure
[Ch. 3
Since mE i < oc, we have
mE = lim mEn. I
n —roc,
The following proposition expresses a number of ways in which a
measurable set is very nearly a nice set. The proof is left to the
reader (Problem 13).
15. Proposition: Let E be a given set. Then the following five
statements are equivalent:
i.
ii.
iii.
iv.
v.
E is measurable;
given e > 0, there is an open set 0 D E with m*(0 — E) < e;
given e > 0, there is a closed set F c E with m*(E — F) < c;
there is a G in 93 with E c G, m*(G — E) = 0;
there is an F in g, with F c E, m* (E — F) = 0;
If m*E is finite, the above statements are equivalent to:
vi. given e > 0, there is a finite union U of open intervals such that
m*(U P E) <E.
Problems
9. Show that if E is a measurable set, then each translate E + y of E is
also measurable.
10. Show that if E l and E2 are measurable, then m(E i U E2) +
m(Ei n E2) = mEl + mE2.
11. Show that the condition mE i < Go is necessary in Proposition 14
by giving a decreasing sequence (En) of measurable sets with 0 = fl Ea
and mEn, = oo for each n.
12. Let (E,) be a sequence of disjoint measurable sets and A any set.
.
co
Then m*(A n U Et) = E m*(A n E1).
13. Prove Proposition 15. [Hints:
a. Show that for m* E < 00, (i) = (ii) .4= (vi) (cf. Proposition 5).
b. Use (a) to show that for arbitrary sets E, (i) = (ii) = (iv) == (i).
c. Use (b) to show that (i) = (iii) = (y)
(i).]
Sec. 4]
A Nonmeasurable Set
63
14. a. Show that the Cantor ternary set (Problem 2.36) has measure
zero.
b. Let F be a subset of [0, 1] constructed in the same manner as the
Cantor ternary set except that each of the intervals removed at the nth
step has length a3 with 0 < a < 1. Then F is a closed set, /4 dense in
[0, 1] and mF =-- 1 — a. Such a set F is called a generalized Cantor set.
*4
A Nonmeasurable Set
We are going to show the existence of a nonmeasurable set. If x and
y are real numbers in [0, 1), we define the sum modulo 1 of x and
y to be x + y, if x + y < 1, and to be x ± y — 1 if x ± y > 1.
Let us denote the sum modulo 1 of x and y by x -V y. Then -V is a
commutative and associative operation taking pairs of numbers in
[0, 1) into numbers in [0, 1). If we assign to each x e [0, 1) the angle
27rx, then addition modulo 1 corresponds to the addition of angles.
If E is a subset of [0, 1), we define the translate modulo 1 of E to
be the set E + y = {z: z = x _p y for some x e E} . If we consider
addition modulo 1 as addition of angles, translation modulo 1
by y corresponds to rotation through an angle of 27ry. The following
lemma shows that Lebesgue measure is invariant under translation
modulo 1.
16. Lemma: Let E c [0, 1) be a measurable set. Then for each
y c [0, 1) the set E + y is measurable and m(E + y) = mE.
y) and E2 = E n [1 y, 1). Then
are disjoint measurable sets whose union is E, and so
Proof: Let E1 = E n [0, 1
E1 and
E2
—
—
mE = mEi ± mE2.
Now E1 4- y — Ei + y, and so E1 _p y is measurable and we have
m(Ei 4- y) = mEi , since m is translation invariant. Also E2 4- y =
E2 ± (y — 1), and so E2 -p y is measurable and m(E2 4- y) = rnEz.
But E 4- y = (E1 + y) u (E2 -p y) and the sets (E1 4- y) and
(E2 4- y) are disjoint measurable sets. Hence E 4- y is measurable
and
m(E + y) = m(Ei -V y) + m(E2 4- y)
= mE i + rnE2
= mE. I
64
Lebesgue Measure
[Ch. 3
We are now in a position to define a nonmeasurable set. If x — y
is a rational number, we say that x and y are equivalent and write
x — y. This is an equivalence relation and hence partitions [0, 1) into
equivalence classes, that is, classes such that any two elements of one
class differ by a rational number, while any two elements of different
classes differ by an irrational number. By the axiom of choice there
is a set P which contains exactly one element from each equivalence
class. Let (r i) 0 be an enumeration of the rational numbers in [0, 1)
with ro = 0, and define Pi = P ri. Then Po = P. LetxgP1 n P3 .
Then x = pi + ri = pi + r, with pi and p, belonging to P. But
pi— p, = r5— ri is a rational number, whence p i p,. Since P has
only one element from each equivalence class, we must have i = j.
This implies that if i j, P, n P, = 0, that is, that (Pi) is a pairwise disjoint sequence of sets. On the other hand, each real number x
in [0, 1) is in some equivalence class and so is equivalent to an
element in P. But if x differs from an element in P by the rational
number ri, then X E P,. Thus U P = [0, 1). Since each P, is a translation modulo 1 of P, each Pi will be measurable if P is and will
have the same measure. But if this were the case,
m[0, 1) = E mPi = E mP,
and the right side is either zero or infinite, depending on whether
mP is zero or positive. But this is impossible since m[0, 1) = 1, and
consequently P cannot be measurable.
While the above proof that P is not measurable is a proof by
contradiction, it should be noted that (until the last sentence) we
have made no use of properties of Lebesgue measure other than
translation invariance and countable additivity. Hence the foregoing
argument gives a direct proof of the following theorem:
17. Theorem: If m is a countably additive, translation invariant
measure defined on a a-algebra containing the set P, then m[0, 1) is
either zero or infinite.
The nonmeasurability of P with respect to any translation invariant
countably additive measure m for which m[0, 1) is 1 follows by
contraposition.
Sec.
5]
Measurable Functions
65
Problems
15. Show that if E is measurable and E C P, then mE = O. [Hint:
Let E, = E ri. Then (Ei) is a disjoint sequence of measurable sets and
mE, = mE. Thus EmE, = mU Ez < m[0, 1 ].]
16. Show that, if A is any set with m* A > 0, then there is a nonmeasurable set E C A. [Hint: If A C (0, 1), let E, = A n P. The measurability of E. implies mE, = 0, while Em*E, > m* A > O.]
17. a. Give an example where (E,) is a disjoint sequence of sets and
4-
In* (1.i Ez) < Em* E,.
b. Give an example of a sequence of sets (E2) with E, D E2+ 1,
m*Ei < o, and m*(n E,) < limm* E,.
5
Measurable Functions
all sets are measurable, it is of great importance to know
that sets which arise naturally in certain constructions are measurable.
If we start with a function f the most important sets which arise from
it are those listed in the following proposition:
Since not
18. Proposition: Let f be an extended real-valued function whose
domain is measurable. Then the following statements are equivalent:
i.
ii.
iii.
iv.
For each real number a the set {x: f(x) > a} is measurable.
For each real number a the set {x: f(x) > a} is measurable.
For each real number a the set {x: f(x) < al is measurable.
For each real number a the set {x: f(x) < a} is measurable.
These statements imply
v. For each extended real number a the set {x: f(x) = a} is
measurable.
Proof: Let the domain of f be D. We have (i) = (iv), since
{x: f(x) > a} and the difference of two
{x: f(x) < a) = D
measurable sets is measurable. Similarly, (iv) = (i) and (ii) <. (iii).
—
Now (i) = (ii), since {x: f(x) > a} =
n
{x: f(x) >
a
— 1/n},
n=1
and the intersection of a sequence of measurable sets is measurable.
Similarly, (ii) = (i), since {x: f(x) > a) = U roc: f (x) >
n=1
a
+ 1/n},
66
Lebesgue Measure [Ch. 3
and the union of a sequence of measurable sets is measurable. This
shows that the first four statements are equivalent. If a is a real
number, {x: f(x) = = {x: f(x) >
n {x: f(x) < a}, and so
(ii) and (iv) = (v) for a real. Since
{x: f(x) = 001 =
{x: f (x) ?_ ,
(ii) = (v) for a = oo . Similarly, (iv)
have (ii) & (iv) == (v). 1
(v) for a = — co , and we
Definition: An extended real-valuedfunctionf is said to be (Lebesgue)
measurable if its domain is measurable and if it satisfies one of the
first four statements of Proposition 18.
Thus if we restrict ourselves to measurable functions, the most
important sets connected with them are measurable. It should be
noted that a continuous function (with a measurable domain) is
measurable, and of course each step function is measurable. If f is a
measurable function and E is a measurable subset of the domain of
f, then the function obtained by restricting f to E is also measurable.
The following proposition tells us that certain operations performed
on measurable functions lead again to measurable functions:
19. Proposition: Let c be a constant and f and g two measurable
real-valued functions defined on the saine domain. Then the functions
f c, cf, f + g, g — f, andfg are also measurable.
Proof:
We shall use condition (iii) of Proposition 18. Then
{x: f(x)
c<
a) = fiX :
f(x) <
a—
CI
,
and so f c is measurable when f is. A similar argument shows cf
to be measurable.
If f(x) g(x) < a, then f(x) < a — g(x) and by the corollary
to the axiom of Archimedes there is a rational number r such that
f(x) <r <a— g(x).
Hence
{x: f(x) g(x) <
= U (fx: f(x) < r) n
g(x) <a — 14
Since the rationals are countable, this set is measurable and so f + g
Sec. 5]
67
Measurable Functions
is measurable. Since —g = (-1)g is measurable when g is, we have
f — g measurable.
The function f 2 is measurable, since
=
{x: f 2 (x) >
for
a
f(x) > -\/«1 u fx:f(x) < NAT,}
> 0 and
f 2 (x) > = D
if a
<
O, where D is the domain of f Thus
fg H(}
-
f2
g2]
is measurable. I
We will often want to use Proposition 19 for extended real-valued
functionsf and g. Unfortunately,f g is not defined at points where
it is of the form Go — oo . However, fg is always measurable and
f + g is measurable if we always take the same value for f + g at
points where it is undefined. Also, f + g is measurable no matter
what values we take at the points where it is not defined, provided
these points are a set of measure zero. See Problem 22.
20. Theorem: Let (fn) be a sequence of measurable functions (with
the same domain of definition). Then the functions sup {fi ,
inf suPfn, inffn, limf„, and Rini`, are all measurable.
Proof: If h is defined by h(x) = sup {f i (x),
h(x) >
= U
,fn(x)}, then
fi(x) > «}. Hence the measurability of the
f implies that of h. Similarly, if g is defined by g(x) = sup fn(x), then
g(x) > a}
Ufn,(x) > al, and so g is measurable. A
n=-1
similar argument establishes the corresponding statements for inf.
Since limfn = inf sup fk , we have limfn measurable, and similarly
n k>n
for lim fn. I
A property is said to hold almost everywhere 5 (abbreviated a.e.) if
the set of points where it fails to hold is a set of measure zero. Thus
in particular we say that f = g a.e. if f and g have the same domain
5
French:
presque partout (P•13.).
68
Lebesgue Measure
[Ch. 3
and rn {x: f(x) g(x)} = O. Similarly, we say that ft, converges to g
almost everywhere if there is a set E of measure zero such that f(x)
converges to g(x) for each x not in E. One consequence of equality
a.e. is the following:
21. Proposition: Iff is a measurable function and f — g a.e., then
g is measurable.
Proof: Let E be the set {x: f(x) g(x)} . By hypothesis mE = O.
Now
{x: g(x) > a} = {x: f(x) > a} u {x c E: g(x) > al
— {x c E: g(x) .- a} .
The first set on the right is measurable, since f is a measurable function. The last two sets on the right are measurable since they are
subsets of E and mE = O. Thus {x: g(x) > a} is measurable for
each a, and so g is measurable. 1
The following proposition tells us that a measurable function is
"almost" a continuous function. The proof is left to the reader
(cf. Problem 23).
22. Proposition: Let f be a measurable function defined on an
interval [a, b], and assume that f takes the values ±oo only on a set of
measure zero. Then given E > 0, we can find a step function g and a
continuous function h such that
( f — g- <
E
and if — I?! < €
except on a set of measure less than E; i.e. m{x: If(x) — g(x) I > e} < €
and nifx: If(x) — h(x)! > el < E. If in addition m < f < M, then
we may choose the functions g and h so that m < g < M and m < h <
M.
If A is any set, we define the characteristic function Xa of the set A
to be the function given by
Sec. 5]
69
Measurable Functions
The function XA is measurable if and only if A is measurable. Thus
the existence of a nonmeasurable set implies the existence of a nonmeasurable function.
A real-valued function ço is called simple if it is measurable and
assumes only a finite number of values. If cp is simple and has the
values a l,
, an then cp
E otixA ., where A, = {x: cp(x) = a,}
The sum, product, and difference of two simple functions are simple.
Problems
18. Show that (v) does not imply (iv) in Proposition 18 by constructing
a function f such that {x: f(x) > 0} = E, a given nonmeasurable set, and
such that f assumes each value at most once.
19. Let D be a dense set of real numbers, that is, a set of real numbers
such that every interval contains an element of D. Let f be an extended
real-valued function on R such that fx: f(x) > a} is measurable for each
e D. Then f is measurable.
20. Show that the sum and product of two simple functions are simple.
Show that
XAnB XA ' XB
XAuB = XA
X; =
XB
XA • XB
1 — xA.
21. a. Let D and E be measurable sets and f a function with domain
Show thatf is measurable if and only if its restrictions to D and E
are measurable.
b. Let f be a function with measurable domain D. Show that f is
measurable if the function g defined by g(x) = f(x) for x c D and
g(x) = 0 for x o D is measurable.
22. a. Let f be an extended real-valued function with measurable
domain D, and let D 1 = {x: f(x) = 00}, D2 = fx: f(x) = col. Then
f is measurable if and only if D 1 and D2 are measurable and the restriction
off to D — (D1 U D2) is measurable.
b. Prove that the product of two measurable extended real-valued
functions is measurable.
c. Iff and g are measurable extended real-valued functions and a
a fixed number, then f + g is measurable if we define f + g to be a
cc.
whenever it is of the form Go — Go or —
D u E.
70
Lebesgue Measure
[Ch. 3
d. Let f and g be measurable extended real-valued functions which
are finite almost everywhere. Then f + g is measurable no matter how it is
defined at points where it has the form CXD - 00 .
23. Prove Proposition 22 by establishing the following lemmas:
a. Given a measurable function f on [a, b] which takes the values
œ only on a set of measure zero, and given E > 0, there is an M such
that VI < M except on a set of measure less than 6/3.
b. Let f be a measurable function on [a, b]. Given 6 > 0 and M,
there is a simple function so such that I f(x) — so(x)I < 6 except where
If(x)1 > M. If m < f < M, then we may take so so that m G so < M.
c. Given a simple function so on [a, b], there is a step function g on
[a, b] such that g(x) = sc(x) except on a set of measure less than 6/3.
[Hint: Use Proposition 15.] If m < so < M, then we can take g so that
m < g < M.
d. Given a step function g on [a, b], there is a continuous function
h such that g(x) = h(x) except on a set of measure less than E /3. If
m < g < M, then we may take h so that m < h < M.
24. Let f be measurable and B a Borel set. Then f —1 [B] is a measurable
set. [Hint: The class of sets for whichf-1 [E] is measurable is a (7-algebra.]
25. Show that iff is a meaurable real-valued function and g a continuous
function defined on (— ap , œ), then g of is measurable.
26. Borel measurability. A function f is said to be Borel measurable if
for each a the set {x: f(x) > a} is a Borel set. Verify that Propositions 18
and 19 and Theorem 20 remain valid if we replace "measurable set" by
"Borel set" and "(Lebesgue) measurable" by "Borel measurable." Every
Borel measurable function is Lebesgue measurable. If f is Borel measurable, and B is a Borel set, then f —1 [B] is a Borel set. If f and g are Borel
measurable, so is f o g. Iff is Borel measurable and g is Lebesgue measurable, then fo g is Lebesgue measurable.
27. How much of the preceding problem can be carried out if we
replace the class 03 of Borel sets by an arbitrary (7-algebra a of sets?
28. Let fi be the Cantor ternary function (cf. Problem 2.46), and
define f by f (x) = f i (x) + x.
a. Show that f is a homeomorphism of [0, 1] onto [0, 2].
b. Show that f maps the Cantor set onto a set F of measure 1.
c. Let g = f-1 . Show that there is a measurable set A such that
g-1 [A] is not measurable.
d. Give an example of a continuous function g and a measurable
function h such that h o g is not measurable. Compare with Problems 25
and 26.
e. Show that there is a measurable set which is not a Borel set.
71
Sec. 6] Littlewood's Three Principles
6 Littlewood's Three Principles
Speaking of the theory of functions of a real variable, J. E.
Littlewood says, 6 "The extent of knowledge required is nothing like
so great as is sometimes supposed. There are three principles, roughly
expressible in the following terms: Every (measurable) set is nearly a
finite union of intervals; every [measurable] function is nearly
continuous; every convergent sequence of [measurable] functions is
nearly uniformly convergent. Most of the results of [the theory] are
fairly intuitive applications of these ideas, and the student armed with
them should be equal to most occasions when real variable theory is
called for. If one of the principles would be the obvious means to
settle the problem if it were 'quite' true, it is natural to ask if the
'nearly' is near enough, and for a problem that is actually solvable
it generally is."
We have already met two of Littlewood's principles : Various forms
of the first principle are given by Proposition 15. One version of the
second principle is given by Proposition 22, another version by
Problem 31, and a third is given by Problems 4.15 and 6.14. The
following proposition gives one version of the third principle. A
slightly stronger form is given by Egoroff's theorem (Problem 30),
but you will generally find the weak form adequate.
23. Proposition: Let E be a measurable set of finite measure, and
(1,0 a sequence of measurable functions defined on E. Let f be a measurable real-valued function such that for each x in E we have fn(x) —3
f(x). Then, given e > 0 and 8 > 0, there is a measurable set A C E
with mA < 3 and an integer N such that for all x o A and all n > N,
if(x) — f(X)I < e•
Proof: Let
Gn =
fx 8 E: 1 fn(x) — f(x)1 ?- e } 9
and set
EN
= U Gn = {x e E: If(x) — f(x)I > e for some n _> NI .
n
=N
We have EN+1 c EN, and for each x e E there must be some EN to
which x does not belong, since fn (x) —> f(x). Thus EN --= 0, and
so, by Proposition 14, lim mEN = 0. Hence given 8 > 0, 311T so that
n
6
Lectures on the Theory of Functions, Oxford, 1944, p. 26.
72
MEN G
Lebesgue Measure
[Ch. 3
; that is,
m {x e E: I f(x) — f(x)1 > e for some n > N} < 5.
If we write A for this EN, then mA <ô and
= {x e E: I f„(x) — f(x)i < E for all n >
.I
If, as in the hypothesis of the proposition, we have f,„(x) f(x)
for each x, we say that the sequence (fn) converges pointwise to f on
E. If there is a subset B of E with mB = 0 such that fn fpointwise
on E B, we say that fn f a.e. on E. We have the following trivial
modification of the last proposition:
24. Proposition: Let E be a measurable set of finite measure, and
(f„) a sequence of measurable functions which converge to a real-valued
function f a.e. on E. Then, given E > 0 and 5 > 0, there is a set A c E
with mA < 5, and an N such that for all x 0 A and all n > N,
f n(x) f(4 < E.
Problems
29. Give an example to show that we must require mE < co in Proposition 23.
30. Prove Egoroff's Theorem: If (A) is a sequence of measurable functions which converge to a real-valued function f a.e. on a measurable set
E of finite measure, then, given 7? > 0, there is a subset A c E with
mA < n such that fn converges to f uniformly on E — A. [Hint: Apply
Proposition 24 repeatedly with En = 1/n and 6„ = 2—nn.]
31. Prove Lusin's Theorem: Let f be a measurable real-valued function
on an interval [a, b]. Then given 0 > 0, there is a continuous function
ço on [a, b] such that mIx: f(x) X q,(x)} < 6. Can you do the same on the
interval (— Go , Go)? [Hint: Use Egoroff's theorem, Propositions 15 and
22, and Problem 2.39. 1
32. Show that Proposition 23 need not be true if the integer variable
n is replaced by a real variable t; that is, construct a family (ft ) of
measurable real-valued functions on [0, 1] such that for each x we have
limft(x) = 0, but for some 0> Owe have m* Ix: ft (x) > T1,-; > S. Hint:
Let P, be the sets in Section 4. For 2—' 1 < t < 2' define ft by
if x P, and x = 2i+1 t — 1
ft(x) = {1
0
otherwise.
4
1
The Lebesgue Integral
The Riemann Integral
We recall a few definitions pertaining to the Riemann integral. Let
f be a bounded real-valued function defined on the interval [a, b]
and let
a = Eo < El < • • • < En = b
be a subdivision of [a, M. Then for each subdivision we can define
the sums
E—
S=
and
s
=E
(E, — Ez_oniz,
1
where
M, =
sup
m, = ml
f(x),
f(x).
We then define the upper Riemann integral of f by
b
Rif(x) dx =
a
inf S
with the infimum taken over all possible subdivisions of [a, b].
Similarly, we define the lower integral
Ri
f(x) dx = sup s.
73
74
The Lebesgue Integral
[Ch. 4
The upper integral is always at least as large as the lower integral,
and if the two are equal we say that f is Riemann integrable and call
this common value the Riemann integral of f. We shall denote it by
R ib
f(x) dx
to distinguish it from the Lebesgue integral, which we shall consider
later.
By a step function we mean a function 1,/, which has the form
14x)
<x<
for some subdivision of [a, b] and some set of constants ci. Under
practically anybody's definition of an integral we have
fa
0(x) dx = E ci(EJ —
i==i
With this in mind we see that
R f(x) dx = inf
dx
for all step functions ik(x) > f(x). Similarly,
Ria f(x) dx = sup fa yo(x) dx
for all step functions
f(x).
yo(x)
Problem
1. a.
Show that if
f(x) =
then
R
a
f(x) dx = b —
a
0 x irrational
1 x rational,
and
R f(x) dx = O.
-a
b. Construct a sequence (fn ) of nonnegative, Riemann integrable
functions such that fen, increases monotonically to f. What does this imply
about changing the order of integration and the limiting process?
Sec. 2]
The Lebesgue Integral of a Bounded Function
75
2 The Lebesgue Integral of a Bounded Function
over a Set of Finite Measure
The problem in the preceding section shows some of the shortcomings of the Riemann integral. In particular, we would like a function which is 1 on a measurable set and zero elsewhere to be integrable
and have as its integral the measure of the set.
The function XE defined by
xE(x) =
1
Xe E
xzE
1°
is called the characteristic function of E. A linear combination
ç(x) =
E
a iX E i(x)
i=1
is called a simple function if the sets Ei are measurable. This representation for yo is not unique. However, we note that a function yo is
simple if and only if it is measurable and assumes only a finite number
of values. If yo is a simple function and fa l , . , an} the set of nonzero values of (p, then
= E ai X A i,
where A i = fx: ç(x) = aj. This representation for ço is called the
canonical representation and it is characterized by the fact that the
A i are disjoint and the ai distinct and nonzero.
If yo vanishes outside a set of finite measure, we define the integral
of go by
E
f (p(x)dx =
i—t
aimA i
when yn has the canonical representation yo = E (IA A ,. We somet=i
times abbreviate the expression for this integral to f (p. If E is any
measurable set, we define
f
q)•XE.
It is often convenient to use representations which are not canonical,
and the following lemma is useful:
76
The Lebesgue Integral
1. Lemma: Let („0 =
E aiX E,,
[Ch. 4
with Ei n E, = 0 for i j.
Suppose each set Ei is a measurable set offinite measure. Then
f
=E
aimEi.
Proof: The set Aa = {x: yo(x) =
E
U Ei. Hence a mE. =
a =a
aimEi by the
additivity of m, and so
ai,=a
E amA a
E aimEi. 1
f io(x) dx
2. Proposition: Let (p and 1p be simple functions which vanish outside a set of finite measure. Then
f (ciço +
and, if ço >
= afço +bflp,
a.e., then
f
Proof: Let {A i} and {B,} be the sets which occur in the canonical
representations of go and %G. Let A 0 and Bo be the sets where go and 1'
are zero. Then the sets Ek obtained by taking all the intersections
n B 5 form a finite disjoint collection of measurable sets, and we
may write
E ak x„,
=
E
bkXE0
k=1
and so
acp
E (aak bbOXEk,
=
whence f(cup +
= a I (to
b fl,t follows from Lemma 1. To
prove the second statement, we note that
-
fço -fip = f - o
0,
since the integral of a simple function which is greater than or equal
to zero a.e. is nonnegative by the definition of the integral. I
77
Sec. 2 1 The Lebesgue Integral of a Bounded Function
It follows from this proposition that if (p =
E aiX E,, then f so =
EaimE„ and so the restriction of Lemma 1 that the sets Ei be disjoint
is unnecessary.
Let f be a bounded real-valued function and E a measurable set
of finite measure. By analogy with the Riemann integral we consider
for simple functions so and ip the numbers
inff
f
sup
fE yo,
E
and
v<1*
and ask when these two numbers are equal. The answer is given by
the following proposition:
3.
Proposition: Let f be defined and bounded on a measurable set
E with mE finite. In order that
inf
f
ip(x) dx = sup f (p(x) dx
E
f<IP E
for all simple functions go and
measurable.
it is necessary and sufficient that f be
Proof: Let f be bounded by M and suppose that f is measurable.
Then the sets
kM
Ek
{ X: —
n
f (x) >
(k — 1)M1
—n < k < n,
are measurable, disjoint, and have union E. Thus
E
—n
mEk
=
mE.
The simple functions defined by
m n
Itin(X) =
n
E
kX E,(x)
k=—n
and
gon (x) =
n
(k — 1)X E,(x)
â–ª
78
The Lebesgue Integral [Ch. 4
satisfy
ion (x)
f(x) 5_ ik„(x).
Thus
m
inf f ,P(x) dx
f O n (x) dx =--
sup f (p(x) dx
f ço„(x) dx =
n
E_n krnEk
11 k=
and
M
n
E
—
k =„
(k — 1)inEk,
whence
0 < inftf
p(x)dx
— sup f so(x) dx <
. .E
M mE.
mE k = —
n
Since n is arbitrary we have
inf fE Ip(x) dx
—
sup IE io(x) dx = 0,
and the condition is sufficient.
Suppose now that
inff E
f 11,(x) dx
sup
f ço (x) dx.
<f E
Then given n there are simple functions (pn, and o n such that
,p(x)
f(x)
and
f
n(x)
dx f
n(x)
dx
Then the functions
inf tp„
and
ço* -,--- sup ion
are measurable by Theorem 3.20, and
ço*(x)
II
< I •
f(x) 5_ ip*(x).
Sec. 2]
79
The Lebesgue Integral of a Bounded Function
Now the set
= {x: ic,* (x) < 0*(x)}
A
is the union of the sets
A,
=
if,*(x) < e(x) — -1 1 •
(x) —
But each A, is contained in the set {x: o „(x) < p n
1/0, and
this latter set has measure less than vin. Since n is arbitrary, mA, = 0,
and so mA = O. Thus 4:7*= 4,* except on a set of measure zero,
and io* = f except on a set of measure zero. Thus f is measurable by
Proposition 3.21, and the condition is also necessary. I
Definition: If f is a bounded measurable function defined on a
measurable set E with mE finite, we define the (Lebesgue) integral of
f over E by
f f(x) dx = inf fe It(x) dx
for all simple )(Unctions
> f.
L f. If E = [a, b], we write
fa f instead of f f. If f is a bounded measurable function which
vanishes outside a set E of finite measure, we write f f for L
We sometimes write the integral as
(a,b]
Note that f = ff- XE. The following corollary to Proposition 3
shows that the Lebesgue integral is in fact a generalization of the
Riemann integral.
4. Proposition: Let f be a bounded function defined on [a, b]. If
f is Riemann integrable on [a, b], then it is measurable and
f(x) dx =
fab
Proof: Since every step function is also a simple function we have
-
R fa f(x) dx <_ sup
g, <I
a
(p(x) dx
b
inf
,P7f
a
ik(x) dx R f af(x) dx.
Since f is Riemann integrable, the inequalities are all equalities, and
f is measurable by Proposition 3. I
The lebesgue Integral [Ch. 4
80
5. Proposition: If f and g are bounded measurable functions
defined on a set E of finite measure, then:
bg) =af f b fE g.
JE W
ii. If f = g a.e., then
JE f = JE g.
iff < g a.e., then
./E f JE g.
Hence iffi
iv. If A _< f(x) < B, then
AmE < fE f < BmE.
v. If A and B are disjoint measurable sets of finite measure, then
fAUB f fA f+ JB
Proof: If It is a simple function so is a, and conversely (if a
Hence for a > 0,
fE af = inf
f
(10 = a inf
g
f
1,G =
0).
afo f.
E
If a < O,
fE af =
act) = a sup f = a inf f = af f,
so<f g
using Proposition 3.
If 0 1 is a simple function not less than f and 02 a simple function
not less than g, then ;G I + 02 is a simple function not less than f + g.
Hence
fE f + g JE(' + o2) = fE ih + JE 2.
Since the infimum of the right side is 5f ± fg, we have
fE f + g JE f + JE g.
Sec. 2]
The Lebesgue Integral of a Bounded Function
On the other hand, (p i < fand 2 Lç g imply (p i
tion not greater than f + g. Hence
fE f + g ?_ fE opi +
2) =
81
cp 2
is a simple func-
fE çoi + fE ço,
Since now the supremum of the right side is 5f + fg, we have
fEf+g>- fEf+ fEg,
and statement (i) of the proposition follows.
To prove (ii) it now suffices to show that
JE f — g = O.
Since f — g = 0 a.e., it follows that if 1,G
this it follows that
JE
>f—
g, ik > 0 a.e. From
o,
whence
Similarly,
whence (ii). This proof also serves to establish (iii). Statement (iv)
follows from (iii) and the fact that
mE.
Statement (v) follows from (i) and the fact that X AuB = XA XB.
We next prove a proposition which will be used to prove Theorem
15. The proposition is a special case of Theorem 15.
6. Proposition (Bounded Convergence Theorem): Let (1,, be a
sequence of measurable functions defined on a set E of finite measure,
and suppose that there is a real number M such that If(x) < M for all
n and all x. If f(x) = lim fn (x) for each x in E, then
f lim
f
The Lebesgue Integral
82
[Ch. 4
Proof: The proof of this proposition furnishes a nice illustration
of the use of Littlewood's "three principles." The conclusion of the
proposition would be trivial if (f„) converged to f uniformly. Littlewood's third principle states that, if (f,) converges to f pointwise,
then (f,i ) is "nearly" uniformly convergent to f A precise version of
this principle is given by Proposition 3.23, which states that, given
e > 0, there is an N and a measurable set A C E with mA < €14M
such that for n > Nand x rE— A we have If(x) — f(x)I < e/2mE.
Thus
fE-A Ifn
<!+!=
2 2
fl
E.
Hence
fE
fE f
Problem
2. a. Let f be a bounded function on [a, b], and let h be the upper
fb
h. (If ço f is a step
envelope of f (cf. Problem 2.49). Then R f =
.1 a
Ja
function, then (p > h except at a finite number of points and so
step functions such that
f h < Rib f But there is a sequence (,„0„) of
h. By Proposition 6 we have fa h = lim fa (pn > Ra f.)
b. Use part (a) to prove Lebesgue's Theorem: A bounded function f
on [a, b] is Riemann integrable if and only if the set of points at which f is
discontinuous has measure zero.
SOn
3
The Integral of a Nonnegative Function
If f is a nortnegative measurable function defined on a measurable
set E, we define
JE f
sup
h<f
JE h,
where h is a bounded measurable function such that m {x:
is finite.
h(x)
0}
Sec. 3]
The Integral of a Nonnegative Function
83
7. Proposition: If f and g are nonnegative measurable functions,
then:
cf
iii. If f < g a.e.,
fE, f,
c> O.
fEf+ g= fEf+ fEg.
then
fE f JE g.
Proof: Parts (i) and (iii) follow directly from Proposition 5, and we
prove only (ii) in detail. If h(x) < f(x) and k(x) < g(x), we have
h(x) k(x) f(x) g(x), and so
fE, h fE k L f + g.
Taking suprema, we have
f
./E g ./E f
g'
On the other hand, let 1 be a bounded measurable function which
vanishes outside a set of finite measure and which is not greater than
f g. Then we define the functions h and k by setting
h(x) = min (f(x), 1(x))
and
k(x) = 1(x) — h(x).
We have
h(x) f(x)
and
k(x) < g(x),
while h and k are bounded by the bound for 1 and vanish where 1
vanishes. Hence f / =f h+j- k<ff±fg, and so
E
E
E
EE
fEf
fg
fE f g.
8. Theorem (Fatou's Lemma): If (fn) is a sequence of nonnegative
measurable functions and f(x) f(x) almost everywhere on a set E,
then
84
The Lebesgue Integral
[Ch. 4
Proof: Without loss of generality we may assume that the convergence is everywhere, since integrals over sets of measure zero are
zero. Let h be a bounded measurable function which is not greater
than f and which vanishes outside a set E' of finite measure. Define
a function h„ by setting
h„(x) = min {h(x), f n (x))-
Then hn is bounded by the bound for h and vanishes outside E'.
Now h„(x) h(x) for each x in E'. Thus by Proposition 6 we have
fB, h = fB, h = lim fE,
lim
fE fn.
Taking the supremum over h, we get
f
fE
fn
9. Monotone Convergence Theorem: Let (f„) be an increasing
sequence of nonnegative measurable functions, and let f = lim f„. Then
f lim f fn .
Proof: By Theorem
8 we have
ff
ffn.
But for each n we have f„ < f, and so if,
fiTia
<
ff. But this implies
ff
Hence
ff= 1imf f . I
10. Corollary: Let u n be a sequence of nonnegative measurable
00
functions, and let f =
E un. Then
n=1
ff =
f
u
,,.
11. Proposition: Let f be a nonnegative function and
sequence of measurable sets. Let E = U E. Then
fEf
fEf
a disjoint
Sec. 3]
The Integral of a Nonnegative Function
85
Proof: Let u, = f • Xi.. Then f • XE = Euo and so the proposition
follows from the preceding corollary. I
Definition: A nonnegative measurable function fis called integrable
over the measurable set E if
fEf<
12. Proposition: Let f and g be two nonnegative measurable
functions. If f is integrable over E and g(x) < f(x) on E, then g is also
integrable on E, and
lEf
g = hf - fEg.
Proof: By Proposition 7
fEf= fE(f - g) -F
Since the left side is finite, the terms on the right must also be finite, and
so g is integrable. 1
13. Proposition: Let f be a nonnegative function which is integrable
over a set E. Then given E > 0 there is a 6 > 0 such that for every set
A c E with mA < 6 we have
fAf <E.
Proof: The proposition would be trivial if f were bounded. Set
f7,(x) = f(x) if f(x) _< n and f „(x) = n otherwise. Then each f,
bounded and f„ converges to f at each point. By the monotone
convergence theorem there is an N such that fE, fN > fE f — E/2, and
fE f — f N < e/2.
Choose 3 < — . If mA < 6, we have
2N
fA f= fA (f — fN) + f fN
A
<
(f — fN) NmA
<
"
86
The Lebesgue Integral [Ch. 4
Problems
3. Let f be a nonnegative measurable function. Show that ff = 0
implies f = 0 a.e.
4. Let f be a nonnegative measurable function.
a. Show that there is an increasing sequence (çon ) of nonnegative
simple functions each of which vanishes outside a set of finite measure
such that f = lim
b. Show that ff = sup fço over all simple functions cp < f.
5. Let f be a nonnegative integrable function. Show that the function F
defined by
F(x) = f f
is continuous by using Theorem 9.
6. Let (fii ) be a sequence of nonnegative measurable functions which
converges to f, and suppose fn < f for each n. Then
f = 11m f fn .
7. a. Show that we may have strict inequality in Fatou's Lemma.
[Consider the sequence (fn ) defined by fn (x) = 1 if n<xGn-1- 1, with
f(x) = 0 otherwise.]
b. Show that the Monotone Convergence Theorem need not hold
for decreasing sequence of functions. [Let f(x) = 0 if x G n, fn(x) = 1
for x > n.]
8. Prove the following generalization of Fatou's Lemma: If (fn )
is a sequence of nonnegative functions, then
f lim fn lim f fn .
9. Let (fn) be a sequence of nonnegative functions on (— cc, co) such
that fn
a.e., and suppose that Lien —jf < co. Then for each measurable set E we have fE fn
fE f
4 The General Lebesgue Integral
By the positive part f+ of a function f we mean the function f+
f V 0; that is,
f+(x) — max {f(x), 0} .
Similarly, we define the negative part f — by f —
measurable, so are f+ and f — . We have
f = f+ — f-
( — f) V O. If f is
87
Sec. 4 1 The General Lebesgue Integral
and
f+ +f.
With these notions in mind we make the following definition:
Definition: A measurable function f is said to be integrable over
E if f+ and f — are both integrable over E. In this case we define
JE f
JE f
JE
14. Proposition: Let f and g be integrable over E. Then:
i. The function cf is integrable over E, and cf = c
JE
ii. The function f ± g is integrable over E, and
fEf+
g= lEf+
iii. 1ff 5_ g a.e., then fo f
le g.
iv. If A and B are disjoint measurable sets contained in E, then
fAuB
f = fA f + fB f
Proof: Part (i) follows directly from the definition of the integral
and Proposition 7. To prove part (ii), we first note that if fi and f2
are nonnegative integrable functions with f = fi — f 2, then
f+ + f2 = f- + fi. By Proposition 7,
ff+
+ f f2 = f
+ Jfi,
and so
ff= ff+-
Jr- =
ff2.
But, if f and g are integrable, so are f + g + and f — g— , and
( f + g) = ( f+
g+) — (f + g —). Hence
f +
f (f+ + g+) - f (f + g-)
= f+ +
=ff+
f g.
g+
- ff -
- g-
The Lebesgue Integral
88
[Ch. 4
Part (iii) follows from part (ii) and the fact that the integral of a
nonnegative integrable function is nonnegative. For (iv) we have
f
f XAu
fAuBf
= ffx A + ff x B
= fA f
fE
'
It should be noted that f + g is not defined at points where
co and g = — Go and where f= — Go and g = cc. However,
the set of such points must have measure zero, since f and g are
integrable. Hence the integrability and the value of f (f + g) is
independent of the choice of values in these ambiguous cases.
f
15. Lebesgue Convergence Theorem: Let g be integrable over E
and let (f i ) be a sequence of measurable functions such that I fI < g
on E and for almost all x in E we have f(x) = limf„(x). Then
fE f =lim fE ffl.
Proof: The function g — f i is nonnegative, and so by Theorem 8
f)
fE (g
fE (g
-
Since f < g, f is integrable, and we have
fE g fE
f
fE g
— firn
fE
whence
fE f lim f E fn.
Similarly, considering g + f„, we get
fE f
1i m f f„,
and the theorem follows. I
The above theorem requires that the sequence ( f,,) be dominated
by a fixed integrable function g. However, the proof does not require
quite so much. If we replace the appropriate g's in the above proof
Sec. 4]
The General Lebesgue Integral
89
by gn's, we get the following generalization of the Lebesgue Convergence Theorem :
16. Theorem: Let (g.) be a sequence of integrable functions which
converge a.e. to an Integrable function g. Let (J..) be a sequence of
measurable functions such that fn i < g. and (f0 converges to f a.e. If
fg = Jim fg.,
then
f f = lim f fn.
If f„) is a sequence of measurable functions which converge a.e.
to f, then Fatou's Lemma, the Monotone Convergence Theorem,
and the Lebesgue Convergence Theorem all state that under suitable
hypotheses we can assert something about ff in terms of ffn. Fatou's
Lemma has the weakest hypothesis: We need only have f„ bounded
below by zero (or more generally by an integrable function). Consequently, the conclusion of Fatou's Lemma is weaker than that of
the others : We can only assert ff < lim ffn. The Lebesgue Convergence Theorem requires that fn be bounded from above and below by
fixed integrable functions and then asserts the equality of ff and
lim ffn . The Monotone Convergence Theorem (as generalized in
Problem 6) is something of a hybrid: It requires that thef„ be bounded
from below by zero (or an integrable function) and above by the
limit function f itself. Of course, iff is integrable, this is a special case
of the Lebesgue Convergence Theorem, but the advantage of Fatou's
Lemma and the Monotone Convergence Theorem is that they are
applicable even if f is not integrable and are often a good way of
showing that f is integrable. Fatou's Lemma and the Monotone
Convergence Theorem are very close in the sense that each can be
derived from the other using only the fact that integration is positive
and linear.
Problems
10. a. Show that iff is integrable over E, then so is f and
fEf
Does the integrability of I f imply that off?
90
The Lebesgue Integral
[Ch. 4
b. The improper Riemann integral of a function may exist without
the function being integrable (in the sense of Lebesgue), e.g., if
f(x) = (sin x)/x on [0, oo). If f is integrable, show that the improper
Riemann integral is equal to the Lebesgue integral whenever the former
exists.
11. If 4, is a simple function, we have two definitions for f, that on
page 75 and that on page 87. Show that they are the same.
12. Let g be an integrable function on a set E and suppose (A) is a
sequence of measurable functions such that If(x)1 < g(x) a.e. on E.
Then
lim fn
fn
7
1-1 f 5_
E
JE
7
1-1 fn
JE
13. Let h be an integrable function and (fu ) a sequence of measurable
functions with fn > — h and limf„ = f. Show that f fn, and f f have a
meaning and that f f < lim f fn .
14. Let (fn ) be a sequence of integrable functions such thatfn --> f a.e.
with f integrable. Then fjf — fn i
0 if and only if fifn i
15. a. Let f be integrable over E. Then given
function ço such that
J
E
If
—
> 0,
there is a simple
Ç,91 <
[Apply Problem 4 to the positive and negative parts off.]
b. Under the same hypothesis there is a step function
J If — I <
E
,p such that
e.
[Combine part (a) with Proposition 3.22.]
c. Under the same hypothesis there is a continuous function g
vanishing outside a finite interval such that
fE
16. Establish the Riemann -Lebesgue Theorem: If f is an integrable
function on (— , co), then lim f f(x) cos nx dx = O. [Hint: The
theorem is easy if f is a step function. Use Problem 15.]
17. a. Let f be integrable over (—ce, co). Then
f f(x) dx = f(x t) dx.
Sec. 5]
91
Convergence in Measure
b. Let g be a bounded measurable function. Then
00
lim f Ig(x)[f(x) — f(x ± oil = OE
—0
[Hint: If f is continuous and vanishes outside a finite interval, this result
follows from the uniform continuity off. Apply Problem 15.]
18. Let f be a function of two variables (x, t) which is defined in the
square Q = {(x, t): 0 < x < 1, 0 < t < 1} and which is a measurable
function of x for each fixed value of t. Suppose that lim f(x, t) = f(x) and
t->o
that for all t we have f(x, t)1 < g(x), where g is an integrable function on
[0, 1]. Then
lirn 1f(x, t)dx = f f(x)dx.
t—o
[Problem 2.47(f) is useful here.] Show also that if the function f(x, t) is
continuous in t for each x, then
h(t) = f f(x, t) dx
is a continuous function of t.
19. Let f be a function defined and bounded in the square Q =
{(x, t): 0 < x < 1, 0 < t < 1 } , and suppose that for each fixed t the
function f is a measurable function of x. For each (x, t) e Q let the partial
af
derivative — exist. Suppose that
at
af
—
i s bounded in
at
1
d
Yt f f(x, t)dx —
o
*5
f 3-,1
Q. Then
1
o -t
dx •
Convergence in Measure
Suppose that ( f1) is a sequence of measurable functions such that
SW O. What can we say about the sequence ( fa)? Perhaps the
most important property of such a sequence is that for each positive
n the measure of the sets {x: 1 f(x) 1 > n} must tend to zero. This
leads us to the following definition:
---,
Definition: A sequence ( fa) of measurable functions is said to
converge to fin measure if, given 6 > 0, there is an N such that for all
n > N we have
mIx: I f(x) — f n (x)I > 61 <
E.
The Lebesgue Integral
92
[Ch. 4
An example of a sequence ( f, which converges to zero in measure
on [0, 1] but such that _icii (x)) does not converge for any x in [0, 1]
can be constructed as follows: Let n — k ± 2', 0 < k < 2', and set
fn(x) = 1 if x e [k2', (k + 1)2-1 and f(x) ---- 0 otherwise. Then
2
m{x: Ifn (x)I > €} <—'
n
and so fn —> 0 in measure, although for any x e [0, 1], the sequence
(fn (x)) has the value 1 for arbitrarily large values of n and so does
not converge. We do, however, have the following proposition:
17. Proposition: Let (fn ) be a sequence of measurable functions
which converges in measure to f. Then there is a subsequence ( f„k)
which converges to f almost everywhere.
Proof: Given p, there is an integer n, such that for all n > n,
we have
m{x: I fn(x) — f(x)I >: 2 -1 < 2'.
Let E, = {x: Ifn p (x) — f(x) > 2'1. Then if x e U E„ we have
v=k
1 fnp( X ) — f(X)i < 2' for i i _.> k, and so f„„(x) —4 f(X). Hence
.
,
I
for
any
=
Û
Ep.
But
mA
_<
.
A
x
e
m[
fn,„(x) -->f(x)
Û Ed <
ri
10=1 v=k
v=lc
co
E mE„ --= 2 —k+ 1 . Hence mA ---- 0.1
v -=-k
18. Proposition: Fatou's Lemma and the Monotone and Lebesgue
Convergence Theorems remain valid if 'convergence a.e.' is replaced by
'convergence in measure'.
Problems
20. Show that if (A) is a sequence which converges tofin measure, then
each subsequence (f„k) converges to f in measure.
21. Deduce Proposition 18 from Proposition 17 using Problems 20 and
2.11.
22. Prove that a sequence (A) of measurable functions converges to f in
measure if and only if every subsequence of (fri) has in turn a subsequence
which converges in measure to f
Sec. 5]
Convergence in Measure
93
23. Prove that a sequence (A) of measurable functions on a set E of
finite measure converges to f in measure if and only if every subsequence
of (frt) has in turn a subsequence which converges to f a.e. on E.
24. Use Proposition 13 to prove directly that, if fn f in measure and
if there is an integrable function g such that for all n we have Ifid < g,
then f
-A, O.
25. A sequence (fn ) of measurable functions is said to be a Cauchy
sequence in measure if given t > 0 there is an N such that for all m, n > N
we have
fm(X)1
61 < 6.
171{X: If(x)
Show that if ./.„) is a Cauchy sequence in measure, then there is a functionf
to which the sequence (A) converges in measure. [Choose nv±i > n, so
that m{x: I-np
f
fnu+,1 > 2-P} < 2 -w. Then the series E(f„, — fn ,)
converges almost everywhere to a function g. Let f = g + fni . Then
in measure, and one can show that consequently f, f in
measure.]
5
Differentiation and
Integration
In this chapter we shall consider the sense in which differentiation
is the inverse of integration. In particular we shall be concerned with
the following questions:
When does
lab f(x) dx = f(b) — f(a)?
When does
d fx
f(y ) dy
dx
= f(x)?
From the theory of Riemann integration it is known that the
second relation is true if x is a point of continuity of f. We shall
show that more generally this relation holds almost everywhere.
Thus differentiation is the inverse of Lebesgue integration. The first
question, however, is more difficult, and even using the Lebesgue
integral it is true only for a certain class of functions which we shall
characterize.
1
Differentiation of Monotone Functions
Let g be a collection of intervals. Then we say that g covers a set
E in the sense of Vita/i, if for each E > 0 and any x in E there is an
interval / e g such that x e / and 1(1) < E.
94
Sec. 11 Differentiation of Monotone Functions
95
1. Lemma (Vitali): Let E be a set of finite outer measure and g a
collection of intervals which cover E in the sense of Vitali. Then given
e > 0 there is a finite disjoint collection {II , . . . , IN} of intervals
in g such that
N
m.[E_
u
id < e.
n=1
Proof: It suffices to prove the lemma in the case that each interval
in .4 is closed, for otherwise we replace each interval by its closure
and observe that the set of endpoints of /1 , . . . , IN has measure zero.
Let 0 be an open set of finite measure containing E. Since g is a
Vitali covering of E, we may assume without loss of generality that
each 1 of g is contained in O. We choose a sequence (In ) of disjoint
intervals of .4 by induction as follows: Let 11 be any interval in g,
and suppose Ii , . . . , In have already been chosen. Let k n be the
supremum of the lengths of the intervals of .4 which do not meet
any of the intervals Ii, . . . , In . Since each I is contained in 0, we
n
have k n < m0 < c'a. Unless E C U 1.„ we can find /n+i in .4 with
/(4,+i) > Pn and /,,+i disjoint from Ii ,. . . , Ir,.
Thus we have a sequence JO of disjoint intervals of .4, and since
U in c 0, we have E/(/n) < m0 < Go . Hence we can find an integer
N such that
.
E 1(17,) < e .•
N+1
Let
R=E—
6
In.
n=1
The lemma will be established if we can show that neR < e. Let X
N
be an arbitrary point of R. Since U In is a closed set not containing
X,
n--.1
we can find an interval I in g which contains x and whose length
is so small that /does not meet any of the intervals I I , . . . , IN. If now
I n I, = 0 for i < n, we must have 41) < k„ < 2/(4,±1). Since
lim 1(In) = 0, the interval I must meet at least one of the intervals
In . Let n be the smallest integer such that I meets Ir,. We have n > N,
and 1(1) < kn_ i < 21(I„). Since x is in I, and /has a point in common
with In, it follows that the distance from x to the midpoint of In is at
96
Differentiation and Integration
[Ch. 5
most 1(1) ±
< WO. Thus x belongs to the interval J
the same midpoint as in, and five times the length. Thus we have
shown that
RC
N+1
Hence
m *R <
E kin) <
E 1(J,i) = 5 N+1
I
N+1
In order to talk about the derivatives of a function f, we first
define a set of four quantities called the derivates of f at x as follows:
.:.— f(x ± h) - f(x)
,
D I- f(x) = Jim
h
h-o+
7.-1-1- f(x) - f(x - h) ,
i.
D - f(x) = i1
h
h,o+
D ±f(x) -=
ihnf(x + h) - f(x)
h-0+
D_f(x) --=
h
,
f(x) - f(x - h)
h
h-40+
lim
Clearly we have D+f(x) D +f(x) and D -f(x) D_ f(x). If
(x) ±=
f(x) =D
D -f(x) = D_ f(x) oo , we say that f is
differentiable at x and define f' (x) to be the common value of the
derivates at x.
2. Theorem: Let f be an increasing real-valued function on the
interval [a, b]. Then fis differentiable almost everywhere. The derivative
f' is measurable, and
L b f' (x) dx f(b) - f(a).
Proof: Let us show that the sets where any two derivates are
unequal have measure zero. We consider only the set E where
D 4f(x) > D_ f(x), the sets arising from other combinations of
derivates being similarly handled. Now the set E is the union of the
sets
= {x: D ± f(x) > u > y > D_f(x)}
Sec. 1]
97
Differentiation of Monotone Functions
for all rational u and v. Hence it suffices to prove that m*E„,„ = O.
Let s = m*E„,, and, choosing e > 0, enclose E„,„ in an open set 0
with m0 < s
E. For each point x in E
there is an arbitrarily
small interval [x — h, x] contained in 0 such that
f(x) — f(x —
h) < vh.
By Lemma 1 we can choose a finite collection {I,. , IN} of them
whose interiors cover a subset A of E„,, of outer measure greater
than s — E. Then summing over these intervals we have
[f(x) - f(x. - 1 )1 <v
n=1
E
hn
n=1
< VMO
< V(S
e).
Now each point y e A is the left endpoint of an arbitrarily small
interval (y, y k) which is contained in some In and such that
f(y k) — f(y) > uk. Using Lemma 1 again, we can pick out a
finite collection {J 1 , . . . , hi} of such intervals such that their union
contains a subset of A of outer measure greater than s — 2E. Then
summing over these intervals
E f(y i k,) — f(y,) > u k,
i= 1
> u(s — 2E).
Each interval J, is contained in some interval In, and if we sum over
those i for which Ji c In, we have
E f(y i k,) — f(y) f(x„) — f(x„ — h„),
since f is increasing. Thus
E
Ax.)
-
f(x.
-
hn)
E f(yi + k)
i=1
and so
v(s
E) > u(s — 2E).
Since this is true for each positive e, we have vs > us. But u > v, and
so s must be zero.
98
Differentiation and Integration
[Ch. 5
This shows that
g(x) =
hill
f (. x
±
h) — f(x)
h
is defined almost everywhere and that f is differentiable wherever g is
finite. Let
gn(x) = n[f(x + 1 / n) — f(x)J,
where we set f(x) = f(b) for x > b. Then g„(x) g(x) for almost all
x, and so g is measurable. Since f is increasing, we have gn, > O. Hence
by Fatou's Lemma,
g lim
f
gr, lim
=
[f(x
1+1/n
6
1/n) f(x)] dx
f— n
faa+1./n
= lim[f(b) — n j: +1 in .1]
< f(b) — f(a).
This shows that g is integrable and hence finite almost everywhere.
Thus f is differentiable a.e. and g = f' a.e. I
Problems
1. Let f be the function defined byf(0) = 0 and f(x) x sin (1/x) for
O. Find D+f(0), D+f(0), D7(0), and D_f(0).
2. If the function f assumes its maximum at c, then D+f(c) < 0 and
D_f(c)
O.
3. lff is continuous on [a, b] and one of its derivates (say D+) is everywhere nonnegative on [a, b], then f(b) > f(a). [Hint: First show this
result for a function g for which Dg > E > O. Apply this to g = f Ex.]
x
2
Functions of Bounded Variation
Let f be a real-valued function defined on the interval [a, b], and let
a = x o <x1 < •
< X = b be any subdivision of [a, 14. Define
P=
if ( xi) — f(x,t)i+
Sec. 2]
99
Functions of Bounded Variation
Ic
n=
E if(xi) —
t=n+p=E
2
f(x—i)I,
=1
where we use r+ to denote r, if r > 0 and 0, if r < 0, and set r — =
r+. We have f(b) — f(a) = p — n. Set
—
P = sup p
N = sup n
T = sup t,
where we take the suprema over all possible subdivisions of [a, b].
We clearly have P < T < P ±N. We call P, N, T the positive,
negative, and total variations of f over [a, b]. We sometimes write
1. „ Tab (f), etc., to denote the dependence on the interval [a, b] or on
the function f. If T < oo , we say that f is of bounded variation over
[a, b]. This notion is sometimes abbreviated by writing f e BV.
3. Lemma: If f is of bounded variation on [a, b], then
=
and
f(b) — f(a) = P — N.
Proof: For any subdivision of [a, b]
p = n f(b) — f(a)
< N f(b) — f(a),
and taking suprema over all possible subdivisions, we obtain
P < N f(b) — f( a).
Since N < T < 00,
P — N < f(b) — f(a).
Similarly,
N — P f(a) —
and so
P — N= f(b) — f( a).
100
[Ch. 5
Differentiation and Integration
Thus
T?p+n=p+p— {f(b) — f(a)} = 2p + N — P,
and
T>2P+N—P=P+N.
Since T < P + N, we have T= P+N.I
4. Theorem: A function f is of bounded variation on [a, b] if
and only if f is the difference of two monotone real-valued functions
on [a, b].
Proof: Let f be of bounded variation, and set g(x) ---- n and
h(x) = N:. Then g and h are monotone increasing functions which
are real valued since 0 < P <1 <T < Go and 0 < N: <n<
h(x) + f(a) by Lemma 3. Since h f(a)
T: < oo . But f(x) = g(x)
is a monotone function, we have f expressed as the difference of two
monotone functions.
On the other hand, iff = g h on [a, b] with g and h increasing,
then for any subdivision we have
—
—
—
E if(x) — f(x i_ i )1 .<_. E [g(xi)— g(xi—i)i± E [h(xi) —
= g(b) — g(a) + h(b) — h(a).
Hence
T(f) .< g(b) + h(b) — g(a) — h(a). I
5. Corollary: If f is of bounded variation on [a, b], then f(x)
exists for almost all x in [a, b].
Problems
4. a. Let f be of bounded variation on [a, b]. Show that for each cc (a, b)
the limit off(x) exists as x --> c— and also as x ----> c+. Prove that a monotone function (and hence a function of bounded variation) can have only a
countable number of discontinuities. [Hint: If f is monotone, the number
of points where if(c+) — f(c—)1> 1/n is finite.]
b. Construct a monotone function on [0, 1] which is discontinuous
at each rational point.
5. Show that if a < c < b, then 7',6, — T,c + Tcb and that hence Le < T.
Sec. 3]
Differentiation of an Integral
101
71(f) + T(g), and T(cf) = ct 71(f).
6. Show that Tab (f + g)
7. Let (f,) be a sequence of functions on [a, b] which converges at
each point of [a, 1)] to a function f. Then T(f) < Ern MA).
8. a. Let f be defined by f(0) = 0 and f(x) = x 2 sin (1/x 2) for x
O.
Is f of bounded variation on [— 1, 1 ] ?
b. Let g be defined by g(0) = 0 and g(x) = x 2 sin (1/x) for x O.
Is g of bounded variation on [— 1, 1]?
3 Differentiation of an Integral
In this section we shall show that the derivative of the indefinite
integral of an integrable function is equal to the integrand almost
everywhere. We begin by establishing some lemmas.
6. Lemma: If f is integrable on [a, b], then the function F defined by
f(t) dt
F(x) =
is a continuous function of bounded variation on [a, b].
Proof: The continuity follows from Proposition 4.13. To show
that Fis of bounded variation, let a = xo < x1 < < xk b be
any subdivision of [a, b]. Then
E
k
F(xi) F(x 1 _ i ) 1 =
1=-1
E
1
f
k
f(t) dt
zi--1
f If(01 dt.
Thus
T(F)
L b If(01 dt < oo. 1
7. Lemma: If f is integrable on [a, b] and
LX f(t) dt = 0
for all x e [a, b], then f(t) = 0 a.e. in [a, b].
E
f
1 xt_i
f(t)1 dt
102
Differentiation and Integration
[Ch. 5
Proof: Suppose f(x) > 0 on a set E of positive measure. Then
by Proposition 3.15 there is a closed set F c E with mF > O. Let
0 = (a, b) — F. Then either b f 0, or else
a
o = Lb f = fF f + fo
and
f
f
a
But 0 is the disjoint union of a countable collection {(an, brj} of
open intervals, and so by Proposition 4.11
fa' f
f=
Thus for some n we have
f b„
ja.
f
0,
and so either
or
In any case we see that if f is positive on a set of positive measure,
then for some x e [a, b] we have
fa
x
f
O.
Similarly for f negative on a set of positive measure, and the lemma
follows by contraposition. I
8. Lemma: If f is bounded and measurable on [a, b] and
F(x) =
f(t) dt F(a),
then F'(x) = f(x) for almost all x in [a, b].
Sec. 3]
103
Differentiation of an Integral
Proof: By Lemma 6, F is of bounded variation over [a, 14, and so
F' (x) exists for almost all x in [a, b]. Let If I < K. Then setting
f(x) =
F(x +
h) — F(x)
,
h
with h = 1/n, we have
1 x-i-h
f„(x) = -h- f
f(t) di',
and so
Ifd
K
Since
f(x) —p F' (x) a.e.,
the bounded convergence theorem implies that
f c F'(x) dx = 11m
f
c
f(x) dx = 11m
h—o n
a
c (F(x + h) — F(x)) dx
a
= Ifin [;i f c+h F(x) dx — faa+h F(x) dx]
= F(c) — F(a) = fc f(x) dx,
since F is continuous. Hence
L e {F1(x) — f(x)} dx = 0
for all c
e [a, b],
and so
F'(x) = f(x)
a.e.
by Lemma 7. I
9.
Theorem: Let f be an integrable function on [a, b], and suppose
F(x) = F(a) + azf f(t) dt.
Then F'(x) = f (x) for almost all x in [a, b].
104
Differentiation and Integration
[Ch. 5
Proof: Without loss of generality we may assume f > O. Let fn
be defined by fn(x) = f(x), if f(x) < n, and fn(x) — n if f(x) > n.
Then f — f„ > 0, and so
Gn (x) =
fax f — fn
is an increasing function of x, which must have a derivative almost
everywhere, and this derivative will be nonnegative. Now by Lemma 8
dx fax .f. = f.(x)
—d
a.e.,
and so
d
d z
F'(x) = — Gn ± f f.
dx
dx a
fn(x)
a.e.
Since n is arbitrary,
F'(x) > f(x)
a.e.
Consequently,
b
Lb F'(x) clx Lb
_I. f(x) dx = F(b) — F(a).
ab
Thus by Theorem 2 we have
Lb F'(x) dx = F(b) — F(a) =Lbja f(x) dx,
b
ab
and
Lba (F'(x)
f — f(x)) dx =
O.
Since F'(x) — f(x) > 0, this implies that F'(x) — f(x) = 0 a.e.,
and so F'(x) = f(x) a.e. 1
4 Absolute Continuity
A real-valued function f defined on [a, b] is said to be absolutely
continuous on [a, b] if, given E > 0, there is a 6 > 0 such that
n
E If(x) - f(xi)1 <
2=1
E
Sec. 4]
105
Absolute Continuity
for every finite collection {(x„ x")j of nonoverlapping intervals with
E
— xii <
S.
An absolutely continuous function is continuous, and it follows from
Proposition 4.13 that every indefinite integral is absolutely continuous.
10. Lemma: If f is absolutely continuous on [a, b], then it is of
bounded variation on [a, b].
Proof: Let 5 be the 3 in the definition of absolute continuity which
corresponds to E = 1. Then any subdivision of [a, b] can be split (by
inserting fresh division points, if necessary) into K sets of intervals,
each of total length less than 8, where K is the largest integer less
than 1
(b — a)/5. Thus for any subdivision we have t < K and
so T < K. I
11. Corollary: If f is absolutely continuous, then f has a derivative
almost everywhere.
12. Lemma: If f is absolutely continuous on [a, b] and f'(x) = 0
a.e., then fis constant.
Proof: We wish to show that f(a) = f(c) for any c e [a, b]. Let
E c (a, c) be the set of measure c — a in which f'(x) = 0, and let e
and n be arbitrary positive numbers. To each x in E there is an
arbitrarily small interval [x, x h] contained in [a, c] such that
If(x ± h) — f(x)i < nh. By Lemma 1 we can find a finite collection
{[x k , yk]l of nonoverlapping intervals of this sort which cover all of
E except for a set of measure less than 8, where 8 is the positive
number corresponding to e in the definition of the absolute continuity off. If we label the x k so that x k < x 1 , we have
Yo = a < -Y1 < yi < x2 < • • < y < c --= .x. + 1
and
E
— ykl <
3.
106
Differentiation and Integration
[Ch. 5
Now
n
E If(Yb) —
k=1
f(xk)l
77
<
by the way the intervals
{[xki Yk]}
E (Yk —
xk)
n(c — a)
were constructed, and
Ti
E If(xk + i) — f(yk)i
<e
k=0
by the absolute continuity off. Thus
If(c) — f(a)I = E [f(xk+i) —
k=0
E ±
f(Yk)]
+k=1
i [f(Yk) —
f(xk)]
n(b — a).
Since E and n are arbitrary positive numbers, f(c) — f(a) = O. I
13. Theorem: A function F is an indefinite integral if and only if
it is absolutely continuous.
Proof: If F is an indefinite integral, then F is absolutely continuous by Proposition 4.13. Suppose on the other hand that F is
absolutely continuous on [a, b]. Then F is of bounded variation, and
we may write
F(x) = Fi (x) — F 2(x),
where the functions Fi are monotone increasing. Hence F'(x) exists
almost everywhere and
Ify (x)1< F(x) + F'(x).
Thus
fir(x)i dx Fi (b) + F2(b) — F i (a) — F2(a)
by Theorem
2,
and F'(x) is integrable. Let
G(x) =
f F' (t) dt.
Then G is absolutely continuous and so is the function f = F — G.
It follows from Theorem 9 that f(x) = F'(x) — G'(x) = 0 a.e., and
Sec. 4]
107
Absolute Continuity
so f is constant by Lemma 12. Thus
F(x) = fax F'(t)dt + F(a). I
14. Corollary: Every absolutely continuous function is the indefinite
integral of its derivative.
Problems
9. Let f be absolutely continuous in the interval [E, 1] for each e > O.
Does the continuity of f at 0 imply that f is absolutely continuous on
[0, 1]? What if f is also of bounded variation on [0, 1]?
10. Let f be absolutely continuous on [a, b]. Show that
t(f) = f
b
and P:(f) = L b LfT p.
a
11. The Cantor ternary function (Problem 2.46) is continuous and
monotone but not absolutely continuous.
12. A monotone function f on [a, b] is called singular iff' = 0 a.e.
a. Show that any monotone increasing function is the sum of an
absolutely continuous function and a singular function.
b. Construct a function f which is monotone and continuous on
[0, 1] but which is not absolutely continuous on any subinterval of [0, 1].
c. Show that there is a strictly increasing singular function on [0, 1].
13. Change of variable. Let g be a monotone increasing absolutely
continuous function on [a, b] with g(a) = c, g(b) = d.
a. Show that for any open set 0 c [c, d]
m0 = f
g'(x)dx.
6,-- ' [01
b. Let H = {x: g'(x) x 0} . If E is a subset of [c, d] with mE = 0,
then g-1 (E) n H has measure zero.
c. If E is a measurable subset of [c, d], then F = g[E]n H is
measurable and
b
mE = fF g' = fa XE(g(x))g'(x) dx.
d. Iff is a nonnegative measurable function on [e, d], then (f . g) g'
is measurable on [a, b] and
fo d f (y) dy =
fabf(g(x))g'(x) dx.
108
Differentiation and Integration
[Ch. 5
14. Let g be an absolutely continuous monotone function on [0, 1]
and E a set of measure zero. Then g[E] has measure zero.
15. a. Construct an absolutely continuous strictly monotone function g
on [0, 1] such that g' = 0 on a set of positive measure. [Hint: Let G be the
complement of a generalized Cantor set of positive measure (Problem 3.14),
and let g be the indefinite integral of )(G.]
b. Show that there is a set E of measure zero such that g' [E] is
not measurable. How does this example compare with that of Problem
3.28?
16. A function f is said to satisfy a Lipschitz condition on an interval if
there is a constant M such that I f(x) — f(y)I < Mix — yi for all x and y
in the interval.
a. Show that a function satisfying a Lipschitz condition is absolutely
continuous.
b. Show that an absolutely continuous function f satisfies a
Lipschitz condition if and only if If 1 is bounded.
*5 Convex Functions
A function so defined on an open interval (a, b) is said to be convex
if for each x, y e (a, b) and each X, 0 < X < 1 we have
so(xx
+
( 1 — X)y) XV(x) + ( 1 — X)ç(Y).
If we look at the graph of (12 in R2 , this condition can be formulated
geometrically by saying that each point on the chord between
(x, so(x)) and (y, ço(y)) is above the graph of cp. An important property
of the chords of a convex function is given by the following lemma,
whose proof is left to the reader.
15. Lemma: If ço is convex on (a, b) and if x, y, x', y' are points
of (a, b) with x < x' < y' and x <y < y', then the chord over
(x', y') has larger slope than the chord over (x, y); that is,
40 (y ) — 50(x) < V (y' ) — 4(x')
y—x—
If the upper and lower left-hand derivates D —f and D_ f of a
function f are equal and finite at a point x, we say that f is differentiable on the left at x and call this common value the left-hand
Sec.
5]
109
Convex Functions
derivative at x. Similarly, we say f is differentiable on the right at x
if D±f and D+ f are equal there. Some of the continuity and differentiability properties of convex functions are given by the following
proposition.
16. Proposition: If io is convex on (a, b), then io is absolutely
continuous on each closed subinterval of (a, b). The right- and left-hand
derivatives of io exist at each point of (a, b) and are equal to each other
except on a countable set. The left- and right-hand derivatives are
monotone increasing functions, and at each point the left-hand derivative
is less than or equal to the right-hand derivative.
Proof: Let [c, d] C (a, b). Then by Lemma 15 we have
ç(d)
So(c) — So(a) < Se(y) — (0(x) < (0(b)
c a
y— x
b— d
—
for x, y in [c, d]. Thus so(y) — so(x)I < Mix — yi in [c, d], and so so
is absolutely continuous there.
If xo e (a, b), then [so(x) — so(xo)]/(x — x o) is an increasing
function of x by Lemma 15, and so the limits as x approaches xo
from the right and from the left exist and are finite. Thus so is differentiable on the right and on the left at each point, and the left-hand
derivative is less than or equal to the right-hand derivative. If
xo < yo, x < yo, and xo < y, then
so(x) —
V(xo)
x — xo
<
40(.0 it00)
Y Yo
and either derivative at xo is less than or equal to either derivative at
Yo . Consequently, each derivative is monotone, and they are equal
at a point if one of them is continuous there. Since a monotone
function can have only a countable number of discontinuities, they
are equal except on a countable set. I
Let so be a convex function on (a, b) and x o e (a, b). The line
y = m(x — xo) so(x 0) through (xo, (0 (x 0)) is called a supporting
line at xo if it always lies below the graph of so, that is, if
so(x) > m(x — xo) so(x0). It follows from Lemma 15 that such a
line is a supporting line if and only if its slope m lies between the
left- and right-hand derivatives at xo . Thus, in particular, there is
110
Differentiation and Integration [Ch. 5
always at least one supporting line at each point. This notion enables
us to give a short proof for the following proposition :
17. Proposition (Jensen Inequality): Let cp be a convex function on
(— x, oo) andf an integrable function on [0, 1]. Then
f ço(f(t)) dt > yo[f f(t) dd.
Proof: Let a = ff(t) dt, and let y = m(x —
equation of a supporting line at a. Then
ço(f(t))
m(f(t)
a)
+ io(a) be the
a) + (p(a).
Integrating both sides with respect to t gives the proposition. I
This inequality has a geometric interpretation worth mentioning.
Since the point Xx i + (1 — X)x2 is the centroid of masses X and
(1 — X) at x 1 and x2, we can say that a function is convex if its value
at the centroid of a two-point mass is less than the weighted average
of its values at the two points. The Jensen inequality is a generalization of this fact: If we define a mass distribution on the line by
setting ,u(a, b] = inft: a < 1(t) < b}, then ff(t) dt is the centroid
of this mass and fça(f (t)) dt =f(x) d,u is the weighted average of cp.
An important application of the Jensen inequality is obtained by
taking for our convex function cio the function exp defined by
exp x = e x. The inequality then becomes a generalization of the
inequality between the arithmetic and geometric mean :
18. Corollary: Let f be an integrable function on [0, 1]. Then
f exp (f(t)) dt
exp [f f(t) dt].
Problems
17. Let g be a nonnegative measurable function on [0, 1]. Then
log f g(t) dt > flog (g(t)) dt whenever the right side is defined.
18. When do we have equality in Corollary 18?
19. Let (an ) be a sequence of nonnegative numbers whose sum is 1
and (En ) a sequence of positive numbers. Then
TT
n
E anEn.
n=1
6
1
The Classical
Banach Spaces
The LP Spaces
In this chapter we study some spaces of functions of a real variable.
Let p be a positive real number. A measurable function defined on
[0, I] is said to belong to the space LP = LP[0, I] if fJf I
<
Thus L l consists precisely of the Lebesgue integrable functions on
[0, 1]. Since j f giP 2P((f + le), we see that the sum of two
functions in L P is again in L". Since af is in LP whenever f is, we have
af #g in L P whenever f and g are. A space X of real-valued functions is called a linear space (or vector space) if it has the property
that af #g belongs to X for each pair f and g belonging to X and
for each pair of constants a and i3. Thus the L P spaces are linear
spaces.
For a function f e L ", we define
11f11 = IlfIlp =
We see that f = 0 if and only if f = 0 a.e. If a is a constant, then
Ilaf11 = la? III f Il. In the next section we shall derive two inequalities,
the second of which states that II f
11 f?? +
if P
1.
Henceforth, we shall always assume p > I. A linear space is said
to be a normed linear space if we have assigned a nonnegative real
number fIl to each f such that Ilafil = I al IIf??, III f gll
I fll
11e, and 1 If?? = O <=> 0. Unfortunately, the norm for the LP
111
112
The Classical Banach Spaces
[Ch. 6
spaces does not satisfy the last requirement, for from ifil = 0 we
can only conclude that f = 0 a.e. We shall, however, consider two
measurable functions to be equivalent if they are equal almost everywhere; and, if we do not distinguish between equivalent functions,
then the LP spaces are normed linear spaces.'
It is convenient to denote by L'''' the space of all bounded measurable functions on [0, 1] (or rather all measurable functions which are
bounded except possibly on a subset of measure zero). Again we
identify functions which are equivalent. Then L is a linear space,
and it becomes a normed linear space if we define
11f11 = 11f11. =
ess sup I f(t)j,
where ess supf(t) is the infimum of sup g(t) as g ranges over all
functions which are equal to f almost everywhere. Thus
ess sup f(t) = inf -CM: mIt: f(t) > M1 --,-- 01.
Problems
1. Show that 11f ± grf. < ilfil„ ± HA2. Let f be a bounded measurable function on [0, 1]. Then lim
!I f I —
llfllp -
3. Prove that 11f+ gill
IlfIli + 11g111.
4. If f e L i and g c Dc, then
flfgl
-. 11f111.11g11..
2 The Holder and Minkowski Inequalities
Before proving inequalities about the LP norm, we first establish
the following lemma, which is a generalization of the inequality
between the arithmetic and geometric mean:
1. Lemma: Let a and 0 be nonnegative real numbers, and suppose
0 < X < 1. Then «?'0 1—)` < X« ± (1 — X)0 with equality only if
a =-- (3.
1
To be pedantic we should say the elements of LP are not functions but rather equivalence classes of functions (cf. Problem 10.10).
Sec. 2]
The W6 !der and Minkowski Inequalities
113
Proof: Consider the function
defined for nonnegative real
numbers t by
io(t) = (1
A + At — tx.
Then
—
)
(t) = X(1 —
Since A — 1 < 0, we have cp' (t) < 0 for t < 1 and ço'(t) > 0 for
t > 1. Thus for t
1 we have yo(t) < v(1) = O. Hence
(1 — A) + At > tx ,
with equality only for t = 1. If )3
0, the lemma follows by substituting a/0 for t, while, if /3 = 0, the lemma is trivial. I
2. Holder Inequality: If p and q are nonnegative extended real
numbers such that
= 1, and if f L P and g cLi, then f • g c
and
f I fgl
• ligh.
Equality holds if and only if, for some nonzero constants a and /3,
we have al
olglq a.e.
Go is straightforward and is left to
Proof: The case p = 1, q
the reader. Hence we assume 1 <p < oo and consequently
I < q < 00 . Let us first suppose that f = deg = 1, and apply
Ig(t)j, X = t,, 1 — X
Then
our lemma with a = I
P, /3
(1)
I f(t)g(t)i < f(t)I P + (1 —
and integrating both sides we have
(2)
f
fg-1< x f ifr + (1
—
X) f 'gi g =
1.
The inequality is trivial if II f!( = 0 or l!gli= O. Let f and g be any
O. Then Eilf li p and
elements of L P and Lq with II 0 and
g/ligli g both have norm 1. Substituting them in (2) gives
f
1
fg
l=
f
Ifl
II
Ilf11, rglIg
<1
—
Ig(t)I, and equality can
Equality in (1) can occur only if I f(t)( P
hold in (2) only if it holds almost everywhere in (1). Hence equality
holds only if (Leg f
flIP
a.e. I
rP =
The Classical Banach Spaces
114
[Ch. 6
3. Minkowski Inequality: 1ff and g are in LP, then so is f + g and
Ilf + gllp
IlfIlp +
Proof: The cases p = 1 and p = co are straightforward and are
left to the reader. Hence we assume 1 < p < co.
IgIP), we have f + g in L. Also
Since j f + glP 2P(1 fjP
flf +
gP
iv+
+
By the H61der inequality,
fif
gi
' ifi
f if + g
' igl
gi p_i )112,
<
<
and
= Ulf+ gr'-'1"
Hof+
+ glIp}",
=
since p = q(p — 1). Hence
Ilf+
(11fIl3, + lIglI3,)(11f + g113,)P ig
Ilf + gllp
IlfIlp + lIglIp. 1
or
Problems
5. For 1 < p < co, we denote by IP the space of all sequences (E,Z= 1
P < oc. Show that if (S,) g /P and (ny) e /q with
such that
y=
; + = 1, then
I timid
I I(G)Ilp • 1 1(ny>II,,
where
(11(t)llp)P
Ê ii
and
11(ny)II. = sup Inv'.
Sec. 3]
Convergence and Completeness
115
This is the Hblder inequality for sequences.
6. Prove the Minkowski inequality for sequences:
li(Ep
n9)11p
5- ii(tv)iip + 11(004.
Here we have 1 < p < cc.
3
Convergence and Completeness
The notion of convergence for a sequence of real numbers
generalizes to give us a notion of convergence for sequences in a
normed linear space.
Definition: A sequence (fi ) in a norrned linear space is said to converge to an element f in the space if, given e > 0, there is an N such
that for all n > N we have 11 f — f„It < e. If Jet, converges to f we
write f = Lim f7, or fn f.
Another way of formulating the convergence off, to f is by noting
—f11 —> O. Convergence in the space L P 1 <p < oo,
that f„
is often referred to as convergence in the mean of order p. Thus a
sequence of functions ( f„) is said to converge to f in the mean of
order p if each f„ belongs to LP and 11f — fjp O. Convergence in
is nearly uniform convergence (Problem 8).
Since we are dealing with a linear space X of functions, we must be
careful to distinguish the above notion of convergence in X from
the notion of a sequence of functions which converges at each point.
We shall call this latter type of convergence pointwise convergence,
and say that ( f„) converges pointwise to f if for each x we have
f(x) = lim fi,(x). If there is a set E of measure zero such that for
each x in E
- we have f (x) = lirn f,, (x), then we say that f„ converges
to f almost everywhere.
Just as for the case of sequences of real numbers, we say that a
sequence ( f,,) in a normed linear space is a Cauchy sequence if, given
e > 0, there is an N such that for all n > N and all m > N we have
< E. It is easily verified that each convergent sequence is a
—
Cauchy sequence.
Definition: A normed linear space is called complete if every
Cauchy sequence in the space converges, that is, if for each Cauchy
116
The Classical Banach Spaces
[Ch 6
sequence (f0 in the space there is an element f in the space such that
fn f. A complete normed linear space is called a Banach space.
A series (fn ) in a normed linear space is said to be summable to a
sum s if s is in the space and the sequence of partial sums of the
series converges to s; that is,
s —
E
i=i
If this is the case, we write s =
absolutely summable if
E
ft
Eft.
0.
The series ( fa) is said to be
fnii < 00 .
n=
For a series of real numbers we know that absolute summability
implies that the series is summable. While this is not true in general
for series of elements in a normed linear space, the following proposition shows that this implication holds if the space is complete.
4. Proposition: A normed linear space X is complete if and only
if every absolutely summable series is summable.
Proof:
Let X be complete and ( fn ) an absolutely summable
series of elements of X. Since Ell fnl I = m. < 00, there is, for each
e > 0, an N such that
EfI <
n=1â– 1
E. Let sn =
E j; be the partial
z--=1
sum of the series (fn). Then for n > m > N we have
Ilsn —
sndl =
E
Ilfi II < E.
Hence the sequence (sa) of partial sums is a Cauchy sequence and
must converge to an element s in X, since X is complete.
Let (fn) be a Cauchy sequence in X. For each integer k there
is an integer nk such that 1 fn — fmII < 2—k for all n and m greater
than nk , and we may choose the nk 's so that nk+1 > nk . Then (fr,)17-1
is a subsequence of ( fn ), and if we set g 1 = fni , and g k = fn,—
for k > 1 we obtain a series (gk ) whose k' partial sum is f„,. But we
have 2 —k+ 1 if k > 1. Thus Eiigkl! E 2—k+1
+ 1. Hence the series (g0 is absolutely summable, and so by
our hypothesis there is an element f in X to which the partial sums of
the series converge. Thus the subsequence (fnk) converges to f.
Sec.
3]
117
Convergence and Completeness
We shall now show thatf = lim f„. Since ( fn ) is a Cauchy sequence,
given E > 0, there is an N such that II fn — f.11 < €72 for all n and m
larger than N. Since fnk f, there is a K such that for all k > K we
have 11
— fil < E/2. Let us take k so large that k > K and nk > N.
Then
llf
—
fi!
I fn — fn,11 + Ilfn,
—
Thus for all n > N we have II f„ — fil <
fil
E,
+ =
and so fn
f. I
5. Theorem (Riesz-Fischer): The LP spaces are complete.
Proof: Since the case p = Go is elementary, it is left to the reader,
and we assume 1 < p < Go . By virtue of the preceding proposition
we need only show that each absolutely summable series in 1,9 is
summable in LP to some element of L P.
Let ( f„) be a sequence in LP with Eiifi1 = M < , and define
n=1
functions gn by setting g(x) = E ifk(x)i. From the Minkowski
inequality we have
11g.11
k=1
Ilfkll
M.
Hence
(g,)9
5,_ mP
.1
For each x, (g„(x)) is an increasing sequence of (extended) real
numbers and so must converge to an extended real number g(x).
The function g so defined is measurable, and, since gn > 0, we have
g9
mP
by the Fatou Lemma. Hence gP is integrable, and g(x) is finite for
almost all x.
fk(x) is an absolutely
For each x such that g(x) is finite the series
k=
summable series of real numbers and so must be summable to a real
number s(x). If we set s(x) = 0 for those x where g(x) =oo , we have
defined a function s which is the limit almost everywhere of the partial
118
The Classical Banach Spaces
[Ch. 6
n
SUMS sn = E fk. Hence s is measurable. Since fsn (x)i < g(x), we
lc= 1
have Is(x)1 < g(x). Consequently, s is in LP and we have
isn (x) — s(x)( P < 2' [g(x)]9 .
Since 2Pg9 is integrable and Js(x) — S(x)I P converges to 0 for almost
all x, we have
I is
— sI P --* 0
by the Lebesgue convergence theorem. Thus tIsn — sil P ---> 0, whence
Ifs„ — sll ---* O. Consequently, the series ( f,.) has in D' the sum s. I
Problems
7. Prove that every convergent sequence is a Cauchy sequence.
8. Let (A) be a sequence of functions in L. Prove that (A) converges
to f in LG.' if and only if there is a set E of measure zero such that A converges to f uniformly on E.
9. Prove that Lc° is complete.
10. Prove that P' is complete (1 < p < cc) (see Problem 5).
11. Let C = C[0, 1] be the space of all continuous functions on [0, 1]
and define If fll = max I f( x)f. Show that C is a Banach space.
12. We denote by PD the space of all bounded sequences of real numbers
and define 11(p)11. — sup la Show that /OE' is a Banach space.
13. Show that the space c of all convergent sequences of real numbers
and the space c o of all sequences which converge to zero are Banach spaces
(with the P° norm).
14. Let f be a function in L9, 1 <p < oo . Show that given e > 0 there
is a continuous function yr, and a step function zi, such that 1ff — ioll < e
and If f — Ntiff < E.
15. Let (fn ) be a sequence of functions in LP, 1 < p < oo , which
converge almost everywhere to a function f in LP. Show that (A) converges
to f in L 9 if and only if Ilfill ---* If!!. (For p = 1 this is just Problem 4.14.)
16. Let (f„) be a sequence of functions in LP, 1 < p < oo, which
converge almost everywhere to a function f in V', and suppose that there is
a constant M such that Ilf„11 < M for all n. Then for each function g in
Lq we have
f fg = lim f f„g.
Is the result true for p = I?
119
Sec. 4] Bounded Linear Functionals on the LP Spaces
17. Let fn, —*f in LP, 1 < p < oo , and let (gr,) be a sequence of
measurable functions such that Ign1 < M, all n, and gn --> g a.e. Then
gnf, —* g f in LP.
4
Bounded Linear Functionals on the
LP
Spaces
We define a linear functional on a normed linear space X to be a
mapping F of the space X into the set of real numbers such that
F(af + (3g) = aF(f) + oF(g). We say that the linear functional is
bounded if there is a constant M such that IF( f)I __< M • I If I for all f
in X. The smallest constant M for which this inequality is true is
called the norm of F. Thus
VII = sup
1F(f)I 3
Ilfll
as f ranges over all nonzero elements of X.
If g is a function in Lq, we can define a bounded linear functional
F on LP by setting
F(f) = f fg.
The functional F is clearly linear, and the Holder inequality states
that IIFII < ligh. In fact, we actually have IIFII = de,• To see this
for the case 1 < p < oc, we set'
f = I gl q1P sgn g.
Then I f I P = IgI q = fg. Hence f is in LP and II fli p = (lIgligrP. Now
F(f) = f fg = f le
= (ileq)q = lIglIglIfIlp,
and so I F must be at least as great as II e q. We state this result as a
proposition, the cases p — 1 and p = Go being left to the reader
(see Problem 18).
6. Proposition: Each function g in Lq defines a bounded linear
functional F on LP by
F(f) = f fg.
We have 11F II = ligil q.
2
For any real number x we define sgn x = 1 if x >0, sgn 0 = 0, and sgn x = —1
if x <0.
120
The Classical Banach Spaces
[Ch. 6
The goal of the present section is to show that for 1 < p < 00 the
converse of this proposition holds, that is, that we obtain every
bounded linear functional on D' in this manner. We shall find it
useful to first establish the following lemma.
7. Lemma: Let g be an integrable function on [0, 1], and suppose
that there is a constant M such that
I
1 fg __ m ilf I lp
for all bounded measurable functions f. Then g is in D, and lIgII, < M.
Proof Let us first assume 1 < p < oc. We define a sequence of
bounded measurable functions by setting
g(x) =
Ig(x)
if Ig(x)1 < n
0
if Ig(x)1 > n,
and letting
fn = ign riP sgn gn .
Now iifnllp = (lIgn110q1 P, and igni q --- fn • gn = f. • g. Hence
(l(gn h)q = ffng < Mdfn li p = M(11g7,110q/P .
Since q — q/ p = 1,
Ng. ii, _. Al
and
fignr ..... M.
Since ign iq converges to igiq almost everywhere, we have
f 'gi g < lim f Igni g < Mq
by the Fatou lemma. Thus g 8 D, and ligilq
M.
For the case p = 1, let E = {x: fg(x) I > M ± E} , and set
f = (sgn g)x E. Then II fill = mE, and
MmE = Miifili ...
Thus mE = 0, and ile. < M. I
f fg _>_ (M + f)mE.
Sec. 4]
Bounded Linear Funchonals on the LP Spaces
121
We are now in a position to give the following characterization of
the bounded linear functionals on D' for 1 < p < co :
8. Riesz Representation Theorem: Let F be a bounded linear
functional on D', 1 < p < cc . Then there is a function g in L 9 such that
F(f) = f fg.
We have also WI( = lIgl( g .
Proof: Let X s be the characteristic function of the interval [0, s].
We begin our investigation of F by observing what it does to X s .
For each s the value of F(X 8) is a real number 4,(s), and this defines a
function 4) on [0, 1]. Now I maintain that 4) is absolutely continuous.
For let {(s„ si.)} be any finite collection of nonoverlapping subintervals of [0, 1] of total length less than b. Then
E 14D(s) — cqs,)i =
where
f = E (x„, — x„) sgn (4)(s'i ) — ct(s,)).
,
Since (II f 11) P
< 6, we have
E 14,0'.0 —
,)1 = F(f)
11F1111fIlp
< IIF(IVP.
This shows that the total variation of 4) is less than e over any
finite collection of disjoint intervals of total length 6 if we take
6 --- &'/F'. Thus 4) is absolutely continuous.
By Theorem 5.13 there is an integrable function g on [0, 1] such
that
4,(s) = fos g.
Thus
F(X 3) = foi g • X,.
122
The Classical Banach Spaces
[Ch. 6
Since every step function on [0, 1 ] is (equal except at a finite number
of points to) a suitable linear combination EciX so we must have
F(4) = 101 gtk
for each step function 4", by the linearity of F and of the integral.
Let f be any bounded measurable function on [0, 1]. Then it follows
from Proposition 3.22 that there is a sequence (Ip„) of step-functions
which converge almost everywhere tof. Since the sequenced f — thil P)
is uniformly bounded and tends to zero almost everywhere, the
bounded convergence theorem implies that 11 f — 0 nit i, --> 0. Since F
is bounded, and
IF(f) — F(lin)i = IFU — COI
we must have
liFii
Ilf — 0.11p,
F(f) = lim
Since gly„ is always less than igl times the uniform bound for the
sequence (4/n), we have
f fg = lim f gC,
by the Lebesgue convergence theorem. Consequently, we must have
f fg = F(f)
for each bounded measurable function f. Since IF(f)I —‹ liFil I fll P ,
we have g in Lg and ligl 1 15_ VII by Lemma 7.
Thus we have only to show that F(f) = ffg for each f in LP . Let f
be an arbitrary function in _U. Then by Problem 14 there is for each
e > 0 a step function 0 such that III f — Oli p < e. Since 11, is bounded
we have
F(0) — 1 Og.
Hence
1F(f) — f fg
F(f) — F(0) + fg — f fgl
f( - f)gl
< liFil If — oh, + Il gti g iif — In,
F( /' — 0)1 +
< (iin + ligh)e•
Sec. 4]
Bounded Linear Functionals on the L 2 Spaces
123
Since E is an arbitrary number, we must have
F(f) = f fg.
The equality 11F11 --= ligli q follows from Proposition 6. 1
In the problems the reader is asked to carry out a similar representation for the bounded linear functionals on 17), 1 < p < oo , c, and
c o . In Theorem 14.8, we give a representation for the bounded linear
functionals on C. Unfortunately, the bounded linear functionals
on L (and on r) do not admit of a similar representation.
Problems
18. a. Let g be an integrable function on [0, 1]. Show that there is a
bounded measurable function f such that 11f II 0 and
f fg = lei 11fII..
'
.
b. Let g be a bounded measurable function. Show that for each
e > 0 there is an integrable function f such that
ffg
(11g11. - 011fIli.
[Hint: f may be taken to be a suitable characteristic function.]
19. Find a representation for the bounded linear functionals on 12 ,
1 < p <GO.
20. Find a representation for the bounded linear functionals on c
and on c o . (Caveat: These representations are different.)
21. Show that the element g in V given by Theorem 8 is unique.
Part Two
0
Abstract Spaces
7
1
Metric Spaces
Introduction
The system of real numbers has two types of properties. The first
type consists of the algebraic, dealing with addition, multiplication,
etc. The other type consists of properties having to do with the notion
of distance between two numbers and with the concept of a limit.
These latter properties are called topological or metric, and the object of the present chapter is to study these properties in a general
space in which the notion of distance is defined. We make the
following definition:
Definition: A metric space (X, p) is a nonempty set X of elements
(which we call points) together with a real-valued function p defined on
X X X such that for all x, y, and z in X.
0;
i. p(x, y)
p(x, y) = O f and only if x = y;
= P(y, x);
P(x,
iv. p(x, y) < p(x, z)
p(z, y).
The function p is called a metric.
An obvious example of a metric space is the set R of all numbers
with p(x, y) = ix — yl. A second example is the n-dimensional
Euclidean space Rn whose points are the n-tuples x = (x 1 , . . • , x.)
of real numbers and
p(x,y ) = r (x 1 - 1) 2 + •
127
+ (x. - y.) 2 1 112 .
128
Metric Spaces [Ch. 7
For Rn, property (iv) of the metric is merely the statement that the
length of a side of a triangle is less than the sum of the lengths of
the other two sides. Consequently, (iv) is usually called the triangle
inequality.
Other examples of metric spaces are the normed linear spaces of
the last chapter if we set
P(x, Y) = 11x — Y11,
the triangle inequality being equivalent to 11x + yil —‹ 11x11 + VII.
It should be emphasized that a metric space is not the set X of its
points, since it is in fact the pair (X, p) consisting of the set of its
points together with the metric p. For example, the set of all n-tuples
of real numbers can also be made into a metric space by use of the
metric p* given by
P* (x, Y) = Ix). — Yi1 ± • • • + lx n — Ynii,
and this is not the metric space Rn (if n > I). Often we are interested
in only one metric for a given set of points, and in such cases we
sometimes use the symbol X to denote both the set of points and the
metric space (X, p).
If we have two metric spaces (X, p) and ( Y, (7), we can form a new
metric space called the Cartesian product X X Y whose set of points
is the set X X Y = {(x, y): x e X, y e Y} and whose metric T is
given by
T((xi, y), (x2, Y2)) = [P(xi, x2) 2 -1- 0-(y „ y 2)1 112
It is readily verified that T has all the properties required of a metric
and that Rin X Rn = Rm+n .
Any nonempty subset of a metric space is itself a metric space if
we restrict the metric to it. For example, the space C of the last
chapter is a subspace of L.
Problems
1. a. Show that the set of all n-tuples of real numbers becomes a metric
space under each of the following metrics:
P* (x, Y) = ix i — Yi 1 + • • • + ixn — Y7,1
max {ixi — Yil, • • • , 1xn — Yni} •
P+(x, Y) =
Sec. 2]
129
Open and Closed Sets
b. For n = 2 and n = 3 describe the sets {x: p(x, y) < 1),
{x: p*(x, y) < 11, and Ix: p±(x, y) < 11.
2. By the spheroid (or sphere) centered at x and having radius we
mean the set
Sz,3 = 1Y: XX, Y) < al.
Prove that if 0 < e < — p(x, z), then Sz,, C S.
3. Pseudometrics. A pair (X, p) is called a pseudometric space if p
satisfies all the conditions of a metric except that p(x, y) = 0 need not
imply x = y. Show that p(x, y) = 0 is an equivalence relation and that if
X* is the set of equivalence classes under this relation, then p(x, y) depends
only on the equivalence classes of x and y and defines a metric on X*.
2
Open and Closed Sets
We shall find that a number of the properties of sets of real numbers
apply immediately to sets in a metric space. Throughout the present
section all sets mentioned are subsets of a given metric space (X, p).
The following propositions and definitions correspond to those in
Section 5 of Chapter 2, and the reader is asked to check that the
proofs given there are valid for metric spaces.
Definition: A set 0 is called open if for every x e 0, ô > O such
that all y with p(x, y) < 3 belong to O.
1. Proposition: The sets X and 0 are open; the intersection of
any two open sets is open; and the union of any collection of open sets
is open.
Definition: A point x e X is called a point of closure of the set E
if for every > 0 there is a point y e E such that p(x, y) < a. We
use to denote the set of points of closure of E. Clearly E c E.
2. Proposition: If A c B, then A
and (A n B) c A n B.
C -E.
Definition: A set F is called closed if
Also (Au B) = -21u7B,
= F.
3. Proposition: The closure of any set E is closed; that is, E =
7.
130
Metric Spaces
[Ch. 7
4. Proposition: The sets 0 and X are closed; the union of any
two closed sets is closed; and the intersection of any collection of
closed sets is closed.
5. Proposition: The complement of an open set is closed; the
complement of a closed set is open.
Definition: A metric space X is called separable if it has a subset
D which has a countable number of points and which is dense in X,
that is, for which .D
X.
—
Since the set of rational numbers is a countable dense subset of R,
we see that R is separable. The following proposition shows that
the Lindel6f theorem holds for a metric space if and only if it is
separable.
6. Proposition: A metric space X is separable if and only if there
is a countable family {0,} of open sets such that for any open set
0 c X,
0= U Oi.
oico
Proof: If X is separable, let D be a countable dense set. By the
spheroid at x with radius 5 we mean the set
Sx,S = {Y: P(X3 Y) < 5 } •
Let {0, } consist of those spheroids S x,6 for which x is in D and 5 is
rational. Then f0j is a countable collection of open sets. If 0 is any
open set and y c 0, then we want to show that for some 0, we have
y c 0, C 0. Since 0 is open, there is a spheroid Sy, 6 such that
S y ,6 C 0. By taking 3 even smaller we may assume 5 is rational.
Since y is a point of closure of D, there is a point x E D such that
p(x, y) < 5/2. Hence
Sx0312 C Su o5 C
0.
But Ss ,112 is one of the {0,} , and the "only if" part of the theorem
is proved.
Suppose, on the other hand, we are given the countable collection
{0,}. Let xi be a point of 0„ and let D be the set of all these points
x,. We shall now see that D is dense. Let x be any point of X and S
Sec. 3]
Continuous Functions and Homeomorphisms
131
any spheroid centered at x. Then we must show that S contains a
point of D. But S is an open set (Problem 6), and so we must have
some 0, such that x e 0, c S. Hence x, e S, and we see that x e TD. I
Problems
4. a. Show that C is a closed subset of L (see Chapter 6).
b. Show that the set of all integrable functions which vanish for
0 < t <i-is a closed subset of L l .
c. Show that the set of all measurable functions x(t) with f x < 1
is an open subset of L'.
5. Show that E =
F for closed sets F. The interior E° of a set E
n
ECF
is the set of those y e E for which there is a (5 > 0 such that p(y, z) < (5
z e E.
a. Show that
E°= U o.
OCE
b. Show that
—CE) = (E)°.
6. a. Show that each spheroid is open.
b. Show that the sets {x: p(x, y) < SI are closed.
c. Is the set in (b) always the closure of the spheroid
{x: p(x, y) < (5 } ?
7. Which of the spaces Rn, C, L', L' are separable?
3
Continuous Functions and Homeomorphisms
A function f on a metric space (X, p) into a metric space ( Y, 0"
is a rule which associates to each x e Xa unique y e Y. We also
call f a mapping of X into Y and shall use the terms function and
mapping interchangeably. By a mapping f of X onto Y we mean, as
usual, that for any y e Y there is some x e X such that f(x) = y;
that is to say, that Y is the range off. Then as in Section 6 of Chapter
2 we have the following definition and propositions :
132
Metric Spaces
[Ch. 7
Definition: The function f is said to be continuous at x if, for every
e > 0, there is a 6 > 0 so that if p(x, y) < 6, then cr[f(x), f(y)] < e.
The function f is called continuous if it is continuous at each x e X.
7. Proposition: A function f from a metric space X to a metric
space Y is continuous if and only if for each open set 0 in Y the set
f -1 [0] is an open set in X.
8. Proposition: If f is a continuous mapping from X to Y and if g
is a continuous mapping from Y to Z, then the mapping g o f from
X to Z is also continuous.
A one-to-one mapping f of X onto Y is called a homeomorphism
between X and Yiff is continuous and the mapping f -1 inverse to f
is also continuous. The spaces X and Y are said to be homeomorphic
if there is a homeomorphism between them. The study of topology
is essentially the study of those properties which are unaltered by
homeomorphisms, and such properties are called topological. By
virtue of Proposition 7 a one-to-one correspondence between X and
Y is a homeomovhism if and only if it makes open sets in X correspond to open sets in Y and conversely. Thus the property of being
an open subset of a space is a topological one. Since a closed set is
the complement of an open set, it follows that being a closed subset
of X is a topological property. In fact, every property which can be
defined by means of open sets is a topological one, and hence so is
the property of being a continuous function: that is, if f is a continuous function on X and h: X —> Y is a homeomorphism between
X and Y, then fo 11 -1 is a continuous function on Y.
Not all properties in a metric space are preserved under a homeomorphism, however. For example, the distance between two points
is usually altered by a homeomorphism. A homeomorphism which
leaves distance unchanged, that is, one for which
cr[h( x i ), h x 2 ) ] = P(x 1, x2 )
(
for all x 1 and x2 in X, is called an isometry between X and Y. The
spaces X and Y are called isometric if there is an isometry between
them. From an abstract point of view two isometric metric spaces
are exactly the same, an isometry amounting merely to a relabeling
of the points. A notion which arises naturally in this connection is
that of equivalent metrics: Two metrics p and cr on the same set X
Sec. 4]
Convergence and Completeness
133
of points are called equivalent if the identity mapping of (X, p) onto
(X, a) is a homeomorphism. Thus two metrics are equivalent if and
only if they define the same open sets, that is, if a set is open with
respect to one whenever it is open with respect to the other.
Problems
8. Show that the function h on [0, 1) given by h(x) = x/(1
homeomorphism between [0, 1) and [0, 00).
9.
—
x) is a
Let E be a set and x a point in a metric space. Define
p(x, E) = inf p(x, y).
ycE
a. Show that for a fixed E the functionf given byf(x) -- p(x, E) is
continuous.
b. Show that {x: p(x, E) = 0} = E.
10. a. Prove that two metrics on a set X are equivalent if and only if
given x e X and e > 0 there is a 6 > 0 such that for all y e X
p(x, y) G 6 = a(x, y) < e
and
a(x, y) < 6
p(x, y) < e.
b. Show that the following set of metrics for the set of n-tuples of
real numbers are equivalent:
P(x, Y) =
[(xi — Yi) 2 + • • • + (xn — Yn) 2 i ii 2
P* (x, Y) = ixi — Yii + • • • + ixn — Yni
p +(x, y) = max {ix i — y 1 1, ... , 1x,, —
c. Find a metric for the set of n-tuples which is not equivalent
to these.
11. Show that if p is any metric on a set X, then a = p/(1 + p) is an
equivalent metric for X. Prove that (X, a) is a bounded metric space; that
is, Œ(x, y) < 1 for all x and y in X.
4
Convergence and Completeness
Just as for the case of real numbers we say that a sequence (xn )
from a metric space X converges to the point x in X (or has x as a
limit), if given E > 0, there is an N such that p(x, x n) < e for all
134
Metric Spaces
[Ch.
7
n > N. This definition can be rephrased in geometric terms by saying
that .x„) converges to x if every spheroid about x contains all but a
finite number of terms of the sequence.
We often write x lim x„ or x„ x to mean that x is the limit
of the sequence (x n ). If we have merely the weaker condition that
each spheroid about x contains infinitely many terms of the sequence,
then we say that x is a cluster point of the sequence (x n ). Thus x is a
cluster point of (x n) if given e > 0 and given N there is an n > N
such that p(x, x,) < E. Thus if x is the limit of (x„), then x is a
cluster point of (x„), but the converse is not necessarily true. A
number of properties concerning limits and cluster points are given
in the problems.
A sequence (xn) from a metric space is called a Cauchy sequence
if given e > 0 there is an N such that for all n and m larger than N
we have p(x„, x,„) < E. If (x„) converges to some point x, then given
> 0 we may choose N so large that p(x, x) < 6/2 for n > N.
Hence for all n, m > N we have
P(x., x.)
p(x„, x)
p(x„„ x) < e,
and (x 7,) is a Cauchy sequence. The converse statement that every
Cauchy sequence converges is not true in an arbitrary metric space
(for example, in the space of rational numbers with the usual metric).
If a metric space has the property that every Cauchy sequence
converges (to some point of the space), we say that the space is
complete. The Cauchy criterion for real numbers merely states that
the space R of real numbers is complete. Other examples of complete
spaces are given by the Banach spaces discussed in Chapter 6.
If X is an incomplete metric space, it can always be enlarged to
become complete. A precise statement of this fact is given by the
following theorem. One proof of this theorem is outlined in Problem
17 and another in Problem 10.16.
9. Theorem: If (X, p) is an incomplete metric space it is possible to
find a complete metric space X* in which X is isometrically embedded
as a dense subset. If X is contained in an arbitrary complete space Y,
then X* is isometric with the closure of X in Y.
Problems
12. Show that a sequence (xn ) in a metric space has x as a cluster point
if and only if there is a subsequence (x,,,,) of (xn ) which converges to x.
Sec. 5]
Uniform Continuity and Uniformity
135
13. Show that a sequence (x) in a metric space converges to x if and
only if every subsequence of (x,i) has x as a cluster point. Hence (xn)
converges to x if every subsequence has in turn a subsequence which
converges to X.
14. Let E be a set in a metric space X. If xis a cluster point of a sequence
from E, then x e 7e, while, if x e 7E, there is a sequence from E which
converges to X.
15. If a Cauchy sequence (x,z ) in a metric space has a cluster point x,
then (xn ) converges to x.
16. If X and Y are metric spaces andf a mapping from X to Y, thenf is
continuous at x if and only if for each sequence (xn ) in X converging to x
we have (f(x„)) converging to f(x) in Y.
17. a. If (x) and (yn) are Cauchy sequences from a metric space X, then
p(xn, yn) converges.
b. The set of all Cauchy sequences from a metric space X becomes
a pseudometric space (cf. Problem 3) if p*( ( 4,), (yn)) = lirn p(xn, yn).
c. This pseudometric space becomes (as in Problem 3) a metric
space X* when we identify elements for which p* = 0, and X is isometrically embedded in X.
d. The metric space (X*, p*) is complete. [Hint: If (x„) is a Cauchy
sequence from X we may assume (by taking subsequences) that
P(xn, xn+1) < 2'. If ((xnm)--1):=1 is a sequence of such Cauchy
sequences which represents a Cauchy sequence in X*, then the sequence
is a Cauchy sequence from X which represents the limit of the
Cauchy sequences from X*.]
e. Use Propositions 10, 11, and 12 of the next two sections to
complete the proof of Theorem 9.
18. Prove that the Cartesian product of two complete metric spaces is
complete.
5 Uniform Continuity and Uniformity
Let f be a mapping from the metric space (X, p) to the metric space
( Y, a). We say that f is uniformly continuous if given e > 0 there is a
8 > 0 such that for all x and x' in X with p(x, x') < 3 we have
o-(f(x), f(x')) < e. The function h defined on [0, 1) by h(x) =
x1(1 — x) is continuous but not uniformly continuous. Moreover,
this function h takes the Cauchy sequence given by xii = 1 — 1/n
into the sequence y7, = n — 1 which is not a Cauchy sequence. Thus
the image under a continuous function of a Cauchy sequence need
136
Metric Spaces
[Ch. 7
not be a Cauchy sequence. However, we have the following proposition:
10. Proposition: Let f be a uniformly continuous mapping of the
metric space X into the metric space Y. If (x„) is a Cauchy sequence
in X, then (f(x, i)) is a Cauchy sequence in Y.
A homeomorphism f between metric spaces X and Y is called a
uniform homeomorphism if both f and f -1 are uniformly continuous.
It follows from Proposition 10 that the property of being a Cauchy
sequence is preserved under uniform homeomorphisms, as is the
property of being complete. Properties which are preserved under
uniform homeomorphisms are called uniform properties. In addition
to the properties of being a Cauchy sequence and of completeness we
have also uniform continuity as a uniform property. These three
properties are not topological properties, since the function h defined
by h(x) = x/(1 — x) is a homeomorphism between the incomplete
space [0, 1) and the complete space [0, Do) which takes a Cauchy
sequence into a sequence which is not a Cauchy sequence and its
inverse carries the uniformly continuous function sin back into a
function which is not uniformly continuous on [0, 1).
Two metrics p and a for a set X of points are said to be uniformly
equivalent if the identity map from (X, p) to (X, a.) is a uniform
homeomorphism. Thus a and p are uniformly equivalent if given
E > 0 there is a 3 > 0 so that for all x and y we have p(x, y) < 8
a(x, y) < E and a(x, y) < 8
p(x, y) < e.
We conclude this section by stating the following useful extension
theorem for uniformly continuous mappings. Its proof is outlined in
Problem 20.
11. Proposition: Let (X, p) and (Y, a) be metric spaces with Y
complete. Let f be a uniformly continuous mapping from a subset E of
X into Y. Then there is a unique continuous extension g of f from E to
E; that is, there is a unique continuous mapping g from E into Y such
that g(x) = f(x) for x e E. Moreover, g is uniformly continuous.
Problems
19. Prove Proposition 10.
20. Prove Proposition 11 by the following steps:
Sec. 6]
Subspaces
137
a. If (x„) is a sequence from E which converges to a point x e E,
then (f (x„)) converges to a point y e Y (cf. Proposition 10).
b. The point y in (a) depends only on x and not on the sequence
(x„). Thus if we define y = g(x) we have defined a function g on E which
is an extension of f.
c. The function g is uniformly continuous on E.
d. If h is any continuous function from r to Y which agrees with
J on E, then h = g.
21. a. Show that the metrics in Problem 10b are uniformly equivalent.
b. Find a metric for the set of n-tuples of real numbers which is
equivalent but not uniformly equivalent to the usual metric.
c. If (X, p) is any metric space, the metric cr = p/ (1 p) is uniformly equivalent to p.
22. a. Boundedness is a metric but not a uniform property (cf.
Problem 21c).
b. A metric space X is said to be totally bounded if given e > 0
there are a finite number of spheroids of radius e which cover (that is,
whose union is) X. Show that total boundedness is a uniform property.
c. Show that total boundedness is not a topological property.
[Consider [0, 1) and [0, GO.]
6 Subspaces
If (X, p) is a metric space and S is a subset of X, then S becomes a
metric space if we restrict p to S, that is to say, if we take as the
distance between two points of S their distance as points of X. When
we consider S as a metric space with this metric, we call S a subspace
of X. For example, the rationals are a subspace of R, and the set
{(x, 0) } in R2 is a subspace isometric with R. The space C is a
subspace of L.
If E is a subset of S, then we may consider the closure of E in S
or in X; that is, we may wish to consider the set of all points of X
which are points of closure of E or else the set of all points of S
which are points of closure of E. These sets are in general different.
For example, let X be the space R and S the interval (0, 1). Then if
E is the interval (0, 1], the closure of E in R is the interval [0, i],
while its closure in S is just (0, IL that is, E is closed relative to S.
Thus we see that the closure of a set as well as the properties of a set
being closed or open are all relative to the space containing the set.
However, we do have the following relations between these notions.
138
Metric Spaces
[Ch. 7
12. Proposition: Let X be a metric space and S a subspace of it.
Then the closure of E relative to S is n S, where E denotes the closure
of E in X. A set A c S is closed relative to S f and only if A = SnE
with F closed in X. A set A c S is open relative to S if and only if
A = Sn 0 with 0 open in X.
Proof: If x is a point of closure of E in X, then it is a point of
closure of E in S if it belongs to S. Hence the closure in S of E is
E n S. If A is closed in S we must have A = S n 7, while on the
other hand if F is closed in X, then the closure in S of S n F is
Sn(SnF)cSn(Snl)cSnF,
whence S n F is closed relative to S.
If A is open relative to S, then S — A is closed relative to S and
we haveS—A =SnF, or SnA =Sn(—F) and —Fis open
in X. Similarly, if 0 is open, then SnO is the complement in S of
S n ( - 0), which is closed in S.1
13. Proposition: Every subspace of a separable metric space is
separable.
Proof: Let X be a separable metric space and S a subspace. Then
by Proposition 6 there is a countable collection {o} of open sets in
X such that each open set in X is a union of some subcollection of
{02 } . By Proposition 12 the collection {0, nS} is a countable
collection of open subsets of S such that every open subset of S is a
union of a subcollection of them. Hence S is separable by Proposition
6. I
In contrast to the relative properties discussed above, there are
some properties which are intrinsic. For example, the property of x
being a point of closure of E holds in any subspace of X containing
x and E as soon as it holds in one of them. Another such property
is that of being complete, since the definition of completeness of a
space is given in terms of points in the space. However, the following
proposition gives some relations between complete sets and closed
sets.
14. Proposition: If a subset A of a metric space X is complete,
then it is closed. On the other hand, a closed subset of a complete metric
space is itself complete.
139
Sec. 7] Baire Category
Problem
23. Prove Proposition 14. [Hint: Problem 14 is helpful.]
7 Baire Category
In this section we shall go more deeply into certain aspects of metric
spaces. We begin by defining a set E to be nowhere dense if —E is
dense. This is equivalent to stating that r contains no spheroid. For
example, the integers are nowhere dense in R, and the Cantor
ternary set is nowhere dense in [0, 1]. A set E is said to be of first
category (or meager) if it is the union of a countable collection of
nowhere dense sets. A set which is not of first category is said to be of
second category, and the complement of a set of first category is called
residual. We shall show that a complete metric space is of second
category when considered as a subset of itself. We begin with the
following theorem:
15. Theorem: Let X be a complete metric space and {O n} a
O n is not
countable collection of dense open subsets of X. Then
empty.
n
Proof: Let x1 be a point of 0 1 and Si a spheroid (of radius r i)
which is centered at x i and contained in 0 1 . Since 02 is dense, there
must be a point x2 in 02 n Si . Since 02 is open, there is a spheroid
S2 centered at x2 and contained in 02, and we may take the radius
r2 of S2 to be smaller than r 1 and smaller than r — p(x 1 , x2). Then
c Si . Proceeding inductively, we obtain a sequence S,i) of
spheroids such that 3, c Sn_i and S O n and whose radii (rn )
tend to zero. Let (x,i ) be the sequence of centers of these spheres.
Then for n, m > N we have x n e SN and xn, e SN. Hence p(x., xm) <
2rN , and (xii) is a Cauchy sequence, since r„ —> 0. By the completeness
of X there is a point x such that xi, x. Since x n e SN+1 for n > N,
we have x e EN+1 C SN C ON. Hence x e (1 On . I
16. Corollary (Baire Category Theorem): A complete metric space
is not the union of a countable collection of nowhere dense sets.
Proof: Let {E„} be a countable collection of nowhere dense sets.
Then O = —En is a dense open set, and so there is a point x e
0..
But this means x o U E. I
n
140
Metric Spaces
[Ch. 7
As an application of the Baire Category Theorem we establish the
following theorem, which is known as the uniform boundedness
principle:
17. Theorem: Let v be a family of real-valued continuous functions
on a complete metric space X, and suppose that for each x c X there
is a number Mx such that I f(x)1 < Mx for all f c F. Then there is a
nonempty open set 0 c X and a constant M such that I f(x)1 < M for
all f c and all x c O.
Proof: For each integer m, let E,„,f — {x: 1 f(x)1 < m}, and set
En, =
Emj. Since each f is continuous, E,,,,f is closed, and con-
n
sequently En, is closed. For each x e X, there is an m such that
I f(x)! < m for all f e 5; that is, there is an m such that x e Ent.
Hence
X= 0 Em.
M=1
Since X is a complete metric space, there is a set En, which is not
nowhere dense. Since this En, is a closed set, it must contain some
spheroid O. But for every x c O we have j f (x)1 < m for all f e F. I
Problems
24. a. Prove that a closed set F is nowhere dense if and only if it contains no open set.
b. Prove that E is nowhere dense if and only if for any nonempty
open set 0 there is a spheroid contained in 0 — E.
25. a. Prove that if E is of first category and A c E then A is also of
first category.
b. Prove that if (En ) is a sequence of sets of first category then
En,
is
also of first category.
U
26. If X is a complete metric space and E is a set of first category in E,
then —E is dense in X.
27. a. Show that on [0, 1] there is a nowhere dense closed set having
Lebesgue measure 1 — 1/n.
b. Construct a set of first category on [0, 1] which has measure 1.
141
Sec. 7] Baire Category
28. A point x in a metric space is called isolated if the set {x} is open.
Prove that a complete metric space without isolated points has an uncountable number of points.
29. a. If Ê is dense and F a closed set contained in E, then F is nowhere
dense.
b. If E and Ê are both dense, then at most one of them is an
c. The set of rational numbers in [0, 1] is not a 9,5 .
d. Is there a real-valued function on [0, 1] which is continuous on
the rationals and discontinuous on the irrationals?
30. Let C be the space of continuous functions on [0, I] and set
F,,,
{f: (3x0) with If(x) — f(x 0)1
n(x — x o) for all x, xo
x < 11.
a. Show that F„ is a closed subset of C.
b. Show that F„, is nowhere dense. [Hint: Any g e C can be
approximated to within e/2 by a polygonal function ço, and ço can be
approximated to within E/2 by a polygonal function 1,t whose right-hand
derivative is everywhere greater than n in absolute value.]
c. Show that the set D of continuous functions which have a finite
derivative on the right at at least one point of [0, 1] is a set of first category
in C.
d. There is a nowhere differentiable continuous function on [0, 1].
31. Let (f,,) be a sequence of continuous functions from a metric space
X to a metric space Y such that for each x in X the sequence (f,i (x))
converges to a pointf(x) in Y. Thenf defines a mapping of X into Y. Show
that there is a set E of first category in X such that f is continuous at each
point x e X — E. Hence if X is complete, f is continuous at a dense set of
.
points in X. [Hint: Let = fx: 0- (fa(X), fk(X)) < 1/m for all k
Let Fn°,„, be the interior of F„,,„. Set E = U iFn,m
tt,m
8
1
Topological Spaces
Fundamental Notions
In Chapter 7 we discussed properties of metric spaces and found
that a number of theorems depended only on the properties of open
and closed sets. In the present chapter we study spaces in which the
notion of an open set is fundamental and other notions are defined
in terms of it. Such spaces are called topological spaces and are more
general than metric spaces. Perhaps you ask: Why not stick to metric
spaces? It is true that metric spaces are simpler, but there are many
examples of spaces of functions where certain topological notions
have a natural meaning not consistent with the topological concepts
derived from any metric that might be put on the space. Important
examples are given by the weak topologies in Banach spaces. We
proceed with a formal definition.
Definition: A topological space (X, 3) is a nonempty set X of
points together with a family 5 of subsets (which we shall call open)
possessing the following properties:
i. X E 3, 0 E 3;
ii. 0 1 E 5 and 02 e 3 imply 0 1 n 02 E 3;
iii. O , E 5 implies U Oa E 3.
The family 5 is called a topology for the set X.
The properties in this definition are all satisfied by open sets in a
metric space (X, p), and hence to each metric space (X, p) we can
142
Sec.
11
Fundamental Notions
143
associate a topological space (X, 3), where 3 is the family of open
sets in (X, p). A topological space which is associated in this manner
to some metric space is called metrizable, and the metric p is said
to be a metric for the topological space. From a logical standpoint
the distinction between a metric space and its associated topological
space is essential, since different metric spaces can give rise to the
same topological space. Two such metric spaces are of course equivalent, and we often disregard the distinction between a metric space
and its associated topological space in cases where no confusion is
likely to arise. In some cases we shall not trouble even to distinguish between a topological space ( X, 3) and the set X of its points,
using X to denote both. It is to be remembered, however, that metric
and topological spaces are couples, and in many cases it is necessary
to express this fact explicitly.
Given any set X of points there are always two topologies that can
be defined on X. One is the trivial topology in which the only open
sets are 0 and X. A second possible topology is the discrete topology:
Every subset of X is an open set.
In terms of the notion of open set we may define other topological
properties, for example: A subset F of X is called closed if F is open.
1. Proposition: The sets 0 and X are closed. The union of any
two closed sets is closed. The intersection of any collection of closed
sets is closed.
If A is a subset of a topological space (X, 3), we can define a
topology 8 for A by taking for s those subsets B of A for which
there is a set 0 E 3 such that B = A n O. We call 8 the topology
inherited from 3 and call the topological space (A, s) a subspace
of (X, 3). This is consistent with our usage concerning metric spaces.
A sequence (x,t) in a topological space is said to converge to the
point x, or to have the limit x, if given any open set O containing x
there is an integer N such that xn E O for all n > N. Similarly, a
sequence (xn,) is said to have x as a cluster point if given any open
set 0 containing x and any integer N there is an integer n > N such
that xn, g O. Thus if (x n) has a subsequence which converges to x,
then x is a cluster point of (xn). The converse of this statement is
not always true in an arbitrary topological space.
In view of Proposition 7.7 we may give a definition of a continuous
function on a topological space which agrees with the usual concept
144
Topological Spaces
[Ch. 8
of continuity in the case in which the topological space is also a
metric space.
Definition: A mapping f of a topological space (X, 3) into a topological space (Y, 8) is said to be continuous if the inverse image
of every open set is open, that is, if 0 e s = f'[0] e 3.
It should be noted that if f is a continuous function on a space X,
then the restriction fi of f to a subspace A of X is a continuous
function on A, for f 1-1 [0] =A n f -1 [0] and must be open for open
0 by the continuity off and the definition of open sets in A.
Definition: A homeornorphisrn between two topological spaces is
a one-to-one continuous mapping of X onto Y for which f -1 is continuous. The spaces X and Y are said to be homeornorphic if there is a
homeomorphism between them.
From an abstract point of view two homeomorphic topological
spaces are indistinguishable, the homeomorphism amounting to a
mere relabeling of the points of one set by the points of a second set.
Thus the concept of homeomorphism plays the same role for topological spaces that isometry plays for metric spaces and isomorphism
plays for algebraic systems.
Suppose that 3 and s are two topologies for the same set X.
Then s is said to be stronger than 3 if D 3. Thus 8 is stronger
than 3 if and only if the identity mapping of (X, g) into (X, 3) is
continuous. The trivial topology for a set X is the weakest possible
topology on X, while the discrete topology is the strongest possible
topology.
If 8 and 3 are two topologies for a set X, then s n 3 is also a
topology. In fact, if {3alif is any collection of topologies, then
3a
n
is a topology. Thus if e is any collection of subsets of X, then the
intersection of all topologies containing e is a topology containing e.
This topology is the weakest topology such that all of the sets of e
are open.
2. Proposition: Let X be a nonempty set of points and e any
collection of subsets of X. Then there is a weakest topology 3 which
contains C.
Sec. 21
Bases and Countability
145
Problems
1. a. Given a set X, can you define a metric on X so that the associated
topological space is discrete? Trivial?
b. Let X be a space with a trivial topology. Find all continuous
mappings of X into R.
c. Let X be a space with a discrete topology. Find all continuous
mappings of X into R.
2. We say that x is a point of closure of a set E if for every 0 e 3 with
x c 0 we have 0 n E 0. Prove that the set F of points of closure of
E is the intersection of all closed sets containing E.
3. Prove that a set A C X is open if and only if given x c A there is an
open set 0 such that x c 0 C A.
4. Prove that a mapping of X into Y is continuous if and only if the
inverse image of every closed set is closed.
5. Show that if f is a continuous mapping of X into Y and g a continuous mapping of Y into Z, then g of is a continuous mapping of X
into Z.
6. Prove that the sum and product of two real-valued continuous
functions are themselves continuous.
7. a. Let F be a closed subset of a topological space and (x,) a sequence
of points from F. Show that if x is a cluster point of (x„) then x e F.
b. Show that iff is continuous and x = lim xn then (f(xn)) has the
limit f(x).
c. Show that if f is continuous and x is a cluster point of (xn ) then
f(x) is a cluster point of (f(xn)).
2
Bases and Countability
A collection 43 of open subsets of a topological space X is called a
base for the topology 3 of X if for each open set 0 in X and each
x c 0 there is a set B e 43 such that x eBc O. A collection 63x of
open sets containing a point x is called a base at x if for each open
set 0 containing x there is a B e 43z such that x eBc O. Thus a
collection 43 of open sets is a base if and only if it contains a base at
each point x e X. If X is a metric space, the spheroids form a base,
and the spheroids centered at x form a base at x.
If 43 is a base for the topology 3, then 0 e 3 if and only if for
each X e 0 there is a B e 63 with X eBc O. The "only if" part of
146
Topological Spaces
[Ch. 8
the statement follows from the definition of a base, while if for each
x in 0 we have x eBc 0, then 0 must be the union of those B in
133 with B C 0, and 0 is open since it is a union of open sets.
We often find it convenient to specify a topology for a set X by
specifying a base 63 of open sets and using the preceding criterion to
define the open sets. Conditions on a collection 133 in order that it
be a base for some topology are given by the following proposition:
3. Proposition: A collection 63 of subsets of a set X is a base for
some topology on X if and only if each x in X is contained in some B,
and if x e B1 n B2 then there is a B3 e 63 such that x e B3 C B1 n B2Proof: That these conditions are necessary follows from the
definition of a base and the fact that X and B 1 n B2 must be open.
Suppose now that 63 satisfies these conditions. If we set 3
{0: (x e 0)(V3 e 63)(x c B c 0)1, then 0 e 3, and the union of sets
in 3 will be in 3, and the first condition on 63 implies that X e 3.
To show that the intersection of two sets 0 1 and 02 in 3 is itself in 3,
let x e 0 1 n 02. Then there are sets Bl and B2 in ea such that x e Bi C
0 1 and x e B2 C 02. Let B3 be a set in 63 with x g B3 C B1 n B2Then x e B3 C 01 n 02, and it follows that 0 1 n 02 is open. 1
A topological space is said to satisfy the first axiom of countability
if there is a countable base at each point. Every metric space satisfies
the first axiom of countability, since the spheroids centered at x and
having rational radii are countable in number and form a base at x.
A space is said to satisfy the second axiom of countability if there
is a countable base for the topology. Thus Proposition 7.6 states that
a metric space satisfies the second axiom of countability if and only
if it is separable.
Problems
8. a. Let 63 be a base for the topological space (X, 3). Then x e r if
and only if for every B e 63 with x e B, there is ay eBn E.
b. Let X satisfy the first axiom of countability. Then x e E if and
only if there is a sequence from E which converges to x.
c. Let X satisfy the first axiom of countability. Then x is a cluster
point of a sequence (xn ) from X if and only if (x7,) has a subsequence
which converges to x.
Sec. 3]
The Separation Axioms and Continuous Real-Valued Functions
147
9. A mapping f of a topological space X into a topological space Y is
said to be continuous at x if for any open set 0 containing f(x) there is an
open set U containing x such that f[U] C O.
a. Show that f is continuous if and only if it is continuous at each
point of X.
b. Let
a base at x and ey be a base at y = f(x). Then f is
x
if
and
only if for each C c e y there is a B y 63, such that
continuous at
B Cf 1 [C}.
63x be
10. Let 0 be any collection of subsets of X. Let 63 consist of X and all
finite intersections of sets in e. Show that 133 is a base for the weakest
topology which contains C.
11. Let X be an uncountable set of points, and let 3 consist of the
empty set and all subsets of X whose complement is finite. Show that 3 is
a topology for X and that the space (X, 3) does not satisfy the first axiom
of countability.
12. Let X be the set of real numbers, and let 63 be the set of all intervals
of the form [a, b). Show that 63 is the base of a topology 3 for X. (This
topology is called the half open interval topology.) Show that (X, 3)
satisfies the first but not the second axiom of countability and that the
rationals are dense in X. Is (X, 3) metrizable?
-
3 The Separation Axioms and Continuous
Real-Valued Functions
The properties of topological spaces are in general quite different
from those of metric spaces, and it is often convenient to suppose that
our topological space satisfies some additional conditions which
are true in metric spaces. Consider the following set of conditions
on a topological space:
T1. Given two distinct points x and y, there is an open set which
contains y but not x.
T2. Given two distinct points x and y, there are disjoint open
sets O l and 0 2 such that x c O l and y c 0 2.
T3. In addition to T1 , given a closed set F and a point x not in
F, there are disjoint open sets O l and 02 such that x c O l and
F c 02.
T4. In addition to T1 , given two disjoint closed sets F1 and F25
there are disjoint open sets 0 1 and 0 2 such that F1 c 0 1 and
F2 C 02.
148
Topological Spaces
[Ch. 8
These are called separation axioms, and all are satisfied in a metric
space. A topological space which satisfies T2 is called a Hausdorff
space, one which satisfies T3 is called a regular space, and one which
satisfies T4 is called a normal space. The following proposition tells
us that the condition T 1 is equivalent to the statement that each set
consisting of a single point is closed. With this in mind we see that
T4 T3 T2 = T1.
4. Proposition: A topological space X satisfies T 1 if and only if
every set consisting of a single point is closed.
Proof: If each set {x} is closed, given two distinct points x and y,
we may take 0 = —{x} . Then 0 is an open set containing y but
not x. Suppose that T 1 holds. Each y c {x} is contained in an open
set 0 c {x}. Thus the set — {x} is the union of the open sets contained in it and so must be open. Hence {x } is closed. I
Important consequences of normality are the following propositions, whose proofs are left to the reader (Problems 18 and 19).
5. Urysohn's Lemma: Let A and B be disjoint closed subsets of a
normal space X. Then there is a continuous real-valued function f
defined on X such that 0 <f < Ion X while f . 0 on A andf 1 011 B.
6. Tietze's Extension Theorem: Let X be a normal topological space,
A a closed subset of X, and! a continuous real-valued function on A.
Then there is a continuous real-valued function g on X such that
g(x) = f(x) for x A.
If X is any set of points and a is any collection of real-valued
functions on X, there is always a weakest topology on X such that
every function in a is continuous, for let C = {E: E = .f-1 [0], f c a
and 0 an open subset of RI and apply Proposition 3. This topology
is called the weak topology generated (or induced) by a. If the
functions in a are all continuous in some topology 3, then the weak
topology generated by a is in general weaker than 3. For these
topologies to coincide, there must be enough continuous real-valued
functions. One condition which guarantees this is the following:
Given a closed set F and a point x o F there is a function f c a with
f(x) = 1 and f 0 on F. If this condition is satisfied when a is
Sec. 3]
The Separation Axioms and Continuous Real-Valued Functions
149
the space C(X) of all continuous real-valued functions on X, then we
say that X is completely regular, provided that X satisfies T1 . It
follows from Urysohn's lemma that each normal space is completely
regular, and it is easy to see that complete regularity implies regularity. For this reason the condition defining complete regularity is
sometimes called T31.
The following theorem characterizes separable metric spaces. It can
be proved using the concepts of the next section (see Problem 30).
7. Urysohn Metrization Theorem: Every normal topological space
satisfying the second axiom of countability is metrizable.
Problems
13. a. Show that every metric space is Hausdorff.
b. Show that every metric space is normal. [Hint: If F 1 and F2 are
disjoint closed sets, let 0 1 = {x: p(x, F1) < P(x, F2)} and 02 =
{x: p(x, F 2) < p(x, F1)}
14. Let X consist of the numbers [0, 1] and an element 0', and take
as a basis for a topology the sets (a, 13), [0, 0), (a, 1], and {O' } u (0, 0).
Show that these are a base for a topology on X which is T1 but not Hausdorff.
15. Let f be a real-valued function on a topological space X. Show that
f is continuous if for each real number a the sets {x: f(x) < a} and
{x: f(x) > a} are open. Show thatf is continuous if for each real number
a the set {x: f(x) > a} is open and the set {x: f(x) > a} is closed.
16. If f and g are continuous real-valued functions on a topological
space X, then the functions f + g, fg, f V g, and f A g are continuous.
(Here as usual (f V g)(x)
max f (x), g(x).)
17. Let (A) be a sequence of continuous functions from a topological
space X to a metric space Y. If (fit ) converges uniformly to a function f,
then f is continuous.
18. a. Show that a Hausdorff space is normal if and only if given a
closed set F and an open set 0 containing F there is an open set U such
that F c U and U C O.
b. Let F be a closed subset of a normal space contained in an open
set O. By repeating the result in (a) indefinitely, show that it is possible
to construct a family {Ur} of open sets, one corresponding to each
rational in (0, 1) of the form r =p • 2, such that F C UT C 0 and
V, C U, for r < s.
150
Topological Spaces (Ch.
8
c. Let {Ur} be the family constructed in (b) with U i = X. Let f be
the real-valued function on X defined by f(x) = inf {r: x e Ur} . Then fis
a continuous function, 0 < f < 1, with f m 0 on F and f m 1 on Ô.
d. Let X be a Hausdorff space. Prove that Xis normal if and only if
for every pair of disjoint closed sets A and B on X there is a continuous
real-valued function f on X such that 0 < f < 1, f 0 on A and f -=-- 1
on B.
19. Prove Tietze's Extension Theorem by using the following steps:
a. Let h = f/(1 ± Ifl). Then Ihj < 1.
b. Let B = {x: h(x) < —}, C = {x: h(x) > il . Then by the
Urysohn lemma there is a continuous real-valued function h 1 on X
which is — i on B and on C, while Ih i (x)1 < for all x c X. Clearly
Ih(x) — hi(x)1 < 1 for all x c A.
c. By induction there is a continuous function hn on X such that
2n-1
h(x)1 < 3n for all x e X and Ih(x) —
2n
h(x) I < k, for all x e A.
d. The sequence (ha) is uniformly summable to a continuous
function k on X, lki < 1 and k = h on A. Set g = k/(1 — jkl).
20. Let a be a family of real-valued functions on a set X. Show that a
base for the weak topology on X generated by a is given by the sets of the
form Ix: If(x) — f(y) l < e, for some e > 0, y e X, and some finite set
fi, . . . ,f„ of functions in 5} . Show that this topology is Hausdorff if and
only if given any pair {x, y} of distinct points in X there is an f e a with
f(x) X f( y).
21. Let 5 be a family of real-valued continuous functions on a topological space (X, 3). Show that the weak topology generated by 5 is 3
if for each closed set F and each x 0 F there is an f e 5 with f(x) = 1
and f m. 0 on F.
22. Show that every completely regular space is regular.
23. Prove that every subset of a Hausdorff space is Hausdorff.
4
Product Spaces
and ( Y, 8) are two topological spaces, we define a topology on the product X X Y by taking as a base the collection of all
sets of the form 0 1 X 02, where 0 1 e 3 and 02 e S. This is called
the product topology for X X Y. If X and Y are metric spaces, then
the product topology agrees with the topology given by the product
metric. If (XOE, 5OE ) is any indexed family of topological spaces, we
If (X, 5)
Sec. 4] Product Spaces
151
define the product topology on X Xa by taking as a base all sets of the
«
form X Oa where Oa e aa and Oa = Xa except for a finite number
«
of a. If the X. are all the same space X and indexed by an index set
A, we write X" for X X.
«
If (XŒ, 3a ) is a collection of topological spaces and Y their product,
we define for each a a mapping Ira (called a projection) of Y into Xa
by letting 71-Π(x) be the a-th coordinate of x. Each 7ra is continuous,
and the product topology in Y is the weakest topology such that each
ra is continuous.
If A is countable and X is metrizable, then X" is metrizable. Since
only the number of elements in A is important in determining X A ,
we usually write X' (or X N) for a countable product. If we denote the
discrete space with two elements by 2, then 2° is homeomorphic to
the Cantor set. If we use to to denote not only a countable set but
also a countable set with discrete topology, then cow is a topological
space which is homeomorphic to the set of irrational numbers.
If I = [0, 1], then /A is called a cube. The cube lw is metrizable
and is called the Hilbert cube. Let X be any set and a family of
functions f on X with 0 < f < 1 and such that for any two distinct
points x and y in X there is an f c iI with f(x) f(y). Then I can be
identified with a subset of Ix if we let each f correspond to the
element whose x-th coordinate is f(x). The mapping of g into /which
takes f into f (x) is simply the projection 7rx restricted to the image of
g. The topology that g inherits as a subspace of /x is called the
topology of pointwise convergence.
On the other hand, X can be identified with a subset of P by
letting each x correspond to the element whose f-th coordinate is f (x).
The topology of X as a subspace of /5 is the weak topology generated
by . If X is a topological space and each f in a is continuous, then
the mapping of X onto its image in P is continuous, and, if has the
property that for each closed subset F c X and each x 0 F there
is an f c with f (x) --= 1 and f F---- 0 on F, then X is homeomorphic
with its image in Jr
Problems
24,
space.
Prove that the direct product of Hausdorff spaces is a Hausdorff
152
Topological Spaces
[Ch. 8
25. Prove that the collections taken for bases in defining product
topologies satisfy the conditions of Proposition 3. Show that if (X, p) and
( Y, o. ) are two metric spaces then the product topology on X X Y is the
same as the topology induced by the product metric.
26. Show that X A is the set of all functions mapping A into X with a
base for its topology given by open sets of the form If: f(a i ) e 0 1 ,
f(a2) e 02, • • • , f(an) E On}, where {a 1 , . . , an } is some finite subset of
A and {O 1 ,
, On} a finite collection of open subsets of X. Prove that a
sequence (fn ) converges to f in XA if and only iffn (a) converges to f( a) for
each a in A.
27. Show that, if X is metrizable and A is countable, then XA is
metrizable. [Hint: X can always be metrized by a bounded metric p.
Define a metric a on XA by o- (x, y) =
2 p(x a, ya).]
E
cxeA
28. Show that each projection re, is continuous and that the product
topology on X A is the weakest topology such that each Ira is continuous.
29. Show that r is homeomorphic to the Cantor ternary set.
30. a. Show that the correspondence described in the text between a
topological space X and its image in F1 is a homeomorphism if g has the
property that, given a closed F and x g F, there is an f e g with f[F] = 0
and f(x) = 1.
b. Show that, if X is a normal space satisfying the second axiom
of countability, then we can find a countable family 5 of continuous functions with the property in (a).
c. Prove the Urysohn Metrization Theorem.
5 Connectedness
A topological space X is said to be connected if there do not exist
two nonempty disjoint open sets 0 1 and 02 such that X = 0 1 u 0 2 .
Such a pair of open sets is called a separation of X. Since each set is
the complement of the other, they are closed sets as well as open sets.
Any pair of disjoint nonempty closed sets whose union is X is a
separation for X, since each of these sets must also be open. A space
X is connected if and only if the only subsets of X which are both
open and closed are the sets 0 and X. A subset E of X is said to be
connected when it is a connected space in the topology it inherits
from X: Thus E is connected if there do not exist open sets 0 1 and
02 in X, both meeting E such that Ec 0 1 u 02 and E n 0 1 n 0 2 = 0.
153
Sec. 5] Connectedness
8. Proposition: Let f be a continuous mapping of a connected space
X onto a topological space Y. Then Y is connected.
Proof: Let 0 1 and 0 2 be a separation of Y. Then f -1 [ 0 1 ] and
f -1 [02 ] are disjoint open sets in X whose union is X. Since f is onto,
neither f [0 1 ] nor f —1 [0 2] is empty, and so this pair is a separation
of X. Thus if Y is not connected, Xis not connected, and the proposition follows by contraposition. 1
9. Proposition: Let f be a real-valued continuous function on a
connected space X. Let x and y be two points in X and c a real number
such that f(x) < c < f(y). Then there is az e X such that f(z) = c.
Proof: If f does not assume the value c, then f—'[(— oo, c)] and
f -1 [(c, co)] are disjoint open sets whose union is X. They are nonempty, since x belongs to the first and y to the second. Thus X is not
connected. 1
10. Proposition: A subset E of R is connected if and only if it is
either an interval or a single point.
Problems
31. Let A be a connected subset of a topological space, and suppose
Then B is connected.
32. a. Let E be a connected subset of R having more than one point.
Prove that E is an interval. [If x and y are in E, x < y, then [x, y] C E.
Let a = inf E, b = sup E. Then (a, b) CE c [a, M.]
b. Prove that an interval in R is connected. [Let I = (a, b) and let
0 be a subset of I which is both open and closed in I. Show that
sup {y: (x, y) C 0} = b, and use Problem 31 to take care of nonopen
ACBCA.
intervals.]
33. A space X is said to be arcwise connected if given two points x and
y in X there is a continuous map f of [0, 1] into X such that f(0) = x and
f(I)
Y
a. Show that an arcwise connected space is connected.
b. In the plane R 2 consider the subspace
X
{(x, y): x
0, — 1 < y < I} u
{(x, y): y = sin 1/x,0 < x < 1 } .
Show that X is connected but not arcwise connected.
154
Topological Spaces
[Ch. 8
c. Show that each connected open set G in Rn is arcwise connected.
[Let x e G and let H be the set of points in G which can be connected
to x by a polygonal arc. Then H is open and closed in G.]
34. Let X be a topological space and {E,2} a collection of connected
subsets of X each of which contains a fixed point x. Then U Ea, is connected.
35. Show that the direct product of connected topological spaces is
itself connected.
*6 Absolute gs's
The purpose of this section is to characterize those topological
spaces which are homeomorphic to complete metric spaces: They are
the metrizable spaces whose dense homeomorphic images in every
Hausdorff space are ga's. We state this property as a theorem. Some
suggestions for its proof are given in Problems 37 and 38 and an
application is given by Problem 39. This characterization is utilized
in Chapter 15 to show that in a sense the complete metric spaces are
absolutely measurable.
11. Theorem: Let X be a complete metric space and if, a homeomorphism of X onto a dense subset E of a Hausdorff space F. Then E
is a g6. Conversely, if E is a 96 in a complete metric space, then E is
homeomorphic to a complete metric space.
Problems
36. Show that if the Hausdorff space F of Theorem II is metrizable,
then we do not need to require that E be dense.
37. Let E be a dense subset of a Hausdorff space F and g a homeomorphism of E onto a complete metric space X. Then E is a Ç. (For
each positive integer n, let On be the subset consisting of ally c F such that
some open set containing y has an image under g whose diameter is less
than 1/n. Then On is an open set, and the homeomorphism g can be
00
extended to a continuous mapping h of G = fl O n into X, since X is
n-.=1
complete. But g-1 o h must be the identity, whence E = G.)
Sec.
7]
Nets
155
n
38. Let ( Y, a) be a complete metric space, and let E =
O„ be
a 95. For each n construct a continuous real-valued function ça,, on Y with
0 < ça,, < 1 such that 0„ = {x: ço„(x) > 0}. For x and y in E let
P(x, Y) = o(x, Y) ±
E 2-4 1,pn(x) —
X [1,,,n(x) — my) + kp.(x)'.(y)11 — i.
Then p is a metric for E which is equivalent to a- and makes E a complete
metric space.
39. Let E be a dense go in [0, 1], and suppose that E is also dense. Then
E is homeomorphic to the product space co‘'). [Hint: Show that E =
On,
where 0, 1_ 1 c On, each component of 0„ containing infinitely many
components of 0„+1, and each component of On having a diameter in the
complete metric for E which tends to zero. Then the natural correspondence between elements of Low and decreasing sequence (4,) of components
of On is a homeomorphism onto E.]
n
*7 Nets
By a directed system we mean a set A together with a relation <
satisfying the following conditions:
i. If a < fi and [3 -< i, then a -< 'Y.
ii. If ce, j c A, there is ay eA with a -< Y and fi < -Y.
One example of a directed system is the set N of positive integers
with -< replaced by Another commonly used directed set is the
set of all open sets containing a point x, with 0 1 < 0 2 defined to
mean 01 D 02.
A net is a mapping of a directed system into a topological space X.
If the directed system is the integers, we have a sequence, and nets
may be thought of as generalizations of sequences. We usually write
xa for the value of the net at a and (.xa for the net itself. A point
x e X is said to be the limit of a net (xa if for each open set 0
containing x there is an a o c A such that xa e 0 for all a > a o. A
point x is called a cluster point of (xa ) if given 0 containing x and
given a e A there is a fi > a such that x0 e 0. For sequences these
notions coincide with our earlier notions of limit and cluster point.
12. Proposition: A point x is a point of closure of a set E if and
only if it is the limit of a net (xOE ) from E.
156
Topological Spaces
[Ch. 8
Proof: The "if" part follows directly from the definitions of
limit and point of closure. Hence we assume x is a point of closure
of E. We take as our directed system A the collection of open sets
which contain x and set 0 1 -< 0 2 if 01 D 02. Since x is a point of
closure of E, for each 0 c A there is a point xo in 0 n E. Then (x0 )
is a net from E, and it converges to x, since, given 0 containing x, we
have x0 e 0 for all 0' > 0. I
Problems
40. Prove that X is Hausdorff if and only if every net in X has at most
one limit. [To prove the "if" part, let x and y be two points which cannot
be separated and let the directed system be the collection of all pairs
(A, B) of open sets with x 6 A, y 6 B. Choose x(A,B) to be in A n B and
show that both x and y are limits of this net.]
41. Prove that a function f from X to Y is continuous at x if and only if
for each net (xa ) which converges to x the net (f(x 0)) converges to f(x).
42. Let X be any set and f a real-valued function on X. Let A be the
system consisting of all finite subsets of X, with F < G meaning F c G.
For each F A, let yF = E f(x). Prove that the net (yF) has a limit if and
xeF
only iff(x) = 0 except for x in a countable subset { xn} and Elf xn)j < oo .
(
In this case, lirn YF
=
E f(xn).
n
43. Let X X X„, Then a net (xs) in X converges to x if and only if
each coordinate of x(i converges to the corresponding coordinate of x.
9
Compact Spaces
I Basic Properties
Many of the important properties of the interval [0, 1] follow from
the Heine-Borel theorem. We introduce a class of topological spaces
in which the conclusion of the Heine-Borel theorem is valid and show
that many properties of [0, 1] are also true for these spaces. These
spaces are called compact spaces. To give an exact definition, we
say that a collection cu of open sets in a topological space is an open
covering for a set K if K is contained in the union of the sets in cu.
A topological space X is said to be compact if every open covering
cu. of X has a finite subcovering, that is, if there is a finite collection
such that X = U Oz. A subset K of a topot—
logical space is called compact if it is compact as a subspace of X.
In view of the definition of the topology of a subspace, this is equivalent to saying that a subset K of Xis compact if every covering ctt of K
by open sets of X has a finite subcovering. The Heine-Borel theorem
states that every closed and bounded subset of real numbers is compact.
If cu is an open covering of a space X, then the collection g• of
complements of sets in 'it is a collection of closed sets whose intersection is empty, and conversely. Thus a space X is compact if and
only if every collection of closed sets with an empty intersection has a
finite subcollection whose intersection is empty. A collection of
{0 1, 02,
ON} C 91
157
158
Compact Spaces
[Ch. 9
sets in X is said to have the finite intersection property if any finite
subcollection of g has a nonempty intersection. Hence we have the
following proposition:
1. Proposition: A topological space X is compact if and only if
every collection g of closed sets with the finite intersection property
has a nonempty intersection.
The notion of compactness is intimately connected with that of
being closed, as the following proposition shows. Thus compactness
may be viewed as an absolute type of closedness.
2. Proposition: A closed subset of a compact space is compact.
A compact subset of a Hausdorff space is closed.
Proof: Let X be compact, F a closed subset of X, and cu. an open
covering for F. Then cu, u {P} is an open covering for X and so must
have a finite subcovering {P, 0 f , . , ON }. Then the sets 0 1 , 02, . • • ,
ON cover F, and so cu has a finite subcovering.
Suppose now that X is a Hausdorff space and K a compact subset
of X. We shall show that k is open. Let y c K. Since X is Hausdorff,
for each x c K there are disjoint open sets Ox and Nx such that
x e Ox and y c N. The sets (0x : x c K) form an open covering of K,
and so there is a finite subcollection fO xi , Ox2 ,
, Ox) which
covers K. Let
N=
n
Nx.
Then N is an open set containing y and not meeting any of the sets
O. Since KcU O, N does not meet K and so is contained in K.
Thus k is open and K closed. I
3. Corollary: Every compact set of real numbers is closed and
bounded.
Proof: Since R is Hausdorff, a compact subset K of R must be
closed. Moreover, the intervals In = (—n, n) form an open covering
of K and so a finite number of them must cover K. Hence K must be
bounded. I
4. Proposition: The continuous image of a compact set is compact.
Sec. 2]
Countable Compactness and the Bolzano-Weierstrass Property
159
Proof: Let f be a continuous function which maps the compact
set K onto a topological space Y. If 'It is an open covering for Y, then
the collection of sets f -1 [0] for all 0 e cu is an open covering of K.
By the compactness of K, there are a finite number 0 1 , , On of
sets of Cu.such that the sets f [Os] cover K. Since f is onto, the sets
0 1 , , 0„ cover Y. I
5. Proposition: A one-to-one continuous mapping of a compact
space onto a Hausdorff space is a homeomorphism.
Proof: Let X be compact, Y Hausdorff, and f a one-to-one
continuous mapping onto Y. In order to show that f is a homeomorphism it is only necessary to show that it carries open sets into
open sets or equivalently closed sets into closed sets. But if F is a
closed subset of X it is compact by Proposition 2, and so f[F] is
compact by Proposition 4 and so must be closed by Proposition 2. I
Problems
1. a. Prove that a compact Hausdorff space is regular.
b. Prove that a compact Hausdorff space is normal.
2. Let f be a continuous mapping of the compact space X onto the
Hausdorff space Y. Then any mapping g of Y into Z for which g 0 f is
continuous must itself be continuous.
3. a. Prove that if (X, 3) is a compact space then (X, 5 1 ) is compact
for all 31 weaker than 3.
b. Show that if (X, 5) is a Hausdorff space then (X, 3 2) is a
Hausdorff space for all 3 2 stronger than 3.
c. Show that if (X, 5) is a compact Hausdorff space then any weaker
topology is not Hausdorff and any stronger topology is not compact.
2 Countable Compactness and the
Bolzano-Weierstrass Property
A weaker notion than compactness is countable compactness:
A space X is said to be countably compact if every countable open
covering has a finite subcovering. Since every space satisfying the
second axiom of countability has the property that every open
160
Compact Spaces
[Ch. 9
covering of it has a countable subcovering, it follows that countable
compactness is equivalent to compactness in the presence of the
second axiom of countability. The proof of Proposition 4 applies in
the countably compact case to give the following proposition:
6. Proposition: The continuous image of a countably compact
space is countably compact.
A topological space X is said to have the Bolzano-Weierstrass
property if every sequence (x„) in X has at least one cluster point,
that is, if there is an x e X such that for each open set 0 containing x
and for each N there is an n > N with xn c O.
7. Proposition: A topological space has the Bolzano-Weierstrass
property if and only if it is countably compact.
Proof: We first observe that X is countably compact if and only
if every countable family of closed sets with the finite intersection
property has a nonempty intersection. Suppose now that X has the
Bolzano-Weierstrass property and that g• = {Fij is a countable
family of closed sets with the finite intersection property. Since the
intersection H„ =
n 1,, is empty for no n, we may choose for each
k=1
n an element x n c H. By the Bolzano-Weierstrass property, the
sequence (x,a ) has a cluster point x. But x„ e F for all n > i, and so
x must belong to F, since F, is closed. Thus x belongs to every F and
so to their intersection.
Suppose, on the other hand, that X is countably compact and that
(xi) is a sequence from X. Let B„ be the set fx„, x.±1, , 1. Then
{} is a countable collection of closed sets with the finite intersection
property, and so there is a point x which belongs to
B. The point
x is a cluster point of the sequence, since given N and any open set 0
containing x we have x c RN, and so there must be an x n e 0 with
n > N.
n
A somewhat old-fashioned concept which resembles the BolzanoWeierstrass property is that of sequential compactness. A space X is
said to be sequentially compact if every infinite sequence from X has
a convergent subsequence. The relationship of sequential compactness to countable compactness is given by the following proposition.
Sec. 2]
Countable Compactness and the Bolzano-Weierstrass Property
161
Problem 6 gives an example of a space which is sequentially compact
but not compact, and Problem 27 an example of a compact space
which is not sequentially compact. Problem 7 shows that a space
may be compact without being either separable or first countable.
8. Proposition: A sequentially compact space is countably compact. A countably compact space satisfying the first countability axiom
is sequentially compact.
Proof: Sequential compactness implies the Bolzano-Weierstrass
property, which is equivalent to countable compactness. The second
part is an immediate consequence of Problem 8.8c. I
9. Proposition: Let f be a continuous real-valued function on a
countably compact space X. Then f is bounded and assumes its maximum and minimum.
This proposition can be proved by using Proposition 6 and the
fact that every countably compact subset of R is closed and bounded.
However, we can give a direct proof which proves more. A real-valued
function f on a topological space is called upper semicontinuous if
for each real number a the set fx: f(x) < cej is open. If f is continuous, both f and —f are upper semicontinuous. This implies that
Proposition 9 is a corollary of the following proposition:
10. Proposition: Let f be an upper semicontinuous real-valued
function on a countably compact space X. Then f is bounded (from
above) and assumes its maximum.
Proof: The sets O n = {x: f(x) < n} form a countable open
covering for X and so there must be a finite subcovering
{O,. . , O} . But this implies X c ON. Hence f(x) < N for all
x, and f is bounded from above. Let = sup ( f(x): x e X]. Then
the sets
= {x: f(x)
—
form a countable collection of closed sets with the finite intersection
property. Hence there is a y belonging to every F. Then f(y) = 0, and
f assumes its maximum at y. I
162
Compact Spaces
[Ch. 9
11. Proposition (Dini): Let (fn) be a sequence of upper semicontinuous real-valued functions on a countably compact space X,
and suppose that for each x e X the sequence (f„(x)) decreases monotonically to zero. Then (fn) converges to zero uniformly.
Proof: Choose e > 0, and let On = {x: fn (x) < e} . Since fn is
upper semicontinuous, 0„ is open. Since fn (x) --> 0 for each x, we
have X c U O n . By the countable compactness of X, there are a
finite number of open sets {o, . — , 0 } whose union contains X.
But this implies that ON = X, and hence fN (x) < E for all x. If
n > N, we have 0 < fn (x) < fN (x) < E, and the sequence ( f,.)
converges to 0 uniformly. I
Problems
4. a. A real-valued function f is called lower semicontinuous if —fis
semicontinuous. Show that a real-valued function on a space X uper
is continuous if and only if it is both upper and lower semicontinuous.
b. Show that if f and g are upper semicontinuous, so is f + g.
e. Let (fn) be a decreasing sequence of upper semicontinuous functions which converge pointwise to a real-valued function f Then f is upper
semicontinuous.
d. Let (fn) be a decreasing sequence of upper semicontinuous functions on a countably compact space, and suppose that lim fn(x) = f(x)
where fis a lower semicontinuous real-valued function. Then f is continuous and (fn ) converges to f uniformly.
e. Show that if a sequence (f„) of upper semicontinuous functions
converges uniformly to a function f, then f is also upper semicontinuous.
5. Let X be a normal topological space. Then the following are equivalent:
i. X is countably compact.
ii. Every continuous real-valued function on X is bounded.
iii. Every bounded continuous real-valued function on X assumes its
maximum.
6. Let X be the set of ordinals less than the first uncountable ordinal,
and let 43 be the collection of sets of the form {x: x < a}, {x: a < x < b } ,
and {x: a < x} .
a. Show that 03 is a base for a topology for X.
Sec. 3]
Compact Metric Spaces
163
b. Show that X is sequentially compact but not compact. [Hint:
Use the well ordering of the ordinals.]
c. Show that if f is a continuous real-valued function on X, then
there is an x 0 such that f( x) = f(x) for all x > xo. [Hint: Show that the
set of x for which f(x) < Inn f is countable.]
7. Let Y be the set of ordinals less than or equal to the first uncountable
ordinal 52, and let 63 be the collection of sets of the form {x: x < a},
{x: a < x < b } , and {x: a < .
a. Show that 63 is a base for a topology for X.
b. Show that X is compact but neither separable nor first countable.
Compact Metric Spaces
3
In this section we investigate special properties of compactness that
are true in metric spaces. We shall show that in a metric space the
notions of compactness, countable compactness, and sequential compactness coincide.
12. Lemma: Let X be a countably compact metric space. Then
given E > 0 there are a finite number of points x l , . . . , xAr of X such
that for each x e X there is an x k with p(x, x k) < E.
Proof: Suppose that no such finite collection of points exists.
Then we can choose an infinite sequence (x n ) of points in X so that
p(x„, xn,) > e for m n. Since each spheroid of radius e/3 can
contain at most one term of this sequence, the Bolzano-Weierstrass
property does not hold, and X is not countably compact. I
13. Proposition: A countably compact metric space is separable.
Proof: For each positive integer n, let Fn, be a finite set of points
such that for each x e X there is a y c Fr, with p(x, y) < 1/n. Then
U F
a countable dense subset of X. I
n= 1
14. Corollary: For a metric space the notions of compactness,
countable compactness, and sequential compactness are equivalent.
Proof: Since a metric space satisfies the first axiom of countability,
the notions of countable compactness and sequential compactness
164
Compact Spaces
[Ch. 9
coincide by Proposition 8. Compactness trivially implies countable
compactness. If X is countably compact, it must satisfy the second
axiom of countability by Proposition 13, and so be compact. I
A metric space X is said to be totally bounded if for each e > 0
there is a finite collection of spheroids of radius e which covers X.
Every compact space is totally bounded, and every subset of a
totally bounded space is totally bounded. The following proposition
characterizes compactness in terms of completeness and total boundedness :
15. Proposition: A metric space X is compact if and only if it
is both complete and totally bounded.
Proof: If X is compact it is trivially totally bounded. If (x„) is a
Cauchy sequence in X, then (x„) must have a cluster point. But a
Cauchy sequence which has a cluster point converges to the cluster
point. Thus X is complete.
Suppose that X is complete and totally bounded. To show that
X is compact, it suffices to show that each infinite sequence (x,i)
has a convergent subsequence. Since X is totally bounded, we may
cover X by a finite number of spheres of radius 1. Among these
spheres there must be a sphere S i which contains infinitely many
terms of the sequence (xn). Covering X by a finite number of spheres
of radius 1, we can find among them a sphere S2 such that S1 n S2
contains infinitely many terms of the sequence (xn). Continuing, we
obtain a sequence (Sk ) of spheres, Sk having radius 1/k, such that
S i n • • • n Sk contains infinitely many terms of the sequence. Since
there are infinitely many terms of the sequence in Si n • • • n Sk,
we may choose nk so that nk > nk _ 1 and x n, c Si n • • • n Sk.
Then (x„,) is a subsequence of (x„), and it must be a Cauchy
sequence, since p(x„,, x„,) < 2/N for k, I > N. Since X is complete,
this subsequence converges. I
16. Proposition: Let f be a continuous mapping of a compact
metric space X into a metric space Y. Then f is uniformly continuous.
Proof: Given e > 0 and x e X, there is a Sz > 0 such that
P(x, < az implies o- (f(x), f()) < e/2. Let Oz be the spheroid
(y: p(x, y) < 115z1. Then {Ox : x e X} is an open covering of X and so
Sec. 4]
165
Products of Compact Spaces
has a finite subcovering {0 x1 ,
. Let ô = min {(5 . 1 , • • , xj •
,
Then ô > O. Given two points y and z in X such that p(y, < ô, the
point y must belong to some O xo and hence p(Y, xi) < la x,. Consequently, p(z, xi) p(z, y) + p(y, xi) < }ax,
< ô . Thus we
have 0- ( f (y), f( xi)) < e/2 and 0- ( f(z), f (x i)) < e/ 2. This implies that
(f(z), f( y)) < e, showing that f is uniformly continuous on X. I
-
,
Problems
8. Let X be a metric space, K a compact subset, and F a closed subset.
Then FnK = Ø if and only if p(F, K) > 0, i.e., if there is a .5> 0
such that p(x, y) > for all x c F and all y e K. [Consider the function
p(x, F) = inf p(x, y).]
yeF
9. Let X be a compact metric space and clt an open covering of X. Then
there is a 5 > 0 such that every spheroid of radius is contained in some
element of cll.
4 Products of Compact Spaces
In this section we prove the theorem of Tychonoff that a product
of compact spaces is compact. It is probably the most important
theorem in general topology. Most applications in analysis need only
the special case of a product of (closed) intervals, but this special
case does not seem to be easier to prove than the general case.
We begin with two lemmas concerning the finite intersection property.
17. Lemma: Let a be a collection of subsets of a set X, and suppose
that a has the finite intersection property. Then there is a collection
63 D a such that 63 has the finite intersection property and is maximal
with respect to this property; that is, no collection properly containing
63 has the finite intersection property.
Proof: Consider the family of all collections containing a and
having the finite intersection property. This family is partially
ordered by inclusion. By the Hausdorff maximal principle there is a
maximal linearly ordered subfamily a. Let 61 be the union of the
166
Compact Spaces
[Ch. 9
collections in a. If B 1 , . , B,, are in (33, then each B, belongs to
some e, e a. Since a is linearly ordered by inclusion, one of the
collections ek contains the others, and so all B, belong to ek. Since
B, Ø. Thus 03 has the
ek has the finite intersection property,
finite intersection property. If 63' D ed and 63' has the finite intersection property, then 63' contains every e in a and so must belong
to a by the maximality of a. Thus 63 is a union of collections one
of which is 63', and so 63' C 63. This shows that 63 is maximal with
respect to the finite intersection property. I
n
18. Lemma: Let 63. be a collection of subsets of X which is maximal
with respect to the finite intersection property. Then each intersection
of a finite number of sets in 03 is again in 63, and each set which meets
each set in 63 is itself in (B.
Proof: Let 631 be the collection of all sets which are finite intersections of sets in 63. Then da' is a collection having the finite intersection property and containing 68. Thus 63' = o3 by the maximality
of 63.
Suppose that a set C meets each element of 63. Since da contains
each finite intersection of sets in 63, it follows that da u {C} has
the finite intersection property. By maximality 63 U {C} = ea, and
SO C 8 63. I
19. Theorem (Tychonoff): Let (Xa) be an indexed family of
compact topological spaces. Then the product space X X„ is compact
in the product topology.
Proof: Let ra be the mapping of X to X,, which assigns to each
its a-th coordinate. Then the sets obtained by taking finite
intersections of sets of the form 7,7 1 [0 with Oa open in X,, form a
base 97, for the topology of X.
Let a be any collection of closed subsets of X with the finite intersection property, and let 63 be a collection of (not necessarily closed)
sets which contains a and is maximal with respect to the finite
intersection property. Let 63„ be the collection of subsets of X,, of
Then (Bo, has the finite intersection
the form a (B) with B E
property, and by the compactness of X,, it is possible to choose a
point xa belonging to n [ir a (B)J, that is, a point xa which is a point of
63
x e X
Sec. 4] Products of Compact Spaces
167
closure of each set 7r a [13]. Let x be that point in X whose a-th coordinate is x,.
Consider a set S which is of the form 7r,7 1 [0a] for some a and some
open set 0a in Xa with xa e Oa. Since xa is a point of closure of 7i-a[B]
for each B in 63, the set S must intersect each B in 63. By Lemma 18
we must have S e 63. Each set containing x in the base 97, for the
topology of X is a finite intersection of sets of this form and so must
be in (33 by Lemma 18. Let F be a closed set in 63. Then F meets each
N e 07, with x e N. Consequently, x is a point of closure of F and so is
in F. Hence x belongs to each set in a, and a has nonempty intersection. I
Problems
10. Each closed and bounded set in Rn is compact.
11. Prove without using the axiom of choice that if X is compact and I
is a closed interval, then X X / is compact. [Hint: Let 91 be an open
covering of X X I, and consider the smallest value of t e / such that for
each t' < t the set X X [0, /1 can be covered by a finite number of sets in
Use the compactness of X to show that X X [0, t] can also be covered
by a finite number of sets in cit and that if t < 1, then for some t" > t,
X X [0, t"] can be covered by a finite number of sets in cud
12. Prove that the product of a countable number of sequentially
compact spaces is sequentially compact. [If (xn) is a sequence in the
product, choose a subsequence (x71) whose first coordinate converges,
choose a subsequence (4) of this whose second coordinate converges, etc.
Then the "diagonal" sequence (4) converges in the product space.]
13. A product Pi of unit intervals is called a (generalized) cube. Prove
that every compact Hausdorff space Xis homeomorphic to a closed subset
of some cube. [Let be the family of continuous real-valued functions on
X with values in [0, 1 . Let Q = X If. Then the mapping g of X into Q
]
fE5
which takes x into the point whosef-th coordinate is f(x) is one-to-one into
Q and continuous.]
14. Let Q
be a cube, and letf be a continuous real-valued function
on Q. Then, given c > 0, there is a continuous real-valued function g on Q
such that If — gl < E and g is a function of only a finite number of coordinates. [Hint: Cover the range off by a finite number of intervals of
length E and look at the inverse images of these intervals.]
Compact Spaces
168
5
[Ch. 9
Locally Compact Spaces
A topological space X is called locally compact if for each x e X
there is an open set 0 containing x such that 0 is compact. Thus X
is locally compact if the collection of open sets with compact
closures form a base for the topology of X. Every compact space is
locally compact, while the Euclidean spaces Rn are examples of
spaces which are locally compact but not compact.
If X is a locally compact Hausdorff space, we can form a new space
X* by adding to X a single point co not in X and taking a set in
X* to be open if it is either an open subset of X or the complement
of a compact subset in X. Then X* is a compact Hausdorff space, and
the identity mapping of X into X* is a homeomorphism of X and
X* — {w} . The space X* is called the Alexandroff one-point compactification of X, and w is often referred to as the point at infinity in X*.
Some useful properties of compact subsets in a locally compact
Hausdorff space are given by the following propositions, whose proofs
are left to the reader (see Problems 16, 17, and 18).
20. Proposition: Let K be a compact subset of a locally compact
Hausdorff space X. Then there is an open set 0 containing K with0
compact. Given such a set 0, there is a continuous nonnegative function
f on X which vanishes outside 0 and is identically 1 on K. If K is also
a gs, we may take f < 1 in R.
21. Proposition: Let K be a compact subset of a locally compact
Hausdorff space and {0,1 an open covering for K. Then there are a
finite number of nonnegative continuous functions f 1 , . . ,f. on X,
that
each f, vanishing outside a compact set and outside some O
is
identically
1
on
K.
f2 + • • • + fn
Problems
15.
a. Prove that the subsets of X* which are either open subsets of X
or the complements of compact subsets of X form a topology for X*,
that is, that the intersection of two such sets is such a set and the union of
any collection of such sets is such a set.
b. Show that the identity mapping from X to the subspace
X* — {0.,} is a homeomorphism.
c. Show that X* is compact and Hausdorff.
Sec.
169
5] Locally Compact Spaces
16. Let X be a locally compact space and K a compact subset of it.
Show that there is an open set 0 D K such that 5 is compact. [Hint: For
each point x e K there is an Or containing x with 5x compact. Let 0 be the
union of a finite number of these Or which cover K.]
17. Let X be a locally compact Hausdorff space and K a compact set.
Then there is a continuous real-valued function on X which is identically
1 on K and for which the set 0 = {x: f(x)
0} has compact closure.
[Hint: Use Problem 16 and the Urysohn Lemma.] Prove Proposition 20.
18. Prove Proposition 21. [For each x in K there is a nonnegative continuous function g which is positive at x and vanishes identically outside
some Oa . Choose a finite number g l , . . . , gn of such functions so that
g = g1 + • • - ± g„,is positive on K. Let h be a continuous function on X
which is equal to 1/g on K, and set f, = hg,.]
19. Show that the Alexandroff one-point compactification of R is
homeomorphic to the boundary of a sphere in R+ 1 •
20. Show that the one-point compactification of the space X in Problem
6 is the space Y in Problem 7.
21. a. Let 0 be an open subset of a compact Hausdorff space. Then
0 is locally compact.
b. Let 0 be an open set in a compact Hausdorff space X. Then
the mapping of X to the one-point compactification of 0 which is the
identity on 0 and takes each point in X — 0 into co is continuous.
22. A continuous mappingf from a topological space X to a topological
space Y is called proper if the inverse image of every compact set is compact. Let X and Y be locally compact Hausdorff spaces, andf a continuous
mapping of X into Y. Let X* and Y* be the one-point compactifications of
X and Y, and f * the mapping of X* into Y* whose restriction to X is f and
which takes the point at infinity in X* into the point at infinity in Y. Then
f is proper if and only if f* is continuous.
23. a. Let X be a locally compact space. A subset F of Xis closed if and
only if F n K is closed for each closed compact set K.
b. The above conclusion is true if X is a Hausdorff space satisfying
the first axiom of countability instead of being locally compact.
24. The Moore Manifold. Let X be the set whose elements are the
points of the open right half-plane (i.e., {(x, y): x > 0)) and the lines
in the plane with nonnegative slope. Write m(1) and b(1) for the slope
and y-intercept of the line 1. We define a base for a topology for X by
taking the open disks in the half-plane and the sets
V, = {l: Im(1) — mol < e, b(l) = b0 )
u {(x,Y): KY — boVx ±
mol < e, x <
f},
170
Compact Spaces
[Ch. 9
a. Show that X is a connected Hausdorff space and that each point
of X is contained in an open set which is homeomorphic to an open subset
of R 2. (Such a space is called a two-dimensional manifold or surface.)
b. Show that X (or any manifold) is locally compact, completely
regular, and satisfies the first axiom of countability.
c. Show that X has a countable dense subset but does not satisfy
the second axiom of countability.
d. Show that X is not normal.
*6
The Stone-tech Compactification
Let X be a completely regular topological space and 5 the family
of continuous real-valued functions f on X with fj < 1. If we let
/ = [ I, 1], then it follows from Problem 8.30 that X is homeomorphic with a set E c I. Let F = E. Since r is a compact Hausdorff
space, the set F is a compact Hausdorff space, and, identifying X with
E, we have X a dense open subset of F. The space F is called the
Stone-tech compactification of X and is denoted by 0(X). We summarize some of its properties in the following proposition.
22. Proposition: Let X be a completely regular topological space.
Then there is a unique compact Hausdorff space 0(X) with the following
properties:
i. The space X is a dense open subset of 0(X).
ii. Each bounded continuous real-valued function on X extends to
a continuous function on 0(X).
iii. If X is a dense open subset of a compact Hausdorff space Y,
then there is a unique continuous mapping io of 0(X) onto Y such
that (c(x) = x for all x c X.
Problems
Prove Proposition 22:
a. If f is a bounded continuous real-valued function on X with
I fi < 1, then f is the restriction to X of 71-f, and rf is continuous on 0(X).
b. Use the fact that Y is a subset of Is, where 9 is the space of
continuous g on Y with igi < 1, to show
25.
Sec. 7]
The Stone-Weierstrass Theorem
171
c. Show that 13(X) is unique in the sense that if Z is another space
with the same properties, then there is a homeomorphism so of Z with
0(X) such that yo(x) = x for all x c X.
26. Let X and Y be the spaces in Problems 6 and 7. Show that
0(X) = Y.
27. Let N be the set of natural numbers. Discuss 0(N). Show that a
sequence from N converges in 13(N) if and only if it converges in N. Hence
f3(N) is compact but not sequentially compact.
7 The Stone Weierstrass Theorem
-
Let X be a compact Hausdorff space. We denote by C(X) the set
of all continuous real-valued functions on X. Since X is normal, it
follows from Urysohn's Lemma that there are enough functions in
C(X) to separate points; that is, given two distinct points x and y in
X, we can find an f in C(X) such that f (x) f (y). The set C(X) is a
linear space, since any constant multiple of a continuous real-valued
function is continuous and the sum of two continuous functions is
continuous. The space C(X) becomes a normed linear space if
we define H fH = max 1 f(x)i, and a metric space if we set p(f, g) =
H f — gH. As a metric space C(X) is complete.
The space C(X) has also a ring structure: The product fg of two
functions f and g in C(X) is again in C(X). A linear space A of
functions in C(X) is called an algebra if the product of any two
elements in A is again in A. Thus A is an algebra if for any two
functions f and g in A and any real numbers a and b we have af ± bg
in A and fg in A. A family A of functions on X is said to separate
points if given distinct points x and y of X there is an f in A such that
f(x)
f(y). In the present section we study the closed subalgebras
of C(X) and prove that if A is a subalgebra of C(X) which separates
points, contains the constant functions, and is closed, then A = C(X).
The space C(X) also has a lattice structure: If f and g are in C(X),
so is the function f A g defined by ( f A g)(x) = min [ f(x), g(x)]
and the function f V g defined by (f v g)(x) = max [ f(x), g(x)J.
A subset L of C(X) is called a lattice if for every pair of functions f
and g in L we also have f v g and f A g in L. It is convenient to
investigate subalgebras of C(X) by first investigating lattices of functions. The following proposition can be thought of as a generalization
of the Dini theorem:
172
Compact Spaces
[Ch. 9
23. Proposition: Let L be a lattice of continuous real-valued
functions on a compact space X, and suppose that the function h
defined by
h(x) = inf f(x)
feL
is continuous. Then given E > 0, there is a g in L such that
0
g(x) — h(x) < E for all x in X.
Proof: For each x in X there is a function fx in L such that
fz (x) < h(x)
€73. Since fx and h are continuous, there is an open
set 0 z containing x such that
IM.Y) — f(x)1 <
and
ih(Y) — h(x)1 <
for all y c O. HenceL(Y) — h(Y) < E for all y in O. Now the sets
0, cover X, and by compactness there are a finite number of them,
say {0,,,
, 0,), which cover X. Let g = f 1 A fx, A • • ' A L..
Then g e L, and given y in X we may choose i so that y c O whence
,
g(y) — h(y) < f(y) — h(y) <
E.
I
24. Proposition: Let X be a compact space and L a lattice of
continuous real-valued functions on X with the following properties:
i. L separates points; that is, if x
y, there is an f e L with
f(x) f(Y).
ii. If f e L, and c is any real number, then cf and c f also belong
to L.
Then given any continuous real-valued function h on X and any E > 0,
there is a function g e L such that for all x e X
0 < g(x) — h(x) < e.
Before proving the proposition, we first establish two lemmas.
25. Lemma: Let L be a family of real-valued functions on a
compact space X which satisfies properties (i) and (ii) of Proposition 24.
Then given any two real numbers a and b and any pair x and y of
distinct points of X, there is an f e L such that f(x) = a andf(y) = b.
Sec. 7]
The Stone-Weierstrass Theorem
173
Proof: Let g be a function in L such that g(x) g(y). Let
a—b
bg(x) — ag(y)
f = g(x) — g(y) g + g(x) — g(y)
Then f e L, by property (ii), and f (x) = a, f (y) = b. I
26. Lemma: Let L be as in Proposition 24, a and b real numbers
with a < b, Fa closed subset of X, and p a point not in F. Then there
is a function f in L such that f > a, f(p) = a, and f(x) > b for all
x e F.
Proof: By Lemma 25 we can choose, for each x e F, a function fx
such that fx(p) = a and f(x) = b ± 1. Let O,, — {y: fx (y) > b} .
Then the sets 10 x1 cover F, and since F is compact, there are a finite
number {O . . . , Ox j which cover F. Let f = fx, V - • • V f. n.
Thenf e L, f(p) = a, and f > b on F. If we replace f by f Va, then
we also have f> a on X.1
Proof of Proposition 24: Since L is nonempty, it follows from (ii)
that the constant functions belong to L. Given g e C(X), let
L' = {f:feL and f> g}. Proposition 24 will follow from Proposition 23 if we can show that for each p e X we have g(p) = inff(p),
f e L'. Choose a positive real number 77 . Since g is continuous, the set
F = Ix: g(x) > g(p) + 77)
is closed. Since X is compact, g is bounded on X, say by M. By
Lemma 26 we can find a function f e L such that f > g(p) ± n,
f(p) = g(p) + n, and f (x) > M on F. Since g < g(p) + n on P, we
have g <f on X. Thus f e L', and f(p) < g(p) + n. Since n was
an arbitrary positive number, we have g(p) = inff( p), f e L'. I
27. Lemma: Given e > 0, there is a polynomial P in one variable
such that for ails e [ —1, 1] we havelP(s) — Is!! < e.
.3
Proof: Let
E can
be the binomial series for (1 — t) 1 / 2 . This
series converges uniformly for t in the interval [0, 1]. Hence,
given E > 0, we can choose N so that for all t e [0, 1] we have
174
Compact Spaces
[Ch. 9
N
1( 1 — 0 112 — QN(t)I < E, where QN =
QN( 1 — S 2 ).
S c [— I, I]. I
E
n=0
can. Let P(s)
=
Then P is a polynomial in s, and I151 — P(s)I < E for
28. Theorem (Stone-Weierstrass): Let X be a compact space and
A an algebra of continuous real-valued functions on X which separates
the points of X and which contains the constant functions. Then given
any continuous real-valued function f on X and any E > 0 there is a
function g in A such that for all x in X we have Ig(x) — f(x)1 < E. In
other words, A is a dense subset of C(X).
Proof: Let :4 denote the closure of A considered as a subset of
C(X). Thus A consists of those functions on X which are uniform
limits of sequences of functions from A. It is easy to verify that A
is itself an algebra of continuous real-valued functions on X. The
theorem is equivalent to the statement that A = c(x). This will
follow from Proposition 24 if we can show that Al is a lattice. Let
f e A, and 11/.11 _- 1. Then given E > 0, I fi — P ( f)11 < E where P
is the polynomial given in Lemma 27. Since A is an algebra containing
the constants, P( f) E AI, and since :4 is a closed subset of C(X), we
have ifl e A. If now f is any function in A, then pi fll has norm 1,
and so If1/11f11 and hence also 1 f I belong to A. Thus A contains
the absolute value of each function which is in A. But
f y g = (f+ g) + if- gl
and
f
A
g= (f+ g) - If - gl.
Thus Al is a lattice and must be C(X) by Proposition 24. I
29. Corollary: Every continuous function on a closed bounded
set X in Rn can be uniformly approximated on X by a polynomial (in
the coordinates).
Proof: The set of all polynomials in the coordinate functions
forms an algebra containing the constants. It separates points, since
given two distinct points in Rn, one of the coordinate functions takes
different values on these points. Hence Theorem 28 applies. I
Sec. 7]
175
The Stone-Weierstrass Theorem
Problems
28. Let f be a continuous periodic real-valued function on R with
period 27; that is, f(x -I- 27) = f(x). Show that, given e > 0, there is a
N
finite Fourier series
(p,
given by 99(x) ---- ao ±
E
(an, cos nx + br, sin nx),
such that I99(x) — f(x) I < E for all x. [Hint: Note that periodic functions
are really functions on the circumference of the unit circle, and that
cos mx cos nx = 1.[cos (m ± n)x — cos (m — n)x], etc.]
29. Let A be an algebra of continuous real-valued functions on a compact space X, and assume that A separates the points of X. Then either
A = C(X) or there is a point p e X and A = ff: f e C(X), f(p) = 01.
30. Let a be a family of continuous real-valued functions on a compact
Hausdorff space X, and suppose that 5 separates the points of X. Then
every continuous real-valued function on Xcan be uniformly approximated
by a polynomial in a finite number of functions of 5.
31. a. Let X be a topological space and A a set of real-valued continuous functions on X. Define x y if f(x) = f(y) for all f e A. Show
that ----- is an equivalence relation.
b. Let )7' be the set of equivalence classes of = and 99 the natural
map of X into X. Show that for each f e A there is a unique real-valued
function J on fi' such that f = Jo (p.
c. Let X' have the weak topology generated by these f. Then 99 is
continuous.
d. If X is compact, then so is jt, and the functions f in (b) are continuous.
e. Let X be a compact space and A a closed subalgebra of C(X)
containing the constant functions. Then there is a compact Hausdorff
space )7 and a mapping 99 of X onto X' such that A is the set of all functions
f of the form Jo 99 with J e C(X ).
32. Let X and Y be compact spaces. Then for each continuous realvalued function f on X X Y and each E > 0, we can find continuous realvalued functions g 1 , . . . , gr, on X and h i , . . . , hn on Y such that for each
(x, y) e X X Y we have
I f(x, y) — Ê gr(x)MY)I <
E.
i=1
The Weierstrass theorem, which states that a continuous function
on a cube in R° can be uniformly approximated by a polynomial, can be
proved directly by giving an integral formula for the approximating
33.
176
Compact Spaces
[Ch. 9
polynomials. 1 Show that this special case implies the general StoneWeierstrass Theorem by showing that the functions of norm 1 in the
algebra a give a mapping of X into the infinite-dimensional cube
X {If :fe a, if M = 1 } . Use the Tietze Extension Theorem and Problem 14 to show that each continuous function on the image of X can be
approximated by a polynomial in (a finite number of) the coordinate
functions.
The Stone-Weierstrass Theorem gives precise information about approximation by functions in an algebra of continuous functions. A natural
question is the nature of functions that can be approximated by a ring of
real-valued continuous functions, that is to say, when we no longer postulate the possibility of multiplying by arbitrary real numbers. The next three
problems give some results in this direction.
34. Let I be the interval [— 1, 1] in R and f a continuous real-valued
function on /such thatf(— 1),f(0), andf(1) are integers andf(1) f( — 1)
mod 2. Then given c > 0, there is a polynomial P with integral coefficients
such that If(x) — P(x)I < e for all x e I. Hints:
a. Let ço be the polynomial defined by ço(x) = x x(1 — 2x)(1 — x).
Then ço is a monotone increasing function whose fixed points are 0, and 1.
b. Choose e > O. Then some iterate son of ço is a polynomial with
integral coefficients which is monotone increasing on [0, 1] and such
that kon (x) —< e for x e [e, 1 — e].
c. Given a number a, 0 < a < 1, and any e > 0, then there is a
polynomial 4/ with integral coefficients (and no constant term) such that
0 < 4/(x) < 1 in [0, 1] and 14/(x) — ai < E for all x in [e, 1 — e].
d. Let P be a polynomial with integral coefficients, and suppose that
P(-1) = P(0) = P(1) =O. Let 0 be any real number. Then for each
E > 0, 13P can be uniformly approximated to within E on [ 1, 1 1 by a
polynomial having integral coefficients and no constant term.
e. Reduce the statement of the problem to (d) and the StoneWeierstrass Theorem.
—
35. a. Let X be a set and R a ring of real-valued functions on X. Let R
be the ring of all real-valued functions which can be uniformly approximated by functions in R. If Je R, and sup If(x)1 < 1, then cf c R for
each real number c.
b. Let X be a compact Hausdorff space and R a ring of continuous
real-valued functions on X such that 1 e R, and for each pair of distinct
1
See, for example, R. Courant and D. Hilbert,
Springer, Berlin, 1931.
Mathemattsche Phystk,
Bd. 1, pp. 55-57,
Sec. 8]
177
The Ascoli Theorem
points x and y there is a function f in R such that f(x) f(y), and
if(z)( < 1 for all z e X. Then every continuous real-valued function on X
can be uniformly approximated by functions in R.
36. a. The statement of Problem 34 can be improved slightly. Show,
for example, that we may take the interval I to be any closed interval
contained in (— Y2). [The polynomial x 2 — 1 has absolute value at
most 1 on I. Apply Problem 35b.]
b. Show that we cannot take I in Problem 34 to be [ —2, 2]. [Hint:
If P is a polynomial with integral coefficients,
1
ir
*8
f
2 p( x ) (4 x 2)112 dx
-2
is an integer.]
The Ascoli Theorem
It is often useful in analysis to know conditions under which a
sequence of functions has a subsequence which is convergent in some
sense. The following notion plays a central role in such questions:
A family of functions from a topological space X to a metric
space ( Y, o-) is called equicontinuous at the point x c X if given E > 0
there is an open set 0 containing x such that c[ f (x), f (y)] < E for
all y in 0 and all f e a. The family is said to be equicontinuous on X
if it is equicontinuous at each point x in X. We are going to show
that if (fn ) is a sequence from an equicontinuous family of functions,
and if at each point x of X there is a convergent subsequence of
( f„(x)), then (fn ) has a subsequence which converges uniformly on
each compact subset of X. We begin with several lemmas.
30. Lemma: Let (fn ) be a sequence of mappings of a countable set
D into a topological space Y such that for each x c D the closure of the
set {f„(x): 0 < n < 3.0} is sequentially compact. Then there is a
subsequence (fnk ) which converges for each x in D.
Proof: Let D = fxkl . By the sequential compactness of the
closure of { fjx 1 ): 0 < n < Do} we can pick a subsequence (AO of
(fn ) such that (fi (x)) converges. We can now pick a subsequence
(f2n ) of ( fi,,) such that (f2 ,(x2)) converges. Continuing in this
fashion we obtain a subsequence ( f,) convergent on x,. Consider the
"diagonal" sequence (fnn ). We have (fnn )77, a subsequence of
( f,7,), and so ( fn„(x,)) converges. 1
178
Compact Spaces
[Ch.
9
31. Lemma: Let (f„) be an equicontinuous sequence of mappings
from a topological space X to a complete metric space Y. If the
sequences (f„(x)) converge for each point x of a dense subset D of X,
then ( .1,0 converges at each point of X, and the limit function is continuous.
Proof: Given x in X and E > 0, we can find an open set 0 containing x such that oi f7,(x), f, i(y)] < e/3 for all y in O. Since D
is dense, there must be a point y eDn 0, and since (f.(Y)) converges, it must be a Cauchy sequence, and we may choose N so large
that olf,(y), f m (y)] < E/3 for all m, n N. Then
offn(x),
Mx
offn(X), fix(Y)] offn(Y), fni(01
)]
offm(X), fm(Y)1
<E
for all m, n > N. Thus (fn (x)) is a Cauchy sequence, and converges
by the completeness of Y.
Let f(x) = lim f, j (x). To see that f is continuous at x, let e > 0 be
given. By equicontinuity there is an open set 0 containing x such
that cr[f,z (x), ffl (y)] < e for all n and all y in O. Hence for all y in 0
we have
off(x), f(Y)J = lim o[/(x), fn(Y)]
and f is continuous at x. 1
32. Lemma: Let K be a compact topological space and (fn ) an
equicontinuous sequence of functions to a metric space Y which converge at each point of K to a function f. Then (fa) converges to f
uniformly on K
Proof: Choose e > O. By equicontinuity each x in K is contained
in an open set 0, such that cr[fn(x),fn(Al < e/3 for ally in 0, and all
n. From this it follows that also olf(x), f(Al < e/3 for ally in O.
By the compactness of K there is a finite collection {0„ 1 , . . , O z,}
of these sets which covers K. Choose N so large that for all n > N
we have cr[f„(xi), f(x)] < e/3 for each xi corresponding to this
finite collection. Then for any y in K there is an i < k such that
J' e Os .. Hence
offn(Y), .f(y)]
offn(y), fn(xi)]
off,i(x i), f(x)]
off(y), f(x i)]
<E
for n > N. Thus (fa) converges to f uniformly on K. I
Sec. 8]
179
The Ascoli Theorem
These three lemmas taken together imply the following theorem:
33. Theorem (Ascoli): Let 5 be an equicontinuous family of
functions from a separable space X to a metric space Y. Let (f,)
be a sequence in g such that for each x in X the closure of the set
{f„(x): 0 < n < 00} is compact. Then there is a subsequence (f,,,)
which converges pointwise to a continuous function f, and the convergence is uniform on each compact subset of X.
34. Corollary: Let g be an equicontinuous family of real-valued
functions on a separable space X. Then each sequence (fn ) in g
which is bounded at each point (of a dense subset) has a subsequence
(f„,) which converges pointwise to a continuous function, the convergence being uniform on each compact subset of X.
Problems
37. The set of all functions from X to Y is precisely the product of X
copies of Y and is often denoted by Yx • The product topology in Yx is
called the topology of pointwise convergence.
a. Show that a sequence (fa) of mappings of X into Y converges to
fin the topology of pointwise convergence if and only if (f,(x)) converges
to f(x) for each x in X.
b. Let be an equicontinuous family of functions from a topological
space X to a metric space Y. Then the closure of in the topology of
pointwise convergence is also equicontinuous.
38. Let X be a space which is either locally compact, or Hausdorff and
satisfies the first axiom of countability. Let (fn) be a sequence of continuous functions from X to a metric space Y which converge to a function
f uniformly on each compact subset K of X. Then fis continuous. [Hint:
Use Problem 23.]
39. Let X be a separable, locally compact metric space, and ( Y, a) any
metric space. Show that:
a. There is a countable collection {On} of open subsets of X such
that Dn is compact and X = IvI O.
b. The set of functions from X into Y becomes a metric space if we
define
o-* (f, g) ----
E 2—ncr,V, g),
where
sup , cr[f(x), g(x)]
cr (f, g) = o„ i 4- a[f(x), g(x)]
:
180
Compact Spaces
[Ch.
9
c. The sequence (fn ) converges to f uniformly on compact subsets
of X if and only if o- *(f, fn) -- O.
40. Let X and Y be metric spaces. A sequence (fn ) of mappings from X
to Y is said to converge continuously to f at x if for each sequence (xn)
such that x = lim xn we have f(x) = limfn(xa). We say (fa) converges
continuously to f if it converges continuously at each x in X.
a. Show that if each fn is continuous on X and if (A) converges
continuously to fat each x e X, then f is continuous.
b. Show that (fa ) converges continuously to f if and only if (A)
converges to f uniformly on each compact subset of X.
41. A real-valued function f on [0, 1] is said to be Holder continuous
of order a if there is a constant C such that If(x) — f(y)I < Clx — yl'.
Define Ilfll,„ = max If(x)I + sup 1 f(x) — MI
Ix — yla
Show that for 0 < a < 1, the set of functions with 1 fil a < 1 is a compact subset of C[0, 1].
Il 0 Banach Spaces
1
Introduction
We are going to study a class of spaces which are endowed with
both a topological and an algebraic structure. A set X of elements is
called a vector space (or linear space, or linear vector space) over the
reals if we have a function + on X X X to X and a function • on
R x X to X which satisfy the following conditions:
i. x + y — y + X.
ii. (x + y) + z = x + (y + z).
iii. There is a vector 0 in X such that x + 0 = x for all x in X.
iv. X (x + y) = Xx + Xy; X c R, x, y e X.
v. (X + ,u)x = Xx + px; X, p c R, x e X.
vi. X(,ux) = (Xp)x; X, p e R, x c X.
vii. 0 • x = 0, 1 • x = x.
We call + addition and • multiplication by scalars. It should be
noted that the element 0 defined in (iii) is unique, for if Ce also has this
property, then 0 = 9 ± 0' = 0' + 0 = 0'. The element ( —1)x is
called the negative of x and written —x. We have x + (— x) — 1 - x ±
(-1)x = (1 — 1)x = 0 • x = O.
A nonnegative real-valued function
is called a norm if
i. 114 = ID <=> x =-- O.
ii. Ilx + Yll 114 + Ilyll.
iii. ilaxII = la! 11x11.
181
11 11 defined on a vector space
182
Banach Spaces
[Ch. 10
A normed vector space becomes a metric space if we define a metric
p by p(x, y) = Ilx — yll. When we speak about metric properties in a
normed space we are referring to this metric.
If a normed vector space is complete in this metric, it is called a
Banach space. Examples of Banach spaces were given in Chapter 6.
Another example is C(X), the space of all continuous real-valued
functions on a compact space X. We restate here Proposition 6.4,
and note that the proof given in Chapter 6 is valid in any normed
vector space.
1. Proposition: A normed vector space is complete if and only if
every absolutely summable sequence is summable.
A nonempty subset S of a vector space X is a subspace or linear
manifold if X i x i + X 2x2 belongs to S whenever x 1 and x2 do. If S is
also closed as a subset of X, then it is called a closed linear manifold.
The intersection of any family of linear manifolds is a linear manifold. Hence, given a set A in X, there is always a smallest linear
manifold containing A. We often denote this manifold by {A}.
If A is any set in X, we use A + x to denote the set of all elements
z of the form z — x + y, y e A. The set A + x is called the translate
of A by x. The set XA is the set of all elements of the form Xx with
x e A, and A + B is the set of all elements of the form x + y with
x in A and y in B.
Problems
1. Show that if x,,, —4 x, then 11xn11 --, 11x11.
2. Two norms 11 11 1 and 11 112 are called equivalent if there is a positive
constant K such that K-1 11xli 1 < 11x 112 < K 114 1 . If the norms are
equivalent, then the metrics derived from them are uniformly equivalent.
Show that the metrics introduced in Problem 7.10b for IV are derived
from norms for Rn, and these norms are all equivalent.
3. Show that ± is a continuous function from X X X into X and
that • is a continuous function from R X X into X.
4. Show that a nonempty set M is a linear manifold if and only if
M ± M --= M and XM ----- M for each X.
5. a. Prove that the intersection of a family of linear manifolds is a
manifold.
Sec.
1]
183
Introduction
b. Prove that there is a smallest linear manifold {A} containing a
given set A.
c. Show that {A} consists of all finite linear combinations of the
form X ix
• • • -I- Xnx,, with x, e A.
6. a. If M and N are linear manifolds, so is M N, and M N
{M U N}.
b. If M is a linear manifold, so is M.
7. Show that the set P of all polynomials on [0, 1] is a linear manifold
in C[0, 1]. Is it closed? Give an example of a closed linear manifold in
C[0, 1].
8. A linear manifold M is said to be finite-dimensional if there are a
finite number of elements x1 ,. . . , x,„ such that M ={ x 1 . • • , x,i}. Prove
that every finite-dimensional linear manifold in a normed vector space X
must be closed.
9. Let S be the spheroid of radius 1 centered at O; that is, S =
Ix: lx < 1 1. Prove that S is open and that
= {x: 11x11
1) .
We call S the open unit sphere (or ball) and 3 the unit sphere (or ball).
10. A nonnegative real-valued function 11 11 defined on a vector space X
is called a pseudonorm if Ilx -1- yll 114 11y11 and 11ax11 = l ai 11x11. Show
that the relation x = y defined by 11x — yll = 0 is an equivalence relation
compatible with addition and multiplication by scalars, and that, if
x y, then 11x11 = 11Y1I. Let X' be the set of equivalence classes of X
under Then X' becomes a normed vector space if we define ax' 01
as the (unique) equivalence class which contains ax -1- 13y for x e x' and
y e y' and define 11x'11 = 11x11 for x e x'. The mapping go of X onto X'
which takes each element of X into the equivalence class to which it
belongs is a homomorphism (called the natural homomorphism) of X
onto X'. What is the kernel of go? Illustrate this procedure with the LP
spaces on [0, 1].
11. Let X be a normed linear space (with norm 11 11) and M a linear
manifold in X. Show that lxlj i
inf 11x — m11 defines a pseudonorm
meM
on X. Let X' be the normed linear space derived from X and the pseudonorm 11 11 using the process described in Problem 10. The natural map
go of X onto X' has kernel M. Prove that go takes open sets into open sets.
The space X' is usually denoted by X/M and called the quotient space of
X modulo M.
12. Show that, if X is complete and M is a closed linear manifold of
X, then X/M is also complete. [Hint: Use Proposition 1.]
184
2
Banach Spaces
[Ch. 10
Linear Operators
A mapping A of a vector space X into a vector space Y is called a
linear mapping, a linear operator, or a linear transformation if
A(a1X1
+ a2X2) =-- a1AX1 + a2AX2
for all xl , x2 in X and all real a l , a2. If X and Y are normed vector
spaces, we call a linear operator A bounded if there is a constant M
such that for all x we have liAxli < M lix11. We call the least such M
the norm of A and denote it by IA Thus
3 .21jj = sup
iiAxii
xgx JfxJf
x*8
Since A(ax) = «Ax, we have also
11.211 -- sup 11A4 = sup II AxII.
ilxil< 1
114=i
A bounded linear transformation A from X to Y is called an isomorphism between X and Y if there is a bounded linear transformation B
from Y to X such that AB is the identity on Y and BA the identity
on X. The following proposition relates the notions of boundedness
and continuity for linear operators:
2. Proposition: A bounded linear operator is uniformly continuous.
If a linear operator is continuous at one point, it is bounded.
Proof: Suppose A is bounded. Then
1}Ax i — AX211 G 1.2111 • 11X1 — X2
E
for all x 1 and x2 in X with 11x 1 — x211 < E/11A11. Thus A is uniformly
continuous.
Suppose now that A is a linear operator which is continuous at xo.
Then there is a 6 > 0 such that liAx — Axd < 1 for all x such that
I Ix — xod < 3. For any z in X, set w = nz/lIzji, where 0 < n < 3.
Then
n—Az = Aw = A(w + x 0 ) — A(x0)
and
iizli
±1 iiAzii = ilA(w ± x 0) — A(xo)i
<1,
Uz(i
Sec. 2]
185
Linear Operators
since ilw + xo — xoli = *11 = n < 6. Consequently, IlAzil < n -1 114,
and A is bounded. 1
3. Proposition: The space 63 of all bounded linear operators from
a normed vector space X to a Banach space Y is itself a Banach space.
Proof: If A and B are in 63, we define «A ± OB by (aA + f3B)x =
aAx + OBx; it is easily seen to be a linear operator. Now
lixAll = sup (IxAxil = Ixi sup IlAxli =
411=1
4—..1
NAN
and
IIA ± AI = sup 1lAx ± Bxli < sup (f(Axil ± IIBXII) 511x11=1
114=i
IA! ± IA.
Thus any linear combination of two bounded linear operators is
again a bounded linear operator. If liAli = 0, then
!!Ax 1! _<_ I I A II • 1! xl! -= 0,
and so Ax = O. Thus iiAl = 0 only for the operator 0, which maps
every x into O. Thus (( (( satisfies all the requirements of a norm
and we have only to show that if Y is complete so is 63.
Let (A n ) be a Cauchy sequence from 53. For each x e X we have
11,4„x — A„,,x11 _.< 1121„ — A mli • lixil, and so (A ux) is a Cauchy
sequence in Y and must converge to an element y in Y. Call this
element Ax. It follows from the definition of Ax that A(xx) = xAx
and A(x i + x2) = Ax i + Ax 2 .
To show that the linear operator A is bounded, we observe
that, given e > 0, there is an N such that for all ni, n > N we have
10„ — /4„,H < e. Hence liA„ii < AN + e for all n ? N, and so
iiAxii = lim iiAnxii _•_ GAid ± OW.
Thus A is bounded. For each x in X we have
HArix — Ax 11 = lim 11/1„x — Anixli
< lim iii.,,, — A nd( ((.4
m-100
5_ 6114
186
Banach Spaces
[Ch. 10
for all n > N. Thus for n > N,
IIA. — AI! = sup 11(A. — A)x11 <
E.
11.11-1.
Thus A n --3 A and aa is complete.
1
Problems
13. Show that if A n —> A and x,„ x, then A nxn, —± Ax.
14. The kernel of an operator A is the set {x: Ax = 0 }. Prove that the
kernel of a linear operator is a linear manifold and that the kernel of a
continuous operator is closed.
15. a. Let X be a normed linear space and M a closed linear manifold.
Then the natural homomorphism ço of X onto X/M has norm 1.
b. Let X and Y be normed linear spaces and A a bounded linear
operator from X into Y whose kernel is M. Then there is a unique bounded
linear operator B from X/M into Y such that A = B 0 47. Moreover,
HAI! = IA16. Let X be a metric space, and let Y be the space consisting of those
real-valued functions f on X which vanish at a fixed point xo e X and
satisfy if(x) — f(y)1 < Mp(x, y) for some M (depending on f). Define
lifi 1 = sup i f(x) — AO i . Then Y is a normed linear space. For each
P(x, Y)
x c X, the functional Fx defined by F(f) = f(x) is a bounded linear functional on Y, and 1lFx — FYI! = p(x, y). Thus X is isometric to a subset of
the space Y* of bounded linear operators from Y to R. Since Y* is complete by Proposition 3, the closure of this subset gives a completion of Y,
and we have another proof of Theorem 7.9.
3 Linear Functionals and the
Hahn-Banach Theorem
A linear functional on a vector space X is a linear operator from X
to the space R of real numbers. Thus a linear functional is a realvalued function f on X such that f(ax + Oy) = of (x) + i3f(Y). The
first question with which we shall be concerned is that of extending a
linear functional from a subspace to the whole space X in such a
manner that various properties of the functional are preserved. The
principal result in this direction is the following:
Sec. 3]
Linear FunctionsIs and the Hahn-Banach Theorem
187
4. Theorem (Hahn-Banach): Let p be a real-valued function
defined on the vector space X satisfying p(x + y) < p(x) + p(y) and
p(ax)---- ap(x) for each a > O. Suppose that f is a linear functional
defined on a subspace S and that f(s) < p(s) for all s in S. Then there
is a linear functional F defined on X such that F(x) < p(x) for all x,
and F(s) --- f(s) for all s in S.
Proof: Consider all linear functionals g defined on a subspace
of X and satisfying g(x) < p(x) whenever g(x) is defined. This set is
partially ordered by setting g 1 -< g2 if g2 is an extension of g l , that is,
if the domain of g 1 is contained in the domain of g2 and g 1 --- g2 on
the domain of gl .
By the Hausdorff Maximal Principle there is a maximal linearly
ordered subfamily {ga } which contains the given functional f We
define a functional F on the union of the domains of the ga by
setting F(x) = g(x) if x is in the domain of ga. Since {g a} is linearly
ordered, this definition is independent of the ga chosen. The domain
of F is a subspace and F is a linear functional, for if x and y are in the
domain of F, then x e domain ga and y E domain go for some a, 13.
By the linear ordering of {ga}, we have either ga ..< go or gfl -‹ ga,
say the former. Then x and y are in the domain of go, and so xx ± py
is in the domain of go and so in the domain of F, and F(xx ± uy) =
go (Xx + uy) = Xg o (x) ± ,ugo (y) = XF(x) + ,uF(y). Thus F is an
extension of f Moreover, F is a maximal extension. For if G is any
extension of F, gc, < F < G implies that G must belong to {ga} by
the maximality of {g al. Hence G < F, and so G = F.
It remains only to show that F is defined for all x e X. Since F is
maximal, this will follow if we can show that each g which is defined
on a proper subspace T of X and satisfies g(t) < p(t) has a proper
extension h.
Let y be an element in X — T. We shall show that g may be
extended to the subspace U spanned by T and y, that is, to the
subspace consisting of elements of the form Xy + t with t c T. If h
is an extension of g, we must have
h(Xy ± t) = xh(y) ± h(t) = xh(y) + g(t),
and so h is defined as soon as we specify h(y).
For t h t2 e T we have
g(ti) + g(t2) = g(ti ± t2) P(t1 + t2)
P(II. - y) + P(t2 ±
Y).
188
Banach Spaces
[Ch. 10
Hence
—p(t i — y) + g(t1 )
p(t2 + y) — g(t 2),
and so
inf [p(t ± y) — g(t)].
sup [—p(t — y) + g(t)]
teT
teT
Define h(y) = «, where
a
is a real number such that
sup [—p(t — y) ± g(t)]
inf [p(t ± y) — g(t)].
a
We must now show that
h(Xy
t) = X«
g(t)
p(Xy
t).
If x > 0, then
Xa
g(t) = X[a
g(t/ X)]
< X[{p(t/X ± y) — g(t1X)}
= Xp(t/ X + y) = p(t
If x =
g(t1X)]
Xy).
ju < O, then
—
g(t) =
a
g(t/ A))
< 1.4({p(t/ — y) — g(t/ p,))
=
Thus h(Xy
of g. 1
t) p(Xy
g(t1,u))
— Y) = P(t
t) for all x, and h is a proper extension
The Hahn-Banach Theorem has a wide range of applications, many
of them involving a clever choice of the subadditive function p.
Propositions 6 and 7 and Theorem 20 are applications of this sort.
The following proposition is a generalization of the Hahn-Banach
Theorem which is useful in certain applications (cf. Problems 20 and
21). By an Abelian semigroup of linear operators on a vector space
X we mean a collection G of linear operators from X to X such that
if A and B are in G, then AB = BA and AB is in G. We also assume
that the identity operator belongs to G.
5. Proposition: Let X, S, p, and f be as in Theorem 4, and let G
be an Abelian semigroup of linear operators on X such that for every
A in G we have p(Ax) < p(x) for all x in X, while for each s in S we
have As in S and f(As) = f(s). Then there is an extension F off to a
Sec. 33
189
Linear Functionals and the Hahn-Banach Theorem
linear functional on X such that F(x) < p(x) and F(Ax) = F(x) for
all x in X.
Proof: Define a function q on X by setting
q(x)
. 1
mf - p(A i x + • • • + A ux),
n
where the inf is taken over all finite sequences (A 1 ,
, A n) from
G. We clearly have q(x) p(x) and q(ax) = aq(x) for a O.
For any x and y in X and any e > 0, we can choose (A1, . . . , An)
and (B 1 ,. , Bm), so that
1
- p(A ix + • • • + Anx) < q(x) +
n
and
1
PAY + • • • + AnY) <q) +
E.
Then
q(x + y) <
nm
i=1 1=1
A iB,(x + y))
5_1 p(E B ; (E
nm
-
n
+
1
nm
p (E A i (E BiY))
J-1
m
p (E A ,x) + p (E B.iy)
m
< q(x) + q(y) + 26.
Since e is arbitrary,
q(x + y) < q(x) + q(y).
For s in S,
1
./(5 ) = -n-
M
1
+ • • • + Ans) 5_ - p(A is + • • • + Ans).
n
Hence f(s) < q(s), and we may apply Theorem 4 with p replaced by
q to obtain an extension F of f to all X such that F(x) < q(x) <
p(x). It remains only to show that F(Ax) = F(x). Now
q(x - Ax)
1
- p((x - Ax) + A(x - Ax) + • • • + A n (x - Ax))
n
11
n +1
= - p(x - A
X) 5_ -[p(x) + p(-x)].
190
Banach Spaces
[Ch. 10
Thus g(x Ax) < O. Since
F(x) — F(Ax) = F(x — Ax) g(x — Ax) <
0,
we have F(x) < F(x), and applying this to —x, we get F(x) =
F(Ax). I
6. Proposition: Let x be an element in a normed vector space X.
Then there is a bounded linear functional f on X such that f(x) =
f I 11x11.
Proof: Let S be the subspace consisting of all multiples of x, and
define f on S by f(Xx) = X 114, and set p(Y) = Then by the
Hahn-Banach Theorem there is an extension of f to be a linear
we have
functional on X such that f(y)
!(iii. Since f(—y)
f I 5_ 1. Also f(x) = 11x11
f • 1411. Thus Il fII - 1 and f(x) 11f 1 . 11x11. I
7. Proposition: Let T be a linear subspace of a normed linear
space X and y an element of X whose distance to T is at least 6, that is,
an element such that Ily — Ill > 5 for all t e T. Then there is a bounded
linear functionalf on X with Il fil < 1, f(y) = 6, and such thatf(t) = 0
for all t in T.
Proof: Let S be the subspace spanned by T and y, that is, the
subspace consisting of all elements of the form ay t with t e T.
Define f( ay t) = ab. Then f is a linear functional on S, and, since
//all ab, we have f(s) Hsll on S. By the
= jal •
Hay
Hahn-Banach Theorem we may extend f to all of X so that
f(x)
IX. But this implies 11f II < 1. By the definition of f on S,
we have f(t) = 0 for t e T and f(y) = 6.1
The space of bounded linear functionals on a normed space X is
called the dual (or conjugate) of X and is denoted by X*. Since R
is complete, the dual X* of any normed space Xis a Banach space by
Proposition 3. Two normed vector spaces are said to be isometrically
isomorphic if there is a one-to-one linear mapping of one of them
onto the other which preserves norms. From an abstract point of
view, isometrically isomorphic spaces are identical, the isomorphism
merely amounting to a renaming of the elements. In Chapter 6 we
saw that the dual of LP was (isometrically isomorphic to) Lq for
Sec. 3] Linear Functionals and the Hahn-Banach Theorem
191
1 < p < 00 and that there was a natural representation of the
bounded linear functionals on LP by elements of L.
We are now in a position to show that a similar representation
does not hold for bounded linear functionals on L[0, 1]. We note
that C[0, 1] is a closed subspace of L"[O, 1]. Let f be that linear
functional on C[0, 1] which assigns to each x in C[0, 1] the value
x(0) of x at O. It has norm 1 on C, and so can be extended to a
bounded linear functional F on L'[0, 1]. Now there is no y in L 1 [0, 1]
such that F(x) =
xy dt for all x in C, for let (x n ) be a sequence
of continuous functions on [0, 1] which are bounded by 1, have
x„(0) = 1, and are such that x(t)
0 for all t
O. Then for each
y e
x„y —> 0, while F(x) = I.
If we consider the dual X** of X*, then to each x in X there
corresponds an element <lox in X** defined by (iox)( f) --= f(x). We
have Ikpx11 = sup f(x). Since f (x) _< 11 f 11 11x11, we have 11(pxll 11-4
while by Proposition 6 we have an f of norm 1 with f(x) = 114
Hence Ikpxli = ilx11. Since (,0 is clearly a linear mapping, yo is an
isometric isomorphism of X onto some linear subspace (p[X] of X**.
The mapping ço is called the natural isomorphism of X into X**, and
if (p[X] = X** we say that X is reflexive.
Thus D' is reflexive if 1 < p < cc. Since there are function ais on
L.' which are not given by integration with respect to a function in L',
D'
it follows that L l is not reflexive. This and Problem 22 show that L
is not reflexive. It should be observed that X may be isometric with
X** without being reflexive.
By Proposition 3, the space X** is complete, and so the closure
yo J1 in X** must be complete by Proposition 7.14. Thus each
normed vector space is isometrically isomorphic to a dense subset of
a Banach space.
Before closing this section, we add a word about the Hahn-Banach
Theorem for complex vector spaces. A complex vector space is a
vector space in which we allow multiplication by complex scalars.
The following extension of the Hahn-Banach Theorem for complex
spaces is due to Bohnenblust and Sobczyk :
8. Theorem: Let X be a complex vector space, S a linear subspace, p a real-valued function on X such that p(x + y) < p(x) p(y),
and p(ax) = lalp(x). Let f be a (complex) linear functional on S such
that 1 f(s) ; < p(s) for all s in S. Then there is a linear functional
192
Banach Spaces
[Ch. 10
F defined on X such that F(s) = f(s) for s in S and 1F(x)1 p(x) for
all x in X.
Proof: We first note that X can be considered as a real vector
space if we simply ignore the possibility of multiplying by complex
constants. A mapping F from X to the complex numbers which is
linear in the real sense is linear in the complex sense if and only if
F(ix) = iF(x) for each x. On S define g and h by taking g(s) to be
the real part of f(s) and h(s) the imaginary part. Then g and h are
linear in the real sense and f = g + ih. Sincefis linear in the complex
sense, g(is) ih(is) = f(is) = if(s) = ig(s) — h(s), and we see that
h(s) = —g(is).
Since g(s) < if(s)1
p(s), we can extend g to a functional G on
X which is linear in the real sense and satisfies G(x) < p(x). Let
F(x) = G(x) — iG(ix). Then F(s) = f(s) for s in S. Since F(ix) =
G(ix) — iG(—x) = i[G(x) — iG(ix)], we have F linear in the complex sense. For any x, choose co to be a complex number such that
= 1 and coF(x) = IF(x)I• Then IF(x)I = coF(x) = F(wx) =
G(cox)
p(cox) = p(x). I
Problems
17. Show that a linear functionalf on a normed linear space is bounded
if its kernel is closed. (The kernel off is {x: f(x) = 0}.)
18. Let T be a linear subspace of a normed linear space X and y a
given element of X. Show that inf ilY
tii = sup ff(y): 11f11 =
teT
f(t) = 0 all te T1.
19. Prove Proposition 7 by taking S to be the subspace consisting of
multiples of y, f(Xy) = XS, and p(x) = inf Ilx — 11.
teT
20. Let r be the space of all bounded sequences. Use Proposition 5 to
show that there is a linear functional F on l with the following properties:
i. lim
Mtn
lim En .
F[(E.)1
n.)] F[(En)] F[(77 71 )].
iii. F[(42 )] =
iv. If ?hi = E+ i, then F[()] =
The functional F is called a Banach limit and is often denoted by LIM.
Sec. 4]
The Closed Graph Theorem
193
21. Use Proposition 5 to show that there is a set function g defined for
all bounded sets of R with the following properties:
i. If A n B = 0,
AB.
U B) = gA
t) =
iii. If A C B, MA < AB.
iv. If A is Lebesgue measurable, then AA is the Lebesgue measure of A.
[Hint: It is easier to work with integrals than sets.]
22. Show that a Banach space X is reflexive if X* is reflexive. [Hint:
If ço[X] is not all of X**, then there is a nonzero functional y e X***
such that y(x) 0 for all x e ço[X].]
23. If S is a linear subspace of a Banach space X, we define the
annihilator S ° of S to be the subset S° = ly e X*: y(s) = 0 for all s e SI.
If T is a subspace of X*, we define T° = {x e X: t(x) = 0 for all t e
a. Show that S° is a closed linear subspace of X.
b. Show that S" =
c. If S is a closed subspace of X, then S* is isomorphic to X'/S° .
d. If S is a closed subspace of a reflexive Banach space X, then
S is reflexive.
24. Let X be a vector space and P a subset of X such that x, y e P
implies x ± y 6 P and ax e P for a > O. Define a partial order in X by
defining x < y to mean y — x e P. A linear functional f on X is said to
be positive (with respect to P) if f(x) > 0 for all x e P. Let S be any subspace of X with the property that for each x e X there is an s e S with
x < s. Then each positive linear functional on S can be extended to a
positive linear functional on X. [Hint: The transfinite part of the proof is
the same as that for the Hahn-Banach Theorem. The possibility of extending a functional to a space containing one more element is even simpler
than for the Hahn-Banach Theorem.]
25. Let f be a mapping of the unit sphere S = {x: 11,4 < 1} into R
such that f(ax + 0y) = af(x) )3f(y) whenever x, y and ax )33/ are in
S. Show that fmay be extended to all of X so that it is a linear functional.
4 The Closed Graph Theorem
A mapping from one topological space to another is called an open
mapping if the image of each open set is open. Thus a one-to-one
continuous open mapping is a homeomorphism. We shall show that a
continuous linear transformation of one Banach space onto another
194
Banach Spaces
[Ch. 10
is always an open mapping, and use this to give criteria for the continuity of a linear transformation. We begin with a lemma.
9. Lemma: Let A be a continuous linear transformation of the
Banach space X onto the Banach space Y. Then the image by A of the
unit sphere in X contains a sphere about the origin in Y.
Proof: Let S = {x: 11x11 < l/2). Since A is onto and
U
X
k=1
we have
Y = U kA(9 1 ).
k=1
But Y is a complete metric space, and so is not of first category in
itself. Consequently, A(S i) cannot be nowhere dense, and A(S i)
contains some sphere, say
{Y: 11Y — P il < 71}
Then A(S i) — p contains the sphere
{_I': !IA <
But
A(S 1 ) — p c A(S 1 ) — A(S 1) c 2A(S 1 )
A(S0).
Thus A(S0) contains a sphere about the origin of radius n snd so by
the linearity of A, A(S) contains a sphere about the origin of radius
n/2n.
We now proceed to show that A(S0) contains a sphere of radius
n/2 about the origin. Let y be an arbitrary point of Y with yjj < n/2.
Since y e A(S i), we can choose .X1 E Si such that
A(x1)11 <
Similarly, we may choose x2 e
S2
•
such that
Ify — A(x j ) — A(x2 )11 <
Sec. 4]
The Closed Graph Theorem
195
and we continue so that we choose x r, e Sr, with
A(Xk)
-E
k-1
Since ilXkii < 1/2k,
2n+1
E x k is absolutely convergent and
x=
E Xk
k=i
belongs to S o . Moreover,
A(x) = A(ix) =
A(Xk) =
17.
Thus y e A(S 0) and so
1
Y:
<
C A(S 0 ). I
10. Proposition: A continuous linear transformation A of a Banach
space X onto a Banach space Y is an open mapping. Thus in particular,
if A is one-to-one, it is an isomorphism.
Proof: Let 0 be an arbitrary open set of X and y any point of
A[0]. Then there is some x e 0 such that y = A(x). Since 0 is open
there is a sphere S containing x and contained in O. But Lemma 9
states that A[S
x] must contain a sphere about the origin or that
A[S] must contain a sphere about y. Thus y is contained in a sphere
which is contained in A[0], and so A[0] is open. I
—
11. Proposition: Let X be a linear vector space which is complete
in each of the norms
and Hi Ili, and suppose there is a constant C such
that
11.x11 <
lx111
for all x e X. Then the norms are equivalent. That is to say, there is a
second constant C' such that
11141 .5_ c'llxII
for all x e X.
Proof: The identity map of (X, Ili Ili) onto (X,II II) is a one-to-one
continuous linear transformation and so must be an isomorphism
by Proposition 10. Therefore the inverse mapping must be bounded. I
196
Banach Spaces
[Ch. 10
12. Closed Graph Theorem: Let A be a linear transformation on a
Banach space X to a Banach space Y. Suppose that A has the property
that, whenever (x) is a sequence in X which converges to some point
x and (Ax„) converges in Y to a point y, then y = Ax. Then A is
continuous.
Proof: Define a new norm in X by
111x111 = 114 + 111x11.
Then X is complete in the norm I I H. For if iiixp
xqiii —> 0, then
i lx p — x,11 —+ 0 and IlAx, — Ax,11 —> O. Hence by the completeness
of X and Y we have x c X and y c Y such that Ix — x11 0 and
Ax,
—> O. By the hypothesis of our theorem y = Ax. Hence
0, and X is complete with respect toll' HI. But now
llIx — xl
Proposition 11 applies, and there is a C' such that
11xII
IlAx11
ClIx11.
Thus
IlAx 1
c llx 11,
and A is bounded. 1
The graph of a mapping of X into Y is just the set of all (x, Ax)
in X X Y. The hypothesis of Theorem 12 merely states that the graph
of A is closed.
Another consequence of the theory of category is the following
proposition, which is known as the principle of uniform boundedness.
13. Proposition: Let X be a Banach space and
a
a family of
bounded linear operators from X to a norrned space Y. Suppose that
for each x in X there is a constant Ms such that liTx11 < Mr for all T
in a. Then the operators in a are uniformly bounded; that is, there is
a constant M such that ITII < M for all T in a.
Proof: For each T the function f defined by f(x) = IITxII is a
real-valued continuous function on X. Since the family of these
functions is bounded at each x in X and X is complete, there is by
Theorem 7.17 an open set 0 in X on which these functions are uniformly bounded. Thus there is a constant M' such that lTxlI< M'
Sec. 5]
Topological Vector Spaces
197
for all x e O. Let y be a point in O. Since 0 is open, there is a
sphere S ---- {x: 11x — yli < 6} of some radius 6 centered at y and
contained in O. If IA _._. 5, then Tz — T(y ± z) — Ty with y + z
in S c O. Hence 117'4 _< ilT(y ± z)jj + IITyll < M' ± M. ConM' ± My
sequently VII __<
for all T in ff". I
a
Problems
26. Let (Ty ) be a sequence of continuous linear operators on a Banach
space X to a normed vector space Y, and suppose that for each x in X the
sequence (7,0c) converges to a value Tx. Then T is a bounded linear
operator.
27. Let A be a bounded linear transformation from a Banach space
X to a Banach space Y, and let M be the kernel and S the range of A.
Then S is isomorphic to X/M if and only if S is closed.
28. Let S be a linear subspace of C[0, 1] which is closed as a subspace
of L 2 [0, 1].
a. Show that S is a closed subspace of C[0, 1].
b. Show that there is a constant M such that for all f c S we have
ilf112 -• 11f11. and 11f11. _._ milf112.
c. Show that for each y c [0, 1] there is a function Icy in L 2 such
that for each f e S we have f(y) = f k(x)f(x) dx.
The space S can be shown to be finite dimensional (See Problem 41 or
55).
29. a. Give an example of a discontinuous operator A from a normed
linear space X to a Banach space Y such that A has a closed graph.
b. Give an example of a discontinuous operator A from a Banach
space X to a normed linear space Y such that A has a closed graph.
*5 Topological Vector Spaces
Just as the notion of a metric space generalizes to that of a topological space, so the notion of a normed linear space generalizes to
that of a topological vector space: A linear vector space X with a
topology 5 on it is called a topological vector space if addition is a
continuous function from X X X into X and multiplication by
scalars is a continuous function from R X X into X. It follows from
Banach Spaces
198
[Ch. 10
the continuity of addition that translation by an element x is a homeomorphism and that the translate x + 0 of an open set 0 is open.
Any topology with this property on a vector space is said to be
translation invariant. If 3 is a translation invariant topology on X
and 63 a base for 3 at 8, then the sets of the form x + U, U e 63,
form a base for 3 at x. Thus it suffices to give a base at 0 in order to
determine a translation invariant topology. A base at 0 is often called
a local base. The following proposition gives conditions on a set
63 which guarantee that it will be a base for a topology for a topological vector space and states that we can always find a base satisfying
these conditions.
14. Proposition: Let X be a topological vector space. Then we can
find a base 63 at 0 which satisfies the following:
i.
ii.
iii.
iv.
v.
If
If
If
If
If
U, V e OE, then there is a WeOE with Wc Un V.
U e OE and x e U, there is a V e 63 such that x + V c U.
U e OE, there is aVeOE such that V + V c U.
U e (33 and x e X, there is an a e R such that x e a U.
U e (33 and 0 < lai < 1, then aU c U and aU e 63.
Conversely, given a collection (33 of subsets containing 0 and satisfying
the above conditions, there is a topology for X making X a topological
vector space and having 63 as a base at O. The topology will be Hausdorff
if and only if
vi.n {U e 63} = {0}.
The proof of the proposition is left to the reader. We note that if X
is a normed linear space we may take 63 to be the set of spheres
about 0, and the proposition gives us a base for the general case
which has many of the properties possessed by the collection of
spheres.
In a topological vector space we can compare the neighborhoods
at one point with those at another by translation. Thus it is possible
to speak of uniform properties: A mapping f from a topological
vector space X into a topological vector space Y is said to be uniformly continuous if for any open set O containing the origin in Y
there is an open set U containing the origin in X such that for each
x e X we have f[x + U] c f (x) + O. It is readily seen that a linear
transformation from X to Y is uniformly continuous if it is con-
Sec. 5]
Topological Vector Spaces
199
tinuous at one point. A linear mapping ço of X onto Y is called a
(topological) isomorphism if cp and io-1 are both continuous. From
an abstract point of view isomorphic spaces are the same. The
following theorem tells us that the only topology on a finite-dimensional vector space which makes it into a topological vector space is
the usual one.
15. Proposition (Tychonoff): Let X be a finite-dimensional
topological vector space. Then X is topologically isomorphic to Rn
for some n.
Some suggestions for the proof of this proposition are given in
Problem 33. Useful corollaries are given in Problems 34, 35, and 36.
Problems
30. Prove Proposition 14:
a. A collection CB of subsets containing 0 is a base at 0 for a
translation invariant topology if and only if (i) and (ii) hold.
b. Addition is continuous from X X X to X if and only if (iii).
c. If multiplication by scalars is continuous (at (0, 0)) from R X X
to X, then (iv).
d. If X is a topological vector space, then the family 63 of all open
sets U which contain 0 and such that aU C U for all a with l al < 1 is a
local base for the topology and satisfies (v). [If 0 is any open set containing 0, the continuity of multiplication implies that there is an open set V
containing 0 and an e > 0 such that XV c 0 for all I XI < e. Then
U = U XV is open, Be Uc 0, and OEU c U for each a with l al < 1.]
ixl<
e. If da satisfies the conditions of the proposition, then it generates
a topology in which multiplication by scalars is continuous from R X X to
X. [Show that (iv) and (v) imply continuity at (0, x) and (a, 0), and use
(iii).]
f. If X is T1 , then (vi) holds. If (vi) and (iii) hold, then X is
Hausdorff.
31. a. Show that a linear transformation from one topological vector
space to another is uniformly continuous if it is continuous at one point.
b. Show that a linear functional f on X is continuous if there is
an open set 0 such that f[0] R. [Hint: You can take 0 to satisfy
property (v) of Proposition 14.]
200
Banach Spaces
[Ch. 10
32. Let X be a topological vector space and M a closed linear subspace.
Let so be the natural homomorphism of X onto X/M, and define a topology
on X/M by taking 0 to be open if and only if so-1 [0] is open in X. Then
this topology makes X/M a topological vector space and so a continuous
open map. When we speak of X/M as a topological vector space, we always
mean X/M with this topology.
33. Prove Proposition 15:
a. If X is n-dimensional, there is a continuous one-to-one linear
map so of Ir onto X.
b. Let S and B be the subsets of RI' defined by S = {y: 1111 = 1 1,
B = fy: 113,11 < 11. Then go[S] is closed and X — ço[S1 open.
e. There is an open set U of X containing 0 such that aU C U
for each lal < land U C X — go[S].
d. The set U in (c) is contained in q,[B], and so so-1 is continuous.
34. Show that a finite-dimensional subspace M of a topological vector
space X is closed. [Hint: If x y M, let N be the finite-dimensional subspace
spanned by x and M. Then N has the usual topology, and so xis not a point
of closure of M.]
35. Let A be a linear mapping of a finite-dimensional topological
vector space X into a topological vector space Y. Then A is continuous.
[Hint: The range of A is finite-dimensional and hence has the usual
topology.]
36. A linear mapping A from a topological vector space X to a finitedimensional topological space Y is continuous if its kernel M is closed.
[Hint: A = B 0 ço, where so is the natural mapping of X on X/M and B
is continuous by Problem 35.]
37. Prove that every locally compact topological vector space X is
finite-dimensional. [Hint: Let V be a neighborhood of 0 with T7 compact
and aVC V for each a, lal G 1. If we cover T7 by a finite number of
translates x i + V, . . . , xr, + V, then x l, . . . , xt, are a basis for X.]
*6
Weak Topologies
If X is any vector space and a a collection of linear functionals
on X, we define the weak topology generated by a to be the weakest
topology such that each f in 5 is continuous (cf. Problem 8.20). This
topology is easily seen to be translation invariant, and a base at 0
for this topology is given by the sets {x: 1 fi(x)I < e, i = 1, . . . , n} ,
Sec. 6]
Weak Topologies
201
where e > 0 and {fi , . . JO is a finite subset of if. Since the family
of all such sets satisfy the conditions of Proposition 14, this topology
makes X a topological vector space. A sequence (or net) (xn ) converges to x in this topology if and only if f(x) f(x) for each
f e T.
If X is a normed vector space and the functionals in F are all
continuous (that is, if a c X*), then the weak topology generated
by if is weaker (has fewer open sets) than the norm topology of X.
We usually call the metric topology generated by the norm the strong
topology of X and the weak topology on X generated by X* the
weak topology of X. Thus we speak of strongly closed and strongly
open sets when referring to the strong topology and weakly open and
weakly closed sets for the weak topology. Every weakly closed set is
strongly closed but not conversely. Every strongly convergent
sequence (or net) is weakly convergent. While not every strongly
closed set is weakly closed, we do have the following proposition, a
generalization of which is given by Corollary 23.
16. Proposition: A linear manifold M is weakly closed if and only
if it is strongly closed.
Proof: Since every weakly closed set is strongly closed, we have
only to show that if M is strongly closed it is also weakly closed.
Suppose M is strongly closed and x is a point not on M. We must
show that x is not a point of closure of M in the weak topology.
Since x is not a point of closure of M in the strong (metric) topology
we have inf lix — sil > 6 > O. Hence by Proposition 7 there is a
seM
continuous linear functional f which vanishes on M and does not
vanish at x. Now {y: f(y) 0} is an open set in the weak topology
which contains x but does not meet M. Hence x is not a weak point
of closure of M. I
If we apply the notion of weak topology to the dual X* of a
normed space X we see that the weak topology of X* is the weakest
topology for X* such that all of the functionals in X** are continuous.
The weak topology for X* turns out to be less useful than the weak
topology for X* generated by X (or more precisely, by yo[X] where so
is the natural embedding of X into X**). This topology is called the
weak* topology for X* and is even weaker than the weak topology.
202
Banach Spaces
[Ch. 10
Thus a weak* closed subset of X* is weakly closed, and weak convergence implies weak* convergence. A base at 0 for the weak*
topology is given by sets of the form If: f f WI < e, i = 1, . . . , nl
where {x1 , . . . , x } is a finite subset of X. If X is reflexive, then the
weak and weak* topologies of X* coincide. Some of the importance
of the weak* topology stems from the following theorem:
17. Theorem (Alaoglu): The unit sphere S* = { f: IIIf It _. 11 of
X* is compact in the weak* topology.
Proof: If f c S*, then I f(x)I <_ 11x11, and so f(x) e [ - 11x11, 114.
Let Iz = [ 11x11, 114]. Then each f c S* corresponds to a point
in P = XL, since the latter is by definition the set of all functions
ze.x
f on X such that f(x) e I. Thus we may think of S* as a subset of P,
and the definition of the topology for P shows that the topology S*
inherits as a subspace of P is the weak* topology of S*. Since P is
compact by the Tychonoff theorem, S* will be compact if it is a
closed subset of P. Let f be a point of closure of S* in P. Then f is a
mapping of X into R. Since Ig(x)1 < 11x11 for g e S*, and evaluation
at x is a continuous function on P, we have I f(x)1 _
< 114. Let x, Y,
and z be three points of X such that z --= ax + 0y. For each e > 0
the set N = Ig e P: Ig(x) — f(x)I < e, rg(y) — f(y)I < e, and
ig(z) — f(z)I < el is an open subset of P containing f. Since f is a
point of closure of S*, we can find a g in S* n N. Since this g is
linear (being in S*) we have g(z) = ag(x) + ,3g(y). Hence it follows
that I f(z) — a f(x) — 0f(y)1 < e(1 + la! ± I01). This inequality
holding for each positive e, we have f(z) = af(x) ± g(y), and f is
linear on X. Thus f is in S*, and so S* is closed. I
-
Problems
38. a. Show that if x„ -- x weakly, then (11xn ii) is bounded.
b. Let (x„) be a sequence in P, 1 < p < oc, and let xn =
(tnt,n):=1. Show that (xn) converges weakly to x = (E„,,) if and only if
(ilx,ill) is bounded and for each m we have t,,,,n ---- Em.
c. Let ()en) be a sequence in LP[O, 1], 1 < p < oc. Show that
(xn) converges weakly to x if (11x.„11) is bounded and (xn) converges to x
in measure (cf. Problem 6.16).
Sec. 7] Convexity
203
d. In P', 1 < p < oo, let xr, be that sequence whose nth term is one
and whose remaining terms are zero. Then (xn ) does not converge in
the strong topology, but x„, —+ 0 in the weak topology.
e. Let (x„) be as in (d), and define Yn m =
nx„,. Then the
set F =
m > n) is strongly closed. [Hint: The distance between
any two points of F is at least 1. Hence F contains no nonconstant sequences which converge in the strong topology.]
f. The set F in (e) has 6 as a weak closure point. However, there is
no sequence (z,„) from F which converges weakly to zero.
39. a. Let S be a bounded subset of a normed space X. Let 5 be a set of
functionals in X* and let 5.0 be a dense subset of ff• (dense in the sense
of the norm topology on X*). Then g• and ff•c, may generate different weak
topologies for X, but these topologies are the same on S; that is, S inherits
the same topology from each.
b. Let S* be the unit sphere in the dual X* of a separable Banach
space X. Then the weak* topology on S* is metrizable. (Caution: This
does not mean the weak* topology on X* is metrizable!)
40. Show that every weakly compact set is bounded in the norm topology.
41. Let S be the linear subspace of C[0, 1] given in Problem 28.
a. Show that if fi, —* f weakly in L 2, then fn(y) f(y) for each
y c [0, 1].
b. If fin
weakly in L 2, then
is bounded, and hence
f7, ---> f strongly in L 2 by the Lebesgue convergence theorem.
c. The space S is a locally compact subspace of L 2 and hence finitedimensional.
,
unto
*7
Convexity
A subset K of a vector space X is said to be convex if, whenever it
contains x and y, it also contains Xx + (1 — N)y for 0 < X < 1.
The set {z: z = Xx + (1 — X)y for 0 < X < 11 is called the line
segment joining x and y. The points x and y are its endpoints, and
a z for which 0 < X < 1 is called an interior point of the segment.
Thus a set K is convex if whenever it contains x and y it contains the
line segment joining x and y. Every linear manifold is convex, and
the unit ball in a normed space is convex. The following lemma gives
some basic properties of convex sets. Further properties are given in
204
Banach Spaces
[Ch. 10
Problems 42, 43, and 44. The proof of the lemma is straightforward
and is omitted.
18. Lemma: If K1 and K2 are convex sets, so also are the sets
K 1 n K2, XKI, and Ki + K2.
A point xo is said to be an internal point of a set K if the intersection with K of each line through xo contains an open interval
about xo. Thus xo is internal to K if, given x e X, there is an E > 0
such that xo Xx e K for all X with IX! < E. Let K be a convex set
which contains 0 as an internal point. Then we define the support
function p of K (with respect to o) by p(x) = inf {x: X -1 x e K, X > .
We have the following properties for this support function:
19. Lemma: If K is a convex set containing 0 as an internal point,
then the support function p has the following properties:
j.
p(Xx) = Xp(x) for X _> O.
P(4 + P(Y).
P(x
{x: p(x) < 1} c KC {x: p(x) < 1}
Proof: The first and third properties follow immediately from
the definition of p. To prove the second, suppose that x —ix and if ly
belong to K Then
(X + 12) -1 (x + Y) = X(X
A) -1 (X -1x)
m(X
A) -1 ( 2-1Y)
belongs to K, since K is convex. Thus p(x + y) < X + A, and
taking infima over all admissible X and A we obtain p(x + y) <
p(x)
IAA I
Two convex sets K1 and K2 are said to be separated by a linear
functional f if there is a real number a such that f(x) < a on K1
and f (x) > a on K2.
20. Theorem: Let K 1 and K2 be two disjoint convex sets in a
vector space X, and suppose that one of them has an internal point.
Then there is a nonzero linear functional f which separates K1 and K2.
Proof: Let x1 be an internal point of K1. Then K1 — K2 is convex,
and the point xo = x1 — x2 is an internal point of K1 — K2 for any
Sec. 7] Convexity
205
x2 in K2. Let K K 1 — K2 — x0. Then K is a convex set containing
O as an internal point. Since K1 and K2 are disjoint, 19 e K1 — K2, and
so —xo e K.
Let p be the support function of K (with respect to 6). Then
p(— x 0) > 1. Let S be the one-dimensional subspace of X consisting
of all multiples of x0. On S define f by f(ax 0) = —a. Then f(s) p(s)
and p satisfies the condition of the Hahn-Banach Theorem by Lemma
19. Thus we may extend f to a linear functional defined on all of X
so that f(x) < p(x) for all x. Thus if x e K, we have f(x) < 1.
Let x e K1 and y e K2. Then x — y — xo e K, and we have
f(x) — AY)
—
f(x0) = f(x — y — xo) < I.
Since f(x 0) = —1, we have f(x) < f(y). This being true for each
x e K1 and each y in K2, we have sup f (x) < inf f(y). Thus f
xeKi
yeK2
separates K1 and K2, andfis a nonzero functional sincef(x0) = —1.1
A topological vector space is called locally convex if we can find a
base for the topology consisting of convex sets. The following
proposition gives a convenient criterion for ensuring that a given
topology 5 for a vector space X will make X into a locally convex
topological vector space. Suggestions for its proof are given in Problem 46.
21. Proposition: Let oz, be a family of convex sets in a vector
space X. Then the following conditions are sufficient for the translates
of sets in t to form a base for a topology which makes X into a locally
convex topological vector space:
i. If N e at, each point of N is internal.
ii. If N 1 and N2 are in at, there is an N3 in at with N3 C N1 n N2.
If N is in at, then for each a with 0 < lai < 1 we have aN c 9t.
Moreover, in each locally convex topological vector space there is a
base it at 0 which satisfies these conditions.
It follows from this proposition that the weak topology on a vector
space X which is generated by a family of linear functionals makes X
into a locally convex topological vector space. Also a normed vector
space is a locally convex topological vector space.
206
Banach Spaces
[Ch. 10
22. Proposition: Let X be a locally convex topological vector
space and F a closed convex subset. Let xo be a point of X not in F.
Then there is a continuous linear functional f on X such that
f(x0) < inf AX).
xe
Proof: By translating by —x 0, we reduce the proposition to the
case that x o = O. Since o is not a point of closure of F, there is a
convex open set N which contains 0 but does not meet F. Let 0 —
N n
N). Then 0 is an open convex set containing 0 and disjoint
from F, and —0 = O. Since o is an internal point of 0 (see Problem 44a), there is by Theorem 20 a nonzero linear functional f such
that sup f(x) < inf f(y) = a. Thus f(x) < a for x in 0, and since
(—
xe0
ye?
x e 0 implies —x e 0, we have — f (x) < a on O, whence I f(x)1 < a
on O. Thus for each e > 0, we have I f(x)i < e on the set 0' =
(€a -1 )0. But 0' is an open set containing 0, and so f is continuous at
O. Since f is linear and continuous at 0, it is continuous everywhere.
It remains only to show that a > 0. Sincef is a nonzero functional,
there is an x such that f(x) > 0. Since 0 is an internal point of 0, we
can choose X > 0 so that Xx is in O. Then
0 < Xf(x) = f(Xx) <
a.
I
23. Corollary: Let K be a convex set in a locally convex topological
space. Then K is strongly closed if and only if it is weakly closed.
24. Corollary: Let x and y be distinct points of a locally convex
topological vector space X. Then there is a continuous linear functional
f such that f(x) f(y).
Let K be a convex subset of a vector space X. A point x in K is
called an extreme point if it is not an interior point of any line
segment lying in K. Thus x is extreme if whenever x = Xy ± (1 — X)z
with 0 < X < 1, we have y K or z K. We are going to show that
every compact convex set has extreme points, but we first consider
some preliminary notions. A subset S of a convex set K is called a
supporting set of K if it is closed and convex and has the property
that if an interior point of a line segment in K belongs to S then the
entire line segment belongs to S. Thus an extreme point is a supporting set consisting of exactly one point.
Sec. 7]
207
Convexity
25. Lemma: Let f be a continuous linear functional on a closed
convex set K. Then the set S of points where f assumes its maximum
on K is a supporting set of K.
Proof: The set S is convex, for if f(x) = m and f(y) = m, then
f(Xx + (1 — X)y) = m. If the line segment joining x to y is in K
and f assumes its maximum m at the point z = Xx + (1 — X)y,
then m = Xf(x) + (1 — X) f(y), and since f(x) and f(y) are not
greater than in, we must have f(x) =[(y) = m. But this implies that
f(Ax + (1 — p.)y) = m, and so the entire line segment joining x to
y is in S. I
The intersection of all convex sets containing a set E is a convex set which contains E and which is contained in every convex
set containing E. This set is called the convex hull of E. The
intersection of all closed convex sets containing E is a closed
convex set which contains E and which is contained in every closed
convex set containing E. This set is called the closed convex hull
of E.
26. Theorem (Krein Milman): Let K be a compact convex set
in a locally convex topological vector space X. Then K is the closed
convex hull of its extreme points.
-
Proof (Kelley): We assume that K is not empty. It follows from
the definition of supporting sets that the intersection of a collection
of supporting sets of K is a supporting set of K and that if S is a supporting set of K and T a supporting set of S then T is a supporting
set of K.
Given any nonempty supporting set S of K, the family of all nonempty supporting sets of K is partially ordered by inclusion, and by
the Hausdorff maximal principle there is a maximal linearly ordered
family 8 of nonempty supporting sets with S in 8. Since K is
compact, the intersection T of all members of s is nonempty and
hence is itself a nonempty supporting set of K. It is, moreover, a
minimal nonempty supporting set, for if T properly contained a supporting set the family s would not be maximal. Thus any supporting
set contains a minimal nonempty supporting set. Now a minimal
nonempty supporting set can contain only one point. For if a supporting set S contains the distinct points x and y, there is a continu-
208
Banach Spaces
[Ch. 10
ous linear functional f such that f(x) > f(y). Then the subset of S
where f attains its maximum is by Lemma 25 a supporting subset of
S and hence of K. Since K is compact, it is a nonempty supporting
subset of K which does not contain y.
If a supporting set consists of just one point, that point must be
extreme. Hence we have shown that every nonempty supporting set
contains an extreme point.
Since the subset of K where a linear functional assumes its maximum is a nonernpty supporting set, we conclude that the maximum
of a continuous linear functional on K is equal to its maximum on the
set E of extreme points of K. Let C be the closed convex hull of the
extreme points of K, and suppose x 0 C. By Proposition 22 there is a
continuous linear functionalf such that f(x) > max f(y) = max f(y).
yee
Thus x
yeK
K, and we have K c C. Hence K = C. I
Problems
42. Let A be a linear operator from the vector space X to the vector
space Y. Then the image of each convex set (or linear manifold) in X is a
convex set (or linear manifold) in Y and the inverse image of a convex set
(or linear manifold) in Y is a convex set (or linear manifold) in X. Give an
example to show that a nonconvex set may have a convex image.
43. Show that the closure of a convex set K in a topological vector space
is convex.
44. a. Show that each interior point of a convex subset of a topological
vector space is internal. [Hint: Use the continuity of multiplication.]
b. Show that in R" each internal point of a convex set is an interior
point.
c. Give an example of a set in the plane which has an internal point
that is not an interior point.
45. Let K be a convex set containing 6, and suppose that x is an internal
point of K. Then for some X > 0 the set x
XK is contained in K. [Hint:
Choose X > 0 so that (1 — X) lx is in K.]
46. Prove Proposition 21. [Hint: If N is convex, then IN + N C N.
Use Proposition 14 and its proof.]
47. Strongest locally convex topology. Let X be a vector space and 63
the collection of all convex sets V containing 0 such that for each x e X
Sec. 7]
209
Convexity
there is an a > 0 with ax e V. Then IT. is a local base for a locally convex
topology on X, and this topology is stronger than any other locally convex
topology on X.
48. a. In LP[O, 1], 1 < p < oo , every x with IX = 1 is an extreme
point of the unit sphere S =
114 < 11.
b. In L'[0, lithe extreme points of the unit sphere are those x such
that lx(i)1 = 1 a.e.
c. The unit sphere in L 1 [0, 1] has no extreme points.
d. L 1 [0, 1] is not the dual of any normed linear space.
e. What are the extreme points of the unit sphere in P?
f. What are the extreme points of the unit sphere in C(X), X a
compact Hausdorff space? Show that C[0, 1] is not the dual of any normed
linear space.
49. Let X be the vector space of all measurable real-valued functions on
[0, 1] with addition and multiplication by scalars defined in the usual way.
i x(t)1
Define a(x) =
dt.
fl 1 ± l x(t)I
a. We have o-(x ± y) < r(x)
a(x — y), p is a metric for X.
cr(y). Hence if we define p(x, y) =
b. In this metric xn
x if and only if xn --> x in measure.
c. X is a complete metric space (cf. Problem 4.25).
d. Addition is a continuous mapping of X X X into X.
e. Multiplication is a continuous mapping of R X X into X. [Since
X is a metric space, it suffices to prove that if xn
X then
x and X„
Xnxn Xx. This follows from the Bounded Convergence Theorem for
convergence in measure.]
f. Show that the set of step functions is dense in X.
g. There is no nonzero continuous linear functional on X. [Show
that there is an n such that f(x) = 0 whenever x is the characteristic
function of an interval of length less than 1/n. Hence f(x) = 0 for all
step functions x.]
h. The space X is a topological vector space which is not locally
convex.
i. Let s be the space of all sequences of real numbers, and define
2- 161 Prove the analogues of (a), (c), (d), and (e). What
1 + Itv i
is the most general continuous linear functional on s?
.
210
Banach
Spaces
[Ch.
10
8 Hilbert Space
By a Hilbert space we mean a Banach space H in which there is defined a function (x, y) on H X H to R with the following properties':
i. (ceiXi
a2X29 Y) = al(X1, Y)
(x, y) = (y, x).
(x, x) = 11x11 2 .
ct2(X29 Y).
We call (x, y) the inner product of x and y. Two examples are
immediate: One is the space Rn with
The other is the space L2 with
(x, y) =
f x(t)y(t) dt.
Since lxii > 0 with equality only for x = 0, we have
0
lix — XYII 2 = (x — XY, x — XY)
= (x, x) — 2X(x, y)
X 2(y, y).
If X > 0, we have
2(x, y)
Setting X =
XIIY112.
we obtain
(x, .Y)
11 ,4 • VII,
and we see that equality can only occur when y = 0 or x = Xy for
some x > O. This inequality is variously known as the Schwarz,
Cauchy-Schwarz, or Cauchy-Buniakowsky-Schwarz inequality. A
consequence of this inequality is that the linear functional f defined
and from this it follows that
by f(x) = (x, y) is bounded by
(x, y) is a continuous function from H X H to R.
We say that two elements x and y of H are orthogonal if (x, y) = O.
We write x 1 y to mean x and y are orthogonal. A set 8 in H is
1
have defined here a real Hilbert space. In analysis it is generally more convenient
to deal with a complex Hilbert space, that is, one in which the scalars are complex
numbers, the inner product is complex-valued, and (ii)is replaced by(ii'): (x,y) =
We
Sec. 8]
211
Hilbert Space
called an orthogonal system if any two different elements ço and 1,t, of
s are orthogonal, that is, (ço, 1,0 = 0. An orthogonal system s is
called orthonormal if 11(1011 = 1 for each ço in S. Any two elements of
an orthonormal system are at distance -0 from each other. Hence if
H is separable, every orthonormal system in H must be countable.
Henceforth we shall deal only with separable Hilbert spaces.
Thus each orthonormal system may be expressed as a sequence
((pp ), which may be finite or infinite We define the Fourier coefficients
(with respect to ((p p )) of an element x in H to be ap = (x, ço). For any
n we have 2
n
2
0<
X—
E
ario9
X11 2 — 2
=
E
av (x,
v=1
v=1
= 114 2 —
n
(I:0+
v=1 A=1
E ce.
v=1
Thus
E 442 ^ 11x11
v=1
and, since
n
was arbitrary, we have Bessel's inequality:
E
p=1
On the other hand, let
(ap )
be any sequence of real numbers with
E ap2 < oo . Then the sequence
Zn =
is a Cauchy sequence, since for m >
z„, — Zn =
n
E
av<Pv,
n+1
and we have
Ilzm — znII 2 = n+1
E a92,
2
If there are only a finite number N of elements in (), we adopt the convention
that
p=i
means
E
v=1
for N < < 00.
212
Banach Spaces
[Ch. 10
which must tend to zero since Ea v2 converges. By the completeness
of H there is an element y such that y = lim z n, and we write
oo
y =
E aviop.
v=1.
Since the inner product is continuous, we have
(y, cpp) = lim (z 7i, iop) = ay.
We have thus shown that for any x there is a y of the form
y = E apiop which has the same Fourier coefficients as x.
v 1
—
When does this y equal x? If we look at y — x, we see that all
its Fourier coefficients are zero. Hence y = x if the orthonormal
system (cp„) has the property that, if (z, (pr) = 0 for all y, then z = O.
An orthonormal system with this property is called complete (or
total). A complete orthonormal system is clearly a maximal one,
while, if (4op ) is a maximal orthonormal system, it must be complete.
For, if (z, io„) = 0 all y and z o 0, then zazil can be added to (40p).
The Hausdorff Maximal Principle implies the existence of a maximal
orthonormal system. We have thus established the following proposition:
27. Proposition: In a separable Hilbert space every orthonormal
system is countable and there is a complete orthonormal system. If
(cp„) is any complete orthonormal system and x any element of H,
we have
x
= E apio„
v=- 1
where a„ = (x,‘,0,,). Moreover, PO =
E a.
In a separable Hilbert space there are two alternatives: Either every
complete orthonormal system has an infinite number of elements or
else there is a complete orthonormal system with a finite number N
of elements. In the latter case such a system is a basis (in the vectorspace sense) of H by Proposition 27. Hence H is a finite-dimensional
vector space, and any system of N + 1 elements is linearly dependent.
Consequently, every orthonormal system can have at most N elements in it. From this it follows that every complete orthonormal
Sec. 8]
213
Hilbert Space
system must have N elements. We have thus proved that in a separable Hilbert space H every complete orthonormal system has the
same number of elements. We call this number the dimension of H.
(Thus if H has a countably infinite complete orthonormal system we
say dim H = N o.)
An isomorphism (P of a Hilbert space H onto a Hilbert space H'
is a linear mapping of H onto H' such that ((f,x, (Py) = (x, y). Thus
an isomorphism between Hilbert spaces is an isometry. Each ndimensional Hilbert space is isomorphic to Rn, since the mapping
defined by 4)(x) = (a1 ,. , an), where a, = (x, ço„) is an isomorphism. Similarly, every N o-dimensional Hilbert space is isomorphic
to 12 . Since L 2 [0, r] is separable and {cos vt} is an infinite orthogonal
system, we see that the dimension of L 2 is No, and so L 2 is isomorphic
to /2
28. Proposition: Let f be a bounded linear functional on the Hilbert
space H. Then there is a y eH such that f(x) = (x, y) for all x.
Moreover, IfM =
Proof: (We consider only the case of separable H. For nonseparable H see Problem 52.) Let (pi,) be a complete orthonormal system
for H, and set bp = f (cpp). Then for each n we have
E bi! =
v=i
f(
v=1
11f11 •v—iE bAc'v
bvçov)
11f11
bv2r 2.
11f11 2, and so E b,2, 11f11 2 < Go. Hence there is an
element y = E /Jo,. We have VII
11f11.
v= 1
x, and so
Let x be any element of H. Then E av ,p,
Thus E b,2,
p=
f(x) = lim f (E a v(p) =
V= I
By the Schwarz inequality 11 f II
E ab.
1
ii.Y11. I
214
Banach Spaces
[Ch. 10
Problems
50. Show that the inner product is continuous; that is, if xi, --> x and
Yn --* y, then (xn, yr,) --)• (x, Y).
51. a. Show that {cos vi, sin pi} is (when suitably normalized) a complete orthonormal system for L 2 [0, 27r] (cf. Problems 6.14 and 9.28).
b. Every function in L 2 [0, 27r] is the limit in mean (of order 2) of its
Fourier series (cf. Section 6.3).
52. a. Show that for each x in a nonseparable Hilbert space there are
only a countable number of nonzero Fourier coefficients (with respect to a
fixed orthonormal system).
b. Show that Proposition 27 is still valid in a nonseparable Hilbert
space except that every complete orthonormal system is uncountable.
c. Show that Proposition 28 is still valid in a nonseparable Hilbert
space.
d. Show that if His any infinite-dimensional Hilbert space, then the
number n of elements in a complete orthonormal system in H is the
smallest cardinal n such that there is a dense subset of H with n elements.
Hence every complete orthonormal system in H has the same number of
elements. We call this number the dimension of H.
e. Show that two Hilbert spaces are isomorphic if and only if they
have the same dimension.
f. Show that there is a Hilbert space of each dimension.
53. Let P be a subset of H. By the orthogonal complement P1 of P we
mean the set {y: y I. x all x a P}.
a. Show that P1 is always a closed linear manifold.
b. Show that P±± is the smallest closed linear manifold containing P.
c. Let M be a closed linear manifold. Then each x e H can be
written uniquely in the form x = y + z with y e M and z a MI. Moreover, 11x11 2 = 111'11 2 + 1iz11 2 .
54. Let (x,,) be a bounded sequence of elements in a separable Hilbert
space. Then (x.) contains a subsequence which converges weakly.
55. Let S be a subspace of L 2 [0, 1], and suppose that there is a constant
K such that 1f(x)1 < Klifil for all x 6 [0, 1]. Then the dimension of S is at
most K2 . [Hint: If (f 1 , . . . ,f,,) is any finite orthonormal sequence in
S,
then t
r. 1fi(x)1 2 _< K2.]
t=1
Part Three
0
General Measure
and Integration Theory
11
1
Measure and Integration
Measure Spaces
The purpose of the present chapter is to abstract the most important
properties of Lebesgue measure and Lebesgue integration. We shall
do this by giving certain axioms which Lebesgue measure satisfies and
base our integration theory on these axioms. As a consequence our
theory will be valid for every system satisfying the given axioms.
We begin by recalling that a c-algebra 63 is a family of subsets of a
given set X which contains 0 and is closed with respect to complements and with respect to countable unions. By a set function we
mean a function which assigns an extended real number to certain
sets. With this in mind we make the following definitions:
Definition: By a measurable space we mean a couple (X, 63 )
consisting of a set X and a cr-algebra 63 of subsets of X. A subset A of
X is called measurable (or measurable with respect to 63) if A e
Definition: By a measure u on a measurable space (X, d3) we
mean a nonnegative set function defined for all sets of 1s3 and satisfying
A(0) = 0 and
12, (ID E)z —
J=1
,uE
for any sequence E, of disjoint measurable sets. By a measure space
(X, 6-3 , )1) we mean a measurable space (X, G3) together with a measure
)1 defined on (B.
217
218
Measure and Integration
[Ch. 11
This second property of /A is often referred to by saying that A is
countably additive. We also have that A is "finitely additive"; that is,
for disjoint sets Ei belonging to CA, since we may set Ei = Ø for i > N.'
One example of a measure space is (R, OR, m), where R is the set
of real numbers, Mt the Lebesgue measurable sets of real numbers,
and m Lebesgue measure. Another measure space results if we replace R by the interval [0, 1] and Mt by the measurable subsets of
[0, 1]. A third example is (R, CA, m), where (id is the class of Borel sets
and m is again Lebesgue measure. Another example is the counting
measure (see Problem 3.4). A slightly bizarre example is the following. Let X be any uncountable set, (id the family of those subsets
which are either countable or the complement of a countable set.
Then (B is a o--algebra and we can define a measure on it by setting
AA = 0 for each countable set and AB = 1 for each set whose com-
plement is countable.
Two further properties of measures are given by the following
propositions:
1. Proposition: If A c 63, B c
(id,
and A C B, then
ALA <
Proof: Since
B = A U [B
A]
is a disjoint union, we have
,uB = AA
2. Proposition: If Ei e
A) > A. I
,u(B
(id, mEi < ck
Ei) =
and Ei D Ei+1 , then
lim
A (z1
=--
A set function A defined on an algebra of sets and satisfying A(23) = 0 and
A(A U B) = 1.1A /213 for disjoint sets A and B in the algebra is called a finitely additive measure. Since our definition (and the usual usage) requires a measure to be
countably additive, it follows that a finitely additive measure is not in general a measure, although every measure is, of course, a finitely additive measure.
Sec. 1]
Measure Spaces
Proof: Set E =
219
n E. Then
i=1
E l = E u 1.1 (E1
i= 1
and this is a disjoint union. Hence
= pE
E
E1-3-1).
i=1
Since
E, — E, +1 U (E,
is a disjoint union, we have
y(E, E, +1) = yE, —
Hence
y E,=
yE+ E (yEi —
n-1
==
li
E
7,-00 i=i
cuEi — yEi+i)
= pE pE1 — lim yEn
,
whence the proposition follows. I
3. Proposition: If Ei r 63, then
< E yEi.
Ei)
In-1
Proof: Let Gn = E.
are disjoint. Hence
U
yGn
. Then G„ C E,, and the sets G.
<
ttEn,
while
pG.
tt(U E,) =
n=1
E /2E,,. I
n=1
220
Measure and Integration
[Ch. 11
A measure is called finite if pt(X) < Go. It is called or finite if there
is a sequence (Xn) of sets in 63 such that
-
X
= U Xn
n=i
and AX„ < 00. By virtue of Proposition 1.2 we may always take
(Xn ) to be a disjoint sequence of sets. Lebesgue measure on [0, 1]
is an example of a finite measure, while Lebesgue measure on
(— cr, , cr..) is an example of a a--finite measure. The counting measure
on an uncountable set is a measure which is not a-finite. Other
examples are given in Problem 46.
A set E is said to be of finite measure if E e 03 and AE < c4. A
set E is said to be of cr finite measure if E is the union of a countable
collection of measurable sets of finite measure. Any measurable set
contained in a set of o--finite measure is itself of a-finite measure, and
the union of a countable collection of sets of a--finite measure is
again of a--finite measure. If is a--finite, then every measurable set
is of a--finite measure.
Roughly speaking, nearly all the familiar properties of Lebesgue
measure and Lebesgue integration hold for arbitrary a.-finite measures, and many treatments of abstract measure theory limit themselves to a--finite measures. However, many parts of the general
theory do not require the assumption of 0-finiteness, and it seems
undesirable to have a development that is unnecessarily restrictive.
A notion weaker than that of a--finiteness is semifiniteness. A measure
pt is said to be semifinite if each measurable set of infinite measure
contains measurable sets of arbitrary large finite measure. Thus
every a--finite measure is semifinite, while the measure which assigns 0
to countable subsets of an uncountable set X and coc to the uncountable sets is not semifinite. Measures which are not semifinite tend
to be somewhat bizarre, and there is a temptation to restrict oneself
to the consideration of semifinite measures. Unfortunately, many
of the natural operations on measures, such as the extension procedures of the next chapter, take us out of the class of semifinite
measures.
A measure space (X, 63, pt) is said to be complete if 03 contains all
subsets of sets of measure zero, that is, if B e 63, cB = 0, and A c B
imply A e 83. Thus Lebesgue measure is complete, while Lebesgue
measure restricted to the a--algebra of Borel sets is not complete.
The following proposition, whose proof is left to the reader (Problem
-
221
Sec. 1] Measure Spaces
shows that each measure space can be completed by the addition
of subsets of sets of measure zero. The measure space (X, a o , /.Lo)
given in the proposition is called the completion of (X, 63, /.4).
2),
4. Proposition: If (X, 63, /2) is a measure space, then we can find
a complete measure space (X, 63 0, A o) such that
i. (33 c (S3 0 .
E c (33= E = 110E.
E e (33 0 <=>E = Au BwhereB eG3andAcC,C e,uC = O.
is a measure space, we say that a subset E of X is
locally measurable if En B e ci3 for each B e ci3 with AB < go. The
collection e of all locally measurable sets is a a--algebra containing
The measure is called saturated if every locally measurable set is
measurable, i.e., is in (B. Every a--finite measure is saturated. A measure
can always be extended to a saturated measure, but, unlike the process
of completion, the process of saturation is not uniquely determined:
There is no difficulty with the a--algebra, for we simply take the
a--algebra e of all locally measurable sets, but the measure can often
be extended to e in different ways. Some details are given in Problem 8, and we shall return to this topic in the next chapter.
If (X, 63,
A)
Problems
1. Let {A n}
be a countable collection of measurable sets. Then
C1
(
=
k
U
n—,00
Ah)
k=1
2. Let {(X«,
A.) } be a collection of measure spaces, and suppose
that the sets {X a} are disjoint. Then we can form a new measure space
(called their union) by letting X = U Xa, G3
{B: (0)[B n Xa e
and defining
kt(B) = E,ua(B
n
a. Show that 63 is a a--algebra.
is a measure.
c. Show that y is a--finite if and only if all but a countable number of
b. Show that A
the
are zero and the remainder are o--finite.
3. a. Show that ,u(Ei A ED = 0 implies AE I = AE2 provided that Ei
and E2 e (B.
0
b. Show that if y is complete, then E l e 63 and ,u(E i A E2)
imply E2 e (B.
222
Measure and Integration
[Ch. 11
4. Let (X, 03, p) be a measure space and Y c 03. Let 03y consist of
those sets of 03 which are contained in Y. Set p.yE = pE if E c (B y . Then
( Y, 03y, gy) is a measure space. The measure gy is called the restriction
of p to Y.
5. Let (X, 03) be a measurable space.
a. If p. and u are measures defined on 03, then the set function X
defined on OB by XE = aE ± vE is also a measure. We denote X by ,u ± v.
b. If p, and u are measures on 03 and p. > u, then there is a measure X
on 03 such that p = v + X.
c. If 7/ is o--finite, the measure X in (b) is unique.
d. Show that in general the measure X need not be unique but that
there is always a smallest such X.
6. a. Show that each o--finite measure is semifinite.
b. Show that each measure p is the sum pi ± p2 of a semifinite
measure p i and a measure p 2 which assumes only the values 0 and GO
The measure p i is always unique, and there is a smallest p 2 .
7. Prove Proposition 4. [First show that the family 63 0 defined by (iii)
is a o--algebra. If E e 630, show that AA is the same for all sets A c 03 such
that E= A uB with B a subset of a set of measure zero. Use this fact to
define p o and show I.L 0 is a measure.]
8. a. Show that each o--finite measure is saturated.
b. Show that the collection of locally measurable sets is a o--algebra.
c. Let (X, 133,0 be a measure space and e the o--algebra of locally
measurable sets. For E e e set rcE = I.LE if E e 63 and gE = cc if E Y 63.
Show that (X, e, FL) is a saturated measure space.
d. If p is semifinite and E e e, set pE — sup {AB: B c 03, B C E} .
Show that (X, 6, p) is a saturated measure space and that A is an extension
of A.
e. Use Problem 5d to show that, even if pc is not semifinite, there is a
smallest extension p of A to 0, and that (X, 6, 1.0 is saturated.
f. Give an example to show that 7.4- and p._ may be different.
9. o--Rings and o--Algebras. Some authors prefer to define a measure
on a o--ring 01, that is, a collection of subsets of X such that if A and B
are in CA so is A — B, and if (An ) is a sequence from Oi then U A n is in 01.
Thus a cr-ring a is a o--algebra if and only if X e R. Some of the relations
between o--rings and o--algebras are given below:
a. Let a be a o--ring which is not a o--algebra, let 03 be the smallest
o--algebra containing GI, and set Cir = {E: E-- ' e 61 } . Then 63 = GI U 61' and
.
a n a' = Ø.
Sec.
2]
Measurable Functions
223
b. If A is a measure on 61, define A on 63 by TtE = I.LE if E e ca and
61'. Then g is a measure on 63.
c. Define A on 63 by AE = ,LLE if E e 61 and
AE = oo if E e
AE =
sup {AA: A C E, A e 01}
61'. Then A is also a measure on B.
_
d. More generally, for any nonnegative extended real number 0
define ,up on CB by ,upE = AE if E e 61 and AgE = AE + 13 if E e 61'. Then
/213 is a measure on 63.
e. If v is any measure on 63 which agrees with A on Ca, then v = ,uo
for some 3> O.
if E e
2
Measurable Functions
The concept of a measurable function on an abstract measurable
space is almost identical with that for functions of a real variable.
Consequently, those propositions and theorems whose proofs are
essentially the same as those in Section 5 of Chapter 3 are stated
without proof. The reader should verify that the proofs in Chapter 3
can be carried out in the abstract case. Throughout this section we
assume that a fixed measurable space (X, 63) is given.
5. Proposition: Let f be an extended real-valued function defined
on X. Then the following statements are equivalent:
i. {x: f(x) < a} e 63 for each a.
11. {X: f(x) _.<. a} e 63 for each a.
ill. {X: f(x) > al e 63 for each a.
iv. {x: f (x)
a} e 63 for each a.
(See Proposition 3.18.)
Definition: The extended real-valued function f defined on X is
called measurable (or measurable with respect to (33) if any one of
the statements of Proposition 5 holds.
6. Theorem: If c is a constant and the functions f and g are measurable, then so are the functions f + c, cf,f + g, f • g, and f v g. Moreover, if (f7i ) is a sequence of measurable functions, then sup f,„ inf,
f. ,
lim f„, and lim fn are all measurable.
(See Theorems 3.19 and 3.20.)
224
Measure and Integration
[Ch. 11
By a simple function we mean as before a finite linear combination
,p(x) = E c zx E,(x)
=1
of characteristic functions of measurable sets E.
7. Proposition: Let f be a nonnegative measurable function. Then
there is a sequence ((,o7,) of simple functions with ço„±i > (ion such that
f = lim ço„ at each point of X. If f is defined on a a--finite measure
space, then we may choose the functions (p, so that each vanishes
outside a set of finite measure.
8. Proposition: If bt is a complete measure and f is a measurable
function, then f = g a.e. implies g is measurable.
(See Proposition 3.21.)
The sets {x: f(x) < a} are sometimes called ordinate sets for f.
They increase with a. The following two lemmas will be useful later.
They state that, given a countable collection [Ba} of measurable sets
which increase with a, we can find a measurable function f which
nearly has these for ordinate sets in the sense that
{x: f(x) <
c Ba c {x: f(x)
.
9. Lemma: Suppose that to each a in a countable set D of real
numbers there is assigned a set Ba e 63 such that B a c Bo for a < #.
Then there is a measurable extended real-valued function f on X such
that f < a on B a and f > a on X — B a.
Proof: For each x c X, let f(x) = inf [a: x c Ba}, where as usual
inf 0 = c/o . If x c Ba, f(x) < a. If x Ba, x Bo for each # < a and
so f (x) > a. Thus we have only to show that f is measurable. But for
any real a, {x: f(x) < a} = U B. For if f(x) < a, then x c Bg
(3<a
for some 13 < a by the definition of f. On the other hand, if x c Bp
for some 13 < a, then f(x) < # < a. I
10. Lemma: Suppose that for each a in a countable set D of real
numbers there is assigned a set Ba c 63 such that A(B, — Bo) = 0 for
a G #. Then there is a measurable function f such that f < a a.e. on
B a and f > a a.e. on X — Ba.
Sec. 3]
225
Integration
Proof: Let C = U B a — B o . Then
oi<0
/I
C = 0. Let K, = B a U C.
If a G 0, we have BL B'o (B a — Bo) C = Ø. Thus by
Lemma 9 there is a measurable f such that f < a on K and f > a
on X BÇ . Thus f < a on Ba and f > a on X B a except for
X e C. I
Problems
10. Prove Proposition 7. [For each pair (n, k) of integers let
22n
En
{x: k 2— n < f(x) G (k
1)21, and set (pn,= 2—n
E0 kx.E.
11. Prove Proposition 8, and show that it is false if the word 'complete'
is omitted.
12. Let (X, 03, ,u) be a measure space and (X, (B o, ,u 0) its completion.
Then a function f is measurable with respect to (g o if and only if there is a
function g measurable with respect to 03 such thatf -= g almost everywhere
in the sense that there is a set E e 63 with AE = 0 and f = g on X — E.
g(x)} to be in e3.
Note that this does not require the set {x: f(x)
13. Show that the requirement that D be countable in Lemmas 9 and 10
is unnecessary. [Replace an uncountable D by a countable subset E which
is dense in D.]
14. Show that if D is dense in R, then the function f in Lemma 9 is
unique and the function f in Lemma 10 is unique to within equivalence;
that is, any other function satisfying the conclusion of the lemma is equal
to f almost everywhere.
15. Let D be the rationals and f the function in Lemma 9. Express the
sets {x: f(x) G a}, {x: f(x) <
{x: f(x) =
in terms of the sets {B}.
3
Integration
Many definitions and proofs of Chapter 4 depend on only those
properties of Lebesgue measure which are also true for an arbitrary
measure in an abstract measure space and carry over to this case.
If E is a measurable set and cp a nonnegative simple function, we
define
E
1-1
n E),
226
Measure and Integration
[Ch. 11
where
E
iO(X) =
cix E,(x).
1
It is easily seen that the value of this integral is independent of the
representation of io which we use. If a and b are positive numbers
and io and tp nonnegative simple functions, then f aio btp =
a f b ftP.
In Chapter 4 we first defined the integral for bounded measurable
functions and then defined the integral of a nonnegative measurable
function f to be the supremum of fg dj as g ranged over all bounded
measurable functions which vanished outside a set of finite measure.
Unfortunately, this definition is not quite suitable in the case of
general measures, for, if (B =- {X, 0 } and AO = 0, AX = 00 , then
we certainly want f i d = CO But the only measurable function g
which vanishes outside a set of finite measure is g 0, and hence
sup fg d = O. To avoid this difficulty, we define the integral of a
nonnegative measurable function directly in terms of integrals of
nonnegative simple functions.
.
Definition: Let f be a nonnegative extended real-valued measurable
Then f f cl A is the supremum
function on the measure space (X, 63,
of the integrals f çe di.4 as go ranges over all simple functions with
0 < <f.
).
It follows immediately from this definition that f < g implies
ff< fg and that f cf = c ff. In other respects, however, this
definition is somewhat awkward to apply, since we are taking a
supremum over a large collection of simple functions, and it is not
apparent from the definition that f (f + g) = f f + fg. Consequently, we begin our treatment of the integral by first establishing
the convergence theorems. These then enable us to determine the
value of f f by taking the limit of f ion for any increasing sequence
of simple functions which converge to f. We begin with Fatou's
Lemma:
11. Theorem (Fatou's Lemma):
be a sequence of nonnegative measurable functions which converge almost everywhere on a
set E to a function f Then
Let
Sec. 3]
227
Integration
Proof: Without loss of generality we may assume that f,,(x) —f (x)
for each X F E. From the definition of ff it suffices to show that, if io
< lim fn.
is any nonnegative simple function with (to f, then
L
If= o, then there is a measurable set A c E with i.LA = oo
such that cp > a > 0 on A. Set A n = {x e E: fk (x) > a for all
k > n} . Then (A n) is an increasing sequence of measurable sets
= 00 .
whose union contains A, since cp < lim f„,. Thus Jim
Since f
a,uA n, we have lirn f f. = 00 = f 0
B
If JE cp < oo , then there is a measurable set A c E with 12,4 < 00
such that cp vanishes identically on E — A. Let M be the maximum
of cp, let e be a given positive number, and set
J
E
.
--= Ix e E: fft(x) > (1 — e)io(x) for all k
Then (A n) is an increasing sequence of sets whose union contains A,
and so (A — A n ) is a decreasing sequence of sets whose intersection is
empty. By Proposition 2 lim /.4(A — A n) = 0, and so we can find an n
such that ,u(A A k) < E for all k > n. Thus for k > n
f
E fk
k fk
(1 - E
>(1-
fA k
E
)
JE
-
LA:,0
IL, — fE so + m] •
Hence
-
E
UE (P M
Since E is arbitrary,
lim
.I.
E`P'
12. Monotone Convergence Theorem: Let (fn ) be a sequence of
nonnegative measurable functions which converge almost everywhere
to a function f and suppose that fn < f for all n. Then
f = lim f fn.
228
Measure and Integration
[Ch. 11
Proof: Sincef, < f, we have ff, < ff. Thus by Fatou's Lemma we
have
<ff. '
We are now able to establish some of the standard properties of
the integral.
13. Proposition: If f and g are nonnegative measurable functions
and a and b nonnegative constants, then
faf+bg= aff+big.
We have
If
0
with equality only if f = 0 a.e.
Proof: To prove the first statement, let (io n) and (1,1,„) be increasing
sequences of simple functions which converge to f and g. Then
(a(pn, -I- bon ) is an increasing sequence of simple functions which
converge to af bg. By the Monotone Convergence Theorem
f af bg = lim f &lion
Ern (a Icp„ b f On )
= a
ff+ b f g.
Clearly f f O. If ff = 0, let A„ = {x: f(x) > 1/n} . Then
f > X, and so I.LA„ = f XA. = O. Since the set where f> 0 is the
union of the sets A„, it has measure zero. 1
Taking this proposition together with the Monotone Convergence
Theorem gives us the following corollary:
14. Corollary: Let (fn ) be a sequence of nonnegative measurable
functions. Then
f
j
f
n=1
=
71=1
j
A nonnegative function f is called integrable (over a measurable
set E with respect to j.i) if it is measurable and
f <
Sec. 3]
229
Integration
An arbitrary function f is said to be integrable if both f+ and
integrable. In this case we define
JE f =
r are
JEfJE
Some of the properties of the integral are contained in the following proposition, whose proof is left to the reader.
15. Proposition: lf f and g are integrable functions and E is a
measurable set, then
J (cif + c2g) =
E
ii. if' !hi <
iii. if f
cif f+ c2f
g;
if!
and h is measurable then h is integrable;
g a.e., then if
g.
16. Lebesgue Convergence Theorem: Let g be integrable over E,
and suppose that (fa) is a sequence of measurable functions such that
on E
1.f.(x)i
g(x)
and such that almost everywhere on E
f(x) f(x).
Then
fE f = lim fE
Proof: Apply Fatou's Lemma to the sequences g + f„ and g — f„. 1
Problems
16. Prove Proposition 15.
17. Suppose that 1.1, is not complete, but that we define a bounded
function f to be integrable over a set E of finite measure if
sup f çdp = inf f
to9f
E
E
for all simple functions 40 and %,G. Show thatf is integrable if and only if it is
measurable with respect to the completion of A.
230
Measure and Integration
[Ch. 11
18. Let f be an integrable function on the measure space (X, G3, ,u).
Show that given e > 0, there is a .5 > 0 such that for each measurable set
E with ,uE < we have
fE f l <
19. Let 0'0 be a sequence of measurable functions. We say that
converges in measure to f if given e > 0, there is an N such that
Afx: if(x) — f(x)I >
<e
for n > N. We say that (fn ) is a Cauchy sequence in measure if given
0, there is an N such that
E>
Afx: if(x) — f rn(x)I >
4<e
for in, n > N.
Show that a necessary and sufficient condition that a sequence (fn ) of
measurable functions converge in measure to some function f is that it be a
Cauchy sequence in measure. If (fn) converges tof in measure, then there is
a subsequence (fnp ) which converges to f almost everywhere, and hence
almost everywhere convergence may be replaced by convergence in measure in the convergence theorems.
20. Show that if f is integrable then the set {x: f(x)
0) is of a-finite
measure.
21. a. Let (X, 03, i.L) be a measure space and g a nonnegative measurable function on X. Set pE f g du. Show that y is a measure on 63.
b. Let f be a nonnegative measurable function on X. Then
f f dy--= f fg d,u.
[Hint: First establish this for the case that g is simple and then use the
Monotone Convergence Theorem.]
22. A function f on a measure space (X, 03, ,u) is called locally measurable if the restriction off to each E in 03 with uE < 00 is measurable, that
is, if f xE is measurable.
a. Show that f is locally measurable if and only if it is measurable
with respect to the a-algebra of locally measurable sets.
b. If we define integration for nonnegative locally measurable
functions f by taking ff to be the supremum of f go as ço ranges over all
simple functions less than f, then ff = ff du, where u is the extension of /2
given in Problem 8d.
Sec. 4]
*4
231
General Convergence Theorems
General Convergence Theorems
In the preceding section we discussed the behavior of the integrals
of a convergent sequence of functions. These were all integrals with
respect to a fixed measure ,u. We can generalize by allowing the
measure to vary also. Let (X, 63) be a measurable space and (un) a
sequence of set functions defined on 63. We say that
converges
setwise to the set function AL if for each E e 63 we have 1.4E = lim
With this notion we have the following two propositions, which
generalize Fatou's Lemma and the Lebesgue Convergence Theorem.
17. Proposition: Let (X, 63) be a measurable space, (pa) a sequence
of measures which converge setwise to a measure u, and ( fn ) a sequence
of nonnegative measurable functions which converge pointwise to the
function f. Then
ffd iu <11m f
Proof: Setwise convergence of An, to
f
g
dp, = lim
implies that
dti,„
for any simple function so. From the definition of ff du it suffices to
prove that Fp tip < Jim ff„ 4, for any simple function so < f.
Suppose fyo dp, < ci. Then cp vanishes outside a set E of finite
measure. Let E be a positive number, and set
E
(1 — E)(p(x) for all k > .
{x: fk (x)
Then (En) is an increasing sequence of sets whose union contains E,
and so (E E„) is a decreasing sequence of measurable sets whose
intersection is empty. Thus by Proposition 2 there is an m such that
,u(E Em) < E. Since ,u(E En) = lirn p k(E Em), we may choose
n > m so that pk (E Em) < E for k > n. Since E Ek C E
we have uk (E Ek) < E for k > n. Thus
f fk diak
(1 — e) Ek cp dp.k
L k fk dpk
(1
?_.
e) f
— e) fE
dAk
f E-E k S° dAk
Me,
232
Measure and Integration [Ch. 11
where M is the maximum of so. Thus
lim f fk d,uk _> fs so d,u — e[M + f so d]•
Since E was arbitrary,
fs so d,u 5_ lim f fk
The case when
f so gu =
(Auk.
c/o is similarly handled. I
18. Proposition: Let (X, 63) be a measurable space and (du n) a
sequence of measures on 63 which converge setwise to a measure du.
Let (fn) and (gn) be two sequences of measurable functions which
converge pointwise to f and g. Suppose I fnl <
_ g n and that
lim f g n d,u.„, -= fg di < c. 0.
Then
Ern ff n 47, --= f f dp.
Proof:
Apply Proposition 17 to the sequences gn + fn and
gn — fn. I
Problems
23. Let (X, 63) be a measurable space and (du n) a sequence of measures
on 63 such that for each E c 63, du n+ IE > E. Let ,uE = firn ME. Then A
is a measure on 63.
24. Give an example of a decreasing sequence (A n ) of measures on a
measurable space such that the set function ,u defined by AE = lim du nE is
not a measure.
25. Let (X, 63) be a measurable space and (An ) a sequence of measures
on 63 which converge setwise to a set function ,u. If du X < wo, then /.4 is a
measure.
5 Signed Measures
In this section we consider some of the possibilities which may arise
if a measure is allowed to take on both positive and negative values.
233
Sec. 5] Signed Measures
We first note that if A i and I/2 are two measures defined on the same
measurable space (X, 63), then we may define a new measure /23 on
(X, 63) by setting
123 (E)
= c l /2 i (E)
c2 > O.
c2/22(E)
What happens if we try to define a measure by
vE
= A.t i
E
— A2
E?
The first thing that may occur is that P is not always nonnegative, and
this leads to the consideration of signed measures which we shall
define later. A more serious difficulty comes from the fact that p is
not defined when 1/ 1 E = 1/ 2 E = cr.,. For this reason we should have
either /.4. 1 or 12 2 finite. With these considerations in mind we make the
following definition:
Definition: By a signed measure on the measurable space (X, 03)
we mean an extended real-valued set function 11 defined for the sets of
63 and satisfying the following conditions:
assumes at most one of the values
ii. v(0) = O.
j. y
P U
2=1
+ cc , —
Go.
E PE, for any sequence E, of disjoint measurable
2=1
sets, the equality taken to mean that the series on the right
converges absolutely if P(U
is finite and that it properly
diverges otherwise.
Thus a measure is a special case of a signed measure, but a signed
measure is not in general a measure. We say that a set A is a positive
set with respect to a signed measure y if A is measurable and for
every measurable subset E of A we have vE > O. Every measurable
subset of a positive set is again positive, and if we take the restriction
of 11 to a positive set we obtain a measure. Similarly, a set B is called
a negative set if it is measurable and every measurable subset of it
has nonpositive p measure. A set which is both positive and negative
with respect to i is called a null set. A measurable set is a null set if
and only if every measurable subset of it has u measure zero. The
reader should carefully note the distinction between a null set and a
set of measure zero: while every null set must have measure zero,
a set of measure zero may well be a union of two sets whose measures
234
Measure and Integration
[Ch. 11
are not zero but are negatives of each other. Similarly, a positive set
is not to be confused with a set which merely has positive measure.
We have the following lemmas concerning positive sets. Similar
statements hold, of course, for negative sets.
19. Lemma: Every measurable subset of a positive set is itself
positive. The union of a countable collection of positive sets is positive.
Proof: The first statement is trivially true by the definition of a
positive set. To prove the second statement, let A be the union
of a sequence (A n ) of positive sets. If E is any measurable subset
of A, set
En = E n An n A-in • • • n A 1 .
Then En is a measurable subset of A n and so vEn > O. Since the
E we have
En are disjoint and E = H
.... —n)
vE -=
E
vEn > O.
n=1
Thus A is a positive set. I
20. Lemma: Let E be a measurable set such that 0 < vE < Go.
Then there is a positive set A contained in E with vA > O.
Proof: Either E itself is a positive set or it contains sets of negative
measure. In the latter case let n i be the smallest positive integer such
that there is a measurable set Ei c E with
1
vE i < — — ni
Proceeding inductively, let nk be the smallest positive integer for
which there is a measurable set Ek such that
k-1 ]
Ek C
E —[t.i E,
i=1
and
If we set
1
vEk < — •
nk
.
A=E—UEk,
k=1
Sec. 5]
235
Signed Measures
then
E=
ALTO
k=1
Since this is a disjoint union, we have
OD
vE =
PEk
k=1
with the series on the right absolutely convergent, since
1
Thus E nk
vE
is finite.
converges, and we have nk —* 00. Since pEk < 0 and
we must have PA > O.
To show that A is a positive set, let > 0 be given. Since nk —* 00,
we may choose k so large that (nk —
< E. Since
vE > 0,
[k
Ac E— U
j=1
A can contain no measurable sets with measure less than
— (n k 1) — ', which is greater than —e. Thus A contains no measurable sets of measure less than — e. Since E is an arbitrary positive
number, it follows that A can contain no sets of negative measure
and so must be a positive set. I
21. Proposition (Hahn Decomposition Theorem): Let 11 be a
signed measure on the measurable space (X, 03). Then there is a
positive set A and a negative set B such that X = Au B and A nB = 0.
Proof: Without loss of generality we may assume that - IL 00 is
the infinite value omitted by v. Let x be the supremum of v A over
all sets A which are positive with respect to v. Since the empty set
is positive„X > O. Let (A t ) be a sequence of positive sets such that
X = lim vA„
and set
A= ÛA,.
z=.1
By Lemma 18 the set A is itself a positive set, and so X >
A — A, C A and so p(A A,) > O. Thus
vA = vA, v(A A,) >
Hence
vA >
X, and so vA
=
X, and X <
vA. But
236
Measure and Integration [Ch. 11
Let B = —A, and suppose E is a positive subset of B. Then E
and A are disjoint and E u A is a positive set. Hence
X>
v(E u A) = vE vA = vE +
X,
whence vE = 0, since 0 < X < oo. Thus B contains no positive
subsets of positive measure and hence no subsets of positive measure
by Lemma 20. Consequently, B is a negative set. I
A decomposition of X into two disjoint sets A and B such that A
is positive for y and B negative is called a Hahn decomposition for v.
Proposition 21 states the existence of a Hahn decomposition for each
signed measure. Unfortunately, a Hahn decomposition is not unique.
If {A, B} is a Hahn decomposition for y, then we may define two
measures v+ and y — with v = v+ — v — by setting
v ±(E) = v(E n A)
and
v — (E) = —v(E n B).
Two measures v 1 and v2 on (X, 63) are said to be mutually singular
(in symbols v 1 _L v 2) if there are disjoint measurable sets A and B
with X= A uB such that v i (A) = v2(B) = 0. Thus the measures v+
and v — defined above are mutually singular. We have thus etablished
the existence part of the following proposition. The uniqueness part
is left to the reader.
22. Proposition: Let v be a signed measure on the measurable
space (X, go. Then there are two mutually singular measures VI - and v on (X, 63) such that v = v+ — v — . Moreover, there is only one such
pair of mutually singular measures.
The decomposition of y given by the proposition is called the
Jordan decomposition of v. The measures v+ and v — are called the
positive and negative parts (or variations) of v. Since y assumes at
most one of the values + co and — oo, either v+ or v — must be finite.
If they are both finite, we call y a finite signed measure.
The measure tyl defined by
114E) = v +E v — E
is called the absolute value or total variation of v. A set E is positive
for y if v —E = 0. It is a null set if 'PRE) = 0.
Sec.
5]
237
Signed Measures
Problems
26. a. Give an example to show that the Hahn decomposition need not
be unique.
b. Show that the Hahn decomposition is unique except for null sets.
27. Show that there is only one pair of mutually singular measures
y+ and v — such that v = v+ — v —. [Hint: Show that any such pair determines a Hahn decomposition and apply the results of Problem 26b.]
28. Show that if E is any measurable set, then
—v — E < vE < v+E
and
IPEI
IPRE).
29. Show that if v 1 and y2 are any two finite signed measures, then so is
av i ± Ov 2 , where a and 13 are real numbers. Show that
laP I = lal
IPI
and
Iiii + v 21
Iv 1 l + Iv 2 1,
where v < /.4 means vE < 1.4E for all measurable sets E.
30. We define integration with respect to a signed measure y by defining
1 f dv = f f dv+ — f f dv — .
If If I < M,
JE
f dv '. MM(E).
Moreover, there is a measurable function f with I fi < 1 such that
JE"
dv = M(E).
31. a. Let ir. and v be finite signed measures. Show that there is a signed
measure A A I/ which is smaller than 1.1 and v but larger than any other
signed measure which is smaller than A and v. [Hint: m. A y = 102 + y
lil — PD.]
b. Show that there is a measure A V v which is larger than A and
v but smaller than any other measure which is larger than A and v. Also
A V v + /J. A I/ = A 4- v.
c. If /I and v are positive measures, then they are mutually singular
if and only if A A y = O.
—
238
Measure and Integration
[Ch. 11
6 The Radon-Nikodym Theorem
Let (X, (33) be a fixed measurable space. If A and v are two measures
defined on (X, (33), we say that A and v are mutually singular (and
write p. 1 v) if there are disjoint sets A and B in 63 such that X =
A u B and PA = AB = O. Despite the fact that the notion of singularity is symmetric in v and 12, we sometimes say that v is singular
with respect to p.. The notion antithetical to singularity is absolute
continuity. A measure P is said to be absolutely continuous with
respect to the measure p. if v A = 0 for each set A for which AA = O.
We use the symbolism v << su for v absolutely continuous with
respect to p.
In the case of signed measures A and v, we say v << ii. if Ivl << 1A1
and v 1 A if 1 vi 1 kul.
Whenever we are dealing with more than one measure on a
measurable space (X, (33), the term 'almost everywhere' becomes ambiguous, and we must specify almost everywhere with respect to p or
almost everywhere with respect to v, etc. These are abbreviated
a.e. [A] and a.e. [v]. If v << p and a property holds a.e. [A], then it
holds a.e. [v].
Let p be a measure and f a nonnegative measurable function on
X. For E in 63, set
vE =
JE f dy.
Then P is a set function defined on 63, and it follows from Corollary
14 that P is countably additive and hence a measure. The measure v
will be finite if and only iff is integrable. Since the integral over a set
of p.-measure zero is zero, we have v absolutely continuous with
respect to A. The next theorem shows that, subject to o--finiteness
restrictions, every absolutely continuous measure v is obtained in
this fashion.
23. Theorem (Radon-Nikodym): Let (X, 63, ii,) be a g finite
measure space, and let v be a measure defined on 63 which is absolutely
continuous with respect to p. Then there is a nonnegative measurable
function f such that for each set E in 63 we have
-
vE — f B f di.4.
The function fis unique in the sense that i f g is any measurable function
with this property then g = f a.e. [A].
Sec. 6]
239
The Radon-Nikodym Theorem
Proof: The extension from the finite to the o-finite case is not
difficult and is left to the reader. Thus we shall assume that 1.‘ is
finite. Then y — aA is a signed measure for each rational number a.
Let (A a, Ba) be a Hahn decomposition for y — ail, and take A o = X,
B o = 0.
= Ba n A o. Thus (y ap)(Ba
< 0 and
Now Ba
(y — 01.0(B,, — B o)
0. If 3 > a, these imply 4.13„ = 0,
and so by Lemma 10 there is a measurable function f such that for
each rational a we have f> a a.e. on A a and f a a.e. on Ba.
Since Bo = 0, we may take f to be nonnegative.
Let E be an arbitrary set in 0A, and set
Ek =
E n (Bk +1 B k ),
N
Then E = E. u U
Ek,
E. = E U Bk
N
and this is a disjoint union. Thus
k=0
PE
Since
Ek C Ek±i
n
Ak,
= PE. +
k=0
VEk•
k
k+ 1
on
we have — f
N — N
— yE k < f f <
N
Rk
Ek
and so
k + 1 r,
k+ 1
k
Since — p.Ek < vEk < N ihEk, we have
N
1
vEk — — yEk < f f
N
F
vEk
1
— PEk.
N
On E. we have f = oo a.e. If ihEc, > 0, we must have vEc, = oo,
since (y — a/. E. is positive for each a. If jiE,0 = 0, we must have
vE. = 0, since P A. In either case
)
vE,, =f
f
E c.,
Adding this equality and our previous inequalities gives
PE
—
N
1
f f d < vE + — E.
240
Measure and Integration
[Ch. 11
Since ALE is finite and N arbitrary, we must have
vE = f f d,u. I
The function f given by Theorem 23 is called the Radon-Nikodym
c .
derivative of v with respect to A. It is sometimes denoted by [l
diA
24. Proposition (Lebesgue Decomposition): Let (X, 63, /1) be a
o--finite measure space and I, a o--finite measure defined on 63. Then we
can find a measure v 0 which is singular with respect to and a measure
v which is absolutely continuous with respect to ,1.1, such that
= v o ± v 1 . The measures v 0 and v are unique.
Proof: Since j and r are o--finite measures, so is the measure
v. Since both A and v are absolutely continuous with
X= A
respect to x, the Radon-Nikodym Theorem asserts the existence of
nonnegative measurable functions f and g such that for each E e
= fE f dX,
vE = fE g dX.
Let A
f(x) > 01 and B = {x: f(x) = 0} . Then X is the
disjoint union of A and B, while AB = O. If we define v o by
p oE = v(E n B),
we have v o(A) = 0 and so v o i A. Let
v i (E) = v(E n A) = fEn.4
g dX '
Then v = ro
and we have only to show that v 1 << A. Let E
be a set of A-measure zero. Then
O = AE =
JE f dX,
and f= O a.e. [X] on E. Since f> 0 on A n E, we must have
X(A n E) = O. Hence v(A n E) = 0, and so v i (E) = v(A n E) = O.
This establishes the proposition except for the uniqueness, which is
left to the reader. I
Sec. 6]
241
The Radon-Nikodym Theorem
Problems
32. a. Show that the Radon-Nikodym Theorem for a finite measure y
implies the theorem for a a-finite measure y. [Hint: Decompose X into a
countable union of sets X, of finite y-measure and apply the RadonNikodym Theorem to each X, to obtain f. Show f to have the required
properties.]
b. Show the uniqueness of the function f in the Radon-Nikodym
Theorem.
33.
Radon-Wilcodym derivatives. Show that the Radon-Nikodym deriv-
ative [—
du] has the following properties:
a. If v << and f is a nonnegative measurable function, then
dv
f f dv = f f[]di.t.
rd(vi + P2)1
b.L
rdvil
Ldp.] '
f dv21
L-dAd •
c. If v «Li «X, then
d. If v «ii and y
r du]
rdvirdui
[fix]
LdAd [dxf
v,
then
[di)]
c-/Tc
-
[dy
did
34. a. Show that if v is a signed measure such that v I p and v
y,
then I) = O.
b. Show that if v 1 and v 2 are singular with respect to 11, then so is
c ul
c2I)2.
c. Show that if 1i 1 and v 2 are absolutely continuous with respect to
p. SO is CiVi +
C2 1) 2-
d. Prove the uniqueness assertion in the Lesbesgue decomposition.
35. Extend the Radon-Nikodym Theorem to the case of signed
measures.
36. Complex measures. A set function v which assigns a complex
number vE to each E in a a-algebra 63 is called a complex measure if
242
Measure and Integration [Ch. 11
v,(2S = 0 and for each countable disjoint union U E, of sets in 63 we have
E1 )
=E
vE,
with absolute convergence on the right.
a. Show that each complex measure may be expressed as v =
Mi — 1.4 2 ± 44 3 — 1g4 where g 12 29 1.1 31 and /24 are finite measures.
b. Show that for each complex measure v there is a measure A and
a complex-valued measurable function (p with j(pi = 1 such that for each
set E in 63,
vE =
[Hint: Apply the Radon-Nikodym Theorem to the measures gi with
respect to the measure pi + /2 2
/23
/24 .]
c. Show that the measure g in (b) is unique and that cp is uniquely
determined to within sets of p measure zero.
d. The measure A in (b) is called the total variation or absolute
value of v and is sometimes denoted by iv!. Show that the results of
Problem 30 hold for complex measures.
e. If v is a complex measure with M(X) = 1 and v(X) = 1, then
v is a positive real measure.
37. Alternate proof of the Radon-Nikodym Theorem. We can give a
proof of the Radon-Nikodym Theorem which is independent of the Hahn
decomposition theorem by using Proposition 10.28, which states that for
each bounded linear functional F on a Hilbert space H there is a g in H
such that F(f) = (f, g) for all ! in H. This proof is due to von Neumann.
The details are outlined below:
a. Let g and v be finite measures on a measurable space (X, 03)
and set X = g + v. Define F(f) ff du. Then F is a bounded linear
functional on L 2(X).
b. The function g e L 2(X) such that F(f) = (f, g) has the property
that 0 < g < 1, and
g(E) = E g dX
v(E) = fE (1 — g) dX.
c. If v
g, then X << g, and g = 0 only on a set of g-measure
zero. In this case
X(E) = E
[Hint: Consider Problem 21.]
g
Sec. 7]
The V' Spaces
d. If v
243
/2, then (1 — g)g-1 is integrable with respect to A and
v(E) = (1 — g)g -1
JE
38. Use the following example to show that the hypothesis in the
Radon-Nikodym Theorem that A is if-finite cannot be omitted. Let
X --= [0, I], Oa the class of Lebesgue measurable subsets of [0, 1], and take
7, to be Lebesgue measure and A to be the counting measure on 63. Then iâ–  is
finite and absolutely continuous with respect to A, but there is no function
f such that vE = fE f du for all E c 03. At what point does the proof of
Theorem 23 break down for this example?
39. Decomposable measures. Let (X, 63, u) be a measure space. A
collection {Xa} of disjoint measurable subsets of X is called a decomposition for is if tiXa < oo for each a and if piE — 0 for every measurable set E
such that ,u(E n Xa) = 0 for all a. A measure ti is called decomposable if
it has a decomposition.
a. If {,Va} is a decomposition for A, and if E is any measurable set,
n E), where E is meant in the sense of Problem 2.20.
b. If {X„} is a decomposition for a complete measure A, then f is
if and only if the restriction off to Xa is measurable locaymesurb
for each a. Iff is a nonnegative locally measurable function of X, then
then pE =
Jfdis= E
a
Xa
f
c. Let i; be absolutely continuous with respect to A, and suppose
that there is a collection {Xa} which is a decomposition for both A and v.
Then there is a nonnegative, locally measurable real-valued function f
such that for every measurable set E we have
vE =
d. The conclusion of (c) is still valid if, instead of assuming {lIa}
to be a decomposition for
we merely assume that if E e 63 and
v(E n Xa) = 0 for all a then pE = 0.
7 The LP Spaces
If (X, 03, pt) is a measure space, we denote by LP(A) the space of
all measurable functions on X for which f f1P cl/.4 < 00, considering
two functions in LP to be equivalent if they are equal almost everywhere. As in Chapter 6 we define L'(1) to be the space of bounded
244
Measure and Integration
[Ch. 11
measurable functions. For 1 < p < co we set
Ilfllp = {f If I' dil } '',
and for p = 00 we set
If
= ess sup f i.
Note that the space L' (y) depends on the choice of p, to determine the
norm and the classes of equivalent functions but that this only requires knowing what the sets of measure zero are.
The Holder and Minkowski inequalities and the Riesz-Fischer
Theorem follow just as in Chapter 6, and we summarize them in the
following theorem:
25. Theorem: For 1<p
_< oo
e Lq(,(2),
with !
and if f c LP(m), g
the spaces LP 04 are Banach spaces,
1
- = 1, then fg c D(/.4) and
flfgldm
The following proposition, whose proof is left to the reader, is a
version of Littlewood's second principle:
26. Proposition: Let f e LP(A), 1 < p < 00. Then, given e > 0,
there is a simple function („o vanishing outside a set of finite measure
such that {jf — cp{1, < c.
It follows from the Wilder inequality that each g e D defines a
linear functional F on LP by setting
F(f) = f fg
and it is not difficult to show that IFII = Ile. Our goal in the remainder of this section is to establish the converse (Riesz Representation Theorem), which states that each linear functional on LP(/.1.) for
1 < p < co is of this form and, if p, is a-finite, each linear functional
on L1 (p.) is of this form. We begin by establishing a lemma.
Sec. 7] The
LP
Spaces
245
27. Lemma: Let (X, 08, /1) be a finite measure space and g an
integrable function such that for some constant M,
f gcp dy,
for all simple functions (p. Then g
E
L.
Proof: Assume p > 1, and let (V.,i ) be a sequence of nonnegative
simple functions which increase to le. Set
= (11.,)9 sgn g.
Then çon is a simple function, and
=fr. d„} P.
4.112,=
1+1
Since q;„g- > koni 10.1 9 = /nr q = On, we have
f g
11/1 11çCn 11P
.41
1
itn C1/4
1
1
Since 1 — — = — ,
P
If
ifrn chilq
or
1 V/i,
<
Mq,
and by the Monotone Convergence Theorem
J. I g
d
W.
The case p = 1 is left to the reader. I
We shall also use the following lemma, whose proof is left to the
reader.
246
Measure and Integration
[Ch. 11
28. Lemma: Let (En ) be a sequence of disjoint measurable sets, and
for each n let fn be a function in LP(1 <p < co) which vanishes out-
side E. Set f = Ê fn . Then f e LP if and only
n=1
this case f =
EL,
if E iimP < ..
In
in LP; that is,
n
f
—
E
—> 0
fi
P
and
Il flIP = n=i
i IlfnlIP.
29. Riesz Representation Theorem: Let F be a bounded linear
functional on LP(g) with 1 <p < co and g a o--finite measure. Then
1
1
there is a unique element g in V, where - + - = 1, such that
q
P
F(f) = f fg dg.
We have also 114 = lig.
Proof: Let us first consider the case of a finite measure ,u. Then
every bounded measurable function is in LP 0.4). Define a set function
y on the measurable sets by setting
vE = F(X E ).
If E is the union of a sequence (En ) of disjoint measurable sets, let
)-- an...4.
x
Then by Lemma 28 and
a, = sgn F(X4), and set f = _a
the boundedness of F we have
i 11)En] = F(f) < co
n=1
and
E
n=1
PE. = F(X E ) = vE.
Thus y is a signed measure, and so by the Radon-Nikodym Theorem
there is a measurable function g such that for each measurable set E
we have vE = f g dy. Since y is always finite, g is integrable.
E
Sec. 7]
The LP Spaces
247
If ye, is a simple function, the linearity of F and of the integral
imply that
F(ç) = f sog dtt.
Since the left side is bounded by VII
we have g c D by
Lemma 27. Let G be the bounded linear functional defined on LP by
G(f) = f fg
Then G — F is a bounded linear functional which vanishes on the
subspace of simple functions. By Proposition 26 the simple functions
are dense in LP, and so G — F = O. Thus for all f c LP
F(f) =- f fg
and it is readily verified that In = uGh =
The function g must determine a unique element of Lg, for if g1
and g2 determine the same functional F, then g l — g 2 must give the
zero functional, and so Ilgi - g2 11, = O. Thus g 1 = g2 a.e.
To extend the theorem to the a-finite case, let (X„) be an increasing
sequence of measurable sets of finite measure whose union is X. The
theorem for finite measure spaces implies that for each n there is a
function g„ in Lq which vanishes outside X„ such that
F(f) =f fg„
for all f e LP which vanish outside Xn . Moreover, g,, < ilf11.
Since any function g„ with this property is uniquely determined on
X„ except for changes on sets of measure zero and since gn+1 also
has this property, we may assume g„ +i = g„ on Xn . For x e Xn set
g(x) = g„(x). Then g is a well-defined measurable function and ign i
increases pointwise to g Thus by the Monotone Convergence
Theorem
d,u =-- limf g,. r d
and g e L.
248
Measure and Integration
[Ch. 11
If f e LP, let fn = f on Xn, and f. = 0 otherwise. Then f. --> f
pointwise and in L. Since j fgl is integrable and I f,igl < I fgl, the
Lebesgue Convergence Theorem gives
ffgd = 1imff n gdi.4
= lim f fi gn
= lim F(f„,)
= F(f). I
If p = I, the requirement that
be a-finite is necessary. Some
extensions and counterexamples are given in Problems 45 and 46. If
p > I, then o--finiteness is not required:
30. Theorem: Let F be a bounded linear functional on LP() with
1 <p < co. Then there is a unique element g e D such that
F(f) = ffgd.
We have VII
Proof: It follows from the previous theorem that if E is any measurable set of o--finite measure then there is a unique gE e Lq, vanishing
outside E, such that
F(f) = fgEfdil
for each f e L P which vanishes outside E. The uniqueness of gE
implies that if A c E then gA = gE a.e. on A. For each E of o--finite
measure set X(E) = f Ig E lq di.t. Then for A c E, we have x(A) <
x (E) < Let
be a sequence of sets of o--finite measure such
that x(E) tends to the maximum value m of X. Then H = U En
is a set of a-finite measure, and by the monotonicity of X we have
X(H) = m.
Let g be defined to be gH on H and 0 elsewhere. Then g e L.
If E is any set of a-finite measure containing H, then gE = gH a.e.
on H. But f ig = x(E) < x(H) = f ig'j ig , and so gE = 0 a.e. on
E — H. Thus gE = g almost everywhere on E.
Sec. 7]
249
The LP Spaces
If f e D', then the set N =
f(x) 0}
measure, and so is the set E = N u H. Thus
F(f) = f fgE d
is a set of a-finite
= f fg d.
The equality of ilFil and lie, follows as in the previous theorem. I
Problems
40. Prove Proposition 26.
41. Prove Lemma 27 for the case p = 1.
42. Prove Lemma 28.
43. For g e V, let F be the linear functional on Ly defined by
F(f) = f fg
Show that VII =
44. a. Let ,u be the counting measure on a countable set X. Show that
(p.) = 12) .
b. Discuss the spaces P(X) =
Cu.), where ,u is the counting measure on a not necessarily countable set X.
45. Let (X, 03, 1.0 be a decomposable measure space (cf. Problem 39).
a. Show that for each bounded linear functional F on L 1 (2) there
is a bounded locally measurable function g such that F(f) = ffg
b. Show that the dual space to L 1 (j2) is the space L(M), where pt is
the saturation of j given in Problem 8e.
46. Let A and B be uncountable sets with different numbers of elements,
x = a} is called a vertical
and let X = A X B. A set of the form {(x,
line and a set of the form {(x, y): y = b} a horizontal line. Let 61 be the
collection of subsets E of X such that for every horizontal or vertical line L
either E n L or E n L is countable. Then 63 is a a-algebra. Let I.LE be the
number of horizontal and vertical lines L for which E n L is countable and
vE the number of horizontal lines with E n L countable. Then p, and y are
measures on 03, and we can define a bounded linear functional F on L 1 (t)
by setting F(f) = ff dv. There is no locally measurable function g such that
F(f) = ffg
12
Measure and Outer
Measure
In this chapter we first consider some of the ways in which a
measure can be defined on a a-algebra. In the case of Lebesgue
measure we defined measure for open sets and used this to define
outer measure, from which we obtain the notion of measurable set
and Lebesgue measure. Such a procedure is feasible in general. In
the first section we discuss the process of deriving a measure from an
outer measure, and in the second section we derive an outer measure
from a measure which is defined only on an algebra of sets. The
remainder of the chapter is devoted to some applications of this
process.
1
Outer Measure and Measurability
By an outer measure ti.* we mean an extended real-valued set
function defined on all subsets of a space X and having the following
properties:
i. i-L * 0
= O.
ii. A C B = 1.1*A < ju*B.
co
C.
iii. E CU E, 12*E < E ,*Ei.
z=i
The second property is called monotonicity and the third countable
subadditivity. In view of (i) finite subadditivity follows from (iii).
250
Sec. 1]
251
Outer Measure and Measurability
Because of (ii), property (iii) can be replaced by
u
E=
E„ E, disjoint
teE <
-1
2--
The outer measure y* is called finite if 12*X < 00.
By analogy with the case of Lebesgue measure we define a set E
to be measurable with respect to A* if for every set A we have
I.L*(A) = ,e (A n E)
,*(A n
Since /.4* is subadditive, it is only necessary to show that
e(A) ,4*( 4 n E) A*(A
n
for every A in order to prove that E is measurable. This inequality is
trivially true when 1.1*A = 00, and so we need only establish it for
sets A with eA finite.
1. Theorem: The class icB of A*-measurable sets is a a--algebra.
If P. is p.* restricted to GI, then Z is a complete measure on CB.
Proof: Trivially, the empty set is measurable. The symmetry of the
definition of measurability in E and E shows that P
- . is measurable
whenever E is.
Let E l and E2 be measurable sets. From the measurability of E2,
12 * ( A)
11* (A
n E2)
n E2)
and
A* 0) = 1.1*(A
n
E2)
+ Al*(A n E 2 n E1) + ih*(A n E1 n E 2)
by the measurability of E l . Since
' A n [E 1 u
= A n
[
u [A n E1 n E2],
we have
A* (A
n
[E1
u E2]) < I.L*(A n E2) + /.1*(A n E 2 n E1)
by subadditivity, and so
1.4*A > 412*(A n [E1 u E2]) +
n E1 n E2).
This means that E 1 u E2 is measurable, since
—(E 1 u E2) = E1 n
E 2.
252
Measure and Outer Measure
[Ch. 12
Thus the union of two measurable sets is measurable, and by induction the union of any finite number of measurable sets is measurable,
showing that 63 is an algebra of sets.
H E where EZ is a disjoint sequence of
Assume that E
measurable sets, and set
Gn
= t-i
U Et-
Then G„ is measurable, and
A * (A) = 1.4* (A
n
bt*(A
since E c O
- n . Now Gn n En =
the measurability of En we have
/I
n
?_
n
+ t.t*(A n
t.t*(A n
En and G„ nE„ = G„_1, and
= t.t*(A n E„)
A*(A
by
n Gn _ 1).
By induction
/.4 * (A
n Gn ) = t tt*(A n Ei),
t=i
and so
tt*(A)
A*(A n
i.t*(A
n E1)
tt*(A n E),
tt*(A n E)
-
since
AnEc
U
t=i
(AnE1).
Thus E is measurable. Since the union of any sequence of sets in
an algebra can be replaced by a disjoint union of sets in the algebra,
it follows that 63 is a a-algebra.
We next demonstrate the finite additivity of A. Let El and E2
be disjoint measurable sets. Then the measurability of E2 implies
that
A(E 1 u E2) = tt*(Ei u E2)
u
il*E2 +
1-4* ([-E1
n E2)
Finite additivity follows by induction.
11*([Ei
u
n E2)
Sec. 2]
253
The Extension Theorem
If E is the disjoint union of the measurable sets {Ei} , then
rt(E)
(
=
i=1
ti=1
and so
T4(E)
i=1
rt(E1).
But
A(E)
i=1
FL(Ei)
by the subadditivity of a*. Hence g is countably additive and thus
a measure since it is nonnegative and AO = A*0 = O. I
Problems
1. Prove the completeness of g.
2. Show that g is saturated.
3. Assume that (Ei) is a sequence of disjoint measurable sets and
E = U E. Then for any set A we have
ii*(A n E) =
ii*(A n Ei).
2 The Extension Theorem
By a measure on an algebra we mean a nonnegative extended realvalued set function defined on an algebra et of sets such that:
i. A(0) = 0;
ii. if (A i ) is a disjoint sequence of sets in et whose union is also
in a, then
(Û
A
=
Thus a measure on an algebra et is a measure if and only if et is a
a-algebra. The purpose of this section is to show that, if we start with
a measure on an algebra et of sets, we may extend it to be a measure
defined on a a-algebra 63 containing a. We shall do this by using
254
Measure and Outer Measure
[Ch. 12
the measure on the algebra to construct an outer measure ,u* and
show that the measure r4 induced by u* is the desired extension of u.
The process by which we construct u* from 1.2 is analogous to that
by which we constructed Lebesgue outer measure from the lengths
of intervals: We define
= inf
E juA,
(1)
where (A,) ranges over all sequences from a such that E C U A.
2= 1
,u*.
Wefirstablhom cnerig
2. Lemma: If A e a and if (41,) is any sequence of sets in a such
that A C U A
,
then ,uA <
E
Br, --- A
A„
Proof: Set
n
n 2,i_i n • • • n 71 1.
Then Br, e a and Bfl, C An . But A is the disjoint union of the sequence
(BO, and so by countable additivity
cc
=
E
03,„<
n=1
E
n =1
3. Corollary: If A e a ,u*A = A.
,
4. Lemma: The set function ,u* is an outer measure.
Proof: Since A * is clearly a monotone nonnegative set function
defined for all sets and p. * 325 = CI, we have only to show that it is
countably subadditive. Let E C U E,. If u*Ei = 00 for any i, we
have
E <E,u*E,= 00. If not, given E > 0, there is for each
i a sequence (A„);°_, of sets in a such that E, c U A„ and
,=1
E R A„
< ,u*E,± E/2'.
=1
Then
teE
< E RA„ < E /2*Ei ±
2)
2=1
E.
Sec. 2]
255
The Extension Theorem
Since E was an arbitrary positive number,
and
i.t*
is
I
subadditive.
5. Lemma: If A c a then A is measurable with respect to
e.
Proof: Let E be an arbitrary set of finite outer measure and e a
positive number. Then there is a sequence (A i) from a such that
E c U A, and
E IA i < ,u* E + e.
By the
additivity
of 1.1 on a we have
4,4 2) = il(Az
n A) + ,u(A, n M.
Hence
co
CO
,a*E + € >
E t (A , n A)
-1-
E
A(Ai n ,i)
> ,.,*(E n A) -1- u*(E n M,
since
EnA C U (A, n A)
and
EnA- CU (A i n M.
Since e was an arbitrary positive number,
,u*E > ,21, ( E n A) -1- A*(E n A),
and A is measurable.
1
The outer measure ii* which we have defined is called the outer
measure induced by IA. For a given algebra a of sets we use a, to
denote those sets which are countable unions of sets of a and use
a,s to denote those sets which are countable intersections of sets in
a,.
256
Measure and Outer Measure
[Ch. 12
6. Proposition: Let ,u be a measure on an algebra a, ,u* the outer
measure induced by IA, and E any set. Then for e > 0, there is a set
A e a, with E C A and
,u*A < *E +E
There is also a set B e a,3 with E c B and ,*E =
Proof: By the definition of
such that E c U A, and
E
A*
there is a sequence (A i ) from a
A. < pc* E
E.
s =1
Set A = U A. Then pc*A < Ept*A i EgA i.
To prove the second statement, we note that for each positive
integer n there is a set A, in a, such that E C A„ and ,u*A. <
,u*E ,•;. Let B = fl A1. Then B ct,s and E c B. Since B C A.,
,u* B < ,u* A „ < ,u* E
Since n is arbitrary, *B < ,u* E. But
E c B, and so ,u*B > A *E by monotonicity. Hence bz * B = 12*E.1
If we apply this proposition in the case that E is a measurable set of
finite measure, we see that E must be the difference of a set B in a,3
and a set of measure zero. This gives us the structure of the measurable sets of finite measure, and the next proposition extends this to
the a-finite case. It can be considered a generalization of the first
principle of Littlewood.
7. Proposition: Let
be a a-finite measure on an algebra a,
and let A * be the outer measure generated by pt. A set E is ,u* measurable
if and only if E is the proper difference A — B of a set A in a,3 and
a set B with A*B = O. Each set B with 1.4*B = 0 is contained in a set
C in cto.6 with ,u*C = O.
Proof: The "if" part of the proposition follows from the fact that
each set in a,3 must be measurable, since the measurable sets form a
a-algebra, while each set of A *measure zero must be measurable,
since Ti is complete.
To prove the "only if" part of the proposition, let IX11 be a
countable disjoint collection of sets in a with 12X, finite and
X = U X. If E is measurable, then E is the disjoint union of the
Sec. 21
257
The Extension Theorem
measurable sets E, = X, n E. By Proposition 6 we can find for each
positive integer n, a set A„„ in a, such that E, C A„, and
1
n2z
Set
= U An,
Then E C A n, and A„ E C U
[An,
—
E7]. Hence
t=1
g.(A„ E)
u(A ni E,)
7=
E
Since An e
a,, the set A
=
fl
n=1
1
1
n2'
n
A,, is in a, and for each n
A — E c A n — E.
Hence
(A,
<1
Since this holds for each positive integer n, we must have
,u(A E) = O. I
We summarize the results of this section in the following theorem.
04,
8. Theorem (Carathéodory): Let y be a measure on an algebra
and A * the outer measure induced by m. Then the restriction rt of
to the y*-measurable sets is an extension of A to a o--algebra containing a. If u is finite (or cr-finite) so is u. If y is-finite, then Ti
is the only measure on the smallest a-algebra containing a which is an
extension of A.
1.1 *
Proof: The fact that g is an extension of y from a to be a measure
on a a-algebra containing a follows directly from Corollary 3,
Lemma 5, and Theorem 1, and it is readily verified that r.k is finite or
o--finite whenever 12 is.
258
Measure and Outer Measure
[Ch. 12
To show the unicity of z when I.4 is o--finite, we let CB be the smallest
o--algebra containing a and ki some measure on G3 which agrees
with ,u on a.
Since each set in a, can be expressed as a disjoint countable union
of sets in a, the measure 1.1: must agree with 1- 7 on a,. Let B be any set
in CB with finite outer measure. Then by Proposition 6 there is an A
in a, such that B C A and
i.t.*A < ,u*B +
E.
Since B c A,
jo < ra = ,.,*A < ,u*B +
E.
Since E is an arbitrary positive number, we have
03 < 12* B
for each B
Since the class of sets measurable with respect to ,L* is a o--algebra
containing a, each B in eii must be measurable. If B is measurable and
A is in a, with B c A and 1.4*A < I.4*B + E, then
,u* A = ju*B + , L * (A ,-.., B),
and so
B) <
E,
if 1.4*B < oo. Hence
A* B < ,u* A = AA
- _. A B + IAA — B)
< AT +
E.
Since E is arbitrary, we have
and so
i.t*B = it- B.
If 1.4 is a o--finite measure, let { X,} be a countable disjoint collection
of sets in a with X — U X., and ,uX, finite. If B is any set in CB, then
B= U (X, n B)
Sec. 2]
259
The Extension Theorem
and this is a countable disjoint union of sets in 63, and so we have
JIB =
E p(X, n B)
pB =
E Tu- (X, n B).
and
Since e(X, n B) <
co,
we have
Ta(X, n B) = p(X, n B). I
This extension procedure not only extends p, to a measure on the
smallest o--algebra 63 containing a, but also completes and saturates
the measure. If p. is o--finite, the extension to 63 is already saturated,
and the extension to the e-measurable sets is merely the completion
of A. If p is not a-finite, then the extension to e-measurable sets also
saturates i. It should be observed that in this case the extension of
p. to (33 need not be unique (Problem 4), although any extension JI
must agree with for each set B of 63 for which 1.7B < 00, and we
always have JIB < A* B. We shall return to the question of extension
and unicity in Sections 5 and 6.
It is often convenient to start with a set function on a collection
e of sets having less structure than an algebra of sets. We say that a
collection e of subsets of X is a semialgebra of sets if the intersection
of any two sets in e is again in e and the complement of any set
in e is a finite disjoint union of sets in e. If e is any semialgebra of
sets, then the collection a consisting of the empty set and all finite
disjoint unions of sets in e is an algebra of sets which is called
the algebra generated by e. If p is a set function defined on e, it is
natural to attempt to define a finitely additive set function on a by
setting
=
E
whenever A is the disjoint union of the sets Ei in e. Since a set A
in a may possibly be represented in several ways as a disjoint union
of sets in e, we must be certain that such a procedure leads to a
unique value for IA. The following proposition gives conditions
under which this procedure can be carried out and will give a
measure on the algebra a.
260
Measure and Outer Measure
[Ch. 12
9. Proposition: Let e be a semialgebra of sets and u a nonnegative
set function defined on e with bk = 0 (if 0 E e). Then bt has a unique
extension to a measure on the algebra a generated by e if the following
conditions are satisfied:
i. If a set C in e is the union of a finite disjoint collection {C il
of sets in e, then AC =
ii. If a set C in e is the union of a countable disjoint collection
{C,} of sets in e, then 1.4C <
Problems
4. Let X be the set of rational numbers and a the algebra of finite
unions of intervals of the form (a, b] with i.t(a, b] = oo and /.425 = O.
The extension of p. to the smallest a-algebra containing a is not unique.
5. Prove Proposition 9 by showing:
a. Condition (i) implies that, if A is the union of each of two finite
disjoint collections {C,) and {D,} of sets in e, then EpC, = A DJ.
[Hint: pC, =
Ace n Din
E/
b. Condition (ii) implies that p, is countably additive on a [for
finite additivity and monotonicity already imply the reverse inequality].
6. Let a be a collection of sets which is closed under finite unions and
finite intersections; an algebra of sets, for example.
a. Show that a, is closed under countable unions and finite intersections.
b. Show that each set in a„,5 is the intersection of a decreasing
sequence of sets in a,.
7. Let g be a finite measure on an algebra a, and m* the induced outer
measure. Show that a set E is measurable if and only if for each e > 0
there is a set A c as A C E, such that p*(E — A) < e.
8. If we start with an outer measure ,u* on X and form the induced
measure /-.7, on the p*-measurable sets, we can use 72 to induce an outer
measure m,+.
a. Show that for each set E we have p+E > p*E.
b. For a given set E we have p+E = p*E if and only if there is a
p*-measurable set A D E with p*A =
c. An outer measure which satisfies the criterion in (b) for each set
E is called a regular outer measure. Show that every outer measure induced
by a measure on an algebra is regular.
d. Let X be a set consisting of two points. Construct an outer
measure on X which is not regular.
,
Sec. 3]
261.
The Lebesgue-Stieltjes Integral
*3 The Lebesgue-Stieltjes Integral
Let X be the set of real numbers and 63 the class of all Borel sets. A
measure y defined on 6a and finite for bounded sets is called a Borel
measure (on the real line). To each finite Borel measure we associate a
function F by setting
F(x) = m( — 00 , x].
The function F is called the cumulative distribution function of j and
is real-valued and monotone increasing. We have
p,(a, b] = F(b) — F(a).
1
— ], Proposition 11.2
Since (a, b] is the intersection of the sets (a, b
implies that
(a, b] = hm
1
(a, b
and so
F(b) = Ern F
1
= F(b+).
Thus a cumulative distribution function is continuous on the right.
Similarly,
A{b} = lim
n
co
(I) — , bl
n
lim F(b) F (I)
7Z
1
—
n
= F(b) — F(b—).
Hence F is continuous at b if and only if the set {b} consisting of b
-n], we have
alone has measure zero. Since 0 =
n
lim F(n) = 0,
n
—co
and hence
lim F(x) = 0,
—
because of the monotonicity of F. We summarize these properties in
the following lemma:
262
Measure and Outer Measure
[Ch. 12
10. Lemma: If A is a finite Borel measure on the real line, then its
cumulative distribution function F is a monotone increasing bounded
function which is continuous on the right. Moreover, lim F(x) = O.
00
Suppose that we begin with a monotone increasing function F
which is continuous on the right. Then we shall show that there is a
unique Borel measure A such that
(2)
p.(a, b] = F(b) — F(a)
for all intervals of the form (a, b], where we define F(00) = lirn F(x)
2 -000
and F(— Go) = lim F(x). We begin with the following lemma, whose
proof is left to the reader (Problem 9):
11. Lemma: Let F be a monotone increasing function continuous
on the right. If (a, b]
C
U
(a„ b0], then
0=1
F(b) — F(a) <
F(b0) — F(a,).
1=-1
If we let e be the semialgebra consisting of all intervals of the form
(a, b] and set 14(a, b] = F(b) — F(a), then A is easily seen to satisfy
condition (i) of Proposition 9, and since Lemma 11 is precisely the
second condition, we see that A admits a unique extension to a
measure on the algebra generated by e. By Theorem 8 this A can be
extended to a a--algebra containing e. Since the class 63 of Borel sets
is the smallest a--algebra containing e, we have an extension of A to a
Borel measure. The measure A is a--finite, since X is the union of the
intervals (n, n 1] and each has finite measure. Thus the extension
of A to 03 is unique, and we have the following proposition:
12. Proposition: Let F be a monotone increasing function which
is continuous on the right. Then there is a unique Borel measure A
such that for all a and b we have
p.(a, b] = F(b) — F(a).
13. Corollary: Each bounded monotone function which is continuous on the right is the cumulative distribution function of a unique
finite Borel measure provided F(— oo) = O.
Sec. 3]
The Lebesgue-Stieltjes Integral
263
If ça is a nonnegative Borel measurable function and F is a monotone increasing function which is continuous on the right, we define
the Lebesgue-Stieltjes integral of ça with respect to F to be
f cp dF = f cp dp,
where A is the Borel measure having F as its cumulative distribution
function. If cp is both positive and negative, we say that it is integrable
with respect to F if it is integrable with respect to A.
If F is any monotone increasing function, then there is a unique
function F* which is monotone increasing, continuous on the right,
and agrees with F wherever F is continuous on the right (Problem 10),
and we define the Lebesgue-Stieltjes integral of cp with respect to F by
f (p dF = f cp dF*.
b
If F is a monotone function, continuous on the right, then f (p dF
L
agrees with the Riemann-Stieltjes integral whenever the latter is
defined. The Lebesgue-Stieltjes integral is only defined when F is
monotone (or more generally of bounded variation as in Problem 11c), while the Riemann-Stieltjes integral can exist when F is
not of bounded variation, say when F is continuous and (p is of
bounded variation.
Problems
9. Prove Lemma 11. [Choose e > 0. By the continuity on the right of
F, choose ?i t > 0 so that F(b, + ni) < F(b) I € 2—i , and choose 3 > 0
so that F(a + (5) < F(a) I e. Then the open intervals (a„ b, + ni) cover
the closed interval [a ± (5, b], and the proof proceeds like that of Proposition 3.1. A little extra care must be taken when (a, b] is infinite.]
10. Let F be a monotone increasing function, and define
- -
- -
F*(x) = Jim F(y).
Then F* F*is a monotone increasing function which is continuous on the
right and agrees with F wherever F is continuous on the right. We have
(F*)* = F*, and if F and G are monotone increasing functions which
agree wherever they are both continuous, then F* = G*.
Measure and Outer Measure
264
[Ch. 12
11. a. Show that each bounded function F of bounded variation gives
rise to a finite signed Borel measure y such that
v(a, b] = F(b+) — F(a+).
b. Compare Theorem 5.4 with the Jordan decomposition of v.
c. Extend the definition of the Lebesgue-Stieltjes integral fclo dF to
functions F of bounded variation.
d. Show that if kpj < M and if the total variation of F is T, then
if/PI _< MT.
12. a. Let F be the cumulative distribution function of the Borel
measure y, and assume that F is continuous. Then for any Borel set E
contained in the range of F, we have mE = v[F-1 (E)], with m Lebesgue
measure. [Hint: This is true for intervals, and the uniqueness part of
Theorem 8 can be used to derive its truth in general.]
b. Generalize to the case of discontinuous cumulative distribution
functions.
13. Let F be a continuous increasing function on [a, b] with F(a) = c,
F(b) = d, and let ço be a nonnegative Borel measurable function on [c, d].
d
Then f cp(F(x)) dF(x)
a
f ,,o(y) dy. [Hint: Use Problem 12a to take
care of the case when (F is a characteristic function and generalize first to
simple ço and then to general ço.]
14. a. Show that a measure ih is absolutely continuous with respect to
Lebesgue measure if and only if its cumulative distribution function is
absolutely continuous.
b. If u is absolutely continuous with respect to Lebesgue measure,
then its Radon-Nikodym derivative is the derivative of its cumulative
distribution function.
c. If F is absolutely continuous, then
f f dF = f fF' dx.
4
Product Measures
Let (X, a, ,u) and ( Y, 63, v) be two complete measure spaces, and
consider the direct product X X Y of X and Y. If A c X and B c Y,
we call A X Ba rectangle. If A e a and B e 63, we call A XBa
Sec. 4]
265
Product Measures
measurable rectangle. The collection
semialgebra, since
61,
of measurable rectangles is a
(A x B) n (C X D) = (A n C) X (B n D)
and
—(A
x
B)
(Al
B)u (A X b) u (21 X h).
If A X B is a measurable rectangle, we set
X(A X B) = ,uA • vB.
14. Lemma: Let {(A, X .81) } be a countable disjoint collection of
measurable rectangles whose union is a measurable rectangle A x B.
Then
X(A X B) =
x(Az X B 1).
E
Proof: Fix a point x e A. Then for each y e B, the point (x, y)
belongs to exactly one rectangle A, X B,. Thus B is the disjoint union
of those B, such that x is in the corresponding A,. Hence
E vB,- X
1 (x) = vB • X A (x),
since I/ is countably additive. Thus by the corollary of the Monotone Convergence Theorem (11.14), we have
L
fB 1 x 41 d = f u(B) • XA d,u
or
E vB, • ALA,
= vB • ,uA. I
The lemma implies that X satisfies the conditions of Proposition 9
and hence has a unique extension to a measure on the algebra at'
consisting of all finite disjoint unions of sets in act. Theorem 8 allows
us to extend X to be a complete measure on a a--algebra 8 containing
at. This extended measure is called the product measure of y and y
and is denoted by X v. If and if are finite (or a--finite), so is X y.
If X and Y are the real line and and y are both Lebesgue measure,
then X y is called two-dimensional Lebesgue measure for the plane.
The purpose of the next few lemmas is to describe the structure of
the sets which are measurable with respect to the product measure
266
Measure and Outer Measure (Ch. 12
X v. If E is any subset of X X Y and x a point of X, we define the
x cross-section Ex by
= (y: (x, y) EE},
and similarly for the y cross-section for y in Y. The characteristic
function of Ex is related to that of E by
X E.(y) = X E(x, y).
We also have
collection {Ea}.
(f) x
= — (Ex) and (U Ea)x = U (Ea)x for any
15. Lemma: Let x be a point of X and E a set in 61,8. Then Ex is
a measurable subset of Y.
Proof: The lemma is trivially true if E is in the class 61 of mea-
surable rectangles. We next show it to be true for
E = U E„
where each
E,
E
in
131,.
Let
is a measurable rectangle. Then
z=1
xE(y) = xE(x, y)
= sup x E,(x, y)
= sup X ( E0 (y).
Since each E, is a measurable rectangle, X (E,4 (y) is a measurable
function of y, and so XE z must also be measurable, whence Ex is
measurable.
Suppose now that E =
E, with E, e CR.,. Then
n
z=i
xE(y) = XE(x, Y)
= inf x Ei (x, y)
= inf X(E i)(Y),
and we see that
XE.
is measurable. Thus Ex is measurable for any
E e (Bo a. I
16. Lemma: Let E be a set in
function g defined by
61,73
with
g(x) = vEx
/.4
X v(E) < Go. Then the
Sec. 4)
267
Product Measures
is a measurable function of x and
fgdj = /.‘ X y(E).
Proof: The lemma is trivially true if E is a measurable rectangle.
We first note that any set in 61,7 is a disjoint union of measurable
rectangles. Let (E,) be a disjoint sequence of measurable rectangles,
and let E = U E,. Set
gi (x) = vr(E0x].
Then each g, is a nonnegative measurable function, and
g=
Thus g is measurable, and by the corollary of the Monotone Convergence Theorem (11.14), we have
f
f g, d/.4
g
=EAXP(E,)
= pi X v(E).
Consequently, the lemma holds for E e
Let E be a set of finite measure in (R,s . Then there is a sequence
(E,) of sets in di, such that E,+1 C E, and E =
E,. It follows from
Proposition 6 that we may take 1.4 X v(E 1) < oo. Let gt (x) =
Then g(x) = lim g,(x), and so g is measurable. Since
n
f gi d1 =
X v(E 1 ) < 00,
we have g 1 (x) < oo for almost all x. For an x with g 1 (x) < oo, we
have ((E,)x ) a decreasing sequence of measurable sets of finite
measure whose intersection is E.
Thus by Proposition 11.2 we have
g(x) = y(Ex ) = lim v[(E0x ]
= lim
Hence
g, --> g a.e.,
268
Measure and Outer Measure
[Ch.
12
and so g is measurable. Since 0 < g, < gli the Lebesgue Convergence
Theorem implies that
Jgdp lim
diu
= lim 1.t X v(E 2)
= IA X v(E),
the last equality following from Proposition 11.2. 1
17. Lemma: Let E be a set for which
almost all x we have v(E z) = O.
1.1,
X v(E) =
O. Then for
Proof: By Proposition 6 there is a set F in 61,8 such that E c F
and ,u X v(F) = O. It follows from Lemma 16 that for almost all x
we have v(F) = O. But Ex c Fx, and so vEr = 0 for almost all x
since v is complete.1
18. Proposition: Let E be a measurable subset of X X Y such that
X v(E) is finite. Then for almost all x the set Ez is a measurable
subset of Y. The function g defined by
/.4
g(x) =v(Ex)
is a measurable function defined for almost all x and
f g diu = 1.4 X v(E).
Proof: By Proposition 6 there is a set F in Gi,a such that E c F
,u X v(F) = A X v(E). Let G = F — E. Since E and F are
measurable, so is G, and
and
A X 70) = A
X v(E)
A
X v(G).
Since A X v(E) is finite and equal to j2 X v(F), we have ,u X v(G) =
O. Thus by Lemma 17 we have vGz = 0 for almost all x. Hence
g(x) = vEz = vF, a.e.,
and so g is a measurable function by Lemma 16. Again by Lemma 16
X v(F)
fgd
= ti
X v(E). 1
Sec.
41
269
Product Measures
The following two theorems enable us to interchange the order of
integration and to calculate integrals with respect to product measures
by iteration.
19. Theorem (Fubini): Let (X, a, /2) and (Y, CB, y) be two complete
measure spaces andf an integrable function on X X Y. Then
i. for almost all x the function jez defined by f(y) = f(x, y) is an
integrable function on Y;
i'. for almost all y the function fy defined by f(x) -= f(x, y) is an
integrable function on X;
f(x, y) dv(y) is an integrable function on X;
fy
fx f(x, y) d(x)
u.
is an integrable function on Y;
fx [ fy f dv]di.1 = fxxy f d(i.1 X y) = f y [fx f diddy.
Proof: Because of the symmetry between x and y it suffices to
prove (i), (ii), and the first half of (iii). If the conclusion of the theorem
holds for each of two functions, it also holds for their difference, and
hence it is sufficient to consider the case when f is nonnegative.
Proposition 18 asserts that the theorem is true if f is the characteristic function of a measurable set of finite measure, and hence the
theorem must be true iff is a simple function which vanishes outside
a set of finite measure. Proposition 11.7 asserts that each nonnegative
integrable function f is the limit of an increasing sequence ((ion ) of
nonnegative simple functions, and, since each (p i, is integrable and
simple, it must vanish outside a set of finite measure. Thus fx is the
limit of the increasing sequence ((gon),) and is measurable. By the
Monotone Convergence Theorem
fy f(x, y) dv(y) = lim fy cp,i (x, y) dv(y),
and so this integral is a measurable function of x. Again by the Monotone Convergence Theorem
fx [fi, f di)] d = lim fx [
cp„ dv]d/2
.f.xy(Pn Cl( bi
X y)
= xxy f d(u X y). I
270
Measure and Outer Measure
[Ch. 12
In order to apply the Fubini Theorem, one must first verify that f
is integrable with respect to A X y; that is, one must show that f is a
measurable function on X X Y and that 11 f I d().t X y) < 00. The
measurability of f on X X Y is sometimes difficult to establish, but
in many cases we can establish it by topological considerations (cf.
Problem 17). In the case when p, and y are a--finite, the integrability
of f can be determined by iterated integration using the following
theorem:
20. Theorem (Tonal»: Let (X, a, p,) and (Y, 03, y) be two a--finite
measure spaces, and let f be a nonnegative measurable function on
X X Y. Then
fx defined by fx (y) = f(x, y) is a
measurable function on Y.
i'. for almost all y the function L defined by f(x) = f(x, y) is a
measurable function on X.
f(x, y) cly(y) is a measurable function on X.
i. for almost all x the function
fy
f
X
f(x, y) dp,(x) is a measurable function on Y.
[ f f did d p, f
f=
X Y
XXY
f C102 X P)
f [f f dA]dy.
Y X
Proof: For a nonnegative measurable function f the only point in
the proof of Theorem 19 where the integrability of f was used was to
infer the existence of an increasing sequence ((p„) of simple functions
each vanishing outside a set of finite measure such that f = fim yon .
But if p. and y are a--finite, then so is y X 1), and any nonnegative
measurable function on X X Y can be so approximated by
Proposition 11.7. I
If a and 03 are a--algebras on X and Y, then the smallest a--algebra
containing the measurable rectangles is denoted by a X 03. Thus
the product measure is defined on a a--algebra containing a x 63,
and since p. X r is obtained by the Carathéodory extension process,
it is both complete and saturated. If y and P are both a--finite, then
the product measure on a X 03 is already saturated and the measurable sets for y X 7) are those which differ from sets in a x o3 by
sets of measure zero.
Many authors prefer to define product measure to be the restriction
of 11 X r to a X 63. The advantage of taking y X 1, to be complete,
271
Sec. 4] Product Measures
as we have done here, is that this does what we want it to for Lebesgue
measure: The product of n-dimensional Lebesgue measure with
m-dimensional Lebesgue measure is (n + m)-dimensional Lebesgue
measure. Since our hypotheses for the Fubini and Tonelli theorems
require only measurability with respect to the complete product
measure, they are weaker than requiring measurability with respect
to a X (33. The price for using these weaker hypotheses is the necessity of including the "almost all" phrases in the conclusion of the
theorems. This has to be expected, since changing f arbitrarily for x
in a set of measure zero doesn't change the measurability or
integrability of f, but fs can be arbitrary for those x. If however,
f is measurable with respect to a X (fa, then f,, is measurable for each
X.
We have also used the completeness of I.,. to show that if (x, y) dv(y)
is measurable, for if p, were not complete we could only conclude that
this was a function which differed on a subset of a set of measure zero
from a measurable function. If, however, f is measurable with respect
to a X 63, then it turns out that ff(x, y) dv(y) is measurable with
respect to a even if 14 is not complete (provided f is integrable), but
the proof of this is surprisingly intricate. For a proof see Halmos
[8], p. 143.
The examples in the problems show that we cannot omit the
hypothesis of the integrability of f from the Fubini Theorem or the
hypotheses of o--finiteness and nonnegativity from the Tonelli
Theorem. Problem 22 shows the essential role played by the measurability of f in these theorems: If we omit this assumption, even for
bounded functions and finite measures, we may have the iterated
integrals f[f f 4] dp, and f[f f 4] dv well defined but unequal.
Problems
15. Let X — Y be the set of positive integers, a OE — 6,(X), and let
v = g be the measure defined by setting p,(E) equal to the number of points
in E if E is finite and 00 if E is an infinite set. (This measure is called the
counting measure.) State the Fubini and Tonelli Theorems explicitly for
this case.
16. Let (X, a, g) be any 0-finite measure space and Y the set of positive
integers with v the counting measure (Problem 15). Then Theorem 20 and
Corollary 11.14 state the same conclusion. However, Corollary 11.14 is
----
Measure and Outer Measure
272
[Ch. 12
valid even if y is not a-finite, and hence the Tonelli Theorem is true without
a-finiteness if ( Y, 63, y) is this special measure space.
17. Let X = Y = [0, It and let A = y be Lebesgue measure. Show
that each open set in X X Y is measurable, and hence each Borel set in
X X Y is measurable.
18. Let h and g be integrable functions on X and Y, and definef(x, y) =
h(x)g(y). Then f is integrable on X X Y and
fxxy fd(y. X v) = f h dy f g dy.
x
Y
We do not need to assume that y and v are a-finite.)
19. Show that Tonelli's Theorem is still true if, instead of assuming
ia and v to be o--finite, we merely assume that {(x, y): f(x, y)
0)- is a
set of a-finite measure.
20. The following example shows that we cannot remove the hypothesis
that f be nonnegative from the Tonelli Theorem or that f be integrable
from the Fubini Theorem. Let X = Y be the positive integers and A = v
be the counting measure. Let
(Note:
{ 2 — 2 —x
f(x, y) = — 2 + 2—z
0
if x = y
if x = y + 1
otherwise.
21. The following example shows that we cannot remove the hypothesis
that f be integrable from the Fubini Theorem or that A and y are a-finite
from the Tonelli Theorem: Let X = Y be the interval [0, 1], with a = (B
the class of Borel sets. Let ,u. be Lebesgue measure and y the counting
measure. Then the diagonal A = {(x, y) e X X Y: x = yl is measurable
(is an 61,3, in fact), but its characteristic function fails to satisfy any of the
equalities in condition (iii) of the Fubini and Tonelli Theorems.
22. The following example shows that the hypothesis that f be measurable with respect to the product measure cannot be omitted from the
Fubini and Tonelli Theorems even if we assume the measurability of fy
and f, and the integrability of ff(x, y) d(y) and ff(x, y) di.t(x). Let
X = Y = the set of ordinals less than or equal to the first uncountable
ordinal St. Let a = (B be the a-algebra consisting of all countable sets and
their complements. Define A = y by letting yE = 0 if E countable,
yE = 1 otherwise. Define a subset S of X X Y by S = {(x, y): x < y} .
Then Sz and Sy are measurable for each x and y, but iff is the characteristic
function of S we have
f[ff(x,
y) di.t(x)]dv(y)
f [ f f(x, y) dv(y)]d,u(x).
Sec. 4] Product Measures
273
If we assume the continuum hypothesis, that is, that X can be put in
one-to-one correspondence with [0, 1], then we can take f to be a function
on the unit square such that f, and fy are bounded and measurable for
each x and y but such that the conclusion of the Fubini and Tonelli
Theorems do not hold.
23. Show that if (X, a, y) and ( Y, (B , v) are two g-finite measure spaces,
then M X I/ is the only measure on a x 6,3 which assigns the value MA vB
to each measurable rectangle A X B. Show that a measure on a x
with this property need not be unique, if we do not have a-finiteness.
24. a. Show that if Ega X
then Ex cΠfor each x.
b. If f is measurable with respect to a x ea, then I:, is measurable
with respect to OE for each x.
25. Let X = Y = R and let i.c=v= Lebesgue measure. Then
ith X v is two-dimensional Lebesgue measure on X X Y = R 2. We often
write dx dy for d(y X v).
a. For each measurable subset E of R, let
u(E) = {(x,
x — y e E}.
Show that a(E) is a measurable subset of R 2 . [Hint: Consider first the
cases when E open, E a 96, E of measure zero, and E measurable.]
b. If f is a measurable function on R, the function F defined by
y)
F(x, = f(x — y) is a measurable function on R 2 .
c. If f and g are integrable functions on R, then for almost all x
the function ço given by ço(y) = f(x — y)g(y) is integrable. If we denote its
integral by h(x), then h is integrable and
f
f fi
f igi.
26. Let f and g be functions in L 1 (—x, co), and define f* g to be the
function h defined by h(y) = ff(y — x)g(x) dx.
a. Show that f* g = g *
b. Show that (f* g) * h = f* (g * h).
c. For fe L 1 , define f by f(s) = fetstf(t) dt. Then f is a bounded
complex function and
f*g
f
27. Let f be a nonnegative integrable function on (— Go, Go), and let
m 2 be two-dimensional Lebesgue measure on R 2 . Then
m2 {(x, y):
y
f(x)} = m 2 {(x, y): O <y < f(x)} = f f(x) dx.
Measure and Outer Measure (Ch. 12
274
Let
. Then
9(t) = m (x: f(x)
is a decreasing function and
I: 9(0 di = f(x) dx.
*5
Inner Measure
Let A be a measure on an algebra a and A * the induced outer
measure. Then A*.E may be thought of as the largest possible measure
for E compatible with A. We can also define an inner measure ,u*
which assigns to a given set E the smallest measure compatible
with A:
Definition: Let A be a measure on an algebra a and A * the induced
outer measure. We define the inner measure A* induced by A by setting
A*E = sup AA — p.* (A — E),
where the supremum is taken over all sets A e a for which
1,1,* (A — , < 00.
Inner measure was important historically because the measurability of a set was originally characterized using both inner and
outer measure. A set E of finite outer measure was defined to be
measurable if 12*E = pc* E, and the measurability of sets of infinite
outer measure was defined in terms of intersections with sets of finite
measure. Even in the case of a finite measure A this procedure is more
cumbersome than the elegant approach due to Carathéodory, which
we have followed in this chapter. Apart from this historical importance, inner measure is useful for the extension of A from a to an
algebra containing a and a given set E (which need not be measurable)
and for determining the freedom we have in extending A to a
or-algebra containing a. The purpose of this section is to do this and
to develop the basic properties of inner measure.
21. Lemma: We have u*E < ,u*E. If E c a, ,u*E =
Proof: Since
MA < ,u*(A n E)
itt*(A
n E),
ME.
Sec. 5]
275
Inner Measure
we have
MA — y*(A n < y*(A n
E) < th*E.
Consequently, y *E < y* E.
If E e a, take A = E in the definition of /2*. Then y*E >
ME =
y*E. 1
22. Lemma: If E c F, then 1.4E <
Proof: If y*(A
< oc, then y*(A F) < oc, and so
F) ?_ MA — y*(A E).
/LW > MA — y* (A
Taking suprema over
A e
a with 1.t*(A E) <
co,
we have
y*F > p*E. I
One of the difficulties of using the definition of inner measure is
that we must take the supremum of MA — y* (A E) over all A with
y*(A E) < Go. The next lemma shows that this expression is
monotone in A and enables us to calculate y *E more easily.
23. Lemma: Let A and B be two sets in a with A*(A E) < oc
and A*(B E) < oc. If A c B, we have ILA — p,*(A E)
MB — ti*(B E). If also E c A, we have equality, and hence p*E —
AA — p,*(A E).
Proof: Since B E C (B
A) U (A
E), we have 1.t*(B E) <
y(B A) + y*(A E). If E C A, this union is a disjoint one and
the measurability of B A gives us equality. Since MB = MA +
y(B
A), we have
MB — y*(B E)
MA — y*(A E)
with equality if E C A.1
24. Corollary: If A g a, then
n E)
m*(1 n
Proof: If 1.t*(A n E) = 00, then AA = oc and there is nothing to
prove. Otherwise, set F= AnE. Then A—F= AnB, and
I.4*F =MA — I.1*(A n L'), since F C A and y*(A F) < 00.1
276
Measure and Outer Measure
[Ch. 12
The following lemma expresses the additivity of 1.4* on decompositions given by disjoint sequences of sets in a. The corresponding
result for outer measure is true even if the sets A, are only assumed to
be measurable (see Problems 3 and 32e).
25. Lemma: Let (A,) be a disjoint sequence of sets In a. Then
,u* (E n
z=i
A1)
=
z=i
*(E n
Co
oo
Proof: Replacing E by E n U A„ we may suppose that E C U A,.
0=
1
For any B
6
a with „,*(B
uB — u *(BE) =
1
E) < oo, we have
1-1
u(A, n B) — /2 * (A, n B n f)
z—i
E 1.1*(A, n E).
z=i
Taking the supremum over all such B gives
i.k *E
E y*(A,
n E).
Let B, be any set in a with e(B, (E n Az)) < oo.
Then
— /.1 * (Bi (E n A,)) = PL*(B2 n A, n E)
Pt(B)
/11 * (.13 n É),
where B, = B, n A,. Since the sets B,' are disjoint and measurable,
E
n
B, —
1.1 * ( B,
(E n A,)) = Ly i Yi] —
< 12*E.
Taking suprema over all choices of B, c a with
(E n A,)) <
ri n
Sec.
5]
277
Inner Measure
we have
(E n A,) < y* E.
Thus
m*(E n Ai) < ,u*E. I
26. Theorem: Let 12 be a measure on an algebra a of subsets of
X and E any subset of X. If 63 is the algebra generated by a and
E and if ri is any extension of 1 2 to 63, then
A * E < 1.1E < A*E.
Moreover, there are extensions z and A of At to 63 (and hence also to the
a-algebra generated by 63) such that pE = ki*E and tzE = 1.4E.
Proof: If (A i) is any disjoint sequence of sets from a with
E c U A„ then E = U(A2. n E), and so
=E
12(A,
n E)
E AzA,.
Thus ,OEE < /2*E.
If A is any set in a with ,u *(A — E) < oo, then /2(A
<
12*(A E), and
<
Hence u*E <
The sets B in â– 33 are the sets of the form B = (A n E)u (A' n f)
with A and A' in a, since the collection of all sets of this form is an
algebra contained in 63 and containing a and E. For each B e
define ri and A by
i.t*(B n E)
,u* (B
n L')
and
1213 = /2*(B n E)
ti*(B n te ).
Then and A are monotone, nonnegative functions defined on 63, and
it follows from Corollary 24 that /IA = 2A = AA for A e a. Thus
278
Measure and Outer Measure
[Ch. 12
the theorem will be proved if we show that 17 and ,u are countably
additive on 03.
Let B, = (A, n E) U (A', n f) be a disjoint sequence of sets whose
union B is also in 63. Then B = (A n E) U (A' n f), where A
u A,
and A' = U A. If i j, B. n B,= 0, and hence A, n A, n E =
Bz n B, n E = 0 and A: n A; nE = B, n B, n E = 0. Let C„ —
A n nA 1 n - n
and G = A n A'1 n • • • n
Then C,
and Ç are in a,, and C, n C, = 0, Ç n C,' = 0 for i j. Also
C,i nE= A n n A- i n•••nfin_inE=AnnE, since we have
A„ n A n E = (A,, n A, n E) u (A. n A, n E) = A„ n E. Similarly,
Cn1" A:i nE, and so B„ = (CnnE)u (C:,nE). Now B n E =
n È). Since C„ and C are disjoint
U (G. n E) and Bnt =U
sequences of sets in a, we have
,
A*(B n E) =
E 1 *(( E) = E
ti*(B n
E
n E)
and
=
n
=
E 1.1*(B,„ n L')
by Problem 3, and
i.t*(B n E) =
E 12* (Cn, n E) = E A *(B„ n E)
and
p.*(B n t) =
E 1.4*(q, n E)
E
n
t).
Thus /7,B = LuB„ and /2/3 = EAB„, showing that rt and ,u are
countably additive and hence measures on 01. 1
Proposition 6 states that every set with finite outer measure is
contained in an a,s with the same outer measure. The following
proposition is the analogue for inner measure.
27. Proposition: Let E be a set with p. *E < 00. Then there is a
set H e as cr such that H C E and /all =
be a set in a with ,u*(A„ E) < 00 and iA,, —
By Proposition 6 there is a G,, e av3 such
E) > ,u*E —
that G„ D A„ E and AG„ = A*(A„ E). Let H„ = A — Gn. Then
H„ e as„ and H„ c E. Moreover, glin = PA. — G,, > /2*E.
Thus the proposition is proved if we take H = U 11„. 1
Proof: Let A n
,,
Sec. 5]
279
Inner Measure
28. Proposition: Suppose A*E < oo. Then E is measurable if and
only if 12*E = m*E.
Proof: Suppose m *E = m*E < GO. Then Propositions 6 and 27
give us measurable sets G and H with GcEcH and Till= TiG.
Thus E differs from a measurable set by a set of measure zero and
hence is measurable.
Suppose that E is a measurable set of finite outer measure. Then
E is contained in the union of a disjoint sequence (A n ) of sets of a
of finite measure. By Proposition 25 and the measurability of E we
have
E= E m*(E n A n)
= E M An — A*(A 7, n L')
n E)
=E
= E m*E. I
29. Theorem: Let E and F be disjoint sets. Then
i-t*E m*F < ii*(E u
< p* E p*F < p.*(E u
< p*E p*F.
Proof: If m *E or i.i* F is infinite, the first inequality follows from
the monotonicity of y*. If m *E and i.i*F are both finite, let G and H
be measurable sets with G C E and H C F such that AG = y *E
and H = 1./*F. Then G U H is a measurable set of finite outer
measure contained in E u F. Thus
m*(E u F) ii*(G u H) = 7.7(G u H)
=
rtH
proving the first inequality.
If ti*F = Go, the second inequality is trivial. If p*F < cc, let A
be any set in a with m*(A (E u F)) < cc. Since A — E C
[A — (E u F)] U F, we have
,u*(A
E) < /.1* (A — (E
U
F))
M* F.
Thus 1.1*(A E) < oc, and
MA — ,u*(A (E U F))
MA
—
m*E m*F.
E) u*F
280
Measure anti Outer Measure
[Ch. 12
Taking the supremum over A, we get
y*(E U
< y*E y*F.
To prove the third inequality, we choose a measurable set G C E
with AG = 1.4E. Then the measurability of G implies that
pi*E ± 4u* F = TAG ± g*F
= 11,*(G u
< y* (E U F).
The last inequality is just the subadditivity of outer measure. I
Problems
28. a. If /IX < 00, then p*E = /1,X —
b. If a is a o--algebra, then
inf {gA: E C A, A c a}
and
y*E = sup IgA: A c E, A e
.
c. If y is Lebesgue measure on R, then 1.4*E = sup tuF: F C E,
F closed).
29. Let E be any set. A measurable set G C E is called a measurable
kernel for E if y*(E G) = O. A measurable set H D E is called a
measurable cover for E if y*(H E) = O.
a. Show that any two measurable kernels (or covers) for E differ
by a null set.
b. Show that if E is a set of o--finite outer measure, then E has a
measurable kernel and a measurable cover.
30. Let g be a measure on an algebra a and E a set with y*E < o.
If j3 is any real number with g* E < o < 1.4*E, then there is an extension
A of y to a o--algebra containing a and E such that AE =
31. a. If p is a semifinite measure on an algebra a and if G3 is the
smallest cr-algebra containing a, then y* restricted to 63 is a semifinite
measure and is the smallest extension of y to Gi.
b. There may be more than one semifinite extension to G3 (for
example, let X = N U {co} and a the algebra generated by the finite
subsets of N with y the counting measure).
32. Let a be the algebra of finite unions of half-open intervals of R
Sec. 6]
Extension by Sets of Measure Zero
281
tl 0 = 0 and pA = oc for A
O. The class (B of Borel sets is the
smallest a-algebra containing a.
a. Show that p*E = Go if E Ø.
b. Show that p*E = 0 if E contains no interval and that p*E = 00
if E contains an interval.
c. The restriction of p * to (53 is not a measure. Hence there is no
smallest extension of p, to OB.
d. The counting measure on 63 is an extension of p to (B.
e. Show that Lemma 25 is no longer true if we replace "sets in a"
by "measurable sets".
33. If p* is not a regular outer measure (that is, if it does not come from
a measure on an algebra), then we do not get a reasonable theory of
inner measure by setting ,u*E = 1.4* X — ti*(E). Let X — la, b, cl, and
set p*X = 2, p,*0 = 0, and i.t* E = I if E is not X or 0.
a. Calculate ,u*.
b. What are the measurable subsets of X?
c. Show that there is a nonmeasurable set E with p*E = p*E.
d. Show that the first and third inequalities of Theorem 29 fail.
34. Let X — R 2 and a the algebra consisting of all disjoint unions of
vertical intervals of the form I = gx, y): a < y < bl. Let pA be the
sum of the lengths of the intervals of which A is composed. Then p, is a
measure on a. Let E = {(x, y): y = 0}.
a. Show that p*E = oc, i.t*E = O.
b. Show that every subset of E is an as.
c. Assuming that there is no finite nonatomic measure defined on
(5)(R), show that each extension of p to a measure on a a-algebra must
assign to E the value 0 or co.
and let
*6
Extension by Sets of Measure Zero
The results of Section 2 allow us to extend a measure A on an
algebra a to a g-algebra containing a and those of Section 5 provide
for the extension from a to a a-algebra containing a and one additional set. It is sometimes useful to be able to extend to a cr-algebra
containing a and some collection arc of subsets of X and to extend
in such a way that each of the sets in arc has measure zero. A necessary
condition for this to be possible is that, whenever we have a set
A c a such that A c Me art, then /221 = O. This condition is not in
general sufficient, since a countable union of sets in mt may contain
282
Measure arid Outer Measure
an A with positive measure, but, if we assume that
countable unions, then the condition is sufficient.
OTC
[Ch. 12
is closed under
30. Proposition: Let /.4 be a measure on a or-algebra a of subsets
of X, and let arc be a collection of subsets of X which is closed under
countable unions and which has the property that for each A e a with
A c Me grc we have AA = O. Then there is an extension A of A to the
smallest a-algebra G3 containing a and 5rt such that AM = 0for each
M c OTC.
Proof: Since the collection of sets which are subsets of a set in
OTC
satisfies the same hypothesis as 5E, we may assume that each subset
of a set in art is itself in art. With this assumption the collection
133 = { B: B = A A M, A c a, M c art} is a o--algebra containing a
and 01Z, and since each o--algebra containing a and art contains 03,
63 is the smallest a--algebra containing a and arc.
If B = A 1 A M1 = A 2 A M2, then A 1 A A 2 = M1 A M2, and
so il(A i A A 2) = O. Thus ilA 1 -= mA 2, and, if we define igB to be
AA 1, then g is well defined on G3 and is an extension of A. It remains
only to show that rt is countably additive.
Let B = U B,, B, n B, = Ø. If B, = A, A M„ then A, A A, e arc.
Setting A n' = An n 21 n • • • n Ân_i, we have .44; n A; = 0, and
A,„ P A n' e arc. Thus B, = A,' P M f , and B = A P M, where
A = U A,' and M cU M,'. Thus FLB = MA = Ei.zA: = EAB,. I
We observe that the condition that MA = 0 for each A E a
with A c M simply states that A *M = O. Thus the proposition states
that we can extend the domain of A to include any collection art of
sets of inner measure zero provided that art is closed under countable
unions. Note that on the or-algebra generated by a and arc we have
ri. = M. Thus this proposition gives a generalization of the process
of completion which extends the domain of a measure by adding sets
of outer measure zero.
Problem
35. Let a be a a-algebra on X and art a collection of subsets of X which
is closed under countable unions and which has the property that each
subset of a set in grt is in Nt. Show that the collection
ff3 = {B:B= APM,AcaandMeart}
is a a-algebra.
283
Sec. 7] Carathéodory Outer Measure
36. Show that Proposition 30 need not be true if
a is only an algebra
of sets.
*7 Carathéodory Outer Measure
Let X be a set of points and r a set of real-valued functions on X.
It is often of interest to know conditions under which an outer
measure I.4* will have the property that every function in r will be
measurable, and it is the purpose of the present section to prove the
sufficiency of one criterion for this.
Two sets are said to be separated by the function it) if there are
numbers a and b with a > b such that ço is greater than a on one and
less than b on the other. An outer measure /2 * is called a Carathéodory
outer measure (with respect to r) if it satisfies the following axiom:
iv. If A and B are two sets which are separated by some function
in r, then 11*(A u B) = 1.4*A + i.t*B.
31. Proposition: If i.t* is a Carathéodory outer measure, then
every function in r is 1.4*-measurab1e.
Proof: Given the real number a and the function ic e r, we must
show that the set
E = {x: ço(x) > a}
is 1.4*-measurable or, equivalently, that given any set A,
i.t*(A n E) + 1.4*(il n E).
i.t*A
Since this inequality is trivial if 1.1*./1 = Do, we suppose l.t*A < oo.
We begin by setting B = E n A, C = É n A, and
B„ =
Defining Rn =
B
Ix:
(x e
1
B) & (o(x) > a + -n-)} •
Rn_i, we have
B = B,, U[ U Rd•
k.---n+1
Now on B„_ 2 we have so > a -I- 11(n
2), while on Rn we have
9 < a + 1/(n — 1). Thus so separates R, and Bn_2 and hence
—
284
Measure and Outer Measure
[Ch. 12
k-1
separates
R2k
and U R 2i, since this latter set is contained in
R2k-2.
J=1
Consequently,
k —1
A * [U R 2 1]= -
R 2k + A* [
=E
U R2 11
1=1
y*R2i,
j-=1
by induction. Since
U R25
i=1
CB c
A,
we have
E
/./ *
R2j
/1 *fi
j=1
and so the series
E
eR2 1
converges. Similarly, the series
E
eR2j +1
j=-1
converges, and therefore also the series
E
x A * Rk•
k=1
From this it follows that, given e > 0, we can choose n so large that
E teR,, <
E.
k=n
Then by the subadditivity of A* ,
E
1.4*B < y*Bn
k=n+1
< A* Rn
e
or
ex>
/A
*B — e.
1.1* -Rk
Sec. 7] Carathéodory Outer Measure
285
Now
At*A
1.4*(Bn U C)
= *B
m*C
since (to separates B„ and C. Consequently,
i.L*A
> I.L*B + 1.4*C
—
e.
Since e is an arbitrary positive quantity,
At*A
> 1.4*B + m*C. 1
32. Proposition: Let (X, p) be a metric space, and let i.t* be an
outer measure on X with the property that I.4*(A u B) = 1.4.*A + i.t*B
whenever p(A, B) > O. Then every closed set (and hence every Borel
set) is measurable with respect to 41.4*.
Problem
37. Prove Proposition 32. [Let r be the set of functions (p of the form
= p(x, E). Show that pt* satisfies (iv) with respect to r, and note that
for a closed set F we have F = {x: p(x, F) < 0}]
9(x)
13
1
The Daniell Integral
Introduction
It is sometimes convenient to introduce integration directly without
using the concept of measure. This happens when we have an elementary integral I defined on some class L of 'elementary functions'
and want to enlarge the class L and extend I so that it has all the
properties of the abstract Lebesgue integral, including the convergence theorems. If we take, for example, our class L to consist of the
continuous real-valued functions on (— c co) each of which vanishes
outside a finite interval and take I to be the Riemann integral, we
might expect to obtain the class of Lebesgue integrable functions and
the Lebesgue integral by some sort of extension process. Such a process has been carried out by Daniell for this case and generalized by
Stone, who also clarified the structure of the extended integral. It is
the purpose of the present chapter to describe this extension procedure
and to show its connection with measure theory.
Let L be a family of real-valued functions on some set X, and
suppose that L is a vector lattice, that is, that whenever the functions
f and g are in L so also are the functions of ± fig, f V g, and f A g.
g) v 0 g,
Since fAg=f+g
(f
(fvg)andfVg
we see that a vector space L of functions is a vector lattice if for each h
in L we have h V 0 in L. Thus a vector space of functions is a vector
lattice if it is closed under the operationf —>f+ f V O. Since i fi
(—f)+, each vector lattice contains the absolute value of each
function in the lattice. Conversely, if L is a vector space such that
,
—
—
286
—
Sec. 1] I ntroduction
287
jf1 is in L for each f in L, then L is a vector lattice, for f+ =
+ I fl).
A linear functional I on L is said to be positive' if I(ç) > 0 for
each nonnegative function ya in L. If I is positive and io < 1,t, then
k,o) < 104 A positive linear functional I on L is called a Daniell
functional or a Daniell integral if the following condition is satisfied:
D. If (io n ) is a sequence of functions in L which decrease to zero
at each point, then lim 1(ic„) = O.
This condition is clearly equivalent to each of the following conditions:
D'. If (cp„) is an increasing sequence of functions in L, and if ye,
is a function in L such that 2 ç < lim (p then l(p) < lim
D". If (u„) is a sequence of nonnegative functions in L, and if io
is a function in L such that io < Eun, then 1((p)
One example of a Daniell integral is obtained by taking L to be
the set of continuous functions on (— 00, 00) each vanishing outside a
finite interval and taking I to be the Riemann integral. Another is
obtained from a measure g on an algebra a of subsets of X by letting
L be the class of simple functions and I the natural integration with
respect to /.4 . A third example is given by taking L to be the class of all
functions integrable with respect to a measure A.
This last example does not quite satisfy the definition we have given,
since integrable functions may be extended real-valued. We want to
extend our definition to cover this case but must be careful about the
fact that f + g is not defined at points where it is of the form 00 — 00
or — 00 + 00 . Thus we shall say that L is a vector lattice of extended
real-valued functions on X provided that, whenever f and g are in L,
the functions f V g, f A g, and of are in L as well as each function h
such that h(x) = f (x) + g(x) whenever the right side is defined.
Thus we require all possible choices for the sum f + g to be in L,
and we shall write h f g whenever h is any possible choice. By
a linear functional on L we mean a mapping I of L into R such that
1(af) = al(f) and 1(h) = l(f) + 1(g) whenever h = f + g. This
Properly speaking, we should call / "nonnegative," but the use of "positive" in this
connection seems to be standard.
2 Here, of course, lim means pointwise limit.
1
288
The Daniell Integral
[Ch. 13
implies that if h l and h 2 are two functions with h l = f + g and
h 2 = f+ g then 1(h 1) = 1(h2). A positive linear functional I on L
is called a Daniell integral if it satisfies condition D.
In general the limits of functions in L will not be in L, and the
procedure of Daniell and Stone, which we are going to develop in
this chapter, consists in extending I to a class L 1 containing L and
closed under suitable limiting operations. In many ways this extension
procedure parallels that of Carathéodory for measures, and in Section 3 we show that under mild restrictions this extended integral is
in fact integration with respect to a suitable measure.
Problems
1. Show that the conditions D, D', and D" are equivalent.
2. Let g be a measure on an algebra a, L the class of simple functions
on a, and I integration with respect to pt. Show that L is a vector lattice
and that I is a Daniell integral.
3. Let L be the family of continuous real-valued functions on (— co, co)
each of which vanishes outside a finite interval, and let I be Riemann integration on L. Show that lis a Daniell integral. [Hint: The Dini Theorem is
useful.]
4. Let L be a vector lattice of extended real-valued functions on X and
I a positive linear functional on L, as defined in this section. Let f c L,
and let E = {x: f(x) = co}. Show that any extended real-valued function
h which vanishes outside E is in L and I(h)
0.
2 The Extension Theorem
We begin by considering the class L. consisting of all those extended real-valued functions on X each of which is a limit of a
monotone increasing sequence of functions in L. It is clear that if f
Ag where a and are
and g are in L,„ then so is the function of
nonnegative constants. If ((p,i ) is an increasing sequence of functions
from L, then (/(yo n)) must be an increasing sequence of real numbers
and so must have a limit (which may be co). It is tempting to try to
define lim /((p n) as the value of I for the function f which is the
pointwise limit of the sequence (cp„. In order to do this we need to
know that this value depends only on the function f and not on the
Sec. 2]
The Extension Theorem
289
choice of the increasing sequence (ion) whose limit is f. That this is so
is guaranteed by the following lemma:
1. Lemma: If (v.) and (1,G,.) are increasing sequences from L
and if lim ço. < lirn ihn, then lim Ayo n) < lim /01/70Proof: For fixed n, we have („on < lim çon < lim
/((p„)
lim /(1,t,„) by (D'). Hence lim /((p„)
lim /(V,„). I
and so
Thus we can extend the functional I to be an extended real-valued
functional on L„ with the property that I(f) < I(g) for f < g and
I(af + (3g) = aI(f) 0I(g) for positive constants a and 0 and
functions f and g in L. It is clear that L„ is a lattice, for if (p. Tf
and ;/,,, g, then (ion A %t. if A g and (p„ V 11,„ f V g.
2. Lemma: A nonnegative function f belongs to L„ if and only if
there is a sequence (Ik v ) of nonnegative functions in L such that f =
E ik v . In this case I(f) = E
Y= 1
Y=1
Proof: The "if" part is trivial. On the other hand, let f be nonnegative and (p„ 1f with (p„ e L. By replacing (,0„ by (pi, V 0, we may
assume that each (p ,, is nonnegative. Set
= = (p. —
CO
for n > 1. Then f =
E
v=1
and
I(f) = Ern /WO
v—i tp)
lirn
E
E
P
=1
3. Lemma: Let (L) be a sequence of nonnegative functions in
00
L. Then the function f =
CO
E
n=1
f,, is in L. and I(f) =
E AL).
n=1
Proof: For each n there is a sequence (1,/,) of nonnegative func-
tions in L such that f. =
E
Hence f =
E
v,n
Since the
290
The Daniell Integral
[Ch. 13
set of pairs of integers is countable, f is the sum of a sequence of nonnegative functions in L and so must be in L. Also
1(f) =
E
op
E
n=1
I(fn).
For an arbitrary function f on X we define the upper integral 7(f)
by setting
I(f) = inf I(g),
f
ge L„
where we adopt the convention that the infimum of the empty set is
+ Go . We define the lower integral I by setting l(f) = —7(—f).
Elementary properties of these upper and lower integrals are given
by the following lemmas. The properties in Lemma 4 follow directly
from the definition of I.
4. Lemma: Let h = f + g. Then 7(h) < 7(f) + 7(g), provided the
right-hand side is defined. If c > 0, 7(cf) = ci(f). if f < g, then
7(f) < 7(g) and 1(f) < I(g).
5. Lemma: We have
I(f).
I (f) <
7(f). tf f e Lu, then l(f) = i(f)=
Proof: We have 0 = 1(0) = I(f f)
7(—f). Hence
l(f) =
To prove the second statement, we note that if f e L u then
I(f) < I(f) by the definition of I. If g c L u and f < g, then I(f)
1(g) and so I(f) < 7(f), whence RD = l(f). If (p e L, then
e L C L, and so 1 ( — (P) = 1( — ça) = 1(0 , and hence
=
1(4 But each f e L u is the limit of an increasing sequence çr,„) of
functions in L. Since f > ço„ we have I(f) > Aq),) = IGO. Hence
l(f) > Jim I(go) = I(f).
Since I(f) < I(f) = I(f), we have AD = l(f). I
6. Lemma: Let (f) be a sequence of nonnegative functions, and
00
let f =
00
E f„. Then I(f) < E i(f).
v=1
Sec. 2]
291
The Extension Theorem
Proof: If r(fp) = co for some v, we are done. If not, given E > 0,
there is a function g, e L u such thatf, < g, and 1(g,) < (f„) + E 2—'.
Since each g, is nonnegative, Lemma 3 implies that the function
g = Egy is in L u and that I(g) = EI(g Er(f) -I- E. Since g f,
we have
)
Ï(f)
E
+
p—i
and the lemma follows since E was an arbitrary positive number. I
We shall call a function f on X integrable with respect to I (or Iintegrable) if i(f) = I(f) and this value is finite. We denote the class
of functions integrable with respect to I by L 1 . For f in L 1 we shall
write I(f) for i(f). Thus we have an extension of our original I to all
of L 1 . Properties of L 1 and this extended I are given by the following
proposition:
7. Proposition: The set L 1 is a vector lattice of functions containing
L, and I is a positive linear functional on L 1 , which extends the functional I on L.
Proof: If f is in L 1 , so is cf, since I(cf) = cI(f) = c1(f) = I(cf) for
c > 0, and I(cf) = ci(f) = cI(f) 1(cf) for c
O. Iff and g are in
L 1 , then
Î(f+ g) _< l(f) + 1(g)
and
—1(f + g) = I(—f — g) < —I(f) — I(g),
that is
.1.(f + g)
I(f)
I(g).
Thus
+ g) = I(f + g) = I(f) I(g),
and so f + g is in L 1 . Consequently, L 1 is a linear space, and I is a
linear functional on L 1 . Lemma 5 implies that L 1 D L and that our
definition of i on L 1 gives us a positive linear functional which agrees
with our original I on LI.
To prove that L 1 is a lattice, it suffices to show that iff e LI, then
f+ e Li . Ley g L 1 . Then for each E > O there are functions g and h in
292
The Daniell Integral
[Ch. 13
L u such that —h < f < g, while I(g) < I(f)
E < co and I(h) <
—I(f)
E < 09. Since g = (g V 0)
(g A 0) and (g A 0) e
we have I(g A 0) > oo and I(g V 0) < I(g) — I(g A 0) < oo .
Thus the function g i = g V 0 is in L. and AO < oo. Let h i = h AO.
Then h 1 E L . , and h 1 < f + < g i . Since g > —h, gi + h i < g -I- h.
Consequently, I(g 1 ) I(h 1) < I(g) I(h) < 2E. Since —1(h 1 ) <
.1(f+) < 1
- (f+) < 1(g1), we have I (f +) — l(f+) < 2E, and so
I(f+) = l(f+), since E was arbitrary. Since 0 < I(f+) < I(gi ) <
we have f-+ c L I . Thus L 1 is a lattice. 1
—
The following proposition is the analogue for L 1 of the monotone
convergence theorem. It also shows that L 1 and I satisfy condition
D' and hence D.
8. Proposition: Let (fa ) be an increasing sequence of functions in
L i , and let f = lim fn . Then f c L 1 if and only if Ern I(fn) < oo.
In this case I(f) = lim I(f).
Proof: Since f > fn , i(f) I(f n). Thus if lim I(fn) = oo, then
I(f)= Go, and f L i .
Then g > 0, and g =
Suppose lim 1(fn) < oo. Set g = f
E (f;i+i — fn). Hence by Lemma 6,
n=1
I(g) < E i(f„±, — fn )
n=1
= E 1(f.+1) — I(fn)
n =1
= lim l(fr,) —
Thus
1 (f) =
1(g)
I(fi)
Since fn < f, we have Rf)
liM kfn).
AA), and so
I(f) ?: Ern I(f).
Thus f(f) = I(f) = Em I(fn). I
Sec. 23
293
The Extension Theorem
9. Corollary: The functional 1 is a Daniell integral on the vector
lattice Li.
The following two propositions are the analogues for the integral
1 of Fatou's lemma and the Lebesgue convergence theorem.
10. Proposition: Let (f„) be a sequence of nonnegative functions
in L 1 . Then the function inff, is in L i , and the function Jim f, is in Li
if lim I(f,) < Go. In this case
I(lim f,) < lim l(f„).
Proof: Let gn = fi A f2 A • • • A fn. Then (gn ) is a sequence of
nonnegative functions in L I, which decrease to g = inff,. Thus
—g n t — g, and since I(—g n) < 0, we must have g e L 1 by Proposition 8.
To prove the rest of the proposition, let h8 = inf f„. Then (10 is a
p>8
sequence of nonnegative functions in L 1 which increases to limfp.
Since h8 < f, for n <
lim 1(178) < hm Afp) < oc. Hence
11m f, e L 1 and /(lim f„) < lim l(fp) by Proposition 8. I
11. Proposition: Let (fn) be a sequence of functions in L 1 and
suppose that there is a function g in L 1 such that for all n we have
Ifid g. Then if f = limf n, we have
Af) = larn AL).
Proof: The functions fn + g are nonnegative, and I(f„ + g) <
21(g). Hence by Proposition 10 we have f + g in L 1 and
1(f+ g)
lim 1(f8 + g) = I(g) + lim I(fn).
Hence
< Jim 1(f8).
Since the functions g — fn are also nonnegative, we have
1(g
—
f) < lim I(g —
= 1(g)
—
111m 1(fn).
294
The Daniell Integral
[Ch. 13
Hence
lirn l(f„) < l(f),
and so Jim l(f„) exists and is equal to l(f).
Problems
5. A function f belongs to L u if and only if f = g ± io with g e Lu,
g > 0, and ç e L.
6. Prove Lemma 4.
7. Show that the ambiguous case in Lemma 4 does not cause difficulties
in Lemma 5.
3
Uniqueness
In the present section we show that the extension to L 1 of a Daniell
integral 1 on L is unique. We begin by proving a proposition of
some interest in its own right which describes the structure of functions in L 1 . It is the analogue for I of Proposition 12.7.
Let us denote by L u/ the class of those functions on X which are the
limit of a decreasing sequence f„) of functions in Lu with 1(f 1) < oo
and lim l(f„) > oo. It follows from Proposition 8 applied to
(-1;,) that Lu / C L 1 . Iff is any function on X such that 7(f) is finite,
then given n we can find h„ E L,. such that
f < h„ and
1(10 < ir(f)
•
Setting g. --= h 1 A h2 A • • • A h„, we have f < g,. < h„, and so
(g.) is a decreasing sequence of functions in L u with 7(f) < Ag.) <
7(f) ±
Hence the function g = lim gu is in L.1, while f < g and
7(f) = 1(g). We have thus established the following lemma:
12. Lemma: If f is any function on X with 7(f) finite, then there
is a g c L ui such that f
g and 7(f) = I(g).
A function f on X is called a null function iff e L 1 and /(1 f l) = O.
I(f) = O.
If f is a null function and
f, 0 < 7( gl)
Hence g e L 1 , and g is a null function.
13. Proposition: A function f on X is in L 1 if and only if f is the
difference g — h of a function g in Lul and a nonnegative null function
Sec. 4]
295
Measurability and Measure
h. A function h is a null function if and only if there is a null function
k in L u1 such that ihl < k.
Proof: Iff = g
h, then f is the difference of two functions in L 1
so must itself be in L 1 . If Ih < k with k null, then h is a nulland
function.
If f is in L 1 , then Lemma 12 asserts the existence of g e L ia such
that f < g and 1( f) = 1(g). Hence h = g f is a nonnegative function and 1(h) = 0, making h a null function. If h is a null function,
then by Lemma 12 there is a function k e L. 1 with 1,/i < k and
1(k) = 1(1111) = O. 1
—
—
14. Proposition: Let I be a Daniell integral on a vector lattice L
of functions on X and let J be a Daniell integral on a vector lattice
A D L. If 1(f) = .1(f) for all f 6 L, then A 1 D L i and I(f) = J(f)
for all f e L i .
Proof: By applying Proposition 8 twice we see that L ia c A 1 and
that 1(f) = f(f) forf e L ia. Hence by the second part of Proposition
13 each function which is null with respect to I must also be null
with respect to J. By the first part of Proposition 13 every function
f in L 1 must be in A 1 , and l(f) = J(f). I
4
Measurability and Measure
We say that a nonnegative function f on X is measurable (with
respect toi) if g A f is in L 1 for each g in Li.
15. Lemma: If f and g are nonnegative measurable functions, so
are f A g and f v g. If (fn ) is a sequence of nonnegative measurable
functions which converge pointwise to a function f, then fis measurable.
Proof: If f and g are nonnegative measurable functions and
h is in L 1 , then h A (f A g) = (h A f) A (h A g) and h A (f V g) —
(h A f) V (h A g). Hence the measurability of f A g and f V g
follows from the fact that L 1 is a lattice. If (fn ) is a sequence of nonnegative measurable functions converging to f and g a function in L l ,
then (f„ A g) is a sequence of functions in L 1 converging to f A g.
Since If, A gl < Igl, we have f A g in L 1 by Proposition 11. 1
16. Lemma: A nonnegative function f on X is measurable with
respect to I if (p A f is in L 1 for each (p in L.
296
The Daniell Integral
[Ch.
13
Proof: If cp e L, then cif, A f is in L 1 , and Proposition 8 implies that
g Afe L1 for g e Lu and I(g) < oc . It follows from Proposition il
that g Afe L 1 for all g L.1. If h is any function in L I , Proposition 13 tells us that h = g — k, where g e L u l and k is a nonnegative
null function. Since 0 < g Af— h A f < k, the function h A f
differs from the integrable function g A f by a null function. Thus
h A f is integrable, proving that f is measurable. I
We say that a set A in X is measurable with respect to I if its
characteristic function XA is measurable. We say that A is integrable
if its characteristic function XA is integrable. Note that a measurable
subset of an integrable set is itself integrable.
17. Lemma: If A and B are measurable sets, so are the sets A U B,
A n B, and A — B. If (A n ) is a sequence of measurable sets, then the
sets
A„ and U A„ are measurable. If the function 1 is measurable,
then the class a of measurable sets is a o--algebra.
n
Proof: The measurability of A n B and A u B follows from the
fact that XAnE = XA A XE and XALJE -= XA V XE• If g is in L 1 , we
have g A XA-B = g A XA — g A XAns + g A 0, and the measurability of A — B follows from that of A and B. If A = U A n , then
X A = liM (X A i V • • • V XA„)
and the measurability of A follows from Lemma 15. A similar arguA. If 1 is a measurable function, then the set
ment holds for
X is a measurable set, and the complement of a measurable set is
measurable. I
n
18. Lemma: If 1 is a measurable function and f a nonnegative
integrable function, then for each real number a the set
E = fx:f(x) >
is measurable.
Proof: If a is negative, E = X and is measurable, since XE = 1
and 1 is measurable. Hence we assume a > O. If a = 0, set g = f,
while if a > 0, let g = (01-1f) — [(or lf) A 1]. Since g is the
difference of two functions in L I , g is in L 1 . In either case we have
Sec. 4]
Measurability and Measure
297
g(x) > 0 for x e E, and g(x) = 0 for x e É. Let cp„ = 1 A (ng).
Then (p. e L i , and (p. XE. Hence XE is measurable, and so E is
measurable. a
19. Lemma: Let the function 1 be measurable, and define a set
function kt, on the class a of measurable sets by
= .1(XE)
if x E is integrable, and AE
= c
otherwise. Then
1.1
is a measure.
Proof: We have AO = 1(0) = O. If A and B are integrable sets
with A C B, we have XA < XB, and so AA < AB. Thus /2 is monotone
for integrable sets and consequently for measurable sets.
Let (E,) be a disjoint sequence of measurable sets, and let
E = U E,. If one of the E, is not integrable, then E is not integrable,
and
/.4.E =
=
E /2E2 .
If each E, is integrable, E will be integrable if and only if EAE, <
by Proposition 8, since XE = EX E, In either case AE = Ei2E„ and
the measure /1, is countably additive. I
This measure has the property that the integrable sets are precisely the measurable sets of finite measure. The following theorem
tells us that the Daniell integral I on L I is equivalent to the integral
with respect to this measure
20. Theorem (Stone): Let L be a vector lattice of functions on X
with the property that iff e L then 1 Afe L, and let I be a Daniell
integral on L. Then there is a a-algebra a of subsets of X and a measure
u on a such that each function f on X is integrable with respect to I
if and only if it is integrable with respect to A. Moreover,
l(f) = f f
Proof: Let a be the class of sets which are measurable with respect
to I. It follows from Lemma 16 that 1 is measurable. Lemma 17 then
asserts that a is a a-algebra, and Lemma 18 asserts that each nonnegative 1-integrable function is measurable with respect to a. Since
298
The Daniell Integral
[Ch. 13
each 1-integrable function is the difference of two nonnegative
/-integrable functions, every /-integrable function must be measurable
with respect to a.
Let ki be the measure given in Lemma 19, and let f be a nonnegative
function which is integrable with respect to I. For each pair (k, n) of
positive integers let
Ek,n = {X: f(x) >
Then
Ek, n
,
is measurable, and since
XEk,„ -= XEk.,. A (k-1 27f),
we have X Ek.. e L i , and ii(Ek , n) < 00. Set
22
= 2—n
.
E
XEk n •
k=1
Then ion e L 1 and (p. I f Hence 1(f) = lim 1(w,a). But
/(çon) = 2—n
E I(XE k „)
k-1
22n
= 2—n
= fion
k=1
dp.
Since
f f dp = lim
fp,
dp
by the Monotone Convergence Theorem, we have
l(f) = f f dp,
and f is integrable with respect to kt. Since an arbitrary f which is
1-integrable is the difference of two nonnegative /-integrable functions, it follows that such an f must also be integrable with respect
to and that
l(f) = f f dp.
Sec. 4]
Measurability and Measure
299
Iff is a nonnegative function on X which is integrable with respect
to /.4, we construct Ek ot and ion as before. Since ff dj2 < co, each
Ek , n has finite measure, and so XEk,. and hence ça,, belong to L I . Since
yon If and lim /(ion) = ffdit < co, we have f e L 1 by Proposition 8.
Thus each f which is integrable with respect to is also integrable
with respect to L I
21. Proposition: Let L be a vector lattice of functions on a set
X, and suppose that 1 e L. Let 63 be the smallest o--algebra of subsets of
X such that each function in L is measurable with respect to (B. Then
for each Daniell integral I there is a unique measure A on (B such that
for every f c L
1(f) = ffdA.
Proof: The existence of u is a special case of Theorem 20 and we
have only to prove the uniqueness of A on (B. Let a be the a-algebra
of measurable sets given by Lemma 17. Lemma 18 asserts that each
f in L is measurable with respect to a, and so we must have 63 c a.
Since 1 e L, the functions XB are in L 1 for each B in a and hence for
each B in B. The uniqueness of A on 133 will be established if we can
show that A(B) = /( X B ) for each B in (B.
If we let A be the set of functions on X which are measurable with
respect to (B and integrable with respect to A and set
J(f) = f f
for f 6 A, then Proposition 14 implies that ./(f) = 1(f) for feL l n A.
But if B e (B, then XB E Li n A, and so
= J(XB)
= I(XB).
Thus
/A
is uniquely determined on
133
by I. I
We can still establish the uniqueness of the measure kt in this proposition if instead of assuming 1 e L, we make the weaker assumption
that there is an everywhere positive function in L 1 (Problem 10).
Without some such assumption the measure need not be unique
on CB (Problem 11).
300
The Daniell Integral
[Ch. 13
Problems
8. Let u be a measure on an algebra a of sets, and let L be the family
consisting of those functions which are finite linear combinations of characteristic functions of sets in a with finite measure, and let I be integration
with respect to A. Discuss the extension of I to L I , and compare this
process with the Carathéodory extension for A.
9. Prove directly from the definition that, if fi and f2 are two nonnegative measurable functions, then fi f2 is measurable.
10. Prove that the conclusion of Proposition 21 still holds when the
hypothesis that 1 e L is replaced by the hypothesis that there is an everywhere positive function e in L 1 . [Hint: X = U {x: e(x) > 1/n} , and the
proof given can be modified to show that A is unique on sets of 63 which
are contained in {x: e(x) > 1/n}.]
11. Let X = (— co, cc) U Icol, and let L consist of all functions on X
which are Lebesgue integrable on (— oc, oc) and vanish at co. Then L is a
vector lattice, and the smallest a-algebra with respect to which each function in L is measurable is the family 133 consisting of all sets B such that
B n (-00, oc) is Lebesgue measurable. Let / be defined on L by I(f)
(x) dx. Then I is a Daniell integral on L. What is the measure y constructed in Theorem 20? Show that there is another measure v defined
on 03 such that for each f in L, I(f) = f f
12. a. Define a measure y on the /-measurable sets by yE = /(x E) if
E integrable, and /(xE) = sup II(xA): A C E, A integrable) otherwise.
Show that p. is a measure and is the smallest measure such that I(f) =
ff dp.
b. Show that for this measure At(X) = jjIj = sup {I(f): f e L,
f< 11.
c. Show that if I III! G oc, then this measure is the unique measure
such that I(f) = ff and AL(X)
II/11-
14
Measure
and Topology
We are often concerned with measures on a set X which is also a
topological space, and it is natural to consider conditions on the
measure so that it is connected with the topological structure. There
seem to be two classes of topological spaces for which it is possible to
carry out a reasonable theory. One is the class of locally compact
Hausdorff spaces, and the present chapter develops the theory for
this class. The other is the class of complete metric spaces, and some
of the significance of this class for measure theory is discussed in
Chapter 15.
1 Baire Sets and Borel Sets
Let X be a locally compact Hausdorff space. From the point of
view of integration theory the most useful family of functions on X
is the family Cc(X) consisting of all continuous real-valued functions
which vanish outside a compact subset of X. If f is a real-valued
function, we define the support of f to be the closure of the set
Ix: f(x) 01. Thus Cc(X) is the class of all continuous real-valued
functions on X which have compact support. The class of Baire sets
is defined to be the smallest a-algebra 33 of subsets of X such that
each function in Ce(X) is measurable with respect to .33. Thus 33 is
the o--algebra generated by the sets Ix: f(x) > al with Je Ce(X).
If a > 0, these sets are compact Vs. It follows from Proposi301
302
Measure and Topology
[Ch. 14
tion 9.20 that each compact 9 3 is a Baire set. Consequently, 63 is the
a--algebra generated by the compact Vs.
If X is any topological space, the smallest a--algebra containing the
closed sets is called the class of Borel sets. Thus, if X is locally
compact, every Baire set is a Bord set. The converse is true when X is
a locally compact metric space, but there are compact spaces where
the class of Bord sets is larger than the class of Baire sets
(Problem 5).
The reader should be warned that not everyone uses the same
definitions for Baire and Borel sets. When X is compact (or o-compact), everyone agrees that the Baire sets are the smallest
a--algebra such that each f in C(X) is measurable and that the Borel
sets are the smallest a--algebra containing the closed sets, but definitions differ when X is allowed to be locally compact. Some authors
like to take the Baire sets to be the smallest a--ring containing the
compact 9 3 's ; others the smallest a--algebra such that each f in
C(X) is measurable. Some authors prefer to take the class of Borel
sets to be the smallest a--algebra (or a--ring) containing the compact
sets. The definitions we have made here seem to me the most useful.
A set is called a--compact if it is the union of a countable collection
of compact sets. A set which is contained in a compact set is called
bounded, and one which is contained in a a--compact set is called
a--bounded. The following lemma is useful.
1. Lemma: If B is a Baire set, then either B or 13 is o--bounded.
Proof: The collection of sets B such that either B or 13 is a--bounded
is a a--algebra. Since it contains all compact sets, it must contain all
Baire sets. I
A measure p defined on the a--algebra of Baire sets is called a Baire
measure if it is finite for each compact Baire set. A measure defined
on the Bord sets is called a Borel measure if it is finite for each compact set. In the next two sections we shall show that each positive
linear functional on Cc (X) is given by integrating with respect to a
suitable Baire measure and that for compact spaces X each bounded
linear functional on C(X) is given by a finite signed Baire measure.
In Section 4 we show that each Baire measure can be extended to a
Bord measure.
Sec. 1] Baire Sets and Borel Sets
303
Problems
1. Let X be a separable locally compact metric space. Show that
the class of Baire sets is the same as the class of Borel sets.
2. Show that the collection of compact 93 's is closed with respect to
finite unions and intersections.
3. a. Show that each compact set in a locally compact space is contained in an open a--compact set 0 with O compact.
b. Show that each a--bounded set is contained in an open a--compact
set.
4. Let X be a locally compact Hausdorff space, and let Co(X) be the
space of all uniform limits of functions in Ce(X).
a. Show that a continuous real-valued function f on X belongs to
Co(X) if and only if for each a > 0 the set {x: I f (x)1 > a} is compact.
b. Let X* be the one-point compactification of X. Then Co(X)
consists precisely of the restrictions to X of those functions in C(X*)
which vanish at co.
c. If B is a Baire set in X*, then B n Xis a Baire set in X.
5. Let X be an uncountable set with the discrete topology.
a. What are Ce (X) and Co(X)?
b. What are the Baire sets in X?
c. Let X* be the one-point compactification of X. What is C(X*)?
d. What are the Baire subsets of X*? Show that X* has a compact
subset which is not a Baire set.
e. Show that there is a Baire measure /2 on X such that i.t(X) = I
and ff dm, — 0 for each f in Co(X).
6. Let X and Y be two locally compact Hausdorff spaces.
a. Show that each f e CAX X Y) is the limit of sums of the form
n
E so,(x)I,L i(y).
t=i
[Hint: The Stone-Weierstrass Theorem is useful.]
b. If a and 63 denote the Baire subsets of X and Y, respectively,
then the class of Baire subsets of X X Y is a x 63.
c. Let X and Y each be the Alexandroff one-point compactification
of a discrete uncountable set. Then the algebra of Borel subsets of
X X Y is larger than the a--algebra which is the product of the Borel sets
of X and the Borel sets of Y.
304
2
Measure and Topology
[Ch. 14
Positive Linear Functionals and Baire Measures
In the present section we show that each positive linear functional
on Cc(X) is given by integration with respect to a Baire measure and
that this measure is essentially unique. We begin with two lemmas
about compact 96 's and Baire measures.
2. Lemma: If K is a compact 95, there is a sequence 4,j ) of functions in Cc (X) such that yon, 1 XK.
Proof: By Proposition 9.20 there is a nonnegative function
ço e Cc (X) with („0 1 on K and 0 < (p < 1 on K. Let ço,„ = ion. I
3. Lemma: Let p i and /22 be Baire measures on a compact Hausdorff
space X. If yiK = /.1 2K whenever K is a compact 96 , then /h i = /2 2.
Proof: The semialgebra e generated by the compact 96 's is the
collection of all sets of the form C = K 1 n k2 , where K1 and K2 are
compact 96 's with K2 C K 1. Since ,u,C = u1K1 — ,u,K2, we see that
and /22 agree on e. Propositions 12.9 and 12.8 imply that their
extensions to the smallest a-algebra containing e are identical,
since /2 1 X = ,u 2X < 00. 1
4. Corollary: Let u i and ,u2 be Baire measures on a locally compact
Hausdorff space such that ,u1K = ,u2 K whenever K is a compact 96 .
Then /2 1 E = I.L2E for each a-bounded Baire set E.
This corollary shows that so far as the a-bounded sets are concerned the value of a Baire measure is determined by its values on
compact 96's. It is possible, unfortunately, for two Baire measures
to agree on the cr-bounded Baire sets and yet differ on sets which are
not a-bounded. I (See Problem 10.)
5. Theorem: Let X be a locally compact Hausdorff space and I a
positive linear functional on Cc (X). Then there is a Baire measure /.1
such that for each f in Ce(X) we have
l(f) = 1 f di.t.
If X is compact, the measure pt is unique. In general /.1 is uniquely
determined on the a-bounded Baire sets.
1 This is one of the principal reasons many authors prefer to do measure on if-rings
rather than if-algebras, since the smallest if-ring containing the compact 93's contains
only if-bounded sets.
Sec. 2]
305
Positive Linear Functionals and Baire Measures
Proof: The space Ce (X) is a vector lattice, and I will be a Daniell
integral if it satisfies the Daniell condition D. Let (<1 ) be a sequence
of functions in Ce (X) which decrease to zero and let K be the support
of (p i . Let 4/ be a nonnegative function in C(X) which is positive on
K. Then for a given E > 0 the sets F = {x: <p(x) > Elp(x)} are a
decreasing family of closed subsets of K whose intersection is empty.
Thus for some N we have FN = 0, and cp„ < el,b for n > N. Thus
I((p„) < €411/) for n > N, and I(cp„) —> 0, since e was arbitrary. The
existence of a Baire measure so that I(f) = ff d,a for each
f e Cc (X) now follows from the Stone Theorem (13.20).
To see the uniqueness of ,u, we note that, if K is a compact go,
then XK is the limit of a decreasing sequence 47,) of functions in
Cc (X). Since p is integrable, the Lebesgue Convergence Theorem
asserts that
AK = lim f yo n dp = urn i(<Pn).
Thus ,aK is uniquely determined by I. By Corollary 4 the measure p is
uniquely determined on the o--bounded Baire sets and hence on all
Baire sets if X is compact. I
The following proposition is the analogue for Baire measures of
Proposition 3.15.
6. Proposition: Let p be a &tire measure on a locally compact
Hausdorff space X. Then given a o--bounded &tire set E,
,uE = inf i,u0: E C 0, 0 a a-compact open set)
and
pE = sup {AK: K c E, K a compact 961.
Proof: To prove the first statement, we note that, since E is
a-bounded, it is contained in a 0-compact open set O. If pE = Go,
then AO = o and there is nothing to prove. Hence we shall assume
,uE <
If E is a compact ga , then there is a function (10 c Cc (X) which
is 1 on E and 0 < (10 < 1 on E. Let 0„ = {x: yo(x) > 1 — 1/n}.
on.
Then each On is a 0-compact open set, On D 07,+1, and E =
n
306
Measure and Topofogy [Ch. 14
Since D i is compact, A01 < 00 and A E = lim AO„. Thus for some
On we have AO, < ALE + E.
Let E = E l n E2 where E l and E2 are compact 96's with E2 C El,
and let U be a o--compact open set with ri compact such that E 1 c U
and AU < AE1 + e. Set 0 = U n E 2 . Then 0 is the intersection of
two 5„'s and so is an 56 . Since 0 is contained in the compact set
U, 0 must be o--compact. Since 0 • E = U El, we have
= /2(U E1 ) = /Ai — /2E1
e. Thus /10 < AE + e,
and the first statement in the proposition holds for sets E in the semialgebra e generated by the compact 91's.
The algebra a generated by the compact 9 6's is the collection of
all finite unions of sets from e, and a, consists of all countable
unions of sets from e. By Proposition 12.6 each set E in the o--algebra
generated by a (that is, each Baire set E) with AE < 00 is contained in
< AE + - . Let A be the disjoint union of the
2
sequence (C,) from e, and choose o--compact open sets 0, such that
an A c a, with
AA
C, c 0, and AO, < AC, +
.Then 0 = U 0, is a o--compact
2'+'
open set, A c0, and A(0
A)
< ;. Thus 0
E and AO < AE + e.
This proves the first statement of the proposition.
If E is a o--bounded Baire set, then E C U K, with K, compact.
Since each compact set is contained in a compact 98, we may assume
each K,
is a Compact 93
and hence a Baire set. Thus En = E n LI K,
z= 1
is a bounded Baire set, and E is the union of the increasing sequence
(E„) of bounded Baire sets. Hence ALE = lim AtEn , and so ALE —
sup {ALB: B C E, B a bounded Baire set}.
Let B be a bounded Baire set and C a compact 93 containing B.
Since C B is a bounded Baire set, we can find a o--compact open
set 0 containing C B such that /AO < AL(C ± E. Let
K = C n Ô. Then K C B, and K is a 96 since it is the intersection of
two 93 ' s, and compact since it is a closed subset of a compact set.
Since Cc Ku 0, we have
ALC <
ALK < At(C B) ,uK +
E.
307
Sec. 2] Positive Linear Functionals and Baire Measures
Hence
AB < IIK ± e
and
AB = sup
WC: K C B, K a compact 951.
I
A Baire measure ,u is called regular if for each Baire set E we have
ALE = sup ,uK as K ranges over the compact 9 6's contained in E.
Thus Proposition 6 states that each Baire measure on a o--compact
space is regular. Some further properties of regular Baire measures
are given in Problem 10.
We have used throughout this section the fact that if a compact
set is a 96 it is a Baire set. The converse is also true: If a compact
set is a Baire set, it is a 96 . Since we do not need this here, we refer the
interested reader to Halmos [8], p. 221, for a proof.
Problems
7. Let 3C be a collection of subsets of X with the property that the
union of any two sets in ac is in x and the intersection of any two sets
in 3C is in N. Show that the algebra generated by 3C consists of the sets which
are finite disjoint unions of sets of the form K 1 n k2 where K 1 and K2
are in 3C and K2 C K1.
8. Show that each compact set in a locally compact Hausdorff space
is contained in a compact 96 .
9. An alternate proof of Theorem 5 can be given using the results
of Section 123 instead of those from Chapter 13:
a. Define an outer measure ,u* on X by setting u*E = inf EA,p,)
for all sequences (45,,,) from C(X) with E(p,, > 1 on E. Show that A* is a
Carathéodory outer measure with respect to Cc (X) and that hence each
Baire set is A* measurable.
b. Let ,u, be ,u* restricted to the Baire sets. If f is a nonnegative
function in Ce(X) and f > 1 on a Baire set E, then u(E) < l(f). Iff < 1
and the support off is contained in a compact Baire set K, then AK > l(f).
c. Show that for any Baire measure with the properties in (b) and
any f c C c(X) we have
,
l(f) = f f d,u.
308
Measure and Topology
[Ch. 14
10. a. Show that a finite Baire measure on a locally compact Hausdorff
space X is regular if 1.cX = sup 1K: K a compact w. [Problem 11.9 is
useful.]
b. If bi is a regular Baire measure on X, then for each Baire set E
we have iE = inf {110: 0 D E, 0 an open Baire set} .
c. Show that the measure i of Theorem 5 can be chosen so that
it is regular, and that it is then unique. [See Problem 13.12.]
d. Let X be an uncountable set with the discrete topology. Construct a nonregular Baire measure on X.
e. Let ,u be a Baire measure on X. Then there is a unique regular
Baire measure on X which agrees with ,u on the a-bounded Baire sets.
11. Let kt be a Baire measure on a locally compact space X. Let U be
the union of all open Baire sets 0 with 1.4,0 = O. The complement F =
of U is a closed set called the support (or carrier) of 12.
a. If 0 is an open Baire set with 0 n F 0, then AO > O.
b. If K is a compact Baire set with K n F = 0, then AK = O.
[Each point of K is contained in an open set of measure zero. Thus by
compactness K is contained in an open set of measure zero.]
c. If E is a a-bounded Baire set with E n F = 0, then ,I.LE = O.
d. If f C c(X) and f > 0, then f f = 0 if and only iff 0 on
F. [Hint: The set ri x:f(x) > 0; is a o--bounded Baire set.]
e. Give an example to show that F need not be a Baire set.
f. It follows from (c) that if X is compact (or o--compact), then
= 0 for each Baire set with E n F = Ø. Construct an example to
show that this need not be true if X is not o--compact (see Problem 5e).
3
Bounded Linear Functionals on C(X)
Let X be a compact Hausdorff space and C(X) the space of realvalued continuous functions on X. In Section 2 we described the
positive linear functionals on C(X), and in this section we consider
the bounded linear functionals on C(X). We first note that if F is a
positive linear functional on C(X) and if I fl < 1, then
IF(f)! 5_ F(! fl) < F(1).
Hence it follows that
I I FII = F (l ).
Sec. 3]
Bounded Linear Functionals on C(X)
309
The next proposition shows that every bounded linear functional
on C(X) is the difference of two positive ones. Since the proof makes
no use of particular properties of C(X) other than that it is a vector
lattice of bounded functions containing 1, we state the proposition
in this generality. If L is a vector lattice of bounded real-valued
functions on a set X, L becomes a normed linear space when we
define Ilf = sup If(x)I. A linear functional is bounded if there is an
M such that
and we define as usual
IlF11 = sup F(f).
11f11<1
7. Proposition: Let L be a vector lattice of bounded functions on
a set X, and suppose 1 e L. Then for each bounded linear functional F
on L, there are two positive linear functionals F+ and F_ such that
F = F+ — F_ and 11F11 = F+(1)
F41).
Proof: For each nonnegative f in L define
F+(f) = sup F().
0<ia5
Then F+(f) 0, and F+(f) F(f). Moreover, F+(cf) =
cF+(f) for c > O. Let f and g be two nonnegative functions in L. If
0 < < f and 0 < < g, then 0 <
< f ± g, and so
F ±( f + g)
F
F(i).
Taking suprema over all such ço and 1,t, we obtain
F+(f ± g) ?_ F+(f) + F(g).
On the other hand, if 0 < < f + g, then 0 <
A f) < g, and so
0 < — (itt
F
F(' A f) + FOP
F+(f) + F(g).
—
[1P A
A)
Taking the supremum over all such 1,t, we get
F±(f + g)
F±(f) F(g),
A f < f and
Measure and Topology
310
[Ch. 14
and so
F+(f + g) = F+(f) + F(g).
Let f be an arbitrary function in L, and let M and N be two nonnegative constants such that f + M and f + N are nonnegative.
Then
F+(f + M + N) = F+(f + M) + F+(N)
F-4-(f
Hence
F+(f + M) — F ±(M) = F+(f + N) — F(N).
Thus the value of F+(f M) — F +(M)is independent of the choice
of M, and we define F+(f) to be this value. The functional F+ is now
defined on all of L, and we have easily F+(f + g) = F+(f) F+(g)
and F+(cf) = cF+(f) for c
O. Since F+(—f) + F+(f) --F+(0) = 0, we have F+(—f) = —F +(f), and F+ is a linear functional on L.
Since 0 < F+(f) and F(f) < F+(f) for f > 0, both F+ and the
linear functional F_ = F+ — F are positive linear functionals, and
F = F+ — F_.
We always have 11F
F41)
F_(1). To
!IF +11
establish the inequality in the opposite direction, let io be any function in L such that 0 < ç < 1. Then I2 — 11 < 1, and
IIF ?_ F(Ip — 1) = 2F(p) — F(1).
Taking the supremum over all such ç
,
we have
IlF11 > 2F+(1) — F(1)
= F+(1) F_(1).
Hence IlF11 = F+(l)
F_(1). I
8. Riesz Representation Theorem: Let X be a compact Hausdorff
space and C(X) the space of continuous real-valued functions on X.
Then to each bounded linear functional F on C(X) there corresponds a
unique finite signed Baire measure p on X such that
F(f) = f Jed!)
for each f in C(X). Moreover, VII = Iv (X).
Sec. 3]
Bounded Linear Functionals on C(X)
311
Proof: Let F = F± — F_ as in Proposition 7. Then by Theorem 5
there are finite Baire measures I.L i and /2 2 such that
F±(f) = f f dil l
and
F_(f) = f f (111 2.
If we set y = A i —
142,
then y is a finite signed Baire measure, and
F(f) = f f dp.
Now
IF(f)i ._ f If divi
l If!! Iv1(x).
Hence 11F11 __ 1v1(X). But
PI(X)
121(X) + /2 2(X)
= F± (1) + F_(l) = lift
Thus VI = y (X).
To show the uniqueness of y, we note that if y 1 and y 2 were both
finite signed Baire measures such that
f f di', = F(f)
for i = 1,2 and f e C(X), then X = y 1 — y 2 would be a finite signed
Baire measure such that
f f dX = 0
for allf e C(X). Let X = X+ — X — be the Jordan decomposition of X.
Then integration with respect to X+ gives the same positive linear
functional on C(X) as that given by X —, and so by Theorem 5 we
must have X+ = X. Hence X = 0, and 1, 1 = y 2 . 1
9. Corollary: Let X be a compact Hausdorff space. Then the dual
of C(X) is (isometrically isomorphic to) the space of all finite signed
Baire measures on X with norm defined by MPH = iPi(X)-
The fact that the space of finite signed Baire measures on X is the
dual of C(X) enables us to conclude a number of things about this
312
Measure and Topology [Ch. 14
space. For example, it follows from Proposition 10.3 that the space
of Baire measures is complete, and it follows from Theorem 10.17
that the set of Baire measures with !H (X) < 1 is compact in the weak*
topology. Some consequences of this are explored in the problems.
Problems
12. Let L and F be as in Proposition 7. Show that if G and H are two
positive linear functionals on L such that F = G — H and G(1) + H(1) <
11E11, then G = F+and H = F. [Hint: Use the definition of F+ to show
that G — F+ is a positive linear functional.]
13. Let X be a compact Hausdorff space if = {fa } a family of continuous real-valued functions on X and {C6,} a corresponding family of
constants . Suppose that for each finite set {fai , . . . ,A.} there is a signed
Baire measure y with IH(X) < 1 such that
f fdv = ca,.
Then there is a finite signed Baire measure y with I p1(X) < 1 such that for
every fa,
fa dp = cc,.
f
14. a. Let X be a compact Hausdorff space and g, fi, • • • „A continuous real-valued functions on X. Suppose that there is a signed Baire
measure y on X with 10(X) < 1 such that for each i we have f f, di, = ci.
Then there is a signed Baire measure y on X with lyi(X) < 1 such that
and
f fi dm = ci
fgd,u15_ fg dX
for any signed Baire measure X with I XI(X) < 1 and such that f ft dX ----- c i.
b. Suppose that there is a Baire measure y on X with y(X) = 1 and
5f2 dp = c2. Then there is a Baire measure AL on X with y(X) — 1 and
5 f, dy = ci which minimizes fg di.t among all Baire measures which
satisfy these conditions.
c. Let G, F1 ,. . . , Fn be continuous functions on In--Euclidean
m-dimensional space), and let fl , . . . , f,,, be continuous functions on X.
Show that if there is a Baire measure y with y(X) ---- 1 such that
Fi (f f i dp, . . . , f fni di') --- c„
Sec. 4]
313
The Borel Extension of a Measure
then there is a Baire measure on X which minimizes
G (f dv,
,f
under these restrictions.
15. Let B be the Banach space of signed Baire measures on a compact
Hausdorff space X. What are the extreme points of the unit sphere of B?
16. Alternate proof of the Stone - Weierstrass Theorem. We can use the
techniques of this section, together with results of Chapter 10, to give a
proof of the Stone-Weierstrass Theorem which does not depend on
Lemma 9.27. This proof is due to deBranges.
Let a be an algebra of real-valued continuous functions on a compact
space X which separates points and contains the constants. Let al be the
set of signed Baire measures on X such that
< 1 and if = 0 for
all f e
a. Use the Hahn-Banach Theorem and Corollary 9 to show that
if al contains only the zero measure, then -.6, = C(X).
b. Use the Krem-Milman Theorem and the compactness of the
unit ball in C*(X) to show that if the zero measure is the only extreme
point of al, then a-L contains only the zero measure.
c. Let p. be an extreme point of al. If f e 0 < f < 1, the measures p. and /12 given by dp. j = f cli.z and 4.2 = (I — f) dbl. are in al,
bi,H, and /I + 11 2 = pt. Since p. is an extreme point,
with lig i d
= cp. for some constant c.
d. Then f— c = 0 on the support of m. (see Problem lid).
e. Since f separates points, the support of p. can contain at most
one point. Since f 1 dp, = 0, the support of p. is empty, and p. is the zero
measure.
*4
The Borel Extension of a Measure
Previously we have considered only Baire measures on a locally
compact Hausdorff space, but it is sometimes convenient to extend a
given Baire measure to be a Borel measure. In this section we show
that, if the Baire measure is regular, then this can be done in such a
way that the analogue of Proposition 6 holds and so that the Borel
measure really lives on the Baire sets in the sense that each Borel set
of finite measure differs from a Baire set by a set of measure zero.
We begin with a lemma.
314
Measure and Topology
[Ch. 14
10. Lemma: Let K be a compact g3, and suppose K c U O i with
O. open. Then K = LI Ki where the K, are compact 96's and K, c O.
1 =1
Proof: By Proposition 9.21 there are continuous nonnegative
functions fi with support in 0, such that fi • ± = 1 on K.
Let Ci = {x: f(x) > -71,} . Then Ci is a compact 96 contained in Oi
and K c tj G. Set K, = Ci n K I
11. Theorem: Let 12, be a regular Baire measure on a locally compact Hausdorff space X. Then there is a unique Borel measure g on X
which extends js and for which:
i. For each open set 0,
/70 = sup {AK: K c 0, K a compact g6
}
For each Borel set E,
= inf {170: E c 0, 0 an open set} .
The measure rt also satisfies:
iii. For each Borel set E with ).-LE < 00,
= sup {17K: K c E, K compact}.
iv. For each Borel set E with i7E < 00, there is a Baire set H and
a Borel set N with riN = O and E = H P N.
Proof: Such a measure Ti is clearly unique, since (i) specifies its
value on open sets and that together with (ii) determines its value for
all Borel sets.
Define a set function /1* on open sets by setting
p,*0 = sup {ilK: K c 0, K a compact ç}.
Since it is regular, it follows that, if 0 is an open Baire set, then
i.z*0 = O. Let 0 = U 0i, and let K be a compact gs contained in
j=1
O. Then K = U Ki with Ki c Oi by Lemma 10, and so
< EtzK i < Ep*Oi. Taking the supremum over K, we have
A*0 < Etz*Oi, and
is countably subadditive on open sets.
CO
Sec. 4)
The Borel Extension of a Measure
315
Let aa be the collection of subsets M of X with the property
that for each e > 0 there is an open set 0 such that Mc0 and
12*0 < e. Any subset of a set in 01Z is again in gc, and the countable
subadditivity of tt* implies that the union of a countable collection
of sets in on is in NE. If E is a Baire set belonging to NE, then
= sup K: K c E, K a compact 9,3}. But for each K C E and
each E > 0 there is an open set 0 D K with 12*0 < E. Thus 12K < E
by the definition of kt * O. Since E was arbitrary, AK = 0. Consequently,
= 0, and the hypotheses of Proposition 12.30 are satisfied.
Thus there is an extension g of A to a o--algebra 03 containing art and
the Baire sets such that 7.-LM = 0 for M c
Each B c GI is of the form H A M with H a Baire set, M c
and 72B = AH. Since M is contained in an open set of arbitrarily
small measure, it follows from Problem 10 b. that if 13 < co, then
there is an open set 0 with B C 0 and 12*0 <
Let 0 be an open set with 11*c0 < oc. If K is a compact 93 contained in 0, then 72*(0
= tz* 0 — 12 K. Let WO a sequence of
compact 93's in 0 such that 1.4*0 = lim 12Kn . Then H = UKK7, isa
n
Baire set contained in 0, and, since 0 —Hc0
and
H e (317,.
Thus
we have 0
I2*0, and 0 c 03.
Let a be the a-algebra consisting of all sets which are locally
measurable with respect to 72, that is, of all sets E such that
EnBe@ for each Be@ with gB < co. Since each subset of 03
of finite measure is contained in an open set of finite measure and
each open set of finite measure is in it follows that E is in a if and
only if the intersection of E with each open set of finite measure is in
63. Consequently, a contains all open sets and hence all Borel sets.
Extend g to a by setting TX = co if E e a
63.
If we restrict g to the Borel sets, we have a Borel measure which is
an extension of A and satisfies properties (i) and (iv) of the theorem.
If E is a Borel set not in 03, then every open set 0 containing E has
= c,0 and TX = co. Thus (ii) holds for such E. Let E be a
set in og. Then E = H A M, where H is a Baire set and M c
Since H is contained in an open Baire set 0 1 with 110 1 <
E
and M in an open set 0 2 with /1 02 < E, we have Ec 0 1 u 0 2 with
2e. This proves (ii).
401 u 0 2) < pE
To prove (iii), let TLE < co. Then there is an open set 0 D E
{ A
—
with 72(0 —
< . By (i) there is a compact set C C 0 with
Measure and Topology
316
E
[Ch. 14
E
C) < — and by (ii) an open set U D (0 — E) with A. 0 <
2
2
Set K = C n U. Then K c E, and 17K > TiC — > TO — E >
2
Problem
17. a. Let f be a Borel measurable function on a locally compact
Hausdorff space X and 1.4. a (y-finite Borel measure on X satisfying conditions (i) and (ii) of Theorem 11. Then there is a Baire measurable
function g such that f = g a.e. [A.
b. What if i.t. is not (7-finite?
c. If f > 0, can we always choose g with 0 < g < f?
15
1
Mappings of Measure
Spaces
Point Mappings and Set Mappings
Let X and Y be any two spaces and cp a mapping of X into Y.
Associated with cp are several mappings of objects associated with Y
into corresponding objects associated with X. For example, the set
mapping 43 defined by cb(E) = yo-1 [E] is a mapping of the subsets of
Y into the subsets of X. This mapping preserves unions, intersections,
and complements. It is called the set mapping induced by or adjoint
to go. We refer to (if) as a point mapping. If (X, a) and ( Y, 03) are
measurable spaces, the point mapping yc, of X into Y is called measurable if ço — 1 [E] c a for each E c 63. Thus cp is measurable if 4, maps
03 into a.
If f is a real-valued function on (X, a), then f is a mapping of X
into R, and f is measurable with respect to a if and only if f is a
measurable mapping of (X, a) into (R, 133) where ild is the a-algebra of
Borel sets. A real-valued function of a real variable is Lebesgue
measurable if it is a measurable mapping of (R, Mt) into (R, 63),
where fflz is the class of Lebesgue measurable sets. It is Borel measurable iff it is a measurable mapping of (R, ea) into (R, (O.
Also associated with ço is the mapping (do* of the space of realvalued functions on Y into the space of real-valued functions on X
defined by v*(f) = fo ço. The mapping ço* is often called the adjoint
of (to, and it preserves sums, products, maxima, etc. If ço is measurable,
then ço* takes measurable functions into measurable functions.
317
318
Mappings of Measure Spaces
[Ch. 15
Let a be an algebra of subsets of X and 03 an algebra of subsets
of Y. Then a mapping cP of 03 into a such that ci.( Y) = X, cD(É) =
—43(E), and c1, (A U B) = cD(A) U cl.(B) is called a (lattice) homomorphism. If a and 03 are a-algebras and cl) has the property that
cf. (
E1) =J cP(E1), then cP is called a cr-homomorphism. Every
set mapping induced by a point mapping of X into Y is a a--homomorphism, but we can have cr-homomorphisms which are not induced
by any point mapping (Problem 2).
Let (X, a) and ( Y, 03) be measurable spaces and 4) a 0--homomorphism of 03 into a. Then induces a mapping ci,* of measures
on (X, a) into measures on ( Y, 03) if we define ci, */.1 by (4)* )(E) =
y(t(E)). The following proposition may be thought of as a changeof-variable formula.
1. Proposition: Let (p be a measurable point mapping of the
measure space (X, a, I.4) into the measurable space (Y, 03). Let 4' be
the induced set mapping of 03 into a. Then for each nonnegative measurable function f on Y we have
fy f g =
Proof: The proposition is clearly true if f is a characteristic function. From this it follows for f a simple function. Since f f is the
supremum of the integrals of all nonnegative simple functions less
than f, the proposition follows. I
Problems
1. Let 03 be a family of subsets of Y which contains Y, 0, and each set
{y} consisting of a single element. Then each mapping 4) of 03 into the subsets of X which preserves finite intersections and arbitrary unions and for
which 4)(Y) = X and 4)(0) = 0 is induced by a point mapping. [The sets
4'({y}) are disjoint and their union is X. Let ço(x) = y for x in Ey .]
2. Let X = Y = [0, 1], and let a be the collection of all subsets of
[0, I] which are either countable or the complements of countable sets.
Then a is a a-algebra. Let 63 = { Y , Ø}. For E r a, define 43(E) = 0 if E
is countable, 4,(E) = Y if E is the complement of a countable set. Then
4) is a a-homomorphism and it is not induced by any point mapping of Y
into X.
3. Show that the adjoint mapping ço* can be extended to map the
Sec. 2] Measure Algebras
319
extended real-valued functions on Y into the space of such functions on X.
Generalize Proposition 1 to this case.
4. Let (X, a), ( Y, Ca), and (Z, e) be measure spaces and yo: X—* Y
and V.: Y —f Z measurable mappings. Show that ti. . io is a measurable
mapping of (X, a) into (Z, e). What has this to do with the fact that, if f
is a Lebesgue measurable and g a Borel measurable real-valued function,
then g of is Lebesgue measurable, but f 0 g need not be?
5. Prove that cD*1.4 is a measure.
6. a. Let so be a measurable point mapping of (X, a, 12) into ( Y, 63, y)
and cl) the induced set mapping of 63 into a. Suppose VIA is absolutely
continuous with respect to v and that y is a finite (or a-finite) measure.
dtz
Define [— to be the Radon-Nikodym derivative of VA with respect
dv
to v. Then for each nonnegative measurable function f on Y we have
(f o io)dtt = f
y
clv
b. Let f be a nonnegative measurable function on [0, 1] and g a
monotone absolutely continuous function on [0, 1] with g(0) = 0,
g(1) = 1. Then
1
fol f[g(t)]g'(t) dt = f f(t) di.
7. a. Let (X, a) and (Y, Co3) be measurable spaces and 4 a cr-homomorphism from 63 to a. Show that there is a unique linear mapping T4,
of the measurable real-valued functions on Y into the measurable realvalued functions on X, which takes nonnegative functions into nonnegative
functions, such that for characteristic functions xE we have
T,p(xE) = XcE)•
b. Let z be a measure on (X, a) and f a nonnegative measurable
function on Y. Then
Ix TD(f) diu = fy f de (A).
2
Measure Algebras
A Boolean algebra is a set of elements on which there are defined
two binary operations V and A and a unary operation which
satisfy the following rules:
i. A VA = A,
. A A A = A,
320
Mappings of Measure Spaces
[Ch. 15
ii.AVB=BV A,
W.AAB=BAA,
iii. (A v B) vC=Av (B V C),
iii' . (A A B) A C=A A (B A C),
iv. A v (B A C) = (A V B) A (A VC),
v (A A C),
iv' . A A (B v C) = (A A
v. (A A
= A' v B',
vi. (A')' = A,
vii. 30 such that A A 0 = 0 and A v 0 = A,
viii. A' A A = 0.
One example of a Boolean algebra is an algebra a of subsets of some
set X with V, A, ' interpreted to mean u, n, —. We shall sometimes
call V, A, 'union', 'intersection', and 'complementation'.
A Boolean algebra a becomes partially ordered if we define A < B
to mean A A B = A. Then 0 is the smallest element, while X = 0'
is the largest element. Moreover, A v B is the smallest element of a
which is larger than both A and B.
A Boolean algebra a is called a Boolean o--algebra if for each
sequence (A n ) of elements of a there is a smallest element B such
that A n < B for all n. This element B is denoted by V A. In a
n=1
Boolean a-algebra the element C — (V
c.' An' )' is the largest element
n= 1
such that C < A n for all n. We write C =
A A.
One example of a
n=1
Boolean a--algebra is given by a a-algebra of subsets of a set X.
Another is the following: Let (X, a, kt) be a measure space and 9z,
the family of sets of measure zero. Two sets of a are said to be
equivalent modulo at if their symmetric difference is in N. Finite or
countable unions and intersections of equivalent sets are equivalent,
and so the classes of equivalent sets form a Boolean a-algebra. We
denote it by a/9z. In fact, the only properties we need of 91, are (i) if
A E 9Z, and B e a with B c A, then B e gt, and (ii) if A n e 9i, then
U A r,
91. A subset of a Boolean a-algebra a with these properties
n=1
is called a e-ideal, and we may define the Boolean a-algebra a/9z of
equivalence classes of a mod N.
By a measure algebra, we mean a Boolean cr-algebra a together
with a nonnegative real-valued function g defined on a such that
Sec.
2]
321
Measure Algebras
p(A) = 0 if and only if A = 0 and 1.4 (V Az) =
E
/Lit
if
A A A, = 0 for i j. We call ,u a measure on a. If (X, 63, 11) is a
finite measure space and oz is the collection of sets of measure zero,
then we obtain a measure algebra if we consider kt on the Boolean
cr-algebra a = ca/a, that is, if we fail to distinguish between sets of
153 which differ by a set of measure zero.
In a Boolean algebra we define the symmetric difference A A B
of two elements by A A B = (A A B') V (A' A B). A measure
algebra becomes a metric space if we define p(A, B) = I.L(A A B).
A',
This metric space is always complete, and the mappings A
(A, B) ---> A v B, and (A, B)
A A B are continuous. A measure
algebra is called separable if it is separable as a metric space.
A mapping (D of a measure algebra (a, /1) into a measure algebra
y) is called an isomorphism into if cD(A') = [43(A )]' , «A 1 y A 2) =
«A 1 ) V «A 2) and I.L(A) = y((D(A)). The mapping (D is called an
isomorphism if it is onto. If the measure algebras are considered as
metric spaces, an isomorphism is an isometry which preserves
complements and finite unions. It follows that an isomorphism also
preserves countable unions and intersections (Problem 10).
An element A
0 in a measure algebra a is called an atom if
B < A can occur only for B = A and B = 0. Let ar6 be the class of
measurable subsets of [0, 1], oz the class of subsets of measure zero,
and m Lebesgue measure. Then (91 6/91, m) is a separable measure
algebra without atoms. The following theorem asserts that, apart
from isomorphism, it is the only such measure algebra.
2. Theorem (Carathéodory): Let (a, ,u) be a separable measure
algebra with p(X) =- 1. Then there is an isomorphism ct. of (a, /1) into
the measure algebra (fflz/91, m) induced by Lebesgue measure m on
[0, 1]. The isomorphism 49 is onto jf a has no atoms.
Proof: Since (a, A) is separable, there is a sequence (A n ) of
elements which are dense in a. Let an be the Boolean algebra obtained by taking all unions of intersections of the sets A 1 , . . • , A.
and their complements, and let a. = U an- Then a. is again a
Boolean algebra. For, given any A and B in a„, they belong to an
and am, respectively. But if m < n, we have am c an, and so A',
A y B, and A A B are all contained in a c a...
71=1
322
Mappings of Measure Spaces
[Ch. 15
We shall define by induction a mapping I. of a. into the algebra
g of all finite unions of half open subintervals of [0, 1). The algebra
al consists of the four sets 0, A1, Ai, X, and we have ,uA l =
uX = 1. Let «A 1) = [0, A 1), = 1),cl)(0) = 0, 1.(X) =
[0, 1). Then 4) preserves unions, intersections, complements, and
measure. Suppose now that l) has been defined on an _ i so that it
maps an _ i onto the algebra generated by the half open intervals
[0, x 1), [x 1 , x2), • • • , [xk, 1) and that 4) is measure preserving and
preserves unions and complements. We wish to extend the mapping
4) to an . Let Bo, • • • , Bk be the sets in an_i which are mapped onto the
intervals [0, x 1), . . , [xk , 1). Then an consists of all finite unions
of the sets Bo, . . • , Bk, and an consists of all finite unions of
A n A B o, ,A n A Bk, A n' A B0, . . . ,A, A Bk. For those intersections which are 0, set 4) equal to Ø. For those which are not
0, let 4)(An A B3) = [x3, xi + 1.1 (An A B3 )) and 43(A n' A B =
[x; m(A„ A B1), xi+1). Since u(An A B) + 12(A'n A B,) = 1.1(13;) =
x 11 — x j, we see that these are properly defined intervals, 4. is
measure preserving, and «A n A Bi) U «A A B1) = x1+1) =
.t(B1). From this it follows that we can extend I. to all of an so that it
preserves unions, complements, and measures.
Thus we have defined by induction the mapping I. from a,, to
urc/gt so that it is measure preserving. Hence it is an isometry. Since
a. is dense in a and the metric space arz/gt is complete, we can
extend l) to be an isometry from a into art/a.
To see that l) preserves complements, let E be any element in a
and choose A e a. so that ,u(E A A) < E. Then A' e a., and
,u(E' A A') = ,u(E A A) < e. Since 43 is an isometry and 43(A') =
we have m(4)(E') A 43(A)) < e and m(4)(E) A 43(A)) =
m(4)(E) A 4)(A)) < 6. Hence m(4)(E') A 44E)) < 2€ for all e > 0.
Thus 43(E') = —43(E) in the algebra arz/91. A similar argument shows
43(E V F) = 4)(E) u 4)(F).
Since 4) is an isometry, it is one-to-one into. To identify the range
of 4), let E be the set of endpoints of those intervals which were used
in defining the mapping 43 on the algebras an . Then maps a.
onto the algebra of finite unions of half-open intervals with endpoints
in E. Suppose that r is not all of [0, 1], and let I be one of the open
is composed, that is, an interval conintervals of which [0, 1)
tained in —1-7 whose endpoints lie in E . Since the endpoints of /
lie in .E, I is a limit of intervals in (DN.], and so lies in 1.[a]. Let
—
Sec. 2]
323
Measure Algebras
A e a be such that (A) = I, and let B be an element of a with
B < A. Then ci (B) c I. Since B can be approximated by elements
in ao,„ ck(B) can be approximated by elements in (14a,,,]. But these
latter are sets which either contain I or do not meet I. Hence
4)(B) = I or (D(B) = Ø, and we have B = A or B = Ø, since ci
is one-to-one. Consequently, A is an atom.
We have thus proved that if a has no atoms, then E' = [0, 1].
But if E = [0, 1], then every half-open interval is in [a], and hence
ci)[a] contains every Borel set. Since every measurable subset of
[0, 1] is the union of a Borel set and a set of measure zero, we have
in this case
cD[a] = TC/91. I
This theorem states that, if (X, 61, ,u) is a separable measure space
without atoms for which A(X) = 1, then the corresponding measure
algebra is isomorphic to the measure algebra induced by Lebesgue
measure on [0, 1 1 . It does not assert the existence of a point mapping
between [0, 1] and X, or even of a set mapping of 61 into the measurable sets of [0, 1], but only of a correspondence of sets of modulo
null sets with measurable sets modulo sets of measure zero. Thus if
we have a set B c 61, we do not assign to it a particular measurable
set but only an equivalence class of measurable sets in [0, 1]. In the
next sections we shall develop criteria for asserting the existence of
point mappings which induce given mappings of measure algebras.
Problems
8. Prove that in a Boolean a-algebra we have
B
9. Let
A (V
A„) =
(B
A
A n).
a be a Boolean a-algebra and at a Boolean a-ideal. Show that
then A'
B' c D-C, and that if (A n A B„) EN. then
if A ABe 01,
(V AO A (V Bn)
10. a. Let (a, ,u) be a measure algebra and (A n ) a sequence of elements
such that A n A A, = 0 for n X
m.
Then V A = lirn V A. (Here
n= 1
k—,00 n=
lim means limit in the metric space defined by 1.t.)
Let (a, 1.4) be a measure algebra and
oe,
elements in a. Then V A = lim V A.
k—*co n=1
n=1
b.
(A n )
any sequence of
324
Mappings of Measure Spaces
[Ch. 15
c. Show that if c is an isomorphism of a measure algebra (a, A)
CO
oo
into a measure algebra (CB, y), then cD ( ci An) = V 49(11 .
)
11. Prove that a measure algebra is complete as a metric space. [If
(A s ) is a Cauchy sequence we may assume that u(A n A A m) < 2—N for
n, m > N. Then if B, = 11
V A„ we have 1.4(A, A Bn) < 2—n+ 1 . Now
v=n
A
n=1
Br, = lim 13, = lim An.]
12. Show that in a measure algebra the operations ', A, and V are
continuous.
13. Show that a measure algebra (as we have defined it with /L(X) < co)
can have only a countable number of atoms. Hence any complete separable
measure algebra is isomorphic either to an interval (with Lebesgue
measure), to a measure space consisting of a countable number of atoms
(discrete measure space), or to a measure space which is the union of the
preceding two.
14. Discuss measure algebras in which we allow /J. to be an extended
real-valued function.
3
Borel Equivalences
If (X, a) and ( Y, 03) are measurable spaces, we may ask for conditions under which they are equivalent in the sense that there is a
one-to-one mapping yo of X onto Y such that yo and cp -1 are measurable, that is, such that (p[A] 6 03 for each A c a and yo —l [B] c a for
each B c G. In the present section we shall always assume that X and
Y are metric spaces and that a and 03. are the algebras of Borel sets.
A one-to-one mapping of X onto Y which takes Borel sets into Borel
sets and whose inverse takes Borel sets into Borel sets will be called a
Borel equivalence. Thus any homeomorphism is a Borel equivalence,
but we may have Borel equivalences which are not homeomorphisms.
If X is Borel equivalent to Y and Y Borel equivalent to Z, then X is
Borel equivalent to Z. We begin by establishing some lemmas:
3. Lemma: Let X be a metric space and E a Borel subset of X.
Then the Borel subsets of E are those Borel subsets of X which are
contained in E.
Proof: Since E is a Borel subset of X, 0 n E is a Borel subset of
X for each open subset 0 of X. Hence the Borel subsets of X which
Sec. 3] Borel Equivalences
325
are contained in E form a a-algebra which contains the (relatively)
open subsets of E and so must contain every Borel subset of E. On
the other hand, the collection of those subsets of X each of which is
the union of a Borel subset of E and a Borel subset of X — E is a
a-algebra which contains the open subsets of X. Hence it must contain all Borel subsets of X, and we see that a Borel subset of X
which is contained in E must be a Borel subset of E.I
4. Lemma: Let X and Y be metric spaces, and let X0 and Yo be
countably infinite subsets of X and Y. Then each Borel equivalence
between X — X 0 and Y — Yo can be extended to a Borel equivalence
between X and Y.
Proof: Since X0 and Y0 are countably infinite, we can extend the
Borel equivalence between X — X 0 and Y — Yo to be a one-to-one
correspondence between X and Y. Since any countable subset of a
metric space is an Fo., the union and difference of a Borel set and a
countable set are again Borel sets. Thus a subset of X [or Y] is
a Borel set if and only if its intersection with X — X0 [or Y — Yo] is
a Borel set. Hence the extended correspondence between X and Y
is a Borel equivalence. I
If X is a topological space and A any set, we use X A to denote the
product space X X, where each X, = X. The set X A is the set of all
aeA
mappings of A into X. If X is metric and A is countable, then X A is
metrizable (cf. Problem 8.27). We shall use w to denote the set of
integers. The following lemma is an immediate consequence of the
fact that co X co is countable.
5. Lemma: If X is a topological space, Xc° and (XI" are homeomorphic.
One of the simplest topological spaces is the space whose elements
are 0 and 1 and whose topology is discrete. We often denote this
space by 2. The product space r is a compact metric space.
6. Proposition: The unit interval [0, 1] is Borel equivalent to 2".
Proof: Let X 0 be that subset of 24' consisting of those elements
which have only a finite number of coordinates equal to 0 and of
326
Mappings of Measure Spaces [Ch. 15
those which have only a finite number of coordinates equal to 1.
Then X 0 is countably infinite. On 26) — X0 define yo as that mapping
E E,2 -1, where E, is the ith coordinate of x.
2=
Then ye, maps 2 — X0 in a one-to-one manner onto those elements
of [0, 1] which are not of the form p 2 —n with p and n integral, and
cp is readily seen to be a homeomorphism between r — X 0 and these
elements of [0, 1]. Since the set of elements of [0, 1] of the form
p 2 -n is countably infinite, the proposition follows from Lemma 4. I
which sends x into
7. Proposition: Let I be the interval [0, 1 ]. Then Iw and I are
Borel equivalent.
Proof: By Lemma 5 the spaces 2w and (r)''' are homeomorphic.
Thus by Proposition 6 we have I Borel equivalent to 2w, which is
Borel equivalent to (216', which in turn is Borel equivalent to
I
8. Theorem: Each complete separable metric space is Borel
equivalent to a Borel subset of [0, 1].
Proof: Since I = [0, 1] and I" are Borel equivalent, it suffices
to show that each complete separable metric space (X, p) is homeomorphic to a Borel subset of P. We may assume without loss of
generality that p is bounded by one. Let (r„) be a dense sequence of
points, and let f be that mapping of X into P which assigns to x the
point whose ith coordinate is p(x, r„). Then f is a one-to-one mapping
of X onto a subset E of P, for, if x
y, there is an r, such that
p(x, r,) < p(y, r2). Since X and ./w are metric, the mapping f will be
continuous if x„ x implies f(x) —f(x). But if x,„ x, then
p(x„, r,)
p(x, r,) for each i, and so each coordinate of f(x) converges to the corresponding coordinate of f(x). Thus f(x) f (x).
(cf. Problem 8.26). The mapping f —1 will be a continuous mapping
of E onto X if f(x) f(x) implies xn
x. But if f(x)
f (x),
p(x, r,) for each i. Given E > 0, choose r, so that
then p(x„, r,)
p(x, r,) < E/3 and choose N so that Ip(x„, r) — p(x, r i)i < e/3 for
n > N. Then, for n > N, we have p(x„, r) < 2E/3, and so
< E. Consequently, f is a homeomorphism of X onto E.
P(xn,
It remains only to show that E is a Borel set. But E is dense in
F = E. By Theorem 8.11 the completeness of X implies that E is a
Sec.
3]
Borel Equivalences
327
gs relative to F and hence a Borel set relative to F. Thus E is a Borel
subset of P. I
Actually, a somewhat stronger statement is true: Each complete
separable metric space is Borel equivalent to [0, 1] (see Kuratowski
[16], p. 227). We shall only make use of the weaker statement given
in Theorem 8.
9. Theorem: Let X be a complete separable metric space and A
a Borel measure on X such that AX = 1 and such that ,u{x} = 0 for
each set fx; consisting of a single point. Then there is a Borel set
X 0 c X with ,uX 0 = 0, and a subset Yo of [0, 1] of Lebesgue
measure zero, such that there is a Borel equivalence 1,t of X - X 0 with
[0, 1] - Y o having the property that p.(1,t -1 [A]) = mA, where m is
Lebesgue measure on [0, 1].
Proof: Let (p be a Borel equivalence of X with a Borel subset E of
[0, 1], and define the Borel measure v on [0, 1] by vA Let f be the cumulative distribution function of v, that is, f(x) =
v[0, x]. Then f is a monotone nondecreasing function on [0, 1]
which is continuous on the right. Since f(x) - lim f(t) = v(xl =
t--u-14p -i [x]} = 0, we see that f is continuous. Since f(0) = 0 and
f(1) = v[0, 1] = AX = 1, f is a continuous mapping of [0, 1] onto
[0, 1]. For each y e [0, 1], the set f -l [fyl ] is a closed set, and since
f is nondecreasing it is either a closed interval or a single point. Let M
be the set of y such that f -1 [{y1 ] is a nondegenerate intèrval. Since
these intervals are disjoint, M is countable. Now mE = v(f -l [E])
by Problem 12.12, and so the set N = f-l [M] has v-measure zero.
Now f is a homeomorphism of [0, 1] - N onto [0, 1] - M. Since
N is a Borel set, the set X 0 = cp -1 [N] is a Borel set, and AX0 =
vN - O. Thus the mapping ly = fo (p, is a Borel equivalence of
X - X 0 with 1,G[X - X 0]. Since (p is a Borel equivalence of X onto a
Borel subset of [0, 1], ço[X - X0] is a Borel subset of [0, 1] contained
in [0, 1] - N and hence a Borel subset of [0, 1] - N. Thus
X0] = f [ço[X - X 0]] is a Borel subset of [0, 1] - M
and hence of [0, 1]. Thus 11, is a Borel equivalence of X - X 0
with a Borel subset of [0, 1]. Since mA = ,u( 1 [A]), we have
m(Iti[X - X0]) - 1. ( X - X0) = 1, and we see that the set Y o =
[0, 1] - %G[X - X0] is a Borel subset of Lebesgue measure zero. I
Mappings of Measure Spaces
328
[Ch. 15
Problems
15. Let X be a complete separable metric space and y a Borel measure
on X. Let E be a Borel subset of X with the property that if A is a Borel
subset of E, then AA = 0 or i.t(E — A) = O. Then there is a point x
contained in E such that ,u(E {X}) = 0. [Use Theorem 9.]
16. Let y be a Borel measure on a space X. Then A=A0-1-Ai where
= 0 for each x c X and ,u 0E =
E
.
xeE
4 Set Mappings and Point Mappings
on Complete Metric Spaces
We say that (X, a, 9-c) is a measurable space with null sets if a is a
a--algebra of subsets of X and 91 is a a--ideal in a. Each measurable
point mapping go of X into [0, 1] induces a cr-homomorphism 43 of
the Borel subsets of [0, 1] into the a-algebra a/91 by letting 1.(A) be
the equivalence class containing cp -1 [A]. The following theorem
states that, conversely, every a-homomorphism of the Borel subsets of
[0, I] into a/m is induced in this way by a point mapping go of X into
[0, 1]. It is essentially Lemma 11.10 in another guise.
10. Theorem (Sikorski): Let (X, a, 91) be a measurable space with
null sets, and let 43 be a a—homomorphism of the family 133 of Borel sets
in [0, 1] into the a--algebra a/91 with .1,([0, 1]) = X. Then there is a
measurable mapping cp of X into [0, 1] such that for each B c (B we
have yo -1 [B] in the equivalence class 4,(B). If p is any other point
mapping with this property, then ‘1. = ia except on a subset in N.
Proof: For each rational number a in [0, 1] let A, be a set in the
equivalence class ,t.([0, a]). We may take A = X. If a < 0, then
I.([0, a]) < 4)([0, 0]), and so the set Ea o = A, — A o is in 9Z. Let
E = U Eafi , where a and 13 run through the rationals. Since this
,
a<is
is a countable union and at is a a-ideal, we have E c N. Set B,
A a u E. Then Ba is in the equivalence class .t.([0, a]), and Ba C B o
for a < 3.
For each x e X, let go(x) = inf (a: x e B al-. Since B 1 = X, ça(x)
is defined for each x, and 0 < go(x) < 1. Thus ça is a mapping
of X into [0, 1]. For each t we have {x: ça(x) < t} = U B a.
a<1
Sec.
4]
Set Mappings and Point Mappings on Complete Metric Spaces
329
Hence ço is measurable, and ço -1 [[0, a]] = Ba. If we let NI, be the
u-homomorphism of 63 into a/oz induced by cp, then NI/ agrees with
ci) on the closed intervals with rational endpoints. Since
and ci)
are u-homomorphisms, the family of sets on which they agree is a
if-algebra and so must be the family 63 of Borel sets.
If o is any other mapping of X into [0, 1] which also induces the
u-homomorphism cic then for each pair a and 3 of rational numbers
the set {x: ç(x) < a < < ti.(x)j =
a]] —
IA] and
Thus we see that the set tx: ç(x) < o(x)1 =
so must be in
(10(x) < a < < V/(x)} is in M. Also ix: (p(x) > 0(x); c oz,
U
a,
and so (x: go(x)
o(x); c
M. I
11. Theorem: Let (X, a, M.) be a measurable space with null
sets, Y a complete separable metric space, and cla a Œ-homomorphism
of the Borel sets 63 of Y into a/oz with 4)(Y) = X. Then there is a set
X0 e T. and a point mapping cp of X — X 0 into Y such that for each
B e 63 we have (p —i [B] in the equivalence class cla(B).
Proof: By Theorem 8 there is a Borel equivalence X of Y with a
Borel subset E of [0, I]. For each Borel subset B of [0, 1] let
if(B) = ci)(X -1 [B n E]). Then there is a mapping o of X into [0, 1]
which induces Nir. Let X0 = 0 -1 [É]. Since O -1 [É] = 43(0), we have
X0 e Dz. On X — X 0, the mapping X -1 . o is defined and induces
the u-homomorphism F. I
A u-homomorphism trip from a Boolean if-algebra a to a Boolean
if-algebra 63 is called a if-isomorphism if there is a u-homomorphism
NIf from 63 to a such that
. 43 and 43 . 4, are the identity mappings
on a and 63. The following theorem is a generalization of a theorem
of von Neumann, who assumed the unnecessary hypothesis that
be measure preserving with respect to suitable measures on a and B.
12. Theorem: Let X and Y be complete separable metric spaces,
a and cB their Borel sets, and let T-c and oz be o--ideals in a and (B.
If cl) is a if-isomorphism of On onto (B/gt then there are sets X 0 e
and Yo e at and a one-to-one mapping cp of Y — Yo onto X — X 0
and cp — ' are measurable and 4)(A) = p 1 [A] modulo M. suchtayo
Proof: By Theorem 11 there is a set Y 1 e 91 and a mapping cp
of Y — Y 1 into X such that 4)(A) = cp 1 [A] modulo oz. Similarly,
Mappings of Measure Spaces
330
[Ch. 15
there is a set X1 e OR and a mapping ;I/ of X X 1 into Y such that
*(B) = irt [B]. Since 430 4/ is the identity on 03/M, the mapping
0 so can differ from the identity mapping on Y — Y1 only on a
subset Y2 e DI. Let Yo = Y1 u Y2. Then so is one-to-one on
Y — Yo, and so—a [A ] — 43(A) modulo Dz, so[B] — *(B) modulo Mt
Thus so[ Y — Yo] differs from X by a set X0 e NC. I
—
Unfortunately, the exceptional sets X0 and Yo in the preceding
theorem may be unavoidable (see Problem 17), but they may be
eliminated in special cases. One is given in Problem 18, another by the
following theorem.
13. Theorem: Let X be a complete separable metric space, ea the
family of Borel sets of X, and 91 a a-ideal of 63. If 4) is any a-isomorphism of 03/1t onto itself, then there is a one-to-one mapping ie, of X
onto itself such that ie, and ie, are Borel measurable and 4)(A) =
—so 1 LA] modulo N.
By Theorem 12 there are sets X0 and Yo in Dz. and a one-toone Borel measurable mapping ‘G of X
Yo onto X
Xo. Let
Zo = X0 u Yo, Z1 = ;G[Z o] u 1G — 'Vol Zn+i = ‘qZ.] U 0-1 [Zn]Since ;G and 1,G -1 take sets in 9t into sets in at, we have Z„ e 9z. Set
Z = U Zr,. Then Z e Di, 44Z] C Z, and V i [Z] C Z. Thus 1,1' and
are one-to-one mappings of X Z into X Z, and since they are
inverses of each other they must map X Z onto X Z. Define so
by letting so(x) = ;G(x) if x eX—Z and so(x) = x if x e Z. Then so
is a Borel equivalence of X onto itself which agrees with L. except
on the set Z e Di. 1
Proof:
—
—
—
—
—
—
Problems
17. a. Show that the exceptional set X0 in Theorem 11 may be unavoidable. [Let Y be a countable space with the discrete topology, X
the unit interval with 9-1. the family of all sets which contain no rational
numbers.]
b. Show that the exceptional sets X0 and Yo in Theorem 12 may be
unavoidable.
18. Show that we can dispense with the exceptional sets X0 and Yo
in Theorem 12 Won and 9-t are o--ideals of (P(X) and aci( Y) and each contains
a set with as many elements as there are real numbers.
Sec. 5]
The Isometries of LP
331
19. Let X, Y, a, ea, 51 t, at be as in Theorem 12, I. a a-homomorphism
of a/ort into oalgt, and let so be the mapping given by Theorem 11.
a. The mapping so is one-to-one except on a set Yo e 9l if and only
if there is a cr-homomorphism IF of 33/97, into a/mt with 4)o x1f the identity
on 63/91.
b. The mapping so is onto all of Xexcept for a set in 511 if and only if
cNA) = 0 implies A = O.
5 The lsometries of
LP
We illustrate the use of the theorems in the preceding sections by
deriving a characterization of the isometries of LP[O, 1] onto itself,
that is, of those linear mappings U of /2[0, 1] into itself such that
Uf =11. We begin by establishing two inequalities, the first
concerning real (or complex) numbers and the second elements in LP.
14. Lemma: Let and n be real numbers. Then if 2 < p < o ,
IE
ni P +
2 (l+
It
while if 0 <p < 2,
2(j2
IE + nIP + E — nIP
If p
I nr).
2, equality can only hold if or n is zero.
Proof: If p = 2, we have equality for all and n. If 2 < p <
then 1 < p/2, and applying the Holder inequality with exponents
p/2 and p/(p
2) to a 2 )3 2 we obtain
—
a2 +
0 2 < ( a l) + 0P)2/P(1 + 1)(p-2)/P
aP
0 13
or
2(2-1))/2( a 2 + 0 2f/2 .
(1)
If 0 < p < 2, replace p by 4/p. Then (1) becomes
a 4/ 7'
+ 13 41p > 2( -2)/ (a2
7'
If we replace a by a 7" 2 and S by
a2 +
13 112 ,
7'
02)21 .
7'
this becomes
(3 2 > 2(p-2)/p( ap
op‘2123
)
332
Mappings of Measure Spaces
Or
ap
[Ch. 15
+ 0 P < 2(2—p)/2(a 2 + 0 2)p/2 .
Since
E2
0<
<1
we have
(2
11 2)
<
_I_ 2)p/2 --
- 2 1'
E2
(E2
IEI P
>
_I_ 0 2)p/2 --
t2 1'
n2
2< p
n2
p < 2.
and
E2
Forming similar inequalities for n and adding, we get
! Er
+ i n r < ( E2 d_
n 2)p/ 2
2
<p
and
l Er + i nr .?. ( t2 ± 712)p/ 2
p < 2.
We verify that equality can hold in (3) or (4) and hence in (5) or
(6) only if t = 0 or n = 0.
Assume p > 2, and replace a and 13 in (1) by ;E + ni and I t —
Then
1 E ± n ip ± 1 E _ o r ..... 2(2—p)/2 (i t + 7112 + i E
ni 2)p12
_ 4 E2 ± n 2y/ 2
2(ItIP d-
Inn
by (5). This establishes the first inequality in the lemma, and the
second follows similarly using (2) and (6). We see that in either case
for p
2 equality can occur in (3) and (4) only if t = 0 or n = O. 1
By integration we have the following lemma as a consequence of
Lemma 14.
15. Lemma: Let 1 < p < oo, p
are in L. Then
2, and suppose that f and g
111+ glIP + Ii! — gir ==. 2(llf1I9 + iiglIP)
Sec. 5]
The lsometries of
LP
333
if and only if f • g = 0 almost everywhere.
16. Theorem (Lamperti): Let 1 < p < oo, p
2, and let U be a
of
LP[0,1]
into
itself
such
that
linear transformation
11U:flip = If
Then there is a Borel measurable mapping cp of [0, 1] onto (almost all
of) [0, 1] and an h c LP such that
Uf h (f
The function h is uniquely determined (to within a.e. equivalence) and
is uniquely determined (to within a.e. equivalence) on the set where
0. For any Borel set E we have
h
dt = f dt.
Proof: We define the support of a function f to be the set
ft: f(t)
01. Iff E LP, then the support off is only defined modulo
null sets. Thus for f e LP the support of f is an element in the
a--algebra 133/9z, of Borel sets modulo sets of measure zero. We define
a mapping 43, of the Borel sets of [0, 1] into Borel sets modulo null
sets by setting 4,(A) equal to the support of UXA. If A and B are
disjoint, we have by Lemma 15 that
Ilx A + xn 1 I P + I lx A —
xn 11'
2( 11x A
+ 11xB
Since U is linear and preserves norms, we have, again using
Lemma 15,
(UXA) • ( UXB)
= 0 a.e.
Thus cic, takes disjoint sets into disjoint sets. If A and B are disjoint,
XAug = XA
Xg, and so UX A uB = UX A UXB. Since the two
functions on the right have disjoint support, we see that 4,(A U B) =
4)(A) u 4, ( B) for disjoint sets and hence for any pair for Borel sets.
If CO, 1]] = E, we have (PO) = E 4,(A). Thus (i) is a homomorphism of 133 into the algebra of Borel subsets of E. If A =
Û Ai
1
with the A i disjoint, we have
XA = lim
E XA i
i=1
334
Mappings of Measure Spaces
[Ch. 15
in LP, and so by the continuity of U,
n
UX A = Jim
E ux A ,.
Hence (1)(A) = U 4)(A 2), and ci. is a u-homomorphism.
By Theorem 10 there is a Borel-measurable mapping cp of E into
[0, 1] such that 43(A) = („o-1 [A]. Since 4, can only take sets of
measure zero into sets of measure zero, the mapping cp must be onto
all of [0, 1] except possibly for a set of measure zero. Extend (p.
to be defined on all of [0, 1] by setting cp(t) = 0 for t y E.
The function 1 = X [0, 11 is in L. Let h = U(1). Then h e D' and
the support of h is E. If A is any Borel set in [0, 1], we have 1 --=
XA + XA. Hence h = UXA ± UX2. But the functions on the right
have disjoint support, and so UX A must equal h on the support of
UX A . Thus UX A = h - X 4,(A) = h - (X A . (p). Hence, if 1,/, is any simple
function, we must have Uip = h • (tp 0 (p). Since every function in
D' can be approximated in norm by a simple function and U is norm
preserving, we have Uf = h • (f . (p) for all f g D'.
The remaining statements in the theorem now follow easily. I
Problems
20. a. Show that if the linear transformation U is onto LP[0, 1], then
X ,-- E has measure zero.
b. Show in this case that q, is essentially one-to-one (that is, one-toone except on a set of measure zero), that (p-1 is measurable, and that
io and cp-1 take sets of measure zero into sets of measure zero. [Hint:
Apply Theorem 16 also to the transformation U— ', and the uniqueness
part of the theorem to / = UU-1 and/ = U-1 U.]
c. Show in this case that if we define a measure A by AA = m( se[AD,
"
1 ]
then (hi') = [±dm •
21. What can you say about the isometries of LP(X, 12) for general finite
measure spaces? (Here you cannot get a point mapping but must be content with the set mapping 43.)
22. Show that the characterization given in Theorem 16 is false for
L2[0,1].
23. A Banach space X is called uniformly convex if given E > 0 there is
an n < 1 such that whenever u and y are elements of X with i lull = Ilyil = 1
and llu — vil > a we have Ilu ± vil < 2n. Use the integrated form of
Lemma 14 to show that LP is uniformly convex if p > 2 and that we can
t
take n = (1 — EPP.
Epilogue
In the first part of the book we considered most of the central theory
of functions of a real variable. This material falls roughly into three
classes: sets of real numbers and their properties; measure and integration;
and properties of real valued functions such as continuity and differentiability. The books of Saks [20] and Hobson [12] give an excellent
account of this classical theory, particularly of those results which depend
strongly on special properties of the real numbers. A very elegant presentation of this classical theory is given by Part I of Riesz-Nagy [18]. The books
by Carathéodory [3] and de la Vallée Poussin [24] will give the reader some
feeling for the earlier treatments of this subject. One topic from the classical
theory of functions of a real variable which has been omitted is the theory
of Fourier series, and a good reference here is Zygmund [25]. Another
topic we have omitted is the classical theory of functions of several real
variables. Although much of this theory is subsumed under the general
theory of integration in Part Three, especially that dealing with Fubini's
Theorem, there are many properties depending on the special structure
of R. These include much of the theory of partial derivatives, change of
variables, and integration by parts. Some of this material is covered in
Saks [20] and Edwards [6].
In Chapters 7, 8, and 9 we considered various aspects of general
topology. For a more detailed treatment of this subject Kelley [14] is an
excellent reference, and you will find there further references to the
literature. In addition to the topics we have discussed briefly, there is a
chapter on uniform spaces which generalize the metric spaces and the
topology of linear spaces so that the notions of completeness and uniform
continuity are meaningful. For deeper properties of metric spaces and Borel
sets, Kuratowski [16] is good.
335
336
Epilogue
In Chapter 10 we discussed briefly some of the geometrical aspects of
Banach spaces, and topological linear spaces were touched upon to the
extent needed for the weak and weak* topologies in Banach spaces. The
theorems (Hahn-Banach, closed graph, uniform boundedness, Alaoglu,
Krein-Milman) are some of the most useful general tools at the disposal
of the analyst. Useful theorems that we did not mention here are the
fixed-point theorems of Schauder and Leray. A thorough catalogue of the
"geometrical" properties of such spaces is given in Day [4], which also
includes an excellent guide to the literature. The book by Banach [1] is
very readable, although it suffers from the fact that general topology was
not in existence at the time it was written and the weak topological notions
are treated entirely in terms of sequential convergence. A thorough
treatment of topological vector spaces can be found in Kelly-Namioka
[15] or in Schaefer [21]. A central topic in modern analysis that we have
not mentioned here is the theory of linear operators on a Banach or Hilbert
space. Discussion of these from various points of view may be found in
Dunford-Schwartz [5], Edwards [6], Hille-Phillips [11], Riesz-Nagy [18].
In Part Three we have given a reasonably comprehensive discussion of
abstract measure and integration. The central theorems are the convergence theorems, the Radon-Nikodym theorem, the extension theorem
for measures on an algebra, product measures and the Fubini-Tonelli
theorems, the equivalence of abstract integration with that based on a
measure, and the representation of bounded linear functionals on C(X).
For a full discussion of measure theory based on o--rings and of measure
theory in locally compact spaces the reader should consult Halmos [8],
which gives a very readable and thorough treatment of measure theory.
For systematic treatment of the Daniell theory of integration in locally
compact spaces, the book by Loomis [17] is a good reference.
An important branch of measure theory which we have not touched
upon is the theory of Haar measure, which is the theory of measure
invariant under a group of transformations on the measure space. A nice
introduction is given in Banach's appendix to Saks [20], a full treatment in
Halmos [8], and a treatment in terms of the Daniell integral in Loomis
[17].
Another important aspect of measure theory is ergodic theory, which is
the study of the properties of a measure preserving transformation and its
iterates. The reader will find readable treatment in Hopf [12] and Halmos
[10].
There are a number of applications of real analysis to the theory of
functions of a complex variable, and the reader will find a nice treatment
of many of them in Rudin [19].
Bibliography
Théorie des Opérations Linéaires (Monografje Matematyczne, Vol. 1), Warsaw, 1932.
[2] G. BIRKHOFF and S. MAC LANE, A Survey of Modern Algebra (3rd ed.),
New York, Macmillan, 1965.
[3] C. CARATHÉODORY, Vorlesungen fiber reelle Funktionen, LeipzigBerlin, 1927.
[4] M. M. DAY, Normed Linear Spaces (Ergebnisse der Mathematik und
ihrer Grenzgebiete, No. 21), Berlin, Julius Springer, 1958.
[5] N. DUNFORD and J. SCHWARTZ, Linear Operators, New York,
Interscience, 1958.
[6] R. E. EDWARDS, Functional Analysis, Theory and Applications, New
York, Holt, Rinehart and Winston, 1965.
[7] A. M. GLEASON, Fundamentals of Abstract Analysis, Reading, Mass.,
Addison-Wesley, 1966.
[8] P. R. HALMOS, Measure Theory, New York, Van Nostrand, 1950.
[9] P. R. HALMOS, Naive Set Theory, Princeton, Van Nostrand, 1960.
[10] P. R. HALMOS, Entropy in Ergodic Theory, Chicago, Univ. of Chicago
Press, 1959.
[11] E. HILLE and R. PHILLIPS, Functional Analysis and Semi-Groups
(American Mathematical Society Colloquium Publications, Vol. 31),
Providence, 1957.
[12] E. W. HonsoN, The Theory of Functions of a Real Variable and the
Theory of Fourier's Series (3rd ed.), Vol. 1, Cambridge, 1927.
[13] E. HOPE, Ergodentheorie (Ergebnisse der Mathematik und ihrer
Grenzgebiete, No. 5), Berlin, Julius Springer, 1937.
[14] J. L. KELLEY, General Topology, New York, Van Nostrand, 1955.
[1] S. BANACH,
338
Bibliography
[15] J. L. KELLEY, I. NAMIOKA, et al., Linear Topological Spaces, Princeton,
Van Nostrand, 1963.
[16] C. KURATOWSKI, Topologie I (Monografje Matematyczne, Vol. 3),
Warsaw, 1933.
[17] L. H. Loomis, An Introduction to Abstract Harmonic Analysis, New
York, Van Nostrand, 1953.
[18] F. RIESZ and B. NAGY, Functional Analysis (English ed.), New York,
Ungar, 1956.
[19] W. RUDIN, Real and Complex Analysis, New York, McGraw-Hill,
1966.
[20] S. SAKS, Theory of the Integral (Monografje Matematyczne, Vol. 7),
Warsaw, 1937.
[21] H. H. SCHAEFER, Topological Vector Spaces, New York, Macmillan,
1966.
[22] P. C. SUPPES, Introduction to Logic, Princeton, Van Nostrand, 1957.
[23] P. C. SUPPES, Axiomatic Set Theory, Princeton, Van Nostrand, 1960.
[24] C. J. DE LA VALLÉE POUSSIN, Intégrales de Lebesgue, Fonctions
d'ensemble, Classe de Baire, Paris, 1916.
[25] A. ZYGMUND, Trigonometrical Series (Monografje Matematyczne,
Vol. 5), Warsaw, 1935.
Index of Symbols
A&B
A V B
TA
A=B
A <=> B
(x)
(3x)
1
N
xe A
AcB
{x e X: P(x)}
—
0
{x}
{x, y}
-x, Y)
[a, b]
(a, b)
X X Y
A and B
A or B
not A
if A, then B
A if and only if B
for all x
there is an x
Q.E.D.
set of natural numbers
x is an element of A
A is a subset of B
set of x in X with P(x)
empty set
singleton
unordered pair
ordered pair
closed interval
open interval
direct product of X and Y
XX),
direct product
XA
g 0f
direct product
composition
restriction off to A
X
f IA
Page 2
2
2
2
2
2
2
4
6, 33
6
6
6
7
7
7
7
38
38
7, 128, 150, 264
18, 150
339
150
9
9
340
Index of Symbols
(x,) 1
finite sequence
infinite sequence
set of subsets of P
(P(X)
A nB
AuB
21
A AB
n
Ace
A
n {A: A
c e;
R
x V y
XA y
lim
50-1
9s)
1(1)
aft
XA
C[0, 1]
Rn
Sx , a
E
,u 1 y
a.e. [i.t]
1, <<,12
Idyl
Li
a,
/.4 X y
Page 9
10
11
11
11
12
12
intersection
union
complement of A
symmetric difference
1
intersection of A in
e
set of real numbers
max (x, y)
min (x, y)
limit superior
special Borel sets
13
29
32, 47, 237
32, 47, 237
36
50
length of /
52
60
class of Lebesgue measurable sets
68
characteristic function of A
space of continuous functions on [0, 1] 118, 301
Euclidean space
127
spheroid centered at x
129
41, 129
closure of E
236
mutually singular measures
238
except for a set E, with ,uE — 0
238
y absolutely continuous with respect to A
Radon-Nikodym derivative
240
countable union from a
product measure
255
265
Subject Index
Banach space, 116, 182
A
Absolute value of a measure, 236,
242 (36)
Absolutely continuous
function, 104
measures, 238
Absolutely summable, 116, 118
Accumulation point, 43 (27)
Algebra
of functions, 171
of sets, 6
Almost everywhere, 67, 238
Annihilator, 193 (23)
Antisymmetric relation, 23
Archimedes, Axiom of, 33
Ascoli Theorem, 179
Atom, 321
Base for a topology, 145
Bijective, 9
Binary expansion, 38 (21)
Bolzano-Weierstrass property, 160
Boolean algebra, 16, 319
Boolean-T-algebra, 320
Borel equivalence, 324
Borel extension of a measure, 314
Borel field, 17
Borel measurability, 70 (26)
Borel measure, 261, 302
Borel set, 50 (302)
Bounded Convergence Theorem, 81
Bounded linear functional, 119
Bounded operator, 184
Bounded set, 302
Bounded variation, 99
Boundedness, total, 137 (22b), 164
B
C
Baire Category Theorem, 139
Baire measure, 302
regular, 307
Baire sets, 301
Banach limit, 192 (20)
Cantor ternary function, 48 (46),
70 (28), 107 (11)
Cantor ternary set, 44 (36), 63 (14)
Carathéodory extension theorem, 257
Note: Numbers in parentheses indicate a problem number.
343
Subject Index
Carathéodory outer measure, 283
Category, 139
Cauchy criterion, 35
Cauchy sequence, 35, 115, 134, 136
in measure, 93 (25)
Change of variable, 107 (13)
Characteristic function, 68, 75
Convergence Theorem
Choice
Axiom of, 18
function, 18
Closed Graph Theorem, 196
Closed set, 41, 129, 143
Closure, point of, 40, 129, 145
Cluster point, 35, 134, 143
of a net, 155
Collection, 5
Compactification
Alexandroff, 168
Stone-tech, 170
Compactness, 157
countable, 159
local, 168
metric, 163
sequential, 160
Complement, 12
Complete measure, 220
Complete orthonormal system, 212
Complete space, 31, 115, 134
Completely regular space, 148
Complex measures, 241 (36)
Composition, 9
Conjugate space, 190
Connected space, 152
Continuity, absolute, 104, 238
Continuous functions, 44, 132, 144
Contraposition, 3
Convergence, 46, 133, 143
in mean, 115
in measure, 91, 230 (19)
pointwise, 46, 72, 115, 151, 179
(37)
setwise, 231
uniform, 46, 162
Correspondence, one-to-one, 19
Countable compactness, 159
Countable set, 9, 19
Countability axioms, 148
Counting measure, 53 (4)
Cover, measurable, 280 (29)
Covering, 42
Cube, 151, 167 (13)
Cumulative distribution function, 261
Bounded, 81
Lebesgue, 88, 229
Monotone, 84, 227
Convex function, 108
Convex hull, 207
Convex set, 203
D
Daniell
functional, 287
integral, 287
Decimal expansion, 38 (21)
Decomposable measure, 243 (39),
249 (45)
Dense subset, 44 (30)
Derivates, 96
Dimension, 213, 214 (52d)
Dini Theorem, 162
Direct product, 7, 18, 128, 150, 264
Directed system, 155
Discrete topology, 143
Disjoint, 13
Domain, 8
Dual Space, 190
Egoroff's Theorem, 72 (30)
Empty set, 7
Envelopes of a function, 49 (49)
Subject Index
345
Equicontinuous, 177
Equivalence
of metrics, 132
of norms, 182 (2), 195
Equivalence, uniform, 136
Equivalence relation, 22
Extended real numbers, 34
Extended real-valued function, 24
.Extreme point, 206
F
Family, 5
Fatou's Lemma, 83, 226, 231
Field, 30
Finite intersection property, 158
Finite measure, 221
Finite set, 9, 19
First axiom of countability, 146
First category, 139
First element, 23
Fubini Theorem, 269
Function, 8
Cantor ternary, 70 (8)
characteristic, 68, 75
choice, 18
continuous, 44, 132, 144
absolutely continuous, 104
H
Hahn-Banach Theorem, 187
Hahn decomposition, 236
Half-open interval topology, 147 (12)
Hausdorff Maximal Principle, 24
Hausdorff space, 148
Heine-Borel Theorem, 42, 157
Hilbert cube, 151
Hilbert space, 210
Holder Inequality, 113
Homeomorphism, 47 (44), 132, 144
uniform, 136
Homomorphism, lattice, 318
I
Iff, 2
Induction, 6
Inf, 31
Interior, 44 (33), 131 (5)
Internal point, 204
Intersection, 11
Interval, 38
Intrinsic property, 138
Inverse image, 9
Isolated set, 44 (29)
Isometry, 132, 333
Isomorphism, 184, 199, 221
semicontinuous, 48 (48), 161
convex, 108
measurable, 65, 223, 317
set, 52, 217
simple, 69, 75, 225
Functionals, bounded linear, 119
J
Jensen Inequality, 110
Jordan decomposition, 236
K
G
Generalized Cantor set, 63 (146)
Graph, 8, 196
Kernel, measurable, 280 (29)
Kernel (of an operator), 186 (14)
Krein-Milman Theorem, 207
Subject Index
346
LP Spaces, 111, 243
Lamperti Theorem, 333
Lattice, 171, 286, 309
homomorphism, 318
Least element, 23
Lebesgue Convergence Theorem, 88,
229
Lebesgue decomposition, 240
Lebesgue integral, 79
Lebesgue measure, 60
Lebesgue-Stieltjes Integral, 261
Lebesgue's Theorem, 82 (2b)
Length, 52
Limit, 35, 115, 133, 143
Banach, 192 (20)
of a net, 155
superior, 36, 48 (47)
Lindelof Theorem, 130
Linear functional, 119, 186, 246,
304, 310
Linear operator, 184
Linear ordering, 23
Linear space, 111, 181
Linear transformation, 184
Local base, 198
Locally compact spaces, 168
Locally convex space, 205
Locally measurable, 221
Lusin's Theorem, 72 (31)
Manifold, 169-170 (24)
Manifold, linear, 182
Measurable
cover, 280 (29)
function, 65, 223, 295
kernel, 280 (29)
mapping, 317
set, 56, 251, 296
space, 217
Measure, 53, 217
on an algebra, 253
Bake, 302
Borel, 261, 302
complex, 241 (36)
counting, 53
inner, 274
Lebesgue, 60
outer, 54, 250
Carathéodory, 283
product, 296
signed, 233
Measure algebra, 320
separable, 321
Metric, 127
Metric space, 127
Metrizable space, 143
Minimal element, 23
Minkowski Inequality, 114
Monotone
function, 47 (44), 96
sequence, 10, 35
Monotone Convergence Theorem,
84, 227
Monotonicity, 53 (1)
n-tuple, 9
Natural number, 6
Negative variation, 99
Net, 155
Norm, 111, 181, 184
Normal space, 148
Nowhere dense, 139
Null function, 294
Null set, 233
0
One-to-one, 9
correspondence, 19
Onto, 8, 19, 131
Subject Index
Open mapping, 193
theorem, 195
Open set, 38, 129, 142
Operator, linear, 184
Ordered field, 30
Ordered pair, 7
Ordering, 23
Ordinal, 25
Orthogonal, 210
Orthornormal system, 211
Outer measure, 54, 250
Carathéodory, 283
regular, 260 (8)
Partial ordering, 23
Point of closure, 40, 129, 145 (2)
Pointwise convergence, 46, 72, 115,
151, 179 (37)
Polygonal function, 47 (45)
Positive set, 233
Positive variation, 99
Product
direct, 7, 18, 128, 150, 264
measure, 265
topology, 150
Projection, 151
Proper map, 169 (22)
Pseudometric, 129 (3)
Pseudonorm, 183 (10)
Quotient, 22, 183 (11), 320
Radon-Nikodym derivative, 240,
241 (33)
347
Radon-Nikodym Theorem, 238,
242 (37)
Range, 8
Real numbers, 29
Recursive definition, 9
Reductio ad absurdum, 3
Reflexive partial order, 24
Reflexive relation, 22
Reflexive space, 191
Regular Baire measure, 307
Regular outer measure, 260 (8)
Regular space, 148
Relation, 22
Relative properties, 138
Residual set, 139
Restriction
of a function, 9
of a measure, 223 (4)
Riemann integral, 74
Riemann-Lebesgue Theorem, 90 (16)
Riesz-Fischer Theorem, 117
Riesz Representation Theorem, 121,
246, 310
a-algebra, 17, 320
a-bounded set, 302
a-compact, 302
a-finite measure, 220
a-homomorphism, 318
a-ideal, 320
a-isomorphism, 329
a-ring, 222 (9)
Saturated measure, 221
Schwarz inequality, 210
Second axiom of countability, 146
Semialgebra, 259
Semicontinuous function, 48 (48),
161, 162 (4)
Semifinite measure, 220
Separable measure algebra, 321
Subject Index
348
Separable space, 130
Separated, linearly, 204
Separation, 152
Sequence, 9
Sequentially compact, 160
Set, 5
Set function, 52, 217
Set mapping, 317
Setwise convergent, 231
Signed measure, 233
Simple function, 69, 75, 225
Singleton, 7
Singular function, 107 (12)
Singular measures, 238
Sphere, 129 (2)
Spheroid, 129 (2)
Step function, 49 (48f), 74
Stone representation theorem, 297
Stone-Weierstrass Theorem, 174,
313 (16)
Strict partial order, 24
Strong topology, 201
Stronger topology, 144
Subadditivity, 53 (2)
Subsequence (10)
Subspace, 137, 143, 182
Summable sequence, 37 (17), 116
Sup, 31
Support (of a measure), 308 (11)
Surjective, 8
Symmetric difference, 12
Symmetric relation, 22
System, directed, 155
Tonelli Theorem, 270
Topological property, 132
Topological space, 142
Topological vector space, 197
Topology, 142
Topology of pointwise convergence,
151
Total boundedness, 137, 164
Total variation
of a function, 99
of a measure, 236, 242 (36)
Transitive relation, 22
Translation invariance, 52, 63, 198
Trivial topology, 143
Tychonoff Theorem, 166, 199
U
Uncountable set, 21 (23)
Uniform boundedness principle,
140, 196
Uniform continuity, 135
Uniform convergence, 46
Uniform homeomorphism, 136
Uniform properties, 136, 198
Uniformly convex, 334 (23)
Uniformly equivalent, 136
Union, 11
Unordered pair, 7
Upper and lower envelopes of a
function, 49 (49)
Upper bound, 31
Ursohn's Lemma, 148
T
V
Th T2, T3, T4,
147
Ternary
expansion, 38 (21)
function, 48 (46), 107 (11)
set, 44 (36), 63 (14)
Tietze's Extension Theorem, 148
Variation, total, 99
Vector lattice, 286
Vector space, 111, 181
topological, 197
Vitali covering, 94, 95
349
Subject Index
W
Weak topology, 148, 150 (20), 200,
201
Weak* topology, 201
Weaker topology, 144
Weakest topology, 147 (10)
Well ordering, 24
Download