Uploaded by Pritam Acharya

(TRIM 77) S. Kesavan - Measure and Integration

advertisement
Texts and Readings in Mathematics 77
S. Kesavan
Measure and
Integration
Texts and Readings in Mathematics
Volume 77
Advisory Editor
C. S. Seshadri, Chennai Mathematical Institute, Chennai
Managing Editor
Rajendra Bhatia, Ashoka University, Sonepat
Editors
Manindra Agrawal, Indian Institute of Technology, Kanpur
V. Balaji, Chennai Mathematical Institute, Chennai
R. B. Bapat, Indian Statistical Institute, New Delhi
V. S. Borkar, Indian Institute of Technology, Mumbai
Apoorva Khare, Indian Institute of Sciences, Bangalore
T. R. Ramadas, Chennai Mathematical Institute, Chennai
V. Srinivas, Tata Institute of Fundamental Research, Mumbai
Technical Editor
P. Vanchinathan, Vellore Institute of Technology, Chennai
The Texts and Readings in Mathematics series publishes high-quality textbooks,
research-level monographs, lecture notes and contributed volumes. Undergraduate
and graduate students of mathematics, research scholars, and teachers would find
this book series useful. The volumes are carefully written as teaching aids and
highlight characteristic features of the theory. The books in this series are
co-published with Hindustan Book Agency, New Delhi, India.
More information about this series at http://www.springer.com/series/15141
S. Kesavan (emeritus)
Measure and Integration
123
S. Kesavan (emeritus)
Institute of Mathematical Sciences
Chennai, Tamil Nadu, India
ISSN 2366-8725 (electronic)
Texts and Readings in Mathematics
ISBN 978-981-13-6678-9 (eBook)
https://doi.org/10.1007/978-981-13-6678-9
Library of Congress Control Number: 2019932597
This work is a co-publication with Hindustan Book Agency, New Delhi, licensed for sale in all countries
in electronic form only. Sold and distributed in print across the world by Hindustan Book Agency, P-19
Green Park Extension, New Delhi 110016, India. ISBN: 978-93-86279-77-4 © Hindustan Book Agency
2019.
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
This work is subject to copyright. All rights are reserved by the Publishers, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publishers, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publishers nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publishers remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Dedicated to
Professor Philippe G. Ciarlet,
to whom I owe more than I can
possibly express,
on the occasion of his eightieth
birthday.
Preface
A course on the theory of measure and the Lebesgue integral is now
an essential component in any masters or graduate programme in mathematics in Indian universities. It is part of the training a student receives
in analysis. The most interesting examples of Banach spaces are function spaces of various kinds and the Lebesgue spaces, also known as Lp
spaces, are amongst the most important of these. A knowledge of the
theory of measure and integration is essential for the study of several
advanced topics in functional analysis like the theory of distributions
and Sobolev spaces, which constitute the functional analytic framework
for the modern study of partial differential equations. Of course, the
theory of measure and integration is vital to the study of probability
and stochastic processes.
This book grew out of the notes I prepared for lectures on measure
theory and the theory of integration. These lectures were delivered,
over the past four decades, to masters and graduate students in several leading institutions like the Centre for Applicable Mathematics,
Tata Institute of Fundamental Research, Bangalore, The Institute of
Mathematical Sciences, Chennai, the Chennai Mathematical Institute,
Siruseri, and the Indian Institute of Technology, Madras. Portions of
the book were also taught at numerous refresher or summer courses.
In particular, it was taught by me at several refresher courses at the
Ramanujan Institute for Advanced Study in Mathematics, of the University of Madras, Chennai. I am indeed thankful to these institutions
and organizers of refresher courses for having given me the opportunity
to deliver these lectures.
The book starts with a preamble, where the Riemann integral is
briefly discussed. Some of the shortcomings of this theory of integration
motivate the need to develop the theory of measure and the Lebesgue
integral.
Chapter 1 develops the abstract theory of a measure defined over
classes of subsets of a non-empty set, like rings, σ-rings and σ-algebras.
The extension of a measure from a smaller class (like a ring) to a larger
class (typically, a σ-algebra), is done via the method of Carathéodory,
using outer-measures. The completion of a measure is also discussed.
vii
viii
Preface
Chapter 2 is devoted to the construction and the study of the important properties of the Lebesgue measure on the euclidean space RN .
Chapter 3 studies important properties of measurable functions.
Chapter 4 introduces various notions of convergence like pointwise
convergence, almost uniform convergence and convergence in measure
and studies their inter-relationships.
Chapter 5 is the core of this book. It develops the theory of the
Lebesgue integral and proves the important limit theorems. It also compares the Riemann and Lebesgue integrals on the real line.
Chapter 6 is devoted to the fundamental theorem of calculus, viz.
the relationship between differentiation and integration. Various classes
of functions, which are differentiable almost everywhere, are studied and
the relationship between the integrand and the derivative of its indefinite integral is explored.
Chapter 7 is devoted to the change of variable formula, viz. the effect
on the integral under the action of transformations of the domain.
Chapter 8 studies product spaces and Fubini’s theorem is proved.
Polar coordinates in RN are discussed.
Chapter 9 concerns signed measures and the main result of this chapter is the Radon-Nikodym theorem.
Finally, Chapter 10 studies Lp spaces. Density theorems and duality
are discussed. The notion of the convolution product is introduced.
Most of the material in this book can be covered in a one semester
introductory course. The pre-requisite for following this book is familiarity with basic real analysis and elementary topological notions, with
special emphasis on the topology of the euclidean space RN . The instructor may omit certain sections or results if (s)he feels it may be too
heavy for the students taking the course. Each chapter is provided with
a variety of exercises, which, it is hoped, the students will try to solve.
Preface
ix
No originality is claimed regarding the contents and the presentation
of the material in this book. I have learnt from, and have been influenced by, many earlier works on this topic, especially those of Halmos,
Royden and Rudin, to mention a few. These appear in the bibliographic
references.
Since this book is meant to serve as a text book for an introductory
course, I have kept the bibliographic references to a minimum.
I wish to thank the Director of the Institute of Mathematical Sciencs
for the excellent facilities accorded to me during the preparation of this
work. I also wish to thank Prof. R. Bhatia, Managing Editor of the
TRIM Series, and Shri J. K. Jain of the Hindustan Book Agency, for
their support. I thank the anonymous referee who went through the entire manuscript with such great care and pointed out several misprints
and other slips. Eliminating these has certainly made the book much
better. Finally, I wish to thank several students of the Indian Institute
of Technology, Madras, who followed my lectures and made my sojourn
there as Visiting (and then Adjunct) Professor a very enjoyable experience. In particular, I wish to mention Ashok Kumar and Nirjan Biswas.
Chennai
November, 2018
S. Kesavan
Notations
Certain general conventions followed throughout the text regarding
notations are described below. All other specific notations are explained
as and when they appear in the text.
• The set of natural numbers {1, 2, 3, · · ·}, is denoted by the symbol
N, the integers by Z, the rationals by Q, the reals by R and the
complex numbers by C.
• If A and B are two sets, then by A ⊂ B, we mean that every
element of A is also an element of B, i.e. A is a subset of B. The
inclusion is not necessarily strict.
• If X is a non-empty set and A ⊂ X, then Ac denotes the complement of A in X, i.e. the set of elements in X which do not belong
to A.
• The empty set is denoted by the symbol ∅.
• The union and intersection of sets are denoted using the usual
symbols ∪ and ∩ respectively.
• If X is a non-empty set and if A and B are subsets of X, then
A\B = A ∩ B c , and A∆B = (A\B) ∪ (B\A).
• If a, b ∈ R ∪ {±∞}, then
(a, b) = {x ∈ R | a < x < b},
[a, b] = {x ∈ R | a ≤ x ≤ b},
[a, b) = {x ∈ R | a ≤ x < b},
(a, b] = {x ∈ R | a < x ≤ b}.
• The symbol RN , N ∈ N, stands for the N -dimensional euclidean
space. If x = (x1 , · · · , xN ) ∈ RN , then
|x| =
N
X
! 12
|xi |2
.
i=1
x
Contents
Preamble
1
1 Measure
1.1 Algebras of sets . . . . . . . . . . .
1.2 Measures on rings . . . . . . . . .
1.3 Outer-measure and measurable sets
1.4 Completion of a measure . . . . . .
1.5 Exercises . . . . . . . . . . . . . .
2 The
2.1
2.2
2.3
2.4
2.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
11
16
24
26
Lebesgue measure
Construction of the Lebesgue measure
Approximation . . . . . . . . . . . . .
Translation invariance . . . . . . . . .
Non-measurable sets . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
30
30
39
46
49
52
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
54
61
65
66
3 Measurable functions
3.1 Basic properties . . .
3.2 The Cantor function
3.3 Almost everywhere .
3.4 Exercises . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 Convergence
68
4.1 Egorov’s theorem . . . . . . . . . . . . . . . . . . . . . . . 68
4.2 Convergence in measure . . . . . . . . . . . . . . . . . . . 70
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5 Integration
81
5.1 Non-negative simple functions . . . . . . . . . . . . . . . . 81
5.2 Non-negative functions . . . . . . . . . . . . . . . . . . . . 85
5.3 Integrable functions . . . . . . . . . . . . . . . . . . . . . 94
xi
xii
Contents
5.4
5.5
5.6
5.7
The Riemann and Lebesgue integrals
Weierstrass’ theorem . . . . . . . . .
Probability . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
104
109
112
114
6 Differentiation
6.1 Monotonic functions . . . . . . . . . .
6.2 Functions of bounded variation . . . .
6.3 Differentiation of an indefinite integral
6.4 Absolute Continuity . . . . . . . . . .
6.5 Exercises . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
118
118
124
131
136
139
7 Change of variable
142
7.1 The Fréchet derivative . . . . . . . . . . . . . . . . . . . . 142
7.2 Sard’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 146
7.3 Diffeomorphisms . . . . . . . . . . . . . . . . . . . . . . . 147
8 Product spaces
8.1 Measurability in the product space
8.2 The product measure . . . . . . . .
8.3 Fubini’s theorem . . . . . . . . . .
8.4 Polar coordinates in RN . . . . . .
8.5 Exercises . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
156
156
160
164
171
175
9 Signed measures
9.1 Hahn and Jordan decompositions
9.2 Absolute continuity . . . . . . . .
9.3 The Radon-Nikodym theorem . .
9.4 Singularity . . . . . . . . . . . .
9.5 Exercises . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
178
178
185
188
193
195
10 Lp spaces
10.1 Basic properties . .
10.2 Approximation . .
10.3 Some applications
10.4 Duality . . . . . .
10.5 Convolutions . . .
10.6 Exercises . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
196
196
205
208
213
221
230
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
235
About the Author
S. KESAVAN is former professor at the Institute of Mathematical
Sciences, Chennai, and adjunct faculty at the Indian Institute of
Technology Madras, Chennai, India. He started his career at the Tata
Institute of Fundamental Research Centre for Applicable Mathematics
(TIFR-CAM), Bangalore, India, in 1973. He has also been associated
with the Chennai Mathematical Institute, where he was deputy director
during 2007–2010 and he is currently adjunct professor at the Indian
Institute of Technology, Madras. He earned his PhD from Université
Pierre-et-Marie-Curie, Paris, in 1979. He is a fellow of the National
Academy of Sciences, India, at Allahabad and the Indian Academy of
Sciences, Bangalore, India. He is a life member of the National Board
for Higher Mathematics, since 2000, Indian Mathematical Society,
International Society for the Interaction of Mechanics and Mathematics
(ISIMM), Indian Society of Industrial and Applicable Mathematics
(ISIAM), Ramanujan Mathematical Society, American Mathematical
Society and an elected fellow of the Forum d’Analystes, Chennai.
He was also the Secretary (Grant Selection) of the Commission for
Developing Countries of the International Mathematical Union, during
2011–2014 and 2015–2018. He has published four books and authored
over 50 research articles, apart from several contributions to conference
proceedings and popular articles. His research interests are in partial
differential equations, homogenization, control theory, and isoperimetric
inequalities.
xiii
Preamble
From the time of the Greeks, the problem of computing the area
enclosed by a curve had been exercising the minds of scientific thinkers.
This crucial question, at the base of the theory of integral calculus, was
treated as early as the third century B.C. by Archimedes, who calculated the area of a circular disc, the area of a segment of a parabola
and other such figures. He used the ‘method of exhaustion’. The basic
idea was to exhaust the given area by a sequence of polygonal domains
and calculate the area as the limit of the area of the inscribed polygons.
During the seventeenth century, many such areas were calculated and in
each case the problem was solved by an ingenious device specially suited
for the case in hand. One of the achievements of calculus was to develop
a general and powerful method to replace these special restricted procedures.
From the time of Archimedes until the time of Gauss, the attitude
was that the area was an intuitively obvious entity which need not be
defined, but which had to be computed. Before Cauchy, there was no
definition of the integral in the precise sense of the term. One was often
limited to saying which areas one had to add, or subtract, to get the
integral.
Cauchy, with his concern for rigour, which is characteristic of modern
mathematics, defined continuous functions and their integrals in much
the same way as we do now. To arrive at the integral of a continuous
function f defined on an interval [a, b] of the real line, he looked at sums
of the form
X
S =
f (ξi )(xi+1 − xi )
i
where a = x0 < x1 < ... < xi < xi+1 < ... < xN = b is a partition of
[a, b] and ξi ∈ [xi , xi+1 ]. He then deduced the value of the integral
b
Z
f (x)dx
a
by a suitable passage to the limit.
For a long time, certain discontinuous functions were integrated by
showing that Cauchy’s definition still applied to these integrals. It was
1
2
Preamble
Riemann who systematically investigated the exact scope of this definition.
In what follows, we will briefly recall the salient features of the Riemann integral and see what are its principal drawbacks which will motivate the study of Lebesgue’s theory of measure and integration.
The Riemann Integral
Let [a, b] ⊂ R be a finite interval and let f : [a, b] → R be a bounded
function. Let P = {a = x0 < x1 < ... < xN = b} be a partition of the
interval. Set
mi =
inf
x∈[xi−1 ,xi ]
f (x) and Mi =
f (x), for 1 ≤ i ≤ N.
sup
x∈[xi−1 ,xi ]
Then, we define the lower and upper (Darboux) sums associated to the
function f and the partition P by
L(P, f ) =
PN
− xi−1 )
U (P, f ) =
PN
− xi−1 ).
i=1 mi (xi
i=1 Mi (xi
Then, we define the lower and upper integrals of f by
Rb
a
Rb
a
f (x)dx = supP L(P, f )
f (x)dx = inf P U (P, f )
where the supremum and infimum are taken over all possible partititions
of [a, b]. The function f is said to be Riemann integrable over [a, b]
if its lower and upper integrals are equal and the common value, called
the Riemann integral of f over [a, b], is denoted by the symbol
b
Z
f (x)dx.
a
Since f is bounded, we have m ≤ f (x) ≤ M for all x ∈ [a, b] and it
is immediate to see that
m(b − a) ≤ L(P, f ) ≤ U (P, f ) ≤ M (b − a)
Preamble
3
for all partitions P. Thus, the lower and upper integrals of f always
exist but the question of their being equal is a delicate one.
Given a partition P as above, we set
µ(P) = max (xi − xi−1 ).
1≤i≤N
Let ti ∈ [xi−1 , xi ] for 1 ≤ i ≤ N . Denote
S(P, f ) =
N
X
f (ti )(xi − xi−1 ).
i=1
The above notation is incomplete. The sum S(P, f ) depends not
only on the partition P and the function f , but also on the choice of the
points ti . But in order to avoid cumbersome notation, we will leave it
as it is.
Definition We say that
lim S(P, f ) = A
µ(P)→0
if, for every ε > 0, there exists a δ > 0 such that, for all partitions P
such that µ(P) < δ, and for all choices of points ti compatible with the
partition, we have
|S(P, f ) − A| < ε.
Theorem 1 (cf. Rudin [7])
The function f is Riemann integrable, if and only if, the limit defined
in the above definition exists and, in this case,
b
Z
f (x)dx =
a
lim S(P, f ).
µ(P)→0
Thus, we see that the requirement that a function be Riemann integrable is a very strong one. We have the following result.
4
Preamble
Theorem 2 (cf. Rudin [7])
If f is continuous, or if f has at most a countable number of discontinuities, then f is Riemann integrable. Example Let us consider the unit interval [0, 1]. Let us choose some
numbering of all the rational numbers in this interval and write them as
r1 , r2 , .... Define
1, if x = r1 , r2 , ..., rn ,
fn (x) =
0, otherwise.
The function fn is discontinuous only at the points r1 , ..., rn which are
finite in number and so, by the previous theorem, fn is Riemann integrable. In fact, it is a simple exercise to check this fact directly using the
definition of Riemann integrability and show that the integral is equal
to zero.
Let us now consider the function f (x) = limn→∞ fn (x). It is easy to
see that
1, if x is rational,
f (x) =
0, if x is irrational.
This function is discontinuous everywhere. Given any partition P, it
is easy to see that mi = 0 and that Mi = 1 for all 1 ≤ i ≤ N . Thus
L(P, f ) = 0 and U (P, f ) = 1. Thus the lower integral is zero while the
upper integral is unity and so f fails to be Riemann integrable. This brings us to a major drawback of the Riemann integral. The
limit of a sequence of Riemann integrable functions need not be Riemann integrable. Even if the limit is a Riemann integrable function,
the limit of the integrals need not be the integral of the limit, as the
following example shows.
Example Let fn (x) = n2 x(1 − x2 )n for x ∈ [0, 1]. Then fn (x) → 0 as
n → ∞ (why?). Now,
1
Z
x(1 − x2 )n dx =
0
1
.
2n + 2
Thus,
1
Z
fn (x)dx =
0
n2
→ ∞
2n + 2
Preamble
5
while, since f ≡ 0, we have
R1
0
f (x)dx = 0. Similarly, if we define
fn (x) = nx(1 − x2 )n ,
again fn → f ≡ 0 pointwise but
R1
0
fn (x)dx → 1/2 6= 0.
So, when do the two limit processes - the pointwise limit of functions
and Riemann integration (which has been defined as a limit of sums as
shown in Theorem 1) - commute?
Definition We say that fn → f uniformly on [a, b] if, for every ε > 0,
there exists a positive integer N such that, for all x ∈ [a, b] and for all
n ≥ N , we have
|fn (x) − f (x)| < ε.
Theorem 3 (cf. Rudin [7])
If fn → f uniformly on [a, b], and if all the fn are Riemann integrable,
then f is Riemann integrable and, further,
Z
lim
n→∞ a
b
b
Z
fn (x)dx =
f (x)dx.
a
In the preceding example, the sequence {fn } failed to converge uniformly. In fact, the non-commutativity of the operations of taking the
pointwise limit and the Riemann integral is a useful test to prove that
a sequence of functions is not uniformly convergent.
Thus, a sequence of functions which does not converge uniformly
may converge to a function which is not integrable or it can happen
that the limit function is Riemann integrable but the limit of the integrals is not the integral of the limit function. But uniform convergence
is a very strong condition as well.
We thus feel the need for a theory of integration, wherein a larger
class of functions is integrable and such that the process of taking pointwise limits of functions commutes with the process of integration under
fairly easily verifiable hypotheses. This is where the alternative approach
6
Preamble
of Lebesgue comes in useful.
The way the Riemann integral is defined, a certain amount of continuity is forced on integrable functions. As we saw in Theorem 1, if f is
Riemann integrable, then, for all admissible choices of points ti , the value
of S(P, f ) cannot vary too much, since the limit exists as µ(P) → 0.
Thus, nearby points must have nearby values ‘to a large extent’ and this
is what Theorem 2 is all about. We can excuse a countable number of
discontinuities. But the function which takes the value 1 on the rationals
and the value 0 on irrationals is discontinuous everywhere and it fails to
be Riemann integrable.
The idea of Riemann in formulating the definition of the integral is
to consider the function following the abcissa. We take the values of the
function as we proceed along the x-axis. Thus, we are forced to consider
and compare the values of the function at nearby points and hence we
are dependent on some amount of continuity.
The idea of Lebesgue is to work, not from the domain, but from the
range of a function. We take a particular value and consider the set of
all points where this value is assumed when we define the integral. Let
us illustrate this via an example.
Example Let P be a partition of the interval [a, b]. Let
f (x) =
N
X
αi χEi (x)
i=1
where Ei = [xi−1 , xi ] and for any subset A of R,
1, if x ∈ A
χA (x) =
0, if x 6∈ A.
This function has a finite number of discontinuities and the Riemann
integral is easily seen (Exercise!) to be
b
Z
f (x)dx =
a
N
X
αi (xi − xi−1 ).
i=1
By Lebesgue’s method, we will be looking at sets of the form Eα = {x ∈
[a, b] | f (x) = α} for each α ∈ R and multiply α by the ‘length’ of the
Preamble
7
set Eα and ‘add’ all these products. In our example, Eα = ∅ if α 6= αi
for any 1 ≤ i ≤ N and Eαi = Ei . Thus the (Lebesgue) integral is given
again by the same expression as the Riemann integral, in this case. Imagine a merchant in a shop wanting to add all the money he has
collected from sales during a particular day. He has two methods. First,
he can take the money one at a time from the till and add the amounts
as he takes them out. The other is for him to sort out all the money
according to each denomination, count the number of coins or notes in
each denomination, multiply the number by the value of the denomination and add all these products. Both procedures will yield the same
result, but the latter is more efficient, especially if it involves large quantities of money (have you seen how they count the Hundi collections in a
large temple, say, Tirumala?). The approach of Riemann is like the first
method where we take a function as it comes along the x-axis, while the
approach of Lebesgue is like the second, where we sort it out according
to the values in the range. Obviously, this does not say anything about
the values of nearby points, and so, hopefully, will not depend on the
continuity of the function.
The Riemann integral approximates a function by another of the
form
N
X
f (ti )χEi (x)
i=1
where ti ∈ Ei and P = {Ei | 1 ≤ i ≤ N } is a partition of [a, b] and
passes to the limit in sums of the form S(P, f ).
The Lebesgue integral approximates a function by one of the form
N
X
αi χAi (x)
i=1
where Ai , 1 ≤ i ≤ N are ‘more general’ sets than just intervals. It then
defines the integral of the simpler function by
N
X
αi µ(Ai )
i=1
where µ(A) is the ‘length’ of the set A, and then passes to the limit
suitably to get the integral of f .
8
Preamble
Here is the catch. What do we mean by the ‘length’ of a set A which
is not an interval. This brings us to the theory of measures which will
generalize the notion of length (area or volume, in higher dimensions)
to a fairly large class of sets.
Chapter 1
Measure
1.1
Algebras of sets
Throughout this section X will stand for a non-empty set. We will define
various classes of subsets of X. The power set of X, i.e. the collection
of all subsets of X, will be denoted by P(X).
Definition 1.1.1 A non-empty collection R of subsets of X is called a
ring if it is closed under the formation of unions and differences, i.e. if
E and F are members of R, then so are E ∪ F and E\F . A ring is said
to be an algebra if, in addition, X itself is a member of R. Remark 1.1.1 By induction, it is clear that every ring is closed under
the formation of finite unions. Remark 1.1.2 The empty set always belongs to any ring since, if E is
a member, so is ∅ = E\E. Remark 1.1.3 If R is a ring and if E, F ∈ R, then
E∆F
E∩F
= (E\F ) ∪ (F \E) ∈ R,
= E\(E\F ) ∈ R.
Thus, a ring is closed under the formation of symmetric differences and
intersections as well. Remark 1.1.4 Conversely, if a non-empty collection of subsets R is
closed under the formation of unions and symmetric differences, then it
is a ring. Indeed, if E, F ∈ R, we have, by hypothesis, that E ∪ F ∈ R
and, further,
E\F = (E ∪ F )∆F ∈ R.
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_1
9
10
1 Measure
Similarly, if R is closed under the formation of symmetric differences
and intersections, then also it is a ring, for
E∪F
E\F
= (E∆F )∆(E ∩ F ) ∈ R,
= (E ∪ F )∆F ∈ R.
Remark 1.1.5 If R is an algebra, then it is closed under complementation since E ∈ R implies that E c = X\E ∈ R. Conversely, if a nonempty collection of subsets R is closed under the formation of unions
and complementation, it is an algebra. To see this, notice that if E ∈ R,
then E c ∈ R and so X = E ∪ E c ∈ R. Further, if E, F ∈ R, then
E\F = E ∩ F c = (E c ∪ F )c ∈ R. Example 1.1.1 The collections R = {∅} and R = P(X) are trivial
examples of rings for any non-empty set X. Example 1.1.2 Let X = Z, the set of all integers. Define
R = {A ⊂ Z | A is a non-empty finite set, or A = ∅}.
Then R defines a ring. The next example is one which we will deal with in detail in this book
since it will be the starting point of the construction of the Lebesgue
measure.
Example 1.1.3 Let X = R, the real line. Define
P = {[a, b) | a, b ∈ R, a ≤ b}
where [a, b) = {x ∈ R | a ≤ x < b} with the convention that this stands
for the empty set if a = b. Define R to be the collection of all finite
unions of members of P. Then R is a ring. To see this, first of all, R
is closed under the formation of finite unions, by definition. Further,
[a, b)\[c, d) will be [a, b) if the two intervals are disjoint, or empty if
[a, b) ⊂ [c, d). If a < c < b ≤ d, then [a, b)\[c, d) = [a, c) and if
a ≤ c < d < b, we have
[a, b)\[c, d) = [a, c) ∪ [d, b) ∈ R.
1.2 Measures on rings
11
Finally, if c < a < d < b, we have
[a, b)\[c, d) = [d, b).
From this it easily follows that R is closed under the formation of differences as well. It is also clear that each member of R can, in fact, be
written as a finite disjoint union of members of P. Definition 1.1.2 A non-empty collection S of subsets of a non-empty
set X is said to be a σ-ring if it is closed under the formation of
differences and countable unions. In other words, if E, F ∈ S, then
E\F ∈ S and if {Ei }∞
i=1 is a countable collection of members of S, then
∪∞
E
∈
S.
A
σ-ring
S is called a σ-algebra if, in addition, X ∈ S. i
i=1
Remark 1.1.6 Thus, a σ-ring is a ring which is closed under the formation of countable unions. If {Ei }∞
i=1 is a countable collection in a σ-ring
S, then
∞
∩∞
i=1 Ei = E\ ∪i=1 (E\Ei ) ∈ S,
where E = ∪∞
i=1 Ei . Thus, a σ-ring is closed under the formation of
countable intersections as well. It is also easy to see that a σ-algebra
can be described as a non-empty collection of subsets which is closed
under the formation of countable unions and complementation. Let X be a non-empty set and let E be a non-empty collection of subsets of X. Clearly, the power set of X, i.e. P(X), is a ring (respectively,
σ-ring) containing E. Now, it is immediate to see that the intersection of
a collection of rings (respectively, σ-rings) is again a ring (respectively, a
σ-ring). Consequently, there exists a smallest ring (respectively, σ-ring)
containing E. This is called the ring (respectively, σ-ring) generated by
E and is denoted by R(E) (respectively, S(E)).
The collection of all sets in X which can be covered by finite (respectively, countable) unions of members of E is clearly a ring (respectively, a
σ-ring) containing E. Thus, every member of R(E) (respectively, S(E))
can be covered by a finite (respectively, countable) union of members of
E.
1.2
Measures on rings
Let X be a non-empty set and let R be a ring of subsets of X.
12
1 Measure
Definition 1.2.1 A measure, µ, on the ring R, is an extended realvalued function on R such that
(i) µ(E) ≥ 0, for all E ∈ R,
(ii) µ(∅) = 0, and,
(iii) µ is countably additive, i.e. if {Ei }∞
i=1 is a sequence of pairwise
disjoint sets in R such that E = ∪∞
E
∈
R, then
i=1 i
µ(E) =
∞
X
µ(Ei ). (1.2.1)
i=1
Remark 1.2.1 Since µ is an extended real-valued function on R, it is
possible that µ(E) = +∞ for some E ∈ R. If there exists at least one
E ∈ R such that µ(E) < +∞, then (ii) in the above definition will
follow as a consequence of (iii), since we can write
E = E ∪ ∅ ∪ ∅ ∪ ∅··· Remark 1.2.2 Since µ(Ei ) ≥ 0 for all i, the order of the summands in
(1.2.1) is unimportant. Remark 1.2.3 A measure is always finitely additive as well. If
{Ei }N
i=1 is a finite collction of mutually disjoint sets whose union is E
(which will be automatically in R), then since
E = ∪N
i=1 Ei ∪ ∅ ∪ ∅ ∪ ∅ · · · ,
we have
µ(E) =
N
X
µ(Ei ). i=1
Example 1.2.1 Let X be any non-empty subset and let R = P(X). If
E ⊂ X, define

0, ifE = ∅,

number of elements in E, if E is a finite non-empty set,
µ(E) =

+∞, otherwise.
We need to check only the countable additivity. Let {Ei }∞
i=1 be a collection of mutually disjoint subsets whose union is E. If E is a finite
set, then only at most finitely many of the Ei will be non-empty and
they will also be finite sets. Then (1.2.1) is obviously true. If E is an
infinite set, then either at least one of the Ei is an infinite set, or there
1.2 Measures on rings
13
are infinitely many Ei which are non-empty finite sets. In either case,
both sides of (1.2.1) take the value +∞ and so countable additivity is
established.
This measure is called the counting measure on the set X. Example 1.2.2 Let X and R be as in the previous example. Let x0 ∈ X
be fixed. Let E ⊂ X. Define
1, if x0 ∈ E,
µ(E) =
0, if x0 6∈ E.
Again, we need only to check the countable additivity. Let E = ∪∞
i=1 Ei
be the union of a sequence of mutually disjoint sets in X. If x0 ∈ E, then
x0 ∈ Ei0 for exactly one index io . Consequently, both sides of (1.2.1)
will be unity. If x0 6∈ E, then x0 6∈ Ei for every index i and so both sides
of (1.2.1) are zero in this case.
This measure is called the Dirac measure concentrated at the point
x0 . Example 1.2.3 Let X be any non-empty set. Let R be the ring of
finite subsets of X. Let f : X → R be a given non-negative real-valued
function. Define µ to be zero on the empty set and set
µ({x1 , · · · , xn }) =
n
X
f (xi ).
i=1
It is easy to check that this defines a measure on X.
The most interesting example will be the Lebesgue measure, to be
defined on the euclidean space RN , N ≥ 1, which we will study in detail
later.
We will now prove some basic, but important, properties of measures.
Proposition 1.2.1 Let µ be a measure on a ring R of subsets of a nonempty set X. Then,
(i) µ is monotone, i.e. if E, F ∈ R and if E ⊂ F , we have µ(E) ≤
µ(F ), and
(ii) µ is subtractive, i.e. if E, F ∈ R, E ⊂ F and if µ(E) < +∞, then
µ(F \E) = µ(F ) − µ(E).
14
1 Measure
Proof: If E ⊂ F , then, by finite additivity, we have that µ(F ) =
µ(F \E) + µ(E). Then (i) is a consequence of the non-negativity of the
measure and (ii) follows from the fact that µ(E) is finite and hence we
can subtract it from both sides of the above relation. Proposition 1.2.2 (Subadditivity) Let µ be a measure on a ring R of
subsets of a non-empty set X. Let {Ei } be a finite, or infinite, sequence
of sets in R and let E ∈ R such that E ⊂ ∪i Ei . Then
X
µ(E) ≤
µ(Ei ).
i
Proof: Set Fi = E ∩ Ei . Define G1 = F1 and define Gi = Fi \(∪i−1
j=1 Fj ).
Then the sets Gi are all mutually disjoint and Gi ⊂ Fi for all i. Further
∪i Gi = ∪i Fi = E.
Thus, by countable additivity and monotonicity, we have
X
X
X
µ(E) =
µ(Gi ) ≤
µ(Fi ) ≤
µ(Ei ). i
i
i
Proposition 1.2.3 Let µ be a measure on a ring R of subsets of a
non-empty set X. Let {Ei } be a finite, or infinite, sequence of mutually
disjoint sets in R such that ∪i Ei ⊂ E, where E ∈ R. Then,
X
µ(Ei ) ≤ µ(E).
i
Proof: For any positive integer n, we have ∪ni=1 Ei ⊂ E. By the finite
additivity and monotonicity of the measure, we have
n
X
µ(Ei ) = µ(∪ni=1 Ei ) ≤ µ(E)
i=1
from which we deduce the result immediately. Proposition 1.2.4 (Continuity from below) Let µ be a measure on a
ring R of subsets of a non-empty set X. Let {Ei }∞
i=1 be an increasing
sequence of sets in R such that ∪∞
E
∈
R.
Then
i=1 i
µ(∪∞
i=1 Ei ) = lim µ(En ).
n→∞
(1.2.2)
1.2 Measures on rings
15
Proof: Set E0 = ∅. Then
µ(∪∞
i=1 Ei ) =
µ(∪∞
i=1 (Ei \Ei−1 ))
= limn→∞
Pn
i=1 µ(Ei \Ei−1 )
=
P∞
i=1 µ(Ei \Ei−1 )
= limn→∞ µ(En ),
since the sets Ei \Ei−1 are all mutually disjoint. This completes the
proof. Proposition 1.2.5 (Continuity from above) Let µ be a measure on a
ring R of subsets of a non-empty set X. Let {Ei }∞
i=1 be a decreasing
sequence of sets in R such that ∩∞
E
∈
R
and
such
that for some
i
i=1
positive integer m, we have µ(Em ) < +∞. Then
µ(∩∞
i=1 Ei ) =
lim µ(En ).
n→∞
(1.2.3)
Proof: Since the sequence of sets is decreasing, we have that µ(En ) <
+∞ for all n ≥ m. Further {Em \En }n≥m is an increasing sequence of
sets. Hence by the preceding proposition and by the subtractive property
of the measure, we have
∞
∞
µ(Em ) − µ(∩∞
i=1 Ei ) = µ(Em ) − µ(∩i=m Ei ) = µ(Em \(∩i=m Ei ))
= µ(∪∞
i=m (Em \Ei )) = limn→∞ µ(Em \En )
= µ(Em ) − limn→∞ µ(En ).
The result now follows on subtracting µ(Em ), which is finite, from both
sides of the above relation. Example 1.2.4 The preceding proposition is not valid without the assumption that µ(Em ) is finite for some m. Consider the set of natural
numbers, N, equipped with the counting measure (cf. Example 1.2.1).
Let En = {m ∈ N | m ≥ n}. Then µ(En ) = +∞ for all n while
∩∞
n=1 En = ∅. Definition 1.2.2 Let µ be a measure on a ring R of subsets of a nonempty set X. We say that µ is finite if µ(E) < +∞ for every E ∈ R.
We say that µ is σ-finite if every set E in R can be covered by a sequence
{Ei }∞
i=1 of sets in R with µ(Ei ) < +∞ for every i. 16
1 Measure
Thus, the Dirac measure (cf. Example 1.2.2) and the measure defined in Example 1.2.3 are both finite measures. The counting measure
on N is a σ-finite measure, since any subset of N can be covered by a
countable number of singleton sets, and each singleton set has measure
unity.
We conclude this section with a very useful result.
Proposition 1.2.6 (Borel-Cantelli Lemma) Let µ be a measure on a
σ-algebra S of subsets of a non-empty set X. Let {Ei }∞
i=1 be a sequence
of sets in X such that
∞
X
µ(Ei ) < +∞.
i=1
Then, except for a set of measure zero, every point x ∈ X belongs to at
most finitely many of the sets Ei .
Proof: Let E be set of all points x ∈ X which belong to infinitely many
of the sets Ei . Then,
∞
E = ∩∞
n=1 ∪i=n Ei .
Then, for every positive integer n, we have
µ(E) ≤ µ(∪∞
i=n Ei ) ≤
∞
X
µ(Ei ).
i=n
But the sum on the extreme right is the tail of a convergent series and
hence can be made arbitrarily small for large n. Thus it follows that
µ(E) = 0. 1.3
Outer-measure and measurable sets
In the sequel, we will really be interested only in measures defined on
σ-algebras. However, as we shall see in the construction of the Lebesgue
measure, it will be simpler to explicitly construct it on a ring and then
try to extend it to larger collections like the σ-ring generated by the ring
itself. We now investigate the possibility of extending a measure defined
on a ring to the σ-ring generated by it or to even larger classes of sets.
Definition 1.3.1 Let X be a non-empty set and let S be a σ-ring of
subsets of X. It is said to be a hereditary σ-ring if, whenever E ∈ S,
we have that every subset of E is also a member of S. 1.3 Outer-measure and measurable sets
17
The power set of X is clearly a hereditary σ-ring and intersections of
hereditary σ-rings is also a hereditary σ-ring. It then follows that given
any collection E of subsets of X, there is a smallest hereditary σ-ring, denoted H(E), containing E. This is called the hereditary σ-ring generated
by the class E.
Notice that the collection of all sets in X which can be covered by
a countable union of members of E is a hereditary σ-ring containing E.
Thus, every member of H(E) can be covered by a countable union of
members of E.
Definition 1.3.2 Let X be a non-empty set and let H be a hereditary
σ-ring of subsets of X. An extended real-valued set function µ∗ defined
on H is said to be an outer-measure if the following properties hold:
(i) (non-negativity) µ∗ (E) ≥ 0 for every E ∈ H;
(ii) (monotonicity) if E, F ∈ H such that E ⊂ F , then µ∗ (E) ≤ µ∗ (F );
(iii) µ∗ (∅) = 0;
(iv) (countable subadditivity) if {En }∞
n=1 is a sequence of sets in H, then
µ
∗
(∪∞
n=1 En )
≤
∞
X
µ∗ (En ). (1.3.1)
n=1
An outer-measure is said to be σ-finite if every set in the hereditary σring can be covered by a countable union of sets of finite outer-measure.
Outer-measures occur naturally when we try to extend a measure
defined on a ring.
Proposition 1.3.1 Let X be a non-empty set and let R be a ring of
subsets of X. Let µ be a measure on R. For any set E ∈ H(R), the
hereditary σ-ring generated by R, define
(∞
)
X
µ∗ (E) = inf
µ(En ) | E ⊂ ∪∞
n=1 En , En ∈ R .
n=1
Then, µ∗ is an outer-measure on H(R) which extends µ. Further, if µ
is σ-finite, so is µ∗ .
Proof: Step 1: The non-negativity of µ∗ is obvious. Now, let E ∈ R.
Then since E covers itself, we have µ∗ (E) ≤ µ(E). On the other hand,
given any countable cover of E, say {En }∞
n=1 with En ∈ R for all n,
18
1 Measure
P∞
we have, by subadditivity of the measure, µ(E) ≤
n=1 µ(En ) and
∗
∗
so, by definition of µ , we have µ(E) ≤ µ (E). Thus, for E ∈ R, we
have µ(E) = µ∗ (E) and so µ∗ extends µ. In particular, we have that
µ∗ (∅) = 0.
Step 2: Let F ⊂ E, where E ∈ H(R). Then every countable cover
of E by sets from R also covers F . Thus, it follows immediately that
µ∗ (F ) ≤ µ∗ (E).
Step 3: We now prove the subadditivity of µ∗ . Let E ∈ H(R) and
assume that E ⊂ ∪∞
i=1 Ei , where each Ei is also a member of H(R).
If there exists even a single index i such that µ∗ (Ei ) = +∞, there is
nothing to prove. Assume, therefore, that µ∗ (Ei ) < +∞ for each i.
Then, given ε > 0, there exists, by definition, sets Eij ∈ R such that
Ei ⊂ ∪ ∞
j=1 Eij and such that
∞
X
ε
.
2i
µ(Eij ) < µ∗ (Ei ) +
j=1
∞
Then E ⊂ ∪∞
i=1 ∪j=1 Eij and
µ∗ (E) ≤
∞ X
∞
X
µ(Eij ) ≤
i=1 j=1
∞
X
µ∗ (Ei ) + ε.
i=1
Since ε > 0 was arbitrarily chosen, we deduce that
µ∗ (E) ≤
∞
X
µ∗ (Ei ).
i=1
Step 4: Let E ∈ H(R). Then, there exists a countable cover {Ei }∞
i=1 of
E such that Ei ∈ R for each i. Since µ is σ-finite, we can find {Eij }∞
i,j=1
in R such that Ei ⊂ ∪∞
j=1 Eij and µ(Eij ) < +∞ for each 1 ≤ i, j < ∞.
∞
∗
Now, we have E ⊂ ∪∞
i=1 ∪j=1 Eij and µ (Eij ) = µ(Eij ) < +∞. This
completes the proof. Example 1.3.1 Let X = N and consider the ring R of all finite subsets,
with the counting measure. Then µ is finite. Since any countable union
of singletons has to be in H(R), it follows that N is in H(R) and by
heredity, it follows that H(R) = P(N), the power set of N. It is now
immediate to see that if E is any infinite subset of N, then µ∗ (E) = +∞.
1.3 Outer-measure and measurable sets
19
Thus, even though µ is a finite measure, we can only say that µ∗ is σfinite. Definition 1.3.3 Let X be a non-empty set and let H be a hereditary
σ-ring of subsets of X. Let µ∗ be an outer-measure defined on H. A set
E ∈ H is said to be µ∗ -measurable if for every A ∈ H, we have
µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ E c ). (1.3.2)
Remark 1.3.1 Since µ∗ is subadditive, the µ∗ -measurability of E is
equivalent to verifying
µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ). Proposition 1.3.2 Let µ∗ be an outer-measure on a hereditary σ-ring
H of subsets of a non-empty set X. Then the collection of all µ∗ measurable sets, denoted S, is a ring.
Proof: Let A ∈ H and let E and F be µ∗ -measurable sets. Then, by
definition, we have
µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ E c ),
∩ E) = µ∗ (A ∩ E ∩ F ) + µ∗ (A ∩ E ∩ F c ),
µ∗ (A ∩ E c ) = µ∗ (A ∩ E c ∩ F ) + µ∗ (A ∩ E c ∩ F c ).
µ∗ (A
Thus,
µ∗ (A) = µ∗ (A∩E∩F )+µ∗ (A∩E∩F c )+µ∗ (A∩E c ∩F )+µ∗ (A∩E c ∩F c ).
(1.3.3)
If we replace A by A ∩ (E ∪ F ) in the above relation, we get
µ∗ (A∩(E∪F )) = µ∗ (A∩E∩F )+µ∗ (A∩E∩F c )+µ∗ (A∩E c ∩F ). (1.3.4)
It then follows from (1.3.3) that
µ∗ (A) = µ∗ (A ∩ (E ∪ F )) + µ∗ (A ∩ (E ∪ F )c ).
Thus, E ∪ F ∈ S. Similarly, replacing A in (1.3.3) by A ∩ (E\F )c =
A ∩ (E c ∪ F ), we get
µ∗ (A ∩ (E\F )c )) = µ∗ (A ∩ E ∩ F ) + µ∗ (A ∩ E c ∩ F ) + µ∗ (A ∩ E c ∩ F c ).
(1.3.5)
20
1 Measure
It now follows from (1.3.3) that
µ∗ (A) = µ∗ (A ∩ (E\F )c ) + µ∗ (A ∩ E ∩ F c )
= µ∗ (A ∩ (E\F )c ) + µ∗ (A ∩ (E\F )).
This shows that E\F ∈ S. Clearly ∅ ∈ S. This completes the proof. In fact, we can prove much more.
Proposition 1.3.3 Let µ∗ be an outer-measure on a hereditary σ-ring
H of subsets of a non-empty set X. Then, S, the collection of all µ∗ measurable sets, is a σ-ring. Further, if {Ei }∞
i=1 is a sequence of mutually disjoint sets in S whose union is E and if A is any arbitrary set in
H, we have
∞
X
∗
µ (A ∩ E) =
µ∗ (A ∩ Ei ).
(1.3.6)
i=1
Proof: It follows from (1.3.4) that
µ∗ (A∩(E1 ∪E2 )) = µ∗ (A∩E1 ∩E2 )+µ∗ (A∩E1 ∩E2c )+µ∗ (A∩E1c ∩E2 ).
Since E1 ∩ E2 = ∅, it follows that E1 ⊂ E2c and that E2 ⊂ E1c . Consequently, the above relation yields,
µ∗ (A ∩ (E1 ∪ E2 )) = µ∗ (A ∩ E1 ) + µ∗ (A ∩ E2 ).
By induction, it follows that for any n mutually disjoint sets {Ei }ni=1 ,
we have
n
X
∗
n
µ (A ∩ (∪i=1 Ei )) =
µ∗ (A ∩ Ei ).
(1.3.7)
i=1
∪ni=1 Ei .
Set Fn =
Since S is a ring, we have that Fn ∈ S. Further, since
Fn ⊂ E, we have that E c ⊂ Fnc . Consequently we have, by (1.3.7) and
the monotonicity of µ∗ ,
µ∗ (A) = µ∗ (A ∩ Fn ) + µ∗ (A ∩ Fnc )
≥
Pn
i=1 µ
∗ (A
∩ Ei ) + µ∗ (A ∩ E c ).
Since n was arbitrarily fixed, we deduce that
µ∗ (A) ≥
∞
X
i=1
µ∗ (A ∩ Ei ) + µ∗ (A ∩ E c ).
1.3 Outer-measure and measurable sets
21
Replacing A by A ∩ E and by the subadditivity of the outer-measure,
we get
∞
X
µ∗ (A ∩ E) ≥
µ∗ (A ∩ Ei ) ≥ µ∗ (A ∩ E).
i=1
This proves (1.3.6) and we also immediately see that
µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c )
from which we deduce that E ∈ S (cf. Remark 1.3.1). Thus, S is closed
under countable disjoint unions. But since S is a ring, we can express
any countable union of sets in it as a countable disjoint union of sets in
it. Thus it follows that S is indeed a σ-ring. This completes the proof.
Definition 1.3.4 A measure µ, defined on a σ-ring S of subsets of a
non-empty set X, is said to be complete if, whenever a set E ∈ S has
measure zero, then every subset of E is also a member of S. Theorem 1.3.1 Let µ∗ be an outer-measure on a hereditary σ-ring H of
subsets of a non-empty set X and let S be the σ-ring of all µ∗ -measurable
subsets. For E ∈ S, define
µ(E) = µ∗ (E).
Then µ is a complete measure on S.
Proof: The fact that µ is a measure on S follows immediately from the
preceding proposition. Now, let µ∗ (E) = 0 for some E ∈ H. Let A ∈ H.
Then
µ∗ (A) = µ∗ (E) + µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c )
by monotonicity and hence (cf. Remark 1.3.1) we deduce that E ∈ S.
Thus S contains all sets of outer-measure zero. If E ∈ S such that
µ(E) = 0, then µ∗ (E) = 0 and so µ∗ (F ) = 0 for all subsets F of E.
Thus every subset F of E is in S, which completes the proof. We say that the measure µ is the measure induced by the outermeasure µ∗ .
Now, let µ be a measure on a ring R of subsets of a non-empty set
X. Then we can define the hereditary σ-ring H(R), generated by R and
22
1 Measure
also define the outer-measure µ∗ on H(R), as described in Proposition
1.3.1. This outer-measure will now induce a complete measure on S, the
σ-ring of all µ∗ -measurable sets.
Proposition 1.3.4 Let R be a ring of subsets of a non-empty set X
and let µ be a measure on R. Let µ∗ be the induced outer-measure.
Then S(R) ⊂ S, where S(R) is the σ-ring generated by R and S is
the σ-ring of all µ∗ -measurable subsets. In particular µ is an extension
of the measure µ to the σ-rings S(R) and S and it is complete on the
latter.
Proof: Let E ∈ R and let A ∈ H(R). Assume that µ∗ (A) < +∞.
Then, given ε > 0, there exists a sequence of sets {Ei }∞
i=1 in R such
∞
that A ⊂ ∪i=1 Ei and such that
∞
X
µ(Ei ) < µ∗ (A) + ε.
i=1
But µ is a measure on R and so
∞
X
i=1
µ∗
Now,
we get
µ(Ei ) =
∞
X
(µ(Ei ∩ E) + µ(Ei ∩ E c )).
i=1
extends µ and so, by the subadditivity of the outer-measure,
µ∗ (A) + ε > µ∗ (A ∩ E) + µ∗ (A ∩ E c ).
Since ε > 0 has been arbitrarily chosen, we deduce that
µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ).
This inequality is trivially true if µ∗ (A) = +∞. Thus we have shown
that R ⊂ S (cf. Remark 1.3.1). Since S is a σ-ring containing R, it also
contains S(R). Remark 1.3.2 If µ on R is σ-finite, then we saw that µ∗ on H(R) is
σ-finite as well, and that, in fact each set in H(R) can be covered by
countably many sets from R with finite measure. In particular, it follows
that µ on S(R) and on S are σ-finite measures. Proposition 1.3.5 Let µ be a measure on a ring R of subsets of a nonempty set X and let µ be its extension to S(R) and S as described above.
Let E ∈ H(R). Then
µ∗ (E) = inf{µ(F ) | F ∈ S, E ⊂ F }
= inf{µ(F ) | F ∈ S(R), E ⊂ F }.
1.3 Outer-measure and measurable sets
23
Proof: The proof follows from the following chain of inequalities.
P
∞
µ∗ (E) = inf{ ∞
i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ R}
P
∞
= inf{ ∞
i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ R}
P
∞
≥ inf{ ∞
i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ S(R)}
∞
≥ inf{µ(∪∞
i=1 Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ S(R)}
≥ inf{µ(F ) | E ⊂ F, F ∈ S(R)}
≥ inf{µ(F ) | E ⊂ F, F ∈ S}
= inf{µ∗ (F ) | E ⊂ F, F ∈ S}
≥ µ∗ (E). Remark 1.3.3 Let µ be a measure on a ring R of subsets of a nonempty set X and let µ be its extension, as described in this section, to
S(R). Starting from this, one could again try to define an outer-measure
µ∗ on H(R) and try to extend it further. The proof of the preceding
proposition shows that µ∗ = µ∗ and so the σ-ring of measurable sets will
still be S and so the induced measure will also only be µ. Definition 1.3.5 Let µ be a measure on a ring R of subsets of a nonempty set X. Let E ∈ H(R) and let F ∈ S(R). We say that F is
a measurable cover of E if E ⊂ F and for all G ∈ S(R) such that
G ⊂ F \E, we have µ(G) = 0. Proposition 1.3.6 Let µ be a measure on a ring R of subsets of a nonempty set X. Let E ∈ H(R) be such that µ∗ (E) < +∞. Then, there
exists a measurable cover F of E such that µ(F ) = µ∗ (E).
Proof: By Proposition 1.3.5, for every positive integer n, there exists
Fn ∈ S(R) such that E ⊂ Fn and
µ∗ (Fn ) < µ∗ (E) +
1
.
n
Set F = ∩∞
n=1 Fn . Then F ∈ S(R) and E ⊂ F . Thus,
µ∗ (E) ≤ µ∗ (F ) ≤ µ∗ (Fn ) < µ∗ (E) +
1
.
n
24
1 Measure
Since this is true for all n, we deduce that
µ∗ (E) = µ∗ (F ) = µ(F ).
Let G ∈ S(R) be such that G ⊂ F \E. Then E ⊂ F \G and µ(G) is
finite. Thus,
µ(F ) = µ∗ (E) ≤ µ∗ (F \G) = µ(F \G) = µ(F ) − µ(G)
from which it follows that µ(G) = 0. This completes the proof. 1.4
Completion of a measure
In the previous section we started with a measure on a ring R of subsets
of a non-empty set X and extended it to a complete measure µ on S,
the σ-ring of all µ∗ -measurable sets.
Given a measure on a σ-ring, it is always possible to extend it to a
complete measure by another process, which we now describe.
Theorem 1.4.1 Let S be a σ-ring of subsets of a non-empty set X and
let µ be a measure on S. Let
Se = {E∆N | E ∈ S, N ⊂ A, A ∈ S, µ(A) = 0}.
Then, Se is a σ-ring. Define µ
e on Se by
µ
e(E∆N ) = µ(E).
Then µ
e is a complete measure on Se and it extends µ.
Proof: Let E, A ∈ S. Let µ(A) = 0 and let N ⊂ A. Then E\A ∈ S.
Consequently,
E∪N
E∆N
= (E\A)∆(A ∩ (E ∪ N )),
= (E\A) ∪ (A ∩ (E∆N )).
(1.4.1)
Thus,
Se = {E ∪ N | E ∈ S, N ⊂ A, A ∈ S, µ(A) = 0}.
It is now clear that Se is closed under the formation of countable unions.
Let E1 , E2 ∈ S and let N1 ⊂ A1 , N2 ⊂ A2 where Ai ∈ S, µ(Ai ) = 0
1.4 Completion of a measure
25
for i = 1, 2. Since the symmetric difference is an associative operation
(check!!) which is also obviously commutative, we have that
(E1 ∆N1 )∆(E2 ∆N2 ) = (E1 ∆E2 )∆(N1 ∆N2 )
with E1 ∆E2 ∈ S and N1 ∆N2 ⊂ A1 ∪A2 where A1 ∪A2 ∈ S, µ(A1 ∪A2 ) =
0. Thus Se is closed under the formation of symmetric differences as well.
Thus it follows (cf. Remark 1.1.4) that Se is a σ-ring.
Again let Ei , Ai , Ni , i = 1, 2 be as above. Assume that E1 ∆N1 =
E2 ∆N2 . Again, by the commutativity and associativity of the symmetric
difference, we easily see that
E1 ∆E2 = N1 ∆N2 .
It then follows that E1 ∆E2 ⊂ A1 ∪ A2 and so µ(E1 ∆E2 ) = 0. We then
deduce that µ(E1 ) = µ(E2 ) = µ(E1 ∩ E2 ). This shows that µ
e is welldefined. Using the latter definition of Se in terms of unions rather than
symmetric differences, it is very easy to check that µ
e defines a measure
e
on S. Also it is clear that µ
e extends µ.
Now let E, A ∈ S such that µ(E) = µ(A) = 0 and let N ⊂ A. Then
µ
e(E ∪ N ) = µ(E\A) = 0 (cf. (1.4.1)). If F ⊂ E ∪ N , then F ⊂ E ∪ A
e This shows that µ
and µ(E ∪ A) = 0. Consequently, F ∈ S.
e is complete.
Now, if µ is a measure defined on a ring R of subsets of a non-empty
set X, we have two complete extensions of µ: first, the measure µ defined
on the σ-ring S of µ∗ -measurable sets, described in the previous section,
and, second, the measure µ
e on the σ-ring Se got by adjoining subsets of
sets of measure zero to sets in the σ-ring S(R). If µ were σ-finite, these
two processes yield the same measure, as the following theorem shows.
Theorem 1.4.2 Let µ be a σ-finite measure on a ring R of subsets of a
non-empty set X. Let µ be its extension to S(R), the σ-ring generated
by R, and to S, the set of all µ∗ -measurable sets. Let µ
e be the measure
e
on the σ-ring S got by adjoining subsets of sets of measure zero in S(R)
to sets in S(R). Then Se = S and µ
e = µ.
Proof: Since S is a σ-ring containing S(R) and since µ is complete,
it follows that Se ⊂ S. Further, if E, A ∈ S(R) with µ(A) = 0 and if
26
1 Measure
N ⊂ A, we have, by definition,
µ
e(E ∪ N ) = µ(E) = µ(E ∪ N )
since
µ(E ∪ N ) = µ∗ (E ∪ N ) ≤ µ∗ (E) + µ∗ (N ) = µ∗ (E)
= µ(E) ≤ µ(E ∪ N ).
e Let
The proof will, therefore, be complete if we show that S ⊂ S.
∗
E ∈ S such that µ(E) = µ (E) < +∞. Then, by Proposition 1.3.6,
there exists a measurable cover F of E such that µ(F ) = µ∗ (F ) =
µ∗ (E) = µ(E). Recall that F ∈ S(R) and that E ⊂ F .Then, since µ is
a measure, we have that µ∗ (F \E) = µ(F \E) = 0. Now, let G ∈ S(R)
be a measurable cover of F \E, with µ(G) = 0. Then,
E = (F \G) ∪ (E ∩ G).
Since F \G ∈ S(R) and since E ∩ G ⊂ G where G ∈ S(R) with measure
e Since µ is σ-finite, we have that µ is σ-finite
zero, it follows that E ∈ S.
and so any set E ∈ S can be expressed as the countable union of sets of
finite measure and so it follows that S ⊂ Se and the proof is complete.
1.5
Exercises
1.1 Let X = RN , N ≥ 2. Let
P = ΠN
i=1 [ai , bi ) | ai , bi ∈ R, ai ≤ bi , 1 ≤ i ≤ N .
Let R be the collection of all finite unions of members of P. Show that
R is a ring.
1.2 Let X be an uncountable set. Let R be the collection of all at most
countable subsets of X, including the empty set. Show that R is a ring.
Is it a σ-ring? Is it a σ-algebra?
1.3 Let R be a ring of subsets of a non-empty set X. Define
S = {F ⊂ X | F ∈ R or F c ∈ R}.
Show that S is the smallest algebra containing R.
1.5 Exercises
27
1.4 Let X be a non-empty set and let E ⊂ X. Let E = {E}. Compute
R(E).
1.5 Let X be a non-empty set and let E ⊂ X. Let E = {F ⊂ X | E ⊂
F }. Compute R(E).
1.6 Let X = N. Let R be the collection of all finite subsets (including
the empty set) and their complements. Show that R is an algebra.
1.7 Let R be a ring of subsets of a non-empty set X. Let µ be a measure
on R. If E, F ∈ R, show that
µ(E) + µ(F ) = µ(E ∪ F ) + µ(E ∩ F ).
1.8 Let R be a ring of subsets of a non-empty set X. Let µ be a nonnegative, finite and additive set function on R. If either µ is continuous
from below for every set E ∈ R or if it is continuous from above for
E = ∅, show that µ is a measure on R.
1.9 Let X = N and let R be the ring described in Exercise 1.6 above.
Define, for E ∈ R,
+∞, if E is an infinite set,
µ(E) =
0,
otherwise.
Show that µ is continuous from above for E = ∅, but that µ is not a
measure. (This shows that finiteness is essential in the previous exercise.)
1.10 Let S be a σ-ring of subsets of a non-empty set X and let µ be a
measure on S. Let {Ei }∞
i=1 be a sequence of sets in S. Define
∞
lim inf n→∞ En = ∪∞
n=1 ∩i=n Ei ,
∞
lim supn→∞ En = ∩n=1 ∪∞
i=n Ei .
(a) Show that
µ(lim inf En ) ≤ lim inf µ(En ).
n→∞
n→∞
(b) If, for some n ∈ N, we have µ(∪∞
i=n Ei ) < +∞, show that
µ(lim sup En ) ≥ lim sup µ(En ).
n→∞
n→∞
28
1 Measure
1.11 Let H be a hereditary σ-ring of subsets of a non-empty set X and
let µ∗ be an outer-measure on H. If E, F ∈ H, and if at least one of
them is µ∗ -measurable, show that
µ∗ (E) + µ∗ (F ) = µ∗ (E ∪ F ) + µ∗ (E ∩ F ).
1.12 Let E ⊂ R. E is said to have an infinite condensation point if E has
uncountably many points outside every finite interval. Let H = P(R).
Define, for E ⊂ R,

0, if E is empty, finite or countable,



1, if E is uncountable, but without an
µ∗ (E) =
infinite condensation point,



+∞, if E has an infinite condensation point.
Show that
(i) µ∗ is a σ-finite outer-measure on H;
(ii) the only µ∗ -measurable sets are at most countable sets or their complements;
(iii) the induced measure µ is not σ-finite.
1.13 Let X be a non-empty set and let H = P(X). Let µ∗i , i = 1, 2
be two finite outer-measures on H. Let S i , i = 1, 2 be the respective
measurable sets. If µ∗ = µ∗1 + µ∗2 , show that µ∗ is an outer measure and
that the class of µ∗ -measurable sets is S1 ∩ S2 .
1.14 Let µ be a measure on a σ-ring S of subsets of a non-empty set X.
Let µ be the induced measure defined on S, the σ-ring of µ∗ -measurable
sets. Let A, B ∈ S be such that µ(B\A) = 0. If A ⊂ E ⊂ B, show that
E ∈ S.
1.15 Let X be an uncountable set and let S be the collection of at most
countable sets (including the empty set) and their complements.
(a) Show that S is a σ-algebra.
(b) If µ is the counting measure on S, show that it is complete.
(c) Show that every subset of X is µ∗ -measurable.
(Thus, without σ-finiteness, the completion via µ∗ -measurability and
the completion as in Section 1.4 need not coincide.)
1.16 Let R be a ring of subsets of a non-empty set X and let µ be a
σ-finite measure on R. Show that for every set E ∈ S(R) and for every
1.5 Exercises
29
ε > 0, there exists E0 ∈ R such that
µ(E∆E0 ) ≤ ε.
Chapter 2
The Lebesgue measure
2.1
Construction of the Lebesgue measure
We will now study in detail the construction and properties of the
Lebesgue measure on the euclidean space RN . We will start with a
measure on the ring which arises from the notion of the length of an
interval (area or volume of a box in higher dimensions) and extend it
to a complete measure on the class of measurable sets, as described in
Section 1.3. To simplify the exposition we will describe in detail the construction on the real line, R. The generalization to higher dimensions
will be obvious.
Let P denote the class of all intervals of the form [a, b), where a ≤
b, a, b ∈ R. Let R be the ring of all finite unions of members of P (cf.
Example 1.1.3). As observed earlier, we can express each member of R
as a finite disjoint union of members of P. Let us define
µ([a, b)) = b − a.
If a = b, then [a, b) = ∅ and we have µ(∅) = 0. We will now construct a
measure on R starting from µ. The definition is almost obvious: if E ∈
R is expressed as the disjoint union of intervals, i.e. if E = ∪kj=1 Ij , where
Ij , 1 ≤ j ≤ k, are mutually disjoint members of P, then, necessarily, we
must have
k
X
µ(E) =
µ(Ij ).
j=1
However, we need to check that this is well-defined and also that it satisfies the properties of a measure. In particular, we need to verify that
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_2
30
2.1 Construction of the Lebesgue measure
31
this set function is countably additive on R.
We start by formally proving a few fairly obvious properties of µ,
defined on P.
Lemma 2.1.1 (a) Let {Ei }ni=1 be a finite set of mutually disjoint intervals in P, such that each of them is contained in E0 ∈ P. Then
n
X
µ(Ei ) ≤ µ(E0 ).
(2.1.1)
i=1
(b) Let F = [a0 , b0 ] be a finite closed interval contained in the finite
union of open intervals Ui , where Ui = (ai , bi ), 1 ≤ i ≤ n. Then
b0 − a 0 <
n
X
(bi − ai ).
(2.1.2)
i=1
Proof: (a) Let Ei = [ai , bi ), 0 ≤ i ≤ n. Since the Ei , 1 ≤ i ≤ n, are
disjoint, we have (by renumbering the sets, if necessary) that
a0 ≤ a1 < b1 ≤ a2 < b2 ≤ · · · < bi−1 ≤ ai < bi ≤ ai+1 < · · · < bn ≤ b0 .
Thus,
Pn
i=1 µ(Ei )
=
Pn
i=1 (bi
− ai ) ≤
Pn
i=1 (bi
− ai ) +
Pn−1
i=1
(ai+1 − bi )
= bn − a1 ≤ b0 − a0 = µ(E0 ).
This proves (2.1.1).
(b) We may renumber the intervals Ui , after getting rid of the superfluous
ones, so that we have
bi ∈ (ai+1 , bi+1 ) = Ui+1 , 1 ≤ i ≤ m − 1,
where m ≤ n. Also a0 ∈ U1 and b0 ∈ Um . Then
b0 − a0 < bm − a1 = (b1 − a1 ) +
m−1
X
i=1
which proves (2.1.2), since m ≤ n. (bi+1 − bi ) ≤
m
X
i=1
(bi − ai ),
32
2 The Lebesgue Measure
∞
Proposition 2.1.1 If {Ei }∞
i=0 is a sequence in P such that E0 ⊂ ∪i=1 Ei ,
then,
∞
X
µ(E0 ) ≤
µ(Ei ).
(2.1.3)
i=1
Proof: The result is trivially true if E0 = ∅. Let Ei = [ai , bi ), 0 ≤ i < ∞,
with ai < bi for all i, and choose ε > 0 such that 0 < ε < b0 − a0 . Let
δ > 0 be an arbitrarily small positive quantity. Set
F0 = [a0 , b0 − ε],
ai −
Ui =
δ
,b
2i i
, 1 ≤ i < ∞.
Then, F0 ⊂ E0 and Ei ⊂ Ui , 1 ≤ i < ∞. Thus F0 ⊂ ∪∞
i=1 Ui . Since F0
is compact, there exists a positive integer n such that F0 ⊂ ∪ni=1 Ui . It
now follows from the preceding lemma (cf. (2.1.2)) that
n ∞
X
X
δ
b0 − a 0 − ε <
bi − a i + i
≤
(bi − ai ) + δ.
2
i=1
i=1
In other words,
µ(E0 ) − ε <
∞
X
µ(Ei ) + δ.
i=1
The result now follows since ε and δ are arbitrarily small quantities. Proposition 2.1.2 The set function µ is countably additive on P.
∞
Proof: Let E0 = ∪∞
i=1 Ei , where {Ei }i=1 is a sequence of mutually
disjoint members of P. Assume that E0 ∈ P as well. By the preceding
proposition, we have
∞
X
µ(E0 ) ≤
µ(Ei ).
i=1
On the other hand, for any positive integer n, we have by Lemma 2.1.1
(cf. (2.1.1)),
n
X
µ(Ei ) ≤ µ(E0 )
i=1
which yields
∞
X
µ(Ei ) ≤ µ(E0 )
i=1
from which the result follows. 2.1 Construction of the Lebesgue measure
33
Theorem 2.1.1 There exists a unique finite measure µ on R which
extends µ defined on P.
Proof: Let E ∈ R. Then E = ∪ni=1 Ei , where {Ei }ni=1 is a collection
of mutually disjoint intervals in P. Then the only possible way we can
define a measure on R is by setting
µ(E) =
n
X
µ(Ei ).
i=1
This is obviously an extension of µ defined on P and so, in particular,
µ(∅) = 0 and µ(E) ≥ 0 for all E ∈ R. However, we need to verify that
µ is well-defined. Let
E = ∪ni=1 Ei = ∪m
j=1 Fj ,
where {Ei }ni=1 and {Fj }m
j=1 are two collections of mutually disjoint intervals in P. Then, for each 1 ≤ i ≤ n, we can write
Ei = ∪ m
j=1 Ei ∩ Fj .
Each set Ei ∩ Fj is either empty, or is a non-empty interval in P. Hence,
by Proposition 2.1.2, since µ is also finitely additive in P, we have
µ(Ei ) =
m
X
µ(Ei ∩ Fj ).
j=1
Similarly, for each 1 ≤ j ≤ m, we have
µ(Fj ) =
n
X
µ(Fj ∩ Ei ).
i=1
Thus,
n
X
i=1
µ(Ei ) =
n X
m
X
i=1 j=1
µ(Ei ∩ Fj ) =
m X
n
X
j=1 i=1
µ(Ei ∩ Fj ) =
m
X
µ(Fj ).
j=1
This establishes that µ is well-defined on R. It is also clear, from the
definition of µ, that it is finitely additive on R.
34
2 The Lebesgue Measure
We now need to show that µ is countably additive on R. Let E =
where E ∈ R and {Ei }∞
i=1 is a collection of mutually disjoint
sets in R. Since Ei ∈ R, we can write
∪∞
i=1 Ei ,
i
Ei = ∪nk=1
Eik , 1 ≤ i < ∞,
i
where {Eik }nk=1
is a finite collection of mutually disjoint intervals in P.
Notice that, for each 1 ≤ i < ∞, we have, by definition of µ,
µ(Ei ) =
ni
X
µ(Eik ).
k=1
Case 1: E ∈ P.
In this case, since
ni
E = ∪∞
i=1 ∪k=1 Eik
is a countable disjoint union within P, we have, by Proposition 2.1.2,
that
ni
∞ X
∞
X
X
µ(E) =
µ(Eik ) =
µ(Ei ).
i=1 k=1
i=1
∪nj=1 Fj ,
Case 2: E =
where Fj ∈ P for each 1 ≤ j ≤ n, and the Fj ’s are
all mutually disjoint. Then, for each 1 ≤ j ≤ n, we have
Fj = ∪ ∞
i=1 Fj ∩ Ei .
Thus {Fj ∩ Ei }∞
i=1 is a collection of mutually disjoint sets in R whose
union is Fj ∈ P. Thus, by Case 1 above, we have
µ(Fj ) =
∞
X
µ(Fj ∩ Ei ).
i=1
Hence, by definition of µ on R, we deduce that
µ(E) =
n
X
j=1
µ(Fj ) =
n X
∞
X
µ(Fj ∩ Ei ) =
j=1 i=1
On the other hand, we have, for each 1 ≤ i < ∞,
Ei = ∪nj=1 Ei ∩ Fj .
∞ X
n
X
i=1 j=1
µ(Fj ∩ Ei ).
2.1 Construction of the Lebesgue measure
35
Now, {Ei ∩ Fj }nj=1 is a finite collection of mutually disjoint sets in R
and so, by finite additivity, we have
µ(Ei ) =
n
X
µ(Ei ∩ Fj )
j=1
and so we deduce that
µ(E) =
∞
X
µ(Ei )
i=1
which proves the countable additivity of µ in the general case as well.
This completes the proof. Remark 2.1.1 In the same way, when N ≥ 2, if
P = ΠN
i=1 [ai , bi ) | ai ≤ bi , ai , bi ∈ R, 1 ≤ i ≤ N ,
and R is the ring of finite (disjoint) unions of members of P (cf. Exercise
1.1), we can define a unique measure on R which, when restricted to P,
is given by
N
µ ΠN
i=1 [ai , bi ) = Πi=1 (bi − ai ). Let B = S(R) be the σ-ring generated by R. Sets in B are called
Borel sets and B is called the Borel σ-algebra. Since
R = ∪n∈Z [n, n + 1),
we see that B is, indeed, a σ-algebra. Thus, the hereditary σ-ring generated by R is clearly the power set of R. We can now define the
induced outer-measure µ∗ for all subsets of R. The collection L of all
µ∗ -measurable sets is thus a σ-algebra which is called the Lebesgue σalgebra and its members are called the Lebesgue measurable sets;
the induced measure on this σ-algebra is called the Lebesgue measure
on R. It is clear that the Lebesgue measure is σ-finite and complete.
Thus the Lebesgue measure is the completion of the measure induced
on the Borel σ-algebra (cf. Theorem 1.4.2) by µ.
Remark 2.1.2 In an identical manner, we can define the Borel and
Lebesgue σ-algebras in RN , N ≥ 2, starting from the class P, the ring
R and the measure µ defined in Remark 2.1.1. We will then get the
Lebesgue measure in RN which will be a σ-finite complete measure, and
36
2 The Lebesgue Measure
which is also the completion of the measure induced on the Borel σalgebra by µ. Notation: We will denote the Lebesgue measure on RN by the symbol
mN . In particular, the Lebesgue measure on R will be denoted by m1 .
The Borel and Lebesgue σ-algebras on RN will be denoted, respectively,
BN and LN . Definition 2.1.1 Any measure defined on the Borel σ-algebra, BN , will
be called a Borel measure on RN . Proposition 2.1.3 Every countable set in R is a Borel set of measure
zero.
Proof: Let a ∈ R. Then
{a} =
∩∞
n=1
1
a, a +
n
∈ B1 .
Thus singletons are all in B1 and, consequently, any countable set is in
B1 . Further, by Proposition 1.2.5, it follows that
1
1
m1 ({a}) = lim m1 a, a +
= lim
= 0.
n→∞
n→∞ n
n
Thus, the measure of any countable set will also be zero. Proposition 2.1.4 The Borel σ-algebra is also the σ-algebra generated
by all the open sets in R.
Proof: Let a, b ∈ R. Then (a, b) = [a, b)\{a} ∈ B1 . Since any open
set can be written as the countable union of such intervals, it follows
that every open set is contained in the Borel σ-algebra, B1 . Hence, the
σ-algebra generated by open sets is also contained in B1 . Conversely,
[a, b) = (a, b) ∪ {a} and
1
1
∞
{a} = ∩n=1 a − , a +
.
n
n
Consequently, P, and hence, R and B1 are all contained in the σ-algebra
generated by the open sets. This completes the proof. Remark 2.1.3 It is an easy exercise to adapt the proofs of the preceding
two propositions to the case RN , N ≥ 2 as well. Thus countable sets in
2.1 Construction of the Lebesgue measure
37
RN are Borel sets of measure zero and BN is the σ-algebra generated by
the open sets in RN (with its usual topology). Example 2.1.1 It follows from the above proposition that
m1 ((a, b)) = m1 ([a, b)) = m1 ([a, b]) = b − a.
Since R is the countable disjoint union of the intervals [n, n + 1), as n
varies over Z, it follows that m1 (R) = +∞.
Now consider any interval [a, b) as the set [a, b) × {0} ⊂ R2 . Then
1
∞
[a, b) × {0} = ∩n=1 [a, b) × 0,
.
n
Then, again, it follows, from Proposition 1.2.5, that
m2 ([a, b) × {0}) = lim
n→∞
1
(b − a) = 0.
n
Since the real line, considered as a coordinate axis in R2 , can be written
as the countable disjoint union of sets of the form [n, n + 1) × {0}, it
follows that
m2 (R) = 0.
More generally, the Lebesgue measure in RN of any proper linear subspace, will be zero. Example 2.1.2 (i) Since any non-empty open set contains a non-empty
open interval, it follows that the Lebesgue measure of any non-empty
open set is strictly positive.
(ii) The set of rationals, Q, being countable, has Lebesgue measure zero.
Thus Q is a an example of a dense set which has measure zero. Its complement, the set of irrationals, is a dense set of infinite measure.
(iii) If E ⊂ R is a measurable set of measure zero, then it cannot contain
any non-empty open set. Thus, every non-empty open set will intersect
E c and so E c will be dense.
(iv) If K ⊂ R is a compact set, then it is bounded and closed and so it
has finite measure. Example 2.1.3 The Cantor Set
Let X = [0, 1]. Set X1 = ( 13 , 23 ). Let X2 be the union of the open
38
2 The Lebesgue Measure
middle-thirds of the subintervals of X\X1 , i.e.
1 2
7 8
X2 =
,
∪
,
.
9 9
9 9
Now let X3 be the union of the open middle-thirds of the four subintervals of X\(X1 ∪ X2 ) and so on. Set
C = X\ ∪∞
n=1 Xn .
The set C is called the Cantor set.
(i) Since each Xn is open, it follows that C is a closed set.
(ii) It is easy to see that
m1 (Xn ) =
2n−1
.
3n
Since all these sets Xn are disjoint, we have
m1 (∪∞
n=1 Xn ) =
∞
X
2n−1
n=1
3n
= 1 = m1 ([0, 1]).
Consequently, m1 (C) = 0.
(iii) It then follows that since it is closed and has measure zero, it cannot
contain a non-empty open set. Thus C is nowhere dense.
(iv) Let x ∈ C. If (a, b) is any interval containing x, then, for n sufficiently large, it must also contain a sub-interval of Xn . End-points of all
such sub-intervals are in C. Thus no point of C is isolated. Since C is
closed as well, it follows that C is a perfect set and so it is uncountable
(cf. Rudin [7]). Thus C is an example of an uncountable set of measure
zero.
(iv) Another way of proving the uncountability of the Cantor set is as
follows. Consider the ternary expansion of real numbers in [0, 1]:
x =
∞
X
an 3−n ,
n=1
where an = 0, 1 or 2. To determine an , we proceed in the following
manner. First we divide the interval [0, 1] into three equal sub-intervals.
Then a1 = 0, 1 or 2 according as x falls in the first, second or third
sub-interval, respectively. If it falls at one of the nodes of the partition,
then the expansion of x has only one term. Next, we divide each of
2.2 Approximation
39
the above subintervals into three equal parts and again a2 = 0, 1 or 2,
according as x falls in the first, second or third of the intervals of the
block in which it falls. We proceed inductively to determine all the an .
The expansion will terminate at a finite stage, if x falls at a node of the
partition at some level.
It is now clear that the Cantor set contains the set of all points in
[0, 1] with infinite ternary expansion such that the digits in its ternary
expansion are either 0 or 2. Now a simple Cantor diagonalisation argument shows that C is uncountable. We have three distinguished collections of subsets of RN , viz. the
Borel σ-algebra, BN , the Lebesgue σ-algebra, LN and, the power set of
RN , P(RN ). We have
BN ⊂ LN ⊂ P(RN ).
One would, naturally, like to know if these inclusions are strict. We will
show, in the sequel, that these inclusions are, indeed, strict. To start
with, since the Lebesgue measure is complete, we have that every subset
of C is Lebesgue measurable. Since C is uncountable, the cardinality of
L1 is 2c , where c is the cardinality of the continuum. It can be shown
that the cardinality of B1 is just c. Consequently the inclusion B1 ⊂ L1
is strict. We will later define the Cantor function and use it to prove the
existence of a Lebesgue measurable set which is not Borel measurable.
We will also prove the existence of sets in R which are not Lebesgue
measurable.
2.2
Approximation
In this section, we will present various results on the approximation of
measurable sets by toplogical sets in the context of the Lebesgue measure.
By a box in RN , N ≥ 1, we will mean a set of the form
B = ΠN
j=1 Ij ,
where each set Ij , 1 ≤ j ≤ N , is a finite interval in R. If all the Ij are
open intervals, we will call it an open box and if thay are all closed,
we will call it a closed box. If all the Ij are of the form [aj , bj ), where
40
2 The Lebesgue Measure
aj < bj , we will call it a half-open box.
Clearly, if B ⊂ RN is a box as above, we have
mN (B) = ΠN
j=1 m1 (Ij ).
(2.2.1)
It is also clear that given ε > 0, we can always find boxes B1ε and B2ε
which are of any kind (open, closed or half-open) such that
B1ε ⊂ B ⊂ B2ε
and such that
mN (B\B1ε ) < ε and mN (B2ε \B) < ε.
Recall that the class P is the collection of all half-open boxes and R
is the ring generated by P. The measure µ on R is the same as mN ,
given by (2.2.1) on P, and extended by finite additivity to R. We will
denote the corresponding outer-measure defined on all subsets of RN by
µ∗ . Thus µ∗ restricted to LN is mN . Unless specified otherwise, the
term measurable set will mean a Lebesgue measurable set.
Proposition 2.2.1 Let E ⊂ RN , N ≥ 1. Then
µ∗ (E) = inf{µ∗ (U ) |E ⊂ U, U an open set}.
Proof: The result is obvious if µ∗ (E) = +∞. So let us assume that
µ∗ (E) < +∞. Since E ⊂ U implies that µ∗ (E) ≤ µ∗ (U ), it is immediate
to see that
µ∗ (E) ≤ inf{µ∗ (U ) | E ⊂ U, U an open set}.
Let ε > 0. Then, by the definition of the outer-measure (cf. Proposition
1.3.1), there exist half-open boxes Bn such that E ⊂ ∪∞
n=1 Bn and such
that
∞
X
ε
mN (Bn ) < µ∗ (E) + .
2
n=1
0
0
Now, construct open boxes {Bn0 }∞
n=1 such that Bn ⊂ Bn and mN (Bn \Bn ) <
ε
0
. Set U = ∪∞
n=1 Bn . Then U is an open set and E ⊂ U . Further
2n+1
µ∗ (U ) = mN (U ) ≤
∞
X
n=1
ε
mN (Bn ) + .
2
Consequently, we have E ⊂ U and µ∗ (U ) < µ∗ (E) + ε. This completes
the proof. 2.2 Approximation
41
Proposition 2.2.2 Let E ⊂ RN . The following statements are equivalent.
(i) The set E is Lebesgue measurable.
(ii) Given any ε > 0, there exists an open set U such that E ⊂ U and
such that µ∗ (U \E) < ε.
(iii) Given any ε > 0, there exists a closed set F such that F ⊂ E and
such that µ∗ (E\F ) < ε.
(iv) There exists a Gδ set G such that E ⊂ G and such that µ∗ (G\E) =
0.
(v) There exists an Fσ set F such that F ⊂ E and such that µ∗ (E\F ) =
0.
Proof: (i) ⇒ (ii): If µ∗ (E) < +∞, then, by the previous proposition,
there exists an open set U containing E such that µ∗ (U ) < µ∗ (E) + ε.
Since E is Lebesgue measurable, we have that µ∗ = mN and since a
measure is subtractive, we deduce that µ∗ (U \E) < ε. If µ∗ (E) = +∞,
then since µ∗ = mN is σ-finite, we can find disjoint measurable sets En
such that each of them has finite measure and such that E = ∪∞
n=1 En .
Then, we can find open sets Un such that En ⊂ Un and such that
µ∗ (Un \En ) < 2εn . Then U = ∪∞
n=1 Un is open, contains E and
µ∗ (U \E) ≤ µ∗ (∪∞
n=1 (Un \En )) ≤
∞
X
ε
= ε.
2n
n=1
(ii) ⇒ (iv): For each positive integer n, choose Un open such that E ⊂ Un
and µ∗ (Un \E) < n1 . Set G = ∩∞
n=1 Un . Then G is a Gδ set containing E
and
1
µ∗ (G\E) ≤ µ∗ (Un \E) < ,
n
from which we deduce that µ∗ (G\E) = 0.
(iv) ⇒ (i): By completeness of the Lebesgue measure, if µ∗ (G\E) = 0,
it follows that G\E is measurable. Since G is a Gδ set, it is measurable
as well. Now
E = G\(G\E)
and so E is mesurable as well.
(i) ⇒ (iii): If E is, measurable, so is E c . Hence, there exists an open set
U containing E c and such that µ∗ (U \E c ) < ε. Then F = U c is closed
42
2 The Lebesgue Measure
and is contained in E. Further
µ∗ (E\F ) = µ∗ (E ∩ F c ) = µ∗ (E ∩ U ) = µ∗ (U \E c ) < ε.
(iii) ⇒ (v): For every positive integer n, choose Fn , a closed set contained
in E, such that µ∗ (E\Fn ) < n1 . Set F = ∪∞
n=1 Fn . Then F is an Fσ set
contained in E. Further,
µ∗ (E\F ) ≤ µ∗ (E\Fn ) <
1
n
for each n, from which it follows that µ∗ (E\F ) = 0.
(v) ⇒ (i): Since µ∗ (E\F ) = 0, once again, by the completeness of the
Lebesgue measure, we have that E\F is measurable. Every Fσ set is
measurable as well. Thus
E = F ∪ (E\F )
is measurable as well. Proposition 2.2.3 Let E ⊂ RN be a measurable set of finite measure.
Given ε > 0, there exists a compact set K ⊂ E such that mN (E\K) < ε.
Proof: Step 1: Let η > 0 be an arbitrary positive number. By Proposition 2.2.2, there exists an open set V such that E ⊂ V and mN (V \E) <
η. Let B(0; r) denote the open ball in RN with centre at 0 and of radius
r > 0; let the corresponding closed ball be denoted by B(0; r). If n is a
positive integer, set Vn = B(0; n) ∩ V . Then the sequence of open sets
{Vn }∞
n=1 increases to V and since V also has finite measure, there exists
a positive integer m such that mN (V \Vm ) < η. Then E\Vm ⊂ V \Vm
and so mN (E\Vm ) < η.
Step 2: Now, Vm is a bounded open set and again, by Proposition 2.2.2,
there exists a closed set F ⊂ Vm such that mN (Vm \F ) < η. Since F is
bounded and closed, it is compact.
Step 3: Thus, for any ε > 0, and a set E ⊂ RN of finite measure, we
can find a bounded open set W such that mN (E\W ) < 3ε . Further, by
Step 2, there exists a compact set K1 ⊂ W such that mN (W \K1 ) < 3ε .
Finally, once again by Proposition 2.2.2, there exists a closed set F1 ⊂ E
2.2 Approximation
43
such that mN (E\F1 ) < 3ε . Then K = F1 ∩K1 is a compact set contained
in E and
E\K = (E\W ) ∪ ((E ∩ W )\F1 ) ∪ ((W ∩ F1 )\K1 )
⊂ (E\W ) ∪ (E\F1 ) ∪ (W \K1 ).
Thus, mN (E\K) < ε. This completes the proof. Remark 2.2.1 Let E ⊂ RN be a measurable set. Then, by Proposition
2.2.1, we have
mN (E) = inf{mN (U ) |E ⊂ U, U is an open set}.
(2.2.2)
If mN (E) < +∞, then, by Proposition 2.2.3, we have
mN (E) = sup{mN (K) | K ⊂ E, K is a compact set}.
(2.2.3)
Any Borel measure µ (cf. Definition 2.1.1) satisfying (2.2.2) with µ in
the place of mN , is called outer-regular. If it satisfies (2.2.3) with µ
in place of mN , when µ(E) < +∞, it is said to be inner-regular. If
both are valid, we say that the measure is regular. Thus, the Lebesgue
measure is a regular measure. Definition 2.2.1 Given any set X and a subset A thereof, the characteristic function of A is the function χA : X → R defined by
1, if x ∈ A,
χA (x) =
0, if x 6∈ A. Definition 2.2.2 Let Ω ⊂ RN be an open set. A step function defined
on Ω is a function f of the form
f=
k
X
α j χ Ij ,
j=1
where the αj , 1 ≤ j ≤ k are constants and the sets Ij , 1 ≤ j ≤ k are
boxes contained in Ω. Proposition 2.2.4 Let I ⊂ RN be a box. Let ε > 0 be an arbitrary
positive number. Then, there exists a function ϕ ∈ Cc (RN ), the space
of continuous real-valued functions with compact support, such that 0 ≤
ϕ(x) ≤ 1 for all x and such that
mN ({x ∈ RN | ϕ(x) 6= χI (x)}) < ε.
Further, the support of ϕ will be contained in I.
44
2 The Lebesgue Measure
Proof: We can find a closed box J1 and an open box J2 , such that
J1 ⊂ J2 ⊂ J2 ⊂ I and such that mN (I\J1 ) < ε. By Urysohn’s lemma,
there exists a continuous function ϕ such that 0 ≤ ϕ(x) ≤ 1 for all x
and such that ϕ(x) = 1 for all x ∈ J1 and ϕ(x) = 0 for all x 6∈ J2 . Then,
the support of ϕ is contained in J2 ⊂ I, and so it is compact. Thus
ϕ ∈ Cc (RN ). Now,
{x ∈ RN | ϕ(x) 6= χI (x)} ⊂ I\J1
from which the result follows immediately. Corollary 2.2.1 Let Ω ⊂ RN be an open set and let f : Ω → R be a
step function. Let ε > 0 be an arbitrary positive number. Then there
exists ϕ ∈ Cc (Ω) such that
mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < ε,
and such that
max |ϕ(x)| ≤ max |f (x)|.
x∈Ω
x∈Ω
Pk
Proof: Let f = j=1 αj χIj be a step function. Without loss of generality, we can assume that the boxes are disjoint. By the preceding
proposition, there exist functions ϕj ∈ Cc (RN ), 1 ≤ j ≤ k, such that, for
each such j, we have 0 ≤ ϕj ≤ 1, the support of ϕj is contained in the
box Ij , and
ε
mN ({x ∈ RN | ϕj (x) 6= χIj (x)}) < .
k
Pk
Set ϕ = j=1 αj ϕj . Then
{x ∈ Ω | ϕ(x) 6= f (x)} ⊂ ∪kj=1 {x ∈ Ω | ϕj (x) 6= χIj (x)}
⊂ ∪kj=1 {x ∈ RN | ϕj (x) 6= χIj (x)}
and so
mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < ε.
Since the supports of the ϕj , 1 ≤ j ≤ k, are all disjoint, it follows that
max |ϕ(x)| ≤ max |αj | = max |f (x)|.
x∈Ω
1≤j≤k
x∈Ω
Finally, the function ϕ has compact support contained in ∪kj=1 Ij ⊂ Ω.
Thus ϕ ∈ Cc (Ω). This completes the proof. We conclude this section with one more approximation result. In
order to prove it, we need a topological result.
2.2 Approximation
45
Lemma 2.2.1 Every open set in RN can be written as a countable disjoint union of half-open boxes.
Proof: For a fixed positive integer n, let Fn denote the set of all points
in RN whose coordinates are all integral multiples of 2−n . Let Gn denote
the collection of all half-open boxes with each edge of length 2−n and
with vertices at the points of Fn . The following conclusions are obvious:
• For a fixed positive integer n, each point x ∈ RN belongs to exactly
one box in Gn .
• Let n > m. If Q ∈ Gm and if Q0 ∈ Gn , then either Q0 ⊂ Q or
Q ∩ Q0 = ∅.
Let Ω ⊂ RN be an open set. Let x ∈ Ω.Then x lies in an open ball
contained in Ω and so, for sufficiently large n, we can find a box Q ∈ Gn
such that
x ∈ Q ⊂ Ω.
In other words, Ω is the union of all boxes contained within it and
belonging to the collection Gn , for some n. This collection of boxes is
clearly countable but may not be disjoint.
Now choose all those boxes in this collection which belong to G1 and
discard those of Gk , k ≥ 2, which are contained inside these selected
boxes. From the remaining collection of boxes in Ω, select those in G2
and discard those which are in Gk , k ≥ 3 and contained within these
selected boxes. Proceeding iteratively like this, we can express Ω as the
countable disjoint union of (half-open) boxes, as is obvious from the two
observations made above. Proposition 2.2.5 Let Ω ⊂ RN be an open set and let E ⊂ Ω be a
measurable set with finite measure. Then, for any ε > 0, there exists a
set F , which is a finite disjoint union of boxes, such that
mN (E∆F ) < ε.
Proof: Let G ⊂ RN be an open set such that E ⊂ G and such that
mN (G\E) < 2ε (cf. Proposition 2.2.2). Set G0 = Ω ∩ G. Then G0 is also
open, E ⊂ G0 ⊂ Ω and mN (G0 \E) < 2ε .
Now, G0 can be written as the countable disjoint union of half-open
boxes, say, {Ij }∞
j=1 , and so, for each j, we have Ij ⊂ Ω. Since E has
finite measure, so have G and G0 . Thus
∞
X
j=1
mN (Ij ) < +∞.
46
2 The Lebesgue Measure
Choose a positive integer k such that
∞
X
mN (Ij ) <
j=k+1
ε
.
2
Set F = ∪kj=1 Ij , which is a finite disjoint union of boxes. Since F ⊂ G0 ,
we have
ε
mN (F \E) ≤ mN (G0 \E) < .
2
and
∞
X
ε
mN (E\F ) ≤ mN (G0 \F ) ≤
mN (Ij ) < .
2
j=k+1
This completes the proof. 2.3
Translation invariance
We will now study a very important property of the Lebesgue measure.
Lemma 2.3.1 Let Ω and Ω0 be open sets in RN . Let T : Ω → Ω0 be a
bijection which is a homeomorphism. Then E ⊂ Ω is a Borel set if, and
only if, T (E) is a Borel set.
Proof: Set
S = {E ⊂ Ω | T (E) is a Borel set}.
Clearly Ω and ∅ are in S. Also, if E ⊂ Ω, T (E c ) = (T (E))c and if {Ei }∞
i=1
∞
is a sequence of subsets of Ω, we have that T (∪∞
i=1 Ei ) = ∪i=1 T (Ei ). It
follows from these observations that S is closed under the formation of
countable unions and under complementation. Thus, S is a σ-algebra
on Ω. Since open sets get mapped onto open sets, we get that all open
sets are in S. Then it follows that all Borel sets are in S as well. The
converse follows by applying this reasoning to the map T −1 . As a special case, let us consider Ω = Ω0 = RN and let T x = x + x0 ,
where x0 is a fixed point in RN . Thus E is a Borel set if, and only if,
T (E) is a Borel set.
If I is a half-open box, then T (I) = I +x0 is also a half-open box and
both of them have the same measure. It is now immediate to see from
the definition of the induced outer-measure, µ∗ , that µ∗ (T (E)) = µ∗ (E),
2.3 Translation invariance
47
for any set E ⊂ RN . The same is obviously true for T −1 as well.
Now let E be a Lebesgue measurable subset of RN . Let A ⊂ RN .
Then
µ∗ (A ∩ T (E)) + µ∗ (A ∩ (T (E))c ) = µ∗ (T (T −1 (A) ∩ E))+
µ∗ (T (T −1 (A) ∩ E c ))
= µ∗ (T −1 (A) ∩ E)+
µ∗ (T −1 (A) ∩ E c )
= µ∗ (T −1 (A))
= µ∗ (A).
This shows that T (E) is Lebesgue measurable. Applying this to T −1 ,
we deduce the following result.
Theorem 2.3.1 Let x0 ∈ RN be a fixed point. Let T (x) = x + x0 , x ∈
RN . Then E ⊂ RN is Lebesgue measurable if, and only if, T (E) is
Lebesgue measurable, and in this case,
mN (T (E)) = mN (E). (2.3.1)
Definition 2.3.1 Let µ be a Borel measure defined on RN . We say
that it is translation invariant if for every Borel set E, and for every
mapping T : RN → RN such that T (x) = x + x0 for some x0 ∈ RN , we
have that µ(T (E)) = µ(E). Thus, the Lebesgue measure is translation invariant. In fact the
properties of outer-regularity, translation invariance and finiteness of
the measure for compact sets characterizes the Lebesgue measure, as
the following theorem shows.
Theorem 2.3.2 Let ν be a Borel measure on RN such that
(i) ν(K) < +∞ for every compact set K ⊂ RN ;
(ii) ν(E) = inf{ν(V ) | E ⊂ V, V an open set}, for every Borel set E ⊂
RN ; and
(iii) it is translation invariant.
Then, there exists a constant c > 0 such that ν(E) = cmN (E) for every
Borel set E ⊂ RN .
Proof: Let Q = [0, 1) × [0, 1) × · · · × [0, 1) (N times). Then mN (Q) = 1.
Let n ≥ 2 be an arbitrary positive integer. Q can be written as the
disjoint union of 2N n boxes in the collection Gn described in the proof
48
2 The Lebesgue Measure
of Lemma 2.2.1. Assume that ν(Q) = c > 0. Since ν is translation
invariant, all the 2N n boxes of side 2−n which make up Q will have the
e be one such box. Then
same measure. Let Q
e = ν(Q) = c = cmN (Q) = c2N n mN (Q).
e
2N n ν(Q)
e = cmN (Q)
e as well. Since any open set can be written as
Thus, ν(Q)
the countable disjoint union of such boxes (cf. Lemma 2.2.1), it follows
that if V is any open set, then ν(V ) = cmN (V ). Then, for any Borel set
E, it follows, from condition (ii) in the statement of this theorem, that
ν(E) = cmN (E). As an application of this result, we have the following theorem.
Theorem 2.3.3 Let A : RN → RN be a linear transformation. Let E
be any Borel set in RN .Then
mN (A(E)) = |det(A)|mN (E).
(2.3.2)
Proof: Step 1: Let A be singular. Then det(A) = 0. Also, the range of
A will be a proper subspace of RN . Then, if E is a Borel set, A(E) is
contained in a proper subspace of RN and hence will have measure zero
(cf. Example 2.1.1). Thus (2.3.2) is valid in this case.
Step 2: Let A be non-singular. Then, by Lemma 2.3.1, A(E) is Borel
measurable whenever E is so. Define
ν(E) = mN (A(E)).
It is easy to see that ν is a Borel measure. If K ⊂ RN is compact, then
so is A(K). Thus, ν takes only finite values on compact sets.
If E ⊂ V , where V is an open set, then A(V ) is an open set and
A(E) ⊂ A(V ) and vice-versa. Thus
inf{ν(V ) | E ⊂ V, V an open set}
= inf{mN (A(V )) | E ⊂ V, V an open set}
= inf{mN (U ) | A(E) ⊂ U, U an open set}
= mN (A(E)) = ν(E).
Finally,
ν(E + x0 ) = mN (A(E + x0 )) = mN (A(E) + Ax0 ) = mN (A(E)) = ν(E).
2.4 Non-measurable sets
49
Thus, ν is translation invariant.
Thus, by the preceding theorem, it follows that there exists a constant cA such that mN (A(E)) = ν(E) = cA mN (E) for every Borel set
E ⊂ RN .
Step 3: It is easy to see that if A and B are two non-singular linear
transformations of RN onto itself, then cAB = cBA = cA cB .
Step 4: Let A be an orthogonal transformation. Then, if E is the
unit ball in RN , we have that A(E) = E. It follows from this that
cA = 1 = |det(A)| whenever A is orthogonal.
Step 5: Let A be represented by a diagonal matrix diag(λ1 , · · · , λN ),
where λi > 0, 1 ≤ i ≤ N . Let E = [0, 1] × · · · × [0, 1] (N times). Then
A(E) = ΠN
i=1 [0, λi ].
Thus
mN (A(E)) = ΠN
i=1 λi .
Once again, in this case, we have cA = det(A) = |det(A)|.
Step 6: Given any non-singular matrix A, we can decompose it as A =
RQ, where R is a positive definite matrix and Q is orthogonal. (It
suffices to find a positive definite matrix R such that R2 = AAT and set
Q = R−1 A.) The positive definite matrix R can, in turn be decomposed
as R = P DP T , where P is orthogonal and D is diagonal with positive
diagonal entries. Notice then that det(D) = |det(A)|. The result now
follows from the observations made in Steps 3 to 5. 2.4
Non-measurable sets
We will now prove the existence of a subset of R which is not Lebesgue
measurable. The construction is essentially a consequence of the translation invariance of the Lebesgue measure and also uses the axiom of
choice. We will follow the treatment given in Royden [6].
Let x, y ∈ [0, 1). Define
◦
x + y,
if x + y < 1,
x+y =
x + y − 1, if x + y ≥ 1.
50
2 The Lebesgue Measure
If E is a subset of [0, 1) and if y ∈ [0, 1), we set
◦
◦
E + y = {x + y | x ∈ E}.
Lemma 2.4.1 Let E ⊂ [0, 1) and let y ∈ [0, 1). If E is measurable,
◦
then, so is E + y and
◦
m1 (E + y) = m1 (E).
Proof: Set E1 = E ∩ [0, 1 − y) and E2 = E ∩ [1 − y, 1). Then E1 and E2
are measurable and are disjoint. Thus, m1 (E) = m1 (E1 ) + m1 (E2 ). By
◦
◦
definition, we clearly have E1 + y = E1 + y and E2 + y = E2 + (y − 1).
◦
Thus, Ei + y, i = 1, 2, are measurable, and by the translation invariance
◦
of the Lebesgue measure, we have m1 (Ei + y) = m1 (Ei ), i = 1, 2.
◦
Further, the sets Ei + y, i = 1, 2, are disjoint. (If not, we will have
a, b ∈ [0, 1) such that a + y = b + y − 1 which implies that b − a = 1,
which is impossible.)
◦
◦
◦
Since E + y is the disjoint union of E1 + y and E2 + y, the result
follows immediately. Let x, y ∈ [0, 1). We say that x ∼ y if x − y is rational. It is easy
to verify that this defines an equivalence relation and hence partitions
[0, 1) into equivalence classes. Let P be a set containing exactly one
element from each equivalence class (axiom of choice!).
Proposition 2.4.1 The set P ⊂ [0, 1) defined above is not Lebesgue
measurable.
Proof: Let {ri }∞
i=0 be an enumeration of the rationals in [0, 1) with
◦
r0 = 0. Set Pi = P + ri , 0 ≤ i < ∞. Thus, P0 = P . If x ∈ Pi ∩ Pj ,
◦
◦
where i 6= j, then x = p1 + ri = p2 + rj , where p1 , p2 ∈ P . If p1 = p2 ,
then, since ri 6= rj , we must have |ri − rj | = 1, which is not possible.
Thus, p1 6= p2 . But this means that p1 − p2 is rational, i.e. p1 ∼ p2 ,
which is not possible. Thus, Pi ∩ Pj = ∅ whenever i 6= j. Further,
since P contains a representative of each equivalence class, it follows
that [0, 1) = ∪∞
i=0 Pi .
2.4 Non-measurable sets
51
Now, if P were Lebesgue measurable, then so is each Pi and m1 (Pi ) =
m1 (P ) for each i. In that case
1 = m1 ([0, 1)) =
∞
X
m1 (Pi )
i=0
and the last sum is either zero or infinity depending on whether m1 (P )
is zero or non-zero, which gives us a contradiction. Thus P cannot be
measurable. ◦
Let E ⊂ P be a measurable set. Then Ei = E + ri is measurable
for each 0 ≤ i < ∞ and m1 (Ei ) = m1 (E) for each such i. Once again,
since E ⊂ P , we see that the Ei are all mutually disjoint. Since their
union is contained in [0, 1), we have that
∞
X
m1 (Ei ) ≤ 1.
i=0
Thus it follows that we must have that m1 (E) = 0. Thus the only measurable subsets of P are those of measure zero. The same is true for any
Pi , 1 ≤ i < ∞.
Now let A ⊂ [0, 1) be a measurable set such that m1 (A) > 0. Set
Ei = A ∩ Pi . If Ei is measurable, then m1 (Ei ) = 0. Thus if all the Ei
are measurable, we have, since A = ∪∞
i=0 Ei ,
0 < m1 (A) ≤
∞
X
m1 (Ei ) = 0,
i=0
a contradiction. Thus, there exists at least one i such that Ei is not
measurable. Thus every subset of strictly positive measure in [0, 1) contains a non-measurable subset.
We can draw the same conclusion for any interval of the form [n, n +
1). Thus, if A ⊂ R is a measurable set with strictly positive measure,
then there exists a positive integer n such that A ∩ [n, n + 1) has strictly
positive measure and hence will contain a non-measurable subset. Thus,
we conclude that every measurable set in R with strictly positive measure
will contain a non-measurable subset.
52
2.5
2 The Lebesgue Measure
Exercises
2.1 Let g : R → R be a continuous and increasing function. Define, for
[a, b) ∈ P,
µg ([a, b)) = g(b) − g(a).
Show that there exists a unique complete measure µg , on a σ-ring containing all the Borel sets, which extends µg . (This measure is called the
Lebesgue-Stieltjes measure induced by g.)
2.2 Let S 1 denote the unit circle in the plane. The Borel sets in S 1 are
the members of the σ-algebra generated by all open arcs. Show that
there exists a Borel measure µ on S 1 such that µ(S 1 ) = 1 and such that
µ is invariant under all rotations of S 1 .
2.3 Show that every subset of the plane {(x, y, z) | 2x + 3y + 4z + 1 = 0}
in R3 is Lebesgue measurable.
2.4 Show that the plane R2 cannot be expressed as the countable union
of straight lines.
2.5 Let ωN = mN (BN ), where BN denotes the unit ball in RN . Show
that the Lebesgue measure of any ball of radius r > 0 in RN is ωN rN .
2.6 Compute m2 (S 1 ), where S 1 is the unit circle in the plane R2 .
2.7 (a) Let T be a triangle in the plane R2 with vertices at the points
(0, 0), (1, 0) and (0, 1). What is the value of m2 (T )?
(b) Let T be a triangle in the plane R2 with vertices at the points
(xi , yi ), i = 1, 2, 3. Show that m2 (T ) = |A|, where
1
A =
2
1 1 1
x1 x2 x3
y1 y2 y 3
.
2.8 Let A : RN → RN be a non-singular linear transformation. Let
E ⊂ RN . With the notations of Section 2.2, show that
µ∗ (E) = |det(A)|µ∗ (E).
Deduce that E is Lebesgue measurable if, and only if A(E) is Lebesgue
measurable.
2.5 Exercises
53
2.9 Let T : R → R be a bijection such that both T and T −1 map
Lebesgue measurable sets onto Lebesgue measurable sets. Define
µ(E) = m1 (T (E)),
for each Lebesgue measurable set E. Show that µ is a complete measure
on L1 .
2.10 Let µ1 and µ2 be two measures defined on a σ-algebra S. Then
µ1 is said to be absolutely continuous with respect to µ2 if µ1 (E) = 0
whenever µ2 (E) = 0. Show that the measure µ defined in Exercise 2.9
is absolutely continuous with respect to the Lebesgue measure.
2.11 Does there exist a non-measurable subset of [0, 1] consisting only
of irrational numbers?
Chapter 3
Measurable functions
3.1
Basic properties
Let X be a non-empty set and let S be a σ-algebra of subsets of X. We
then say that (X, S) is a measurable space. The members of S are
called measurable sets.
An extended real-valued function on X is a function defined on X
which takes values in the set R ∪ {±∞}.
Definition 3.1.1 Let (X, S) be a measurable space and let f be an extended real-valued function defined on X. We say that f is a measurable function if f −1 ((α, +∞]) ∈ S for every α ∈ R. When X = RN ,
if the above condition is satisfied by f with S = BN , we say that f is a
Borel measurable function and if it is satisfied with S = LN , we say
that f is a Lebesgue measurable function. Remark 3.1.1 Evidently, if a function, f , defined on RN , is Borel
measurable, it is also Lebesgue measurable. Proposition 3.1.1 Let (X, S) be a measurable space. Let f be an extended real-valued function defined on X. The following statements are
equivalent:
(i) for every α ∈ R, f −1 ((α, +∞]) ∈ S, i.e. f is measurable;
(ii) For every α ∈ R, f −1 ([α, +∞]) ∈ S;
(iii) For every α ∈ R, f −1 ([−∞, α)) ∈ S;
(iv) For every α ∈ R, f −1 ([−∞, α]) ∈ S.
Proof: (i) ⇒ (ii):
f
−1
([α, +∞]) =
−1
∩∞
n=1 f
1
α − , +∞
∈ S.
n
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_3
54
3.1 Basic properties
55
(ii) ⇒ (iii):
f −1 ([−∞, α)) = (f −1 ([α, +∞]))c ∈ S.
(iii) ⇒ (iv):
−1
f −1 ([−∞, α]) = ∩∞
n=1 f
−∞, α +
1
n
∈ S.
(iv) ⇒ (i):
f −1 ((α, +∞]) = (f −1 ([−∞, α]))c ∈ S. Corollary 3.1.1 (i) Let (X, S) be a measurable space. Let f be a measurable function on X. Then, for every α ∈ R ∪ {±∞}, we have
f −1 ({α}) ∈ S.
(ii) If U ⊂ R is an open set, then f −1 (U ) ∈ S.
Proof: (i) If α ∈ R, then
−1
f −1 ({α}) = ∩∞
n=1 f
α−
1
1
, +∞ ∩ −∞, α +
∈ S.
n
n
Next,
−1
f −1 ({+∞}) = ∩∞
((n, +∞]) ∈ S
n=1 f
and
−1
f −1 ({−∞}) = ∩∞
([−∞, −n)) ∈ S.
n=1 f
(ii) Given (a, b) ⊂ R, we have
f −1 ((a, b)) = f −1 ([−∞, b)) ∩ f −1 ((a, +∞]) ∈ S.
The result now follows since every open set can be written as the countable union of open intervals. Example 3.1.1 Let X = RN . Then, every continuous real-valued function, f , defined on R will be both Borel and Lebesgue measurable since
f −1 ((α, +∞])) = f −1 ((α, ∞)) which is open and hence belongs to both
the Borel and Lebesgue σ-algebras. Example 3.1.2 Let (X, S) be a measurable space and let f be a realvalued function defined on X. It is clear that if f −1 (U ) is measurable
56
3 Measurable functions
for every open set U ⊂ R, then f is measurable. However, the converse
of part (i) of the preceding corollary is not true. Let E ⊂ [0, 1) be a
non-measurable subset (cf. Section 2.4), i.e. E 6∈ L1 . Define

 x, if x ∈ E,
−x, if x ∈ [0, 1)\E,
f (x) =

−2, if x 6∈ [0, 1).
Then

R\[0, 1),



{−α},
f −1 ({α}) =
{α},



∅,
if α = −2,
if − α ∈ [0, 1)\E,
if α ∈ E,
otherwise.
Thus, it follows that f −1 ({α}) is measurable for every α ∈ R. However
f −1 ((0, ∞)) = E, which is not Lebesgue measurable and so f is not
Lebesgue measurable. Example 3.1.3 Let (X, S) be a measurable space and let A ⊂ X. Let
f = χA , the characterisitic function of the set A. Then

 X, if α < 0,
A, if 0 ≤ α < 1,
f −1 ((α, ∞]) =

∅, if α ≥ 1.
Thus, χA is measurable if, and only if, A ∈ S. Example 3.1.4 Let (X, S) be a measurable space. Any constant function is measurable. Let f (x) = c for all x ∈ X. If α ∈ R, then
f −1 ((α, +∞]) = X if α < c and is equal to the empty set if α ≥ c. Proposition 3.1.2 Let (X, S) be a measurable space and let f and g
be measurable real-valued functions defined on X. Let c ∈ R. Then
f + c, cf, f ± g and f g are all measurable functions.
Proof: (i) Let α ∈ R. Let c > 0. Then
n
αo
{x ∈ X | cf (x) < α} = x ∈ X | f (x) <
∈ S.
c
If c < 0, then
{x ∈ X | cf (x) < α} =
n
x ∈ X | f (x) >
αo
∈ S.
c
3.1 Basic properties
57
If c = 0, then cf is the constant function taking the value zero. Thus, if
follows that cf is measurable for all c ∈ R.
(ii) Let α ∈ R. Then
{x ∈ X | f (x) + g(x) < α} = {x ∈ X | f (x) < α − g(x)}
= ∪r∈Q ({x ∈ X | f (x) < r} ∩ {x ∈ X | g(x) < α − r}).
Since the rationals are countable, it follows that (f +g)−1 ([−∞, α)) ∈ S.
Thus f + g is measurable. Since f − g = f + (−1)g, it follows from (i)
above that f − g is also measurable.
(iii) Since constant functions are measurable, it follows from (ii) above
that f + c is measurable.
(iv) Let α ∈ R. If α > 0, then
{x ∈ X | (f (x))2 > α} = {x ∈ X | f (x) >
√
√
α}∪{x ∈ X | f (x) < − α}.
If α ≤ 0, then {x ∈ X | (f (x))2 > α} = X. Thus, it follows that the
function f 2 is measurable.
Now.
1
f g = ((f + g)2 − (f − g)2 ).
4
Thus, by the preceding assertions, it follows that f g is also measurable.
Remark 3.1.2 Whenever the concerned functions are well-defined, the
preceding proposition holds for extended real-valued functions as well.
For instance, f + g is not defined at points x ∈ X where f (x) = +∞
and g(x) = −∞. Proposition 3.1.3 Let (X, S) be a measurable space and let f be a
real-valued measurable function defined on X. Then |f | is measurable.
Proof: Let α ∈ R. Then,
{x ∈ X | |f (x)| < α} = {x ∈ X | f (x) > −α} ∩ {x ∈ X | f (x) < α}
if α > 0 and is the empty set if α ≤ 0. Thus, |f | is measurable. 58
3 Measurable functions
Corollary 3.1.2 Let (X, S) be a measurable space and let f and g be
measurable real-valued functions defined on X. Then, max{f, g} and
min{f, g} are measurable. In particular, if f is a measurable real-valued
function defined on X, then
f + = max{f, 0} and f − = − min{f, 0}
are measurable.
Proof: The result follows from the previous propositions and the following relations:
max{f, g} =
min{f, g} =
1
2 (f
1
2 (f
+ g + |f − g|),
+ g − |f − g|). Remark 3.1.3 The functions f + and f − are called the positive and
negative parts of the function f . We have f = f + − f − and |f | =
f + + f − . Notice that both f + and f − are non-negative functions. Lemma 3.1.1 Let (X, S) be a measurable space and let f be a realvalued measurable function defined on X. Then f −1 (E) ∈ S whenever
E is a Borel set.
Proof: Consider
Se = {E ⊂ R | f −1 (E) ∈ S}.
e Now f −1 (E c ) = (f −1 (E))c and so if E ∈ S,
e so does
Clearly, R ∈ S.
c
∞
−1
e
E . Similarly, if {Ei }i=1 is a sequence in S, we have f (∪∞
i=1 Ei ) =
∞
−1
∞
e
e
∪i=1 f (Ei ) and so ∪i=1 Ei ∈ S as well. Thus S is a σ-algebra and,
by the definition of measurability, it contains all the open sets of R (cf.
Corollary 3.1.1). Thus, Se contains all the Borel sets, and this completes
the proof. Corollary 3.1.3 Let (X, S) be a measurable space and let f be a realvalued measurable function defined on X. Then f is measurable if, and
only if, f −1 (U ) ∈ S whenever U is a Borel set.
Proof: If f is measurable, the above lemma shows that the inverse image of every Borel set is measurable. Conversely, if the inverse image of
every Borel set is measurable, then f −1 ((α, +∞)) ∈ S for every α ∈ R,
and since the function is real-valued, it follows that f is measurable, by
definition. 3.1 Basic properties
59
Proposition 3.1.4 Let (X, S) be a measurable space and let f be a
measurable real-valued function defined on X. Let ϕ : R → R be a Borel
measurable function. Then ϕ ◦ f is a measurable function on X.
Proof: Let α ∈ R. We have
{x ∈ X | (ϕ ◦ f )(x) > α} = f −1 (ϕ−1 ((α, +∞))).
Now ϕ−1 ((α, +∞)) is a Borel set and the proof now follows from the
preceding lemma. Remark 3.1.4 In general, the composition of two measurable functions
can fail to be measurable. We will see an example of this in the next
section. Proposition 3.1.5 Let (X, S) be a measurable space and let {fn } be
a sequence of extended real-valued measurable functions defined on X.
Define, for x ∈ X,
h(x) = sup fn (x) and g(x) = inf fn (x).
n
n
Then h and g are measurable.
Proof: Let α ∈ R. Then
{x ∈ X | h(x) > α} = ∪∞
i=1 {x ∈ X | fi (x) > α},
{x ∈ X | g(x) < α} = ∪∞
i=1 {x ∈ X | fi (x) < α},
and the result follows immediately. Corollary 3.1.4 Let (X, S) be a measurable space. Let {fn } be a sequence of real-valued measurable functions defined on X. We have that
lim supn→∞ fn and lim inf n→∞ fn are measurable. Hence, if fn (x) →
f (x) for all x ∈ X, then f is a measurable function.
Proof: Notice that gn = supm≥n fm is measurable. Then
lim sup fn = inf gn
n→∞
n
is measurable. Similarly, hn = inf m≥n fm is measurable and so
lim inf fn = sup hn
n→∞
n
is measurable. If fn (x) → f (x) for all x ∈ X, then
f = lim inf fn = lim sup fn ,
n→∞
and the result follows. n→∞
60
3 Measurable functions
Definition 3.1.2 Let (X, S) be a measurable space. A simple function defined on X is a function of the form
f =
k
X
αi χAi ,
i=1
where the αi , 1 ≤ i ≤ k, are real constants and the Ai , 1 ≤ i ≤ k, are
measurable sets. Remark 3.1.5 By definition, a simple function is measurable. Simple functions are the building blocks with which we develop
Lebesgue’s theory of integration, just as Riemann’s theory of integration was based on step functions. As a first step towards this, we have
the following result.
Theorem 3.1.1 Let (X, S) be a measurable space and let f be a nonnegative extended real-valued function defined on X. Then f is the increasing limit of a sequence of non-negative simple functions defined on
X.
Proof: Let n be a fixed positive integer. For 1 ≤ i ≤ n2n , define
En,i = f
−1
i−1 i
,
2n 2n
and Fn = f −1 ([n, +∞]).
In other words, we divide the interval [0, n) into subintervals of length
1
2n and consider the inverse images under f of each of these to define the
sets En,i . The sets En,i and Fn are all clearly measurable. Now define
n
fn = nχFn +
n2
X
i−1
i=1
2n
χEn,i .
Thus, fn is a non-negative simple function. If f (x) ≥ n, then fn (x) = n.
If f (x) < n and if
i−1
i
≤ f (x) < n ,
n
2
2
then fn (x) =
i−1
2n .
Thus, fn (x) ≤ f (x) for all x ∈ X.
3.2 The Cantor function
61
We claim that fn (x) ≤ fn+1 (x) for each positive integer n and for
each x ∈ X. Indeed, if f (x) ≥ n+1, then fn+1 (x) = n+1 and fn (x) = n.
If n ≤ f (x) < n + 1, then fn+1 (x) = 2i−1
n+1 for some i such that
f (x) ∈
i−1 i
,
2n+1 2n+1
⊂ [n, n + 1).
In this case, fn (x) = n and so we still have fn (x) ≤ fn+1 (x). Finally, if
f (x) < n, then for some 1 ≤ i ≤ n2n , we have
i−1 i
2(i − 1) 2i
f (x) ∈
,
=
,
.
2n 2n
2n+1 2n+1
Consequently, fn (x) =
i−1
2n
while fn+1 (x) =
f (x) ∈ [ 2(i−1)
, 2i−1 ), or fn+1 (x) = 22i−1
n+1
2n+1 2n+1
2i
[ 22i−1
,
).
This
establishes
the
claim.
n+1 2n+1
>
2(i−1)
= i−1
2n = fn (x),
2n+1
i−1
2n = fn (x), if f (x)
if
∈
Thus, {fn } is an increasing sequence of simple functions bounded
above by f . If f (x) = +∞, then fn (x) = n for all n. If f (x) < +∞,
then there exists a positive integer N such that f (x) < N . Then, for all
n ≥ N , we see, from the the construction above, that |f (x) − fn (x)| =
f (x) − fn (x) ≤ 21n . Thus we have established that fn ↑ f. Corollary 3.1.5 Let (X, S) be a measurable space and let f be a realvalued measurable function defined on X. Then f is the limit of a sequence of simple functions.
Proof: We can split f into its positive and negative parts. Thus f =
f + − f − , where f ± are non-negative measurable functions. We can find
sequences of non-negative simple functions {ϕn } and {ψn } such that
ϕn ↑ f + and ψn ↑ f − . Thus, fn = ϕn − ψn gives a sequence of simple
functions converging pointwise to f. 3.2
The Cantor function
The Cantor function, like the Cantor set, provides a lot of interesting examples, or counter-exmples, to illustrate fine points in the theory
of measure and integration. Several constructions are possible but the
essential properties are the same for all these functions and they serve
62
3 Measurable functions
the same purpose. In this section, we will present one such construction.
Before we construct the function, which will be the uniform limit
of a sequence of piecewise linear functions, we will present a basic construction which will be used iteratively, at different scales, to produce
the next member of the desired sequence from the current one.
Consider an interval [a, b] in the real line and let f : [a, b] → R be a
linear function, i.e.
f (x) = f (a) +
f (b) − f (a)
(x − a), x ∈ [a, b].
b−a
We then divide the interval [a, b] into three equal parts. Let us denote
the two interior points of this partition by cj , j = 1, 2. Thus,
cj = a + j
b−a
, j = 1, 2.
3
The value of f at the point c1 will, therefore, be given by the relation
f (c1 ) =
2f (a) + f (b)
.
3
We then define the ‘next iterate’, g, of f as follows:

f (x), if x ∈ [a, c1 ],

f
(c1 ), if x ∈ [c1 , c2 ],
g(x) =

f (b)−f (c1 )
f (c1 ) + b−c2 (x − c2 ), if x ∈ [c2 , b].
In other words, we move along f in the first third of the interval, then
move horizontally along the second third, and finally climb up to f (b)
in a straight line on the third interval (cf. Figure 3.2.1 below).
f(b)
f(a)
a
b
Figure 3.2.1
3.2 The Cantor function
63
A simple computation shows that the slope of g is the same as that of
f in the first third of the interval, equal to zero in the middle third and
twice the slope of f in the final third of the interval.
Let us consider the interval [0, 1] and the function f0 (x) = x defined
on it. If we apply the procedure described above to this function, we
will get the function f1 given by

x, if x ∈ [0, 13 ],

1
1 2
f1 (x) =
3 , if x ∈ [ 3 , 3 ],

2x − 1, if x ∈ [ 23 , 1].
We now apply the iteration produre described above in each of the
intervals [0, 13 ] and [ 23 , 1] to get the next function f2 and so on (cf. Figure
3.2.2 below).
Figure 3.2.2
For each positive integer n, we apply the iteration procedure described earlier only to those sub-intervals where fn is not constant. Thus,
if fn is constant on any sub-interval, we will have fm = fn = the same
constant on that sub-interval for all m ≥ n. Notice that the union of the
sub-intervals where fn is a constant is precisely the set Xn described in
the construction of the Cantor set (cf. Example 2.1.3).
In this manner, we can construct a sequence of continuous piecewise
linear functions {fn }. By construction, we have a decreasing sequence
of functions, each of which is monotonically non-decreasing.
The maximum slope occurs in the last sub-interval and, as seen earlier, each time we apply the iteration procedure, the slope doubles. Thus,
64
3 Measurable functions
in the last sub-interval of length 31n , the slope of fn will be 2n . By the
mean value theorem, we thus see that for any x ∈ [0, 1],
−(n+1)
|fn (x) − fn+1 (x)| ≤ |fn+1 (1) − fn+1 (1 − 3
n+1
2
)| ≤
.
3
P∞ 2 n
Since the series
n=1 ( 3 ) is a convergent geometric series, it follows
that {fn } is uniformly Cauchy and so it converges uniformly to a continuous function f . This function is called the Cantor function.
Since each fn is non-decreasing, so is f . We also have that f (0) = 0
and f (1) = 1. If C is the Cantor set (cf. Example 2.1.3), then, by
construction, f is constant on each sub-interval of C c , since in the construction of fn+1 from fn , we set the value in each middle third interval
as a constant and once fixed thus, it remains unaltered in the construction of fm , m ≥ n + 1.
Let us now define ψ(y) = y + f (y) for y ∈ [0, 1]. Then ψ is strictly
monotonic increasing and continuous. We have ψ(0) = 0 and ψ(1) = 2.
Thus ψ is a continuous bijection of [0, 1] onto [0, 2].
Let ϕ denote the inverse of ψ. Then ϕ is also monotonic increasing.
We have that x = ϕ(x) + f (ϕ(x)), for every x ∈ [0, 2]. If x ≥ y, then
ϕ(x) ≥ ϕ(y) and
x − y = ϕ(x) − ϕ(y) + f (ϕ(x)) − f (ϕ(y)).
Since f is non-decreasing, we deduce, from the above relation, that
ϕ(x) − ϕ(y) ≤ x − y
whenever x ≥ y from which we have
|ϕ(x) − ϕ(y)| ≤ |x − y|.
Thus, ϕ is continuous as well.
Now, ψ is a bijection and so it maps disjoint sets into disjoint sets.
If I is an interval contained in C c , where C is the Cantor set, then f is a
constant, cI , on I and so ψ(x) = x + cI on I. Thus, ψ(I) just translates
I and so m1 (ψ(I)) = m1 (I). Since C c is made up of disjoint itervals,
3.3 Almost everywhere
65
it follows that m1 (ψ(C c )) = m1 (C c ) = 1. Since the range of ψ is [0, 2],
we thus conclude that m1 (ψ(C)) = 1 as well. Thus, ψ maps C, a set of
measure zero, onto a set of measure one.
Since ψ(C) has positive measure, it contains a non-measurable set,
say, S. Let M = ψ −1 (S) = ϕ(S). Then M ⊂ C and by the completeness
of the Lebesgue measure, it follows that M is Lebesgue measurable. If
M were Borel measurable, then it would follow that S = ϕ−1 (M ) is also
Borel measurable, since ϕ is continuous and hence a Borel measurable
function. But that would imply that S is also Lebesgue measurable,
which contradicts our assumption on S.
Thus, the set M described above is an example of a Lebesgue measurable set which is not Borel measurable.
Finally, let Φ = χM , which is a Lebesgue measurable function. Set
ζ = Φ ◦ ϕ. Thus ζ is the composition of a Lebesgue measurable function
and a continuous (and hence, Lebesgue measurable) function. Now
ζ −1 ({1}) = {x ∈ [0, 2] | ζ(x) = 1} = ϕ−1 (M ) = S,
which, by choice, is not Lebesgue measurable. Thus ζ is not a Lebesgue
measurable function (cf. Corollary 3.1.1(i)). Thus, we have constructed
an example to show that the composition of two measurable functions
need not be measurable (cf. Proposition 3.1.4 and Remark 3.1.4).
Remark 3.2.1 Notice, however, that by Proposition 3.1.4, the mapping
ϕ ◦ Φ will be measurable. 3.3
Almost everywhere
Let (X, S) be a measurable space. Let µ be a measure defined on S. We
say that (X, S, µ) is a measure space.
Given a measure space (X, S, µ), we say that a measurable function,
or a collection of measurable functions, enjoys a certain property almost
everywhere if that property is valid at all points in X except, possibly,
on a set of measure zero. We abbreviate ‘almost everywhere’ by a.e.
More specifically, we will frequently encounter the following situations
in the sequel.
66
3 Measurable functions
• A measurable extended real-valued function defined on X is finite
a.e. if there exists a set E ∈ S such that µ(E) = 0 and such that
f (x) ∈ R for all x ∈ E c .
• Given two measurable functions f and g defined on X, we say that
f = g a.e. if there exists a set E ∈ S such that µ(E) = 0 and such
that f (x) = g(x) for all x ∈ E c .
• Given a sequence of measurable functions {fn } and a measurable
function f defined on X, we say that fn converges to f a.e. if there
exists a set E ∈ S such that µ(E) = 0 and such that fn (x) → f (x)
for every x ∈ E c .
Definition 3.3.1 Let (X, S, µ) be a measure space and let f : X → R
be a measurable function. We say that f is essentially bounded if
there exists M > 0 such that the set
{x ∈ X | |f (x)| > M }
has measure zero. The essential supremum of f is the infimum of all
such M , and is denoted kf k∞ , i.e.
kf k∞ = inf{M | µ({x ∈ X | |f (x)| > M }) = 0}. 3.4
Exercises
3.1 Let (X, S) be a measurable space. Let f : X → R be a measurable
function. Define
1
f (x) , iff (x) 6= 0,
g(x) =
0, iff (x) = 0.
Show that g is measurable.
3.2 Let (X, S) be a measurable space. Let f : X → R be a function
such that |f | is measurable. Is it necessary that f be measurable?
3.3 Let (X, S) be a measurable space. Let f : X → R be a function
such that f −1 ((r, +∞)) ∈ S for every rational number r. Show that f
is measurable.
3.4 Exercises
67
3.4 Let (X, S) be a measurable space. Let {fn } be a sequence of measurable functions defined on X. Show that the set of all points x ∈ X,
where the sequence {fn (x)} is not Cauchy, is a measurable set.
3.5 Let (X, S) be a measurable space. Let {fn } be a sequence of measurable functions defined on X converging to a function f a.e. Is it
necessary that f is measurable?
3.6 Let (X, S) be a measurable space. Let f : X × [0, 1] → R be a
function such that, for each fixed y ∈ [0, 1], the mapping x 7→ f (x, y)
is measurable, and, for each fixed x ∈ X, the mapping y 7→ f (x, y) is
continuous. Define
h(x) = min f (x, y), for x ∈ X.
y∈[0,1]
Show that h : X → R is measurable.
3.7 Let f : R → R be a Lebesgue measurable function. Show that there
exists a Borel measurable function g : R → R such that g = f a.e.
Chapter 4
Convergence
4.1
Egorov’s theorem
Theorem 4.1.1 (Egorov) Let (X, S, µ) be a finite measure space, i.e.
µ(X) < +∞. Let {fn }∞
n=1 be a sequence of real-valued measurable functions, defined on X, converging almost everywhere to a real-valued measurable function f . Then, given any ε > 0, there exists a measurable set
F ⊂ X such that µ(F ) < ε and such that fn → f uniformly on F c .
Proof: Let E ∈ S be such that µ(E) = 0 and such that fn → f
pointwise on E c . Set Y = E c . Given positive integers m and n, define
1
∞
En,m = ∩i=n x ∈ Y | |fi (x) − f (x)| <
.
m
Then, clearly,
E1,m ⊂ E2,m ⊂ · · · ⊂ En,m ⊂ En+1,m ⊂ · · · .
Further, for every x ∈ Y , we have fn (x) → f (x) and so for any m,
1
there exists N such that for all i ≥ N , we have |fi (x) − f (x)| < m
, i.e.
x ∈ EN,m . Thus,
Y = ∪∞
n=1 En,m .
Consequently (cf. Proposition 1.2.4),
µ(Y ) = lim µ(En,m ).
n→∞
Since, µ(Y ) = µ(X) < +∞, given ε > 0, there exists n0 (m) ∈ N such
that
ε
µ(Y \En0 (m),m ) = µ(Y ) − µ(En0 (m),m ) < m .
2
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
68
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_4
4.1 Egorov’s theorem
69
Set G = ∪∞
m=1 (Y \En0 (m),m ). Then G is measurable and
∞
X
ε
µ(G) <
= ε.
2m
m=1
Now set F = G ∪ E so that µ(F ) = µ(G) < ε. Observe that
F c = ∩∞
m=1 En0 (m),m .
1
Given any η > 0, choose m such that m
< η. If x ∈ F c , then x ∈
En0 (m),m ⊂ En,m for all n ≥ n0 (m). Thus, for all x ∈ F c , and for all
n ≥ n0 (m), we have
|fn (x) − f (x)| <
1
< η.
m
Since the choice of m depended only on η, this shows that we have uniform convergence of the sequence {fn } to f on F c . This completes the
proof. Example 4.1.1 The result of the theorem does not hold, in general,
in infinite measure spaces. For instance, consider the set N of natural
numbers with the counting measure defined on the σ-algebra of all subsets of N. If F ⊂ N is such that µ(F ) < ε < 1, then, clearly, F = ∅.
Thus uniform convergence on F c means uniform convergence on N. Now
consider the sequence {fn }∞
n=1 defined by fn = χ{1,2,···,n} . Then fn → f
on N, where f (i) = 1 for all i ∈ N, but this convergence is not uniform. Inspired by the statement of Egorov’s theorem, we can formulate the
following definition.
Definition 4.1.1 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a
sequence of real-valued measurable functions defined on X. We say that
this sequence converges almost uniformly to a real-valued measurable
function f defined on X, if for every ε > 0, there exists a measurable
set F such that µ(F ) < ε and such that fn → f uniformly on F c . The converse of Egorov’s theorem holds for any measure space.
Proposition 4.1.1 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued measurable functions defined on X converging
almost uniformly to a real-valued measuable function f defined on X.
Then fn → f a.e. on X.
70
4 Convergence
1
Proof: Let m ∈ N. Choose Fm ∈ S such that µ(Fm ) < m
and such
c
∞
that fn → f uniformly on Fm . Set F = ∩m=1 Fm . Then µ(F ) = 0. Since
c
c
F c = ∪∞
m=1 Fm , we have that fn (x) → f (x) for every x ∈ F . Definition 4.1.2 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be a
sequence of real-valued measurable functions defined on X. We say that
this sequence is almost uniformly Cauchy if, for every ε > 0, there
exists a set F ∈ S such that µ(F ) < ε and such that {fn } is a uniformly
Cauchy sequence on F c . Clearly, if a sequence of real-valued measurable functions defined on
X, {fn }∞
n=1 , converges almost uniformly, then it is almost uniformly
Cauchy. We now prove the converse.
Proposition 4.1.2 Let (X, S, µ) be a measure space and let {fn }∞
n=1
be an almost uniformly Cauchy sequence of real-valued measuarble functions defined on X. Then there exists a real-valued measurable function
f defined on X such that fn → f almost uniformly.
1
Proof: For each m ∈ N, choose Fm ∈ S such that µ(Fm ) < m
and such
c
that the sequence {fn } is uniformly Cauchy on Fm . set F = ∩∞
m=1 Fm .
c , we have that {f (x)} is a Cauchy
Then µ(F ) = 0. Since F c = ∪∞
F
n
m=1 m
sequence for every x ∈ F c . Define
limn→∞ fn (x), if x ∈ F c ,
f (x) =
0,
if x ∈ F.
Set gn = χF c fn . Then gn is measurable for each positive integer n.
Further if x ∈ F , we have gn (x) = f (x) = 0 for all n and if x ∈ F c ,
we have gn (x) = fn (x) → f (x). Thus gn → f everywhere and so (cf.
Corollary 3.1.4) f is measurable. In particular, fn → f on F c and
c , where µ(F ) < 1 . Thus, we see that the
fn → f uniformly on Fm
m
m
sequence {fn } converges to f almost uniformly. 4.2
Convergence in measure
In this section, we will investigate a new notion of convergence of measurable functions defined on a measure space and compare it with the
notions of pointwise convergence a.e. and almost uniform convergence.
Definition 4.2.1 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued measurable functions defined on X. Let f be a
4.2 Convergence in measure
71
real-valued measurable function defined on X. We say that the sequence
{fn } converges in measure to the function f if for every ε > 0, we
have
lim µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) = 0.
n→∞
We say that the sequence {fn } is Cauchy in measure if for every ε > 0
and for every δ > 0, there exists N ∈ N such that for all n, m ≥ N , we
have
µ({x ∈ X | |fn (x) − fm (x)| ≥ ε}) < δ. Notation Let (X, S, µ) be a measure space. If a sequence of real-valued
measurable functions {fn } defined on X converges in measure to a realvalued measurable function f , we write
µ
fn → f.
Proposition 4.2.1 Let (X, S, µ) be a finite measure space, i.e. µ(X) <
+∞. Let {fn }∞
n=1 be a sequence of real-valued measurable functions,
defined on X, converging a.e. to a real-valued measurable function f .
µ
Then fn → f .
Proof: Let D denote the set of all points x ∈ X such that the sequence
{fn (x)} fails to converge to f (x). Thus µ(D) = 0. Let ε > 0. If we set
Em (ε) = {x ∈ X | |fm (x) − f (x)| ≥ ε},
then,
∞
D = ∪ε>0 ∩∞
n=1 ∪m=n Em (ε) = ∪ε>0 lim sup En (ε).
n→∞
Thus, µ(lim supn→∞ En (ε)) = 0. Since µ(X) < +∞, we have (cf. Exercise 1.10),
0 = µ(lim sup En (ε)) ≥ lim sup µ(En (ε)).
n→∞
n→∞
Thus,
0 ≤ lim inf µ(En (ε)) ≤ lim sup µ(En (ε)) ≤ 0,
n→∞
n→∞
µ
from which we deduce that limn→∞ µ(En (ε)) = 0, i.e. fn → f. 72
4 Convergence
Example 4.2.1 This result is not valid, in general, in infinite measure spaces. If we consider the set N with the σ-algebra of all subsets
equipped with the counting measure, then it is easy to verify that convergence in measure is just uniform convergence, since µ(E) < 1 implies
that E = ∅. Once again, the same sequence as in Example 4.1.1 gives a
sequence converging pointwise everywhere, but not in measure. The converse is not true, even in finite measure spaces, i.e. convergence in measure does not imply pointwise convergence as the following
example shows.
Example 4.2.2 Consider X = [0, 1) equipped with the Lebesgue measure. Consider the function
χin = χ
i
[ i−1
n ,n)
, 1 ≤ i ≤ n.
Consider the sequence
{χ11 , χ12 , χ22 , χ13 , χ23 , χ33 , · · ·}.
Let x ∈ [0, 1). For each n ∈ N, there exists exactly one i such that
χin (x) = 1, 1 ≤ i ≤ n, while χjn (x) = 0 for all 1 ≤ j ≤ n, j 6= i. Thus, we
see that the above sequence fails to converge at every point x ∈ [0, 1).
On the other hand, if 0 < ε < 1, we have
m1 ({x ∈ [0, 1) |
|χin (x)|
≥ ε}) = m1
i−1 i
,
n n
=
1
,
n
from which we deduce that this sequence converges to zero in measure. In the above example, notice that for any x ∈ [0, 1), we have χ1n (x) =
0, for every n ≥ x−1 . Thus, there exists a subsequence converging to
the zero function pointwise. This behaviour is generic, as the following
proposition shows.
Proposition 4.2.2 Let (X, S, µ) be a measure space and let {fn }∞
n=1
be a sequence of real-valued measurable functions defined on X which
converges in measure to a real-valued measurable function f defined on
X. Then, there exists a subsequence of {fn } which converges to f almost
everywhere.
4.2 Convergence in measure
73
Proof: Set
En,m =
1
x ∈ X | |fn (x) − f (x)| ≥
m
.
Then, for every m ∈ N, we can find a positive integer n0 (m) such that
µ(En0 (m),m ) <
Thus,
∞
X
1
.
2m
µ(En0 (m),m ) < +∞.
m=1
Hence, by the Borel-Cantelli lemma (cf. Proposition 1.2.6), there exists
a measurable set E, with measure zero, such that every point of E c
belongs to at most finitely many of the sets En0 (m),m . In other words,
for every x ∈ E c , there exist a positive integer N such that x 6∈ En0 (m),m ,
for all m ≥ N , i.e. for all m ≥ N , we have
|fn0 (m) (x) − f (x)| <
1
.
m
This shows that fn0 (m) (x) → f (x) for all x ∈ E c . This completes the
proof. The next result shows that the limit function, under convergence in
measure, is defined uniquely up to a set of measure zero.
Proposition 4.2.3 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued measurable functions defined on X. Let f and
µ
g be real-valued measurable functions defined on X such that fn → f
µ
and fn → g. Then f = g almost everywhere.
Proof: Let ε > 0. Then,
{x ∈ X | |f (x) − g(x)| ≥ ε} ⊂
x ∈ X | |fn (x) − f (x)| ≥
ε
2
∪ x ∈ X | |fn (x) − g(x)| ≥
since
|f (x) − g(x)| ≤ |fn (x) − f (x)| + |fn (x) − g(x)|.
µ
µ
Since fn → f and fn → g, we deduce that
µ({x ∈ X | |f (x) − g(x)| ≥ ε}) = 0.
ε
2
,
74
4 Convergence
The result now follows from the relation
1
∞
{x ∈ X | |f (x) − g(x)| > 0} = ∪n=1 x ∈ X | |f (x) − g(x)| ≥
.
n
We now investigate the relationship between a sequence being Cauchy
in measure and its convergence in measure.
Proposition 4.2.4 Let (X, S, µ) be a measure space and let {fn }∞
n=1
be a sequence of real-valued measurable functions defined on X. If the
sequence converges in measure, then it is Cauchy in measure.
µ
Proof: Let fn → f . Then, by a similar reasoning as in the preceding
proof, we have, for ε > 0,
{x ∈ X | |fn (x) − fm (x)| ≥ ε} ⊂ x ∈ X | |fn (x) − f (x)| ≥ 2ε
∪ x ∈ X | |fm (x) − f (x)| ≥
ε
2
.
Given δ > 0, we can then find a positive integer N such that for n and
m greater than, or equal to, N , we have that the measure of each of the
two sets on the right-hand side of the above relation will be less than 2δ .
Thus for m, n ≥ N , we have
µ({x ∈ X | |fn (x) − fm (x)| ≥ ε}) < δ.
Thus the sequence {fn } is Cauchy in measure. Proposition 4.2.5 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued functions, defined on X, which is Cauchy in
measure. If there exists a subsequence {fnk } which converges in measure
µ
to a real-valued measurable function f defined on X, then fn → f .
Proof: Let ε > 0. Then
{x ∈ X | |fn (x) − f (x)| ≥ ε} ⊂
x ∈ X | |fn (x) − fnk (x)| ≥
ε
2
∪ x ∈ X | |fnk (x) − f (x)| ≥
ε
2
.
Let δ > 0 be given. Then, there exists N ∈ N such that for all n ≥ N
and for all nk ≥ N , we have
µ x ∈ X | |fn (x) − fnk (x)| ≥ 2ε
< 2δ ,
µ
x ∈ X | |fnk (x) − f (x)| ≥
ε
2
<
δ
2.
4.2 Convergence in measure
75
Thus, for all n ≥ N ,
µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) < δ,
µ
which shows that fn → f. Proposition 4.2.6 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued measurable functions, defined on X, which converges almost uniformly to a real-valued measurable function f defined
µ
on X. Then fn → f .
Proof: Let ε > 0 and δ > 0 be given. There exists F ∈ S such that
µ(F ) < δ and such that fn → f uniformly on F c . Thus, there exists
n0 ∈ N (which depends on ε and also on δ since we have chosen F
based on δ) such that, for all x ∈ F c , and for all n ≥ n0 , we have
|fn (x) − f (x)| < ε. Hence, for all n ≥ n0 ,
µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) ≤ µ(F ) < δ,
µ
which proves that fn → f. Proposition 4.2.7 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued measurable functions, defined on X, which is
Cauchy in measure. Then, there exists a subsequence which is almost
uniformly Cauchy.
Proof: Since the sequence is Cauchy in measure, given k ∈ N, there
exists n(k) ∈ N such that for all n, m ≥ n(k), we have
1
1
µ
x ∈ X | |fn (x) − fm (x)| ≥ k
< k.
2
2
Choose n1 = n(1) + 1, n2 = max{n(2), n1 + 2}, n3 = max{n(3), n2 + 3}
and so on. Thus, nk = max{n(k), nk−1 + k}. Hence we get a strictly
increasing sequence {nk } which also satisfies nk > k for each k. Thus,
we have a subsequence {fnk } of {fn }.
Set
Ek
1
= x ∈ X | |fnk (x) − fnk+1 (x)| ≥ k .
2
Then µ(Ek ) < 2−k . Given δ > 0, choose k such that 2−(k−1) < δ. Set
F = ∪∞
i=k Ei . Then
µ(F ) ≤
∞
X
i=k
µ(Ei ) <
1
2k−1
< δ.
76
4 Convergence
c
Given ε > 0, choose N ≥ k such that 2−(N −1) < ε. Now F c = ∩∞
i=k Ei .
Thus, for all x ∈ F c and for all m ≥ ` ≥ N , we have
Pm
|fn` (x) − fnm (x)| ≤
j=` |fnj (x) − fnj+1 (x)|
<
Pm
=
1
2`−1
1
j=` 2j
<
1
2N −1
< ε.
Thus {fnk } is a uniformly Cauchy sequence in F c and µ(F ) < δ, ı.e.
{fnk } is almost uniformly Cauchy. Proposition 4.2.8 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued measurable functions, defined on X, which is
Cauchy in measure. Then, there exists a real-valued measurable function
µ
f defined on X such that fn → f .
Proof: Let {fnk } be a subsequence which is almost uniformly Cauchy.
Then (cf. Proposition 4.1.2), there exists a real-valued measurable function f , defined on X, such that {fnk } converges almost uniformly to
µ
f . By Proposition 4.2.6, we deduce that fnk → f which implies that
µ
fn → f , by Proposition 4.2.5. Let us summarize the results on the inter-relationships of the various
convergences studied in this chapter so far. We have defined pointwise
convergence a.e., almost uniform convergence and convergence in measure.
• If a sequence of real-valued measurable functions converges almost
uniformly, then it converges pointwise a.e. (cf. Proposition 4.1.1)
as well as in measure (cf. Proposition 4.2.6).
• If a sequence of real-valued measurable functions converges pointwise a.e., then it converges almost uniformly (cf. Egorov’s theorem) and in measure (cf. Proposition 4.2.1), provided the space is
a finite measure space.
• If a sequence of real-valued measurable functions converges in measure, then there is a subsequence which converges pointwise a.e.
(cf. Proposition 4.2.3) and almost uniformly (cf. Propositions
4.2.7 and 4.1.2). In fact, Propositions 4.2.7, 4.1.2 and 4.1.1 yield
another proof of Proposition 4.2.2.
4.2 Convergence in measure
77
• Almost uniform convergence of a sequence of real-valued measurable functions obviously implies that the sequence is almost uniformly Cauchy and vice-versa (cf. Proposition 4.1.2).
• Convergence in measure of a sequence of real-valued measurable
functions implies that the sequence is Cauchy in measure (cf.
Proposition 4.2.4) and vice-versa (cf. Proposition 4.2.8).
We will conclude this section by studying the behaviour of convergence with measure with respect to basic algebraic operations on functions.
Proposition 4.2.9 Let (X, S, µ) be a measure space and let {fn }∞
n=1
and {gn }∞
n=1 be sequences of real-valued measurable functions defined on
µ
µ
X. Let fn → f and let gn → g, where f and g are real-valued measurable
functions defined on X. Let α and β be non-zero real scalars.Then
µ
µ
αfn + βgn → αf + βg. We also have that |fn | → |f |.
Proof: Let ε > 0. The result follows immediately from the following
relations:
{x ∈ X | |(αfn + βgn )(x) − (αf + βg)(x)| ≥ ε}
⊂
n
x ∈ X | |fn (x) − f (x)| ≥
ε
2|α|
o
n
∪ x ∈ X | |gn (x) − g(x)| ≥
ε
2|β|
and
{x ∈ X | | |fn (x)| − |f (x)| | ≥ ε} ⊂ {x ∈ X | |fn (x) − f (x)| ≥ ε}. Proposition 4.2.10 Let (X, S, µ) be a finite measure space and let
∞
{fn }∞
n=1 and {gn }n=1 be sequences of real-valued measurable functions
µ
µ
defined on X. Let fn → f and let gn → g, where f and g are real-valued
µ
measurable functions defined on X. Then fn gn → f g.
Proof: It follows from the relation
fg =
1
[(f + g)2 − (f − g)2 ],
4
µ
µ
that it is enough to show that if fn → f , then fn2 → f 2 .
Step 1: Let f = 0. Then, since
{x ∈ X | |fn (x)|2 ≥ ε} = {x ∈ X | |fn (x)| ≥
√
ε},
o
,
78
4 Convergence
µ
it follows that fn2 → 0 as well.
µ
µ
Step 2: If fn → f , then fn − f → 0. Now, let
En = {x ∈ X | |f (x)| > n},
so that En ↓ ∅. Since µ(X) < +∞, it follows that (cf. Proposition 1.2.5)
µ(En ) ↓ 0. Given δ > 0, choose m ∈ N such that µ(Em ) < δ.
Now,
{x ∈ X | |fn f (x) − f 2 (x)| ≥ ε}
= {x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em
c .
∪{x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em
The measure of the first set on the right-hand side of the above relation
c , it follows that
is, evidently, less than δ. Since |f (x)| ≤ m for x ∈ Em
for all points x in the second set on the right-hand side of the above
relation, we have
ε
ε ≤ m|fn (x) − f (x)|, i.e. |fn (x) − f (x)| ≥ .
m
Thus, we can find N ∈ N such that, for all n ≥ N , we have
c
µ({x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em
) < δ.
Thus, for all n ≥ N , we have
µ({x ∈ X | |fn f (x) − f 2 (x)| ≥ ε}) < 2δ,
µ
which shows that fn f → f 2 .
Step 3: Now
fn2 − f 2 = (fn − f )2 + 2(fn f − f 2 ).
µ
µ
Since fn −f → 0, we have (fn −f )2 → 0. Combining this with the result
µ
of Step 2 above, we deduce that fn2 → f 2 , which completes the proof. Example 4.2.3 The above result is not true, in general, in infinite
measure spaces. Consider the set N equipped with the counting measure
defined on the σ-algebra of all subsets. Let
1
n , if 1 ≤ k ≤ n,
fn (k) =
0, if k > n.
4.3 Exercises
79
Then fn → 0 uniformly on N and hence, as already observed earlier,
µ
fn → 0. Let g(n) = n for all n ∈ N. Now (fn g)(n) = 1 for all n ∈ N and
so fn g does not converge uniformly to zero and so it does not converge
to zero in measure. 4.3
Exercises
4.1 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
real-valued measurable functions defined on X converging in measure
to a real-valued measurable function f defined on X. If g is a realvalued measurable function defined on X such that f = g a.e., show
µ
that fn → g.
∞
4.2 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 and {gn }n=1 be two
sequences of real-valued measurable functions defined on X such that
µ
µ
fn = gn a.e. for every n. If fn → f , show that gn → f , where f is a
real-valued measurable function defined on X.
4.3 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
real-valued measurable functions defined on X converging in measure
to a real-valued measurable function f defined on X. If the sequence
{fn } is almost uniformly Cauchy, show that it converges to f almost
uniformly.
4.4 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
real-valued measurable functions defined on X converging in measure to
a real-valued measurable function f defined on X. If the sequence {fn }
is pointwise Cauchy a.e., show that fn → f almost everywhere.
4.5 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
real-valued measurable functions defined on X such that every subsequence has a further subsequence which converges in measure to a fixed
µ
real-valued measurable function f defined on X. Show that fn → f .
4.6 (a) Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
real-valued measurable functions defined on X. Let f be a real-valued
measurable function defined on X. Show that the following statements
are equivalent:
µ
(i) fn → f .
80
4 Convergence
(ii) Every subsequence of {fn } has a further subsequence converging to
f almost uniformly.
(b) If, in addition, µ(X) < +∞, show that the above statements are
equivalent to the following statement:
(iii) Every subsequence of {fn } has a further subsequence converging to
f almost everywhere.
4.7 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
µ
real-valued measurable functions defined on X such that fn → 0. Let
{an } be a sequence of real numbers such that an ↓ 0. Show that there
exists a subsequence {fnk } such that for almost every x ∈ X, we have
|fnk (x)| < ak for sufficiently large k.
4.8 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
real-valued measurable functions defined on X converging in measure to
a real-valued measurable function f defined on X. If, for every n ∈ N,
we have that fn ≥ 0 a.e., show that f ≥ 0 almost everywhere. Deduce
that
µ
(i) if fn → f and if for every n, fn ≤ g a.e., then f ≤ g a.e., where g is
a real-valued measurable function defined on X;
µ
(ii) if fn → f and if for every n, |fn | ≤ g a.e., then |f | ≤ g a.e., where g
is a real-valued measurable function defined on X.
4.9 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
real-valued measurable functions defined on X converging in measure
to a real-valued measurable function f defined on X. If fn ≤ fn+1 for
every n, show that fn ↑ f almost everywhere.
Chapter 5
Integration
5.1
Non-negative simple functions
Let (X, S, µ) be a measure space. Let ϕ : X → R be a (measurable)
simple function which is non-negative. Let {αi }ni=1 be the set of non-zero
values assumed by ϕ. Set
Ai = ϕ−1 ({αi }), 1 ≤ i ≤ n.
The sets {Ai }ni=1 are mutually disjoint and we can write
ϕ =
n
X
α i χ Ai .
(5.1.1)
i=1
In order to define the integral of ϕ, over the set X, with respect to
the measure µ, we imitate what one does in order to define the Riemann
integral of a step function. Given a step-function of the form
ϕ =
n
X
α i χ Ii ,
i=1
where {Ii }ni=1 is a finite collection of disjoint intervals and the αi are all
non-negative, the Riemann integral of ϕ is nothing but the area under
the graph of ϕ, i.e.
Z
n
X
ϕ(x) dx =
αi m1 (Ii ).
R
i=1
Imitating this, we can define, when ϕ is of the form given in (5.1.1)
Z
n
X
ϕ dµ =
αi µ(Ai ).
X
i=1
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_5
81
82
5 Integration
However, a simple function may be written in more than one way as a
finite linear combination of characteristic functions. For instance, each
set Ai may be partitioned into subsets and ϕ could be written in terms
of the characteristic functions of those subsets. It is also possible to
express ϕ in the form (5.1.1), with the sets Ai not being mutually disjoint. Thus, we would like to define the integral in a manner which is
independent of the way the function is written.
Let us assume that we can write ϕ, given by (5.1.1), in the form
ϕ =
m
X
β j χ Bj ,
j=1
where the collection of sets {Bj }m
j=1 are also mutually disjoint. Then, it
follows that each βj is equal to αi for some unique index i. In that case,
we have that Bj ⊂ Ai . Further, we have that
Ai = ∪{j
| βj =αi } Bj .
Since µ is finitely additive, we have that
X
µ(Ai ) =
µ(Bj ).
{j | βj =αi }
It is now immediate to see that
m
X
βj µ(Bj ) =
j=1
n
X
αi µ(Ai ).
(5.1.2)
i=1
Now let us assume that we write
ϕ =
k
X
γ i χ Ei ,
(5.1.3)
i=1
where the sets {Ei }ki=1 are not necessarily disjoint.
Let σ = (σ1 , · · · , σk ) be a k-tuple, where σi = ±1 for each 1 ≤ i ≤ k.
Define, for A ⊂ X,
A, if σi = 1,
σi
A =
Ac , if σi = −1.
5.1 Non-negative simple functions
83
Set
E σ = ∩ki=1 Eiσi .
Thus, if σ0 = (−1, · · · , −1), then
E σ0 = ∩ki=1 Eic =
∪ki=1 Ei
c
.
Given two such k-tuples σ and σ 0 which are not equal, there must
exist i with 1 ≤ i ≤ k such that σi 6= σi0 . Without loss of generality,
assume that σi = +1 and σi0 = −1. In that case, by definition, E σ ⊂ Ei
0
0
while E σ ⊂ Eic . Thus, if σ 6= σ 0 , we have that E σ and E σ are disjoint.
Lemma 5.1.1 With the preceding notations, we have, for each 1 ≤ i ≤
k,
Ei = ∪{σ | σi =+1} E σ .
(5.1.4)
Proof: If σi = +1, then E σ ⊂ Ei . Thus the set on the right-hand
side of (5.1.4) is contained in Ei . Conversely, let x ∈ Ei . Define σ as
follows: σj = +1 if x ∈ Ej and σj = −1 if x 6∈ Ej , where 1 ≤ j ≤ k. In
particular, σi = +1 and x ∈ E σ . This establishes the reverse inclusion
in (5.1.4) and thus completes the proof. Now let us assume that ϕ is given in the form (5.1.3). For any
1 ≤ i ≤ k, we have
X
χ Ei = χ ∪
=
χE σ .
σ
E
{σ | σi =+1}
{σ | σi =+1}
Consequently, we have
ϕ =
k
X
i=1
γi
X
χE σ =
{σ | σi =+1}
X
X
γi χE σ .
σ6=σ0 {i | σi =+1}
By virtue of (5.1.2), we get, since the E σ are disjoint,
P
Pn
P
σ
i=1 αi µ(Ai ) =
σ6=σ0
{i | σi =+1} γi µ(E )
=
Pk
=
Pk
i=1 γi
P
{σ | σi =+1} µ(E
i=1 γi µ(Ei ).
σ)
84
5 Integration
The last equality comes from the finite additivity of the measure and
from the result of Lemma 5.1.1 above.
Thus we can now make the following definition, which is independent
of the representation of a simple function.
Definition 5.1.1 Let (X, S, µ) be a measure space and let ϕ be a nonnegative simple function given by
ϕ =
k
X
γ i χ Ei
i=1
The (Lebesgue) integral of ϕ, over the set X, with respect to
the measure µ, is given by
Z
ϕ dµ =
X
k
X
γi µ(Ei ). i=1
Remark 5.1.1 Notice that the measure of some (or all) of the sets Ei
could be +∞. Thus the integral of ϕ could be +∞ as well. This is
the reason why we only consider non-negative functions. If the γi were
of different signs and if the corresponding sets were of infinite measure,
then we cannot add them meaningfully. For consistency, if γi = 0 and
the set Ei has infinite measure, we adopt the convention that 0.∞ = 0. Remark 5.1.2 Let (X, S, µ) be a measure space and let E be a measurable subset X. Then we can consider the σ-algebra SE of sets of the
form A ∩ E, where A ∈ S, defined on E, and the restriction of the measure µ to this σ-algebra. If ϕ is a non-negative simple function defined
on X given by (5.1.3), then its restriction to E is given by
ϕ|E =
k
X
γi χEi ∩E .
i=1
We define the integral of ϕ|E , over E, with respect to the measure µ
restricted to E, as the integral
R of ϕ, over the set E, with respect to the
measure µ, and denote it by E ϕ dµ. Clearly we have
Z
ϕ dµ =
E
k
X
i=1
Z
γi µ(Ei ∩ E) =
ϕχE dµ. X
5.2 Non-negative functions
85
Remark 5.1.3 Let (X, S, µ) be a measure space and let ϕ be a nonnegative simple function given by (5.1.1). Let ψ be another non-negative
simple function such that ψ ≤ ϕ. Then, clearly, we can write
ψ =
m
X
β j χ Fj ,
j=1
where each Fj is a subset of some unique Ai and, in that case, 0 ≤ βj ≤
αi . Then it is clear that we have
Z
Z
ψ dµ ≤
ϕ dµ. X
5.2
X
Non-negative functions
Let (X, S, µ) be a measure space. Let f be a non-negative, extended
real-valued measurable function defined on X. Recall that (cf. Theorem
3.1.1) f is the increasing limit of a sequence of non-negative simple
functions.
Definition 5.2.1 Let (X, S, µ) be a measure space and let f be a nonnegative, extended real-valued measurable function defined on X. Then
the (Lebesgue) integral of f , over the set X, with respect to the
measure µ, is defined by
Z
Z
f dµ = sup
ϕ dµ.
X
0≤ϕ≤f
ϕ simple
X
If E ⊂ X is a measurable set, then, we define the integral of f , over the
set E, with respect to the measure µ, by
Z
Z
f dµ =
f χE dµ. E
X
Remark 5.2.1 In view of Remarks 5.1.2 and 5.1.3, it is clear that the
above definition is consistent with the definitions made in the previous
section in case f is itself a non-negative simple function. Again, if ϕ is a
simple function such that 0 ≤ ϕ ≤ f , then ϕχE is a non-negative simple
function defined on E which is bounded above by f |E . Conversely, if ϕ
is a non-negative simple function defined on E bounded above by f |E ,
86
5 Integration
then its extension to all of X by setting it to be zero outside E is also
a non-negative simple function defined on X and is bounded above by
f . Thus, we can easily see that the integral of f , over the set E, with
respect to the measure µ, defined above is the same as the integral of
the function f |E , over the set E, with respect to the restriction of the
measure µ to E .
Remark 5.2.2 Notice that the integral of a non-negative real-valued
function may be infinite. The following proposition is an immediate consequence of the definition of the integral for non-negative functions.
Proposition 5.2.1 Let (X, S, µ) be a measure space and let f be a nonnegative, extended real-valued measurable function defined on X.
(a) If g is a measurable function defined on X such that 0 ≤ g ≤ f , and
if E is a measurable subset of X, then
Z
Z
g dµ ≤
f dµ.
(5.2.1)
E
E
(b) If E and F are measurable subsets of X such that E ⊂ F , then
Z
Z
f dµ ≤
f dµ.
(5.2.2)
E
F
(c) If c is a non-negative real number, and if E is a measurable subset
of X, then
Z
Z
cf dµ = c
f dµ.
(5.2.3)
E
E
(d) If E is a measurable subset of X such that f (x) = 0 for all x ∈ E,
then
Z
f dµ = 0.
E
(e) If E is a measurable subset of X such that µ(E) = 0, then
Z
f dµ = 0. E
Proposition 5.2.2 Let (X, S, µ) be a measure
R space and let f : X → R
be a non-negative measurable function. If X f dµ = 0, then f = 0
almost everywhere.
5.2 Non-negative functions
Proof: Set
Fn =
87
1
x ∈ X | f (x) >
n
, n ∈ N.
Then
{x ∈ X | f (x) 6= 0} = ∪∞
n=1 Fn .
Then, by virtue of (5.2.1) and (5.2.2), we get
Z
Z
1
µ(Fn ) ≤
f dµ ≤
f dµ = 0.
n
Fn
X
Thus, µ(Fn ) = 0 for each n ∈ N and the result follows. Proposition 5.2.3 Let (X, S, µ) be a measure space and let ϕ : X → R
be a non-negative simple function. Define, for E ∈ S,
Z
ν(E) =
ϕ dµ.
E
Then, ν defines a measure on S.
Proof: Clearly ν is non-negative and ν(∅) = 0. We just need to check
countable additivity. Let {Ei }∞
sets in
i=1 be a sequence of measurable
Pk
∞
X which are mutually disjoint. Let E = ∪i=1 Ei . Let ϕ = j=1 αj χAj .
Then
R
Pk
ν(E) =
=
j=1 αj µ(Aj ∩ E)
E ϕ dµ
=
=
Pk
j=1 αj
P∞
i=1 µ(Aj
P∞ R
i=1 Ei
∩ Ei ) =
ϕ dµ
=
P∞ Pk
i=1
j=1 αj µ(Aj
∩ Ei )
P∞
i=1 ν(Ei ).
This completes the proof. Proposition 5.2.4 Let (X, S, µ) be a measure space and let ϕ and ψ
be non-negative simple functions defined on X. Then
Z
Z
Z
(ϕ + ψ) dµ =
ϕ dµ +
ψ dµ.
(5.2.4)
X
X
Pn
X
Pm
Proof: Let ϕ = i=1 αi χAi and let ψ = j=1 βj χBj , where the {Ai }ni=1
and the {Bj }m
j=1 are collections of mutually disjoint sets. Set Eij =
Ai ∩ Bj . Then
n X
m
X
ϕ+ψ =
(αi + βj )χEij .
i=1 j=1
88
5 Integration
Then,
Z
Z
Z
(ϕ + ψ) dµ = (αi + βj )µ(Eij ) =
ϕ dµ +
Eij
Eij
ψ dµ.
Eij
The result now follows immediately from the preceding proposition since
the Eij are all disjoint. We are now in a position to prove the first important theorem which
shows how the (Lebesgue) integral handles limit processes.
Theorem 5.2.1 (Monotone convergence theorem) Let (X, S, µ) be a
measure space and let {fn }∞
n=1 be a sequence of non-negative measurable functions defined on X such that, for every x ∈ X,
(i) 0 ≤ f1 (x) ≤ f2 (x) ≤ · · · ≤ fn (x) ≤ · · ·, and,
(ii)
lim fn (x) = f (x).
n→∞
Then
Z
Z
lim
n→∞ X
fn dµ =
f dµ.
X
Proof: Let
Z
α = sup
n
fn dµ.
X
By Proposition 5.2.1 (a), we have
Z
Z
Z
f1 dµ ≤
f2 dµ ≤ · · · ≤
fn dµ ≤ · · ·
X
X
Since fn ≤ f , we also have
X
R
R
fn dµ ≤ X f dµ. Thus, it follows that
Z
α ≤
f dµ.
X
X
We now need to prove the reverse inequality, which will complete the
proof.
Let 0 < c < 1 be any fixed constant. Let ϕ be a simple function such
that 0 ≤ ϕ ≤ f . For n ∈ N, define
En = {x ∈ X | fn (x) ≥ cϕ(x)}.
Then each set En is measurable. Since the sequence {fn }∞
n=1 is increasing, we have that E1 ⊂ E2 ⊂ · · · ⊂ En ⊂ · · ·. Further, given any x ∈ X,
we have two possibilities.
5.2 Non-negative functions
89
• Either, f (x) = 0 which imples that fn (x) = 0 for all n and also
that ϕ(x) = 0. In this case x ∈ E1 .
• Or, f (x) > 0 which implies that f (x) > cϕ(x), since 0 < c < 1. In
this case, there exists n ∈ N such that cϕ(x) < fn (x) ≤ f (x) and
so x ∈ En .
Thus, X = ∪∞
n=1 En . Now,
Z
Z
Z
fn dµ ≥
fn dµ ≥ c
X
En
def
ϕ dµ = cν(En ).
En
But, by Proposition 5.2.3, ν is a measure and so
Z
α ≥ c lim ν(En ) = cν(X) = c
n→∞
ϕ dµ.
X
Since this is true for any simple function ϕ satisfying 0 ≤ ϕ ≤ f , we get,
by definition of the integral, that
Z
α ≥ c
f dµ.
X
Since 0 < c < 1 was arbitrarily chosen, it now follows, on letting c tend
to unity, that
Z
α ≥
f dµ,
X
which completes the proof. R
Remark 5.2.3 It is possible that X f dµ is infinite. In
R that case,
we conclude from the preceding theorem that the limit of X fn dµ, as
n → ∞, is also infinite. Proposition 5.2.5 Let (X, S, µ) be a measure space and let {fn }∞
n=1
be a sequence of non-negative extended real-valued measurable functions
defined on X. Set
f (x) =
∞
X
fn (x), x ∈ X.
n=1
Then f is a non-negative measurable function and
Z
∞ Z
X
f dµ =
fn dµ.
X
n=1 X
90
5 Integration
Proof: Any finite sum of non-negative measurable functions is measurable (cf. Remark 3.1.2). If gn = f1 + · · · + fn , then gn increases to f .
Thus, for any α ∈ R, we have
{x ∈ X | f (x) ≤ α} = ∩∞
n=1 {x ∈ X | gn (x) ≤ α}
which shows that f is measurable.
∞
Let {ϕn }∞
n=1 and {ψn }n=1 be increasing sequences of non-negative
simple functions increasing to f1 and f2 respectively (cf. Theorem 3.1.1).
By Proposition 5.2.4,
Z
Z
Z
(ϕn + ψn ) dµ =
ϕn dµ +
ψn dµ.
X
X
X
We also have that ϕn + ψn increases to f1 + f2 . Hence, by the monotone
convergence theorem, we get
Z
Z
lim
(ϕn + ψn ) dµ =
(f1 + f2 ) dµ,
n→∞ X
and
X
Z
Z
lim
ϕn dµ =
n→∞
Z
lim
n→∞
f1 dµ,
X
Z
ψn dµ =
f2 dµ.
X
We thus conclude that
Z
Z
Z
(f1 + f2 ) dµ =
f1 dµ +
f2 dµ.
X
X
X
It now follows by induction that, for any n ∈ N,
Z
n Z
X
(f1 + · · · + fn ) dµ =
fk dµ.
X
k=1
X
Setting gn = f1 + · · · + fn , we get that gn is non-negative, measurable
and increases to f . Thus, the result follows, once again, by an application of the monotone convergence theorem. Example 5.2.1 (Integration with respect to the counting measure) Let
X = N be equipped with the counting measure. Let f : X → R be a
given non-negative function. Let f (k) = ak ≥ 0, k ∈ N. If we define
ak , if 1 ≤ k ≤ n,
fn (k) =
0, if k > n,
5.2 Non-negative functions
91
then fn increases to f . Notice that
fn =
n
X
ak χ{k}
k=1
is a non-negative simple function and so, by definition,
Z
fn dµ =
X
n
X
ak µ({k}) =
k=1
n
X
ak .
k=1
Thus, by the monotone convergence theorem, it follows that
Z
f dµ =
X
∞
X
ak .
k=1
The integral of a non-negative function over N with respect to the counting measure is just the summation of the values of the function. Example 5.2.2 (Integration with respect to the Dirac measure) Let
X be a non-empty set and let x0 ∈ X. Let µ be the Dirac measure
concentrated at the point x0 (cf. Example 1.2.2). Let ϕ be a simple
function defined as in (5.1.1). Then x0 can belong to at most one set
Ai . If x0 6∈ Aj for any 1 ≤ j ≤ n, then µ(Aj ) = 0 for all 1 ≤ j ≤ n and
we have
Z
ϕ dµ = 0 = ϕ(x0 ).
X
If 1 ≤ i0 ≤ n is such that x0 ∈ Ai0 , then, we see that
Z
ϕ dµ = αi0 = ϕ(x0 ).
X
Now, if f is any non-negative extended real-valued measurable function
defined on X, it is the increasing limit of non-negative simple functions
and, by the monotone convergence theorem, it immediately follows that
Z
f dµ = f (x0 ).
X
Thus, integration of a non-negative function with respect to the Dirac
measure is just evaluation of the function at the point where the measure is concentrated. 92
5 Integration
Example 5.2.3 Let {aij }∞
i,j=1 be a double sequence of non-negative real
numbers. Let X = N be equipped with
P the counting measure. Define
fi (j) = aij , 1 ≤ i, j ≤ ∞. Define f = ∞
i=1 fi . Then
∞
X
f (j) =
aij .
i=1
Now, by Proposition 5.2.5, we get
Z
f dµ =
X
∞ Z
X
i=1
fi dµ.
X
Using the result of Example 5.2.1, this translates into the following relation:
∞
∞ X
∞
X
X
f (j) =
fi (j).
j=1
i=1 j=1
Substituting the values for f (j) and fi (j), we get
∞ X
∞
X
j=1 i=1
aij =
∞ X
∞
X
aij .
i=1 j=1
Thus, we have shown that, for a non-negative double sequence of reals,
the order of summation can be reversed. (Of course, both sums could be
infinite.) This result is not true in general for sequences which change
sign. We will later see sufficient conditions which ensure that the order
of summation is immaterial. Theorem 5.2.2 (Fatou’s lemma) Let (X, S, µ) be a measure space and
let {fn }∞
n=1 be a sequence of non-negative extended real-valued measurable functions defined on X. Then
Z
Z
(lim inf fn ) dµ ≤ lim inf
fn dµ.
X
n→∞
n→∞ X
Proof: Set gn (x) = inf i≥n fi (x) for x ∈ X. Then {gn }∞
n=1 is an increasing sequence of non-negative measurable functions whose limit is
lim inf n→∞ fn . Thus, by the monotone convergence theorem, we get
R
R
dµ = limn→∞ X gn dµ
R
R
≤ limn→∞ inf i≥n X fi dµ = lim inf n→∞ X fn dµ.
X (lim inf n→∞ fn )
5.2 Non-negative functions
93
This completes the proof. Example 5.2.4 We can have strict inequality in Fatou’s lemma. Let
X = R be equipped with the Lebesgue measure, m1 . Set
fn = χ[n,n+1) .
R
Then fn (x) → 0 for each x ∈ R and so RR(lim inf n→∞ fn ) dm1 = 0. On
the other hand, for every n ∈ N, we have R fn dm1 = m1 ([n, n + 1)) =
1. The following result is a variation of the monotone convergence theorem.
Proposition 5.2.6 Let (X, S, µ) be a measure space and let {fn }∞
n=1
and f be non-negative, extended real-valued measurable functions defined
on X. Assume that, for all n ∈ N, and for all x ∈ X, we have 0 ≤
fn (x) ≤ f (x), and that fn (x) → f (x) as n → ∞. Then
Z
Z
lim
fn dµ =
f dµ.
n→∞ X
X
Proof: By Fatou’s lemma and the the fact that the fn are all bounded
above by f , we get
Z
Z
Z
Z
f dµ ≤ lim inf
fn dµ ≤ lim sup
fn dµ ≤
f dµ,
X
n→∞ X
n→∞ X
X
from which the desired result follows immediately. We conclude this section with a generalization of Proposition 5.2.3.
Proposition 5.2.7 Let (X, S, µ) be a measure space and let f be a nonnegative extended real-valued measurable function defined on X. Define
Z
ν(E) =
f dµ, E ∈ S.
E
Then ν defines a measure on S. If g is any non-negative extended realvalued measurable function defined on X, then we have
Z
Z
g dν =
gf dµ.
(5.2.5)
X
X
94
5 Integration
Proof: Clearly ν(∅) = 0 and ν(E) ≥ 0 for all E ∈ S. Let {Ei }∞
i=1 be a
sequence of mutually disjoint sets in S whose union is E. Then
χE =
∞
X
χ Ei .
i=1
Thus, using Proposition 5.2.5, we get
R
ν(E) =
E f dµ
=
P∞ R
=
P∞
i=1 X
=
f χEi dµ =
R
X
f χE dµ
P∞ R
i=1 Ei
f dµ
i=1 ν(Ei ).
This establishes the countable additivity and thus ν is a measure.
Now, let g = χE , where E ∈ S. Then,
Z
Z
Z
g dν = ν(E) =
f dµ =
f g dµ.
X
E
X
Thus (5.2.5) is true when g is a characteristic function. By the linearity
of the integral with respect to the integrand (cf. (5.2.3) and (5.2.4)), it
follows that (5.2.5) is true when g is any non-negative simple function.
Then, by the monotone convergence theorem, (5.2.5) is true when g is
any non-negative extended real-valued measurable function. Remark 5.2.4 The above method of proof is very useful in proving
several identities involving integrals of non-negative measurable functions. We first prove an identity for characteristic functions, and then,
by linearity, for simple functions and then, by the monotone convergence
theorem, for arbitrary non-negative measurable functions. Remark 5.2.5 The result of (5.2.5) is often symbolically written as
dν = f dµ. An important converse of this result is known as the
Radon-Nikodym theorem which we will see much later. 5.3
Integrable functions
Let (X, S, µ) be a measure space. We now consider an arbitrary measurable function f defined on X. Since any function f can be split into its
positive and negative parts (cf. Remark 3.1.3) as f = f + − f − , we may,
5.3 Integrable functions
95
in view of the linearity of the integral with respect to the integrand, try
to define the interal of f as the difference between the integrals of f +
and f − , which are well-defined, since these functions are non-negative.
However, if both these integrals turn out to be infinite, we cannot define
difference.
So we need that at least one of the two quantities,
R their
R
+ dµ or
− dµ, be finite. In view of this requirement, we make
f
f
X
X
the following definition.
Definition 5.3.1 Let (X, S, µ) be a measure space and let f be a measurable function defined on X. The function f is said to be (Lebesgue)
integrable if
Z
|f | dµ < +∞. X
Since |f | =Rf + + f − , it now
R follows from the definition of integrability
that both X f + dµ and X f − dµ are finite and so we are now in a
position to define the integral of an integrable function.
Definition 5.3.2 Let (X, S, µ) be a measure space and let f be an integrable function defined on X. Then the (Lebesgue) integral of f
over X with respect to the measure µ, is defined by
Z
Z
Z
f dµ =
f + dµ −
f − dµ. (5.3.1)
X
X
X
At this point, it is easy for us to consider complex-valued functions
as well. Let f : X → C be a given function, written in terms of its real
and imaginary parts, as f = u + iv.
Definition 5.3.3 Let (X, S, µ) be a measure space and let f be a complexvalued function defined on X. It is said to be measurable if its real
and imaginary parts are measurable real-valued functions. It is said to
be integrable if, in addition, |f | is integrable. If f is an integrable complex-valued function, then, clearly, its real
and imaginary parts are also integrable, for, if f = u + iv, then |u| ≤ |f |
and |v| ≤ |f |. Thus we may now define, in this case,
Z
Z
Z
f dµ =
u dµ + i
v dµ.
(5.3.2)
X
X
X
Notation
Let (X, S, µ) be a measure space. We generally use the symR
bol X f dµ to denote the (Lebesgue) integral of an integrable function
96
5 Integration
defined over X, with respect to the measure µ. However, there may arise
situations when f depends on more than one variable. For instance, if
f is a function of two variables x and y varying over different measure
spaces, say, (X, S, µ) and (Y, S 0 , ν), and if we wish to integrate f as a
function of x with y fixed in Y , we will write the integral as
Z
f (x, y) dµ(x). X
We now prove the full linearity of the Lebesgue integral with respect to
the integrand.
Theorem 5.3.1 Let (X, S, µ) be a measure space and let f and g be integrable, complex-valued functions defined on X. Let α and β be complex
constants. Then
Z
Z
Z
(αf + βg) dµ = α
f dµ + β
g dµ.
(5.3.3)
X
X
X
Proof: By definition, αf +βg is clearly measurable. Further, since |αf +
βg| ≤ |α||f |+|β||g|, it follows, from the linearity of the Lebesgue integral
with respect to non-negative integrands and non-negative constants (cf.
(5.2.3) and the proof of Proposition 5.2.5), that αf + βg is integrable as
well.
We will now show that
Z
Z
Z
(f + g) dµ =
f dµ +
g dµ.
(5.3.4)
X
X
X
Again, by definition of the intergal for complex-valued functions, it is
enough to prove (5.3.4) when f and g are real-valued. Assuming this,
let h = f + g. Then
h+ − h− = f + g = f + − f − + g + − g − ,
which implies that
h+ + f − + g − = f + + g + + h− .
Since all the functions involved in the above relation are non-negative,
we deduce that (cf. the proof of Proposition 5.2.5)
Z
Z
Z
Z
Z
Z
+
−
−
+
+
h dµ +
f dµ +
g dµ =
f dµ +
g dµ +
h− dµ.
X
X
X
X
X
X
5.3 Integrable functions
97
Since all the quantities in the above relation are finite, we can rearrange
the terms to get
Z
Z
Z
Z
Z
Z
+
−
+
−
+
h dµ −
h dµ =
f dµ −
f dµ +
g dµ −
g − dµ,
X
X
X
X
X
X
which is exactly (5.3.4).
Finally, we show that if c ∈ C and if f is a complex-valued integrable
function defined on X, we have
Z
Z
cf dµ = c
f dµ,
(5.3.5)
X
X
which will complete the proof. If c ≥ 0, then (10.3.4) follows from the
definition of the integral and from (5.2.3). If c = −1, then the result is
again true since, for a real-valued function f , we have (−f )+ = f − and
(−f )− = f + , and we can again use the definition of the integral. Thus,
(10.3.4) is true for all real constants c. The proof will be complete if we
can prove the relation when c = i. Let f = u + iv be the decomposition
of f into its real and imaginary parts. Then, by definition,
Z
Z
Z
Z
Z
if dµ =
(−v + iu) dµ = −
v dµ + i
u dµ = i
f dµ.
X
X
X
X
X
This completes the proof. The next result is very important for estimating integrals.
Theorem 5.3.2 Let (X, S, µ) be a measure space and let f be a complexvalued integrable function defined on X. Then
Z
Z
f dµ ≤
|f | dµ. (5.3.6)
X
X
Proof: If z ∈ C, then z = |z|eiθ , where 0 ≤ θ < 2π. Thus, we can write
|z| = αz, where |α| = 1. Let
Z
Z
f dµ = α
f dµ,
X
X
where |α| = 1. Let u denote the real part of the function αf . Then
u ≤ |αf | ≤ |f |. Then, by the preceding theorem, we have
Z
Z
Z
Z
Z
f dµ = α
f dµ =
αf dµ =
u dµ ≤
|f | dµ.
X
X
X
X
X
98
5 Integration
(In the above chain of equalities, we
R used the fact
R that the integral of αf
is, in fact, a real quantity and so X αf dµ = X u dµ.) This completes
the proof. We will now prove a result, which, without exaggeration, could be
called the high point of the theory of the Lebesgue integral. It proves
that wecan interchange the processes of limits and integration under
fairly simple conditions.
Theorem 5.3.3 (Dominated convergence theorem) Let (X, S, µ) be a
measure space and let {fn }∞
n=1 be a sequence of (complex-valued) integrable functions defined on X, converging pointwise to a function f .
Assume, further, that for all x ∈ X, and for all n ∈ N, we have
|fn (x)| ≤ g(x), where g is a non-negative integrable function defined
on X. Then, f is integrable. Further,
Z
lim
|fn − f | dµ = 0.
(5.3.7)
n→∞ X
In particular, we have
Z
lim
n→∞ X
Z
fn dµ =
f dµ.
(5.3.8)
X
Proof: By the preceding theorem, (5.3.8) is an immediate consequence
of (5.3.7).
Since we also have |f (x)| ≤ g(x), we deduce that f is integrable.
Now, |fn − f | ≤ 2g and so 2g − |fn − f | is non-negative and converges
to 2g as n → ∞. Consequently, by Fatou’s lemma, we get
Z
Z
2
g dµ ≤ lim inf
(2g − |fn − f |) dµ
n→∞ X
X
= 2
Since
R
X g dµ − lim supn→∞
R
X
|fn − f | dµ.
R
g dµ < +∞, we deduce from the above that
Z
Z
0 ≤ lim inf
|fn − f | dµ ≤ lim sup
|fn − f | dµ ≤ 0.
X
n→∞ X
n→∞ X
This proves (5.3.7). Remark 5.3.1 Since the integral of a function over a set of measure
zero is zero, in all the convergence theorems proved up to now (say, the
5.3 Integrable functions
99
monotone convergence theorem and the dominated convergence theorem), the theorems remain valid even if we assume that fn → f almost
everywhere. If E is the set of measure zero where convergence fails,
we can work with X\E in the proofs and the results remain valid for
integrals over X since the addition of the integrals over E does not alter
anything. Example 5.3.1 Let X = N be equipped with the counting measure.
Define
1
n , if 1 ≤ k ≤ n,
fn (k) =
0, if k > n.
R
Then
fn → f ≡ 0 uniformly. However, while X f dµ = 0, we have
R
1
X fn dµ = n. n = 1 for each n. This is because the fn are not bounded
above by any integrable function. Definition 5.3.4 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of integrable functions defined on X. Then we say that the
sequence converges in the mean to an integrable function f if (5.3.7)
holds. Proposition 5.3.1 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of integrable functions defined on X, converging in the mean
µ
to an integrable function f defined on X. Then fn → f . In particular,
there exists a subsequence {fnk } which converges to f pointwise, almost
everywhere.
Proof: Let ε > 0. Set
En (ε) = {x ∈ X | |fn (x) − f (x)| ≥ ε}.
Then,
Z
Z
0 ≤
|fn − f | dµ ≤
En (ε)
|fn − f | dµ.
X
It now follows from the definition of En (ε) that (cf. (5.2.1)),
Z
1
µ(En (ε)) ≤
|fn − f | dµ,
ε X
from which we deduce that µ(En (ε)) → 0 as n → ∞. This proves that
µ
fn → f . The other conclusion now follows from Proposition 4.2.2. We now prove a generalization of the dominated convergence theorem.
100
5 Integration
Theorem 5.3.4 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of integrable functions defined on X converging pointwise,
almost everywhere, to an integrable function f defined on X. Assume
that for all x ∈ X, and for all n ∈ N, we have that |fn (x)| ≤ gn (x), where
{gn }∞
n=1 is a sequence of non-negative integrable functions defined on X
converging to a non-negative integrable function g almost everywhere in
X. Finally, assume that
Z
Z
lim
gn dµ =
g dµ.
n→∞ X
X
Then, (5.3.8) holds.
Proof: Assume that the functions fn and f are all real-valued. By
hypotheses, we have that for each n ∈ N, gn ±fn ≥ 0 and gn ±fn → g ±f
as n → ∞. Applying Fatou’s lemma, we get
R
R
X (g − f ) dµ ≤ lim inf n→∞ X (gn − fn ) dµ
R
=
X
g dµ − lim supn→∞
R
X
fn dµ,
and
R
X (g
+ f ) dµ ≤ lim inf n→∞
=
Since
R
X
R
R
X (gn
X g dµ + lim inf n→∞
R
X
fn dµ.
g dµ < +∞, we deduce that
Z
Z
Z
f dµ ≤ lim inf
X
+ fn ) dµ
n→∞ X
fn dµ ≤ lim sup
n→∞ X
Z
fn dµ ≤
f dµ,
X
from which (5.3.8) follows.
If the functions fn and f are complex valued, then the inequality
|fn (x)| ≤ gn (x) also implies that the same is valid for the sequences of
the real and imaginary parts of the fn . Hence, the theorem applies to
these two sequences from which we easily deduce (5.3.8). As a corollary of the preceding theorem, we have the following useful
result.
Theorem 5.3.5 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of integrable functions defined on X converging pointwise
5.3 Integrable functions
101
(almost everywhere) to an integrable function f defined on X. Then,
the sequence converges to f in the mean if, and only if,
Z
Z
lim
|fn | dµ =
|f | dµ.
(5.3.9)
n→∞ X
X
Proof: Assume that the sequence converges to f in the mean. Since
| |fn | − |f | | ≤ |fn − f |,
we see that {|fn |}∞
n=1 converges in the mean to |f | and then (5.3.9) is
an immediate consequence.
Conversely, assume that (5.3.9) holds. Define Fn = |fn − f | which
converges to zero (almost everywhere). Let Gn = |fn |+|f | and G = 2|f |.
Then Gn and G are non-negative integrable functions, |FnR| ≤ Gn , and
G
R n → G as n → ∞. Finally, thanks to (5.3.9), we have that X Gn dµ →
X G Rdµ, as n → ∞. Consequently, by the preceeding theorem, we have
that X Fn dµ → 0 as n → ∞, which is the same as saying that the
sequence {fn }∞
n=1 converges to f in the mean. We will now see a few examples of application of the dominated convergence theorem.
Example 5.3.2 (Fourier transform) Let RN be equipped with the Lebesgue
measure, mN . Given two vectors x = (x1 , · · · , xN ) and ξ = (ξ1 , · · · , ξN )
in RN , we define the usual euclidean inner-product between them by
x.ξ =
N
X
xi ξi .
i=1
Let f : RN → R be an integrable function. The Fourier transform of the
function f , denoted fb, is defined by
Z
b
f (ξ) =
e−2πix.ξ f (x) dmN (x).
RN
By Theorem 5.3.2, we have
Z
|fb(ξ)| ≤
|f | dmN < +∞,
RN
since the exponential has absolute value equal to unity. Thus fb is a
N
well-defined and bounded function. Let {ξ (n) }∞
n=1 be a sequence in R
102
5 Integration
converging to a vector ξ. Then, since the exponential is continuous, we
have that
(n)
e−2πiξ .x f (x) → e−2πiξ.x f (x)
as n → ∞. Further,
e−2πiξ
(n) .x
f (x) ≤ |f (x)|
for all x ∈ RN and for all n ∈ N. Since f is integrable, it now follows,
from the dominated convergence theorem, that
lim fb(ξ (n) ) = fb(ξ).
n→∞
Thus, the Fourier transform of an integrable function is a continuous
and bounded function. Example 5.3.3 Let {aij }∞
i,j=1 be a double sequence of real numbers.
Assume that, for each j ∈ N,
∞
X
|aij | ≤ bj ,
i=1
where
P∞
j=1 bj
< +∞.
Let X = N be equipped with the counting measure. Define
fi (j) = aij , j ∈ N, and f =
∞
X
fi .
i=1
Thus,
f (j) =
∞
X
aij , j ∈ N.
i=1
Define, for j ∈ N, g(j) = bj . Then g is a non-negative integrable function. By hypothesis, for each n ∈ N,
n
X
fi (j) ≤
i=1
Since
Pn
lim
i=1 fi
n→∞
∞
X
|fi (j)| =
i=1
∞
X
|aij | ≤ bj = g(j).
i=1
→ f , we have, by the dominated convergence theorem,
n X
∞
X
i=1 j=1
aij = lim
n→∞
∞ X
n
X
j=1 i=1
aij =
∞
X
j=1
f (j) =
∞ X
∞
X
j=1 i=1
aij .
5.3 Integrable functions
103
In other words, we have
∞ X
∞
X
aij =
i=1 j=1
∞ X
∞
X
aij .
j=1 i=1
Thus, the given conditions are sufficient to ensure that the order of
summation can be reversed when the double sequence is not necessarily
non-negative. (cf. Example 5.2.3). Proposition 5.3.2 Let (X, S, µ) be a measure space and let f be an
integrable function defined on X. Then, given ε > 0, there exists δ > 0
such that, whenever µ(E) < δ, E ∈ S, we have
Z
|f | dµ < ε.
(5.3.10)
E
Proof: Step 1: Let us assume that f is bounded. Let |f (x)| ≤ M for
(almost) every x ∈ X. Then, if E ∈ S, we have
Z
|f | dµ ≤ M µ(E).
E
The result follows on choosing δ <
ε
M.
Step 2: Given an arbitrary integrable function f , define, for n ∈ N,
|f (x)|, if |f (x)| ≤ n,
fn (x) =
n, if |f (x)| > n.
Then {fn }∞
n=1 is a non-negative sequence of bounded functions increasing to |f |. Thus, by the monotone convergence theorem, there exists
N ∈ N such that, for all n ≥ N , we have
Z
Z
Z
ε
| |f | − fn | dµ =
|f | dµ −
fn dµ < .
2
X
X
X
Hence, for every E ∈ S, we have, for all n ≥ N ,
Z
Z
Z
Z
ε
|f | dµ −
fn dµ =
| |f | − fn | dµ ≤
| |f | − fn | dµ < .
2
E
E
E
X
Since fN is bounded, choose, by Step 1, a δ > 0 such that, whenever
µ(E) < δ, E ∈ S, we have
Z
ε
fN dµ < .
2
E
104
5 Integration
Thus, for every set E ∈ S such that µ(E) < δ, (5.3.10) holds. Remark 5.3.2 Let (X, S, µ) be a measure space. We know that (cf.
Proposition 5.2.7) if f is an integrable function defined on X, then
Z
ν(E) =
|f | dµ, E ∈ S,
E
defines a measure on S. The above proposition states that if ε > 0 is
given, we can find δ > 0 such that ν(E) < ε whenever µ(E) < δ. We
say that the measure ν is absolutely continuous with respect to the
measure µ. The Radon-Nikodym theorem (which we shall see much
later) states that, for σ-finite measure spaces, every measure ν which is
absolutely continuous with respect to a measure µ occurs in this form
(see also Remark 5.2.5). 5.4
The Riemann and Lebesgue integrals
On the real line, R, we have two notions of integrals. The first is that of
the Riemann integral, defined primarily for bounded functions defined
over bounded intervals. If f : [a, b] → R is a bounded function, then the
Riemann integral, if it exists, is denoted by the symbol
b
Z
f (x) dx.
a
The integral when f and/or the interval is unbounded, is then defined
by appropriate limit processes and, again, may or may not exist. When
the integral exists, it is always a real number.
The Lebesgue integral, on the other hand, is defined for all nonnegative measurable functions defined on any (Lebesgue) measurable
subset of R. The value of the integral may be finite or infinite. If
it is finite, we say the function is Lebesgue integrable. The Lebesgue
integral is then defined for any measurable function such that |f | is
Lebesgue integrable. In this case, of course, the integral will be again a
real number. The Lebesgue integral of a non-negative or an integrable
function, f , over a (Lebesgue) measurable set E, is denoted by the
symbol
Z
f dm1 .
E
5.4 The Riemann and Lebesgue integrals
105
We have seen, in the Preamble, that there exist bounded functions
which are not Riemann integrable. An example is that of the characteristic function of the rationals in the interval [0, 1]. On the other hand,
the set of rationals is countable and hence is a measurable set of measure
zero. Thus, its characteristic function is integrable and the value of the
Lebesgue integral is zero.
We now ask ourselves if a Riemann integrable function is always
Lebesgue integrable. Our aim in this section is to show that this is indeed the case when the function and the interval are both bounded and
that, in fact, the two theories yield the same value for the integral.
Let [a, b] be a finite interval in R. Let
P = {a = x0 < x1 < · · · < xn = b}
be a partition of the interval [a, b]. The points {xi }ni=0 are called the
nodes of the partition. The mesh size of the partition, denoted ∆(P), is
defined as follows:
∆(P) = max (xi − xi−1 ).
1≤i≤n
A partition P 0 is said to be a refinement of P if the nodes of P form a
subset of those of P 0 .
Let f : [a, b] → R be a bounded function. Let us consider a sequence
{Pk }∞
k=1 of partitions of the interval [a, b] such that, for each k, we have
that Pk+1 is a refinement of Pk and such that ∆(Pk ) → 0, as k → ∞. If
Pk = {a = x0 < x1 < · · · < xn = b},
define the functions Uk and Lk as follows:
Uk = f (a) +
n
X
Mi χ(xi−1 ,xi ] , and Lk = f (a) +
i=1
n
X
mi χ(xi−1 ,xi ] ,
i=1
where, for 1 ≤ i ≤ n,
Mi =
sup f (x), and mi =
[xi−1 ,xi ]
inf
[xi−1 ,xi ]
f (x).
106
5 Integration
Then the upper and lower (Darboux) sums associated to f and this
partition (cf. Preamble) are given by
Z
Z
U (Pk , f ) =
Uk dm1 , and L(Pk , f ) =
Lk dm1 .
(5.4.1)
[a,b]
[a,b]
Since, for each k ∈ N, we have that the partition Pk+1 is a refinement
of the partition Pk , it follows that, for each x ∈ [a, b],
L1 (x) ≤ L2 (x) ≤ · · · ≤ f (x) ≤ · · · ≤ U2 (x) ≤ U1 (x).
(5.4.2)
Theorem 5.4.1 Let f : [a, b] → R be a bounded function which is Riemann integrable. Then, it is also Lebesgue integrable and
Z
Z b
f dm1 =
f (x) dx.
[a,b]
a
Proof: With the notations established above, the sequence {Uk (x)}∞
k=1
is monotonic decreasing and bounded below and the sequence {Lk (x)}∞
k=1
is monotonic increasing and bounded above, for each x ∈ [a, b]. Thus
both sequences are convergent. Let their respective limits be U (x) and
L(x). Then
L(x) ≤ f (x) ≤ U (x), x ∈ [a, b].
Since f is bounded, assume that |f (x)| ≤ M for all x ∈ [a, b]. Then,
for all x ∈ [a, b], and for all k ∈ N, we also have |Lk (x)| ≤ M and
|Uk (x)| ≤ M . Consequently, by the dominated convergence theorem, we
have
R
R
limk→∞ [a,b] Uk dm1 = [a,b] U dm1 , and
(5.4.3)
R
R
limk→∞ [a,b] Lk dm1 = [a,b] L dm1 .
Since f is Riemann integrable, the upper and lower Darboux sums converge to the Riemann integral of f . Thus, in view of (5.4.1), we get
Z
Z
Z b
U dm1 =
L dm1 =
f (x) dx.
[a,b]
[a,b]
a
But U ≥ L and so by the above result, it follows from, Proposition
5.2.2, that U (x) = L(x) = f (x) almost everywhere. Thus f is Lebesgue
integrable and
Z
Z
Z
Z b
f dm1 =
U dm1 =
L dm1 =
f (x) dx.
[a,b]
[a,b]
This completes the proof. [a,b]
a
5.4 The Riemann and Lebesgue integrals
107
Theorem 5.4.2 Let f : [a, b] → R be a bounded function. Then f is
Riemann integrable if, and only if, it is continuous almost everywhere.
Proof: With the preceding notations, assume that x ∈ [a, b] is not a
node of any of the partitions Pk , k ∈ N. (The set all nodes is countable
and hence is of measure zero.) Now, f is continuous at such a point x
if, and only if,
U (x) = f (x) = L(x).
Thus, from the proof of the preceding theorem, we see that if f is a
bounded and Riemann integrable function, then it is continuous almost
everywhere.
Conversely, assume that f : [a, b] → R is a bounded function which
is continuous almost everywhere. Then U = f = L almost everywhere.
Let ε > 0 be given. Then, by virtue of (5.4.1) and (5.4.3), it follows that
we can find k, sufficiently large, such that
Z
Z
Uk dm1 −
[a,b]
Lk dm1 < ε,
[a,b]
i.e.
|U (Pk , f ) − L(Pk , f )| < ε,
which proves that f is Riemann integrable. Example 5.4.1 The result of Theorem 5.4.1 is not true for unbounded
intervals. Consider the interval (0, ∞) ⊂ R. Let
f (x) =
sin x
.
x
It is well known (a
R ∞standard exercise on contour integration) that the
Riemann integral 0 f (x) dx is well-defined and that its value is, infact,
π
2 (cf., for example, Ahlfors [1]). However, f is not Lebesgue integrable.
To see this, consider the intervals
h
π
πi
In = nπ + , nπ +
, n ∈ N,
4
2
which are all disjoint. On In , we have
1
π
| sin x| ≥ √ and x = |x| ≤ (2n + 1) .
2
2
108
5 Integration
Thus, for x ∈ In , we have
sin x
x
√
2 1
.
π 2n + 1
≥
Thus, for any N ∈ N, we have
Z
|f | dm1 ≥
(0,∞)
N Z
X
π
π
n=1 [nπ+ 4 ,nπ+ 2 ]
|f | dm1 ≥
N
1 X 1
√
,
2 2 n=1 2n + 1
and the sum on the right becomes arbitrarily large, for large N , since it
is a partial sum of a divergent series. We can use Theorem 5.4.1 to study the Lebesgue integrability of
functions on subsets of the real line.
Example 5.4.2 Let f (x) = √1x on the interval (0, 1). This is a nonnegative function and its Lebesgue integral is well-defined. Define
(
0, if x ∈ (0, n1 ),
fn (x) =
1
√ , if x ∈ [ 1 , 1).
n
x
Then the sequence of non-negative functions {fn }∞
n=1 increases to f and
so
Z
Z
f dm1 = lim
fn dm1 .
n→∞ (0,1)
(0,1)
Now
Z
Z
fn dm1 =
(0,1)
1
[n
,1)
f dm1 .
But, on the interval [ n1 , 1), the function f is a bounded and continuous
function. Hence it is Riemann integrable and, we can easily calculate
the Riemann integral using the familiar rules of the calculus as
Z 1
√ 1
1
1
√ dx = 2 x 1 = 2 1 − √
.
1
x
n
n
n
Thus, it follows that
Z
f dm1 = 2
(0,1)
and so f is integrable over (0, 1). 5.5 Weierstrass’ theorem
109
Example 5.4.3 Let
(
f (x) =
sin x 2
,
x
if x ∈ (0, ∞),
1, if x = 0.
Again, this is a non-negative function and its Lebesgue integral is welldefined. We can write (using Proposition 5.2.7)
Z
Z
Z
f dm1 =
f dm1 +
f dm1 .
[0,∞)
[0,1]
(1,∞)
On the interval [0, 1], we have that f is a bounded and continuous function and so it is Riemann and hence Lebesgue integrable. Now, set
fn (x) =
1
χ
(x).
x2 (1,n)
Then for each x ∈ (1, ∞), fn (x) increases to x12 . Since fn is a bounded
and continuous function on the interval (1, n), it is Riemann integrable
there. Consequently,
Z
Z n
1
1
1
dm1 (x) = lim
dx = lim 1 −
= 1.
2
n→∞ 1 x2
n→∞
n
(1,∞) x
Thus, the function x 7→
1
x2
is integrable on the interval (1, ∞). Since
sin x
x
2
≤
1
,
x2
it follows that f is integrable on (1, ∞) as well and, therefore, f is
integrable on [0, ∞). 5.5
Weierstrass’ theorem
In this section, we will prove the famous theorem of Weierstrass, on
the uniform approximation of a continuous function on a compact interval by means of polynomials, using the notion of the Lebesgue integral.
Without loss of generality, we work on the interval [0, 1].
If x0 ∈ [0, 1], we denote the Dirac measure concentrated at x0 by δx0 .
110
5 Integration
Let t ∈ [0, 1] and n ∈ N be fixed. Let X = [0, 1]. On the σ-algebra
of all subsets of X, define the measure
µtn
=
n X
n
k=0
k
tk (1 − t)n−k δ k ,
n
where
n!
n
=
,
k
k!(n − k)!
is the usual binomial coefficient.
Let fi (x) = xi , i = 0, 1, 2. Now,
µtn (X)
Z
f0 dµtn
=
X
=
n X
n
k=0
k
tk (1 − t)n−k = 1.
Next,
R
t
X f1 dµn
=
n k
n−k k
k=0 k t (1 − t)
n
Pn
= t
n − 1 k−1
(1 − t)(n−1)−(k−1)
k=1 k − 1 t
Pn
= t.
In the same way, a similar computation yields
Z
1
f2 dµtn = ((n − 1)t2 + t).
n
X
Setting f (x) = (x − t)2 = f2 (x) − 2tf1 (x) + t2 f0 (x), we get, on simplification,
Z
t − t2
f dµtn =
.
(5.5.1)
n
X
Lemma 5.5.1 Let t ∈ [0, 1] and n ∈ N be fixed. Let ε > 0 be given. Set
Aε = {x ∈ X | |x − t| ≥ ε}.
Then µtn (Aε ) converges uniformly to zero (with respect to t) as n → ∞.
5.5 Weierstrass’ theorem
111
Proof: By the definition of the set Aε , we get
Z
Z
2 t
2
t
ε µn (Aε ) ≤
(x − t) dµn (x) ≤
(x − t)2 dµtn (x) ≤
Aε
X
using (5.5.1) and the fact that t(1−t) ≤
the proof. 1
4
1
,
4n
when t ∈ [0, 1]. This completes
Lemma 5.5.2 Let f ∈ C[0, 1]. Let t ∈ [0, 1]. Then
Z
lim
f dµtn = f (t),
n→∞ X
and the limit is uniform with respect to t.
Proof: Since f is continuous over a compact interval, it is uniformly
continuous. Given ε > 0, let δ > 0 be such that whenever |x − y| < δ,
we have |f (x) − f (y)| < ε. Set
Aδ = {x ∈ X | |x − t| ≥ δ}.
Now, since µtn (X) = 1, we have
Z
Z
Z
f dµtn − f (t) =
(f (x) − f (t)) dµtn (x) ≤
|f (x)−f (t)| dµtn (x).
X
We can write
X
Z
X
|f (x) − f (t)| dµtn (x) = I1 + I2 ,
X
where
I1 =
R
I2 =
R
Aδ
|f (x) − f (t)| dµtn (x), and
Acδ
|f (x) − f (t)| dµtn (x).
Let M = maxx∈[0,1] |f (x)|. Then, by Lemma 5.5.1, we have
I1 ≤ 2M µtn (Aδ ) ≤
2M
.
4n
On the other hand, for x ∈ Acδ , we have that |f (x) − f (t)| < ε and so
I2 ≤ εµtn (Acδ ) ≤ εµtn (X) ≤ ε.
Now, given any η > 0, choose ε < η2 and choose N ∈ N such that, for
all n ≥ N , we have
2M
η
< .
4n
2
112
5 Integration
Then, for all n ≥ N and for all t ∈ [0, 1], we have
Z
f dµtn − f (t) < η,
X
which completes the proof. Remark 5.5.1 In the above proof, to estimate the integral of |f (x) −
f (t)|, we split the integral over two sets. On one of the sets (Acδ ), we had
information which controlled the integrand while on the other (Aδ ), we
had minimal information on the integrand, but the measure of the set
was small. This method of ‘divide and rule’ is often useful in estimating
integrals. Now, by the definition of the measure µtn , we get
Z
∞ X
k
n k
t
n−k
f dµn =
t (1 − t)
f
,
k
n
X
(5.5.2)
k=0
which is a polynomial in t converging uniformly to f (t). Thus, we have
proved the following theorem:
Theorem 5.5.1 (Weierstrass approximation theorem) Every continuous function on a compact interval can be uniformly approximated by a
sequence of polynomials.
The polynomials occuring on the right-hand side of (5.5.2) are called
the Bernstein polynomials.
5.6
Probability
We are now in a position to indicate the connections between the theory
of probability and that of measure and integration.
A probability space is a measure space (Ω, B, p) such that p(Ω) =
1. The set Ω is called a sample space and the σ-algebra B is said to
be the collection of events. Thus, the measure p(A) of a set A ∈ B, is
called the probability of the event A.
Let B ∈ B. Consider the σ-algebra of subsets of B, given by
BB = {A ∩ B | A ∈ B}.
5.6 Probability
113
We define the measure pB on BB by
pB (A ∩ B) =
p(A ∩ B)
, A ∈ B.
p(B)
so that pB (B) = 1. Then (B, BB , pB ) is a probability space. The conditional probability of an event A, given B, denoted p(A | B) is nothing
but pB (A ∩ B). The events A and B are said to be independent if
p(A | B) = p(A). In this case, we deduce that
p(A ∩ B) = p(A)p(B).
A random variable, X, on Ω, is a measurable real-valued function
defined on Ω. The expected value (also called expectation or mean)
of the random variable X, denoted E(X), is defined by
Z
E(X) =
X dp.
Ω
Pointwise convergence, almost everywhere, of a sequence of random
variables ir referred to as convergence almost surely and convergence
in measure is referred to as convergence in probability.
A distinguishing feature of probability theory, which does not have
a parallel in the theory of measure and integration, is the study of independent and identically distributed random variables.
Two random variables X and Y defined on Ω are said to be independent if for any pair of Borel sets in R, say, A and B, we have
p(X −1 (A) ∩ Y −1 (B)) = p(X −1 (A))p(Y −1 (B)).
The distribution function of a random variable X defined on Ω is
defined by
F (t) = p(X −1 ((−∞, t])).
In other words, it is the probability that the random variable takes a
value less than, or equal to t. Two random variables are said to be
identically distributed if they have the same distribution function.
114
5.7
5 Integration
Exercises
5.1 Let (X, S, µ) be a measure space. Let {fn }∞
n=1 be a sequence of
measurable functions defined on X converging to a measurable function
f almost everywhere. Assume that
f1 ≥ f2 ≥ · · · ≥ fn ≥ · · · ≥ 0,
and that f1 is integrable. Show that
Z
Z
lim
fn dµ =
f dµ.
n→∞ X
X
5.2 Let (X, S, µ) be a measure space such that µ(X) < +∞. Let
{fn }∞
n=1 be a sequence of integrable functions defined on X converging uniformly to a function f , on X. Show that
Z
Z
lim
fn dµ =
f dµ.
n→∞ X
X
(This is not true, in general, if µ(X) == +∞; cf. Example 5.3.1.)
5.3 Let f : R → R be integrable. Let t ∈ R be fixed. Define g(x) =
f (x + t), x ∈ R. If [a, b] ⊂ R is an interval, show that
Z
Z
g dm1 =
f dm1 .
[a,b]
[a+t,b+t]
5.4 Let f : [0, 1] × [0, 1] → R be (Lebesgue) measurable as a function
of x, for each fixed t, and let g : [0, 1] → R be an integrable function.
Assume that for each (x, t) ∈ [0, 1] × [0, 1],
|f (x, t)| ≤ g(x).
If limt→0 f (x, t) = h(x), show that
Z
Z
lim
f (x, t) dm1 (x) =
t→0 [0,1]
h dm1 .
[0,1]
5.5 (Differentiation under the integral sign) Let f : [0, 1] × [0, 1] → R
be a function such that, for each t ∈ [0, 1], the function x 7→ f (x, t) is
integrable and for each x ∈ [0, 1], the map t 7→ f (x, t) is differentiable
and that the derivative ∂f
∂t (x, t) is uniformly bounded. Show that
Z
Z
d
∂f
f (x, t) dm1 (x) =
(x, t) dm1 (x).
dt [0,1]
[0,1] ∂t
5.7 Exercises
115
5.6 In each of the following cases, check the function f for (Lebesgue)
integrability over the indicated domain.
1
(a) f (x) = 1+x
2 on R.
(b) f (x) =
1
x
(c) f (x) =
1
x
on (0, 1).
sin x1 on (0, 1).
5.7 Show that the function defined by
f (x) =
xn−1
(1 + x2 )k
is integrable over (0, ∞) if k > n2 .
5.8 Let (X, S, µ) be a measure space. Show that the dominated convergence theorem is true if we replace ‘fn → f almost everywhere’ by
µ
‘fn → f ’.
5.9 (a) Let f : [0, ∞) → R be uniformly continuous. If f is integrable,
show that limx→+∞ f (x) = 0.
(b) Show by means of an example that this result is not true if we replace ‘uniformly continuous’ by ‘continuous’.
5.10 Let f ∈ C[0, 1]. Assume that, for each n ∈ N, we have
1
Z
xn f (x) dx = 0.
0
Show that f ≡ 0.
5.11 Let (X, S, µ) be a measure space and let f be a non-negative integrable function defined on X. Let {ϕn }∞
n=1 be a sequence of non-negative
simple functions increasing to f . Show that
Z
lim
|ϕn − f | dµ = 0.
n→∞ X
5.12 (Korovkin’s theorem) Let ϕ : [0, +∞) → [0, +∞) be a continuous function such that ϕ(t) > 0 for t > 0. Let X = [0, 1] and let
{µxn }n∈N,x∈[0,1] be a collection of finite Borel measures (cf. Definition
116
5 Integration
2.1.1) on X. Define, for x ∈ X,
Z
ψn (x) =
ϕ(|x − y|) dµxn (y).
X
Assume that (i) µxn (X) → 1, uniformly with respect to x, as n → ∞,
and (ii) ψn → 0, as n → ∞, uniformly on [0, 1]. Show that, for any
f ∈ C[0, 1], we have
Z
f (y) dµxn (y) → f (x), as n → ∞,
X
uniformly on X.
5.13 Show that the Weierstrass approximation theorem, as proved in
Section 5.5, is a particular case of Korovkin’s theorem, as stated above.
5.14 Let (X, S, µ) be a measure space such that µ(X) = 1. Let g : R →
R be a bounded and uniformly continuous function. Let {fn }∞
n=1 be a
µ
sequence of measurable functions defined on X such that fn → f , where
f is a measurable function defined on X. Show that
Z
Z
lim
g ◦ fn dµ =
g ◦ f dµ,
n→∞ X
X
where, for x ∈ X, and for any measurable function h defined on X,
(g ◦ h)(x) = g(h(x))
.
5.15 Let (X, S, µ) be a measure space and let M denote the collection
of all equivalence classes of real-valued measurable functions defined
on X, modulo equality almost everywhere. If f : X → R is a measurable function, denote the equivalence class containing f by f . Let
ϕ : [0, +∞) → [0, 1] be a strictly monotonic increasing continuous function such that ϕ(0) = 0. Asume, further, that, for all x, y ∈ [0, +∞), we
have
ϕ(x + y) ≤ ϕ(x) + ϕ(y).
Let µ(X) = 1. For f , g ∈ M, define
Z
d(f , g) =
ϕ(|f − g|) dµ.
X
5.7 Exercises
117
(a) Show that d(·, ·) is well-defined.
(b) Show that d(·, ·) defines a metric on M.
(c) Show that a sequence {f n }∞
n=1 converges to f with respect to this
µ
metric if, and only if, fn → f .
(d) Show that the function
ϕ(x) =
x
, 0 ≤ x < +∞,
1+x
satisfies the conditions stated above.
Chapter 6
Differentiation
6.1
Monotonic functions
One of the important features of differential and integral calculus is that
differentiation and integration are essentially two sides of the same coin.
More precisely, the fundamental theorem of calculus states that, if f is
a Riemann integrable function which is the derivative of a function F
on an interval [a, b], then
Z b
F (b) − F (a) =
f (x) dx.
(6.1.1)
a
We would like to investigate how far such a result is true when we deal
with functions which are only differentable almost everywhere and with
the Lebesgue integral of the derivative of that function.
Consider the Cantor function (cf. Section 3.2), f , defined on the
interval [0, 1]. It is continuous and monotonic increasing, with f (0) = 0
and f (1) = 1. If C is the Cantor set, then f is constant on each subinterval of C c . Thus, f is differentiable almost everywhere and its derivative, wherever it exists, is zero. Thus we have f (1) − f (0) = 1, while the
integral of the derivative of f vanishes. In other words, (6.1.1) is not
true for this function.
Our aim in this chapter is to give necessary and sufficient conditions on a function f which is differentiable almost everywhere, with the
derivative f 0 being integrable, such that (6.1.1) is true.
We will first study, briefly, various classes of functions which are differentiable almost everywhere. Following the treatment as in Royden [6],
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_6
118
6.1 Monotonic functions
119
we start with monotonic functions.
Definition 6.1.1 Let I be a collection of intervals covering a set E ⊂
R. It is said to be a Vitali covering of E if, for every ε > 0, and
for every x ∈ E, there exists an interval I ∈ I such that x ∈ I and
m1 (I) < ε. In other words, every point in E can be covered by intervals of arbitrarily small length. Such coverings occur naturally when we study
local properties of functions, especially, derivability.
In the sequel, when we use the word ‘outer-measure’, we will mean
the outer-measure which generates the Lebesgue measure on R, which
will be denoted by the symbol µ∗ .
Lemma 6.1.1 (Vitali covering lemma) Let E ⊂ R be a set of finite
outer-measure and let I be a Vitali covering of E. Then, given ε > 0,
we can find a finite collection of disjoint intervals {I1 , · · · , IN } in I such
that
µ∗ E\ ∪N
< ε.
j=1 Ij
Proof: We observe, first of all, that the intervals can be of any type:
open, closed, or half-open. The addition, or removal, of end-points do
not change the results since these points constitute a set of measure
zero. Consequently, without loss of generality, we will assume that the
intervals are all closed.
Since E has finite outer-measure, we can find (cf. Proposition 2.2.1)
an open set U , which contains E and is such that m1 (U ) < +∞. Since
I is a Vitali covering of E, we can also assume, again, without loss of
generality, that all the intervals in I are also contained in U .
Let us choose an interval I1 arbitrarily from the collection I. We
will now inductively choose disjoint intervals Ik , for k > 1, as follows.
Assume that the intervals {I1 , · · · , In } have been chosen.
• If E ⊂ ∪nk=1 Ik , then we are through.
• If not, let x ∈ E\(∪nk=1 Ik ). Since the intervals are closed, the
distance, d, of x from ∪nk=1 Ik is strictly positive. Since I is a
Vitali covering, there exists an interval I ∈ I, containing x and of
120
6 Differentiation
length less than, say, d2 . Then, it is clear that I will not intersect
any of the intervals Ik , 1 ≤ k ≤ n. Set
kn =
sup
m1 (I).
I∈I
I ∩ Ik = ∅
1≤k≤n
Since all intervals of I are contained in U , we have that kn ≤
m1 (U ) < +∞. Thus, we can find an interval In+1 ∈ I such that
In+1 ∩ Ik = ∅ for all 1 ≤ k ≤ n and such that
1
kn < m1 (In+1 ).
2
Thus, if the process does not terminate at any finite stage, we will
have a sequence of disjointP
intervals {Ik }∞
k=1 in I. Since they are all contained in U , we have that ∞
m
(I
)
≤
m1 (U ) < +∞. Consequently,
k=1 1 k
m1 (Ik ) → 0 as k → ∞. Now, choose N ∈ N such that
∞
X
k=N +1
m1 (Ik ) <
ε
.
5
Set
R = E\ ∪N
k=1 Ik .
We complete the proof by showing that µ∗ (R) < ε.
Let x ∈ R. Then, as observed earlier, there exists I ∈ I such that
x ∈ I and I ∩Ik = ∅ for all 1 ≤ k ≤ N . Assume, if possible, that I ∩In =
∅ for all n ∈ N. Then, by definition, 0 < m1 (I) ≤ kn ≤ 2m1 (In+1 ) for all
n, which is impossible, since m1 (In ) → 0 as n → ∞. Thus, there exists
n ∈ N, such that n > N , I ∩ In 6= ∅, and I ∩ Ik = ∅ for all 1 ≤ k < n.
Notice that
m1 (I) ≤ kn−1 ≤ 2m1 (In ).
Let cn denote the mid-point of the interval In . Then, since x ∈ I and
I ∩ In 6= ∅, we have
1
5
|x − cn | ≤ m1 (I) + m1 (In ) ≤ m1 (In ).
2
2
Set
Jn =
5
5
cn − m1 (In ), cn + m1 (In ) .
2
2
6.1 Monotonic functions
121
Then, x ∈ Jn and m1 (Jn ) ≤ 5m1 (In ). Thus, R ⊂ ∪∞
k=N +1 Jk and so
µ∗ (R) ≤
∞
X
∞
X
m1 (Jk ) ≤ 5
k=N +1
m1 (Ik ) < ε.
k=N +1
This completes the proof. Remark 6.1.1 There are several similar results in the literature, each of
them being called a Vitali covering lemma. The spirit of these are all the
same: a set of finite outer-measure in RN is covered by basic open sets
of arbitrarily small size and we can find a finite disjoint sub-collection
which almost completely covers the given set, i.e. the outer-measure of
the uncovered portion can be made as small as we wish. Let f : [a, b] → R be a given measurable function. We can define the
following ‘one-sided derivatives’ of f at any point x ∈ (a, b):
D+ f (x) = lim suph↓0
f (x+h)−f (x)
,
h
D− f (x) = lim suph↓0
f (x)−f (x−h)
,
h
D+ f (x) = lim inf h↓0
f (x+h)−f (x)
,
h
D− f (x) = lim inf h↓0
f (x)−f (x−h)
.
h
We have that f is differentiable at x if, and only if, these four values coincide, and, in that case, we denote the common value as f 0 (x), which is
the derivative of f at x. Notice that we always have D+ f (x) ≥ D+ f (x)
and D− f (x) ≥ D− f (x) at any point x ∈ (a, b).
We say that f is differentiable almost everywhere on (a, b) if the
derivative exists for almost every x ∈ (a, b).
Theorem 6.1.1 Let f : [a, b] → R be a monotonic increasing realvalued function. Then f is differentiable almost everywhere on (a, b).
The derivative f 0 is measurable and
Z
f 0 dm1 ≤ f (b) − f (a).
(6.1.2)
(a,b)
122
6 Differentiation
Proof: Step 1. Consider the set
E = {x ∈ (a, b) | D+ f (x) > D− f (x)}.
We will show that m1 (E) = 0. All other sets involving inequalities
between the various one-sided derivatives can be handled in a similar
manner. This will establish the proof.
We can write
E = ∪r,s∈Q Ers ,
r>s
where
Ers = {x ∈ (a, b) | D+ f (x) > r > s > D− f (x)}.
Since Q is countable, again, it suffices to show that m1 (Ers ) = 0.
Let m = m1 (Ers ). Let ε > 0 be arbitrary. Then, there exists U , an
open set, such thatErs ⊂ U and m1 (U ) < m + ε. Let x ∈ Ers . Since
D− f (x) < s, for h sufficiently small, we have that [x − h, x] ⊂ U and
f (x) − f (x − h) < sh.
The collection of all such closed intervals, as x varies over Ers , forms a
Vitali covering of Ers and so by the Vitali covering lemma, we can find
a disjoint collection of such intervals {I1 , · · · , IN } such that the union of
their interiors covers a set A ⊂ Ers such that
µ∗ (A) > m − ε.
If, for 1 ≤ k ≤ N , we have Ik = [xk , xk − hk ], then
N
X
k=1
(f (xk ) − f (xk − hk )) < s
N
X
hk < sm1 (U ) < s(m + ε). (6.1.3)
k=1
Now, let y ∈ A. Then, for h0 sufficiently small, we have that (y, y+h0 )
is contained in an interval Ik , where 1 ≤ k ≤ N and
f (y + h0 ) − f (y) > kr.
Again the collection of such intervals, as y varies over A, is a Vitali
covering of A and so there exists a finite collections of disjoint intervals
6.1 Monotonic functions
123
{J1 , · · · , JM } which cover a set B ⊂ A of outermeasure greater than
m − 2ε. If Ji = (yi , yi + h0i ), 1 ≤ i ≤ M , then
M
X
f (yi + h0i ) − f (yi ) > r
i=1
M
X
ki0 > r(m − 2ε).
(6.1.4)
i=1
Since each Ji is contained in some Ik and since f is monotonic increasing,
we have that
f (yi + h0i ) − f (yi ) ≤ f (xk ) − f (xk − hk ).
Thus,
M
X
(f (yi +
h0i )
N
X
− f (yi )) ≤
i=1
(f (xk ) − f (xk − hk )).
k=1
From (6.1.3) and (6.1.4), we then deduce that
r(m − 2ε) < s(m + ε).
Since ε > 0 was arbitrarily chosen, we get that mr ≤ ms. Since r > s,
this is possible only if m = 0. This completes the proof of the differentiability, almost everywhere, of f .
Step 2. Define f (x) = f (b) for x ≥ b. Define
1
gn (x) = n f x +
− f (x) .
n
Since f is differentiable almost everywhere, f 0 is defined almost every0
where and the sequence {gn }∞
n=1 converges to f , wherever it is defined.
Since the Lebesgue measure is complete, it follows that f 0 is measurable.
Also, since f is monotonically increasing, we have gn (x) ≥ 0 for all x.
Thus, by Fatou’s lemma, we have that
Z
Z
0
f dm1 ≤ lim inf
gn dm1 .
n→∞ (a,b)
(a,b)
By Exercise 5.3, we get
R
R
R
f
dm
−
f
dm
g
dm
=
n
1
1
1
1
1
(a,b)
(a,b) n
(a+ ,b+ )
n
= n
R
n
[b,b+ 1 ) f dm1 −
n
= f (b) − n
R
1 f
(a,a+ n
]
R
(a,a+ 1 ] f dm1
dm1 .
n
124
6 Differentiation
Thus,
Z
Z
0
f dm1 ≤ f (b) − lim sup n
n→∞
(a,b)
1
(a,a+ n
]
f dm1 ≤ f (b) − f (a),
since f (x) ≥ f (a) for all x ∈ (a, a + n1 ]. This completes the proof. Remark 6.1.2 As the example of the Cantor function shows, we can
have strict inequality in (6.1.2). 6.2
Functions of bounded variation
Let f : [a, b] → R be a given function. Consider a partition
P = {a = x0 < x1 < · · · < xn−1 < xn = b}.
Define
t(P, f ) =
n
X
(6.2.1)
|f (xi ) − f (xi−1 )|.
i=1
Definition 6.2.1 Let f : [a, b] → R be a given function. The total
variation of f , over the interval [a, b], is defined as
Tab (f ) = sup t(P, f ),
P
where the supremum is taken over all possible partitions of the interval
[a, b]. The function f is said to be of bounded variation over the
interval [a, b] if Tab (f ) < +∞. Example 6.2.1 Let f : [a, b] → R be Lipschitz continuous, i.e. there
exists L > 0 such that, for all x, y ∈ [a, b], we have
|f (x) − f (y)| ≤ L|x − y|.
Then, given any partition P as in (6.2.1), we have
t(P, f ) =
n
X
|f (xi ) − f (xi−1 | ≤ L(b − a).
i=1
Thus, f is of bounded variation over [a, b] and Tab (f ) ≤ L(b − a).
In particular, if f is differentiable on (a, b) and if |f 0 (x)| ≤ L for
all x ∈ (a, b), then, by the mean value theorem, f is Lipschitz continuous (with Lipschitz constant L) and hence is of bounded variation over
6.2 Functions of bounded variation
125
[a, b]. Example 6.2.2 Let f : [a, b] → R be monotonic. Then, for any partition
P as in (6.2.1), we have
t(P, f ) =
n
X
|f (xi )−f (xi−1 )| =
i=1
n
X
(f (xi ) − f (xi−1 )) = |f (b)−f (a)|,
i=1
by the monotonicity of f . Thus f is of bounded variation over the interval [a, b] and Tab (f ) = |f (b) − f (a)|. Example 6.2.3 Define
f (x) =
x2 sin x12 , if x ∈ (0, 1],
0, if x = 0.
Then f is a continuous function which is not of bounded variation. To
see this, consider the patrition P of the interval [0, 1] defined by the set
of points
(s
)n
2
{0, 1} ∪
.
π(2k + 1)
k=0
Let us denote the points of the partion which are not on the boundary
by {xk }nk=0 . Then
|f (xk ) − f (xk−1 )| =
2 1
2 1
2 4k
21
+
=
≥
.
π 2k + 1 π 2k − 1
π 4k 2 − 1
πk
for 1 ≤ k ≤ n. Thus,
t(P, f ) ≥
n
2X1
,
π
k
k=1
and the right-hand side will become arbitrarily large for large n. Proposition 6.2.1 Let [a, b] ⊂ R be a given bounded interval. If f is a
function of bounded variation on [a, b], then f is bounded. The function
|f | is also of bounded variation. If f and g are functions of bounded
variation on [a, b] and if α and β are real constants, then αf + βg and
f g are also of bounded variation on [a, b].
Proof: Let x ∈ (a, b]. Consider the partition defined by {a < x ≤ b}.
Then |f (x) − f (a)| ≤ Tab (f ). Thus, for any x ∈ [a, b], we have
|f (x)| ≤ |f (a)| + Tab (f ).
126
6 Differentiation
The rest of the proof is an immediate consequence of the following inequalities:
| |f (x)| − |f (y)| | ≤ |f (x) − f (y)|,
|(αf (x)+βg(x))−(αf (y)+βg(y))| ≤ |α| |f (x)−f (y)|+|β| |g(x)−g(y)|,
and
|f (x)g(x) − f (y)g(y)| ≤ |f (x)| |g(x) − g(y)| + |g(y)| |f (x) − f (y)|. Given a real number r, define
r+ = max{r, 0} and r− = − min{r, 0},
so that r = r+ − r− and |r| = r+ + r− . Given a partition P as in (6.2.1)
of an interval [a, b] and a function f : [a, b] → R, define
P
p(P, f ) = Pni=1 (f (xi ) − f (xi−1 ))+ , and,
n
−
n(P, f ) =
i=1 (f (xi ) − f (xi−1 )) .
Thus,
t(P, f ) = p(P, f ) + n(P, f ),
f (b) − f (a) = p(P, f ) − n(P, f ).
(6.2.2)
Define
Pab (f ) = sup p(P, f ), and Nab (f ) = sup n(P, f ).
P
P
Proposition 6.2.2 Let f : [a, b] → R be a function of bounded variation. Then
Tab (f ) = Pab (f ) + Nab (f ),
(6.2.3)
and
f (b) − f (a) = Pab (f ) − Nab (f ).
(6.2.4)
Proof: Since f is of bounded variation, we have that Tab (f ), Pab (f ) and
Nab (f ) are all finite. Now, for any partition P of [a, b], we have, using
(6.2.2),
p(P, f ) = n(P, f ) + f (b) − f (a) ≤ Nab (f ) + f (b) − f (a).
We then immediately deduce that Pab (f ) ≤ Nab (f ) + f (b) − f (a), or,
equivalently,
Pab (f ) − Nab (f ) ≤ f (b) − f (a).
6.2 Functions of bounded variation
127
Interchanging the roles of p(P, f ) and n(P, f ) in the use of (6.2.2), we
deduce, in the same way, that
Nab (f ) − Pab (f ) ≤ f (a) − f (b).
Thus, we have established (6.2.4).
Since t(P, f ) = p(P, f ) + n(P, f ) for any partition P, it follows that
Tab (f ) ≤ Pab (f ) + Nab (f ).
Again, for any partition P,
Tab (f ) ≥ p(P, f ) + n(P, f ) = p(P, f ) + p(P, f ) − (f (b) − f (a))
= 2p(P, f ) + Nab (f ) − Pab (f ),
by virtue of (6.2.4). Thus,
Tab (f ) ≥ 2Pab (f ) + Nab (f ) − Pab (f ) = Pab (f ) + Nab (f ),
which gives the reverse inequality, thereby establishing (6.2.3). This
completes the proof. Theorem 6.2.1 A function f : [a, b] → R is of bounded variation if,
and only if, it is the difference of two monotonic functions.
Proof: Since monotonic functions are of bounded variation, and since
the sum, or difference, of functions of bounded variation is also of
bounded variation, we see that the difference of two monotonic functions is of bounded variation.
Conversely, let f be of bounded variation on the interval [a, b]. Let
x ∈ (a, b]. Define
g(x) = Pax (f ) and h(x) = Nax (f ).
By definition, it is easy to see that both g and h are monotonic increasing
functions. Consequently, the function x 7→ h(x)−f (a) is also monotonic
increasing. Further, by the preceding proposition, we have
f (x) − f (a) = g(x) − h(x).
Thus,
f (x) = g(x) − (h(x) − f (a)),
which completes the proof. 128
6 Differentiation
Corollary 6.2.1 Let f : [a, b] → R be a function of bounded variation.
Then, it is differentiable almost everywhere.
Proof: This is a direct consequence of the preceding theorem and Theorem 6.1.1. Proposition 6.2.3 Let f : [a, b] → R be a function of bounded variation. Then f 0 is integrable and
Z
|f 0 | dm1 ≤ Tab (f ).
[a,b]
In addition, if f ∈ C 1 [a, b], then we have equality in the above relation.
Proof: The functions x 7→ Pax (f ), x 7→ Nax (f ) and x 7→ Tax (f ) are all
monotonic increasing and so are differentiable almost everywhere. Since
we have
f (x) − f (a) = Pax (f ) − Nax (f ),
we have
f 0 (x) = (Pax (f ))0 − (Nax (f ))0 a.e..
Since (Pax (f ))0 and Nax (f ))0 are non-negative (the functions concerned
being monotonic increasing), we have
|f 0 (x)| ≤ |(Pax (f ))0 | + |(Nax (f ))0 | = (Pax (f ))0 + (Nax (f ))0 = (Tax (f ))0 ,
since Tax (f ) = Pax (f ) + Nax (f ). Since Tax (f ), is monotonic increasing, we
have, by Theorem 6.1.1, that
Z
Z
0
|f | dm1 ≤
(Tax (f ))0 dm1 ≤ Tab (f ) − Taa (f ) = Tab (f ).
[a,b]
[a,b]
If f is continuously differentiable, and if P is any partition as in (6.2.1),
we have, for 1 ≤ i ≤ n,
Z xi
f (xi ) − f (xi−1 ) =
f 0 (t) dt.
xi−1
Thus, it follows that
t(P, f ) ≤
n Z
X
i=1
xi
0
b
Z
|f (t)| dt =
xi−1
0
Z
|f 0 | dm1 .
|f (t)| dt =
a
[a,b]
6.2 Functions of bounded variation
129
Consequently, we have the reverse inequality
Z
b
Ta (f ) ≤
|f 0 | dm1 ,
[a,b]
which completes the proof. Let us now consider a vector-valued map f : [a, b] → RN , where, for
x ∈ [a, b], we have f (x) = (f1 (x), · · · , fN (x)). We write
! 12
N
X
|f (x)| =
|fi (x)|2
.
i=1
We then say that f is of bounded variation over [a, b] if
Tab (f ) = sup t(P, f ) < +∞,
P
where the supremum is taken over all partitions P of [a, b] and if P is a
partition of [a, b] as in (6.2.1), we set
t(P, f ) =
n
X
|f (xi ) − f (xi−1 )|.
i=1
If each fi , 1 ≤ i ≤ N , is integrable over [a, b], we define
!N
Z
Z
f dm1 =
fi dm1
.
[a,b]
[a,b]
i=1
If each fi , 1 ≤ i ≤ N is differentiable in (a, b), we define
f 0 (x) = (fi0 (x))N
i=1 .
Lemma 6.2.1 Let f : [a, b] → RN be integrable. Then
Z
Z
f dm1 ≤
|f | dm1 .
[a,b]
(6.2.5)
[a,b]
R
R
Proof: Set y = [a,b] f dm1 , and yi = [a,b] fi dm1 , 1 ≤ i ≤ N . The
result is trivially true if y = 0. Assume that y 6= 0. Then
R
PN 2
PN
|y|2 =
=
i=1 yi
i=1 yi [a,b] fi dm1
=
=
R
[a,b]
|y|
PN
i=1 yi fi
R
[a,b] |f |
dm1 ≤
R
[a,b] |y||f |
dm1
dm1 ,
from which (6.2.5) follows on dividing throughout by |y|. 130
6 Differentiation
Proposition 6.2.4 Let f : [a, b] → RN be a continuously differentiable
map. Then f is of bounded variation and
Z
b
Ta (f ) =
|f 0 | dm1 .
[a,b]
Proof: Proceeding exactly as in the latter part of the proof of Proposition 6.2.3, we can easily see that, owing to the continuous differentiablity
of f ,
Z
|f 0 | dm1 < +∞.
Tab (f ) ≤
[a,b]
Thus, f is of bounded variation.
We now prove the reverse inequality. Since f 0 is continuous over
[a, b], it is uniformly continuous. Given ε > 0, let δ > 0 be such that,
whenever |x − y| < δ, we have |f 0 (x) − f 0 (y)| < ε. Let P be a partition
of [a, b] as in (6.2.1) and such that
max (xi − xi−1 ) < δ.
1≤i≤n
Thus, if xi−1 ≤ t ≤ xi , we have
|f 0 (t)| < |f 0 (xi )| + ε.
Therefore,
R xi
0
0
xi−1 |f (t)| dt − ε(xi − xi−1 ) ≤ |f (xi )|(xi − xi−1 )
=
R xi
0 (t)
≤
R xi
f 0 (t) dt
xi−1 (f
xi−1
+
R xi
xi−1 (f
+ f 0 (xi ) − f 0 (t)) dt
0 (x
i)
− f 0 (t)) dt
≤ |f (xi ) − f (xi−1 )| + ε(xi − xi−1 ).
Summing over all 1 ≤ i ≤ n, we get
b
Z
a
|f 0 (t)| dt − ε(b − a) ≤ t(P, f ) + ε(b − a).
6.3 Differentiation of an indefinite integral
131
Thus,
b
Z
|f 0 (t)| dt ≤ Tab (f ) + 2ε(b − a).
a
Since ε > 0 was arbitrarily chosen, we get
Z
Z b
|f 0 | dm1 =
|f 0 (t)| dt ≤ Tab (f ),
[a,b]
a
which completes the proof. Example 6.2.4 (Rectifiable arcs) An arc, or a curve, in the plane, is a
continuous map of the form γ : [a, b] → R2 .To compute the ‘length’ of
the curve,we partition [a, b] as in (6.2.1) and get the approximate length
as
n
X
|γ(xi ) − γ(xi−1 )|,
i=1
which is none other than the sum of the lengths of the chords connecting
the successive points of the set {γ(xi )}ni=0 lying on the curve γ. We say
that the arc is rectifiable, i.e. its length is well-defined, if the supremum
of the above sum, taken over all partitions, is finite, and that number is
called the length of the arc. In other words, an arc given by the mapping
γ is rectifiable if, and only if, the map γ is of bounded variation and the
length of the arc is, in fact, Tab (γ). Let us assume that the arc is given
by the parametric equations: x = x(t), y = y(t), where t ∈ [a, b]. If
the functions x(t) and y(t) are continuously differentiable, then, by the
preceding proposition, we get that the length, L, of the curve is given
by the formula
Z bp
L =
(x0 (t))2 + (y 0 (t))2 dt
a
which is precisely what we define as the length of a curve in an undergraduate calculus course. 6.3
Differentiation of an indefinite integral
In this section, we will show that the derivative of the indefinite intergal
of an integrable function is equal to the integrand, almost everywhere.
Proposition 6.3.1 Let f : [a, b] → R be an integrable function. The
indefinite integral of f , defined by
Z
F (x) =
f dm1 , x ∈ [a, b],
[a,x]
132
6 Differentiation
is a uniformly continuous function of bounded variation over [a, b].
Proof: Let x, y ∈ [a, b], with x < y. Then
Z
Z
|F (x) − F (y)| =
f dm1 ≤
[x,y]
|f | dm1 .
[x,y]
Given ε > 0, we know that (cf. Proposition 5.3.2) there exists δ > 0
such that, whenever |x − y| < δ, we have
Z
|f | dm1 < ε,
[x,y]
since f is integrable. This proves the uniform continuity of F .
If
P = {a = x0 < x1 < · · · < xn = b},
is any partition of [a, b], then
N
X
|F (xi ) − F (xi−1 )| ≤
i=1
n Z
X
i=1
Z
|f | dm1 =
[xi−1 ,xi ]
|f | dm1 .
[a,b]
Thus, it follows that
Tab (F ) ≤
Z
|f | dm1 < +∞,
[a,b]
which proves that F is of bounded variation over [a, b]. Remark 6.3.1 The above result shows that in order that a function
be the indefinite integral of an integrable function, it must be at least a
function of bounded variation. In fact, we will see in the next section that
even more will be required. Thus, in general, a differentiable function
may not be the indefinite integral of its derivative, even if the derivative
is integrable. It needs to be at least a uniformly continuous function of
bounded variation, with additional properties. Proposition 6.3.2 Let f : [a, b] → R be an integrable function such
that
Z
f dm1 = 0,
[a,x]
for all x ∈ [a, b]. Then f (x) = 0 for almost every x ∈ [a, b].
6.3 Differentiation of an indefinite integral
133
Proof: Set
E+ = {x ∈ [a, b] | f (x) > 0} and E− = {x ∈ [a, b] | f (x) < 0}.
Assume that m1 (E+ ) > 0. Then (cf. Proposition 2.2.2), there exists
a closed set F ⊂ E+ such that m1 (F ) > 0. Set U = (a, b)\F , which
is open. Then, U can be written as the disjoint union of a countable
number of half-open intervals (cf. Lemma 2.2.1):
U = ∪∞
n=1 [an , bn ).
Set
f n = f χ ∪n
k=1
[ak ,bk )
.
Then fn → f in U and |fn | ≤ |f |, which is integrable. Thus, by the
dominated convergence theorem, we have that
Z
f dm1 =
U
∞ Z
X
f dm1 .
n=1 [an ,bn )
Since f > 0 on
R F and since F has positive measure, we have (cf. Proposition 5.2.2), F f dm1 > 0 and so since
Z
Z
0 =
Z
f dm1 =
[a,b]
f dm1 +
U
f dm1 ,
F
R
we deduce
that U f dm1 6= 0. Consequently, there exists n ∈ N such
R
that [an ,bn ) f 6= 0. But this is a contradiction, since
Z
Z
Z
f dm1 −
f dm1 =
[an ,bn )
[a,bn )
f dm1 = 0,
[a,an )
by hypothesis. Thus, it follows that m1 (E+ ) = 0. Similarly, we can
show that m1 (E− ) = 0. This shows that f = 0 almost everywhere. Proposition 6.3.3 Let f : [a, b] → R be a bounded measurable function.
Define
Z
f dm1 , x ∈ [a, b],
F (x) = F (a) +
(6.3.1)
[a,x]
where F (a) is an arbitrary constant. Then F is differentiable almost
everywhere and F 0 (x) = f (x) for almost every x ∈ [a, b].
134
6 Differentiation
Proof: By Proposition 6.3.1, we know that F is uniformly continuous and of bounded variation and hence that it is differentiable almost
everywhere. Let |f (x)| ≤ M for all x ∈ [a, b]. For n ∈ N, define
Z
1
fn (x) = n F x +
− F (x) = n
f dm1 .
n
[x,x+ 1 ]
n
F 0,
Then |fn (x)| ≤ M for all x ∈ [a, b] and fn →
almost everywhere,
as n → ∞. Hence, by the dominated convergence theorem, for any
c ∈ (a, b), we have
R
R
0
[a,c] F dm1 = limn→∞ [a,c] fn dm1
= limn→∞ n
R
[a,c] (F (x
+ n1 ) − F (x)) dm1 (x)
R
R
= limn→∞ n [c,c+ 1 ] F dm1 − [a,a+ 1 ] F dm1 .
n
n
(We have used the result of Exercise 5.3.) Now, since F is uniformly
continuous, given ε > 0, there exists N ∈ N such that for all n ≥ N ,
and for all x ∈ [a, b), we have |F (x + n1 ) − F (x)| < ε. Consequently, for
any x ∈ [a, b) and for any n ≥ N , we have
Z
1
x+ n
Z
n
1
[x,x+ n
]
F dm1 − F (x) = n
(F (t) − F (x)) dt ≤ ε.
x
Thus, we deduce that, for any c ∈ [a, b),
Z
Z
0
F dm1 = F (c) − F (a) =
[a,c]
f dm1 .
[a,c]
We then conclude, by applying the result of Proposition 6.3.2, that
F 0 = f almost everywhere. We can discard the hypothesis of boundedness of the integrand.
Theorem 6.3.1 Let f : [a, b] → R be an integrable function. Let F be
defined as in (6.3.1). Then F is differentiable almost everywhere and
F 0 (x) = f (x) for almost every x ∈ [a, b].
Proof: Let us first assume that f ≥ 0. Define
f (x), if f (x) ≤ n,
fn (x) =
n, if f (x) > n.
6.3 Differentiation of an indefinite integral
135
Then each fn is bounded and the sequence {fn }∞
n=1 increases to f . Thus,
f − fn ≥ 0. Define
Z
Gn (x) =
(f − fn ) dm1 .
[a,x]
Then Gn is monotonic increasing and is hence differentiable almost everywhere and its derivative is non-negative. Further, since fn is bounded,
by the preceding proposition, we have
Z
d
fn dm1 = fn (x),
dx [a,x]
for almost every x ∈ [a, b]. Now, we can write
Z
F (x) = F (a) + Gn (x) +
fn (x) dm1 ,
[a,x]
and so, for almost every x ∈ [a, b], we have
F 0 (x) = G0n (x) + fn (x) ≥ fn (x).
Since n ∈ N was arbitrarily chosen, we have
F 0 (x) ≥ f (x) a.e.,
which implies that
Z
Z
F 0 dm1 ≥
[a,b]
f dm1 = F (b) − F (a).
(6.3.2)
(6.3.3)
[a,b]
On the other hand, since f is non-negative, we have that F is monotonic increasing and hence, by Theorem 6.1.1,
Z
F 0 dm1 ≤ F (b) − F (a).
(6.3.4)
[a,b]
It now follows from, (6.3.3) and (6.3.4), that
Z
Z
0
F dm1 =
f dm1 = F (b) − F (a).
[a,b]
[a,b]
Since by (6.3.2), we have that F 0 ≥ f almost everywhere, the above
relation implies that F 0 = f almost everywhere (cf. Proposition 5.2.2).
136
6 Differentiation
This completes the proof for non-negative f .
In the general case, we write f = f + − f − . Then
Z
Z
F (x) = F (a) +
f + dm1 −
f − dm1 .
[a,x]
[a,x]
It now follows, from the preceding arguments, that for almost every
x ∈ [a, b], we have
F 0 (x) = f + (x) − f − (x) = f (x). 6.4
Absolute Continuity
From the previous section, we see that in order that a given function f :
[a, b] → R be written as an indefinite integral of an integrable function, it
must be at least uniformly continuous and of bounded variation. We will
now introduce a new concept which will provide both a necessary and
sufficient condition for a function to be written as an indefinite integral.
Definition 6.4.1 A function f : [a, b] → R is said to be absolutely
continuous if, for every ε > 0, there exists δ > 0 such that the following
holds: given any finite collection of disjoint intervals {(xk , yk )}nk=1 in
[a, b] such that
n
X
|yk − xk | < δ,
(6.4.1)
k=1
we have
n
X
|f (yk ) − f (xk )| < ε. (6.4.2)
k=1
Remark 6.4.1 Clearly an absolutely continuous function is uniformly
continuous. Example 6.4.1 If f : [a, b] → R is Lipschitz continuous, then it is
absolutely continuous. If L is the Lipschitz constant (cf. Example 6.2.1),
then, choose δ = Lε . Then, if {(xk , yk )}nk=1 is a collection of disjoint
intervals in [a, b], satisfying (6.4.1), we have
n
X
k=1
|f (yk ) − f (xk )| ≤ L
n
X
k=1
|yk − xk | < ε.
6.4 Absolute Continuity
137
Thus, every differentiable function whose derivative is bounded on [a, b]
will be absolutely continuous. Example 6.4.2 Let f : [a, b] → R be an integrable function and let F
be defined as in (6.3.1). Then F is absolutely continuous. To see this,
let {(xk , yk )}nk=1 be a collection of disjoint intervals in [a, b], satisfying
(6.4.1). Then
Z
n
X
|F (yk ) − F (xk )| ≤
|f | dm1 .
∪n
k=1 (xk ,yk )
k=1
The existence of δ > 0, given ε > 0, such that (6.4.2) is true follows
from Proposition 5.3.2. (See also Remark 5.3.2). Our aim now is to prove the converse of the result in the above
example, thereby establishing absolute continuity as the necessary and
sufficient condition for a function to be written as an indefinite integral
(of its derivative).
Proposition 6.4.1 Let f : [a, b] → R be absolutely continuous. Then,
f is of bounded variation over [a, b]. In particular, f is differentiable
almost everywhere in [a, b].
Proof: Let δ > 0 correspond to ε = 1 in the definition of absolute
continuity of f . Let K be the integral part of 1 + (b−a)
δ . Then, given
any partition P of [a, b], we can refine it to a partition P 0 such that the
constituent intervals of P 0 can be grouped into K sets of intervals, each
with total length less than δ. Then,
t(P, f ) ≤ t(P 0 , f ) ≤ K.
Thus Tab (f ) ≤ K < +∞. Proposition 6.4.2 Let f : [a, b] → R be an absolutely continuous function such that f 0 = 0 almost everywhere in [a, b]. Then f is a constant
function.
Proof: Let c ∈ (a, b] be arbitrarily chosen. Set
E = {x ∈ (a, c) | f 0 (x) = 0}.
By hypothesis, we have that m1 (E) = c − a. Let ε and η, be arbitrarily
small positive numbers.
138
6 Differentiation
If x ∈ E, then, for sufficiently small h > 0, we have [x, x + h] ⊂ (a, c)
and also that |f (x+h)−f (x)| < ηh.Then, by the Vitali covering lemma,
we can find a finite disjoint set of intervals, {(xk , xk + hk )}nk=1 , such that
m1 (E\ ∪nk=1 (xk , yk )) < δ,
where, we have set yk = xk + hk , 1 ≤ k ≤ n. Without loss of generality,
we may assume that the {xk }nk=1 have been labelled in increasing order
of magnitude. Thus we have
y0 = a ≤ x1 < y1 ≤ x2 < y2 ≤ · · · ≤ xn < yn ≤ c = xn+1 ,
and
n
X
|xk+1 − yk | < δ.
k=0
Now, on one hand, we have
n
X
|f (yk ) − f (xk )| < η
k=1
n
X
|yk − xk | < η(c − a).
k=1
On the other hand, by absolute continuity, we have
n
X
|f (xk+1 ) − f (yk )| < ε.
k=0
Together, these relations yield,
|f (c) − f (a)| < ε + η(c − a).
Since ε and η were arbitrarily chosen, it follows that f (c) = f (a), and
this completes the proof, since c was fixed arbitrarily in (a, b]. Remark 6.4.2 The Cantor function is an example of a non-constant
function whose derivative vanishes almost everywhere. Thus, the Cantor
function is not absolutely continuous. Theorem 6.4.1 A function F : [a, b] → R can be written as an indefinite integral of an integrable function if, and only if, it is absolutely
continuous.
6.5 Exercises
139
Proof: We have already seen, in Example 6.4.2, that if F is an indefinite integral, then it is absolutely continuous.
Conversly, let F be absolutely continuous. Since it is of bounded
variation, it can be written as the difference of two monotonic increasing
functions (cf. Theorem 6.2.1). Let F = F1 − F2 , where Fi , i = 1, 2, are
monotonic increasing. Then F 0 = F10 − F20 and so, by Theorem 6.1.1, we
have
R
R
R
0
0
0
[a,b] |F | dm1 ≤
[a,b] |F1 | dm1 + [a,b] |F2 | dm1
0
[a,b] F1
=
R
≤
P2
dm1 +
i=1 (Fi (b)
R
0
[a,b] F2
dm1
− Fi (a)) < +∞.
Thus, F 0 is integrable. Let
Z
F 0 dm1 .
G(x) =
[a,x]
Then G is absolutely continuous and hence so is the function f = F − G.
Now, by Theorem 6.3.1, we have that G0 = F 0 almost everywhere, i.e.
f is an absolutely continuous function whose derivative f 0 = F 0 − G0
vanishes almost everywhere. Thus, f is a constant, equal to f (a) = F (a).
Thus, we have
Z
F 0 dm1 ,
F (a) = F (x) −
[a,x]
or, equivalently,
Z
F 0 dm1 .
F (x) = F (a) +
[a,x]
This completes the proof. 6.5
Exercises
6.1 Let f : [a, b] → R be a monotonic function. For x ∈ (a, b), define
f (x+) = lim f (x + h) and f (x−) = lim f (x − h).
h↓0
h↓0
Show that f (x+) and f (x−) always exist. Deduce that the set of discontinuities of f is at most countable.
140
6 Differentiation
6.2 For x ∈ [−1, 1], define
f (x) =
x sin x1 , if x 6= 0,
0, if x = 0.
Compute D+ f (0), D+ f (0), D− f (0) and D− f (0).
6.3 Show that the function f defined on [0, 1] by
f (x) =
x2 sin x1 , if x 6= 0,
0, if x = 0,
is of bounded variation.
6.4 Let f : [a, b] → R be a function of bounded variation. Let a ≤ c ≤ b.
Show that
Tab (f ) = Tac (f ) + Tcb (f ).
6.5 Let f : [a, b] → R be an absolutely continuous function. Show that
Tab (f ) =
R
[a,b] |f
0|
Pab (f ) =
R
[a,b] (f
0 )+
dm1 , and,
Nab (f ) =
R
[a,b] (f
0 )−
dm1 .
dm1 ,
6.6 Let f : [a, b] → R be a function of bounded variation. Define, for
x ∈ [a, b],
vf (x) = Tax (f ).
(a) If f is continuous, show that vf is also continuous.
(b) If f is absolutely continuous, show that vf is also absolutely continuous.
6.7 A monotone function is said to be singular if its derivative vanishes
almost everywhere. Show that if f : [a, b] → R is a monotonic increasing
function, then it can be written as the sum of a singular function and
an absolutely continuous function.
6.8 Let f : [0, 1] → R be a continuous function which is absolutely continuous on [ε, 1] for every 0 < ε < 1.
6.5 Exercises
141
(a) Show, by means of an example, that f need not be absolutely continuous on [0, 1].
(b) If, in addition, f is of bounded variation on [0, 1], show that it is
absolutely continuous on [0, 1].
6.9 (Another example of a Cantor function) Consider the Cantor set C
(cf. Example 2.1.3). For each n ∈ N, let En = [0, 1]\Xn , where Xn is as
described in Example 2.1.3. Define, for x ∈ [0, 1],
n
Z
3
gn (x) =
χEn (x) and fn (x) =
gn dm1 .
2
[0,x]
(a) Show that, for each n ∈ N, fn is a monotonic increasing function such
that fn (0) = 0, fn (1) = 1 and that fn is constant on each constituent
interval of Xn = Enc .
(b) If I is any constituent interval of En , n ∈ N, show that
Z
Z
gn dm1 =
gn+1 dm1 = 2−n .
I
I
(c) Let n ∈ N. Show that fn+1 (x) = fn (x) if x 6∈ En , and that
|fn+1 (x) − fn (x)| ≤
3
2n
for all x ∈ En .
(d) Deduce the existence of a continuous function f : [a, b] → R which is
monotonic increasing, whose derivative vanishes at every point x 6∈ C,
where C is the Cantor set, and such that f (0) = 0, f (1) = 1.
Chapter 7
Change of variable
7.1
The Fréchet derivative
Let A : RN → RN be a linear transformation. We have seen (cf. Theorem 2.3.3) that if E ⊂ RN is a measurable set, then
mN (A(E)) = |det(A)|mN (E).
Now, using the procedure outlined in Remark 5.2.4, it is a simple exercise
to see that, if f : RN → RN is an integrable function, then
Z
Z
f dmN = |det(A)|
(f ◦ A) dmN ,
RN
RN
where, f ◦ A stands for the composition of the two mappings A and f .
The aim of this chapter is to generalize this result to suitable transformations on open sets in RN . In order to do this, we need the tools of
differential calculus in RN , which we recall in this section. For proofs of
all assertions made in this section, see, for example, Kesavan [4].
Definition 7.1.1 Let U ⊂ RN be an open set and let T : U → RM be
a given mapping. The mapping is said to be differentiable at a point
a ∈ U , if there exists a linear transformation A : RN → RM such that
|T (a + h) − T (a) − A(h)|
= 0,
h→0
|h|
lim
(7.1.1)
where | · | denotes the euclidean length of a vector in the appropriate
euclidean space. The linear map A is called the Fréchet derivative of
T at the point a ∈ U and is denoted by T 0 (a). © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_7
142
7.1 The Fréchet derivative
143
Remark 7.1.1 The following facts are immediate consequences of the
definition:
(i) If T is differentiable at a point a ∈ U , then it is continuous at that
point.
(ii) The derivative at a point, if it exists, is unique. It is for this purpose
that we work in an open set.
(iii) If T is itself a linear map, then it is differentiable at every point of
U and, for every a ∈ U , we have
T 0 (a) = T. Remark 7.1.2 The relation (7.1.1) can be written in an equivalent
fashion as follows:
T (a + h) = T (a) + T 0 (a)(h) + ε(h),
where, the error term ε(h) satisfies the condition
|ε(h)|
= 0. h→0 |h|
lim
Definition 7.1.2 Let U ⊂ RN be an open set. A mapping T : U → RM
is said to be differentiable on U if T 0 (x) exists for every x ∈ U . The
mapping T is said to be of class C 1 on U , or, equivalently, T is said to
be continuously differentiable on U , if it is differentiable on U and,
in addition, the mapping x 7→ T 0 (x) from U into the space of all linear
transformations from RN into RM , denoted L(RN , RM ), is continuous
when the latter space is endowed with its usual topology. If V ⊂ RN is
an open set, a mapping T : U → V is said to be a diffeomorphism if
T is a bijection and if both T and T −1 are continuously differentiable
maps. Example 7.1.1 If N = M = 1, then the Fréchet derivative is the familiar derivative that we define when studying the calculus of functions
of a single variable. In this case, the derivative T 0 (a) at a point a ∈ U
is a real number, which can be visualised as a linear map from R onto
itself, acting on R by multiplication, i.e. T 0 (a)(h) = T 0 (a)h. Example 7.1.2 Let N > 1 and let M = 1. Then T 0 (a) is a linear
functional on RN , and so, it can be represented by a vector in RN .
Indeed, we have
∂T
∂T
0
T (a) = ∇T (a) =
(a), · · · ,
(a) ,
∂x1
∂xN
144
7 Change of variable
∂T
where { ∂x
(a)}N
i=1 are the usual partial derivatives of T at the point
i
a. It can be shown that if T is differentiable at a ∈ U , then all partial
derivatives of T exist at that point. Thus, the action of T 0 (a) on a vector
h = (h1 , · · · , hN ) is given by
T 0 (a)(h) =
N
X
∂T
(a)hi . ∂xi
i=1
Example 7.1.3 Consider the mapping T : R2 → R defined, for (x, y) ∈
R2 , by
(
x5
, if (x, y) 6= (0, 0),
(y−x2 )2 +x4
T (x, y) =
0, if (x, y) = (0, 0).
Then one can verify that both partial derivatives exist, and are equal
to zero, at the point (0, 0). If T were differentiable at the origin, then
it would follow, from the preceding example, that T 0 ((0, 0)) = 0. In
particular, we must then have
|T (h)|
= 0.
h→0 |h|
lim
However, if we take h = (t, t2 ), where t → 0, we can easily see that the
above limit is unity.
Thus, while the partial derivatives will all exist at a point if the
mapping is differentiable there, the converse is not true. The partial
derivatives may exist at a point, but the mapping can still fail to be
differentiable there. Example 7.1.4 Let N > 1 and M > 1. Let T (x) = (T1 (x), · · · , TM (x)),
where Ti is a mapping of U ⊂ RN into R. If T is differentiable at a
point a ∈ U , then all the Ti , 1 ≤ i ≤ M are also differentiable at a. The
derivative T 0 (a) can now be represented by an M × N matrix. We have,
for h = (h1 , · · · , hN ) ∈ RN ,



∂T1
∂T1
(a)
.
.
.
(a)
h1
∂x1
∂xN


···
··· 
··· 
 ···
. T 0 (a)(h) = 


···
···  ··· 
 ···
∂TM
∂TM
hN
∂x1 (a) . . . ∂xN (a)
Definition 7.1.3 Let U ⊂ RN be an open set and let T : U → RN be a
differentiable map defined on U . The Jacobian of T at a point a ∈ U
is denoted by JT (a) and is equal to det(T 0 (a)). 7.1 The Fréchet derivative
145
We will now recall (without poof) some important results from the
differential calculus in RN , which are well known when N = 1.
Theorem 7.1.1 (Chain rule) Let Ni ∈ N, 1 ≤ i ≤ 3. Let U ⊂ RN1
and V ⊂ RN2 be open sets. Let f : U → RN2 and let g : V → RN3
be continuous mappings. Let a ∈ U be such that f (a) ∈ V . Assume
that f is differentiable at a and that g is differentiable at f (a). Then the
composition h = g◦f , which is defined on the open set U 0 = f −1 (V ) ⊂ U ,
is differentiable at a and
h0 (a) = g 0 (f (a)) ◦ f 0 (a). Notation
• Let a and b be vectors in RN . We denote the (closed) line segment
connecting these points by [a, b]. Thus,
[a, b] = {ta + (1 − t)b | 0 ≤ t ≤ 1}.
• Let A : RN → RN be a linear transformation. We denote by kAk,
its norm, i.e.
kAk = max |A(x)|.
|x|=1
Theorem 7.1.2 (Mean value theorem) Let U ⊂ RN be an open set and
let T : U → RM be a given mapping. Let a, b ∈ U such that [a, b] ⊂ U .
If T is differentiable in U , then
|T (b) − T (a)| ≤
sup kT 0 (x)k |b − a|.
(7.1.2)
x∈[a,b]
Remark 7.1.3 The mean value theorem has numerous applications. For
instance, let T : U → R be a mapping such that its partial derivatives
∂T
exist at all points of U . Assume that the mappings x 7→ ∂x
(x) are
i
continuous at a point a ∈ U for all 1 ≤ i ≤ N . Then, using the mean
value theorem, we can show that T is differentiable at a (cf. Example
7.1.3). Theorem 7.1.3 Let U ⊂ RN be an open set and let T : U → RM be a
mapping of class C 1 . Then, if [a, a + h] ⊂ U , we have
Z 1
T (a + h) = T (a) +
T 0 (a + th)(h) dt.
(7.1.3)
0
146
7.2
7 Change of variable
Sard’s theorem
Definition 7.2.1 Let U ⊂ RN be an open set. Let T : U → RN be a
C 1 map. Let x ∈ U . We say that the point x is a singular point, or a
critical point, if the rank of T 0 (x) is strictly less than N , i.e. the linear
transformation T 0 (x) is singular. The value T (x) is called a singular
value or critical value of T . If T −1 {y} does not contain any critical
point, we say that y is a regular value of T . Theorem 7.2.1 (Sard) Let U ⊂ RN be an open set and let T : U → RN
be a C 1 map. Then the set of critical values of T has measure zero.
Proof: Let S be the set of critical points of T i.e.
S = {x ∈ U | JT (x) = 0}.
We need to show that mN (T (S)) = 0.
Step 1. Let C be a closed cube of side a, with sides parallel to the
coordinate axes, contained in U . Since T 0 is bounded and uniformly
continuous on C (since T is continuously differentiable), given ε > 0,
there exists δ > 0 such that
kT 0 (x) − T 0 (y)k < ε,
whenever |x − y| < δ and x, y ∈ C. Let us divide C into k N similar
cubes, each of side ka , with k being chosen large enough such that the
√
diameter of each sub-cube is less than δ, i.e. N ka < δ. If kT 0 (x)k ≤ L,
for all x ∈ C, we have
|T (x) − T (y)| ≤ L|x − y|,
for all x, y ∈ C, by virtue of the mean value thoerem (cf. Theorem 7.1.2).
Step 2. Let x ∈ C ∩ S. Then, x must belong to one of the sub-cubes,
e Given any y ∈ C,
e we have, on one hand,
say, C.
√ a
(7.2.1)
|T (x) − T (y)| ≤ L N .
k
On the other hand, by virtue of Theorem 7.1.3, we have
Z 1
0
T (y) − T (x) − T (x)(y − x) =
(T 0 (x + t(y − x)) − T 0 (x))(y − x) dt.
0
7.3 Diffeomorphisms
147
Using the uniform continuity of T in C, we get
√ a
|T (y) − T (x) − T 0 (x)(y − x)| ≤ ε|y − x| ≤ ε N .
k
(7.2.2)
Step 3. Set H = T 0 (x)(RN ). Since x is a critical point, it follows that
the dimension of the subspace H is at most N − 1. Hence by (7.2.2), we
deduce that
√ a
dist(T (y), T (x) + H) ≤ ε N ,
(7.2.3)
k
e Combining (7.2.1) and (7.2.3), we duduce that T (C)
e is
for every y ∈ C.
√
√
a
contained within a cylindrical block of radius L N
and height 2ε
k
N
−1
If ωN −1 is the measure of the unit ball in R
, we have
√ a N −1 √ a
e ≤ ωN −1 L N
mN (T (C))
2ε N = K(N, C)εk −N .
k
k
Thus
P
e
mN (T (C ∩ S) ≤
e ⊂ C mN (T (C))
C
e ∩ S 6= ∅
C
Na
k .
≤ k N K(N, C)εk −N = K(N, C)ε.
Since ε can be chosen arbitrarily small, we deduce that mN (T (C ∩ S)) =
0. Now U can be covered by a countable number of such cubes and so
the result follows. .
7.3
Diffeomorphisms
Lemma 7.3.1 Let U ⊂ RN be an open set and let T : U → RN be a
C 1 map. Let x ∈ U be such that T 0 (x) is non-singular. Let C ⊂ U be a
closed cube with centre at x, its sides parallel to the coordinate axes and
of length ν. Then, given ε > 0, there exists δ > 0 such that, if ν < δ,
e is any sub-cube of C with sides parallel to the coordinate axes,
and if C
we have
N Z
e ≤ (1 + ε)
mN (T (C))
|JT | dmN .
(7.3.1)
1−ε
e
C
Proof: Let B ⊂ U be a closed and bounded neighbourhood of x so that
T 0 is bounded and uniformly continuous on B. We will only work with
cubes C contained in B. Let
L = max kT 0 (ξ)k.
ξ∈B
148
7 Change of variable
e of C, with sides parallel to the coordinate
If we consider any sub-cube C
0
axes and with centre at x , we have, by the mean value theorem,
|T (y) − T (x0 )| ≤ L|y − x0 |,
e Thus, it follows that for any such sub-cube,
for every y ∈ C.
e ≤ LN mN (C).
e
mN (T (C))
(7.3.2)
Let ε > 0 be arbitrarily chosen. Then, by the uniform continuity of T 0 ,
we can find δ > 0 such that, if |y − x| < δ, we have,
k(T 0 (x))−1 T 0 (y)k < 1 + ε,
|JT (y)| > (1 − ε)|JT (x)|.
(7.3.3)
e
Let ν < δ so that the above relations are valid throughout C. Let C
be any sub-cube of C, as described earlier. Then, by virtue of Theorem
2.3.3, we have
e = mN ((T 0 (x))−1 T (C)).
e
|JT (x)|−1 mN (T (C))
Now, T 0 (x) is a fixed linear transformation. Hence, by Remark 7.1.1
(iii) and the chain rule (cf. Theorem 7.1.1), we have that (T 0 (x))−1 T is
differentiable and its derivative at any point ξ is (T 0 (x))−1 T 0 (ξ). Since,
for ξ ∈ C, we have that
max k(T 0 (x))−1 T 0 (ξ)k < 1 + ε,
ξ∈C
we deduce from (7.3.2) that
e < (1 + ε)N mN (C).
e
mN ((T 0 x)−1 T (C))
Thus,
e < (1 + ε)N |JT (x)|mN (C).
e
mN (T (C))
On the other hand, we also have from (7.3.3), that,
Z
e
|JT (y)|dmN (y) > (1 − ε)|JT (x)|mN (C).
e
C
Thus, combining (7.3.4) and (7.3.5), we deduce (7.3.1). (7.3.4)
(7.3.5)
7.3 Diffeomorphisms
149
Lemma 7.3.2 Let U ⊂ RN be open and let K ⊂ U be a compact set.
Let T : U → RN be a C 1 map. Then
Z
Z
|JT | dmN =
inf
|JT | dmN .
(7.3.6)
W
K
W open
W compact
K⊂W ⊂W ⊂U
Proof: Since T is continuously differentiable and since K is compact,
|JT | is integrable over K and over any open set W whose closure is
compact. Clearly,
Z
Z
|JT | dmN ≤
inf
|JT | dmN .
K
W
W open
W compact
K⊂W ⊂W ⊂U
To prove the reverse inequality, observe that by absolute continuity
(cf. Proposition 5.3.2), if we restrict our attention to subsets of a relatively compact open set W0 containing K and contained in U , given any
ε > 0, we can find δ > 0 such that, if F ⊂ W0 satisfies mN (F ) < δ, then
Z
|JT | dmN < ε.
F
Now we can find W open such that K ⊂ W ⊂ W ⊂ W0 ⊂ W 0 ⊂ U and
such that mN (W \K) < δ (cf. Proposition 2.2.2 (ii)). Thus,
Z
Z
|JT | dmN −
|JT | dmN < ε.
W
K
In other words, we have found W , a relatively compact open set contained in U and containing K such that
Z
Z
|JT | dmN <
|JT | dmN + ε.
W
K
This establishes the reverse inequality and completes the proof. Proposition 7.3.1 Let U and V be open subsets of RN and let T : U →
V be a C 1 map, which is also a homeomorphism. Then
Z
mN (T (E)) ≤
|JT | dmN ,
(7.3.7)
E
where E ⊂ U is either a compact set or an open set.
150
7 Change of variable
Proof: Step 1. Let K ⊂ U be a compact set and let W be a relatively
compact open set such that
K ⊂ W ⊂ W ⊂ U.
Let
L = max kT 0 (x)k.
x∈W
Let ε > 0. Let δ1 > 0 be such that, whenever |x − y| < δ1 , x, y ∈ W , we
have
kT 0 (x) − T 0 (y)k < ε.
If x ∈ K such that T 0 (x) is singular, set ν(x) = δ1 . If x ∈ K is such
that T 0 (x) is non-singular, set ν(x) = min{δ1 , δ(x)}, where δ(x) is the
number δ chosen in the proof of Lemma 7.3.1 such that (7.3.3) is valid.
Now cover K by cubes {C(x)}x∈K , where C(x) is centered at x ∈ K and
with sides of length ν(x), the sides being parallel to the coordinate axes.
Since K is compact, there exists a finite subcover. We can further ensure
that we have a finite collection of disjoint (half-open) cubes whose union
covers K and each of these is a sub-cube of one of the C(x). Thus we
have a finite cover of K consisting of disjoint cubes {C 0 (x)}x∈S , where S
is a finite set and x is the centre of the cube C 0 (x). Then S = Js ∪ Jns ,
where
Js = {x ∈ S | T 0 (x) is singular},
Jns = {x ∈ S | T 0 (x) is non-singular}.
If x ∈ Js , then we can proceed exactly as in the proof of Sard’s theorem
(Theorem 7.2.1) to get
mN (T (C 0 (x))) ≤ 2ωN −1 LN −1 εmN (C 0 (x)).
(7.3.8)
If x ∈ Jns , then, by Lemma 7.4.1, we get
(1 + ε)N
mN (T (C (x))) ≤
1−ε
0
Z
C 0 (x)
|JT | dmN .
(7.3.9)
(We must remember that each C 0 (x) is a sub-cube of one of the original
cubes. The estimates in Sard’s theorem and Lemma 7.4.1 were observed
7.3 Diffeomorphisms
151
to be vaild for any sub-cube of the cube of admissible size.) Now,
mN (T (K)) ≤ mN (T (∪x∈S C 0 (x))
≤
P
x∈Js
≤ C(K)ε
mN (T (C 0 (x)) +
P
x∈Js
P
x∈Jns
mN (C 0 (x)) +
≤ C(K)εmN (W ) +
(1+ε)N
1−ε
R
W
mN (T (C 0 (x))
(1+ε)N
1−ε
P
x∈Jns
R
C 0 (x) |JT |
dmn
|JT | dmn ,
where
C(K) = 2ωN −1 LN −1 .
We have used here the disjointness
of the cubes C 0 (x) when using mN (W )
P
as an upper bound for x∈Js mN (C 0 (x)) and when using the integral
over W as an upper bound for the sum of the integrals over the cubes
C 0 (x).
Since ε can be chosen arbitrarily small, we get
Z
mN (T (K)) ≤
|JT | dmN .
W
Consequently, we have, by Lemma 7.3.2,
Z
Z
mN (T (K)) ≤
inf
|JT | dmN =
|JT | dmN .
W
K
W open
W compact
K⊂W ⊂W ⊂U
Let W ⊂ U be an open set. Then, W can be written as the countable
increasing union of compact sets, i.e.
W = ∪∞
n=1 Kn ,
where the sets Kn , n ∈ N are all compact and Kn ⊂ Kn+1 for all n ∈ N.
Then, since T is a homeomorphism,
T (W ) = ∪∞
n=1 T (Kn ),
and for all n ∈ N,we have that T (Kn ) ⊂ T (Kn+1 ) and T (Kn ) is compact.
Thus,
Z
Z
mN (T (W )) = lim mN (T (Kn )) ≤ lim
|JT | dmN ≤
|JT | dmN .
n→∞
This completes the proof. n→∞ K
n
W
152
7 Change of variable
Corollary 7.3.1 Let U and V be bounded open subsets of RN . Let T
be a C 1 map which maps U homeomorphically onto V . Let E ⊂ U be a
Borel set. Then (7.3.7) holds.
Proof: By Lemma 2.3.1, T (E) is a Borel set. Since V is bounded, it
follows that T (E) has finite measure. Consequently (cf. Proposition
2.2.3 and Remark 2.2.1), since T is a homeomorphism, we have
mN (T (E)) = sup{mN (T (K)) | K ⊂ E, K compact}.
But
Z
Z
mN (T (K)) ≤
|JT | dmN ≤
K
|JT | dmN ,
E
from which (7.3.7) follows. Remark 7.3.1 We needed the boundedness of V to ensure that T (E)
has finite measure. If T : U → V is such that |JT | is integrable over U ,
then, if E ⊂ U is any Borel set, we can find an open set W such that
E ⊂ W ⊂ U (cf. Proposition 2.2.2). Then, by Proposition 7.3.1, T (W )
will have finite measure and so the measure of T (E) will also be finite.
Then the proof of Corollary 7.3.1 will go through. The following result is an immediate consequence of the preceding
corollary.
Corollary 7.3.2 Let U and V be bounded open subsets of RN and let
T be a homeomorphism of V onto V , which is also a C 1 map. If E ⊂ U
is a Borel set of measure zero, then so is T (E). Proposition 7.3.2 Let U and V be bounded open subsets of RN and
let T be a homeomorphism of U onto V , which is also a C 1 map. Let
f : V → R be a non-negative Borel measurable function. Then
Z
Z
f dmN ≤
(f ◦ T )|JT | dmN .
(7.3.10)
V
U
Proof: Let F ⊂ V be a Borel set. Then F = T (E), where E ⊂ U is
a Borel set. Then χE = χF ◦ T . Thus, if f = χF , then (7.3.10) is just
a restatement of (7.3.7). The result is now true for any non-negative
simple function and, hence, by the monotone convergence theorem, for
any non-negative Borel measurable function. Henceforth, we will work with diffeomorphisms.
7.3 Diffeomorphisms
153
Proposition 7.3.3 Let U and V be bounded open subsets of RN and
let T be a diffeomorphism of U onto V . Let f be a non-negative Borel
measurable function defined on V . Then
Z
Z
f dmN =
(f ◦ T )|JT | dmN .
(7.3.11)
V
U
Proof: We apply (7.3.10) to the function (f ◦ T )|JT |, defined on U and
to the diffeomorphism T −1 : V → U . Set T x = y. We then get
R
R
−1 (y))|.|J
T −1 (y)| dmN (y)
U (f ◦ T )(x)|JT (x)| dmN (x) ≤ V f (y)|JT (T
=
R
V
f (y) dmN (y),
since |JT (T −1 (y))|.|JT −1 (y)| = 1 for all y ∈ V . This gives the reverse
inequality of (7.3.10), thereby establishing (7.3.11). Corollary 7.3.3 Let U and V be bounded open subsets of RN and let
T be a diffeomorphism of U onto V . If E is any Borel subset of U , then
Z
mN (T (E)) =
|JT | dmN . (7.3.12)
E
We can now extend Lemma 2.3.1 to Lebesgue measurable sets.
Proposition 7.3.4 Let U and V be bounded open subsets of RN and let
T be a diffeomorphism of U onto V . If E ⊂ U is Lebesgue measurable,
then T (E) is a Lebesgue measurable subset of V .
Proof: We can write (cf. Theorem 1.4.1), E = F ∪ N , where F is a
Borel set and N is a subset of a Borel set A, where mN (A) = 0. Then
T (E) = T (F ) ∪ T (N ) and T (F ) is a Borel set. We also have that T (A)
is a Borel set of measure zero (cf. Corollary 7.3.2) and T (N ) ⊂ T (A).
Thus, T (E) is Lebesgue measurable. Theorem 7.3.1 (Change of variable formula) Let U and V be bounded
open subsets of RN and let T be a diffeomorphism of U onto V . Let
f : V → R be an integrable function. Then (7.3.11) holds.
Proof: Let E ⊂ U be a Lebesgue measurable set. Then if we write
E = F ∪ N as in the proof of the preceding proposition, we may assume,
without loss of generality, that F ∩ N = ∅. Then
R
mN (T (E)) =
mN (T (F ))
= F |JT | dmN
=
R
F ∪N |JT | dmN
=
R
E
|JT | dmN .
154
7 Change of variable
If G ⊂ V is a Lebesgue measurable set, then we can write G = T (E),
where E ⊂ U is Lebesgue measurable. The preceding consderations then
show that (7.3.11) holds for f = χG . Consequently, the relation remains
valid for any non-negative simple function, and hence, by the monotone
convergence theorem, for any non-negative measurable function. If f
is an integrable function, then (7.3.11) holds for both f + and f − , and
hence for f as well. Example 7.3.1 Consider a continuous function f : [−1, 1] → R. When
studying the change of variable y = −x in an undergraduate class, we
usually set dy = −dx and we get
1
Z
−1
Z
f (x) dx = −
−1
f (−y) dy.
1
We then declare that
Z −1
Z
1
f (−y) dy = −
f (−y) dy,
−1
1
to get
Z
1
Z
1
f (x) dx =
−1
f (−y) dy.
(7.3.13)
−1
The correct way of interpreting this is to use (7.3.11). We set T (x) =
−x. Then |JT (x)| = |T 0 (x)| = 1 for all x ∈ (−1.1). We also have
T ((−1, 1)) = (−1, 1). Thus (7.3.11) gives us
Z
Z
f (x) dm1 (x) =
f (−y) dm1 (y),
(−1,1)
(−1,1)
which is the same as (7.3.13). Example 7.3.2 (Polar coordinates) Let D ⊂ R2 denote the open disc
of radius a > 0, i.e.
D = {(x, y) ∈ R2 | |x|2 + |y|2 < a}.
Consider the open set V = D\{(x, 0) | 0 ≤ x < a}. Let U = (0, a) ×
(0, 2π) ⊂ R2 . Then the mapping T : U → V defined by T (r, θ) = (x, y),
where
x = r cos θ, y = r sin θ,
7.3 Diffeomorphisms
155
defines a diffeomorphism between U and V . We have
JT =
cos θ
sin θ
−r sin θ r cos θ
= r.
Thus, if f : V → R is an integrable function, we have
Z
Z
f dm2 =
rf (r cos θ, r sin θ) dm2 (r, θ).
V
U
Since D and V differ by a set of measure zero, we have
Z
Z
f dm2 =
rf (r cos θ, r sin θ) dm2 (r, θ).
D
U
If f : R2 → R is a non-negative function, then, by the monotone connvergence theorem, we have
Z
Z
f dm2 =
rf (r cos θ, r sin θ) dm2 (r, θ).
(7.3.14)
R2
(0,+∞)×(0,2π)
By considering the positive and negative parts of f , we have that (7.3.14)
is valid for any integrable function f : R2 → R. In the next chapter, we
will write (7.3.14) in a more familiar form. Chapter 8
Product spaces
8.1
Measurability in the product space
Let (X, S, µ) and (Y, T , λ) be two measure spaces. We would like to define a σ-algebra and a measure on the product X ×Y which is compatible
with the structures given on X and Y and also relate the process of integration with respect to this measure with the processes of integration
on X and Y .
Definition 8.1.1 Let (X, S) and (Y, T ) be two measurable spaces. A
measurable rectangle is a subset of X × Y of the form A × B, where
A ∈ S and B ∈ T . An elementary set is a finite disjoint union of
measurable rectangles. The σ-algebra generated by the collection of all
elementary sets is denoted by S × T . Definition 8.1.2 Let X and Y be non-empty sets. Let E ⊂ X × Y . Let
x ∈ X. Then the x-section of E, denoted Ex , is defined by
Ex = {y ∈ Y | (x, y) ∈ E}.
Similarly, for y ∈ Y , the y-section of E, denoted E y , is defined by
E y = {x ∈ X | (x, y) ∈ E}.
Thus, Ex ⊂ Y and E y ⊂ X. Proposition 8.1.1 Let (X, S) and (Y, T ) be measurable spaces. Let
E ∈ S × T . Then Ex ∈ T and E y ∈ S for every x ∈ X and for every
y ∈Y.
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_8
156
8.1 Measurability in the product space
157
Proof: Let U denote the collection of all subsets E of X × Y such that
Ex ∈ T for every x ∈ X. If E = A × B is a measurable rectangle, then
B, if x ∈ A,
Ex =
∅, if x 6∈ A.
Thus, every measurable rectangle belongs to U . In particular, X × Y ∈
U . Now, if E ⊂ X × Y , and if x ∈ X, we have
(Ex )c = {y ∈ Y | (x, y) 6∈ E} = (E c )x .
Thus, if E ∈ U , then E c ∈ U . Similarly, if {Ei }∞
i=1 is a sequence of sets
in X × Y and if E = ∪∞
E
,
then,
for
any
x
∈
X, we have
i
i=1
Ex = ∪ ∞
i=1 (Ei )x .
This shows that if {Ei }∞
i=1 is a countable collection of sets in U , then
E = ∪∞
E
∈
U
.
Thus,
U is a σ-algebra on X × Y which contains all
i
i=1
measurable rectangles and so it must contain all members of S ×T . This
shows that if E ∈ S × T , then, for every x ∈ X, we have that Ex ∈ T .
In the same way we can show that if E is in S × T , then E y ∈ S for
every y ∈ Y . This completes the proof. Definition 8.1.3 Let X be any non-empty set. A monotone class is
a collection M of subsets of X which is closed under countable increasing unions and countable decreasing intersections, i.e. if {Ai }∞
i=1 and
{Bi }∞
are
two
countable
collections
of
subsets
of
X
in
M
such
that,
i=1
for all i ∈ N we have Ai ⊂ Ai+1 and Bi ⊃ Bi+1 , then,
∞
∪∞
i=1 Ai ∈ M and ∩i=1 Bi ∈ M. Remark 8.1.1 Any σ-ring, and so, in particular, any σ-algebra, is a
monotone class. Remark 8.1.2 The intersection of monotone classes is obviously a
monotone class. The collection P(X), of all subsets of a non-empty
set X, is obviously a monotone class. Thus, if A is any collection of
subsets of X, there exists a smallest monotone class containing A. We
will denote it by M(A) and will call it the monotone class generated by
A. Lemma 8.1.1 Let X be a non-empty set and let M be a monotone
class of subsets of X. Let P ⊂ X. Define
U (P ) = {Q ⊂ X |P ∪ Q, P \Q and Q\P are all in M}.
Then U (P ) is a monotone class.
158
8 Product spaces
Proof: Let {Qi }ni=1 be an increasing sequence of sets in U (P ). Then
∞
{P ∪ Qi }∞
i=1 and {Qi \P }i=1 are increasing sequences of sets in M. Consequently,
∞
P ∪ (∪∞
i=1 Qi ) = ∪i=1 (P ∪ Qi ) ∈ M,
and
∞
(∪∞
i=1 Qi )\P = ∪i=1 (Qi \P ) ∈ M.
Finally {P \Qi }∞
i=1 is a decreasing sequence of sets in M. Hence
∞
P \(∪∞
i=1 Qi ) = ∩i=1 (P \Qi ) ∈ M.
Thus, we see that ∪∞
i=1 Qi ∈ U (P ). In the same way, it is easy to see that
if {Qi }∞
is
a
decreasing
sequence of sets in U (P ), then ∩∞
i=1
i=1 Qi ∈ U (P )
as well. This completes the proof. Lemma 8.1.2 Let X be a non-empty set and let R be an algebra of
subsets of X. Let M(R) denote the monotone class generated by R.
Then M(R) = S(R), the σ-algebra generated by R.
Proof: Let P ∈ R. If Q ∈ R, then, P ∪ Q, P \Q and Q\P belong to
R and hence to M(R) as well. Thus, if U (P ) is as defined in the preceding lemma, we have that Q ∈ U (P ). Since U (P ) is a monotone class
containing R, it follows that U (P ) ⊃ M(R).
Now, let Q ∈ M(R). Then, we have just seen that Q ∈ U (P ).
By symmetry of the definition of U (P ), we immediately deduce that
P ∈ U (Q). Thus, U (Q) is a monotone class containing R, we have that
U (Q) ⊃ M(R). Thus, for all P and Q in M(R), we have that P ∪Q and
P \Q belong to M(R). Thus, we deduce that M(R) is an algebra as well.
Now let {Ei }∞
i=1 be a countable collection of members of M(R).
Then Fn = ∪ni=1 Ei ∈ M(R), since the latter collection is an algebra.
Since {Fn }∞
n=1 is an increasing sequence of sets in M(R), which is a
monotone class, we have
∞
∪∞
i=1 Ei = ∪n=1 Fn ∈ M(R).
Thus M(R) is a σ-algebra containing R, fom which we deduce that
M(R) ⊃ S(R). Since every σ-algebra is a monotone class, we also have
that M(R) ⊂ S(R). This completes the proof. Proposition 8.1.2 Let (X, S) and (Y, T ) be measurable spaces. Then
S × T is the smallest monotone class containing all elementary sets.
8.1 Measurability in the product space
159
Proof: Let us denote by E, the class of elementary sets. Let Ai ∈ S
and Bi ∈ T for i = 1, 2. Then (check!)
(A1 × B1 ) ∩ (A2 × B2 ) = (A1 ∩ A2 ) × (B1 ∩ B2 ),
(A1 × B1 )\(A2 × B2 ) = ((A1 \A2 ) × B1 ) ∪ ((A1 ∩ A2 ) × (B1 \B2 )).
Thus, the intersection of two measurable rectangles is a measurable rectangle and their difference is the disjoint union of two measurable rectangles. It then follows that if P and Q are in E, we have that P ∪ Q and
P \Q are also in E. Thus E is an algebra. The result now follows from
Lemma 8.1.2. Definition 8.1.4 Let (X, S) and (Y, T ) be measurable spaces. Let f :
X × Y → R be a given function. Let x ∈ X and y ∈ Y . The x-section,
fx , and the y-section, f y , are defined by
fx (y) = f (x, y) = f y (x). We are now dealing with three σ-algebras, viz. the σ-algebra S on
X, the σ-algebra T on Y , and the σ-algebra S × T on X × Y . To avoid
confusion, we will say a set (respectively, function) is S-measurable, T measurable or S × T measurable according to the context.
Proposition 8.1.3 Let (X, S) and (Y, T ) be measurable spaces. Let f
be a S ×T -measurable function defined on X ×Y . Then, for every x ∈ X
and for every y ∈ Y , the section fx is T -measurable and the section f y
is S-measurable.
Proof: Let c ∈ R. Then
Q = {(x, y) | f (x, y) > c}
is S × T measurable. Then, for x ∈ X,
{y ∈ Y | fx (y) > c} = Qx
is T -measurable by Proposition 8.1.1. Thus, fx is T -measurable. Similarly, f y is S-measurable. Example 8.1.1 Let (X, S) and (Y, T ) be measurable spaces. Let f :
X → R be a S-measurable function. Define F : X ×Y → R by F (x, y) =
f (x) for every (x, y) ∈ X × Y . Then, if c ∈ R, we have
{(x, y) ∈ X × Y | F (x, y) > c} = {x ∈ X | f (x) > c} × Y,
160
8 Product spaces
which is S × T -measurable. Thus, F is S × T -measurable. Similarly, if
g : Y → R is T -measurable, we also have that the function (x, y) 7→ g(y)
is S × T -measurable. Since the product of measurable functions is
measurable, we have that the function ϕ : X × Y → R defined by
ϕ(x, y) = f (x)g(y) is also S × T -measurable. Example 8.1.2 Let R be equipped with the Borel or the Lebesgue σalgebra. Let (X, S) be a measurable space. Let f : R × X → R be a
function such that
• for every x ∈ X, the function t 7→ f (t, x) is continuous on R;
• for every t ∈ R, the function x 7→ f (t, x) is S-measurable.
(Such a function is called a Carathéodory function.) Then, f is
measurable on the product space R × X.
To see this, first observe that for any fixed n ∈ N,
k−1 k
R = ∪k∈Z
,
.
2n 2n
k
Define fn (t, x) = f ( 2kn , x) if t ∈ ( k−1
2n , 2n ]. Thus, we can write
X k fn (t, x) =
f
, x χ k−1 k (t).
( n , n]
2n
2
2
k∈Z
By the previous example, and by hypotheses, we see that each term in
the above series is a measurable function on R × X, from which we deduce that fn is a measurable function on R × X.
Now, let (t0 , x0 ) ∈ R × X. Let ε > 0 be given. Then, there exists
δ > 0 such that, if |t − t0 | < δ, we have, by hypotheis, that |f (t, x0 ) −
f (t0 , x0 )| < ε. If n is large enough so that 21n < δ, it then follows that
|fn (t0 , x0 ) − f (t0 , x0 )| < ε, by construction of the function fn . Thus,
fn → f pointwise and so f is measurable on R × X. 8.2
The product measure
Theorem 8.2.1 Let (X, S, µ) and (Y, T , λ) be σ-finite measure spaces.
Let Q ∈ S × T . Define, for x ∈ X and y ∈ Y ,
ϕ(x) = λ(Qx ) and ψ(y) = µ(Qy ).
8.2 The product measure
161
Then ϕ is S-measurable and ψ is T -measurable. Further
Z
Z
ϕ dµ =
ψ dλ.
X
(8.2.1)
Y
Proof: Let U be the collection of all sets in S ×T such that (8.2.1) holds.
Step 1. Let Q = A × B, where A ∈ S and B ∈ T . Then (cf. the
proof of Proposition 8.1.1), ϕ = λ(B)χA and ψ = µ(A)χB . Thus ϕ is
S-measurable and ψ is T -measurable. We also have
Z
Z
ϕ dµ = µ(A)λ(B) =
ψ dλ.
X
Y
Thus, every measurable rectangle is in U .
Step 2. Let {Qi }∞
i=1 be an increasing sequence of sets in U . Set Q =
∞
∪i=1 Qi . Let ϕi (x) = λ((Qi )x ) and let ψi (y) = µ((Qi )y ) for x ∈ X and
y ∈ Y . Since {(Qi )x }∞
i=1 is an increasing sequence of sets whose union
is Qx and, similarly, since {(Qi )y }∞
i=1 is an increasing sequence of sets
∞
whose union is Qy , we have that {ϕi }∞
i=1 and {ψi }i=1 are sequences of
non-negative increasing functions such that (cf. Proposition 1.2.4)
limi→∞ ϕi (x) = λ(Qx )
limi→∞ ψi (y) = µ(Qy )
def
=
def
=
ϕ(x), and
ψ(y).
R
R
Since X ϕi dµ = Y ψi dλ, for each i, it follows from the monotone
convergence theorem that (8.2.1) holds. Thus Q ∈ U .
Step 3. It is very easy to see that if {Qi }ni=1 is a finite collection of
disjoint sets in U , then ∪ni=1 Qi ∈ U as well. Now, given any countable
n
collection of disjoint sets {Qi }∞
i=1 in U , we set Rn = ∪i=1 Qi so that
∞
Rn ∈ U for each n ∈ N. Since {Rn }n=1 is an increasing sequence, it
follows, from Step 2, that
∪ni=1 Qi = ∪∞
n=1 Rn ∈ U .
Step 4. Let A ∈ S and B ∈ T be such that µ(A) < +∞ and λ(B) < +∞.
Let {Qi }∞
i=1 be a sequence of sets in U such that
A × B ⊃ Q1 ⊃ Q2 ⊃ · · · ⊃ Qi ⊃ Qi+1 ⊃ · · · .
162
8 Product spaces
Then in the same manner as in Step 2 (using the dominated convergence theorem instead of the monotone convergence theorem), we can
show that ∩∞
i=1 Qi ∈ U .
Step 5. Since both the measure spaces are σ-finite, we can write
∞
X = ∪∞
n=1 Xn and Y = ∪m=1 Ym ,
∞
where {Xn }∞
n=1 and {Ym }m=1 are sequences of disjoint sets such that for
all n and m we have µ(Xn ) < +∞ and λ(Ym ) < +∞. Let Q ∈ S × T .
Set Qnm = Q∩(Xn ×Ym ). Let M be the collection of all sets Q in S ×T
such that Qnm ∈ U for all n and m. By Steps 2 and 4, it follows that
M is a monotone class. By Steps 1 and 3, it follows that all elementary
sets are in M. Thus, M is a monotone class containing all elementary
sets and is contained in S × T . It now follows, from Proposition 8.1.2,
that M = S × T .
Step 6. If Q ∈ S × T , then by Step 5, Qnm ∈ U for all n and m. Then,
since the Qnm are all disjoint and since
∞
Q = ∪∞
n=1 ∪m=1 Qnm ,
it follows, from Step 3, that Q ∈ U as well. Thus U = S × T . This
completes the proof. We can use the precding theorem to define the product measure on
S ×T.
Definition 8.2.1 Let (X, S, µ) and (Y, T , λ) be σ-finite measure spaces.
The product measure, denoted µ × λ, is defined for Q ∈ S × T by
Z
Z
(µ × λ)(Q) =
λ(Qx ) dµ(x) =
µ(Qy ) dλ(y). X
Y
Example 8.2.1 Let X = Y = R be equipped with the Lebesgue measure. The x-axis can be written as the disjoint union of measurable
rectangles:
{(x, 0) | x ∈ R} = ∪n∈Z [n, n + 1) × {0},
and so its measure, for the product measure m1 × m1 is zero. Let E ⊂
[0, 1] ⊂ R be a non-measurable set, i.e. E 6∈ L1 . Then (cf. Proposition
8.1.1), the set E × {0} 6∈ L1 × L1 . However,
E × {0} ⊂ [0, 1] × {0},
8.2 The product measure
163
and (cf. Example 2.1.1)
m2 ([0, 1] × {0}) = 0 = (m1 × m1 )([0, 1] × {0}).
Since m2 is complete, it follows that E × {0} ∈ L2 . This shows that
even though m1 is complete, it does not follow that m1 × m1 is complete
and also shows that L1 × L1 6= L2 . .
Remark 8.2.1 In view of the above example, we can ask ourselves what
is the relationship between the product of Lebesgue measures and the
Lebesgue measure of the product space. We sketch the argument below.
Let ` = k + n and let us consider R` as the product space Rk × Rn .
We have the Borel sets B` , Bk and Bn in R` , Rk and Rn respectively.
Similarly we have the Lebesgue measurable sets L` , Lk and Ln as well.
Now, any open set in R` and can be expressed as the countable
disjoint union of (half-open) boxes (cf. Lemma 2.2.1). Thus, all open
sets are in Lk × Ln and so we deduce that
B` ⊂ L k × L n .
If E ⊂ Rk is a Lebesgue measurable subset, then (cf. Proposition
2.2.2) it can be approximated from above by a Gδ set and from below
by an Fσ set. It follows from this that E × Rn is Lebesgue measurable
in R` . Similarly, if F ⊂ Rn is Lebesgue measurable, then Rk × F will
be Lebesgue measurable in R` . Then, their intersection E × F will be
Lebesgue measurable in R` . Thus, L` contains all measurable rectangles
and it follows from this that
B` ⊂ L k × L n ⊂ L ` .
We know that m` and mk × mn both agree on all boxes. Both these
measures are also easily seen to be translation invariant, outer-regular
(cf. Remark 2.2.1) and finite on all compact sets. Thus (cf. Theorem
2.3.2), it will follow that these measures agree on B` .
Now, if Q is Lk × Ln -measurable, it is also L` -measurable and so
there exist Pi ∈ B` , i = 1, 2, such that P1 ⊂ Q ⊂ P2 and such that
m` (P2 \P1 ) = 0 (cf. Proposition 2.2.2). Thus,
(mk × mn )(Q\P1 ) ≤ (mk × mn )(P2 \P1 ) = m` (P2 \P1 ) = 0.
Cosequently,
(mk × mn )(Q) = (mk × mn )(P1 ) = m` (P1 ) = m` (Q).
164
8 Product spaces
Thus, mk × mn and m` agree on Lk × Ln as well. Since the Lebesgue
measure is the completion of the same measure on Borel sets, it follows
that m` is the completion of mk × mn as well. 8.3
Fubini’s theorem
Theorem 8.3.1 (Fubini’s theorem) Let (X, S, µ) and (Y, T , λ) be two
σ-finite measure spaces. Let f be an extended real-valued function defined on X × Y which is S × T -measurable.
(a) Let f be non-negative. Define, for x ∈ X and y ∈ Y ,
Z
Z
ϕ(x) =
fx dλ and ψ(y) =
f y dµ.
(8.3.1)
Y
X
Then ϕ is S-measurable, ψ is T -measurable and
Z
Z
Z
ϕ dµ =
f d(µ × λ) =
ψ dλ.
X
X×Y
(8.3.2)
Y
(b) Assume that ϕ∗ is integrable over X, with respect to the measure µ,
where, for x ∈ X,
Z
∗
ϕ (x) =
|f |x dλ.
Y
Then f is integrable over X × Y , with respect to the measure µ × λ.
(c) Let f be integrable over X × Y , with respect to the measure µ × λ.
Then, for almost every x ∈ X, the function fx is integrable over Y , with
respect to the measure λ, and, for almost every y ∈ Y , the function f y is
integrable over X, with respect to the measure µ. Further, the functions
ϕ and ψ defined by (8.3.1) above are integrable over X, with respect to
the measure µ, and over Y , with respect to the measure λ, respectively,
and (8.3.2) holds.
Proof: (a) Since fx and f y are non-negative, the functions ϕ and ψ
are defined. By definition of the product measure, (8.3.2) is exactly the
conclusion of Theorem 8.2.1, when f = χQ , where Q ∈ S × T . By
linearity, the result holds for all non-negative simple functions. Let f be
a non-negative S ×T -measurable function. Let {fn }∞
n=1 be aRsequence of
non-negative simple functions increasing to f . Let ϕn (x) = Y (ϕn )x dλ.
Then, by the monotone convergence theorem, ϕn ↑ ϕ. Further,
Z
Z
fn d(µ × λ) =
ϕn dµ.
X×Y
X
8.3 Fubini’s theorem
165
Once again, we can pass to the limit, as n tends to infinity, to get
Z
Z
f d(µ × λ) =
ϕ dµ.
X×Y
X
This proves one part of (8.3.2). The proof of the other part is similar.
(b) We apply the result of (a) to the function |f |. Thus, by hypothesis
and by (8.3.2), we get
Z
Z
|f | d(µ × λ) =
ϕ∗ dµ < +∞.
X×Y
X
This shows that f is integrable over X × Y , with respect to the measure
µ × λ.
(c) We write f = f + − f − . Since f is integrable, we have that f ± are
integrable non-negative functions. Let
Z
Z
±
ϕ± (x) =
(f )x dλ, and ψ± (y) =
(f ± )y dµ.
Y
X
Then (8.3.2) holds for the triples (f ± , ϕ± , ψ± ) replacing the triple (f, ϕ, ψ).
All the integrals are now finite and so subtracting the relations for f −
from those of f + , we deduce (8.3.2) for the function f . This completes
the proof. Remark 8.3.1 When f is non-negative (case (a) of Theorem 8.3.1), all
the integrals could be infinite. If even one of them is finite, all are finite
and will be equal. Remark 8.3.2 In the case (b) of the preceding
theorem, of course, an
R
∗
y
analogious statement involving ψ (y) = X |f | dµ is valid. Thus, if one
of the integrals
Z
Z
|f |y dµ or
X
|f |x dλ
Y
is finite, then we have that f is integrable over X × Y and that (8.3.2)
holds. Remark 8.3.3 The relation (8.3.2) can also be written as
Z Z
Z
Z Z
f (x, y) dλ(y)dµ(x) =
f d(µ×λ) =
f (x, y) dµ(x)dλ(y).
X
Y
X×Y
Y
X
166
8 Product spaces
The first and the last term are referred to as iterated integrals. Example 8.3.1 Let X = Y = N, and let µ = λ be the counting measure.
Let f (m, n) = amn be non-negative for all m and n. Then, by case (a)
of Theorem 8.3.1, we get that
∞ X
∞
X
amn =
m=1 n=1
∞ X
∞
X
amn .
n=1 m=1
This was proved earlier as a consequence of the monotone convergence
theorem (cf. Example 5.2.3). The same result is true without the nonnegativity condition if we assume the extra condition
∞ X
∞
X
|amn | < +∞,
m=1 n=1
as an application of case (b) of Theorem 8.3.1. We proved this earlier,
using the dominated convergence theorem (cf. Example 5.3.3). Example 8.3.2 Once again, let X = Y = N and let µ = λ be the
counting measure. Define

 1, if n = 1,
−1, if n = m + 1,
f (m, n) = am,n =

0, otherwise.
Then
∞ X
∞
X
am,n =
m=1 n=1
while
∞
X
(am,1 + am,m+1 ) = 0,
m=1
∞
X
am,1 = +∞
m=1
and so
∞ X
∞
X
am,n = +∞.
n=1 m=1
Note that in this case,
∞ X
∞
X
m=1 n=1
|am,n | = +∞.
8.3 Fubini’s theorem
167
Thus, the integrability of f cannot be relaxed for a general function for
the validity of (8.3.2). Example 8.3.3 Let X = Y = [0, 1]. Let S = T = L1 . Let µ = m1 and
let λ be the counting measure. Thus λ is not a σ-finite measure. Let
D = {(x, x) | x ∈ [0, 1]} ⊂ X × Y.
Since D is a closed set, it is Borel measurable and so D ∈ L1 × L1 (cf.
Remark 8.2.1). Let f = χD . Then
R R
RY RX f (x, y) dµ(x) dλ(y) = 0,
X Y f (x, y) dλ(y) dµ(x) = 1.
Thus, σ-finiteness cannot be dispensed with. Example 8.3.4 (Integration by parts for absolutely continuous functions) Let [a, b] ⊂ R and let f, g : [a, b] → R be absolutely continuous
functions. Then, f 0 and g 0 exist almost everywhere and are integrable.
Let us consider the integral of ϕ(x, y) = f 0 (x)g 0 (y) on the set
E = {(x, y) ∈ [a, b] × [a, b] | x ≤ y},
which is closed and hence is Borel measurable. Consequently, it also
belongs to L1 × L1 . By case (a) of Theorem 8.3.1, we have
R
R
R
0 (x)| |g 0 (y)| dm (x)
|ϕ|d(m
×
m
)
=
|f
dm1 (y)
1
1
1
[a,b]×[a,b]
[a,b]
[a,b]
=
R
[a,b] |f
0|
dm1
R
[a,b] |g
0|
dm1 < +∞.
Thus ϕ is integrable on [a, b]×[a, b] with respect to the measure m1 ×m1
and so we can apply Fubini’s theorem. Consequently, we have
Z
Z
f 0 (x)g 0 (y)χE (x, y) dm1 (y) dm1 (x)
[a,b]
=
R
R
[a,b]
[a,b] [a,b] f
0 (x)g 0 (y)χ
E
(x, y) dm1 (x) dm1 (y).
The left-hand side of (8.3.3) is equal to
Z
!
Z
[a,b]
0
g (y) dm1 (y) f 0 (x) dm1 (x),
[x,b]
(8.3.3)
168
8 Product spaces
which yields
Z
Z
0
(g(b) − g(x))f (x) dm1 (x) = g(b)f (b) − g(b)f (a) −
[a,b]
gf 0 dm1 .
[a,b]
We have used here the fact that both f and g are absolutely continuous
and so the integral of the derivative is the difference of the values of the
function at the end points of the interval (cf. Theorem 6.4.1). Similarly,
the right-hand side of (8.3.3) is equal to
!
Z
Z
f 0 (x) dm1 (x) g 0 (y) dm1 (y),
[a,b]
[a,y]
which yields
Z
f g 0 dm1 .
−g(b)f (a) + g(a)f (a) +
[a,b]
Equating these two, we get
Z
Z
f g 0 dm1 = g(b)f (b) − g(a)f (a) −
f 0 g dm1 ,
[a,b]
[a,b]
which is the formula for integration by parts. Example 8.3.5 (Polar coordinates.) In Example 7.3.2, we described
the transformation which led to polar coordinates. Let f : R2 → R be
an integrable function which is Borel measurable. Since the Lebesgue
measure m2 agrees with m1 × m1 on Borel sets, we can apply Fubini’s
theorem. Then, we can write (7.3.14) in the form
Z
Z
Z
f dm2 =
rf (r cos θ, r sin θ) dm1 (r) dm1 (θ).
R2
(0,2π)
(0,+∞)
If the integrand on the right-hand side is Riemann integrable, then we
may write
Z
Z 2π Z ∞
f dm2 =
f (r cos θ, r sin θ)r dr dθ. R2
0
0
Example 8.3.6 Let x ∈ RN and consider the function
2
f (x) = e−|x| ,
8.3 Fubini’s theorem
169
which is continuous and hence is Borel measurable. We wish to show
N
that this
R function is integrable over R and evaluate that integral. Let
IN = RN f dmN . By repeated use of Fubini’s theorem (case (a)), we
get
Z
2
e−|xi | dm1 (xi ) = I1N .
IN = Π N
i=1
R
I12 ,
Now, on one hand I2 =
by the previous example,
by the above reasoning. On the other hand,
∞ Z 2π
Z
I2 =
e
0
−r 2
∞
Z
r dθ dr = 2π
0
2
e−r r dr.
0
The last integral is easily evaluated to give I12 = I2 = π. Thus
I1 =
√
N
π and IN = π 2 , N ≥ 2. Example 8.3.7 Let (X, S, µ) be a measure space and let f : X → R
be an integrable function. The distribution function of f is a function
F : [0, +∞) → [0, +∞], defined by
F (t) = µ(E(t)),
where E(t) = {x ∈ X | |f (x)| > t}. We have
Z
Z
Z
F dm1 =
χE(t) (x) dµ(x) dm1 (t).
[0,+∞)
[0,+∞)
X
Since we are dealing with a non-negative integrand, we can change the
order of integration to get
Z
Z Z
Z
F dm1 =
dm1 (t) dµ(x) =
|f (x)| dµ(x).
[0,+∞)
X
[0,|f (x)|]
X
Thus,
Z
Z
|f | dµ =
X
F dm1 .
[0,+∞)
Two functions defined on X are said to be equimeasurable or, are rearrangements of each other, if they have the same distribution function.
Thus, the integrals of the absolute values of functions which are rearrangements of each other are equal. 170
8 Product spaces
Example 8.3.8 (Convolutions) Let f and g be Borel measurable realvalued functions defined on RN . Consider the mappings ϕ and ψ defined
on RN × RN taking values in RN defined by
ϕ(x, y) = x − y, ψ(x, y) = y.
These are continuous and hence Borel measurable. Thus (cf. Proposition
3.1.4), we have that the mappings
(x, y) 7→ f (x − y) and (x, y) 7→ g(y)
are Borel measurable and so their product (x, y) 7→ f (x − y)g(y) is also
Borel measurable. We would like to know if the integral
Z
def
h(x) =
f (x − y)g(y) dmN (y)
RN
exists and is finite.
Assume that f and g are integrable as well. Since the Lebesgue measure agrees with the product measure on Borel measurable sets, we can
apply Fubini’s theorem. We have
R
RN
R
RN
f (x − y)g(y) dmN (y) dmN (x)
R R
≤ RN RN |f (x − y)|.|g(y)| dmN (y) dmN (x)
=
R
RN
|g(y)|
R
RN
|f (x − y)| dmN (x) dmN (y).
Since the Lebesgue measure is translation invariant, we have that
Z
Z
|f (x − y)| dmN (x) =
|f | dmN
RN
RN
for every fixed y. Consequently, we get
Z
Z
Z
f (x − y)g(y) dmN (y) dmN (x) ≤
RN
RN
Z
|f | dmN
RN
|g| dmN .
RN
Since, by hypothesis, the right-hand side is finite, it follows from Fubini’s theorem, that the function h is defined for almost every x and is
integrable and, in fact we also have that
Z
Z
Z
|h| dmN ≤
|f | dmN
|g| dmN .
(8.3.4)
RN
RN
RN
8.4 Polar coordinates in RN
171
Now, if f and g are Lebesgue measurable functions, then we can
find Borel measurable functions f0 and g0 such that f = f0 and g = g0
almost everywhere (cf. Exercise 3.7). Since the integrals of functions
which are equal almost everywhere are the same, it follows that if f and
g are integrable functions (with respect to the Lebesgue measure) on
RN , then h is well-defined almost everywhere. The function h is called
the convolution of f and g and is denoted by the symbol f ∗ g. Polar coordinates in RN
8.4
We saw that the transformation
x = r cos θ, y = r sin θ,
in the plane R2 allowed us to write the integral of a non-negative function, or an integrable function in the form (cf. Example 7.3.2 and Example 8.3.5)
Z
Z 2π Z ∞
f dm2 =
f (r cos θ, r sin θ)r dr dθ.
R2
0
0
In the same way the transformation in R3 defined by the spherical polar
coordinate system
x = r sin θ cos ϕ, y = r sin θ sin ϕ, z = r cos θ,
R
will convert R3 f dm3 into the multiple integral
Z 2π Z π Z ∞
f (r sin θ cos ϕ, r sin θ sin ϕ, r cos θ)r2 sin θ dr dθ dϕ,
0
0
0
when f is a Borel measurable function defined on R3 , which is nonnegative or which is integrable.
When N > 3, it is difficult to write down explicitly the ‘polar coordinates’ and the computation of the Jacobian will surely be horrendous. We will describe below the transformation of the integral when
f : RN → R is a radial function.
Definition 8.4.1 We say that a function f : RN → R is radial, if there
exists a function fe : R → R such that, for x ∈ RN , we have
f (x) = fe(|x|). 172
8 Product spaces
Let B be the unit ball in RN . Let us set
ωN = mN (B).
If R > 0, then the linear map T (x) = Rx maps the open unit ball
diffeomorphically onto B(0; R), the open ball of radius R, and so we
have (cf. Theorem 2.3.)
mN (B(0; R)) = ωN RN .
By translation invariance, any ball in RN with radius R will have measure ωN RN .
Let us denote the closed ball centred at the origin and of radius
R by B(0; R). Let us assume that we have a continuous function f :
B(0; R) → R which is radial. Thus, f (x) = fe(|x|), where fe : [0, R] → R
is continuous. Consider a partition of the interval [0, R]:
P = {0 = r0 < r1 < · · · < rn = R}.
For 1 ≤ i ≤ N , let us set
Ai = {x ∈ RN | ri−1 ≤ |x| < ri },
so that B(0; R) = ∪ni=1 Ai .
Now, by the mean value theorem, there exists ξi ∈ (ri−1 , ri ) such
that
N
riN − ri−1
= N ξiN −1 (ri − ri−1 ),
(8.4.1)
for each 1 ≤ i ≤ n. Let us choose yi ∈ Ai such that |yi | = ξi , 1 ≤ i ≤ N .
Now define the function
fP =
n
X
f (yi )χAi .
i=1
Let
∆(P) = max (ri − ri−1 ).
1≤i≤n
Now, if x ∈ Ai , we have for any 1 ≤ i ≤ n,
|f (x) − fP (x)| = |fe(|x|) − fe(|yi |)| = |fe(|x|) − fe(ξi )|.
8.4 Polar coordinates in RN
173
Since fe is uniformly continuous, given ε > 0, we can find δ > 0 such
that, if ∆(P) < δ, we have
|f (x) − fP (x)| < ε,
for every x ∈ B(0; R). Thus, as ∆(P) → 0, we have that fP → f
uniformly on B(0; R). Consequently (cf. Exercise 5.2),
Z
Z
lim
fP dmN =
f dmN .
∆(P)→0 B(0;R)
B(0;R)
On the other hand, we have
R
Pn e
i=1 f (ξi )mN (Ai )
B(0;R) fP dmN =
=
Pn
=
Pn
N
e
i=1 f (ξi )ωN (ri
N )
− ri−1
N −1
e
(ri
i=1 f (ξi )N ωN ξi
− ri−1 ),
in view of (8.4.1). Since f is continuous, we have that
Z R
n
X
N −1
e
lim
f (ξi )N ωN ξi
(ri − ri−1 ) = N ωN
fe(r)rN −1 dr.
∆(P)→0
0
i=1
Thus we have
Z
R
Z
f dmN = N ωN
B(0;R)
fe(r)rN −1 dr.
(8.4.2)
0
If f is a continuous non-negative, or an integrable, radial function defined
on RN , we then have
Z
Z ∞
f dmN = N ωN
fe(r)rN −1 dr.
(8.4.3)
RN
0
The formula (8.4.3) is a particular case of a more general result known
in the literature as the coarea formula.
Theorem 8.4.1 There exists a unique Borel measure σN −1 on the unit
sphere S N −1 in RN such that, if f : RN → R is a Borel measurable
function which is either non-negative or integrable over RN (with respect
to the Lebesgue measure), then
Z
Z
Z
f dmN =
f (rx0 )rN −1 dσN −1 (x0 ) dm1 (r),
RN
where r = |x| and x0 =
[0,∞)
x
r
S N −1
∈ S N −1 . 174
8 Product spaces
The interested reader is referred to the books of Evans and Gariepy [2] or
Folland [3]. Essentially, the coarea formula says that when we integrate
over RN , we first integrate over the surface of sphere of radius r, centred
at the origin, and then integrate over r. If R > 0, we have
Z
Z
Z
f dmN =
f (rx0 )rN −1 dσN −1 (x0 ) dm1 (r).
B(0;R)
[0,R)
S N −1
Setting R = 1 and f ≡ 1, we get
ωN = σN −1 (S N −1 )
1
Z
rN −1 dr,
0
which yields
σN −1 (S N −1 ) = N ωN .
The quantity σN −1 (S N −1 ) is the natural ‘N − 1 dimensional surface
measure’ of the unit sphere. Indeed, if N = 2, we have that the area of
the unit circle is ω2 = π while its perimeter is σ1 (S 1 ) = 2π = 2ω2 . If
N = 3, the volume of the unit sphere is ω3 = 43 π and its surface area is
σ2 (S 2 ) = 4π = 3ω3 .
Remark 8.4.1 There is a rich theory of measures defined on surfaces,
or more genrally, lower dimensional manifolds, in RN . In fact there are
several methods to do it, depending on how we wish to handle singularities in the geometry of these sets. The main theory is that of Hausdorff
measures. See Evans and Gariepy [2], or Folland [3], for a treatment of
these notions. Example 8.4.1 (Volume of the unit ball) We now compute the value of
2
ωN = mN (B(0; 1)). We start with the function f (x) = e−|x| . We saw
earlier (cf. Example 8.3.6) that
Z
N
2
e−|x| dmN (x) = π 2 .
RN
Since f is a radial function, we can also compute it using polar coordinates. By (8.4.3), we get
Z
Z ∞
2
−|x|2
e
dmN (x) = N ωN
e−r rN −1 dr.
RN
0
Setting s = r2 , we get
Z
Z ∞
N
N
N
−1
−|x|2
−s N
=
ωN Γ
,
e
dmN (x) =
ωN
e s2
2
2
2
RN
0
8.5 Exercises
175
where Γ(s) is the familiar gamma function. Thus, equating the two
expressions we got for the integral, we obtain,
N
N
π2
π2
=
,
= N
N
N
Γ 2 +1
2Γ 2
ωN
since sΓ(s) = Γ(s + 1).
Using the last mentioned property of the gamma function and the
√
fact that Γ 12 = π, we can easily verify that
ω2 = π and ω3 =
4
π.
3
We can also see that
ω4 =
1 2
8 2
π and that ω5 =
π ,
2
15
and so on. 8.5
Exercises
8.1 Give an example of a non-empty set X and a monotone class M of
subsets of X which contains X and ∅ and which is not a σ-algebra.
8.2 Let p ≥ 1. Let (X, S, µ) be a Rmeasure space. Let f be a real-valued
function defined on X such that X |f |p dµ < +∞. Show that
Z
Z
p
|f | dµ = p
tp−1 µ(E(t)) dm1 (t),
X
[0,∞)
where
E(t) = {x ∈ X | |f (x)| > t}.
8.3 (a) For x > 0, show that
Z
e−xt dm1 (t) =
[0,+∞)
1
.
x
(b) Use the above relation and Fubini’s theorem to show that
Z
lim
R→+∞ 0
R
sin x
π
dx = .
x
2
176
8 Product spaces
8.4 Let f, g and h be integrable real-valued functions defined on RN .
Show that
(a) f ∗ g = g ∗ f .
(b) Show that f ∗ (g ∗ h) and (f ∗ g) ∗ h are well-defined and that they
are equal.
8.5 Let f and g be integrable real-valued functions defined on RN . Show
that (cf. Example 5.3.2)
f[
∗ g = fb · gb.
8.6 Let (X, S) be a measurable space. Let f be a real-valued, nonnegative function defined on X. Define the upper and lower ordinate
sets of f by
V ∗ (f ) = {(x, t) ∈ X × R | 0 ≤ t ≤ f (x)}, and
V∗ (f ) = {(x, t) ∈ X × R | 0 ≤ t < f (x)},
respectively.
(a) If f is a non-negative simple function, show that V ∗ (f ) and V∗ (f ) are
measurable in X × R (where R is equipped with the Lebesgue measure).
(b) If f and g are non-negative functions such that f (x) ≤ g(x) for all
x ∈ X, show that V ∗ (f ) ⊂ V ∗ (g) and that V∗ (f ) ⊂ V∗ (g).
(c) Let {fn }∞
n=1 be a sequence of non-negative measurable functions
defined on X. If fn ↑ f , show that {V∗ (fn )}∞
n=1 is an increasing sequence
of sets whose union is V∗ (f ). If fn ↓ f , show that {V ∗ (fn )}∞
n=1 is a
decreasing sequence of sets whose intersection is V ∗ (f ).
(d) If f is a non-negative measurable function defined on X, show that
V ∗ (f ) and V∗ (f ) are measurable subsets of X × R.
(e) If f is any measurable real-valued function defined on X, show that
its graph, G(f ), is a measurable subset of X × R, where
G(f ) = {(x, t) ∈ X × R | f (x) = t}.
(f) Let (X, S, µ) be a σ-finite measure space. Set λ = µ × m1 . Show
that, if f is a non-negative measurable function defined on X, then,
Z
∗
λ(V (f )) = λ(V∗ (f )) =
f dµ.
X
(This is a generalization of the notion that the (Riemann) integral of a
non-negative real-vaued function defined on (a sub-interval of) R, is the
8.5 Exercises
177
area under the graph of the function.)
8.7 Let A be a real, symmetric and positive definite N × N matrix.
Show that
s
Z
πN
T
e−x Ax dmN (x) =
,
det(A)
RN
where xT denotes the transpose of (the column vector) x ∈ RN .
Chapter 9
Signed measures
9.1
Hahn and Jordan decompositions
Let (X, S) be a measurable space. Let µi , i = 1, 2, be two measures
defined on this space. Let αi , i = 1, 2, be non-negative real numbers.
Then α1 µ1 + α2 µ2 defines a measure on this space. We now consider
the possibility that αi , i = 1, 2, be arbitrary real numbers. Thus, it is
possible that certain sets have negative measure. The principal difficulty
in doing this is that if µi (E), i = 1, 2, are both infinite for some E ∈ S,
then we cannot define µ1 (E) − µ2 (E). The situation is similar to the one
we encountered when defining the integral of a function. In that case
we needed that at least one of the functions, f + or f − , be integrable.
In the same way, if we assume that if one of µ1 or µ2 is a finite measure,
then, at least formally, we can define the set function µ1 − µ2 , which will
still be countably additive.
Motivated by these remarks, we make the following definition.
Definition 9.1.1 Let (X, S) be a measurable space and let µ be an extended real-valued set function defined on S. It is said to be a signed
measure if
(i) µ(∅) = 0,
(ii) µ takes at most one of the values +∞ or −∞, and
(iii) µ is countably additive.
A signed measure, µ, is said to be finite if |µ(E)| < +∞ for every
E ∈ S. It is said to be σ-finite if X = ∪∞
n=1 En , with |µ(En )| < +∞ for
each n ∈ N. Example 9.1.1 As already observed, if µi , i = 1, 2, are two measures
on a measureable space (X, S), and if at least one of them is finite, then
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_9
178
9.1 Hahn and Jordan decompositions
179
µ1 − µ2 is a signed measure. One of our objectives in this section will
be to show that every signed measure can be written as the difference
of two measures, one of them finite. Example 9.1.2 Let (X, S, µ) be a measure space. Let f be an integrable
function defined on X. Define
Z
ν(E) =
f dµ, E ∈ S.
E
Then ν defines a signed measure on (X, S). Remark 9.1.1 A signed measure is clearly finitely additive. If µ(E) is
finite, then µ(F \E) = µ(F ) − µ(E), where E, F ∈ S and E ⊂ F. Proposition 9.1.1 Let (X, S) be a measurable space and let µ be a
signed measure defined on it. Let E and F be measurable sets such that
E ⊂ F . If µ(F ) is finite, then µ(E) is also finite.
Proof: We have F = (F \E) ∪ E and the two sets on the right-hand side
are disjoint. Thus, µ(F ) = µ(F \E) + µ(E). If both the summands on
the right-hand side of this equation are infinite, then so is µ(F ), since we
have assumed that µ can take at most one of the two infinite values +∞
or −∞. If one of them alone is finite, then again, µ(F ) will be infinite.
Thus both summands have to be finite, which completes the proof. Proposition 9.1.2 Let (X, S) be a measurable space and let µ be a
signed measure defined on it. Let {En }∞
n=1 be a sequence of disjoint
P∞ measurable sets such that |µ(∪∞
E
)|
<
+∞.
Then,
the
series
n
n=1
n=1 µ(En )
is absolutely convergent.
Proof: Set
En+
Then
En , if µ(En ) ≥ 0,
∅, if µ(En ) < 0,
En , if µ(En ) ≤ 0,
∅, if µ(En ) > 0.
=
and
En−
=
P∞
+
+
µ(∪∞
n=1 En ) = Pn=1 µ(En ),
(9.1.1)
∞
∞
−
−
µ(∪n=1 En ) =
n=1 µ(En ),
P
∞
and the sum of the two series is ∞
Since µ can
n=1 µ(En ) = µ(∪n=1 En ). P
take at most one of the two values +∞ or −∞, and since ∞
n=1 µ(En )
180
9 Signed measures
is convergent by the given hypothesis, it follows that both the series in
(9.1.1) are finite. ButP
these are the series of positive terms and the series
of negative terms of ∞
n=1 µ(En ) and so this latter series is absolutely
convergent. Proposition 9.1.3 Let (X, S) be a measurable space equipped with a
signed measure µ. If {En }∞
n=1 is an increasing sequence of measurable
sets, then
µ(∪∞
(9.1.2)
n=1 En ) = lim µ(En ).
n→∞
{En }∞
n=1
If
is a decreasing sequence of measurable sets such that µ(Em )
is finite for some m ∈ N, then
µ(∩∞
n=1 En ) = lim µ(En ).
n→∞
(9.1.3)
Proof: The proof is exactly as in the case of measures (cf. Propositions 1.2.4 and 1.2.5). Proposition 9.1.1 ensures that subsets of sets of
finite measure are also of finite measure and we can use the subtractive
property of signed measures (cf. Remark 9.1.1) to get (9.1.3). Definition 9.1.2 Let (X, S) be a measurable space equipped with a signed
measure µ. Let E be a measurable subset of X. We say that E is a positive set (respectively, a negative set), if for every measurable set F ,
we have µ(E ∩ F ) ≥ 0 (respectively, µ(E ∩ F ) ≤ 0). Equivalently, E
is a positive (respectively, negative) set if for every measurable subset
F ⊂ E, we have µ(F ) ≥ 0 (respectively, µ(F ) ≤ 0). Remark 9.1.2 The empty set is both a positive and a negative set. Remark 9.1.3 Any subset of a positive (respectively, negative) set is
positive (respectively, negative). Any (finite or countable) disjoint union
of positive (respectively, negative) sets is positive (respectively, negative). If Ai , i = 1, 2 are positive (respectively, negative) sets, then so
is A1 \A2 . Consequently, any countable union of positive (respectively,
negative) sets is positive (respectively, negative). Proposition 9.1.4 Let (X, S) be a measurable space equipped with a
signed measure µ. Let {Bi }ni=1 be a finite collection of negative sets. Let
B = ∪ni=1 Bi . Then
µ(B) ≤ min µ(Bi ).
(9.1.4)
1≤i≤n
9.1 Hahn and Jordan decompositions
181
Proof: We have B1 ∪ B2 = B1 ∪ (B2 \B1 ) and the latter is a disjoint
union. By the preceding remark, we have
µ(B1 ∪ B2 ) = µ(B1 ) + µ(B2 \B1 ) ≤ µ(B1 ).
Similarly, µ(B1 ∪ B2 ) ≤ µ(B2 ). This proves (9.1.4) when n = 2. The
general case now follows by induction on n. Theorem 9.1.1 (Hahn decomposition) Let (X, S) be a measurable space
equipped with a signed measure µ. There exist two disjoint sets A and
B such that X = A ∪ B, and such that A is a positive set and B is a
negative set.
Proof: Without loss of generality, let us assume that for all E ∈ S, we
have
−∞ < µ(E) ≤ +∞.
Step 1. Let us denote by N , the collection of all negative subsets of X.
Set
β = inf µ(B).
B∈N
{Bi }∞
i=1
Let
be a sequence of sets in N such that µ(Bi ) ↓ β. If B =
∪∞
B
,
we
have
seen that B ∈ N and so β ≤ µ(B). On the other hand,
i=1 i
if we set Cn = ∪ni=1 Bi , then µ(B) = limn→∞ µ(Cn ), by Proposition
9.1.3. But, by Proposition 9.1.4, we have that µ(Cn ) ≤ µ(Bn ) and so
µ(B) ≤ β. Thus, µ(B) = β. In particular, by our assumption on µ, we
have that β is finite.
Step 2. Let A = X\B. We will show that A is a positive set. If not,
there exists a measurable set E0 ⊂ A such that µ(E0 ) < 0.
Assume, if possible, that E0 is a negative set. Then, B ∪ E0 is a
negative set and, since B and E0 are disjoint, µ(B ∪ E0 ) = µ(B) +
µ(E0 ) < β, which is impossible by the definition of β. Thus, there exists
a measurable subset of E0 with positive measure. Since µ(E0 ), being
negative, is finite, so is the measure of any subset of E0 . Let k1 be the
smallest positive integer such that there exists a measurable set E1 ⊂ E0
with µ(E1 ) ≥ k11 . Now,
µ(E0 \E1 ) = µ(E0 ) − µ(E1 ) ≤ µ(E0 ) −
1
< 0.
k1
Step 3. We can now apply the procedure of Step 2 to the set E0 \E1 .
Then there exist measurable subsets of E0 \E1 with positive measure,
182
9 Signed measures
and let k2 be the smallest positive integer with the property that there
exists such a set E2 with µ(E2 ) ≥ k12 . (In other words, at each stage,
we choose a set with positive measure with the measure being as large
as possible.)
Proceeding in this way, for each positive integer i, there exists a
measurable set of positive measure contained in E0 \ ∪i−1
k=1 Ek and let ki
be the smallest positive integer with the property that there exists such
a measurable subset Ei with µ(Ei ) ≥ k1i . Then, since Ei ⊂ E0 \ ∪i−1
`=1 E` ,
the sets {Ei }∞
are
clearly
disjoint,
and
so
i=1
∞
X
µ(Ei ) = µ(∪∞
i=1 Ei ) < +∞,
i=1
since ∪∞
i=1 Ei ⊂ E0 , which has finite measure. In particular, it follows
that µ(Ei ) → 0 as i → ∞ and so ki → ∞.
Step 4. Let F be a measurable set such that F ⊂ E0 \ ∪∞
i=1 Ei . Then
µ(F ) ≤ 0. If not, let kn be such that µ(F ) ≥ k1n , which is possible ,
since ki → ∞. Then, for all m ≥ n, we have
m
F ⊂ E0 \ ∪ ∞
i=1 Ei ⊂ E0 \ ∪i=1 Ei ,
which yields, by definition of the ki , that km ≤ kn , which is a contradiction. Thus,
F0 = E 0 \ ∪ ∞
i=1 Ei
is a negative set and µ(F0 ) ≤ µ(E0 ) < 0. Once again, this is a contradiction since we then have that B ∪ F0 is a negative set and µ(B ∪ F0 ) < β.
Thus A is a positive set. This completes the proof. .
Remark 9.1.4 The decomposition of X into two disjoint sets, one positive and the other negative, is called a Hahn decompositon of X. Such a
decomposition is not unique. Let X = A ∪ B be a Hahn decomposition
and assume that there exists N ⊂ B such that µ(N ) = 0. Let F ⊂ N .
Then µ(F ) ≤ 0. If µ(F ) < 0, then 0 = µ(N ) = µ(F ) + µ(N \F ), which
implies that µ(N \F ) > 0, which is not possible since B and, hence, N ,
is a negative set. Thus, µ(F ) = 0, for all F ⊂ N . Then it is clear that
X = (A ∪ N ) ∪ (B\N ) gives another Hahn decomposition of X. The situation described in the preceding remark is, in fact, the only
way non-uniqueness can occur for the Hahn decomposition. More precisely, we have the following result.
9.1 Hahn and Jordan decompositions
183
Proposition 9.1.5 Let (X, S) be a measurable space equipped with a
signed measure µ. Let X = Ai ∪Bi , i = 1, 2, be two Hahn decompositions
of X. Then, for every E ∈ S, we have
µ(E ∩ A1 ) = µ(E ∩ A2 ) and µ(E ∩ B1 ) = µ(E ∩ B2 ).
Proof: Let E ∈ S. We have E ∩(A1 \A2 ) ⊂ A1 and so µ(E ∩(A1 \A2 )) ≥
0. On the other hand,
E ∩ (A1 \A2 ) = E ∩ A1 ∩ Ac2 = E ∩ A1 ∩ B2 ⊂ B2 ,
and so µ(E ∩ (A1 \A2 )) ≤ 0. Thus µ(E ∩ (A1 \A2 )) = 0 and, similarly,
µ(E ∩ (B1 \B2 )) = 0 as well. Consequently,
µ(E ∩ (A1 ∪ A2 )) = µ(E ∩ A2 ) + µ(E ∩ (A1 \A2 )) = µ(E ∩ A2 ).
Interchanging the roles of A1 and A2 , we get µ(E ∩ (A1 ∪ A2 )) = µ(E ∩
A1 ). Thus
µ(E ∩ A1 ) = µ(E ∩ A2 ).
This proves the first relation in the statement of the proposition. The
proof of the other one is similar. Let (X, S) be a measurable space equipped with a signed measure
µ. Let us now define two set functions on S by
µ+ (E) =
µ(E ∩ A),
µ− (E) = −µ(E ∩ B),
(9.1.5)
where E ∈ S and X = A ∪ B is a Hahn decomposition of X. By the
preceding proposition, µ± are well-defined, since they do not depend on
the Hahn decomposition chosen. Further, it is clear that they are both
measures. Also, since µ takes at most one of the two infinite values +∞
or −∞, it follows that one of these two measures is finite and, we have
µ(E) = µ+ (E) − µ− (E), E ∈ S.
If µ is finite (respectively, σ-finite), then the same is true for µ± .
We have thus proved the following result.
Theorem 9.1.2 (Jordan decomposition) Let (X, S) be a measurable space
equipped with a signed measure µ. Then µ is the difference of two (positive) measures µ+ and µ− , at least one of which is finite. If µ is finite
(respectively, σ-finite), then so are µ± . 184
9 Signed measures
Definition 9.1.3 Let (X, S) be a measurable space equipped with a signed
measure µ. The relation µ = µ+ − µ− is called the Jordan deomposition of µ. The measures µ+ and µ− are respectively called the
upper and lower variations of the signed measure µ. The measure
|µ| = µ+ + µ− is called the total variation of the signed measure µ. Definition 9.1.4 Let (X, S) be a measurable space. A complex measure is a set function µ defined on S which can be written as µ =
µ1 + iµ2 , where µj , j = 1, 2, are signed measures and i is a square root
of −1. Proposition 9.1.6 Let (X, S) be a measurable space equipped with a
signed measure µ. Let E ∈ S. Let µ± be the upper and lower variations
of µ. Then
µ+ (E) =
sup{µ(F ) | F ⊂ E, F ∈ S},
µ− (E) = − inf{µ(F ) | F ⊂ E, F ∈ S}.
(9.1.6)
Proof: Let X = A ∪ B be a Hahn decomposition of X. Let E, F ∈ S
such that F ⊂ E. Since µ+ is a measure, we have that µ(F ∩ A) ≤
µ(E ∩ A). Thus
µ(F ) =
≤
≤
=
µ(F ∩ A) + µ(F ∩ B)
µ(F ∩ A)
µ(E ∩ A)
µ+ (E).
It then follows that
sup{µ(F ) | F ⊂ E, F ∈ S} ≤ µ+ (E).
Since µ+ (E) = µ(E ∩ A) and E ∩ A ⊂ E, the reverse inequality is obvious. This proves the first relation in (9.1.6). The proof of the second
relation is similar. Example 9.1.3 Let (X, S, µ) be a measure space and let f be an integrable function defined on X. Consider the signed measure ν defined
by
Z
f dµ, E ∈ S.
ν(E) =
E
Then we have a Hahn decomposition X = A ∪ B where
A = {x ∈ X f + (x) > 0},
B = {x ∈ X f − (x) ≥ 0}.
9.2 Absolute continuity
185
The upper and lower variations ν ± of ν are given by
Z
Z
+
+
−
ν (E) =
f dµ, and ν (E) =
f − dµ,
E
E
and the total variation |ν| is given by
Z
|ν|(E) =
|f | dµ,
E
for E ∈ S. Let (X, S) be a measurable space equipped with a signed measure
µ. It is clear that a measurable function f defined on X is integrable
with respect to |µ| if, and only if, it is integrable with respect to both
µ+ and µ− . In that case we can define the integral of f over X, with
respect to µ.
Definition 9.1.5 Let (X, S) be a measurable space equipped with a signed
measure µ. Let f be a measurable function defined on X which is integrable with respect to |µ|. Then, we say that f is integrable with respect
to the signed measure µ and we define
Z
Z
Z
+
f dµ =
f dµ −
f dµ− ,
X
X
X
where µ = µ+ − µ− is the Jordan decomposition of µ. If µi , i = 1, 2 are
two signed measures defined on the measurable space (X, S), and if µ is
the complex measure defined by µ = µ1 + iµ2 , we say that a measurable
function f defined on X is integrable with respect to µ if it is integrable
with respect to both µ1 and µ2 and we define
Z
Z
Z
f dµ =
f dµ1 + i
f dµ2 . X
9.2
X
X
Absolute continuity
Definition 9.2.1 Let (X, S) be a measurable space and let µ and ν be
signed measures defined on it. We say that ν is absolutely continuous
with respect to µ if ν(E) = 0 whenever |µ|(E) = 0, E ∈ S. In this case
we write ν << µ. 186
9 Signed measures
Example 9.2.1 Let (X, S, µ) be a measure space and let f be an integrable function defined on X. For E ∈ S, define the signed measure
Z
ν(E) =
f dµ.
E
Then ν << µ. Example 9.2.2 Let (X, S) be a measure space and let µ and ν be measures defined on it. Then µ << µ + ν and ν << µ + ν. Example 9.2.3 Let (X, S) be a measurable space equipped with a
signed measure µ. Then µ+ << µ, µ− << µ. We also have µ << |µ|
and |µ| << µ. Example 9.2.4 Let X = [0, 1] be equipped with the Lebesgue measure
m1 . Let F = [0, 12 ] and, for x ∈ X, set
f1 (x) = 2χF (x) − 1, f2 (x) = x.
Let, for E ∈ L1 ,
Z
µi (E) =
fi dm1 , i = 1, 2.
E
Then, since |f1 | ≡ 1, we have |µ1 | = m1 . Consequently (cf. Example
9.1.1), µ2 << µ1 . However,
Z
µ1 (X) =
f1 dm1 = 0,
X
while
1
Z
µ2 (X) =
x dx =
0
1
6= 0.
2
Thus, if µ2 << µ1 and if µ1 (E) = 0, it does not imply that µ2 (E) = 0. Proposition 9.2.1 Let (X, S) be a measurable space and let µ and ν be
signed measures defined on it. The following statements are equivalent.
(i) ν << µ.
(ii) ν ± << µ.
(iii) |ν| << |µ|.
9.2 Absolute continuity
187
Proof: (i) ⇒ (ii). Let X = A ∪ B be a Hahn decomposition of X
with respect to ν, so that, for E ∈ S, we have ν + (E) = ν(E ∩ A) and
ν − (E) = −ν(E ∩ B). Let E ∈ S such that |µ|(E) = 0. Then
0 ≤ |µ|(E ∩ A) ≤ |µ|(E) = 0,
0 ≤ |µ|(E ∩ B) ≤ |µ|(E) = 0.
Then ν(E ∩ A) = ν(E ∩ B) = 0. Thus, ν ± << µ.
(ii) ⇒ (iii). Clearly, ν ± << µ implies that ν ± << |µ|. Thus, if
|µ|(E) = 0, then ν ± (E) = 0 and so |ν|(E) = 0 as well. Thus, |ν| << |µ|.
(iii) ⇒ (i). If |µ|(E) = 0, then |ν|(E) = 0 and so ν ± (E) = 0 from which
we get
ν(E) = ν + (E) − ν − (E) = 0.
This completes the proof. Definition 9.2.2 Let (X, S) be a measurable space and let µ and ν be
signed measures defined on it. We say that µ is equivalent to ν if both
the relations µ << ν and ν << µ hold. In this case we write µ ≡ ν. Example 9.2.5 If µ is a signed measure defined on a measurable space
(X, S), then µ ≡ |µ|. We defined a notion of absolute continuity of a measure with respect
to another in Remark 5.3.2, based on the result of Proposition 5.3.2. In
that case, the measure ν is also absolutely continuous with respect to µ
according the Definition 9.2.1 above. We reconcile these two definitions
in the proposition below.
Proposition 9.2.2 Let (X, S) be a measurable space and let µ and ν be
signed measures defined on it. Assume that ν is finite and that ν << µ.
Then, given ε > 0, we can find δ > 0 such that, whenever |µ|(E) <
δ, E ∈ S, we have |ν|(E) < ε.
Proof: Assume the contrary. Then, there exists ε > 0 such that, for
every n ∈ N, there exists a set En ∈ S with |µ|(En ) < 21n and |ν|(En ) ≥
ε. Set E = lim supn→∞ En . Then, for every n ∈ N,
|µ|(E) ≤
∞
X
m=n
|µ|(Em ) <
1
2n−1
,
188
9 Signed measures
and so |µ|(E) = 0. Since |ν| is finite (cf. Exercise 1.10 (b)), we have
|ν|(E) ≥ ε, which contradicts the absolute continuity of ν with respect
to µ. Example 9.2.6 The above result is not true, in general, if ν is not finite.
Let X = N and let S = P(N). Let µ({n}) = 2−n and let ν({n}) = 2n .
Then µ and ν define measures and since the only set, in either case, with
measure zero is the empty set, we have that ν << µ. However, for any
δ > 0, we can find n0 ∈ N such that for all n ≥ n0 , we have µ({n}) < δ,
while {ν({n})}n≥n0 is unbounded. Proposition 9.2.3 Let (X, S) be a measurable space and let µ and ν be
finite measures on it such that ν << µ.Assume that ν is not identically
zero. Then, there exists ε > 0 and a measurable set A with µ(A) > 0,
such that A is a positive set for the signed measure ν − εµ.
Proof: For each n ∈ N, consider the signed measure ν − n1 µ and let
X = An ∪ Bn be a Hahn decomposition for this signed measure. Set
∞
A0 = ∪ ∞
n=1 An and B0 = ∩n=1 Bn . Since B0 ⊂ Bn for each n, and since
Bn is a negative set for the signed measure ν − n1 µ, we have
0 ≤ ν(B0 ) ≤
1
µ(B0 ).
n
Thus, ν(B0 ) = 0. Since A0 = B0c , and since ν is not identically zero, it
follows that ν(A0 ) > 0. By absolute continuity, it follows that µ(A0 ) > 0
as well. Then, there exists n ∈ N such that µ(An ) > 0. We can now set
A = An and ε = n1 . 9.3
The Radon-Nikodym theorem
Let (X, S, µ) be a measure space and let f be a non-negative integrable
function defined on X. If we define the measure ν
Z
ν(E) =
f dµ, E ∈ S,
E
then ν is absolutely continuous with respect to µ. The Radon-Nikodym
theorem states that in the σ-finite case, every signed measure ν, which
is absolutely continuous with respect to µ, arises in this fashion. We
will first prove this for finite measures and then extend it to the general
case.
9.3 The Radon-Nikodym theorem
189
Theorem 9.3.1 (Radon-Nikodym) Let (X, S, µ) be a finite measure space
and let ν be a finite measure defined on S such that ν << µ. Then, there
exists a non-negative function defined on X, which is integrable with respect to µ, and such that
Z
ν(E) =
f dµ,
E
for every E ∈ S. The function f is unique Rin the sense that, if g is
another integrable function such that ν(E) = E g dµ, the f = g almost
everywhere with respect to the measure µ.
Proof: Step 1. Uniqueness. If f and g are two functions such that
Z
Z
ν(E) =
f dµ =
g dµ,
E
E
for every E ∈ S, then, for every n ∈ N, we have that µ(En ) = 0, where
1
En = x ∈ X | f (x) − g(x) >
.
n
It then immediately follows that
µ({x ∈ X | f (x) − g(x) > 0}) = 0.
Similarly, we have
µ({x ∈ X | f (x) − g(x) < 0}) = 0,
whence we deduce that
µ({x ∈ X | f (x) 6= g(x)}) = 0.
Step 2. Let us denote by L(µ), the set of all measurable functions defined
on X which are integrable with respect to µ. Define
f ≥ 0 and
R
K = f ∈ L(µ)
.
E f dµ ≤ ν(E) for every E ∈ S
Then K 6= ∅. To see this, let ε > 0 and A ∈ S be as in the statement of
Proposition 9.2.3. Then, if we set f = εχA , we have
Z
f dµ = εµ(E ∩ A) ≤ ν(E ∩ A) ≤ ν(E),
E
190
9 Signed measures
for every E ∈ S. Thus f ∈ K and
R
X
f dµ = εµ(A) > 0. Now set
Z
f dµ | f ∈ K
α = sup
.
X
Then,
0 < α ≤ ν(X) < +∞.
Step 3. Let {gn }∞
n=1 be a sequence of functions in K such that
Z
1
gn dµ > α − .
n
X
Let fn = max{g1 , · · · , gn } ≥ 0. We claim that fn ∈ K.
Indeed, let Ein = {x ∈ X | fn (x) = gi (x)}, for 1 ≤ i ≤ n. Then
X = ∪ni=1 Ein . Set F1n = E1n and
n
Fin = Ein \(∪i−1
k=1 Fk ).
Then the Fin , 1 ≤ i ≤ n are disjoint, Fin ⊂ Ein for 1 ≤ i ≤ n and
X = ∪ni=1 Fin . Thus, if E ∈ S, we have
Z
fn dµ =
E
n Z
X
i=1
E∩Fin
fn dµ =
n Z
X
E∩Fin
i=1
gi dµ ≤
n
X
ν(E ∩ Fin ) = ν(E).
i=1
This proves that fn ∈ K.
Step 4. We have that {fn }∞
n=1 is an increasing sequence of non-negative
functions. Let f = limn→∞ fn . Then, by the monotone convergence
theorem,
Z
Z
f dµ = lim
E
n→∞ E
fn dµ ≤ ν(E),
R
for every E ∈ S. Thus, f ∈ K as well. Consequently, X f dµ ≤ α. On
the other hand,
Z
Z
Z
1
f dµ ≥
fn dµ ≥
gn dµ > α − ,
n
X
X
X
R
for every n ∈ N. Consequently, X f dµ ≥ α as well and so we have that
Z
f ∈ K and
f dµ = α.
X
9.3 The Radon-Nikodym theorem
191
Step 5. Define the measure ν1 by
Z
ν1 (E) =
f dµ, E ∈ S.
E
Then ν1 is a finite measure and ν1 << µ. Set ν0 = ν − ν1 , which is
a well-defined signed measure since both ν and ν1 are finite measures.
Since f ∈ K, we have that ν1 (E) ≤ ν(E) for every E ∈ S. Thus ν0 is a
measure and we also have that ν0 << µ.
Step 6. We claim that ν0 ≡ 0. This will complete the proof. Assume
the contrary. Then, since ν0 is a finite measure which is absolutely
continuous with respect to µ, we may again apply Proposition 9.2.3.
Thus, there exist η > 0 and a set F ∈ S, which is a positive set for
the signed measure ν0 − ηµ, and such that µ(F ) > 0. Hence, for every
E ∈ S, we have
Z
ηµ(E ∩ F ) ≤ ν0 (E ∩ F ) = ν(E ∩ F ) −
f dµ.
E∩F
Now, set h = f + ηχF . Then, if E ∈ S, we have
R
R
h
dµ
=
E
E f dµ + ηµ(E ∩ F )
≤
R
=
R
E
f dµ −
E∩F c
R
E∩F
f dµ + ν(E ∩ F )
f dµ + ν(E ∩ F )
≤ ν(E ∩ F c ) + ν(E ∩ F )
= ν(E).
Thus h ∈ K. But
Z
Z
h dµ =
X
f dµ + ηµ(F ) > α,
X
which is a contradiction. Thus, ν0 ≡ 0. Hence f is the required function
and this completes the proof. If µ and ν are σ-finite measures, then we can write X as the disjoint
union of a countable number of measurable sets, such that on each of
them µ and ν are finite. Thus we can ‘patch up’ the function f obtained
192
9 Signed measures
on each of these sets to get f defined on X as in the theorem. Next, if
ν is a σ-finite signed measure, we can write ν = ν + − ν − . Since ν ± will
also be σ-finite, the preceding argument gives two functions f+ and f−
such that, for every E ∈ S, we have
Z
±
ν (E) =
f± dµ.
E
Then we can set f = f+ − f− to get the result of the theorem in this
case. Finally, let us assume that µ is also a σ-finite signed measure. Let
X = A ∪ B be a Hahn decomposition with respect to µ. Then, if E ⊂ A,
we have |µ|(E) = µ+ (E) and if E ⊂ B, we have |µ|(E) = µ− (E). Thus,
when restricted to A or to B, since we have that ν << µ, we deduce
that ν << µ+ on A and ν << µ− on B. Hence, there exist functions
fA and fB such that
R
ν(E) = E fA dµ+ , for every E ∈ S, E ⊂ A,
ν(E) =
R
E
fB dµ− , for every E ∈ S, E ⊂ B.
Now, if E ∈ S, we have
ν(E) = ν(E ∩ A) + ν(E ∩ B)
=
R
=
R
E∩A fA
E (fA χA
dµ+ +
R
E∩B
fB dµ−
− fB χB ) dµ.
Combining all these, we get the following result.
Theorem 9.3.2 (Radon-Nikodym) Let (X, S) be a measurable space
and let µ and ν be σ-finite signed measures defined on S such that
ν << µ. Then, there exists a measurable function defined on X such
that
Z
ν(E) =
f dµ,
(9.3.1)
E
for every E ∈ S. Definition 9.3.1 Let (X, S) be a measurable space and let µ and ν be
σ-finite signed measures defined on S such that ν << µ. The function
f occuring in (9.3.1) is called the Radon-Nikodym derivative of ν
dν
with respect to µ and we formally write f = dµ
.
9.4 Singularity
193
Proposition 9.3.1 Let (X, S) be a measurable space and let λ and µ be
σ-finite measures defined on S. Let µ << λ. Let ν be a σ-finite signed
measure defined on S such that ν << µ. Then
dν
dν dµ
=
,
dλ
dµ dλ
almost everywhere (with respect to the measure λ).
Proof: As usual, by considering the upper and lower variations ν ±
separately, we may assume that ν is a measure, without loss of generality.
dν
Thus, f = dµ
≥ 0 and let g = dµ
dλ ≥ 0. By virtue of proposition 5.2.7,
we have, for E ∈ S,
Z
Z
ν(E) =
f dµ =
f g dλ,
E
E
which completes the proof. 9.4
Singularity
If a measure is absolutely continuous with respect to another, then the
former vanishes whenever the latter vanishes. We now consider the
opposite notion.
Definition 9.4.1 Let (X, S) be a measurable space and let µ and ν be
two measures defined on S. We say that ν is singular with respect to
µ, and we write ν ⊥ µ, if there exists E ∈ S such that µ(E) = 0 and
ν ≡ 0 on E c , i.e. if F ⊂ E c , F ∈ S, then ν(F ) = 0. Example 9.4.1 Let m1 be the Lebesgue measure on R and let δ be the
Dirac measure concentrated at the origin. Then, if we set E = {0}, we
have m1 (E) = 0 and δ ≡ 0 on E c . Thus δ ⊥ m1 . Example 9.4.2 Let (X, S) be a measurable space equipped with a
signed measure µ. Then µ+ ⊥ µ− and µ− ⊥ µ+ . Theorem 9.4.1 (Lebesgue decomposition) Let (X, S) be a measurable
space and let µ and ν be two σ-finite measures defined on S. Then,
there exist two uniquely determined measures ν0 and ν1 such that ν =
ν0 + ν1 , ν0 ⊥ µ and ν1 << µ.
194
9 Signed measures
Proof Since ν << µ + ν, there exists a non-negative function f such
that
Z
ν(E) =
f d(µ + ν),
E
for every E ∈ S. Set
A = {x ∈ X |f (x) ≥ 1} and B = Ac .
Then,
Z
ν(A) ≥
d(µ + ν) = µ(A) + ν(A),
A
whence we deduce that µ(A) = 0. Define, for E ∈ S,
ν0 (E) = ν(E ∩ A) and ν1 (E) = ν(E ∩ B).
Then ν = ν0 + ν1 and ν0 ⊥ µ. We will now show that ν1 << µ, which
will establish the decomposition of ν.
Let E ∈ S such that µ(E) = 0. Then
Z
Z
Z
ν1 (E) =
dν =
f dµ +
f dν.
E∩B
Since µ(E) = 0, we have that
Z
E∩B
R
E∩B
E∩B
f dµ = 0. Thus we get
(1 − f ) dν = 0.
E∩B
But on B, we have that 0 ≤ f < 1. Hence it follows that ν(E ∩ B) = 0,
i.e. ν1 (E) = 0. This shows that ν1 << µ.
To complete the proof, we need to show that ν0 and ν1 are uniquely
determined. Let ν = ν0 +ν1 = νe0 +e
ν1 be two Lebesgue decompositions of
e be such that µ(A) = µ(A)
e =
ν with respect to µ. Let A (respectively A)
c
c
e
e
0 and ν0 ≡ 0 on A (repectively, νe0 ≡ 0 on A ). Then µ(A ∪ A) = 0 and,
ec , both ν0 and νe0 are both zero. Now, set
on Ac ∩ A
λ = ν0 − νe0 = νe1 − ν1 .
Then λ = νe1 − ν1 is clearly absolutely continuous with respect to µ and
e and on all of its measurable subsets.
so we have that λ vanishes on A ∪ A
On the other hand, we have just seen above that λ = ν0 − νe0 vanishes
e Thus λ ≡ 0 which proves the
identically on the complement of A ∪ A.
uniqueness of the Lebesgue decomposition. 9.5 Exercises
9.5
195
Exercises
9.1 Let (X, S) be a measurable space equipped with a signed measure
µ. If µ is finite, show that, for every E ∈ S,
Z
|µ|(E) = sup
f dµ | f : X → R measurable, |f | ≤ 1 .
E
9.2 Let (X, S) be a measurable space equipped with a signed measure
µ. Let µi , i = 1, 2 be two measures defined on S such that µ = µ1 − µ2 .
Show that µ+ ≤ µ1 and that µ− ≤ µ2 .
9.3 Let (X, S) be a measurable space equipped with a signed measure
µ. Let E ∈ S. Show that |µ|(E) = 0 if, and only if, µ(F ) = 0 for every
measurable subset F of E.
9.4 Let (X, S) be a measurable space equipped with a signed measure
+
dµ−
µ. Compute dµ
d|µ| and d|µ| .
9.5 Let (X, S) be a measurable space. Let µ and ν be finite measures
dν
. Show that, for every E ∈ S, we have
defined on S. Let f = d(µ+ν)
Z
f
ν(E) =
dµ.
E 1−f
9.6 Let ν be any σ-finite signed measure on the measurable space
(N, P(N)). Let µ be the counting measure on this measurable space.
dν
Show that ν << µ and compute dµ
.
9.7 Let (X, S) be a measurable space. Let µ and ν be σ-finite measures
defined on S such that µ ≡ ν. Show that
−1
dµ
dν
=
dν
dµ
almost everywhere (with respect to µ, and hence, with respect to ν as
well).
9.8 Let (X, S) be a measurable space. Let µ and ν be σ-finite measures
defined on S such that ν << µ. Show that
dν
(x) = 0
= 0.
ν
x∈X
dµ
Chapter 10
Lp spaces
10.1
Basic properties
The Lebesgue spaces, also known as the Lp spaces, constitute a rich
source of examples and counter-examples in functional analysis. They
also form an important class of function spaces when studying the applications of mathematical analysis.
Definition 10.1.1 Let (X, S, µ) be a measure space. Let f : X → R be
a measurable function. Let 1 ≤ p < ∞. We define
Z
p
kf kp =
1
p
|f | dµ
(10.1.1)
X
and we say that f is p-integrable (integrable, if p = 1 and square
integrable, if p = 2) if kf kp < +∞. Let M > 0. We set
{|f | > M } = {x ∈ X | |f (x)| > M }.
We now define (cf. Definition 3.3.1)
kf k∞ = inf{M > 0 | µ({|f | > M }) = 0}.
(10.1.2)
Definition 10.1.2 Let (X, S, µ) be a measure space. Let f : X → R be
a measurable function. We say that f is essentially bounded if kf k∞ <
+∞. © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_10
196
10.1 Basic properties
197
Definition 10.1.3 Let 1 < p < ∞. The conjugate exponent of p,
denoted p0 , is given by the relation:
1
1
+ 0 = 1.
p p
If p = 1 we define its conjugate exponent as ∞ and vice-versa. Lemma 10.1.1 Let 1 < p < ∞. Let p0 be its conjugate exponent. Then,
if a and b are non-negative real numbers, we have
1
1
a p b p0 ≤
a
b
+ 0.
p p
(10.1.3)
Proof: Let t ≥ 1. Consider the function
f (t) = k(t − 1) − tk + 1,
where k ∈ (0, 1). Then f 0 (t) = k(1 − tk−1 ) ≥ 0 since 0 < k < 1. Thus,
f is an increasing function on [1, +∞) and, since f (1) = 0, we deduce
that
tk ≤ k(t − 1) + 1,
(10.1.4)
for t ≥ 1 and for 0 < k < 1.
If a or b is zero, then (10.1.3) is obviously true. Let us assume,
without loss of generality (since p and p0 are conjugate exponents of
each other), that a ≥ b > 0. Then (10.1.3) follows from (10.1.4) on
setting t = ab , k = p1 and using the relation between p and p0 . Proposition 10.1.1 (Hölder’s inequality) Let 1 ≤ p < ∞ and let p0
be the conjugate exponent. If f is p-integrable and g is p0 -integrable
(essentially bounded, if p = 1), then
Z
|f g| dµ ≤ kf kp kgkp0 .
(10.1.5)
X
Proof: If p = 1, then p0 = ∞. Then
|f (x)g(x)| ≤ |f (x)| · kgk∞
for almost every x ∈ X and then (10.1.5) follows on integrating this
inequality over X.
10 Lp spaces
198
Let us now assume that 1 < p < ∞ so that 1 < p0 < ∞ as well. The
relation (10.1.5) is trivially true if kf kp (respectively, kgkp0 ) equals zero,
for then f (respectively, g) will be equal to zero almost everywhere. So
we assume further that kf kp 6= 0 and that kgkp0 6= 0. Then, by Lemma
10.1.1,
1
1
0
|f (x)g(x)| ≤ |f (x)|p + 0 |g(x)|p
p
p
for all x ∈ X. Assume now that kf kp = kgkp0 = 1. Then, integrating
the above inequality over X, we get
Z
1
1
|f g| dµ ≤
+ 0 = 1.
p
p
X
For the general case, apply the preceding result to the functions f /kf kp
and g/kgkp0 to get (10.1.5). Remark 10.1.1 When p = p0 = 2, the inequality (10.1.5) is known as
the Cauchy-Schwarz inequality. Proposition 10.1.2 (Minkowski’s Inequality) Let 1 ≤ p ≤ ∞. Let
f and g be p-integrable (essentially bounded , if p = ∞). Then f + g is
also p-integrable (essentially bounded, if p = ∞) and
kf + gkp ≤ kf kp + kgkp .
(10.1.6)
Proof: Let 1 < p < ∞. We assume that kf + gkp 6= 0, since, otherwise,
the result is trivially true. Since the function t 7→ |t|p is convex for
1 ≤ p < ∞, we have that
|f (x) + g(x)|p ≤ 2p−1 (|f (x)|p + |g(x)|p )
from which it follows that f + g is also p-integrable. Thus, if 1 < p < ∞,
we have
Z
Z
Z
p
p−1
|f + g| dµ ≤
|f + g| |f | dµ +
|f + g|p−1 |g| dµ.
X
X
X
We apply Hölder’s inequality to each of the terms on the right-hand side.
0
Notice that |f (x) + g(x)|(p−1)p = |f (x) + g(x)|p by the definition of p0 .
Thus |f + g|p−1 is p0 -integrable and
p
0
k |f + g|p−1 kp0 = kf + gkpp .
10.1 Basic properties
199
Thus,
p
0
kf + gkpp ≤ kf + gkpp (kf kp + kgkp ).
p
0
Dividing both sides by kf + gkpp and using, once again, the definition
of p0 , we get (10.1.6). The cases where p = 1 and p = ∞ follow trivially
from the inequality
|f (x) + g(x)| ≤ |f (x)| + |g(x)|.
This completes the proof. It is now easy to see that the space of all p-integrable functions
(1 ≤ p < ∞) and that of all essentially bounded functions are vector
spaces and that the map f 7→ kf kp for 1 ≤ p ≤ ∞ verifies all the properties of the norm, except that kf kp = 0 does not imply that f = 0, but
that f = 0 almost everywhere.
Given two measurable functions f and g, we say that f ∼ g if f = g
almost everywhere. This defines an equivalence relation. If f ∼ g,
then for 1 ≤ p ≤ ∞, we have that kf kp = kgkp . Further the set of
all equivalence classes forms a vector space with respect to pointwise
addition and scalar multiplication defined via arbitrary representatives
of the equivalence classes. In other words, if f1 ∼ f2 and if g1 ∼ g2 , then
f1 + g1 ∼ f2 + g2 and, for any scalar α, we also have αf1 ∼ αf2 and so
on. Since k · kp is also constant on any equivalence class, we can define
the ‘norm’ of an equivalence class via any representative function of that
class. Further, if kf kp = 0, then f will belong to the equivalence class
of the function which is identically zero. Thus the set of all equivalence
classes, with k · kp , becomes a normed linear space.
Definition 10.1.4 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞.
The space of all equivalence classes, under the equivalence relation defined by equality of functions almost everywhere, of all p-integrable functions is a normed linear space with the norm of an equivalence class
being the k · kp -‘norm’ of any representative of that class. This space
is denoted Lp (µ). The space of all equivalence classes of all essentially
bounded functions with the norm of an equivalence class being defined as
the k·k∞ -‘norm’ of any representative of that class, is denoted L∞ (µ). While we may often talk of ‘Lp -functions’, we must keep in mind that
we are really talking about equivalence classes of functions and that we
10 Lp spaces
200
carry out computations via representatives of those equivalence classes.
Notation If X = Ω, a non-empty open set of RN , provided with the
Lebesgue measure mN , then the corresponding Lp spaces will be denoted Lp (Ω). In particular, if R is provided with the Lebesgue measure
m1 , and if (a, b) is an interval, where −∞ ≤ a < b ≤ +∞, then the Lp
spaces on (a, b) will be denoted Lp (a, b).
Example 10.1.1 Let X = {1, 2, · · · , N }. Let S be the collection of all
subsets of X and let µ be the counting measure. Then a measurable
function can be identified with an n-tuple (a1 , a2 , · · · , aN ). In this case
Lp (µ) = RN equipped with the norm
kxkp =
N
X
! p1
|xi |p
,
i=1
if 1 ≤ p < ∞, and with the norm
kxk∞ = max |xi |,
1≤i≤N
where x = (x1 , · · · , xN ) ∈ RN . Notice that in this example, equality almost everywhere is the same as equality everywhere and so every
equivalence class is a singleton. Example 10.1.2 Let X = N, and let S be the collection of all subsets of
X. Let µ be the counting measure. In this case, functions are identified
with real sequences and Lp (µ) = `p , the space of all real sequences
equipped with the norm
kxkp =
∞
X
!1
p
|xk |
p
,
k=1
if 1 ≤ p < ∞ and with the norm
kxk∞ = sup |xk |,
k∈N
if p = ∞. Again, in this example, equivalence classes are just singletons.
10.1 Basic properties
201
Proposition 10.1.3 Let (X, S, µ) be a finite measure space. Then
Lp (µ) ⊂ Lq (µ)
with the inclusion being continuous, whenever 1 ≤ q ≤ p.
Proof: The result is trivial if p = ∞. Let 1 ≤ q < p < ∞ and let
f ∈ Lp (µ). Then, by Hölder’s inequality, we have
R
|f |q dµ ≤
R
=
R
p
q q
X (|f | ) dµ
p
X |f | dµ
q
= kfkqp .(µ(X))
p
q R
p
1− q
X
dµ
(µ(X))
1− pq
p
1− pq
which yields
kf kq ≤ C kf kp
where
1
C = (µ(X)) q
− p1
.
This completes the proof. Example 10.1.3 No such inclusions hold in infinite measure spaces.
For instance, the sequence ( n1 ) belongs to `2 but not to `1 . Example 10.1.4 Nothing can be said about the reverse inclusions. For
√
example, if f (x) = 1/ x, then f ∈ L1 (0, 1) but f 6∈ L2 (0, 1). Example 10.1.5 While we cannot say anything, in general, about the
reverse inclusions, as mentioned in the previous example, nevertheless,
if 1 ≤ p < q ≤ ∞, we do have the continuous inclusion
`p ,→ `q .
In fact have that, if x ∈ `p , then
kxkq ≤ kxkp .
Let q = ∞. If x ∈ `p , where x = (x1 , · · · , xi , · · ·), we have
|xi | ≤ kxkp ,
10 Lp spaces
202
for each i. Consequently,
kxk∞ = sup |xi | ≤ kxkp .
i
Let 1 ≤ p < q < ∞. Let x ∈ `p . Assume, for the moment, that kxkp = 1.
Then, if x = (x1 , · · · , xi , · · ·), we have seen that |xi | ≤ 1 for all i. Now
∞
X
∞
X
|xi |q =
i=1
|xi |p |xi |q−p ≤
i=1
∞
X
|xi |p = 1.
i=1
x
Thus, x ∈ `q and kxkq ≤ 1 = kxkp . Now, if x ∈ `p , consider y = kxk
.
p
Then kykp = 1. Consequently, y ∈ `q and kykq ≤ 1. Thus, it follows
that x ∈ `q as well (since it is a constant multiple of y) and
kxkq
≤ 1,
kxkp
which establishes our claim. Remark 10.1.2 Let (X, S, µ) be a finite measure space and let f ∈
L∞ (µ), f 6= 0. Then, we have seen that f ∈ Lp (µ) for all 1 ≤ p ≤ ∞.
Let 1 ≤ p < ∞. Then
Z
|f |p dµ ≤ kf kp−1
∞ kf k1 .
X
Thus,
p−1
1
kf kp ≤ kf k∞p kf k1p .
Consequently,
lim sup kf kp ≤ kf k∞ .
p→∞
Now let 0 < ε < kf k∞ and let
E = {x ∈ X | |f (x)| > kf k∞ − ε > 0}.
Then µ(E) > 0 and
Z
Z
p
|f | dµ ≥
|f |p dµ > (kf k∞ − ε)p µ(E).
X
Then
E
1
kf kp ≥ (kf k∞ )(µ(E)) p ,
10.1 Basic properties
203
from which we get,
lim inf kf kp ≥ kf k∞ .
p→∞
Thus, we have
lim kf kp = kf k∞ .
p→∞
This justifies, to some extent, the notation kf k∞ for the norm given by
the essential supremum. p
A Cauchy sequence {fn }∞
n=1 in L (µ) is a sequence such that, given
any ε > 0, there exists N ∈ N satisfying kfn − fm kp < ε whenever n and
m are larger than N . If f ∈ Lp (µ), we say that the sequence converges
to f in Lp (µ) if kfn − f kp → 0 as n → ∞.
Lemma 10.1.2 Let 1 ≤ p < ∞. Let (X, S, µ) be a measure space. If
p
{f }∞
n=1 is a Cauchy sequence in L (µ), then the sequence is Cauchy in
measure.
Proof: Let ε > 0 be fixed. For positive integers n and m, set
An,m (ε) = {x ∈ X | |fn (x) − fm (x)| ≥ ε}.
Then
Z
p
Z
|fn − fm |p dµ ≥ εp µ(An,m (ε)).
|fn − fm | dµ ≥
X
An,m (ε)
Thus,
kfn − fm kpp
µ(An,m (ε)) ≤
,
εp
and, since the right-hand side of the above inequality can be made arbitrarily small for large, n and m, the same is true for µ(An,m (ε)) as well
and that completes the proof. Theorem 10.1.1 Let (X, S, µ) be a measure space. Let 1 ≤ p ≤ ∞.
Then Lp (µ) is a Banach space.
Proof: We need to show that every Cauchy sequence in Lp (µ) is convergent.
p
Case 1. Let 1 ≤ p < ∞. Let {fn }∞
n=1 be a Cauchy sequence in L (µ).
Then we saw, in the preceding lemma, that the sequence is Cauchy in
measure. Then, by Proposition 4.2.7, there exists a subsequence {fnk }
10 Lp spaces
204
which is almost uniformly Cauchy and, hence, by Proposition 4.1.2,
there exists a measurable function f such that fnk → f pointwise almost
everywhere.
Let ε > 0. Let N ∈ N be such that kfn − fm kp < ε for all n, m ≥ N .
We have, by Fatou’s lemma,
Z
Z
p
|f − fn | dµ ≤ lim inf
|fnk − fn |p dµ ≤ εp ,
X
k→∞ X
for all n ≥ N . Thus, it follows that f − fn ∈ Lp (µ) and so f ∈ Lp (µ) as
well. Further, it also shows that fn → f in Lp (µ).
∞
Case 2. Let p = ∞. Let {fn }∞
n=1 be Cauchy in L (µ). Then, for each
k, there exists a positive integer Nk such that
kfm − fn k∞ <
1
k
for all m, n ≥ Nk . Thus, there exists a set Ek of measure zero, such that
|fm (x) − fn (x)| ≤
1
k
for all m, n ≥ Nk and for all x ∈ X\Ek . Setting E = ∪∞
k=1 Ek , we see
that E is of measure zero and for all x ∈ X\E, the sequence {fn (x)} is
a Cauchy sequence in R. Thus, for all such x, fn (x) → f (x). Passing to
the limit as m → ∞, we see that, for all x ∈ X\E, and for all n ≥ Nk ,
|f (x) − fn (x)| ≤
1
.
k
Hence, it follows that f is essentially bounded and that fn → f in
L∞ (µ). This completes the proof. Corollary 10.1.1 Let (X, S, µ) be a measure space and let fn → f in
Lp (µ) for some 1 ≤ p ≤ ∞. Then, there exists a subsequence {fnk } such
that fnk (x) → f (x) almost everywhere.
Proof: If 1 ≤ p < ∞, we have already proved this in the course of the
proof of the preceding theorem. If p = ∞, then we have that fn → f
pointwise almost everywhere. Remark 10.1.3 An explicit construction of a subsequence, which converges pointwise almost everywhere, of a Cauchy sequence in Lp (µ), 1 ≤
p < ∞, can be found in Kesavan [5] and in Rudin [8]. This has the added
10.2 Approximation
205
advantage that it also shows that the subsequence is bounded above by
a fixed function belonging to Lp (µ). Just as in Theorem 5.3.5, we can prove the Lp -convergence of a
sequence converging pointwise almost everywhere and whose Lp -norm
also converges.
p
Theorem 10.1.2 Let 1 ≤ p < ∞. Let {fn }∞
n=1 be a sequence in L (µ)
p
converging pointwise almost everywhere to a function f ∈ L (µ). Then
fn → f in Lp (µ) if, and only if, kfn kp → kf kp as n → ∞.
Proof: Since the norm defines a continuous function on any normed linear space, it follows that if fn → f in Lp (µ), we have that kfn kp → kf kp .
Conversely, let fn → f pointwise almost everywhere and let kfn kp →
kf kp . As in the case of Theorem 5.3.5, we apply the generalised dominated convergence theorem (cf. Theorem 5.3.4). Let Fn = |fn − f |p ,
which converges to zero almost everywhere. Since the function t 7→ |t|p
is convex, we get Fn ≤ 2p−1 (|fn |p + |f |p ) = Gn , say. Then Gn is integrable and Gn → G = 2Rp |f |p pointwise
R almost everywhere. Further, by
hypothesis, we get thatR X Gn dµ → X G dµ < +∞. Thus, by Theorem
5.3.4, we deduce that X Fn dµ → 0, which is the same as saying that
fn → f in Lp (µ). 10.2
Approximation
Let Ω ⊂ RN be a non-empty open set. Let S denote the set of all
real-valued simple functions defined on Ω which vanish outside a set of
finite (Lebesgue) measure. If 1 ≤ p < ∞, a simple function ϕ belongs
to Lp (Ω) if, and only if, ϕ ∈ S.
Lemma 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤ p <
∞. Then S is dense in Lp (Ω).
Proof: Let f ∈ Lp (Ω) be a non-negative function and let {ϕn }∞
n=1 be a
sequence of non-negative simple functions increasing to f . Then, clearly
ϕn ∈ Lp (Ω) for each n and so ϕn ∈ S as well.
Now, |ϕn − f |p ≤ 2p |f |p , and since |f |p is integrable, it follows,
from the dominated convergence theorem, that ϕn → f in Lp (Ω). If
f ∈ Lp (Ω) is any generic function, then we can write f = f + − f − and
10 Lp spaces
206
f ± are non-negative functions in Lp (Ω). Then, we can find sequences
∞
+
−
p
{ϕn }∞
n=1 and {ψn }n=1 in S such that ϕn → f and ψn → f in L (Ω).
Thus ϕn − ψn ∈ S for each positive integer n and ϕn − ψn → f in Lp (Ω).
This completes the proof. Lemma 10.2.2 Let Ω ⊂ RN be a non-empty open set and let 1 ≤ p <
∞. Let f ∈ S. Then, f can be approximated by step functions in Lp (Ω).
Proof: Let E ⊂ Ω be a measurable set of finite measure. By Proposition
2.2.5, given ε > 0, we can find a set F ⊂ Ω which is a finite disjoint
union of boxes, such that mN (E∆F ) < εp . Then
kχE − χF kpp = mN (E∆F ) < εp
and so kχE − χF kp < ε. Now, let f ∈ S. Then we can write f =
Pk
j=1 αj χEj , where the αj are all non-zero and the Ej are all mutually
disjoint sets of finite measure. For each 1 ≤ j ≤ k, we can find Fj , a
finite disjoint union of boxes, such that
kχEj − χFj kp <
ε
.
k|αj |
P
Then ϕ = kj=1 αj χFj is a step function, and, by the triangle inequality,
we have kf − ϕkp < ε. This completes the proof. Theorem 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤
p < ∞. Let Cc (Ω) denote the space of continuous real-valued functions
defined on Ω, having compact support contained in Ω. Then, Cc (Ω) is
dense in Lp (Ω).
Proof: By Lemmas 10.2.1 and 10.2.2, it follows that step functions are
dense in Lp (Ω). So it suffices to show that any step function in Lp (Ω)
can be approximated, in Lp (Ω), by a continuous function with compact
support. Let ε > 0 be given. Then, by Corollary 2.2.1, there exists
ϕ ∈ Cc (Ω), such that
p
ε
mN ({x ∈ Ω | ϕ(x) 6= f (x)}) <
2kf k∞
and such that
kϕk∞ ≤ kf k∞ .
Then
kϕ − f kpp ≤ 2p kf kp∞ mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < εp
10.2 Approximation
207
so that kϕ − f kp < ε. This completes the proof. Remark 10.2.1 The above result is not true when p = ∞. In fact, we
can show that the closure of Cc (Ω) under the sup-norm (i.e. the norm
k · k∞ ), is the space of continuous functions which vanish at infinity, i.e.
functions such that, given any ε > 0, there exists a compact set K ⊂ Ω
such that |f (x)| < ε for all x ∈ Ω\K. There are several interesting applications of the fact that Cc (Ω) is
dense in Lp (Ω) for 1 ≤ p < ∞. We will see a few in the next section.
To conclude this section, we discuss separability properties of the spaces
Lp (Ω).
Proposition 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤
p < ∞. Then the space Lp (Ω) is separable.
Proof: The set Ω can be expressed as the increasing union of compact sets Kn , n ∈ N. Given f ∈ Lp (Ω), we can approximate it by a
continuous function with compact support, say ϕ, and the support of
ϕ will lie in some Kn . By the Weierstrass approximation theorem, we
can approximate ϕ uniformly in Kn by means of a polynomial and, in
fact, by means of a polynomial whose coefficients are rational. Since
Kn is compact, it also means that these polynomials approximate ϕ in
the Lp - norm over Kn . Since the set of all polynomials with rational
coefficients is countable, let us number them as {pm,n }∞
m=1 . Denote by
pem,n the extension of pm,n to Ω by setting it to be equal to zero outside
Kn . Thus
∞
∪∞
pm,n }
n=1 ∪m=1 {e
is a countable and dense set in Lp (Ω). This completes the proof. Proposition 10.2.2 Let Ω ⊂ RN be a non-empty open set. The space
L∞ (Ω) is not separable.
Proof: Let x ∈ Ω. Let r > 0 be such that B(x; r) ⊂ Ω, where B(x; r)
is the open ball of radius r which is centered at x. Let
ϕx = χB(x;r) .
Set
Ux =
∞
f ∈ L (Ω) | kf − ϕx k∞
1
<
2
.
10 Lp spaces
208
Then Ux is an open set in L∞ (Ω).
Let x and y be distinct points in Ω. Then, by definition, it follows
that kϕx − ϕy k∞ = 1. Consequently, if x 6= y, then Ux ∩ Uy = ∅. Thus
{Ux }x∈Ω is an uncountable collection of disjoint open sets in L∞ (Ω).
Hence, given any countable set {fn }∞
n=1 , there exist open sets Ux which
do not contain any of the fn . Thus, no countable set in L∞ (Ω) can be
dense. This completes the proof. 10.3
Some applications
In the previous section, we proved the density of Cc (Ω), the space of
continuous functions with compact support, in Lp (Ω), where Ω is an
open subset of RN and 1 ≤ p < ∞. We used it to examine the separability of Lp (Ω). In this section we will deduce a few more important
and interesting consequences of this density theorem.
Theorem 10.3.1 (Lusin’s theorem) Let E ⊂ RN be a measurable set of
finite measure. Let f : E → R be a measurable function. Let ε > 0 be
given. Then, there exists ϕ ∈ Cc (RN ) such that
mN ({x ∈ E | ϕ(x) 6= f (x)}) < ε.
Further, if f is bounded, we can ensure that
kϕk∞ ≤ kf k∞ .
Proof: Step 1. For each positive integer n, define
En = {x ∈ E | |f (x)| ≤ n}.
Then En ↑ E. Since E has finite measure, we can choose m such that
mN (E\Em ) < 3ε . Now, define fe : RN → R by
f (x), if x ∈ Em ,
e
f (x) =
0,
if x ∈ RN \Em .
Since fe is bounded and since Em has finite measure, it follows that fe
N
is integrable on RN . Hence, there exists a sequence {ϕn }∞
n=1 in Cc (R )
1
N
e
such that ϕn → f in L (R ). Then, there exists a subsequence {ϕnk }
which converges to fe pointwise almost everywhere on RN .
10.3 Some applications
209
Step 2. Since Em has finite measure, we can find F ⊂ Em such that
mN (Em \F ) < 3ε and such that ϕnk → fe uniformly on F , by virtue of
Egorov’s theorem (cf. Theorem 4.1.1). Again, since F has finite measure, we can find a compact set K ⊂ F such that mN (F \K) < 3ε (cf.
Proposition 2.2.3). Clearly, mN (E\K) < ε.
Step 3. Since {ϕnk } converges uniformly to fe on K, it follows that
the restriction of fe to K is continuous. But K ⊂ F ⊂ Em and so
fe(x) = f (x) for every x ∈ K. Thus, we deduce that the restriction of f
to K is continuous and |f (x)| ≤ m for x ∈ K.
Step 4. Now, by the Tietze extension theorem (cf., for instance, Simmons [9]) we can find a continuous function g : RN → R such that
kgk∞ ≤ m and such that g = f on K.
Step 5. Finally, let ψ ∈ Cc (RN ) be such that 0 ≤ ψ ≤ 1 and such that
ψ ≡ 1 on K (which exists, by Urysohn’s lemma, cf. Simmons [9]). Let
ϕ = ψg. Then ϕ ∈ Cc (RN ), and
{x ∈ E | ϕ(x) 6= f (x)} ⊂ E\K,
which has measure less than ε. Also kϕk∞ ≤ m and, if f is bounded,
m ≤ kf k∞ . This completes the proof. Remark 10.3.1 Usually, in most texts, Lusin’s theorem is used to prove
the density of continuous functions with compact support, in Lp (Ω).
Here we have proved the density result directly and used it to prove
Lusin’s theorem. This presentation appeared in the expository article A
note on some approximation theorems in measure theory, by S. Kesavan
and M. T. Nair, in the Mathematics Newsletter, 27, No.2, 2016, of the
Ramanujan Mathematical Society. Theorem 10.3.2 (Hardy’s inequality) Let 1 < p < ∞. Let f ∈ Lp (0, ∞).
For 0 < x < ∞, define
Z
1
F (x) =
f dm1 .
(10.3.1)
x (0,x)
Then F ∈ Lp (0, ∞) and
kF kp ≤
p
kf kp .
p−1
(10.3.2)
10 Lp spaces
210
Proof: Step 1. Let f ∈ Cc ((0, ∞)) be a non-negative
R x function. Then,
if F is defined as in (10.3.1), we have xF (x) = 0 f (t) dt. Further
(xF (x))0 = f (x) for all x > 0. Since f has compact support in (0, ∞),
it follows that f ≡ 0 near the origin and so F ≡ 0 near the origin. So
if we define F (0) = 0, then F is continuous on [0, ∞). If the support
of f is contained Rin a finite interval [a, b], where 0 < a < b, then f ≡ 0
x
on [b, ∞) and so 0 f (t) dt is constant for x ≥ b. Hence F (x) → 0 as
x → ∞. Also, since F is continuous on [0, b] and since x 7→ x−1 is in
Lp (b, ∞) for 1 < p < ∞, it follows that F ∈ Lp (0, ∞) as well and it is a
non-negative function. By our earlier observation, F 0 (x) exists for x > 0
and, for such x,
f (x) = F (x) + xF 0 (x).
(10.3.3)
Multiplying both sides of this equation by F p−1 (x) and integrating, we
get
Z
Z
Z
F p−1 f dm1 =
F p dm1 +
xF p−1 (x)F 0 (x) dm1 (x).
(0,∞)
(0,∞)
(0,∞)
(10.3.4)
Since f and F are non-negative, we deduce from (10.3.4) and the monotone convergence theorem, that
Z
Z
p−1
0
xF
(x)F (x) dm1 (x) = lim
xF p−1 (x)F 0 (x) dm1 (x).
(0,∞)
n→∞ (0,n)
By virtue of (10.3.3), the integrand xF p−1 (x)F 0 (x) is continuous and
so the integral in the extreme right in the above relation is, in fact, a
Riemann integral. Now,
Z n
Z
1 n d
p−1
0
xF
(x)F (x) dx =
x (F p (x)) dx.
p
dx
0
0
Rb
Now, since F (x) = xc for x ≥ b, where c = 0 f (t) dt, and since p > 1,
we have that xF p (x) → 0 as x → ∞. Consequently, by integration by
parts, we get, on passing to the limit as n → ∞,
Z
Z
1
xF p−1 (x)F 0 (x) dm1 (x) = −
F p dm1 .
p
(0,∞)
(0,∞)
Using this, and the fact that F ≥ 0, in (10.3.4), and applying Hölder’s
inequality, we get
Z
p−1
p
kF kp =
F p−1 f dm1 ≤ kf kp kF p−1 kp0 ,
p
(0,∞)
10.3 Some applications
211
where p0 is the conjugate exponent of p. But
Z
Z
0
0
kF p−1 kpp0 =
F (p−1)p dm1 =
(0,∞)
F p dm1 = kF kpp ,
(0,∞)
by the definition of p0 . Thus,
p
p−1
p0
p
kF kp ≤ kf kp · kF kp ,
p
which yields (10.3.2), once again, by using the definition of p0 . Thus
the result is true for non-negative continuous functions with compact
support.
Step 2. If f ∈ Cc ((0, ∞)) is an arbitrary function, set
Z
1
T (f )(x) =
f dm1 ,
x (0,x)
for x > 0. Then, clearly, |f | is a non-negative continuous function with
compact support and we have |T (f )(x)| ≤ T (|f |)(x) and so T (f ) ∈
Lp (0, ∞) and
kT (f )kp ≤ kT (|f |)kp ≤
p
p
k |f | kp =
kf kp .
p−1
p−1
Thus the result is proven for all continuous functions with compact support in (0, ∞).
Step 3. Let f ∈ Lp (0, ∞). Let {fn }∞
n=1 be a sequence of continuous
functions with compact support in (0, ∞) converging to f in Lp (0, ∞).
If Fn = T (fn ), then Fn ∈ Lp (0, ∞) for all n ∈ N and
kFn − Fm kp ≤
p
kfn − fm kp ,
p−1
p
which implies that {Fn }∞
n=1 is a Cauchy sequence in L (0, ∞). Let
p
Fn → G in L (0, ∞).
Step 4. Now, for any x > 0, fn → f in Lp (0, x) and so, since (0, x) has
finite measure, fn → f in L1 (0, x). Consequently,
Z
Z
fn dm1 →
f dm1 ,
(0,x)
(0,x)
10 Lp spaces
212
and so Fn (x) → F (x) for each x > 0. But, for a subsequence Fnk → G
pointwise, almost everywhere, and so F = G almost everywhere. Thus,
it follows that F ∈ Lp (0, ∞) and that Fn → F in Lp (0, ∞). Since
kFn kp ≤
p
kfn kp ,
p−1
for each n ∈ N, we get (10.3.2) on passing to the limit as n → ∞. This
completes the proof. Example 10.3.1 The preceding theorem is not true when p = 1. Consider the function f (x) = e−x which belongs to L1 (0, ∞). Then, by
(10.3.1), we get
1 − e−x
F (x) =
.
x
For x ≥ 1, we have 1 − e−x ≥ 1 − e−1 . Thus,
F (x) ≥
1 − e−1
, x ≥ 1,
x
and so F is not integrable on (1, ∞) and so it cannot be integrable over
(0, ∞) either. Remark 10.3.2 Hardy’s inequality is valid in `p as well, when 1 < p <
∞. If x ∈ `p , where x = (xi ), then the sequence y = (yi ) defined by
yn =
x1 + · · · + xn
n
is also in `p and
kykp ≤
p
kxkp .
p−1
For a proof, see, for instance, Kesavan [5]. We conclude this section with another useful result.
Proposition 10.3.1 Let 1 ≤ p < ∞. Let f ∈ Lp (RN ). For h ∈ RN ,
define
τh (f )(x) = f (x − h), x ∈ RN .
Then
lim kτh (f ) − f kp = 0.
h→0
(10.3.5)
10.4 Duality
213
Proof: By the translation invariance of the Lebesgue measure, it is clear
that τh (f ) ∈ Lp (RN ), whenever f ∈ Lp (RN ) and also that kτh (f )kp =
kf kp .
Let ε > 0 be given. Choose ϕ ∈ Cc (RN ) such that
ε
kf − ϕkp < .
3
Then, we also have
ε
kτh (f ) − τh (ϕ)kp = kf − ϕkp < .
3
(10.3.6)
(10.3.7)
Let the support of ϕ be contained in the box [−a, a]N . Since ϕ is uniformly continuous, there exists 0 < δ < 1 such that, whenever |h| < δ,
we have
ε
−N
|ϕ(x − h) − ϕ(x)| < (2(a + 1)) p ,
3
N
for all x ∈ R . Then, for |h| < δ,
Z
Z
ε p
|τh (ϕ)−ϕ|p dm1 =
|ϕ(x−h)−ϕ(x)|p dm1 <
,
3
RN
[−(a+1),(a+1)]N
so that
ε
kτh (ϕ) − ϕkp < .
(10.3.8)
3
The result now follows on combining the relations (10.3.6)-(10.3.8). 10.4
Duality
When studying normed linear spaces, one of the important objectives is
to identify the dual space of a normed linear space, i.e. the space of all
continuous linear functionals on that space. In this section, we will try
to identify the dual spaces of the spaces Lp (µ).
Let (X, S, µ) be a measure space and let 1 ≤ p ≤ ∞. Let p0 be the
0
conjugate exponent of p. Let g ∈ Lp (µ). Define Tg : Lp (µ) → R by
Z
Tg (f ) =
f g dµ, f ∈ Lp (µ).
X
Then, clearly, Tg is a linear functional defined on Lp (µ). It is also
continuous, Indeed, by Hölder’s inequality, we have
|Tg (f )| ≤ kf kp kgkp0 ,
10 Lp spaces
214
which establishes the continuity of Tg . We also have
kTg k ≤ kgkp0 .
(10.4.1)
In this section, we will show that if (X, S, µ) is a σ-finite measure
space, then every continuous linear functional on Lp (µ) occurs in this
way, if 1 ≤ p < ∞ and that we have equality in (10.4.1). Thus, we have
an isometric isomorphism (i.e. an isomorphism which preserves norms)
0
between the dual of Lp (µ) and the space Lp (µ) and so we can identify
the latter space with the dual of the former, when 1 ≤ p < ∞. The
result does not hold when p = ∞.
Proposition 10.4.1 (Uniqueness) Let (X, S, µ) be a σ-finite measure
0
space. Let 1 ≤ p < ∞. If gi , i = 1, 2, are in Lp (µ), where p0 is the
conjugate exponent of p, such that Tg1 = Tg2 , then g1 = g2 almost
everywhere.
Proof: If f ∈ Lp (µ), we have
Z
f (g1 − g2 ) dµ = 0.
X
Let E ⊂ X be a measurable set of finite measure. Then χE ∈ Lp (µ) and
so we deduce that
Z
(g1 − g2 ) dµ = 0,
E
for all E ∈ S, µ(E) < +∞. If E ∈ S, then, by the σ-finiteness, we can
∞
find a collection of disjoint sets {Ei }∞
i=1 in S such that E = ∪i=1 Ei and
such that µ(Ei ) < +∞ for each i. Then, it follows that
Z
(g1 − g2 ) dµ = 0,
E
for all E ∈ S, from which we easily deduce that g1 = g2 almost everywhere. It follows from the above proposition that the mapping g 7→ Tg from
0
Lp (µ) into the dual of Lp (µ) is injective. It is also continuous , by virtue
of (10.4.1). We now need to show that it is surjective and that it is an
isometry. We will first prove this when the measure space is finite and
then deduce the general case of a σ-finite measure space.
10.4 Duality
215
Lemma 10.4.1 Let (X, S, µ) be a finite measure space and let g : X →
R be a measurable function such that
Z
1
g dµ ≤ K,
(10.4.2)
µ(E) E
for all E ∈ S with µ(E) > 0. Then |g| ≤ K almost everywhere.
Proof: Let U = {t ∈ R | |t| > K}. This is an open set in R and hence
can be written as the countable union of open intervals. Let (a−r, a+r)
be one such interval. Let
E = {x ∈ X | g(x) ∈ (a − r, a + r)}.
If µ(E) > 0, then set
1
AE (g) =
µ(E)
Then
|AE (g) − a| =
1
µ(E)
Z
g dµ.
E
Z
(g − a) dµ < r.
E
Thus,AE (g) ∈ (a − r, a + r) as well and so |AE (g)| > K, which is a
contradiction. Thus, µ(E) = 0 and since the set {x ∈ X | |g(x)| > K}
can be covered by a countable number of sets like E, we deduce that
|g(x)| ≤ K except on a set of measure zero. This completes the proof.
Theorem 10.4.1 Let (X, S, µ) be a finite measure space. Let 1 ≤ p <
∞. Let T be a continuous linear functional on Lp (µ). Then, there exists
0
a unique g ∈ Lp (µ) such that T = Tg and, further, kT k = kgkp0 .
Proof: Step 1. For E ∈ S, define
λ(E) = T (χE ).
This is well-defined, since µ(E) < +∞ and so χE ∈ Lp (µ). If A and B
are disjoint measurable sets, then
χA∪B = χA + χB ,
and so, by the linearity of T , we get that λ is finitely additive. Let
{Ei }∞
i=1 be a countable collection of disjoint sets in S. Let their union
10 Lp spaces
216
be E. Set Fk = ∪ki=1 Ei . Then Fk increases to E. Since the measure
space is finite, we have
µ(E\Fk ) =
∞
X
µ(Ei )
k→∞
→
0,
i=k+1
since µ(E) =
P∞
i=1 µ(Ei )
< +∞. Now,
1
kχE − χFk kp = µ(E\Fk ) p ,
and so χFk → χE in Lp (µ). Then T (χFk ) → T (χE ) which shows that
λ(E) =
∞
X
λ(Ei ).
i=1
Thus λ defines a signed measure on (X, S). Further, if µ(E) = 0, then
χE = 0 in Lp (µ) and so λ(E) = 0 as well. Thus, λ << µ. Hence, by
the Radon-Nikodym theorem, there exists a measurable function g such
that, for every E ∈ S, we have
Z
λ(E) =
g dµ,
E
or, in other words,
Z
T (χE ) =
χE g dµ.
X
Since both λ and µ are finite, it also follows that g ∈ L1 (µ). Further, if
ϕ is any simple function, we get by the linearity of T , that
Z
T (ϕ) =
ϕg dµ.
X
L∞ (µ)
Step 2. Let f ∈
be a non-negative function. Since, we are
in a finite measure space, it follows that f ∈ Lp (µ) as well for every
1 ≤ p < ∞. Let {ϕn }∞
n=1 be a sequence of non-negative simple functions
increasing to f . Then (cf. Lemma 10.2.1) ϕn → f in Lp (µ) for 1 ≤ p <
∞ and so T (ϕn ) → T (f ). On the other hand, ϕn g → f g pointwise and
|ϕn g| ≤ f |g| ≤ kf k∞ |g|.
Since g is integrable,
it follows,
from the dominated comnvergence theR
R
orem, that X ϕn g dµ → X f g dµ. Thus, for all non-negative bounded
functions, f, we have
Z
T (f ) =
f g dµ.
X
(10.4.3)
10.4 Duality
217
By splitting any bounded function f into its positive and negative parts,
f ± , we deduce that (10.4.3) is true for any bounded measurable function.
Step 3. Let p = 1. Let E ∈ S with µ(E) > 0. Then
Z
Z
g dµ =
χE g dµ = |T (χE )| ≤ kT k · kχE k1 = kT kµ(E).
E
X
Thus,
1
µ(E)
Z
g dµ ≤ kT k.
E
It then follows, from Lemma 10.4.1, that |g| ≤ kT k almost everywhere.
Thus, in this case g ∈ L∞ (µ) and kgk∞ ≤ kT k.
Step 4. Let 1 < p < ∞. Let ψ be a measurable function (taking values
±1) such that ψg = |g|. Let
En = {x ∈ X | |g(x)| ≤ n}, n ∈ N.
0
Set f = χEn |g|p −1 ψ. Then
0
0
|f |p = χEn |g|p p−p = χEn |g|p ,
by the definition of the conjugate exponent. Further, by the definition
0
of En , it follows that f is bounded as well. Since f g = χEn |g|p , we get,
from Step 2, that
Z
Z
p0
|g| dµ =
f g dµ = T (f ),
En
X
and so
Z
|g|
p0
Z
dµ ≤ kT k · kf kp = kT k
En
|g|
p0
1
p
dµ
,
En
which yields
Z
|g|
p0
10
p
dµ
≤ kT k.
En
But En ↑ X and so, by the monotone convergence theorem, we have
Z
|g|
X
p0
10
p
dµ
≤ kT k.
10 Lp spaces
218
0
Thus, g ∈ Lp (µ) and kgkp0 ≤ kT k.
0
Step 5. Let 1 ≤ p < ∞. Then, we have seen that g ∈ Lp (µ) and that
kgkp0 ≤ kT k. Further, since µ(X) is finite, simple functions are dense
in Lp (µ) (cf. Lemma 10.2.1) and so L∞ (µ) is also dense in Lp (µ). Both
sides of (10.4.3) define continuous linear functionals on Lp (µ) and agree
on the dense subspace L∞ (µ) and so, they agree on all of Lp (µ). Thus,
we get that, in fact, T = Tg , in which case, we have that
kT k = kTg k ≤ kgkp0 ≤ kT k.
Thus T = Tg and kT k = kTg k = kgkp0 . This completes the proof. Let (X, S, µ) be a σ-finite measure space. Let X = ∪∞
n=1 Xn , where
the Xn are all disjoint and 0 < µ(Xn ) < +∞ for each n ∈ N. Define
h(x) =
∞
X
n=1
1
n2 µ(X
n)
χ Xn .
Then h ∈ L1 (µ) and h > 0. Define, for E ∈ S,
Z
ν(E) =
h dµ.
E
Then (X, S, ν) is a finite measure space and ν << µ.
1
Let 1 ≤ p < ∞. Then f ∈ Lp (ν) if, and only if, h p f ∈ Lp (µ) and
1
kf kLp (ν) = kh p f kLp (µ) .
This defines an isometric isomorphism between the spaces Lp (ν) and
Lp (µ), when 1 ≤ p < ∞. Since ν << µ, we have that L∞ (ν) = L∞ (µ)
and that kf kL∞ (ν) = kf kL∞ (µ) .
Now, let T be a continuous linear functional defined on Lp (µ), for
1 ≤ p < ∞. Define S, a linear functional on Lp (ν), by
1
S(f ) = T (h p f ), f ∈ Lp (ν).
Thus,
|S(f )| =
1
|T (h p f )|
1
|T (f )| = |T (h p h
− p1
≤
1
kT k · kh p f kLp (µ)
f )| ≤ kSk · kh
− p1
= kT k · kf kLp (ν) ,
f kLp (ν) = kSk · kf kLp (µ) ,
10.4 Duality
219
where f ∈ Lp (ν) in the first line above and f ∈ Lp (µ) in the second.
This shows that S defines a continuous linear functional on Lp (ν) and
that kSk = kT k.
Theorem 10.4.2 (Riesz representation theorem) Let (X, S, µ) be a σfinite measure space. Let 1 ≤ p < ∞. Let T be a continuous linear
0
functional on Lp (µ). Then, there exists a unique g ∈ Lp (µ) such that
T = Tg and, further, kT k = kgkp0 .
Proof: We write X = ∪∞
n=1 Xn as a disjoint union of sets of finite
measure. We adopt the notation and definitions made in the preceding
paragraphs and define the function h and the measure ν as before. Given
a linear functional T on Lp (µ), we define, as before, the linear functional
0
S on Lp (ν). Then, by the preceding theorem, there exists ge in Lp (ν)
such that
Z
S(f ) =
f gedν, for all f ∈ Lp (ν),
X
and such that kSk = kgkLp0 (ν) .
1
Define g = h p0 ge if 1 < p < ∞ and g = ge, if p = 1.
Case 1. Let 1 < p < ∞. Then
R
R
0
0
0
kgkpLp0 (µ) = X |g|p dµ = X |e
g |p h dµ
0
R
=
X
0
0
|e
g |p dν = kSkp = kT kp .
Further, if f ∈ Lp (µ), we have
R
X f g dµ =
R
=
R
1
eh p0 dµ
X fg
X
h
− p1
=
R
1− p1
eh
X fg
f ge dν = S(h
− p1
dµ
f ) = T (f ).
This proves the result when 1 < p < ∞.
Case 2. Let p = 1. Then
kgkL∞ (µ) = ke
g kL∞ (ν) = kSk = kT k.
If f ∈ L1 (µ), then
Z
Z
f g dµ =
f geh−1 dν = S(h−1 f ) = T (f ).
X
X
10 Lp spaces
220
This proves the result for p = 1 and the proof of the theorem is complete. Example 10.4.1 The above result is not true when p = ∞. For example, consider the interval (0,1). Then C[0, 1] is a subspace of L∞ (0, 1).
If f ∈ C[0, 1], define
T (f ) = f (0).
This defines a linear functional on C[0, 1] and
|T (f )| = |f (0)| ≤ kf k∞ .
Thus T is a continuous linear functional of C[0, 1] equipped with the
norm from L∞ (0, 1). By the Hahn-Banach theorem, we can extend it to
a continuous linear functional on L∞ (0, 1).
We claim that this functional cannot be represented in the form Tg
where g ∈ L1 (µ). (Recall that the conjugate exponent of p = ∞ is
p0 = 1.) Assume the contrary. Now consider the sequence of functions
{fn }∞
n=1 in C[0, 1] given by
fn (t) =

 1 − nt, if 0 ≤ t ≤ n1 ,

0,
if
1
n
≤ t ≤ 1.
Then T (fn ) = fn (0) = 1 for all n ∈ N. On the other hand, if T = Tg ,
where g ∈ L1 (0, 1), then,
Z
T (fn ) =
fn g dm1 .
(0,1)
But |fn g| ≤ |g| for all n and g is integrable, while (fn g)(t) → 0 for all
0 < t ≤ 1. Thus, by the dominated convergence theorem, we have that
T (fn ) → 0, which is a contradiction. Remark 10.4.1 The spaces Lp (µ) are reflexive (cf. for instance, Kesavan [5]) if 1 < p < ∞ since we have the canonical identifications
0
0
(Lp (µ))0 = Lp (µ) and (Lp (µ))0 = Lp (µ). The spaces L1 (µ) and L∞ (µ)
are not reflexive. We have that (L1 (µ))0 = L∞ (µ) but the reverse identity fails. 10.5 Convolutions
221
Remark 10.4.2 We can prove an inequality called Clarkson’s inequality
when 2 ≤ p < ∞ (see the exercises at the end of this chapter) which will
show that the spaces Lp (µ) are uniformly convex (cf. Kesavan [5]), which
is a geometric property of the norm. Then, it follows, from a theorem in
functional analysis, that the spaces Lp (µ) are reflexive when 2 ≤ p < ∞.
One can then easily prove that the spaces Lp (µ), when 1 < p < 2, are
also refllexive, using functional analytic arguments. Then it is very easy
0
to show that the dual space of Lp (µ) is Lp (µ) for 1 < p < ∞. The
advantage of this proof is that it does not need the measure space to be
σ-finite. See Kesavan [5] for details. 10.5
Convolutions
In this section, we will study some properties of a very important tool
in analysis called the convolution product. We will assume throughout
that RN is equipped with the Lebesgue measure mN .
Definition 10.5.1 Let f and g be integrable functions defined on RN .
The convolution or convolution product of f and g , denoted f ∗ g,
is given by
Z
(f ∗ g)(x) =
f (x − y)g(y) dmN (y), for x ∈ RN . (10.5.1)
RN
We have already encountered this earlier (cf. Example 8.3.8): we have
seen that the convolution is well-defined and that it is a commutative
and associative binary operation on integrable functions (cf. Exercise
8.4).
We also saw that f ∗ g ∈ L1 (RN ) and that
kf ∗ gk1 ≤ kf k1 kgk1 .
(10.5.2)
We will now try to extend the definition of the convolution product to
other classes of functions as well.
Theorem 10.5.1 Let 1 < p < ∞. Let f ∈ L1 (RN ) and let g ∈ Lp (RN ).
Then f ∗ g is well-defined. Further, f ∗ g ∈ Lp (RN ) and
kf ∗ gkp ≤ kf k1 kgkp .
(10.5.3)
10 Lp spaces
222
0
Proof: Let p0 be the conjugate exponent of p. Let h ∈ Lp (RN ). Then
(x, y) 7→ f (x − y)g(y)h(x)
is measurable and using Hölder’s inequality and the translation invariance of the Lebesgue measure, we have
R
RN
R
RN
|f (x − y)g(y)h(x)| dmN (x)dmN (y)
R
R
= RN |h(x)| RN |f (x − y)g(y)| dmN (y)dmN (x)
=
R
=
R
RN
RN
|h(x)|
R
|f (w)|
RN
R
RN
|f (w)g(x − w)| dmN (w)dmN (x)
|h(x)||g(x − w)| dmN (x)dmN (w)
≤ khkp0 kgkp kf k1 < +∞.
Thus, by Fubini’s theorem, the integral
Z
h(x)f (x − y)g(y) dmN (y)
RN
exists for almost all x. We can choose h(x) 6= 0 for all x (for instance,
h(x) = exp(−|x|2 ), which belongs to all Lp spaces) and so we deduce
that f ∗ g defined via (10.5.1) is well-defined. Further, by the preceding
computation, it follows that
Z
h 7→
(f ∗ g)h dmN
RN
0
is a continuous linear functional on Lp (RN ) whose norm is bounded by
the quantity kgkp kf k1 which shows, by the Riesz representation theorem, that f ∗ g ∈ Lp (RN ) and that (10.5.3) holds. Remark 10.5.1 Notice that (10.5.2) is the same as (10.5.3) for the case
p = 1. The relation (10.5.3) is a particular case of Young’s inequality.
Let 1 ≤ p, q, r < ∞ be such that
1 1
1
+
= 1+ .
p q
r
If f ∈ Lp (RN ) and g ∈ Lq (RN ), then f ∗ g is well-defined via (10.5.1),
f ∗ g ∈ Lr (RN ) and
kf ∗ gkr ≤ kf kp kgkq . 10.5 Convolutions
223
If f and g are continuous real valued functions on RN and if at
least one of them has compact support, then the integral in (10.5.1)
makes sense and the convolution f ∗ g is well-defined. More generally,
if f1 , · · · , fn are continuous real valued functions on RN such that all
but at most one of them have compact support, then we can define the
convolution product f1 ∗ · · · ∗ fn by taking the products two at a time.
For instance we can have
f1 ∗ ((f2 ∗ f3 ) ∗ · · · ∗ (fn−1 ∗ fn )).
The actual pairing and order will be unimportant since we have commutativity and associativity. The convolution product of functions occuring
within any pair of parantheses is well-defined since at least one of them
will have compact support as shown by the following result.
Theorem 10.5.2 Let f and g be continuous real valued functions on
RN and let one of them have compact support so that f ∗ g is welldefined. Then
supp(f ∗ g) ⊂ supp(f ) + supp(g)
where, supp(ϕ) denotes the support of a continuous function ϕ : RN →
R, and, for subsets A and B of RN we define
A + B = {x + y | x ∈ A, y ∈ B}.
In particular, if both f and g have compact support, then f ∗ g has
compact support.
Proof: Let A = supp(f ) and B = supp(g) and, without loss of generality, assume that B is compact. Then A + B is closed. To see this, let
xn + yn ∈ A + B such that xn + yn → z in RN . Since B is compact,
for a subsequence, we have ynk → y ∈ B. Then xnk → z − y = x which
will belong to A, since A is closed. Thus, z = x + y ∈ A + B, which
establishes our claim.
We clearly need to consider the integral in (10.5.1) only on the set
B = supp(g). In order that (f ∗ g)(x) 6= 0, it is necessary that x − y ∈
A = supp(f ) for y varying over a subset of B with positive measure. In
particular, it follows that x ∈ supp(f ) + supp(g) and the result follows
since this set is closed.
If both functions have compact supports, then supp(f ) + supp(g) is
also compact (why?) and so f ∗ g has compact support. 10 Lp spaces
224
If f is continuous with compact support and if g is integrable on RN ,
then also it is easy to see that f ∗ g is well-defined. One of the important
properties of the convolution product is that it has a smoothing effect
on functions. More precisely, we have the following result.
Theorem 10.5.3 Let f be a continuous real valued function on RN with
compact support and let g be integrable. Then f ∗ g is continuous. If f
is C ∞ , then so is f ∗ g.
Proof: We will show that f ∗g is continuous and that, for any 1 ≤ i ≤ N ,
∂
∂f
(f ∗ g) =
∗g
∂xi
∂xi
(10.5.4)
if f is differentiable. Iterating this, we can complete the proof.
(i) Let x ∈ RN be fixed and let h ∈ RN be such that |h| ≤ 1. Then
Z
|(f ∗g)(x+h)−(f ∗g)(x)| ≤
|(f (x+h−y)−f (x−y)||g(y)| dmN (y).
RN
The above integral needs to be taken only over a compact set K(x)
containing the supports of the functions y 7→ f (x−y) and y 7→ f (x+h−
y). For example, we can take K(x) = x+B(0; R)+B(0; 1) where B(0; r)
denotes the closed ball in RN with centre at the origin and radius r, and
R > 0 is such that supp(f ) ⊂ B(0; R). Since f has compact support, it
is uniformly continuous and so, given ε > 0, there exists η > 0 such that
|f (u) − f (v)| < ε whenever |u − v| < η. Thus, if |h| < η (we can assume
that 0 < η < 1), we get
Z
|(f ∗ g)(x + h) − (f ∗ g)(x)| ≤ ε
|g| dmN
K(x)
which proves the continuity of f ∗ g at any arbitrary point x ∈ RN .
(ii) Let x ∈ RN be fixed and let h ∈ R, |h| ≤ 1. Let ei be the i-th
standard basis vector of RN , i.e. the vector with 1 in the i-th coordinate
and zero elsewhere. Then
∂
(f ∗ g)(x + hei ) − (f ∗ g)(x)
(f ∗ g)(x) = lim
h→0
∂xi
h
if the limit exists. If K(x) is a compact set containing the supports of
the functions y 7→ f (x − y) and y 7→ f (x + hei − y), then
R
(f ∗g)(x+hei )−(f ∗g)(x)
= h1 K(x) (f (x + hei − y) − f (x − y))g(y) dy
h
=
R
∂f
K(x) ∂xi ((x
− y + θhei )g(y) dy
10.5 Convolutions
225
∂f
where θ ∈ (0, 1) (and depends on x, y and h). Since ∂x
is assumed to be
i
continuous, it is bounded on the compact set K(x) and so the integrand
in the last integral above is bounded by M |g(y)| which is integrable on
the compact set K(x). Further, as h → 0, the integrand converges to
∂f
∂xi (x−y)g(y). Thus, by the dominated convergence theorem, we deduce
the validity of (10.5.4). Notation Let N be a fixed positive integer. A multi-index α (of size
N ), is an N -tuple of non-negative integers. If α = (α1 , · · · , αN ) is a
multi-index, we denote by |α|, the sum of its components, i.e.
|α| = α1 + · · · + αN .
If f is a sufficiently smooth real-valued function defined on RN (or an
open set thereof), we define
Dα f =
∂xα1 1
∂ |α|
.
· · · ∂xαNN
For example, if N = 3 and if α = (3, 0, 1), then
Dα f =
∂4f
.
∂x31 ∂x3
Remark 10.5.2 It is easy to write down a similar proof when f is C ∞
and g is continuous with compact support. More, generally, if f is C ∞
and if one of the two functions has compact support, we have
Dα (f ∗ g)(x) = ((Dα f ) ∗ g)(x)
for any multi-index α. It is also easy to see that if f is only C k , then
the above relation is valid for all multi-indices α such that |α| ≤ k. If
g also has differentiability properties, then any derivative of f ∗ g could
be got by taking the convolution product of appropriate derivatives of
f and g. Thus, the convolution of a smooth function with compact support
with any integrable function produces a smooth function. This fact used
together with the mollifiers defined below provides us with a powerful
technique to prove a variety of density and approximation theorems.
10 Lp spaces
226
Lemma 10.5.1 Define f : R → R by
f (x) =
exp(−x−2 ) if x > 0,
0
if x ≤ 0.
Then f ∈ C ∞ (R).
Proof: We only need to check the smoothness at x = 0. As x ↑ 0, the
function and all the derivatives are zero. As x ↓ 0, the derivatives are
all finite linear combinations of terms of the form x−k exp(−x−2 ), where
k is a non-negative integer.
Consider the function g(t) = tk e−t . Then
g 0 (t) = tk−1 e−t (k − t)
which is non-positive for t ≥ k. Thus for all such t, we have that
g(t) ≤ g(k).
Now
k
1
−2
−k −x−2
k
x e
= x
e−x
≤ xk k k e−k
2
x
for x12 ≥ k, i.e. x ≤ √1k . It then follows that x−k e−x
This completes the proof. −2
→ 0 as x ↓ 0.
We can use the above lemma to construct examples of C ∞ functions
with compact support.
Example 10.5.1 Consider the function
ρ(x) =
exp(−a2 /(a2 − x2 )) if |x| < a,
0
if |x| ≥ a.
A simple application of the preceding lemma shows that ρ is a C ∞ function and that its support is the interval [−a, a]. Example 10.5.2 This is a slight, but very useful, variation of the preceding example. Let x = (x1 , · · · , xN ) ∈ RN . Let
|x| =
N
X
i=1
! 12
|xi |2
.
(10.5.5)
10.5 Convolutions
227
Given ε > 0, define
−N
κε
exp(−ε2 /(ε2 − |x|2 )) if |x| < ε,
ρε (x) =
0
if |x| ≥ ε,
where
κ−1 =
Z
(10.5.6)
exp(−1/(1 − |x|2 )) dx.
|x|≤1
It follows then that ρε is a C ∞ function with support in the closed ball
B(0; ε), centered at the origin and of radius ε. Further ρε ≥ 0 and by
the change of variable y = xε , we see that
R
R
κ
2
2
2
RN ρε dmN = εN |x|≤ε exp(−ε /(ε − |x| )) dmN (x)
=
κ
R
|y|≤1 exp(−1/(1
− |y|2 )) dmN (y)
= 1.
Thus, as ε → 0, the functions ρε have decreasing supports, but preserve
the volume contained under the graph and so will be concentrated near
the origin. Definition 10.5.2 The family of functions {ρε }ε>0 is called the family
of mollifiers. Theorem 10.5.4 Let {ρε }ε>0 be the family of mollifiers.
(i) If f : RN → R is continuous, then ρε ∗ f → f pointwise, as ε → 0.
(ii) If f : RN → R is continuous with compact support, then ρε ∗ f → f
uniformly, as ε → 0.
Proof: (i) Let x ∈ RN . Then, given η > 0, there exists δ > 0 such that
for all |y| < δ, we have |f (x − y) − f (x)| < η. Thus, if ε < δ, we have,
on observing that the integral of ρε is unity and that this function is
supported on B(0; ε),
Z
(ρε ∗ f )(x) − f (x) =
(f (x − y) − f (x))ρε (y) dmN (y)
|y|≤ε
which yields (since ρε ≥ 0)
|(ρε ∗ f )(x) − f (x)| ≤
R
|y|≤ε |f (x
< η
R
− y) − f (x)|ρε (y) dmN (y)
|y|≤ε ρε (y)
dmN (y) = η.
10 Lp spaces
228
This proves the first statement.
(ii) If supp(f ) = K which is compact, then
supp(ρε ∗ f ) ⊂ K + B(0; ε)
which is compact and is contained within a fixed compact set, say, K +
B(0; 1) if we restrict ε to be less than or equal to unity. Since f has
compact support, it is uniformly continuous and the δ corresponding
to η in the previous step is now independent of the point x and so the
pointwise convergence is now uniform. Corollary 10.5.1 Let f be a continuous real-valued function on RN
with compact support. Then ρε ∗ f → f , as ε → 0, in Lp (RN ) for all
1 ≤ p ≤ ∞.
Proof: The case p = ∞ is already covered in the previous theorem.
If 1 ≤ p < ∞, then let K be the compact set containing the support
of f and all the functions ρε ∗ f . Then, on this set we have uniform
convergence, which automatically implies convergence in Lp (RN ). Remark 10.5.3 Notice that in the above case, ρε ∗ f is a C ∞ function
with compact support in RN . Theorem 10.5.5 Let 1 ≤ p < ∞. Then, the space of C ∞ functions
with compact support in RN is dense in Lp (RN ).
Proof: By Corollary 10.5.1 and Remark 10.5.3 above, continuous functions with compact support can be approximated in Lp (RN ) by C ∞
functions with compact support in RN . This completes the proof, since
continuous functions with compact support are dense in Lp (RN ). Corollary 10.5.2 Let {ρε }ε>0 be the family of mollifiers. If f ∈ Lp (RN ),
then ρε ∗ f → f as ε → 0, in Lp (RN ), for 1 ≤ p < ∞.
Proof: Given f ∈ Lp (RN ), we can find, for every η > 0, a continuous
function g with compact support such that
kf − gkp <
η
.
3
Then, for ε sufficiently small, we have, by Corollary 10.5.1, that
kρε ∗ g − gkp <
η
.
3
10.5 Convolutions
229
Then
kρε ∗ f − f kp ≤ kf − gkp + kg − ρε ∗ gkp + kρε ∗ (g − f )kp .
But by (10.5.2)
kρε ∗ (g − f )kp ≤ kρε k1 kg − f kp <
η
3
since the integral of ρε is unity and the result now follows immediately.
Theorem 10.5.6 Let Ω ⊂ RN be an open set and let 1 ≤ p < ∞. Then
the space of C ∞ functions with compact support in Ω is dense in Lp (Ω).
Proof: We know that continuous functions with compact support in Ω
are dense in Lp (Ω) for 1 ≤ p < ∞. Thus, given η > 0 and f ∈ Lp (Ω),
there exists g, a continuous function with compact support in Ω, such
that kf − gkp < η/2. Now, let ge be the extension of g by zero outside
Ω. Then ρε ∗ ge is a C ∞ function and since its support is compact and is
contained in B(0; ε) + supp(e
g ) = B(0; ε) + supp(g) ⊂ Ω, for ε sufficiently
small, we have that (ρε ∗ ge)|Ω is a C ∞ function with compact support in
Ω. But ρε ∗ ge → ge, as ε → 0, in Lp (RN ). Hence, for sufficently small ε,
we have
η
k(ρε ∗ ge)|Ω − gkp < ,
2
which yields
k(ρε ∗ ge)|Ω − f kp < η
which completes the proof. Bibliographical comment: Apart from the books cited in the text,
the following are highly recommended for further study of the Lp -spaces
as well as their applications:
1. Brézis, H, Functional Analysis, Sobolev Spaces and Partial Differential Equations, Springer, Universitext, 2011.
2. Ciarlet, P.G. Linear and Nonlinear Functional Analysis with Applications, SIAM, 2013.
3. Lieb E. H. and Loss, M. Analysis, Graduate Studies in Mathematics, Volume 14, American Mathematical Society, 1997. (Indian Edition:
10 Lp spaces
230
Norosa, 1998.)
The five volume treatise entitled A comprehensive Course in Analysis
by Barry Simon is an excellent reference for all topics in Analysis. In
particular Part 1 of this set has material relevant to topics treated in
this book:
Real analysis: A Comprehensive Course in Analysis (Part 1), American
Mathematical Society, 2015. (Indian Edition: Universities Press, 2017.)
10.6
Exercises
10.1 Let (X, S, µ) be a measure space. Let 1 ≤ p, q, r < ∞ be such that
1 1
1
+
= .
p q
r
(a) If f ∈ Lp (µ) and g ∈ Lq (µ), show that f g ∈ Lr (µ) and that
kf gkr ≤ kf kp kgkq .
(b) If fn → f in Lp (µ) and if gn → g in Lq (µ), show that fn gn → f g in
Lr (µ).
10.2 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞. Let {fn }∞
n=1
be a sequence in Lp (µ). Assume that there exists g ∈ Lp (µ) such that
|fn | ≤ g for each n ∈ N. If fn → f pointwise, show that f ∈ Lp (µ) and
that fn → f in Lp (µ).
10.3 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞. Let fn → f in
Lp (µ). Let {gn }∞
n=1 be a sequence of measurable real-valued functions
which are uniformly bounded by M > 0 and which converge almost everywhere to a measurable function g on X. Show that fn gn → f g in
Lp (µ).
10.4 Let (X, S, µ) be a measure space. Let 1 < p < ∞. Let f : X ×X →
R be such that, for every y ∈ X, the section f y is p-integrable and that
Z
kf y kp dµ(y) < +∞.
X
Define, for x ∈ X,
Z
g(x) =
f (x, y) dµ(y).
X
10.6 Exercises
231
Show that g ∈ Lp (µ) and that
Z
kf y kp dµ(y).
kgkp ≤
X
10.5 (Riemann-Lebesgue lemma) Let h : (0, ∞) → R be a bounded and
(Lebesgue) measurable function such that
Z
1
lim
h dm1 = 0.
c→∞ c (0,c)
(a) Let [c, d] ⊂ (0, ∞) and let f = χ[c,d] . Show that
Z
lim
ω→∞ (0,∞)
f (t)h(ωt) dm1 (t) = 0.
(10.6.1)
(b) Show that (10.6.1) holds for all f ∈ L1 (0, ∞).
(c) If f ∈ L1 (a, b), where (a, b) ⊂ (0, ∞), show that
Z
Z
lim
f (t) cos nt dm1 (t) = lim
f (t) sin nt dm1 (t) = 0.
n→∞ (a,b)
n→∞ (a,b)
10.6 (a) Consider the trigonometric series
∞
a0 X
+
(an cos nt + bn sin nt).
2
n=1
Show that it can be written as
∞
a0 X
+
dn cos(nt − φn ).
2
n=1
(This is called the amplitude-phase form of the series.) Write down the
relations between an , bn and dn , φn .
(b) Using the amplitude-phase form of a trigonometric series, show that
if the series converges pointwise over a set E whose (Lebesgue) measure
is strictly positive, then an → 0 and bn → 0 as n → ∞. (This is called
the Cantor-Lebesgue theorem.)
In Exercises 10.7-10.11, which follow, give direct proofs of the results,
without appealing to the general results proved in this chapter.
10 Lp spaces
232
10.7 Show that the spaces `p are complete, for 1 ≤ p ≤ ∞.
10.8 Show that `p is separable if 1 ≤ p < ∞ and that `∞ is not separable.
10.9 Let V 0 denote the dual space of a Banach space V . If p0 is the
conjugate exponent of p, show that `0p is isometrically isomorphic to `p0 ,
when 1 ≤ p < ∞.
10.10 Give an example of a continuous linear functional on `∞ which
does not arise from any element of `1 .
10.11 Let c0 denote the space of all real sequences which converge to
zero, equipped with the norm k · k∞ .
(a) Show that c0 is complete.
(b) Show that c00 is isometrically isomorphic to `1 .
10.12 (Clarkson’s inequality)
(a) Let 2 ≤ p < ∞. If x ≥ 0, show that
p
(x2 + 1) 2 ≥ xp + 1.
Deduce that, if α and β are positive real numbers, then
p
(α2 + β 2 ) 2 ≥ αp + β p .
p
(b) Combining the above with the fact that the map t 7→ t 2 is convex
on the set {t ∈ R | t ≥ 0}, show that, if f and g are in Lp (µ), where
(X, S, µ) is a measure space, then
1
(f + g)
2
p
+
p
1
(f − g)
2
p
≤
p
1
(kf kpp + kgkpp ).
2
(d) Deduce that if (X, S, µ) is a measure space and if 2 ≤ p < ∞, then
the space Lp (µ) is uniformly convex, i.e. given ε > 0, there exists δ > 0
such that, whenever f and g are in Lp (µ) with
kf kp = kgkp = 1, kf − gkp > ε,
we have
1
(f + g)
2
< 1 − δ.
p
10.6 Exercises
233
10.13 Let (X, S, µ) be a measure space and let 1 < p < ∞. Let p0
denote the conjugate exponent of p.
0
(a)If g ∈ Lp (µ), define
0
|g(x)|p −2 g(x), if g(x) 6= 0,
f (x) =
0,
if g(x) = 0.
Show that f ∈ Lp (µ).
(b) Define the continuous linear functional (as in Section 10.5)
Z
Tg (f ) =
f g dµ, for all f ∈ Lp (µ).
X
Show that kTg k = kgkp0 .
(c) Given that every uniformly convex Banach space is reflexive (cf. Kesavan [5], Theorem 5.5.1), deduce that Lp (µ) is reflexive for all p such
that 1 < p < ∞.
0
(d) Deduce that the dual of Lp (µ) is isometrically isomorphic to Lp (µ)
for every 1 < p < ∞.
Remark 10.6.1 This gives a proof of the Riesz representation theorem
when 1 < p < ∞, without the assumption of σ-finiteness on the measure
space. 10.14 Let f ∈ L1 (RN ) be such that for every non-negative C ∞ function
with compact support, ϕ, we have
Z
f ϕ dmN ≥ 0.
RN
Show that f ≥ 0 almost everywhere on RN .
10.15 Let (a, b) ⊂ R be a finite interval. Let {ϕk }∞
k=1 be an orthonormal
sequence in L2 (a, b), i.e.
Z
1, if j = k,
ϕj ϕk dm1 =
0, if j 6= k.
(a,b)
P
∞
2
Let {ck }∞
k=1 |ck | < +∞. Show that there
k=1 be scalars such that
2
exists
R f ∈ L (a, b) such that:
(i) (a,b) f ϕk dm1 = ck for every k ∈ N.
(ii)
∞
X
kf k22 =
|ck |2 .
k=1
10 Lp spaces
234
Remark 10.6.2 This result is known as the Riesz-Fischer theorem. The
completeness of the Lp spaces (cf. Theorem 10.1.1) is also known by the
same name. Remark 10.6.3 Let (X, S, µ) be a measure space. The space L2 (µ) is
a Hilbert space with the inner-product defined by
Z
(f, g) =
f g dµ.
X
(If we are dealing with complex-valued functions then g above should
be replaced by its complex conjugate.) The Riesz-Fisher theorem, as
stated in the preceding exercise, is valid in any Hilbert space. Bibliography
[1] Ahlfors, L. V. Complex Analysis, International Student Edition,
Third Edition, McGraw-Hill,1979.
[2] Evans, L. C. and Gariepy, R. F. Measure Theory and Fine Properties of Functions, CRC Press, 1992.
[3] Folland, G. B. Real Analysis: Modern Techniques and their Applications, John Wiley and Sons Inc., 1984.
[4] Kesavan, S.
Nonlinear Functional Analysis, A First Course,
Texts and Readings in Mathematics (TRIM), 28, Hindusthan Book
Agency, 2004.
[5] Kesavan, S. Functional Analysis, Texts and Readings in Mathematics (TRIM), 52, Hindusthan Book Agency, 2009.
[6] Royden, H. L.
Real Analysis, 2nd Edition, Macmillan, 1964.
[7] Rudin, W. Principles of Mathematical Analysis, Third Editon,
McGraw-Hill International Edition, 1976.
[8] Rudin, W. Real and Complex Analysis, Tata McGraw-Hill, 1974.
[9] Simmons, G. F. Introduction to Topology and Modern Analysis,
McGraw-Hill, 1963.
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9
235
Index
σ-algebra, 11
σ-ring, 11
hereditary, 16
absolute continuity, 136, 185
absolutely continuous
measure, 104
algebra, 9
almost everywhere, 65
almost uniform convergence, 69
Bernstein polynomial, 112
Borel
σ-algebra, 35
measure, 36
set, 35
Borel-Cantelli lemma, 16
bounded variation, 124
Cantor
function, 64, 118, 141
set, 37
Cantor-Lebesgue theorem, 231
Cauchy in measure, 71
Cauchy-Schwarz inequality, 198
chain rule, 145
change of variable, 153
characteristic function, 43, 56
Clarkson’s inequality, 221, 232
coarea formula, 173
complex measure, 184
conjugate exponent, 197
convergence
almost surely, 113
in mean, 99
in measure, 71
in probability, 113
convolution, 170, 171, 221
countable additivity, 12
counting measure, 90
critical
point, 146
value, 146
diffeomorphism, 143
differentiable mapping, 142
Dirac measure, 91
distribution function, 169
dominated convergence theorem,
98
Egorov’s theorem, 68
elementary set, 156
equimeasurable, 169
essential supremum, 66
essentially bounded, 66, 196
events, 112
expectation, 113
Fatou’s lemma, 92
finite additivity, 12
Fourier transform, 101
Fréchet derivative, 142
Fubini’s thoerem, 164
function
Cantor, 118
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9
236
10.6 Exercises
Carathéodory, 160
characteristic, 43, 56
complex-valued, 95
integrable, 95
negative part, 58
positive part, 58
simple, 60
step, 43
Hölder’s inequality, 197
Hahn decomposition, 181
Hardy’s inequality, 209
independent
events, 113
random variables, 113
inequality
Cauchy-Schwarz, 198
Clarkson, 221, 232
Hölder, 197
Hardy, 209
Minkowski, 198
Young, 222
integrable, 196
function, 95
Riemann, 2
integral
Lebesgue, 84, 85, 95
Riemann, 2
iterated integrals, 166
Jacobian, 144
Jordan decomposition, 183
Lebesgue
σ-algebra, 35
integral, 84, 85, 95
measurable sets, 35
measure, 35
lemma
Borel-Cantelli, 16
237
Fatou, 92
Riemann-Lebesgue, 231
Vitali covering, 119
Lipschitz continuous, 124
Lusin’s theorem, 208
mapping
continuously differentiable, 143
differentiable, 142
mean value thoerem, 145
measurable
cover, 23
function, 54
rectangle, 156
set, 19, 54
space, 54
measure, 12
σ-finite, 15
absolute continuity, 53, 185
absolutely continuous, 104
complete, 21
completion of, 24
complex, 184
continuity from above, 15
continuity from below, 14
counting, 13
Dirac, 13
equivalent, 187
finite, 15
inner-regular, 43
Jordan decomposition, 184
Lebesgue decomposition, 193
Lebesgue-Stieltjes, 52
outer-regular, 43
product, 162
regular, 43
signed, 178
singular, 193
space, 65
subadditivity, 14
238
translation invariance, 47
Minkowski’s inequality, 198
mollifiers, 225, 227
monotone class, 157
monotone convergence theorem,
88
negative set, 180
outer-measure, 17
polar coordinates, 168
positive set, 180
probability
conditional, 113
space, 112
product measure, 162
Radon-Nikodym
derivative, 192
theorem, 94, 104, 189, 192
random variable, 113
distribution function, 113
identically distributed, 113
independence, 113
rearrangement, 169
rectifiable arc, 131
regular value, 146
Riemann-Lebesgue Lemma, 231
Riesz representation theorem, 219
Riesz-Fischer theorem, 234
ring, 9
sample space, 112
Sard’s theorem, 146
section
function, 159
set, 156
signed measure, 178
lower variation, 184
total variation, 184
10 Lp spaces
upper variation, 184
simple function, 60
singular
function, 140
point, 146
value, 146
step function, 43
subadditivity
countable, 17
theorem
Cantor-Lebesgue, 231
dominated convergence, 98
Egorov, 68
Fubini, 164
Hahn decomposition, 181
Jordan decomposition, 183
Lusin, 208
mean value, 145
monotone convergence, 88
Radon-Nikodym, 94, 104, 189,
192
Riesz representation, 219
Riesz-Fischer, 234
Sard, 146
Weierstrass, 109
total variation, 124
translation invariant, 47
Vitali covering, 119
Weierstrass’ theorem, 109
Young’s inequality, 222
Texts and Readings in Mathematics
1. R. B. Bapat: Linear Algebra and Linear Models (3/E)
2. Rajendra Bhatia: Fourier Series (2/E)
3. C.Musili: Representations of Finite Groups
4. Henry Helson: Linear Algebra (2/E)
5. Donald Sarason: Complex Function Theory (2/E)
6. M. G. Nadkarni: Basic Ergodic Theory (3/E)
7. Henry Helson: Harmonic Analysis (2/E)
8. K. Chandrasekharan: A Course on Integration Theory
9. K. Chandrasekharan: A Course on Topological Groups
10. Rajendra Bhatia(ed.): Analysis, Geometry and Probability
11. K. R. Davidson: C* -Algebras by Example
12. Meenaxi Bhattacharjee et al.: Notes on Infinite Permutation Groups
13. V. S. Sunder: Functional Analysis - Spectral Theory
14. V. S. Varadarajan: Algebra in Ancient and Modern Times
15. M. G. Nadkarni: Spectral Theory of Dynamical Systems
16. A. Borel: Semi-Simple Groups and Symmetric Spaces
17. Matilde Marcoli: Seiberg Witten Gauge Theory
18. Albrecht Bottcher:Toeplitz Matrices, Asymptotic Linear Algebra and Functional
Analysis
19. A. Ramachandra Rao and P Bhimasankaram: Linear Algebra (2/E)
20. C. Musili: Algebraic Geomtery for Beginners
21. A. R. Rajwade: Convex Polyhedra with Regularity Conditions and Hilbert’s Third
Problem
22. S. Kumaresen: A Course in Differential Geometry and Lie Groups
23. Stef Tijs: Introduction to Game Theory
24. B. Sury: The Congruence Subgroup Problem - An Elementary Approach Aimed at
Applications
25. Rajendra Bhatia (ed.): Connected at Infinity - A Selection of Mathematics by
Indians
26. Kalyan Mukherjea: Differential Calculas in Normed Linear Spaces (2/E)
27. Satya Deo: Algebraic Topology - A Primer (2/E)
28. S. Kesavan: Nonlinear Functional Analysis - A First Course
29. Sandor Szabo: Topics in Factorization of Abelian Groups
30. S. Kumaresan and G.Santhanam: An Expedition to Geometry
31. David Mumford: Lectures on Curves on an Algebraic Surface (Reprint)
32. John. W Milnor and James D Stasheff: Characteristic Classes(Reprint)
33. K.R. Parthasarathy: Introduction to Probability and Measure
34. Amiya Mukherjee: Topics in Differential Topology
35. K.R. Parthasarathy: Mathematical Foundation of Quantum Mechanics (Corrected
Reprint)
36. K. B. Athreya and S.N.Lahiri: Measure Theory
37. Terence Tao: Analysis - I (3/E)
38. Terence Tao: Analysis - II (3/E)
39. Wolfram Decker and Christoph Lossen: Computing in Algebraic Geometry
40. A. Goswami and B.V.Rao: A Course in Applied Stochastic Processes
240
Texts and Readings in Mathematics
41. K. B. Athreya and S.N.Lahiri: Probability Theory
42. A. R. Rajwade and A.K. Bhandari: Surprises and Counterexamples in Real
Function Theory
43. Gene H. Golub and Charles F. Van Loan: Matrix Computations (Reprint of the 4/E)
44. Rajendra Bhatia: Positive Definite Matrices
45. K.R. Parthasarathy: Coding Theorems of Classical and Quantum Information
Theory (2/E)
46. C.S. Seshadri: Introduction to the Theory of Standard Monomials (2/E)
47. Alain Connes and Matilde Marcolli: Noncommutative Geometry, Quantum Fields
and Motives
48. Vivek S. Borkar: Stochastic Approximation - A Dynamical Systems Viewpoint
49. B.J. Venkatachala: Inequalities - An Approach Through Problems (2/E)
50. Rajendra Bhatia: Notes on Functional Analysis
51. A. Clebsch: Jacobi’s Lectures on Dynamics (2/E)
52. S. Kesavan: Functional Analysis
53. V.Lakshmibai and Justin Brown: Flag Varieties - An Interplay of Geometry,
Combinatorics and Representation Theory (2/E)
54. S. Ramasubramanian: Lectures on Insurance Models
55. Sebastian M. Cioaba and M. Ram Murty: A First Course in Graph Theory and
Combinatorics
56. Bamdad R. Yahaghi: Iranian Mathematics Competitions 1973-2007
57. Aloke Dey: Incomplete Block Designs
58. R.B. Bapat: Graphs and Matrices (2/E)
59. Hermann Weyl: Algebraic Theory of Numbers(Reprint)
60. C L Siegel: Transcendental Numbers(Reprint)
61. Steven J. Miller and RaminTakloo-Bighash: An Invitation to Modern Number
Theory (Reprint)
62. John Milnor: Dynamics in One Complex Variable (3/E)
63. R. P. Pakshirajan: Probability Theory: A Foundational Course
64. Sharad S. Sane: Combinatorial Techniques
65. Hermann Weyl: The Classical Groups-Their Invariants and Representations
(Reprint)
66. John Milnor: Morse Theory (Reprint)
67. Rajendra Bhatia(Ed.): Connected at Infinity II- A Selection of Mathematics by
Indians
68. Donald Passman: A Course in Ring Theory (Reprint)
69. Amiya Mukherjee: Atiyah-Singer Index Theorem- An Introduction
70. Fumio Hiai and Denes Petz: Introduction to Matrix Analysis and Applications
71. V. S. Sunder: Operators on Hilbert Space
72. Amiya Mukherjee: Differential Topology
73. David Mumford and Tadao Oda: Algebraic Geometry II
74. Kalyan B. Sinha and Sachi Srivastava: Theory of Semigroups and Applications
75. Arup Bose and Snigdhansu Chatterjee: U-Statistics, M m -Estimators and
Resampling
76. Rajeeva L. Karandikar and B. V. Rao: Introduction to Stochastic Calculus
Download