Texts and Readings in Mathematics 77 S. Kesavan Measure and Integration Texts and Readings in Mathematics Volume 77 Advisory Editor C. S. Seshadri, Chennai Mathematical Institute, Chennai Managing Editor Rajendra Bhatia, Ashoka University, Sonepat Editors Manindra Agrawal, Indian Institute of Technology, Kanpur V. Balaji, Chennai Mathematical Institute, Chennai R. B. Bapat, Indian Statistical Institute, New Delhi V. S. Borkar, Indian Institute of Technology, Mumbai Apoorva Khare, Indian Institute of Sciences, Bangalore T. R. Ramadas, Chennai Mathematical Institute, Chennai V. Srinivas, Tata Institute of Fundamental Research, Mumbai Technical Editor P. Vanchinathan, Vellore Institute of Technology, Chennai The Texts and Readings in Mathematics series publishes high-quality textbooks, research-level monographs, lecture notes and contributed volumes. Undergraduate and graduate students of mathematics, research scholars, and teachers would find this book series useful. The volumes are carefully written as teaching aids and highlight characteristic features of the theory. The books in this series are co-published with Hindustan Book Agency, New Delhi, India. More information about this series at http://www.springer.com/series/15141 S. Kesavan (emeritus) Measure and Integration 123 S. Kesavan (emeritus) Institute of Mathematical Sciences Chennai, Tamil Nadu, India ISSN 2366-8725 (electronic) Texts and Readings in Mathematics ISBN 978-981-13-6678-9 (eBook) https://doi.org/10.1007/978-981-13-6678-9 Library of Congress Control Number: 2019932597 This work is a co-publication with Hindustan Book Agency, New Delhi, licensed for sale in all countries in electronic form only. Sold and distributed in print across the world by Hindustan Book Agency, P-19 Green Park Extension, New Delhi 110016, India. ISBN: 978-93-86279-77-4 © Hindustan Book Agency 2019. © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 This work is subject to copyright. All rights are reserved by the Publishers, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publishers, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publishers nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publishers remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Dedicated to Professor Philippe G. Ciarlet, to whom I owe more than I can possibly express, on the occasion of his eightieth birthday. Preface A course on the theory of measure and the Lebesgue integral is now an essential component in any masters or graduate programme in mathematics in Indian universities. It is part of the training a student receives in analysis. The most interesting examples of Banach spaces are function spaces of various kinds and the Lebesgue spaces, also known as Lp spaces, are amongst the most important of these. A knowledge of the theory of measure and integration is essential for the study of several advanced topics in functional analysis like the theory of distributions and Sobolev spaces, which constitute the functional analytic framework for the modern study of partial differential equations. Of course, the theory of measure and integration is vital to the study of probability and stochastic processes. This book grew out of the notes I prepared for lectures on measure theory and the theory of integration. These lectures were delivered, over the past four decades, to masters and graduate students in several leading institutions like the Centre for Applicable Mathematics, Tata Institute of Fundamental Research, Bangalore, The Institute of Mathematical Sciences, Chennai, the Chennai Mathematical Institute, Siruseri, and the Indian Institute of Technology, Madras. Portions of the book were also taught at numerous refresher or summer courses. In particular, it was taught by me at several refresher courses at the Ramanujan Institute for Advanced Study in Mathematics, of the University of Madras, Chennai. I am indeed thankful to these institutions and organizers of refresher courses for having given me the opportunity to deliver these lectures. The book starts with a preamble, where the Riemann integral is briefly discussed. Some of the shortcomings of this theory of integration motivate the need to develop the theory of measure and the Lebesgue integral. Chapter 1 develops the abstract theory of a measure defined over classes of subsets of a non-empty set, like rings, σ-rings and σ-algebras. The extension of a measure from a smaller class (like a ring) to a larger class (typically, a σ-algebra), is done via the method of Carathéodory, using outer-measures. The completion of a measure is also discussed. vii viii Preface Chapter 2 is devoted to the construction and the study of the important properties of the Lebesgue measure on the euclidean space RN . Chapter 3 studies important properties of measurable functions. Chapter 4 introduces various notions of convergence like pointwise convergence, almost uniform convergence and convergence in measure and studies their inter-relationships. Chapter 5 is the core of this book. It develops the theory of the Lebesgue integral and proves the important limit theorems. It also compares the Riemann and Lebesgue integrals on the real line. Chapter 6 is devoted to the fundamental theorem of calculus, viz. the relationship between differentiation and integration. Various classes of functions, which are differentiable almost everywhere, are studied and the relationship between the integrand and the derivative of its indefinite integral is explored. Chapter 7 is devoted to the change of variable formula, viz. the effect on the integral under the action of transformations of the domain. Chapter 8 studies product spaces and Fubini’s theorem is proved. Polar coordinates in RN are discussed. Chapter 9 concerns signed measures and the main result of this chapter is the Radon-Nikodym theorem. Finally, Chapter 10 studies Lp spaces. Density theorems and duality are discussed. The notion of the convolution product is introduced. Most of the material in this book can be covered in a one semester introductory course. The pre-requisite for following this book is familiarity with basic real analysis and elementary topological notions, with special emphasis on the topology of the euclidean space RN . The instructor may omit certain sections or results if (s)he feels it may be too heavy for the students taking the course. Each chapter is provided with a variety of exercises, which, it is hoped, the students will try to solve. Preface ix No originality is claimed regarding the contents and the presentation of the material in this book. I have learnt from, and have been influenced by, many earlier works on this topic, especially those of Halmos, Royden and Rudin, to mention a few. These appear in the bibliographic references. Since this book is meant to serve as a text book for an introductory course, I have kept the bibliographic references to a minimum. I wish to thank the Director of the Institute of Mathematical Sciencs for the excellent facilities accorded to me during the preparation of this work. I also wish to thank Prof. R. Bhatia, Managing Editor of the TRIM Series, and Shri J. K. Jain of the Hindustan Book Agency, for their support. I thank the anonymous referee who went through the entire manuscript with such great care and pointed out several misprints and other slips. Eliminating these has certainly made the book much better. Finally, I wish to thank several students of the Indian Institute of Technology, Madras, who followed my lectures and made my sojourn there as Visiting (and then Adjunct) Professor a very enjoyable experience. In particular, I wish to mention Ashok Kumar and Nirjan Biswas. Chennai November, 2018 S. Kesavan Notations Certain general conventions followed throughout the text regarding notations are described below. All other specific notations are explained as and when they appear in the text. • The set of natural numbers {1, 2, 3, · · ·}, is denoted by the symbol N, the integers by Z, the rationals by Q, the reals by R and the complex numbers by C. • If A and B are two sets, then by A ⊂ B, we mean that every element of A is also an element of B, i.e. A is a subset of B. The inclusion is not necessarily strict. • If X is a non-empty set and A ⊂ X, then Ac denotes the complement of A in X, i.e. the set of elements in X which do not belong to A. • The empty set is denoted by the symbol ∅. • The union and intersection of sets are denoted using the usual symbols ∪ and ∩ respectively. • If X is a non-empty set and if A and B are subsets of X, then A\B = A ∩ B c , and A∆B = (A\B) ∪ (B\A). • If a, b ∈ R ∪ {±∞}, then (a, b) = {x ∈ R | a < x < b}, [a, b] = {x ∈ R | a ≤ x ≤ b}, [a, b) = {x ∈ R | a ≤ x < b}, (a, b] = {x ∈ R | a < x ≤ b}. • The symbol RN , N ∈ N, stands for the N -dimensional euclidean space. If x = (x1 , · · · , xN ) ∈ RN , then |x| = N X ! 12 |xi |2 . i=1 x Contents Preamble 1 1 Measure 1.1 Algebras of sets . . . . . . . . . . . 1.2 Measures on rings . . . . . . . . . 1.3 Outer-measure and measurable sets 1.4 Completion of a measure . . . . . . 1.5 Exercises . . . . . . . . . . . . . . 2 The 2.1 2.2 2.3 2.4 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 11 16 24 26 Lebesgue measure Construction of the Lebesgue measure Approximation . . . . . . . . . . . . . Translation invariance . . . . . . . . . Non-measurable sets . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 30 39 46 49 52 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 54 61 65 66 3 Measurable functions 3.1 Basic properties . . . 3.2 The Cantor function 3.3 Almost everywhere . 3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Convergence 68 4.1 Egorov’s theorem . . . . . . . . . . . . . . . . . . . . . . . 68 4.2 Convergence in measure . . . . . . . . . . . . . . . . . . . 70 4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5 Integration 81 5.1 Non-negative simple functions . . . . . . . . . . . . . . . . 81 5.2 Non-negative functions . . . . . . . . . . . . . . . . . . . . 85 5.3 Integrable functions . . . . . . . . . . . . . . . . . . . . . 94 xi xii Contents 5.4 5.5 5.6 5.7 The Riemann and Lebesgue integrals Weierstrass’ theorem . . . . . . . . . Probability . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 109 112 114 6 Differentiation 6.1 Monotonic functions . . . . . . . . . . 6.2 Functions of bounded variation . . . . 6.3 Differentiation of an indefinite integral 6.4 Absolute Continuity . . . . . . . . . . 6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 118 124 131 136 139 7 Change of variable 142 7.1 The Fréchet derivative . . . . . . . . . . . . . . . . . . . . 142 7.2 Sard’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 146 7.3 Diffeomorphisms . . . . . . . . . . . . . . . . . . . . . . . 147 8 Product spaces 8.1 Measurability in the product space 8.2 The product measure . . . . . . . . 8.3 Fubini’s theorem . . . . . . . . . . 8.4 Polar coordinates in RN . . . . . . 8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 156 160 164 171 175 9 Signed measures 9.1 Hahn and Jordan decompositions 9.2 Absolute continuity . . . . . . . . 9.3 The Radon-Nikodym theorem . . 9.4 Singularity . . . . . . . . . . . . 9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 178 185 188 193 195 10 Lp spaces 10.1 Basic properties . . 10.2 Approximation . . 10.3 Some applications 10.4 Duality . . . . . . 10.5 Convolutions . . . 10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 196 205 208 213 221 230 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 About the Author S. KESAVAN is former professor at the Institute of Mathematical Sciences, Chennai, and adjunct faculty at the Indian Institute of Technology Madras, Chennai, India. He started his career at the Tata Institute of Fundamental Research Centre for Applicable Mathematics (TIFR-CAM), Bangalore, India, in 1973. He has also been associated with the Chennai Mathematical Institute, where he was deputy director during 2007–2010 and he is currently adjunct professor at the Indian Institute of Technology, Madras. He earned his PhD from Université Pierre-et-Marie-Curie, Paris, in 1979. He is a fellow of the National Academy of Sciences, India, at Allahabad and the Indian Academy of Sciences, Bangalore, India. He is a life member of the National Board for Higher Mathematics, since 2000, Indian Mathematical Society, International Society for the Interaction of Mechanics and Mathematics (ISIMM), Indian Society of Industrial and Applicable Mathematics (ISIAM), Ramanujan Mathematical Society, American Mathematical Society and an elected fellow of the Forum d’Analystes, Chennai. He was also the Secretary (Grant Selection) of the Commission for Developing Countries of the International Mathematical Union, during 2011–2014 and 2015–2018. He has published four books and authored over 50 research articles, apart from several contributions to conference proceedings and popular articles. His research interests are in partial differential equations, homogenization, control theory, and isoperimetric inequalities. xiii Preamble From the time of the Greeks, the problem of computing the area enclosed by a curve had been exercising the minds of scientific thinkers. This crucial question, at the base of the theory of integral calculus, was treated as early as the third century B.C. by Archimedes, who calculated the area of a circular disc, the area of a segment of a parabola and other such figures. He used the ‘method of exhaustion’. The basic idea was to exhaust the given area by a sequence of polygonal domains and calculate the area as the limit of the area of the inscribed polygons. During the seventeenth century, many such areas were calculated and in each case the problem was solved by an ingenious device specially suited for the case in hand. One of the achievements of calculus was to develop a general and powerful method to replace these special restricted procedures. From the time of Archimedes until the time of Gauss, the attitude was that the area was an intuitively obvious entity which need not be defined, but which had to be computed. Before Cauchy, there was no definition of the integral in the precise sense of the term. One was often limited to saying which areas one had to add, or subtract, to get the integral. Cauchy, with his concern for rigour, which is characteristic of modern mathematics, defined continuous functions and their integrals in much the same way as we do now. To arrive at the integral of a continuous function f defined on an interval [a, b] of the real line, he looked at sums of the form X S = f (ξi )(xi+1 − xi ) i where a = x0 < x1 < ... < xi < xi+1 < ... < xN = b is a partition of [a, b] and ξi ∈ [xi , xi+1 ]. He then deduced the value of the integral b Z f (x)dx a by a suitable passage to the limit. For a long time, certain discontinuous functions were integrated by showing that Cauchy’s definition still applied to these integrals. It was 1 2 Preamble Riemann who systematically investigated the exact scope of this definition. In what follows, we will briefly recall the salient features of the Riemann integral and see what are its principal drawbacks which will motivate the study of Lebesgue’s theory of measure and integration. The Riemann Integral Let [a, b] ⊂ R be a finite interval and let f : [a, b] → R be a bounded function. Let P = {a = x0 < x1 < ... < xN = b} be a partition of the interval. Set mi = inf x∈[xi−1 ,xi ] f (x) and Mi = f (x), for 1 ≤ i ≤ N. sup x∈[xi−1 ,xi ] Then, we define the lower and upper (Darboux) sums associated to the function f and the partition P by L(P, f ) = PN − xi−1 ) U (P, f ) = PN − xi−1 ). i=1 mi (xi i=1 Mi (xi Then, we define the lower and upper integrals of f by Rb a Rb a f (x)dx = supP L(P, f ) f (x)dx = inf P U (P, f ) where the supremum and infimum are taken over all possible partititions of [a, b]. The function f is said to be Riemann integrable over [a, b] if its lower and upper integrals are equal and the common value, called the Riemann integral of f over [a, b], is denoted by the symbol b Z f (x)dx. a Since f is bounded, we have m ≤ f (x) ≤ M for all x ∈ [a, b] and it is immediate to see that m(b − a) ≤ L(P, f ) ≤ U (P, f ) ≤ M (b − a) Preamble 3 for all partitions P. Thus, the lower and upper integrals of f always exist but the question of their being equal is a delicate one. Given a partition P as above, we set µ(P) = max (xi − xi−1 ). 1≤i≤N Let ti ∈ [xi−1 , xi ] for 1 ≤ i ≤ N . Denote S(P, f ) = N X f (ti )(xi − xi−1 ). i=1 The above notation is incomplete. The sum S(P, f ) depends not only on the partition P and the function f , but also on the choice of the points ti . But in order to avoid cumbersome notation, we will leave it as it is. Definition We say that lim S(P, f ) = A µ(P)→0 if, for every ε > 0, there exists a δ > 0 such that, for all partitions P such that µ(P) < δ, and for all choices of points ti compatible with the partition, we have |S(P, f ) − A| < ε. Theorem 1 (cf. Rudin [7]) The function f is Riemann integrable, if and only if, the limit defined in the above definition exists and, in this case, b Z f (x)dx = a lim S(P, f ). µ(P)→0 Thus, we see that the requirement that a function be Riemann integrable is a very strong one. We have the following result. 4 Preamble Theorem 2 (cf. Rudin [7]) If f is continuous, or if f has at most a countable number of discontinuities, then f is Riemann integrable. Example Let us consider the unit interval [0, 1]. Let us choose some numbering of all the rational numbers in this interval and write them as r1 , r2 , .... Define 1, if x = r1 , r2 , ..., rn , fn (x) = 0, otherwise. The function fn is discontinuous only at the points r1 , ..., rn which are finite in number and so, by the previous theorem, fn is Riemann integrable. In fact, it is a simple exercise to check this fact directly using the definition of Riemann integrability and show that the integral is equal to zero. Let us now consider the function f (x) = limn→∞ fn (x). It is easy to see that 1, if x is rational, f (x) = 0, if x is irrational. This function is discontinuous everywhere. Given any partition P, it is easy to see that mi = 0 and that Mi = 1 for all 1 ≤ i ≤ N . Thus L(P, f ) = 0 and U (P, f ) = 1. Thus the lower integral is zero while the upper integral is unity and so f fails to be Riemann integrable. This brings us to a major drawback of the Riemann integral. The limit of a sequence of Riemann integrable functions need not be Riemann integrable. Even if the limit is a Riemann integrable function, the limit of the integrals need not be the integral of the limit, as the following example shows. Example Let fn (x) = n2 x(1 − x2 )n for x ∈ [0, 1]. Then fn (x) → 0 as n → ∞ (why?). Now, 1 Z x(1 − x2 )n dx = 0 1 . 2n + 2 Thus, 1 Z fn (x)dx = 0 n2 → ∞ 2n + 2 Preamble 5 while, since f ≡ 0, we have R1 0 f (x)dx = 0. Similarly, if we define fn (x) = nx(1 − x2 )n , again fn → f ≡ 0 pointwise but R1 0 fn (x)dx → 1/2 6= 0. So, when do the two limit processes - the pointwise limit of functions and Riemann integration (which has been defined as a limit of sums as shown in Theorem 1) - commute? Definition We say that fn → f uniformly on [a, b] if, for every ε > 0, there exists a positive integer N such that, for all x ∈ [a, b] and for all n ≥ N , we have |fn (x) − f (x)| < ε. Theorem 3 (cf. Rudin [7]) If fn → f uniformly on [a, b], and if all the fn are Riemann integrable, then f is Riemann integrable and, further, Z lim n→∞ a b b Z fn (x)dx = f (x)dx. a In the preceding example, the sequence {fn } failed to converge uniformly. In fact, the non-commutativity of the operations of taking the pointwise limit and the Riemann integral is a useful test to prove that a sequence of functions is not uniformly convergent. Thus, a sequence of functions which does not converge uniformly may converge to a function which is not integrable or it can happen that the limit function is Riemann integrable but the limit of the integrals is not the integral of the limit function. But uniform convergence is a very strong condition as well. We thus feel the need for a theory of integration, wherein a larger class of functions is integrable and such that the process of taking pointwise limits of functions commutes with the process of integration under fairly easily verifiable hypotheses. This is where the alternative approach 6 Preamble of Lebesgue comes in useful. The way the Riemann integral is defined, a certain amount of continuity is forced on integrable functions. As we saw in Theorem 1, if f is Riemann integrable, then, for all admissible choices of points ti , the value of S(P, f ) cannot vary too much, since the limit exists as µ(P) → 0. Thus, nearby points must have nearby values ‘to a large extent’ and this is what Theorem 2 is all about. We can excuse a countable number of discontinuities. But the function which takes the value 1 on the rationals and the value 0 on irrationals is discontinuous everywhere and it fails to be Riemann integrable. The idea of Riemann in formulating the definition of the integral is to consider the function following the abcissa. We take the values of the function as we proceed along the x-axis. Thus, we are forced to consider and compare the values of the function at nearby points and hence we are dependent on some amount of continuity. The idea of Lebesgue is to work, not from the domain, but from the range of a function. We take a particular value and consider the set of all points where this value is assumed when we define the integral. Let us illustrate this via an example. Example Let P be a partition of the interval [a, b]. Let f (x) = N X αi χEi (x) i=1 where Ei = [xi−1 , xi ] and for any subset A of R, 1, if x ∈ A χA (x) = 0, if x 6∈ A. This function has a finite number of discontinuities and the Riemann integral is easily seen (Exercise!) to be b Z f (x)dx = a N X αi (xi − xi−1 ). i=1 By Lebesgue’s method, we will be looking at sets of the form Eα = {x ∈ [a, b] | f (x) = α} for each α ∈ R and multiply α by the ‘length’ of the Preamble 7 set Eα and ‘add’ all these products. In our example, Eα = ∅ if α 6= αi for any 1 ≤ i ≤ N and Eαi = Ei . Thus the (Lebesgue) integral is given again by the same expression as the Riemann integral, in this case. Imagine a merchant in a shop wanting to add all the money he has collected from sales during a particular day. He has two methods. First, he can take the money one at a time from the till and add the amounts as he takes them out. The other is for him to sort out all the money according to each denomination, count the number of coins or notes in each denomination, multiply the number by the value of the denomination and add all these products. Both procedures will yield the same result, but the latter is more efficient, especially if it involves large quantities of money (have you seen how they count the Hundi collections in a large temple, say, Tirumala?). The approach of Riemann is like the first method where we take a function as it comes along the x-axis, while the approach of Lebesgue is like the second, where we sort it out according to the values in the range. Obviously, this does not say anything about the values of nearby points, and so, hopefully, will not depend on the continuity of the function. The Riemann integral approximates a function by another of the form N X f (ti )χEi (x) i=1 where ti ∈ Ei and P = {Ei | 1 ≤ i ≤ N } is a partition of [a, b] and passes to the limit in sums of the form S(P, f ). The Lebesgue integral approximates a function by one of the form N X αi χAi (x) i=1 where Ai , 1 ≤ i ≤ N are ‘more general’ sets than just intervals. It then defines the integral of the simpler function by N X αi µ(Ai ) i=1 where µ(A) is the ‘length’ of the set A, and then passes to the limit suitably to get the integral of f . 8 Preamble Here is the catch. What do we mean by the ‘length’ of a set A which is not an interval. This brings us to the theory of measures which will generalize the notion of length (area or volume, in higher dimensions) to a fairly large class of sets. Chapter 1 Measure 1.1 Algebras of sets Throughout this section X will stand for a non-empty set. We will define various classes of subsets of X. The power set of X, i.e. the collection of all subsets of X, will be denoted by P(X). Definition 1.1.1 A non-empty collection R of subsets of X is called a ring if it is closed under the formation of unions and differences, i.e. if E and F are members of R, then so are E ∪ F and E\F . A ring is said to be an algebra if, in addition, X itself is a member of R. Remark 1.1.1 By induction, it is clear that every ring is closed under the formation of finite unions. Remark 1.1.2 The empty set always belongs to any ring since, if E is a member, so is ∅ = E\E. Remark 1.1.3 If R is a ring and if E, F ∈ R, then E∆F E∩F = (E\F ) ∪ (F \E) ∈ R, = E\(E\F ) ∈ R. Thus, a ring is closed under the formation of symmetric differences and intersections as well. Remark 1.1.4 Conversely, if a non-empty collection of subsets R is closed under the formation of unions and symmetric differences, then it is a ring. Indeed, if E, F ∈ R, we have, by hypothesis, that E ∪ F ∈ R and, further, E\F = (E ∪ F )∆F ∈ R. © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_1 9 10 1 Measure Similarly, if R is closed under the formation of symmetric differences and intersections, then also it is a ring, for E∪F E\F = (E∆F )∆(E ∩ F ) ∈ R, = (E ∪ F )∆F ∈ R. Remark 1.1.5 If R is an algebra, then it is closed under complementation since E ∈ R implies that E c = X\E ∈ R. Conversely, if a nonempty collection of subsets R is closed under the formation of unions and complementation, it is an algebra. To see this, notice that if E ∈ R, then E c ∈ R and so X = E ∪ E c ∈ R. Further, if E, F ∈ R, then E\F = E ∩ F c = (E c ∪ F )c ∈ R. Example 1.1.1 The collections R = {∅} and R = P(X) are trivial examples of rings for any non-empty set X. Example 1.1.2 Let X = Z, the set of all integers. Define R = {A ⊂ Z | A is a non-empty finite set, or A = ∅}. Then R defines a ring. The next example is one which we will deal with in detail in this book since it will be the starting point of the construction of the Lebesgue measure. Example 1.1.3 Let X = R, the real line. Define P = {[a, b) | a, b ∈ R, a ≤ b} where [a, b) = {x ∈ R | a ≤ x < b} with the convention that this stands for the empty set if a = b. Define R to be the collection of all finite unions of members of P. Then R is a ring. To see this, first of all, R is closed under the formation of finite unions, by definition. Further, [a, b)\[c, d) will be [a, b) if the two intervals are disjoint, or empty if [a, b) ⊂ [c, d). If a < c < b ≤ d, then [a, b)\[c, d) = [a, c) and if a ≤ c < d < b, we have [a, b)\[c, d) = [a, c) ∪ [d, b) ∈ R. 1.2 Measures on rings 11 Finally, if c < a < d < b, we have [a, b)\[c, d) = [d, b). From this it easily follows that R is closed under the formation of differences as well. It is also clear that each member of R can, in fact, be written as a finite disjoint union of members of P. Definition 1.1.2 A non-empty collection S of subsets of a non-empty set X is said to be a σ-ring if it is closed under the formation of differences and countable unions. In other words, if E, F ∈ S, then E\F ∈ S and if {Ei }∞ i=1 is a countable collection of members of S, then ∪∞ E ∈ S. A σ-ring S is called a σ-algebra if, in addition, X ∈ S. i i=1 Remark 1.1.6 Thus, a σ-ring is a ring which is closed under the formation of countable unions. If {Ei }∞ i=1 is a countable collection in a σ-ring S, then ∞ ∩∞ i=1 Ei = E\ ∪i=1 (E\Ei ) ∈ S, where E = ∪∞ i=1 Ei . Thus, a σ-ring is closed under the formation of countable intersections as well. It is also easy to see that a σ-algebra can be described as a non-empty collection of subsets which is closed under the formation of countable unions and complementation. Let X be a non-empty set and let E be a non-empty collection of subsets of X. Clearly, the power set of X, i.e. P(X), is a ring (respectively, σ-ring) containing E. Now, it is immediate to see that the intersection of a collection of rings (respectively, σ-rings) is again a ring (respectively, a σ-ring). Consequently, there exists a smallest ring (respectively, σ-ring) containing E. This is called the ring (respectively, σ-ring) generated by E and is denoted by R(E) (respectively, S(E)). The collection of all sets in X which can be covered by finite (respectively, countable) unions of members of E is clearly a ring (respectively, a σ-ring) containing E. Thus, every member of R(E) (respectively, S(E)) can be covered by a finite (respectively, countable) union of members of E. 1.2 Measures on rings Let X be a non-empty set and let R be a ring of subsets of X. 12 1 Measure Definition 1.2.1 A measure, µ, on the ring R, is an extended realvalued function on R such that (i) µ(E) ≥ 0, for all E ∈ R, (ii) µ(∅) = 0, and, (iii) µ is countably additive, i.e. if {Ei }∞ i=1 is a sequence of pairwise disjoint sets in R such that E = ∪∞ E ∈ R, then i=1 i µ(E) = ∞ X µ(Ei ). (1.2.1) i=1 Remark 1.2.1 Since µ is an extended real-valued function on R, it is possible that µ(E) = +∞ for some E ∈ R. If there exists at least one E ∈ R such that µ(E) < +∞, then (ii) in the above definition will follow as a consequence of (iii), since we can write E = E ∪ ∅ ∪ ∅ ∪ ∅··· Remark 1.2.2 Since µ(Ei ) ≥ 0 for all i, the order of the summands in (1.2.1) is unimportant. Remark 1.2.3 A measure is always finitely additive as well. If {Ei }N i=1 is a finite collction of mutually disjoint sets whose union is E (which will be automatically in R), then since E = ∪N i=1 Ei ∪ ∅ ∪ ∅ ∪ ∅ · · · , we have µ(E) = N X µ(Ei ). i=1 Example 1.2.1 Let X be any non-empty subset and let R = P(X). If E ⊂ X, define 0, ifE = ∅, number of elements in E, if E is a finite non-empty set, µ(E) = +∞, otherwise. We need to check only the countable additivity. Let {Ei }∞ i=1 be a collection of mutually disjoint subsets whose union is E. If E is a finite set, then only at most finitely many of the Ei will be non-empty and they will also be finite sets. Then (1.2.1) is obviously true. If E is an infinite set, then either at least one of the Ei is an infinite set, or there 1.2 Measures on rings 13 are infinitely many Ei which are non-empty finite sets. In either case, both sides of (1.2.1) take the value +∞ and so countable additivity is established. This measure is called the counting measure on the set X. Example 1.2.2 Let X and R be as in the previous example. Let x0 ∈ X be fixed. Let E ⊂ X. Define 1, if x0 ∈ E, µ(E) = 0, if x0 6∈ E. Again, we need only to check the countable additivity. Let E = ∪∞ i=1 Ei be the union of a sequence of mutually disjoint sets in X. If x0 ∈ E, then x0 ∈ Ei0 for exactly one index io . Consequently, both sides of (1.2.1) will be unity. If x0 6∈ E, then x0 6∈ Ei for every index i and so both sides of (1.2.1) are zero in this case. This measure is called the Dirac measure concentrated at the point x0 . Example 1.2.3 Let X be any non-empty set. Let R be the ring of finite subsets of X. Let f : X → R be a given non-negative real-valued function. Define µ to be zero on the empty set and set µ({x1 , · · · , xn }) = n X f (xi ). i=1 It is easy to check that this defines a measure on X. The most interesting example will be the Lebesgue measure, to be defined on the euclidean space RN , N ≥ 1, which we will study in detail later. We will now prove some basic, but important, properties of measures. Proposition 1.2.1 Let µ be a measure on a ring R of subsets of a nonempty set X. Then, (i) µ is monotone, i.e. if E, F ∈ R and if E ⊂ F , we have µ(E) ≤ µ(F ), and (ii) µ is subtractive, i.e. if E, F ∈ R, E ⊂ F and if µ(E) < +∞, then µ(F \E) = µ(F ) − µ(E). 14 1 Measure Proof: If E ⊂ F , then, by finite additivity, we have that µ(F ) = µ(F \E) + µ(E). Then (i) is a consequence of the non-negativity of the measure and (ii) follows from the fact that µ(E) is finite and hence we can subtract it from both sides of the above relation. Proposition 1.2.2 (Subadditivity) Let µ be a measure on a ring R of subsets of a non-empty set X. Let {Ei } be a finite, or infinite, sequence of sets in R and let E ∈ R such that E ⊂ ∪i Ei . Then X µ(E) ≤ µ(Ei ). i Proof: Set Fi = E ∩ Ei . Define G1 = F1 and define Gi = Fi \(∪i−1 j=1 Fj ). Then the sets Gi are all mutually disjoint and Gi ⊂ Fi for all i. Further ∪i Gi = ∪i Fi = E. Thus, by countable additivity and monotonicity, we have X X X µ(E) = µ(Gi ) ≤ µ(Fi ) ≤ µ(Ei ). i i i Proposition 1.2.3 Let µ be a measure on a ring R of subsets of a non-empty set X. Let {Ei } be a finite, or infinite, sequence of mutually disjoint sets in R such that ∪i Ei ⊂ E, where E ∈ R. Then, X µ(Ei ) ≤ µ(E). i Proof: For any positive integer n, we have ∪ni=1 Ei ⊂ E. By the finite additivity and monotonicity of the measure, we have n X µ(Ei ) = µ(∪ni=1 Ei ) ≤ µ(E) i=1 from which we deduce the result immediately. Proposition 1.2.4 (Continuity from below) Let µ be a measure on a ring R of subsets of a non-empty set X. Let {Ei }∞ i=1 be an increasing sequence of sets in R such that ∪∞ E ∈ R. Then i=1 i µ(∪∞ i=1 Ei ) = lim µ(En ). n→∞ (1.2.2) 1.2 Measures on rings 15 Proof: Set E0 = ∅. Then µ(∪∞ i=1 Ei ) = µ(∪∞ i=1 (Ei \Ei−1 )) = limn→∞ Pn i=1 µ(Ei \Ei−1 ) = P∞ i=1 µ(Ei \Ei−1 ) = limn→∞ µ(En ), since the sets Ei \Ei−1 are all mutually disjoint. This completes the proof. Proposition 1.2.5 (Continuity from above) Let µ be a measure on a ring R of subsets of a non-empty set X. Let {Ei }∞ i=1 be a decreasing sequence of sets in R such that ∩∞ E ∈ R and such that for some i i=1 positive integer m, we have µ(Em ) < +∞. Then µ(∩∞ i=1 Ei ) = lim µ(En ). n→∞ (1.2.3) Proof: Since the sequence of sets is decreasing, we have that µ(En ) < +∞ for all n ≥ m. Further {Em \En }n≥m is an increasing sequence of sets. Hence by the preceding proposition and by the subtractive property of the measure, we have ∞ ∞ µ(Em ) − µ(∩∞ i=1 Ei ) = µ(Em ) − µ(∩i=m Ei ) = µ(Em \(∩i=m Ei )) = µ(∪∞ i=m (Em \Ei )) = limn→∞ µ(Em \En ) = µ(Em ) − limn→∞ µ(En ). The result now follows on subtracting µ(Em ), which is finite, from both sides of the above relation. Example 1.2.4 The preceding proposition is not valid without the assumption that µ(Em ) is finite for some m. Consider the set of natural numbers, N, equipped with the counting measure (cf. Example 1.2.1). Let En = {m ∈ N | m ≥ n}. Then µ(En ) = +∞ for all n while ∩∞ n=1 En = ∅. Definition 1.2.2 Let µ be a measure on a ring R of subsets of a nonempty set X. We say that µ is finite if µ(E) < +∞ for every E ∈ R. We say that µ is σ-finite if every set E in R can be covered by a sequence {Ei }∞ i=1 of sets in R with µ(Ei ) < +∞ for every i. 16 1 Measure Thus, the Dirac measure (cf. Example 1.2.2) and the measure defined in Example 1.2.3 are both finite measures. The counting measure on N is a σ-finite measure, since any subset of N can be covered by a countable number of singleton sets, and each singleton set has measure unity. We conclude this section with a very useful result. Proposition 1.2.6 (Borel-Cantelli Lemma) Let µ be a measure on a σ-algebra S of subsets of a non-empty set X. Let {Ei }∞ i=1 be a sequence of sets in X such that ∞ X µ(Ei ) < +∞. i=1 Then, except for a set of measure zero, every point x ∈ X belongs to at most finitely many of the sets Ei . Proof: Let E be set of all points x ∈ X which belong to infinitely many of the sets Ei . Then, ∞ E = ∩∞ n=1 ∪i=n Ei . Then, for every positive integer n, we have µ(E) ≤ µ(∪∞ i=n Ei ) ≤ ∞ X µ(Ei ). i=n But the sum on the extreme right is the tail of a convergent series and hence can be made arbitrarily small for large n. Thus it follows that µ(E) = 0. 1.3 Outer-measure and measurable sets In the sequel, we will really be interested only in measures defined on σ-algebras. However, as we shall see in the construction of the Lebesgue measure, it will be simpler to explicitly construct it on a ring and then try to extend it to larger collections like the σ-ring generated by the ring itself. We now investigate the possibility of extending a measure defined on a ring to the σ-ring generated by it or to even larger classes of sets. Definition 1.3.1 Let X be a non-empty set and let S be a σ-ring of subsets of X. It is said to be a hereditary σ-ring if, whenever E ∈ S, we have that every subset of E is also a member of S. 1.3 Outer-measure and measurable sets 17 The power set of X is clearly a hereditary σ-ring and intersections of hereditary σ-rings is also a hereditary σ-ring. It then follows that given any collection E of subsets of X, there is a smallest hereditary σ-ring, denoted H(E), containing E. This is called the hereditary σ-ring generated by the class E. Notice that the collection of all sets in X which can be covered by a countable union of members of E is a hereditary σ-ring containing E. Thus, every member of H(E) can be covered by a countable union of members of E. Definition 1.3.2 Let X be a non-empty set and let H be a hereditary σ-ring of subsets of X. An extended real-valued set function µ∗ defined on H is said to be an outer-measure if the following properties hold: (i) (non-negativity) µ∗ (E) ≥ 0 for every E ∈ H; (ii) (monotonicity) if E, F ∈ H such that E ⊂ F , then µ∗ (E) ≤ µ∗ (F ); (iii) µ∗ (∅) = 0; (iv) (countable subadditivity) if {En }∞ n=1 is a sequence of sets in H, then µ ∗ (∪∞ n=1 En ) ≤ ∞ X µ∗ (En ). (1.3.1) n=1 An outer-measure is said to be σ-finite if every set in the hereditary σring can be covered by a countable union of sets of finite outer-measure. Outer-measures occur naturally when we try to extend a measure defined on a ring. Proposition 1.3.1 Let X be a non-empty set and let R be a ring of subsets of X. Let µ be a measure on R. For any set E ∈ H(R), the hereditary σ-ring generated by R, define (∞ ) X µ∗ (E) = inf µ(En ) | E ⊂ ∪∞ n=1 En , En ∈ R . n=1 Then, µ∗ is an outer-measure on H(R) which extends µ. Further, if µ is σ-finite, so is µ∗ . Proof: Step 1: The non-negativity of µ∗ is obvious. Now, let E ∈ R. Then since E covers itself, we have µ∗ (E) ≤ µ(E). On the other hand, given any countable cover of E, say {En }∞ n=1 with En ∈ R for all n, 18 1 Measure P∞ we have, by subadditivity of the measure, µ(E) ≤ n=1 µ(En ) and ∗ ∗ so, by definition of µ , we have µ(E) ≤ µ (E). Thus, for E ∈ R, we have µ(E) = µ∗ (E) and so µ∗ extends µ. In particular, we have that µ∗ (∅) = 0. Step 2: Let F ⊂ E, where E ∈ H(R). Then every countable cover of E by sets from R also covers F . Thus, it follows immediately that µ∗ (F ) ≤ µ∗ (E). Step 3: We now prove the subadditivity of µ∗ . Let E ∈ H(R) and assume that E ⊂ ∪∞ i=1 Ei , where each Ei is also a member of H(R). If there exists even a single index i such that µ∗ (Ei ) = +∞, there is nothing to prove. Assume, therefore, that µ∗ (Ei ) < +∞ for each i. Then, given ε > 0, there exists, by definition, sets Eij ∈ R such that Ei ⊂ ∪ ∞ j=1 Eij and such that ∞ X ε . 2i µ(Eij ) < µ∗ (Ei ) + j=1 ∞ Then E ⊂ ∪∞ i=1 ∪j=1 Eij and µ∗ (E) ≤ ∞ X ∞ X µ(Eij ) ≤ i=1 j=1 ∞ X µ∗ (Ei ) + ε. i=1 Since ε > 0 was arbitrarily chosen, we deduce that µ∗ (E) ≤ ∞ X µ∗ (Ei ). i=1 Step 4: Let E ∈ H(R). Then, there exists a countable cover {Ei }∞ i=1 of E such that Ei ∈ R for each i. Since µ is σ-finite, we can find {Eij }∞ i,j=1 in R such that Ei ⊂ ∪∞ j=1 Eij and µ(Eij ) < +∞ for each 1 ≤ i, j < ∞. ∞ ∗ Now, we have E ⊂ ∪∞ i=1 ∪j=1 Eij and µ (Eij ) = µ(Eij ) < +∞. This completes the proof. Example 1.3.1 Let X = N and consider the ring R of all finite subsets, with the counting measure. Then µ is finite. Since any countable union of singletons has to be in H(R), it follows that N is in H(R) and by heredity, it follows that H(R) = P(N), the power set of N. It is now immediate to see that if E is any infinite subset of N, then µ∗ (E) = +∞. 1.3 Outer-measure and measurable sets 19 Thus, even though µ is a finite measure, we can only say that µ∗ is σfinite. Definition 1.3.3 Let X be a non-empty set and let H be a hereditary σ-ring of subsets of X. Let µ∗ be an outer-measure defined on H. A set E ∈ H is said to be µ∗ -measurable if for every A ∈ H, we have µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ E c ). (1.3.2) Remark 1.3.1 Since µ∗ is subadditive, the µ∗ -measurability of E is equivalent to verifying µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ). Proposition 1.3.2 Let µ∗ be an outer-measure on a hereditary σ-ring H of subsets of a non-empty set X. Then the collection of all µ∗ measurable sets, denoted S, is a ring. Proof: Let A ∈ H and let E and F be µ∗ -measurable sets. Then, by definition, we have µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ E c ), ∩ E) = µ∗ (A ∩ E ∩ F ) + µ∗ (A ∩ E ∩ F c ), µ∗ (A ∩ E c ) = µ∗ (A ∩ E c ∩ F ) + µ∗ (A ∩ E c ∩ F c ). µ∗ (A Thus, µ∗ (A) = µ∗ (A∩E∩F )+µ∗ (A∩E∩F c )+µ∗ (A∩E c ∩F )+µ∗ (A∩E c ∩F c ). (1.3.3) If we replace A by A ∩ (E ∪ F ) in the above relation, we get µ∗ (A∩(E∪F )) = µ∗ (A∩E∩F )+µ∗ (A∩E∩F c )+µ∗ (A∩E c ∩F ). (1.3.4) It then follows from (1.3.3) that µ∗ (A) = µ∗ (A ∩ (E ∪ F )) + µ∗ (A ∩ (E ∪ F )c ). Thus, E ∪ F ∈ S. Similarly, replacing A in (1.3.3) by A ∩ (E\F )c = A ∩ (E c ∪ F ), we get µ∗ (A ∩ (E\F )c )) = µ∗ (A ∩ E ∩ F ) + µ∗ (A ∩ E c ∩ F ) + µ∗ (A ∩ E c ∩ F c ). (1.3.5) 20 1 Measure It now follows from (1.3.3) that µ∗ (A) = µ∗ (A ∩ (E\F )c ) + µ∗ (A ∩ E ∩ F c ) = µ∗ (A ∩ (E\F )c ) + µ∗ (A ∩ (E\F )). This shows that E\F ∈ S. Clearly ∅ ∈ S. This completes the proof. In fact, we can prove much more. Proposition 1.3.3 Let µ∗ be an outer-measure on a hereditary σ-ring H of subsets of a non-empty set X. Then, S, the collection of all µ∗ measurable sets, is a σ-ring. Further, if {Ei }∞ i=1 is a sequence of mutually disjoint sets in S whose union is E and if A is any arbitrary set in H, we have ∞ X ∗ µ (A ∩ E) = µ∗ (A ∩ Ei ). (1.3.6) i=1 Proof: It follows from (1.3.4) that µ∗ (A∩(E1 ∪E2 )) = µ∗ (A∩E1 ∩E2 )+µ∗ (A∩E1 ∩E2c )+µ∗ (A∩E1c ∩E2 ). Since E1 ∩ E2 = ∅, it follows that E1 ⊂ E2c and that E2 ⊂ E1c . Consequently, the above relation yields, µ∗ (A ∩ (E1 ∪ E2 )) = µ∗ (A ∩ E1 ) + µ∗ (A ∩ E2 ). By induction, it follows that for any n mutually disjoint sets {Ei }ni=1 , we have n X ∗ n µ (A ∩ (∪i=1 Ei )) = µ∗ (A ∩ Ei ). (1.3.7) i=1 ∪ni=1 Ei . Set Fn = Since S is a ring, we have that Fn ∈ S. Further, since Fn ⊂ E, we have that E c ⊂ Fnc . Consequently we have, by (1.3.7) and the monotonicity of µ∗ , µ∗ (A) = µ∗ (A ∩ Fn ) + µ∗ (A ∩ Fnc ) ≥ Pn i=1 µ ∗ (A ∩ Ei ) + µ∗ (A ∩ E c ). Since n was arbitrarily fixed, we deduce that µ∗ (A) ≥ ∞ X i=1 µ∗ (A ∩ Ei ) + µ∗ (A ∩ E c ). 1.3 Outer-measure and measurable sets 21 Replacing A by A ∩ E and by the subadditivity of the outer-measure, we get ∞ X µ∗ (A ∩ E) ≥ µ∗ (A ∩ Ei ) ≥ µ∗ (A ∩ E). i=1 This proves (1.3.6) and we also immediately see that µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ) from which we deduce that E ∈ S (cf. Remark 1.3.1). Thus, S is closed under countable disjoint unions. But since S is a ring, we can express any countable union of sets in it as a countable disjoint union of sets in it. Thus it follows that S is indeed a σ-ring. This completes the proof. Definition 1.3.4 A measure µ, defined on a σ-ring S of subsets of a non-empty set X, is said to be complete if, whenever a set E ∈ S has measure zero, then every subset of E is also a member of S. Theorem 1.3.1 Let µ∗ be an outer-measure on a hereditary σ-ring H of subsets of a non-empty set X and let S be the σ-ring of all µ∗ -measurable subsets. For E ∈ S, define µ(E) = µ∗ (E). Then µ is a complete measure on S. Proof: The fact that µ is a measure on S follows immediately from the preceding proposition. Now, let µ∗ (E) = 0 for some E ∈ H. Let A ∈ H. Then µ∗ (A) = µ∗ (E) + µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ) by monotonicity and hence (cf. Remark 1.3.1) we deduce that E ∈ S. Thus S contains all sets of outer-measure zero. If E ∈ S such that µ(E) = 0, then µ∗ (E) = 0 and so µ∗ (F ) = 0 for all subsets F of E. Thus every subset F of E is in S, which completes the proof. We say that the measure µ is the measure induced by the outermeasure µ∗ . Now, let µ be a measure on a ring R of subsets of a non-empty set X. Then we can define the hereditary σ-ring H(R), generated by R and 22 1 Measure also define the outer-measure µ∗ on H(R), as described in Proposition 1.3.1. This outer-measure will now induce a complete measure on S, the σ-ring of all µ∗ -measurable sets. Proposition 1.3.4 Let R be a ring of subsets of a non-empty set X and let µ be a measure on R. Let µ∗ be the induced outer-measure. Then S(R) ⊂ S, where S(R) is the σ-ring generated by R and S is the σ-ring of all µ∗ -measurable subsets. In particular µ is an extension of the measure µ to the σ-rings S(R) and S and it is complete on the latter. Proof: Let E ∈ R and let A ∈ H(R). Assume that µ∗ (A) < +∞. Then, given ε > 0, there exists a sequence of sets {Ei }∞ i=1 in R such ∞ that A ⊂ ∪i=1 Ei and such that ∞ X µ(Ei ) < µ∗ (A) + ε. i=1 But µ is a measure on R and so ∞ X i=1 µ∗ Now, we get µ(Ei ) = ∞ X (µ(Ei ∩ E) + µ(Ei ∩ E c )). i=1 extends µ and so, by the subadditivity of the outer-measure, µ∗ (A) + ε > µ∗ (A ∩ E) + µ∗ (A ∩ E c ). Since ε > 0 has been arbitrarily chosen, we deduce that µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ). This inequality is trivially true if µ∗ (A) = +∞. Thus we have shown that R ⊂ S (cf. Remark 1.3.1). Since S is a σ-ring containing R, it also contains S(R). Remark 1.3.2 If µ on R is σ-finite, then we saw that µ∗ on H(R) is σ-finite as well, and that, in fact each set in H(R) can be covered by countably many sets from R with finite measure. In particular, it follows that µ on S(R) and on S are σ-finite measures. Proposition 1.3.5 Let µ be a measure on a ring R of subsets of a nonempty set X and let µ be its extension to S(R) and S as described above. Let E ∈ H(R). Then µ∗ (E) = inf{µ(F ) | F ∈ S, E ⊂ F } = inf{µ(F ) | F ∈ S(R), E ⊂ F }. 1.3 Outer-measure and measurable sets 23 Proof: The proof follows from the following chain of inequalities. P ∞ µ∗ (E) = inf{ ∞ i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ R} P ∞ = inf{ ∞ i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ R} P ∞ ≥ inf{ ∞ i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ S(R)} ∞ ≥ inf{µ(∪∞ i=1 Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ S(R)} ≥ inf{µ(F ) | E ⊂ F, F ∈ S(R)} ≥ inf{µ(F ) | E ⊂ F, F ∈ S} = inf{µ∗ (F ) | E ⊂ F, F ∈ S} ≥ µ∗ (E). Remark 1.3.3 Let µ be a measure on a ring R of subsets of a nonempty set X and let µ be its extension, as described in this section, to S(R). Starting from this, one could again try to define an outer-measure µ∗ on H(R) and try to extend it further. The proof of the preceding proposition shows that µ∗ = µ∗ and so the σ-ring of measurable sets will still be S and so the induced measure will also only be µ. Definition 1.3.5 Let µ be a measure on a ring R of subsets of a nonempty set X. Let E ∈ H(R) and let F ∈ S(R). We say that F is a measurable cover of E if E ⊂ F and for all G ∈ S(R) such that G ⊂ F \E, we have µ(G) = 0. Proposition 1.3.6 Let µ be a measure on a ring R of subsets of a nonempty set X. Let E ∈ H(R) be such that µ∗ (E) < +∞. Then, there exists a measurable cover F of E such that µ(F ) = µ∗ (E). Proof: By Proposition 1.3.5, for every positive integer n, there exists Fn ∈ S(R) such that E ⊂ Fn and µ∗ (Fn ) < µ∗ (E) + 1 . n Set F = ∩∞ n=1 Fn . Then F ∈ S(R) and E ⊂ F . Thus, µ∗ (E) ≤ µ∗ (F ) ≤ µ∗ (Fn ) < µ∗ (E) + 1 . n 24 1 Measure Since this is true for all n, we deduce that µ∗ (E) = µ∗ (F ) = µ(F ). Let G ∈ S(R) be such that G ⊂ F \E. Then E ⊂ F \G and µ(G) is finite. Thus, µ(F ) = µ∗ (E) ≤ µ∗ (F \G) = µ(F \G) = µ(F ) − µ(G) from which it follows that µ(G) = 0. This completes the proof. 1.4 Completion of a measure In the previous section we started with a measure on a ring R of subsets of a non-empty set X and extended it to a complete measure µ on S, the σ-ring of all µ∗ -measurable sets. Given a measure on a σ-ring, it is always possible to extend it to a complete measure by another process, which we now describe. Theorem 1.4.1 Let S be a σ-ring of subsets of a non-empty set X and let µ be a measure on S. Let Se = {E∆N | E ∈ S, N ⊂ A, A ∈ S, µ(A) = 0}. Then, Se is a σ-ring. Define µ e on Se by µ e(E∆N ) = µ(E). Then µ e is a complete measure on Se and it extends µ. Proof: Let E, A ∈ S. Let µ(A) = 0 and let N ⊂ A. Then E\A ∈ S. Consequently, E∪N E∆N = (E\A)∆(A ∩ (E ∪ N )), = (E\A) ∪ (A ∩ (E∆N )). (1.4.1) Thus, Se = {E ∪ N | E ∈ S, N ⊂ A, A ∈ S, µ(A) = 0}. It is now clear that Se is closed under the formation of countable unions. Let E1 , E2 ∈ S and let N1 ⊂ A1 , N2 ⊂ A2 where Ai ∈ S, µ(Ai ) = 0 1.4 Completion of a measure 25 for i = 1, 2. Since the symmetric difference is an associative operation (check!!) which is also obviously commutative, we have that (E1 ∆N1 )∆(E2 ∆N2 ) = (E1 ∆E2 )∆(N1 ∆N2 ) with E1 ∆E2 ∈ S and N1 ∆N2 ⊂ A1 ∪A2 where A1 ∪A2 ∈ S, µ(A1 ∪A2 ) = 0. Thus Se is closed under the formation of symmetric differences as well. Thus it follows (cf. Remark 1.1.4) that Se is a σ-ring. Again let Ei , Ai , Ni , i = 1, 2 be as above. Assume that E1 ∆N1 = E2 ∆N2 . Again, by the commutativity and associativity of the symmetric difference, we easily see that E1 ∆E2 = N1 ∆N2 . It then follows that E1 ∆E2 ⊂ A1 ∪ A2 and so µ(E1 ∆E2 ) = 0. We then deduce that µ(E1 ) = µ(E2 ) = µ(E1 ∩ E2 ). This shows that µ e is welldefined. Using the latter definition of Se in terms of unions rather than symmetric differences, it is very easy to check that µ e defines a measure e on S. Also it is clear that µ e extends µ. Now let E, A ∈ S such that µ(E) = µ(A) = 0 and let N ⊂ A. Then µ e(E ∪ N ) = µ(E\A) = 0 (cf. (1.4.1)). If F ⊂ E ∪ N , then F ⊂ E ∪ A e This shows that µ and µ(E ∪ A) = 0. Consequently, F ∈ S. e is complete. Now, if µ is a measure defined on a ring R of subsets of a non-empty set X, we have two complete extensions of µ: first, the measure µ defined on the σ-ring S of µ∗ -measurable sets, described in the previous section, and, second, the measure µ e on the σ-ring Se got by adjoining subsets of sets of measure zero to sets in the σ-ring S(R). If µ were σ-finite, these two processes yield the same measure, as the following theorem shows. Theorem 1.4.2 Let µ be a σ-finite measure on a ring R of subsets of a non-empty set X. Let µ be its extension to S(R), the σ-ring generated by R, and to S, the set of all µ∗ -measurable sets. Let µ e be the measure e on the σ-ring S got by adjoining subsets of sets of measure zero in S(R) to sets in S(R). Then Se = S and µ e = µ. Proof: Since S is a σ-ring containing S(R) and since µ is complete, it follows that Se ⊂ S. Further, if E, A ∈ S(R) with µ(A) = 0 and if 26 1 Measure N ⊂ A, we have, by definition, µ e(E ∪ N ) = µ(E) = µ(E ∪ N ) since µ(E ∪ N ) = µ∗ (E ∪ N ) ≤ µ∗ (E) + µ∗ (N ) = µ∗ (E) = µ(E) ≤ µ(E ∪ N ). e Let The proof will, therefore, be complete if we show that S ⊂ S. ∗ E ∈ S such that µ(E) = µ (E) < +∞. Then, by Proposition 1.3.6, there exists a measurable cover F of E such that µ(F ) = µ∗ (F ) = µ∗ (E) = µ(E). Recall that F ∈ S(R) and that E ⊂ F .Then, since µ is a measure, we have that µ∗ (F \E) = µ(F \E) = 0. Now, let G ∈ S(R) be a measurable cover of F \E, with µ(G) = 0. Then, E = (F \G) ∪ (E ∩ G). Since F \G ∈ S(R) and since E ∩ G ⊂ G where G ∈ S(R) with measure e Since µ is σ-finite, we have that µ is σ-finite zero, it follows that E ∈ S. and so any set E ∈ S can be expressed as the countable union of sets of finite measure and so it follows that S ⊂ Se and the proof is complete. 1.5 Exercises 1.1 Let X = RN , N ≥ 2. Let P = ΠN i=1 [ai , bi ) | ai , bi ∈ R, ai ≤ bi , 1 ≤ i ≤ N . Let R be the collection of all finite unions of members of P. Show that R is a ring. 1.2 Let X be an uncountable set. Let R be the collection of all at most countable subsets of X, including the empty set. Show that R is a ring. Is it a σ-ring? Is it a σ-algebra? 1.3 Let R be a ring of subsets of a non-empty set X. Define S = {F ⊂ X | F ∈ R or F c ∈ R}. Show that S is the smallest algebra containing R. 1.5 Exercises 27 1.4 Let X be a non-empty set and let E ⊂ X. Let E = {E}. Compute R(E). 1.5 Let X be a non-empty set and let E ⊂ X. Let E = {F ⊂ X | E ⊂ F }. Compute R(E). 1.6 Let X = N. Let R be the collection of all finite subsets (including the empty set) and their complements. Show that R is an algebra. 1.7 Let R be a ring of subsets of a non-empty set X. Let µ be a measure on R. If E, F ∈ R, show that µ(E) + µ(F ) = µ(E ∪ F ) + µ(E ∩ F ). 1.8 Let R be a ring of subsets of a non-empty set X. Let µ be a nonnegative, finite and additive set function on R. If either µ is continuous from below for every set E ∈ R or if it is continuous from above for E = ∅, show that µ is a measure on R. 1.9 Let X = N and let R be the ring described in Exercise 1.6 above. Define, for E ∈ R, +∞, if E is an infinite set, µ(E) = 0, otherwise. Show that µ is continuous from above for E = ∅, but that µ is not a measure. (This shows that finiteness is essential in the previous exercise.) 1.10 Let S be a σ-ring of subsets of a non-empty set X and let µ be a measure on S. Let {Ei }∞ i=1 be a sequence of sets in S. Define ∞ lim inf n→∞ En = ∪∞ n=1 ∩i=n Ei , ∞ lim supn→∞ En = ∩n=1 ∪∞ i=n Ei . (a) Show that µ(lim inf En ) ≤ lim inf µ(En ). n→∞ n→∞ (b) If, for some n ∈ N, we have µ(∪∞ i=n Ei ) < +∞, show that µ(lim sup En ) ≥ lim sup µ(En ). n→∞ n→∞ 28 1 Measure 1.11 Let H be a hereditary σ-ring of subsets of a non-empty set X and let µ∗ be an outer-measure on H. If E, F ∈ H, and if at least one of them is µ∗ -measurable, show that µ∗ (E) + µ∗ (F ) = µ∗ (E ∪ F ) + µ∗ (E ∩ F ). 1.12 Let E ⊂ R. E is said to have an infinite condensation point if E has uncountably many points outside every finite interval. Let H = P(R). Define, for E ⊂ R, 0, if E is empty, finite or countable, 1, if E is uncountable, but without an µ∗ (E) = infinite condensation point, +∞, if E has an infinite condensation point. Show that (i) µ∗ is a σ-finite outer-measure on H; (ii) the only µ∗ -measurable sets are at most countable sets or their complements; (iii) the induced measure µ is not σ-finite. 1.13 Let X be a non-empty set and let H = P(X). Let µ∗i , i = 1, 2 be two finite outer-measures on H. Let S i , i = 1, 2 be the respective measurable sets. If µ∗ = µ∗1 + µ∗2 , show that µ∗ is an outer measure and that the class of µ∗ -measurable sets is S1 ∩ S2 . 1.14 Let µ be a measure on a σ-ring S of subsets of a non-empty set X. Let µ be the induced measure defined on S, the σ-ring of µ∗ -measurable sets. Let A, B ∈ S be such that µ(B\A) = 0. If A ⊂ E ⊂ B, show that E ∈ S. 1.15 Let X be an uncountable set and let S be the collection of at most countable sets (including the empty set) and their complements. (a) Show that S is a σ-algebra. (b) If µ is the counting measure on S, show that it is complete. (c) Show that every subset of X is µ∗ -measurable. (Thus, without σ-finiteness, the completion via µ∗ -measurability and the completion as in Section 1.4 need not coincide.) 1.16 Let R be a ring of subsets of a non-empty set X and let µ be a σ-finite measure on R. Show that for every set E ∈ S(R) and for every 1.5 Exercises 29 ε > 0, there exists E0 ∈ R such that µ(E∆E0 ) ≤ ε. Chapter 2 The Lebesgue measure 2.1 Construction of the Lebesgue measure We will now study in detail the construction and properties of the Lebesgue measure on the euclidean space RN . We will start with a measure on the ring which arises from the notion of the length of an interval (area or volume of a box in higher dimensions) and extend it to a complete measure on the class of measurable sets, as described in Section 1.3. To simplify the exposition we will describe in detail the construction on the real line, R. The generalization to higher dimensions will be obvious. Let P denote the class of all intervals of the form [a, b), where a ≤ b, a, b ∈ R. Let R be the ring of all finite unions of members of P (cf. Example 1.1.3). As observed earlier, we can express each member of R as a finite disjoint union of members of P. Let us define µ([a, b)) = b − a. If a = b, then [a, b) = ∅ and we have µ(∅) = 0. We will now construct a measure on R starting from µ. The definition is almost obvious: if E ∈ R is expressed as the disjoint union of intervals, i.e. if E = ∪kj=1 Ij , where Ij , 1 ≤ j ≤ k, are mutually disjoint members of P, then, necessarily, we must have k X µ(E) = µ(Ij ). j=1 However, we need to check that this is well-defined and also that it satisfies the properties of a measure. In particular, we need to verify that © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_2 30 2.1 Construction of the Lebesgue measure 31 this set function is countably additive on R. We start by formally proving a few fairly obvious properties of µ, defined on P. Lemma 2.1.1 (a) Let {Ei }ni=1 be a finite set of mutually disjoint intervals in P, such that each of them is contained in E0 ∈ P. Then n X µ(Ei ) ≤ µ(E0 ). (2.1.1) i=1 (b) Let F = [a0 , b0 ] be a finite closed interval contained in the finite union of open intervals Ui , where Ui = (ai , bi ), 1 ≤ i ≤ n. Then b0 − a 0 < n X (bi − ai ). (2.1.2) i=1 Proof: (a) Let Ei = [ai , bi ), 0 ≤ i ≤ n. Since the Ei , 1 ≤ i ≤ n, are disjoint, we have (by renumbering the sets, if necessary) that a0 ≤ a1 < b1 ≤ a2 < b2 ≤ · · · < bi−1 ≤ ai < bi ≤ ai+1 < · · · < bn ≤ b0 . Thus, Pn i=1 µ(Ei ) = Pn i=1 (bi − ai ) ≤ Pn i=1 (bi − ai ) + Pn−1 i=1 (ai+1 − bi ) = bn − a1 ≤ b0 − a0 = µ(E0 ). This proves (2.1.1). (b) We may renumber the intervals Ui , after getting rid of the superfluous ones, so that we have bi ∈ (ai+1 , bi+1 ) = Ui+1 , 1 ≤ i ≤ m − 1, where m ≤ n. Also a0 ∈ U1 and b0 ∈ Um . Then b0 − a0 < bm − a1 = (b1 − a1 ) + m−1 X i=1 which proves (2.1.2), since m ≤ n. (bi+1 − bi ) ≤ m X i=1 (bi − ai ), 32 2 The Lebesgue Measure ∞ Proposition 2.1.1 If {Ei }∞ i=0 is a sequence in P such that E0 ⊂ ∪i=1 Ei , then, ∞ X µ(E0 ) ≤ µ(Ei ). (2.1.3) i=1 Proof: The result is trivially true if E0 = ∅. Let Ei = [ai , bi ), 0 ≤ i < ∞, with ai < bi for all i, and choose ε > 0 such that 0 < ε < b0 − a0 . Let δ > 0 be an arbitrarily small positive quantity. Set F0 = [a0 , b0 − ε], ai − Ui = δ ,b 2i i , 1 ≤ i < ∞. Then, F0 ⊂ E0 and Ei ⊂ Ui , 1 ≤ i < ∞. Thus F0 ⊂ ∪∞ i=1 Ui . Since F0 is compact, there exists a positive integer n such that F0 ⊂ ∪ni=1 Ui . It now follows from the preceding lemma (cf. (2.1.2)) that n ∞ X X δ b0 − a 0 − ε < bi − a i + i ≤ (bi − ai ) + δ. 2 i=1 i=1 In other words, µ(E0 ) − ε < ∞ X µ(Ei ) + δ. i=1 The result now follows since ε and δ are arbitrarily small quantities. Proposition 2.1.2 The set function µ is countably additive on P. ∞ Proof: Let E0 = ∪∞ i=1 Ei , where {Ei }i=1 is a sequence of mutually disjoint members of P. Assume that E0 ∈ P as well. By the preceding proposition, we have ∞ X µ(E0 ) ≤ µ(Ei ). i=1 On the other hand, for any positive integer n, we have by Lemma 2.1.1 (cf. (2.1.1)), n X µ(Ei ) ≤ µ(E0 ) i=1 which yields ∞ X µ(Ei ) ≤ µ(E0 ) i=1 from which the result follows. 2.1 Construction of the Lebesgue measure 33 Theorem 2.1.1 There exists a unique finite measure µ on R which extends µ defined on P. Proof: Let E ∈ R. Then E = ∪ni=1 Ei , where {Ei }ni=1 is a collection of mutually disjoint intervals in P. Then the only possible way we can define a measure on R is by setting µ(E) = n X µ(Ei ). i=1 This is obviously an extension of µ defined on P and so, in particular, µ(∅) = 0 and µ(E) ≥ 0 for all E ∈ R. However, we need to verify that µ is well-defined. Let E = ∪ni=1 Ei = ∪m j=1 Fj , where {Ei }ni=1 and {Fj }m j=1 are two collections of mutually disjoint intervals in P. Then, for each 1 ≤ i ≤ n, we can write Ei = ∪ m j=1 Ei ∩ Fj . Each set Ei ∩ Fj is either empty, or is a non-empty interval in P. Hence, by Proposition 2.1.2, since µ is also finitely additive in P, we have µ(Ei ) = m X µ(Ei ∩ Fj ). j=1 Similarly, for each 1 ≤ j ≤ m, we have µ(Fj ) = n X µ(Fj ∩ Ei ). i=1 Thus, n X i=1 µ(Ei ) = n X m X i=1 j=1 µ(Ei ∩ Fj ) = m X n X j=1 i=1 µ(Ei ∩ Fj ) = m X µ(Fj ). j=1 This establishes that µ is well-defined on R. It is also clear, from the definition of µ, that it is finitely additive on R. 34 2 The Lebesgue Measure We now need to show that µ is countably additive on R. Let E = where E ∈ R and {Ei }∞ i=1 is a collection of mutually disjoint sets in R. Since Ei ∈ R, we can write ∪∞ i=1 Ei , i Ei = ∪nk=1 Eik , 1 ≤ i < ∞, i where {Eik }nk=1 is a finite collection of mutually disjoint intervals in P. Notice that, for each 1 ≤ i < ∞, we have, by definition of µ, µ(Ei ) = ni X µ(Eik ). k=1 Case 1: E ∈ P. In this case, since ni E = ∪∞ i=1 ∪k=1 Eik is a countable disjoint union within P, we have, by Proposition 2.1.2, that ni ∞ X ∞ X X µ(E) = µ(Eik ) = µ(Ei ). i=1 k=1 i=1 ∪nj=1 Fj , Case 2: E = where Fj ∈ P for each 1 ≤ j ≤ n, and the Fj ’s are all mutually disjoint. Then, for each 1 ≤ j ≤ n, we have Fj = ∪ ∞ i=1 Fj ∩ Ei . Thus {Fj ∩ Ei }∞ i=1 is a collection of mutually disjoint sets in R whose union is Fj ∈ P. Thus, by Case 1 above, we have µ(Fj ) = ∞ X µ(Fj ∩ Ei ). i=1 Hence, by definition of µ on R, we deduce that µ(E) = n X j=1 µ(Fj ) = n X ∞ X µ(Fj ∩ Ei ) = j=1 i=1 On the other hand, we have, for each 1 ≤ i < ∞, Ei = ∪nj=1 Ei ∩ Fj . ∞ X n X i=1 j=1 µ(Fj ∩ Ei ). 2.1 Construction of the Lebesgue measure 35 Now, {Ei ∩ Fj }nj=1 is a finite collection of mutually disjoint sets in R and so, by finite additivity, we have µ(Ei ) = n X µ(Ei ∩ Fj ) j=1 and so we deduce that µ(E) = ∞ X µ(Ei ) i=1 which proves the countable additivity of µ in the general case as well. This completes the proof. Remark 2.1.1 In the same way, when N ≥ 2, if P = ΠN i=1 [ai , bi ) | ai ≤ bi , ai , bi ∈ R, 1 ≤ i ≤ N , and R is the ring of finite (disjoint) unions of members of P (cf. Exercise 1.1), we can define a unique measure on R which, when restricted to P, is given by N µ ΠN i=1 [ai , bi ) = Πi=1 (bi − ai ). Let B = S(R) be the σ-ring generated by R. Sets in B are called Borel sets and B is called the Borel σ-algebra. Since R = ∪n∈Z [n, n + 1), we see that B is, indeed, a σ-algebra. Thus, the hereditary σ-ring generated by R is clearly the power set of R. We can now define the induced outer-measure µ∗ for all subsets of R. The collection L of all µ∗ -measurable sets is thus a σ-algebra which is called the Lebesgue σalgebra and its members are called the Lebesgue measurable sets; the induced measure on this σ-algebra is called the Lebesgue measure on R. It is clear that the Lebesgue measure is σ-finite and complete. Thus the Lebesgue measure is the completion of the measure induced on the Borel σ-algebra (cf. Theorem 1.4.2) by µ. Remark 2.1.2 In an identical manner, we can define the Borel and Lebesgue σ-algebras in RN , N ≥ 2, starting from the class P, the ring R and the measure µ defined in Remark 2.1.1. We will then get the Lebesgue measure in RN which will be a σ-finite complete measure, and 36 2 The Lebesgue Measure which is also the completion of the measure induced on the Borel σalgebra by µ. Notation: We will denote the Lebesgue measure on RN by the symbol mN . In particular, the Lebesgue measure on R will be denoted by m1 . The Borel and Lebesgue σ-algebras on RN will be denoted, respectively, BN and LN . Definition 2.1.1 Any measure defined on the Borel σ-algebra, BN , will be called a Borel measure on RN . Proposition 2.1.3 Every countable set in R is a Borel set of measure zero. Proof: Let a ∈ R. Then {a} = ∩∞ n=1 1 a, a + n ∈ B1 . Thus singletons are all in B1 and, consequently, any countable set is in B1 . Further, by Proposition 1.2.5, it follows that 1 1 m1 ({a}) = lim m1 a, a + = lim = 0. n→∞ n→∞ n n Thus, the measure of any countable set will also be zero. Proposition 2.1.4 The Borel σ-algebra is also the σ-algebra generated by all the open sets in R. Proof: Let a, b ∈ R. Then (a, b) = [a, b)\{a} ∈ B1 . Since any open set can be written as the countable union of such intervals, it follows that every open set is contained in the Borel σ-algebra, B1 . Hence, the σ-algebra generated by open sets is also contained in B1 . Conversely, [a, b) = (a, b) ∪ {a} and 1 1 ∞ {a} = ∩n=1 a − , a + . n n Consequently, P, and hence, R and B1 are all contained in the σ-algebra generated by the open sets. This completes the proof. Remark 2.1.3 It is an easy exercise to adapt the proofs of the preceding two propositions to the case RN , N ≥ 2 as well. Thus countable sets in 2.1 Construction of the Lebesgue measure 37 RN are Borel sets of measure zero and BN is the σ-algebra generated by the open sets in RN (with its usual topology). Example 2.1.1 It follows from the above proposition that m1 ((a, b)) = m1 ([a, b)) = m1 ([a, b]) = b − a. Since R is the countable disjoint union of the intervals [n, n + 1), as n varies over Z, it follows that m1 (R) = +∞. Now consider any interval [a, b) as the set [a, b) × {0} ⊂ R2 . Then 1 ∞ [a, b) × {0} = ∩n=1 [a, b) × 0, . n Then, again, it follows, from Proposition 1.2.5, that m2 ([a, b) × {0}) = lim n→∞ 1 (b − a) = 0. n Since the real line, considered as a coordinate axis in R2 , can be written as the countable disjoint union of sets of the form [n, n + 1) × {0}, it follows that m2 (R) = 0. More generally, the Lebesgue measure in RN of any proper linear subspace, will be zero. Example 2.1.2 (i) Since any non-empty open set contains a non-empty open interval, it follows that the Lebesgue measure of any non-empty open set is strictly positive. (ii) The set of rationals, Q, being countable, has Lebesgue measure zero. Thus Q is a an example of a dense set which has measure zero. Its complement, the set of irrationals, is a dense set of infinite measure. (iii) If E ⊂ R is a measurable set of measure zero, then it cannot contain any non-empty open set. Thus, every non-empty open set will intersect E c and so E c will be dense. (iv) If K ⊂ R is a compact set, then it is bounded and closed and so it has finite measure. Example 2.1.3 The Cantor Set Let X = [0, 1]. Set X1 = ( 13 , 23 ). Let X2 be the union of the open 38 2 The Lebesgue Measure middle-thirds of the subintervals of X\X1 , i.e. 1 2 7 8 X2 = , ∪ , . 9 9 9 9 Now let X3 be the union of the open middle-thirds of the four subintervals of X\(X1 ∪ X2 ) and so on. Set C = X\ ∪∞ n=1 Xn . The set C is called the Cantor set. (i) Since each Xn is open, it follows that C is a closed set. (ii) It is easy to see that m1 (Xn ) = 2n−1 . 3n Since all these sets Xn are disjoint, we have m1 (∪∞ n=1 Xn ) = ∞ X 2n−1 n=1 3n = 1 = m1 ([0, 1]). Consequently, m1 (C) = 0. (iii) It then follows that since it is closed and has measure zero, it cannot contain a non-empty open set. Thus C is nowhere dense. (iv) Let x ∈ C. If (a, b) is any interval containing x, then, for n sufficiently large, it must also contain a sub-interval of Xn . End-points of all such sub-intervals are in C. Thus no point of C is isolated. Since C is closed as well, it follows that C is a perfect set and so it is uncountable (cf. Rudin [7]). Thus C is an example of an uncountable set of measure zero. (iv) Another way of proving the uncountability of the Cantor set is as follows. Consider the ternary expansion of real numbers in [0, 1]: x = ∞ X an 3−n , n=1 where an = 0, 1 or 2. To determine an , we proceed in the following manner. First we divide the interval [0, 1] into three equal sub-intervals. Then a1 = 0, 1 or 2 according as x falls in the first, second or third sub-interval, respectively. If it falls at one of the nodes of the partition, then the expansion of x has only one term. Next, we divide each of 2.2 Approximation 39 the above subintervals into three equal parts and again a2 = 0, 1 or 2, according as x falls in the first, second or third of the intervals of the block in which it falls. We proceed inductively to determine all the an . The expansion will terminate at a finite stage, if x falls at a node of the partition at some level. It is now clear that the Cantor set contains the set of all points in [0, 1] with infinite ternary expansion such that the digits in its ternary expansion are either 0 or 2. Now a simple Cantor diagonalisation argument shows that C is uncountable. We have three distinguished collections of subsets of RN , viz. the Borel σ-algebra, BN , the Lebesgue σ-algebra, LN and, the power set of RN , P(RN ). We have BN ⊂ LN ⊂ P(RN ). One would, naturally, like to know if these inclusions are strict. We will show, in the sequel, that these inclusions are, indeed, strict. To start with, since the Lebesgue measure is complete, we have that every subset of C is Lebesgue measurable. Since C is uncountable, the cardinality of L1 is 2c , where c is the cardinality of the continuum. It can be shown that the cardinality of B1 is just c. Consequently the inclusion B1 ⊂ L1 is strict. We will later define the Cantor function and use it to prove the existence of a Lebesgue measurable set which is not Borel measurable. We will also prove the existence of sets in R which are not Lebesgue measurable. 2.2 Approximation In this section, we will present various results on the approximation of measurable sets by toplogical sets in the context of the Lebesgue measure. By a box in RN , N ≥ 1, we will mean a set of the form B = ΠN j=1 Ij , where each set Ij , 1 ≤ j ≤ N , is a finite interval in R. If all the Ij are open intervals, we will call it an open box and if thay are all closed, we will call it a closed box. If all the Ij are of the form [aj , bj ), where 40 2 The Lebesgue Measure aj < bj , we will call it a half-open box. Clearly, if B ⊂ RN is a box as above, we have mN (B) = ΠN j=1 m1 (Ij ). (2.2.1) It is also clear that given ε > 0, we can always find boxes B1ε and B2ε which are of any kind (open, closed or half-open) such that B1ε ⊂ B ⊂ B2ε and such that mN (B\B1ε ) < ε and mN (B2ε \B) < ε. Recall that the class P is the collection of all half-open boxes and R is the ring generated by P. The measure µ on R is the same as mN , given by (2.2.1) on P, and extended by finite additivity to R. We will denote the corresponding outer-measure defined on all subsets of RN by µ∗ . Thus µ∗ restricted to LN is mN . Unless specified otherwise, the term measurable set will mean a Lebesgue measurable set. Proposition 2.2.1 Let E ⊂ RN , N ≥ 1. Then µ∗ (E) = inf{µ∗ (U ) |E ⊂ U, U an open set}. Proof: The result is obvious if µ∗ (E) = +∞. So let us assume that µ∗ (E) < +∞. Since E ⊂ U implies that µ∗ (E) ≤ µ∗ (U ), it is immediate to see that µ∗ (E) ≤ inf{µ∗ (U ) | E ⊂ U, U an open set}. Let ε > 0. Then, by the definition of the outer-measure (cf. Proposition 1.3.1), there exist half-open boxes Bn such that E ⊂ ∪∞ n=1 Bn and such that ∞ X ε mN (Bn ) < µ∗ (E) + . 2 n=1 0 0 Now, construct open boxes {Bn0 }∞ n=1 such that Bn ⊂ Bn and mN (Bn \Bn ) < ε 0 . Set U = ∪∞ n=1 Bn . Then U is an open set and E ⊂ U . Further 2n+1 µ∗ (U ) = mN (U ) ≤ ∞ X n=1 ε mN (Bn ) + . 2 Consequently, we have E ⊂ U and µ∗ (U ) < µ∗ (E) + ε. This completes the proof. 2.2 Approximation 41 Proposition 2.2.2 Let E ⊂ RN . The following statements are equivalent. (i) The set E is Lebesgue measurable. (ii) Given any ε > 0, there exists an open set U such that E ⊂ U and such that µ∗ (U \E) < ε. (iii) Given any ε > 0, there exists a closed set F such that F ⊂ E and such that µ∗ (E\F ) < ε. (iv) There exists a Gδ set G such that E ⊂ G and such that µ∗ (G\E) = 0. (v) There exists an Fσ set F such that F ⊂ E and such that µ∗ (E\F ) = 0. Proof: (i) ⇒ (ii): If µ∗ (E) < +∞, then, by the previous proposition, there exists an open set U containing E such that µ∗ (U ) < µ∗ (E) + ε. Since E is Lebesgue measurable, we have that µ∗ = mN and since a measure is subtractive, we deduce that µ∗ (U \E) < ε. If µ∗ (E) = +∞, then since µ∗ = mN is σ-finite, we can find disjoint measurable sets En such that each of them has finite measure and such that E = ∪∞ n=1 En . Then, we can find open sets Un such that En ⊂ Un and such that µ∗ (Un \En ) < 2εn . Then U = ∪∞ n=1 Un is open, contains E and µ∗ (U \E) ≤ µ∗ (∪∞ n=1 (Un \En )) ≤ ∞ X ε = ε. 2n n=1 (ii) ⇒ (iv): For each positive integer n, choose Un open such that E ⊂ Un and µ∗ (Un \E) < n1 . Set G = ∩∞ n=1 Un . Then G is a Gδ set containing E and 1 µ∗ (G\E) ≤ µ∗ (Un \E) < , n from which we deduce that µ∗ (G\E) = 0. (iv) ⇒ (i): By completeness of the Lebesgue measure, if µ∗ (G\E) = 0, it follows that G\E is measurable. Since G is a Gδ set, it is measurable as well. Now E = G\(G\E) and so E is mesurable as well. (i) ⇒ (iii): If E is, measurable, so is E c . Hence, there exists an open set U containing E c and such that µ∗ (U \E c ) < ε. Then F = U c is closed 42 2 The Lebesgue Measure and is contained in E. Further µ∗ (E\F ) = µ∗ (E ∩ F c ) = µ∗ (E ∩ U ) = µ∗ (U \E c ) < ε. (iii) ⇒ (v): For every positive integer n, choose Fn , a closed set contained in E, such that µ∗ (E\Fn ) < n1 . Set F = ∪∞ n=1 Fn . Then F is an Fσ set contained in E. Further, µ∗ (E\F ) ≤ µ∗ (E\Fn ) < 1 n for each n, from which it follows that µ∗ (E\F ) = 0. (v) ⇒ (i): Since µ∗ (E\F ) = 0, once again, by the completeness of the Lebesgue measure, we have that E\F is measurable. Every Fσ set is measurable as well. Thus E = F ∪ (E\F ) is measurable as well. Proposition 2.2.3 Let E ⊂ RN be a measurable set of finite measure. Given ε > 0, there exists a compact set K ⊂ E such that mN (E\K) < ε. Proof: Step 1: Let η > 0 be an arbitrary positive number. By Proposition 2.2.2, there exists an open set V such that E ⊂ V and mN (V \E) < η. Let B(0; r) denote the open ball in RN with centre at 0 and of radius r > 0; let the corresponding closed ball be denoted by B(0; r). If n is a positive integer, set Vn = B(0; n) ∩ V . Then the sequence of open sets {Vn }∞ n=1 increases to V and since V also has finite measure, there exists a positive integer m such that mN (V \Vm ) < η. Then E\Vm ⊂ V \Vm and so mN (E\Vm ) < η. Step 2: Now, Vm is a bounded open set and again, by Proposition 2.2.2, there exists a closed set F ⊂ Vm such that mN (Vm \F ) < η. Since F is bounded and closed, it is compact. Step 3: Thus, for any ε > 0, and a set E ⊂ RN of finite measure, we can find a bounded open set W such that mN (E\W ) < 3ε . Further, by Step 2, there exists a compact set K1 ⊂ W such that mN (W \K1 ) < 3ε . Finally, once again by Proposition 2.2.2, there exists a closed set F1 ⊂ E 2.2 Approximation 43 such that mN (E\F1 ) < 3ε . Then K = F1 ∩K1 is a compact set contained in E and E\K = (E\W ) ∪ ((E ∩ W )\F1 ) ∪ ((W ∩ F1 )\K1 ) ⊂ (E\W ) ∪ (E\F1 ) ∪ (W \K1 ). Thus, mN (E\K) < ε. This completes the proof. Remark 2.2.1 Let E ⊂ RN be a measurable set. Then, by Proposition 2.2.1, we have mN (E) = inf{mN (U ) |E ⊂ U, U is an open set}. (2.2.2) If mN (E) < +∞, then, by Proposition 2.2.3, we have mN (E) = sup{mN (K) | K ⊂ E, K is a compact set}. (2.2.3) Any Borel measure µ (cf. Definition 2.1.1) satisfying (2.2.2) with µ in the place of mN , is called outer-regular. If it satisfies (2.2.3) with µ in place of mN , when µ(E) < +∞, it is said to be inner-regular. If both are valid, we say that the measure is regular. Thus, the Lebesgue measure is a regular measure. Definition 2.2.1 Given any set X and a subset A thereof, the characteristic function of A is the function χA : X → R defined by 1, if x ∈ A, χA (x) = 0, if x 6∈ A. Definition 2.2.2 Let Ω ⊂ RN be an open set. A step function defined on Ω is a function f of the form f= k X α j χ Ij , j=1 where the αj , 1 ≤ j ≤ k are constants and the sets Ij , 1 ≤ j ≤ k are boxes contained in Ω. Proposition 2.2.4 Let I ⊂ RN be a box. Let ε > 0 be an arbitrary positive number. Then, there exists a function ϕ ∈ Cc (RN ), the space of continuous real-valued functions with compact support, such that 0 ≤ ϕ(x) ≤ 1 for all x and such that mN ({x ∈ RN | ϕ(x) 6= χI (x)}) < ε. Further, the support of ϕ will be contained in I. 44 2 The Lebesgue Measure Proof: We can find a closed box J1 and an open box J2 , such that J1 ⊂ J2 ⊂ J2 ⊂ I and such that mN (I\J1 ) < ε. By Urysohn’s lemma, there exists a continuous function ϕ such that 0 ≤ ϕ(x) ≤ 1 for all x and such that ϕ(x) = 1 for all x ∈ J1 and ϕ(x) = 0 for all x 6∈ J2 . Then, the support of ϕ is contained in J2 ⊂ I, and so it is compact. Thus ϕ ∈ Cc (RN ). Now, {x ∈ RN | ϕ(x) 6= χI (x)} ⊂ I\J1 from which the result follows immediately. Corollary 2.2.1 Let Ω ⊂ RN be an open set and let f : Ω → R be a step function. Let ε > 0 be an arbitrary positive number. Then there exists ϕ ∈ Cc (Ω) such that mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < ε, and such that max |ϕ(x)| ≤ max |f (x)|. x∈Ω x∈Ω Pk Proof: Let f = j=1 αj χIj be a step function. Without loss of generality, we can assume that the boxes are disjoint. By the preceding proposition, there exist functions ϕj ∈ Cc (RN ), 1 ≤ j ≤ k, such that, for each such j, we have 0 ≤ ϕj ≤ 1, the support of ϕj is contained in the box Ij , and ε mN ({x ∈ RN | ϕj (x) 6= χIj (x)}) < . k Pk Set ϕ = j=1 αj ϕj . Then {x ∈ Ω | ϕ(x) 6= f (x)} ⊂ ∪kj=1 {x ∈ Ω | ϕj (x) 6= χIj (x)} ⊂ ∪kj=1 {x ∈ RN | ϕj (x) 6= χIj (x)} and so mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < ε. Since the supports of the ϕj , 1 ≤ j ≤ k, are all disjoint, it follows that max |ϕ(x)| ≤ max |αj | = max |f (x)|. x∈Ω 1≤j≤k x∈Ω Finally, the function ϕ has compact support contained in ∪kj=1 Ij ⊂ Ω. Thus ϕ ∈ Cc (Ω). This completes the proof. We conclude this section with one more approximation result. In order to prove it, we need a topological result. 2.2 Approximation 45 Lemma 2.2.1 Every open set in RN can be written as a countable disjoint union of half-open boxes. Proof: For a fixed positive integer n, let Fn denote the set of all points in RN whose coordinates are all integral multiples of 2−n . Let Gn denote the collection of all half-open boxes with each edge of length 2−n and with vertices at the points of Fn . The following conclusions are obvious: • For a fixed positive integer n, each point x ∈ RN belongs to exactly one box in Gn . • Let n > m. If Q ∈ Gm and if Q0 ∈ Gn , then either Q0 ⊂ Q or Q ∩ Q0 = ∅. Let Ω ⊂ RN be an open set. Let x ∈ Ω.Then x lies in an open ball contained in Ω and so, for sufficiently large n, we can find a box Q ∈ Gn such that x ∈ Q ⊂ Ω. In other words, Ω is the union of all boxes contained within it and belonging to the collection Gn , for some n. This collection of boxes is clearly countable but may not be disjoint. Now choose all those boxes in this collection which belong to G1 and discard those of Gk , k ≥ 2, which are contained inside these selected boxes. From the remaining collection of boxes in Ω, select those in G2 and discard those which are in Gk , k ≥ 3 and contained within these selected boxes. Proceeding iteratively like this, we can express Ω as the countable disjoint union of (half-open) boxes, as is obvious from the two observations made above. Proposition 2.2.5 Let Ω ⊂ RN be an open set and let E ⊂ Ω be a measurable set with finite measure. Then, for any ε > 0, there exists a set F , which is a finite disjoint union of boxes, such that mN (E∆F ) < ε. Proof: Let G ⊂ RN be an open set such that E ⊂ G and such that mN (G\E) < 2ε (cf. Proposition 2.2.2). Set G0 = Ω ∩ G. Then G0 is also open, E ⊂ G0 ⊂ Ω and mN (G0 \E) < 2ε . Now, G0 can be written as the countable disjoint union of half-open boxes, say, {Ij }∞ j=1 , and so, for each j, we have Ij ⊂ Ω. Since E has finite measure, so have G and G0 . Thus ∞ X j=1 mN (Ij ) < +∞. 46 2 The Lebesgue Measure Choose a positive integer k such that ∞ X mN (Ij ) < j=k+1 ε . 2 Set F = ∪kj=1 Ij , which is a finite disjoint union of boxes. Since F ⊂ G0 , we have ε mN (F \E) ≤ mN (G0 \E) < . 2 and ∞ X ε mN (E\F ) ≤ mN (G0 \F ) ≤ mN (Ij ) < . 2 j=k+1 This completes the proof. 2.3 Translation invariance We will now study a very important property of the Lebesgue measure. Lemma 2.3.1 Let Ω and Ω0 be open sets in RN . Let T : Ω → Ω0 be a bijection which is a homeomorphism. Then E ⊂ Ω is a Borel set if, and only if, T (E) is a Borel set. Proof: Set S = {E ⊂ Ω | T (E) is a Borel set}. Clearly Ω and ∅ are in S. Also, if E ⊂ Ω, T (E c ) = (T (E))c and if {Ei }∞ i=1 ∞ is a sequence of subsets of Ω, we have that T (∪∞ i=1 Ei ) = ∪i=1 T (Ei ). It follows from these observations that S is closed under the formation of countable unions and under complementation. Thus, S is a σ-algebra on Ω. Since open sets get mapped onto open sets, we get that all open sets are in S. Then it follows that all Borel sets are in S as well. The converse follows by applying this reasoning to the map T −1 . As a special case, let us consider Ω = Ω0 = RN and let T x = x + x0 , where x0 is a fixed point in RN . Thus E is a Borel set if, and only if, T (E) is a Borel set. If I is a half-open box, then T (I) = I +x0 is also a half-open box and both of them have the same measure. It is now immediate to see from the definition of the induced outer-measure, µ∗ , that µ∗ (T (E)) = µ∗ (E), 2.3 Translation invariance 47 for any set E ⊂ RN . The same is obviously true for T −1 as well. Now let E be a Lebesgue measurable subset of RN . Let A ⊂ RN . Then µ∗ (A ∩ T (E)) + µ∗ (A ∩ (T (E))c ) = µ∗ (T (T −1 (A) ∩ E))+ µ∗ (T (T −1 (A) ∩ E c )) = µ∗ (T −1 (A) ∩ E)+ µ∗ (T −1 (A) ∩ E c ) = µ∗ (T −1 (A)) = µ∗ (A). This shows that T (E) is Lebesgue measurable. Applying this to T −1 , we deduce the following result. Theorem 2.3.1 Let x0 ∈ RN be a fixed point. Let T (x) = x + x0 , x ∈ RN . Then E ⊂ RN is Lebesgue measurable if, and only if, T (E) is Lebesgue measurable, and in this case, mN (T (E)) = mN (E). (2.3.1) Definition 2.3.1 Let µ be a Borel measure defined on RN . We say that it is translation invariant if for every Borel set E, and for every mapping T : RN → RN such that T (x) = x + x0 for some x0 ∈ RN , we have that µ(T (E)) = µ(E). Thus, the Lebesgue measure is translation invariant. In fact the properties of outer-regularity, translation invariance and finiteness of the measure for compact sets characterizes the Lebesgue measure, as the following theorem shows. Theorem 2.3.2 Let ν be a Borel measure on RN such that (i) ν(K) < +∞ for every compact set K ⊂ RN ; (ii) ν(E) = inf{ν(V ) | E ⊂ V, V an open set}, for every Borel set E ⊂ RN ; and (iii) it is translation invariant. Then, there exists a constant c > 0 such that ν(E) = cmN (E) for every Borel set E ⊂ RN . Proof: Let Q = [0, 1) × [0, 1) × · · · × [0, 1) (N times). Then mN (Q) = 1. Let n ≥ 2 be an arbitrary positive integer. Q can be written as the disjoint union of 2N n boxes in the collection Gn described in the proof 48 2 The Lebesgue Measure of Lemma 2.2.1. Assume that ν(Q) = c > 0. Since ν is translation invariant, all the 2N n boxes of side 2−n which make up Q will have the e be one such box. Then same measure. Let Q e = ν(Q) = c = cmN (Q) = c2N n mN (Q). e 2N n ν(Q) e = cmN (Q) e as well. Since any open set can be written as Thus, ν(Q) the countable disjoint union of such boxes (cf. Lemma 2.2.1), it follows that if V is any open set, then ν(V ) = cmN (V ). Then, for any Borel set E, it follows, from condition (ii) in the statement of this theorem, that ν(E) = cmN (E). As an application of this result, we have the following theorem. Theorem 2.3.3 Let A : RN → RN be a linear transformation. Let E be any Borel set in RN .Then mN (A(E)) = |det(A)|mN (E). (2.3.2) Proof: Step 1: Let A be singular. Then det(A) = 0. Also, the range of A will be a proper subspace of RN . Then, if E is a Borel set, A(E) is contained in a proper subspace of RN and hence will have measure zero (cf. Example 2.1.1). Thus (2.3.2) is valid in this case. Step 2: Let A be non-singular. Then, by Lemma 2.3.1, A(E) is Borel measurable whenever E is so. Define ν(E) = mN (A(E)). It is easy to see that ν is a Borel measure. If K ⊂ RN is compact, then so is A(K). Thus, ν takes only finite values on compact sets. If E ⊂ V , where V is an open set, then A(V ) is an open set and A(E) ⊂ A(V ) and vice-versa. Thus inf{ν(V ) | E ⊂ V, V an open set} = inf{mN (A(V )) | E ⊂ V, V an open set} = inf{mN (U ) | A(E) ⊂ U, U an open set} = mN (A(E)) = ν(E). Finally, ν(E + x0 ) = mN (A(E + x0 )) = mN (A(E) + Ax0 ) = mN (A(E)) = ν(E). 2.4 Non-measurable sets 49 Thus, ν is translation invariant. Thus, by the preceding theorem, it follows that there exists a constant cA such that mN (A(E)) = ν(E) = cA mN (E) for every Borel set E ⊂ RN . Step 3: It is easy to see that if A and B are two non-singular linear transformations of RN onto itself, then cAB = cBA = cA cB . Step 4: Let A be an orthogonal transformation. Then, if E is the unit ball in RN , we have that A(E) = E. It follows from this that cA = 1 = |det(A)| whenever A is orthogonal. Step 5: Let A be represented by a diagonal matrix diag(λ1 , · · · , λN ), where λi > 0, 1 ≤ i ≤ N . Let E = [0, 1] × · · · × [0, 1] (N times). Then A(E) = ΠN i=1 [0, λi ]. Thus mN (A(E)) = ΠN i=1 λi . Once again, in this case, we have cA = det(A) = |det(A)|. Step 6: Given any non-singular matrix A, we can decompose it as A = RQ, where R is a positive definite matrix and Q is orthogonal. (It suffices to find a positive definite matrix R such that R2 = AAT and set Q = R−1 A.) The positive definite matrix R can, in turn be decomposed as R = P DP T , where P is orthogonal and D is diagonal with positive diagonal entries. Notice then that det(D) = |det(A)|. The result now follows from the observations made in Steps 3 to 5. 2.4 Non-measurable sets We will now prove the existence of a subset of R which is not Lebesgue measurable. The construction is essentially a consequence of the translation invariance of the Lebesgue measure and also uses the axiom of choice. We will follow the treatment given in Royden [6]. Let x, y ∈ [0, 1). Define ◦ x + y, if x + y < 1, x+y = x + y − 1, if x + y ≥ 1. 50 2 The Lebesgue Measure If E is a subset of [0, 1) and if y ∈ [0, 1), we set ◦ ◦ E + y = {x + y | x ∈ E}. Lemma 2.4.1 Let E ⊂ [0, 1) and let y ∈ [0, 1). If E is measurable, ◦ then, so is E + y and ◦ m1 (E + y) = m1 (E). Proof: Set E1 = E ∩ [0, 1 − y) and E2 = E ∩ [1 − y, 1). Then E1 and E2 are measurable and are disjoint. Thus, m1 (E) = m1 (E1 ) + m1 (E2 ). By ◦ ◦ definition, we clearly have E1 + y = E1 + y and E2 + y = E2 + (y − 1). ◦ Thus, Ei + y, i = 1, 2, are measurable, and by the translation invariance ◦ of the Lebesgue measure, we have m1 (Ei + y) = m1 (Ei ), i = 1, 2. ◦ Further, the sets Ei + y, i = 1, 2, are disjoint. (If not, we will have a, b ∈ [0, 1) such that a + y = b + y − 1 which implies that b − a = 1, which is impossible.) ◦ ◦ ◦ Since E + y is the disjoint union of E1 + y and E2 + y, the result follows immediately. Let x, y ∈ [0, 1). We say that x ∼ y if x − y is rational. It is easy to verify that this defines an equivalence relation and hence partitions [0, 1) into equivalence classes. Let P be a set containing exactly one element from each equivalence class (axiom of choice!). Proposition 2.4.1 The set P ⊂ [0, 1) defined above is not Lebesgue measurable. Proof: Let {ri }∞ i=0 be an enumeration of the rationals in [0, 1) with ◦ r0 = 0. Set Pi = P + ri , 0 ≤ i < ∞. Thus, P0 = P . If x ∈ Pi ∩ Pj , ◦ ◦ where i 6= j, then x = p1 + ri = p2 + rj , where p1 , p2 ∈ P . If p1 = p2 , then, since ri 6= rj , we must have |ri − rj | = 1, which is not possible. Thus, p1 6= p2 . But this means that p1 − p2 is rational, i.e. p1 ∼ p2 , which is not possible. Thus, Pi ∩ Pj = ∅ whenever i 6= j. Further, since P contains a representative of each equivalence class, it follows that [0, 1) = ∪∞ i=0 Pi . 2.4 Non-measurable sets 51 Now, if P were Lebesgue measurable, then so is each Pi and m1 (Pi ) = m1 (P ) for each i. In that case 1 = m1 ([0, 1)) = ∞ X m1 (Pi ) i=0 and the last sum is either zero or infinity depending on whether m1 (P ) is zero or non-zero, which gives us a contradiction. Thus P cannot be measurable. ◦ Let E ⊂ P be a measurable set. Then Ei = E + ri is measurable for each 0 ≤ i < ∞ and m1 (Ei ) = m1 (E) for each such i. Once again, since E ⊂ P , we see that the Ei are all mutually disjoint. Since their union is contained in [0, 1), we have that ∞ X m1 (Ei ) ≤ 1. i=0 Thus it follows that we must have that m1 (E) = 0. Thus the only measurable subsets of P are those of measure zero. The same is true for any Pi , 1 ≤ i < ∞. Now let A ⊂ [0, 1) be a measurable set such that m1 (A) > 0. Set Ei = A ∩ Pi . If Ei is measurable, then m1 (Ei ) = 0. Thus if all the Ei are measurable, we have, since A = ∪∞ i=0 Ei , 0 < m1 (A) ≤ ∞ X m1 (Ei ) = 0, i=0 a contradiction. Thus, there exists at least one i such that Ei is not measurable. Thus every subset of strictly positive measure in [0, 1) contains a non-measurable subset. We can draw the same conclusion for any interval of the form [n, n + 1). Thus, if A ⊂ R is a measurable set with strictly positive measure, then there exists a positive integer n such that A ∩ [n, n + 1) has strictly positive measure and hence will contain a non-measurable subset. Thus, we conclude that every measurable set in R with strictly positive measure will contain a non-measurable subset. 52 2.5 2 The Lebesgue Measure Exercises 2.1 Let g : R → R be a continuous and increasing function. Define, for [a, b) ∈ P, µg ([a, b)) = g(b) − g(a). Show that there exists a unique complete measure µg , on a σ-ring containing all the Borel sets, which extends µg . (This measure is called the Lebesgue-Stieltjes measure induced by g.) 2.2 Let S 1 denote the unit circle in the plane. The Borel sets in S 1 are the members of the σ-algebra generated by all open arcs. Show that there exists a Borel measure µ on S 1 such that µ(S 1 ) = 1 and such that µ is invariant under all rotations of S 1 . 2.3 Show that every subset of the plane {(x, y, z) | 2x + 3y + 4z + 1 = 0} in R3 is Lebesgue measurable. 2.4 Show that the plane R2 cannot be expressed as the countable union of straight lines. 2.5 Let ωN = mN (BN ), where BN denotes the unit ball in RN . Show that the Lebesgue measure of any ball of radius r > 0 in RN is ωN rN . 2.6 Compute m2 (S 1 ), where S 1 is the unit circle in the plane R2 . 2.7 (a) Let T be a triangle in the plane R2 with vertices at the points (0, 0), (1, 0) and (0, 1). What is the value of m2 (T )? (b) Let T be a triangle in the plane R2 with vertices at the points (xi , yi ), i = 1, 2, 3. Show that m2 (T ) = |A|, where 1 A = 2 1 1 1 x1 x2 x3 y1 y2 y 3 . 2.8 Let A : RN → RN be a non-singular linear transformation. Let E ⊂ RN . With the notations of Section 2.2, show that µ∗ (E) = |det(A)|µ∗ (E). Deduce that E is Lebesgue measurable if, and only if A(E) is Lebesgue measurable. 2.5 Exercises 53 2.9 Let T : R → R be a bijection such that both T and T −1 map Lebesgue measurable sets onto Lebesgue measurable sets. Define µ(E) = m1 (T (E)), for each Lebesgue measurable set E. Show that µ is a complete measure on L1 . 2.10 Let µ1 and µ2 be two measures defined on a σ-algebra S. Then µ1 is said to be absolutely continuous with respect to µ2 if µ1 (E) = 0 whenever µ2 (E) = 0. Show that the measure µ defined in Exercise 2.9 is absolutely continuous with respect to the Lebesgue measure. 2.11 Does there exist a non-measurable subset of [0, 1] consisting only of irrational numbers? Chapter 3 Measurable functions 3.1 Basic properties Let X be a non-empty set and let S be a σ-algebra of subsets of X. We then say that (X, S) is a measurable space. The members of S are called measurable sets. An extended real-valued function on X is a function defined on X which takes values in the set R ∪ {±∞}. Definition 3.1.1 Let (X, S) be a measurable space and let f be an extended real-valued function defined on X. We say that f is a measurable function if f −1 ((α, +∞]) ∈ S for every α ∈ R. When X = RN , if the above condition is satisfied by f with S = BN , we say that f is a Borel measurable function and if it is satisfied with S = LN , we say that f is a Lebesgue measurable function. Remark 3.1.1 Evidently, if a function, f , defined on RN , is Borel measurable, it is also Lebesgue measurable. Proposition 3.1.1 Let (X, S) be a measurable space. Let f be an extended real-valued function defined on X. The following statements are equivalent: (i) for every α ∈ R, f −1 ((α, +∞]) ∈ S, i.e. f is measurable; (ii) For every α ∈ R, f −1 ([α, +∞]) ∈ S; (iii) For every α ∈ R, f −1 ([−∞, α)) ∈ S; (iv) For every α ∈ R, f −1 ([−∞, α]) ∈ S. Proof: (i) ⇒ (ii): f −1 ([α, +∞]) = −1 ∩∞ n=1 f 1 α − , +∞ ∈ S. n © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_3 54 3.1 Basic properties 55 (ii) ⇒ (iii): f −1 ([−∞, α)) = (f −1 ([α, +∞]))c ∈ S. (iii) ⇒ (iv): −1 f −1 ([−∞, α]) = ∩∞ n=1 f −∞, α + 1 n ∈ S. (iv) ⇒ (i): f −1 ((α, +∞]) = (f −1 ([−∞, α]))c ∈ S. Corollary 3.1.1 (i) Let (X, S) be a measurable space. Let f be a measurable function on X. Then, for every α ∈ R ∪ {±∞}, we have f −1 ({α}) ∈ S. (ii) If U ⊂ R is an open set, then f −1 (U ) ∈ S. Proof: (i) If α ∈ R, then −1 f −1 ({α}) = ∩∞ n=1 f α− 1 1 , +∞ ∩ −∞, α + ∈ S. n n Next, −1 f −1 ({+∞}) = ∩∞ ((n, +∞]) ∈ S n=1 f and −1 f −1 ({−∞}) = ∩∞ ([−∞, −n)) ∈ S. n=1 f (ii) Given (a, b) ⊂ R, we have f −1 ((a, b)) = f −1 ([−∞, b)) ∩ f −1 ((a, +∞]) ∈ S. The result now follows since every open set can be written as the countable union of open intervals. Example 3.1.1 Let X = RN . Then, every continuous real-valued function, f , defined on R will be both Borel and Lebesgue measurable since f −1 ((α, +∞])) = f −1 ((α, ∞)) which is open and hence belongs to both the Borel and Lebesgue σ-algebras. Example 3.1.2 Let (X, S) be a measurable space and let f be a realvalued function defined on X. It is clear that if f −1 (U ) is measurable 56 3 Measurable functions for every open set U ⊂ R, then f is measurable. However, the converse of part (i) of the preceding corollary is not true. Let E ⊂ [0, 1) be a non-measurable subset (cf. Section 2.4), i.e. E 6∈ L1 . Define x, if x ∈ E, −x, if x ∈ [0, 1)\E, f (x) = −2, if x 6∈ [0, 1). Then R\[0, 1), {−α}, f −1 ({α}) = {α}, ∅, if α = −2, if − α ∈ [0, 1)\E, if α ∈ E, otherwise. Thus, it follows that f −1 ({α}) is measurable for every α ∈ R. However f −1 ((0, ∞)) = E, which is not Lebesgue measurable and so f is not Lebesgue measurable. Example 3.1.3 Let (X, S) be a measurable space and let A ⊂ X. Let f = χA , the characterisitic function of the set A. Then X, if α < 0, A, if 0 ≤ α < 1, f −1 ((α, ∞]) = ∅, if α ≥ 1. Thus, χA is measurable if, and only if, A ∈ S. Example 3.1.4 Let (X, S) be a measurable space. Any constant function is measurable. Let f (x) = c for all x ∈ X. If α ∈ R, then f −1 ((α, +∞]) = X if α < c and is equal to the empty set if α ≥ c. Proposition 3.1.2 Let (X, S) be a measurable space and let f and g be measurable real-valued functions defined on X. Let c ∈ R. Then f + c, cf, f ± g and f g are all measurable functions. Proof: (i) Let α ∈ R. Let c > 0. Then n αo {x ∈ X | cf (x) < α} = x ∈ X | f (x) < ∈ S. c If c < 0, then {x ∈ X | cf (x) < α} = n x ∈ X | f (x) > αo ∈ S. c 3.1 Basic properties 57 If c = 0, then cf is the constant function taking the value zero. Thus, if follows that cf is measurable for all c ∈ R. (ii) Let α ∈ R. Then {x ∈ X | f (x) + g(x) < α} = {x ∈ X | f (x) < α − g(x)} = ∪r∈Q ({x ∈ X | f (x) < r} ∩ {x ∈ X | g(x) < α − r}). Since the rationals are countable, it follows that (f +g)−1 ([−∞, α)) ∈ S. Thus f + g is measurable. Since f − g = f + (−1)g, it follows from (i) above that f − g is also measurable. (iii) Since constant functions are measurable, it follows from (ii) above that f + c is measurable. (iv) Let α ∈ R. If α > 0, then {x ∈ X | (f (x))2 > α} = {x ∈ X | f (x) > √ √ α}∪{x ∈ X | f (x) < − α}. If α ≤ 0, then {x ∈ X | (f (x))2 > α} = X. Thus, it follows that the function f 2 is measurable. Now. 1 f g = ((f + g)2 − (f − g)2 ). 4 Thus, by the preceding assertions, it follows that f g is also measurable. Remark 3.1.2 Whenever the concerned functions are well-defined, the preceding proposition holds for extended real-valued functions as well. For instance, f + g is not defined at points x ∈ X where f (x) = +∞ and g(x) = −∞. Proposition 3.1.3 Let (X, S) be a measurable space and let f be a real-valued measurable function defined on X. Then |f | is measurable. Proof: Let α ∈ R. Then, {x ∈ X | |f (x)| < α} = {x ∈ X | f (x) > −α} ∩ {x ∈ X | f (x) < α} if α > 0 and is the empty set if α ≤ 0. Thus, |f | is measurable. 58 3 Measurable functions Corollary 3.1.2 Let (X, S) be a measurable space and let f and g be measurable real-valued functions defined on X. Then, max{f, g} and min{f, g} are measurable. In particular, if f is a measurable real-valued function defined on X, then f + = max{f, 0} and f − = − min{f, 0} are measurable. Proof: The result follows from the previous propositions and the following relations: max{f, g} = min{f, g} = 1 2 (f 1 2 (f + g + |f − g|), + g − |f − g|). Remark 3.1.3 The functions f + and f − are called the positive and negative parts of the function f . We have f = f + − f − and |f | = f + + f − . Notice that both f + and f − are non-negative functions. Lemma 3.1.1 Let (X, S) be a measurable space and let f be a realvalued measurable function defined on X. Then f −1 (E) ∈ S whenever E is a Borel set. Proof: Consider Se = {E ⊂ R | f −1 (E) ∈ S}. e Now f −1 (E c ) = (f −1 (E))c and so if E ∈ S, e so does Clearly, R ∈ S. c ∞ −1 e E . Similarly, if {Ei }i=1 is a sequence in S, we have f (∪∞ i=1 Ei ) = ∞ −1 ∞ e e ∪i=1 f (Ei ) and so ∪i=1 Ei ∈ S as well. Thus S is a σ-algebra and, by the definition of measurability, it contains all the open sets of R (cf. Corollary 3.1.1). Thus, Se contains all the Borel sets, and this completes the proof. Corollary 3.1.3 Let (X, S) be a measurable space and let f be a realvalued measurable function defined on X. Then f is measurable if, and only if, f −1 (U ) ∈ S whenever U is a Borel set. Proof: If f is measurable, the above lemma shows that the inverse image of every Borel set is measurable. Conversely, if the inverse image of every Borel set is measurable, then f −1 ((α, +∞)) ∈ S for every α ∈ R, and since the function is real-valued, it follows that f is measurable, by definition. 3.1 Basic properties 59 Proposition 3.1.4 Let (X, S) be a measurable space and let f be a measurable real-valued function defined on X. Let ϕ : R → R be a Borel measurable function. Then ϕ ◦ f is a measurable function on X. Proof: Let α ∈ R. We have {x ∈ X | (ϕ ◦ f )(x) > α} = f −1 (ϕ−1 ((α, +∞))). Now ϕ−1 ((α, +∞)) is a Borel set and the proof now follows from the preceding lemma. Remark 3.1.4 In general, the composition of two measurable functions can fail to be measurable. We will see an example of this in the next section. Proposition 3.1.5 Let (X, S) be a measurable space and let {fn } be a sequence of extended real-valued measurable functions defined on X. Define, for x ∈ X, h(x) = sup fn (x) and g(x) = inf fn (x). n n Then h and g are measurable. Proof: Let α ∈ R. Then {x ∈ X | h(x) > α} = ∪∞ i=1 {x ∈ X | fi (x) > α}, {x ∈ X | g(x) < α} = ∪∞ i=1 {x ∈ X | fi (x) < α}, and the result follows immediately. Corollary 3.1.4 Let (X, S) be a measurable space. Let {fn } be a sequence of real-valued measurable functions defined on X. We have that lim supn→∞ fn and lim inf n→∞ fn are measurable. Hence, if fn (x) → f (x) for all x ∈ X, then f is a measurable function. Proof: Notice that gn = supm≥n fm is measurable. Then lim sup fn = inf gn n→∞ n is measurable. Similarly, hn = inf m≥n fm is measurable and so lim inf fn = sup hn n→∞ n is measurable. If fn (x) → f (x) for all x ∈ X, then f = lim inf fn = lim sup fn , n→∞ and the result follows. n→∞ 60 3 Measurable functions Definition 3.1.2 Let (X, S) be a measurable space. A simple function defined on X is a function of the form f = k X αi χAi , i=1 where the αi , 1 ≤ i ≤ k, are real constants and the Ai , 1 ≤ i ≤ k, are measurable sets. Remark 3.1.5 By definition, a simple function is measurable. Simple functions are the building blocks with which we develop Lebesgue’s theory of integration, just as Riemann’s theory of integration was based on step functions. As a first step towards this, we have the following result. Theorem 3.1.1 Let (X, S) be a measurable space and let f be a nonnegative extended real-valued function defined on X. Then f is the increasing limit of a sequence of non-negative simple functions defined on X. Proof: Let n be a fixed positive integer. For 1 ≤ i ≤ n2n , define En,i = f −1 i−1 i , 2n 2n and Fn = f −1 ([n, +∞]). In other words, we divide the interval [0, n) into subintervals of length 1 2n and consider the inverse images under f of each of these to define the sets En,i . The sets En,i and Fn are all clearly measurable. Now define n fn = nχFn + n2 X i−1 i=1 2n χEn,i . Thus, fn is a non-negative simple function. If f (x) ≥ n, then fn (x) = n. If f (x) < n and if i−1 i ≤ f (x) < n , n 2 2 then fn (x) = i−1 2n . Thus, fn (x) ≤ f (x) for all x ∈ X. 3.2 The Cantor function 61 We claim that fn (x) ≤ fn+1 (x) for each positive integer n and for each x ∈ X. Indeed, if f (x) ≥ n+1, then fn+1 (x) = n+1 and fn (x) = n. If n ≤ f (x) < n + 1, then fn+1 (x) = 2i−1 n+1 for some i such that f (x) ∈ i−1 i , 2n+1 2n+1 ⊂ [n, n + 1). In this case, fn (x) = n and so we still have fn (x) ≤ fn+1 (x). Finally, if f (x) < n, then for some 1 ≤ i ≤ n2n , we have i−1 i 2(i − 1) 2i f (x) ∈ , = , . 2n 2n 2n+1 2n+1 Consequently, fn (x) = i−1 2n while fn+1 (x) = f (x) ∈ [ 2(i−1) , 2i−1 ), or fn+1 (x) = 22i−1 n+1 2n+1 2n+1 2i [ 22i−1 , ). This establishes the claim. n+1 2n+1 > 2(i−1) = i−1 2n = fn (x), 2n+1 i−1 2n = fn (x), if f (x) if ∈ Thus, {fn } is an increasing sequence of simple functions bounded above by f . If f (x) = +∞, then fn (x) = n for all n. If f (x) < +∞, then there exists a positive integer N such that f (x) < N . Then, for all n ≥ N , we see, from the the construction above, that |f (x) − fn (x)| = f (x) − fn (x) ≤ 21n . Thus we have established that fn ↑ f. Corollary 3.1.5 Let (X, S) be a measurable space and let f be a realvalued measurable function defined on X. Then f is the limit of a sequence of simple functions. Proof: We can split f into its positive and negative parts. Thus f = f + − f − , where f ± are non-negative measurable functions. We can find sequences of non-negative simple functions {ϕn } and {ψn } such that ϕn ↑ f + and ψn ↑ f − . Thus, fn = ϕn − ψn gives a sequence of simple functions converging pointwise to f. 3.2 The Cantor function The Cantor function, like the Cantor set, provides a lot of interesting examples, or counter-exmples, to illustrate fine points in the theory of measure and integration. Several constructions are possible but the essential properties are the same for all these functions and they serve 62 3 Measurable functions the same purpose. In this section, we will present one such construction. Before we construct the function, which will be the uniform limit of a sequence of piecewise linear functions, we will present a basic construction which will be used iteratively, at different scales, to produce the next member of the desired sequence from the current one. Consider an interval [a, b] in the real line and let f : [a, b] → R be a linear function, i.e. f (x) = f (a) + f (b) − f (a) (x − a), x ∈ [a, b]. b−a We then divide the interval [a, b] into three equal parts. Let us denote the two interior points of this partition by cj , j = 1, 2. Thus, cj = a + j b−a , j = 1, 2. 3 The value of f at the point c1 will, therefore, be given by the relation f (c1 ) = 2f (a) + f (b) . 3 We then define the ‘next iterate’, g, of f as follows: f (x), if x ∈ [a, c1 ], f (c1 ), if x ∈ [c1 , c2 ], g(x) = f (b)−f (c1 ) f (c1 ) + b−c2 (x − c2 ), if x ∈ [c2 , b]. In other words, we move along f in the first third of the interval, then move horizontally along the second third, and finally climb up to f (b) in a straight line on the third interval (cf. Figure 3.2.1 below). f(b) f(a) a b Figure 3.2.1 3.2 The Cantor function 63 A simple computation shows that the slope of g is the same as that of f in the first third of the interval, equal to zero in the middle third and twice the slope of f in the final third of the interval. Let us consider the interval [0, 1] and the function f0 (x) = x defined on it. If we apply the procedure described above to this function, we will get the function f1 given by x, if x ∈ [0, 13 ], 1 1 2 f1 (x) = 3 , if x ∈ [ 3 , 3 ], 2x − 1, if x ∈ [ 23 , 1]. We now apply the iteration produre described above in each of the intervals [0, 13 ] and [ 23 , 1] to get the next function f2 and so on (cf. Figure 3.2.2 below). Figure 3.2.2 For each positive integer n, we apply the iteration procedure described earlier only to those sub-intervals where fn is not constant. Thus, if fn is constant on any sub-interval, we will have fm = fn = the same constant on that sub-interval for all m ≥ n. Notice that the union of the sub-intervals where fn is a constant is precisely the set Xn described in the construction of the Cantor set (cf. Example 2.1.3). In this manner, we can construct a sequence of continuous piecewise linear functions {fn }. By construction, we have a decreasing sequence of functions, each of which is monotonically non-decreasing. The maximum slope occurs in the last sub-interval and, as seen earlier, each time we apply the iteration procedure, the slope doubles. Thus, 64 3 Measurable functions in the last sub-interval of length 31n , the slope of fn will be 2n . By the mean value theorem, we thus see that for any x ∈ [0, 1], −(n+1) |fn (x) − fn+1 (x)| ≤ |fn+1 (1) − fn+1 (1 − 3 n+1 2 )| ≤ . 3 P∞ 2 n Since the series n=1 ( 3 ) is a convergent geometric series, it follows that {fn } is uniformly Cauchy and so it converges uniformly to a continuous function f . This function is called the Cantor function. Since each fn is non-decreasing, so is f . We also have that f (0) = 0 and f (1) = 1. If C is the Cantor set (cf. Example 2.1.3), then, by construction, f is constant on each sub-interval of C c , since in the construction of fn+1 from fn , we set the value in each middle third interval as a constant and once fixed thus, it remains unaltered in the construction of fm , m ≥ n + 1. Let us now define ψ(y) = y + f (y) for y ∈ [0, 1]. Then ψ is strictly monotonic increasing and continuous. We have ψ(0) = 0 and ψ(1) = 2. Thus ψ is a continuous bijection of [0, 1] onto [0, 2]. Let ϕ denote the inverse of ψ. Then ϕ is also monotonic increasing. We have that x = ϕ(x) + f (ϕ(x)), for every x ∈ [0, 2]. If x ≥ y, then ϕ(x) ≥ ϕ(y) and x − y = ϕ(x) − ϕ(y) + f (ϕ(x)) − f (ϕ(y)). Since f is non-decreasing, we deduce, from the above relation, that ϕ(x) − ϕ(y) ≤ x − y whenever x ≥ y from which we have |ϕ(x) − ϕ(y)| ≤ |x − y|. Thus, ϕ is continuous as well. Now, ψ is a bijection and so it maps disjoint sets into disjoint sets. If I is an interval contained in C c , where C is the Cantor set, then f is a constant, cI , on I and so ψ(x) = x + cI on I. Thus, ψ(I) just translates I and so m1 (ψ(I)) = m1 (I). Since C c is made up of disjoint itervals, 3.3 Almost everywhere 65 it follows that m1 (ψ(C c )) = m1 (C c ) = 1. Since the range of ψ is [0, 2], we thus conclude that m1 (ψ(C)) = 1 as well. Thus, ψ maps C, a set of measure zero, onto a set of measure one. Since ψ(C) has positive measure, it contains a non-measurable set, say, S. Let M = ψ −1 (S) = ϕ(S). Then M ⊂ C and by the completeness of the Lebesgue measure, it follows that M is Lebesgue measurable. If M were Borel measurable, then it would follow that S = ϕ−1 (M ) is also Borel measurable, since ϕ is continuous and hence a Borel measurable function. But that would imply that S is also Lebesgue measurable, which contradicts our assumption on S. Thus, the set M described above is an example of a Lebesgue measurable set which is not Borel measurable. Finally, let Φ = χM , which is a Lebesgue measurable function. Set ζ = Φ ◦ ϕ. Thus ζ is the composition of a Lebesgue measurable function and a continuous (and hence, Lebesgue measurable) function. Now ζ −1 ({1}) = {x ∈ [0, 2] | ζ(x) = 1} = ϕ−1 (M ) = S, which, by choice, is not Lebesgue measurable. Thus ζ is not a Lebesgue measurable function (cf. Corollary 3.1.1(i)). Thus, we have constructed an example to show that the composition of two measurable functions need not be measurable (cf. Proposition 3.1.4 and Remark 3.1.4). Remark 3.2.1 Notice, however, that by Proposition 3.1.4, the mapping ϕ ◦ Φ will be measurable. 3.3 Almost everywhere Let (X, S) be a measurable space. Let µ be a measure defined on S. We say that (X, S, µ) is a measure space. Given a measure space (X, S, µ), we say that a measurable function, or a collection of measurable functions, enjoys a certain property almost everywhere if that property is valid at all points in X except, possibly, on a set of measure zero. We abbreviate ‘almost everywhere’ by a.e. More specifically, we will frequently encounter the following situations in the sequel. 66 3 Measurable functions • A measurable extended real-valued function defined on X is finite a.e. if there exists a set E ∈ S such that µ(E) = 0 and such that f (x) ∈ R for all x ∈ E c . • Given two measurable functions f and g defined on X, we say that f = g a.e. if there exists a set E ∈ S such that µ(E) = 0 and such that f (x) = g(x) for all x ∈ E c . • Given a sequence of measurable functions {fn } and a measurable function f defined on X, we say that fn converges to f a.e. if there exists a set E ∈ S such that µ(E) = 0 and such that fn (x) → f (x) for every x ∈ E c . Definition 3.3.1 Let (X, S, µ) be a measure space and let f : X → R be a measurable function. We say that f is essentially bounded if there exists M > 0 such that the set {x ∈ X | |f (x)| > M } has measure zero. The essential supremum of f is the infimum of all such M , and is denoted kf k∞ , i.e. kf k∞ = inf{M | µ({x ∈ X | |f (x)| > M }) = 0}. 3.4 Exercises 3.1 Let (X, S) be a measurable space. Let f : X → R be a measurable function. Define 1 f (x) , iff (x) 6= 0, g(x) = 0, iff (x) = 0. Show that g is measurable. 3.2 Let (X, S) be a measurable space. Let f : X → R be a function such that |f | is measurable. Is it necessary that f be measurable? 3.3 Let (X, S) be a measurable space. Let f : X → R be a function such that f −1 ((r, +∞)) ∈ S for every rational number r. Show that f is measurable. 3.4 Exercises 67 3.4 Let (X, S) be a measurable space. Let {fn } be a sequence of measurable functions defined on X. Show that the set of all points x ∈ X, where the sequence {fn (x)} is not Cauchy, is a measurable set. 3.5 Let (X, S) be a measurable space. Let {fn } be a sequence of measurable functions defined on X converging to a function f a.e. Is it necessary that f is measurable? 3.6 Let (X, S) be a measurable space. Let f : X × [0, 1] → R be a function such that, for each fixed y ∈ [0, 1], the mapping x 7→ f (x, y) is measurable, and, for each fixed x ∈ X, the mapping y 7→ f (x, y) is continuous. Define h(x) = min f (x, y), for x ∈ X. y∈[0,1] Show that h : X → R is measurable. 3.7 Let f : R → R be a Lebesgue measurable function. Show that there exists a Borel measurable function g : R → R such that g = f a.e. Chapter 4 Convergence 4.1 Egorov’s theorem Theorem 4.1.1 (Egorov) Let (X, S, µ) be a finite measure space, i.e. µ(X) < +∞. Let {fn }∞ n=1 be a sequence of real-valued measurable functions, defined on X, converging almost everywhere to a real-valued measurable function f . Then, given any ε > 0, there exists a measurable set F ⊂ X such that µ(F ) < ε and such that fn → f uniformly on F c . Proof: Let E ∈ S be such that µ(E) = 0 and such that fn → f pointwise on E c . Set Y = E c . Given positive integers m and n, define 1 ∞ En,m = ∩i=n x ∈ Y | |fi (x) − f (x)| < . m Then, clearly, E1,m ⊂ E2,m ⊂ · · · ⊂ En,m ⊂ En+1,m ⊂ · · · . Further, for every x ∈ Y , we have fn (x) → f (x) and so for any m, 1 there exists N such that for all i ≥ N , we have |fi (x) − f (x)| < m , i.e. x ∈ EN,m . Thus, Y = ∪∞ n=1 En,m . Consequently (cf. Proposition 1.2.4), µ(Y ) = lim µ(En,m ). n→∞ Since, µ(Y ) = µ(X) < +∞, given ε > 0, there exists n0 (m) ∈ N such that ε µ(Y \En0 (m),m ) = µ(Y ) − µ(En0 (m),m ) < m . 2 © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 68 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_4 4.1 Egorov’s theorem 69 Set G = ∪∞ m=1 (Y \En0 (m),m ). Then G is measurable and ∞ X ε µ(G) < = ε. 2m m=1 Now set F = G ∪ E so that µ(F ) = µ(G) < ε. Observe that F c = ∩∞ m=1 En0 (m),m . 1 Given any η > 0, choose m such that m < η. If x ∈ F c , then x ∈ En0 (m),m ⊂ En,m for all n ≥ n0 (m). Thus, for all x ∈ F c , and for all n ≥ n0 (m), we have |fn (x) − f (x)| < 1 < η. m Since the choice of m depended only on η, this shows that we have uniform convergence of the sequence {fn } to f on F c . This completes the proof. Example 4.1.1 The result of the theorem does not hold, in general, in infinite measure spaces. For instance, consider the set N of natural numbers with the counting measure defined on the σ-algebra of all subsets of N. If F ⊂ N is such that µ(F ) < ε < 1, then, clearly, F = ∅. Thus uniform convergence on F c means uniform convergence on N. Now consider the sequence {fn }∞ n=1 defined by fn = χ{1,2,···,n} . Then fn → f on N, where f (i) = 1 for all i ∈ N, but this convergence is not uniform. Inspired by the statement of Egorov’s theorem, we can formulate the following definition. Definition 4.1.1 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X. We say that this sequence converges almost uniformly to a real-valued measurable function f defined on X, if for every ε > 0, there exists a measurable set F such that µ(F ) < ε and such that fn → f uniformly on F c . The converse of Egorov’s theorem holds for any measure space. Proposition 4.1.1 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X converging almost uniformly to a real-valued measuable function f defined on X. Then fn → f a.e. on X. 70 4 Convergence 1 Proof: Let m ∈ N. Choose Fm ∈ S such that µ(Fm ) < m and such c ∞ that fn → f uniformly on Fm . Set F = ∩m=1 Fm . Then µ(F ) = 0. Since c c F c = ∪∞ m=1 Fm , we have that fn (x) → f (x) for every x ∈ F . Definition 4.1.2 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X. We say that this sequence is almost uniformly Cauchy if, for every ε > 0, there exists a set F ∈ S such that µ(F ) < ε and such that {fn } is a uniformly Cauchy sequence on F c . Clearly, if a sequence of real-valued measurable functions defined on X, {fn }∞ n=1 , converges almost uniformly, then it is almost uniformly Cauchy. We now prove the converse. Proposition 4.1.2 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be an almost uniformly Cauchy sequence of real-valued measuarble functions defined on X. Then there exists a real-valued measurable function f defined on X such that fn → f almost uniformly. 1 Proof: For each m ∈ N, choose Fm ∈ S such that µ(Fm ) < m and such c that the sequence {fn } is uniformly Cauchy on Fm . set F = ∩∞ m=1 Fm . c , we have that {f (x)} is a Cauchy Then µ(F ) = 0. Since F c = ∪∞ F n m=1 m sequence for every x ∈ F c . Define limn→∞ fn (x), if x ∈ F c , f (x) = 0, if x ∈ F. Set gn = χF c fn . Then gn is measurable for each positive integer n. Further if x ∈ F , we have gn (x) = f (x) = 0 for all n and if x ∈ F c , we have gn (x) = fn (x) → f (x). Thus gn → f everywhere and so (cf. Corollary 3.1.4) f is measurable. In particular, fn → f on F c and c , where µ(F ) < 1 . Thus, we see that the fn → f uniformly on Fm m m sequence {fn } converges to f almost uniformly. 4.2 Convergence in measure In this section, we will investigate a new notion of convergence of measurable functions defined on a measure space and compare it with the notions of pointwise convergence a.e. and almost uniform convergence. Definition 4.2.1 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X. Let f be a 4.2 Convergence in measure 71 real-valued measurable function defined on X. We say that the sequence {fn } converges in measure to the function f if for every ε > 0, we have lim µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) = 0. n→∞ We say that the sequence {fn } is Cauchy in measure if for every ε > 0 and for every δ > 0, there exists N ∈ N such that for all n, m ≥ N , we have µ({x ∈ X | |fn (x) − fm (x)| ≥ ε}) < δ. Notation Let (X, S, µ) be a measure space. If a sequence of real-valued measurable functions {fn } defined on X converges in measure to a realvalued measurable function f , we write µ fn → f. Proposition 4.2.1 Let (X, S, µ) be a finite measure space, i.e. µ(X) < +∞. Let {fn }∞ n=1 be a sequence of real-valued measurable functions, defined on X, converging a.e. to a real-valued measurable function f . µ Then fn → f . Proof: Let D denote the set of all points x ∈ X such that the sequence {fn (x)} fails to converge to f (x). Thus µ(D) = 0. Let ε > 0. If we set Em (ε) = {x ∈ X | |fm (x) − f (x)| ≥ ε}, then, ∞ D = ∪ε>0 ∩∞ n=1 ∪m=n Em (ε) = ∪ε>0 lim sup En (ε). n→∞ Thus, µ(lim supn→∞ En (ε)) = 0. Since µ(X) < +∞, we have (cf. Exercise 1.10), 0 = µ(lim sup En (ε)) ≥ lim sup µ(En (ε)). n→∞ n→∞ Thus, 0 ≤ lim inf µ(En (ε)) ≤ lim sup µ(En (ε)) ≤ 0, n→∞ n→∞ µ from which we deduce that limn→∞ µ(En (ε)) = 0, i.e. fn → f. 72 4 Convergence Example 4.2.1 This result is not valid, in general, in infinite measure spaces. If we consider the set N with the σ-algebra of all subsets equipped with the counting measure, then it is easy to verify that convergence in measure is just uniform convergence, since µ(E) < 1 implies that E = ∅. Once again, the same sequence as in Example 4.1.1 gives a sequence converging pointwise everywhere, but not in measure. The converse is not true, even in finite measure spaces, i.e. convergence in measure does not imply pointwise convergence as the following example shows. Example 4.2.2 Consider X = [0, 1) equipped with the Lebesgue measure. Consider the function χin = χ i [ i−1 n ,n) , 1 ≤ i ≤ n. Consider the sequence {χ11 , χ12 , χ22 , χ13 , χ23 , χ33 , · · ·}. Let x ∈ [0, 1). For each n ∈ N, there exists exactly one i such that χin (x) = 1, 1 ≤ i ≤ n, while χjn (x) = 0 for all 1 ≤ j ≤ n, j 6= i. Thus, we see that the above sequence fails to converge at every point x ∈ [0, 1). On the other hand, if 0 < ε < 1, we have m1 ({x ∈ [0, 1) | |χin (x)| ≥ ε}) = m1 i−1 i , n n = 1 , n from which we deduce that this sequence converges to zero in measure. In the above example, notice that for any x ∈ [0, 1), we have χ1n (x) = 0, for every n ≥ x−1 . Thus, there exists a subsequence converging to the zero function pointwise. This behaviour is generic, as the following proposition shows. Proposition 4.2.2 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X which converges in measure to a real-valued measurable function f defined on X. Then, there exists a subsequence of {fn } which converges to f almost everywhere. 4.2 Convergence in measure 73 Proof: Set En,m = 1 x ∈ X | |fn (x) − f (x)| ≥ m . Then, for every m ∈ N, we can find a positive integer n0 (m) such that µ(En0 (m),m ) < Thus, ∞ X 1 . 2m µ(En0 (m),m ) < +∞. m=1 Hence, by the Borel-Cantelli lemma (cf. Proposition 1.2.6), there exists a measurable set E, with measure zero, such that every point of E c belongs to at most finitely many of the sets En0 (m),m . In other words, for every x ∈ E c , there exist a positive integer N such that x 6∈ En0 (m),m , for all m ≥ N , i.e. for all m ≥ N , we have |fn0 (m) (x) − f (x)| < 1 . m This shows that fn0 (m) (x) → f (x) for all x ∈ E c . This completes the proof. The next result shows that the limit function, under convergence in measure, is defined uniquely up to a set of measure zero. Proposition 4.2.3 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X. Let f and µ g be real-valued measurable functions defined on X such that fn → f µ and fn → g. Then f = g almost everywhere. Proof: Let ε > 0. Then, {x ∈ X | |f (x) − g(x)| ≥ ε} ⊂ x ∈ X | |fn (x) − f (x)| ≥ ε 2 ∪ x ∈ X | |fn (x) − g(x)| ≥ since |f (x) − g(x)| ≤ |fn (x) − f (x)| + |fn (x) − g(x)|. µ µ Since fn → f and fn → g, we deduce that µ({x ∈ X | |f (x) − g(x)| ≥ ε}) = 0. ε 2 , 74 4 Convergence The result now follows from the relation 1 ∞ {x ∈ X | |f (x) − g(x)| > 0} = ∪n=1 x ∈ X | |f (x) − g(x)| ≥ . n We now investigate the relationship between a sequence being Cauchy in measure and its convergence in measure. Proposition 4.2.4 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X. If the sequence converges in measure, then it is Cauchy in measure. µ Proof: Let fn → f . Then, by a similar reasoning as in the preceding proof, we have, for ε > 0, {x ∈ X | |fn (x) − fm (x)| ≥ ε} ⊂ x ∈ X | |fn (x) − f (x)| ≥ 2ε ∪ x ∈ X | |fm (x) − f (x)| ≥ ε 2 . Given δ > 0, we can then find a positive integer N such that for n and m greater than, or equal to, N , we have that the measure of each of the two sets on the right-hand side of the above relation will be less than 2δ . Thus for m, n ≥ N , we have µ({x ∈ X | |fn (x) − fm (x)| ≥ ε}) < δ. Thus the sequence {fn } is Cauchy in measure. Proposition 4.2.5 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued functions, defined on X, which is Cauchy in measure. If there exists a subsequence {fnk } which converges in measure µ to a real-valued measurable function f defined on X, then fn → f . Proof: Let ε > 0. Then {x ∈ X | |fn (x) − f (x)| ≥ ε} ⊂ x ∈ X | |fn (x) − fnk (x)| ≥ ε 2 ∪ x ∈ X | |fnk (x) − f (x)| ≥ ε 2 . Let δ > 0 be given. Then, there exists N ∈ N such that for all n ≥ N and for all nk ≥ N , we have µ x ∈ X | |fn (x) − fnk (x)| ≥ 2ε < 2δ , µ x ∈ X | |fnk (x) − f (x)| ≥ ε 2 < δ 2. 4.2 Convergence in measure 75 Thus, for all n ≥ N , µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) < δ, µ which shows that fn → f. Proposition 4.2.6 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions, defined on X, which converges almost uniformly to a real-valued measurable function f defined µ on X. Then fn → f . Proof: Let ε > 0 and δ > 0 be given. There exists F ∈ S such that µ(F ) < δ and such that fn → f uniformly on F c . Thus, there exists n0 ∈ N (which depends on ε and also on δ since we have chosen F based on δ) such that, for all x ∈ F c , and for all n ≥ n0 , we have |fn (x) − f (x)| < ε. Hence, for all n ≥ n0 , µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) ≤ µ(F ) < δ, µ which proves that fn → f. Proposition 4.2.7 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions, defined on X, which is Cauchy in measure. Then, there exists a subsequence which is almost uniformly Cauchy. Proof: Since the sequence is Cauchy in measure, given k ∈ N, there exists n(k) ∈ N such that for all n, m ≥ n(k), we have 1 1 µ x ∈ X | |fn (x) − fm (x)| ≥ k < k. 2 2 Choose n1 = n(1) + 1, n2 = max{n(2), n1 + 2}, n3 = max{n(3), n2 + 3} and so on. Thus, nk = max{n(k), nk−1 + k}. Hence we get a strictly increasing sequence {nk } which also satisfies nk > k for each k. Thus, we have a subsequence {fnk } of {fn }. Set Ek 1 = x ∈ X | |fnk (x) − fnk+1 (x)| ≥ k . 2 Then µ(Ek ) < 2−k . Given δ > 0, choose k such that 2−(k−1) < δ. Set F = ∪∞ i=k Ei . Then µ(F ) ≤ ∞ X i=k µ(Ei ) < 1 2k−1 < δ. 76 4 Convergence c Given ε > 0, choose N ≥ k such that 2−(N −1) < ε. Now F c = ∩∞ i=k Ei . Thus, for all x ∈ F c and for all m ≥ ` ≥ N , we have Pm |fn` (x) − fnm (x)| ≤ j=` |fnj (x) − fnj+1 (x)| < Pm = 1 2`−1 1 j=` 2j < 1 2N −1 < ε. Thus {fnk } is a uniformly Cauchy sequence in F c and µ(F ) < δ, ı.e. {fnk } is almost uniformly Cauchy. Proposition 4.2.8 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of real-valued measurable functions, defined on X, which is Cauchy in measure. Then, there exists a real-valued measurable function µ f defined on X such that fn → f . Proof: Let {fnk } be a subsequence which is almost uniformly Cauchy. Then (cf. Proposition 4.1.2), there exists a real-valued measurable function f , defined on X, such that {fnk } converges almost uniformly to µ f . By Proposition 4.2.6, we deduce that fnk → f which implies that µ fn → f , by Proposition 4.2.5. Let us summarize the results on the inter-relationships of the various convergences studied in this chapter so far. We have defined pointwise convergence a.e., almost uniform convergence and convergence in measure. • If a sequence of real-valued measurable functions converges almost uniformly, then it converges pointwise a.e. (cf. Proposition 4.1.1) as well as in measure (cf. Proposition 4.2.6). • If a sequence of real-valued measurable functions converges pointwise a.e., then it converges almost uniformly (cf. Egorov’s theorem) and in measure (cf. Proposition 4.2.1), provided the space is a finite measure space. • If a sequence of real-valued measurable functions converges in measure, then there is a subsequence which converges pointwise a.e. (cf. Proposition 4.2.3) and almost uniformly (cf. Propositions 4.2.7 and 4.1.2). In fact, Propositions 4.2.7, 4.1.2 and 4.1.1 yield another proof of Proposition 4.2.2. 4.2 Convergence in measure 77 • Almost uniform convergence of a sequence of real-valued measurable functions obviously implies that the sequence is almost uniformly Cauchy and vice-versa (cf. Proposition 4.1.2). • Convergence in measure of a sequence of real-valued measurable functions implies that the sequence is Cauchy in measure (cf. Proposition 4.2.4) and vice-versa (cf. Proposition 4.2.8). We will conclude this section by studying the behaviour of convergence with measure with respect to basic algebraic operations on functions. Proposition 4.2.9 Let (X, S, µ) be a measure space and let {fn }∞ n=1 and {gn }∞ n=1 be sequences of real-valued measurable functions defined on µ µ X. Let fn → f and let gn → g, where f and g are real-valued measurable functions defined on X. Let α and β be non-zero real scalars.Then µ µ αfn + βgn → αf + βg. We also have that |fn | → |f |. Proof: Let ε > 0. The result follows immediately from the following relations: {x ∈ X | |(αfn + βgn )(x) − (αf + βg)(x)| ≥ ε} ⊂ n x ∈ X | |fn (x) − f (x)| ≥ ε 2|α| o n ∪ x ∈ X | |gn (x) − g(x)| ≥ ε 2|β| and {x ∈ X | | |fn (x)| − |f (x)| | ≥ ε} ⊂ {x ∈ X | |fn (x) − f (x)| ≥ ε}. Proposition 4.2.10 Let (X, S, µ) be a finite measure space and let ∞ {fn }∞ n=1 and {gn }n=1 be sequences of real-valued measurable functions µ µ defined on X. Let fn → f and let gn → g, where f and g are real-valued µ measurable functions defined on X. Then fn gn → f g. Proof: It follows from the relation fg = 1 [(f + g)2 − (f − g)2 ], 4 µ µ that it is enough to show that if fn → f , then fn2 → f 2 . Step 1: Let f = 0. Then, since {x ∈ X | |fn (x)|2 ≥ ε} = {x ∈ X | |fn (x)| ≥ √ ε}, o , 78 4 Convergence µ it follows that fn2 → 0 as well. µ µ Step 2: If fn → f , then fn − f → 0. Now, let En = {x ∈ X | |f (x)| > n}, so that En ↓ ∅. Since µ(X) < +∞, it follows that (cf. Proposition 1.2.5) µ(En ) ↓ 0. Given δ > 0, choose m ∈ N such that µ(Em ) < δ. Now, {x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} = {x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em c . ∪{x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em The measure of the first set on the right-hand side of the above relation c , it follows that is, evidently, less than δ. Since |f (x)| ≤ m for x ∈ Em for all points x in the second set on the right-hand side of the above relation, we have ε ε ≤ m|fn (x) − f (x)|, i.e. |fn (x) − f (x)| ≥ . m Thus, we can find N ∈ N such that, for all n ≥ N , we have c µ({x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em ) < δ. Thus, for all n ≥ N , we have µ({x ∈ X | |fn f (x) − f 2 (x)| ≥ ε}) < 2δ, µ which shows that fn f → f 2 . Step 3: Now fn2 − f 2 = (fn − f )2 + 2(fn f − f 2 ). µ µ Since fn −f → 0, we have (fn −f )2 → 0. Combining this with the result µ of Step 2 above, we deduce that fn2 → f 2 , which completes the proof. Example 4.2.3 The above result is not true, in general, in infinite measure spaces. Consider the set N equipped with the counting measure defined on the σ-algebra of all subsets. Let 1 n , if 1 ≤ k ≤ n, fn (k) = 0, if k > n. 4.3 Exercises 79 Then fn → 0 uniformly on N and hence, as already observed earlier, µ fn → 0. Let g(n) = n for all n ∈ N. Now (fn g)(n) = 1 for all n ∈ N and so fn g does not converge uniformly to zero and so it does not converge to zero in measure. 4.3 Exercises 4.1 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X converging in measure to a real-valued measurable function f defined on X. If g is a realvalued measurable function defined on X such that f = g a.e., show µ that fn → g. ∞ 4.2 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 and {gn }n=1 be two sequences of real-valued measurable functions defined on X such that µ µ fn = gn a.e. for every n. If fn → f , show that gn → f , where f is a real-valued measurable function defined on X. 4.3 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X converging in measure to a real-valued measurable function f defined on X. If the sequence {fn } is almost uniformly Cauchy, show that it converges to f almost uniformly. 4.4 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X converging in measure to a real-valued measurable function f defined on X. If the sequence {fn } is pointwise Cauchy a.e., show that fn → f almost everywhere. 4.5 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X such that every subsequence has a further subsequence which converges in measure to a fixed µ real-valued measurable function f defined on X. Show that fn → f . 4.6 (a) Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X. Let f be a real-valued measurable function defined on X. Show that the following statements are equivalent: µ (i) fn → f . 80 4 Convergence (ii) Every subsequence of {fn } has a further subsequence converging to f almost uniformly. (b) If, in addition, µ(X) < +∞, show that the above statements are equivalent to the following statement: (iii) Every subsequence of {fn } has a further subsequence converging to f almost everywhere. 4.7 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of µ real-valued measurable functions defined on X such that fn → 0. Let {an } be a sequence of real numbers such that an ↓ 0. Show that there exists a subsequence {fnk } such that for almost every x ∈ X, we have |fnk (x)| < ak for sufficiently large k. 4.8 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X converging in measure to a real-valued measurable function f defined on X. If, for every n ∈ N, we have that fn ≥ 0 a.e., show that f ≥ 0 almost everywhere. Deduce that µ (i) if fn → f and if for every n, fn ≤ g a.e., then f ≤ g a.e., where g is a real-valued measurable function defined on X; µ (ii) if fn → f and if for every n, |fn | ≤ g a.e., then |f | ≤ g a.e., where g is a real-valued measurable function defined on X. 4.9 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of real-valued measurable functions defined on X converging in measure to a real-valued measurable function f defined on X. If fn ≤ fn+1 for every n, show that fn ↑ f almost everywhere. Chapter 5 Integration 5.1 Non-negative simple functions Let (X, S, µ) be a measure space. Let ϕ : X → R be a (measurable) simple function which is non-negative. Let {αi }ni=1 be the set of non-zero values assumed by ϕ. Set Ai = ϕ−1 ({αi }), 1 ≤ i ≤ n. The sets {Ai }ni=1 are mutually disjoint and we can write ϕ = n X α i χ Ai . (5.1.1) i=1 In order to define the integral of ϕ, over the set X, with respect to the measure µ, we imitate what one does in order to define the Riemann integral of a step function. Given a step-function of the form ϕ = n X α i χ Ii , i=1 where {Ii }ni=1 is a finite collection of disjoint intervals and the αi are all non-negative, the Riemann integral of ϕ is nothing but the area under the graph of ϕ, i.e. Z n X ϕ(x) dx = αi m1 (Ii ). R i=1 Imitating this, we can define, when ϕ is of the form given in (5.1.1) Z n X ϕ dµ = αi µ(Ai ). X i=1 © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_5 81 82 5 Integration However, a simple function may be written in more than one way as a finite linear combination of characteristic functions. For instance, each set Ai may be partitioned into subsets and ϕ could be written in terms of the characteristic functions of those subsets. It is also possible to express ϕ in the form (5.1.1), with the sets Ai not being mutually disjoint. Thus, we would like to define the integral in a manner which is independent of the way the function is written. Let us assume that we can write ϕ, given by (5.1.1), in the form ϕ = m X β j χ Bj , j=1 where the collection of sets {Bj }m j=1 are also mutually disjoint. Then, it follows that each βj is equal to αi for some unique index i. In that case, we have that Bj ⊂ Ai . Further, we have that Ai = ∪{j | βj =αi } Bj . Since µ is finitely additive, we have that X µ(Ai ) = µ(Bj ). {j | βj =αi } It is now immediate to see that m X βj µ(Bj ) = j=1 n X αi µ(Ai ). (5.1.2) i=1 Now let us assume that we write ϕ = k X γ i χ Ei , (5.1.3) i=1 where the sets {Ei }ki=1 are not necessarily disjoint. Let σ = (σ1 , · · · , σk ) be a k-tuple, where σi = ±1 for each 1 ≤ i ≤ k. Define, for A ⊂ X, A, if σi = 1, σi A = Ac , if σi = −1. 5.1 Non-negative simple functions 83 Set E σ = ∩ki=1 Eiσi . Thus, if σ0 = (−1, · · · , −1), then E σ0 = ∩ki=1 Eic = ∪ki=1 Ei c . Given two such k-tuples σ and σ 0 which are not equal, there must exist i with 1 ≤ i ≤ k such that σi 6= σi0 . Without loss of generality, assume that σi = +1 and σi0 = −1. In that case, by definition, E σ ⊂ Ei 0 0 while E σ ⊂ Eic . Thus, if σ 6= σ 0 , we have that E σ and E σ are disjoint. Lemma 5.1.1 With the preceding notations, we have, for each 1 ≤ i ≤ k, Ei = ∪{σ | σi =+1} E σ . (5.1.4) Proof: If σi = +1, then E σ ⊂ Ei . Thus the set on the right-hand side of (5.1.4) is contained in Ei . Conversely, let x ∈ Ei . Define σ as follows: σj = +1 if x ∈ Ej and σj = −1 if x 6∈ Ej , where 1 ≤ j ≤ k. In particular, σi = +1 and x ∈ E σ . This establishes the reverse inclusion in (5.1.4) and thus completes the proof. Now let us assume that ϕ is given in the form (5.1.3). For any 1 ≤ i ≤ k, we have X χ Ei = χ ∪ = χE σ . σ E {σ | σi =+1} {σ | σi =+1} Consequently, we have ϕ = k X i=1 γi X χE σ = {σ | σi =+1} X X γi χE σ . σ6=σ0 {i | σi =+1} By virtue of (5.1.2), we get, since the E σ are disjoint, P Pn P σ i=1 αi µ(Ai ) = σ6=σ0 {i | σi =+1} γi µ(E ) = Pk = Pk i=1 γi P {σ | σi =+1} µ(E i=1 γi µ(Ei ). σ) 84 5 Integration The last equality comes from the finite additivity of the measure and from the result of Lemma 5.1.1 above. Thus we can now make the following definition, which is independent of the representation of a simple function. Definition 5.1.1 Let (X, S, µ) be a measure space and let ϕ be a nonnegative simple function given by ϕ = k X γ i χ Ei i=1 The (Lebesgue) integral of ϕ, over the set X, with respect to the measure µ, is given by Z ϕ dµ = X k X γi µ(Ei ). i=1 Remark 5.1.1 Notice that the measure of some (or all) of the sets Ei could be +∞. Thus the integral of ϕ could be +∞ as well. This is the reason why we only consider non-negative functions. If the γi were of different signs and if the corresponding sets were of infinite measure, then we cannot add them meaningfully. For consistency, if γi = 0 and the set Ei has infinite measure, we adopt the convention that 0.∞ = 0. Remark 5.1.2 Let (X, S, µ) be a measure space and let E be a measurable subset X. Then we can consider the σ-algebra SE of sets of the form A ∩ E, where A ∈ S, defined on E, and the restriction of the measure µ to this σ-algebra. If ϕ is a non-negative simple function defined on X given by (5.1.3), then its restriction to E is given by ϕ|E = k X γi χEi ∩E . i=1 We define the integral of ϕ|E , over E, with respect to the measure µ restricted to E, as the integral R of ϕ, over the set E, with respect to the measure µ, and denote it by E ϕ dµ. Clearly we have Z ϕ dµ = E k X i=1 Z γi µ(Ei ∩ E) = ϕχE dµ. X 5.2 Non-negative functions 85 Remark 5.1.3 Let (X, S, µ) be a measure space and let ϕ be a nonnegative simple function given by (5.1.1). Let ψ be another non-negative simple function such that ψ ≤ ϕ. Then, clearly, we can write ψ = m X β j χ Fj , j=1 where each Fj is a subset of some unique Ai and, in that case, 0 ≤ βj ≤ αi . Then it is clear that we have Z Z ψ dµ ≤ ϕ dµ. X 5.2 X Non-negative functions Let (X, S, µ) be a measure space. Let f be a non-negative, extended real-valued measurable function defined on X. Recall that (cf. Theorem 3.1.1) f is the increasing limit of a sequence of non-negative simple functions. Definition 5.2.1 Let (X, S, µ) be a measure space and let f be a nonnegative, extended real-valued measurable function defined on X. Then the (Lebesgue) integral of f , over the set X, with respect to the measure µ, is defined by Z Z f dµ = sup ϕ dµ. X 0≤ϕ≤f ϕ simple X If E ⊂ X is a measurable set, then, we define the integral of f , over the set E, with respect to the measure µ, by Z Z f dµ = f χE dµ. E X Remark 5.2.1 In view of Remarks 5.1.2 and 5.1.3, it is clear that the above definition is consistent with the definitions made in the previous section in case f is itself a non-negative simple function. Again, if ϕ is a simple function such that 0 ≤ ϕ ≤ f , then ϕχE is a non-negative simple function defined on E which is bounded above by f |E . Conversely, if ϕ is a non-negative simple function defined on E bounded above by f |E , 86 5 Integration then its extension to all of X by setting it to be zero outside E is also a non-negative simple function defined on X and is bounded above by f . Thus, we can easily see that the integral of f , over the set E, with respect to the measure µ, defined above is the same as the integral of the function f |E , over the set E, with respect to the restriction of the measure µ to E . Remark 5.2.2 Notice that the integral of a non-negative real-valued function may be infinite. The following proposition is an immediate consequence of the definition of the integral for non-negative functions. Proposition 5.2.1 Let (X, S, µ) be a measure space and let f be a nonnegative, extended real-valued measurable function defined on X. (a) If g is a measurable function defined on X such that 0 ≤ g ≤ f , and if E is a measurable subset of X, then Z Z g dµ ≤ f dµ. (5.2.1) E E (b) If E and F are measurable subsets of X such that E ⊂ F , then Z Z f dµ ≤ f dµ. (5.2.2) E F (c) If c is a non-negative real number, and if E is a measurable subset of X, then Z Z cf dµ = c f dµ. (5.2.3) E E (d) If E is a measurable subset of X such that f (x) = 0 for all x ∈ E, then Z f dµ = 0. E (e) If E is a measurable subset of X such that µ(E) = 0, then Z f dµ = 0. E Proposition 5.2.2 Let (X, S, µ) be a measure R space and let f : X → R be a non-negative measurable function. If X f dµ = 0, then f = 0 almost everywhere. 5.2 Non-negative functions Proof: Set Fn = 87 1 x ∈ X | f (x) > n , n ∈ N. Then {x ∈ X | f (x) 6= 0} = ∪∞ n=1 Fn . Then, by virtue of (5.2.1) and (5.2.2), we get Z Z 1 µ(Fn ) ≤ f dµ ≤ f dµ = 0. n Fn X Thus, µ(Fn ) = 0 for each n ∈ N and the result follows. Proposition 5.2.3 Let (X, S, µ) be a measure space and let ϕ : X → R be a non-negative simple function. Define, for E ∈ S, Z ν(E) = ϕ dµ. E Then, ν defines a measure on S. Proof: Clearly ν is non-negative and ν(∅) = 0. We just need to check countable additivity. Let {Ei }∞ sets in i=1 be a sequence of measurable Pk ∞ X which are mutually disjoint. Let E = ∪i=1 Ei . Let ϕ = j=1 αj χAj . Then R Pk ν(E) = = j=1 αj µ(Aj ∩ E) E ϕ dµ = = Pk j=1 αj P∞ i=1 µ(Aj P∞ R i=1 Ei ∩ Ei ) = ϕ dµ = P∞ Pk i=1 j=1 αj µ(Aj ∩ Ei ) P∞ i=1 ν(Ei ). This completes the proof. Proposition 5.2.4 Let (X, S, µ) be a measure space and let ϕ and ψ be non-negative simple functions defined on X. Then Z Z Z (ϕ + ψ) dµ = ϕ dµ + ψ dµ. (5.2.4) X X Pn X Pm Proof: Let ϕ = i=1 αi χAi and let ψ = j=1 βj χBj , where the {Ai }ni=1 and the {Bj }m j=1 are collections of mutually disjoint sets. Set Eij = Ai ∩ Bj . Then n X m X ϕ+ψ = (αi + βj )χEij . i=1 j=1 88 5 Integration Then, Z Z Z (ϕ + ψ) dµ = (αi + βj )µ(Eij ) = ϕ dµ + Eij Eij ψ dµ. Eij The result now follows immediately from the preceding proposition since the Eij are all disjoint. We are now in a position to prove the first important theorem which shows how the (Lebesgue) integral handles limit processes. Theorem 5.2.1 (Monotone convergence theorem) Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of non-negative measurable functions defined on X such that, for every x ∈ X, (i) 0 ≤ f1 (x) ≤ f2 (x) ≤ · · · ≤ fn (x) ≤ · · ·, and, (ii) lim fn (x) = f (x). n→∞ Then Z Z lim n→∞ X fn dµ = f dµ. X Proof: Let Z α = sup n fn dµ. X By Proposition 5.2.1 (a), we have Z Z Z f1 dµ ≤ f2 dµ ≤ · · · ≤ fn dµ ≤ · · · X X Since fn ≤ f , we also have X R R fn dµ ≤ X f dµ. Thus, it follows that Z α ≤ f dµ. X X We now need to prove the reverse inequality, which will complete the proof. Let 0 < c < 1 be any fixed constant. Let ϕ be a simple function such that 0 ≤ ϕ ≤ f . For n ∈ N, define En = {x ∈ X | fn (x) ≥ cϕ(x)}. Then each set En is measurable. Since the sequence {fn }∞ n=1 is increasing, we have that E1 ⊂ E2 ⊂ · · · ⊂ En ⊂ · · ·. Further, given any x ∈ X, we have two possibilities. 5.2 Non-negative functions 89 • Either, f (x) = 0 which imples that fn (x) = 0 for all n and also that ϕ(x) = 0. In this case x ∈ E1 . • Or, f (x) > 0 which implies that f (x) > cϕ(x), since 0 < c < 1. In this case, there exists n ∈ N such that cϕ(x) < fn (x) ≤ f (x) and so x ∈ En . Thus, X = ∪∞ n=1 En . Now, Z Z Z fn dµ ≥ fn dµ ≥ c X En def ϕ dµ = cν(En ). En But, by Proposition 5.2.3, ν is a measure and so Z α ≥ c lim ν(En ) = cν(X) = c n→∞ ϕ dµ. X Since this is true for any simple function ϕ satisfying 0 ≤ ϕ ≤ f , we get, by definition of the integral, that Z α ≥ c f dµ. X Since 0 < c < 1 was arbitrarily chosen, it now follows, on letting c tend to unity, that Z α ≥ f dµ, X which completes the proof. R Remark 5.2.3 It is possible that X f dµ is infinite. In R that case, we conclude from the preceding theorem that the limit of X fn dµ, as n → ∞, is also infinite. Proposition 5.2.5 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of non-negative extended real-valued measurable functions defined on X. Set f (x) = ∞ X fn (x), x ∈ X. n=1 Then f is a non-negative measurable function and Z ∞ Z X f dµ = fn dµ. X n=1 X 90 5 Integration Proof: Any finite sum of non-negative measurable functions is measurable (cf. Remark 3.1.2). If gn = f1 + · · · + fn , then gn increases to f . Thus, for any α ∈ R, we have {x ∈ X | f (x) ≤ α} = ∩∞ n=1 {x ∈ X | gn (x) ≤ α} which shows that f is measurable. ∞ Let {ϕn }∞ n=1 and {ψn }n=1 be increasing sequences of non-negative simple functions increasing to f1 and f2 respectively (cf. Theorem 3.1.1). By Proposition 5.2.4, Z Z Z (ϕn + ψn ) dµ = ϕn dµ + ψn dµ. X X X We also have that ϕn + ψn increases to f1 + f2 . Hence, by the monotone convergence theorem, we get Z Z lim (ϕn + ψn ) dµ = (f1 + f2 ) dµ, n→∞ X and X Z Z lim ϕn dµ = n→∞ Z lim n→∞ f1 dµ, X Z ψn dµ = f2 dµ. X We thus conclude that Z Z Z (f1 + f2 ) dµ = f1 dµ + f2 dµ. X X X It now follows by induction that, for any n ∈ N, Z n Z X (f1 + · · · + fn ) dµ = fk dµ. X k=1 X Setting gn = f1 + · · · + fn , we get that gn is non-negative, measurable and increases to f . Thus, the result follows, once again, by an application of the monotone convergence theorem. Example 5.2.1 (Integration with respect to the counting measure) Let X = N be equipped with the counting measure. Let f : X → R be a given non-negative function. Let f (k) = ak ≥ 0, k ∈ N. If we define ak , if 1 ≤ k ≤ n, fn (k) = 0, if k > n, 5.2 Non-negative functions 91 then fn increases to f . Notice that fn = n X ak χ{k} k=1 is a non-negative simple function and so, by definition, Z fn dµ = X n X ak µ({k}) = k=1 n X ak . k=1 Thus, by the monotone convergence theorem, it follows that Z f dµ = X ∞ X ak . k=1 The integral of a non-negative function over N with respect to the counting measure is just the summation of the values of the function. Example 5.2.2 (Integration with respect to the Dirac measure) Let X be a non-empty set and let x0 ∈ X. Let µ be the Dirac measure concentrated at the point x0 (cf. Example 1.2.2). Let ϕ be a simple function defined as in (5.1.1). Then x0 can belong to at most one set Ai . If x0 6∈ Aj for any 1 ≤ j ≤ n, then µ(Aj ) = 0 for all 1 ≤ j ≤ n and we have Z ϕ dµ = 0 = ϕ(x0 ). X If 1 ≤ i0 ≤ n is such that x0 ∈ Ai0 , then, we see that Z ϕ dµ = αi0 = ϕ(x0 ). X Now, if f is any non-negative extended real-valued measurable function defined on X, it is the increasing limit of non-negative simple functions and, by the monotone convergence theorem, it immediately follows that Z f dµ = f (x0 ). X Thus, integration of a non-negative function with respect to the Dirac measure is just evaluation of the function at the point where the measure is concentrated. 92 5 Integration Example 5.2.3 Let {aij }∞ i,j=1 be a double sequence of non-negative real numbers. Let X = N be equipped with P the counting measure. Define fi (j) = aij , 1 ≤ i, j ≤ ∞. Define f = ∞ i=1 fi . Then ∞ X f (j) = aij . i=1 Now, by Proposition 5.2.5, we get Z f dµ = X ∞ Z X i=1 fi dµ. X Using the result of Example 5.2.1, this translates into the following relation: ∞ ∞ X ∞ X X f (j) = fi (j). j=1 i=1 j=1 Substituting the values for f (j) and fi (j), we get ∞ X ∞ X j=1 i=1 aij = ∞ X ∞ X aij . i=1 j=1 Thus, we have shown that, for a non-negative double sequence of reals, the order of summation can be reversed. (Of course, both sums could be infinite.) This result is not true in general for sequences which change sign. We will later see sufficient conditions which ensure that the order of summation is immaterial. Theorem 5.2.2 (Fatou’s lemma) Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of non-negative extended real-valued measurable functions defined on X. Then Z Z (lim inf fn ) dµ ≤ lim inf fn dµ. X n→∞ n→∞ X Proof: Set gn (x) = inf i≥n fi (x) for x ∈ X. Then {gn }∞ n=1 is an increasing sequence of non-negative measurable functions whose limit is lim inf n→∞ fn . Thus, by the monotone convergence theorem, we get R R dµ = limn→∞ X gn dµ R R ≤ limn→∞ inf i≥n X fi dµ = lim inf n→∞ X fn dµ. X (lim inf n→∞ fn ) 5.2 Non-negative functions 93 This completes the proof. Example 5.2.4 We can have strict inequality in Fatou’s lemma. Let X = R be equipped with the Lebesgue measure, m1 . Set fn = χ[n,n+1) . R Then fn (x) → 0 for each x ∈ R and so RR(lim inf n→∞ fn ) dm1 = 0. On the other hand, for every n ∈ N, we have R fn dm1 = m1 ([n, n + 1)) = 1. The following result is a variation of the monotone convergence theorem. Proposition 5.2.6 Let (X, S, µ) be a measure space and let {fn }∞ n=1 and f be non-negative, extended real-valued measurable functions defined on X. Assume that, for all n ∈ N, and for all x ∈ X, we have 0 ≤ fn (x) ≤ f (x), and that fn (x) → f (x) as n → ∞. Then Z Z lim fn dµ = f dµ. n→∞ X X Proof: By Fatou’s lemma and the the fact that the fn are all bounded above by f , we get Z Z Z Z f dµ ≤ lim inf fn dµ ≤ lim sup fn dµ ≤ f dµ, X n→∞ X n→∞ X X from which the desired result follows immediately. We conclude this section with a generalization of Proposition 5.2.3. Proposition 5.2.7 Let (X, S, µ) be a measure space and let f be a nonnegative extended real-valued measurable function defined on X. Define Z ν(E) = f dµ, E ∈ S. E Then ν defines a measure on S. If g is any non-negative extended realvalued measurable function defined on X, then we have Z Z g dν = gf dµ. (5.2.5) X X 94 5 Integration Proof: Clearly ν(∅) = 0 and ν(E) ≥ 0 for all E ∈ S. Let {Ei }∞ i=1 be a sequence of mutually disjoint sets in S whose union is E. Then χE = ∞ X χ Ei . i=1 Thus, using Proposition 5.2.5, we get R ν(E) = E f dµ = P∞ R = P∞ i=1 X = f χEi dµ = R X f χE dµ P∞ R i=1 Ei f dµ i=1 ν(Ei ). This establishes the countable additivity and thus ν is a measure. Now, let g = χE , where E ∈ S. Then, Z Z Z g dν = ν(E) = f dµ = f g dµ. X E X Thus (5.2.5) is true when g is a characteristic function. By the linearity of the integral with respect to the integrand (cf. (5.2.3) and (5.2.4)), it follows that (5.2.5) is true when g is any non-negative simple function. Then, by the monotone convergence theorem, (5.2.5) is true when g is any non-negative extended real-valued measurable function. Remark 5.2.4 The above method of proof is very useful in proving several identities involving integrals of non-negative measurable functions. We first prove an identity for characteristic functions, and then, by linearity, for simple functions and then, by the monotone convergence theorem, for arbitrary non-negative measurable functions. Remark 5.2.5 The result of (5.2.5) is often symbolically written as dν = f dµ. An important converse of this result is known as the Radon-Nikodym theorem which we will see much later. 5.3 Integrable functions Let (X, S, µ) be a measure space. We now consider an arbitrary measurable function f defined on X. Since any function f can be split into its positive and negative parts (cf. Remark 3.1.3) as f = f + − f − , we may, 5.3 Integrable functions 95 in view of the linearity of the integral with respect to the integrand, try to define the interal of f as the difference between the integrals of f + and f − , which are well-defined, since these functions are non-negative. However, if both these integrals turn out to be infinite, we cannot define difference. So we need that at least one of the two quantities, R their R + dµ or − dµ, be finite. In view of this requirement, we make f f X X the following definition. Definition 5.3.1 Let (X, S, µ) be a measure space and let f be a measurable function defined on X. The function f is said to be (Lebesgue) integrable if Z |f | dµ < +∞. X Since |f | =Rf + + f − , it now R follows from the definition of integrability that both X f + dµ and X f − dµ are finite and so we are now in a position to define the integral of an integrable function. Definition 5.3.2 Let (X, S, µ) be a measure space and let f be an integrable function defined on X. Then the (Lebesgue) integral of f over X with respect to the measure µ, is defined by Z Z Z f dµ = f + dµ − f − dµ. (5.3.1) X X X At this point, it is easy for us to consider complex-valued functions as well. Let f : X → C be a given function, written in terms of its real and imaginary parts, as f = u + iv. Definition 5.3.3 Let (X, S, µ) be a measure space and let f be a complexvalued function defined on X. It is said to be measurable if its real and imaginary parts are measurable real-valued functions. It is said to be integrable if, in addition, |f | is integrable. If f is an integrable complex-valued function, then, clearly, its real and imaginary parts are also integrable, for, if f = u + iv, then |u| ≤ |f | and |v| ≤ |f |. Thus we may now define, in this case, Z Z Z f dµ = u dµ + i v dµ. (5.3.2) X X X Notation Let (X, S, µ) be a measure space. We generally use the symR bol X f dµ to denote the (Lebesgue) integral of an integrable function 96 5 Integration defined over X, with respect to the measure µ. However, there may arise situations when f depends on more than one variable. For instance, if f is a function of two variables x and y varying over different measure spaces, say, (X, S, µ) and (Y, S 0 , ν), and if we wish to integrate f as a function of x with y fixed in Y , we will write the integral as Z f (x, y) dµ(x). X We now prove the full linearity of the Lebesgue integral with respect to the integrand. Theorem 5.3.1 Let (X, S, µ) be a measure space and let f and g be integrable, complex-valued functions defined on X. Let α and β be complex constants. Then Z Z Z (αf + βg) dµ = α f dµ + β g dµ. (5.3.3) X X X Proof: By definition, αf +βg is clearly measurable. Further, since |αf + βg| ≤ |α||f |+|β||g|, it follows, from the linearity of the Lebesgue integral with respect to non-negative integrands and non-negative constants (cf. (5.2.3) and the proof of Proposition 5.2.5), that αf + βg is integrable as well. We will now show that Z Z Z (f + g) dµ = f dµ + g dµ. (5.3.4) X X X Again, by definition of the intergal for complex-valued functions, it is enough to prove (5.3.4) when f and g are real-valued. Assuming this, let h = f + g. Then h+ − h− = f + g = f + − f − + g + − g − , which implies that h+ + f − + g − = f + + g + + h− . Since all the functions involved in the above relation are non-negative, we deduce that (cf. the proof of Proposition 5.2.5) Z Z Z Z Z Z + − − + + h dµ + f dµ + g dµ = f dµ + g dµ + h− dµ. X X X X X X 5.3 Integrable functions 97 Since all the quantities in the above relation are finite, we can rearrange the terms to get Z Z Z Z Z Z + − + − + h dµ − h dµ = f dµ − f dµ + g dµ − g − dµ, X X X X X X which is exactly (5.3.4). Finally, we show that if c ∈ C and if f is a complex-valued integrable function defined on X, we have Z Z cf dµ = c f dµ, (5.3.5) X X which will complete the proof. If c ≥ 0, then (10.3.4) follows from the definition of the integral and from (5.2.3). If c = −1, then the result is again true since, for a real-valued function f , we have (−f )+ = f − and (−f )− = f + , and we can again use the definition of the integral. Thus, (10.3.4) is true for all real constants c. The proof will be complete if we can prove the relation when c = i. Let f = u + iv be the decomposition of f into its real and imaginary parts. Then, by definition, Z Z Z Z Z if dµ = (−v + iu) dµ = − v dµ + i u dµ = i f dµ. X X X X X This completes the proof. The next result is very important for estimating integrals. Theorem 5.3.2 Let (X, S, µ) be a measure space and let f be a complexvalued integrable function defined on X. Then Z Z f dµ ≤ |f | dµ. (5.3.6) X X Proof: If z ∈ C, then z = |z|eiθ , where 0 ≤ θ < 2π. Thus, we can write |z| = αz, where |α| = 1. Let Z Z f dµ = α f dµ, X X where |α| = 1. Let u denote the real part of the function αf . Then u ≤ |αf | ≤ |f |. Then, by the preceding theorem, we have Z Z Z Z Z f dµ = α f dµ = αf dµ = u dµ ≤ |f | dµ. X X X X X 98 5 Integration (In the above chain of equalities, we R used the fact R that the integral of αf is, in fact, a real quantity and so X αf dµ = X u dµ.) This completes the proof. We will now prove a result, which, without exaggeration, could be called the high point of the theory of the Lebesgue integral. It proves that wecan interchange the processes of limits and integration under fairly simple conditions. Theorem 5.3.3 (Dominated convergence theorem) Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of (complex-valued) integrable functions defined on X, converging pointwise to a function f . Assume, further, that for all x ∈ X, and for all n ∈ N, we have |fn (x)| ≤ g(x), where g is a non-negative integrable function defined on X. Then, f is integrable. Further, Z lim |fn − f | dµ = 0. (5.3.7) n→∞ X In particular, we have Z lim n→∞ X Z fn dµ = f dµ. (5.3.8) X Proof: By the preceding theorem, (5.3.8) is an immediate consequence of (5.3.7). Since we also have |f (x)| ≤ g(x), we deduce that f is integrable. Now, |fn − f | ≤ 2g and so 2g − |fn − f | is non-negative and converges to 2g as n → ∞. Consequently, by Fatou’s lemma, we get Z Z 2 g dµ ≤ lim inf (2g − |fn − f |) dµ n→∞ X X = 2 Since R X g dµ − lim supn→∞ R X |fn − f | dµ. R g dµ < +∞, we deduce from the above that Z Z 0 ≤ lim inf |fn − f | dµ ≤ lim sup |fn − f | dµ ≤ 0. X n→∞ X n→∞ X This proves (5.3.7). Remark 5.3.1 Since the integral of a function over a set of measure zero is zero, in all the convergence theorems proved up to now (say, the 5.3 Integrable functions 99 monotone convergence theorem and the dominated convergence theorem), the theorems remain valid even if we assume that fn → f almost everywhere. If E is the set of measure zero where convergence fails, we can work with X\E in the proofs and the results remain valid for integrals over X since the addition of the integrals over E does not alter anything. Example 5.3.1 Let X = N be equipped with the counting measure. Define 1 n , if 1 ≤ k ≤ n, fn (k) = 0, if k > n. R Then fn → f ≡ 0 uniformly. However, while X f dµ = 0, we have R 1 X fn dµ = n. n = 1 for each n. This is because the fn are not bounded above by any integrable function. Definition 5.3.4 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of integrable functions defined on X. Then we say that the sequence converges in the mean to an integrable function f if (5.3.7) holds. Proposition 5.3.1 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of integrable functions defined on X, converging in the mean µ to an integrable function f defined on X. Then fn → f . In particular, there exists a subsequence {fnk } which converges to f pointwise, almost everywhere. Proof: Let ε > 0. Set En (ε) = {x ∈ X | |fn (x) − f (x)| ≥ ε}. Then, Z Z 0 ≤ |fn − f | dµ ≤ En (ε) |fn − f | dµ. X It now follows from the definition of En (ε) that (cf. (5.2.1)), Z 1 µ(En (ε)) ≤ |fn − f | dµ, ε X from which we deduce that µ(En (ε)) → 0 as n → ∞. This proves that µ fn → f . The other conclusion now follows from Proposition 4.2.2. We now prove a generalization of the dominated convergence theorem. 100 5 Integration Theorem 5.3.4 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of integrable functions defined on X converging pointwise, almost everywhere, to an integrable function f defined on X. Assume that for all x ∈ X, and for all n ∈ N, we have that |fn (x)| ≤ gn (x), where {gn }∞ n=1 is a sequence of non-negative integrable functions defined on X converging to a non-negative integrable function g almost everywhere in X. Finally, assume that Z Z lim gn dµ = g dµ. n→∞ X X Then, (5.3.8) holds. Proof: Assume that the functions fn and f are all real-valued. By hypotheses, we have that for each n ∈ N, gn ±fn ≥ 0 and gn ±fn → g ±f as n → ∞. Applying Fatou’s lemma, we get R R X (g − f ) dµ ≤ lim inf n→∞ X (gn − fn ) dµ R = X g dµ − lim supn→∞ R X fn dµ, and R X (g + f ) dµ ≤ lim inf n→∞ = Since R X R R X (gn X g dµ + lim inf n→∞ R X fn dµ. g dµ < +∞, we deduce that Z Z Z f dµ ≤ lim inf X + fn ) dµ n→∞ X fn dµ ≤ lim sup n→∞ X Z fn dµ ≤ f dµ, X from which (5.3.8) follows. If the functions fn and f are complex valued, then the inequality |fn (x)| ≤ gn (x) also implies that the same is valid for the sequences of the real and imaginary parts of the fn . Hence, the theorem applies to these two sequences from which we easily deduce (5.3.8). As a corollary of the preceding theorem, we have the following useful result. Theorem 5.3.5 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a sequence of integrable functions defined on X converging pointwise 5.3 Integrable functions 101 (almost everywhere) to an integrable function f defined on X. Then, the sequence converges to f in the mean if, and only if, Z Z lim |fn | dµ = |f | dµ. (5.3.9) n→∞ X X Proof: Assume that the sequence converges to f in the mean. Since | |fn | − |f | | ≤ |fn − f |, we see that {|fn |}∞ n=1 converges in the mean to |f | and then (5.3.9) is an immediate consequence. Conversely, assume that (5.3.9) holds. Define Fn = |fn − f | which converges to zero (almost everywhere). Let Gn = |fn |+|f | and G = 2|f |. Then Gn and G are non-negative integrable functions, |FnR| ≤ Gn , and G R n → G as n → ∞. Finally, thanks to (5.3.9), we have that X Gn dµ → X G Rdµ, as n → ∞. Consequently, by the preceeding theorem, we have that X Fn dµ → 0 as n → ∞, which is the same as saying that the sequence {fn }∞ n=1 converges to f in the mean. We will now see a few examples of application of the dominated convergence theorem. Example 5.3.2 (Fourier transform) Let RN be equipped with the Lebesgue measure, mN . Given two vectors x = (x1 , · · · , xN ) and ξ = (ξ1 , · · · , ξN ) in RN , we define the usual euclidean inner-product between them by x.ξ = N X xi ξi . i=1 Let f : RN → R be an integrable function. The Fourier transform of the function f , denoted fb, is defined by Z b f (ξ) = e−2πix.ξ f (x) dmN (x). RN By Theorem 5.3.2, we have Z |fb(ξ)| ≤ |f | dmN < +∞, RN since the exponential has absolute value equal to unity. Thus fb is a N well-defined and bounded function. Let {ξ (n) }∞ n=1 be a sequence in R 102 5 Integration converging to a vector ξ. Then, since the exponential is continuous, we have that (n) e−2πiξ .x f (x) → e−2πiξ.x f (x) as n → ∞. Further, e−2πiξ (n) .x f (x) ≤ |f (x)| for all x ∈ RN and for all n ∈ N. Since f is integrable, it now follows, from the dominated convergence theorem, that lim fb(ξ (n) ) = fb(ξ). n→∞ Thus, the Fourier transform of an integrable function is a continuous and bounded function. Example 5.3.3 Let {aij }∞ i,j=1 be a double sequence of real numbers. Assume that, for each j ∈ N, ∞ X |aij | ≤ bj , i=1 where P∞ j=1 bj < +∞. Let X = N be equipped with the counting measure. Define fi (j) = aij , j ∈ N, and f = ∞ X fi . i=1 Thus, f (j) = ∞ X aij , j ∈ N. i=1 Define, for j ∈ N, g(j) = bj . Then g is a non-negative integrable function. By hypothesis, for each n ∈ N, n X fi (j) ≤ i=1 Since Pn lim i=1 fi n→∞ ∞ X |fi (j)| = i=1 ∞ X |aij | ≤ bj = g(j). i=1 → f , we have, by the dominated convergence theorem, n X ∞ X i=1 j=1 aij = lim n→∞ ∞ X n X j=1 i=1 aij = ∞ X j=1 f (j) = ∞ X ∞ X j=1 i=1 aij . 5.3 Integrable functions 103 In other words, we have ∞ X ∞ X aij = i=1 j=1 ∞ X ∞ X aij . j=1 i=1 Thus, the given conditions are sufficient to ensure that the order of summation can be reversed when the double sequence is not necessarily non-negative. (cf. Example 5.2.3). Proposition 5.3.2 Let (X, S, µ) be a measure space and let f be an integrable function defined on X. Then, given ε > 0, there exists δ > 0 such that, whenever µ(E) < δ, E ∈ S, we have Z |f | dµ < ε. (5.3.10) E Proof: Step 1: Let us assume that f is bounded. Let |f (x)| ≤ M for (almost) every x ∈ X. Then, if E ∈ S, we have Z |f | dµ ≤ M µ(E). E The result follows on choosing δ < ε M. Step 2: Given an arbitrary integrable function f , define, for n ∈ N, |f (x)|, if |f (x)| ≤ n, fn (x) = n, if |f (x)| > n. Then {fn }∞ n=1 is a non-negative sequence of bounded functions increasing to |f |. Thus, by the monotone convergence theorem, there exists N ∈ N such that, for all n ≥ N , we have Z Z Z ε | |f | − fn | dµ = |f | dµ − fn dµ < . 2 X X X Hence, for every E ∈ S, we have, for all n ≥ N , Z Z Z Z ε |f | dµ − fn dµ = | |f | − fn | dµ ≤ | |f | − fn | dµ < . 2 E E E X Since fN is bounded, choose, by Step 1, a δ > 0 such that, whenever µ(E) < δ, E ∈ S, we have Z ε fN dµ < . 2 E 104 5 Integration Thus, for every set E ∈ S such that µ(E) < δ, (5.3.10) holds. Remark 5.3.2 Let (X, S, µ) be a measure space. We know that (cf. Proposition 5.2.7) if f is an integrable function defined on X, then Z ν(E) = |f | dµ, E ∈ S, E defines a measure on S. The above proposition states that if ε > 0 is given, we can find δ > 0 such that ν(E) < ε whenever µ(E) < δ. We say that the measure ν is absolutely continuous with respect to the measure µ. The Radon-Nikodym theorem (which we shall see much later) states that, for σ-finite measure spaces, every measure ν which is absolutely continuous with respect to a measure µ occurs in this form (see also Remark 5.2.5). 5.4 The Riemann and Lebesgue integrals On the real line, R, we have two notions of integrals. The first is that of the Riemann integral, defined primarily for bounded functions defined over bounded intervals. If f : [a, b] → R is a bounded function, then the Riemann integral, if it exists, is denoted by the symbol b Z f (x) dx. a The integral when f and/or the interval is unbounded, is then defined by appropriate limit processes and, again, may or may not exist. When the integral exists, it is always a real number. The Lebesgue integral, on the other hand, is defined for all nonnegative measurable functions defined on any (Lebesgue) measurable subset of R. The value of the integral may be finite or infinite. If it is finite, we say the function is Lebesgue integrable. The Lebesgue integral is then defined for any measurable function such that |f | is Lebesgue integrable. In this case, of course, the integral will be again a real number. The Lebesgue integral of a non-negative or an integrable function, f , over a (Lebesgue) measurable set E, is denoted by the symbol Z f dm1 . E 5.4 The Riemann and Lebesgue integrals 105 We have seen, in the Preamble, that there exist bounded functions which are not Riemann integrable. An example is that of the characteristic function of the rationals in the interval [0, 1]. On the other hand, the set of rationals is countable and hence is a measurable set of measure zero. Thus, its characteristic function is integrable and the value of the Lebesgue integral is zero. We now ask ourselves if a Riemann integrable function is always Lebesgue integrable. Our aim in this section is to show that this is indeed the case when the function and the interval are both bounded and that, in fact, the two theories yield the same value for the integral. Let [a, b] be a finite interval in R. Let P = {a = x0 < x1 < · · · < xn = b} be a partition of the interval [a, b]. The points {xi }ni=0 are called the nodes of the partition. The mesh size of the partition, denoted ∆(P), is defined as follows: ∆(P) = max (xi − xi−1 ). 1≤i≤n A partition P 0 is said to be a refinement of P if the nodes of P form a subset of those of P 0 . Let f : [a, b] → R be a bounded function. Let us consider a sequence {Pk }∞ k=1 of partitions of the interval [a, b] such that, for each k, we have that Pk+1 is a refinement of Pk and such that ∆(Pk ) → 0, as k → ∞. If Pk = {a = x0 < x1 < · · · < xn = b}, define the functions Uk and Lk as follows: Uk = f (a) + n X Mi χ(xi−1 ,xi ] , and Lk = f (a) + i=1 n X mi χ(xi−1 ,xi ] , i=1 where, for 1 ≤ i ≤ n, Mi = sup f (x), and mi = [xi−1 ,xi ] inf [xi−1 ,xi ] f (x). 106 5 Integration Then the upper and lower (Darboux) sums associated to f and this partition (cf. Preamble) are given by Z Z U (Pk , f ) = Uk dm1 , and L(Pk , f ) = Lk dm1 . (5.4.1) [a,b] [a,b] Since, for each k ∈ N, we have that the partition Pk+1 is a refinement of the partition Pk , it follows that, for each x ∈ [a, b], L1 (x) ≤ L2 (x) ≤ · · · ≤ f (x) ≤ · · · ≤ U2 (x) ≤ U1 (x). (5.4.2) Theorem 5.4.1 Let f : [a, b] → R be a bounded function which is Riemann integrable. Then, it is also Lebesgue integrable and Z Z b f dm1 = f (x) dx. [a,b] a Proof: With the notations established above, the sequence {Uk (x)}∞ k=1 is monotonic decreasing and bounded below and the sequence {Lk (x)}∞ k=1 is monotonic increasing and bounded above, for each x ∈ [a, b]. Thus both sequences are convergent. Let their respective limits be U (x) and L(x). Then L(x) ≤ f (x) ≤ U (x), x ∈ [a, b]. Since f is bounded, assume that |f (x)| ≤ M for all x ∈ [a, b]. Then, for all x ∈ [a, b], and for all k ∈ N, we also have |Lk (x)| ≤ M and |Uk (x)| ≤ M . Consequently, by the dominated convergence theorem, we have R R limk→∞ [a,b] Uk dm1 = [a,b] U dm1 , and (5.4.3) R R limk→∞ [a,b] Lk dm1 = [a,b] L dm1 . Since f is Riemann integrable, the upper and lower Darboux sums converge to the Riemann integral of f . Thus, in view of (5.4.1), we get Z Z Z b U dm1 = L dm1 = f (x) dx. [a,b] [a,b] a But U ≥ L and so by the above result, it follows from, Proposition 5.2.2, that U (x) = L(x) = f (x) almost everywhere. Thus f is Lebesgue integrable and Z Z Z Z b f dm1 = U dm1 = L dm1 = f (x) dx. [a,b] [a,b] This completes the proof. [a,b] a 5.4 The Riemann and Lebesgue integrals 107 Theorem 5.4.2 Let f : [a, b] → R be a bounded function. Then f is Riemann integrable if, and only if, it is continuous almost everywhere. Proof: With the preceding notations, assume that x ∈ [a, b] is not a node of any of the partitions Pk , k ∈ N. (The set all nodes is countable and hence is of measure zero.) Now, f is continuous at such a point x if, and only if, U (x) = f (x) = L(x). Thus, from the proof of the preceding theorem, we see that if f is a bounded and Riemann integrable function, then it is continuous almost everywhere. Conversely, assume that f : [a, b] → R is a bounded function which is continuous almost everywhere. Then U = f = L almost everywhere. Let ε > 0 be given. Then, by virtue of (5.4.1) and (5.4.3), it follows that we can find k, sufficiently large, such that Z Z Uk dm1 − [a,b] Lk dm1 < ε, [a,b] i.e. |U (Pk , f ) − L(Pk , f )| < ε, which proves that f is Riemann integrable. Example 5.4.1 The result of Theorem 5.4.1 is not true for unbounded intervals. Consider the interval (0, ∞) ⊂ R. Let f (x) = sin x . x It is well known (a R ∞standard exercise on contour integration) that the Riemann integral 0 f (x) dx is well-defined and that its value is, infact, π 2 (cf., for example, Ahlfors [1]). However, f is not Lebesgue integrable. To see this, consider the intervals h π πi In = nπ + , nπ + , n ∈ N, 4 2 which are all disjoint. On In , we have 1 π | sin x| ≥ √ and x = |x| ≤ (2n + 1) . 2 2 108 5 Integration Thus, for x ∈ In , we have sin x x √ 2 1 . π 2n + 1 ≥ Thus, for any N ∈ N, we have Z |f | dm1 ≥ (0,∞) N Z X π π n=1 [nπ+ 4 ,nπ+ 2 ] |f | dm1 ≥ N 1 X 1 √ , 2 2 n=1 2n + 1 and the sum on the right becomes arbitrarily large, for large N , since it is a partial sum of a divergent series. We can use Theorem 5.4.1 to study the Lebesgue integrability of functions on subsets of the real line. Example 5.4.2 Let f (x) = √1x on the interval (0, 1). This is a nonnegative function and its Lebesgue integral is well-defined. Define ( 0, if x ∈ (0, n1 ), fn (x) = 1 √ , if x ∈ [ 1 , 1). n x Then the sequence of non-negative functions {fn }∞ n=1 increases to f and so Z Z f dm1 = lim fn dm1 . n→∞ (0,1) (0,1) Now Z Z fn dm1 = (0,1) 1 [n ,1) f dm1 . But, on the interval [ n1 , 1), the function f is a bounded and continuous function. Hence it is Riemann integrable and, we can easily calculate the Riemann integral using the familiar rules of the calculus as Z 1 √ 1 1 1 √ dx = 2 x 1 = 2 1 − √ . 1 x n n n Thus, it follows that Z f dm1 = 2 (0,1) and so f is integrable over (0, 1). 5.5 Weierstrass’ theorem 109 Example 5.4.3 Let ( f (x) = sin x 2 , x if x ∈ (0, ∞), 1, if x = 0. Again, this is a non-negative function and its Lebesgue integral is welldefined. We can write (using Proposition 5.2.7) Z Z Z f dm1 = f dm1 + f dm1 . [0,∞) [0,1] (1,∞) On the interval [0, 1], we have that f is a bounded and continuous function and so it is Riemann and hence Lebesgue integrable. Now, set fn (x) = 1 χ (x). x2 (1,n) Then for each x ∈ (1, ∞), fn (x) increases to x12 . Since fn is a bounded and continuous function on the interval (1, n), it is Riemann integrable there. Consequently, Z Z n 1 1 1 dm1 (x) = lim dx = lim 1 − = 1. 2 n→∞ 1 x2 n→∞ n (1,∞) x Thus, the function x 7→ 1 x2 is integrable on the interval (1, ∞). Since sin x x 2 ≤ 1 , x2 it follows that f is integrable on (1, ∞) as well and, therefore, f is integrable on [0, ∞). 5.5 Weierstrass’ theorem In this section, we will prove the famous theorem of Weierstrass, on the uniform approximation of a continuous function on a compact interval by means of polynomials, using the notion of the Lebesgue integral. Without loss of generality, we work on the interval [0, 1]. If x0 ∈ [0, 1], we denote the Dirac measure concentrated at x0 by δx0 . 110 5 Integration Let t ∈ [0, 1] and n ∈ N be fixed. Let X = [0, 1]. On the σ-algebra of all subsets of X, define the measure µtn = n X n k=0 k tk (1 − t)n−k δ k , n where n! n = , k k!(n − k)! is the usual binomial coefficient. Let fi (x) = xi , i = 0, 1, 2. Now, µtn (X) Z f0 dµtn = X = n X n k=0 k tk (1 − t)n−k = 1. Next, R t X f1 dµn = n k n−k k k=0 k t (1 − t) n Pn = t n − 1 k−1 (1 − t)(n−1)−(k−1) k=1 k − 1 t Pn = t. In the same way, a similar computation yields Z 1 f2 dµtn = ((n − 1)t2 + t). n X Setting f (x) = (x − t)2 = f2 (x) − 2tf1 (x) + t2 f0 (x), we get, on simplification, Z t − t2 f dµtn = . (5.5.1) n X Lemma 5.5.1 Let t ∈ [0, 1] and n ∈ N be fixed. Let ε > 0 be given. Set Aε = {x ∈ X | |x − t| ≥ ε}. Then µtn (Aε ) converges uniformly to zero (with respect to t) as n → ∞. 5.5 Weierstrass’ theorem 111 Proof: By the definition of the set Aε , we get Z Z 2 t 2 t ε µn (Aε ) ≤ (x − t) dµn (x) ≤ (x − t)2 dµtn (x) ≤ Aε X using (5.5.1) and the fact that t(1−t) ≤ the proof. 1 4 1 , 4n when t ∈ [0, 1]. This completes Lemma 5.5.2 Let f ∈ C[0, 1]. Let t ∈ [0, 1]. Then Z lim f dµtn = f (t), n→∞ X and the limit is uniform with respect to t. Proof: Since f is continuous over a compact interval, it is uniformly continuous. Given ε > 0, let δ > 0 be such that whenever |x − y| < δ, we have |f (x) − f (y)| < ε. Set Aδ = {x ∈ X | |x − t| ≥ δ}. Now, since µtn (X) = 1, we have Z Z Z f dµtn − f (t) = (f (x) − f (t)) dµtn (x) ≤ |f (x)−f (t)| dµtn (x). X We can write X Z X |f (x) − f (t)| dµtn (x) = I1 + I2 , X where I1 = R I2 = R Aδ |f (x) − f (t)| dµtn (x), and Acδ |f (x) − f (t)| dµtn (x). Let M = maxx∈[0,1] |f (x)|. Then, by Lemma 5.5.1, we have I1 ≤ 2M µtn (Aδ ) ≤ 2M . 4n On the other hand, for x ∈ Acδ , we have that |f (x) − f (t)| < ε and so I2 ≤ εµtn (Acδ ) ≤ εµtn (X) ≤ ε. Now, given any η > 0, choose ε < η2 and choose N ∈ N such that, for all n ≥ N , we have 2M η < . 4n 2 112 5 Integration Then, for all n ≥ N and for all t ∈ [0, 1], we have Z f dµtn − f (t) < η, X which completes the proof. Remark 5.5.1 In the above proof, to estimate the integral of |f (x) − f (t)|, we split the integral over two sets. On one of the sets (Acδ ), we had information which controlled the integrand while on the other (Aδ ), we had minimal information on the integrand, but the measure of the set was small. This method of ‘divide and rule’ is often useful in estimating integrals. Now, by the definition of the measure µtn , we get Z ∞ X k n k t n−k f dµn = t (1 − t) f , k n X (5.5.2) k=0 which is a polynomial in t converging uniformly to f (t). Thus, we have proved the following theorem: Theorem 5.5.1 (Weierstrass approximation theorem) Every continuous function on a compact interval can be uniformly approximated by a sequence of polynomials. The polynomials occuring on the right-hand side of (5.5.2) are called the Bernstein polynomials. 5.6 Probability We are now in a position to indicate the connections between the theory of probability and that of measure and integration. A probability space is a measure space (Ω, B, p) such that p(Ω) = 1. The set Ω is called a sample space and the σ-algebra B is said to be the collection of events. Thus, the measure p(A) of a set A ∈ B, is called the probability of the event A. Let B ∈ B. Consider the σ-algebra of subsets of B, given by BB = {A ∩ B | A ∈ B}. 5.6 Probability 113 We define the measure pB on BB by pB (A ∩ B) = p(A ∩ B) , A ∈ B. p(B) so that pB (B) = 1. Then (B, BB , pB ) is a probability space. The conditional probability of an event A, given B, denoted p(A | B) is nothing but pB (A ∩ B). The events A and B are said to be independent if p(A | B) = p(A). In this case, we deduce that p(A ∩ B) = p(A)p(B). A random variable, X, on Ω, is a measurable real-valued function defined on Ω. The expected value (also called expectation or mean) of the random variable X, denoted E(X), is defined by Z E(X) = X dp. Ω Pointwise convergence, almost everywhere, of a sequence of random variables ir referred to as convergence almost surely and convergence in measure is referred to as convergence in probability. A distinguishing feature of probability theory, which does not have a parallel in the theory of measure and integration, is the study of independent and identically distributed random variables. Two random variables X and Y defined on Ω are said to be independent if for any pair of Borel sets in R, say, A and B, we have p(X −1 (A) ∩ Y −1 (B)) = p(X −1 (A))p(Y −1 (B)). The distribution function of a random variable X defined on Ω is defined by F (t) = p(X −1 ((−∞, t])). In other words, it is the probability that the random variable takes a value less than, or equal to t. Two random variables are said to be identically distributed if they have the same distribution function. 114 5.7 5 Integration Exercises 5.1 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of measurable functions defined on X converging to a measurable function f almost everywhere. Assume that f1 ≥ f2 ≥ · · · ≥ fn ≥ · · · ≥ 0, and that f1 is integrable. Show that Z Z lim fn dµ = f dµ. n→∞ X X 5.2 Let (X, S, µ) be a measure space such that µ(X) < +∞. Let {fn }∞ n=1 be a sequence of integrable functions defined on X converging uniformly to a function f , on X. Show that Z Z lim fn dµ = f dµ. n→∞ X X (This is not true, in general, if µ(X) == +∞; cf. Example 5.3.1.) 5.3 Let f : R → R be integrable. Let t ∈ R be fixed. Define g(x) = f (x + t), x ∈ R. If [a, b] ⊂ R is an interval, show that Z Z g dm1 = f dm1 . [a,b] [a+t,b+t] 5.4 Let f : [0, 1] × [0, 1] → R be (Lebesgue) measurable as a function of x, for each fixed t, and let g : [0, 1] → R be an integrable function. Assume that for each (x, t) ∈ [0, 1] × [0, 1], |f (x, t)| ≤ g(x). If limt→0 f (x, t) = h(x), show that Z Z lim f (x, t) dm1 (x) = t→0 [0,1] h dm1 . [0,1] 5.5 (Differentiation under the integral sign) Let f : [0, 1] × [0, 1] → R be a function such that, for each t ∈ [0, 1], the function x 7→ f (x, t) is integrable and for each x ∈ [0, 1], the map t 7→ f (x, t) is differentiable and that the derivative ∂f ∂t (x, t) is uniformly bounded. Show that Z Z d ∂f f (x, t) dm1 (x) = (x, t) dm1 (x). dt [0,1] [0,1] ∂t 5.7 Exercises 115 5.6 In each of the following cases, check the function f for (Lebesgue) integrability over the indicated domain. 1 (a) f (x) = 1+x 2 on R. (b) f (x) = 1 x (c) f (x) = 1 x on (0, 1). sin x1 on (0, 1). 5.7 Show that the function defined by f (x) = xn−1 (1 + x2 )k is integrable over (0, ∞) if k > n2 . 5.8 Let (X, S, µ) be a measure space. Show that the dominated convergence theorem is true if we replace ‘fn → f almost everywhere’ by µ ‘fn → f ’. 5.9 (a) Let f : [0, ∞) → R be uniformly continuous. If f is integrable, show that limx→+∞ f (x) = 0. (b) Show by means of an example that this result is not true if we replace ‘uniformly continuous’ by ‘continuous’. 5.10 Let f ∈ C[0, 1]. Assume that, for each n ∈ N, we have 1 Z xn f (x) dx = 0. 0 Show that f ≡ 0. 5.11 Let (X, S, µ) be a measure space and let f be a non-negative integrable function defined on X. Let {ϕn }∞ n=1 be a sequence of non-negative simple functions increasing to f . Show that Z lim |ϕn − f | dµ = 0. n→∞ X 5.12 (Korovkin’s theorem) Let ϕ : [0, +∞) → [0, +∞) be a continuous function such that ϕ(t) > 0 for t > 0. Let X = [0, 1] and let {µxn }n∈N,x∈[0,1] be a collection of finite Borel measures (cf. Definition 116 5 Integration 2.1.1) on X. Define, for x ∈ X, Z ψn (x) = ϕ(|x − y|) dµxn (y). X Assume that (i) µxn (X) → 1, uniformly with respect to x, as n → ∞, and (ii) ψn → 0, as n → ∞, uniformly on [0, 1]. Show that, for any f ∈ C[0, 1], we have Z f (y) dµxn (y) → f (x), as n → ∞, X uniformly on X. 5.13 Show that the Weierstrass approximation theorem, as proved in Section 5.5, is a particular case of Korovkin’s theorem, as stated above. 5.14 Let (X, S, µ) be a measure space such that µ(X) = 1. Let g : R → R be a bounded and uniformly continuous function. Let {fn }∞ n=1 be a µ sequence of measurable functions defined on X such that fn → f , where f is a measurable function defined on X. Show that Z Z lim g ◦ fn dµ = g ◦ f dµ, n→∞ X X where, for x ∈ X, and for any measurable function h defined on X, (g ◦ h)(x) = g(h(x)) . 5.15 Let (X, S, µ) be a measure space and let M denote the collection of all equivalence classes of real-valued measurable functions defined on X, modulo equality almost everywhere. If f : X → R is a measurable function, denote the equivalence class containing f by f . Let ϕ : [0, +∞) → [0, 1] be a strictly monotonic increasing continuous function such that ϕ(0) = 0. Asume, further, that, for all x, y ∈ [0, +∞), we have ϕ(x + y) ≤ ϕ(x) + ϕ(y). Let µ(X) = 1. For f , g ∈ M, define Z d(f , g) = ϕ(|f − g|) dµ. X 5.7 Exercises 117 (a) Show that d(·, ·) is well-defined. (b) Show that d(·, ·) defines a metric on M. (c) Show that a sequence {f n }∞ n=1 converges to f with respect to this µ metric if, and only if, fn → f . (d) Show that the function ϕ(x) = x , 0 ≤ x < +∞, 1+x satisfies the conditions stated above. Chapter 6 Differentiation 6.1 Monotonic functions One of the important features of differential and integral calculus is that differentiation and integration are essentially two sides of the same coin. More precisely, the fundamental theorem of calculus states that, if f is a Riemann integrable function which is the derivative of a function F on an interval [a, b], then Z b F (b) − F (a) = f (x) dx. (6.1.1) a We would like to investigate how far such a result is true when we deal with functions which are only differentable almost everywhere and with the Lebesgue integral of the derivative of that function. Consider the Cantor function (cf. Section 3.2), f , defined on the interval [0, 1]. It is continuous and monotonic increasing, with f (0) = 0 and f (1) = 1. If C is the Cantor set, then f is constant on each subinterval of C c . Thus, f is differentiable almost everywhere and its derivative, wherever it exists, is zero. Thus we have f (1) − f (0) = 1, while the integral of the derivative of f vanishes. In other words, (6.1.1) is not true for this function. Our aim in this chapter is to give necessary and sufficient conditions on a function f which is differentiable almost everywhere, with the derivative f 0 being integrable, such that (6.1.1) is true. We will first study, briefly, various classes of functions which are differentiable almost everywhere. Following the treatment as in Royden [6], © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_6 118 6.1 Monotonic functions 119 we start with monotonic functions. Definition 6.1.1 Let I be a collection of intervals covering a set E ⊂ R. It is said to be a Vitali covering of E if, for every ε > 0, and for every x ∈ E, there exists an interval I ∈ I such that x ∈ I and m1 (I) < ε. In other words, every point in E can be covered by intervals of arbitrarily small length. Such coverings occur naturally when we study local properties of functions, especially, derivability. In the sequel, when we use the word ‘outer-measure’, we will mean the outer-measure which generates the Lebesgue measure on R, which will be denoted by the symbol µ∗ . Lemma 6.1.1 (Vitali covering lemma) Let E ⊂ R be a set of finite outer-measure and let I be a Vitali covering of E. Then, given ε > 0, we can find a finite collection of disjoint intervals {I1 , · · · , IN } in I such that µ∗ E\ ∪N < ε. j=1 Ij Proof: We observe, first of all, that the intervals can be of any type: open, closed, or half-open. The addition, or removal, of end-points do not change the results since these points constitute a set of measure zero. Consequently, without loss of generality, we will assume that the intervals are all closed. Since E has finite outer-measure, we can find (cf. Proposition 2.2.1) an open set U , which contains E and is such that m1 (U ) < +∞. Since I is a Vitali covering of E, we can also assume, again, without loss of generality, that all the intervals in I are also contained in U . Let us choose an interval I1 arbitrarily from the collection I. We will now inductively choose disjoint intervals Ik , for k > 1, as follows. Assume that the intervals {I1 , · · · , In } have been chosen. • If E ⊂ ∪nk=1 Ik , then we are through. • If not, let x ∈ E\(∪nk=1 Ik ). Since the intervals are closed, the distance, d, of x from ∪nk=1 Ik is strictly positive. Since I is a Vitali covering, there exists an interval I ∈ I, containing x and of 120 6 Differentiation length less than, say, d2 . Then, it is clear that I will not intersect any of the intervals Ik , 1 ≤ k ≤ n. Set kn = sup m1 (I). I∈I I ∩ Ik = ∅ 1≤k≤n Since all intervals of I are contained in U , we have that kn ≤ m1 (U ) < +∞. Thus, we can find an interval In+1 ∈ I such that In+1 ∩ Ik = ∅ for all 1 ≤ k ≤ n and such that 1 kn < m1 (In+1 ). 2 Thus, if the process does not terminate at any finite stage, we will have a sequence of disjointP intervals {Ik }∞ k=1 in I. Since they are all contained in U , we have that ∞ m (I ) ≤ m1 (U ) < +∞. Consequently, k=1 1 k m1 (Ik ) → 0 as k → ∞. Now, choose N ∈ N such that ∞ X k=N +1 m1 (Ik ) < ε . 5 Set R = E\ ∪N k=1 Ik . We complete the proof by showing that µ∗ (R) < ε. Let x ∈ R. Then, as observed earlier, there exists I ∈ I such that x ∈ I and I ∩Ik = ∅ for all 1 ≤ k ≤ N . Assume, if possible, that I ∩In = ∅ for all n ∈ N. Then, by definition, 0 < m1 (I) ≤ kn ≤ 2m1 (In+1 ) for all n, which is impossible, since m1 (In ) → 0 as n → ∞. Thus, there exists n ∈ N, such that n > N , I ∩ In 6= ∅, and I ∩ Ik = ∅ for all 1 ≤ k < n. Notice that m1 (I) ≤ kn−1 ≤ 2m1 (In ). Let cn denote the mid-point of the interval In . Then, since x ∈ I and I ∩ In 6= ∅, we have 1 5 |x − cn | ≤ m1 (I) + m1 (In ) ≤ m1 (In ). 2 2 Set Jn = 5 5 cn − m1 (In ), cn + m1 (In ) . 2 2 6.1 Monotonic functions 121 Then, x ∈ Jn and m1 (Jn ) ≤ 5m1 (In ). Thus, R ⊂ ∪∞ k=N +1 Jk and so µ∗ (R) ≤ ∞ X ∞ X m1 (Jk ) ≤ 5 k=N +1 m1 (Ik ) < ε. k=N +1 This completes the proof. Remark 6.1.1 There are several similar results in the literature, each of them being called a Vitali covering lemma. The spirit of these are all the same: a set of finite outer-measure in RN is covered by basic open sets of arbitrarily small size and we can find a finite disjoint sub-collection which almost completely covers the given set, i.e. the outer-measure of the uncovered portion can be made as small as we wish. Let f : [a, b] → R be a given measurable function. We can define the following ‘one-sided derivatives’ of f at any point x ∈ (a, b): D+ f (x) = lim suph↓0 f (x+h)−f (x) , h D− f (x) = lim suph↓0 f (x)−f (x−h) , h D+ f (x) = lim inf h↓0 f (x+h)−f (x) , h D− f (x) = lim inf h↓0 f (x)−f (x−h) . h We have that f is differentiable at x if, and only if, these four values coincide, and, in that case, we denote the common value as f 0 (x), which is the derivative of f at x. Notice that we always have D+ f (x) ≥ D+ f (x) and D− f (x) ≥ D− f (x) at any point x ∈ (a, b). We say that f is differentiable almost everywhere on (a, b) if the derivative exists for almost every x ∈ (a, b). Theorem 6.1.1 Let f : [a, b] → R be a monotonic increasing realvalued function. Then f is differentiable almost everywhere on (a, b). The derivative f 0 is measurable and Z f 0 dm1 ≤ f (b) − f (a). (6.1.2) (a,b) 122 6 Differentiation Proof: Step 1. Consider the set E = {x ∈ (a, b) | D+ f (x) > D− f (x)}. We will show that m1 (E) = 0. All other sets involving inequalities between the various one-sided derivatives can be handled in a similar manner. This will establish the proof. We can write E = ∪r,s∈Q Ers , r>s where Ers = {x ∈ (a, b) | D+ f (x) > r > s > D− f (x)}. Since Q is countable, again, it suffices to show that m1 (Ers ) = 0. Let m = m1 (Ers ). Let ε > 0 be arbitrary. Then, there exists U , an open set, such thatErs ⊂ U and m1 (U ) < m + ε. Let x ∈ Ers . Since D− f (x) < s, for h sufficiently small, we have that [x − h, x] ⊂ U and f (x) − f (x − h) < sh. The collection of all such closed intervals, as x varies over Ers , forms a Vitali covering of Ers and so by the Vitali covering lemma, we can find a disjoint collection of such intervals {I1 , · · · , IN } such that the union of their interiors covers a set A ⊂ Ers such that µ∗ (A) > m − ε. If, for 1 ≤ k ≤ N , we have Ik = [xk , xk − hk ], then N X k=1 (f (xk ) − f (xk − hk )) < s N X hk < sm1 (U ) < s(m + ε). (6.1.3) k=1 Now, let y ∈ A. Then, for h0 sufficiently small, we have that (y, y+h0 ) is contained in an interval Ik , where 1 ≤ k ≤ N and f (y + h0 ) − f (y) > kr. Again the collection of such intervals, as y varies over A, is a Vitali covering of A and so there exists a finite collections of disjoint intervals 6.1 Monotonic functions 123 {J1 , · · · , JM } which cover a set B ⊂ A of outermeasure greater than m − 2ε. If Ji = (yi , yi + h0i ), 1 ≤ i ≤ M , then M X f (yi + h0i ) − f (yi ) > r i=1 M X ki0 > r(m − 2ε). (6.1.4) i=1 Since each Ji is contained in some Ik and since f is monotonic increasing, we have that f (yi + h0i ) − f (yi ) ≤ f (xk ) − f (xk − hk ). Thus, M X (f (yi + h0i ) N X − f (yi )) ≤ i=1 (f (xk ) − f (xk − hk )). k=1 From (6.1.3) and (6.1.4), we then deduce that r(m − 2ε) < s(m + ε). Since ε > 0 was arbitrarily chosen, we get that mr ≤ ms. Since r > s, this is possible only if m = 0. This completes the proof of the differentiability, almost everywhere, of f . Step 2. Define f (x) = f (b) for x ≥ b. Define 1 gn (x) = n f x + − f (x) . n Since f is differentiable almost everywhere, f 0 is defined almost every0 where and the sequence {gn }∞ n=1 converges to f , wherever it is defined. Since the Lebesgue measure is complete, it follows that f 0 is measurable. Also, since f is monotonically increasing, we have gn (x) ≥ 0 for all x. Thus, by Fatou’s lemma, we have that Z Z 0 f dm1 ≤ lim inf gn dm1 . n→∞ (a,b) (a,b) By Exercise 5.3, we get R R R f dm − f dm g dm = n 1 1 1 1 1 (a,b) (a,b) n (a+ ,b+ ) n = n R n [b,b+ 1 ) f dm1 − n = f (b) − n R 1 f (a,a+ n ] R (a,a+ 1 ] f dm1 dm1 . n 124 6 Differentiation Thus, Z Z 0 f dm1 ≤ f (b) − lim sup n n→∞ (a,b) 1 (a,a+ n ] f dm1 ≤ f (b) − f (a), since f (x) ≥ f (a) for all x ∈ (a, a + n1 ]. This completes the proof. Remark 6.1.2 As the example of the Cantor function shows, we can have strict inequality in (6.1.2). 6.2 Functions of bounded variation Let f : [a, b] → R be a given function. Consider a partition P = {a = x0 < x1 < · · · < xn−1 < xn = b}. Define t(P, f ) = n X (6.2.1) |f (xi ) − f (xi−1 )|. i=1 Definition 6.2.1 Let f : [a, b] → R be a given function. The total variation of f , over the interval [a, b], is defined as Tab (f ) = sup t(P, f ), P where the supremum is taken over all possible partitions of the interval [a, b]. The function f is said to be of bounded variation over the interval [a, b] if Tab (f ) < +∞. Example 6.2.1 Let f : [a, b] → R be Lipschitz continuous, i.e. there exists L > 0 such that, for all x, y ∈ [a, b], we have |f (x) − f (y)| ≤ L|x − y|. Then, given any partition P as in (6.2.1), we have t(P, f ) = n X |f (xi ) − f (xi−1 | ≤ L(b − a). i=1 Thus, f is of bounded variation over [a, b] and Tab (f ) ≤ L(b − a). In particular, if f is differentiable on (a, b) and if |f 0 (x)| ≤ L for all x ∈ (a, b), then, by the mean value theorem, f is Lipschitz continuous (with Lipschitz constant L) and hence is of bounded variation over 6.2 Functions of bounded variation 125 [a, b]. Example 6.2.2 Let f : [a, b] → R be monotonic. Then, for any partition P as in (6.2.1), we have t(P, f ) = n X |f (xi )−f (xi−1 )| = i=1 n X (f (xi ) − f (xi−1 )) = |f (b)−f (a)|, i=1 by the monotonicity of f . Thus f is of bounded variation over the interval [a, b] and Tab (f ) = |f (b) − f (a)|. Example 6.2.3 Define f (x) = x2 sin x12 , if x ∈ (0, 1], 0, if x = 0. Then f is a continuous function which is not of bounded variation. To see this, consider the patrition P of the interval [0, 1] defined by the set of points (s )n 2 {0, 1} ∪ . π(2k + 1) k=0 Let us denote the points of the partion which are not on the boundary by {xk }nk=0 . Then |f (xk ) − f (xk−1 )| = 2 1 2 1 2 4k 21 + = ≥ . π 2k + 1 π 2k − 1 π 4k 2 − 1 πk for 1 ≤ k ≤ n. Thus, t(P, f ) ≥ n 2X1 , π k k=1 and the right-hand side will become arbitrarily large for large n. Proposition 6.2.1 Let [a, b] ⊂ R be a given bounded interval. If f is a function of bounded variation on [a, b], then f is bounded. The function |f | is also of bounded variation. If f and g are functions of bounded variation on [a, b] and if α and β are real constants, then αf + βg and f g are also of bounded variation on [a, b]. Proof: Let x ∈ (a, b]. Consider the partition defined by {a < x ≤ b}. Then |f (x) − f (a)| ≤ Tab (f ). Thus, for any x ∈ [a, b], we have |f (x)| ≤ |f (a)| + Tab (f ). 126 6 Differentiation The rest of the proof is an immediate consequence of the following inequalities: | |f (x)| − |f (y)| | ≤ |f (x) − f (y)|, |(αf (x)+βg(x))−(αf (y)+βg(y))| ≤ |α| |f (x)−f (y)|+|β| |g(x)−g(y)|, and |f (x)g(x) − f (y)g(y)| ≤ |f (x)| |g(x) − g(y)| + |g(y)| |f (x) − f (y)|. Given a real number r, define r+ = max{r, 0} and r− = − min{r, 0}, so that r = r+ − r− and |r| = r+ + r− . Given a partition P as in (6.2.1) of an interval [a, b] and a function f : [a, b] → R, define P p(P, f ) = Pni=1 (f (xi ) − f (xi−1 ))+ , and, n − n(P, f ) = i=1 (f (xi ) − f (xi−1 )) . Thus, t(P, f ) = p(P, f ) + n(P, f ), f (b) − f (a) = p(P, f ) − n(P, f ). (6.2.2) Define Pab (f ) = sup p(P, f ), and Nab (f ) = sup n(P, f ). P P Proposition 6.2.2 Let f : [a, b] → R be a function of bounded variation. Then Tab (f ) = Pab (f ) + Nab (f ), (6.2.3) and f (b) − f (a) = Pab (f ) − Nab (f ). (6.2.4) Proof: Since f is of bounded variation, we have that Tab (f ), Pab (f ) and Nab (f ) are all finite. Now, for any partition P of [a, b], we have, using (6.2.2), p(P, f ) = n(P, f ) + f (b) − f (a) ≤ Nab (f ) + f (b) − f (a). We then immediately deduce that Pab (f ) ≤ Nab (f ) + f (b) − f (a), or, equivalently, Pab (f ) − Nab (f ) ≤ f (b) − f (a). 6.2 Functions of bounded variation 127 Interchanging the roles of p(P, f ) and n(P, f ) in the use of (6.2.2), we deduce, in the same way, that Nab (f ) − Pab (f ) ≤ f (a) − f (b). Thus, we have established (6.2.4). Since t(P, f ) = p(P, f ) + n(P, f ) for any partition P, it follows that Tab (f ) ≤ Pab (f ) + Nab (f ). Again, for any partition P, Tab (f ) ≥ p(P, f ) + n(P, f ) = p(P, f ) + p(P, f ) − (f (b) − f (a)) = 2p(P, f ) + Nab (f ) − Pab (f ), by virtue of (6.2.4). Thus, Tab (f ) ≥ 2Pab (f ) + Nab (f ) − Pab (f ) = Pab (f ) + Nab (f ), which gives the reverse inequality, thereby establishing (6.2.3). This completes the proof. Theorem 6.2.1 A function f : [a, b] → R is of bounded variation if, and only if, it is the difference of two monotonic functions. Proof: Since monotonic functions are of bounded variation, and since the sum, or difference, of functions of bounded variation is also of bounded variation, we see that the difference of two monotonic functions is of bounded variation. Conversely, let f be of bounded variation on the interval [a, b]. Let x ∈ (a, b]. Define g(x) = Pax (f ) and h(x) = Nax (f ). By definition, it is easy to see that both g and h are monotonic increasing functions. Consequently, the function x 7→ h(x)−f (a) is also monotonic increasing. Further, by the preceding proposition, we have f (x) − f (a) = g(x) − h(x). Thus, f (x) = g(x) − (h(x) − f (a)), which completes the proof. 128 6 Differentiation Corollary 6.2.1 Let f : [a, b] → R be a function of bounded variation. Then, it is differentiable almost everywhere. Proof: This is a direct consequence of the preceding theorem and Theorem 6.1.1. Proposition 6.2.3 Let f : [a, b] → R be a function of bounded variation. Then f 0 is integrable and Z |f 0 | dm1 ≤ Tab (f ). [a,b] In addition, if f ∈ C 1 [a, b], then we have equality in the above relation. Proof: The functions x 7→ Pax (f ), x 7→ Nax (f ) and x 7→ Tax (f ) are all monotonic increasing and so are differentiable almost everywhere. Since we have f (x) − f (a) = Pax (f ) − Nax (f ), we have f 0 (x) = (Pax (f ))0 − (Nax (f ))0 a.e.. Since (Pax (f ))0 and Nax (f ))0 are non-negative (the functions concerned being monotonic increasing), we have |f 0 (x)| ≤ |(Pax (f ))0 | + |(Nax (f ))0 | = (Pax (f ))0 + (Nax (f ))0 = (Tax (f ))0 , since Tax (f ) = Pax (f ) + Nax (f ). Since Tax (f ), is monotonic increasing, we have, by Theorem 6.1.1, that Z Z 0 |f | dm1 ≤ (Tax (f ))0 dm1 ≤ Tab (f ) − Taa (f ) = Tab (f ). [a,b] [a,b] If f is continuously differentiable, and if P is any partition as in (6.2.1), we have, for 1 ≤ i ≤ n, Z xi f (xi ) − f (xi−1 ) = f 0 (t) dt. xi−1 Thus, it follows that t(P, f ) ≤ n Z X i=1 xi 0 b Z |f (t)| dt = xi−1 0 Z |f 0 | dm1 . |f (t)| dt = a [a,b] 6.2 Functions of bounded variation 129 Consequently, we have the reverse inequality Z b Ta (f ) ≤ |f 0 | dm1 , [a,b] which completes the proof. Let us now consider a vector-valued map f : [a, b] → RN , where, for x ∈ [a, b], we have f (x) = (f1 (x), · · · , fN (x)). We write ! 12 N X |f (x)| = |fi (x)|2 . i=1 We then say that f is of bounded variation over [a, b] if Tab (f ) = sup t(P, f ) < +∞, P where the supremum is taken over all partitions P of [a, b] and if P is a partition of [a, b] as in (6.2.1), we set t(P, f ) = n X |f (xi ) − f (xi−1 )|. i=1 If each fi , 1 ≤ i ≤ N , is integrable over [a, b], we define !N Z Z f dm1 = fi dm1 . [a,b] [a,b] i=1 If each fi , 1 ≤ i ≤ N is differentiable in (a, b), we define f 0 (x) = (fi0 (x))N i=1 . Lemma 6.2.1 Let f : [a, b] → RN be integrable. Then Z Z f dm1 ≤ |f | dm1 . [a,b] (6.2.5) [a,b] R R Proof: Set y = [a,b] f dm1 , and yi = [a,b] fi dm1 , 1 ≤ i ≤ N . The result is trivially true if y = 0. Assume that y 6= 0. Then R PN 2 PN |y|2 = = i=1 yi i=1 yi [a,b] fi dm1 = = R [a,b] |y| PN i=1 yi fi R [a,b] |f | dm1 ≤ R [a,b] |y||f | dm1 dm1 , from which (6.2.5) follows on dividing throughout by |y|. 130 6 Differentiation Proposition 6.2.4 Let f : [a, b] → RN be a continuously differentiable map. Then f is of bounded variation and Z b Ta (f ) = |f 0 | dm1 . [a,b] Proof: Proceeding exactly as in the latter part of the proof of Proposition 6.2.3, we can easily see that, owing to the continuous differentiablity of f , Z |f 0 | dm1 < +∞. Tab (f ) ≤ [a,b] Thus, f is of bounded variation. We now prove the reverse inequality. Since f 0 is continuous over [a, b], it is uniformly continuous. Given ε > 0, let δ > 0 be such that, whenever |x − y| < δ, we have |f 0 (x) − f 0 (y)| < ε. Let P be a partition of [a, b] as in (6.2.1) and such that max (xi − xi−1 ) < δ. 1≤i≤n Thus, if xi−1 ≤ t ≤ xi , we have |f 0 (t)| < |f 0 (xi )| + ε. Therefore, R xi 0 0 xi−1 |f (t)| dt − ε(xi − xi−1 ) ≤ |f (xi )|(xi − xi−1 ) = R xi 0 (t) ≤ R xi f 0 (t) dt xi−1 (f xi−1 + R xi xi−1 (f + f 0 (xi ) − f 0 (t)) dt 0 (x i) − f 0 (t)) dt ≤ |f (xi ) − f (xi−1 )| + ε(xi − xi−1 ). Summing over all 1 ≤ i ≤ n, we get b Z a |f 0 (t)| dt − ε(b − a) ≤ t(P, f ) + ε(b − a). 6.3 Differentiation of an indefinite integral 131 Thus, b Z |f 0 (t)| dt ≤ Tab (f ) + 2ε(b − a). a Since ε > 0 was arbitrarily chosen, we get Z Z b |f 0 | dm1 = |f 0 (t)| dt ≤ Tab (f ), [a,b] a which completes the proof. Example 6.2.4 (Rectifiable arcs) An arc, or a curve, in the plane, is a continuous map of the form γ : [a, b] → R2 .To compute the ‘length’ of the curve,we partition [a, b] as in (6.2.1) and get the approximate length as n X |γ(xi ) − γ(xi−1 )|, i=1 which is none other than the sum of the lengths of the chords connecting the successive points of the set {γ(xi )}ni=0 lying on the curve γ. We say that the arc is rectifiable, i.e. its length is well-defined, if the supremum of the above sum, taken over all partitions, is finite, and that number is called the length of the arc. In other words, an arc given by the mapping γ is rectifiable if, and only if, the map γ is of bounded variation and the length of the arc is, in fact, Tab (γ). Let us assume that the arc is given by the parametric equations: x = x(t), y = y(t), where t ∈ [a, b]. If the functions x(t) and y(t) are continuously differentiable, then, by the preceding proposition, we get that the length, L, of the curve is given by the formula Z bp L = (x0 (t))2 + (y 0 (t))2 dt a which is precisely what we define as the length of a curve in an undergraduate calculus course. 6.3 Differentiation of an indefinite integral In this section, we will show that the derivative of the indefinite intergal of an integrable function is equal to the integrand, almost everywhere. Proposition 6.3.1 Let f : [a, b] → R be an integrable function. The indefinite integral of f , defined by Z F (x) = f dm1 , x ∈ [a, b], [a,x] 132 6 Differentiation is a uniformly continuous function of bounded variation over [a, b]. Proof: Let x, y ∈ [a, b], with x < y. Then Z Z |F (x) − F (y)| = f dm1 ≤ [x,y] |f | dm1 . [x,y] Given ε > 0, we know that (cf. Proposition 5.3.2) there exists δ > 0 such that, whenever |x − y| < δ, we have Z |f | dm1 < ε, [x,y] since f is integrable. This proves the uniform continuity of F . If P = {a = x0 < x1 < · · · < xn = b}, is any partition of [a, b], then N X |F (xi ) − F (xi−1 )| ≤ i=1 n Z X i=1 Z |f | dm1 = [xi−1 ,xi ] |f | dm1 . [a,b] Thus, it follows that Tab (F ) ≤ Z |f | dm1 < +∞, [a,b] which proves that F is of bounded variation over [a, b]. Remark 6.3.1 The above result shows that in order that a function be the indefinite integral of an integrable function, it must be at least a function of bounded variation. In fact, we will see in the next section that even more will be required. Thus, in general, a differentiable function may not be the indefinite integral of its derivative, even if the derivative is integrable. It needs to be at least a uniformly continuous function of bounded variation, with additional properties. Proposition 6.3.2 Let f : [a, b] → R be an integrable function such that Z f dm1 = 0, [a,x] for all x ∈ [a, b]. Then f (x) = 0 for almost every x ∈ [a, b]. 6.3 Differentiation of an indefinite integral 133 Proof: Set E+ = {x ∈ [a, b] | f (x) > 0} and E− = {x ∈ [a, b] | f (x) < 0}. Assume that m1 (E+ ) > 0. Then (cf. Proposition 2.2.2), there exists a closed set F ⊂ E+ such that m1 (F ) > 0. Set U = (a, b)\F , which is open. Then, U can be written as the disjoint union of a countable number of half-open intervals (cf. Lemma 2.2.1): U = ∪∞ n=1 [an , bn ). Set f n = f χ ∪n k=1 [ak ,bk ) . Then fn → f in U and |fn | ≤ |f |, which is integrable. Thus, by the dominated convergence theorem, we have that Z f dm1 = U ∞ Z X f dm1 . n=1 [an ,bn ) Since f > 0 on R F and since F has positive measure, we have (cf. Proposition 5.2.2), F f dm1 > 0 and so since Z Z 0 = Z f dm1 = [a,b] f dm1 + U f dm1 , F R we deduce that U f dm1 6= 0. Consequently, there exists n ∈ N such R that [an ,bn ) f 6= 0. But this is a contradiction, since Z Z Z f dm1 − f dm1 = [an ,bn ) [a,bn ) f dm1 = 0, [a,an ) by hypothesis. Thus, it follows that m1 (E+ ) = 0. Similarly, we can show that m1 (E− ) = 0. This shows that f = 0 almost everywhere. Proposition 6.3.3 Let f : [a, b] → R be a bounded measurable function. Define Z f dm1 , x ∈ [a, b], F (x) = F (a) + (6.3.1) [a,x] where F (a) is an arbitrary constant. Then F is differentiable almost everywhere and F 0 (x) = f (x) for almost every x ∈ [a, b]. 134 6 Differentiation Proof: By Proposition 6.3.1, we know that F is uniformly continuous and of bounded variation and hence that it is differentiable almost everywhere. Let |f (x)| ≤ M for all x ∈ [a, b]. For n ∈ N, define Z 1 fn (x) = n F x + − F (x) = n f dm1 . n [x,x+ 1 ] n F 0, Then |fn (x)| ≤ M for all x ∈ [a, b] and fn → almost everywhere, as n → ∞. Hence, by the dominated convergence theorem, for any c ∈ (a, b), we have R R 0 [a,c] F dm1 = limn→∞ [a,c] fn dm1 = limn→∞ n R [a,c] (F (x + n1 ) − F (x)) dm1 (x) R R = limn→∞ n [c,c+ 1 ] F dm1 − [a,a+ 1 ] F dm1 . n n (We have used the result of Exercise 5.3.) Now, since F is uniformly continuous, given ε > 0, there exists N ∈ N such that for all n ≥ N , and for all x ∈ [a, b), we have |F (x + n1 ) − F (x)| < ε. Consequently, for any x ∈ [a, b) and for any n ≥ N , we have Z 1 x+ n Z n 1 [x,x+ n ] F dm1 − F (x) = n (F (t) − F (x)) dt ≤ ε. x Thus, we deduce that, for any c ∈ [a, b), Z Z 0 F dm1 = F (c) − F (a) = [a,c] f dm1 . [a,c] We then conclude, by applying the result of Proposition 6.3.2, that F 0 = f almost everywhere. We can discard the hypothesis of boundedness of the integrand. Theorem 6.3.1 Let f : [a, b] → R be an integrable function. Let F be defined as in (6.3.1). Then F is differentiable almost everywhere and F 0 (x) = f (x) for almost every x ∈ [a, b]. Proof: Let us first assume that f ≥ 0. Define f (x), if f (x) ≤ n, fn (x) = n, if f (x) > n. 6.3 Differentiation of an indefinite integral 135 Then each fn is bounded and the sequence {fn }∞ n=1 increases to f . Thus, f − fn ≥ 0. Define Z Gn (x) = (f − fn ) dm1 . [a,x] Then Gn is monotonic increasing and is hence differentiable almost everywhere and its derivative is non-negative. Further, since fn is bounded, by the preceding proposition, we have Z d fn dm1 = fn (x), dx [a,x] for almost every x ∈ [a, b]. Now, we can write Z F (x) = F (a) + Gn (x) + fn (x) dm1 , [a,x] and so, for almost every x ∈ [a, b], we have F 0 (x) = G0n (x) + fn (x) ≥ fn (x). Since n ∈ N was arbitrarily chosen, we have F 0 (x) ≥ f (x) a.e., which implies that Z Z F 0 dm1 ≥ [a,b] f dm1 = F (b) − F (a). (6.3.2) (6.3.3) [a,b] On the other hand, since f is non-negative, we have that F is monotonic increasing and hence, by Theorem 6.1.1, Z F 0 dm1 ≤ F (b) − F (a). (6.3.4) [a,b] It now follows from, (6.3.3) and (6.3.4), that Z Z 0 F dm1 = f dm1 = F (b) − F (a). [a,b] [a,b] Since by (6.3.2), we have that F 0 ≥ f almost everywhere, the above relation implies that F 0 = f almost everywhere (cf. Proposition 5.2.2). 136 6 Differentiation This completes the proof for non-negative f . In the general case, we write f = f + − f − . Then Z Z F (x) = F (a) + f + dm1 − f − dm1 . [a,x] [a,x] It now follows, from the preceding arguments, that for almost every x ∈ [a, b], we have F 0 (x) = f + (x) − f − (x) = f (x). 6.4 Absolute Continuity From the previous section, we see that in order that a given function f : [a, b] → R be written as an indefinite integral of an integrable function, it must be at least uniformly continuous and of bounded variation. We will now introduce a new concept which will provide both a necessary and sufficient condition for a function to be written as an indefinite integral. Definition 6.4.1 A function f : [a, b] → R is said to be absolutely continuous if, for every ε > 0, there exists δ > 0 such that the following holds: given any finite collection of disjoint intervals {(xk , yk )}nk=1 in [a, b] such that n X |yk − xk | < δ, (6.4.1) k=1 we have n X |f (yk ) − f (xk )| < ε. (6.4.2) k=1 Remark 6.4.1 Clearly an absolutely continuous function is uniformly continuous. Example 6.4.1 If f : [a, b] → R is Lipschitz continuous, then it is absolutely continuous. If L is the Lipschitz constant (cf. Example 6.2.1), then, choose δ = Lε . Then, if {(xk , yk )}nk=1 is a collection of disjoint intervals in [a, b], satisfying (6.4.1), we have n X k=1 |f (yk ) − f (xk )| ≤ L n X k=1 |yk − xk | < ε. 6.4 Absolute Continuity 137 Thus, every differentiable function whose derivative is bounded on [a, b] will be absolutely continuous. Example 6.4.2 Let f : [a, b] → R be an integrable function and let F be defined as in (6.3.1). Then F is absolutely continuous. To see this, let {(xk , yk )}nk=1 be a collection of disjoint intervals in [a, b], satisfying (6.4.1). Then Z n X |F (yk ) − F (xk )| ≤ |f | dm1 . ∪n k=1 (xk ,yk ) k=1 The existence of δ > 0, given ε > 0, such that (6.4.2) is true follows from Proposition 5.3.2. (See also Remark 5.3.2). Our aim now is to prove the converse of the result in the above example, thereby establishing absolute continuity as the necessary and sufficient condition for a function to be written as an indefinite integral (of its derivative). Proposition 6.4.1 Let f : [a, b] → R be absolutely continuous. Then, f is of bounded variation over [a, b]. In particular, f is differentiable almost everywhere in [a, b]. Proof: Let δ > 0 correspond to ε = 1 in the definition of absolute continuity of f . Let K be the integral part of 1 + (b−a) δ . Then, given any partition P of [a, b], we can refine it to a partition P 0 such that the constituent intervals of P 0 can be grouped into K sets of intervals, each with total length less than δ. Then, t(P, f ) ≤ t(P 0 , f ) ≤ K. Thus Tab (f ) ≤ K < +∞. Proposition 6.4.2 Let f : [a, b] → R be an absolutely continuous function such that f 0 = 0 almost everywhere in [a, b]. Then f is a constant function. Proof: Let c ∈ (a, b] be arbitrarily chosen. Set E = {x ∈ (a, c) | f 0 (x) = 0}. By hypothesis, we have that m1 (E) = c − a. Let ε and η, be arbitrarily small positive numbers. 138 6 Differentiation If x ∈ E, then, for sufficiently small h > 0, we have [x, x + h] ⊂ (a, c) and also that |f (x+h)−f (x)| < ηh.Then, by the Vitali covering lemma, we can find a finite disjoint set of intervals, {(xk , xk + hk )}nk=1 , such that m1 (E\ ∪nk=1 (xk , yk )) < δ, where, we have set yk = xk + hk , 1 ≤ k ≤ n. Without loss of generality, we may assume that the {xk }nk=1 have been labelled in increasing order of magnitude. Thus we have y0 = a ≤ x1 < y1 ≤ x2 < y2 ≤ · · · ≤ xn < yn ≤ c = xn+1 , and n X |xk+1 − yk | < δ. k=0 Now, on one hand, we have n X |f (yk ) − f (xk )| < η k=1 n X |yk − xk | < η(c − a). k=1 On the other hand, by absolute continuity, we have n X |f (xk+1 ) − f (yk )| < ε. k=0 Together, these relations yield, |f (c) − f (a)| < ε + η(c − a). Since ε and η were arbitrarily chosen, it follows that f (c) = f (a), and this completes the proof, since c was fixed arbitrarily in (a, b]. Remark 6.4.2 The Cantor function is an example of a non-constant function whose derivative vanishes almost everywhere. Thus, the Cantor function is not absolutely continuous. Theorem 6.4.1 A function F : [a, b] → R can be written as an indefinite integral of an integrable function if, and only if, it is absolutely continuous. 6.5 Exercises 139 Proof: We have already seen, in Example 6.4.2, that if F is an indefinite integral, then it is absolutely continuous. Conversly, let F be absolutely continuous. Since it is of bounded variation, it can be written as the difference of two monotonic increasing functions (cf. Theorem 6.2.1). Let F = F1 − F2 , where Fi , i = 1, 2, are monotonic increasing. Then F 0 = F10 − F20 and so, by Theorem 6.1.1, we have R R R 0 0 0 [a,b] |F | dm1 ≤ [a,b] |F1 | dm1 + [a,b] |F2 | dm1 0 [a,b] F1 = R ≤ P2 dm1 + i=1 (Fi (b) R 0 [a,b] F2 dm1 − Fi (a)) < +∞. Thus, F 0 is integrable. Let Z F 0 dm1 . G(x) = [a,x] Then G is absolutely continuous and hence so is the function f = F − G. Now, by Theorem 6.3.1, we have that G0 = F 0 almost everywhere, i.e. f is an absolutely continuous function whose derivative f 0 = F 0 − G0 vanishes almost everywhere. Thus, f is a constant, equal to f (a) = F (a). Thus, we have Z F 0 dm1 , F (a) = F (x) − [a,x] or, equivalently, Z F 0 dm1 . F (x) = F (a) + [a,x] This completes the proof. 6.5 Exercises 6.1 Let f : [a, b] → R be a monotonic function. For x ∈ (a, b), define f (x+) = lim f (x + h) and f (x−) = lim f (x − h). h↓0 h↓0 Show that f (x+) and f (x−) always exist. Deduce that the set of discontinuities of f is at most countable. 140 6 Differentiation 6.2 For x ∈ [−1, 1], define f (x) = x sin x1 , if x 6= 0, 0, if x = 0. Compute D+ f (0), D+ f (0), D− f (0) and D− f (0). 6.3 Show that the function f defined on [0, 1] by f (x) = x2 sin x1 , if x 6= 0, 0, if x = 0, is of bounded variation. 6.4 Let f : [a, b] → R be a function of bounded variation. Let a ≤ c ≤ b. Show that Tab (f ) = Tac (f ) + Tcb (f ). 6.5 Let f : [a, b] → R be an absolutely continuous function. Show that Tab (f ) = R [a,b] |f 0| Pab (f ) = R [a,b] (f 0 )+ dm1 , and, Nab (f ) = R [a,b] (f 0 )− dm1 . dm1 , 6.6 Let f : [a, b] → R be a function of bounded variation. Define, for x ∈ [a, b], vf (x) = Tax (f ). (a) If f is continuous, show that vf is also continuous. (b) If f is absolutely continuous, show that vf is also absolutely continuous. 6.7 A monotone function is said to be singular if its derivative vanishes almost everywhere. Show that if f : [a, b] → R is a monotonic increasing function, then it can be written as the sum of a singular function and an absolutely continuous function. 6.8 Let f : [0, 1] → R be a continuous function which is absolutely continuous on [ε, 1] for every 0 < ε < 1. 6.5 Exercises 141 (a) Show, by means of an example, that f need not be absolutely continuous on [0, 1]. (b) If, in addition, f is of bounded variation on [0, 1], show that it is absolutely continuous on [0, 1]. 6.9 (Another example of a Cantor function) Consider the Cantor set C (cf. Example 2.1.3). For each n ∈ N, let En = [0, 1]\Xn , where Xn is as described in Example 2.1.3. Define, for x ∈ [0, 1], n Z 3 gn (x) = χEn (x) and fn (x) = gn dm1 . 2 [0,x] (a) Show that, for each n ∈ N, fn is a monotonic increasing function such that fn (0) = 0, fn (1) = 1 and that fn is constant on each constituent interval of Xn = Enc . (b) If I is any constituent interval of En , n ∈ N, show that Z Z gn dm1 = gn+1 dm1 = 2−n . I I (c) Let n ∈ N. Show that fn+1 (x) = fn (x) if x 6∈ En , and that |fn+1 (x) − fn (x)| ≤ 3 2n for all x ∈ En . (d) Deduce the existence of a continuous function f : [a, b] → R which is monotonic increasing, whose derivative vanishes at every point x 6∈ C, where C is the Cantor set, and such that f (0) = 0, f (1) = 1. Chapter 7 Change of variable 7.1 The Fréchet derivative Let A : RN → RN be a linear transformation. We have seen (cf. Theorem 2.3.3) that if E ⊂ RN is a measurable set, then mN (A(E)) = |det(A)|mN (E). Now, using the procedure outlined in Remark 5.2.4, it is a simple exercise to see that, if f : RN → RN is an integrable function, then Z Z f dmN = |det(A)| (f ◦ A) dmN , RN RN where, f ◦ A stands for the composition of the two mappings A and f . The aim of this chapter is to generalize this result to suitable transformations on open sets in RN . In order to do this, we need the tools of differential calculus in RN , which we recall in this section. For proofs of all assertions made in this section, see, for example, Kesavan [4]. Definition 7.1.1 Let U ⊂ RN be an open set and let T : U → RM be a given mapping. The mapping is said to be differentiable at a point a ∈ U , if there exists a linear transformation A : RN → RM such that |T (a + h) − T (a) − A(h)| = 0, h→0 |h| lim (7.1.1) where | · | denotes the euclidean length of a vector in the appropriate euclidean space. The linear map A is called the Fréchet derivative of T at the point a ∈ U and is denoted by T 0 (a). © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_7 142 7.1 The Fréchet derivative 143 Remark 7.1.1 The following facts are immediate consequences of the definition: (i) If T is differentiable at a point a ∈ U , then it is continuous at that point. (ii) The derivative at a point, if it exists, is unique. It is for this purpose that we work in an open set. (iii) If T is itself a linear map, then it is differentiable at every point of U and, for every a ∈ U , we have T 0 (a) = T. Remark 7.1.2 The relation (7.1.1) can be written in an equivalent fashion as follows: T (a + h) = T (a) + T 0 (a)(h) + ε(h), where, the error term ε(h) satisfies the condition |ε(h)| = 0. h→0 |h| lim Definition 7.1.2 Let U ⊂ RN be an open set. A mapping T : U → RM is said to be differentiable on U if T 0 (x) exists for every x ∈ U . The mapping T is said to be of class C 1 on U , or, equivalently, T is said to be continuously differentiable on U , if it is differentiable on U and, in addition, the mapping x 7→ T 0 (x) from U into the space of all linear transformations from RN into RM , denoted L(RN , RM ), is continuous when the latter space is endowed with its usual topology. If V ⊂ RN is an open set, a mapping T : U → V is said to be a diffeomorphism if T is a bijection and if both T and T −1 are continuously differentiable maps. Example 7.1.1 If N = M = 1, then the Fréchet derivative is the familiar derivative that we define when studying the calculus of functions of a single variable. In this case, the derivative T 0 (a) at a point a ∈ U is a real number, which can be visualised as a linear map from R onto itself, acting on R by multiplication, i.e. T 0 (a)(h) = T 0 (a)h. Example 7.1.2 Let N > 1 and let M = 1. Then T 0 (a) is a linear functional on RN , and so, it can be represented by a vector in RN . Indeed, we have ∂T ∂T 0 T (a) = ∇T (a) = (a), · · · , (a) , ∂x1 ∂xN 144 7 Change of variable ∂T where { ∂x (a)}N i=1 are the usual partial derivatives of T at the point i a. It can be shown that if T is differentiable at a ∈ U , then all partial derivatives of T exist at that point. Thus, the action of T 0 (a) on a vector h = (h1 , · · · , hN ) is given by T 0 (a)(h) = N X ∂T (a)hi . ∂xi i=1 Example 7.1.3 Consider the mapping T : R2 → R defined, for (x, y) ∈ R2 , by ( x5 , if (x, y) 6= (0, 0), (y−x2 )2 +x4 T (x, y) = 0, if (x, y) = (0, 0). Then one can verify that both partial derivatives exist, and are equal to zero, at the point (0, 0). If T were differentiable at the origin, then it would follow, from the preceding example, that T 0 ((0, 0)) = 0. In particular, we must then have |T (h)| = 0. h→0 |h| lim However, if we take h = (t, t2 ), where t → 0, we can easily see that the above limit is unity. Thus, while the partial derivatives will all exist at a point if the mapping is differentiable there, the converse is not true. The partial derivatives may exist at a point, but the mapping can still fail to be differentiable there. Example 7.1.4 Let N > 1 and M > 1. Let T (x) = (T1 (x), · · · , TM (x)), where Ti is a mapping of U ⊂ RN into R. If T is differentiable at a point a ∈ U , then all the Ti , 1 ≤ i ≤ M are also differentiable at a. The derivative T 0 (a) can now be represented by an M × N matrix. We have, for h = (h1 , · · · , hN ) ∈ RN , ∂T1 ∂T1 (a) . . . (a) h1 ∂x1 ∂xN ··· ··· ··· ··· . T 0 (a)(h) = ··· ··· ··· ··· ∂TM ∂TM hN ∂x1 (a) . . . ∂xN (a) Definition 7.1.3 Let U ⊂ RN be an open set and let T : U → RN be a differentiable map defined on U . The Jacobian of T at a point a ∈ U is denoted by JT (a) and is equal to det(T 0 (a)). 7.1 The Fréchet derivative 145 We will now recall (without poof) some important results from the differential calculus in RN , which are well known when N = 1. Theorem 7.1.1 (Chain rule) Let Ni ∈ N, 1 ≤ i ≤ 3. Let U ⊂ RN1 and V ⊂ RN2 be open sets. Let f : U → RN2 and let g : V → RN3 be continuous mappings. Let a ∈ U be such that f (a) ∈ V . Assume that f is differentiable at a and that g is differentiable at f (a). Then the composition h = g◦f , which is defined on the open set U 0 = f −1 (V ) ⊂ U , is differentiable at a and h0 (a) = g 0 (f (a)) ◦ f 0 (a). Notation • Let a and b be vectors in RN . We denote the (closed) line segment connecting these points by [a, b]. Thus, [a, b] = {ta + (1 − t)b | 0 ≤ t ≤ 1}. • Let A : RN → RN be a linear transformation. We denote by kAk, its norm, i.e. kAk = max |A(x)|. |x|=1 Theorem 7.1.2 (Mean value theorem) Let U ⊂ RN be an open set and let T : U → RM be a given mapping. Let a, b ∈ U such that [a, b] ⊂ U . If T is differentiable in U , then |T (b) − T (a)| ≤ sup kT 0 (x)k |b − a|. (7.1.2) x∈[a,b] Remark 7.1.3 The mean value theorem has numerous applications. For instance, let T : U → R be a mapping such that its partial derivatives ∂T exist at all points of U . Assume that the mappings x 7→ ∂x (x) are i continuous at a point a ∈ U for all 1 ≤ i ≤ N . Then, using the mean value theorem, we can show that T is differentiable at a (cf. Example 7.1.3). Theorem 7.1.3 Let U ⊂ RN be an open set and let T : U → RM be a mapping of class C 1 . Then, if [a, a + h] ⊂ U , we have Z 1 T (a + h) = T (a) + T 0 (a + th)(h) dt. (7.1.3) 0 146 7.2 7 Change of variable Sard’s theorem Definition 7.2.1 Let U ⊂ RN be an open set. Let T : U → RN be a C 1 map. Let x ∈ U . We say that the point x is a singular point, or a critical point, if the rank of T 0 (x) is strictly less than N , i.e. the linear transformation T 0 (x) is singular. The value T (x) is called a singular value or critical value of T . If T −1 {y} does not contain any critical point, we say that y is a regular value of T . Theorem 7.2.1 (Sard) Let U ⊂ RN be an open set and let T : U → RN be a C 1 map. Then the set of critical values of T has measure zero. Proof: Let S be the set of critical points of T i.e. S = {x ∈ U | JT (x) = 0}. We need to show that mN (T (S)) = 0. Step 1. Let C be a closed cube of side a, with sides parallel to the coordinate axes, contained in U . Since T 0 is bounded and uniformly continuous on C (since T is continuously differentiable), given ε > 0, there exists δ > 0 such that kT 0 (x) − T 0 (y)k < ε, whenever |x − y| < δ and x, y ∈ C. Let us divide C into k N similar cubes, each of side ka , with k being chosen large enough such that the √ diameter of each sub-cube is less than δ, i.e. N ka < δ. If kT 0 (x)k ≤ L, for all x ∈ C, we have |T (x) − T (y)| ≤ L|x − y|, for all x, y ∈ C, by virtue of the mean value thoerem (cf. Theorem 7.1.2). Step 2. Let x ∈ C ∩ S. Then, x must belong to one of the sub-cubes, e Given any y ∈ C, e we have, on one hand, say, C. √ a (7.2.1) |T (x) − T (y)| ≤ L N . k On the other hand, by virtue of Theorem 7.1.3, we have Z 1 0 T (y) − T (x) − T (x)(y − x) = (T 0 (x + t(y − x)) − T 0 (x))(y − x) dt. 0 7.3 Diffeomorphisms 147 Using the uniform continuity of T in C, we get √ a |T (y) − T (x) − T 0 (x)(y − x)| ≤ ε|y − x| ≤ ε N . k (7.2.2) Step 3. Set H = T 0 (x)(RN ). Since x is a critical point, it follows that the dimension of the subspace H is at most N − 1. Hence by (7.2.2), we deduce that √ a dist(T (y), T (x) + H) ≤ ε N , (7.2.3) k e Combining (7.2.1) and (7.2.3), we duduce that T (C) e is for every y ∈ C. √ √ a contained within a cylindrical block of radius L N and height 2ε k N −1 If ωN −1 is the measure of the unit ball in R , we have √ a N −1 √ a e ≤ ωN −1 L N mN (T (C)) 2ε N = K(N, C)εk −N . k k Thus P e mN (T (C ∩ S) ≤ e ⊂ C mN (T (C)) C e ∩ S 6= ∅ C Na k . ≤ k N K(N, C)εk −N = K(N, C)ε. Since ε can be chosen arbitrarily small, we deduce that mN (T (C ∩ S)) = 0. Now U can be covered by a countable number of such cubes and so the result follows. . 7.3 Diffeomorphisms Lemma 7.3.1 Let U ⊂ RN be an open set and let T : U → RN be a C 1 map. Let x ∈ U be such that T 0 (x) is non-singular. Let C ⊂ U be a closed cube with centre at x, its sides parallel to the coordinate axes and of length ν. Then, given ε > 0, there exists δ > 0 such that, if ν < δ, e is any sub-cube of C with sides parallel to the coordinate axes, and if C we have N Z e ≤ (1 + ε) mN (T (C)) |JT | dmN . (7.3.1) 1−ε e C Proof: Let B ⊂ U be a closed and bounded neighbourhood of x so that T 0 is bounded and uniformly continuous on B. We will only work with cubes C contained in B. Let L = max kT 0 (ξ)k. ξ∈B 148 7 Change of variable e of C, with sides parallel to the coordinate If we consider any sub-cube C 0 axes and with centre at x , we have, by the mean value theorem, |T (y) − T (x0 )| ≤ L|y − x0 |, e Thus, it follows that for any such sub-cube, for every y ∈ C. e ≤ LN mN (C). e mN (T (C)) (7.3.2) Let ε > 0 be arbitrarily chosen. Then, by the uniform continuity of T 0 , we can find δ > 0 such that, if |y − x| < δ, we have, k(T 0 (x))−1 T 0 (y)k < 1 + ε, |JT (y)| > (1 − ε)|JT (x)|. (7.3.3) e Let ν < δ so that the above relations are valid throughout C. Let C be any sub-cube of C, as described earlier. Then, by virtue of Theorem 2.3.3, we have e = mN ((T 0 (x))−1 T (C)). e |JT (x)|−1 mN (T (C)) Now, T 0 (x) is a fixed linear transformation. Hence, by Remark 7.1.1 (iii) and the chain rule (cf. Theorem 7.1.1), we have that (T 0 (x))−1 T is differentiable and its derivative at any point ξ is (T 0 (x))−1 T 0 (ξ). Since, for ξ ∈ C, we have that max k(T 0 (x))−1 T 0 (ξ)k < 1 + ε, ξ∈C we deduce from (7.3.2) that e < (1 + ε)N mN (C). e mN ((T 0 x)−1 T (C)) Thus, e < (1 + ε)N |JT (x)|mN (C). e mN (T (C)) On the other hand, we also have from (7.3.3), that, Z e |JT (y)|dmN (y) > (1 − ε)|JT (x)|mN (C). e C Thus, combining (7.3.4) and (7.3.5), we deduce (7.3.1). (7.3.4) (7.3.5) 7.3 Diffeomorphisms 149 Lemma 7.3.2 Let U ⊂ RN be open and let K ⊂ U be a compact set. Let T : U → RN be a C 1 map. Then Z Z |JT | dmN = inf |JT | dmN . (7.3.6) W K W open W compact K⊂W ⊂W ⊂U Proof: Since T is continuously differentiable and since K is compact, |JT | is integrable over K and over any open set W whose closure is compact. Clearly, Z Z |JT | dmN ≤ inf |JT | dmN . K W W open W compact K⊂W ⊂W ⊂U To prove the reverse inequality, observe that by absolute continuity (cf. Proposition 5.3.2), if we restrict our attention to subsets of a relatively compact open set W0 containing K and contained in U , given any ε > 0, we can find δ > 0 such that, if F ⊂ W0 satisfies mN (F ) < δ, then Z |JT | dmN < ε. F Now we can find W open such that K ⊂ W ⊂ W ⊂ W0 ⊂ W 0 ⊂ U and such that mN (W \K) < δ (cf. Proposition 2.2.2 (ii)). Thus, Z Z |JT | dmN − |JT | dmN < ε. W K In other words, we have found W , a relatively compact open set contained in U and containing K such that Z Z |JT | dmN < |JT | dmN + ε. W K This establishes the reverse inequality and completes the proof. Proposition 7.3.1 Let U and V be open subsets of RN and let T : U → V be a C 1 map, which is also a homeomorphism. Then Z mN (T (E)) ≤ |JT | dmN , (7.3.7) E where E ⊂ U is either a compact set or an open set. 150 7 Change of variable Proof: Step 1. Let K ⊂ U be a compact set and let W be a relatively compact open set such that K ⊂ W ⊂ W ⊂ U. Let L = max kT 0 (x)k. x∈W Let ε > 0. Let δ1 > 0 be such that, whenever |x − y| < δ1 , x, y ∈ W , we have kT 0 (x) − T 0 (y)k < ε. If x ∈ K such that T 0 (x) is singular, set ν(x) = δ1 . If x ∈ K is such that T 0 (x) is non-singular, set ν(x) = min{δ1 , δ(x)}, where δ(x) is the number δ chosen in the proof of Lemma 7.3.1 such that (7.3.3) is valid. Now cover K by cubes {C(x)}x∈K , where C(x) is centered at x ∈ K and with sides of length ν(x), the sides being parallel to the coordinate axes. Since K is compact, there exists a finite subcover. We can further ensure that we have a finite collection of disjoint (half-open) cubes whose union covers K and each of these is a sub-cube of one of the C(x). Thus we have a finite cover of K consisting of disjoint cubes {C 0 (x)}x∈S , where S is a finite set and x is the centre of the cube C 0 (x). Then S = Js ∪ Jns , where Js = {x ∈ S | T 0 (x) is singular}, Jns = {x ∈ S | T 0 (x) is non-singular}. If x ∈ Js , then we can proceed exactly as in the proof of Sard’s theorem (Theorem 7.2.1) to get mN (T (C 0 (x))) ≤ 2ωN −1 LN −1 εmN (C 0 (x)). (7.3.8) If x ∈ Jns , then, by Lemma 7.4.1, we get (1 + ε)N mN (T (C (x))) ≤ 1−ε 0 Z C 0 (x) |JT | dmN . (7.3.9) (We must remember that each C 0 (x) is a sub-cube of one of the original cubes. The estimates in Sard’s theorem and Lemma 7.4.1 were observed 7.3 Diffeomorphisms 151 to be vaild for any sub-cube of the cube of admissible size.) Now, mN (T (K)) ≤ mN (T (∪x∈S C 0 (x)) ≤ P x∈Js ≤ C(K)ε mN (T (C 0 (x)) + P x∈Js P x∈Jns mN (C 0 (x)) + ≤ C(K)εmN (W ) + (1+ε)N 1−ε R W mN (T (C 0 (x)) (1+ε)N 1−ε P x∈Jns R C 0 (x) |JT | dmn |JT | dmn , where C(K) = 2ωN −1 LN −1 . We have used here the disjointness of the cubes C 0 (x) when using mN (W ) P as an upper bound for x∈Js mN (C 0 (x)) and when using the integral over W as an upper bound for the sum of the integrals over the cubes C 0 (x). Since ε can be chosen arbitrarily small, we get Z mN (T (K)) ≤ |JT | dmN . W Consequently, we have, by Lemma 7.3.2, Z Z mN (T (K)) ≤ inf |JT | dmN = |JT | dmN . W K W open W compact K⊂W ⊂W ⊂U Let W ⊂ U be an open set. Then, W can be written as the countable increasing union of compact sets, i.e. W = ∪∞ n=1 Kn , where the sets Kn , n ∈ N are all compact and Kn ⊂ Kn+1 for all n ∈ N. Then, since T is a homeomorphism, T (W ) = ∪∞ n=1 T (Kn ), and for all n ∈ N,we have that T (Kn ) ⊂ T (Kn+1 ) and T (Kn ) is compact. Thus, Z Z mN (T (W )) = lim mN (T (Kn )) ≤ lim |JT | dmN ≤ |JT | dmN . n→∞ This completes the proof. n→∞ K n W 152 7 Change of variable Corollary 7.3.1 Let U and V be bounded open subsets of RN . Let T be a C 1 map which maps U homeomorphically onto V . Let E ⊂ U be a Borel set. Then (7.3.7) holds. Proof: By Lemma 2.3.1, T (E) is a Borel set. Since V is bounded, it follows that T (E) has finite measure. Consequently (cf. Proposition 2.2.3 and Remark 2.2.1), since T is a homeomorphism, we have mN (T (E)) = sup{mN (T (K)) | K ⊂ E, K compact}. But Z Z mN (T (K)) ≤ |JT | dmN ≤ K |JT | dmN , E from which (7.3.7) follows. Remark 7.3.1 We needed the boundedness of V to ensure that T (E) has finite measure. If T : U → V is such that |JT | is integrable over U , then, if E ⊂ U is any Borel set, we can find an open set W such that E ⊂ W ⊂ U (cf. Proposition 2.2.2). Then, by Proposition 7.3.1, T (W ) will have finite measure and so the measure of T (E) will also be finite. Then the proof of Corollary 7.3.1 will go through. The following result is an immediate consequence of the preceding corollary. Corollary 7.3.2 Let U and V be bounded open subsets of RN and let T be a homeomorphism of V onto V , which is also a C 1 map. If E ⊂ U is a Borel set of measure zero, then so is T (E). Proposition 7.3.2 Let U and V be bounded open subsets of RN and let T be a homeomorphism of U onto V , which is also a C 1 map. Let f : V → R be a non-negative Borel measurable function. Then Z Z f dmN ≤ (f ◦ T )|JT | dmN . (7.3.10) V U Proof: Let F ⊂ V be a Borel set. Then F = T (E), where E ⊂ U is a Borel set. Then χE = χF ◦ T . Thus, if f = χF , then (7.3.10) is just a restatement of (7.3.7). The result is now true for any non-negative simple function and, hence, by the monotone convergence theorem, for any non-negative Borel measurable function. Henceforth, we will work with diffeomorphisms. 7.3 Diffeomorphisms 153 Proposition 7.3.3 Let U and V be bounded open subsets of RN and let T be a diffeomorphism of U onto V . Let f be a non-negative Borel measurable function defined on V . Then Z Z f dmN = (f ◦ T )|JT | dmN . (7.3.11) V U Proof: We apply (7.3.10) to the function (f ◦ T )|JT |, defined on U and to the diffeomorphism T −1 : V → U . Set T x = y. We then get R R −1 (y))|.|J T −1 (y)| dmN (y) U (f ◦ T )(x)|JT (x)| dmN (x) ≤ V f (y)|JT (T = R V f (y) dmN (y), since |JT (T −1 (y))|.|JT −1 (y)| = 1 for all y ∈ V . This gives the reverse inequality of (7.3.10), thereby establishing (7.3.11). Corollary 7.3.3 Let U and V be bounded open subsets of RN and let T be a diffeomorphism of U onto V . If E is any Borel subset of U , then Z mN (T (E)) = |JT | dmN . (7.3.12) E We can now extend Lemma 2.3.1 to Lebesgue measurable sets. Proposition 7.3.4 Let U and V be bounded open subsets of RN and let T be a diffeomorphism of U onto V . If E ⊂ U is Lebesgue measurable, then T (E) is a Lebesgue measurable subset of V . Proof: We can write (cf. Theorem 1.4.1), E = F ∪ N , where F is a Borel set and N is a subset of a Borel set A, where mN (A) = 0. Then T (E) = T (F ) ∪ T (N ) and T (F ) is a Borel set. We also have that T (A) is a Borel set of measure zero (cf. Corollary 7.3.2) and T (N ) ⊂ T (A). Thus, T (E) is Lebesgue measurable. Theorem 7.3.1 (Change of variable formula) Let U and V be bounded open subsets of RN and let T be a diffeomorphism of U onto V . Let f : V → R be an integrable function. Then (7.3.11) holds. Proof: Let E ⊂ U be a Lebesgue measurable set. Then if we write E = F ∪ N as in the proof of the preceding proposition, we may assume, without loss of generality, that F ∩ N = ∅. Then R mN (T (E)) = mN (T (F )) = F |JT | dmN = R F ∪N |JT | dmN = R E |JT | dmN . 154 7 Change of variable If G ⊂ V is a Lebesgue measurable set, then we can write G = T (E), where E ⊂ U is Lebesgue measurable. The preceding consderations then show that (7.3.11) holds for f = χG . Consequently, the relation remains valid for any non-negative simple function, and hence, by the monotone convergence theorem, for any non-negative measurable function. If f is an integrable function, then (7.3.11) holds for both f + and f − , and hence for f as well. Example 7.3.1 Consider a continuous function f : [−1, 1] → R. When studying the change of variable y = −x in an undergraduate class, we usually set dy = −dx and we get 1 Z −1 Z f (x) dx = − −1 f (−y) dy. 1 We then declare that Z −1 Z 1 f (−y) dy = − f (−y) dy, −1 1 to get Z 1 Z 1 f (x) dx = −1 f (−y) dy. (7.3.13) −1 The correct way of interpreting this is to use (7.3.11). We set T (x) = −x. Then |JT (x)| = |T 0 (x)| = 1 for all x ∈ (−1.1). We also have T ((−1, 1)) = (−1, 1). Thus (7.3.11) gives us Z Z f (x) dm1 (x) = f (−y) dm1 (y), (−1,1) (−1,1) which is the same as (7.3.13). Example 7.3.2 (Polar coordinates) Let D ⊂ R2 denote the open disc of radius a > 0, i.e. D = {(x, y) ∈ R2 | |x|2 + |y|2 < a}. Consider the open set V = D\{(x, 0) | 0 ≤ x < a}. Let U = (0, a) × (0, 2π) ⊂ R2 . Then the mapping T : U → V defined by T (r, θ) = (x, y), where x = r cos θ, y = r sin θ, 7.3 Diffeomorphisms 155 defines a diffeomorphism between U and V . We have JT = cos θ sin θ −r sin θ r cos θ = r. Thus, if f : V → R is an integrable function, we have Z Z f dm2 = rf (r cos θ, r sin θ) dm2 (r, θ). V U Since D and V differ by a set of measure zero, we have Z Z f dm2 = rf (r cos θ, r sin θ) dm2 (r, θ). D U If f : R2 → R is a non-negative function, then, by the monotone connvergence theorem, we have Z Z f dm2 = rf (r cos θ, r sin θ) dm2 (r, θ). (7.3.14) R2 (0,+∞)×(0,2π) By considering the positive and negative parts of f , we have that (7.3.14) is valid for any integrable function f : R2 → R. In the next chapter, we will write (7.3.14) in a more familiar form. Chapter 8 Product spaces 8.1 Measurability in the product space Let (X, S, µ) and (Y, T , λ) be two measure spaces. We would like to define a σ-algebra and a measure on the product X ×Y which is compatible with the structures given on X and Y and also relate the process of integration with respect to this measure with the processes of integration on X and Y . Definition 8.1.1 Let (X, S) and (Y, T ) be two measurable spaces. A measurable rectangle is a subset of X × Y of the form A × B, where A ∈ S and B ∈ T . An elementary set is a finite disjoint union of measurable rectangles. The σ-algebra generated by the collection of all elementary sets is denoted by S × T . Definition 8.1.2 Let X and Y be non-empty sets. Let E ⊂ X × Y . Let x ∈ X. Then the x-section of E, denoted Ex , is defined by Ex = {y ∈ Y | (x, y) ∈ E}. Similarly, for y ∈ Y , the y-section of E, denoted E y , is defined by E y = {x ∈ X | (x, y) ∈ E}. Thus, Ex ⊂ Y and E y ⊂ X. Proposition 8.1.1 Let (X, S) and (Y, T ) be measurable spaces. Let E ∈ S × T . Then Ex ∈ T and E y ∈ S for every x ∈ X and for every y ∈Y. © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_8 156 8.1 Measurability in the product space 157 Proof: Let U denote the collection of all subsets E of X × Y such that Ex ∈ T for every x ∈ X. If E = A × B is a measurable rectangle, then B, if x ∈ A, Ex = ∅, if x 6∈ A. Thus, every measurable rectangle belongs to U . In particular, X × Y ∈ U . Now, if E ⊂ X × Y , and if x ∈ X, we have (Ex )c = {y ∈ Y | (x, y) 6∈ E} = (E c )x . Thus, if E ∈ U , then E c ∈ U . Similarly, if {Ei }∞ i=1 is a sequence of sets in X × Y and if E = ∪∞ E , then, for any x ∈ X, we have i i=1 Ex = ∪ ∞ i=1 (Ei )x . This shows that if {Ei }∞ i=1 is a countable collection of sets in U , then E = ∪∞ E ∈ U . Thus, U is a σ-algebra on X × Y which contains all i i=1 measurable rectangles and so it must contain all members of S ×T . This shows that if E ∈ S × T , then, for every x ∈ X, we have that Ex ∈ T . In the same way we can show that if E is in S × T , then E y ∈ S for every y ∈ Y . This completes the proof. Definition 8.1.3 Let X be any non-empty set. A monotone class is a collection M of subsets of X which is closed under countable increasing unions and countable decreasing intersections, i.e. if {Ai }∞ i=1 and {Bi }∞ are two countable collections of subsets of X in M such that, i=1 for all i ∈ N we have Ai ⊂ Ai+1 and Bi ⊃ Bi+1 , then, ∞ ∪∞ i=1 Ai ∈ M and ∩i=1 Bi ∈ M. Remark 8.1.1 Any σ-ring, and so, in particular, any σ-algebra, is a monotone class. Remark 8.1.2 The intersection of monotone classes is obviously a monotone class. The collection P(X), of all subsets of a non-empty set X, is obviously a monotone class. Thus, if A is any collection of subsets of X, there exists a smallest monotone class containing A. We will denote it by M(A) and will call it the monotone class generated by A. Lemma 8.1.1 Let X be a non-empty set and let M be a monotone class of subsets of X. Let P ⊂ X. Define U (P ) = {Q ⊂ X |P ∪ Q, P \Q and Q\P are all in M}. Then U (P ) is a monotone class. 158 8 Product spaces Proof: Let {Qi }ni=1 be an increasing sequence of sets in U (P ). Then ∞ {P ∪ Qi }∞ i=1 and {Qi \P }i=1 are increasing sequences of sets in M. Consequently, ∞ P ∪ (∪∞ i=1 Qi ) = ∪i=1 (P ∪ Qi ) ∈ M, and ∞ (∪∞ i=1 Qi )\P = ∪i=1 (Qi \P ) ∈ M. Finally {P \Qi }∞ i=1 is a decreasing sequence of sets in M. Hence ∞ P \(∪∞ i=1 Qi ) = ∩i=1 (P \Qi ) ∈ M. Thus, we see that ∪∞ i=1 Qi ∈ U (P ). In the same way, it is easy to see that if {Qi }∞ is a decreasing sequence of sets in U (P ), then ∩∞ i=1 i=1 Qi ∈ U (P ) as well. This completes the proof. Lemma 8.1.2 Let X be a non-empty set and let R be an algebra of subsets of X. Let M(R) denote the monotone class generated by R. Then M(R) = S(R), the σ-algebra generated by R. Proof: Let P ∈ R. If Q ∈ R, then, P ∪ Q, P \Q and Q\P belong to R and hence to M(R) as well. Thus, if U (P ) is as defined in the preceding lemma, we have that Q ∈ U (P ). Since U (P ) is a monotone class containing R, it follows that U (P ) ⊃ M(R). Now, let Q ∈ M(R). Then, we have just seen that Q ∈ U (P ). By symmetry of the definition of U (P ), we immediately deduce that P ∈ U (Q). Thus, U (Q) is a monotone class containing R, we have that U (Q) ⊃ M(R). Thus, for all P and Q in M(R), we have that P ∪Q and P \Q belong to M(R). Thus, we deduce that M(R) is an algebra as well. Now let {Ei }∞ i=1 be a countable collection of members of M(R). Then Fn = ∪ni=1 Ei ∈ M(R), since the latter collection is an algebra. Since {Fn }∞ n=1 is an increasing sequence of sets in M(R), which is a monotone class, we have ∞ ∪∞ i=1 Ei = ∪n=1 Fn ∈ M(R). Thus M(R) is a σ-algebra containing R, fom which we deduce that M(R) ⊃ S(R). Since every σ-algebra is a monotone class, we also have that M(R) ⊂ S(R). This completes the proof. Proposition 8.1.2 Let (X, S) and (Y, T ) be measurable spaces. Then S × T is the smallest monotone class containing all elementary sets. 8.1 Measurability in the product space 159 Proof: Let us denote by E, the class of elementary sets. Let Ai ∈ S and Bi ∈ T for i = 1, 2. Then (check!) (A1 × B1 ) ∩ (A2 × B2 ) = (A1 ∩ A2 ) × (B1 ∩ B2 ), (A1 × B1 )\(A2 × B2 ) = ((A1 \A2 ) × B1 ) ∪ ((A1 ∩ A2 ) × (B1 \B2 )). Thus, the intersection of two measurable rectangles is a measurable rectangle and their difference is the disjoint union of two measurable rectangles. It then follows that if P and Q are in E, we have that P ∪ Q and P \Q are also in E. Thus E is an algebra. The result now follows from Lemma 8.1.2. Definition 8.1.4 Let (X, S) and (Y, T ) be measurable spaces. Let f : X × Y → R be a given function. Let x ∈ X and y ∈ Y . The x-section, fx , and the y-section, f y , are defined by fx (y) = f (x, y) = f y (x). We are now dealing with three σ-algebras, viz. the σ-algebra S on X, the σ-algebra T on Y , and the σ-algebra S × T on X × Y . To avoid confusion, we will say a set (respectively, function) is S-measurable, T measurable or S × T measurable according to the context. Proposition 8.1.3 Let (X, S) and (Y, T ) be measurable spaces. Let f be a S ×T -measurable function defined on X ×Y . Then, for every x ∈ X and for every y ∈ Y , the section fx is T -measurable and the section f y is S-measurable. Proof: Let c ∈ R. Then Q = {(x, y) | f (x, y) > c} is S × T measurable. Then, for x ∈ X, {y ∈ Y | fx (y) > c} = Qx is T -measurable by Proposition 8.1.1. Thus, fx is T -measurable. Similarly, f y is S-measurable. Example 8.1.1 Let (X, S) and (Y, T ) be measurable spaces. Let f : X → R be a S-measurable function. Define F : X ×Y → R by F (x, y) = f (x) for every (x, y) ∈ X × Y . Then, if c ∈ R, we have {(x, y) ∈ X × Y | F (x, y) > c} = {x ∈ X | f (x) > c} × Y, 160 8 Product spaces which is S × T -measurable. Thus, F is S × T -measurable. Similarly, if g : Y → R is T -measurable, we also have that the function (x, y) 7→ g(y) is S × T -measurable. Since the product of measurable functions is measurable, we have that the function ϕ : X × Y → R defined by ϕ(x, y) = f (x)g(y) is also S × T -measurable. Example 8.1.2 Let R be equipped with the Borel or the Lebesgue σalgebra. Let (X, S) be a measurable space. Let f : R × X → R be a function such that • for every x ∈ X, the function t 7→ f (t, x) is continuous on R; • for every t ∈ R, the function x 7→ f (t, x) is S-measurable. (Such a function is called a Carathéodory function.) Then, f is measurable on the product space R × X. To see this, first observe that for any fixed n ∈ N, k−1 k R = ∪k∈Z , . 2n 2n k Define fn (t, x) = f ( 2kn , x) if t ∈ ( k−1 2n , 2n ]. Thus, we can write X k fn (t, x) = f , x χ k−1 k (t). ( n , n] 2n 2 2 k∈Z By the previous example, and by hypotheses, we see that each term in the above series is a measurable function on R × X, from which we deduce that fn is a measurable function on R × X. Now, let (t0 , x0 ) ∈ R × X. Let ε > 0 be given. Then, there exists δ > 0 such that, if |t − t0 | < δ, we have, by hypotheis, that |f (t, x0 ) − f (t0 , x0 )| < ε. If n is large enough so that 21n < δ, it then follows that |fn (t0 , x0 ) − f (t0 , x0 )| < ε, by construction of the function fn . Thus, fn → f pointwise and so f is measurable on R × X. 8.2 The product measure Theorem 8.2.1 Let (X, S, µ) and (Y, T , λ) be σ-finite measure spaces. Let Q ∈ S × T . Define, for x ∈ X and y ∈ Y , ϕ(x) = λ(Qx ) and ψ(y) = µ(Qy ). 8.2 The product measure 161 Then ϕ is S-measurable and ψ is T -measurable. Further Z Z ϕ dµ = ψ dλ. X (8.2.1) Y Proof: Let U be the collection of all sets in S ×T such that (8.2.1) holds. Step 1. Let Q = A × B, where A ∈ S and B ∈ T . Then (cf. the proof of Proposition 8.1.1), ϕ = λ(B)χA and ψ = µ(A)χB . Thus ϕ is S-measurable and ψ is T -measurable. We also have Z Z ϕ dµ = µ(A)λ(B) = ψ dλ. X Y Thus, every measurable rectangle is in U . Step 2. Let {Qi }∞ i=1 be an increasing sequence of sets in U . Set Q = ∞ ∪i=1 Qi . Let ϕi (x) = λ((Qi )x ) and let ψi (y) = µ((Qi )y ) for x ∈ X and y ∈ Y . Since {(Qi )x }∞ i=1 is an increasing sequence of sets whose union is Qx and, similarly, since {(Qi )y }∞ i=1 is an increasing sequence of sets ∞ whose union is Qy , we have that {ϕi }∞ i=1 and {ψi }i=1 are sequences of non-negative increasing functions such that (cf. Proposition 1.2.4) limi→∞ ϕi (x) = λ(Qx ) limi→∞ ψi (y) = µ(Qy ) def = def = ϕ(x), and ψ(y). R R Since X ϕi dµ = Y ψi dλ, for each i, it follows from the monotone convergence theorem that (8.2.1) holds. Thus Q ∈ U . Step 3. It is very easy to see that if {Qi }ni=1 is a finite collection of disjoint sets in U , then ∪ni=1 Qi ∈ U as well. Now, given any countable n collection of disjoint sets {Qi }∞ i=1 in U , we set Rn = ∪i=1 Qi so that ∞ Rn ∈ U for each n ∈ N. Since {Rn }n=1 is an increasing sequence, it follows, from Step 2, that ∪ni=1 Qi = ∪∞ n=1 Rn ∈ U . Step 4. Let A ∈ S and B ∈ T be such that µ(A) < +∞ and λ(B) < +∞. Let {Qi }∞ i=1 be a sequence of sets in U such that A × B ⊃ Q1 ⊃ Q2 ⊃ · · · ⊃ Qi ⊃ Qi+1 ⊃ · · · . 162 8 Product spaces Then in the same manner as in Step 2 (using the dominated convergence theorem instead of the monotone convergence theorem), we can show that ∩∞ i=1 Qi ∈ U . Step 5. Since both the measure spaces are σ-finite, we can write ∞ X = ∪∞ n=1 Xn and Y = ∪m=1 Ym , ∞ where {Xn }∞ n=1 and {Ym }m=1 are sequences of disjoint sets such that for all n and m we have µ(Xn ) < +∞ and λ(Ym ) < +∞. Let Q ∈ S × T . Set Qnm = Q∩(Xn ×Ym ). Let M be the collection of all sets Q in S ×T such that Qnm ∈ U for all n and m. By Steps 2 and 4, it follows that M is a monotone class. By Steps 1 and 3, it follows that all elementary sets are in M. Thus, M is a monotone class containing all elementary sets and is contained in S × T . It now follows, from Proposition 8.1.2, that M = S × T . Step 6. If Q ∈ S × T , then by Step 5, Qnm ∈ U for all n and m. Then, since the Qnm are all disjoint and since ∞ Q = ∪∞ n=1 ∪m=1 Qnm , it follows, from Step 3, that Q ∈ U as well. Thus U = S × T . This completes the proof. We can use the precding theorem to define the product measure on S ×T. Definition 8.2.1 Let (X, S, µ) and (Y, T , λ) be σ-finite measure spaces. The product measure, denoted µ × λ, is defined for Q ∈ S × T by Z Z (µ × λ)(Q) = λ(Qx ) dµ(x) = µ(Qy ) dλ(y). X Y Example 8.2.1 Let X = Y = R be equipped with the Lebesgue measure. The x-axis can be written as the disjoint union of measurable rectangles: {(x, 0) | x ∈ R} = ∪n∈Z [n, n + 1) × {0}, and so its measure, for the product measure m1 × m1 is zero. Let E ⊂ [0, 1] ⊂ R be a non-measurable set, i.e. E 6∈ L1 . Then (cf. Proposition 8.1.1), the set E × {0} 6∈ L1 × L1 . However, E × {0} ⊂ [0, 1] × {0}, 8.2 The product measure 163 and (cf. Example 2.1.1) m2 ([0, 1] × {0}) = 0 = (m1 × m1 )([0, 1] × {0}). Since m2 is complete, it follows that E × {0} ∈ L2 . This shows that even though m1 is complete, it does not follow that m1 × m1 is complete and also shows that L1 × L1 6= L2 . . Remark 8.2.1 In view of the above example, we can ask ourselves what is the relationship between the product of Lebesgue measures and the Lebesgue measure of the product space. We sketch the argument below. Let ` = k + n and let us consider R` as the product space Rk × Rn . We have the Borel sets B` , Bk and Bn in R` , Rk and Rn respectively. Similarly we have the Lebesgue measurable sets L` , Lk and Ln as well. Now, any open set in R` and can be expressed as the countable disjoint union of (half-open) boxes (cf. Lemma 2.2.1). Thus, all open sets are in Lk × Ln and so we deduce that B` ⊂ L k × L n . If E ⊂ Rk is a Lebesgue measurable subset, then (cf. Proposition 2.2.2) it can be approximated from above by a Gδ set and from below by an Fσ set. It follows from this that E × Rn is Lebesgue measurable in R` . Similarly, if F ⊂ Rn is Lebesgue measurable, then Rk × F will be Lebesgue measurable in R` . Then, their intersection E × F will be Lebesgue measurable in R` . Thus, L` contains all measurable rectangles and it follows from this that B` ⊂ L k × L n ⊂ L ` . We know that m` and mk × mn both agree on all boxes. Both these measures are also easily seen to be translation invariant, outer-regular (cf. Remark 2.2.1) and finite on all compact sets. Thus (cf. Theorem 2.3.2), it will follow that these measures agree on B` . Now, if Q is Lk × Ln -measurable, it is also L` -measurable and so there exist Pi ∈ B` , i = 1, 2, such that P1 ⊂ Q ⊂ P2 and such that m` (P2 \P1 ) = 0 (cf. Proposition 2.2.2). Thus, (mk × mn )(Q\P1 ) ≤ (mk × mn )(P2 \P1 ) = m` (P2 \P1 ) = 0. Cosequently, (mk × mn )(Q) = (mk × mn )(P1 ) = m` (P1 ) = m` (Q). 164 8 Product spaces Thus, mk × mn and m` agree on Lk × Ln as well. Since the Lebesgue measure is the completion of the same measure on Borel sets, it follows that m` is the completion of mk × mn as well. 8.3 Fubini’s theorem Theorem 8.3.1 (Fubini’s theorem) Let (X, S, µ) and (Y, T , λ) be two σ-finite measure spaces. Let f be an extended real-valued function defined on X × Y which is S × T -measurable. (a) Let f be non-negative. Define, for x ∈ X and y ∈ Y , Z Z ϕ(x) = fx dλ and ψ(y) = f y dµ. (8.3.1) Y X Then ϕ is S-measurable, ψ is T -measurable and Z Z Z ϕ dµ = f d(µ × λ) = ψ dλ. X X×Y (8.3.2) Y (b) Assume that ϕ∗ is integrable over X, with respect to the measure µ, where, for x ∈ X, Z ∗ ϕ (x) = |f |x dλ. Y Then f is integrable over X × Y , with respect to the measure µ × λ. (c) Let f be integrable over X × Y , with respect to the measure µ × λ. Then, for almost every x ∈ X, the function fx is integrable over Y , with respect to the measure λ, and, for almost every y ∈ Y , the function f y is integrable over X, with respect to the measure µ. Further, the functions ϕ and ψ defined by (8.3.1) above are integrable over X, with respect to the measure µ, and over Y , with respect to the measure λ, respectively, and (8.3.2) holds. Proof: (a) Since fx and f y are non-negative, the functions ϕ and ψ are defined. By definition of the product measure, (8.3.2) is exactly the conclusion of Theorem 8.2.1, when f = χQ , where Q ∈ S × T . By linearity, the result holds for all non-negative simple functions. Let f be a non-negative S ×T -measurable function. Let {fn }∞ n=1 be aRsequence of non-negative simple functions increasing to f . Let ϕn (x) = Y (ϕn )x dλ. Then, by the monotone convergence theorem, ϕn ↑ ϕ. Further, Z Z fn d(µ × λ) = ϕn dµ. X×Y X 8.3 Fubini’s theorem 165 Once again, we can pass to the limit, as n tends to infinity, to get Z Z f d(µ × λ) = ϕ dµ. X×Y X This proves one part of (8.3.2). The proof of the other part is similar. (b) We apply the result of (a) to the function |f |. Thus, by hypothesis and by (8.3.2), we get Z Z |f | d(µ × λ) = ϕ∗ dµ < +∞. X×Y X This shows that f is integrable over X × Y , with respect to the measure µ × λ. (c) We write f = f + − f − . Since f is integrable, we have that f ± are integrable non-negative functions. Let Z Z ± ϕ± (x) = (f )x dλ, and ψ± (y) = (f ± )y dµ. Y X Then (8.3.2) holds for the triples (f ± , ϕ± , ψ± ) replacing the triple (f, ϕ, ψ). All the integrals are now finite and so subtracting the relations for f − from those of f + , we deduce (8.3.2) for the function f . This completes the proof. Remark 8.3.1 When f is non-negative (case (a) of Theorem 8.3.1), all the integrals could be infinite. If even one of them is finite, all are finite and will be equal. Remark 8.3.2 In the case (b) of the preceding theorem, of course, an R ∗ y analogious statement involving ψ (y) = X |f | dµ is valid. Thus, if one of the integrals Z Z |f |y dµ or X |f |x dλ Y is finite, then we have that f is integrable over X × Y and that (8.3.2) holds. Remark 8.3.3 The relation (8.3.2) can also be written as Z Z Z Z Z f (x, y) dλ(y)dµ(x) = f d(µ×λ) = f (x, y) dµ(x)dλ(y). X Y X×Y Y X 166 8 Product spaces The first and the last term are referred to as iterated integrals. Example 8.3.1 Let X = Y = N, and let µ = λ be the counting measure. Let f (m, n) = amn be non-negative for all m and n. Then, by case (a) of Theorem 8.3.1, we get that ∞ X ∞ X amn = m=1 n=1 ∞ X ∞ X amn . n=1 m=1 This was proved earlier as a consequence of the monotone convergence theorem (cf. Example 5.2.3). The same result is true without the nonnegativity condition if we assume the extra condition ∞ X ∞ X |amn | < +∞, m=1 n=1 as an application of case (b) of Theorem 8.3.1. We proved this earlier, using the dominated convergence theorem (cf. Example 5.3.3). Example 8.3.2 Once again, let X = Y = N and let µ = λ be the counting measure. Define 1, if n = 1, −1, if n = m + 1, f (m, n) = am,n = 0, otherwise. Then ∞ X ∞ X am,n = m=1 n=1 while ∞ X (am,1 + am,m+1 ) = 0, m=1 ∞ X am,1 = +∞ m=1 and so ∞ X ∞ X am,n = +∞. n=1 m=1 Note that in this case, ∞ X ∞ X m=1 n=1 |am,n | = +∞. 8.3 Fubini’s theorem 167 Thus, the integrability of f cannot be relaxed for a general function for the validity of (8.3.2). Example 8.3.3 Let X = Y = [0, 1]. Let S = T = L1 . Let µ = m1 and let λ be the counting measure. Thus λ is not a σ-finite measure. Let D = {(x, x) | x ∈ [0, 1]} ⊂ X × Y. Since D is a closed set, it is Borel measurable and so D ∈ L1 × L1 (cf. Remark 8.2.1). Let f = χD . Then R R RY RX f (x, y) dµ(x) dλ(y) = 0, X Y f (x, y) dλ(y) dµ(x) = 1. Thus, σ-finiteness cannot be dispensed with. Example 8.3.4 (Integration by parts for absolutely continuous functions) Let [a, b] ⊂ R and let f, g : [a, b] → R be absolutely continuous functions. Then, f 0 and g 0 exist almost everywhere and are integrable. Let us consider the integral of ϕ(x, y) = f 0 (x)g 0 (y) on the set E = {(x, y) ∈ [a, b] × [a, b] | x ≤ y}, which is closed and hence is Borel measurable. Consequently, it also belongs to L1 × L1 . By case (a) of Theorem 8.3.1, we have R R R 0 (x)| |g 0 (y)| dm (x) |ϕ|d(m × m ) = |f dm1 (y) 1 1 1 [a,b]×[a,b] [a,b] [a,b] = R [a,b] |f 0| dm1 R [a,b] |g 0| dm1 < +∞. Thus ϕ is integrable on [a, b]×[a, b] with respect to the measure m1 ×m1 and so we can apply Fubini’s theorem. Consequently, we have Z Z f 0 (x)g 0 (y)χE (x, y) dm1 (y) dm1 (x) [a,b] = R R [a,b] [a,b] [a,b] f 0 (x)g 0 (y)χ E (x, y) dm1 (x) dm1 (y). The left-hand side of (8.3.3) is equal to Z ! Z [a,b] 0 g (y) dm1 (y) f 0 (x) dm1 (x), [x,b] (8.3.3) 168 8 Product spaces which yields Z Z 0 (g(b) − g(x))f (x) dm1 (x) = g(b)f (b) − g(b)f (a) − [a,b] gf 0 dm1 . [a,b] We have used here the fact that both f and g are absolutely continuous and so the integral of the derivative is the difference of the values of the function at the end points of the interval (cf. Theorem 6.4.1). Similarly, the right-hand side of (8.3.3) is equal to ! Z Z f 0 (x) dm1 (x) g 0 (y) dm1 (y), [a,b] [a,y] which yields Z f g 0 dm1 . −g(b)f (a) + g(a)f (a) + [a,b] Equating these two, we get Z Z f g 0 dm1 = g(b)f (b) − g(a)f (a) − f 0 g dm1 , [a,b] [a,b] which is the formula for integration by parts. Example 8.3.5 (Polar coordinates.) In Example 7.3.2, we described the transformation which led to polar coordinates. Let f : R2 → R be an integrable function which is Borel measurable. Since the Lebesgue measure m2 agrees with m1 × m1 on Borel sets, we can apply Fubini’s theorem. Then, we can write (7.3.14) in the form Z Z Z f dm2 = rf (r cos θ, r sin θ) dm1 (r) dm1 (θ). R2 (0,2π) (0,+∞) If the integrand on the right-hand side is Riemann integrable, then we may write Z Z 2π Z ∞ f dm2 = f (r cos θ, r sin θ)r dr dθ. R2 0 0 Example 8.3.6 Let x ∈ RN and consider the function 2 f (x) = e−|x| , 8.3 Fubini’s theorem 169 which is continuous and hence is Borel measurable. We wish to show N that this R function is integrable over R and evaluate that integral. Let IN = RN f dmN . By repeated use of Fubini’s theorem (case (a)), we get Z 2 e−|xi | dm1 (xi ) = I1N . IN = Π N i=1 R I12 , Now, on one hand I2 = by the previous example, by the above reasoning. On the other hand, ∞ Z 2π Z I2 = e 0 −r 2 ∞ Z r dθ dr = 2π 0 2 e−r r dr. 0 The last integral is easily evaluated to give I12 = I2 = π. Thus I1 = √ N π and IN = π 2 , N ≥ 2. Example 8.3.7 Let (X, S, µ) be a measure space and let f : X → R be an integrable function. The distribution function of f is a function F : [0, +∞) → [0, +∞], defined by F (t) = µ(E(t)), where E(t) = {x ∈ X | |f (x)| > t}. We have Z Z Z F dm1 = χE(t) (x) dµ(x) dm1 (t). [0,+∞) [0,+∞) X Since we are dealing with a non-negative integrand, we can change the order of integration to get Z Z Z Z F dm1 = dm1 (t) dµ(x) = |f (x)| dµ(x). [0,+∞) X [0,|f (x)|] X Thus, Z Z |f | dµ = X F dm1 . [0,+∞) Two functions defined on X are said to be equimeasurable or, are rearrangements of each other, if they have the same distribution function. Thus, the integrals of the absolute values of functions which are rearrangements of each other are equal. 170 8 Product spaces Example 8.3.8 (Convolutions) Let f and g be Borel measurable realvalued functions defined on RN . Consider the mappings ϕ and ψ defined on RN × RN taking values in RN defined by ϕ(x, y) = x − y, ψ(x, y) = y. These are continuous and hence Borel measurable. Thus (cf. Proposition 3.1.4), we have that the mappings (x, y) 7→ f (x − y) and (x, y) 7→ g(y) are Borel measurable and so their product (x, y) 7→ f (x − y)g(y) is also Borel measurable. We would like to know if the integral Z def h(x) = f (x − y)g(y) dmN (y) RN exists and is finite. Assume that f and g are integrable as well. Since the Lebesgue measure agrees with the product measure on Borel measurable sets, we can apply Fubini’s theorem. We have R RN R RN f (x − y)g(y) dmN (y) dmN (x) R R ≤ RN RN |f (x − y)|.|g(y)| dmN (y) dmN (x) = R RN |g(y)| R RN |f (x − y)| dmN (x) dmN (y). Since the Lebesgue measure is translation invariant, we have that Z Z |f (x − y)| dmN (x) = |f | dmN RN RN for every fixed y. Consequently, we get Z Z Z f (x − y)g(y) dmN (y) dmN (x) ≤ RN RN Z |f | dmN RN |g| dmN . RN Since, by hypothesis, the right-hand side is finite, it follows from Fubini’s theorem, that the function h is defined for almost every x and is integrable and, in fact we also have that Z Z Z |h| dmN ≤ |f | dmN |g| dmN . (8.3.4) RN RN RN 8.4 Polar coordinates in RN 171 Now, if f and g are Lebesgue measurable functions, then we can find Borel measurable functions f0 and g0 such that f = f0 and g = g0 almost everywhere (cf. Exercise 3.7). Since the integrals of functions which are equal almost everywhere are the same, it follows that if f and g are integrable functions (with respect to the Lebesgue measure) on RN , then h is well-defined almost everywhere. The function h is called the convolution of f and g and is denoted by the symbol f ∗ g. Polar coordinates in RN 8.4 We saw that the transformation x = r cos θ, y = r sin θ, in the plane R2 allowed us to write the integral of a non-negative function, or an integrable function in the form (cf. Example 7.3.2 and Example 8.3.5) Z Z 2π Z ∞ f dm2 = f (r cos θ, r sin θ)r dr dθ. R2 0 0 In the same way the transformation in R3 defined by the spherical polar coordinate system x = r sin θ cos ϕ, y = r sin θ sin ϕ, z = r cos θ, R will convert R3 f dm3 into the multiple integral Z 2π Z π Z ∞ f (r sin θ cos ϕ, r sin θ sin ϕ, r cos θ)r2 sin θ dr dθ dϕ, 0 0 0 when f is a Borel measurable function defined on R3 , which is nonnegative or which is integrable. When N > 3, it is difficult to write down explicitly the ‘polar coordinates’ and the computation of the Jacobian will surely be horrendous. We will describe below the transformation of the integral when f : RN → R is a radial function. Definition 8.4.1 We say that a function f : RN → R is radial, if there exists a function fe : R → R such that, for x ∈ RN , we have f (x) = fe(|x|). 172 8 Product spaces Let B be the unit ball in RN . Let us set ωN = mN (B). If R > 0, then the linear map T (x) = Rx maps the open unit ball diffeomorphically onto B(0; R), the open ball of radius R, and so we have (cf. Theorem 2.3.) mN (B(0; R)) = ωN RN . By translation invariance, any ball in RN with radius R will have measure ωN RN . Let us denote the closed ball centred at the origin and of radius R by B(0; R). Let us assume that we have a continuous function f : B(0; R) → R which is radial. Thus, f (x) = fe(|x|), where fe : [0, R] → R is continuous. Consider a partition of the interval [0, R]: P = {0 = r0 < r1 < · · · < rn = R}. For 1 ≤ i ≤ N , let us set Ai = {x ∈ RN | ri−1 ≤ |x| < ri }, so that B(0; R) = ∪ni=1 Ai . Now, by the mean value theorem, there exists ξi ∈ (ri−1 , ri ) such that N riN − ri−1 = N ξiN −1 (ri − ri−1 ), (8.4.1) for each 1 ≤ i ≤ n. Let us choose yi ∈ Ai such that |yi | = ξi , 1 ≤ i ≤ N . Now define the function fP = n X f (yi )χAi . i=1 Let ∆(P) = max (ri − ri−1 ). 1≤i≤n Now, if x ∈ Ai , we have for any 1 ≤ i ≤ n, |f (x) − fP (x)| = |fe(|x|) − fe(|yi |)| = |fe(|x|) − fe(ξi )|. 8.4 Polar coordinates in RN 173 Since fe is uniformly continuous, given ε > 0, we can find δ > 0 such that, if ∆(P) < δ, we have |f (x) − fP (x)| < ε, for every x ∈ B(0; R). Thus, as ∆(P) → 0, we have that fP → f uniformly on B(0; R). Consequently (cf. Exercise 5.2), Z Z lim fP dmN = f dmN . ∆(P)→0 B(0;R) B(0;R) On the other hand, we have R Pn e i=1 f (ξi )mN (Ai ) B(0;R) fP dmN = = Pn = Pn N e i=1 f (ξi )ωN (ri N ) − ri−1 N −1 e (ri i=1 f (ξi )N ωN ξi − ri−1 ), in view of (8.4.1). Since f is continuous, we have that Z R n X N −1 e lim f (ξi )N ωN ξi (ri − ri−1 ) = N ωN fe(r)rN −1 dr. ∆(P)→0 0 i=1 Thus we have Z R Z f dmN = N ωN B(0;R) fe(r)rN −1 dr. (8.4.2) 0 If f is a continuous non-negative, or an integrable, radial function defined on RN , we then have Z Z ∞ f dmN = N ωN fe(r)rN −1 dr. (8.4.3) RN 0 The formula (8.4.3) is a particular case of a more general result known in the literature as the coarea formula. Theorem 8.4.1 There exists a unique Borel measure σN −1 on the unit sphere S N −1 in RN such that, if f : RN → R is a Borel measurable function which is either non-negative or integrable over RN (with respect to the Lebesgue measure), then Z Z Z f dmN = f (rx0 )rN −1 dσN −1 (x0 ) dm1 (r), RN where r = |x| and x0 = [0,∞) x r S N −1 ∈ S N −1 . 174 8 Product spaces The interested reader is referred to the books of Evans and Gariepy [2] or Folland [3]. Essentially, the coarea formula says that when we integrate over RN , we first integrate over the surface of sphere of radius r, centred at the origin, and then integrate over r. If R > 0, we have Z Z Z f dmN = f (rx0 )rN −1 dσN −1 (x0 ) dm1 (r). B(0;R) [0,R) S N −1 Setting R = 1 and f ≡ 1, we get ωN = σN −1 (S N −1 ) 1 Z rN −1 dr, 0 which yields σN −1 (S N −1 ) = N ωN . The quantity σN −1 (S N −1 ) is the natural ‘N − 1 dimensional surface measure’ of the unit sphere. Indeed, if N = 2, we have that the area of the unit circle is ω2 = π while its perimeter is σ1 (S 1 ) = 2π = 2ω2 . If N = 3, the volume of the unit sphere is ω3 = 43 π and its surface area is σ2 (S 2 ) = 4π = 3ω3 . Remark 8.4.1 There is a rich theory of measures defined on surfaces, or more genrally, lower dimensional manifolds, in RN . In fact there are several methods to do it, depending on how we wish to handle singularities in the geometry of these sets. The main theory is that of Hausdorff measures. See Evans and Gariepy [2], or Folland [3], for a treatment of these notions. Example 8.4.1 (Volume of the unit ball) We now compute the value of 2 ωN = mN (B(0; 1)). We start with the function f (x) = e−|x| . We saw earlier (cf. Example 8.3.6) that Z N 2 e−|x| dmN (x) = π 2 . RN Since f is a radial function, we can also compute it using polar coordinates. By (8.4.3), we get Z Z ∞ 2 −|x|2 e dmN (x) = N ωN e−r rN −1 dr. RN 0 Setting s = r2 , we get Z Z ∞ N N N −1 −|x|2 −s N = ωN Γ , e dmN (x) = ωN e s2 2 2 2 RN 0 8.5 Exercises 175 where Γ(s) is the familiar gamma function. Thus, equating the two expressions we got for the integral, we obtain, N N π2 π2 = , = N N N Γ 2 +1 2Γ 2 ωN since sΓ(s) = Γ(s + 1). Using the last mentioned property of the gamma function and the √ fact that Γ 12 = π, we can easily verify that ω2 = π and ω3 = 4 π. 3 We can also see that ω4 = 1 2 8 2 π and that ω5 = π , 2 15 and so on. 8.5 Exercises 8.1 Give an example of a non-empty set X and a monotone class M of subsets of X which contains X and ∅ and which is not a σ-algebra. 8.2 Let p ≥ 1. Let (X, S, µ) be a Rmeasure space. Let f be a real-valued function defined on X such that X |f |p dµ < +∞. Show that Z Z p |f | dµ = p tp−1 µ(E(t)) dm1 (t), X [0,∞) where E(t) = {x ∈ X | |f (x)| > t}. 8.3 (a) For x > 0, show that Z e−xt dm1 (t) = [0,+∞) 1 . x (b) Use the above relation and Fubini’s theorem to show that Z lim R→+∞ 0 R sin x π dx = . x 2 176 8 Product spaces 8.4 Let f, g and h be integrable real-valued functions defined on RN . Show that (a) f ∗ g = g ∗ f . (b) Show that f ∗ (g ∗ h) and (f ∗ g) ∗ h are well-defined and that they are equal. 8.5 Let f and g be integrable real-valued functions defined on RN . Show that (cf. Example 5.3.2) f[ ∗ g = fb · gb. 8.6 Let (X, S) be a measurable space. Let f be a real-valued, nonnegative function defined on X. Define the upper and lower ordinate sets of f by V ∗ (f ) = {(x, t) ∈ X × R | 0 ≤ t ≤ f (x)}, and V∗ (f ) = {(x, t) ∈ X × R | 0 ≤ t < f (x)}, respectively. (a) If f is a non-negative simple function, show that V ∗ (f ) and V∗ (f ) are measurable in X × R (where R is equipped with the Lebesgue measure). (b) If f and g are non-negative functions such that f (x) ≤ g(x) for all x ∈ X, show that V ∗ (f ) ⊂ V ∗ (g) and that V∗ (f ) ⊂ V∗ (g). (c) Let {fn }∞ n=1 be a sequence of non-negative measurable functions defined on X. If fn ↑ f , show that {V∗ (fn )}∞ n=1 is an increasing sequence of sets whose union is V∗ (f ). If fn ↓ f , show that {V ∗ (fn )}∞ n=1 is a decreasing sequence of sets whose intersection is V ∗ (f ). (d) If f is a non-negative measurable function defined on X, show that V ∗ (f ) and V∗ (f ) are measurable subsets of X × R. (e) If f is any measurable real-valued function defined on X, show that its graph, G(f ), is a measurable subset of X × R, where G(f ) = {(x, t) ∈ X × R | f (x) = t}. (f) Let (X, S, µ) be a σ-finite measure space. Set λ = µ × m1 . Show that, if f is a non-negative measurable function defined on X, then, Z ∗ λ(V (f )) = λ(V∗ (f )) = f dµ. X (This is a generalization of the notion that the (Riemann) integral of a non-negative real-vaued function defined on (a sub-interval of) R, is the 8.5 Exercises 177 area under the graph of the function.) 8.7 Let A be a real, symmetric and positive definite N × N matrix. Show that s Z πN T e−x Ax dmN (x) = , det(A) RN where xT denotes the transpose of (the column vector) x ∈ RN . Chapter 9 Signed measures 9.1 Hahn and Jordan decompositions Let (X, S) be a measurable space. Let µi , i = 1, 2, be two measures defined on this space. Let αi , i = 1, 2, be non-negative real numbers. Then α1 µ1 + α2 µ2 defines a measure on this space. We now consider the possibility that αi , i = 1, 2, be arbitrary real numbers. Thus, it is possible that certain sets have negative measure. The principal difficulty in doing this is that if µi (E), i = 1, 2, are both infinite for some E ∈ S, then we cannot define µ1 (E) − µ2 (E). The situation is similar to the one we encountered when defining the integral of a function. In that case we needed that at least one of the functions, f + or f − , be integrable. In the same way, if we assume that if one of µ1 or µ2 is a finite measure, then, at least formally, we can define the set function µ1 − µ2 , which will still be countably additive. Motivated by these remarks, we make the following definition. Definition 9.1.1 Let (X, S) be a measurable space and let µ be an extended real-valued set function defined on S. It is said to be a signed measure if (i) µ(∅) = 0, (ii) µ takes at most one of the values +∞ or −∞, and (iii) µ is countably additive. A signed measure, µ, is said to be finite if |µ(E)| < +∞ for every E ∈ S. It is said to be σ-finite if X = ∪∞ n=1 En , with |µ(En )| < +∞ for each n ∈ N. Example 9.1.1 As already observed, if µi , i = 1, 2, are two measures on a measureable space (X, S), and if at least one of them is finite, then © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_9 178 9.1 Hahn and Jordan decompositions 179 µ1 − µ2 is a signed measure. One of our objectives in this section will be to show that every signed measure can be written as the difference of two measures, one of them finite. Example 9.1.2 Let (X, S, µ) be a measure space. Let f be an integrable function defined on X. Define Z ν(E) = f dµ, E ∈ S. E Then ν defines a signed measure on (X, S). Remark 9.1.1 A signed measure is clearly finitely additive. If µ(E) is finite, then µ(F \E) = µ(F ) − µ(E), where E, F ∈ S and E ⊂ F. Proposition 9.1.1 Let (X, S) be a measurable space and let µ be a signed measure defined on it. Let E and F be measurable sets such that E ⊂ F . If µ(F ) is finite, then µ(E) is also finite. Proof: We have F = (F \E) ∪ E and the two sets on the right-hand side are disjoint. Thus, µ(F ) = µ(F \E) + µ(E). If both the summands on the right-hand side of this equation are infinite, then so is µ(F ), since we have assumed that µ can take at most one of the two infinite values +∞ or −∞. If one of them alone is finite, then again, µ(F ) will be infinite. Thus both summands have to be finite, which completes the proof. Proposition 9.1.2 Let (X, S) be a measurable space and let µ be a signed measure defined on it. Let {En }∞ n=1 be a sequence of disjoint P∞ measurable sets such that |µ(∪∞ E )| < +∞. Then, the series n n=1 n=1 µ(En ) is absolutely convergent. Proof: Set En+ Then En , if µ(En ) ≥ 0, ∅, if µ(En ) < 0, En , if µ(En ) ≤ 0, ∅, if µ(En ) > 0. = and En− = P∞ + + µ(∪∞ n=1 En ) = Pn=1 µ(En ), (9.1.1) ∞ ∞ − − µ(∪n=1 En ) = n=1 µ(En ), P ∞ and the sum of the two series is ∞ Since µ can n=1 µ(En ) = µ(∪n=1 En ). P take at most one of the two values +∞ or −∞, and since ∞ n=1 µ(En ) 180 9 Signed measures is convergent by the given hypothesis, it follows that both the series in (9.1.1) are finite. ButP these are the series of positive terms and the series of negative terms of ∞ n=1 µ(En ) and so this latter series is absolutely convergent. Proposition 9.1.3 Let (X, S) be a measurable space equipped with a signed measure µ. If {En }∞ n=1 is an increasing sequence of measurable sets, then µ(∪∞ (9.1.2) n=1 En ) = lim µ(En ). n→∞ {En }∞ n=1 If is a decreasing sequence of measurable sets such that µ(Em ) is finite for some m ∈ N, then µ(∩∞ n=1 En ) = lim µ(En ). n→∞ (9.1.3) Proof: The proof is exactly as in the case of measures (cf. Propositions 1.2.4 and 1.2.5). Proposition 9.1.1 ensures that subsets of sets of finite measure are also of finite measure and we can use the subtractive property of signed measures (cf. Remark 9.1.1) to get (9.1.3). Definition 9.1.2 Let (X, S) be a measurable space equipped with a signed measure µ. Let E be a measurable subset of X. We say that E is a positive set (respectively, a negative set), if for every measurable set F , we have µ(E ∩ F ) ≥ 0 (respectively, µ(E ∩ F ) ≤ 0). Equivalently, E is a positive (respectively, negative) set if for every measurable subset F ⊂ E, we have µ(F ) ≥ 0 (respectively, µ(F ) ≤ 0). Remark 9.1.2 The empty set is both a positive and a negative set. Remark 9.1.3 Any subset of a positive (respectively, negative) set is positive (respectively, negative). Any (finite or countable) disjoint union of positive (respectively, negative) sets is positive (respectively, negative). If Ai , i = 1, 2 are positive (respectively, negative) sets, then so is A1 \A2 . Consequently, any countable union of positive (respectively, negative) sets is positive (respectively, negative). Proposition 9.1.4 Let (X, S) be a measurable space equipped with a signed measure µ. Let {Bi }ni=1 be a finite collection of negative sets. Let B = ∪ni=1 Bi . Then µ(B) ≤ min µ(Bi ). (9.1.4) 1≤i≤n 9.1 Hahn and Jordan decompositions 181 Proof: We have B1 ∪ B2 = B1 ∪ (B2 \B1 ) and the latter is a disjoint union. By the preceding remark, we have µ(B1 ∪ B2 ) = µ(B1 ) + µ(B2 \B1 ) ≤ µ(B1 ). Similarly, µ(B1 ∪ B2 ) ≤ µ(B2 ). This proves (9.1.4) when n = 2. The general case now follows by induction on n. Theorem 9.1.1 (Hahn decomposition) Let (X, S) be a measurable space equipped with a signed measure µ. There exist two disjoint sets A and B such that X = A ∪ B, and such that A is a positive set and B is a negative set. Proof: Without loss of generality, let us assume that for all E ∈ S, we have −∞ < µ(E) ≤ +∞. Step 1. Let us denote by N , the collection of all negative subsets of X. Set β = inf µ(B). B∈N {Bi }∞ i=1 Let be a sequence of sets in N such that µ(Bi ) ↓ β. If B = ∪∞ B , we have seen that B ∈ N and so β ≤ µ(B). On the other hand, i=1 i if we set Cn = ∪ni=1 Bi , then µ(B) = limn→∞ µ(Cn ), by Proposition 9.1.3. But, by Proposition 9.1.4, we have that µ(Cn ) ≤ µ(Bn ) and so µ(B) ≤ β. Thus, µ(B) = β. In particular, by our assumption on µ, we have that β is finite. Step 2. Let A = X\B. We will show that A is a positive set. If not, there exists a measurable set E0 ⊂ A such that µ(E0 ) < 0. Assume, if possible, that E0 is a negative set. Then, B ∪ E0 is a negative set and, since B and E0 are disjoint, µ(B ∪ E0 ) = µ(B) + µ(E0 ) < β, which is impossible by the definition of β. Thus, there exists a measurable subset of E0 with positive measure. Since µ(E0 ), being negative, is finite, so is the measure of any subset of E0 . Let k1 be the smallest positive integer such that there exists a measurable set E1 ⊂ E0 with µ(E1 ) ≥ k11 . Now, µ(E0 \E1 ) = µ(E0 ) − µ(E1 ) ≤ µ(E0 ) − 1 < 0. k1 Step 3. We can now apply the procedure of Step 2 to the set E0 \E1 . Then there exist measurable subsets of E0 \E1 with positive measure, 182 9 Signed measures and let k2 be the smallest positive integer with the property that there exists such a set E2 with µ(E2 ) ≥ k12 . (In other words, at each stage, we choose a set with positive measure with the measure being as large as possible.) Proceeding in this way, for each positive integer i, there exists a measurable set of positive measure contained in E0 \ ∪i−1 k=1 Ek and let ki be the smallest positive integer with the property that there exists such a measurable subset Ei with µ(Ei ) ≥ k1i . Then, since Ei ⊂ E0 \ ∪i−1 `=1 E` , the sets {Ei }∞ are clearly disjoint, and so i=1 ∞ X µ(Ei ) = µ(∪∞ i=1 Ei ) < +∞, i=1 since ∪∞ i=1 Ei ⊂ E0 , which has finite measure. In particular, it follows that µ(Ei ) → 0 as i → ∞ and so ki → ∞. Step 4. Let F be a measurable set such that F ⊂ E0 \ ∪∞ i=1 Ei . Then µ(F ) ≤ 0. If not, let kn be such that µ(F ) ≥ k1n , which is possible , since ki → ∞. Then, for all m ≥ n, we have m F ⊂ E0 \ ∪ ∞ i=1 Ei ⊂ E0 \ ∪i=1 Ei , which yields, by definition of the ki , that km ≤ kn , which is a contradiction. Thus, F0 = E 0 \ ∪ ∞ i=1 Ei is a negative set and µ(F0 ) ≤ µ(E0 ) < 0. Once again, this is a contradiction since we then have that B ∪ F0 is a negative set and µ(B ∪ F0 ) < β. Thus A is a positive set. This completes the proof. . Remark 9.1.4 The decomposition of X into two disjoint sets, one positive and the other negative, is called a Hahn decompositon of X. Such a decomposition is not unique. Let X = A ∪ B be a Hahn decomposition and assume that there exists N ⊂ B such that µ(N ) = 0. Let F ⊂ N . Then µ(F ) ≤ 0. If µ(F ) < 0, then 0 = µ(N ) = µ(F ) + µ(N \F ), which implies that µ(N \F ) > 0, which is not possible since B and, hence, N , is a negative set. Thus, µ(F ) = 0, for all F ⊂ N . Then it is clear that X = (A ∪ N ) ∪ (B\N ) gives another Hahn decomposition of X. The situation described in the preceding remark is, in fact, the only way non-uniqueness can occur for the Hahn decomposition. More precisely, we have the following result. 9.1 Hahn and Jordan decompositions 183 Proposition 9.1.5 Let (X, S) be a measurable space equipped with a signed measure µ. Let X = Ai ∪Bi , i = 1, 2, be two Hahn decompositions of X. Then, for every E ∈ S, we have µ(E ∩ A1 ) = µ(E ∩ A2 ) and µ(E ∩ B1 ) = µ(E ∩ B2 ). Proof: Let E ∈ S. We have E ∩(A1 \A2 ) ⊂ A1 and so µ(E ∩(A1 \A2 )) ≥ 0. On the other hand, E ∩ (A1 \A2 ) = E ∩ A1 ∩ Ac2 = E ∩ A1 ∩ B2 ⊂ B2 , and so µ(E ∩ (A1 \A2 )) ≤ 0. Thus µ(E ∩ (A1 \A2 )) = 0 and, similarly, µ(E ∩ (B1 \B2 )) = 0 as well. Consequently, µ(E ∩ (A1 ∪ A2 )) = µ(E ∩ A2 ) + µ(E ∩ (A1 \A2 )) = µ(E ∩ A2 ). Interchanging the roles of A1 and A2 , we get µ(E ∩ (A1 ∪ A2 )) = µ(E ∩ A1 ). Thus µ(E ∩ A1 ) = µ(E ∩ A2 ). This proves the first relation in the statement of the proposition. The proof of the other one is similar. Let (X, S) be a measurable space equipped with a signed measure µ. Let us now define two set functions on S by µ+ (E) = µ(E ∩ A), µ− (E) = −µ(E ∩ B), (9.1.5) where E ∈ S and X = A ∪ B is a Hahn decomposition of X. By the preceding proposition, µ± are well-defined, since they do not depend on the Hahn decomposition chosen. Further, it is clear that they are both measures. Also, since µ takes at most one of the two infinite values +∞ or −∞, it follows that one of these two measures is finite and, we have µ(E) = µ+ (E) − µ− (E), E ∈ S. If µ is finite (respectively, σ-finite), then the same is true for µ± . We have thus proved the following result. Theorem 9.1.2 (Jordan decomposition) Let (X, S) be a measurable space equipped with a signed measure µ. Then µ is the difference of two (positive) measures µ+ and µ− , at least one of which is finite. If µ is finite (respectively, σ-finite), then so are µ± . 184 9 Signed measures Definition 9.1.3 Let (X, S) be a measurable space equipped with a signed measure µ. The relation µ = µ+ − µ− is called the Jordan deomposition of µ. The measures µ+ and µ− are respectively called the upper and lower variations of the signed measure µ. The measure |µ| = µ+ + µ− is called the total variation of the signed measure µ. Definition 9.1.4 Let (X, S) be a measurable space. A complex measure is a set function µ defined on S which can be written as µ = µ1 + iµ2 , where µj , j = 1, 2, are signed measures and i is a square root of −1. Proposition 9.1.6 Let (X, S) be a measurable space equipped with a signed measure µ. Let E ∈ S. Let µ± be the upper and lower variations of µ. Then µ+ (E) = sup{µ(F ) | F ⊂ E, F ∈ S}, µ− (E) = − inf{µ(F ) | F ⊂ E, F ∈ S}. (9.1.6) Proof: Let X = A ∪ B be a Hahn decomposition of X. Let E, F ∈ S such that F ⊂ E. Since µ+ is a measure, we have that µ(F ∩ A) ≤ µ(E ∩ A). Thus µ(F ) = ≤ ≤ = µ(F ∩ A) + µ(F ∩ B) µ(F ∩ A) µ(E ∩ A) µ+ (E). It then follows that sup{µ(F ) | F ⊂ E, F ∈ S} ≤ µ+ (E). Since µ+ (E) = µ(E ∩ A) and E ∩ A ⊂ E, the reverse inequality is obvious. This proves the first relation in (9.1.6). The proof of the second relation is similar. Example 9.1.3 Let (X, S, µ) be a measure space and let f be an integrable function defined on X. Consider the signed measure ν defined by Z f dµ, E ∈ S. ν(E) = E Then we have a Hahn decomposition X = A ∪ B where A = {x ∈ X f + (x) > 0}, B = {x ∈ X f − (x) ≥ 0}. 9.2 Absolute continuity 185 The upper and lower variations ν ± of ν are given by Z Z + + − ν (E) = f dµ, and ν (E) = f − dµ, E E and the total variation |ν| is given by Z |ν|(E) = |f | dµ, E for E ∈ S. Let (X, S) be a measurable space equipped with a signed measure µ. It is clear that a measurable function f defined on X is integrable with respect to |µ| if, and only if, it is integrable with respect to both µ+ and µ− . In that case we can define the integral of f over X, with respect to µ. Definition 9.1.5 Let (X, S) be a measurable space equipped with a signed measure µ. Let f be a measurable function defined on X which is integrable with respect to |µ|. Then, we say that f is integrable with respect to the signed measure µ and we define Z Z Z + f dµ = f dµ − f dµ− , X X X where µ = µ+ − µ− is the Jordan decomposition of µ. If µi , i = 1, 2 are two signed measures defined on the measurable space (X, S), and if µ is the complex measure defined by µ = µ1 + iµ2 , we say that a measurable function f defined on X is integrable with respect to µ if it is integrable with respect to both µ1 and µ2 and we define Z Z Z f dµ = f dµ1 + i f dµ2 . X 9.2 X X Absolute continuity Definition 9.2.1 Let (X, S) be a measurable space and let µ and ν be signed measures defined on it. We say that ν is absolutely continuous with respect to µ if ν(E) = 0 whenever |µ|(E) = 0, E ∈ S. In this case we write ν << µ. 186 9 Signed measures Example 9.2.1 Let (X, S, µ) be a measure space and let f be an integrable function defined on X. For E ∈ S, define the signed measure Z ν(E) = f dµ. E Then ν << µ. Example 9.2.2 Let (X, S) be a measure space and let µ and ν be measures defined on it. Then µ << µ + ν and ν << µ + ν. Example 9.2.3 Let (X, S) be a measurable space equipped with a signed measure µ. Then µ+ << µ, µ− << µ. We also have µ << |µ| and |µ| << µ. Example 9.2.4 Let X = [0, 1] be equipped with the Lebesgue measure m1 . Let F = [0, 12 ] and, for x ∈ X, set f1 (x) = 2χF (x) − 1, f2 (x) = x. Let, for E ∈ L1 , Z µi (E) = fi dm1 , i = 1, 2. E Then, since |f1 | ≡ 1, we have |µ1 | = m1 . Consequently (cf. Example 9.1.1), µ2 << µ1 . However, Z µ1 (X) = f1 dm1 = 0, X while 1 Z µ2 (X) = x dx = 0 1 6= 0. 2 Thus, if µ2 << µ1 and if µ1 (E) = 0, it does not imply that µ2 (E) = 0. Proposition 9.2.1 Let (X, S) be a measurable space and let µ and ν be signed measures defined on it. The following statements are equivalent. (i) ν << µ. (ii) ν ± << µ. (iii) |ν| << |µ|. 9.2 Absolute continuity 187 Proof: (i) ⇒ (ii). Let X = A ∪ B be a Hahn decomposition of X with respect to ν, so that, for E ∈ S, we have ν + (E) = ν(E ∩ A) and ν − (E) = −ν(E ∩ B). Let E ∈ S such that |µ|(E) = 0. Then 0 ≤ |µ|(E ∩ A) ≤ |µ|(E) = 0, 0 ≤ |µ|(E ∩ B) ≤ |µ|(E) = 0. Then ν(E ∩ A) = ν(E ∩ B) = 0. Thus, ν ± << µ. (ii) ⇒ (iii). Clearly, ν ± << µ implies that ν ± << |µ|. Thus, if |µ|(E) = 0, then ν ± (E) = 0 and so |ν|(E) = 0 as well. Thus, |ν| << |µ|. (iii) ⇒ (i). If |µ|(E) = 0, then |ν|(E) = 0 and so ν ± (E) = 0 from which we get ν(E) = ν + (E) − ν − (E) = 0. This completes the proof. Definition 9.2.2 Let (X, S) be a measurable space and let µ and ν be signed measures defined on it. We say that µ is equivalent to ν if both the relations µ << ν and ν << µ hold. In this case we write µ ≡ ν. Example 9.2.5 If µ is a signed measure defined on a measurable space (X, S), then µ ≡ |µ|. We defined a notion of absolute continuity of a measure with respect to another in Remark 5.3.2, based on the result of Proposition 5.3.2. In that case, the measure ν is also absolutely continuous with respect to µ according the Definition 9.2.1 above. We reconcile these two definitions in the proposition below. Proposition 9.2.2 Let (X, S) be a measurable space and let µ and ν be signed measures defined on it. Assume that ν is finite and that ν << µ. Then, given ε > 0, we can find δ > 0 such that, whenever |µ|(E) < δ, E ∈ S, we have |ν|(E) < ε. Proof: Assume the contrary. Then, there exists ε > 0 such that, for every n ∈ N, there exists a set En ∈ S with |µ|(En ) < 21n and |ν|(En ) ≥ ε. Set E = lim supn→∞ En . Then, for every n ∈ N, |µ|(E) ≤ ∞ X m=n |µ|(Em ) < 1 2n−1 , 188 9 Signed measures and so |µ|(E) = 0. Since |ν| is finite (cf. Exercise 1.10 (b)), we have |ν|(E) ≥ ε, which contradicts the absolute continuity of ν with respect to µ. Example 9.2.6 The above result is not true, in general, if ν is not finite. Let X = N and let S = P(N). Let µ({n}) = 2−n and let ν({n}) = 2n . Then µ and ν define measures and since the only set, in either case, with measure zero is the empty set, we have that ν << µ. However, for any δ > 0, we can find n0 ∈ N such that for all n ≥ n0 , we have µ({n}) < δ, while {ν({n})}n≥n0 is unbounded. Proposition 9.2.3 Let (X, S) be a measurable space and let µ and ν be finite measures on it such that ν << µ.Assume that ν is not identically zero. Then, there exists ε > 0 and a measurable set A with µ(A) > 0, such that A is a positive set for the signed measure ν − εµ. Proof: For each n ∈ N, consider the signed measure ν − n1 µ and let X = An ∪ Bn be a Hahn decomposition for this signed measure. Set ∞ A0 = ∪ ∞ n=1 An and B0 = ∩n=1 Bn . Since B0 ⊂ Bn for each n, and since Bn is a negative set for the signed measure ν − n1 µ, we have 0 ≤ ν(B0 ) ≤ 1 µ(B0 ). n Thus, ν(B0 ) = 0. Since A0 = B0c , and since ν is not identically zero, it follows that ν(A0 ) > 0. By absolute continuity, it follows that µ(A0 ) > 0 as well. Then, there exists n ∈ N such that µ(An ) > 0. We can now set A = An and ε = n1 . 9.3 The Radon-Nikodym theorem Let (X, S, µ) be a measure space and let f be a non-negative integrable function defined on X. If we define the measure ν Z ν(E) = f dµ, E ∈ S, E then ν is absolutely continuous with respect to µ. The Radon-Nikodym theorem states that in the σ-finite case, every signed measure ν, which is absolutely continuous with respect to µ, arises in this fashion. We will first prove this for finite measures and then extend it to the general case. 9.3 The Radon-Nikodym theorem 189 Theorem 9.3.1 (Radon-Nikodym) Let (X, S, µ) be a finite measure space and let ν be a finite measure defined on S such that ν << µ. Then, there exists a non-negative function defined on X, which is integrable with respect to µ, and such that Z ν(E) = f dµ, E for every E ∈ S. The function f is unique Rin the sense that, if g is another integrable function such that ν(E) = E g dµ, the f = g almost everywhere with respect to the measure µ. Proof: Step 1. Uniqueness. If f and g are two functions such that Z Z ν(E) = f dµ = g dµ, E E for every E ∈ S, then, for every n ∈ N, we have that µ(En ) = 0, where 1 En = x ∈ X | f (x) − g(x) > . n It then immediately follows that µ({x ∈ X | f (x) − g(x) > 0}) = 0. Similarly, we have µ({x ∈ X | f (x) − g(x) < 0}) = 0, whence we deduce that µ({x ∈ X | f (x) 6= g(x)}) = 0. Step 2. Let us denote by L(µ), the set of all measurable functions defined on X which are integrable with respect to µ. Define f ≥ 0 and R K = f ∈ L(µ) . E f dµ ≤ ν(E) for every E ∈ S Then K 6= ∅. To see this, let ε > 0 and A ∈ S be as in the statement of Proposition 9.2.3. Then, if we set f = εχA , we have Z f dµ = εµ(E ∩ A) ≤ ν(E ∩ A) ≤ ν(E), E 190 9 Signed measures for every E ∈ S. Thus f ∈ K and R X f dµ = εµ(A) > 0. Now set Z f dµ | f ∈ K α = sup . X Then, 0 < α ≤ ν(X) < +∞. Step 3. Let {gn }∞ n=1 be a sequence of functions in K such that Z 1 gn dµ > α − . n X Let fn = max{g1 , · · · , gn } ≥ 0. We claim that fn ∈ K. Indeed, let Ein = {x ∈ X | fn (x) = gi (x)}, for 1 ≤ i ≤ n. Then X = ∪ni=1 Ein . Set F1n = E1n and n Fin = Ein \(∪i−1 k=1 Fk ). Then the Fin , 1 ≤ i ≤ n are disjoint, Fin ⊂ Ein for 1 ≤ i ≤ n and X = ∪ni=1 Fin . Thus, if E ∈ S, we have Z fn dµ = E n Z X i=1 E∩Fin fn dµ = n Z X E∩Fin i=1 gi dµ ≤ n X ν(E ∩ Fin ) = ν(E). i=1 This proves that fn ∈ K. Step 4. We have that {fn }∞ n=1 is an increasing sequence of non-negative functions. Let f = limn→∞ fn . Then, by the monotone convergence theorem, Z Z f dµ = lim E n→∞ E fn dµ ≤ ν(E), R for every E ∈ S. Thus, f ∈ K as well. Consequently, X f dµ ≤ α. On the other hand, Z Z Z 1 f dµ ≥ fn dµ ≥ gn dµ > α − , n X X X R for every n ∈ N. Consequently, X f dµ ≥ α as well and so we have that Z f ∈ K and f dµ = α. X 9.3 The Radon-Nikodym theorem 191 Step 5. Define the measure ν1 by Z ν1 (E) = f dµ, E ∈ S. E Then ν1 is a finite measure and ν1 << µ. Set ν0 = ν − ν1 , which is a well-defined signed measure since both ν and ν1 are finite measures. Since f ∈ K, we have that ν1 (E) ≤ ν(E) for every E ∈ S. Thus ν0 is a measure and we also have that ν0 << µ. Step 6. We claim that ν0 ≡ 0. This will complete the proof. Assume the contrary. Then, since ν0 is a finite measure which is absolutely continuous with respect to µ, we may again apply Proposition 9.2.3. Thus, there exist η > 0 and a set F ∈ S, which is a positive set for the signed measure ν0 − ηµ, and such that µ(F ) > 0. Hence, for every E ∈ S, we have Z ηµ(E ∩ F ) ≤ ν0 (E ∩ F ) = ν(E ∩ F ) − f dµ. E∩F Now, set h = f + ηχF . Then, if E ∈ S, we have R R h dµ = E E f dµ + ηµ(E ∩ F ) ≤ R = R E f dµ − E∩F c R E∩F f dµ + ν(E ∩ F ) f dµ + ν(E ∩ F ) ≤ ν(E ∩ F c ) + ν(E ∩ F ) = ν(E). Thus h ∈ K. But Z Z h dµ = X f dµ + ηµ(F ) > α, X which is a contradiction. Thus, ν0 ≡ 0. Hence f is the required function and this completes the proof. If µ and ν are σ-finite measures, then we can write X as the disjoint union of a countable number of measurable sets, such that on each of them µ and ν are finite. Thus we can ‘patch up’ the function f obtained 192 9 Signed measures on each of these sets to get f defined on X as in the theorem. Next, if ν is a σ-finite signed measure, we can write ν = ν + − ν − . Since ν ± will also be σ-finite, the preceding argument gives two functions f+ and f− such that, for every E ∈ S, we have Z ± ν (E) = f± dµ. E Then we can set f = f+ − f− to get the result of the theorem in this case. Finally, let us assume that µ is also a σ-finite signed measure. Let X = A ∪ B be a Hahn decomposition with respect to µ. Then, if E ⊂ A, we have |µ|(E) = µ+ (E) and if E ⊂ B, we have |µ|(E) = µ− (E). Thus, when restricted to A or to B, since we have that ν << µ, we deduce that ν << µ+ on A and ν << µ− on B. Hence, there exist functions fA and fB such that R ν(E) = E fA dµ+ , for every E ∈ S, E ⊂ A, ν(E) = R E fB dµ− , for every E ∈ S, E ⊂ B. Now, if E ∈ S, we have ν(E) = ν(E ∩ A) + ν(E ∩ B) = R = R E∩A fA E (fA χA dµ+ + R E∩B fB dµ− − fB χB ) dµ. Combining all these, we get the following result. Theorem 9.3.2 (Radon-Nikodym) Let (X, S) be a measurable space and let µ and ν be σ-finite signed measures defined on S such that ν << µ. Then, there exists a measurable function defined on X such that Z ν(E) = f dµ, (9.3.1) E for every E ∈ S. Definition 9.3.1 Let (X, S) be a measurable space and let µ and ν be σ-finite signed measures defined on S such that ν << µ. The function f occuring in (9.3.1) is called the Radon-Nikodym derivative of ν dν with respect to µ and we formally write f = dµ . 9.4 Singularity 193 Proposition 9.3.1 Let (X, S) be a measurable space and let λ and µ be σ-finite measures defined on S. Let µ << λ. Let ν be a σ-finite signed measure defined on S such that ν << µ. Then dν dν dµ = , dλ dµ dλ almost everywhere (with respect to the measure λ). Proof: As usual, by considering the upper and lower variations ν ± separately, we may assume that ν is a measure, without loss of generality. dν Thus, f = dµ ≥ 0 and let g = dµ dλ ≥ 0. By virtue of proposition 5.2.7, we have, for E ∈ S, Z Z ν(E) = f dµ = f g dλ, E E which completes the proof. 9.4 Singularity If a measure is absolutely continuous with respect to another, then the former vanishes whenever the latter vanishes. We now consider the opposite notion. Definition 9.4.1 Let (X, S) be a measurable space and let µ and ν be two measures defined on S. We say that ν is singular with respect to µ, and we write ν ⊥ µ, if there exists E ∈ S such that µ(E) = 0 and ν ≡ 0 on E c , i.e. if F ⊂ E c , F ∈ S, then ν(F ) = 0. Example 9.4.1 Let m1 be the Lebesgue measure on R and let δ be the Dirac measure concentrated at the origin. Then, if we set E = {0}, we have m1 (E) = 0 and δ ≡ 0 on E c . Thus δ ⊥ m1 . Example 9.4.2 Let (X, S) be a measurable space equipped with a signed measure µ. Then µ+ ⊥ µ− and µ− ⊥ µ+ . Theorem 9.4.1 (Lebesgue decomposition) Let (X, S) be a measurable space and let µ and ν be two σ-finite measures defined on S. Then, there exist two uniquely determined measures ν0 and ν1 such that ν = ν0 + ν1 , ν0 ⊥ µ and ν1 << µ. 194 9 Signed measures Proof Since ν << µ + ν, there exists a non-negative function f such that Z ν(E) = f d(µ + ν), E for every E ∈ S. Set A = {x ∈ X |f (x) ≥ 1} and B = Ac . Then, Z ν(A) ≥ d(µ + ν) = µ(A) + ν(A), A whence we deduce that µ(A) = 0. Define, for E ∈ S, ν0 (E) = ν(E ∩ A) and ν1 (E) = ν(E ∩ B). Then ν = ν0 + ν1 and ν0 ⊥ µ. We will now show that ν1 << µ, which will establish the decomposition of ν. Let E ∈ S such that µ(E) = 0. Then Z Z Z ν1 (E) = dν = f dµ + f dν. E∩B Since µ(E) = 0, we have that Z E∩B R E∩B E∩B f dµ = 0. Thus we get (1 − f ) dν = 0. E∩B But on B, we have that 0 ≤ f < 1. Hence it follows that ν(E ∩ B) = 0, i.e. ν1 (E) = 0. This shows that ν1 << µ. To complete the proof, we need to show that ν0 and ν1 are uniquely determined. Let ν = ν0 +ν1 = νe0 +e ν1 be two Lebesgue decompositions of e be such that µ(A) = µ(A) e = ν with respect to µ. Let A (respectively A) c c e e 0 and ν0 ≡ 0 on A (repectively, νe0 ≡ 0 on A ). Then µ(A ∪ A) = 0 and, ec , both ν0 and νe0 are both zero. Now, set on Ac ∩ A λ = ν0 − νe0 = νe1 − ν1 . Then λ = νe1 − ν1 is clearly absolutely continuous with respect to µ and e and on all of its measurable subsets. so we have that λ vanishes on A ∪ A On the other hand, we have just seen above that λ = ν0 − νe0 vanishes e Thus λ ≡ 0 which proves the identically on the complement of A ∪ A. uniqueness of the Lebesgue decomposition. 9.5 Exercises 9.5 195 Exercises 9.1 Let (X, S) be a measurable space equipped with a signed measure µ. If µ is finite, show that, for every E ∈ S, Z |µ|(E) = sup f dµ | f : X → R measurable, |f | ≤ 1 . E 9.2 Let (X, S) be a measurable space equipped with a signed measure µ. Let µi , i = 1, 2 be two measures defined on S such that µ = µ1 − µ2 . Show that µ+ ≤ µ1 and that µ− ≤ µ2 . 9.3 Let (X, S) be a measurable space equipped with a signed measure µ. Let E ∈ S. Show that |µ|(E) = 0 if, and only if, µ(F ) = 0 for every measurable subset F of E. 9.4 Let (X, S) be a measurable space equipped with a signed measure + dµ− µ. Compute dµ d|µ| and d|µ| . 9.5 Let (X, S) be a measurable space. Let µ and ν be finite measures dν . Show that, for every E ∈ S, we have defined on S. Let f = d(µ+ν) Z f ν(E) = dµ. E 1−f 9.6 Let ν be any σ-finite signed measure on the measurable space (N, P(N)). Let µ be the counting measure on this measurable space. dν Show that ν << µ and compute dµ . 9.7 Let (X, S) be a measurable space. Let µ and ν be σ-finite measures defined on S such that µ ≡ ν. Show that −1 dµ dν = dν dµ almost everywhere (with respect to µ, and hence, with respect to ν as well). 9.8 Let (X, S) be a measurable space. Let µ and ν be σ-finite measures defined on S such that ν << µ. Show that dν (x) = 0 = 0. ν x∈X dµ Chapter 10 Lp spaces 10.1 Basic properties The Lebesgue spaces, also known as the Lp spaces, constitute a rich source of examples and counter-examples in functional analysis. They also form an important class of function spaces when studying the applications of mathematical analysis. Definition 10.1.1 Let (X, S, µ) be a measure space. Let f : X → R be a measurable function. Let 1 ≤ p < ∞. We define Z p kf kp = 1 p |f | dµ (10.1.1) X and we say that f is p-integrable (integrable, if p = 1 and square integrable, if p = 2) if kf kp < +∞. Let M > 0. We set {|f | > M } = {x ∈ X | |f (x)| > M }. We now define (cf. Definition 3.3.1) kf k∞ = inf{M > 0 | µ({|f | > M }) = 0}. (10.1.2) Definition 10.1.2 Let (X, S, µ) be a measure space. Let f : X → R be a measurable function. We say that f is essentially bounded if kf k∞ < +∞. © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_10 196 10.1 Basic properties 197 Definition 10.1.3 Let 1 < p < ∞. The conjugate exponent of p, denoted p0 , is given by the relation: 1 1 + 0 = 1. p p If p = 1 we define its conjugate exponent as ∞ and vice-versa. Lemma 10.1.1 Let 1 < p < ∞. Let p0 be its conjugate exponent. Then, if a and b are non-negative real numbers, we have 1 1 a p b p0 ≤ a b + 0. p p (10.1.3) Proof: Let t ≥ 1. Consider the function f (t) = k(t − 1) − tk + 1, where k ∈ (0, 1). Then f 0 (t) = k(1 − tk−1 ) ≥ 0 since 0 < k < 1. Thus, f is an increasing function on [1, +∞) and, since f (1) = 0, we deduce that tk ≤ k(t − 1) + 1, (10.1.4) for t ≥ 1 and for 0 < k < 1. If a or b is zero, then (10.1.3) is obviously true. Let us assume, without loss of generality (since p and p0 are conjugate exponents of each other), that a ≥ b > 0. Then (10.1.3) follows from (10.1.4) on setting t = ab , k = p1 and using the relation between p and p0 . Proposition 10.1.1 (Hölder’s inequality) Let 1 ≤ p < ∞ and let p0 be the conjugate exponent. If f is p-integrable and g is p0 -integrable (essentially bounded, if p = 1), then Z |f g| dµ ≤ kf kp kgkp0 . (10.1.5) X Proof: If p = 1, then p0 = ∞. Then |f (x)g(x)| ≤ |f (x)| · kgk∞ for almost every x ∈ X and then (10.1.5) follows on integrating this inequality over X. 10 Lp spaces 198 Let us now assume that 1 < p < ∞ so that 1 < p0 < ∞ as well. The relation (10.1.5) is trivially true if kf kp (respectively, kgkp0 ) equals zero, for then f (respectively, g) will be equal to zero almost everywhere. So we assume further that kf kp 6= 0 and that kgkp0 6= 0. Then, by Lemma 10.1.1, 1 1 0 |f (x)g(x)| ≤ |f (x)|p + 0 |g(x)|p p p for all x ∈ X. Assume now that kf kp = kgkp0 = 1. Then, integrating the above inequality over X, we get Z 1 1 |f g| dµ ≤ + 0 = 1. p p X For the general case, apply the preceding result to the functions f /kf kp and g/kgkp0 to get (10.1.5). Remark 10.1.1 When p = p0 = 2, the inequality (10.1.5) is known as the Cauchy-Schwarz inequality. Proposition 10.1.2 (Minkowski’s Inequality) Let 1 ≤ p ≤ ∞. Let f and g be p-integrable (essentially bounded , if p = ∞). Then f + g is also p-integrable (essentially bounded, if p = ∞) and kf + gkp ≤ kf kp + kgkp . (10.1.6) Proof: Let 1 < p < ∞. We assume that kf + gkp 6= 0, since, otherwise, the result is trivially true. Since the function t 7→ |t|p is convex for 1 ≤ p < ∞, we have that |f (x) + g(x)|p ≤ 2p−1 (|f (x)|p + |g(x)|p ) from which it follows that f + g is also p-integrable. Thus, if 1 < p < ∞, we have Z Z Z p p−1 |f + g| dµ ≤ |f + g| |f | dµ + |f + g|p−1 |g| dµ. X X X We apply Hölder’s inequality to each of the terms on the right-hand side. 0 Notice that |f (x) + g(x)|(p−1)p = |f (x) + g(x)|p by the definition of p0 . Thus |f + g|p−1 is p0 -integrable and p 0 k |f + g|p−1 kp0 = kf + gkpp . 10.1 Basic properties 199 Thus, p 0 kf + gkpp ≤ kf + gkpp (kf kp + kgkp ). p 0 Dividing both sides by kf + gkpp and using, once again, the definition of p0 , we get (10.1.6). The cases where p = 1 and p = ∞ follow trivially from the inequality |f (x) + g(x)| ≤ |f (x)| + |g(x)|. This completes the proof. It is now easy to see that the space of all p-integrable functions (1 ≤ p < ∞) and that of all essentially bounded functions are vector spaces and that the map f 7→ kf kp for 1 ≤ p ≤ ∞ verifies all the properties of the norm, except that kf kp = 0 does not imply that f = 0, but that f = 0 almost everywhere. Given two measurable functions f and g, we say that f ∼ g if f = g almost everywhere. This defines an equivalence relation. If f ∼ g, then for 1 ≤ p ≤ ∞, we have that kf kp = kgkp . Further the set of all equivalence classes forms a vector space with respect to pointwise addition and scalar multiplication defined via arbitrary representatives of the equivalence classes. In other words, if f1 ∼ f2 and if g1 ∼ g2 , then f1 + g1 ∼ f2 + g2 and, for any scalar α, we also have αf1 ∼ αf2 and so on. Since k · kp is also constant on any equivalence class, we can define the ‘norm’ of an equivalence class via any representative function of that class. Further, if kf kp = 0, then f will belong to the equivalence class of the function which is identically zero. Thus the set of all equivalence classes, with k · kp , becomes a normed linear space. Definition 10.1.4 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞. The space of all equivalence classes, under the equivalence relation defined by equality of functions almost everywhere, of all p-integrable functions is a normed linear space with the norm of an equivalence class being the k · kp -‘norm’ of any representative of that class. This space is denoted Lp (µ). The space of all equivalence classes of all essentially bounded functions with the norm of an equivalence class being defined as the k·k∞ -‘norm’ of any representative of that class, is denoted L∞ (µ). While we may often talk of ‘Lp -functions’, we must keep in mind that we are really talking about equivalence classes of functions and that we 10 Lp spaces 200 carry out computations via representatives of those equivalence classes. Notation If X = Ω, a non-empty open set of RN , provided with the Lebesgue measure mN , then the corresponding Lp spaces will be denoted Lp (Ω). In particular, if R is provided with the Lebesgue measure m1 , and if (a, b) is an interval, where −∞ ≤ a < b ≤ +∞, then the Lp spaces on (a, b) will be denoted Lp (a, b). Example 10.1.1 Let X = {1, 2, · · · , N }. Let S be the collection of all subsets of X and let µ be the counting measure. Then a measurable function can be identified with an n-tuple (a1 , a2 , · · · , aN ). In this case Lp (µ) = RN equipped with the norm kxkp = N X ! p1 |xi |p , i=1 if 1 ≤ p < ∞, and with the norm kxk∞ = max |xi |, 1≤i≤N where x = (x1 , · · · , xN ) ∈ RN . Notice that in this example, equality almost everywhere is the same as equality everywhere and so every equivalence class is a singleton. Example 10.1.2 Let X = N, and let S be the collection of all subsets of X. Let µ be the counting measure. In this case, functions are identified with real sequences and Lp (µ) = `p , the space of all real sequences equipped with the norm kxkp = ∞ X !1 p |xk | p , k=1 if 1 ≤ p < ∞ and with the norm kxk∞ = sup |xk |, k∈N if p = ∞. Again, in this example, equivalence classes are just singletons. 10.1 Basic properties 201 Proposition 10.1.3 Let (X, S, µ) be a finite measure space. Then Lp (µ) ⊂ Lq (µ) with the inclusion being continuous, whenever 1 ≤ q ≤ p. Proof: The result is trivial if p = ∞. Let 1 ≤ q < p < ∞ and let f ∈ Lp (µ). Then, by Hölder’s inequality, we have R |f |q dµ ≤ R = R p q q X (|f | ) dµ p X |f | dµ q = kfkqp .(µ(X)) p q R p 1− q X dµ (µ(X)) 1− pq p 1− pq which yields kf kq ≤ C kf kp where 1 C = (µ(X)) q − p1 . This completes the proof. Example 10.1.3 No such inclusions hold in infinite measure spaces. For instance, the sequence ( n1 ) belongs to `2 but not to `1 . Example 10.1.4 Nothing can be said about the reverse inclusions. For √ example, if f (x) = 1/ x, then f ∈ L1 (0, 1) but f 6∈ L2 (0, 1). Example 10.1.5 While we cannot say anything, in general, about the reverse inclusions, as mentioned in the previous example, nevertheless, if 1 ≤ p < q ≤ ∞, we do have the continuous inclusion `p ,→ `q . In fact have that, if x ∈ `p , then kxkq ≤ kxkp . Let q = ∞. If x ∈ `p , where x = (x1 , · · · , xi , · · ·), we have |xi | ≤ kxkp , 10 Lp spaces 202 for each i. Consequently, kxk∞ = sup |xi | ≤ kxkp . i Let 1 ≤ p < q < ∞. Let x ∈ `p . Assume, for the moment, that kxkp = 1. Then, if x = (x1 , · · · , xi , · · ·), we have seen that |xi | ≤ 1 for all i. Now ∞ X ∞ X |xi |q = i=1 |xi |p |xi |q−p ≤ i=1 ∞ X |xi |p = 1. i=1 x Thus, x ∈ `q and kxkq ≤ 1 = kxkp . Now, if x ∈ `p , consider y = kxk . p Then kykp = 1. Consequently, y ∈ `q and kykq ≤ 1. Thus, it follows that x ∈ `q as well (since it is a constant multiple of y) and kxkq ≤ 1, kxkp which establishes our claim. Remark 10.1.2 Let (X, S, µ) be a finite measure space and let f ∈ L∞ (µ), f 6= 0. Then, we have seen that f ∈ Lp (µ) for all 1 ≤ p ≤ ∞. Let 1 ≤ p < ∞. Then Z |f |p dµ ≤ kf kp−1 ∞ kf k1 . X Thus, p−1 1 kf kp ≤ kf k∞p kf k1p . Consequently, lim sup kf kp ≤ kf k∞ . p→∞ Now let 0 < ε < kf k∞ and let E = {x ∈ X | |f (x)| > kf k∞ − ε > 0}. Then µ(E) > 0 and Z Z p |f | dµ ≥ |f |p dµ > (kf k∞ − ε)p µ(E). X Then E 1 kf kp ≥ (kf k∞ )(µ(E)) p , 10.1 Basic properties 203 from which we get, lim inf kf kp ≥ kf k∞ . p→∞ Thus, we have lim kf kp = kf k∞ . p→∞ This justifies, to some extent, the notation kf k∞ for the norm given by the essential supremum. p A Cauchy sequence {fn }∞ n=1 in L (µ) is a sequence such that, given any ε > 0, there exists N ∈ N satisfying kfn − fm kp < ε whenever n and m are larger than N . If f ∈ Lp (µ), we say that the sequence converges to f in Lp (µ) if kfn − f kp → 0 as n → ∞. Lemma 10.1.2 Let 1 ≤ p < ∞. Let (X, S, µ) be a measure space. If p {f }∞ n=1 is a Cauchy sequence in L (µ), then the sequence is Cauchy in measure. Proof: Let ε > 0 be fixed. For positive integers n and m, set An,m (ε) = {x ∈ X | |fn (x) − fm (x)| ≥ ε}. Then Z p Z |fn − fm |p dµ ≥ εp µ(An,m (ε)). |fn − fm | dµ ≥ X An,m (ε) Thus, kfn − fm kpp µ(An,m (ε)) ≤ , εp and, since the right-hand side of the above inequality can be made arbitrarily small for large, n and m, the same is true for µ(An,m (ε)) as well and that completes the proof. Theorem 10.1.1 Let (X, S, µ) be a measure space. Let 1 ≤ p ≤ ∞. Then Lp (µ) is a Banach space. Proof: We need to show that every Cauchy sequence in Lp (µ) is convergent. p Case 1. Let 1 ≤ p < ∞. Let {fn }∞ n=1 be a Cauchy sequence in L (µ). Then we saw, in the preceding lemma, that the sequence is Cauchy in measure. Then, by Proposition 4.2.7, there exists a subsequence {fnk } 10 Lp spaces 204 which is almost uniformly Cauchy and, hence, by Proposition 4.1.2, there exists a measurable function f such that fnk → f pointwise almost everywhere. Let ε > 0. Let N ∈ N be such that kfn − fm kp < ε for all n, m ≥ N . We have, by Fatou’s lemma, Z Z p |f − fn | dµ ≤ lim inf |fnk − fn |p dµ ≤ εp , X k→∞ X for all n ≥ N . Thus, it follows that f − fn ∈ Lp (µ) and so f ∈ Lp (µ) as well. Further, it also shows that fn → f in Lp (µ). ∞ Case 2. Let p = ∞. Let {fn }∞ n=1 be Cauchy in L (µ). Then, for each k, there exists a positive integer Nk such that kfm − fn k∞ < 1 k for all m, n ≥ Nk . Thus, there exists a set Ek of measure zero, such that |fm (x) − fn (x)| ≤ 1 k for all m, n ≥ Nk and for all x ∈ X\Ek . Setting E = ∪∞ k=1 Ek , we see that E is of measure zero and for all x ∈ X\E, the sequence {fn (x)} is a Cauchy sequence in R. Thus, for all such x, fn (x) → f (x). Passing to the limit as m → ∞, we see that, for all x ∈ X\E, and for all n ≥ Nk , |f (x) − fn (x)| ≤ 1 . k Hence, it follows that f is essentially bounded and that fn → f in L∞ (µ). This completes the proof. Corollary 10.1.1 Let (X, S, µ) be a measure space and let fn → f in Lp (µ) for some 1 ≤ p ≤ ∞. Then, there exists a subsequence {fnk } such that fnk (x) → f (x) almost everywhere. Proof: If 1 ≤ p < ∞, we have already proved this in the course of the proof of the preceding theorem. If p = ∞, then we have that fn → f pointwise almost everywhere. Remark 10.1.3 An explicit construction of a subsequence, which converges pointwise almost everywhere, of a Cauchy sequence in Lp (µ), 1 ≤ p < ∞, can be found in Kesavan [5] and in Rudin [8]. This has the added 10.2 Approximation 205 advantage that it also shows that the subsequence is bounded above by a fixed function belonging to Lp (µ). Just as in Theorem 5.3.5, we can prove the Lp -convergence of a sequence converging pointwise almost everywhere and whose Lp -norm also converges. p Theorem 10.1.2 Let 1 ≤ p < ∞. Let {fn }∞ n=1 be a sequence in L (µ) p converging pointwise almost everywhere to a function f ∈ L (µ). Then fn → f in Lp (µ) if, and only if, kfn kp → kf kp as n → ∞. Proof: Since the norm defines a continuous function on any normed linear space, it follows that if fn → f in Lp (µ), we have that kfn kp → kf kp . Conversely, let fn → f pointwise almost everywhere and let kfn kp → kf kp . As in the case of Theorem 5.3.5, we apply the generalised dominated convergence theorem (cf. Theorem 5.3.4). Let Fn = |fn − f |p , which converges to zero almost everywhere. Since the function t 7→ |t|p is convex, we get Fn ≤ 2p−1 (|fn |p + |f |p ) = Gn , say. Then Gn is integrable and Gn → G = 2Rp |f |p pointwise R almost everywhere. Further, by hypothesis, we get thatR X Gn dµ → X G dµ < +∞. Thus, by Theorem 5.3.4, we deduce that X Fn dµ → 0, which is the same as saying that fn → f in Lp (µ). 10.2 Approximation Let Ω ⊂ RN be a non-empty open set. Let S denote the set of all real-valued simple functions defined on Ω which vanish outside a set of finite (Lebesgue) measure. If 1 ≤ p < ∞, a simple function ϕ belongs to Lp (Ω) if, and only if, ϕ ∈ S. Lemma 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤ p < ∞. Then S is dense in Lp (Ω). Proof: Let f ∈ Lp (Ω) be a non-negative function and let {ϕn }∞ n=1 be a sequence of non-negative simple functions increasing to f . Then, clearly ϕn ∈ Lp (Ω) for each n and so ϕn ∈ S as well. Now, |ϕn − f |p ≤ 2p |f |p , and since |f |p is integrable, it follows, from the dominated convergence theorem, that ϕn → f in Lp (Ω). If f ∈ Lp (Ω) is any generic function, then we can write f = f + − f − and 10 Lp spaces 206 f ± are non-negative functions in Lp (Ω). Then, we can find sequences ∞ + − p {ϕn }∞ n=1 and {ψn }n=1 in S such that ϕn → f and ψn → f in L (Ω). Thus ϕn − ψn ∈ S for each positive integer n and ϕn − ψn → f in Lp (Ω). This completes the proof. Lemma 10.2.2 Let Ω ⊂ RN be a non-empty open set and let 1 ≤ p < ∞. Let f ∈ S. Then, f can be approximated by step functions in Lp (Ω). Proof: Let E ⊂ Ω be a measurable set of finite measure. By Proposition 2.2.5, given ε > 0, we can find a set F ⊂ Ω which is a finite disjoint union of boxes, such that mN (E∆F ) < εp . Then kχE − χF kpp = mN (E∆F ) < εp and so kχE − χF kp < ε. Now, let f ∈ S. Then we can write f = Pk j=1 αj χEj , where the αj are all non-zero and the Ej are all mutually disjoint sets of finite measure. For each 1 ≤ j ≤ k, we can find Fj , a finite disjoint union of boxes, such that kχEj − χFj kp < ε . k|αj | P Then ϕ = kj=1 αj χFj is a step function, and, by the triangle inequality, we have kf − ϕkp < ε. This completes the proof. Theorem 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤ p < ∞. Let Cc (Ω) denote the space of continuous real-valued functions defined on Ω, having compact support contained in Ω. Then, Cc (Ω) is dense in Lp (Ω). Proof: By Lemmas 10.2.1 and 10.2.2, it follows that step functions are dense in Lp (Ω). So it suffices to show that any step function in Lp (Ω) can be approximated, in Lp (Ω), by a continuous function with compact support. Let ε > 0 be given. Then, by Corollary 2.2.1, there exists ϕ ∈ Cc (Ω), such that p ε mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < 2kf k∞ and such that kϕk∞ ≤ kf k∞ . Then kϕ − f kpp ≤ 2p kf kp∞ mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < εp 10.2 Approximation 207 so that kϕ − f kp < ε. This completes the proof. Remark 10.2.1 The above result is not true when p = ∞. In fact, we can show that the closure of Cc (Ω) under the sup-norm (i.e. the norm k · k∞ ), is the space of continuous functions which vanish at infinity, i.e. functions such that, given any ε > 0, there exists a compact set K ⊂ Ω such that |f (x)| < ε for all x ∈ Ω\K. There are several interesting applications of the fact that Cc (Ω) is dense in Lp (Ω) for 1 ≤ p < ∞. We will see a few in the next section. To conclude this section, we discuss separability properties of the spaces Lp (Ω). Proposition 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤ p < ∞. Then the space Lp (Ω) is separable. Proof: The set Ω can be expressed as the increasing union of compact sets Kn , n ∈ N. Given f ∈ Lp (Ω), we can approximate it by a continuous function with compact support, say ϕ, and the support of ϕ will lie in some Kn . By the Weierstrass approximation theorem, we can approximate ϕ uniformly in Kn by means of a polynomial and, in fact, by means of a polynomial whose coefficients are rational. Since Kn is compact, it also means that these polynomials approximate ϕ in the Lp - norm over Kn . Since the set of all polynomials with rational coefficients is countable, let us number them as {pm,n }∞ m=1 . Denote by pem,n the extension of pm,n to Ω by setting it to be equal to zero outside Kn . Thus ∞ ∪∞ pm,n } n=1 ∪m=1 {e is a countable and dense set in Lp (Ω). This completes the proof. Proposition 10.2.2 Let Ω ⊂ RN be a non-empty open set. The space L∞ (Ω) is not separable. Proof: Let x ∈ Ω. Let r > 0 be such that B(x; r) ⊂ Ω, where B(x; r) is the open ball of radius r which is centered at x. Let ϕx = χB(x;r) . Set Ux = ∞ f ∈ L (Ω) | kf − ϕx k∞ 1 < 2 . 10 Lp spaces 208 Then Ux is an open set in L∞ (Ω). Let x and y be distinct points in Ω. Then, by definition, it follows that kϕx − ϕy k∞ = 1. Consequently, if x 6= y, then Ux ∩ Uy = ∅. Thus {Ux }x∈Ω is an uncountable collection of disjoint open sets in L∞ (Ω). Hence, given any countable set {fn }∞ n=1 , there exist open sets Ux which do not contain any of the fn . Thus, no countable set in L∞ (Ω) can be dense. This completes the proof. 10.3 Some applications In the previous section, we proved the density of Cc (Ω), the space of continuous functions with compact support, in Lp (Ω), where Ω is an open subset of RN and 1 ≤ p < ∞. We used it to examine the separability of Lp (Ω). In this section we will deduce a few more important and interesting consequences of this density theorem. Theorem 10.3.1 (Lusin’s theorem) Let E ⊂ RN be a measurable set of finite measure. Let f : E → R be a measurable function. Let ε > 0 be given. Then, there exists ϕ ∈ Cc (RN ) such that mN ({x ∈ E | ϕ(x) 6= f (x)}) < ε. Further, if f is bounded, we can ensure that kϕk∞ ≤ kf k∞ . Proof: Step 1. For each positive integer n, define En = {x ∈ E | |f (x)| ≤ n}. Then En ↑ E. Since E has finite measure, we can choose m such that mN (E\Em ) < 3ε . Now, define fe : RN → R by f (x), if x ∈ Em , e f (x) = 0, if x ∈ RN \Em . Since fe is bounded and since Em has finite measure, it follows that fe N is integrable on RN . Hence, there exists a sequence {ϕn }∞ n=1 in Cc (R ) 1 N e such that ϕn → f in L (R ). Then, there exists a subsequence {ϕnk } which converges to fe pointwise almost everywhere on RN . 10.3 Some applications 209 Step 2. Since Em has finite measure, we can find F ⊂ Em such that mN (Em \F ) < 3ε and such that ϕnk → fe uniformly on F , by virtue of Egorov’s theorem (cf. Theorem 4.1.1). Again, since F has finite measure, we can find a compact set K ⊂ F such that mN (F \K) < 3ε (cf. Proposition 2.2.3). Clearly, mN (E\K) < ε. Step 3. Since {ϕnk } converges uniformly to fe on K, it follows that the restriction of fe to K is continuous. But K ⊂ F ⊂ Em and so fe(x) = f (x) for every x ∈ K. Thus, we deduce that the restriction of f to K is continuous and |f (x)| ≤ m for x ∈ K. Step 4. Now, by the Tietze extension theorem (cf., for instance, Simmons [9]) we can find a continuous function g : RN → R such that kgk∞ ≤ m and such that g = f on K. Step 5. Finally, let ψ ∈ Cc (RN ) be such that 0 ≤ ψ ≤ 1 and such that ψ ≡ 1 on K (which exists, by Urysohn’s lemma, cf. Simmons [9]). Let ϕ = ψg. Then ϕ ∈ Cc (RN ), and {x ∈ E | ϕ(x) 6= f (x)} ⊂ E\K, which has measure less than ε. Also kϕk∞ ≤ m and, if f is bounded, m ≤ kf k∞ . This completes the proof. Remark 10.3.1 Usually, in most texts, Lusin’s theorem is used to prove the density of continuous functions with compact support, in Lp (Ω). Here we have proved the density result directly and used it to prove Lusin’s theorem. This presentation appeared in the expository article A note on some approximation theorems in measure theory, by S. Kesavan and M. T. Nair, in the Mathematics Newsletter, 27, No.2, 2016, of the Ramanujan Mathematical Society. Theorem 10.3.2 (Hardy’s inequality) Let 1 < p < ∞. Let f ∈ Lp (0, ∞). For 0 < x < ∞, define Z 1 F (x) = f dm1 . (10.3.1) x (0,x) Then F ∈ Lp (0, ∞) and kF kp ≤ p kf kp . p−1 (10.3.2) 10 Lp spaces 210 Proof: Step 1. Let f ∈ Cc ((0, ∞)) be a non-negative R x function. Then, if F is defined as in (10.3.1), we have xF (x) = 0 f (t) dt. Further (xF (x))0 = f (x) for all x > 0. Since f has compact support in (0, ∞), it follows that f ≡ 0 near the origin and so F ≡ 0 near the origin. So if we define F (0) = 0, then F is continuous on [0, ∞). If the support of f is contained Rin a finite interval [a, b], where 0 < a < b, then f ≡ 0 x on [b, ∞) and so 0 f (t) dt is constant for x ≥ b. Hence F (x) → 0 as x → ∞. Also, since F is continuous on [0, b] and since x 7→ x−1 is in Lp (b, ∞) for 1 < p < ∞, it follows that F ∈ Lp (0, ∞) as well and it is a non-negative function. By our earlier observation, F 0 (x) exists for x > 0 and, for such x, f (x) = F (x) + xF 0 (x). (10.3.3) Multiplying both sides of this equation by F p−1 (x) and integrating, we get Z Z Z F p−1 f dm1 = F p dm1 + xF p−1 (x)F 0 (x) dm1 (x). (0,∞) (0,∞) (0,∞) (10.3.4) Since f and F are non-negative, we deduce from (10.3.4) and the monotone convergence theorem, that Z Z p−1 0 xF (x)F (x) dm1 (x) = lim xF p−1 (x)F 0 (x) dm1 (x). (0,∞) n→∞ (0,n) By virtue of (10.3.3), the integrand xF p−1 (x)F 0 (x) is continuous and so the integral in the extreme right in the above relation is, in fact, a Riemann integral. Now, Z n Z 1 n d p−1 0 xF (x)F (x) dx = x (F p (x)) dx. p dx 0 0 Rb Now, since F (x) = xc for x ≥ b, where c = 0 f (t) dt, and since p > 1, we have that xF p (x) → 0 as x → ∞. Consequently, by integration by parts, we get, on passing to the limit as n → ∞, Z Z 1 xF p−1 (x)F 0 (x) dm1 (x) = − F p dm1 . p (0,∞) (0,∞) Using this, and the fact that F ≥ 0, in (10.3.4), and applying Hölder’s inequality, we get Z p−1 p kF kp = F p−1 f dm1 ≤ kf kp kF p−1 kp0 , p (0,∞) 10.3 Some applications 211 where p0 is the conjugate exponent of p. But Z Z 0 0 kF p−1 kpp0 = F (p−1)p dm1 = (0,∞) F p dm1 = kF kpp , (0,∞) by the definition of p0 . Thus, p p−1 p0 p kF kp ≤ kf kp · kF kp , p which yields (10.3.2), once again, by using the definition of p0 . Thus the result is true for non-negative continuous functions with compact support. Step 2. If f ∈ Cc ((0, ∞)) is an arbitrary function, set Z 1 T (f )(x) = f dm1 , x (0,x) for x > 0. Then, clearly, |f | is a non-negative continuous function with compact support and we have |T (f )(x)| ≤ T (|f |)(x) and so T (f ) ∈ Lp (0, ∞) and kT (f )kp ≤ kT (|f |)kp ≤ p p k |f | kp = kf kp . p−1 p−1 Thus the result is proven for all continuous functions with compact support in (0, ∞). Step 3. Let f ∈ Lp (0, ∞). Let {fn }∞ n=1 be a sequence of continuous functions with compact support in (0, ∞) converging to f in Lp (0, ∞). If Fn = T (fn ), then Fn ∈ Lp (0, ∞) for all n ∈ N and kFn − Fm kp ≤ p kfn − fm kp , p−1 p which implies that {Fn }∞ n=1 is a Cauchy sequence in L (0, ∞). Let p Fn → G in L (0, ∞). Step 4. Now, for any x > 0, fn → f in Lp (0, x) and so, since (0, x) has finite measure, fn → f in L1 (0, x). Consequently, Z Z fn dm1 → f dm1 , (0,x) (0,x) 10 Lp spaces 212 and so Fn (x) → F (x) for each x > 0. But, for a subsequence Fnk → G pointwise, almost everywhere, and so F = G almost everywhere. Thus, it follows that F ∈ Lp (0, ∞) and that Fn → F in Lp (0, ∞). Since kFn kp ≤ p kfn kp , p−1 for each n ∈ N, we get (10.3.2) on passing to the limit as n → ∞. This completes the proof. Example 10.3.1 The preceding theorem is not true when p = 1. Consider the function f (x) = e−x which belongs to L1 (0, ∞). Then, by (10.3.1), we get 1 − e−x F (x) = . x For x ≥ 1, we have 1 − e−x ≥ 1 − e−1 . Thus, F (x) ≥ 1 − e−1 , x ≥ 1, x and so F is not integrable on (1, ∞) and so it cannot be integrable over (0, ∞) either. Remark 10.3.2 Hardy’s inequality is valid in `p as well, when 1 < p < ∞. If x ∈ `p , where x = (xi ), then the sequence y = (yi ) defined by yn = x1 + · · · + xn n is also in `p and kykp ≤ p kxkp . p−1 For a proof, see, for instance, Kesavan [5]. We conclude this section with another useful result. Proposition 10.3.1 Let 1 ≤ p < ∞. Let f ∈ Lp (RN ). For h ∈ RN , define τh (f )(x) = f (x − h), x ∈ RN . Then lim kτh (f ) − f kp = 0. h→0 (10.3.5) 10.4 Duality 213 Proof: By the translation invariance of the Lebesgue measure, it is clear that τh (f ) ∈ Lp (RN ), whenever f ∈ Lp (RN ) and also that kτh (f )kp = kf kp . Let ε > 0 be given. Choose ϕ ∈ Cc (RN ) such that ε kf − ϕkp < . 3 Then, we also have ε kτh (f ) − τh (ϕ)kp = kf − ϕkp < . 3 (10.3.6) (10.3.7) Let the support of ϕ be contained in the box [−a, a]N . Since ϕ is uniformly continuous, there exists 0 < δ < 1 such that, whenever |h| < δ, we have ε −N |ϕ(x − h) − ϕ(x)| < (2(a + 1)) p , 3 N for all x ∈ R . Then, for |h| < δ, Z Z ε p |τh (ϕ)−ϕ|p dm1 = |ϕ(x−h)−ϕ(x)|p dm1 < , 3 RN [−(a+1),(a+1)]N so that ε kτh (ϕ) − ϕkp < . (10.3.8) 3 The result now follows on combining the relations (10.3.6)-(10.3.8). 10.4 Duality When studying normed linear spaces, one of the important objectives is to identify the dual space of a normed linear space, i.e. the space of all continuous linear functionals on that space. In this section, we will try to identify the dual spaces of the spaces Lp (µ). Let (X, S, µ) be a measure space and let 1 ≤ p ≤ ∞. Let p0 be the 0 conjugate exponent of p. Let g ∈ Lp (µ). Define Tg : Lp (µ) → R by Z Tg (f ) = f g dµ, f ∈ Lp (µ). X Then, clearly, Tg is a linear functional defined on Lp (µ). It is also continuous, Indeed, by Hölder’s inequality, we have |Tg (f )| ≤ kf kp kgkp0 , 10 Lp spaces 214 which establishes the continuity of Tg . We also have kTg k ≤ kgkp0 . (10.4.1) In this section, we will show that if (X, S, µ) is a σ-finite measure space, then every continuous linear functional on Lp (µ) occurs in this way, if 1 ≤ p < ∞ and that we have equality in (10.4.1). Thus, we have an isometric isomorphism (i.e. an isomorphism which preserves norms) 0 between the dual of Lp (µ) and the space Lp (µ) and so we can identify the latter space with the dual of the former, when 1 ≤ p < ∞. The result does not hold when p = ∞. Proposition 10.4.1 (Uniqueness) Let (X, S, µ) be a σ-finite measure 0 space. Let 1 ≤ p < ∞. If gi , i = 1, 2, are in Lp (µ), where p0 is the conjugate exponent of p, such that Tg1 = Tg2 , then g1 = g2 almost everywhere. Proof: If f ∈ Lp (µ), we have Z f (g1 − g2 ) dµ = 0. X Let E ⊂ X be a measurable set of finite measure. Then χE ∈ Lp (µ) and so we deduce that Z (g1 − g2 ) dµ = 0, E for all E ∈ S, µ(E) < +∞. If E ∈ S, then, by the σ-finiteness, we can ∞ find a collection of disjoint sets {Ei }∞ i=1 in S such that E = ∪i=1 Ei and such that µ(Ei ) < +∞ for each i. Then, it follows that Z (g1 − g2 ) dµ = 0, E for all E ∈ S, from which we easily deduce that g1 = g2 almost everywhere. It follows from the above proposition that the mapping g 7→ Tg from 0 Lp (µ) into the dual of Lp (µ) is injective. It is also continuous , by virtue of (10.4.1). We now need to show that it is surjective and that it is an isometry. We will first prove this when the measure space is finite and then deduce the general case of a σ-finite measure space. 10.4 Duality 215 Lemma 10.4.1 Let (X, S, µ) be a finite measure space and let g : X → R be a measurable function such that Z 1 g dµ ≤ K, (10.4.2) µ(E) E for all E ∈ S with µ(E) > 0. Then |g| ≤ K almost everywhere. Proof: Let U = {t ∈ R | |t| > K}. This is an open set in R and hence can be written as the countable union of open intervals. Let (a−r, a+r) be one such interval. Let E = {x ∈ X | g(x) ∈ (a − r, a + r)}. If µ(E) > 0, then set 1 AE (g) = µ(E) Then |AE (g) − a| = 1 µ(E) Z g dµ. E Z (g − a) dµ < r. E Thus,AE (g) ∈ (a − r, a + r) as well and so |AE (g)| > K, which is a contradiction. Thus, µ(E) = 0 and since the set {x ∈ X | |g(x)| > K} can be covered by a countable number of sets like E, we deduce that |g(x)| ≤ K except on a set of measure zero. This completes the proof. Theorem 10.4.1 Let (X, S, µ) be a finite measure space. Let 1 ≤ p < ∞. Let T be a continuous linear functional on Lp (µ). Then, there exists 0 a unique g ∈ Lp (µ) such that T = Tg and, further, kT k = kgkp0 . Proof: Step 1. For E ∈ S, define λ(E) = T (χE ). This is well-defined, since µ(E) < +∞ and so χE ∈ Lp (µ). If A and B are disjoint measurable sets, then χA∪B = χA + χB , and so, by the linearity of T , we get that λ is finitely additive. Let {Ei }∞ i=1 be a countable collection of disjoint sets in S. Let their union 10 Lp spaces 216 be E. Set Fk = ∪ki=1 Ei . Then Fk increases to E. Since the measure space is finite, we have µ(E\Fk ) = ∞ X µ(Ei ) k→∞ → 0, i=k+1 since µ(E) = P∞ i=1 µ(Ei ) < +∞. Now, 1 kχE − χFk kp = µ(E\Fk ) p , and so χFk → χE in Lp (µ). Then T (χFk ) → T (χE ) which shows that λ(E) = ∞ X λ(Ei ). i=1 Thus λ defines a signed measure on (X, S). Further, if µ(E) = 0, then χE = 0 in Lp (µ) and so λ(E) = 0 as well. Thus, λ << µ. Hence, by the Radon-Nikodym theorem, there exists a measurable function g such that, for every E ∈ S, we have Z λ(E) = g dµ, E or, in other words, Z T (χE ) = χE g dµ. X Since both λ and µ are finite, it also follows that g ∈ L1 (µ). Further, if ϕ is any simple function, we get by the linearity of T , that Z T (ϕ) = ϕg dµ. X L∞ (µ) Step 2. Let f ∈ be a non-negative function. Since, we are in a finite measure space, it follows that f ∈ Lp (µ) as well for every 1 ≤ p < ∞. Let {ϕn }∞ n=1 be a sequence of non-negative simple functions increasing to f . Then (cf. Lemma 10.2.1) ϕn → f in Lp (µ) for 1 ≤ p < ∞ and so T (ϕn ) → T (f ). On the other hand, ϕn g → f g pointwise and |ϕn g| ≤ f |g| ≤ kf k∞ |g|. Since g is integrable, it follows, from the dominated comnvergence theR R orem, that X ϕn g dµ → X f g dµ. Thus, for all non-negative bounded functions, f, we have Z T (f ) = f g dµ. X (10.4.3) 10.4 Duality 217 By splitting any bounded function f into its positive and negative parts, f ± , we deduce that (10.4.3) is true for any bounded measurable function. Step 3. Let p = 1. Let E ∈ S with µ(E) > 0. Then Z Z g dµ = χE g dµ = |T (χE )| ≤ kT k · kχE k1 = kT kµ(E). E X Thus, 1 µ(E) Z g dµ ≤ kT k. E It then follows, from Lemma 10.4.1, that |g| ≤ kT k almost everywhere. Thus, in this case g ∈ L∞ (µ) and kgk∞ ≤ kT k. Step 4. Let 1 < p < ∞. Let ψ be a measurable function (taking values ±1) such that ψg = |g|. Let En = {x ∈ X | |g(x)| ≤ n}, n ∈ N. 0 Set f = χEn |g|p −1 ψ. Then 0 0 |f |p = χEn |g|p p−p = χEn |g|p , by the definition of the conjugate exponent. Further, by the definition 0 of En , it follows that f is bounded as well. Since f g = χEn |g|p , we get, from Step 2, that Z Z p0 |g| dµ = f g dµ = T (f ), En X and so Z |g| p0 Z dµ ≤ kT k · kf kp = kT k En |g| p0 1 p dµ , En which yields Z |g| p0 10 p dµ ≤ kT k. En But En ↑ X and so, by the monotone convergence theorem, we have Z |g| X p0 10 p dµ ≤ kT k. 10 Lp spaces 218 0 Thus, g ∈ Lp (µ) and kgkp0 ≤ kT k. 0 Step 5. Let 1 ≤ p < ∞. Then, we have seen that g ∈ Lp (µ) and that kgkp0 ≤ kT k. Further, since µ(X) is finite, simple functions are dense in Lp (µ) (cf. Lemma 10.2.1) and so L∞ (µ) is also dense in Lp (µ). Both sides of (10.4.3) define continuous linear functionals on Lp (µ) and agree on the dense subspace L∞ (µ) and so, they agree on all of Lp (µ). Thus, we get that, in fact, T = Tg , in which case, we have that kT k = kTg k ≤ kgkp0 ≤ kT k. Thus T = Tg and kT k = kTg k = kgkp0 . This completes the proof. Let (X, S, µ) be a σ-finite measure space. Let X = ∪∞ n=1 Xn , where the Xn are all disjoint and 0 < µ(Xn ) < +∞ for each n ∈ N. Define h(x) = ∞ X n=1 1 n2 µ(X n) χ Xn . Then h ∈ L1 (µ) and h > 0. Define, for E ∈ S, Z ν(E) = h dµ. E Then (X, S, ν) is a finite measure space and ν << µ. 1 Let 1 ≤ p < ∞. Then f ∈ Lp (ν) if, and only if, h p f ∈ Lp (µ) and 1 kf kLp (ν) = kh p f kLp (µ) . This defines an isometric isomorphism between the spaces Lp (ν) and Lp (µ), when 1 ≤ p < ∞. Since ν << µ, we have that L∞ (ν) = L∞ (µ) and that kf kL∞ (ν) = kf kL∞ (µ) . Now, let T be a continuous linear functional defined on Lp (µ), for 1 ≤ p < ∞. Define S, a linear functional on Lp (ν), by 1 S(f ) = T (h p f ), f ∈ Lp (ν). Thus, |S(f )| = 1 |T (h p f )| 1 |T (f )| = |T (h p h − p1 ≤ 1 kT k · kh p f kLp (µ) f )| ≤ kSk · kh − p1 = kT k · kf kLp (ν) , f kLp (ν) = kSk · kf kLp (µ) , 10.4 Duality 219 where f ∈ Lp (ν) in the first line above and f ∈ Lp (µ) in the second. This shows that S defines a continuous linear functional on Lp (ν) and that kSk = kT k. Theorem 10.4.2 (Riesz representation theorem) Let (X, S, µ) be a σfinite measure space. Let 1 ≤ p < ∞. Let T be a continuous linear 0 functional on Lp (µ). Then, there exists a unique g ∈ Lp (µ) such that T = Tg and, further, kT k = kgkp0 . Proof: We write X = ∪∞ n=1 Xn as a disjoint union of sets of finite measure. We adopt the notation and definitions made in the preceding paragraphs and define the function h and the measure ν as before. Given a linear functional T on Lp (µ), we define, as before, the linear functional 0 S on Lp (ν). Then, by the preceding theorem, there exists ge in Lp (ν) such that Z S(f ) = f gedν, for all f ∈ Lp (ν), X and such that kSk = kgkLp0 (ν) . 1 Define g = h p0 ge if 1 < p < ∞ and g = ge, if p = 1. Case 1. Let 1 < p < ∞. Then R R 0 0 0 kgkpLp0 (µ) = X |g|p dµ = X |e g |p h dµ 0 R = X 0 0 |e g |p dν = kSkp = kT kp . Further, if f ∈ Lp (µ), we have R X f g dµ = R = R 1 eh p0 dµ X fg X h − p1 = R 1− p1 eh X fg f ge dν = S(h − p1 dµ f ) = T (f ). This proves the result when 1 < p < ∞. Case 2. Let p = 1. Then kgkL∞ (µ) = ke g kL∞ (ν) = kSk = kT k. If f ∈ L1 (µ), then Z Z f g dµ = f geh−1 dν = S(h−1 f ) = T (f ). X X 10 Lp spaces 220 This proves the result for p = 1 and the proof of the theorem is complete. Example 10.4.1 The above result is not true when p = ∞. For example, consider the interval (0,1). Then C[0, 1] is a subspace of L∞ (0, 1). If f ∈ C[0, 1], define T (f ) = f (0). This defines a linear functional on C[0, 1] and |T (f )| = |f (0)| ≤ kf k∞ . Thus T is a continuous linear functional of C[0, 1] equipped with the norm from L∞ (0, 1). By the Hahn-Banach theorem, we can extend it to a continuous linear functional on L∞ (0, 1). We claim that this functional cannot be represented in the form Tg where g ∈ L1 (µ). (Recall that the conjugate exponent of p = ∞ is p0 = 1.) Assume the contrary. Now consider the sequence of functions {fn }∞ n=1 in C[0, 1] given by fn (t) = 1 − nt, if 0 ≤ t ≤ n1 , 0, if 1 n ≤ t ≤ 1. Then T (fn ) = fn (0) = 1 for all n ∈ N. On the other hand, if T = Tg , where g ∈ L1 (0, 1), then, Z T (fn ) = fn g dm1 . (0,1) But |fn g| ≤ |g| for all n and g is integrable, while (fn g)(t) → 0 for all 0 < t ≤ 1. Thus, by the dominated convergence theorem, we have that T (fn ) → 0, which is a contradiction. Remark 10.4.1 The spaces Lp (µ) are reflexive (cf. for instance, Kesavan [5]) if 1 < p < ∞ since we have the canonical identifications 0 0 (Lp (µ))0 = Lp (µ) and (Lp (µ))0 = Lp (µ). The spaces L1 (µ) and L∞ (µ) are not reflexive. We have that (L1 (µ))0 = L∞ (µ) but the reverse identity fails. 10.5 Convolutions 221 Remark 10.4.2 We can prove an inequality called Clarkson’s inequality when 2 ≤ p < ∞ (see the exercises at the end of this chapter) which will show that the spaces Lp (µ) are uniformly convex (cf. Kesavan [5]), which is a geometric property of the norm. Then, it follows, from a theorem in functional analysis, that the spaces Lp (µ) are reflexive when 2 ≤ p < ∞. One can then easily prove that the spaces Lp (µ), when 1 < p < 2, are also refllexive, using functional analytic arguments. Then it is very easy 0 to show that the dual space of Lp (µ) is Lp (µ) for 1 < p < ∞. The advantage of this proof is that it does not need the measure space to be σ-finite. See Kesavan [5] for details. 10.5 Convolutions In this section, we will study some properties of a very important tool in analysis called the convolution product. We will assume throughout that RN is equipped with the Lebesgue measure mN . Definition 10.5.1 Let f and g be integrable functions defined on RN . The convolution or convolution product of f and g , denoted f ∗ g, is given by Z (f ∗ g)(x) = f (x − y)g(y) dmN (y), for x ∈ RN . (10.5.1) RN We have already encountered this earlier (cf. Example 8.3.8): we have seen that the convolution is well-defined and that it is a commutative and associative binary operation on integrable functions (cf. Exercise 8.4). We also saw that f ∗ g ∈ L1 (RN ) and that kf ∗ gk1 ≤ kf k1 kgk1 . (10.5.2) We will now try to extend the definition of the convolution product to other classes of functions as well. Theorem 10.5.1 Let 1 < p < ∞. Let f ∈ L1 (RN ) and let g ∈ Lp (RN ). Then f ∗ g is well-defined. Further, f ∗ g ∈ Lp (RN ) and kf ∗ gkp ≤ kf k1 kgkp . (10.5.3) 10 Lp spaces 222 0 Proof: Let p0 be the conjugate exponent of p. Let h ∈ Lp (RN ). Then (x, y) 7→ f (x − y)g(y)h(x) is measurable and using Hölder’s inequality and the translation invariance of the Lebesgue measure, we have R RN R RN |f (x − y)g(y)h(x)| dmN (x)dmN (y) R R = RN |h(x)| RN |f (x − y)g(y)| dmN (y)dmN (x) = R = R RN RN |h(x)| R |f (w)| RN R RN |f (w)g(x − w)| dmN (w)dmN (x) |h(x)||g(x − w)| dmN (x)dmN (w) ≤ khkp0 kgkp kf k1 < +∞. Thus, by Fubini’s theorem, the integral Z h(x)f (x − y)g(y) dmN (y) RN exists for almost all x. We can choose h(x) 6= 0 for all x (for instance, h(x) = exp(−|x|2 ), which belongs to all Lp spaces) and so we deduce that f ∗ g defined via (10.5.1) is well-defined. Further, by the preceding computation, it follows that Z h 7→ (f ∗ g)h dmN RN 0 is a continuous linear functional on Lp (RN ) whose norm is bounded by the quantity kgkp kf k1 which shows, by the Riesz representation theorem, that f ∗ g ∈ Lp (RN ) and that (10.5.3) holds. Remark 10.5.1 Notice that (10.5.2) is the same as (10.5.3) for the case p = 1. The relation (10.5.3) is a particular case of Young’s inequality. Let 1 ≤ p, q, r < ∞ be such that 1 1 1 + = 1+ . p q r If f ∈ Lp (RN ) and g ∈ Lq (RN ), then f ∗ g is well-defined via (10.5.1), f ∗ g ∈ Lr (RN ) and kf ∗ gkr ≤ kf kp kgkq . 10.5 Convolutions 223 If f and g are continuous real valued functions on RN and if at least one of them has compact support, then the integral in (10.5.1) makes sense and the convolution f ∗ g is well-defined. More generally, if f1 , · · · , fn are continuous real valued functions on RN such that all but at most one of them have compact support, then we can define the convolution product f1 ∗ · · · ∗ fn by taking the products two at a time. For instance we can have f1 ∗ ((f2 ∗ f3 ) ∗ · · · ∗ (fn−1 ∗ fn )). The actual pairing and order will be unimportant since we have commutativity and associativity. The convolution product of functions occuring within any pair of parantheses is well-defined since at least one of them will have compact support as shown by the following result. Theorem 10.5.2 Let f and g be continuous real valued functions on RN and let one of them have compact support so that f ∗ g is welldefined. Then supp(f ∗ g) ⊂ supp(f ) + supp(g) where, supp(ϕ) denotes the support of a continuous function ϕ : RN → R, and, for subsets A and B of RN we define A + B = {x + y | x ∈ A, y ∈ B}. In particular, if both f and g have compact support, then f ∗ g has compact support. Proof: Let A = supp(f ) and B = supp(g) and, without loss of generality, assume that B is compact. Then A + B is closed. To see this, let xn + yn ∈ A + B such that xn + yn → z in RN . Since B is compact, for a subsequence, we have ynk → y ∈ B. Then xnk → z − y = x which will belong to A, since A is closed. Thus, z = x + y ∈ A + B, which establishes our claim. We clearly need to consider the integral in (10.5.1) only on the set B = supp(g). In order that (f ∗ g)(x) 6= 0, it is necessary that x − y ∈ A = supp(f ) for y varying over a subset of B with positive measure. In particular, it follows that x ∈ supp(f ) + supp(g) and the result follows since this set is closed. If both functions have compact supports, then supp(f ) + supp(g) is also compact (why?) and so f ∗ g has compact support. 10 Lp spaces 224 If f is continuous with compact support and if g is integrable on RN , then also it is easy to see that f ∗ g is well-defined. One of the important properties of the convolution product is that it has a smoothing effect on functions. More precisely, we have the following result. Theorem 10.5.3 Let f be a continuous real valued function on RN with compact support and let g be integrable. Then f ∗ g is continuous. If f is C ∞ , then so is f ∗ g. Proof: We will show that f ∗g is continuous and that, for any 1 ≤ i ≤ N , ∂ ∂f (f ∗ g) = ∗g ∂xi ∂xi (10.5.4) if f is differentiable. Iterating this, we can complete the proof. (i) Let x ∈ RN be fixed and let h ∈ RN be such that |h| ≤ 1. Then Z |(f ∗g)(x+h)−(f ∗g)(x)| ≤ |(f (x+h−y)−f (x−y)||g(y)| dmN (y). RN The above integral needs to be taken only over a compact set K(x) containing the supports of the functions y 7→ f (x−y) and y 7→ f (x+h− y). For example, we can take K(x) = x+B(0; R)+B(0; 1) where B(0; r) denotes the closed ball in RN with centre at the origin and radius r, and R > 0 is such that supp(f ) ⊂ B(0; R). Since f has compact support, it is uniformly continuous and so, given ε > 0, there exists η > 0 such that |f (u) − f (v)| < ε whenever |u − v| < η. Thus, if |h| < η (we can assume that 0 < η < 1), we get Z |(f ∗ g)(x + h) − (f ∗ g)(x)| ≤ ε |g| dmN K(x) which proves the continuity of f ∗ g at any arbitrary point x ∈ RN . (ii) Let x ∈ RN be fixed and let h ∈ R, |h| ≤ 1. Let ei be the i-th standard basis vector of RN , i.e. the vector with 1 in the i-th coordinate and zero elsewhere. Then ∂ (f ∗ g)(x + hei ) − (f ∗ g)(x) (f ∗ g)(x) = lim h→0 ∂xi h if the limit exists. If K(x) is a compact set containing the supports of the functions y 7→ f (x − y) and y 7→ f (x + hei − y), then R (f ∗g)(x+hei )−(f ∗g)(x) = h1 K(x) (f (x + hei − y) − f (x − y))g(y) dy h = R ∂f K(x) ∂xi ((x − y + θhei )g(y) dy 10.5 Convolutions 225 ∂f where θ ∈ (0, 1) (and depends on x, y and h). Since ∂x is assumed to be i continuous, it is bounded on the compact set K(x) and so the integrand in the last integral above is bounded by M |g(y)| which is integrable on the compact set K(x). Further, as h → 0, the integrand converges to ∂f ∂xi (x−y)g(y). Thus, by the dominated convergence theorem, we deduce the validity of (10.5.4). Notation Let N be a fixed positive integer. A multi-index α (of size N ), is an N -tuple of non-negative integers. If α = (α1 , · · · , αN ) is a multi-index, we denote by |α|, the sum of its components, i.e. |α| = α1 + · · · + αN . If f is a sufficiently smooth real-valued function defined on RN (or an open set thereof), we define Dα f = ∂xα1 1 ∂ |α| . · · · ∂xαNN For example, if N = 3 and if α = (3, 0, 1), then Dα f = ∂4f . ∂x31 ∂x3 Remark 10.5.2 It is easy to write down a similar proof when f is C ∞ and g is continuous with compact support. More, generally, if f is C ∞ and if one of the two functions has compact support, we have Dα (f ∗ g)(x) = ((Dα f ) ∗ g)(x) for any multi-index α. It is also easy to see that if f is only C k , then the above relation is valid for all multi-indices α such that |α| ≤ k. If g also has differentiability properties, then any derivative of f ∗ g could be got by taking the convolution product of appropriate derivatives of f and g. Thus, the convolution of a smooth function with compact support with any integrable function produces a smooth function. This fact used together with the mollifiers defined below provides us with a powerful technique to prove a variety of density and approximation theorems. 10 Lp spaces 226 Lemma 10.5.1 Define f : R → R by f (x) = exp(−x−2 ) if x > 0, 0 if x ≤ 0. Then f ∈ C ∞ (R). Proof: We only need to check the smoothness at x = 0. As x ↑ 0, the function and all the derivatives are zero. As x ↓ 0, the derivatives are all finite linear combinations of terms of the form x−k exp(−x−2 ), where k is a non-negative integer. Consider the function g(t) = tk e−t . Then g 0 (t) = tk−1 e−t (k − t) which is non-positive for t ≥ k. Thus for all such t, we have that g(t) ≤ g(k). Now k 1 −2 −k −x−2 k x e = x e−x ≤ xk k k e−k 2 x for x12 ≥ k, i.e. x ≤ √1k . It then follows that x−k e−x This completes the proof. −2 → 0 as x ↓ 0. We can use the above lemma to construct examples of C ∞ functions with compact support. Example 10.5.1 Consider the function ρ(x) = exp(−a2 /(a2 − x2 )) if |x| < a, 0 if |x| ≥ a. A simple application of the preceding lemma shows that ρ is a C ∞ function and that its support is the interval [−a, a]. Example 10.5.2 This is a slight, but very useful, variation of the preceding example. Let x = (x1 , · · · , xN ) ∈ RN . Let |x| = N X i=1 ! 12 |xi |2 . (10.5.5) 10.5 Convolutions 227 Given ε > 0, define −N κε exp(−ε2 /(ε2 − |x|2 )) if |x| < ε, ρε (x) = 0 if |x| ≥ ε, where κ−1 = Z (10.5.6) exp(−1/(1 − |x|2 )) dx. |x|≤1 It follows then that ρε is a C ∞ function with support in the closed ball B(0; ε), centered at the origin and of radius ε. Further ρε ≥ 0 and by the change of variable y = xε , we see that R R κ 2 2 2 RN ρε dmN = εN |x|≤ε exp(−ε /(ε − |x| )) dmN (x) = κ R |y|≤1 exp(−1/(1 − |y|2 )) dmN (y) = 1. Thus, as ε → 0, the functions ρε have decreasing supports, but preserve the volume contained under the graph and so will be concentrated near the origin. Definition 10.5.2 The family of functions {ρε }ε>0 is called the family of mollifiers. Theorem 10.5.4 Let {ρε }ε>0 be the family of mollifiers. (i) If f : RN → R is continuous, then ρε ∗ f → f pointwise, as ε → 0. (ii) If f : RN → R is continuous with compact support, then ρε ∗ f → f uniformly, as ε → 0. Proof: (i) Let x ∈ RN . Then, given η > 0, there exists δ > 0 such that for all |y| < δ, we have |f (x − y) − f (x)| < η. Thus, if ε < δ, we have, on observing that the integral of ρε is unity and that this function is supported on B(0; ε), Z (ρε ∗ f )(x) − f (x) = (f (x − y) − f (x))ρε (y) dmN (y) |y|≤ε which yields (since ρε ≥ 0) |(ρε ∗ f )(x) − f (x)| ≤ R |y|≤ε |f (x < η R − y) − f (x)|ρε (y) dmN (y) |y|≤ε ρε (y) dmN (y) = η. 10 Lp spaces 228 This proves the first statement. (ii) If supp(f ) = K which is compact, then supp(ρε ∗ f ) ⊂ K + B(0; ε) which is compact and is contained within a fixed compact set, say, K + B(0; 1) if we restrict ε to be less than or equal to unity. Since f has compact support, it is uniformly continuous and the δ corresponding to η in the previous step is now independent of the point x and so the pointwise convergence is now uniform. Corollary 10.5.1 Let f be a continuous real-valued function on RN with compact support. Then ρε ∗ f → f , as ε → 0, in Lp (RN ) for all 1 ≤ p ≤ ∞. Proof: The case p = ∞ is already covered in the previous theorem. If 1 ≤ p < ∞, then let K be the compact set containing the support of f and all the functions ρε ∗ f . Then, on this set we have uniform convergence, which automatically implies convergence in Lp (RN ). Remark 10.5.3 Notice that in the above case, ρε ∗ f is a C ∞ function with compact support in RN . Theorem 10.5.5 Let 1 ≤ p < ∞. Then, the space of C ∞ functions with compact support in RN is dense in Lp (RN ). Proof: By Corollary 10.5.1 and Remark 10.5.3 above, continuous functions with compact support can be approximated in Lp (RN ) by C ∞ functions with compact support in RN . This completes the proof, since continuous functions with compact support are dense in Lp (RN ). Corollary 10.5.2 Let {ρε }ε>0 be the family of mollifiers. If f ∈ Lp (RN ), then ρε ∗ f → f as ε → 0, in Lp (RN ), for 1 ≤ p < ∞. Proof: Given f ∈ Lp (RN ), we can find, for every η > 0, a continuous function g with compact support such that kf − gkp < η . 3 Then, for ε sufficiently small, we have, by Corollary 10.5.1, that kρε ∗ g − gkp < η . 3 10.5 Convolutions 229 Then kρε ∗ f − f kp ≤ kf − gkp + kg − ρε ∗ gkp + kρε ∗ (g − f )kp . But by (10.5.2) kρε ∗ (g − f )kp ≤ kρε k1 kg − f kp < η 3 since the integral of ρε is unity and the result now follows immediately. Theorem 10.5.6 Let Ω ⊂ RN be an open set and let 1 ≤ p < ∞. Then the space of C ∞ functions with compact support in Ω is dense in Lp (Ω). Proof: We know that continuous functions with compact support in Ω are dense in Lp (Ω) for 1 ≤ p < ∞. Thus, given η > 0 and f ∈ Lp (Ω), there exists g, a continuous function with compact support in Ω, such that kf − gkp < η/2. Now, let ge be the extension of g by zero outside Ω. Then ρε ∗ ge is a C ∞ function and since its support is compact and is contained in B(0; ε) + supp(e g ) = B(0; ε) + supp(g) ⊂ Ω, for ε sufficiently small, we have that (ρε ∗ ge)|Ω is a C ∞ function with compact support in Ω. But ρε ∗ ge → ge, as ε → 0, in Lp (RN ). Hence, for sufficently small ε, we have η k(ρε ∗ ge)|Ω − gkp < , 2 which yields k(ρε ∗ ge)|Ω − f kp < η which completes the proof. Bibliographical comment: Apart from the books cited in the text, the following are highly recommended for further study of the Lp -spaces as well as their applications: 1. Brézis, H, Functional Analysis, Sobolev Spaces and Partial Differential Equations, Springer, Universitext, 2011. 2. Ciarlet, P.G. Linear and Nonlinear Functional Analysis with Applications, SIAM, 2013. 3. Lieb E. H. and Loss, M. Analysis, Graduate Studies in Mathematics, Volume 14, American Mathematical Society, 1997. (Indian Edition: 10 Lp spaces 230 Norosa, 1998.) The five volume treatise entitled A comprehensive Course in Analysis by Barry Simon is an excellent reference for all topics in Analysis. In particular Part 1 of this set has material relevant to topics treated in this book: Real analysis: A Comprehensive Course in Analysis (Part 1), American Mathematical Society, 2015. (Indian Edition: Universities Press, 2017.) 10.6 Exercises 10.1 Let (X, S, µ) be a measure space. Let 1 ≤ p, q, r < ∞ be such that 1 1 1 + = . p q r (a) If f ∈ Lp (µ) and g ∈ Lq (µ), show that f g ∈ Lr (µ) and that kf gkr ≤ kf kp kgkq . (b) If fn → f in Lp (µ) and if gn → g in Lq (µ), show that fn gn → f g in Lr (µ). 10.2 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞. Let {fn }∞ n=1 be a sequence in Lp (µ). Assume that there exists g ∈ Lp (µ) such that |fn | ≤ g for each n ∈ N. If fn → f pointwise, show that f ∈ Lp (µ) and that fn → f in Lp (µ). 10.3 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞. Let fn → f in Lp (µ). Let {gn }∞ n=1 be a sequence of measurable real-valued functions which are uniformly bounded by M > 0 and which converge almost everywhere to a measurable function g on X. Show that fn gn → f g in Lp (µ). 10.4 Let (X, S, µ) be a measure space. Let 1 < p < ∞. Let f : X ×X → R be such that, for every y ∈ X, the section f y is p-integrable and that Z kf y kp dµ(y) < +∞. X Define, for x ∈ X, Z g(x) = f (x, y) dµ(y). X 10.6 Exercises 231 Show that g ∈ Lp (µ) and that Z kf y kp dµ(y). kgkp ≤ X 10.5 (Riemann-Lebesgue lemma) Let h : (0, ∞) → R be a bounded and (Lebesgue) measurable function such that Z 1 lim h dm1 = 0. c→∞ c (0,c) (a) Let [c, d] ⊂ (0, ∞) and let f = χ[c,d] . Show that Z lim ω→∞ (0,∞) f (t)h(ωt) dm1 (t) = 0. (10.6.1) (b) Show that (10.6.1) holds for all f ∈ L1 (0, ∞). (c) If f ∈ L1 (a, b), where (a, b) ⊂ (0, ∞), show that Z Z lim f (t) cos nt dm1 (t) = lim f (t) sin nt dm1 (t) = 0. n→∞ (a,b) n→∞ (a,b) 10.6 (a) Consider the trigonometric series ∞ a0 X + (an cos nt + bn sin nt). 2 n=1 Show that it can be written as ∞ a0 X + dn cos(nt − φn ). 2 n=1 (This is called the amplitude-phase form of the series.) Write down the relations between an , bn and dn , φn . (b) Using the amplitude-phase form of a trigonometric series, show that if the series converges pointwise over a set E whose (Lebesgue) measure is strictly positive, then an → 0 and bn → 0 as n → ∞. (This is called the Cantor-Lebesgue theorem.) In Exercises 10.7-10.11, which follow, give direct proofs of the results, without appealing to the general results proved in this chapter. 10 Lp spaces 232 10.7 Show that the spaces `p are complete, for 1 ≤ p ≤ ∞. 10.8 Show that `p is separable if 1 ≤ p < ∞ and that `∞ is not separable. 10.9 Let V 0 denote the dual space of a Banach space V . If p0 is the conjugate exponent of p, show that `0p is isometrically isomorphic to `p0 , when 1 ≤ p < ∞. 10.10 Give an example of a continuous linear functional on `∞ which does not arise from any element of `1 . 10.11 Let c0 denote the space of all real sequences which converge to zero, equipped with the norm k · k∞ . (a) Show that c0 is complete. (b) Show that c00 is isometrically isomorphic to `1 . 10.12 (Clarkson’s inequality) (a) Let 2 ≤ p < ∞. If x ≥ 0, show that p (x2 + 1) 2 ≥ xp + 1. Deduce that, if α and β are positive real numbers, then p (α2 + β 2 ) 2 ≥ αp + β p . p (b) Combining the above with the fact that the map t 7→ t 2 is convex on the set {t ∈ R | t ≥ 0}, show that, if f and g are in Lp (µ), where (X, S, µ) is a measure space, then 1 (f + g) 2 p + p 1 (f − g) 2 p ≤ p 1 (kf kpp + kgkpp ). 2 (d) Deduce that if (X, S, µ) is a measure space and if 2 ≤ p < ∞, then the space Lp (µ) is uniformly convex, i.e. given ε > 0, there exists δ > 0 such that, whenever f and g are in Lp (µ) with kf kp = kgkp = 1, kf − gkp > ε, we have 1 (f + g) 2 < 1 − δ. p 10.6 Exercises 233 10.13 Let (X, S, µ) be a measure space and let 1 < p < ∞. Let p0 denote the conjugate exponent of p. 0 (a)If g ∈ Lp (µ), define 0 |g(x)|p −2 g(x), if g(x) 6= 0, f (x) = 0, if g(x) = 0. Show that f ∈ Lp (µ). (b) Define the continuous linear functional (as in Section 10.5) Z Tg (f ) = f g dµ, for all f ∈ Lp (µ). X Show that kTg k = kgkp0 . (c) Given that every uniformly convex Banach space is reflexive (cf. Kesavan [5], Theorem 5.5.1), deduce that Lp (µ) is reflexive for all p such that 1 < p < ∞. 0 (d) Deduce that the dual of Lp (µ) is isometrically isomorphic to Lp (µ) for every 1 < p < ∞. Remark 10.6.1 This gives a proof of the Riesz representation theorem when 1 < p < ∞, without the assumption of σ-finiteness on the measure space. 10.14 Let f ∈ L1 (RN ) be such that for every non-negative C ∞ function with compact support, ϕ, we have Z f ϕ dmN ≥ 0. RN Show that f ≥ 0 almost everywhere on RN . 10.15 Let (a, b) ⊂ R be a finite interval. Let {ϕk }∞ k=1 be an orthonormal sequence in L2 (a, b), i.e. Z 1, if j = k, ϕj ϕk dm1 = 0, if j 6= k. (a,b) P ∞ 2 Let {ck }∞ k=1 |ck | < +∞. Show that there k=1 be scalars such that 2 exists R f ∈ L (a, b) such that: (i) (a,b) f ϕk dm1 = ck for every k ∈ N. (ii) ∞ X kf k22 = |ck |2 . k=1 10 Lp spaces 234 Remark 10.6.2 This result is known as the Riesz-Fischer theorem. The completeness of the Lp spaces (cf. Theorem 10.1.1) is also known by the same name. Remark 10.6.3 Let (X, S, µ) be a measure space. The space L2 (µ) is a Hilbert space with the inner-product defined by Z (f, g) = f g dµ. X (If we are dealing with complex-valued functions then g above should be replaced by its complex conjugate.) The Riesz-Fisher theorem, as stated in the preceding exercise, is valid in any Hilbert space. Bibliography [1] Ahlfors, L. V. Complex Analysis, International Student Edition, Third Edition, McGraw-Hill,1979. [2] Evans, L. C. and Gariepy, R. F. Measure Theory and Fine Properties of Functions, CRC Press, 1992. [3] Folland, G. B. Real Analysis: Modern Techniques and their Applications, John Wiley and Sons Inc., 1984. [4] Kesavan, S. Nonlinear Functional Analysis, A First Course, Texts and Readings in Mathematics (TRIM), 28, Hindusthan Book Agency, 2004. [5] Kesavan, S. Functional Analysis, Texts and Readings in Mathematics (TRIM), 52, Hindusthan Book Agency, 2009. [6] Royden, H. L. Real Analysis, 2nd Edition, Macmillan, 1964. [7] Rudin, W. Principles of Mathematical Analysis, Third Editon, McGraw-Hill International Edition, 1976. [8] Rudin, W. Real and Complex Analysis, Tata McGraw-Hill, 1974. [9] Simmons, G. F. Introduction to Topology and Modern Analysis, McGraw-Hill, 1963. © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9 235 Index σ-algebra, 11 σ-ring, 11 hereditary, 16 absolute continuity, 136, 185 absolutely continuous measure, 104 algebra, 9 almost everywhere, 65 almost uniform convergence, 69 Bernstein polynomial, 112 Borel σ-algebra, 35 measure, 36 set, 35 Borel-Cantelli lemma, 16 bounded variation, 124 Cantor function, 64, 118, 141 set, 37 Cantor-Lebesgue theorem, 231 Cauchy in measure, 71 Cauchy-Schwarz inequality, 198 chain rule, 145 change of variable, 153 characteristic function, 43, 56 Clarkson’s inequality, 221, 232 coarea formula, 173 complex measure, 184 conjugate exponent, 197 convergence almost surely, 113 in mean, 99 in measure, 71 in probability, 113 convolution, 170, 171, 221 countable additivity, 12 counting measure, 90 critical point, 146 value, 146 diffeomorphism, 143 differentiable mapping, 142 Dirac measure, 91 distribution function, 169 dominated convergence theorem, 98 Egorov’s theorem, 68 elementary set, 156 equimeasurable, 169 essential supremum, 66 essentially bounded, 66, 196 events, 112 expectation, 113 Fatou’s lemma, 92 finite additivity, 12 Fourier transform, 101 Fréchet derivative, 142 Fubini’s thoerem, 164 function Cantor, 118 © Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 S. Kesavan, Measure and Integration, Texts and Readings in Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9 236 10.6 Exercises Carathéodory, 160 characteristic, 43, 56 complex-valued, 95 integrable, 95 negative part, 58 positive part, 58 simple, 60 step, 43 Hölder’s inequality, 197 Hahn decomposition, 181 Hardy’s inequality, 209 independent events, 113 random variables, 113 inequality Cauchy-Schwarz, 198 Clarkson, 221, 232 Hölder, 197 Hardy, 209 Minkowski, 198 Young, 222 integrable, 196 function, 95 Riemann, 2 integral Lebesgue, 84, 85, 95 Riemann, 2 iterated integrals, 166 Jacobian, 144 Jordan decomposition, 183 Lebesgue σ-algebra, 35 integral, 84, 85, 95 measurable sets, 35 measure, 35 lemma Borel-Cantelli, 16 237 Fatou, 92 Riemann-Lebesgue, 231 Vitali covering, 119 Lipschitz continuous, 124 Lusin’s theorem, 208 mapping continuously differentiable, 143 differentiable, 142 mean value thoerem, 145 measurable cover, 23 function, 54 rectangle, 156 set, 19, 54 space, 54 measure, 12 σ-finite, 15 absolute continuity, 53, 185 absolutely continuous, 104 complete, 21 completion of, 24 complex, 184 continuity from above, 15 continuity from below, 14 counting, 13 Dirac, 13 equivalent, 187 finite, 15 inner-regular, 43 Jordan decomposition, 184 Lebesgue decomposition, 193 Lebesgue-Stieltjes, 52 outer-regular, 43 product, 162 regular, 43 signed, 178 singular, 193 space, 65 subadditivity, 14 238 translation invariance, 47 Minkowski’s inequality, 198 mollifiers, 225, 227 monotone class, 157 monotone convergence theorem, 88 negative set, 180 outer-measure, 17 polar coordinates, 168 positive set, 180 probability conditional, 113 space, 112 product measure, 162 Radon-Nikodym derivative, 192 theorem, 94, 104, 189, 192 random variable, 113 distribution function, 113 identically distributed, 113 independence, 113 rearrangement, 169 rectifiable arc, 131 regular value, 146 Riemann-Lebesgue Lemma, 231 Riesz representation theorem, 219 Riesz-Fischer theorem, 234 ring, 9 sample space, 112 Sard’s theorem, 146 section function, 159 set, 156 signed measure, 178 lower variation, 184 total variation, 184 10 Lp spaces upper variation, 184 simple function, 60 singular function, 140 point, 146 value, 146 step function, 43 subadditivity countable, 17 theorem Cantor-Lebesgue, 231 dominated convergence, 98 Egorov, 68 Fubini, 164 Hahn decomposition, 181 Jordan decomposition, 183 Lusin, 208 mean value, 145 monotone convergence, 88 Radon-Nikodym, 94, 104, 189, 192 Riesz representation, 219 Riesz-Fischer, 234 Sard, 146 Weierstrass, 109 total variation, 124 translation invariant, 47 Vitali covering, 119 Weierstrass’ theorem, 109 Young’s inequality, 222 Texts and Readings in Mathematics 1. R. B. Bapat: Linear Algebra and Linear Models (3/E) 2. Rajendra Bhatia: Fourier Series (2/E) 3. C.Musili: Representations of Finite Groups 4. Henry Helson: Linear Algebra (2/E) 5. Donald Sarason: Complex Function Theory (2/E) 6. M. G. Nadkarni: Basic Ergodic Theory (3/E) 7. Henry Helson: Harmonic Analysis (2/E) 8. K. Chandrasekharan: A Course on Integration Theory 9. K. Chandrasekharan: A Course on Topological Groups 10. Rajendra Bhatia(ed.): Analysis, Geometry and Probability 11. K. R. Davidson: C* -Algebras by Example 12. Meenaxi Bhattacharjee et al.: Notes on Infinite Permutation Groups 13. V. S. Sunder: Functional Analysis - Spectral Theory 14. V. S. Varadarajan: Algebra in Ancient and Modern Times 15. M. G. Nadkarni: Spectral Theory of Dynamical Systems 16. A. Borel: Semi-Simple Groups and Symmetric Spaces 17. Matilde Marcoli: Seiberg Witten Gauge Theory 18. Albrecht Bottcher:Toeplitz Matrices, Asymptotic Linear Algebra and Functional Analysis 19. A. Ramachandra Rao and P Bhimasankaram: Linear Algebra (2/E) 20. C. Musili: Algebraic Geomtery for Beginners 21. A. R. Rajwade: Convex Polyhedra with Regularity Conditions and Hilbert’s Third Problem 22. S. Kumaresen: A Course in Differential Geometry and Lie Groups 23. Stef Tijs: Introduction to Game Theory 24. B. Sury: The Congruence Subgroup Problem - An Elementary Approach Aimed at Applications 25. Rajendra Bhatia (ed.): Connected at Infinity - A Selection of Mathematics by Indians 26. Kalyan Mukherjea: Differential Calculas in Normed Linear Spaces (2/E) 27. Satya Deo: Algebraic Topology - A Primer (2/E) 28. S. Kesavan: Nonlinear Functional Analysis - A First Course 29. Sandor Szabo: Topics in Factorization of Abelian Groups 30. S. Kumaresan and G.Santhanam: An Expedition to Geometry 31. David Mumford: Lectures on Curves on an Algebraic Surface (Reprint) 32. John. W Milnor and James D Stasheff: Characteristic Classes(Reprint) 33. K.R. Parthasarathy: Introduction to Probability and Measure 34. Amiya Mukherjee: Topics in Differential Topology 35. K.R. Parthasarathy: Mathematical Foundation of Quantum Mechanics (Corrected Reprint) 36. K. B. Athreya and S.N.Lahiri: Measure Theory 37. Terence Tao: Analysis - I (3/E) 38. Terence Tao: Analysis - II (3/E) 39. Wolfram Decker and Christoph Lossen: Computing in Algebraic Geometry 40. A. Goswami and B.V.Rao: A Course in Applied Stochastic Processes 240 Texts and Readings in Mathematics 41. K. B. Athreya and S.N.Lahiri: Probability Theory 42. A. R. Rajwade and A.K. Bhandari: Surprises and Counterexamples in Real Function Theory 43. Gene H. Golub and Charles F. Van Loan: Matrix Computations (Reprint of the 4/E) 44. Rajendra Bhatia: Positive Definite Matrices 45. K.R. Parthasarathy: Coding Theorems of Classical and Quantum Information Theory (2/E) 46. C.S. Seshadri: Introduction to the Theory of Standard Monomials (2/E) 47. Alain Connes and Matilde Marcolli: Noncommutative Geometry, Quantum Fields and Motives 48. Vivek S. Borkar: Stochastic Approximation - A Dynamical Systems Viewpoint 49. B.J. Venkatachala: Inequalities - An Approach Through Problems (2/E) 50. Rajendra Bhatia: Notes on Functional Analysis 51. A. Clebsch: Jacobi’s Lectures on Dynamics (2/E) 52. S. Kesavan: Functional Analysis 53. V.Lakshmibai and Justin Brown: Flag Varieties - An Interplay of Geometry, Combinatorics and Representation Theory (2/E) 54. S. Ramasubramanian: Lectures on Insurance Models 55. Sebastian M. Cioaba and M. Ram Murty: A First Course in Graph Theory and Combinatorics 56. Bamdad R. Yahaghi: Iranian Mathematics Competitions 1973-2007 57. Aloke Dey: Incomplete Block Designs 58. R.B. Bapat: Graphs and Matrices (2/E) 59. Hermann Weyl: Algebraic Theory of Numbers(Reprint) 60. C L Siegel: Transcendental Numbers(Reprint) 61. Steven J. Miller and RaminTakloo-Bighash: An Invitation to Modern Number Theory (Reprint) 62. John Milnor: Dynamics in One Complex Variable (3/E) 63. R. P. Pakshirajan: Probability Theory: A Foundational Course 64. Sharad S. Sane: Combinatorial Techniques 65. Hermann Weyl: The Classical Groups-Their Invariants and Representations (Reprint) 66. John Milnor: Morse Theory (Reprint) 67. Rajendra Bhatia(Ed.): Connected at Infinity II- A Selection of Mathematics by Indians 68. Donald Passman: A Course in Ring Theory (Reprint) 69. Amiya Mukherjee: Atiyah-Singer Index Theorem- An Introduction 70. Fumio Hiai and Denes Petz: Introduction to Matrix Analysis and Applications 71. V. S. Sunder: Operators on Hilbert Space 72. Amiya Mukherjee: Differential Topology 73. David Mumford and Tadao Oda: Algebraic Geometry II 74. Kalyan B. Sinha and Sachi Srivastava: Theory of Semigroups and Applications 75. Arup Bose and Snigdhansu Chatterjee: U-Statistics, M m -Estimators and Resampling 76. Rajeeva L. Karandikar and B. V. Rao: Introduction to Stochastic Calculus