CALIFORNIA STATE UNIVERSITY, NORTHRIDGE MEASURE THEORY AND APPLICATIONS A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Mathematics by Keith A. Barker June, 1993 The Thesis of Keith A. Barker is approved: Peter Collas Phillip Emig I David M. Klein, Chair California State University, Northridge 11 ACKNOWLEDGMENTS I would like to express my gratitude for the continued support and encouragement of several people who helped make this paper a reality. First, I would like to thank Dr. Norman Herr, Dr. Tung Po Lin, Dr. Barnabus Hughes, and Dr. Elena Marchisotto for their stimulation and guidance in my graduate studies. I would like to thank the members of my committee, Dr. Phillip Emig and Dr. Peter Collas for their insights which made the paper more logical and coherent. This project would not have come into existence without the patience, inspiration, and the constant prodding given to me by the chair of my committee, Dr. David Klein. Thank you, David. I needed that. I would also like to thank my brother, Chris, and his wife Sue, for their constant support and encouragement throughout the pursuit of my studies. And, of course, I want to dedicate this effort to my wife, Kay, and my beautiful daughter, Christy, without whose patience, understanding, and reinforcement this project would never have been completed. iii TABLE OF CONTENTS Acknowledgements .........................................................................................................iii Abstract ...............................................................................................................................v 1. Introduction .................................................................................................................. 1 Historical Perspective Elementary Measures 2. The Area Under a Curve ........................................................................................... 5 The Riemann Integral The Lebesgue Integral ·3. Measure Spaces ...................................................................................................... 11 Definitions Outer Measure Measurable Sets Measure Spaces 4. Integration ..................................................................................................................30 Riemann Integration Lebesgue Integration The Fundamental Theorem of Calculus 5. Applications ...............................................................................................................42 Fourier Series Probability Bibliography ....................................................................................................................51 IV ABSTRACT MEASURE THEORY AND APPLICATIONS by Keith A. Barker Master of Science in Mathematics This graduate paper is an excursion into the field of measure theory. It begins by focusing on the historical development of the measures of lengths, areas, and volumes, followed by the introduction of various methods to calculate these quantities. The area under a curve is discussed descriptively in terms of both the Riemann integral and the Lebesgue integral. A series of fundamental definitions lead to the development of the outer measure set function. An example of a family of sets on which outer measure is not a countably additive set function is then formulated. Measurable sets are defined and a series of theorems are proved to establish that outer measure, restricted to measurable sets, is a countably additive set function. After a discussion of measure spaces, Riemann integration is developed in terms of step functions, followed by a rigorous development of Lebesgue integration in terms of simple functions. The usefulness of Lebesgue integration is then illustrated by applying the Lebesgue Dominated Convergence Theorem to a sequence of functions that converge pointwise, but do not converge uniformly. The Fundamental Theorem of Calculus is presented for the Riemann integral and is then generalized to the Lebesgue Integral. The thesis concludes with an application indicating how Lebesgue integration is used in Fourier series and with an application of measure theory into the study of probability. v Chapter 1 Introduction HISTORICAL PERSPECTIVE Historians generally agree that mathematics arose from the necessity to solve the practical problems of counting and recording numbers. The annual flooding of the Nile River plain forced the Egyptians to develop a method for the relocation of land markings. Since, according to Herodotus (c. 485- c. 430 s.c.), taxes were paid on the basis of land area, Egyptian surveyors developed, and used, various mensuration formulas. To convert the land along the Tigris and Euphrates rivers into a rich agricultural region, the Babylonians engineered structures which accomplished marsh drainage, irrigation and flood control. The mathematics needed by the Babylonians was extensive. The word geometry is a compound of the two Greek words meaning "earth" and "measure," indicating that the subject arose from the necessity of land measure. (Eves, 1) As primitive counting evolved into mensuration and practical arithmetic, it became necessary to develop methods to instruct, and to study, this developing science. The study of the empirical reasoning inherent in mensuration and practical arithmetic led to the abstraction needed for the beginnings of theoretical geometry and elementary algebra. When theoretical geometry and elementary algebra are studied in the abstraction, formulas for both areas and volumes of various geometric shapes arise. Although many of the Egyptian and Babylonian formulas contained errors (Eves, 2), it was their groundwork that led to the generalized concepts of length of a segment, the area of a plane figure, and the volume of a figure in 1 space. These notions are the intuitive values that are associated with the "measure" of some particular object. Since the time of the ancient Egyptians and Babylonians, the problem of finding areas and volumes has captured the imagination of mathematicians. Archimedes (287- 212 B.c.), through an ingenious method called "exhaustion," verified calculations of areas and volumes (Burton, 221). In this method, formal logic, not infinitesimal quantities, was used. Nicole Oresme (c.1323 - 1382) anticipated an aspect of analytic geometry by graphing the de pendent variable against the independent variable as the independent variable took on small increments. Using this method, he was able to approximate areas under curves (Eves, 93). With the advent of the scientific revolution in the seventeenth century, interest in the calculation of area and volume reappeared. Johannes Kepler (1571 - 1630), in his second law of celestial mechanics, stated that the displacement vector from the sun to a planet sweeps out equal areas in equal time intervals (Burton, 341). Kepler treated the area swept out by the planet as a collection of an infinite number of small triangles with one vertex at the sun and the other two vertices at points infinitely close together along the planet's elliptical orbit. Using this unrefined form of calculus, Kepler was able to find the sum of the areas of these triangles, and verify his law (Burton, 337). Throughout the sixteenth and seventeenth centuries, mathematicians and scientists developed their own peculiar methods for finding areas and volumes. Isaac Newton (1642- 1727) and Gottfried Leibniz (1646- 1716) were the first to apply integral calculus, in a systematic way, for the determination of areas and volumes. 2 ELEMENTARY MEASURES Area, as a measure of a plane figure, has evolved from the land measure of the Egyptians and Babylonians to the abstract Banach spaces of modern analysis. To develop area as a measure, it is requisite to start with a unit length. By establishing an arbitrary length as unity, a unit square follows immediately. The measure of any square whose side is a multiple of unity can then be found by counting the number of unit squares contained in the original square. In a similar manner, areas of rectangles, whose sides are multiples of unity, can be determined by totaling the number of unit squares contained in the figure. If the areas of various squares and rectangles are collected, a general rule can be ascertained from the data relating area to the lengths of the sides of the respective figures. The concept of area, as a measure, can be further extended with the inclusion of rational and irrational numbers as lengths of sides of the squares and rectangles The area of a parallelogram is found by removing a triangular region from one end, and attaching it to the opposite end, to form a rectangle. Areas of triangular regions, which can be shown to be half of a parallelogram, can now be determined. The area of any shape which can be broken down into nonoverlapping squares, rectangles, and triangles can be found by summing the areas of the individual pieces. By using the definition of 1t (the ratio of the circumference of a circle to the diameter of the circle), the area of a circular region can now be ascertained. The first step is to determine the area of an n-sided regular polygon (which can be inscribed in a circle). In Figure 1, "x" represents the length of a side of the polygon, and "a" represents the length of the apothem, the distance from the center of the circumscribed circle to the.side of the polygon. The area of the 3 triangle is (1/2)(x)(a), and, since there are "n" triangles in the n-sided polygon, the area of the polygon becomes (1/2)(n)(x)(a). As the number of sides of the polygon become infinitely large, the term "(n)(x)", which is the perimeter of the polygon, approaches the circumference of the circle, 2rrr, and the length "a" approaches the radius of the circle, r. The area of the circle now becomes (1/2)(2rrr)(r), which simplifies to rrr 2 . Avoiding rigor, and for the illumination of the reader, the area of a sector of a circle can be found by using the following ratio: Area of Sector Arc Length of Sector Area of Circle - Circumference of Circle If e signifies the angular measure, in radians, of the central angle formed by the radii of the arc, the formula can now be simplified to: Area of Sector = (re)rrr2 2rrr _? = 21 re The measure of any region composed of non-overlapping squares, rectangles, triangles, circles, or arcs of circles, can now be determined. As long as a figure is composed of straight line segments, arcs of circles, or combinations of these, it is relatively easy, although possibly tedious, to find its measure. If, however, the figure is composed of curves determined by relations other than circular arcs, the measure must be calculated through the use of integral calculus. 4 Chapter 2 The Area Under A Curve Generally, the area under a curve is determined by finding the limit of the sum of inscribed rectangles, as the number of rectangles increases without bound. The area is found by two distinct methods. The first technique is called the Riemann integral, which is the familiar integral of elementary calculus, and the second is the Lebesgue integral, which is the keystone of modern analysis. THE RIEMANN INTEGRAL Given a function, f, continuous on the interval [a, b], we begin by partitioning the interval into n subintervals by inserting points x 1, x2, ... , Xk-1, Xk, ... , Xn-1 into the interval as shown in Figures 2 and 3. For convenience, the figures shown are of positive valued functions, but the procedure is valid for any continuous function on the closed interval. y y ____,-;:::::!q'"""- Y=f(x) --f----L...L...-...1.....1..--"'-'~1...-.--....___.L---7}{ a ~ }{2 XK-1 XK ~~.L-~-~~-~1...-.-.L-~}{ a ~ x2 xn -1 b xk-1 xk Xn-1 b Figure 3 Figure 2 These points subdivide [a, b] into n subintervals, each of length ffi<1 = x1 -a, Ax2 = x2- x1, ... , Axn = b- Xn-1, which need not be of equal length. Since 5 f is continuous, it has a minimum value, m k· and a maximum value, Mk in each subinterval (for a proof of this consequence of continuity, see Cronin-Scanlon, pages 69 -70). The sum of the areas of the rectangles in figure 2 is called the Lower Sum: L = m1Llx1 + m2Llx2 + ... + mn.ill<n = _Lmk.ill<k, and the sum of the areas of the rectangles in Figure 3, comprise the Upper Sum: U = M1Llx1 + M2Llx2 + ... + Mn.ill<n = _LMk.ill<k. The difference in the areas, U - L, is of particular importance. By increasing the number of subintervals, the closer the tops of the rectangles will fit the curve, thus decreasing the difference, U - L. First, define the "norm" of the subdivision as the largest subinterval width and then, as this norm approaches zero, all of the subintervals must both decrease in width and become more numerous at the same time. As the number of rectangles increases, the difference U - L can be made as small as any given positive number. This means that the lim norm~ (U - L) = 0, which implies that U = lim lim norm~ 0 0 norm~ L. 0 The existence of these limits is a consequence of a special property that continuous functions have on a closed and bounded interval, called "uniform continuity." (For a discussion of uniform continuity, see Cronin-Scanlon, pages 71 - 74). In each interval [Xk-1, xk], a point Ck is arbitrarily chosen and the sum 5 = _Lf(ck).ill<k is formed. The sum 5, like L and U, is the sum of the product of functional values and interval widths, and is therefore the sum of areas of rectangles. Since each Ck is randomly chosen, we know that m k:::;; f(ck):::;; Mk 6 which implies that L ~ 5 ~ U. By the Sandwich Theorem, (Thomas/Finney, 65), it is concluded that lim nonn~ 5 =lim L = lim 0 nonn~ 0 U, nonn~ 0 which means that, no matter how the point Ck is chosen, the sum 5 = _Lf(ck)~k. always has the same limit as the norm of the subdivisions approaches zero. This conclusion, without uniform continuity, was first reached in 1823 by Augustin- Louis Cauchy ( 1789 - 1857), and was then put on a solid logical foundation, with uniform continuity, by other nineteenth century mathematicians. This limit, called the Riemann Integral of function f over the interval [a, b], is b denoted by: J f(x) dx. This integral was named for Georg Friedrich Bernhard a Riemann (1826- 1866). It was Riemann's idea to trap the limit between the upper and lower sums. The sum 5 = IJ(ck)~k is called the approximating sum, or the Riemann sum for the integral (Thomas/Finney, 267). THE LEBESGUE INTEGRAL The usual derivation of the Riemann integral is based on dividing the domain of the function f into smaller and smaller subintervals. Henri Lebesgue (1875- 1941), in his doctoral thesis published in 1902, made a simple modification of the Riemann theory. Instead of dividing the domain, he divided the range into finer and finer pieces. This method, illustrated in Figure 4, depends more on the function rather than the domain, so it has the possibility of working for more types of functions than does Riemann integration. The first step in Lebesgue integration is to define a measure function 1J which will assign a non-negative real number to sets that are deemed measurable. This function must have the property that the measure of an 7 interval on the real line, IJ([a, b]), must be equal to (b- a). This requirement suggests that the measure function on the real line is actually the formula for the length of the interval between pairs of points on the line as defined in elementary algebra. The next step is to formulate the "inverse image" of the set [yo, Ynl It is denoted as: F 1([yo. Yn]) = {x: f(x) E [yo, Yn]}. The inverse image is the set of all points of the domain of the function which are mapped into [yo, Yn] by f. As an example, let f be a function such that: f1 if XE [2, 3) l0 if X e; (2, 3) f(x) = ~ then r\1) = [2, 3) and r\o) = (- oo, 2) u [3, oo). Using the concepts of measure and inverse image, the Lebesgue Integral can now be developed. Referring to Figure 4 below, the range of f(x) must be partitioned into n subintervals, each of width ~, and note that portions of the graph of the function are contained within the "horizontal rectangle" m m+ 1 bounded by and n n m+1 tr m n 1 n a Figure 4 8 b Let Ln (f)= L ~ IJ(f [[~ IJ(f 1 [[~ , m : 1 1 , m : 1 ) ] ) . Since ~is a y-coordinate, and ) ]) represents the measure, or the "length" of an interval on the horizontal axis, the sum. Ln (f), therefore denotes the aggregate of the areas of rectangles. As n, the number of horizontal subintervals, increases without bound, the limit of the sum will be the area under the curve. The Lebesgue Integral can now be defined in the following manner: Jf(x) dx = lim Ln(f) (Reed-Simon, 13-14). The existence of this limit can be proven using, for example, the Lebesgue Dominated Convergence Theorem. To illustrate the Lebesgue Integral, define the function f(x), on the interval [0, 1], in the following manner: f(x) = r1 if x is irrational i lo if x is rational Attempting to integrate this function by the Riemann integral, the limit of the lower sum, 5, is found to be equal to zero, and the limit of the upper sum, U, is determined to be one, which means that the function is not Riemann integrable. The alternative is to now try the Lebesgue integration technique. By using the horizontal rectangles as defined by the Lebesgue integral, 1 and, as shown in Figure 5, it is concluded that: Jf(x) dx = (1){1J[f-1(1)]}. 0 1 :::::::::::: 0 Figure 5 9 An analysis of the measure of the interval [0, 1] is necessary to find IJ[f\1)]. The measure of the interval [0, 1] is equal to the length of the interval, which is one. Since the union of the disjoint sets, {Rationals n [0, 1]} and {Irrationals n [0, 1]}, is equal to the interval [0, 1], 1-J[O, 1] = 1-J{Rationals n [0, 1]} + 1-J{Irrationals n [0, 1]}. The set of rational numbers is countably infinite, so the set, {Rationals n [0, 1]}, can be written as the set {r1, r2, r3, ... }, where each rk represents a unique rational number. The set {r1, r2, r3, ... } can be represented as the union of {r 1}, {r2}, {r3}, .... Since {ri} n {rj} =0, fori* j, 1-J{Rationals n [0, 1]} = IJ{r1, r2, r 3, ... } = IJ{r1} + IJ{r2} + 1.1{r3} + ... , which is the sum of the measures of sets whose only element is a single point. A set consisting of a single point is an interval with both endpoints being the same point. Since the measure of an interval is the length of the interval, IJ{rk} = rk- rk = 0, which implies ~J{Rationals n [0, 1]} must be 0. Therefore, 1-J{Irrationals n [0, 1]} = 1-J[O, 1] - 1-J{Rationals n [0, 1]} = 1 - 0 = 1. But, f 1(1) is equal to {Irrationals n [0, 1]}, which means ~-J[r\1 )] = 1-J{Irrationals n [0, 1]} = 1, and 1 f f(x) dx = (1){1-J[f- 1(1)]} = (1)(1) = 1. 0 To understand this kind of integration, it is necessary to comprehend the notion of length, i.e. the "measure," of a set. Do all sets have measure? If not, what are the necessary and sufficient conditions for a set to have a measure? What are the properties of a set that will make it measurable? The answers to these, and other questions, will be found as Lebesgue measure and measure theory are developed in the remainder of this paper. 10 Chapter 3 Measure Spaces DEFINITIONS To introduce the study of measure spaces, it is necessary to establish a few working definitions. To begin with, the real number system is not sufficient for this investigation and must be enlarged to include the two elements,- oo and + oo. This set will be referred to as the extended real numbers. It should be noted that, for any real number x,- oo < x < oo and the following properties hold: 1) x + oo = oo and x- oo = - oo 2) x(oo) = oo and x(- oo) =- oo, for all x > 0 3) oo + oo = oo and - oo - oo = - oo 4) oo(± oo) = ± oo, - oo(+ oo) =- oo, and - oo(- oo) = + 00 5) O(oo) =0 and (oo- oo) will be left undefined. (Royden, 36) Let A be a collection of sets such that if A and B are elements of A, then (Au B) eA. and (-A) e A, where (-A) is the complement of set A. When these two conditions are met, then the set A is called an algebra of sets. Using DeMorgan's laws, it follows that (A n B) is also an element of A. If the above two conditions are satisfied, and if any union of a countable collection of sets in A is again in A, then A is called a a-algebra. It again follows that the intersection of a countable collection of sets in A is also an element of A. An interval I, on the real number line, can be expressed as open, closed, or half-open, and can be denoted in the following forms: (a, b), [a, b], (a, b], or [a, b). The length of I, I(I), is defined by (b- a). I(I) is an example of a set function. A set function is a function that assigns an extended real number to 11 each set in some collection of sets. The domain of this length function would be the collection of all intervals contained in the real number line. If S is a set of real numbers, d is an upper bound for set S if, for each x e S, d ~ x. A number c is called the least upper bound of set S if c is an upper bound of setS and c ::;; d for each upper bound d of set S. The least upper bound of set S is also referred to as the supremum of set S and denoted as sup S. The number a is a lower bound of setS if, for each x e S, a ::;; x. If b is a lower bound for set S and a ::;; b for each lower bound a of set S, then b is defined as the greatest lower bound or the infimum of set S, and denoted as inf S. The Completeness Axiom states that every setS, of real numbers, which has an upper bound, has a least upper bound. From the this axiom, it follows that every set of real numbers with a lower bound has a greatest lower bound. The concept of measure of a set is a natural generalization of the ideas of length, area, or volume. Other measures can be created, for example, by considering the increment cj>{b)- cj>{a) of a non-decreasing function cj>{x) on the interval [a, b), or the integral of a non-negative function over some line, plane surface or space region. In all cases, the notion of measure is a set function that assigns a non-negative extended real number to a given set. n be a collection of sets of real numbers, and let "m" be a set function that assigns non-negative extended real numbers to elements of n. If set E is Let an element ofn, then the extended real number assigned toE by m, noted as "mE," is called the measure of E. In the ideal situation, the set function m should have the following properties: 1. mE should be defined for each set of real numbers 2. For an interval I, on the real number line, ml = 1{1) 3. If {En} is a sequence of disjoint sets, for which m is defined, then m(UEn) = L{mEn). This means that "m" is a countably additive 12 set function. 4. m is translation invariant. If E is a set for which m is defined and if {E + y} is the set {x + y: x e E} obtained by replacing each point in x in E by the point (x + y), then m(E + y) =mE (Royden, 54). As will be illustrated later, it is impossible to construct a set function having all of these properties and it is not known if a set function satisfying the first three properties exists. Consequently, one of the properties must be weakened and generally it is most useful to weaken the first property and retain the remaining three properties. Therefore, mE need not be defined for all sets E of real numbers, however it is desired that mE be defined for as many sets as possible. It is convenient to require that the family rt. of sets for which m is defined be a a-algebra. The set function "m" is said to be a countably additive measure if it is a non-negative extended real valued function whose domain is a a-algebra rt.of sets of real numbers and m(UEn) = L{mEn) for each sequence {En} of disjoint sets in M. It is now desired to construct a countably additive measure which is translation invariant and has the property that m I = I{I) for each interval I (Royden, 55). OUTER MEASURE Consider a set of real numbers, A, and the countable collection {In} of open intervals that cover A. Note that A c Uin and that the length of each interval is a positive number. Since the sum of the lengths of the intervals is uniquely defined independent of the order of terms, the outer measure, m*A, of A can be defined as the greatest lower bound or the infimum of all such sums of the lengths of the intervals that cover A. In symbols, m*A = inf ~)(In) for A c Uin. The following three propositions arise from this definition. 13 Proposition 1: If A c B, then m*A::; m*B. 1. Suppose A c B. 2. Let LA = {_LI{In): In is an open interval and A c Uin}. Let LB = {_LI(In): In is an open interval and B c Uin}. 3. As an example, let set A= (0, 1), which implies 1 E LA· and let set B = (-1, 2), which implies 3 E LB· .... -1 4. 2 Every element of LB is also an element of LA.which implies that LB 5. 1 0 CLA· Therefore, inf LA::; inf "Ls which implies m*A::; m*B. Proposition 2: Each set consisting of a single point has outer measure zero. 1. Suppose set A contains only the single point x. 2. 3. m*A = inf {_LI(In): In is an open interval and A c Uin}. 1 1 Let It = (x- 2 E, x + 2 E), for some E > 0. 4. Since It is an open covering of set A, A c (x- ~ E, x 5. Therefore, m*A::; l(x- 2 E, x + 2 E)= E. 6. Since E is arbitrarily small, m*A must be zero. 1 Proposition 3: +~E). 1 m*0 = 0. 1. Since 0 c {x} for any x E R, m*0::; m*{x}, by Proposition 1. 2. By Proposition 2, m*{x} = 0, which implies that m*0::; 0. 3. By the definition of outer measure, 0 ::; m*0. 4. By combining steps 2 and 3, 0::; m*0::; 0, which implies m*0 = 0. The following three properties concerning outer measure are direct con- sequences of the definitions presented here. Elegant proofs of these properties are provided in Royden, pages 56- 58. 14 Property 1: The outer measure of an interval is its length. Property 2: Let {An} be a countable collection of sets of real numbers. Then 00 m*(U~= 1 An) Property 3: =::;; I,(m*An). This is called countable subadditivity. n=1 Outer measure is translation invariant. As a consequence of Property 2 above and Proposition 2, if {x1, x2, ... } is a countable collection of real numbers, then 00 00 m*{x1, x2, ... } :::;; I,m*{xi} = i=1 L,.o = 0. i=1 This establishes that the outer measure of any countable set of real numbers is equal to zero. Outer measure is defined for gil sets of real numbers. However, as indicated in Property 2 above, it is not necessarily a countably additive set function. To provide an example of a family of sets of real numbers on which outer measure is not a countably additive set function, it is first necessary to define the sum modulo one of two numbers. If x and yare real numbers in the interval [0, 1), then the sum modulo one of x andy, denoted by (x 63 y), is defined as follows: fX +y x63y =i l X + y- if X + y < 1 1 if X + y ~ 1 The operation, 63, is a commutative and associative operation that maps pairs of numbers in [0, 1) into numbers in [0, 1). As an example of this operation, consider the unit circle, and assign to each x e [0, 1), the angle 27tx. The sum modulo one now corresponds to angle addition. For example, let x =0.5 and let y = 0.75, then (x 63 y) = 0.5 + 0.75-1 = 0.25 and would correspond to 21t(0.5) + 21t(0.75) = 21t(1.25) = 2.51t which is equivalent to (0.5)1t or 21t(0.25). 15 Suppose set E be a subset of [0, 1), then the translation modulo one of set E is defined as (E $ y) = {z: z = (x $ y), for all x E E}. If the example of sum modulo one is angle addition, then translation modulo one by y would correspond to rotation of set E through the angle 21ty. The following lemma establishes that outer measure is invariant under translation modulo one. Lemma 1: If E is an outer measurable subset of [0, 1), then, for each y E [0, 1), (y $ E) is outer measurable and m*(y $ E)= m*E. An e~cellent example of the proof of this lemma is provided by Munroe on page 142. If x andy are real numbers in [0, 1) and if (x- y) is a rational number, then x andy are defined to be equivalent and will be written as (x = y). This is an equivalence relation since =x) : x- x = 0 i) (x ii) If (x = y), then (y = x) : (x- y) iii) If (x = y), (y = z), then (x = z) : if (x- y) E Q, the set of rational numbers. E Q implies (y- x) E Q. E Q, (y- z) E Q, then (x- y) + (y- z) =(x- z) E Q. This equivalence relation partitions the interval [0, 1) into equivalence classes, that is, classes such that any two elements of one class differ by a rational number. The reader should note that a partition separates a set into a collection of disjoint subsets whose union is the original set. The equivalence class for the element x is the set of all y such that (x = y). The following are three examples of equivalence classes: i) Q, the set of rational numbers ii) V2 +Q iii) 1t +Q The Axiom of Choice states that, given any collection of non-empty sets, it is possible to choose an element from each of the sets. Using this axiom, let P 16 be the set consisting one element from each equivalence class as defined above. If {ro, r1, r2, ... } is an enumeration of the rational numbers in [0, 1), with ro = 0, then define Pi= ri ffi P, where Po= ro ffi P = 0 ffi P = P. The following lemma establishes that {Pi} is a pairwise disjoint sequence of sets. Lemma 2: Pin Pj = 0, fori"# j. 1. Suppose x e Pin Pj. then xis in both Pi and Pj. 2. Then x = Pi ffi ri = Pj ffi rj. where both Pi and Pj e P. 3. But Pi- Pj e Q. 4. Therefore, Pi= Pj· 5. But P contains exactly one element from each equivalence class which implies Pi= Pj and ri = rj. which means Pi= Pj. Therefore, if the intersection is non-empty, the sets are equal, and, if they are different, the intersection of Pi and Pj must be empty. 1. Let x e [0, 1), then xis contained in some equivalence class. Suppose x= peP. 2. Therefore x = p ffi ri, for some ri e Q n [0, 1), which implies x e Pi. 3. Thus, [0, 1) = U;:'0 Pi. From lemmas 2 and 3, m*[O, 1) = m*(U;:'0Pi) = 1. If outer measure is 00 countably additive on {Pi}, then m*(U;:'0 Pi) = Im*(Pi) = 1. Since each Pi is a i=O translation modulo one of P, then, by lemma 1, m*(Pi) = m*(P) fori= 1, 2, 3, ... 00 00 00 (X) 00 and Im*(Pi) =Im*(P) = 1. Suppose, m*(P) = 0, then Im*(Pi) = Im*(P) = i=O i=O i=O . = 0 "# 1. Suppose m*(P) > 0, for example, let m*(P) = 17 i=O :Lo i=O 00 00 i=O i=O ~,then Im*(Pi) = I,m*(P) 00 = L~ = oo * 1. It can now be concluded that Im*(P) is either 0 or i=O oo, i=O depending on whether m*(P) is zero or a positive number. In either case, outer measure is not a countably additive set function on the pairwise disjoint sequence of sets, {Pi} (Royden, 64-66). It is clear from this example that in order to guarantee countable additivity of the outer measure set function, it is necessary to, at the minimum, restrict outer measure to a family of subsets of :R which does not contain set P or its translates {Pi}- Set P will then be a "nonmeasurable" subset of :R. The notion of measurability is defined in the next section. The construction of the "non-measurable" family of sets {Pi}. required the use of the Axiom of Choice. Robert M. Solovay, in 1970 showed that " ... the existence of a non-Lebesgue measurable set cannot be proved in ZemeloFrankel set theory (ZF) if use of the axiom of choice is disallowed. In fact, even adjoining an axiom DC (principle of dependent choice) to ZF, which allows many consecutive choices, does not create a theory strong enough to construct a non-measurable set" (Solovay, 1). In this paper, Solovay concluded that the axiom of choice implies the existence of non-measurable sets and the existence of non-measurable sets implies the axiom of choice. MEASURABLE 5 ETS Since all subsets of the real number line have outer measure, to meet the condition of countable additivity, it will be necessary to decrease the number of sets on which outer measure is defined. To suitably reduce the set of outer measurable sets to a collection of measurable sets forming a a-algebra on which outer measure will be countably additive, the definition first provided by 18 Caratheodory (page 161 ), and used by Halmos (page 44), and Royden (page 58) will be ery1ployed. Definition: A set E is said to be measurable if, for each set A, we have m*A = m*(A n E)+ m*(A n -E), where the symbol -E represents the complement of set E. A member of the aalgebra formed by this definition will be called a measurable set. It is difficult to get an intuitive understanding of this definition except through familiarity with its implications as will be done in the proofs of the succeeding theorems. Since outer measure is not a countably additive set function on the collection of all subsets of R, this definition creates a class of sets on which outer measure is, at the minimum, additive. Given any two sets of real numbers, A and E, set A is equal to (A n E) u (A n -E) and applying the definition of measurability, set E, therefore, splits set A in such a manner that (A n E) and (A n -E) will have outer measures that add to the outer measure of set A. Since set E splits set A in such a way that outer measure is additive on set A, set E will be regarded as measurable. In other words, this definition achieves the rudimentary additive notion for outer measure. Finally, it should be noted that the definition of measurability of set E starts with the outer measure of set A, not with the outer measure of set E. Set A is an arbitrary "test set." The measurability of set E has nothing to do with the outer measure of set E itself, but it depends on what set E does to the outer measure of other sets. A measurable set is one which separates no set in such a way as to abrogate additivity for outer measure (Munroe, 86). From this foundation of the countable additivity of outer measure on certain sets, the concept of countable additivity can now be expanded to pairs of disjoint sets, providing one of the sets is measurable. 19 Proposition 4: If E1 is measurable, and E1 n E2 = 0, then m*(E1 u E2) = m*(E1) + m*(E2)Since E 1 and E2 are disjoint, 1. [(E1 u E2) n E1] = E1, and [(E1 u E2) n -E1] = E2. Since E 1 is measurable, 2. m*(E1 u E2) = m*[(E1 u E2) n E1] + m*[(E1 u E2) n -E1]. 3. By substitution, m*(E1 u E2) = m*(E1) + m*(E2). This proposition is a special case of Theorem 5, which is proved later in this chapter. Using the definition of measurability, two other sets can now be classified as measurable. Theorem 1: The empty set, 0, and the set of real numbers, R, are measurable. 1. Let A be any set of real numbers. 2. 3. =0, therefore, m*A =m*0 + m*A. Since 0 =(An 0), and set A =(An R) =(An -0), by substituting into statement 2, it can be inferred that m*A =m*(A n 0)+ m*(A n -0). 4. Therefore, the empty set, 0, is measurable. 5. It should be noted that the definition of measurability is symmetric in E By proposition 1, m*0 and -E which means that -E is measurable whenever E is measurable. Since -0 =Rand -R = 0, the set of real numbers is also measurable. 6. Combining the notion that, for any sets of real numbers, A and E, set A is equal to (An E)+ (An -E) with the proposition that outer measure is countably subadditive, it can be concluded that m*A:;:;; m*(A n E) + m*(A n -E) is true for all sets of real numbers, A and E. It can now be inferred that, in order to prove that a set E is measurable, it is only required to demonstrate that m*A ~ m*(A n E)+ m*(A n -E). 20 The following theorems establish that outer measure, restricted to a class of measurable sets, is a translation invariant, countably additive set function, and that the measure of an interval is the length of the interval. Theorem 2: If Eisa set of real numbers and m*E = 0, then E is measurable (Royden, 58). 1. Let A be any set of real numbers, then (A n E) c E. 2. Therefore, m*(A n E)::; m*E and, since m*E = 0, it can be inferred that m*(A n E)= 0, since out~r measure is never negative. 3. Since (A n -E) c A, m*(A n -E)::; m*A. 4. Reversing this inequality and adding 0, m*A 2 m*(A n -E)+ 0, and sub- stituting from statement 2, m*A 2 m*(A n -E)+ m*(A n E). 5. Therefore set E is measurable. Theorem 3: If E1 and E2 are measurable sets of real numbers, so is (E 1 u E2) (Royden, 59}. 1. Let A be any set of real numbers. 2. Since E2 is measurable, m*(A n -E1) = m*(A n -E1 n E2) + m*(A n -E1 n -E2). 3. Since An (E1 u E2) =(An E1) u (An E2 n -E1), then m*(A n (E1 u E2))::; m*(A n E1) + m*(A n E2 n -E1). 4. Adding m*(A n -E1 n -E2) to both sides of statement 3: m*(A n (E1 u E2)) + m*(A n -E1 n -E2)::; m*(A n E1) + m*(A n E2 n -E1) + m*(A n -E1 n -E2). 5. Substituting statement 2 into statement 4, m*(A n (E1 u E2)) + m*(A n -E1 n -E2)::; m*(A n E1) + m*(A n -E1). 6. By definition of the measurability of E 1, m*(A n (E1 u E2)) + m*(A n -E1 n -E2)::; m*A. 7. Since (-E 1 n -E2) = -(E1 u E2), statement 6 becomes 21 m*(A n (E1 u E2)) + m*(A n -(E1 u E2)::; m*A. 8. By the definition of measurability, (E 1 u E2) is measurable. Theorem 4: The family of measurable sets, n, is an algebra of sets (Royden, 59). 1. Let the sets E1 and E2 be members of the set X of measurable sets. 2. Set E 1 being measurable, implies that -E 1 is also measurable and is therefore contained in n:. 3. As demonstrated by Theorem 3, (E 1 u E2) is measurable and is also a member ofn. 4. By definition, the family of measurable sets n is an algebra of sets. Theorem 5: If A is any set, and if E 1, ... , En is a finite sequence of disjoint measurable sets, then m*(A n [U~ 1 Eil) = n L m *(A n E i) (Royden, 59). i=1 This theorem will be proven by induction. 1 1. lfn=1,m*(AnE1)=L m*(A n Ei)=m*(AnE1). i=1 2. 1 Assume the induction hypothesis: m*(A n [U~-1 Ei]) = n-1 L m*(A n Ei). i=1 3. Since En is measurable, m*{A n [U~ 1 Ei]) = m*(A n [U~ 1 Ei] n En)+ m*(A n [U~ 1 Ei] n -En) 4. m*(A n [U~ 1 Ei]) = m*(A n En)+ n-1 L m*(A n Ei). by statement 2. i=1 n 5. m*(An [U~ 1 EiD=I m*(A n Ei). i=1 Theorem 6: The collection 1. n of measurable sets is a cr-algebra (Royden, 60). Theorem 4 demonstrated that n n is an algebra of sets. Thus, to prove that is a cr-algebra, all that must be verified is that any union of a countable 22 collection of measurable sets is measurable. In other words, it must be shown that if A1, A2, ... e :M., then A1 u A2 u ... e :M., or E = Ui:1 A;e :M.. 2. Let E1 = A1_, E2 = A2 \ A1, E3 = A3 \ (A1 u A2), E4 = A4 \ (A1 u A2 u A3), ... , En= An\ (A1 u A2 u ... u An-1). 3. For each E; e :M., E; n Ej = 0 fori 00 :t j, U~ 1 E; = U~ 1 A;, for any~. and 00 Ui=1 E; = Ui=1 A; . n 4. Let A be any subset of real numbers and Fn be defined as Ui=1 E;. (Note that Fn e :M., Fn c E which implies that -E c -Fn). 5. Since F n is measurable, m*A = m*(A n Fn) + m*(A n -Fn). 6. Since -E c -Fn, m*A ~(An Fn) + m*(A n -E), n m*A ~ m*( An Ui=1 E;) + m*(A n -E). 7. Using Theorem 5, m*A ~ n L m*(A n E ;) + m*(A n -E). i=1 8. As n ~ oo, lim [m*A] ~lim [ n L m*(A n E ;) + m*(A n -E)] which i=1 00 becomes m*A~ L m*(A n E;)+m*(An-E). i=1 00 9. 00 Using statements 1 and 3, E = Ui=1 A;= Ui=1 E;, it is concluded that 00 00 An E = (Ui=1 E;) n A, which implies that m*(A n E)= m*((Ui=1 E;) n A). 10. Since outer measure is countably subadditive, 00 m*{A n E)= m*((Ui:1 E;} n A)::;; L m*(E; n A), therefore, i=1 00 m*(A n E)::;; I, m*(E; n A). i=1 11. By substitution into statement 8, m*A ~ m*(A n E)+ m*{A n -E) which 23 makes set E measurable, and it is concluded that 1't is a a-algebra. Theorem 7: The interval (a, oo) is measurable (Royden, 60). 1. Let A be any set and define A1 =An (a, oo) and A2 =An (- oo, a]. As an example, let a= 1, and A= [-10, 10]. Then A 1 = [-10, 10] n (1, oo) = (1, 10] and A2 = [-10, 10] n (- oo, 1] 2. =[-10, 1]. For (a, oo) to be measurable, it must be shown that m*A = m*(A n (a, oo)) + m*(A n -(a, oo)) which is equivalent to m*A = m*(A n (a, oo)) + m*(A n (- oo, a]), or m*A = m*A1 + m*A2. Since m*A::; m*A1 + m*A2, all that needs to be shown is that 3. If m*A = oo, then the theorem is proven. 4. If m*A < oo, then there is a countable collection, {In}. of open intervals which cover A and for which :LI(In)::; m*A + c:. 5. Let In' =Inn (a, oo) and In =Inn (- oo, a]. Then In' and In are intervals II II (possibly empty sets) and I(In) = I(In ') + I(In 6. 11 ) = m*(In ') + m*(In 11 ). Since A 1 c U {In'), m*A1 ::; m*(U {In')) ::; .L,m*{In '). This inequality follows from the notion that outer measure is countably subadditive. 11 7. By a similar argument, A2 c U {In 8. Combining statements 6 and 7, .L,[m*(In ') + m*(In ")] ~ m*A1 + m*A2. 9. Substituting from statement 5, .L,I(In} ~ m*A1 + m*A2 and from statement 4, m*A + c: ~ ), m*A2 ::; m*(U On")) ::; .L,m*{In "). m*A1 + m*A2 which makes m*A ~ m*A1 + m*A2, since c: is arbitrary. 10. Therefore, the interval (a, oo) is measurable. At this point in the development of measure theory, it is necessary to consider more general types of sets than the open and closed sets. This is required because the intersection of any collection of closed sets is closed, and the union of any finite collection of closed sets is closed, however, the union of a 24 countable collection of closed sets is not necessarily closed. As an example, consider the set of rational numbers. The set of rational numbers is the union of a countable collection of closed sets, each of which contains exactly one number. But the union of these closed sets is not closed. Thus, if a-algebras of sets that contain all of the closed sets are to be considered, a more general type of set must be specified (Royden, 52). Definition: The Borel sets, :8, of Rare the smallest family of subsets of R with the following three properties: (i) The family is closed under complements. If Be :8, then-Be :8. (ii) The family is closed under countable unions. If B1, B2, ... e :8, then (iii) u;:1 Bi E:8. The family contains each open interval. Since the family of all subsets of R is a a-algebra that satisfies (i), (ii), and (iii) above, the collection of all such a-algebras is non-empty. To see that the smallest family exists, consider that if {:Sa} is a non-empty collection of families satisfying the three conditions stated above, then so does rl{:Ba}, and this intersection is the smallest family that satisfies all the conditions (Reed and Simon, 14- 15). The reader should note that, from statements (i) and (ii) of the definition, the Borel sets form a a-algebra. The only properties of Borel sets needed in this paper follow from the fact that they form the smallest a-algebra containing the open and closed sets (Royden, 53). Theorem 8: Every Borel set is measurable, specifically each open set and each closed set is measurable (Royden, 61 ). 1. By Theorem 6, the collection Theorem 7, (a, oo) En n, of measurable sets, is a a-algebra. By ,which means -(a, oo) = (- oo, a] 25 E n. 2. Since 1't is a a-algebra, the union, as well as the intersection, of a countable collection of sets in 1't is also an element of n. Each open interval of the form (- oo, b), can be represented as U~= 1 (- oo, b- ~],which makes(- oo, b) an element of (- oo, b) 3. n. Every open interval of the form (a, b) can be represented as n (a, oo). This implies that all open intervals are contained inn. Since :M., contains all of the open intervals, and since 1't is a a-algebra, 1't contains the smallest a-algebra containing all the open intervals, therefore n contains the Borel sets. The following theorem states that if outer measure is defined on a sequence of disjoint subsets of the set n, of measurable sets, then outer measure is a countably additive set function. Theorem 9: If {Ei} is a sequence of measurable sets and Ei n Ej = 0, fori-:~:- j, 00 then m*(U~ 1 Ei) = :L<m*Ei). (Royden, 62) i=1 1. Proposition 2 on page 14 demonstrated that outer measure is a 00 subadditive set function, that is, m*(U~ 1 Ei)::; :L<m*Ei). Therefore, to complete i=1 00 the proof of this theorem, it only necessary to prove that m*(U~ 1 Ei);::: :L<m*Ei). i=1 n 2. Theorem 6 proved that, for any set A, m*(A n [U~ 1 E~) = :Lm*(A n Ei). i=1 If set A is the set of real numbers, R, then m*(U~ 1 Ei) n =:Lm*(Ei) and it can be i=1 inferred that outer measure, on set :M., is a finitely additive set function. 3. If {Ei} is an infinite sequence of disjoint measurable sets, then U~ 1 Ei ::> 26 n 4. By substitution from statement 2, m*(U; 1Ei) ~ :Lm*(Ei). i=1 5. The left side of the inequality in statement 4 is independent of n, and the n limit, as n ~ oo, ~ ~ of :Lm*(Ei) is L(m*Ei) which implies m*(U; 1Ei) ~ :L<m*Ei). i=1 i=1 i=1 ~ 6. Combining statements 1 and 5, m*(U; 1Ei) = :L<m*Ei). i=1 It can now be concluded that if outer measure is restricted to the set n of measurable sets, it will be a countably additive set function which is translation invariant (Royden, 58}, and has the property that the measure of an interval will be the length of the interval (Royden, 56). If set E is a measurable set, the Lebesgue measure, mE, is defined to be the outer measure of E. Lebesgue measure is, therefore, outer measure, restricted to the collection of measurable sets. MEASURE SPACES A measurable space is usually represented as a couple, {'X, :'8), consisting of a set 'X and a a-algebra :'8 of subsets of 'X. Since :'8 is a aalgebra, :'8 is a family of subsets of 'X which contain the empty set, and is closed with respect to complements and with respect to countable unions. A subset .A, of 'X, is defined to be measurable if A e :'8. A measure, ~. on the measurable space, {'X, :'8), is a nonnegative set function defined for all sets of :'8, and satisfying the following conditions: =0 (i) ~(0) (ii) ~(U~ 1 Ai) ~ = :L~Ai, for any sequence Ai, of disjoint measurable i=1 subsets of :'8. 27 A measure space, represented by the triple (X, :8, ~). is a measurable space with a measure~ defined on :8 (Royden, 254). An example of a measure space is (:R., n, m) where :R. is the set of real numbers, Xis the set of measurable sets of real numbers, and m is Lebe$gue measure. Another example would be (R, :8, m) where R is again the set of real numbers, :8 is the a-algebra of Borel sets, and m is again Lebesgue measure. A measure space, {X, :8, ~) is defined to be complete if :8 contains all subsets of sets of measure zero. This means that if B e :8, ~B = 0, and A c B, then A e :8. The measure space defined by (R, n, m) is complete, since n contains subsets of all sets of measure 0, while the measure space {:R., :8, m), where :8 is the a-algebra of Borel sets, is not. To see this, the reader should note that, by Theorem 8, every Borel set on the real line is Lebesgue measurable and that there are Lebesgue measurable sets of measure zero which are not Borel sets (Munroe, 148- 149). It can also be shown that any Lebesgue measurable set of measure zero is contained in a Borel set of measure zero (Munroe, 97- 98). Thus, (:R., :8, m) does not contain all subsets of sets of measure zero, and is therefore, not complete. It can be proven that each measure space can be completed by the addition of subsets of sets of measure zero. This is called the completion of the measure space. Formally, the completion of measure space (X, :8, ~) is the measure space (X, :Bo. ~ 0 } where :8o (i) :8 c (ii) If E e :8, then !JE = !JoE (iii) Ee :8o if and only if E =A u B, where B e :8 and A c C, C e :8, !JC =0. Thus, the completion of measure space (R, :8, m), where :8 is the aalgebra of Borel sets, is the measure space (R, n, m), where set n, is the set of 28 Lebesgue measurable sets (Royden, 257). This provides an alternative description of the Lebesgue measurable sets. A set is Lebesgue measurable if, and only if, it is the union of a Borel set and a subset of a Borel set with Lebesgue measure zero. With the foundations of measure theory now complete, it is time to return for a closer look at the Riemann integral, its shortcomings, and develop a more general form of integration as defined by the Lebesgue integral. 29 Chapter 4 Integration Riemann Integration An elegant formulation of the Riemann integral can be obtained by the use of step functions and, to introduce this formulation, it is necessary to establish a few working definitions. To begin with, let f(x) be a real valued, bounded function on the interval [a, b] and let xo, ... , xn be a partition of [a, b] such that xo =a, and xn =b. Let Mi be defined as the sup {f(x): Xi-1 ~ x ~Xi} and let mi be defined as the inf {f(x): Xi-1 ~ x ~Xi}, then the upper sum is designated n n asS= 'LMi(Xi- Xi-1), and the lower sum denoted ass= 'Lmi(Xi- Xi-1). The i=1 i=1 expressions "s" and "S" are called Darboux Sums, and from their definitions, it can be seen that s ~ S (Cronin-Scanlon, 127- 128). The Upper Riemann integral of f(x) is defined as b R+ J f(x) dx = inf S, a and the Lower Riemann integral of f(x) is defined as b R- Jf(x) dx =sups, a with the infimum and supremum taken over all possible subdivisions of [a, b]. If the Upper and Lower Riemann integrals are equal, then f(x) is defined to be b Riemann integrable, and is denoted as R Jf(x) dx, (Royden,76). a If xo .... , Xn is a partition of [a, b] such that xo =a and xn = b, then the step function, 'I', has the form 'l'i(X) = Ci, where Xi-1 ~ x ~Xi and Ci is a constant. For 30 example, suppose n = 3, c1 = 2, c2 = 3, c3 = 1, then the graph of the step function formed would be represented as shown in Figure 6. 3 2 1 I b From the definition of the step function, it can be seen that the I 'JI(X) dx a n is equal to Ici(Xi - Xi-1), which implies that i=1 b R+ I f(x) dx b = inf b R- I 'JI(X) dx, for all step functions 'JI(X) ~ f(x), and a a I f(x) dx b = sup a I 'Jf(X) dx, for all step functions 'JI(X)::; f(x). a (Royden, 76). This means that if f(x) is Riemann integrable, its integral can be expressed in terms of the integral of step functions and this formulation represents the intuitive notion of the Riemann integral as presented in elementary calculus. As demonstrated in chapter 2, the function defined by: f(x) r1 if X iS irrational l0 if x is rational =~ for x e [0, 1] is not a Riemann integrable function. This example illustrates the shortcomings of Riemann integration and shows the need do develop a more general form of integration. The required general form was first introduced by Henri Lebesgue, in his doctoral thesis presented at the Sorbonne in 1902, (Gillispie, 110). 31 Lebesgue Integration Henri Leon Lebesgue was born in Beauvais, France on June 28, 1875. Lebesgue studied at the Ecole Normale Superieure from 1894 to 1897 and his first university position was at Rennes, from 1902 to 1906. After his studies at the Ecole Normale Superieure, Lebesgue became acquainted with the research of another graduate of the Ecole, Rene Baire. Baire studies included the theory of discontinuous functions of a real variable which helped focus Lebesgue on the deficiencies of Riemann integration. In his doctoral thesis, Lebesgue began the development of a theory of integration which, as he demonstrated in his presentation at the Sorbonne, included all the bounded, discontinuous functions introduced by Baire. By 1922, Lebesgue had produced nearly ninety books and papers. Much of his work dealt with his theory of integration, but he also presented significant work in the calculus of variations, the theory of surface area, and in the structures of sets and functions. In 1922, he was elected to the Academie des Sciences and, during the last twenty years of his life, he remained active in his studies of pedagogy, elementary geometry, and history. Lebesgue died in Paris on July 26, 1941 (Gillispie, 110- 112, and Lebesgue, 1 - 5). To begin the study of Lebesgue integration, it is necessary to have a function that has a value of one on a measurable set and zero elsewhere. The integral of this function should, therefore, be equal to the measure of the set. The function that satisfies these conditions, on the measurable set E, is called the characteristic function, XE, and is defined as follows: XE(X) = r1 i l0 if x e E ifx ~ E 32 As introduced in Chapter 2, the inverse image, J-1(A), is defined as the set {x: f(x) e A}. Equivalently, the inverse image is the set of points in the domain of the function which are mapped into A by f. An extended real valued function, f, is defined to be measurable, or Lebesgue measurable, if, whenever set A is measurable, J-1(A) is measurable and f satisfies one of the following conditions: (i) For each real number a, the set {x: f(x) >a} is measurable. (ii) For each real number a, the set {x: f(x);:: a} is measurable. (iii) For each real number a, the set {x: f(x) <a} is measurable. (iv) For each real number a, the set {x: f(x):::;; a} is measurable. The reader should note that each of the above conditions are equivalent (Royden, 66). It should be observed that continuous functions, with measurable domains, are measurable. All step functions are also measurable. If f is a measurable function and E is a measurable subset of the domain off, then the function obtained by restricting f toE is measurable (Royden, 77). Suppose E 1, ... ,En is a finite class of disjoint, measurable sets, and if {a1, ... , an} is a set of finite real numbers, then the real valued function <1>, defined by the relation n <I>(x) = IaiXEt(x) i=1 is called a simple function. A simple function is a function that assumes a finite number of values and assumes each of these values on a measurable set (Munroe, 155). It should also be noted that a function <1> is simple if, and only if, it is measurable and assumes only a finite number of values (Royden, 77). The basic idea of Lebesgue integration is to replace the step functions of Riemann integration with simple functions. 33 But, before this can be accomplished, a few facts about simple functions need to be established. One of the more useful portrayals of the simple function is called the canonical representation and is defined as follows: Suppose <1> is a simple function and n {a1, ... , an} is the set of nonzero values of <1>, then <I> = :Iai(XAi) where i=1 Ai = {x: <l>{x) = ai}. The canonical representation of <I> is characterized by the fact that the Ai are disjoint and the ai are distinct and nonzero. If <1> vanishes outside a set of finite measure, then, by definition n I<l>{x) dx = :Iai(mAi). i=1 where (mAi) is the Lebesgue measure of Ai, and <I> is the canonical n representation, <I> = :Iai(XAi). This integral is usually abbreviated as i=1 any measurable set, then I <I> is defined as I <I>Xt E I<l>. If E is (Royden, 77). E As an example, suppose f(x) = 3X[o, 11 + 2X[2, 5]· If [0, 1] u [2, 5] c E, then I <I> dx = E I 3X[o, 11 + 2X[2, 5] dx 3X[o, 11 dx + I 2X[2, E = I 5] dx E E = 3m[O, 1] + 2m[2, 5] = 3(1) + 2(3) = 9. The following proposition brings the idea of simple functions a little closer to the more familiar step functions: Suppose f is a bounded, real valued function defined on set E, a measurable set of finite measure. If <1> and 'I' are 34 simple functions, then, by analogy with the Riemann integral, it can be concluded that inf J\ji(X) dx, for f::::; "', is equal to the sup j<D{x) dx, for f;::: <D, if, and only if, f is a measurable function. A proof of this conjecture is provided by Royden, pages 79-81. The Lebesgue integral for non-negative functions can now be defined as follows: Iff is a non-negative measurable function defined on a measurable set E where mE is finite, the Lebesgue integral off over E, denoted as J f(x) dx, E is defined as the supremum of J <D{x) dx, where the supremum is over all E simple functions, <D::::; f, which vanish outside of a set of finite measure. This integral is sometimes expressed as: J f, or, if E = [a, b], the integral would be E b written as J f(x) dx. a Any non-negative function is said to be "integrable" if Jf(x) < oo. Before defining a Lebesgue integrable function, two further definitions must be introduced. The positive part, j+(x), of function f(x), is defined as j+(x) = max {f(x), 0}. In a similar manner, the negative part, J-(x), of function f(x), is defined as J-(x) = max {-f(x), 0}. It should be observed that both j+(x) and J-(x) are non-negative functions and that f(x) = j+(x) - J-(x). Thus, if f(x) is measurable, then j+(x) and J-(x) are both measurable and non-negative. This leads to the following definition: Definition: A measurable function f(x) is defined to be Lebesgue integrable over set E if j+(x) and J-(x) are both integrable over E. If they 35 are integrable, then J f(x) JJ-(x) dx, dx E E (Royden, 89-90). In considering both Riemann and Lebesgue integration, what are the conditions for a function to be integrable? It has been shown that a necessary and sufficient condition for a function to be Riemann integrable is that the function must be bounded and continuous almost everywhere. This means that, for example, a function with a countable number of discontinuities is Riemann integrable. A necessary and sufficient condition for a function f to be Lebesgue integrable, is that f be measurable and both J+(x) and J-(x) have finite integrals. The following proposition establishes that the Lebesgue integral is actually a generalization of the Riemann integral, (Munroe, 177). Proposition: If a bounded function f, is Riemann integrable on [a, b], then it is b Lebesgue integrable on [a, b], and: R fa f(x)dx b = fa f(x)dx. b 1. Since f is Riemann integrable, R- J f(x) b dx = R+ a a 2. f f(x) dx. But every step function is also a simple function, and b R- J f(x) a 3. b dx ~ sup1 ~~ j<I>(x) dx ~ inf1 ~'1' j'Jf(X) dx ~ R+ Therefore, sup 1 ~~ j<I>(x) dx = inf1 ~'1' j'Jf(x) dx, and b integrable, and R f J f(x) dx. a must be Lebesgue b Jf(x)dx = Jf(x)dx. a a An alternative proof of this proposition is provided in Cronin-Scanlon, on pages 197 -198. Before illustrating some of the uses of Lebesgue integration, the following definitions from analysis should be reviewed. 1. A sequence of functions {fn} defined on a set E, is said to converge pointwise onE, to a function f(x) if, for all x in E, limn~~fn(x) = f(x). This means 36 that, given an x E E, and an arbitrary E > 0, there is an N such that for all n ~ N, lfn(x)- f(x)l < E, (Royden, 49). 2. A sequence of bounded functions {/n} defined on a set E, is said to converge uniformly to f(x) on E, if limn~~ suplfn(x)- f(x)l = 0, (Beals, 47) .. 1 0 Figure 7 1 The sequence of functions, fn(x) = xn, on the interval [0, 1), is an example of a sequence of functions that converge pointwise, but the sequence is not uniformly continuous on [0, 1). f 1(x) = x 1, /2(x) = x2, and f 10(x) = x 10 are illustrated in Figure 7. fn(x) = xn converges to 0 pointwise since limn~~fn(x) = limn~~xn = 0 = f(x), for all x E [0, 1). To converge uniformly, limn~~suplfn(x)- f(x)j must equal 0. However, limH~suplfn(x)- f(x)l = limH~supjxn- 01 = limn~~supjxnj = 1 * 0, and it is concluded that the sequence does not converge uniformly. This example will be used to illustrate a use of Lebesgue integration after the introduction of one more definition and a theorem. A property is said to hold almost everywhere (a.e.), if the set of points where it fails to hold is a set of measure zero (Royden, 69). For example, functions f and g are equal a.e. iff and g have the same domain, and the set of points where f(x) * g(x) has measure zero. 37 The Lebesgue Dominated Convergence Theorem, first published by Lebesgue in 1908, states: If fn(x) is a sequence of functions and there exists an integrable function G(x), such that lfn(x)l :::;; G(x) a.e. and the limit off n(x), as b n ~ oo, is equal to f(x) a.e., then the limit, as n ~ oo, of J fn(x) dx a b = Jf(x) dx, (Hawkins, 118, and Reed- Simon, 24). a Using the previous example, f n(x) = xn, it has been established that fn(x) converges pointwise to 0, but does not converge uniformly. Therefore, f(x) =0. Since lxnl:::;; 1 on [0, 1], let G(x) = 1. By the Lebesgue Dominated Convergence 1 Theorem, the limit, as n ~ oo, 1 Jx n dx 0 = J0 d x = 0. 0 The Fundamental Theorem of Calculus Both Isaac Newton (1642- 1727) and Gottfried Leibniz (1646- 1716) were familiar with the idea of relating integrals as sums with the concept of integrals as anti-derivatives (Bell, 153). Newton even introduced the notion of changing the upper bound of an integral to a free variable (Burton, 368). However, it wasn't until 1823 that the first statement of the Fundamental Theorem of Calculus was presented by Augustin Cauchy (1789- 1857). Using the ideas of continuity and uniform continuity, Cauchy defined, and coined the term "definite integral" of a continuous function on the closed interval [a, b]. In his statement of the fundamental theorem, Cauchy considered a function f, continuous on the interval [a, b] and defined the function F(x) as X F(x) = J f. Cauchy also established that F(x) is both continuous and differen- a tiable. Cauchy's statement of the Fundamental Theorem of Calculus takes the form of three theorems: 38 Theorem 1: F is a primitive function for f; that is, F' =f. X Theorem 2: All primitive functions off must be in the form If + C, a where C denotes a constant; that is, If G is a function with a X continuous derivative G', then Ia G' = G(x)- G(a). Theorem 3: If G is a function such that G'(x) = 0 for all x in [a, b], then G'(x) remains constant there. The reader should note that Theorem 3 is an immediate consequence of the Mean Value Theorem and was used by Cauchy to prove Theorem 2. In 1849, using the idea of limits, Cauchy refined his version of the fundamental theorem by including discontinuous functions with a finite number of discontinuities on the interval [a, b], and he extended his definition of the integral to unbounded functions (Hawkins, 9- 12). In analyzing the Fundamental Theorem of Calculus with regards to both Riemann and Lebesgue integration, two basic questions arise. d 1. When does Ia f(t) dt X dx = f(x)? Iff is Riemann integrable on [a, b], then the relation holds at each point x in [a, b] at which f is continuous. Iff is continuous on [a, b], then the relation holds for all points of [a, b] (Fulks, 156- 157). If f(x) is Lebesgue integrable, this relation holds for almost all values of x, (Royden, 97). The reader should note that, for Lebesgue integration, there is no requirement that f be continuous f is continuous, at x. This relation leads to the key observation that whenever X the integral I f(t) dt is a differentiable function of x. As illustrations of a functions whose definitions are only in terms of an integral, consider first, the 39 natural logarithm of a positive number. The natural logarithm is defined in the following form: X In X = ft dt for all x > 0. 1 As a second example, consider the sine integral: X si(x) = J si~ t dt 0 which appears in various engineering applications {Thomas, 278). b 2. When does J f'(x) dx = f(b)- f(a)? a For Riemann integration, this relation holds as long as f(x) is differentiable on [a, b] and f'(x) is Riemann integrable on [a, b] (Fulks, 158). For the Lebesgue integral, the answer to this question depends on the concept of "absolute continuity." A real valued function f, defined on the closed interval [a, b], is defined to be absolutely continuous on [a, b] if, given an E > 0, there is a o > 0 such that n L lf(xj') - f(xi)l < E i=1 for every finite collection {(xi, Xi')} of non-overlapping intervals with n L i=1 lxi' - x il < o. An absolutely continuous function is clearly continuous. Since any continuous function on a closed interval is uniformly continuous, it follows that an absolutely continuous function is uniformly continuous. However, the converse in not true. Any absolutely continuous function f, on [a, b], is differentiable almost everywhere (Royden, 108- 109). The property of absolute 40 continuity was first introduced by Axel Harnack (1851-1888) in 1884 (Hawkins, 77). The following proposition establishes the Fundamental Theorem of Calculus as applied to Lebesgue integration: X Proposition: f(x) = f(a) + j f'(t) dt, for all x E [a, b], if, and only if, f is a absolutely continuous on [a, b] (Royden, 110, and Munroe, 268). As an illustration of the power of the Lebesgue integral over the Riemann integral, consider the function f(x) = .]x on the interval (0, 1]. Since f(x) is unbounded, the Riemann integral does not exist, and, using the method first introduced by Cauchy, the improper Riemann integral must be evaluated using limits (Hawkins, 12). Thus, 1 1 JJx dx becomes lim 0 J Vx _1_ dx asE ~o. which, because of the limit, becomes a more complex problem than necessary. It can be shown that f(x) = 2Vx is absolutely continuous, and therefore, it satisfies the Fundamental Theorem of Calculus for Lebesgue integration, and 1 the integral becomes J .]x d x = 2vxl6 = 2. This integral is now a proper 0 Lebesgue integral, and it can be integrated as indicated. 41 Chapter 5 Applications Fourier Series A trigonometric series is an infinite series of the form: a °+ L 2 00 (ancos nx + bnsin nx) n=1 where ao, a1, ... and b1, b2, ... are given constants. If the trigonometric series converges for all x e [0, 21t] then it converges for all real x, since, for all n, both cos nx and sin nx have periods of 21t (Cronin-Scanlon, 217). In 1822, Joseph Fourier (1768- 1830) proposed that any bounded function, f(x), can be expressed in the form of a trigonometric series (Hawkins, 5). In his proofs, Fourier, a mathematical physicist, demonstrated to the pure mathematicians of the time, that the notions of "arbitrary" functions, real numbers, and continuity needed rigorous clarification (Bell, 293). Fourier's development is based on the following: Suppose the trigonometric series converges uniformly on [0, 21t] to a function f(x), i.e., a = 2° + L f(x) 00 (ancos nx+ bnsin nx), n=1 21t for all real x (Cronin-Scanlon, 217). If Jf(x) dx exists, then the Fourier 0 series of f(x) is the trigonometric series given by a ° + L (ancos nx + bnsin nx), 2 00 where n=1 1 an - 1t 21t J f(x)(cos nx) dx 0 42 for n = 0, 1, 2, ... , and 1 bn 21t = -1t J0 f(x)(sin nx) dx for n = 1, 2, .... The numbers ao, an, bn are called the Fourier coefficients of f(x}. The Fourier series of f(x) is sometimes specified by: f(x) - a 2° + "" ~ (ancos nx + bnsin nx). n=1 The symbol"-" is used to indicate that the coefficients an and bn are obtained from the functions defined above. In general, the series to the right of the "-" may not converge to f(x) for every x. However, if the Fourier series does converge to f(x), then the "-" symbol can be replaced by the equal sign (Cronin-Scanlon, 219- 20). The Fourier series is, of course, a trigonometric series. However, if given a trigonometric series, can it be determined if there is a function, f(x), such that the Fourier series of f(x) is the given trigonometric series? Solving this problem led to the discovery, and proof, of the well-known Riesz-Fischer Theorem. Before introducing this theorem, it is necessary that the reader be familiar with the properties of the LP spaces. Suppose "p" is a positive real number and E is a measurable set of real numbers. Then LP(E) is defined as the set of measurable functions on E such that lfiP is integrable. Thus L1[0, 1] would consist of the set of Lebesgue integrable functions on [0, 1] (Royden, 119). As another example, consider f(x) = x(-1/4) on [0, 16]. It should be recognized that f(x) e L1[0, 16] since 16 I l g = 3{ [x (-1/4)]1 dx = ~ x(3/4) 1 . However, f(x) 0 16 16 I 0 [x (-1/4)]4 dx = J ~dx is not integrable. 0 43 ~ L4[0, 16] since Since any two functions that are equal almost everywhere on a measurable set E, have the same integral on E, the distinctions between the functions are insignificant. The relation "equal almost everywhere" is reflexive, symmetric, and transitive, thus it forms an equivalence relation. The Lebesgue space, denoted as LP(E), is the collection of equivalence classes from the associated equivalence relation on LP(E). Even though the elements of LP(E) are usually thought of as individual functions, they are actually equivalence classes of functions, such that, in each class, any two functions are equal almost everywhere. Suppose [f] denotes an equivalence class in LP function f. = L P(E) containing Define addition in LP by: [f] + [g) = [f + g), and scalar multiplication in LP by: a[f] = [af]. Using the relation If+ giP ~ 2P(IfiP + lgiP), it can be shown that the Lebesgue space, LP, is a linear space, or a vector space. A norm, that II• 11. on a vector space V over real numbers, is a function, such II• II: V ~ R, which satisfies the following: (i) II f (ii) II (iii) II af II = (lal)(ll f II), for all a e R, and f II f + g II ~ II f II+ II g 11. for all f, g e V, (iv) II ~ 0, for all f e V. f II = 0, if and only iff = 0. Define II If] II as II I II = [ e V. (Beals, 70). ~III P }/p, thus II I II " 0, and condition (i) is met. Condition (ii) is satisfied since LP consists of equivalence classes of functions. For any constant a e R, condition (iii) is fulfilled, and condition (iv), called subadditivity, can be shown to hold providing p ~ 1. Thus, for p ~ 1, the LP spaces are normed linear spaces, (Royden, 119). Any normed linear space is defined as a metric space with the metric: p(f, g) = II f- g II. (Reed and Simon, 9). A Banach space is a normed linear 44 space (over the real or complex numbers) which is a complete metric space. A normed linear space is complete if every Cauchy sequence in the space converges. A sequence, {/n}. in a normed linear space, is a Cauchy sequence if, given any E > 0, there is anN, such that, for all n, m;;:: N, II fn- fm II< E. A sequence Un} in a normed linear space is said to converge to an element f in the space if, given an II f- II< E. fn E > 0, there is an N such that, for all n > N, Convergence in the LP spaces is referred to as convergence in the mean of order p, (Royden, 123- 4). The Riesz-Fischer Theorem simply states that the LP spaces are complete, thus, they are Banach spaces. The theorem, first introduced in 1907, was proved by F. Riesz (1880 - 1956) and Ernst Fischer (1875 - 1959) independent of each other. Part of the proof employed by Riesz involved the special case for the interval [0, 21t] in the L2 (Hilbert) space. He deduced the convergence of trigonometric series as long as an and b n are Fourier coefficients, (Hawkins, 174 - 175). An elegant proof of the Riesz-Fischer Theorem is presented in Royden, on pages 124- 125. From the Riesz-Fischer Theorem, it has been established that if the sequences {an} and {bn} are such that the series converges, then 1 2ao +L (ancos nx + bnsin nx) 00 n=1 is the Fourier series of a function f e L2[0, 21t]. It has also been shown that the partial sums of the Fourier series converge in the mean to 235). It can also be shown that any converges to f f f, (Cronin-Scanlon, e L2[a, b], has a Fourier series which in the mean. Furthermore, it can be shown that any Fourier 45 00 series, such that L (a n2 + b n2) < oo, converges in the mean to some f E n=1 L2[a, b]. It should be noted that the integration required by the Riesz-Fischer Theorem is Lebesgue integration. One reason that the Riesz-Fischer Theorem uses Lebesgue integration is that the convergence properties of the Lebesgue integral are essential in the proof of the theorem (Cronin-Scanlon, 234-235). Another reason that Lebesgue integration is used is that the class of Lebesgue integrable functions is larger than the class of Riemann integrable functions. The search for this theorem was one of the chief motivating forces behind the development of the Lebesgue integral (Munroe, 258). One crucial property of the LP spaces not shared by the corresponding spaces of Riemann integrable functions is that of completeness in the metric given above. This was pointed out in a note by Fischer in 1907 (Hawkins, 177). As an example, consider the set of continuous functions on [a, b] with the corresponding metric b II f- g II = Jlf(x) - g(x)l dx. a Let C[O, 1] denote the set of continuous functions on the interval [0, 1]. Since each function in C[O, 1] is bounded and continuous, each is Riemann integrable. To illustrate that C[O, 1] is not complete, it is necessary to find a Cauchy sequence of functions on C[O, 1] that does not converge to a function in C[O, 1]. Let fn(x) be defined as in Figure 8. 46 1 3 4 4 Figure 8 fn(x) is a Cauchy sequence of functions since, given an E > 0, there 1 exists anN, such that, if n, m > N, JIf n(x) - f m(x)l dx < E. But fn(x) does 0 not converge to a function in C[O, 1]. It converges to the characteristic function of [ 1, ~]which is not contained in C[O, 1]. Thus, the set of continuous functions in [0, 1] do not form a complete metric space. The reader should note that the completion of C[a, b] is L1 [a, b] (Reed and Simon, 13). Probability The purpose of this section is provide the reader with an intuitive justification for the treatment of probability in terms of a measure space. This development is excerpted from Halmos, pages 184- 191. Using a common example, consider the rolling of an ordinary six-sided die with the number "x" showing on the top face of the die. The number x may, therefore, take on values of 1, 2, 3, 4, 5, or 6. This means that x e {1, 2, 3, 4, 5, 6}. Possible outcomes of this experiment could be (i) the number xis even, which is equivalent to x e {2, 4, 6} (ii) x is less than 4, which is equivalent to x e {1, 2, 3} (iii) x is equal to 6, which is equivalent to x e {6}. 47 There are as many events associated with this experiment as there are combinations of the first six positive integers taken any number at a time. If the impossible event, x not being equal to any of the first six positive integers, is considered, then there are 26 possible events. The impossible event wiU be symbolized as 0, while the certain event, x e {1, 2, 3, 4, 5, 6}, will be denoted as X. If E is an event, then the complementary event of E will be written as -E. An experiment, one of whose outcomes lies in E, will be said to result in -E if, and only if, the outcome does not lie in E. Thus, if E = {2, 4, 6}, then -E would equal {1, 3, 5}. Given two events, E and F, the union, E u F, occurs if, and only if, at least one of the two events occur, and the intersection, E n F, occurs if, and only if, both E and F occur. Suppose E = {2, 4, 6} and F = {1, 2, 3}, the E u F = {1, 2, 3, 4, 6} and E n F = {2}. The preceding examples, with generalizations to more complex experiments, justify the conclusion that probability theory consists of the study of an algebra of sets. The reader should recall that a collection A. of subsets of X is an algebra of sets if (i) Au B is in A. whenever A and B are, and (ii) -A is in A. whenever A is. An event is, therefore, a set, and its opposite event is the complementary set. Mutually exclusive events are disjoint sets and an event consisting of simultaneous occurrences of two other events, is a set obtained by the intersection of two sets. For the study of probability where the total number of possible events is finite, the treatment of the class of possible events as an algebra of sets is adequate. For situations arising in modern probability theory and practice, it is necessary to make the additional assumption that the system 48 given in Halmos, Chapter IX. For example, a "random variable" in probability is just a measurable function. It seems appropriate to finish this introductory account of measure theory and Lebesgue integration with a quote given by Lebesgue himself: "And now, gentlemen, I pause to thank you for your kind attention and to offer a final word of conclusion, if you will permit. It is that a generalization made not for the vain pleasure of generalizing but in order to solve previously existing problems is always a fruitful generalization. This is proved abundantly by the variety of applications of the ideas that we have just examined." (Lebesgue, 194). 50 BIBLIOGRAPHY Beals, Richard. Advanced Mathematical Analysis. New York: Springer-Verlag, 1973. Bell, Eric T. The Development of Mathematics. 2d ed. New York: McGraw-Hill Book Company, Inc., 1945. Burton, David M. The History of Mathematics, An Introduction. Dubuque: William C. Brown Publishers, 1985. Caratheodory, Constantin. Algebraic Theory of Measure and Integration. Reprint, New York: Chelsea Publishing Co., 1963. Cronin-Scanlon, Jane. Advanced Calculus. Rev. ed. Lexington: D. C. Heath and Company, 1969. Eves, Howard. Foundations and Fundamental Concepts of Mathematics. 3d ed. Boston: PWS-Kent Publishing Company, 1990. Fulks, Watson. Advanced Calculus, An Introduction to Analysis. 3d ed. New York: John Wiley & Sons, 1978. Gillispie, Charles C. (ed.). "Henri Lebesgue," Dictionary of Scientific Biography. vol. VIII. New York: Charles Scribner's Sons, 1973. Halmos, Paul R. Measure Theory. New York: D. Van Nostrand Company, Inc. 1951. Hawkins, Thomas. Lebesgue's Theory of Integration, Its Origins and Development. Madison: The University of Wisconsin Press, 1970. Kolmogorov, A. N. and S. V. Fomin. Measure, Lebesgue Integrals, and Hilbert Space. translated by Natascha Artin Brunswick and Allan Jeffrey. New York: Academic Press, 1961. Lebesgue, Henri. Measure and the Integral. edited by Kenneth 0. May. San Francisco: Holden-Day, Inc., 1966. Munroe, M. E. Introduction to Measure and Integration. Reading: AddisonWesley Publishing Company, Inc., 1953. Reed, Michael and Barry Simon. Methods of Modem Mathematical Physics. New York: Academic Press, 1972. 51 Royden, H. L. Real Analysis. 3d ed. New York: Macmillan Publishing Company, 1988. Solovay, Robert M. "A model of set-theory in which every set of reals is Lebesgue measurable," Annals of Mathematics. 2d ser., vol. 92, 1970, 1-56. Temple, G. The Structure of Lebesgue Integration Theory. London: Oxford University Press, 1971. Thomas, George B. and Ross L. Finney. Elements of Calculus and Analytic Geometry. Menlo Park: Addison-Wesley Publishing Company, 1989. 52