A thesis submitted in partial fulfillment of the
requirements for the degree of
Master of Science in
Keith A. Barker
June, 1993
The Thesis of Keith A. Barker is approved:
Peter Collas
Phillip Emig
David M. Klein, Chair
California State University, Northridge
I would like to express my gratitude for the continued support and
encouragement of several people who helped make this paper a reality. First, I
would like to thank Dr. Norman Herr, Dr. Tung Po Lin, Dr. Barnabus Hughes,
and Dr. Elena Marchisotto for their stimulation and guidance in my graduate
studies. I would like to thank the members of my committee, Dr. Phillip Emig
and Dr. Peter Collas for their insights which made the paper more logical and
coherent. This project would not have come into existence without the patience,
inspiration, and the constant prodding given to me by the chair of my committee,
Dr. David Klein. Thank you, David. I needed that.
I would also like to thank my brother, Chris, and his wife Sue, for their
constant support and encouragement throughout the pursuit of my studies.
And, of course, I want to dedicate this effort to my wife, Kay, and my beautiful
daughter, Christy, without whose patience, understanding, and reinforcement
this project would never have been completed.
Acknowledgements .........................................................................................................iii
Abstract ...............................................................................................................................v
1. Introduction .................................................................................................................. 1
Historical Perspective
Elementary Measures
2. The Area Under a Curve ........................................................................................... 5
The Riemann Integral
The Lebesgue Integral
·3. Measure Spaces ...................................................................................................... 11
Outer Measure
Measurable Sets
Measure Spaces
4. Integration ..................................................................................................................30
Riemann Integration
Lebesgue Integration
The Fundamental Theorem of Calculus
5. Applications ...............................................................................................................42
Fourier Series
Bibliography ....................................................................................................................51
Keith A. Barker
Master of Science in Mathematics
This graduate paper is an excursion into the field of measure theory. It
begins by focusing on the historical development of the measures of lengths,
areas, and volumes, followed by the introduction of various methods to
calculate these quantities. The area under a curve is discussed descriptively in
terms of both the Riemann integral and the Lebesgue integral. A series of
fundamental definitions lead to the development of the outer measure set
function. An example of a family of sets on which outer measure is not a
countably additive set function is then formulated. Measurable sets are defined
and a series of theorems are proved to establish that outer measure, restricted
to measurable sets, is a countably additive set function.
After a discussion of measure spaces, Riemann integration is developed
in terms of step functions, followed by a rigorous development of Lebesgue
integration in terms of simple functions. The usefulness of Lebesgue integration
is then illustrated by applying the Lebesgue Dominated Convergence Theorem
to a sequence of functions that converge pointwise, but do not converge
uniformly. The Fundamental Theorem of Calculus is presented for the Riemann
integral and is then generalized to the Lebesgue Integral. The thesis concludes
with an application indicating how Lebesgue integration is used in Fourier
series and with an application of measure theory into the study of probability.
Chapter 1
Historians generally agree that mathematics arose from the necessity to
solve the practical problems of counting and recording numbers. The annual
flooding of the Nile River plain forced the Egyptians to develop a method for the
relocation of land markings. Since, according to Herodotus (c. 485- c. 430 s.c.),
taxes were paid on the basis of land area, Egyptian surveyors developed, and
used, various mensuration formulas. To convert the land along the Tigris and
Euphrates rivers into a rich agricultural region, the Babylonians engineered
structures which accomplished marsh drainage, irrigation and flood control.
The mathematics needed by the Babylonians was extensive. The word
geometry is a compound of the two Greek words meaning "earth" and
"measure," indicating that the subject arose from the necessity of land measure.
(Eves, 1)
As primitive counting evolved into mensuration and practical arithmetic, it
became necessary to develop methods to instruct, and to study, this developing
science. The study of the empirical reasoning inherent in mensuration and
practical arithmetic led to the abstraction needed for the beginnings of
theoretical geometry and elementary algebra.
When theoretical geometry and elementary algebra are studied in the
abstraction, formulas for both areas and volumes of various geometric shapes
arise. Although many of the Egyptian and Babylonian formulas contained
errors (Eves, 2), it was their groundwork that led to the generalized concepts of
length of a segment, the area of a plane figure, and the volume of a figure in
space. These notions are the intuitive values that are associated with the
"measure" of some particular object.
Since the time of the ancient Egyptians and Babylonians, the problem of
finding areas and volumes has captured the imagination of mathematicians.
Archimedes (287- 212 B.c.), through an ingenious method called "exhaustion,"
verified calculations of areas and volumes (Burton, 221). In this method, formal
logic, not infinitesimal quantities, was used. Nicole Oresme (c.1323 - 1382)
anticipated an aspect of analytic geometry by graphing the de pendent variable
against the independent variable as the independent variable took on small
increments. Using this method, he was able to approximate areas under curves
(Eves, 93).
With the advent of the scientific revolution in the seventeenth century,
interest in the calculation of area and volume reappeared. Johannes Kepler
(1571 - 1630), in his second law of celestial mechanics, stated that the
displacement vector from the sun to a planet sweeps out equal areas in equal
time intervals (Burton, 341). Kepler treated the area swept out by the planet as
a collection of an infinite number of small triangles with one vertex at the sun
and the other two vertices at points infinitely close together along the planet's
elliptical orbit. Using this unrefined form of calculus, Kepler was able to find the
sum of the areas of these triangles, and verify his law (Burton, 337).
Throughout the sixteenth and seventeenth centuries, mathematicians
and scientists developed their own peculiar methods for finding areas and
volumes. Isaac Newton (1642- 1727) and Gottfried Leibniz (1646- 1716) were
the first to apply integral calculus, in a systematic way, for the determination of
areas and volumes.
Area, as a measure of a plane figure, has evolved from the land measure
of the Egyptians and Babylonians to the abstract Banach spaces of modern
analysis. To develop area as a measure, it is requisite to start with a unit length.
By establishing an arbitrary length as unity, a unit square follows immediately.
The measure of any square whose side is a multiple of unity can then be found
by counting the number of unit squares contained in the original square. In a
similar manner, areas of rectangles, whose sides are multiples of unity, can be
determined by totaling the number of unit squares contained in the figure. If the
areas of various squares and rectangles are collected, a general rule can be
ascertained from the data relating area to the lengths of the sides of the
respective figures. The concept of area, as a measure, can be further extended
with the inclusion of rational and irrational numbers as lengths of sides of the
squares and rectangles
The area of a parallelogram is found by removing a triangular region
from one end, and attaching it to the opposite end, to form a rectangle. Areas of
triangular regions, which can be shown to be half of a parallelogram, can now
be determined. The area of any shape which can be broken down into nonoverlapping squares, rectangles, and triangles can be found by summing the
areas of the individual pieces.
By using the definition of 1t (the ratio of the circumference of a circle to the
diameter of the circle), the area of a circular region can now be ascertained.
The first step is to determine the area of an n-sided regular polygon (which can
be inscribed in a circle). In Figure 1, "x" represents the length of a side of the
polygon, and "a" represents the length of the apothem, the distance from the
center of the circumscribed circle to the.side of the polygon. The area of the
triangle is (1/2)(x)(a), and, since there are "n" triangles in the n-sided polygon,
the area of the polygon becomes (1/2)(n)(x)(a). As the number of sides of the
polygon become infinitely large, the term "(n)(x)", which is the perimeter of the
polygon, approaches the circumference of the circle, 2rrr, and the length "a"
approaches the radius of the circle, r. The area of the circle now becomes
(1/2)(2rrr)(r), which simplifies to rrr 2 .
Avoiding rigor, and for the illumination of the reader, the area of a sector
of a circle can be found by using the following ratio:
Area of Sector
Arc Length of Sector
Area of Circle - Circumference of Circle
If e signifies the angular measure, in radians, of the central angle formed by the
radii of the arc, the formula can now be simplified to:
Area of Sector
= 21 re
The measure of any region composed of non-overlapping squares, rectangles,
triangles, circles, or arcs of circles, can now be determined.
As long as a figure is composed of straight line segments, arcs of circles,
or combinations of these, it is relatively easy, although possibly tedious, to find
its measure. If, however, the figure is composed of curves determined by
relations other than circular arcs, the measure must be calculated through the
use of integral calculus.
Chapter 2
The Area Under A Curve
Generally, the area under a curve is determined by finding the limit of the
sum of inscribed rectangles, as the number of rectangles increases without
bound. The area is found by two distinct methods. The first technique is called
the Riemann integral, which is the familiar integral of elementary calculus, and
the second is the Lebesgue integral, which is the keystone of modern analysis.
Given a function, f, continuous on the interval [a, b], we begin by
partitioning the interval into n subintervals by inserting points x 1, x2, ... , Xk-1, Xk,
... , Xn-1 into the interval as shown in Figures 2 and 3. For convenience, the
figures shown are of positive valued functions, but the procedure is valid for any
continuous function on the closed interval.
a ~ }{2
a ~ x2
xn -1 b
xk-1 xk
Xn-1 b
Figure 3
Figure 2
These points subdivide [a, b] into n subintervals, each of length ffi<1 = x1 -a, Ax2
= x2- x1, ... , Axn = b- Xn-1, which need not be of equal length. Since
continuous, it has a minimum value, m k· and a maximum value, Mk in each
subinterval (for a proof of this consequence of continuity, see Cronin-Scanlon,
pages 69 -70). The sum of the areas of the rectangles in figure 2 is called the
Lower Sum:
L = m1Llx1 + m2Llx2 + ... + mn.ill<n = _Lmk.ill<k,
and the sum of the areas of the rectangles in Figure 3, comprise the Upper
U = M1Llx1 + M2Llx2 + ... + Mn.ill<n = _LMk.ill<k.
The difference in the areas, U - L, is of particular importance. By
increasing the number of subintervals, the closer the tops of the rectangles will
fit the curve, thus decreasing the difference, U - L. First, define the "norm" of
the subdivision as the largest subinterval width and then, as this norm
approaches zero, all of the subintervals must both decrease in width and
become more numerous at the same time. As the number of rectangles
increases, the difference U - L can be made as small as any given positive
number. This means that the
(U - L) = 0, which implies that
U = lim
The existence of these limits is a consequence of a special property that
continuous functions have on a closed and bounded interval, called "uniform
continuity." (For a discussion of uniform continuity, see Cronin-Scanlon, pages
71 - 74).
In each interval [Xk-1, xk], a point Ck is arbitrarily chosen and the sum 5 =
_Lf(ck).ill<k is formed. The sum 5, like L and U, is the sum of the product of
functional values and interval widths, and is therefore the sum of areas of
rectangles. Since each Ck is randomly chosen, we know that m k:::;; f(ck):::;; Mk
which implies that L ~ 5 ~ U. By the Sandwich Theorem, (Thomas/Finney, 65),
it is concluded that
5 =lim
L = lim
which means that, no matter how the point Ck is chosen, the sum 5 = _Lf(ck)~k.
always has the same limit as the norm of the subdivisions approaches zero.
This conclusion, without uniform continuity, was first reached in 1823 by
Augustin- Louis Cauchy ( 1789 - 1857), and was then put on a solid logical
foundation, with uniform continuity, by other nineteenth century mathematicians.
This limit, called the Riemann Integral of function
f over the interval [a, b], is
denoted by:
J f(x)
dx. This integral was named for Georg Friedrich Bernhard
Riemann (1826- 1866). It was Riemann's idea to trap the limit between the
upper and lower sums. The sum 5 = IJ(ck)~k is called the approximating
sum, or the Riemann sum for the integral (Thomas/Finney, 267).
The usual derivation of the Riemann integral is based on dividing the
domain of the function
into smaller and smaller subintervals. Henri Lebesgue
(1875- 1941), in his doctoral thesis published in 1902, made a simple
modification of the Riemann theory. Instead of dividing the domain, he divided
the range into finer and finer pieces. This method, illustrated in Figure 4,
depends more on the function rather than the domain, so it has the possibility of
working for more types of functions than does Riemann integration.
The first step in Lebesgue integration is to define a measure function 1J
which will assign a non-negative real number to sets that are deemed
measurable. This function must have the property that the measure of an
interval on the real line, IJ([a, b]), must be equal to (b- a). This requirement
suggests that the measure function on the real line is actually the formula for the
length of the interval between pairs of points on the line as defined in
elementary algebra.
The next step is to formulate the "inverse image" of the set [yo, Ynl It is
denoted as:
F 1([yo. Yn]) = {x:
[yo, Yn]}. The inverse image is the set of all
points of the domain of the function which are mapped into [yo, Yn] by f. As an
example, let f be a function such that:
if XE [2, 3)
if X e; (2, 3)
f(x) = ~
then r\1) = [2, 3) and r\o) = (- oo, 2) u [3, oo).
Using the concepts of measure and inverse image, the Lebesgue
Integral can now be developed. Referring to Figure 4 below, the range of f(x)
must be partitioned into n subintervals, each of width
~, and note that portions
of the graph of the function are contained within the "horizontal rectangle"
m+ 1
bounded by
Figure 4
L ~ IJ(f [[~
IJ(f 1 [[~ , m :
, m :
) ] ) . Since
~is a y-coordinate, and
) ]) represents the measure, or the "length" of an interval on
the horizontal axis, the sum.
Ln (f),
therefore denotes the aggregate of the
areas of rectangles. As n, the number of horizontal subintervals, increases
without bound, the limit of the sum will be the area under the curve.
Lebesgue Integral can now be defined in the following manner:
Jf(x) dx = lim
(Reed-Simon, 13-14). The existence of this limit can be proven using, for
example, the Lebesgue Dominated Convergence Theorem.
To illustrate the Lebesgue Integral, define the function f(x), on the
interval [0, 1], in the following manner:
f(x) =
if x is irrational
if x is rational
Attempting to integrate this function by the Riemann integral, the limit of
the lower sum, 5, is found to be equal to zero, and the limit of the upper sum, U,
is determined to be one, which means that the function is not Riemann
integrable. The alternative is to now try the Lebesgue integration technique.
By using the horizontal rectangles as defined by the Lebesgue integral,
and, as shown in Figure 5, it is concluded that:
Jf(x) dx = (1){1J[f-1(1)]}.
1 ::::::::::::
Figure 5
An analysis of the measure of the interval [0, 1] is necessary to find IJ[f\1)]. The
measure of the interval [0, 1] is equal to the length of the interval, which is one.
Since the union of the disjoint sets, {Rationals n [0, 1]} and {Irrationals n [0, 1]},
is equal to the interval [0, 1], 1-J[O, 1] = 1-J{Rationals n [0, 1]} + 1-J{Irrationals n [0,
1]}. The set of rational numbers is countably infinite, so the set, {Rationals n [0,
1]}, can be written as the set {r1, r2, r3, ... }, where each rk represents a unique
rational number. The set {r1, r2, r3, ... } can be represented as the union of {r 1},
{r2}, {r3}, .... Since {ri} n {rj}
=0, fori* j,
1-J{Rationals n [0, 1]}
= IJ{r1, r2, r 3, ... }
= IJ{r1} + IJ{r2} + 1.1{r3} + ... , which is the sum of the measures of sets whose only
element is a single point. A set consisting of a single point is an interval with
both endpoints being the same point. Since the measure of an interval is the
length of the interval, IJ{rk} = rk- rk = 0, which implies ~J{Rationals n [0, 1]}
must be 0. Therefore,
1-J{Irrationals n [0, 1]}
= 1-J[O, 1]
- 1-J{Rationals n [0, 1]}
= 1 - 0
= 1.
1(1) is equal to {Irrationals n [0, 1]}, which means ~-J[r\1 )] =
1-J{Irrationals n [0, 1]}
= 1,
f f(x) dx
= (1){1-J[f- 1(1)]} = (1)(1) = 1.
To understand this kind of integration, it is necessary to comprehend the
notion of length, i.e. the "measure," of a set. Do all sets have measure? If not,
what are the necessary and sufficient conditions for a set to have a measure?
What are the properties of a set that will make it measurable? The answers to
these, and other questions, will be found as Lebesgue measure and measure
theory are developed in the remainder of this paper.
Chapter 3
Measure Spaces
To introduce the study of measure spaces, it is necessary to establish a
few working definitions. To begin with, the real number system is not sufficient
for this investigation and must be enlarged to include the two elements,- oo and
+ oo. This set will be referred to as the extended real numbers. It should be
noted that, for any real number x,- oo < x < oo and the following properties hold:
x + oo = oo and x- oo = - oo
x(oo) = oo and x(- oo) =- oo, for all x > 0
oo + oo = oo and - oo - oo = - oo
oo(± oo) = ± oo, - oo(+ oo) =- oo, and - oo(- oo) = + 00
and (oo- oo) will be left undefined. (Royden, 36)
Let A be a collection of sets such that if A and B are elements of A, then
(Au B) eA. and (-A) e A, where (-A) is the complement of set A. When these
two conditions are met, then the set A is called an algebra of sets. Using
DeMorgan's laws, it follows that (A n B) is also an element of A. If the above
two conditions are satisfied, and if any union of a countable collection of sets in
A is again in A, then A is called a a-algebra. It again follows that the
intersection of a countable collection of sets in
A is also an element of A.
An interval I, on the real number line, can be expressed as open, closed,
or half-open, and can be denoted in the following forms: (a, b), [a, b], (a, b], or
[a, b). The length of I, I(I), is defined by (b- a). I(I) is an example of a set
function. A set function is a function that assigns an extended real number to
each set in some collection of sets. The domain of this length function would be
the collection of all intervals contained in the real number line.
If S is a set of real numbers, d is an upper bound for set S if, for each x e
S, d
x. A number c is called the least upper bound of set S if c is an upper
bound of setS and c ::;; d for each upper bound d of set S. The least upper
bound of set S is also referred to as the supremum of set S and denoted as sup
S. The number a is a lower bound of setS if, for each x e S, a ::;; x. If b is a
lower bound for set S and a ::;; b for each lower bound a of set S, then b is
defined as the greatest lower bound or the infimum of set S, and denoted as inf
S. The Completeness Axiom states that every setS, of real numbers, which has
an upper bound, has a least upper bound. From the this axiom, it follows that
every set of real numbers with a lower bound has a greatest lower bound.
The concept of measure of a set is a natural generalization of the ideas of
length, area, or volume. Other measures can be created, for example, by
considering the increment cj>{b)- cj>{a) of a non-decreasing function cj>{x) on the
interval [a, b), or the integral of a non-negative function over some line, plane
surface or space region. In all cases, the notion of measure is a set function that
assigns a non-negative extended real number to a given set.
n be a collection of sets of real numbers, and let "m" be a set function
that assigns non-negative extended real numbers to elements of n. If set E is
an element ofn, then the extended real number assigned toE by m, noted as
"mE," is called the measure of E. In the ideal situation, the set function m should
have the following properties:
mE should be defined for each set of real numbers
For an interval I, on the real number line, ml = 1{1)
If {En} is a sequence of disjoint sets, for which m is defined, then
m(UEn) = L{mEn). This means that "m" is a countably additive
set function.
m is translation invariant. If E is a set for which m is defined and if
{E + y} is the set {x + y: x e E} obtained by replacing each point in
x in E by the point (x + y), then m(E + y) =mE (Royden, 54).
As will be illustrated later, it is impossible to construct a set function
having all of these properties and it is not known if a set function satisfying the
first three properties exists. Consequently, one of the properties must be
weakened and generally it is most useful to weaken the first property and retain
the remaining three properties. Therefore, mE need not be defined for all sets E
of real numbers, however it is desired that mE be defined for as many sets as
possible. It is convenient to require that the family rt. of sets for which m is
defined be a a-algebra. The set function "m" is said to be a countably additive
measure if it is a non-negative extended real valued function whose domain is a
a-algebra rt.of sets of real numbers and m(UEn) = L{mEn) for each sequence
{En} of disjoint sets in M. It is now desired to construct a countably additive
measure which is translation invariant and has the property that m I = I{I) for
each interval I (Royden, 55).
Consider a set of real numbers, A, and the countable collection {In} of
open intervals that cover A. Note that A c Uin and that the length of each
interval is a positive number. Since the sum of the lengths of the intervals is
uniquely defined independent of the order of terms, the outer measure, m*A, of
A can be defined as the greatest lower bound or the infimum of all such sums of
the lengths of the intervals that cover A. In symbols, m*A = inf ~)(In) for A c
Uin. The following three propositions arise from this definition.
Proposition 1:
If A c B, then m*A::; m*B.
Suppose A c B.
LA = {_LI{In):
In is an open interval and A c Uin}.
Let LB = {_LI(In): In is an open interval and B c Uin}.
As an example, let set A= (0, 1), which implies 1 E
LA· and
let set B = (-1, 2), which implies 3 E LB·
Every element of LB is also an element of LA.which implies that
Therefore, inf LA::; inf "Ls which implies m*A::; m*B.
Proposition 2:
Each set consisting of a single point has outer measure zero.
Suppose set A contains only the single point x.
m*A = inf {_LI(In): In is an open interval and A c Uin}.
Let It = (x- 2 E, x + 2 E), for some E > 0.
Since It is an open covering of set A, A c (x- ~ E, x
Therefore, m*A::; l(x- 2 E, x + 2 E)= E.
Since E is arbitrarily small, m*A must be zero.
Proposition 3:
m*0 = 0.
Since 0 c {x} for any x E R, m*0::; m*{x}, by Proposition 1.
By Proposition 2, m*{x} = 0, which implies that m*0::; 0.
By the definition of outer measure, 0 ::; m*0.
By combining steps 2 and 3, 0::; m*0::; 0, which implies m*0 = 0.
The following three properties concerning outer measure are direct con-
sequences of the definitions presented here. Elegant proofs of these properties
are provided in Royden, pages 56- 58.
Property 1:
The outer measure of an interval is its length.
Property 2:
Let {An} be a countable collection of sets of real numbers. Then
m*(U~= 1 An)
Property 3:
I,(m*An). This is called countable subadditivity.
Outer measure is translation invariant.
As a consequence of Property 2 above and Proposition 2, if {x1, x2, ... } is
a countable collection of real numbers, then
m*{x1, x2, ... } :::;; I,m*{xi} =
= 0.
This establishes that the outer measure of any countable set of real numbers is
equal to zero.
Outer measure is defined for gil sets of real numbers. However, as
indicated in Property 2 above, it is not necessarily a countably additive set
function. To provide an example of a family of sets of real numbers on which
outer measure is not a countably additive set function, it is first necessary to
define the sum modulo one of two numbers. If x and yare real numbers in the
interval [0, 1), then the sum modulo one of x andy, denoted by (x 63 y), is
defined as follows:
fX +y
l X + y-
if X + y < 1
if X + y ~ 1
The operation, 63, is a commutative and associative operation that maps pairs of
numbers in [0, 1) into numbers in [0, 1). As an example of this operation,
consider the unit circle, and assign to each x e [0, 1), the angle 27tx. The sum
modulo one now corresponds to angle addition. For example, let x
=0.5 and let
y = 0.75, then (x 63 y) = 0.5 + 0.75-1 = 0.25 and would correspond to 21t(0.5) +
21t(0.75) = 21t(1.25) = 2.51t which is equivalent to (0.5)1t or 21t(0.25).
Suppose set E be a subset of [0, 1), then the translation modulo one of
set E is defined as (E $ y) = {z: z = (x $ y), for all x
E}. If the example of sum
modulo one is angle addition, then translation modulo one by y would
correspond to rotation of set E through the angle 21ty. The following lemma
establishes that outer measure is invariant under translation modulo one.
Lemma 1:
If E is an outer measurable subset of [0, 1), then, for each y
[0, 1),
(y $ E) is outer measurable and m*(y $ E)= m*E.
An e~cellent example of the proof of this lemma is provided by Munroe
on page 142.
If x andy are real numbers in [0, 1) and if (x- y) is a rational number,
then x andy are defined to be equivalent and will be written as (x
= y).
This is
an equivalence relation since
=x) : x- x = 0
If (x = y), then (y = x) : (x- y)
If (x = y), (y = z), then (x = z) : if (x- y)
Q, the set of rational numbers.
Q implies (y- x)
Q, (y- z)
Q, then (x- y) + (y- z)
=(x- z) E Q.
This equivalence relation partitions the interval [0, 1) into equivalence
classes, that is, classes such that any two elements of one class differ by a
rational number. The reader should note that a partition separates a set into a
collection of disjoint subsets whose union is the original set. The equivalence
class for the element x is the set of all y such that (x
= y).
The following are three
examples of equivalence classes:
Q, the set of rational numbers
V2 +Q
The Axiom of Choice states that, given any collection of non-empty sets, it
is possible to choose an element from each of the sets. Using this axiom, let P
be the set consisting one element from each equivalence class as defined
above. If {ro, r1, r2, ... } is an enumeration of the rational numbers in [0, 1), with
ro = 0, then define Pi= ri ffi P, where Po= ro ffi P = 0 ffi P = P. The following
lemma establishes that {Pi} is a pairwise disjoint sequence of sets.
Lemma 2: Pin Pj = 0, fori"# j.
Suppose x e Pin Pj. then xis in both Pi and Pj.
Then x = Pi ffi ri = Pj ffi rj. where both Pi and Pj e P.
But Pi- Pj e Q.
Therefore, Pi= Pj·
But P contains exactly one element from each equivalence class which
implies Pi= Pj and ri = rj. which means Pi= Pj. Therefore, if the
intersection is non-empty, the sets are equal, and, if they are different, the
intersection of Pi and Pj must be empty.
Let x e [0, 1), then xis contained in some equivalence class. Suppose
x= peP.
Therefore x = p ffi ri, for some ri e Q n [0, 1), which implies x e Pi.
Thus, [0, 1) = U;:'0 Pi.
From lemmas 2 and 3, m*[O, 1) = m*(U;:'0Pi) = 1. If outer measure is
countably additive on {Pi}, then m*(U;:'0 Pi) = Im*(Pi) = 1. Since each Pi is a
translation modulo one of P, then, by lemma 1, m*(Pi) = m*(P) fori= 1, 2, 3, ...
and Im*(Pi) =Im*(P) = 1. Suppose, m*(P) = 0, then Im*(Pi) = Im*(P) =
= 0 "# 1. Suppose m*(P) > 0, for example, let m*(P) =
~,then Im*(Pi) = I,m*(P)
= L~ = oo * 1.
It can now be concluded that Im*(P) is either 0 or
depending on whether m*(P) is zero or a positive number. In either case, outer
measure is not a countably additive set function on the pairwise disjoint
sequence of sets, {Pi} (Royden, 64-66). It is clear from this example that in order
to guarantee countable additivity of the outer measure set function, it is
necessary to, at the minimum, restrict outer measure to a family of subsets of :R
which does not contain set P or its translates {Pi}- Set P will then be a "nonmeasurable" subset of :R. The notion of measurability is defined in the next
The construction of the "non-measurable" family of sets {Pi}. required the
use of the Axiom of Choice. Robert M. Solovay, in 1970 showed that " ... the
existence of a non-Lebesgue measurable set cannot be proved in ZemeloFrankel set theory (ZF) if use of the axiom of choice is disallowed. In fact, even
adjoining an axiom DC (principle of dependent choice) to ZF, which allows
many consecutive choices, does not create a theory strong enough to construct
a non-measurable set" (Solovay, 1). In this paper, Solovay concluded that the
axiom of choice implies the existence of non-measurable sets and the existence
of non-measurable sets implies the axiom of choice.
Since all subsets of the real number line have outer measure, to meet the
condition of countable additivity, it will be necessary to decrease the number of
sets on which outer measure is defined. To suitably reduce the set of outer
measurable sets to a collection of measurable sets forming a a-algebra on
which outer measure will be countably additive, the definition first provided by
Caratheodory (page 161 ), and used by Halmos (page 44), and Royden (page
58) will be ery1ployed.
A set E is said to be measurable if, for each set A, we have
m*A = m*(A n E)+ m*(A n -E),
where the symbol -E represents the complement of set E. A member of the aalgebra formed by this definition will be called a measurable set.
It is difficult to get an intuitive understanding of this definition except
through familiarity with its implications as will be done in the proofs of the
succeeding theorems. Since outer measure is not a countably additive set
function on the collection of all subsets of R, this definition creates a class of
sets on which outer measure is, at the minimum, additive. Given any two sets of
real numbers, A and E, set A is equal to (A n E) u (A n -E) and applying the
definition of measurability, set E, therefore, splits set A in such a manner that (A
n E) and (A n -E) will have outer measures that add to the outer measure of set
A. Since set E splits set A in such a way that outer measure is additive on set A,
set E will be regarded as measurable. In other words, this definition achieves
the rudimentary additive notion for outer measure.
Finally, it should be noted that the definition of measurability of set E
starts with the outer measure of set A, not with the outer measure of set E. Set A
is an arbitrary "test set." The measurability of set E has nothing to do with the
outer measure of set E itself, but it depends on what set E does to the outer
measure of other sets. A measurable set is one which separates no set in such
a way as to abrogate additivity for outer measure (Munroe, 86).
From this foundation of the countable additivity of outer measure on
certain sets, the concept of countable additivity can now be expanded to pairs of
disjoint sets, providing one of the sets is measurable.
Proposition 4:
If E1 is measurable, and E1 n E2
= 0, then m*(E1
u E2)
m*(E1) + m*(E2)Since E 1 and E2 are disjoint,
[(E1 u E2) n E1] = E1, and [(E1 u E2) n -E1] = E2.
Since E 1 is measurable,
m*(E1 u E2) = m*[(E1 u E2) n E1] + m*[(E1 u E2) n -E1].
By substitution, m*(E1 u E2) = m*(E1) + m*(E2).
This proposition is a special case of Theorem 5, which is proved later in this
chapter. Using the definition of measurability, two other sets can now be
classified as measurable.
Theorem 1: The empty set, 0, and the set of real numbers, R, are measurable.
Let A be any set of real numbers.
=0, therefore, m*A =m*0 + m*A.
Since 0 =(An 0), and set A =(An R) =(An -0), by substituting into
statement 2, it can be inferred that m*A =m*(A n 0)+ m*(A n -0).
Therefore, the empty set, 0, is measurable.
It should be noted that the definition of measurability is symmetric in E
By proposition 1, m*0
and -E which means that -E is measurable whenever E is measurable.
Since -0 =Rand -R = 0, the set of real numbers is also measurable.
Combining the notion that, for any sets of real numbers, A and E, set A is
equal to (An E)+ (An -E) with the proposition that outer measure is countably
subadditive, it can be concluded that m*A:;:;; m*(A n E) + m*(A n -E) is true for
all sets of real numbers, A and E. It can now be inferred that, in order to prove
that a set E is measurable, it is only required to demonstrate that
m*(A n E)+ m*(A n -E).
The following theorems establish that outer measure, restricted to a class
of measurable sets, is a translation invariant, countably additive set function,
and that the measure of an interval is the length of the interval.
Theorem 2: If Eisa set of real numbers and m*E = 0, then E is measurable
(Royden, 58).
Let A be any set of real numbers, then (A n E) c E.
Therefore, m*(A n E)::; m*E and, since m*E = 0, it can be inferred that
m*(A n E)= 0, since
measure is never negative.
Since (A n -E) c A, m*(A n -E)::; m*A.
Reversing this inequality and adding 0, m*A 2 m*(A
n -E)+ 0, and sub-
stituting from statement 2, m*A 2 m*(A n -E)+ m*(A n E).
Therefore set E is measurable.
Theorem 3: If E1 and E2 are measurable sets of real numbers, so is (E 1 u E2)
(Royden, 59}.
Let A be any set of real numbers.
Since E2 is measurable,
m*(A n -E1) = m*(A n -E1 n E2) + m*(A n -E1 n -E2).
Since An (E1 u E2) =(An E1) u (An E2 n -E1), then
m*(A n (E1 u E2))::; m*(A n E1) + m*(A n E2 n -E1).
Adding m*(A n -E1 n -E2) to both sides of statement 3:
m*(A n (E1 u E2)) + m*(A n -E1 n -E2)::; m*(A n E1) +
m*(A n E2 n -E1) + m*(A n -E1 n -E2).
Substituting statement 2 into statement 4,
m*(A n (E1 u E2)) + m*(A n -E1 n -E2)::; m*(A n E1) + m*(A n -E1).
By definition of the measurability of E 1,
m*(A n (E1 u E2)) + m*(A n -E1 n -E2)::; m*A.
Since (-E 1 n -E2) = -(E1 u E2), statement 6 becomes
m*(A n (E1 u E2)) + m*(A n -(E1 u E2)::; m*A.
By the definition of measurability, (E 1 u E2) is measurable.
Theorem 4: The family of measurable sets,
n, is an algebra of sets (Royden,
Let the sets E1 and E2 be members of the set X of measurable sets.
Set E 1 being measurable, implies that -E 1 is also measurable and is
therefore contained in n:.
As demonstrated by Theorem 3, (E 1 u E2) is measurable and is also a
member ofn.
By definition, the family of measurable sets n is an algebra of sets.
Theorem 5: If A is any set, and if E 1, ... , En is a finite sequence of disjoint
measurable sets, then m*(A n [U~ 1 Eil) =
L m *(A
n E i) (Royden, 59).
This theorem will be proven by induction.
lfn=1,m*(AnE1)=L m*(A n Ei)=m*(AnE1).
Assume the induction hypothesis: m*(A n [U~-1 Ei]) =
L m*(A
n Ei).
Since En is measurable,
m*{A n [U~ 1 Ei]) = m*(A n [U~ 1 Ei] n En)+ m*(A n [U~ 1 Ei] n -En)
m*(A n [U~ 1 Ei]) = m*(A n En)+
L m*(A
n Ei). by statement 2.
m*(An [U~ 1 EiD=I m*(A n Ei).
Theorem 6: The collection
n of measurable sets is a cr-algebra (Royden, 60).
Theorem 4 demonstrated that
n is an algebra of sets.
Thus, to prove that
is a cr-algebra, all that must be verified is that any union of a countable
collection of measurable sets is measurable. In other words, it must be
shown that if A1, A2, ... e :M., then A1 u A2 u ... e :M., or E = Ui:1 A;e :M..
Let E1 = A1_, E2 = A2 \ A1, E3 = A3 \ (A1 u A2), E4 = A4 \ (A1 u A2 u A3), ... ,
En= An\ (A1 u A2 u ... u An-1).
For each E; e :M., E; n Ej = 0 fori
U~ 1 E; = U~ 1 A;, for any~. and
Ui=1 E; = Ui=1 A; .
Let A be any subset of real numbers and Fn be defined as Ui=1 E;. (Note
that Fn e :M., Fn c E which implies that -E c -Fn).
Since F n is measurable, m*A = m*(A n Fn) + m*(A n -Fn).
Since -E c -Fn, m*A ~(An Fn) + m*(A n -E),
m*A ~ m*( An Ui=1 E;) + m*(A n -E).
Using Theorem 5, m*A ~
L m*(A n
E ;) + m*(A n -E).
As n ~ oo, lim [m*A] ~lim [
L m*(A n
E ;) + m*(A n -E)] which
becomes m*A~
L m*(A
n E;)+m*(An-E).
Using statements 1 and 3, E = Ui=1 A;= Ui=1 E;, it is concluded that
An E = (Ui=1 E;) n A, which implies that m*(A n E)= m*((Ui=1 E;) n A).
Since outer measure is countably subadditive,
m*{A n E)= m*((Ui:1 E;} n A)::;;
L m*(E; n
A), therefore,
m*(A n E)::;;
I, m*(E;
n A).
By substitution into statement 8, m*A ~ m*(A n E)+ m*{A n -E) which
makes set E measurable, and it is concluded that 1't is a a-algebra.
Theorem 7: The interval (a, oo) is measurable (Royden, 60).
Let A be any set and define A1 =An (a, oo) and A2 =An (- oo, a].
As an example, let a= 1, and A= [-10, 10]. Then A 1 = [-10, 10] n (1, oo)
= (1, 10] and A2 = [-10, 10] n (- oo, 1]
=[-10, 1].
For (a, oo) to be measurable, it must be shown that
m*A = m*(A n (a, oo)) + m*(A n -(a, oo)) which is equivalent to
m*A = m*(A n (a, oo)) + m*(A n (- oo, a]), or m*A = m*A1 + m*A2.
Since m*A::; m*A1 + m*A2, all that needs to be shown is that
If m*A = oo, then the theorem is proven.
If m*A < oo, then there is a countable collection, {In}. of open intervals
which cover A and for which :LI(In)::; m*A + c:.
Let In' =Inn (a, oo) and In =Inn (- oo, a]. Then In' and In are intervals
(possibly empty sets) and I(In) = I(In ') + I(In
= m*(In ') + m*(In
Since A 1 c U {In'), m*A1 ::; m*(U {In')) ::; .L,m*{In '). This inequality
follows from the notion that outer measure is countably subadditive.
By a similar argument, A2 c U {In
Combining statements 6 and 7, .L,[m*(In ') + m*(In ")] ~ m*A1 + m*A2.
Substituting from statement 5, .L,I(In} ~ m*A1 + m*A2 and from statement
4, m*A + c:
m*A2 ::; m*(U On")) ::; .L,m*{In ").
m*A1 + m*A2 which makes m*A ~ m*A1 + m*A2, since c: is
Therefore, the interval (a, oo) is measurable.
At this point in the development of measure theory, it is necessary to
consider more general types of sets than the open and closed sets. This is
required because the intersection of any collection of closed sets is closed, and
the union of any finite collection of closed sets is closed, however, the union of a
countable collection of closed sets is not necessarily closed. As an example,
consider the set of rational numbers. The set of rational numbers is the union of
a countable collection of closed sets, each of which contains exactly one
number. But the union of these closed sets is not closed. Thus, if a-algebras of
sets that contain all of the closed sets are to be considered, a more general type
of set must be specified (Royden, 52).
The Borel sets, :8, of Rare the smallest family of subsets of R with
the following three properties:
The family is closed under complements. If Be :8, then-Be :8.
The family is closed under countable unions. If B1, B2, ... e :8,
u;:1 Bi E:8.
The family contains each open interval.
Since the family of all subsets of R is a a-algebra that satisfies (i), (ii),
and (iii) above, the collection of all such a-algebras is non-empty. To see that
the smallest family exists, consider that if {:Sa} is a non-empty collection of
families satisfying the three conditions stated above, then so does rl{:Ba}, and
this intersection is the smallest family that satisfies all the conditions (Reed and
Simon, 14- 15). The reader should note that, from statements (i) and (ii) of the
definition, the Borel sets form a a-algebra. The only properties of Borel sets
needed in this paper follow from the fact that they form the smallest a-algebra
containing the open and closed sets (Royden, 53).
Theorem 8: Every Borel set is measurable, specifically each open set and each
closed set is measurable (Royden, 61 ).
By Theorem 6, the collection
Theorem 7, (a, oo)
of measurable sets, is a a-algebra. By
,which means -(a, oo) = (- oo, a]
Since 1't is a a-algebra, the union, as well as the intersection, of a
countable collection of sets in 1't is also an element of n. Each open interval of
the form (- oo, b), can be represented as U~= 1 (- oo, b- ~],which makes(- oo, b)
an element of
(- oo, b)
Every open interval of the form (a, b) can be represented as
n (a, oo). This implies that all open intervals are contained inn.
Since :M., contains all of the open intervals, and since 1't is a a-algebra,
1't contains the smallest a-algebra containing all the open intervals, therefore
contains the Borel sets.
The following theorem states that if outer measure is defined on a
sequence of disjoint subsets of the set
of measurable sets, then outer
measure is a countably additive set function.
Theorem 9: If {Ei} is a sequence of measurable sets and Ei n Ej = 0,
then m*(U~ 1 Ei) = :L<m*Ei). (Royden, 62)
Proposition 2 on page 14 demonstrated that outer measure is a
subadditive set function, that is, m*(U~ 1 Ei)::; :L<m*Ei). Therefore, to complete
the proof of this theorem, it only necessary to prove that m*(U~ 1 Ei);::: :L<m*Ei).
Theorem 6 proved that, for any set A, m*(A n [U~ 1 E~) = :Lm*(A n Ei).
If set A is the set of real numbers, R, then m*(U~ 1 Ei)
=:Lm*(Ei) and it can
inferred that outer measure, on set :M., is a finitely additive set function.
If {Ei} is an infinite sequence of disjoint measurable sets, then U~ 1 Ei ::>
By substitution from statement 2, m*(U; 1Ei) ~ :Lm*(Ei).
The left side of the inequality in statement 4 is independent of n, and the
limit, as n ~
of :Lm*(Ei) is L(m*Ei) which implies m*(U; 1Ei) ~ :L<m*Ei).
Combining statements 1 and 5, m*(U; 1Ei) = :L<m*Ei).
It can now be concluded that if outer measure is restricted to the set
n of
measurable sets, it will be a countably additive set function which is translation
invariant (Royden, 58}, and has the property that the measure of an interval will
be the length of the interval (Royden, 56). If set E is a measurable set, the
Lebesgue measure, mE, is defined to be the outer measure of E. Lebesgue
measure is, therefore, outer measure, restricted to the collection of measurable
A measurable space is usually represented as a couple, {'X, :'8),
consisting of a set 'X and a a-algebra :'8 of subsets of 'X. Since :'8 is a aalgebra, :'8 is a family of subsets of 'X which contain the empty set, and is closed
with respect to complements and with respect to countable unions. A subset .A,
of 'X, is defined to be measurable if A e :'8. A measure, ~. on the measurable
space, {'X, :'8), is a nonnegative set function defined for all sets of :'8, and
satisfying the following conditions:
~(U~ 1 Ai)
= :L~Ai,
for any sequence Ai, of disjoint measurable
subsets of :'8.
A measure space, represented by the triple (X, :8, ~). is a measurable space
with a measure~ defined on :8 (Royden, 254).
An example of a measure space is (:R.,
n, m) where :R. is the set of real
numbers, Xis the set of measurable sets of real numbers, and m is Lebe$gue
measure. Another example would be (R, :8, m) where R is again the set of real
numbers, :8 is the a-algebra of Borel sets, and m is again Lebesgue measure.
A measure space, {X, :8, ~) is defined to be complete if :8 contains all
subsets of sets of measure zero. This means that if B e :8, ~B = 0, and A c B,
then A e :8. The measure space defined by (R,
n, m) is complete, since n
contains subsets of all sets of measure 0, while the measure space {:R., :8, m),
where :8 is the a-algebra of Borel sets, is not. To see this, the reader should
note that, by Theorem 8, every Borel set on the real line is Lebesgue
measurable and that there are Lebesgue measurable sets of measure zero
which are not Borel sets (Munroe, 148- 149). It can also be shown that any
Lebesgue measurable set of measure zero is contained in a Borel set of
measure zero (Munroe, 97- 98). Thus, (:R., :8, m) does not contain all subsets
of sets of measure zero, and is therefore, not complete. It can be proven that
each measure space can be completed by the addition of subsets of sets of
measure zero. This is called the completion of the measure space.
Formally, the completion of measure space (X, :8, ~) is the measure
space (X, :Bo. ~ 0 } where
:8 c
If E e :8, then !JE = !JoE
:8o if and only if E =A u
B, where B e :8 and A c C, C e :8, !JC =0.
Thus, the completion of measure space (R, :8, m), where :8 is the aalgebra of Borel sets, is the measure space (R, n, m), where set n, is the set of
Lebesgue measurable sets (Royden, 257). This provides an alternative
description of the Lebesgue measurable sets. A set is Lebesgue measurable if,
and only if, it is the union of a Borel set and a subset of a Borel set with
Lebesgue measure zero.
With the foundations of measure theory now complete, it is time to return
for a closer look at the Riemann integral, its shortcomings, and develop a more
general form of integration as defined by the Lebesgue integral.
Chapter 4
Riemann Integration
An elegant formulation of the Riemann integral can be obtained by the
use of step functions and, to introduce this formulation, it is necessary to
establish a few working definitions. To begin with, let f(x) be a real valued,
bounded function on the interval [a, b] and let xo, ... , xn be a partition of [a, b]
such that xo =a, and xn =b. Let Mi be defined as the sup {f(x): Xi-1 ~ x ~Xi} and
let mi be defined as the inf {f(x): Xi-1 ~ x ~Xi}, then the upper sum is designated
asS= 'LMi(Xi- Xi-1), and the lower sum denoted ass= 'Lmi(Xi- Xi-1). The
expressions "s" and "S" are called Darboux Sums, and from their definitions,
it can be seen that s
S (Cronin-Scanlon, 127- 128). The Upper Riemann
integral of f(x) is defined as
J f(x)
dx = inf S,
and the Lower Riemann integral of f(x) is defined as
Jf(x) dx =sups,
with the infimum and supremum taken over all possible subdivisions of [a, b]. If
the Upper and Lower Riemann integrals are equal, then f(x) is defined to be
Riemann integrable, and is denoted as R
Jf(x) dx, (Royden,76).
If xo .... , Xn is a partition of [a, b] such that
xo =a and xn = b, then the step
function, 'I', has the form 'l'i(X) = Ci, where Xi-1 ~ x ~Xi and Ci is a constant. For
example, suppose n = 3, c1 = 2, c2 = 3, c3 = 1, then the graph of the step
function formed would be represented as shown in Figure 6.
From the definition of the step function, it can be seen that the
I 'JI(X) dx
is equal to Ici(Xi - Xi-1), which implies that
I f(x) dx
= inf
I 'JI(X) dx, for all step functions 'JI(X) ~ f(x), and
I f(x) dx
= sup
I 'Jf(X) dx, for all step functions 'JI(X)::; f(x).
(Royden, 76). This means that if f(x) is Riemann integrable, its integral can be
expressed in terms of the integral of step functions and this formulation
represents the intuitive notion of the Riemann integral as presented in
elementary calculus.
As demonstrated in chapter 2, the function defined by:
if X iS irrational
if x is rational
for x e [0, 1]
is not a Riemann integrable function. This example illustrates the shortcomings
of Riemann integration and shows the need do develop a more general form of
integration. The required general form was first introduced by Henri Lebesgue,
in his doctoral thesis presented at the Sorbonne in 1902, (Gillispie, 110).
Lebesgue Integration
Henri Leon Lebesgue was born in Beauvais, France on June 28, 1875.
Lebesgue studied at the Ecole Normale Superieure from 1894 to 1897 and his
first university position was at Rennes, from 1902 to 1906. After his studies at
the Ecole Normale Superieure, Lebesgue became acquainted with the
research of another graduate of the Ecole, Rene Baire. Baire studies included
the theory of discontinuous functions of a real variable which helped focus
Lebesgue on the deficiencies of Riemann integration. In his doctoral thesis,
Lebesgue began the development of a theory of integration which, as he
demonstrated in his presentation at the Sorbonne, included all the bounded,
discontinuous functions introduced by Baire.
By 1922, Lebesgue had produced nearly ninety books and papers.
Much of his work dealt with his theory of integration, but he also presented
significant work in the calculus of variations, the theory of surface area, and in
the structures of sets and functions. In 1922, he was elected to the Academie
des Sciences and, during the last twenty years of his life, he remained active in
his studies of pedagogy, elementary geometry, and history. Lebesgue died in
Paris on July 26, 1941 (Gillispie, 110- 112, and Lebesgue, 1 - 5).
To begin the study of Lebesgue integration, it is necessary to have a
function that has a value of one on a measurable set and zero elsewhere. The
integral of this function should, therefore, be equal to the measure of the set.
The function that satisfies these conditions, on the measurable set E, is called
the characteristic function, XE, and is defined as follows:
if x e E
ifx ~ E
As introduced in Chapter 2, the inverse image, J-1(A), is defined as the
set {x: f(x) e A}. Equivalently, the inverse image is the set of points in the
domain of the function which are mapped into A by f.
An extended real valued function, f, is defined to be measurable, or
Lebesgue measurable, if, whenever set A is measurable, J-1(A) is measurable
f satisfies one of the following conditions:
For each real number a, the set {x: f(x) >a} is measurable.
For each real number a, the set {x: f(x);:: a} is measurable.
For each real number a, the set {x: f(x) <a} is measurable.
For each real number a, the set {x: f(x):::;; a} is measurable.
The reader should note that each of the above conditions are equivalent
(Royden, 66). It should be observed that continuous functions, with measurable
domains, are measurable. All step functions are also measurable. If
f is a
measurable function and E is a measurable subset of the domain off, then the
function obtained by restricting
f toE is measurable (Royden, 77).
Suppose E 1, ... ,En is a finite class of disjoint, measurable sets, and if
{a1, ... , an} is a set of finite real numbers, then the real valued function <1>,
defined by the relation
<I>(x) = IaiXEt(x)
is called a simple function. A simple function is a function that assumes a finite
number of values and assumes each of these values on a measurable set
(Munroe, 155). It should also be noted that a function <1> is simple if, and only if,
it is measurable and assumes only a finite number of values (Royden, 77).
The basic idea of Lebesgue integration is to replace the step functions of
Riemann integration with simple functions.
But, before this can be
accomplished, a few facts about simple functions need to be established. One
of the more useful portrayals of the simple function is called the canonical
representation and is defined as follows: Suppose <1> is a simple function and
{a1, ... , an} is the set of nonzero values of <1>, then <I> = :Iai(XAi) where
Ai = {x: <l>{x) = ai}. The canonical representation of <I> is characterized by the
fact that the Ai are disjoint and the ai are distinct and nonzero. If <1> vanishes
outside a set of finite measure, then, by definition
I<l>{x) dx = :Iai(mAi).
where (mAi) is the Lebesgue measure of Ai, and <I> is the canonical
representation, <I> = :Iai(XAi). This integral is usually abbreviated as
any measurable set, then
I <I>
is defined as
I <I>Xt
If E is
(Royden, 77).
As an example, suppose f(x) = 3X[o, 11 + 2X[2, 5]· If [0, 1] u [2, 5] c E, then
<I> dx
3X[o, 11 + 2X[2,
3X[o, 11 dx +
3m[O, 1] + 2m[2, 5]
3(1) + 2(3)
= 9.
The following proposition brings the idea of simple functions a little closer
to the more familiar step functions: Suppose
is a bounded, real valued
function defined on set E, a measurable set of finite measure. If <1> and 'I' are
simple functions, then, by analogy with the Riemann integral, it can be
concluded that
inf J\ji(X) dx, for f::::; "', is equal to the sup j<D{x) dx, for f;::: <D,
if, and only if, f is a measurable function. A proof of this conjecture is provided
by Royden, pages 79-81.
The Lebesgue integral for non-negative functions can now be defined as
follows: Iff is a non-negative measurable function defined on a measurable set
E where mE is finite, the Lebesgue integral off over E, denoted as
J f(x)
is defined as the supremum of
J <D{x) dx, where the supremum
is over all
simple functions, <D::::; f, which vanish outside of a set of finite measure. This
integral is sometimes expressed as:
J f, or, if E = [a, b],
the integral would be
written as
J f(x)
Any non-negative function is said to be "integrable" if
Jf(x) < oo.
defining a Lebesgue integrable function, two further definitions must be
introduced. The positive part, j+(x), of function f(x), is defined as
j+(x) = max {f(x), 0}.
In a similar manner, the negative part, J-(x), of function f(x), is defined as
J-(x) = max {-f(x), 0}.
It should be observed that both j+(x) and J-(x) are non-negative functions and
that f(x) = j+(x) - J-(x).
Thus, if f(x) is measurable, then j+(x) and J-(x) are
both measurable and non-negative. This leads to the following definition:
A measurable function f(x) is defined to be Lebesgue integrable
over set E if j+(x) and J-(x) are both integrable over E. If they
are integrable, then
J f(x)
JJ-(x) dx,
(Royden, 89-90).
In considering both Riemann and Lebesgue integration, what are the
conditions for a function to be integrable? It has been shown that a necessary
and sufficient condition for a function to be Riemann integrable is that the
function must be bounded and continuous almost everywhere. This means that,
for example, a function with a countable number of discontinuities is Riemann
integrable. A necessary and sufficient condition for a function
to be Lebesgue
integrable, is that f be measurable and both J+(x) and J-(x) have finite integrals.
The following proposition establishes that the Lebesgue integral is
actually a generalization of the Riemann integral, (Munroe, 177).
Proposition: If a bounded function f, is Riemann integrable on [a, b], then it is
Lebesgue integrable on [a, b], and: R
fa f(x)dx
fa f(x)dx.
Since f is Riemann integrable, R-
J f(x)
dx = R+
f f(x) dx.
But every step function is also a simple function, and
J f(x)
dx ~ sup1 ~~ j<I>(x) dx ~ inf1 ~'1' j'Jf(X) dx ~ R+
Therefore, sup 1 ~~ j<I>(x) dx = inf1 ~'1' j'Jf(x) dx, and
integrable, and R
J f(x)
must be Lebesgue
Jf(x)dx = Jf(x)dx.
An alternative proof of this proposition is provided in Cronin-Scanlon, on pages
197 -198.
Before illustrating some of the uses of Lebesgue integration, the
following definitions from analysis should be reviewed.
A sequence of functions {fn} defined on a set E, is said to converge
pointwise onE, to a function f(x) if, for all x in E, limn~~fn(x) = f(x). This means
that, given an x E E, and an arbitrary E > 0, there is an N such that for all n ~ N,
lfn(x)- f(x)l < E, (Royden, 49).
A sequence of bounded functions {/n} defined on a set E, is said to
converge uniformly to f(x) on E, if limn~~ suplfn(x)- f(x)l = 0, (Beals, 47) ..
Figure 7
The sequence of functions, fn(x) = xn, on the interval [0, 1), is an example
of a sequence of functions that converge pointwise, but the sequence is not
uniformly continuous on [0, 1).
f 1(x)
= x 1, /2(x) = x2,
f 10(x)
= x 10 are
illustrated in Figure 7.
= xn converges to 0 pointwise since limn~~fn(x) = limn~~xn = 0
= f(x), for all x
[0, 1). To converge uniformly, limn~~suplfn(x)- f(x)j must
equal 0. However, limH~suplfn(x)- f(x)l = limH~supjxn- 01 = limn~~supjxnj
= 1
* 0, and
it is concluded that the sequence does not converge uniformly.
This example will be used to illustrate a use of Lebesgue integration after the
introduction of one more definition and a theorem.
A property is said to hold almost everywhere (a.e.), if the set of points
where it fails to hold is a set of measure zero (Royden, 69). For example,
and g are equal a.e. iff and g have the same domain, and the set of
points where f(x)
* g(x) has measure zero.
The Lebesgue Dominated Convergence Theorem, first published by
Lebesgue in 1908, states: If fn(x) is a sequence of functions and there exists an
integrable function G(x), such that lfn(x)l :::;; G(x) a.e. and the limit off n(x), as
n ~
is equal to f(x) a.e., then the limit, as n ~
J fn(x)
Jf(x) dx,
(Hawkins, 118, and Reed- Simon, 24).
Using the previous example,
f n(x) = xn,
it has been established that fn(x)
converges pointwise to 0, but does not converge uniformly. Therefore, f(x)
Since lxnl:::;; 1 on [0, 1], let G(x) = 1. By the Lebesgue Dominated Convergence
Theorem, the limit, as n ~ oo,
Jx n dx
d x = 0.
The Fundamental Theorem of Calculus
Both Isaac Newton (1642- 1727) and Gottfried Leibniz (1646- 1716)
were familiar with the idea of relating integrals as sums with the concept of
integrals as anti-derivatives (Bell, 153). Newton even introduced the notion of
changing the upper bound of an integral to a free variable (Burton, 368).
However, it wasn't until 1823 that the first statement of the Fundamental
Theorem of Calculus was presented by Augustin Cauchy (1789- 1857). Using
the ideas of continuity and uniform continuity, Cauchy defined, and coined the
term "definite integral" of a continuous function on the closed interval [a, b].
In his statement of the fundamental theorem, Cauchy considered a
function f, continuous on the interval [a, b] and defined the function F(x) as
F(x) =
J f.
Cauchy also established that F(x) is both continuous and differen-
tiable. Cauchy's statement of the Fundamental Theorem of Calculus takes the
form of three theorems:
Theorem 1: F is a primitive function for f; that is, F'
Theorem 2: All primitive functions off must be in the form
+ C,
where C denotes a constant; that is, If G is a function with a
continuous derivative G', then
Ia G' = G(x)- G(a).
Theorem 3: If G is a function such that G'(x) = 0 for all x in [a, b], then G'(x) remains constant there.
The reader should note that Theorem 3 is an immediate consequence of
the Mean Value Theorem and was used by Cauchy to prove Theorem 2. In
1849, using the idea of limits, Cauchy refined his version of the fundamental
theorem by including discontinuous functions with a finite number of
discontinuities on the interval [a, b], and he extended his definition of the
integral to unbounded functions (Hawkins, 9- 12).
In analyzing the Fundamental Theorem of Calculus with regards to both
Riemann and Lebesgue integration, two basic questions arise.
1. When does
Ia f(t) dt
= f(x)?
Iff is Riemann integrable on [a, b], then the relation holds at each point x
in [a, b] at which
is continuous. Iff is continuous on [a, b], then the relation
holds for all points of [a, b] (Fulks, 156- 157). If f(x) is Lebesgue integrable,
this relation holds for almost all values of x, (Royden, 97). The reader should
note that, for Lebesgue integration, there is no requirement that
be continuous
is continuous,
at x.
This relation leads to the key observation that whenever
the integral
I f(t) dt
is a differentiable function of x. As illustrations of
functions whose definitions are only in terms of an integral, consider first, the
natural logarithm of a positive number. The natural logarithm is defined in the
following form:
for all x > 0.
As a second example, consider the sine integral:
si(x) =
J si~ t dt
which appears in various engineering applications {Thomas, 278).
2. When does
J f'(x)
dx = f(b)- f(a)?
For Riemann integration, this relation holds as long as f(x) is
differentiable on [a, b] and f'(x) is Riemann integrable on [a, b] (Fulks, 158). For
the Lebesgue integral, the answer to this question depends on the concept of
"absolute continuity."
A real valued function f, defined on the closed interval [a, b], is defined to
be absolutely continuous on [a, b] if, given an E > 0, there is a
o > 0 such that
lf(xj') - f(xi)l <
for every finite collection {(xi, Xi')} of non-overlapping intervals with
lxi' - x il <
An absolutely continuous function is clearly continuous. Since any
continuous function on a closed interval is uniformly continuous, it follows that
an absolutely continuous function is uniformly continuous. However, the
converse in not true. Any absolutely continuous function f, on [a, b], is
differentiable almost everywhere (Royden, 108- 109). The property of absolute
continuity was first introduced by Axel Harnack (1851-1888) in 1884 (Hawkins,
The following proposition establishes the Fundamental Theorem of
Calculus as applied to Lebesgue integration:
f(x) = f(a) +
j f'(t) dt,
for all x
[a, b], if, and only if,
f is
absolutely continuous on [a, b] (Royden, 110, and Munroe, 268).
As an illustration of the power of the Lebesgue integral over the Riemann
integral, consider the function f(x) =
.]x on the interval (0, 1].
Since f(x) is
unbounded, the Riemann integral does not exist, and, using the method first
introduced by Cauchy, the improper Riemann integral must be evaluated using
limits (Hawkins, 12). Thus,
J Vx
_1_ dx asE ~o.
which, because of the limit, becomes a more complex problem than necessary.
It can be shown that f(x)
= 2Vx
is absolutely continuous, and therefore, it
satisfies the Fundamental Theorem of Calculus for Lebesgue integration, and
the integral becomes
J .]x d x =
2vxl6 = 2.
This integral is now a proper
Lebesgue integral, and it can be integrated as indicated.
Chapter 5
Fourier Series
A trigonometric series is an infinite series of the form:
°+ L
(ancos nx + bnsin nx)
where ao, a1, ... and b1, b2, ... are given constants. If the trigonometric series
converges for all x e [0, 21t] then it converges for all real x, since, for all n, both
cos nx and sin nx have periods of 21t (Cronin-Scanlon, 217). In 1822, Joseph
Fourier (1768- 1830) proposed that any bounded function, f(x), can be
expressed in the form of a trigonometric series (Hawkins, 5). In his proofs,
Fourier, a mathematical physicist, demonstrated to the pure mathematicians of
the time, that the notions of "arbitrary" functions, real numbers, and continuity
needed rigorous clarification (Bell, 293).
Fourier's development is based on the following: Suppose the
trigonometric series converges uniformly on [0, 21t] to a function f(x), i.e.,
= 2° + L
(ancos nx+ bnsin nx),
for all real x (Cronin-Scanlon, 217). If
dx exists, then the Fourier
series of f(x) is the trigonometric series given by
° + L (ancos nx + bnsin nx),
an - 1t
J f(x)(cos
nx) dx
for n = 0, 1, 2, ... , and
= -1t J0 f(x)(sin
nx) dx
for n = 1, 2, ....
The numbers ao, an, bn are called the Fourier coefficients of f(x}. The
Fourier series of f(x) is sometimes specified by:
f(x) -
2° +
~ (ancos nx + bnsin nx).
The symbol"-" is used to indicate that the coefficients an and bn are
obtained from the functions defined above. In general, the series to the right of
the "-" may not converge to f(x) for every x. However, if the Fourier series does
converge to f(x), then the "-" symbol can be replaced by the equal sign
(Cronin-Scanlon, 219- 20).
The Fourier series is, of course, a trigonometric series. However, if given
a trigonometric series, can it be determined if there is a function, f(x), such that
the Fourier series of f(x) is the given trigonometric series? Solving this problem
led to the discovery, and proof, of the well-known Riesz-Fischer Theorem.
Before introducing this theorem, it is necessary that the reader be familiar with
the properties of the LP spaces.
Suppose "p" is a positive real number and E is a measurable set of real
numbers. Then LP(E) is defined as the set of measurable functions on E such
that lfiP is integrable. Thus L1[0, 1] would consist of the set of Lebesgue
integrable functions on [0, 1]
(Royden, 119). As another example, consider
f(x) = x(-1/4) on [0, 16]. It should be recognized that f(x) e L1[0, 16] since
l g = 3{
[x (-1/4)]1 dx = ~ x(3/4) 1
. However, f(x)
[x (-1/4)]4 dx
= J ~dx
is not integrable.
~ L4[0, 16] since
Since any two functions that are equal almost everywhere on a
measurable set E, have the same integral on E, the distinctions between the
functions are insignificant. The relation "equal almost everywhere" is reflexive,
symmetric, and transitive, thus it forms an equivalence relation. The Lebesgue
space, denoted as LP(E), is the collection of equivalence classes from the
associated equivalence relation on LP(E).
Even though the elements of LP(E)
are usually thought of as individual functions, they are actually equivalence
classes of functions, such that, in each class, any two functions are equal almost
Suppose [f] denotes an equivalence class in LP
= L P(E) containing
Define addition in LP by: [f] + [g) = [f + g), and scalar
multiplication in LP by: a[f] = [af]. Using the relation If+ giP ~ 2P(IfiP + lgiP), it
can be shown that the Lebesgue space, LP, is a linear space, or a vector space.
A norm,
II• 11. on a vector space V over real
numbers, is a function, such
II• II: V ~ R, which satisfies the following:
II f
II af II = (lal)(ll f II), for all a e R, and f
II f + g II ~ II f II+ II g 11. for all f, g e V,
II ~ 0, for all f e V.
f II
= 0, if and only iff = 0.
II I II = [
e V.
(Beals, 70).
~III P }/p,
II I II " 0, and condition (i)
is met. Condition (ii) is satisfied since LP consists of equivalence classes of
functions. For any constant a e R, condition (iii) is fulfilled, and condition (iv),
called subadditivity, can be shown to hold providing p ~ 1. Thus, for p ~ 1, the
LP spaces are normed linear spaces, (Royden, 119).
Any normed linear space is defined as a metric space with the metric:
p(f, g)
= II f- g II.
(Reed and Simon, 9). A Banach space is a normed linear
space (over the real or complex numbers) which is a complete metric space. A
normed linear space is complete if every Cauchy sequence in the space
converges. A sequence, {/n}. in a normed linear space, is a Cauchy sequence
if, given any E > 0, there is anN, such that, for all n, m;;:: N,
II fn- fm II<
A sequence Un} in a normed linear space is said to converge to an
f in the space if, given an
II f-
II< E.
> 0, there is an N such that, for all n > N,
Convergence in the LP spaces is referred to as convergence in
the mean of order p, (Royden, 123- 4).
The Riesz-Fischer Theorem simply states that the LP spaces are
complete, thus, they are Banach spaces. The theorem, first introduced in 1907,
was proved by F. Riesz (1880 - 1956) and Ernst Fischer (1875 - 1959)
independent of each other. Part of the proof employed by Riesz involved the
special case for the interval [0, 21t] in the L2 (Hilbert) space. He deduced the
convergence of trigonometric series as long as an and b n are Fourier
coefficients, (Hawkins, 174 - 175). An elegant proof of the Riesz-Fischer
Theorem is presented in Royden, on pages 124- 125. From the Riesz-Fischer
Theorem, it has been established that if the sequences {an} and {bn} are such
that the series
converges, then
+L (ancos nx + bnsin nx)
is the Fourier series of a function
e L2[0, 21t]. It has also been shown that the
partial sums of the Fourier series converge in the mean to
235). It can also be shown that any
converges to
e L2[a, b], has a Fourier series which
in the mean. Furthermore, it can be shown that any Fourier
series, such that
L (a n2
+ b n2) <
converges in the mean to some
L2[a, b].
It should be noted that the integration required by the Riesz-Fischer
Theorem is Lebesgue integration. One reason that the Riesz-Fischer Theorem
uses Lebesgue integration is that the convergence properties of the Lebesgue
integral are essential in the proof of the theorem (Cronin-Scanlon, 234-235).
Another reason that Lebesgue integration is used is that the class of Lebesgue
integrable functions is larger than the class of Riemann integrable functions.
The search for this theorem was one of the chief motivating forces behind the
development of the Lebesgue integral (Munroe, 258).
One crucial property of the LP spaces not shared by the corresponding
spaces of Riemann integrable functions is that of completeness in the metric
given above. This was pointed out in a note by Fischer in 1907 (Hawkins, 177).
As an example, consider the set of continuous functions on [a, b] with the
corresponding metric
II f- g II =
Jlf(x) -
g(x)l dx.
Let C[O, 1] denote the set of continuous functions on the interval [0, 1].
Since each function in C[O, 1] is bounded and continuous, each is Riemann
integrable. To illustrate that C[O, 1] is not complete, it is necessary to find a
Cauchy sequence of functions on C[O, 1] that does not converge to a function in
C[O, 1]. Let fn(x) be defined as in Figure 8.
Figure 8
fn(x) is a Cauchy sequence of functions since, given an
> 0, there
exists anN, such that, if n, m > N,
JIf n(x) -
m(x)l dx <
But fn(x) does
not converge to a function in C[O, 1]. It converges to the characteristic function
of [
1, ~]which is not contained in C[O, 1].
Thus, the set of continuous functions
in [0, 1] do not form a complete metric space. The reader should note that the
completion of C[a, b] is L1 [a, b] (Reed and Simon, 13).
The purpose of this section is provide the reader with an intuitive
justification for the treatment of probability in terms of a measure space. This
development is excerpted from Halmos, pages 184- 191.
Using a common example, consider the rolling of an ordinary six-sided
die with the number "x" showing on the top face of the die. The number x may,
therefore, take on values of 1, 2, 3, 4, 5, or 6. This means that x e {1, 2, 3, 4, 5,
6}. Possible outcomes of this experiment could be
the number xis even, which is equivalent to x e {2, 4, 6}
x is less than 4, which is equivalent to x e {1, 2, 3}
x is equal to 6, which is equivalent to x e {6}.
There are as many events associated with this experiment as there are
combinations of the first six positive integers taken any number at a time. If the
impossible event, x not being equal to any of the first six positive integers, is
considered, then there are 26 possible events. The impossible event wiU be
symbolized as 0, while the certain event, x e {1, 2, 3, 4, 5, 6}, will be denoted
as X.
If E is an event, then the complementary event of E will be written as -E.
An experiment, one of whose outcomes lies in E, will be said to result in -E if,
and only if, the outcome does not lie in E. Thus, if E = {2, 4, 6}, then -E would
equal {1, 3, 5}.
Given two events, E and F, the union, E u F, occurs if, and only
if, at least one of the two events occur, and the intersection, E n F, occurs if, and
only if, both E and F occur. Suppose E = {2, 4, 6} and F = {1, 2, 3}, the E u F =
{1, 2, 3, 4, 6} and E n F = {2}.
The preceding examples, with generalizations to more complex
experiments, justify the conclusion that probability theory consists of the study of
an algebra of sets. The reader should recall that a collection A. of subsets of X
is an algebra of sets if
Au B is in A. whenever A and B are, and
-A is in A. whenever A is.
An event is, therefore, a set, and its opposite event is the complementary
set. Mutually exclusive events are disjoint sets and an event consisting of
simultaneous occurrences of two other events, is a set obtained by the
intersection of two sets. For the study of probability where the total number of
possible events is finite, the treatment of the class of possible events as an
algebra of sets is adequate. For situations arising in modern probability theory
and practice, it is necessary to make the additional assumption that the system
given in Halmos, Chapter IX. For example, a "random variable" in probability is
just a measurable function.
It seems appropriate to finish this introductory account of measure theory
and Lebesgue integration with a quote given by Lebesgue himself:
"And now, gentlemen, I pause to thank you for your kind attention and to
offer a final word of conclusion, if you will permit. It is that a generalization
made not for the vain pleasure of generalizing but in order to solve previously
existing problems is always a fruitful generalization. This is proved abundantly
by the variety of applications of the ideas that we have just examined."
(Lebesgue, 194).
Beals, Richard. Advanced Mathematical Analysis. New York: Springer-Verlag,
Bell, Eric T. The Development of Mathematics. 2d ed. New York: McGraw-Hill
Book Company, Inc., 1945.
Burton, David M. The History of Mathematics, An Introduction. Dubuque:
William C. Brown Publishers, 1985.
Caratheodory, Constantin. Algebraic Theory of Measure and Integration.
Reprint, New York: Chelsea Publishing Co., 1963.
Cronin-Scanlon, Jane. Advanced Calculus. Rev. ed. Lexington: D. C. Heath
and Company, 1969.
Eves, Howard. Foundations and Fundamental Concepts of Mathematics. 3d
ed. Boston: PWS-Kent Publishing Company, 1990.
Fulks, Watson. Advanced Calculus, An Introduction to Analysis. 3d ed. New
York: John Wiley & Sons, 1978.
Gillispie, Charles C. (ed.). "Henri Lebesgue," Dictionary of Scientific
Biography. vol. VIII. New York: Charles Scribner's Sons, 1973.
Halmos, Paul R. Measure Theory. New York: D. Van Nostrand Company, Inc.
Hawkins, Thomas. Lebesgue's Theory of Integration, Its Origins and
Development. Madison: The University of Wisconsin Press, 1970.
Kolmogorov, A. N. and S. V. Fomin. Measure, Lebesgue Integrals, and Hilbert
Space. translated by Natascha Artin Brunswick and Allan Jeffrey. New
York: Academic Press, 1961.
Lebesgue, Henri. Measure and the Integral. edited by Kenneth 0. May. San
Francisco: Holden-Day, Inc., 1966.
Munroe, M. E. Introduction to Measure and Integration. Reading: AddisonWesley Publishing Company, Inc., 1953.
Reed, Michael and Barry Simon. Methods of Modem Mathematical Physics.
New York: Academic Press, 1972.
Royden, H. L. Real Analysis. 3d ed. New York: Macmillan Publishing
Company, 1988.
Solovay, Robert M. "A model of set-theory in which every set of reals is
Lebesgue measurable," Annals of Mathematics. 2d ser., vol. 92, 1970,
Temple, G. The Structure of Lebesgue Integration Theory. London: Oxford
University Press, 1971.
Thomas, George B. and Ross L. Finney. Elements of Calculus and Analytic
Geometry. Menlo Park: Addison-Wesley Publishing Company, 1989.