Uploaded by antarangsharma

Solutions to Rudin Principles of Mathematical Analysis

advertisement
Solutions to Principles of Mathematical Analysis (Walter Rudin)
Jason Rosendale
jason.rosendale@gmail.com
December 19, 2011
This work was done as an undergraduate student: if you really don’t understand something in one of these
proofs, it is very possible that it doesn’t make sense because it’s wrong. Any questions or corrections can be
directed to jason.rosendale@gmail.com.
Exercise 1.1a
Let r be a nonzero rational number. We’re asked to show that x 6∈ Q implies that (r + x) 6∈ Q. Proof of the
contrapositive:
7→ r + x is rational
assumed
→ (∃p ∈ Q)(r + x = p)
definition of rational
→ (∃p, q ∈ Q)(q + x = p)
we’re told that r is rational
→ (∃p, q ∈ Q)(x = −q + p)
existence of additive inverses in Q
Because p and q are members of the closed additive group of Q, we know that their sum is a
member of Q.
→ (∃u ∈ Q)(x = u)
→ x is rational
definition of rational
By assuming that r +x is rational, we prove that x must be rational. By contrapositive, then, if x is irrational
then r + x is irrational, which is what we were asked to prove.
Exercise 1.1b
Let r be a nonzero rational number. We’re asked to show that x 6∈ Q implies that rx 6∈ Q. Proof of the
contrapositive:
7→ rx is rational
assumed
→ (∃p ∈ Q)(rx = p)
→ (∃p ∈ Q)(x = r
−1
definition of rational
p)
existence of multiplicative inverses in Q
(Note that we can assume that r−1 exists only because we are told that r is nonzero.) Because r−1
and p are members of the closed multiplicative group of Q, we know that their product is also a
member of Q.
→ (∃u ∈ Q)(x = u)
→ x is rational
definition of rational
By assuming that rx is rational, we prove that x must be rational. By contrapositive, then, if x is irrational
then rx is irrational, which is what we were asked to prove.
1
Exercise 1.2
√
Proof by contradiction. If 12 were rational, then we could write it as a reduced-form fraction in the form of
p/q where p and q are nonzero integers with no common divisors.
7→
p
q
=
√
12
assumed
2
→ ( pq2 = 12)
→ (p2 = 12q 2 )
It’s clear that 3|12q 2 , which means that 3|p2 . By some theorem I can’t remember (possibly the
definition of ‘prime’ itself), if a is a prime number and a|mn, then a|m ∨ a|n. Therefore, since 3|pp
and 3 is prime,
→ 3|p
→ 9|p2
→ (∃m ∈ N)(p2 = 9m)
2
→ (∃m ∈ N)(12q = 9m)
2
→ (∃m ∈ N)(4q = 3m)
2
→ (3|4q )
definition of divisibility
substitution from p2 = 12q 2
divide both sides by 3
definition of divisibility
From the same property of prime divisors that we used previously, we know that 3|4 ∨ 3|q 2 : it
clearly doesn’t divide 4, so it must be the case that 3|q 2 . But if 3|qq, then 3|q ∨ 3|q. Therefore:
→ (3|q)
And this establishes a contradiction. We began by assuming that p and q had no common divisors, but we
have shown
√ that 3|p and 3|q. So our assumption must be wrong: there is no reduced-form rational number such
that pq = 12.
Exercise 1.3 a
If x 6= 0 and xy = xz, then
y = 1y = (x−1 x)y = x−1 (xy) = x−1 (xz) = (x−1 x)z = 1z = z
Exercise 1.3 b
If x 6= 0 and xy = x, then
y = 1y = (x−1 x)y = x−1 (xy) = x−1 x = 1
Exercise 1.3 c
If x 6= 0 and xy = 1, then
y = 1y = (x−1 x)y = x−1 (xy) = x−1 1 = x−1 = 1/x
Exercise 1.3 d
If x 6= 0, then the fact that x−1 x =1 means that x is the inverse of x−1 : that is, x = (x−1 )−1 = 1/(1/x).
Exercise 1.4
We are told that E is nonempty, so there exists some e ∈ E. By the definition of lower bound, (∀x ∈ E)(α ≤ x):
so α ≤ e. By the definition of upper bound, (∀x ∈ E)(x ≤ β): so e ≤ β. Together, these two inequalities tell us
that α ≤ e ≤ β. We’re told that S is ordered, so by the transitivity of order relations this implies α ≤ β.
2
Exercise 1.5
We’re told that A is bounded below. The field of real numbers has the greatest lower bound property, so we’re
guaranteed to have a greatest lower bound for A. Let β be this greatest lower bound. To prove that −β is
the least upper bound of −A, we must first show that it’s an upper bound. Let −x be an arbitrary element in −A:
7→ −x ∈ −A
assumed
→x∈A
definition of membership in −A
→β≤x
β = inf(A)
→ −β ≥ −x
consequence of 1.18(a)
We began with an arbitrary choice of −x, so this proves that (∀ − x ∈ −A)(−β ≥ −x), which by definition
tells us that −β is an upper bound for −A. To show that −β is the least such upper bound for −A, we choose
some arbitrary element less than −β:
7→ α < −β
assumed
→ −α > β
consequence of 1.18(a)
Remember that β is the greatest lower bound of A. If −α is larger than inf(A), there must be some
element of A that is smaller than −α.
→ (∃x ∈ A)(x < −α)
(see above)
→ (∃ − x ∈ −A)(−x > α)
→ !(∀ − x ∈ −A)(−x ≤ α)
consequence of 1.18(a)
→ α is not an upper bound of −A
definition of upper bound
This proves that any element less than −β is not an upper bound of −A. Together with the earlier proof
that −β is an upper bound of −A, this proves that −β is the least upper bound of −A.
Exercise 1.6a
The difficult part of this proof is deciding which, if any, of the familiar properties of exponents are considered
axioms and which properties we need to prove. It seems impossible to make any progress on this proof unless we
1
can assume that (bm )n = bmn . On the other hand, it seems clear that we can’t simply assume that (bm ) n = bm/n :
this would make the proof trivial (and is essentially assuming what we’re trying to prove).
As I understand this problem, we have defined xn in such a way that it is trivial to prove that (xa )b = xab
1
when a and b are integers. And we’ve declared in theorem 1.21 that, by definition, the symbol x n is the element
1
n n
such that (x ) = x. But we haven’t defined exactly what it might mean to combine an integer power like n
and some arbitrary inverse like 1/r. We are asked to prove that these two elements do, in fact, combine in the
way we would expect them to: (xn )1/r = xn/r .
Unless otherwise noted, every step of the following proof is justified by theorem 1.21.
3
7→ bm = bm
assumed
1
n
1
→ ((bm ) )n = bm
definition of x n
1
→ ((bm ) n )nq = bmq
We were told that m
n =
mq = np. Therefore:
p
q
which, by the definition of the equality of rational numbers, means that
1
→ ((bm ) n )nq = bnp
1
→ ((bm ) n )qn = bpn
commutativity of multiplication
From theorem 12.1, we can take the n root of each side to get:
1
→ ((bm ) n )q = bp
From theorem 12.1, we can take the q root of each side to get:
1
1
→ (bm ) n = (bp ) q
Exercise 1.6b
As in the last proof, we assume that br+s = br bs when r and s are integers and try to prove that the operation
p
works in a similar way when r and s are rational numbers. Let r = m
n and let s = q where m, n, p, q ∈ Z and
n, q 6= 0.
p
m
7→ br+s = b n + q
→ br+s = b
r+s
→b
mq+pn
nq
= (b
definition of addition for rationals
mq+pn
)
→ br+s = (bmq bpn )
r+s
→b
r+s
→b
= (b
= (b
r+s
mq
mq
nq
)
1
nq
)(b
m
n
1
nq
from part a
1
nq
pn
(b )
pn
nq
legal because mq and pn are integers
1
nq
)
corollary of 1.21
from part a
p
q
→b
= (b )(b )
r+s
→b
= (br )(bs )
Exercise 1.6c
We’re given that b > 1. Let r be a rational number. Proof by contradiction that br is an upper bound of B(r):
7→ br is not an upper bound of B(r)
hypothesis of contradiction
→ (∃x ∈ B(r))(x > br )
formalization of the hypothesis
By the definition of membership in B(r), x = bt where t is rational and t ≤ r.
→ (∃t ∈ Q)(bt > br ∧ t ≤ r)
It can be shown that b−t > 0 (see theorem S1, below) so we can multiply this term against both
sides of the inequality.
→ (∃t ∈ Q)(bt b−t > br b−t ∧ t ≤ r)
t−t
→ (∃t ∈ Q)(b
r−t
>b
∧ t ≤ r)
theorem S2
from part b
→ (∃t ∈ Q)(1 > br−t ∧ r − t ≥ 0)
→ (∃t ∈ Q)(1−(r−t) > b ∧ r − t ≥ 0)
→1>b
And this establishes our contradiction, since we were given that b > 1. Our initial assumption must have
been incorrect: br is, in fact, an upper bound of B(r). We must still prove that it is the least upper bound of
B(r), though. To do so, let α represent an arbitrary rational number such that bα < br . From this, we need to
4
prove that α < r.
7→ bα < br
α −r
→b b
α−r
α−r
→b
→b
hypothesis of contradiction
r −r
<b b
theorem S2
r−r
<b
from part b
<1
from part b
Exercise 1.7 a
Proof by induction. Let S = {n : bn − 1 ≥ n(b − 1)}. We can easily verify that 1 ∈ S. Now, assume that k ∈ S:
7→ k ∈ S
hypothesis of induction
→ bk − 1 ≥ k(b − 1)
definition of membership in S
→ bbk − 1 ≥ k(b − 1)
we’re told that b > 1.
k+1
− b ≥ k(b − 1)
k+1
≥ k(b − 1) + b
→b
→b
→ bk+1 − 1 ≥ k(b − 1) + b − 1
→ bk+1 − 1 ≥ (k + 1)(b − 1)
→ k+1∈S
definition of membership in S
By induction, this proves that (∀n ∈ N)(bn − 1 ≥ n(b − 1)).
Alternatively, we could prove this using the same identity that Rudin used in the proof of 1.21. From the
distributive property we can verify that bn − an = (b − a)(bn−1 a0 + bn−2 a1 + . . . + b0 an−1 ). So when a = 1, this
becomes bn − 1 = (b − 1)(bn−1 + bn−2 + . . . + b0 ). And since b > 1, each term in the bn−k series is greater than
1, so bn − 1 ≥ (b − 1)(1n−1 + 1n−2 + . . . + 10 ) = (b − 1)n.
Exercise 1.7 b
1
1
7→ n(b n − 1) = n(b n − 1)
1
1
→ n(b n − 1) = (1 + 1 + . . . + 1)(b n − 1)
|
{z
}
1
n
→ n(b − 1) = (1
n times
n−1
n−2
+1
1
+ . . . + 10 )(b n − 1)
1
It can be shown that bk > 1 when b > 1, k > 0 (see theorem S4). Replacing 1 with b n gives us the
inequality:
1
1
1
1
1
→ n(b n − 1) ≤ ((b n )n−1 + (b n )n−2 + . . . + (b n )0 )(b n − 1)
Now we can use the algebraic identity bn − an = (bn−1 a0 + bn−2 a1 + . . . + b0 an−1 )(b − a):
1
1
→ n(b n − 1) ≤ ((b n )n − 1)
1
→ n(b n − 1) ≤ (b − 1)
Exercise 1.7 c
7→ n > (b − 1)/(t − 1)
assumed
→ n(t − 1) > (b − 1)
this holds because n, t, and b are greater than 1
1
n
→ n(t − 1) > (b − 1) ≥ n(b − 1)
1
n
→ n(t − 1) > n(b − 1)
1
n
→ (t − 1) > (b − 1)
→t>b
from part b
transitivity of order relations
n > 0 → n−1 > 0 would be a trivial proof
1
n
5
Exercise 1.7 d
We’re told that bw < y, which means that 1 < yb−w . Using the substitution yb−w = t with part (c), we’re lead
1
1
directly to the conclusion that we can select n such that yb−w > b n . From this we get y > bw+ n , which is what
1
1
w+
w
we were asked to prove. As a corollary, the fact that b n > 1 means that b n > b .
Exercise 1.7 e
We’re told that bw > y, which means that bw y −1 > 1. Using the substitution bw y −1 = t with part (c), we’re
1
lead directly to the conclusion that we can select n such that bw y −1 > b n . Multiplying both sides by y gives us
−1
1
1
w
w−
b > b n y. Multiplying this by b n gives us b n > y, which is what we were asked to prove. As a corollary,
−1
1
1
the fact that b n > 1 > 0 means that, upon taking the reciprocals, we have b n < 1 and therefore bw− n < bw .
Exercise 1.7 f
We’ll prove that bx = y by showing that the assumptions bx > y and bx < y lead to contradictions.
1
If bx > y, then from part (e) we can choose n such that bx > bx− n > y. From this we see that x − n1 is an
upper bound of A that is smaller than x. This is a contradiction, since we’ve assumed that x =sup(A).
1
If bx < y, then from part (d) we can choose n such that y > bx+ n > bx . From this we see that x is not an
upper bound of A. This is a contradiction, since we’ve assumed that x is the least upper bound of A.
Having ruled out these two possibilities, the trichotomy property of ordered fields forces us to conclude that
bx = y.
Exercise 1.7 g
Assume that there are two elements such that bw = y and bx = y. Then by the transitivity of equality relations,
bw = by , although this seems suspiciously simple.
Exercise 1.8
In any ordered set, all elements of the set must be comparable (the trichotomy rule, definition 1.5). We will
show by contradiction that (0, 1) is not comparable to (0, 0) in any potential ordered field containing C. First,
we assume that (0, 1) > (0, 0) :
7→ (0, 1) > (0, 0)
hypothesis of contradiction
→ (0, 1)(0, 1) > (0, 0)
definition 1.17(ii) of ordered fields
We assumed here that (0, 0) can take the role of 0 in definition 1.17 of an ordered field. This is
a safe assumption because the uniqueness property of the additive identity shows us immediately
that (0, 0) + (a, b) = (a, b) → (0, 0) = 0.
→ (−1, 0) > (0, 0)
definition of complex multiplication
→ (−1, 0)(0, 1) > (0, 0)
definition 1.17(ii) of ordered fields, since we initially assumed (0, 1) > 0
→ (0, −1) > (0, 0)
definition of complex multiplication
It might seem that we have established our contradiction as soon as we concluded that (−1, 0) > 0
or (0, −1) > 0. However, we’re trying to show that the complex field cannot be an ordered field
under any ordered relation, even a bizarre one in which −1 > −i > 0. However, we’ve shown that
(0, 1) and (0, −1) are both greater than zero. Therefore:
→ (0, −1) + (0, 1) > (0, 0) + (0, 1)
definition 1.17(i) of ordered fields
→ (0, 0) > (0, 1)
definition of complex multiplication
This conclusion is in contradiction of trichotomy, since we initially assumed that (0, 0) < (0, 1). Next, we
assume that (0, 1) < (0, 0):
6
7→ (0, 0) > (0, 1)
hypothesis of contradiction
→ (0, 0) + (0, −1) > (0, 1) + (0, −1)
definition 1.17(i) of ordered fields
→ (0, −1) > (0, 0)
definition of complex addition
→ (0, −1)(0, −1) > (0, 0)
definition 1.17(ii) of ordered fields
→ (−1, 0) > (0, 0)
→ (−1, 0)(0, −1) > (0, 0)
definition of complex multiplication
definitino 1.17(ii) of ordered fields, since we’ve established (0, −1) >
(0, 0)
→ (0, 1) > (0, 0)
definition of complex multiplication
Once again trichotomy has been violated.
Proof by contradiction that (0, 1) 6= (0, 0): if we assume that (0, 1) = (0, 0) we’re led to the conclusion that
(a, b) = (0, 0) for every complex number, since (a, b) = a(0, 1)4 + b(0, 1) = a(0, 0) + b(0, 0) = (0, 0). By the
transitivity of equivalence relations, this would mean that every element is equal to every other. And this is
in contradiction of definition 1.12 of a field, where we’re told that there are at least two distinct elements: the
additive identity (’0’) and the multiplicative identity (’1’).
Exercise 1.9a
To prove that this relation turns C into an ordered set, we need to show that it satisfies the two requirements
in definition 1.5. Proof of transitivity:
7→ (a, b) < (c, d) ∧ (c, d) < (e, f )
assumption
→ [a < c ∨ (a = c ∧ b < d)] ∧ [c < e ∨ (c = e ∧ d < f )]
definition of this order relation
→ (a < c ∧ c < e) ∨ (a < c ∧ c = e ∧ d < f )
∨(a = c ∧ b < d ∧ c < e) ∨ (a = c ∧ b < d ∧ c = e ∧ d < f )
distributivity of logical operators
→ (a < e) ∨ (a < e ∧ d < f ) ∨ (a < e ∧ b < d) ∨ (a = e ∧ b < f )
transitivity of order relation on R
Although we’re falling back on the the transitivity of an order relation, we are not assuming what
we’re trying to prove. We’re trying to prove the transitivity of the dictionary order relation on C,
and this relation is defined in terms of the standard order relation on R. This last step is using the
transitivity of this standard order relation on R and is not assuming that transitivity holds for the
dictionary order relation.
→ (a < e) ∨ (a < e) ∨ (a < e) ∨ (a = e ∧ b < f )
p∧q →p
→ a < e ∨ (a = e ∧ b < f )
p∨p→p
→ (a, b) < (e, f )
definition of this order relation
To prove that the trichotomy property holds for the dictionary relation on Q, we rely on the trichotomy
property of the underlying standard order relation on R. Let (a, b) and (c, d) be two elements in C. From the
standard order relation, we know that
7→ (a, b) ∈ C ∧ (c, d) ∈ C
assumed
→ a, b, c, d ∈ R
definition of a complex number
→ (a < c) ∨ (a > c) ∨ (a = c)
trichotomy of the order relation on R
→ (a < c) ∨ (a > c) ∨ (a = c ∧ (b < d ∨ b > d ∨ b = d))
trichotomy of the order relation on R
→ (a < c) ∨ (a > c) ∨ (a = c ∧ b < d) ∨ (a = c ∧ b > d)
∨(a = c ∧ b = d)
distributivity of the logical operators
→ (a, b) < (c, d) ∨ (a, b) > (c, d) ∨ (a, b) < (c, d) ∨ (a, b) > (c, d)
∨(a, b) = (c, d)
definition of the dictionary order relation
→ (a, b) < (c, d) ∨ (a, b) > (c, d) ∨ (a, b) = (c, d)
And this is the definition of the trichotomy law, so we have proven that the dictionary order turns the
7
complex numbers into an ordered set.
Exercise 1.9b
C does not have the least upper bound property under the dictionary order. Let E = {(0, a) : a ∈ R}. This subset
is just the imaginary axis in the complex plane. This subset clearly has an upper bound, since (x, 0) > (0, a) for
any x > 0. But it does not have a least upper bound: for any proposed upper bound (x, y) with x > 0, we see
that
x
(x, y) < ( , y) < (0, a)
2
So that ( x2 , y) is an upper bound less than our proposed least upper bound, which is a contradiction.
Exercise 1.10
This is just straightforward algebra, and is too tedious to write out.
Exercise 1.11
If we choose w =
z
|z|
z
and choose r = |z|, then we can easily verify that |w| = 1 and that rw = |z| |z|
= z.
Exercise 1.12
√
√
Set ai zi and bi = z¯i and use the Cauchy-Schwarz inequality (theorem 1.35). This gives us
n
X
√ p
zj z¯j
2
n
n
X
√ 2X p 2
≤
| zj |
| z¯j |
j=1
j=1
j=1
which is equivalent to
2
|z1 + z2 + . . . + zn |2 ≤ (|z1 | + |z2 | + . . . + |zn |)
Taking the square root of each side shows that
|z1 + z2 + . . . + zn | ≤ |z1 | + |z2 | + . . . + |zn |
which is what we were asked to prove.
Exercise 1.13
|x − y|2
= (x − y)(x − y)
= (x − y)(x̄ − ȳ)
Theorem 1.31(a)
= xx̄ − xȳ − yx̄ + y ȳ
= xx̄ − (xȳ + yx̄) + y ȳ
= |x|2 − 2Re(xȳ) + |y|2
Theorem 1.31(c), definition 1.32
≥ |x|2 − 2|Re(xȳ)| + |y|2
x ≤ |x|, so −|x| ≥ |x|.
≥ |x|2 − 2|xȳ| + |y|2
Theorem 1.33(d)
2
2
Theorem 1.33(c)
= |x|2 − 2|x||y| + |y|2
Theorem 1.33(b)
= |x| − 2|x||ȳ| + |y|
= (|x| − |y|)(|x| − |y|)
= (|x| − |y|)(|x̄| − |ȳ|)
Theorem 1.33(b)
= (|x| − |y|)(|x| − |y|)
Theorem 1.31(a)
2
= ||x| − |y||
This chain of inequalities shows us that ||x| − |y||2 ≤ |x − y|2 . Taking the square root of both sides results
in the claim we wanted to prove.
8
Exercise 1.14
|1 + z|2 + |1 − z|2
= (1 + z)(1 + z) + (1 − z)(1 − z)
= (1 + z)(1̄ + z̄) + (1 − z)(1̄ − z̄)
Theorem 1.31(a)
= (1 + z)(1 + z̄) + (1 − z)(1 − z̄)
The conjugate of 1 = 1 + 0i is just 1 − 0i = 1.
= (1 + z̄ + z + z z̄) + (1 − z̄ − z + z z̄)
= (2 + 2z z̄)
= (2 + 2)
We are told that z z̄ = 1
=4
Exercise 1.15
Using the logic and the notation from Rudin’s proof of theorem 1.35, we see that equality holds in the Schwarz
inequality when AB = |C|2 . This
have B(AB − |C|2 ) = 0, and from the given chain of equalities
P occurs when
2
we see that this occurs when
|Baj − Cbj | = 0. For this to occur we must have Baj = Cbj for all j, which
occurs only when B = 0 or when
C
aj = bj for all j
B
That is, each aj must be a constant multiple of bj .
Exercise 1.16
We know that |z − x|2 = |z − y|2 = r2 . Expanding these terms out, we have
|z − x|2 = (z − x) · (z − x) = |z|2 − 2z · x + |x|2
|z − y|2 = (z − y) · (z − y) = |z|2 − 2z · y + |y|2
For these to be equal, we must have
−2z · x + |x|2 = −2z · y + |y|2
which happens when
1
[|x|2 − |y|2 ]
(1)
2
We also want |z − x| = r, which occurs when z = x + rû where |û| = 1. Using this representation of z, the
requirement that r2 = |z − y|2 becomes
z · (x − y) =
r2 = |z − y|2 = |x + rû − y|2 = |(x − y) + rû2 | = |x − y|2 + 2rû · (x − y) + |rû|2 = d2 + 2rû · (x − y) + r2
Rearranging some terms, this becomes
û · (y − x) =
d2
d
=
|y − x|
2r
2r
(2)
A quick, convincing, and informal proof would be to appeal to the relationship a · b = |a||b| cos(θ) where θ is the
angle between the two vectors; the previous equation then becomes
|û||y − x| cos(θ) =
d
|y − x|
2r
Dividing by |y − x| and remembering that û = 1, this becomes
cos(θ) =
d
2r
where θ is the angle between the fixed vector (y − x) and the variable vector û. It’s easy to see that this equation
will hold for exactly one û when d = 2r; it will hold for no û when d > 2r; it will hold for two values of û when
d < 2r and n = 2; and it will hold for infinitely many values of û when d < 2r and n > 2. Each value of û
corresponds with a unique value of z. More formal proofs follow.
9
part (a)
When d < 2r, equation (2) is satisfied for all û for which
û · (y − x) =
d
|y − x| < |y − x|
2r
By the definition of the dot product, this is equivalent to
u1 (y1 + x1 ) + u2 (y2 + x2 ) + . . . + uk (yk + xk ) =
d
|y − x|
2r
The only other requirement for the values of ui is that
q
u21 + u22 + . . . + u2k = 1
(3)
(4)
This gives us a system of two equations with k variables. As long as k ≥ 3 we have more variables than equations
and therefore the system will have infinitely many solutions.
part (b)
Evaluating d2 , we have:
d2 = |x − y|2
= |(x − z) + (z − y)|2
= [(x − z) + (z − y)] · [(x − z) + (z − y)]
definition of inner product ·
= (x − z) · (x − z) + 2(x − z) · (z − y) + (z − y) · (z − y)
inner products are distributive
2
2
= |x − z| + 2(x − z) · (z − y) + |z − y|
definition of inner product ·
Evaluating (2r)2 , we have:
(2r)2 = (r + r)2 = (|z − x| + |z − y|)2
= |z − x|2 + 2|z − x||z − y| + |z − y|2
commutativity of multiplication
If 2r = d then d2 = (2r)2 and therefore, by the above evaluations, we have
|x − z|2 + 2(x − z) · (z − y) + |z − y|2 = |z − x|2 + 2|z − x||z − y| + |z − y|2
which occurs if and only if
2(x − z) · (z − y) = 2|x − z||z − y|
From exercise 14 we saw that this equality held only if (x − z) = c(z − y) for some constant c; we know that
|x − z| = |z − y| so c = ±1; we know x 6= y so c = 1. Therefore we have x − z = z − y, from which we have
z=
x+y
2
and there is clearly only one such z that satisfies this equation.
part (c)
If 2r < d then we have
|x − y| > |x − z| + |z − y|
which is equivalent to
|(x − z) + (z − y)| > |x − z| + |z − y|
which violates the triangle inequality (1.37e) and is therefore false for all z.
10
Exercise 1.17
First, we need to prove that a · (b + c) = a · b + a · c and that (a + b) · c = a · c + b · c: that is, we need to prove
that the distributive property holds between the inner product operation and addition.
P
a · (b + c) = ai (bi + ci )
P
= (ai bi + ai ci )
P
P
=
ai bi + ai ci
=a·b+a·c
P
(a + b) · c = (ai + bi )ci
P
= (ai ci + bi ci )
P
P
=
ai ci + bi ci
=a·c+b·c
definition 1.36 of inner product
distributive property of R
associative property of R
definition 1.36 of inner product
definition 1.36 of inner product
distributive property of R
associative property of R
definition 1.36 of inner product
The rest of the proof follows directly:
|x + y|2 + |x − y|2 = (x + y) · (x + y) + (x − y) · (x − y)
= (x + y) · x + (x + y) · y + (x − y) · x − (x − y) · y
=x·x+y·x+x·y+y·y+x·x−y·x−x·y+y·y
=x·x+y·y+x·x+y·y
= 2|x|2 + 2|y|2
Exercise 1.18
If x = 0, then x · y = 0 for any y ∈ Rk . If x 6= 0, then at least one of the elements x1 , . . . , xk must be nonzero:
let this element be represented by xa . Let xb represent any other element of x. Choose y such that:
 x
b

i=a

xa
yi =
−1 i = b


0
otherwise
We can now see that x·y = xa xxab +xb (−1) = xb −xb = 0. We began with an arbitrary vector x and demonstrated
a method of construction for y such that x · y = 0: therefore, we can always find a nonzero y such that x · y = 0.
This is not true in R1 because the nonzero elements of R are closed with respect to multiplication.
Exercise 1.19
We need to determine the circumstances under which |x − a| = 2|x − b| and |x − c| = r. To do this, we need to
manipulate these equalities until they have a common term that we can use to compare them.
|x − a| = 2|x − b|
|x − a|2 = 4|x − b|2
|x|2 − 2x · a + |a|2 = 4|x|2 − 8x · b + 4|b|2
3|x|2 = |a|2 − 2x · a + 8x · b − 4|b|2
|x − c| = r
|x − c|2 = r2
|x|2 − 2x · c + |c|2 = r2
|x|2 = r2 + 2x · c − |c|2
3|x|2 = 3r2 + 6x · c − 3|c|2
Combining these last two equalities together, we have
11
7→ |a|2 − 2x · a + 8x · b − 4|b|2 = 3r2 + 6x · c − 3|c|2
→ |a|2 − 2x · a + 8x · b − 4|b|2 − 3r2 − 6x · c + 3|c|2 = 0
→ |a|2 − 4|b|2 + 3|c|2 − 2x · (a − 4b − 3c) − 3r2 = 0
We can zero out the dot product in this equation by letting 3c = 4b − a. Of course, this also
determines a specific value of c. This substitution gives us the new equality:
2
4b a
− 3r2 = 0
−
3
3
3 · 16 2 3 · 8
3
→ |a|2 − 4|b|2 +
|b| −
a · b + |a|2 − 3r2 = 0
9
9
9
→ 3|a|2 − 12|b|2 + 16|b|2 − 8a · b + |a|2 − 9r2 = 0
→ |a|2 − 4|b|2 + 3
→ 4|a|2 4|b|2 − 8a · b − 9r2 = 0
→ 4|a − b|2 = 9r2
→ 2|a − b| = 3r
By choosing 3c = 4b − a and 3r = 2|a − b|, we guarantee that |x − a| = 2|x − b| iff |x − c| = r.
Exercise 1.20
We’re trying to show that R has the least upper bound property. The elements of R are certain subsets of Q,
and < is defined to be “is a proper subset of”. To say that an element α ∈ R has a least upper bound, then,
is to say that the subset α has some “smallest” superset β such that α ⊂ β. We’re asked to omit property III,
which told us that each α ∈ R has no largest element.
√ With this restriction lifted, we have a new definition of
“cut” that includes cuts such as (−∞, 1] and (−∞, 2].
To prove that each subset of R with an upper bound must have a least upper bound, we will follow Step 3
in the book almost exactly. We will define two subsets A ⊂ R and γ ⊂ R, show that the subset γ is also an
element of R, and then show that the subset/element γ is the least upper bound of A.
Let A be a nonempty subset of R with an upper bound of β (Note: A is a subset of R, not a subset of R.
The elements of A are cuts, which are subsets of Q). Define γ to be the union of all cuts in A. This means that
γ contains every element from every cut in A: γ consists of elements from Q.
proof that γ is a cut:
The proof of criterion (I) has two parts. (i) γ is the union of elements of A, and we were told that A is nonempty.
Therefore γ is nonempty. (ii) We are told that β is an upper bound for A, so x ∈ A → x ⊂ β (remember that
< is defined as ⊂). But γ is just the union of cut elements in A, so γ is the union of proper subsets of β. This
means that γ itself is a proper subset of β. This shows that γ ⊂ β ⊆ Q, so γ 6= Q. So criterion (I) in the
definition of “cut” has been met.
To prove part (II), pick some arbitrary rational p ∈ γ. γ is the set of cuts in A, so p ∈ α for some cut α ∈ A.
Choose a second rational q such that q < p: by the definition of cut, the fact that p ∈ α and q < p means that
q ∈ α and therefore q ∈ γ. And we’re being asked to disregard part (III), so this is sufficient to prove that γ is
a cut.
proof that γ is the least upper bound of A
(i) Choose any arbitrary cut α ∈ A. We’ve defined γ as the union of all cuts in A, so it’s clear that every rational
number in α is also contained in γ: that is, α ⊆ γ. And by the definition of < for this set, this tells us that
α ≤ γ for every α ∈ A.
(ii) Suppose that δ < γ. By the definition of < for this set, this means that δ ⊂ γ. This is a proper subset,
so there must be some rational s such that s 6∈ δ but s ∈ γ. In order for s ∈ γ, it must be the case that s ∈ α
for some cut α ∈ A. We can now show that δ ⊂ α. For every r ∈ δ,it’s also true that s 6∈ δ. By (II), this means
12
that r < s (using standard rational order). And since s ∈ α, (II) also shows that r ∈ α. This shows that δ ⊂ α,
which means δ < α ∈ A (using the subset order on R), and therefore δ is not an upper bound of A.
Together, these two facts show that γ is the least upper bound of A.
definition of addition
Following the example in the book, we define the addition of two cuts α + β to be the set of all sums r + s where
r ∈ α and s ∈ β. The book defined 0∗ to be the set of rational numbers < 0, but by omitting requirement (III)
we are forced to redefine our zero. Let 0# be the set of all rational numbers ≤ 0.
The original definition had to omit 0 from 0∗ in order to keep R, the set of cuts, closed under addition
(otherwise we’d have 0∗ + 0∗ = (−∞, 0] which is not a cut because of criterion (III)). Our new zero 0# must
include 0 as an element because our newly defined cuts can have a greatest element. The set 0∗ can no longer
function as our zero since (−∞, x] + 0∗ = (−∞, x); these two cuts are not equal ((−∞, x) < (−∞, x]), so 0∗ has
not functioned as the additive identity.
field axioms A1,A2, and A3
The proofs from the book for closure, commutativity, and associativity are directly applicable to our new
definition of cut.
field axiom A4: existence of an additive identity
Let α be a cut in R. Let r and s be rationals such that r ∈ α and s ∈ 0# . Then r + s ≤ r, which means that
r + s ∈ α. This shows us that α + 0# ⊆ α. Now let p and r be rationals such that p ∈ α, r ∈ α, and p ≤ r. This
means that p − r ≤ 0, so p − r ∈ 0# . Therefore r + (p − r) ∈ α + 0# , which means that p ∈ α + 0# . This shows
us that α ⊆ α + 0# . Having shown that α + 0# ⊆ α and that α ⊆ α + 0# , we conclude that α = α + 0# for any
α ∈ R.
field axiom A5: existence of additive inverses
The book constructed the definition of inverse so that the inverse of (−∞, x) would be (−∞, −x). This works
under the original definition of “cut”, since (−∞, x) + (−∞, −x) = (−∞, 0) = 0∗ . It is tempting to just define
the inverse for our new definition of cut so that the inverse of (−∞, x] is (∞, −x]. This overlooks the fact that
both (−∞, x] and (−∞, x) are cuts under our new definition: both of these elements must have additive inverses.
To show that A5 is not satisfied, we need to demonstrate that at least one element of R has no additive inverse. Let α = (−∞, r) for some rational r. Assume that there was some cut β such that α + β = 0# = (−∞, 0]:
we will show that this assumption leads to a contradiction.
(i) Assume that β does not contain any elements greater than −r. Let p and q be arbitrary rationals such
that p ∈ α and q ∈ β. From our definitions of α and β, we know that p < r and q ≤ −r. Combining these two
inequalities tells us that p + q < −r + r, or p + q < 0. But p and q were arbitrary members of α and β, so this
shows that 0 6∈ α + β. So it cannot be the case that α + β = 0# .
(ii) Assume that β contains at least one element q that is greater than −r. Let s represent the difference
q − (−r): then q = −r + s (note that s must be a positive rational). Let p be an arbitrary rational such that
p ∈ α. By the definition of α, we know that p < r. And from property (II) of cuts, we know that r − s < p < r.
Adding q = −r + s to each element in this equality gives us 0 < p + q < s. And p + q is an element of α + β, so
we see that α + β contains some positive rational: it cannot be the case that α + β = 0# .
Whether or not β contains an element greater than −r, we find ourselves in contradiction with the initial
assumption that β might be the inverse of α. Therefore there can be no inverse for α. In conclusion, we see
that omitting (III) forces us to redefine the additive identity, and this new definition results in the existence of
elements in R that have no inverse.
13
Exercise 2.1
This is immediately justified by noting that the definition of subset, x ∈ ∅ → x ∈ A, is satisfied for any set A
because of the false antecedent. A more formal proof:
7→ ¬(∃x ∈ ∅)
→ (∀x)(x ∈
6 ∅)
definition of an empty set
negation of the ∃ quantifier
→ (∀x)(x 6∈ A → x 6∈ ∅)
Hilbert’s PL1
This previous step is justified by the argument p → (q → p): if something is true (like x 6∈ ∅), then
everything implies it.
→ (∀x)(x ∈ ∅ → x ∈ A)
contrapositive
→∅⊆A
definition of subset
Exercise 2.2
The set of integers is countable, so by theorem 2.13 the set of all (k + 1)-tuples (a0 , a1 , . . . , ak ) with a0 6= 0 is also
countable. Let this set be represented by Zk . For each a∈ Zk consider the polynomial a0 z k +a1 z k−1 +. . .+ak = 0.
From the fundamental theorem of algebra, we know that there are exactly k complex roots for this polynomial.
We now have a series of nested sets that encompass every possible root for every possible polynomial with
integer coefficients. More specifically, we have a countable number of Zk s, each containing a countable number
of (k + 1)-tuples, each of which corresponds with k roots of a k-degree polynomial. So our set of complex roots
(call it R) is a countable union of countable unions of finite sets. This only tells us that R is at most countable:
it is either countable or finite.
To show that R is not finite, consider the roots for 2-tuples in Z1 . Each 2-tuple of the form (−1, n) corresponds with the polynomial −z + n = 0 whose solution is z = n. There is clearly a unique solution for each
n ∈ Z, so R is an infinite set. Because R is also at most countable, this proves that R is countable.
I did not use the hint provided in the text, which either means that this proof is invalid or that there is an
alternate (simpler?) proof.
Exercise 2.3
The set of real numbers is uncountable, so if every real number were algebraic then the set of real algebraic
numbers would be uncountable. However, exercise 2.2 showed that the algebraic complex numbers were countable. The real numbers are a subset of the complex numbers, so the set of algebraic real numbers is at most
countable.
Exercise 2.4
The rational real numbers are countable. If the irrational real numbers were countable, then the union of rational
and irrational reals would be countable. But this union is just R, which is not countable. So the irrational reals
are not countable.
Exercise 2.x: The Cantor Set
The idea behind the Cantor set is that each term in the series E1 ⊃ E2 . . . ⊃ En removes the middle third of
each remaining line segment, so that you get a series of increasingly smaller segments that look like:
14
We can see that E0 has one segment of length 1, E1 has two segments of length 1/3, and Em has 2m segments
of length 1/3m . Notice that for any possible segment (α, β), we can choose m to be large enough so that the
maximum segment length is less than the length of (α, β) and therefore (α, β) 6∈ Em . And if the segment is not
in Em for some m, it’s not in the union P . This is the reasoning behind Rudin’s statement that “P contains no
segment”.
To justify the claim that no point in P is an isolated point, we need to show that for every point p1 ∈ P
andTany arbitrarily small radius r, we can find some second point p2 such that d(p1 , p2 ) < r. P is defined to
be Ei , so if p1 ∈ P then it is a member of every Ei . Choose some element Em where m is so large that the
segments in Em are all shorter than r. We are assured that this is possible by the Archimedean property of the
rationals, since we need only find m such that 3−m < r.
So we’ve found an Em with a line segment containing p1 . Let p2 represent one of the endpoints of this line.
The length of this segment is less than r, so d(p1 , p2 ) < r. In each subsequent term in the series {Ei }, the
endpoint p2 will never be “cut off”: it will always be the endpoint of a segment from which the middle third is
lost. So p2 ∈ P . And since this was true for an arbitrary point p1 ∈ P and an arbitrarily small r, this shows
that every neighborhood of every point in P is a limit point and not an isolated point.
Exercise 2.5
1
, m ∈ N. No point in E is a limit
Let E be a subset of (0, 1] consisting only all rational numbers of the form m
point, but the limit points of E need not be members of E: we will demonstrate that the point 0 is a limit point
of E.
1
(i) Proof that no point in E is a limit point: let pm be an arbitrary member of E. Then pm = m
for some
1
1
m ∈ N. The closest point to pm is therefore pm+1 = m+1 . So we just choose r such that r = 2 d(pm , pm+1 ) and
we see that there are no other points in this neighborhood of pm , which shows that pm is an isolated point.
(ii) Proof that 0 is a limit point: choose an arbitrarily small radius r. From the Archimedian property, we
1
1
1
< r. This pm = m
is a member of E, and d(0, pm ) = m
< r. This shows that
can find some m ∈ N such that m
0 is a limit point.
(iii) Proof that no other points are limit points: let x be some point that is neither zero nor a member
of E. This point must be either less than zero, greater than 1, or it must lie between two sequential points
pm , pm+1 ∈ E. Let the dx represent the smallest distance from the set {d(x, 0), d(x, 1), d(x, pm ), d(x, pm+1 )}.
Choose r such that r = 21 dx . There can be no points of E in the neighborhood Nr (x). If there were, then this
1
1
would indicate the existence of a point of E less than zero, greater than 1, or between m
and m+1
. None of
these can exist in E as we defined it, so no points other than 0 are limit points.
1
Now let F be the subset of (2, 3] containing rational numbers of the form 2 + m
and let G be the subset of
1
(4, 5] consisting of rational numbers of the form 4 + m . Just as E was shown to have a limit point of only zero,
these sets can be shown to have limit points of only 2 and 4. Therefore the union E ∪ F ∪ G is bounded and
has three limit points {0, 2, 4}.
Exercise 2.6
(i) We’re asked to prove that E 0 is closed. This is equivalent to proving that every limit point of E 0 is a point
of E. And by the definition of E 0 , this is equivalent to proving that every limit point of E 0 is a limit point of E.
To prove this, let x represent an arbitrary limit point of E 0 . Choose any arbitrarily small r and let
s = 2r , t = 4r 1 . Since x is a limit point of E 0 , we can find a point y ∈ E 0 in the neighborhood Ns (x). And
y, by virtue of being in E 0 , is a limit point of E: so we can find a point z ∈ E in the neighborhood Nt (y).
The distance d(x, y) is less than s and d(y, z) is less than t, so definition 2.15 of metric spaces assures us
that d(x, z) ≤ d(x, y) + d(y, z) < s + t < r, so d(x, z) < r and therefore z is in the neighborhood Nr (x) for any
1t
was chosen to be less than r to guarantee x 6= z.
15
arbitrarily small r. So the point x is a limit point for E. But x was an arbitrarily chosen limit point of E 0 , so
we have proven that every limit point of E 0 is a limit point of E, which is what we were asked to prove. The
following image is helpful for imagining these points in R2 .
(ii) We can use a similar technique to show that every limit point of Ē is a limit point E. Let x represent
an arbitrary limit point of Ē (which we won’t assume to be a member of Ē). Choose any arbitrarily small r
and let s = 2r , t = 4r . Since x is a limit point of Ē, theorem 2.20 tells us that the neighborhood Ns (x) contains
infinitely many points of Ē. Because Ē is defined to be E 0 ∪ E, each of these infinitely many points in Ns (x) is
either a member of E 0 or a member of E.
(ii.a) First, assume that there exists at least one point y in Ns (x) such that y ∈ E 0 . By definition, this
means that y is a limit point of E, so there is at least one point z ∈ E in the neighborhood Nt (y). The
distance d(x, y) is less than s and d(y, z) is less than t, so definition 2.15 of metric spaces assures us that
d(x, z) ≤ d(x, y) + d(y, z) < s + t < r, so d(x, z) < r and therefore we can find some z ∈ E in the neighborhood
Nr (x) for any arbitrarily small r. So x is a limit point for E.
(ii.b) Next, assume that none of the points in Ns (x) are members of E. This means that all of the infinitely many points of E 0 ∪ E in Ns (x) are members of E. From this, we see that every neighborhood of x
contains elements of E. For any neighborhood Nt (x) with t < s, the fact that x is a limit point of Ē and
that Nt (x) ⊂ Ns (x) means that Nt (x) contains infinitely many points in E. But this means that x is a limit
point of E. For any neighborhood Nt (x) with t > s, the fact that Ns (x) ⊂ Nt (x) means that Nt (x) contains
infinitely many points in E. And since every neighborhood of x contains an element of E, x is a limit point for E.
The second part of the proof is to show that every limit point of E is a limit point of Ē and is relatively trivial.
Every element of E is also an element of E ∪ E 0 = Ē. If x is a limit point of E, then every neighborhood of x
contains an element of E. Therefore every neighborhood of x contains an element of Ē. So x is a limit point of Ē.
We’ve shown that every limit point of Ē is a limit point of E and vice-versa, which is what we were asked
to prove.
(iii) We’re asked if E and E 0 always have the same limit points. The answer is “no”, and a counterexample
can be found in the previous question. The limit points of E in exercise 2.5 were E 0 = {0, 2, 4}. And E 0 clearly
has no limit points whatsoever.
Exercise 2.7a
We’ll prove the equality of these sets by showing that each is a subset of the other.
(A) Assume that x ∈ B̄n . Then, x ∈ Bn or x is a limit point of Bn .
(A1) If x ∈ Bn , then x ∈ Ak for some Ak ∈
S
Ai . And if x ∈ Ak , then x ∈ A¯k . And this means that x ∈
S
Āi .
S
(A2) If x is a limit point of Bn , then it must be a limitSpoint at least one specific Ak ∈ Ai . Proof by
contradiction: assume that x is not a limit point for any Ak ∈ Ai . Then for each i, there is some neighborhood
Nri (x) that contains no elements of Ai . Let s =min(ri ) (which exists because i is finite). Then Ns (x) contains
16
no elements from any Ai since Ns (x) ⊆ NriS
(x) for each i. And if this neighborhood contains no points for
any Ai , then it contains no points of Bn = Ai . And this is a contradiction, S
since x is a limit point of Bn .
Therefore,
by
contradiction,
x
must
be
a
limit
point
of
at
least
one
specific
A
∈
Ai . So x ∈ Āk , which means
k
S
that x ∈ Āi .
In either of these two cases, x ∈
(B) Assume that x ∈
point of Ak .
S
S
Āi . This proves that x ∈ B̄n → x ∈
S
Āi .
Āi . Then x ∈ Āk for some k. And this means that either x ∈ Ak or x is a limit
(B1) If x ∈ Ak , then x ∈
S
Ai , which means that x ∈ Bn and therefore x ∈ B̄N .
(B2) If x is a limit S
point of Ak , then every neighborhood of x contain an element of Ak . But every element
of Ak is an element of Ai = Bn , so this means that every neighborhood of x contains an element of Bn . By
definition, this means that x is a limit point of Bn , so that x ∈ B̄n .
In either ofSthese two cases, x ∈ B̄n . This proves
S that x ∈
x ∈ B̄n → x ∈ Āi , so we have proven that B̄n = Āi .
S
Āi → x ∈ B̄n . And we’ve already shown that
Exercise 2.7b
S∞
Nothing in part B of the above proof required that i be finite, so it’s still the case that i=1 Āi ⊂ B̄. But part
A of the proof assumed that we could choose
a least element from the set of size i, which we can’t do for an
S
infinite set. So we can’t assume that B ⊆ Āi . And it’s a good thing that we can’t assume this, because it’s false.
1
, m ∈ N. We’ll
Consider the set from exercise 2.5, the set E consisting S
of all rational numbers of the form m
∞
1
construct this set by defining
A
=
and
defining
B
=
A
.
We
saw
in
exercise
2.5
that
0 ∈ B̄. But
i
i
i
S∞
S∞i=1
0S6∈ Āi for any i, so 0 6∈ i=1 Āi . This shows us that B̄ 6= Si=1 Ai for these sets. And we’ve already shown that
Āi ⊆ B, so for this particular choice of sets we see that Āi ⊂ B.
Exercise 2.8a
Every point in an open set E in R2 is a limit point of E. Proof: let p be an arbitrary point in E. We can
say two things about the neighborhoods of p. First, since we’re dealing with R2 , every neighborhood of p contains infinitely many points. Second, p is an interior point (since E is open) and so there is some r such that
Nr (p) ⊂ E. Together, these two facts tell us that Nr (p) contains infinitely many points of E.
We now show that this guarantees that every neighborhood of p contains a point of E. Choose any s such
that r < s. Because the neighborhood Nr (p) contains some q ∈ E and Nr (p) ⊂ Ns (p), we know that q ∈ Ns (p).
Now choose s such that s < r. Ns (p) contains infinitely many points, and Ns (p) ⊂ Nr (p) ⊂ E, so Ns (p) contains
infinitely many points of E. We’ve shown that whether r < s or r > s for any arbitrary r, Ns (p) contains a
point of E. Thus p is a limit point of E.
Exercise 2.8b
Consider a closed set consisting of a single point, such as E = {(0, 0)}. This point is clearly not a limit point of
E.
Exercise 2.9a
To prove that E ◦ is open, we will show that every point in E ◦ must be an interior point of E ◦ .
Let x be an arbitrary point in E ◦ . By the definition of E ◦ , we know that x is an interior point of E. This
means that we can choose some r such that Nr (x) ⊂ E: i.e., every point of Nr (x) is a point of E. To show that
x is an interior point of E ◦ , though, we will need to show that every point in the neighborhood Nr (x) is also a
17
point of E ◦ .
Let y be an arbitrary point in Nr (x). Clearly, 0 < d(x, y) < r. Choose a radius s such that d(x, y) + s = r
and let z be an arbitrary point in Ns (y). Every point in Ns (y) is a point in E, since d(x, z) ≤ d(x, y) + d(y, z) <
d(x, y) + s = r implies d(x, z) < r, and we’ve established that every point in Nr (x) is a member of E. And since
every point in Ns (y) is a point in E, by definition we know that y is an interior point of E. But y was an arbitrary
point in Nr (x), so we know that every point in Nr (x) is an interior point of E. And this means that x is itself
an interior point of the set of interior points of E: that is, x is an interior point of E ◦ . And since x was an arbitrary point in E ◦ , this means that every point in E ◦ is an interior point of E ◦ . And this proves that E ◦ is open.
Exercise 2.9b
From exercise 2.9(a), we know that E ◦ is always open: so E is open if E = E ◦ . And if E is open, then every
point of E is an interior point and therefore E ◦ = E: so E = E ◦ if E is open. Together, these two statements
show that E is open if and only if E = E ◦ .
Exercise 2.9c
Let p be an arbitrary element of G. Because G is open, p is an interior point of G. This means that for every
point p ∈ G there is some neighborhood Nr (p) such that Nr (p) ⊂ G ⊂ E. This chain of subsets tells us that
every point in G is an interior point not only of G, but also of E. And if every point in G is an interior point
of E, then this shows us that G ⊂ E ◦ .
Exercise 2.9d
f◦ ⊂ E:
e let x be a member of E
f◦ , the complement of E ◦ . From the definition of E ◦ we know that x
Proof that E
is not an interior point of E. From the definition of “interior point”, this means that every neighborhood of x
contains some element y in the complement of E. For each neighborhood, It must be true that either x = y or
e (since y is in E
e and x = y). If x 6= y for
x 6= y. If x = y for one or more of these neighborhoods, then x is in E
e
e
every neighborhood, then by definition we know that x is a limit point of E. So we conclude that either x ∈ E
e that is, x is a member of E,
e the closure of E.
e
or x is a limit point for E:
e⊂E
f◦ : this proof is only a very slight modification of the previous one. Let x be a member of
Proof that E
e the closure of the complement of E. From the definition of closure, either x is a limit point for E
e or x itself
E,
e In either case, every neighborhood of x contains some point of E.
e By definition of “interior
is a member of E.
f◦ , the complement of E ◦ .
point”, this means that x is not an interior point of E and therefore x ∈ E
f◦ = E.
e
Together, these two proofs show us that E
18
Exercise 2.9e
No, E and E do not always have the same interiors. Consider the set E = (−1, 0) ∪ (0, 1) in R1 . The point 0 is
not a member of the interior of E, but it is a limit point of E. So the point 0 is an interior point of E = (−1, 1).
Exercise 2.9f
No, E and E ◦ do not always have the same closures. Let E be a subset of (0, 1] consisting only all rational
1
, m ∈ N. As we saw in exercise 2.5, the closure of E is E ∪ {0}. But every neighborhood
numbers of the form m
of every point in E contains a point not in E, so E has no interior points. So E ◦ = ∅. And the empty set
is already closed, so the closure of E ◦ is ∅. Clearly E ∪ {0} 6= ∅, so E and E ◦ do not always have the same
closures.
Exercise 2.10
Which sets are closed? Let E be an arbitrary subset of X and let p be an arbitrary point in E. The distance
between any two distinct points is always 1, so by choosing r = .5 we guarantee that the neighborhood Nr (p)
contains only p itself. Because this is true for any point in E, this shows us that E contains no limit points.
It’s then vacuously true that all of the (nonexistant) limit points of E are points of E: we conclude that E is
closed. But our choice of E was arbitrary, so this shows that every subset of X is closed.
Which sets are open? Let E be an arbitrary set in X. We’ve shown that every set is closed, so it must be
e is closed: and from theorem 2.23, this means that E is open. But our choice of E was arbitrary,
the case that E
so this shows that every subset of X is open.
Which sets are compact? Under this metric, subsets of X are compact if and only if they are finite.
Proof: let E be an arbitrary subset of X. This set is either finite or infinite.
If E is finite with cardinality k, then we can always generate a finite subcover for any open cover. For each
ei ∈ E, we select some gi in our (arbitrary) open cover {Gα } such that ei ∈ gi (note that E is finite, so we don’t
Sk
Sk
need to use the axiom of choice). This gives us a collection of sets i=1 gi such that E ⊆ i=1 gi ⊆ {Gα }. By
Sk
definition, then, i=1 gi is a finite subcover of E. Our choices for E, k, and {Gα } were arbitrary, so this shows
that any finite subset of X is compact.
If E is infinite, then we can always generate an open cover that has no finite subcover. As we saw previously,
every subset of X is open: this includes subsets that consist of a single element. So we can create an infinite
open cover of E by letting each set in {Gα } be a single element of E. Any subcover of E will need to contain one
element of {Gα } for each element of E: and since E is infinite, this means that any subcover must be infinite.
Therefore there is no finite subcover for this particular open cover, and E is not compact.
Exercise 2.11
d1 is not a metric: let x = −1, y = 0, z = 1. Then d1 (x, y) + d1 (y, z) = 2 while d1 (x, z) = 4. And this tells us
that d1 (x, z) > d1 (x, y) + d1 (y, z) which violates definition 2.15(c) of metric spaces.
d2 is a metric. It’s trivial to verify that d2 has the properties 2.15(a) and 2.15(b). To show that it has the
property 2.15(c):
7→ |x − z| ≤ |x − y| + |y − z|
true by the triangle inequality, theorem 1.37(f)
p
→ |x − z| ≤ |x − y| + |y − z| + 2 |x − y||y − z|
p
2
p
p
2
→ |x − z| ≤
|x − y| + |y − z|
p
p
p
→ |x − z| ≤ |x − y| + |y − z|
this additional term is > 0
legal because all terms are > 0
→ d2 (x, z) ≤ d2 (x, y) + d2 (y, z)
19
d3 is not a metric. d3 (1, −1) = 0, which violates definition 2.15(a) of metric spaces.
d4 is not a metric. d4 (2, 1) = 0, which violates definition 2.15(a) of metric spaces.
d5 is a metric. It’s trivial to verify that d2 has the properties 2.15(a) and 2.15(b). To show that it has the
property 2.15(c):
7→ |x − z| ≤ |x − y| + |y − z|
This first step is always true because of the triangle inequality (theorem 1.37(f).
→ |x − z| ≤ |x − y| + 2|x − y||y − z| + |y − z| + |x − z||x − y||x − z|
Adding positive terms to the right side does not change the sign of the equality.
→ |x − z| + |x − z||x − y| + |x − z||y − z| + |x − z||x − y||y − z|
≤ |x − y| + 2|x − y||y − z| + |y − z| + |x − z||x − y| + 2|x − z||x − y||y − z| + |x − z||y − z|
It’s hard to verify from this ugly mess, but we’ve just added the same terms to each side.
→ (|x − z|)(1 + |x − y| + |y − z| + |x − y||y − z|) ≤ (1 + |x − z|)(|x − y| + 2|x − y||y − z| + |y − z|)
It’s still hard to verify, but we’ve just factored out terms from each side.
→
|x − z|
|x − y| + 2|x − y||y − z| + |y − z|
≤
1 + |x − z|
1 + |x − y| + |y − z| + |x − y||y − z|
And now we’ve divided each side by two of these factored terms.
|x − z|
|x − y|
|y − z|
≤
+
1 + |x − z|
1 + |x − y| 1 + |y − z|
→ d2 (x, z) ≤ d2 (x, y) + d2 (y, z)
→
The logic behind this seemingly arbitrary series of algebraic steps becomes clear if one starts at the end and
works back to the beginning (which is, of course, how I initially derived the proof).
Exercise 2.12
Let {Gα } be any open cover of K. At least one open set in {Gα } must contain the point 0: let G0 be this
set, open in R, that contains 0 as an interior point. As an interior point of G0 , there is some neighborhood
Nr (0) ⊂ G0 . This neighborhood contains every n1 ∈ K such that n1 < r: equivalently, this neighborhood contains
one element for every natural number n such that n > 1r . Not only does this mean that G0 contains an infinite number of elements of K, it means that it contains all but a finite number ( 1r , to be exact) of elements of K.
Let Gi represent an element of {Gα } containing 1i . We can now see that K ⊂ G0 ∪
subcover of {Gα }.
S1/r
i=1
Gi , which is a finite
Exercise 2.13
For i ≥ 2, let Ai represent the set of all points of the form
Each Ai can be shown to have a single limit point of
1
i
1
i
+ n1 that are contained in the open interval
(see exercise 2.5).
20
1
1
i , i−1
.
S∞
Now consider the union of sets S = i=2 Ai . The same reasoning used in exercise 2.5 shows that the set of
limit points for S is just the collection of limit points from Ai : that is, the rationals of the form 1i for i ≥ 2.
This shows us that S has one limit point for each natural number greater than 1, which is a countable number.
To show that S is compact, we return to the reasoning found in exercise 2.12. Let {Gα } be any open cover
1
of S. For each i ≥ 2, Let Gi represent the element of {Gα } that
i . As we saw in exercise 2.12, Gi
contains
1
1
i , i−1
. If we let Gij represent an element of
1
{Gα } containing 1i + 1j , we see that all of the points of S in the interval 1i , i−1
can be covered by the finite
S ri
union of sets Gi ∪ i=1 .
contains all but a finite number of elements from the interval
Now, let G0 represent the element of {Gα } that contains 0 as an interior point. As an interior point, there
is some neighborhood Nr (0) ⊂ G0 . This neighborhood contains every limit point of the form n1 ∈ R such that
1
n < r. As in 2.12, this means that G0 contains all but a finite number of the limit points of S. More than that,
1
).
though, it means that G0 contains all but a finite number of the intervals of the form ( 1i , i−1
And from these sets we can construct our finite subcover. Our subcover S
contains G0 . For each of the finitely
ri
many intervals not covered by G0 , we include the finite union of sets Gi ∪ i=1
. This is a finite collection of a
finite number of elements from {Gα }, and each element of S is included in one of these elements, so we have
constructed a finite subcover for an arbitrary cover {Gα }. This proves that S is compact.
Exercise 2.14
S∞
Let Gn represent the interval n1 , 1 . The union {Gα } = i=1 Gn is a cover for the interval (0, 1) (Proof: for
any x ∈ (0, 1), there is some n > 1/x and therefore some n1 < x. So x ∈ Gn+1 ). But there is no finite subcover
1
for (0, 1). Let H be a finite subcover of {Gα }, and consider the set { i+1
: Gi ∈ H}. This set is finite, so there
is a least element. And this least element is in the interval (0, 1) but is not a member of any Gi ∈ H. So our
assumption that a finite subcover exists must be false.
Exercise 2.15
For each i ∈ N, define Ai to be:
(
Ai =
r
p∈Q:
1
2− ≤p≤
i
r
1
2+
i
)
Because the endpoints are irrational, each of these intervals Ai are both bounded and closed (see exercise 2.16
for the proofs). From theorem 1.20, we also know that each interval is nonempty.
T
The union of any finite collection of these intervals will be nonempty, since the union i∈∇ Ai is equal to
√
√
T∞
Ak , where k is the largest index in ∇. But the infinite union i=1 Ai is equal to {p ∈ Q : 2 ≤ p ≤ 2}, which
is obviously empty.
This proves that the collection {Ai : i ∈ N} is a counterexample to theorem 2.36 if the word “compact” is
replaced by either “closed” or “bounded”. This same collection works as a counterexample to the corollary of
2.36, since An+1 ⊂ An .
Exercise 2.16
lemma 1: For any real number x 6∈ Q, the intervals [x, ∞) and (x, ∞) are open in Q. Proof: let x be a real
number such that x 6∈ Q. Let p be a rational number in the interval [x, ∞) or (x, ∞). It cannot be the case
that p = x (since p is rational while x is not), so p is in the interval (x, ∞) (i.e, [x, ∞) is identical to (x, ∞)
in Q). Because x < p, we can rewrite this interval as (p − d(p, x), ∞). Choose r = d(x, p). Then we see that
(p − r, p + r) ⊆ (p − d(p, x), ∞) = (x, ∞). This means that Nr (p) is an interior point of (x, ∞): but p was an
arbitrary point chosen from [x, ∞), so every point of this interval is an interior point. This proves that [x, ∞)
and (x, ∞) are open in Q.
21
lemma 2: For any real number x 6∈ Q, the intervals (−∞, x] and (−∞, x) are open in Q. Proof: the proof
is nearly identical to that of lemma 1.
lemma 3: All of these open intervals ([x, ∞),(x, ∞),(−∞, x], and (−∞, x)) are also closed. Proof: Choose
any of these four open sets. Note that its complement is also one of these four open sets. Since its complement
is open, the set must be closed.
lemma 4: Every interval of the form (x, y) with x, y 6∈ Q is both open and closed in Q. Proof: choose an
e = (−∞, x] ∪ [y, ∞). From lemma 3, we see that E
e is a finite
arbitrary interval E = (x, y). It’s complement is E
e
e
union of closed sets, so E is closed (theorem 2.24) and E is open (theorem 2.23). But E is also a finite union
e is open (theorem 2.24) and E is closed (theorem 2.23). This proves that
of open sets (lemmas 1 and 2), so E
(x, y) is both open and closed. The proofs for [x, y], (x, y], and [x, y) are identical to this one.
√ √
√
E is bounded: If |p| ≥ ± 3, then p2 > 3 and p 6∈ E. So E is bounded by the interval (− 3, 3).
2
E is open and closed in Q: E is√
the set
rational
< 3, which means that
√ of all √
√ numbers p such that
√ 2<p √
E is the set of rational numbers in (− 3, − 2) ∪ ( 2, 3). We know that ± 2 and ± 3 are not rational, so
by lemma 4 we know that E is both a union of closed sets and also a union of open sets. So by theorem 2.24 we
know that E is both open and closed.
E is not compact: Drawing from the example in exercise 2.14, define the interval Ai to be:
√
√
(
3, − 2) if i = 1
(−
q
√
Ai =
2 + 1i , 3
if i > 1
q
Because 2 + 1i is irrational for every i ∈ N, we know by lemma 4 that each Ai is an open set. Define an open
cover of E to be
G = {Ai : i ∈ N}
This has no finite subcover for the same reason that the set in exercise 2.14 has no infinite
subcover: given any
√
finite collection of elements of G, we can always find an element sufficiently close to 2 that is not contained in
any of those elements of G.
Exercise 2.17
E is not countable: This is proven directly by theorem 2.14.
E is not dense: Let x = .660... and let r = .01. Then Nr (x) = (.65, .67), and every real number in
this interval contains a number other than 4 or 7. Therefore Nr (x) contains no points of E. This shows that x
is an element of [0, 1] that is neither a point in E nor a limit point of E, which proves that E is not dense in [0, 1].
E is compact: By theorem 2.41, we can prove that E is compact by showing that it is bounded and closed.
E ⊂ [0, 1], so E is bounded. To show that E is closed, we need to show that every limit point of E is a member
of E.
Proof by contrapositive that every limit point of E is a member of E: let x be an element of [0, 1] such that
x 6∈ E. We will show that x is not a limit point. By the definition of membership in E, the fact that x 6∈ E
means that
∞
X
ai
x=
10i
i=1
where some ai 6∈ {4, 7}. Let k represent some index such that ak 6∈ {4, 7}. Choose r = 10−(k+1) . Then the
neighborhood Nr (x) does not contain any points in E:
i) If ak+1 = 0, then the k + 1st decimal place will be either 9, 0, or 1 for every element of (x − r, x + r) and so
no element of Nr (x) is a member of E.
22
ii) If ak+1 = 9, then the k + 1st decimal place will be either 8, 9, or 0 for every element of (x − r, x + r) and so
no element of Nr (x) is a member of E.
iii) If ak+1 is neither 0 nor 9, then the kth decimal place will be be ak for every element of (x − r, x + r) and
so no element of Nr (x) is a member of E.
This proves that there is some neighborhood Nr (x) that contains no points in E. This proves that x is not
a limit point of E. But x was an arbitrary point such that x 6∈ E, so this proves that (x 6∈ E → x is not a
limit point of E). By contrapositive, this proves that every limit point of E is a member of E. And this, by
definition, means that E is closed.
E is perfect: We have already shown that E is closed in the course of proving that E is compact. To prove
that E is perfect, we need to show that every point in E is a limit point of E.
Let x be an arbitrary point in E. By the definition of membership in E, we know that
x=
∞
X
ai
10i
i=1
where each ai is either 4 or 7. Define a second number, xk , such that

∞
 bi = ai if i 6= k
X
bi
bk = 4 if ak = 7
xk =
,
where

10i
i=1
bk = 7 if ak = 4
From this definition, we see that xk differs from x only in the kth decimal place, and d(x, xk ) = 3 × 10−k .
We can use this information to show that x is a limit point of E. Let Nr (x) be any arbitrary neighborhood
of x. From the archimedian principle, we can find some integer k such that k > log1 0 3r . And this is algebraically
equivalent to finding some integer k such that 3 × 10−k < r. This means that we can find some xk ∈ E (as
defined above) in Nr (x). And this was an arbitrary neighborhood of an arbitrary point in E, so we have proven
that every neighborhood of every point in E contains a second point in E: by definition, this means that every
point in E is a limit point.
Exercise 2.18
Section 2.44 describes the Cantor set. The Cantor set is a nonempty
perfect subset of R1 . Each point in the
n n+1
Cantor set is an endpoint of some segment of the form 3k , 3k with n, k ∈ N, so each point in the Cantor set
is rational.
√
√
Let E be the set {x + 2 : x is in the Cantor set} (that is, we’re shifting every element of the Cantor set 2
units to the right). Each
√ element
√ in E is irrational (exercise 1.1). The Cantor set was bounded by [0, 1] so E
is clearly bounded by [ 2, 1 + 2]. The proof that E is perfect is identical to the proof that the Cantor set is
perfect (given in the book in section 2.44 and also in this document after exercise 2.4).
Exercise 2.19 a
We’re told that A and B are disjoint, so A ∩ B = ∅. And if A and B are closed, then by theorem 2.27 we know
that A = A and B = B. So we conclude that A ∩ B = A ∩ B = A ∩ B = ∅. By definition, this means that A
and B are separated.
Exercise 2.19 b
Let A and B be disjoint open sets. Assume that A ∩ B is not empty. This assumption leads to a contradiction:
23
7→ A ∩ B is not empty
hypothesis to be contradicted
→ (∃x)(x ∈ A ∩ B)
definition of non-emptiness
0
→ (∃x)(x ∈ A ∩ (B ∪ B ))
→ (∃x)(x ∈ (A ∩ B) ∪ (A ∩ B 0 ))
definition of closure
distributivity (section 2.11)
→ (∃x)(x ∈ ∅ ∪ (A ∩ B 0 ))
A and B are disjoint
0
→ (∃x)(x ∈ A ∩ B )
→ (∃x)(x ∈ A ∧ x ∈ B 0 )
definition of set intersection
0
→ (∃x)(x is an interior point of A ∧ x ∈ B )
A is an open set
→ (∃x)(x is an interior point of A ∧ x is a limit point of B)
definition of B 0
→ (∃x) [(∃r ∈ R)(Nr (x) ⊆ A) ∧ (x is a limit point of B)]
definition of interior point
→ (∃x) [(∃r ∈ R)(Nr (x) ⊆ A) ∧ (∀s ∈ R)(∃y ∈ B)(y ∈ Ns (x))]
definition of limit point
If something is true for all s ∈ R, it’s true for any particular R. So choose s = r.
→ (∃x) [(∃r ∈ R)( (Nr (x) ⊆ A) ∧ (∃y ∈ B)(y ∈ Nr (x)) )]
substitution of s = r
→ (∃x) [(∃r ∈ R)(∃y ∈ B)(y ∈ Nr (x) ∧ Nr (x) ⊆ A)]
rearrangement of terms for clarity
→ (∃x) [(∃r ∈ R)(∃y ∈ B)(y ∈ A)]
definition of subset
This last step establishes a contradiction. The sets A and B are disjoint, so there cannot be any possible
choice of variables such that y ∈ B and y ∈ A. Our assumption must have been incorrect: A ∩ B is, in fact,
empty. If we swap the roles of A and B, this same proof also shows us that A∩ B is empty. By definition, then,
A and B are separated.
Exercise 2.19 c
A is open in X: The set A is, by definition 2.18(a), a neighborhood of p. By theorem 2.19, then, A is an open
subset of X.
B is open in X: Let x be an arbitrary point in B. Let r = d(p, x) − δ. Let y be an arbitrary element in
Nr (x). Proof that y ∈ B:
7→ d(p, y) + d(y, x) ≥ d(p, x)
Property 2.15(c) of metric spaces
→ d(p, y) + d(y, x) − d(p, x) ≥ 0
→ d(p, y) + d(y, x) − d(p, x) ≥ 0 ∧ r > d(x, y)
we know d(x, y) < r because y ∈ Nr (x)
→ d(p, y) + d(y, x) − d(p, x) ≥ 0 ∧ d(p, x) − δ > d(x, y)
definition of r
→ d(p, y) + d(y, x) − d(p, x) ≥ 0 ∧ d(p, x) − δ − d(x, y) > 0
→ (d(p, y) + d(y, x) − d(p, x)) + (d(p, x) − δ − d(x, y)) > 0
the sum of positive numbers is positive
→ d(p, y) − δ > 0
cancellation of like terms
→ d(p, y) > δ
→y∈B
definition of membership in B
Our choice for y was arbitrary, so every point in this neighborhood of x is a member of B: by definition,
then, x is an interior point of B. But our choice of x ∈ B was also arbitrary, so this proves that every x ∈ B is
an interior point of B. By definition, this shows that B is open in X.
A and B are disjoint: If there were some x ∈ A ∩ B, then d(x, p) > δ and d(x, p) < δ, which violates the
trichotomy rule for order relations.
A and B are separate: We’ve shown that A and B are disjoint open sets, so by exercise 2.19(b) we know
that A and B are separate.
Exercise 2.19 d
If we are given any metric space X with a countable or finite number of elements, we can always find some
distance δ such that there are no elements x, y ∈ X with d(x, y) = δ (proof follows). This allows us to choose
24
some arbitrary p ∈ X and then use the results from part (c) to completely partition X into separated sets
A = {x ∈ X : d(x, p) > δ} and B = {x ∈ X : d(x, p) < δ}. This proves that (X is at most countable → X is not
connected). By contrapositive, this proves that (X is connected → X is uncountable), which is what we were
asked to prove.
Proof that an at most countable metric space X has some distance δ such that, for all x, y ∈ X, d(x, y) 6= δ:
Let X be an at most countable metric space with elements a1 , a2 , . . . an . Then from theorem 2.13, we know
that the set of all order pairs {(ai , aj ) : ai , aj ∈ X} is at most countable. And there is a clear one-to-one
correspondence between this set and the set {(d(ai , aj ) : ai , aj ∈ X}: so the set of all distances between all
combinations of points in X is at most countable.
Distances in metric spaces are always real numbers (definition 2.15), which are of course uncountable. Because there are an at-most countable number of distances in the set {(d(ai , aj ) : ai , aj ∈ X}, we can choose a
real number δ that is not in this set (otherwise we would have an at most countable set with R as a proper subset).
But this isn’t quite enough: if δ is so large that there are no elements in the set {x ∈ X : d(x, y) > δ}, then
one of our partitions will be empty. We can avoid this problem (as long as X has at least two elements) by picking
arbitrary x, y ∈ X and then choosing delta from the interval (0, d(x, y)). This interval is still uncountable, so
we are still able to choose a δ that is not in this set. And we know that d(x, y) > δ and d(x, x) < δ, so our
partitions A and B will both be nonempty.
Exercise 2.20 a
Closures of connected sets are always connected. If A and B are connected, then there is either some p in A ∩ B
or some q in A ∩ B. Clearly, then, we have either either p ∈ (A0 ∪ A) ∩ B or q ∈ A ∩ (B 0 ∪ B). Therefore A ∩ B
is nonempty, so these two closed sets are connected.
Exercise 2.20 b
Consider the segment (0, 1) of the real line. Although this segment is open in R1 , it contains no interior
points in R2 since every neighborhood Nr (x, 0) contains the point (x, 2r ). So let r = 14 and let E be the set
Nr (0, 0) ∪ {(x, 0) : 0 < x < 1} ∪ Nr (1, 0).
Since the line segment {(x, 0) : 0 < x < 1} contains no interior points while every point in a neighborhood is
an interior point (theorem 2.19), the interior of E is just Nr (0, 0) ∪ Nr (1, 1), and it is trivial to show that this
is the union of two nonempty separated sets.
Exercise 2.21 a
The function p(t) can be thought of as the parameterization of a straight line connecting the points a and b.
For instance, consider the following sets in R2 :
25
The set A0 is the set of all t such that p(t) ∈ A, and the set B0 is the set of all t such that p(t) ∈ B. We’re
told that A and B are separated and are asked to prove that A0 and B0 are separated. Proof by contrapositive:
• There is some element tA that is an element of A0 and a limit point of B0 .
Assume that A0 and B0 are not separated. By definition 2.45, either A0 ∩ B0 or A0 ∩ B0 is nonempty. Assume
that A0 ∩ B0 is nonempty (the sets are interchangeable, so the proof for A0 ∩ B0 6= ∅ is almost identical). Let
tA be one element of A0 ∩ B0: then either tA ∈ A0 ∩ B0 or tA ∈ B00 . It can’t be the case that tA ∈ A0 ∩ B0 ,
because this would imply p(tA ) ∈ A ∩ B which is impossible because A and B are separated sets. So it must be
the case that tA ∈ A0 ∩ B00 .
• There is a proportional relationship between d(t, t + r) and d(p(t), p(t + r)).
The distance between p(t) and p(t + r) is the vector norm of p(t + r) − p(t):
d(p(t), p(t+r)) = |p(t+r)−p(t)| = |[(1−(t+r))a+(t+r)b]−[(1−t)a+tb]| = |r(b−a)| = |r||(b−a)| = d(t, t+r)d(a, b)
• p(tA ) is an element of A and a limit point of B.
We’ve shown that tA is a limit point of B0 . So for any arbitrarily small r, there is some tB ∈ B0 such
that d(tA , tB ) < r. And this means that for any arbitrarily small r, there is some p(tB ) ∈ B such that
d(p(tA ), p(tB )) < |r(b − a)|. And since p(tA ) is in A and each p(tB ) is in B, this means that there is an element
of A that is a limit point of B: So A ∩ B is nonempty, which means that A and B are not separated.
This shows that if A0 and B0 are not separated, then A and B are not separated. By contrapositive, then,
if A and B are separated then A0 and B0 are separated. And this is what we were asked to prove.
Exercise 2.21 b
We are asked to find α ∈ (0, 1) such that p(α) 6∈ A ∪ B: from the definition of the function p, this is equivalent
to finding α ∈ (0, 1) such that α 6∈ A0 ∪ B0 . Proof that such a α exists:
Let E be defined as E = {d(0, x) : x ∈ B0 }. That is, E is the set of all distances between the point 0 (which
is in A0 , since p(0) = a) and elements of B0 . The set R has the greatest lower bound property and E is a subset
of R with a lower bound of 0, so the set E has a greatest lower bound. Let α represent this greatest lower bound.
26
• 0 < α < 1:
We know that α > 0 because α is a distance. We know that α 6= 0, because this would mean that d(0, 0) ∈ E
which means that 0 ∈ B0 , which is false. And we know that α < 1, because 1 ∈ B0 and B0 is an open set: so 1
is an interior point of B0 , which means that there is some small r such that 1 − r ∈ B0 .
We now need to show that we can use α to construct elements that are not in A0 ∪ B0 . We know that either
α ∈ B0 ,α ∈ A0 , or α 6∈ A0 ∪ B0 :
• if α ∈ B0 :
We know that α is not a limit point of A0 (otherwise, xα ∈ B0 ∩ A0 which contradicts the fact from part (a) that
B0 and A0 are separated). So there is some neighborhood Nr (α) that contains no points in A0 . Now consider
the points p in the range (α − r, α). Each p is in the neighborhood Nr (α), so they aren’t members of A0 . And
for each p we see that d(0, p) < α, so they aren’t members of B0 (otherwise α wouldn’t have been a lower bound
of E). So we see that for every p ∈ (α − r, α), we have p 6∈ A0 ∪ B0 .
• if α ∈ A0 :
This assumption leads to a contradiction. We’ve shown in part (a) that A0 and B0 are separated, so it can’t
be the case that α is a limit point of B0 (otherwise A0 ∩ B0 would not be empty). So there is some neighborhood Nr (α) that contains no points of B0 . But if the interval (0, α) contains no points of B0 and the interval
(α − r, α + r) contains no points of B0 , then their union (0, α + r) contains no points of B0 . But this contradicts
our definition of α as the greatest lower bound of E. So our initial assumption must have been incorrect: it
cannot have been the case that α ∈ A0 .
• if α 6∈ A0 and α 6∈ B0 :
Under this assumption, we clearly have α 6∈ A0 ∪ B0 .
Whatever assumption we make about the set containing α, we see that there will always be at least one
element α such that 0 < α < 1 and α 6∈ A0 ∪ B0 . And, from the definition of the function p, this means that
p(α) 6∈ A ∪ B. And this is what we were asked to demonstrate.
Exercise 2.21 c
Proof by contrapositive. Assume that E is not connected: then E could be described as the union of two
separated sets (i.e, E = A ∪ B). From part (b), we could then choose a, b, and t such that (1 − t)a + (t)b 6∈
A ∪ B = E. And this would mean that E is not convex. By contrapositive, if E is convex then E is connected.
Exercise 2.22
The metric space Rk clearly contains Qk as a subset. We know that Qk is countable from theorem 2.13. To
prove that Qk is dense in Rk , we need to show that every point in Rk is a limit point of Qk :
Let a = (a1 , a2 , . . . , ak ) be an arbitrary point in Rk and let Nr (a) be an arbitrary neighborhood of a. Let
b = (b1 , b2 , . . . , bk ) where bi is chosen to be a rational number such that ai < bi < ai + √rk (possible via
theorem 1.20(b)). The point b is clearly in Qk , and
r
r
2
2
p
r
r
kr2
d(a, b) = (a1 − bi )2 + . . . + (ak − bk )2 <
+ ... +
=
=r
k
k
k
This shows that every point in Rk is a limit point of Qk , which completes the proof that Rk is separable.
Exercise 2.23
Let X be a separable metric space. From the definition of separable in the previous exercise, we know that
X contains a countable dense subset E. For each ei ∈ E, let Nq (ei ) be a neighborhood with rational radius
q around point ei . Let {Vα } = {Nq (ei ) : q ∈ Q, i ∈ N} be the collection of all neighborhoods with rational
radius centered around members of E. This is a countable collection of countable sets, and therefore {Vα } has
27
countably many elements.
Let x be an arbitrary point in X, and let G be an arbitrary open set in X such that x ∈ G. Because G is
open, we know that x is an interior point of G. So there is some neighborhood Nr (x) such that Nr (x) ⊆ G. But
we can choose a rational q such that 0 < q < 2r , so that x ∈ Nq (x) ⊆ Nr (x) ⊆ G.
Because E is dense in x, every neighborhood of x contains some e ∈ E. So e ∈ Nq (x), which means that
d(e, x) < q. But, on the other hand, this also means that d(x, e) < q so that x ∈ Nq (e). And Nq (e) ∈ {Vα }, by
definition.
Having shown that x ∈ Nq (e), we need to prove that Nq (e) ⊆ G. Let y be any point in Nq (e). We know
that d(x, e) < q and d(e, y) < q. So d(x, y) < d(x, e) + d(e, y) = 2q by definition 2.15(c) of metric spaces. But
we defined q so that 0 < q < 2r , so we know that d(x, y) < r. This means that every y ∈ Nq (e) → y ∈ Nr (x), or
Nq (e) ⊆ Nr (x). And we chose r so that Nr (x) ⊆ G, so by transitivity we know that Nq (e) ⊆ G.
We started by choosing an arbitrary element x in an arbitrary open set G ⊆ X, and proved that there was an
element Nq (e) ∈ {Vα } such that x ∈ Nq (e) ⊂ G. We’ve shown that {Vα } has a countable number of elements,
and each element is an open neighborhood. This proves that {Vα } is a base for X.
Exercise 2.24
We’re told that X is a metric space in which every infinite subset has a limit point. Choose some arbitrary δ
and construct a set {xi } as described in the exercise.
• The set {xi } must be finite: if this set were infinite, then it would have a limit point. And if it has a
limit point, then there would be multiple points of {xi } in some neighborhood of radius δ/2, which contradicts
our assumption that d(xi , xj ) > δ for each pair of points in {xi }.
• The neighborhoods of {xi } form a cover of X: Every x ∈ X must be contained in Nδ (xi ) for some
xi ∈ {xi }. If it weren’t, this would imply that d(x, xi ) > δ for each xi ∈ {xi }, so that we could have chosen x
to be an additional element in {xi }.
S∞
Let Ei represent the set of {xi } constructed in the above way when δ = 1i , and consider the set E = i=1 Ei .
We will show that E is a countable dense subset of X. This union E is a countable union of nonempty finite
sets, so E is countable. Each element of E was chosen from X, so clearly E ⊂ X. Proof that E is dense in X:
Choose an arbitrary x ∈ X. Choose an arbitrarily small radius r. From the archimedian principle, we can
find an integer n such that n1 < r. We’ve shown that the neighborhoods of En cover X, so there is some e ∈ En
such that x ∈ N1/n (e). But if d(e, x) < n1 , then d(x, e) < n1 and so e ∈ N1/n (x). And since n1 < r, this implies
that e ∈ Nr (x).
So we’ve chosen an arbitrary x and an arbitrary radius r, and shown that Nr (x) will always contain some
e ∈ En ⊂ E. This proves that every x is a limit point of E, which by definition means that E is dense in X.
28
This proves that X contains a countable dense subset, so by the definition in exercise 22 we have proven that
X is separable.
Exercise 2.25 a
Theorem 2.41 tells us that the set K, by virtue of being a compact space, has the property that every infinite
subset of K has a limit point. By exercise 24, this means that K is separable. And exercise 23 proves that every
separable metric spaces has a countable base.
I’m not sure that this an appropriate proof, though. The question wants us to prove that K has a countable
base and that therefore K is separable, whereas this is a proof that K is separable and therefore has a countable
base. An alternate proof follows.
Exercise 2.25 b
We’re told that K is compact. Consider the set En = {N1/n (k) : k ∈ K}, the set of neighborhoods of radius
1/n around every element in K. This is clearly an open cover of K, since x ∈ K → x ∈ N( 1/n)(x) → x ∈ En .
And K is compact, so the open cover En must have some finite subcover: i.e., K must be covered by some finite
number of neighborhoods from En . Let {Vn } represent a finite subcover of the open cover En .
Now consider the union V =
a base for K.
S∞
n=1
Vn , a countable collection of finite covers of K. We will prove that V is
Choose an arbitrary x ∈ K, and an arbitrary open set G such that x ∈ G ⊂ K. Because G is open and
x ∈ G, x must be an interior point of G. And since x is an interior point, there is some neighborhood Nr (x) such
1
< 2r . Because Vm is an open cover of K, there is some
that Nr (x) ⊂ G. Now choose an integer m such that m
k ∈ K and neighborhood N1/m (k) in the open cover Vm such that x ∈ N1/m (k). Proof that this neighborhood
N1/m (k) is a subset of G:
Assume that y is an element of N1/m (k). From the fact that x ∈ N1/m (k), we know that d(x, k) < 1/m.
And from the fact that y ∈ N1/m (k), we know that d(k, y) < 1/m. And from the definition of metric spaces, we
1
know that d(x, y) ≤ d(x, k) + d(k, y), which means that d(x, y) ≤ 2/m ≤ r (since we chose m
< 2r ). This proves
that every element of N1/m (k) is in the neighborhood Nr (x), so N1/m (k) ⊆ Nr (x) ⊆ G. And this shows that
N1/m (k) ⊆ G.
We’ve chosen an arbitrary element x and an arbitrary element G, and have shown that there is some
N1/m (k) ∈ V such that x ∈ N1/m (k) ⊂ G. By the definition in exercise 23, this proves that K has a base. And
this base is a countable collection of finite sets, so it’s a countable base.
This countable base V is identical to the set E constructed in exercise 24. We saw there that this base was
a dense subset, so by the definition of “separable” in exercise 23 we know that K is separable.
Exercise 2.26
Let {Vn } be the countable base we constructed in exercises 24 and 25.
{Vn } acts as a countable subcover to any open cover of X. Proof: let G be an arbitrary open set from an
arbitrary open cover of X. Choose any x ∈ G. We can find an element N1/m (k) ∈ V such that x ∈ N1/m (k) ⊆ G
by the same method used in the second half of exercise 2.25(b).
We must now prove that a finite subset of {Vn } can always be chosen to act as a subcover for any open cover.
We’ll do this with a proof by contradiction.
Let {Wα } be an open cover of X and assume that there is no finite subset of {Vn } that acts as a subcover
of {Wα }. Construct Fn and E as described in the exercise. Because E is an infinite subset, it has a limit point:
let x be this limit point. Every neighborhood Nr (x) contains infinitely many points of E (theorem 2.20), and
we can prove that this arbitrary neighborhood Nr (x) must contain every point of E:
29
We have constructed the Fn sets in such a way that Fn+1 ⊆ Fn for each n. So if Nr (x) doesn’t contain any
points from Fj , then it doesn’t contain any points from Fj+1 , Fj+2 ,etc because y ∈ Fj+1 → y ∈ Fj . So if Nr (x)
doesn’t contain any points from Fj , then it has no more than j − 1 points of E: i.e, Nr (x) ∩ E would be finite.
But x is a limit point of E, so Nr (x) must contain infinitely many points of E. So Nr (x) ∩ E contains one point
from each Fn .
But this means that E = {x}, a set with one element. If it contained a second point, we could find a
neighborhood Nr (x) that failed to contain this second point, which would contradict our finding that every
neighborhood of T
x contains E. And this means that we choseTthe same element from each Fn : i.e., x was in
each Fn . So x ∈ Fn , which contradicts our assumption that Fn was empty.
Exercise 2.27
To prove that P is perfect, we must show that every limit point of P is a point of P , and vice-versa.
Proof that every limit point of P is a point of P : assume that x is a limit point of P . Choose some arbitrary
r and let s = 2r . By the definition of limit point, every neighborhood Ns (x) contains some y ∈ P . And from the
fact that y ∈ P , we know that Ns (y) contains uncountably many points of P . But for each p ∈ Ns (y), we see
that d(x, p) ≤ d(x, y) + d(y, p) so that d(x, p) ≤ 2s = r. So the neighborhood Nr (x) contains uncountably many
points of P . But Nr (x) was an arbitrary neighborhood of x, so this proves that x ∈ P .
Proof that every point of P is a limit point of P : assume that x ∈ P . By the definition of P , this means that
every neighborhood of x contains uncountably many points of P . So clearly every neighborhood of x contains
at least one point of P , which means that x is a limit point of P .
This proves that P is perfect. To prove that P c ∩ E is countable, we let {Vn } be a countable base for X and
let W be the union of all Vn for which E ∩ Vn is countable. We will show that W c = P :
Proof that P ⊆ W c : Assume that x ∈ P . By the definition of membership in P , we know that x is a
condensation point of E. By the definition of condensation point, then, we know that every neighborhood
Nr (x) contains uncountably many points of E. And this means that every open set Vn containing x contains
uncountably many points of E. So x is not a member of any countable Vn ∩ E, which means that x ∈ W c .
Proof that W c ⊆ P : Assume that x ∈ W c . Choose any arbitrary neighborhood Nr (x): by the definition
of the base, there is some Vn ⊂ Nr (x) such that x ∈ Vn . By definition of W c , the fact that x ∈ Vn means
that E ∩ Vn is uncountable, which of course means that Vn has uncountably many elements of E. And we’ve
established that Vn ⊆ Nr (x), which means that Nr (x) has uncountably many elements of E. But Nr (x) was an
arbitrary neighborhood of x, so this proves that every neighborhood of x has uncountably many elements of E:
by definition, then, x is a condensation point and therefore x ∈ P .
So we have proven that W c = P . Taking the complement of each of these, we see that W = P c . And the set
W is the union of countably many sets of the form E ∩ Vn , each of which contains countably many elements: so
W is countable. And W = P c , so P c is countable: this proves that countably many points are P c . And from
this, it’s trivial to show that there are countably many points in E ∩ P c , which is what we were asked to prove.
Note that this proof assumed only that X had a countable base, so it is valid for any separable metric space.
Exercise 2.28
Let E be a closed set in a separable metric space X. Let P be the set of all condensation points for E.
Proof that E ∩ P is perfect: assume that x ∈ E ∩ P . Clearly x is a point of P , which means that x is a limit
point of E (from the definition of P ) and a limit point of P (because P was shown to be perfect in exercise 27).
So x is a limit point of E ∩ P . Now, assume that x is a limit point of E ∩ P . From this, we know that x is a
point of E (because E is closed) and a point of P (because P is perfect). So we have shown that every point of
E ∩ P is a limit point of E ∩ P and vice-versa: this proves that E ∩ P is perfect.
30
Proof that E ∩ P c is at most countable: this was proven in exercise 27.
And E = (E ∩ P ) ∪ (E ∩ P c ): that is, E is the union of a perfect set and a countable set. And this is what
we were asked to prove.
Exercise 2.29
1
Let
S E be an arbitrary open set in R and let {Gi } be an arbitrary collection of disjoint segments such that
Gi = E. Each Gi is open and nonempty, so each Gi contains at least one rational point. Therefore there
can’t be more Gi elements than rational numbers, which means that there are at most a countable number of
elements in {Gi }. But {Gi } was an arbitrary set of disjoint segements whose union is E, therefore E cannot be
the union of an uncountable number of disjoint segments.
Exercise 2.30
Proof by contradiction: Let {Fn } be a collection of sets such that
empty interior.
7→ (∀n)(Fn◦ = ∅)
S
→ Fn◦ = ∅
S
c
→ ( Fn◦ ) = R
T ◦ c
→ (Fn ) = R
T
→ Fnc = R
→
(∀n)(Fnc
S
Fn = R and suppose that each Fn has an
hypothesis of contradiction
taking the complement of both sides
theorem 2.22
exercise 2.9d
= R)
This last step tells us that Fnc is dense in R for every n. And Fn was closed, so Fnc is open. Appealing to the
proofTof Baire’s theorem in exercise 3.22 (which uses only terms and concepts introduced in chapter 2), we see
that Fnc is nonempty. Therefore:
Fnc 6= ∅
S
c
→ ( Fn ) 6= ∅
S
→ Fn 6= R
→
T
theorem 2.22
taking the complement of both sides
S
And this contradicts our original claim that Fn = R so one of our initial assumptions must be wrong. And
our only assumption was that each Fn has an empty interior, so by contradiction we have proven that at least
one Fn must have a nonempty interior.
Exercise 3.1
The exercise does not explicitly say that we’re operating in the metric space Rk , but there are two reasons to
assume that we are. First, we are supposed to understand the meaning of absolute value in this metric space,
and Rk is the only metric space we’ve encountered so far for which absolute value has been defined. Second,
Rudin appears to use sn to represent series in Rk and pn to represent series in arbitrary metric spaces. So we’ll
assume that we’re operating in Rk for this exercise.
We’re told that the sequence {sn } converges to some value s: that is, for any arbitrarily small there is some
integer N such that n > N implies d(s, sn ) < . But it’s always the case that d(|s|, |sn |) ≤ d(s, sn ) (exercise
1.13), so for this same and N we see that d(|s|, |sn |) ≤ d(s, sn ) < . By transitivity, this means that for any
choice of there is some integer N such that n > N implies d(|s|, |sn |) < . By definition of convergence, this
means that the sequence {|sn |} converges to |s|.
The converse is not true. If we let sk = (−1)k , the sequence {|sn |} clearly converges while the sequence {sn }
clearly does not.
31
Exercise 3.2
The exercise asks us to calculate the limit rather than prove it rigorously, so we might be able to manipulate
the expression algebraically:
!
√
p
p
2+n+n
n
n
n
1
n2 + n − n = ( n2 + n − n) √
=√
= p
=p
2
2
n +n+n
n +n+n
n 1 + 1/n + n
1 + 1/n + 1
and then use basic calc 2 techniques to show that this limit is 12 . If we need to prove it more rigorously, we can
use theorem 3.14. We will first need to prove that this sequence has a limit, which theorem 3.14 tells us can
be done by proving that the sequence is bounded and monotonically increasing. We can then find the limit by
finding the least upper bound of the sequence.
The sequence
√
√ is bounded
√
2
The
from the fact that n2 + n > n2 = n, we know that
√ quantitity ( n + n − n) is bounded below:
n2 + n − n > 0. And it’s bounded above by 12 :
7→
→
√
√
n2 + n − n ≥
n2 + n ≥
2
→ n +n≥
→0≥
1
4
1
2
1
2
hypothesis to be contradicted
+n
+ n + n2
1
4
both sides are positive, so the sign doesn’t change
subtract n + n2 from both sides
√
This last step is clearly false, so our initial assumption must be false: this shows that ( n2 + n − n) is
bounded above by 21 .
The sequence is monotonically increasing
7→ sn+1 > sn
p
√
↔ (n + 1)2 + (n + 1) − (n + 1) > n2 + n − n
p
√
↔ (n + 1)2 + (n + 1) − n2 + n > (n + 1) − n
√
√
↔ n2 + 3n + 2 − n2 + n > 1
√
√
↔ (n2 + 3n + 2) + (n2 + n) − 2 n2 + 3n + 2 n2 + n > 1
√
↔ (2n2 + 4n + 2) − 2 n4 + 4n3 + 5n2 + 2n > 1
√
↔ −2 n4 + 4n3 + 5n2 + 2n > 1 − (2n2 + 4n + 2)
√
↔ 2 n4 + 4n3 + 5n2 + 2n < 2n2 + 4n + 1
4
3
2
4
3
2
definition of sn
rearrange terms
some algebra
square both sides
simplify
rearrange terms
multiply both sides by −1 and simplify
↔ 4n + 16n + 20n + 8n < 4n + 16n + 20n + 8n + 1
square both sides
↔0<1
subtract 4n4 + 16n3 + 20n2 + 8n from each side
The sequence has a least upper bound of 21
We’ve already shown that 12 is an upper bound. To show that it is the least such upper bound, choose any
x ∈ [0, 12 ) and assume that it is an upper bound for {sn }.
7→ (∀n ∈ N)(sn ≤ x)
hypothesis to be contradicted
√
→ (∀n ∈ N)( n2 + n − n ≤ x)
definition of sn
√
→ (∀n ∈ N)( n2 + n ≤ x + n)
→ (∀n ∈ N)(n2 + n ≤ x2 + 2xn + n2 )
2
square both sides
→ (∀n ∈ N)(0 ≤ x + (2x − 1)n)
subtract n2 + n from both sides
→ (∀n ∈ N)(−x2 ≤ (2x − 1)n)
subtract x2 from both sides
2
−x
→ (∀n ∈ N)( 2x−1
≤ n)
2x − 1 < 0 when x <
1
2
This last step must be false: by the Archimedian principle, we can always find some n ∈ N greater than any
specific quantity. So, by contradiction, we can find sn > x for every x ∈ [0, 21 ): this means that the upper bound
cannot be less than 12 . But 21 is an upper bound, so we see that 12 is the least such upper bound.
32
We have shown that {sn } is a monotonically increasing bounded sequence with a least upper bound of
by theorem 3.14, this proves that the limit of {sn } is 21 .
1
2:
Exercise 3.3
Note that we are not asked to find the limit of this sequence. We are only asked to show that it converges and
that 2 is an upper bound. By theorem 3.14, we can prove convergence by proving that the sequence is bounded
and monotonically increasing.
p
√
The sequence is bounded above by 2 + 2
Although we’re asked to show
the sequence has an upper bound of 2, we can prove the stronger result that
p that
√
√
it has an upper bound of 2 + 2. We can prove this by induction. We see that 0 < s1 = 2 < 2. Now assume
that 0 < sn < 2:
7→ 0 < sn < 2
hypothesis of induction
√
√
→ 0 < sn < 2
√
√
→ 0 < 2 + sn < 2 + 2
p
√
p
√
→ 2 + sn < 2 + 2
p
√
→ sn+1 < 2 + 2
p
√
By induction, then, sn < 2 + 2 for all n.
The sequence is monotonically increasing
p
√
√
Proof by induction. We can immediately see that s1 = 2 < 2 + 2 = s2 . Now, assume that we have
proven si−1 <psi for i = 1 . . . n.
√
7→ sn = 2 + sn−1 definition of sn
p
√
→ sn > 2 + sn−2 our hypothesis of induction tells us sn−1 > sn−2
→ sn > sn−1
definition of sn−1
We’ve shown that sn is a monotonically increasing function that is bounded above. By theorem 3.14, this is
sufficient to prove that it converges.
Exercise 3.4
From the recursive definition we are given, we can see that s2 = 0 and s2m = 14 + 12 s2m−2 . So we can use
theorem 3.26 to find the limit of s2m .
!
∞
X
1 1
1
1
1
1
1
1
lim s2m = + +
+ ... =
−1− =
−1− =
1
n
m→∞
4 8 16
2
2
2
2
1− 2
n=0
From the same recursive definition, we see that s1 = 0 and s2m+1 = 12 + 21 s2m−1 . So the limit of s2m+1 is given
by
!
∞
X
1 1 1
1
1
lim s2m+1 = + + + . . . =
−1=
1 −1=1
n
m→∞
2 4 8
2
1
−
2
n=0
This shows that as n increases, the terms of {sn } are either arbitrarily close to
these are our upper and lower bounds for the sequence.
1
2
or arbitrarily close to 1, so
Exercise 3.5
Let a∗ = limn→∞ sup an , let b∗ = limn→∞ sup bn , and let s∗ = limn→∞ sup(an + bn ).
If a∗ and b∗ are both finite:
From theorem 3.17(a) we know that there are some subsequences {aj }, {bk } such that limj→∞ aj + limk→∞ bk =
33
s∗ (that is, the supremum of E as defined above is also member of E: E is a closed set). The limit of aj must
be less than or equal to the supremum a∗ and the limit of bj must be less than or equal to the supremum b∗ , so
s∗ = lim aj + lim bk ≤ lim sup an + lim sup bn = a∗ + b∗
j→∞
n→∞
k→∞
n→∞
By transitivity, this shows that s∗ ≤ a∗ + b∗ : and, by the definitions of these terms, this proves that
lim sup(an + bn ) ≤ lim sup an + lim sup bn
n→∞
∗
n→∞
n→∞
∗
If either a or b is infinite:
We’re asked to discount the possibility that a∗ = ∞ and b∗ = −∞. In all other cases when one or both of these
values is infinite, the inequality can be easily shown to resolve to ∞ = ∞ or −∞ = −∞.
Exercise 3.6a
an =
√
n+1−
√
√
n=
√ n+1− n
√
√ n+1+ n
1
√
=√
√
√
n+1+ n
n+1+ n
We can compare this last term to a known series:
an = √
1
1
1
>
√ > √
2
n+1+ n
2 n+1
1/2
1
n
P 1 1/2
We know that the series
diverges by theorem 3.28, so the terms
n
P of {an } are larger than the terms of
a divergent series. By the comparison theorem 3.25, this tells us that
an is divergent.
Exercise 3.6b
√
an =
n+1−
n
√
n
√
=
n+1−
n
√ √
√ n
n+1+ n
1
√
√
=√
√
3
n+1+ n
n + n2 + n2
We can compare this last term to a known series:
√
1
n3 + n2 +
√
n3
<√
1
n3 +
√
1
=
3
2
n
3/2
1
n
P 1 3/2
converges by theorem 3.28, so the terms
We know that the series
n
Pof {an } are smaller than the terms
of a convergent series. By the comparison theorem 3.25, this tells us that
an is convergent.
Exercise 3.6c
√
For n ≥ 1, we can show that 0 ≤ ( n n − 1) < 1:
7→ (∀n ∈ N)(1 ≤ n < 2n )
the < 2n can easily be proven via induction
√
→ (∀n ∈ N)(1 ≤ n n < 2)
take the nth root of each term
√
→ (∀n ∈ N)(0 ≤ n n − 1 < 1) subtract 1 from each term
P n
x converges when 0 ≤ x < 1, so by the comparison theorem 3.25 we know that
P We know that the series
an is convergent.
Exercise 3.6d
This probably is more difficult than it looks at first, mainly because for complex z it’s not true that 1 + z n > z n
– in fact, without absolute value signs the inequality z1 > z2 has no meaning whatsoever (see exercise 1.8). It’s
also not true that |1 + z n | > |z n |. The best we can do is appeal to the triangle inequality.
34
If |z| < 1 then by the triangle inequality we have
lim
n→∞
1
1
≥ lim
=1
n
n→∞ |1| + |z n |
1+z
and therefore by theorem 3.23 the series doesn’t converge. If |z| = 1 then we similarly have
lim
n→∞
1
1
1
≥ lim
=
n→∞ |1| + |z n |
1 + zn
2
and by theorem 3.23 the series again fails to converge. If |z| > 1 then we can use the ratio test:
lim
n→∞
z −n 1 + z n
z −n + 1
1 + zn
1
=
lim
=
lim
=
<1
n+1
−n
n+1
−n
n→∞ z
n→∞ z
1+z
1+z
+z
z
and by theorem 3.34 the series converges. I’m not totally satisfied with this proof because I can’t totally justify
the claim that limn→∞ z −n = 0 without appealing to the polar representation z −n = r−n e−inθ and the fact that
|eixθ | = 1 for all x.
Exercise 3.7
Define the partial sum tn to be
tn =
n √
X
ak
k=1
P√
k
We can show that the series
an /n converges by showing that the sequence {tn } converges; we can do this
by showing that {tn } is bounded and monotonically increasing (theorem 3.14).
the sequence is monotonically increasing:
√
From the definition of partial sums, we know that tn = an /n + tn−1 for all n. And we’re told an > 0 for every
n, so tn > tn−1 for every n.
the sequence is bounded:
pP P
P
a2 b2 . Applying this to the given series,
From the Cauchy-Schwartz inequality, we know that (ab) ≤
we see that
r
X √ 1 X X 1
an
an
≤
n
n2
P
P
We are told that
an converges, and from theorem 3.28 we know that that
1/n2 converges. Therefore (by
theorem 3.50)√we know that their product converges
P √to some α, and so the√right-hand side of the above inequality
an /n is bounded by α, and therefore the sequence {tn } is
converges to √ α. This shows us that the series
bounded by α.
P
By assuming that
an is convergent and that an > 0, we’ve shown that tn is bounded and monotonically
increasing. By theorem 3.14 this is sufficient to show that tn converges. And this is what we were asked to
prove.
Exercise 3.8
We’re told that {bn } is bounded. Let α be the upper bound of {|bn |}. We’re also told that
for any arbitrarily small , we can find an integer N such that
n
X
k=m
ak ≤
for all n, m such that n ≥ m ≥ N
α
which is algebraically equivalent to
n
X
ak α ≤ for all n, m such that n ≥ m ≥ N
k=m
35
P
an converges: so
which, since |bk | ≤ α for every k, means that
n
X
ak bk ≤
n
X
ak α ≤ for all n, m such that n ≥ m ≥ N
k=m
k=m
By theorem 3.22, this is sufficient to prove that
P
an bn converges.
Exercise 3.9a
p
n
α = lim sup
|n3 | = lim sup |n3/n | = |n0 | = 1
So the radius of convergence is R = 1/α = 1.
Exercise 3.9b
Example 3.40(b) mentions that “the ratio test is easier to apply than the root test”, although we’re never
explicitly told what the ratio test is, how it might relate to the ratio test for series convergence, or how to apply
the ratio test to example (b). According to some course handouts I found online 2 , we can’t simply apply the
ratio test in the same way we use the root test. That is, we can’t use the fact that
β = lim sup
2
2n+1 n!
= lim sup
=0
(n + 1)! 2n
n+1
to assume that the radius of convergence is 1/β = ∞. Instead, we look at theorem 3.37, which tells us that
α ≤ β, so that
1
1
R= ≥ =∞
α
β
Exercise 3.9c
s
α = lim sup
n
2
2n
2
√ 2
= lim sup √
=
n
n2
(lim
sup
n)
2
n
The last step of this equality is justified by theorem 3.3c. From theorem 3.20c, we know the limit of this last
term:
2
2
√ 2 =
1
(lim sup n)
So α = 2, which gives us a radius of convergence of 1/2.
Exercise 3.9d
√
n
√
n3
(lim sup n n)3
=
3
3
The last step of this equality is justified by theorem 3.3c. From theorem 3.20c, we know the limit of this last
term:
√
(lim sup n n)3
1
=
3
3
So α = 1/3, which gives us a radius of convergence of 3.
α = lim sup
p
n
|n3 3n | = lim sup
Exercise 3.10
P
We’re told that infinitely many of the coefficients of
an z n are positive integers. This means that for every N ,
there is some n > N such that an ≥ 1. Therefore
lim supn→∞ |an | ≥ 1
p
→ lim supn→∞ n |an | ≥ 1 from the fact thatk > 1 → k 1/n > 1
→
1
R
≥1
→1≥R
This final step indicates that the radius of convergence is at most 1.
2 http://math.berkeley.edu/∼gbergman/
36
theorem 3.39
Exercise 3.11a
Proof by contrapositive: we will assume that
converges.
To simplify the form of
series by 1/an to get
P
P
an /(1 + an ) converges and show that this implies that
P
an
an /(1 + an ), we can multiply the numerator and denominator of each term of the
1
+1
X
1
an
For this to converge, it is necessary that
lim
n→∞ 1
an
1
=0
+1
which can only happen if the limit of the denominator is ∞; and this can only happen if the limit of 1/an is ∞.
That is, it must be the case that
lim an = 0
n→∞
P
This alone is not sufficient to prove that
an converges. It does, however, allow us to make the terms of an as
arbitrarily small as we like. For this proof, we it’s helpful to recognize that there is some integer N1 such that
an < 1 for all n > N1 .
From the assumption that
P
an /(1 + an ) converges, we know that for every there is some N such that
n
X
k=m
ak
< for every n ≥ m ≥ N
1 + ak
Let N be the larger of {N1 , N }. For any and any n ≥ m ≥ N , we can produce two inequalities:
n
X
k=m
n
X
k=m
n
X
ak
ak
<
1+1
1 + ak
k=m
ak
< for every n ≥ m ≥ N
1 + ak
The first is true because each k is larger than N1 (so that ak < 1); the second is true because each k is larger
than N . Together, these inequalities tell us that
n
X
ak
< for every n ≥ m ≥ N
2
k=m
or, equivalently, that
n
X
ak < 2 for every n ≥ m ≥ N
k=m
And our choice of was arbitrary, so we have shown that for every P
there is some N that makes the last
statement true: and this is the definition of convergence for the series
an .
P an
P
We’ve
an . By contrapositive, the fact
1+an implies the convergence of
P shown that the convergence
P anof
diverges.
that
an diverges is proof that
1+an
Exercise 3.11b
From the definition of sn , we can see that
sn+1 = a1 + . . . + an+1 = sn + an+1
37
and we’re told that every an > 0, so we know that sn+1 > sn . By induction, we also know that sn ≥ sm
whenever n ≥ m. And from this, we know that s1m ≥ s1n whenever n ≥ m. Therefore:
aN +1
aN +k
aN +1
aN +k
aN +1 + . . . + aN +k
+ ... +
≥
+ ... +
=
sN +1
sN +k
sN +k
sN +k
sN +k
A bit of algebraic manipulation shows us that aN +1 +. . .+aN +k = sN +k −sN , so this last inequality is equivalent
to
aN +k
sN +k − sN
sN
aN +1
+ ... +
≥
=1−
sN +1
sN +k
sN +k
sN +k
which is what we were asked to prove.
P an
To show that
sn is divergent, we once again look at the sequence {sn }. We’ve already determined that
{sn } is an increasingPsequence, and we know that’s it’s not convergent (otherwise, by the definition 3.21 of
“convergent series”,
an would be convergent). Therefore, we know that {sn } is not bounded (from theorem
3.14, which says that a bounded monotonic series is convergent).
Now let sN be an arbitrary element of {sn } and let be an arbitrarily small real. From the fact that {sn }
is not bounded, we can make sN +k arbitrarily large by choosing a sufficiently large k. This means that, for any
N and any arbitrarily small , we can make sNsN+k < by choosing a sufficiently large k. And we’ve established
that
aN +1
aN +k
sN
+ ... +
≥1−
sN +1
sN +k
sN +k
which means that, by choosing sufficiently large k, we get the inequality
N
+k
X
k=N +1
ak
sN
≥1−
≥1−
sk
sN +k
P an
1
We’ve now established everything
sn is divergent. Choose any such that 0 < < 2 .
P anwe need to show that
From theorem 3.22, in order for
sn to be convergent we must be able to find some integer N such that
n
X
ak
< for all n ≥ m ≥ N
sk
k=m
But there can be no such N , because for every N we’ve shown that there is some k such that
N
+k
X
k=N +1
ak
>1−>
sk
(note that 1 - > because 0 < < 21 ). And this is sufficient to show that the series does not converge.
Exercise 3.11c
To prove the inequality:
sn ≥ sn−1
→
→
→
→
1
sn
an
s2n
an
s2n
an
s2n
≤
≤
≤
≤
{sn } is an increasing sequence (see part b)
1
sn−1
an
sn sn−1
sn −sn−1
sn sn−1
1
1
sn−1 − sn
multiply both sides by an /sn
sn − sn−1 = an (see part b)
algebra
Now consider the summation
n
X
k=2
1
sk−1
−
1
=
sk
1
1
−
s1
s2
+
1
1
−
s2
s3
38
+ ... +
1
sn−1
−
1
sn
Most of the terms in this summation cancel one another out: the summation “telescopes down” and simplifies
to
n
X
1
1
1
1
−
=
−
sk−1
sk
s1
sk
k=2
As we saw in part (b), the terms of {sn } increase without bound as n → ∞, so that
∞
X
k=2
1
sk−1
−
1
1
=
sk
s1
Now, by the inequality we proved above, we know that
∞
X
an
k=2
s2n
≤
∞
X
k=2
1
1
1
−
=
sk−1
sk
s1
We can add one term to each side so that our summation starts at 1 instead of at 2 to get
∞
X
an
k=1
s2n
≤
a1
1
+
s21
s1
We can now show that the series converges. Define the partial sum of this series to be
{tn } =
n
X
k=1
1
sk−1
−
1
sk
We’ve shown that {tn } is bounded above by a1 /s21 + 1/s1 , and we know that {tn } is monotonically increasing
because each an is positive. Therefore, by theorem 3.14 we know that {tn } converges; and so by definition 3.21
we know that its associated series converges. And this is what we were asked to prove.
Exercise 3.11d
P
The series
an /(1 + n2 an ) always converges. From the fact that an > 0, we can establish the following chain
of inequalities:
an
1/an
an
1
1
=
= 1
< 2
2
2
2
1 + n an
1/an 1 + n an
n
an + n
From this, we see that
∞
X
∞
X
an
1
<
2a
2
1
+
n
n
n
n=0
n=0
P
P
We know that
1/n2 converges (theorem 3.28), and therefore
an /(1 + n2 an ) converges by the comparison
test of theorem 3.25.
The series
P
an /(1 + nan ) may or may not converge. If an = 1/n, for instance, the summation becomes
X
X 1
1X1
an
n
=
=
1 + nan
2
2
n
which is divergent by theorem 3.28.
To construct a convergent series, let an be defined as
1 if n = 2m − 1 (m ∈ Z)
an =
0 otherwise
P
The series
an is divergent, since there are infinitely many integers of the form 2m − 1. But the series
P
an /(1 + nan ) is convergent:
∞
∞
X
X
X 1 m
an
1
=
=
1 + nan
2m
2
n=0
m=0
This series is convergent to 2 by theorem 3.26.
39
Exercise 3.12a
establishing the inequality
From the definition of rn , we see that
rk =
∞
X
am = ak +
m=k
∞
X
am = ak + rk+1
m=k+1
so that ak = rk − rk+1 . And because each ak is positive, we see that rk > rk+1 which (from transitivity) means
that rm > rn when n > m (that is, {rn } is a decreasing sequence). With these equalities, we can form our proof.
Take m ≤ n and consider the sum
n
X
am
an
ak
=
+ ... +
rk
rm
rn
k=m
From the fact that rm ≥ rn , we know that ak /rm ≤ ak /rn for all k, so that
am + . . . + an
am
an
am + . . . + an
≤
+ ... +
≤
rm
rm
rn
rn
Now, from the fact that ak = rk − rk+1 , we see that this inequality is equivalent to
(rm − rm+1 ) + (rm+1 − rm+2 ) + . . . + (rn − rn+1 )
am
an
(rm − rm+1 ) + (rm+1 − rm+2 ) + . . . + (rn − rn+1 )
≤
+. . .+
≤
rm
rm
rn
rn
Notice that most of the terms of these numerators cancel one another out, leaving us with
rm − rn+1
am
an
rm − rn+1
≤
+ ... +
≤
rm
rm
rn
rn
Taking the two leftmost terms of this inequality and performing some simple algebra then gives us
1−
rn+1
am
an
≤
+ ... +
rm
rm
rn
This is close to what we want to prove, but our index is off by one. But each an is positive and each rn is
the sum of positive terms, so an+1 /rn+1 is strictly positive. By adding this term to the right-hand side of the
inequality, we accomplish two things: we correct our index and we make this a strict inequality (<) instead of
a non-strict inequality (≤).
rn+1
am
an
an+1
1−
<
+ ... +
+
rm
rm
rn
rn+1
This last statement is true whenever n ≥ m, which means that it’s true whenever n + 1 > m. By a simple
replacement of variables, then, we know that
1−
am
a0
rn0
<
+ . . . + 0n
rm
rm
rn
whenever n0 > m, which is what we were asked to prove.
proof of divergence
P
P
We’re told that
an converges, so
an = α for some α ∈ R. From the definition of rn , we see that there is a
relationship between {rn } and α:
rn =
∞
X
k=n
ak =
∞
X
ak −
k=1
n−1
X
k=1
ak = α −
n−1
X
ak
k=1
As n → ∞, this last term approaches α − α = 0, and so we see that limn→∞ rn = 0.
40
Now choose any arbitrarly small from the interval (0, 12 ) and choose any arbitrary integer N . From the
inequality we verified in part (a1), we see that
aN
aN +n
rN +n
+ ... +
>1−
rN
rN +n
rN
We can choose n to be arbitrarily large; in doing so, rN +n approaches zero while rN remains fixed. So, by
choosing a sufficiently large n, we can guarantee that rN +n /rN < . This choice of n gives us the inequality
aN +n
rN +n
aN
+ ... +
>1−
>1−>
rN
rN +n
rN
where the last step of this chain of inequalities is justified by our choice of 0 < < 12 .
We’ve shown that there is some such that, for every N , we can find
n
X
an
>
rn
k=N
P
an /rn diverges (since we’ve proven the negation of the “for
From theorem 3.22, this is sufficient to show that
every > 0 there is some N . . . ” statement of the theorem).
Exercise 3.12b
rn > rn+1 > 0
√
√
→ rn > rn+1
√
√
√
→ 2 rn > rn + rn+1
√
established in part (a)
take square root of each side
√
add rn to each side
√
divide by rn
√
r + r
→ 2 > n√rn n+1
√
√
→ 2( rn − rn+1 ) >
√
√
→ 2( rn − rn+1 ) >
√
√
→ 2( rn − rn+1 ) >
√
√
√
√
( rn + rn+1 )( rn − rn+1 )
√
rn
rn −rn+1
√
rn
an
√
rn
multiply each side by a positive term
simply the right-hand side
we established an = rn − rn+1 in part (a)
Having established this inequality, we see that
n
n
X
X
√
ak
√
2( rk − rk+1 )
√ <
rk
k=1
k=1
and many of the terms in the right-hand summation cancel one another out, so that the series “telescopes down”
and simplifies to
n
X
√
ak
√
√ < 2 r1 − rk+1
rk
k=1
so that
n
X
√
√
ak
√
√ < lim 2 r1 − rk+1 = 2 r1
n→∞
n→∞
rk
lim
k=1
This shows that this sum of nonnegative terms is bounded, which is sufficient to show that it converges by
theorem 3.24.
Exercise 3.13
P
P
Let
an be a series that converges absolutely to α and let
bn be a series that converges absolutely to β. We
can prove that the Cauchy product of these two series (definition 3.48) is bounded.
41
7→
Pn
=
k=0 |ck |
Pk
Pn
j=0
k=0
aj bk−j
definition 3.48 of ck
≤
Pn
Pk
|aj bk−j |
triangle inequality
=
Pn
Pk
|aj ||bk−j |
1.33(c)
k=0
k=0
j=0
j=0
We can expand this summation out to get:
= |a0 ||b0 | + (|a0 ||b1 | + |a1 ||b0 |) + . . . + (|a0 ||bn | + |a1 ||bn−1 | + . . . + |an ||b0 |)
Define Bn to be
Pn
k=0
|bk | and An to be
Pn
k=0
|ak |. We can rearrange these terms to get:
= |a0 |Bn + |a1 |Bn−1 + |a2 |Bn−2 + . . . + |an |B0
Now we can add several nonnegative terms to get
≤ |a0 |Bn + |a1 |(Bn−1 + |bn |) + |a2 |(Bn−2 + |bn−1 | + |bn |) + . . . + |an |(B0 + |b1 | + . . . + |bn |)
= |a0 |Bn + |a1 |Bn + |a2 |Bn + . . . + |an |Bn
= An B n
Pn
Pn
= ( k=0 |ak |) ( k=0 |bk |)
= αβ
P
This shows that each partial sum of
|ck | is bounded above by αβ
P and below by 0 (since it’sPa series of
nonnegative terms). By theorem 3.24, this is sufficient to P
prove that
|ck | converges; and since
ck is the
Cauchy product, we have proved that the Cauchy product
ck converges absolutely.
Exercise 3.14a
Choose any arbitrarily small > 0. We’re told that lim{sn } = s, which means that there is some N such that
d(s, sn ) < whenever n > N . So we’ll rearrange the terms of σn a bit:
PN
Pn
σn =
k=0 sk
n+1
=
k=0 sk
Pn
+ k=N +1 sk
n+1
Whenever n > N we know that d(s, sn ) < , which means that − < sn − s < , or that s − < sn < s + . This
gives us the inequality
PN
k=0 sk
PN
Pn
Pn
+ k=N +1 (s − )
k=0 sk +
k=N +1 (s + )
< σn <
n+1
n+1
The terms in some of these summations don’t depend on k, so we can further rewrite this as
PN
k=0 sk
+ (n − (N + 1))(s − )
< σn <
n+1
PN
k=0 sk
+ (n − (N + 1))(s + )
n+1
PN
PN
sk
n(s − ) (N + 1)(s − )
n(s + ) (N + 1)(s + )
+
−
< σn < k=0
+
−
n+1
n+1
n+1
n+1
n+1
n+1
For many of these terms, the numerators are constant with respect to n. Therefore many of these terms will
approach zero as n → ∞.
n(s − )
n(s + )
lim
< lim σn < lim
n→∞ n + 1
n→∞
n→∞ n + 1
s − < lim σn < s + k=0 sk
n→∞
And this is just another way of saying d(s, lim σn ) < . And was arbitrarily small, so we have shown that
limn→∞ σn = s.
42
Exercise 3.14b
Let sn = (−1)n . Then
P
sn is either equal to 0 or 1, depending on n, and
1
0
≤ σn ≤
n+1
n+1
Taking the limit as n → ∞ gives us
0 ≤ lim σn ≤ 0
n→∞
which can only be true if
lim σn = 0
n→∞
Exercise 3.14c
Define sn to be
1 n
2 n
1
2
sn =
if n = k 3 (k ∈ Z)
otherwise
+k
√
√
There are no more than 3 n perfect cubes within the first n integers, so {sn } will contain no more than b 3 nc
terms of the form (1/2)n + k. This gives us the inequality
n
X
sn =
k=0
n n
X
1
k=0
2
b
+
√
3
nc
X
√
3
k ≤2+
k=0
√
n( 3 n + 1)
2
where the last step in this chain of inequalities is justified by the common summation
Continuing, we see that
n
X
√
3
sn ≤ 2 +
k=0
Pn
k=1
k = k(k + 1)/2.
√
√
√
√
3
√
n( 3 n + 1)
n( 3 n + 3 n)
3
≤2+
= 2 + n2
2
2
We can now analyze the value of σn .
Pn
σn =
k=0 sn
n+1
√
2
3
√
3 2 + 1
2 + n2
n
≤
= √
1
3
n+1
n+ √
3 2
n
Taking the limit of each side as n → ∞ gives us
2
√
3 2
n
lim σn ≤ lim √
3
n→∞
n→∞
+1
n+
1
√
3 2
n
1
= lim √
=0
n→∞ 3 n
Every term of {sn } was greater than zero, so we know that the arithmetic average σn is greater than zero.
Therefore 0 ≤ lim σn ≤ 0, and therefore lim σn = 0.
Exercise 3.14
proving the equality
n
X
kak = (s1 − s0 ) + 2(s2 − s1 ) + . . . + n(sn − sn−1 )
k=1
= −s0 + (1 − 2)s1 + (2 − 3)s2 + . . . + ((n − 1) − n)sn−1 + sn (n)
= −s0 − s1 − . . . − sn + (n + 1)sn
Therefore, if we divide this by n + 1, we get
n
1 X
−s0 − s1 − . . . − sn
kak =
+ sn = −σn + sn
n+1
n+1
k=1
43
establishing convergence
We’re told that limn→∞ nan = 0, so by part a we know that
Pn
k=1 kak
lim
=0
n→∞
n+1
We’re also told that {σn } converges to some value σ, so by theorem 3.3(a) we know that
Pn
k=1 kak
+ σn = (0 + σ) = σ
lim
n→∞
n+1
And since sn =
Pn
kak
n+1
k=1
+ σn , this is sufficient to prove that limn→∞ sn = σ.
Exercise 3.15
If you think I’m going to go through this tedious exercise, you can lick me where I shit.
Exercise 3.16a
{xn } is monotonically decreasing
√
From the fact that 0 < α < xn we know that α < x2n , and therefore
1
α
x2n
1
xn+1 =
xn +
xn +
<
= xn
2
xn
2
xn
This shows that xn > xn+1 for all n, which proves that {xn } is monotonically decreasing.
√
The limit of {xn } exists and is not less than a
√
√
First, √
we show that {xn } is bounded below by a. We know that x0 > a because we chose it to be. And if
xn > a for any n, we have
7→ xn 6=
→ xn −
→
→
→
→
→
→
√
√
a
a 6= 0
√ 2
(xn − a) ≥ 0
√
x2n − 2xn a + a ≥ 0
√
x2n + a ≥ 2xn a
√
x2n +a
a
2xn ≥
√
1
a
a
2 xn + xn ≥
√
xn+1 ≥ a
assumed
1.18(d)
definition of xn+1
√
So by induction, we know that √
xn ≥ a for all n. We’ve now demonstrated that {xn } is monotonically
√
decreasing and is bounded below by a, so we’re guaranteed that the limit of {xn } exists and that lim{xn } ≥ a.
√
The limit of {xn } is not greater than a
√
√
√
√
Now we show that lim{xn } ≤ a. Assume that
√ lim{xn } = b for some b > a. By the definition
√ of “limit”,
we can find N such
that
n
>
N
implies
d(x
,
b)
<
for
any
arbitrarily
small
.
We’ll
choose
=
b − a (which
n
√
√
is positive since b > a ≥ 1 implies b > a).
√
7→ d(xn , b) < √
√
→ d(xn , b) < b − a
√
√
→ xn − b < b − a
√
√
→ xn < b − a + b
chosen value of metric on R1
44
We can then calculate xn+1 in terms of xn to get a chain of inequalities:
√ √
√
√
(b − a) + 2 b b − a + b + a
x2n + a
( b − a + b)2 + a
√
√
=
xn+1 =
<
√
√
2xn
2( b − a + b)
2( b − a + b)
√ √
√ √
√
2b + 2 b b − a
b( b + b − a) √
√ = √
= √
= b
√
2( b − a + b)
b+ b−a
√
√
This shows us that if the distance
guaranteed to
√ between xn and b becomes less than b − a (which it is √
do eventually because lim{xn } = b), then the next term in the sequence {xn } will be less than b. And since
we’ve already shown that
√ that every subsequent term will
√
√ the sequence is monotonically decreasing, this means
b,
which
contradicts
our
assumption
that
lim{x
}
=
b. Therefore the limit is not b:
be even
farther
from
n
√
√
b was an arbitrary value greater than a, so we’ve shown that the limit cannot be any value greater than
but
√
a.
√ Having shown that lim{xn } exists, that the√limit is not less than
a, we have therefore proven that lim{xn } = a.
√
a, and that the limit is not greater than
Exercise 3.16b
From the definition of xn+1 and en , we have
en+1 = xn+1 −
√
a=
√
√
√
x2n + a √
x2 − 2xn a + a
(xn − a)2
− a= n
=
2xn
2xn
2xn
e2n
e2
< √n
2xn
2 a
√
where the last step is justified by the fact that xn > a (see part (a)).
=
The next part can be proven by induction. Setting n = 1 in the above inequality, we have
e2
e2
e2 < √1 = 1 = β
β
2 a
e1
β
21
And, if the statement is true for n, then
en+1
" n−1 #2
" n#
2n
2
2
1 2
1
e1
e1
e2n
1
e1
2
β
β
< √ = (en ) <
=
=β
β
β
β
β
β
β
2 a
which shows that it is therefore true for n + 1. By induction, this is sufficient to show that it is true for all
n ∈ N.
Exercise 3.16c
When a = and x1 = 2, we have
√
√
x1 − a
2− 3
e1
√
√
=
=
β
2 a
2 3
which can be shown to be less than 1/10 through simple algebra. Then, by part (b), we have
en < β
e1
β
2n
√
<2 3
45
1
10
2n
<4
1
10
2n
Exercise 3.17
√
√
√
√
Lemma 1: xn > a → xn+1 < a, and xn < a → xn+1 > a
√
7→ xn > a
√ √
√
↔ xn ( a − 1) > a( a − 1)
√
√
↔ xn a − xn > a − a
√
√
↔ xn a + a > a + xn
√
↔ a(xn + 1) > a + xn
√
n
↔ a > a+x
xn +1
√
↔ a > xn+1
definition of xn+1
√
√
This shows that xn > a → xn+1 <√ a. If “>” is√replaced with “<” in each of the above steps, we will
have also constructed a proof that xn < a → xn+1 > a.
√
√
Lemma 2: xn > a → xn+2 < xn and xn < a → xn+2 > xn
√
7→ xn > a
√
↔ xn − a > 0
√
√
multiplied by positive terms, so it’s still > 0
↔ 2(xn − a)(xn + a) > 0
↔ 2(x2n − a) > 0
At this point, the algebraic steps become bizarre and seemingly nonsensical. But bear with me.
↔ 2x2n + 0xn − 2a) > 0
↔ 2x2n + (1 + a − 1 − a)xn − 2a) > 0
↔ xn + x2n + axn + x2n > a + axn + a + xn
↔ xn (1 + xn ) + xn (a + xn ) > a(1 + xn ) + a + xn
↔
xn (1+xn )+xn (a+xn )
1+xn
↔ xn 1 +
a+xn
1+xn
a(1+xn )+a+xn
1+xn
>
>a+
division by a positive term
a+xn
1+xn
↔ xn (1 + xn+1 ) > a + xn+1
↔ xn >
a+xn+1
1+xn+1
↔ xn > xn+2
definition of xn+2
√
This shows that xn > a → xn+2
√ < xn . If “>” is replaced with “<” in each of the above steps, we will have
also constructed a proof that xn < a → xn+2 > xn .
Exercise 3.17a
√
We’re forced to choose x1 such that x1 > a, so we can use induction with lemma 2 to show than {x2k+1 } (the
subsequence of {xn } consisting of elements with odd indices) is a monotonically decreasing sequence.
Exercise 3.17b
√
√
Because we chose x1 such that x1 > a, lemma 1 tells us that x2 < a. We can then use induction with lemma
2 to show that {x2k } (the subsequence of {xn } consisting of elements with even indices) is a monotonically
increasing sequence.
Exercise 3.17c
√
√
We can show than {xn } converges to a by showing
√ that we can
√ make d(xn , a) arbitrarily small. We do this
by demonstrating the relationship between d(xn , a) and d(x1 , a).
46
en+2 = xn+2 −
7→ en+2 = xn+2 −
→ en+2 =
→ en+2 =
a+xn+1
1+xn+1
√
−
a
√
√
√
a=
√
√
a + xn+1 √
a + xn+1 − a − xn+1 a
− a=
1 + xn+1
1 + xn+1
a
a+xn+1 − a−xn+1
1+xn+1
√
a
√
√
→ en+2 (1 + xn+1 ) = a + xn+1 − a − xn+1 a
√
√
√
→ en+2 (1 + xn+1 ) = xn+1 (1 − a) − a(1 − a)
√
√
→ en+2 (1 + xn+1 ) = (xn+1 − a)(1 − a)
√
→ en+2 (1 + xn+1 ) = en+1 (1 − a)
√
1− a
→ en+2 = en+1 1+x
n+1
This tells us how to express en+2 in terms of en+1 . We can use this same equality to express en+1 in terms
of en , giving us
√ √ √ 1− a
1− a
1− a
en+2 = en+1
= en
1 + xn+1
1 + xn
1 + xn+1
!
!
√
√ √
√ 1− a
1− a
1− a
1− a
= en
= en
1+xn +a+xn
n
1 + xn
1 + xn
1 + a+x
1+xn
1+xn
√ √
√ (1 + xn )(1 − a)
(1 − a)2
1− a
= en
= en
1 + xn
1 + 2xn + a
1 + 2xn + a
√
( a − 1)2
< en
a−1
The last step in this chain of inequalities is justified by the fact that a + 2xn + 1 > a − 1 > 1. Continuing, we
have
√
√
( a − 1)2
a−1
en+2 < en
= cen , 0 < c < 1
= en √
a−1
a+1
This tells us how to express en+2 in terms of en . We can use this same inequality to express e2n+2 in terms e2
and e2n+1 in terms of e1 :
e2n+2 < ce2n < c(ce2n−2 ) < . . . < cn e2
e2n+1 < ce2n−1 < c(ce2n−3 ) < . . . < cn e1
And since 0 < c < 1, this means that we can make cn arbitrarily small by taking sufficiently large n; and
en < c2 [maxe1 , e2 ], so we can make en arbitrarily small by taking √
sufficiently large n; and
√ this shows that
a).
So
lim
d(x
,
a) = 0 which is
limn→∞ en = 0. Finally, remember that
we
defined
e
to
be
d(x
,
n→∞
n
n
n
√
sufficient to prove that limn→∞ {xn } = a.
Exercise 3.17d
√
√
As shown above, every two iterations reduces the error term by a factor of ( a−1)/( a+1) (linear convergence).
n
For the algorithm in exercise 16, the nth iteration reduced the error term by a factor of 10−2 (quadratic
convergence).
Exercise 3.18
Lemma 1 : {xn } is decreasing.
√
√
We’re asked to choose that x1 > p a. And if xn > p a, we have
47
7→ xn >
→
xpn
√
p
a
>a
→ 0 > a − xpn
→ pxpn > pxpn + a − xpn
→ pxpn > (p − 1)xpn + a
→
pxp
n
pxp−1
n
→ xn >
(p−1)xp
n +a
pxp−1
n
(p−1)xn
+ ap xp−1
n
p
>
→ xn > xn+1
Lemma 2 : 0 < k < 1 → p(1 − k) > 1 − k p
Let k be a positive number less than 1 and let p be a positive integer.
7→ p = 10 + 12 + 13 + . . . + 1p−1
→ p > k + k 2 + k 3 + . . . + k p−1
→p>
→p>
(k−1)(k+k2 +k3 +...+kp−1 )
k−1
kp −1
k−1
p
→ p(k − 1) > k − 1
Lemma 3 : xn >
√
p
We know that x1 >
7→ xn >
√
p
a
√
p
a because we chose it. And if xn >
√
p
a, we have:
a>0
√
p
→ 1 > xna > 0
√
p
√
p
p
→ p 1 − xna > 1 − xna
√
p
a
→ p − 1 > p xn −
a
p
xn √
p
p
p
→ xn (p − 1) > xn p xna − xapn
√
→ pxpn − xpn > p p axp−1
−a
n
√
p
p
→ (p − 1)xn + a > p axp−1
n
→
→
√
(p−1)xp
p p axp−1
n +a
n
> pxp−1
pxp−1
n
n
√
(p−1)xn
a
+ pxp−1
> pa
p
n
→ xn+1 >
√
p
a
Finally : limn→∞ {xn } =
from lemma 2
expand and rearrange the terms
multiply both sides by xpn
rearrange the terms and simplify
divide both sides by pxp−1
n
simplify
definition of xn+1
√
p
a
We’ve shown that {xn } is decreasing (lemma 1)√and that it’s bounded below (lemma 3), which is sufficient
to show that it has some limit x∗ with x∗ ≥ p a. From the definition of limit, we can find N such that
n > N → d(xn , x∗ ) < 2p
. From this, we see that we can find xn and xn+1 such that:
48
7→ d(xn , x∗ ) < ∧ d(xn+1 , x∗ ) <
2p
∗
→ d(xn , xn+1 ) < d(xn , x∗ ) + d(x , xn+1 ) =
→ xn − xn+1 < p
h
→ xn − xn − xpn +
→
xn
p
→
xp
n −a
pxp−1
n
−
→ xn −
a
p−1
pxn
<
<
p
triangle inequality
lemma 1: xn > xn+1
a
i
<
pxp−1
n
p
definition of xn+1
p
p
a
xp−1
n
<
This last statement tells us that limn→∞ xn −
7→ limn→∞ xn −
a
xp−1
n
=0
a
xp−1
n
= 0, or that
assumed
→ limn→∞ xn (1 − a/xpn ) = 0
p √
p
=0
→ x∗ 1 − x∗a
√
→ p a/x∗ = 1
√
→ p a = x∗
theorem 3.3c
theorem 3.3c
From the definition of x∗ as the limit of {xn }, this tells us that limn→∞ {xn } =
to prove.
√
p
a which is what we wanted
Exercise 3.19
The idea behind this proof is to consider the elements of the line segment [0, 1] in the form of their base-3
expansion. When we do this, we notice that the mth iteration of the Cantor set eliminates every number with
a 1 as the mth digit of its ternary decimal expansion.
From equation 3.24 in the book, we know that a real number r is not in the Cantor set iff it is in any interval
of the form
3k + 1 3k + 2
,
m, k ∈ N
3m
3m
Let {αn } represent an arbitrary ternary sequence (that is, each digit is either 0,1, or 2). We can determine the
necessary and sufficient conditions for x(α) to fall into such an interval:
3k + 1 3k + 2
x(α) 6∈ Cantor iff x(α) ∈
,
3m
3m
∞
X
αn
3k + 1 3k + 2
iff (∃m)
∈
,
3n
3m
3m
n=1
iff (∃m)
∞
X
αn
∈ (3k + 1, 3k + 2)
n−m
3
n=1
We then split up the summation into three distinct parts.
iff (∃m)
m−1
X
n=1
iff (∃m)
αn
3n−m
m−1
X
n=1
+
m
X
n=m
αn
3n−m
+
∞
X
n=m+1
∞
X
αn 3m−n + αm +
n=m+1
αn
n−m
3
αn
n−m
3
∈ (3k + 1, 3k + 2)
∈ (3k + 1, 3k + 2)
Every term in the leftmost sum is divisible by three, so the leftmost sum is itself divisible by three. This gives
us, for some j ∈ N,
∞
X
αn
iff (∃m) 3j + αm +
∈ (3k + 1, 3k + 2), j ∈ N
n−m
3
n=m+1
49
Without specifying a certain sequence {αn }, we can’t evaluate the rightmost summation. But we can establish
bounds for it. Each αn is either 0,1, or 2. So the summation is largest whenever αn = 2 for all n, and it’s
smallest when αn = 0 for all n. This gives us the bound
0=
∞
X
n=m+1
0
3n−m
<
∞
X
n=m+1
αn
3n−m
<
∞
X
2
n=m+1
3n−m
=2
∞
X
1
=1
n
3
n=1
Note that the upper bound is 1, not 3, since the index is from 1 instead of 0. Continuing with our chain of “iff”
statements, we conclude that x(a) is not member of the Cantor set iff:
x(a) 6∈ Cantor iff (∃m)
3j + αm + δ ∈ (3k + 1, 3k + 2), j ∈ N, 0 < δ < 1
From the bounds on αm and δ, we know that 0 < αm + δ < 3: their sum is never a multiple of 3. So the only
way that the set membership in the previous statement can be true is
iff (∃m) αm + δ ∈ (1, 2),
0<δ<1
iff (∃m) αm ∈ (0, 2)
We know that αm is an integer, and there’s only one integer in the open interval (0, 2):
iff (∃m) αm = 1
iff x(α) 6∈ x(a)
where x(a) is as defined in the exercise. This shows that any real number x(α) is a member of the Cantor set if
and only if it is a member of x(a): this is sufficient to prove that the Cantor set is equal to x(a).
Exercise 3.20
Choose some arbitrarily small . From the definition of Cauchy convergence, there is some N such that i, n > N
implies d(pi , pn ) < /2. From the convergence of the subsequence to p∗ , there is some M such that i > M
implies d(p∗ , pi) < /2.
So, for i, n > max(M, N ) we have
d(p∗ , pn ) ≤ d(p∗ , pi ) + d(pi , pn ) = which, because is arbitrarily small, is sufficient to prove that {pn } converges to p∗ .
Exercise 3.21
T
We know that each Ei ∈ En is nonempty (see note 1), so we can construct at least one sequence whose ith
element is an arbitrary element of Ei . Let {sn } be an arbitrary sequence constructed in this way. We can
immediately say three things about this sequence.
1) The sequence is Cauchy. This comes from the fact that lim diam En = 0: see the text below definition 3.9.
2) The sequence is convergent. This comes from the fact that it’s a sequence in a complete metric space X: see
definition 3.12.
T
T
3) The sequence converges to some s∗ ∈ EnT
. The set En is T
a intersection of closed sets, and is therefore
closed
(thm
2.24b),
so
any
limit
point
of
E
is
a
point
of
En , and therefore s∗ (being a limit point of
n
T
T
En ) is an element of En .
T
This shows that En isTnonempty: it contains at least one element s∗ . We can then follow the proof of theorem
3.10b to conclude that En contains only one point.
50
note 1
We’re told that Ei ⊃ Ei+1 for all i. If Ei were empty, then Ek would be empty for all k > i. This would mean
that
lim diam En = diam∅ = diam sup {d(p, q) : p, q ∈ ∅} = sup ∅
To find the supremum of the empty set in R, we need to rely on the definition of supremum:
sup ∅ = least upper bound of ∅ in R = min {x ∈ R : a ∈ ∅ → a ≤ x}
The set for which we’re seeking a minimum contains every x ∈ R (because of false antecedent a ∈ ∅), so the
supremum of the empty set in R is the minimum of R itself. Regardless of how we define this minimum (or if
it’s defined at all), it certainly isn’t equal to zero. But we’re told that
lim diam En = 0
so our initial assumption must have been wrong: there is no empty Ei ∈
T
En .
Exercise 3.22
The set G1 is an open set. Choose p1 ∈ G1 and find some open neighborhood Nr1 (p1 ) ⊆ G1 . Choose a smaller
neighborhood Ns1 (p1 ). Not only do we have Ns1 ⊂ Nr1 ⊆ G1 , we can take the closure of Ns1 and still have
Ns1 ⊂ Nr1 ⊆ G1 (since δ ≤ s1 < r1 ). Define E1 to be the neighborhood Ns1 (not its closure). At this point, we
have
E1 ⊂ E1 ⊆ G1
The set G2 is dense in X, so p1 is either an element of G2 or is a limit point for G2 . In either case, every
neighborhood of p1 contains some point p2 ∈ G2 . More specifically, the neighborhood E1 contains some point
p2 ∈ G2 . Since p2 is in both E1 and G2 , both of which are open sets, there is some neighborhood Nr2 (p2 ) that’s a
subset of both E1 and G2 . We now choose an even smaller radius Ns2 (p2 ). Not only do we have Ns2 ⊂ Nr2 ⊆ G2
and Ns2 ⊂ Nr2 ⊆ E1 , we could take the closure Ns2. Define E2 to be the neighborhood Ns2 (not its closure).
At this point, we have
E2 ⊂ E2 ⊂ G2 , E2 ⊂ E2 ⊂ E1 ⊂ E1
We can continue in this same way, constructing a series of nested sets
E1 ⊃ E2 ⊃ · · · ⊃ En
T
This is a series of closed, bounded, nonempty, nested sets.
T So (from exercise 21) we know that En contains
T a
single pointTx. This single point x must be in every Ei ∈ En, and Ei ⊂ Gi , and x must be in every Gi ∈ Gn .
Therefore Gn is nonempty.
Exercise 3.23
We’re aren’t given enough information about the metric space X to assume that the Cauchy sequences converge
to elements in X. All we can say is that, for any arbitrarily small , there exists some M, N ∈ R such that
n, m > M → d(pn , pm ) ≤ /2
n, m > N → d(qn , qm ) ≤ /2
From multiple applications of the triangle inequality, we have
d(pn , qn ) ≤ d(pn , pm ) + d(pm , qn ) ≤ d(pn , pm ) + d(pm , qm ) + d(qm , qn )
which, for n, m > max{N, M }, becomes
d(pn , qn ) ≤ + d(pm , qm )
or
d(pn , qn ) − d(pm , qm ) ≤ 51
Exercise 3.24a
reflexivity
For all sequences {pn }, we have
lim d(pn , pn ) = lim |pn − pn | = lim 0 = 0
n→∞
n→∞
n→∞
so {pn } = {pn }.
symmetry
If {pn } = {qn }, we have
lim d(pn , qn ) = lim |pn − qn | = lim |qn − pn | = lim d(qn , pn )
n→∞
n→∞
n→∞
n→∞
so {qn } = {pn }.
transitivity
If {pn } = {qn } and {qn } = {rn }, the triangle inequality gives us
lim d(pn , rn ) ≤ lim d(pn , qn ) + d(qn , rn ) = 0 + 0
n→∞
n→∞
This tells us that limn→∞ d(pn , rn ) = 0, so {pn } = {rn }.
Exercise 3.24b
Let {an } = {pn } and let {bn } = {qn }. From the triangle inequality, we have
∆(P, Q) = lim d(pn , qn ) ≤ lim d(pn , an ) + d(an , bn ) + d(bn , qn )
n→∞
n→∞
Which, from the definition of equality established in part (a), gives us
∆(P, Q) = lim d(pn , qn ) ≤ lim d(an , bn )
n→∞
n→∞
(5)
A similar application of the triangle inequality gives us
lim d(an , bn ) ≤ lim d(an , pn ) + d(pn , qn ) + d(qn , bn )
n→∞
n→∞
Which, from the definition of equality established in part (a), gives us
lim d(an , bn ) ≤ lim d(pn , qn )
n→∞
n→∞
(6)
Combining equations (5) and (6), we have
∆(P, Q) = lim d(pn , qn ) = lim d(an , bn )
n→∞
n→∞
Exercise 3.24c
Define X ∗ to be the set of equivalence classes from part (b). Let {Pn } be a Cauchy sequence in (X ∗ , ∆). This is
an unusual metric space, so to be clear: The sequence {Pn } is a sequence {P0 , P1 , P2 . . .} of equivalence classes
in X ∗ that get “closer” to one another with respect to distance function ∆. Each Pi ∈ {Pn } is an equivalence
class of Cauchy sequences of the form {pi0 , pi1 , pi2 , . . .} containing elements of X that get “closer together” with
respect to distance function d.
Each Pi ∈ {Pn } is a set of equivalent Cauchy sequences in X. From each of these, choose some sequence
{pin } ∈ Pi . For each i we have
(∃Ni ∈ R) m, n > Ni → d(pim , pin ) <
i
52
We’ll define a new sequence {qn } by letting qi = pim for some m > Ni . Let Q be the equivalence class containing
{qn }. We can show that limn→∞ Pn = Q.
0 ≤ lim ∆(Pn , Q) = lim lim d(pnk , qk ) ≤ lim
n→∞
n→∞ k→∞
n→∞ n
So, by the squeeze theorem, we have
lim ∆(Pn , Q) = 0
n→∞
(7)
which, by the definition of equality in X ∗ , means that limn→∞ Pn = Q.
Proof that Q ∈ X ∗ :
Choose any arbitrarily small values 1 , 2 , 3 . From (7) we have
∃X : n, k > X → d(pnk , qn ) < 1
And {Pn } is Cauchy, so we have
∃Y : n > Y → ∆(Pn , Pn+m ) < 2
which, after choosing appropriately large n, gives us
∃Z : k > Z → d(pnk , p(n+m)k ) < 3
Therefore, by the triangle inequality:
d(qn , qn+m ) ≤ d(qn , pnk ) + d(pnk , p(n+m)k ) + d(p(n+m)k , qn+m )
Taking the limits of each side gives us
lim d(qn , qn+m ) ≤ 1 + 3 + 1
n→∞
But these epsilon values were arbitrarily small and m was an arbitrary integer. This shows that for every there
exists some integer N such that n, n + m > N implies d(qn , qn+m ) < . And this is the definition of {qn } being
a Cauchy sequence. Therefore Q ∈ X ∗ .
Exercise 3.24d
Let {pn } represent the sequence whose terms are all p, and let {qn } represent the sequence whose terms are all
q.
∆(Pp , Pq ) = lim d(pn , qn ) = lim d(p, q) = d(p, q)
n→∞
n→∞
Exercise 3.24e : proof of density
Let {an } be an arbitrary Cauchy sequence in X and let A ∈ X ∗ be the equivalence class containing {an }. If
the sequence {an } converges to some element x ∈ X, then A = Px = ϕ(x) and therefore A ∈ ϕ(X) (see exercise
3.24 for the definition of Px ).
If the sequence {an } does not converge to some element x ∈ X, then choose an arbitrarily small . The
sequence {an } is Cauchy, so we’re guaranteed the existence of K such that
j, k > K → d(aj , ak ) < From this, we can consider the sequence whose terms are all ak . This sequence is a member of the equivalence
class Pak = ϕ(ak ) and
∆(A, Pak ) = lim d(an , ak ) < n→∞
This shows that we can find some element of ϕ(X) to be arbitrarily close to A, which means that A is a limit
point of ϕ(X).
We have shown that an arbitrary Cauchy sequence A ∈ X ∗ is either an element of ϕ(X) or a limit point of
ϕ(X). By definition, this means that ϕ(X) is dense in X ∗ .
53
Exercise 3.24e : if X is complete
If X is complete then every arbitrary Cauchy sequence A ∈ X ∗ converges to some point a ∈ X, so that
A = Pa = ϕ(a) ∈ ϕ(X). This shows that X ∗ ⊆ ϕ(X). And for every ϕ(b) ∈ ϕ(X) there is some Cauchy
sequence in X ∗ whose every element is b, so that ϕ(b) = Pb ∈ X ∗ . This shows that ϕ(X) ⊆ X ∗ .
This shows that X ∗ ⊆ ϕ(X) ⊆ X ∗ , or that ϕ(X) = X ∗ .
Exercise 3.25
The completion of the set of rational numbers is a set that’s isomorphic to R.
Exercise 4.1
Consider the function
f (x) =
1,
0,
x=0
x 6= 0
This function satisfies the condition that
limh→∞ [f (x + h) − f (x − h)] = 0
for all x, but the function is not continuous at x = 0: We can choose < 1, and every neighborhood Nδ (0) will
contain a point p for which
d(f (p), f (0)) = 1
Therefore we can’t pick δ such that d(p, 0) < δ → d(f (p), f (0)) < 1, which means that f is not continuous by
definition 4.5.
Exercise 4.2
Let X be a metric space, let E be an arbitrary subset of X, and let E represent the closure of E. We want to
prove that f (E) ⊆ f (E). To do this, assume y ∈ f (E). This means that y = f (e) for some e ∈ (E ∪ E 0 ).
case 1: e ∈ E
If e ∈ E, then y ∈ f (E) and therefore y ∈ f (E).
case 2: e ∈ E 0
If e ∈ E 0 , then every neighborhood of e contains infinitely many points of E. Choose an arbitrarily small
neighborhood N (y). We’re told that f is continuous so, by definition 4.1, we’re guaranteed the existence of δ
such that f (x) ∈ N (y) whenever x ∈ Nδ (e). But there are infinitely many elements of E in the neighborhood
Nδ (e), so there are infinitely many elements of f (E) in N (y). This means that y is a limit point of f (E).
We’ve shown that every arbitrary element y ∈ f (E) is either a member of f (E) or a limit point of f (E),
which means that y ∈ (f (E) ∪ f (E)0 ) = f (E). This proves that f (E) ⊆ f (E).
A function f for which f (E) is a proper subset f (E)
Let X be the metric space consisting of the interval (0, 1) with the standard distance metric. Let Y be the
metric space R1 . Define the function f : X → Y as f (x) = x. The interval (0, 1) is closed in X but open in Y ,
so we have
f (X) = f (X) = (0, 1) 6= (0, 1)
Exercise 4.3
If we consider the image of Z(f ) under f , we have f (Z(f )) = {0}. This range is a finite set, and is therefore a
closed set. By the corollary of theorem 4.8, we know that f −1 ({0}) = Z(f ) must also be a closed set.
54
Exercise 4.4: f (E) is dense in f (X)
To show that f (E) is dense in f (X) we must show that every element of f (X) is either an element of f (E) or
a limit point of f (E).
Assume y ∈ f (X). Then p = f −1 (y) ∈ X. We’re told that E is dense in X, so either p ∈ E or p ∈ E 0 .
case 1: p ∈ E
If p ∈ E, then y = f (p) ∈ f (E).
case 2: f −1 (y) ∈ E 0
If p is a limit point of E, then there is a sequence {en } of elements of E such that en 6= p and limn→∞ en = p.
We’re told that f is continuous, so by theorem 4.2 we know that limn→∞ f (en ) = f (p) = y. Using definition
4.2 again, we know that there is a sequence {f (en )} of elements of f (E) From theorem 4.2, this tells us that
limx→p f (x) = f (p) = y. Therefore y is a limit point of f (E).
We’ve shown that every element y ∈ f (X) is either an element of f (E) or a limit point of f (E). By definition,
this means that f (X) is dense in f (E).
Exercise 4.4b
Choose an arbitrary p ∈ X. We’re told E is dense in X, so p is either an element of E or a limit point of E.
Case 1: p ∈ E
If p ∈ E, then we’re told that f (p) = g(p).
Case 2: p ∈ E 0
If p is a limit point of E, then there is a sequence {en } of elements of E such that en 6= p and limn→∞ en =
p. We’re told that f and g are continuous, so by theorem 4.2 we know that limn→∞ f (en ) = f (p) and
limn→∞ g(en ) = g(p). But each en is an element of E, so we have f (en ) = g(en ) for all n. This tells us
that
g(p) = lim g(en ) = lim f (en ) = f (p)
n→∞
n→∞
We see that f (p) = g(p) in either case. This proves that f (p) = g(p) for all p ∈ X.
Exercise 4.5
The set E C can be formed from an at most countable number of disjoint open intervals
We’re told that E is closed, so E C is open. Exercise 2.29 tells us that E C contains an at-most countable number
of disjoint segments. Each of these segments must be open (if any of them contained a non-interior point, it
would be a non-interior point of E C , but open sets have no non-interior points). Let {(an , bn )} be the at-most
countable collection of disjoint open segments.
constructing the function
We must separately consider the cases for x ∈ E and x 6∈ E; and if x 6∈ E we must consider the possibility that
E contains an interval of the form (−∞, b) or (a, ∞). We’ll define the function to be

f (x),
x∈E


 f (b ),
x ∈ (ai , bi ) ∧ ai = −∞
i
g(x) =
(8)
f
(a
),
x ∈ (ai , bi ) ∧ bi = ∞
i



f (bi )−f (ai )
f (ai ) + bi −ai (x − ai ), x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞
This function is the one mentioned in the hint: the graph of g is a straight line on each closed interval [bi , ai+1 ] ∈
E C . This function can easily (albeit tediously) be shown to be continuous on R1 .
55
failure if “closed” is omitted
Consider the following function defined on the open set (−∞, 0) ∩ (0, ∞):
1,
x>0
f (x) =
−1 x < 0
This function will be discontinuous at x = 0 no matter how we define the function at this point.
Extending this to vector-valued functions
Let E be a closed subset of R and let f : E → Rk be a vector-valued function defined by
f (x) = (f1 (x), f2 (x), . . . , fk (x))
For each fi we can define a function gi as in equation (8) to create a new vector-valued function g : R → Rk
defined by
g(x) = (g1 (x), g2 (x), . . . , gk (x))
We’ve shown that each of these gi functions are continuous, therefore by theorem 4.10 the function g(x) is
continuous.
Exercise 4.6
Let G represent the graph of f . Let g : E → G be defined as as g(x) = (x, f (x)). We can make G into a metric
space by defining a distance function: p
the set G is a subset of R × R, so the natural choice is to use the metric
d(g(x), g(y)) = d((x, f (x)), (y, f (y)) = (x − y)2 + (f (x) − f (y))2 . This choice allows us to treat G as a subset
of R2 .
If f is continuous
Both x 7→ x and x 7→ f (x) are continuous mappings, so g(x, f (x)) is a continuous mapping by theorem 4.10.
The domain of g is a compact metric space so by theorem 4.14 (or theorem 4.15) G is compact.
If the graph is compact
Define g as above. The inverse of g is g −1 (x, f (x)) = x. It’s clear that g −1 is a one-to-one and onto function
from G to E. Therefore by theorem 4.17 the inverse of g −1 – that is, g itself – is continuous. We can then appeal
to theorem 4.10 once again to conclude that f is continuous.
Exercise 4.7
f is bounded
If x = 0, then f (x, y) = 0 for any value of y. If x > 0:
(x − y 2 )2 ≥ 0
2
2
squares are positive
4
→ x − 2xy + y ≥ 0
2
2
LHS remains positive after adding the nonnegative term (xy 2 )
2
add xy 2 to both sides
→ x − xy + y ≥ 0
2
4
→ x + y ≥ xy
→1≥
expanding the squared term
4
xy 2
x2 +y 4
divide both sides by positive term x2 + y 4
If x < 0:
(x + y 2 )2 ≥ 0
squares are positive
→ x2 + 2xy 2 + y 4 ≥ 0
expanding the squared term
→ x2 − xy 2 + y 4 ≥ 0
LHS remains positive after adding the nonnegative term (−3xy 2 )
→ x2 + y 4 ≥ xy 2
add xy 2 to both sides
→1≥
2
xy
x2 +y 4
divide both sides by positive term x2 + y 4
56
g is unbounded near (0,0)
If we let x = nα and let y = nβ , we have
nα+2β
n2α + n6β
g(nα , nβ ) =
We can divide the numerator and denominator by nα+2β to get
g(nα , nβ ) =
1
nα−2β + n4β−α
If we let α = −3, β = −1 this becomes
g
1 1
,
n3 n
=
n−1
n
1
=
−1
+n
2
Taking limits, we have
g(0, 0) = lim g
n→∞
1 1
,
n3 n
= lim
n→∞
n
2
The rightmost limit is +∞.
f is not continuous at (0,0)
Choose 0 < δ < 1/2. For f to be continuous we must be able to choose some > 0 such that
d ((0, 0), (x, y)) < → d (f (0, 0), f (x, y)) < δ
But there can be no √
such epsilon. To see this, choose an arbitrary > 0, choose x > 0 such that x2 + x < 2 ,
and then choose y = x. This gives us
p
p
d ((0, 0), (x, y)) = x2 + y 2 = x2 + x < but
d (f (0, 0), f (x, y)) =
xy 2
x2
1
=
=
2
4
2
x +y
2x
2
The restriction of f to a straight line is continuous
Any straight line that doesn’t pass through (0, 0) doesn’t encounter any of the irregularities that occur at the
origin; it’s trivial but tedious to show that the restriction of f to such a line is continuous. So we need only
consider lines that pass through the origin: that is, lines of the form y = cx or x = 0 for some constant c.
For the line y = cx: let > 0 be given and let δ = /c2 . Choose x such that 0 < x < δ. Then:
d(f (0, 0), f (x, y)) =
xy 2
c2 x3
c2 x
c2 x
=
=
<
< c2 δ = x2 + y 4
x 2 + c4 x 4
1 + c4 x2
1
And was arbitrary, so this proves that f is continuous on this line.
For the line x = 0: let > 0 be given choose any y 6= 0. Regardless of our choice of δ, we have
d(f (0, 0), f (x, y)) =
0
xy 2
= 4 =0<
2
4
x +y
y
And was arbitrary, so this proves that f is continuous on this line.
The restriction of g to a straight line is continuous
The proof for the continuity of the restriction of g is almost identical to that of the continuity of the restriction
of f .
57
Exercise 4.8
To show that f is bounded, we need to show that there exists some M such that |f (p)| < M for every p ∈ R1 .
Let p be an arbitrary element of R1 . We’re told that E is a bounded subset of R1 , so we know that E has a
lower bound α. Choose any > 0. From the definition of uniform continuity there exists some δ > 0 such that,
for all p, q ∈ R1 ,
d(p, q) < δ → d(f (p), f (q)) < Let n be the smallest integer such that nδ > p − α (which exists from the Archimedean property of the reals)
and define γ = (p − α)/n. This allows us to divide the interval (α, p) into n intervals of length γ, each of which
is smaller than δ. We can then apply the triangle inequality multiple times:
d(f (α), f (p))
≤ d(f (α), f (α + γ)) + d(f (α + γ), f (α + 2γ)) + . . . + d(f (α + (n − 1)γ), f (α + nγ))
<
+ + ... + (n terms)
= n
This shows us that |f (p)| is bounded by |f (α) ± n| for all p ∈ R1 , so by definition 4.13 we know that f is
bounded.
if E is not bounded
If E is not bounded below, then we can revise the previous proof using the upper bound β in place of the lower
bound α. If E is not bounded above or below, then we will not be able to use the Archimedean property to find
n such that nδ > p − (−∞) and the proof fails. For an example of a real uniformly continuous function that is
not bounded, we can simply look to f (x) = x.
Exercise 4.9
Definition 4.18 says that f is uniformly continuous on X if for every > 0 there exists δ > 0 such that
dX (p, q) < δ → dY (f (p), f (q)) < (9)
We want to show that this conditional statement is true iff for every > 0 there exists δ > 0 such that
diam E < δ → diam f (E) < (10) implies (9)
Let equation (10) hold and suppose d(p, q) < δ.
d(p, q) < δ
assumed
→ diam {p, q} < δ
we’re letting E be the set containing only p and q.
→ diam {f (p), f (q)} < from (10)
→ d(f (p), f (q)) < definition of diameter
Therefore (9) holds.
(9) implies (10)
Let equation (9) hold, let E be a nonempty subset of X, and suppose diam E < δ.
diam E < δ
assumed
→ (∀p, q ∈ E)(d(p, q) < δ)
definition of diameter
→ (∀p, q ∈ E)(d(f (p), f (q)) < )
definition of diameter
→ (∀f (p), f (q) ∈ f (E))(d(f (p), f (q)) < )
p ∈ E → f (p) ∈ f (E)
→ diam f (E) < definition of diameter
Therefore (10) holds.
58
(10)
Exercise 4.10
We want to prove the converse of theorem 4.19: that is, we want to prove that if f is not uniformly continuous
on X then either X is not a compact metric space or f is not a continuous function.
If f is not uniformly continuous, then the converse of definition 4.18 tells us that there exists some such
that, for every δ > 0, we can find p, q ∈ X such that d(p, q) < δ but d(f (p), f (q)) ≥ . We’ll be contradicting
this claim, so we’ll restate it formally in a numbered equation:
(∃ > 0)(∀δ > 0) d(p, q) < δ ∧ d(f (p), f (q)) ≥ From the fact that we can find such p, q for every value of δ, we can set δn =
{pn } and {qn } where d(pn , qn ) < n1 but d(f (pn ), f (qn )) ≥ .
(11)
1
n
and construct two sequences
If the set X is compact (supposition 1) the sequence {pn } must have at least one subsequential limit p
(theorem 3.6a or 2.37). And, since d(pn , qn ) < n1 , the sequence {qn } must also have p as a subsequential limit
since
1
d(qn , p) ≤ d(qn , pn ) + d(pn , p) < + n
But then, from theorem 4.2, it must be the case that both {f (pn )} and {f (qn )} have subsequential limits of
f (p). If the function f is continuous without being uniformly continuous (supposition 2) this would imply
that, for some sufficiently large n,
d(f (pn ), f (qn )) ≤ d(f (pn ), f (p)) + d(f (p), f (qn )) < + = 2 2
which contradicts (11).
We’ve established a contradiction from our initial assumption that f is uniformly continuous, so one of our
suppositions must be incorrect: either X is not compact or f is not continuous. By converse, if X is compact
and f is continuous then f is uniformly continuous. And this is what we were asked to prove.
Exercise 4.11
Choose an arbitrary > 0. We’re told that f is uniformly continuous, so
(∃δ > 0) : d(xm , xn ) < δ → d(f (xm ), f (xn )) < And we’re told that {xn } is Cauchy, so
(∃N ) : n, m > N → d(xm , xn ) < δ
Together, these two conditional statements tell us that, for any ,
(∃N ) : n, m > N → d(f (xm ), f (xn )) < which, by definition, shows that {f (xn )} is Cauchy. See exercise 4.13 for the proof that uses this result.
Exercise 4.12
We’re asked to prove that the composition of uniformly continuous functions is uniformly continuous. Let
f : X → Y and g : Y → Z be uniformly continuous functions. Choose an arbitrary > 0. Because g is
uniformly continuous, we have
(∃α > 0) : d(y1 , y2 ) < α → d(g(y1 ), g(y2 )) < Because f is uniformly continuous, we have
(∃δ > 0) : d(x1 , x2 ) < δ → d(f (x1 ), f (x2 )) < α
Since the elements f (x1 ), f (x2 ) are elements of Y these two conditional statements together tell us that, for
arbitrary > 0,
(∃δ > 0) : d(x1 , x2 ) < δ → d(f (x1 ), f (x2 )) < α → d(g(f (x1 )), g(f (x2 ))) < which, by definition means that the composite function (g ◦ f ) : X → Z is uniformly continuous.
59
Exercise 4.13 (Proof using exercise 4.11)
For every p ∈ X it is either the case that p ∈ E or p 6∈ E. If p 6∈ E, then p is a limit point of E (because
of density) and therefore we can construct a sequence {en } that converges to p. We now define the function
g : X → Y as
limn→∞ f (p),
if p ∈ E
g(p) =
limn→∞ f (en ), if p 6∈ E
We need to prove that this is actually a well-defined function and that it’s continuous.
The function g is well-defined
It’s clear that every element of X is mapped to at least one element of Y , but it’s not immediately clear that
each element of X is mapped to only one element of Y . We need to show that if {pn } and {qn } are sequences
in E that converge to p, then {f (pn )} and {f (qn )} both converge to the same element.
Let {pn } and {qn } be arbitrary sequences in E that converge to p. From the definition of convergence (or
from exercise 4.11), we know that
(∀δ > 0)(∃N ) : n > N → d(pn , p) <
(∀δ > 0)(∃M ) : m > M → d(qm , p) <
δ
2
δ
2
We can then use the triangle inequality to show that
(∀δ > 0)(∃M, N ) : n > max{M, N } → d(pn , qn ) ≤ d(pn , p) + d(p, qn ) < δ
(12)
We know that the function f is continuous on E, so for any > 0 there exists some δ > 0 such that
d(pn , qn ) < δ → d(f (pn ), f (qn )) < (13)
Combining equations (12) and (13) we see that for any > 0 we can find some integer N such that
n > N → (d(pn , qn ) < δ) → (d(f (pn ), f (qn )) < )
This tells us that {f (pn )} and {f (qn )} don’t converge to different points, but without being able to conclude
that Y is compact it’s entirely possible that {f (pn )} and {f (qn )} are merely Cauchy without actually converging
to any point in Y at all. The exercise asks us to assume that Y is R1 , but we’ll try for a more general solution
and assume only that Y is a complete metric space (see definition 3.12). This allows us to conclude that {f (pn )}
and {f (qn )} both converge to the same point of Y , and we call this point g(p). Therefore the function g(p) has
a unique value for every p ∈ X.
60
The function g is continuous
d(p, q) < δ/2
assumed
Let {pn } be a sequence that converges to p and let {qn } be a sequence that converges to q.
→ d(pn , qn ) ≤ d(pn , p) + d(p, q) + d(q, qn )
→ d(pn , qn ) < d(pn , p) + δ/2 + d(q, qn )
from our initial assumption
This holds true for any n, even when n is sufficiently large to make d(q, qn ) < δ/4, d(p, pn ) < δ/4 in
which case we have
→ d(pn , qn ) < δ
We’re told that f is continuous on E, so we can make d(f (pn ), f (qn )) arbitrarily small if we can
make d(pn , qn ) arbitrarily small. And we can make δ arbitrarily small, so for any we can choose
pn , qn such that
→ d(f (pn ), f (qn )) < /2
By definition, we have g(p) = f (p) for p ∈ E, so
→ d(g(pn ), g(qn )) < /2
→ d(g(p), g(q)) ≤ d(g(p), g(pn )) + d(g(pn ), g(qn )) + d(g(qn ), g(q))
triangle inequality
→ d(g(p), g(q)) < d(g(p), g(pn )) + /2 + d(g(qn ), g(q))
established bound for d(g(pn ), g(qn ))
This holds true for any n, even when n is sufficiently large to make d(q, qn ) < /4, d(p, pn ) < /4
(we know that {g(pn )} converges to g(p) and that {g(qn )} converges to g(q) because we defined
g(p), g(q) specifically so that this would be true). For such values of n we have
d(g(p), g(q)) < This shows that we can constrain the range of g by constraining its domain, and therefore g is continuous.
Could we replace the range with Rk or any compact metric space? By any complete metric space?
Yes. Definition 3.12 tells us that all compact metric spaces and all Euclidean metric spaces are also complete
metric spaces. Our proof didn’t depend on the range of g being R1 : we assumed only that the range was a
complete metric space.
Could we replace the range with any metric space?
No. Consider the function f (x) = x1 with a domain of E = n1 : n ∈ N . The set E is dense in X = E ∪ {0}.
This function is continuous at every x ∈ E (we can find a neighborhood Nδ (x) that contains only x, which
guarantees that d(f (y), f (x)) = 0 for any y in Nδ (x)). However, there is no way to define f (0) to make this a
continuous function. This can be seen intuitively by recognizing that the range of f (x) = 1/x is unbounded in
every neighborhood of x = 0. If you’re satisfied with an “intuitive proof”, we’re done. Otherwise, buckle up
and get ready for a few more paragraphs of inequalities and Greek letters.
To prove that f is discontinous at x = 0, choose any arbitrary neighborhood around 0 with radius δ. We
can find an integer n such that n > 1δ (Archimedian property of the reals) so that d( n1 , 0) < δ. For any positive
1
integer k we also have d( n+k
, 0) < δ. Now, by the triangle inequality, we have
1
1
1
1
d f
,f
≤d f
, f (0) + d f (0) , f
n
n+k
n
n+k
We haven’t specificied a value for f (0) yet, but we do know that |f (1/n) − f (1/n + k)| = |n − (n + k)| = k. This
last inequality is therefore equivalent to
1
1
k≤d f
, f (0) + d f (0) , f
n
n+k
61
Every term of this inequality is nonnegative, so one of the two terms on the right-hand side of this inequality
must be ≥ k/2. But this is true for arbitrarily large k and arbitrarily small δ.
We can now show that f is not continuous at x = 0. Choose any > 0. For all possible choice of δ > 0 we can
find integers n > 1δ and k > 2, so that by the previous method we can be sure that either d(f ( n1 ), f (0)) ≥ k/2 = 1
), f (0)) ≥ k/2 > . By definition 4.5 of “continuous”, this proves that f is not continous at 0. And
or d(f ( n+k
we never specified the value of f (0), so we’ve proven that f will be discontinuous however we define f (0). This
means that there is no continuous extension from E to X.
Exercise 4.14 proof 1
If f (0) = 0 or f (1) = 1, we’re done. Otherwise, we know that f (0) > 0 and f (1) < 1.
We’ll now inductively define a sequence of intervals {[ai , bi ]} as follows3 : Let [a0 , b0 ] = [0, 1]. Now assume
that we have defined {[ai , bi ]} for i = 0, 1, . . . , n. Let m = (an + bn )/2. We define [an+1 , bn+1 ] as
[an , m],
[m, bn ],
if f (m) ≤ m
if f (m) ≥ m
(This procedure is underdefined for the case of f (m) = m, but when f (m) = m we’ve found the point we’re
looking for anyway). The interval [an+1 , bn+1 ] has diameter 2−(n+1) with f (an+1 ) ≥ an+1 , f (bn+1 ) ≤ bn+1 . We
therefore haveT
a nested sequence of compact sets {[an , bn ]} whose diameter converges to zero, so by exercise 3.21
we know that {[an , bn ]} consists of just one point i ∈ I. Our method of constructing {[an , bn ]} guarantees that
f (i) ≤ i and f (i) ≥ i, therefore f (i) = i and this is the point we were asked to find.
Exercise 4.14 proof 2
This proof, which is much clearer and more concise than mine, was taken from the homework of Men-Gen Tsai
(b89902089@ntu.edu.tw).
Define a new function g(x) = f (x) − x. Both f (x) and x are continuous, so g(x) is continuous by theorem
4.9. If g(0) = 0 or g(1) = 0, then we’re done. Otherwise, we have g(0) > 0 and g(1) < 0. Therefore by theorem
4.23 there must be some intermediate point in the interval [0, 1] for which g(x) = 0: at this point, f (x) = x.
Exercise 4.15
Although the details of this proof might be ugly, the general idea is simple. If the function f is continuous
but not monotonic then we can find some open interval (x1 , x2 ) on which f (x) obtains a local maximum or
minimum. This point f (x) is not an interior point of f ( (x1 , x2 ) ), so f ( (x1 , x2 ) ) is not an open set.
Let f : R1 → R1 be a continuous mapping and assume that f is not monotonically increasing or decreasing.
Then we can find x1 < x2 < x3 such that f (x2 ) > f (x1 ) and f (x2 ) > f (x3 ), or such that f (x2 ) < f (x1 ) and
f (x2 ) < f (x3 ). The proofs for either case are analogous, so we’ll focus only on the case that f (x2 ) > f (x1 ) and
f (x2 ) > f (x3 ).
We can’t assume that the supremum of f ( (x1 , x3 ) ) occurs at f (x2 ): there could be some x2.5 for which
f (x2.5 ) > f (x2 ). We want to construct a closed subinterval [xa , xb ] ⊂ (x1 , x3 ) containing the x for which f (x)
obtains its local maximum. To do this, we define to be
= min{d( f (x2 ), f (x1 ) ), d( f (x2 ), f (x3 ) )}
Because f is continuous, we know that there exists some δa and δb such that
d(x1 , x) < δa → d(f (x1 ), f (x)) < d(x3 , x) < δb → d(f (x3 ), f (x)) < 3 Our
method of defining this sequence is similar to Newton’s method or the Bisection method of root finding.
62
We can now let xa = x1 + 12 δa and xb = x3 − 12 δb . We know that f doesn’t obtain its local maximum for any x
in the intervals (x1 , xa ) or (xb , x3 ) because all of these xs are sufficiently close (within δ) to x1 or x3 so that f (x)
is less (by at least 21 δ) than f (x2 ). And [xa , xb ] is a closed subset of R1 and is therefore a compact set, so we
know that f ([xa , xb ]) is a closed and bounded subset of R1 (theorem 4.15), and therefore f ([xa , xb ]) contains its
supremum (which may or may not occur at x2 , but we only care that the supremum exists for some x ∈ [xa , xb ]).
This is all getting a bit muddled, so to recap our results so far:
1) [xa , xb ] ⊂ (x1 , x3 )
2) f ([xa , xb ]) ⊂ f ((x1 , x3 ))
3) (x ∈ (x1 , xa ) ∪ (xb , x3 )) → f (x) < f (x2 )
4) f (x) has a local maximum for some x ∈ [xa , xb ]
Together these four facts tell us that f ( (x1 , x3 ) ) contains its own supremum for some x ∈ [xa , xb ]. But the
supremum of a set can’t be an interior point of the set, so f ( (x1 , x3 ) ) is not open even though (x1 , x3 ) is
open, so f is not an open mapping (theorem 4.8) We’ve shown that “f is not monotonic” implies “f is not a
continuous open mapping from R1 to R1 ”. By converse, then, we have shown that if f is a continuous open
mapping from R1 to R1 then f is monotonic. And this is what we were asked to prove.
Exercise 4.16
The functions [x] and (x) are both discontinuous for each x ∈ N. If x ∈ N then, for every 0 < δ < 12 , we have
d([x − δ], [x]) = |(x − 1) − x| = 1
1
2
This shows that we can’t constrain the range of these functions by restraining the domain around x ∈ N, which
by definition means that the functions are discontinuous for integer values of x.
d((x − δ), (x)) = |(1 − δ) − 0| = 1 − δ >
Note that the the sum of these discontinuous functions, [x] + (x), is the continuous function f (x) = x.
Exercise 4.17
Define E as in the hint. To the three criteria listed in the exercise, we add a fourth:
(d)
a<q<x<r<b
Lemma 1: For each x ∈ E we can find p, q, r such that the four criteria are met
We’re assuming that f (x−) < f (x+), so by the density of Q in R (theorem 1.20b) we can find some rational p
such that f (x−) < p < f (x+). From our definition of the set E we know that f (x−) exists for all x ∈ E, so we
can find some neighborhood Nδ (x) containing a rational q that fulfills criteria (b). More formally,
(∃q ∈ Nδ (x) ∩ (a, x))(a < q < x ∧ [q < t < x → f (t) < p]
Similarly, the existence of f (x+) tells us that there is some neighborhood Nδ2 (x) containing a rational r such
that
(∃r ∈ Nδ2 (x) ∩ (x, b))(x < r < b ∧ [x < t < r → f (t) > p]
Therefore each x ∈ E can be associated with at least one (p, q, r) rational triple with q < x < r.
Lemma 2: If a given (p, q, r) triple fulfills the criteria for x and y, then x = y
Suppose x, y are two elements of E that both meet the four criteria. Assume, without loss of generality, that
x < y. The density of the reals guarantees that there exists some rational w such that x < w < y. But this
means that f (w) < p (by criteria (b) since q < w < y) and that f (w) > p (by criteria (c) since x < w < r).
Reversing the roles of x and y establishes the same results for the case of x > y. This is clearly a contradiction,
so one of our assumptions must be wrong: for any given triple (p, q, r) it’s not possible to find two distinct
elements of E that fulfill all four criteria. Note that we haven’t proven that each triple can be associated with
a unique x ∈ E: we’ve only shown that each triple can be associated with at most one x ∈ E.
63
Lemma 3: E is at most countable
The set of rational triples is countable (theorem 2.13). Lemma 2 tells us that we can create a function from
the set of triples onto E, so the cardinality of E is not more than the cardinality of the set of rational triples.
Therefore the cardinality of E is at most countable.
Lemma 4: F is at most countable
If we define F to be the set of x for which f (x+) < f (x−) we can prove that F is at most countable with only
trivial modifications to the previous lemmas.
Lemma 5: G is at most countable
Define G to be the set of x for which f (x+) = f (x−) = α but f (x) 6= α. Let x be an arbitrary element of
G. If every neighborhood of x contained another element of G, we could construct a sequence {tn } for which
{tn } → x but {f (tn )} 6→ α. But this would mean that either f (x−) or f (x+) wouldn’t exist, which contradicts
our definition of G. Therefore each x is an isolated point of G: we can find some radius δ such that Nδ (x)∩G = x
and Nδ (x) contains rational numbers q, r such that q < x < r.
Now consider all of the (q, r) pairs of rational numbers. Each of these pairs will either have one, more than
one, or zero elements of G in the associated open interval (q, r). Let G be the set of rational pairs (q, r) for
which there is exactly one element of g in the open interval (q, r). The set G is a subset of Q × Q, so it’s at most
countable (theorem 2.13). And each x ∈ G can be associated with at least one (p, q) ∈ G, so we can create a
function from G onto G. This proves that G is at most countable.
The sets E,F ,and G exhaust all the different types of simple discontinuities and all of these sets are countable.
Therefore their union is countable (theorem 2.12). And this is what we were asked to prove.
Exercise 4.18
f is continuous at every irrational point
Let x be irrational. Choose any > 0 and let n be the smallest integer such that n > 1 . If we want to constrain
the range of f to f (x) ± , we will need to find a neighborhood of x that contains no rational numbers in the set
p
: p, q ∈ Q, q ≤ n
q
For each i ≤ n the product xi is irrational, so there exists some integer mi such that
m < xi < m + 1
which means that
m
m+1
<x<
i
i
So we can define a neighborhood around x with radius δi where
m+1
m
δi = min d x,
, d x,
i
i
The neighborhood (x − δi , x + δi ) contains no rationals of the form ki . So if we have constructed neighborhoods
δ1 , δ2 , . . . , δn we can let δ = min{δi } (this minimum exists because n is finite), and we will have constructed a
neighborhood (x − δ, x + δ) that contains no rational numbers with denominators ≤ n. Therefore, for all rational
numbers y, we have
1
d(x, y) < δ → d(f (x), f (y)) < < n
and for irrational numbers y we have
d(x, y) < δ → d(f (x), f (y)) = 0 < Which, by definition, means that f is continuous at x.
64
f has a simple discontinuity at every rational point
Let x be a rational point. If we follow the previous method of constructing δ we immediately see that f (x+) =
f (x−) = 0. When x is rational, though, we no longer have f (x) = 0 and therefore f has a simple discontinuity
at x.
Exercise 4.19
Suppose that f is discontinous at some point p. By definition 4.5 of “continuity”, we know that there exists
some > 0 such that, for every δ > 0, we can find some x such that d(x, p) < δ but d(f (x), f (p)) ≥ . Therefore
we are able construct a sequence {xn } where each term of {xn } gets closer to p even though every term of
{f (xn )} differs from f (p) by at least . More formally, we choose each element of {xn } so that d(p, xn ) < n1 but
d(f (p), f (xn )) ≥ . By constructing the sequence in this way we see that lim xn → p but d(f (xn ), f (p)) ≥ for
all n.
Assume, without loss of generality, that f (xn ) < f (p) for every term of {xn } (see note 1 for justification of
WLOG). From the density property of the reals we can find a rational number r such that f (xn ) < r < f (p).
We’re told that f has the intermediate value property, so for each xn there is some yn between xn and p such
that f (yn ) = r. This sequence {yn } converges to p (since the terms of {yn } are squeezed between the terms of
{xn } and p) and {f (yn )} clearly converges to r (since f (yn ) = r for all n). But the fact that {yn } converges to
p means that p is a limit point of {yn }, which means that p is a limit point of the closed set described in the
exercise (the set of all x with f (x) = r). Therefore p is an element of this set, which means that f (p) = r. And
this is a contradiction since we specifically chose r so that f (p) 6= r.
We’ve established a contradiction, so our initial supposition must be false: f is not discontinous at any point
p. Therefore f is continuous.
Note 1: justification for WLOG claim
For each term of the sequence {f (xn )} either f (p) < f (xn ) or f (xn ) < f (p). Therefore {f (xn )} either contains
an infinite subsequence where f (xn ) < f (p) for all n or f (xn ) > f (p) for all n (or both). If it contains an infinite
subsequence where f (xn ) < f (p) for all n then we use this subsequence in place of {xn } in the proof. If it does
not contain such a subsequence, then it contains an infinite subsequence where f (xn ) > f (p) for all n: the proof
for this case requires only the trivial twiddling of a few inequality signs. Therefore we can assume, without loss
of generality, that f (xn ) < f (p) for each term of {xn }.
Exercise 4.20a
C
If x 6∈ E then x is an element of the open set E , therefore there is some radius such that d(x, y) < → y 6∈ E,
and therefore inf z∈E d(x, z) ≥ > 0.
If x ∈ E then for every > 0 we can find some z ∈ E such that d(x, z) < and therefore inf z∈E d(x, z) = 0.
Exercise 4.20b
Case 1: x, y ∈ E
If x and y are both elements of E then from part (a) we have ρE (x) = ρE (y) = 0 and therefore
|ρE (x) − ρE (y)| = 0 ≤ d(x, y)
Case 2: x ∈ E, y 6∈ E
If y ∈ E but x 6∈ E then ρE (y) = 0 and d(x, y) ∈ {d(x, z) : z ∈ E)}. Therefore
|ρE (x) − ρE (y)| = |ρE (x)| = inf d(x, z) ≤ d(x, y)
z∈E
65
Case 3: x, y 6∈ E
If neither x nor y are elements of E then, for arbitrary z ∈ E we have
ρE (x) ≤ d(x, z) ≤ d(x, y) + d(y, z)
This must hold for any choice of z. By choosing z to make d(y, z) arbitrarily close to inf z∈E d(x, z) we have
ρE (x) ≤ d(x, y) + ρE (y) + Our choice of can be made arbitrarily small (possibly even zero, if y is a limit point of E), so this is equivalent
to
ρE (x) ≤ d(x, y) + ρE (y)
→
ρE (x) − ρE (y) ≤ d(x, y)
By changing the roles of x and y we can similarly show that
ρE (y) − ρE (x) ≤ d(x, y)
Together, these last two inequalities show us that
|ρE (y) − ρE (x)| ≤ d(x, y)
These three cases exhaust the possibilities for x, y. This shows us that for every > 0 we have d(ρE (x), ρE (y)) < whenever d(x, y) < . And this is just definition 4.18 for uniform continuity with δ = .
Exercise 4.21
K is compact and ρF is continuous on K, so ρF (K) is compact (and is therefore both closed and bounded from
theorem 2.41). The results of exercise 20 tell us that 0 is not an element of ρF (K), and therefore 0 is not a limit
point of ρF (K) (since ρF (K) is closed). Therefore we can find some neighborhood around 0 with radius δ such
that none of the elements in the interval (0 − δ, 0 + δ) are in ρF (K).
Choose p ∈ K, q ∈ F . The distance d(p, q) is an element of ρF (K) and therefore d(p, q) 6∈ (0 − δ, 0 + δ).
Therefore d(p, q) > δ.
The conclusion fails if neither set is compact
Consider the sets4
K=
1
n+ :n∈N
n
F = {n : n ∈ N} − {2}
The set F is closed, neither K nor F is compact, and we can choose p ∈ K, q ∈ F such that d(p, q) <
n.
1
n
for any
Exercise 4.22
Exercise 20a shows us that ρA (p) = 0 iff p ∈ A and ρB (p) = 0 iff p ∈ B, so it’s clear that f (p) = 0 iff p ∈ A and
f (p) = 1 iff p ∈ B. This means that the ρA (p) + ρB (p) is never zero, so f is continuous from theorem 4.9.
The intervals [0, 21 ) and ( 12 , 1] are open in [0, 1]; therefore V and W are open by theorem 4.8. The sets V
and W are clearly disjoint since f (p) can have only a single value for any given p. From the definition of A and
B we have
1
−1
−1
A = f (0) ⊆ f
0,
=V
2
1
B = f −1 (1) ⊆ f −1
,1
=W
2
4 This
counterexample was taken from the homework of Men-Gen Tsai, b89902089@ntu.edu.tw
66
Exercise 4.23a: proof of continuity
Choose x ∈ (a, b). Assume without loss of generality that f (x) 6= f (a) and f (x) 6= f (b) (see below for justification
of WLOG). Choose some > 0, and choose 0 such that
0 < 0 < min{, |f (b) − f (x)|, |f (a) − f (x)|}
Choose δ such that
0 < δ < min
0 (x − a)
0 (b − x)
,
|f (b) − f (x)| |f (a) − f (x)|
Define λa , λb as
δ
x−a
δ
λb = 1 −
b−x
λa = 1 −
The method we used to select 0 and δ ensure that both of these λ values are in the interval (0, 1). Notice that
we can express δ in terms of λa or λb as
δ = (1 − λa )(x − a) = (1 − λb )(b − x)
If we restrict the domain of f to (x − δ, x + δ) we have
|f (x + δ) − f (x)|
= |f (x + (1 − λb )(b − x)) − f (x)|
from our definition of δ
= |f (λb x + (1 − λb )b) − f (x)|
algebraic rearrangement
≤ |λb f (x) + (1 − λb )f (b) − f (x)|
from the convexity of f
= |(1 − λb )(f (b) − f (x))|
algebraic rearrangement
=
<
δ
| b−x
(f (b) − f (x))|
0 (b−x)
(b−x)|f (b)−f (x)|
0
|(f (b) − f (x))|
from our definition of λ
from our definition of δ
=
algebra
<
from our definition of and also
|f (x − δ) − f (x)|
= |f (x + (1 − λa )(x − a)) − f (x)|
from our definition of δ
= |f (λa x + (1 − λa )a) − f (x)|
algebraic rearrangement
≤ |λa f (x) + (1 − λa )f (a) − f (x)|
from the convexity of f
= |(1 − λa )(f (a) − f (x))|
algebraic rearrangement
δ
= | x−a
(f (a) − f (x))|
from our definition of λ
<
0 (x−a)
(x−a)|f (a)−f (x)|
0
|(f (a) − f (x))|
from our definition of δ
=
algebra
<
from our definition of We chose an arbitrarily small and showed how to find a nonzero upper bound for δ such that d(x, y) < δ →
d(f (x), f (y)) < . By definition, this means that f is continuous.
Justifying “without loss of generality”: special case of f (a) = f (x) or f (b) = f (x)
Let x be an arbitrary point in the interval (a, b). Choose α ∈ (a, x) such that f (α) 6= f (x): if this is not possible,
then choose α = x. Choose β ∈ (x, b) such that f (β) 6= f (x): if this is not possible, then choose β = x.
67
If α < x < β, we can use the previous method to prove that f is continuous on an interval (α, β) containing x
such that (α, β) ⊆ (a, b). Since x was an arbitrary element of (a, b) this is sufficient to prove that f is continuous
on (a, b).
If α = x < β, we can use the previous method to prove that f is continuous on an interval (x, β) containing
x such that (x, β) ⊆ (a, b). We only have α = x if there was no α ∈ (a, x) such that f (α) 6= f (x): this can
only happen if f is constant (and therefore continuous) on the interval (a, x]. Therefore f is continuous on the
interval (a, β) containing our arbitrary x; this is sufficient to prove that f is continuous on (a, b).
If α < x = β, we can use the previous method to prove that f is continuous on an interval (α, x) containing
x such that (α, x) ⊆ (a, b). We only have β = x if there was no β ∈ (x, b) such that f (β) 6= f (x): this can
only happen if f is constant (and therefore continuous) on the interval [x, b). Therefore f is continuous on the
interval (α, b) containing our arbitrary x; this is sufficient to prove that f is continuous on (a, b).
If α = x = β then f is a constant function on the entire interval (a, b) and is therefore trivially continuous.
Exercise 4.23b: increasing convex functions of convex functions are convex
Let g be an increasing convex function, let f be a convex function, and define their composite to be h = g ◦ f .
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
from convexity of f
→ g(f (λx + (1 − λ)y)) ≤ g(λf (x) + (1 − λ)f (y))
g is an increasing function
→ g(f (λx + (1 − λ)y)) ≤ g(λf (x) + (1 − λ)f (y)) ≤ λg(f (x)) + (1 − λ)g(f (y))
from convexity of g
→ g(f (λx + (1 − λ)y)) ≤ λg(f (x)) + (1 − λ)g(f (y))
transitivity
→ h(λx + (1 − λ)y) ≤ λh(x) + (1 − λ)h(y)
definition of h
x and y were arbitrary, so this shows that h is a convex function.
Exercise 4.23c: the ugly-looking inequality
We’re given that s < t < u, so we know that 0 < t − s < u − s. Define λ to be
t−s
u−s
λ=
From this definition we immediately see that 0 < λ < 1. We can also see that
(1 − λ) = 1 −
t−s
u−t
=
u−s
u−s
From this, we can establish an inequality:
λf (u) + (1 − λ)f (s) ≥ f (λu + (1 − λ)s)
t−s
u−t
→ λf (u) + (1 − λ)f (s) ≥ f u−s
u + u−s
s
→ λf (u) + (1 − λ)f (s) ≥ f ut−us+su−st
u−s
→ λf (u) + (1 − λ)f (s) ≥ f t(u−s)
u−s
from convexity of f
→ λf (u) + (1 − λ)f (s) ≥ f (t)
algebra
definition of λ, (1 − λ)
algebra
algebra
From this last inequality, we can derive two results.
λf (u) + (1 − λ)f (s) ≥ f (t)
→ λf (u) − λf (s) ≥ f (t) − f (s)
→
t−s
u−s
(f (u) − λf (s)) ≥ f (t) − f (s)
previously derived
subtract f (s) from each side
definition of λ
68
Dividing both sides of this last inequality by t − s gives us
f (t) − f (s)
f (u) − f (s)
≥
u−s
t−s
(14)
Similarly, we have:
λf (u) + (1 − λ)f (s) ≥ f (t)
previously derived
→ −λf (u) − (1 − λ)f (s) ≤ −f (t)
multiply both sides by −1
→ (1 − λ)f (u) − (1 − λ)f (s) ≤ f (u) − f (t)
add f (u) to both sides
→ (1 − λ)(f (u) − f (s)) ≤ f (u) − f (t)
add f (u) to both sides
→
u−t
u−s
(f (u) − f (s)) ≤ f (u) − f (t)
definition of λ
Dividing both sides of this last equation by u − t gives us
f (u) − f (s)
f (u) − f (t)
≤
u−s
u−t
(15)
Combining (14) and (15) gives us
f (u) − f (s)
f (u) − f (t)
f (t) − f (s)
≤
≤
t−s
u−s
u−t
which is what we were asked to prove.
Exercise 4.24
To prove that f is convex on (a, b) we must show that
(∀x, y ∈ (a, b), λ ∈ [0, 1]) : f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
(16)
To do this, choose an arbitrary x, y ∈ (a, b) (this choice of x and y will remain fixed for the majority of this
proof). Let Λ represent the set of values for λ for which (16) holds. It’s immediately clear that 0 ∈ Λ and 1 ∈ Λ.
We’re also told that f has the property
f (x) + f (y)
x+y
≤
for all x, y ∈ (a, b)
(17)
f
2
2
which is just (16) with λ = 21 ; therefore
1
2
∈ Λ. To prove that f is convex we must show that [0, 1] ⊂ Λ.
Lemma 1: If j, k ∈ Λ then (j + k)/2 ∈ Λ
j+k
Assume that j, k ∈ Λ and let
m = 2 . Proof that m ∈ Λ:
2−j−k
f (mx + (1 − m)y) = f j+k
y
definition of m
2 x+
2
= f jx+kx+2y−jy−ky
algebra
2
= f jx−jy+y
+ kx−ky+y
algebra
2
2
= f jx+(1−j)y
+ kx+(1−k)y
algebra
2
2
j is in Λ, so j ∈ [0, 1], so (jx + (1 − j)y) ∈ (a, b). The same is true for k. So we can apply (17).
≤
≤
=
f (jx+(1−j)y)+f (kx+(1−k)y)
2
jf (x)+(1−j)f (y)+kf (x)+(1−k)f (y)
2
j+k
2−j−k
f
(x)
+
f (y)
2
2
= mf (x) + (1 − m)f (y)
We can apply (16) because j, k ∈ Λ
algebra
definition of m
This shows that f (mx + (1 − m)y) ≤ mf (x) + (1 − m)f (y), therefore m ∈ Λ.
69
Lemma 2: all rationals of the form m/2n with 0 ≤ m ≤ 2n are members of Λ
This can be proven by induction. Let E be the set of all n for which the lemma is true. We know that
{0, 21 , 1} ⊂ Λ so we have 0 ∈ E, 1 ∈ E. Now assume that E contains 1, 2, . . . , k. Choose an arbitrary m such
that 0 ≤ m ≤ 2k+1 .
case 1: m is even
If m is even then m = 2α for some α ∈ Z, 0 ≤ α ≤ 2k and therefore m/2k+1 = α/2k. By the hypothesis of
induction α/2k ∈ Λ therefore m/2k+1 ∈ Λ.
case 2: m is odd
If m is odd then (m − 1)/2 = α and (m + 1)/2 = β for some α, β ∈ Z with α, β ∈ [0, 2k ]. From this we have
m+1 m−1
1 α
β
m
= k+2 + k+2 =
+ k
2k+1
2
2
2 2k
2
By the hypothesis of induction α/2k ∈ Λ and β/2k ∈ Λ. Therefore, by lemma 1, m/2k+1 ∈ Λ.
This covers all possible cases for m, so k + 1 ∈ E. This completes the inductive step, therefore E = N.
Lemma 3: every element of [0, 1] is a member of Λ
Choose any p ∈ [0, 1]. From lemma 2, we see that Λ is dense in [0, 1] so we can construct a sequence {λn } in Λ
such that limn→∞ λn = p. Each λn is an element of Λ, so for each n we have
f (λn x + (1 − λn )y) ≤ λn f (x) + (1 − λn )f (y)
The function f is continuous, so by theorem 4.2 we can take the limit of both sides as n → ∞ to get
f (px + (1 − p)y) ≤ pf (x) + (1 − p)f (y)
which means that p ∈ Λ.
By lemma 3 we know that
(∀λ ∈ [0, 1])f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
But we chose x and y arbitrarily from the interval (a, b), so we have proven that
(∀x, y ∈ (a, b), λ ∈ [0, 1])f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
which is (16), the definition of convexity. This shows that f is convex, which is what we were asked to prove.
Exercise 4.25a
The “hint” given for this problem is actually a proof. The set F = z − C is clearly closed because z is a fixed
element and C is closed. The sets K and F are disjoint:
k ∈K ∩F
→ k ∈K ∧k ∈F
→ k ∈ K ∧ (∃c ∈ C)(k = z − c)
assumed
def. of intersection
def. of F
→ k ∈ K ∧ (∃c ∈ C)(k + c = z)
→ (∃k ∈ K, c ∈ C)k + c = z
rearrangement of quantifiers
→ z ∈K +C
def. of K + C
From the definition of the set F , we see that (∀c ∈ C)(z − c ∈ F ). And we’ve established that F is closed
and that and K is compact, so by exercise 21 we know that there exists some δ > 0 such that
70
(∀c ∈ C, k ∈ K)(d(z − c, k) > δ)
by exercise 21
→ (∀c ∈ C, k ∈ K)(|z − c − k| > δ)
def. of distance function in Rk
→ (∀c ∈ C, k ∈ K)(|z − (c + k)| > δ)
algebra
→ (∀c ∈ C, k ∈ K)(d(z, c + k) > δ)
def. of distance function in Rk
The set of c + k for all c ∈ C, k ∈ K is simply the set C + K: so we have
def. of distance function in Rk
→ (∀y ∈ C + K)(d(z, y) > δ)
This shows us that z is an interior point of C + K. But z was an arbitrary point of C + K, so we have shown
that C + K is open. By theorem 2.23 the complement of an open set is closed, so C + K is closed. And this is
what we were asked to prove.
Exercise 4.25b
Let α be an irrational number and let C1 + C2 be defined as the set
C1 + C2 = {m + nα : m, n ∈ N}
(18)
We’ll first prove that C1 + C2 is dense in [0, 1) and then extend this proof out to prove that C1 + C2 is dense in
R1 . Define the set ∆ to be the set of radii δ such that, for every x ∈ [0, 1), every neighborhood of radius > δ
contains a point of C1 + C2 . More formally, we define it to be
∆ = {δ : (∀x ∈ [0, 1))(Nδ (x) ∩ C1 + C2 6= ∅)
We want to prove that C1 + C2 is dense in [0, 1): we can do this by proving that ∆ has a greatest lower bound
of 0.
Lemma 1: Each element of C1 + C2 has a unique representation of the form m + nα
Assume that m + nα and p + qα are two ways of describing the same element of C1 + C2 .
m + nα = p + qα
assumption of equality
→ (m − p) + (n − q)α = 0
algebra
→ (n − q)α = (p − m)
algebra
If both sides of this equation are zero, then n = q and p = m so that our two representations are
not unique. If both sides are nonzero, we can divide by p − m:
→α=
p−m
n−q
algebra
But m, n, p, q are integers so this last statement implies that α is a rational number. By contradiction, each
element of C1 + C2 has a unique representation of the form m + nα.
Lemma 2: For each n ∈ N, there is exactly one m ∈ N such that m + nα ∈ [0, 1).
Assume that p + nα and q + nα are both in the interval [0, 1) Then |(p + nα) − (q + nα| = |p − q| ∈ [0, 1). And
p and q are both integers, so it must be the case that p = q.
Lemma 3: If d(x, y) = δ for any x, y ∈ [0, 1) ∩ C1 + C2 then δ ∈ ∆
Assume d(x, y) = δ for some 0 ≤ x < y < 1 with x, y ∈ C1 + C2 . Then we have
d(x, y) = y − x = p + mα − q + nα = (p − q) + (m − n)α = δ
So that δ is itself an element of C1 + C2 . It’s also clear that any integer multiple of δ is an element of C1 + C2 .
Now, choose an arbitrary p ∈ [0, 1) that is not a multiple of δ. Every real number lies between two integers, so
there exists some a such that
p
a< <a+1
δ
71
which implies
aδ < p < (a + 1)δ
which shows that p is in a neighborhood of radius δ of some element of C1 + C2 . But p was an arbitrary element
of [0, 1), so every element of [0, 1) lies in such a neighborhood of radius δ, and therefore δ ∈ ∆.
Lemma 4: if δ ∈ ∆, then
1
2δ
∈∆
Proof by contradiction. Assume that δ ∈ ∆ but 21 δ 6∈ ∆. By lemma 3 we know that d(x, y) > δ for all
x, y ∈ [0, 1) ∩ C1 + C2 . This gives us a maximum size for the set [0, 1) ∩ C1 + C2 :
|[0, 1) ∩ C1 + C2 | ≤
1
δ
But this contradicts lemmas 1 and 2, which tell us that there is one unique element of [0, 1) ∩ C1 + C2 for each
n ∈ N. By contradiction, then, we must have 21 δ ∈ ∆ whenever δ ∈ ∆.
The set C1 + C2 is dense in [0, 1)
We can now use induction to show that inf ∆ = 0. We know that 1 ∈ ∆ because a neighborhood of radius δ = 1
around any x ∈ [0, 1) will contain 0 ∈ C1 + C2 . Therefore 1 ∈ ∆. Using induction with lemma 4 tells us that
(∀n ∈ N)(2−n ∈ ∆). This is sufficient to prove that inf ∆ ≤ 0. Each element of ∆ is a distance, so inf ∆ ≥ 0.
Therefore inf ∆ = 0.
By the definition of ∆, this means that every neighborhood of every element of [0, 1) has an element of
C1 + C2 . This allows us to conclude that every element of [0, 1) is a limit point of C1 + C2 , which means that
C1 + C2 is dense in [0, 1).
The set C1 + C2 is dense in R1
Choose an arbitrary p ∈ R1 . Let m be the integer such that m ≤ p < m + 1. This means that 0 ≤ p − m < 1, and
we know that C1 +C2 is dense in [0, 1). Therefore we can construct a sequence {cn } of elements of [0, 1)∩C1 +C2
that converges to p − m. The definition of C1 + C2 guarantees that {cn + m} is also a sequence in C1 + C2 ,
and theorem 3.3 tells us that {cn + m} converges to p. Therefore p is a limit point of C1 + C2 . But p was an
arbitrary element of R1 , so every element of R1 is a limit point of C1 + C2 . And this proves that C1 + C2 is
dense in R1 .
The sets C1 and C2 are closed, but C1 + C2 is not closed.
The sets C1 and C2 are closed because they have no limit points, so it’s trivially true that they contain all of
their limit points. The set C1 + C2 doesn’t contain any non-integer rational numbers, but every real number is
a limit point of C1 + C2 . Therefore C1 + C2 doesn’t contain all of its limit points which means it is not closed.
Exercise 4.26
We’re told that Y is compact and that g is continuous and one-to-one. We conclude that g(Y ) is a compact
subset of Z (theorem 4.14) and therefore g is uniformly continuous on Y (theorem 4.19). The fact that g is
one-to-one tells us that g is one-to-one and onto g(Y ), so by theorem 4.17 we conclude that g −1 (g(Y )) is a
continuous mapping from compact space g(Y ) to compact space X. So by theorem 4.19 we see that g −1 is
uniformly continuous on g(Y ).
In exercise 12 we proved that the composition of uniformly continuous functions is uniformly continuous,
therefore f (x) = g −1 (h(x)) is uniformly continuous if h is uniformly continuous. Theorem 4.7 tells us that the
composition of continuous functions is continuous, therefore f (x) = g −1 (h(x)) is continuous if h is continuous.
To construct the counter example, define X = Z = [0, 1] and Y = [0, 21 ) ∪ [1, 2]. Define the functions
f : X → Y and g : Y → Z as
x if x < 12
f (x) =
2x if x ≥ 12
72
g(y) =
y
y
2
if y <
if y ≥
1
2
1
2
1
2
We can easily demonstrate that f fails to be continuous at x = and that g is continous at every point. The
composite function h : X → Z is just h(x) = x, which is clearly continuous.
Exercise 5.1
Choose arbitrary elements x, y ∈ R1 . We’re told that f is continuous and that
|f (x) − f (y)| ≤ (x − y)2 = |x − y|2
Dividing the leftmost and rightmost terms by |x − y| we have
f (x) − f (y)
≤ |x − y|
x−y
Taking the limit of each side as y → x gives us
|f 0 (x)| ≤ 0
But our choice of x was arbitrary, so the derivative of f is zero at every point. By theorem 5.11b, this proves
that f is a constant function.
Exercise 5.2
Lemma 1: g(f (x)) = x
We’re told that f 0 (x) > 0 for all x, so f is strictly increasing in (a, b) (this result is a trivial extension of the
proofs for theorem 5.11). By definition, this means that x < y ↔ f (x) < f (y) and x > y ↔ f (x) > f (y): from
this we conclude
x 6= y ↔ f (x) 6= f (y)
By contrapositive this is equivalent to
x = y iff f (x) = f (y)
which, by definition, means that f is one-to-one. Therefore g(f (x)) = x (theorem 4.17 ).
g 0 (x) exists for all x ∈ f ( (a, b) )
Using lemma 1 we see that f 0 (x) is inversely proportional to g 0 (x):
f 0 (x) = lim
t→x
f (x) − f (t)
f (x) − f (t)
1
1
=
= lim t → x g(f (x))−g(f (t)) = 0
x−t
g(f (x)) − g(f (t))
g (x)
f (x)−f (t)
We’re told that the derivative f 0 (x) is defined for all x ∈ (a, b), so 1/g 0 (x) must also be defined.
bonus proofs: a big wad of properties for f and g
We can show that both f and g are uniformly continous, injective, differentiable, strictly increasing functions
whose domains and ranges are compact.
From lemma 1, we know that f is a one-to-one function that is strictly increasing on (a, b). We’re told
that f is differentiable on (a, b), therefore f is continuous on [a, b] (theorem 5.2). Because the domain of f is
a compact metric space we can conclude that f is uniformly continuous (theorem 4.19). From the continuity
and injectiveness of f we can conclude that g is continuous on f ( [a, b] ) (theorem 4.17). The domain of g is
the range of f , so the domain of g is a compact space (thoerem 4.14) and therefore g is uniformly continuous
(theorem 4.19). From lemma 1 we know that g(f (x)) = x for all x, therefore the inequalities in lemma 1 are
equivalent to
g(f (x)) > g(f (y)) ↔ f (x) > f (y)
g(f (x)) < g(f (y)) ↔ f (x) < f (y)
g(f (x)) = g(f (y)) ↔ f (x) = f (y)
Which means that g is a one-to-one function and is strictly increasing on f ( (a, b) ).
73
Exercise 5.3
Choose such that || <
itself differentiable:
1
M.
The function f is the sum of two differentiable functions, so by theorem 5.3 f is
x−t
g(x) − g(t)
+ lim
= 1 + g 0 (x)
t→x
x−t
x−t
From our bounds on and g 0 (x) this gives us
1
1
0
M < f (x) < 1 +
M
1−
M
M
f 0 (x) = lim
t→x
These are strict inequalities, so we conclude that f 0 (x) is always positive for this choice of . Therefore f is an
increasing function and is one-to-one (see lemma 1 of the previous exercise).
Exercise 5.4
Consider the function
C 1 x2
Cn xn+1
+ ··· +
2
n+1
When x = 1 this evaluates to the function given to us in the exercise, so f (1) = 0. When x = 0 every term
evaluates to zero, so f (0) = 0. From example 5.4 we know that f (x) is differentiable and its derivative is given
by
f 0 (x) = C0 + C1 x + C2 x2 + · · · + Cn xn
P
By theorem 5.10, the fact that f (0) = f (1) = 0 means that f 0 (x) = 0 for some x ∈ (0, 1). Therefore
C n xn
has a real root in (0, 1), and this is what we were asked to prove.
f (x) = C0 x +
Exercise 5.5
Choose an arbitrary > 0. We’re told that f 0 (t) → 0 as t → ∞, so there exists some N such that
t > N → |f 0 (t)| < With some algebraic manipulation and the use of the the mean value theorem (theorem 5.10) we can express g
as
f (x + 1) − f (x)
= f 0 (t) for some t ∈ (x, x + 1)
g(x) = f (x + 1) − f (x) =
(x + 1) − (x)
This must be true for all possible values of x, so choose x > N . We now have t > x > N , so the f 0 (t) term in
the previous equation is now less than .
|g(x)| = |f 0 (t)| ≤ which means that |g(x)| < for all x > N . And was an arbitrary positive real, which by definition means that
g(x) → 0 as x → ∞. And this is what we were asked to prove.
Exercise 5.6
The function g is differentiable (theorem 5.3c) and its derivative is
g 0 (x) =
xf 0 (x) − f (x)
x2
We want to prove that g is monotonically increasing. This is true iff g 0 (x) > 0 for all x, which is true iff
xf 0 (x) − f (x)
>0
x2
which is true iff
xf 0 (x) > f (x)
which is true iff
f 0 (x) >
74
f (x)
x
(19)
To show that (19) holds for all x, choose an arbitrary x ∈ R. From the fact that f (0) = 0 we know from
theorem 5.10 that
f (x)
f (x) − f (0)
=
= f 0 (c) for some c ∈ (0, x)
x
x−0
We’re told that f 0 is monotonically increasing, so f 0 (x) > f 0 (c). Therefore:
f 0 (x) > f 0 (c) =
f (x)
x
Therefore (19) holds, and we’ve shown that this occurs iff g 0 (x) > 0, which means that g is monotonically
increasing. And this is what we were asked to prove.
Exercise 5.7
From the fact that f (x) = g(x) = 0 we see that
f (t)
f (t) − f (x)
f (t) − f (x)
t−x
f 0 (x)
= lim
= lim
= 0
t→x g(t)
t→x g(t) − g(x)
t→x
t−x
g(t) − g(x)
g (x)
lim
Exercise 5.8
We’re told that f 0 is a continuous function on the compact space [a, b] therefore f 0 is uniformly continuous
(theorem 4.19). Choose any > 0: by the definition of uniform continuity there exists some δ such that
0 < |t − x| < δ → |f 0 (t) − f 0 (x)| < . Choose t, x ∈ [a, b]: by the mean value theorem there exists c ∈ (t, x)
such that
f (t) − f (x)
f 0 (c) =
t−x
From the fact that c ∈ (t, x) we know that |c − x| < |t − x|δ. Therefore |f 0 (c) − f 0 (x)| < . Therefore, from our
definition of c, we have
f (t) − f (x)
|f 0 (c) − f 0 (x)| =
− f 0 (x) < t−x
Our initial choice of t, x, and was arbitrary, some δ > 0 must exist so that this previous inequality is true for
all t, x, and . And this is what we were asked to prove.
Does this hold for vector-valued functions?
Yes. Choose an arbitrary > 0 and define the vector-valued function f to be
f (x) = (f1 (x), f2 (x), . . . , fn (x))
Assume that f is differentiable on [a, b]. Then each fi is differentiable on [a, b] (remark 5.16) and [a, b] is compact
so by the preceeding proof we know that for each fi there exists some δi > 0 such that
|t − x| < δi →
fi (t) − fi (x)
− fi0 (x) <
t−x
n
Define δ = min{δ1 , δ2 , . . . , δn }. For |t − x| < δ we now have
f (t) − f (x)
− f 0 (x)
t−x
=
=
≤
<
(f1 (t) + f2 (t) + · · · + fn (t)) − (f1 (x) + f2 (x) + · · · + fn (x))
− (f10 (x) + f20 (x) + · · · + fn0 (x))
t−x
f1 (t) − f1 (x)
f2 (t) − f2 (x)
fn (t) − fn (x)
− f10 (x) +
− f20 (x) + · · · +
− fn0 (x)
t−x
t−x
t−x
f1 (t) − f1 (x)
f2 (t) − f2 (x)
fn (t) − fn (x)
− f10 (x) +
− f20 (x) + · · · +
− fn0 (x)
t−x
t−x
t−x
+ + ··· + = n n
n
75
Exercise 5.9
We’re asked to show that f 0 (0) exists. From the definition of the derivative, we know that
f 0 (0) = lim
x→0
f (x) − f (0)
x−0
The function f is continuous, so limx→0 f (x) − f (0) = 0 and limx→0 x = 0. Therefore we can use L’Hopital’s
rule.
f (x) − f (0)
f 0 (0) = lim
= lim f 0 (x) − 01 − 0 = lim f 0 (x)
x→0
x→0
x→0
x
We’re told that the right-hand limit exists and is equal to 3, therefore the leftmost term (f 0 (0)) exists and is
equal to 3. And this is what we were asked to prove.
Exercise 5.9 : Alternate proof
If limx→0 f 0 (0) = 3 and f 0 (0) 6= 3, then f 0 would have a simple discontinuity at x = 0. Therefore f 0 (0) = 3 as
an immediate consequence of the corollary to theorem 5.12.
Exercise 5.10
Let f1 , g1 represent the real parts of the functions f, g and let f2 , g2 represent their imaginary parts: that is,
f (x) = f1 (x) + if2 (x) and g(x) = g1 (x) + ig2 (x). We’re told that f and g are differentiable, therefore each of
these dependent functions is differentiable (see Rudin’s remark 5.16). Applying the hint given in the exercise,
we have
f1 (x) + if2 (x)
x
x
f (x)
= lim
−A ·
+A·
lim
x→0
x→0 g(x)
x
g1 (x) + ig2 (x)
g1 (x) + ig2 (x)
Each of these functions is differentiable and the denominators all tend to 0, so we can apply L’Hopital’s rule.
0
f (x)
f1 (x) + if20 (x)
1
1
lim
= lim
−A · 0
+A· 0
0
x→0
x→0 g(x)
1
g1 (x) + ig2 (x)
g1 (x) + ig20 (x)
0
f (x)
1
1
= lim
−A · 0
+A· 0
x→0
1
g (x)
g (x)
1
1
A
= {A − A} · + A ·
=
B
B
B
Exercise 5.11
The denominator of the given ratio tends to 0 as h → 0, so we can use L’Hopital’s rule (differentiating with
respect to h):
f (x + h) + f (x − h) − 2f (x)
f 0 (x + h) − f 0 (x − h)
lim
= lim
2
h→0
h→0
h
2h
1 f 0 (x + h) − f 0 (x) f 0 (x) − f 0 (x − h)
= lim
+
h→0 2
h
h
We’re told that f 00 (x) exists, so this limit exists and is equal to
= lim
h→0
1 00
(f (x) + f 00 (x)) = f 00 (x)
2
A function for which this limit exists although f 00 (x) does not
For a counterexample, we need only find a differentiable function for which f 00 (x) = 1 when x > 0 and f 00 (x) = −1
when x < 0. These criteria are met by

if x > 0
 x2 ,
−(x2 ), if x < 0
f (x) =

0,
if x = 0
This function is continuous and differentiable, but f 00 (x) does not exist at x = 0.
76
Exercise 5.12
From the definition of f 0 (x), we have
|x + h|3 − |x|3
h→0
h
If x > 0 then the terms in the numerator are positive and (20) resolves to
f 0 (x) = lim
(20)
(x + h)3 − (x)3
= lim 3x2 + 3xh + h2 = 3x2
h→0
h→0
h
f 0 (x) = lim
If x < 0 then the terms in the numerator are negative and (20) resolves to
−(|x| + h)3 + (|x|)3
= lim −(3|x|2 + 3|x|h + h2 ) = −(3|x2 |)
h→0
h→0
h
f 0 (x) = lim
It’s clear from the above results that f 0 (x) → 0 as x → 0, and this agrees with f 0 (0):
|h3 |
=0
h→0 h
f 0 (0) = lim
So f 0 (x) = 3x|x| for all x.
From the definition of f 00 (x), we have
|3(x + h)2 | − |3x2 |
h→0
h
f 00 (x) = lim
(21)
If x > 0 then the terms in the numerator are positive and (21) resolves to
3(x + h)2 − 3x2
= lim 6x + 3h = 6x
h→0
h→0
h
f 00 (x) = lim
If x < 0 then the terms in the numerator are negative and (21) resolves to
−3(|x| + h)2 − 3|x2 |
= lim 6|x| + 3h = 6|x|
h→0
h→0
h
f 00 (x) = lim
It’s clear from the above results that f 00 (x) → 0 as x → 0, and this agrees with f 00 (0):
|3h2 |
=0
h→0 h
f 00 (0) = lim
So f 00 (x) = |6x| for all x.
From the definition of f 000 (x), we have
f 000 (x) = lim
h→0
|6(x + h)| − |6x|
h
(22)
If x = 0, then when h > 0 we have
|6h|
=6
h→0 h
lim
and when h < 0 we have
lim
h→0
|6h|
= −6
h
So the limit in (22) (which is f (3) (0)) doesn’t exist.
Exercise 5.13a
f is continuous when x 6= 0
The proof that xa is continuous when x 6= 0 is trivial. The sin function hasn’t been well-defined yet, but we can
assume that it’s a continuous function5 . Therefore their product xa sin(x−c ) is continuous wherever it’s defined
(theorem 4.9), which is everywhere but x = 0.
5 Example 5.6 says that we should assume without proof that sin is differentiable, so we can also assume that it’s continuous
(theorem 5.2).
77
f is continous at x = 0 if a > 0
We have f (0) = 0 by definition. For f to be continuous at x = 0 it must be the case that limx→0 f (x) = 0. The
range of the sin function is [−1, 1], so if a > 0 we have
xa (−1) ≤ xa sin(x−c ) ≤ xa (1)
Taking the limit of each of these terms as x → 0 gives us
0 ≤ lim xa sin(x−c ) ≤ 0
x→0
which shows that limx→0 f (x) = 0, and therefore f is continuous at x = 0.
f is discontinous at x = 0 if a = 0
To show that f is not continuous at x = 0, it’s sufficient to construct a sequence {xn } such that lim xn = 0 but
lim f (xn ) 6= f (0) (theorem 4.2). Define the terms of {xn } to be
xn =
1
2nπ + π/2
1c
This sequence clearly has a limit of 0, but
f (xn ) = x0n sin(x−c
n ) = sin(2nπ + π/2) = 1
so that lim{f (xn )} = 1. Note that we’re making lots of unjustified assumptions about the sin function and the
properties of the as-of-yet undefined symbol π.
f is discontinous at x = 0 if a < 0
Define the terms of {xn } to be
xn =
1
2nπ + π/2
1c
This sequence clearly has a limit of 0, but
f (xn ) = xan sin(x−c
n )=
1
2nπ + π/2
ac
sin(2nπ + π/2) =
1
2nπ + π/2
ac
We’re told that c > 0, therefore a/c < 0 and we have
f (xn ) =
1
2nπ + π/2
ac
= (2nπ + π/2)
−a
c
where −a/c > 0. By theorem 3.20a we see that lim f (xn ) = ∞, which means that lim xn = 0 but lim f (xn ) 6= 0
and therefore f is not continuous at x = 0.
These cases show that f is continous iff a > 0.
Exercise 5.13b
From the definition of limit we have
f (0 + h) − f (0)
ha sin(h−c ) − 0
= lim
= lim ha−1 sin(h−c )
h→0
h→0
h→0
h
h
f 0 (0) = lim
(23)
We can evaluate the rightmost term by noting that sin is bounded by [−1, 1] so that
|ha−1 |(−1) ≤ ha−1 sin(h−c ) ≤ |ha−1 |(1)
78
(24)
Lemma 1: f is differentiable when x 6= 0
The proof that xa and x−c are differentiable when x 6= 0 is trivial. The sin function hasn’t been well-defined yet,
but Rudin asks us in example 5.6 to assume that it’s differentiable. Therefore sin(x−c ) is differentiable when
x 6= 0 (theorem 5.5, the chain rule) and therefore xa sin(x−c ) is differentiable when x = 0 (theorem 5.3b, the
product rule).
case 1: f 0 (0) exists when a > 1
When a > 1 we have a − 1 > 0 and therefore taking the limits of (24) as h → 0 gives us
0 ≤ lim |ha−1 sin(h−c )| ≤ 0
h→0
which means that (23) becomes
f 0 (0) = lim ha−1 sin(h−c ) = 0
h→0
0
which shows that f (0) is defined.
case 2: f 0 (0) does not exist when a = 1
Define the sequences {hn } and {jn } such that
hn =
jn =
1
2nπ +
1/c
π
2
1
(2n + 1)π +
1/c
π
2
When a = 1 we have a − 1 = 0 and therefore equation (23) gives us
lim ha−1 sin(h−c ) = lim sin(h−c
n )=1
n→∞
h→0
lim j a−1 sin(j −c ) = lim sin(jn−c ) = −1
n→∞
j→0
0
0
We know the sequences {f (hn )} and {j (hn )} are well-defined because of lemma 1, therefore we have conflicting
definitions of f 0 (0). This means that the limit in (23) (and therefore f 0 (0) itself) does not exist.
case 3: f 0 (0) does not exist when a < 1
Define the sequences {hn } and {jn } such that
hn =
jn =
1
2nπ +
1/c
π
2
1
(2n + 1)π +
1/c
π
2
When a < 1 we have a − 1 < 0 and therefore equation (23) gives us
lim ha−1 sin(h−c ) = lim ha−1
=∞
n
n→∞
h→0
lim j a−1 sin(j −c ) = lim −jna−1 = −∞
j→0
n→∞
We know the sequences {f 0 (hn )} and {j 0 (hn )} are well-defined because of lemma 1, therefore we have conflicting
definitions of f 0 (0). This means that the limit in (23) (and therefore f 0 (0) itself) does not exist.
These cases show that f 0 (0) exists iff a > 1.
79
Exercise 5.13c
Note that we’ve only defined f on the domain [−1, 1], so we only need to show that f is bounded on this domain.
case 1: f 0 is unbounded when a < 1
We saw in case (3) of part (b) that f 0 is unbounded near 0 when a < 1.
case 2: f 0 is unbounded when 1 ≤ a < c + 1
By the lemma of part (b) we know that f 0 (x) is defined for all x ∈ [1, 1] except for possibly x = 0. By the chain
rule and product rule we know that the derivative of f when x 6= 0 is
f 0 (x) = axa−1 sin(x−c ) − cxa−(c+1) cos(x−c )
Define the sequence {hn } such that
(25)
hn = (2nπ)−1/c
Evaluating the derivative in (25) at x = hn gives us
f 0 (hn ) = (2nπ)
1−a
c
sin(2nπ) − c(2nπ)
(1+c)−a
c
cos(2nπ) = −c(2nπ)
(1+c)−a
c
We’re assuming in this case that a < c + 1, so taking the limits of this last equation as n → ∞ gives us
lim f 0 (hn ) = lim −c(2nπ)
n→∞
n→∞
(1+c)−a
c
= −∞
This doesn’t prove anything about f 0 (0) itself, but it does show that f 0 (x) is unbounded near 0.
case 3: f 0 is bounded when a ≥ c + 1
If a ≥ c + 1 then clearly a > 1, so by the lemma of part (b) we know that f 0 (x) is defined for all x ∈ [1, 1]
including x = 0. By the chain rule and product rule we know that the derivative of f is
f 0 (x) = axa−1 sin(x−c ) − cxa−(c+1) cos(x−c )
(26)
Since f 0 (x) is defined for x = 0, we can take the limit of (26) as x → 0:
f 0 (0) = lim f 0 (x) = lim axa−1 sin(x−c ) − cxa−(c+1) cos(x−c )
x→0
x→0
Because x is bounded by [−1, 1] we know that xa−1 , xa−(c+1) , sin, and cos are all bounded by [−1, 1]. Therefore
the rightmost limit of the previous equation is bounded by
−(a + c) ≤ lim axa−1 sin(x−c ) − cxa−(c+1) cos(x−c ) ≤ a + c
x→0
Which, of course, means that f 0 (x) is also bounded by [−(a + c), a + c] for x ∈ [−1, 1]. We could find stricter
bounds for f 0 (x), but it’s not necessary.
These three cases show that f 0 is bounded iff a ≥ c + 1.
Exercise 5.13d
lemma: f 0 is continuous when x 6= 0
From lemma 1 of part (b) we know that f 0 exists for all x 6= 0 and its derivative is given by
f 0 (x) = axa−1 sin(x−c ) − cxa−(c+1) cos(x−c )
(27)
Rudin asks us to assume that sin and cos are continuous functions and it’s trivial to show that x±α is continuous
when x 6= 0 for any α, so we can use the chain rule (theorem 5.5) and product rule (theorem 5.3b) to show that
f 0 is continuous when x 6= 0.
80
case 1: f 0 is continuous at x = 0 when a > 1 + c
We’ve shown that f 0 (0) = 0 when a > 1 (case 1 of part (b)). For f 0 to be continuous at x = 0 it must be the
case that limx→0 f 0 (x) = 0. We can algebraically rearrange (27) to obtain
f 0 (x) = xa−(c+1) axc sin(x−c ) − c cos(x−c )
The range of the cosine and sine functions are [−1, 1], so we can establish a bound on this function.
|f 0 (x)| = |xa−(c+1) axc sin(x−c ) − c cos(x−c )| ≤ |xa−(c+1) | · |axc + c| ≤ |xa−(c+1) | · (|axc | + |c|)
Because a > c + 1, we have |xa−(c+1) | → 0 and |axc | → 0 as x → 0. Taking the limits of this last inequality as
x → 0 therefore gives us
lim |f 0 (x)| ≤ 0 · (0 + c) = 0
x→0
0
0
This shows that limx→0 f (x) = f (0), therefore f 0 is continuous at x = 0.
case 2: f 0 is not continuous at x = 0 when a = 1 + c
To show that f 0 is not continuous at x = 0 it’s sufficient to construct a sequence {xn } such that lim xn = 0 but
lim f 0 (xn ) 6= f 0 (0) = 0 (theorem 4.2). Define the terms of xn to be
xn =
1
2nπ
1c
This sequence clearly has a limit of 0, but sin(xn ) = 0 and cos(xn ) = 1 so that the terms of {f 0 (xn )} are
f 0 (xn ) = axn (0) − cx0 (1) = −c
so that lim{f 0 (xn )} = −c 6= f 0 (0).
case 3: f 0 is not continuous at x = 0 when a < 1 + c
If a < 1+c we know that f 0 is not bounded on [−1, 1] (part (c)) therefore f 0 is not continuous on [−1, 1] (theorem
4.15). From lemma 1, we know that the discontinuity must occur at the point x = 0.
For an alternative proof we could use the sequence {xn } established in case 2 and show that f 0 (xn ) → ∞ =
6
f (0) as xn → 0 when a < 1 + c.
0
Exercise 5.13e
Lemma 1: f 0 is differentiable when x 6= 0
We established in part (b) that f 0 exists for x 6= 0 and is given by
f 0 (x) = axa−1 sin(x−c ) − cxa−(c+1) cos(x−c )
(28)
We know that all of the exponential powers of x are differentiable when x 6= 0 and Rudin asks us to assume that
sin and cos are differentiable, so we can use theorem 5.5 (the chain rule) and theorem 5.3(the product rule) to
show that f 0 is differentiable when x 6= 0.
case 1: f 00 (0) exists when a > 2 + c
From the definition of limit we know that
aha−1 sin(h−c ) − cha−(c+1) cos(h−c )
f 0 (0 + h) − f 0 (0)
= lim
h→0
h→0
h
h
h
i
= lim (aha−2 ) sin(h−c ) − (cha−(c+2) ) cos(h−c )
f 00 (0) = lim
h→0
81
(29)
The range of the sin and cos functions is [−1, 1] so we can establish bounds for the limited term.
−(|aha−2 | + |cha−(c+2) |) ≤ aha−2 sin(h−c ) − cha−(c+2) cos(h−c ) ≤ |aha−2 | + |cha−(c+2) |
When a > (2 + c) > 2, the powers of h tend tend to zero as h → 0. Taking the limits of the previous inequality
as h → 0, we have
0 ≤ lim aha−2 sin(h) − cha−(c+2) cos(h−c ) ≤ 0
h→0
This shows that the limit in (29) (and therefore f 00 (0)) exists.
case 2: f 00 (0) does not exist when a = 2 + c
Define the sequences {hn } and {jn } such that
hn =
jn =
1
2nπ
1/c
1
(2n + 1)π
1/c
When a = 2 + c we have a − (c + 2) = 0 and therefore equation (28) gives us
lim f 00 (h) = lim f 00 (hn ) = 0 − (ch0n )(1) = −c
n→∞
h→0
lim f 00 (j) = lim f 00 (jn ) = 0 − (cjn0 )(−1) = c
n→∞
j→0
00
00
We know the sequences {f (hn )} and {j (hn )} are well-defined because of lemma 1, therefore we have conflicting
definitions of f 00 (0). This means that the limit in (28) (and therefore f 00 (0) itself) does not exist.
case 2: f 00 (0) does not exist when a < 2 + c
Define the sequences {hn } and {jn } such that
hn =
jn =
1
2nπ
1/c
1
(2n + 1)π
1/c
a−(c+2)
→ ∞. This means that equation (28) gives us
When a < 2 + c we have a − (c + 2) < 0 and therefore hn
h
i
lim f 00 (h) = lim f 00 (hn ) = 0 − (cha−(c+2)
)(1)
= −∞
n
h→0
n→∞
h
i
lim f 00 (j) = lim f 00 (jn ) = 0 − (cjna−(c+2) )(−1) = ∞
j→0
n→∞
We know the sequences {f 00 (hn )} and {j 00 (hn )} are well-defined because of lemma 1, therefore we have conflicting
definitions of f 00 (0). This means that the limit in (28) (and therefore f 00 (0) itself) does not exist.
These three cases show that f 00 (0) exists iff a > 2 + c.
Exercise 5.13f
Lemma 1: to hell with this
By the lemma of part (e) we know that f 00 (x) is defined for all x ∈ [1, 1] except for possibly x = 0. By the chain
rule and product rule we know that the derivative of f when x 6= 0 is
h
i
h
i
f 00 (x) = (a2 − a)xa−2 − c2 xa−(2+2c) sin(x−c ) + (c2 + c − ca)xa−(2+c) − caxa−(1+c) cos(x−c )
(30)
And I’m not going to screw around with limits and absolute values of something so annoying to type out, so I’ll
conclude with “the proof is similar to that of part(c)”.
82
Exercise 5.13g
The proof is similar to that of part (d), and the sentiment is similar to that of lemma (1) of part (f).
Exercise 5.14
Lemma 1: If f is not convex then the convexity condition fails for all λ for some s, t ∈ (a, b)
This could also be stated as “if f is not convex on (a, b) then it is strictly concave on some interval (s, t) ⊂ (a, b)”.
Assume that f is continuous on the interval (a, b) and that f is not convex. By definition, this means that we
can choose c, d ∈ (a, b) and λ ∈ (0, 1) such that
f (λc + (1 − λ)d) > λf (c) + (1 − λ)f (d)
(31)
Having fixed our choice of c, d, and λ so that the previous equation is true, we define the function g(x) to be
g(x) = f ((x)c + (1 − x)d) − (x)f (c) + (1 − x)f (d)
We know that g is continuous on [0, 1] because f is continuous on [c, d] ⊂ (a, b) (theorem 4.9). And we know
that there is at least one p ∈ [0, 1] such that g(p) > 0 because we can choose p = λ which causes the previous
equation to simplify to
g(p) = f (λc + (1 − λ)d) − [λf (c) + (1 − λ)f (d)]
which is > 0 from (31). Let Z1 be the set of all p ∈ [0, λ) for which g(p) = 0 and let Z2 be the set of all
p ∈ (λ, 1] for which g(p) = 0 It’s immediately clear that these sets are nonempty since g(0) = g(1) = 0. These
sets are closed (exercise 4.3) and therefore contain their supremums and infimums. Let α = sup{Z1 } and let
β = inf{Z2 }. We can now claim that equation (31) holds for all λ ∈ (α, β). And this is the same as saying that
f (λs + (1 − λ)t) > λf (s) + (1 − λ)f (t)
for all λ ∈ (0, 1) for
s = αc + (1 − α)d,
t = βc + (1 − β)d
which, by definition, means that the convexity conditions fails on the inteval (s, t) ∈ (a, b) for all λ ∈ (0, 1).
The “if ” case: f 0 is monotonically increasing if f is convex
Let f be a function that is convex on the interval (a, b) (see exercise 4.23) and choose x, y ∈ (a, b) with y ≥ x.
By definition this means that for all x, y ∈ (a, b) and for all λ ∈ (0, 1) it must be the case that
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
(32)
We want to express the left-hand side of this inequality as f (x + h): we can do this by defining h such that
h = (λ − 1)(x − y)
Rearranging this algebraically allows us to express λ and 1 − λ as
λ=1−
h
,
y−x
(1 − λ) =
h
x−y
Substituting these values of λ and λ − 1 into (32) results in
f (x + h) ≤ f (x) −
h f (x) h f (y)
+
y−x
y−x
which is algebraically equivalent to
f (x + h) − f (x)
f (y) − f (x)
≤
h
y−x
83
Equation (32) had to be true for any value of λ ∈ (0, 1). As λ → 1 we see that h → 0. Taking the limit of both
sides of the previous equation as h → 0 gives us
f 0 (x) ≤
f (y) − f (x)
y−x
(33)
Having established (33), we now want to express the left-hand side of (32) as f (y − h): we can do this by
redefining h such that
h = −(λ)(x − y)
Rearranging this algebraically allows us to express λ and 1 − λ as
λ=
h
,
y−x
(1 − λ) = 1 −
h
x−y
Substituting these values of λ and 1 − λ into (32) results in
f (y − h) ≤
h f (x)
h f (y)
+ f (y) −
y−x
y−x
which is algebraically equivalent to
f (y) − f (y − h)
f (y) − f (x)
≥
h
y−x
Equation (32) had to be true for any value of λ ∈ (0, 1). As λ → 0 we see that h → 0. Taking the limit of both
sides of the previous equation as h → 0 gives us
f 0 (y) ≥
f (y) − f (x)
y−x
(34)
Combining equations (33) and (34), we have have shown that
f 0 (y) ≥ f 0 (x)
We assumed only that f was convex and that y > x; we concluded that f 0 (y) ≥ f 0 (x). By definition, this means
that f 0 is monotonically increasing if f is convex.
The “only if ” case: f 0 is monotonically increasing only if f is convex
Assume that f is not convex. By lemma 1, we can find some subinterval (s, t) ∈ (a, b) such that
f (λs + (1 − λ)t) > λf (s) + (1 − λ)f (t)
(35)
is true for all λ ∈ (0, 1). We can now follow the logic of the “if” case and define
h = (λ − 1)(s − t)
Rearranging this algebraically allows us to express λ and 1 − λ as
λ=1−
h
,
t−s
(1 − λ) =
h
s−t
Substituting these values of λ and λ − 1 into (35) results in
f (s + h) > f (s) −
h f (s) h f (t)
+
t−s
t−s
which is algebraically equivalent to
f (s + h) − f (s)
f (t) − f (t)
>
h
t−s
Equation (35) had to be true for any value of λ ∈ (0, 1). As λ → 1 we see that h → 0. Taking the limit of both
sides of the previous equation as h → 0 gives us
f 0 (s) >
f (t) − f (s)
t−s
84
(36)
Having established (36), we now redefine h such that
h = −(λ)(s − t)
Rearranging this algebraically allows us to express λ and 1 − λ as
λ=
h
,
t−s
(1 − λ) = 1 −
h
s−t
Substituting these values of λ and 1 − λ into (35) results in
f (t − h) >
h f (s)
h f (t)
+ f (t) −
t−s
t−s
which is algebraically equivalent to
f (t) − f (t − h)
f (t) − f (s)
<
h
t−s
Equation (35) had to be true for any value of λ ∈ (0, 1). As λ → 0 we see that h → 0. Taking the limit of both
sides of the previous equation as h → 0 gives us
f 0 (t) <
f (t) − f (s)
t−s
(37)
Combining equations (36) and (37), we have have shown that
f 0 (t) < f 0 (s)
for some t > s. We assumed only that f was not convex; we concluded that f 0 was not monotonically increasing.
By contrapositive, this means that f 0 is monotonically increasing only if f is convex.
f is convex iff f 00 (x) ≥ 0 for all x ∈ (a, b)
We’ve shown that f is convex iff f 0 is monotonically inreasing, and theorem 5.11 tells us that f 0 is monotonically
increasing iff f 00 (x) ≥ 0 for all x ∈ (a, b).
Exercise 5.15
Note on the bounds of f and its derivatives
When Rudin asks us to assume that |f | and its derivatives have upper bounds of M0 , M1 , and M2 it appears
that we must assume that these bounds are finite. Otherwise the function f (x) = x would be a counterexample
to the claim that M12 ≤ 4M0 M2 .
Proof that M12 ≤ 4M0 M2 for real-valued functions
Choose any h > 0. Using Taylor’s theorem (theorem 5.15) we can express f (x + 2h) as
f (x + 2h) = f (x) + 2hf 0 (x) +
4h2 f 00 (ξ)
,
2
ξ>a
which can be algebraically arranged to give us
f 0 (x) =
f (x + 2h) f (x)
−
− hf 00 (ξ)
2h
2h
f (x + 2h) f (x)
f (x + 2h)
f (x)
−
− hf 00 (ξ)| ≤ |
|+|
| + |hf 00 (ξ)|
2h
2h
2h
2h
We’re given upper bounds for |f (x)| and |f 00 (x)|; these bounds give us the inequality
|f 0 (x)| = |
|f 0 (x)| ≤
2M0
+ hM2
h
85
(38)
This inequality must hold for all x, even when |f (x)| approaches its upper bound, so we have
M1 ≤
2M0
+ hM2
h
We can multiply both sides by h and then algebraically rearrange this into a quadratic equation in h.
h2 M2 − hM1 + M0 ≥ 0
(39)
The quadratic solution to this equation is
h=
M1 ±
p
M12 − 4M0 M2
2M2
(40)
We want to make sure that there are not two solutions to (40): we want (39) to hold for all values of h, and if
(40) had two solutions then we would have h2 M2 − hM1 + M0 < 0 on some interval of h. To make sure that
there is at most a single real solution we need to make sure that the discriminant of (40) is either zero (one
solution) or negative (zero solutions). This occurs exactly when
M12 ≤ 4M0 M2
Does this apply to vector-valued functions?
Yes. Let f be a vector-valued function that is continuous on (a, ∞) and is defined by
f (x) = ( f1 (x), f2 (x), . . . , fn (x) )
Assume that |f |, |f 0 |, and |f 00 | have finite upper bounds of (respectively) M0 , M1 , and M2 . If we evaluate f at
x + 2h we have
f (x + 2h) = ( f1 (x + 2h), f2 (x + 2h), . . . , fn (x + 2h) )
Taking the Taylor expansion of each of these terms, we have
f (x + 2h)
=
=
[f1 (x) + 2hf10 (x) + 2h2 f100 (ξ1 )], [f2 (x) + 2hf20 (x) + 2h2 f200 (ξ1 )], . . . , [fn (x) + 2hfn0 (x) + 2h2 fn00 (ξ1 )]
( f1 (x), f2 (x), . . . , fn (x) ) + ( 2hf10 (x), 2hf20 (x), . . . , 2hfn0 (x) ) +
2h2 f100 (ξ), 2h2 f200 (ξ), . . . , 2h2 fn00 (ξ)
= f (x) + 2hf 0 (x) + 2h2 f 00 (ξ)
This tells us that equation (38) holds for vector-valued functions. But nothing in the proof following equation
(38) requires us to assume that f is real-valued instead of vector-valued. So the proof following (38) still suffices
to prove that
M12 ≤ 4M0 M2
Exercise 5.16 proof 1
In exercise 5.15 we established the inequality
|f 0 (x)| ≤ |
f (x)
f (x + 2h)
|+|
| + |hf 00 (ξ)|
2h
2h
We are given an upper bound of M2 for |f 00 (ξ)|, so this inequality can be expressed as
|f 0 (x)| ≤ |
f (x + 2h)
f (x)
|+|
| + hM2
2h
2h
We’re told that f is twice-differentiable, so we know that both f and f 0 are continuous (theorem 5.2), so we can
take the limit of both sides of this equation as x → 0 (theorem 4.2).
f (x + 2h)
f (x)
0
lim |f (x)| ≤ lim |
|+|
| + hM2
x→∞
x→∞
2h
2h
86
We’re told that f (x) → 0 as x → 0 so this becomes
lim |f 0 (x)| ≤ lim hM2
x→∞
x→∞
This must be true for all h, so we can take the limit of both sides of this as h → 0 (we can do this because both
|f 0 (x)| and hM2 are continuous functions with respect to the variable h).
lim lim |f 0 (x)| ≤ lim lim hM2
h→0 x→∞
h→0 x→∞
The left-hand side of the previous inequality is independent of h and therefore doesn’t change; the right-hand
side becomes 0.
lim |f 0 (x)| = lim lim |f 0 (x)| ≤ lim lim hM2 = 0
x→∞
h→0 x→∞
h→0 x→∞
0
lim |f (x)| ≤ 0
x→∞
This show that f 0 (x) → 0 as x → ∞, which is what we were asked to prove.
Exercise 5.16 proof 2
In exercise 5.15 we established the inequality
M12 ≤ 4M0 M2
where each of the M terms represented a supremum on the interval (a, ∞). This inequality was proven to hold
for all a such that f was twice-differentiable on the interval (a, ∞). To show more explicitly that the value of
these M terms depends on a, we might express the previous inequality as
sup |f 0 (x)|2 ≤ sup 4|f (x)||f 00 (x)|
x>a
x>a
Each of these terms is continuous with respect to a (the proof of this claim is trivial but tedious) so we can take
the limit of both sides as a → ∞.
lim sup |f 0 (x)|2 ≤ lim sup 4|f (x)||f 00 (x)|
a→∞ x>a
a→∞ x>a
We’re told that |f 00 (x)| has an upper bound of M2 . We’re also told that f (x) → 0 as x → ∞. The last inequality
therefore allows us to conclude that
lim sup |f 0 (x)|2 ≤ lim sup 4|0||M2 | = 0
a→∞ x>a
a→∞ x>a
It’s clear that |f 0 (x)| must be less than or equal to the supremum of a set containing |f 0 (x)|, so we have
lim |f 0 (x)| ≤ lim sup |f 0 (x)|2 ≤ 0
x→∞
a→∞ x>a
lim |f 0 (x)| ≤ 0
x→∞
This shows that f 0 (x) → 0 as x → ∞, which is what we were asked to prove.
Exercise 5.17
We’re told that f is three-times differentiable on the interval (−1, 1) so we can take the Taylor expansions of
f (−1) and f (1) around x = 0:
f (−1) = f (0) + f 0 (0)(−1) +
f 000 (ξ1 )(−1)3
f 00 (0)(−1)2
+
,
2
6
f (1) = f (0) + f 0 (0)(1) +
f 00 (0)(1)2
f 000 (ξ2 )(1)3
+
,
2
6
87
ξ1 ∈ (−1, 0)
ξ2 ∈ (0, 1)
When we evaluate f (1) − f (−1), many of these terms cancel out and we’re left with
f (1) − f (−1) = 2f 0 (0) +
f 000 (ξ1 ) + f 000 (ξ2 )
,
6
ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1)
We’re given the values of f (1), f (−1), and f 0 (0) so this last equation becomes
f 000 (ξ1 ) + f 000 (ξ2 )
,
6
ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1)
f 000 (ξ2 ) = 6 − f 000 (ξ1 ),
ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1)
1=
which is algebraically equivalent to
If f 000 (ξ1 ) ≤ 3 then f 000 (ξ2 ) ≥ 3, and vice-versa: so f 000 (x) ≥ 3 for either ξ1 or ξ2 ∈ (−1, 1). And this is what we
were asked to prove.
Exercise 5.18
The nth derivative of f (t)
The exercise tells gives us the following formula for f (t):
f (t) = f (β) − (β − t)Q(t)
Since β is a fixed constant, the first two derivatives of f with respect to t are
f 0 (t) = Q(t) − (β − t)Q00 (t)
f 00 (t) = 2Q0 (t) − (β − t)Q000 (t)
And, in general, we have
f (n) (t) = nQ(n−1) (t) − (β − t)Q(n) (t)
which, after multiplying by (β − α)n /n! and setting t = α, becomes
Q(n−1) (t)
Q(n) (t)
f (n) (t)
(β − α)n =
(β − α)n −
(β − α)n+1
n!
(n − 1)!
n!
(41)
Modifying the Taylor formula
The formula for the Taylor expansion of f (β) around the point f (α) (theorem 5.15) includes a P (β) term defined
as
n−1
X f (k) (α)
P (β) =
(β − α)k
k!
k=0
If we isolate the k = 0 case this becomes
P (β) = f (α) +
n−1
X
k=1
f (k) (α)
(β − α)k
k!
We can now use equation (41) to express the terms of the summation as functions of Q.
P (β) = f (α) +
n−1
X
k=1
n−1
X Q(k) (α)
Q(k−1) (α)
k
k+1
(β − α) −
(β − α)
(k − 1)!
k!
k=1
If we extract the k = 1 term from the leftmost summation and the k = n−1 term from the rightmost summation,
we have
n−2
n−1
X Q(k−1) (α)
X Q(k) (α)
Q(n−1) (α)
P (β) = f (α) + Q(α)(β − α) +
(β − α)k −
(β − α)k+1 −
(β − α)n
(k − 1)!
k!
(n − 1)!
k=2
k=1
88
We can then re-index the leftmost summation to obtain
n−2
n−2
X Q(k) (α)
X Q(k) (α)
Q(n−1) (α)
k+1
k+1
P (β) = f (α) + Q(α)(β − α) +
(β − α)
(β − α)
(β − α)n
−
−
(k)!
k!
(n − 1)!
k=1
k=1
The two summations cancel one another, leaving us with
P (β) = f (α) + Q(α)(β − α) −
Q(n−1) (α)
(β − α)n
(n − 1)!
If we replace Q(α) with the definition of Q given in the exercise, we see that this previous equation evaluates to
P (β) = f (α) +
f (α) − f (β)
Q(n−1) (α)
(β − α) −
(β − α)n
α−β
(n − 1)!
This simplifies to
P (β) = f (α) + (f (β) − f (α)) −
Q(n−1) (α)
(β − α)n
(n − 1)!
A simple algebraic rearrangement of these terms gives us
f (β) = P (β) −
Q(n−1) (α)
(β − α)n
(n − 1)!
which is the equation we were asked to derive.
Exercise 5.19a
The given expression for Dn is algebraically equivalent to
f (βn ) − f (0)
βn
f (0) − f (αn ) −αn
Dn =
+
βn − 0
βn − α n
0 − αn
βn − α n
We really want to be able to evaluate this by taking the limit of each side of this previous equation in the
following manner:
f (βn ) − f (0)
βn
f (0) − f (αn )
−αn
lim Dn = lim
· lim
+ lim
· lim
(42)
n→∞
n→∞
n→∞ βn − αn
n→∞
n→∞ βn − αn
βn − 0
0 − αn
There two conditions that must be met in order for this last step to be justified:
condition 1: theorem 3.3 tells us that each of the limits must actually exist (and must not be ±∞)
condition 2: and theorem 4.2 tells us that we must have lim f (αn ) = lim f (βn ) = f (0).
The fact that αn < 0 < βn guarantees that 0 < βn /(βn − αn ) < 1 and 0 < −αn /(βn − αn ) < 1, which tells us
that at 2 of the 4 limits in (42) exist. The other two limits exist because they’re equal to f 0 (0), and we’re told
that f 0 (0) exists. Therefore condition 1 is met. The fact that f 0 (0) exists tells us that f is continuous at x = 0
(theorem 5.2) and therefore condition 2 is met (theorem 4.2). Therefore we’re justified in taking the limits in
(42), giving us
βn
−αn
lim Dn = f 0 (0) · lim
+ f 0 (0) · lim
n→∞
n→∞ βn − αn
n→∞ βn − αn
= lim f 0 (0)
n→∞
βn − αn
= f 0 (0)
βn − αn
89
Exercise 5.19b
The given expression for Dn is algebraically equivalent to
f (βn ) − f (0)
βn
f (αn ) − f (0)
βn
Dn =
+
1−
βn − 0
βn − αn
αn − 0
βn − αn
We want to evaluate this by taking the limits of the individual terms as we did in part (a):
f (βn ) − f (0)
βn
f (αn ) − f (0)
βn
lim Dn = lim
· lim
+ lim
lim
1−
n→∞
n→∞
n→∞ βn − αn
n→∞
n→∞
βn − 0
αn − 0
βn − αn
(43)
In order for this to be justified we must once again meet the two conditions mentioned in part (a). We’re told
that 0 < βn /(βn − αn ) < M for some M , which tells us that at 2 of the 4 limits in (43) exist. The other
two limits exist because they’re equal to f 0 (0), and we’re told that f 0 (0) exists. Therefore condition 1 is met.
The fact that f 0 (0) exists tells us that f is continuous at x = 0 (theorem 5.2) and therefore condition 2 is met
(theorem 4.2). Therefore we’re justified in taking the limits in (43), giving us
βn
βn
+ f 0 (0) · lim 1 −
lim Dn = f 0 (0) · lim
n→∞
n→∞
n→∞ βn − αn
βn − αn
βn
βn
= f 0 (0) · lim
+1−
n→∞ βn − αn
βn − αn
= f 0 (0)
Exercise 5.19c proof 1
Define the sequence {hn } where hn = βn − αn . We can now express Dn as
Dn =
f (αn + hn ) − f (αn )
hn
We know that αn → 0 and βn → 0 as n → ∞, so clearly hn → 0 as n → ∞. We’re also told that f 0 is continuous
on (−1, 1), so we know that f 0 is defined on this interval. Therefore we have
lim Dn
n→∞
=
=
lim lim
f (αn + hn ) − f (αn )
hn
n→∞ h→0
lim f 0 (αn )
n→∞
We’re told that f 0 is continuous on the interval (−1, 1) and that lim αn = 0 so by theorem 4.2 we have
lim Dn = lim f 0 (αn ) = f 0 (0)
n→∞
n→∞
Exercise 5.19c proof 2
The mean value theorem (theorem 5.10) allows us to construct a sequence {γn } as follows: for each n, choose
some γn ∈ (αn , βn ) such that
f (βn ) − f (αn )
= f 0 (γn )
(44)
Dn =
βn − αn
Each γn is in the interval (αn , βn ). We’re told that lim αn = lim βn = 0. Therefore we can use the squeeze
theorem to determine lim γn .
0 = lim αn ≤ lim γn ≤ lim βn = 0
n→∞
n→∞
0
n→∞
which means that lim γn → 0. We’re told that f is continuous so by theorem 5.2 we can take the limit of
equation (44).
lim Dn = lim f 0 (γn ) = f 0 (0)
n→∞
n→∞
90
Exercise 5.20
Exercise 5.21
We saw in exercise 2.29 that any open set in R1 can be represented as a countable number
of disjoint open
S
segments. Let {(an , bn )} represent such a countable collection of disjoint sets such that (ai , bi ) = E C . Define
f : R1 → R1 as

0,
x∈E



(x − bi )2 ,
x ∈ (ai , bi ) ∧ ai = −∞
f (x) =
2
(x
−
a
)
,
x ∈ (ai , bi ) ∧ bi = ∞

i


(x − ai )2 (x − bi )2 , x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞
It’s easy but tedious to verify that this a continuous function which is differentiable and that f (x) = 0 iff
x ∈ E. Note that the derivative also has the property that f 0 (x) = 0 iff x ∈ E. For the second part of the
exercise we can define a function f to be

0,
x∈E



(x − bi )n+1 ,
x ∈ (ai , bi ) ∧ ai = −∞
f (x) =
n+1
(x
−
a
)
,
x ∈ (ai , bi ) ∧ bi = ∞

i


(x − ai )n+1 (x − bi )n+1 , x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞
It’s similarly easy but tedious to verify that this is a continuous function that is n times differentiable and
that f (x) = 0 iff x ∈ E. It can also be seen that when k ≤ n we have f (k) (x) = 0 iff x ∈ E. Finally6 , we can
pretend that we’ve defined the exponential function and define f to be

0, x∈E




−1

x ∈ (ai , bi ) ∧ ai = −∞
 exp (x−bi )2 ,
f (x) =
−1
exp (x−ai )2 ,
x ∈ (ai , bi ) ∧ bi = ∞





−1
 exp
x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞
(x−ai )2 (x−bi )2 ,
It’s fairly easy to confirm that this function is continuous and that f (x) = 0 iff x ∈ E. Looking at the
derivative with respect to x we have

0,
x∈E




−2
−1

x ∈ (ai , bi ) ∧ ai = −∞
 (x−bi )3 exp (x−bi )2 ,
f 0 (x) =
−2
−1
x ∈ (ai , bi ) ∧ bi = ∞

(x−ai )3 exp (x−ai
)2 ,




−2(bi −ai )
−1

x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞
(x−ai )2 (x−bi )2 exp (x−ai )2 (x−bi )2 ,
To calculate the limit of f 0 (x) in the first case, we use L’Hopital’s rule.
−1
exp (x−b
2
i)
lim f 0 (x) = lim
3
x→a
x→a
(x − bi )
The numerator and denominator of this last term both tend to ±∞, so L’Hopital’s rule is applicable.
Repeated applications of L’Hopital’s rule will eventually give us a constant term in the numerator and a term
that tends to ±∞ in the denominator, so we see that lim f 0 (x) = 0. Similar results hold for the limits in the other
two cases. The general idea is that, for any polynomial term p(n), the exponential limit limn→∞ exp(−1/n)
will tend towards zero faster than limn→∞ p(n) tends towards ∞. Therefore p(n)exp(−1/n) will tend to zero
as n → ∞. Every term of every derivative of f (x) will consist only of polynomial multiples of the exponential
term, so it will hold that for all k:
lim f (k) (x) = lim f (k) (x) = 0
x→ai
x→bi
It seems clear in a vague, poorly-defined Calc II-ish way that f is infinitely differentiable and that, for all n,
f (n) (x) = 0 iff x ∈ E. I have no idea how to prove this.
6 this
example was provided by Boris Shekhtman, University of South Florida
91
Exercise 5.22a
Assume that f (x1 ) = x1 and f (x2 ) = x2 with x1 6= x2 . Then
f (x2 ) − f (x1 )
x2 − x1
=
=1
x2 − x1
x2 − x1
By theorem 5.10 this means that f 0 (x) = 1 for some x between x1 and x2 . This contradicts the presumption
that f 0 (x) 6= 1, so our initial assumption must be wrong: there cannot be two fixed points.
Exercise 5.22b
For f (t) to have a fixed point it would be necessary that f (t) = t, in which case we would have
t = t + (1 − et )−1
−→
1
=0
1 − et
This statement is not true for any t.
Exercise 5.22c
Choose an arbitrary value for x0 and let {xn } be the sequence recursively defined by xn+1 = f (xn ). We’re told
that |f 0 (t)| ≤ A for all real t. So for any n we have
f (xn−1 ) − f (xn−2 )
≤A
xn−1 − xn−2
−→ |f (xn−1 ) − f (xn−2 )| ≤ A |xn−1 − xn−2 |
which, since f (xn ) = xn+1 , gives us
|xn − xn−1 | ≤ A|xn−1 − xn−2 |
But this holds for all n. So:
|xn − xn−1 | ≤ A|xn−1 − xn−2 | ≤ A2 |xn−2 − xn−3 | ≤ A3 |xn−3 − xn−4 | ≤ · · · ≤ An−2 |x2 − x1 |
We can use this general formula to determine the difference between xn+k and xn .
|xn+k − xn | = |(xn+k − xn+k−1 ) + (xn+k−1 − xn+k−2 ) + · · · + (xn+2 − xn+1 ) + (xn+1 − xn )|
≤
|xn+k − xn+k−1 | + |xn+k−1 − xn+k−2 | + · · · + |xn+2 − xn+1 | + |xn+1 − xn )|
≤
An+k−2 |x2 − x1 | + An+k−3 |x2 − x1 | + · · · + An |x2 − x1 | + An−1 |x2 − x1 )|
≤ An |x2 − x1 |
This shows us that
|xn+k − xn | ≤ An |x2 − x1 |
We’re told that A < 1, so taking the limit of eachs ide of this inequality as n → ∞ gives us
lim |xn+k − xn | ≤ lim An |x2 − x1 | = 0
n→∞
n→∞
And this is just the Cauchy criterion for convergence for the sequence {xn }. And we’re converging in R1 so by
theorem 3.11 we know that the sequence converges to some element x ∈ R1 .
∃x ∈ R : lim xn = x
n→∞
But the elements of {xn } are just the elements of {f (xn )}, so we can also conclude that
∃x ∈ R : lim f (xn ) = x
n→∞
And we’re told that f is continuous, so by theorem 4.6 we see that
∃x ∈ R : f (x) = x
92
Exercise 5.23a
If x < α, then we can express x as x = α − δ for some δ > 0. This gives us
f (x) = f (α − δ)
definition of x
=
=
=
(α−δ)3 −1
3
α3 −3α2 δ+3αδ 2 −δ 3 +1
3
−3α2 δ+3αδ 2
α3 +1
+
3
3
= f (α) − αδ(α − δ)
algebra
3
− δ3
− 13 δ 3
= f (α) − αδx − 13 δ 3
= α − αδx −
definition of f
1 3
3δ
algebra
algebra
definition of x
α is a fixed point
We know that x < α < −1, so αx > 1 and therefore αδx > δ. From this we have an inequality:
< α − δ − 31 δ 3
<α−δ
=x
definition of x
This establishes that x < α → f (x) < x. Therefore if x1 < α then the sequence {xn } will be a nonincreasing
sequence. We know that this sequence doesn’t converge because we’re told that f has no fixed points less than
α.
Exercise 5.23b
The first derivative of f is f 0 (x) = x2 . This is nonnegative, so we know that f is monotonically increasing
(theorem 5.11). The second derivative is f 0 (x) = 2x. This is negative for x < 0 and positive for x > 0: therefore
f is strictly convex on (0, γ) and is strictly concave on (α, 0). Now let xk be chosen from the interval (α, γ).
Case 1: xk ∈ (α, 0)
Let xk ∈ (α, 0) be chosen. We can express this x as xk = λα + (1 − λ)0 for some λ ∈ (0, 1). The second derivative
of f 00 (x) = 2x is negative on the interval (α, 0) and so the function is concave on this interval (corollary of exercise
5.14). Therefore we have
xk+1 = f (xk )
definition of xk+1
= f (λα + (1 − λ)0)
definition of xk
> λf (α) + (1 − λ)f (0)
f is concave on (α, 0)
= λα + (1 − λ)(1/3)
α is a fixed point, f (0) = 1/3
= xk + (1 − λ)(1/3)
definition of xk
> xk
We see that xk < xk+1 ; from our choice of xk we have α < xk ; from the monotonic nature of f we have
xk < β → xk+1 = f (xk ) < f (β) = β. Combining these inequalities yields
α < xk < xk+1 < β
Therefore our initial choice of xk gives us an increasing sequence {xn } with an upper bound of β. Therefore it
converges to a some fixed point in the interval (x1 , beta], and this point must be β.
Case 2: x ∈ (β, γ)
Let xk ∈ (β, γ) be chosen. We can express this as xk = λβ + (1 − λ)γ for some λ ∈ (0, 1). The second derivative
of f 00 (x) = 2x is positive on the interval (β, γ) and so the function is convex on this interval (exercise 5.14). There-
93
fore we have
xk+1 = f (xk )
definition of xk+1
= f (λβ + (1 − λ)γ)
definition of xk
< λf (β) + (1 − λ)f (γ)
f is convex on (β, γ)
= λβ + (1 − λ)γ
β and γ are fixed points
= xk
definition of xk
We see that xk+1 < xk ; from our choice of xk we have xk < γ; from the monotonic nature of f we have
β < xk → β = f (β) < f (xk ) = xk+1 . Combining these inequalities yields
β < xk+1 < xk < γ
Therefore our initial choice of xk gives us an decreasing sequence {xn } with an lower bound of β. Therefore it
converges to a some fixed point in the interval [β, x1 ), and this point must be β.
Case 3: x ∈ (0, β)
Let xk ∈ (0, β) be chosen. We can express this as xk = λ0 + (1 − λ)β for some λ ∈ (0, 1). The second derivative
of f 00 (x) = 2x is positive on the interval (0, β) and so the function is convex on this interval (exercise 5.14).
Therefore we have
xk+1 = f (xk )
definition of xk+1
= f (λ(0) + (1 − λ)β)
definition of xk
< λf (0) + (1 − λ)f (β)
definition of convexity
= λ(1/3) + (1 − λ)β
β is a fixed point, f (0) = 1/3
= λ(1/3) + x
definition of xk
> xk
We see that xk < xk+1 ; from our choice of xk we have α < xk ; from the monotonic nature of f we have
xk < β → xk+1 = f (xk ) < f (β) = β. Combining these inequalities yields
α < xk < xk+1 < β
Therefore our initial choice of xk gives us an increasing sequence {xn } with an upper bound of β. Therefore it
converges to a some fixed point in the interval (x1 , beta], and this point must be β.
Case 4: xk = β or xk = 0
If xk = β then every term of {xn } is β, so this sequence clearly converges to β. If xk = 0 then xk+1 = f (0) = (1/3)
and the remainder of the sequence {xn } converges to β by one of the previous cases.
Exercise 5.23c
If x > γ then we can express x as x = γ + δ for some δ > 0. This gives us
f (x) = f (γ + δ)
definition of x
=
=
=
=
=
=
(γ+δ)3 −1
3
γ 3 +3γ 2 δ+3γδ 2 +δ 3 +1
3
3
γ 3 +1
3γ 2 δ+3γδ 2
+
+ δ3
3
3
f (γ) + γδ(γ + δ) + 13 δ 3
f (γ) − γδx + 31 δ 3
γ + γδx + 13 δ 3
definition of f
algebra
algebra
algebra
definition of x
γ is a fixed point
We know that γ > 1 and x > 1, so γxδ > δ. From this we have an inequality :
> γ + δ + 31 δ 3
>γ+δ
=x
definition of x
94
This establishes that x > γ → f (x) > x. Therefore if x1 > γ then the sequence {xn } will be a nonincreasing
sequence. We know that this sequence doesn’t converge because we’re told that f has no fixed points greater
than γ.
Exercise 5.24
The function f (x) has a derivative of zero at its fixed point, so when xk and xk+1 are both close to
value theorem guarantees us that f (xk ) and f (xk+1 ) will be very near one another:
√
a the mean
|f (xk ) − f (xk+1 )| = |(xk − xk+1 )f 0 (ξ)| ≈ 0
The lefthand term of this equation converges very rapidly to 0 because both |xk − xk+1 | and f 0 (x) are converging
toward zero. The function g(x) does not have a derivative of zero at its fixed point and therefore does not have
this property (although, as we saw in exercise 3.17, it still converges albeit more slowly).
Exercise 5.25a
Each xn+1 is chosen to be the point where the line tangent tangent to f (xn ) crosses the x-axis.
Exercise 5.25b
Lemma 1: xn+1 < xn if xn > ξ
We’re told that f (b) > 0 and that f (x) = 0 only at x = ξ. We know that f is continuous since it is differentiable
(theorem 5.2), so by the intermediate value theorem (theorem 4.23) we know that f (x) > 0 when x > ξ (otherwise
we’d have f (x) = 0 for some second x ∈ (ξ, b). We’re also told that f 0 (x) > δ > 0 for all x ∈ (a, b) (without the
δ it might be the case that f 0 (x) 6= 0 but lim f 0 (x) = 0). Therefore the ratio f (xn )/f 0 (xn ) is defined at each x
and is positive when x > ξ. Therefore we have
xn −
f (xn )
< xn ,
f 0 (xn )
xn > ξ
This of course means that xn+1 < xn when xn > ξ.
Lemma 2: xn+1 > ξ if xn > ξ
We’re told that f 00 (x) ≥ 0, which means that f 0 (x) is monotonically increasing, which means that c < xn implies
f 0 (c) ≤ f 0 (xn ), which means that
f 0 (xn ) − f (ξ)
≤ f 0 (xn )
xn − ξ
because the LHS of this inequality is equal to f 0 (c) for some c < xn . Using the fact that f (ξ) = 0 and a bit of
algebraic manipulation, this equation is equivalent to
xn −
f (xn )
≥ξ
f 0 (xn )
which of course means that xn+1 ≥ ξ.
Lemma 3: if {xn } → κ then f (κ) = 0
Suppose that limn→∞ xn+1 = κ. Then by the Cauchy criterion we have limn→∞ xn − xn+1 = 0 which is
equivalent to
f (xn )
lim xn − xn − 0
=0
n→∞
f (xn )
or
f (xn )
lim
=0
n→∞ f 0 (xn )
For this to hold it must either be the case that f (xn ) → 0 or f 0 (xn ) → ±∞: but we know that f 0 (xn )
is bounded from the fact that f 00 (xn ) is bounded (mean value theorem : if f 0 (xn ) were unbounded, then
95
(f 0 (xn ) − f 0 (x1 ))/xn − x1 = f 00 (c) would be unbounded). Therefore it must be the case that f (xn ) → 0. So we
have limn→∞ xn = κ and limn→∞ f (xn ) = 0. Therefore by theorem 4.2 we have
f (κ) = lim f (x) = 0
x→κ
The actual proof
Choose x1 > ξ. By induction, using lemma 2, we know that every element of {xn } will be > ξ. Therefore,
by lemma 1, xn+1 < xn for all n. This means that {xn } is a decreasing sequence with a lower bound of ξ.
Therefore, by theorem 3.14, the sequence converges to some point κ. By lemma 3 we have f (κ) = 0. But we’re
told that f has only one zero, so it must be the case that κ = ξ. This means that {xn } converges to ξ.
Exercise 5.25c
Using Taylor’s theorem to expand f (ξ) around f (xn ), we have
f 00 (tn )(ξ − xn )2
2
f (ξ) = f (xn ) + f 0 (xn )(ξ − xn ) +
Subtracting f (xn ) from both sides then dividing by f 0 (xn ) (we’re told f 0 (x) ≥ δ > 0) gives us
f 00 (tn )(ξ − xn )2
f (ξ) − f (xn )
=
(ξ
−
x
)
+
n
f 0 (xn )
2f 0 (xn )
Rearranging some terms and recognizing that f (ξ) = 0, we have
xn −
f (xn )
f 00 (tn )(xn − ξ)2
−
ξ
=
f 0 (xn )
2f 0 (xn )
Which, by the definition of xn+1 , is equivalent to
xn+1 − ξ =
f 00 (tn )(xn − ξ)2
2f 0 (xn )
Exercise 5.25d
We’re told that f 0 (x) ≥ δ and f 00 (x) ≤ M , so the inequality in part (c) guarantees us that
xn+1 − ξ ≤
M (xn − ξ)2
= A(xn − ξ)2
2δ
This allows us to recursively construct a chain of inequalities.
xn+1 − ξ ≤ A1 (xn − ξ)2 ≤ A1 A2 (xn−1 − ξ)4 ≤ A1 A2 A4 (xn − 2 − ξ)8 ≤ · · · ≤ A1 A2 . . . A2
n−1
(x1 − ξ)2
n
Collapsing the exponents of the rightmost term, we have
n
xn+1 − ξ ≤ A2
−1
n
(x1 − ξ)2 =
n
1
[A(x1 − ξ)]2
A
Exercise 5.25e
How does g behave near ξ?
We’re told f and f 0 are differentiable, and clearly x is differentiable, therefore g is differentiable (theorem 5.3).
Using the quotient rule (theorem 5.3 again), the derivative of g is given by
g 0 (x) = 1 −
f 0 (x)2 − f (x)f 00 (x)
f (x)f 00 (x)
=
0
2
f (x)
f 0 (x)
96
Taking the absolute value of each side, we have
|g 0 (x)| =
f (x)f 00 (x)
f 0 (x)
Using the inequality we established in part (d) we have
|g 0 (x)| ≤ A|f (x)|
When we take the limit of each side of this equation as x → ξ, the RHS reduces to f (ξ) = 0 because f is
continuous.
lim |g 0 (x)| ≤ lim A|f (x)| = 0
x→ξ
x→ξ
Which immediately implies
lim g 0 (x) = 0
x→ξ
and this describes the behavior of g 0 as x → ξ, which is what we were asked to describe.
Show that Newton’s method involves finding a fixed point of g
This is a slightly modified version of lemma 3 for part (b). Suppose that we have found a fixed point κ such
that g(κ) = κ. From the definition of g this would mean that
g(κ) − κ = −
f (κ)
=0
f 0 (κ)
For this to hold it must either be the case that f (κ) = 0 or f 0 (κ) = ±∞: but we know that f 0 is bounded from
0
0
(y)
the fact that f 00 is bounded (mean value theorem : if f 0 were unbounded, then f (x)−f
= f 00 (c) would be
x−y
unbounded for some x, y). Therefore it must be the case that f (κ) = 0 and therefore κ must be the unique
point for which f (κ) = 0.
Exercise 5.25f
We’ll consider the more general case in which f (x) = xm . In this case the single real zero occurs at x = 0.
Newton’s formula gives us the step function
f (xn )
xm
m−1
xn
n
xn+1 = xn − 0
= xn −
=
=
x
−
xn
n
f (xn )
m
m
mxm−1
n
This gives us a recursive definition of xn .
2
3
n
m−1
m−1
m−1
m−1
xn+1 =
xn =
xn−1 =
xn−2 = · · · =
x1
m
m
m
m
Taking the limit of each side as n → ∞, we have
lim xn+1 = lim
n→∞
n→∞
m−1
m
n
x1
(45)
In order to have {xn } converge to 0 we must either have x1 = 0 or |m − 1/m| < 1. The latter case occurs when
|m − 1| < |m|
This inequality holds iff m > 21 . If m is larger than 21 the limit in equation (45) exists and is equal to zero and
so {xn } → 0; if m is smaller than 12 then the limit in equation (45) is unbounded and so {xn } → ±∞; if m = 12
then equation (45) becomes
lim xn+1 = lim (−1)n x1
n→∞
n→∞
This limit clearly doesn’t exist when x1 6= 0.
In the specific case given in the exercise we have m =
97
1
3
and therefore the sequence {xn } fails to converge.
Exercise 5.26
Following the hint given in the exercise, let M0 = sup |f (x)| and let M1 = sup |f 0 (x)| for x ∈ [a, x0 ]. We know
that
|f (x)| ≤ M1 (x0 − a)
(46)
because otherwise we’d have
f (x) − f (a)
f (x)
=
= f 0 (c) for some c ∈ (a, b) > M1 = sup |f 0 (x)|
x0 − a
x0 − a
which is clearly contradictory. Additionally, we know that
M1 = sup |f 0 (x)| ≤ sup |Af (x)| = A sup |f (x)| = AM0
(47)
Combining (46) and (47) gives us
|f (x)| ≤ M1 (x0 − a) ≤ AM0 (x0 − a) , x ∈ [a, x0 ]
(48)
From the fact that f (a) = 0 we know that M1 (x0 − a) is the maximum possible value that f could possibly
obtain at f (x0 ) (otherwise, if f (x0 ) > M1 (x0 −a), then the mean value theorem would give us some f 0 (c) > M1 ).
In comparison, M0 is the maximum value that f actually does obtain at some x ∈ [a, b]. Therefore we have
M0 ≤ M1 (x0 − a)
(49)
Suppose 0 < A(x0 − a) < δ < 1. Then (48) would give us
|f (x)| ≤ M1 (x0 − a) ≤ AM0 (x0 − a) ≤ δM0
which contradicts (49) unless M1 = M0 = 0. And since we can force 0 < A(x0 − a) < 1 by choosing an
appropriate x0 , it must be the case that M1 = M0 = 0. This turns (48) into
|f (x)| ≤ 0 , x ∈ [a, x0 ]
which shows that f (x) = 0 on the interval [a, x0 ]. We can now repeat these steps using the interval [x0 , b]. We
need only perform this a finite number of times (specifically, a maximum of of [(b − a)/(x0 − a)] + 1 times) before
we have covered the entire interval. Therefore f (x) = 0 on the entire interval [a, b].
Exercise 5.27
The given hint is pretty much the entire solution. Let y1 and y2 be two solutions to the initial-value problem
and define the function f (x) = y2 (x) − y1 (x). The function f meets all of the prerequisites spelled out in exercise
5.26: it’s differentiable, f (a) = 0, and we’re told that there is a real number A such that
|f 0 (x)| = |y20 − y10 | = |φ(x, y2 ) − φ(x, y1 )| ≤ A|y2 − y1 | = A|f (x)|
Therefore, by exericse 5.26, we have y2 (x) − y1 (x) = f (x) = 0 for all x; therefore y1 = y2 . But y1 and y2 were
arbitrary solutions for the intial-value problem, so we have proven that there is only one unique solution.
Exercise 5.28
Let ya and yb be two solution vectors to the initial-value problem and define the vector-valued function f (x) =
yb (x) − ya (x). As in the previous problem, we see that f (x) is differentiable and that f (a) = a. Therefore, if
there is a real number A such that
|f 0 (x)| = |yb0 − ya0 | = |φ(x, yb ) − φ(x, ya )| ≤ A|y2 − y1 | = A|f (x)|
then, by exercises 5.26 and 5.27 the initial-value problem has a unique solution.
98
Exercise 5.29
Note: there’s probably a more subtle answer here, probably involving Taylor’s theorem.
Let v and v be two solution vectors to the initial value problem and define the vector valued function
f (x) = w(x) − v(x):
f (x) = [w1 (x) − v1 (x), w2 (x) − v2 (x), . . . , wk−1 (x) − vk−1 (x), wk (x) − vk (x)]
(50)
The derivative of this is given by
0
0
f 0 (x) = w10 (x) − v10 (x), w20 (x) − v20 (x), . . . , wk−1
(x) − vk−1
(x), wk0 (x) − vk0 (x)
which, by the equivalences given in the exercise, becomes

f 0 (x) = w2 (x) − v2 (x), w3 (x) − v3 (x), . . . , wk (x) − vk (x),
k
X

gj (wj − vj )
(51)
j=1
From exercises 5.26-5.28 we know that we want an inequality of the form |f 0 (x)| ≤ A|f 0 (x)|. Using (50) and
(51) we see that this inequality will hold if there exists some A such that
|w1 (x) − v1 (x), w2 (x) − v2 (x), . . . , wk−1 (x) − vk−1 (x), wk (x) − vk (x)|
≤ A w2 (x) − v2 (x), w3 (x) − v3 (x), . . . , wk (x) − vk (x),
k
X
gj (wj − vj )
j=1
The order of the components don’t matter when we’re taking the norm, and the rightmost term is a dot product,
so this is equivalent to
|w2 (x) − v2 (x), w3 (x) − v3 (x), . . . , wk (x) − vk (x), w1 (x) − v1 (x)| ≤ A|w2 (x) − v2 (x), w3 (x) − v3 (x), . . . , wk (x) − vk (x), g(x) · (w(x
Exercise 6.1 (Proof 1)
Outline of the proof
We’ll see that L(P, f ) = 0 for every partition; therefore sup L(P, f ) = 0. By choosing
R an appropriate partition
we can make U (P, f ) arbitrarily small; therefore inf U (P, f ) = 0. We conclude that f = 0.
sup L(P,f ) = 0
Although it’s easy to construct a partition such that L(P, f ) = 0, we have to show that 0 is the supremum of the
set of all L(P, f ). To do this, let P be an arbitrary partition of [a, b] and let mi be an arbitrary interval of this
arbitrary partition. If mi contains any point other than x0 then inf f (x) = 0 on this interval, so mi ∆αi = 0. If
mi contains only the point x0 (that is, if the interval is [x0 , x0 ]) then ∆αi = [α(x0 ) − α(x0 )] = 0 and therefore
mi ∆αi = 0. Therefore mi ∆αi = 0 for all i, which means L(P, f ) = 0. But P was an arbitrary partition, so
L(P, f ) = 0 for all P . Therefore
Z
b
f dx
0 = sup{L(P, f ) : P is a partition of [a, b]} =
a
inf U(P,f )=0
We need to show that for any > 0 we can find some partition P such that 0 ≤ U (P, f ) < . So let > 0 be given.
We’re told that α is continuous at x0 , so by the definition of continuity we can find some δ > 0 such that
d(x, x0 ) < δ → d(α(x), α(x0 )) <
99
2
(52)
Let 0 < µ <
M1 ∆α1
M2 ∆α2
M3 ∆α3
δ
2
and let P be the partition {a, x0 − µ, x0 + µ, b}. We now calculate each Mi :
=
=
=
M1 [α(x0 − µ) − α(a)]
M2 [α(x0 + µ) − α(x0 − µ)]
M3 [α(b) − α(x0 + µ)]
=
=
=
0 · [α(x0 − µ) − α(a)] = 0
1 · [α(x0 + µ) − α(x0 − µ)]
0 · [α(b) − α(x0 − µ)] = 0
To determine a bound for M2 ∆α2 we apply (52) which allows us to conclude that α(x0 + µ) − α(x0 − µ) < from the fact that d(x0 − µ, x0 + µ) = 2µ < δ.
M2 ∆α2 = α(x0 + µ) − α(x0 − µ) ≤ From this we have
U (P, f )
X
Mi ∆αi = 0 + + 0 = i=13
But this epsilon is arbitrarily small, so
Z
b
f dx
0 = inf{U (P, f ) : P is a partition of [a, b]} =
a
Exercise 6.1 (Proof 2)
Thanks to Helen Barclay (hbarcla2@mail.usf.edu) for this proof. The function f is discontinuous at only one
point and α is continous at that point, so f ∈ R by theorem 6.10. Every partition of [a, b] will contain a point
other than x0 so mi = 0 for every interval; therefore the infimum of L(P, f ) is zero; therefore
Z
b
b
Z
f dα =
a
f dα = 0
a
Exercise 6.2
Outline of the proof
If we assume that f (x) 6= 0 for some x ∈ [a, b], then we
R can construct a partition P for which L(P, f ) > 0. From
this we conclude that sup L(P, f ) > 0 and therefore f 6= 0.
The proof
Suppose, for purposes of contradiction, that f (x0 ) = κ for some x0 ∈ [a, b] where κ is some arbitrary nonzero
number. We’re told that f is continuous on a closed set, therefore f is uniformly continuous (theorem 4.19).
Therefore there exists some δ > 0 such that |x0 − x| < δ → |f (x0 ) − f (x)| < κ2 . Now let 0 < µ < δ and let P be
the four-element partition {a, x0 − µ, x0 + µ, b}. For clarity, let X = (x0 − µ, x0 + µ).
Since |x0 − x| ≤ µ < δ for all x ∈ X, we have |f (x0 ) − f (x)| = |κ − f (x)| < κ2 for all x ∈ X. For this
inequality to hold for all f (x) it must be the case that min f (X) > κ2 . This means that
κ
(2µ) > 0
2
R
Since L(P, f ) > 0 for this particular P , it must be the case that sup L(P, f ) > 0 and therefore f dx 6= 0.
L(P, f ) =
X
mi ∆xi ≥ m2 ∆x2 = min f (X)[(x0 + µ) − (x0 − µ)] = min f (X)(2µ) >
We assumed that f (x) 6= 0 for some x ∈ [a, b] and concluded that
then f (x) = 0 for all x ∈ [a, b].
R
f dx 6= 0: by contrapositive, if
R
f dx = 0
Exercise 6.3
Outline of the proof
We’ll see that U (P, f ) − L(P, f ) is equivalent to Mi − mi on a single arbitrarily small interval around 0. To
make this difference arbitrarily small we will need to make sup f (x) − inf f (x) arbitrarily small by restricting x
to a sufficiently small neighborhood of 0: this is possible iff f is continuous at 0.
100
Lemma 1: For each of the β functions, the upper and lower Riemann sums depend entirely on
the intervals containing 0
Let P be an arbitrary partition of [−1, 1]. By theorem 6.4 we can assume, without loss of generality, that 0 is an
element of this partition. We define α to be the partition element immediately preceding 0 and define ω to be
the partition element immediately following 0 (so our partition has the form P = {x0 < x1 < . . . < α < 0 < ω <
. . . < xn < xn+1 }). Note that α and ω must both exist since P is a finite set, although it’s possible that α = −1
or ω = 1. For every xi−1 < xi < 0 we have β(xi−1 ) = β(xi ) = 0, therefore ∆β = 0, therefore Mi ∆β = mi ∆β = 0.
Similarly, for every 0 < xi−1 < xi we have β(xi−1 ) = β(xi ) = 1, therefore ∆β = 0, therefore Mi ∆β = mi ∆β = 0.
This shows that Mi ∆β = mi ∆β = 0 for every interval that doesn’t contain 0. Therefore the values of U (P, f )
and L(P, f ) depend entirely on the intervals [α, 0] and [0, ω]. This holds for arbitrary P .
The general form of Mi ∆β and mi ∆β on [α, 0] and [0, ω]
Let P be an arbitrary partition. Let Mα denote the supremum of f (x) on the interval [α, 0]; similarly define
mα , Mω , mω , etc. For all three of the β functions we have
Mα ∆βα
= Mα [β(0) − β(α)] = Mα β(0)
(53)
Mω ∆βω
= Mω [β(ω) − β(0)] = Mω [1 − β(0)]
(54)
mα ∆βα
= mα [β(0) − β(α)] = mα β(0)
(55)
mω ∆βω
= mω [β(ω) − β(0)] = mω [1 − β(0)]
(56)
(57)
Case 1: β(0) = 0
When β(0) = 0 we have
X
Mi ∆β = Mα β(0) + Mω [1 − β(0)] = Mω
X
mi ∆β = mα β(0) + mω [1 − β(0)] = mω
Proof that continuity of f implies integrability: let > 0 be given. We’re told that f (0+) = f (0), so by definition
of continuity at this point there exists some δ > 0 such that
d(0, x) < δ → d(f (0), f (x)) <
2
Construct a partition P = {−1, 0, δ, 1}. The interval [0, δ] is compact, therefore by theorem 4.16 the points
f −1 (Mω ) and f −1 (mω ) both exist in [0, δ] and therefore:
2
d(f −1 (mω ), 0) < δ → d(mω , f (0)) <
2
and therefore, by the triangle inequality,
d(f −1 (Mω ), 0) < δ → d(Mω , f (0)) <
d(Mω , mω ) ≤ d(Mω , f (0)) + d(f (0), mω ) < and therefore
U (P, f ) − L(P, f ) = Mω − mω < R
And since was arbitrary, this is sufficient to show that f dx exists.
Proof by contrapositive that integrability of f implies right-hand continuity: Assume that f (0+) 6= f (0).
From the negation of the definition of continuity we therefore know that there exists some > 0 such that for
all δ we can find some 0 < x < δ such that |f (x) − f (0)| ≥ . So for any partition P let a be a point at which
f (a) = Mω and let b be a point for which |f (b) − f (0)| ≥ . This gives us f (b) ≤ f (a) = Mω and mω ≤ f (0),
therefore
U (P, f ) − L(P, f ) = Mω − mω = f (a) − mω ≥ f (a) − f (0) ≥ f (b) − f (0) ≥ 101
The partition P was arbitrary so this inequality must be true for all possible partitions, so we can never find P
such that
inf U (P, f ) − sup L(P, f ) ≤ R
and therefore f dx does not exist.
Proof that this integral, if it exists, is equal
R to f (0): from the fact that 0 is an elementR of [0, δ] we know that
Mω ≥ f (0). We also know that L(P, f ) ≤ f (by theorem 6.4 and/or the definition of f ). This gives us the
inequalities
Mω = U (P, f ) ≥ f (0)
Z
L(P, f ) ≤ f dβ
from which we have
Z
U (P, f ) − L(P, f ) ≥ f (0) −
But we also know that mω ≤ f (0) and that U (P, f ) ≥
f dβ
(58)
R
f . This gives us
Z
U (P, f ) ≥ f dβ
mω = L(P, f ) ≤ f (0)
from which we have
Z
U (P, f ) − L(P, f ) ≥
f dβ − f (0)
(59)
Combining (58) and (59) gives us
Z
f dβ − f (0) ≤ U (P, f ) − L(P, f ) = If this is to be true for all > 0 we must have
R
f dβ = f (0).
Case 2: β(0) = 1
When β(0) = 1 we have
X
Mi ∆β = Mα β(0) + Mω [1 − β(0)] = Mα
X
mi ∆β = mα β(0) + mω [1 − β(0)] = mα
From here the proof that
place of ω.
Case 3: β(0) =
R
f dβ = f (0) iff f (0) = f (0−) is almost identical to the previous proof with α in the
1
2
When β(0) = 1/2 we have
X
Mi ∆β = Mα β(0) + Mω [1 − β(0)] =
X
mi ∆β = mα β(0) + mω [1 − β(0)] =
Therefore
1
[Mα + Mω ]
2
1
[mα + mω ]
2
1
1
[Mα − mα ] + [Mω − mω ]
2
2
The function f is integrable iff this term can be made arbitrarily small; we saw in case (1) that [Mω − mω ] can
be made arbitrarily small iff f (0) = f (0+), and we saw in case (2) that [Mα − mα ] can be made arbitrarily small
iff f (0) = f (0−). Therefore we can minimize U (P, f ) − L(P, f ) in this case iff f (0+) = f (0−) = f (0); that is,
iff f is continuous at 0.
U (P, f ) − L(P, f ) =
102
Proof that this integral, if it exists, is equal to f (0): fromRthe fact that 0 is an element of [0, δ] we know Rthat
Mω ≥ f (0) and Mα ≥ f (0). We also know that L(P, f ) ≤ f (by theorem 6.4 and/or the definition of f ).
This gives us the inequalities
1
1
[Mα + Mω ] = U (P, f ) ≥ [f (0) + f (0)] = f (0)
2
2
Z
L(P, f ) ≤ f dβ
from which we have
Z
U (P, f ) − L(P, f ) ≥ f (0) −
f dβ
But we also know that mω ≤ f (0) and that mα ≤ f (0) and that U (P, f ) ≥
Z
U (P, f ) ≥ f dβ
from which we have
(60)
R
f . This gives us
1
1
[mα + mω ] = L(P, f ) ≤ [f (0) + f (0)] = f (0)
2
2
Z
U (P, f ) − L(P, f ) ≥ f dβ − f (0)
(61)
Combining (58) and (59) gives us
Z
f dβ − f (0) ≤ U (P, f ) − L(P, f ) = If this is to be true for all > 0 we must have
R
f dβ = f (0).
Part (d) of the exercise
If f is continuous
at 0 then we have f (0) = f (0+) = f (0−), so by case (1) we have
R
(2) we have f dβ2 = f (0) and by case (3) we have f dβ3 = f (0).
R
f dβ1 = f (0) and by case
Exercise 6.4
Let P be an arbitrary partition of [a, b]. Let n = |P | − 1, so that n represents the number of intervals in the
partition P . Every interval will contain at least one rational number (theorem 1.20b) and at least one irrational
number (by pigeonhole principle, since |(xi , xi+1 )| = |R| > |Q|). Therefore we have Mi = 1, mi = 0 for all i.
This gives us
n
X
X
Mi ∆xi =
1 · (xi+1 − xi ) = xn+1 − x0 = b − a
1
X
mi ∆xi =
n
X
0 · (xi+1 − xi ) = 0
1
But P was an arbitrary partition, so this holds for all partitions, and therefore
sup L(P, f ) = 0,
inf U (P, f ) = 1
and therefore f 6∈ R.
Exercise 6.5
Is f ∈ R if f 2 ∈ R?
No. Consider the function
f (x) =
-1,
1,
x∈Q
x 6∈ Q
We saw that f 6∈ R in exercise 6.4, but clearly f 2 (x) = 1 ∈ R.
103
Is f ∈ R if f 2 ∈ R?
Yes. The function φ(x) =
6.11 to claim that
√
3
x is a continuous function, so we let h(x) = φ(f 3 (x)) = f (x) and appeal to theorem
f3 ∈ R → h ∈ R → f ∈ R
Note √
that the claim that h(x) = φ(f 3 (x)) = f (x) relied on the fact that x → x3 is a one-to-one
p mapping, so
3
2
3
that x = x. In contrast, the mapping x → x is not one-to-one as can be seen by the fact that (−1)2 6= −1.
Exercise 6.6
Outline of the proof
We can define a partition for which f is discontinuous on 2n intervals each of which has length 3−n , so the
Riemann sum across these intervals is proportional to (2/3)n . This sum approaches 0 as n → ∞.
The proof
Let Ei be defined as in sec. 2.44. Note that E0 consists of one interval of length 1; E1 consists of two intervals of length 13 ; E2 consists of 4 intervals of length 19 ; and, in general, En will consist of 2n intervals of length 3−n .
With each En we can associate a set Fn that contains the endpoints of the intervals contained in En . That
is, we define
Fn = {a1 < b1 < a2 < b2 < . . . < a2n < b2n }
where each [ai , bi ] is an interval in En . Since every point of the Cantor set – and therefore every point at which
f might be discontinuous – is in an interval of the form [ai , bi ] we’ll choose a partition that lets us isolate these
intervals.
Let m and M represent the lower and upper bounds of f on [0, 1]. Choose an arbitrary > 0. Choose n
large enough that
2n+1 (M − m)
3n−1 >
and choose δ such that
1
0 < δ < n+1
3
These choice will be justified later. Define Pn to be
Pn = {0 = a1 < (b1 + δ) < (a2 − δ) < (b2 + δ) < (a3 − δ) < (b3 + δ) < . . . < (a2n − δ) < b2n = 1}
This partition contains 2n+1 points and therefore contains 2n+1 − 1 segments. We must now show that we can
make U (Pn , f ) − L(Pn , f ) arbitrarily small.
U (Pn , f ) − L(Pn , f ) =
2n+1
X−1
(Mi − mi )∆xi
i=1
We’ll separate this sum into the intervals on which f is continuous (i = 2, 4, 6, . . .) and the intervals on which f
might contain discontinuities (i = 1, 3, 5, . . .).
n
=
2
X
n
2
X
(M2i − m2i )∆x2i +
(M2i+1 − m2i+1 )∆x2i+1
i=1
i=0
The function f is continuous on every interval of the form [bi + δ, ai+1 − δ], and these intervals are represented
by the lefthand summation. We can therefore refine Pn such that
2n
≤
X
+
(M2i+1 − m2i+1 )∆x2i+1
2 i=1
104
We know the exact value of the ∆x terms. For Pn , we have d(ai , bi ) = 3−n and therefore ∆x2i+1 = 3−n + 2δ.
Similarly, we know that Mi ≤ M and mi ≥ m, so we have
2n
X
≤ +
(M − m)
2 i=1
From our choice of δ <
1
3n+1
<
1
3n
1
+ 2δ
3n
this becomes
2n
X
≤ +
(M − m)
2 i=1
1
3n−1
This summation is constant with respect to i so it becomes simply
1
n
= + 2 (M − m)
2
3n−1
From our choice of n we have 3n−1 >
2n+1 (M −m)
≤
and therefore 3−(n−1) < /2n+1 (M − m).
2n (M − m)
+ n+1
2 2
(M − m)
=≤
+ =
2 2
Exercise 6.7a
If f ∈ R, then from theorem 6.12 we have
Z
1
c
Z
f dx =
1
Z
f dx +
0
0
f dx
(62)
c
Now consider the partition P of [0, c] defined by P = {0, c}. This partition has only a single interval, so we have
Z c
inf f (x) · c = L(P, f ) ≤
f dx ≤ U (P, f ) = sup f (x) · c
0<x<c
From (62), adding
R1
c
0<x<c
0
f dx to each term of this inequality gives us
Z
1
Z
f dx + inf f (x) · c ≤
c
0<x<c
1
Z
1
f dx ≤
0
f dx + sup f (x) · c
0<x<c
c
Taking the limit of each term of this inequality as c → 0, we see that f (x) · c → 0 while the center term remains
constant. The resulting inequality is
Z 1
Z 1
Z 1
lim
f dx ≤
f dx ≤ lim
f dx
c→0
c
c→0
0
c
which of course implies that
Z
lim
c→0
1
Z
f dx =
c
1
f dx
0
which is what we wanted to prove.
Exercise 6.7b
Outline of the proof
R
i
Pn
We construct a function f for which f = R i=1 (−1)
iP : this is the alternating harmonic series, which converges
n
(see Rudin’s remark 3.46). We then see that |f | = i=1 1i : this is the harmonic series, which diverges (theorem
3.28).
105
The proof
Choose any c ∈ (0, 1) and consider the following function defined on [c, 1].
1
(−1)n (n + 1), n+1
< x < n1 for some n ∈ N
f (x) =
0, otherwise
Claim 1 : f is integrable on any interval [c, 1]
Choose an arbitrarily small > 0. Let N1 be the smallest harmonic number greater than c. We want to choose
δ such that each harmonic number in [c, 1] is contained in a distinct interval of radius δ: so choose δ such that
for reasons that will become clear later. Let
0 < δ < 21 [ N1 − c]. We must also make sure that δ < 2(N +1)(N
1 +2)
Hc represent the partition containing the elements {c} ∪ n ± δ : n ∈ N, n < 1c ∩ [0, 1].
• The partition Hc contains one interval of the form [c, N1 − δ].
1
1
M1 ∆x1 = sup f (x)∆x1 = (−1) (N + 1)∆x1 = (−1) (N + 1)
−δ−
N
N +1
1
1
− δ = (−1)N
− (N + 1)δ
= (−1)N (N + 1)
N (N + 1)
N
N
N
The function is constant over this interval, so we have
m1 ∆x1 = inf f (x)∆x1 = sup f (x)∆x1 = M1
therefore
|M1 − m1 |∆x1 = 0
(63)
• The partition Hc contains one interval of the form [1 − δ, 1] for which
Mn ∆xn = sup f (x)∆xn = 0∆x = 0
mn ∆xn = inf f (x)∆xn = −2∆x = −2δ
therefore
|Mn − mn |∆xn = 2δ
• The partition contains N − 1 intervals of the form 1i − δ, 1i + δ for which
(64)
Mi ∆xi = sup f (x)∆xi ≤ sup |f (x)|2δ ≤ |i + 1|2δ
mi ∆xi = inf f (x)∆xi ≥ inf −|f (x)|2δ ≥ −|i + 1|2δ
therefore
|Mi − mi |∆xi ≤ |Mi | + |mi | ≤ 4δ|i + 1|
h
i
1
• The partition contains N − 2 intervals of the form j+1
+ δ, 1j − δ for which
Mj ∆xj = sup f (x)∆xj = (−1)j (j+1)
1
1
−δ−
−δ
j
j+1
= (−1)j (j+1)
(65)
1
− 2δ
j(j + 1)
= (−1)j
1
− 2δ(j + 1)
j
The function is constant over this interval, so we have
mj ∆xj = inf f (x)∆xj = sup f (x)∆x = M = (−1)
j
1
− 2δ(j + 1)
j
therefore
(Mj − mj )∆xj = 0
106
(66)
Summing equations (63)-(66), we have
|U (P, f ) − L(P, f )| =
|(M1 − m1 )∆x1 + (Mn − mn )∆xn +
N
N
−1
X
X
(Mi − mi )∆xi +
(Mj − mj )∆xj |
i=2
≤
|(M1 − m1 )|∆x1 + |(Mn − mn )|∆xn +
j=2
N
X
|Mi − mi |∆xi +
i=2
≤ 0 + 2δ +
N
X
4δ|i + 1| +
N
−1
X
i=2
=
=
N
−1
X
|Mj − Mj |∆xj
j=2
0
j=2
(N + 1)(N + 2) − 1
2
2δ(N + 1)(N + 2)
2δ + 4δ
2(N +1)(N +2) ,
Earlier in the proof we chose δ such that δ <
so we obtain the inequality
|U (P, f ) − L(P, f )| ≤ And was arbitrary, so this proves that f ∈ R.
Claim 2 : The value of limc→0
R1
c
f is defined
Adding the values for the M terms, we have
U (P, f )
=
M1 + Mn +
N
X
i=2
≤
=
(−1)
N
(−1)
N
Mi +
N
−1
X
Mj
i=2
N
−1
N
X
X
1
1
j
(−1)
|i + 1|2δ +
− (N + 1)δ + 0 +
− 2δ(j + 1)
N
j
j=2
i=2
N
−1
X
1
1
j
(−1)
− (N + 1)δ δ[(N + 1)(N + 2) − 6] +
− 2δ(j + 1)
N
j
j=2
Remember that we first chose a value for c, which forced us to use a particular value for N ; our choice of δ came
afterward, so we’re still free to select δ small enough to make this last inequality become:
N
−1
X
(−1)N
1
+
(−1)j + N
j
j=2
U (Pc , f ) ≤
We defined N1 to be the smallest harmonic number greater than c, so
taking the limit of both sides as c → 0 gives us
lim U (Pc , f ) =
c→0
∞
X
(−1)j
j=2
1
N
→ 0 and N → ∞ as c → 0. Therefore
1
+ = 1 − ln(2) + j
which proves that
Z
c→0
1
f dx ≥ 1 − ln(2)
lim
c
The values for the m terms are almost identical to the M terms, and adding them gives us the inequality
Z 1
lim
f dx ≤ 1 − ln(2)
c→0
c
Together, of course, these two inequalities prove that
Z 1
lim
f dx = 1 − ln(2)
c→0
c
107
Claim 3 : The value of limc→0
R1
c
|f | is not defined
If we follow the logic from claims 1 and 2, we find that adding the values for the M terms gives us
U (P, f )
=
M1 + Mn +
N
X
Mi +
i=2
Mj
i=2
N
N
−1 X
1 X
1
1
− (N + 1)δ + +
− 2δ(j + 1)
|i + 1|2δ +
N
2 i=2
j
j=2
N
−1 X
1
1
1
− (N + 1)δ + δ[(N + 1)(N + 2) − 6] +
− 2δ(j + 1)
N
2
j
j=2
≤
=
N
−1
X
Again following the logic of part 2, we select δ small enough to gives us
Z 1
∞
X
1
f dx =
lim
+
c→0 c
j
j=2
But weRknow from chapter 3 that this series doesn’t converge, so the lefthand limit doesn’t exist and therefore
1
limc→0 c f dx does not exist.
Exercise 6.8
Choose an arbitrary real number c > 1. Let n be the greatest integer such that n < c.Define the partition Pn of
[0, n] to be Pn = {x0 =P1, 2, 3, 4, . . . , n = xn−1 }.
P
First, assume that
f (n) diverges. Because f (x) > 0 this means that
f (n) is unbounded (theorem 3.24).
The fact that f is monotonically decreasing allows us to derive the following chain of inequalities.
Z c
Z n
f (x) dx ≥
f (x) dx
1
1
≥ L(Pn , f )
=
n−2
X
mi ∆xi
i=1
=
n−2
X
f (i + 1)
i=1
(Note: the index is going to n − 2 instead of n − 1 because of the awkward numbering of the partition: note
that the partition only goes to xn−1 ). Taking the limit as c → ∞, we have
Z ∞
Z c
n
X
f (x) dx = lim
f (x) dx ≥ lim
f (i + 1)
c→∞
1
n→∞
1
i=2
The integral on the left-hand side of this chain of inequalities is greater than the unbounded sum on the righthand side, so we conclude
P that the integral does not converge.
Next assume that
f (n) converges to some κ ∈ R. Choose c, n, Pn+1 as defined above. The fact that f is
monotonically decreasing allows us to derive the following chain of inequalities.
Z c
Z n+1
f (x) dx ≤
f (x) dx
1
1
≤ U (Pn+1 , f )
=
n−1
X
Mi ∆xi
i=1
=
n−1
X
i=1
108
f (i)
Taking the limit as c → ∞, we have
Z
∞
Z
c
f (x) dx ≤ lim → ∞
f (x) dx = lim
c→∞
1
n
1
n−1
X
f (i)
i=1
Rc
We’re told that f (x) > 0, so we know that 0 f (x) is a monotonically increasing function of c. The previous
Rc
Rc
inequality tells us that 1 is bounded above. Therefore by theorem 3.14 we know that limc→∞ 1 f dx is defined.
Exercise 6.9
By the definition given in exercise 6.8:
Z ∞
Z
0
f (x)g (x) dx = lim
c→∞
0
c
f (x)g 0 (x) dx
0
The integral from 0 to c is finite, so we can use integration by parts:
Z ∞
Z
f (x)g 0 (x) dx = lim f (c)g(c) − f (0)g(0) −
0
c→∞
c
f 0 (x)g(x) dx
0
−1
Applying this to the given function with f (x) = (1 + x) and g(x) = sin x, we have f 0 (x) = −(1 + x)−2 and
g 0 (x) = cos x. Integration by parts therefore yields
Z c
Z ∞
sin c
sin 0
sin x
cos x
dx = lim
−
+
dx
2
c→∞ 1 + c
1+x
1
0 (1 + x)
0
Since −1 ≤ sin c ≤ 1, the non-integral terms both tend to zero as c → ∞. This leaves us with
Z ∞
Z c
cos x
sin x
dx = lim
dx
c→∞
1
+
x
(1
+ x)2
0
0
By the definition in exercise 6.8 this is equivalent to
Z ∞
Z ∞
cos x
sin x
dx =
dx
1
+
x
(1
+ x)2
0
0
Exercise 6.10 a: method 1
This proof was due to Helen Barclay (hbarcla2@mail.usf.edu). We rewrite uv as the exponential of a natural
log:
1
1
ln up + ln v q
uv = (up )1/p (v q )1/q = exp(ln((up )1/p (v q )1/q )) = exp
p
q
The exponential function f (x) = ex has a strictly positive second derivative; therefore its first derivative is
monotonically increasing (theorem 5.11); therefore f (x) is a convex function (proof in exercise 5.14, definition
of convex in exercise 4.23). By definition of convexity with λ = 1/p, 1 − λ = 1/q we have
1
1
1
1
up
vq
exp
ln up + ln v q ≤ exp (ln up ) + exp (v q ) =
+
p
q
p
q
p
q
Combining these last two inequalities, we have
uv ≤
up
vq
+
p
q
which is what we were asked to prove.
109
Exercise 6.10 a: method 2
Using the variable substitutions s = up , t = v q , a = p1 , (1 − a) =
uv ≤
1
q
we can rewrite
up
vq
+
p
q
(67)
as
sa t1−a ≤ as + (1 − a)t
Multiplying both sides by s−a ta−1 gives us
1 ≤ as1−a ta−1 + (1 − a)s−a ta
We can rewrite this previous inequality in two equivalent ways:
1≤a
1≤a
a
a−1
t
t
+ (1 − a)
s
s
s 1−a
t
+ (1 − a)
s −a
t
(68)
(69)
If s ≤ t then t/s ≥ 1 and therefore
a
a−1
a
t
t
+ (1 − a)
≥ a(1)a−1 + (1 − a)(1)a = 1
s
s
so the inequality in (68) holds. If s ≥ t then s/t ≥ 1, and therefore
a
s 1−a
t
+ (1 − a)
s −a
t
≥ a(1)1−a + (1 − a)(1)−a = 1
so the inequality in (69) holds. But both of these inequalities are just (67) with a variable change, so we see
that (67) holds when s ≥ t or when t ≥ s – that is, it always holds. And this is what we were asked to prove.
Exercise 6.10 a: method 3
This proof was due to Boris Shektman (don’t email him). Define the function f (u) to be
f (u) =
up
vq
+
− uv
p
q
This function has the derivative
f 0 (u) = up−1 − v
Note that f 0 (u) = 0 has only one positive real solution, which occurs at u = v 1/(p−1) . We can show algebraically
1
= pq . For u < v q/p we have f 0 (u) < 0 and for u > v q/p we have f 0 (u) > 0, so the the point u = v q/p is
that p−1
the unique global minimum of f (u). Evaluating the function at this value of u gives us
vq
vq
+
− v q/p v
p
q
1 1
=
+
v q − (v (p+q)/p )
p q
f (v q/p ) =
We’re given that 1/p + 1/q = 1 and, from this, we also know that p + q = pq. Making these substitutions gives
us
= v q − v pq/p = v q − v q 0
Therefore the unique minimum of f is f (v q/p ) = 0, and f (u) ≥ 0 for all other u.
110
Exercise 6.10b
From part (a), we have
Z b
Z
f g dα ≤
a
a
b
fp
gq
1
+
dα =
p
q
p
b
Z
1
f dα +
q
p
a
Z
b
g q dα =
a
1 1
+ =1
p q
Exercise 6.10c
Define κ and λ to be
b
Z
Z
|f |p dα,
κ=
λ=
b
|g|q dα
a
a
Define fˆ and ĝ to be
f (x)
g(x)
fˆ(x) = 1/p , ĝ(x) = 1/q
κ
λ
These two functions are “normalized” in the sense that
Z b p
Z b
Z
|f |
κ
1 b p
|fˆ|p dα =
|f | dα = = 1
dα =
κ
κ a
κ
a
a
Z b
Z b
Z b q
1
λ
|g|
dα =
|ĝ|q dα =
|g|q dα = = 1
λ
λ a
λ
a
a
ˆ
We know that f and ĝ are Riemann integrable by theorem 6.11, and therefore by theorem 6.13 and part (b) of
this exercise we know that |fˆ| and |ĝ| are Riemann integrable and that
Z
b
fˆĝ dα ≤
a
By definition of fˆ and ĝ this becomes:
Z b
a
f
b
Z
|fˆ||ĝ| dα ≤ 1
a
g
κ1/p λ1/q
b
Z
dα ≤
a
|f | |g|
dα ≤ 1
κ1/p λ1/q
Multiplying both sides by the constant κ1/p λ1/q gives us
Z b
Z b
f g dα ≤
|f g| dα ≤ κ1/p λ1/q
a
a
which, by the definition of κ and λ, is equivalent to
b
Z
Z
f g dα ≤
a
(Z
b
)1/p (Z
b
p
|f g| dα ≤
q
|f | dα
a
)1/q
b
|g| dα
a
a
which is what we wanted to prove.
Exercise 6.10d
Proof by contrapositive. If the inequality were false for the improper integrals, we would be able to find some
function f such that
Z c
1/p Z c
1/q
Z c
lim
f g dα > lim
|f |p dα
|g|q dα
c→∞
a
c→∞
a
a
or
Z
b
lim
c→x0
(Z
b
p
c→x0
x0
)1/q
b
|f | dα
f g dα > lim
x0
)1/p (Z
q
|g| dα
x0
But this is a strict inequality, so in either of these cases we would have to be able to find a neighborhood of ∞
or x0 for which this inequality still held. And this would give us a proper integral for which Holder’s inequality
doesn’t hold. By contrapositive, the fact that Holder’s inequality is valid for proper integrals shows that it’s
true for improper integrals.
111
Exercise 6.11
If we can prove that ||f +g|| ≤ ||f ||+||g||, then we can prove the given inequality by letting f = f −g and g = g−h.
Rb
||f + g||2 = a |f + g|2 dα
definition of || · ||
Rb
2
≤ a (|f | + |g|) dα
The triangle inequality is established
for | · |
Rb 2
2
= a |f | + 2|f g| + |g| dα
Rb
Rb
Rb
= a |f |2 dα + 2 a |f g| dα + a |g|2 dα
properties of integrals: theorem 6.12
nR
o1/2 nR
o1/2 R
Rb 2
b
b
b
≤ a |f | dα + 2 a |f |2 dα
|g|2 dα
+ a |g|2 dα Holder’s inequality (exercise 6.10)
a
= ||f ||2 + 2||f || ||g|| + ||g||2
definition of || · ||
2
= (||f || + ||g||)
Taking the square root of the first and last terms of this chain of inequalities, we have
||f + g|| ≤ ||f || + ||g||
This inequality must hold for any f, g ∈ R. Letting f = f − g and g = g − h this becomes
||f − h|| ≤ ||f − g|| + ||g − h||
which is what we were asked to prove.
Exercise 6.12
We’re told that f is continuous, so it has upper and lower bounds m ≤ f (x) ≤ M . Choose any > 0. Since
f ∈ R(α), we can find some partition P such that
U (P, f, α) − L(P, f, α) <
2
M −m
Having now determined a particular partition P we can define g as suggested in the hint. This function
is simply a series of straight lines connecting f (xi−1 ) to f (xi ). This becomes clearer if we rewrite g in an
algebraically equivalent form:
f (xi ) − f (xi−1 )
g(t) = f (xi−1 ) +
(t − xi−1 )
xi − xi−1
We see that on any interval [xi−1 , xi ] the function g(t) is bounded between f (xi−1 ) and f (xi ). That is,
mi min{f (xi−1 ), f (xi )} ≤ g(t), t ∈ [xi−1 , xi ] ≤ max{f (xi−1 ), f (xi )} ≤ Mi
And of course we have similar bounds on f :
mi ≤ f (t) ≤ Mi
and therefore, since f (t), g(t) ≤ Mi and −f (t), −g(t) ≤ −mi , we have
|f (t) − g(t)| ≤ |Mi − mi | = Mi − mi
(70)
Similarly, since Mi ≤ M and −mi ≤ −m we have
Mi − mi ≤ M − m
We’ve now established all of the inequalities we need to complete our proof.
112
(71)
Rb
||f − g||2 = a |f (x) − g(x)|2 dα
Pn R xi+1
= i=0 xi |f (x) − g(x)|2 dα
Pn R x
≤ i=0 xii+1 |Mi − mi |2 dα
Rx
Pn
= i=0 |Mi − mi |2 xii+1 dα
Pn
= i=0 |Mi − mi |2 ∆α(xi )
Pn
≤ i=0 (M − m)(Mi − mi ) ∆α(xi )
Pn
= (M − m) i=0 (Mi − mi ) ∆α(xi )
= (M − m)[U (P, f, α) − L(P, f, α)]
≤ (M −
2
m) M−m
definition of || · ||
integral property 6.12c
from (70)
integral property 6.12a
from (71)
integral property 6.12a
Definition of U and L
we chose P so that this would hold
2
=
we chose so that this would hold
Taking the square root of the first and last terms gives us
||f − g|| ≤ which is what we were asked to prove.
Exercise 6.13a
Letting u = t2 we have du/dt = 2t and therefore
dt =
du
du
= √
2t
2 u
Using the change of variables theorem (6.19) we see that the given integral is equivalent to
Z (x+1)2
sin u
√ du
f (x) =
2 u
x2
√
Integration by parts (theorem 6.22) with F = 1/2 u and g = sin u du gives us
− cos u
√
f (x) =
2 u
(x+1)2
Z
(x+1)2
−
x2
x2
cos u
du
4u3/2
which expands to
cos x2
− cos(x + 1)2
+
−
2(x + 1)
2x
By the triangle inequality this gives us
(x+1)2
Z
f (x) =
|f (x)| =
≤
<
=
=
=
=
− cos(x + 1)2
cos x2
+
−
2(x + 1)
2x
x2
Z
(x+1)2
x2
cos u
du
4u3/2
cos u
du
4u3/2
Z (x+1)2
− cos(x + 1)2
cos x2
cos u
+
+
du
2(x + 1)
2x
4u3/2
x2
Z (x+1)2
1
1
1
+
+
du
3/2
2(x + 1)
2x
4u
2
x
1
1
1
1
1
+
+
−
2(x + 1)
2x
2 x+1 x
1
1
1
+
+
2(x + 1) 2x 2x(x + 1)
x + (x + 1) + 1
2x(x + 1)
2(x + 1)
1
=
2x(x + 1)
x
113
(72)
Note that the strict inequality is justified by the fact that cos u ≤ 1 for all u ∈ [x2 , (x + 1)2 ] but cos is not a
constant function so cos u < 1 for some u ∈ [x2 , (x + 1)2 ].
Exercise 6.13b
In the previous problem we used integration by parts to determine that
− cos(x + 1)2
cos x2
f (x) =
+
−
2(x + 1)
2x
(x+1)2
Z
cos u
4u3/2
x2
Multiplying by 2x gives us
−x cos(x + 1)2
2xf (x) =
+ cos x2 − 2x
x+1
Z
(x+1)2
x2
cos u
du
4u3/2
which is algebraically equivalent to
2xf (x) = cos x2 − cos[(x + 1)2 ] +
cos(x + 1)2
− 2x
x+1
Z
(x+1)2
x2
cos u
du
4u3/2
Letting r(x) be defined as
r(x) =
cos(x + 1)2
− 2x
x+1
Z
(x+1)2
x2
cos u
du
4u3/2
we have, by the triangle inequality,
|r(x)|
≤
≤
=
=
=
=
<
Z (x+1)2
cos(x + 1)2
cos u
+ 2x
du
3/2
x+1
4u
2
x
Z (x+1)2
1
1
+ 2x
du
x+1
4u3/2
x2
1
1
1
1
+ 2x
−
x+1
2 x+1 x
1
1
+x
x+1
x(x + 1)
1
1
+
x+1 x+1
2
x+1
2
x
So we see that
2xf (x) = cos x2 − cos[(x + 1)2 ] + r(x)
where |r(x)| < 2/x.
Exercise 6.13c
We’re asked to find the lim sup and lim inf of the function
xf (x) =
1
cos(x2 ) − cos([x + 1]2 ) + r(x)
2
We established in part (b) that r(x) → 0 as x → ∞. It’s also clear that this function is bounded above by 1
and bounded below by −1, but it’s not immediately clear that these bounds are the lim sup and the lim inf.
114
Remark: The supremum and infimum of cos(x2 ) − cos([x + 1]2 ) are never obtained
The supremum would be obtained if we could find x such that
cos(x2 ) − cos([x + 1]2 ) = 2
(73)
which would occur precisely when
x2 = 2kπ,
[x + 1]2 = (2j + 1)π,
j, k ∈ N
After some tedious algebra, we can see that these two equalities hold when
2
π[2(j − k) + 1] − 1
√
=k
2 2π
(74)
(75)
But this equation can’t hold. If it did, we could rearrange it algebraically to give us
π 2 [2(j − k) + 1]2 − 2π[2j + 3k + 1] + 1 = 0
This would allow us to use the quadratic formula to give an algebraic expression for π; but this is impossible as
π is a transcendental number. The infimum of cos(x2 ) − cos([x + 1]2 ) is not obtained for the same reason.
Proof: The lim sup of xf (x) is 1
Let 1 > > 0 and N ∈ N be given. Let δ be chosen so that
0<δ<
cos−1 (1 − )
2π
Define the function p(m) to be
p(m) =
Its derivative with respect to m is
p0 (m) =
√
π[2m + 1] − 1
√
2 2π
2
2π(π[2m + 1] − 1)
which is strictly increasing. Therefore we can make the derivative as large as we want by choosing a sufficiently
large m. Specifically, we are able to choose M ∈ N such that p0 (M )δ > 1 and M > N . By the mean value
theorem we have
p(M + δ) − p(M ) = p0 (ξ)δ, ξ ∈ (M, M + δ)
From the strictly increasing nature of p0 and our choice of M we have
p(M + δ) − p(M ) = p0 (ξ)δ > p0 (M )δ > 1
Therefore p(x) must take an integer value for at least one x ∈ [M, M + δ]. Let κ represent one such x. The
two important properties of κ are that p(κ) is an integer and that κ − M < δ (so that κ itself is “almost” an
integer). We now have
2
π[2(M + δ) + 1] − 1
√
=k∈N
p(κ) = p(M + δ) =
2 2π
If we define j to be j = k + M this becomes
2
π[2(j − k + δ) + 1] − 1
√
=k∈N
2 2π
Reversing the algebraic steps that led from (74) to (75) tells us that there exists some x ∈ R such that
x2 = 2kπ, [x + 1]2 = (2(j + δ) + 1)π,
j, k ∈ N
Using this value of x in our original function, we have
xf (x) =
1
1
cos(x2 ) − cos([x + 1]2 ) + r(x) = [cos(2kπ) − cos(2(j + 1)π + 2δπ) + r(x)]
2
2
115
Using trig identities, this becomes
xf (x) =
1
[1 − {cos(2(j + 1)π) cos(2π) − sin(2(j + 1)π) sin(2π)} + r(x)]
2
1
[1 − (−1) cos(2π) − 0 + r(x)]
2
Finally, from our original choice of δ as an inverse cosine, this becomes
=
xf (x) =
1
[1 + (1 − ) + r(x)]
2
1
[r(x) − ]
2
=1+
From part (b), we know that r(x) → 0 as x → ∞:
lim xf (x) = 1 −
x→∞
2
And was arbitrary, so the supremum is
lim sup xf (x) = 1
x→∞
The proof that lim inf xf (x) = −1 is similar.
Exercise 6.13d
Z
∞
2
sin(t ) dt
=
0
∞ Z
X
√
0
=
∞ Z
X
∞
X
(n+1)π
sin(t2 ) dt
(n+1)π
(−1)n | sin(t2 )| dt
nπ
(−1)n
0
≤
∞
X
(−1)n
0
=
(76)
nπ
√
√
0
=
√
Z √(n+1)π
√
| sin(t2 )| dt
nπ
Z √(n+1)π
√
1 dt
(and is similarly bounded below)
nπ
∞
√
√ X
√ π
(−1)n
n+1− n
(77)
0
(78)
√
√
We saw in exercise 3.6.a that the sequence { n + 1 − n} is decreasing. Therefore, by the alternating series
theorem (3.43) the series in (77) converges. Therefore, by the comparison test (3.25) the series in (76) converges
and therefore the integral in (76) converges.
Exercise 6.14a
Letting u = et we have du/dt = et = u and therefore
dt =
du
du
=
et
u
Using the change of variables theorem (6.19) we see that the given integral is equivalent to
Z
ex+1
f (x) =
ex
116
sin(u)
du
u
(79)
We use integration by parts as we did in 6.13(a).
f (x) =
− cos u
u
ex+1
ex+1
Z
−
ex
ex
cos u
du
u2
which expands to
f (x) =
cos ex
cos ex+1
−
−
x
e
ex+1
ex+1
Z
cos u
du
u2
ex
By the triangle inequality this gives us
|f (x)| =
≤
<
<
=
=
=
cos ex
cos ex+1
−
−
ex
ex+1
Z
ex+1
cos u
du
u2
ex
Z ex+1
cos ex
cos ex+1
cos u
+
+
du
ex
ex+1
u2
ex
Z ex+1
1
1
1
+ x+1 +
du
2
ex
e
u
x
e
1
1
1
1
+ x+1 + x − x+1
ex
e
e
e
1
1
e−1
+ x+1 + x+1
x
e
e
e
2e
ex+1
2
ex
Therefore ex |f (x)| < 2. Note that the strictness of the inequality is justified by the fact that cos u ≤ 1 for all
u ∈ [ex , ex+1 ] but cos is not a constant function so cos u < 1 for some u ∈ [ex , ex+1 ].
Exercise 6.14b
In the previous exercise we used integration by parts to determine that
f (x) =
cos ex
cos ex+1
−
−
x
e
ex+1
Z
ex+1
cos u
du
u2
ex
Multiplying by ex gives us
x
x
e f (x) = cos e − e
−1
x+1
cos e
−e
x
Z
ex+1
ex
Letting r(x) be defined as
r(x) = −e
x
Z
ex+1
ex
we have, by the triangle inequality,
117
cos u
du
u2
cos u
du
u2
|r(x)|
−e
≤
x
Z
ex+1
ex
−ex
=
Z
ex+1
ex
Z
x
ex+1
|e |
=
ex
cos u
du
u2
1
du
u2
1
du
u2
1
1
− x+1
ex
e
e−1
e
2
e
|ex |
=
=
<
So we see that
ex f (x) = cos ex − e−1 cos ex+1 + r(x)
where |r(x)| < 2e−1 .
Exercise 6.15a
Using integration by parts with F = f 2 (x) and g = dx gives us
Z b
Z b
b
1=
f 2 (x) dx = xf 2 (x) a −
2f (x)f 0 (x)x dx
a
a
which evaluates to
2
Z
2
1 = bf (b) − af (a) −
b
2f (x)f 0 (x)x dx
a
which, since f (a) = f (b) = 0, becomes
b
Z
2f (x)f 0 (x)x dx
1=−
a
Dividing both sides by −2 gives us
−1
=
2
Z
b
f (x)f 0 (x)x dx
a
which is the desired equality.
Exercise 6.15b
Applying Holder’s inequality to the last equation in part (a) gives us
(Z
)1/2 (Z
)1/2
Z b
b
b
−1
=
f (x)f 0 (x)x dx ≤
|x2 f 2 (x)| dx
·
|[f 0 (x)]2 | dx
2
a
a
a
Since f (x)f 0 (x) must be negative at some point (by mean value theorem, since f (a) = f (b) = 0) while x2 f 2 and
[f 0 (x)]2 are strictly positive, this inequality must be strict:
(Z
)1/2 (Z
)1/2
Z b
b
b
−1
2 2
0
2
0
=
f (x)f (x)x dx <
|x f (x)| dx
·
|[f (x)] | dx
2
a
a
a
Squaring the first and last term in this chain of inequalities, we have
Z b
Z b
1
<
|x2 f 2 (x)| dx ·
|[f 0 (x)]2 | dx
4
a
a
118
All of the normed values are squares, so the norms are redundant:
Z b
Z b
1
2 2
<
x f (x) dx ·
[f 0 (x)]2 dx
4
a
a
Exercise 6.16a
We can integrate the given function separately over infinitely many intervals of length 1:
Z ∞
Z n+1
∞
X
[x]
[x]
s
dx
=
dx
s
s+1
xs+1
x
1
n
n=1
We have [x] = n on the interval [n, n + 1) so this becomes
Z n+1
∞
X
=
s
n
n=1
∞
X
=
sn
n=1
n
dx
xs+1
−1
sxs
n+1
x=n
∞
X
1
1
=
n s−
n
(n + 1)s
n=1
We then split up the summation into three parts.
=
1
X
n=1
∞
n
∞
X 1
X
1
1
+
n s−
n
s
n
n
(n + 1)s
n=2
n=1
Evaluating the first summation and changing the index of the third gives us
=1+
∞
X
n=2
∞
n
X
1
1
(n − 1)
−
ns n=2
(n)s
∞
X
n − (n − 1)
=1+
ns
n=2
=1+
∞
X
1
ns
n=2
And since 1 is clearly equal to 1/ns when n = 1, this is equivalent to
=
∞
X
1
s
n
n=1
Exercise 6.16b
We’re asked to evaluate the integral
Z ∞
x − [x]
s
−s
dx
s−1
xs+1
1
Having determined that [x] was integrable in part (a) we can split up the integral as follows:
Z ∞
Z ∞
s
1
[x]
=
−s
dx
+
s
dx
s
s−1
x
xs+1
1
1
Elementary calculus allows us to calculate the left integral:
Z ∞
s
s
[x]
−
+s
dx
s−1 s−1
xs+1
1
Z ∞
[x]
=s
dx
s+1
x
1
This, as we saw in part (a), is equivalent to
= ζ(s)
=
119
Exercise 6.17
I’m going to change the notation of this problem a bit to make it clearer (to me, at least) by letting f = G and
f 0 = g. We’re told that α is a monotonically increasing function on [a, b] and that f is continuous. We’re asked
to prove that
Z b
Z b
0
α(x)f (x) dx = f (b)α(b) − f (a)α(a) −
f (x) dα
a
a
Most of the work is done for us by the theorem 6.22 (integration by parts) which tells us that
Z
b
α(x)f 0 (x) dx = f (b)α(b) − f (a)α(a) −
b
Z
a
f (x)α0 (x) dx
a
So our proof is reduced to simply proving that
b
Z
b
Z
f (x)α0 (x) dx
f (x) dα =
a
(80)
a
It’s tempting to appeal to elementary calculus to show that dα = α0 (x) dx, but we can prove this more formally.
Proving this more formally
Let > 0 be given. We’re told that f is continuous and that α is monotonically inreasing; therefore f ∈ R(α).
Let P be an arbitrary partition of [a, b] such that U (P, f, α) − L(P, f, α) < 2 .
On any interval [xi−1 , xi ] the mean value theorem tells us that there is some ti ∈ [xi−1 , xi ] such that
α(xi−1 ) − α(xi ) = α0 (ti )[xi−1 − xi ]
or, to express the same equation with different notation,
∆αi = α0 (ti )∆xi
(81)
By theorem 6.7 (c) we have
n
X
f (x) dα <
a
i=1
and also
n
X
b
Z
f (ti )∆αi −
f (ti )α0 (ti )∆xi −
2
b
Z
f (x)α0 (x) dx <
a
i=1
By the inequality (81) we have
n
X
n
X
f (ti )∆αi =
i=1
f (ti )α0 (ti )∆xi
i=1
Therefore, by (82) and (83) and the triangle inequality, we have
b
Z
b
Z
f (x)α0 (x) dx < f (x) dα −
a
a
which, in order to be true for any > 0, requires that
Z
b
Z
b
f (x) dα =
a
a
This proves (80), and this completes the proof.
120
f (x)α0 (x) dx
(82)
2
(83)
Exercise 6.18
The derivative γ10 (t) = ieit is continuous, therefore γ1 (t) is rectifiable (theorem 6.27) and its length is given by
Z 2π
Z 2π
1 = 2π
|ieit | =
0
0
The derivative γ20 (t) = 2ieit is continuous, therefore γ2 (t) is rectifiable and its length is given by
Z 2π
Z 2π
it
|2ie | =
2 = 4π
0
0
The arc γ3 is not rectifiable
Proof by contradiction. If γ3 were rectifiable then its length would be given by the integral
Z 2π
Z 2π 1
1
−1
0
2πi sin
|γ (t)| dt =
+ 2 2πit cos
e2πit sin(1/t) dt
t
t
t
0
0
Z 2π
1
1
1
=
2π sin
dt
(84)
− cos
t
t
t
0
But this integral isn’t defined, since Riemann integrals are defined only for bounded functions and this function
is unbounded:
1
0
0
= lim 2π |sin (2kπ) − 2kπ cos (2kπ)|
lim |γ (t) dt| = lim γ
t→0
k→∞
k→∞
2kπ
= lim 2π |±2kπ|
k→∞
= lim 4π 2 k = ∞
k→∞
Therefore the integral in (84) doesn’t exist, which means the arc length of γ3 is not defined, which means that
γ3 is not rectifiable.
Exercise 6.19
Lemma 1: If f : A → B and G : B → C are both one-to-one, then g ◦ f : A → C is one-to-one
By definition of “one-to-one”, we have
f (x) = f (y) → x = y,
g(s) = g(t) → s = t
Therefore
g(f (x)) = g(f (y)) → f (x) = f (y) → x = y
and so g ◦ f is one-to-one.
Lemma 2: If g ◦ f : A → C is one-to-one and f : A → B is one-to-one and onto, then g : B → C is
one-to-one.
Proof by contrapositive. Suppose that g is not one-to-one. Then we could find x, y ∈ B such that g(x) = g(y)
but x 6= y. But f is one-to-one and onto, so there exist unique s, t ∈ A such that f (s) = x, f (t) = y. So we have
g(f (s)) = g(f (t)) but f (s) 6= f (t) and therefore s 6= t. So g ◦ f is not one-to-one. By contrapositive, the lemma
is proven.
Lemma 3: If f : [a, b] → [c, d] is a continuous, one-to-one, real function then f is either strictly
decreasing or strictly increasing
Proof by contrapositive. Let f be a continuous real function. If f were not strictly increasing and not strictly
decreasing then we could find some x < y < z such that either f (y) ≤ f (x) and f (y) ≤ f (z) or such that
f (y) ≥ f (x) and f (y) ≥ f (z). From the intermediate value property of continuous functions we know that we
must then be able to find x0 , y 0 such that
x ≤ x0 < y < z 0 < z,
f (x0 ) = f (z 0 )
so that f is not one-to-one. By contrapositive, the lemma is proven.
121
Lemma 4: If P = {xi } is a partition of γ2 , then P 0 = {φ(xi )} is a partition of γ1
From lemma 3 we know that φ is strictly increasing or decreasing, and since φ(c) = a it must be strictly
increasing. And φ is onto, so φ(d) = b. Let P be an arbitrary partition of [c, d]:
P = {xi } = {c = x0 , x1 , . . . , xn , xn+1 = d}
From the properties of φ we have
a = φ(c) = φ(x0 ) < φ(x1 ) < φ(x2 ) < . . . < φ(xn ) < φ(xn+1 ) = φ(d) = b
and therefore
{φ(xi )} = {a = φ(x0 ), φ(x1 ), φ(x2 ), . . . , φ(xn ), φ(xn+1 ) = b}
which is a partition of [a, b].
Lemma 5: If P = {xi } is a partition of γ1 , then P 0 = φ−1 (xi ) is a partition of γ2
We’re told that φ is a continuous one-to-one function from [a, b] to [c, d], so its inverse exists and is a continuous
mapping from [a, b] onto [c, d] (theorem 4.17). By lemma 3, this means that φ−1 is either strictly increasing or
strictly decreasing, and since φ−1 (a) = c it must be strictly increasing. And φ−1 is onto, so φ−1 (b) = d. Let P
be an arbitrary partion of [a, b]:
P = {xi } = {a = x0 , x1 , . . . , xn , xn+1 = b}
From the properties of φ−1 we have
c = φ−1 (a) = φ−1 (x0 ) < φ−1 (x1 ) < φ−1 (x2 ) < . . . < φ−1 (xn ) < φ−1 (xn+1 ) = φ−1 (b) = d
and therefore
{φ−1 (xi )} = {c = φ−1 (x0 ), φ−1 (x1 ), φ−1 (x2 ), . . . , φ−1 (xn ), φ−1 (xn+1 ) = d}
which is a partition of [c, d].
Proof 1: γ1 is an arc iff γ2 is an arc
If γ1 is an arc then γ1 is a one-to-one function. We’re told that φ is one-to-one, therefore by lemma 1 γ1 ◦ φ is
one-to-one. And γ2 = γ1 ◦ φ so γ2 is one-to-one. There γ2 is an arc.
If γ2 is an arc then γ2 = γ1 ◦ φ is one-to-one. We’re told that φ is a one-to-one and onto function, therefore
by lemma 2 we know that γ1 is one-to-one. Therefore γ1 is an arc.
Proof 2: γ1 is a closed curve iff γ2 is a closed curve
Assume that γ1 is a closed curve so that γ1 (a) = γ1 (b). We saw in lemma 4 that φ(c) = a and φ(d) = b, so we
have
γ2 (c) ≡ γ1 (φ(c)) = γ1 (a) = γ1 (b) = γ1 (φ(d)) ≡ γ2 (d)
Therefore γ2 (c) = γ2 (d), which means that γ2 is a closed curve.
Now assume that γ2 is a closed curve so that γ2 (c) = γ2 (d). We saw in lemma 5 that φ−1 (a) = c and
φ (b) = d so we have
−1
γ1 (a) = γ1 (φ(φ−1 (a))) ≡ γ2 (φ−1 (a)) = γ2 (c) = γ2 (d) = γ2 (φ−1 (b)) ≡ γ1 (φ(φ−1 (b))) = γ1 (b)
Therefore γ1 (a) = γ1 (b), which means that γ1 is a closed curve.
122
Proof 3: γ1 and γ2 have the same length
Let P be an arbitrary partitioning of [a, b]. Using the notation from definition 6.26, we have
Λ(P, γ2 ) =
n
X
|(γ2 (xi ) − γ2 (xi−1 )| < M
i=1
From the definition of γ2 , this becomes
Λ(P, γ2 ) =
n
X
|(γ1 (φ(xi )) − γ1 (φ(xi−1 ))| < M
i=1
From lemma 4 we see that {φ(xi )} describes a partitioning of [c, d]:
Λ(P, γ2 ) = Λ(φ(P ), γ1 )
Similarly, if we let P 0 be an arbitrary partition of [a, b] we can use lemma 5 to show that
Λ(P 0 , γ1 ) = Λ(φ−1 (P ), γ2 )
This means that the set {Λ(P, γ1 )} is identical to the set of {Λ(P, γ2 )} so clearly these sets have the same
supremums, which means that Λ(γ1 ) = Λ(γ2 ).
Proof 4: γ1 is rectifiable iff γ2 is rectifiable
Having shown previously that Λ(γ1 ) = Λ(γ2 ), it’s clear that Λ(γ1 ) is finite iff Λ(γ2 ) is finite and so γ1 is rectifiable
iff γ2 is rectifiable.
Exercise 7.1
Let {fn } be a sequence of functions that converges uniformly to f . Let Mn = sup |fn | and let M = sup |f |. Let
> 0 be given. Because of the uniform convergence of {fn } → f we can find some N such that
n > N → |fn (x)| = |fn (x) − f (x) + f (x)| ≤ |fn (x) − f (x)| + |f (x)| ≤ + M
This tells us that for all n > N the function fn is bounded above by M + . There are only finitely many n ≤ N ,
each of which is bounded by some Mn . So by choosing
max{M + , M1 , M2 , . . . , MN }
we have found a bound for all fn .
Exercise 7.2a
Let > 0 be given. We’re told that {fn } and {gn } converge uniformly on E so we can find N, M such that
n, m > N → |fn (x) − fm (x)| <
,
2
n, m > M → |gn (x) − gm (x)| <
2
By choosing N ∗ > max{N, M } we have
n, m > N ∗ → |(fn (x) + gn (x)) − (fm (x) − gm (x))| ≤ |fn (x) − fm (x)| + |gn (x) − gm (x)| < which shows that fn + gn converges uniformly on E.
123
Exercise 7.2b
If {fn } and {gn } are bounded functions then by exercise 7.1 they are uniformly bounded, say by F and G. Let
> 0 be given and choose δ such that
r
,
,
δ < min ,
3 3F 3G
We’re told that {fn } and {gn } converge uniformly on E so we can find N, M such that
n, m > N → |fn (x) − fm (x)| < δ,
n, m > M → |gn (x) − gm (x)| < δ
|fn gn − fm gm | = |(fn − fm )(gn − gm ) + fm (gn − gm ) + gm (fn − fm )|
≤ |(fn − fm )||(gn − gm )| + |fm ||(gn − gm )| + |gm ||(fn − fm )| ≤ δ 2 + |fm |δ + |gm |δ ≤ δ 2 F δ + Gδ
r 2
+G
=
≤
+F
3
3F
3G
Exercise 7.3
Let {fn } be a sequence such that fn (x) = x. This function obviously converges uniformly to the function
f (x) = x on the set R. Let {gn } be a sequence of constant functions such that gn (x) = n1 . This sequence obviously
converges uniformly to the function g(x) = 0. Their product is the sequence {fn gn } where fn (x)gn (x) = x/n.
It’s clear that this sequence converges pointwise to fn (x)gn (x) = 0.
To show that fn gn is not uniformly convergent, let > 0 be given. Choose an arbitrarily large n ∈ Z and
choose t ∈ R such that t > (n(n + 1)). We now have
|fn (t)gn (t) − fn+1 (t)gn+1 (t)| =
t
t
(n(n + 1))
t
−
=
>
=
n n+1
n(n + 1)
n(n + 1)
This shows that the necessary requirements for uniform convergence given in theorem 7.8 do not hold.
Exercise 7.4
Holy shit this exercise is a mess.
For what values of x does the series converge absolutely? (Incomplete)
For values of x > 0 we have
X
1 X
1
1 X 1
1
=
≤
1 + n2 x
x
1/x + n2
x
n2
By the comparison test this shows that f (x) converges absolutely. If x = 0 then we have
X
X
1
=
|1| = ∞
1 + n2 x
This series clearly doesn’t converge, absolutely or otherwise. If x < 0 things get more complicated: If x = −1/n2
for any n ∈ N then the nth term of the series is undefined and therefore f (x) is undefined. If x 6= −1/n2 for any
n ∈ N then we can use the fact that
For what intervals does the function converge uniformly?
If E is any interval of the form [a, b] with a > 0 then we have
sup |fn (x)| = fn (a) =
124
1
1 + n2 a
And therefore we have
X
sup |fn (x)| =
X
1
1X 1
≤
2
1+n a
a
n2
P
This shows that
sup |fn (x)| converges by the comparison test, and so by theorem 7.10 we see that the series
P
fn (x) converges uniformly on E.
If E is any interval of the form [a, b] with b < 0 that does not contain any elements of the form −1/n2 , n ∈ N
For what intervals does the function fail to converge uniformly?
Define the set X = {xn } with xn = −1/n2 . The function f will fail to converge uniformly on any interval that
contains an element of X ∪ 0 or has an element of X ∪ 0 as a limit point of E.
Proof: Let E be an arbitrary interval. If E contains
P any xn ∈ X then f (xn ) undefined and so f fails to
converge uniformly on E. If E contains 0 then f (0) = 1 = ∞, but we will never find some finite N such that
PN
1 1 − f (0) < so it’s clear that f fails to converge uniformly on E.
Now suppose that some xn ∈ X is a limit point of E. The nth term of f is unbounded near xn , so f is
unbounded near xn , and therefore limt→xn f (t) = ∞. From this we have
lim lim f (t) = lim ∞ = ∞
n→∞ t→xn
n→∞
On the other hand, if we first fix a value of t and take the limit of f as n → ∞ we have
lim f (t) = lim
n→∞
n→∞
1
=0
1 + n2 t
and therefore
lim lim f (t) = lim 0 = 0
t→xn n→∞
t→xn
Exercise 7.5
If x ≤ 0 or x > 1 then x = 0 for all n and therefore in these cases limn→∞ fn (x) = 0. For any other x, choose
an integer N large enough so that N > 1/x. For all n > N we now have x > 1/n and therefore fn (x) = 0, so
for this case we have limn→∞ fn (x) = 0. This exhausts all possible values of x, so {fn } converges pointwise to
the continuous function f (x) = 0.
To show that this function doesn’t converge uniformly to f (x) = 0 let 1 > > 0 be given and let n be an
arbitrarily large integer. We can easily verify that
1
fn
= sin2 (nπ + π/2) = 1
n + 1/2
and therefore the definition of uniform convergence is not satisfied.
P
The last part of the question asks us to use the series
fn to show that absolute convergence for all x does
not imply uniform
convergence.
The
proof
of
this
is
simple:
P
Pwe’ve already shown that {fn } is not uniformly
convergent but
|fn (x)| converges to f (x) = | sin2 (π/x)| so
|fn | is absolutely convergent.
Exercise 7.6
To show that the series doesn’t converge absolutely for any x:
X
(−1)n
X x2
X 1
x2 + n
1
=
+
≥
2
2
n
n
n
n
The rightmost sum is the harmonic series, which is known to diverge. By comparison test the leftmost series
must also diverge.
125
To show that the series converges uniformly in every bounded interval [a, b], let > 0 be given. Define X to
be sup{|a|, |b|}. Define the partial sum fm to be
fm =
m
X
(−1)n
n=1
x2 + n
n2
We can rearrange this algebraically to form
fm =
m
X
(−1)n
n=1
m
X
1
1
+ x2
(−1)n 2
n
n
n=1
We know that the alternating harmonic series converges (theorem 3.44, or example 3.40(d)) and that
converges (theorem 3.28). Therefore by the Cauchy criterion for convergence we can find N such that
p>q>N →|
P
1/n2
p
X
1
(−1)n | <
n
2
n=q
and we can also find M such that
p>q>M →|
p
X
(−1)n
n=q
1
|<
n2
2|X|2
So, by choosing p > q > max{N, M } we have (for all x ∈ [a, b]):
|fp (x) − fq (x)| =
p
X
(−1)n
p
X
1
1
+ x2
(−1)n 2
n
n
n=q
(−1)n
p
X
1
1
(−1)n 2
+ x2
n
n
n=q
n=q
≤
p
X
n=q
≤
≤
x2
+
2 2X 2
+ =
2 2
Exercise 7.7
We can establish a global maximum for |fn (x)|
The derivative of fn (x) is
fn0 (x) =
1 + nx2 − 2nx2
1 − nx2
=
(1 + nx2 )2
(1 + nx2 )2
√
This derivative is zero only when nx2 = 1, which occurs only when x = ±1/ n, at which point we have
±1
±1
fn √
= √
n
2 n
These extrema must be the global√extrema for the function, since fn has no asymptotes and fn (x) → 0 as
x → ±∞. Therefore |fn (x)| ≤ 1/(2 n).
fn converges uniformly to f (x) = 0
Clearly fn converges pointwise to f (x) = 0, since for any x we have
lim fn (x) = lim
n→∞
n→∞
126
x
=0
1 + nx2
√
Now let > 0 be given and choose n sufficiently large so that 1/(2 n) < : this can be done by choosing
1
n > 42 . From the previously established bounds, we now have
1
|fn (x) − f (x)| = |fn (x) − 0| = |fn (x)| ≤ √ < 2 n
By theorem 7.9 this is sufficient to prove that fn converges uniformly to f (x) = 0.
When does f 0 (x) = lim fn0 (x)?
We’ve established that f (x) = 0, so clearly f 0 (x) = 0 for all x. The limit of fn0 is given by:
1 − nx2
0 x=
6 0
lim fn0 (x) = lim 2 4
=
1 x=0
n→∞
n→∞ n x + 2nx2 + 1
This shows that it’s not necessarily true that [lim fn ]0 = lim[fn0 ] even if fn converges uniformly.
Exercise 7.8
Proof of uniform convergence
Let > 0 be given. Let fm represent the partial sum
fm =
m
X
cn I(x − xi )
n=1
P
P
We’re told that
|cn | converges, so
|cn | satisfies the Cauchy convergence criterion, so we can find N such
that p > q > N implies
p
X
|cn | < n=q
For the same values of p > q > N we also have, by the triangle inequality and the fact that I(x − xn ) ≤ 1,
|fp − fq | =
p
X
cn I(x − xn ) ≤
n=q
p
X
|cn I(x − xn )| ≤
n=q
p
X
|cn | < n=q
so {fn } converges uniformly by theorem 7.8
Proof of continuity
Let t be an arbitrary point (not necessarily in the interval (a, b)). This t either is or isn’t a limit of some
subsequence of {xn }.
If t is not a limit point of some subsequence then we can find some neighborhood around t that
P contains no
points of {xn }; the function I(t − xn ) is constant on this interval for all n and therefore f (x) =
cn I(x − xn )
is constant on this interval for all n. (see below for a a more thorough justification of this claim). And if f is
constant on an interval around t then it is clearly continuous at t.
P
|cn | we can find N such that
P∞Now suppose that t a limit point of {xn }. By the Cauchy convergence of
|c
|
<
.
Choose
a
neighborhood
around
t
small
enough
that
it
does
not
contain
the first N terms of {xn };
n=N n
let δ represent the radius of this neighborhood. If we choose s such that |s − t| < δ, then I(s − xn ) = I(t − xn )
for n = 1, 2, . . . , N . From this we have
|s − t| < δ → f (s) − f (t) ≤
∞
X
(cn I(s − xn ) − cn I(t − xn )) ≤
n=N +1
∞
X
n=N +1
127
|cn [I(s − xn ) − I(t − xn )]|
Each I(x − xn ) is either 0 or 1 so |I(s − xn ) − I(t − xn )| ≤ 1, so we have
|s − t| < δ → f (s) − f (t) ≤
∞
X
|cn [I(s − xn ) − I(t − xn )]| ≤
n=N +1
∞
X
|cn | < n=N +1
That is, we’ve found δ such that
|s − t| < δ → f (s) − f (t) < which, by definition, means that f is continuous at t.
We have therefore shown that f is continuous at t under all cases. But t was an arbitrary point (not
necessarily confined to (a, b)) and therefore f is continuous at all points.
Justification of I(x − xn ) being constant on some interval
As mentioned above, if t is not a limit point of some subsequence then we can find some neighborhood around
t that contains no points of {xn }. Let A be the set of elements of {xn } that are greater than t, and let B be
the set of elements of {xn } that are smaller than t. If |s − t| < δ then there are no elements of {xn } between s
and t and so every element of A is greater than both t and s and every element of B is smaller than both t and
s. From this we see that I(s − xn ) = I(t − xn ) = 0 if xn ∈ A andP
I(s − xn ) = I(t − xn ) = 1 if xn ∈ B. This
means that I(x − xn ) has the same value for every x ∈ Nδ (t); so
cn I(x − xn ) has the same value for every
x ∈ Nδ (t); and therefore f (x) has the same value for every x ∈ Nδ (t).
Exercise 7.9
We’re told that {fn } → f uniformly on E, so we can find N such that
n > N → |fn (t) − f (t)| <
2
for all t ∈ E
This holds for all t ∈ E, so it must also hold for the elements of {xn }, which means that
n > N → |fn (xn ) − f (xn )| <
2
(85)
We’re also told that {fn } is a uniformly convergent sequence of continuous functions, so f is continuous; and
since {xn } → x we can find M such that
n > M → |f (xn ) − f (x)| <
2
(86)
Using the triangle inequality and equations (85) and (86), we have
|fn (xn ) − f (x)| ≤ |fn (xn ) − f (xn )| + |f (xn ) − f (x)| <
+ =
2 2
Converse statement 1
Suppose the converse is “Let {fn } be a sequence of continuous functions that converges uniformly to f . Is it
true that if lim fn (xn ) = f (x) then {xn } → x?”. The answer to this question is “no”. Consider the function
fn (x) = x/n and the sequence {xn } = {1} (an infinite sequence of 1s). The sequence {fn } converges uniformly
to f (x) = 0 on the set [0, 1], so we have
lim fn (xn ) = lim fn (1) = 0 = f (0)
n→∞
n→∞
It’s clear that {xn } 6→ 0, so the converse does not hold.
128
Converse statement 2
Suppose the converse is “Let {fn } be a sequence such that lim fn (xn ) = f (x) for some function f and for all
sequences of points xn ∈ E such that {xn } → x ∈ E. Is it true that {fn } converges uniformly to f on E?
The answer to this question is “no”. Let fn (x) = 1/(nx). Let E be the harmonic set {1/n}. Then fn
converges pointwise on E to f (x) = 0. Although 0 is a limit point of E, 0 is not an element of E and this
converse statement is only concerned with sequences {xn } that converge to some x ∈ E. So the only sequences
of points xn ∈ E such that {xn } → x ∈ E are sequences where every term of the sequence is eventually just
x itself (that is, sequences for which there is some N such that n > N → xn = x). So clearly for every such
sequence we must have lim fn (xn ) = lim fn (x) = f (x). But, despite the fact that lim fn (xn ) = f (x) whenever
{xn } → x ∈ E, it’s clear that fn (x) does not converge uniformly to f (x) = 0 on E (to prove this, choose any and any n and then choose x < (n)−1 ).
Exercise 7.10a
f is continuous at every irrational number
P
We know that
1/n2 converges, so by the Cauchy criterion we can choose N such that
b>a>N →
b
X
1
<
2
n
4
a
The fractional part of (nx) will never be zero because x is irrational, so we can choose δ such that
0<δ<
1
min{(nx), 1 − (nx) : n < N }
2
This guarantees that 0 < (n[x − δ]) < (nx) for all n < N and (nx) < (n[x + δ]) < 1 for all n. More importantly,
this choice of δ guarantees that (n[x − δ]) = (nx) − (nδ) for n < N . We now derive the following chain of
inequalities:
#
"X
N ∞
X
(nx) (nx ± nδ)
(nx) (nx ± nδ)
|f (x) − f (x ± δ)| =
−
−
+
n2
n2
n2
n2
1
N +1
≤
N X
1
(nx) (nx ± nδ)
−
n2
n2
<
N X
1
<
+
∞
∞
X
X
(nx ± nδ)
(nx)
+
n2
n2
N +1
N +1
(nx) (nx ± nδ)
−
n2
n2
N X
(nx)
n2
1
=
+
(nx) (nδ)
− 2 ± 2
n
n
N X
±(nδ)
1
n2
+
+
4 4
+
2
2
We chose our value of δ after we fixed a particular value of N , so we could have chosen δ small enough to make
this sum less than /2. In doing so, we have
|f (x) − f (x ± δ)| <
+ =
2 2
And this would hold not just for x ± δ, but for all y such that |x − y| < δ. That is:
|x − y| < δ → |f (x) − f (y)| < which means that f is continuous at x.
129
Lemma 1:
P
(nx)/n2 converges uniformly to f
This is an immediate consequence of the fact that (nx) < 1 for all n, x and that
be given. By the Cauchy criterion we can choose N such that
b>a>N →
and therefore we have
f (x) −
1/n2 converges. Let > 0
b
X
1
<
2
n
a
∞
N
N
X
X
(nx) X (nx)
(nx)
=
−
n2
n2
n2
n=1
n=1
n=1
∞
∞
X
X
(nx)
≤
n2
=
P
n=N +1
n=N +1
∞
X
(nx)
<
n2
n=N +1
1
<
n2
f is discontinuous at every rational number
When nx is not an integer the following limits are identical:
lim
t→x+
(nt)
(nx)
(nt)
= 2 = lim
t→x− n2
n2
n
When nx is an integer then this equality no longer holds. Instead, we have
lim
t→x+
We know by lemma 1 that
(nt)
0
= 2,
n2
n
∞
X
fn (x+) = f (x+),
n=1
P
(nt)
1
= 2
n2
n
P
(nx)/n2 converges uniformly to f , so we also have (by theorem 7.11):
∞
X
We know that
lim
t→x−
fn (x−) = f (x−)
n=1
1/n2 converges, so by the Cauchy criterion we can choose N such that
b>a>N →
b
X
1
<
2
n
4
a
If f were continuous at our rational point x, then we would have |f (x+) − f (x−)| = 0. But when we actually
calculate this difference we find:
|f (x+) − f (x−)| =
∞
X
fn (x+) −
n=1
∞
X
fn (x−)
n=1
We determined that fn (x+) = fn (x−) unless nx is an integer; this occurs when n is a multiple of q, at which
point nx = [mq]x = mq[p/q] = mp. Most of the terms of the summations cancel out, leaving us with
∞
X
|f (x+)−f (x−)| =
m=1
fmq (x+) −
∞
X
m=1
fmq (x−) =
∞
X
∞
∞
X
X
1
(mq)
(mq)
(0)
1
−
lim
−
=
=
2
2
2
t→x+ [mq]2
t→x− [mq]2
n
[mq]
[mq]
m=1
m=1
m=1
lim
This is clearly not equal to zero and it can’t be made arbitrarily small because we have no freedom to select
particular values for m or q. Therefore f (x+) 6= f (x−) and therefore f is not rational at x. But x was an
arbitrary rational number, and therefore f is not continuous at any rational number.
Exercise 7.10b
We’ve shown that the discontinuities of f are the rational numbers, and these are clearly countable and clearly
dense in R.
130
Exercise 7.10c
To
f ∈ R we need only show that (nx)/n2 ∈ R for any fixed value P
of n.
then
use the fact that
R We can
RP
R
P prove that
2
(nx)/n converges uniformly to f (lemma 1) and theorem 7.16 to show that
(nx)/n2 =
(nx)/n2 = f .
Let n ∈ N be given. Let [a, b] be an arbitrary interval. Assume without loss of generality that a and b are
integers.
Z b
b−1 Z k+1
2
X
(nx)
(nx)
dx
dx =
n
n2
a
k
k=a
For 0 ≤ δ < 1 we have (n(k + δ)) = (nk + nδ) = (nδ), so we can integrate over (0, 1) instead of (k, k + 1) without
changing the value of the integral.
Z b
b−1 Z 1
2
X
(nx)
(nx)
dx
dx =
n
n2
a
0
k=a
As x ranges from 0 to 1, nx ranges from 0 to n. So we split up the interval (0, 1) into n intervals of length 1/n:
Z
b
a
b−1 n−1
2
X
X Z [j+1]/n (nx)
(nx)
dx =
dx
n
n2
j=0 j/n
k=a
To make this integral a bit more manageable we make the variable substitution u = nx. When x = j/n we have
u = j; when x = [j + 1]/n we have u = j + 1; and we have dx = n1 du. Using this variable substitution in the
previous integral:
Z b
b−1 n−1
2
X
X Z j+1 (u)
(nx)
dx =
du
n
n3
a
j=0 j
k=a
Again, there is no difference between the value of (u) on the intervals (j, j + 1) and the value of (u) on the
interval (0, 1):
Z b
b−1 n−1
2
X
X Z 1 (u)
(nx)
dx =
du
n
n3
a
j=0 0
k=a
On the interval (0, 1) we have (u) = u, so we can finally start integrating this thing. After a bunch of trivial
calculus, we have
Z b
2
u2
b−a−1
(nx)
dx = [b − a − 1][n] 3 =
n
2n
n2
a
And therefore
Z
b
Z
f (x) =
a
a
∞ Z
∞
(nx) X b (nx) X b − a − 1
=
=
n2
n2
n2
n=1
n=1 a
n=1
∞
bX
We know this rightmost sum converges (specifically, it converges to π 2 [b − a − 1]/6), therefore
therefore f ∈ R.
Rb
a
f (x) exists and
Exercise 7.11
P
Let > 0 be given. We’re told
fn has uniformly bounded partial sums, so let M represent the upper
Pthat
bound of the partial sums of | fn |. We’re told that gn → 0 uniformly, so we can find N such that
n > N → gn (x) <
M
for all x
Pn
Let An represent the partial sum k=1 fk gk . Our result is proven if we can prove that {An } satisfies the
Cauchy converge criterion. To do this, we choose q > p > N . Following the logic of theorem 3.42, we have




q
q−1
q−1
X
X
X
|Aq − Ap−1 | =
fk gk = 
An (gn − gn+1 ) + Aq gq − Ap−1 gp ≤ M  (gn − gn+1 ) + gq − gp
k=p
k=p
k=p
131
The (gn − gn+1 ) terms telescope down, leaving us with


q−1
X
(gn − gn+1 ) + gq + gp = 2M |gp | ≤ 2M |gN | = |Aq − Ap−1 | ≤ M gp −
k=p
Exercise 7.12
R
Let > 0 be given. We’re told that 0 ≤ fn , f ≤ g and that
g < ∞. Let M =
R∞
0
g. Define G(t) to be
c
Z
G(t) =
g(x) dx
0
Since G(t) is continuous (theorem 6.30) and is strictly increasing (because g > 0) and has an upper bound of
M , we know that
lim G(t) = M
t→∞
we can find some N such that
2
n > N → M − G(n) <
which is equivalent to saying that
∞
Z
c
Z
n>N →
Z
g−
0
∞
g=
0
g<
c
2
(87)
We’re given that fn → f uniformly, so for all x and any specified value for c we can find some M such that
m > M → |fm (x) − f (x)| <
2c
(88)
from which we can conclude that
c
Z
c
Z
fm − f ≤
m>M →
|fm − f | ≤ c
0
0
=
2c
2
(89)
So, for the given value of > 0 we choose c large enough that (88) holds and then, based on this choice of c, we
choose n to be large enough that (89) holds. By the triangle inequality we then have
Z ∞
Z c
Z ∞
fn − f
fn − f +
fn − f ≤
c
0
0
c
Z
≤
Z
∞
|fn − f | +
0
|fn − f |
c
+ =
2 2
We can make this hold for arbitrary by taking n sufficiently large, which is simply saying that
Z ∞
lim
fn − f = 0
≤
n→∞
from which we conclude
0
Z
lim
n→∞
∞
Z
fn =
0
f
0
132
∞
Exercise 7.13a
(1)
For any choice of x1 ∈ [0, 1], the sequence {fn (x1 )} is a bounded sequence of real numbers in the compact
(1)
domain [0, 1]. Therefore there exists some subsequence of functions {fn } for which the subsequence of real
(1)
(2)
numbers {fn (x1 )} converges to some point in [0, 1] (theorem 3.6a). Call this subsequence of functions {fn }
(2)
and choose any x2 ∈ [0, 1]. The sequence {f (x2 ) is still a bounded sequence of real numbers in the compact
(3)
domain [0, 1], so we can find some subsequence of functions {fn } for which {fn (x1 )} and {fn (x2 )} both converge. This can be repeated a countable number of times to construct a sequence of functions {Fn } for which
{Fn (x)} converges at all rational numbers and for all points of discontinuity for each fn (this set is countable
by theorems 4.30,2.12, and 2.13).
Now let t be a rational number or a point of discontinuity for some fn . We have constructed {Fn } so that
{Fn (t)} converges to some point, so we simply define f (t) for rational t to be
f (t) = lim Fn (t),
n→∞
if t is rational or a point of discontinuity
Now t be a rational number and a point at which fn is continuous for all n. The rational numbers are dense in
R so we can construct a sequence {rn } of rational numbers such that {rn } → t. Each Fn is continuous at t, so
limk→∞ Fn (rk ) = Fn (t) for every n. Therefore we simply define the value of f (t) to be
f (t) = lim Fn (t),
n→∞
if t is irrational and each Fn is continuous at t
Exercise 7.13: example
An example where a sequence {fn } of monotonically increasing functions converges pointwise to f and f is not
continuous:

0
x≤0

1
x≥1
1
−
fn (x) =
1+n

1
1 − 1+nx 0 < x < 1
0 x≤0
lim fn (x) =
1 x>0
n→∞
Exercise 7.13b
Let α = inf f (x) and let β = sup f (x) (these values may be ±∞). Let > 0 be given. The function f must
come arbitrarily close to its supremum and infimum on R. Let a be a point at which |f (a) − α| < and let b
be a point at which |f (b) − β| < . From the monotonicity of f we have
x < a → |f (x) − α| < ,
x > b → |f (x) − β| < (90)
Proving uniform convergence is now a matter of proving uniform convergence on [a, b].
We’re given that f is continuous on [a, b]; therefore it’s uniformly continuous on [a, b] (theorem 4.19); therefore
we can find some δ such that
|x − y| < δ → |f (x) − f (y) < (91)
We can cover [a, b] with finitely many intervals of length δ/2 ([a, b] is finite, so we don’t even need to rely on
compactness for this). From each interval we choose a rational number qi . We know that f converges pointwise
at each rational number, and the set of qi s is a finite set, so we can find some integer N such that
n > N → |Fn (qi ) − f (qi )| < for each qi
(92)
Now choose any x ∈ [a, b]. If x = qi for some i then by (92) we have n > N → |Fn (qi ) − f (qi )| < . If x 6= qi
then we can find qi , qi+1 such that qi < x < qi+1 and |qi+1 − qi | < δ. By the triangle inequality:
|Fn (x) − f (x)| ≤ |Fn (x) − Fn (qi+1 )| + |Fn (qi+1 ) − f (qi+1 )| + |f (qi+1 ) − f (x)|
Each Fn is monotonic, so we have |Fn (x) − Fn (qi+1 )| ≤ |Fn (qi ) − Fn (qi+1 )| so this last inequality becomes
|Fn (x) − f (x)| ≤ |Fn (qi ) − Fn (qi+1 )| + |Fn (qi+1 ) − f (qi+1 )| + |f (qi+1 ) − f (x)|
133
The term |Fn (qi+1 ) − f (qi+1 )| is < by (92). The term |f (qi+1 ) − f (x)| is < by (91). This leaves us with
|Fn (x) − f (x)| ≤ |Fn (qi ) − Fn (qi+1 )| + 2
Additional applications of the triangle inequality gives us
|Fn (x) − f (x)| ≤ |Fn (qi ) − f (qi )| + |f (qi ) − f (qi+1 )||f (qi ) − Fn (qi+1 )| + 2
The terms |Fn (qi ) − f (qi )| and |f (qi ) − Fn (qi+1 )| are both < by (92). The term |f (qi ) − f (qi+1 )| is < by
(91). This leave us with
|Fn (x) − f (x)| ≤ 5
Exercise 7.14
Exercise 7.15
Let > 0 be given. Each fn is equicontinous on [0, 1], which means that there exists some δ such that, for all n,
|0 − y| < δ → |fn (0) − fn (y)| < By the definition of f this means that, for all y ∈ (0, 1) and all n ∈ N,
|y| < δ → |f (0) − f (ny)| < By taking n sufficiently large and choosing y appropriately we can cause ny to take on any value in (0, ∞). So
we conclude that f is a constant function on [0, ∞).
Exercise 7.16
See part (b) of exercise 7.13
Exercise 7.17
Exercise 7.18
Lemma 1: {Fn } is uniformly bounded
We’re told that {fn } is a uniformly bounded sequence: let M be the upper bound of {|fn |}. For each n, we have
Z x
Z x
Z b
|Fn (x)| =
fn (t) dt ≤
|fn (t)| dt ≤
|fn (t)| dt ≤ (b − a)M
a
a
a
and therefore {|Fn |} is uniformly bounded.
Lemma 2: {Fn } is equicontinuous
Let > 0 be given. Let δ = /M .
Z
y
|y − x| < δ → |Fn (y) − Fn (x)| =
Z
y
fn (t) dt ≤
x
|fn (t) dt| ≤ (y − x)M < δM < x
Therefore, by theorem 7.25, we know that {Fn } contains a uniformly convergent subsequence.
Exercise 7.19
This was solved during class
Exercise 7.21
We know that A separates points because f (eiθ ) = eiθ ∈ A . For this same function, |f (eiθ )| = |eiθ | = 1 and so
A vanishes at no point of K. But the function f (eiθ ) = −eiθ is a continuous function that is not in the closure
of A .
134
Download