Solutions to Rudin Principles of Mathematical Analysis

Solutions to Principles of Mathematical Analysis (Walter Rudin) Jason Rosendale jason.rosendale@gmail.com December 19, 2011 This work was done as an undergraduate student: if you really don’t understand something in one of these proofs, it is very possible that it doesn’t make sense because it’s wrong. Any questions or corrections can be directed to jason.rosendale@gmail.com. Exercise 1.1a Let r be a nonzero rational number. We’re asked to show that x 6∈ Q implies that (r + x) 6∈ Q. Proof of the contrapositive: 7→ r + x is rational assumed → (∃p ∈ Q)(r + x = p) definition of rational → (∃p, q ∈ Q)(q + x = p) we’re told that r is rational → (∃p, q ∈ Q)(x = −q + p) existence of additive inverses in Q Because p and q are members of the closed additive group of Q, we know that their sum is a member of Q. → (∃u ∈ Q)(x = u) → x is rational definition of rational By assuming that r +x is rational, we prove that x must be rational. By contrapositive, then, if x is irrational then r + x is irrational, which is what we were asked to prove. Exercise 1.1b Let r be a nonzero rational number. We’re asked to show that x 6∈ Q implies that rx 6∈ Q. Proof of the contrapositive: 7→ rx is rational assumed → (∃p ∈ Q)(rx = p) → (∃p ∈ Q)(x = r −1 definition of rational p) existence of multiplicative inverses in Q (Note that we can assume that r−1 exists only because we are told that r is nonzero.) Because r−1 and p are members of the closed multiplicative group of Q, we know that their product is also a member of Q. → (∃u ∈ Q)(x = u) → x is rational definition of rational By assuming that rx is rational, we prove that x must be rational. By contrapositive, then, if x is irrational then rx is irrational, which is what we were asked to prove. 1 Exercise 1.2 √ Proof by contradiction. If 12 were rational, then we could write it as a reduced-form fraction in the form of p/q where p and q are nonzero integers with no common divisors. 7→ p q = √ 12 assumed 2 → ( pq2 = 12) → (p2 = 12q 2 ) It’s clear that 3|12q 2 , which means that 3|p2 . By some theorem I can’t remember (possibly the definition of ‘prime’ itself), if a is a prime number and a|mn, then a|m ∨ a|n. Therefore, since 3|pp and 3 is prime, → 3|p → 9|p2 → (∃m ∈ N)(p2 = 9m) 2 → (∃m ∈ N)(12q = 9m) 2 → (∃m ∈ N)(4q = 3m) 2 → (3|4q ) definition of divisibility substitution from p2 = 12q 2 divide both sides by 3 definition of divisibility From the same property of prime divisors that we used previously, we know that 3|4 ∨ 3|q 2 : it clearly doesn’t divide 4, so it must be the case that 3|q 2 . But if 3|qq, then 3|q ∨ 3|q. Therefore: → (3|q) And this establishes a contradiction. We began by assuming that p and q had no common divisors, but we have shown √ that 3|p and 3|q. So our assumption must be wrong: there is no reduced-form rational number such that pq = 12. Exercise 1.3 a If x 6= 0 and xy = xz, then y = 1y = (x−1 x)y = x−1 (xy) = x−1 (xz) = (x−1 x)z = 1z = z Exercise 1.3 b If x 6= 0 and xy = x, then y = 1y = (x−1 x)y = x−1 (xy) = x−1 x = 1 Exercise 1.3 c If x 6= 0 and xy = 1, then y = 1y = (x−1 x)y = x−1 (xy) = x−1 1 = x−1 = 1/x Exercise 1.3 d If x 6= 0, then the fact that x−1 x =1 means that x is the inverse of x−1 : that is, x = (x−1 )−1 = 1/(1/x). Exercise 1.4 We are told that E is nonempty, so there exists some e ∈ E. By the definition of lower bound, (∀x ∈ E)(α ≤ x): so α ≤ e. By the definition of upper bound, (∀x ∈ E)(x ≤ β): so e ≤ β. Together, these two inequalities tell us that α ≤ e ≤ β. We’re told that S is ordered, so by the transitivity of order relations this implies α ≤ β. 2 Exercise 1.5 We’re told that A is bounded below. The field of real numbers has the greatest lower bound property, so we’re guaranteed to have a greatest lower bound for A. Let β be this greatest lower bound. To prove that −β is the least upper bound of −A, we must first show that it’s an upper bound. Let −x be an arbitrary element in −A: 7→ −x ∈ −A assumed →x∈A definition of membership in −A →β≤x β = inf(A) → −β ≥ −x consequence of 1.18(a) We began with an arbitrary choice of −x, so this proves that (∀ − x ∈ −A)(−β ≥ −x), which by definition tells us that −β is an upper bound for −A. To show that −β is the least such upper bound for −A, we choose some arbitrary element less than −β: 7→ α < −β assumed → −α > β consequence of 1.18(a) Remember that β is the greatest lower bound of A. If −α is larger than inf(A), there must be some element of A that is smaller than −α. → (∃x ∈ A)(x < −α) (see above) → (∃ − x ∈ −A)(−x > α) → !(∀ − x ∈ −A)(−x ≤ α) consequence of 1.18(a) → α is not an upper bound of −A definition of upper bound This proves that any element less than −β is not an upper bound of −A. Together with the earlier proof that −β is an upper bound of −A, this proves that −β is the least upper bound of −A. Exercise 1.6a The difficult part of this proof is deciding which, if any, of the familiar properties of exponents are considered axioms and which properties we need to prove. It seems impossible to make any progress on this proof unless we 1 can assume that (bm )n = bmn . On the other hand, it seems clear that we can’t simply assume that (bm ) n = bm/n : this would make the proof trivial (and is essentially assuming what we’re trying to prove). As I understand this problem, we have defined xn in such a way that it is trivial to prove that (xa )b = xab 1 when a and b are integers. And we’ve declared in theorem 1.21 that, by definition, the symbol x n is the element 1 n n such that (x ) = x. But we haven’t defined exactly what it might mean to combine an integer power like n and some arbitrary inverse like 1/r. We are asked to prove that these two elements do, in fact, combine in the way we would expect them to: (xn )1/r = xn/r . Unless otherwise noted, every step of the following proof is justified by theorem 1.21. 3 7→ bm = bm assumed 1 n 1 → ((bm ) )n = bm definition of x n 1 → ((bm ) n )nq = bmq We were told that m n = mq = np. Therefore: p q which, by the definition of the equality of rational numbers, means that 1 → ((bm ) n )nq = bnp 1 → ((bm ) n )qn = bpn commutativity of multiplication From theorem 12.1, we can take the n root of each side to get: 1 → ((bm ) n )q = bp From theorem 12.1, we can take the q root of each side to get: 1 1 → (bm ) n = (bp ) q Exercise 1.6b As in the last proof, we assume that br+s = br bs when r and s are integers and try to prove that the operation p works in a similar way when r and s are rational numbers. Let r = m n and let s = q where m, n, p, q ∈ Z and n, q 6= 0. p m 7→ br+s = b n + q → br+s = b r+s →b mq+pn nq = (b definition of addition for rationals mq+pn ) → br+s = (bmq bpn ) r+s →b r+s →b = (b = (b r+s mq mq nq ) 1 nq )(b m n 1 nq from part a 1 nq pn (b ) pn nq legal because mq and pn are integers 1 nq ) corollary of 1.21 from part a p q →b = (b )(b ) r+s →b = (br )(bs ) Exercise 1.6c We’re given that b > 1. Let r be a rational number. Proof by contradiction that br is an upper bound of B(r): 7→ br is not an upper bound of B(r) hypothesis of contradiction → (∃x ∈ B(r))(x > br ) formalization of the hypothesis By the definition of membership in B(r), x = bt where t is rational and t ≤ r. → (∃t ∈ Q)(bt > br ∧ t ≤ r) It can be shown that b−t > 0 (see theorem S1, below) so we can multiply this term against both sides of the inequality. → (∃t ∈ Q)(bt b−t > br b−t ∧ t ≤ r) t−t → (∃t ∈ Q)(b r−t >b ∧ t ≤ r) theorem S2 from part b → (∃t ∈ Q)(1 > br−t ∧ r − t ≥ 0) → (∃t ∈ Q)(1−(r−t) > b ∧ r − t ≥ 0) →1>b And this establishes our contradiction, since we were given that b > 1. Our initial assumption must have been incorrect: br is, in fact, an upper bound of B(r). We must still prove that it is the least upper bound of B(r), though. To do so, let α represent an arbitrary rational number such that bα < br . From this, we need to 4 prove that α < r. 7→ bα < br α −r →b b α−r α−r →b →b hypothesis of contradiction r −r <b b theorem S2 r−r <b from part b <1 from part b Exercise 1.7 a Proof by induction. Let S = {n : bn − 1 ≥ n(b − 1)}. We can easily verify that 1 ∈ S. Now, assume that k ∈ S: 7→ k ∈ S hypothesis of induction → bk − 1 ≥ k(b − 1) definition of membership in S → bbk − 1 ≥ k(b − 1) we’re told that b > 1. k+1 − b ≥ k(b − 1) k+1 ≥ k(b − 1) + b →b →b → bk+1 − 1 ≥ k(b − 1) + b − 1 → bk+1 − 1 ≥ (k + 1)(b − 1) → k+1∈S definition of membership in S By induction, this proves that (∀n ∈ N)(bn − 1 ≥ n(b − 1)). Alternatively, we could prove this using the same identity that Rudin used in the proof of 1.21. From the distributive property we can verify that bn − an = (b − a)(bn−1 a0 + bn−2 a1 + . . . + b0 an−1 ). So when a = 1, this becomes bn − 1 = (b − 1)(bn−1 + bn−2 + . . . + b0 ). And since b > 1, each term in the bn−k series is greater than 1, so bn − 1 ≥ (b − 1)(1n−1 + 1n−2 + . . . + 10 ) = (b − 1)n. Exercise 1.7 b 1 1 7→ n(b n − 1) = n(b n − 1) 1 1 → n(b n − 1) = (1 + 1 + . . . + 1)(b n − 1) | {z } 1 n → n(b − 1) = (1 n times n−1 n−2 +1 1 + . . . + 10 )(b n − 1) 1 It can be shown that bk > 1 when b > 1, k > 0 (see theorem S4). Replacing 1 with b n gives us the inequality: 1 1 1 1 1 → n(b n − 1) ≤ ((b n )n−1 + (b n )n−2 + . . . + (b n )0 )(b n − 1) Now we can use the algebraic identity bn − an = (bn−1 a0 + bn−2 a1 + . . . + b0 an−1 )(b − a): 1 1 → n(b n − 1) ≤ ((b n )n − 1) 1 → n(b n − 1) ≤ (b − 1) Exercise 1.7 c 7→ n > (b − 1)/(t − 1) assumed → n(t − 1) > (b − 1) this holds because n, t, and b are greater than 1 1 n → n(t − 1) > (b − 1) ≥ n(b − 1) 1 n → n(t − 1) > n(b − 1) 1 n → (t − 1) > (b − 1) →t>b from part b transitivity of order relations n > 0 → n−1 > 0 would be a trivial proof 1 n 5 Exercise 1.7 d We’re told that bw < y, which means that 1 < yb−w . Using the substitution yb−w = t with part (c), we’re lead 1 1 directly to the conclusion that we can select n such that yb−w > b n . From this we get y > bw+ n , which is what 1 1 w+ w we were asked to prove. As a corollary, the fact that b n > 1 means that b n > b . Exercise 1.7 e We’re told that bw > y, which means that bw y −1 > 1. Using the substitution bw y −1 = t with part (c), we’re 1 lead directly to the conclusion that we can select n such that bw y −1 > b n . Multiplying both sides by y gives us −1 1 1 w w− b > b n y. Multiplying this by b n gives us b n > y, which is what we were asked to prove. As a corollary, −1 1 1 the fact that b n > 1 > 0 means that, upon taking the reciprocals, we have b n < 1 and therefore bw− n < bw . Exercise 1.7 f We’ll prove that bx = y by showing that the assumptions bx > y and bx < y lead to contradictions. 1 If bx > y, then from part (e) we can choose n such that bx > bx− n > y. From this we see that x − n1 is an upper bound of A that is smaller than x. This is a contradiction, since we’ve assumed that x =sup(A). 1 If bx < y, then from part (d) we can choose n such that y > bx+ n > bx . From this we see that x is not an upper bound of A. This is a contradiction, since we’ve assumed that x is the least upper bound of A. Having ruled out these two possibilities, the trichotomy property of ordered fields forces us to conclude that bx = y. Exercise 1.7 g Assume that there are two elements such that bw = y and bx = y. Then by the transitivity of equality relations, bw = by , although this seems suspiciously simple. Exercise 1.8 In any ordered set, all elements of the set must be comparable (the trichotomy rule, definition 1.5). We will show by contradiction that (0, 1) is not comparable to (0, 0) in any potential ordered field containing C. First, we assume that (0, 1) > (0, 0) : 7→ (0, 1) > (0, 0) hypothesis of contradiction → (0, 1)(0, 1) > (0, 0) definition 1.17(ii) of ordered fields We assumed here that (0, 0) can take the role of 0 in definition 1.17 of an ordered field. This is a safe assumption because the uniqueness property of the additive identity shows us immediately that (0, 0) + (a, b) = (a, b) → (0, 0) = 0. → (−1, 0) > (0, 0) definition of complex multiplication → (−1, 0)(0, 1) > (0, 0) definition 1.17(ii) of ordered fields, since we initially assumed (0, 1) > 0 → (0, −1) > (0, 0) definition of complex multiplication It might seem that we have established our contradiction as soon as we concluded that (−1, 0) > 0 or (0, −1) > 0. However, we’re trying to show that the complex field cannot be an ordered field under any ordered relation, even a bizarre one in which −1 > −i > 0. However, we’ve shown that (0, 1) and (0, −1) are both greater than zero. Therefore: → (0, −1) + (0, 1) > (0, 0) + (0, 1) definition 1.17(i) of ordered fields → (0, 0) > (0, 1) definition of complex multiplication This conclusion is in contradiction of trichotomy, since we initially assumed that (0, 0) < (0, 1). Next, we assume that (0, 1) < (0, 0): 6 7→ (0, 0) > (0, 1) hypothesis of contradiction → (0, 0) + (0, −1) > (0, 1) + (0, −1) definition 1.17(i) of ordered fields → (0, −1) > (0, 0) definition of complex addition → (0, −1)(0, −1) > (0, 0) definition 1.17(ii) of ordered fields → (−1, 0) > (0, 0) → (−1, 0)(0, −1) > (0, 0) definition of complex multiplication definitino 1.17(ii) of ordered fields, since we’ve established (0, −1) > (0, 0) → (0, 1) > (0, 0) definition of complex multiplication Once again trichotomy has been violated. Proof by contradiction that (0, 1) 6= (0, 0): if we assume that (0, 1) = (0, 0) we’re led to the conclusion that (a, b) = (0, 0) for every complex number, since (a, b) = a(0, 1)4 + b(0, 1) = a(0, 0) + b(0, 0) = (0, 0). By the transitivity of equivalence relations, this would mean that every element is equal to every other. And this is in contradiction of definition 1.12 of a field, where we’re told that there are at least two distinct elements: the additive identity (’0’) and the multiplicative identity (’1’). Exercise 1.9a To prove that this relation turns C into an ordered set, we need to show that it satisfies the two requirements in definition 1.5. Proof of transitivity: 7→ (a, b) < (c, d) ∧ (c, d) < (e, f ) assumption → [a < c ∨ (a = c ∧ b < d)] ∧ [c < e ∨ (c = e ∧ d < f )] definition of this order relation → (a < c ∧ c < e) ∨ (a < c ∧ c = e ∧ d < f ) ∨(a = c ∧ b < d ∧ c < e) ∨ (a = c ∧ b < d ∧ c = e ∧ d < f ) distributivity of logical operators → (a < e) ∨ (a < e ∧ d < f ) ∨ (a < e ∧ b < d) ∨ (a = e ∧ b < f ) transitivity of order relation on R Although we’re falling back on the the transitivity of an order relation, we are not assuming what we’re trying to prove. We’re trying to prove the transitivity of the dictionary order relation on C, and this relation is defined in terms of the standard order relation on R. This last step is using the transitivity of this standard order relation on R and is not assuming that transitivity holds for the dictionary order relation. → (a < e) ∨ (a < e) ∨ (a < e) ∨ (a = e ∧ b < f ) p∧q →p → a < e ∨ (a = e ∧ b < f ) p∨p→p → (a, b) < (e, f ) definition of this order relation To prove that the trichotomy property holds for the dictionary relation on Q, we rely on the trichotomy property of the underlying standard order relation on R. Let (a, b) and (c, d) be two elements in C. From the standard order relation, we know that 7→ (a, b) ∈ C ∧ (c, d) ∈ C assumed → a, b, c, d ∈ R definition of a complex number → (a < c) ∨ (a > c) ∨ (a = c) trichotomy of the order relation on R → (a < c) ∨ (a > c) ∨ (a = c ∧ (b < d ∨ b > d ∨ b = d)) trichotomy of the order relation on R → (a < c) ∨ (a > c) ∨ (a = c ∧ b < d) ∨ (a = c ∧ b > d) ∨(a = c ∧ b = d) distributivity of the logical operators → (a, b) < (c, d) ∨ (a, b) > (c, d) ∨ (a, b) < (c, d) ∨ (a, b) > (c, d) ∨(a, b) = (c, d) definition of the dictionary order relation → (a, b) < (c, d) ∨ (a, b) > (c, d) ∨ (a, b) = (c, d) And this is the definition of the trichotomy law, so we have proven that the dictionary order turns the 7 complex numbers into an ordered set. Exercise 1.9b C does not have the least upper bound property under the dictionary order. Let E = {(0, a) : a ∈ R}. This subset is just the imaginary axis in the complex plane. This subset clearly has an upper bound, since (x, 0) > (0, a) for any x > 0. But it does not have a least upper bound: for any proposed upper bound (x, y) with x > 0, we see that x (x, y) < ( , y) < (0, a) 2 So that ( x2 , y) is an upper bound less than our proposed least upper bound, which is a contradiction. Exercise 1.10 This is just straightforward algebra, and is too tedious to write out. Exercise 1.11 If we choose w = z |z| z and choose r = |z|, then we can easily verify that |w| = 1 and that rw = |z| |z| = z. Exercise 1.12 √ √ Set ai zi and bi = z¯i and use the Cauchy-Schwarz inequality (theorem 1.35). This gives us n X √ p zj z¯j 2 n n X √ 2X p 2 ≤ | zj | | z¯j | j=1 j=1 j=1 which is equivalent to 2 |z1 + z2 + . . . + zn |2 ≤ (|z1 | + |z2 | + . . . + |zn |) Taking the square root of each side shows that |z1 + z2 + . . . + zn | ≤ |z1 | + |z2 | + . . . + |zn | which is what we were asked to prove. Exercise 1.13 |x − y|2 = (x − y)(x − y) = (x − y)(x̄ − ȳ) Theorem 1.31(a) = xx̄ − xȳ − yx̄ + y ȳ = xx̄ − (xȳ + yx̄) + y ȳ = |x|2 − 2Re(xȳ) + |y|2 Theorem 1.31(c), definition 1.32 ≥ |x|2 − 2|Re(xȳ)| + |y|2 x ≤ |x|, so −|x| ≥ |x|. ≥ |x|2 − 2|xȳ| + |y|2 Theorem 1.33(d) 2 2 Theorem 1.33(c) = |x|2 − 2|x||y| + |y|2 Theorem 1.33(b) = |x| − 2|x||ȳ| + |y| = (|x| − |y|)(|x| − |y|) = (|x| − |y|)(|x̄| − |ȳ|) Theorem 1.33(b) = (|x| − |y|)(|x| − |y|) Theorem 1.31(a) 2 = ||x| − |y|| This chain of inequalities shows us that ||x| − |y||2 ≤ |x − y|2 . Taking the square root of both sides results in the claim we wanted to prove. 8 Exercise 1.14 |1 + z|2 + |1 − z|2 = (1 + z)(1 + z) + (1 − z)(1 − z) = (1 + z)(1̄ + z̄) + (1 − z)(1̄ − z̄) Theorem 1.31(a) = (1 + z)(1 + z̄) + (1 − z)(1 − z̄) The conjugate of 1 = 1 + 0i is just 1 − 0i = 1. = (1 + z̄ + z + z z̄) + (1 − z̄ − z + z z̄) = (2 + 2z z̄) = (2 + 2) We are told that z z̄ = 1 =4 Exercise 1.15 Using the logic and the notation from Rudin’s proof of theorem 1.35, we see that equality holds in the Schwarz inequality when AB = |C|2 . This have B(AB − |C|2 ) = 0, and from the given chain of equalities P occurs when 2 we see that this occurs when |Baj − Cbj | = 0. For this to occur we must have Baj = Cbj for all j, which occurs only when B = 0 or when C aj = bj for all j B That is, each aj must be a constant multiple of bj . Exercise 1.16 We know that |z − x|2 = |z − y|2 = r2 . Expanding these terms out, we have |z − x|2 = (z − x) · (z − x) = |z|2 − 2z · x + |x|2 |z − y|2 = (z − y) · (z − y) = |z|2 − 2z · y + |y|2 For these to be equal, we must have −2z · x + |x|2 = −2z · y + |y|2 which happens when 1 [|x|2 − |y|2 ] (1) 2 We also want |z − x| = r, which occurs when z = x + rû where |û| = 1. Using this representation of z, the requirement that r2 = |z − y|2 becomes z · (x − y) = r2 = |z − y|2 = |x + rû − y|2 = |(x − y) + rû2 | = |x − y|2 + 2rû · (x − y) + |rû|2 = d2 + 2rû · (x − y) + r2 Rearranging some terms, this becomes û · (y − x) = d2 d = |y − x| 2r 2r (2) A quick, convincing, and informal proof would be to appeal to the relationship a · b = |a||b| cos(θ) where θ is the angle between the two vectors; the previous equation then becomes |û||y − x| cos(θ) = d |y − x| 2r Dividing by |y − x| and remembering that û = 1, this becomes cos(θ) = d 2r where θ is the angle between the fixed vector (y − x) and the variable vector û. It’s easy to see that this equation will hold for exactly one û when d = 2r; it will hold for no û when d > 2r; it will hold for two values of û when d < 2r and n = 2; and it will hold for infinitely many values of û when d < 2r and n > 2. Each value of û corresponds with a unique value of z. More formal proofs follow. 9 part (a) When d < 2r, equation (2) is satisfied for all û for which û · (y − x) = d |y − x| < |y − x| 2r By the definition of the dot product, this is equivalent to u1 (y1 + x1 ) + u2 (y2 + x2 ) + . . . + uk (yk + xk ) = d |y − x| 2r The only other requirement for the values of ui is that q u21 + u22 + . . . + u2k = 1 (3) (4) This gives us a system of two equations with k variables. As long as k ≥ 3 we have more variables than equations and therefore the system will have infinitely many solutions. part (b) Evaluating d2 , we have: d2 = |x − y|2 = |(x − z) + (z − y)|2 = [(x − z) + (z − y)] · [(x − z) + (z − y)] definition of inner product · = (x − z) · (x − z) + 2(x − z) · (z − y) + (z − y) · (z − y) inner products are distributive 2 2 = |x − z| + 2(x − z) · (z − y) + |z − y| definition of inner product · Evaluating (2r)2 , we have: (2r)2 = (r + r)2 = (|z − x| + |z − y|)2 = |z − x|2 + 2|z − x||z − y| + |z − y|2 commutativity of multiplication If 2r = d then d2 = (2r)2 and therefore, by the above evaluations, we have |x − z|2 + 2(x − z) · (z − y) + |z − y|2 = |z − x|2 + 2|z − x||z − y| + |z − y|2 which occurs if and only if 2(x − z) · (z − y) = 2|x − z||z − y| From exercise 14 we saw that this equality held only if (x − z) = c(z − y) for some constant c; we know that |x − z| = |z − y| so c = ±1; we know x 6= y so c = 1. Therefore we have x − z = z − y, from which we have z= x+y 2 and there is clearly only one such z that satisfies this equation. part (c) If 2r < d then we have |x − y| > |x − z| + |z − y| which is equivalent to |(x − z) + (z − y)| > |x − z| + |z − y| which violates the triangle inequality (1.37e) and is therefore false for all z. 10 Exercise 1.17 First, we need to prove that a · (b + c) = a · b + a · c and that (a + b) · c = a · c + b · c: that is, we need to prove that the distributive property holds between the inner product operation and addition. P a · (b + c) = ai (bi + ci ) P = (ai bi + ai ci ) P P = ai bi + ai ci =a·b+a·c P (a + b) · c = (ai + bi )ci P = (ai ci + bi ci ) P P = ai ci + bi ci =a·c+b·c definition 1.36 of inner product distributive property of R associative property of R definition 1.36 of inner product definition 1.36 of inner product distributive property of R associative property of R definition 1.36 of inner product The rest of the proof follows directly: |x + y|2 + |x − y|2 = (x + y) · (x + y) + (x − y) · (x − y) = (x + y) · x + (x + y) · y + (x − y) · x − (x − y) · y =x·x+y·x+x·y+y·y+x·x−y·x−x·y+y·y =x·x+y·y+x·x+y·y = 2|x|2 + 2|y|2 Exercise 1.18 If x = 0, then x · y = 0 for any y ∈ Rk . If x 6= 0, then at least one of the elements x1 , . . . , xk must be nonzero: let this element be represented by xa . Let xb represent any other element of x. Choose y such that:  x b  i=a  xa yi = −1 i = b   0 otherwise We can now see that x·y = xa xxab +xb (−1) = xb −xb = 0. We began with an arbitrary vector x and demonstrated a method of construction for y such that x · y = 0: therefore, we can always find a nonzero y such that x · y = 0. This is not true in R1 because the nonzero elements of R are closed with respect to multiplication. Exercise 1.19 We need to determine the circumstances under which |x − a| = 2|x − b| and |x − c| = r. To do this, we need to manipulate these equalities until they have a common term that we can use to compare them. |x − a| = 2|x − b| |x − a|2 = 4|x − b|2 |x|2 − 2x · a + |a|2 = 4|x|2 − 8x · b + 4|b|2 3|x|2 = |a|2 − 2x · a + 8x · b − 4|b|2 |x − c| = r |x − c|2 = r2 |x|2 − 2x · c + |c|2 = r2 |x|2 = r2 + 2x · c − |c|2 3|x|2 = 3r2 + 6x · c − 3|c|2 Combining these last two equalities together, we have 11 7→ |a|2 − 2x · a + 8x · b − 4|b|2 = 3r2 + 6x · c − 3|c|2 → |a|2 − 2x · a + 8x · b − 4|b|2 − 3r2 − 6x · c + 3|c|2 = 0 → |a|2 − 4|b|2 + 3|c|2 − 2x · (a − 4b − 3c) − 3r2 = 0 We can zero out the dot product in this equation by letting 3c = 4b − a. Of course, this also determines a specific value of c. This substitution gives us the new equality: 2 4b a − 3r2 = 0 − 3 3 3 · 16 2 3 · 8 3 → |a|2 − 4|b|2 + |b| − a · b + |a|2 − 3r2 = 0 9 9 9 → 3|a|2 − 12|b|2 + 16|b|2 − 8a · b + |a|2 − 9r2 = 0 → |a|2 − 4|b|2 + 3 → 4|a|2 4|b|2 − 8a · b − 9r2 = 0 → 4|a − b|2 = 9r2 → 2|a − b| = 3r By choosing 3c = 4b − a and 3r = 2|a − b|, we guarantee that |x − a| = 2|x − b| iff |x − c| = r. Exercise 1.20 We’re trying to show that R has the least upper bound property. The elements of R are certain subsets of Q, and < is defined to be “is a proper subset of”. To say that an element α ∈ R has a least upper bound, then, is to say that the subset α has some “smallest” superset β such that α ⊂ β. We’re asked to omit property III, which told us that each α ∈ R has no largest element. √ With this restriction lifted, we have a new definition of “cut” that includes cuts such as (−∞, 1] and (−∞, 2]. To prove that each subset of R with an upper bound must have a least upper bound, we will follow Step 3 in the book almost exactly. We will define two subsets A ⊂ R and γ ⊂ R, show that the subset γ is also an element of R, and then show that the subset/element γ is the least upper bound of A. Let A be a nonempty subset of R with an upper bound of β (Note: A is a subset of R, not a subset of R. The elements of A are cuts, which are subsets of Q). Define γ to be the union of all cuts in A. This means that γ contains every element from every cut in A: γ consists of elements from Q. proof that γ is a cut: The proof of criterion (I) has two parts. (i) γ is the union of elements of A, and we were told that A is nonempty. Therefore γ is nonempty. (ii) We are told that β is an upper bound for A, so x ∈ A → x ⊂ β (remember that < is defined as ⊂). But γ is just the union of cut elements in A, so γ is the union of proper subsets of β. This means that γ itself is a proper subset of β. This shows that γ ⊂ β ⊆ Q, so γ 6= Q. So criterion (I) in the definition of “cut” has been met. To prove part (II), pick some arbitrary rational p ∈ γ. γ is the set of cuts in A, so p ∈ α for some cut α ∈ A. Choose a second rational q such that q < p: by the definition of cut, the fact that p ∈ α and q < p means that q ∈ α and therefore q ∈ γ. And we’re being asked to disregard part (III), so this is sufficient to prove that γ is a cut. proof that γ is the least upper bound of A (i) Choose any arbitrary cut α ∈ A. We’ve defined γ as the union of all cuts in A, so it’s clear that every rational number in α is also contained in γ: that is, α ⊆ γ. And by the definition of < for this set, this tells us that α ≤ γ for every α ∈ A. (ii) Suppose that δ < γ. By the definition of < for this set, this means that δ ⊂ γ. This is a proper subset, so there must be some rational s such that s 6∈ δ but s ∈ γ. In order for s ∈ γ, it must be the case that s ∈ α for some cut α ∈ A. We can now show that δ ⊂ α. For every r ∈ δ,it’s also true that s 6∈ δ. By (II), this means 12 that r < s (using standard rational order). And since s ∈ α, (II) also shows that r ∈ α. This shows that δ ⊂ α, which means δ < α ∈ A (using the subset order on R), and therefore δ is not an upper bound of A. Together, these two facts show that γ is the least upper bound of A. definition of addition Following the example in the book, we define the addition of two cuts α + β to be the set of all sums r + s where r ∈ α and s ∈ β. The book defined 0∗ to be the set of rational numbers < 0, but by omitting requirement (III) we are forced to redefine our zero. Let 0# be the set of all rational numbers ≤ 0. The original definition had to omit 0 from 0∗ in order to keep R, the set of cuts, closed under addition (otherwise we’d have 0∗ + 0∗ = (−∞, 0] which is not a cut because of criterion (III)). Our new zero 0# must include 0 as an element because our newly defined cuts can have a greatest element. The set 0∗ can no longer function as our zero since (−∞, x] + 0∗ = (−∞, x); these two cuts are not equal ((−∞, x) < (−∞, x]), so 0∗ has not functioned as the additive identity. field axioms A1,A2, and A3 The proofs from the book for closure, commutativity, and associativity are directly applicable to our new definition of cut. field axiom A4: existence of an additive identity Let α be a cut in R. Let r and s be rationals such that r ∈ α and s ∈ 0# . Then r + s ≤ r, which means that r + s ∈ α. This shows us that α + 0# ⊆ α. Now let p and r be rationals such that p ∈ α, r ∈ α, and p ≤ r. This means that p − r ≤ 0, so p − r ∈ 0# . Therefore r + (p − r) ∈ α + 0# , which means that p ∈ α + 0# . This shows us that α ⊆ α + 0# . Having shown that α + 0# ⊆ α and that α ⊆ α + 0# , we conclude that α = α + 0# for any α ∈ R. field axiom A5: existence of additive inverses The book constructed the definition of inverse so that the inverse of (−∞, x) would be (−∞, −x). This works under the original definition of “cut”, since (−∞, x) + (−∞, −x) = (−∞, 0) = 0∗ . It is tempting to just define the inverse for our new definition of cut so that the inverse of (−∞, x] is (∞, −x]. This overlooks the fact that both (−∞, x] and (−∞, x) are cuts under our new definition: both of these elements must have additive inverses. To show that A5 is not satisfied, we need to demonstrate that at least one element of R has no additive inverse. Let α = (−∞, r) for some rational r. Assume that there was some cut β such that α + β = 0# = (−∞, 0]: we will show that this assumption leads to a contradiction. (i) Assume that β does not contain any elements greater than −r. Let p and q be arbitrary rationals such that p ∈ α and q ∈ β. From our definitions of α and β, we know that p < r and q ≤ −r. Combining these two inequalities tells us that p + q < −r + r, or p + q < 0. But p and q were arbitrary members of α and β, so this shows that 0 6∈ α + β. So it cannot be the case that α + β = 0# . (ii) Assume that β contains at least one element q that is greater than −r. Let s represent the difference q − (−r): then q = −r + s (note that s must be a positive rational). Let p be an arbitrary rational such that p ∈ α. By the definition of α, we know that p < r. And from property (II) of cuts, we know that r − s < p < r. Adding q = −r + s to each element in this equality gives us 0 < p + q < s. And p + q is an element of α + β, so we see that α + β contains some positive rational: it cannot be the case that α + β = 0# . Whether or not β contains an element greater than −r, we find ourselves in contradiction with the initial assumption that β might be the inverse of α. Therefore there can be no inverse for α. In conclusion, we see that omitting (III) forces us to redefine the additive identity, and this new definition results in the existence of elements in R that have no inverse. 13 Exercise 2.1 This is immediately justified by noting that the definition of subset, x ∈ ∅ → x ∈ A, is satisfied for any set A because of the false antecedent. A more formal proof: 7→ ¬(∃x ∈ ∅) → (∀x)(x ∈ 6 ∅) definition of an empty set negation of the ∃ quantifier → (∀x)(x 6∈ A → x 6∈ ∅) Hilbert’s PL1 This previous step is justified by the argument p → (q → p): if something is true (like x 6∈ ∅), then everything implies it. → (∀x)(x ∈ ∅ → x ∈ A) contrapositive →∅⊆A definition of subset Exercise 2.2 The set of integers is countable, so by theorem 2.13 the set of all (k + 1)-tuples (a0 , a1 , . . . , ak ) with a0 6= 0 is also countable. Let this set be represented by Zk . For each a∈ Zk consider the polynomial a0 z k +a1 z k−1 +. . .+ak = 0. From the fundamental theorem of algebra, we know that there are exactly k complex roots for this polynomial. We now have a series of nested sets that encompass every possible root for every possible polynomial with integer coefficients. More specifically, we have a countable number of Zk s, each containing a countable number of (k + 1)-tuples, each of which corresponds with k roots of a k-degree polynomial. So our set of complex roots (call it R) is a countable union of countable unions of finite sets. This only tells us that R is at most countable: it is either countable or finite. To show that R is not finite, consider the roots for 2-tuples in Z1 . Each 2-tuple of the form (−1, n) corresponds with the polynomial −z + n = 0 whose solution is z = n. There is clearly a unique solution for each n ∈ Z, so R is an infinite set. Because R is also at most countable, this proves that R is countable. I did not use the hint provided in the text, which either means that this proof is invalid or that there is an alternate (simpler?) proof. Exercise 2.3 The set of real numbers is uncountable, so if every real number were algebraic then the set of real algebraic numbers would be uncountable. However, exercise 2.2 showed that the algebraic complex numbers were countable. The real numbers are a subset of the complex numbers, so the set of algebraic real numbers is at most countable. Exercise 2.4 The rational real numbers are countable. If the irrational real numbers were countable, then the union of rational and irrational reals would be countable. But this union is just R, which is not countable. So the irrational reals are not countable. Exercise 2.x: The Cantor Set The idea behind the Cantor set is that each term in the series E1 ⊃ E2 . . . ⊃ En removes the middle third of each remaining line segment, so that you get a series of increasingly smaller segments that look like: 14 We can see that E0 has one segment of length 1, E1 has two segments of length 1/3, and Em has 2m segments of length 1/3m . Notice that for any possible segment (α, β), we can choose m to be large enough so that the maximum segment length is less than the length of (α, β) and therefore (α, β) 6∈ Em . And if the segment is not in Em for some m, it’s not in the union P . This is the reasoning behind Rudin’s statement that “P contains no segment”. To justify the claim that no point in P is an isolated point, we need to show that for every point p1 ∈ P andTany arbitrarily small radius r, we can find some second point p2 such that d(p1 , p2 ) < r. P is defined to be Ei , so if p1 ∈ P then it is a member of every Ei . Choose some element Em where m is so large that the segments in Em are all shorter than r. We are assured that this is possible by the Archimedean property of the rationals, since we need only find m such that 3−m < r. So we’ve found an Em with a line segment containing p1 . Let p2 represent one of the endpoints of this line. The length of this segment is less than r, so d(p1 , p2 ) < r. In each subsequent term in the series {Ei }, the endpoint p2 will never be “cut off”: it will always be the endpoint of a segment from which the middle third is lost. So p2 ∈ P . And since this was true for an arbitrary point p1 ∈ P and an arbitrarily small r, this shows that every neighborhood of every point in P is a limit point and not an isolated point. Exercise 2.5 1 , m ∈ N. No point in E is a limit Let E be a subset of (0, 1] consisting only all rational numbers of the form m point, but the limit points of E need not be members of E: we will demonstrate that the point 0 is a limit point of E. 1 (i) Proof that no point in E is a limit point: let pm be an arbitrary member of E. Then pm = m for some 1 1 m ∈ N. The closest point to pm is therefore pm+1 = m+1 . So we just choose r such that r = 2 d(pm , pm+1 ) and we see that there are no other points in this neighborhood of pm , which shows that pm is an isolated point. (ii) Proof that 0 is a limit point: choose an arbitrarily small radius r. From the Archimedian property, we 1 1 1 < r. This pm = m is a member of E, and d(0, pm ) = m < r. This shows that can find some m ∈ N such that m 0 is a limit point. (iii) Proof that no other points are limit points: let x be some point that is neither zero nor a member of E. This point must be either less than zero, greater than 1, or it must lie between two sequential points pm , pm+1 ∈ E. Let the dx represent the smallest distance from the set {d(x, 0), d(x, 1), d(x, pm ), d(x, pm+1 )}. Choose r such that r = 21 dx . There can be no points of E in the neighborhood Nr (x). If there were, then this 1 1 would indicate the existence of a point of E less than zero, greater than 1, or between m and m+1 . None of these can exist in E as we defined it, so no points other than 0 are limit points. 1 Now let F be the subset of (2, 3] containing rational numbers of the form 2 + m and let G be the subset of 1 (4, 5] consisting of rational numbers of the form 4 + m . Just as E was shown to have a limit point of only zero, these sets can be shown to have limit points of only 2 and 4. Therefore the union E ∪ F ∪ G is bounded and has three limit points {0, 2, 4}. Exercise 2.6 (i) We’re asked to prove that E 0 is closed. This is equivalent to proving that every limit point of E 0 is a point of E. And by the definition of E 0 , this is equivalent to proving that every limit point of E 0 is a limit point of E. To prove this, let x represent an arbitrary limit point of E 0 . Choose any arbitrarily small r and let s = 2r , t = 4r 1 . Since x is a limit point of E 0 , we can find a point y ∈ E 0 in the neighborhood Ns (x). And y, by virtue of being in E 0 , is a limit point of E: so we can find a point z ∈ E in the neighborhood Nt (y). The distance d(x, y) is less than s and d(y, z) is less than t, so definition 2.15 of metric spaces assures us that d(x, z) ≤ d(x, y) + d(y, z) < s + t < r, so d(x, z) < r and therefore z is in the neighborhood Nr (x) for any 1t was chosen to be less than r to guarantee x 6= z. 15 arbitrarily small r. So the point x is a limit point for E. But x was an arbitrarily chosen limit point of E 0 , so we have proven that every limit point of E 0 is a limit point of E, which is what we were asked to prove. The following image is helpful for imagining these points in R2 . (ii) We can use a similar technique to show that every limit point of Ē is a limit point E. Let x represent an arbitrary limit point of Ē (which we won’t assume to be a member of Ē). Choose any arbitrarily small r and let s = 2r , t = 4r . Since x is a limit point of Ē, theorem 2.20 tells us that the neighborhood Ns (x) contains infinitely many points of Ē. Because Ē is defined to be E 0 ∪ E, each of these infinitely many points in Ns (x) is either a member of E 0 or a member of E. (ii.a) First, assume that there exists at least one point y in Ns (x) such that y ∈ E 0 . By definition, this means that y is a limit point of E, so there is at least one point z ∈ E in the neighborhood Nt (y). The distance d(x, y) is less than s and d(y, z) is less than t, so definition 2.15 of metric spaces assures us that d(x, z) ≤ d(x, y) + d(y, z) < s + t < r, so d(x, z) < r and therefore we can find some z ∈ E in the neighborhood Nr (x) for any arbitrarily small r. So x is a limit point for E. (ii.b) Next, assume that none of the points in Ns (x) are members of E. This means that all of the infinitely many points of E 0 ∪ E in Ns (x) are members of E. From this, we see that every neighborhood of x contains elements of E. For any neighborhood Nt (x) with t < s, the fact that x is a limit point of Ē and that Nt (x) ⊂ Ns (x) means that Nt (x) contains infinitely many points in E. But this means that x is a limit point of E. For any neighborhood Nt (x) with t > s, the fact that Ns (x) ⊂ Nt (x) means that Nt (x) contains infinitely many points in E. And since every neighborhood of x contains an element of E, x is a limit point for E. The second part of the proof is to show that every limit point of E is a limit point of Ē and is relatively trivial. Every element of E is also an element of E ∪ E 0 = Ē. If x is a limit point of E, then every neighborhood of x contains an element of E. Therefore every neighborhood of x contains an element of Ē. So x is a limit point of Ē. We’ve shown that every limit point of Ē is a limit point of E and vice-versa, which is what we were asked to prove. (iii) We’re asked if E and E 0 always have the same limit points. The answer is “no”, and a counterexample can be found in the previous question. The limit points of E in exercise 2.5 were E 0 = {0, 2, 4}. And E 0 clearly has no limit points whatsoever. Exercise 2.7a We’ll prove the equality of these sets by showing that each is a subset of the other. (A) Assume that x ∈ B̄n . Then, x ∈ Bn or x is a limit point of Bn . (A1) If x ∈ Bn , then x ∈ Ak for some Ak ∈ S Ai . And if x ∈ Ak , then x ∈ A¯k . And this means that x ∈ S Āi . S (A2) If x is a limit point of Bn , then it must be a limitSpoint at least one specific Ak ∈ Ai . Proof by contradiction: assume that x is not a limit point for any Ak ∈ Ai . Then for each i, there is some neighborhood Nri (x) that contains no elements of Ai . Let s =min(ri ) (which exists because i is finite). Then Ns (x) contains 16 no elements from any Ai since Ns (x) ⊆ NriS (x) for each i. And if this neighborhood contains no points for any Ai , then it contains no points of Bn = Ai . And this is a contradiction, S since x is a limit point of Bn . Therefore, by contradiction, x must be a limit point of at least one specific A ∈ Ai . So x ∈ Āk , which means k S that x ∈ Āi . In either of these two cases, x ∈ (B) Assume that x ∈ point of Ak . S S Āi . This proves that x ∈ B̄n → x ∈ S Āi . Āi . Then x ∈ Āk for some k. And this means that either x ∈ Ak or x is a limit (B1) If x ∈ Ak , then x ∈ S Ai , which means that x ∈ Bn and therefore x ∈ B̄N . (B2) If x is a limit S point of Ak , then every neighborhood of x contain an element of Ak . But every element of Ak is an element of Ai = Bn , so this means that every neighborhood of x contains an element of Bn . By definition, this means that x is a limit point of Bn , so that x ∈ B̄n . In either ofSthese two cases, x ∈ B̄n . This proves S that x ∈ x ∈ B̄n → x ∈ Āi , so we have proven that B̄n = Āi . S Āi → x ∈ B̄n . And we’ve already shown that Exercise 2.7b S∞ Nothing in part B of the above proof required that i be finite, so it’s still the case that i=1 Āi ⊂ B̄. But part A of the proof assumed that we could choose a least element from the set of size i, which we can’t do for an S infinite set. So we can’t assume that B ⊆ Āi . And it’s a good thing that we can’t assume this, because it’s false. 1 , m ∈ N. We’ll Consider the set from exercise 2.5, the set E consisting S of all rational numbers of the form m ∞ 1 construct this set by defining A = and defining B = A . We saw in exercise 2.5 that 0 ∈ B̄. But i i i S∞ S∞i=1 0S6∈ Āi for any i, so 0 6∈ i=1 Āi . This shows us that B̄ 6= Si=1 Ai for these sets. And we’ve already shown that Āi ⊆ B, so for this particular choice of sets we see that Āi ⊂ B. Exercise 2.8a Every point in an open set E in R2 is a limit point of E. Proof: let p be an arbitrary point in E. We can say two things about the neighborhoods of p. First, since we’re dealing with R2 , every neighborhood of p contains infinitely many points. Second, p is an interior point (since E is open) and so there is some r such that Nr (p) ⊂ E. Together, these two facts tell us that Nr (p) contains infinitely many points of E. We now show that this guarantees that every neighborhood of p contains a point of E. Choose any s such that r < s. Because the neighborhood Nr (p) contains some q ∈ E and Nr (p) ⊂ Ns (p), we know that q ∈ Ns (p). Now choose s such that s < r. Ns (p) contains infinitely many points, and Ns (p) ⊂ Nr (p) ⊂ E, so Ns (p) contains infinitely many points of E. We’ve shown that whether r < s or r > s for any arbitrary r, Ns (p) contains a point of E. Thus p is a limit point of E. Exercise 2.8b Consider a closed set consisting of a single point, such as E = {(0, 0)}. This point is clearly not a limit point of E. Exercise 2.9a To prove that E ◦ is open, we will show that every point in E ◦ must be an interior point of E ◦ . Let x be an arbitrary point in E ◦ . By the definition of E ◦ , we know that x is an interior point of E. This means that we can choose some r such that Nr (x) ⊂ E: i.e., every point of Nr (x) is a point of E. To show that x is an interior point of E ◦ , though, we will need to show that every point in the neighborhood Nr (x) is also a 17 point of E ◦ . Let y be an arbitrary point in Nr (x). Clearly, 0 < d(x, y) < r. Choose a radius s such that d(x, y) + s = r and let z be an arbitrary point in Ns (y). Every point in Ns (y) is a point in E, since d(x, z) ≤ d(x, y) + d(y, z) < d(x, y) + s = r implies d(x, z) < r, and we’ve established that every point in Nr (x) is a member of E. And since every point in Ns (y) is a point in E, by definition we know that y is an interior point of E. But y was an arbitrary point in Nr (x), so we know that every point in Nr (x) is an interior point of E. And this means that x is itself an interior point of the set of interior points of E: that is, x is an interior point of E ◦ . And since x was an arbitrary point in E ◦ , this means that every point in E ◦ is an interior point of E ◦ . And this proves that E ◦ is open. Exercise 2.9b From exercise 2.9(a), we know that E ◦ is always open: so E is open if E = E ◦ . And if E is open, then every point of E is an interior point and therefore E ◦ = E: so E = E ◦ if E is open. Together, these two statements show that E is open if and only if E = E ◦ . Exercise 2.9c Let p be an arbitrary element of G. Because G is open, p is an interior point of G. This means that for every point p ∈ G there is some neighborhood Nr (p) such that Nr (p) ⊂ G ⊂ E. This chain of subsets tells us that every point in G is an interior point not only of G, but also of E. And if every point in G is an interior point of E, then this shows us that G ⊂ E ◦ . Exercise 2.9d f◦ ⊂ E: e let x be a member of E f◦ , the complement of E ◦ . From the definition of E ◦ we know that x Proof that E is not an interior point of E. From the definition of “interior point”, this means that every neighborhood of x contains some element y in the complement of E. For each neighborhood, It must be true that either x = y or e (since y is in E e and x = y). If x 6= y for x 6= y. If x = y for one or more of these neighborhoods, then x is in E e e every neighborhood, then by definition we know that x is a limit point of E. So we conclude that either x ∈ E e that is, x is a member of E, e the closure of E. e or x is a limit point for E: e⊂E f◦ : this proof is only a very slight modification of the previous one. Let x be a member of Proof that E e the closure of the complement of E. From the definition of closure, either x is a limit point for E e or x itself E, e In either case, every neighborhood of x contains some point of E. e By definition of “interior is a member of E. f◦ , the complement of E ◦ . point”, this means that x is not an interior point of E and therefore x ∈ E f◦ = E. e Together, these two proofs show us that E 18 Exercise 2.9e No, E and E do not always have the same interiors. Consider the set E = (−1, 0) ∪ (0, 1) in R1 . The point 0 is not a member of the interior of E, but it is a limit point of E. So the point 0 is an interior point of E = (−1, 1). Exercise 2.9f No, E and E ◦ do not always have the same closures. Let E be a subset of (0, 1] consisting only all rational 1 , m ∈ N. As we saw in exercise 2.5, the closure of E is E ∪ {0}. But every neighborhood numbers of the form m of every point in E contains a point not in E, so E has no interior points. So E ◦ = ∅. And the empty set is already closed, so the closure of E ◦ is ∅. Clearly E ∪ {0} 6= ∅, so E and E ◦ do not always have the same closures. Exercise 2.10 Which sets are closed? Let E be an arbitrary subset of X and let p be an arbitrary point in E. The distance between any two distinct points is always 1, so by choosing r = .5 we guarantee that the neighborhood Nr (p) contains only p itself. Because this is true for any point in E, this shows us that E contains no limit points. It’s then vacuously true that all of the (nonexistant) limit points of E are points of E: we conclude that E is closed. But our choice of E was arbitrary, so this shows that every subset of X is closed. Which sets are open? Let E be an arbitrary set in X. We’ve shown that every set is closed, so it must be e is closed: and from theorem 2.23, this means that E is open. But our choice of E was arbitrary, the case that E so this shows that every subset of X is open. Which sets are compact? Under this metric, subsets of X are compact if and only if they are finite. Proof: let E be an arbitrary subset of X. This set is either finite or infinite. If E is finite with cardinality k, then we can always generate a finite subcover for any open cover. For each ei ∈ E, we select some gi in our (arbitrary) open cover {Gα } such that ei ∈ gi (note that E is finite, so we don’t Sk Sk need to use the axiom of choice). This gives us a collection of sets i=1 gi such that E ⊆ i=1 gi ⊆ {Gα }. By Sk definition, then, i=1 gi is a finite subcover of E. Our choices for E, k, and {Gα } were arbitrary, so this shows that any finite subset of X is compact. If E is infinite, then we can always generate an open cover that has no finite subcover. As we saw previously, every subset of X is open: this includes subsets that consist of a single element. So we can create an infinite open cover of E by letting each set in {Gα } be a single element of E. Any subcover of E will need to contain one element of {Gα } for each element of E: and since E is infinite, this means that any subcover must be infinite. Therefore there is no finite subcover for this particular open cover, and E is not compact. Exercise 2.11 d1 is not a metric: let x = −1, y = 0, z = 1. Then d1 (x, y) + d1 (y, z) = 2 while d1 (x, z) = 4. And this tells us that d1 (x, z) > d1 (x, y) + d1 (y, z) which violates definition 2.15(c) of metric spaces. d2 is a metric. It’s trivial to verify that d2 has the properties 2.15(a) and 2.15(b). To show that it has the property 2.15(c): 7→ |x − z| ≤ |x − y| + |y − z| true by the triangle inequality, theorem 1.37(f) p → |x − z| ≤ |x − y| + |y − z| + 2 |x − y||y − z| p 2 p p 2 → |x − z| ≤ |x − y| + |y − z| p p p → |x − z| ≤ |x − y| + |y − z| this additional term is > 0 legal because all terms are > 0 → d2 (x, z) ≤ d2 (x, y) + d2 (y, z) 19 d3 is not a metric. d3 (1, −1) = 0, which violates definition 2.15(a) of metric spaces. d4 is not a metric. d4 (2, 1) = 0, which violates definition 2.15(a) of metric spaces. d5 is a metric. It’s trivial to verify that d2 has the properties 2.15(a) and 2.15(b). To show that it has the property 2.15(c): 7→ |x − z| ≤ |x − y| + |y − z| This first step is always true because of the triangle inequality (theorem 1.37(f). → |x − z| ≤ |x − y| + 2|x − y||y − z| + |y − z| + |x − z||x − y||x − z| Adding positive terms to the right side does not change the sign of the equality. → |x − z| + |x − z||x − y| + |x − z||y − z| + |x − z||x − y||y − z| ≤ |x − y| + 2|x − y||y − z| + |y − z| + |x − z||x − y| + 2|x − z||x − y||y − z| + |x − z||y − z| It’s hard to verify from this ugly mess, but we’ve just added the same terms to each side. → (|x − z|)(1 + |x − y| + |y − z| + |x − y||y − z|) ≤ (1 + |x − z|)(|x − y| + 2|x − y||y − z| + |y − z|) It’s still hard to verify, but we’ve just factored out terms from each side. → |x − z| |x − y| + 2|x − y||y − z| + |y − z| ≤ 1 + |x − z| 1 + |x − y| + |y − z| + |x − y||y − z| And now we’ve divided each side by two of these factored terms. |x − z| |x − y| |y − z| ≤ + 1 + |x − z| 1 + |x − y| 1 + |y − z| → d2 (x, z) ≤ d2 (x, y) + d2 (y, z) → The logic behind this seemingly arbitrary series of algebraic steps becomes clear if one starts at the end and works back to the beginning (which is, of course, how I initially derived the proof). Exercise 2.12 Let {Gα } be any open cover of K. At least one open set in {Gα } must contain the point 0: let G0 be this set, open in R, that contains 0 as an interior point. As an interior point of G0 , there is some neighborhood Nr (0) ⊂ G0 . This neighborhood contains every n1 ∈ K such that n1 < r: equivalently, this neighborhood contains one element for every natural number n such that n > 1r . Not only does this mean that G0 contains an infinite number of elements of K, it means that it contains all but a finite number ( 1r , to be exact) of elements of K. Let Gi represent an element of {Gα } containing 1i . We can now see that K ⊂ G0 ∪ subcover of {Gα }. S1/r i=1 Gi , which is a finite Exercise 2.13 For i ≥ 2, let Ai represent the set of all points of the form Each Ai can be shown to have a single limit point of 1 i 1 i + n1 that are contained in the open interval (see exercise 2.5). 20 1 1 i , i−1 . S∞ Now consider the union of sets S = i=2 Ai . The same reasoning used in exercise 2.5 shows that the set of limit points for S is just the collection of limit points from Ai : that is, the rationals of the form 1i for i ≥ 2. This shows us that S has one limit point for each natural number greater than 1, which is a countable number. To show that S is compact, we return to the reasoning found in exercise 2.12. Let {Gα } be any open cover 1 of S. For each i ≥ 2, Let Gi represent the element of {Gα } that i . As we saw in exercise 2.12, Gi contains 1 1 i , i−1 . If we let Gij represent an element of 1 {Gα } containing 1i + 1j , we see that all of the points of S in the interval 1i , i−1 can be covered by the finite S ri union of sets Gi ∪ i=1 . contains all but a finite number of elements from the interval Now, let G0 represent the element of {Gα } that contains 0 as an interior point. As an interior point, there is some neighborhood Nr (0) ⊂ G0 . This neighborhood contains every limit point of the form n1 ∈ R such that 1 n < r. As in 2.12, this means that G0 contains all but a finite number of the limit points of S. More than that, 1 ). though, it means that G0 contains all but a finite number of the intervals of the form ( 1i , i−1 And from these sets we can construct our finite subcover. Our subcover S contains G0 . For each of the finitely ri many intervals not covered by G0 , we include the finite union of sets Gi ∪ i=1 . This is a finite collection of a finite number of elements from {Gα }, and each element of S is included in one of these elements, so we have constructed a finite subcover for an arbitrary cover {Gα }. This proves that S is compact. Exercise 2.14 S∞ Let Gn represent the interval n1 , 1 . The union {Gα } = i=1 Gn is a cover for the interval (0, 1) (Proof: for any x ∈ (0, 1), there is some n > 1/x and therefore some n1 < x. So x ∈ Gn+1 ). But there is no finite subcover 1 for (0, 1). Let H be a finite subcover of {Gα }, and consider the set { i+1 : Gi ∈ H}. This set is finite, so there is a least element. And this least element is in the interval (0, 1) but is not a member of any Gi ∈ H. So our assumption that a finite subcover exists must be false. Exercise 2.15 For each i ∈ N, define Ai to be: ( Ai = r p∈Q: 1 2− ≤p≤ i r 1 2+ i ) Because the endpoints are irrational, each of these intervals Ai are both bounded and closed (see exercise 2.16 for the proofs). From theorem 1.20, we also know that each interval is nonempty. T The union of any finite collection of these intervals will be nonempty, since the union i∈∇ Ai is equal to √ √ T∞ Ak , where k is the largest index in ∇. But the infinite union i=1 Ai is equal to {p ∈ Q : 2 ≤ p ≤ 2}, which is obviously empty. This proves that the collection {Ai : i ∈ N} is a counterexample to theorem 2.36 if the word “compact” is replaced by either “closed” or “bounded”. This same collection works as a counterexample to the corollary of 2.36, since An+1 ⊂ An . Exercise 2.16 lemma 1: For any real number x 6∈ Q, the intervals [x, ∞) and (x, ∞) are open in Q. Proof: let x be a real number such that x 6∈ Q. Let p be a rational number in the interval [x, ∞) or (x, ∞). It cannot be the case that p = x (since p is rational while x is not), so p is in the interval (x, ∞) (i.e, [x, ∞) is identical to (x, ∞) in Q). Because x < p, we can rewrite this interval as (p − d(p, x), ∞). Choose r = d(x, p). Then we see that (p − r, p + r) ⊆ (p − d(p, x), ∞) = (x, ∞). This means that Nr (p) is an interior point of (x, ∞): but p was an arbitrary point chosen from [x, ∞), so every point of this interval is an interior point. This proves that [x, ∞) and (x, ∞) are open in Q. 21 lemma 2: For any real number x 6∈ Q, the intervals (−∞, x] and (−∞, x) are open in Q. Proof: the proof is nearly identical to that of lemma 1. lemma 3: All of these open intervals ([x, ∞),(x, ∞),(−∞, x], and (−∞, x)) are also closed. Proof: Choose any of these four open sets. Note that its complement is also one of these four open sets. Since its complement is open, the set must be closed. lemma 4: Every interval of the form (x, y) with x, y 6∈ Q is both open and closed in Q. Proof: choose an e = (−∞, x] ∪ [y, ∞). From lemma 3, we see that E e is a finite arbitrary interval E = (x, y). It’s complement is E e e union of closed sets, so E is closed (theorem 2.24) and E is open (theorem 2.23). But E is also a finite union e is open (theorem 2.24) and E is closed (theorem 2.23). This proves that of open sets (lemmas 1 and 2), so E (x, y) is both open and closed. The proofs for [x, y], (x, y], and [x, y) are identical to this one. √ √ √ E is bounded: If |p| ≥ ± 3, then p2 > 3 and p 6∈ E. So E is bounded by the interval (− 3, 3). 2 E is open and closed in Q: E is√ the set rational < 3, which means that √ of all √ √ numbers p such that √ 2<p √ E is the set of rational numbers in (− 3, − 2) ∪ ( 2, 3). We know that ± 2 and ± 3 are not rational, so by lemma 4 we know that E is both a union of closed sets and also a union of open sets. So by theorem 2.24 we know that E is both open and closed. E is not compact: Drawing from the example in exercise 2.14, define the interval Ai to be: √ √ ( 3, − 2) if i = 1 (− q √ Ai = 2 + 1i , 3 if i > 1 q Because 2 + 1i is irrational for every i ∈ N, we know by lemma 4 that each Ai is an open set. Define an open cover of E to be G = {Ai : i ∈ N} This has no finite subcover for the same reason that the set in exercise 2.14 has no infinite subcover: given any √ finite collection of elements of G, we can always find an element sufficiently close to 2 that is not contained in any of those elements of G. Exercise 2.17 E is not countable: This is proven directly by theorem 2.14. E is not dense: Let x = .660... and let r = .01. Then Nr (x) = (.65, .67), and every real number in this interval contains a number other than 4 or 7. Therefore Nr (x) contains no points of E. This shows that x is an element of [0, 1] that is neither a point in E nor a limit point of E, which proves that E is not dense in [0, 1]. E is compact: By theorem 2.41, we can prove that E is compact by showing that it is bounded and closed. E ⊂ [0, 1], so E is bounded. To show that E is closed, we need to show that every limit point of E is a member of E. Proof by contrapositive that every limit point of E is a member of E: let x be an element of [0, 1] such that x 6∈ E. We will show that x is not a limit point. By the definition of membership in E, the fact that x 6∈ E means that ∞ X ai x= 10i i=1 where some ai 6∈ {4, 7}. Let k represent some index such that ak 6∈ {4, 7}. Choose r = 10−(k+1) . Then the neighborhood Nr (x) does not contain any points in E: i) If ak+1 = 0, then the k + 1st decimal place will be either 9, 0, or 1 for every element of (x − r, x + r) and so no element of Nr (x) is a member of E. 22 ii) If ak+1 = 9, then the k + 1st decimal place will be either 8, 9, or 0 for every element of (x − r, x + r) and so no element of Nr (x) is a member of E. iii) If ak+1 is neither 0 nor 9, then the kth decimal place will be be ak for every element of (x − r, x + r) and so no element of Nr (x) is a member of E. This proves that there is some neighborhood Nr (x) that contains no points in E. This proves that x is not a limit point of E. But x was an arbitrary point such that x 6∈ E, so this proves that (x 6∈ E → x is not a limit point of E). By contrapositive, this proves that every limit point of E is a member of E. And this, by definition, means that E is closed. E is perfect: We have already shown that E is closed in the course of proving that E is compact. To prove that E is perfect, we need to show that every point in E is a limit point of E. Let x be an arbitrary point in E. By the definition of membership in E, we know that x= ∞ X ai 10i i=1 where each ai is either 4 or 7. Define a second number, xk , such that  ∞  bi = ai if i 6= k X bi bk = 4 if ak = 7 xk = , where  10i i=1 bk = 7 if ak = 4 From this definition, we see that xk differs from x only in the kth decimal place, and d(x, xk ) = 3 × 10−k . We can use this information to show that x is a limit point of E. Let Nr (x) be any arbitrary neighborhood of x. From the archimedian principle, we can find some integer k such that k > log1 0 3r . And this is algebraically equivalent to finding some integer k such that 3 × 10−k < r. This means that we can find some xk ∈ E (as defined above) in Nr (x). And this was an arbitrary neighborhood of an arbitrary point in E, so we have proven that every neighborhood of every point in E contains a second point in E: by definition, this means that every point in E is a limit point. Exercise 2.18 Section 2.44 describes the Cantor set. The Cantor set is a nonempty perfect subset of R1 . Each point in the n n+1 Cantor set is an endpoint of some segment of the form 3k , 3k with n, k ∈ N, so each point in the Cantor set is rational. √ √ Let E be the set {x + 2 : x is in the Cantor set} (that is, we’re shifting every element of the Cantor set 2 units to the right). Each √ element √ in E is irrational (exercise 1.1). The Cantor set was bounded by [0, 1] so E is clearly bounded by [ 2, 1 + 2]. The proof that E is perfect is identical to the proof that the Cantor set is perfect (given in the book in section 2.44 and also in this document after exercise 2.4). Exercise 2.19 a We’re told that A and B are disjoint, so A ∩ B = ∅. And if A and B are closed, then by theorem 2.27 we know that A = A and B = B. So we conclude that A ∩ B = A ∩ B = A ∩ B = ∅. By definition, this means that A and B are separated. Exercise 2.19 b Let A and B be disjoint open sets. Assume that A ∩ B is not empty. This assumption leads to a contradiction: 23 7→ A ∩ B is not empty hypothesis to be contradicted → (∃x)(x ∈ A ∩ B) definition of non-emptiness 0 → (∃x)(x ∈ A ∩ (B ∪ B )) → (∃x)(x ∈ (A ∩ B) ∪ (A ∩ B 0 )) definition of closure distributivity (section 2.11) → (∃x)(x ∈ ∅ ∪ (A ∩ B 0 )) A and B are disjoint 0 → (∃x)(x ∈ A ∩ B ) → (∃x)(x ∈ A ∧ x ∈ B 0 ) definition of set intersection 0 → (∃x)(x is an interior point of A ∧ x ∈ B ) A is an open set → (∃x)(x is an interior point of A ∧ x is a limit point of B) definition of B 0 → (∃x) [(∃r ∈ R)(Nr (x) ⊆ A) ∧ (x is a limit point of B)] definition of interior point → (∃x) [(∃r ∈ R)(Nr (x) ⊆ A) ∧ (∀s ∈ R)(∃y ∈ B)(y ∈ Ns (x))] definition of limit point If something is true for all s ∈ R, it’s true for any particular R. So choose s = r. → (∃x) [(∃r ∈ R)( (Nr (x) ⊆ A) ∧ (∃y ∈ B)(y ∈ Nr (x)) )] substitution of s = r → (∃x) [(∃r ∈ R)(∃y ∈ B)(y ∈ Nr (x) ∧ Nr (x) ⊆ A)] rearrangement of terms for clarity → (∃x) [(∃r ∈ R)(∃y ∈ B)(y ∈ A)] definition of subset This last step establishes a contradiction. The sets A and B are disjoint, so there cannot be any possible choice of variables such that y ∈ B and y ∈ A. Our assumption must have been incorrect: A ∩ B is, in fact, empty. If we swap the roles of A and B, this same proof also shows us that A∩ B is empty. By definition, then, A and B are separated. Exercise 2.19 c A is open in X: The set A is, by definition 2.18(a), a neighborhood of p. By theorem 2.19, then, A is an open subset of X. B is open in X: Let x be an arbitrary point in B. Let r = d(p, x) − δ. Let y be an arbitrary element in Nr (x). Proof that y ∈ B: 7→ d(p, y) + d(y, x) ≥ d(p, x) Property 2.15(c) of metric spaces → d(p, y) + d(y, x) − d(p, x) ≥ 0 → d(p, y) + d(y, x) − d(p, x) ≥ 0 ∧ r > d(x, y) we know d(x, y) < r because y ∈ Nr (x) → d(p, y) + d(y, x) − d(p, x) ≥ 0 ∧ d(p, x) − δ > d(x, y) definition of r → d(p, y) + d(y, x) − d(p, x) ≥ 0 ∧ d(p, x) − δ − d(x, y) > 0 → (d(p, y) + d(y, x) − d(p, x)) + (d(p, x) − δ − d(x, y)) > 0 the sum of positive numbers is positive → d(p, y) − δ > 0 cancellation of like terms → d(p, y) > δ →y∈B definition of membership in B Our choice for y was arbitrary, so every point in this neighborhood of x is a member of B: by definition, then, x is an interior point of B. But our choice of x ∈ B was also arbitrary, so this proves that every x ∈ B is an interior point of B. By definition, this shows that B is open in X. A and B are disjoint: If there were some x ∈ A ∩ B, then d(x, p) > δ and d(x, p) < δ, which violates the trichotomy rule for order relations. A and B are separate: We’ve shown that A and B are disjoint open sets, so by exercise 2.19(b) we know that A and B are separate. Exercise 2.19 d If we are given any metric space X with a countable or finite number of elements, we can always find some distance δ such that there are no elements x, y ∈ X with d(x, y) = δ (proof follows). This allows us to choose 24 some arbitrary p ∈ X and then use the results from part (c) to completely partition X into separated sets A = {x ∈ X : d(x, p) > δ} and B = {x ∈ X : d(x, p) < δ}. This proves that (X is at most countable → X is not connected). By contrapositive, this proves that (X is connected → X is uncountable), which is what we were asked to prove. Proof that an at most countable metric space X has some distance δ such that, for all x, y ∈ X, d(x, y) 6= δ: Let X be an at most countable metric space with elements a1 , a2 , . . . an . Then from theorem 2.13, we know that the set of all order pairs {(ai , aj ) : ai , aj ∈ X} is at most countable. And there is a clear one-to-one correspondence between this set and the set {(d(ai , aj ) : ai , aj ∈ X}: so the set of all distances between all combinations of points in X is at most countable. Distances in metric spaces are always real numbers (definition 2.15), which are of course uncountable. Because there are an at-most countable number of distances in the set {(d(ai , aj ) : ai , aj ∈ X}, we can choose a real number δ that is not in this set (otherwise we would have an at most countable set with R as a proper subset). But this isn’t quite enough: if δ is so large that there are no elements in the set {x ∈ X : d(x, y) > δ}, then one of our partitions will be empty. We can avoid this problem (as long as X has at least two elements) by picking arbitrary x, y ∈ X and then choosing delta from the interval (0, d(x, y)). This interval is still uncountable, so we are still able to choose a δ that is not in this set. And we know that d(x, y) > δ and d(x, x) < δ, so our partitions A and B will both be nonempty. Exercise 2.20 a Closures of connected sets are always connected. If A and B are connected, then there is either some p in A ∩ B or some q in A ∩ B. Clearly, then, we have either either p ∈ (A0 ∪ A) ∩ B or q ∈ A ∩ (B 0 ∪ B). Therefore A ∩ B is nonempty, so these two closed sets are connected. Exercise 2.20 b Consider the segment (0, 1) of the real line. Although this segment is open in R1 , it contains no interior points in R2 since every neighborhood Nr (x, 0) contains the point (x, 2r ). So let r = 14 and let E be the set Nr (0, 0) ∪ {(x, 0) : 0 < x < 1} ∪ Nr (1, 0). Since the line segment {(x, 0) : 0 < x < 1} contains no interior points while every point in a neighborhood is an interior point (theorem 2.19), the interior of E is just Nr (0, 0) ∪ Nr (1, 1), and it is trivial to show that this is the union of two nonempty separated sets. Exercise 2.21 a The function p(t) can be thought of as the parameterization of a straight line connecting the points a and b. For instance, consider the following sets in R2 : 25 The set A0 is the set of all t such that p(t) ∈ A, and the set B0 is the set of all t such that p(t) ∈ B. We’re told that A and B are separated and are asked to prove that A0 and B0 are separated. Proof by contrapositive: • There is some element tA that is an element of A0 and a limit point of B0 . Assume that A0 and B0 are not separated. By definition 2.45, either A0 ∩ B0 or A0 ∩ B0 is nonempty. Assume that A0 ∩ B0 is nonempty (the sets are interchangeable, so the proof for A0 ∩ B0 6= ∅ is almost identical). Let tA be one element of A0 ∩ B0: then either tA ∈ A0 ∩ B0 or tA ∈ B00 . It can’t be the case that tA ∈ A0 ∩ B0 , because this would imply p(tA ) ∈ A ∩ B which is impossible because A and B are separated sets. So it must be the case that tA ∈ A0 ∩ B00 . • There is a proportional relationship between d(t, t + r) and d(p(t), p(t + r)). The distance between p(t) and p(t + r) is the vector norm of p(t + r) − p(t): d(p(t), p(t+r)) = |p(t+r)−p(t)| = |[(1−(t+r))a+(t+r)b]−[(1−t)a+tb]| = |r(b−a)| = |r||(b−a)| = d(t, t+r)d(a, b) • p(tA ) is an element of A and a limit point of B. We’ve shown that tA is a limit point of B0 . So for any arbitrarily small r, there is some tB ∈ B0 such that d(tA , tB ) < r. And this means that for any arbitrarily small r, there is some p(tB ) ∈ B such that d(p(tA ), p(tB )) < |r(b − a)|. And since p(tA ) is in A and each p(tB ) is in B, this means that there is an element of A that is a limit point of B: So A ∩ B is nonempty, which means that A and B are not separated. This shows that if A0 and B0 are not separated, then A and B are not separated. By contrapositive, then, if A and B are separated then A0 and B0 are separated. And this is what we were asked to prove. Exercise 2.21 b We are asked to find α ∈ (0, 1) such that p(α) 6∈ A ∪ B: from the definition of the function p, this is equivalent to finding α ∈ (0, 1) such that α 6∈ A0 ∪ B0 . Proof that such a α exists: Let E be defined as E = {d(0, x) : x ∈ B0 }. That is, E is the set of all distances between the point 0 (which is in A0 , since p(0) = a) and elements of B0 . The set R has the greatest lower bound property and E is a subset of R with a lower bound of 0, so the set E has a greatest lower bound. Let α represent this greatest lower bound. 26 • 0 < α < 1: We know that α > 0 because α is a distance. We know that α 6= 0, because this would mean that d(0, 0) ∈ E which means that 0 ∈ B0 , which is false. And we know that α < 1, because 1 ∈ B0 and B0 is an open set: so 1 is an interior point of B0 , which means that there is some small r such that 1 − r ∈ B0 . We now need to show that we can use α to construct elements that are not in A0 ∪ B0 . We know that either α ∈ B0 ,α ∈ A0 , or α 6∈ A0 ∪ B0 : • if α ∈ B0 : We know that α is not a limit point of A0 (otherwise, xα ∈ B0 ∩ A0 which contradicts the fact from part (a) that B0 and A0 are separated). So there is some neighborhood Nr (α) that contains no points in A0 . Now consider the points p in the range (α − r, α). Each p is in the neighborhood Nr (α), so they aren’t members of A0 . And for each p we see that d(0, p) < α, so they aren’t members of B0 (otherwise α wouldn’t have been a lower bound of E). So we see that for every p ∈ (α − r, α), we have p 6∈ A0 ∪ B0 . • if α ∈ A0 : This assumption leads to a contradiction. We’ve shown in part (a) that A0 and B0 are separated, so it can’t be the case that α is a limit point of B0 (otherwise A0 ∩ B0 would not be empty). So there is some neighborhood Nr (α) that contains no points of B0 . But if the interval (0, α) contains no points of B0 and the interval (α − r, α + r) contains no points of B0 , then their union (0, α + r) contains no points of B0 . But this contradicts our definition of α as the greatest lower bound of E. So our initial assumption must have been incorrect: it cannot have been the case that α ∈ A0 . • if α 6∈ A0 and α 6∈ B0 : Under this assumption, we clearly have α 6∈ A0 ∪ B0 . Whatever assumption we make about the set containing α, we see that there will always be at least one element α such that 0 < α < 1 and α 6∈ A0 ∪ B0 . And, from the definition of the function p, this means that p(α) 6∈ A ∪ B. And this is what we were asked to demonstrate. Exercise 2.21 c Proof by contrapositive. Assume that E is not connected: then E could be described as the union of two separated sets (i.e, E = A ∪ B). From part (b), we could then choose a, b, and t such that (1 − t)a + (t)b 6∈ A ∪ B = E. And this would mean that E is not convex. By contrapositive, if E is convex then E is connected. Exercise 2.22 The metric space Rk clearly contains Qk as a subset. We know that Qk is countable from theorem 2.13. To prove that Qk is dense in Rk , we need to show that every point in Rk is a limit point of Qk : Let a = (a1 , a2 , . . . , ak ) be an arbitrary point in Rk and let Nr (a) be an arbitrary neighborhood of a. Let b = (b1 , b2 , . . . , bk ) where bi is chosen to be a rational number such that ai < bi < ai + √rk (possible via theorem 1.20(b)). The point b is clearly in Qk , and r r 2 2 p r r kr2 d(a, b) = (a1 − bi )2 + . . . + (ak − bk )2 < + ... + = =r k k k This shows that every point in Rk is a limit point of Qk , which completes the proof that Rk is separable. Exercise 2.23 Let X be a separable metric space. From the definition of separable in the previous exercise, we know that X contains a countable dense subset E. For each ei ∈ E, let Nq (ei ) be a neighborhood with rational radius q around point ei . Let {Vα } = {Nq (ei ) : q ∈ Q, i ∈ N} be the collection of all neighborhoods with rational radius centered around members of E. This is a countable collection of countable sets, and therefore {Vα } has 27 countably many elements. Let x be an arbitrary point in X, and let G be an arbitrary open set in X such that x ∈ G. Because G is open, we know that x is an interior point of G. So there is some neighborhood Nr (x) such that Nr (x) ⊆ G. But we can choose a rational q such that 0 < q < 2r , so that x ∈ Nq (x) ⊆ Nr (x) ⊆ G. Because E is dense in x, every neighborhood of x contains some e ∈ E. So e ∈ Nq (x), which means that d(e, x) < q. But, on the other hand, this also means that d(x, e) < q so that x ∈ Nq (e). And Nq (e) ∈ {Vα }, by definition. Having shown that x ∈ Nq (e), we need to prove that Nq (e) ⊆ G. Let y be any point in Nq (e). We know that d(x, e) < q and d(e, y) < q. So d(x, y) < d(x, e) + d(e, y) = 2q by definition 2.15(c) of metric spaces. But we defined q so that 0 < q < 2r , so we know that d(x, y) < r. This means that every y ∈ Nq (e) → y ∈ Nr (x), or Nq (e) ⊆ Nr (x). And we chose r so that Nr (x) ⊆ G, so by transitivity we know that Nq (e) ⊆ G. We started by choosing an arbitrary element x in an arbitrary open set G ⊆ X, and proved that there was an element Nq (e) ∈ {Vα } such that x ∈ Nq (e) ⊂ G. We’ve shown that {Vα } has a countable number of elements, and each element is an open neighborhood. This proves that {Vα } is a base for X. Exercise 2.24 We’re told that X is a metric space in which every infinite subset has a limit point. Choose some arbitrary δ and construct a set {xi } as described in the exercise. • The set {xi } must be finite: if this set were infinite, then it would have a limit point. And if it has a limit point, then there would be multiple points of {xi } in some neighborhood of radius δ/2, which contradicts our assumption that d(xi , xj ) > δ for each pair of points in {xi }. • The neighborhoods of {xi } form a cover of X: Every x ∈ X must be contained in Nδ (xi ) for some xi ∈ {xi }. If it weren’t, this would imply that d(x, xi ) > δ for each xi ∈ {xi }, so that we could have chosen x to be an additional element in {xi }. S∞ Let Ei represent the set of {xi } constructed in the above way when δ = 1i , and consider the set E = i=1 Ei . We will show that E is a countable dense subset of X. This union E is a countable union of nonempty finite sets, so E is countable. Each element of E was chosen from X, so clearly E ⊂ X. Proof that E is dense in X: Choose an arbitrary x ∈ X. Choose an arbitrarily small radius r. From the archimedian principle, we can find an integer n such that n1 < r. We’ve shown that the neighborhoods of En cover X, so there is some e ∈ En such that x ∈ N1/n (e). But if d(e, x) < n1 , then d(x, e) < n1 and so e ∈ N1/n (x). And since n1 < r, this implies that e ∈ Nr (x). So we’ve chosen an arbitrary x and an arbitrary radius r, and shown that Nr (x) will always contain some e ∈ En ⊂ E. This proves that every x is a limit point of E, which by definition means that E is dense in X. 28 This proves that X contains a countable dense subset, so by the definition in exercise 22 we have proven that X is separable. Exercise 2.25 a Theorem 2.41 tells us that the set K, by virtue of being a compact space, has the property that every infinite subset of K has a limit point. By exercise 24, this means that K is separable. And exercise 23 proves that every separable metric spaces has a countable base. I’m not sure that this an appropriate proof, though. The question wants us to prove that K has a countable base and that therefore K is separable, whereas this is a proof that K is separable and therefore has a countable base. An alternate proof follows. Exercise 2.25 b We’re told that K is compact. Consider the set En = {N1/n (k) : k ∈ K}, the set of neighborhoods of radius 1/n around every element in K. This is clearly an open cover of K, since x ∈ K → x ∈ N( 1/n)(x) → x ∈ En . And K is compact, so the open cover En must have some finite subcover: i.e., K must be covered by some finite number of neighborhoods from En . Let {Vn } represent a finite subcover of the open cover En . Now consider the union V = a base for K. S∞ n=1 Vn , a countable collection of finite covers of K. We will prove that V is Choose an arbitrary x ∈ K, and an arbitrary open set G such that x ∈ G ⊂ K. Because G is open and x ∈ G, x must be an interior point of G. And since x is an interior point, there is some neighborhood Nr (x) such 1 < 2r . Because Vm is an open cover of K, there is some that Nr (x) ⊂ G. Now choose an integer m such that m k ∈ K and neighborhood N1/m (k) in the open cover Vm such that x ∈ N1/m (k). Proof that this neighborhood N1/m (k) is a subset of G: Assume that y is an element of N1/m (k). From the fact that x ∈ N1/m (k), we know that d(x, k) < 1/m. And from the fact that y ∈ N1/m (k), we know that d(k, y) < 1/m. And from the definition of metric spaces, we 1 know that d(x, y) ≤ d(x, k) + d(k, y), which means that d(x, y) ≤ 2/m ≤ r (since we chose m < 2r ). This proves that every element of N1/m (k) is in the neighborhood Nr (x), so N1/m (k) ⊆ Nr (x) ⊆ G. And this shows that N1/m (k) ⊆ G. We’ve chosen an arbitrary element x and an arbitrary element G, and have shown that there is some N1/m (k) ∈ V such that x ∈ N1/m (k) ⊂ G. By the definition in exercise 23, this proves that K has a base. And this base is a countable collection of finite sets, so it’s a countable base. This countable base V is identical to the set E constructed in exercise 24. We saw there that this base was a dense subset, so by the definition of “separable” in exercise 23 we know that K is separable. Exercise 2.26 Let {Vn } be the countable base we constructed in exercises 24 and 25. {Vn } acts as a countable subcover to any open cover of X. Proof: let G be an arbitrary open set from an arbitrary open cover of X. Choose any x ∈ G. We can find an element N1/m (k) ∈ V such that x ∈ N1/m (k) ⊆ G by the same method used in the second half of exercise 2.25(b). We must now prove that a finite subset of {Vn } can always be chosen to act as a subcover for any open cover. We’ll do this with a proof by contradiction. Let {Wα } be an open cover of X and assume that there is no finite subset of {Vn } that acts as a subcover of {Wα }. Construct Fn and E as described in the exercise. Because E is an infinite subset, it has a limit point: let x be this limit point. Every neighborhood Nr (x) contains infinitely many points of E (theorem 2.20), and we can prove that this arbitrary neighborhood Nr (x) must contain every point of E: 29 We have constructed the Fn sets in such a way that Fn+1 ⊆ Fn for each n. So if Nr (x) doesn’t contain any points from Fj , then it doesn’t contain any points from Fj+1 , Fj+2 ,etc because y ∈ Fj+1 → y ∈ Fj . So if Nr (x) doesn’t contain any points from Fj , then it has no more than j − 1 points of E: i.e, Nr (x) ∩ E would be finite. But x is a limit point of E, so Nr (x) must contain infinitely many points of E. So Nr (x) ∩ E contains one point from each Fn . But this means that E = {x}, a set with one element. If it contained a second point, we could find a neighborhood Nr (x) that failed to contain this second point, which would contradict our finding that every neighborhood of T x contains E. And this means that we choseTthe same element from each Fn : i.e., x was in each Fn . So x ∈ Fn , which contradicts our assumption that Fn was empty. Exercise 2.27 To prove that P is perfect, we must show that every limit point of P is a point of P , and vice-versa. Proof that every limit point of P is a point of P : assume that x is a limit point of P . Choose some arbitrary r and let s = 2r . By the definition of limit point, every neighborhood Ns (x) contains some y ∈ P . And from the fact that y ∈ P , we know that Ns (y) contains uncountably many points of P . But for each p ∈ Ns (y), we see that d(x, p) ≤ d(x, y) + d(y, p) so that d(x, p) ≤ 2s = r. So the neighborhood Nr (x) contains uncountably many points of P . But Nr (x) was an arbitrary neighborhood of x, so this proves that x ∈ P . Proof that every point of P is a limit point of P : assume that x ∈ P . By the definition of P , this means that every neighborhood of x contains uncountably many points of P . So clearly every neighborhood of x contains at least one point of P , which means that x is a limit point of P . This proves that P is perfect. To prove that P c ∩ E is countable, we let {Vn } be a countable base for X and let W be the union of all Vn for which E ∩ Vn is countable. We will show that W c = P : Proof that P ⊆ W c : Assume that x ∈ P . By the definition of membership in P , we know that x is a condensation point of E. By the definition of condensation point, then, we know that every neighborhood Nr (x) contains uncountably many points of E. And this means that every open set Vn containing x contains uncountably many points of E. So x is not a member of any countable Vn ∩ E, which means that x ∈ W c . Proof that W c ⊆ P : Assume that x ∈ W c . Choose any arbitrary neighborhood Nr (x): by the definition of the base, there is some Vn ⊂ Nr (x) such that x ∈ Vn . By definition of W c , the fact that x ∈ Vn means that E ∩ Vn is uncountable, which of course means that Vn has uncountably many elements of E. And we’ve established that Vn ⊆ Nr (x), which means that Nr (x) has uncountably many elements of E. But Nr (x) was an arbitrary neighborhood of x, so this proves that every neighborhood of x has uncountably many elements of E: by definition, then, x is a condensation point and therefore x ∈ P . So we have proven that W c = P . Taking the complement of each of these, we see that W = P c . And the set W is the union of countably many sets of the form E ∩ Vn , each of which contains countably many elements: so W is countable. And W = P c , so P c is countable: this proves that countably many points are P c . And from this, it’s trivial to show that there are countably many points in E ∩ P c , which is what we were asked to prove. Note that this proof assumed only that X had a countable base, so it is valid for any separable metric space. Exercise 2.28 Let E be a closed set in a separable metric space X. Let P be the set of all condensation points for E. Proof that E ∩ P is perfect: assume that x ∈ E ∩ P . Clearly x is a point of P , which means that x is a limit point of E (from the definition of P ) and a limit point of P (because P was shown to be perfect in exercise 27). So x is a limit point of E ∩ P . Now, assume that x is a limit point of E ∩ P . From this, we know that x is a point of E (because E is closed) and a point of P (because P is perfect). So we have shown that every point of E ∩ P is a limit point of E ∩ P and vice-versa: this proves that E ∩ P is perfect. 30 Proof that E ∩ P c is at most countable: this was proven in exercise 27. And E = (E ∩ P ) ∪ (E ∩ P c ): that is, E is the union of a perfect set and a countable set. And this is what we were asked to prove. Exercise 2.29 1 Let S E be an arbitrary open set in R and let {Gi } be an arbitrary collection of disjoint segments such that Gi = E. Each Gi is open and nonempty, so each Gi contains at least one rational point. Therefore there can’t be more Gi elements than rational numbers, which means that there are at most a countable number of elements in {Gi }. But {Gi } was an arbitrary set of disjoint segements whose union is E, therefore E cannot be the union of an uncountable number of disjoint segments. Exercise 2.30 Proof by contradiction: Let {Fn } be a collection of sets such that empty interior. 7→ (∀n)(Fn◦ = ∅) S → Fn◦ = ∅ S c → ( Fn◦ ) = R T ◦ c → (Fn ) = R T → Fnc = R → (∀n)(Fnc S Fn = R and suppose that each Fn has an hypothesis of contradiction taking the complement of both sides theorem 2.22 exercise 2.9d = R) This last step tells us that Fnc is dense in R for every n. And Fn was closed, so Fnc is open. Appealing to the proofTof Baire’s theorem in exercise 3.22 (which uses only terms and concepts introduced in chapter 2), we see that Fnc is nonempty. Therefore: Fnc 6= ∅ S c → ( Fn ) 6= ∅ S → Fn 6= R → T theorem 2.22 taking the complement of both sides S And this contradicts our original claim that Fn = R so one of our initial assumptions must be wrong. And our only assumption was that each Fn has an empty interior, so by contradiction we have proven that at least one Fn must have a nonempty interior. Exercise 3.1 The exercise does not explicitly say that we’re operating in the metric space Rk , but there are two reasons to assume that we are. First, we are supposed to understand the meaning of absolute value in this metric space, and Rk is the only metric space we’ve encountered so far for which absolute value has been defined. Second, Rudin appears to use sn to represent series in Rk and pn to represent series in arbitrary metric spaces. So we’ll assume that we’re operating in Rk for this exercise. We’re told that the sequence {sn } converges to some value s: that is, for any arbitrarily small there is some integer N such that n > N implies d(s, sn ) < . But it’s always the case that d(|s|, |sn |) ≤ d(s, sn ) (exercise 1.13), so for this same and N we see that d(|s|, |sn |) ≤ d(s, sn ) < . By transitivity, this means that for any choice of there is some integer N such that n > N implies d(|s|, |sn |) < . By definition of convergence, this means that the sequence {|sn |} converges to |s|. The converse is not true. If we let sk = (−1)k , the sequence {|sn |} clearly converges while the sequence {sn } clearly does not. 31 Exercise 3.2 The exercise asks us to calculate the limit rather than prove it rigorously, so we might be able to manipulate the expression algebraically: ! √ p p 2+n+n n n n 1 n2 + n − n = ( n2 + n − n) √ =√ = p =p 2 2 n +n+n n +n+n n 1 + 1/n + n 1 + 1/n + 1 and then use basic calc 2 techniques to show that this limit is 12 . If we need to prove it more rigorously, we can use theorem 3.14. We will first need to prove that this sequence has a limit, which theorem 3.14 tells us can be done by proving that the sequence is bounded and monotonically increasing. We can then find the limit by finding the least upper bound of the sequence. The sequence √ √ is bounded √ 2 The from the fact that n2 + n > n2 = n, we know that √ quantitity ( n + n − n) is bounded below: n2 + n − n > 0. And it’s bounded above by 12 : 7→ → √ √ n2 + n − n ≥ n2 + n ≥ 2 → n +n≥ →0≥ 1 4 1 2 1 2 hypothesis to be contradicted +n + n + n2 1 4 both sides are positive, so the sign doesn’t change subtract n + n2 from both sides √ This last step is clearly false, so our initial assumption must be false: this shows that ( n2 + n − n) is bounded above by 21 . The sequence is monotonically increasing 7→ sn+1 > sn p √ ↔ (n + 1)2 + (n + 1) − (n + 1) > n2 + n − n p √ ↔ (n + 1)2 + (n + 1) − n2 + n > (n + 1) − n √ √ ↔ n2 + 3n + 2 − n2 + n > 1 √ √ ↔ (n2 + 3n + 2) + (n2 + n) − 2 n2 + 3n + 2 n2 + n > 1 √ ↔ (2n2 + 4n + 2) − 2 n4 + 4n3 + 5n2 + 2n > 1 √ ↔ −2 n4 + 4n3 + 5n2 + 2n > 1 − (2n2 + 4n + 2) √ ↔ 2 n4 + 4n3 + 5n2 + 2n < 2n2 + 4n + 1 4 3 2 4 3 2 definition of sn rearrange terms some algebra square both sides simplify rearrange terms multiply both sides by −1 and simplify ↔ 4n + 16n + 20n + 8n < 4n + 16n + 20n + 8n + 1 square both sides ↔0<1 subtract 4n4 + 16n3 + 20n2 + 8n from each side The sequence has a least upper bound of 21 We’ve already shown that 12 is an upper bound. To show that it is the least such upper bound, choose any x ∈ [0, 12 ) and assume that it is an upper bound for {sn }. 7→ (∀n ∈ N)(sn ≤ x) hypothesis to be contradicted √ → (∀n ∈ N)( n2 + n − n ≤ x) definition of sn √ → (∀n ∈ N)( n2 + n ≤ x + n) → (∀n ∈ N)(n2 + n ≤ x2 + 2xn + n2 ) 2 square both sides → (∀n ∈ N)(0 ≤ x + (2x − 1)n) subtract n2 + n from both sides → (∀n ∈ N)(−x2 ≤ (2x − 1)n) subtract x2 from both sides 2 −x → (∀n ∈ N)( 2x−1 ≤ n) 2x − 1 < 0 when x < 1 2 This last step must be false: by the Archimedian principle, we can always find some n ∈ N greater than any specific quantity. So, by contradiction, we can find sn > x for every x ∈ [0, 21 ): this means that the upper bound cannot be less than 12 . But 21 is an upper bound, so we see that 12 is the least such upper bound. 32 We have shown that {sn } is a monotonically increasing bounded sequence with a least upper bound of by theorem 3.14, this proves that the limit of {sn } is 21 . 1 2: Exercise 3.3 Note that we are not asked to find the limit of this sequence. We are only asked to show that it converges and that 2 is an upper bound. By theorem 3.14, we can prove convergence by proving that the sequence is bounded and monotonically increasing. p √ The sequence is bounded above by 2 + 2 Although we’re asked to show the sequence has an upper bound of 2, we can prove the stronger result that p that √ √ it has an upper bound of 2 + 2. We can prove this by induction. We see that 0 < s1 = 2 < 2. Now assume that 0 < sn < 2: 7→ 0 < sn < 2 hypothesis of induction √ √ → 0 < sn < 2 √ √ → 0 < 2 + sn < 2 + 2 p √ p √ → 2 + sn < 2 + 2 p √ → sn+1 < 2 + 2 p √ By induction, then, sn < 2 + 2 for all n. The sequence is monotonically increasing p √ √ Proof by induction. We can immediately see that s1 = 2 < 2 + 2 = s2 . Now, assume that we have proven si−1 <psi for i = 1 . . . n. √ 7→ sn = 2 + sn−1 definition of sn p √ → sn > 2 + sn−2 our hypothesis of induction tells us sn−1 > sn−2 → sn > sn−1 definition of sn−1 We’ve shown that sn is a monotonically increasing function that is bounded above. By theorem 3.14, this is sufficient to prove that it converges. Exercise 3.4 From the recursive definition we are given, we can see that s2 = 0 and s2m = 14 + 12 s2m−2 . So we can use theorem 3.26 to find the limit of s2m . ! ∞ X 1 1 1 1 1 1 1 1 lim s2m = + + + ... = −1− = −1− = 1 n m→∞ 4 8 16 2 2 2 2 1− 2 n=0 From the same recursive definition, we see that s1 = 0 and s2m+1 = 12 + 21 s2m−1 . So the limit of s2m+1 is given by ! ∞ X 1 1 1 1 1 lim s2m+1 = + + + . . . = −1= 1 −1=1 n m→∞ 2 4 8 2 1 − 2 n=0 This shows that as n increases, the terms of {sn } are either arbitrarily close to these are our upper and lower bounds for the sequence. 1 2 or arbitrarily close to 1, so Exercise 3.5 Let a∗ = limn→∞ sup an , let b∗ = limn→∞ sup bn , and let s∗ = limn→∞ sup(an + bn ). If a∗ and b∗ are both finite: From theorem 3.17(a) we know that there are some subsequences {aj }, {bk } such that limj→∞ aj + limk→∞ bk = 33 s∗ (that is, the supremum of E as defined above is also member of E: E is a closed set). The limit of aj must be less than or equal to the supremum a∗ and the limit of bj must be less than or equal to the supremum b∗ , so s∗ = lim aj + lim bk ≤ lim sup an + lim sup bn = a∗ + b∗ j→∞ n→∞ k→∞ n→∞ By transitivity, this shows that s∗ ≤ a∗ + b∗ : and, by the definitions of these terms, this proves that lim sup(an + bn ) ≤ lim sup an + lim sup bn n→∞ ∗ n→∞ n→∞ ∗ If either a or b is infinite: We’re asked to discount the possibility that a∗ = ∞ and b∗ = −∞. In all other cases when one or both of these values is infinite, the inequality can be easily shown to resolve to ∞ = ∞ or −∞ = −∞. Exercise 3.6a an = √ n+1− √ √ n= √ n+1− n √ √ n+1+ n 1 √ =√ √ √ n+1+ n n+1+ n We can compare this last term to a known series: an = √ 1 1 1 > √ > √ 2 n+1+ n 2 n+1 1/2 1 n P 1 1/2 We know that the series diverges by theorem 3.28, so the terms n P of {an } are larger than the terms of a divergent series. By the comparison theorem 3.25, this tells us that an is divergent. Exercise 3.6b √ an = n+1− n √ n √ = n+1− n √ √ √ n n+1+ n 1 √ √ =√ √ 3 n+1+ n n + n2 + n2 We can compare this last term to a known series: √ 1 n3 + n2 + √ n3 <√ 1 n3 + √ 1 = 3 2 n 3/2 1 n P 1 3/2 converges by theorem 3.28, so the terms We know that the series n Pof {an } are smaller than the terms of a convergent series. By the comparison theorem 3.25, this tells us that an is convergent. Exercise 3.6c √ For n ≥ 1, we can show that 0 ≤ ( n n − 1) < 1: 7→ (∀n ∈ N)(1 ≤ n < 2n ) the < 2n can easily be proven via induction √ → (∀n ∈ N)(1 ≤ n n < 2) take the nth root of each term √ → (∀n ∈ N)(0 ≤ n n − 1 < 1) subtract 1 from each term P n x converges when 0 ≤ x < 1, so by the comparison theorem 3.25 we know that P We know that the series an is convergent. Exercise 3.6d This probably is more difficult than it looks at first, mainly because for complex z it’s not true that 1 + z n > z n – in fact, without absolute value signs the inequality z1 > z2 has no meaning whatsoever (see exercise 1.8). It’s also not true that |1 + z n | > |z n |. The best we can do is appeal to the triangle inequality. 34 If |z| < 1 then by the triangle inequality we have lim n→∞ 1 1 ≥ lim =1 n n→∞ |1| + |z n | 1+z and therefore by theorem 3.23 the series doesn’t converge. If |z| = 1 then we similarly have lim n→∞ 1 1 1 ≥ lim = n→∞ |1| + |z n | 1 + zn 2 and by theorem 3.23 the series again fails to converge. If |z| > 1 then we can use the ratio test: lim n→∞ z −n 1 + z n z −n + 1 1 + zn 1 = lim = lim = <1 n+1 −n n+1 −n n→∞ z n→∞ z 1+z 1+z +z z and by theorem 3.34 the series converges. I’m not totally satisfied with this proof because I can’t totally justify the claim that limn→∞ z −n = 0 without appealing to the polar representation z −n = r−n e−inθ and the fact that |eixθ | = 1 for all x. Exercise 3.7 Define the partial sum tn to be tn = n √ X ak k=1 P√ k We can show that the series an /n converges by showing that the sequence {tn } converges; we can do this by showing that {tn } is bounded and monotonically increasing (theorem 3.14). the sequence is monotonically increasing: √ From the definition of partial sums, we know that tn = an /n + tn−1 for all n. And we’re told an > 0 for every n, so tn > tn−1 for every n. the sequence is bounded: pP P P a2 b2 . Applying this to the given series, From the Cauchy-Schwartz inequality, we know that (ab) ≤ we see that r X √ 1 X X 1 an an ≤ n n2 P P We are told that an converges, and from theorem 3.28 we know that that 1/n2 converges. Therefore (by theorem 3.50)√we know that their product converges P √to some α, and so the√right-hand side of the above inequality an /n is bounded by α, and therefore the sequence {tn } is converges to √ α. This shows us that the series bounded by α. P By assuming that an is convergent and that an > 0, we’ve shown that tn is bounded and monotonically increasing. By theorem 3.14 this is sufficient to show that tn converges. And this is what we were asked to prove. Exercise 3.8 We’re told that {bn } is bounded. Let α be the upper bound of {|bn |}. We’re also told that for any arbitrarily small , we can find an integer N such that n X k=m ak ≤ for all n, m such that n ≥ m ≥ N α which is algebraically equivalent to n X ak α ≤ for all n, m such that n ≥ m ≥ N k=m 35 P an converges: so which, since |bk | ≤ α for every k, means that n X ak bk ≤ n X ak α ≤ for all n, m such that n ≥ m ≥ N k=m k=m By theorem 3.22, this is sufficient to prove that P an bn converges. Exercise 3.9a p n α = lim sup |n3 | = lim sup |n3/n | = |n0 | = 1 So the radius of convergence is R = 1/α = 1. Exercise 3.9b Example 3.40(b) mentions that “the ratio test is easier to apply than the root test”, although we’re never explicitly told what the ratio test is, how it might relate to the ratio test for series convergence, or how to apply the ratio test to example (b). According to some course handouts I found online 2 , we can’t simply apply the ratio test in the same way we use the root test. That is, we can’t use the fact that β = lim sup 2 2n+1 n! = lim sup =0 (n + 1)! 2n n+1 to assume that the radius of convergence is 1/β = ∞. Instead, we look at theorem 3.37, which tells us that α ≤ β, so that 1 1 R= ≥ =∞ α β Exercise 3.9c s α = lim sup n 2 2n 2 √ 2 = lim sup √ = n n2 (lim sup n) 2 n The last step of this equality is justified by theorem 3.3c. From theorem 3.20c, we know the limit of this last term: 2 2 √ 2 = 1 (lim sup n) So α = 2, which gives us a radius of convergence of 1/2. Exercise 3.9d √ n √ n3 (lim sup n n)3 = 3 3 The last step of this equality is justified by theorem 3.3c. From theorem 3.20c, we know the limit of this last term: √ (lim sup n n)3 1 = 3 3 So α = 1/3, which gives us a radius of convergence of 3. α = lim sup p n |n3 3n | = lim sup Exercise 3.10 P We’re told that infinitely many of the coefficients of an z n are positive integers. This means that for every N , there is some n > N such that an ≥ 1. Therefore lim supn→∞ |an | ≥ 1 p → lim supn→∞ n |an | ≥ 1 from the fact thatk > 1 → k 1/n > 1 → 1 R ≥1 →1≥R This final step indicates that the radius of convergence is at most 1. 2 http://math.berkeley.edu/∼gbergman/ 36 theorem 3.39 Exercise 3.11a Proof by contrapositive: we will assume that converges. To simplify the form of series by 1/an to get P P an /(1 + an ) converges and show that this implies that P an an /(1 + an ), we can multiply the numerator and denominator of each term of the 1 +1 X 1 an For this to converge, it is necessary that lim n→∞ 1 an 1 =0 +1 which can only happen if the limit of the denominator is ∞; and this can only happen if the limit of 1/an is ∞. That is, it must be the case that lim an = 0 n→∞ P This alone is not sufficient to prove that an converges. It does, however, allow us to make the terms of an as arbitrarily small as we like. For this proof, we it’s helpful to recognize that there is some integer N1 such that an < 1 for all n > N1 . From the assumption that P an /(1 + an ) converges, we know that for every there is some N such that n X k=m ak < for every n ≥ m ≥ N 1 + ak Let N be the larger of {N1 , N }. For any and any n ≥ m ≥ N , we can produce two inequalities: n X k=m n X k=m n X ak ak < 1+1 1 + ak k=m ak < for every n ≥ m ≥ N 1 + ak The first is true because each k is larger than N1 (so that ak < 1); the second is true because each k is larger than N . Together, these inequalities tell us that n X ak < for every n ≥ m ≥ N 2 k=m or, equivalently, that n X ak < 2 for every n ≥ m ≥ N k=m And our choice of was arbitrary, so we have shown that for every P there is some N that makes the last statement true: and this is the definition of convergence for the series an . P an P We’ve an . By contrapositive, the fact 1+an implies the convergence of P shown that the convergence P anof diverges. that an diverges is proof that 1+an Exercise 3.11b From the definition of sn , we can see that sn+1 = a1 + . . . + an+1 = sn + an+1 37 and we’re told that every an > 0, so we know that sn+1 > sn . By induction, we also know that sn ≥ sm whenever n ≥ m. And from this, we know that s1m ≥ s1n whenever n ≥ m. Therefore: aN +1 aN +k aN +1 aN +k aN +1 + . . . + aN +k + ... + ≥ + ... + = sN +1 sN +k sN +k sN +k sN +k A bit of algebraic manipulation shows us that aN +1 +. . .+aN +k = sN +k −sN , so this last inequality is equivalent to aN +k sN +k − sN sN aN +1 + ... + ≥ =1− sN +1 sN +k sN +k sN +k which is what we were asked to prove. P an To show that sn is divergent, we once again look at the sequence {sn }. We’ve already determined that {sn } is an increasingPsequence, and we know that’s it’s not convergent (otherwise, by the definition 3.21 of “convergent series”, an would be convergent). Therefore, we know that {sn } is not bounded (from theorem 3.14, which says that a bounded monotonic series is convergent). Now let sN be an arbitrary element of {sn } and let be an arbitrarily small real. From the fact that {sn } is not bounded, we can make sN +k arbitrarily large by choosing a sufficiently large k. This means that, for any N and any arbitrarily small , we can make sNsN+k < by choosing a sufficiently large k. And we’ve established that aN +1 aN +k sN + ... + ≥1− sN +1 sN +k sN +k which means that, by choosing sufficiently large k, we get the inequality N +k X k=N +1 ak sN ≥1− ≥1− sk sN +k P an 1 We’ve now established everything sn is divergent. Choose any such that 0 < < 2 . P anwe need to show that From theorem 3.22, in order for sn to be convergent we must be able to find some integer N such that n X ak < for all n ≥ m ≥ N sk k=m But there can be no such N , because for every N we’ve shown that there is some k such that N +k X k=N +1 ak >1−> sk (note that 1 - > because 0 < < 21 ). And this is sufficient to show that the series does not converge. Exercise 3.11c To prove the inequality: sn ≥ sn−1 → → → → 1 sn an s2n an s2n an s2n ≤ ≤ ≤ ≤ {sn } is an increasing sequence (see part b) 1 sn−1 an sn sn−1 sn −sn−1 sn sn−1 1 1 sn−1 − sn multiply both sides by an /sn sn − sn−1 = an (see part b) algebra Now consider the summation n X k=2 1 sk−1 − 1 = sk 1 1 − s1 s2 + 1 1 − s2 s3 38 + ... + 1 sn−1 − 1 sn Most of the terms in this summation cancel one another out: the summation “telescopes down” and simplifies to n X 1 1 1 1 − = − sk−1 sk s1 sk k=2 As we saw in part (b), the terms of {sn } increase without bound as n → ∞, so that ∞ X k=2 1 sk−1 − 1 1 = sk s1 Now, by the inequality we proved above, we know that ∞ X an k=2 s2n ≤ ∞ X k=2 1 1 1 − = sk−1 sk s1 We can add one term to each side so that our summation starts at 1 instead of at 2 to get ∞ X an k=1 s2n ≤ a1 1 + s21 s1 We can now show that the series converges. Define the partial sum of this series to be {tn } = n X k=1 1 sk−1 − 1 sk We’ve shown that {tn } is bounded above by a1 /s21 + 1/s1 , and we know that {tn } is monotonically increasing because each an is positive. Therefore, by theorem 3.14 we know that {tn } converges; and so by definition 3.21 we know that its associated series converges. And this is what we were asked to prove. Exercise 3.11d P The series an /(1 + n2 an ) always converges. From the fact that an > 0, we can establish the following chain of inequalities: an 1/an an 1 1 = = 1 < 2 2 2 2 1 + n an 1/an 1 + n an n an + n From this, we see that ∞ X ∞ X an 1 < 2a 2 1 + n n n n=0 n=0 P P We know that 1/n2 converges (theorem 3.28), and therefore an /(1 + n2 an ) converges by the comparison test of theorem 3.25. The series P an /(1 + nan ) may or may not converge. If an = 1/n, for instance, the summation becomes X X 1 1X1 an n = = 1 + nan 2 2 n which is divergent by theorem 3.28. To construct a convergent series, let an be defined as 1 if n = 2m − 1 (m ∈ Z) an = 0 otherwise P The series an is divergent, since there are infinitely many integers of the form 2m − 1. But the series P an /(1 + nan ) is convergent: ∞ ∞ X X X 1 m an 1 = = 1 + nan 2m 2 n=0 m=0 This series is convergent to 2 by theorem 3.26. 39 Exercise 3.12a establishing the inequality From the definition of rn , we see that rk = ∞ X am = ak + m=k ∞ X am = ak + rk+1 m=k+1 so that ak = rk − rk+1 . And because each ak is positive, we see that rk > rk+1 which (from transitivity) means that rm > rn when n > m (that is, {rn } is a decreasing sequence). With these equalities, we can form our proof. Take m ≤ n and consider the sum n X am an ak = + ... + rk rm rn k=m From the fact that rm ≥ rn , we know that ak /rm ≤ ak /rn for all k, so that am + . . . + an am an am + . . . + an ≤ + ... + ≤ rm rm rn rn Now, from the fact that ak = rk − rk+1 , we see that this inequality is equivalent to (rm − rm+1 ) + (rm+1 − rm+2 ) + . . . + (rn − rn+1 ) am an (rm − rm+1 ) + (rm+1 − rm+2 ) + . . . + (rn − rn+1 ) ≤ +. . .+ ≤ rm rm rn rn Notice that most of the terms of these numerators cancel one another out, leaving us with rm − rn+1 am an rm − rn+1 ≤ + ... + ≤ rm rm rn rn Taking the two leftmost terms of this inequality and performing some simple algebra then gives us 1− rn+1 am an ≤ + ... + rm rm rn This is close to what we want to prove, but our index is off by one. But each an is positive and each rn is the sum of positive terms, so an+1 /rn+1 is strictly positive. By adding this term to the right-hand side of the inequality, we accomplish two things: we correct our index and we make this a strict inequality (<) instead of a non-strict inequality (≤). rn+1 am an an+1 1− < + ... + + rm rm rn rn+1 This last statement is true whenever n ≥ m, which means that it’s true whenever n + 1 > m. By a simple replacement of variables, then, we know that 1− am a0 rn0 < + . . . + 0n rm rm rn whenever n0 > m, which is what we were asked to prove. proof of divergence P P We’re told that an converges, so an = α for some α ∈ R. From the definition of rn , we see that there is a relationship between {rn } and α: rn = ∞ X k=n ak = ∞ X ak − k=1 n−1 X k=1 ak = α − n−1 X ak k=1 As n → ∞, this last term approaches α − α = 0, and so we see that limn→∞ rn = 0. 40 Now choose any arbitrarly small from the interval (0, 12 ) and choose any arbitrary integer N . From the inequality we verified in part (a1), we see that aN aN +n rN +n + ... + >1− rN rN +n rN We can choose n to be arbitrarily large; in doing so, rN +n approaches zero while rN remains fixed. So, by choosing a sufficiently large n, we can guarantee that rN +n /rN < . This choice of n gives us the inequality aN +n rN +n aN + ... + >1− >1−> rN rN +n rN where the last step of this chain of inequalities is justified by our choice of 0 < < 12 . We’ve shown that there is some such that, for every N , we can find n X an > rn k=N P an /rn diverges (since we’ve proven the negation of the “for From theorem 3.22, this is sufficient to show that every > 0 there is some N . . . ” statement of the theorem). Exercise 3.12b rn > rn+1 > 0 √ √ → rn > rn+1 √ √ √ → 2 rn > rn + rn+1 √ established in part (a) take square root of each side √ add rn to each side √ divide by rn √ r + r → 2 > n√rn n+1 √ √ → 2( rn − rn+1 ) > √ √ → 2( rn − rn+1 ) > √ √ → 2( rn − rn+1 ) > √ √ √ √ ( rn + rn+1 )( rn − rn+1 ) √ rn rn −rn+1 √ rn an √ rn multiply each side by a positive term simply the right-hand side we established an = rn − rn+1 in part (a) Having established this inequality, we see that n n X X √ ak √ 2( rk − rk+1 ) √ < rk k=1 k=1 and many of the terms in the right-hand summation cancel one another out, so that the series “telescopes down” and simplifies to n X √ ak √ √ < 2 r1 − rk+1 rk k=1 so that n X √ √ ak √ √ < lim 2 r1 − rk+1 = 2 r1 n→∞ n→∞ rk lim k=1 This shows that this sum of nonnegative terms is bounded, which is sufficient to show that it converges by theorem 3.24. Exercise 3.13 P P Let an be a series that converges absolutely to α and let bn be a series that converges absolutely to β. We can prove that the Cauchy product of these two series (definition 3.48) is bounded. 41 7→ Pn = k=0 |ck | Pk Pn j=0 k=0 aj bk−j definition 3.48 of ck ≤ Pn Pk |aj bk−j | triangle inequality = Pn Pk |aj ||bk−j | 1.33(c) k=0 k=0 j=0 j=0 We can expand this summation out to get: = |a0 ||b0 | + (|a0 ||b1 | + |a1 ||b0 |) + . . . + (|a0 ||bn | + |a1 ||bn−1 | + . . . + |an ||b0 |) Define Bn to be Pn k=0 |bk | and An to be Pn k=0 |ak |. We can rearrange these terms to get: = |a0 |Bn + |a1 |Bn−1 + |a2 |Bn−2 + . . . + |an |B0 Now we can add several nonnegative terms to get ≤ |a0 |Bn + |a1 |(Bn−1 + |bn |) + |a2 |(Bn−2 + |bn−1 | + |bn |) + . . . + |an |(B0 + |b1 | + . . . + |bn |) = |a0 |Bn + |a1 |Bn + |a2 |Bn + . . . + |an |Bn = An B n Pn Pn = ( k=0 |ak |) ( k=0 |bk |) = αβ P This shows that each partial sum of |ck | is bounded above by αβ P and below by 0 (since it’sPa series of nonnegative terms). By theorem 3.24, this is sufficient to P prove that |ck | converges; and since ck is the Cauchy product, we have proved that the Cauchy product ck converges absolutely. Exercise 3.14a Choose any arbitrarily small > 0. We’re told that lim{sn } = s, which means that there is some N such that d(s, sn ) < whenever n > N . So we’ll rearrange the terms of σn a bit: PN Pn σn = k=0 sk n+1 = k=0 sk Pn + k=N +1 sk n+1 Whenever n > N we know that d(s, sn ) < , which means that − < sn − s < , or that s − < sn < s + . This gives us the inequality PN k=0 sk PN Pn Pn + k=N +1 (s − ) k=0 sk + k=N +1 (s + ) < σn < n+1 n+1 The terms in some of these summations don’t depend on k, so we can further rewrite this as PN k=0 sk + (n − (N + 1))(s − ) < σn < n+1 PN k=0 sk + (n − (N + 1))(s + ) n+1 PN PN sk n(s − ) (N + 1)(s − ) n(s + ) (N + 1)(s + ) + − < σn < k=0 + − n+1 n+1 n+1 n+1 n+1 n+1 For many of these terms, the numerators are constant with respect to n. Therefore many of these terms will approach zero as n → ∞. n(s − ) n(s + ) lim < lim σn < lim n→∞ n + 1 n→∞ n→∞ n + 1 s − < lim σn < s + k=0 sk n→∞ And this is just another way of saying d(s, lim σn ) < . And was arbitrarily small, so we have shown that limn→∞ σn = s. 42 Exercise 3.14b Let sn = (−1)n . Then P sn is either equal to 0 or 1, depending on n, and 1 0 ≤ σn ≤ n+1 n+1 Taking the limit as n → ∞ gives us 0 ≤ lim σn ≤ 0 n→∞ which can only be true if lim σn = 0 n→∞ Exercise 3.14c Define sn to be 1 n 2 n 1 2 sn = if n = k 3 (k ∈ Z) otherwise +k √ √ There are no more than 3 n perfect cubes within the first n integers, so {sn } will contain no more than b 3 nc terms of the form (1/2)n + k. This gives us the inequality n X sn = k=0 n n X 1 k=0 2 b + √ 3 nc X √ 3 k ≤2+ k=0 √ n( 3 n + 1) 2 where the last step in this chain of inequalities is justified by the common summation Continuing, we see that n X √ 3 sn ≤ 2 + k=0 Pn k=1 k = k(k + 1)/2. √ √ √ √ 3 √ n( 3 n + 1) n( 3 n + 3 n) 3 ≤2+ = 2 + n2 2 2 We can now analyze the value of σn . Pn σn = k=0 sn n+1 √ 2 3 √ 3 2 + 1 2 + n2 n ≤ = √ 1 3 n+1 n+ √ 3 2 n Taking the limit of each side as n → ∞ gives us 2 √ 3 2 n lim σn ≤ lim √ 3 n→∞ n→∞ +1 n+ 1 √ 3 2 n 1 = lim √ =0 n→∞ 3 n Every term of {sn } was greater than zero, so we know that the arithmetic average σn is greater than zero. Therefore 0 ≤ lim σn ≤ 0, and therefore lim σn = 0. Exercise 3.14 proving the equality n X kak = (s1 − s0 ) + 2(s2 − s1 ) + . . . + n(sn − sn−1 ) k=1 = −s0 + (1 − 2)s1 + (2 − 3)s2 + . . . + ((n − 1) − n)sn−1 + sn (n) = −s0 − s1 − . . . − sn + (n + 1)sn Therefore, if we divide this by n + 1, we get n 1 X −s0 − s1 − . . . − sn kak = + sn = −σn + sn n+1 n+1 k=1 43 establishing convergence We’re told that limn→∞ nan = 0, so by part a we know that Pn k=1 kak lim =0 n→∞ n+1 We’re also told that {σn } converges to some value σ, so by theorem 3.3(a) we know that Pn k=1 kak + σn = (0 + σ) = σ lim n→∞ n+1 And since sn = Pn kak n+1 k=1 + σn , this is sufficient to prove that limn→∞ sn = σ. Exercise 3.15 If you think I’m going to go through this tedious exercise, you can lick me where I shit. Exercise 3.16a {xn } is monotonically decreasing √ From the fact that 0 < α < xn we know that α < x2n , and therefore 1 α x2n 1 xn+1 = xn + xn + < = xn 2 xn 2 xn This shows that xn > xn+1 for all n, which proves that {xn } is monotonically decreasing. √ The limit of {xn } exists and is not less than a √ √ First, √ we show that {xn } is bounded below by a. We know that x0 > a because we chose it to be. And if xn > a for any n, we have 7→ xn 6= → xn − → → → → → → √ √ a a 6= 0 √ 2 (xn − a) ≥ 0 √ x2n − 2xn a + a ≥ 0 √ x2n + a ≥ 2xn a √ x2n +a a 2xn ≥ √ 1 a a 2 xn + xn ≥ √ xn+1 ≥ a assumed 1.18(d) definition of xn+1 √ So by induction, we know that √ xn ≥ a for all n. We’ve now demonstrated that {xn } is monotonically √ decreasing and is bounded below by a, so we’re guaranteed that the limit of {xn } exists and that lim{xn } ≥ a. √ The limit of {xn } is not greater than a √ √ √ √ Now we show that lim{xn } ≤ a. Assume that √ lim{xn } = b for some b > a. By the definition √ of “limit”, we can find N such that n > N implies d(x , b) < for any arbitrarily small . We’ll choose = b − a (which n √ √ is positive since b > a ≥ 1 implies b > a). √ 7→ d(xn , b) < √ √ → d(xn , b) < b − a √ √ → xn − b < b − a √ √ → xn < b − a + b chosen value of metric on R1 44 We can then calculate xn+1 in terms of xn to get a chain of inequalities: √ √ √ √ (b − a) + 2 b b − a + b + a x2n + a ( b − a + b)2 + a √ √ = xn+1 = < √ √ 2xn 2( b − a + b) 2( b − a + b) √ √ √ √ √ 2b + 2 b b − a b( b + b − a) √ √ = √ = √ = b √ 2( b − a + b) b+ b−a √ √ This shows us that if the distance guaranteed to √ between xn and b becomes less than b − a (which it is √ do eventually because lim{xn } = b), then the next term in the sequence {xn } will be less than b. And since we’ve already shown that √ that every subsequent term will √ √ the sequence is monotonically decreasing, this means b, which contradicts our assumption that lim{x } = b. Therefore the limit is not b: be even farther from n √ √ b was an arbitrary value greater than a, so we’ve shown that the limit cannot be any value greater than but √ a. √ Having shown that lim{xn } exists, that the√limit is not less than a, we have therefore proven that lim{xn } = a. √ a, and that the limit is not greater than Exercise 3.16b From the definition of xn+1 and en , we have en+1 = xn+1 − √ a= √ √ √ x2n + a √ x2 − 2xn a + a (xn − a)2 − a= n = 2xn 2xn 2xn e2n e2 < √n 2xn 2 a √ where the last step is justified by the fact that xn > a (see part (a)). = The next part can be proven by induction. Setting n = 1 in the above inequality, we have e2 e2 e2 < √1 = 1 = β β 2 a e1 β 21 And, if the statement is true for n, then en+1 " n−1 #2 " n# 2n 2 2 1 2 1 e1 e1 e2n 1 e1 2 β β < √ = (en ) < = =β β β β β β β 2 a which shows that it is therefore true for n + 1. By induction, this is sufficient to show that it is true for all n ∈ N. Exercise 3.16c When a = and x1 = 2, we have √ √ x1 − a 2− 3 e1 √ √ = = β 2 a 2 3 which can be shown to be less than 1/10 through simple algebra. Then, by part (b), we have en < β e1 β 2n √ <2 3 45 1 10 2n <4 1 10 2n Exercise 3.17 √ √ √ √ Lemma 1: xn > a → xn+1 < a, and xn < a → xn+1 > a √ 7→ xn > a √ √ √ ↔ xn ( a − 1) > a( a − 1) √ √ ↔ xn a − xn > a − a √ √ ↔ xn a + a > a + xn √ ↔ a(xn + 1) > a + xn √ n ↔ a > a+x xn +1 √ ↔ a > xn+1 definition of xn+1 √ √ This shows that xn > a → xn+1 <√ a. If “>” is√replaced with “<” in each of the above steps, we will have also constructed a proof that xn < a → xn+1 > a. √ √ Lemma 2: xn > a → xn+2 < xn and xn < a → xn+2 > xn √ 7→ xn > a √ ↔ xn − a > 0 √ √ multiplied by positive terms, so it’s still > 0 ↔ 2(xn − a)(xn + a) > 0 ↔ 2(x2n − a) > 0 At this point, the algebraic steps become bizarre and seemingly nonsensical. But bear with me. ↔ 2x2n + 0xn − 2a) > 0 ↔ 2x2n + (1 + a − 1 − a)xn − 2a) > 0 ↔ xn + x2n + axn + x2n > a + axn + a + xn ↔ xn (1 + xn ) + xn (a + xn ) > a(1 + xn ) + a + xn ↔ xn (1+xn )+xn (a+xn ) 1+xn ↔ xn 1 + a+xn 1+xn a(1+xn )+a+xn 1+xn > >a+ division by a positive term a+xn 1+xn ↔ xn (1 + xn+1 ) > a + xn+1 ↔ xn > a+xn+1 1+xn+1 ↔ xn > xn+2 definition of xn+2 √ This shows that xn > a → xn+2 √ < xn . If “>” is replaced with “<” in each of the above steps, we will have also constructed a proof that xn < a → xn+2 > xn . Exercise 3.17a √ We’re forced to choose x1 such that x1 > a, so we can use induction with lemma 2 to show than {x2k+1 } (the subsequence of {xn } consisting of elements with odd indices) is a monotonically decreasing sequence. Exercise 3.17b √ √ Because we chose x1 such that x1 > a, lemma 1 tells us that x2 < a. We can then use induction with lemma 2 to show that {x2k } (the subsequence of {xn } consisting of elements with even indices) is a monotonically increasing sequence. Exercise 3.17c √ √ We can show than {xn } converges to a by showing √ that we can √ make d(xn , a) arbitrarily small. We do this by demonstrating the relationship between d(xn , a) and d(x1 , a). 46 en+2 = xn+2 − 7→ en+2 = xn+2 − → en+2 = → en+2 = a+xn+1 1+xn+1 √ − a √ √ √ a= √ √ a + xn+1 √ a + xn+1 − a − xn+1 a − a= 1 + xn+1 1 + xn+1 a a+xn+1 − a−xn+1 1+xn+1 √ a √ √ → en+2 (1 + xn+1 ) = a + xn+1 − a − xn+1 a √ √ √ → en+2 (1 + xn+1 ) = xn+1 (1 − a) − a(1 − a) √ √ → en+2 (1 + xn+1 ) = (xn+1 − a)(1 − a) √ → en+2 (1 + xn+1 ) = en+1 (1 − a) √ 1− a → en+2 = en+1 1+x n+1 This tells us how to express en+2 in terms of en+1 . We can use this same equality to express en+1 in terms of en , giving us √ √ √ 1− a 1− a 1− a en+2 = en+1 = en 1 + xn+1 1 + xn 1 + xn+1 ! ! √ √ √ √ 1− a 1− a 1− a 1− a = en = en 1+xn +a+xn n 1 + xn 1 + xn 1 + a+x 1+xn 1+xn √ √ √ (1 + xn )(1 − a) (1 − a)2 1− a = en = en 1 + xn 1 + 2xn + a 1 + 2xn + a √ ( a − 1)2 < en a−1 The last step in this chain of inequalities is justified by the fact that a + 2xn + 1 > a − 1 > 1. Continuing, we have √ √ ( a − 1)2 a−1 en+2 < en = cen , 0 < c < 1 = en √ a−1 a+1 This tells us how to express en+2 in terms of en . We can use this same inequality to express e2n+2 in terms e2 and e2n+1 in terms of e1 : e2n+2 < ce2n < c(ce2n−2 ) < . . . < cn e2 e2n+1 < ce2n−1 < c(ce2n−3 ) < . . . < cn e1 And since 0 < c < 1, this means that we can make cn arbitrarily small by taking sufficiently large n; and en < c2 [maxe1 , e2 ], so we can make en arbitrarily small by taking √ sufficiently large n; and √ this shows that a). So lim d(x , a) = 0 which is limn→∞ en = 0. Finally, remember that we defined e to be d(x , n→∞ n n n √ sufficient to prove that limn→∞ {xn } = a. Exercise 3.17d √ √ As shown above, every two iterations reduces the error term by a factor of ( a−1)/( a+1) (linear convergence). n For the algorithm in exercise 16, the nth iteration reduced the error term by a factor of 10−2 (quadratic convergence). Exercise 3.18 Lemma 1 : {xn } is decreasing. √ √ We’re asked to choose that x1 > p a. And if xn > p a, we have 47 7→ xn > → xpn √ p a >a → 0 > a − xpn → pxpn > pxpn + a − xpn → pxpn > (p − 1)xpn + a → pxp n pxp−1 n → xn > (p−1)xp n +a pxp−1 n (p−1)xn + ap xp−1 n p > → xn > xn+1 Lemma 2 : 0 < k < 1 → p(1 − k) > 1 − k p Let k be a positive number less than 1 and let p be a positive integer. 7→ p = 10 + 12 + 13 + . . . + 1p−1 → p > k + k 2 + k 3 + . . . + k p−1 →p> →p> (k−1)(k+k2 +k3 +...+kp−1 ) k−1 kp −1 k−1 p → p(k − 1) > k − 1 Lemma 3 : xn > √ p We know that x1 > 7→ xn > √ p a √ p a because we chose it. And if xn > √ p a, we have: a>0 √ p → 1 > xna > 0 √ p √ p p → p 1 − xna > 1 − xna √ p a → p − 1 > p xn − a p xn √ p p p → xn (p − 1) > xn p xna − xapn √ → pxpn − xpn > p p axp−1 −a n √ p p → (p − 1)xn + a > p axp−1 n → → √ (p−1)xp p p axp−1 n +a n > pxp−1 pxp−1 n n √ (p−1)xn a + pxp−1 > pa p n → xn+1 > √ p a Finally : limn→∞ {xn } = from lemma 2 expand and rearrange the terms multiply both sides by xpn rearrange the terms and simplify divide both sides by pxp−1 n simplify definition of xn+1 √ p a We’ve shown that {xn } is decreasing (lemma 1)√and that it’s bounded below (lemma 3), which is sufficient to show that it has some limit x∗ with x∗ ≥ p a. From the definition of limit, we can find N such that n > N → d(xn , x∗ ) < 2p . From this, we see that we can find xn and xn+1 such that: 48 7→ d(xn , x∗ ) < ∧ d(xn+1 , x∗ ) < 2p ∗ → d(xn , xn+1 ) < d(xn , x∗ ) + d(x , xn+1 ) = → xn − xn+1 < p h → xn − xn − xpn + → xn p → xp n −a pxp−1 n − → xn − a p−1 pxn < < p triangle inequality lemma 1: xn > xn+1 a i < pxp−1 n p definition of xn+1 p p a xp−1 n < This last statement tells us that limn→∞ xn − 7→ limn→∞ xn − a xp−1 n =0 a xp−1 n = 0, or that assumed → limn→∞ xn (1 − a/xpn ) = 0 p √ p =0 → x∗ 1 − x∗a √ → p a/x∗ = 1 √ → p a = x∗ theorem 3.3c theorem 3.3c From the definition of x∗ as the limit of {xn }, this tells us that limn→∞ {xn } = to prove. √ p a which is what we wanted Exercise 3.19 The idea behind this proof is to consider the elements of the line segment [0, 1] in the form of their base-3 expansion. When we do this, we notice that the mth iteration of the Cantor set eliminates every number with a 1 as the mth digit of its ternary decimal expansion. From equation 3.24 in the book, we know that a real number r is not in the Cantor set iff it is in any interval of the form 3k + 1 3k + 2 , m, k ∈ N 3m 3m Let {αn } represent an arbitrary ternary sequence (that is, each digit is either 0,1, or 2). We can determine the necessary and sufficient conditions for x(α) to fall into such an interval: 3k + 1 3k + 2 x(α) 6∈ Cantor iff x(α) ∈ , 3m 3m ∞ X αn 3k + 1 3k + 2 iff (∃m) ∈ , 3n 3m 3m n=1 iff (∃m) ∞ X αn ∈ (3k + 1, 3k + 2) n−m 3 n=1 We then split up the summation into three distinct parts. iff (∃m) m−1 X n=1 iff (∃m) αn 3n−m m−1 X n=1 + m X n=m αn 3n−m + ∞ X n=m+1 ∞ X αn 3m−n + αm + n=m+1 αn n−m 3 αn n−m 3 ∈ (3k + 1, 3k + 2) ∈ (3k + 1, 3k + 2) Every term in the leftmost sum is divisible by three, so the leftmost sum is itself divisible by three. This gives us, for some j ∈ N, ∞ X αn iff (∃m) 3j + αm + ∈ (3k + 1, 3k + 2), j ∈ N n−m 3 n=m+1 49 Without specifying a certain sequence {αn }, we can’t evaluate the rightmost summation. But we can establish bounds for it. Each αn is either 0,1, or 2. So the summation is largest whenever αn = 2 for all n, and it’s smallest when αn = 0 for all n. This gives us the bound 0= ∞ X n=m+1 0 3n−m < ∞ X n=m+1 αn 3n−m < ∞ X 2 n=m+1 3n−m =2 ∞ X 1 =1 n 3 n=1 Note that the upper bound is 1, not 3, since the index is from 1 instead of 0. Continuing with our chain of “iff” statements, we conclude that x(a) is not member of the Cantor set iff: x(a) 6∈ Cantor iff (∃m) 3j + αm + δ ∈ (3k + 1, 3k + 2), j ∈ N, 0 < δ < 1 From the bounds on αm and δ, we know that 0 < αm + δ < 3: their sum is never a multiple of 3. So the only way that the set membership in the previous statement can be true is iff (∃m) αm + δ ∈ (1, 2), 0<δ<1 iff (∃m) αm ∈ (0, 2) We know that αm is an integer, and there’s only one integer in the open interval (0, 2): iff (∃m) αm = 1 iff x(α) 6∈ x(a) where x(a) is as defined in the exercise. This shows that any real number x(α) is a member of the Cantor set if and only if it is a member of x(a): this is sufficient to prove that the Cantor set is equal to x(a). Exercise 3.20 Choose some arbitrarily small . From the definition of Cauchy convergence, there is some N such that i, n > N implies d(pi , pn ) < /2. From the convergence of the subsequence to p∗ , there is some M such that i > M implies d(p∗ , pi) < /2. So, for i, n > max(M, N ) we have d(p∗ , pn ) ≤ d(p∗ , pi ) + d(pi , pn ) = which, because is arbitrarily small, is sufficient to prove that {pn } converges to p∗ . Exercise 3.21 T We know that each Ei ∈ En is nonempty (see note 1), so we can construct at least one sequence whose ith element is an arbitrary element of Ei . Let {sn } be an arbitrary sequence constructed in this way. We can immediately say three things about this sequence. 1) The sequence is Cauchy. This comes from the fact that lim diam En = 0: see the text below definition 3.9. 2) The sequence is convergent. This comes from the fact that it’s a sequence in a complete metric space X: see definition 3.12. T T 3) The sequence converges to some s∗ ∈ EnT . The set En is T a intersection of closed sets, and is therefore closed (thm 2.24b), so any limit point of E is a point of En , and therefore s∗ (being a limit point of n T T En ) is an element of En . T This shows that En isTnonempty: it contains at least one element s∗ . We can then follow the proof of theorem 3.10b to conclude that En contains only one point. 50 note 1 We’re told that Ei ⊃ Ei+1 for all i. If Ei were empty, then Ek would be empty for all k > i. This would mean that lim diam En = diam∅ = diam sup {d(p, q) : p, q ∈ ∅} = sup ∅ To find the supremum of the empty set in R, we need to rely on the definition of supremum: sup ∅ = least upper bound of ∅ in R = min {x ∈ R : a ∈ ∅ → a ≤ x} The set for which we’re seeking a minimum contains every x ∈ R (because of false antecedent a ∈ ∅), so the supremum of the empty set in R is the minimum of R itself. Regardless of how we define this minimum (or if it’s defined at all), it certainly isn’t equal to zero. But we’re told that lim diam En = 0 so our initial assumption must have been wrong: there is no empty Ei ∈ T En . Exercise 3.22 The set G1 is an open set. Choose p1 ∈ G1 and find some open neighborhood Nr1 (p1 ) ⊆ G1 . Choose a smaller neighborhood Ns1 (p1 ). Not only do we have Ns1 ⊂ Nr1 ⊆ G1 , we can take the closure of Ns1 and still have Ns1 ⊂ Nr1 ⊆ G1 (since δ ≤ s1 < r1 ). Define E1 to be the neighborhood Ns1 (not its closure). At this point, we have E1 ⊂ E1 ⊆ G1 The set G2 is dense in X, so p1 is either an element of G2 or is a limit point for G2 . In either case, every neighborhood of p1 contains some point p2 ∈ G2 . More specifically, the neighborhood E1 contains some point p2 ∈ G2 . Since p2 is in both E1 and G2 , both of which are open sets, there is some neighborhood Nr2 (p2 ) that’s a subset of both E1 and G2 . We now choose an even smaller radius Ns2 (p2 ). Not only do we have Ns2 ⊂ Nr2 ⊆ G2 and Ns2 ⊂ Nr2 ⊆ E1 , we could take the closure Ns2. Define E2 to be the neighborhood Ns2 (not its closure). At this point, we have E2 ⊂ E2 ⊂ G2 , E2 ⊂ E2 ⊂ E1 ⊂ E1 We can continue in this same way, constructing a series of nested sets E1 ⊃ E2 ⊃ · · · ⊃ En T This is a series of closed, bounded, nonempty, nested sets. T So (from exercise 21) we know that En contains T a single pointTx. This single point x must be in every Ei ∈ En, and Ei ⊂ Gi , and x must be in every Gi ∈ Gn . Therefore Gn is nonempty. Exercise 3.23 We’re aren’t given enough information about the metric space X to assume that the Cauchy sequences converge to elements in X. All we can say is that, for any arbitrarily small , there exists some M, N ∈ R such that n, m > M → d(pn , pm ) ≤ /2 n, m > N → d(qn , qm ) ≤ /2 From multiple applications of the triangle inequality, we have d(pn , qn ) ≤ d(pn , pm ) + d(pm , qn ) ≤ d(pn , pm ) + d(pm , qm ) + d(qm , qn ) which, for n, m > max{N, M }, becomes d(pn , qn ) ≤ + d(pm , qm ) or d(pn , qn ) − d(pm , qm ) ≤ 51 Exercise 3.24a reflexivity For all sequences {pn }, we have lim d(pn , pn ) = lim |pn − pn | = lim 0 = 0 n→∞ n→∞ n→∞ so {pn } = {pn }. symmetry If {pn } = {qn }, we have lim d(pn , qn ) = lim |pn − qn | = lim |qn − pn | = lim d(qn , pn ) n→∞ n→∞ n→∞ n→∞ so {qn } = {pn }. transitivity If {pn } = {qn } and {qn } = {rn }, the triangle inequality gives us lim d(pn , rn ) ≤ lim d(pn , qn ) + d(qn , rn ) = 0 + 0 n→∞ n→∞ This tells us that limn→∞ d(pn , rn ) = 0, so {pn } = {rn }. Exercise 3.24b Let {an } = {pn } and let {bn } = {qn }. From the triangle inequality, we have ∆(P, Q) = lim d(pn , qn ) ≤ lim d(pn , an ) + d(an , bn ) + d(bn , qn ) n→∞ n→∞ Which, from the definition of equality established in part (a), gives us ∆(P, Q) = lim d(pn , qn ) ≤ lim d(an , bn ) n→∞ n→∞ (5) A similar application of the triangle inequality gives us lim d(an , bn ) ≤ lim d(an , pn ) + d(pn , qn ) + d(qn , bn ) n→∞ n→∞ Which, from the definition of equality established in part (a), gives us lim d(an , bn ) ≤ lim d(pn , qn ) n→∞ n→∞ (6) Combining equations (5) and (6), we have ∆(P, Q) = lim d(pn , qn ) = lim d(an , bn ) n→∞ n→∞ Exercise 3.24c Define X ∗ to be the set of equivalence classes from part (b). Let {Pn } be a Cauchy sequence in (X ∗ , ∆). This is an unusual metric space, so to be clear: The sequence {Pn } is a sequence {P0 , P1 , P2 . . .} of equivalence classes in X ∗ that get “closer” to one another with respect to distance function ∆. Each Pi ∈ {Pn } is an equivalence class of Cauchy sequences of the form {pi0 , pi1 , pi2 , . . .} containing elements of X that get “closer together” with respect to distance function d. Each Pi ∈ {Pn } is a set of equivalent Cauchy sequences in X. From each of these, choose some sequence {pin } ∈ Pi . For each i we have (∃Ni ∈ R) m, n > Ni → d(pim , pin ) < i 52 We’ll define a new sequence {qn } by letting qi = pim for some m > Ni . Let Q be the equivalence class containing {qn }. We can show that limn→∞ Pn = Q. 0 ≤ lim ∆(Pn , Q) = lim lim d(pnk , qk ) ≤ lim n→∞ n→∞ k→∞ n→∞ n So, by the squeeze theorem, we have lim ∆(Pn , Q) = 0 n→∞ (7) which, by the definition of equality in X ∗ , means that limn→∞ Pn = Q. Proof that Q ∈ X ∗ : Choose any arbitrarily small values 1 , 2 , 3 . From (7) we have ∃X : n, k > X → d(pnk , qn ) < 1 And {Pn } is Cauchy, so we have ∃Y : n > Y → ∆(Pn , Pn+m ) < 2 which, after choosing appropriately large n, gives us ∃Z : k > Z → d(pnk , p(n+m)k ) < 3 Therefore, by the triangle inequality: d(qn , qn+m ) ≤ d(qn , pnk ) + d(pnk , p(n+m)k ) + d(p(n+m)k , qn+m ) Taking the limits of each side gives us lim d(qn , qn+m ) ≤ 1 + 3 + 1 n→∞ But these epsilon values were arbitrarily small and m was an arbitrary integer. This shows that for every there exists some integer N such that n, n + m > N implies d(qn , qn+m ) < . And this is the definition of {qn } being a Cauchy sequence. Therefore Q ∈ X ∗ . Exercise 3.24d Let {pn } represent the sequence whose terms are all p, and let {qn } represent the sequence whose terms are all q. ∆(Pp , Pq ) = lim d(pn , qn ) = lim d(p, q) = d(p, q) n→∞ n→∞ Exercise 3.24e : proof of density Let {an } be an arbitrary Cauchy sequence in X and let A ∈ X ∗ be the equivalence class containing {an }. If the sequence {an } converges to some element x ∈ X, then A = Px = ϕ(x) and therefore A ∈ ϕ(X) (see exercise 3.24 for the definition of Px ). If the sequence {an } does not converge to some element x ∈ X, then choose an arbitrarily small . The sequence {an } is Cauchy, so we’re guaranteed the existence of K such that j, k > K → d(aj , ak ) < From this, we can consider the sequence whose terms are all ak . This sequence is a member of the equivalence class Pak = ϕ(ak ) and ∆(A, Pak ) = lim d(an , ak ) < n→∞ This shows that we can find some element of ϕ(X) to be arbitrarily close to A, which means that A is a limit point of ϕ(X). We have shown that an arbitrary Cauchy sequence A ∈ X ∗ is either an element of ϕ(X) or a limit point of ϕ(X). By definition, this means that ϕ(X) is dense in X ∗ . 53 Exercise 3.24e : if X is complete If X is complete then every arbitrary Cauchy sequence A ∈ X ∗ converges to some point a ∈ X, so that A = Pa = ϕ(a) ∈ ϕ(X). This shows that X ∗ ⊆ ϕ(X). And for every ϕ(b) ∈ ϕ(X) there is some Cauchy sequence in X ∗ whose every element is b, so that ϕ(b) = Pb ∈ X ∗ . This shows that ϕ(X) ⊆ X ∗ . This shows that X ∗ ⊆ ϕ(X) ⊆ X ∗ , or that ϕ(X) = X ∗ . Exercise 3.25 The completion of the set of rational numbers is a set that’s isomorphic to R. Exercise 4.1 Consider the function f (x) = 1, 0, x=0 x 6= 0 This function satisfies the condition that limh→∞ [f (x + h) − f (x − h)] = 0 for all x, but the function is not continuous at x = 0: We can choose < 1, and every neighborhood Nδ (0) will contain a point p for which d(f (p), f (0)) = 1 Therefore we can’t pick δ such that d(p, 0) < δ → d(f (p), f (0)) < 1, which means that f is not continuous by definition 4.5. Exercise 4.2 Let X be a metric space, let E be an arbitrary subset of X, and let E represent the closure of E. We want to prove that f (E) ⊆ f (E). To do this, assume y ∈ f (E). This means that y = f (e) for some e ∈ (E ∪ E 0 ). case 1: e ∈ E If e ∈ E, then y ∈ f (E) and therefore y ∈ f (E). case 2: e ∈ E 0 If e ∈ E 0 , then every neighborhood of e contains infinitely many points of E. Choose an arbitrarily small neighborhood N (y). We’re told that f is continuous so, by definition 4.1, we’re guaranteed the existence of δ such that f (x) ∈ N (y) whenever x ∈ Nδ (e). But there are infinitely many elements of E in the neighborhood Nδ (e), so there are infinitely many elements of f (E) in N (y). This means that y is a limit point of f (E). We’ve shown that every arbitrary element y ∈ f (E) is either a member of f (E) or a limit point of f (E), which means that y ∈ (f (E) ∪ f (E)0 ) = f (E). This proves that f (E) ⊆ f (E). A function f for which f (E) is a proper subset f (E) Let X be the metric space consisting of the interval (0, 1) with the standard distance metric. Let Y be the metric space R1 . Define the function f : X → Y as f (x) = x. The interval (0, 1) is closed in X but open in Y , so we have f (X) = f (X) = (0, 1) 6= (0, 1) Exercise 4.3 If we consider the image of Z(f ) under f , we have f (Z(f )) = {0}. This range is a finite set, and is therefore a closed set. By the corollary of theorem 4.8, we know that f −1 ({0}) = Z(f ) must also be a closed set. 54 Exercise 4.4: f (E) is dense in f (X) To show that f (E) is dense in f (X) we must show that every element of f (X) is either an element of f (E) or a limit point of f (E). Assume y ∈ f (X). Then p = f −1 (y) ∈ X. We’re told that E is dense in X, so either p ∈ E or p ∈ E 0 . case 1: p ∈ E If p ∈ E, then y = f (p) ∈ f (E). case 2: f −1 (y) ∈ E 0 If p is a limit point of E, then there is a sequence {en } of elements of E such that en 6= p and limn→∞ en = p. We’re told that f is continuous, so by theorem 4.2 we know that limn→∞ f (en ) = f (p) = y. Using definition 4.2 again, we know that there is a sequence {f (en )} of elements of f (E) From theorem 4.2, this tells us that limx→p f (x) = f (p) = y. Therefore y is a limit point of f (E). We’ve shown that every element y ∈ f (X) is either an element of f (E) or a limit point of f (E). By definition, this means that f (X) is dense in f (E). Exercise 4.4b Choose an arbitrary p ∈ X. We’re told E is dense in X, so p is either an element of E or a limit point of E. Case 1: p ∈ E If p ∈ E, then we’re told that f (p) = g(p). Case 2: p ∈ E 0 If p is a limit point of E, then there is a sequence {en } of elements of E such that en 6= p and limn→∞ en = p. We’re told that f and g are continuous, so by theorem 4.2 we know that limn→∞ f (en ) = f (p) and limn→∞ g(en ) = g(p). But each en is an element of E, so we have f (en ) = g(en ) for all n. This tells us that g(p) = lim g(en ) = lim f (en ) = f (p) n→∞ n→∞ We see that f (p) = g(p) in either case. This proves that f (p) = g(p) for all p ∈ X. Exercise 4.5 The set E C can be formed from an at most countable number of disjoint open intervals We’re told that E is closed, so E C is open. Exercise 2.29 tells us that E C contains an at-most countable number of disjoint segments. Each of these segments must be open (if any of them contained a non-interior point, it would be a non-interior point of E C , but open sets have no non-interior points). Let {(an , bn )} be the at-most countable collection of disjoint open segments. constructing the function We must separately consider the cases for x ∈ E and x 6∈ E; and if x 6∈ E we must consider the possibility that E contains an interval of the form (−∞, b) or (a, ∞). We’ll define the function to be  f (x), x∈E    f (b ), x ∈ (ai , bi ) ∧ ai = −∞ i g(x) = (8) f (a ), x ∈ (ai , bi ) ∧ bi = ∞ i    f (bi )−f (ai ) f (ai ) + bi −ai (x − ai ), x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞ This function is the one mentioned in the hint: the graph of g is a straight line on each closed interval [bi , ai+1 ] ∈ E C . This function can easily (albeit tediously) be shown to be continuous on R1 . 55 failure if “closed” is omitted Consider the following function defined on the open set (−∞, 0) ∩ (0, ∞): 1, x>0 f (x) = −1 x < 0 This function will be discontinuous at x = 0 no matter how we define the function at this point. Extending this to vector-valued functions Let E be a closed subset of R and let f : E → Rk be a vector-valued function defined by f (x) = (f1 (x), f2 (x), . . . , fk (x)) For each fi we can define a function gi as in equation (8) to create a new vector-valued function g : R → Rk defined by g(x) = (g1 (x), g2 (x), . . . , gk (x)) We’ve shown that each of these gi functions are continuous, therefore by theorem 4.10 the function g(x) is continuous. Exercise 4.6 Let G represent the graph of f . Let g : E → G be defined as as g(x) = (x, f (x)). We can make G into a metric space by defining a distance function: p the set G is a subset of R × R, so the natural choice is to use the metric d(g(x), g(y)) = d((x, f (x)), (y, f (y)) = (x − y)2 + (f (x) − f (y))2 . This choice allows us to treat G as a subset of R2 . If f is continuous Both x 7→ x and x 7→ f (x) are continuous mappings, so g(x, f (x)) is a continuous mapping by theorem 4.10. The domain of g is a compact metric space so by theorem 4.14 (or theorem 4.15) G is compact. If the graph is compact Define g as above. The inverse of g is g −1 (x, f (x)) = x. It’s clear that g −1 is a one-to-one and onto function from G to E. Therefore by theorem 4.17 the inverse of g −1 – that is, g itself – is continuous. We can then appeal to theorem 4.10 once again to conclude that f is continuous. Exercise 4.7 f is bounded If x = 0, then f (x, y) = 0 for any value of y. If x > 0: (x − y 2 )2 ≥ 0 2 2 squares are positive 4 → x − 2xy + y ≥ 0 2 2 LHS remains positive after adding the nonnegative term (xy 2 ) 2 add xy 2 to both sides → x − xy + y ≥ 0 2 4 → x + y ≥ xy →1≥ expanding the squared term 4 xy 2 x2 +y 4 divide both sides by positive term x2 + y 4 If x < 0: (x + y 2 )2 ≥ 0 squares are positive → x2 + 2xy 2 + y 4 ≥ 0 expanding the squared term → x2 − xy 2 + y 4 ≥ 0 LHS remains positive after adding the nonnegative term (−3xy 2 ) → x2 + y 4 ≥ xy 2 add xy 2 to both sides →1≥ 2 xy x2 +y 4 divide both sides by positive term x2 + y 4 56 g is unbounded near (0,0) If we let x = nα and let y = nβ , we have nα+2β n2α + n6β g(nα , nβ ) = We can divide the numerator and denominator by nα+2β to get g(nα , nβ ) = 1 nα−2β + n4β−α If we let α = −3, β = −1 this becomes g 1 1 , n3 n = n−1 n 1 = −1 +n 2 Taking limits, we have g(0, 0) = lim g n→∞ 1 1 , n3 n = lim n→∞ n 2 The rightmost limit is +∞. f is not continuous at (0,0) Choose 0 < δ < 1/2. For f to be continuous we must be able to choose some > 0 such that d ((0, 0), (x, y)) < → d (f (0, 0), f (x, y)) < δ But there can be no √ such epsilon. To see this, choose an arbitrary > 0, choose x > 0 such that x2 + x < 2 , and then choose y = x. This gives us p p d ((0, 0), (x, y)) = x2 + y 2 = x2 + x < but d (f (0, 0), f (x, y)) = xy 2 x2 1 = = 2 4 2 x +y 2x 2 The restriction of f to a straight line is continuous Any straight line that doesn’t pass through (0, 0) doesn’t encounter any of the irregularities that occur at the origin; it’s trivial but tedious to show that the restriction of f to such a line is continuous. So we need only consider lines that pass through the origin: that is, lines of the form y = cx or x = 0 for some constant c. For the line y = cx: let > 0 be given and let δ = /c2 . Choose x such that 0 < x < δ. Then: d(f (0, 0), f (x, y)) = xy 2 c2 x3 c2 x c2 x = = < < c2 δ = x2 + y 4 x 2 + c4 x 4 1 + c4 x2 1 And was arbitrary, so this proves that f is continuous on this line. For the line x = 0: let > 0 be given choose any y 6= 0. Regardless of our choice of δ, we have d(f (0, 0), f (x, y)) = 0 xy 2 = 4 =0< 2 4 x +y y And was arbitrary, so this proves that f is continuous on this line. The restriction of g to a straight line is continuous The proof for the continuity of the restriction of g is almost identical to that of the continuity of the restriction of f . 57 Exercise 4.8 To show that f is bounded, we need to show that there exists some M such that |f (p)| < M for every p ∈ R1 . Let p be an arbitrary element of R1 . We’re told that E is a bounded subset of R1 , so we know that E has a lower bound α. Choose any > 0. From the definition of uniform continuity there exists some δ > 0 such that, for all p, q ∈ R1 , d(p, q) < δ → d(f (p), f (q)) < Let n be the smallest integer such that nδ > p − α (which exists from the Archimedean property of the reals) and define γ = (p − α)/n. This allows us to divide the interval (α, p) into n intervals of length γ, each of which is smaller than δ. We can then apply the triangle inequality multiple times: d(f (α), f (p)) ≤ d(f (α), f (α + γ)) + d(f (α + γ), f (α + 2γ)) + . . . + d(f (α + (n − 1)γ), f (α + nγ)) < + + ... + (n terms) = n This shows us that |f (p)| is bounded by |f (α) ± n| for all p ∈ R1 , so by definition 4.13 we know that f is bounded. if E is not bounded If E is not bounded below, then we can revise the previous proof using the upper bound β in place of the lower bound α. If E is not bounded above or below, then we will not be able to use the Archimedean property to find n such that nδ > p − (−∞) and the proof fails. For an example of a real uniformly continuous function that is not bounded, we can simply look to f (x) = x. Exercise 4.9 Definition 4.18 says that f is uniformly continuous on X if for every > 0 there exists δ > 0 such that dX (p, q) < δ → dY (f (p), f (q)) < (9) We want to show that this conditional statement is true iff for every > 0 there exists δ > 0 such that diam E < δ → diam f (E) < (10) implies (9) Let equation (10) hold and suppose d(p, q) < δ. d(p, q) < δ assumed → diam {p, q} < δ we’re letting E be the set containing only p and q. → diam {f (p), f (q)} < from (10) → d(f (p), f (q)) < definition of diameter Therefore (9) holds. (9) implies (10) Let equation (9) hold, let E be a nonempty subset of X, and suppose diam E < δ. diam E < δ assumed → (∀p, q ∈ E)(d(p, q) < δ) definition of diameter → (∀p, q ∈ E)(d(f (p), f (q)) < ) definition of diameter → (∀f (p), f (q) ∈ f (E))(d(f (p), f (q)) < ) p ∈ E → f (p) ∈ f (E) → diam f (E) < definition of diameter Therefore (10) holds. 58 (10) Exercise 4.10 We want to prove the converse of theorem 4.19: that is, we want to prove that if f is not uniformly continuous on X then either X is not a compact metric space or f is not a continuous function. If f is not uniformly continuous, then the converse of definition 4.18 tells us that there exists some such that, for every δ > 0, we can find p, q ∈ X such that d(p, q) < δ but d(f (p), f (q)) ≥ . We’ll be contradicting this claim, so we’ll restate it formally in a numbered equation: (∃ > 0)(∀δ > 0) d(p, q) < δ ∧ d(f (p), f (q)) ≥ From the fact that we can find such p, q for every value of δ, we can set δn = {pn } and {qn } where d(pn , qn ) < n1 but d(f (pn ), f (qn )) ≥ . (11) 1 n and construct two sequences If the set X is compact (supposition 1) the sequence {pn } must have at least one subsequential limit p (theorem 3.6a or 2.37). And, since d(pn , qn ) < n1 , the sequence {qn } must also have p as a subsequential limit since 1 d(qn , p) ≤ d(qn , pn ) + d(pn , p) < + n But then, from theorem 4.2, it must be the case that both {f (pn )} and {f (qn )} have subsequential limits of f (p). If the function f is continuous without being uniformly continuous (supposition 2) this would imply that, for some sufficiently large n, d(f (pn ), f (qn )) ≤ d(f (pn ), f (p)) + d(f (p), f (qn )) < + = 2 2 which contradicts (11). We’ve established a contradiction from our initial assumption that f is uniformly continuous, so one of our suppositions must be incorrect: either X is not compact or f is not continuous. By converse, if X is compact and f is continuous then f is uniformly continuous. And this is what we were asked to prove. Exercise 4.11 Choose an arbitrary > 0. We’re told that f is uniformly continuous, so (∃δ > 0) : d(xm , xn ) < δ → d(f (xm ), f (xn )) < And we’re told that {xn } is Cauchy, so (∃N ) : n, m > N → d(xm , xn ) < δ Together, these two conditional statements tell us that, for any , (∃N ) : n, m > N → d(f (xm ), f (xn )) < which, by definition, shows that {f (xn )} is Cauchy. See exercise 4.13 for the proof that uses this result. Exercise 4.12 We’re asked to prove that the composition of uniformly continuous functions is uniformly continuous. Let f : X → Y and g : Y → Z be uniformly continuous functions. Choose an arbitrary > 0. Because g is uniformly continuous, we have (∃α > 0) : d(y1 , y2 ) < α → d(g(y1 ), g(y2 )) < Because f is uniformly continuous, we have (∃δ > 0) : d(x1 , x2 ) < δ → d(f (x1 ), f (x2 )) < α Since the elements f (x1 ), f (x2 ) are elements of Y these two conditional statements together tell us that, for arbitrary > 0, (∃δ > 0) : d(x1 , x2 ) < δ → d(f (x1 ), f (x2 )) < α → d(g(f (x1 )), g(f (x2 ))) < which, by definition means that the composite function (g ◦ f ) : X → Z is uniformly continuous. 59 Exercise 4.13 (Proof using exercise 4.11) For every p ∈ X it is either the case that p ∈ E or p 6∈ E. If p 6∈ E, then p is a limit point of E (because of density) and therefore we can construct a sequence {en } that converges to p. We now define the function g : X → Y as limn→∞ f (p), if p ∈ E g(p) = limn→∞ f (en ), if p 6∈ E We need to prove that this is actually a well-defined function and that it’s continuous. The function g is well-defined It’s clear that every element of X is mapped to at least one element of Y , but it’s not immediately clear that each element of X is mapped to only one element of Y . We need to show that if {pn } and {qn } are sequences in E that converge to p, then {f (pn )} and {f (qn )} both converge to the same element. Let {pn } and {qn } be arbitrary sequences in E that converge to p. From the definition of convergence (or from exercise 4.11), we know that (∀δ > 0)(∃N ) : n > N → d(pn , p) < (∀δ > 0)(∃M ) : m > M → d(qm , p) < δ 2 δ 2 We can then use the triangle inequality to show that (∀δ > 0)(∃M, N ) : n > max{M, N } → d(pn , qn ) ≤ d(pn , p) + d(p, qn ) < δ (12) We know that the function f is continuous on E, so for any > 0 there exists some δ > 0 such that d(pn , qn ) < δ → d(f (pn ), f (qn )) < (13) Combining equations (12) and (13) we see that for any > 0 we can find some integer N such that n > N → (d(pn , qn ) < δ) → (d(f (pn ), f (qn )) < ) This tells us that {f (pn )} and {f (qn )} don’t converge to different points, but without being able to conclude that Y is compact it’s entirely possible that {f (pn )} and {f (qn )} are merely Cauchy without actually converging to any point in Y at all. The exercise asks us to assume that Y is R1 , but we’ll try for a more general solution and assume only that Y is a complete metric space (see definition 3.12). This allows us to conclude that {f (pn )} and {f (qn )} both converge to the same point of Y , and we call this point g(p). Therefore the function g(p) has a unique value for every p ∈ X. 60 The function g is continuous d(p, q) < δ/2 assumed Let {pn } be a sequence that converges to p and let {qn } be a sequence that converges to q. → d(pn , qn ) ≤ d(pn , p) + d(p, q) + d(q, qn ) → d(pn , qn ) < d(pn , p) + δ/2 + d(q, qn ) from our initial assumption This holds true for any n, even when n is sufficiently large to make d(q, qn ) < δ/4, d(p, pn ) < δ/4 in which case we have → d(pn , qn ) < δ We’re told that f is continuous on E, so we can make d(f (pn ), f (qn )) arbitrarily small if we can make d(pn , qn ) arbitrarily small. And we can make δ arbitrarily small, so for any we can choose pn , qn such that → d(f (pn ), f (qn )) < /2 By definition, we have g(p) = f (p) for p ∈ E, so → d(g(pn ), g(qn )) < /2 → d(g(p), g(q)) ≤ d(g(p), g(pn )) + d(g(pn ), g(qn )) + d(g(qn ), g(q)) triangle inequality → d(g(p), g(q)) < d(g(p), g(pn )) + /2 + d(g(qn ), g(q)) established bound for d(g(pn ), g(qn )) This holds true for any n, even when n is sufficiently large to make d(q, qn ) < /4, d(p, pn ) < /4 (we know that {g(pn )} converges to g(p) and that {g(qn )} converges to g(q) because we defined g(p), g(q) specifically so that this would be true). For such values of n we have d(g(p), g(q)) < This shows that we can constrain the range of g by constraining its domain, and therefore g is continuous. Could we replace the range with Rk or any compact metric space? By any complete metric space? Yes. Definition 3.12 tells us that all compact metric spaces and all Euclidean metric spaces are also complete metric spaces. Our proof didn’t depend on the range of g being R1 : we assumed only that the range was a complete metric space. Could we replace the range with any metric space? No. Consider the function f (x) = x1 with a domain of E = n1 : n ∈ N . The set E is dense in X = E ∪ {0}. This function is continuous at every x ∈ E (we can find a neighborhood Nδ (x) that contains only x, which guarantees that d(f (y), f (x)) = 0 for any y in Nδ (x)). However, there is no way to define f (0) to make this a continuous function. This can be seen intuitively by recognizing that the range of f (x) = 1/x is unbounded in every neighborhood of x = 0. If you’re satisfied with an “intuitive proof”, we’re done. Otherwise, buckle up and get ready for a few more paragraphs of inequalities and Greek letters. To prove that f is discontinous at x = 0, choose any arbitrary neighborhood around 0 with radius δ. We can find an integer n such that n > 1δ (Archimedian property of the reals) so that d( n1 , 0) < δ. For any positive 1 integer k we also have d( n+k , 0) < δ. Now, by the triangle inequality, we have 1 1 1 1 d f ,f ≤d f , f (0) + d f (0) , f n n+k n n+k We haven’t specificied a value for f (0) yet, but we do know that |f (1/n) − f (1/n + k)| = |n − (n + k)| = k. This last inequality is therefore equivalent to 1 1 k≤d f , f (0) + d f (0) , f n n+k 61 Every term of this inequality is nonnegative, so one of the two terms on the right-hand side of this inequality must be ≥ k/2. But this is true for arbitrarily large k and arbitrarily small δ. We can now show that f is not continuous at x = 0. Choose any > 0. For all possible choice of δ > 0 we can find integers n > 1δ and k > 2, so that by the previous method we can be sure that either d(f ( n1 ), f (0)) ≥ k/2 = 1 ), f (0)) ≥ k/2 > . By definition 4.5 of “continuous”, this proves that f is not continous at 0. And or d(f ( n+k we never specified the value of f (0), so we’ve proven that f will be discontinuous however we define f (0). This means that there is no continuous extension from E to X. Exercise 4.14 proof 1 If f (0) = 0 or f (1) = 1, we’re done. Otherwise, we know that f (0) > 0 and f (1) < 1. We’ll now inductively define a sequence of intervals {[ai , bi ]} as follows3 : Let [a0 , b0 ] = [0, 1]. Now assume that we have defined {[ai , bi ]} for i = 0, 1, . . . , n. Let m = (an + bn )/2. We define [an+1 , bn+1 ] as [an , m], [m, bn ], if f (m) ≤ m if f (m) ≥ m (This procedure is underdefined for the case of f (m) = m, but when f (m) = m we’ve found the point we’re looking for anyway). The interval [an+1 , bn+1 ] has diameter 2−(n+1) with f (an+1 ) ≥ an+1 , f (bn+1 ) ≤ bn+1 . We therefore haveT a nested sequence of compact sets {[an , bn ]} whose diameter converges to zero, so by exercise 3.21 we know that {[an , bn ]} consists of just one point i ∈ I. Our method of constructing {[an , bn ]} guarantees that f (i) ≤ i and f (i) ≥ i, therefore f (i) = i and this is the point we were asked to find. Exercise 4.14 proof 2 This proof, which is much clearer and more concise than mine, was taken from the homework of Men-Gen Tsai (b89902089@ntu.edu.tw). Define a new function g(x) = f (x) − x. Both f (x) and x are continuous, so g(x) is continuous by theorem 4.9. If g(0) = 0 or g(1) = 0, then we’re done. Otherwise, we have g(0) > 0 and g(1) < 0. Therefore by theorem 4.23 there must be some intermediate point in the interval [0, 1] for which g(x) = 0: at this point, f (x) = x. Exercise 4.15 Although the details of this proof might be ugly, the general idea is simple. If the function f is continuous but not monotonic then we can find some open interval (x1 , x2 ) on which f (x) obtains a local maximum or minimum. This point f (x) is not an interior point of f ( (x1 , x2 ) ), so f ( (x1 , x2 ) ) is not an open set. Let f : R1 → R1 be a continuous mapping and assume that f is not monotonically increasing or decreasing. Then we can find x1 < x2 < x3 such that f (x2 ) > f (x1 ) and f (x2 ) > f (x3 ), or such that f (x2 ) < f (x1 ) and f (x2 ) < f (x3 ). The proofs for either case are analogous, so we’ll focus only on the case that f (x2 ) > f (x1 ) and f (x2 ) > f (x3 ). We can’t assume that the supremum of f ( (x1 , x3 ) ) occurs at f (x2 ): there could be some x2.5 for which f (x2.5 ) > f (x2 ). We want to construct a closed subinterval [xa , xb ] ⊂ (x1 , x3 ) containing the x for which f (x) obtains its local maximum. To do this, we define to be = min{d( f (x2 ), f (x1 ) ), d( f (x2 ), f (x3 ) )} Because f is continuous, we know that there exists some δa and δb such that d(x1 , x) < δa → d(f (x1 ), f (x)) < d(x3 , x) < δb → d(f (x3 ), f (x)) < 3 Our method of defining this sequence is similar to Newton’s method or the Bisection method of root finding. 62 We can now let xa = x1 + 12 δa and xb = x3 − 12 δb . We know that f doesn’t obtain its local maximum for any x in the intervals (x1 , xa ) or (xb , x3 ) because all of these xs are sufficiently close (within δ) to x1 or x3 so that f (x) is less (by at least 21 δ) than f (x2 ). And [xa , xb ] is a closed subset of R1 and is therefore a compact set, so we know that f ([xa , xb ]) is a closed and bounded subset of R1 (theorem 4.15), and therefore f ([xa , xb ]) contains its supremum (which may or may not occur at x2 , but we only care that the supremum exists for some x ∈ [xa , xb ]). This is all getting a bit muddled, so to recap our results so far: 1) [xa , xb ] ⊂ (x1 , x3 ) 2) f ([xa , xb ]) ⊂ f ((x1 , x3 )) 3) (x ∈ (x1 , xa ) ∪ (xb , x3 )) → f (x) < f (x2 ) 4) f (x) has a local maximum for some x ∈ [xa , xb ] Together these four facts tell us that f ( (x1 , x3 ) ) contains its own supremum for some x ∈ [xa , xb ]. But the supremum of a set can’t be an interior point of the set, so f ( (x1 , x3 ) ) is not open even though (x1 , x3 ) is open, so f is not an open mapping (theorem 4.8) We’ve shown that “f is not monotonic” implies “f is not a continuous open mapping from R1 to R1 ”. By converse, then, we have shown that if f is a continuous open mapping from R1 to R1 then f is monotonic. And this is what we were asked to prove. Exercise 4.16 The functions [x] and (x) are both discontinuous for each x ∈ N. If x ∈ N then, for every 0 < δ < 12 , we have d([x − δ], [x]) = |(x − 1) − x| = 1 1 2 This shows that we can’t constrain the range of these functions by restraining the domain around x ∈ N, which by definition means that the functions are discontinuous for integer values of x. d((x − δ), (x)) = |(1 − δ) − 0| = 1 − δ > Note that the the sum of these discontinuous functions, [x] + (x), is the continuous function f (x) = x. Exercise 4.17 Define E as in the hint. To the three criteria listed in the exercise, we add a fourth: (d) a<q<x<r<b Lemma 1: For each x ∈ E we can find p, q, r such that the four criteria are met We’re assuming that f (x−) < f (x+), so by the density of Q in R (theorem 1.20b) we can find some rational p such that f (x−) < p < f (x+). From our definition of the set E we know that f (x−) exists for all x ∈ E, so we can find some neighborhood Nδ (x) containing a rational q that fulfills criteria (b). More formally, (∃q ∈ Nδ (x) ∩ (a, x))(a < q < x ∧ [q < t < x → f (t) < p] Similarly, the existence of f (x+) tells us that there is some neighborhood Nδ2 (x) containing a rational r such that (∃r ∈ Nδ2 (x) ∩ (x, b))(x < r < b ∧ [x < t < r → f (t) > p] Therefore each x ∈ E can be associated with at least one (p, q, r) rational triple with q < x < r. Lemma 2: If a given (p, q, r) triple fulfills the criteria for x and y, then x = y Suppose x, y are two elements of E that both meet the four criteria. Assume, without loss of generality, that x < y. The density of the reals guarantees that there exists some rational w such that x < w < y. But this means that f (w) < p (by criteria (b) since q < w < y) and that f (w) > p (by criteria (c) since x < w < r). Reversing the roles of x and y establishes the same results for the case of x > y. This is clearly a contradiction, so one of our assumptions must be wrong: for any given triple (p, q, r) it’s not possible to find two distinct elements of E that fulfill all four criteria. Note that we haven’t proven that each triple can be associated with a unique x ∈ E: we’ve only shown that each triple can be associated with at most one x ∈ E. 63 Lemma 3: E is at most countable The set of rational triples is countable (theorem 2.13). Lemma 2 tells us that we can create a function from the set of triples onto E, so the cardinality of E is not more than the cardinality of the set of rational triples. Therefore the cardinality of E is at most countable. Lemma 4: F is at most countable If we define F to be the set of x for which f (x+) < f (x−) we can prove that F is at most countable with only trivial modifications to the previous lemmas. Lemma 5: G is at most countable Define G to be the set of x for which f (x+) = f (x−) = α but f (x) 6= α. Let x be an arbitrary element of G. If every neighborhood of x contained another element of G, we could construct a sequence {tn } for which {tn } → x but {f (tn )} 6→ α. But this would mean that either f (x−) or f (x+) wouldn’t exist, which contradicts our definition of G. Therefore each x is an isolated point of G: we can find some radius δ such that Nδ (x)∩G = x and Nδ (x) contains rational numbers q, r such that q < x < r. Now consider all of the (q, r) pairs of rational numbers. Each of these pairs will either have one, more than one, or zero elements of G in the associated open interval (q, r). Let G be the set of rational pairs (q, r) for which there is exactly one element of g in the open interval (q, r). The set G is a subset of Q × Q, so it’s at most countable (theorem 2.13). And each x ∈ G can be associated with at least one (p, q) ∈ G, so we can create a function from G onto G. This proves that G is at most countable. The sets E,F ,and G exhaust all the different types of simple discontinuities and all of these sets are countable. Therefore their union is countable (theorem 2.12). And this is what we were asked to prove. Exercise 4.18 f is continuous at every irrational point Let x be irrational. Choose any > 0 and let n be the smallest integer such that n > 1 . If we want to constrain the range of f to f (x) ± , we will need to find a neighborhood of x that contains no rational numbers in the set p : p, q ∈ Q, q ≤ n q For each i ≤ n the product xi is irrational, so there exists some integer mi such that m < xi < m + 1 which means that m m+1 <x< i i So we can define a neighborhood around x with radius δi where m+1 m δi = min d x, , d x, i i The neighborhood (x − δi , x + δi ) contains no rationals of the form ki . So if we have constructed neighborhoods δ1 , δ2 , . . . , δn we can let δ = min{δi } (this minimum exists because n is finite), and we will have constructed a neighborhood (x − δ, x + δ) that contains no rational numbers with denominators ≤ n. Therefore, for all rational numbers y, we have 1 d(x, y) < δ → d(f (x), f (y)) < < n and for irrational numbers y we have d(x, y) < δ → d(f (x), f (y)) = 0 < Which, by definition, means that f is continuous at x. 64 f has a simple discontinuity at every rational point Let x be a rational point. If we follow the previous method of constructing δ we immediately see that f (x+) = f (x−) = 0. When x is rational, though, we no longer have f (x) = 0 and therefore f has a simple discontinuity at x. Exercise 4.19 Suppose that f is discontinous at some point p. By definition 4.5 of “continuity”, we know that there exists some > 0 such that, for every δ > 0, we can find some x such that d(x, p) < δ but d(f (x), f (p)) ≥ . Therefore we are able construct a sequence {xn } where each term of {xn } gets closer to p even though every term of {f (xn )} differs from f (p) by at least . More formally, we choose each element of {xn } so that d(p, xn ) < n1 but d(f (p), f (xn )) ≥ . By constructing the sequence in this way we see that lim xn → p but d(f (xn ), f (p)) ≥ for all n. Assume, without loss of generality, that f (xn ) < f (p) for every term of {xn } (see note 1 for justification of WLOG). From the density property of the reals we can find a rational number r such that f (xn ) < r < f (p). We’re told that f has the intermediate value property, so for each xn there is some yn between xn and p such that f (yn ) = r. This sequence {yn } converges to p (since the terms of {yn } are squeezed between the terms of {xn } and p) and {f (yn )} clearly converges to r (since f (yn ) = r for all n). But the fact that {yn } converges to p means that p is a limit point of {yn }, which means that p is a limit point of the closed set described in the exercise (the set of all x with f (x) = r). Therefore p is an element of this set, which means that f (p) = r. And this is a contradiction since we specifically chose r so that f (p) 6= r. We’ve established a contradiction, so our initial supposition must be false: f is not discontinous at any point p. Therefore f is continuous. Note 1: justification for WLOG claim For each term of the sequence {f (xn )} either f (p) < f (xn ) or f (xn ) < f (p). Therefore {f (xn )} either contains an infinite subsequence where f (xn ) < f (p) for all n or f (xn ) > f (p) for all n (or both). If it contains an infinite subsequence where f (xn ) < f (p) for all n then we use this subsequence in place of {xn } in the proof. If it does not contain such a subsequence, then it contains an infinite subsequence where f (xn ) > f (p) for all n: the proof for this case requires only the trivial twiddling of a few inequality signs. Therefore we can assume, without loss of generality, that f (xn ) < f (p) for each term of {xn }. Exercise 4.20a C If x 6∈ E then x is an element of the open set E , therefore there is some radius such that d(x, y) < → y 6∈ E, and therefore inf z∈E d(x, z) ≥ > 0. If x ∈ E then for every > 0 we can find some z ∈ E such that d(x, z) < and therefore inf z∈E d(x, z) = 0. Exercise 4.20b Case 1: x, y ∈ E If x and y are both elements of E then from part (a) we have ρE (x) = ρE (y) = 0 and therefore |ρE (x) − ρE (y)| = 0 ≤ d(x, y) Case 2: x ∈ E, y 6∈ E If y ∈ E but x 6∈ E then ρE (y) = 0 and d(x, y) ∈ {d(x, z) : z ∈ E)}. Therefore |ρE (x) − ρE (y)| = |ρE (x)| = inf d(x, z) ≤ d(x, y) z∈E 65 Case 3: x, y 6∈ E If neither x nor y are elements of E then, for arbitrary z ∈ E we have ρE (x) ≤ d(x, z) ≤ d(x, y) + d(y, z) This must hold for any choice of z. By choosing z to make d(y, z) arbitrarily close to inf z∈E d(x, z) we have ρE (x) ≤ d(x, y) + ρE (y) + Our choice of can be made arbitrarily small (possibly even zero, if y is a limit point of E), so this is equivalent to ρE (x) ≤ d(x, y) + ρE (y) → ρE (x) − ρE (y) ≤ d(x, y) By changing the roles of x and y we can similarly show that ρE (y) − ρE (x) ≤ d(x, y) Together, these last two inequalities show us that |ρE (y) − ρE (x)| ≤ d(x, y) These three cases exhaust the possibilities for x, y. This shows us that for every > 0 we have d(ρE (x), ρE (y)) < whenever d(x, y) < . And this is just definition 4.18 for uniform continuity with δ = . Exercise 4.21 K is compact and ρF is continuous on K, so ρF (K) is compact (and is therefore both closed and bounded from theorem 2.41). The results of exercise 20 tell us that 0 is not an element of ρF (K), and therefore 0 is not a limit point of ρF (K) (since ρF (K) is closed). Therefore we can find some neighborhood around 0 with radius δ such that none of the elements in the interval (0 − δ, 0 + δ) are in ρF (K). Choose p ∈ K, q ∈ F . The distance d(p, q) is an element of ρF (K) and therefore d(p, q) 6∈ (0 − δ, 0 + δ). Therefore d(p, q) > δ. The conclusion fails if neither set is compact Consider the sets4 K= 1 n+ :n∈N n F = {n : n ∈ N} − {2} The set F is closed, neither K nor F is compact, and we can choose p ∈ K, q ∈ F such that d(p, q) < n. 1 n for any Exercise 4.22 Exercise 20a shows us that ρA (p) = 0 iff p ∈ A and ρB (p) = 0 iff p ∈ B, so it’s clear that f (p) = 0 iff p ∈ A and f (p) = 1 iff p ∈ B. This means that the ρA (p) + ρB (p) is never zero, so f is continuous from theorem 4.9. The intervals [0, 21 ) and ( 12 , 1] are open in [0, 1]; therefore V and W are open by theorem 4.8. The sets V and W are clearly disjoint since f (p) can have only a single value for any given p. From the definition of A and B we have 1 −1 −1 A = f (0) ⊆ f 0, =V 2 1 B = f −1 (1) ⊆ f −1 ,1 =W 2 4 This counterexample was taken from the homework of Men-Gen Tsai, b89902089@ntu.edu.tw 66 Exercise 4.23a: proof of continuity Choose x ∈ (a, b). Assume without loss of generality that f (x) 6= f (a) and f (x) 6= f (b) (see below for justification of WLOG). Choose some > 0, and choose 0 such that 0 < 0 < min{, |f (b) − f (x)|, |f (a) − f (x)|} Choose δ such that 0 < δ < min 0 (x − a) 0 (b − x) , |f (b) − f (x)| |f (a) − f (x)| Define λa , λb as δ x−a δ λb = 1 − b−x λa = 1 − The method we used to select 0 and δ ensure that both of these λ values are in the interval (0, 1). Notice that we can express δ in terms of λa or λb as δ = (1 − λa )(x − a) = (1 − λb )(b − x) If we restrict the domain of f to (x − δ, x + δ) we have |f (x + δ) − f (x)| = |f (x + (1 − λb )(b − x)) − f (x)| from our definition of δ = |f (λb x + (1 − λb )b) − f (x)| algebraic rearrangement ≤ |λb f (x) + (1 − λb )f (b) − f (x)| from the convexity of f = |(1 − λb )(f (b) − f (x))| algebraic rearrangement = < δ | b−x (f (b) − f (x))| 0 (b−x) (b−x)|f (b)−f (x)| 0 |(f (b) − f (x))| from our definition of λ from our definition of δ = algebra < from our definition of and also |f (x − δ) − f (x)| = |f (x + (1 − λa )(x − a)) − f (x)| from our definition of δ = |f (λa x + (1 − λa )a) − f (x)| algebraic rearrangement ≤ |λa f (x) + (1 − λa )f (a) − f (x)| from the convexity of f = |(1 − λa )(f (a) − f (x))| algebraic rearrangement δ = | x−a (f (a) − f (x))| from our definition of λ < 0 (x−a) (x−a)|f (a)−f (x)| 0 |(f (a) − f (x))| from our definition of δ = algebra < from our definition of We chose an arbitrarily small and showed how to find a nonzero upper bound for δ such that d(x, y) < δ → d(f (x), f (y)) < . By definition, this means that f is continuous. Justifying “without loss of generality”: special case of f (a) = f (x) or f (b) = f (x) Let x be an arbitrary point in the interval (a, b). Choose α ∈ (a, x) such that f (α) 6= f (x): if this is not possible, then choose α = x. Choose β ∈ (x, b) such that f (β) 6= f (x): if this is not possible, then choose β = x. 67 If α < x < β, we can use the previous method to prove that f is continuous on an interval (α, β) containing x such that (α, β) ⊆ (a, b). Since x was an arbitrary element of (a, b) this is sufficient to prove that f is continuous on (a, b). If α = x < β, we can use the previous method to prove that f is continuous on an interval (x, β) containing x such that (x, β) ⊆ (a, b). We only have α = x if there was no α ∈ (a, x) such that f (α) 6= f (x): this can only happen if f is constant (and therefore continuous) on the interval (a, x]. Therefore f is continuous on the interval (a, β) containing our arbitrary x; this is sufficient to prove that f is continuous on (a, b). If α < x = β, we can use the previous method to prove that f is continuous on an interval (α, x) containing x such that (α, x) ⊆ (a, b). We only have β = x if there was no β ∈ (x, b) such that f (β) 6= f (x): this can only happen if f is constant (and therefore continuous) on the interval [x, b). Therefore f is continuous on the interval (α, b) containing our arbitrary x; this is sufficient to prove that f is continuous on (a, b). If α = x = β then f is a constant function on the entire interval (a, b) and is therefore trivially continuous. Exercise 4.23b: increasing convex functions of convex functions are convex Let g be an increasing convex function, let f be a convex function, and define their composite to be h = g ◦ f . f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) from convexity of f → g(f (λx + (1 − λ)y)) ≤ g(λf (x) + (1 − λ)f (y)) g is an increasing function → g(f (λx + (1 − λ)y)) ≤ g(λf (x) + (1 − λ)f (y)) ≤ λg(f (x)) + (1 − λ)g(f (y)) from convexity of g → g(f (λx + (1 − λ)y)) ≤ λg(f (x)) + (1 − λ)g(f (y)) transitivity → h(λx + (1 − λ)y) ≤ λh(x) + (1 − λ)h(y) definition of h x and y were arbitrary, so this shows that h is a convex function. Exercise 4.23c: the ugly-looking inequality We’re given that s < t < u, so we know that 0 < t − s < u − s. Define λ to be t−s u−s λ= From this definition we immediately see that 0 < λ < 1. We can also see that (1 − λ) = 1 − t−s u−t = u−s u−s From this, we can establish an inequality: λf (u) + (1 − λ)f (s) ≥ f (λu + (1 − λ)s) t−s u−t → λf (u) + (1 − λ)f (s) ≥ f u−s u + u−s s → λf (u) + (1 − λ)f (s) ≥ f ut−us+su−st u−s → λf (u) + (1 − λ)f (s) ≥ f t(u−s) u−s from convexity of f → λf (u) + (1 − λ)f (s) ≥ f (t) algebra definition of λ, (1 − λ) algebra algebra From this last inequality, we can derive two results. λf (u) + (1 − λ)f (s) ≥ f (t) → λf (u) − λf (s) ≥ f (t) − f (s) → t−s u−s (f (u) − λf (s)) ≥ f (t) − f (s) previously derived subtract f (s) from each side definition of λ 68 Dividing both sides of this last inequality by t − s gives us f (t) − f (s) f (u) − f (s) ≥ u−s t−s (14) Similarly, we have: λf (u) + (1 − λ)f (s) ≥ f (t) previously derived → −λf (u) − (1 − λ)f (s) ≤ −f (t) multiply both sides by −1 → (1 − λ)f (u) − (1 − λ)f (s) ≤ f (u) − f (t) add f (u) to both sides → (1 − λ)(f (u) − f (s)) ≤ f (u) − f (t) add f (u) to both sides → u−t u−s (f (u) − f (s)) ≤ f (u) − f (t) definition of λ Dividing both sides of this last equation by u − t gives us f (u) − f (s) f (u) − f (t) ≤ u−s u−t (15) Combining (14) and (15) gives us f (u) − f (s) f (u) − f (t) f (t) − f (s) ≤ ≤ t−s u−s u−t which is what we were asked to prove. Exercise 4.24 To prove that f is convex on (a, b) we must show that (∀x, y ∈ (a, b), λ ∈ [0, 1]) : f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) (16) To do this, choose an arbitrary x, y ∈ (a, b) (this choice of x and y will remain fixed for the majority of this proof). Let Λ represent the set of values for λ for which (16) holds. It’s immediately clear that 0 ∈ Λ and 1 ∈ Λ. We’re also told that f has the property f (x) + f (y) x+y ≤ for all x, y ∈ (a, b) (17) f 2 2 which is just (16) with λ = 21 ; therefore 1 2 ∈ Λ. To prove that f is convex we must show that [0, 1] ⊂ Λ. Lemma 1: If j, k ∈ Λ then (j + k)/2 ∈ Λ j+k Assume that j, k ∈ Λ and let m = 2 . Proof that m ∈ Λ: 2−j−k f (mx + (1 − m)y) = f j+k y definition of m 2 x+ 2 = f jx+kx+2y−jy−ky algebra 2 = f jx−jy+y + kx−ky+y algebra 2 2 = f jx+(1−j)y + kx+(1−k)y algebra 2 2 j is in Λ, so j ∈ [0, 1], so (jx + (1 − j)y) ∈ (a, b). The same is true for k. So we can apply (17). ≤ ≤ = f (jx+(1−j)y)+f (kx+(1−k)y) 2 jf (x)+(1−j)f (y)+kf (x)+(1−k)f (y) 2 j+k 2−j−k f (x) + f (y) 2 2 = mf (x) + (1 − m)f (y) We can apply (16) because j, k ∈ Λ algebra definition of m This shows that f (mx + (1 − m)y) ≤ mf (x) + (1 − m)f (y), therefore m ∈ Λ. 69 Lemma 2: all rationals of the form m/2n with 0 ≤ m ≤ 2n are members of Λ This can be proven by induction. Let E be the set of all n for which the lemma is true. We know that {0, 21 , 1} ⊂ Λ so we have 0 ∈ E, 1 ∈ E. Now assume that E contains 1, 2, . . . , k. Choose an arbitrary m such that 0 ≤ m ≤ 2k+1 . case 1: m is even If m is even then m = 2α for some α ∈ Z, 0 ≤ α ≤ 2k and therefore m/2k+1 = α/2k. By the hypothesis of induction α/2k ∈ Λ therefore m/2k+1 ∈ Λ. case 2: m is odd If m is odd then (m − 1)/2 = α and (m + 1)/2 = β for some α, β ∈ Z with α, β ∈ [0, 2k ]. From this we have m+1 m−1 1 α β m = k+2 + k+2 = + k 2k+1 2 2 2 2k 2 By the hypothesis of induction α/2k ∈ Λ and β/2k ∈ Λ. Therefore, by lemma 1, m/2k+1 ∈ Λ. This covers all possible cases for m, so k + 1 ∈ E. This completes the inductive step, therefore E = N. Lemma 3: every element of [0, 1] is a member of Λ Choose any p ∈ [0, 1]. From lemma 2, we see that Λ is dense in [0, 1] so we can construct a sequence {λn } in Λ such that limn→∞ λn = p. Each λn is an element of Λ, so for each n we have f (λn x + (1 − λn )y) ≤ λn f (x) + (1 − λn )f (y) The function f is continuous, so by theorem 4.2 we can take the limit of both sides as n → ∞ to get f (px + (1 − p)y) ≤ pf (x) + (1 − p)f (y) which means that p ∈ Λ. By lemma 3 we know that (∀λ ∈ [0, 1])f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) But we chose x and y arbitrarily from the interval (a, b), so we have proven that (∀x, y ∈ (a, b), λ ∈ [0, 1])f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) which is (16), the definition of convexity. This shows that f is convex, which is what we were asked to prove. Exercise 4.25a The “hint” given for this problem is actually a proof. The set F = z − C is clearly closed because z is a fixed element and C is closed. The sets K and F are disjoint: k ∈K ∩F → k ∈K ∧k ∈F → k ∈ K ∧ (∃c ∈ C)(k = z − c) assumed def. of intersection def. of F → k ∈ K ∧ (∃c ∈ C)(k + c = z) → (∃k ∈ K, c ∈ C)k + c = z rearrangement of quantifiers → z ∈K +C def. of K + C From the definition of the set F , we see that (∀c ∈ C)(z − c ∈ F ). And we’ve established that F is closed and that and K is compact, so by exercise 21 we know that there exists some δ > 0 such that 70 (∀c ∈ C, k ∈ K)(d(z − c, k) > δ) by exercise 21 → (∀c ∈ C, k ∈ K)(|z − c − k| > δ) def. of distance function in Rk → (∀c ∈ C, k ∈ K)(|z − (c + k)| > δ) algebra → (∀c ∈ C, k ∈ K)(d(z, c + k) > δ) def. of distance function in Rk The set of c + k for all c ∈ C, k ∈ K is simply the set C + K: so we have def. of distance function in Rk → (∀y ∈ C + K)(d(z, y) > δ) This shows us that z is an interior point of C + K. But z was an arbitrary point of C + K, so we have shown that C + K is open. By theorem 2.23 the complement of an open set is closed, so C + K is closed. And this is what we were asked to prove. Exercise 4.25b Let α be an irrational number and let C1 + C2 be defined as the set C1 + C2 = {m + nα : m, n ∈ N} (18) We’ll first prove that C1 + C2 is dense in [0, 1) and then extend this proof out to prove that C1 + C2 is dense in R1 . Define the set ∆ to be the set of radii δ such that, for every x ∈ [0, 1), every neighborhood of radius > δ contains a point of C1 + C2 . More formally, we define it to be ∆ = {δ : (∀x ∈ [0, 1))(Nδ (x) ∩ C1 + C2 6= ∅) We want to prove that C1 + C2 is dense in [0, 1): we can do this by proving that ∆ has a greatest lower bound of 0. Lemma 1: Each element of C1 + C2 has a unique representation of the form m + nα Assume that m + nα and p + qα are two ways of describing the same element of C1 + C2 . m + nα = p + qα assumption of equality → (m − p) + (n − q)α = 0 algebra → (n − q)α = (p − m) algebra If both sides of this equation are zero, then n = q and p = m so that our two representations are not unique. If both sides are nonzero, we can divide by p − m: →α= p−m n−q algebra But m, n, p, q are integers so this last statement implies that α is a rational number. By contradiction, each element of C1 + C2 has a unique representation of the form m + nα. Lemma 2: For each n ∈ N, there is exactly one m ∈ N such that m + nα ∈ [0, 1). Assume that p + nα and q + nα are both in the interval [0, 1) Then |(p + nα) − (q + nα| = |p − q| ∈ [0, 1). And p and q are both integers, so it must be the case that p = q. Lemma 3: If d(x, y) = δ for any x, y ∈ [0, 1) ∩ C1 + C2 then δ ∈ ∆ Assume d(x, y) = δ for some 0 ≤ x < y < 1 with x, y ∈ C1 + C2 . Then we have d(x, y) = y − x = p + mα − q + nα = (p − q) + (m − n)α = δ So that δ is itself an element of C1 + C2 . It’s also clear that any integer multiple of δ is an element of C1 + C2 . Now, choose an arbitrary p ∈ [0, 1) that is not a multiple of δ. Every real number lies between two integers, so there exists some a such that p a< <a+1 δ 71 which implies aδ < p < (a + 1)δ which shows that p is in a neighborhood of radius δ of some element of C1 + C2 . But p was an arbitrary element of [0, 1), so every element of [0, 1) lies in such a neighborhood of radius δ, and therefore δ ∈ ∆. Lemma 4: if δ ∈ ∆, then 1 2δ ∈∆ Proof by contradiction. Assume that δ ∈ ∆ but 21 δ 6∈ ∆. By lemma 3 we know that d(x, y) > δ for all x, y ∈ [0, 1) ∩ C1 + C2 . This gives us a maximum size for the set [0, 1) ∩ C1 + C2 : |[0, 1) ∩ C1 + C2 | ≤ 1 δ But this contradicts lemmas 1 and 2, which tell us that there is one unique element of [0, 1) ∩ C1 + C2 for each n ∈ N. By contradiction, then, we must have 21 δ ∈ ∆ whenever δ ∈ ∆. The set C1 + C2 is dense in [0, 1) We can now use induction to show that inf ∆ = 0. We know that 1 ∈ ∆ because a neighborhood of radius δ = 1 around any x ∈ [0, 1) will contain 0 ∈ C1 + C2 . Therefore 1 ∈ ∆. Using induction with lemma 4 tells us that (∀n ∈ N)(2−n ∈ ∆). This is sufficient to prove that inf ∆ ≤ 0. Each element of ∆ is a distance, so inf ∆ ≥ 0. Therefore inf ∆ = 0. By the definition of ∆, this means that every neighborhood of every element of [0, 1) has an element of C1 + C2 . This allows us to conclude that every element of [0, 1) is a limit point of C1 + C2 , which means that C1 + C2 is dense in [0, 1). The set C1 + C2 is dense in R1 Choose an arbitrary p ∈ R1 . Let m be the integer such that m ≤ p < m + 1. This means that 0 ≤ p − m < 1, and we know that C1 +C2 is dense in [0, 1). Therefore we can construct a sequence {cn } of elements of [0, 1)∩C1 +C2 that converges to p − m. The definition of C1 + C2 guarantees that {cn + m} is also a sequence in C1 + C2 , and theorem 3.3 tells us that {cn + m} converges to p. Therefore p is a limit point of C1 + C2 . But p was an arbitrary element of R1 , so every element of R1 is a limit point of C1 + C2 . And this proves that C1 + C2 is dense in R1 . The sets C1 and C2 are closed, but C1 + C2 is not closed. The sets C1 and C2 are closed because they have no limit points, so it’s trivially true that they contain all of their limit points. The set C1 + C2 doesn’t contain any non-integer rational numbers, but every real number is a limit point of C1 + C2 . Therefore C1 + C2 doesn’t contain all of its limit points which means it is not closed. Exercise 4.26 We’re told that Y is compact and that g is continuous and one-to-one. We conclude that g(Y ) is a compact subset of Z (theorem 4.14) and therefore g is uniformly continuous on Y (theorem 4.19). The fact that g is one-to-one tells us that g is one-to-one and onto g(Y ), so by theorem 4.17 we conclude that g −1 (g(Y )) is a continuous mapping from compact space g(Y ) to compact space X. So by theorem 4.19 we see that g −1 is uniformly continuous on g(Y ). In exercise 12 we proved that the composition of uniformly continuous functions is uniformly continuous, therefore f (x) = g −1 (h(x)) is uniformly continuous if h is uniformly continuous. Theorem 4.7 tells us that the composition of continuous functions is continuous, therefore f (x) = g −1 (h(x)) is continuous if h is continuous. To construct the counter example, define X = Z = [0, 1] and Y = [0, 21 ) ∪ [1, 2]. Define the functions f : X → Y and g : Y → Z as x if x < 12 f (x) = 2x if x ≥ 12 72 g(y) = y y 2 if y < if y ≥ 1 2 1 2 1 2 We can easily demonstrate that f fails to be continuous at x = and that g is continous at every point. The composite function h : X → Z is just h(x) = x, which is clearly continuous. Exercise 5.1 Choose arbitrary elements x, y ∈ R1 . We’re told that f is continuous and that |f (x) − f (y)| ≤ (x − y)2 = |x − y|2 Dividing the leftmost and rightmost terms by |x − y| we have f (x) − f (y) ≤ |x − y| x−y Taking the limit of each side as y → x gives us |f 0 (x)| ≤ 0 But our choice of x was arbitrary, so the derivative of f is zero at every point. By theorem 5.11b, this proves that f is a constant function. Exercise 5.2 Lemma 1: g(f (x)) = x We’re told that f 0 (x) > 0 for all x, so f is strictly increasing in (a, b) (this result is a trivial extension of the proofs for theorem 5.11). By definition, this means that x < y ↔ f (x) < f (y) and x > y ↔ f (x) > f (y): from this we conclude x 6= y ↔ f (x) 6= f (y) By contrapositive this is equivalent to x = y iff f (x) = f (y) which, by definition, means that f is one-to-one. Therefore g(f (x)) = x (theorem 4.17 ). g 0 (x) exists for all x ∈ f ( (a, b) ) Using lemma 1 we see that f 0 (x) is inversely proportional to g 0 (x): f 0 (x) = lim t→x f (x) − f (t) f (x) − f (t) 1 1 = = lim t → x g(f (x))−g(f (t)) = 0 x−t g(f (x)) − g(f (t)) g (x) f (x)−f (t) We’re told that the derivative f 0 (x) is defined for all x ∈ (a, b), so 1/g 0 (x) must also be defined. bonus proofs: a big wad of properties for f and g We can show that both f and g are uniformly continous, injective, differentiable, strictly increasing functions whose domains and ranges are compact. From lemma 1, we know that f is a one-to-one function that is strictly increasing on (a, b). We’re told that f is differentiable on (a, b), therefore f is continuous on [a, b] (theorem 5.2). Because the domain of f is a compact metric space we can conclude that f is uniformly continuous (theorem 4.19). From the continuity and injectiveness of f we can conclude that g is continuous on f ( [a, b] ) (theorem 4.17). The domain of g is the range of f , so the domain of g is a compact space (thoerem 4.14) and therefore g is uniformly continuous (theorem 4.19). From lemma 1 we know that g(f (x)) = x for all x, therefore the inequalities in lemma 1 are equivalent to g(f (x)) > g(f (y)) ↔ f (x) > f (y) g(f (x)) < g(f (y)) ↔ f (x) < f (y) g(f (x)) = g(f (y)) ↔ f (x) = f (y) Which means that g is a one-to-one function and is strictly increasing on f ( (a, b) ). 73 Exercise 5.3 Choose such that || < itself differentiable: 1 M. The function f is the sum of two differentiable functions, so by theorem 5.3 f is x−t g(x) − g(t) + lim = 1 + g 0 (x) t→x x−t x−t From our bounds on and g 0 (x) this gives us 1 1 0 M < f (x) < 1 + M 1− M M f 0 (x) = lim t→x These are strict inequalities, so we conclude that f 0 (x) is always positive for this choice of . Therefore f is an increasing function and is one-to-one (see lemma 1 of the previous exercise). Exercise 5.4 Consider the function C 1 x2 Cn xn+1 + ··· + 2 n+1 When x = 1 this evaluates to the function given to us in the exercise, so f (1) = 0. When x = 0 every term evaluates to zero, so f (0) = 0. From example 5.4 we know that f (x) is differentiable and its derivative is given by f 0 (x) = C0 + C1 x + C2 x2 + · · · + Cn xn P By theorem 5.10, the fact that f (0) = f (1) = 0 means that f 0 (x) = 0 for some x ∈ (0, 1). Therefore C n xn has a real root in (0, 1), and this is what we were asked to prove. f (x) = C0 x + Exercise 5.5 Choose an arbitrary > 0. We’re told that f 0 (t) → 0 as t → ∞, so there exists some N such that t > N → |f 0 (t)| < With some algebraic manipulation and the use of the the mean value theorem (theorem 5.10) we can express g as f (x + 1) − f (x) = f 0 (t) for some t ∈ (x, x + 1) g(x) = f (x + 1) − f (x) = (x + 1) − (x) This must be true for all possible values of x, so choose x > N . We now have t > x > N , so the f 0 (t) term in the previous equation is now less than . |g(x)| = |f 0 (t)| ≤ which means that |g(x)| < for all x > N . And was an arbitrary positive real, which by definition means that g(x) → 0 as x → ∞. And this is what we were asked to prove. Exercise 5.6 The function g is differentiable (theorem 5.3c) and its derivative is g 0 (x) = xf 0 (x) − f (x) x2 We want to prove that g is monotonically increasing. This is true iff g 0 (x) > 0 for all x, which is true iff xf 0 (x) − f (x) >0 x2 which is true iff xf 0 (x) > f (x) which is true iff f 0 (x) > 74 f (x) x (19) To show that (19) holds for all x, choose an arbitrary x ∈ R. From the fact that f (0) = 0 we know from theorem 5.10 that f (x) f (x) − f (0) = = f 0 (c) for some c ∈ (0, x) x x−0 We’re told that f 0 is monotonically increasing, so f 0 (x) > f 0 (c). Therefore: f 0 (x) > f 0 (c) = f (x) x Therefore (19) holds, and we’ve shown that this occurs iff g 0 (x) > 0, which means that g is monotonically increasing. And this is what we were asked to prove. Exercise 5.7 From the fact that f (x) = g(x) = 0 we see that f (t) f (t) − f (x) f (t) − f (x) t−x f 0 (x) = lim = lim = 0 t→x g(t) t→x g(t) − g(x) t→x t−x g(t) − g(x) g (x) lim Exercise 5.8 We’re told that f 0 is a continuous function on the compact space [a, b] therefore f 0 is uniformly continuous (theorem 4.19). Choose any > 0: by the definition of uniform continuity there exists some δ such that 0 < |t − x| < δ → |f 0 (t) − f 0 (x)| < . Choose t, x ∈ [a, b]: by the mean value theorem there exists c ∈ (t, x) such that f (t) − f (x) f 0 (c) = t−x From the fact that c ∈ (t, x) we know that |c − x| < |t − x|δ. Therefore |f 0 (c) − f 0 (x)| < . Therefore, from our definition of c, we have f (t) − f (x) |f 0 (c) − f 0 (x)| = − f 0 (x) < t−x Our initial choice of t, x, and was arbitrary, some δ > 0 must exist so that this previous inequality is true for all t, x, and . And this is what we were asked to prove. Does this hold for vector-valued functions? Yes. Choose an arbitrary > 0 and define the vector-valued function f to be f (x) = (f1 (x), f2 (x), . . . , fn (x)) Assume that f is differentiable on [a, b]. Then each fi is differentiable on [a, b] (remark 5.16) and [a, b] is compact so by the preceeding proof we know that for each fi there exists some δi > 0 such that |t − x| < δi → fi (t) − fi (x) − fi0 (x) < t−x n Define δ = min{δ1 , δ2 , . . . , δn }. For |t − x| < δ we now have f (t) − f (x) − f 0 (x) t−x = = ≤ < (f1 (t) + f2 (t) + · · · + fn (t)) − (f1 (x) + f2 (x) + · · · + fn (x)) − (f10 (x) + f20 (x) + · · · + fn0 (x)) t−x f1 (t) − f1 (x) f2 (t) − f2 (x) fn (t) − fn (x) − f10 (x) + − f20 (x) + · · · + − fn0 (x) t−x t−x t−x f1 (t) − f1 (x) f2 (t) − f2 (x) fn (t) − fn (x) − f10 (x) + − f20 (x) + · · · + − fn0 (x) t−x t−x t−x + + ··· + = n n n 75 Exercise 5.9 We’re asked to show that f 0 (0) exists. From the definition of the derivative, we know that f 0 (0) = lim x→0 f (x) − f (0) x−0 The function f is continuous, so limx→0 f (x) − f (0) = 0 and limx→0 x = 0. Therefore we can use L’Hopital’s rule. f (x) − f (0) f 0 (0) = lim = lim f 0 (x) − 01 − 0 = lim f 0 (x) x→0 x→0 x→0 x We’re told that the right-hand limit exists and is equal to 3, therefore the leftmost term (f 0 (0)) exists and is equal to 3. And this is what we were asked to prove. Exercise 5.9 : Alternate proof If limx→0 f 0 (0) = 3 and f 0 (0) 6= 3, then f 0 would have a simple discontinuity at x = 0. Therefore f 0 (0) = 3 as an immediate consequence of the corollary to theorem 5.12. Exercise 5.10 Let f1 , g1 represent the real parts of the functions f, g and let f2 , g2 represent their imaginary parts: that is, f (x) = f1 (x) + if2 (x) and g(x) = g1 (x) + ig2 (x). We’re told that f and g are differentiable, therefore each of these dependent functions is differentiable (see Rudin’s remark 5.16). Applying the hint given in the exercise, we have f1 (x) + if2 (x) x x f (x) = lim −A · +A· lim x→0 x→0 g(x) x g1 (x) + ig2 (x) g1 (x) + ig2 (x) Each of these functions is differentiable and the denominators all tend to 0, so we can apply L’Hopital’s rule. 0 f (x) f1 (x) + if20 (x) 1 1 lim = lim −A · 0 +A· 0 0 x→0 x→0 g(x) 1 g1 (x) + ig2 (x) g1 (x) + ig20 (x) 0 f (x) 1 1 = lim −A · 0 +A· 0 x→0 1 g (x) g (x) 1 1 A = {A − A} · + A · = B B B Exercise 5.11 The denominator of the given ratio tends to 0 as h → 0, so we can use L’Hopital’s rule (differentiating with respect to h): f (x + h) + f (x − h) − 2f (x) f 0 (x + h) − f 0 (x − h) lim = lim 2 h→0 h→0 h 2h 1 f 0 (x + h) − f 0 (x) f 0 (x) − f 0 (x − h) = lim + h→0 2 h h We’re told that f 00 (x) exists, so this limit exists and is equal to = lim h→0 1 00 (f (x) + f 00 (x)) = f 00 (x) 2 A function for which this limit exists although f 00 (x) does not For a counterexample, we need only find a differentiable function for which f 00 (x) = 1 when x > 0 and f 00 (x) = −1 when x < 0. These criteria are met by  if x > 0  x2 , −(x2 ), if x < 0 f (x) =  0, if x = 0 This function is continuous and differentiable, but f 00 (x) does not exist at x = 0. 76 Exercise 5.12 From the definition of f 0 (x), we have |x + h|3 − |x|3 h→0 h If x > 0 then the terms in the numerator are positive and (20) resolves to f 0 (x) = lim (20) (x + h)3 − (x)3 = lim 3x2 + 3xh + h2 = 3x2 h→0 h→0 h f 0 (x) = lim If x < 0 then the terms in the numerator are negative and (20) resolves to −(|x| + h)3 + (|x|)3 = lim −(3|x|2 + 3|x|h + h2 ) = −(3|x2 |) h→0 h→0 h f 0 (x) = lim It’s clear from the above results that f 0 (x) → 0 as x → 0, and this agrees with f 0 (0): |h3 | =0 h→0 h f 0 (0) = lim So f 0 (x) = 3x|x| for all x. From the definition of f 00 (x), we have |3(x + h)2 | − |3x2 | h→0 h f 00 (x) = lim (21) If x > 0 then the terms in the numerator are positive and (21) resolves to 3(x + h)2 − 3x2 = lim 6x + 3h = 6x h→0 h→0 h f 00 (x) = lim If x < 0 then the terms in the numerator are negative and (21) resolves to −3(|x| + h)2 − 3|x2 | = lim 6|x| + 3h = 6|x| h→0 h→0 h f 00 (x) = lim It’s clear from the above results that f 00 (x) → 0 as x → 0, and this agrees with f 00 (0): |3h2 | =0 h→0 h f 00 (0) = lim So f 00 (x) = |6x| for all x. From the definition of f 000 (x), we have f 000 (x) = lim h→0 |6(x + h)| − |6x| h (22) If x = 0, then when h > 0 we have |6h| =6 h→0 h lim and when h < 0 we have lim h→0 |6h| = −6 h So the limit in (22) (which is f (3) (0)) doesn’t exist. Exercise 5.13a f is continuous when x 6= 0 The proof that xa is continuous when x 6= 0 is trivial. The sin function hasn’t been well-defined yet, but we can assume that it’s a continuous function5 . Therefore their product xa sin(x−c ) is continuous wherever it’s defined (theorem 4.9), which is everywhere but x = 0. 5 Example 5.6 says that we should assume without proof that sin is differentiable, so we can also assume that it’s continuous (theorem 5.2). 77 f is continous at x = 0 if a > 0 We have f (0) = 0 by definition. For f to be continuous at x = 0 it must be the case that limx→0 f (x) = 0. The range of the sin function is [−1, 1], so if a > 0 we have xa (−1) ≤ xa sin(x−c ) ≤ xa (1) Taking the limit of each of these terms as x → 0 gives us 0 ≤ lim xa sin(x−c ) ≤ 0 x→0 which shows that limx→0 f (x) = 0, and therefore f is continuous at x = 0. f is discontinous at x = 0 if a = 0 To show that f is not continuous at x = 0, it’s sufficient to construct a sequence {xn } such that lim xn = 0 but lim f (xn ) 6= f (0) (theorem 4.2). Define the terms of {xn } to be xn = 1 2nπ + π/2 1c This sequence clearly has a limit of 0, but f (xn ) = x0n sin(x−c n ) = sin(2nπ + π/2) = 1 so that lim{f (xn )} = 1. Note that we’re making lots of unjustified assumptions about the sin function and the properties of the as-of-yet undefined symbol π. f is discontinous at x = 0 if a < 0 Define the terms of {xn } to be xn = 1 2nπ + π/2 1c This sequence clearly has a limit of 0, but f (xn ) = xan sin(x−c n )= 1 2nπ + π/2 ac sin(2nπ + π/2) = 1 2nπ + π/2 ac We’re told that c > 0, therefore a/c < 0 and we have f (xn ) = 1 2nπ + π/2 ac = (2nπ + π/2) −a c where −a/c > 0. By theorem 3.20a we see that lim f (xn ) = ∞, which means that lim xn = 0 but lim f (xn ) 6= 0 and therefore f is not continuous at x = 0. These cases show that f is continous iff a > 0. Exercise 5.13b From the definition of limit we have f (0 + h) − f (0) ha sin(h−c ) − 0 = lim = lim ha−1 sin(h−c ) h→0 h→0 h→0 h h f 0 (0) = lim (23) We can evaluate the rightmost term by noting that sin is bounded by [−1, 1] so that |ha−1 |(−1) ≤ ha−1 sin(h−c ) ≤ |ha−1 |(1) 78 (24) Lemma 1: f is differentiable when x 6= 0 The proof that xa and x−c are differentiable when x 6= 0 is trivial. The sin function hasn’t been well-defined yet, but Rudin asks us in example 5.6 to assume that it’s differentiable. Therefore sin(x−c ) is differentiable when x 6= 0 (theorem 5.5, the chain rule) and therefore xa sin(x−c ) is differentiable when x = 0 (theorem 5.3b, the product rule). case 1: f 0 (0) exists when a > 1 When a > 1 we have a − 1 > 0 and therefore taking the limits of (24) as h → 0 gives us 0 ≤ lim |ha−1 sin(h−c )| ≤ 0 h→0 which means that (23) becomes f 0 (0) = lim ha−1 sin(h−c ) = 0 h→0 0 which shows that f (0) is defined. case 2: f 0 (0) does not exist when a = 1 Define the sequences {hn } and {jn } such that hn = jn = 1 2nπ + 1/c π 2 1 (2n + 1)π + 1/c π 2 When a = 1 we have a − 1 = 0 and therefore equation (23) gives us lim ha−1 sin(h−c ) = lim sin(h−c n )=1 n→∞ h→0 lim j a−1 sin(j −c ) = lim sin(jn−c ) = −1 n→∞ j→0 0 0 We know the sequences {f (hn )} and {j (hn )} are well-defined because of lemma 1, therefore we have conflicting definitions of f 0 (0). This means that the limit in (23) (and therefore f 0 (0) itself) does not exist. case 3: f 0 (0) does not exist when a < 1 Define the sequences {hn } and {jn } such that hn = jn = 1 2nπ + 1/c π 2 1 (2n + 1)π + 1/c π 2 When a < 1 we have a − 1 < 0 and therefore equation (23) gives us lim ha−1 sin(h−c ) = lim ha−1 =∞ n n→∞ h→0 lim j a−1 sin(j −c ) = lim −jna−1 = −∞ j→0 n→∞ We know the sequences {f 0 (hn )} and {j 0 (hn )} are well-defined because of lemma 1, therefore we have conflicting definitions of f 0 (0). This means that the limit in (23) (and therefore f 0 (0) itself) does not exist. These cases show that f 0 (0) exists iff a > 1. 79 Exercise 5.13c Note that we’ve only defined f on the domain [−1, 1], so we only need to show that f is bounded on this domain. case 1: f 0 is unbounded when a < 1 We saw in case (3) of part (b) that f 0 is unbounded near 0 when a < 1. case 2: f 0 is unbounded when 1 ≤ a < c + 1 By the lemma of part (b) we know that f 0 (x) is defined for all x ∈ [1, 1] except for possibly x = 0. By the chain rule and product rule we know that the derivative of f when x 6= 0 is f 0 (x) = axa−1 sin(x−c ) − cxa−(c+1) cos(x−c ) Define the sequence {hn } such that (25) hn = (2nπ)−1/c Evaluating the derivative in (25) at x = hn gives us f 0 (hn ) = (2nπ) 1−a c sin(2nπ) − c(2nπ) (1+c)−a c cos(2nπ) = −c(2nπ) (1+c)−a c We’re assuming in this case that a < c + 1, so taking the limits of this last equation as n → ∞ gives us lim f 0 (hn ) = lim −c(2nπ) n→∞ n→∞ (1+c)−a c = −∞ This doesn’t prove anything about f 0 (0) itself, but it does show that f 0 (x) is unbounded near 0. case 3: f 0 is bounded when a ≥ c + 1 If a ≥ c + 1 then clearly a > 1, so by the lemma of part (b) we know that f 0 (x) is defined for all x ∈ [1, 1] including x = 0. By the chain rule and product rule we know that the derivative of f is f 0 (x) = axa−1 sin(x−c ) − cxa−(c+1) cos(x−c ) (26) Since f 0 (x) is defined for x = 0, we can take the limit of (26) as x → 0: f 0 (0) = lim f 0 (x) = lim axa−1 sin(x−c ) − cxa−(c+1) cos(x−c ) x→0 x→0 Because x is bounded by [−1, 1] we know that xa−1 , xa−(c+1) , sin, and cos are all bounded by [−1, 1]. Therefore the rightmost limit of the previous equation is bounded by −(a + c) ≤ lim axa−1 sin(x−c ) − cxa−(c+1) cos(x−c ) ≤ a + c x→0 Which, of course, means that f 0 (x) is also bounded by [−(a + c), a + c] for x ∈ [−1, 1]. We could find stricter bounds for f 0 (x), but it’s not necessary. These three cases show that f 0 is bounded iff a ≥ c + 1. Exercise 5.13d lemma: f 0 is continuous when x 6= 0 From lemma 1 of part (b) we know that f 0 exists for all x 6= 0 and its derivative is given by f 0 (x) = axa−1 sin(x−c ) − cxa−(c+1) cos(x−c ) (27) Rudin asks us to assume that sin and cos are continuous functions and it’s trivial to show that x±α is continuous when x 6= 0 for any α, so we can use the chain rule (theorem 5.5) and product rule (theorem 5.3b) to show that f 0 is continuous when x 6= 0. 80 case 1: f 0 is continuous at x = 0 when a > 1 + c We’ve shown that f 0 (0) = 0 when a > 1 (case 1 of part (b)). For f 0 to be continuous at x = 0 it must be the case that limx→0 f 0 (x) = 0. We can algebraically rearrange (27) to obtain f 0 (x) = xa−(c+1) axc sin(x−c ) − c cos(x−c ) The range of the cosine and sine functions are [−1, 1], so we can establish a bound on this function. |f 0 (x)| = |xa−(c+1) axc sin(x−c ) − c cos(x−c )| ≤ |xa−(c+1) | · |axc + c| ≤ |xa−(c+1) | · (|axc | + |c|) Because a > c + 1, we have |xa−(c+1) | → 0 and |axc | → 0 as x → 0. Taking the limits of this last inequality as x → 0 therefore gives us lim |f 0 (x)| ≤ 0 · (0 + c) = 0 x→0 0 0 This shows that limx→0 f (x) = f (0), therefore f 0 is continuous at x = 0. case 2: f 0 is not continuous at x = 0 when a = 1 + c To show that f 0 is not continuous at x = 0 it’s sufficient to construct a sequence {xn } such that lim xn = 0 but lim f 0 (xn ) 6= f 0 (0) = 0 (theorem 4.2). Define the terms of xn to be xn = 1 2nπ 1c This sequence clearly has a limit of 0, but sin(xn ) = 0 and cos(xn ) = 1 so that the terms of {f 0 (xn )} are f 0 (xn ) = axn (0) − cx0 (1) = −c so that lim{f 0 (xn )} = −c 6= f 0 (0). case 3: f 0 is not continuous at x = 0 when a < 1 + c If a < 1+c we know that f 0 is not bounded on [−1, 1] (part (c)) therefore f 0 is not continuous on [−1, 1] (theorem 4.15). From lemma 1, we know that the discontinuity must occur at the point x = 0. For an alternative proof we could use the sequence {xn } established in case 2 and show that f 0 (xn ) → ∞ = 6 f (0) as xn → 0 when a < 1 + c. 0 Exercise 5.13e Lemma 1: f 0 is differentiable when x 6= 0 We established in part (b) that f 0 exists for x 6= 0 and is given by f 0 (x) = axa−1 sin(x−c ) − cxa−(c+1) cos(x−c ) (28) We know that all of the exponential powers of x are differentiable when x 6= 0 and Rudin asks us to assume that sin and cos are differentiable, so we can use theorem 5.5 (the chain rule) and theorem 5.3(the product rule) to show that f 0 is differentiable when x 6= 0. case 1: f 00 (0) exists when a > 2 + c From the definition of limit we know that aha−1 sin(h−c ) − cha−(c+1) cos(h−c ) f 0 (0 + h) − f 0 (0) = lim h→0 h→0 h h h i = lim (aha−2 ) sin(h−c ) − (cha−(c+2) ) cos(h−c ) f 00 (0) = lim h→0 81 (29) The range of the sin and cos functions is [−1, 1] so we can establish bounds for the limited term. −(|aha−2 | + |cha−(c+2) |) ≤ aha−2 sin(h−c ) − cha−(c+2) cos(h−c ) ≤ |aha−2 | + |cha−(c+2) | When a > (2 + c) > 2, the powers of h tend tend to zero as h → 0. Taking the limits of the previous inequality as h → 0, we have 0 ≤ lim aha−2 sin(h) − cha−(c+2) cos(h−c ) ≤ 0 h→0 This shows that the limit in (29) (and therefore f 00 (0)) exists. case 2: f 00 (0) does not exist when a = 2 + c Define the sequences {hn } and {jn } such that hn = jn = 1 2nπ 1/c 1 (2n + 1)π 1/c When a = 2 + c we have a − (c + 2) = 0 and therefore equation (28) gives us lim f 00 (h) = lim f 00 (hn ) = 0 − (ch0n )(1) = −c n→∞ h→0 lim f 00 (j) = lim f 00 (jn ) = 0 − (cjn0 )(−1) = c n→∞ j→0 00 00 We know the sequences {f (hn )} and {j (hn )} are well-defined because of lemma 1, therefore we have conflicting definitions of f 00 (0). This means that the limit in (28) (and therefore f 00 (0) itself) does not exist. case 2: f 00 (0) does not exist when a < 2 + c Define the sequences {hn } and {jn } such that hn = jn = 1 2nπ 1/c 1 (2n + 1)π 1/c a−(c+2) → ∞. This means that equation (28) gives us When a < 2 + c we have a − (c + 2) < 0 and therefore hn h i lim f 00 (h) = lim f 00 (hn ) = 0 − (cha−(c+2) )(1) = −∞ n h→0 n→∞ h i lim f 00 (j) = lim f 00 (jn ) = 0 − (cjna−(c+2) )(−1) = ∞ j→0 n→∞ We know the sequences {f 00 (hn )} and {j 00 (hn )} are well-defined because of lemma 1, therefore we have conflicting definitions of f 00 (0). This means that the limit in (28) (and therefore f 00 (0) itself) does not exist. These three cases show that f 00 (0) exists iff a > 2 + c. Exercise 5.13f Lemma 1: to hell with this By the lemma of part (e) we know that f 00 (x) is defined for all x ∈ [1, 1] except for possibly x = 0. By the chain rule and product rule we know that the derivative of f when x 6= 0 is h i h i f 00 (x) = (a2 − a)xa−2 − c2 xa−(2+2c) sin(x−c ) + (c2 + c − ca)xa−(2+c) − caxa−(1+c) cos(x−c ) (30) And I’m not going to screw around with limits and absolute values of something so annoying to type out, so I’ll conclude with “the proof is similar to that of part(c)”. 82 Exercise 5.13g The proof is similar to that of part (d), and the sentiment is similar to that of lemma (1) of part (f). Exercise 5.14 Lemma 1: If f is not convex then the convexity condition fails for all λ for some s, t ∈ (a, b) This could also be stated as “if f is not convex on (a, b) then it is strictly concave on some interval (s, t) ⊂ (a, b)”. Assume that f is continuous on the interval (a, b) and that f is not convex. By definition, this means that we can choose c, d ∈ (a, b) and λ ∈ (0, 1) such that f (λc + (1 − λ)d) > λf (c) + (1 − λ)f (d) (31) Having fixed our choice of c, d, and λ so that the previous equation is true, we define the function g(x) to be g(x) = f ((x)c + (1 − x)d) − (x)f (c) + (1 − x)f (d) We know that g is continuous on [0, 1] because f is continuous on [c, d] ⊂ (a, b) (theorem 4.9). And we know that there is at least one p ∈ [0, 1] such that g(p) > 0 because we can choose p = λ which causes the previous equation to simplify to g(p) = f (λc + (1 − λ)d) − [λf (c) + (1 − λ)f (d)] which is > 0 from (31). Let Z1 be the set of all p ∈ [0, λ) for which g(p) = 0 and let Z2 be the set of all p ∈ (λ, 1] for which g(p) = 0 It’s immediately clear that these sets are nonempty since g(0) = g(1) = 0. These sets are closed (exercise 4.3) and therefore contain their supremums and infimums. Let α = sup{Z1 } and let β = inf{Z2 }. We can now claim that equation (31) holds for all λ ∈ (α, β). And this is the same as saying that f (λs + (1 − λ)t) > λf (s) + (1 − λ)f (t) for all λ ∈ (0, 1) for s = αc + (1 − α)d, t = βc + (1 − β)d which, by definition, means that the convexity conditions fails on the inteval (s, t) ∈ (a, b) for all λ ∈ (0, 1). The “if ” case: f 0 is monotonically increasing if f is convex Let f be a function that is convex on the interval (a, b) (see exercise 4.23) and choose x, y ∈ (a, b) with y ≥ x. By definition this means that for all x, y ∈ (a, b) and for all λ ∈ (0, 1) it must be the case that f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) (32) We want to express the left-hand side of this inequality as f (x + h): we can do this by defining h such that h = (λ − 1)(x − y) Rearranging this algebraically allows us to express λ and 1 − λ as λ=1− h , y−x (1 − λ) = h x−y Substituting these values of λ and λ − 1 into (32) results in f (x + h) ≤ f (x) − h f (x) h f (y) + y−x y−x which is algebraically equivalent to f (x + h) − f (x) f (y) − f (x) ≤ h y−x 83 Equation (32) had to be true for any value of λ ∈ (0, 1). As λ → 1 we see that h → 0. Taking the limit of both sides of the previous equation as h → 0 gives us f 0 (x) ≤ f (y) − f (x) y−x (33) Having established (33), we now want to express the left-hand side of (32) as f (y − h): we can do this by redefining h such that h = −(λ)(x − y) Rearranging this algebraically allows us to express λ and 1 − λ as λ= h , y−x (1 − λ) = 1 − h x−y Substituting these values of λ and 1 − λ into (32) results in f (y − h) ≤ h f (x) h f (y) + f (y) − y−x y−x which is algebraically equivalent to f (y) − f (y − h) f (y) − f (x) ≥ h y−x Equation (32) had to be true for any value of λ ∈ (0, 1). As λ → 0 we see that h → 0. Taking the limit of both sides of the previous equation as h → 0 gives us f 0 (y) ≥ f (y) − f (x) y−x (34) Combining equations (33) and (34), we have have shown that f 0 (y) ≥ f 0 (x) We assumed only that f was convex and that y > x; we concluded that f 0 (y) ≥ f 0 (x). By definition, this means that f 0 is monotonically increasing if f is convex. The “only if ” case: f 0 is monotonically increasing only if f is convex Assume that f is not convex. By lemma 1, we can find some subinterval (s, t) ∈ (a, b) such that f (λs + (1 − λ)t) > λf (s) + (1 − λ)f (t) (35) is true for all λ ∈ (0, 1). We can now follow the logic of the “if” case and define h = (λ − 1)(s − t) Rearranging this algebraically allows us to express λ and 1 − λ as λ=1− h , t−s (1 − λ) = h s−t Substituting these values of λ and λ − 1 into (35) results in f (s + h) > f (s) − h f (s) h f (t) + t−s t−s which is algebraically equivalent to f (s + h) − f (s) f (t) − f (t) > h t−s Equation (35) had to be true for any value of λ ∈ (0, 1). As λ → 1 we see that h → 0. Taking the limit of both sides of the previous equation as h → 0 gives us f 0 (s) > f (t) − f (s) t−s 84 (36) Having established (36), we now redefine h such that h = −(λ)(s − t) Rearranging this algebraically allows us to express λ and 1 − λ as λ= h , t−s (1 − λ) = 1 − h s−t Substituting these values of λ and 1 − λ into (35) results in f (t − h) > h f (s) h f (t) + f (t) − t−s t−s which is algebraically equivalent to f (t) − f (t − h) f (t) − f (s) < h t−s Equation (35) had to be true for any value of λ ∈ (0, 1). As λ → 0 we see that h → 0. Taking the limit of both sides of the previous equation as h → 0 gives us f 0 (t) < f (t) − f (s) t−s (37) Combining equations (36) and (37), we have have shown that f 0 (t) < f 0 (s) for some t > s. We assumed only that f was not convex; we concluded that f 0 was not monotonically increasing. By contrapositive, this means that f 0 is monotonically increasing only if f is convex. f is convex iff f 00 (x) ≥ 0 for all x ∈ (a, b) We’ve shown that f is convex iff f 0 is monotonically inreasing, and theorem 5.11 tells us that f 0 is monotonically increasing iff f 00 (x) ≥ 0 for all x ∈ (a, b). Exercise 5.15 Note on the bounds of f and its derivatives When Rudin asks us to assume that |f | and its derivatives have upper bounds of M0 , M1 , and M2 it appears that we must assume that these bounds are finite. Otherwise the function f (x) = x would be a counterexample to the claim that M12 ≤ 4M0 M2 . Proof that M12 ≤ 4M0 M2 for real-valued functions Choose any h > 0. Using Taylor’s theorem (theorem 5.15) we can express f (x + 2h) as f (x + 2h) = f (x) + 2hf 0 (x) + 4h2 f 00 (ξ) , 2 ξ>a which can be algebraically arranged to give us f 0 (x) = f (x + 2h) f (x) − − hf 00 (ξ) 2h 2h f (x + 2h) f (x) f (x + 2h) f (x) − − hf 00 (ξ)| ≤ | |+| | + |hf 00 (ξ)| 2h 2h 2h 2h We’re given upper bounds for |f (x)| and |f 00 (x)|; these bounds give us the inequality |f 0 (x)| = | |f 0 (x)| ≤ 2M0 + hM2 h 85 (38) This inequality must hold for all x, even when |f (x)| approaches its upper bound, so we have M1 ≤ 2M0 + hM2 h We can multiply both sides by h and then algebraically rearrange this into a quadratic equation in h. h2 M2 − hM1 + M0 ≥ 0 (39) The quadratic solution to this equation is h= M1 ± p M12 − 4M0 M2 2M2 (40) We want to make sure that there are not two solutions to (40): we want (39) to hold for all values of h, and if (40) had two solutions then we would have h2 M2 − hM1 + M0 < 0 on some interval of h. To make sure that there is at most a single real solution we need to make sure that the discriminant of (40) is either zero (one solution) or negative (zero solutions). This occurs exactly when M12 ≤ 4M0 M2 Does this apply to vector-valued functions? Yes. Let f be a vector-valued function that is continuous on (a, ∞) and is defined by f (x) = ( f1 (x), f2 (x), . . . , fn (x) ) Assume that |f |, |f 0 |, and |f 00 | have finite upper bounds of (respectively) M0 , M1 , and M2 . If we evaluate f at x + 2h we have f (x + 2h) = ( f1 (x + 2h), f2 (x + 2h), . . . , fn (x + 2h) ) Taking the Taylor expansion of each of these terms, we have f (x + 2h) = = [f1 (x) + 2hf10 (x) + 2h2 f100 (ξ1 )], [f2 (x) + 2hf20 (x) + 2h2 f200 (ξ1 )], . . . , [fn (x) + 2hfn0 (x) + 2h2 fn00 (ξ1 )] ( f1 (x), f2 (x), . . . , fn (x) ) + ( 2hf10 (x), 2hf20 (x), . . . , 2hfn0 (x) ) + 2h2 f100 (ξ), 2h2 f200 (ξ), . . . , 2h2 fn00 (ξ) = f (x) + 2hf 0 (x) + 2h2 f 00 (ξ) This tells us that equation (38) holds for vector-valued functions. But nothing in the proof following equation (38) requires us to assume that f is real-valued instead of vector-valued. So the proof following (38) still suffices to prove that M12 ≤ 4M0 M2 Exercise 5.16 proof 1 In exercise 5.15 we established the inequality |f 0 (x)| ≤ | f (x) f (x + 2h) |+| | + |hf 00 (ξ)| 2h 2h We are given an upper bound of M2 for |f 00 (ξ)|, so this inequality can be expressed as |f 0 (x)| ≤ | f (x + 2h) f (x) |+| | + hM2 2h 2h We’re told that f is twice-differentiable, so we know that both f and f 0 are continuous (theorem 5.2), so we can take the limit of both sides of this equation as x → 0 (theorem 4.2). f (x + 2h) f (x) 0 lim |f (x)| ≤ lim | |+| | + hM2 x→∞ x→∞ 2h 2h 86 We’re told that f (x) → 0 as x → 0 so this becomes lim |f 0 (x)| ≤ lim hM2 x→∞ x→∞ This must be true for all h, so we can take the limit of both sides of this as h → 0 (we can do this because both |f 0 (x)| and hM2 are continuous functions with respect to the variable h). lim lim |f 0 (x)| ≤ lim lim hM2 h→0 x→∞ h→0 x→∞ The left-hand side of the previous inequality is independent of h and therefore doesn’t change; the right-hand side becomes 0. lim |f 0 (x)| = lim lim |f 0 (x)| ≤ lim lim hM2 = 0 x→∞ h→0 x→∞ h→0 x→∞ 0 lim |f (x)| ≤ 0 x→∞ This show that f 0 (x) → 0 as x → ∞, which is what we were asked to prove. Exercise 5.16 proof 2 In exercise 5.15 we established the inequality M12 ≤ 4M0 M2 where each of the M terms represented a supremum on the interval (a, ∞). This inequality was proven to hold for all a such that f was twice-differentiable on the interval (a, ∞). To show more explicitly that the value of these M terms depends on a, we might express the previous inequality as sup |f 0 (x)|2 ≤ sup 4|f (x)||f 00 (x)| x>a x>a Each of these terms is continuous with respect to a (the proof of this claim is trivial but tedious) so we can take the limit of both sides as a → ∞. lim sup |f 0 (x)|2 ≤ lim sup 4|f (x)||f 00 (x)| a→∞ x>a a→∞ x>a We’re told that |f 00 (x)| has an upper bound of M2 . We’re also told that f (x) → 0 as x → ∞. The last inequality therefore allows us to conclude that lim sup |f 0 (x)|2 ≤ lim sup 4|0||M2 | = 0 a→∞ x>a a→∞ x>a It’s clear that |f 0 (x)| must be less than or equal to the supremum of a set containing |f 0 (x)|, so we have lim |f 0 (x)| ≤ lim sup |f 0 (x)|2 ≤ 0 x→∞ a→∞ x>a lim |f 0 (x)| ≤ 0 x→∞ This shows that f 0 (x) → 0 as x → ∞, which is what we were asked to prove. Exercise 5.17 We’re told that f is three-times differentiable on the interval (−1, 1) so we can take the Taylor expansions of f (−1) and f (1) around x = 0: f (−1) = f (0) + f 0 (0)(−1) + f 000 (ξ1 )(−1)3 f 00 (0)(−1)2 + , 2 6 f (1) = f (0) + f 0 (0)(1) + f 00 (0)(1)2 f 000 (ξ2 )(1)3 + , 2 6 87 ξ1 ∈ (−1, 0) ξ2 ∈ (0, 1) When we evaluate f (1) − f (−1), many of these terms cancel out and we’re left with f (1) − f (−1) = 2f 0 (0) + f 000 (ξ1 ) + f 000 (ξ2 ) , 6 ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1) We’re given the values of f (1), f (−1), and f 0 (0) so this last equation becomes f 000 (ξ1 ) + f 000 (ξ2 ) , 6 ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1) f 000 (ξ2 ) = 6 − f 000 (ξ1 ), ξ1 ∈ (−1, 0)ξ2 ∈ (0, 1) 1= which is algebraically equivalent to If f 000 (ξ1 ) ≤ 3 then f 000 (ξ2 ) ≥ 3, and vice-versa: so f 000 (x) ≥ 3 for either ξ1 or ξ2 ∈ (−1, 1). And this is what we were asked to prove. Exercise 5.18 The nth derivative of f (t) The exercise tells gives us the following formula for f (t): f (t) = f (β) − (β − t)Q(t) Since β is a fixed constant, the first two derivatives of f with respect to t are f 0 (t) = Q(t) − (β − t)Q00 (t) f 00 (t) = 2Q0 (t) − (β − t)Q000 (t) And, in general, we have f (n) (t) = nQ(n−1) (t) − (β − t)Q(n) (t) which, after multiplying by (β − α)n /n! and setting t = α, becomes Q(n−1) (t) Q(n) (t) f (n) (t) (β − α)n = (β − α)n − (β − α)n+1 n! (n − 1)! n! (41) Modifying the Taylor formula The formula for the Taylor expansion of f (β) around the point f (α) (theorem 5.15) includes a P (β) term defined as n−1 X f (k) (α) P (β) = (β − α)k k! k=0 If we isolate the k = 0 case this becomes P (β) = f (α) + n−1 X k=1 f (k) (α) (β − α)k k! We can now use equation (41) to express the terms of the summation as functions of Q. P (β) = f (α) + n−1 X k=1 n−1 X Q(k) (α) Q(k−1) (α) k k+1 (β − α) − (β − α) (k − 1)! k! k=1 If we extract the k = 1 term from the leftmost summation and the k = n−1 term from the rightmost summation, we have n−2 n−1 X Q(k−1) (α) X Q(k) (α) Q(n−1) (α) P (β) = f (α) + Q(α)(β − α) + (β − α)k − (β − α)k+1 − (β − α)n (k − 1)! k! (n − 1)! k=2 k=1 88 We can then re-index the leftmost summation to obtain n−2 n−2 X Q(k) (α) X Q(k) (α) Q(n−1) (α) k+1 k+1 P (β) = f (α) + Q(α)(β − α) + (β − α) (β − α) (β − α)n − − (k)! k! (n − 1)! k=1 k=1 The two summations cancel one another, leaving us with P (β) = f (α) + Q(α)(β − α) − Q(n−1) (α) (β − α)n (n − 1)! If we replace Q(α) with the definition of Q given in the exercise, we see that this previous equation evaluates to P (β) = f (α) + f (α) − f (β) Q(n−1) (α) (β − α) − (β − α)n α−β (n − 1)! This simplifies to P (β) = f (α) + (f (β) − f (α)) − Q(n−1) (α) (β − α)n (n − 1)! A simple algebraic rearrangement of these terms gives us f (β) = P (β) − Q(n−1) (α) (β − α)n (n − 1)! which is the equation we were asked to derive. Exercise 5.19a The given expression for Dn is algebraically equivalent to f (βn ) − f (0) βn f (0) − f (αn ) −αn Dn = + βn − 0 βn − α n 0 − αn βn − α n We really want to be able to evaluate this by taking the limit of each side of this previous equation in the following manner: f (βn ) − f (0) βn f (0) − f (αn ) −αn lim Dn = lim · lim + lim · lim (42) n→∞ n→∞ n→∞ βn − αn n→∞ n→∞ βn − αn βn − 0 0 − αn There two conditions that must be met in order for this last step to be justified: condition 1: theorem 3.3 tells us that each of the limits must actually exist (and must not be ±∞) condition 2: and theorem 4.2 tells us that we must have lim f (αn ) = lim f (βn ) = f (0). The fact that αn < 0 < βn guarantees that 0 < βn /(βn − αn ) < 1 and 0 < −αn /(βn − αn ) < 1, which tells us that at 2 of the 4 limits in (42) exist. The other two limits exist because they’re equal to f 0 (0), and we’re told that f 0 (0) exists. Therefore condition 1 is met. The fact that f 0 (0) exists tells us that f is continuous at x = 0 (theorem 5.2) and therefore condition 2 is met (theorem 4.2). Therefore we’re justified in taking the limits in (42), giving us βn −αn lim Dn = f 0 (0) · lim + f 0 (0) · lim n→∞ n→∞ βn − αn n→∞ βn − αn = lim f 0 (0) n→∞ βn − αn = f 0 (0) βn − αn 89 Exercise 5.19b The given expression for Dn is algebraically equivalent to f (βn ) − f (0) βn f (αn ) − f (0) βn Dn = + 1− βn − 0 βn − αn αn − 0 βn − αn We want to evaluate this by taking the limits of the individual terms as we did in part (a): f (βn ) − f (0) βn f (αn ) − f (0) βn lim Dn = lim · lim + lim lim 1− n→∞ n→∞ n→∞ βn − αn n→∞ n→∞ βn − 0 αn − 0 βn − αn (43) In order for this to be justified we must once again meet the two conditions mentioned in part (a). We’re told that 0 < βn /(βn − αn ) < M for some M , which tells us that at 2 of the 4 limits in (43) exist. The other two limits exist because they’re equal to f 0 (0), and we’re told that f 0 (0) exists. Therefore condition 1 is met. The fact that f 0 (0) exists tells us that f is continuous at x = 0 (theorem 5.2) and therefore condition 2 is met (theorem 4.2). Therefore we’re justified in taking the limits in (43), giving us βn βn + f 0 (0) · lim 1 − lim Dn = f 0 (0) · lim n→∞ n→∞ n→∞ βn − αn βn − αn βn βn = f 0 (0) · lim +1− n→∞ βn − αn βn − αn = f 0 (0) Exercise 5.19c proof 1 Define the sequence {hn } where hn = βn − αn . We can now express Dn as Dn = f (αn + hn ) − f (αn ) hn We know that αn → 0 and βn → 0 as n → ∞, so clearly hn → 0 as n → ∞. We’re also told that f 0 is continuous on (−1, 1), so we know that f 0 is defined on this interval. Therefore we have lim Dn n→∞ = = lim lim f (αn + hn ) − f (αn ) hn n→∞ h→0 lim f 0 (αn ) n→∞ We’re told that f 0 is continuous on the interval (−1, 1) and that lim αn = 0 so by theorem 4.2 we have lim Dn = lim f 0 (αn ) = f 0 (0) n→∞ n→∞ Exercise 5.19c proof 2 The mean value theorem (theorem 5.10) allows us to construct a sequence {γn } as follows: for each n, choose some γn ∈ (αn , βn ) such that f (βn ) − f (αn ) = f 0 (γn ) (44) Dn = βn − αn Each γn is in the interval (αn , βn ). We’re told that lim αn = lim βn = 0. Therefore we can use the squeeze theorem to determine lim γn . 0 = lim αn ≤ lim γn ≤ lim βn = 0 n→∞ n→∞ 0 n→∞ which means that lim γn → 0. We’re told that f is continuous so by theorem 5.2 we can take the limit of equation (44). lim Dn = lim f 0 (γn ) = f 0 (0) n→∞ n→∞ 90 Exercise 5.20 Exercise 5.21 We saw in exercise 2.29 that any open set in R1 can be represented as a countable number of disjoint open S segments. Let {(an , bn )} represent such a countable collection of disjoint sets such that (ai , bi ) = E C . Define f : R1 → R1 as  0, x∈E    (x − bi )2 , x ∈ (ai , bi ) ∧ ai = −∞ f (x) = 2 (x − a ) , x ∈ (ai , bi ) ∧ bi = ∞  i   (x − ai )2 (x − bi )2 , x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞ It’s easy but tedious to verify that this a continuous function which is differentiable and that f (x) = 0 iff x ∈ E. Note that the derivative also has the property that f 0 (x) = 0 iff x ∈ E. For the second part of the exercise we can define a function f to be  0, x∈E    (x − bi )n+1 , x ∈ (ai , bi ) ∧ ai = −∞ f (x) = n+1 (x − a ) , x ∈ (ai , bi ) ∧ bi = ∞  i   (x − ai )n+1 (x − bi )n+1 , x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞ It’s similarly easy but tedious to verify that this is a continuous function that is n times differentiable and that f (x) = 0 iff x ∈ E. It can also be seen that when k ≤ n we have f (k) (x) = 0 iff x ∈ E. Finally6 , we can pretend that we’ve defined the exponential function and define f to be  0, x∈E     −1  x ∈ (ai , bi ) ∧ ai = −∞  exp (x−bi )2 , f (x) = −1 exp (x−ai )2 , x ∈ (ai , bi ) ∧ bi = ∞      −1  exp x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞ (x−ai )2 (x−bi )2 , It’s fairly easy to confirm that this function is continuous and that f (x) = 0 iff x ∈ E. Looking at the derivative with respect to x we have  0, x∈E     −2 −1  x ∈ (ai , bi ) ∧ ai = −∞  (x−bi )3 exp (x−bi )2 , f 0 (x) = −2 −1 x ∈ (ai , bi ) ∧ bi = ∞  (x−ai )3 exp (x−ai )2 ,     −2(bi −ai ) −1  x ∈ (ai , bi ) ∧ −∞ < ai < bi < ∞ (x−ai )2 (x−bi )2 exp (x−ai )2 (x−bi )2 , To calculate the limit of f 0 (x) in the first case, we use L’Hopital’s rule. −1 exp (x−b 2 i) lim f 0 (x) = lim 3 x→a x→a (x − bi ) The numerator and denominator of this last term both tend to ±∞, so L’Hopital’s rule is applicable. Repeated applications of L’Hopital’s rule will eventually give us a constant term in the numerator and a term that tends to ±∞ in the denominator, so we see that lim f 0 (x) = 0. Similar results hold for the limits in the other two cases. The general idea is that, for any polynomial term p(n), the exponential limit limn→∞ exp(−1/n) will tend towards zero faster than limn→∞ p(n) tends towards ∞. Therefore p(n)exp(−1/n) will tend to zero as n → ∞. Every term of every derivative of f (x) will consist only of polynomial multiples of the exponential term, so it will hold that for all k: lim f (k) (x) = lim f (k) (x) = 0 x→ai x→bi It seems clear in a vague, poorly-defined Calc II-ish way that f is infinitely differentiable and that, for all n, f (n) (x) = 0 iff x ∈ E. I have no idea how to prove this. 6 this example was provided by Boris Shekhtman, University of South Florida 91 Exercise 5.22a Assume that f (x1 ) = x1 and f (x2 ) = x2 with x1 6= x2 . Then f (x2 ) − f (x1 ) x2 − x1 = =1 x2 − x1 x2 − x1 By theorem 5.10 this means that f 0 (x) = 1 for some x between x1 and x2 . This contradicts the presumption that f 0 (x) 6= 1, so our initial assumption must be wrong: there cannot be two fixed points. Exercise 5.22b For f (t) to have a fixed point it would be necessary that f (t) = t, in which case we would have t = t + (1 − et )−1 −→ 1 =0 1 − et This statement is not true for any t. Exercise 5.22c Choose an arbitrary value for x0 and let {xn } be the sequence recursively defined by xn+1 = f (xn ). We’re told that |f 0 (t)| ≤ A for all real t. So for any n we have f (xn−1 ) − f (xn−2 ) ≤A xn−1 − xn−2 −→ |f (xn−1 ) − f (xn−2 )| ≤ A |xn−1 − xn−2 | which, since f (xn ) = xn+1 , gives us |xn − xn−1 | ≤ A|xn−1 − xn−2 | But this holds for all n. So: |xn − xn−1 | ≤ A|xn−1 − xn−2 | ≤ A2 |xn−2 − xn−3 | ≤ A3 |xn−3 − xn−4 | ≤ · · · ≤ An−2 |x2 − x1 | We can use this general formula to determine the difference between xn+k and xn . |xn+k − xn | = |(xn+k − xn+k−1 ) + (xn+k−1 − xn+k−2 ) + · · · + (xn+2 − xn+1 ) + (xn+1 − xn )| ≤ |xn+k − xn+k−1 | + |xn+k−1 − xn+k−2 | + · · · + |xn+2 − xn+1 | + |xn+1 − xn )| ≤ An+k−2 |x2 − x1 | + An+k−3 |x2 − x1 | + · · · + An |x2 − x1 | + An−1 |x2 − x1 )| ≤ An |x2 − x1 | This shows us that |xn+k − xn | ≤ An |x2 − x1 | We’re told that A < 1, so taking the limit of eachs ide of this inequality as n → ∞ gives us lim |xn+k − xn | ≤ lim An |x2 − x1 | = 0 n→∞ n→∞ And this is just the Cauchy criterion for convergence for the sequence {xn }. And we’re converging in R1 so by theorem 3.11 we know that the sequence converges to some element x ∈ R1 . ∃x ∈ R : lim xn = x n→∞ But the elements of {xn } are just the elements of {f (xn )}, so we can also conclude that ∃x ∈ R : lim f (xn ) = x n→∞ And we’re told that f is continuous, so by theorem 4.6 we see that ∃x ∈ R : f (x) = x 92 Exercise 5.23a If x < α, then we can express x as x = α − δ for some δ > 0. This gives us f (x) = f (α − δ) definition of x = = = (α−δ)3 −1 3 α3 −3α2 δ+3αδ 2 −δ 3 +1 3 −3α2 δ+3αδ 2 α3 +1 + 3 3 = f (α) − αδ(α − δ) algebra 3 − δ3 − 13 δ 3 = f (α) − αδx − 13 δ 3 = α − αδx − definition of f 1 3 3δ algebra algebra definition of x α is a fixed point We know that x < α < −1, so αx > 1 and therefore αδx > δ. From this we have an inequality: < α − δ − 31 δ 3 <α−δ =x definition of x This establishes that x < α → f (x) < x. Therefore if x1 < α then the sequence {xn } will be a nonincreasing sequence. We know that this sequence doesn’t converge because we’re told that f has no fixed points less than α. Exercise 5.23b The first derivative of f is f 0 (x) = x2 . This is nonnegative, so we know that f is monotonically increasing (theorem 5.11). The second derivative is f 0 (x) = 2x. This is negative for x < 0 and positive for x > 0: therefore f is strictly convex on (0, γ) and is strictly concave on (α, 0). Now let xk be chosen from the interval (α, γ). Case 1: xk ∈ (α, 0) Let xk ∈ (α, 0) be chosen. We can express this x as xk = λα + (1 − λ)0 for some λ ∈ (0, 1). The second derivative of f 00 (x) = 2x is negative on the interval (α, 0) and so the function is concave on this interval (corollary of exercise 5.14). Therefore we have xk+1 = f (xk ) definition of xk+1 = f (λα + (1 − λ)0) definition of xk > λf (α) + (1 − λ)f (0) f is concave on (α, 0) = λα + (1 − λ)(1/3) α is a fixed point, f (0) = 1/3 = xk + (1 − λ)(1/3) definition of xk > xk We see that xk < xk+1 ; from our choice of xk we have α < xk ; from the monotonic nature of f we have xk < β → xk+1 = f (xk ) < f (β) = β. Combining these inequalities yields α < xk < xk+1 < β Therefore our initial choice of xk gives us an increasing sequence {xn } with an upper bound of β. Therefore it converges to a some fixed point in the interval (x1 , beta], and this point must be β. Case 2: x ∈ (β, γ) Let xk ∈ (β, γ) be chosen. We can express this as xk = λβ + (1 − λ)γ for some λ ∈ (0, 1). The second derivative of f 00 (x) = 2x is positive on the interval (β, γ) and so the function is convex on this interval (exercise 5.14). There- 93 fore we have xk+1 = f (xk ) definition of xk+1 = f (λβ + (1 − λ)γ) definition of xk < λf (β) + (1 − λ)f (γ) f is convex on (β, γ) = λβ + (1 − λ)γ β and γ are fixed points = xk definition of xk We see that xk+1 < xk ; from our choice of xk we have xk < γ; from the monotonic nature of f we have β < xk → β = f (β) < f (xk ) = xk+1 . Combining these inequalities yields β < xk+1 < xk < γ Therefore our initial choice of xk gives us an decreasing sequence {xn } with an lower bound of β. Therefore it converges to a some fixed point in the interval [β, x1 ), and this point must be β. Case 3: x ∈ (0, β) Let xk ∈ (0, β) be chosen. We can express this as xk = λ0 + (1 − λ)β for some λ ∈ (0, 1). The second derivative of f 00 (x) = 2x is positive on the interval (0, β) and so the function is convex on this interval (exercise 5.14). Therefore we have xk+1 = f (xk ) definition of xk+1 = f (λ(0) + (1 − λ)β) definition of xk < λf (0) + (1 − λ)f (β) definition of convexity = λ(1/3) + (1 − λ)β β is a fixed point, f (0) = 1/3 = λ(1/3) + x definition of xk > xk We see that xk < xk+1 ; from our choice of xk we have α < xk ; from the monotonic nature of f we have xk < β → xk+1 = f (xk ) < f (β) = β. Combining these inequalities yields α < xk < xk+1 < β Therefore our initial choice of xk gives us an increasing sequence {xn } with an upper bound of β. Therefore it converges to a some fixed point in the interval (x1 , beta], and this point must be β. Case 4: xk = β or xk = 0 If xk = β then every term of {xn } is β, so this sequence clearly converges to β. If xk = 0 then xk+1 = f (0) = (1/3) and the remainder of the sequence {xn } converges to β by one of the previous cases. Exercise 5.23c If x > γ then we can express x as x = γ + δ for some δ > 0. This gives us f (x) = f (γ + δ) definition of x = = = = = = (γ+δ)3 −1 3 γ 3 +3γ 2 δ+3γδ 2 +δ 3 +1 3 3 γ 3 +1 3γ 2 δ+3γδ 2 + + δ3 3 3 f (γ) + γδ(γ + δ) + 13 δ 3 f (γ) − γδx + 31 δ 3 γ + γδx + 13 δ 3 definition of f algebra algebra algebra definition of x γ is a fixed point We know that γ > 1 and x > 1, so γxδ > δ. From this we have an inequality : > γ + δ + 31 δ 3 >γ+δ =x definition of x 94 This establishes that x > γ → f (x) > x. Therefore if x1 > γ then the sequence {xn } will be a nonincreasing sequence. We know that this sequence doesn’t converge because we’re told that f has no fixed points greater than γ. Exercise 5.24 The function f (x) has a derivative of zero at its fixed point, so when xk and xk+1 are both close to value theorem guarantees us that f (xk ) and f (xk+1 ) will be very near one another: √ a the mean |f (xk ) − f (xk+1 )| = |(xk − xk+1 )f 0 (ξ)| ≈ 0 The lefthand term of this equation converges very rapidly to 0 because both |xk − xk+1 | and f 0 (x) are converging toward zero. The function g(x) does not have a derivative of zero at its fixed point and therefore does not have this property (although, as we saw in exercise 3.17, it still converges albeit more slowly). Exercise 5.25a Each xn+1 is chosen to be the point where the line tangent tangent to f (xn ) crosses the x-axis. Exercise 5.25b Lemma 1: xn+1 < xn if xn > ξ We’re told that f (b) > 0 and that f (x) = 0 only at x = ξ. We know that f is continuous since it is differentiable (theorem 5.2), so by the intermediate value theorem (theorem 4.23) we know that f (x) > 0 when x > ξ (otherwise we’d have f (x) = 0 for some second x ∈ (ξ, b). We’re also told that f 0 (x) > δ > 0 for all x ∈ (a, b) (without the δ it might be the case that f 0 (x) 6= 0 but lim f 0 (x) = 0). Therefore the ratio f (xn )/f 0 (xn ) is defined at each x and is positive when x > ξ. Therefore we have xn − f (xn ) < xn , f 0 (xn ) xn > ξ This of course means that xn+1 < xn when xn > ξ. Lemma 2: xn+1 > ξ if xn > ξ We’re told that f 00 (x) ≥ 0, which means that f 0 (x) is monotonically increasing, which means that c < xn implies f 0 (c) ≤ f 0 (xn ), which means that f 0 (xn ) − f (ξ) ≤ f 0 (xn ) xn − ξ because the LHS of this inequality is equal to f 0 (c) for some c < xn . Using the fact that f (ξ) = 0 and a bit of algebraic manipulation, this equation is equivalent to xn − f (xn ) ≥ξ f 0 (xn ) which of course means that xn+1 ≥ ξ. Lemma 3: if {xn } → κ then f (κ) = 0 Suppose that limn→∞ xn+1 = κ. Then by the Cauchy criterion we have limn→∞ xn − xn+1 = 0 which is equivalent to f (xn ) lim xn − xn − 0 =0 n→∞ f (xn ) or f (xn ) lim =0 n→∞ f 0 (xn ) For this to hold it must either be the case that f (xn ) → 0 or f 0 (xn ) → ±∞: but we know that f 0 (xn ) is bounded from the fact that f 00 (xn ) is bounded (mean value theorem : if f 0 (xn ) were unbounded, then 95 (f 0 (xn ) − f 0 (x1 ))/xn − x1 = f 00 (c) would be unbounded). Therefore it must be the case that f (xn ) → 0. So we have limn→∞ xn = κ and limn→∞ f (xn ) = 0. Therefore by theorem 4.2 we have f (κ) = lim f (x) = 0 x→κ The actual proof Choose x1 > ξ. By induction, using lemma 2, we know that every element of {xn } will be > ξ. Therefore, by lemma 1, xn+1 < xn for all n. This means that {xn } is a decreasing sequence with a lower bound of ξ. Therefore, by theorem 3.14, the sequence converges to some point κ. By lemma 3 we have f (κ) = 0. But we’re told that f has only one zero, so it must be the case that κ = ξ. This means that {xn } converges to ξ. Exercise 5.25c Using Taylor’s theorem to expand f (ξ) around f (xn ), we have f 00 (tn )(ξ − xn )2 2 f (ξ) = f (xn ) + f 0 (xn )(ξ − xn ) + Subtracting f (xn ) from both sides then dividing by f 0 (xn ) (we’re told f 0 (x) ≥ δ > 0) gives us f 00 (tn )(ξ − xn )2 f (ξ) − f (xn ) = (ξ − x ) + n f 0 (xn ) 2f 0 (xn ) Rearranging some terms and recognizing that f (ξ) = 0, we have xn − f (xn ) f 00 (tn )(xn − ξ)2 − ξ = f 0 (xn ) 2f 0 (xn ) Which, by the definition of xn+1 , is equivalent to xn+1 − ξ = f 00 (tn )(xn − ξ)2 2f 0 (xn ) Exercise 5.25d We’re told that f 0 (x) ≥ δ and f 00 (x) ≤ M , so the inequality in part (c) guarantees us that xn+1 − ξ ≤ M (xn − ξ)2 = A(xn − ξ)2 2δ This allows us to recursively construct a chain of inequalities. xn+1 − ξ ≤ A1 (xn − ξ)2 ≤ A1 A2 (xn−1 − ξ)4 ≤ A1 A2 A4 (xn − 2 − ξ)8 ≤ · · · ≤ A1 A2 . . . A2 n−1 (x1 − ξ)2 n Collapsing the exponents of the rightmost term, we have n xn+1 − ξ ≤ A2 −1 n (x1 − ξ)2 = n 1 [A(x1 − ξ)]2 A Exercise 5.25e How does g behave near ξ? We’re told f and f 0 are differentiable, and clearly x is differentiable, therefore g is differentiable (theorem 5.3). Using the quotient rule (theorem 5.3 again), the derivative of g is given by g 0 (x) = 1 − f 0 (x)2 − f (x)f 00 (x) f (x)f 00 (x) = 0 2 f (x) f 0 (x) 96 Taking the absolute value of each side, we have |g 0 (x)| = f (x)f 00 (x) f 0 (x) Using the inequality we established in part (d) we have |g 0 (x)| ≤ A|f (x)| When we take the limit of each side of this equation as x → ξ, the RHS reduces to f (ξ) = 0 because f is continuous. lim |g 0 (x)| ≤ lim A|f (x)| = 0 x→ξ x→ξ Which immediately implies lim g 0 (x) = 0 x→ξ and this describes the behavior of g 0 as x → ξ, which is what we were asked to describe. Show that Newton’s method involves finding a fixed point of g This is a slightly modified version of lemma 3 for part (b). Suppose that we have found a fixed point κ such that g(κ) = κ. From the definition of g this would mean that g(κ) − κ = − f (κ) =0 f 0 (κ) For this to hold it must either be the case that f (κ) = 0 or f 0 (κ) = ±∞: but we know that f 0 is bounded from 0 0 (y) the fact that f 00 is bounded (mean value theorem : if f 0 were unbounded, then f (x)−f = f 00 (c) would be x−y unbounded for some x, y). Therefore it must be the case that f (κ) = 0 and therefore κ must be the unique point for which f (κ) = 0. Exercise 5.25f We’ll consider the more general case in which f (x) = xm . In this case the single real zero occurs at x = 0. Newton’s formula gives us the step function f (xn ) xm m−1 xn n xn+1 = xn − 0 = xn − = = x − xn n f (xn ) m m mxm−1 n This gives us a recursive definition of xn . 2 3 n m−1 m−1 m−1 m−1 xn+1 = xn = xn−1 = xn−2 = · · · = x1 m m m m Taking the limit of each side as n → ∞, we have lim xn+1 = lim n→∞ n→∞ m−1 m n x1 (45) In order to have {xn } converge to 0 we must either have x1 = 0 or |m − 1/m| < 1. The latter case occurs when |m − 1| < |m| This inequality holds iff m > 21 . If m is larger than 21 the limit in equation (45) exists and is equal to zero and so {xn } → 0; if m is smaller than 12 then the limit in equation (45) is unbounded and so {xn } → ±∞; if m = 12 then equation (45) becomes lim xn+1 = lim (−1)n x1 n→∞ n→∞ This limit clearly doesn’t exist when x1 6= 0. In the specific case given in the exercise we have m = 97 1 3 and therefore the sequence {xn } fails to converge. Exercise 5.26 Following the hint given in the exercise, let M0 = sup |f (x)| and let M1 = sup |f 0 (x)| for x ∈ [a, x0 ]. We know that |f (x)| ≤ M1 (x0 − a) (46) because otherwise we’d have f (x) − f (a) f (x) = = f 0 (c) for some c ∈ (a, b) > M1 = sup |f 0 (x)| x0 − a x0 − a which is clearly contradictory. Additionally, we know that M1 = sup |f 0 (x)| ≤ sup |Af (x)| = A sup |f (x)| = AM0 (47) Combining (46) and (47) gives us |f (x)| ≤ M1 (x0 − a) ≤ AM0 (x0 − a) , x ∈ [a, x0 ] (48) From the fact that f (a) = 0 we know that M1 (x0 − a) is the maximum possible value that f could possibly obtain at f (x0 ) (otherwise, if f (x0 ) > M1 (x0 −a), then the mean value theorem would give us some f 0 (c) > M1 ). In comparison, M0 is the maximum value that f actually does obtain at some x ∈ [a, b]. Therefore we have M0 ≤ M1 (x0 − a) (49) Suppose 0 < A(x0 − a) < δ < 1. Then (48) would give us |f (x)| ≤ M1 (x0 − a) ≤ AM0 (x0 − a) ≤ δM0 which contradicts (49) unless M1 = M0 = 0. And since we can force 0 < A(x0 − a) < 1 by choosing an appropriate x0 , it must be the case that M1 = M0 = 0. This turns (48) into |f (x)| ≤ 0 , x ∈ [a, x0 ] which shows that f (x) = 0 on the interval [a, x0 ]. We can now repeat these steps using the interval [x0 , b]. We need only perform this a finite number of times (specifically, a maximum of of [(b − a)/(x0 − a)] + 1 times) before we have covered the entire interval. Therefore f (x) = 0 on the entire interval [a, b]. Exercise 5.27 The given hint is pretty much the entire solution. Let y1 and y2 be two solutions to the initial-value problem and define the function f (x) = y2 (x) − y1 (x). The function f meets all of the prerequisites spelled out in exercise 5.26: it’s differentiable, f (a) = 0, and we’re told that there is a real number A such that |f 0 (x)| = |y20 − y10 | = |φ(x, y2 ) − φ(x, y1 )| ≤ A|y2 − y1 | = A|f (x)| Therefore, by exericse 5.26, we have y2 (x) − y1 (x) = f (x) = 0 for all x; therefore y1 = y2 . But y1 and y2 were arbitrary solutions for the intial-value problem, so we have proven that there is only one unique solution. Exercise 5.28 Let ya and yb be two solution vectors to the initial-value problem and define the vector-valued function f (x) = yb (x) − ya (x). As in the previous problem, we see that f (x) is differentiable and that f (a) = a. Therefore, if there is a real number A such that |f 0 (x)| = |yb0 − ya0 | = |φ(x, yb ) − φ(x, ya )| ≤ A|y2 − y1 | = A|f (x)| then, by exercises 5.26 and 5.27 the initial-value problem has a unique solution. 98 Exercise 5.29 Note: there’s probably a more subtle answer here, probably involving Taylor’s theorem. Let v and v be two solution vectors to the initial value problem and define the vector valued function f (x) = w(x) − v(x): f (x) = [w1 (x) − v1 (x), w2 (x) − v2 (x), . . . , wk−1 (x) − vk−1 (x), wk (x) − vk (x)] (50) The derivative of this is given by 0 0 f 0 (x) = w10 (x) − v10 (x), w20 (x) − v20 (x), . . . , wk−1 (x) − vk−1 (x), wk0 (x) − vk0 (x) which, by the equivalences given in the exercise, becomes  f 0 (x) = w2 (x) − v2 (x), w3 (x) − v3 (x), . . . , wk (x) − vk (x), k X  gj (wj − vj ) (51) j=1 From exercises 5.26-5.28 we know that we want an inequality of the form |f 0 (x)| ≤ A|f 0 (x)|. Using (50) and (51) we see that this inequality will hold if there exists some A such that |w1 (x) − v1 (x), w2 (x) − v2 (x), . . . , wk−1 (x) − vk−1 (x), wk (x) − vk (x)| ≤ A w2 (x) − v2 (x), w3 (x) − v3 (x), . . . , wk (x) − vk (x), k X gj (wj − vj ) j=1 The order of the components don’t matter when we’re taking the norm, and the rightmost term is a dot product, so this is equivalent to |w2 (x) − v2 (x), w3 (x) − v3 (x), . . . , wk (x) − vk (x), w1 (x) − v1 (x)| ≤ A|w2 (x) − v2 (x), w3 (x) − v3 (x), . . . , wk (x) − vk (x), g(x) · (w(x Exercise 6.1 (Proof 1) Outline of the proof We’ll see that L(P, f ) = 0 for every partition; therefore sup L(P, f ) = 0. By choosing R an appropriate partition we can make U (P, f ) arbitrarily small; therefore inf U (P, f ) = 0. We conclude that f = 0. sup L(P,f ) = 0 Although it’s easy to construct a partition such that L(P, f ) = 0, we have to show that 0 is the supremum of the set of all L(P, f ). To do this, let P be an arbitrary partition of [a, b] and let mi be an arbitrary interval of this arbitrary partition. If mi contains any point other than x0 then inf f (x) = 0 on this interval, so mi ∆αi = 0. If mi contains only the point x0 (that is, if the interval is [x0 , x0 ]) then ∆αi = [α(x0 ) − α(x0 )] = 0 and therefore mi ∆αi = 0. Therefore mi ∆αi = 0 for all i, which means L(P, f ) = 0. But P was an arbitrary partition, so L(P, f ) = 0 for all P . Therefore Z b f dx 0 = sup{L(P, f ) : P is a partition of [a, b]} = a inf U(P,f )=0 We need to show that for any > 0 we can find some partition P such that 0 ≤ U (P, f ) < . So let > 0 be given. We’re told that α is continuous at x0 , so by the definition of continuity we can find some δ > 0 such that d(x, x0 ) < δ → d(α(x), α(x0 )) < 99 2 (52) Let 0 < µ < M1 ∆α1 M2 ∆α2 M3 ∆α3 δ 2 and let P be the partition {a, x0 − µ, x0 + µ, b}. We now calculate each Mi : = = = M1 [α(x0 − µ) − α(a)] M2 [α(x0 + µ) − α(x0 − µ)] M3 [α(b) − α(x0 + µ)] = = = 0 · [α(x0 − µ) − α(a)] = 0 1 · [α(x0 + µ) − α(x0 − µ)] 0 · [α(b) − α(x0 − µ)] = 0 To determine a bound for M2 ∆α2 we apply (52) which allows us to conclude that α(x0 + µ) − α(x0 − µ) < from the fact that d(x0 − µ, x0 + µ) = 2µ < δ. M2 ∆α2 = α(x0 + µ) − α(x0 − µ) ≤ From this we have U (P, f ) X Mi ∆αi = 0 + + 0 = i=13 But this epsilon is arbitrarily small, so Z b f dx 0 = inf{U (P, f ) : P is a partition of [a, b]} = a Exercise 6.1 (Proof 2) Thanks to Helen Barclay (hbarcla2@mail.usf.edu) for this proof. The function f is discontinuous at only one point and α is continous at that point, so f ∈ R by theorem 6.10. Every partition of [a, b] will contain a point other than x0 so mi = 0 for every interval; therefore the infimum of L(P, f ) is zero; therefore Z b b Z f dα = a f dα = 0 a Exercise 6.2 Outline of the proof If we assume that f (x) 6= 0 for some x ∈ [a, b], then we R can construct a partition P for which L(P, f ) > 0. From this we conclude that sup L(P, f ) > 0 and therefore f 6= 0. The proof Suppose, for purposes of contradiction, that f (x0 ) = κ for some x0 ∈ [a, b] where κ is some arbitrary nonzero number. We’re told that f is continuous on a closed set, therefore f is uniformly continuous (theorem 4.19). Therefore there exists some δ > 0 such that |x0 − x| < δ → |f (x0 ) − f (x)| < κ2 . Now let 0 < µ < δ and let P be the four-element partition {a, x0 − µ, x0 + µ, b}. For clarity, let X = (x0 − µ, x0 + µ). Since |x0 − x| ≤ µ < δ for all x ∈ X, we have |f (x0 ) − f (x)| = |κ − f (x)| < κ2 for all x ∈ X. For this inequality to hold for all f (x) it must be the case that min f (X) > κ2 . This means that κ (2µ) > 0 2 R Since L(P, f ) > 0 for this particular P , it must be the case that sup L(P, f ) > 0 and therefore f dx 6= 0. L(P, f ) = X mi ∆xi ≥ m2 ∆x2 = min f (X)[(x0 + µ) − (x0 − µ)] = min f (X)(2µ) > We assumed that f (x) 6= 0 for some x ∈ [a, b] and concluded that then f (x) = 0 for all x ∈ [a, b]. R f dx 6= 0: by contrapositive, if R f dx = 0 Exercise 6.3 Outline of the proof We’ll see that U (P, f ) − L(P, f ) is equivalent to Mi − mi on a single arbitrarily small interval around 0. To make this difference arbitrarily small we will need to make sup f (x) − inf f (x) arbitrarily small by restricting x to a sufficiently small neighborhood of 0: this is possible iff f is continuous at 0. 100 Lemma 1: For each of the β functions, the upper and lower Riemann sums depend entirely on the intervals containing 0 Let P be an arbitrary partition of [−1, 1]. By theorem 6.4 we can assume, without loss of generality, that 0 is an element of this partition. We define α to be the partition element immediately preceding 0 and define ω to be the partition element immediately following 0 (so our partition has the form P = {x0 < x1 < . . . < α < 0 < ω < . . . < xn < xn+1 }). Note that α and ω must both exist since P is a finite set, although it’s possible that α = −1 or ω = 1. For every xi−1 < xi < 0 we have β(xi−1 ) = β(xi ) = 0, therefore ∆β = 0, therefore Mi ∆β = mi ∆β = 0. Similarly, for every 0 < xi−1 < xi we have β(xi−1 ) = β(xi ) = 1, therefore ∆β = 0, therefore Mi ∆β = mi ∆β = 0. This shows that Mi ∆β = mi ∆β = 0 for every interval that doesn’t contain 0. Therefore the values of U (P, f ) and L(P, f ) depend entirely on the intervals [α, 0] and [0, ω]. This holds for arbitrary P . The general form of Mi ∆β and mi ∆β on [α, 0] and [0, ω] Let P be an arbitrary partition. Let Mα denote the supremum of f (x) on the interval [α, 0]; similarly define mα , Mω , mω , etc. For all three of the β functions we have Mα ∆βα = Mα [β(0) − β(α)] = Mα β(0) (53) Mω ∆βω = Mω [β(ω) − β(0)] = Mω [1 − β(0)] (54) mα ∆βα = mα [β(0) − β(α)] = mα β(0) (55) mω ∆βω = mω [β(ω) − β(0)] = mω [1 − β(0)] (56) (57) Case 1: β(0) = 0 When β(0) = 0 we have X Mi ∆β = Mα β(0) + Mω [1 − β(0)] = Mω X mi ∆β = mα β(0) + mω [1 − β(0)] = mω Proof that continuity of f implies integrability: let > 0 be given. We’re told that f (0+) = f (0), so by definition of continuity at this point there exists some δ > 0 such that d(0, x) < δ → d(f (0), f (x)) < 2 Construct a partition P = {−1, 0, δ, 1}. The interval [0, δ] is compact, therefore by theorem 4.16 the points f −1 (Mω ) and f −1 (mω ) both exist in [0, δ] and therefore: 2 d(f −1 (mω ), 0) < δ → d(mω , f (0)) < 2 and therefore, by the triangle inequality, d(f −1 (Mω ), 0) < δ → d(Mω , f (0)) < d(Mω , mω ) ≤ d(Mω , f (0)) + d(f (0), mω ) < and therefore U (P, f ) − L(P, f ) = Mω − mω < R And since was arbitrary, this is sufficient to show that f dx exists. Proof by contrapositive that integrability of f implies right-hand continuity: Assume that f (0+) 6= f (0). From the negation of the definition of continuity we therefore know that there exists some > 0 such that for all δ we can find some 0 < x < δ such that |f (x) − f (0)| ≥ . So for any partition P let a be a point at which f (a) = Mω and let b be a point for which |f (b) − f (0)| ≥ . This gives us f (b) ≤ f (a) = Mω and mω ≤ f (0), therefore U (P, f ) − L(P, f ) = Mω − mω = f (a) − mω ≥ f (a) − f (0) ≥ f (b) − f (0) ≥ 101 The partition P was arbitrary so this inequality must be true for all possible partitions, so we can never find P such that inf U (P, f ) − sup L(P, f ) ≤ R and therefore f dx does not exist. Proof that this integral, if it exists, is equal R to f (0): from the fact that 0 is an elementR of [0, δ] we know that Mω ≥ f (0). We also know that L(P, f ) ≤ f (by theorem 6.4 and/or the definition of f ). This gives us the inequalities Mω = U (P, f ) ≥ f (0) Z L(P, f ) ≤ f dβ from which we have Z U (P, f ) − L(P, f ) ≥ f (0) − But we also know that mω ≤ f (0) and that U (P, f ) ≥ f dβ (58) R f . This gives us Z U (P, f ) ≥ f dβ mω = L(P, f ) ≤ f (0) from which we have Z U (P, f ) − L(P, f ) ≥ f dβ − f (0) (59) Combining (58) and (59) gives us Z f dβ − f (0) ≤ U (P, f ) − L(P, f ) = If this is to be true for all > 0 we must have R f dβ = f (0). Case 2: β(0) = 1 When β(0) = 1 we have X Mi ∆β = Mα β(0) + Mω [1 − β(0)] = Mα X mi ∆β = mα β(0) + mω [1 − β(0)] = mα From here the proof that place of ω. Case 3: β(0) = R f dβ = f (0) iff f (0) = f (0−) is almost identical to the previous proof with α in the 1 2 When β(0) = 1/2 we have X Mi ∆β = Mα β(0) + Mω [1 − β(0)] = X mi ∆β = mα β(0) + mω [1 − β(0)] = Therefore 1 [Mα + Mω ] 2 1 [mα + mω ] 2 1 1 [Mα − mα ] + [Mω − mω ] 2 2 The function f is integrable iff this term can be made arbitrarily small; we saw in case (1) that [Mω − mω ] can be made arbitrarily small iff f (0) = f (0+), and we saw in case (2) that [Mα − mα ] can be made arbitrarily small iff f (0) = f (0−). Therefore we can minimize U (P, f ) − L(P, f ) in this case iff f (0+) = f (0−) = f (0); that is, iff f is continuous at 0. U (P, f ) − L(P, f ) = 102 Proof that this integral, if it exists, is equal to f (0): fromRthe fact that 0 is an element of [0, δ] we know Rthat Mω ≥ f (0) and Mα ≥ f (0). We also know that L(P, f ) ≤ f (by theorem 6.4 and/or the definition of f ). This gives us the inequalities 1 1 [Mα + Mω ] = U (P, f ) ≥ [f (0) + f (0)] = f (0) 2 2 Z L(P, f ) ≤ f dβ from which we have Z U (P, f ) − L(P, f ) ≥ f (0) − f dβ But we also know that mω ≤ f (0) and that mα ≤ f (0) and that U (P, f ) ≥ Z U (P, f ) ≥ f dβ from which we have (60) R f . This gives us 1 1 [mα + mω ] = L(P, f ) ≤ [f (0) + f (0)] = f (0) 2 2 Z U (P, f ) − L(P, f ) ≥ f dβ − f (0) (61) Combining (58) and (59) gives us Z f dβ − f (0) ≤ U (P, f ) − L(P, f ) = If this is to be true for all > 0 we must have R f dβ = f (0). Part (d) of the exercise If f is continuous at 0 then we have f (0) = f (0+) = f (0−), so by case (1) we have R (2) we have f dβ2 = f (0) and by case (3) we have f dβ3 = f (0). R f dβ1 = f (0) and by case Exercise 6.4 Let P be an arbitrary partition of [a, b]. Let n = |P | − 1, so that n represents the number of intervals in the partition P . Every interval will contain at least one rational number (theorem 1.20b) and at least one irrational number (by pigeonhole principle, since |(xi , xi+1 )| = |R| > |Q|). Therefore we have Mi = 1, mi = 0 for all i. This gives us n X X Mi ∆xi = 1 · (xi+1 − xi ) = xn+1 − x0 = b − a 1 X mi ∆xi = n X 0 · (xi+1 − xi ) = 0 1 But P was an arbitrary partition, so this holds for all partitions, and therefore sup L(P, f ) = 0, inf U (P, f ) = 1 and therefore f 6∈ R. Exercise 6.5 Is f ∈ R if f 2 ∈ R? No. Consider the function f (x) = -1, 1, x∈Q x 6∈ Q We saw that f 6∈ R in exercise 6.4, but clearly f 2 (x) = 1 ∈ R. 103 Is f ∈ R if f 2 ∈ R? Yes. The function φ(x) = 6.11 to claim that √ 3 x is a continuous function, so we let h(x) = φ(f 3 (x)) = f (x) and appeal to theorem f3 ∈ R → h ∈ R → f ∈ R Note √ that the claim that h(x) = φ(f 3 (x)) = f (x) relied on the fact that x → x3 is a one-to-one p mapping, so 3 2 3 that x = x. In contrast, the mapping x → x is not one-to-one as can be seen by the fact that (−1)2 6= −1. Exercise 6.6 Outline of the proof We can define a partition for which f is discontinuous on 2n intervals each of which has length 3−n , so the Riemann sum across these intervals is proportional to (2/3)n . This sum approaches 0 as n → ∞. The proof Let Ei be defined as in sec. 2.44. Note that E0 consists of one interval of length 1; E1 consists of two intervals of length 13 ; E2 consists of 4 intervals of length 19 ; and, in general, En will consist of 2n intervals of length 3−n . With each En we can associate a set Fn that contains the endpoints of the intervals contained in En . That is, we define Fn = {a1 < b1 < a2 < b2 < . . . < a2n < b2n } where each [ai , bi ] is an interval in En . Since every point of the Cantor set – and therefore every point at which f might be discontinuous – is in an interval of the form [ai , bi ] we’ll choose a partition that lets us isolate these intervals. Let m and M represent the lower and upper bounds of f on [0, 1]. Choose an arbitrary > 0. Choose n large enough that 2n+1 (M − m) 3n−1 > and choose δ such that 1 0 < δ < n+1 3 These choice will be justified later. Define Pn to be Pn = {0 = a1 < (b1 + δ) < (a2 − δ) < (b2 + δ) < (a3 − δ) < (b3 + δ) < . . . < (a2n − δ) < b2n = 1} This partition contains 2n+1 points and therefore contains 2n+1 − 1 segments. We must now show that we can make U (Pn , f ) − L(Pn , f ) arbitrarily small. U (Pn , f ) − L(Pn , f ) = 2n+1 X−1 (Mi − mi )∆xi i=1 We’ll separate this sum into the intervals on which f is continuous (i = 2, 4, 6, . . .) and the intervals on which f might contain discontinuities (i = 1, 3, 5, . . .). n = 2 X n 2 X (M2i − m2i )∆x2i + (M2i+1 − m2i+1 )∆x2i+1 i=1 i=0 The function f is continuous on every interval of the form [bi + δ, ai+1 − δ], and these intervals are represented by the lefthand summation. We can therefore refine Pn such that 2n ≤ X + (M2i+1 − m2i+1 )∆x2i+1 2 i=1 104 We know the exact value of the ∆x terms. For Pn , we have d(ai , bi ) = 3−n and therefore ∆x2i+1 = 3−n + 2δ. Similarly, we know that Mi ≤ M and mi ≥ m, so we have 2n X ≤ + (M − m) 2 i=1 From our choice of δ < 1 3n+1 < 1 3n 1 + 2δ 3n this becomes 2n X ≤ + (M − m) 2 i=1 1 3n−1 This summation is constant with respect to i so it becomes simply 1 n = + 2 (M − m) 2 3n−1 From our choice of n we have 3n−1 > 2n+1 (M −m) ≤ and therefore 3−(n−1) < /2n+1 (M − m). 2n (M − m) + n+1 2 2 (M − m) =≤ + = 2 2 Exercise 6.7a If f ∈ R, then from theorem 6.12 we have Z 1 c Z f dx = 1 Z f dx + 0 0 f dx (62) c Now consider the partition P of [0, c] defined by P = {0, c}. This partition has only a single interval, so we have Z c inf f (x) · c = L(P, f ) ≤ f dx ≤ U (P, f ) = sup f (x) · c 0<x<c From (62), adding R1 c 0<x<c 0 f dx to each term of this inequality gives us Z 1 Z f dx + inf f (x) · c ≤ c 0<x<c 1 Z 1 f dx ≤ 0 f dx + sup f (x) · c 0<x<c c Taking the limit of each term of this inequality as c → 0, we see that f (x) · c → 0 while the center term remains constant. The resulting inequality is Z 1 Z 1 Z 1 lim f dx ≤ f dx ≤ lim f dx c→0 c c→0 0 c which of course implies that Z lim c→0 1 Z f dx = c 1 f dx 0 which is what we wanted to prove. Exercise 6.7b Outline of the proof R i Pn We construct a function f for which f = R i=1 (−1) iP : this is the alternating harmonic series, which converges n (see Rudin’s remark 3.46). We then see that |f | = i=1 1i : this is the harmonic series, which diverges (theorem 3.28). 105 The proof Choose any c ∈ (0, 1) and consider the following function defined on [c, 1]. 1 (−1)n (n + 1), n+1 < x < n1 for some n ∈ N f (x) = 0, otherwise Claim 1 : f is integrable on any interval [c, 1] Choose an arbitrarily small > 0. Let N1 be the smallest harmonic number greater than c. We want to choose δ such that each harmonic number in [c, 1] is contained in a distinct interval of radius δ: so choose δ such that for reasons that will become clear later. Let 0 < δ < 21 [ N1 − c]. We must also make sure that δ < 2(N +1)(N 1 +2) Hc represent the partition containing the elements {c} ∪ n ± δ : n ∈ N, n < 1c ∩ [0, 1]. • The partition Hc contains one interval of the form [c, N1 − δ]. 1 1 M1 ∆x1 = sup f (x)∆x1 = (−1) (N + 1)∆x1 = (−1) (N + 1) −δ− N N +1 1 1 − δ = (−1)N − (N + 1)δ = (−1)N (N + 1) N (N + 1) N N N The function is constant over this interval, so we have m1 ∆x1 = inf f (x)∆x1 = sup f (x)∆x1 = M1 therefore |M1 − m1 |∆x1 = 0 (63) • The partition Hc contains one interval of the form [1 − δ, 1] for which Mn ∆xn = sup f (x)∆xn = 0∆x = 0 mn ∆xn = inf f (x)∆xn = −2∆x = −2δ therefore |Mn − mn |∆xn = 2δ • The partition contains N − 1 intervals of the form 1i − δ, 1i + δ for which (64) Mi ∆xi = sup f (x)∆xi ≤ sup |f (x)|2δ ≤ |i + 1|2δ mi ∆xi = inf f (x)∆xi ≥ inf −|f (x)|2δ ≥ −|i + 1|2δ therefore |Mi − mi |∆xi ≤ |Mi | + |mi | ≤ 4δ|i + 1| h i 1 • The partition contains N − 2 intervals of the form j+1 + δ, 1j − δ for which Mj ∆xj = sup f (x)∆xj = (−1)j (j+1) 1 1 −δ− −δ j j+1 = (−1)j (j+1) (65) 1 − 2δ j(j + 1) = (−1)j 1 − 2δ(j + 1) j The function is constant over this interval, so we have mj ∆xj = inf f (x)∆xj = sup f (x)∆x = M = (−1) j 1 − 2δ(j + 1) j therefore (Mj − mj )∆xj = 0 106 (66) Summing equations (63)-(66), we have |U (P, f ) − L(P, f )| = |(M1 − m1 )∆x1 + (Mn − mn )∆xn + N N −1 X X (Mi − mi )∆xi + (Mj − mj )∆xj | i=2 ≤ |(M1 − m1 )|∆x1 + |(Mn − mn )|∆xn + j=2 N X |Mi − mi |∆xi + i=2 ≤ 0 + 2δ + N X 4δ|i + 1| + N −1 X i=2 = = N −1 X |Mj − Mj |∆xj j=2 0 j=2 (N + 1)(N + 2) − 1 2 2δ(N + 1)(N + 2) 2δ + 4δ 2(N +1)(N +2) , Earlier in the proof we chose δ such that δ < so we obtain the inequality |U (P, f ) − L(P, f )| ≤ And was arbitrary, so this proves that f ∈ R. Claim 2 : The value of limc→0 R1 c f is defined Adding the values for the M terms, we have U (P, f ) = M1 + Mn + N X i=2 ≤ = (−1) N (−1) N Mi + N −1 X Mj i=2 N −1 N X X 1 1 j (−1) |i + 1|2δ + − (N + 1)δ + 0 + − 2δ(j + 1) N j j=2 i=2 N −1 X 1 1 j (−1) − (N + 1)δ δ[(N + 1)(N + 2) − 6] + − 2δ(j + 1) N j j=2 Remember that we first chose a value for c, which forced us to use a particular value for N ; our choice of δ came afterward, so we’re still free to select δ small enough to make this last inequality become: N −1 X (−1)N 1 + (−1)j + N j j=2 U (Pc , f ) ≤ We defined N1 to be the smallest harmonic number greater than c, so taking the limit of both sides as c → 0 gives us lim U (Pc , f ) = c→0 ∞ X (−1)j j=2 1 N → 0 and N → ∞ as c → 0. Therefore 1 + = 1 − ln(2) + j which proves that Z c→0 1 f dx ≥ 1 − ln(2) lim c The values for the m terms are almost identical to the M terms, and adding them gives us the inequality Z 1 lim f dx ≤ 1 − ln(2) c→0 c Together, of course, these two inequalities prove that Z 1 lim f dx = 1 − ln(2) c→0 c 107 Claim 3 : The value of limc→0 R1 c |f | is not defined If we follow the logic from claims 1 and 2, we find that adding the values for the M terms gives us U (P, f ) = M1 + Mn + N X Mi + i=2 Mj i=2 N N −1 X 1 X 1 1 − (N + 1)δ + + − 2δ(j + 1) |i + 1|2δ + N 2 i=2 j j=2 N −1 X 1 1 1 − (N + 1)δ + δ[(N + 1)(N + 2) − 6] + − 2δ(j + 1) N 2 j j=2 ≤ = N −1 X Again following the logic of part 2, we select δ small enough to gives us Z 1 ∞ X 1 f dx = lim + c→0 c j j=2 But weRknow from chapter 3 that this series doesn’t converge, so the lefthand limit doesn’t exist and therefore 1 limc→0 c f dx does not exist. Exercise 6.8 Choose an arbitrary real number c > 1. Let n be the greatest integer such that n < c.Define the partition Pn of [0, n] to be Pn = {x0 =P1, 2, 3, 4, . . . , n = xn−1 }. P First, assume that f (n) diverges. Because f (x) > 0 this means that f (n) is unbounded (theorem 3.24). The fact that f is monotonically decreasing allows us to derive the following chain of inequalities. Z c Z n f (x) dx ≥ f (x) dx 1 1 ≥ L(Pn , f ) = n−2 X mi ∆xi i=1 = n−2 X f (i + 1) i=1 (Note: the index is going to n − 2 instead of n − 1 because of the awkward numbering of the partition: note that the partition only goes to xn−1 ). Taking the limit as c → ∞, we have Z ∞ Z c n X f (x) dx = lim f (x) dx ≥ lim f (i + 1) c→∞ 1 n→∞ 1 i=2 The integral on the left-hand side of this chain of inequalities is greater than the unbounded sum on the righthand side, so we conclude P that the integral does not converge. Next assume that f (n) converges to some κ ∈ R. Choose c, n, Pn+1 as defined above. The fact that f is monotonically decreasing allows us to derive the following chain of inequalities. Z c Z n+1 f (x) dx ≤ f (x) dx 1 1 ≤ U (Pn+1 , f ) = n−1 X Mi ∆xi i=1 = n−1 X i=1 108 f (i) Taking the limit as c → ∞, we have Z ∞ Z c f (x) dx ≤ lim → ∞ f (x) dx = lim c→∞ 1 n 1 n−1 X f (i) i=1 Rc We’re told that f (x) > 0, so we know that 0 f (x) is a monotonically increasing function of c. The previous Rc Rc inequality tells us that 1 is bounded above. Therefore by theorem 3.14 we know that limc→∞ 1 f dx is defined. Exercise 6.9 By the definition given in exercise 6.8: Z ∞ Z 0 f (x)g (x) dx = lim c→∞ 0 c f (x)g 0 (x) dx 0 The integral from 0 to c is finite, so we can use integration by parts: Z ∞ Z f (x)g 0 (x) dx = lim f (c)g(c) − f (0)g(0) − 0 c→∞ c f 0 (x)g(x) dx 0 −1 Applying this to the given function with f (x) = (1 + x) and g(x) = sin x, we have f 0 (x) = −(1 + x)−2 and g 0 (x) = cos x. Integration by parts therefore yields Z c Z ∞ sin c sin 0 sin x cos x dx = lim − + dx 2 c→∞ 1 + c 1+x 1 0 (1 + x) 0 Since −1 ≤ sin c ≤ 1, the non-integral terms both tend to zero as c → ∞. This leaves us with Z ∞ Z c cos x sin x dx = lim dx c→∞ 1 + x (1 + x)2 0 0 By the definition in exercise 6.8 this is equivalent to Z ∞ Z ∞ cos x sin x dx = dx 1 + x (1 + x)2 0 0 Exercise 6.10 a: method 1 This proof was due to Helen Barclay (hbarcla2@mail.usf.edu). We rewrite uv as the exponential of a natural log: 1 1 ln up + ln v q uv = (up )1/p (v q )1/q = exp(ln((up )1/p (v q )1/q )) = exp p q The exponential function f (x) = ex has a strictly positive second derivative; therefore its first derivative is monotonically increasing (theorem 5.11); therefore f (x) is a convex function (proof in exercise 5.14, definition of convex in exercise 4.23). By definition of convexity with λ = 1/p, 1 − λ = 1/q we have 1 1 1 1 up vq exp ln up + ln v q ≤ exp (ln up ) + exp (v q ) = + p q p q p q Combining these last two inequalities, we have uv ≤ up vq + p q which is what we were asked to prove. 109 Exercise 6.10 a: method 2 Using the variable substitutions s = up , t = v q , a = p1 , (1 − a) = uv ≤ 1 q we can rewrite up vq + p q (67) as sa t1−a ≤ as + (1 − a)t Multiplying both sides by s−a ta−1 gives us 1 ≤ as1−a ta−1 + (1 − a)s−a ta We can rewrite this previous inequality in two equivalent ways: 1≤a 1≤a a a−1 t t + (1 − a) s s s 1−a t + (1 − a) s −a t (68) (69) If s ≤ t then t/s ≥ 1 and therefore a a−1 a t t + (1 − a) ≥ a(1)a−1 + (1 − a)(1)a = 1 s s so the inequality in (68) holds. If s ≥ t then s/t ≥ 1, and therefore a s 1−a t + (1 − a) s −a t ≥ a(1)1−a + (1 − a)(1)−a = 1 so the inequality in (69) holds. But both of these inequalities are just (67) with a variable change, so we see that (67) holds when s ≥ t or when t ≥ s – that is, it always holds. And this is what we were asked to prove. Exercise 6.10 a: method 3 This proof was due to Boris Shektman (don’t email him). Define the function f (u) to be f (u) = up vq + − uv p q This function has the derivative f 0 (u) = up−1 − v Note that f 0 (u) = 0 has only one positive real solution, which occurs at u = v 1/(p−1) . We can show algebraically 1 = pq . For u < v q/p we have f 0 (u) < 0 and for u > v q/p we have f 0 (u) > 0, so the the point u = v q/p is that p−1 the unique global minimum of f (u). Evaluating the function at this value of u gives us vq vq + − v q/p v p q 1 1 = + v q − (v (p+q)/p ) p q f (v q/p ) = We’re given that 1/p + 1/q = 1 and, from this, we also know that p + q = pq. Making these substitutions gives us = v q − v pq/p = v q − v q 0 Therefore the unique minimum of f is f (v q/p ) = 0, and f (u) ≥ 0 for all other u. 110 Exercise 6.10b From part (a), we have Z b Z f g dα ≤ a a b fp gq 1 + dα = p q p b Z 1 f dα + q p a Z b g q dα = a 1 1 + =1 p q Exercise 6.10c Define κ and λ to be b Z Z |f |p dα, κ= λ= b |g|q dα a a Define fˆ and ĝ to be f (x) g(x) fˆ(x) = 1/p , ĝ(x) = 1/q κ λ These two functions are “normalized” in the sense that Z b p Z b Z |f | κ 1 b p |fˆ|p dα = |f | dα = = 1 dα = κ κ a κ a a Z b Z b Z b q 1 λ |g| dα = |ĝ|q dα = |g|q dα = = 1 λ λ a λ a a ˆ We know that f and ĝ are Riemann integrable by theorem 6.11, and therefore by theorem 6.13 and part (b) of this exercise we know that |fˆ| and |ĝ| are Riemann integrable and that Z b fˆĝ dα ≤ a By definition of fˆ and ĝ this becomes: Z b a f b Z |fˆ||ĝ| dα ≤ 1 a g κ1/p λ1/q b Z dα ≤ a |f | |g| dα ≤ 1 κ1/p λ1/q Multiplying both sides by the constant κ1/p λ1/q gives us Z b Z b f g dα ≤ |f g| dα ≤ κ1/p λ1/q a a which, by the definition of κ and λ, is equivalent to b Z Z f g dα ≤ a (Z b )1/p (Z b p |f g| dα ≤ q |f | dα a )1/q b |g| dα a a which is what we wanted to prove. Exercise 6.10d Proof by contrapositive. If the inequality were false for the improper integrals, we would be able to find some function f such that Z c 1/p Z c 1/q Z c lim f g dα > lim |f |p dα |g|q dα c→∞ a c→∞ a a or Z b lim c→x0 (Z b p c→x0 x0 )1/q b |f | dα f g dα > lim x0 )1/p (Z q |g| dα x0 But this is a strict inequality, so in either of these cases we would have to be able to find a neighborhood of ∞ or x0 for which this inequality still held. And this would give us a proper integral for which Holder’s inequality doesn’t hold. By contrapositive, the fact that Holder’s inequality is valid for proper integrals shows that it’s true for improper integrals. 111 Exercise 6.11 If we can prove that ||f +g|| ≤ ||f ||+||g||, then we can prove the given inequality by letting f = f −g and g = g−h. Rb ||f + g||2 = a |f + g|2 dα definition of || · || Rb 2 ≤ a (|f | + |g|) dα The triangle inequality is established for | · | Rb 2 2 = a |f | + 2|f g| + |g| dα Rb Rb Rb = a |f |2 dα + 2 a |f g| dα + a |g|2 dα properties of integrals: theorem 6.12 nR o1/2 nR o1/2 R Rb 2 b b b ≤ a |f | dα + 2 a |f |2 dα |g|2 dα + a |g|2 dα Holder’s inequality (exercise 6.10) a = ||f ||2 + 2||f || ||g|| + ||g||2 definition of || · || 2 = (||f || + ||g||) Taking the square root of the first and last terms of this chain of inequalities, we have ||f + g|| ≤ ||f || + ||g|| This inequality must hold for any f, g ∈ R. Letting f = f − g and g = g − h this becomes ||f − h|| ≤ ||f − g|| + ||g − h|| which is what we were asked to prove. Exercise 6.12 We’re told that f is continuous, so it has upper and lower bounds m ≤ f (x) ≤ M . Choose any > 0. Since f ∈ R(α), we can find some partition P such that U (P, f, α) − L(P, f, α) < 2 M −m Having now determined a particular partition P we can define g as suggested in the hint. This function is simply a series of straight lines connecting f (xi−1 ) to f (xi ). This becomes clearer if we rewrite g in an algebraically equivalent form: f (xi ) − f (xi−1 ) g(t) = f (xi−1 ) + (t − xi−1 ) xi − xi−1 We see that on any interval [xi−1 , xi ] the function g(t) is bounded between f (xi−1 ) and f (xi ). That is, mi min{f (xi−1 ), f (xi )} ≤ g(t), t ∈ [xi−1 , xi ] ≤ max{f (xi−1 ), f (xi )} ≤ Mi And of course we have similar bounds on f : mi ≤ f (t) ≤ Mi and therefore, since f (t), g(t) ≤ Mi and −f (t), −g(t) ≤ −mi , we have |f (t) − g(t)| ≤ |Mi − mi | = Mi − mi (70) Similarly, since Mi ≤ M and −mi ≤ −m we have Mi − mi ≤ M − m We’ve now established all of the inequalities we need to complete our proof. 112 (71) Rb ||f − g||2 = a |f (x) − g(x)|2 dα Pn R xi+1 = i=0 xi |f (x) − g(x)|2 dα Pn R x ≤ i=0 xii+1 |Mi − mi |2 dα Rx Pn = i=0 |Mi − mi |2 xii+1 dα Pn = i=0 |Mi − mi |2 ∆α(xi ) Pn ≤ i=0 (M − m)(Mi − mi ) ∆α(xi ) Pn = (M − m) i=0 (Mi − mi ) ∆α(xi ) = (M − m)[U (P, f, α) − L(P, f, α)] ≤ (M − 2 m) M−m definition of || · || integral property 6.12c from (70) integral property 6.12a from (71) integral property 6.12a Definition of U and L we chose P so that this would hold 2 = we chose so that this would hold Taking the square root of the first and last terms gives us ||f − g|| ≤ which is what we were asked to prove. Exercise 6.13a Letting u = t2 we have du/dt = 2t and therefore dt = du du = √ 2t 2 u Using the change of variables theorem (6.19) we see that the given integral is equivalent to Z (x+1)2 sin u √ du f (x) = 2 u x2 √ Integration by parts (theorem 6.22) with F = 1/2 u and g = sin u du gives us − cos u √ f (x) = 2 u (x+1)2 Z (x+1)2 − x2 x2 cos u du 4u3/2 which expands to cos x2 − cos(x + 1)2 + − 2(x + 1) 2x By the triangle inequality this gives us (x+1)2 Z f (x) = |f (x)| = ≤ < = = = = − cos(x + 1)2 cos x2 + − 2(x + 1) 2x x2 Z (x+1)2 x2 cos u du 4u3/2 cos u du 4u3/2 Z (x+1)2 − cos(x + 1)2 cos x2 cos u + + du 2(x + 1) 2x 4u3/2 x2 Z (x+1)2 1 1 1 + + du 3/2 2(x + 1) 2x 4u 2 x 1 1 1 1 1 + + − 2(x + 1) 2x 2 x+1 x 1 1 1 + + 2(x + 1) 2x 2x(x + 1) x + (x + 1) + 1 2x(x + 1) 2(x + 1) 1 = 2x(x + 1) x 113 (72) Note that the strict inequality is justified by the fact that cos u ≤ 1 for all u ∈ [x2 , (x + 1)2 ] but cos is not a constant function so cos u < 1 for some u ∈ [x2 , (x + 1)2 ]. Exercise 6.13b In the previous problem we used integration by parts to determine that − cos(x + 1)2 cos x2 f (x) = + − 2(x + 1) 2x (x+1)2 Z cos u 4u3/2 x2 Multiplying by 2x gives us −x cos(x + 1)2 2xf (x) = + cos x2 − 2x x+1 Z (x+1)2 x2 cos u du 4u3/2 which is algebraically equivalent to 2xf (x) = cos x2 − cos[(x + 1)2 ] + cos(x + 1)2 − 2x x+1 Z (x+1)2 x2 cos u du 4u3/2 Letting r(x) be defined as r(x) = cos(x + 1)2 − 2x x+1 Z (x+1)2 x2 cos u du 4u3/2 we have, by the triangle inequality, |r(x)| ≤ ≤ = = = = < Z (x+1)2 cos(x + 1)2 cos u + 2x du 3/2 x+1 4u 2 x Z (x+1)2 1 1 + 2x du x+1 4u3/2 x2 1 1 1 1 + 2x − x+1 2 x+1 x 1 1 +x x+1 x(x + 1) 1 1 + x+1 x+1 2 x+1 2 x So we see that 2xf (x) = cos x2 − cos[(x + 1)2 ] + r(x) where |r(x)| < 2/x. Exercise 6.13c We’re asked to find the lim sup and lim inf of the function xf (x) = 1 cos(x2 ) − cos([x + 1]2 ) + r(x) 2 We established in part (b) that r(x) → 0 as x → ∞. It’s also clear that this function is bounded above by 1 and bounded below by −1, but it’s not immediately clear that these bounds are the lim sup and the lim inf. 114 Remark: The supremum and infimum of cos(x2 ) − cos([x + 1]2 ) are never obtained The supremum would be obtained if we could find x such that cos(x2 ) − cos([x + 1]2 ) = 2 (73) which would occur precisely when x2 = 2kπ, [x + 1]2 = (2j + 1)π, j, k ∈ N After some tedious algebra, we can see that these two equalities hold when 2 π[2(j − k) + 1] − 1 √ =k 2 2π (74) (75) But this equation can’t hold. If it did, we could rearrange it algebraically to give us π 2 [2(j − k) + 1]2 − 2π[2j + 3k + 1] + 1 = 0 This would allow us to use the quadratic formula to give an algebraic expression for π; but this is impossible as π is a transcendental number. The infimum of cos(x2 ) − cos([x + 1]2 ) is not obtained for the same reason. Proof: The lim sup of xf (x) is 1 Let 1 > > 0 and N ∈ N be given. Let δ be chosen so that 0<δ< cos−1 (1 − ) 2π Define the function p(m) to be p(m) = Its derivative with respect to m is p0 (m) = √ π[2m + 1] − 1 √ 2 2π 2 2π(π[2m + 1] − 1) which is strictly increasing. Therefore we can make the derivative as large as we want by choosing a sufficiently large m. Specifically, we are able to choose M ∈ N such that p0 (M )δ > 1 and M > N . By the mean value theorem we have p(M + δ) − p(M ) = p0 (ξ)δ, ξ ∈ (M, M + δ) From the strictly increasing nature of p0 and our choice of M we have p(M + δ) − p(M ) = p0 (ξ)δ > p0 (M )δ > 1 Therefore p(x) must take an integer value for at least one x ∈ [M, M + δ]. Let κ represent one such x. The two important properties of κ are that p(κ) is an integer and that κ − M < δ (so that κ itself is “almost” an integer). We now have 2 π[2(M + δ) + 1] − 1 √ =k∈N p(κ) = p(M + δ) = 2 2π If we define j to be j = k + M this becomes 2 π[2(j − k + δ) + 1] − 1 √ =k∈N 2 2π Reversing the algebraic steps that led from (74) to (75) tells us that there exists some x ∈ R such that x2 = 2kπ, [x + 1]2 = (2(j + δ) + 1)π, j, k ∈ N Using this value of x in our original function, we have xf (x) = 1 1 cos(x2 ) − cos([x + 1]2 ) + r(x) = [cos(2kπ) − cos(2(j + 1)π + 2δπ) + r(x)] 2 2 115 Using trig identities, this becomes xf (x) = 1 [1 − {cos(2(j + 1)π) cos(2π) − sin(2(j + 1)π) sin(2π)} + r(x)] 2 1 [1 − (−1) cos(2π) − 0 + r(x)] 2 Finally, from our original choice of δ as an inverse cosine, this becomes = xf (x) = 1 [1 + (1 − ) + r(x)] 2 1 [r(x) − ] 2 =1+ From part (b), we know that r(x) → 0 as x → ∞: lim xf (x) = 1 − x→∞ 2 And was arbitrary, so the supremum is lim sup xf (x) = 1 x→∞ The proof that lim inf xf (x) = −1 is similar. Exercise 6.13d Z ∞ 2 sin(t ) dt = 0 ∞ Z X √ 0 = ∞ Z X ∞ X (n+1)π sin(t2 ) dt (n+1)π (−1)n | sin(t2 )| dt nπ (−1)n 0 ≤ ∞ X (−1)n 0 = (76) nπ √ √ 0 = √ Z √(n+1)π √ | sin(t2 )| dt nπ Z √(n+1)π √ 1 dt (and is similarly bounded below) nπ ∞ √ √ X √ π (−1)n n+1− n (77) 0 (78) √ √ We saw in exercise 3.6.a that the sequence { n + 1 − n} is decreasing. Therefore, by the alternating series theorem (3.43) the series in (77) converges. Therefore, by the comparison test (3.25) the series in (76) converges and therefore the integral in (76) converges. Exercise 6.14a Letting u = et we have du/dt = et = u and therefore dt = du du = et u Using the change of variables theorem (6.19) we see that the given integral is equivalent to Z ex+1 f (x) = ex 116 sin(u) du u (79) We use integration by parts as we did in 6.13(a). f (x) = − cos u u ex+1 ex+1 Z − ex ex cos u du u2 which expands to f (x) = cos ex cos ex+1 − − x e ex+1 ex+1 Z cos u du u2 ex By the triangle inequality this gives us |f (x)| = ≤ < < = = = cos ex cos ex+1 − − ex ex+1 Z ex+1 cos u du u2 ex Z ex+1 cos ex cos ex+1 cos u + + du ex ex+1 u2 ex Z ex+1 1 1 1 + x+1 + du 2 ex e u x e 1 1 1 1 + x+1 + x − x+1 ex e e e 1 1 e−1 + x+1 + x+1 x e e e 2e ex+1 2 ex Therefore ex |f (x)| < 2. Note that the strictness of the inequality is justified by the fact that cos u ≤ 1 for all u ∈ [ex , ex+1 ] but cos is not a constant function so cos u < 1 for some u ∈ [ex , ex+1 ]. Exercise 6.14b In the previous exercise we used integration by parts to determine that f (x) = cos ex cos ex+1 − − x e ex+1 Z ex+1 cos u du u2 ex Multiplying by ex gives us x x e f (x) = cos e − e −1 x+1 cos e −e x Z ex+1 ex Letting r(x) be defined as r(x) = −e x Z ex+1 ex we have, by the triangle inequality, 117 cos u du u2 cos u du u2 |r(x)| −e ≤ x Z ex+1 ex −ex = Z ex+1 ex Z x ex+1 |e | = ex cos u du u2 1 du u2 1 du u2 1 1 − x+1 ex e e−1 e 2 e |ex | = = < So we see that ex f (x) = cos ex − e−1 cos ex+1 + r(x) where |r(x)| < 2e−1 . Exercise 6.15a Using integration by parts with F = f 2 (x) and g = dx gives us Z b Z b b 1= f 2 (x) dx = xf 2 (x) a − 2f (x)f 0 (x)x dx a a which evaluates to 2 Z 2 1 = bf (b) − af (a) − b 2f (x)f 0 (x)x dx a which, since f (a) = f (b) = 0, becomes b Z 2f (x)f 0 (x)x dx 1=− a Dividing both sides by −2 gives us −1 = 2 Z b f (x)f 0 (x)x dx a which is the desired equality. Exercise 6.15b Applying Holder’s inequality to the last equation in part (a) gives us (Z )1/2 (Z )1/2 Z b b b −1 = f (x)f 0 (x)x dx ≤ |x2 f 2 (x)| dx · |[f 0 (x)]2 | dx 2 a a a Since f (x)f 0 (x) must be negative at some point (by mean value theorem, since f (a) = f (b) = 0) while x2 f 2 and [f 0 (x)]2 are strictly positive, this inequality must be strict: (Z )1/2 (Z )1/2 Z b b b −1 2 2 0 2 0 = f (x)f (x)x dx < |x f (x)| dx · |[f (x)] | dx 2 a a a Squaring the first and last term in this chain of inequalities, we have Z b Z b 1 < |x2 f 2 (x)| dx · |[f 0 (x)]2 | dx 4 a a 118 All of the normed values are squares, so the norms are redundant: Z b Z b 1 2 2 < x f (x) dx · [f 0 (x)]2 dx 4 a a Exercise 6.16a We can integrate the given function separately over infinitely many intervals of length 1: Z ∞ Z n+1 ∞ X [x] [x] s dx = dx s s+1 xs+1 x 1 n n=1 We have [x] = n on the interval [n, n + 1) so this becomes Z n+1 ∞ X = s n n=1 ∞ X = sn n=1 n dx xs+1 −1 sxs n+1 x=n ∞ X 1 1 = n s− n (n + 1)s n=1 We then split up the summation into three parts. = 1 X n=1 ∞ n ∞ X 1 X 1 1 + n s− n s n n (n + 1)s n=2 n=1 Evaluating the first summation and changing the index of the third gives us =1+ ∞ X n=2 ∞ n X 1 1 (n − 1) − ns n=2 (n)s ∞ X n − (n − 1) =1+ ns n=2 =1+ ∞ X 1 ns n=2 And since 1 is clearly equal to 1/ns when n = 1, this is equivalent to = ∞ X 1 s n n=1 Exercise 6.16b We’re asked to evaluate the integral Z ∞ x − [x] s −s dx s−1 xs+1 1 Having determined that [x] was integrable in part (a) we can split up the integral as follows: Z ∞ Z ∞ s 1 [x] = −s dx + s dx s s−1 x xs+1 1 1 Elementary calculus allows us to calculate the left integral: Z ∞ s s [x] − +s dx s−1 s−1 xs+1 1 Z ∞ [x] =s dx s+1 x 1 This, as we saw in part (a), is equivalent to = ζ(s) = 119 Exercise 6.17 I’m going to change the notation of this problem a bit to make it clearer (to me, at least) by letting f = G and f 0 = g. We’re told that α is a monotonically increasing function on [a, b] and that f is continuous. We’re asked to prove that Z b Z b 0 α(x)f (x) dx = f (b)α(b) − f (a)α(a) − f (x) dα a a Most of the work is done for us by the theorem 6.22 (integration by parts) which tells us that Z b α(x)f 0 (x) dx = f (b)α(b) − f (a)α(a) − b Z a f (x)α0 (x) dx a So our proof is reduced to simply proving that b Z b Z f (x)α0 (x) dx f (x) dα = a (80) a It’s tempting to appeal to elementary calculus to show that dα = α0 (x) dx, but we can prove this more formally. Proving this more formally Let > 0 be given. We’re told that f is continuous and that α is monotonically inreasing; therefore f ∈ R(α). Let P be an arbitrary partition of [a, b] such that U (P, f, α) − L(P, f, α) < 2 . On any interval [xi−1 , xi ] the mean value theorem tells us that there is some ti ∈ [xi−1 , xi ] such that α(xi−1 ) − α(xi ) = α0 (ti )[xi−1 − xi ] or, to express the same equation with different notation, ∆αi = α0 (ti )∆xi (81) By theorem 6.7 (c) we have n X f (x) dα < a i=1 and also n X b Z f (ti )∆αi − f (ti )α0 (ti )∆xi − 2 b Z f (x)α0 (x) dx < a i=1 By the inequality (81) we have n X n X f (ti )∆αi = i=1 f (ti )α0 (ti )∆xi i=1 Therefore, by (82) and (83) and the triangle inequality, we have b Z b Z f (x)α0 (x) dx < f (x) dα − a a which, in order to be true for any > 0, requires that Z b Z b f (x) dα = a a This proves (80), and this completes the proof. 120 f (x)α0 (x) dx (82) 2 (83) Exercise 6.18 The derivative γ10 (t) = ieit is continuous, therefore γ1 (t) is rectifiable (theorem 6.27) and its length is given by Z 2π Z 2π 1 = 2π |ieit | = 0 0 The derivative γ20 (t) = 2ieit is continuous, therefore γ2 (t) is rectifiable and its length is given by Z 2π Z 2π it |2ie | = 2 = 4π 0 0 The arc γ3 is not rectifiable Proof by contradiction. If γ3 were rectifiable then its length would be given by the integral Z 2π Z 2π 1 1 −1 0 2πi sin |γ (t)| dt = + 2 2πit cos e2πit sin(1/t) dt t t t 0 0 Z 2π 1 1 1 = 2π sin dt (84) − cos t t t 0 But this integral isn’t defined, since Riemann integrals are defined only for bounded functions and this function is unbounded: 1 0 0 = lim 2π |sin (2kπ) − 2kπ cos (2kπ)| lim |γ (t) dt| = lim γ t→0 k→∞ k→∞ 2kπ = lim 2π |±2kπ| k→∞ = lim 4π 2 k = ∞ k→∞ Therefore the integral in (84) doesn’t exist, which means the arc length of γ3 is not defined, which means that γ3 is not rectifiable. Exercise 6.19 Lemma 1: If f : A → B and G : B → C are both one-to-one, then g ◦ f : A → C is one-to-one By definition of “one-to-one”, we have f (x) = f (y) → x = y, g(s) = g(t) → s = t Therefore g(f (x)) = g(f (y)) → f (x) = f (y) → x = y and so g ◦ f is one-to-one. Lemma 2: If g ◦ f : A → C is one-to-one and f : A → B is one-to-one and onto, then g : B → C is one-to-one. Proof by contrapositive. Suppose that g is not one-to-one. Then we could find x, y ∈ B such that g(x) = g(y) but x 6= y. But f is one-to-one and onto, so there exist unique s, t ∈ A such that f (s) = x, f (t) = y. So we have g(f (s)) = g(f (t)) but f (s) 6= f (t) and therefore s 6= t. So g ◦ f is not one-to-one. By contrapositive, the lemma is proven. Lemma 3: If f : [a, b] → [c, d] is a continuous, one-to-one, real function then f is either strictly decreasing or strictly increasing Proof by contrapositive. Let f be a continuous real function. If f were not strictly increasing and not strictly decreasing then we could find some x < y < z such that either f (y) ≤ f (x) and f (y) ≤ f (z) or such that f (y) ≥ f (x) and f (y) ≥ f (z). From the intermediate value property of continuous functions we know that we must then be able to find x0 , y 0 such that x ≤ x0 < y < z 0 < z, f (x0 ) = f (z 0 ) so that f is not one-to-one. By contrapositive, the lemma is proven. 121 Lemma 4: If P = {xi } is a partition of γ2 , then P 0 = {φ(xi )} is a partition of γ1 From lemma 3 we know that φ is strictly increasing or decreasing, and since φ(c) = a it must be strictly increasing. And φ is onto, so φ(d) = b. Let P be an arbitrary partition of [c, d]: P = {xi } = {c = x0 , x1 , . . . , xn , xn+1 = d} From the properties of φ we have a = φ(c) = φ(x0 ) < φ(x1 ) < φ(x2 ) < . . . < φ(xn ) < φ(xn+1 ) = φ(d) = b and therefore {φ(xi )} = {a = φ(x0 ), φ(x1 ), φ(x2 ), . . . , φ(xn ), φ(xn+1 ) = b} which is a partition of [a, b]. Lemma 5: If P = {xi } is a partition of γ1 , then P 0 = φ−1 (xi ) is a partition of γ2 We’re told that φ is a continuous one-to-one function from [a, b] to [c, d], so its inverse exists and is a continuous mapping from [a, b] onto [c, d] (theorem 4.17). By lemma 3, this means that φ−1 is either strictly increasing or strictly decreasing, and since φ−1 (a) = c it must be strictly increasing. And φ−1 is onto, so φ−1 (b) = d. Let P be an arbitrary partion of [a, b]: P = {xi } = {a = x0 , x1 , . . . , xn , xn+1 = b} From the properties of φ−1 we have c = φ−1 (a) = φ−1 (x0 ) < φ−1 (x1 ) < φ−1 (x2 ) < . . . < φ−1 (xn ) < φ−1 (xn+1 ) = φ−1 (b) = d and therefore {φ−1 (xi )} = {c = φ−1 (x0 ), φ−1 (x1 ), φ−1 (x2 ), . . . , φ−1 (xn ), φ−1 (xn+1 ) = d} which is a partition of [c, d]. Proof 1: γ1 is an arc iff γ2 is an arc If γ1 is an arc then γ1 is a one-to-one function. We’re told that φ is one-to-one, therefore by lemma 1 γ1 ◦ φ is one-to-one. And γ2 = γ1 ◦ φ so γ2 is one-to-one. There γ2 is an arc. If γ2 is an arc then γ2 = γ1 ◦ φ is one-to-one. We’re told that φ is a one-to-one and onto function, therefore by lemma 2 we know that γ1 is one-to-one. Therefore γ1 is an arc. Proof 2: γ1 is a closed curve iff γ2 is a closed curve Assume that γ1 is a closed curve so that γ1 (a) = γ1 (b). We saw in lemma 4 that φ(c) = a and φ(d) = b, so we have γ2 (c) ≡ γ1 (φ(c)) = γ1 (a) = γ1 (b) = γ1 (φ(d)) ≡ γ2 (d) Therefore γ2 (c) = γ2 (d), which means that γ2 is a closed curve. Now assume that γ2 is a closed curve so that γ2 (c) = γ2 (d). We saw in lemma 5 that φ−1 (a) = c and φ (b) = d so we have −1 γ1 (a) = γ1 (φ(φ−1 (a))) ≡ γ2 (φ−1 (a)) = γ2 (c) = γ2 (d) = γ2 (φ−1 (b)) ≡ γ1 (φ(φ−1 (b))) = γ1 (b) Therefore γ1 (a) = γ1 (b), which means that γ1 is a closed curve. 122 Proof 3: γ1 and γ2 have the same length Let P be an arbitrary partitioning of [a, b]. Using the notation from definition 6.26, we have Λ(P, γ2 ) = n X |(γ2 (xi ) − γ2 (xi−1 )| < M i=1 From the definition of γ2 , this becomes Λ(P, γ2 ) = n X |(γ1 (φ(xi )) − γ1 (φ(xi−1 ))| < M i=1 From lemma 4 we see that {φ(xi )} describes a partitioning of [c, d]: Λ(P, γ2 ) = Λ(φ(P ), γ1 ) Similarly, if we let P 0 be an arbitrary partition of [a, b] we can use lemma 5 to show that Λ(P 0 , γ1 ) = Λ(φ−1 (P ), γ2 ) This means that the set {Λ(P, γ1 )} is identical to the set of {Λ(P, γ2 )} so clearly these sets have the same supremums, which means that Λ(γ1 ) = Λ(γ2 ). Proof 4: γ1 is rectifiable iff γ2 is rectifiable Having shown previously that Λ(γ1 ) = Λ(γ2 ), it’s clear that Λ(γ1 ) is finite iff Λ(γ2 ) is finite and so γ1 is rectifiable iff γ2 is rectifiable. Exercise 7.1 Let {fn } be a sequence of functions that converges uniformly to f . Let Mn = sup |fn | and let M = sup |f |. Let > 0 be given. Because of the uniform convergence of {fn } → f we can find some N such that n > N → |fn (x)| = |fn (x) − f (x) + f (x)| ≤ |fn (x) − f (x)| + |f (x)| ≤ + M This tells us that for all n > N the function fn is bounded above by M + . There are only finitely many n ≤ N , each of which is bounded by some Mn . So by choosing max{M + , M1 , M2 , . . . , MN } we have found a bound for all fn . Exercise 7.2a Let > 0 be given. We’re told that {fn } and {gn } converge uniformly on E so we can find N, M such that n, m > N → |fn (x) − fm (x)| < , 2 n, m > M → |gn (x) − gm (x)| < 2 By choosing N ∗ > max{N, M } we have n, m > N ∗ → |(fn (x) + gn (x)) − (fm (x) − gm (x))| ≤ |fn (x) − fm (x)| + |gn (x) − gm (x)| < which shows that fn + gn converges uniformly on E. 123 Exercise 7.2b If {fn } and {gn } are bounded functions then by exercise 7.1 they are uniformly bounded, say by F and G. Let > 0 be given and choose δ such that r , , δ < min , 3 3F 3G We’re told that {fn } and {gn } converge uniformly on E so we can find N, M such that n, m > N → |fn (x) − fm (x)| < δ, n, m > M → |gn (x) − gm (x)| < δ |fn gn − fm gm | = |(fn − fm )(gn − gm ) + fm (gn − gm ) + gm (fn − fm )| ≤ |(fn − fm )||(gn − gm )| + |fm ||(gn − gm )| + |gm ||(fn − fm )| ≤ δ 2 + |fm |δ + |gm |δ ≤ δ 2 F δ + Gδ r 2 +G = ≤ +F 3 3F 3G Exercise 7.3 Let {fn } be a sequence such that fn (x) = x. This function obviously converges uniformly to the function f (x) = x on the set R. Let {gn } be a sequence of constant functions such that gn (x) = n1 . This sequence obviously converges uniformly to the function g(x) = 0. Their product is the sequence {fn gn } where fn (x)gn (x) = x/n. It’s clear that this sequence converges pointwise to fn (x)gn (x) = 0. To show that fn gn is not uniformly convergent, let > 0 be given. Choose an arbitrarily large n ∈ Z and choose t ∈ R such that t > (n(n + 1)). We now have |fn (t)gn (t) − fn+1 (t)gn+1 (t)| = t t (n(n + 1)) t − = > = n n+1 n(n + 1) n(n + 1) This shows that the necessary requirements for uniform convergence given in theorem 7.8 do not hold. Exercise 7.4 Holy shit this exercise is a mess. For what values of x does the series converge absolutely? (Incomplete) For values of x > 0 we have X 1 X 1 1 X 1 1 = ≤ 1 + n2 x x 1/x + n2 x n2 By the comparison test this shows that f (x) converges absolutely. If x = 0 then we have X X 1 = |1| = ∞ 1 + n2 x This series clearly doesn’t converge, absolutely or otherwise. If x < 0 things get more complicated: If x = −1/n2 for any n ∈ N then the nth term of the series is undefined and therefore f (x) is undefined. If x 6= −1/n2 for any n ∈ N then we can use the fact that For what intervals does the function converge uniformly? If E is any interval of the form [a, b] with a > 0 then we have sup |fn (x)| = fn (a) = 124 1 1 + n2 a And therefore we have X sup |fn (x)| = X 1 1X 1 ≤ 2 1+n a a n2 P This shows that sup |fn (x)| converges by the comparison test, and so by theorem 7.10 we see that the series P fn (x) converges uniformly on E. If E is any interval of the form [a, b] with b < 0 that does not contain any elements of the form −1/n2 , n ∈ N For what intervals does the function fail to converge uniformly? Define the set X = {xn } with xn = −1/n2 . The function f will fail to converge uniformly on any interval that contains an element of X ∪ 0 or has an element of X ∪ 0 as a limit point of E. Proof: Let E be an arbitrary interval. If E contains P any xn ∈ X then f (xn ) undefined and so f fails to converge uniformly on E. If E contains 0 then f (0) = 1 = ∞, but we will never find some finite N such that PN 1 1 − f (0) < so it’s clear that f fails to converge uniformly on E. Now suppose that some xn ∈ X is a limit point of E. The nth term of f is unbounded near xn , so f is unbounded near xn , and therefore limt→xn f (t) = ∞. From this we have lim lim f (t) = lim ∞ = ∞ n→∞ t→xn n→∞ On the other hand, if we first fix a value of t and take the limit of f as n → ∞ we have lim f (t) = lim n→∞ n→∞ 1 =0 1 + n2 t and therefore lim lim f (t) = lim 0 = 0 t→xn n→∞ t→xn Exercise 7.5 If x ≤ 0 or x > 1 then x = 0 for all n and therefore in these cases limn→∞ fn (x) = 0. For any other x, choose an integer N large enough so that N > 1/x. For all n > N we now have x > 1/n and therefore fn (x) = 0, so for this case we have limn→∞ fn (x) = 0. This exhausts all possible values of x, so {fn } converges pointwise to the continuous function f (x) = 0. To show that this function doesn’t converge uniformly to f (x) = 0 let 1 > > 0 be given and let n be an arbitrarily large integer. We can easily verify that 1 fn = sin2 (nπ + π/2) = 1 n + 1/2 and therefore the definition of uniform convergence is not satisfied. P The last part of the question asks us to use the series fn to show that absolute convergence for all x does not imply uniform convergence. The proof of this is simple: P Pwe’ve already shown that {fn } is not uniformly convergent but |fn (x)| converges to f (x) = | sin2 (π/x)| so |fn | is absolutely convergent. Exercise 7.6 To show that the series doesn’t converge absolutely for any x: X (−1)n X x2 X 1 x2 + n 1 = + ≥ 2 2 n n n n The rightmost sum is the harmonic series, which is known to diverge. By comparison test the leftmost series must also diverge. 125 To show that the series converges uniformly in every bounded interval [a, b], let > 0 be given. Define X to be sup{|a|, |b|}. Define the partial sum fm to be fm = m X (−1)n n=1 x2 + n n2 We can rearrange this algebraically to form fm = m X (−1)n n=1 m X 1 1 + x2 (−1)n 2 n n n=1 We know that the alternating harmonic series converges (theorem 3.44, or example 3.40(d)) and that converges (theorem 3.28). Therefore by the Cauchy criterion for convergence we can find N such that p>q>N →| P 1/n2 p X 1 (−1)n | < n 2 n=q and we can also find M such that p>q>M →| p X (−1)n n=q 1 |< n2 2|X|2 So, by choosing p > q > max{N, M } we have (for all x ∈ [a, b]): |fp (x) − fq (x)| = p X (−1)n p X 1 1 + x2 (−1)n 2 n n n=q (−1)n p X 1 1 (−1)n 2 + x2 n n n=q n=q ≤ p X n=q ≤ ≤ x2 + 2 2X 2 + = 2 2 Exercise 7.7 We can establish a global maximum for |fn (x)| The derivative of fn (x) is fn0 (x) = 1 + nx2 − 2nx2 1 − nx2 = (1 + nx2 )2 (1 + nx2 )2 √ This derivative is zero only when nx2 = 1, which occurs only when x = ±1/ n, at which point we have ±1 ±1 fn √ = √ n 2 n These extrema must be the global√extrema for the function, since fn has no asymptotes and fn (x) → 0 as x → ±∞. Therefore |fn (x)| ≤ 1/(2 n). fn converges uniformly to f (x) = 0 Clearly fn converges pointwise to f (x) = 0, since for any x we have lim fn (x) = lim n→∞ n→∞ 126 x =0 1 + nx2 √ Now let > 0 be given and choose n sufficiently large so that 1/(2 n) < : this can be done by choosing 1 n > 42 . From the previously established bounds, we now have 1 |fn (x) − f (x)| = |fn (x) − 0| = |fn (x)| ≤ √ < 2 n By theorem 7.9 this is sufficient to prove that fn converges uniformly to f (x) = 0. When does f 0 (x) = lim fn0 (x)? We’ve established that f (x) = 0, so clearly f 0 (x) = 0 for all x. The limit of fn0 is given by: 1 − nx2 0 x= 6 0 lim fn0 (x) = lim 2 4 = 1 x=0 n→∞ n→∞ n x + 2nx2 + 1 This shows that it’s not necessarily true that [lim fn ]0 = lim[fn0 ] even if fn converges uniformly. Exercise 7.8 Proof of uniform convergence Let > 0 be given. Let fm represent the partial sum fm = m X cn I(x − xi ) n=1 P P We’re told that |cn | converges, so |cn | satisfies the Cauchy convergence criterion, so we can find N such that p > q > N implies p X |cn | < n=q For the same values of p > q > N we also have, by the triangle inequality and the fact that I(x − xn ) ≤ 1, |fp − fq | = p X cn I(x − xn ) ≤ n=q p X |cn I(x − xn )| ≤ n=q p X |cn | < n=q so {fn } converges uniformly by theorem 7.8 Proof of continuity Let t be an arbitrary point (not necessarily in the interval (a, b)). This t either is or isn’t a limit of some subsequence of {xn }. If t is not a limit point of some subsequence then we can find some neighborhood around t that P contains no points of {xn }; the function I(t − xn ) is constant on this interval for all n and therefore f (x) = cn I(x − xn ) is constant on this interval for all n. (see below for a a more thorough justification of this claim). And if f is constant on an interval around t then it is clearly continuous at t. P |cn | we can find N such that P∞Now suppose that t a limit point of {xn }. By the Cauchy convergence of |c | < . Choose a neighborhood around t small enough that it does not contain the first N terms of {xn }; n=N n let δ represent the radius of this neighborhood. If we choose s such that |s − t| < δ, then I(s − xn ) = I(t − xn ) for n = 1, 2, . . . , N . From this we have |s − t| < δ → f (s) − f (t) ≤ ∞ X (cn I(s − xn ) − cn I(t − xn )) ≤ n=N +1 ∞ X n=N +1 127 |cn [I(s − xn ) − I(t − xn )]| Each I(x − xn ) is either 0 or 1 so |I(s − xn ) − I(t − xn )| ≤ 1, so we have |s − t| < δ → f (s) − f (t) ≤ ∞ X |cn [I(s − xn ) − I(t − xn )]| ≤ n=N +1 ∞ X |cn | < n=N +1 That is, we’ve found δ such that |s − t| < δ → f (s) − f (t) < which, by definition, means that f is continuous at t. We have therefore shown that f is continuous at t under all cases. But t was an arbitrary point (not necessarily confined to (a, b)) and therefore f is continuous at all points. Justification of I(x − xn ) being constant on some interval As mentioned above, if t is not a limit point of some subsequence then we can find some neighborhood around t that contains no points of {xn }. Let A be the set of elements of {xn } that are greater than t, and let B be the set of elements of {xn } that are smaller than t. If |s − t| < δ then there are no elements of {xn } between s and t and so every element of A is greater than both t and s and every element of B is smaller than both t and s. From this we see that I(s − xn ) = I(t − xn ) = 0 if xn ∈ A andP I(s − xn ) = I(t − xn ) = 1 if xn ∈ B. This means that I(x − xn ) has the same value for every x ∈ Nδ (t); so cn I(x − xn ) has the same value for every x ∈ Nδ (t); and therefore f (x) has the same value for every x ∈ Nδ (t). Exercise 7.9 We’re told that {fn } → f uniformly on E, so we can find N such that n > N → |fn (t) − f (t)| < 2 for all t ∈ E This holds for all t ∈ E, so it must also hold for the elements of {xn }, which means that n > N → |fn (xn ) − f (xn )| < 2 (85) We’re also told that {fn } is a uniformly convergent sequence of continuous functions, so f is continuous; and since {xn } → x we can find M such that n > M → |f (xn ) − f (x)| < 2 (86) Using the triangle inequality and equations (85) and (86), we have |fn (xn ) − f (x)| ≤ |fn (xn ) − f (xn )| + |f (xn ) − f (x)| < + = 2 2 Converse statement 1 Suppose the converse is “Let {fn } be a sequence of continuous functions that converges uniformly to f . Is it true that if lim fn (xn ) = f (x) then {xn } → x?”. The answer to this question is “no”. Consider the function fn (x) = x/n and the sequence {xn } = {1} (an infinite sequence of 1s). The sequence {fn } converges uniformly to f (x) = 0 on the set [0, 1], so we have lim fn (xn ) = lim fn (1) = 0 = f (0) n→∞ n→∞ It’s clear that {xn } 6→ 0, so the converse does not hold. 128 Converse statement 2 Suppose the converse is “Let {fn } be a sequence such that lim fn (xn ) = f (x) for some function f and for all sequences of points xn ∈ E such that {xn } → x ∈ E. Is it true that {fn } converges uniformly to f on E? The answer to this question is “no”. Let fn (x) = 1/(nx). Let E be the harmonic set {1/n}. Then fn converges pointwise on E to f (x) = 0. Although 0 is a limit point of E, 0 is not an element of E and this converse statement is only concerned with sequences {xn } that converge to some x ∈ E. So the only sequences of points xn ∈ E such that {xn } → x ∈ E are sequences where every term of the sequence is eventually just x itself (that is, sequences for which there is some N such that n > N → xn = x). So clearly for every such sequence we must have lim fn (xn ) = lim fn (x) = f (x). But, despite the fact that lim fn (xn ) = f (x) whenever {xn } → x ∈ E, it’s clear that fn (x) does not converge uniformly to f (x) = 0 on E (to prove this, choose any and any n and then choose x < (n)−1 ). Exercise 7.10a f is continuous at every irrational number P We know that 1/n2 converges, so by the Cauchy criterion we can choose N such that b>a>N → b X 1 < 2 n 4 a The fractional part of (nx) will never be zero because x is irrational, so we can choose δ such that 0<δ< 1 min{(nx), 1 − (nx) : n < N } 2 This guarantees that 0 < (n[x − δ]) < (nx) for all n < N and (nx) < (n[x + δ]) < 1 for all n. More importantly, this choice of δ guarantees that (n[x − δ]) = (nx) − (nδ) for n < N . We now derive the following chain of inequalities: # "X N ∞ X (nx) (nx ± nδ) (nx) (nx ± nδ) |f (x) − f (x ± δ)| = − − + n2 n2 n2 n2 1 N +1 ≤ N X 1 (nx) (nx ± nδ) − n2 n2 < N X 1 < + ∞ ∞ X X (nx ± nδ) (nx) + n2 n2 N +1 N +1 (nx) (nx ± nδ) − n2 n2 N X (nx) n2 1 = + (nx) (nδ) − 2 ± 2 n n N X ±(nδ) 1 n2 + + 4 4 + 2 2 We chose our value of δ after we fixed a particular value of N , so we could have chosen δ small enough to make this sum less than /2. In doing so, we have |f (x) − f (x ± δ)| < + = 2 2 And this would hold not just for x ± δ, but for all y such that |x − y| < δ. That is: |x − y| < δ → |f (x) − f (y)| < which means that f is continuous at x. 129 Lemma 1: P (nx)/n2 converges uniformly to f This is an immediate consequence of the fact that (nx) < 1 for all n, x and that be given. By the Cauchy criterion we can choose N such that b>a>N → and therefore we have f (x) − 1/n2 converges. Let > 0 b X 1 < 2 n a ∞ N N X X (nx) X (nx) (nx) = − n2 n2 n2 n=1 n=1 n=1 ∞ ∞ X X (nx) ≤ n2 = P n=N +1 n=N +1 ∞ X (nx) < n2 n=N +1 1 < n2 f is discontinuous at every rational number When nx is not an integer the following limits are identical: lim t→x+ (nt) (nx) (nt) = 2 = lim t→x− n2 n2 n When nx is an integer then this equality no longer holds. Instead, we have lim t→x+ We know by lemma 1 that (nt) 0 = 2, n2 n ∞ X fn (x+) = f (x+), n=1 P (nt) 1 = 2 n2 n P (nx)/n2 converges uniformly to f , so we also have (by theorem 7.11): ∞ X We know that lim t→x− fn (x−) = f (x−) n=1 1/n2 converges, so by the Cauchy criterion we can choose N such that b>a>N → b X 1 < 2 n 4 a If f were continuous at our rational point x, then we would have |f (x+) − f (x−)| = 0. But when we actually calculate this difference we find: |f (x+) − f (x−)| = ∞ X fn (x+) − n=1 ∞ X fn (x−) n=1 We determined that fn (x+) = fn (x−) unless nx is an integer; this occurs when n is a multiple of q, at which point nx = [mq]x = mq[p/q] = mp. Most of the terms of the summations cancel out, leaving us with ∞ X |f (x+)−f (x−)| = m=1 fmq (x+) − ∞ X m=1 fmq (x−) = ∞ X ∞ ∞ X X 1 (mq) (mq) (0) 1 − lim − = = 2 2 2 t→x+ [mq]2 t→x− [mq]2 n [mq] [mq] m=1 m=1 m=1 lim This is clearly not equal to zero and it can’t be made arbitrarily small because we have no freedom to select particular values for m or q. Therefore f (x+) 6= f (x−) and therefore f is not rational at x. But x was an arbitrary rational number, and therefore f is not continuous at any rational number. Exercise 7.10b We’ve shown that the discontinuities of f are the rational numbers, and these are clearly countable and clearly dense in R. 130 Exercise 7.10c To f ∈ R we need only show that (nx)/n2 ∈ R for any fixed value P of n. then use the fact that R We can RP R P prove that 2 (nx)/n converges uniformly to f (lemma 1) and theorem 7.16 to show that (nx)/n2 = (nx)/n2 = f . Let n ∈ N be given. Let [a, b] be an arbitrary interval. Assume without loss of generality that a and b are integers. Z b b−1 Z k+1 2 X (nx) (nx) dx dx = n n2 a k k=a For 0 ≤ δ < 1 we have (n(k + δ)) = (nk + nδ) = (nδ), so we can integrate over (0, 1) instead of (k, k + 1) without changing the value of the integral. Z b b−1 Z 1 2 X (nx) (nx) dx dx = n n2 a 0 k=a As x ranges from 0 to 1, nx ranges from 0 to n. So we split up the interval (0, 1) into n intervals of length 1/n: Z b a b−1 n−1 2 X X Z [j+1]/n (nx) (nx) dx = dx n n2 j=0 j/n k=a To make this integral a bit more manageable we make the variable substitution u = nx. When x = j/n we have u = j; when x = [j + 1]/n we have u = j + 1; and we have dx = n1 du. Using this variable substitution in the previous integral: Z b b−1 n−1 2 X X Z j+1 (u) (nx) dx = du n n3 a j=0 j k=a Again, there is no difference between the value of (u) on the intervals (j, j + 1) and the value of (u) on the interval (0, 1): Z b b−1 n−1 2 X X Z 1 (u) (nx) dx = du n n3 a j=0 0 k=a On the interval (0, 1) we have (u) = u, so we can finally start integrating this thing. After a bunch of trivial calculus, we have Z b 2 u2 b−a−1 (nx) dx = [b − a − 1][n] 3 = n 2n n2 a And therefore Z b Z f (x) = a a ∞ Z ∞ (nx) X b (nx) X b − a − 1 = = n2 n2 n2 n=1 n=1 a n=1 ∞ bX We know this rightmost sum converges (specifically, it converges to π 2 [b − a − 1]/6), therefore therefore f ∈ R. Rb a f (x) exists and Exercise 7.11 P Let > 0 be given. We’re told fn has uniformly bounded partial sums, so let M represent the upper Pthat bound of the partial sums of | fn |. We’re told that gn → 0 uniformly, so we can find N such that n > N → gn (x) < M for all x Pn Let An represent the partial sum k=1 fk gk . Our result is proven if we can prove that {An } satisfies the Cauchy converge criterion. To do this, we choose q > p > N . Following the logic of theorem 3.42, we have     q q−1 q−1 X X X |Aq − Ap−1 | = fk gk =  An (gn − gn+1 ) + Aq gq − Ap−1 gp ≤ M  (gn − gn+1 ) + gq − gp k=p k=p k=p 131 The (gn − gn+1 ) terms telescope down, leaving us with   q−1 X (gn − gn+1 ) + gq + gp = 2M |gp | ≤ 2M |gN | = |Aq − Ap−1 | ≤ M gp − k=p Exercise 7.12 R Let > 0 be given. We’re told that 0 ≤ fn , f ≤ g and that g < ∞. Let M = R∞ 0 g. Define G(t) to be c Z G(t) = g(x) dx 0 Since G(t) is continuous (theorem 6.30) and is strictly increasing (because g > 0) and has an upper bound of M , we know that lim G(t) = M t→∞ we can find some N such that 2 n > N → M − G(n) < which is equivalent to saying that ∞ Z c Z n>N → Z g− 0 ∞ g= 0 g< c 2 (87) We’re given that fn → f uniformly, so for all x and any specified value for c we can find some M such that m > M → |fm (x) − f (x)| < 2c (88) from which we can conclude that c Z c Z fm − f ≤ m>M → |fm − f | ≤ c 0 0 = 2c 2 (89) So, for the given value of > 0 we choose c large enough that (88) holds and then, based on this choice of c, we choose n to be large enough that (89) holds. By the triangle inequality we then have Z ∞ Z c Z ∞ fn − f fn − f + fn − f ≤ c 0 0 c Z ≤ Z ∞ |fn − f | + 0 |fn − f | c + = 2 2 We can make this hold for arbitrary by taking n sufficiently large, which is simply saying that Z ∞ lim fn − f = 0 ≤ n→∞ from which we conclude 0 Z lim n→∞ ∞ Z fn = 0 f 0 132 ∞ Exercise 7.13a (1) For any choice of x1 ∈ [0, 1], the sequence {fn (x1 )} is a bounded sequence of real numbers in the compact (1) domain [0, 1]. Therefore there exists some subsequence of functions {fn } for which the subsequence of real (1) (2) numbers {fn (x1 )} converges to some point in [0, 1] (theorem 3.6a). Call this subsequence of functions {fn } (2) and choose any x2 ∈ [0, 1]. The sequence {f (x2 ) is still a bounded sequence of real numbers in the compact (3) domain [0, 1], so we can find some subsequence of functions {fn } for which {fn (x1 )} and {fn (x2 )} both converge. This can be repeated a countable number of times to construct a sequence of functions {Fn } for which {Fn (x)} converges at all rational numbers and for all points of discontinuity for each fn (this set is countable by theorems 4.30,2.12, and 2.13). Now let t be a rational number or a point of discontinuity for some fn . We have constructed {Fn } so that {Fn (t)} converges to some point, so we simply define f (t) for rational t to be f (t) = lim Fn (t), n→∞ if t is rational or a point of discontinuity Now t be a rational number and a point at which fn is continuous for all n. The rational numbers are dense in R so we can construct a sequence {rn } of rational numbers such that {rn } → t. Each Fn is continuous at t, so limk→∞ Fn (rk ) = Fn (t) for every n. Therefore we simply define the value of f (t) to be f (t) = lim Fn (t), n→∞ if t is irrational and each Fn is continuous at t Exercise 7.13: example An example where a sequence {fn } of monotonically increasing functions converges pointwise to f and f is not continuous:  0 x≤0  1 x≥1 1 − fn (x) = 1+n  1 1 − 1+nx 0 < x < 1 0 x≤0 lim fn (x) = 1 x>0 n→∞ Exercise 7.13b Let α = inf f (x) and let β = sup f (x) (these values may be ±∞). Let > 0 be given. The function f must come arbitrarily close to its supremum and infimum on R. Let a be a point at which |f (a) − α| < and let b be a point at which |f (b) − β| < . From the monotonicity of f we have x < a → |f (x) − α| < , x > b → |f (x) − β| < (90) Proving uniform convergence is now a matter of proving uniform convergence on [a, b]. We’re given that f is continuous on [a, b]; therefore it’s uniformly continuous on [a, b] (theorem 4.19); therefore we can find some δ such that |x − y| < δ → |f (x) − f (y) < (91) We can cover [a, b] with finitely many intervals of length δ/2 ([a, b] is finite, so we don’t even need to rely on compactness for this). From each interval we choose a rational number qi . We know that f converges pointwise at each rational number, and the set of qi s is a finite set, so we can find some integer N such that n > N → |Fn (qi ) − f (qi )| < for each qi (92) Now choose any x ∈ [a, b]. If x = qi for some i then by (92) we have n > N → |Fn (qi ) − f (qi )| < . If x 6= qi then we can find qi , qi+1 such that qi < x < qi+1 and |qi+1 − qi | < δ. By the triangle inequality: |Fn (x) − f (x)| ≤ |Fn (x) − Fn (qi+1 )| + |Fn (qi+1 ) − f (qi+1 )| + |f (qi+1 ) − f (x)| Each Fn is monotonic, so we have |Fn (x) − Fn (qi+1 )| ≤ |Fn (qi ) − Fn (qi+1 )| so this last inequality becomes |Fn (x) − f (x)| ≤ |Fn (qi ) − Fn (qi+1 )| + |Fn (qi+1 ) − f (qi+1 )| + |f (qi+1 ) − f (x)| 133 The term |Fn (qi+1 ) − f (qi+1 )| is < by (92). The term |f (qi+1 ) − f (x)| is < by (91). This leaves us with |Fn (x) − f (x)| ≤ |Fn (qi ) − Fn (qi+1 )| + 2 Additional applications of the triangle inequality gives us |Fn (x) − f (x)| ≤ |Fn (qi ) − f (qi )| + |f (qi ) − f (qi+1 )||f (qi ) − Fn (qi+1 )| + 2 The terms |Fn (qi ) − f (qi )| and |f (qi ) − Fn (qi+1 )| are both < by (92). The term |f (qi ) − f (qi+1 )| is < by (91). This leave us with |Fn (x) − f (x)| ≤ 5 Exercise 7.14 Exercise 7.15 Let > 0 be given. Each fn is equicontinous on [0, 1], which means that there exists some δ such that, for all n, |0 − y| < δ → |fn (0) − fn (y)| < By the definition of f this means that, for all y ∈ (0, 1) and all n ∈ N, |y| < δ → |f (0) − f (ny)| < By taking n sufficiently large and choosing y appropriately we can cause ny to take on any value in (0, ∞). So we conclude that f is a constant function on [0, ∞). Exercise 7.16 See part (b) of exercise 7.13 Exercise 7.17 Exercise 7.18 Lemma 1: {Fn } is uniformly bounded We’re told that {fn } is a uniformly bounded sequence: let M be the upper bound of {|fn |}. For each n, we have Z x Z x Z b |Fn (x)| = fn (t) dt ≤ |fn (t)| dt ≤ |fn (t)| dt ≤ (b − a)M a a a and therefore {|Fn |} is uniformly bounded. Lemma 2: {Fn } is equicontinuous Let > 0 be given. Let δ = /M . Z y |y − x| < δ → |Fn (y) − Fn (x)| = Z y fn (t) dt ≤ x |fn (t) dt| ≤ (y − x)M < δM < x Therefore, by theorem 7.25, we know that {Fn } contains a uniformly convergent subsequence. Exercise 7.19 This was solved during class Exercise 7.21 We know that A separates points because f (eiθ ) = eiθ ∈ A . For this same function, |f (eiθ )| = |eiθ | = 1 and so A vanishes at no point of K. But the function f (eiθ ) = −eiθ is a continuous function that is not in the closure of A . 134

Solutions to Rudin Principles of Mathematical Analysis

Related documents

Products

Support

Solutions to Rudin Principles of Mathematical Analysis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib