INTRODUCTION TO ANALYSIS TIM TRAYNOR UNIVERSITY OF WINDSOR C ONTENTS 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. Axioms for the real number system Consequences of the field axioms Consequences of the ordered field axioms Intervals Maximum and minimum Absolute value and distance The natural numbers A little number theory The rational numbers Incompleteness of the rationals The existence of roots — a consequence of completeness The extended real number system The Complex Number System A bit about Rn Cantor’s Principle and the uncountability of R The uncountability of the reals Countablity of the Rationals Cantor’s Principle in Rn Suprema, infima, and the Archimedean Property Supremum and Infimum as operations x Supremum and infimum in the extended real numbers system, R Exponents Natural exponent Integer exponent Rational exponents Arbitrary real exponents The existence of Topology in R and Rn and other metric spaces. Open and Closed sets Balls, open sets, and closed sets in subspaces Interior, boundary, and closure Closure Bounded sets Boundedness Accumulation and the Bolzano-Weierstrass Theorem (set form) Accumulation points Compactness and the Heine-Borel Theorem 0 1 3 7 10 11 11 15 19 21 21 23 25 27 29 31 31 32 33 35 39 41 45 45 45 45 46 49 51 51 55 57 60 63 63 65 65 67 INTRODUCTION TO ANALYSIS 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. Compactness in subspaces Convergence of sequences: Definition and Examples Examples Limit theorems for sequences of reals Convergence to C1, and 1 Existence: Monotone sequences Cluster points and subsequences: The Bolzano-Weierstrass theorem (sequence form) Subsequences Existence: Cauchy sequences The number e, an applicationP of Monotone Convergence 1 Connection with the series k kŠ n Convergence of ..1 C 2=n/ / Limit inferior and limit superior. Unbounded sequences Series of numbers Limits of functions Uniqueness Left and right limits Infinite limits of functions and limits at ˙1 Continuity of functions Discontinuities of a monotone function Continuity and compactness The Intermediate Value Theorem Banach’s Contraction Mapping Theorem Uniform Continuity Differentiation Mean Value Theorems The Real Inverse Function Theorem L’Hôpital’s Rule Taylor’s Theorem Convex Functions The Riemann Integral Existence of the Riemann integral The Fundamental Theorem of Calculus Lebesgue’s Criterion for Riemann Integrability (optional) The Cantor set Pointwise and uniform convergence Distance interpretation Uniform convergence of series of functions Uniform convergence: Continuity, integral, derivative. Continuity Integration Differentiation A continuous nowhere differentiable function on R. Power series The complex case The product of two power series 1 70 73 73 81 87 89 93 93 97 99 100 100 103 106 107 117 118 121 123 125 128 131 133 137 139 143 147 151 153 157 161 167 173 183 189 191 193 196 196 199 199 201 202 205 209 213 213 INTRODUCTION TO ANALYSIS 49. 50. 51. 52. 53. 54. 55. The exponential and trigonometric functions Details (proofs) The connection with the angles of geometry Differentiation of vector-valued and complex-valued functions Integration of vector-valued functions Rectifiable curves — arc length Differentiation of vector functions of a vector variable Connection with the usual definition for functions of a real variable The case of real-valued functions of a vector variable The space L.Rn ; Rm / Rules of differentiation Directional derivatives and partial derivatives The gradient The matrix of the derivative. The Jacobian Continuous differentiability The Inverse Function Theorem The Implicit Function Theorem Appendix: Countability -1 215 216 219 221 225 226 229 230 230 230 232 233 234 235 236 239 243 247 0 TIM TRAYNOR UNIVERSITY OF WINDSOR Many thanks to Maria Pap, Aleksandra Katafiasz, and students of several years. T.T. INTRODUCTION TO ANALYSIS 1 1. A XIOMS FOR THE REAL NUMBER SYSTEM We begin by assuming the existence of a set R called the set of real numbers with certain properties. We will see that everything is based on these. The system of real numbers forms a complete ordered field. This means that it satisfies the following axioms: The field axioms. R is field; that is, a set with two binary operations .x; y/ 7! x C y and .x; y/ 7! xy, called addition and multiplication, with two members 0 and 1 and two unary operations x 7! x and x 7! x 1 (for x ¤ 0) such that: (A1) For all x; y 2 R, x C y D y C x (commutativity) (A2) For all x; y; z 2 R, .x C y/ C z D x C .y C z/ (associativity) (A3) For all x 2 R, x C 0 D x D 0 C x (0 is an identity for addition) (A4) For all x 2 R, x C . x/ D 0 D . x/ C x. ( x is an additive inverse of x) For all x; y 2 R, xy D yx, (commutativity) For all x; y; z 2 R, .xy/z D x.yz/ (associativity) 1 ¤ 0 and for all x 2 R, x1 D x D 1x (1 is an identity for multiplication). For all x 2 R with x ¤ 0, xx 1 D 1 D x 1 x (x 1 is a multiplicative inverse of x). For all x; y; z 2 R, x.y C z/ D xy C xz and .y C z/x D yx C zx (distributive law). The order axioms. In addition to the above there is a relation < on R making it an ordered field, that is, it satisfies: (O1) For all x; y 2 R, exactly one of x < y, x D y, y < x holds (trichotomy). (O2) If x < y and y < z, then x < z (transitivity). (O3) x < y implies x C z < y C z. (addition preserves order) (O4) x < y and z > 0 implies xz < yz. (multiplication by a number > 0 preserves order). The Completeness Axiom. If A and B are non-empty subsets of R such that for all a 2 A and b 2 B, a < b, then there is an x 2 R such that a x for all a 2 A and x b for all b 2 B. (Here x y is an abbreviation for x < y or x D y.) (There are many other ways of expressing the completeness axiom and we will meet some of them. We have chosen one that can be stated with very little theory.) Notice, by the way, that the conclusion of the completeness axiom still holds if a < b is replaced by a b. (Why?) (M1) (M2) (M3) (M4) (DL) 2 TIM TRAYNOR UNIVERSITY OF WINDSOR In analysis books, it is traditional to only write x C 0 D x in the axiom for additive identity, since the other part, x D 0Cx follows from the commutativity. Similar statements apply for the multiplicative identity and the additive and multiplicative inverses. Here, however, we have elected to use the conventions usually used in algebra books, to facilitate the comparison with other algebraic systems. The distributive law (DL) is described by saying that multiplication is distributive over addition. This distinguishes it from x C .y z/ D .x C y/ .x C z/, which does not hold. By contrast, in elementary set theory we do have two distributive laws (intersection over union and union over intersection). From the field axioms we can deduce all the usual rules of algebraic manipulation that we know so well. The order axioms add the ability to handle inequalities in the way we are accustomed, and the completeness axiom adds the power of Calculus and much more. Even though the field axioms give us all the rules of the algebra of numbers, they don’t guarantee the existence of very many numbers. In fact, the field could have only the members 0 and 1. (You might want to amuse yourself by figuring out what the addition and multiplication tables would have to look like then.) But, when we add the order axioms, we find that 0 < 1 < 1 C 1 < .1 C 1/ C 1 < ..1 C 1/ C 1/ C 1 < : : : : This gives an infinite number of numbers. We will be able to produce the set N D f1; 2; : : : g of natural numbers, complete with the Principle of Mathematical Induction. From them and their additive inverses and 0 we will get the set Z of integers and, using multiplicative inverses, we can upgrade to Q, the rational numbers. This set is again an ordered field. But it is the Completeness Axiom that really gives us lots of numbers. You probably p have heard that 2 and are not rational numbers. It is the completeness axiom that guarantees they exist as real numbers. It is completeness that will tell us that a continuous function that is negative at some point and positive somewhere else must be 0 somewhere in between. (Without completeness, the graph of the cosine function would never intersect the x-axis). It will tell us that many integrals exist, even if we can’t figure out their values. It will enable us to solve differential equations, etc. Look closely at this axiom: it is of a different nature than the others. The others talk only about one, two, or three real numbers at a time. The Completeness Axiom, however, talks about two sets of real numbers. That is what gives it its power. INTRODUCTION TO ANALYSIS 3 2. C ONSEQUENCES OF THE FIELD AXIOMS We won’t spend much time on these, since the student will have ample practice with them in Algebra courses; we just give a few examples which should convince you that all the usual algebraic properties will be deducible. Before we begin, look at the first axiom. It starts “ for all x; y 2 R ”: What this really means is “ for all x 2 R and for all y 2 R ”: This is a common way to abbreviate. Make sure you don’t get “for all x; y. . . ” mixed up with “for all (x,y). . . ” the latter is talking about an ordered pair of elements. An expression such as “for all x” is an example of a universal quantifier. Logically, “for each x” and “for every x” (8x) also mean the same thing. The other type of quantifier is the existential quantifier, “there exists ”, “for some ” (9). Quantifiers play a very significant role role in Analysis. The first results we will establish are Uniqueness of the identity elements Cancellation laws for addition and multiplication Uniqueness of the inverses 2.1 Theorem. (Uniqueness of the identity elements.) (1) If a 2 R and for all x 2 R, x C a D x, then a D 0. (2) If b 2 R and for all x 2 R, xb D x then b D 1. Proof. (1) Since 0 is an identity for addition, (A3), for all x 2 R, x D 0 C x: In particular, a D 0 C a: (1) By hypothesis, for all x 2 R, x C a D x: Therefore, 0Ca D0 which combined with (1) gives a D 0; as required. (2) The corresponding statement for multiplication and 1 is proved in the same way. Here is a more compact version of the proof of (1) Proof. a D a C 0; D 0; and hence a D 0. 2.2 Theorem. (Cancellation) by A3, (2) by hypothesis; (3) 4 Consequences of the field axioms (1) For all x; y; z 2 R, if x C z D y C z; then x D y. (2) Similarly, for all x; y and w in R, if w ¤ 0 and xw D yw, then x D y. Proof. (1) Fix arbitrary x; y; z 2 R for which x C z D y C z: Then, .x C z/ C . z/ D .y C z/ C . z/ x C .z C . z// D y C .z C . z//; x C 0 D y C 0; x D y; (4) by A2, associativity by A4 (5) (6) (7) by A3 The proof of (2) is left as an exercise Inverses, too, are unique: 2.3 Theorem. For x 2 R, the only additive inverse of x is multiplicative inverse is x 1 . x; if x is not 0, its only Proof. Let x 2 R. Suppose a is an additive inverse of x. Then, x C a D 0: But also x C . x/ D 0; hence x C a D x C . x/: By commutativity, a C x D . x/ C x; and then a D x; by cancellation (the previous result). The proof of the corresponding result for multiplicative inverse is left to the reader. The word inverse by itself usually refers to the multiplicative inverse. while x is called minus x, or “the negative of x”. In the latter case, we must be aware that the negative of a negative number (to be defined later) is not negative, but positive. We define x y to be x C . y/ and if y ¤ 0, yx to be xy 1 , Once we get going, of course, we will not even bother with the justifications of simple algebraic calculations, since we will have much more immediate concerns. The student should, however, achieve a level of competence that such tiny gaps in reasoning can be filled anytime. Here are some more consequences of the field properties: 2.4 Theorem. Let a,b,c,. . . be elements of R (or another fixed field). Then, (1) 0 D 0; (2) 1 1 D 1I (3) . a/ D a; (4) if a ¤ 0, then .a 1 / 1 D a; (5) a 0 D 0; (6) . a/b D .ab/ D a. b/; (7) . a/. b/ D ab; (8) ab D 0 implies either a D 0 or b D 0; ac (9) If b ¤ 0 and d ¤ 0 then ab dc D bd INTRODUCTION TO ANALYSIS (10) If b ¤ 0 and d ¤ 0 then (11) If b ¤ 0 and d ¤ 0 then a b a b 5 Cbc C dc D adbd c D d if and only if ad D bc. We prove a few of these and leave the rest to the reader. Proof. (1) Since 0 C 0 D 0, by (A3), 0 is an additive inverse of 0. But such an inverse is unique, hence 0 D 0. (5) Start with 0 C 0 D 0: Multiply by a on the left to obtain a.0 C 0/ D a0: Now use the distributive law a0 C a0 D a0: The right side here is also 0 C a0; by A3 and A1, so a0 C a0 D 0 C a0; a0 D 0; yielding by cancellation. (6) As with (1) we show . a/b is an additive inverse of ab. ab C . a/b D b.a C . a//; by commutativity and distributivity D b0 by A4 D0 by the previous result. Thus ab C .. a/b/ D 0, so by uniqueness of additive inverse, . a/b D That a. b/ D .ab/ is proved in the same way. .ab/. 6 Consequences of the field axioms Notes INTRODUCTION TO ANALYSIS 7 3. C ONSEQUENCES OF THE ORDERED FIELD AXIOMS We begin with a very simple, but very striking, consequence of these seemingly innocent assumptions. It illustrates the role of the trichotomy axiom. 3.1 Theorem. 0 < 1. Proof. By trichotomy 0 < 1, 0 D 1, or 1 < 0. We know 0 ¤ 1, by the axiom. Now, suppose 1 < 0. Then, since addition preserves order (O3) , 1 C . 1/ < 0 C . 1/: And, since 1 C . 1/ D 0 and 0 C . 1/ D 0< 1, we have 1: This and O4 yields 0. 1/ < . 1/. 1/: But 0. 1/ D 0 and . 1/. 1/ D 1 1 D 1, so 0 < 1: This contradicts 1 < 0. The only possibility remaining is 1 > 0, as required. Let us officially define 2 D 1 C 1; 3 D 2 C 1; 4 D 3 C 1; 5 D 4 C 1; 6 D 5 C 1; 7 D 6 C 1; 8 D 7 C 1; 9 D 8 C 1 From the previous result we get, for all x 2 R 1 C 0 < 1 C 1; : so, 1<2 In general, for all x, x < x C 1, so 1 < 2 < 3 < 4 and so on. This shows that there are lots of numbers going out to the right. Since taking negatives reverses the order (as we will see), there are also lots of numbers going to the left. Here is how we can also get lots of numbers between two numbers. It is surprising how useful this fact is. 3.2 Theorem. For real numbers, a and b, if a < b, then a < there exists x 2 R with a < x < b. aCb 2 < b. Hence, if a < b, Proof. (Practice exercise.) As you are aware, a < b < c means a < b and b < c. One of the conveniences of this notation is that, by transitivity, we can drop the middle one, so it includes the statement a < c. The expression a b is an abbreviation for a < b or a D b. The new relation on R is still transitive and addition preserves this order: ab implies aCc bCc And similarly if c 0, a b implies ac bc The trichotomy axiom O1 breaks into two conditions: (O1a) For all x; y 2 R, x < y or x D y or y < x and (O1b) For all x; y 2 R, not (x < y and x D y) and not (x < y and y < x) and not (x D y and y < x). These can be stated in terms of the relation as the following two statements 8 Consequences of the ordered field axioms (TOa) For all x; y 2 R, either x y or y x (TOb) For all x; y 2 R, if x y and y x then x D y. (A relation that satisfies TOa and TOb and O2 (transitivity) is called a total order.) The student should prove the above statements as an exercise in simple logic. We will assume them in what follows. The expression a b < c means a b and b < c, and we can conclude a < c; a < b c is interpreted similarly and yields the same conclusion. (You should check this.) As you expect, the expression a > b is defined to mean b < a and a b is defined to mean b a. 3.3 Theorem. (Our First Analysis Result) For real numbers a; b; c; "0 : (1) If for all x > a, c x, then c a. (2) Let a < b. If for all x such that a < x < b, c x, then c a. (3) If for all " > 0 c a C ", then c a. (4) Let "0 > 0. If c a C ", whenever 0 < " < "0 , then c a. In the sequel, we will refer to this result as FAR. Many students have difficulty believing this result, because they are thinking of x as fixed. But the hypothesis is about all possible x in the interval .a; C1/ (in version (1)) and all possible x in the interval .a; b/ in version .2/. (See the definitions of intervals below.)?? If you draw yourself the picture, the results become “obvious”: if c is less than or equal to every element of .a; b/, it must be less than or equal to the left endpoint a. Proof. (1) Assume that for all x > a, c x. By trichotomy, the negation of c a is c > a. So, suppose that c > a: Then, aCc < c: 2 yields x > a and c > x. This proves that a< Thus, choosing x D aCc 2 there exists x > a with c > x. This contradicts the hypothesis that for all x > a, c x. Thus, c > a is false so c a, as required. By the way, even though the hypothesis is just that 8x > a; c x, this actually implies 8x > a; c < x. In fact, after the result has been proved, we even see 8x > a, c a < x. (2) Method 1: Let a < b and assume c x, for all x with a < x < b: Suppose c > a. There are two cases: either c > b or c b. If c > b, take x D aCb , so that a < x < b. By the hypothesis, c x, and by 2 transitivity c < b. Since c > b, this contradicts trichotomy. If, instead, c b, take x D aCc : Then, a < x < c: But then, by transitivity, a < x < b; 2 so again, by hypothesis, c x and again this contradicts trichotomy, since x < c. Thus, c > a is false, so c a. INTRODUCTION TO ANALYSIS 9 Method 2 — reducing to version (1): Let a < b. Assume c x, for all x with a < x < b: Now, let x > a. If x < b, then a < x < b, so c x. But, a < then aCb c < b x; 2 so again c x. Thus, aCb 2 < b, so if x b, for all x > a, c xI hence, c a, by (1). (3) Suppose c aC", for all " > 0. Let x > a. Then x a > 0, so c aC.x a/ D x. Thus, for all x > a, c xI hence, c a, by (1) again. (4) This is an exercise. You can mimic the proof of version (2) or deduce it from (3) as (2) was deduced from (1). Each of these versions of the First Analysis Result have a corresponding version for the opposite inequalities. For example, if c x, for all x < a, then c a. I suggest you prove these. A real number a is called positive, if a > 0, negative, if a < 0; non-negative, if a 0, and non-positive, if a 0. 3.4 Theorem. For real numbers a; b: (1) if a < b then b < a; (2) if a is negative then a is positive; (3) if a is positive then a is negative. Proof. (1) Let a < b. Adding a C . b/ to both sides gives a C .. a/ C . b// < b C .. a/ C . b//; by O3 .a C . a// C . b/ < .b C . b// C . a/; by A2 and A1 0 C . b/ < 0 C . a/ b< a; as required. (2) Let a be negative. Then a < 0, so by (1), positive by definition. Statement (3) is proved in the same way. 3.5 Theorem. Let a; b; c; d be real numbers. (1) If a < b and c < d then a C c < b C d . (2) If 0 < a < b and 0 < c < d then ac < bd . 0 < a. But 0 D 0. Thus, a is 10 Consequences of the ordered field axioms Proof. (1) Start with a < b: By O3 we may add c to both sides yielding a C c < b C c: () But also c<d so we may add b to both sides and use commutativity, yielding b C c < b C d: () Combining ./ and ./, using transitivity (O2), we have a C c < b C d; as required. (2) This is proved similarly, using O4 instead of O3. Note carefully how the positivity is used. Here are a few more familiar properties. The proofs of these should present no serious difficulties. They are left as exercises. Make sure you try some. You should also state and prove versions of these results that use and in place of one or both of < and >. 3.6 Theorem. Let a; b; c; d be real numbers. Then: (1) a < b and c < 0 implies ac > bc. (2) a > 0 and b > 0 implies ab > 0. (3) a < 0 and b < 0 implies ab > 0. (4) a < 0 and b > 0 implies ab < 0: (5) a2 0 and a2 > 0, if a ¤ 0. (6) If a > 0 then a 1 > 0 (7) If 0 < a < b then 0 < b 1 < a 1 . (8) If 0 < a < 1, then a2 < a. (9) If a > 1, then a < a2 : (10) 0 < a < b implies a2 < b 2 : (11) 0 < b; d implies ab < dc iff ad < bc. a b (12) 0 a < b implies aC1 < bC1 aCc (13) 0 < b; d and ab < dc implies ab < bCd < dc . Intervals. If a; b are real numbers, we define four bounded intervals by .a; b/ D fx 2 R W a < x < bg Œa; b D fx 2 R W a x bg .a; b D fx 2 R W a < x bg Œa; b/ D fx 2 R W a x < bg We notice that if b < a these all turn out to be empty. We usually assume a b when we use this notation. If a D b, Œa; b D fag, whereas the others are empty. We call .a; b/ an open interval and Œa; b, a closed interval; .a; b and Œa; b/ are half-open intervals. In all four cases, a is the left endpoint and b is the right endpoint of the interval. If a < b, the interval is called non-degenerate, otherwise degenerate. Notice that each such belongs to it. non-degenerate interval is non-empty since aCb 2 INTRODUCTION TO ANALYSIS 11 There are also unbounded intervals defined by .a; C1/ D fx 2 R W x > ag (8) Œa; C1/ D fx 2 R W x ag (9) . 1; b/ D fx 2 R W x < bg (10) . 1; b D fx 2 R W x bg (11) . 1; C1/ D R (12) The symbols C1, and 1 are from the extended real number system. (See section 8.) Every interval J of R (bounded or unbounded) has the property that for all x; y 2 J , if x < t < y then t 2 J . Once one has the concept of supremum and infimum (see section 12. S UPREMA , INFIMA , AND THE A RCHIMEDIAN PROPERTY) one can show that this property characterizes intervals. 3.7 Theorem. (Characterization of interval) A subset J of R is an interval if and only if for all x; y 2 J and t 2 R, if x < t < y then t 2 J . Maximum and minimum. If x and y are two real numbers, the maximum of x and y is ( x; if x y maxfx; yg D y if y xI the minimum of x and y is ( minfx; yg D x; if x y y if y xI 3.8 Theorem. For real numbers x; y; a, (1) maxfx; yg a if and only if x a and y a and (2) minfx; yg a if and only if x a and y a. (3) maxfx; yg < a if and only if x < a and y < a and (4) minfx; yg > a if and only if x > a and y > a. Proof. (Exercise) Absolute value and distance. For each x 2 R, the absolute value of x is defined to be jxj D x if x 0 and x if x < 0. An immediate consequence (which you should prove for practice) is: 3.9 Lemma. Let x; a 2 R. Then, jxj a iff a x a. (similarly, for <) 3.10 Theorem. For real numbers: (1) jxj 0, for all x; jxj D 0 iff x D 0, (2) jx C yj jxj C jyj. (3) jxyj D jxjjyj. We are using the common abbreviation “iff” for “if and only if”. Proof. (1) If x 0 then jxj D x 0: If x < 0, jxj D x > 0. In both cases, jxj 0. Now, if x D 0, jxj D x D 0 by definition. Conversely, suppose x ¤ 0. Then either x < 0 or x > 0. If x < 0 then jxj D x > 0, while if x > 0 then jxj D x > 0. Thus, in both cases jxj ¤ 0. (2) Since jxj jxj, we have jxj x jxj 12 Consequences of the ordered field axioms and similarly jyj y jyj; and adding gives .jxj C jyj/ x C y jxj C jyj: Now using the Lemma again with a D jxj C jyj, we have jx C yj jxj C jyj; as required. Alternate Proof In case x 0; y 0, we have jxj D x, jyj D y; and x C y 0, so that jx C yj D x C y D jxj C jyj: In case x < 0, y < 0, we have jxj D jx C yj D In case x 0; x, jyj D .x C y/ D y and x C y < 0, so x C . y/ D jxj C jyj: y < 0, we have jxj D x, jyj D y > 0. If x C y 0, this gives jx C yj D x C y < x C . y/ D jxj C jyj; while if x C y < 0 it gives jx C yj D x y x C . y/ D jxj C jyj: The case x < 0; y 0 is similar. In all 4 cases then, we had jx C yj jxj C jyj. The proof of (3) is similar to the alternative proof of (2), but easier. Property (2) here is called the triangle inequality for absolute value. It gets its name from its analogue in the Euclidean plane, where it describes the relationship between the lengths of sides of a triangle. Often the following variant of the triangle inequality is useful. 3.11 Corollary. Let x; y 2 R. Then j jxj Proof. jxj D jx y C yj jx jyj j jx yj and j jxj jyj j jx C yj. yj C jyj, by the triangle inequality, so jxj jyj jx yj: Similarly Now j jxj jyj j is one of jxj in both cases jyj jxj jx yj: jyj or jyj jxj D .jxj jyj/, depending on the sign, so j jxj jyj j jx yj: To obtain the second statement from the first, replace y by y and use j yj D jyj. 3.12 Remark. It is a common source of error to reverse the inequality in this corollary, writing jx yj j jxj jyj j, which is false in general. For real numbers x and y, the quantity jx yj is called the distance from x to y. We will write d.x; y/ or dist.x; y/ for this. 3.13 Theorem. For all a; b; c 2 R, d.a; c/ d.a; b/ C d.b; c/. Proof. Let a; b; c 2 R. Then, ja cj D j.a b/C.b c/j ja bjCjb cj, by the triangle inequality for absolute value. The rest is just translation into the distance notation. The property stated in the above theorem is called the triangle inequality for distance. Again, the name really comes from the analogous result in two dimensions. It says that the length of one side of a triangle is less than or equal to the sum of the lengths of the other two sides. INTRODUCTION TO ANALYSIS 13 3.1. Use the concept of minimum of two real numbers to simplify the proof of Our First Analysis Result (2). 3.2. Let a; b; c be real with b > 0. If "0 > 0 and c a C "b, whenever 0 < " < "0 , then c a. 3.3. Let I be a set of real numbers, and let a; b 2 R with a b. Then I is an interval with endpoints a and b if and only if .a; b/ I Œa; b. (Check the 4 cases.) 3.4. Let I D .a; b/ and J D .c; d / be bounded open intervals. Find a formula for the intersection, I \ J , and prove it. Under the additional hypothesis that I \ J ¤ ;, find and prove a formula for I [ J . S 3.5. If fI˛ W ˛ 2 Ag is a family of intervals of R, all containing the point c, then ˛2A I˛ is also an interval. 3.6. (Improving the previous problem) IfS fI˛ W ˛ 2 Ag is a family of intervals of R, such that for each ˛; ˇ 2 A, I˛ \ Iˇ ¤ ;, then ˛2A I˛ is also an interval. T 3.7. If fI˛ W ˛ 2 Ag is a family of intervals of R, ˛2A I˛ is also an interval, though possibly empty. 3.8. Find and prove formulas for the intersection of two closed bounded intervals and for the union of two non-disjoint ones. 3.9. Find, with proof, \ Œ2; x/; x>3 that is, the intersection of the family of intervals fŒ2; x/ W x > 3g. 3.10. Find, with proof, [ .3; x: x<7 3.11. Let I be an interval in R. If a 2 I and a < t … I , then, for all x 2 I , x < t. 3.12. Let A and B be disjoint intervals in R. Suppose a 2 A, b 2 B, and a < b. Then, for all x 2 A and all y 2 B, x < y. 3.13. If jaj " for all " > 0, then a D 0. 3.14. For real a; b; c, d.a; b/ D d.a; c/ C d.c; b/ if and only if a c b or a c b. 3.15. If x; y 2 Œa; b, then d.x; y/ d.a; b/. 3.16. If x; y 2 Œa; b, then d.x; y/ < d.a; b/, unless one of x; y is a and the other is b. 3.17. Let jx 2j < ı and ı < 1. Prove jx C 7j < ı C 9 and jx 2 C 5x 14j < 10ı. 14 Consequences of the ordered field axioms Notes INTRODUCTION TO ANALYSIS 15 4. T HE NATURAL NUMBERS For a subset A of R, A is called inductive (or inductively closed) if for all x, x 2 A implies x C 1 2 A: One inductive set is R itself. The set of natural numbers is defined to be the set \ ND fA W A is inductive with 1 2 Ag: 4.1 Theorem. N is an inductive subset of R containing 1. If A is any inductive set that contains 1, then A N. Proof. . First, there is an inductive subset of R containing 1, namely R itself, so N exists and is a subset of R. T From the definition of intersection, 1 2 fA W A is inductive and 1 2 Ag D N. To see that N is inductive, let x 2 N. Then, x 2 A; for all inductive sets A with 1 2 A. Hence x C 1 2 A, for all inductive sets A with 1 2 A: Therefore, xC12 \ fA W A is inductive with 1 2 Ag D N; by definition. Thus x 2 N H) x C 1 2 N, so N is inductive. For the second statement, let A be inductive with 1 belonging to it. If x 2 N, then x 2 A, by definition of intersection. So N A. The end of the proof is a special case of the general fact that the intersection of a family of sets is always contained in each set in the family. The above immediately yields the Principle of Mathematical Induction: 4.2 Theorem. (PMI) (1) If A is an inductive set of natural numbers including the element 1, then A D N. (2) If P .n/ is a statement about natural numbers n, such that (i) P .1/ is true, and (ii) whenever P .n/ is true, P .n C 1/ is true, then P .n/ is true for all n 2 N. Proof. (1) By hypothesis, A N; Also A is inductive with 1 2 A, so N A by the previous result. Thus A D N. (2) Let A D fn 2 N W P .n/ is true g. By hypothesis, 1 2 A and x 2 A H) x C 1 2 A. Thus A is inductive, so A D N, by part (1). That is, for all n 2 N, P .n/ is true. 4.3 Theorem. For all n 2 N, n 1, hence 0 … N. Proof. Here the statement to be proved by induction on n is: n 1: Now, (i) 1 1 is true of course and, (ii) if n 1, then n C 1 1 C 0 1. Thus, by PMI n 1, for all n 2 N. Moreover, since 0 < 1, 0 … N. So, if n D 1, its predecessor n case. 1 is not a natural number, but this is the only such 16 The natural numbers 4.4 Theorem. If n 2 N and n > 1, then n 1 2 N. Proof. Since every natural number is greater than or equal to 1, saying n > 1 is the same as saying n ¤ 1. Let A be fn 2 N W n D 1 or n 1 2 Ng. Then A contains 1, by definition. Suppose now n 2 A. Then n 2 N; hence, .n C 1/ 1 D n 2 N. Thus n 2 A H) n C 1 2 A, so A is inductive. Hence N D A, by PMI. Thus, for all n 2 N, n D 1 or n 1 2 N: In other words, n ¤ 1 implies n 1 2 N. The last step of the proof used the fact, from elementary logic, that the statement p _ q is equivalent to the statement .not p/ H) q. (_ means “or” and H) means “implies”.) 4.5 Theorem. If n 2 N and n < m 2 N, then n C 1 m. Proof. We use induction on n, so let A D fn 2 N W for all m 2 N; if n < m then n C 1 mg: Let m 2 N. If 1 < m, then m 1 2 N, so 1 m 1, hence 1 C 1 m. Since m was arbitrary, 1 2 A. Suppose n 2 A, and let m 2 N again be arbitrary. If n C 1 < m, then 1 < m, so m 1 2 N and n < m 1; hence, n C 1 m 1, by the inductive hypothesis (that is, since n 2 A). Thus, .n C 1/ C 1 m. And this was true for arbitrary m, so A is inductive, and A D N, as required. In the above proof we used a general principle. If we have a statement to prove about two natural numbers n and m, we try to do induction on just one of them (in the above case, n). We get more power by making the inductive hypothesis strong, saying “for all m 2 N”. The idea is that if you assume more, you can prove more. 4.6 Corollary. If n 2 N and n < a < n C 1, then a … N. In other words, there is no natural number between successive ones. Proof. Let n 2 N and n < a < n C 1. If a 2 N, then by the theorem n C 1 a, which contradicts a < n C 1. 4.7 Theorem (Well-ordering property). Each non-; subset of N has a least element. If A is a set of real numbers, to say that a0 is a least element of A means a0 2 A and for each a 2 A, a0 a. Clearly, a set A can have no more than one least element (why?). Proof. Let M N and let M be non-empty. Suppose M has no least element. We will show that this leads to a contradiction. Let B be the set of n 2 N for which n m, for all m 2 M . We claim that 1 2 B and B is inductive. First, 1 2 B. Indeed, 1 m, for all m 2 M , since M is a set of natural numbers. Now let n 2 B. Then n m, for all m 2 M , But n … M , for otherwise n would be the least element of M which we are assuming doesn’t exist. Thus, n < m, for all m 2 M , so n C 1 m, for all m 2 M , INTRODUCTION TO ANALYSIS 17 by the previous result, that is n C 1 2 B. Thus, n 2 B implies n C 1 2 B. This shows B is inductive, so contains all the elements of N. In other words, for all natural numbers n, n m, for all m 2 M . Since M ¤ ; it contains some element, say m0 . But m0 is a natural number so m0 m, for all m 2 M . So m0 is the least member of M after all, a contradiction. There is a strengthening of the Principle of Mathematical induction which is often more convenient, known as the Principle of Strong Induction, 2nd Principle of Mathematical Induction or the Principle of Complete Induction. As usual f1; : : : ; ng means the set of all natural numbers k with 1 k n. 4.8 Theorem (Strong Induction). (1) Let B be a set of natural numbers such that 1 2 B and that for all n, f1; : : : ; ng B implies n C 1 2 B, then B D N. (2) If P .n/ is a statement about natural numbers n such that (i) P .1/ is true, and (ii) whenever P .k/ is true for all natural numbers k n, P .n C 1/ is also true, then P .n/ is true for all n 2 N. Proof. Let A be the set of all n 2 N such that f1; : : : ; ng B. We will show that A is inductive. Since 1 2 B, f1g B, so 1 2 A. Let n 2 A. Then f1; : : : ; ng B, so n C 1 2 B. But then, f1; : : : ; n C 1g B so that n C 1 2 A. Thus, by the Principle of Mathematical Induction, A D N, so that for all n 2 N, f1; : : : ; ng B, in particular, N B. Since B N, B D N. 4.9 Theorem. The set N is closed under addition. The statement means that for all m; n 2 N, m C n 2 N. In the proof, we will use the general principle mentioned earlier: We will choose one of m and n and prove the statement with the other universally quantified. Proof. Let A D fn 2 N W for all m 2 N, m C n 2 N; g: Then 1 2 A, because this just says m C 1 2 N for all m 2 N, which we already know. Now let n 2 A. Then, for each m 2 N, m C n 2 N, so m C .n C 1/ D .m C n/ C 1 2 N; so that n C 1 2 A. Thus, A is inductive, so each n 2 N belongs to A, which says that for all n 2 N and for all m 2 N, m C n 2 N, as required. In the above proof, how did we know to add n C 1 to the m? Because, we assumed n 2 A and we were trying to show n C 1 in A. For that, we needed to show n C 1 satisfied the definition of the members of the set A. Thus, we fixed one m and added n C 1 to it to see if m C .n C 1/ still belonged to N. It did. We also followed a common custom of not bothering to write out the line: n2A implies n C 1 2 A: 18 The natural numbers 4.10 Theorem. The set N is closed under multiplication. Proof. Let A D fn 2 N W for all m 2 N, mn 2 Ng Then, 1 2 A, since 1 is a multiplicative identity, so suppose n 2 A and fix m 2 N. Then, m.n C 1/ D mn C m, by the distributive law. Since n 2 A, mn 2 N and thus, by closure under addition, mn C m 2 N. This proves n C 1 2 A, so A is inductive. Therefore by PMI, for all n 2 N and for all m 2 N mn 2 N. For n 2 N and x 2 R we define x n using the idea of induction: x 1 D x; and assuming x n defined we let x nC1 D x n x. A definition of this type is called a recursive definition or a definition by recursion. One also lets x 0 D 1. (In some books, 00 is left undefined, but these still tend P to use it as though its definition were 00 D 1, for example in an expression such as nkD0 x k , when x is allowed to be 0. We take 00 D 1 always. We have to be very careful, though, not to give this expression properties that it doesn’t have.) 4.1. Sometimes the Principle of Strong Induction is stated in only one step: If B is a set of natural numbers such that n 2 N and k 2 B, for all k 2 N with k < n implies n 2 B, then B D N. Show that in using this version one still has to prove 1 2 N. 4.2. If m; n 2 N, with m > n, then m n 2 N. 4.3. The Binomial Theorem: Let a; b 2 R (or in any field), for any n 2 N, .a C b/n D Pn n k n k n is the binomial coefficient kŠ.nnŠ k/Š . It satisfies Pascal’s Law: . Here k kD0 k a b n n nC1 kC1 D k C kC1 . 4.4. The Bernoulli inequalities: For n 2 N, if b > 1, .1 C b/n 1 C nb; if b < 1, .1 1 nb. P n i bi 1 . 4.5. For each n 2 N, an b n D .a b/ n iD1 a b/n 4.6. If S is a non-empty finite subset of R, then S has a maximum (that is, a largest element) and a minimum ( smallest element). (This can be proved by induction on the number of elements of S .) 4.7. Let an 2 R, for all n 2 N. Suppose an anC1 , for all n 2 N. (Then .an / is called a decreasing sequence.) Prove that for all n; k 2 N, if n k, then an ak . 4.8. Let an 2 R, for all n 2 N. Suppose an anC1 , for all n 2 N. (Then .an / is called an increasing sequence.) Prove that for all n; k 2 N, if n k, then an ak . 4.9. (Extension of PMI) The set of integers is defined as Z D fn m W m; n 2 Ng. If A is an inductive set of integers with a 2 A, then fn 2 Z W n ag A. Consequently, if P .n/ is a statement about integers n for which P .a/ is true and P .n/ implies P .n C 1/, then P .n/ is true for all n a. INTRODUCTION TO ANALYSIS 19 5. A LITTLE NUMBER THEORY We now define the set Z of integers by ZDN N D fn m W n 2 N and m 2 Ng: From this, you can easily prove that Z is closed under addition and multiplication. It is almost as easy to show that Z D N [ f0g [ . N/; this is based on the fact that, if n; m 2 N and n > m, then n m 2 N, which you can prove by induction. For n; d 2 Z, if n D cd , for some c 2 Z, we say d divides n, or d is a divisor of n, or n is a multiple of d , denoted d jn. One also says n is divisible by d . Evidently 0 is divisible by every integer and d divides n if and only if it divides n, so we often may assume n > 0. If n is positive, then d jn implies d n. 5.1 Theorem. The division algorithm If n; d are integers with d > 0, there exist unique q; r with n D dq C r, 0 r < d . Proof. Let A D fm 2 N [ f0g W m D n dq; q 2 Zg. Then, A is not empty. (If n 0, then n D n d 0 2 A; if n < 0, n d.n/ D .d 1/. n/ 2 A.) Using the Well Ordering Property, let r be the least element of A and choose q so that r D n dq. Then, n D dq C r and r 0. We need to show r < d . If, to the contrary, r d , then r d D n d.q C 1/ 2 A, contradicting the fact that r is the smallest element of A. To prove uniqueness, let n also be dq1 Cr1 , with 0 r1 < d . Then dq Cr D dq1 Cr1 , so d.q1 q/ D r r1 and d < r r1 < d , so 1 < q1 q < 1. Since the only integer between 1 and 1 is 0, q D q1 and r D r1 , as required. If d is a divisor of both n and m, we call d a common divisor of n and m. 5.2 Theorem. Each pair of integers n; m, not both 0, has a unique positive common divisor of the form d D nx C my, where x; y 2 Z. If e is any other common divisor of n; m, ejd , so e d . Thus, d is the greatest common divisor of n and m. Proof. Without loss of generality, we may assume n; m 2 N. Let A D fk 2 N W k D nx C my; x 2 Z; y 2 Zg: The set A is not empty since n C m belongs to it. Let d D nx C my be the least element of A, where x; y 2 Z. To see that d divides n, use the Division Algorithm to find q; r with n D dq C r, 0 r < d . If r > 0, then d > r D n dq D n .nx C my/q D n.1 xq/ C m.yq/, so r is an element of A smaller than d . Thus, r D 0, so d jn. Similarly, d jm. Of course, if e divides n and m it divides nx C my D d ; hence, e d . Uniqueness follows from this. A natural number n > 1 is called prime if the only positive divisors of n are 1 and n. Otherwise it is called composite. Integers a; b are called relatively prime if the greatest common divisor of a and b is 1, that is they have no common factor d > 1. 5.3 Lemma. If n; m are integers that are not relatively prime, and d is their greatest common divisor, then n=d and m=d are relatively prime. 20 A little number theory Proof. We may assume n; m 2 N. If a natural number k were to divide n=d and m=d , then kd would be a larger divisor of n and m, a contradiction. 5.4 Euclid’s Lemma. If a; b; c are integers, a divides bc and a and b are relatively prime, then a divides c. Proof. Since 1 is the greatest common divisor of a and b, 1 D ax C by, where x; y 2 Z. Thus, c D axc C bcy is divisible by a, since both terms are. 5.5 Corollary. If a prime number p divides a product a1 a2 : : : ak , then it divides one of the factors ai . Proof. This is a simple induction, using Euclid’s Lemma. 5.6 Unique Factorization Theorem. Each natural number n > 1 can be represented as a product of primes n D p1 p2 : : : pm in one and only one way, except for the order of the factors. Proof. To clarify, if n itself is prime, n D n is the required product. Let A D fn 2 N W n > 1 and n is not a product of primesg. We are to show that A is empty. If it is not, it has a smallest element a. Since a is not prime, it can be written as a D bc, where 1 < b < a and 1 < c < a. Thus, both b and c are products of primes. Say b D p1 p2 : : : pm and c D q1 q2 : : : qk , where each of the pi and qj are primes. Thus, a D p1 p2 : : : pm q1 q2 : : : qk is a product of primes, a contradiction. So A is empty and every natural number is a product of primes. To prove uniqueness, we must show that if n > 1 is a natural number and n D p1 p2 : : : pm D q1 q2 : : : qk ; () where all the pi and qj are primes, then m D k and q1 ; q2 ; : : : ; qk is just a rearrangement of p1 ; p2 ; : : : pm . We do this by induction on n. Since every product of two or more primes is greater than 2, 2 cannot be written in any other way as a product of primes. Now suppose the uniquess holds for integers less than n and that () holds. Since p1 is prime, we find from Euclid’s lemma that p1 must be qi , for some i D 1; : : : ; k. By rearranging the order, we can assume p1 D q1 . Then, p2 : : : pm D q2 : : : qk : Since this integer is smaller than n, the p2 ; : : : pm and q2 ; : : : ; qk are the same except for the order; hence p1 ; p2 ; : : : pm and q1 ; q2 ; : : : ; qk are the same except for the order. INTRODUCTION TO ANALYSIS 21 6. T HE RATIONAL NUMBERS We define the set Q of rational numbers by nm o QD W m; n 2 Z; n ¤ 0 : n ˚ W m 2 Z; n 2 N . Moreover: We find that Q D m n 6.1 Theorem. Q is closed under addition and under multiplication and taking of additive and multiplicative inverses. Hence, Q satisfies the axioms of an ordered field. The above unproved statements should be considered exercises. Do a few of them from time to time. Incompleteness of the rationals.pAlthough Q is an ordered field, it still has lots of holes. We expect there to be a number ( 2) whose square is 2, for example. But: 6.2 Theorem. There is no rational number a such that a2 D 2. Proof. Suppose that a D m=n, in lowest terms (meaning m and n are relatively prime, that is, have no common divisor > 1) and that a2 D 2. Then, m 2 D 2; n so that. m2 D 2n2 : 2 This says m is even, so m itself must be even. (If m D 2k C 1, odd, then m2 D 4k 2 C 4k C 1, odd.) But if m D 2k; 4k 2 D 2n2 ; so 2k 2 D n2 ; so n2 is even, and hence n is also even. Thus, m and n are both even, which contradicts the fact that m n was in lowest terms. The above may be extended to show that no prime has a rational square root. This result will be often used in examples. If you would like to prove this, remember the words “even” and “odd” mean “divisible by 2” and “not divisible by 2”. p In any case, we see that 2, if it exists, is irrational; that is, not rational. We will prove, however, that it does exist in R as a consequence of completeness. (See the section E XISTENCE OF ROOTS.) 6.1. No prime has a rational square root. Euclid’s Lemma 5.4 from the section A LITTLE N UMBER T HEORY is needed. 6.2. A natural number n has a rational square root only if it is actually a perfect square: n D m2 , where m is also a natural number. (Uses the Unique Factorization Theorem 5.6 of Number Theory.) 22 Rational numbers Notes INTRODUCTION TO ANALYSIS 23 7. T HE EXISTENCE OF ROOTS — A CONSEQUENCE OF COMPLETENESS p We now find out that the irrational number 2 does indeed exist. 7.1 Theorem. Let y be a positive real number. Then, for every n 2 N, there exists a unique positive real number x such that x n D y. Proof. (1) First, note that for positive numbers a; b, a < b H) an < b n This is proved by induction on n. (Exercise). (2) This implies uniqueness: Suppose x1n D y and x2n D y, with x1 and x2 positive but not equal. Then one must be smaller, by trichotomy. Say x1 < x2 : Then x1n < x2n , so we can’t have x1n D x2n . The contradiction shows x1 D x2 . (3) Let A D fa > 0 W an yg, B D fb > 0 W b n yg. I claim that A and B are not empty and every element of A is every element of B. Indeed, since y > 0, y 0 < yC1 < 1; so we have 0< y yC1 n y yC1 < y: y yC1 Thus, 2 A. In the same way, y C 1 > 1, so .y C 1/n y C 1 > y and hence y C 1 2 B. Now, if a 2 A and b 2 B we have an y b n , so an b n , and therefore a b. This comes from step 1, because if a > b we would have an > b n . (4) Step 3 sets us up for our completeness axiom. There must exist an x with a x b, for all a 2 A and all b 2 B. We will now show that for this x, x n D y. Let 0 < a < x < b. Then an < x n < b n . But also, a … B (since x each element of B), so an < y. Similarly, y < b n . Thus, an < x n < b n an < y < b n Since both x n and y are in the interval .an ; b n /, jx n yj < b n an : But, bn an D .b a/ n X b n i ai 1 .b a/nb n 1 : iD1 So, jx n yj .b b D x C ". Then b a/nb n 1 . Now take any " such that 0 < " < x and a D x a D 2" and b < 2x and so jx n yj 2"n.2x/n 1 ", : Hence, jx n yj ": 2n.2x/n 1 Now " here was arbitrary satisfying 0 < " < x. So, 0 jx n yj D 0: 2n.2x/n 1 Hence jx n yj D 0. But then x n y D 0, so x n D y, which completes the proof. 24 The existence of roots — a consequence of completeness The proof just given may appear difficult at first, but will seem easy later in the course. I just wanted to show how the axioms immediately give such an important result, without developing a big theory. Though the words aren’t there, the proof gives a preview of arguments about limits and continuous functions. The result itself is a special case of the Intermediate Value Theorem 32.1. INTRODUCTION TO ANALYSIS 25 8. T HE EXTENDED REAL NUMBER SYSTEM x D R [ fC1; 1g consists of the set of real The extended real number system R numbers R together with two additional objects, C1 and 1. The ordering of R is x by setting extended to R 1 < x < C1; for each x 2 R. The resulting order still satisfies the trichotomy and transitivity axioms. x by setting for x 2 R The addition is partially extended to R x C .C1/ D C1 C x D C1 x .C1/ D 1 (13) x C . 1/ D x . 1/ D C1 (14) 1Cx D x= C 1 D x= 1 1D0 (15) if x > 0, x.C1/ D C1x D C1 x. 1/ D 1x D 1 (16) if x < 0, x.C1/ D C1x D x. 1/ D 1x D C1 (17) 1 (18) 1 C1 C .C1/ D C1 C1.C1/ D . 1/. 1/ D C1 1 C . 1/ D .C1/. 1/ D . 1/.C1/ D 1 (19) The expressions C1 .C1/, C1 C . 1/, and 1 C .C1/ are left undefined, as are the corresponding quotients C1= C 1, etc. The expressions 0.C1/ and 0. 1/ are sometimes defined to be 0, since a “rectangle” of infinite length and 0 height should have area 0. We shall have little use for this in this course. x For Be very careful not to assume that the axioms of the real number system hold in R. example, C1.2 C . 1// D C1, but C1.2/ C .C1/. 1/ is not defined. 26 The extended real number system Notes INTRODUCTION TO ANALYSIS 27 9. T HE C OMPLEX N UMBER S YSTEM The system C, of complex numbers is a field — that is, satisfies the field axioms A1– A4, M1–M4, and DL listed in section 1. A XIOMS OF THE R EAL N UMBER SYSTEM — which contains R as a subfield.1 The set C contains an element i for which i 2 D 1. This shows that C cannot be made into an ordered field, since the square of any member of an ordered field is non-negative. We take as an axiom that every element of C is of the form z D x C iy; where x; y 2 R: This representation is unique. Indeed, if x C iy D x 0 C iy 0 ; with x; y; x 0 ; y 0 2 R; then x x 0 D i.y y 0 /, so .x x 0 /2 D .y 0 y/2 . Since there are no real numbers that are both positive and negative, this shows x D x 0 and y D y 0 , as required. Because of the unique representation just shown, x C iy can be identified with the ordered pair .x; y/. (This is often used as a way of constructing C from R.) If z D x C iy, where x; y are real, then x is called the real part of z, denoted Re.z/ and y is called the imaginary part of z, denoted Im.z/; the number zx D x iy is called the complex conjugate ofpz. Thus, z D x 2 C y 2 , and one defines the magnitude or p zx absolute value of z to be zx z D x 2 C y 2 . (See 7. T HE EXISTENCE OF ROOTS — A CONSEQUENCE OF COMPLETENESS .) Notice that when z D x, real, then z is the same as jxj as defined in section 3. C ONSEQUENCES OF THE ORDERED FIELD AXIOMS. 9.1 Theorem. If z and w are complex numbers, then x w D zx C w, (1) z C x (2) zw x D zx w, x (3) z C zx D 2 Re.z/, and z zx D 2i Im.z/. 9.2 Theorem. If z and w are complex numbers, then (1) jzj 0, jzj D 0 if and only if z D 0, (2) jx z j D jzj (3) jzwj D jzjjwj, (4) j Re.z/j jzj, j Im.z/j jzj, and (5) jz C wj jzj C jwj (triangle inequality) All the proofs here are straightforward, except perhaps the proof of the triangle inequality: Since zxw is the conjugate of z w, x zw x C zxw D 2 Re.z w/; x hence, x w/ D zx jz C wj2 D .z C w/.z C z C zw x C zxw C w w x D jzj2 C 2 Re.z w/ x C jwj2 jzj2 C 2jz wj x C jwj2 D jzj2 C 2jzjjwj C jwj2 D .jzj C jwj/2 ; from which the result follows. 1This means that the operations applied to complex numbers belonging to R yield the same results as in the real number system. 28 Complex Numbers Notes INTRODUCTION TO ANALYSIS 29 10. A BIT ABOUT Rn As the reader will know from a first course in Linear Algebra, Rn (real Euclidean nspace) is the set of n-tuples x D .x1 ; : : : ; xn / of real numbers. We assume the reader is familiar with the basic algebraic properties of this space. There is an addition x C y and a multiplication cx defined for x; y 2 Rn and c 2 R x C y D .x1 C y1 ; : : : ; xn C yn / cx D .cx1 ; : : : ; cxn /; making Rn a vector space (we won’t reproduce P the definition here). P There is a dot product defined by x y D niD1 xi yi and a norm jxj D . niD1 xi2 /1=2 . The notation kxk is also used, expecially if one wants to emphasize that some letters represent real numbers and others represent vectors. After establishing simple properties of the dot product, one proves the Cauchy-Schwartz inequality: jx yj jxjjyj; as follows. If x or y are 0, both sides are 0. If jxj D 1 and jyj D 1, 0 jx yj2 D .x y/ .x y/ D x x 2x y C y y D jxj2 D2 2x y C jyj2 2x y; so that xy 1 2 Similarly, using 0 jx C yj , we have x y 1: Thus, if jxj D jyj D 1, jx yj 1; If x ¤ 0 and y ¤ 0, we can replace x and y in () by x=jxj and y=jyj to obtain ˇ ˇ ˇ x y ˇˇ ˇ ˇ jxj jyj ˇ 1; () which is the same as jx yj jxjjyj; as required. The triangle inequality jx C yj jxj C jyj: follows from the Cauchy-Schwarz inequality via the calculation jx C yj2 D .x C y/ .x C y/ D x x C 2x y C y y x x C 2jxjjyj C y y D jxj2 C 2jxjjyj C jyj2 D .jxj C jyj/2 : A bit about Rn 30 In Rn , the distance between two points x and y is d.x; y/ D jx 10.1. Let x; y 2 Rn , yj. t0 > 0. If jx C tyj jxj, for all real t with 0 < t < t0 , then x y 0. 10.2. The line through a in the direction v is given by L D fa C t v W t 2 Rg. Find the distance between 2 arbitrary points x D a C t v and y D a C sv on the line. INTRODUCTION TO ANALYSIS 31 11. C ANTOR ’ S P RINCIPLE AND THE UNCOUNTABILITY OF R Recall that a bounded closed interval in R is a set of the form Œa; b D fx W a x bg, where a; b 2 R. (Often, when the context makes it clear, one omits the word “bounded”.) 11.1 Theorem (Cantor’s Principle of nested intervals). For each n 2 N , let In be a (non-empty) bounded closed interval of the real number system and suppose In InC1 , T for all n. Then, n2N In ¤ ;. Proof. Let In D Œan ; bn , for each n. The condition In InC1 says that for each n; an anC1 bnC1 bn . a1 a2 an bn b2 b1 : A simple induction yields that n k implies an ak and bk bn . (See Exercises 4.7 and 4.8.) From this we see that for any n; m 2 N an bm : Indeed, let k D maxfn; mg, then a n a k bk bm : Therefore, for all n; m, an bm ; hence, by completeness there exists a realTnumber x such that for all n; m, an x bm . Thus, for all n, an x bn ; that is, x 2 n2N In , so the intersection is not empty. With the concepts of supremum and infimum, we can prove a more precise result: \ In D Œa; b; (20) n2N where a D supfan W n 2 Ng, and b D inffbm W m 2 Ng. [See 12. S UPREMA , INFIMA , AND THE A RCHIMEDIAN PROPERTY ]. The uncountability ofpthe reals. We saw in the section on existence of roots that there are irrational numbers, 2 in particular. We are now going to see that there are many more irrational numbers than there are rationals. One can prove that it is possible to “count” the rational numbers; that is, to list all the rational numbers r1 ; r2 ; r3 ; r4 ; : : :, letting one rational number rn correspond to each natural number n. The theorem below will prove that this cannot be done for the real numbers. More explanation about the countability of Q and of the concepts will follow the proof. An interval of R is called non-degenerate if it has at least 2 points in it. In particular, an interval Œa; b of R is non-degenerate if a < b. 11.2 Lemma. If x 2 R and Œa; b is a non-degenerate closed interval of R, then there exists a non-degenerate closed interval I Œa; b with x … I . Proof. Let a0 and b 0 be any two points with a < a0 < b 0 < b, for example a0 D a C .b a/=3 and b 0 D a C 2.b a/=3. Then the intervals Œa; a0 and Œb 0 ; b are disjoint so either x … Œa; a0 or x … Œb 0 ; b. Of course, it is quite possible that x is in neither of the two intervals constructed in the above proof. 11.3 Theorem. Each non-degenerate interval I R, including R itself, is uncountable. 32 Cantor’s Principle and the uncountability of R Proof. To say that I is uncountable is to say that it cannot be written in the form I D fx1 ; x2 ; : : : g D fxn W n 2 Ng: Suppose to the contrary, that we could write I in that form. Start with 2 points a < b in I . Then, by the lemma, there exists a non-degenerate closed interval I1 Œa; b with x1 … I1 : Again, there is a non-degenerate closed interval I2 I1 with x2 … I2 , Continuing recursively, assuming Ik defined with xk … Ik , we find a non-degenerate closed interval IkC1 contained in Ik with xkC1 … IkC1 : Thus, we have a nested sequence of closed intervals Œa; b I1 I2 Ik IkC1 : : : ; T with xk … Ik , for all k. The intersection n2N In ofT all these In contains none of the points of I , yet by Cantor’s Principle of Nested Intervals, n2N In ¤ ;. This is a contradiction. T Let us go through T that contradiction slowly. Cantor’s principle says n2N In ¤ ;. But, if c isT a point of n In ; then T c belongs to I so is one of the xk . But xk … Ik and Ik n In ; so c D xk … n In . The set of real numbers R itself can be viewed as the interval . 1; C1/, so the above argument applies The reader may have seen elsewhere a proof involving decimal expansions. Although it looks quite different, it is essentially the same type of proof. The reason is that if x has decimal expansion 0:d1 d2 d3 : : : then the digits d1 ; d2 ; d3 ; : : : determine a decreasing sequence I1 ; I2 ; I3 ; : : : of intervals to which x belongs and can be chosen to ensure that xn … In . You might try working this out. Countablity of the Rationals. One famous way of counting the positive rationals is 1 2 1 3 2 1 4 3 2 1 5 4 3 2 1 , ; , ; ; , ; ; ; , ; ; ; ; ; : : :, 1 1 2 1 2 3 1 2 3 4 1 2 3 4 5 listing first those for which the numerator and denominator sum to 2, then those for which they sum to 3, then to 4, then to 5, etc. (If we want to list a number only once we would have to omit duplicates — those which are not in lowest terms, such as 22 , 33 , 26 , etc.) To list all the rationals, start with 0, then alternate positive and negative, thus: 1 1 2 2 1 1 3 3 2 0, ; , ; ; ; , ; ; ;: : : 1 1 1 1 2 2 1 1 2 So, the set of rational numbers is countable. A set is finite if is empty or it can be written in the form fx1 ; x2 ; : : : ; xn g where n is a fixed natural number. Finite sets are also considered countable. It is evident, then, that each subset of a countable set is countable. It turns out that a set (such as Q) which is countable and infinite can always be written fx1 ; x2 ; x3 ; : : :g D fxi W i 2 Ng; with all of the xi distinct. Such sets are called denumerable (or countably infinite). An interesting way to enumerate the positive rationals is given by the following recursive definition. Put x1 D 1, then for k 2 N, put x2k D xk C 1 and x2kC1 D 1=x2k . I learned about this from an article by Shen Yu-Ting (Shen You Ding) in the American Mathematical Monthly, vol 87, 1980, pages 25–29. For more about these things, including rigorous proofs, see the section C OUNTABILITY. INTRODUCTION TO ANALYSIS 33 Cantor’s Principle in Rn . To say that I is a bounded closed interval of Rn means I D Œa1 ; b1 Œa2 ; b2 Œan ; bn ; where each of Œa1 ; b1 ; : : : ; Œan ; bn is a bounded closed interval of R. If a D .a1 ; : : : ; an / and b D .b1 ; : : : ; bn / we write Œa; b for this I . With these definitions we obtain the same result as before. 11.4 Theorem (Cantor’s Principle of Nested Intervals in Rn .). Let .Ik / be a T sequence of non-empty bounded closed intervals in Rn with Ik IkC1 for all k 2 N. Then k Ik ¤ ;: The proof is an exercise. It consists of reducing the problem to the corresponding theorem in R, using the fact that if A D A1 An , then c 2 A if and only if ci 2 Ai , for i D 1; : : : ; n. 11.1. Bounded intervals are required in Cantor’s Principle: If .Ik / is a sequence of unbounded nonempty closed intervals (of the form Œak ; C1/, say) with Ik IkC1 , for all k, we can have T 1 kD1 Ik D ;. This requires the Archimedean property. See section 12: S UPREMA , INFIMA , AND THE A RCHIMEDEAN P ROPERTY . 11.2. Closed intervals are required in Cantor’s Principle: If .Ik / is a sequence of bounded non-empty T intervals Ik IkC1 , for all k, we can have 1 I kD1 k D ;. This also requires the Archimedean property. See S UPREMA , INFIMA , AND THE A RCHIMEDEAN P ROPERTY. 11.3. Let fI˛ W ˛ 2 Ag be a family of closed bounded intervals T I˛ D Œa˛ ; b˛ of R. Assume that each pair of intervals of the family intersect. Then, ˛2A I˛ ¤ ;. (Show that for all ˛; ˇ 2 A; a˛ bˇ and use completeness.) This is a special case of Helly’s Theorem on compact convex sets in Rn . 34 Cantor’s Principle and the uncountability of R Notes INTRODUCTION TO ANALYSIS 35 12. S UPREMA , INFIMA , AND THE A RCHIMEDEAN P ROPERTY For a subset S of R, a real number c is the largest element of S (also called greatest element or maximum of S ), if (1) c x; for all x 2 S (2) c 2 S. We then write c D max S . Similarly, c is called the smallest element of S (also called least element or mininum of S ), and we write c D min S, if (1) c x, for all x 2 S and (2) c 2 S. It is immediate that a set can only have one maximum and only one minimum. Indeed, if c1 and c2 are different elements of S then either c1 < c2 or c2 < c1 , so they can’t both be the largest, nor can they both be the smallest. Recall that a finite set F is one which is either ; or can be written as F D fx1 ; : : : ; xn g, where n is a natural number. It is a nice exercise in induction to prove that every nonempty finite set of real numbers has a largest and a smallest element. We also know, from the Well-ordering Property, that each non-empty set of natural numbers has a least element. The existence of greatest or least elements in other situations is guaranteed by completeness. For a subset S of R, a real number u is said to be an upper bound for S if u x, for all x 2 S . Similarly, ` is called a lower-bound for S if ` x, for all x 2 S . 12.1 Theorem. (a) If a non-empty set S R has an upper bound, then it has a least one. (b) If such an S has a lower bound, then it has a greatest one. Proof. (a) First suppose S has an upper bound and let U be the set of all the upper bounds of S. Our job is to prove that U has a least element. Now S is non-empty, by hypothesis and U is non-empty, since an upper bound is assumed to exist. By definition of upper bound, if x 2 S and u 2 U , we have x u: Thus, we are in the setting of the Completness Axiom: S plays the role of A and U plays the role of B. Therefore, there exists a real number c such that x c u, for all x 2 S and u 2 U. This is exactly what we want. That x c for all x 2 S says that c is an upper bound for S and that c u, for all u 2 U says that c is the least one. (b) Let us briefly outline the proof for lower bounds. If S ¤ 0 and has a lower bound, then the set L of all its lower bounds is non-empty, and by definition ` x for all ` 2 L and all x 2 S. By the completeness axiom, there is a c with ` c x for all ` 2 L and all x 2 S . This c is then the greatest lower bound of S. The least upper bound, if it exists, is also called the supremum of S (supS ) and the greatest lower bound of a set S , if it exists, is also called the infimum of S (inf S). A set which has an upper bound u is called bounded above by u and a set which has a lower bound ` is called bounded below by `. The theorem we just stated is often quoted as: Theorem. Every non-empty set bounded above has a supremum, and every non-empty set bounded below has an infimum. 36 Suprema, infima, and the Archimedean Property In many books, one of these is taken as an axiom and is called completeness. We have shown these follow from our Completeness Axiom. It is also easy to show that each implies our axiom. Going back to the meanings of the terms we see that c D sup S iff (1) c x, for all x 2 S ( c is an upper bound of S ) and (2) if u x for all x 2 S, then u c (the least one). It is often useful, mainly in checking individual examples, to use condition (2) in its contrapositive form: (20 ) if u < c then there exists x 2 S with u < x. (In other words, if u is less than c, then it is no longer an upper bound of S.) You should compare (1) and (2) to the corresponding conditions for maximum, to see that if c D max S , then c D sup S . In the case of lower bounds we see that c D inf S iff (1) c x, for all x 2 S and (2) if ` x for all x 2 S , then ` c, or in contrapositive form (20 ) if ` > c then there exists x 2 S with ` > x. (In other words, if ` is greater than c, then it is no longer a lower bound of S .) Again a minimum of a set is always its infimum. 12.2 Examples. (a) For the set A D f 21 ; 3; 2; 1g, 3 is the maximum of A since it belongs to A and is each of the members of A. It is therefore also the supremum of A. Similarly, the mimimum (hence also the infimum) of A is 21 . (b) The infimum of the set B D .3; 1/ D fx W x > 3g is 3: This requires proof. Certainly, (1) 3 < x, for all x 2 .3; 1/. Now, suppose ` is a lower bound for B. That is, suppose ` x for all x > 3 Then, ` 3. This was “our First Analysis Result” — 3.3 (in section 3: C ONSEQUENCES OF THE ORDERED FIELD PROPERTIES .) Thus, (2) If ` x for all x 2 B, then ` 3. Hence, 3 is the greatest lower bound of B (infimum). (c) The set N of natural numbers has 1 as a minimum, hence infimum. However, as the following theorem shows, it is not bounded above, so has no supremum. 12.3 The Archimedean Property. If "; b 2 R and " > 0, then there exists an n 2 N such that n" > b (equivalently, that nb < "). Proof. Let " > 0, and b any element of R. Suppose the conclusion were false. Then we would have n" b, for all n 2 N. This means that the set S D fn" W n 2 Ng is bounded above (by b). Thus, S has a supremum (say, c D sup S /. Since c is an upper bound for S, n" c; for all n 2 N: But if n 2 N, so is n C 1, hence .n C 1/" c; for all n 2 N; and therefore, n" c "; for all n 2 N: But this shows that c " is also an upper bound for S , even though it is smaller than c, since " > 0. This is a contradiction, establishing the result. INTRODUCTION TO ANALYSIS 37 Another way to look at the contradiction at the end of the proof is to conclude from na c "; for all n 2 N that cc "; by property (2) of supremum, and hence c < c. T 12.4 T Example. n2N .0; 1=n D ;. To prove this, suppose the set is not empty; say, x 2 n2N .0; 1=n. Then, for all n 2 N, x 2 .0; 1=n; that is, for all n 2 N, 0 < x 1=n. But by the Archimedean Property, if x > 0, there exists n 2 N with 1=n < x, so this is a T contradiction. Therefore, n2N .0; 1=n D ;. 12.5 Examples. Practice Problems The solutions appear below —12.12 T (1) Determine, with proof, the set S D n2N .1 n1 ; 1 C n1 /, and find its supremum and infimum (if possible). (2) Find the supremum, infimum, maximum and minimum of f n1 W n 2 Ng, if they exist. (3) Do the same for .1; 2. 12.6 Theorem (Order density of the rationals in R). If a and b are real numbers with a < b, then there is a rational number r with a < r < b. The intuitive idea is that we can use the Archimedean property to get to the left of a and then move in small steps of size n1 < b a to the right till we (or the rational number) land(s) in .a; b/. Proof. Fix a; b 2 R with a < b. By the Archimedean property, there exists k 2 N with k > a, so k < a: Choose such a k. Now, by the second form of the Archimedean property, we can choose n 2 N, with 1 < b a: n Again, by the Archimedean property, there exists a natural number m such that m n1 > a C k. By the Well-ordering Property of N, there is a smallest such natural number. If m is this one, we have m m 1 >aCk but aCk n n Thus, m k; a< n and m 1 m 1 1 kD C k C a < b a C a D b; n n n n since n1 < b a. Summarizing, we have m a< k < b; n that is, a < r < b, where r D m k, a rational number. n 38 Suprema, infima, and the Archimedean Property 12.7 Note. In the above proof, we were unable to use the Well-ordering Property directly to get a smallest integer with a certain property, so we translated a and b to the right by adding a suitable k, so that a C k and b C k are positive. Then, we were able to use the Well-ordering property to get a least natural number m with m=n > a C k, and translate back to get m=n k > a. This translation technique is useful in many situations. In the next result we use it to deduce density of the irrational numbers from density of the rational numbers In general a subset A of R is called order dense in R if whenever a and b are real numbers with a < b, then there is an element x 2 A with a < x < b. 12.8 Theorem ( Order density of the irrationals in R). If a and b are real numbers with a < b, then there is an irrational number x with a < x < b. (That is, Qc is order dense in R.) p p Proof. . Let a; b be real numbers with a < b.pThen, a 2p <b 2. By the p density of 2 < r < b 2. Then, a < 2 C r < b. the rationals, we can choose r 2 Q with a p Take x D 2 C r. Then, a < x < b and x is irrational, as required. Here are some more results that you can easily deduce from the Archimedean Property, using translation and the Well Ordering Property. 12.9 Corollary. For every real number x, there exists an integer n with n < x. 12.10 Theorem. Every non-empty set of integers bounded below has a minimum and every non-empty set of integers bounded above has a maximum. 12.11 Example. For each x 2 R, there is a largest integer n with n x. It is denoted bxc (pronounced “floor of x”). Thus, bxc 2 Z and bxc x < bxc C 1: bxc is also called the integral part of x and the function x 2 R 7! bxc is called the floor function or greatest integer function. The fractional part of x is hxi D x bxc. (Some use this terminology for x bxc, if x 0, and x b xc if x < 0.) Similarly, there is a least integer m, with m x, denoted dxe, “ceiling of x”, dxe 1 < x dxe: 12.12 Examples ( Solution of the practice problems). 12.5 T (1) Determine n2N .1 n1 ; 1 C n1 / and find its supremum and infimum. Call this set S . A point x belongs to S iff x 2 .1 n1 ; 1 C n1 /, for all n 2 N. Now, certainly 1 belongs to each of these intervals, so 1 2 S . If x ¤ 1, then either x > 1 or x < 1. If x > 1, then x 1 > 0, so by the Archimedean property, there exists a natural number n with n1 < x 1, and hence x > 1 C n1 . Thus, for this n, x … .1 n1 ; 1 C n1 /, so x … S. Similarly, if x < 1, 1 x > 0 and we can find n with n1 < 1 x, so that x < 1 n1 , and again x … S . Thus, 1 is the only element of S; S D f1g. So 1 D inf S D sup S D max S D min S. (2) Find the supremum, infimum, maximum and minimum of f n1 W n 2 Ng, if they exist. Let us call the set W . Since, for all n 2 N, 1 0 < 1; () n INTRODUCTION TO ANALYSIS 39 we see that 1 x, for all x 2 W ; and since 1 2 W , 1 is the maximum of W . Since, when it exists, the maximum of a set is always its supremum, this says also that 1 D sup W . Again from (), we see that 0 x, for all x 2 W ; that is, 0 is a lower bound of W. To show it is the greatest one, suppose ` > 0. Then, by the Archimedean property, there exists n 2 N with n1 < `; that is, there exists x 2 W with x < `, so that ` is not a lower bound of W , as required. (3) Find the maximum, minumum, supremum and infimum of the set (1,2], provided these exist. Proof. Let us put A D .1; 2. Then, by definition x 2 A iff 1 < x 2. Thus, 2 x; for all x 2 A and 2 2 A. These are exactly the two conditions that 2 be the maximum of A. So, 2 D max A: Since, when it exists, the maximum of a set is always its supremum, this says also that 2 D sup A: Let us go over this in this special case. We need two things: that 2 is an upper bound of A (i.e. 2 x, for all x 2 A) and that it is the least one (if u x, for all x 2 A, then u 2). The first of these was already part of 2 being the maximum of A. As for the second, since 2 2 A, if u each element of A, it is certainly 2. Now let us investigate the question of minimum and infimum. It looks like 1 “would like to be” the minimum of A, but 1 … A, so it cannot be the mimimum. We do, however, have that 1 x, for all x 2 A; that is, 1 is a lower bound of A. Is it the greatest one? Well, suppose ` > 1. There are two possibilities: ` > 2 and 1 < ` 2. In the first case, ` is not a lower bound of A, since 2 2 A. In the , we have 1 < a < ` 2 , In particular, 1 < a 2, so second, if we put a D 1C` 2 a 2 A, and also, a < `, so ` is not a lower bound for A. We have thus proved that the infimum of A is 1: 1 D inf A: Now, if A had a minimum, it would have to also be the infimum, so 1 would have to be the mimimum, which it is not, since 1 … A. Thus, .1; 2 has no minumum. Supremum and Infimum as operations. We can view sup and inf as operations acting on the elements of a set S, yielding a new real number. This point of view can be quite convenient, especially in theoretical calculations. The definition of supremum for example, together with the theorem on its existence could be stated: Let S be a non-empty set of real numbers. If x u, for all x 2 S , then (0) sup S exists, (1) x sup S , for all x 2 S and (2) sup S u. 40 Suprema, infima, and the Archimedean Property Thus, (for all x 2 S , x u) implies sup S u. “The supremum of a set of numbers, each u is also u.” (Of course, there is nothing new here, but a change of emphasis.) Similarly, for a non-empty set S of real numbers, from ` x, for all x 2 S , we deduce (0) inf S exists (1) inf S x, for all x 2 S , (2) ` inf S. “The infimum of a set of numbers, each `, is also `.” Let us look at these principles in action. 12.13 Examples. (1) If S is a set of real numbers, and c 2 R, cS denotes fcx W x 2 S g. Now, if S is a non-empty set of real numbers, bounded above and c > 0, then sup.cS / D c sup S. Proof. Let S be non-empty and bounded above, so that sup S exists. Let y 2 cS . Then, y D cx, for some choice of x 2 S . Thus, y=c D x 2 S . Since the supremum of S is an upper bound for S, y=c sup S: Since c > 0, we may multiply by it and get y c sup S: But, y was an arbitrary element of cS , so for all y 2 cS , y c sup S: Hence, also, sup.cS / c sup S: Now, suppose x 2 S . Then, cx 2 cS: Hence, cx sup.cS /: Therefore, since c is positive, 1 x sup.cS /: c The x 2 S here was arbitrary, so this statement is true for all x 2 S . Thus again we may “take the supremum of the left-side over all x 2 S ,” obtaining 1 sup S sup.cS /: c Thus, c sup S sup.cS /: This with the earlier statement, yields sup.cS / D c sup S: (2) Let A B, where A is non-empty and B is bounded above. Then sup A sup B. INTRODUCTION TO ANALYSIS 41 Proof. Let x 2 A. Then, x 2 B and since sup B is an upper bound of B, x sup B. Thus, for all x 2 A, x sup B. Hence, sup A sup B. How would the above change if we talked about inf instead of sup? x The extended real Supremum and infimum in the extended real numbers system, R. numbers system consists of R together with the additional elements 1 and C1, satisfying 1 < x < C1, for all real x. The definitions of supremum, infimum, maximum and x has C1 as an upper bound and 1 as minimum still make sense. Clearly every set in R a lower bound. If A is a set which is not bounded above in R, it will have sup A D C1 in x if it is not bounded below, then inf A D 1 in R. x R; At this point the reader can prove the characterization of intervals mentioned in section 3. C ONSEQUENCES OF THE ORDERED FIELD AXIOMS 12.14 Theorem (Characterization of intervals). A subset J of R is an interval if and only if for all x; y 2 J and t 2 R, if x < t < y then t 2 J . (Do the first two problems below first to get started.) NOTE: For general statements below, we assume given sets are not-empty, even if we don’t say so explicitly. 12.1. Let A be a non-empty set of real numbers bounded above and below. Then inf A sup A. 12.2. If I is a non-empty bounded interval of R with endpoints a b, then a D inf I and b D sup I . (The possibilities are I D Œa; b; .a; b/; .a; b, or Œa; b/.) 12.3. The previous problem extends to unbounded intervals as well, using the conventions about suprex mum and infimum in R. 12.4. If a < b, then inf.a; b/ \ Q D a and sup.a; b/ \ Q D b. 12.5. If c D sup A and a is not the maximum of A, then sup.A n fag/ D c also. 12.6. Let A and B be sets of real numbers with A B. If B is bounded above, then sup A sup B. If B is bounded below, then inf A inf B. 12.7. Let A and B be sets of real numbers bounded above. Then, sup.A [ B/ D maxfsup A; sup Bg. 12.8. Let A and B be non-empty sets of real numbers bounded below. Then, inf.A [ B/ D minfinf A; inf Bg. 12.9. Let A be a non-empty set of real numbers bounded above and let c 2 R Then, sup.c C A/ D c C sup A. (c C A D fc C a W a 2 Ag.) 12.10. Let A be a non-empty set of real numbers bounded above and let c 2 R, c > 0. Then, sup.cA/ D c sup A. (cA D fca W a 2 Ag.) 12.11. Let A be a set of real numbers bounded above.. Then, inf. A/ D sup A. ( A D f a W a 2 Ag.) 12.12. Let A be a set of real numbers bounded above and let c 2 R Then, inf.c C A/ D c C inf A. (c C A D fc C a W a 2 Ag.) 12.13. Let A and B be sets of real numbers bounded above. Then, sup.A C B/ D sup A C sup B. (A C B D fa C b W a 2 A; b 2 Bg.) 42 Suprema, infima, and the Archimedean Property 12.14. Let A and B be sets of real numbers bounded below. Then, inf.A C B/ D inf A C inf B. (A C B D fa C b W a 2 A; b 2 Bg.) 12.15. Let f and g be functions with domain X and values in R such that ff .x/ W x 2 X g and fg.x/ W x 2 X g are both bounded above. Then, supff .x/ C g.x/ W x 2 X g supff .x/ W x 2 X g C supfg.x/ W x 2 X g: Note: this is often written sup .f .x/ C g.x// sup f .x/ C sup fg.x/: x2X x2X x2X Simple examples show that equality need not hold. 12.16. Let f and g be functions with domain X and values in R such that ff .x/ W x 2 X g and fg.x/ W x 2 X g are both bounded below. Prove that infff .x/ C g.x/ W x 2 X g infff .x/ W x 2 X g C inffg.x/ W x 2 X g: Note: this is often written inf .f .x/ C g.x// inf f .x/ C inf fg.x/: x2X x2X x2X Equality need not hold. 12.17. Find (with proof) [ 2 (a) .2 C ; n. n n2N \ 1 (b) .2 ; 2/. n n2N [ 1 1 (c) ; . nC1 n n2N [ 1 2 (d) 2 C ; . n n n2N 12.18. Find the supremum, infimum, maximum, and minimum (if they exist) of the following sets. Prove your answer. (a) f3; 2; ; g 1 , for n 2 N. (b) fxn W n 1g, where xn D 1 C . 1/n C n (b0 ) fxn W n 2g, for the same xn as in (b). (b00 ) fx ˚ n W n 3g, for the same xn as in (b). (c) x 2 Q W x 2 13 . 1 1 (d) f n Cm W n; m 2 Ng. (e) Œa; b/, where a < b. x (f) f 1Cx W x 2 R; x > 0g. n p o (g) x 2 Q W x 7 . 12.19. If c is a fixed real number, then fx 2 R W x D c C r; where r 2 Qg is dense in R. 12.20. Prove the more precise version of Cantor’s Theorem referred to in equation 11.20: For each n 2 N , let In D Œan ; bn be a (non-empty) bounded closed interval of the real number system and suppose In InC1 , for all n. Then, \ In D Œa; b; n2N where a D supfan W n 2 Ng, and b D inffbm W m 2 Ng. 12.21. Every additive subgroup of R is either dense or consists of the integral multiples of one element. (Depending on whether or not it contains a least positive member.) [Proof sketch] If the infimum a of the set of positive elements of the group G is 0, then the group is dense. [If an interval I has length ` > 0, and g 2 G with 0 < g < `, then some integer multiple of g will belong to I .] Otherwise, a > 0. If a … G, there exists g with a < g < a C a INTRODUCTION TO ANALYSIS 43 and again there is another h 2 G with a < h < g, so that 0 < g h < a. This is a smaller positive group element, which is impossible. Thus, a 2 G. The claim is that G D aZ. For if g 2 G and na < g < .n C 1/a, we would have 0 < g < a, again impossible. 12.22. Let x be a real number and N 2 N. Then, there exists m 2 f1; : : : ; N g and n 2 Z with jmx nj < 1=N . [The N C 1 numbers Œkx/ D kx bkxc, for k D 0; : : : ; N , lie in the interval Œ0; 1/. Since there are more than N pairs of these numbers, some pair of them must fall i of length 1=N .] in one of the N intervals Œ i N1 ; N 12.23. If x is an irrational number, the set xZCZ D fmx Cn W m; n 2 Zg is dense in R. [Combine the 2 previous problems.] 12.24. If x is irrational, then there exist an infinite number of rationals p=q such that jx 1=q 2 . p=qj < 44 Suprema, infima, and the Archimedean Property Notes INTRODUCTION TO ANALYSIS 45 13. E XPONENTS Here we will explain how to use completeness to define ax where 0 < a 2 R and x 2 R. What we want is that a1 D a and the laws of exponents hold, namely: (1) axCy D ax ay (2) .ax /y D axy (3) .ab/x D ax b x , for as many values of x, y, a, b as possible. Natural exponent. Start by allowing a to be any real number. For n 2 N, an is defined recursively by a1 D a and anC1 D an a. Thus, as we expect, n a becomes a : : :… a. „ aƒ‚ n times Using induction, we can prove that anCm D an am .an /m D anm and .ab/n D an b n ; for all real a; b and natural numbers n; m; that is, the laws of exponents for a; b real and x; y naturals. Integer exponent. Now, restrict to the case a ¤ 0. For x an integer, x D n m where n; m 2 N. We can prove that if n m D n0 m0 , where n; m; n0 ; m0 are natural numbers, then 0 an an D m0 am a So it makes sense to define an an m D m : a 0 1 1 Notice in particular that a D a =a D 1. With this definition, it is an easy exercise to show the laws of exponents hold for a; b non-zero and x; y integers. Rational exponents. If x is any positive real number and n is a natural number, there exists a unique real number y such that y n D x and we let x 1=n D y. See the section E XISTENCE OF ROOTS. Here, if a > 0 and r is a rational number of the form m=n where m 2 Z and n 2 N, we put ar D .am /1=n : To justify this, we need to show that if r is also p=q, where p 2 Z and q 2 N, then .am /1=n D .ap /1=q : By the uniqueness of roots, it is enough to show that when .am /1=n is raised to the q th power, the result is ap . As a first step, notice that .am /1=n is also .a1=n /m , since ..a1=n /m /n D .a1=n /mn D .a1=n /nm D ..a1=n /n /m D am : But m=n D p=q means mq D pn. Thus, ..am /1=n /q D ..a1=n /m /q D .a1=n /mq D .a1=n /np D ..a1=n /n /p D ap ; as required. Again, straightforward calculations allow us to prove the laws of exponents for a; b > 0 and x; y rational. 46 Exponents Arbitrary real exponents. We now come to the point of all the above discussion. We would like to extend the definition of ax so it applies to all real x. For this, we look at three cases. a > 1, a D 1, and a < 1. Since we want .1b/x D 1x b x , we have no choice but to take 1x D 1; and since we want 1 D .aa 1 /x D ax .a 1 /x , we will have to have ax D .a 1 / x . So we work with the case a > 1. Let a > 1, x 2 R. Then, ax is defined to be the unique number satisfying ar ax as ; for all rational numbers r; s with r < x < s: () To establish that such a number exists, consider the two sets C D far W r 2 Q; r < xg and D D fas W s 2 Q; x < sg We show (a) c 2 C and d 2 D imply c < d . (b) For each " > 0 there exist c 2 C and d 2 D with d c < ". It follows that there is exactly one number between the elements of C and D, namely, sup C D inf D (this is an exercise) and the definition sets ax equal to this value. (a) Since a > 1, for natural numbers m; n, am > 1, hence am=n D .am /1=n > 1. That is, for all rational numbers r > 0, ar > 1. From this we deduce that for rationals, r < s implies ar < as . Indeed, s () r > 0, so ar < ar .as r / D arCs r D as : Thus, if r < x and x < s, ar < as , which establishes (a). (b) By induction one can show that for every natural number n, b > 1 implies b n 1 n.b 1/I hence, taking b D a1=n , 1 n.a1=n a hence, if 0 < s 1/I r < 1=n, then ar .a 1/ : n Let M be any rational > x. Use the Archimedean property to find n so large that aM .a 1/ < ". Then choose r; s rational with n 1 1 x <r <x<s<xC 2n 2n as Then, r < M and s ar D ar .as r 1/ ar .a1=n 1/ r < n1 , so ar .a 1/ aM .a 1/ < ": n n This is what we wanted: ar 2 C , as 2 D and as ar < ". as ar 13.1 Note. (i) If x is rational, the new definition and the old definition of ax give the same value, because of (). (ii) If x < y, and a > 1, we see that ax < ay . Indeed, by density we may choose rationals u; v with x < u < v < y, so ax au < av ay . INTRODUCTION TO ANALYSIS 47 As we stated earlier, if a D 1, we let ax D 1, for all x, if 0 < a < 1, we let ax D .a 1 / x . 13.2 Theorem. With the above definitions, the expression ax satifies the laws of exponents: For all real a > 0, and for all x; y 2 R, (1) axCy D ax ay , (2) .ax /y D axy , and (3) .ab/x D ax b x Proof. (1) Start with a > 1 and x; y 2 R. Let u and v be rational with u<xCy <v Then u y < x, so by density of the rationals, there exists a rational r with u y < r < x. Put s D u r. Then, s < y and u D r C s, so au D arCs D ar as ax ay Similarly, ax ar av Thus, u and v rational with u < x C y < v implies au ax ay av : But, there is only one number between all such au and av , nameley axCy . Thus axCy D ax ay . Note now that ax a x D axC. x/ D a0 D 1, so a x D .ax / 1 In case a D 1, 1x 1y D 1 1 D 1 D 1xCy In case 0 < a < 1, the result (1) follows easily from the case a > 1, by applying it to a 1: axCy D .a 1 / .xCy/ D .a 1 / xC. y/ D .a 1 / x .a 1 /. y/ D ax ay (2) This is similar to the proof of (1), but uses multiplication instead of addition. Case a > 1, x; y > 0. Let u; v be rationals with u < xy < v. Since u < xy, and y > 0, yu < x. Thus, there exists a rational r with u=y < r < x. And then u=r is rational s with s < y. Thus, au D ars D .ar /s .ax /s .ax /y : Similarly .ax /y av . Hence, au .ax /y av ; for all rationals u; v with u < xy < v, so by the uniqueness in the definition, axy D .ax /y . The cases with a > 1 and one or both of x and y negative follow from the positive case by simple algebraic manipulation. For example, if a > 1, x > 0, y < 0, we have axy D .a xy / 1 D .ax. y/ / 1 D ..ax /. y/ / 1 D .ax /y : We leave the other cases to the reader. As in (1), the cases with a D 1 are trivial and the cases with a < 1 follow from the cases a > 1 by considering a 1 . (3) is left as an exercise. 48 Exponents Notes INTRODUCTION TO ANALYSIS 49 14. T HE EXISTENCE OF Here we would like to prove the existence of the number and indicate how the trigonometric functions can be defined. The most common intuitive definition given for the number is the ratio of the circumference of a circle to its diameter. Of course, that brings up the question as to what the circumference of a circle is exactly. One might say it is the distance a point travels around the circle in one direction till it gets back to the starting point. This all can be defined rigorously once one has the concept of a continuous parameterized curve. Here, let us do a simple version. You will agree that any reasonable definition of the total distance around the circle will be twice the length of the upper semicircle, so consider the semicircle C WD f.x; y/ W x 2 C y 2 D 1; y 0g: p For a given x 2 Œ 1; 1, the unique y with .x; y/ 2 C is y D 1 x 2 . For each list . 1; 0/ DPa0 ; a1 ; : : : ; an D .1; 0/ of points ai D .xi ; yi / 2 C , such that xi 1 < xi , we calculate niD1 d.ai 1 ; ai /. The supremum of all such sums, if it exists, may be interpreted as the length of this semicircular arc, and used to define . So consider the set of real numbers ˚Pn SD i D1 d.ai 1 ; ai / W . 1; 0/ D a0 ; a1 ; : : : ; an D .1; 0/ 2 C : (Here we are assuming as indicated above that these ai D .xi ; yi / are listed in order of their first coordinates xi .) The set S is non-empty (for example 2 D d.. 1; 0/; .1:0// 2 S ). If we can show that S is bounded above, then we will know that its supremum exists. We will prove that S is bounded above by 4. So, let a0 ; a1 ; : : : ; an be such a list. For each i, d.ai 1 ; ai / jxi xi j C jyi 1 1 yi j: Notice that if i 1 and xi is negative, xi 1 < xi yields q q yi 1 D 1 xi2 1 < 1 xi2 D yi : Similarly as soon as xi 1 0, yi 1 > yi . As the x values cross from negative to positive, we could have two values of y the same. Let k be the first index such that yk D maxfy0 ; : : : ; yn g. Then, since xi 1 < xi , for all i and yi 1 < yi , for i k and yi 1 yi , for i > k, n X d.ai 1 ; ai / i D1 n X jxi xi j C 1 i D1 D n X jyi 1 yi j iD1 .xi 1 xi / C i D1 D1 n X k X .yi 1 iD1 . 1/ C .yk 0/ C .yk yi / C n X .yi 1 yi / i DkC1 0/ 2C1C1D4 This shows that D sup S , the length of the semicircular arc, exists and is 4. Just to check that this is consistent with what you have learned, consider a semicircle of radius r: 2 2 Cr D f.x 0 ; y 0 / W x 0 C y 0 D r 2 ; y 0 0g D fr.x; y/ W x 2 C y 2 D 1; y 0g: 50 The existence of The distance between two points ra and rb on this arc is d.ra; rb/ D jra rbj D rja bj D rd.a; b/ and, if we compute the arc length in a similar way, we obtain ˚Pn sup i D1 d.rai 1 ; rai / W . 1; 0/ D a0 ; a1 ; : : : ; an D .1; 0/ 2 C D supfrs W s 2 Sg D r; as expected. Now, let’s go back to the unit circle. In a similar way, we can define arc length `.a; b/ along the unit circle U D f.x; y/ W x 2 C y 2 D 1g from any point a 2 U to another b 2 U , traversing continuously in one direction. This is a bit cumbersome with the theory we have at the moment, but the idea is simple. We find that if b is between a and c as we traverse the from a to c, the lengths satisfy `.a; c/ D `.a; b/ C `.b; c/: (21) Assuming the traversal is in the counterclockwise direction, one can show that for each 2 Œ0; 2/, there is exactly one point u D .x; y/ 2 U with `..1; 0/; u/ D , and of course, we put cos D x; sin D y: Using the usual identification of points in R2 with vectors whose tails are at the origin, we think of as a measure of the angle between the vector .1; 0/ and the vector .x; y/, traversing in that direction. The rotation map u cos sin u 7! v sin cos v preserves distance. It follows that it preserves arc length. This together with the additivity formula (21) can be used to give the usual formulas for the sin and cos of sums and differences of angles. Other standard identities follow from sin2 C cos2 D x 2 C y 2 D 1. INTRODUCTION TO ANALYSIS 51 15. T OPOLOGY IN R AND Rn AND OTHER METRIC SPACES . A metric space is a set X together with a function d W X X ! R (called a metric) such that for all x; y; z 2 X , (1) d.x; y/ 0 (2) d.x; y/ D 0 if and only if x D y (3) d.x; y/ D d.y; x/ (4) d.x; y/ d.x; z/ C d.z; y/ (triangle inequality). 15.1 Examples. (a) R. As we know, in R, the distance between x and y, defined by d.x; y/ D jx yj, satisfies these 4 properties, so R together with this distance forms a metric space. (b) Rn . These conditions are also satisfied by distances in Rn . Recall that for points Pn 2 1=2 (vectors) in Rn , the distance from x to y is d.x; y/ D jx yj, where jxj D , i D1 xi the norm of x, and d is called the Euclidean metric or Euclidean distance in Rn ; many authors use the notation kxk instead of jxj. We will occasionally do this for emphasis. (c) The complex number system C is also a metric space under the metric defined by d.z; w/ D jz wj. Here, if z D x C iy, jzj D .x 2 C y 2 /1=2 . As a result, as a metric space, C can be regarded as the same as R2 . (d) Metric subspaces. The main examples we work with in this course are subspaces of Rn (or R). If X S, a metric space with metric d , we can define a distance function on X simply by restriction dX .x; y/ D d.x; y/; for x; y 2 X . This distance is still a metric, because the four properties were already satisfied for elements of S , so they are certainly satisfied for elements of X . The set X with the metric dX is called a (metric) subspace of S . If we are working in a metric space X , the complement of a set A in X is Ac D X n A. Thus, in R, Ac D R n A and in Rn , Ac means Rn n A. Open and Closed sets. Let X be a metric space. For x 2 X , " > 0, the set B.x; "/ D fy 2 X W d.y; x/ < "g is called the "-neighbourhood or (open) "-ball about x. Here, " is called the radius of the ball. For each x, we will let Bx D fB.x; "/ W " > 0g, the family of all these basic neighbourhoods. Now if x 2 X and A X , exactly one of three things can happen: A contains some U 2 Bx c .x is an interior point of A/ A and A intersect each U 2 Bx .x is a boundary point of A/ A is disjoint from some U 2 Bx .x is an exterior point of A/ The sets of such points are, respectively, called the interior of A, the boundary of A, and the exterior of A, and denoted int A, bd A, ext A. You can check right away that a point x is an exterior point of A if and only if it is an interior point of Ac . A set is called open if all its points are interior points and is called closed if all points outside it are exterior points. Thus, 15.2 Theorem (Main test for open or closed set). (1) A is open iff for each x 2 A, there exists an " > 0 with B.x; "/ A; (2) A is closed iff for each x … A, there exists an " > 0 with B.x; "/ \ A D ;. Topology in R and Rn and other metric spaces. 52 Since B.x; "/ \ A D ; , B.x; "/ Ac ; (2) says A is closed iff for all x 2 Ac , there exists " > 0 with B.x; "/ Ac . That is: 15.3 Corollary. A set is closed if and only if its complement is open. Since a set clearly contains its interior (proof?) we can say A is open iff it is equal to its interior, and it is closed iff its complement is equal to its exterior. 15.4 Examples. Determine which of the following sets are open, closed, neither or both. (a) .1; 5/. (b) Œ1; 5. (c) .1; 5. (d) Œ2; 1/. (e) R. (f) f1; 2; 3g. (g) f n1 W n 2 Ng. p (h) fr 2 Q W r > 2g. (i) Œ2; 4 Œ3; 6, a closed interval of R2 . Proof. (a) Let A be the open interval .1; 5/. If x 2 A, then 1 < x < 5, so x 1 > 0 and 5 x > 0. Put " D minfx 1; 5 xg. Then, " > 0. I claim B.x; "/ A. Indeed, y 2 B.x; "/ implies jy xj D d.y; x/ < ". Thus, 1Dx .x 1/ x "<y <xC"xC5 x D 5; so y 2 .1; 5/ D A. We have shown that for each x 2 A, there exists " > 0 with B.x; "/ A. So all of the points of A are interior points, or in other words, A is open. On the other hand, A is not closed because 1 … A and every neighbourhood of 1 contains points of A: indeed, if " > 0 then the neighbourhood B.1; "/ D .1 "; 1 C "/ and if a is any point strictly between 1 and minf5; 1 C "g, such as minf3; 1 C "=2g/ , then 1 < a < 1 C " and a 2 A, so a 2 B.1; "/ \ A: (b) Let B D Œ1; 5. (Don’t confuse this B with a ball B.1; "/.) Then B is not open, since for example, 1 2 B, but is not an interior point. Every neighbourhood of 1 contains points outside of B; indeed, if " > 0, B.1; "/ D .1 "; 1 C "/ contains the point 1 "=2, which does not belong to B. However, B is closed. To prove this, we take an element x … B and find a neighbourhood of x which is disjoint from B. For such an x, either x < 1 or x > 5. In case x < 1, we let " D .1 x/=2, which is > 0. Then B.x; "/ D .x .1 x/=2; xC.1 x/=2. All of its elements are < 1, so B.x; "/\B D ;. In case x > 5, we put " D .x 5/=2. Then t 2 B.x; "/ implies t >x .x 5/=2 > x .x 5/ D 5; so t … B. In other words, B.x; "/ \ B D ;. Thus, in both possible cases, x has a neighbourhood which doesn’t intersect B. (Actually, we could have taken " D 1 x in the first case and x 5 in the second. The smaller choice of " was taken just to make us feel “safer”: B.x; "/ is well away from B. ) INTRODUCTION TO ANALYSIS 53 (c) If C D .1; 5, then C is not closed, because as for part (a), each neighbourhood of 1 intersects C . Use the same a. On the other hand, C is not open, because each neighbourhood of 5 contains points of C c , namely points > 5. (Indeed, if " > 0, then 5 C "=2 2 B.5; "/, but does not belong to C .) (d) Œ2; C1/ means fx 2 R W 2 x < 1g. Call this set A. Then 2 2 A, but each neighbourhood of 2 intersects Ac D . 1; 2/, so 2 … int.A/. Thus, A is not open. On the other hand, any point of Ac — that is, any point x < 2 — is in the exterior. Indeed, let " D 2 x. Then t 2 B.x; "/ implies t < x C " D 2, so B.x; "/ is disjoint from A. Thus, A is closed. (e) If x 2 R, then each neighbourhood of x is R, so x 2 int R; thus, R is open. R is also closed, for there are no points outside of R, hence all points outside of R are exterior! The condition is vacuously satisfied. (f) Let F D f1; 2; 3g. Here we have only 3 points. If x … F; put " D minfd.x; 1/; d.x; 2/; d.x; 3/g. This minimum exists since the minimum of a finite set always exists and it is > 0, since each of d.x; 1/; d.x; 2/; d.x; 3/ is > 0. Now, the ball B.x; "/ does not intersect F , because y 2 B.x; "/ implies d.x; y/ < " d.x; a/, for each a 2 F . This shows that each point outside of F is an exterior point, so F is closed. On the other hand, if a 2 F , then a is not an interior point of F . Indeed, if " > 0, then B.a; "/ D .a "; a C "/ is an infinite set and F is finite, so B.a; "/ contains (infinitely many) points that are not in F . Thus,B.a; "/ 6 F . (g) Let M D f n1 W n 2 Ng. Then M is neither open nor closed. M not closed: Certainly, 0 … M . To see that 0 has no neighbourhood disjoint from M involves the Archimedean property. If " > 0; there exists n 2 N with n1 < ". But then 1 2 B.0; "/ \ M . n M not open: 1 2 M , but each neighbourhood of 1 contains many points of M c [[Why?]] . p (h) Let S D fr 2 Q W r > 2g. Then the density of the irrationals will show S is not open, and the density of the rationals will allow us to conclude S is not closed. In more detail: suppose x 2 S . Then S does not contain a neighbourhood of x. Indeed, if " > 0 then B.x; "/ is an interval, so contains an irrational, and S consists entirely of rationals. Thus, S is not open. p On the other hand, if x is any irrational > 2, then x … S and if and " > 0, B.x; "/ contains the interval .x; x C "/: Any rational in this set belongs to S . Thus such an x has no neighbourhood disjoint from S , so S is not closed. (i) Let I D Œ2; 4 Œ3; 6. Then, I is closed. To prove this, let c D .c1 ; c2 / … Œ2; 4 Œ3; 6. Then either c1 … Œ2; 4 or c2 … Œ3; 6 (or both). In the case c1 … Œ2; 4, either c1 < 2 or c1 > 4. We have to show that B.c; "/ \ I D ;. Let x D In the first case, let " D 2 c1 . p .x1 ; x2 / 2 B.c; "/. Then jx1 c1 j .x1 c1 /2 C .x2 c2 /2 D d.x; c/ < ". Thus, x1 < c1 C " D 2. Therefore x1 … Œ2; 4, so x … Œ2; 4 Œ3; 6. The case c1 > 4 is proved similarly, using " D c1 4. And, of course, the case c2 … Œ3; 6 splits into 2 subcases, also proved in the same way. We now prove I is not open. There are lots of points that are not interior points. To show the way to more general situations, let us look at the point a D .2; 5/. Since 2 2 Œ2; 4 and 5 2 Œ3; 6, .2; 5/ 2 I . To show that no neighbourhood of a is entirely contained in I , let " > 0 be arbitrary. Let x D .2 "=2; 5/.pThen, x … I , since x1 D 2 "=2 … Œ2; 4. But, x 2 B.a; "/, since d.x; a/ D jx aj D .2 "=2 2/2 C .5 5/ D "=2 < ". This 54 Topology in R and Rn and other metric spaces. shows B.a; "/ is not contained in I . Since " was arbitrary, no ball around a is contained in I . Thus, a is not an interior point of I and therefore, I is not open. Pay close attention to the following two theorems. You will notice that they depend only on the properties of distance. Thus, they are valid in any metric space. 15.5 Theorem. Every open ball is an open set. Proof. Let U D B.a; r/, where r > 0. Let x 2 U . Then d.x; a/ < r. Put " D r d.x; a/. Then B.x; "/ U: Indeed, if y 2 B.x; "/ then d.y; x/ < ", so d.y; a/ d.y; x/ C d.x; a/ < " C d.x; a/ D r d.x; a/ C d.x; a/ D rI that is, d.y; a/ < r. In other words, y 2 U , as required. 15.6 Theorem. (1) The union of any family of open sets is open. (2) The intersection of any finite family of open sets is open. S Proof. (1) Let fGi W i 2 I g be a family of open sets and put U D i 2I Gi . If x 2 U , then (by definition of union) there exists an i 2 I with x 2 Gi . For such an i , since Gi is open, there exists an " > 0 with B.x; "/ Gi . But Gi U , so B.x; "/ U . Thus, for each x 2 U there is a neighbourhood of x contained in U , so U is open.T (2) Let fG1 ; : : : ; Gn g be a finite family of open sets and U D niD1 Gi . To prove U is open, let x 2 U . Then for each i , x 2 Gi and Gi is open. Thus, we may choose "i > 0 such that that B.x; "i / Gi . Put " D T minf"1 ; : : : "n g. Then " > 0 and, for each i, B.x; "/ is contained in Gi . so B.x; "/ niD1 Gi D U . Thus, each point of U has a neighbourhood contained in U , so U is open. The two properties in the above theorem form the basis for the concept of topology in more advanced studies. As a corollary to the previous theorem we see immediately that: 15.7 Theorem. The intersection of any family of closed sets is closed; the union of any finite family of closed sets is closed. Proof. Recall (corollary 15.3) that a set is closed if andT only if its complement is open. So let fFi W i 2 I g be a family of closed sets and let C D i Fi . Then, for each i 2 I , Fic is open. Therefore, [ Fic is open. i2I But, by De Morgan’s laws, !c c C D \ i 2I Fi D [ Fic ; i 2I so C c is open. Therefore, its complement, C is closed. The proof of the second statement is similar and is left to the reader. INTRODUCTION TO ANALYSIS 55 Balls, open sets, and closed sets in subspaces. Let X S , a metric space with metric d . Then, BX .a; "/ denotes the open ball centred at a and radius " in the subspace X . We calculate immediately that BX .a; "/ D fx 2 X W dX .x; a/ < "g D fx 2 X W d.x; a/ < "g D fx W d.x; a/ < "g \ X D B.a; "/ \ X: x "/ \ X: Of course, a similar result holds for closed balls: BxX .a; "/ D B.a; 15.8 Theorem. Let X S. Then, (1) U is open in X iff there exists an open set G in S such that U D G \ X ; (2) C is closed in X iff there exists a closed set F in S such that C D F \ X . Proof. (1) ( (H ) Let G be open in S . We have to prove that G \ X is open in X . Let a 2 G \ X . Then a 2 G. Since G is open in S , there exists " > 0 such that B.a; "/ G. Therefore BX .a; "/ D B.a; "/ \ X G \ X: Therefore, every point of G \ X is an interior point and hence G \ X is open in X . ( H) ) Let U be open in X . We have to find G open in S such that U D G \ X . For each a 2 U , there is an open ball in X about a contained in U . That is there exists "a > 0 such that BX .a; "a / U . Then [ BX .a; "a / D U (why?). a2U But BX .a; "a / D B.a; "a / \ X; for all a 2 U Put GD [ B.a; "a /: a2U Then G is open (in the whole space) and [ [ G\X D B.a; "a / \ X D BX .a; "a / D U: a2U a2U Thus, U D G \ X , where G is open in the whole space. (2) is left as an exercise. (Hint: What is the relationship between open and closed sets?) There are similar results about interior and about closure. (The corresponding statement about boundary is false. So is the corresponding statement about compact sets.) In the case of compact sets, it turns out that a subset of X is compact in X iff it is compact in the whole space. (See the section: Compactness in subspaces.) 15.9 Examples. (a) If S D R, X D Œa; b, a closed interval and G D R, we get G \ X D Œa; b is open in X but not in R. (b) The set Œ0; 1/ neither open nor closed in R. However, it is open in the subspace X1 D Œ0; 1/, though still is not closed in X1 . It is closed in the subspace X2 D . 1; 1/, but is not open in X2 . 15.1. Classify each of the following as open, closed, neither or both. (Provide proof, of course.) Topology in R and Rn and other metric spaces. 56 (a) Q \ .2; 7/. (b) Z, the set of integers. (c) f40 =n W n 2 Ng. 15.2. Every finite set in a metric space is closed. 15.3. In R, every open interval .a; b/ is an open set; one or both of a; b can be infinite. Notice that . 1; C1/ D R is one of these. 15.4. In R, every closed interval is a closed set. Here we include Œa; b, when a; b are real, and infinite closed intervals Œa; C1/, . 1; b ( and R itself). 15.5. If A1 and A2 are closed sets in R, then A1 A2 is a closed subset of R2 . 15.6. If A1 and A2 are open sets in R, then A1 A2 is a open subset of R2 . 15.7. In a metric space, each closed ball fx W d.x; a/ rg is a closed set. 15.8. If t 2 R and a; b 2 Rn , let x.t / D .1 15.9. In Rn , t/a C tb. Then, d.x.t /; a/ D jt jd.a; b/. no open ball fx W d.x; a/ < rg is closed. 15.10. Give an example of a sequence .An / of open sets such that the intersection open. T n2N An is not 15.11. The "-neighbourhoods are called basic neighbourhoods. If U is a set which contains some basic neighbourhood of x, then U is still referred to as a neighbourhood. Our definitions were in terms of basic neighbourhoods, but prove that x is (a) an interior point of A iff A contains some neighbourhood of x; (b) a boundary point of A iff both A and Ac intersect each neighbourhood of x; (c) an exterior point of A iff A is disjoint from some neighbourood of x. 15.12. A is open iff it contains a neighbourhood of each of its points iff A is a neighbourhood of each of its points (in the general sense of the previous exercise). [You will see that in most general results, “basic neighbourhood” can be replaced by “neighbourhood”, without affecting their validity.] INTRODUCTION TO ANALYSIS 57 16. I NTERIOR , BOUNDARY, AND CLOSURE Interior and boundary points were defined in the section T OPOLOGY IN R AND Rn . It is worth noticing that the interior points are the points of the set that are not in the boundary. 16.1 Lemma. int A D A n bd A. Proof. If x 2 int A then there exists a neighbourhood U D B.x; "/ of x with U A. Since x 2 U , x 2 A. But U A also implies U \ Ac D ;, so x … bd A. This shows x 2 int A implies x 2 A n bd A. Conversely, if x 2 A n bd A, then x 2 A and x … bd A. The second statement means that there is a neighbourhood U of x which does not intersect both A and Ac . But x 2 U so U intesects A, hence U doesn’t intersect Ac , That is U A, so that x 2 int A. You should check that all points of A that are not interior points are in the boundary. 16.2 Theorem. (1) A is open iff it is disjoint from its boundary. (2) A and Ac have the same boundary. (3) A is closed iff it contains its boundary. Proof. (1) Once again, int A D A n bd A so A is open ” A D int A ” A D A n bd A ” A \ bd A D ; (2) and (3) are left to the reader. 16.3 Examples. Find the boundary, and interior points for the following sets. (a) .1; 5/. (b) Œ1; 5. (c) Œ1; 5/. (d) Œ2; 1/. (e) R. (f) f1; 2; 3g. (g) f n1 W n 2 Ng. p (h) fr 2 Q W r > 2g. These are the same sets as in examples 15.4 (a)–(h) Soln. (a) Let A be the open interval .1; 5/. What are its boundary points and what are its interior points? As we have shown in example 15.4(a), A is open: each point of A is an interior point and since A can have no interior points outside, these are all of them: A D int.A/: If " > 0 then the neighbourhood B.1; "/ D .1 "; 1 C "/ contains points both of A and of Ac . Indeed, if a is any point between 1 and minf5; 1 C "g, such as minf3; 1 C "=2g, then 1 < a < " and a 2 A, so a 2 B.1; "/ \ AI also 1 2 B.1; "/ \ Ac : Thus, 1 is a boundary point of A. In the same way, each neighbourhood of 5 contains points of A and of Ac . Thus, 5 is also a boundary point of A. So far we have shown that 58 Topology in R and Rn and other metric spaces. bd A f1; 5g. Are there any other boundary points of A? The points of A are interior points, so the only other place to look is outside Œ1; 5. The closed interval Œ1; 5 is a closed set, so if x is any point not in Œ1; 5, it has a neighbourhood U that doesn’t intersect Œ1; 5. Such a neighbourhood doesn’t intersect A D .1; 5/ either, so A has no boundary points outside of Œ1; 5; thus, bd A D f1; 5g: Note: If we didn’t want to use the fact that Œ1; 5 is a closed set we could go through the argument from basics. If x > 5, taking " D x 5 we find that B.x; "/ \ A D ;; so x is an exterior point (hence not a boundary point of A). Similarly, if x < 1, x is not a boundary point point of A. Notice, by the way, that A contained none of its boundary points. (b) Let B D Œ1; 5. Since .1; 5/ is an open set, each point of .1; 5/ has a neighbourhood U contained in .1; 5/, so also contained in B. Thus, .1; 5/ int B. This time, however, not all the points of B are interior points: the points 1; 5 are in B, but are boundary points rather than interior points. For example if " > 0, 1 "=2 2 B.1; "/ \ B c and B.1; "/ \ B ¤ ;, since we know B.1; "/ \ .1; 5/ ¤ ;. Thus, int B D .1; 5/ and f1; 5g bd B: But, as before, since B is closed, no point outside it can be a boundary point of B, so in fact, f1; 5g D bd B: (c) If C D .1; 5, then one more time we see that bd C D f1; 5g and int C D .1; 5/, but now C contains one of its boundary points but not the other. (d) Let D D Œ2; C1/ D fx 2 R W 2 x < 1g. The interval .2; 1/ is an open set, so if x 2 .2; 1/, x has a neibourhood U contained in .2; 1/. Such a U is contained in D also, s .2; 1/ intŒ2; 1/: Each neighbourhood of 2 intersects both A and Ac D . 1; 2/, (check this) so 2 2 bd D. Since there are no more points of A, int D D .2; 1/: And since D is closed (or . 1; 2/ is open) any point < 2 is in the exterior, so bd D D f2g: (e) If x 2 R, then each neighbourhood of x is R, so x 2 int R; there are no more points, so int R D R and bd R D ;. (f) Let F D f1; 2; 3g. As we have seen, F is closed. Each point outside F is an exterior point, so not a boundary point. On the other hand, if a 2 F , then a is a boundary point of F . Indeed, if " > 0, then a 2 B.a; "/ \ F; so B.a; "/ \ F ¤ ;. Also, B.a; "/ D .a "; a C "/ is an infinite set and F is finite, so B.a; "/ contains (infinitely many) points that are not in F ; that is, B.a; "/ \ F c ¤ ;. This shows that each point of F is a boundary point of F . INTRODUCTION TO ANALYSIS 59 (g) Let G D f n1 W n 2 Ng. We find bd G D G [ f0g. That is, each of the points of G are boundary points, 0 is another one, and there are no others. It follows that G has no interior points, int G D ;. We will let the reader check that all points of G are boundary points of G. To see that 0 is a boundary point, we need the Archimedean property: If " > 0; there exists n 2 N with n1 < ". But then n1 2 B.0; "/ \ G. On the other hand, 0 2 B.0; "/ \ G c . That there are no other boundary points, involves 3 cases: x < 0; 0 < x < 1, and x > 1. In the first case, if we choose " D x, B.x; "/ \ G D ;, since all elements of G are > 0. The case x > 1 is similar. Just use " D x 1. It is the case 0 < x < 1 that is more interesting. Let x … G, but 0 < x < 1. By the Archimedean property there is a natural number n with 1=n < x. Choose the smallest such n, so that 1 1 <x< : n n 1 (n ¤ 1, since x < 1). The interval .1=n; 1=.n 1// is open so it contains a neighbourhood U of x. This neighbourhood cannot contain any elements of G, since there is no integer between n 1 and n. This shows that x cannot be a boundary point of G, establishing the claim. Note: Here n 1 D b1=xc and pn D d1=xe. (h) Let H D fr 2 Q W r > 2g. p Then density of the rationals and of the irrationals 2; 1/ and int H D ;. will allow us to conclude bd H D Œ p p To see this, first note that Œ 2; 1/ is closed so G has no boundary points < 2. But, if p p " > 0, and x 2, B.x; "/ \ . 2; 1/ p .x; x C "/; this contains a rational r, by density of the rationals. This r is greater than 2, so belongs to H . Thus, B.x; "/ \ H ¤ ;. Also, c the irrationals are dense, so there p is an irrational t 2 .x; x C "/, so B.X; "/ \ pH ¤ ;. This shows that each x 2 is a boundary point of H . Thus, bd H D Œ 2; C1/ and int H D ;. In the following list of properties we will see the interior of A is the largest open set contained in A. 16.4 Theorem (Properties of interior). (0) (1) (2) (3) (4) (5) A is open iff A D int A int A AI A B implies int A int B. If G is open and G A, then G int A. int A is open; that is, int.int A/ D int A. int.A \ B/ D .int A/ \ .int B/ Consequently, int.A/ is the largest open set contained in A. Proof. You will see that we have already used some of the arguments in special cases. (0) and (1) have been established before, we are just collecting them here together. (2) If x 2 int A, then there exists an open ball U D B.x; "/ with U A. But A B, so U B and hence x 2 int B. (3) If G is open and G A, then G D int G int A by (0) and (2) Topology in R and Rn and other metric spaces. 60 (4) Let x 2 int A. We have to show that there exists a neighbourhood of x contained in int A. By definition, we do know that there is an open ball U D B.x; "/ A. But an open ball is an open set, so U int A, by (3). Thus int A is open. The formula int.int A/ D int A follows by (0). (5) Since A \ B A; (2) yields int.A \ B/ int A and similarly, int.A \ B/ int B so int.A \ B/ .int A/ \ .int B/ On the other hand, int A and int B are open, so .int A/ \ .int B/ is open and .int A/ \ .int B/ A \ B: Thus, .int A/ \ .int B/ int.A \ B/; by (3). We said that int A is the largest open set contained in A. Well, (4) says it is open, (1) says it is contained in A and (3) says it contains any other open set contained in A, so it is the largest. Closure. There is an operator that does for closed sets what interior does for open ones. For A R or Rn (or any other metric space) we define the closure of A by cl A D A [ bd A: The elements of cl A are called closure points of A. Here is a characterization of closure which is so important that many take it as the definition. 16.5 Theorem. A point x is a closure point of a set A iff every neighbourhood of x intersects A. You should prove this as an exercise, from the definition. We can also deduce this fact from the following formula. 16.6 Lemma. .cl A/c D int.Ac /. Proof. This is just a calculation. Since cl.A/ D A [ bd A, .cl A/c D .A [ bd A/c D Ac \ .bd A/c D Ac n bd.A/ D Ac n bd.Ac / since bd.A/ D bd.Ac / D int.Ac /: To get theorem 16.5, then, we see that x 2 cl.A/ if and only if x … int.Ac /; that is, iff there does not exist a neighbourhood of x contiained in Ac — in other words, iff every neighbourhood of x intersects A. INTRODUCTION TO ANALYSIS 61 16.7 Theorem (Properties of closure). (0) A is closed iff A D cl A (1) cl A AI (2) A B implies cl A cl B. (3) If F is closed and F A, then F cl A. (4) cl A is closed; that is, cl.cl A/ D cl A. (5) cl.A [ B/ D .cl A/ [ .cl B/ Hence, cl.A/ is the smallest closed set containing A. The proofs are left as exercises. They follow almost immediately from the corresponding results for interior. Alternatively, one may use similar proofs. 16.1. Find the interior and boundary of each of these sets. Give proof. (a) f1 1=n W n 2 Ng (b) .3; 4 [ f7g (c) fr 2 Qc W r 2 .3; 4g (d) fn2 W n 2 Ng (e) The line segment from .0; 0/ to .1; 1/ of R2 . Note: You can (and may) use theorems about open and closed sets. For example, every open interval is an open set. 16.2. Find the interior, boundary, and closure of the closed ball in Rn : x B.a; r/ D fx 2 Rn W d.x; a/ rg (with proof, of course). 16.3. The minimum of a set A of real numbers is never an interior point of A. 16.4. When they exist (in R), sup A and inf A are always boundary points of A. 62 Topology in R and Rn and other metric spaces. Notes INTRODUCTION TO ANALYSIS 63 17. B OUNDED SETS As we know, in a metric space, the open ball centred at a radius r is B.a; r/ D fx W d.x; a/ < rg; and the closed ball centred at a radius r is x r/ D fx W d.x; a/ rg; B.a; One also says “ball about a” instead of ball centred at a”. Boundedness. In a metric space, a set A is called bounded if it is contained in some ball. (It doesn’t matter whether one uses open or closed balls. Why? — Problem 17.1) 17.1 Theorem. (a) If A is a set in a metric space S , then A is bounded iff for each point c 2 S , A is contained in some ball centred at c. (b) In Rn , A is bounded in the metric space sense iff there exists M such that jxj M , for all x 2 A. (c) In R, A is bounded in the metric space sense iff it is bounded above and bounded below, and iff there exists M with jxj M , for all x 2 A. Proof. (a) If A is contained in a ball centred at c, it is certainly contained in a ball, so it is bounded by definition. Now, suppose A is contained in the ball B.a; r/ and c is some other point of the metric space S. Then, for all x 2 A; d.x; a/ < r, so d.x; c/ d.x; a/ C d.a; c/ < r C d.a; c/. In other words A is contained in the open ball centred at c, radius M D r C d.a; c/. (b) For A Rn , we can take c D 0 in (a) and obtain A is bounded iff there exists M > 0 such that A B.0; M /; that is, jxj < M for all x 2 A. (c) Let A R. If A is bounded in the metric space sense, then there exist s point a and a radius r such that A B.a; r/. Thus, x 2 A implies a r < x < a C r. This shows that A is bounded below by a r and above by a C r. Now suppose A is bounded below by b and above by c. Then A Œb; c, and Œb; c is x r/, where a D .b C c/=2 and r D .c b/=2. Alternatively, we could a closed ball B.a; let M D maxfjbj; jcjg and obtain M x M , for all x 2 A; that is jxj M , for all x M /. x 2 A. In other words, A B.0; 17.2 Note. Be aware of the difference between the definition of bounded and the characterization (a) of the Theorem. The definition says there exists a 2 S and there exists r > 0 such that A B.a; r/. The characterization says we can change the centre to any other point of S: for all c 2 S, there exists r such that A B.c; r/. As we indicated in (b), if the space is Rn one often takes the centre to be 0. 17.3 Example. By an interval in Rn , we mean a cartesian product I1 : : : In where each Ii is an interval of the reals. If a D .a1 ; : : : ; an / and b D .b1 ; : : : ; bn / are points of Rn , then a b means ai bi for i D 1; : : : ; n. The closed inteval Œa; b then means Œa; b D fx W ai xi bi ; for i D 1; : : : ; ng D Œa1 ; b1 Œan ; bn : Every closed interval I D Œa1 ; b1 Œa2 ; b2 Œan ; bn of Rn is bounded. Conversely, every bounded set in Rn is contained is some such closed interval. The reader should prove this. What is the radius of the smallest ball that contains I ? 64 Bounded sets For a set A of a metric space S , the diameter of A is diam.A/ D supfd.x; y/ W x; y 2 Ag; with the convention that diam.A/ D 1, if this set of distances is not bounded above. Also the diameter of the empty set is considered to be 0. 17.1. Prove that a set A in a metric space is contained in some open ball if and only if it is contained in some closed ball. 17.2. A set in a metric space is bounded if and only if its diameter is finite. 17.3. For a non-empty set A in a metric space diam.A/ D diam.cl A//. 17.4. A set is bounded if and only if its closure is bounded. 17.5. The diameter of the sphere in Rn of radius r, S.a; r/ D fx 2 Rn W jx aj D rg is 2r. The same is true for the closed ball and the open ball of radius r. This need not be true in other metric spaces. 17.6. If I D Œa; b D Œa1 ; b1 Œan ; bn is a closed interval of Rn , then diam.I / D d.a; b/ D ja bj, the “length of the diagonal”. (a D .a1 ; : : : ; an /, b D .b1 ; : : : ; bn /.) What if I is a non-closed interval with these endpoints? a Cbi 17.7. (Bisection Procedure) Let I D I1 In , Ii D Œai ; bi , ci D i 2 For each i , put Ii0 D Œai ; ci and Ii1 D Œci ; bi ; Then, Ii D Ii0 [ Ii1 so [ e e I D I1 In D I1 1 In n : for i D 1; : : : ; n. e2f0;1gn Each e D .e1 ; : : : ; en / here is an n-tuple of 0’s and 1’s. There are 2n of them. Each product e e I1 1 In n is a closed interval with diameter !1=2 !1=2 X X bi ai 2 1 ei 2 D d.a; b/: length Ii D 2 2 i i In the case n D 2, the representation looks like I D .I10 I20 / [ .I10 I21 / [ .I11 I20 / [ .I10 I21 /: INTRODUCTION TO ANALYSIS 65 18. ACCUMULATION AND THE B OLZANO -W EIERSTRASS T HEOREM ( SET FORM ) Accumulation points. For a point c and a set A, c is an accumulation point of A if for each " > 0, B.c; "/ \ A n fcg ¤ ;. Thus, c is an accumulation point of A iff each neighbourhood of c intersects A in a point other than c. The set of all accumulation points of A will be denoted here by acc A. (In some books this is denoted A0 and is called the derived set of A. We won’t use this terminology.) A neighbourhood of c with c removed is called a deleted neighbourhood of c. In particular, the set B 0 .c; "/ D B.c; "/ n fcg is called the deleted neighbourhood of c radius ". (Here " > 0 as usual.) Using this language we could say that c is an accumulation point of A if each deleted neighbourhood of c intersects A. 18.1 Theorem. (a) acc A n A D bd A n A D cl A n A: (b) cl A D A [ acc A .D A [ bd A/. Proof. (a) Let c 2 .acc A/ n A. Then c … A and every neighbourhood of c intersects A n fcg. Let U be a neighbourhood of c. Since U intersects A n fcg it intersects A and it also intersects Ac , since c 2 U \ Ac . This shows that c 2 bd A. Since it is also in Ac , c 2 bd AnA. Conversely, let c 2 bd AnA. Then c 2 bd A and c … A. By definition of boundary, every neighbourhood U of c intersects A (and Ac ). And since c … A, U \ A D U \ A n fcg ¤ ;. So that c 2 acc A. The second equality is trivial: cl A n A D .A [ bd A/ n A D bd A n A: (b) is of the same level of difficulty: cl A D A [ bd A D A [ .bd A n A/ D A [ .acc A/; by (a). Many books use the formula cl A D A [ acc A as the definition of closure. 18.2 Examples. (1) If A is (2,3], then acc A D Œ2; 3. (2) No finite set has an accumulation point. Indeed, if F is a finite set and " D minfd.c; a/ W a 2 F n fcgg, then " > 0 and B.c; "/ \ F n fcg D ;. (3) If A D f n1 W n 2 Ng, then acc A D f0g. (4) acc N D ;. A point x is called an isolated point of A if it belongs to A, but is not an accumulation point of A. Thus, in the above examples (2), (3), and (4) all the points of the set are isolated. The following is one of the reasons for the name “accumulation” point. 18.3 Theorem. A point c is an accumulation point of A iff every neighbourhood of c contains infinitely many points of A. Proof. . (H / This direction of the proof is immediate. If a neighbourhood contains infinitely many points, then it contains one other than c! ( H) ) Let c be an accumulation point of A. Let U D B.c; "/ be a neighbourhood of c. Then U has at least one point of A other than c. Suppose U \ A is F , finite and put "0 D minfd.c; a/ W a 2 F n fcg/. Then, "0 > 0, but B.c; "0 / contains no element of A other than c, a contradiction. 66 Accumulation and the Bolzano-Weierstrass Theorem 18.4 The Bolzano-Weierstrass Theorem (set form). Every bounded infinite set in R or Rn has an accumulation point. We will be giving several proofs of this, because of the various techniques that they teach. Many of our results are true in general metric spaces. This is not one of them. It depends on the order completeness of the reals. Proof. Let A be a bounded infinite set in R. Since A is bounded, there exists a closed bounded interval I with A I . Let d be its diameter. (if I D Œa; b D Œa1 ; b1 Œan ; bn , the diameter of I is ja bj.) Now, we may bisect I using the midpoints pi D ai Cbi , obtaining it as the union of 2n closed intervals of diameter d=2: see problem17.7 2 (Each of these intervals is of the form Œx; y, where for each i, either xi D ai and yi D pi or xi D pi and yi D bi . ) At least one of these 2n intervals must contain an infinite number of points of A; otherwise, A would be finite. So let I1 be one closed interval contained in I with diam.I1 / D d=2. Continue this recursively — if Ik has been chosen of diameter d=2k containing infinitely many points of A, choose IkC1 a closed interval of diameter d=2kC1 contained in Ik and also containing infinitely many points of A. By Cantor’s Principle of nested intervals, the intersection \ Ik k2N contains some point c. (In fact it will be a singleton fcg, but that is not needed here.) We claim c is an accumulation point of A. To see this, let " > 0. Then there exists k 2 N with d=2k < " and we have Ik B.c; "/: Indeed, c 2 Ik and if x 2 Ik ; then d.x; c/ diam.Ik / D d=2k < ". Finally, since Ik contains infinitely many points of A, the neighbourhood B.c; "/ contains infinitely many points of A, so c is an accumulation point of A. 18.1. If A B, then acc A acc B. 18.2. If A is a non-empty set of real numbers bounded below with no minimum, then inf A is an accumulation point of A. 18.3. In Rn , find the set of accumulation points, the boundary, and the closure of B.a; r/, r > 0. 18.4. In Rn , find the set of accumulation points, the boundary, and the closure of the sphere S.a; r/ D fx W jx aj D r, r > 0. 18.5. Find the set off accumulation points of these sets. Give proof. (a) f1 1=n W n 2 Ng (b) .3; 4 [ f7g (c) fr 2 Qc W r 2 .3; 4g (d) fn2 W n 2 Ng (e) The line segment from .0; 0/ to .1; 1/ of R2 . 18.6. In R or Rn a set without accumulation points must have empty interior. 18.7. By contrast with problem 18.6: Let the metric space be Z, the integers, with the usual distance. (Thus, it is a subspace of R.) Then, for all A Z, acc.A/ D ;, but int.A/ D A. INTRODUCTION TO ANALYSIS 67 19. C OMPACTNESS AND THE H EINE -B OREL T HEOREM The general statements and definitions we make here are valid in R, or Rn or any metric space. The Heine-Borel Theorem however is only true in R or Rn . If U is a family of sets and A is another set, U covers A means every element of A is in some member of U. This can be stated in terms of the union of the sets of U: [ [ UD U A: U 2U To say a set K is compact means that whenever U is a family of open sets covering K, there is a finite U0 U which also covers K. 19.1 Example. Every finite set is compact. Proof. Let K D fx1 ; : : : ; xn g. To prove K is compact we must begin with an arbitrary family of open sets which covers K and extract from it a finite set which still covers K. So let U be such a family of open sets. [ U K: U 2U Then for each i D 1; : : : ; n, xi 2 [ U; U 2U hence we may choose a Ui 2 U with xi 2 Ui . Put U0 D fU1 ; : : : ; Un g: Then [ U0 D U1 [ [ Un K; and U0 is finite, so U0 is a finite subfamily of U which covers K. We have shown that every family of open sets which covers K has a finite subfamily which also covers K. That is, by definition, K is compact. 19.2 Example. Let A D Œ1; 1/ D fx 2 R W x 1g. Then A is not compact. To prove this we must show that there exists U, a family of open sets which covers A, such that there does not exist a finite subfamily U0 of U which also covers A. For each n 2 N, put Un D .0; n/, an open interval, hence an open set. Put U D fUn W n 2 Ng. Then [ [ [ U D UD Un : U 2U n2N This contains A. Indeed, a 2 A means a 2 R with a 1. By the Archimedean Property, there exists N 2 N with N > a, hence 0 < a < N: In other words a 2 UN , so [ a2 Un : n2N Now, suppose U0 is a finite subfamily of U. Then, we can write U0 D fUn1 ; : : : ; Unk g; where k 2 N. Let M be the maximum of the numbers n1 ; : : : ; nk . Since the Un are (increasingly) nested, [ U0 D .0; M / D UM : This does not contain A since A contains many points > M . 68 Compactness and the Heine-Borel Theorem Thus, U is a family of open sets covering A, which has no finite subfamily which covers A. Therefore, A is not compact. 19.3 Example. (2,5] is not compact. To prove this notice that this interval has no minimum. It is the fact that the 2 is missing that will make it not compact. We put Un D .2 C n1 ; 1/, for each n 2 N. The family U D fUn W n 2 Ng consists of open sets and it covers (2,5]. Indeed, if x 2 .2; S5 and n is chosen so that 1=n < x 2, then x > 2 C 1=n, so x 2 Un . (Actually, n2N Un D .2; 1/; as this method of proof shows.) Now, suppose U0 is a finite subfamily of U, say U0 D fUn1 ; : : : ; Unk g If M is the largest of the ni , i D 1; : : : ; k, then the union of U0 is UM D .2 C there is a point in 1 .2; 5 n UM D 2; 2 C ; M so that U0 does not cover (2,5]. Thus, (2,5] is not compact. The examples here are indicative of a general theorem. 1 M ; 1/ and 19.4 Theorem. In any metric space, every compact set is closed and bounded. Proof. Let K be a compact set in the metric space S . To prove K is bounded, let a be any point of S; we will find a ball centred at a containing K. For each postive r 2 R, let Ur D B.a; r/. Then, Ur is an open set for each r > 0. Put U D fUr W 0 < r 2 Rg: Now, each point x is some distance d.a; x/ away from a. Thus, there exists an r with r > d.a; x/, so x 2 B.a; r/. Since this is true for all x 2 S; for all x 2 K, there exists r > 0, such that x 2 Ur . This shows that U D fUr W 0 < r 2 Rg covers K: Therefore, by compactness, there exists a finite subfamily U0 of U which also covers K. Say, U0 D fUr1 ; : : : ; Urk g: If M is the maximum of the ri ; we have for all i , Uri UM , so every element of K belongs to UM D B.a; M / and hence, Thus, K is contained in the ball B.a; M /, so is bounded. Now, to prove K is closed, let a … K. We will find a neighbourhood of a that is disjoint from K. x r/c and U D fUr W 0 < r 2 R. Since This time, for 0 < r 2 R, we put Ur D B.a; every closed ball is a closed set and the complement of a closed set is open, U is a family of open sets. Now, let x 2 K. Then, x ¤ a, so d.x; a/ > 0. Thus, if r D d.x; a/, x … B.a; r/; in other words x 2 Ur . Thus, each x 2 K is in some U 2 U, so U covers K: Therefore, since K is compact, there is a finite subfamily, say U0 D fUr1 ; : : : ; Urn g INTRODUCTION TO ANALYSIS 69 which covers K. Thus, for each x 2 K, there exists i 2 f1; : : : ng with x 2 Uri ; in x ri /. Let " D minfr1 ; : : : ; rn g, so that for i D 1; : : : ; n, other words, with x … B.a; x "/ B.a; x ri /. Thus, for all x 2 K, … B.a; x "/: B.a; x "/ \ K D ;: B.a; Thus, every point outside K has a neighbourhood disjoint from K, so K is closed. A family U of open sets which covers K is often referred to as an open cover of A, even though normally “open” refers to a set. A subfamily of a cover of A which still covers A is called a subcover. Sometimes one says ‘subcover of A’, but it is ‘sub’ of U and cover of A. In this language the definition of compact set becomes: K is compact if and only if every open cover of K has a finite subcover. 19.5 Theorem. Every closed subset of a compact set is compact. Proof. Let K be compact. Let F be a closed set with F K. Let U be a family of open sets covering F . e of K, extract a finite U e0 U e which covers K, The plan is to construct an open cover U and then get from it the finite U0 U which covers F . Please do not make the mistake many students make of starting with a cover of K. e D U [ fF c g. Since F is closed, the elements of U e are still open. Since U We put U covers F , every element of F is in some U 2 U and every other element of K is in F c , so e is a family of open sets which covers K. Thus, there is a finite U e 0 which covers K and U c 0 e , but it contains no points of hence also covers F . Now, F may or may not belong to U F , so is not really needed. We remove it if it is there: put e 0 n fF c g: U0 D U e 0 with x 2 U I but this U Then, U0 still covers F , for if x 2 F , then there exists U 2 U c c 0 cannot be F ; since x … F , so must belong to U . As we said before, the general results above are actually true in any metric space. The following, however, is not. The fact that we are dealing with Rm is essential. 19.6 The Heine-Borel Theorem. In Rm , any closed bounded set is compact. Proof. If a set is bounded, it is contained in a closed interval I D Œa; b. Since each closed subset of a compact set is compact, we need only prove that I is compact. For this, let U be a family of open sets that covers of I but no finite subfamily covers I. Put I1 D I and let d D diam.I /. Assume a closed interval In I has been chosen such that In is not covered by a finite subfamily of U and 1 diam.In / D n 1 d: 2 Then, In is the union of a finite number (2m ) of closed intervals of diameter 12 diam.In / D 1 d: If each of these can be covered by finite subfamilies of U, then so could In , so at 2n least one of them, say InC1 cannot be covered by a finite subfamily of U. Thus, we have obtained, T by recursion, a nested sequence .In / of closed intervals, and by Cantor’s Principle, n In ¤ ;. Let c be a point of this intersection. Now, U is an open cover of I , so there exists U 2 U with c 2 U . For such a U , there exists " > 0 with B.c; "/ U . Choose such an " and then an n so that diam.In / D d=2n 1 < ": 70 Compactness and the Heine-Borel Theorem Then, In B.c; "/ U: Thus, fU g is a finite subfamily of U which covers In , contrary to the construction: NO finite subfamily of U covers In . 19.7 Theorem. LetT F be a non-empty T family of compact sets such that for each finite subfamily F 0 of F , F 0 ¤ ;. Then F ¤ ;. T T (Recall that F and F 2F F both mean the same thing.) Proof. Let F be a non-empty family of compact sets such that each finite subfamily has non-empty intersection. Let U D fF c W F 2 F g. Now, F compact implies F is closed and hence F c is open. Thus, U is a family of open sets. T Choose any K 2 F . If we suppose F 2F F D ;, we have !c [ [ \ UD Fc D F D ;c K: F 2F F 2F Thus, by compactness there exists a finite U0 U which also covers K. In other words, there exists a finite F 0 F such that [ F c K: F 2F 0 That is, \ F Kc; F 2F 0 or, equivalently, \ F \ K D ;; F 2F 0 This shows that the intersection of the finite subfamily F 0 [ fKg is empty. This is a contradiction. 19.8 Example (Practice). To see if you understand, show that Cantor’s Principle of Nested Intervals can be regarded as as special case of the above theorem. Compactness in subspaces. You will recall (see Theorem 15.8) that, in a subspace X of a metric space S , (1) a set U is open iff there exists a set G, open in S , such that U D G \ X . (2) a set C is closed iff there exists a set F , closed in S, such that C D F \ X . In the first case U need not be open in S . For example, if X is not open in S we can take U D X and G D S . (However, U will be open in S if X is open in S . Indeed, U D G \X is then the intersection of 2 open sets in S .) Similarly, in case (2), C need not be closed in S, though it is so if X is closed in S. But compactness behaves differently. 19.9 Theorem. Let X be a subspace of the metric space S. If K X , then K is compact in X iff K is compact in S. Proof. Suppose K is compact in X. To prove K is compact in S, let U be a cover of K by open sets in S . Since [ U K; U 2U INTRODUCTION TO ANALYSIS 71 intersecting with X gives [ U \ X K \ X D K; U 2U Thus, fU \ X W U 2 Ug is a cover of K by open sets of X . But K is compact in X, so there exists a finite subcover fU1 \ X; : : : ; Un \ X g Thus, .U1 \ X / [ [ .Un \ X / K and therefore, U1 [ [ Un K; so that fU1 ; : : : ; Un g is the required subcover, showing that K is compact in R. Now, for the converse, suppose K is compact in S. To show K is compact in the subspace X , let U be a cover of K by open sets of X . For each U 2 U, there is an open GU of S with U D GU \ X: Then [ [ GU U K; U 2U U 2U so that fGU W U 2 Ug is an cover of K by open sets of S . Thus, there is a finite subcover. Say, GU1 [ [ GUn K: Intersect these all with X and get .GU1 \ X / [ [ .GUn \ X / K \ X D K: That is, U1 [ [ Un K; so fU1 ; : : : ; Un g is a subfamily of U covering K, the required subcover. Hence, K is compact in X . The conclusion to all of this is that, if a theorem refers to a compact subset of X , a subspace of S , it doesn’t matter which of these spaces we think of the set being in. This frequently comes up if we are working with a function defined only on a subset of R, or if we need to restrict a function to a subset of its domain. 19.1. Let K be compact in a metric space. For each x 2 K let ı.x/ be a positive real number. Prove that there are finitely many elements x1 ; : : : ; xk of K such that B.x1 ; ı.x1 // [ [ B.xk ; ı.xk // K. 19.2. Find a metric space and a subset of it which is closed and bounded but not compact. 19.3. The set A D Œ2; 4/ is not compact. (a) Prove this by explicitly finding a family of open sets which covers A but has no finite subfamily which also covers A. (b) Find another family of open sets which covers A and does have a finite subfamily which covers A. 72 Compactness and the Heine-Borel Theorem Notes INTRODUCTION TO ANALYSIS 73 20. C ONVERGENCE OF SEQUENCES : D EFINITION AND E XAMPLES A sequence .xn / in a space X is a function n 7! xn on fn 2 Z W n pg, for some fixed p. To make more precise where the indexing starts one can use the notation .xn /1 nDp . One usually takes p D 1 as typical, which is what we do here. Sometimes it is convenient to use p D 0. A sequence .xn / in a metric space is said to converge to the point a provided for every " > 0, there exists N 2 N such that for n > N , d.xn ; a/ < ". (22) In symbols, 8" > 0; 9N 2 N; 8n > N; d.xn ; a/ < ": (22a) We then call a the limit of the sequence .xn / and write lim xn D a xn ! a: or n To say a sequence .xn / diverges simply means it doesn’t converge. Thus each sequence in a metric space is either convergent or divergent. Examples. Below we give the general term of each of a number of sequences. The object is to investigate whether the limits of the sequences exist or not, with proof. (1) (2) 1 n p1 n (3) 1 C 21n (4) . 1/n n2 (5) 1Cn 2 2 (6) nn3C2n 5 (7) . 1/n (8) n2 1 2 1 n Remember that ja C bj jaj C jbj and jjaj inequality. 20.1 Example. limn xn D n1 . 1 n D 0. In other words 1 n jbjj ja bj, versions of the triangle converges to 0 or briefly, 1 n ! 0. Let Analysis. The definition of convergence involves showing that the distance from the general term xn to 0 can be made small when n is large. So we look at ˇ ˇ ˇ1 ˇ 1 d ; 0 D ˇˇ 0ˇˇ : n n We have to make this small when n is large. More precisely, we must show that 1 8 " > 0; 9N 2 N; 8n > N; d ; 0 < ": n Now, d ˇ ˇ1 1 ; 0 D ˇˇ n n The Archimedean Property lets us find N with So let us put this into a formal proof. 1 N ˇ ˇ 1 0ˇˇ D n < ", and any n > N will satisfy 1 n < ". 74 Convergence of sequences: Examples Proof. Let " > 0 be given (fixed, but arbitrary). Choose, by the Archimedean Property, N 2 N with N1 < ". Then, for n > N , 1 1 < < "; n N ˇ ˇ ˇ1 ˇ 1 1 ˇ ;0 D ˇ 0ˇˇ D < ": d n n n therefore, Thus, for all " > 0, there exists N 2 N; for all n > N , d That is, limn 1 n 1 ; 0 < ": n D 0. 20.2 Example. . p1n / converges to Analysis. We guess that limn . p1 n d D 0. The relevant distance is ˇ ˇ ˇ 1 ˇ 1 1 ˇ 0ˇˇ D p ; p ; 0 D ˇp n n n For " > 0 we need to find N such that n > N implies p1 n < ". Now, 1 1 < "2 : p <" ” n n So we see how the proof should go. Proof. Let " > 0 be given. Choose N 2 N with 1 "2 (possible by the Archi. Property) N Then for n > N , 1 1 < "2 ; hence n N 1 p < ": n ˇ ˇ ˇ ˇ 1 Thus, for all n > N , we have d pn ; 0 D ˇ p1n 0ˇ < ": But " was arbitrary, so for all " > 0, there exists N 2 N such that for all n > N , d p1n ; 0 < "I that is, limn p1n D 0. 20.3 Example. Let xn D 1 C 1 . 2n Then limn xn D 1. Analysis. We are to show that no matter what " > 0 we are given, we can find an N so large that d.xn ; 1/ < "; for n > N . Now, ˇ ˇ ˇ ˇ 1 1 d.xn ; 1/ D jxn 1j D ˇˇ1 C n 1ˇˇ D n : 2 2 Since 2n > n, for all n 2 N; 1 <" 2n INTRODUCTION TO ANALYSIS 75 provided 1 < ": n Proof. Let " > 0 be given. Choose N 2 N with Then for all n 2 N, n > N implies ˇ ˇ 1 d.xn ; 1/ D jxn 1j D ˇˇ1 C n 2 1 N < ", by the Archimedean Property. ˇ ˇ 1 1 1ˇˇ D n < < ": 2 n Thus, for all " > 0 there exists N 2 N such that n > N implies d.xn ; 1/ < "; that is, xn ! 1. 20.4 Example. Let an D . 1/n . Then limn an does not exist. That is, the sequence .an / does not converge. Notice that a1 D 1 a2 D 1 a3 D 1 a4 D 1; :: : ( 1; if n is odd 1; if n is even. The successive values here never get closer together than 2. Imagine an ! c. When n is even, jan cj < " gives j1 cj < ", hence an D 1 When n is odd jan " < c < 1 C ": 1j < " gives j. 1/ 1 cj < "; hence "<c< 1 C ": We take " D 1. Then the even case yields 0 < c < 2 and the odd one gives 2 < c < 0. Putting these things together properly will yield a contradiction. Claim. .an / diverges, that is, does not converge. Proof. Suppose an converged. Then there would exist c with an ! c. Take " D 1. Then there exists N such that for all n > N , jan cj <1. For such N , we take an even n > N and get 1 " < c < 1 C "I that is, 0 < c < 2: But we may also take an odd n > N , and get 1 "<c< 1C"W that is, 2 < c < 0: Thus the existence of such an N implies 0<c and c < 0; 76 Convergence of sequences: Examples a contradiction. Therefore, for " D 1, no N exists with jan cj < ", for n > N . Therefore 8" > 0; 9N 2 N; 8n > N; jan cj < " is false: .an / does not converge to c. Here c was any proposed limit, so .an / does not converge. Proof. (Second method.) Here the idea is that if .an / converges to some c, then the terms must be getting close together. Suppose .an / converges to c. Take " D 1 in the definition to find N so that n > N implies jan cj < 1. Consider any n > N . Then n C 1 is also > N . Thus, jan anC1 j jan cj C jc jan anC1 j < 1 C 1 D 2; anC1 j < 2; but also jan anC1 j D j. 1/n . 1/nC1 j D j1 C 1j D 2: These two statements together yield 2 < 2, a contradiction. Thus, .an / does not converge to c. The c here was arbitrary, so .an / does not converge at all. We will see that the method just used is quite general. It involves the idea of a Cauchy sequence. See section 24. E XISTENCE : C AUCHY SEQUENCES. 20.5 Example. We guess limn n2 1Cn2 D 1. One way to see this is to use the calculation: n2 D 1 C n2 The 1 n2 1 n2 1 : C1 is very small when n is large, so should be negligible. Analysis. ˇ ˇ n2 ˇ ˇ 1 C n2 Now 1 C n2 > n, so 1 1Cn2 < " provided ˇ ˇ 2 ˇ ˇ ˇn .1 C n2 / ˇˇ ˇ ˇ 1ˇ D ˇ ˇ 1 C n2 ˇ ˇ ˇ 1 ˇˇ D ˇˇ 1 C n2 ˇ 1 D : 1 C n2 1 n < ". And Archimedes can take care of that. 2 n Proof. Let xn D 1Cn 2 . Let " > 0 be given. By the Archimedean property, there exists N 1 with N ". For such an N let n > N . then 1 1 1 < < " 2 1Cn n N INTRODUCTION TO ANALYSIS 77 Thus, ˇ ˇ ˇ ˇ n2 ˇ 1 1j D ˇˇ ˇ 2 1Cn ˇ 2 ˇ ˇn .1 C n2 / ˇˇ ˇ Dˇ ˇ 1 C n2 ˇ ˇ ˇ 1 ˇˇ D ˇˇ 1 C n2 ˇ 1 < ": D 1 C n2 Therefore, for all n > N , jxn 1j < ". Since " > 0 was arbitrary, for all " > 0 there exists N such that for all n > N , jxn 1j < ". Therefore, limn xn D 1. jxn 20.6 Example. limn n2 C2n n3 5 D 0. Analysis. ˇ ˇ 2 ˇ ˇ 2 ˇ ˇ n C 2n ˇ ˇ n C 2n ˇ ˇ ˇ: ˇ 0ˇ D ˇ 3 ˇ n3 5 n 5 ˇ If n 2, we may remove the absolute value signs, because then n3 5 23 5 > 0. 2 How much bigger must n be to make nn3C2n < "‹ Do not attempt to solve for n. If 5 n 2 we do have n2 C 2n n2 C n2 D 2n2 and also n3 n3 ; whenever 5; n3 5 2 2 which will hold if n3 10; which will hold if n 3. Now, for n 3, we have n2 C 2n 2n2 4 D ; 3 3 n 5 n =2 n which should be made < ". We are ready to organize this into a proof. Proof. Let us call this sequence .xn /. Let " > 0 be given. Choose N1 2 N with N1 4=". Let N D maxf3; N1 g. Then for n > N , we have ˇ ˇ 2 ˇ ˇ n C 2n ˇ 0ˇˇ jxn 0j D ˇ 3 n 5 n2 C 2n (since n 2/ n3 5 2n2 3 .because n 3 H) n3 n =2 4 D n < ": D 5 n3 =2/ Thus, for all " > 0, there exists N 2 N, such that for all n > N , jxn limn xn D 0. 0j < ". Thus, 78 Convergence of sequences: Examples 20.7 Example. Let an D . 1/n 12 n1 . When n is large, we see that an gets close to 1 , when n is even, and close to 12 , when n is odd. We guess that this sequence does not 2 converge. Analysis. Suppose .an / converges and let c be its limit. For all n 2 N d.anC1 ; an / d.anC1 ; c/ C d.c; an /: This can be made small for n large, while also, ˇ ˇ 1 d.anC1 ; an / D janC1 an j D ˇˇ. 1/nC1 2 ˇ ˇ 1 D ˇˇ. 1/nC1 2 ˇ ˇ 1 C D ˇˇ1 nC1 1 1 . 1/n nC1 2 ˇ 1 1 1 ˇˇ C nC1 2 n ˇ ˇ 1 ˇˇ : n ˇ ˇ 1 ˇˇ n ˇ If n 2, we have 1 1 1 1 C C < 1; nC1 n 3 2 so for these n we may remove the absolute value signs 1 1 janC1 an j D 1 C >1 nC1 n 2 : n If n 6, we would get 2 2 D : 6 3 Now we see what to do for a proof. We will use " D 31 . janC1 an j > 1 Proof. Suppose .an / converges to c. Then there exists N 2 N, with jan cj < 31 , for all n > N . Fix such an N and choose any n 2 N greater than the maximum of N and 6. Then n C 1 is also > maxfN; 6g. We have 1 2 1 an j janC1 cj C jc an j < C D : 3 3 3 ˇ ˇ ˇ 1 1 1 1 ˇˇ janC1 an j D ˇˇ. 1/nC1 . 1/n 2 nC1 2 n ˇ ˇ ˇ ˇ 1 1 1 1 ˇˇ D ˇˇ. 1/nC1 C 2 nC1 2 n ˇ ˇ ˇ ˇ 1 1 ˇˇ : D ˇˇ1 C nC1 n ˇ Since n 6, we may remove the absolute value signs and get 1 1 2 janC1 an j D 1 >1 C nC1 n n 2 2 >1 D : 6 3 2 2 Combining this with ./ we have 3 < 3 , a contradiction. Thus, .an / does not converge. janC1 20.8 Example. The sequence .n2 / does not converge. () INTRODUCTION TO ANALYSIS 79 The reason this doesn’t converge is that it becomes too large (unbounded), so can’t get close to any fixed number. Let us go directly to the proof. Proof. Suppose limn n2 D c 2 R. Then, taking " D 1 in the definition, there exists N such that for all n > N , jn2 cj < 1: By the triangle inequality, we have n2 jn2 cj C jcj < 1 C jcj; for all n > N . But according to the Archimedean Property, there exists n with n > maxfN; 1 C jcjg, For such an n we have n < n2 < 1 C jcj < n; which is impossible. This contradiction shows that the limit c did not actually exist. 20.1. For the following sequences .xn /, prove they converge or diverge (that is, do not converge) by using the definition of limit directly. (a) xn D (b) xn D (c) (d) (e) (f) xn xn xn xn D D D D n2 1 . n2 C2n 2n 4n C1 1C. 1/n .n2 / . n2 C1 3n . 2n 17 . 1/n .1 n/ . p 1Cn 2 n C 4 C 1=n p n2 . 20.2. Let S be a metric space with distance function d . Which of the following are logically equivalent to the definition of convergence to a point in S ? (a) 9a 2 S; 8" > 0; 9N 2 N; 8n > N; d.xn ; a/ < ". (b) 8" > 0; 9a 2 S; 9N 2 N; 8n > N; d.xn ; a/ < ". (c) 8" > 0; 9N 2 N; 9a 2 S; 8n > N; d.xn ; a/ < ". (d) 9N 2 N; 8" > 0; 9a 2 S; 8n > N; d.xn ; a/ < ". (e) 9a 2 S; 9N 2 N; 8" > 0; 8n > N; d.xn ; a/ < ". (f) 9a 2 S; 9N 2 N; 8n > N; 8" > 0; d.xn ; a/ < ". (g) 9a 2 S; 8" > 0; 8n > N; 9N 2 N; d.xn ; a/ < ". 80 Convergence of sequences: Examples Notes INTRODUCTION TO ANALYSIS 81 21. L IMIT THEOREMS FOR SEQUENCES OF REALS The limit of a sequence of real numbers is unique: If a sequence .xn / converges to a number a, then that is the only number it converges to. This justifies the notation a D limn xn . 21.1 Theorem (Uniqueness of limits). Let .xn / be a sequence converging to a and to b. Then a D b. Proof. Let " > 0: Since xn ! a, we may choose Na such that jxn aj < "=2, for n > Na : Since xn ! b, we may choose Nb such that jxn bj < "=2, for n > Nb : Put N D maxfNa ; Nb g, and let n > N . Then, by the triangle inequality, ja bj ja xn j C jxn " " < C D" 2 2 Thus, for all " > 0, ja bj < ". Hence, a result” (Theorem 3.3), so a D b. bj : b D 0, by what we call “the first analysis A set A R is called bounded if there exists M 2 R with jaj M;for all a 2 A. As we saw in section 17. B OUNDED SETS, this is the same as “order bounded” (that is, bounded above and bounded below) or bounded in the metric space sense (that is, contained in some ball). A sequence .an / in R is called bounded if its range fan W n 2 Ng is bounded; that is, if there exists M 2 R with jan j M , for all n 2 N. 21.2 Theorem. Every convergent sequence is bounded. Proof. Let .xn / be a convergent sequence and let a be its limit. From the definition of convergence, if we take " D 1, we obtain an N 2 N with jxn aj < 1; for all n > N: Then, by the triangle inequality, jxn j jxn aj C jaj < 1 C jaj; for n > N: Put M D maxfjx1 j; : : : ; jxN j; 1 C jajg. Then, jxn j M; for all n 2 N: Therefore, .xn / is a bounded sequence. 21.3 Theorem (Comparison). Let .xn / and .cn / be a sequence of real numbers, a 2 R. If limn cn D 0 and there exists k 0 and m 2 N such that jxn then limn xn D a. The proof is a good exercise. aj kcn ; for all n > m; 82 Limit theorems for sequences of reals 21.4 Theorem. Let .xn / and .yn / be sequences in R, a; b 2 R. (1) If xn ! a and yn ! b; then xn C yn ! a C b. (2) If xn ! a and c 2R then cxn ! ca. (3) If xn ! a and yn ! b, then xn yn ! ab. (4) Let xn ! a and yn ! b. If yn ¤ 0, for all n 2 N and b ¤ 0, then xn =yn ! a=b. Thus, lim.xn C yn / D lim xn C lim yn provided the right side exists. n n n You should write out similar formulations for (2),(3) and (4). Notice that you have the additional requirement, in the case of quotients, that none of the denominators be 0. Proof of (1). Let xn ! a, yn ! b. Let " > 0 be given. By definition, since xn ! a, there exists N1 2 N such that " jxn aj < ; for n > N1 . 2 Also since yn ! b, there exists N2 2 N such that " jyn bj < ; for n > N2 . 2 Let N D maxfN1 ; N2 g. Then n > N implies both of these hold. Therefore, for n > N; jxn C yn .a C b/j D jxn a C yn bj jxn aj C jyn bj by the triangle inequality " " < C D ": 2 2 We have shown that for all " > 0, there exists N 2 N such that for all n > N , j.xn C yn / .a C b/j < "; that is, .xn C yn / converges to a C b. " Proof of (2). Suppose limn xn D a; c 2 R. Let " > 0 be given. Put "0 D . Then jcj C 1 there exists N 2 N such that n > N implies jxn aj < "0 . Then, e < ". jcj C 1 Since " > 0 was arbitrary, this shows for all " > 0 there exists N such that for all n > N jcxn caj < ". In other words, cxn ! ca. n > N implies jcxn caj jcjjxn aj jcj"0 D jcj In the above proof, we used jcj C 1 instead of jcj since c could have been 0 and we cannot divide by 0. An alternative would have been to treat the case c D 0 separately. After all, if c D 0, jcxn caj D j0 0j D 0, for all n 2 N. Analysis of (3). Let xn ! a and yn ! b. Then jxn yn abj D jxn yn xn b C xn b jxn yn xn bj C jxn b jxn jjyn bj C jxn abj abj ajjbj: Since jxn aj gets small as n gets large and since jbj stays fixed, the second term here will become small. As for the first term, we see that jyn aj gets small as n gets large. Without some control of jxn j the product jxn jjyn bj could get large. But since every convergent INTRODUCTION TO ANALYSIS 83 sequence is bounded, this won’t be a problem: there is a M such that jxn j < M for all n 2 N, and M jyn bj can be made small. Proof of (3). Since .xn / converges, there exists M with jxn j < M for all n 2 N. Let " > 0 be given. Since xn ! a, there exists N1 2 N such that " jxn aj < ; for n > N1 . 2.jbj C 1/ Also since yn ! b, there exists N2 2 N such that " jyn bj < ; for n > N2 . 2M Let N D maxfN1 ; N2 g. Then n > N implies both of these hold, and jxn yn abj D jxn yn xn b C xn b abj jxn yn xn bj C jxn b abj jxn jjyn bj C jxn ajjbj " " C jbj <M 2M 2.jbj C 1/ " " C D ": 2 2 Thus, for all n > N , jxn yn abj < ": Since " > 0 was arbitrary, this shows xn yn ! ab. Proof of (4). Let xn ! a and yn ! b, with b ¤ 0 and yn ¤ 0 for all n. It will be enough to prove that 1 1 ! ; () yn b because then, by the limit of products theorem (3), 1 xn 1 a D xn !a D : yn yn b b Accordingly, let " > 0 be given. For all n, ˇ ˇ ˇ1 1 ˇˇ jb yn j ˇ ˇ D jy jjbj : ˇy b n n Now, since yn ! b, there exists N1 such that jyn bj < jbj=2; for n > N1 so, by the triangle inequality, for n > N1 , jbj jb yn j C jyn j < jbj=2 C jyn j; So jyn j jbj ; for n > N1 . 2 Also, there exists N2 such that jyn bj < "jbj2 =2 for n > N2 . Let N D maxfN1 ; N2 g. Then, for n > N ˇ ˇ ˇ1 1 ˇˇ "jbj2 =2 jb yn j ˇ < : D ˇ ˇy b jyn jjbj jbj2 =2 n 84 Limit theorems for sequences of reals To understand the next two theorems, it is better to think of convergence in terms of neighbourhoods. Remember that jx aj < " iff a " < x < a C ", that is iff x 2 B.a; "/ D .a "; a C "/. Thus, xn ! a iff for all " > 0, there exists N 2 N such that for all n > N , xn 2 .a "; a C "). The key to the squeeze theorem below is that if x and z belong to an interval U , such as .a "; a C "/, and if x y z, then y also belongs to U . 21.5 Squeeze Theorem. Let .xn /, .yn / and .zn ) be sequences of real numbers with xn yn zn , for all n. If .xn / and .zn / both converge to a, then .yn / converges to a. Proof. Let " > 0 be given. Since xn ! a, we may choose N1 with xn 2 .a "; a C "/; for n > N1 : Since zn ! a, we may choose N2 with zn 2 .a "; a C "/; for n > N2 : Put N D maxfN1 ; N2 g. Then for n > N , both xn and zn belong to .a hypothesis, xn yn zn ; for all n 2 N: Hence, yn 2 .a "; a C "/; for n > N : Explicitly, if n > N , then a "; a C "/. But, by " < xn yn zn < a C ": Since " > 0 is arbitrary, we have for all " > 0 there exists N with jyn n > N , as required. aj < ", for all 21.6 Theorem (Preservation of inequalities). If .xn / and .yn / are sequences of reals with xn ! x, yn ! y and xn yn , for all n2 N, then x y. Proof. Assume the hypothesis and suppose x > y. Put " D .x y/=2. Then, x " D y C " D .x C y/=2. Now, since xn ! x, we may choose N1 such that xn 2 .x "; x C "/ for n > N1 , and since yn ! y, we may choose N2 such that yn 2 .y "; y C "/ for n > N2 . Let n be any natural number > maxfN1 ; N2 g. Then, yn < y C " D x " < xn : This contradicts the fact that xn yn for all n 2 N. WARNING: This theorem does not say that the strict inequality is preserved. If xn < yn ; for all n and the limits exist, it is not true that limn xn < limn yn . We still only have limn xn limn yn . For an example, take xn D 1 n1 and yn D 1 C n1 . Then xn < yn but in the limit both become 1. 21.7 Basic limit examples. 1 (1) If p > 0, limn p D 0. n (2) If jaj < 1, then limn an D 0. (3) limn n1=n D 1 INTRODUCTION TO ANALYSIS 85 (4) If a > 0 limn a1=n D 1. For example (1) we don’t require that p be an integer. But, we need the fact that for all x > 0 and all p 2 R, x p is defined and the usual rules of exponents hold. (See Theorem 13.2.) Proof of example (1). Let " > 0. By the Archimedean property, there exists N 2 N with N 1=."1=p /. Then for n > N , p 1 1=p D ": < " np Thus, ˇ ˇ ˇ 1 ˇ ˇ ˇ < "; for n > N , 0 ˇ np ˇ and n1p ! 0. Proof of example (2). 1 Let jaj < 1. jaj > 1, so 1 D jaj 1 C b, where b > 0, and for n 2 N . 1 : .1 C b/n By the Bernoulli inequality .1 C b/n 1 C nb. (This is proved by induction or from the Binomial Theorem.) Hence, 1 1 jan j D jajn < : 1 C nb nb jajn D 1 Let " > 0. Then there exists N such that N1 < "b. Then for n > N , jajn < nb < "b D ". b Thus, for all " > 0, there exists N such that for all n > N , jan 0j < ". Hence an ! 0. Proof of example (3). Let xn D n1=n 1. Then, xn 0, for all n. It will be enough to prove xn ! 0. By the Binomial Theorem, for n > 1, n.n 1/ 2 n.n 1/ 2 xn C : : : xnn > xn I n D .1 C xn /n D 1 C nxn C 2 2 hence, n 2 xn2 < D ; for n > 1. n.n 1/=2 n 1 Now let " > 0 be given. Choose any N "22 C 1. Then, for all n > N , jxn j D xn < ". Hence, xn ! 0. Proof of (4). First assume a 1. Then 1 a1=n n1=n , for n a. So ja1=n 1=n 1j n1=n 1; for n > a: 1=n By example (3) .n 1/ converges to 0, so a ! 1, by comparison. (Alternatively we could have used the squeeze theorem.) In case 0 < a < 1, we apply the above to the reciprocal: 1=a > 1, so .1=a/n ! 1 and hence an D 1=.1=a/n ! 1 by the limit of quotients theorem. 21.8 Theorem. Let .an / be a sequence of positive real numbers such that limn and is < 1. Then an ! 0. anC1 an exists 86 Limit theorems for sequences of reals a Proof. Let r D limn nC1 and assume r < 1. Choose c with r < c < 1 and put " D c an a Then, there exists N 2 N such that, for all n N , j nC1 rj < ". Thus, for n N , an r. anC1 < r C " D c: an In other words anC1 < can for n N . Thus, aN C1 aN c aN C2 aN C1 c aN c 2 ; :: : By a simple induction aN Ck aN c k , for all k 2 N or what is the same, an aN c n Put K D aN c N N ; for n > N . and get an Kc n ; for n > N . Since an 0 and c n ! 0, it follows that limn an D 0. It would be a good idea to write out the proof of uniqueness of limits in terms of the distance function d , to see that the result is also valid in general metric space. (See problem 21.1.) Another way to look at uniqueness of limits is via 21.9 Theorem (The Hausdorff Property.). In any metric space, if a ¤ b, there exist neighbourhoods Ua of a and Ub of b with Ua \ Ub D ;. Proof. Since a ¤ b, d.a; b/ > 0. Take ı any positive number d.a; b/=2, Ua D B.a; ı/ and Ub D B.b; ı/. Then Ua and Ub are neighbourhoods of a and b, respectively, and Ua \ Ub D ;. Indeed, if z 2 Ua \ Ub , then d.a; b/ d.a; z/ C d.z; b/ < ı C ı d.a; b/, an impossibility. To prove uniqueness of limits, suppose .xn / converged to both a and b. By the Hausdorff property, there exist neighbourhoods Ua of a and Ub of b with Ua \ Ub D ;. Since xn ! a, we may choose Na such that xn 2 Ua , for n > Na : Since xn ! b, we may choose Nb such that xn 2 Ub , for n > Nb : Now, take a particular n > maxfNa ; Nb g. Then, xn 2 Ua \ Ub ; which is impossible. INTRODUCTION TO ANALYSIS 87 Convergence to C1, and 1. For a sequence .xn / of real numbers, limn xn D C1 means for each M 2 R, there exists N 2 N such that, for all n > N , xn > M . Some people refer to this as .xn / converges to C1 (in the extended real number system). Others say .xn / diverges to C1, to emphasize that .xn / does not converge in the real number system. Similarly, limn xn D 1 means for each M 2 R, there exists N 2 N such that, for all n > N , xn < M . Here one says .xn / converges to 1 (in the extended real number system), or .xn / diverges to 1. x 21.10 Theorem. Let .xn / and .yn / be sequences in R, a; b 2 R. (1) If xn ! a and yn ! b; then xn C yn ! a C b, provided a C b is defined. (2) If xn ! a and yn ! b, then xn yn ! ab, unless one of a and b is 0 and the other is infinite. (4) Let xn ! a and yn ! b. If yn ¤ 0, for all n 2 N and b ¤ 0, then xn =yn ! a=b, provided the latter is defined. These results are left as exercises. It is best to write out explicit cases, for example if a > 0 is real and b D C1, then xn yn ! C1, etc. 21.1. Write out the proof of uniqueness of limits in terms of the distance function d , to prove the result in a general metric space. 21.2. The theorem on limits of sum, product, and quotients of sequences extend (with no change in proof) to sequences of complex numbers. 21.3. A sequence .an / of (non-zero) real numbers converges to 0 if and only if 1=jan j ! C1. 21.4. Let .xn / and .yn / be sequences in R. Let xn ! 1 and yn ! c, where 0 < c 2 R. Prove that xn yn ! 1. Find an improvement of this result,in which .yn / need not converge. 21.5. Let .xn / be a sequence of elements of Rm ; xn D .xn1 ; : : : ; xnm /. Then, .xn / converges to some a D .a1 ; : : : ; am / if and only if xnj ! aj , for each D 1; : : : ; m. 21.6. The theorem on limits of sum, product, and quotients of sequences extend to Rm as follows: Let .xn / and .yn / be sequences in Rm , .cn / a sequence of reals. Let a; b 2 Rm and c 2 R. (1) If xn ! a and yn ! b; then xn C yn ! a C b. (2) If xn ! a and cn ! c then cn xn ! ca (product of a vector by a scalar). (3) If xn ! a and yn ! b, then xn yn ! a b (dot product). (4) Let xn ! a and cn ! c. If cn ¤ 0, for all n 2 N and c ¤ 0, then xn =cn ! a=c. 21.7. A set A in a metric space is bounded if there exists a ball B.a; r/ (about some a) which contains A: B.a; r/ A. A sequence is bounded if its range is. Modify the proof for real numbers to prove every convergent sequence in a metric space is bounded. 21.8. By definition, for a sequence .xn / in a metric space, xn ! a if and only if 8" > 0; 9N 2 N; 8n > N; d.xn ; a/ < ": Prove that xn ! a if and only if 8" > 0; 9N 2 N; 8n N; d.xn ; a/ < 2": Notice the changes to weak inequality (n N ). 21.9. Let G be an open set and let .xn / be a sequence converging to a 2 G. Prove that there exists N 2 N with xn 2 G, for all n N . 21.10. Let .an / and .bn / be sequences in a metric space S such that d.an ; bn / ! 0. Then .an / converges iff .bn / converges, and if they converge, they have the same limit. 21.11. (a) (b) (c) Prove or disprove, for sequences of real numbers: If .xn / converges and .yn / diverges, then .xn C yn / diverges. If .xn / diverges and .yn / diverges, then .xn yn / diverges. If .xn / diverges and .yn / converges, then .xn yn / diverges. 88 Limit theorems for sequences of reals 21.12. Let .xn / be a sequence in R converging to c and let an D Give an example where .an / converges, but .xn / does not. x1 CCxn . n Then an ! c, also. 21.13. (Connection with Linear Algebra) The set c of convergent sequences x D .xn / in R is a vector space and the the map T W c ! R defined by T .x/ D limn cn is a linear functional. The space R may be replaced by C or Rm or Cm in this statement. INTRODUCTION TO ANALYSIS 89 22. E XISTENCE : M ONOTONE SEQUENCES In the definition and examples, we needed to know (or guess) the limit of a sequence in order to prove it converged. Here and in a subsequent section, we will find conditions that guarantee the existence of a limit, without knowing it in advance. The first is the idea of a monotone sequence. A sequence .xn / of real numbers is called increasing if xn xnC1 ; for all n 2 NI it is called strictly increasing if xn < xnC1 ; for all n 2 NI xn xnC1 ; for all n 2 NI xn > xnC1 ; for all n 2 N: it is called decreasing if and strictly decreasing if A sequence is called monotone if it is either increasing or decreasing and I guess you can figure out what “strictly monotone” means. Alternate terminology: Some people use ‘increasing’ for ‘strictly increasing’ and say ‘non-decreasing’ for what is here called increasing. Be careful: a sequence which is not decreasing need not be nondecreasing! (E.g. our friend .. 1/n /. ) Some authors use monotone increasing for increasing and monotone decreasing for decreasing. Sometimes isotone and antitone are used. Be sure which terminology a book you are looking at is using. 22.1 Monotone Convergence Theorem. Every bounded monotone sequence of real numbers converges. Proof. We prove the increasing case and leave the decreasing case as an exercise. Suppose .xn / is an increasing sequence which is bounded. In particular, fxn W n 2 Ng is bounded above, so has a supremum. Let a D supn2N xn . We claim .xn / converges to a. Indeed, fix " > 0. By definition of supremum, xn a < a C " and since a for all n 2 N; " < a, there exists, and we choose, N 2 N with a " < xN : But .xn / is increasing, so xn xN , for all n N . Thus, for all n N; a " < xn < a C ": Thus, for all " > 0, there exists N 2 N with xn in the ball B.a; "/ for all n > N ; in other words, xn ! a. . You will notice that what one really proves is 22.2 Corollary. (1) each increasing sequence bounded above converges to its supremum. (2) each decreasing sequence bounded below converges to its infimum. Of course, an increasing sequence is bounded iff it is bounded above, and a decreasing sequence is bounded iff it is bounded below. (why?) 90 Existence: Monotone sequences 22.3 Example. Let .xn / be the sequence defined by recursion by p x1 D 1; xnC1 D 1 C xn ; for n 2 N. We will see that this is a bounded increasing sequence. If we try a few values, it appears that xn 2 for all n. So we try to prove that by induction. First, x1 D 1 < 2. Now suppose, xn < 2. Then p p p xnC1 D 1 C xn < 1 C 2 D 3 < 2; so by induction xn < 2 for all n 2 N. To prove that .xn / is increasing, we also use induction. We have to show xn xnC1 ; for all n 2 N. p p This is true for n D 1, since x1 D 1 < 1 C 1 D 1 C x1 D x2 . Now supposing it true for n we have p p xnC1 D 1 C xn 1 C xnC1 D xnC1C1 : Thus the result holds for all n 2 N. Since .xn / is bounded and increasing we can apply the Monotone Convergence Theorem to yield that .xn / converges to some a 2 R. p To decide what the limit a is we need that limn xnC1 is also a. (Proof?). Now, xnC1 D 1 C xn , so 2 xnC1 D 1 C xn ; for all n. Hence, using the limit theorems for sums and products, we have 2 a2 D lim xnC1 D 1 C lim xn D 1 C a: n n p Thus a2 D 1 C a; and hence a D .1 ˙ 5/=2. Also, xn 0 for all n, so in the limit a p 0. (Inequalities are preserved p under limits.) This excludes the possibility that a D .1 5/=2; therefore a D .1 C 5/=2. The above proof could have been shortened using the following result. 22.4 Lemma. If .yn / is a sequence of positive real numbers with yn ! b, then p b. p p The proof is an exercise, based on y b D p y bp . p yn ! yC b p 22.5 Example (Finding 2). Notice that x 2 D 2 ” 2x 2 p D x2 C 2 ” x D 2 .x C 2/=2x. Let us use this formula as motivation to try to find 2. Let x1 D 2, and for each natural number n, let x2 C 2 xnC1 D n : () 2xn We will show that .xn / is a bounded monotone sequence. For all a; b, a2 C b 2 2ab, since a2 2ab C b 2 D .a b/2 0. Thus, for x > 0, p x 2 C . 2/2 p 2; () 2x p p p so that, for all n, xnC1 p2. Since also x1 D 2 > 2 this shows xn 2, for all n, so .xn / is bounded below by 2. INTRODUCTION TO ANALYSIS Now, for all n, we have 91 xn2 C 2 xn 2xn ” xn2 2xn2 xnC1 D ” xn2 2; which we have just checked is true. p We have shown that .xn / is a decreasing sequence, which by ./ is bounded below by 2. Thus xn converges to some x. Then taking a limit on both sides of ./ yields x2 C 2 : 2x p p The only positive solution to this equation is x D 2, so that xn ! 2. Note that, if we had started p with x1 any other positive number, we would still reach the same conclusion. If 0 < x1 < 2, the sequence is not monotone, but becomes so after the first step and changing a finite number of terms does not affect convergence or divergence (or the value of the limit). What happens if x1 < 0? xD The Monotone Convergence Theorem can be extended to the unbounded case using the concept of infinite limits. 22.6 Theorem. (a) If .xn / is unbounded and increasing, then limn xn D C1. (b) If .xn / is unbounded and decreasing, then limn xn D 1. Proof. (a) Let .xn / be increasing and unbounded. Recall that, by definition, limn xn D C1 means for each M 2 R, there exists N 2 N such that, for all n > N , xn > M . So let M 2 R be given. Since .xn / is increasing, it is bounded below by x1 . But .xn / is unbounded, so it can’t be bounded above. Thus, there must be an N 2 N with xN > M . Now, as in the case of finite limits, for n N , we have xn xN ; since the sequence is increasing. Thus, for all n N , xn > M , as required. The proof of (b) is similar. x C1 is considered the supremum of any set In the extended real number system R, which is not bounded above, and 1 is the infimum of any set which is not bounded below. Thus, in general, 22.7 Corollary. (1) Each increasing sequence of real numbers converges to its supremum, possibly C1. (2) Each decreasing sequence of real numbers converges to its infimum, possibly 1. 22.1. For the following sequences .xn / of reals, use the monotone convergence theorem to decide whether they converge and if so find the limit. (a) xn D 51=n . (b) x1 D 1 and xnC1 D 14 .xn C 5/, for all n 2 N. (c) x1 D k > 0, xnC1 D k=.1 C xn /, for all n. (Careful. This requires a little more. ) 22.2. Let x1 D 1 and for all n, xnC1 D xn C 1=xn . Prove .xn / is unbounded. 22.3. Let .xn / be a sequence of real numbers and an D maxfa1 ; : : : ; an g, for all n. Prove an x converges to supfxn W n 2 Ng. The convergence is in R if the sequence is bounded and in R otherwise. 92 Existence: Monotone sequences Notes INTRODUCTION TO ANALYSIS 93 23. C LUSTER POINTS AND SUBSEQUENCES : T HE B OLZANO -W EIERSTRASS THEOREM ( SEQUENCE FORM ) A point c is called a cluster point of .xn / if for each " > 0, for all n 2 N, there exists m > n such that xm 2 B.c; "/. Thus, the set fn 2 N W xn 2 B.c; "/g is infinite. In other words, c is a cluster point of .xn / if each neighbourhood of c contains xn for infinitely many n. We also say .xn / clusters at c. 23.1 Lemma. If .xn / converges to c then .xn / clusters at c. 23.2 Examples. (a) Let xn D . 1/n . Then, 1 and 1 are cluster points of .xn /. Are there any others? (b) Let xn D 1 C . 1/n C 1=n. Then, 2 and 0 are the only two cluster points of .xn /. (c) Let xn D nC. 1/n n. Then .xn / has only one cluster point, yet does not converge. Subsequences. The reader will have noticed that in examples where there is more than one cluster point, the sequence seems to ‘converge’ to that point over a subset of the indices. If .xn / is a sequence and .nk / is a strictly increasing sequence of natural numbers n1 < n2 < < nk < nkC1 ; then the sequence .yk / defined by yk D xnk , for all k 2 N is called a subsequence of .xn ). In the above examples, if we take nk D 2k, we obtain a subsequence .yk / D .x2k /. In particular, for example (b), x2k D 1 C . 1/2k C 1=.2k// D 2 C 1=.2k/, so the subsequence .yk / is just .2 C 1=.2k//, which clearly converges to the cluster point 2. Another subsequence, .x2k 1 /, picks out the terms of .xn / which have odd indices. We see that .x2k 1 / D .1=.2k 1//, which converges to the other cluster point 0. It is important to realize that a sequence .xn / really stands for a function f W n 2 N 7! xn . To construct a subsequence, we compose this with a map g W k 2 N 7! nk , obtaining a new function whose value at k is f .g.k// D xnk . If the function g W k 7! 2k, the resulting function (subsequence) becomes .x2k /. Since k is just a “dummy variable” that runs through all the natural numbers, it can be replaced by any other letter, so .x2n / denotes the same subsequence. The following lemma is convenient to keep in mind when working with subsequences. The proof is an easy induction. 23.3 Lemma. If .nk / is a strictly increasing sequence of natural numbers, then nk k, for all k. 23.4 Theorem. A sequence .xn / clusters at a if and only if it has a subsequence which converges to a. Proof. Let .xn / cluster at a. From the definition, for each n 2 N, and for each " > 0, there exists m > n (which depends on both n and "), with xm 2 B.a; "/. We define a subsequence recursively as follows. First take n D 1 and " D 1 and obtain an n1 > 1 with xn1 2 B.a; 1/: Then take n D n1 and " D 1=2 and obtain an n2 > n1 with xn2 2 B.a; 1=2/: In general, if nk has been chosen, we choose nkC1 > nk with 1 /: xnkC1 2 B.a; kC1 94 Cluster points and subsequences: The Bolzano-Weierstrass Thm We then have xnk 2 B.a; 1=k/ for all k 2 N. The resulting subsequence .xnk / converges to a. Indeed, let " > 0, Then there exists K with K1 < ". For k > K we have n1k < " and xk 2 B.a; 1=k/ B.a; "/, as required. For the converse, let .xnk / be a subsequence of .xn / which converges to a. Then .nk / is a strictly increasing sequence of natural numbers so that nk k, for all k. Fix " > 0, and N 2 N. Since xnk ! a, there exists K such that for k > K, xnk 2 B.a; "/. Choose k to be any natural number > maxfN; Kg. Then, nk k > N and xnk 2 B.a; "/. Thus, 8" > 0; 8N 2 N; 9m > N with xm 2 B.a; "/: Thus we have seen that the notions of cluster point and subsequential limit coincide in metric spaces. Now, let us restrict to the real number system. 23.5 Theorem. Every sequence of real numbers has a monotone subsequence. Proof. Let .xn / be a sequence of real numbers. Call n a dominant index if xn xm , for all m n. There are 2 cases. Either the set D of dominant indices is infinite or it is finite. If D is infinite, choose a sequence .nk / in D with nk < nkC1 , for all k 2 N. Then, xnk xnkC1 , for all k, so the subsequence .xnk / is decreasing. If D is finite, then for all n > max D, there exists m > n, with xm > xn . In this case, we can let n1 > max D be arbitrary, and for each k choose nkC1 > nk , with xnkC1 > xn , obtaining a strictly increasing subsequence .xnk /. 23.6 Bolzano-Weierstrass Theorem (Sequence form). (1) Every bounded sequence of real numbers has a cluster point. (2) Equivalently, every bounded sequence of real numbers has a convergent subsequence. Proof. Let .xn / be a bounded sequence in R. Then, .xn / has a monotone subsequence. But every bounded mononotone sequence converges, so .xn / has a convergent subsequence. This result extends to Euclidean space Rm . 23.7 Bolzano-Weierstrass Theorem (Sequence form in Rm ). (1) Every bounded sequence in Rm has a cluster point. (2) Equivalently, every bounded sequence in Rm has a convergent subsequence. Proof. This can be deduced from the corresponding set form, but we prove it here by repeatedly applying the real case. Let .xn / be a bounded sequence in Rm . Each xn is an m-tuple, xn D .xn1 ; xn2 ; : : : ; xnm /. Since .xn / is bounded, so are each of the real sequences .xni /. First the sequence .xn / has a subsequence .xk1 / D .xn1 / for which the sequence of first coordinates converge. Indeed, k .xn1 / is a bounded sequence of real numbers, so has a subsequence which converges. In this same way, the new sequence .xk1 / has a subsequence .xj2 /, for which the sequence of second coordinates converge. But that sequence .xj2 / is also a subsequence of .xn /, so its first coordinates also converge. If we continue in this way, in m steps we reach a subsequence which converges in all coordinates, so converges in Rm . INTRODUCTION TO ANALYSIS 23.1. If a sequence .xn / in a metric space does not converge to a point a, then there is a neighbourhood U of a and a subsequence .xnk /, none of whose terms are in U , so that it does not cluster at a. 23.2. A bounded set in Rm converges if and only if it has exactly one cluster point. (Use problem 23.1.) 23.3. Prove the sequence form of the Bolzano-Weierstrass in R by proving directly that for a bounded sequence .xn / supft W xn t for infinitely many ng exists in R and is a cluster point of .xn /. (This is actually the lim supn xn , but we need not refer to that concept.) 23.4. Mimic the proof of the set form of the Bolzano-Weierstrass theorem to prove the sequence form. 23.5. From the set form of the Bolzano-Weierstrass theorem, deduce the sequence form. (Be careful: the range of a sequence could be finite.) 23.6. From the sequence form of the Bolzano-Weierstrass theorem, deduce the set form. 23.7. Create an example of a sequence of real numbers that: (a) is not convergent, but has exactly one cluster point; (b) has exactly 5 cluster points, but all terms are distinct; (c) has an infinite number of cluster points; (d) is not monotone, yet has limit C1. 23.8. Let .an / and .bn / be sequences in a metric space S such that d.an ; bn / ! 0. Then .an / and .bn / have the same cluster points. 95 96 Cluster points and subsequences: The Bolzano-Weierstrass Thm Notes INTRODUCTION TO ANALYSIS 97 24. E XISTENCE : C AUCHY SEQUENCES A sequence .xn / in a metric space .S; d / is called a Cauchy sequence if for every " > 0, there exists N 2 N such that for all n; m > N , d.xn ; xm / < ". We see immediately that 24.1 Lemma. Every convergent sequence is Cauchy. Proof. Suppose .xn / converges to a. Let " > 0. By definition, there exists N such that for all n > N d.xn ; a/ < "=2. But then for n; m > N , d.xn ; xm / d.xn ; a/ C d.a; xm / < " C 2" D "; as required. 2 The remarkable thing is that we will be able to prove that in R and Rm every Cauchy sequence converges, which will give us a second way of getting the existence of a limit without knowing it in advance. 24.2 Theorem. If a Cauchy sequence clusters at a, then it converges to a. Proof. Let .xn / be a Cauchy sequence with cluster point a. We will show that .xn / also converges to a. Let " > 0. Since .xn / is Cauchy, there exists N such that for all n; m > N , d.xn ; xm / < "=2 Fix such an N and let n > N . Since .xn / clusters at a, we may choose m > N such that d.xm ; a/ < "=2. Thus, d.xn ; a/ d.xn ; xm / C d.xm ; a/ < ". We have shown, then, that for all n > N , d.xn ; a/ < ". Since " > 0 was arbitrary, xn ! a. 24.3 Theorem. Every Cauchy sequence in a metric space is bounded. Proof. Let .xn / be a Cauchy sequence. Let a be a fixed point of the space. Apply the definition with " D 1 to obtain N such that for n; m > N , d.xn ; xm / < 1: In particular, taking m D N C 1 d.xn ; a/ d.xn ; xN C1 / C d.xN C1 ; a/ < 1 C d.xN C1 ; a/; for all n > N C 1: So we put M D maxfd.x1 ; a/; : : : ; d.xN ; a/; 1 C d.xN C1 ; a/g. Then, d.xn ; a/ M; for all n 2 N: Therefore, .xn / is a bounded sequence. 24.4 Theorem (Cauchy Criterion). Every Cauchy sequence in R or Rm converges. Proof. Let .xn / be a Cauchy sequence in Rm . Then .xn / is bounded, by the above theorem, so has a cluster point by the Bolzano-Weierstrass theorem. Thus, .xn / converges to that cluster point. You will have noticed that the completeness of the reals was the essential ingredient in this proof. (This was what caused cluster points to exist.) The fact that the Cauchy criterion holds for sequences in Rm is called metric completeness of Rm . A metric space .S; d / is called a complete metric space if every Cauchy sequence converges. The concept of Cauchy sequence is at the heart of the theory of convergence of series. See section 27. S ERIES OF NUMBERS 98 Existence: Cauchy sequences 24.1. Let .xn / be a sequence in a metric space such that for every " > 0, there exists N such that for all n > N , d.xn ; xN / < ", then .xn / is Cauchy. 24.2. A sequence .xn / in a metric space is called contractive if there exists k 2 .0; 1/ such that d.xnC2 ; xnC1 / kd.xnC1 ; xn /, for all n. Every contractive sequence is Cauchy. 1 . Prove that .an / is a Cauchy sequence, 1 C an so converges. What is the limit? (Suggestion: prove the sequence is contractive.) 24.3. Let an D 1, and for each n, put anC1 D 1 C 24.4. Let .an / be a sequence in a metric space and for each n 2 N, let An D fak W k mg. Then .an / is a Cauchy sequence iff limn diam An ! 0. 24.5. A metric space is complete iff whenever .K T n / is a sequence of closed sets with Kn KnC1 , for all n 2 N, with diam.Kn / ! 0, then n2N Kn consists of a single point. INTRODUCTION TO ANALYSIS 99 25. T HE NUMBER e, AN APPLICATION OF M ONOTONE C ONVERGENCE Here we use the Monotone Convergence Theorem for the real numbers to prove that 1 n lim 1 C n n exists. You know this from Calculus as the number e. Once we show it exists, we will make this the definition of e. Let 1 n an D 1 C : n Of course, an 0. We will show that .an / is bounded above and is increasing, so that it will converge by the Monotone Convergence Theorem. First, use the Binomial Theorem to obtain ! n X n 1 k an D : n k kD0 Boundedness. Now, ! n 1 k D1 1 k n Since each of the i n 0 this is 1 n 1 2 1 n k 1 n 1 : kŠ () 1 , kŠ hence ! n n X n 1 k X 1 : an D k n kŠ kD0 kD0 Now, a1 D 2 and for n 2, an 1 C 1 C n X kD2 1 k.k n X 1 D2C 1/ k 1 kD2 1 k D2C1 1 < 1 C 2 D 3: n Thus, .an / is bounded above by 3 (and below by 0). Increasing. To show .an / is actually increasing, look at the expressions for an and anC1 . ! n X n 1 k an D : k n kD0 ! k n X 1 nC1 1 C anC1 D : nC1 .n C 1/nC1 k kD0 So an anC1 , provided for each k D 0; : : : ; n ! ! k nC1 n 1 k 1 ; k n k nC1 But, the left-side here is 1 1 1 n 1 2 1 n k 1 n 1 kŠ and the right side is obtained from it by replacing n by n C 1, which makes it larger, since i .1 ni / < .1 nC1 /. The number e an application of Monotone Convergence 100 This proves .an / is a bounded monotone sequence of real numbers, so it converges. As we said, the limit is called e. P 1 Connection with the series k kŠ . We noticed above that n 1 n X 1 : an D 1 C n kŠ kD0 And, if we let bn stand for the right side of this inequality, bn 3: Clearly, the sequence .bn / is increasing and bounded, so it too has a limit, denoted Let us temporarily call this limit E: Since P1 1 kD0 kŠ . a n bn ; in the limit we have e E: But look at ./ again. ! n n X X n 1 k an D 1 1 D k n kD0 kD0 Fix m 2 N. Then, for n m; m X 1 1 an 1 n kD0 1 But, this is a finite sum, and each of the sides gives e i n 1 n 1 2 1 n 2 1 n k 1 n k 1 n 1 : kŠ 1 : kŠ converges to 0 as n runs, so taking limits of both m X 1 D bm : kŠ kD0 Again, inequalities are preserved in the limit, so e lim bm D E: m Since we had e E before, this gives e D E. That is, 1 1 n X 1 D : lim 1 C n n kŠ kD0 (See 27. S ERIES OF NUMBERS .) Convergence of ..1 C 2=n/n /. A slight modification of the proof that an D .1 C 1=n/n increases with n, shows that .1 C 2=n/n also increases with n, so converges in R, provided this sequence is bounded above. But, 2 n 2 2n 1C 1C D an2 e 2 ; n 2n n so it is indeed bounded above. Let cn D 1 C n2 and let w D limn cn . Since .c2n / is a subsequence of .cn / it must have the same limit. Thus 2 2n D lim an2 D e 2 : w D lim 1 C n n 2n INTRODUCTION TO ANALYSIS 101 More about the number e will be found in section 27. S ERIES OF NUMBERS. 25.1. For each p 2 N, the sequence .yn /, where yn D .1 C p=n/n is increasing, bounded above, and converges to e p . 25.2. Let r > 0 be rational; say r D p=q with p; q 2 N. Let wn D .1 C r=n/n . Then .wn / is q increasing. By looking at wn , one can see that .wn / is bounded above by e r . If z is its limit, q z q D limn wn D e p , so z D e p=q D e r . 25.3. Using the fact that e x is the unique number such that e r e x e s , for all rational r; s with r < x < s, show that for 0 < x 2 R, .1 C x=n/n ! e x . 25.4. For p 2 N, .1 p=n/n D . n np /n D 1=.1 C n pp /n ; for n > p. This decreases in n with limit 1=e p D e p . Arguing with subsequences, obtain for rational r < 0, .1 C r=n/n ! e r and finally that .1 C x=n/n ! e x , for all negative real x, so that in fact this holds for all x 2 R. 25.5. Let an D .1 C 1=n/n , bn D an .1 C 1=n/ D .1 C 1=n/nC1 . Then an bn . .an / is increasing and .bn / is decreasing, so both converge and to the same limit, namely e. n .1C1=n/n n2 1 Outline: Using Bernoulli’s inequality prove that for n > 1, .1C1=.n is 1//n D n2 between 1 1=n and 1=.1 C 1=n/, from which an =an 1 1 and bn =bn 1 1. 102 The number e an application of Monotone Convergence Notes INTRODUCTION TO ANALYSIS 103 26. L IMIT INFERIOR AND LIMIT SUPERIOR . If .xn / is a bounded sequence there are two important monotone sequences of real numbers associated with it. If we put, for each n 2 N, an D inf xk bn D sup xk and kn kn then .an / is an increasing sequence and .bn / is a decreasing sequence. Indeed, for each n 2 N, fxk W k ng fxk W k n C 1g; So an D inffxk W k ng inffxk W k n C 1g D anC1 and bn D supfxk W k ng supfxk W k n C 1g D bnC1 : (The infima and suprema here exist in R because the sequences are bounded both below and above.) Now, if .xn / is bounded, we see that .an / is also bounded, and since it is increasing it converges to some a. In fact, it converges to a D supn an . This a is called the limit inferior of .xn / written lim infn xn . Thus, lim inf xn D lim inf xk D sup inf xk n n kn n kn Similarly, .bn / converges to b D infn bn , which is called called the limit superior of .xn / written b D lim supn xn W lim sup xn D lim sup xk D inf sup xk : n kn n n kn Other notation for these are: limn xn for lim infn xn and limn xn for lim supn xn . 26.1 Example. Let xn D 1 C . 1/n C 1 . 2n xn D 2 C 1 ; 2n xn D 1 2n Then for each n, if n is even, and if n is odd. Thus, if bn D supkn xk , then ( bn D 2C 2C 1 ; 2n 1 ; 2nC1 if n is even if n is odd. hence, lim sup xn D lim bn D 2: n n On the other hand, we find that infkn xn D 0, for all n, so lim infn xn D 0. The following theorem gives characterizations of limit superior and limit inferior. These are often taken as the definitions of the concepts. 104 Limit inferior and limit superior 26.2 Theorem. For a bounded sequence .xn / of real numbers, (1) lim infn xn D a if and only if for each " > 0, (i) there exists n such that xk > a ", for all k n and (ii) for all n, there exists k n with xk < a C ". (2) lim supn xn D b if and only if for each " > 0, (i) there exists n such that xk < b C ", for all k n and (ii) for all n there exists k n with xk > b ". Proof. We do the case of lim sup and leave the lim inf case as an exercise. Let b D lim supn xn . Then b D infn bn , where bn D supkn xk . (i) Let " > 0. Since b D infn bn , we may choose n with bn < b C ": That is, sup xk < b C ": kn But if the supremum of a set is < some number, each member of the set is also < that number, so xk < b C "; for all k n: Thus, for each " > 0, there exists n, such that xk < b C ", for all k n. (ii) Again let " > 0. Then b " < b D infn bn , so if we fix an arbitrary n, b " < bn : But, bn D supkn xk , and a number less than a least upper bound is no longer an upper bound, so there exists k n with b " < xk : Thus, we have shown that for each " > 0, and each n 2 N, there exists k n with xk > b ", completing the proof that the limit superior satisfies the two properties. Conversely, suppose (i) and (ii) hold. Let " > 0. Then, by (i), we may choose n such that xk < b C ", for all k n. Taking supremum over all the k n, we obtain sup xk b C ": kn (Be careful, this is , not < !) Thus, there exists n such that sup xk b C ": kn Hence, taking infimum (or limit), lim sup xn D inf sup xk b C ": n kn n Since this inequality is true for arbitrary " > 0, lim sup xn b: n Again, let " > 0 and fix n 2 N. Then, by (ii), there exists k n with xk > b ", INTRODUCTION TO ANALYSIS 105 so that sup xk > b ": kn But then, since n was arbitrary, we may take infimum and get lim sup xn D inf sup xk b n kn n ": Finally, since " was arbitrary, we get lim sup xn b: n Thus, lim supn xn b and lim supn xn b, so we have equality 26.3 Note. Remember that the variables k, and n in the definitions limit inferior and limit superior and in the above theorem are dummy variables, so may be replace by any others. Thus, lim supn xn is also limN supnN xn D infN supnN xn . And, if " > 0, then (i) there exists N such that xn < b C ", for all n N . (ii) for all N , there exists n N with xn > b ". Here, (i) can be summarized by saying xn < b C ", for all except for a finite number of indices n, and (ii) by saying that xn > b ", for an infinite number of n. “All but finitely many terms are < b C " and infinitely many terms are > b ".” 26.4 Theorem. Let .xn / be a bounded sequence of real numbers. Then lim infn xn and lim supn xn are each cluster points of .xn /. Proof. (liminf case) This follows from the characterization in theorem 26.2: lim infn xn D a if and only if for each " > 0, (i) there exists n such that xk > a ", for all k n and (ii) for all n, there exists k n with xk < a C ". To see this, let a D lim infn xn . By condition (i) all but finitely many terms xk are > a " and by (ii), infinitely many are < a C ", so infinitely many are < a C ". So, together, infinitely many terms are in .a "; a C "/, showing that a is a cluster point of .xn /. Here is a detailed argument: Let " > 0 and n 2 N and choose by (i) an N 2 N such that for all k N , xk > a ". Now, by (ii), we can find m maxfn C 1; N g, with xm < a C ". Thus, m > n and satisfies xm > a " and xm < a C "; that is a " < xm < a C ": Thus, for all " > 0, and all n 2 N, there exists m > n with xm 2 B.a; "/; that is, a is a cluster point of .xn /. The proof for lim supn xn is similar. 106 Limit inferior and limit superior Unbounded sequences. If a real sequence .xn / is unbounded, we can still define lim supn xn and lim infn xn in the same way, provided we use the conventions for supremum and infix namely that a (non-empty)set which is not bounded above has supremum C1 mum in R, and one which is not bounded below has infimum 1. We find that, if .xn / is not bounded above, then lim supn xn D C1 and lim infn xn may or may not be infinite; if .xn / is not bounded below, then lim infn xn D 1, and lim supn xn may or may not be infinite. 26.1. For a bounded sequence .xn /, lim infn xn is the smallest cluster point of .xn / and lim supn xn is the largest. 26.2. If a bounded sequence .xn / has lim infn xn D lim supn xn , then the sequence converges to this common value. 26.3. If an unbounded sequence .xn / has lim infn xn D lim supn xn , then the sequence converges to x this common value in R. 26.4. For a bounded sequence .xn / in R, lim supn xn D supft 2 R W xn t; for infinitely many ng; lim infn xn D infft 2 R W xn t for infinitely many ng. 26.5. For a sequence in a metric space, xn ! a iff lim supn d.xn ; a/ D 0. INTRODUCTION TO ANALYSIS 107 27. S ERIES OF NUMBERS Associated with complex) numbers is the P Pa sequence .an / D .a1 ; a2 ; : : : / of real (orP n th series n an D 1 a . For each n 2 N, the sum s D n nD1 n kD1 ak is called the n 2 th partial sum ofP the series and an is called its n term. The series 1 nD1 an is said to converge if the sequence .sn / of partial sums converges and to be Cauchy if .sn / is Cauchy. If s is the limit of .sn / we call s the sum of the series and write 1 X an D s: nD1 If .sn / diverges, the series is said to diverge. P n 27.1 Example (Geometric Series). the series 1 nD1 x case 1 X 1 : xn 1 D 1 x nD1 1 converges iff jxj < 1, in which Proof. If x ¤ 1, the nth partial sum is sn D n X xk 1 D kD1 1 xn : 1 x Since x ! 0 if and only if jxj < 1, this converges to 1 1 x if jxj < 1 and diverges if jxj > 1. If x D 1, we have sn D n so the series diverges (to C1) and if x D 1, the sequence .sn / is .1; 0; 1; 0; : : : / which also diverges. If x is a complex number with jxj D 1, the series still diverges, since snC1 sn D x n 6! 0. (This is a special case of the “trivial test”, also known as the “nth -term test”, theorem 27.11 below.) n For convenience of notation one also works with series of the form 1 X an ; nDp whose terms form a sequence .ap ; apC1 ; : : :/ indexed on fp; p C 1; : : :g and whose partial sums are of the form n X sn D ak : kDp For example, the geometric series above is often considered to be x ¤ 1) n X 1 x nC1 xk D ; 1 x P1 nD0 x n . We have (if kD0 and if jxj < 1, 1 X nD0 xn D 1 1 x : 2Technically, the series is the pair ..a /; .s // consisting of the sequence of terms and the sequence of partial n n sums. 108 Series of numbers 27.2 Theorem. (Linearity) P P1 P1 (a) If 1 nD1 an and nD1 bn are convergent series then nD1 .an Cbn / is convergent with 1 1 1 X X X .an C bn / D an C bn nD1 (b) If nD1 P1 nD1 an is convergent and c 2 R, then 1 X can D c nD1 nD1 P1 1 X nD1 can is convergent with an : nD1 Proof. This follows from the corresponding facts for sequences. The details are left as an exercise. P 1 27.3 Example (The harmonic series). The series 1 nD1 n diverges. Proof. Suppose this series were to converge with sum s, and let sn be the nth partial sum. Then s sn ! 0. But, for all n, s sn s2n D sn 2n X kDnC1 2n X kDnC1 1 k 1 n 1 D D ; 2n 2n 2 which does not converge to 0, a contradiction. 27.4 Example (The number e). In section 25, we defined the number e as the limit of the sequence whose terms are .1 C 1=n/n . But we also proved that that e is the limit of the 1 n X X 1 1 ; that is, that e is the sum of the infinite series : sums sn D kŠ kŠ kD0 kD0 The convergence of the partial sums sn to e is very rapid. Indeed, the error in using sn to approximate e is 1 X 1 e sn D kŠ kDnC1 1 X 1 1 < .n C 1/Š .n C 1/k kD0 1 1 1 D D 1 .n C 1/Š 1 nC1 nŠn 1 Thus, 0 < e sn < nŠn . For n D 10, the error is 0:2755731922 10 yields e correct to 7 decimal places. 7 , so the partial sum 27.5 Theorem. The number e is irrational Proof. Suppose otherwise that e D m=n, m; n 2 N. By the estimate above, 0 < nŠ.e sn / < 1=n: INTRODUCTION TO ANALYSIS 109 By our assumption, nŠe is an integer, and 1 1 ; nŠsn D nŠ 1 C 1 C C : : : 2Š nŠ is an integer, so nŠ.e sn / is an integer between 0 and 1, which is impossible. Actually, e is trancendental, that is, it is not the root of any polynomial with rational coefficients, but that we don’t prove here. The monotone convergence theorem for convergence of sequences of reals becomes, in terms of series: 27.6 Theorem. A series of non-negative terms converges iff its partial sums are bounded. P1 Proof. Pn Let nD1 an be a series with an 0, for all n, and with partial sums sn D kD1 ak . Then, for all n; sn sn C anC1 D snC1 : This shows that .sn / is an increasing sequence of real numbers, so if it is bounded above, it is convergent; if not, it diverges to C1. 27.7 Note. We can say more: For a series of non-negative terms n X ak kD1 1 X ak : kD1 Indeed, using the notation of the previous proof, we know that limn sn D supn sn . x and we write In case the .sn / is not bounded above, we know that sn ! C1 in R 1 X an D C1: nD1 However, we still say the series diverges (to infinity). P P 27.8 Theorem (Comparison test). For series n an , and n bn of non-negative terms, if N0 2 N, and an bn , for all n N0 , thend P P (a) if Pn bn converges then so doesP n an and (b) if n an diverges, then so does n bn . Proof. .a/ and .b/ are contrapositives of each other, so we prove only the first. Changing a finite number of terms does not affect convergence, (although it does affect the sum), so we mayPassume an bn for all n. P Pn Let n bn converge and let B D 1 nD1 bn . The partial sums Bn D kD1 bk form an increasing sequence, bounded above by B. But then, 0 n X ak kD1 This shows that the partial sums of P 1 X nD1 n n X bk B: kD1 an are also bounded above by B so converge and an B D 1 X bn : nD1 The Cauchy condition can be restated in terms of series as: 110 Series of numbers P 27.9 Theorem. A series n an is Cauchy (hence converges) iff for each " > 0, there exists N such that for n m N , ˇ ˇ n ˇX ˇ ˇ ˇ ak ˇ < ": ˇ ˇ ˇ kDm Pn Proof. Let sn D kD1 ak . Then, by definition, .sn / is Cauchy iff for each " > 0, there exists N such that for n; m > N , jsn sm j < ": We notice that if n m, then jsn ˇ n ˇ m ˇX ˇ X ˇ ˇ sm j D ˇ ak ak ˇ ˇ ˇ kD1 kD1 ˇ ˇ ˇ X ˇ ˇ ˇ n ˇ Dˇ ak ˇˇ : ˇkDmC1 ˇ Suppose .sn / is Cauchy. Let " > 0. Choose N1 so that for m; n > N1 , jsn sm j < ": Let N D N1 C 2 and n m N . Then, n; m 1 > N1 , so ˇ n ˇ ˇX ˇ ˇ ˇ jsn sm 1 j D ˇ an ˇ < ": ˇ ˇ kDm ˇP ˇ Thus, for each " > 0, there exists N such that n m N implies ˇ nkDm aˇ n ˇ < ". ˇ P Conversely, suppose " > 0 and N is chosen so that n m N implies ˇ nkDm an ˇ < ". Let n; m > N . ˇ ˇP In case n > m; we have n m C 1 N and jsn sm j D ˇ nkDmC1 an ˇ < ". The case n < m is proved similarly, by interchanging the roles of n and m; and in case n D m, jsn sm j D 0 < ". Thus, in all cases n; m > N implies jsn sm j < ", so the sequence .sn / is Cauchy. P 27.10 Corollary. IfP 1 then the sequence of “tails” or “remainders” nD1 an converges, P1 P1 n Rn WD nD1 an a D a kD1 k kDnC1 k converges to 0. Here is what some people call the nth -term test. 27.11 Theorem (trivial test for divergence). P (1) If n an converges then an ! 0. Equivalently, P (2) if .an / does not converge to 0, then the series n an diverges. Proof. If the series converges it ˇis Cauchy, so if " > 0 is given, we can find an N such ˇP that for n m N ˇ nkDm ak ˇ < ". Taking n D m N gives jan j < ". This proves an ! 0. We emphasize that the above result P does not say that an ! 0 implies the series converges. Indeed, the harmonic series n n1 diverges, yet 1=n ! 0. However, there is a case where this does hold. 27.12 Theorem (Alternating series test). P nC1 If the sequence .an / decreases to 0, then the series 1 a converges. MorenD1 . 1/ P1 P Pn1 n kC1 kC1 over, the remainder Rn D kD1 . 1/ ak ak D kDnC1 . 1/kC1 ak ; kD1 . 1/ st nC2 satisfies jRn j anC1 and Rn has the same anC1 . P1sign asnthe n C 1 term . 1/ (Corresponding statements hold for nD1 . 1/ an .) INTRODUCTION TO ANALYSIS 111 Thus, if the terms of a series have alternating signs and have absolute values which decrease with limit 0, then the series converges and the nth “remainder”, that is, the error in using the nth partial sum to approximate the sum, is bounded by the size of first term omitted (the n C 1st term) and is of the same sign. Another way you could state this result for both cases (without explicitly writing the . 1/n nC1 ) is: Suppose c ! 0, jc j jc or . 1/ n n nC1 j, for all n, and cn cnC1 0, for all n. P Then, n cn converges and Rn D 1 X n X cn nD1 ck D kD1 1 X ck kDnC1 satisifies jRn j jcnC1 j and cn RnC1 0, for all n. The cn cnC1 0 indicates that the terms alternate in sign. The cn RnC1 0 indicates that the remainder is of the same sign as cnC1 . Proof. Let m and n be natural numbers with m > n. Then m X . 1/kC1 ak D . 1/nC1 ..an anC1 / C .anC2 anC3 / C .anC4 anC5 / C : : : / kDn Since the sequence .an / decreases, the terms .an and hence the sum .an anC1 / C .anC2 anC3 / C .anC4 is also 0. (Whether this sum ends with am or am even or odd, but it is still 0.) ˇ ˇ m ˇ ˇX ˇ ˇ kC1 . 1/ ak ˇ D .an ˇ ˇ ˇ anC1 /, .anC2 anC1 / C .anC2 am anC3 /, : : : are all 0, anC5 / C : : : / 1 depends on whether m anC3 / C .anC4 n is anC5 / C : : : / kDn D an Œ.anC1 anC2 / C .anC3 anC4 / C : : : / an ; since again the terms .anC1 anC2 /, .anC3 anC4 /; : : : are 0. Now, let " > 0 and use the fact that an ! 0 to find N such that n N implies an < ". Then for m > n N , we have ˇ ˇ m ˇ ˇX ˇ ˇ . 1/kC1 ak ˇ < "; ˇ ˇ ˇ kDn so the series is Cauchy hence converges. We have still to check the sign of the remainder and the estimate jRn j anC1 . We saw above that m X . 1/kC1 ak kDn is . 1/nC1 times a non-negative quantity, hence has the same sign as . 1/nC1 an , the nth term. When we let m tend to infinity we obtain Rn 1 D 1 X . 1/kC1 ak ; kDn which still has the sign of the nth term, as required. (Replace n by n C 1 to obtain the result in the form stated.) P P A series n an is said to converge absolutely if n jan j converges. 112 Series of numbers 27.13 Theorem. If a series of real (or complex) numbers converges absolutely, then it converges. P P Proof. Let n an converge absolutely. Then, by definition, n jan j converges. Thus, by the Cauchy criterion, if " > 0 is fixed, we can choose N such that n m N implies n X jak j < " kDm Hence, ˇ n ˇ n ˇX ˇ X ˇ ˇ ak ˇ jak j < " ˇ ˇ ˇ kDm Thus, the series P kDm n an is also Cauchy, so converges. The reason we emphasize that the terms of the sequence are real or complex numbers is that the metric completeness of these spaces is responsible for the result. Remember “metric completeness” refers to the fact that Cauchy sequences converge. (That absolute convergence implies convergence can actually be used to characterize completeness.) 27.14 Corollary P (Absolute comparison test). IfPthere exists N0 such that jan j jbn j for all n N0 , then n an converges absolutely if n bn does. Notice that, in this form, there was no test for divergence. 27.15 (Limit comparison test). 3 PTheoremP Let n an and n bn be series of non-negative terms. P P (a) If limn abnn D L < 1 and n bn converges, then n an converges also. P P (b) If limn abnn D L > 0 and n an converges, then so does n bn . P P In part (b), L is allowed to be C1. In case 0 < L < 1, the result says n an and n bn converge or diverge together. That is, both converge or both diverge. Proof. (a) Suppose an =bn ! L < 1 and fix K with L < K < 1. Then P there exists N such thatPfor n N , an =bn < K. Thus, a Kb , for all n. Thus, if n n n bn converges, P so does n Kbn , and hence so does n an , by the usual comparison test. (b) Suppose limn an =bn D L > 0, and let 0 < c < L. Then, we may choose N P so that for n N , anP =bn > c. Thus, cbn <Pan , for all n N , and so convergence of n an implies that of n cbn , and hence of n bn , since c is not 0. You will notice that limit could have been replaced by lim sup in (a) and by lim inf in (b). As the proof shows, what is really involved is that the ratios an =bn be bounded above in (a) and be bounded below by a number c > 0 in (b). The doubling idea used to show that the harmonic series diverges can be refined to give a surprisingly useful test. For a series whose terms are non-negative and decrease, a rather “thin” subsequence determines convergence. 27.16 Theorem (Cauchy’s condensation test). Let an anC1 0, for all n. Then, 1 X nD1 an converges ” 1 X 2k a2k converges. kD0 3Some people call this the “ratio comparison test”. See the exercises for another result with that name. INTRODUCTION TO ANALYSIS 113 Proof. If 2m > n, we have n X ai a1 C .a2 C a3 / C .a4 C a5 C a6 C a7 / C C .a2m C C a2mC1 1/ i D1 a1 C .a2 C a2 / C .a4 C a4 C a4 C a4 / C C .a2m C C a2m / D 1a1 C 2a2 C 4a4 C C 2m a2m 1 X 2k a2k ; kD0 P1 k so if kD0 2 a2k converges, so does Similarly, 1 X P1 nD1 an , since its partial sums are bounded above. ai a1 C a2 C .a3 C a4 / C .a5 C a6 C a7 C a8 / C .a9 C C a16 / C C .a2m 1 C1 C C a2m / i D1 a1 C a2 C .a4 C a4 / C .a8 C a8 C a8 C a8 / C .a16 C C a16 / C C .a2m 1 a1 C a2 C 2a4 C 4a8 C C 2m 1 a2m 2 1 D .a1 C 2a2 C 4a4 C C 2m a2m / 2 m 1X k D 2 a2k 2 kD0 so that if P1 nD1 an converges, so does P1 kD0 2k a2k : Let us apply this to the so called p-hyperharmonic series, also called “p-series”. P 27.17 Theorem (p-series). For a real number p, n n1p converges iff p > 1. P Proof. By the Cauchy condensation test the series n n1p converges iff k 1 1 X X 1 1 k 2 D 2p 1 .2k /p kD0 kD0 does. But this is a geometric series; it converges iff 2p1 1 < 1, that is, iff p > 1. P 27.18 Theorem (Cauchy’s root test). The series n an (a) converges absolutely if lim supn jan j1=n < 1; (b) diverges if lim supn jan j1=n > 1 If lim supn jan j1=n D 1 the series could converge or diverge. Proof. (a) Let ˛ D lim supn jan j1=n < 1. Choose r with ˛ < r < 1. Then, there exists N such that for n > N , jan j1=n < r. Thus, jan j r n for n > N: P Hence n an converges absolutely by comparison with the geometric series n r n , 0 r < 1. (b) If lim supn jan j1=n >P 1, then for infinitely many n, jan j > 1, hence .an / could not tend to 0. Hence, the series n an diverges. P 1 C1 C C a2m / ; 114 Series of numbers Notice that limn .1=n/1=n D 1 and 1 n n2 converges. P 1 n n diverges, while limn .1=n2 /1=n is also 1 and P P 27.19 Theorem (D’Alembert’s ratio test). The series n an , with an ¤ 0, (a) converges absolutely if lim supn janC1 j=jan j < 1 and (b) diverges if lim infn janC1 j=jan j > 1; (more generally, if there exists N such that for n N , janC1 j=jan j 1). Proof. (a) We may assume an > 0, for all n. Let ˛ D lim supn ˛. Then, there exists N such that for n N , anC1 < r: an Thus, for all n N , anC1 < an r; so that aN C1 < aN r; anC1 an < 1 and let 1 > r > aN C2 < aN C1 r < aN r 2 ; and by induction aN Ck < aN r k : Thus, for all n N , an aN r n N D aN r Writing K for the constant aN r N ; we have N n r : an Kr n ; for n N: P Since 0 < r < 1, the series n an converges, by comparison with a convergent geometric series. (b) If lim inf anC1 =an > 1, then there exists an N such that for all n N , anC1 =an > 1. Now if we have even anC1 =an 1, for all nP N , we see that an anC1 for all n N , so an cannot converge to 0, hence the series n an cannot converge. P 1 P 1 anC1 Again, n n diverges and n n2 converges, yet in both cases limn an D 1. Notice that both the ratio test and the root test deduce divergence only from the lack of convergence to 0 of the terms. The relationship between the ratio and the root tests is brought out by the following result. It shows that whenever the ratio test shows convergence, the root test will also. If the lim inf version of the ratio test shows divergence, so will the root test. There are series, however, for which the root test indicates convergence, but the ratio test does not apply. 27.20 Theorem. If an > 0 for all n 2 N, anC1 anC1 lim inf lim inf an1=n lim sup an1=n lim sup : n n an an n n Proof. Recall that, for any sequence .xn /, lim sup xn D lim sup xk n n kn and lim inf xn D lim inf xk : n n kn and that if ˇ > lim sup xn ; then there exists N such that for n N , xn < ˇ: n () INTRODUCTION TO ANALYSIS 115 (“All but finitely many terms are < ˇ.”) Now, for each n; inf xk sup xk ; kn kn so in the limit lim inf xn lim sup xn : n n This is a general result which applies here to give lim inf an1=n lim sup an1=n : n n The interesting part is the comparison with the ratios. a and let ˇ > ˛. Then, by () there exists N such that for n N , Let ˛ D lim supn nC1 an anC1 < ˇ: an Thus, for all n N , anC1 < an ˇ; so that aN C2 < aN C1 ˇ < aN ˇ 2 ; aN C1 < aN ˇ; and by induction aN Ck < aN ˇ k : Thus, for all n N , an aN ˇ n N D aN ˇ Writing K for the constant aN ˇ N ; we have N ˇn: an Kˇ n ; for n N: Thus, an1=n K 1=n ˇ; for n N: Taking the lim sup of both sides (recalling that the limit superior does not change if we change a finite number of terms) we have lim sup an1=n lim sup K 1=n ˇ: n n But if a limit exists, it is also the limit superior, so lim sup K 1=n D lim K 1=n D 1; n n and hence, lim sup an1=n ˇ: n But ˇ was arbitrary > ˛. Hence, anC1 an n n The inequality involving limit inferior is proved the same way. lim sup an1=n ˛ D lim sup P P 27.1. (Ratio Comparison Test) For series n an and n bn of positive terms, if P P for all n sufficiently large, then convergence of n bn implies that of n an . anC1 an bnC1 bn , 27.2. Use the corresponding results about sequences of real numbers prove: (Linearity) Let an ; bn 2 R, for all n 2 N. P P1 P1 (a) If 1 nD1 an and nD1 bn are convergent series then nD1 .an C bn / is convergent with 1 X .an C bn / D nD1 1 X nD1 an C 1 X nD1 bn 116 Series of numbers (b) If P1 nD1 an is convergent and c 2 R, then 1 X P1 nD1 can 1 X can D c nD1 is convergent with an : nD1 27.3. For the following series, determine whether the series converges or diverges. If it is convergent, find its sum. P1 1 (a) nD1 n1=n . 1 X 3 . (b) n.n C 1/ (c) nD1 1 X kD1 k2 C 5 . 3k 3 C 2k 1 27.4. For the following series, determine whether the series converges or diverges. If it is convergent, find its sum. 1 X 1 (a) . 2n (b) (c) (d) nD1 1 X nD1 1 X nD1 1 X nD1 .3n 1 . 2/.3n C 1/ n p . 1 C n2 3n C 2n . 6n 27.5. Test the following for convergence or divergence. 1 X 2n (a) . n 3 C1 (b) (c) (d) nD1 1 X . 1/n nD1 1 X . 1/n nD1 1 X nD1 n2 . n2 C 1 p n 1 . nC4 1 cos n . n4=3 27.6. For a real-number x, x C D maxfx; 0g and x D maxf x; 0g, so that x D x C x and jxj D x C C x . P P C P (a) If n an converges absolutely, then both n an and a converge. P P Cn n P (b) If n an converges, but not absolutely, then both n an and n an diverge. P P1 27.7. A rearrangement of a series 1 nD1 an is a series nD1 akn , where W n 7! kn is a bijective map of N onto N. P (a) If the series 1 nD1 an converges absolutely with sum s, then every rearrangement of this series converges with the same sum. P x (b) If a series of real numbers 1 nD1 an converges, but not absolutely, then for each x 2 R, there is a rearrangement which converges to x. In fact, for each ˛; ˇ with 1 ˛ ˇ C1, then there is P 0 0 such that a rearrangement n an with partial sums sn 0 0 D ˛ and lim sup sn D ˇ: lim inf sn n n INTRODUCTION TO ANALYSIS 117 28. L IMITS OF FUNCTIONS Let X be a set in a metric space S and let Y be another metric space and f W X ! Y . If c 2 S , we say f .x/ converges to L as x tends to c, and write f .x/ ! L as x ! c iff for each " > 0, there exists ı > 0 such that for x 2 X , H) 0 < dS .x; c/ < ı dY .f .x/; L/ < ": In other words, for all " > 0 there exists ı > 0 such that f B.c; ı/ n fcg B.L; "/: () One also writes limx!c f .x/ D L. (This notation has a slight flaw, as we shall see shortly.) Here, dS denotes the distance in S and dY denotes the distance in Y . Often we may just use the letter d in both places. By the way, in the expression (), since f is only defined on X and c is excluded, it doesn’t matter whether we use the ball in S , B.c; ı/ D fx 2 S W dS .c; x/ < ıg, or the corresponding ball in X , namely BX .c; ı/ D X \ B.c; ı/. 28.1 Example. Let f .x/ D x 2 C 2x C 6, for x 2 R. Then, limx!3 f .x/ D 21. Proof. Here, X D S D Y D R: distances are given in terms of absolute values. Let " > 0. Then, jf .x/ Now, if jx 21j < " , jx 2 C 2x C 6 , jx 2 C 2x , jx 21j < " 15j < " 3jjx C 5j < ": 3j < 1; we will have jx C 5j jx 3j C j3 C 5j 1 C 8 D 9; so that Thus, jf .x/ jf .x/ 21j jx 21j will be < " provided jx and jx 3j9: 3j9 < "; 3j < 1. Thus, we take ı D minf1; "=9g. Then, jx jf .x/ 21j D jx 3jjx C 5j jx 3j < ı implies " 3j9 < 9 D "; 9 as required to prove limx!3 f .x/ D 21. As you can see, the techniques we have developed for limits of sequences seem to apply for limits of functions. Actually, there is a close connection, as we will see, known as the sequential criterion for convergence. The set B 0 .c; ı/ D B.c; ı/ n fcg is often called the deleted neighbourhood of c radius ı, or the deleted open ball about c radius ı. The definition of convergence of the function f to L as x ! c becomes: For each neighbourhood V of L there exists a deleted neighbourhood U of c with f .U / V Recall that, by definition, (section 18) a point c is an accumulation point of a set A iff B.c; ı/ \ .A n fcg/ ¤ ;, and this can be rewritten so that B 0 .c; ı/ \ A ¤ ;. (A point of A which is not an accumulation point is called an isolated point of A. ) 118 Limits of functions So points of A come in 2 kinds: accumulation and isolated. But be careful, accumulation points need not be points of A; A [ acc.A/ D cl A. Uniqueness. If c is an isolated point of the domain of the function f , the limit of f at c has little meaning, as we will describe shortly, but if c is an accumulation point of the domain, the limit is unique. To see this, we recall a concept mentioned briefly in the discussion of convergence of sequences: 28.2 Theorem (The Hausdorff Property). In any metric space, if x ¤ y, there exist neighbourhoods Ux of x and Uy of y with Ux \ Uy D ;. The proof of this result, which we gave in theorem 21.9 is a triviality, but the result is enormously important, so lets go through it again. Proof. Since x ¤ y, d.x; y/ > 0. Take ı any positive number d.x; y/=2, Ux D B.x; ı/ and Uy D B.y; ı/. Then Ux and Uy are neighbourhoods of x and y, respectively, and Ux \ Uy D ;. Indeed, if z 2 Ux \ Uy , then d.x; y/ d.x; z/ C d.z; y/ < ı C ı d.x; y/, an impossibility. 28.3 Theorem (Uniqueness of limits). Let S and Y be metric spaces. Suppose c is an accumulation point of X , f W X ! Y . If f .x/ ! L1 as x ! c and f .x/ ! L2 as x ! c, then L1 D L2 . Proof. Suppose L1 ¤ L2 . By the Hausdorff property, there exists neighbourhoods V1 of L1 and V2 of L2 such that V1 \ V2 D ;: By definition, there exist deleted neighbourhoods U1 of c and U2 of c with with f .U1 / V1 and f .U2 / V2 . Now U1 \ U2 is still a deleted neighbourhood of c. (Take out your ı’s and check.) Thus, U1 \ U2 \ X ¤ ;. But if x 2 U1 \ U2 \ X , f .x/ 2 V1 \ V2 ; which is impossible. The theorem we have just proved is the justification of the notation lim f .x/ D L: x!c However, limits of functions can be non unique in a certain situation. 28.4 Theorem. If c is not an accumulation point of X , the domain of the function f W X ! Y , then f .x/ ! y as x ! c, for all y 2 Y . Proof. c is not an accumulation point of X means there exists a deleted neighbourhood U of c which does not intersect X . Thus, if V is a neighbourhood of y, f .U / D f .X \ U / D f .;/ D ; V: This is so disturbing to some people that they refuse to talk about limits at an isolated point of X . They just define this behaviour away. We prefer to allow c to be a non accumulation point, so that a function continuous at c will converge at c to f .c/ even in this case. See 30.1 If we use a formula for something, it should have a unique definition. In case c is an accumulation point of the domain of f , the limit is unique, so this is fine. If c is not an accumulation point, it is not. Nevertheless, it is somewhat common to use this notation even in the latter case. INTRODUCTION TO ANALYSIS 119 One more thing. . . you will notice that it is not necessary for c to be a member of the domain of the function to define a limit at c. 28.5 Theorem (The sequential criterion for convergence). Let S , Y be metric spaces, c 2 S , X S and f W X ! Y . Then f .x/ ! L as x ! c ” f .xn / ! L for each sequence .xn / in X n fcg converging to c: We sometimes abbreviate “sequential criterion” as “SC”. Proof. ( H) ) Suppose f .x/ ! L as x ! c. Let .xn / be a sequence in X n fcg with xn ! c. We are to show f .xn / ! L. Let " > 0. Then, there exists ı > 0 such that f .x/ 2 B.L; "/, whenever x 2 B.c; ı/ n fcg. But xn ! c and xn ¤ c for all n, so there exists N such that n > N implies xn 2 B.c; ı/ n fcg. For this N , and for n > N , we therefore have f .xn / 2 B.L; "/ as required. ( (H ) We prove this by establishing the contrapositive. Assume f .x/ does not converge to L as x ! c. That is, there exists " > 0 such that for all ı > 0, there exists x 2 B.c; ı/ \ X, with x ¤ c, and f .x/ … B.L; "/. Fix such an " > 0. (Today it is delta’s turn to take on a lot of identities.) For each n 2 N, take ı D 1=n in the above statement and choose xn 2 B.c; 1=n/ \ X; with xn ¤ c; but f .xn / … B.L; "/: Since 1=n ! 0, .xn / is a sequence in X nfcg for which xn ! c, yet there is no n for which f .xn / belongs to B.L; "/I the sequence .f .xn // certainly doesn’t converge to L. For those who prefer distances to neigbourhoods: we constructed .xn / with xn 2 X 1 0 < d.xn ; c/ < ; but d.f .xn /; L/ "; for all n 2 N: n 28.6 Example. Let f .x/ D sin.1=x/; for x ¤ 0, so the domain of f is X D R n f0g. Then f .x/ does not converge to anything as x ! 0. 1 0.5 0 -1.5 -1 -0.5 0 0.5 1 x -0.5 -1 1.5 120 Limits of functions Proof. Let xn D 1=. n /; for each n. Then f .x1 / D 1; f .x2 / D 0; f .x3 / D 1, 2 f .x4 / D 0,. . . . We see that the sequence .f .xn // does not converge. But xn ¤ 0 and xn ! 0, so the sequential criterion isn’t satisfied. Thus, f .x/ does not converge as x ! c. Let f; g be functions on a set X to R. Then the functions f C g, fg, f =g are “defined pointwise” as follows. (1) f C g is defined on X by .f C g/.x/ D f .x/ C g.x/: (2) fg is defined on X by .fg/.x/ D f .x/g.x/: (3) f =g is defined on fx 2 X W g.x/ ¤ 0g by .f =g/.x/ D f .x/=g.x/: 28.7 Theorem. Let X be a subset of a metric space, f W X ! R, g W X ! R, c 2 X , with f .x/ ! L and g.x/ ! M as x ! c, then (1) .f C g/.x/ ! L C M as x ! c (sum law) (2) .fg/.x/ ! LM as x ! c (product law) (3) .f =g/.x/ ! L=M as x ! c (quotient law), provided M ¤ 0. We will prove (1) using the definition of limit, and (2) using the sequential criterion (SC). (3) is left as an exercise. I think you will find it easiest using the SC, but it is instructive to try it both ways. Proof. (1) Let " > 0. Since f .x/ ! L as x ! c, then there exists ı1 > 0 such that x 2 X and x 2 X; and 0 < d.x; c/ < ı1 ; imply jf .x/ Lj < "=2 and similarly there exists ı2 > 0 such that x 2 X; and 0 < d.x; c/ < ı2 ; imply jg.x/ M j < "=2: Put ı D minfı1 ; ı2 g then, x 2 X and 0 < d.x; c/ < ı imply j.f C g/.x/ .L C M /j jf .x/ Lj C jg.x/ Mj < " " C D ": 2 2 Since " > 0 was arbitrary, 8" > 0; 9ı > 0; 8x 2 X; 0 < d.x; c/ < ı H) j.f C g/.x/ .L C M /j < ": That is, .f C g/.x/ ! L C M as x ! c. (2) Let .xn / be an arbitrary sequence in X n fcg converging to c. Then f .xn / ! L and g.xn / ! M , by the SC, and .fg/.xn / D f .xn /g.xn / by definition. Therefore, by the product law for sequences, .fg/.xn / ! LM: Since .xn / was arbitrary, fg satisfies the SC for convergence and .fg/.x/ ! LM as x ! c. 28.1. The sequential criterion has a stronger form, without reference to a particular limit: If for each .xn / converging to c, .f .xn // converges, then f .x/ converges as x ! c. (The point here is that at first sight, the limit of f .xn / could be different for different sequences .xn /, yet it is part of the conclusion that there is only one such limit, and then f .x/ converges to it as x ! c. INTRODUCTION TO ANALYSIS 121 Left and right limits. For a function f defined on a subset X of the reals, and c 2 R, f .x/ tends to L as x ! c from the left means the restriction of f to X \ . 1; c converges to L. Then L is called the left limit or left-hand limit of f at c, denoted f .c / D lim f .x/ or lim x!c x<c f .x/ x!c Be very careful using this notation. There is no point c . f .c / makes sense, but c does not. (Also if c is not an accumulation point of X \ . 1; c, one shouldn’t use the limit notation, since the limit is not unique.) Similarly, f .x/ converges to L as x ! c from the right means the restriction of f to X \ Œc; 1/ does. The corresponding notation is f .cC/ D lim f .x/ or lim x!c x>c f .x/ x!cC You can prove that limx!c f .x/ D L if and only if f .c / D f .cC/ D L. (See exercise 28.2 below.) A real-valued function f is called increasing on X if x1 < x2 in X implies f .x1 / f .x2 / and decreasing on X if x1 < x2 in X implies f .x1 / f .x2 /. A real-valued function f is called strictly increasing on X if x1 < x2 in X implies f .x1 / < f .x2 / and strictly decreasing on X if x1 < x2 in X implies f .x1 / > f .x2 /. A function is called monotone if it is either increasing or decreasing and strictly monotone if it is either strictly increasing or strictly decreasing. You should prove as an exercise: 28.8 Theorem. . (Convergence of monotone functions) If f is a monotone function defined on an interval I of the reals, then for each c 2 I , the limits from the right and from the left exist. In fact, if f is increasing, f .c / D supx<c f .x/ f .c/ if c is not the left endpoint of I and f .cC/ D infx>c f .x/ f .c/ if c is not the right endpoint of I . If f is decreasing, we have the same result with supremum and infimum interchanged. The proofs follow the pattern of the corresponding results for monotone sequences. 28.2. Let X; Y be a metric spaces and A and B be X with union X , f W X ! Y , c 2 X . Prove that limx!c f .x/ D L iff limx!c .f jA/.x/ D L and limx!c .f jB/.x/ D L. Often, in application, X is an interval of R and A D X \ . 1; c, B D X \ Œc; C1/. So, we deduce limx!c f .x/ D L, if and only if f .c / D f .cC/ D L. 28.3. Formulate and prove sum, product, and quotient laws — as far as possible — for functions from subsets of metric spaces to Rm . Remember there is no quotient of vectors, and there is a multiplication by a scalar and a dot product of vectors available, depending on the case. 28.4. Prove the result on convergence of monotone functions. What changes would have to be made if c does not belong to the interval? 28.5. (a) Use the definition to prove limx!5 x 2 3x C 1 D 11 (b) Use the definition to prove limx!2 x 3 D 8. 28.6. Let f W X ! R, c 2 R and suppose limx!c f .x/ D p < 0. Prove there is a deleted neighbourhood U of c and an r < 0 such that f .x/ r for all x 2 U \ X . 28.7. Disprove: If f .x/ ! L as x ! a and g.x/ ! a as x ! c, then f .g.x// ! L as x ! c. (See also problems 30.1 and 30.2.) 28.8. The problem with 28.7 goes away if there exists a deleted neighbourhood U of c with g.x/ ¤ a, for all x 2 U . A particular case of this, of great use, is the case g is a one-to-one function. 122 Limits of functions 28.9. Let f W X ! R, c 2 R and suppose limx!c f .x/ D p. Prove f is bounded on some neighbourhood of c; that is, there exists a neighbourhood U of c and M > 0 such that jf .x/j M , for all x 2 U \ X . 28.10. Let f W X S ! Rn , where S is a metric space. Find an analogue of the Cauchy condition for convergence of f at a point c 2 S . Using your definition, prove that if f is Cauchy at c, then f converges at c. INTRODUCTION TO ANALYSIS 123 29. I NFINITE LIMITS OF FUNCTIONS AND LIMITS AT ˙1 Limits of real-valued functions that involve infinities are not much different from those involving only real numbers, if we keep in mind the meanings of neighbourhoods of x D C1 and of 1. For a real-valued function defined on a set X R and c; L in R Œ 1; C1, we say f .x/ converges to L as x tends to c, if for each neighbourhood V of L, there exists a deleted neighbourhood U of c such that for x 2 X \ U , f .x/ 2 V . Let’s write out what that means in some special cases. The neighbourhoods of C1 are of the form .M; C1 where M is any real number; the neighbourhoods of 1 are of the form Œ 1; M /; where M is any real number. The deleted neighbourhoods of C1 are of the form .M; C1/ and the deleted neighbourhoods of 1 are of the form . 1; M /. Thus, if c 2 R, f .x/ ! C1 as x ! c, or lim f .x/ D C1 x!c means for all M 2 R, there exists ı > 0 such that, for all x 2 X \B.c; ı/nfcg, f .x/ > M : x 2 X and 0 ¤ jx cj < ı H) f .x/ > M: Similarly, if L 2 R, f .x/ ! L as x ! 1, lim f .x/ D L; x! 1 means for all " > 0 there exists r such that, for all x 2 X \ . 1; r/, f .x/ 2 B.L; "/. Most of the algebraic operations are preserved for infinite limits, provided the quantities are defined. (See the section T HE EXTENDED REAL NUMBER SYSTEM, for the definition of the operations.) The only exceptions involve 0 times C1 or 0 times 1. Of course, C1 1, and C1=C1 and the like are not defined, so there is no corresponding theorem for such a situation. 29.1 Example. Let f and g be real valued functions defined on X. If limx!c f .x/ D p 2 R and limx!c g.x/ D C1, then limx!c .f C g/.x/ D C1. Proof. Let M 2 R. Since f .x/ ! p, there is a deleted neighbourhood U1 of c such that 1 for x 2 U1 \ X . f .x/ > p Since g.x/ ! C1, there is a deleted neighbourhod U2 of c such that g.x/ > M .p 1/ for x 2 U2 \ X . Put U D U1 \ U2 . Then, for x 2 U \ X , .f C g/.x/ D f .x/ C g.x/ > p 1CM .p 1/ D M for x 2 U \ X . Thus, for each M 2 R, there exists a deleted neighbourhood U of c such that for all x 2 U \ X , .f C g/.x/ > M , as required to prove .f C g/.x/ ! C1 as x ! c. The above proof has been written so that it works whether c is real or infinite. If c 2 R, you could use B 0 .c; ı1 / for U1 and B 0 .c; ı2 / for U2 . Then, U would become B 0 .c; ı/, where ı D minfı1 ; ı2 g. If c D 1, you could use U1 D . 1; r1 / and U2 D . 1; r2 /. Then U D . 1; r/, where r D minfr1 ; r2 g. The reader can check that in the definitions of convergence to C1, it is sufficient to use neighbourhoods of the form .M; C1, with M > 0, and for convergence to 1, one can use Œ 1; M /, with M < 0, so that it is easier to work with inequalities. 124 Infinite limits of functions and limits at ˙1 29.1. Formulate and prove the various limit theorems indicated in this discussion. 29.2. State and prove sequential criteria for infinite limits and limits at an infinity. 29.3. If h is defined (at least) in Œa; 1/, a > 0, and H.t/ D h.1=t / for 0 < t < 1=a. Then lim h.x/ D lim H.t/; x!1 t!0C x whenever one side exists in R. 29.4. Every unbounded monotone function on an interval of R has an infinite limit. Make more precise statements of this principle and prove them. 29.5. Give an example, with proof, of a function, defined on an interval, which is not monotone but converges to C1. INTRODUCTION TO ANALYSIS 125 30. C ONTINUITY OF FUNCTIONS Let X be a subset of a metric space S and f W X ! Y , another metric space. Then f is called continuous at a 2 X iff for every " > 0, there exists ı > 0 such d.f .x/; f .a// < ", whenever x 2 X and d.x; a/ < ı. In terms of neighbourhoods, f W X ! R is continuous a 2 X if for each neighbourhood V of f .a/, there exists a neighbourhood U of a with f .U / V . Here we remember that a ı-neighbourhood in X is of the form BX .a; ı/ D X \ BS .a; ı/. If A X , f is called continuous on A if it is continuous at each point of A; f is called simply continuous if it is continuous at each point of its domain. We say f is discontinuous at the point a, if a 2 X and f is not continuous at a. If a is not a point of the domain of f , f is neither continous nor discontinuous at a — it is not “at a” at all. If a is not an accumulation point of the domain X of f , (that is, if a is an isolated point of X ) then we see that f is automatically continuous at a. Indeed, if a is isolated, then exists ı > 0 such that BX .a; ı/ D BR .a; ı/ \ X D fag: thus for x 2 BX .a; ı/; d.f .x/; f .a// D d.f .a/; f .a// D 0 < "; no matter what the given " > 0 was. Looked at another way, if V is a neighbourhood of f .a/; then, f .BX .a; ı// D ff .a/g V . 30.1 Theorem. Let S , Y be metric spaces, X S, and f W X ! Y and a 2 S. Then f is continuous at a iff limx!a f .x/ D f .a/. This is an immediate consequence of the definitions. We emphasize that the condition means 3 things: (1) the limit limx!a f .x/ exists, (2) f .a/ exists (that is a 2 X ) and (3) the two sides are equal. If S is a metric space and X is a subset of S , we know that X becomes a metric space with the induced metric. Since continuity of a function f doesn’t involve points outside the domain of f , often we might as well state theorems, assuming the whole space as the domain. 30.2 Theorem (Sequential criterion for continuity). Let X and Y be metric spaces, f W X ! Y , and a 2 X . Then f is continuous at a iff for each sequence .xn / in X n fag converging to a, f .xn / ! f .a/, or alternatively, iff for each sequence .xn / in X converging to a, f .xn / ! f .a/. Proof. Since f is continuous at a iff limx!a f .x/ D f .a/, the first version is an immediate consequence of the sequential criterion for limits; namely, lim f .x/ D L x!a iff for each .xn / in X n fag converging to a, f .xn / ! L : The proof of the second form of this is essentially the same as the proof of the limit version; the difference is just that we don’t have to pay special attention to a. Here is the detail: Assume f is continuous at a. Let xn ! a, xn 2 X for all n2 N. Let V be a neighbourhood of f .a/. Then there is a neighbourhood U of a such that f .U / V . But since xn ! a, there exists N 2 N such that for n > N , xn 2 U and hence f .xn / 2 V . Thus, for every neighbourhood V of f .a/, there exists N with f .xn / 2 V for n > N . That is, f .xn / ! f .a/. For the converse, we assume f is not continuous at a. Then, there exists neighbourhood V of f .a/, such that there is no neighbourhood U of a with f .U / V . For each n 2 N, 126 Continuity of functions then, there exists xn 2 BX .a; n1 / such that f .xn / … V . Since 1n ! 0 we have xn ! a. Thus, .xn / is a sequence in X converging to a, yet f .xn / does not converge to f .a/. 30.3 Example. Let f W R ! R be the indicator function of the rationals 1Q W ( 1 if x 2 Q f .x/ D 1Q .x/ D 0 if x … Q Then f is discontinuous at each point of R. One way to prove this is by the sequential criterion. If a 2 Q, we use the fact that Qc is dense in R to obtain a sequence .xn / of irrationals converging to a, but then f .xn / D 0 ! 0 and f .a/ D 1, so .f .xn // does not converge to f .a/. So f is not continuous at a. On the other hand, if a … Q then (since Q is dense in R) there is a sequence .xn / in Q with xn ! a but f .xn / D 1 which doesn’t converge to 0 D f .a/ and again, by the sequential criterion, f is not continuous at a. Actually, a slight change in wording of the proof we have given shows that f does not even have a limit at any point of Q. 30.4 Example (Dirichlet’s function). We now give an example of a function on Œ0; 1 which is continuous exactly on the irrationals of this interval. We put, for x 2 Œ0; 1 ( 1 if x D m rational in lowest terms n f .x/ D n 0 if x is irrational: (Lowest terms are necessary to make the function “well-defined”, that is, to have a unique value for each x.) Dirichlet function f(0)=1 f(1)=1 1 0.8 0.6 f(1/2)=1/2 0.4 f(1/3)=1/3 f(2/3)=1/3 1/4 1/4 0.2 0 0 0.2 0.6 0.4 0.8 1 x We will prove that for each a 2 R, limx!a f .x/ D 0. Since f .a/ D 0, if a is irrational, and f .a/ ¤ 0, if a is rational, this shows f is continuous at each irrational and discontinuous at each rational. Fix " > 0. By the Archimedean property, there exists k such that 1= k < ". Now, let F be the set of rationals in [0,1] with denominators < k, This set is finite, so there exists a ı > 0 such that B 0 .a; ı/ \ F D ;. Thus, for each rational m in B 0 .a; ı/ \ Œ0; 1, n k n INTRODUCTION TO ANALYSIS 127 and hence, ˇ ˇ m ˇ ˇˇ 1 ˇ 1 ˇ ˇ 0ˇ D ˇˇ 0ˇˇ D < ": ˇf n n n For each irrational x in B.a; ı/ \ Œ0; 1, f .x/D 0, so again jf .x/ 0j < ". Thus, for all x 2 Œ0; 1, jx aj < ı implies jf .x/ 0/j < ", so that limx!a f .x/ D 0. The sum, product, and quotient of continuous functions are continuous: 30.5 Theorem. Let f; g be functions on X to R. If f; g are continuous at a then then (a) the functions f C g and fg are continuous at a. (b) f =g is continuous at a, provided g.a/ ¤ 0. We recall that the f =g is defined on the set of those x 2 X for which g.x/ ¤ 0. Proof. This follows from the corresponding theorem about limits. For example lim .f C g/.x/ D lim f .x/ C lim g.x/ x!a x!a x!a D f .a/ C g.a/ D .f C g/.a/ 30.6 Examples. (1) Each constant function defined by f .x/ D c for all x 2 R is continuous on R. Indeed, for each " > 0, we can take any ı > 0, and get jf .x/ f .a/j D 0 < " for all x,a, in particular when jx aj < ı. (2) The identity function i.x/ D x is continouus. For a given " > 0, the choice ı D " satisfies the definition: jx aj < ı H) jf .x/ f .a/j D jx aj < ": (3) Let p be a polynomial function. Then there exist n 2 N and constants a0 ; a1 ; : : : ; an such that p.x/ D a0 C a1 x 1 C a2 x 2 C C an x n : Thus, using (1) and (2) and induction with the fact that the sum and product of continuous functions is continuous, we see that p is continuous on R. (4) A rational function is by definition the quotient of two polynomial functions, and is thus continuous on its domain. The composition of two continuous functions is continuous. 30.7 Theorem. Let X; Y; Z be metric spaces Let f W X ! Y and g W Y ! Z. If f is continuous at a 2 X and g is continuous at f .a/ then g ı f is continuous at a. Proof. Let W be a neighbourhood of g ı f .a/. Since g is continuous at f .a/, there is a neighbourhood V of f .a/ such that g.V / W: But f is continuous at a, so there exists a neighbourhood U of a such that f .U / V: Hence, g ı f .U / D g.f .U // g.V / W; as required. A function is continuous iff the inverse image of an open set is open. 128 Continuity of functions 30.8 Theorem. Let X and Y be metric spaces and f W X ! Y: Then f is continuous iff for each G open in Y , f 1 .G/ is open in X . Proof. Suppose f is continuous and let G be open in R. Let a 2 f 1 .G/. Then, f .a/ 2 G. Then there is an open ball V centered at f .a/ contained in G. But then, since f is continuous at a, there is a neighbourhood U of a such that f .U / V . Thus, f .U / G; and hence U f 1 .G/. Thus each point of f 1 .G/ is an interior point, so is open. Conversely, if the condition is satisfied and V is an open ball around f .a/, then f 1 .V / is open set containing a, so there is an open ball U about a with U f 1 .V /, in other words, with f .U / V , showing f is continuous at a. 30.9 Corollary. For f W X ! Y , f is continuous iff its inverse image of each closed set in Y is closed in X . Discontinuities of a monotone function. A real function f defined on an open interval I is said to have a jump discontinuity at c if f .cC/ and f .c / exist in R and f .c / ¤ f .cC/. The difference f .cC/ f .c / is called the jump of f at c. The only kind of discontinuity that a monotone function on an open interval can have is a jump discontinuity. (Why?) If the interval I is not open, the jump at an interior point is defined in the same way, bit if c is an endpoint of I , one takes the jump to be f .cC/ f .c/, if c is the left endpoint of I and f .c/ f .c /, if c is the right endpoint. 30.10 Theorem. If I is an interval of R and ' W I ! R is monotone, then ' has at most a countable number of discontinuities. Proof. We may assume ' is increasing. Then, for all x 2 int.I /, '.x / and '.xC/ exist and '.x / '.x/ '.xC/: For x a endpoint, we can only say '.x/ '.xC/ or '.x / '.x/. Now, if x 2 int.I / is a point of discontinuity, put Ix D .'.x /; '.xC//. If x is an endpoint of I use Ix D .'.x/; '.xC//, or .'.x /; '.x//. Then, Ix ¤ ;, so we may choose a rational number r.x/ 2 Ix /, by density. If x1 < x2 , then '.x1 C/ '.x2 /, so Ix1 \ Ix2 D ;, and hence, r.x1 / ¤ r.x2 /. This shows that the set of discontinuities of ' in I is in one-to-one correspondence with a subset of Q, so is countable. 30.1. Discuss how the composition of continuous functions theorem could lead to a formal substitution: if f .x/ ! b, as x ! a, then lim g.f .x// D lim g.y/: x!a y!b 30.2. Let X; Y; Z be metric spaces, A X , B Y . Let f W A ! B and g W B ! Z. If f .x/ ! b as x ! a and g.y/ ! c as y ! b, then limx!a g.f .x// D limy!b g.y/ D c if and only if either (1) g.b/ D c (so g is continuous at b) or (2) there exists a deleted neighbourhood U 0 of a such that f .x/ ¤ b, for x 2 U 0 . Note that (2) holds in particular if f is one-to-one on A onto B. 30.3. The function g W R ! R with g.x/ D ( x; x2Q x2; x … Q is continuous at exactly 2 points. 30.4. Give an example, with proof, of a function f W R ! R which is continuous at exactly 3 points. INTRODUCTION TO ANALYSIS 30.5. Prove that the function defined by f .x/ D 129 ( x cos.1=x/; x ¤ 0 0; xD0 is continuous everywhere. 30.6. Let f W X S ! Rn , where S is a metric space. Find an analogue of the Cauchy condition for convergence of f at a point c 2 S . Using your definition, prove that if f is Cauchy at c, then f converges at c. 30.7. (Removable discontinuity) Let f W X ! Y , a 2 X and limx!a f .x/ D b ¤ f .a/, so that f is discontinuous at a. Then the function fx defined by fx.x/ D f .x/, if x ¤ a, and fx.a/ D b is continuous at a. (The discontinuity has been “removed”.) If f is only defined on X n fag, the same construction extends f to X , in such a way that it is continuous at a. (In this case no discontinuity has been removed, but some authors still call a a removable discontinuity.) 30.8. Let i W Rm ! R, be defined by i .x/ D . x1 ; : : : ; xm / D xi . (This is called the i th coordinate map, or the projection onto the i th coordinate). This map is continuous because ji .x/ i .y/j jx yj. A function f D .f1 ; : : : ; fm / from a metric space to Rm is continuous at c iff and only if the composites fi D i ı f are continuous at c, for each i D 1; : : : ; m. 30.9. Assume known that the functions x 7! e x and sin are continuous. Prove the map g on R2 to R, 2 defined by g.x1 ; x2 / D e x1 sin.1=.x22 C 1// C x2 is continuous everywhere. [Use continuity of the coordinate maps.] 30.10. Prove the function f W R2 ! R2 defined by 2 f .x1 ; x2 / D .x12 sin.x1 x22 /; e x2 / is continuous at each point of R2 . 30.11. Discuss the continuity at .0; 0/ of 2 g W .x1 ; x2 / 2 R 7! 8 3 2 < sin.x1 Cx2 / ; x12 Cx22 : . 1; x¤0 x D 0: 130 Continuity of functions Notes INTRODUCTION TO ANALYSIS 131 31. C ONTINUITY AND COMPACTNESS The continuous image of a compact set is compact: 31.1 Theorem. Let f W X ! Y be a continuous function from one metric space X to another one Y . If K is a compact subset of X , then f .K/ is a compact subset of Y . Proof. Let U a family of open sets covering f .K/. For each U 2 U, f X . Moreover, the family ff 1 .U / W U 2 Ug covers K. Indeed, [ U f .K/ 1 .U / is open in U 2U so ! [ U 2U f 1 .U / D f 1 [ U K: U 2U 1 .U1 /; : : : ; f 1 Since K is compact, there is a finite subfamily ff .Un /g which also covers K. f 1 .U1 / [ [ f 1 .Un / K: That is f 1 .U1 [ [ Un / K; and so U1 [ [ Un f .K/: Thus, each family of open sets covering f .K/ has a finite subfamily which also covers f .K/, as required. 31.2 Extreme Value Theorem. Each continuous real-valued function on a non-empty compact set assumes a maximum and a minimum value. Notice that this result needs no reference to derivatives. Proof. Let K be compact and non-empty and let f be continuous on K to R. Then f .K/ is compact non-empty, hence it has a minimum and a maximum. This is actually all there is to the theorem, but let’s put it in familiar terms: if y1 D min f .K/ and y2 D max f .K/, then y1 2 f .K/ so there exists x1 2 K with f .x1 / D y1 and similarly there exists x2 2 K with f .x2 / D y2 : Finally for all x 2 K, we obtain f .x/ 2 f .K/, so f .x1 / f .x/ f .x2 /. By the way, avoid saying “x1 is the minimum” here. (It is f .x1 / that is the minimum.) Instead, say “x1 is a minimizer of f at x1 , or f assumes a minimum at x1 . Similarly, x2 is a maximizer of f . 31.1. If X is a compact metric space and f W X ! Y is a continuous bijection, then f continuous. 1 is also 31.2. Let Œa; b be a closed interval of R and W Œa; b ! Rn be one-to-one and continuous. Then, 1 is also continuous (on .Œa; b/. 31.3. The previous result also holds for open intervals .a; b/, but not half open ones .a; b. These problems 31.1, 31.2, 31.3 are important for the study of curves and their length. 132 Continuity and compactness Notes INTRODUCTION TO ANALYSIS 133 32. T HE I NTERMEDIATE VALUE T HEOREM Here is a result on which much of the application of Calculus is based. 32.1 Intermediate Value Theorem. Let I be an interval, f W I ! R be continuous and let a; b 2 I , with a < b. If y is a point strictly between f .a/ and f .b/, then there exists c 2 .a; b/ such that f .c/ D y. Proof. We may assume I D Œa; b, since the restriction of a continuous function is continuous. We may assume f .a/ < y < f .b/. (For, if f .a/ > y > f .b/, we may replace f by f and y by y.) Let A D f 1 .. 1; y// and C D f 1 .. 1; y/ Then A is open in Œa; b and C is closed in Œa; b, since . 1; y/ is open and . 1; y is closed and f is continuous. Now A D fx W f .x/ < yg C D fx W f .x/ yg: Since f .a/ < y; A is not empty. Since A Œa; b, A is bounded above. Thus we may set c D sup A. Then c 2 cl A, A C and C is closed so c 2 C . That is, f .c/ y: We know c < b, since f .b/ > y. Now, suppose f .c/ ¤ y. Then, f .c/ < y, so c 2 A, open in [a,b]. Thus, there exists ı > 0 such that Œa; b \ B.c; ı/ A. We know c C ı b, since f .b/ > y, so that Œc; c C ı/ A: If we let d be any point of .c; c C ı/ we have c < d 2 A; contradicting the fact that c is an upper bound of A and establishing that f .c/ D y. 32.2 Remark. Of course, if in the hypothesis of the above theorem, we just have y between f .a/ and f .b/ instead of strictly between, then there still is c 2 Œa; b with f .c/ D y. Indeed, if y is between them but not strictly between them, then either f .a/ D y or f .b/ D y. 32.3 Example. Every 5th degree real polynomial has at least one real root. We use the limit behaviour of such polynomials at the infinities. If P .x/ D a5 x 5 C a4 x 4 C a3 x 3 C a2 x 2 C a1 x C a0 , there are 2 cases, a5 > 0 and a5 < 0. Without loss of generality we may assume a5 > 0, since P .x/ D 0 if and only if P .x/ D 0. Then, a4 a3 a2 a1 a0 C C C C P .x/ D a5 x 5 1 C a5 x a5 x 2 a5 x 3 a5 x 4 a5 x 5 which converges to C1 as x ! 1 and converges to 1 as x ! 1. In particular, there exists s with P .s/ > 0 and t with P .t / < 0. But then, since P is continuous, by the Intermediate Value Theorem, there exists c between s and t with P .c/ D 0, as required. Of course, a slight modification will show that: 32.4 Theorem. Every odd degree real polynomial has a real root. Here is a common application of the Intermediate Value Theorem. 32.5 Corollary. Let f W Œa; b ! Œa; b be continuous. Then f has a fixed point; that is, there exists x 2 Œa; b with f .x/ D x. 134 The Intermediate Value Theorem Proof. Let g.x/ D f .x/ x for all x 2 Œa; b. We have to show there exists c with g.c/ D 0. But, f .a/ 2 Œa; b H) g.a/ D f .a/ a 0 and f .b/ 2 Œa; b H) g.b/ D f .b/ b0 Thus, by the Intermediate value theorem, there exists c 2 Œa; b with f .c/ D c. (See the above remark.) 32.6 Note. This proof indicated a general technique. Given 2 functions, f and g, to prove there exists a point x with f .x/ D g.x/, define a new function h D f g and prove there exists x with h.x/ D 0. We recall that J is an interval if and only if it contains all numbers between any 2 of its points. Theorem (12.14. Characterization of interval). A set J in R is an interval if and only if whenever y1 and y2 belong to J and y1 < y < y2 , then y also belongs to J . Proof. The proof is an exercise, which you have probably already done. The left endpoint of J is inf J and the right endpoint is sup J . A continuous image of an interval is an interval. 32.7 Corollary. If I is an interval of R and f is a continuous real-valued function whose domain contains I , then f .I / is an interval. Proof. Because of the characterization of intervals, this is almost a restatement of the Intermediate Value Theorem. Indeed, let I be an interval contained in the domain of f , J D f .I /, and let y1 ; y2 2 J with y1 < y < y2 . Then, there exists x1 ; x2 2 I with y1 D f .x1 /, y2 D f .x2 / and f .x1 / < y < f .x1 /, so by the Intermediate Value Theorem, there exists x between x1 and x2 with y D f .x/. Every continuous image of a compact interval is a compact interval. 32.8 Corollary. If I is a compact interval of R and f is a continuous real-valued function whose domain contains I , then f .I / is a compact interval. Proof. The continuous image of a compact set is compact and the continuous image of an interval is an interval, so if I is a compact interval, and f is continuous on I , then f .I / is a compact interval. To visualize this, notice that f .I / D Œm1 ; m2 , where m1 is the minimum value and m2 is the maximum value of f on I . 32.9 Remark (Reminder). A compact interval is the same as a closed bounded interval Œa; b. This language is a way of emphasizing its properties. As you know, one includes sets of the form Œa; 1/, .a; 1/, . 1; b/, . 1; b, . 1; 1/ as intervals. Thus, Œa; 1/ is a closed set, which is an interval, but is not compact. You have to look at the context to understand what someone means when he or she talks about “closed interval”. But a compact interval must be closed and bounded, so there is no ambiguity: it must be of the form Œa; b, where a b, real numbers. INTRODUCTION TO ANALYSIS 135 A function which satisfies the conclusion of the Intermediate Value Theorem is said to have the Intermediate Value Property or Darboux Property, because Darboux proved derivatives have this property. A function with the Darboux property is also said to be a Darboux function. A function can have this property without being continuous. 32.10 Example (A discontinuous function with the Intermediate Value Property). The function defined on R by f .x/ D sin.1=x/ for x ¤ 0 and f .0/ D 0 is not continuous at 0, but attains all values between any two of its values. Indeed, if Œa; b with a < b contains 0, then f takes on all values between 1 and 1, and if it does not contain 0, then f takes all values between f .a/ and f .b/ by the IVT, because f is continuous on that interval. Notice that the graph of the function in the above example wiggles a lot. If it doesn’t wiggle too much, a function with the IVP must be continuous. The simplest case is that of a monotone function. 32.11 Theorem. If f W I ! R is monotone on the interval I and if f .I / is an interval, then f is continuous on I . Proof. The easiest way to see this is to use left and right limits. Let c 2 I . If c is not the left endpoint of I , we know f .c / D limx!c f .x/ D supx<c f .x/. But, since f .I / is an interval a 2 I with a < c implies f takes on all values between f .a/ and f .c/. Thus ff .x/ W x < cg is an interval with right endpoint f .c/, so its supremum is f .c/. Thus, f .c / D f .c/. Similarly, if c is not the right endpoint of I , f .cC/ D f .c/. Thus, f is both right and left continuous, so is continuous at c. Note, by the way, that the full force of the Intermediate Value Property was not assumed here. But, because of the monotonicity it turns out to hold anyway. It is easy to construct a function f such that both I and f .I / are intervals of R, but f doesn’t have the IVP (and, of course, is not continuous). 32.12 Corollary. If I is an interval and f is continuous and strictly increasing on I then f 1 is continuous and strictly increasing on f .I /. Proof. If I is an interval and f is continuous, then f .I / is also an interval. If f is also strictly increasing, then so is f 1 . Since f 1 is defined on an interval and has an interval as its range, it must be continuous, by the previous theorem. 32.13 Example. Existence of roots again. Let f .x/ D x n , defined for x 2 Œ0; 1/. Then f is continuous and strictly increasing. Thus, f has an inverse, which therefore must also be strictly increasing and continuous. By the IVT, the range of f is an interval. Since x n ! 1, the interval must be Œ0; 1/. Thus the inverse of f is defined on Œ0; 1/. This is the function y 7! y 1=n . It is interesting that the only way a continuous function on an interval can have an inverse is if it is strictly monotone: 32.14 Theorem. If f is continuous and injective on an interval I , then f is either strictly increasing or strictly decreasing on I . Proof. Suppose f is continuous and injective on the interval I . Claim. If a1 < a2 < a3 then f is strictly monotone on fa1 ; a2 ; a3 g; that is, either f .a1 / < f .a2 / < f .a3 / or f .a1 / > f .a2 / > f .a3 /. Indeed, suppose a1 < a2 < a3 and f .a1 / < f .a2 / then we must also have f .a2 / < f .a3 /. Otherwise, f .a2 / > f .a3 / and if we choose any y strictly between maxff .a1 /; f .a3 /g, 136 The Intermediate Value Theorem and f .a2 /, the Intermediate Value Theorem yields points c1 2 .a1 ; a2 / and c2 2 .a2 ; a3 /, with f .c1 / D f .c2 / D y, contradicting injectivity. The case f .a1 / > f .a2 / is similar, establishing the claim. Now suppose x1 < x2 with f .x1 / < f .x2 / and x < x 0 are any two points of I . We have to show that f .x/ < f .x 0 / as well. Let A be the set fx1 ; x2 ; x; x 0 g. It is possible that A consists of 2 or 3 points (for example we could have x1 D x). The first of these cases is trivial. The second case we have handled by our claim. The final case is that A has 4 distinct points. If so, order them in increasing order a1 < a2 < a3 < a4 : Since f is monotone on sets of 3 elements, and since fa1 ; a2 ; a3 g and fa2 ; a3 ; a4 g have the pair fa2 ; a3 g in common, f is monotone on the whole of A. Since f .x1 / < f .x2 /, f is increasing on A and we have f .x/ < f .x 0 / as required. 32.1. Prove that the equation p x 5 D 1=.x C 3/ has at least one real solution. 32.2. Prove that there is a real number x with e x D x 2 . 32.3. Prove there exists a point x such that sin.x the sine function, including continuity.) 1/ D x. (You are allowed the usual properties of 32.4. Find intervals I and J of R and a function f on I onto J , but which doesn’t have the IVP. 32.5. Find a bijection of Œ0; 1 onto itself which has no points of continuity or prove this impossible. 32.6. Prove the Intermediate Value Theorem using a bisection argument. Hint: If y is between f .a/ and f .b/ and c is the point .a C b/=2, then y is either between f .a/ and f .c/ or f .c/ and f .b/. This allows one to construct a decreasing sequence of closed intervals. The one point x in the intersection of these satisfies f .x/ D y. The following two notes give the true content of the Intermediate Value Theorem. 32.7. Two subsets A and B of a metric space S are called separated if cl.A/ \ B and A \ cl.B/ are both empty; that is A contains no closure points of B and B contains no closure points of A. A set E S is called connected if there are no non-empty separated sets A, B with E D A [ B. A subset of R is connected if and only if it is an interval. 32.8. Let f be a continuous function from one metric space X to another, Y , the image of any connected set in X is connected. 32.9. A set E in a metric space is called pathwise connected if for every p and q in E , there exists a continuous function f defined on an interval Œa; b with f .a/ D p, f .b/ D q and f .x/ 2 E , for all x 2 Œa; b. This is interpreted as saying “we can draw a continuous path in E from any point to any other”. Every pathwise connected set is connected. 32.10. A set A in Rn (or other vector space) is called convex if a; b 2 A implies A contains the line segment f.1 t /a C t b W 0 t 1g. Each convex set in Rn is (pathwise connected, hence) connected. 32.11. In R a set is convex iff it is an interval. INTRODUCTION TO ANALYSIS 137 33. BANACH ’ S C ONTRACTION M APPING T HEOREM The result of the title, also known as Banach’s Fixed Point Theorem, is valid in a general complete metric space, such as Rn . It provides conditions under which a mapping has a fixed point; its proof provides an algorithm for finding that point. Thus, the result complements the Intermediate Value Theorem in obtaining solutions to equations. If X is a metric space and f W X ! X , f is called a contraction mapping if there is a number c < 1 with d.f .x/; f .y/ cd.x; y/, for all x; y 2 X . 33.1 Banach’s Fixed Point Theorem. Let X be a complete metric space and let f W X ! X be a contraction mapping. Then, there exists exactly one point x 2 X such that f .x/ D x. Proof. Define a sequence of points .xn / as follows. Let x0 be any point of X and for n D 0; 1; 2; : : : ; let xnC1 D f .xn /. Choose c < 1 so that d.f .x/; f .y// cd.x; y/, for all x; y 2 X . For n 1, we have d.xn ; xnC1 / D d.f .xn 1 /; f .xn // c d.xn 1 ; xn /: By induction, we obtain d.xn ; xnC1 / c n d.x0 ; x1 /: If n < m, this yields d.xn ; xm / m X1 d.xi ; xi C1 / .c n C c nC1 C C c m 1 /d.x0 ; x1 / kDn cn d.x0 ; x1 /: 1 c Since c n ! 0, .xn / is a Cauchy sequence. Since X is complete, .xn / converges to some x 2 X . Since f is a contraction mapping, it is (uniformly) continuous. Hence, f .x/ D lim f .xn / D lim xnC1 D x; n n as required. The uniqueness is immediate: if x and y are distinct fixed points, then d.x; y/ D d.f .x/; f .y// cd.x; y/ < d.x; y/; a contradiction. 33.2 Example. The map f W Œ1; C1/ ! Œ1; C1/ defined by f .x/ D 1 C 1=.1 C x/ is a contraction. What is its fixed point? 138 Banach’s Contraction Mapping Theorem Notes INTRODUCTION TO ANALYSIS 139 34. U NIFORM C ONTINUITY Let X and Y be metric spaces, with distance functions dX and dY . (We often write simply d , in both cases, if there is no danger of confusion.) Let f be a function on X to Y . Then f is continuous on X iff it is continuous at each point a of X , that is: 8a 2 X; 8" > 0; 9ı > 0; such that 8x 2 X; d.x; a/ < ı H) d.f .x/; f .a// < ": For example, let X D R and f .x/ D x 2 , for x 2 X . To show that f is continuous, one takes a 2 X; lets " > 0 and calculates for x 2 X : d.f .x/; f .a// D jx 2 D jx If we assume jx a2 j ajjx C aj aj < 1 we will have jx C aj < 1 C 2jaj, so jf .x/ f .a/j jx aj.1 C 2jaj/; which is < ", provided jx aj < " ; 1 C 2jaj so we can take ıa D minf1; "=.1 C 2jaj/g, and get jx aj < ıa implies jf .x/ f .a/j < ": Notice that the ı D ıa depends on a: For larger a we need a smaller ıa . Uniform continuity is different. The distances involved don’t depend on where in the domain of the function we are. The order of quantifiers is changed. A function f W X ! Y is said to be uniformly continuous on X if 8" > 0; 9ı > 0; such that 8x; y 2 X; d.x; y/ < ı H) d.f .x/; f .y// < ": 34.1 Example. Let f .x/ D 3x, for x 2 R. Fix " > 0. Then for all x; y, jf .x/ f .y/j < " iff 3jx yj < " So if we take ı D "=3, we have for all x; y 2 R; jx yj < ı H) jf .x/ f .y/j < ": Thus, f is uniformly continuous on R. At this point one should check that the function defined on R by f .x/ D x 2 is not uniformly continous on R. If one says f W X ! Y is uniformly continuous, one means f is uniformly continuous on its domain, X . 34.2 Theorem. Every continuous function on a compact set is uniformly continuous. That is, if f W K ! Y is continuous and K is compact, then f is uniformly continuous on K. Proof. Let " > 0. Since f is continuous on K, for each a 2 K, there exists ıa such that x 2 B.a; ıa / \ K H) d.f .x/; f .a// < "=2: () 140 Uniform Continuity Now, fB.a; ıa =2/ W a 2 Kg is a family of open sets covering K, so there exists a finite subfamily which also covers K; that is, there exists a1 ; : : : ; an 2 K such that n [ B.ai ; ıai =2/ K: () iD1 Let ı D 1 2 minfıa1 ; : : : ; ıan g. Suppose d.x; y/ < ı. By ./ there exists i such that d.x; ai / < ıai =2: Then we also have d.y; ai / d.y; x/ C d.x; ai / < ı C ıai =2 < 2ıai =2 D ıai : Thus, by ./, we have d.f .x/; f .y// d.f .x/; f .ai // C d.f .ai /; f .y// < "=2 C "=2 D ": We have shown that for x; y 2 K, d.x; y/ < ı implies d.f .x/; f .y// < ". (Notice that the dependence on i disappeared.) Thus, f is uniformly continuous on K. 34.3 Theorem. Let f be uniformly continuous on X . If .xn / is a Cauchy sequence in X , then .f .xn // is also a Cauchy sequence. Informally, “uniformly continuous functions map Cauchy sequences to Cauchy sequences”. Proof. Let .xn / be a Cauchy sequence in X . Fix " > 0. Then, by uniform continuity, there exists a ı > 0 such that for x; x 0 2 X , d.x; x 0 / < ı implies d.f .x/; f .x 0 // < ". From the definition of Cauchy sequence we obtain N such that for n; m > N d.xn ; xm / < ı. Thus, for n; m > N , d.f .xn /; f .xm // < ": Since " was arbitrary, this shows the sequence .f .xn // is Cauchy. Recall that if f is a function with domain A, A B and fx is defined on B with x f .x/ D f .x/ for x 2 A, we say that fx is an extension of f (to B) and f is the restriction of fx to A. To say that f has an extension to a continuous function on B, means there exists an extension fx of f to B, which is continuous. (Similarly for “uniformly continuous”.) 34.4 Theorem. Let A B X. If f W A ! Y and fx W B ! Y is an extension of f which is uniformly continuous on B, then f itself is uniformly continuous on A. Proof. Let " > 0. Since fx is uniformly continuous on B, we may choose ı > 0 such that for all x; x 0 2 B, d.x; x 0 / < ı implies d.fx.x/; fx.x 0 // < ". Now, let x; x 0 2 A with d.x; x 0 / < ı. Then x; x 0 2 B so d.fx.x/; fx.x 0 // < ": But, fx extends f , so f .x/ D fx.x/ and f .x 0 / D fx.x 0 /. Hence, d.f .x/; f .x 0 // < ": Thus, for all x; x 0 2 A with d.x; x 0 / < ı, d.f .x/; f .x 0 // < ". Since " was arbitrary, this shows f is uniformly continuous on its domain, A. INTRODUCTION TO ANALYSIS 141 34.5 Corollary. Let f be defined on a subset A of Œa; b with values in R. If f can be extended to a continuous function on Œa; b, then f is uniformly continuous. Proof. If fx is a continuous function on Œa; b which extends f , then fx is a continuous function on the compact set Œa; b, so is uniformly continuous. Thus, its restriction f is also uniformly continuous, by the previous result. 34.6 Example. The function defined on . 1; 0/ [ .0; 2/ by sin x f .x/ WD x is uniformly continuous. Proof. Define the function fx on Œ 1; 2 by ( sin x ; x 2 Œ 1; 2 n f0g x fx.x/ D 1; x D 0: Since sin is continuous on all of R and x 7! x is continuous everywhere, the map x 7! sinx x is continuous on all of R n f0g. Also limx!0 sinx x D 1. So the function fx is continuous on the compact set Œ 1; 2, hence uniformly continuous. Furthermore, fx.x/ D f .x/ on . 1; 0/ [ .0; 2/, so f is also uniformly continuous. We are now going to prove a generalization of the converse of this theorem. 34.7 Theorem (Extending a u.c. function). Let X; Y be metric spaces, with Y complete (say R or Rm ). Let f be uniformly continuous on the set A. Then f can be extended to a uniformly continuous function on cl A. The conclusion says there exists a function fx W cl A ! Y such that fx is uniformly continuous on cl A and fx extends f . Proof. Let f be uniformly continuous on A. Fix x 2 cl A. If there is a continuous function on cl A which extends f then its formula must be fx.x/ D lim f .a/; (23) a!x so we check that the limit on the right-side exists. Let .an / be any sequence in A converging to x. Then, .an / is Cauchy. Since f is uniformly continuous on A, .f .an // is also Cauchy. Since Y is complete, .f .an // converges. Thus, for every sequence .an / in A converging to x, .f .an // converges; hence, by the sequential criterion, lima!x f .a/ exists. Define fx on cl.A/ by (23). First, fx extends f , because if x 2 A, fx.x/ D lim f .a/ D f .x/; since f is continuous on A. a!x We now prove fx is uniformly continuous. Let " > 0. Use the definition of the uniform continuity of f to choose ı > 0 such that a; a0 2 A; d.a; a0 / < ı H) d.f .a/; f .a0 // < ": (24) Let x; x 0 2 cl A with d.x; x 0 / < ı. Choose sequences .an / and .an0 / in A such that an ! x and an0 ! x 0 : Then, d.an ; an0 / ! d.x; x 0 / < ı: Thus, there exists N so large that for all n N , d.an ; an0 / < ı; 142 Uniform Continuity so that by (24) d.f .an /; f .an0 // < " for all n N . 0 0 x x But, f .an / ! f .x/, f .an / ! f .x /, so in the limit d.fx.x/; fx.x 0 // ": Thus, x; x 0 2 cl A, d.x; x 0 / < ı implies d.fx.x/; fx.x 0 // ", and hence fx is uniformly continuous on cl A. 34.8 Example. The function defined on Œ 1; 1 n f0g by f .x/ D sin.1=x/ is continuous, but not uniformly continuous. Proof. The closure of Œ 1; 1 n f0g is just the interval Œ 1; 1. If f had a continuous extension fx to Œ 1; 1, then fx.0/ would have to be limx!0 sin.1=x/. But this limit doesn’t exist, since every neighbourhood of 0 contains points where sin.1=x/ is 1 and points where it is 1. Thus, there is no such extension and according to the theorem, f is not uniformly continuous. 34.9 Corollary. If f is uniformly continuous on an interval .a; b/ of R, a < b, real, then, f can be extended to a uniformly continuous function on Œa; b. The proof of this is immediate from the previous theorem, since cl.a; b/ D Œa; b. 34.1. Prove the sequential criterion for uniform continuity: For f W X ! Y , f is uniformly continuous if and only if, for each pair of sequences .xn / and 0 / in X with d.x ; x 0 / ! 0, d.f .x /; f .x 0 // ! 0. .xn n n n n 34.2. A function f W R ! R is said to be periodic if there exists a number k (called the period of f ) such that f .x C k/ D f .x/ for all x 2 R. Prove that a continuous periodic function is bounded and uniformly continuous on all of R. 34.3. Let f be defined on an interval .a; b/, with one or both of a; b infinite. If limx!a f .x/ and limx!b f .x/ exist in R, then f is uniformly continuous. 34.4. Let A X , let f W A ! Y be continuous and let C be the set of all points x of X for which the limit fx.x/ D lima!x f .a/ exists in Y . Then the function fx, defined on C by this formula, is a continuous extension of f . 34.5. Prove that each function below is uniformly continuous on the given set by directly verifying the definition. x (a) f W Œ0; 3 ! R, defined by f .x/ D 1Cx . (b) f W Œ0; 10/ ! R, defined by f .x/ D 2x 2 x. 34.6. Which of the following are uniformly continuous? Use the theorems to justify your answers. (a) g W .0; 7/ n f3g defined by g.x/ D ex . .x 3/2 (b) h W .0; 7/ n f3g ! R, defined by h.x/ D e 1 .x 3/2 . 34.7. Let f W X ! R and g W X ! R. If f and g are uniformly continuous, so is f C g, but fg need not be. 34.8. Use the definition to prove that the function f W Œ0; 1/ ! R, defined by f .x/ D uniformly continuous. x 1 xC1 is 34.9. Let A; B X R, f W X ! Y uniformly continuous on A and f uniformly continuous on B, does f have to be uniformly continuous on A [ B? p 34.10. Is the function f W Œ0; C1/ defined by f .x/ D x uniformly continuous? INTRODUCTION TO ANALYSIS 143 35. D IFFERENTIATION Let f be a real-valued function defined on an interval I of R containing the point c. We say f is differentiable at c if the limit f .x/ f .c/ lim x!c x c exists. If so, this limit is called the derivative of f at c. The function f 0 with domain the set of points where f is differentiable defined by f .x/ f .c/ f 0 .c/ D lim x!c x c is called the derivative of f . 35.1 Theorem. For a function f W I ! R and c 2 I , f is differentiable at c iff there exists a number m and a function " W I ! R such that limx!c ".x/ D ".c/ D 0 such that for all x 2 I; f .x/ D f .c/ C m.x c/ C ".x/.x c/: 0 In this case, f .c/ D m. The graph of the function x 7! f .c/Cm.x c/ is tangent to the graph of f at .c; f .c//, and we will sometimes call this result the tangent characterisation of differentiability. The slope of the tangent line is m D f 0 .c/, and one also calls this the slope of f at c. Proof. . If such a number m and a function " exist, we have f .x/ f .c/ lim D lim .m C ".x// D m C lim ".x/ D m; x!c x!c x!c x c so f is differentiable at c with derivative m. Conversely, suppose f is differentiable with f 0 .c/ D m, we have f .x/ f .c/ m D 0; lim x!c x c Put ".c/ D 0 and for all other x, f .x/ f .c/ ".x/ D m x c Then limx!c ".x/ D 0 D ".c/. and f .x/ f .c/ D m C ".x/; x c so f .x/ D f .c/ C m.x c/ C ".x/.x c/; as required. 35.2 Theorem. If f is differentiable at c, then f is continuous at c. Proof. If f is differentiable at c then it can be written as f .x/ D f .c/ C m.x c/ C ".x/.x c/: where limx!c ".x/ D ".c/ D 0. This last condition says " is a continuous function, so the right-hand side is a continuous a sum of products of continuous functions, so is continuous. Hence f is continuous. The most important rule for calculation of derivatives is the chain rule, so we give it immediately. 144 Differentiation 35.3 The chain rule. Let I be an interval and g W I ! R be differentiable at x0 . Let g.I / J , f W J ! R, and let f be differentiable at u0 D g.x0 /. Then, f ı g is differentiable at x0 , with .f ı g/0 .xo / D f 0 .uo /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /. Proof. Since f is differentiable at u0 , there exists a function, " which is continuous and 0 at u0 D g.x0 / with f .u/ D f .u0 / C f 0 .u0 /.u u0 / C ".u/.u u0 /; (25) for all u 2 J . Replacing u by g.x/ and u0 by g.x0 / in (25) yields f .g.x// D f .g.xo // C f 0 .u0 /.g.x/ g.x0 // C ".g.x//.g.x/ g.x0 //: If x ¤ x0 , we may rearrange and divide by x x0 , obtaining: f .g.x// f .g.xo // g.x/ g.x0 / g.x/ g.x0 / D f 0 .u0 / C ".g.x// : x x0 x x0 x x0 Now, since g is differentiable at x0 , it is continuous there and " is continuous at u0 D 0/ converges to g.x0 /, so the composite is continuous at x0 . Thus, as x ! x0 ; g.x/x g.x x0 0 g .x0 / and ".g.x// converges to ".g.x0 // D ".u0 / D 0, so f .g.x// f .g.xo // lim D f 0 .u0 /g 0 .x0 / C 0g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /: x!x0 x x0 That is .f ı g/0 .x0 / D f 0 .g.x0 //g 0 .x0 /, as required. The following version of the above proof, using only the tangent characterization of differentiability is a little messier, but is more directly generalizable to higher dimensions, where division is not defined. 2nd proof. Since g is differentiable at x0 , there exists a function "1 , continuous and 0 at x0 with g.x/ D g.x0 / C g 0 .x0 /.x x0 / C "1 .x/.x x0 /; (26) for all x 2 I . Since f is differentiable at u0 , there exists a function, "2 which is continuous and 0 at u0 D g.x0 / with f .u/ D f .u0 / C f 0 .u0 /.u u0 / C "2 .u/.u u0 /; (27) for all u 2 J . Replacing u by g.x/ in (27) gives f .g.x// D f .uo / C f 0 .u0 /.g.x/ But u0 D g.x0 /, so by (1) we may replace g.x/ yielding f .g.x// D f .g.x0 // C f 0 .u0 /Œg 0 .x0 /.x 0 C "2 .g.x//Œg .x0 /.x D f .g.x0 // C f 0 .u0 /g 0 .x0 /.x 0 u0 / C "2 .g.x//.g.x/ u0 by g 0 .x0 /.x x0 / C "1 .x/.x u0 /: x0 / C "1 .x/.x x0 / x0 / x0 / C "1 .x/.x x0 / x0 / C f 0 .u0 /"1 .x/ C "2 ..g.x//g 0 .x0 / C "1 .x/ .x 0 Since Œf .u0 /"1 .x/ C "2 ..g.x//g .x0 / C "1 .x/ converges to 0 as x ! x0 , this shows f ı g is differentiable at x0 with derivative f 0 .u0 /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /. 35.4 Theorem. If f is constant on the interval I , then f is differentiable on I with f 0 .x/ D 0 for all x 2 I . To prove this, just do the calculation from the definition, or note that m D 0 and the function " D 0 satisfy the tangent characterization of differentiability. x0 / INTRODUCTION TO ANALYSIS 145 35.5 Theorem. Let f W I ! R, g W I ! R be differentiable at c 2 I and let k 2 R, then: (a) kf is differentiable at c with .kf /0 .c/ D kf 0 .c/ (constant multiple rule) (b) f C g is differentiable at c and .f C g/0 .c/ D f 0 .c/ C g 0 .c/ (sum rule). (c) fg is differentiable at c and .fg/.c/ Df 0 .c/g.c/ C f .c/g 0 .c/ (product rule). (d) If g.c/ ¤ 0, f g is differentiable at c and f g 0 .c/ D g.c/f 0 .c/ f .c/g 0 .c/ g.c/2 (quotient rule). Proof. (a) and (b) are left as exercises. For (c), we merely calculate, for each x 2 I; .fg/.x/ x .fg/.c/ f .x/g.x/ f .c/g.c/ D c x c f .x/g.x/ f .c/g.x/ C f .c/g.x/ f .c/g.c/ D x c g.x/ g.c/ f .x/ f .c/ D g.x/ C f .c/ : x c x c Now, since f is differentiable at c, f .x/x cf .c/ converges to f 0 .c/ as x ! c. Similarly, g.x/ g.c/ converges to g 0 .c/. Moreover, since g is differentiable at c, it is continuous x c there, so g.x/ ! g.c/ as x ! c. This shows fg is differentiable at c with derivative f 0 .c/g.c/ C f .c/g 0 .c/. (d) The proof of the quotient rule is similar. But before doing the calculation we recall that f =g has as domain the set of points x for which g.x/ ¤ 0. Moreover, since g.c/ ¤ 0 and g is continuous at c, there is a neighbourhood of c on which g ¤ 0. Now, for x in this neighbourhood, we calculate f .x/ g.x/ f .c/ g.c/f .x/ g.x/f .c/ g.c/.f .x/ D D g.c/ g.x/g.c/ f .c// .g.x/ g.x/g.c/ g.c//f .c/ ; which when divided by (x-c) gives f .x/ g.x/ x f .c/ g.c/ c D g.c/ .f .x/x f .c// c .g.x/ g.c// f .c/ x c g.x/g.c/ : Now, let x ! c, and use the fact that f is differentiable at c, g is differentiable at c and g is continuous at c to get the required result. For a function f W I ! R, a point c is called a critical point of f if one of the following three conditions is satisfied (1) c is an endpoint of I , (2) f 0 .c/ does not exist, or (3) f 0 .c/ D 0. 35.6 Interior Extremum Theorem. Let I be an interval of R. If f W I ! R is differentiable at c 2 int.I / and f has a maximum or minimum at c, then f 0 .c/ D 0. Proof. We assume f has a maximum at c; the case of minimum is similar. For elements x 2 I with x < c, we have f .x/ f .c/, and x c < 0, so f .x/ x f .c/ 0; c 146 Differentiation Thus, f .x/ f .c/ f 0 .c/ D lim 0: x c x!c x<c Similarly, for x 2 I with x > c, f .x/ f .c/ 0; x c and again f .x/ f .c/ 0: f 0 .c/ D lim x c x!c x>c 0 0 Thus f .c/ 0 and f .c/ 0, so f 0 .c/ D 0. In the proof of the Interior Extremum Theorem, we restricted the “difference quotient” q.x/ D f .x/x fc .c/ to I \ . 1; c/, and then took a limit, the so-called limit from the left. Since the limit of q is f 0 .c/, so is the limit from the left. Then we similarly used the limit from the right. The hypothesis that c be an interior point of I was needed. Where? The limit from the left used above is called the left derivative of f at c, denoted f 0 .c/. Similarly the limit from the right is called the right derivative, fC0 .c/. Of course, if f is differentiable at a point c 2 int I , then fC0 .c/ D f 0 .c/ D f 0 .c/. If c is the left endpoint of I , then the derivative at c and the right derivative at c are the same thing. A similar statement holds for the right endpoint and left derivative. A closer look at the proof of the theorem shows: 35.7 Endpoint Extremum Theorem. If f W I ! R is differentiable at the left endpoint c of I and f has a minimum at c, then f 0 .c/ 0; if it has a maximum there, then f 0 .c/ 0. The inequalities reverse if the word “left” is replaced by “right” in this statement. The function f W I ! R has a local maximum (or local minimum) at c if there is a neighbourhood U of c in I for which f jU has a maximum (minimum). The point c is then called a local maximizer (minimizer). 35.8 Local Extremum. Let I be an interval, f W I ! R and have a local maximum or local minimum at a point c 2 I . Then c is a critical point of f . Proof. We may assume there is a local maximum at c. There are 3 cases in the definition of critical point; if c is not an endpoint and not a point where f is non-differentiable, then c must be an interior point of I and f 0 .c/ exists. Thus there is a ball B.c; "/ D .c "; c C "/ I , and there is a neighbourhood U of c such that the restriction of f to U has a maximum. By shrinking the radius, then, we find an open interval .a; b/ such that c 2 .a; b/, f is differentiable at c and the restriction of f to .a; b/ has a maximum at c; thus, we may assume without loss of generality that I D .a; b/ and f has a maximum at c. Then f 0 .c/ D 0 by the Interior Extremum Theorem. 35.1. Recall that a function f is left differentiable at x 2 I if x is not a left endpoint of I and f .x/ its left derivative, defined by f 0 .x/ D limu!x f .u/ exists (in R) and f is right u x f .x/ 0 differentiable at x if x is not a right endpoint of I and fC .x/ D limu!xC f .u/ exists. u x If f is both left and right differentiable at a point x 2 int.I /, then f is continuous at x. 35.2. The function f W I ! R is differentiable at c with derivative m if and only if for all " > 0, there exists ı > 0 such that for all x 2 I with jx cj < ı, jf .x/ f .c/ m.x c/j < "jx cj. INTRODUCTION TO ANALYSIS 147 36. M EAN VALUE T HEOREMS 36.1 Theorem. Rolle’s. Let f W Œa; b ! R be continuous on Œa; b and differentiable on .a; b/ with f .a/ D f .b/. Then, there exists c 2 .a; b/ with f 0 .c/ D 0. Proof. Since f is continuous on the compact set Œa; b, it has a maximum and a minimum at some point of Œa; b. If one of these is in the interior, that is, at some c 2 .a; b/, then f 0 .c/ D 0. If, on the other hand, both the maximum and the minimum are at the endpoints, then they are equal by hypothesis. Hence, in this case, f is a constant and thus has derivative 0 at all points of (a,b). Thus, any c in .a; b/ produces the desired conclusion. The following theorem is purportedly due to Lagrange. We will also study a generalization due to Cauchy. 36.2 Mean Value Theorem (MVT). Let f W Œa; b ! R be continuous on Œa; b and differentiable on .a; b/. Then, there exists c 2 .a; b/ with f 0 .c/ D f .b/b af .a/ . Equivalently, f .b/ f .a/ D f 0 .c/.b a/. Proof. This is a generalization of Rolle’s Theorem. The method of proof is to create a new function h which satisfies the hypotheses of Rolle’s Theorem and for which h0 .c/ D 0 gives the desired equality. For all x 2 Œa; b, let f .b/ f .a/ .x a/: g.x/ D f .a/ C b a Then g.a/ D f .a/; g.b/ D f .b/: f .b/ f .a/ ; for all x 2 Œa; b: b a Thus, we let h.x/ D f .x/ g.x/; for all x 2 Œa; b, so that h.a/ D h.b/ D 0, h is continuous on [a,b] and h0 D f 0 g 0 . By Rolle’s Theorem, there is a point c 2 .a; b/ where h0 .c/ D 0, that is, where f 0 .c/ D f .b/b fa .a/ , as required. g 0 .x/ D The Mean Value Theorem can be rewritten in the useful form: 36.3 Mean Value Theorem (MVT– utility version). If I is an interval of the reals and f is continuous on I and differentiable on its interior, then for each distinct x1 ; x2 2 I , f .x2 / D f .x1 / C f 0 .c/.x2 x1 /; for some c strictly between x1 and x2 . This is because the quotient f .xx11/ fx2.x2 / does not change when you interchange x1 and x2 . Of course, if f is differentiable everywhere, the result is true also for x1 D x2 , except that one then has c is no longer strictly between x1 and x2 . It is an immediate consequence that only constant functions have derivative zero. 36.4 Theorem. If f is continuous on an interval I , differentiable on its interior, with f 0 D 0 there, then f is constant on I . Proof. If x1 ; x2 belong to I , with x1 < x2 ; then there exists c 2 .x1 ; x2 / with f .x2 / f .x1 / D f 0 .c/.x2 x1 / D 0. Thus, f .x1 / D f .x2 /. This shows all the values of f are the same. That is, f is a constant function. 148 Mean Value Theorems From this it follows that if two functions have the same derivative, they differ by a constant: 36.5 Corollary. Let F; G be continuous on an interval I , differentiable on the interior with F 0 D G 0 . Then there exists a constant C such that G D F C C on I . If f is defined on the interior of an interval I and F is defined and continuous on I , with derivative f in the interior, then F Ris called a primitive, antiderivative or indefinite integral of f , and one writes F .x/ D f .x/ dx. Thus, if G is another antiderivative of f , then F D G C C , where C is a constant function. 36.6 Theorem. Let f be continuous on the interval I , differentiable on its interior. (a) (b) (c) (d) If f 0 > 0 on int I , then f is strictly increasing on I . If f 0 < 0 on int I , then f is strictly decreasing on I . f 0 0 on int I iff f is increasing on I . f 0 0 on int I iff f is decreasing on I . Proof. (a) If x1 < x2 in I , then by the Mean Value Theorem 36.3, we may choose c between x1 and x2 with f .x2 / D f .x1 / C f 0 .c/.x2 x1 /: Since f 0 .c/ > 0, this says f .x1 / < f .x2 /, as required. The proof of (b) is similar, as is one direction of (c) and (d). We emphasise that in (a) and (b), the implication cannot be reversed. For example, if f is defined on R by f .x/ D x 3 ; f 0 .x/ D 3x 2 , which is 0 at x D 0, yet f is everywhere strictly increasing. The following theorem, proved by Gaston Darboux in 1875 is often called The Intermediate Value Theorem for derivatives. It will lead to results about continuity of derivatives and differentiation of inverse functions (Inverse Function Theorem). 36.7 The Darboux Property of derivatives. Let f be differentiable on the interval I . If a; b 2 I and m is any number strictly between f 0 .a/ and f 0 .b/, then there exists c between a and b with f 0 .c/ D m. Proof. Say, f 0 .a/ < m < f 0 .b/, a < b. Let h.x/ D f .x/ mx, for all x 2 I . Since h0 .a/ < 0 and h0 .b/ > 0, h cannot be monotone on Œa; b, so must have the same value at 2 different points, hence by Rolle’s Theorem, a point between where h0 .c/ D 0; that is, f 0 .c/ D m. Before Darboux published his paper, it was believed that if a function f is differentiable on an interval, the derivative f 0 must be continuous. Darboux gave examples showing that this need not be the case. However, his theorem clarifies the situation: in the usual examples, I can be broken into intervals on which f 0 is monotone. 36.8 Corollary. Let f be differentiable on an interval I , with a monotone derivative f 0 . Then f 0 is continuous on I . Proof. We recall that whenever a monotone function defined on an interval has the Darboux property (that is, satisfies the conclusion of the Intermediate Value Theorem), it is continuous. We have just proved that f 0 is such a function. INTRODUCTION TO ANALYSIS 149 36.9 Example. (A function, differentiable on R, whose derivative is not continuous.) For x 2 R, put ( x 2 sin.1=x/; x ¤ 0; f .x/ D 0; x D 0: The reader can check, using the usual rules, that f is differentiable for x ¤ 0 with f 0 .x/ D 2x sin.1=x/ cos.1=x/. Notice that limx!0 f 0 .x/ does not exist. Nevertheless, f 0 is differentiable at 0. To prove that f is differentiable at 0 one must go back to the definition. f .x/ f .0/ f 0 .0/ D lim D lim x sin.1=x/ D 0; x!0 x!0 x 0 since the sine function is bounded. Thus, f 0 is defined on all of R, yet f 0 is not continuous at 0, since it doesn’t even have a limit there. In your study of uniform continuity, you will recall that one often tries to get an inequality of the form jf .x/ f .y/j M jx yj; for then jx yj < "=M implies jf .x/ f .y/j < ", so that f is uniformly continuous. A function with this property is called Lipschitz (of order 1). 36.10 Theorem. Let f be continuous on an interval I , differentiable in the interior. If f 0 is bounded on the interior of I , then f is Lipschitz on I , hence f is uniformly continuous. Proof. Let jf 0 .x/j M , for all x 2 I . Let x1 ; x2 2 I . If these are distinct, there exists c between x1 and x2 with f .x1 / f .x2 / D f 0 .c/; x1 x2 by the Mean Value Theorem. Thus, ˇ ˇ ˇ f .x1 / f .x2 / ˇ ˇ ˇ D jf 0 .c/j M; ˇ ˇ x x 1 2 so jf .x1 / f .x2 /j M jx1 x2 j. In case x1 D x2 , this argument fails, but the inequality is still true, since both sides are then 0. Here is the generalized mean value theorem mentioned earlier. 36.11 Cauchy’s Mean Value Theorem. Let f; g be real-valued functions continuous on Œa; b, differentiable on .a; b/. Then, there exists a point c 2 .a; b/ such that .f .b/ Proof. Let h.x/ D .f .b/ h.a/ D .f .b/ f .a//g 0 .c/ D .g.b/ f .a//g.x/ f .a//g.a/ .g.b/ .g.b/ g.a//f 0 .c/: g.a//f .x/, for all x 2 Œa; b. Then g.a//f .a/ D f .b/g.a/ g.b/f .a/: and h.b/ D .f .b/ f .a//g.b/ .g.b/ g.a//f .b/ D f .a/g.b/ C g.a/f .b/; which is the same thing. So by Rolle’s theorem (or the MVT) there exists a point c in .a; b/ with h0 .c/ D 0, that is where .f .b/ This is what is required. f .a//g 0 .c/ .g.b/ g.a//f 0 .c/ D 0: 150 Mean Value Theorems A major consequence of Cauchy’s MVT is L’Hôpital’s rule, in section 38. 36.1. Let f be continuous on the interval I and differentiable on the interior with f 0 > 0 except at finitely many points in each bounded interval. Then f is strictly increasing on I . (Note: At the finitely-many points, we don’t even need to assume the differentiability.) 36.2. If f 0 0 on I and there is no interval .a; b/ with a < b on which f 0 D 0, then f is strictly increasing on Œa; b 36.3. Use the Mean Value Theorem or Rolle’s Theorem to prove that a real polynomial of degree n can have at most n real roots. 36.4. Let f be differentiable on the interval Œ1; 7 with f .1/ D 3 and f 0 .x/ 4, for all x 2 .1; 7/. What is the largest f .7/ could be? 36.5. Prove the following inequalities using the Mean Value Theorem. (a) e x > x C 1; for x > 0. (b) If 0 < p < 1, and t 1, then .1 C t /p < 1 C t p . 36.6. Prove the “first derivative test” for local maximum: Let f be continuous on .a; b/, differentiable, except maybe at c, with f 0 .x/ > 0, for a < x < c, and f 0 .x/ < 0, for c < x < b, then f has a maximum at c. 36.7. Consider the function defined on R by f .x/ D x 2 sin2 .1=x 2 / for x ¤ 0 and 0 for x D 0. Prove that f has a minimum at 0 and is differentiable there, but the first derivative test does not apply: the derivative does not simply change sign at 0. (It changes sign infinitely often in each neighbourhood of 0.) 36.8. Let f; g be differentiable on Œa; C1/ with f 0 .x/ g 0 .x/, for all x a. Suppose f .a/ g.a/. Prove that f .x/ g.x/, for all x a. 36.9. Let f be differentiable on R with f . 1/ D 0, f .0/ D 2, and f .1/ D is a point c where f 0 .c/ D 7. 50. Prove that there 36.10. Alternate proofs of the Darboux Property of Derivatives. Proof. (Lars Olsen, American Mathematical Monthly 2004) — I later found that this is also in Tom Apostol, 2nd Ed. page 112 Assume without loss of generality that a < b and f 0 .a/ < m < f 0 .b/. For x 2 I put ( ( f .x/ f .a/ f .b/ f .x/ ; x¤a ; x¤b x a b x qa .x/ D and q .x/ D b f 0 .a/; x D a: f 0 .b/; x D b: Since f 0 .a/ D limx!a qa .x/ and f 0 .b/ D limx!b qb .x/, these functions are continuf .a/ ous on I . Notice also that qa .b/ D qb .a/ D f .b/ . b a In case m qa .b/, the Intermediate Value Theorem gives us a point x 2 .a; b with f .x/ f .a/ ; x a and then, the Mean Value theorem gives us a point c 2 .a; x with m D qa .x/ D m D f 0 .c/: In case m > qa .b/ D qb .a/, the analogous argument with qb gives us a point x 2 .a; b/ and a point c 2 .x; b/ with mD f .b/ b f .x/ D f 0 .c/: x This completes the proof. Usual proof. Assume without loss of generality that a < b and f 0 .a/ < m < f 0 .b/. As before, let g.x/ D f .x/ mx. Then, g 0 .a/ D f 0 .a/ m < 0 and g 0 .b/ D f 0 .b/ m > 0. Since the restriction of g to Œa; b is continuous, it must have a minimum at some point c of Œa; b. Now, c cannot be a; otherwise, by the Endpoint Extremum Theorem, g 0 .a/ 0. Also, c can’t be b; otherwise, g 0 .b/ 0. But then g 0 .c/ D 0; that is, f 0 .c/ D m: INTRODUCTION TO ANALYSIS 151 37. T HE R EAL I NVERSE F UNCTION T HEOREM The following application of the Darboux Property of Derivatives is the key to differentiating the inverse of functions. 37.1 Theorem. Inverse Function. Let f be differentiable on the interval I and f 0 .x/ ¤ 0, for all x 2 I . Then, (1) f is injective, strictly increasing or strictly decreasing, (2) f 1 is differentiable, on J D f .I / and (3) at each point y 2 J; 1 .f 1 /0 .y/ D 0 : f .f 1 .y// Note that if x is the point of I with y D f .x/, the final formula reads 1 : .f 1 /0 .y/ D 0 f .x/ In calculus courses one often writes . dy dx D1 ; dy dx but one has to remember that on the left side, the x represents the function f 1 , and dx dy stands for the derivative of that function at the point y, whereas on the right side y is dy representing the function f and dx stands for the derivative of f at x, yet we are still assuming that the y at which we evaluate the left side is related to the x at which we evaluate the right side by y D f .x/. The notation is a mess! — though very convenient in calculations. Proof. By the Darboux property of derivatives (Theorem 36), if f 0 ¤ 0 on I , then there cannot exist a; b 2 I with f 0 .a/ > 0 and f 0 .b/ < 0. Thus f 0 > 0 on I or f 0 < 0 on I . Thus, f is strictly increasing on I or strictly decreasing. In particular, f is injective. Say f is strictly increasing on I . Then f 1 is also strictly increasing. Since f is continuous, it too has the Darboux property (intermediate value property) which makes J D f .I / an interval and f 1 is continuous there. Fix y 2 J; y D f .x/ where x 2 I . We will use the sequential criterion to show f 1 is differentiable at y. Let .yn / be a sequence in J n fyg with yn ! y and put xn D f 1 .yn /. Since f 1 is continuous, xn ! x. Also xn ¤ x, for all n since f 1 is injective. Thus, we calculate f 1 .yn / yn f y 1 .y/ D xn f .xn / x D f .x/ 1 f .xn / f .x/ xn x : Since f is differentiable at x, this converges to f 01.x/ : Since the sequence .yn / was arbitrary, f 1 .t / f 1 .y/ 1 lim D 0 : t !y t y f .x/ That is, .f 1 /0 .y/ exists and is 1=f 0 .x/. 37.2 Example. The function tan is defined on the set of real numbers where cos is not 0, sin x namely on A WD Rnf=2Cn W n 2 Zg by tan x D cos . Since sin and cos are continuous, x so is tan on A. The denominator is positive in the interval . =2; =2/. As x ! 2 , sin x ! 1 and cos x ! 0, so limx! 2 tan x D C1; similarly, limx! 2 C tan x D 1. From this, for any y 2 R, there exist a; b with tan a < y and tan b > y. Thus, by the 152 The Real Inverse Function Theorem Intermediate Value Theorem, there exists x with tan x D y. This shows that tan maps . =2; =2/ onto R. Since sin0 D cos and cos0 D sin, the quotient rule gives, for x 2 A, cos x sin0 x sin x cos0 x cos2 x C sin2 x D D 1 C tan2 x: cos2 x cos2 x (We could say cos12 x D sec2 x, but we are anticipating the use below.) In any case, this shows tan0 x > 0 in . =2; =2/, so tan is strictly increasing there, hence injective. The restriction of tan to . =2; =2/ is thus invertible. Its inverse is defined to be arctan. Since tan maps . =2; =2/ onto R, arctan maps R onto . =2; =2/. By the Inverse Function Theorem for real functions, arctan is differentiable with 1 arctan0 y D tan0 .arctan y/ tan0 x D If x D arctan y, tan x D y, and tan0 x D 1 C tan2 x D 1 C y 2 . Hence, 1 ; arctan0 y D 1 C y2 for all y. 37.1. You are given that the exponential function exp.x/ D e x has exp0 D exp. Prove that ln D loge is also differentiable, with the usual formula. 37.2. Let arcsin denote the inverse of the restriction of sin to the interval Œ =2; =2. Prove that arcsin is defined on Œ 1; 1, that it is strictly increasing and continuous there, and that it is differentiable in . 1; 1/and determine its derivative.Justify all steps (including that such an inverse exists). 37.3. Formulate and work out a similar result for cos restricted to Œ0; . Could any larger interval be used? 37.4. If f is continuously differentiable on I and f 0 .a/ ¤ 0, then f is locally invertible, that is there exists a neighbourhood U of x for which f jU is invertible and if g is the inverse of f jU , g 0 .f .x// D 1=f 0 .g.f .x///, for x 2 U . INTRODUCTION TO ANALYSIS 153 38. L’H ÔPITAL’ S RULE This actually refers to several related results about calculating the limit of a quotient f =g of two functions f; g by calculating the limit of the quotient f 0 =g 0 of their derivatives. 38.1 L’Hôpital’s rule (0/0 form at a real). Let I be an interval of R, c 2 I ; let f; g be defined on I , except possibly at c, differentiable with g 0 not 0 in a deleted neighbourhood of c, and limx!c f .x/ D limx!c g.x/ D 0. If lim x!c f 0 .x/ x DL2R g 0 .x/ lim then x!c f .x/ D L: g.x/ (28) Proof. We may assume f and g are continuous with value 0 at c. Indeed, if they are not we may define fx and gx on I by ( ( g.x/; x ¤ c f .x/; x ¤ c gx.x/ D fx.x/ D 0; x D c: 0; x D c: Then, fx and gx are continuous with value 0 at c, and fx0 D f 0 and gx0 D g 0 ¤ 0 in a deleted neighbourhood U of c. Since for x 2 U , fx0 .x/ f 0 .x/ D gx0 .x/ g 0 .x/ and f .x/ fx.x/ D ; g.x/ gx.x/ it would be enough to prove the result for these new functions. Assume then, that f and g are defined and continuous with value 0 at c. Let U D .a; c/ [ .c; b/ be the deleted neighbourhood of c in I such that f 0 and g 0 exist at all points of U and g 0 is not zero in U . Since g 0 .x/ ¤ 0 for x 2 U , and g.c/ D 0, g ¤ 0 in U . We use the sequential criterion for convergence. We must show that for each sequence .xn / .xn / in U with xn ! c, fg.x ! L. n/ Fix .xn / a sequence in U converging to c. For each xn ; either xn < c or c < xn . Since g is continuous on Œxn ; c (or Œc; xn ), and differentiable on the interior, Cauchy’s Mean Value Theorem allows us to choose cn between xn and c with .f .xn / f .c//g 0 .cn / D .g.xn / g.c//f 0 .cn /: In other words, since f .c/ D g.c/ D 0, .f .xn / f .xn / D g.xn / .g.xn / Now, since cn is between xn and c, jcn f .c// f 0 .cn / D 0 : g.c// g .cn / cj jxn cj, so cn ! c, and by hypothesis, 0 f .cn / ! L; g 0 .cn / so f .xn / ! L: g.xn / Since .xn / was an arbitrary sequence in U converging to c, lim x!c f .x/ D L: g.x/ Recall the practice problem (Exercise 29.3). 154 L’Hôpital’s Rule 38.2 Lemma. If h is defined (at least) in Œa; 1/, a > 0, and H.t / D h.1=t/ for 0 < t < 1=a. Then lim h.x/ D lim H.t/; x!1 t !0C x whenever one side exists in R. 38.3 L’Hôpital’s rule (0/0 form at 1 or 1). Let f; g be defined (at least) on an interval on Œa; 1/ of R, differentiable there with g 0 never 0 and limx!1 f .x/ D limx!1 g.x/ D 0. Then the implication (28) holds. A similar statement holds for limits at 1. Proof. We may assume a > 0. Put F .t / D f .1=t /, G.t / D g.1=t /, for each t with 1=t 2 Œa; 1/. Then F and G are differentiable in (0,1/a) and f 0 .1=t /. 1=t 2 / f 0 .1=t / f 0 .x/ F 0 .t / D lim D lim D lim D L: x!1 g 0 .x/ t !0C g 0 .1=t /. 1=t 2 / t !0C g 0 .1=t / t !0C G 0 .t / lim Moreover, F and G converge to 0 as t !0C, so by the previous theorem, lim x!1 f .x/ F .t / D lim D L: t !0C G.t / g.x/ Here is a version where the denominator converges to 1. Note that no assumption is made on the behaviour of the numerator. 38.4 L’Hôpital’s rule (‹=1 form). Let f; g be defined and differentiable (at least) in a (possibly infinite) interval .a; b/ R, except at some c, with g 0 never 0 there and limx!c g.x/ D 1, (or 1). Then the implication (28) holds. Proof. . We may assume b D c, since the case c D a is similar and the general case follows by considering right and left limits. We do the case L is finite, but write the proof in such a way that it is easily changed to give the infinite cases. Let u < L < v. We need only show that there is an interval U D .r; c/ with u < f .x/ < v, for all x 2 U . Choose u0 ; v 0 2 R with u < u0 < L < v 0 < v. Since g.x/ lim x!c f 0 .x/ D L; g 0 .x/ there exists an interval U1 D .r1 ; c/ with u0 < f 0 .t / < v0; g 0 .t / () for all t 2 U1 . Fix x; y 2 U1 (distinct) and apply Cauchy’s MVT to get a t between x and y with f .x/ g.x/ f .y/ f 0 .t / D 0 : g.y/ g .t / There is no problem about dividing by zero here, since g 0 is not 0 anywhere on .a; c/, and hence g.x/ is never g.y/. Since t is between x and y, it also belongs to U1 , hence by ./; u0 < f .x/ g.x/ f .y/ < v0: g.y/ () INTRODUCTION TO ANALYSIS 155 We need not make reference to t any longer. This statement ./ holds for all x and y in U1 . Hold y fixed throughout the remainder of the proof. Since g.x/ ! 1, we may choose a smaller neighbourhood U2 D .r2 ; c/ with g.x/ > g.y/ and g.x/ > 0, for x 2 U2 . Divide the numerator and denominator of the quotient in ./ by g.x/ to obtain u0 < f .x/ g.x/ 1 f .y/ g.x/ g.y/ g.x/ < v0: The denominator is > 0 here so we may multiply through by it, preserving the inequality. After adding f .y/=g.x/, this gives g.y/ f .y/ g.y/ f .y/ f .x/ u0 1 C C < < v0 1 : g.x/ g.x/ g.x/ g.x/ g.x/ Now, as x ! c, since g.x/ ! 1, the left side here converges to u0 and the right side converges to v 0 . Thus, since u < u0 , and v > v 0 , there is a smaller deleted neighbourhood U3 D .r3 ; c/ such that the left side is > u and the right side is < v, for x 2 U3 . Hence f .x/ < v: u< g.x/ Thus, U3 is the required U . In the case the limit L is C1, a proof is obtained by deleting the side of the inequalities above containing v and v 0 , since the neighbourhoods of C1 are of the form .u; C1/. Similarly, the proof for the case L D 1 is obtained by deleting the side involving u and u0 . Alternate Proof of both of the 0=0 forms. We may assume the interval I has right endpoint c, where c could be C1. The case that c is a left endpoint is similar and the general case follows by considering right and left limits. Let V be a neighbourhood of L and let W be a neighbourhood of L with cl W V . (For example, if L is finite and V D B.L; "/, we could take W D B.L; "=2/.) Since, lim x!c f 0 .x/ D L; g 0 .x/ there exists an interval U D .r; c/ with f 0 .t / 2W g 0 .t / for all t 2 U . By hypothesis we can take U small enough that g 0 is never 0 on U . Fix x; y 2 U (distinct) and apply Cauchy’s MVT to get a t between x and y with f .x/ g.x/ ( ) f .y/ f 0 .t / D 0 : g.y/ g .t / Since g 0 is not 0 anywhere on U , g.x/ is never g.y/. Since t is between x and y, it also belongs to U ; hence, by . /, f .x/ f .y/ 2 W: g.x/ g.y/ Now, let y ! c and get f .x/ f .x/ f .y/ D lim 2 cl W V: y!c g.x/ g.x/ g.y/ Thus, for each neighbourhood W of L, we have found a deleted neighbourhood of c such that for all x 2 U , f .x/ f .x/ g.x/ 2 V , which shows that limx!c g.x/ D L, as required. 38.1. Let f .x/ D 2x C sin 2x and g.x/ D .x C sin x cos x/e 2 sin x . Investigate limx!C1 and limx!C1 f .x/ g.x/ . Explain the behaviour you discover. f 0 .x/ g 0 .x/ 156 L’Hôpital’s Rule Notes INTRODUCTION TO ANALYSIS 157 39. TAYLOR ’ S T HEOREM The following theorem, named after Brook Taylor (1685–1731), is an extension of the Mean Value Theorem to include higher derivatives. For a function defined on an interval I . If f is differentiable in a neigbhourhood of x and the derivative f 0 is differentiable at x, then the second derivative of f at x is f 00 .x/ D .f 0 /0 .x/. Similarly, f 000 .x/ is the derivative of f 00 at x, provided f 00 exists in a neighbourhood of x, and f 00 is differentiable at x. Other notation for these are f .2/ .x/ and f .3/ .x/. This notation can be extended recursively. We put f .0/ .x/ D f .x/, and supposing f .n/ exists in a neighbourhood of x and is differentiable at x, we put f .nC1/ .x/ D .f .n/ /0 .x/. 39.1 Taylor’s Theorem (Taylor’s formula with Lagrange’s form of remainder). Let I be an interval of the reals, f W I ! R and its first n derivatives be continuous on I and differentiable on the interior of I . Let x0 2 I . Then for all x 2 I , f .x/ D n X f .k/ .x0 / .x kŠ kD0 .nC1/ where Rn .x/ D f .c/ .x .n C 1/Š x0 /k C Rn .x/ x0 /nC1 ; for some c between x and x0 : We emphasis that f .nC1/ is assumed to exist on the interior of I , but need not be continuous, and it is not assumed to exist at the endpoints of I . Remember that f .0/ D f and 0Š D 1, .x x0 /0 D 1; even when x D x0 . Proof. For each x 2 I , put Pn .x/ D n X f .k/ .x0 / .x kŠ x0 /k kD0 This is called the Taylor polynomial of order n about x0 ; it is defined for all x in I , since f is n-times differentiable. Define the remainder Rn .x/ by subtraction: Rn .x/ D f .x/ Pn .x/: From the way it is defined, the remainder automatically satisfies f .x/ D Pn .x/ C Rn .x/: We are to show it can be written in the stated form. Keep x; x0 2 I fixed, with x ¤ x0 ; and define a function g.t / D n X f .k/ .t / .x kŠ t /k C Rn .x/ kD0 Then, since .x x/ D 0 and .x x0 /nC1 .x x0 /nC1 .x t /nC1 : .x x0 /nC1 D 1, g.x/ D f .x/ and g.x0 / D Pn .x/ C Rn .x/ D f .x/: 158 Taylor’s Theorem Thus, by Rolle’s Theorem (or the Mean Value Theorem), there exists a c between x and x0 with g 0 .c/ D 0. Using the rules of differentiation, we calculate: # " n X f .k/ .t / .n C 1/.x t /n f .kC1/ .t / 0 0 k k 1 g .t / D f .t / C .x t / k.x t / Rn .x/ kŠ kŠ .x x0 /nC1 kD1 # " n X f .k/ .t / .n C 1/.x t /n f .kC1/ .t / k k 1 0 .x t / .x t / Rn .x/ D f .t / C kŠ .k 1/Š .x x0 /nC1 D f 0 .t / C kD1 .nC1/ f .t / nŠ .x t /n 0 f .t / .x 0Š n t /0 Rn .x/.n C 1/ f .nC1/ .t / .x t /n .x t /n Rn .x/.n C 1/ : nŠ .x x0 /nC1 Now, at t D c the left side is 0, so when we solve for Rn .x/, the .x .x t / .x x0 /nC1 D Rn .x/ D f .nC1/ .c/ .x .n C 1/nŠ t /n cancel and x0 /nC1 ; which is equal to the required form. The form given to the remainder here is generally attributed to Lagrange. Taylor didn’t actually prove this theorem, but gave the infinite series expansion, known as Taylor’s series, without discussing questions of convergence. 39.2 Example. If f .x/ D e x ; for x 2 R, then f 0 .x/ D e x , for all x 2 R. Thus, for all n f .n/ .0/ D e 0 D 1 and the Taylor polynomial of order n about 0 is Pn .x/ D n X 1 k x : kŠ kD0 Taylor’s Theorem says f .x/ D Pn .x/ C f .nC1/ .c/ nC1 ec x D Pn .x/ C x nC1 ; .n C 1/Š .n C 1/Š for some c between 0 and x. If, for example x > 0, and n D 4, we obtain (since e t increases with t ) 1 < e c < e x and x5 x5 P4 .x/ C < e x < P4 .x/ C e x : 5Š 5Š For more applications of such approximation methods, see your calculus book. 39.1. The remainder in Lagrange’s form of Taylor’s theorem could be written as Rn .x/ D f .nC1/ ..1 /x C x0 / .x .n C 1/Š x0 /nC1 ; for some t 2 .0; 1/: 39.2. (Taylor’s formula with Cauchy form of remainder.) Let I be an interval of the reals, f W I ! R and its first n derivatives be continuous on I and differentiable on the interior of I . Let x0 2 I . Then for all x 2 I , f .x/ D n X f .k/ .x0 / .x kŠ kD0 f .nC1/ .c/ x0 /k C Rn .x/ .x c/n .x x0 /; for some c between x and x0 : nŠ Hint: use a slightly different form of auxiliary function. where Rn .x/ D : INTRODUCTION TO ANALYSIS 159 39.3. (Taylor’s formula with Schlömilch form of remainder.) Generalize both Cauchy’s form and Lagrange’s form of Taylor’s Theorem, obtaining one for which the remainder involves .x x0 / and .x c/ to different powers summing to n C 1. This general result was proved by Oskar Schlömilch in 1847. 39.4. (Young’s form of Taylor’s Theorem.) Let f be n-times differentiable at a point a of the interval I and P f .k/ .a/ Pn .x/ D n .x a/k . Then, for all x 2 I , kD0 kŠ f .x/ D Pn .x/ C Rn .x/, a/n where ".x/ ! 0 as x ! a. (One says where Rn .x/ is of the form ".x/.x Rn .x/ is o..x a/n / as x ! a.) Hint: One simply calculates the limit f .x/ .x lim x!a Pn .x/ : a/n Use L’Hôpital’s rule n 1 times, possible since the hypothesis that f is n-times differentiable at a includes that it is (n 1)-times differentiable on a neighbourhood of a in I . CAREFUL: f is only n-times differentiable, and f .n/ is not assumed continuous. 39.5. Let f be n-times differentiable on the open interval I containing the point a and f .k/ .a/ D 0 for k D 1; : : : ;n-1, but f .n/ .a/ ¤ 0. (a) If n is even and f .n/ .a/ < 0, then f has a local maximum at a. (b) If n is even and f .n/ .a/ > 0, then f has a local minimum at a. (c) If n is odd, then f has neither a local maximum, nor a local minimum at a. 39.6. Show that if x 2 Œ0; 1, then x2 x3 C 2 3 Here log stands for loge D ln. x x4 log.1 C x/ x 4 39.7. Prove that the function ( f W x 7! exp. 1=x 2 /; 0; x2 x3 C : 2 3 x¤0 xD0 has f .n/ .0/ D 0; for all n D 0; 1; 2; : : : , so that the remainder in Taylor’s theorem about 0 is the function f itself. 160 Taylor’s Theorem Notes INTRODUCTION TO ANALYSIS 161 40. C ONVEX F UNCTIONS Let I be an interval of R and f a real function whose domain contains I . Then f is called convex (on I ) if for all a; b 2 I and 0 < t < 1, f ..1 t/a C t b/ .1 t /f .a/ C tf .b/: (C0) The function f is called strictly convex if this holds with “” replaced by “<”. If we reverse the inequalities, we obtain the notions of concave and strictly concave function, respectively. Since the points strictly between a and b are those of the form x D .1 t /a C t b, where b x x a ; 1 tD ; tD b a b a (C0) becomes x a b x f .a/ C f .b/; (C1) f .x/ b a b a for all x between a and b. Let `.x/ denote the right side of this inequality. Then, ` is a linear function whose value at a is f .a/ and whose value at b is f .b/, so it could also be written f .b/ b f .b/ D f .b/ C b `.x/ D f .a/ C f .a/ .x a f .a/ .x a a/ (L1) b/ (L2) Thus, f is convex if and only if for each a; b 2 I , the line segment joining the two points .a; f .a// and .b; f .b// is never below the graph of f . Using the two representations (L1) and (L2) of the point `.x/, we see that the inequality (C1) can also be written as. f .x/ x f .a/ f .b/ a b f .a/ a (C2) f .b/ f .a/ f .b/ b a b Combining (C2) and (C3) gives also f .x/ x (C3) f .x/ x (C4) or as f .x/ x f .a/ f .b/ a b But (C4) can also be rearranged to give (C1) back. Indeed, (C4) yields f .x/ D b b x x f .x/ C a b a b f .x/ a b x x f .a/ C a b a f .b/: a Hence, we obtain the following characterization. 40.1 Lemma. Let I be an interval of R and f W I ! R. Then f is convex if and only if, for all a; x; b 2 I with a < x < b, one of the equivalent inequalities (C1),(C2),(C3), or (C4) holds. 40.2 Corollary. Every convex function on an open interval I is continuous. In fact, for each closed interval Œa; b I , f is Lipschitz on Œa; b. 162 Convex Functions Proof. Let I be an open interval, f convex on I . We actually show f is Lipschitz on any closed interval Œa; b I with a < b. Choose a0 ; b 0 2 I such that a0 < a < b < b 0 . Then, for x; y 2 Œa; b, we have, applying (C1–C4) several times, f .a0 / f .a/ f .x/ f .y/ f .b/ a0 a x y b ˇ ˇ ˇ nˇ 0 o 0 ˇ ˇ ˇ ˇ Thus, putting K D max ˇ f .aa/0 fa .a/ ˇ ; ˇ f .b/b bf0.b / ˇ yields K so that jf .x f .y/j Kjx f .x/ x f .b 0 / : b0 f .y/ K; y yj. 40.3 Corollary. Let f be convex on I and a < x < b, with a; b 2 I . (1) If f .a/ f .x/, then f .x/ f .b/. (2) If f .x/ f .b/, then f .a/ f .x/. (3) If f .a/ < f .x/, then f .x/ < f .b/. (4) If f .x/ > f .b/, then f .a/ > f .x/. In any case, f .x/ maxff .a/; f .b/g. Proof. By (C4), if f is convex and a < x < b, then f .x/ f .a/ 0 implies f .b/ f .x/ 0, which yields (1), and f .b/ f .x/ 0 implies f .x/ f .a/ 0, which yields .2/. (3) and (4) are proved in the same way. We will now see that every non-monotone convex function on an open interval decreases to a minimum and increases thereafter. Notice that there is no assumption involving differentiation here. 40.4 Theorem. Let I be an open interval and f be a convex function on I . Then, either f is monotone, or there exists x0 2 I such that f is decreasing on fx 2 I W x x0 g and f is increasing on fx 2 I W x x0 g. Proof. Assume f is not monotone. Then, we can find a; z; b 2 I with a < z < b and either f .a/ < f .z/ and f .z/ > f .b/ or f .a/ > f .z/ and f .z/ < f .b/. The first possibility is excluded by the corollary; hence, the second must hold. But f is continuous on the compact interval Œa; b, so has a minimum value at some x0 : f .x0 / D minff .x/ W x 2 Œa; bg: Since f .x0 / f .z/, we have f .a/ > f .x0 / and f .x0 / < f .b/: From this, we actually have f .x0 / D minff .x/ W x 2 I g: For example, if x < a, we have x < a < x0 , and f .a/ > f .x0 /, so by the corollary, f .x/ f .a/ > f .x0 /. A similar argument works for x > b. Now, let x1 < x2 x0 . Then f .x2 / f .x0 /, so f .x1 / f .x2 /; hence, f is decreasing f is decreasing on fx 2 I W x x0 g. In the same way, f is increasing on fx 2 I W x x0 g. 40.5 Theorem. If I is an open interval and f W I ! R is differentiable on I , with f 0 increasing on I , in particular, if f 00 0 on I , then f is convex. INTRODUCTION TO ANALYSIS 163 Proof. Let f 0 be increasing and let a; x; b 2 I with a < x < b. Thus, the Mean Value Theorem yields c1 2 .a; x/ and c2 2 .x; b/ with f .x/ f .a/ f .x/ f .b/ f 0 .c1 / D and f 0 .c2 / D : x a x b Since f 0 is increasing, f 0 .c1 / f 0 .c2 /, so that inequality (C4) holds, for all such a < x < b. Thus, f is convex. Since the condition f 00 0 on I entails f 0 is increasing on I , we are done. Recall that a function f is left differentiable at x 2 I if x is not a left endpoint of I and its left derivative, defined by f 0 .x/ D limu!x f .u/u xf .x/ exists (in R) and f is right differentiable at x if x is not a right endpoint of I and fC0 .x/ D limu!xC f .u/u xf .x/ exists. 40.6 Theorem. Let I be an open interval of R and let f W I ! R be convex. Then, (1) f is both left and right differentiable at each point of x 2 I with (2) f 0 .x/ fC0 .x/; (3) each of f 0 and fC0 is an increasing function on I ; (4) f is differentiable except at a countable number of points of I ; (5) if Œa; b I , then for all x; y 2 Œa; b, jf .x/ where M D f .y/j M jx yj; maxfjfC0 .a/j; jf 0 .b/jg; Proof. Fix x 2 I . Define for u ¤ x, f .u/ f .x/ : u x Then, ' is an increasing function on I n fxg. There are 3 cases to check, but they all come from .C 2 C 4/ with different choices of the variables. For example, if u1 < u2 < x, replace a; x; b by u1 ; u2 ; x in (C2) to obtain f .u2 / f .x/ f .u1 / f .x/ D '.u2 /: '.u1 / D u1 x u2 x If u1 < x < u2 , replace a; x; b by u1 ; x; u2 in (C4) to get f .x/ f .u2 / f .u1 / f .x/ '.u1 / D D '.u2 /: u1 x x u2 We’ll leave the 3rd case for the reader. Now, if u < x < v, we have '.u/ D '.u/ '.v/; so letting u ! x and v ! xC, we have lim '.u/ D sup '.u/ inf '.v/ D lim '.v/I u!x 0 that is, f .x/ and fC0 u<x v>x v!xC exist with f 0 .x/ fC0 .x/; proving (1) and (2). We now prove (3) in a strong form. Let x1 < x2 and choose u with x1 < u < x2 , obtaining, using (C4) f .u/ f .x2 / f .u/ f .x1 / f 0 .x2 /; fC0 .x1 / u x1 u x2 164 Convex Functions so f 0 .x1 / fC0 .x1 / f 0 .x2 / fC0 .x2 /: 0 () fC0 Thus, both f and are increasing. Now, increasing functions on an interval have at most a countable number of discontinuities (Theorem 30.10). So to prove (4), we need only show that f is differentiable at each point x where f 0 is continuous. But at such a point, () yields for u > x, f 0 .x/ fC0 .x/ f 0 .u/ and letting u ! xC gives f 0 .x/ fC0 .x/ f 0 .x/; so f 0 .x/ D fC0 .x/, as required. Now, to prove (5) let a x; y b. Then, using consequences of .C1 fC0 .a/ f .x/ x C 4/ again, f .y/ f 0 .b/; y so putting M D maxfjfC0 .a/j; jf 0 .b/jg, we have M f .x/ x f .y/ MI y ˇ ˇ ˇ ˇ that is, ˇ f .x/x fy .y/ ˇ M; so that (5) follows. Finally, as we have noted, (5) says f is Lipschitz on every compact subinterval of I , so is continuous in a strong way. 40.7 Warning. If the interval I is not open, f can be convex on I , without being continuous. For example, the function f defined on . 1; 1 by ( x2; x < 1 f .x/ D 7; x D 1: is convex, with a jump discontinuity at 7. Putting together the above results, we obtain the following commonly used charactizations. 40.8 Corollary. Let I be an open interval. (1) If f is differentiable on I , then f is convex iff f 0 is increasing on I . (2) If f is twice differentiable on I , then f is convex iff f 00 0 on I . (3) If f is twice differentiable and f 00 is never 0, then f is (strictly) convex iff there exists c 2 I with f 00 .c/ > 0. Proof. (1) Assuming f is differentiable on I , f 0 D fC0 D f 0 , so f 0 is increasing by (3) of the theorem. We already proved that if f 0 is increasing, then f is convex. (2) If f is twice differentiable, then f 0 is increasing on I iff f 00 0 on I , so (1) applies. (3) By the Darboux property of the derivative .f 0 /0 , since f 00 is never 0, the positivity of f 00 at one point implies positivity in the whole interval and (2) applies. The one-sided derivatives of a convex function determine tangent lines. The graph of the function never falls below these lines. INTRODUCTION TO ANALYSIS 165 40.9 Theorem. Let I be an open interval of R, x0 2 I , and let f W I ! R be convex. Let m be any real number with f 0 .x0 / m fC0 .x0 /. Then, the linear function T W x 7! f .x0 / C m.x x0 / satisfies T .x0 / D f .x0 / T .x/ f .x/; for all x 2 I: Proof. . Of course for x D x0 , T .x/ D f .x0 /. We have seen that for x > x0 , f .x/ f .x0 / fC0 .x0 / m: x x0 So, f .x/ f .x0 / C m.x x0 / D T .x/. Similarly, for x < x0 , f .x/ f .x0 / f 0 .x0 / m; x x0 which again yields f .x/ f .x0 / C m.x x0 / D T .x/, since x x0 < 0. 40.10 Corollary. Every convex function f on an open interval I is the pointwise supremum of linear functions: f .x0 / D supfT .x0 / W T linear and T f on I g: Proof. If T f on I and x0 2 I , then T .x0 / f .x0 /. So supfT .x0 / W T linear and T f on I g f .x0 /: On the other hand, the theorem shows that there exists a linear T D Tx0 with T f on I and T .x0 / D f .x0 /, so we have equality. Conversely, you can prove: 40.11 Theorem. If the real function f is is the pointwise supremum of a family of convex functions on the interval I , then f is also convex. 40.1. Let f be convex on I and a < x < b, with a; b 2 I . If f .a/ < f .x/, then f .x/ < f .b/. If f .x/ > f .b/, then f .a/ > f .x/. 40.2. Let f be a convex function on the interval I . If f has a local minimum at x0 , it is also a global minimum. 40.3. Give an example of a non-monotone convex function on an interval but which has no miminum value. 40.4. Let f be strictly convex on an interval I . Then f has at most one minimizer. If f is not monotone on the interior of I , then f has exactly one minimizer. 166 Convex Functions Notes INTRODUCTION TO ANALYSIS 167 41. T HE R IEMANN I NTEGRAL Here we introduce the Riemann integral using the original formulation of Riemann. The later one due to Darboux will arise naturally in the section on existence. You will have heard of Riemann sums from calculus. In everything, if we don’t say otherwise, f is a real-valued function defined (at least) on a bounded interval Œa; b to R. A Riemann partition of Œa; b is a finite set of non-overlapping intervals P D fI1 ; I2 ; : : : ; In g D fŒx0 ; x1 ; Œx1 ; x2 ; : : : ; Œxn 1 ; xn g obtained by subdividing the interval with points a D x0 x1 x2 xn D b: A tagged partition of Œa; b is obtained from a partition P D fI1 ; I2 ; : : : ; In g by choosing points ti 2 Ii (called tags): D f.t1 ; I1 /; .t2 ; I2 /; : : : ; .tn ; In /g: The length of an interval I D Œu; v is .I / D v u. In this setting, if Ii D Œxi 1 ; xi it is traditional to write xi for its length. The mesh of the partition P or of the tagged partition is the length of the largest interval in it: kk D kP k D max .Ii / D max.xi i xi i Pn 1/ D max xi : i P Notice that the sum of the lengths iD1 .Ii / D i .xi xi 1 / D b Œa; b. Corresponding to each such tagged partition is a Riemann sum R.f; / D n X f .ti /.xi xi 1/ X D i D1 a, the length of f .t /.I /: .t;I /2 The function f is called Riemann integrable over Œa; b if a real number r exists such that lim R.f; / D r; kk!0 in the sense that for all " > 0, there exists a ı > 0 such that jR.f; / rj < "; whenever is a tagged partition of Œa; b with kk < ı. Of course, r is then referred to as the Riemann integral of f over Œa; b; it is denoted Rb Rb a f D a f .x/ dx. One can show that if f is integrable on Œa; b then it is bounded on Œa; b (see a sketch in the exercises). At first, I suggest you just accept it as part of the definition. 41.1 Example. To see that you understand the definition, check that if f is constantly k, Rb Rb Rb then a f D a f .x/ dx D a k dx D k.b a/, as expected. As in the case of the other limit processes we have studied, the limit of the sum is the sum of the limits and order is preserved. 41.2 Theorem (linearity). The set of Riemann integrable functions on Œa; b is a vector space on which the integral is a linear functional. Thus, Rb Rb (1) If f is integrable on Œa; b and k 2 R, then kf is integrable and a kf D k a f: Rb Rb Rb (2) If f and g are integrable on Œa; b then so is f Cg and a .f Cg/ D a f C a g. 168 The Riemann Integral Proof. Let us prove (2) and leave (1) to the reader. Let f and g be Riemann integrable on Œa; b. Let " > 0. Choose ı1 so that for each Rb tagged partition of Œa; b with kk < ı1 , jR.f; / f j < "=2. Choose ı2 so that for Rb a each tagged partition with kk < ı2 , jR.g; / a gj < "=2. Let ı D minfı1 ; ı2 g. If a tagged partition of Œa; b with mesh kk < ı R.f C g; / D n X .f .t/ C g.t //.I / D R.f; / C R.g; /: .t;I /2 Thus, ˇ ˇ ˇR.f C g; / . Rb a f C Rb a ˇ ˇ ˇ ˇ g/ˇ ˇR.f; / This shows that f C g is integrable on Œa; b with Rb a Rb a ˇ ˇ ˇ ˇ f ˇ C ˇR.g; / .f C g/ D Rb a f C R b ˇˇ a g ˇ < ": Rb a g. 41.3 Theorem (monotonicity). If f and g are integrable on Œa; b and f g, then Rb Rb a f a g. Proof. This amounts to a “preservation of inequalities in limits” result. The method is the Rb Rb same as usual. Let f g on Œa; b with a f D r and a g D s. Suppose r > s. Choose " D .r s/=2. Using the definition, choose ı > 0 so that for kk < ı, both jR.f; / rj < " and jR.g; / sj < ". Then, choose any tagged partition of Œa; b with mesh less than ı. Then, R.g; / < s C " D r " < R.f; /, yet X X R.f; / D f .t /.I / g.t /.I / D R.g; / .t;I /2 .t;I /2 a contradiction. To find such a , just choose n with .b a/=n < ı and create equally spaced division points: xi D a C .b a/i=n, for each i. Any choice of the tags will do. 41.4 Theorem (additivity over intervals). If a < c < b and f is integrable on Œa; c and Rb Rc Rb on Œc; b, then f is integrable on Œa; b with a f D a f C c f . (We will soon show also that if f is integrable over Œa; b then it is integrable over each of Œa; c and Œc; b, so the result applies. See Theorem 42.6) Proof. Since f is integrable on Œa; c and on Œc; b, it is bounded on each, hence is bounded on Œa; b. Say jf .t /j < K < C1, for all t 2 Œa; b. Let " > 0. Since f is integrable on Œa; c we canR choose ı1 > 0 such that for each tagged partition 1 of Œa; c with k1 k < ı1 , c jR.f; 1 / a j < "=3. Similarly, we can choose ı2 > 0 such that for each tagged partition Rb 2 of Œc; b with k2 k < ı2 , jR.f; 2 / c j < "=3. Choose ı D minfı1 ; ı2 ; "=6Kg: Now, let D f.t1 ; I1 /; : : : ; .tn ; In g be a tagged partition of Œa; b with kk < ı where the subdivision points are x0 ; : : : ; xn arranged in increasing order. We create tagged partitions of Œa; c and Œc; b as follows. Choose k such that c 2 Ik D Œxk 1 ; xk and put 1 D f.t1 ; I1 /; : : : ; .tk 1 ; Ik 1 /; .c; Œxk 1 ; c/g and 2 D f.c; Œc; xk ; .tkC1 ; IkC1 /; : : : ; .tn ; In /g: Thus, we have taken the tagged intervals .ti ; Ii / to the left of c with one more obtained by splitting one with c in it, and did a similar thing to the other side. The point c becomes INTRODUCTION TO ANALYSIS 169 a tag for 2 intervals. Notice that k1 k < ı1 and k2 k < ı2 and since f .c/.c f .c/.xk c/ D f .c/.xk xk 1 / D f .c/.Ik /, R.f;1 / C R.f; 2 / X D f .ti /.Ii / C f .c/.c D i <k n X xk 1/ C f .c/.xk c/ C X xk 1/ C f .ti /.Ii / i >k f .ti /.Ii / C f .c/.Ik / f .tk /.Ik / iD1 D R.f; / C .f .c/ f .tk //.Ik /: Moreover, jf .tk / f .c/j < 2K, so Rc Rb jR.f; / . a f C c f /j ˇ R c ˇ ˇˇ D ˇR.f; / f ˇ C ˇR.f; 2 / Rb a c "=3 C "=3 C 2K"=6K D ": ˇ ˇ f ˇ C j.f .tk / f .c//.Ik /j Rb 41.5 Lemma (Sequential Criterion). A function f is integrable on Œa; b, with a f D r iff for any choice of a sequence .m / of tagged partitions with km k ! 0, we have limm R.f; m / D r. Proof. This should be a routine exercise by now. The sequential criterion can be used to prove some of the earlier results a little more quickly. Just as for limits of sequences, proving that a function is integrable directly from Riemann’s definition can be tricky. What one needs is a good guess at the value of the integral. Rb Rb 41.6 Example. Let f .x/ D x, for x 2 Œa; b. We expect a f D a x dx D .b 2 a2 /=2. Let’s try to prove it. For a tagged partition D f.t1 ; Œx0 ; x1 /; : : : ; .tn ; Œxn 1 ; xn /g, the Riemann sum is R.f; / D n X ti .xi xi 1 /; i D1 but it is difficult to see how this will converge to the expected limit as kk ! 0. However, if kk < ı, and we create a new tagged partition by changing the tags ti to some other ti , we obtain ˇ ˇ n n ˇX ˇ X ˇ ˇ jR.f; / R.f; /j D ˇ ti .xi xi 1 / ti .xi xi 1 /ˇ ˇ ˇ i D1 n X jti i D1 ti j.xi xi 1/ iD1 1 ; xi : ti n 1X .xi 2 i D1 ı.xi xi 1/ D .xi 1 C xi /.xi a/: C xi /=2. Then, n 1 D ı.b iD1 Take ti to be the midpoint of the interval Œxi R.f; / D n X xi 1/ D 1X 2 .xi 2 i D1 xi2 1 / D 1 2 .b 2 a2 /: 170 The Riemann Integral Thus, kk < ı implies 1 2 .b 2 jR.f; / from which we see that Rb a f D 12 .b 2 a2 /j ı.b a/; a2 /. What made this proof work? It was the fact that when the mesh of is small enough, all the Riemann sums R.f; /, obtained by changing the tags of , are close to R.f; / and that one can always choose the new tags to give the same value. Let us apply the same method to prove a special case of part of the Fundamental Theorem of Calculus. 41.7 Theorem (integrating a continuous derivative). Let f be a continuous real function on the interval Œa; b. If there exists a function F continuous on Œa; b and differentiable on Rb .a; b/ with F 0 D f there, then f is integrable with a f D F .b/ f .a/. Proof. Let " > 0. Since f is continuous on Œa; b, it is uniformly continuous, so we can choose ı > 0 so that jt t j < ı implies jf .t / f .t /j < ". Let D f.t1 ; Œx0 ; x1 /; : : : ; .tn ; Œxn be a tagged partition with mesh kk < ı. If we create a new partition by choosing new tags ti in .xi 1 ; xi /, we have ˇ n ˇ n ˇX ˇ X ˇ ˇ jR.f; / R.f; /j D ˇ f .ti /.xi xi 1 / f .ti /.xi xi 1 /ˇ ˇ ˇ i D1 n X i D1 jf .ti / f .ti /j.xi 1/ xi n X ".xi xi 1/ D ".b a/ i D1 i D1 By the Mean Value Theorem, we could choose the tags ti with f .ti /.xi xi 1/ D F 0 .ti /.xi xi 1/ D F .xi / F .xi 1 /: Sum this over i , using the telescoping property, and get R.f; / D n X f .ti /.xi xi 1/ D F .b/ F .a/: i D1 Thus, kk < ı implies jR.f; / .F .b/ F .a/j < ".b Rb Since " was arbitrary, this tells us a f D F .b/ F .a/. a/: 41.8 Remark. In the previous result, we actually only need assume f is integrable rather than continuous. We are in a position to make the required minor modification of the proof just given, but we have chosen to postpone it to after the existence criteria of the next section so that it may be naturally grouped with related results. See Theorem 43.2 41.1. If S is a finite subset of Œa; b, f W Œa; b ! R, and f .x/ D 0 if x … S , then f is integrable Rb with a f D 0. 41.2. If f D g except at a finite number of points in Œa; b, then f is integrable on Œa; b iff g is so, Rb Rb and then a f D a g. 41.3. Use the Riemann definition of integral to prove that the function f W Œ0; 1 ! R, defined by ( 1; x 2 Q f .x/ D 0; x … Q is not integrable. [Every interval of positive length contains both rationals and irrationals that can be used as tags.] 1 ; xn /g. INTRODUCTION TO ANALYSIS 41.4. Use the sequential criterion to deduce linearity of the integral from the usual limit theorems for sequences of reals. 41.5. Use the sequential criterion to deduce monotonicity of the integral from preservation of order in limits of sequences of reals. 41.6. If f is Riemann integrable on Œa; b, then f is bounded on Œa; b. Sketch: first check that there Rb exists ı > 0 such that for tagged partitions with kk < ı, jR.f; /j < 1 C j a f j D M . Fix one particular partition P D fI1 ; : : : ; In g with kP k < ı. If f were unbounded, it would be unbounded on one of the intervals Ii , say for i D k. Fix the tags on all the other intervals, then choose the tag for the k th so that f .tk /.Ii / overpowers the rest of the sum, making jR.f; /j > M . 171 172 The Riemann Integral Notes INTRODUCTION TO ANALYSIS 173 42. E XISTENCE OF THE R IEMANN INTEGRAL The upper and lower Darboux sums for f with respect to the partition P D fI1 ; : : : ; In g are n X U.f; P / D sup f .Ii /.Ii / and i D1 L.f; P / D n X inf f .Ii /.Ii /: i D1 Notice that f .Ii / D ff .t / W t 2 Ii g is a set of numbers and the upper sum is defined in terms of the sup of this set. It is worth noting immediately a connection between the Riemann sums and the Darboux sums: Let us write P if is obtained from P by tagging its intervals. Since, for each tag ti 2 Ii , inf f .Ii / f .ti / sup f .Ii /; we see that each Riemann sum R.f; / satisfies L.f; P / R.f; / U.f; P / and since the tags t1 ; : : : ; tn move independently, you can check that in fact, for each partition P of Œa; b, L.f; P / D inf R.f; / and U.f; P / D sup R.f; / () P P If P and Q are partitions of the same interval, one says that Q is finer than P , or Q is a refinement of P , if each interval of Q comes from subdividing an interval of P . The supremum norm of the function f on Œa; b is kf k D supx2Œa;b jf .x/j and we recall that kP k denotes the mesh of the partition P (the maximum length of the intervals of P ). When we refine a partition, the lower sums go up and the upper ones come down, but not by much. 42.1 Lemma (Comparison of Darboux sums). Let f W Œa; b ! R, and let P and Q be partitions of Œa; b, with Q finer than P . (a) Then, L.f; P / L.f; Q/ U.f; Q/ U.f; P /: (b) If Q is obtained from P by inserting N new division points, then U.f; P / U.f; Q/ 2N kf kkP k and L.f; Q/ L.f; P / 2N kf kkP k: Proof. We may assume that P D fI1 ; : : : ; In g and that Q is obtained by dividing one interval Ik using x , where x is not one of the subdivision points of P . The general case follows by a simple induction. We work with the lower sums. The upper sums behave similarly. Say, x 2 Ik D Œxk 1 ; xk and J1 D Œxk 1 ; x and J2 D Œx ; xk are the new intervals produced. Then, L.f; P / D n X inf f .Ii /.Ii / iD1 D X i ¤k inf f .Ii /.Ii / C inf f .Ik /.Ik / 174 Existence of the Riemann integral and the lower sum for Q is the same, except that the Ik term is replaced by two new ones inf f .J1 /.J1 / C inf f .J2 /.J2 /: Now, f .J1 / f .Ik /, so inf f .J1 / inf f .Ik /, and similarly, inf f .J2 / inf f .Ik /, so inf f .J1 /.J1 / C inf f .J2 /.J2 / inf f .Ik /.J1 / C inf f .Ik /.J2 / D inf f .Ik /..J1 / C .J2 // D inf f .Ik /.Ik /: When we add in the terms with i ¤ k, we obtain L.f; Q/ L.f; P /; which proves (a). To prove (b), just notice that all the values of f are between kf k and kf k, so L.f; Q/ L.f; P / D inf f .J1 /.J1 / C inf f .J2 /.J2 / inf f .Ik /.Ik / kf k.J1 / C kf k.J2 / C kf k.Ik / D 2kf k.Ik / 2kf kkP k; as required. 42.2 Corollary. If P and Q are partitions of Œa; b and f W Œa; b ! R, then L.f; P / U.f; Q/. Proof. If P and Q are any partitions of Œa; b, then they have a common refinement P obtained by using the division points of both: P D fI \ J W I 2 P; J 2 Qg. Then, L.f; P / L.f; P / U.f; P / U.f; Q/: Thus, L.f; P / U.f; Q/; for all partitions P and Q of Œa; b. Now, the Corollary says that all the lower sums are less than all the upper sums. By completeness of the reals, there must be at least one number in between. The smallest and largest such numbers are Rb Rb f D sup L.f; P / and a f D inf U.f; P /; a P P where it is understood that P is running over all possible partitions of Œa; b. These are known as the lower integral and upper integral, respectively. We will see in the next result that that a f is Riemann integrable if and only if these are equal — that is, if and only if there is only one number between all the lower sums and the upper sums — and that number is the Riemann integral. If I is a subinterval of Œa; b, the oscillation of f on I is !f .I / D sup jf .s/ f .t /j D sup f .I / inf f .I /; s;t 2I If P is a partition of Œa; b, then U.f; P / L.f; P / D n X !f .I /.I /: I 2P Let us denote this by .f; P /. Notice that if Q is a refinement of P , then upper sums come down and lower sums go up and .f; Q/ D U.f; Q/ L.f; Q/ U.f; P / L.f; P / D .f; P /: INTRODUCTION TO ANALYSIS 175 42.3 Theorem. Let f be a bounded function on Œa; b to R. Then equivalent are: (1) f is Riemann integrable; (2) for each " > 0, there exists a partition P of Œa; b with .f; P / < "; (3) there is a unique number D such that for every partition P of Œa; b, namely, Rb a L.f; P / D U.f; P /; Rb Rb f D D D f D af . a We will refer to (2) as the Basic Integrability Criterion (BIC) and to (3) as Darboux’s characterization. Proof. Assume first that BIC holds. Let D be a fixed number such that for every partition P of Œa; b, L.f; P / D U.f; P /: Rb Rb Rb For example, take D to be f or a f . We will prove that a f D D, establishing both a (1) and (3). Let " > 0 and choose a partition P" of Œa; b such that .f; P" / < ": Let N be the number of intervals in P" . Choose ı > 0 so that 4N kf kı < " Let be any tagged partition with kk < ı. Let P D fI1 ; : : : ; In g be the corresponding partition without the tags and let Q be the common refinement of P" and P , obtained by using the division points of both. Then, L.f; Q/ D U.f; Q/: and, .f; Q/ .f; P" /, so .f; Q/ < ": Moreover, fewer than N points are inserted in P to obtain Q so U.f; P / U.f; Q/ 2N kf kkP k 2N kf kı < "=2 and L.f; Q/ L.f; P / 2N kf kkP k 2N kf kı < "=2: Putting this together with the fact that L.f; P / R.f; / U.f; P /, we see that L.f; Q/ "=2 R.f; / U.f; Q/ C "=2: Thus, both R.f; / and D are in the interval ŒL.f; Q/ "=2; U.f; Q/ C"=2: This interval has length U.f; Q/ L.f; Q/ C " D .f; Q/ C " < 2", so jR.f; / Dj < 2": We have shown that for all " > 0, there exists ı > 0 so that for every tagged partition of Œa; b with kk < ı, jR.f; / Dj < 2"; Rb so f is Riemann integrable with a f D D. Now, suppose that f is Riemann integrable and let " > 0. Choose ı > 0 so that for every tagged partition of Œa; b with kk < ı, Rb jR.f; / a f j "=4I Fix any partition P with kP k < ı. Each Riemann sum made with tagged in P is thus in the same closed interval of length "=2. Since the upper sum U.f; P / is the supremum of 176 Existence of the Riemann integral such Riemann sums and L.f; P / is the infimum of them, U.f; P / and L.f; P / are also in that interval, so .f; P / D U.f; P / L.f; P / "=2 < " and (2) holds. We have left to prove that (3) implies (2). So suppose (3) holds; that is, suppose D D Rb Rb Rb f D a f . Let " > 0. Since a f D infP U.f; P /, we can choose a partition P1 of a Œa; b with U.f; P1 / < D C "=2. Similarly, we choose a partition P2 with L.f; P2 / > D "=2. Take P to be a common refinement of P1 and P2 so that U.f; P / < D C "=2 and L.f; P / > D .f; P / D U.f; P / "=2 and hence L.f; P / < " ; as required. 42.4 Example. Let g be the indicator function of the rationals on Œ0; 1: ( 1; if x is rational g.x/ D 0; if x is irrational. We will see that g is not Riemann integrable. Indeed, let P D fI1 ; : : : ; In g be a partition of Œ0; 1. We may assume that each interval Ii is of positive length, since the others do not contribute to the relevant sums. Each Ii contains both a rational and an irrational, so the oscillation of g on Ii is !g.I / D sups;t 2Ii jg.s/ g.t /j D 1 0 D 1. Thus, .g; P / D n X 1.Ii / D 1: i D1 Since P was arbitrary, this shows the Basic Integrability Criterion is not satisfied; hence, g is not integrable. We can also look at this in terms of upper and lower integrals. U.g; P / D L.g; P / D n X i D1 n X 1.xi xi 1/ D xn 0.xi xi 1/ D0 x0 D 1 iD1 Since P was an arbitrary partition of Œ0; 1 we have integrable. R1 0g D 1 and R1 0 g D 0, so g is not 42.5 Example. Let f .x/ D x 2 , defined for x 2 Œ0; 1. Consider the partition P determined by fx0 ; x1 ; : : : ; xn g where xi D ni ; for i D 0; : : : ; n. Then, for each i, 2 i 2 ; sup f .Œxi 1 ; xi / D xi D n so n n X X i2 1 n.n C 1/.2n C 1/ U.f; P / D sup f .Œxi 1 ; xi /.xi xi 1 / D D 2 n n 6n3 i D1 i D1 and hence, since this is one of the sums in the definition of Z 1 n.n C 1/.2n C 1/ f 6n3 0 R1 0f , INTRODUCTION TO ANALYSIS 177 (The infimum of a set is a lower bound for it.) Since the left side does not depend on n, we may take a limit and get Z 1 2 1 f D : 6 3 0 Similarly, we find L.f; P / D n X inf f .Œxi 1 ; xi /.xi xi 1/ D iD1 n X .i i D1 so Z 1 f .n .n 1/2 1 D 2 n n 0 Z 1 f 0 1/ C 1/ 1/ C 1/ 1/.n/.2.n 6n3 and in the limit 1/.n/.2.n 6n3 1 : 3 Thus, Z b Z b 1 1 f ; f 3 3 a a Rb so f is Riemann integrable with a f D 13 by Darboux’s characterization. Here is a general example of the use of the Basic Integrability Theorem to show integrability, when we don’t know the value of the integral. 42.6 Theorem (integrability over subintervals). If f is integrable over Œa; b and Œc; d Œa; b then f is integrable over Œc; d . Proof. We use the Basic Integrability Condition. Notice that J I implies !f .J / !f .I /, since this amounts to taking supremum over a smaller set. Let " > 0. Use the fact that f is integrable on Œa; b to choose a partition P1 of Œa; b with .f; P1 / < ". P be the partition of Œc; d obtained by intersecting each I 2 P with Œc; d . Then X X .f; P / D !f .I \ Œc; d /.I \ Œc; d / !f .I /.I / D .f; P1 / < ": I 2P1 I 2P1 Thus, f also satisfies the BIC on Œc; d , as required. ˇR ˇ ˇ b ˇ Rb 42.7 Theorem. If f is integrable on Œa; b then so is jf j and ˇ a f ˇ a jf j. Proof. Let f be integrable. If we can show jf j integrable, we can use monotonicity: jf j f jf j; so noting that Rb a jf j D Rb a . jf j/, we obtain Z b Z b Z jf j f a or in other words a b a ˇ ˇZ ˇ b ˇ Z b ˇ ˇ fˇ jf j: ˇ ˇ a ˇ a jf j; 178 Existence of the Riemann integral To prove that jf j is integrable, we use the Basic Criterion. Let " > 0. Suppose P is a partition of Œa; b with X !f .I /.I / < ": I 2P For each I , we have for s; t 2 I ˇ ˇjf .t /j ˇ jf .s/jˇ jf .t / f .s/j Then !jf j.I / D sup jjf .t /j s;t jf .s/jj sup jf .t / f .s/j D !f .I /: s;t So multiplying by .I / and summing gives X X !jf j.I /.I / !f .I /.I / < ": I I hence jf j satisfies the Basic Criterion for integrability, as required. In the above result we have shown absolute integrability implies integrability. The converse fails, as the reader can show. Now we will apply the Basic Integrability Criterion to establish integrability of two of our favourite kinds of functions. 42.8 Theorem. Let f be a real function on Œa; b. (1) If f is monotone, then f is integrable on Œa; b. (2) If f is continuous, then f is integrable on Œa; b. Proof. We show in each case that the Basic Integrability Criterion is satisfied. We see that for any partition P of Œa; b determined by x0 ; x1 ; : : : ; xn .f; P / D n X !f .Œxi 1 ; xi /.xi xi 1 /; i D1 and we will use the hypotheses to make this small. In the first case this will be done by making the xi D xi xi 1 small enough to overpower the total change in f , and in the second by making the !f .Œxi 1 ; xi / small enough to overpower the total change in x. (1) We assume without loss of generality that f is increasing. The decreasing case is similar. Let " > 0. Choose ı > 0 such that .f .b/ f .a//ı < " and then a partition P determined by fx0 ; : : : ; xn g with kP k < ı. Since f is increasing, for each i , f .xi / is the maximum value of f on Ii D Œxi 1 ; xi and f .xi 1 / is the minimum value. Hence, !f .Œxi 1 ; xi / D f .xi / f .xi 1 / and .f; P / D n X .f .xi / f .xi 1 //.xi .f .xi / f .xi 1 //ı iD1 n X xi 1/ D .f .b/ f .a//ı < ": iD1 Thus, the Basic Integrability Criterion is satisfied and f is integrable. (2) Suppose f is continuous on Œa; b. Then f is uniformly continuous. Let " > 0. Then there exists ı > 0 such that js t j < ı implies jf .s/ f .t /j < "=.b a/. Choose INTRODUCTION TO ANALYSIS 179 such a ı and then choose any partition P with kP k < ı. Then, for all I 2 P , s; t 2 I implies jf .s/ f .t /j < "=.b a/, so !f .I / "=.b a/ and .f; P / D n X !f .Œxi 1 ; xi /.xi xi 1/ iD1 n X " b a .xi xi 1/ D ": iD1 Again this shows the basic integrablity criterion is satisfied, so f is integrable. Continuous and monotone functions are by no means the only ones that are integrable. Changing the value of a function on a finite number of points, for example, destroys both properties, but leaves the function integrable with the same integral. Right now, if you haven’t already done so, you should prove the special case: 42.9 Theorem. If S is a finite subset of Œa; b, f W Œa; b ! R, and f .x/ D 0 if x … S, Rb then f is integrable with a f D 0. Here is a more dramatic example. 42.10 Example. The Dirichlet function, defined on [0,1] by ( 1 ; if x D pq is rational in lowest terms f .x/ D q 0; otherwise is integrable with integral 0. The point is that we have shown elsewhere in the course (Example 30.4) that this function is discontinuous at all rationals (and continuous at all irrationals) of Œ0; 1 and it is certainly not monotone. Nevertheless, it is integrable. R1 Proof. Since f 0, f 0, so it is the upper integral that is of interest. 0 Now, let " > 0. Since there are only finitely many q with 1=q > ", there are only finitely many rationals in Œ0; 1 of the form p=q, with f .p=q/ D 1=q > ". But there are no irrationals with f .x/ > ", since f .x/ is 0 for x irrational. Thus, the set F WD fx W f .x/ > "g " . 2N At most 2N of the intervals of P can contain points of F . All values of f are 1 and if I \ F D ;, sup f .I / ", so X U.f; P / D sup f .I /.I / is finite. Say it has N elements. Choose any partition P of Œ0; 1 of mesh kP k < I D X sup f .I /.I / C X sup f .I /.I / I \F D; I \F ¤; X 1.I / C X ".I / I \F D; I \F ¤; The first term here is at most 2N kP k < " and the second is at most " times the total length of all the intervals of P , which is 1 0 D 0. Thus, U.f; P / 2"I hence, Z 1 f 2"; 0 for all " > 0; 180 Existence of the Riemann integral and therefore R1 0f 0. Since R1 0 f 0, we have R1 0 f D 0. 42.11 Theorem (Integral of products). If f; g are integrable on Œa; b, then so is fg. Proof. Let f and g be integrable on Œa; b. Then, f and g are both bounded. Say both kf k and kgk are less than K. Let " > 0. Choose, by the BIC, partitions P1 and P2 so that .f; P1 / < "=2K and .g; P2 / < "=2K. Take P to be a common refinement of these. Then, for each I 2 P , s; t 2 I implies jf .s/g.s/ f .t /g.t /j jf .s/ f .t /jjg.s/j C jf .t /jjg.s/ g.t /j K!f .I / C K!g.I /: Thus, !.fg/.I / K!f .I / C K!g.I /; so .fg; P / D X !.fg/.I /.I / K.f; P / C K.g; P / < K I 2P " " CK D ": 2K 2K Thus, fg satisfies the BIC, so is integrable. 42.12 Theorem (Integral of composites). Let f W Œa; b ! Œc; d and g W Œc; d ! R. (a) If f is integrable and g is continuous then g ı f is integrable on Œa; b. (b) If f and g are both integrable, the composite need not be integrable. Proof. (a) Since g is continuous on the compact interval Œc; d , it is bounded: kgk D supx2Œc;d jg.x/j < 1. Let " > 0. Choose "0 > 0 so that "0 .b a/ C 2kgk"0 < ". Since g is continuous on Œc; d , g is uniformly continuous, so we may choose ı > 0 such that jg.s/ g.t /j "0 when js t j < ı: Since f is integrable, there exists a partition P of Œa; b with .f; P / < ı"0 : Fix such a P D fI1 ; : : : ; In g. Divide the index set into two parts, A D fi W !f .Ii / < ıg B D fi W !f .Ii / ıg Then, for i in A; x; y 2 Ii implies jf .x/ f .y/j !f .Ii / < ı so jg ı f .x/ g ı f .y/j D jg.f .x// g.f .y//j "0 ; Hence, !.g ı f /.Ii /.Ii / "0 .Ii /; for i 2 A, and X !.g ı f /.Ii /.Ii / i 2A X "0 xi "0 .b a/; i 2A For i 2 B, !f .Ii / ı so X X ı.Ii / !f .Ii /.Ii / .f; P / < ı"0 ; i2B i 2B which shows that the total length of the intervals indexed in B is less than "0 and hence, X X !.g ı f /.Ii /.Ii / 2kgk.Ii / < 2kgk"0 ; i2B i 2B INTRODUCTION TO ANALYSIS 181 Thus, altogether, .g ı f; P / "0 .b a/ C 2kgk"0 < " This shows g ı f is integrable by the Basic Integrability Criterion. (b) The composite need not be integrable if f and g are both integrable, but g is not continuous. Let f be the Dirichlet function on Œ0; 1 (f .p=q/ D 1=q, if x D p=q in lowest terms, f .x/ D 0, if x is irrational). Let g on [0,1] be given by g.u/ D 1, if u ¤ 0, and g.0/ D 0. Then, f is integrable and g is integrable, but the composite g ı f is the indicator function of the rationals on [0,1], which is the basic example of a non-integrable function. One can also prove that g ı f need not be integrable if g is integrable and f is continuous. This requires a more sophisticated argument. 42.13 Examples. Here are some applications of the above results. (1) If f is integrable on Œa; b, then the function sin.f /, which really means sin ıf is also integrable on Œa; b, since the sine function is continuous everywhere. (2) If f is integrable on Œa; b so is jf j. This has been proved earlier, but it is also a consequence of the present result since the absolute value map is continuous. (3) If f is integrable, so is f 2 . This follows from the integral of products result, but it is also a consequence of the composite result since the function g W u 7! u2 is continuous and g ı f D f 2 : (4) (Integral of Products again.) If f; g are integrable on Œa; b, then so is fg. Indeed, f C g is integrable, so f 2 ; g 2 and .f C g/2 are integrable, and hence fg D 1 .f C g/2 2 f2 g2 is so also. (5) If f; g are integrable on Œa; b, so are f _ g WD maxff; gg and f ^ g WD minff; gg: Indeed, for real numbers u; v, ( u C v C ju vj D 2u; 2v; if u v if u < v, D 2 maxfu; vg: Thus, 1 .f C g C jf 2 gj are. f _g D which is integrable since f; g; jf Similarly f ^g D 1 .f C g 2 jf gj/; gj/ is integrable. 42.1. If P is a partition of Œa; b and Q is a finer one, then for each f W Œa; b ! R, .f; Q/ .f; P /. 42.2. If f is bounded on Œa; b and integrable on Œc; b, for all c 2 .a; b/, then f is integrable on Œa; b. 182 Existence of the Riemann integral 42.3. A function f W Œa; b ! R is called of bounded variation on Œa; b if there exists a finite K such that P a D x0 x1 x2 xn D b implies n f .xi /j < K. iD1 jf .xi 1 / Prove that such a function is Riemann integrable. 42.4. Consider the function f W Œ0; 1 ! R, defined by ( x 2 ; if x is rational f .x/ D 0; otherwise. Decide, with proof, whether f is Riemann integrable. Rb 42.5. Let f be continuous on Œa; b with f .x/ 0 for all x and a f D 0. Prove that f is constantly 0. Hint: Contrapositive. What can you say about values of f near a point p with f .p/ > 0? 42.6. Give an example of a function whose absolute value is integrable, but the function is not. 42.7. Let f be bounded on Œa; b. Suppose there exists a sequence of partitions of Œa; b such that limn >1 . U.f; Pn / L.f; Pn / / D 0, then (a) f is integrable on Œa; b and Rb (b) a f D limn U.f; Pn / D limn L.f; Pn /. Rb 42.8. Let f be continuous on Œa; b with f .x/ 0 for all x and a f D 0. Prove that f is constantly 0. Hint: Contrapositive. What can you say about values of f near a point p with f .p/ > 0? 42.9. If f is integrable on Œa; b and there exists a number m > 0 such that f .x/ > m, for all x 2 Œa; b, then 1=f is integrable. (The same would hold if m < 0 and f .x/ < m for all x 2 Œa; b.) 42.10. For each partition P of Œa; b, U.f; P / D sup R.f; / and P L.f; P / D inf R.f; /: P INTRODUCTION TO ANALYSIS 183 43. T HE F UNDAMENTAL T HEOREM OF C ALCULUS If f is a Riemann integrable function on an interval Œa; b, it is integrable over subintervals. This was proved in the section E XISTENCE OF THE R IEMANN INTEGRAL. Thus, we may define a new function F on Œa; b by integrating over subintervals. The result is a Lipschitz, hence (uniformly) continuous function. 43.1 Theorem. Let f be Riemann integrable on Œa; b. If F is defined on Œa; b by Z x F .x/ D f .t / dt a then F is Lipschitz, hence uniformly continuous on Œa; b. (Of course, continuity on Œa; b always implies uniform continuity, but the Lipschitz property yields it in a simple way.) Proof. Recall that f being integrable includes that f is bounded. Say jf .x/j K, for all x 2 Œa; b. Since f is integrable on Œa; b, it is integrable on subintervals. Thus, if x; y 2 Œa; b with x < y, Z y Z x Z y f; f D f F .y/ F .x/ D y Z jF .y/ F .x/j x a a so y Z K K.y jf j x/ D Kjx yj: x x If y < x, interchange the roles of x and y here; in either case, we get jF .x/ F .y/j Kjx yj; for all x; y 2 Œa; b. This is the statement that F is Lipschitz. From this uniform continuity follows, as we know. The smallest possible K in the above argument is the supremum norm of f defined by kf k D supx2Œa;b jf .x/j. What we proved is that jF .x/ F .y/j kf kjx yj: 43.2 The Fundamental Theorem of Calculus. (1 (differentiating an integral) Let f be Riemann integrable on Œa; b. If F is defined on Œa; b by Z x F .x/ D f .t / dt a and f is continuous at c 2 Œa; b, then F 0 .c/ D f .c/. (2) (integrating a derivative) Let f be Riemann integrable on Œa; b. If there exists a continuous function F on Œa; b with F 0 D f on .a; b/, then Z b f D F .b/ F .a/: a Notice that in both cases we are assuming integrability of f . The second part is saying Rb that if F 0 is integrable, then a F 0 .t / dt D F .b/ F .a/. Since this also applies for those Rx 0 x between a and b, we get a F D F .x/ F .a/. 184 The Fundamental Theorem of Calculus Proof. (1) Suppose f is continuous at c. Given " > 0, choose ı > 0 such that jx cj < ı implies jf .x/ f .c/j < ". Then since f .c/ is constant Z x Z x Z x F .x/ F .c/ f .c/.x c/ D f f .c/ D f .t/ f .c/ dt c But for jx Thus, c ˇR x cj < ı, the absolute value of the right-side is ˇ c jf ˇ ˇ F .x/ ˇ ˇ x for jx c ˇ f .c/jˇ "jx cj. ˇ ˇ f .c/ˇˇ "; F .c/ c cj < ı. This says F 0 .c/ D lim x !c F .x/ x F .c/ D f .c/; c as claimed. (2) Now, instead assume that there exists F such that F 0 D f . Recall that f is supposed Rb integrable. So let " > 0. Choose ı > 0 so that for kk < ı, jR.f; / a f j < ". Create a partition of Œa; b determined by fx0 ; : : : ; xn g with kk < ı, using the mean value theorem to find tags ti in .xi 1 ; xi / with f .ti /.xi xi 1/ D F 0 .ti /.xi xi 1/ D F .xi / F .xi 1 /: Sum this over i, using the telescoping property, and get R.f; / D n X f .ti /.xi xi 1/ D F .b/ F .a/: i D1 Thus, Z jF .b/ b f j < ": F .a/ a Since " was arbitrary, this tells us Rb a f D F .b/ F .a/. Rx 43.3 Warning. It is quite possible that the map F W x 7! a f is differentiable at a point c, without f being R xcontinuous at c. For example, if f .x/ D 1, for x 2 Œ0; 2 n f1g, and f .1/ D 14, then 0 f D x, for all x 2 Œ0; 2. (Changing a value at one point does not change the integral.) Hence F will then be differentiable everywhere, with F 0 .x/ D 1. The Dirichlet function is a more extreme example. It is discontinuous at every rational in Œ0; 1, but the integrated function is 0 everywhere, so is differentiable everywhere. R x3 43.4 Example. Let G.x/ D 0 sin.cos.t // dt. Find G 0 .x/ if possible. Soln. Since also, so for all u 2 R, the R u sin and cos are continuous on R, their compositeR is u integral 0 sin.cos.t // dt exists and the function F W u 7! 0 sin.cos.t // dt is differentiable by the Fundamental Theorem of Calculus, with F 0 .u/ D sin.cos.u//, for all u. The function in question is G D F ı g, where g.x/ D x 3 . Thus, by the chain rule, G 0 .x/ D F 0 .g.x//g 0 .x/ D sin.cos.g.x///3x 2 : A similar question with both limits of integration “variable” is handled by subtraction. R x3 R x3 R x2 For example x 2 sin.cos.t // dt D 0 sin.cos.t // dt 0 sin.cos.t // dt, and each term can be treated separately. INTRODUCTION TO ANALYSIS 185 43.5 Corollary (Integration by parts.). Suppose that f and g are differentiable on Œa; b with integrable derivatives. Then, Z b Z b 0 fg D f .b/g.b/ f .a/g.a/ f 0g a a Proof. Since f and g are differentiable, they are continuous, hence integrable. Therefore both fg 0 and f 0 g are also integrable. By the product formula for differentiation, we have .fg/0 .x/ D f 0 .x/g.x/ C f .x/g 0 .x/ for all x in Œa; b. Thus, by the Fundamental Theorem of Calculus, Z b Z b Z b 0 0 .f 0 g C g 0 f / gf D f gC a a a b Z .fg/0 D a D f .b/g.b/ f .a/g.a/: The following is also known as integration by substitution. For examples of its use, look in the “Techniques of Integration” section of almost any Calculus book. 43.6 Corollary (Change of Variable). Let g be a real function differentiable on the interval Œc; d , with integrable derivative. Let f be a real function which is continuous on the range of g. Put a D g.c/ and b D g.d /. Then, Z b Z d 0 f f ıg g D a c . Written with dummy variables, the formula of this Corollary looks like Z b Z d 0 f .g.x// g .x/ dx D f .u/ du c a The right side can be formally obtained from the left side with the substitutions: u D g.x/ du D g 0 .x/ dx x runs from c to d u runs from a D g.c/ to b D g.d / Proof. Under the hypotheses, g is differentiable, hence continuous, so the range of g is a Ru closed interval Œu0 ; u1 . Let F .u/ D u0 f , for all u 2 Œu0 ; u1 and let G D F ı g; that is, G.x/ D F .g.x// for x 2 Œc; d . Since f is continuous on Œu0 ; u1 , F is differentiable there with F 0 D f , by the Fundamental Theorem of Calculus (differentiating an integral). By the Chain Rule G 0 D F 0 ıg g 0 D f ıg g 0 . To use the other half of the Fundamental Theorem, we need to know that G 0 is integrable. But, f was given continuous, and g is 186 The Fundamental Theorem of Calculus continuous, so f ıg is continuous, hence integrable. Since g 0 was also assumed integrable, the product f ı g g 0 is also integrable. Thus, Z d Z d 0 f ıgg D G 0 D G.d / G.c/ c c D F .g.d // D F .b/ F .g.c// Z b F .a/ D f: a 43.7 Remark. Actually the change of variable theorem is true without the hypothesis of continuity. One just needs f integrable, but this method of proof fails. A reasonably easy proof is available if g is monotone. It is much more difficult in general, relying on Lebesgue’s Criterion for Riemann Integrability (Theorem 44.3) and a result known as the Bounded Convergence Theorem. Rb 43.1. Let f be integrable on Œa; b and for each x 2 Œa; b, put H.x/ D x f: Prove that, if f is continuous at c, then H is differentiable at c with H 0 .c/ D f .c/. This shows that if one Rx Rb Rx d uses the convention b f D x f , when x < b, we still get dx b f D f .x/, when f is continuous at x. 43.2. (Linear Change of Variable) Let f W Œa; b ! R and let g W Œc; d ! Œa; b be the function whose graph is a straight line with g.c/ D a, g.d / D b. Then f is integrable if and only Rb Rd if f ı g is integrable. Moreover a f .u/ du D m c f .g.x// dx, where m is the slope of the line. (Suggestion: Use Riemann’s definition of integral directly. Note: f is not assumed continuous here.) 43.3. (Monotone Change of Variable.) Let g be a monotone real function differentiable on the interval Œc; d , with integrable derivative. Let f be a real function which is integrable on the range of g. Put a D g.c/ and b D g.d /. Then, Z d Z b f ıg g 0 D f c a . 43.4. Find a formula for the derivatives of the functions defined by Rx p (a) F .x/ D 0 1 C t 2 dt . R sin x (b) F .x/ D 0 ln.5 C t / dt R x2 p (c) F .x/ D x 1 C t 2 dt . Rx 43.5. Let f .t / D t for 0 t 2 and f .t / D t, for 2 < t 4 and let F .x/ D 0 f .t / dt, for x 2 Œ0; 4. (a) Find an explicit expression for F . (b) Sketch F . Determine where F is differentiable and where not. (c) Find a formula for F 0 where F is differentiable. ( x 2 sin.1=x 2 /; x ¤ 0 43.6. Let f .x/ D . Prove that f is differentiable everywhere, but f 0 is 0; x D 0: not integrable on Œ 1; 1. Rx 43.7. Let f be Riemann integrable on Œa; b and F .x/ D a f , for all x 2 Œa; b. If f has a jump discontinuity at c, then F cannot be differentiable at c. 43.8. Let f be Riemann integrable on Œa; b. If there exists a continuous function F on Œa; b with F 0 D f except at a finite number of points, then Z b f D F .b/ F .a/: a INTRODUCTION TO ANALYSIS Rb 43.9. For a real function f defined (at least) on Œa; b/ the improper integral a f is defined to be Rt x It is said to converge if this is finite. Prove that if f limt!b a f , whenever this exists in R. is defined and Riemann integrable on all of Œa; b, then the improper integral, defined this way, Rb Rb converges and has the same value as the Riemann integral: a f D a f . (It is for this reason Rb that the one usually just writes a f for the improper integral.) 187 188 The Fundamental Theorem of Calculus Notes INTRODUCTION TO ANALYSIS 189 44. L EBESGUE ’ S C RITERION FOR R IEMANN I NTEGRABILITY ( OPTIONAL ) Here we give Henri Lebesgue’s characterization of those functions which are Riemann integrable. Recall the example of the he Dirichlet function, defined on [0,1] by ( 1 ; if x D pq is rational in lowest terms f .x/ D q : 0; otherwise This function is continuous at all irrational numbers and discontinuous at the rational numbers. It is also Riemann-integrable (with integral 0). It turns out that there is a connection here. It is the nature of the set of discontinuities that determines integrability. For a real-valued function f defined on a set X , and I X , let !f .I / D sups;t 2I jf .s/ f .t /j, the oscillation of f on I , as usual. The oscillation of f at a point x is defined as !f .x/ D inff!f .B.x; ı// W ı > 0g: It is easy to prove that f is continuous at x if and only if !f .x/ D 0. 44.1 Lemma. Let f W Œa; b ! R. Then, for every ˛ > 0, fx W !f .x/ < ˛g is open in Œa; b and fx W !f .x/ ˛g is a closed set (in R). Proof. Let G D fx 2 Œa; b W !f .x/ < ˛g. Let c 2 G. Then, !f .c/ < ˛ and by definition, there is a ı > 0 such that !f .B.c; ı/\Œa; b/ < ˛. If x 2 B.c; ı/\Œa; b, and U is a neighbourhood of x contained in B.c; ı/, then !f .U / < ˛, so !f .x/ !f .U / < ˛, also. Thus, G is open in Œa; b. Since Œa; b is closed and G is open in Œa; b, fx W !f .x/ ˛g D Œa; b n G, is closed in Œa; b and in R. Let .I / denote the length of the interval I . A subset N of R is said to have measure 0, if for each " > 0, P there exists countable family H D fI1 ; I2 ; : : : g of intervals covering N , with total length k .Ik / < ". 44.2 Lemma. (1) Every countable set of reals has measure 0. (2) If B has measure 0 and A B, then A also S has measure 0. (3) If Ak has measure 0, for all k 2 N, then k2N Ak also has measure 0. . Proof. (1) Let A D fa1 ; a2 ; : : : g be countable, " > 0, and for every k, let Ik be the interval S .a "=2kC1 ; a C "=2kC1 /. Then, A k Ik — that is these intervals cover A. For each P P k k, the length of Ik is "=2k , and the total length is k .Ik / 1 kD1 "=2 D ". Thus, A has measure 0. (2) is obvious, because a family of intervals that covers B also covers A. To prove (3), one uses a modification of the proof of (1). Let " > 0. For eachS k, let Hk be a countable family of intervals whose total length is less than "=2k . Then, k Hk is P still a countable family of intervals, and their total length is less than k "=2k D ". Of course, we could have proved singletons have measure 0 and then deduce (1) from (3). 44.3 Theorem. (Lebesgue’s Criterion for integrability) Let f W Œa; b ! R. Then, f is Riemann integrable if and only if f is bounded and the set of discontinuities of f has measure 0. Notice that the Dirichlet function satisfies this criterion, since the set of discontinuities is the set of rationals in Œ0; 1, which is countable. 190 Lebesgue’s criterion for Riemann integrability Proof. Let f be Riemann integrable on Œa; b. Then, f is certainly bounded. Let D be the set of points of discontinuity of F . Then D D fx W !f .x/ > 0g. We are to show that DShas measure 0. For each ˛ > 0, let N.˛/ D fx 2 Œa; b W !f .x/ ˛g. Then, DD 1 kD1 N.1=k/. Thus, we need only prove that each N.˛/ has measure 0. Fix such an ˛ and let " > 0. By the Basic Integrability Criterion, we can choose a partition P , of Œa; b determined by the set of division points fx0 ; x1 ; : : : ; xn g with n X !f .Œxi 1 ; xi /.xi xi 1/ < ˛"=2: i D1 Assume, as we may, that the xi are distinct. Let F be the set of all i for which .xi intersects N.˛/. Then for each i 2 F , !f .Œxi 1 ; xi / ˛. Thus, X X !f .Œxi 1 ; xi /xi < ˛"=2; xi ˛ i2F 1 ; xi / i2F so that the sum of the lengths of the intervals .xi 1 ; xi / is less than "=2. These cover N.˛/ except for the elements of fx0 ; x1 ; : : : ; xn g. But these can be covered by intervals whose lengths total less than "=2, so that N.˛/ can be covered with open intervals of total length less than ", as required. For the converse, let f be bounded and suppose that the set D of discontinuities of f is of measure 0. Fix " > 0 and let E D fx W !f .x/ "g. Since E D, E has measure 0. Thus, E can be covered by a countable family of open intervals, whose total length is less than ". Since E isSclosed and bounded, it is compact, so a finite family of such intervals will do, say E m i D1 Ui . For each i, let Ii be the closure of Ui . For simplicity, by replacing pairs that intersect, we may S assume that no two Ii intersect. Let D D fI1 ; : : : ; Im g. The set K D Œa; b n m i D1 Ui is compact (in fact, is the union of a finite number of disjoint closed intervals) and consists of points where !f .x/ < ". For each x 2 K, there is a closed interval J with x 2 int J and !f .ŒJ / < ". By compactness, a finite number of such intervals covers K. By intersecting with K, we can assume that they are all subsets of K. Thus, let C D fJ1 ; : : : ; Jk g, be closed intervals whose union is K and such that !f .ŒJj / < ", for all j . We can (and do) assume that the intervals Jk do not overlap. The family D [ C D fŒx0 ; x1 ; Œx1 ; x2 ; : : : ; Œxn 1 ; xn g partitions Œa; b and n X !f .Œxi 1 ; xi /.xi xi 1/ D i D1 m X !f .Ii /.Ii / C i D1 X k X !f .Jj /.Jj / j D1 2kf k.Ii / C i D 2kf k k X ".Jj / j D1 X .Ii / C ".b a/ i 2kf k" C ".b a/; which is arbitrarily small. Thus, the Basic Integrablity Criterion is satisfied and f is integrable. You may have noticed that part of this argument is similar to that in the proof that the composition g ı f of a continuous function g with an integrable function f is integrable. We see now that the composition result is an immediate consequence of Lebesgue’s criterion. INTRODUCTION TO ANALYSIS 191 44.4 Lemma. Let f W Œa; b ! Œc; d be integrable and g W Œc; d ! R be continuous. Then, g ı f is integrable. Proof. The set of points of discontinuity of f has measure 0, since f is integrable. But g ı f is continuous wherever f is, so the set of discontinuities of g ı f is contained in that of f , so has measure 0 also. The Cantor set. . We know that countable sets are of measure zero, but are there any others? Yes, indeed; the following is an example of an uncountable set of measure 0. Let I be the unit interval [0,1]. Let G11 D . 31 ; 23 / the “open middle third” of I . I n G11 is the disjoint union of the two compact intervals K11 D Œ0; 13 , K12 D Œ 23 ; 1. Let G21 and G22 be the “open middle third” of K11 and K12 , respectively, and K21 , K22 ; K23 , K24 be the 4 compact intervals obtained by removing these middle thirds, etc K01 D Œ0; 1 G11 D . 31 ; 23 / K11 D Œ0; 13 ; K12 D Œ 23 ; 1 G21 D . 91 ; 29 /; G22 D . 97 ; 89 /; K21 D Œ0; 19 ; K22 D Œ 92 ; 39 ; K23 D Œ 69 ; 97 ; :::::::::::::::::::::::::::::::::::::::::::: K24 D Œ 89 ; 1 In general, Gij is the open interval of length 1=3i concentric with Ki and Ki;2j are the two component intervals of Ki 1;j n Gij . The Cantor (ternary) set is defined to be 0 i 1 2 \ [ @ C D Kij A ; i 2N 1;j , while Ki;2j 1 j D1 (The reader may check that C is the set of those points of [0,1] which have a ternary expansion using only the digits 0 and 2.) Now, for a fixed i, the total length of the intervals Kij , j D 1; : : : ; 2i is i 2i 2i X X 1 2 D .Kij / D : 3i 3 i D1 i D1 . 32 /i ! 0, C has measure 0. Since To see that C is uncountable, suppose C D fx1 ; x2 ; x3 ; : : : g. Then x1 2 K11 [ K12 , a union of two disjoint sets. Let K1j1 be the one of these for which x1 … K1j1 . Now take K2j2 to be such that x2 … K2j2 K1j1T ,. . . . In this way obtain a decreasing sequence Kiji of compact intervals. The intersection i Kiji is non-empty, since it is the intersection of non-empty T compact sets, yet contains none of the points xi of C . This is a contradiction, since C i Kiji . The indicator function of the Cantor set, defined by ( 1; x 2 C f .x/ D 1C .x/ D 0; x … C; is continuous on Œ0; 1 n C and discontinuous on the set C of measure 0. It is integrable R1 with 0 f D 0. The set C happens to be compact, with empty interior and all its points accumulation points. Such a set is called “perfect”. By removing smaller intervals one can modify this 192 Lebesgue’s criterion for Riemann integrability example to obtain a Cantor-type set D with the same properties but not of measure 0. Moreover, this can be done in such a way that there is a continuous, strictly increasing function bijection g of Œ0; 1 onto itself such that g.D/ D C . Then f ı g is discontinuous on D. Thus, this gives an integrable function f and a continuous function g such that f ı g is not integrable. We omit the details here. INTRODUCTION TO ANALYSIS 193 45. P OINTWISE AND UNIFORM CONVERGENCE Let .fn / be a sequence of real functions defined on a set X . Then .fn / is said to converge pointwise to f on X if for all x 2 X , fn .x/ converges to f .x/. Notation: fn ! f pointwise on X , or simply limn fn D f . 45.1 Questions. Suppose .fn / converges pointwise to f . If fn is continuous at x for each n, is f continuous at x? If fn is differentiable at x, is f differentiable? If fn is integrable is f integrable? Continuity at x means (for example) limt !x f .t / D f .x/, so the first question is, does lim lim fn .t / D lim lim fn .t /‹ t !x n n t !x We will see that the answer is, in general, no. For integrals, what we want to know really is whether Z Z b Z b D f .x/ dx fn .x/ dx D lim n lim fn a a a ! b n 45.2 Example (A limit of continuous functions which is not continuous). For each n 2 N, define fn W Œ0; 1 ! R by fn .x/ D x n : Then for each x 2 Œ0; 1, ( 0; if 0 x < 1 n lim fn .x/ D lim x D n n 1; if x D 1. Thus fn ! f; pointwise, where f .x/ D 0 for x 2 Œ0; 1/ and f .1/ D 1. This is certainly not a continuous function. 45.3 Example (A limit of integrable functions with the “wrong integral”). For each n let fn be the function defined on Œ0; 2 which is 0 at 0, n at n1 ; 0 again at n2 , linear in between, and 0 on the rest of the interval. 8 2 ˆ if 0 x n1 <n x; 2 2 fn .x/ D n .x n /; if n1 < x n2 ˆ : 0; otherwise R2 Then 0 fn D 1, for all n, but limn fn .x/ D 0, for all x 2 Œ0; 2, So Z 2 Z 2 lim fn .x/ dx: fn .x/ dx ¤ lim n 0 0 n For another example, let fn .x/ D n2 x.1 x 2 /n ; x 2 Œ0; 1: Then fn .x/ ! 0, for all x, yet Z 1 lim fn .x/ dx D lim n2 =2.n C 1/ ! C1; n which is certainly not R1 0 0 0. n The problems here will be rectified by using a stronger kind of convergence of sequences of functions: Let .fn / be a sequence of functions defined (at least) on a set X . Then .fn / is said to converge uniformly to f on X if for all " > 0 there exists N 2 N such that for all n N and all x 2 X , jfn .x/ f .x/j < ". Notation: fn ! f uniformly (on X ). 194 Pointwise and uniform convergence 45.4 Example. For each n 2 N, define fn W Œ0; 1 ! R by x : fn .x/ D 1 n Then .fn / converges uniformly on [0,1]. Proof. . Our first job is to identify the limit. For each fixed x 2 Œ0; 1, the sequence . xn / converges to 0, so we put f .x/ D 1, for all x 2 Œ0; 1. For each x 2 Œ0; 1 we have ˇ ˇ 1 x x ˇ ˇ 1ˇ D : jfn .x/ f .x/j D ˇ1 n n n Now if " > 0, then by the Archimedean property, there exists N 2 N such that n1 < ", thus for all n N , and all x 2 Œ0; 1, 1 jfn .x/ f .x/j < ": n Thus, For all " > 0, there exists N 2 N such that for all n N and all x 2 Œ0; 1, jfn .x/ that is, .fn / converges to f uniformly on Œ0; 1. f .x/j < "; 45.5 Example. For each n 2 N, define fn W Œ0; 1 ! R by fn .x/ D x n : Then (as we have seen) fn converges pointwise on [0,1] to f given by ( 0; if 0 x < 1 f .x/ D 1; if x D 1. We claim that this convergence is not uniform. Indeed, choose " D 21 . Let N 2 N and choose n D N . Put x D . 34 /1=n . Since 0 < 34 < 1, 0 < x < 1, and hence x 2 Œ0; 1, yet d.fn .x/; f .x// D jx n 0j D .. 34 /1=n /n D 43 > 12 D ". We have shown that there exists an " > 0 such that for all N 2 N, there exists n N and an x 2 Œ0; 1 with jfn .x/ f .x/j ". This the negation of the defintion of uniform convergence on Œ0; 1. Note that in the positive example of uniform convergence, we majorized the distance to the limit by a number n1 which went to 0, independently of x. This is the key to proving uniform convergence to a known function: 45.6 Theorem. Let .fn / be a sequence of functions on X , f another one. Then fn ! f uniformly iff there exists a sequence .Kn / of extended real numbers converging to 0, such that for all n 2 N, jfn .x/ f .x/j Kn ; for all x 2 X . In this setting, kfn f k (supremum norm) always serves as a suitable Kn . Proof. Assume fn ! f uniformly on X . Then, for all " > 0, there exists N such that for all n N , jfn .x/ f .x/j ": But then by the definition of supremum, for all n N , kfn f k D sup jfn .x/ f .x/j ": x2X Thus, if we put Kn D kfn f k; INTRODUCTION TO ANALYSIS 195 then, for all " > 0, there exists N such that for n N , jKn 0j "; that is, Kn ! 0. Conversely, suppose for each n, Kn is an extended real number such that for all x 2 X jfn .x/ f .x/j Kn ; and Kn ! 0. Then, for all " > 0, there exists N such that for n N; jfn .x/ f .x/j Kn < "; for all x 2 X . Thus, fn ! f , uniformly on X , by definition. 45.7 Example. If fn .x/ D x n , for x 2 Œ0; 1/, fn ! f pointwise, where f .x/ D 0, for x 2 Œ0; 1/. Then, jfn .x/ f .x/j D jx n j, for all x 2 Œ0; 1/. Thus, kfn f k D supx2Œ0;1/ jx n j D 1. Since this does not converge to 0, .fn / does not converge uniformly. 45.8 Theorem (Cauchy criterion for uniform convergence). Let .fn / be a sequence of functions on X to R. Then there exists a function f such that fn ! f uniformly iff for all " > 0 there exists N such that for n; m N , jfn .x/ fm .x/j "; for all x 2 X . In terms of supremum norm, (C) is saying kfn (C) fm k ". Proof. The proof of one direction is almost the same as the one for sequences of real numbers. Suppose .fn / converges uniformly to f . Let " > 0. Then we can choose N such that for all n N and all x 2 X , jfn .x/ f .x/j < "=2. Let n; m N . Then, for each x 2 X, jfn .x/ fm .x/j jfn .x/ f .x/j C jf .x/ fm .x/j < "=2 C "=2 D ": Thus, the condition is satisfied. Conversely, suppose the condition is satisfied. Namely, 8" > 0 9N 2 N such that for n; m N , jfn .x/ fm .x/j " , for all x 2 X . Now, fix a particular x 2 X . Then, 8" > 0 9N 2 N such that for n; m N , jfn .x/ fm .x/j ". This means that the the sequence .fn .x//n of real numbers is Cauchy. Since the set of real numbers is complete in the usual metric, this sequence converges. Since this held for an arbitrary x 2 X , we define a function f on X by f .x/ D lim fn .x/; for all x 2 X . n So far we have fn ! f pointwise. Now, going back to the Cauchy condition, fix " > 0 and choose N such that for all n; m N , and all x 2 X , jfn .x/ fm .x/j ". Let x 2 X , and n N . Then for all m N , jfn .x/ fm .x/j ". Now let m ! 1. Since fm .x/ ! f .x/, this yields jfn .x/ f .x/j ": Since this was true for arbitrary n N and arbitrary x 2 X , we can say for all n N and all x 2 X , jfn .x/ f .x/j ": Thus, for all " > 0 there exists N such that for n N , jfn .x/ That is fn ! f uniformly. f .x/j "; for all x 2 X ; 196 Pointwise and uniform convergence Distance interpretation. We saw above that using the idea of supremum norm kgk D sup jg.x/j; x2X a sequence of functions .fn / converges uniformly to f iff kfn words, for all " > 0; there exists N 2 N with kfn f k ! 0, or in other f k "; for all n N : So we could define a distance (the uniform distance) on the set of functions on X by d.f; g/ D kf gk: This satisfies the metric properties: (1) (2) (3) (4) d.f; g/ 0, d.f; g/ D 0 iff f D g. d.f; g/ D d.g; f /. d.f; g/ d.f; h/ C d.h; g/. and we see that fn ! f uniformly iff .fn / converges to f in the uniform distance. Similarly, we see that the Cauchy criterion for uniform convergence is just the Cauchy criterion for this distance. The only thing stopping d from being a metric is the fact that d.f; g/ could be C1, this happens whenever f g is not a bounded function. We could call d an extended metric, as some authors do. Uniform convergence of series of functions. As with sequences P1and series of numbers, if .fn /1 nD1 fn refers to the senD1 is a sequence of numbers, the corresponding series quence .sn / of partial sums n X sn D fk : kD1 (And there is a similar notation for sequences and series indexed on, say f0; 1; 2; : : : g.) The series is said to converge P (pointwise) if .sn / converges pointwise.P We say that the series n fn converges absolutely if the series n jfn j converges. (Just as for series of numbers, absolute convergence implies convergence but not conversely.) P We say that the series n fn converges uniformly if .sn / does and the limit is called the sum of the series. The Cauchy criterion for uniform convergence for series becomes P P 45.9 Theorem. For a series n fn of functions on a set X , n fn converges uniformly on X iff for all " > 0, there exists N 2 N such that for n m N , ˇ n ˇ ˇ ˇX ˇ ˇ fk .x/ˇ "; for all x 2 X : ˇ ˇ ˇ kDm As a corollary we have P 45.10 The Weierstrass M-test. A series n fn of functions fn on X is uniformly and absolutely convergent provided there exists Pa sequence .Mn / of real numbers with jfn .x/j Mn , for all x 2 X , and all n such that n Mn converges . INTRODUCTION TO ANALYSIS 197 The proof should now be an easy exercise. Notice also that if any Mn works in the Weierstrass M-test, then kfn k must also work, since jfn .x/j Mn for all x 2 X implies jfn .x/j kfn k Mn for all x 2 X : However, other choices of Mn are often easier to work with. P x/ 45.11 Example. Let fn .x/ D sin.cos , for x 2 R. Then n fn converges uniformly on 2n Cx 2 P R, by the Weierstrass M -test, since jfn .x/j 21n and n 21n converges. (Mn D 21n .) It is not at all clear what the sum is. 45.12 Example (A series converging uniformly for which the Weierstrass M-test does not nC1 2 apply). Let fn .x/ D . 1/ n x for x 2 Œ0; 5. Here the Weierstrass M -test does not apply, since if we put ˇ ˇ 2 ˇ . 1/nC1 x 2 ˇ ˇD 5 ; Mn D kfn k D sup ˇˇ ˇ n n x2Œ0;5 P1 then nD1 Mn diverges. 2 However, since, for each x 2 Œ0; 5, the sequence . xn / is decreasing to 0, the series P1 . 1/nC1 x 2 converges by the alternating series test to some f .x/, nD1 n ˇ ˇ n ˇ 2ˇ X x2 52 ˇ kC1 x ˇ : . 1/ ˇ ˇf .x/ ˇ k ˇ nC1 nC1 kD1 Thus, ˇ ˇ n ˇ 2ˇ X x 52 ˇ ˇ kf . 1/kC1 ˇ ! 0: kD1 fk k D sup ˇf .x/ k ˇ nC1 x2Œ0;5 ˇ kD1 P Hence, the series 1 nD1 fn converges uniformly on Œ0; 5. Pn nx 2 =.1 C nx 2 /, 45.1. Let fn .x/ D for all x 0. (a) Prove that for each t > 0, .fn / converges uniformly on Œt; 1/ but (b) Prove that .fn / does not converge uniformly on Œ0; 1/. 45.2. For a sequence .fn / of real functions on X and f another one, fn ! f uniformly on X iff for each sequence .xn / in X , fn .xn / f .xn / ! 0. 45.3. Let .fn / be a sequence of functions on X which converges pointwise on X . Let F X be finite. Prove .fn / converges uniformly on F . 45.4. Decide whether the following sequences of functions are uniformly convergent on their domains. (a) .fn /, where fn W Œ0; 1 ! R is defined by fn .x/ D nx.1 x 2 /n . (b) .gn /, where gn W Œ0; 1 ! R is defined by gn .x/ D x.1 x/n . 45.5. Prove that the sum of two uniformly convergent sequences of functions converges uniformly. 45.6. Prove or disprove that the product of two uniformly convergent sequences of functions is also uniformly convergent. P 45.7. If the series 1 nD1 fn converges uniformly on X , then fn ! 0 uniformly on X . The converse doesn’t hold. 45.8. Discuss pointwise and uniform convergence of the following series. (a) P1 nD1 1 for x 2 Œ1; C1/. 1 C n2 x 2 198 Pointwise and uniform convergence 1 for x 2 R. 1 C n2 x 2 P1 sin.nx/ (c) , for x 2 R. nD1 n2 (b) P1 nD1 45.9. Formulate versions of the definitions and results on uniform convergece of sequences of functions for the metric space valued case. What changes have to be made? INTRODUCTION TO ANALYSIS 199 46. U NIFORM CONVERGENCE : C ONTINUITY, INTEGRAL , DERIVATIVE . We know that under pointwise convergence, continuity can be lost in the limit. The same is true of integrability and the value of the integral of a limit function, and since integration is connected to differentiation by the Fundamental Theorem of Calculus, there are problems there as well. But, as promised earlier, the difficulties are to a large extent rectified by uniform convergence. Continuity. The uniform limit of continuous functions is continuous. 46.1 Theorem. Let .fn / be a sequence of real functions on a metric space X , a 2 X , and fn continuous at a for all n 2 N. If fn ! f uniformly on X , then f is also continuous at a. Proof. Let " > 0. By uniform convergence there exists N such that for all n N and for all x 2 X , " jfn .x/ f .x/j : 3 Fix such an N . Since fN is continuous at a, we can choose a neighbourhood U of a such that " jfN .x/ fN .a/j < ; for all x 2 U . 3 Thus, for all x 2 U . jf .x/ f .a/j jf .x/ fN .x/j C jfN .x/ fN .a/j C jfN .a/ which shows that f is continuous at a. f .a/j < " Applying the above statement to each a 2 X , we have: 46.2 Theorem. Let .fn / be a sequence of continuous functions on X converging uniformly to f . Then, f is also continuous. 46.3 Example. For each n 2 N let fn .x/ D tan 1 .nx/ for all x 2 R. Here tan 1 refers to arctan; the inverse of the tangent function restricted to its principal domain . =2; =2). Now, tan 1 .0/ D 0; as y ! 1, tan 1 .y/ ! =2Iand as y ! 1, tan 1 .y/ ! =2. Thus, 8 9 < =2; if x > 0 = 0; if x D 0 D sgn.x/; lim fn .x/ D lim tan 1 .nx/ D n n : ; 2 =2; if x < 0: where sgn is the function known as signum, which gives the “sign of x” (+1 if x is > 0; 1 if it is < 0 and 0 otherwise). Thus, fn ! 2 sgn pointwise, but the convergence cannot be uniform, since the the fn are all continuous, but the limit function is discontinuous at 0. 46.4 P1 Corollary. Let .fn / be a sequence of continuous functions on X such that the series nD1 fn converges uniformly with sum f . Then, f is also continuous. P Proof. This follows because thePconvergence of 1 nD1 fn uniformly to f , really means that the sequence of sums sn D nkD1 fk converges uniformly to f . Since the sum of a finite number of continuous functions is continuous, each sn is con tinuous and sn ! f uniformly, f is also continuous. A slight modification of the result about uniform limit of continuous functions gives a corresponding result for a sequence of functions, each of whose limits exist. 200 Uniform convergence: Continuity, integral, derivative. 46.5 Theorem. Let .fn / be a sequence of functions on X and let a be an accumulation point of X . Suppose .fn / converges uniformly to f on X n fag and for each n 2 N, limx!a fn .x/ D yn : Then limn yn exists and lim f .x/ D lim yn : x!a n In other words lim lim fn .x/ D lim lim fn .x/: x!a n n x!a Can you see why a is assumed to be an accumulation point of X ? If a is not an accumulation point of X , then any function on X converges to anything as x ! a, so the result would be meaningless. Officially, the notation lim should not even be used in that case, since the limits would not be unique. Proof. Let " > 0. By the Cauchy condition for uniform convergence there exists N such that for all n; m N and for all x 2 X , jfn .x/ fm .x/j ": Fix such an N . Fix n; m N for a moment and take the limit as x ! a. This yields jyn ym j ": (Here we used the fact that a 2 acc X , so that the limits yn and ym are unique.) Thus, for all n; m N jyn ym j "; so the sequence .yn / is Cauchy, hence converges to some number y. Now, we start again. Let " > 0. Since fn converges uniformly on X , and yn ! y, there exists N so large that n N implies jfn .x/ f .x/j "=3 for all x 2 X and jyn yj "=3: Choose n D N: Since limx!a fN .x/ D yN ; there is a neighbourhood U of a such that jfN .x/ yN j < "=3; for x 2 U n fag: Thus, for x 2 U n fag; jf .x/ yj jf .x/ fN .x/j C jfN .x/ yN j C jyN yj < "; which shows that lim f .x/ D y D lim yn ; x!a n as promised. The reader should prove the corresponding result for series of functions: 46.6 Corollary (Limit past the summation sign). Let .fn / be a sequence of functions P on X and let a be an accumulation point of X . Suppose the series 1 f nD1 n converges P uniformly on X with sum f , and for each n 2 N, limx!a fn .x/ D yn : Then 1 nD1 yn converges and 1 X lim f .x/ D yn : x!a In other words lim x!a 1 X nD1 fn .x/ D nD1 1 X nD1 lim fn .x/: x!a INTRODUCTION TO ANALYSIS 201 This is also referred to as “interchanging limit and sum” under uniform convergence. Integration. The uniform limit of integrable functions is also integrable, and the integral of the limit is the limit of the integrals. More precisely: 46.7 Theorem (Interchanging limit and integral.). If for each n 2 N, fn is Riemann integrable on Œa; b and the sequence .fn / converges uniformly to f on Œa; b, then f is Riemann integrable on Œa; b and Z b Z b Z b lim fn D lim fn D f: n a a n a (Use of this result is also referred to as “taking a limit under the integral sign”.) Before starting on the proof, recall the Basic Integrability Criterion 42.3(2): f is integrable if and only if for each " > 0, there exists a partition P D fI1 ; : : : ; In g of Œa; b, with X !f .I /.I // < "; I 2P where !f .I / D sups;t 2I jf .s/ of the interval I . f .t /j, the oscillation of f on I and .I / is the length Proof. For each n, let Kn D kfn f k. Since fn ! f uniformly, Kn ! 0. Let " > 0 and choose N so large that for m N , Km .b a/ < ". Since fN is integrable, we can choose a partition P D fI1 ; : : : ; In g of Œa; b with n X !fN .I /.I / < ": I 2P But, for all s; t, jf .s/ f .t /j jf .s/ fN .s/jCjfN .s/ fN .t /jCjfN .t / f .t /j jfN .s/ fN .t /jC2KN ; so !f .I / !fN .I / C 2KN : Hence, X I 2P !f .I /.I / X !fN .I /.I / C 2KN .b a/ < 3": I 2P Thus, f also satisfies the Basic Integrability Condition, so is integrable. Finally, ˇZ ˇ Z Z b Z b ˇˇ ˇˇZ b ˇ b ˇ b ˇ ˇ ˇ ˇ f f ˇ D ˇ .fn f /ˇ jfn f j Kn D Kn .b ˇ ˇ a n ˇ ˇ a ˇ a a a This shows b Z n b Z fn D lim a a/ ! 0 f; a as required. 46.8 Corollary (Integration termPby term.). For each n 2 N, Let fn be Riemann integrable on Œa; b and let the series 1 nD1 fn be uniformly convergent with sum f . Then, f is also Riemann integrable and Z b 1 Z b X f D fn : a nD1 a 202 Uniform convergence: Continuity, integral, derivative. P Proof. This follows because thePconvergence of 1 nD1 fn uniformly to f , really means that the sequence of sums sn D nkD1 fk converges uniformly to f . Thus, Z b Z b Z b Z bX n n Z b 1 Z b X X f D lim sn D lim sn D lim fk D lim fk D fn ; a a n n n a a n kD1 kD1 a nD1 a as required. Differentiation. The uniform limit of derivatives is a derivative, and we may interchange the limit operation with that of differentiation, provided one more obviously necessary condition is satisfied. 46.9 Theorem (Interchanging limit and derivative). Let .fn / be a sequence of differentiable functions on Œa; b. If (1) the sequence .fn0 / converges uniformly on Œa; b and (2) there is one point c 2 Œa; b such that .fn .c// converges, then, .fn / converges uniformly to some f , f is differentiable on Œa; b and lim fn0 D f 0 : n 46.10 Note. If we use D to stand for the differentiation operator, we can write this as Df D D.lim fn / D lim Dfn : n n If we were to assume each fn had a continuous derivative, we could deduce this from the corresponding result for integrals, using the Fundamental Theorem of Calculus. The version given here doesn’t even assume that the derivatives are integrable. Notice also that the uniform convergence of the sequence of .fn / is part of the conclusion, not the hypothesis. It is the sequence of derivatives that is assumed to converge uniformly. Proof. Let n; m 2 N and t; x 2 Œa; b. The Mean Value Theorem, applied to fn yields a point s such that fn .t / Since jfn0 .s/ fm .t / fm0 .s/j jfn .t / kfn0 fm .t / .fn .x/ fm0 k fm .x// D .fn0 .s/ fm0 .s//.t fm x/: this yields .fn .x/ fm .x//j k.fn0 fm0 /kjt xj: () Taking x D c, we obtain for all t 2 Œa; b, jfn .t / fm .t /j jfn .c/ fm .c/j C kfn0 fm0 kjt cj: .fn0 / Since .fn .c// converges, converges uniformly, and jt cj jb aj, for each " > 0, there exists N , such that jfn .t / fm .t /j < ", for n; m N . Thus, .fn / is uniformly Cauchy, so converges uniformly. Now, ./ can be written ˇ ˇ ˇ fn .t / fn .x/ fm .t / fm .x/ ˇ ˇ kf 0 f 0 k: ˇ n m ˇ ˇ t x t x Let f be the limit of .fn / and g the limit of .fn0 /. Fix x 2 Œa; b and fix n for a moment. Then, letting m ! 1, yields ˇ ˇ ˇ fn .t / fn .x/ f .t / f .x/ ˇ ˇ kf 0 gk: ˇ n ˇ ˇ t x t x INTRODUCTION TO ANALYSIS 203 Since fn0 ! g uniformly, this shows that the sequence of functions 'n defined by fn .t / fn .x/ t x converges uniformly, and we can interchange the order of limits, obtaining fn .t / fn .x/ f .t / f .x/ D lim lim D lim fn0 .x/ D g.x/: lim n t !x n t !x t x t x 0 Thus, f D g and by hypothesis, 'n .t / D fn0 ! f 0 uniformly; as promised. As in the case of integration, the result just proved yields a corresponding result about differentiating series. P 46.11 Corollary (Differentiation term-by-term). Let 1 nD1 P1fn be0 a series of differentiablePreal functions defined on Œa; b such that the series nD1 fn converges uniformly P1 and 1 nD1 fn .c/ converges for some point c. Then nD1 fn converges uniformly and if f denotes the sum, then, 1 X 0 fn0 : f D nD1 The proof is a straightforward exercise. 46.1. Let f be the Dirichlet function defined on [0,1] by ( 1 ; if x D m n is rational in lowest terms f .x/ D n 0; otherwise. For each k let gk be the maximum of f and 1=k. Use uniform convergence to prove that R1 0 f D 0. 8 < sin.nx/ ; x¤0 46.2. Let fn W R ! R, defined by fn .x/ D . Prove .fn / does not converge nx :1; xD0 uniformly, using limit interchange theorems, that is, the theorems of this section. 46.3. Prove that if .fn / is a sequence of bounded functions on X converging uniformly to the function f , then f is also bounded. 204 Uniform convergence: Continuity, integral, derivative. Notes INTRODUCTION TO ANALYSIS 47. A CONTINUOUS 205 NOWHERE DIFFERENTIABLE FUNCTION ON R. This is an example of the use of the theorem on sums of a uniformly convergent series of functions to prove a remarkable result. It is a modification of an example of Weierstrass. He built his example out of trigonometric functions, but we will use the technically simpler “sawtooth” functions. Start with a function g W R ! R defined by setting g.x/ D jxj, for x 2 Œ 2; 2 and extend this periodically by putting g.x C 4/ D g.x/, for all x 2 R. You can check that g.x/ is the minimum distance to the set 4Z D f4m W m 2 Zg, namely g.x/ D inffjx yj W y 2 4Zg D inffjx 4mj W m 2 Zg: P n 2 For each n 2 N, let fn .x/ D g.44n x/ and define f D 1 nD1 fn . Since jfn .x/j 4n , for P1 2 all x and nD1 4n converges, we know the series converges uniformly. Since each fn is continuous, the sum f is also continuous. We will prove that f is nowhere differentiable. Graphs of g; f1 ; f2 ; f3 appear here and graphs of the partial sums s2 D f1 C f2 and s3 D f1 C f2 C f3 appear on the page following the proof (for technical typesetting reasons). 2.0 g 1.5 1.0 f_1 0.5 f_2 0.0 −2 −1 0 1 2 x ja Notice that if an open interval .a; b/ contains no even integer, then jg.a/ g.b/j D bj. Let a 2 R. We will prove that f is not differentiable at a. Suppose otherwise. The interval .4k a 1; 4k a C 1/ can contain at most one even integer. Put ık D ( 1; if there is no even integer in .4k a; 4k a C 1/ 1; otherwise — so that there is no even integer in .4k a hk D ık =4k : 1; 4k a/ 206 A continuous nowhere differentiable function. Since hk ! 0, the difference quotient f .a C hk / qk WD hk f .a/ ! f 0 .a/: And hence, qk qk 1 ! 0. However, for each n, fn has period 41 hk D ık =4k is a multiple of this, so fn .a C hk / fn .a/ D 0. qk D f .a C hk / hk f .a/ D k X fn .a C hk / hk nD1 n , so for n > k, and fn .a/ : But there is no even integer between 4k a and 4k a C ık , so for n k, there is no even integer between 4n a and 4n a C 4n hk , hence jfn .a C hk / Thus, fn .aChk / fn .a/ hk fn .a/j D 4 n jg.4n a C 4n hk / g.4n a/j D jhk j: is either 1 or 1 and k X qk D ˙1 nD1 is odd if k is odd and even if k is even. Hence jqk qk 1 j 1, which doesn’t converge to 0, a contradiction. Another formula for the function g which gives the minimum distance to 4Z is ˇ ˇ ˇ x C 2 ˇˇ ˇ g.x/ D ˇx 4 ˇ; 4 where, as usual, bac is the greatest integer less than or equal to a. To see this, let m D b xC2 c. Then, 4 xC2 m < m C 1; 4 so that 2 x 4m < 2 If n 2 Z with n m C 1, we have x and if n m 4n x 4m 4n x 4m C 4 4<2 4D 2x 4m 2C4D2>x 4m: 1, then x In both cases jx 4mj < jx P1 4nj, so jx 4mj is indeed the minimum distance. sin..nŠ/2 t/ . nŠ 47.1. Let f D nD0 , where fn .t / D Then f is continuous and nowhere differentiable. P 1 2 The proof uses: (1) 1 : (2) j sin x sin yj jx yj. (3) for every integer k 3, kDn kŠ nŠ 3 there exists y with jx yj and j sin kx sin kyj 1. k k P sin b n x The original example of Weierstrass was 1 nD 1 an , with b an integer and b=a and a sufficiently large. INTRODUCTION TO ANALYSIS 207 s_2 0.4 0.0 −2 0 2 x s_3 0.4 0.0 −2 0 x 2 208 A continuous nowhere differentiable function. Notes INTRODUCTION TO ANALYSIS 209 48. P OWER SERIES A series of the form 1 X an .x c/n nD0 is called a power series about c. (The same name is given also to series where the P summation starts at n D 1 (or larger).) This can be considered a series of functions, n fn , where fn .x/ D an .x c/n . On what domains does this define a series which converges pointwise? On what domains does it converge uniformly? Can we integrate this series, or differentiate it, by integrating or differentiating the nice polynomials an .x c/n ? The answer is surprisingly often YES. For the moment we consider power series of real numbers (and “real variables”). We will make some remarks P about the complex case afterward. The radius of convergence of the power series 1 c/n , is nD0 an .x ˚ P R D sup r W n an r n converges (possibly C1). ˚ P The value C1 is given when the set r W n an r n converges is not bounded above. One P1 should note that R 0, since the series nD0 an r n converges to a0 when r D 0. P 48.1 Lemma. For the power series n an .x c/n , put ˛ D lim supn jan j1=n . Then the radius of convergence is R D ˛1 (interpreted as C1, if ˛ D 0, and as 0, if ˛ D C1). P Proof. This is an application of the root rest, which says that a series n xn converges absolutely if lim supn jxn j1=n < 1 and diverges if lim supn jxn j1=n > 1. Let ˛ be as given in the statement. If ˛ is finite, and r 0, then lim sup jan r n j1=n D lim sup.jan j1=n r/ D r.lim sup jan j1=n / D ˛r: n n n P Thus the series n an r n converges absolutely for r < 1=˛ and diverges for r > 1=˛. This shows that the radius of convergence is 1=˛. In case ˛ D 0, r˛ D 0 < 1, so the series converges for all r and hence the radius of convergence is C1. Finally, in case ˛ D C1, lim supn jan r n j1=n D C1 > 1, unless r D 0, in which case lim supn jan r n j1=n D lim supn 0 D 0. Thus, the radius of convergence is 0. Notice also that this proof actually showed that, in the definition of radius of convergence, one could use jan j in place of an , obtaining absolute convergence. P 48.2 Theorem. Let R be the radius of convergence of the power series n an .x c/n . (1) Then the series converges pointwise on .c R; c C R/. P (2) If 0 r < R,Pthen the series n an .x c/n converges uniformly on Œc r; c C r. (3) Put f .x/ D 1 c/n , for x 2 .c R; c C R/. Then: nD0 an .x (a) If a; b 2 .c R; c C R/ with a b, then f is integrable on Œa; b with b Z b 1 Z b 1 X X .x c/nC1 n f .x/dx D an .x c/ dx D an : nC1 a xDa nD0 a nD0 (b) f is differentiable with f 0 .x/ D 1 X nD0 nan .x c/n 1 ; for x 2 .c R; c C R/: 210 Power series The interval of convergence of the series is the largest interval on which the series converges. It consists of .c R; c CR/ together with those endpoints at which it converges. Proof. (The statements are vacuously satisfied if R D 0.) ((1) and (2)) Convergence pointwise on .c R; c C R/ means convergence for each fixed x 2 .c R; c C R/. But if x belongs to this interval, equivalently if jx cj < R, then there exists r with jx cj < r < R; hence, if we can show the series converges uniformly on Œc r; c C r, it has to convergeP at x. Thus it is enough to prove (2). So, suppose 0 r < R. then n jan jr n converges. But then, for all n and all x 2 Œc r; c C r, jan .x c/n j jan jr n , so by the Weierstrass M -test, the series converges absolutely, uniformly for x 2 Œc r; c C r. (3) Let fn .x/ D an .x c/n defined for x 2 .c R; c C R/. LetPa; b 2 .c R; c C R/ and choose 0 r < R so that Œa; b Œc r; c C r. Then n fn converges to f uniformly on Œc r; c C r, hence on Œa; b, so by the theorem on integrating uniformly convergent series, b Z b 1 1 Z b X X .x c/nC1 : fn D an f D nC1 a xDa nD0 nD0 a This proves part (a). To prove part (b), we apply the result on differentiating the sum of a series. However, for that result, we need uniformity of the convergence of the series 1 X fn0 nD0 of derivatives in a neighbourhood of the point where we intend to differentiate. Now, 1 X fn0 .x/ D 1 X an n.x c/n 1 : (D) nD0 nD0 The radius of convergence of this “differentiated” series is determined by lim sup jnan j1=n D lim sup jnj1=n jan j1=n D lim sup jan j1=n ; n n n 1=n since n ! 1. Thus, the series (D) has radius of convergence R. Since also 1 X fn .c/ converges to a0 ; nD0 we can differentiate term by term and get f 0 .x/ D 1 X fn0 .x/; nD0 for all x 2 Œc r; c C r. If we are given a particular point in .c R; c C R/, we simply choose r so that x belongs to Œc r; c C r to complete the proof. 48.3 Note. We see from the above arguments that the radii of convergence of the three series X X X an an .x c/n an n.x c/n 1 and .x c/nC1 ; n C 1 n n n are the same. INTRODUCTION TO ANALYSIS 211 P 48.4 Corollary. Suppose f .x/ D 1 c/n ; for x 2 .c R; c C R/. Then for nD0 an .x each k 2 N, f is k-times differentiable in this interval, with derivative f .k/ .x/ D 1 X nDk .n nŠ an .x k/Š c/n k ; and in particular, f .k/ .c/ D kŠak ; so ak D f .k/ .c/ : kŠ Thus, we obtain the representation of f in terms of its so-called Taylor’s series: f .x/ D 1 X an .x c/n D nD0 1 X f .n/ .c/ .x nŠ nD0 c/n ; for x 2 .c R; c C R/ The proof is a simple exercise. 48.5 Warning. The above theorem does not say that an infinitely differentiable function is always the sum of its Taylor’s series. It only applies to functions which are already known to be expandable as the sum of a power series. The function defined on R by ( 2 e 1=x ; if x ¤ 0 f .x/ D 0; if x D 0 has f .n/ .0/ D 0, for all n 2 N, so its Taylor’s series about 0 is the series 1 X 0 n x nŠ nD0 the “0 series”. Its sum is 0 for all x, which is certainly different from f . The remainder term in Taylor’s Theorem is actually the entire function f . P We know that, if R is the radius of convergence of the series 1 c/n , then it nD0 an .x converges pointwise on .c R; c C R/ and uniformly on any closed interval contained it P this, but not necessarily on the whole interval. (The geometric series n x n shows this.) It diverges at each point outside the closed interval Œc R; c C R. What happens to the uniformity if there is convergence at an endpoint? P 48.6 Abel’s Theorem. If the power series 1 c/n converges at c C R, then it nD0 an .x converges uniformly on Œc; c C R. Similarly, at c R, then it converges P if it converges uniformly on Œc R; c. Consequently, 1 c/n converges uniformly on each nD0 an .x closed interval contained in its interval of convergence. Proof. Without loss of P generality, we may (and do) assume c D 0 and R D 1. (Why is this?) Let " > 0. Since 1 nD0 an converges, it is Cauchy, so we may find N such that ˇ ˇ m ˇX ˇ ˇ ˇ ai ˇ < "; for all m N; ˇ ˇ ˇ i DN or what is the same ˇ k ˇ ˇX ˇ ˇ ˇ aN Ci ˇ < "; for all k 0: ˇ ˇ ˇ iD0 212 Power series Put Ak D k X Pk iD0 aN Ci , so jAk j < " for all k. Then, for x 2 Œ0; 1, aN Ci x N Ci D aN x N C aN C1 x N C1 C C aN Ck x N Ck i D0 D A0 x N C .A1 A0 /x N C1 C .A2 D x N ŒA0 C .A1 A0 /x C .A2 N D x ŒA0 .1 A1 /x N C2 C C .Ak A1 /x 2 C C .Ak 2 x/ C A1 .x x / C C Ak 1 .x k 1 Ak Ak 1 /x k 1 /x N Ck k x / C Ak x k Now the absolute value of this is at most x N ŒjA0 j.1 x/ C jA1 j.x x N Œ".1 x/ C ".x x 2 / C C jAk x 2 / C C ".x k 1 1 j.x k 1 x k / C jAk jx k x k / C "x k D x N ", since the sum telescopes. Thus, ˇ m ˇ ˇX ˇ ˇ iˇ ai x ˇ "; for all m N and x 2 Œ0; 1; ˇ ˇ ˇ i DN which shows that the series satisfies the Cauchy condition for uniform convergence on Œ0; 1. The proof for the case of convergence at c R is similar. Alternatively, it can be deduced from by replacing x c by .x c/; that is, by considering the series P the nprevious case c/n : n . 1/ an .x 48.7 Corollary. If P1 nD0 an converges, then limx!1 P1 nD0 an x n D P1 an . P1 nD0 48.8 Example. Let f .x/ D 1=.1 x ¤ 1. Since 1=.1 x/ D nD0 x n , for P1C x/, for n n jxj < 1, we also have f .x/ D nD0 . 1/ x , for jxj < 1. The interval of convergence of this series is just . 1; 1/. The Taylor’s series for f about 0 is just this, but f is not the sum of its Taylor’s series, except on that interval. Now, for each x 2 Œ0; 1/, we may integrate term by term, yielding Z x 1 1 X X 1 x nC1 xn . 1/n . 1/n 1 : dt D D nC1 n 0 1Ct nD0 nD1 This latter series also converges at x D 1, because it is an alternating series of decreasing terms. 1 1 1 1 C C :::: 2 3 4 Thus, by Abel’s Theorem, lim x!1 1 X nD1 . 1/n 1x n n D 1 X nD1 . 1/n 11 n :: Assuming the properties of natural logarithms, we are now able to sum this series, because Z x Z 1 1 1 lim dt D dt D ln.1 C 1/ ln.1 C 0/ D ln 2: x!1 0 1 C t 0 1Ct INTRODUCTION TO ANALYSIS 213 The complex case. The definition of power series remains unchanged if the coefficients an and the center c are replaced by complex numbers. Traditionally,P the variable z is used instead of x. Again, using the same proofs, we find that the series 1 c/n nD0 an .z converges absolutely in the ball fz W jz cj < Rg, uniformly on any smaller ball, and diverges for jz cj > R, where again R D 1=˛, ˛ D lim supn jan j1=n . The statement about integrating the series term by term doesn’t make sense with the definition of integral we are using — we have no concept of integration with respect to a complex variable. The complex derivative of a function f W Z ! C, where Z C is defined as in the real case by f 0 .z0 / D limz!z0 f .z/z fz0.z0 / . If f .z/ is given by a power series, the theorem on differentiation term-by-term is still true, but our proof doesn’t apply, because it depends on the Mean Value Theorem, which is not valid in the complex case. (See Example 50.6.) P1 P1 The product of two power series. Given series P1 nD0 an and nD0 P bn . The convolution or (Cauchy)product of these two series is nD0 cn , where cn D nkD0 ak bn k . Power series for this definition. If one formally multiplies the two Pgive the motivation P power series n an z n and n z n term by term, collecting terms containing the same power of z (as if they were polynomials), one obtains X X . an z n /. z n / D .a0 C a1 z C a2 z 2 C a3 z 3 : : : /.b0 C b1 z C b2 z 2 C : : : / n n D a0 b0 C .a0 b1 C a1 b0 /z C .a0 b2 C a1 b1 C a2 b0 /z 2 C : : : D c0 C c1 z C c2 z 2 C : : : : Taking z D 1 gives the above definition. 48.9 Theorem. Suppose P (a) P1 nD0 an converges absolutely with sum A and 1 (b) nD0 bn converges with sum B. P Pn Then, the series 1 nD0 cn , with cn D kD0 ak bn k , converges with sum AB. P P P Proof. Put An D kn ak , Bn D kn bk , and Cn D kn ck . Then, Cn D a0 b0 C .a0 b1 C a1 b0 / C .a0 b2 C a1 b1 C a2 b0 / C C .a0 bn C a1 bn D a 0 Bn C a 1 Bn 1 C a 2 Bn 2 1 C C a n b0 / C : : : an B0 D .a0 C a1 C C an /B C a0 .Bn B/ C a1 .Bn 1 B/ C a2 .Bn 2 B/ C : : : an .B0 D A n B C Wn where Wn D a0 .Bn B/ C a1 .Bn B/ C C an .B0 B/: P Since An B ! AB, our job is to show that Wn ! 0. Let Ax D 1 nD0 jan j, which was assumed finite by (a). Choose N so large that jBn Bj ", for n N . Then, 1 B/ C a2 .Bn jWn j ja0 j" C ja1 j" C ja2 j" C : : : jan C jan N C1 .BN 1 x C jan N C1 .BN A" B/ C an 1 2 N j" N C2 .BN 2 B/ C an N C2 .BN 2 B/ C : : : an .B0 B/ C : : : an .B0 B/j B/j x C" Here, there are N terms, each of which tend to 0, since ak ! 0. Thus, this less than A" for all n large enough. This shows that Wn ! 0, and we are done. B/ 214 Power series The same conclusion holds also with the absolute convergence in (a) replaced by convergence, provided the product series is known to converge. P1 P1 48.10 nD0 bn converges with sum B and P1 Theorem. If nD0 an converges with sum PA, n c convergeces with sum C , where c D a n kD0 k bn k , then AB D C . nD0 n P1 P P1 n n Proof. Define f .x/ D nD0 an x n , g.x/ D 1 nD0 bn x and h.x/ D nD0 cn x , for x 2 Œ0; 1. For x < 1, the series converge absolutely and hence may be multiplied using the Cauchy product, so that f .x/g.x/ D h.x/ .0 x < 1/: Because of Abel’s Theorem, these functions are continuous at 1: f .x/ ! A g.x/ ! B h.x/ ! C; so that AB D C , as required. ˇ ˇ P ˇa ˇ 1 48.1. If limn ˇ nC1 (again interpreted as c/n is ˛ n an .x an ˇ D ˛, the radius of convergence of C1, if ˛ D 0; 0, if ˛ D C1). P 48.2. Abel’s theorem states: If the power series 1 c/n converges at c C R, then nD0 an .x it converges uniformly on Œc; c C R. Similarly, if it converges at c R, then it converges uniformly on Œc R; c. In our proof, we stated it is enough to prove this for the special case R D 1 and c D 0. Prove that the result is indeed deducible from this case. 48.3. Prove that there is exactly one function F on R to R such that F 00 D F , F .0/ D 1, F 0 .0/ D 0. R1 P n 48.4. If f .x/ D 1 nD1 x =n on Œ0; 1/, find (the improper integral) 0 f . (With proof, of course.) INTRODUCTION TO ANALYSIS 215 49. T HE EXPONENTIAL AND TRIGONOMETRIC FUNCTIONS Here we define the complex exponential function, establish its main properties, and use it to obtain other elementary functions. We begin in outline form and give details afterward. (1) We define the exponential function for every complex number z by 1 X zn exp.z/ D : nŠ nD0 (2) (3) (4) (5) (6) (7) (8) The series converges absolutely for every z and uniformly on each bounded subset of C. exp.a C b/ D exp.a/ exp.b/, for all a; b 2 C. exp.0/ D 1, exp.1/ D e, exp. z/ D exp.z/ 1 . exp.z/ ¤ 0, for all z. exp0 .z/ D exp.z/. (complex differentiation) The restriction of exp to R is strictly increasing, continuous and positive. It agrees with the map x 7! e x , defined earlier, so one also writes e z for exp.z/. limx! 1 exp.x/ D 0, limx!C1 exp.x/ D C1 and hence, exp maps R onto .0; C1/. The map t 7! exp.i t / maps R into [actually onto as we see in step (11)] the unit circle and we define the cosine and sine functions as the real and imaginary parts of this map. Thus, cos t D Re.exp.i t // sin t D Im.exp.i t //; or what is the same e i t D cos t C i sin t (“the Euler indentity”) (9) sin and cos are differentiable with sin0 D cos; cos0 D sin : (10) The functions cos and sin have power series representations 1 X t2 t 2k t4 cos t D 1 . 1/k C C D 2Š 4Š .2k/Š kD0 sin t D t 1 X t5 t 2kC1 t3 C C D . 1/k 3Š 5Š .2k C 1/Š kD0 (11) There is a smallest positive number such that exp.i =2/ D i. Then, the interval Œ0; 2/ is mapped by t 7! exp.i t / onto the unit circle and exp.z/ D 1 if and only if z D .2 i /k, for some integer k. (12) exp maps C onto C n f0g. Thus, for every complex number w other than 0, there exists z such that e z D w. 216 The exponential and trigonometric functions Details (proofs). EX z z nC1 . z n D ! 0, the defining series (1) Since .n C 1/Š nŠ nC1 1 X zn : nŠ nD0 converges, by the ratio test, for all z 2 C. Thus, the definition is valid, the radius of convergence is R D C1, and hence on each disk fz W jzj rg, the series converges uniformly and absolutely. (2) exp.a C b/ D exp.a/ exp.b/, for all a; b 2 C. P n To see this, we use the Cauchy product (convolution) of the series. Since n anŠ P bn converges absolutely and n nŠ converges, 1 1 1 n X ak X b m X X ak b n k D kŠ mD0 mŠ kŠ.n k/Š nD0 kD0 kD0 D n nŠ 1 X ak b n nŠ kŠ.n k/Š nD0 1 X k kD0 1 X 1 D .a C b/n nŠ nD0 by the binomial theorem. (3) As usual for power series, 00 is defined to be 1 and since all the other terms of the series for exp.0/ vanish, exp.0/ D 1. Long ago we proved that exp.1/ D limn .1 C 1=n/n D e. From (2), exp.z/ exp. z/ D exp.z z/ D exp.0/ D 1, so that exp. z/ D exp.z/ 1 . (4) That exp.z/ ¤ 0, for all z, is an immediate consequence. (5) Using complex differentiation, exp0 .z/ means lim w!z exp.w/ w exp.z/ exp.z C h/ D lim z h h!0 exp.z/ By (2) this is the same as exp.z/ lim exp.h/ h h!0 But, using the series, we see that P1 hn exp.h/ exp.0/ D nD0 nŠ h h exp.0/ 1 D : 1 1 X X hn 1 hn 2 D1Ch ; nŠ nŠ nD1 nD2 which converges to 1. Thus, exp0 .z/ D exp.z/, as claimed. (6) It follows from (5) that restriction of exp to R also has exp0 .x/ D exp.x/. So we see that this function is differentiable, hence continuous. Since exp.0/ D 1 and since exp.x/ ¤ 0, for all x, we know by the Intermediate Value Theorem that exp.x/ > 0, for all real x. But then, exp0 .x/ > 0, so exp is strictly increasing on R. ı Applying (2) by induction one finds that for n a natural number, exp.n/ D e n , and similar arguments INTRODUCTION TO ANALYSIS 217 (see section 13 ) exp.r/ D e r , for rational r. By continuity, we find that, for real x, exp.x/ D supfe r W r < x; r 2 Qg D inffe s W s > x; s 2 Qg: This is the way we defined e x . Thus, exp.x/ D e x , for all real x, and one also writes e z for exp.z/. P (7) Since, for positive real x, exp.x/ D n x n =nŠ > x, limx!C1 exp.x/ D C1, and limx! 1 exp.x/ D limx!1 exp. x/ D limx!1 1= exp.x/ D 0. By the Intermediate Value Theorem, this entails that exp maps R onto .0; C1/. (8) Recall that for a complex number z D x Ciy, the complex conjugate is zx D x iy z . Now, let t be real. From the series, and jzj2 D zx X .i t /n exp.i t / D nŠ n we see that exp. i t / is the complex conjugate of exp.i t /. Indeed, those terms with even n are real and don’t change when we replace i by i. Those terms with odd n are imaginary, and are negated when we replace i by i. Consequently j exp.i t /j2 D exp.i t / exp. i t / D 1: This shows that for all t 2 R, j exp.i t /j D 1; that is, the map t 7! exp.i t / maps R into the unit circle. We define the cosine and sine functions by cos t D Re.exp.i t // sin t D Im.exp.i t //I that is, e i t D cos t C i sin t (“the Euler indentity”) (9) Differentiating exp.i t / with respect to t gives cos0 t C i sin0 t D i exp.i t / D i.cos t C i sin t / D sin t C i cos t; so that sin and cos are differentiable with cos0 D sin; sin0 D cos : Note: we should be careful here. We are comparing differentiation with respect to a complex variable with differentiation with respect to a real variable. We know for each z, exp.w/ exp.z/ lim D exp z; w!z w z so exp.w/ exp.i t / D exp i t: lim w!i t w it Hence, if we restrict w to only run through values of the form w D i s, for s 2 R, then exp.i s/ exp.i t / lim D exp i t s!t is it This gives cos s cos t sin s sin t Ci D i exp.i t /; s t s t justifying the calculation at the beginning of this paragraph. lim s!t 218 The exponential and trigonometric functions (10) As we noted above, the terms of the series for exp.i t/ with even powers of i t are real and those with odd powers of i t are imaginary. That is 1 1 X X t 2k t 2kC1 Ci : cos t C i sin t D exp.it / D . 1/k . 1/k .2k/Š .2k C 1/Š kD0 In other words, cos t D sin t D 1 X kD0 1 X kD0 kD0 . 1/k t 2k .2k/Š . 1/k t 2kC1 .2k C 1/Š (11) Now we are going to define . We know that cos 0 C i sin 0 D exp.i 0/ D 1, so cos 0 D 1 and from the series representation of cos, 6 22 2 24 22 24 cos 2 D 1 C < 1 C D 1=3: 2Š 4Š 6Š 2Š 4Š Thus, by the Intermediate Value Theorem, there exists a t > 0 with cos t D 0. Because of the continuity of cos, we can take t0 to be the smallest such t . Define to be 2t0 ; thus, is the smallest positive number such that cos =2 D 0. Now, for t 2 Œ0; =2/, cos t > 0. But sin0 t D cos t , so sin is strictly increasing on Œ0; =2/ and hence, since sin 0 D 0, sin > 0 on .0; =2. Since cos2 .i =2/ C sin2 .=2/ D 1, sin2 .=2/ D 1, and hence sin.=2/ D 1. It follows that e i =2 D cos.=2/ C i sin.=2/ D i e i D e i =2 e i =2 D i 2 D e i 3=2 D i 1 () e i 2 D 1 Now, let z D u C iv be on the unit circle. If u > 0, v 0, there exists t with cos t D u. But then v 2 D sin2 t , so that v D sin t , since both are non-negative. If u 0, v > 0, then .u C iv/. i / D v i u D e i t , for some t , and hence z D e i.tC=2/ : Finally, if v < 0, then z D e i t , for some t , by the previous 2 cases, so z D e i.tC/ . We now prove that exp.z/ D 1 if and only if z D .2 i /k, for some integer k. Certainly from ./, for each k 2 Z e i 2k D 1: To prove the converse, first observe that on .0; /, cos0 D sin < 0, so cos is strictly decreasing there and on .; 2/, cos0 > 0, so cos is strictly increasing there. Hence, 0 is the only y 2 Œ0; 2/, with e iy D 1. ı[] If z D x C iy then exp.z/ D e x e iy . If e z D 1, then je z j D e x D 1, so x D 0 and e iy D 1. Let k D by=2c, the greatest integer less than or equal to y=2. Then, exp.i.y 2k// D 1, so y 2k D 0, and thus y D 2k. Thus, z D .2 i /k, as required. (12) Let w be a complex number other than 0. Then, jwj D e x , for some real x and w=jwj is on the unit circle, so is of the form e iy , for some real y. Hence w D e xCiy , as required to establish all the properties listed. . INTRODUCTION TO ANALYSIS 219 The connection with the angles of geometry. An angle , in radian measure, is identified with the length of an arc: namely, the arc traced out on the unit circle as one rotates the point .1; 0/ through the angle . If .t / D e i t ; t 2 Œ0; 2; the curve has range the unit circle and the length of this arc is, by Theorem 51.3, Z 2 Z 2 0 j .t /j dt D 1 dt D 2; 0 0 R and more generally, if t moves from 0 to , traces out an arc of length 0 1 dt D . So the approach we have taken is consistent with the angle interpretation, and we find the cosine and sine of an angle , as defined in terms of right triangles, is also consistent with the definitions of cos and sin here. 220 The exponential and trigonometric functions Notes Differentiation of vector-valued and complex-valued fns 221 50. D IFFERENTIATION OF VECTOR - VALUED AND COMPLEX - VALUED FUNCTIONS The definition of derivative given for real-valued functions applies without change to those with values in C or Rn . Thus, let f be a function defined on an interval I of R containing the point c with values in C or in Rn . We say f is differentiable at c if the limit f .x/ f .c/ lim x!c x c exists. If so, this limit is called the derivative of f at c. The function f 0 with domain the set of points where f is differentiable defined by f 0 .c/ D lim x!c f .x/ x f .c/ c is called the derivative of f . In the vector case, if f D .f1 ; : : : fn /, that is, fi .x/ is the i th component of f .x/, for each i, we see that f is differentiable at c if and only if fi is differentiable at c, for all i, and we have f 0 .c/ D .f10 .c/; : : : ; fn0 .c//: As a vector-space, C is identified with R2 through the correspondence: x C iy ! .x; y/ and we see that for a function f W I ! C, f D f1 C if2 , f is differentiable at c iff each of f1 and f2 are differentiable and f 0 .c/ D f10 .c/ C if20 .c/: The Tangent Characterization (35.1) holds true again in this setting, with the identical proof. The only change is that some numbers become vectors: 50.1 Theorem. For a function f W I ! Rn and x 2 I , f is differentiable at x iff there exists an v 2 Rn and a function " W I ! Rn such that limt !x ".t / D ".x/ D 0 such that for all t 2 I; f .t / D f .x/ C v.t x/ C ".t /.t x/: In this case, f 0 .x/ D v. The product v.t x/ here means multiplication of the vector v by the scalar t x. If f 0 .x/ ¤ 0, as t moves along the real line, f .x/ C f 0 .x/.t x/ traces out a straight line tangent to the image of f at f .x/. Since ".t / ! 0, as t gets close to x, the error in using the point on the tangent line to approximate the point f .t / becomes small, even compared to t x. [This is actually the meaning assigned to the word “tangent” in this setting.] The formula in the tangent characterization can also be written: f .x C h/ D f .x/ C vh C "0 .h/h; where "0 .h/ ! 0 as h ! 0. Of course this is meaningful only for those h for which x C h 2 I. The tangent characterization of derivative again immediately gives continuity. 50.2 Theorem. If f is differentiable at c, then f is continuous at c. To obtain a chain rule for this notion of derivative, we have to be careful about the ranges of the two functions. 222 Differentiation of vector-valued and complex-valued fns 50.3 The chain rule. Let I be an interval and g W I ! R be differentiable at x0 . Let g.I / J , another interval of R, f W J ! Rn , and let f be differentiable at u0 D g.x0 /. Then, f ı g is differentiable at x0 , with .f ı g/0 .xo / D f 0 .uo /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /. This is proved just as before. It will also be deducible from the version given elsewhere for functions of a vector variable. Notice that here if g had its values in Rn , f would have to be defined on a set of elements of Rn and the present definition of derivative would not apply. The simple algebraic results now become: 50.4 Theorem. If f is constant on the interval I , then f is differentiable on I with f 0 .x/ D 0 for all x 2 I . 50.5 Theorem. Let f W I ! Rn , g W I ! Rn be differentiable at c 2 I and let k 2 R, then: (a) kf is differentiable at c with .kf /0 .c/ D kf 0 .c/ (b) f C g is differentiable at c and .f C g/0 .c/ D f 0 .c/ C g 0 .c/ (sum rule). (c) f g is differentiable at c and .f g/0 .c/ D f 0 .c/ g.c/ C f .c/ g 0 .c/ (product rule for dot product). Of course, there is also a version of the product rule for one of the functions f and g scalar valued and the other vector valued, and a corresponding quotient rule. The Mean Value Theorem does not hold for vector (or complex) valued functions defined on a real interval. 50.6 Example (Complex MVT fails). For each t 2 Œ0; 2, let f .t / D e i t . Then, f .2/ f .0/ D 1 1 D 0; but f 0 .t / D i e i t has absolute value 1, for all t . But there is a consequence of the Mean Value Theorem that does hold; it is the one that we used to show that a function with a bounded derivative is Lipschitz. (Theorem 36.10) 50.7 Mean Value Inequality. Let f W Œa; b ! Rn be continuous on Œa; b and differentiable on .a; b/. Then, there exists x 2 .a; b/ such that jf .b/ Proof. Put z D f .b/ f .a/j jf 0 .x/jjb aj: f .a/ and define for t 2 Œa; b, '.t / D z f .t /: Then, ' is a real-valued function, continuous on Œa; b, differentiable in .a; b/. Therefore there exists x 2 .a; b/ with '.b/ '.a/ D ' 0 .x/.b a/: But z is constant, so by the product rule, ' 0 .x/ D 0 f .x/ C z f 0 .x/ D z f 0 .x/: As for the other side of the equation, '.b/ '.a/ D z f .b/ z f .a/ D z .f .b/ f .a// D jzj2 : Differentiation of vector-valued and complex-valued fns 223 Thus, jzj2 D z f 0 .x/.b a/ jzjjf 0 .x/j.b a/; by the Cauchy-Schwarz inequality, and the result follows by cancelling the jzj. We will have a further generalization of this result when we turn to functions of a vector variable. (Theorem 53.4.) 50.1. Prove the chain rule for vector functions of a real variable: Let I be an interval and g W I ! R be differentiable at x0 . Let g.I / J , another interval of R, f W J ! Rn , and let f be differentiable at u0 D g.x0 /. Then, f ı g is differentiable at x0 , with .f ı g/0 .xo / D f 0 .uo /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /. 50.2. Let W Œa:b ! Rn , with constant norm; that is j.t /j the same for all t . Prove 0 .t / is orthogonal to .t /, for all t . 224 Differentiation of vector-valued and complex-valued fns Notes Differentiation of vector-valued and complex-valued fns 225 51. I NTEGRATION OF VECTOR - VALUED FUNCTIONS Let f1 ; : : : ; fn be functions on Œa; b R to R, and let f D .f1 ; : : : ; fn / be the corresponding function on Œa; b to Rn . The Riemann definition of integral as the limit of Riemann sums as the mesh of the tagged partitions tend to 0 still makes sense here, and one finds that f is integrable iff each of f1 ; : : : fn is integrable and Z b Z b Z b ! f D f1 : : : ; fn : a a a When we need (or want) to show the variable of integration this becomes. ! Z b Z b Z b f .t / dt D f1 .t / dt; : : : ; fn .t / dt a a a It is clear this integral is still linear and additive on intervals, by just applying the real case to each coordinate. The same is true for the Fundamental Theorem of Calculus. Let us state the “integrating a derivative” form. 51.1 Theorem. If F maps Œa; b into Rn , F 0 D f on Œa; b, and f is Riemann integrable, then Z b f .t / dt D F .b/ F .a/: a The result about the integral of the absolute value of an integrable function f is also true, but the proof is a little trickier, since f is vector-valued and jf j is real-valued. (By the way — remember that jf j means the function whose value at x is jf .x/j, the norm of the vector f .x/, not really absolute value. We are reserving kf k for the supremum norm of the function. ) 51.2 Theorem. Let f W Œa; b ! Rn be Riemann integrable. Then jf j is Riemann integrable and ˇ ˇZ ˇ b ˇ Z b ˇ ˇ jf j: fˇ ˇ ˇ a ˇ a Proof. The hypothesis implies that each of the components f1 ,. . . ,fn is integrable, so each of the squares f12 ,. . . ,fn2 is integrable and so is their sum. The square root function is P continuous, so jf j D . i fi2 /1=2 is also integrable. Rb Rb Put wi D a fi , so that w D .w1 ; : : : ; wn / is a f . Then, Z bX X X Z b jwj2 D wi2 D wi fi D wi fi : i i a a i By the Cauchy-Schwarz inequality, for all t 2 Œa; b, X wi fi .t / jwjjf .t /j: i Thus, jwj2 b Z b Z jwjjf .t /j dt D jwj a and the result follows by cancelling the jwj. jf j; a 226 Integration of vector-valued functions Rectifiable curves — arc length. A continuous mapping of an interval Œa; b into Rn is called a (parametrized) curve, because its range C D f.t / W t 2 Œa; bg can be considered a geometric curve, traced out by the point .t / as t moves from Œa; b. The “distance traveled” by the point .t / is thought of as the length of the curve. But one should should keep in mind that same set C corresponds to many different maps and these then may have different lengths. (For example, the curve could wrap around a circle several times.) If is one-to-one, it is often called an arc. If .a/ D .b/, is called a closed curve. For a firm definition of the length of a curve, associate to each partition P of Œa; b determined by points x0 D a D x1 xk D bg the number `.; P / D k X j.xi / .xi 1 /j: i D1 This is the sum of the distances between the points .xi /, and .xi 1 /, so is the length of a polygonal path with vertices .x0 /; .x1 /; : : : ; .xk /. We define the length of to be `. / D supf`.; P / W P a partition of Œa; bg: It is easy to see that the approximations `.; P / increase as the partition P gets finer. If `. / is finite, we call the curve rectifiable. Notice that, although a process similar to integration is used, integration and differentiation are not involved in the definition. A continuously differentiable curve is one for which 0 is continuous on the parameter interval. Sometimes these are called “smooth” curves, but the geometric curve traced out by such a curve could have sharp points, so some authors reserve this word for continuously differentiable , for which g 0 .t / ¤ 0, for all t . Those who call continuously differentiable curves “smooth” often call those which have no zero derivatives regular.) For continuously differentiable curves, we do have a formula in terms of integration and differentiation. 51.3 Theorem. If W Œa; b ! Rn is a continuously differentiable curve, then is rectifiable and Z b j 0 .t /j dt: `. / D a Proof. . Let P D fx0 ; x1 ; : : : ; xk g be a partition of Œa; b. For each i we have by the Fundamental Theorem of Calculus, ˇ Z x ˇZ x i ˇ ˇ i 0 ˇ .t / dt ˇˇ j 0 .t /j dt: j.xi / .xi 1 /j D ˇ xi xi 1 1 Summing over i gives Z `.; P / b j 0 .t /j dt a and taking the supremum over all such partitions yields Z b `. / j 0 .t /j dt: a For the reverse inequality, let " > 0, and use the fact that 0 is uniformly continuous to choose ı > 0 such that j 0 .s/ 0 .t /j < ", whenever js t j < ı. Then, choose any partition P D fx0 ; : : : ; xk g of Œa; b with mesh kP k < ı. For each t 2 Œxi 1 ; xi , we have j 0 .t /j j 0 .xi /j C " Differentiation of vector-valued and complex-valued fns 227 and integrating gives ˇZ x ˇ Z xi ˇ i 0 ˇ j 0 .t /j dt j 0 .xi /jxi C "xi D j 0 .xi /xi j C "xi D ˇˇ .xi / dt ˇˇ C "xi xi 1 xi 1 ˇZ x ˇ ˇ i 0 ˇ D ˇˇ .t / C . 0 .xi / 0 .t // dt ˇˇ C "xi xi 1 ˇ Z x ˇZ x i ˇ ˇ i 0 ˇ ˇ .t / dt ˇˇ C j 0 .xi / 0 .t /j dt C "xi xi xi 1 j.xi / .xi 1 1 j C 2"xi Summing over i yields Z b j 0 .t /j dt `.; P / C 2" `. / C 2" a Since " is arbitrary, b Z j 0 .t /j dt `. /; a which is all that was left to prove. t 3 ; t 2 /, 51.1. Let I be an interval of R with 0 2 int.I / and .t / D .1 C for t 2 I . Then, is continuously differentiable, but 0 .t / D 0 and the geometric curve .I / has a sharp point at .0/. 51.2. Let 1 be a rectifiable curve in Rn defined on Œa; b and ' a continuous 1-1 mapping of Œc; d onto Œa; b and 2 .s/ D 1 .'.s//, for all s 2 Œc; d . Prove 1 and 2 have the same length. 51.3. Let W Œa; b ! Rn be a one-to-one curve. Let W Œc; d ! Rn be another one such that .Œa; b/ D .Œc; d /. Prove that the two curves have the same length. [Hint: these curves have continuous inverses; see C ONTINUITY AND C OMPACTNESS, Exercise 31.2.] Note: Differentiation and integration are not involved. 51.4. Prove that if 1 W Œa; b ! Rn and 2 W Œb; c ! Rn are curves and is the curve on Œa; c that extends both of these (we call it 1 C 2 ) then `. / D `.1 / C `.2 /. 228 Integration of vector-valued functions Notes Differentiation of vector-valued and complex-valued fns 229 52. D IFFERENTIATION OF VECTOR FUNCTIONS OF A VECTOR VARIABLE We now study the general case of a mappings between a subset of Rn into Rm . These are often called transformations or vector valued functions of several variables. We recall that for real functions defined on an interval of R, f was differentiable at a point c if and only if there was a straight line approximating f closely at c, and a similar result held for vector valued functions of a real variable. This is the clue for the definition in the vector-to-vector case. A mapping T of a vector space E1 into another vector space E is called a linear transformation if, for all x; y 2 E1 , and all scalars t , T .x C y/ D T x C T y T .tx/ D t T x: Note that for linear maps, one often writes T x instead of T .x/. 1 Definition. Let G be an open set of Rn and let f be a mapping into Rm , defined (at least) on G. We say that f is differentiable at x0 2 G, provided there exists a linear transformation T W Rn ! Rm and a function " W G ! Rm such that limx!x0 ".x/ D ".x0 / D 0 and for all x 2 G, f .x/ D f .x0 / C T .x x0 / C ".x/jx x0 j: If this is satisfied, T is denoted f 0 .x0 / or Df .x0 /, and called the derivative of f at x0 . Of course the function " is given by ".x/ D f .x/ f .x0 / T .x jx x0 j for x ¤ x0 , (and 0, if x D x0 ). It should be noticed that we had to use jx divide by a vector. x0 / ; x0 j, instead of x x0 , because we cannot Another way we could write the defining condition is f .x/ D f .x0 / C Df .x0 /.x x0 / C R.x/; where the remainder R.x/ satisfies lim x!x0 jR.x/j D 0: jx x0 j Such a term R.x/ is said to be o.x/ as x ! x0 . At the risk of causing boredom, we also note that this could be written f .x0 C h/ D f .x0 / C Df .x0 /h C R1 .h/; where jR1 .h/j D 0: jhj h!0 lim As we shall reconfirm shortly, linear transformations on Rn are continuous, so it follows as in the real setting, that differentiability imples continuity. 52.1 Theorem. If G is an open subset of Rn and f W G ! Rm is differentiable at x0 , then f is continuous at x0 . 230 Differentiation of vector functions of a vector-variable Connection with the usual definition for functions of a real variable. If T W R ! R is a linear mapping, then its value at h 2 R is T h D T .1/h. That is, if a D T .1/, then T h D ah, ordinary multiplication. Conversely, every function of that form is linear. So, when we write f .x/ D f .x0 / C f 0 .x0 /.x x0 / C R.x/; it doesn’t matter whether we think of f 0 .x0 / as a number which is multiplied by x x0 , or as a linear mapping, evaluated at x x0 . Similarly, if v is a vector of Rm , then the mapping h 2 R 7! vh (multiplication of v by the scalar h) is a linear mapping. Conversely, if T W R ! Rm is a linear transformation, and v is the vector T .1/, T at h is again h.T .1// D vh. Once more, then, it doesn’t matter whether we think of f 0 .x0 /.x x0 / as a vector multiplied by the scalar x x0 or as a linear transformation acting on x x0 . The case of real-valued functions of a vector variable. From elementary linear algebra, we learn: 52.2 Theorem. If T W Rn ! R, then T is a linear transformation if and only if there exists a D .a1 ; : : : ; an / 2 Rn such that T x D a1 x1 C a2 x2 C C an xn D a x, for all x 2 Rn . The linearity of such a map follows immediately from the properties of dot product. For the converse, let e1 ; e2 ; : : : ; en be the standard basis vectors of Rn , e1 D .1; 0; 0; : : : ; 0/ e2 D .0; 1; 0; : : : ; 0/ :: :D :: : en D .0; 0; 0; : : : ; 1/; so eij D 1, if i D j , and 0 otherwise. Then, each x 2 Rn is of the form x D .x1 ; : : : ; xn / D n X xi ei ; i D1 P and we see that T x D i xi T ei D .a1 ; : : : ; an / .x1 ; : : : ; xn / D a x, where ai D T .ei /, for all i . Thus, if a function f on an open set G of Rn with values in R is differentiable at x0 , then there is a vector a 2 Rn such that f 0 .x0 /h D a h, for all h 2 Rn . The vector a here will turn out to be what is called the gradient of f at x0 . Often we think of the graph of the mapping f W G Rn ! R, namely: S D f.x; f .x// W x 2 Gg, as an m-dimensional surface in RmC1 . The graph of the map x 7! f .x0 / C f 0 .x0 /.x x0 / then becomes a (hyper)plane tangent to S at x0 . The space L.Rn ; Rm /. The identification of the derivative as a real number or as a vector with the derivative as a linear transformation becomes clearer when we notice that the set L.Rn ; Rm / of all linear transformations from Rn to Rm is itself a vector space under the usual operations .S C T /.x/ D S.x/ C T .x/ .cT /.x/ D c.T .x//: The correspondences mentioned above are actually vector space isomorphisms. Thus, as vector spaces, (1) R can be identified with L.R; R/ under the isomorphism a 7! Ta , where Ta h D ah, for all real h; Differentiation of vector-valued and complex-valued fns 231 (2) Rm can be identified with L.R; Rm / under the correspondence v 7! Tv , where Tv .h/ D vh, multiplication by a scalar; (3) Rn can be identified with L.Rn ; R/ under the correspondence a 7! Ta , where Ta .x/ D a x, for all x 2 Rn . Of course, we are not only interested in the algebraic properties of derivatives, but also in related distances. To handle that, we introduce the norm of a linear transformation T , kT k D sup jT xj: jxj1 ˇ ˇ ˇxˇ x Notice that if x ¤ 0, then ˇ jxj /j D ˇ D 1, so jT . jxj 1 jT xj jxj and hence, jT xj kT k jxj: This evidently holds also when x D 0. On the other hand, if K 0 satisfies jT xj Kjxj, for all x 2 Rn , we have by the definition of supremum as a least upper bound that kT k K. Thus, 52.3 Lemma. For T 2 L.Rn ; Rm /, kT k is the least K 0 such that jT xj Kjxj, for all x 2 Rn . Now, letP e1 ; : : : ; en be the standard basis vectors of Rn , and x D .x1 ; : : : ; xn / 2 Rn . Then, x D njD1 xj ej , so ˇ ˇ ˇ ˇX n n 1=2 X ˇ X ˇ n ˇ jT ej j2 ; xj T ej ˇˇ jxj jjT ej j jxj jT xj D ˇ ˇ j D1 ˇj D1 j D1 by the Cauchy-Schwarz inequality. This shows that kT k . n X jT ej j2 /1=2 < C1: j D1 It follows that T is Lipschitz, hence uniformly continuous, because jT .x y/j kT kjx yj; for all x 2 Rn : 52.4 Theorem. The map T 7! kT k turns L.Rn ; Rm / into a normed space; that is, for all S; T 2 L.Rn ; Rm / and all scalars c, (1) 0 kT k < 1; (2) kS C T k kS k C kT k; and (3) kcT k D jcjkT k: One can check that in the 3 special cases above, the correspondences also preserve the norm, as follows. 52.5 Theorem. (1) If a 2 R and Ta h D ah, for all real h, then kTa k D jaj. (2) If v 2 Rn and Tv .h/ D vh, multiplication by the scalar h, then kTv k D jvj. (3) If a 2 Rn and Ta .x/ D a x, for all x 2 Rn , the kTa k D jaj. 232 Differentiation of vector functions of a vector-variable Rules of differentiation. We first consider the case f D T is already a linear mapping. Then, f .x/ D f .x0 / C T .x x0 / C 0; 0 0 so T .x0 / D f .x0 / D T . Thus, 52.6 Theorem. Every linear transformation T W Rn ! Rm is differentiable at each point x of Rn , with T 0 .x/ D T . We emphasize that we are not saying T 0 D T . In general if f W G ! Rm , f 0 D Df , is not a transformation of Rn to Rm , but rather a mapping that associates to each x 2 Rn a linear transformation f 0 .x/ 2 L.Rn ; Rm /. In the case f D T is linear, the value of f 0 .x/ is T , for all x. Derivatives of constant maps are 0. 52.7 Theorem. Let f W G Rn ! Rm be defined by f .x/ D v, for all x 2 G. Then, f 0 .x/ D 0, for all x 2 Rm . This is immediate from the definition and is left as an exercise. 52.8 The chain rule. Let G be an open subset of Rn and g W G ! Rm be differentiable at x0 . Let f map an open set U containing g.G/ into Rk , and let f be differentiable at u0 D g.x0 /. Then, f ı g is differentiable at x0 , with .f ı g/0 .x0 / D f 0 .u0 /g 0 .x0 / D f 0 .g.x0 //g 0 .x0 /. Here if T D g 0 .x0 /, S D f 0 .g.x0 //, we are talking about the composite mapping S ı T D ST . Proof. Since g is differentiable at x0 , there exists a function "1 , continuous and 0 at x0 with g.x/ D g.x0 / C g 0 .x0 /.x x0 / C "1 .x/jx x0 j; (29) for all x 2 G. Since f is differentiable at u0 , there exists a function, "2 which is continuous and 0 at u0 D g.x0 / with f .u/ D f .u0 / C f 0 .u0 /.u u0 / C "2 .u/ju u0 j; (30) for all u in U . Replacing u by g.x/ in (30) gives f .g.x// D f .u0 / C f 0 .u0 /.g.x/ u0 / C "2 .g.x//jg.x/ u0 j: (31) 0 But u0 D g.x0 /, so by (1) we may replace g.x/ u0 by g .x0 /.x x0 / C "1 .x/jx x0 j yielding h i f .g.x// D f .g.x0 // C f 0 .u0 / g 0 .x0 /.x x0 / C "1 .x/jx x0 j ˇ ˇ ˇ ˇ (32) C "2 .g.x//ˇ g 0 .x0 /.x x0 / C "1 .x/jx x0 j ˇ ; D f .g.x0 // C f 0 .u0 /g 0 .x0 /.x x0 / C R.x/ where R.x/ D f 0 .u0 /"1 .x/jx ˇ x0 j C "2 .g.x//ˇg 0 .x0 /.x x0 / C "1 .x/jx ˇ x0 jˇ: Thus, we need only show that the remainder R.x/ satisfies jR.x/j=jx x0 j ! 0, as x ! x0 . Well, jg 0 .x0 /.x x0 / C "1 .x/jx x0 jj kg 0 .x0 /kjx x0 j C j"1 .x/jjx x0 j, so jR.x/j kf 0 .u0 /kj"1 .x/j C j"2 .g.x//jkg 0 .x0 /k C j"1 .x/j ! 0; jx x0 j Differentiation of vector-valued and complex-valued fns 233 as required. Here we used the fact that g is continuous at x0 , and "2 is continuous and 0 at g.x0 / D u0 . 52.9 Remark. The fact that for linear transformations we write T x instead of T .x/ and S T instead of S ı T hides some subtleties in the computation above. For example, if S denotes f 0 .u0 /, T denotes g 0 .x0 /, and h denotes x x0 , going from the first to the second line of (32), we calculate S T .h/ C "1 .x/jhj D S.T .h// C S."1 .x/jhj/ D .S ı T /.h/ C S."1 .x//jhj; where we have “taken out” the scalar jhj. 52.10 Theorem. Let f and g be functions defined and differentiable on an open set containing x0 2 Rn with values in Rm and let c 2 R. Then f C g and cf are differentiable at x0 and .f C g/0 .x0 / D f 0 .x0 / C g 0 .x0 / and .cf /0 .x0 / D cf 0 .x0 /. 52.11 Theorem. Let f D .f1 ; : : : ; fm / map an open set G of Rn to Rm . Then f is differentiable at x iff each fi is differentiable at x. In this case f 0 .x/ D .f10 .x/; : : : ; fm0 .x//. Proof. Note that for variety, we have fixed x in the domain of f . Assume fi is differentiable at x for each i D 1; : : : ; m and let Ti D fi0 .x/. Then, there exist "1 ; : : : ; "m such that "i .h/ ! 0 as h ! 0 and fi .x C h/ fi .x/ D Ti h C "i .h/jhj. Let e1 ; : : : ; em be the standard basis vectors of Rm . Then, f .x C h/ f .x/ D m X i D1 .fi .x C h/ fi .x//ei D m X i D1 .Ti h/ei C m X "i .h/jhjei i D1 Thus, f .x C h/ D f .x/ C .T1 h; : : : ; Tm h/ C ".h/jhj, where ".h/ ! 0 as h ! 0. Thus, f is differentiable at x and f 0 .x/ is the linear transformation T whose components are T1 ; : : : ; Tn , as required. The converse can be simply established by direct calculation, but it is fun to see that it follows by the chain rule from other results. Indeed, if f is differentiable at x, fi D i ıf , where i is the projection of Rm onto its i th coodinate. Thus, fi0 .x/ D i0 .f .x// ı f 0 .x/ But, i is linear so i0 .f .x// D i , and we get fi0 .x/ D i ı f 0 .x/, the i th coordinate of f 0 .x/. Directional derivatives and partial derivatives. A lot of information about derivatives of a vector function f of a vector variable can be obtained from functions of a real variable, by looking at the behaviour of f along a straight line. Let u be a non-zero element of Rn , f a function defined in a neighbourhood of x 2 Rn with values in Rm . The u-directional derivative of f at x, or derivative of f at x in the direction u is f .x C t u/ f .x/ ; Du f .x/ D lim t !0 t provided this limit exists. Notice that this is the derivative at 0 of the function of a real variable, f ı `, where `.t / D x C t u, for all t 2 R, which parametrizes a straight line through the point x in the direction u. In the special case u D ej , the j th basis vector, .x/ Dej f .x/ is called the j th partial derivative at x and denoted Dj f .x/ or @f . Notice, @xj that in this case, f .x1 ; : : : ; xj 1 ; xj C t; xj C1 ; : : : ; xn / f .x1 ; : : : ; xn / Dj f .x/ D lim t !0 t 234 Differentiation of vector functions of a vector-variable the derivative at xj of the function s 7! g.s/ obtained from f by fixing all the xk , for k ¤ j , and replacing xj by the variable s. 52.12 Theorem. Let u be a non-zero element of Rn , f a function defined in a neighbourhood of x 2 Rn with values in Rm . If f is differentiable at x, then the u-directional derivative of f at x is f 0 .x/u, the value of the linear transformation f 0 .x/ at u. Proof. Let `.t / D x C t u, for all t 2 R. Then, by the chain rule for differentiation of transformations, Du f .x/1 D .f ı `/0 .0/1 D .f 0 .`.0// ı `0 .0//.1/ D f 0 .`.0//.`0 .0/1/ But `.0/ D x, and `0 .0/1 D u, so Du f .x/ D f 0 .x/u: The reason we put the 1 in the computation in the above proof is that we were using the chain rule for transformations (i.e. vector functions of a vector variable), and the directional derivative was defined as a vector, not as a linear transformation. Multiplying by 1 amounts to evaluating the corresponding linear transformation at 1. Proof using direct calculation. : f .x C t u/ t f .x/ D f 0 .x/.t u/ C "1 .t u/jt uj jt uj D f 0 .x/u C "1 .t u/ ; t t where "1 .t u/ ! 0 as t ! 0. Since the absolute value of the scalar we have f .x C t u/ f .x/ D f 0 .x/u Du f .x/ D lim t !0 t jt uj t is constantly juj, 52.13 Remark. It follows that if s is a scalar, Dsu f .x/ D sDu f .x/. Many people require that u be a unit vector and leave directional derivatives undefined otherwise; others define the directional derivative in terms of u=juj. Of course, this means that formulas must be adjusted by the factor 1=juj. The gradient. In the case of a real valued function f of a vector variable, we saw that there exists a vector a 2 Rn such that f 0 .x/h D a h, for all h 2 Rn . We will now identify that a. If a D .a1 ; : : : ; an /, then the j th coordinate of a is simply aj D a ej ; th where ej is the j standard basis vector. Thus, in the present case, aj D f 0 .x/ej D Dj f .x/ D @f .x/ ; @xj and a D .D1 f .x/; : : : ; Dn f .x//. The vector .D1 f .x/; : : : ; Dn f .x//, defined whenever each of the partials D1 f .x/,. . . ,Dn f .x/ exists is called the gradient of f at x and denoted rf .x/ or grad f .x/. Thus, if a real valued f is differentiable at x, then f 0 .x/h D grad f .x/ h, for all h 2 Rn . Often, in this setting, one uses the notation dx D .dx1 ; : : : ; dxn / instead of h and this becomes f 0 .x/dx D grad f .x/ dx D @f .x/ @f .x/ dx1 C C dxn @x1 @xn Differentiation of vector-valued and complex-valued fns 235 52.14 Example. The existence of the gradient does not imply differentiability — not even continuity. Let ( x x 1 2 if .x1 ; x2 / ¤ 0 2 2; f .x1 ; x2 / D x1 Cx2 0; if .x1 ; x2 / D 0: Then, along the line x2 D 0, f .x/ is constantly 0, so D1 f .0/ D 0, and similarly D2 f .0/ D 0, so rf .0/ D .0; 0/, but as x ! 0 along the line x1 D x2 , f .x/ ! 1=2, so f is not continuous at 0. You can check this using "; ı arguments or you can look at the same phenomenon by using a composition with the function ` W t 7! t .1; 1/ which decribes the line x1 D x2 . t2 D 1=2: lim f .t; t / D lim 2 t t C t2 t !0 In this example, we see that the derivative in the direction of .1; 1/ does not exist. One might be tempted to believe that if all directional derivatives existed and were equal, then f would be differentiable, but this fails also. 52.15 Example (The existence of all directional derivatives does not imply continuity). Let f W R2 ! R be given by 8 2 < x1 x2 ; if .x ; x / ¤ 0 1 2 4 2 f .x1 ; x2 / D x1 Cx2 :0; if .x ; x / D 0: 1 2 Fix u D .u1 ; u2 / ¤ 0. If u2 D 0, we get t 2 u21 0 ı t D 0; t !0 t 4 u4 C 0 1 Du f .0; 0/ D lim while if u2 ¤ 0, t 3 u21 u2 u21 D : t !0 t .t 4 u4 C t 2 u2 / u2 1 2 Thus, all directional derivatives exist at .0; 0/. This implies that the restriction of f to each straight line through the origin is continuous at 0. Nevertheless, f is not continuous at 0, for if we follow the curve W t 7! .t; t 2 /, we have Du f .0; 0/ D lim t4 D 1=2 ¤ 0: t !0 t !0 t 4 C t 4 The matrix of the derivative. The Jacobian. We recall from Algebra that every 2 Linear 3 x1 6 7 vector x 2 Rn has a column matrix representation Œx D 4 ::: 5, where the xj are the lim f ..t // D lim xn P coordinates of x with respect to the standard basis x D .x1 ; : : : ; xn / D njD1 xj ej . The mapping x 7! Œx is a vector space isomorphism, and often one identifies x with Œx. Moreover, every linear transformation T W Rn ! Rm has a matrix 2 3 a11 : : : a1n 6 :: 7 ŒT D 4 ::: : 5 am1 ::: amn which satisfies ŒT x D ŒT Œx; (matrix multiplication). The columns of ŒT are the column representations of the image vectors T e1 ; : : : ; T en . 236 Differentiation of vector functions of a vector-variable For the reader who has not seen that development, or would like a review, here is a brief n version. Let e1 ; : : : ; en denote the standard Pbasis vectors in R , and let ex1 ; : : : ; exm be the standard basis vectors in Rm . Then, x D n x e , and j j j D1 Tx D n X xj T ej : j D1 For each j , T ej D Pm aij exi , for some numbers aij , i D 1; : : : ; m. Thus, 0 1 m m n X X X X @ Tx D xj aij exi D aij xj A exi iD1 j Thus, the coordinates of T x are iD1 iD1 j D1 Pn aij xj . In terms of matrix multiplication this says 32 3 a11 : : : a1n x1 6 : 6 7 :: 7 7 6 :: 7 : ŒT x D 6 4 : : 54 : 5 am1 : : : amn xn j D1 2 Now, in case T is f 0 .x/ the derivative at x of a transformation taking a neighbourhood of x 2 Rn to Rm , for each j , f 0 .x/ej is the ej -directional derivative; that is, the partial derivative Dj f .x/. Its coordinates with respect to ex1 ; : : : ; exm are just obtained by differentiating the coordinates of f at x. Thus, the i th coordinate of f 0 .x/ej is Dj fi .x/, and the matrix Œf 0 .x/ which represents the linear transformation f 0 .x/ is 2 3 D1 f1 .x/ : : : Dn f1 .x/ 6 7 :: :: 4 5 : : D1 fm .x/ ::: Dn fm .x/ This matrix is often called the Jacobian matrix of f at x. If m D n, the matrix is invertible if and only if its determinant is non-zero. This determinant detŒf 0 .x/ is called 1 ;:::;fn / . the Jacobian of f at x, sometimes denoted @.f @.x1 ;:::;xn / Continuous differentiability. For a function f W G Rn ! Rm , G open, f is continuously differentiable at a if f 0 is exists in a neighbourhood G of a and f 0 is continous at a as a map on G to L.Rn ; Rm /, with the distance given by the operator norm: d.T; S / D kT Sk. basis in Rn , then RecallPthat we showed, that if e1 ; : : : ; ej are the standardP P vectors 2 2 2 kT k j jT ej j . If ŒT D .aij / the right-hand side here is j i aij : Thus, if T and S are linear transformations with matrices A D .aij / and B D .bij /, then 0 11=2 X kT Sk @ .aij bij /2 A : (33) ij Even though existence of all the partials doesn’t imply differentiability in general, it does so if the partials are continuous. 52.16 Theorem. For a function f defined (at least) in an open set G of Rn , with values in Rm , f is continuously differentiable in G if and only if for all i; j the partial derivative Dj fi exists and is continuous in G. Proof. . ( H) ) Let f be continuously differentiable at a 2 G. Then, for all x in G, each partial derivative exists with Dj fi .x/ D .f 0 .x/ej / exi , where exi denotes the i th basis Differentiation of vector-valued and complex-valued fns 237 vector in Rm . Thus, jDj fi .x/ Dj fi .a/j D jf 0 .x/ej exi D j.f 0 .x/ f 0 .a/ej exi j f 0 .a//ej ei j kf 0 .x/ f 0 .a/k jej j j exi j: As x ! 0, f 0 .x/ ! f 0 .a/, by continuity of f 0 , so Dj fi is continous at a. ( (H ) Conversely, assume the partial derivatives are continuous in G. Once we show differentiability, the continuity of the derivative will follow from 0 11=2 X kf 0 .x/ f 0 .a/k @ .Dj fi .x/ Dj fi .a//2 A ; ij which is the application of (33.) to this situation. Without loss of generality, assume m D 1. Let a 2 G. Let " > 0. Choose ı > 0 so small that B.a; ı/ G and for all j , jDj f .x/ Dj f .a/j < "=n; for jx aj < ı P P k Take h 2 Rn with jhj < ı. Put x0 D a, xk D aC j D1 hj ej , (so xn D aC njD1 hj ej D a C h). Then, n X f .a C h/ f .a/ D f .xj / f .xj 1 /: j D1 By the Mean Value Theorem (for real functions of a real variable), f .xj / for some cj on the line from xj 1 f .xj 1/ D Dj f .cj /hj ; to xj . Thus, f .a C h/ f .a/ D n X Dj f .cj /hj ; j D1 so ˇ ˇ ˇ ˇf .a C h/ ˇ ˇ f .a/ ˇ ˇ ˇ ˇ ˇ ˇX n X ˇ ˇ ˇ n Dj f .a/hj ˇˇ Dj f .cj /hj Dj f .a/hj ˇˇ ˇˇ ˇ ˇ ˇj D1 j D1 j D1 X jDj f .cj / Dj f .a/jjhj j n X j "jhj Thus, by definition, f is differentiable at a with f 0 .a/h D h. P j Dj f .a/hj D rf .a/ 52.1. Let f be defined in a neighbourhood of x in Rn with values in Rm . If W t 7! a C t u and .t0 / D x, then Du f .x/ is the derivative of f ı at t0 . 52.2. Directly from the definition, prove that the function f W R2 ! R defined by f .x1 ; x2 / D sin x1 is differentiable at each point a D .a1 ; a2 / of R2 . 52.3. Let f W R2 ! R be defined by f .x/ D 8 < qx1 jx2 j ; 2 2 x¤0 : 0; x D 0: x1 Cx2 Prove that f has all directional derivatives at 0, but is not differentiable at 0. 238 Differentiation of vector functions of a vector-variable 52.4. Let be a bilinear product from Rn Rm to Rk . Prove there exists M 2 R with ju vj M jujjvj, for all u 2 Rn , v 2 Rm . 52.5. Prove the product rule for differentiation of vector functions of a vector variable: Let G be an open subset of Rp , x0 2 G, f W G ! Rn , g W G ! Rm , a bilinear product from Rn Rm to Rk . If f and g are differentiable at x0 , so is f g, with D.f g/.x0 /h D .Df .x0 /h/ g.x0 / C f .x0 / .Dg.x0 /h). 52.6. Let f be a real valued function on an open set G of R2 such that the partial derivatives D1 f and D2 f exist and are bounded in G. Prove that f is continuous. (Suggestion: mimic the proof that continuous partials imply differentiability.) Differentiation of vector-valued and complex-valued fns 239 53. T HE I NVERSE F UNCTION T HEOREM For real functions of a real variable, we learned that if f 0 .x/ ¤ 0, for all x in an open interval I , then f is strictly monotone, hence injective, so the inverse function f 1 is defined everywhere on f .I /, and this inverse function itself is also differentiable. If we knew only that f 0 .a/ were non-zero at one point a, and if we assumed that f 0 were continuous, then we could still find a neighbourhood U of a such that f 0 .u/ ¤ 0 for all u 2 U , so the result would still hold for f restricted to U . This is the version of the result we will develop for the higher dimensional case. The general idea of the theorem is that a derivative gives a local approximation to a function at point; if the derivative at the point is invertible, then the function is also invertible (near that point). The significance of the condition f 0 .a/ ¤ 0 is that it is equivalent to the invertiblity of f 0 .a/. That is the condition we will have to impose on the general case. 53.1 The Inverse Function Theorem. Let f be a continuously differentiable mapping of an open subset G of Rn to Rn . If a 2 G with f 0 .a/ invertible and b D f .a/, then (1) there exist open sets U and V such that a 2 U , b 2 V , and f maps U one-to-one onto V ; (2) if g is the inverse of the restriction of f to U , then g is continuously differentiable on V and g 0 .y/ D Œf 0 .g.y// 1 : 53.2 Note. Unlike the case of functions on R to R, there is no hope for invertibility on all of G, even if f 0 .x/ is invertible for all x 2 G. You can check this by looking at the mapping f W R2 ! R2 defined by f .x1 ; x2 / D .e x1 cos x2 ; e x1 sin x2 /: This is actually the exponential function, viewed as a map on R2 to R2 instead of C to C. It is continuously differentiable at every point and its derivative (in the sense of transformation) is invertible, but f is far from one-to-one. Its inverse “the complex logarithm” has infinitely many “branches”. 53.3 Theorem. Let be the set of all invertible linear transformations of Rn to itself. (1) If T 2 and S 2 L.Rn ; Rn / with kS T k < 1=kT 1 k; then S is also invertible, so is open. (2) The map T 7! T 1 on onto itself is continuous. Proof. (1) Let T be invertible and ˛ D 1=kT jxj D jT 1 kT 1 kj.T kT 1 k .kT 1 k. Then T xj kT 1 kjT xj S /x C S xj S kjxj C jS xj/ : Thus, .˛ kT S k/jxj jS xj: () Now, for all S with kS T k < ˛, we deduce from () that S x D 0 implies x D 0, so S is invertible. Thus, is open. (2) Replacing x by S 1 y in () gives 1 jS 1 yj jyj; ˛ kT S k 240 The Inverse Function Theorem so that kS Also, 1 k 1=.˛ kT kS Sk/ < 2=˛ D 2kT 1 T 1 k D kS 1 kS 1 TT kkT 1 1 k, provided kT S 1 1 ST 1 S kkT S k < ˛=2. k k 1 2 when kT 2kT k kT S k; S k < ˛=2. It follows that the inversion map is continous at T . n m 53.4 Mean Value Inequality. Let U be an open subset of R and f W U ! R , is differentiable. If U contains the line segment from a to b, then there is a point c on that segment with jf .b/ f .a/j kf 0 .c/kjb aj: If kf 0 .x/k K < C1 for x 2 U and U is convex, that is, contains the line segment joining each pair of its points, then the result entails f is Lipschitz. Proof. We have proved elsewhere the corresponding result for vector functions of a real variable (see Theorem 50.7). We will reduce the present situation to that case. Let .t / D .1 t /a C t b, so that as t traverses the interval Œ0; 1, .t / traverses the line segment from a to b. Let g D f ı . According to the chain rule, g 0 .t / D f 0 ..t // 0 .t / D f 0 ..t //.b 0 a/: 0 Then, jg .t /j kf ..t //kjb aj, so according to the real variable Mean Value Inequality, for some t 2 .0; 1/, jg.1/ g.0/j jg 0 .t /jj1 0jI hence, putting c D .t /, jf .b/ as required. f .a/j kf 0 .c/kjb aj; We are now ready for the proof of the inverse function theorem. Differentiation of vector-valued and complex-valued fns 241 Proof of the Inverse Function Theorem. Let f 0 .a/ be invertible and denote it by T . Since f 0 is continuous at a, there exists an open ball U centred at a with kf 0 .x/ 1 2kT 1 k Tk for all x 2 U: Then, we see that f 0 .x/ is also invertible, though we won’t make use of that yet. For a fixed y 2 Rn , define for all x 2 U , 1 '.x/ D 'y .x/ D x C T .y f .x//: Notice that y D f .x/ if and only if '.x/ D x; that is, if and only if x is a fixed point of '. Differentiating gives ' 0 .x/ D I C T 1 1 . f 0 .x// D T .T f 0 .x//I hence, 1 k' 0 .x/k kT kkT f 0 .x/k 1=2: Since U is convex, we can use the Mean Value Inequality, obtaining '.x2 /j 12 jx1 j'.x1 / x2 j; (34) for all x1 ; x2 2 U , showing that ' is a contraction mapping. It follows that ' can have at most one fixed point. Thus, there is at most one point x with y D f .x/. This shows that f is one-to-one on U . Let V D f .U /. To show that V is also open, let y0 2 V . Let x0 be such that f .x0 / D x 0 ; r/ is contained in U . We will show y0 . Choose r > 0 so small that the closed ball B.x that V contains the ball centred at y0 , radius r=2kT 1 k. So, let jy y0 j r=2kT 1 k. Using the contraction mapping ' D 'y , defined above, we compute j'.x0 / x0 j D jT 1 f .x0 //j kT .y 1 kjy f .x0 /j r=2; x 0 ; r/, so that for x 2 B.x j'.x/ x0 j j'.x/ x0 j 21 jx '.x0 /j C j'.x0 / x0 j C r 2 r: x 0 ; r/ is a complete metric space, since it is closed and Rn is complete. The closed ball B.x Thus, by the Contraction Mapping Theorem (Banach’s fixed point theorem), ' has a fixed point x. Thus, f .x/ D y, so that y 2 V . This completes the proof that V is open. Now, let g be the inverse of the restriction of f to U . Our job is to show that g is continuously differentiable in V , with the expected formula for its derivative. The inequality (34) doesn’t actually depend on the particular y. Indeed, '.x1 / '.x2 / D x1 x2 T 1 .f .x1 / f .x2 //; so (C) becomes jx1 x2 T 1 f .x2 //j 12 jx1 .f .x1 / x2 j; and hence 1 jx 2 1 x2 j jT 1 .f .x1 / f .x2 //j: For y1 ; y2 2 V , we may replace x1 and x2 by g.y1 /, g.y2 /, obtaining jg.y1 / g.y2 /j 2jT 1 .y1 y2 /j 2kT 1 k jy1 y2 j: 242 The Inverse Function Theorem Now, to show g is differentiable at any point y0 2 V , let y 2 V also, and put x0 D g.y0 /, y D g.x/, S D Œf 0 .x0 / 1 . Then, there is ".x/ ! ".x0 / D 0 with g.y/ g.y0 / S.y y0 / D x x0 S.y y0 / D S.y y0 f 0 .x0 /.x D S.f .x/ f .x0 / D S.".x/jx x0 j/ D S.".g.y//jg.y/ As y ! y0 , S.".g.y/// ! 0 and jg.y/ differentiable at y0 , with derivative g 0 .y0 / D S D Œf 0 .g.y0 // 1 f .x0 /.x x0 // g.y0 /j: 1 g.y0 j 2kT x0 // 0 kjy y0 j, so this shows g is : 0 Finally, g is continuous, f is continuous by hypothesis, and the inversion map T ! T 1 is continuous on the set of all invertible linear operators on Rn , so g 0 — which is the composite of these 3 — is continuous on V , so g is also continuously differentiable, and we are done. 53.5 Corollary. If f is a continuously differentiable mapping of an open set G of Rn into Rn , with f 0 .x/ invertible for all x, then f is an open mapping; that is, f .W / is open, for each open subset W of G. The proof is an exercise. Differentiation of vector-valued and complex-valued fns 243 54. T HE I MPLICIT F UNCTION T HEOREM The Inverse Function Theorem gives conditions under which we can solve an equation of the form y D f .x/ “for x in terms of y”, where x; y are variables in Rn . Equivalently, for systems of n equations y1 D f1 .x1 ; : : : ; xn / :: : yn D fn .x1 ; : : : ; xn /; involving real variables x1 ; : : : ; xn ; y1 ; : : : ; yn , it gives conditions under which we can solve for x1 ; : : : ; xn in terms of y1 ; : : : ; yn . We now look for conditions under which we can do similarly for more general systems f1 .x1 ; : : : ; xn ; y1 ; : : : ; ym / D 0 :: : () fn .x1 ; : : : ; xn ; y1 ; : : : ; ym / D 0: For a moment, consider the case n D m D 1. If f is a continuously differentiable realvalued function in the plane, f .x; y/ D 0 can be solved for x in terms of y in any neigh.a; b/ ¤ 0. A simple bourhood of a point .a; b/ such that f .a; b/ D 0, provided @f @x familiar example consists of the equation x2 C y2 2 Here f .x; y/ D x C y 2 1 D 0: () 1. Attempts to solve for x in terms of y yield p x D ˙ 1 y2: ( ) @f .x; y/ @x The partial derivative is 2x. This is 0 if and only if x D 0. The corresponding points satisifying ./ are .0; 1/ and .0; 1/. As long as we stay away from these points, we can choose one of the 2 formulas indicated by . / to obtain a valid function 2 2 compatible with ./. For example, if 1 D 0; then the function pa > 0, and a C b g W . 1; 1/ ! R defined by g.y/ D 1 y 2 is such that a D g.b/ and for an element .x; y/ in the open set U D f.x; y/ W x > 0g, x2 C y2 1 D 0 ” x D g.y/: Further we notice that at such a point, g 0 .y/ D y.1 y 2 / 1=2 D y=x. This is the same as one gets by a formal calculation called “implicit differentiation” @f .x; y/ @f .x; y/ dx C dy D 0 @x @y 2x dx C 2y dy D 0 f 0 .x; y/.dx; dy/ D dx 2y y D D dy 2x x Assuming we knew g were differentiable at y, this is actually reflects a correct calculation, based on the chain rule. Indeed, we know in the open set U , f .x; y/ D 0 if and only if g.y/ D x, so f .g.y/; y/ D 0. Let .y/ D .g.y/; y/. Since f 0 .u; v/.h; k/ D 2uh C 2vk, by the chain rule, .f ı /0 .y/ D f 0 .g.y/; y/ 0 .y/ D 2g.y/g 0 .y/ C 2y1: Since f ı D 0, , we can solve for g 0 .y/ obtaining g 0 .y/ D y=g.y/. 244 The Implicit Function Theorem Now, let’s turn to the general case. Treat the system of equations ./ as one equation f .x; y/ D 0; where f is defined on an open subset of Rn Rm . The idea is once again that the derivative of f at a point .a; b/ is a linear transformation which yields a local approximation to f at that point. If T is a linear transformation from Rn Rm to Rn , it induces linear transformations Tx W Rn ! Rn and Ty W Rm ! Rn by Tx .h/ D T .h; 0/ Ty .k/ D T .0; k/: These satisfy T .h; k/ D Tx h C Ty k. If Tx is invertible, then for each y 2 Rm , there exists a unique h 2 Rn with T .h; k/ D 0, given by h D Tx 1 Ty k. If T D Df .a; b/ D f 0 .a; b/, then Tx is the derivative of the map x 7! f .x; b/ and Ty is the derivative of the map y 7! f .a; y/. We denote these (tranformation-valued) partial derivatives by D1 f .a; b/ and D2 f .a; b/. . 54.1 Implicit Function Theorem. Let f be a continuously differentiable map on an open subset G of Rn Rm into Rn , .a; b/ 2 G, and with f .a; b/ D 0. If D1 f .a; b/ is invertible, then there exist open sets U Rn Rm and W Rm with .a; b/ 2 U , b 2 W and a map g W W ! Rn such that for .x; y/ 2 U , f .x; y/ D 0 ” x D g.y/: Moreover, g is continuously differentiable in W and for all y 2 W , Dg.y/ D .D1 f .g.y/; y// 1 ı D2 f .g.y/; y/: Proof. . Let T D Df .a; b/. Define a map F W G ! Rn Rm , by F .x; y/ D .f .x; y/; y/. Then F is continuously differentiable, since both f and W .x; y/ 7! y are so. In fact, DF .x; y/ D .Df .x; y/; D.x; y// D .Df .x; y/; /, since we can differentiate each coordinate separately. In particular, DF .a; b/ D .T; /, the linear operator that maps .h; k/ to .T .h; k/; k/. Now, DF .a; b/.h; k/ D 0 implies .T .h; k/; k/ D 0, so k D 0, and thus T .h; 0/ D 0. In other words, Tx h D 0. But by hypothesis, Tx D D1 f .a; b/ is invertible, so h is also 0. From F 0 .a; b/.h; k/ D 0, we have deduced .h; k/ D 0; hence, DF .a; b/ is invertible. This shows that F satisfies the hypotheses of the Inverse Function Theorem. Therefore, there exist U an open subset of G, and V an open subset of Rn Rm with F a bijection of U onto V . Put W D fy 2 Rm W .0; y/ 2 V g. This is open, since V is open. Now, y 2 W iff f .x; y/ D 0, for some .x; y/ 2 U . Since F is one-to-one on U , this x is unique. Thus, we define a map g W W ! Rn by letting g.y/ be the unique x with .x; y/ 2 U and f .x; y/ D 0. Let H be the inverse of the restriction of F to U . Then, we know from the Inverse Function Theorem that H is also continuously differentiable and .g.y/; y/ D H.0; y/; so g is also continuously differentiable. To compute Dg.y/, put .y/ D .g.y/; y/, for all y 2 W . Then D.y/k D .Dg.y/k; k/, for all k 2 Rm . Since f ..y// D 0, for y 2 W , the chain rule gives at a point .x; y/ where x D g.y/, Df .x; y/.Dg.y/k; k/ D 0 Differentiation of vector-valued and complex-valued fns 245 Thus, D1 f .x; y/.Dg.y/k/ C D2 f .x; y/k D 0. Solving this for the tranformation Dg.y/ we have Dg.y/k D .D1 f .x; y// 1 D2 f .x; y/k; for all k 2 Rm . In other words, Dg.y/ D .D1 f .x; y// 1 D2 f .x; y/: 54.2 Example. Let f be given by f .x; y/ D .f1 .x1 ; x2 ; y1 ; y2 /; f2 .x1 ; x2 ; y1 ; y2 //; ( f1 .x1 ; x2 ; y1 ; y2 / D x12 x22 C y12 y2 C y1 y22 f2 .x1 ; x2 ; y1 ; y2 / D e y1 Cy2 x2 : Then, f .1; 1; 0; 0/ D .0; 0/. The derivative of f at .x; y/ has matrix D1 f1 .x; y/ D2 f1 .x; y/ j D3 f1 .x; y/ D4 f1 .x; y/ 2x1 D D1 f2 .x; y/ D2 f2 .x; y/ j D3 f2 .x; y/ D4 f2 .x; y/ 0 2x2 1 The matrix has been partitioned to emphasize 2x1 2x2 2y1 y2 C y22 D1 f .x; y/ D D2 f .x; y/ D and 0 1 e y1 Cy2 At the point .a; b/ D .1; 1; 0; 0/, 2 D1 f .a; b/ D 0 j 2y1 y2 C y22 j e y1 Cy2 y12 C 2y2 : e y1 Cy2 2 ; 1 which is invertible, since its columns are independent (or since its determinant is not 0). Thus, in a neighbourhood W of b there is a continuously differentiable function g such that f .g.y/; y/ D 0, for all y 2 W and the matrix of Dg.y/ is 1 1 2x1 2x2 2y1 y2 C y22 y12 C 2y2 D2 f .x; y/ D ŒDg.y/ D D1 f .x; y/ : 0 1 e y1 Cy2 e y1 Cy2 54.1. Show that the continuity of f 0 is needed in the inverse function theorem, even in the case n D 1: Let ( t C 2t 2 sin. 1t /; t ¤ 0 : f .t / D 0; t D0 Then, f 0 .0/ D 1, f 0 is bounded in . 1; 1/, but f is not 1 to 1 in any neighbourhood of 0. 54.2. Let f be a real valued function on and open set G of R2 such that the partial derivatives D1 f and D2 f exist and are bounded in G. Prove that f is continuous. 54.3. Investigate how the inverse function theorem applies to the function f W R2 ! R2 given by f .x1 ; x2 / D .x12 C x22 ; 2x1 x2 /: y12 C 2y2 e y1 Cy2 246 The Implicit Function Theorem Notes Differentiation of vector-valued and complex-valued fns 247 55. A PPENDIX : C OUNTABILITY Recall that a function f W A ! B is called one-to-one or an injection if f .a/ D f .a0 / implies a D a0 . It is called onto or a surjection if, for each b 2 B there exists a 2 A with f .a/ D b, and is called a bijection if it is both an injection and a surjection (one-to-one and onto). 2 Definition. Two sets A, and B, are called equinumerous if there exists a bijection f W A ! B. Notation: A $ B. We also say A is equinumerous with B, or that A and B are in one-to-one correspondence. 55.1 Example. The set N D f1; 2; : : : g is equinumerous with fm 2 N W m 2g D f2; 3; 4; : : : g. The idea of this example is basic and IMPORTANT, it is used over and over again! Let us put A D fm 2 N W m 2g. Since we have no theorems to use yet, to show N $ A, we must find a bijection from N onto A. We simply let f .n/ D n C 1, for n 2 N. Certainly f W N ! A, because if n 2 N then n C 1 1 C 1 D 2. To show that f is injective (i.e. one-to-one) let f .n/ D f .n0 /: Then, n C 1 D n0 C 1; so n D n0 : To prove f is surjective, that is f maps onto A, let m 2 A. Then m 2 N and m > 1, so m 1 2 N (one of our first theorems of natural numbers) and f .m 1/ D m 1 C 1 D m: 55.2 Example. The set of natural numbers is equinumerous with the set of even natural numbers. To see this, recall that a natural number m is called even if it is divisible by 2. That, in turn, means that there exists n 2 N such that m D 2n. Thus, the set of even numbers is exactly f2n W n 2 Ng D 2N. A bijection on N to 2N is given by f .n/ D 2n; for n 2 N. Certainly for all n 2 N, f .n/ is even and we have just checked that f is onto. To see that f is one-to-one, put f .n1 / D f .n2 /. Then, 2n1 D 2n2 , so n1 D n2 as required. Equinumerosity is an equivalence relation on the class of all sets. That is: 55.3 Lemma. For sets A; B; C : (a) A $ A. (b) A $ B implies B $ A. (c) A $ B and B $ C imply A $ C . 248 Appendix: Countability Proof. (a) For any set A, the identity mapping defined by f .x/ D x; for all x 2 A, is a bijection. Indeed, if f .x/ D f .x 0 / then x D x 0 , just by definition (so f is an injection) and if x 2 A then x D f .x/ (so f is a surjection). (b) Suppose f W A ! B is a bijection. Then f 1 W B ! A is also a bijection. Let us review the proof of this. From the definitions, f .x/ D y iff x D f 1 .y/. [For a given y in the range of f , the fact that f is one-to-one gives us exactly one x such that f .x/ D y, and this x is defined to be f 1 .y/]. Now if y; y 0 2 B and f 1 .y/ D f 1 .y 0 /, then f .f 1 .y// D f .f 1 .y 0 //, so y D y 0 . Thus, f 1 is injective. To see that it is surjective, let x 2 A. Then f .x/ is some point y of B and thus x D f 1 .y/. (c) is left as an exercise. 55.4 Lemma. The only set equinumerous with ; is itself. Proof. Suppose A $ ;. Then there is a bijection f W A ! ;. If A ¤ ;, choose a 2 A then f .a/ 2 ;, which is impossible. 55.5 Lemma (Pigeonhole Principle). For natural numbers m and n, if m < n, then there is no injection f W f1; : : : ; ng ! f1; : : : ; mg. Proof. We prove this by induction on m. Thus we let P(m) denote “ For all n 2 N, if m < n then there is no injection f W f1; : : : ; ng ! f1; : : : ; mg”. 1-1 This is true for m D 1, because if n 2 N and f W f1; : : : ; ng ! f1g, then f .n/ D f .1/, which implies n D 1. Assume it is true for m: Let n 2 N and n > m C 1. Suppose there were an injection f W f1; : : : ; ng ! f1; : : : ; m C 1g. Since there is no injection of f1; : : : ; ng into f1; : : : ; mg, there must be a p 2 f1; : : : ; ng with f .p/ D mC1. Fix such a p and put g.k/ D f .k/, for k < p and f .k C 1/ for k D p; : : : ; n 1. Then g injects f1; : : : ; n 1g into f1; : : : ; mg, which is a contradiction. Thus, no such f exists: the statement is true for m C 1. By the PMI, P .m/ holds for all m 2 N. That is, for all m; n 2 N, if m < n, then there is no injection f W f1; : : : ; ng ! f1; : : : ; mg. 55.6 Corollary. There is no injection of f1; : : : ; ng onto a proper subset of itself. Proof. Suppose h were an injection on f1; : : : ; ng to itself which is not surjective. Since h cannot map f1; : : : ; ng to f1; : : : ; n 1g, there is some i with h.i / D n and since h is not surjective, there is some other k 2 f1; : : : ; ng n .range h). Define ( h.j /; j ¤ i f .j / D : k; j Di Then f injects f1; : : : ; ng into f1; : : : ; n 1g which is impossible. 3 Definition. The empty set is said to have 0 elements. For n 2 N, A is said to have n elements if A $ f1; : : : ; ng We prove now that this n is unique. It is called the cardinality of A (or the number of elements in A). denoted card.A/ or #.A/. 55.7 Theorem. If n; m 2 N and A $ f1; : : : ; ng and A $ f1; : : : ; mg then m D n. In other words, if A has m elements and A has n elements, then m D n. Differentiation of vector-valued and complex-valued fns 249 Proof. Let n; m 2 N. And suppose A $ f1; : : : ; ng and A $ f1; : : : ; mg. Then, from the equivalence relation properties of $, f1; : : : ; ng $ f1; : : : ; mg; If m ¤ n, one must be smaller. Say m < n. But, by the Pigeon-hole Principle, there can be no injection of f1; : : : ; ng into f1; : : : ; mg;, so no bijection. This is a contradiction yielding m D n. 4 Definition. A set is called finite if it has n elements, for some n 2 W D N [ f0g, infinite, if it is not finite denumerable, if it is equinumerous with N countable , if it is finite or denumerable. 55.8 Theorem (Also referred to as the Pigeon-hole Principle). If m; n 2 N, A has n elements, B has m elements and m < n, then there is no injection of A into B. Proof. Exercise. We have shown that the set f2; 3; : : : g is denumerable, as is the set of even numbers. The reader should check that: The set of odd numbers is denumerable and so is the set Z of integers f 2; 1; 0; 1; 2; : : : g. 55.9 Theorem. (1) The set N is infinite; hence, all denumerable sets are infinite. (2) Equivalent for a set A are: (a) A is infinite (b) A contains a denumerable subset (c) A is equinumerous with a proper subset of itself. Proof. (1) If N were not infinite, then since it is not ;, there would be a bijection f W N ! f1; : : : ; ng, for some n 2 N. But then, if we let g be the restriction of f to f1; : : : ; n C 1g: g.x/ D f .x/; for x 2 f1; : : : ; n C 1g; then g is an injection of f1; : : : ; n C 1g into f1; : : : ; ng. which violates the Pigeon-hole Principle. For the second statement, suppose A is denumerable, but finite. Then N $ A. If A were empty we would have N D ;, and if n 2 N with A $ f1; : : : ; ng, we would have N $ f1; : : : ng, which cannot happen since N is infinite. (2) ((a) H) (b)) Assume (a) holds; that is, A is infinite. Then, A is not empty, so we may choose a point x1 2 A. Suppose x1 ; : : : ; xk have been chosen as distinct (that is, with xi ¤ xj if i ¤ j ) elements of A. Then fx1 ; : : : ; xk g A; and A ¤ fx1 ; : : : ; xk g; otherwise A would be finite. Thus A n fx1 ; : : : ; xk g ¤ ;; and we may choose another xkC1 2 A n fx1 ; : : : ; xk g. Thus, x1 ; : : : ; xkC1 are distinct elements of A. 250 Appendix: Countability By recursion (definition by induction), we thus have proved the existence of a sequence of elements xi 2 A; i 2 N, such that xi ¤ xj if i ¤ j . (Indeed, if i < j , all the elements x1 ; : : : ; xj are distinct.) This says that the map g W i 7! xi is one-to-one on N into A. Let B be the range of g, that is B D fxi W i 2 Ng. Then g is onto B so g is a bijection of N onto B. That is B is a denumerable subset of A. ((b) H) (c)) Suppose B is a denumerable subset of A, say i 7! xi is an injection on N onto B. We now define a map f on A by ( x; if x 2 A n B f .x/ D xiC1 ; if x D xi ; for some i 2 N. This f is 1-1 on A onto A n fx1 g, a proper subset of A. Indeed, first, notice that if x 2 A; then either x 2 A n B, in which case, f .x/ D x cannot be x1 , or x is some xi 2 B, and f .x/ D xi C1 . But xiC1 cannot be xi , otherwise i C 1 D 1, since i 7! xi is one-to-one. This is impossible, since each natural number is 1. This proves the range of f is contained in A n fx1 g (which is all we really need to say that f maps to a proper subset of A. But we go on anyway, since the technique of proof is of use elswhere). If a 2 A n fx1 g, either a 2 A n B, so f .a/ D a 2 A n fx1 g or a D xiC1 , for some i and then a D f .xi /. This shows the range of f is exactly A n fx1 g. To show f is injective, we let f .x/ D f .x 0 / and show x D x 0 There are three cases, both x; x 0 2 A n B, both are in B, or one is in A n B and the other is in B. In the first case, f .x/ D x and f .x 0 / D x 0 and f .x/ D f .x 0 / so x D x 0 . In the second case, x D xi , and x 0 D xj , for some i; j 2 N. Thus, f .x/ D xiC1 and f .x 0 / D xj C1 , so xi C1 D xj C1 : But the various xk are distinct, so i C 1 D j C 1, and therefore i D j . The third case cannot occur, since then (say) f .x/ 2 A n B and f .x 0 / 2 B. ((c) H) (a)) Let A be equinumerous with a proper subset B of itself. (Thus B A, A n B ¤ ;, and A $ B.) Suppose A were finite. Then A ¤ ;, since A n B ¤ ;, so A $ f1; : : : ; ng for some n 2 N. Fix such an n. Let f W A ! B be a bijection and g W f1; : : : ; ng ! A a bijection. Then g 1 ı f ı g is still one-to-one (why?). Moreover, it maps f1; : : : ; ng onto a proper subset of itself, which is impossible by the (corollary to) the Pigeon-hole principle. Hence, A is infinite. In detail, since B is a proper subset of A, there exists an a 2 A n B. Then, there is an i with g.i / D a. If there would exist j with g 1 ı f ı g.j / D i; we would have f .g.j // D g.i / D a; which contradicts the fact that f maps to B. Thus the range of g 1 ı f ı g.j / is not all of f1; : : : ; ng. 55.10 Example. If a < b, then Œa; b is infinite. Differentiation of vector-valued and complex-valued fns 251 One proof of this would use .b/. For each n 2 N, let xn D a C b n a . Then the set fxn W n 2 Ng is denumerable and contained in Œa; b. . Then Œa; c is a proper subset of Œa; b, yet the Another could use (c): Let c D aCb 2 function on Œa; c defined by f .x/ D 2.x a/ C a maps Œa; c bijectively onto Œa; b. For convenience of reference, we now list the facts about countability most commonly used in Analysis, in the form of 4 theorems, and give the proofs afterward. 55.11 A. (1) Every subset of a finite set is finite. (2) Every subset of a countable set is countable. (3) The image under any map of a finite set is finite. (4) The image under any map of a countable set is countable. 55.12 B. For a non-empty set A, equivalent are: (a) A is countable. (b) There is a surjection f W N ! A. (c) There is an injection g W A ! N. 55.13 C. The following are countable: N N, hence the product of two countable sets. the union of a countable number of countable sets. the integers the rationals. the algebraic numbers 55.14 D (Theorem 11.3 of text). Each non-degenerate of R is uncountable, including R itself. Proof of A(1). Let A be finite and B A. If B is infinite, then it contains a denumerable subset D. But then D A also, so A is also infinite. Proof of A(2). (Every subset of a countable set is countable.) Let A be countable and B A. If A is finite, then so is B is by (1). So we may assume A is denumerable. Also, if B is finite we are done, so we also assume B is infinite. Let A D fa1 ; a2 ; : : : g, where the an are distinct. Now B is infinite, so the set M WD fn 2 N W an 2 Bg is not empty, in fact it is infinite (why?). By the Well-ordering Property of N, M has a least element. So let n1 be the first n in M . Thus i < n1 implies i … M: () Continuing recursively, suppose n1 ; : : : ; nk have been chosen in M , we choose nkC1 to be the first element n of M n fn1 ; : : : ; nk g. This defines a map k 7! nk of N to M . At each stage, if n nk and n 2 M , then n 2 fn1 ; : : : ; nk g. () Thus nkC1 > nk and we have k 7! nk is one-to-one on N onto M ; hence, k 7! ank is one-to-one on N onto B. Here are some details of the last paragraph. The statement ./ is proved by induction on k: It is true for k D 1, since n1 is the first element of M (see ./). So suppose the statement is true for k, Then the elements of M n fn1 ; : : : ; nk g are all > nk . Since nkC1 is one of these, nkC1 > nk and 252 Appendix: Countability since it is the smallest one, if n 2 M with n nkC1 , it has to be nkC1 or else n nk , In the second case, n 2 fn1 ; : : : ; nk g, by the inductive hypothesis. Thus, in both cases if n nkC1 and n 2 M , then n 2 fn1 ; : : : ; nkC1 g, which is the statement for k C 1. Thus, by the PMI, ./ is true for all k. Now, the map i 7! ni is one-to-one, for if i ¤ k, say i < k, then ni 2 fn1 ; : : : ; nk not in this set. 1g and nk is Finally, to prove the map i 7! ni is onto M , let m 2 M . Since nk < nkC1 for all k we have nk k, for all k 2 N. (This is a simple induction exercise.) Thus m 2 M and m nm , so m 2 fn1 ; : : : ; nm g, by ./. As we said, the results A,B,C, and D above are listed together for ease of reference, but it is not necessarily the best order to prove them. We will come back to the rest of Theorem A later, and for now turn to Theorem B. Proof of Th. B. Let A ¤ ;. (a) H) (b). Let A be countable. Then either A is denumerable, or A is finite. If A is denumerable, then there is a bijection f W N ! A, and this is certainly a surjection. So, we are left with the case A is finite. Then, there is an n 2 N and a bijection i 7! ai on f1; : : : ; ng onto A. Thus A D fa1 ; : : : ; an g: We simply define f W N ! A by ( f .j / D aj ; a1 ; if j n if j > n: Clearly this map is onto since each of the elements of A was already one of the ai for i n. (b) H) (c). Assume f W N ! A is surjective. For each a 2 A, there is an element n 2 N with f .n/ D a. Choose one such n (say the first one) and call it na . Then the map a 7! na ; is on A into N. This map is injective because if na D na 0 ; then f .na / D f .na0 /; 0 that is, a D a . (c) H) (a). Now let g W A ! N be injective. Then gŒA N and g W A ! gŒA; is one-to-one, and onto. Thus A is equinumerous gŒA, and gŒA is countable, since it is a subset of a countable set. Proof of A(c). Suppose A is countable, let g be any mapping. We have to show that gŒA is countable. By theorem B(b) there is a surjection f of N onto A, and then the composite map g ı f maps N onto gŒA, so gŒA, is also countable. Differentiation of vector-valued and complex-valued fns 253 It is left as an exercise to state and prove an analogue of Theorem B for finite sets and use it to prove A(3) that the image of any finite set is finite. Proof that N N is countable . Method 1: The map f W .m; n/ 7! m C .m C n 2/.m C n 1/=2 is 1-1 on N N onto N. This is “diagonal enumeration”. Onto is proved by induction: f .1; 1/ D 1. Supposing f .m; n/ D k either n D 1 in which case f .1; m C 1/ D k C 1 otherwise f .m C 1; n 1/ D k C 1. To prove 1-1: suppose f .m; n/ D f .m0 ; n0 /. If m C n D m0 C n0 , then m D m0 C1 1/ D and then n D n0 ; if m C n > m0 C n0 D N , then f .m; n/ m C .N C1 2/.N 2 .N 1/N .N 2/.N 1/ 0 0 mC D m C .N 1/ C > f .m ; n /. 2 2 Method 2: By B(c), it is enough to show that there is an injection g W N N into N. One such is given by g W .m; n/ 7! 2m 1 .2n for .m; n/ 2 N N: 1/; If g.m; n/ D g.m0 ; n0 / then, 2m 1 .2n 1/ D 2m From very basic number theory, this implies 2m 1 D 2m 0 1 and 0 1 .2n0 2n 1/: 1 D 2n0 1 (otherwise you could prove that an even number was equal to an odd number) and then m D m0 and n D n0 ; that is, .m; n/ D .m0 ; n0 /: The mapping g is actually also onto. For a given k 2 N, k D g.m; n/ where m is the greatest natural number for which 2m 1 divides k (leaving the odd number 2n 1 as a quotient). It is interesting to draw a picture numbering pairs .m; n/ this way. Try it. Another common choice for this method is to use h.m; n/ D 2m 3n . It is still injective, but certainly not onto N. The reader can easily deduce from the countability of N N, that the cartesian product of two countable sets is countable, that is, if A and B are countable, so is A B. Proof that a countable union of countable sets is countable. If I is countable (¤ ;/ and for each i 2 I , Ai is countable (and ¤ ;), then there is a map g on N onto I and, for each i 2 I , there is a map hi on N onto Ai . S Then the map f defined by f .m; n/ D hg.m/ .n/ maps the countable set N N onto i 2I Ai , hence the latter image is also countable. Proof that the set of rationals is countable. The set Z of integers is the union of the nonnegative and the negative integers, so is countable. The set of rationals is the image of the countable set Z N under the map .m; n/ 7! m=n, so is countable by B(2). 254 Appendix: Countability Notes